Thursday, June 4, 2015

"How a group of researchers tried to use social media data and algorithms to find breaking news"

From the Nieman Journalism Lab:
Using geotagged Instagram data, CityBeat tries — often unsuccessfully or belatedly — to find breaking news.

Shortly after 9:30 a.m. on March 12, 2014, two apartment buildings in East Harlem exploded when a water main collapsed into a gas line. Eight people were killed and dozens more were injured.

Journalists rushed to the scene to cover the tragedy, but four newsrooms — The New York Times, BuzzFeed, Gothamist, The New York World — had another tool to help them cover the explosions: CityBeat, a program designed to algorithmically search geotagged social media posts to find news stories in New York City. CityBeat was built by researchers at Cornell Tech, Cornell’s applied sciences outpost in New York City, and Rutgers and was being tested by the four outlets at the time.

Social media posts about the building collapses appeared on CityBeat, but by the time there were enough posts to register in its algorithm, the news organizations themselves already knew about the explosion and had reporters and photographers on the scene.

“[The Harlem Fire] did show up, but it was half an hour later…at that point we’re not using Instagram,” One of the journalists interviewed by the researchers said in their paper on the project.

CityBeat, the participants said, was most useful in covering planned events — conferences, concerts, events, or even PR stunts, such as when a man in a bear suit was spotted walking around Manhattan. The tool was less effective for covering realtime breaking news stories.

“That’s of the things that we talked about in the limitations and understanding the biases of the information,” Raz Schwartz, one of the study’s authors, told me. “Social media data might not be the best way to find these breaking events.”

Schwartz now works on the user experience research team at Facebook, but conducted the study as part of his postdoctoral research at Cornell along with Cornell professor Mor Naaman and Rannie Teodoro from Rutgers. The research was funded by the Brown Institute for Media Innovation at Columbia, and Schwartz presented the paper last week at a conference in Oxford, England.

Though the researchers have moved onto other topics, CityBeat is still live. The site was designed to be shown on big screens in newsrooms and has three main components. There’s the Detected Events List, a compilation of events the algorithm has discovered in the past 24 hours using Instagram data. There’s also the Event Window, which shows specific events and their location within New York. The third element is a sidebar showing statistics on the rate of tweets, popular hashtags, and more....MORE