Social media sites like Twitter provide readily accessible sources of large-volume, high-velocity data streams, now referred to as “Big Data.” While private companies have already made great strides in leverage these social media sources, many public organizations and government agencies have could reap significant benefits from these resources. Care must be exercised in this integration, however, as huge data sets come with their own intrinsic issues. This paper explores these advantages and hazards with a selection of experiments to demonstrate social media data’s ability to support government organizations and supplement existing programs. Our first experiment shows consistency between geographic populations of Twitter and population estimates from the US Census Bureau. We then follow with a comparison between references to drug use on Twitter and incidence of drug use estimated from a national survey, in which Twitter yields similar estimates for marijuana use but shows little correlation with cocaine use. Our final experiment then illustrates how a social media community responds to law enforcement agencies during times of crisis and social unrest by examining sentiment on Twitter during the Boston Marathon Bombing and protests in Ferguson, MO. The paper concludes with a discussion on open problems in leveraging social media for public programs with concentration on acquiring high-quality geolocation information from these sources.
Buntain, Cody, Jennifer Golbeck, and Gary LaFree. 2015. "Powers and Problems of Integrating Social Media Data with Public Health and Safety." Presented at the Bloomberg Data for Good Exchange Conference, New York City. https://cdn1.topi.com/uploads/public_events/1250/files/1d4936adc9578ca24cc2e050202c86f5/PH_Buntain_88.pdf