“Compiling a perfect data set is like trying to catch a stream of water in your hands. Some data is inherently inaccurate, some data is not available for logistical or legal reasons, and some data simply isn’t included because it didn’t seem necessary at the point of design (Monroe, 47).”
That quote is very relevant to the project work I’ve been doing in my Social Media Mining class at Indiana. For those of you unfamiliar with the term “Social Media Mining” is a way to take data that’s generated by consumers, process it, gain understanding and take actions from the insights and knowledge gained from it. We’re taking lots of data generated by Twitter, Facebook and other forms of social media and using machine learning technologies to extract meaning. The culmination of what I’m learning is in a final project.
Today I want to talk a little about what I’m doing for my final project. I wanted to do research related to a social problem. Food Shortages in Venezuela have been prevalent since 2013 and have reached a crisis-level nationwide. Social media data plays a critical role in Venezuela where there is government censorship. Social media mining (SMM) from this project can be used to help give insight into how citizens express themselves in space and time.
The planned project goal is to examine reactions to food shortage events and detect spatiotemporal tweeting patterns. Twitter hashtags and other relevant Spanish terms in Caracas, Venezuela from December 2014 – October 2016 will be used. Expected output will be a heat map of food shortage reactions within each of Caracas’ five municipalities. Twitter is chosen because according to an EModeration article, Venezuela led Latin America in the number of active Twitter users compared to internet users at 14%.
Data will be obtained using the publically available open source Twitter API and precautions will be taken to not violate any privacy restrictions. Python programming language will be used to get and process the data. Caracas is the planned focus of the project since it is the capital of Venezuela and has the country’s largest population at 5.5 million. According to Tobler’s first law of geography, near things are more related than distant things. Therefore the project will look at comments from within the location where the food shortage event is happening.
After I complete the project, look for a blog around mid-December discussing the results.