In my April 4 blog, I talked about my first graduate school final course project for Information Visualization at Indiana University. Although I don’t have my final grade yet for the project, I wanted to spend this week talking a little about the process and creation of our final visualizations.
Our group of five began our month-long work following the principles we learned throughout the class. And like good data scientists, our goal was to add value to the data and tell a story about the data. We were given You Tube comments from the film Ambition, produced by the European Space Agency. Only one of our team members was an in-class student and we had never worked together before so we faced typical challenges in defining our project goals, team roles and sorting through different communication styles. Two of us were working full-time also which brought different perspectives to the team.
Unlike some of the other client projects in our class, our data set was relatively small (on the order of 1,600 You Tube comments) from the time the film was released in October 2014 – April 2016. This was also our first experience with analyzing opinion data. In our Information Visualization class, we mainly focused on learning the Sci2 tool developed at Indiana for processing and visually encoding data. We performed our first analysis of the data set using this tool but found it inadequate for the type of sentiment data we had to process. We scoured machine learning tools and other types of natural language processing methods next.
Similar to a real-world project, there were several milestone dates along the way. We finally settled upon using Tableau as our main visualization tool and selected and Area Map and Intensity Tree Map for our final visualizations. We focused on delivering quality visualizations rather than a large quantity of visualizations.
We’re excited that our Intensity Tree Map is unlike many other tree maps in that it adds a layer of visual encoding not previously done in other research. In addition to showing the importance/frequency of the words by the size of the boxes, the color saturation is a measure of how “intense” a word is. The visual encoding shows whether a word is more or less intense, meaning having more or less of an emotion. A darker color saturation means the word is more intense. For example, the word “Magic” denotes the user has more emotionally intense or stronger feelings than the use of the word “Beautiful”. So the word “Magic” has a darker or more saturated orange color on the ITM than the word “Beautiful”. Each of the color saturations corresponds to a hexadecimal value.
Our final work can be found on this GitHub page and the interactive visualizations can be found on this Tableau page.
I’m taking a much needed vacation and will not be posting blogs the following two weeks. See you again the week of Memorial Day and let me know what you think about the project.