Today I want to begin a five-week series about a project I’m working on at Indiana University’s Data Science Master’s degree program this fall. The topic of the project for me started with brainstorming ideas at the end of August. I knew I’d more than likely be doing at least one project in the fall so I thought it logical to do something I was interested in. I love teaching myself new topics and thrive learning with projects outside of my domain expertise so exploring the world of coffee seemed like a good start.
Finding an idea that would have a business impact and solve a real need was at the start of the research process. I started thinking about problems that seem to exist with coffee production since the livelihood of 120 million people in developing countries depends on the coffee supply chain. Coffee has been one of the keys to my survival in graduate school and apparently more than 2 billion cups of coffee are drunk each day. I originally wanted to look at relationships between environmental variables that affect coffee production and the amount of rust. Unfortunately there was a surprising lack of open data that used consistent measurements in various coffee growing regions around the world. So I tweaked my project scope to look at coffee futures prices since coffee is the second most popular commodity traded by volume.
Coffee rust leads to considerable losses worldwide for millions of farmers and affects futures prices. Coffee rust is caused by the coffee berry borer, or Hemileia vastatrix fungus, and is one of the main diseases of coffee arabica worldwide. Understanding the relationship between coffee rust, production quantities and futures prices is important to anyone affected by the coffee supply chain. This research offers a quantitative framework for describing and visualizing the relationship between coffee rust, amount of coffee produced and futures prices.
We have several research questions that we’re hoping the project will answer. Is there a link between the amount of rust coffee plants have and the amount of coffee produced and / or coffee futures prices? Can past data of rust-infected coffee plants and prices worldwide be used to answer this question?
We know of no research to date that uses past coffee futures prices and coffee rust variables with machine learning techniques to predict coffee futures prices. There are no preconceived notions from the project authors of the relationships between coffee rust and futures commodity prices. Our final project will tell the story of how coffee rust affects coffee futures prices.
Next week, I’ll talk about the data acquisition process of the project. Spoiler alert: Here’s a video of the mid-term presentation.