It’s that time of year again and somehow I’ve gotten caught up in the irony of waiting three weeks for the perfect “Fa La La Llama” red knit sweater decorated with multicolored sequins, tulle and all the shimmer I could get away with. As someone who until now has not owned any holiday clothing or accessories, I’m a little surprised by myself. Maybe now that I’m 90% of the way toward finishing my data science master’s degree, I’m more comfortable fully embracing the inner geek-fashion self. Something to ponder as I sip hot cocoa with nutmeg – but I digress. Feature engineering is a process ‘of transforming raw data into features that better represent the underlying problem to the predictive models, resulting in improved accuracy.’ In data science feature engineering helps meet the goal of getting the most out of your data to answer questions and predict outcomes. Today I want to talk about how feature engineering is like the ugly Christmas sweater.
Choosing the best features in data science is essential to success – as is in my opinion choosing the ugly holiday sweater with the best characteristics. In data science, choosing better features is a process that gives better results with the data. Data inputs or features have to be transformed into something the computer can understand to make the prediction. You focus on key inputs to solving the problem rather than all the features.
In the last few years, owning an ugly sweater has become more than a popular trend. Its importance in holiday traditions lay dormant for many years but no longer. Choosing a sweater that reflects your personality is a great way to break the ice at an awkward office party – a successful result in my book. I uncharacteristically waited weeks for the llama sweater because I thought it was the best of all the sweaters. The ‘features’ of a sequin birthday cake on top of the llama was irresistible. Wearing the sweater with sequins and tulle has made attending some less than enjoyable events much more enjoyable and surely I can’t be alone in this?
Much like the recently new trend of data science, the ugly sweater continues to have a larger impact than most probably first imagined. It’s now considered a party foul to wear a chic J Crew (or other name brand) ‘regular’ sweater to that family shindig or tree lighting. A fashion statement that was sorely underestimated much like the importance of feature engineering in data science. Now feature engineering and ugly sweater take center stage for hopefully years to come.
I can’t wait to see what you think of my comparison of feature engineering to the ugly holiday sweater. And please add the ‘features’ that made you chose the ugly sweater you did.