Machine Learning for Beginners, Part 3: Principal Component Analysis


Last week I looked at Singular Value Decomposition unsupervised machine learning technique as part of a four-part series on data science concepts for beginners. Remember that unsupervised machine learning is data driven rather than task driven (supervised machine learning). Today we’ll be staying in the dimension reduction part of unsupervised machine learning as shown in the Cheat-sheet below and will talk about principal component analysis or PCA.

In a similar manner to SVD, PCA is trying to reduce the number of dimensions for data exploration. The PCA method is trying to maximize variance of the data to make a predictive model and converts a set of possibly correlated variables into a set of linearly uncorrelated variables.

A great example from Francois Labelle is when we have three variables (population, average income and area) from 18 countries (Australia, Canada, US, Russia, Brazil, Mexico, Spain, France, Italy, Germany, UK, South Korea, Japan, Iran, Turkey, Thailand, Mexico, Indonesia, Pakistan, India and China). We want to reduce the dimensionality of show the features in a two-dimensional rather than three-dimensional space as shown below. In other words we reduce the variables to reduce redundancy.

pcaSome other examples are from the SciKit Learn  (Python) and Michael Barton’s (R) blogs. Some of the best articles are from University of Illinois (ppt) and PCA by Abdi. Georgia Tech Udacity   has one of the best YouTube video explanations and some great GitHub source code is from SciKit Learn  (Python) and Michael Barton (R). Other technology applications include image processing, pattern recognition, time series prediction, neuroscience and the Internet of Things.

Next week we’ll take a look at Apriori unsupervised machine learning.



Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.