Unsupervised Machine Learning for Beginners, Part 4: Apriori

apriori-paulallen

Today is week 4 in a five-part series describing types of unsupervised machine learning. Today we’ll be talking about Apriori algorithms. According to Wikipedia, Apriori is an algorithm for frequent item set mining and association rule learning over transactional databases.The Apriori algorithm identifies the frequency of items within a set of transactions and predicts the occurrence of an item based on the occurrences of other items in the transaction.

For example, say we have items purchased from 5 customers. Customer 1 purchased bread and milk. Customer 2 purchased bread, diaper, beer and eggs. Customer 3 bought milk, diaper, beer and cola. Customer 4 bought bread, milk, diaper and beer. Customer 5 bought bread, milk, diaper and cola.

Customer Items Purchased
1 Bread, milk
2 Bread, diaper, beer, eggs
3 Milk, diaper, beer, cola
4 Bread, milk, diaper, beer
5 Bread, milk, diaper, cola

From this data, association rules are formed. When someone buys diaper, it is associated with bread. When someone buys milk and bread, it is associated with buying eggs and cola. When someone buys beer and bread, it is associated with buying milk. The items associated does not mean that a customer will buy bread because they bought diapers, etc. They are just occurring at the same time for the same customer. If we have a minimum threshold, we have all the association rules at or above that value.

{Diaper} → {Beer},

{Milk, Bread} → {Eggs, Soda}

{Beer, Bread} → {Milk}

Some other examples are from the Code Review (Python) and the CranR Project. One of the best articles is by Inokuchi.  Laurel Powell and Stanford University have some of the best YouTube video explanations and some great GitHub source code is from Asaini (Python) and Krunal3103 (R). Other technology applications include marketing, internet of things, gene mapping, dairy herd management, weather forecasting or finding similar features in poisonous mushrooms.

Next week I’ll conclude the five-part unsupervised machine learning series with Frequent pattern growth.

Advertisements

One thought on “Unsupervised Machine Learning for Beginners, Part 4: Apriori

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s