Top Banner
Market Basket Analysis- Implementation using R Submitted By:Yogesh Khandelwal Guided By:Prof.Avinash Navlani
16
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Market basketanalysis using r

Market Basket Analysis-Implementation using R

Submitted By:Yogesh KhandelwalGuided By:Prof.Avinash Navlani

Page 2: Market basketanalysis using r

What is Association Rule Mining?

• Affinity analysis and association rule learning encompasses a broad set of analytics techniques aimed at uncovering the associations and connections between specific objects these might be

• visitors to your website (customers or audience),

• products in your store,

• content items on your media site

• Product recommendation on e-commerce website.

Page 3: Market basketanalysis using r

Market Basket Analysis?

• “market basket analysis” is one of the most famous application of Association rule mining.

• In a market basket analysis, you look to see if there are combinations of products that frequently co-occur in transactions.

– For example, maybe people who buy flour and casting sugar, also tend to buy eggs (because a high proportion of them are planning on baking a cake).

Page 4: Market basketanalysis using r

Importance of performing MBA

• A retailer can use this information:– Store layout (put products that co-occur together close to one

another, to improve the customer shopping experience)– Marketing (e.g. target customers who buy flour with offers on

eggs, to encourage them to spend more on their shopping basket)

• Online retailers and publishers can use this type of analysis to:

– Inform the placement of content items on their media sites, or products in their catalogue

– Drive recommendation engines (like Amazon’s customers who bought this product also bought these products…)

– Deliver targeted marketing (e.g. emailing customers who bought products specific products with other products and offers on those products that are likely to be interesting to them.)

Page 5: Market basketanalysis using r

Some Mathematical terminology

• Support: The fraction of which our item set occurs in our dataset.

X->Y

pr(xUy)

• Confidence: probability that a rule is correct for a new transaction with items on the left.

pr(y|x)

Page 6: Market basketanalysis using r

Impelmentation using R

• It is assumed that reader has prior basic knowledge of Association rule mining,apriorialgorithm etc.

• Packages we will use:• Arules

• Arulesviz

Page 7: Market basketanalysis using r

Know your dataset

• Dataset used:– Groceries(data is inbuilt in arule package)

– Also it can be downloaded from:(Source:http://www.salemmarafi.com/wp-content/uploads/2014/03/groceries.csv)

# Transaction in Input data 9835

# Columns in input data 32

# Items in input data 169

Page 8: Market basketanalysis using r

Lets start by loading packages & datasets

Top 25 Frequent product

Page 9: Market basketanalysis using r

Mine some rules!!

• We need to provide min.support and min. confidence

Output:

If someone buys yogurt and cereals, they are 81% likely to buy whole milk too.

Page 10: Market basketanalysis using r

Cont..

• We can get the summary report by using

• Summary(rules)

Page 11: Market basketanalysis using r

Sorting Stuff outThe first issue we see here is that the rules are not sorted. Often we will want the most relevant rules first. Lets say we wanted to have the most likely rules. We can easily sort by confidence by executing the following code.

Before Sorting

After Sorting

Page 12: Market basketanalysis using r

Redundancies!!

• Sometimes, rules will repeat. Redundancy indicates that one item might be a given. As an analyst you can elect to drop the item from the dataset. Alternatively, you can remove redundant rules generated.

• We can remove the redundancy by following rule:

subset.matrix <- is.subset(rules, rules)

subset.matrix[lower.tri(subset.matrix, diag=T)] <- NA

redundant <- colSums(subset.matrix, na.rm=T) >= 1

rules.pruned <- rules[!redundant]

rules<-rules.pruned

Page 13: Market basketanalysis using r

Targeting Items

• There are two types of Target we might be intrested:• What are customers likely to buy before buying whole milk

• What are customers likely to buy if they purchase whole milk?

Page 14: Market basketanalysis using r

Visualization

Page 15: Market basketanalysis using r

What Next??

• Association rule is applicable to many area like:

• Twitter Analysis

• Medical Analysis

• Etc.

Page 16: Market basketanalysis using r

• Thank You

• Sources:http://www.salemmarafi.com/code/market-basket-analysis-with-r/comment-page-1/

• http://www.analyticsvidhya.com/blog/2014/08/visualizing-market-basket-analysis/

• http://snowplowanalytics.com/analytics/catalog-analytics/market-basket-analysis-identifying-products-that-sell-well-together.html

• Rdatamining.com