In data mining, Association Rule Learning is a popular and well researched method for discovering interesting relations between variable in large databases. It is intended to identify strong rules discovered in databases using different measures of interests.
The rule found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, he or she is likely to also buy hamburger meat. Such information can be used as the basis for decisions about marketing activities such as promotional pricing or product placements.
Association Rule mining highlight two or more products that are closely related. It is also used to find frauds, which diseases have more association and many more combinations where two or more variable are involved.
Let us take a sample input data:
Here the transaction ID denotes each customer. We also examine the code:
Rules = apriori (pattern, parameter=list (support=0.03, confidence=0.5))
Support (LHS)= Support is the proportion of transactions in the data set which contains the interest. It is the probability of item A in the total data set.
Example: There are 10,000 different items in a month, only 1 % items were X.
Confidence (RHS)= Conf(x=>y)=Supp(X U Y)/Supp(X)
The support and confidence are probability values which will be between 0 and 1.
Constraints are the measures used to select useful and best rules of all the rules given by R. After analyzing these values for all the rules, best rules for WB have been obtained.
Lift: Lift(X=>Y)= Supp(X U Y ) / (Supp(X) x Supp(Y))
In other words, it is confidence/ Supp(Y).Lift is also the ratio of observed support to the expected support.
If the observed support is more than the expected support then the ratio is greater than 1. When the lift is high between the two values, it reveals something extra which may be new to the study.
For example : A high association level between milk and beer bottle.
Got a question for us?? Mention them in the comments section and we will get back to you.