Minimum support and minimum confidence in Data Mining

0 votes
I would like to know if minimum support and minimum confidence can be automatically determined in mining association rules? If so any hint or pointer to resource would be great.
Mar 26 in Machine Learning by Nandini
• 5,480 points
20 views

1 answer to this question.

0 votes
Yes, there is a mechanism for calculating the minsup and minconf thresholds automatically.
But first, let me explain how to select the minsup and minconf options. It is up to you to decide which ones to use based on your data.

On some data, I use 80 percent for minimum support. I use 0.05 percent for certain other statistics. It is entirely dependent on the dataset. Typically, I begin with a large value and gradually lower it until I reach a value that generates sufficient patterns.

It's a little easier with the minimum confidence because it symbolises the level of trust you want in the rules. So I normally use a percentage like 60% because I don't want to employ a rule that is truely less than 60% of the time. However, it is also dependent on the data.
When minsup is set to a greater value, the algorithm finds fewer patterns and runs faster. When minconf is set higher, there will be fewer pattern, but many algorithms do not use minconf to trim the search space, therefore it may not be faster. Obviously, the number of rules you intend to create influences how these parameters are configured.

You can use a top-k association rule mining algorithm instead of the minsup parameter if you don't want to use it. In this situation, you'd set k=1000, and the algorithm would find the 1000 most common rules with a certain level of confidence. For association rule mining, I created a new algorithm called TopKRules. The source code is available from the SPMF open-source data mining library, which includes numerous implementations of the association rule and pattern mining algorithms.

Another way to automatically adjust the minsup threshold is to utilise a mathematical calculation that calculates how much data you have.
Other efforts have attempted to solve the problem of setting up minsup and minconf. They can be found on Google Scholar.
answered Apr 4 by Dev
• 6,000 points

Related Questions In Machine Learning

0 votes
1 answer
0 votes
1 answer
+1 vote
0 answers

text mining new set of data in production environment expect training feature

Hi, I have trained a model based on ...READ MORE

Nov 28, 2019 in Machine Learning by MANOJ
• 130 points
231 views
0 votes
1 answer

If both negative and positive skewness are present in data set,then how it can be removed??

Hi@shama, It depends on your use case. If ...READ MORE

answered Dec 8, 2020 in Machine Learning by MD
• 95,340 points
152 views
0 votes
1 answer

How to write rules generated by Apriori?

I found out one solution: Use as() ...READ MORE

answered Jun 19, 2018 in Data Analytics by Sahiti
• 6,360 points
249 views
0 votes
1 answer

Is there any easy way to fill in missing data?

You can try the following code: First, you ...READ MORE

answered Jun 20, 2018 in Data Analytics by DataKing99
• 8,240 points
490 views
0 votes
1 answer

SMOTE-function not working in R

If you convert 'y' to a factor, ...READ MORE

answered Jun 27, 2018 in Data Analytics by CodingByHeart77
• 3,720 points
1,906 views
0 votes
1 answer

How to find out cluster center mean of DBSCAN in R?

Just index back into the original data ...READ MORE

answered Jun 27, 2018 in Data Analytics by Sahiti
• 6,360 points
742 views
0 votes
1 answer

Training and testing data in machine learning

Unsupervised learning is used with the K-means ...READ MORE

answered Feb 23 in Machine Learning by Dev
• 6,000 points
34 views
0 votes
1 answer
webinar REGISTER FOR FREE WEBINAR X
Send OTP
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP