Minimum support and minimum confidence in Data Mining

Question

I would like to know if minimum support and minimum confidence can be automatically determined in mining association rules? If so any hint or pointer to resource would be great.

Dev · Answer 1 · Apr 4, 2022

Yes, there is a mechanism for calculating the minsup and minconf thresholds automatically.
But first, let me explain how to select the minsup and minconf options. It is up to you to decide which ones to use based on your data.

On some data, I use 80 percent for minimum support. I use 0.05 percent for certain other statistics. It is entirely dependent on the dataset. Typically, I begin with a large value and gradually lower it until I reach a value that generates sufficient patterns.

It's a little easier with the minimum confidence because it symbolises the level of trust you want in the rules. So I normally use a percentage like 60% because I don't want to employ a rule that is truely less than 60% of the time. However, it is also dependent on the data.
When minsup is set to a greater value, the algorithm finds fewer patterns and runs faster. When minconf is set higher, there will be fewer pattern, but many algorithms do not use minconf to trim the search space, therefore it may not be faster. Obviously, the number of rules you intend to create influences how these parameters are configured.

You can use a top-k association rule mining algorithm instead of the minsup parameter if you don't want to use it. In this situation, you'd set k=1000, and the algorithm would find the 1000 most common rules with a certain level of confidence. For association rule mining, I created a new algorithm called TopKRules. The source code is available from the SPMF open-source data mining library, which includes numerous implementations of the association rule and pattern mining algorithms.

Another way to automatically adjust the minsup threshold is to utilise a mathematical calculation that calculates how much data you have.
Other efforts have attempted to solve the problem of setting up minsup and minconf. They can be found on Google Scholar.

Supercharge Your Skills with Our Machine Learning Course!