How to specify the prior probability for scikit-learn s Naive Bayes

Question

I'm using the scikit-learn machine learning library (Python) for a machine learning project. One of the algorithms I'm using is the Gaussian Naive Bayes implementation. One of the attributes of the GaussianNB() function is the following:

class_prior_ : array, shape (n_classes,)

I want to alter the class prior manually since the data I use is very skewed and the recall of one of the classes is very important. By assigning a high prior probability to that class the recall should increase.

However, I can't figure out how to set the attribute correctly.

This is my code:

gnb = GaussianNB()
gnb.class_prior_ = [0.1, 0.9]
gnb.fit(data.XTrain, yTrain)
yPredicted = gnb.predict(data.XTest)

I figured this was the correct syntax and I could find out which class belongs to which place in the array by playing with the values but the results remain unchanged. Also no errors were given.

What is the correct way of setting the attributes of the GaussianNB algorithm from scikit-learn library?

Nandini · Answer 1 · Apr 7, 2022

In GaussianNB, there is a mechanism to set prior probabilities. It's called 'priors,' and it's a parameter that you can use. See the following documentation: "Parameters: priors: array-like, (n classes,) shape The classes' prior probability. The priors are not adjusted according to the data unless otherwise specified." As an example, consider the following:

from sklearn.naive_bayes import GaussianNB
# minimal dataset
X = [[1, 0], [1, 0], [0, 1]]
y = [0, 0, 1]
# use empirical prior, learned from y
gauss = GaussianNB()
print (gauss.fit(X,y).predict([1,1]))
print (gauss.class_prior_)

>>>[0]
>>>[ 0.66666667  0.33333333]

However, if you adjust the prior probabilities, you'll get a different result, which I believe is what you're looking for.

# use custom prior to make 1 more likely
gauss = GaussianNB(priors=[0.1, 0.9])
gauss.fit(X,y).predict([1,1])
>>>>array([1])

You can't set class prior with the GaussianNB() function in scikit-learn. If you look at the documentation online, you'll notice that. Instead of arguments, class prior_ is an attribute. You can access the class prior_ property after fitting the GaussianNB().

Elevate Your Expertise with Our Machine Learning Certification Program!