ValueError Unknown label type array 7 5 9 2 9 2 5 8 10 9 6

0 votes

Hi guys, I'm trying to use the Naive Bayes Algorithm on my dataset. Dataset can be downloaded here: https://www.kaggle.com/jiashenliu/515k-hotel-reviews-data-in-europe 

This is my code: 

#

data = pd.read_json('/Users/rokayadarai/Desktop/Coding/DataSets/Hotel_Reviews.json')

data.head()

#stopword are not usefull (a, and, the)

stopset = set(stopwords.words('english'))

vectorizer = TfidfVectorizer(use_idf=True, lowercase=True, strip_accents='ascii', stop_words=stopset)

#merge 2 columns negative_reviews&Positive reviews into 1

data ['Reviews'] = data['Negative_Review'] + data['Positive_Review']

y = data.Reviewer_Score

X = vectorizer.fit_transform(data.Reviews)

# 515738 observations and 83941 unique words

print (y.shape)

print (X.shape)

#split the data - 0.2 means 20% of the data. 123 means use same dataset with every test

X_train, X_test, y_train, y_test = train_test_split(X,y,test_size=0.2,random_state=123)

#train naive bayes classifier

classifier = naive_bayes.MultinomialNB()

classifier.fit(X_train, y_train)

But after running it I keep getting the error: 

ValueError: Unknown label type: (array([ 7.5,  9.2,  9.2, ...,  5.8, 10. ,  9.6]),) for the line classifier.fit(X_train, y_train)

Could somebody please help me out?

Dec 16, 2020 in Machine Learning by anonymous
• 170 points
204 views

1 answer to this question.

0 votes
Hi,

There is a problem with your steps. Before you go for the model, try to analyze the dataset. First, check the format and type of each column. Check the format of your X_train and y_train.
answered Dec 16, 2020 by MD
• 95,060 points

Related Questions In Machine Learning

0 votes
2 answers
0 votes
1 answer
0 votes
1 answer
0 votes
2 answers
+1 vote
2 answers

how can i count the items in a list?

Syntax :            list. count(value) Code: colors = ['red', 'green', ...READ MORE

answered Jul 6, 2019 in Python by Neha
• 330 points

edited Jul 8, 2019 by Kalgi 1,858 views
0 votes
0 answers
0 votes
1 answer