How to perform regression algorithm on a textual data IMDB reviews

0 votes
                 reviews              label
0   i admit the great majority of...    1
1   take a low budget inexperienced ... 0
2   everybody has seen back to th...    1
3   doris day was an icon of b...       0
4   after a series of silly fun ...     0

I've a dataframe of movie reviews and the label column(1-postive , 0-negative review).
I've another similar test dataset with only review columnI need to build a sentiment analysis model using linear regression to predict the label column of test dataframe.
Desired output: Test dataframe with label column. Regression is performed on numerical data , how do convert text review to numeric form to be able to fit it?

Mar 26, 2022 in Machine Learning by Nandini
• 5,480 points
533 views

1 answer to this question.

0 votes
You can use either word2vec or tf-idf to convert words to vectors.

Word2vec's key benefit is that words with similar meanings will have similar encodings. This is particularly intriguing when attempting to predict the positive of a review. This is not possible with td-idf because it is based solely on the occurrence of words.

Word to vector To map words into vector space, the procedure employs language models. Each word in a vector space is represented by a vector of real integers. It also allows for similar representations of words with similar meanings.

"Term Frequency — Inverse Document Frequency" is abbreviated as TF-IDF. This is a method for calculating the number of words in a collection of documents. We usually assign each word a score to indicate its prominence in the document and corpus. This method is commonly used in text mining and information retrieval.
answered Mar 30, 2022 by Dev
• 6,000 points

Related Questions In Machine Learning

0 votes
3 answers

How to train a Keras model on multiple GPUs?

Hello there, With the latest commit and release ...READ MORE

answered Jul 17, 2020 in Machine Learning by Lily
• 260 points
3,315 views
0 votes
1 answer

How to use ICD10 Code in a regression model in R?

Using the concept of comorbidities is a ...READ MORE

answered Apr 12, 2022 in Machine Learning by Dev
• 6,000 points
610 views
0 votes
1 answer
0 votes
1 answer

Why is random_state required for ridge & lasso regression classifiers?

This is because the regression coefficients of ...READ MORE

answered Mar 2, 2022 in Machine Learning by Nandini
• 5,480 points
1,114 views
0 votes
1 answer

bias and variance calculation for linear regression

Evaluation of Variance: variance = np.var(prediction) # Where ...READ MORE

answered Mar 2, 2022 in Machine Learning by Nandini
• 5,480 points
1,810 views
0 votes
1 answer

Crawling after login in Python

You missed a few login data forms, ...READ MORE

answered Sep 7, 2018 in Python by Priyaj
• 58,100 points
1,652 views
0 votes
1 answer

Crawling after login in Python

You missed a few login data forms, ...READ MORE

answered Sep 14, 2018 in Python by Priyaj
• 58,100 points
837 views
0 votes
1 answer

How to export regression equations for grouped data?

First, you'll need a linear model with ...READ MORE

answered Mar 14, 2022 in Machine Learning by Dev
• 6,000 points
569 views
0 votes
1 answer

How to get a regression summary in scikit-learn like R does?

In sklearn, there is no R type ...READ MORE

answered Mar 15, 2022 in Machine Learning by Dev
• 6,000 points
3,624 views
webinar REGISTER FOR FREE WEBINAR X
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP