Data Science with Python Certification Course
- 99k Enrolled Learners
- Live Class
We are all surrounded by devices that run on internet. Previously, it was just the computers, but now we have web on mobiles and tablets, that are handy. In a way, the technology has not only benefited the business and made our lives easier, but has also enriched our online experience. It has become a platform where people spend a lot of time, seek knowledge, exchange ideas and even shop!
For example: When we want to make a purchase online/offline, what do we initially do? We browse through different websites and forums to see if people are talking about it. We check out a few online stores that sell what we are looking for. We read through the reviews and comments that many people have written or expressed about the product and the online store. Only after going through a good number of reviews we decide whether to make the purchase or not.
Most purchase decisions in the virtual world are made after going through what influential reviewers and peers have to say about the product/service. This is the reason why the companies are now forced to see and analyze what people are talking about them on the web. From the company’s perspective, the reviews and comments become very crucial. Therefore, analyzing the comments and reviews is something that an organization cannot afford to miss.
But, what are these comments or the reviews collectively called?
These comments, opinions and reviews are known as “sentiment data” and the task of identifying if the comments and the reviews are positive or negative is known as “sentiment data analysis” or “sentiment analysis”
Sentiment Analysis is one of the prominent features of R, which provides valuable insights to Marketers and Organizations looking to improve productivity and optimize their brand/product.
R is the most comprehensive statistical analysis package available for this purpose. It integrates all of the standard statistical tests, model and analyses, as well as providing a complete language for managing and manipulating data. The graphical capabilities of R are outstanding, providing a fully programmable graphics language that surpasses most other statistical and graphical packages. The power of Sentiment Analysis along with its graphical skills makes it a truly powerful tool for an organization.
There are different methods to analyze the ‘sentiment data’. Let us take a look at each of them here.
Document-level of sentiment analysis
Opinions are usually subjective expressions that describe people’s sentiments, appraisals or feelings towards an entity or an event. Many blogs or forums allow people to express their opinion in the form of reviews and comments. When opinions are expressed in the form of reviews, instead of a simple ‘Yes’ or ‘No’, identifying the actual emotions would need a subjective analysis of the words used in the review
In document-level of sentiment analysis, each document focuses on a single entity or event and contains opinion from a single opinion holder. The opinion here are can be classified in to two simple classes: Positive or negative (probably neutral). For example: A product review: “I bought a new phone few days ago. It is a nice phone, though it is a little big. The touch screen is good. The voice clarity is better. I simply love the phone”. Considering the words or phrases used in the review (nice, good, better, love), the subjective opinion is said to be positive. The objective opinions are measured using the star or poll system, where 4 or 5 stars are positive and 1 or 2 stars are negative.
To have more refined view of different opinions expressed in the document about the entities, we should move to the sentence level. This level of sentiment analysis – filters out those sentences which contain no opinion and – determines whether the opinion on the entity is positive or negative.
Document level and sentence level sentiment analysis works well when they refer to a single entity. However, in many cases people talk about entities that have many aspects or attributes. They will also have different opinions about different aspects. It often happens in product review and discussion forums. For example: “I am a Nokia phone lover. I like the look of the phone. The screen is big and clear. The camera is fantastic. But, there are few downsides too; the battery life is not up-to the mark and access to Whatsapp is difficult.” Categorizing the positive and negatives of this review hides the valuable information about the product. Therefore, the Aspect –based sentiment analysis focuses on the recognition of all sentiment expressions within a given document and the aspects to which the opinions refer.
In many cases, users express their opinions by comparing it with a similar product or brand. Therefore, the goal here is to identify sentences that contain comparative opinions.
For example: “I drove the Honda Civic, it does not handle better that the Skoda Superb”
Sentiment lexicon acquisition
This sentiment analysis method uses a list of words and expressions used to express people’s subjective feelings and sentiment or opinions. It not only uses certain words, but also phrases and idioms. In the other types of sentiment analysis, we have seen what positive and negative words are. Let us take an example: “Car X is better than car Y.” This sentence does not express an opinion that any of the two cars is good or bad. Therefore, these types of sentences/documents are furthered analyzed using 3 approaches: Manual approach, dictionary based approach and corpus-based approach.
Manual Approach: This is not feasible as it is time consuming.
Dictionary based approach: This approach uses ‘Word Net’ to find suitable words of the sentiment word to carry out the analysis.
Corpus-based approach: This is used to create a domain-specific sentiment lexicon to carry out the analysis.
These are the different ways to analyze consumer’s sentiments and know where the company stands in the market!