From a business perspective, there is huge difference between plain polarity and topic-based sentiment analysis (also known as aspect-based sentiment analysis)
Why is polarity analysis by itself not enough?
Polarity analysis takes into account the amount of positive or negative terms that appear in a given sentence. It is useful to some extent, since it does a good job of structuring data sets.
Let say I have 1000 reviews on my product that I want to analyze. By using polarity I can identify that 30% are negative, 20% are neutral and 50% are positive – and that’s good for segmentation. But I am left with three chunks of 300, 200 and 500 reviews to go through if I want to get more meaningful insights, rather than just a nice looking pie chart.
And imagine if we have 10 times that data, and in multiple languages! Then we are dealing with some really expensive (and not very delighted) employees that have to spend all day reading through reviews.
Let’s have a look at how comments are treated when using polarity analysis by itself.
“The weather is amazing, and I love my XYZ phone.”
Clearly positive as we have expressions like amazing and love
“The weather is gloomy and sad, and I hate my XYZ phone.”
Clearly negative as gloomy, sad and hate are objectively negative expressions
“The weather is gloomy and sad, and I love my XYZ phone.”
Neutral. Some expressions are good others bad.
As we add them all together we end up with something in between.
It is here, with the neutral cases, that the limitations of a polarity-based approach become clear.
And this is where topic-based sentiment analysis really shines, and why it is becoming a standard in the Text Analytics Industry.
Going back to our example about the weather and the phone, when we can identify exactly what the opinion is about, then the true expressed opinion is not lost in a misleading overall score. And what is even more useful is that XYZ company can filter out the already analyzed data to zoom in on the opinions specifically about their phones. The rest of the noise like comments about the weather, are filtered out.
How does topic-based sentiment analysis work?
To be able to match expressions that bear sentiment with their relevant topic, we need to rely on linguistic knowledge. The industry counts on many methods to do this, such as POS tagging and parsing, and other techniques like n-grams and neural networks models.
Linguistics uncovers the structure of the sentence (known as phrase structure). This is what makes the difference in topic-based analysis. Knowledge of parts of speech and grammar are used to detect the sentiment topic that the expressed opinion is related to:
- I like this camera (with “like” the topic normally is the direct object)
- This lens is great (with “be” the topic normally is the subject)
Well this is all very interesting, but does it really produce measurable results?
You’d be surprised.
In academic conferences on topic-based sentiment analysis (SemEval, for example), sentiment analysis platforms are tested against human hand tagging and precision, recall, and F-score performed by contestants.
It’s quite amazing to see how sentiment analysis platforms measure up against humans!