Accurate Opinion Mining requires Deep Linguistic Analysis
Sentiment analysis of the type of language used in social media is HARD. One extremely frequent example is “like”, typically in positive comments like “Although I do like the iPhone I prefer the Galaxy S3”, and often in negative comments, “I didn’t like it that much”. In these cases “like” is a verb we use to express preference.
However, “like” also appears very frequently with a neutral meaning, not expressing sentiment:
My life is exactly like that...
Looks like he officially introduced himself...
Nokia does not sell Lumias like Apple sells iPhones
Clearly "like" is not a verb in the examples above. Here "like" is a preposition (first example); a particle modifying another verb (second example: "looks like"); or a subordinating conjunction (which signals a subordinate clause, such as in the third example). How many Social Media analysis tools make the mistake of reporting these neutral “likes” as positive comments? For marketeers relying on the accuracy of the numbers they see reported by these tools, it's a critical question to answer.
However, there are technologies capable of correctly understanding all these uses of “like”, and many other similar problems. They are based on Deep Linguistic Analysis. In short, this involves technologies which can analyze a text from a morphological, syntactic and semantic point of view to properly determine:
- which entities and concepts appear in the text
- what relationships are established among them
- whether there is a sentiment associated with any of the entities or concepts.
Language ambiguities are pervasive since they occur at all levels: morphology, syntax and semantics. That’s why Deep Linguistic Analysis is needed. These techniques bring language knowledge to the analysis and obtain high accuracy as a consequence.
This is the way we do it at Bitext with NaturalOpinions and our API. NaturalOpinions performs a full morphological analysis and a syntactic analysis which determines what the meaning of ambiguous words such as "like" is. This level of language analysis is what sets NaturalOpinions apart and makes it particularly useful for Social Media and Big Data Analysis.