Bitext Presentation in San Francisco (October 2nd, 2013)

How Accurate is 90% Accurate? On Evaluation of Multilingual Sentiment Analysis engines

“Nobody can go beyond 70% accuracy”. ”Our tool reaches 90% accuracy”.

The statements above are meaningful and meaningless at the same time.

Meaningful because they make it clear that there is an issue here, and 100% accuracy is beyond our wildest dreams. Meaningless because they don't provide details on how accuracy is measured and, most important, they don't specify for which task the accuracy is accomplished: for document-level sentiment (by the way, is that useful for business purposes?), for entity-level sentiment... In other words, when sentiment is assigned to a piece of text, do we know for which brand (Microsoft, Apple...) or for which topic (operating system, price...) is it assigned? So maybe the question is rather: accuracy on what?

Evaluation of accuracy is a scientific task that should be performed with open methods and metrics. This includes issues like: what's the difference between accuracy and precision and recall? Or how do we measure inter-tagger agreement? Are there genuinely ambiguous texts for humans?

In our view, there is one factor that plays a major role here: business rules, i.e. the way a company sees  its space. If I say "ACME just launched a new release of its explosive tennis balls", is that a positive statement (new release) or just a neutral fact that shouldn't distract marketeers?  Being able to efficiently implement these peculiarities (business rules) is key in achieving high accuracy in a way that is meaningful for the end user of the information.

We will show why linguistic approaches to sentiment analysis are better suited to efficiently respond to this challenge: integrating business rules. And we will use real corpora for sentiment evaluation and study their peculiarities.

We will make references to Seth Grimes article "Never Trust Sentiment Accuracy Claims", a common reference for the industry.

This event wil be of interest to users of Sentiment Analysis and Text Analytics Technology in these sectors:

  • Social CRM: because customer sentiment in social media is key
  • Business Intelligence: because their new challenge is integrating unstructured data
  • Contact Center: because Social Media is becoming the channel of choice for many customers
  • Big Data: because most of the data in “big data” is text

Venue Details:


Subscribe Here!