How to improve Amazon Lex's understanding accuracy by 50%

Lex, the bot building platform in which Amazon has invested a great deal of money, can be drastically improved. Lex’s understanding accuracy has the potential to go from 40% to over 90% thanks to the artificial training data automatically generated by Bitext. Building a well performing bot with Lex is now possible!

Amazon Lex is a platform for building conversational interfaces. It provides Amazon Alexa with deep learning technologies of automatic speech recognition and Natural Language Understanding (NLU). Thousands of people are currently using Lex without even knowing it when they interact with conversational agents deployed on messaging platforms such as Facebook and Slack. Despite that, Natural Language Understanding is still one of the most significant challenges to overcome in computer science because of all the training data and infrastructure necessary for building a human-like virtual assistant.

 

How can Lex results be improved?

We chose Lex for this test as it is one of the most popular bot-building platforms in 2018. The main goal was to prove how Lex can benefit from our technology when it comes to generating training data. For this purpose, two different tests were carried out, comparing accuracy results obtained by evaluating bots trained with manually-tagged sentences, with those trained with artificial training data.

Our technology generated sentence variations based on five different intents and the same type of slots (ACTION, OBJECT, PLACE, PERCENTAGE and TIME), all of them related to the management of the lights in a house:

  • SWITCH_ON  (switch on the lights in the living room)
  • SWITCH_OFF (switch off the lights in the living room)
  • CHANGE_COLOR (change the color of the lights to blue)
  • DIM_LIGHT (dim the living room lights to 20%)
  • PROGRAM_LIGHT (program garden lights for 21:00)

 

What Lex can get on its own

In the first test, Lex was trained with just 12 manually-tagged sentences. It’s not enough data to make a bot understands accurately, therefore, the 30% accuracy identifying slots is not surprising. In the second test, we wanted to go further and trained Lex with 50 manually-tagged sentences, that is 10 sentences per intent. This amount should be a decent amount of sentences per intent for training an average bot.  However, a slot filling accuracy of 42% is still not good enough for an optimal user experience.

Here you can see a clear comparison of both training processes carried out with Amazon Lex:

 

When Bitext technology comes into play

We generated artificial training data based on the manually-tagged sentences of the previous tests and used it to train the Lex bot. The evaluation sentences used for this test are the same as the ones used in the previous test. As you can see below, these results show a substantial improvement in terms of accuracy:

 

 

There is no doubt that the more sentences are used to train a bot, the higher the bot accuracy is. Using artificial training data we're able to drastically improve understanding accuracy spending zero time and effort manually generating the subtle variations and tagging the slots. With this technology you will be able to achieve outstanding results with only a few seed sentences. This is the magic of automation.

This improvement can not only be achieved with Lex, but with any bot building platform. Check out the results we obtained benchmarking with Rasa. As a consequence, we can observe that it's now possible thanks to our technology to create a bot with 90% understanding accuracy with minimal effort.

If you want to learn more about the tests we conducted, make sure to check out this benchmark where we go into the details of the process as well as the results. 

AMAZON LEX BENCHMARK

Chatbots

Subscribe to Email Updates