Last week we published a complete introduction guide to chat bots and since we have received some requests to keep digging, this week we want to go one step further and explain how is possible to train bots built with machine learning by using linguistics to improve their performance.
Machine learning bots present some advantages compared to those based on retrieval models, being the main one that they build answers from scratch. However, to be able to provide those responses the bot needs to be trained, as every tool built with machine learning.
Nowadays the training stage happens as the bot is interacting with the user, and this can be very risky for a company, because if the bot doesn’t understand the customer’s demands they may end up abandoning the website and the purchasing process. But not only this, once a user has suffered a bad experience it is very unlikely for him to repeat the experience.
The general belief is that the chat bot needs to know the approximately 80.000 words that one language may have, but these seems too much. Why? Because even the most erudite people use maximum 10.000 words of a language. It seems obvious then to not teach the bot all the words in the dictionary, only those ones used by the native speakers.
Let’s think about kids, when they are starting to learn how to speak, they don’t teach them all the words in one language, only the most common ones that will allow them to speak
But the problem is: which ones are those? How can we differentiate them? We find a problem here: a lack of training material in colloquial language. The easy way of building these kind of materials is by using data from Twitter for example, but this is very tricky because of privacy issues.
Related to this issue, to be able to write sentences that are correct syntactically bots need training, but again it doesn’t make sense to teach them thousands and thousands of structures.
As an example, when you go to another country you don’t need to know how to say “I want a glass of wine” or “ I want a bag of chips” or “ I want one ticket” You just need to know the structure “ I want a …” and then know different words.
If we are able to solve these two problems we will be able to speed up the learning process and therefore improve the results machine learning provide not only while building a bot but for every tool.