• Home »
  • Blog Bitext »
  • Is it possible to speed up the training process in Deep Learning?

Is it possible to speed up the training process in Deep Learning?

“Artificial Intelligence has arrived to stay!” You may have heard this over the past years several times, and it’s right. However, we are not talking about a Hollywood science-fiction movie. We are referring to Machine and Deep Learning.

The difference between machines dominating the world and machine learning, comes from refining the “general AI” concept to a “narrowed AI”. In this latest one the belief is that machines can do some task as good as humans or even better. However, if we expect any machine being able to do this, we must teach them. The question is how? By using Machine Learning

Machine Learning tools examine large amounts of data in search of patterns by using an algorithm and then generating a signal that lets users recognize those patterns across in their incoming data.

Machine Learning works with input features that the algorithm will process, but creating them takes a lot of time, is a complex process, since the information the vector should contain is very particular. Since we don’t know how the final output will look like it is difficult to discriminate the information that should be included in the input features.

In this approach the only way tools can improve their performance is by experience, by including lots of data and creating new input features. And this is quite resource consuming in a period when you need fast results to stay ahead of the market.

The evolution of machine learning into deep learning allows to facilitate thing since vectors are not required anymore. This model works with data structures that can be considered as inputs, therefore it is possible to speed up the process.

However, in our field, linguistics, we have detected a lack of high quality data or inputs that can be given to machines as training material. For example, most of colloquial language is not in any dictionary or corpus but out there on social media, but we cannot use this information due to privacy issues.

Another problem we have detected is people trying to train machine learning tools using a full dictionary, but this takes a lot of time and is not needed. One language has an average of 80.000 but even the most illustrated people use at most 10.000, so why are they spending time in teaching something that is not needed? Because they don’t know which are the relevant words.

The hhird issue is that neither deep learning or machine learning teach structures to their tools, only words and since they learn from experience and the given data, they will never understand structures.

Let’s take this easy example: a tech company, sets an alert by wich each time a user tweets complaining about a problem someone from Customer’s Support should receive an email and contact the customer.

If we don’t train the machine to recognize structures this workflow will send an alert when it finds a tweet saying:

“At the beginning everything worked perfectly with XXX, however after 3 weeks of usage I have a problem with the screen, I cannot see anything but black!”

This alert will be correct, but they will receive an alert also in this case:

“Besides all the reviews I’ve read about the XXX I haven’t had any type of issue or problem yet, it works perfectly”

And in this case the user hasn’t suffered any problem with the product. The issue is that the machine is only trained to detect the word “problem”, without considering the phrase structure.  

At Bitext we have put our technology to work to face this problem, by using our grammars that include lexicon and syntaxes to train deep learning tools to speed up the pattern recognition process.

If you want to know how does this training work more posts on the subject will follow soon! But for now get to know how does this apply to chat bots built with Machine Learning.

 

Download our white paper

Subscribe to Email Updates

Categories

see all