What is the difference between stemming and lemmatization?

[fa icon="calendar'] Apr 5, 2021 4:35:03 PM / by Bitext posted in Machine Learning, NLP, Bitext, Natural Language, Text Analytics, Artificial Intelligence, Deep Learning, Chatbots, Stemming, AI, Multilanguage, Lemmatization, NLP for Core, NLP for Chatbots, Conversational AI

[fa icon="comment"] 4 Comments

When we are running a search, we want to find relevant results not only for the exact expression we typed on the search bar, but also for the other possible forms of the words we used. For example, it’s very likely we will want to see results containing the form “skirt” if we have typed “skirts” in the search bar.

Read More [fa icon="long-arrow-right"]

Siri Speaking Arabic: What Is Failing?

[fa icon="calendar'] Apr 1, 2021 2:04:30 AM / by Bitext posted in NLP for Core

[fa icon="comment"] 0 Comments

Almost three years after Apple launched its well-known voice assistant Siri for the Arabic language, there is still room for further improvement. Siri can currently understand more than 20 languages and dialects; but, when it comes to Arabic, its abilities are not good enough to fully understand what users need. Several utterance errors together with poor understanding skills are quite frustrating for Arabic speakers. What’s going wrong here?

Read More [fa icon="long-arrow-right"]

Noisy text is realistic text

[fa icon="calendar'] Feb 24, 2020 4:45:00 PM / by Bitext posted in API, Machine Learning, NLP, Big Data, Bitext, Deep Linguistic Analysis, Natural Language, Text Analytics, Artificial Intelligence, Deep Learning, NLG, NLU, Query Rewriting, AI, Multilanguage, NLP for Core, NLP for Chatbots, NLP for CX, "Multilingual synthetic data"

[fa icon="comment"] 0 Comments

One of the flaws of usual training data generation is that, when you ask somebody to manually create training data for you, they will make an effort to write these sentences correctly, following the spelling and punctuation norms of your language. Even if some errors appear, they will be minimal, because they are trying to do things right —this is, to provide “orthographically right” sentences.

Read More [fa icon="long-arrow-right"]

Linguistic Resources in +100 Languages & Variants

[fa icon="calendar'] Feb 11, 2020 2:55:24 PM / by Bitext posted in API, Machine Learning, NLP, Big Data, Bitext, Deep Linguistic Analysis, Natural Language, Text Analytics, Artificial Intelligence, Deep Learning, NLG, Stemming, NLU, AI, Multilanguage, Language Identification, Decompounding, Lemmatization, NLP for Core, Finance, Banking

[fa icon="comment"] 0 Comments

All Machine Learning (ML) engines that work with text can benefit from a solid linguistic background. If they are working in a multilingual environment, the need of a good lexicon (with forms, lemmas and attributes) is overwhelming. Even so, basic features such as Word Embeddings hugely improve when enriched with linguistic knowledge, and if this is not usually applied, is because of a lack of linguists working for ML companies.

Read More [fa icon="long-arrow-right"]

Has the bot revolution failed?

[fa icon="calendar'] Dec 17, 2019 6:04:16 PM / by Bitext posted in Machine Learning, NLP, Big Data, Bitext, Deep Linguistic Analysis, Natural Language, Text Analytics, Artificial Intelligence, Deep Learning, POS tagging, AI, Multilanguage, NLP for Core, NLP for Chatbots, "Multilingual synthetic data"

[fa icon="comment"] 0 Comments

The company CB Insights has recently published a document named “Lessons From The Failed Chatbot Revolution”. This ominous title reveals a hard truth: chatbots have not been the revolution we expected.

Read More [fa icon="long-arrow-right"]

Amazon re:Invent: the Age of Data

[fa icon="calendar'] Dec 12, 2019 7:00:00 PM / by Bitext posted in Machine Learning, NLP, Sentiment Analysis, Big Data, Bitext, Deep Linguistic Analysis, Natural Language, Text Analytics, Artificial Intelligence, Deep Learning, NLG, POS tagging, AI, Multilanguage, NLP for Core, NLP for Chatbots, "Multilingual synthetic data"

[fa icon="comment"] 0 Comments

A few days ago, Amazon Web Services organized AWS re:Invent, one of the world biggest IT events, focusing on everything Amazon has to offer. Among the great amount of novelties that were announced, some of them were very interesting for us. 

Read More [fa icon="long-arrow-right"]

Subscribe Here!