1. Insights
  2. AI Data
  3. Article
  • Share on X
  • Share on Facebook
  • Share via email

What is natural language processing?

Posted January 27, 2021
Smart home device sitting atop kitchen counter alongside a mug and blueberries

Allow us to start with a brief hypothetical — let’s say you want to set an alarm on your iPhone. You launch Siri, and tell it to set an alarm for tomorrow. Siri responds “For what time?”, and you specify 9 a.m. Voilà, the alarm is set.

In this short interaction, you activated a device, which heard your speech, processed said speech, executed an action and responded with a sentence. This entire exchange was made possible by natural language processing (NLP). Natural language processing is the basis behind any machine or program’s ability to process human speech. It’s the technology behind recognizable voice assistants like Siri or Alexa, and chatbots in messaging apps.

What is natural language processing?

Natural language processing is the umbrella term for any machine’s ability to recognize what is said to it, understand its meaning, determine the appropriate action and respond in a language the user will understand. NLP is critical across geographies and industries, and non-English languages have an important role in the future of the technology. Using diverse languages and dialects is a good way to counteract biases and improve ML technology overall.

Natural language processing terms to know

Natural language understanding (NLU) is a subset of natural language processing. Natural language understanding goes beyond just basic sentence structure and attempts to understand the intended meaning of language. Human speech is peppered with nuances, subtleties, mispronunciations and colloquialisms. Natural language understanding is designed to tackle the complexities of human speech. One of the main areas of research in language processing is to transition from natural language processing to natural language understanding. Natural language understanding deals with the much narrower facet of how to best handle unstructured inputs and convert them into a structured form that a machine can understand and act upon.

Finally, natural language generation (NLG) is what a machine writes itself. In the example above, Siri’s response “For what time?” is a demonstration of natural language generation.

How does natural language processing work?

Let’s use the above example of asking Siri to set an alarm for you. At a very basic level, these were the steps that the natural language processing followed.

  • You ask Siri to set an alarm.
  • Siri converts your audio speech to text.
  • Siri converts this plain text request into commands for itself, using natural language processing to turn text into structured data.
  • Siri processes this data in a decision engine.
  • Siri responds to you by asking “For what time?” using natural language generation to turn structured data into text.
  • You specify 9 a.m., which is then handled by natural language processing and directed into the decision engine.
  • Siri sets the alarm for you.

Data annotation for natural language processing

How are natural language processing systems built? The following are a few ways to break down and organize data so that you can train your program to improve its natural language processing.

Entity annotation: Entity annotation refers to the practice of extracting units of information from sentences or unstructured data, and making it structured. These units can include names, such as people, organizations, location names and proper nouns. It can also be used to identify numeric expressions such as time, date, money and percent expressions.

Semantic annotation: Semantic annotation helps assess search results. Essentially, companies are looking for ways to improve their search relevance so that customers can actually find their products in search engines. The problem is, most product descriptions vary greatly depending on the source — and are often inaccurate. Semantic annotation helps improve search results by tagging different product titles and search queries. At TELUS International, we can build datasets to help you predict which categories fit best to a given product to make eCommerce processes and product classification easier, faster and more reliable.

Linguistic annotation: Linguistic annotation refers to the practice of assessing the subject of any given sentence. Its a broad genre, but essentially it’s anything to do with analysis of text, whether that be sentiment analysis of social media data, or using natural language processing to answer routine questions.

What is natural language processing used for?

Natural language processing can be used in a variety of cases, such as the following:

  • Voice assistants: as described above, voice assistants like Siri and Alexa are powered by natural language processing.
  • Chatbots: since chatbots mimic real conversations, they heavily rely on natural language processing.
  • Customer service: many companies transcribe and analyze customer call recordings. Natural language processing helps in analyzing this data and enables you to repond to customer needs faster.
  • Sentiment analysis: Natural language processing is used in figuring out the tone of any given piece of writing. This is very useful for companies looking to understand how their product or service is received on social media.
  • Healthcare: Natural language processing has huge implications for healthcare. This includes healthcare assistants that are similar to Siri but specially trained on medical terminology, as well as image classification to understand a medical scan and provide diagnosis and treatment options.

Don’t stop now that you’ve learned the basics of natural language processing — keep going. Browse our range of AI Data Solutions today.

Check out our solutions

Power your NLP algorithms using our accurately annotated AI training data.

Learn more