2023-01-20

What is NLP

What is NLP

Natural language is the language that humans use in everyday life, such as spoken and written language. Natural language contains ambiguities and overlaps in meaning that can be interpreted differently depending on the context, as in the following examples.

  • "dog ate a bone" and "bone dog a ate"
    • The same word appears with the same frequency in both sentences, but depending on the position of the word, the first sentence is given meaning and the other sentence is not.
  • "Jack saw Ben with a telescope on a mountain."
    • Is it Jack or Ben with a telescope?
    • Who is on the mountain?
  • "I went to the bank."
    • The word "bank" can refer not only to a financial institution but also to a river bank.

Natural Language Processing (NLP) is a series of computer processes that analyze ambiguous and complex words used by humans.

NLP terminology

Key terms in NLP are listed in the table below.

Term Meaning Example
Corpus Set of documents Sentences on all pages of Wikipedia
Document Document Sentence from the "word2vec" page on Wikipedia
Sentence Sentence First sentence of Document(Word2vec is a group of related models that are used to produce word embeddings.
Phrase Phase First clause of Sentence(Word2vec is a group of related models
Token Word First word of Phase (Word2vec
Character Character First character of Token(W
Vocabulary Vocabulary A collection of unique Tokens that appear in a Corpus

Process of NLP

NLP is processed based on four main processes:

  • Morphological analysis/ Lexical analysis
  • Syntax analysis
  • Semantic analysis
  • Pragmatic analysis

Morphological analysis

Morphological analysis is the process of breaking down a sentence into its smallest elements (morphemes) that have meaning and assigning information such as parts of speech. This process allows the meaning of each morpheme in a sentence to be extracted as data.

For example, the sentence "Jack saw Ben with a telescope on a mountain.

Original Morphological analysis
Jack saw Ben with a telescope on a mountain.」 Jack (noun) | saw (verb) | Ben (noun) | with (preposition) | a (noun) | telescope (noun) | on (preposition) | a (noun) | mountain (noun)

Syntax analysis

Syntax analysis is the process of clarifying the structure of a sentence based on morphological analysis of language elements.

After morphological analysis of "Jack saw Ben with a telescope on a mountain," the Syntax Analysis result is as follows.

  • Jack saw | Ben with a telescope on a mountain
  • Jack saw | Ben with a telescope | on a mountain
  • Jack saw Ben with a telescope | on a mountain

In terms of syntax, both sentences are correct.

Semantic analysis

Semantic analysis determines the relationship between each word based on syntactic analysis. Suppose we have the following sentences.

  • Green | shining | aurora | and | stars | are | beautiful

In the above statement, it is immediately understood that the aurora borealis glows green. It can also be interpreted that not only the aurora borealis but also the stars glow green.

Checking the relationship between each word while pulling up a dictionary in the semantic analysis reveals that while the northern lights glow green, stars are rarely described as glowing green. Therefore, the AI can understand that in the above sentence, the only thing that glows green is the aurora borealis.

Pragmatic analysis

Pragmatic analysis is the process of analyzing the relationship between sentences by performing morphological and semantic analysis on multiple sentences. However, this process requires machines to learn knowledge from various domains and is still a developing field.

Examples of NLP applications

NLP has the following applications:

  • Text mining
    • SNS analysis
    • Survey analysis
  • Dialogue systems
    • Siri
    • Alexa
    • Google Home
  • Machine translation
    • DeepL
    • Google Translate
  • Search Engine
    • Google
    • Yahoo
  • Spam detection
  • Document summary

References

https://www.ibm.com/topics/natural-language-processing
https://www.analyticsvidhya.com/blog/2021/05/natural-language-processing-step-by-step-guide/
https://byteiota.com/introduction-to-nlp/
https://byteiota.com/stages-of-nlp/

Ryusei Kakujo

researchgatelinkedingithub

Focusing on data science for mobility

Bench Press 100kg!