N-gram

N-gram is a term used in the text analytics industry to refer to a sequence of items, typically words, that are processed as a unit. The term is also used outside of the industry, where it may refer to a sequence of any kind of item, not just words.

N-gram First Used

The term N-gram was first used in the early 1950s by Jean Carletta, a French linguist. Carletta used the term to refer to a sequence of letters in a word. The term was later popularized by Frederick Jelinek, an American computer scientist, who used it to refer to a sequence of words in a sentence.

Applications Using N-grams

N-grams are used in many different applications, such as natural language processing, computational linguistics, and speech recognition.

N-grams Counts

N-grams can be unigrams (single items), bigrams (pairs of items), trigrams (triplets of items), or higher-order n-grams. For example,

  • unigram: “dog”
  • bigram: “dog food”
  • trigram: “dog food bowl”
  • 4-gram: “dog food bowl dish”
  • 5-gram: “dog food bowl dish table”

As you can see, N-grams can be of any length. The length of an N-gram is referred to as its order.

Applications for N-grams

Some common applications for N-grams include:

  • Information retrieval: N-grams are often used to index and retrieve documents that contain a given sequence of items.
  • Statistical modeling: N-grams are often used in statistical models, such as probabilistic language models, which are used to predict the likelihood of a sequence of items occurring.
  • Text generation: N-grams can be used to generate text, such as in the Google Books Ngram Viewer, which generates n-grams from a corpus of books.
  • Word embeddings: N-grams can be used to create word embeddings, which are vector representations of words that capture the context in which they occur.

There are many different applications for N-grams, and the term is used in many different fields. N-grams are a powerful tool for understanding and working with data, and they have a wide range of applications.

Leave a Reply

Your email address will not be published. Required fields are marked *

Unlock the power of actionable insights with AI-based natural language processing.

Follow Us

© 2023 VeritasNLP, All Rights Reserved. Website designed by Mohit Ranpura.
This is a staging enviroment