Recent

Blogs.

We write regularly about different terminology and jargon you’ll hear in the text analytics industry. Join us in our blog to make the complex, simple.

LatentView

LatentView is a tool that is used to automatically analyze unstructured data, such as text documents. It can be used to extract…

June 16, 2022

Ligature

Ligature is a term used to refer to the process of combining two or more adjacent characters into a single character. This…

June 15, 2022

Consistency

Consistency is a metric of how well the annotated data correspond with one another. For instance, if two distinct annotationers identify the…

June 14, 2022

Perplexity

Perplexity is a measure of how well a probability model predicts a sample. It is often used in the text analytics industry…

June 13, 2022

Stemming

Stemming is the process of reducing inflected (or sometimes derived) words to their stem, base or root form—generally a written word form.…

June 12, 2022

Ingestion

Ingestion is the process of acquiring unstructured text data from a variety of sources in order to be able to perform further…

June 11, 2022

Trigram

Trigram is a term used to refer to a group of three successive words. In this context, it is often used as…

June 10, 2022

Natural Language Processing Tools

Natural Language Processing Tools is a term used to describe software that can automatically process and analyze large amounts of natural language…

June 9, 2022

Univariate Analysis

Univariate analysis is used to understand each piece of data within a dataset on its own. This means that each variable is…

June 8, 2022

Culling

Culling is the process of removing unhelpful or uninformative content from a document or set of documents before further processing. This may…

June 7, 2022

Geometric Distribution

The geometric distribution is used to calculate the probability that a given word will appear in a document. For example, if we…

June 4, 2022

Grapheme

Grapheme is a term used to refer to the smallest unit of meaning in a language. This may be a letter, a…