Recent
Blogs.
We write regularly about different terminology and jargon you’ll hear in the text analytics industry. Join us in our blog to make the complex, simple.
LatentView
LatentView is a tool that is used to automatically analyze unstructured data, such as text documents. It can be used to extract…
June 16, 2022
Ligature
Ligature is a term used to refer to the process of combining two or more adjacent characters into a single character. This…
June 15, 2022
Consistency
Consistency is a metric of how well the annotated data correspond with one another. For instance, if two distinct annotationers identify the…
June 14, 2022
Perplexity
Perplexity is a measure of how well a probability model predicts a sample. It is often used in the text analytics industry…
June 13, 2022
Stemming
Stemming is the process of reducing inflected (or sometimes derived) words to their stem, base or root form—generally a written word form.…
June 12, 2022
Ingestion
Ingestion is the process of acquiring unstructured text data from a variety of sources in order to be able to perform further…
June 11, 2022
Trigram
Trigram is a term used to refer to a group of three successive words. In this context, it is often used as…
June 10, 2022
Natural Language Processing Tools
Natural Language Processing Tools is a term used to describe software that can automatically process and analyze large amounts of natural language…
June 9, 2022
Univariate Analysis
Univariate analysis is used to understand each piece of data within a dataset on its own. This means that each variable is…
June 8, 2022
Culling
Culling is the process of removing unhelpful or uninformative content from a document or set of documents before further processing. This may…
June 7, 2022
Geometric Distribution
The geometric distribution is used to calculate the probability that a given word will appear in a document. For example, if we…
June 4, 2022
Grapheme
Grapheme is a term used to refer to the smallest unit of meaning in a language. This may be a letter, a…
June 4, 2022