Recent
Blogs.
We write regularly about different terminology and jargon you’ll hear in the text analytics industry. Join us in our blog to make the complex, simple.
Topic word probabilities
Topic word probabilities, also called topic-specific word probabilities, are a type of topic model where words are assigned probabilities according to the…
July 7, 2023
Topic mixture
The term Topic mixture is used to describe the different ways in which a topic or subject matter can be blended together.…
July 6, 2023
Topic concentration
Topic concentration is a statistical measure that can be used to calculate the degree to which a set of documents share common…
July 5, 2023
Tokenized document
A Tokenized document is a text file that has been processed by a tokenizer, which is a software program that breaks up…
July 4, 2023
Thread (Email)
Thread (Email) is defined as a set of e-mail messages with the same subject that are grouped together. It can be useful…
July 3, 2023
Theme extraction
Theme extraction is the process of identifying and extracting the main themes from a text document. This can be done automatically using…
July 2, 2023
Text-based scoring
Text-based scoring is defined as a method of assessing the meaning of unstructured text data through the use of statistical and natural…
July 1, 2023
Text segmentation
Text Segmentation is the process of dividing a text into smaller parts, or segments. The purpose of text segmentation is to make…
June 30, 2023
Text scatter plot
A text scatter plot is a graphical representation of a corpus of texts where each text is represented by a point. The…
June 29, 2023
Text analysis engine
A text analysis engine is a system that performs text analytics on a given body of text. The term “text analytics” itself…
June 28, 2023
Terminological Extraction
Terminological Extraction is also referred to as chunking. Chunking is a process of extracting small pieces of information from a larger piece…
June 27, 2023
Term Frequency–Inverse Document Frequency (tf-idf) matrix
The Term Frequency–Inverse Document Frequency (tf-idf) matrix is a statistical measure used to evaluate how important a word is to a document…
June 26, 2023