Topic word probabilities
Topic word probabilities, also called topic-specific word probabilities, are a type of topic model where words are assigned probabilities according to the […]
Topic mixture
The term Topic mixture is used to describe the different ways in which a topic or subject matter can be blended together. […]
Topic concentration
Topic concentration is a statistical measure that can be used to calculate the degree to which a set of documents share common […]
Tokenized document
A Tokenized document is a text file that has been processed by a tokenizer, which is a software program that breaks up […]
Thread (Email)
Thread (Email) is defined as a set of e-mail messages with the same subject that are grouped together. It can be useful […]
Theme extraction
Theme extraction is the process of identifying and extracting the main themes from a text document. This can be done automatically using […]
Text-based scoring
Text-based scoring is defined as a method of assessing the meaning of unstructured text data through the use of statistical and natural […]
Text segmentation
Text Segmentation is the process of dividing a text into smaller parts, or segments. The purpose of text segmentation is to make […]
Text scatter plot
A text scatter plot is a graphical representation of a corpus of texts where each text is represented by a point. The […]
Text analysis engine
A text analysis engine is a system that performs text analytics on a given body of text. The term “text analytics” itself […]
Terminological Extraction
Terminological Extraction is also referred to as chunking. Chunking is a process of extracting small pieces of information from a larger piece […]
Term Frequency–Inverse Document Frequency (tf-idf) matrix
The Term Frequency–Inverse Document Frequency (tf-idf) matrix is a statistical measure used to evaluate how important a word is to a document […]