Interface | Description |
---|---|
AnchorText |
The anchor text is the visible, clickable text in a hyperlink.
|
Corpus |
A corpus is a collection of documents.
|
TextTerms |
The terms in a text.
|
Class | Description |
---|---|
Bigram |
Bigrams or digrams are groups of two words, and are very commonly used
as the basis for simple statistical analysis of text.
|
NGram |
An n-gram is a contiguous sequence of n words from a given sequence of text.
|
SimpleCorpus |
An in-memory text corpus.
|
SimpleText |
A list-of-words representation of documents.
|
Text |
A minimal interface of text in the corpus.
|
Trie<K,V> |
A trie, also called digital tree or prefix tree, is an ordered tree data
structure that is used to store a dynamic set or associative array where
the keys are usually strings.
|