Interface Document
- All Superinterfaces:
AnchorText, Text
- All Known Implementing Classes:
SimpleDocument
The terms in a text.
-
Field Summary
Fields inherited from interface Text
CLUSTERING_THRESHOLD, FREQ_TERM_RATIO, MAX_NGRAM_SIZE, MIN_NGRAM_FREQ -
Method Summary
Modifier and TypeMethodDescriptionid()Returns the id of document, which must be unique in the corpus.intmaxtf()Returns the maximum term frequency over all terms in the document.intsize()Returns the number of words.intReturns the term frequency.unique()Returns the iterator of unique words.words()Returns the iterator of the words of the document.Methods inherited from interface AnchorText
addAnchor, getAnchor, setAnchor
-
Method Details
-
id
String id()Returns the id of document, which must be unique in the corpus.- Returns:
- the id of document.
-
size
int size()Returns the number of words.- Returns:
- the number of words.
-
words
-
unique
-
tf
Returns the term frequency.- Parameters:
term- the term.- Returns:
- the term frequency.
-
maxtf
int maxtf()Returns the maximum term frequency over all terms in the document.- Returns:
- the maximum term frequency.
-