public interface Corpus
Modifier and Type | Method and Description |
---|---|
int |
getAverageDocumentSize()
Returns the average size of documents in the corpus.
|
int |
getBigramFrequency(Bigram bigram)
Returns the total frequency of the bigram in the corpus.
|
java.util.Iterator<Bigram> |
getBigrams()
Returns an iterator over the bigrams in the corpus.
|
long |
getNumBigrams()
Returns the number of bigrams in the corpus.
|
int |
getNumDocuments()
Returns the number of documents in the corpus.
|
int |
getNumTerms()
Returns the number of unique terms in the corpus.
|
int |
getTermFrequency(java.lang.String term)
Returns the total frequency of the term in the corpus.
|
java.util.Iterator<java.lang.String> |
getTerms()
Returns an iterator over the terms in the corpus.
|
java.util.Iterator<Relevance> |
search(RelevanceRanker ranker,
java.lang.String term)
Returns an iterator over the set of documents containing the given term
in descending order of relevance.
|
java.util.Iterator<Relevance> |
search(RelevanceRanker ranker,
java.lang.String[] terms)
Returns an iterator over the set of documents containing (at least one
of) the given terms in descending order of relevance.
|
java.util.Iterator<Text> |
search(java.lang.String term)
Returns an iterator over the set of documents containing the given term.
|
long |
size()
Returns the number of words in the corpus.
|
long size()
int getNumDocuments()
int getNumTerms()
long getNumBigrams()
int getAverageDocumentSize()
int getTermFrequency(java.lang.String term)
int getBigramFrequency(Bigram bigram)
java.util.Iterator<java.lang.String> getTerms()
java.util.Iterator<Bigram> getBigrams()
java.util.Iterator<Text> search(java.lang.String term)
java.util.Iterator<Relevance> search(RelevanceRanker ranker, java.lang.String term)
java.util.Iterator<Relevance> search(RelevanceRanker ranker, java.lang.String[] terms)