Class SimpleDocument
java.lang.Object
smile.nlp.SimpleDocument
- All Implemented Interfaces:
AnchorText, Document, Text
A list-of-words representation of documents.
-
Field Summary
Fields inherited from interface Text
CLUSTERING_THRESHOLD, FREQ_TERM_RATIO, MAX_NGRAM_SIZE, MIN_NGRAM_FREQ -
Constructor Summary
ConstructorsConstructorDescriptionSimpleDocument(String id, String title, String content, String[] words) Constructor. -
Method Summary
Modifier and TypeMethodDescriptionAdds a link label to the anchor text.content()Returns the text content.booleanReturns the anchor text if any.inthashCode()id()Returns the id of document, which must be unique in the corpus.intmaxtf()Returns the maximum term frequency over all terms in the document.Sets the anchor text.intsize()Returns the number of words.intReturns the term frequency.title()Returns the title of text, if there is one.toString()unique()Returns the iterator of unique words.words()Returns the iterator of the words of the document.
-
Constructor Details
-
SimpleDocument
-
-
Method Details
-
id
-
title
-
content
-
size
-
words
-
unique
-
tf
-
maxtf
-
getAnchor
Returns the anchor text if any. The anchor text is the visible, clickable text in a hyperlink. The anchor text is all the anchor text in the corpus pointing to this text.- Specified by:
getAnchorin interfaceAnchorText- Returns:
- the anchor text.
-
setAnchor
Sets the anchor text. Note that anchor is all link labels in the corpus pointing to this text. So addAnchor is more appropriate in most cases.- Specified by:
setAnchorin interfaceAnchorText- Parameters:
anchor- the anchor text.- Returns:
- this object.
-
addAnchor
Description copied from interface:AnchorTextAdds a link label to the anchor text.- Specified by:
addAnchorin interfaceAnchorText- Parameters:
linkLabel- the link label.- Returns:
- this object.
-
toString
-
equals
-
hashCode
-