Package smile.nlp


package smile.nlp
Natural language processing.
  • Class
    Description
    The anchor text is the visible, clickable text in a hyperlink.
    Bigrams or digrams are groups of two words, and are very commonly used as the basis for simple statistical analysis of text.
    A corpus is a collection of documents.
    An n-gram is a contiguous sequence of n words from a given sequence of text.
    An in-memory text corpus.
    A list-of-words representation of documents.
    A minimal interface of text in the corpus.
    The terms in a text.
    Trie<K,V>
    A trie, also called digital tree or prefix tree, is an ordered tree data structure that is used to store a dynamic set or associative array where the keys are usually strings.