Class SimpleParagraphSplitter

java.lang.Object
smile.nlp.tokenizer.SimpleParagraphSplitter
All Implemented Interfaces:
ParagraphSplitter

public class SimpleParagraphSplitter extends Object implements ParagraphSplitter
This is a simple paragraph splitter. Given a string, it returns a list of strings, where each element is a paragraph.

The beginning of a paragraph is indicated by

  • the beginning of the content, that is, the paragraph is the first content in the document, or
  • exactly one blank line preceding the paragraph text
The end of a paragraph is indicated by
  • the end of the content, that is, the paragraph is the last content in the document, or
  • one or more blank lines following the paragraph text
A blank line contains zero or more non-printing characters, such as space or tab, followed by a new line.
  • Method Details

    • getInstance

      public static SimpleParagraphSplitter getInstance()
      Returns the singleton instance.
      Returns:
      the singleton instance.
    • split

      public String[] split(String text)
      Description copied from interface: ParagraphSplitter
      Splits the text into paragraphs.
      Specified by:
      split in interface ParagraphSplitter
      Parameters:
      text - the text.
      Returns:
      the paragraphs.