Package smile.nlp.tokenizer
Class SimpleParagraphSplitter
java.lang.Object
smile.nlp.tokenizer.SimpleParagraphSplitter
- All Implemented Interfaces:
ParagraphSplitter
This is a simple paragraph splitter. Given a string, it returns a list of
strings, where each element is a paragraph.
The beginning of a paragraph is indicated by
- the beginning of the content, that is, the paragraph is the first content in the document, or
- exactly one blank line preceding the paragraph text
- the end of the content, that is, the paragraph is the last content in the document, or
- one or more blank lines following the paragraph text
-
Method Summary
Modifier and TypeMethodDescriptionstatic SimpleParagraphSplitter
Returns the singleton instance.String[]
Splits the text into paragraphs.
-
Method Details
-
getInstance
Returns the singleton instance.- Returns:
- the singleton instance.
-
split
Description copied from interface:ParagraphSplitter
Splits the text into paragraphs.- Specified by:
split
in interfaceParagraphSplitter
- Parameters:
text
- the text.- Returns:
- the paragraphs.
-