Class SimpleNormalizer

java.lang.Object
smile.nlp.normalizer.SimpleNormalizer
All Implemented Interfaces:
Normalizer

public class SimpleNormalizer extends Object implements Normalizer
A baseline normalizer for processing Unicode text.
  • Apply Unicode normalization form NFKC.
  • Strip, trim, normalize, and compress whitespace.
  • Remove control and formatting characters.
  • Normalize dash, double and single quotes.
  • Method Details

    • getInstance

      public static SimpleNormalizer getInstance()
      Returns the singleton instance.
      Returns:
      the singleton instance.
    • normalize

      public String normalize(String text)
      Description copied from interface: Normalizer
      Normalize the given string.
      Specified by:
      normalize in interface Normalizer
      Parameters:
      text - the text.
      Returns:
      the normalized text.