Class Parquet

java.lang.Object
smile.io.Parquet

public class Parquet extends Object
Apache Parquet is a columnar storage format that supports nested data structures. It uses the record shredding and assembly algorithm described in the Dremel paper.
  • Method Details

    • read

      public static DataFrame read(Path path) throws Exception
      Reads a parquet file.
      Parameters:
      path - the input file path.
      Returns:
      the data frame.
      Throws:
      IOException - when fails to write the file.
      URISyntaxException - when the file path syntax is wrong.
      Exception
    • read

      public static DataFrame read(Path path, int limit) throws Exception
      Reads a limited number of records from a parquet file.
      Parameters:
      path - the input file path.
      limit - the number of records to read.
      Returns:
      the data frame.
      Throws:
      IOException - when fails to write the file.
      Exception
    • read

      public static DataFrame read(String uri) throws Exception
      Reads a parquet file.
      Parameters:
      uri - the input file URI.
      Returns:
      the data frame.
      Throws:
      IOException - when fails to write the file.
      URISyntaxException - when the file path syntax is wrong.
      Exception
    • read

      public static DataFrame read(String uri, int limit) throws Exception
      Reads a limited number of records from a parquet file.
      Parameters:
      uri - the input file URI.
      limit - the number of records to read.
      Returns:
      the data frame.
      Throws:
      IOException - when fails to write the file.
      Exception