Package smile.data

Class SQL

java.lang.Object
smile.data.SQL
All Implemented Interfaces:
AutoCloseable

public class SQL extends Object implements AutoCloseable
An in-process SQL database management interface.
  • Constructor Details

    • SQL

      public SQL() throws SQLException
      Constructor of in-memory database.
      Throws:
      SQLException - if fail to create an in-memory database.
    • SQL

      public SQL(String path) throws SQLException
      Constructor to open or create a persistent database.
      Parameters:
      path - DuckDB file path.
      Throws:
      SQLException - if fail to open or create the persistent database.
  • Method Details

    • toString

      public String toString()
      Overrides:
      toString in class Object
    • close

      public void close() throws SQLException
      Specified by:
      close in interface AutoCloseable
      Throws:
      SQLException
    • tables

      public DataFrame tables() throws SQLException
      Returns the tables in the database.
      Returns:
      the data frame of table metadata.
      Throws:
      SQLException - if fail to query metadata.
    • describe

      public DataFrame describe(String table) throws SQLException
      Returns the columns in a table.
      Parameters:
      table - the table name.
      Returns:
      the data frame of table columns.
      Throws:
      SQLException - if fail to query metadata.
    • csv

      public SQL csv(String name, String... path) throws SQLException
      Creates an in-memory table from csv files. The file should have a header line and uses the default comma delimiter. You can read a series of files and treat them as if they were a single table. Note that this only works if the files have the same schema. The files can be specified by a list parameter, glob pattern matching syntax, or a combination of both.
      Parameters:
      name - the table name.
      path - the csv file path.
      Returns:
      this object.
      Throws:
      SQLException - if fail to read the files or create the in-memory table.
    • csv

      public SQL csv(String name, char delimiter, Map<String,String> columns, String... path) throws SQLException
      Creates an in-memory table from csv files. You can read a series of files and treat them as if they were a single table. Note that this only works if the files have the same schema. The files can be specified by a list parameter, glob pattern matching syntax, or a combination of both.
      Parameters:
      name - the table name.
      delimiter - the delimiter character that separates columns.
      columns - a map that specifies the column names and column types.
      path - a list of csv files.
      Returns:
      this object.
      Throws:
      SQLException - if fail to read the files or create the in-memory table.
    • parquet

      public SQL parquet(String name, String... path) throws SQLException
      Creates an in-memory table from parquet files. You can read a series of files and treat them as if they were a single table. Note that this only works if the files have the same schema. The files can be specified by a list parameter, glob pattern matching syntax, or a combination of both.
      Parameters:
      name - the table name.
      path - a list of csv files.
      Returns:
      this object.
      Throws:
      SQLException - if fail to read the files or create the in-memory table.
    • parquet

      public SQL parquet(String name, Map<String,String> options, String... path) throws SQLException
      Creates an in-memory table from parquet files. You can read a series of files and treat them as if they were a single table. Note that this only works if the files have the same schema. The files can be specified by a list parameter, glob pattern matching syntax, or a combination of both.
      Parameters:
      name - the table name.
      options - supported options include 'binary_as_string' - Parquet files generated by legacy writers do not correctly set the UTF8 flag for strings, causing string columns to be loaded as BLOB instead. Set this to true to load binary columns as strings. 'filename' - Whether an extra filename column should be included in the result. 'file_row_number' - Whether to include the file_row_number column. 'hive_partitioning' - Whether to interpret the path as a Hive partitioned path. 'union_by_name' - Whether the columns of multiple schemas should be unified by name, rather than by position.
      path - a list of csv files.
      Returns:
      this object.
      Throws:
      SQLException - if fail to read the files or create the in-memory table.
    • json

      public SQL json(String name, String... path) throws SQLException
      Creates an in-memory table from json files. You can read a series of files and treat them as if they were a single table. Note that this only works if the files have the same schema. The files can be specified by a list parameter, glob pattern matching syntax, or a combination of both.
      Parameters:
      name - the table name.
      path - a list of json files.
      Returns:
      this object.
      Throws:
      SQLException - if fail to read the files or create the in-memory table.
    • json

      public SQL json(String name, String format, Map<String,String> columns, String... path) throws SQLException
      Creates an in-memory table from json files. You can read a series of files and treat them as if they were a single table. Note that this only works if the files have the same schema. The files can be specified by a list parameter, glob pattern matching syntax, or a combination of both.
      Parameters:
      name - the table name.
      format - "auto", "unstructured", "newline_delimited", or "array". "auto" - Attempt to determine the format automatically. "newline_delimited" - Each line is a JSON. "array" - A JSON array of objects (pretty-printed or not). "unstructured" - If the JSON file contains JSON that is not newline-delimited or an array.
      columns - a map that specifies the column names and column types.
      path - a list of json files.
      Returns:
      this object.
      Throws:
      SQLException - if fail to read the files or create the in-memory table.
    • query

      public DataFrame query(String sql) throws SQLException
      Executes a SELECT statement.
      Parameters:
      sql - a SELECT statement.
      Returns:
      the query result.
      Throws:
      SQLException - if fail to execute the SQL query.
    • update

      public int update(String sql) throws SQLException
      Executes an INSERT, UPDATE, or DELETE statement.
      Parameters:
      sql - an INSERT, UPDATE, or DELETE statement.
      Returns:
      the number of rows affected by the SQL statement.
      Throws:
      SQLException - if fail to execute the SQL update.
    • execute

      public boolean execute(String sql) throws SQLException
      Executes an SQL statement, which may return multiple results.
      Parameters:
      sql - an SQL statement.
      Returns:
      true if the first result is a ResultSet object; false if it is an update count or there are no results.
      Throws:
      SQLException - if fail to execute the SQL query.