smile.data.SparseDataset<T>

Type Parameters:: T - the target type.

All Implemented Interfaces:: Iterable<SampleInstance<SparseArray,T>>, Dataset<SparseArray,T>

public class SparseDataset<T> extends SimpleDataset<SparseArray,T>

List of Lists sparse matrix format. LIL stores one list per row, where each entry stores a column index and value. Typically, these entries are kept sorted by column index for faster lookup. This format is good for incremental matrix construction.

LIL is typically used to construct the matrix. Once the matrix is constructed, it is typically converted to a format, such as Harwell-Boeing column-compressed sparse matrix format, which is more efficient for matrix operations.

Constructor Summary

Constructors

Constructor

Description

SparseDataset(Collection<SampleInstance<SparseArray,T>> data)

Constructor.

SparseDataset(Collection<SampleInstance<SparseArray,T>> data, int ncol)

Constructor.
Method Summary

Modifier and Type

Method

Description

static SparseDataset<Void>

from(Path path)

Parses spare dataset in coordinate triple tuple list format.

static SparseDataset<Void>

from(Path path, int arrayIndexOrigin)

Reads spare dataset in coordinate triple tuple list format.

double

get(int i, int j)

Returns the value at entry (i, j).

int

ncol()

Returns the number of columns.

int

nrow()

Returns the number of rows.

int

nz()

Returns the number of nonzero entries.

int

nz(int j)

Returns the number of nonzero entries in column j.

static SparseDataset<Void>

of(SparseArray[] data)

Returns a default implementation of SparseDataset without targets.

static SparseDataset<Void>

of(SparseArray[] data, int ncol)

Returns a default implementation of SparseDataset without targets.

SparseMatrix

toMatrix()

Convert into Harwell-Boeing column-compressed sparse matrix format.

void

unitize()

Unitize each row so that L2 norm of x = 1.

void

unitize1()

Unitize each row so that L1 norm of x is 1.

Methods inherited from class smile.data.SimpleDataset
get, iterator, size, stream, toList

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Methods inherited from interface smile.data.Dataset
apply, batch, isEmpty, toString

Methods inherited from interface java.lang.Iterable
forEach, spliterator

Constructor Details
- SparseDataset
  
  public SparseDataset(Collection<SampleInstance<SparseArray,T>> data)
  
  Constructor.
  
  Parameters:
  
  data - The sample instances.
- SparseDataset
  
  public SparseDataset(Collection<SampleInstance<SparseArray,T>> data, int ncol)
  
  Constructor.
  
  Parameters:
  
  data - The sample instances.
  
  ncol - The number of columns.
Method Details
- nz
  
  public int nz()
  
  Returns the number of nonzero entries.
  
  Returns:
  
  the number of nonzero entries.
- nz
  
  public int nz(int j)
  
  Returns the number of nonzero entries in column j.
  
  Parameters:
  
  j - the column index.
  
  Returns:
  
  the number of nonzero entries in column j.
- nrow
  
  public int nrow()
  
  Returns the number of rows.
  
  Returns:
  
  the number of rows.
- ncol
  
  public int ncol()
  
  Returns the number of columns.
  
  Returns:
  
  the number of columns.
- get
  
  public double get(int i, int j)
  
  Returns the value at entry (i, j).
  
  Parameters:
  
  i - the row index.
  
  j - the column index.
  
  Returns:
  
  the cell value.
- unitize
  
  public void unitize()
  
  Unitize each row so that L2 norm of x = 1.
- unitize1
  
  public void unitize1()
  
  Unitize each row so that L1 norm of x is 1.
- toMatrix
  
  public SparseMatrix toMatrix()
  
  Convert into Harwell-Boeing column-compressed sparse matrix format.
  
  Returns:
  
  the sparse matrix.
- of
  
  public static SparseDataset<Void> of(SparseArray[] data)
  
  Returns a default implementation of SparseDataset without targets.
  
  Parameters:
  
  data - sparse arrays.
  
  Returns:
  
  the sparse dataset.
- of
  
  public static SparseDataset<Void> of(SparseArray[] data, int ncol)
  
  Returns a default implementation of SparseDataset without targets.
  
  Parameters:
  
  data - sparse arrays.
  
  ncol - the number of columns.
  
  Returns:
  
  the sparse dataset.
- from
  
  public static SparseDataset<Void> from(Path path) throws IOException, ParseException
  
  Parses spare dataset in coordinate triple tuple list format. Coordinate file stores a list of (row, column, value) tuples.
  
  Parameters:
  
  path - the input file path.
  
  Returns:
  
  the sparse dataset.
  
  Throws:
  
  IOException - when fails to read file.
  
  ParseException - when fails to parse data.
- from
  
  public static SparseDataset<Void> from(Path path, int arrayIndexOrigin) throws IOException, ParseException
  Reads spare dataset in coordinate triple tuple list format. Coordinate file stores a list of (row, column, value) tuples:
  instanceID attributeID value instanceID attributeID value instanceID attributeID value instanceID attributeID value ... instanceID attributeID value instanceID attributeID value instanceID attributeID value
  Ideally, the entries are sorted (by row index, then column index) to improve random access times. This format is good for incremental matrix construction.
  In addition, there may a header line
  D W N // The number of rows, columns and nonzero entries.
  or 3 header lines
  D // The number of rows W // The number of columns N // The total number of nonzero entries in the dataset.
  Parameters:
  
  path - the input file path.
  
  arrayIndexOrigin - the starting index of array. By default, it is 0 as in C/C++ and Java. But it could be 1 to parse data produced by other programming language such as Fortran.
  
  Returns:
  
  the sparse dataset.
  
  Throws:
  
  IOException - if stream to file cannot be read or closed.
  
  ParseException - if an index is not an integer or the value is not a double.

Class SparseDataset<T>

Constructor Summary

Method Summary

Methods inherited from class smile.data.SimpleDataset

Methods inherited from class java.lang.Object

Methods inherited from interface smile.data.Dataset

Methods inherited from interface java.lang.Iterable

Constructor Details

SparseDataset

SparseDataset

Method Details

nz

nz

nrow

ncol

get

unitize

unitize1

toMatrix

of

of

from

from