All Classes and Interfaces

A band matrix is a sparse matrix, whose non-zero entries are confined to a diagonal band, comprising the main diagonal and zero or more diagonals on either side.

Bank32nh

Auto MPG dataset.

Bar

Bars with heights proportional to the value.

BarPlot

A barplot draws bars with heights proportional to the value.

Base

The coordinate base of Canvas.

BatchNorm1dLayer

A batch normalization layer that re-centers and normalizes the output of one layer before feeding it to another.

BatchNorm2dLayer

A batch normalization layer that re-centers and normalizes the output of one layer before feeding it to another.

BBDTree

Balanced Box-Decomposition Tree.

Bernoulli

The response variable is of Bernoulli distribution.

BernoulliDistribution

Bernoulli's distribution is a discrete probability distribution, which takes value 1 with success probability p and value 0 with failure probability q = 1 - p.

BestLocalizedWavelet

Best localized wavelets.

Beta

The beta function, also called the Euler integral of the first kind.

BetaDistribution

The beta distribution is defined on the interval [0, 1] parameterized by two positive shape parameters, typically denoted by α and β.

BFGS

The Broyden–Fletcher–Goldfarb–Shanno (BFGS) algorithm is an iterative method for solving unconstrained nonlinear optimization problems.

BiconjugateGradient

The biconjugate gradient method to solve systems of linear equations.

BicubicInterpolation

Bicubic interpolation in a two-dimensional regular grid.

Bigram

Bigrams or digrams are groups of two words, and are very commonly used as the basis for simple statistical analysis of text.

Bigram

Collocations are expressions of multiple words which commonly co-occur.

BilinearInterpolation

Bilinear interpolation in a two-dimensional regular grid.

BinaryEncoder

Encodes categorical features using sparse one-hot scheme.

BinarySparseDataset<T>

Binary sparse dataset.

BinarySparseGaussianKernel

Gaussian kernel, also referred as RBF kernel or squared exponential kernel.

BinarySparseHyperbolicTangentKernel

The hyperbolic tangent kernel on binary sparse data.

BinarySparseLaplacianKernel

Laplacian kernel, also referred as exponential kernel.

BinarySparseLinearKernel

The linear dot product kernel on sparse binary arrays in int[], which are the indices of nonzero elements.

BinarySparseLinearSVM

Binary sparse linear support vector machines for classification.

BinarySparseLinearSVM

Binary sparse linear support vector machines for regression.

BinarySparseMaternKernel

The class of Matérn kernels is a generalization of the Gaussian/RBF.

BinarySparsePolynomialKernel

The polynomial kernel on binary sparse data.

BinarySparseSequenceDataset

Binary sparse sequence dataset.

BinarySparseThinPlateSplineKernel

The Thin Plate Spline kernel on binary sparse data.

Binomial

The response variable is of Binomial distribution.

BinomialDistribution

The binomial distribution is the discrete probability distribution of the number of successes in a sequence of n independent yes/no experiments, each of which yields success with probability p.

BinParams

To test a data point in a filter transform or a test property in conditional encoding, a predicate definition of the following forms must be specified:

BIRCH

Balanced Iterative Reducing and Clustering using Hierarchies.

BitcoinPrice

Bitcoin Price history on a daily basis from April-28th, 2013 to Feb-20th, 2018.

BitString

The standard bit string representation of the solution domain.

BKTree<K,V>

A BK-tree is a metric tree specifically adapted to discrete metric spaces.

BM25

The BM25 weighting scheme, often called Okapi weighting, after the system in which it was first implemented, was developed as a way of building a probabilistic model sensitive to term frequency and document length while not introducing too many additional parameters into the model.

BooleanType

Boolean data type.

BooleanVector

A boolean vector.

Bootstrap

The bootstrap is a general tool for assessing statistical accuracy.

BostonHousing

Boston housing dataset.

BoxPlot

A boxplot is a convenient way of graphically depicting groups of numerical data through their five-number summaries the smallest observation (sample minimum), lower quartile (Q1), median (Q2), upper quartile (Q3), and largest observation (sample maximum).

BoxTest

Portmanteau test jointly that several autocorrelations of time series are zero.

BoxTest.Type

The type of test.

BreakIteratorSentenceSplitter

A sentence splitter based on the java.text.BreakIterator, which supports multiple natural languages (selected by locale setting).

BreakIteratorTokenizer

A word tokenizer based on the java.text.BreakIterator, which supports multiple natural languages (selected by locale setting).

BreastCancer

Breast cancer dataset.

Bucket

A bucket is a container for points that all have the same value for hash function g (function g is a vector of k LSH functions).

Button

Action initialized JButton.

ButtonCellRenderer

The ButtonCellRenderer class provides a renderer and an editor that looks like a JButton.

ByteArrayCellRenderer

Byte array renderer in JTable.

Byte string.

Byte data type.

A byte vector.

Static methods that manage cache files.

CalHousing

California housing dataset.

Canvas

Interactive view of a mathematical plot.

CART

Classification and regression tree.

CategoricalEncoder

Categorical variable encoder.

CategoricalMeasure

Categorical data can be stored into groups or categories with the aid of names or labels.

cblas_h

cblas_h.cblas_xerbla

Variadic invoker class for:

void cblas_xerbla(blasint p, char *rout, char *form, ...)

cblas_h.dprintf

Variadic invoker class for:

extern int dprintf(int __fd, const char *restrict __fmt, ...)

cblas_h.fprintf

Variadic invoker class for:

extern int fprintf(FILE *restrict __stream, const char *restrict __format, ...)

cblas_h.fscanf

Variadic invoker class for:

extern int fscanf(FILE *restrict __stream, const char *restrict __format, ...)

cblas_h.printf

Variadic invoker class for:

extern int printf(const char *restrict __format, ...)

cblas_h.scanf

Variadic invoker class for:

extern int scanf(const char *restrict __format, ...)

cblas_h.snprintf

Variadic invoker class for:

extern int snprintf(char *restrict __s, size_t __maxlen, const char *restrict __format, ...)

cblas_h.sprintf

Variadic invoker class for:

extern int sprintf(char *restrict __s, const char *restrict __format, ...)

cblas_h.sscanf

Variadic invoker class for:

extern int sscanf(const char *restrict __s, const char *restrict __format, ...)

CentroidClustering<T,U>

Centroid-based clustering that uses the center of each cluster to group similar data points into clusters.

CharType

Char data type.

CharVector

A char vector.

ChebyshevDistance

Chebyshev distance (or Tchebychev distance), or L_∞ metric is a metric defined on a vector space where the distance between two vectors is the greatest of their differences along any coordinate dimension.

ChiSqTest

Pearson's chi-square test, also known as the chi-square goodness-of-fit test or chi-square test for independence.

ChiSquareDistribution

Chi-square (or chi-squared) distribution with k degrees of freedom is the distribution of a sum of the squares of k independent standard normal random variables.

Cholesky

The Cholesky decomposition of a symmetric, positive-definite matrix.

Chromosome<T>

Artificial chromosomes in genetic algorithm/programming encoding candidate solutions to an optimization problem.

clapack_h

clapack_h_1

ClassificationMetric

An abstract interface to measure the classification performance.

ClassificationMetrics

The classification validation metrics.

ClassificationModel

The classification model.

ClassificationValidation<M>

Classification model validation results.

ClassificationValidations<M>

Classification model validation results.

Classifier<T>

A classifier assigns an input object into one of a given number of categories.

Classifier.Trainer<T,M>

The classifier trainer.

ClassLabels

Map arbitrary class labels to [0, k), where k is the number of classes.

Clustering

Clustering utility functions.

Clustering.Options

Iterative clustering algorithm hyperparameters.

ClusteringMetric

An abstract interface to measure the clustering performance.

CoifletWavelet

Coiflet wavelets.

Collectors

Stream collectors for Dataset, DataFrame, and Matrix.

ColonCancer

Colon cancer dataset.

ColorCellEditor

Color editor in JTable.

ColorCellRenderer

Color renderer in JTable.

ColumnTransform

Column-wise data transformation.

CompleteLinkage

Complete linkage.

CompletionPrediction

Prompt completion prediction.

Complex

Complex number.

Complex.Array

Packed array of complex numbers for better memory efficiency.

Concat

Concatenating views.

Concept

Concept is a set of synonyms, i.e.

Config

Vega-Lite's config object lists configuration properties of a visualization for creating a consistent theme.

ConfusionMatrix

The confusion matrix of truth and predictions.

Constant

A constant value in the formula.

ContingencyTable

The contingency table.

Contour

A contour plot is a graphical technique for representing a 3-dimensional surface by plotting constant z slices, called contours, on a 2-dimensional format.

Conv2dLayer

A convolutional layer.

Conv2dNormActivation

Convolution2d-Normalization-Activation block.

Conv2dNormActivation.Options

Conv2dNormActivation configurations.

CooccurrenceKeywords

Keyword extraction from a single document using word co-occurrence statistical information.

Corpus

A corpus is a collection of documents.

CorrelationDistance

Correlation distance is defined as 1 - correlation coefficient.

CorTest

Correlation test.

Cost

Neural network cost function.

CoverTree<K,V>

Cover tree is a data structure for generic nearest neighbor search, which is especially efficient in spaces with small intrinsic dimension.

CPU

CPU dataset.

CRF

First-order linear conditional random field.

CRF.Options

CRF hyperparameters.

CRFLabeler<T>

First-order CRF sequence labeler.

CrossEntropy

Cross entropy generalizes the log loss metric to multiclass problems.

Crossover

The types of crossover operation.

CrossValidation

Cross-validation is a technique for assessing how the results of a statistical analysis will generalize to an independent data set.

CSV

Reads and writes files in variations of the Comma Separated Value (CSV) format.

CubicSplineInterpolation1D

Cubic spline interpolation.

CubicSplineInterpolation2D

Cubic spline interpolation in a two-dimensional regular grid.

CUDA

NVIDIA CUDA helper functions.

D4Wavelet

The simplest and most localized wavelet, Daubechies wavelet of 4 coefficients.

Data

The basic data model used by Vega-Lite is tabular data.

DataFrame

Two-dimensional, potentially heterogeneous tabular data.

DataFrameClassifier

Classification trait on DataFrame.

DataFrameClassifier.Trainer<M>

The classifier trainer.

DataFrameRegression

Regression trait on DataFrame.

DataFrameRegression.Trainer<M>

The regression trainer.

DataFrameTableModel

A table model for data frames with paging.

Dataset<D,T>

An immutable collection of data objects.

Dataset

A dataset consists of data and an associated target (label) and can be iterated in mini-batches.

DataType

The interface of data types.

DataType.ID

Data type ID.

DataTypes

To get a specific data type, users should use singleton objects and factory methods in this class.

Date

Date/time feature extractor.

DateCellEditor

Implements a cell editor that uses a formatted text field to edit Date values.

DateCellRenderer

Date cell renderer.

DateFeature

The date/time features.

Dates

Date and time utility functions.

DateTime data type.

Date data type.

Daubechies wavelets.

Density-Based Spatial Clustering of Applications with Noise.

DecimalType

Arbitrary-precision decimal data type.

DecisionNode

A leaf node in decision tree.

DecisionTree

Decision tree.

DecisionTree.Options

Decision tree hyperparameters.

Default

Credit card default dataset.

DefaultTableHeaderCellRenderer

A default cell renderer for a JTableHeader.

Delete

Remove a factor from the formula.

DENCLUE

DENsity CLUstering.

DENCLUE.Options

DENCLUE hyperparameters.

Dendrogram

A dendrogram is a tree diagram frequently used to illustrate the arrangement of the clusters produced by hierarchical clustering.

DenseMatrix

A dense matrix is a matrix where a large proportion of its elements are non-zero.

DensityTransform

The density transform performs one-dimensional kernel density estimation over an input data stream and generates a new data stream of samples of the estimated densities.

DeterministicAnnealing

Deterministic annealing clustering.

DeterministicAnnealing.Options

Deterministic annealing hyperparameters.

Device

The compute device on which a tensor is stored.

DeviceType

The compute device type.

Diabetes

Diabetes dataset.

Diag

The flag if a triangular matrix has unit diagonal elements.

Dictionary

A dictionary is a set of words in some natural language.

DifferentiableFunction

A differentiable function is a function whose derivative exists at each point in its domain.

DifferentiableMultivariateFunction

A differentiable function is a function whose derivative exists at each point in its domain.

DiscreteDistribution

Univariate discrete distributions.

DiscreteExponentialFamily

The purpose of this interface is mainly to define the method M that is the Maximization step in the EM algorithm.

DiscreteExponentialFamilyMixture

The finite mixture of distributions from discrete exponential family.

DiscreteMixture

The finite mixture of discrete distributions.

DiscreteMixture.Component

A component in the mixture distribution is defined by a distribution and its weight in the mixture.

DiscreteNaiveBayes

Naive Bayes classifier for document classification in NLP.

DiscreteNaiveBayes.Model

The generation models of naive Bayes classifier.

Distance<T>

An interface to calculate a distance measure between two objects.

Distribution

Probability distribution of univariate random variable.

Div

The term of a / b expression.

Dot

The special term "." means all columns not otherwise in the formula in the context of a data frame.

DotProductKernel

Dot product kernel depends only on the dot product of x and y.

DoubleArrayCellEditor

Implements a cell editor that uses a formatted text field to edit double[] values.

DoubleArrayCellRenderer

Double array renderer in JTable.

DoubleArrayList

A resizeable, array-backed list of double primitives.

DoubleCellEditor

Implements a cell editor that uses a formatted text field to edit Double values.

DoubleConsumer

Double precision matrix element stream consumer.

DoubleFunction

The generic term of applying a double function.

DoubleHeapSelect

This class tracks the smallest values seen thus far in a stream of values.

DoubleType

Double data type.

DoubleVector

A double vector.

DropoutLayer

A dropout layer that randomly zeroes some of the elements of the input tensor with probability p during training.

DynamicTimeWarping<T>

Dynamic time warping is an algorithm for measuring similarity between two sequences which may vary in time or speed.

Edge

The connection between neurons.

EditDistance

The Edit distance between two strings is a metric for measuring the amount of difference between two sequences.

EfficientNet

EfficientNet is an image classification model family.

Eigen

Eigenvalue algorithms such as power iteration and Lanczos algorithms.

EigenRange

THe option of eigenvalue range.

ElasticNet

Elastic Net regularization.

ElasticNet.Options

Elastic Net hyperparameters.

EmbeddingLayer

An embedding layer that is a simple lookup table that stores embeddings of a fixed dictionary and size.

EmpiricalDistribution

An empirical distribution function or empirical cdf, is a cumulative probability distribution function that concentrates probability 1/n at each of the n numbers in a sample.

EnglishDictionary

A concise dictionary of common terms in English.

EnglishPOSLexicon

An English lexicon with part-of-speech tags.

EnglishPunctuations

Punctuation marks in English.

EnglishStopWords

Several sets of English stop words.

Erf

The error function.

Error

The number of errors in the population.

EuclideanDistance

Euclidean distance.

Eurodist

Distances between European cities.

EVD

Eigenvalue decomposition.

EVDJob

The option if computing eigen vectors.

Exp

The contrast function when the independent components are highly super-Gaussian, or when robustness is very important.

ExponentialDistribution

An exponential distribution describes the times between events in a Poisson process, in which events occur continuously and independently at a constant average rate.

ExponentialFamily

The exponential family is a class of probability distributions sharing a certain form.

ExponentialFamilyMixture

The finite mixture of distributions from exponential family.

ExponentialVariogram

Exponential variogram.

Facet

A facet is a trellis plot (or small multiple) of a series of similar plots that displays different subsets of the same data, facilitating comparison across subsets.

FacetField

Facet field definition object.

FactorCrossing

Factor crossing.

FactorInteraction

The interaction of all the factors appearing in the term.

Fallout

Fall-out, false alarm rate, or false positive rate (FPR)

FDistribution

F-distribution arises in the testing of whether two observed samples have the same variance.

FDR

The false discovery rate (FDR) is ratio of false positives to combined true and false positives, which is actually 1 - precision.

Feature

A feature in the formula once bound to a schema.

FeedForward

Feedforward layer in Transformer.

Field

Encoding field definition object.

Figure

A figure serves as the canvas on which plots and other elements are drawn.

FigurePane

The Swing container of a figure with toolbar.

FileChooser

File chooser with file/images preview.

FileChooser.SimpleFileFilter

A simple extension-based file filter.

FinishReason

The reasons that the chat completions finish.

Fitness<T>

A measure to evaluate the fitness of chromosomes.

FLD

Fisher's linear discriminant.

FloatArrayCellRenderer

Float array renderer in JTable.

FloatArrayFormatter

Text formatter for floating array values.

FloatArrayList

A resizeable, array-backed list of float primitives.

FloatHeapSelect

This class tracks the smallest values seen thus far in a stream of values.

FloatType

Float data type.

FloatVector

A float vector.

FontCellEditor

Font editor in JTable.

FontCellRenderer

Font renderer in JTable.

FontChooser

The FontChooser class is a swing component for font selection with JFileChooser-like APIs.

FormatConfig

These config properties define the default number and time formats for text marks as well as axes, headers, tooltip, and legends.

Formula

The model fitting formula in a compact symbolic form.

FPGrowth

Frequent item set mining based on the FP-growth (frequent pattern growth) algorithm, which employs an extended prefix-tree (FP-tree) structure to store the database in a compressed form.

FPTree

FP-tree data structure used in FP-growth (frequent pattern growth) algorithm for frequent item set mining.

FRegression

Univariate F-statistic and p-values, which can be used as a feature selection criterion for linear regression problems.

FScore

The F-score (or F-measure) considers both the precision and the recall of the test to compute the score.

FTest

F test of the hypothesis that two independent samples come from normal distributions with the same variance, against the alternative that they come from normal distributions with different variances.

Function

An interface representing a univariate real function.

FusedMBConv

Fused-MBConv replaces the depthwise-conv3×3 and expansion-conv1×1 in MBConv with single regular conv3×3.

GAFE

Genetic algorithm based feature selection.

Gamma

The gamma, digamma, and incomplete gamma functions.

GammaDistribution

The Gamma distribution is a continuous probability distributions with a scale parameter θ and a shape parameter k.

Gaussian

Gaussian kernel, also referred as RBF kernel or squared exponential kernel.

GaussianDistribution

The normal distribution or Gaussian distribution is a continuous probability distribution that describes data that clusters around a mean.

GaussianKernel

Gaussian kernel, also referred as RBF kernel or squared exponential kernel.

GaussianMixture

Synthetic Gaussian mixture dataset.

GaussianMixture

Finite univariate Gaussian mixture.

GaussianProcessRegression<T>

Gaussian Process for Regression.

GaussianProcessRegression.Options

Gaussian process regression hyperparameters.

GaussianRadialBasis

Gaussian RBF.

GaussianVariogram

Gaussian variogram.

GELU

Gaussian Error Linear Unit activation function.

GeneticAlgorithm<T>

A genetic algorithm (GA) is a search heuristic that mimics the process of natural evolution.

GeometricDistribution

The geometric distribution is a discrete probability distribution of the number X of Bernoulli trials needed to get one success, supported on the set {1, 2, 3, …}.

GHA

Generalized Hebbian Algorithm.

GLM

Generalized linear models.

GLM.Options

GLM hyperparameters.

GloVe

Global Vectors for Word Representation.

GLU

Gated Linear Unit activation function.

GMeans

G-Means clustering algorithm, an extended K-Means which tries to automatically determine the number of clusters by normality test.

GoodTuring

Good–Turing frequency estimation.

GradientTreeBoost

Gradient boosting for classification.

GradientTreeBoost

Gradient boosting for regression.

GradientTreeBoost.Options

Gradient tree boosting hyperparameters.

GradientTreeBoost.Options

Gradient tree boosting hyperparameters.

GradientTreeBoost.TrainingStatus

Training status per tree.

GradientTreeBoost.TrainingStatus

Training status per tree.

Graph

A graph is an abstract representation of a set of objects where some pairs of the objects are connected by links.

Graph edge.

A 2D grid plot.

Group normalization.

Growing Neural Gas.

Haar wavelet.

In information theory, the Hamming distance between two strings of equal length is the number of positions for which the corresponding symbols are different.

HardShrink

Hard Shrink activation function.

Hash

The hash function for Euclidean spaces.

HashEncoder

Feature hashing, also known as the hashing trick, is a fast and space-efficient way of vectorizing features, i.e.

HashValueParzenModel

Hash value Parzen model for multi-probe hash.

Headless

Aids in creating swing components in a "headless" environment.

HeapSelect<T>

This class tracks the smallest values seen thus far in a stream of values.

Heatmap

A heat map is a graphical representation of data where the values taken by a variable in a two-dimensional map are represented as colors.

HellingerKernel

The Hellinger kernel.

Hexmap

Hexmap is a variant of heat map by replacing rectangle cells with hexagon cells.

Hexmap.Tooltip

The lambda interface to retrieve the tooltip of cell.

HiddenLayer

A hidden layer in the neural network.

HiddenLayerBuilder

The builder of hidden layers.

HierarchicalClustering

Agglomerative Hierarchical Clustering.

Histogram

Histogram utilities.

Histogram

A histogram is a graphical display of tabulated frequencies, shown as bars.

Histogram3D

A histogram is a graphical display of tabulated frequencies, shown as bars.

HMM

First-order Hidden Markov Model.

HMMLabeler<T>

First-order Hidden Markov Model sequence labeler.

HMMPOSTagger

Part-of-speech tagging with hidden Markov model.

HyperbolicTangent

The hyperbolic tangent kernel.

HyperbolicTangentKernel

The hyperbolic tangent kernel.

HyperGeometricDistribution

The hypergeometric distribution is a discrete probability distribution that describes the number of successes in a sequence of n draws from a finite population without replacement, just as the binomial distribution describes the number of successes for draws with replacement.

Hyperparameters

Hyperparameter configuration.

Hyphen

Hyphen sequence dataset.

Hypothesis

Hypothesis test functions.

Chi-square test.

Correlation test.

F-test.

The Kolmogorov-Smirnov test (K-S test).

Hypothesis.t

t-test.

ICA

Independent Component Analysis (ICA) is a computational method for separating a multivariate signal into additive components.

ICA.Options

ICA hyperparameters.

ImageDataset

Each of these directories should contain one subdirectory for each class in the dataset.

ImageNet

ImageNet class labels.

ImageSegmentation

Image segmentation dataset.

ImputeTransform

The impute transform groups data and determines missing values of the key field within each group.

Index

Indexing a tensor.

Index

Immutable sequence used for indexing.

InformationValue

Information Value (IV) measures the predictive strength of a feature for a binary dependent variable.

Input

Static methods that return the InputStream/Reader of a file or URL.

InputLayer

An input layer in the neural network.

IntArray2D

2-dimensional array of integers.

IntArrayElementConsumer

Represents an operation that accepts an array element of integer value and returns no result.

IntArrayElementFunction

Represents a function that accepts an array element of integer value and produces a result.

IntArrayList

A resizeable, array-backed list of integer primitives.

IntDoubleHashMap

HashMap<int, double> for primitive types.

IntegerArrayCellEditor

Implements a cell editor that uses a formatted text field to edit int[] values.

IntegerArrayCellRenderer

Integer array renderer in JTable.

IntegerArrayFormatter

Text formatter for integer array values.

IntegerCellEditor

Implements a cell editor that uses a formatted text field to edit Integer values.

Intercept

The flag if intercept should be included in the model.

InternalNode

An internal node in CART.

Interpolation

In numerical analysis, interpolation is a method of constructing new data points within the range of a discrete set of known data points.

Interpolation2D

Interpolation of 2-dimensional data.

IntervalScale

The interval scale allows for the degree of difference between items, but not the ratio between them.

IntFunction

The generic term of applying an integer function.

IntFunction

An interface representing a univariate int function.

IntHashSet

HashSet<int> for primitive types.

IntHeapSelect

This class tracks the smallest values seen thus far in a stream of values.

IntPair

A tuple of 2 integer elements.

IntSet

A set of integers.

IntType

Integer data type.

IntVector

An integer vector.

InverseMultiquadricRadialBasis

Inverse multiquadric RBF.

InvertibleColumnTransform

Invertible column-wise transformation.

InvertibleTransform

Invertible data transformation.

IQAgent

Incremental quantile estimation.

Iris

Iris flower dataset.

IsolationForest

Isolation forest is an unsupervised learning algorithm for anomaly detection that works on the principle of isolating anomalies.

IsolationForest.Options

Isolation Forest hyperparameters.

IsolationTree

Isolation tree.

Isoline

Contour contains a list of segments.

IsoMap

Isometric feature mapping.

IsoMap.Options

IsoMap hyperparameters.

IsotonicMDS

Kruskal's non-metric MDS.

IsotonicMDS.Options

Kruskal's non-metric MDS hyperparameters.

IsotonicRegressionScaling

A method to calibrate decision function value to probability.

IsotropicKernel

Isotropic kernel.

ItemSet

A set of items.

IterativeAlgorithmController<T>

A controller for iterative algorithms.

JaccardDistance<T>

The Jaccard index, also known as the Jaccard similarity coefficient is a statistic used for comparing the similarity and diversity of sample sets.

JensenShannonDistance

The Jensen-Shannon divergence is a popular method of measuring the similarity between two probability distributions.

JSON

Reads JSON datasets.

JSON.Mode

JSON files in single-line or multi-line mode.

JTensor

A simple on-heap Tensor implementation.

KDTree<E>

A KD-tree (short for k-dimensional tree) is a space-partitioning dataset structure for organizing points in a k-dimensional space.

KernelDensity

Kernel density estimation is a non-parametric way of estimating the probability density function of a random variable.

KernelMachine<T>

Kernel machines.

KernelMachine<T>

The learning methods building on kernels.

KernelPCA

Kernel PCA transform.

Kin8nm

Robot arm simulation dataset.

KMeans

K-Means clustering.

KMedoids<T>

K-Medoids clustering based on randomized search (CLARANS).

KMedoidsImputer

Missing value imputation by K-Medoids clustering.

KModes

K-Modes clustering.

KNN<T>

K-nearest neighbor classifier.

KNNImputer

Missing value imputation with k-nearest neighbors.

KNNSearch<K,V>

Retrieves the top k nearest neighbors to the query.

KPCA<T>

Kernel principal component analysis.

KPCA.Options

Kernel PCA hyperparameters.

KrigingInterpolation

Kriging interpolation for the data points irregularly distributed in space.

KrigingInterpolation1D

Kriging interpolation for the data points irregularly distributed in space.

KrigingInterpolation2D

Kriging interpolation for the data points irregularly distributed in space.

KSTest

The Kolmogorov-Smirnov test (K-S test) is a form of minimum distance estimation used as a non-parametric test of equality of one-dimensional probability distributions.

Kurtosis

The kurtosis of the probability density function of a signal.

Label

Label is a single line text.

LamarckianChromosome<T>

Artificial chromosomes used in Lamarckian algorithm that is a hybrid of evolutionary computation and a local improver such as hill-climbing.

LancasterStemmer

The Paice/Husk Lancaster stemming algorithm.

LaplaceInterpolation

Laplace's interpolation to restore missing or unmeasured values on a 2-dimensional evenly spaced regular grid.

Laplacian

Laplacian kernel, also referred as exponential kernel.

LaplacianEigenmap

Laplacian Eigenmaps.

LaplacianEigenmap.Options

Laplacian Eigenmaps hyperparameters.

LaplacianKernel

Laplacian kernel, also referred as exponential kernel.

LASSO

Lasso (least absolute shrinkage and selection operator) regression.

LASSO.Options

Lasso regression hyperparameters.

LASVM<T>

LASVM is an approximate SVM solver that uses online approximation.

Layer

A layer in the neural network.

Layer

A layer in the neural network.

Layer

To superimpose one chart on top of another.

LayerBlock

A block is combinations of one or more layers.

LayerBuilder

The builder of layers.

Layout

The memory layout of a Tensor.

LDA

Linear discriminant analysis.

LeafNode

A leaf node in decision tree.

LeakyReLU

Sigmoid Linear Unit activation function.

LeeDistance

In coding theory, the Lee distance is a distance between two strings x₁x₂...x_n and y₁y₂...y_n of equal length n over the q-ary alphabet {0, 1, ..., q-1} of size q >= 2, defined as

Legend

Legend is a single line text which coordinates are in proportional to the base coordinates.

Legend

Similar to axes, legends visualize scales.

LevenbergMarquardt

The Levenberg–Marquardt algorithm.

LibrasMovement

LIBRAS movement dataset.

Line

This class represents a poly line in the plot.

Line.Style

The supported styles of lines.

LinearInterpolation

Piecewise linear interpolation.

LinearKernel

The linear dot product kernel.

LinearKernelMachine

Linear kernel machine.

LinearLayer

A fully connected linear layer.

LinearModel

Linear model.

LinearSearch<K,V>

Brute force linear nearest neighbor search.

LinearSVM

Linear support vector machines for classification.

LinearSVM

Linear support vector machines for regression.

LinePlot

Line plot is a special scatter plot which connects points by straight lines.

Linkage

A measure of dissimilarity between clusters (i.e.

Llama

LLaMA model specification.

LLE

Locally Linear Embedding.

LLE.Options

LLE hyperparameters.

LoessTransform

The loess transform (for locally-estimated scatterplot smoothing) uses locally-estimated regression to produce a trend line.

LogCosh

A good general-purpose contrast function for ICA.

LogisticDistribution

The logistic distribution is a continuous probability distribution whose cumulative distribution function is the logistic function, which appears in logistic regression and feedforward neural networks.

LogisticRegression

Logistic regression.

LogisticRegression.Binomial

Binomial logistic regression.

LogisticRegression.Multinomial

Multinomial logistic regression.

LogisticRegression.Options

Logistic regression hyperparameters.

LogLoss

Log loss is an evaluation metric for binary classifiers, and it is sometimes the optimization objective as well in case of logistic regression and neural networks.

LogNormalDistribution

A log-normal distribution is a probability distribution of a random variable whose logarithm is normally distributed.

LogSigmoid

Log sigmoid activation function.

LogSoftmax

Log softmax activation function.

LongArrayCellRenderer

Long array renderer in JTable.

Longley

The classic 1967 Longley dataset.

LongType

Long data type.

LongVector

A long vector.

LOOCV

Leave-one-out cross validation.

LookupData

The density transform performs one-dimensional kernel density estimation over an input data stream and generates a new data stream of samples of the estimated densities.

Loss

Loss functions.

Loss

Regression loss function.

Loss.Type

The type of loss.

LSH<E>

Locality-Sensitive Hashing.

The LU decomposition.

MAD

Mean absolute deviation error.

MahalanobisDistance

In statistics, Mahalanobis distance is based on correlations between variables by which different patterns can be identified and analyzed.

ManhattanDistance

Manhattan distance, also known as L₁ distance or L₁ norm, is the sum of the (absolute) differences of their coordinates.

Mark

Mark definition object.

Matern

The class of Matérn kernels is a generalization of the Gaussian/RBF.

MaternKernel

The class of Matérn kernels is a generalization of the Gaussian/RBF.

MathEx

Extra basic numeric functions.

Matrix

Mathematical matrix interface.

MatrixTableModel

A table model for matrices with paging.

MatthewsCorrelation

Matthews correlation coefficient.

MaxAbsScaler

Scales each feature by its maximum absolute value.

Maxent

Maximum Entropy Classifier.

Maxent.Binomial

Binomial maximum entropy classifier.

Maxent.Multinomial

Multinomial maximum entropy classifier.

Maxent.Options

Maximum entropy classifier hyperparameters.

MaxPool2dLayer

A max pooling layer that reduces a tensor by combining cells, and assigning the maximum value of the input cells to the output cell.

MBConv

Mobile inverted bottleneck convolution.

MBConvConfig

EfficientNet block configuration.

MDS

Classical multidimensional scaling, also known as principal coordinates analysis.

MDS.Options

MDS hyperparameters.

Measure

Level of measurement or scale of measure is a classification that describes the nature of information within the values assigned to variables.

MEC<T>

Non-parametric Minimum Conditional Entropy Clustering.

MEC.Options

MEC hyperparameters.

MercerKernel<T>

Mercer kernel, also called covariance function in Gaussian process.

MersenneTwister

32-bit Mersenne Twister.

MersenneTwister64

64-bit Mersenne Twister.

Message

Dialog messages.

Metric

The class metrics keeps track of metric states, which enables them to be able to calculate values through accumulations and synchronizations across multiple processes.

Metric<T>

A metric function defines a distance between elements of a set.

MinkowskiDistance

Minkowski distance of order p or L_p-norm, is a generalization of Euclidean distance that is actually L₂-norm.

Mixture

A finite mixture model is a probabilistic model for density estimation using a mixture distribution.

Mixture.Component

A component in the mixture distribution is defined by a distribution and its weight in the mixture.

MLP

Fully connected multilayer perceptron neural network for classification.

MLP

Fully connected multilayer perceptron neural network for regression.

MNIST

MNIST dataset.

Model

The deep learning models.

Model

The GLM model specification.

Model

Generic model interface.

ModelArgs

LLaMA model hyperparameters.

ModelSelection

Model selection criteria.

MPLSH<E>

Multi-Probe Locality-Sensitive Hashing.

MSE

Mean squared error.

Mul

The term of a * b expression.

MultiColumnSortTableHeaderCellRenderer

An extension of DefaultTableHeaderCellRenderer that paints sort icons on the header of each sorted column with varying opacity.

MultiFigurePane

Interactive view for multiple mathematical plots.

MultilayerPerceptron

Fully connected multilayer perceptron neural network.

MultiProbeHash

The hash function for data in Euclidean spaces.

MultiProbeSample

Training sample for MPLSH.

MultiquadricRadialBasis

Multiquadric RBF.

MultivariateDistribution

Probability distribution of multivariate random variable.

MultivariateExponentialFamily

The purpose of this interface is mainly to define the method M that is the Maximization step in the EM algorithm.

MultivariateExponentialFamilyMixture

The finite mixture of distributions from multivariate exponential family.

MultivariateFunction

An interface representing a multivariate real function.

MultivariateGaussianDistribution

Multivariate Gaussian distribution.

MultivariateGaussianMixture

Finite multivariate Gaussian mixture.

MultivariateMixture

The finite mixture of multivariate distributions.

MultivariateMixture.Component

A component in the mixture distribution is defined by a distribution and its weight in the mixture.

MurmurHash2

MurmurHash is a very fast, non-cryptographic hash suitable for general hash-based lookup.

MurmurHash3

MurmurHash is a very fast, non-cryptographic hash suitable for general hash-based lookup.

MutableInt

A mutable int wrapper.

MutableLSH<E>

Mutable LSH.

MutualInformation

Mutual Information for comparing clustering.

NaiveBayes

Naive Bayes classifier.

NearestNeighborGraph

The k-nearest neighbor graph builder.

NegativeBinomialDistribution

Negative binomial distribution arises as the probability distribution of the number of successes in a series of independent and identically distributed Bernoulli trials needed to get a specified (non-random) number r of failures.

Neighbor<K,V>

The immutable object encapsulates the results of nearest neighbor search.

Neighborhood

The neighborhood function for 2-dimensional lattice topology (e.g.

NeuralGas

Neural Gas soft competitive learning algorithm.

NeuralMap

NeuralMap is an efficient competitive learning algorithm inspired by growing neural gas and BIRCH.

Neuron

The neuron vertex in the growing neural gas network.

NGram

An n-gram is a contiguous sequence of n words from a given sequence of text.

NGram

An n-gram is a contiguous sequence of n words from a given sequence of text.

Node

CART tree node.

NominalNode

A node with a nominal split variable.

NominalScale

Nominal variables take on a limited number of unordered values.

NominalSplit

The data about of a potential split for a leaf node.

NormalizedMutualInformation

Normalized Mutual Information (NMI) for comparing clustering.

NormalizedMutualInformation.Method

The normalization method.

Normalizer

Normalize samples individually to unit norm.

Normalizer

Normalization transforms text into a canonical form by removing unwanted variations.

Normalizer.Norm

Vector norm.

NullableBooleanVector

A nullable boolean vector.

NullableByteVector

A nullable byte vector.

NullableCharVector

A nullable char vector.

NullableDoubleVector

A nullable double vector.

NullableFloatVector

A nullable float vector.

NullableIntVector

A nullable integer vector.

NullableLongVector

A nullable long vector.

NullablePrimitiveVector

Abstract base class implementation of ValueVector interface.

NullableShortVector

A nullable short vector.

NumberCellRenderer

Number renderer in JTable.

NumberVector<T>

A number object vector.

NumericalMeasure

Numerical data, also called quantitative data.

ObjectType

Object data type.

ObjectVector<T>

A generic vector.

OCSVM<T>

One-class support vector machine.

OLS

Ordinary least squares.

OLS.Method

Computational methods to fit the model.

OLS.Options

Least squares hyperparameters.

OneVersusOne<T>

One-vs-one strategy for reducing the problem of multiclass classification to multiple binary classification problems.

OneVersusRest<T>

One-vs-rest (or one-vs-all) strategy for reducing the problem of multiclass classification to multiple binary classification problems.

Operator

The infix bifunction term.

Optimizer

Optimizer functions.

Order

Matrix layout.

OrdinalNode

A node with an ordinal split variable (real-valued or ordinal categorical value).

OrdinalScale

The ordinal type allows for rank order (1st, 2nd, 3rd, etc.) by which data can be sorted, but still does not allow for relative degree of difference between them.

OrdinalSplit

The data about of a potential split for a leaf node.

OutputFunction

The output function of neural networks.

OutputLayer

The output layer in the neural network.

OutputLayerBuilder

The builder of output layers.

PageRank

PageRank is a link analysis algorithm, and it assigns a numerical weighting to each element of a hyperlinked set of documents, such as the World Wide Web, with the purpose of "measuring" its relative importance within the set.

PageTableModel

A table model that performs "paging" of its data.

PairingHeap<E>

A pairing heap is a type of heap data structure with relatively simple implementation and excellent practical amortized performance.

Palette

Color palette generator.

ParagraphSplitter

A paragraph splitter segments text into paragraphs.

Parquet

Apache Parquet is a columnar storage format that supports nested data structures.

Partitioning

Clustering partitions.

Paths

Static methods that return a Path by converting a path string or URI.

PCA

Principal component analysis.

PearsonKernel

Pearson VII universal kernel.

PenDigits

Pen-based recognition of handwritten digits dataset.

PennTreebankPOS

The Penn Treebank Tag set.

PennTreebankTokenizer

A word tokenizer that tokenizes English sentences using the conventions used by the Penn Treebank.

PerfectHash

A perfect hash of an array of strings to their index in the array.

PerfectMap<T>

Perfect hash based immutable map.

PerfectMap.Builder<T>

The builder of perfect map.

PivotTransform

The pivot transform maps unique values from a field to new aggregated fields (columns) in the output stream.

Planes2D

2D planes synthetic dataset.

PlattScaling

Platt scaling or Platt calibration is a way of transforming the outputs of a classification model into a probability distribution over classes.

Plot

The abstract base class of plots.

Point

One more points in the plot.

Poisson

The response variable is of Poisson distribution.

PoissonDistribution

Poisson distribution expresses the probability of a number of events occurring in a fixed period of time if these events occur with a known average rate and independently of the time since the last event.

Polynomial

The polynomial kernel.

PolynomialKernel

The polynomial kernel.

PorterStemmer

Porter's stemming algorithm.

PositionalEncoding

Positional encoding in original Transformer.

POSTagger

Part-of-speech tagging (POS tagging) is the process of marking up the words in a sentence as corresponding to a particular part of speech.

PosterioriModel

Pre-computed posteriori probabilities for generating multiple probes.

PowerVariogram

Power variogram.

Precision

The precision or positive predictive value (PPV) is ratio of true positives to combined true and false positives, which is different from sensitivity.

Precision

The precision or positive predictive value (PPV) is ratio of true positives to combined true and false positives, which is different from sensitivity.

Preconditioner

The preconditioner matrix.

Predicate

To test a data point in a filter transform or a test property in conditional encoding, a predicate definition of the following forms must be specified:

PrH

The probability for given query object and hash function.

PrimitiveType

Primitive data type.

PrimitiveVector

Abstract base class implementation of ValueVector interface.

Printer

A printer controller object.

PriorityQueue

Priority Queue for index items.

ProbabilisticClassificationMetric

An abstract interface to measure the probabilistic classification performance.

ProbabilisticPCA

Probabilistic principal component analysis.

Probe

Probe to check for nearest neighbors.

ProductKernel<T>

The product kernel takes two kernels and combines them via k1(x, y) * k2(x, y).

Projection

A projection is a kind of feature extraction technique that transforms data from the input space to a feature space, linearly or non-linearly.

Projection

Projection provides methods to map logical coordinates to Java2D coordinates.

Projection

The geographic projection, which will be applied to shape path for "geoshape" marks and to latitude and "longitude" channels for other marks.

Property

A component in record or a property in a Java Bean class.

ProstateCancer

Prostate cancer dataset.

Protein

Protein sequence dataset.

PrZ

The probability list of all buckets for given query object.

Puma8NH

Pumadyn dataset.

Punctuations

Punctuation marks are symbols that indicate the structure and organization of written language, as well as intonation and pauses to be observed when reading aloud.

QDA

Quadratic discriminant analysis.

QQPlot

A Q-Q plot ("Q" stands for quantile) is a probability plot, a kind of graphical method for comparing two probability distributions, by plotting their quantiles against each other.

The QR decomposition.

QuantileTransform

The quantile transform calculates empirical quantile values for an input data stream.

QuickSelect

Selection is asking for the k-th smallest element out of n elements.

QuickSort

Quicksort is a well-known sorting algorithm that, on average, makes O(n log n) comparisons to sort n items.

R².

RadialBasisFunction

A radial basis function (RBF) is a real-valued function whose value depends only on the distance from the origin, so that φ(x)=φ(||x||); or alternatively on the distance from some other point c, called a center, so that φ(x,c)=φ(||x-c||).

RandIndex

Rand Index.

Random

This is a high quality random number generator as a replacement of the standard Random class of Java system.

RandomForest

Random forest for classification.

RandomForest

Random forest for regression.

RandomForest.Model

The base model.

RandomForest.Model

The base model.

RandomForest.Options

Random forest hyperparameters.

RandomForest.Options

Random forest hyperparameters.

RandomForest.TrainingStatus

Training status per tree.

RandomForest.TrainingStatus

Training status per tree.

RandomNumberGenerator

Random number generator interface.

RandomProjection

Random projection is a promising dimensionality reduction technique for learning mixtures of Gaussians.

RandomProjectionForest

A set of random projection trees.

RandomProjectionTree

Random projection trees.

RatioScale

The ratio scale allows for both difference and ratio of two values.

RBF<T>

A neuron in radial basis function network.

RBFInterpolation

Radial basis function interpolation is a popular method for the data points are irregularly distributed in space.

RBFInterpolation1D

Radial basis function interpolation is a popular method for the data points are irregularly distributed in space.

RBFInterpolation2D

Radial basis function interpolation is a popular method for the data points are irregularly distributed in space.

RBFNetwork<T>

Radial basis function networks.

RBFNetwork<T>

Radial basis function network.

RDA

Regularized discriminant analysis.

Read

Reads data from external storage systems.

Recall

Recall or true positive rate (TPR) (also called hit rate, sensitivity) is a statistical measures of the performance of a binary classification test.

Recall

In information retrieval area, sensitivity is called recall.

Regex

Regular expression patterns.

Regression<T>

Regression analysis includes any techniques for modeling and analyzing the relationship between a dependent variable and one or more independent variables.

Regression.Trainer<T,M>

The regression trainer.

RegressionMetric

An abstract interface to measure the regression performance.

RegressionMetrics

The regression validation metrics.

RegressionModel

The regression model.

RegressionNode

A leaf node in regression tree.

RegressionTransform

The regression transform fits two-dimensional regression models to smooth and predict data.

RegressionTree

Regression tree.

RegressionTree.Options

Regression tree hyperparameters.

RegressionValidation<M>

Regression model validation results.

RegressionValidations<M>

Regression model validation results.

Relevance

In the context of information retrieval, relevance denotes how well a retrieved set of documents meets the information need of the user.

RelevanceRanker

An interface to provide relevance ranking algorithm.

ReLU

Rectified Linear Unit activation function.

Renderer

Renderer provides methods to draw graphical primitives in logical/mathematical coordinates.

Repeat

Repeat a View.

RidgeRegression

Ridge Regression.

RidgeRegression.Options

Ridge regression hyperparameters.

RMSE

Root mean squared error.

RMSNormLayer

Root Mean Square Layer Normalization.

RNNSearch<K,V>

Retrieves the nearest neighbors to a query in a radius.

RobustStandardizer

Robustly standardizes numeric feature by subtracting the median and dividing by the IQR.

Role

The role of message speaker in a dialog.

RotaryPositionalEncoding

Rotary positional encoding (RoPE).

Round

The term of round function.

Row

A row in data frame.

RowIndex

DataFrame row index.

RSS

Residual sum of squares.

SammonMapping

The Sammon's mapping is an iterative technique for making interpoint distances in the low-dimensional projection as close as possible to the interpoint distances in the high-dimensional object.

SammonMapping.Options

Sammon's mapping hyperparameters.

SampleBatch

A min-batch dataset consists of data and an associated target (label).

SampleInstance<D,T>

An immutable sample instance.

Sampling

Random sampling is the selection of a subset of individuals from within a statistical population to estimate characteristics of the whole population.

SAS

Reads SAS7BDAT datasets.

Scalar

A scalar is a single number.

ScalarType

The data type of the elements stored in the tensor.

ScalarType

The data type of scalar value.

Scaler

Scales the numeric variables into the range [0, 1].

Scaler

Affine transformation y = (x - offset) / scale.

ScatterPlot

The data is displayed as a collection of points.

Scene

Printable scene of mathematical plots.

Scene.PrintAction

Action to print the scene.

Scene.SaveAction

Action to save the scene to an image file.

ScreePlot

In multivariate statistics, a scree plot is a line plot of the eigenvalues of factors or principal components in an analysis.

ScrollablePanel

Customized JPanel whose width match the width of its containing JScrollPane's viewport.

Selection

The way to select chromosomes from the population as parents to crossover.

Sensitivity

Sensitivity or true positive rate (TPR) (also called hit rate, recall) is a statistical measures of the performance of a binary classification test.

SentenceSplitter

A sentence splitter segments text into sentences (a string of words satisfying the grammatical rules of a language).

SequenceLabeler<T>

A sequence labeler assigns a class label to each position of the sequence.

SequentialBlock

A block of sequential layers.

SHAP<T>

SHAP (SHapley Additive exPlanations) is a game theoretic approach to explain the output of any machine learning model.

Shape

Abstract object that knows how to use a renderer to paint onto the canvas.

ShellSort

Shell sort is a generalization of insertion sort.

ShepardInterpolation

Shepard interpolation is a special case of normalized radial basis function interpolation if the function φ(r) goes to infinity as r → 0, and is finite for r > 0.

ShepardInterpolation1D

Shepard interpolation is a special case of normalized radial basis function interpolation if the function φ(r) goes to infinity as r → 0, and is finite for r > 0.

ShepardInterpolation2D

Shepard interpolation is a special case of normalized radial basis function interpolation if the function φ(r) goes to infinity as r → 0, and is finite for r > 0.

ShiftedGeometricDistribution

The "shifted" geometric distribution is a discrete probability distribution of the number of failures before the first success, supported on the set {0, 1, 2, 3, …}.

ShortArrayCellRenderer

Short array renderer in JTable.

ShortType

Short data type.

ShortVector

A short vector.

SIB

The Sequential Information Bottleneck algorithm.

Side

The flag if the symmetric matrix A appears on the left or right in the matrix-matrix operation.

Sigmoid

Sigmoid activation function.

SignalNoiseRatio

The signal-to-noise (S2N) ratio is a univariate feature ranking metric, which can be used as a feature selection criterion for binary classification problems.

SiLU

Sigmoid Linear Unit activation function.

SimHash<T>

SimHash is a technique for quickly estimating how similar two sets are.

SimpleCorpus

An in-memory text corpus.

SimpleDataset<D,T>

A simple implementation of Dataset that store data in single machine's memory.

SimpleDictionary

A simple implementation of dictionary interface.

SimpleImputer

Simple algorithm replaces missing values with the constant value along each column.

SimpleNormalizer

A baseline normalizer for processing Unicode text.

SimpleParagraphSplitter

This is a simple paragraph splitter.

SimpleSentenceSplitter

This is a simple sentence splitter for English.

SimpleText

A list-of-words representation of documents.

SimpleTokenizer

A word tokenizer that tokenizes English sentences with some differences from TreebankWordTokenizer, notably on handling not-contractions.

SingleLinkage

Single linkage.

SmileUtilities

A collection of utility methods primarily for performing common GUI-related tasks.

SNLSH<K,V>

Locality-Sensitive Hashing for Signatures.

Softmax

Softmax activation function.

SoftShrink

Soft Shrink activation function.

SOM

Self-Organizing Map.

Sort

Sort algorithm trait that includes useful static functions such as swap and swift up/down used in many sorting algorithms.

SortField

A sort field definition for sorting data objects within a window.

SparseArray

Sparse array of double values.

SparseArray.Entry

The entry in a sparse array of double values.

SparseChebyshevDistance

SparseDataset<T>

List of Lists sparse matrix format.

SparseEncoder

Encodes numeric and categorical features into sparse array with on-hot encoding of categorical variables.

SparseEuclideanDistance

Euclidean distance on sparse arrays.

SparseGaussianKernel

Gaussian kernel, also referred as RBF kernel or squared exponential kernel.

SparseHyperbolicTangentKernel

The hyperbolic tangent kernel on sparse data.

SparseIntArray

Sparse array of integers.

SparseIntArray.Entry

The entry in a sparse array of double values.

SparseLaplacianKernel

Laplacian kernel, also referred as exponential kernel.

SparseLinearKernel

The linear dot product kernel on sparse arrays.

SparseLinearSVM

Sparse linear support vector machines for classification.

SparseLinearSVM

Sparse linear support vector machines for regression.

SparseLogisticRegression

Logistic regression on sparse data.

SparseLogisticRegression.Binomial

Binomial logistic regression.

SparseLogisticRegression.Multinomial

Multinomial logistic regression.

SparseManhattanDistance

Manhattan distance, also known as L₁ distance or L₁ norm, is the sum of the (absolute) differences of their coordinates.

SparseMaternKernel

The class of Matérn kernels is a generalization of the Gaussian/RBF.

SparseMatrix

A sparse matrix is a matrix populated primarily with zeros.

SparseMatrixPlot

A graphical representation of sparse matrix data.

SparseMinkowskiDistance

Minkowski distance of order p or L_p-norm, is a generalization of Euclidean distance that is actually L₂-norm.

SparsePolynomialKernel

The polynomial kernel on sparse data.

SparseThinPlateSplineKernel

The Thin Plate Spline kernel on sparse data.

Specificity

Specificity (SPC) or True Negative Rate is a statistical measures of the performance of a binary classification test.

SpectralClustering

Spectral Clustering.

SpectralClustering.Options

Spectral clustering hyperparameters.

SphericalVariogram

Spherical variogram.

Split

The data about of a potential split for a leaf node.

SplitRule

The criterion to choose variable to split instances.

SQL

An in-process SQL database management interface.

SqueezeExcitation

Squeeze-and-Excitation block from "Squeeze-and-Excitation Networks".

StackTransform

The stack transform.

Staircase

This class represents a poly line in the plot.

StaircasePlot

Staircase plot is a special case of line which is most useful to display empirical distribution.

Standardizer

Standardizes numeric feature to 0 mean and unit variance.

Stemmer

A Stemmer transforms a word into its root form.

StochasticDepth

Stochastic Depth for randomly dropping residual branches of residual architectures, from "Deep Networks with Stochastic Depth".

StopWords

A set of stop words in some language.

Strings

String utility functions.

StringType

String data type.

StringVector

A string vector.

StructField

A field in a Struct data type.

StructType

Struct data type is determined by the fixed order of the fields of primitive data types in the struct.

Sub

The term of a - b expression.

SumKernel<T>

The sum kernel takes two kernels and combines them via k1(x, y) + k2(x, y)

SumSquaresRatio

The ratio of between-groups to within-groups sum of squares is a univariate feature ranking metric, which can be used as a feature selection criterion for multi-class classification problems.

SupportVector<T>

Support vector.

Surface

A surface object gives 3D information e.g.

SVD

Singular Value Decomposition.

SVDImputer

Missing value imputation with singular value decomposition.

SVDJob

The option if computing singular vectors.

SVM<T>

One-class support vector machines for novelty detection.

SVM<T>

Support vector machines for classification.

SVM

Epsilon support vector regression.

SVM hyperparameters.

SVM hyperparameters.

SVM hyperparameters.

Epsilon support vector regression.

SwissRoll

Swiss roll dataset.

SymletWavelet

Symlet wavelets.

SymmMatrix

The symmetric matrix in packed storage.

SyntheticControl

Synthetic control time series.

Table

Customized JTable with optional row number header.

TableColumnSettings

Table column settings.

TableCopyPasteAdapter

TableCopyPasteAdapter enables Copy-Paste Clipboard functionality on JTables.

Tanh

Hyperbolic Tangent activation function.

TanhShrink

Hyperbolic Tangent Shrink activation function.

TaxonomicDistance

The distance between concepts in a taxonomy.

Taxonomy

A taxonomy is a tree of terms (aka concept) where leaves must be named but intermediary nodes can be anonymous.

TDistribution

Student's t-distribution (or simply the t-distribution) is a probability distribution that arises in the problem of estimating the mean of a normally distributed population when the sample size is small.

Tensor

A Tensor is a multidimensional array containing elements of a single data type.

Tensor

A Tensor is a multidimensional array containing elements of a single data type.

Tensor.Options

A class that encapsulates the construction axes of a tensor.

Term

An abstract term in the formula.

Terms

Predefined terms.

Text

A minimal interface of text in the corpus.

TextPlot

The scatter plot of texts.

TextTerms

The terms in a text.

TFIDF

The tf-idf weight (term frequency-inverse document frequency) is a weight often used in information retrieval and text mining.

ThinPlateRadialBasis

Thin plate RBF.

ThinPlateSpline

The Thin Plate Spline kernel.

ThinPlateSplineKernel

The Thin Plate Spline kernel.

Tiktoken

tiktoken is a fast BPE tokenizer by OpenAI.

TimeFunction

A time-dependent function.

TimeSeries

Time series utility functions.

TimeType

Time data type.

ToFloatFunction<T>

Represents a function that produces a float-valued result.

Tokenizer

Custom tokenizer for Llama 3 models.

Tokenizer

Tokenizing and encoding/decoding text.

Tokenizer

A token is a string of characters, categorized according to the rules as a symbol.

Transform

Data transformation interface.

Transform

View-level data transformations such as filter and new field calculation.

Transform

Transformation from image to tensor.

Transformer

The Transformer model.

TransformerBlock

A block in Transformer model.

Transpose

Matrix transpose operation.

TreeSHAP

SHAP of ensemble tree methods.

Trie<K,V>

A trie, also called digital tree or prefix tree, is an ordered tree data structure that is used to store a dynamic set or associative array where the keys are usually strings.

TSNE

The t-distributed stochastic neighbor embedding.

TSNE.Options

The t-SNE hyperparameters.

TTest

Student's t test.

Tuple

A tuple is an immutable ordered finite list (sequence) of elements.

Tuple2<T1,T2>

A tuple of 2 elements.

UMAP

Uniform Manifold Approximation and Projection.

UMAP.Options

The UMAP hyperparameters.

UniversalGenerator

The so-called "Universal Generator" based on multiplicative congruential method, which originally appeared in "Toward a Universal Random Number Generator" by Marsaglia, Zaman and Tsang.

UPGMALinkage

Unweighted Pair Group Method with Arithmetic mean (also known as average linkage).

UPGMCLinkage

Unweighted Pair Group Method using Centroids (also known as centroid linkage).

UPLO

The format of packed matrix storage.

USArrests

Violent crime rates by US state.

USPS

USPS handwritten text recognition dataset.

ValueVector

ValueVector interface is an abstraction that is used to store a sequence of values having the same type in an individual column of data frame.

Variable

A variable in the formula.

Variogram

In spatial statistics the theoretical variogram 2γ(x,y) is a function describing the degree of spatial dependence of a spatial random field or stochastic process Z(x).

Vector

Mathematical vector interface.

VectorQuantizer

Vector quantizer with competitive learning.

VegaLite

Vega-Lite specifications are JSON objects that describe a diverse range of interactive visualizations.

VertexVisitor

A visitor is encapsulation of some operation on graph vertices during traveling graph (DFS or BFS).

View

Single view specification, which describes a view that uses a single mark type to visualize the data.

ViewComposition

All view composition specifications (layer, facet, concat, and repeat) can have the resolve property for scale, axes, and legend resolution.

ViewConfig

The style of a single view visualization.

ViewLayoutComposition

All view layout composition (facet, concat, and repeat) can have the following layout properties: align, bounds, center, spacing.

VisionModel

The computer vision models.

WardLinkage

Ward's linkage.

Wavelet

A wavelet is a wave-like oscillation with an amplitude that starts out at zero, increases, and then decreases back to zero.

WaveletShrinkage

The wavelet shrinkage is a signal denoising technique based on the idea of thresholding the wavelet coefficients.

Weather

Toy weather dataset.

WeatherNominal

Toy weather dataset, of which all attributes are nominal.

WeibullDistribution

The Weibull distribution is one of the most widely used lifetime distributions in reliability engineering.

WindowTransform

The window transform performs calculations over sorted groups of data objects.

WindowTransformField

A sort field definition for sorting data objects within a window.

WinsorScaler

Scales all numeric variables into the range [0, 1].

Wireframe

A wire frame model specifies each edge of the physical object where two mathematically continuous smooth surfaces meet, or by connecting an object's constituent vertices using straight lines or curves.

Word2Vec

Word2vec is a group of related models that are used to produce word embeddings.

WordNet

Words in WordNet datasets.

WPGMALinkage

Weighted Pair Group Method with Arithmetic mean.

WPGMCLinkage

Weighted Pair Group Method using Centroids (also known as median linkage).

Write

Writes data to external storage systems.

XMeans

X-Means clustering algorithm, an extended K-Means which tries to automatically determine the number of clusters based on BIC scores.