# All Classes and Interfaces

Class

Description

A dictionary interface for abbreviations.

The term of abs function.

This class provides a skeletal implementation of the bi-function term.

Abstract base class of classifiers.

This class provides a skeletal implementation of the function term.

Abstract base class of one-dimensional interpolation methods.

Abstract tuple base class.

The accuracy is the proportion of true results (both true positives and
true negatives) in the population.

The activation function in hidden layers.

The activation function.

AdaBoost (Adaptive Boosting) classifier with decision trees.

An adaptive average pooling that reduces a tensor by combining cells.

The term of

`a + b`

expression.An adjacency list representation of a graph.

An adjacency matrix representation of a graph.

Adjusted Mutual Information (AMI) for comparing clustering.

The normalization method.

Adjusted Rand Index.

An Icon wrapper that paints the contained icon with a specified transparency.

The anchor text is the visible, clickable text in a hyperlink.

Autoregressive model.

The fitting method.

Weka ARFF (attribute relation file format) is an ASCII
text file format that is essentially a CSV file with a header that describes
the meta-data.

Association Rule Mining.

Autoregressive moving-average model.

ARPACK is a collection of Fortran77 subroutines designed to
solve large scale eigenvalue problems.

ARPACK is a collection of Fortran77 subroutines designed to
solve large scale eigenvalue problems.

Which eigenvalues of asymmetric matrix to compute.

Which eigenvalues of asymmetric matrix to compute.

Which eigenvalues of symmetric matrix to compute.

Which eigenvalues of symmetric matrix to compute.

2-dimensional array of doubles.

Array of primitive data type.

Apache Arrow is a cross-language development platform for in-memory data.

Association rule object.

The area under the curve (AUC).

AutoScope allows for predictable, deterministic resource deallocation.

The averaging strategy to aggregate binary performance metrics across
multi-classes.

An average pooling layer that reduces a tensor by combining cells,
and assigning the average value of the input cells to the output cell.

Apache Avro is a data serialization system.

This class describes an axis of a coordinate system.

Axes provide axis lines, ticks, and labels to convey how a positional range
represents a data range.

The view background of a single-view or layer specification.

A bag of random selected samples.

The bag-of-words feature of text used in natural language
processing and information retrieval.

A band matrix is a sparse matrix, whose non-zero entries are confined to
a diagonal band, comprising the main diagonal and zero or more diagonals
on either side.

The Cholesky decomposition of a symmetric, positive-definite matrix.

The Cholesky decomposition of a symmetric, positive-definite matrix.

The LU decomposition.

The LU decomposition.

Bars with heights proportional to the value.

A barplot draws bars with heights proportional to the value.

The coordinate base of PlotCanvas.

Base interface for immutable named vectors, which are sequences of elements supporting
random access and sequential stream operations.

A batch normalization layer that re-centers and normalizes the output
of one layer before feeding it to another.

Balanced Box-Decomposition Tree.

The response variable is of Bernoulli distribution.

Bernoulli's distribution is a discrete probability distribution, which takes
value 1 with success probability p and value 0 with failure probability
q = 1 - p.

Bidirectional Encoder Representations from Transformers (BERT).

Best localized wavelets.

The beta function, also called the Euler integral of the first kind.

The beta distribution is defined on the interval [0, 1] parameterized by
two positive shape parameters, typically denoted by α and β.

The Broyden–Fletcher–Goldfarb–Shanno (BFGS) algorithm is an iterative
method for solving unconstrained nonlinear optimization problems.

Bicubic interpolation in a two-dimensional regular grid.

Big dense matrix of double precision values for more than
2 billion elements.

The Cholesky decomposition of a symmetric, positive-definite matrix.

Eigenvalue decomposition.

The LU decomposition.

The QR decomposition.

Singular Value Decomposition.

Bigrams or digrams are groups of two words, and are very commonly used
as the basis for simple statistical analysis of text.

Collocations are expressions of multiple words which commonly co-occur.

Bilinear interpolation in a two-dimensional regular grid.

Encodes categorical features using sparse one-hot scheme.

Binary sparse dataset.

Gaussian kernel, also referred as RBF kernel or squared exponential kernel.

The hyperbolic tangent kernel on binary sparse data.

Laplacian kernel, also referred as exponential kernel.

The linear dot product kernel on sparse binary arrays in

`int[]`

,
which are the indices of nonzero elements.The class of Matérn kernels is a generalization of the Gaussian/RBF.

The polynomial kernel on binary sparse data.

The Thin Plate Spline kernel on binary sparse data.

The response variable is of Binomial distribution.

The binomial distribution is the discrete probability distribution of
the number of successes in a sequence of n independent yes/no experiments,
each of which yields success with probability p.

To test a data point in a filter transform or a test property in conditional
encoding, a predicate definition of the following forms must be specified:
- a Vega expression string, where datum can be used to refer to the current
data object.

Balanced Iterative Reducing and Clustering using Hierarchies.

The standard bit string representation of the solution domain.

A BK-tree is a metric tree specifically adapted to discrete metric spaces.

Basic Linear Algebra Subprograms.

The BM25 weighting scheme, often called Okapi weighting, after the system in
which it was first implemented, was developed as a way of building a
probabilistic model sensitive to term frequency and document length while
not introducing too many additional parameters into the model.

Boolean data type.

An immutable boolean vector.

The bootstrap is a general tool for assessing statistical accuracy.

A boxplot is a convenient way of graphically depicting groups of numerical
data through their five-number summaries the smallest observation
(sample minimum), lower quartile (Q1), median (Q2), upper quartile (Q3),
and largest observation (sample maximum).

Portmanteau test jointly that several autocorrelations of time series
are zero.

The type of test.

A sentence splitter based on the java.text.BreakIterator, which supports
multiple natural languages (selected by locale setting).

A word tokenizer based on the java.text.BreakIterator, which supports
multiple natural languages (selected by locale setting).

A bucket is a container for points that all have the same value for hash
function g (function g is a vector of k LSH functions).

Action initialized JButton.

The ButtonCellRenderer class provides a renderer and an editor that looks
like a JButton.

Byte array renderer in JTable.

Byte string.

Byte data type.

An immutable byte vector.

Static methods that manage cache files.

Canvas for mathematical plots.

Classification and regression tree.

Categorical variable encoder.

Categorical data can be stored into groups or categories with the aid of
names or labels.

In centroid-based clustering, clusters are represented by a central vector,
which may not necessarily be a member of the data set.

Char data type.

An immutable char vector.

Chebyshev distance (or Tchebychev distance), or L

_{∞}metric is a metric defined on a vector space where the distance between two vectors is the greatest of their differences along any coordinate dimension.Pearson's chi-square test, also known as the chi-square goodness-of-fit test
or chi-square test for independence.

Chi-square (or chi-squared) distribution with k degrees of freedom is the
distribution of a sum of the squares of k independent standard normal
random variables.

Artificial chromosomes in genetic algorithm/programming encoding candidate
solutions to an optimization problem.

Clustering Large Applications based upon RANdomized Search.

An abstract interface to measure the classification performance.

The classification validation metrics.

Classification model validation results.

Classification model validation results.

A classifier assigns an input object into one of a given number of categories.

The classifier trainer.

Map arbitrary class labels to [0, k), where k is the number of classes.

An abstract interface to measure the clustering performance.

Coiflet wavelets.

Color editor in JTable.

Color renderer in JTable.

Column-wise data transformation.

Complete linkage.

Complex number.

Packed array of complex numbers for better memory efficiency.

Concatenating views.

Concept is a set of synonyms, i.e.

Vega-Lite's config object lists configuration properties of
a visualization for creating a consistent theme.

The confusion matrix of truth and predictions.

A constant value in the formula.

The contingency table.

A contour plot is a graphical technique for representing a 3-dimensional
surface by plotting constant z slices, called contours, on a 2-dimensional
format.

A convolutional layer.

Convolution2d-Normalization-Activation block.

Conv2dNormActivation configurations.

Keyword extraction from a single document using word co-occurrence statistical information.

A corpus is a collection of documents.

Correlation distance is defined as 1 - correlation coefficient.

Correlation test.

Neural network cost function.

Cover tree is a data structure for generic nearest neighbor search, which
is especially efficient in spaces with small intrinsic dimension.

First-order linear conditional random field.

First-order CRF sequence labeler.

Cross entropy generalizes the log loss metric to multiclass problems.

The types of crossover operation.

Cross-validation is a technique for assessing how the results of a
statistical analysis will generalize to an independent data set.

Reads and writes files in variations of the Comma Separated Value
(CSV) format.

Cubic spline interpolation.

Cubic spline interpolation in a two-dimensional regular grid.

NVIDIA CUDA helper functions.

The simplest and most localized wavelet, Daubechies wavelet of 4 coefficients.

The basic data model used by Vega-Lite is tabular data.

An immutable collection of data organized into named columns.

Stream collectors.

Classification trait on DataFrame.

The classifier trainer.

Regression trait on DataFrame.

The regression trainer.

An immutable collection of data objects.

A dataset consists of data and an associated target (label)
and can be iterated in mini-batches.

The interface of data types.

Data type ID.

To get a specific data type, users should use singleton objects
and factory methods in this class.

Date/time feature extractor.

Implements a cell editor that uses a formatted text field
to edit Date values.

Date cell renderer.

The date/time features.

DateTime data type.

Date data type.

Daubechies wavelets.

Density-Based Spatial Clustering of Applications with Noise.

Arbitrary-precision decimal data type.

A leaf node in decision tree.

Decision tree.

A default cell renderer for a JTableHeader.

DENsity CLUstering.

A dendrogram is a tree diagram frequently used to illustrate the arrangement
of the clusters produced by hierarchical clustering.

The density transform performs one-dimensional kernel density estimation
over an input data stream and generates a new data stream of samples of
the estimated densities.

Deterministic annealing clustering.

The compute device on which a tensor is stored.

The compute device type.

The flag if a triangular matrix has unit diagonal elements.

A dictionary is a set of words in some natural language.

A differentiable function is a function whose derivative exists at each point
in its domain.

A differentiable function is a function whose derivative exists at each point
in its domain.

Univariate discrete distributions.

The purpose of this interface is mainly to define the method M that is
the Maximization step in the EM algorithm.

The finite mixture of distributions from discrete exponential family.

The finite mixture of discrete distributions.

A component in the mixture distribution is defined by a distribution
and its weight in the mixture.

Naive Bayes classifier for document classification in NLP.

The generation models of naive Bayes classifier.

An interface to calculate a distance measure between two objects.

Probability distribution of univariate random variable.

The term of

`a / b`

expression.Dot product kernel depends only on the dot product of x and y.

Implements a cell editor that uses a formatted text field
to edit double[] values.

Double array renderer in JTable.

A resizeable, array-backed list of double primitives.

Implements a cell editor that uses a formatted text field
to edit Double values.

Double precision matrix element stream consumer.

The generic term of applying a double function.

This class tracks the smallest values seen thus far in a stream of values.

Double data type.

An immutable double vector.

A dropout layer that randomly zeroes some of the elements of
the input tensor with probability p during training.

Dynamic time warping is an algorithm for measuring similarity between two
sequences which may vary in time or speed.

The connection between neurons.

The Edit distance between two strings is a metric for measuring the amount
of difference between two sequences.

EfficientNet is an image classification model family.

THe option of eigenvalue range.

Elastic Net regularization.

An embedding layer that is a simple lookup table that stores embeddings
of a fixed dictionary and size.

An empirical distribution function or empirical cdf, is a cumulative
probability distribution function that concentrates probability 1/n at
each of the n numbers in a sample.

A concise dictionary of common terms in English.

An English lexicon with part-of-speech tags.

Punctuation marks in English.

Several sets of English stop words.

The error function.

The number of errors in the population.

Euclidean distance.

The option if computing eigen vectors.

The contrast function when the independent components are highly
super-Gaussian, or when robustness is very important.

An exponential distribution describes the times between events in a Poisson
process, in which events occur continuously and independently at a constant
average rate.

The exponential family is a class of probability distributions sharing
a certain form.

The finite mixture of distributions from exponential family.

Exponential variogram.

A facet is a trellis plot (or small multiple) of a series of similar
plots that displays different subsets of the same data, facilitating
comparison across subsets.

Facet field definition object.

Factor crossing.

The interaction of all the factors appearing in the term.

Fall-out, false alarm rate, or false positive rate (FPR)

F-distribution arises in the testing of whether two observed samples have
the same variance.

The false discovery rate (FDR) is ratio of false positives
to combined true and false positives, which is actually 1 - precision.

A feature in the formula once bound to a schema.

Encoding field definition object.

File chooser for with file/images preview.

A simple extension-based file filter.

A measure to evaluate the fitness of chromosomes.

Fisher's linear discriminant.

Float array renderer in JTable.

Single precision matrix element stream consumer.

This class tracks the smallest values seen thus far in a stream of values.

Float data type.

An immutable float vector.

Font editor in JTable.

Font renderer in JTable.

The

`FontChooser`

class is a swing component
for font selection with `JFileChooser`

-like APIs.These config properties define the default number and time formats
for text marks as well as axes, headers, tooltip, and legends.

The model fitting formula in a compact symbolic form.

Frequent item set mining based on the FP-growth (frequent pattern growth)
algorithm, which employs an extended prefix-tree (FP-tree) structure to
store the database in a compressed form.

FP-tree data structure used in FP-growth (frequent pattern growth)
algorithm for frequent item set mining.

The F-score (or F-measure) considers both the precision and the recall of the test
to compute the score.

F test of the hypothesis that two independent samples come from normal
distributions with the same variance, against the alternative that they
come from normal distributions with different variances.

A fully connected layer with nonlinear activation function.

An interface representing a univariate real function.

Fused-MBConv replaces the depthwise-conv3×3 and expansion-conv1×1
in MBConv with single regular conv3×3.

Genetic algorithm based feature selection.

The gamma, digamma, and incomplete gamma functions.

The Gamma distribution is a continuous probability distributions with
a scale parameter θ and a shape parameter k.

Gaussian kernel, also referred as RBF kernel or squared exponential kernel.

The normal distribution or Gaussian distribution is a continuous probability
distribution that describes data that clusters around a mean.

Gaussian kernel, also referred as RBF kernel or squared exponential kernel.

Finite univariate Gaussian mixture.

Gaussian Process for Regression.

Gaussian RBF.

Gaussian variogram.

Gaussian Error Linear Unit activation function.

A genetic algorithm (GA) is a search heuristic that mimics the process of
natural evolution.

The geometric distribution is a discrete probability distribution of the
number X of Bernoulli trials needed to get one success, supported on the set

`{1, 2, 3, …}`

.Generalized Hebbian Algorithm.

Generalized linear models.

Global Vectors for Word Representation.

Gated Linear Unit activation function.

G-Means clustering algorithm, an extended K-Means which tries to
automatically determine the number of clusters by normality test.

Good–Turing frequency estimation.

Gradient boosting for classification.

Gradient boosting for regression.

A graph is an abstract representation of a set of objects where some pairs
of the objects are connected by links.

Graph edge.

Graphics provides methods to draw graphical primitives in logical/mathematical
coordinates.

A 2D grid plot.

Group normalization.

Growing Neural Gas.

Haar wavelet.

Static methods that return the InputStream/Reader of a HDFS/S3.

In information theory, the Hamming distance between two strings of equal
length is the number of positions for which the corresponding symbols are
different.

Hard Shrink activation function.

The hash function for Euclidean spaces.

Feature hashing, also known as the hashing trick, is a fast and
space-efficient way of vectorizing features, i.e.

Hash value Parzen model for multi-probe hash.

Aids in creating swing components in a "headless" environment.

This class tracks the smallest values seen thus far in a stream of values.

Heapsort is a comparison-based sorting algorithm, and is part of the
selection sort family.

A heat map is a graphical representation of data where the values taken by
a variable in a two-dimensional map are represented as colors.

The Hellinger kernel.

Hexmap is a variant of heat map by replacing rectangle cells with hexagon cells.

The lambda interface to retrieve the tooltip of cell.

A hidden layer in the neural network.

The builder of hidden layers.

Agglomerative Hierarchical Clustering.

Histogram utilities.

A histogram is a graphical display of tabulated frequencies, shown as bars.

A histogram is a graphical display of tabulated frequencies, shown as bars.

First-order Hidden Markov Model.

First-order Hidden Markov Model sequence labeler.

Part-of-speech tagging with hidden Markov model.

The hyperbolic tangent kernel.

The hyperbolic tangent kernel.

The hypergeometric distribution is a discrete probability distribution that
describes the number of successes in a sequence of n draws from a finite
population without replacement, just as the binomial distribution describes
the number of successes for draws with replacement.

Hyperparameter configuration.

Hypothesis test functions.

Chi-square test.

Correlation test.

F-test.

The Kolmogorov-Smirnov test (K-S test).

t-test.

Independent Component Analysis (ICA) is a computational method for separating
a multivariate signal into additive components.

Each of these directories should contain one subdirectory for each class
in the dataset.

ImageNet class labels.

Matrix base class.

Matrix base class.

The preconditioner matrix.

The preconditioner matrix.

The impute transform groups data and determines missing values of the key
field within each group.

Indexing a tensor.

A data frame with a new index instead of the default [0, n) row index.

Information Value (IV) measures the predictive strength of a feature
for a binary dependent variable.

Static methods that return the InputStream/Reader of a file or URL.

An input layer in the neural network.

2-dimensional array of integers.

A resizeable, array-backed list of integer primitives.

`HashMap<int, double>`

for primitive types.Implements a cell editor that uses a formatted text field
to edit int[] values.

Integer array renderer in JTable.

Implements a cell editor that uses a formatted text field
to edit Integer values.

Integer data type.

An internal node in CART.

In numerical analysis, interpolation is a method of constructing new data
points within the range of a discrete set of known data points.

Interpolation of 2-dimensional data.

The interval scale allows for the degree of difference between items,
but not the ratio between them.

The generic term of applying an integer function.

An interface representing a univariate int function.

`HashSet<int>`

for primitive types.This class tracks the smallest values seen thus far in a stream of values.

A tuple of 2 integer elements.

A set of integers.

An immutable integer vector.

Inverse multiquadric RBF.

Invertible column-wise transformation.

Invertible data transformation.

Incremental quantile estimation.

Isolation forest is an unsupervised learning algorithm for anomaly
detection that works on the principle of isolating anomalies.

Isolation tree.

Contour contains a list of segments.

Isometric feature mapping.

Kruskal's non-metric MDS.

A method to calibrate decision function value to probability.

Isotropic kernel.

A set of items.

The Jaccard index, also known as the Jaccard similarity coefficient is a
statistic used for comparing the similarity and diversity of sample sets.

The Jensen-Shannon divergence is a popular method of measuring the
similarity between two probability distributions.

Reads JSON datasets.

JSON files in single-line or multi-line mode.

A KD-tree (short for k-dimensional tree) is a space-partitioning dataset
structure for organizing points in a k-dimensional space.

Kernel density estimation is a non-parametric way of estimating the
probability density function of a random variable.

Kernel machines.

The learning methods building on kernels.

Kernel PCA transform.

K-Means clustering.

Missing value imputation by K-Medoids clustering.

K-Modes clustering.

K-nearest neighbor classifier.

Missing value imputation with k-nearest neighbors.

Retrieves the top k nearest neighbors to the query.

Kernel principal component analysis.

Kriging interpolation for the data points irregularly distributed in space.

Kriging interpolation for the data points irregularly distributed in space.

Kriging interpolation for the data points irregularly distributed in space.

The Kolmogorov-Smirnov test (K-S test) is a form of minimum distance
estimation used as a non-parametric test of equality of one-dimensional
probability distributions.

The kurtosis of the probability density function of a signal.

Label is a single line text.

Artificial chromosomes used in Lamarckian algorithm that is a hybrid of
evolutionary computation and a local improver such as hill-climbing.

The Paice/Husk Lancaster stemming algorithm.

The Lanczos algorithm is a direct algorithm devised by Cornelius Lanczos
that is an adaptation of power methods to find the most useful eigenvalues
and eigenvectors of an n

^{th}order linear system with a limited number of operations, m, where m is much smaller than n.Linear Algebra Package.

Laplace's interpolation to restore missing or unmeasured values on a 2-dimensional
evenly spaced regular grid.

Laplacian kernel, also referred as exponential kernel.

Laplacian Eigenmap.

Laplacian kernel, also referred as exponential kernel.

Lasso (least absolute shrinkage and selection operator) regression.

LASVM is an approximate SVM solver that uses online approximation.

A layer in the neural network.

A layer in the neural network.

To superimpose one chart on top of another.

A block is combinations of one or more layers.

The builder of layers.

The memory layout of a Tensor.

Matrix layout.

Linear discriminant analysis.

A leaf node in decision tree.

Sigmoid Linear Unit activation function.

In coding theory, the Lee distance is a distance between two strings

`x`_{1}x_{2}...x_{n}

and
`y`_{1}y_{2}...y_{n}

of equal length n over the q-ary alphabet `{0, 1, ..., q-1}`

of size `q >= 2`

, defined asLegend is a single line text which coordinates are in
proportional to the base coordinates.

Similar to axes, legends visualize scales.

The Levenberg–Marquardt algorithm.

This class represents a poly line in the plot.

The supported styles of lines.

Piecewise linear interpolation.

The linear dot product kernel.

Linear kernel machine.

Linear model.

Brute force linear nearest neighbor search.

Line plot is a special scatter plot which connects points by straight lines.

A measure of dissimilarity between clusters (i.e.

Locally Linear Embedding.

The loess transform (for locally-estimated scatterplot smoothing) uses
locally-estimated regression to produce a trend line.

A good general-purpose contrast function for ICA.

The logistic distribution is a continuous probability distribution whose
cumulative distribution function is the logistic function, which appears
in logistic regression and feedforward neural networks.

Logistic regression.

Binomial logistic regression.

Multinomial logistic regression.

Log loss is an evaluation metric for binary classifiers, and it is sometimes
the optimization objective as well in case of logistic regression and neural
networks.

A log-normal distribution is a probability distribution of a random variable
whose logarithm is normally distributed.

Log sigmoid activation function.

Log softmax activation function.

Long array renderer in JTable.

Long data type.

An immutable long vector.

Leave-one-out cross validation.

Regression loss function.

Loss functions.

The type of loss.

Locality-Sensitive Hashing.

Mean absolute deviation error.

In statistics, Mahalanobis distance is based on correlations between
variables by which different patterns can be identified and analyzed.

Manhattan distance, also known as L

_{1}distance or L_{1}norm, is the sum of the (absolute) differences of their coordinates.Mark definition object.

The class of Matérn kernels is a generalization of the Gaussian/RBF.

The class of Matérn kernels is a generalization of the Gaussian/RBF.

Extra basic numeric functions.

Dense matrix.

Dense matrix of double precision values.

The Cholesky decomposition of a symmetric, positive-definite matrix.

The Cholesky decomposition of a symmetric, positive-definite matrix.

Eigenvalue decomposition.

Eigenvalue decomposition.

The LU decomposition.

The LU decomposition.

The QR decomposition.

The QR decomposition.

Singular Value Decomposition.

Singular Value Decomposition.

Matthews correlation coefficient.

Scales each feature by its maximum absolute value.

Maximum Entropy Classifier.

Binomial maximum entropy classifier.

Multinomial maximum entropy classifier.

A max pooling layer that reduces a tensor by combining cells,
and assigning the maximum value of the input cells to the output cell.

Mobile inverted bottleneck convolution.

EfficientNet block configuration.

Classical multidimensional scaling, also known as principal coordinates
analysis.

Level of measurement or scale of measure is a classification that
describes the nature of information within the values assigned to
variables.

Non-parametric Minimum Conditional Entropy Clustering.

Mercer kernel, also called covariance function in Gaussian process.

32-bit Mersenne Twister.

64-bit Mersenne Twister.

Dialog messages.

The class metrics keeps track of metric states, which enables them to
be able to calculate values through accumulations and synchronizations
across multiple processes.

A metric function defines a distance between elements of a set.

Minkowski distance of order p or L

_{p}-norm, is a generalization of Euclidean distance that is actually L_{2}-norm.A finite mixture model is a probabilistic model for density estimation
using a mixture distribution.

A component in the mixture distribution is defined by a distribution
and its weight in the mixture.

Fully connected multilayer perceptron neural network for classification.

Fully connected multilayer perceptron neural network for regression.

The deep learning models.

The GLM model specification.

Model selection criteria.

Multi-Probe Locality-Sensitive Hashing.

Mean squared error.

The term of

`a * b`

expression.An extension of

`DefaultTableHeaderCellRenderer`

that paints sort icons on the
header of each sorted column with varying opacity.Fully connected multilayer perceptron neural network.

The hash function for data in Euclidean spaces.

Training sample for MPLSH.

Multiquadric RBF.

Probability distribution of multivariate random variable.

The finite mixture of distributions from multivariate exponential family.

An interface representing a multivariate real function.

Multivariate Gaussian distribution.

Finite multivariate Gaussian mixture.

The finite mixture of multivariate distributions.

A component in the mixture distribution is defined by a distribution
and its weight in the mixture.

MurmurHash is a very fast, non-cryptographic hash suitable for general hash-based
lookup.

MurmurHash is a very fast, non-cryptographic hash suitable for general hash-based
lookup.

A mutable int wrapper.

Mutable LSH.

Mutual Information for comparing clustering.

Naive Bayes classifier.

Negative binomial distribution arises as the probability distribution of
the number of successes in a series of independent and identically distributed
Bernoulli trials needed to get a specified (non-random) number r of failures.

The immutable object encapsulates the results of nearest neighbor search.

Gaussian model of hash values of nearest neighbor.

The neighborhood function for 2-dimensional lattice topology (e.g.

Neural Gas soft competitive learning algorithm.

NeuralMap is an efficient competitive learning algorithm inspired by growing
neural gas and BIRCH.

The neuron vertex in the growing neural gas network.

An n-gram is a contiguous sequence of n words from a given sequence of text.

An n-gram is a contiguous sequence of n words from a given sequence of text.

CART tree node.

A node with a nominal split variable.

Nominal variables take on a limited number of unordered values.

The data about of a potential split for a leaf node.

Normalized Mutual Information (NMI) for comparing clustering.

The normalization method.

Normalize samples individually to unit norm.

Normalization transforms text into a canonical form by removing unwanted
variations.

Vector norm.

Number renderer in JTable.

An immutable number object vector.

Numerical data, also called quantitative data.

Object data type.

One-class support vector machine.

Ordinary least squares.

One-vs-one strategy for reducing the problem of
multiclass classification to multiple binary classification problems.

One-vs-rest (or one-vs-all) strategy for reducing the problem of
multiclass classification to multiple binary classification problems.

OpenBLAS library wrapper.

The infix bifunction term.

Optimizer functions.

A node with a ordinal split variable (real-valued or ordinal categorical value).

The ordinal type allows for rank order (1st, 2nd, 3rd, etc.) by which
data can be sorted, but still does not allow for relative degree of
difference between them.

The data about of a potential split for a leaf node.

The output function of neural networks.

The output layer in the neural network.

The builder of output layers.

PageRank is a link analysis algorithm, and it assigns a numerical weighting
to each element of a hyperlinked set of documents, such as the World Wide
Web, with the purpose of "measuring" its relative importance within the
set.

A table model that performs "paging" of its data.

Color palette generator.

A paragraph splitter segments text into paragraphs.

Apache Parquet is a columnar storage format that supports
nested data structures.

Partition clustering.

Static methods that return a Path by converting a path string or URI.

Principal component analysis.

Pearson VII universal kernel.

The Penn Treebank Tag set.

A word tokenizer that tokenizes English sentences using the conventions
used by the Penn Treebank.

A perfect hash of an array of strings to their index in the array.

Perfect hash based immutable map.

The builder of perfect map.

The pivot transform maps unique values from a field to new aggregated
fields (columns) in the output stream.

Platt scaling or Platt calibration is a way of transforming the outputs
of a classification model into a probability distribution over classes.

The abstract base class of plots.

PlotGrid organizes multiple plots in a grid layout.

Canvas for mathematical plots.

One more points in the plot.

The response variable is of Poisson distribution.

Poisson distribution expresses the probability of a number of events
occurring in a fixed period of time if these events occur with a known
average rate and independently of the time since the last event.

The polynomial kernel.

The polynomial kernel.

Porter's stemming algorithm.

Positional encoding injects some information about the relative
or absolute position of the tokens in the sequence.

Part-of-speech tagging (POS tagging) is the process of marking up the words
in a sentence as corresponding to a particular part of speech.

Pre-computed posteriori probabilities for generating multiple probes.

Power variogram.

The precision or positive predictive value (PPV) is ratio of true positives
to combined true and false positives, which is different from sensitivity.

The probability for given query object and hash function.

A printer controller object.

Priority Queue for index items.

An abstract interface to measure the probabilistic classification performance.

Probabilistic principal component analysis.

Probe to check for nearest neighbors.

The product kernel takes two kernels and combines them via k1(x, y) * k2(x, y).

A projection is a kind of feature extraction technique that transforms data
from the input space to a feature space, linearly or non-linearly.

The geographic projection, which will be applied to shape path for
"geoshape" marks and to latitude and "longitude" channels for other
marks.

The probability list of all buckets for given query object.

Punctuation marks are symbols that indicate the structure and organization
of written language, as well as intonation and pauses to be observed when
reading aloud.

Quadratic discriminant analysis.

A Q-Q plot ("Q" stands for quantile) is a probability plot, a kind of
graphical method for comparing two probability distributions, by
plotting their quantiles against each other.

The quantile transform calculates empirical quantile values for an input
data stream.

Selection is asking for the k-th smallest element out of n elements.

Quicksort is a well-known sorting algorithm that, on average, makes O(n log n)
comparisons to sort n items.

R

^{2}.A radial basis function (RBF) is a real-valued function whose value depends
only on the distance from the origin, so that φ(x)=φ(||x||); or
alternatively on the distance from some other point c, called a center, so
that φ(x,c)=φ(||x-c||).

Rand Index.

This is a high quality random number generator as a replacement of
the standard Random class of Java system.

Random forest for classification.

Random forest for regression.

The base model.

The base model.

Random number generator interface.

Random projection is a promising dimensionality reduction technique for
learning mixtures of Gaussians.

The ratio scale allows for both difference and ratio of two values.

A neuron in radial basis function network.

Radial basis function interpolation is a popular method for the data points
are irregularly distributed in space.

Radial basis function networks.

Radial basis function network.

Regularized discriminant analysis.

Reads data from external storage systems.

Recall or true positive rate (TPR) (also called hit rate, sensitivity) is
a statistical measures of the performance of a binary classification test.

In information retrieval area, sensitivity is called recall.

Regular expression patterns.

Regression analysis includes any techniques for modeling and analyzing
the relationship between a dependent variable and one or more independent
variables.

The regression trainer.

An abstract interface to measure the regression performance.

The regression validation metrics.

A leaf node in regression tree.

The regression transform fits two-dimensional regression models to smooth
and predict data.

Regression tree.

Regression model validation results.

Regression model validation results.

In the context of information retrieval, relevance denotes how well a
retrieved set of documents meets the information need of the user.

An interface to provide relevance ranking algorithm.

Rectified Linear Unit activation function.

Repeat a View.

Ridge Regression.

Root mean squared error.

Retrieves the nearest neighbors to a query in a radius.

Robustly standardizes numeric feature by subtracting
the median and dividing by the IQR.

The role of message speaker in a dialog.

Root finding algorithms.

Residual sum of squares.

The Sammon's mapping is an iterative technique for making interpoint
distances in the low-dimensional projection as close as possible to the
interpoint distances in the high-dimensional object.

A min-batch dataset consists of data and an associated target (label).

An immutable sample instance.

Random sampling is the selection of a subset of individuals
from within a statistical population to estimate characteristics of
the whole population.

Reads SAS7BDAT datasets.

The data type of the elements stored in the tensor.

Scales the numeric variables into the range [0, 1].

Affine transformation

`y = (x - offset) / scale`

.The data is displayed as a collection of points.

In multivariate statistics, a scree plot is a line plot of the eigenvalues
of factors or principal components in an analysis.

The way to select chromosomes from the population as parents to crossover.

Sensitivity or true positive rate (TPR) (also called hit rate, recall) is a
statistical measures of the performance of a binary classification test.

SentencePiece is an unsupervised text tokenizer by Google.

A sentence splitter segments text into sentences (a string of words
satisfying the grammatical rules of a language).

A sequence labeler assigns a class label to each position of the sequence.

A block of sequential layers.

SHAP (SHapley Additive exPlanations) is a game theoretic approach to
explain the output of any machine learning model.

Abstract rendering object in a PlotCanvas.

Shell sort is a generalization of insertion sort.

Shepard interpolation is a special case of normalized radial basis function
interpolation if the function φ(r) goes to infinity as r → 0, and is
finite for

`r > 0`

.`r > 0`

.`r > 0`

.The "shifted" geometric distribution is a discrete probability distribution
of the number of failures before the first success, supported on the set

`{0, 1, 2, 3, …}`

.Short array renderer in JTable.

Short data type.

An immutable short vector.

The Sequential Information Bottleneck algorithm.

The flag if the symmetric matrix A appears on the left or right
in the matrix-matrix operation.

Sigmoid activation function.

The signal-to-noise (S2N) metric ratio is a univariate feature ranking metric,
which can be used as a feature selection criterion for binary classification
problems.

Sigmoid Linear Unit activation function.

SimHash is a technique for quickly estimating how similar two sets are.

An in-memory text corpus.

A simple implementation of dictionary interface.

Simple algorithm replaces missing values with the constant value
along each column.

A baseline normalizer for processing Unicode text.

This is a simple paragraph splitter.

This is a simple sentence splitter for English.

A list-of-words representation of documents.

A word tokenizer that tokenizes English sentences with some differences from
TreebankWordTokenizer, notably on handling not-contractions.

Single linkage.

Locality-Sensitive Hashing for Signatures.

Softmax activation function.

Soft Shrink activation function.

Self-Organizing Map.

Sort algorithm trait that includes useful static functions
such as swap and swift up/down used in many sorting algorithms.

A sort field definition for sorting data objects within a window.

Sparse array of double values.

_{∞}metric is a metric defined on a vector space where the distance between two vectors is the greatest of their differences along any coordinate dimension.

List of Lists sparse matrix format.

Encodes numeric and categorical features into sparse array
with on-hot encoding of categorical variables.

Euclidean distance on sparse arrays.

Gaussian kernel, also referred as RBF kernel or squared exponential kernel.

The hyperbolic tangent kernel on sparse data.

Laplacian kernel, also referred as exponential kernel.

The linear dot product kernel on sparse arrays.

Logistic regression on sparse data.

Binomial logistic regression.

Multinomial logistic regression.

_{1}distance or L

_{1}norm, is the sum of the (absolute) differences of their coordinates.

The class of Matérn kernels is a generalization of the Gaussian/RBF.

A sparse matrix is a matrix populated primarily with zeros.

A sparse matrix is a matrix populated primarily with zeros.

A graphical representation of sparse matrix data.

_{p}-norm, is a generalization of Euclidean distance that is actually L

_{2}-norm.

The polynomial kernel on sparse data.

The Thin Plate Spline kernel on sparse data.

Specificity (SPC) or True Negative Rate is a statistical measures of the
performance of a binary classification test.

Spectral Clustering.

Spherical variogram.

The data about of a potential split for a leaf node.

The criterion to choose variable to split instances.

An in-process SQL database management interface.

Squeeze-and-Excitation block from "Squeeze-and-Excitation Networks".

The stack transform.

This class represents a poly line in the plot.

Staircase plot is a special case of line which is most useful to display
empirical distribution.

Standardizes numeric feature to 0 mean and unit variance.

A Stemmer transforms a word into its root form.

Stochastic Depth for randomly dropping residual branches of residual
architectures, from "Deep Networks with Stochastic Depth".

A set of stop words in some language.

String utility functions.

String data type.

An immutable string vector.

A field in a Struct data type.

Struct data type is determined by the fixed order of the fields
of primitive data types in the struct.

The term of

`a - b`

expression.The sum kernel takes two kernels and combines them via k1(x, y) + k2(x, y)

The ratio of between-groups to within-groups sum of squares is a univariate
feature ranking metric, which can be used as a feature selection criterion
for multi-class classification problems.

Support vector.

A surface object gives 3D information e.g.

Missing value imputation with singular value decomposition.

The option if computing singular vectors.

One-class support vector machines for novelty detection.

Support vector machines for classification.

Epsilon support vector regression.

Epsilon support vector regression.

Symlet wavelets.

The symmetric matrix in packed storage.

The symmetric matrix in packed storage.

The LU decomposition.

The LU decomposition.

The Cholesky decomposition of a symmetric, positive-definite matrix.

The Cholesky decomposition of a symmetric, positive-definite matrix.

Customized JTable with optional row number header.

Table column settings.

TableCopyPasteAdapter enables Copy-Paste Clipboard functionality on
JTables.

Hyperbolic Tangent activation function.

Hyperbolic Tangent Shrink activation function.

The distance between concepts in a taxonomy.

A taxonomy is a tree of terms (aka concept) where leaves
must be named but intermediary nodes can be anonymous.

Student's t-distribution (or simply the t-distribution) is a probability
distribution that arises in the problem of estimating the mean of a
normally distributed population when the sample size is small.

A Tensor is a multidimensional array containing elements of a single data type.

A class that encapsulates the construction axes of a Tensor.

An abstract term in the formula.

Predefined terms.

A minimal interface of text in the corpus.

The scatter plot of texts.

The terms in a text.

The tf-idf weight (term frequency-inverse document frequency) is a weight
often used in information retrieval and text mining.

Thin plate RBF.

The Thin Plate Spline kernel.

The Thin Plate Spline kernel.

tiktoken is a fast BPE tokenizer by OpenAI.

A time-dependent function.

Time series utility functions.

Time data type.

Represents a function that produces a float-valued result.

Custom tokenizer for Llama 3 models.

Tokenizing and encoding/decoding text.

A token is a string of characters, categorized according to the rules as a
symbol.

Data transformation interface.

View-level data transformations such as filter and new field calculation.

Transformation from image to tensor.

A transformer is a deep learning architecture developed based on the
multi-head attention mechanism, proposed in a 2017 paper "Attention
Is All You Need".

Transformer architecture configuration.

Matrix transpose.

SHAP of ensemble tree methods.

A trie, also called digital tree or prefix tree, is an ordered tree data
structure that is used to store a dynamic set or associative array where
the keys are usually strings.

The t-distributed stochastic neighbor embedding.

Student's t test.

A tuple is an immutable finite ordered list (sequence) of elements.

A tuple of 2 elements.

Uniform Manifold Approximation and Projection.

The so-called "Universal Generator" based on multiplicative congruential
method, which originally appeared in "Toward a Universal Random Number
Generator" by Marsaglia, Zaman and Tsang.

Unweighted Pair Group Method with Arithmetic mean (also known as average linkage).

Unweighted Pair Group Method using Centroids (also known as centroid linkage).

The format of packed matrix storage.

In spatial statistics the theoretical variogram

`2γ(x,y)`

is a function describing the degree
of spatial dependence of a spatial random field or stochastic process
`Z(x)`

.An immutable generic vector.

Vector quantizer with competitive learning.

Vega-Lite specifications are JSON objects that describe a diverse range
of interactive visualizations.

Single view specification, which describes a view that uses a single
mark type to visualize the data.

All view composition specifications (layer, facet, concat, and repeat)
can have the resolve property for scale, axes, and legend resolution.

The style of a single view visualization.

All view layout composition (facet, concat, and repeat) can have the
following layout properties: align, bounds, center, spacing.

The computer vision models.

A visitor is encapsulation of some operation on graph vertices during
traveling graph (DFS or BFS).

Ward's linkage.

A wavelet is a wave-like oscillation with an amplitude that starts out at
zero, increases, and then decreases back to zero.

The wavelet shrinkage is a signal denoising technique based on the idea of
thresholding the wavelet coefficients.

The Weibull distribution is one of the most widely used lifetime distributions
in reliability engineering.

The window transform performs calculations over sorted groups of data
objects.

A sort field definition for sorting data objects within a window.

Scales all numeric variables into the range [0, 1].

A wire frame model specifies each edge of the physical object where two
mathematically continuous smooth surfaces meet, or by connecting an
object's constituent vertices using straight lines or curves.

Word2vec is a group of related models that are used to produce word
embeddings.

Weighted Pair Group Method with Arithmetic mean.

Weighted Pair Group Method using Centroids (also known as median linkage).

Writes data to external storage systems.

X-Means clustering algorithm, an extended K-Means which tries to
automatically determine the number of clusters based on BIC scores.