# Smile — Statistical Machine Intelligence & Learning Engine

Package

Description

Anomaly detection is the identification of rare items, events
or observations which raise suspicions by differing significantly from
the majority of the data.

Frequent item set mining and association rule mining.

Classification and regression tree base package.

Multilayer perceptron neural network base package.

RBF network base package.

Support vector machine base package.

Classification algorithms.

Clustering analysis.

Cluster dissimilarity measures.

Data and attribute encapsulation classes.

The formula interface symbolically specifies the predictors
and the response.

Level of measurement or scale of measure.

Data transformations.

Data types.

Immutable named vectors.

Deep learning.

Activation functions.

Neural network layers.

Model validation metrics.

A tensor is a multidimensional array.

Feature extraction.

Feature importance.

Missing value imputation.

Feature selection.

Feature transformations.

Genetic algorithm and programming.

Generalized linear models.

The error distribution models.

Graphs are mathematical structures used to model pairwise relations between
objects from a certain collection.

Hashing functions.

Hyperparameter optimization.

Independent Component Analysis (ICA).

Interpolation is the process of constructing a function that takes on
specified values at specified points.

Variogram functions.

Interfaces to read/write a Dataset.

Large language models.

Meta Llama models.

LLM Tokenization.

Manifold learning finds a low-dimensional basis for describing
high-dimensional data.

Basic mathematical functions, complex, differentiable function interfaces,
random number generators, unconstrained optimization, and raw data type
(int and double) array lists, etc.

BLAS and LAPACK interfaces.

OpenBLAS library.

Distance and metric measures.

Mercer kernels.

Matrix interface, dense and sparse (band or irregular) matrix encapsulation
classes, LU, QR, Cholesky, SVD and eigen decompositions, etc.

Single-precision (32-bit) matrix.

High quality random number generators as a replacement of
the standard Random class of Java system.

Radial basis functions.

Special mathematical functions including beta, erf, and gamma.

Nearest neighbor search.

LSH internal classes.

Natural language processing.

Collocation finding algorithms.

Common dictionaries such as stop words, punctuation, common English words, etc.

Word embedding.

Keyword extraction.

Text normalization.

Part-of-speech taggers.

Term-document relevance ranking algorithms.

English word stemmer algorithms.

Sentence splitter and word tokenizer.

Mathematical and statistical plots.

Declarative data visualization.

Regression analysis.

Learning algorithms for sequence data.

Sorting algorithms.

Probability distributions and statistical hypothesis tests.

Probability distributions.

Statistical hypothesis tests.

Enhanced and additional Swing components (FileChooser, FontChooser, Table,
Button, AlphaIcon, and Printer).

Enhancement to Swing JTable and cell components.

A taxonomy is a tree of terms (concepts) where leaves
must be named but intermediary nodes can be anonymous.

Time series analysis.

Utility functions.

Model validation and selection.

Model validation metrics.

Computer vision models.

Neural network layers for computer vision tasks.

Image transformations.

Vector quantization is a lossy compression technique used in speech
and image coding.

Hebbian theory is a neuroscientific theory claiming that an increase in
synaptic efficacy arises from a presynaptic cell's repeated and persistent
stimulation of a postsynaptic cell.

Discrete wavelet transform (DWT).