Smile — Statistical Machine Intelligence & Learning Engine

Packages
Package
Description
Anomaly detection is the identification of rare items, events or observations which raise suspicions by differing significantly from the majority of the data.
Frequent item set mining and association rule mining.
Classification algorithms.
Clustering analysis.
Cluster dissimilarity measures.
Compressed sensing is a signal processing technique for efficiently acquiring and reconstructing a signal by finding solutions to underdetermined linear systems.
Data and attribute encapsulation classes.
The formula interface symbolically specifies the predictors and the response.
Level of measurement or scale of measure.
Data transformations.
Data types.
Abstraction to store a sequence of values having the same type in an individual column of data frame.
Built-in datasets.
Deep learning.
Activation functions.
Neural network layers.
Model validation metrics.
A tensor is a multidimensional array.
Feature extraction.
Feature importance.
Missing value imputation.
Feature selection.
Feature transformations.
Genetic algorithm and programming.
Graphs are mathematical structures used to model pairwise relations between objects from a certain collection.
Hashing functions.
Hyperparameter optimization.
Independent Component Analysis (ICA).
Interpolation is the process of constructing a function that takes on specified values at specified points.
Variogram functions.
Interfaces to read/write a Dataset.
Type-safe enumerations and constants for BLAS, LAPACK, and ARPACK.
 
 
 
Large language models.
Meta Llama models.
LLM Tokenization.
Manifold learning finds a low-dimensional basis for describing high-dimensional data.
Basic mathematical functions, complex, differentiable function interfaces, random number generators, unconstrained optimization, and raw data type (int and double) array lists, etc.
Distance and metric measures.
Mercer kernels.
High quality random number generators as a replacement of the standard Random class of Java system.
Radial basis functions.
Special mathematical functions including beta, erf, and gamma.
Generic model interface and base classes that may be used for both classification and regression.
Classification and regression tree base package.
Multilayer perceptron neural network base package.
RBF network base package.
Support vector machine base package.
Nearest neighbor search.
LSH internal classes.
Natural language processing.
Common dictionaries such as stop words, punctuation, common English words, etc.
Text normalization.
Part-of-speech taggers.
Term-document relevance ranking algorithms.
English word stemmer algorithms.
A taxonomy is a tree of terms (concepts) where leaves must be named but intermediary nodes can be anonymous.
Sentence splitter and word tokenizer.
Java API for the ONNX Runtime inference engine.
 
Mathematical and statistical plots.
Declarative data visualization.
Regression analysis.
The spline functions in Generalized Additive Models (GAMs).
The error distributions in Generalized Linear Models (GLMs).
Learning algorithms for sequence data.
Sorting algorithms.
Probability distributions and statistical hypothesis tests.
Probability distributions.
Statistical hypothesis tests.
Enhanced and additional Swing components (FileChooser, FontChooser, Table, Button, AlphaIcon, and Printer).
Enhancement to Swing JTable and cell components.
 
 
A tensor is a multidimensional array.
Time series analysis.
Primitive data collections, string, date and time facilities, and miscellaneous utility classes.
Mathematical functional interfaces.
The ipynb format is a plain-text JSON document schema used by Jupyter Notebook to store an interactive computing session's contents, including live code, output, Markdown text, and metadata.
LSP (Language Server Protocol) client built on top of LSP4J.
Model validation and selection.
Model validation metrics.
Computer vision models.
Neural network layers for computer vision tasks.
Image transformations.
Vector quantization is a lossy compression technique used in speech and image coding.
Hebbian theory is a neuroscientific theory claiming that an increase in synaptic efficacy arises from a presynaptic cell's repeated and persistent stimulation of a postsynaptic cell.
Discrete wavelet transform (DWT).