Package smile.validation
Interface Bootstrap
public interface Bootstrap
The bootstrap is a general tool for assessing statistical accuracy. The basic
idea is to randomly draw samples with replacement from the training data,
each samples the same size as the original training set. This is done many
times (say k = 100), producing k bootstrap datasets. Then we refit the model
to each of the bootstrap datasets and examine the behavior of the fits over
the k replications.
-
Method Summary
Modifier and TypeMethodDescriptionstatic <M extends DataFrameClassifier>
ClassificationValidations<M> classification
(int k, Formula formula, DataFrame data, BiFunction<Formula, DataFrame, M> trainer) Runs classification bootstrap validation.static <T,
M extends Classifier<T>>
ClassificationValidations<M> classification
(int k, T[] x, int[] y, BiFunction<T[], int[], M> trainer) Runs classification bootstrap validation.static Bag[]
of
(int[] category, int k) Stratified bootstrap sampling.static Bag[]
of
(int n, int k) Bootstrap sampling.static <M extends DataFrameRegression>
RegressionValidations<M> regression
(int k, Formula formula, DataFrame data, BiFunction<Formula, DataFrame, M> trainer) Runs regression bootstrap validation.static <T,
M extends Regression<T>>
RegressionValidations<M> regression
(int k, T[] x, double[] y, BiFunction<T[], double[], M> trainer) Runs regression bootstrap validation.
-
Method Details
-
of
Bootstrap sampling.- Parameters:
n
- the number of samples.k
- the number of rounds of bootstrap.- Returns:
- the samplings.
-
of
Stratified bootstrap sampling.- Parameters:
category
- the strata labels.k
- the number of rounds of bootstrap.- Returns:
- the samplings.
-
classification
static <T,M extends Classifier<T>> ClassificationValidations<M> classification(int k, T[] x, int[] y, BiFunction<T[], int[], M> trainer) Runs classification bootstrap validation.- Type Parameters:
T
- the data type of samples.M
- the model type.- Parameters:
k
- k-fold bootstrap sampling.x
- the samples.y
- the sample labels.trainer
- the lambda to train a model.- Returns:
- the validation results.
-
classification
static <M extends DataFrameClassifier> ClassificationValidations<M> classification(int k, Formula formula, DataFrame data, BiFunction<Formula, DataFrame, M> trainer) Runs classification bootstrap validation.- Type Parameters:
M
- the model type.- Parameters:
k
- k-fold bootstrap sampling.formula
- the model specification.data
- the training/validation data.trainer
- the lambda to train a model.- Returns:
- the validation results.
-
regression
static <T,M extends Regression<T>> RegressionValidations<M> regression(int k, T[] x, double[] y, BiFunction<T[], double[], M> trainer) Runs regression bootstrap validation.- Type Parameters:
T
- the data type of samples.M
- the model type.- Parameters:
k
- k-fold bootstrap sampling.x
- the samples.y
- the response variable.trainer
- the lambda to train a model.- Returns:
- the validation results.
-
regression
static <M extends DataFrameRegression> RegressionValidations<M> regression(int k, Formula formula, DataFrame data, BiFunction<Formula, DataFrame, M> trainer) Runs regression bootstrap validation.- Type Parameters:
M
- the model type.- Parameters:
k
- k-fold bootstrap sampling.formula
- the model specification.data
- the training/validation data.trainer
- the lambda to train a model.- Returns:
- the validation results.
-