smile.classification.AbstractClassifier<double[]>

smile.classification.LogisticRegression

All Implemented Interfaces:: Serializable, ToDoubleFunction<double[]>, ToIntFunction<double[]>, Classifier<double[]>

Direct Known Subclasses:: LogisticRegression.Binomial, LogisticRegression.Multinomial

public abstract class LogisticRegression extends AbstractClassifier<double[]>

Logistic regression. Logistic regression (logit model) is a generalized linear model used for binomial regression. Logistic regression applies maximum likelihood estimation after transforming the dependent into a logit variable. A logit is the natural log of the odds of the dependent equaling a certain value or not (usually 1 in binary logistic models, the highest value in multinomial models). In this way, logistic regression estimates the odds of a certain event (value) occurring.

Goodness-of-fit tests such as the likelihood ratio test are available as indicators of model appropriateness, as is the Wald statistic to test the significance of individual independent variables.

Logistic regression has many analogies to ordinary least squares (OLS) regression. Unlike OLS regression, however, logistic regression does not assume linearity of relationship between the raw values of the independent variables and the dependent, does not require normally distributed variables, does not assume homoscedasticity, and in general has less stringent requirements.

Compared with linear discriminant analysis, logistic regression has several advantages:

It is more robust: the independent variables don't have to be normally distributed, or have equal variance in each group
It does not assume a linear relationship between the independent variables and dependent variable.
It may handle nonlinear effects since one can add explicit interaction and power terms.

However, it requires much more data to achieve stable, meaningful results.

Logistic regression also has strong connections with neural network and maximum entropy modeling. For example, binary logistic regression is equivalent to a one-layer, single-output neural network with a logistic activation function trained under log loss. Similarly, multinomial logistic regression is equivalent to a one-layer, softmax-output neural network.

Logistic regression estimation also obeys the maximum entropy principle, and thus logistic regression is sometimes called "maximum entropy modeling", and the resulting classifier the "maximum entropy classifier".

See Also:

Nested Class Summary

Nested Classes

Modifier and Type

Class

Description

static class

LogisticRegression.Binomial

Binomial logistic regression.

static class

LogisticRegression.Multinomial

Multinomial logistic regression.

static final record

LogisticRegression.Options

Logistic regression hyperparameters.

Nested classes/interfaces inherited from interface smile.classification.Classifier
Classifier.Trainer<T,M extends Classifier<T>>
Field Summary

Fields inherited from class smile.classification.AbstractClassifier
classes
Constructor Summary

Constructors

Constructor

Description

LogisticRegression(int p, double L, double lambda, IntSet labels)

Constructor.
Method Summary

Modifier and Type

Method

Description

double

AIC()

Returns the AIC score.

static LogisticRegression.Binomial

binomial(double[][] x, int[] y)

Fits binomial logistic regression.

static LogisticRegression.Binomial

binomial(double[][] x, int[] y, LogisticRegression.Options options)

Fits binomial logistic regression.

static LogisticRegression

fit(double[][] x, int[] y)

Fits logistic regression.

static LogisticRegression

fit(double[][] x, int[] y, Properties params)

Fits logistic regression.

static LogisticRegression

fit(double[][] x, int[] y, LogisticRegression.Options options)

Fits logistic regression.

double

getLearningRate()

Returns the learning rate of stochastic gradient descent.

double

loglikelihood()

Returns the log-likelihood of model.

static LogisticRegression.Multinomial

multinomial(double[][] x, int[] y)

Fits multinomial logistic regression.

static LogisticRegression.Multinomial

multinomial(double[][] x, int[] y, LogisticRegression.Options options)

Fits multinomial logistic regression.

boolean

online()

Returns true if this is an online learner.

void

setLearningRate(double rate)

Sets the learning rate of stochastic gradient descent.

boolean

soft()

Returns true if this is a soft classifier that can estimate the posteriori probabilities of classification.

Methods inherited from class smile.classification.AbstractClassifier
classes, numClasses

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Methods inherited from interface smile.classification.Classifier
applyAsDouble, applyAsInt, predict, predict, predict, predict, predict, predict, predict, predict, score, update, update, update

Constructor Details
- LogisticRegression
  
  public LogisticRegression(int p, double L, double lambda, IntSet labels)
  
  Constructor.
  
  Parameters:
  
  p - the dimension of input data.
  
  L - the log-likelihood of learned model.
  
  lambda - lambda > 0 gives a "regularized" estimate of linear weights which often has superior generalization performance, especially when the dimensionality is high.
  
  labels - the class label encoder.
Method Details
- binomial
  
  public static LogisticRegression.Binomial binomial(double[][] x, int[] y)
  
  Fits binomial logistic regression.
  
  Parameters:
  
  x - training samples.
  
  y - training labels.
  
  Returns:
  
  the model.
- binomial
  
  public static LogisticRegression.Binomial binomial(double[][] x, int[] y, LogisticRegression.Options options)
  
  Fits binomial logistic regression.
  
  Parameters:
  
  x - training samples.
  
  y - training labels.
  
  options - the hyperparameters.
  
  Returns:
  
  the model.
- multinomial
  
  public static LogisticRegression.Multinomial multinomial(double[][] x, int[] y)
  
  Fits multinomial logistic regression.
  
  Parameters:
  
  x - training samples.
  
  y - training labels.
  
  Returns:
  
  the model.
- multinomial
  
  public static LogisticRegression.Multinomial multinomial(double[][] x, int[] y, LogisticRegression.Options options)
  
  Fits multinomial logistic regression.
  
  Parameters:
  
  x - training samples.
  
  y - training labels.
  
  options - the hyperparameters.
  
  Returns:
  
  the model.
- fit
  
  public static LogisticRegression fit(double[][] x, int[] y)
  
  Fits logistic regression.
  
  Parameters:
  
  x - training samples.
  
  y - training labels.
  
  Returns:
  
  the model.
- fit
  
  public static LogisticRegression fit(double[][] x, int[] y, LogisticRegression.Options options)
  
  Fits logistic regression.
  
  Parameters:
  
  x - training samples.
  
  y - training labels.
  
  options - the hyperparameters.
  
  Returns:
  
  the model.
- fit
  
  public static LogisticRegression fit(double[][] x, int[] y, Properties params)
  
  Fits logistic regression.
  
  Parameters:
  
  x - training samples.
  
  y - training labels.
  
  params - the hyperparameters.
  
  Returns:
  
  the model.
- soft
  
  public boolean soft()
  
  Description copied from interface: Classifier
  
  Returns true if this is a soft classifier that can estimate the posteriori probabilities of classification.
  
  Returns:
  
  true if soft classifier.
- online
  
  public boolean online()
  
  Description copied from interface: Classifier
  
  Returns true if this is an online learner.
  
  Returns:
  
  true if online learner.
- setLearningRate
  
  public void setLearningRate(double rate)
  
  Sets the learning rate of stochastic gradient descent. It is a good practice to adapt the learning rate for different data sizes. For example, it is typical to set the learning rate to eta/n, where eta is in [0.1, 0.3] and n is the size of the training data.
  
  Parameters:
  
  rate - the learning rate.
- getLearningRate
  
  public double getLearningRate()
  
  Returns the learning rate of stochastic gradient descent.
  
  Returns:
  
  the learning rate of stochastic gradient descent.
- loglikelihood
  
  public double loglikelihood()
  
  Returns the log-likelihood of model.
  
  Returns:
  
  the log-likelihood of model.
- AIC
  
  public double AIC()
  
  Returns the AIC score.
  
  Returns:
  
  the AIC score.

Class LogisticRegression

Nested Class Summary

Nested classes/interfaces inherited from interface smile.classification.Classifier

Field Summary

Fields inherited from class smile.classification.AbstractClassifier

Constructor Summary

Method Summary

Methods inherited from class smile.classification.AbstractClassifier

Methods inherited from class java.lang.Object

Methods inherited from interface smile.classification.Classifier

Constructor Details

LogisticRegression

Method Details

binomial

binomial

multinomial

multinomial

fit

fit

fit

soft

online

setLearningRate

getLearningRate

loglikelihood

AIC