Class LogisticRegression
- All Implemented Interfaces:
Serializable
,ToDoubleFunction<double[]>
,ToIntFunction<double[]>
,Classifier<double[]>
- Direct Known Subclasses:
LogisticRegression.Binomial
,LogisticRegression.Multinomial
Goodness-of-fit tests such as the likelihood ratio test are available as indicators of model appropriateness, as is the Wald statistic to test the significance of individual independent variables.
Logistic regression has many analogies to ordinary least squares (OLS) regression. Unlike OLS regression, however, logistic regression does not assume linearity of relationship between the raw values of the independent variables and the dependent, does not require normally distributed variables, does not assume homoscedasticity, and in general has less stringent requirements.
Compared with linear discriminant analysis, logistic regression has several advantages:
- It is more robust: the independent variables don't have to be normally distributed, or have equal variance in each group
- It does not assume a linear relationship between the independent variables and dependent variable.
- It may handle nonlinear effects since one can add explicit interaction and power terms.
Logistic regression also has strong connections with neural network and maximum entropy modeling. For example, binary logistic regression is equivalent to a one-layer, single-output neural network with a logistic activation function trained under log loss. Similarly, multinomial logistic regression is equivalent to a one-layer, softmax-output neural network.
Logistic regression estimation also obeys the maximum entropy principle, and thus logistic regression is sometimes called "maximum entropy modeling", and the resulting classifier the "maximum entropy classifier".
- See Also:
-
Nested Class Summary
Modifier and TypeClassDescriptionstatic class
Binomial logistic regression.static class
Multinomial logistic regression.Nested classes/interfaces inherited from interface smile.classification.Classifier
Classifier.Trainer<T,
M extends Classifier<T>> -
Field Summary
Fields inherited from class smile.classification.AbstractClassifier
classes
-
Constructor Summary
ConstructorDescriptionLogisticRegression
(int p, double L, double lambda, IntSet labels) Constructor. -
Method Summary
Modifier and TypeMethodDescriptiondouble
AIC()
Returns the AIC score.static LogisticRegression.Binomial
binomial
(double[][] x, int[] y) Fits binomial logistic regression.static LogisticRegression.Binomial
binomial
(double[][] x, int[] y, double lambda, double tol, int maxIter) Fits binomial logistic regression.static LogisticRegression.Binomial
binomial
(double[][] x, int[] y, Properties params) Fits binomial logistic regression.static LogisticRegression
fit
(double[][] x, int[] y) Fits logistic regression.static LogisticRegression
fit
(double[][] x, int[] y, double lambda, double tol, int maxIter) Fits logistic regression.static LogisticRegression
fit
(double[][] x, int[] y, Properties params) Fits logistic regression.double
Returns the learning rate of stochastic gradient descent.double
Returns the log-likelihood of model.multinomial
(double[][] x, int[] y) Fits multinomial logistic regression.multinomial
(double[][] x, int[] y, double lambda, double tol, int maxIter) Fits multinomial logistic regression.multinomial
(double[][] x, int[] y, Properties params) Fits multinomial logistic regression.boolean
online()
Returns true if this is an online learner.void
setLearningRate
(double rate) Sets the learning rate of stochastic gradient descent.boolean
soft()
Returns true if this is a soft classifier that can estimate the posteriori probabilities of classification.Methods inherited from class smile.classification.AbstractClassifier
classes, numClasses
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
Methods inherited from interface smile.classification.Classifier
applyAsDouble, applyAsInt, predict, predict, predict, predict, predict, predict, predict, predict, score, update, update, update
-
Constructor Details
-
LogisticRegression
Constructor.- Parameters:
p
- the dimension of input data.L
- the log-likelihood of learned model.lambda
-lambda > 0
gives a "regularized" estimate of linear weights which often has superior generalization performance, especially when the dimensionality is high.labels
- the class label encoder.
-
-
Method Details
-
binomial
Fits binomial logistic regression.- Parameters:
x
- training samples.y
- training labels.- Returns:
- the model.
-
binomial
Fits binomial logistic regression.- Parameters:
x
- training samples.y
- training labels.params
- the hyper-parameters.- Returns:
- the model.
-
binomial
public static LogisticRegression.Binomial binomial(double[][] x, int[] y, double lambda, double tol, int maxIter) Fits binomial logistic regression.- Parameters:
x
- training samples.y
- training labels.lambda
-lambda > 0
gives a "regularized" estimate of linear weights which often has superior generalization performance, especially when the dimensionality is high.tol
- the tolerance for stopping iterations.maxIter
- the maximum number of iterations.- Returns:
- the model.
-
multinomial
Fits multinomial logistic regression.- Parameters:
x
- training samples.y
- training labels.- Returns:
- the model.
-
multinomial
Fits multinomial logistic regression.- Parameters:
x
- training samples.y
- training labels.params
- the hyper-parameters.- Returns:
- the model.
-
multinomial
public static LogisticRegression.Multinomial multinomial(double[][] x, int[] y, double lambda, double tol, int maxIter) Fits multinomial logistic regression.- Parameters:
x
- training samples.y
- training labels.lambda
-lambda > 0
gives a "regularized" estimate of linear weights which often has superior generalization performance, especially when the dimensionality is high.tol
- the tolerance for stopping iterations.maxIter
- the maximum number of iterations.- Returns:
- the model.
-
fit
Fits logistic regression.- Parameters:
x
- training samples.y
- training labels.- Returns:
- the model.
-
fit
Fits logistic regression.- Parameters:
x
- training samples.y
- training labels.params
- the hyper-parameters.- Returns:
- the model.
-
fit
Fits logistic regression.- Parameters:
x
- training samples.y
- training labels.lambda
-lambda > 0
gives a "regularized" estimate of linear weights which often has superior generalization performance, especially when the dimensionality is high.tol
- the tolerance to stop iterations.maxIter
- the maximum number of iterations.- Returns:
- the model.
-
soft
public boolean soft()Description copied from interface:Classifier
Returns true if this is a soft classifier that can estimate the posteriori probabilities of classification.- Returns:
- true if soft classifier.
-
online
public boolean online()Description copied from interface:Classifier
Returns true if this is an online learner.- Returns:
- true if online learner.
-
setLearningRate
public void setLearningRate(double rate) Sets the learning rate of stochastic gradient descent. It is a good practice to adapt the learning rate for different data sizes. For example, it is typical to set the learning rate to eta/n, where eta is in [0.1, 0.3] and n is the size of the training data.- Parameters:
rate
- the learning rate.
-
getLearningRate
public double getLearningRate()Returns the learning rate of stochastic gradient descent.- Returns:
- the learning rate of stochastic gradient descent.
-
loglikelihood
public double loglikelihood()Returns the log-likelihood of model.- Returns:
- the log-likelihood of model.
-
AIC
public double AIC()Returns the AIC score.- Returns:
- the AIC score.
-