smile.regression

## Class RidgeRegression

• java.lang.Object
• smile.regression.RidgeRegression
• All Implemented Interfaces:
java.io.Serializable, Regression<double[]>

```public class RidgeRegression
extends java.lang.Object
implements Regression<double[]>, java.io.Serializable```
Ridge Regression. Coefficient estimates for multiple linear regression models rely on the independence of the model terms. When terms are correlated and the columns of the design matrix X have an approximate linear dependence, the matrix `X'X` becomes close to singular. As a result, the least-squares estimate becomes highly sensitive to random errors in the observed response `Y`, producing a large variance.

Ridge regression is one method to address these issues. In ridge regression, the matrix `X'X` is perturbed so as to make its determinant appreciably different from 0.

Ridge regression is a kind of Tikhonov regularization, which is the most commonly used method of regularization of ill-posed problems. Ridge regression shrinks the regression coefficients by imposing a penalty on their size. By allowing a small amount of bias in the estimates, more reasonable coefficients may often be obtained. Often, small amounts of bias lead to dramatic reductions in the variance of the estimated model coefficients.

Another interpretation of ridge regression is available through Bayesian estimation. In this setting the belief that weight should be small is coded into a prior distribution.

The penalty term is unfair is the predictor variables are not on the same scale. Therefore, if we know that the variables are not measured in the same units, we typically scale the columns of X (to have sample variance 1), and then we perform ridge regression.

When including an intercept term in the regression, we usually leave this coefficient unpenalized. Otherwise we could add some constant amount to the vector `y`, and this would not result in the same solution. If we center the columns of `X`, then the intercept estimate ends up just being the mean of `y`.

Ridge regression doesn’t set coefficients exactly to zero unless `λ = ∞`, in which case they’re all zero. Hence ridge regression cannot perform variable selection, and even though it performs well in terms of prediction accuracy, it does poorly in terms of offering a clear interpretation.

Serialized Form
• ### Nested Class Summary

Nested Classes
Modifier and Type Class and Description
`static class ` `RidgeRegression.Trainer`
Trainer for ridge regression.
• ### Constructor Summary

Constructors
Constructor and Description
```RidgeRegression(double[][] x, double[] y, double lambda)```
Constructor.
• ### Method Summary

All Methods
Modifier and Type Method and Description
`double` `adjustedRSquared()`
`double[]` `coefficients()`
Returns the (scaled) linear coefficients.
`int` `df()`
Returns the degree-of-freedom of residual standard error.
`double` `error()`
Returns the residual standard error.
`double` `ftest()`
Returns the F-statistic of goodness-of-fit.
`double` `intercept()`
Returns the (centered) intercept.
`double` `predict(double[] x)`
Predicts the dependent variable of an instance.
`double` `pvalue()`
Returns the p-value of goodness-of-fit test.
`double[]` `residuals()`
Returns the residuals, that is response minus fitted values.
`double` `RSquared()`
Returns R2 statistic.
`double` `RSS()`
Returns the residual sum of squares.
`double` `shrinkage()`
Returns the shrinkage parameter.
`java.lang.String` `toString()`
`double[][]` `ttest()`
Returns the t-test of the coefficients (without intercept).
• ### Methods inherited from class java.lang.Object

`clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait`
• ### Methods inherited from interface smile.regression.Regression

`predict`
• ### Constructor Detail

• #### RidgeRegression

```public RidgeRegression(double[][] x,
double[] y,
double lambda)```
Constructor. Learn the ridge regression model.
Parameters:
`x` - a matrix containing the explanatory variables. NO NEED to include a constant column of 1s for bias.
`y` - the response values.
`lambda` - the shrinkage/regularization parameter. Large lambda means more shrinkage. Choosing an appropriate value of lambda is important, and also difficult.
• ### Method Detail

• #### coefficients

`public double[] coefficients()`
Returns the (scaled) linear coefficients.
• #### intercept

`public double intercept()`
Returns the (centered) intercept.
• #### shrinkage

`public double shrinkage()`
Returns the shrinkage parameter.
• #### predict

`public double predict(double[] x)`
Description copied from interface: `Regression`
Predicts the dependent variable of an instance.
Specified by:
`predict` in interface `Regression<double[]>`
Parameters:
`x` - the instance.
Returns:
the predicted value of dependent variable.
• #### ttest

`public double[][] ttest()`
Returns the t-test of the coefficients (without intercept). The first column is the coefficients, the second column is the standard error of coefficients, the third column is the t-score of the hypothesis test if the coefficient is zero, the fourth column is the p-values of test. The last row is of intercept.
• #### residuals

`public double[] residuals()`
Returns the residuals, that is response minus fitted values.

`public double RSS()`
Returns the residual sum of squares.
• #### error

`public double error()`
Returns the residual standard error.
• #### df

`public int df()`
Returns the degree-of-freedom of residual standard error.
• #### RSquared

`public double RSquared()`
Returns R2 statistic. In regression, the R2 coefficient of determination is a statistical measure of how well the regression line approximates the real data points. An R2 of 1.0 indicates that the regression line perfectly fits the data.

In the case of ordinary least-squares regression, R2 increases as we increase the number of variables in the model (R2 will not decrease). This illustrates a drawback to one possible use of R2, where one might try to include more variables in the model until "there is no more improvement". This leads to the alternative approach of looking at the adjusted R2.

`public double adjustedRSquared()`
Returns adjusted R2 statistic. The adjusted R2 has almost same explanation as R2 but it penalizes the statistic as extra variables are included in the model.
• #### ftest

`public double ftest()`
Returns the F-statistic of goodness-of-fit.
• #### pvalue

`public double pvalue()`
Returns the p-value of goodness-of-fit test.
• #### toString

`public java.lang.String toString()`
Overrides:
`toString` in class `java.lang.Object`