Class LinearModel
 All Implemented Interfaces:
Serializable
,ToDoubleFunction<Tuple>
,DataFrameRegression
,Regression<Tuple>
Once a regression model has been constructed, it may be important to confirm the goodness of fit of the model and the statistical significance of the estimated parameters. Commonly used checks of goodness of fit include the Rsquared, analysis of the pattern of residuals and hypothesis testing. Statistical significance can be checked by an Ftest of the overall fit, followed by ttests of individual parameters.
Interpretations of these diagnostic tests rest heavily on the model assumptions. Although examination of the residuals can be used to invalidate a model, the results of a ttest or Ftest are sometimes more difficult to interpret if the model's assumptions are violated. For example, if the error term does not have a normal distribution, in small samples the estimated parameters will not follow normal distributions and complicate inference. With relatively large samples, however, a central limit theorem can be invoked such that hypothesis testing may proceed using asymptotic approximations.
 See Also:

Nested Class Summary
Nested classes/interfaces inherited from interface smile.regression.DataFrameRegression
DataFrameRegression.Trainer<M extends DataFrameRegression>

Constructor Summary
ConstructorDescriptionLinearModel
(Formula formula, StructType schema, Matrix X, double[] y, double[] w, double b) Constructor. 
Method Summary
Modifier and TypeMethodDescriptiondouble
Returns adjusted R^{2} statistic.double[]
Returns the linear coefficients without intercept.int
df()
Returns the degreeoffreedom of residual standard error.double
error()
Returns the residual standard error.double[]
Returns the fitted values.formula()
Returns the model formula.double
ftest()
Returns the Fstatistic of goodnessoffit.double
Returns the intercept.boolean
online()
Returns true if this is an online learner.double
predict
(double[] x) Predicts the dependent variable of an instance.double[]
Predicts the dependent variables of a data frame.double
Predicts the dependent variable of an instance.double
pvalue()
Returns the pvalue of goodnessoffit test.double[]
Returns the residuals, which is response minus fitted values.double
RSquared()
Returns R^{2} statistic.double
RSS()
Returns the residual sum of squares.schema()
Returns the schema of predictors.toString()
double[][]
ttest()
Returns the ttest of the coefficients (including intercept).void
update
(double[] x, double y) Growing window recursive least squares with lambda = 1.void
update
(double[] x, double y, double lambda) Recursive least squares.void
Online update the regression model with a new data frame.void
Online update the regression model with a new training instance.Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
Methods inherited from interface smile.regression.Regression
applyAsDouble, predict, predict, predict, update, update, update

Constructor Details

LinearModel
Constructor. Parameters:
formula
 a symbolic description of the model to be fitted.schema
 the schema of input data.X
 the design matrix.y
 the responsible variable.w
 the linear weights.b
 the intercept.


Method Details

formula
Description copied from interface:DataFrameRegression
Returns the model formula. Specified by:
formula
in interfaceDataFrameRegression
 Returns:
 the model formula.

schema
Description copied from interface:DataFrameRegression
Returns the schema of predictors. Specified by:
schema
in interfaceDataFrameRegression
 Returns:
 the schema of predictors.

ttest
public double[][] ttest()Returns the ttest of the coefficients (including intercept). The first column is the coefficients, the second column is the standard error of coefficients, the third column is the tscore of the hypothesis test if the coefficient is zero, the fourth column is the pvalues of test. The last row is of intercept. Returns:
 the ttest of the coefficients.

coefficients
public double[] coefficients()Returns the linear coefficients without intercept. Returns:
 the linear coefficients without intercept.

intercept
public double intercept()Returns the intercept. Returns:
 the intercept.

residuals
public double[] residuals()Returns the residuals, which is response minus fitted values. Returns:
 the residuals

fittedValues
public double[] fittedValues()Returns the fitted values. Returns:
 the fitted values.

RSS
public double RSS()Returns the residual sum of squares. Returns:
 the residual sum of squares.

error
public double error()Returns the residual standard error. Returns:
 the residual standard error.

df
public int df()Returns the degreeoffreedom of residual standard error. Returns:
 the degreeoffreedom of residual standard error.

RSquared
public double RSquared()Returns R^{2} statistic. In regression, the R^{2} coefficient of determination is a statistical measure of how well the regression line approximates the real data points. An R^{2} of 1.0 indicates that the regression line perfectly fits the data.In the case of ordinary leastsquares regression, R^{2} increases as we increase the number of variables in the model (R^{2} will not decrease). This illustrates a drawback to one possible use of R^{2}, where one might try to include more variables in the model until "there is no more improvement". This leads to the alternative approach of looking at the adjusted R^{2}.
 Returns:
 R^{2} statistic.

adjustedRSquared
public double adjustedRSquared()Returns adjusted R^{2} statistic. The adjusted R^{2} has almost same explanation as R^{2} but it penalizes the statistic as extra variables are included in the model. Returns:
 adjusted R^{2} statistic.

ftest
public double ftest()Returns the Fstatistic of goodnessoffit. Returns:
 the Fstatistic of goodnessoffit.

pvalue
public double pvalue()Returns the pvalue of goodnessoffit test. Returns:
 the pvalue of goodnessoffit test.

predict
public double predict(double[] x) Predicts the dependent variable of an instance. Parameters:
x
 an instance. Returns:
 the predicted value of dependent variable.

predict
Description copied from interface:Regression
Predicts the dependent variable of an instance. Specified by:
predict
in interfaceRegression<Tuple>
 Parameters:
x
 an instance. Returns:
 the predicted value of dependent variable.

predict
Description copied from interface:DataFrameRegression
Predicts the dependent variables of a data frame. Specified by:
predict
in interfaceDataFrameRegression
 Parameters:
df
 the data frame. Returns:
 the predicted values.

update
Online update the regression model with a new training instance. Parameters:
data
 the training data.

update
Online update the regression model with a new data frame. Parameters:
data
 the training data.

online
public boolean online()Description copied from interface:Regression
Returns true if this is an online learner. Specified by:
online
in interfaceRegression<Tuple>
 Returns:
 true if online learner.

update
public void update(double[] x, double y) Growing window recursive least squares with lambda = 1. RLS updates an ordinary least squares with samples that arrive sequentially. Parameters:
x
 training instance.y
 response variable.

update
public void update(double[] x, double y, double lambda) Recursive least squares. RLS updates an ordinary least squares with samples that arrive sequentially.In some adaptive configurations it can be useful not to give equal importance to all the historical data but to assign higher weights to the most recent data (and then to forget the oldest one). This may happen when the phenomenon underlying the data is nonstationary or when we want to approximate a nonlinear dependence by using a linear model which is local in time. Both these situations are common in adaptive control problems.
 Parameters:
x
 training instance.y
 response variable.lambda
 The forgetting factor in (0, 1]. The smaller lambda is, the smaller is the contribution of previous samples to the covariance matrix. This makes the filter more sensitive to recent samples, which means more fluctuations in the filter coefficients. The lambda = 1 case is referred to as the growing window RLS algorithm. In practice, lambda is usually chosen between 0.98 and 1.

toString
