Class Optimizer
java.lang.Object
smile.deep.Optimizer
Optimizer functions.
-
Method Summary
Modifier and TypeMethodDescriptionstatic OptimizerReturns an Adam optimizer.static OptimizerAdam(Model model, double rate, double beta1, double beta2, double eps, double decay, boolean amsgrad) Returns an Adam optimizer.static OptimizerReturns an AdamW optimizer.static OptimizerAdamW(Model model, double rate, double beta1, double beta2, double eps, double decay, boolean amsgrad) Returns an AdamW optimizer.voidreset()Resets gradients.static OptimizerReturns an RMSprop optimizer.static OptimizerRMSprop(Model model, double rate, double alpha, double eps, double decay, double momentum, boolean centered) Returns an RMSprop optimizer.voidsetLearningRate(double rate) Sets the learning rate.static OptimizerReturns a stochastic gradient descent optimizer without momentum.static OptimizerReturns a stochastic gradient descent optimizer with momentum.voidstep()Updates the parameters based on the calculated gradients.
-
Method Details
-
reset
public void reset()Resets gradients. -
step
public void step()Updates the parameters based on the calculated gradients. -
setLearningRate
public void setLearningRate(double rate) Sets the learning rate.- Parameters:
rate- the learning rate.
-
SGD
-
SGD
public static Optimizer SGD(Model model, double rate, double momentum, double decay, double dampening, boolean nesterov) Returns a stochastic gradient descent optimizer with momentum.- Parameters:
model- the model to be optimized.rate- the learning rate.momentum- the momentum factor.decay- the weight decay (L2 penalty).dampening- dampening for momentum.nesterov- enables Nesterov momentum.- Returns:
- the optimizer.
-
Adam
-
Adam
public static Optimizer Adam(Model model, double rate, double beta1, double beta2, double eps, double decay, boolean amsgrad) Returns an Adam optimizer.- Parameters:
model- the model to be optimized.rate- the learning rate.beta1- coefficients used for computing running averages of gradient and its square.beta2- coefficients used for computing running averages of gradient and its square.eps- term added to the denominator to improve numerical stability.decay- the weight decay (L2 penalty).amsgrad- whether to use the AMSGrad variant of this algorithm from the paper On the Convergence of Adam and Beyond.- Returns:
- the optimizer.
-
AdamW
-
AdamW
public static Optimizer AdamW(Model model, double rate, double beta1, double beta2, double eps, double decay, boolean amsgrad) Returns an AdamW optimizer.- Parameters:
model- the model to be optimized.rate- the learning rate.beta1- coefficients used for computing running averages of gradient and its square.beta2- coefficients used for computing running averages of gradient and its square.eps- term added to the denominator to improve numerical stability.decay- the weight decay (L2 penalty).amsgrad- whether to use the AMSGrad variant of this algorithm from the paper On the Convergence of Adam and Beyond.- Returns:
- the optimizer.
-
RMSprop
-
RMSprop
public static Optimizer RMSprop(Model model, double rate, double alpha, double eps, double decay, double momentum, boolean centered) Returns an RMSprop optimizer.- Parameters:
model- the model to be optimized.rate- the learning rate.alpha- smoothing constant.eps- term added to the denominator to improve numerical stability.decay- the weight decay (L2 penalty).momentum- the momentum factor.centered- if true, compute the centered RMSProp, the gradient is normalized by an estimation of its variance.- Returns:
- the optimizer.
-