Package smile.math
Interface TimeFunction
- All Superinterfaces:
Serializable
A time-dependent function. When training a neural network model,
it is often recommended to lower the learning rate as the training
progresses. Besides the learning rate schedule, it may also be used
for 1-dimensional neighborhood function, etc.
-
Method Summary
Modifier and TypeMethodDescriptiondouble
apply
(int t) Returns the function value at time step t.static TimeFunction
constant
(double alpha) Returns the constant learning rate.static TimeFunction
cosine
(double minLearningRate, double decaySteps, double maxLearningRate) Returns the cosine annealing function.static TimeFunction
exp
(double initLearningRate, double decaySteps) Returns the exponential decay function.static TimeFunction
exp
(double initLearningRate, double decaySteps, double endLearningRate) Returns the exponential decay function.static TimeFunction
exp
(double initLearningRate, double decaySteps, double decayRate, boolean staircase) Returns the exponential decay function.static TimeFunction
inverse
(double initLearningRate, double decaySteps) Returns the inverse decay function.static TimeFunction
inverse
(double initLearningRate, double decaySteps, double decayRate) Returns the inverse decay function.static TimeFunction
inverse
(double initLearningRate, double decaySteps, double decayRate, boolean staircase) Returns the inverse decay function.static TimeFunction
linear
(double initLearningRate, double decaySteps, double endLearningRate) Returns the linear learning rate decay function that starts with an initial learning rate and reach an end learning rate in the given decay steps.static TimeFunction
Parses a time function.static TimeFunction
piecewise
(int[] milestones, double[] values) Returns the piecewise constant learning rate.static TimeFunction
piecewise
(int[] milestones, TimeFunction... schedules) Returns the piecewise constant learning rate.static TimeFunction
polynomial
(double degree, double initLearningRate, double decaySteps, double endLearningRate) Returns the polynomial learning rate decay function that starts with an initial learning rate and reach an end learning rate in the given decay steps, without cycling.static TimeFunction
polynomial
(double degree, double initLearningRate, double decaySteps, double endLearningRate, boolean cycle) Returns the polynomial learning rate decay function that starts with an initial learning rate and reach an end learning rate in the given decay steps.
-
Method Details
-
apply
double apply(int t) Returns the function value at time step t.- Parameters:
t
- the discrete time step.- Returns:
- the function value.
-
constant
Returns the constant learning rate.- Parameters:
alpha
- the learning rate.- Returns:
- the constant learning rate function.
-
piecewise
Returns the piecewise constant learning rate. This can be useful for changing the learning rate value across different invocations of optimizer functions.- Parameters:
milestones
- List of batch indices. Must be increasing.values
- The values for each interval defined by milestones. It should have one more element than milestones.- Returns:
- the piecewise learning rate function.
-
piecewise
Returns the piecewise constant learning rate. This can be useful for changing the learning rate value across different invocations of optimizer functions.- Parameters:
milestones
- List of batch indices. Must be increasing.schedules
- The time functions for each interval defined by milestones. It should have one more element than .- Returns:
- the piecewise learning rate function.
-
linear
Returns the linear learning rate decay function that starts with an initial learning rate and reach an end learning rate in the given decay steps.- Parameters:
initLearningRate
- the initial learning rate.decaySteps
- the decay steps.endLearningRate
- the end learning rate.- Returns:
- the linear learning rate function.
-
polynomial
static TimeFunction polynomial(double degree, double initLearningRate, double decaySteps, double endLearningRate) Returns the polynomial learning rate decay function that starts with an initial learning rate and reach an end learning rate in the given decay steps, without cycling.It is commonly observed that a monotonically decreasing learning rate, whose degree of change is carefully chosen, results in a better performing model.
- Parameters:
degree
- the degree of the polynomial.initLearningRate
- the initial learning rate.decaySteps
- the decay steps.endLearningRate
- the end learning rate.- Returns:
- the polynomial learning rate function.
-
polynomial
static TimeFunction polynomial(double degree, double initLearningRate, double decaySteps, double endLearningRate, boolean cycle) Returns the polynomial learning rate decay function that starts with an initial learning rate and reach an end learning rate in the given decay steps.It is commonly observed that a monotonically decreasing learning rate, whose degree of change is carefully chosen, results in a better performing model.
- Parameters:
degree
- the degree of the polynomial.initLearningRate
- the initial learning rate.decaySteps
- the decay steps.endLearningRate
- the end learning rate.cycle
- the flag indicating if it should cycle beyond decaySteps.- Returns:
- the polynomial learning rate function.
-
inverse
Returns the inverse decay function.initLearningRate * decaySteps / (t + decaySteps)
.- Parameters:
initLearningRate
- the initial learning rate.decaySteps
- the decay steps that should be a small percentage of the number of iterations.- Returns:
- the inverse decay function.
-
inverse
Returns the inverse decay function.initLearningRate / (1 + decayRate * t / decaySteps)
.- Parameters:
initLearningRate
- the initial learning rate.decaySteps
- how often to apply decay.decayRate
- the decay rate.- Returns:
- the inverse decay function.
-
inverse
static TimeFunction inverse(double initLearningRate, double decaySteps, double decayRate, boolean staircase) Returns the inverse decay function.initLearningRate / (1 + decayRate * t / decaySteps)
.- Parameters:
initLearningRate
- the initial learning rate.decaySteps
- how often to apply decay.decayRate
- the decay rate.staircase
- the flag whether to apply decay in a discrete staircase, as opposed to continuous, fashion.- Returns:
- the inverse decay function.
-
exp
Returns the exponential decay function.initLearningRate * exp(-t / decaySteps)
.- Parameters:
initLearningRate
- the initial learning rate.decaySteps
- the decay steps that should be a small percentage of the number of iterations.- Returns:
- the exponential decay function.
-
exp
Returns the exponential decay function.initLearningRate * pow(endLearningRate / initLearningRate, min(t, decaySteps) / decaySteps)
.- Parameters:
initLearningRate
- the initial learning rate.decaySteps
- the maximum decay steps.endLearningRate
- the end learning rate.- Returns:
- the exponential decay function.
-
exp
static TimeFunction exp(double initLearningRate, double decaySteps, double decayRate, boolean staircase) Returns the exponential decay function.initLearningRate * pow(decayRate, t / decaySteps)
.- Parameters:
initLearningRate
- the initial learning rate.decaySteps
- how often to apply decay.decayRate
- the decay rate.staircase
- the flag whether to apply decay in a discrete staircase, as opposed to continuous, fashion.- Returns:
- the exponential decay function.
-
cosine
Returns the cosine annealing function. Cosine Annealing has the effect of starting with a large learning rate that is relatively rapidly decreased to a minimum value before being increased rapidly again. The resetting of the learning rate acts like a simulated restart of the learning process and the re-use of good weights as the starting point of the restart is referred to as a "warm restart" in contrast to a "cold restart" where a new set of small random numbers may be used as a starting point.initLearningRate * pow(endLearningRate / initLearningRate, min(t, decaySteps) / decaySteps)
.- Parameters:
minLearningRate
- the minimum learning rate.decaySteps
- the maximum decay steps.maxLearningRate
- the maximum learning rate. It also serves as the initial learning rate.- Returns:
- the cosine decay function.
-
of
Parses a time function.- Parameters:
time
- the time function representation.- Returns:
- the time function.
-