Package smile.math

Interface TimeFunction

All Superinterfaces:
Serializable

public interface TimeFunction extends Serializable
A time-dependent function. When training a neural network model, it is often recommended to lower the learning rate as the training progresses. Besides the learning rate schedule, it may also be used for 1-dimensional neighborhood function, etc.
  • Method Summary

    Modifier and Type
    Method
    Description
    double
    apply(int t)
    Returns the function value at time step t.
    constant(double alpha)
    Returns the constant learning rate.
    cosine(double minLearningRate, double decaySteps, double maxLearningRate)
    Returns the cosine annealing function.
    exp(double initLearningRate, double decaySteps)
    Returns the exponential decay function.
    exp(double initLearningRate, double decaySteps, double endLearningRate)
    Returns the exponential decay function.
    exp(double initLearningRate, double decaySteps, double decayRate, boolean staircase)
    Returns the exponential decay function.
    inverse(double initLearningRate, double decaySteps)
    Returns the inverse decay function.
    inverse(double initLearningRate, double decaySteps, double decayRate)
    Returns the inverse decay function.
    inverse(double initLearningRate, double decaySteps, double decayRate, boolean staircase)
    Returns the inverse decay function.
    linear(double initLearningRate, double decaySteps, double endLearningRate)
    Returns the linear learning rate decay function that starts with an initial learning rate and reach an end learning rate in the given decay steps.
    of(String time)
    Parses a time function.
    piecewise(int[] milestones, double[] values)
    Returns the piecewise constant learning rate.
    piecewise(int[] milestones, TimeFunction... schedules)
    Returns the piecewise constant learning rate.
    polynomial(double degree, double initLearningRate, double decaySteps, double endLearningRate)
    Returns the polynomial learning rate decay function that starts with an initial learning rate and reach an end learning rate in the given decay steps, without cycling.
    polynomial(double degree, double initLearningRate, double decaySteps, double endLearningRate, boolean cycle)
    Returns the polynomial learning rate decay function that starts with an initial learning rate and reach an end learning rate in the given decay steps.
  • Method Details

    • apply

      double apply(int t)
      Returns the function value at time step t.
      Parameters:
      t - the discrete time step.
      Returns:
      the function value.
    • constant

      static TimeFunction constant(double alpha)
      Returns the constant learning rate.
      Parameters:
      alpha - the learning rate.
      Returns:
      the constant learning rate function.
    • piecewise

      static TimeFunction piecewise(int[] milestones, double[] values)
      Returns the piecewise constant learning rate. This can be useful for changing the learning rate value across different invocations of optimizer functions.
      Parameters:
      milestones - List of batch indices. Must be increasing.
      values - The values for each interval defined by milestones. It should have one more element than milestones.
      Returns:
      the piecewise learning rate function.
    • piecewise

      static TimeFunction piecewise(int[] milestones, TimeFunction... schedules)
      Returns the piecewise constant learning rate. This can be useful for changing the learning rate value across different invocations of optimizer functions.
      Parameters:
      milestones - List of batch indices. Must be increasing.
      schedules - The time functions for each interval defined by milestones. It should have one more element than .
      Returns:
      the piecewise learning rate function.
    • linear

      static TimeFunction linear(double initLearningRate, double decaySteps, double endLearningRate)
      Returns the linear learning rate decay function that starts with an initial learning rate and reach an end learning rate in the given decay steps.
      Parameters:
      initLearningRate - the initial learning rate.
      decaySteps - the decay steps.
      endLearningRate - the end learning rate.
      Returns:
      the linear learning rate function.
    • polynomial

      static TimeFunction polynomial(double degree, double initLearningRate, double decaySteps, double endLearningRate)
      Returns the polynomial learning rate decay function that starts with an initial learning rate and reach an end learning rate in the given decay steps, without cycling.

      It is commonly observed that a monotonically decreasing learning rate, whose degree of change is carefully chosen, results in a better performing model.

      Parameters:
      degree - the degree of the polynomial.
      initLearningRate - the initial learning rate.
      decaySteps - the decay steps.
      endLearningRate - the end learning rate.
      Returns:
      the polynomial learning rate function.
    • polynomial

      static TimeFunction polynomial(double degree, double initLearningRate, double decaySteps, double endLearningRate, boolean cycle)
      Returns the polynomial learning rate decay function that starts with an initial learning rate and reach an end learning rate in the given decay steps.

      It is commonly observed that a monotonically decreasing learning rate, whose degree of change is carefully chosen, results in a better performing model.

      Parameters:
      degree - the degree of the polynomial.
      initLearningRate - the initial learning rate.
      decaySteps - the decay steps.
      endLearningRate - the end learning rate.
      cycle - the flag indicating if it should cycle beyond decaySteps.
      Returns:
      the polynomial learning rate function.
    • inverse

      static TimeFunction inverse(double initLearningRate, double decaySteps)
      Returns the inverse decay function. initLearningRate * decaySteps / (t + decaySteps).
      Parameters:
      initLearningRate - the initial learning rate.
      decaySteps - the decay steps that should be a small percentage of the number of iterations.
      Returns:
      the inverse decay function.
    • inverse

      static TimeFunction inverse(double initLearningRate, double decaySteps, double decayRate)
      Returns the inverse decay function. initLearningRate / (1 + decayRate * t / decaySteps).
      Parameters:
      initLearningRate - the initial learning rate.
      decaySteps - how often to apply decay.
      decayRate - the decay rate.
      Returns:
      the inverse decay function.
    • inverse

      static TimeFunction inverse(double initLearningRate, double decaySteps, double decayRate, boolean staircase)
      Returns the inverse decay function. initLearningRate / (1 + decayRate * t / decaySteps).
      Parameters:
      initLearningRate - the initial learning rate.
      decaySteps - how often to apply decay.
      decayRate - the decay rate.
      staircase - the flag whether to apply decay in a discrete staircase, as opposed to continuous, fashion.
      Returns:
      the inverse decay function.
    • exp

      static TimeFunction exp(double initLearningRate, double decaySteps)
      Returns the exponential decay function. initLearningRate * exp(-t / decaySteps).
      Parameters:
      initLearningRate - the initial learning rate.
      decaySteps - the decay steps that should be a small percentage of the number of iterations.
      Returns:
      the exponential decay function.
    • exp

      static TimeFunction exp(double initLearningRate, double decaySteps, double endLearningRate)
      Returns the exponential decay function. initLearningRate * pow(endLearningRate / initLearningRate, min(t, decaySteps) / decaySteps).
      Parameters:
      initLearningRate - the initial learning rate.
      decaySteps - the maximum decay steps.
      endLearningRate - the end learning rate.
      Returns:
      the exponential decay function.
    • exp

      static TimeFunction exp(double initLearningRate, double decaySteps, double decayRate, boolean staircase)
      Returns the exponential decay function. initLearningRate * pow(decayRate, t / decaySteps).
      Parameters:
      initLearningRate - the initial learning rate.
      decaySteps - how often to apply decay.
      decayRate - the decay rate.
      staircase - the flag whether to apply decay in a discrete staircase, as opposed to continuous, fashion.
      Returns:
      the exponential decay function.
    • cosine

      static TimeFunction cosine(double minLearningRate, double decaySteps, double maxLearningRate)
      Returns the cosine annealing function. Cosine Annealing has the effect of starting with a large learning rate that is relatively rapidly decreased to a minimum value before being increased rapidly again. The resetting of the learning rate acts like a simulated restart of the learning process and the re-use of good weights as the starting point of the restart is referred to as a "warm restart" in contrast to a "cold restart" where a new set of small random numbers may be used as a starting point. initLearningRate * pow(endLearningRate / initLearningRate, min(t, decaySteps) / decaySteps).
      Parameters:
      minLearningRate - the minimum learning rate.
      decaySteps - the maximum decay steps.
      maxLearningRate - the maximum learning rate. It also serves as the initial learning rate.
      Returns:
      the cosine decay function.
    • of

      static TimeFunction of(String time)
      Parses a time function.
      Parameters:
      time - the time function representation.
      Returns:
      the time function.