Package smile.math

Class BFGS

java.lang.Object
smile.math.BFGS

public class BFGS extends Object
The Broyden–Fletcher–Goldfarb–Shanno (BFGS) algorithm is an iterative method for solving unconstrained nonlinear optimization problems.

The BFGS method belongs to quasi-Newton methods, a class of hill-climbing optimization techniques that seek a stationary point of a (preferably twice continuously differentiable) function. For such problems, a necessary condition for optimality is that the gradient be zero. Newton's method and the BFGS methods are not guaranteed to converge unless the function has a quadratic Taylor expansion near an optimum. However, BFGS has proven to have good performance even for non-smooth optimizations.

In quasi-Newton methods, the Hessian matrix of second derivatives is not computed. Instead, the Hessian matrix is approximated using updates specified by gradient evaluations (or approximate gradient evaluations). Quasi-Newton methods are generalizations of the secant method to find the root of the first derivative for multidimensional problems. In multi-dimensional problems, the secant equation does not specify a unique solution, and quasi-Newton methods differ in how they constrain the solution. The BFGS method is one of the most popular members of this class.

Like the original BFGS, the limited-memory BFGS (L-BFGS) uses an estimation to the inverse Hessian matrix to steer its search through variable space, but where BFGS stores a dense n × n approximation to the inverse Hessian (n being the number of variables in the problem), L-BFGS stores only a few vectors that represent the approximation implicitly. Due to its resulting linear memory requirement, the L-BFGS method is particularly well suited for optimization problems with a large number of variables (e.g., > 1000). Instead of the inverse Hessian H_k, L-BFGS maintains * a history of the past m updates of the position x and gradient ∇f(x), where generally the history size m can be small (often m < 10). These updates are used to implicitly do operations requiring the H_k-vector product.

References

  1. Roger Fletcher. Practical methods of optimization.
  2. D. Liu and J. Nocedal. On the limited memory BFGS method for large scale optimization. Mathematical Programming B 45 (1989) 503-528.
  3. Richard H. Byrd, Peihuang Lu, Jorge Nocedal and Ciyou Zhu. A limited memory algorithm for bound constrained optimization.
  • Constructor Details

    • BFGS

      public BFGS()
  • Method Details

    • minimize

      public static double minimize(DifferentiableMultivariateFunction func, double[] x, double gtol, int maxIter)
      This method solves the unconstrained minimization problem
           min f(x),    x = (x1,x2,...,x_n),
       
      using the BFGS method.
      Parameters:
      func - the function to be minimized.
      x - on initial entry this must be set by the user to the values of the initial estimate of the solution vector. On exit, it contains the values of the variables at the best point found (usually a solution).
      gtol - the convergence tolerance on zeroing the gradient.
      maxIter - the maximum number of iterations.
      Returns:
      the minimum value of the function.
    • minimize

      public static double minimize(DifferentiableMultivariateFunction func, int m, double[] x, double gtol, int maxIter)
      This method solves the unconstrained minimization problem
           min f(x),    x = (x1,x2,...,x_n),
       
      using the limited-memory BFGS method. The method is especially effective on problems involving a large number of variables. In a typical iteration of this method an approximation Hk to the inverse of the Hessian is obtained by applying m BFGS updates to a diagonal matrix Hk0, using information from the previous M steps. The user specifies the number m, which determines the amount of storage required by the routine.
      Parameters:
      func - the function to be minimized.
      m - the number of corrections used in the L-BFGS update. Values of m less than 3 are not recommended; large values of m will result in excessive computing time. 3 <= m <= 7 is recommended. A common choice for m is m = 5.
      x - on initial entry this must be set by the user to the values of the initial estimate of the solution vector. On exit with iflag = 0, it contains the values of the variables at the best point found (usually a solution).
      gtol - the convergence tolerance on zeroing the gradient.
      maxIter - the maximum number of iterations.
      Returns:
      the minimum value of the function.
    • minimize

      public static double minimize(DifferentiableMultivariateFunction func, int m, double[] x, double[] l, double[] u, double gtol, int maxIter)
      This method solves the bound constrained minimization problem using the L-BFGS-B method. The L-BFGS-B algorithm extends L-BFGS to handle simple box constraints on variables; that is, constraints of the form li ≤ xi ≤ ui where li and ui are per-variable constant lower and upper bounds, respectively (for each xi, either or both bounds may be omitted). The method works by identifying fixed and free variables at every step (using a simple gradient method), and then using the L-BFGS method on the free variables only to get higher accuracy, and then repeating the process.
      Parameters:
      func - the function to be minimized.
      m - the number of corrections used in the L-BFGS update. Values of m less than 3 are not recommended; large values of m will result in excessive computing time. 3 <= m <= 7 is recommended. A common choice for m is m = 5.
      x - on initial entry this must be set by the user to the values of the initial estimate of the solution vector. On exit with iflag = 0, it contains the values of the variables at the best point found (usually a solution).
      l - the lower bound.
      u - the upper bound.
      gtol - the convergence tolerance on zeroing the gradient.
      maxIter - the maximum number of iterations.
      Returns:
      the minimum value of the function.