Multivariate optimisation involves finding the minimum or maximum of a function with multiple variables (). Most real-world engineering problems involve multiple design variables, making multivariate optimisation essential.
Mathematical Formulation
For a function where , we seek:
- for minimization, or
- for maximization
Key Concepts
Gradient Vector
The gradient vector () is the vector of first partial derivatives:
The gradient points in the direction of steepest increase of the function.
Example:
Hessian Matrix
The Hessian matrix () is the matrix of second partial derivatives:
The Hessian characterizes the local curvature of the function.
Example:
Note that this is a symmetric matrix.
Optimality Conditions
Necessary Conditions
For a smooth function, a necessary condition for a local minimum or maximum is:
This means the gradient equals the zero vector at the optimal point.
Sufficient Conditions
To determine the nature of a stationary point (where ):
- If the Hessian is positive definite (all eigenvalues positive), then is a local minimum
- If the Hessian is negative definite (all eigenvalues negative), then is a local maximum
- If the Hessian has both positive and negative eigenvalues, then is a saddle point
- If the Hessian is singular (has zero eigenvalues), higher-order tests are needed
Special Cases
Linear Functions
For :
- No extrema exist unless constrained
- Gradient is constant
- Hessian is the zero matrix
Quadratic Functions
For where is symmetric:
- If is positive definite, unique minimum at
- If is negative definite, unique maximum at
- If is indefinite, saddle point at
- If is singular, no extrema or infinitely many extrema
Numerical Methods
Gradient-Based Methods
-
Steepest Descent:
- Update:
- Simple but can be slow, especially for ill-conditioned problems
- Step size determined by line search
-
Conjugate Gradient:
- Uses conjugate directions to improve convergence
- Particularly effective for quadratic functions
- Avoids the “zig-zag” behavior of steepest descent
Hessian-Based Methods
-
Newton’s Method:
- Update:
- Quadratic converg
-
Quasi-Newton Methods:
- Approximates the Hessian or its inverse
- BFGS (Broyden-Fletcher-Goldfarb-Shanno) is the most popular
- Avoids explicit computation of second derivatives
-
Trust Region Methods:
- Restricts the optimization step within a “trusted” region
- Combines advantages of gradient and Hessian approaches
- Robust for non-convex functions
Challenges in Multivariate Optimisation
Scaling
When different variables have vastly different scales, optimization algorithms may perform poorly. Solutions include:
- Variable transformation
- Preconditioning
- Normalized coordinates
Ill-Conditioning
When the Hessian has a high condition number (ratio of largest to smallest eigenvalue), convergence can be slow. This occurs in:
- Problems with highly interdependent variables
- Problems with vastly different sensitivities to different variables
Multiple Local Optima
Unlike in the convex case, general nonlinear functions may have multiple local optima, making it difficult to find the global optimum. Strategies include:
- Multiple random starts
- Global optimization methods
- Problem reformulation
Applications
Multivariate optimization appears in countless engineering applications:
- Structural design (multiple dimensional parameters)
- Control system tuning (multiple control parameters)
- Machine learning (model parameters)
- Financial portfolio optimization (multiple assets)
- Chemical process optimization (multiple process variables)