Multivariate Optimisation

Multivariate optimisation involves finding the minimum or maximum of a function with multiple variables ( $D \geq 2$ ). Most real-world engineering problems involve multiple design variables, making multivariate optimisation essential.

Mathematical Formulation

For a function $L (θ)$ where $θ \in R^{D}$ , we seek:

$θ^{*} = ar g min L (θ)$ for minimization, or
$θ^{*} = ar g max L (θ)$ for maximization

Key Concepts

Gradient Vector

The gradient vector ( $g$ ) is the vector of first partial derivatives:

$g = \nabla_{θ} L = [\frac{\partial L}{\partial θ _{1}}, \frac{\partial L}{\partial θ _{2}}, \dots, \frac{\partial L}{\partial θ _{D}}]^{T}$

The gradient points in the direction of steepest increase of the function.

Example:

L = δ = \frac{64 F L ^{3}}{3 π E D ^{4}}

vector \tilde{δ} = \nabla_{(D, L)} L = \frac{\partial L}{\partial D} \frac{\partial L}{\partial L} = - \frac{256 F L ^{3}}{3 π E D ^{5}} \frac{192 F L ^{2}}{3 π E D ^{4}}

Hessian Matrix

The Hessian matrix ( $H$ ) is the matrix of second partial derivatives:

$H = \nabla_{θ}^{2} L where [H]_{ij} = \frac{\partial ^{2} L}{\partial θ _{i} \partial θ _{j}}$

The Hessian characterizes the local curvature of the function.

Example:

matrix H = \nabla_{(D, L)}^{2} L = \frac{\partial ^{2} L}{\partial D ^{2}} \frac{\partial ^{2} L}{\partial L \partial D} \frac{\partial ^{2} L}{\partial D \partial L} \frac{\partial ^{2} L}{\partial L ^{2}} = \frac{1280 F L ^{3}}{3 π E D ^{6}} - \frac{256 F L ^{2}}{π E D ^{5}} - \frac{256 F L ^{2}}{π E D ^{5}} \frac{128 F L}{π E D ^{4}}

Note that this is a symmetric matrix.

Optimality Conditions

Necessary Conditions

For a smooth function, a necessary condition for a local minimum or maximum is:

$\nabla_{θ} L (θ^{*}) = 0$

This means the gradient equals the zero vector at the optimal point.

Sufficient Conditions

To determine the nature of a stationary point (where $\nabla_{θ} L (θ) = 0$ ):

If the Hessian $H$ is positive definite (all eigenvalues positive), then $θ^{*}$ is a local minimum
If the Hessian $H$ is negative definite (all eigenvalues negative), then $θ^{*}$ is a local maximum
If the Hessian $H$ has both positive and negative eigenvalues, then $θ^{*}$ is a saddle point
If the Hessian $H$ is singular (has zero eigenvalues), higher-order tests are needed

Special Cases

Linear Functions

For $L (θ) = c^{T} θ + b$ :

No extrema exist unless constrained
Gradient $\nabla L = c$ is constant
Hessian is the zero matrix

Quadratic Functions

For $L (θ) = \frac{1}{2} θ^{T} Q θ + c^{T} θ + b$ where $Q$ is symmetric:

If $Q$ is positive definite, unique minimum at $θ^{*} = - Q^{- 1} c$
If $Q$ is negative definite, unique maximum at $θ^{*} = - Q^{- 1} c$
If $Q$ is indefinite, saddle point at $θ^{*} = - Q^{- 1} c$
If $Q$ is singular, no extrema or infinitely many extrema

Numerical Methods

Gradient-Based Methods

Steepest Descent:
- Update: $θ_{k + 1} = θ_{k} - α_{k} \nabla L (θ_{k})$
- Simple but can be slow, especially for ill-conditioned problems
- Step size $α_{k}$ determined by line search
Conjugate Gradient:
- Uses conjugate directions to improve convergence
- Particularly effective for quadratic functions
- Avoids the “zig-zag” behavior of steepest descent

Hessian-Based Methods

Newton’s Method:
- Update: $θ_{k + 1} = θ_{k} - H^{- 1} (θ_{k}) \nabla L (θ_{k})$
- Quadratic converg
Quasi-Newton Methods:
- Approximates the Hessian or its inverse
- BFGS (Broyden-Fletcher-Goldfarb-Shanno) is the most popular
- Avoids explicit computation of second derivatives
Trust Region Methods:
- Restricts the optimization step within a “trusted” region
- Combines advantages of gradient and Hessian approaches
- Robust for non-convex functions

Challenges in Multivariate Optimisation

Scaling

When different variables have vastly different scales, optimization algorithms may perform poorly. Solutions include:

Variable transformation
Preconditioning
Normalized coordinates

Ill-Conditioning

When the Hessian has a high condition number (ratio of largest to smallest eigenvalue), convergence can be slow. This occurs in:

Problems with highly interdependent variables
Problems with vastly different sensitivities to different variables

Multiple Local Optima

Unlike in the convex case, general nonlinear functions may have multiple local optima, making it difficult to find the global optimum. Strategies include:

Multiple random starts
Global optimization methods
Problem reformulation

Applications

Multivariate optimization appears in countless engineering applications:

Structural design (multiple dimensional parameters)
Control system tuning (multiple control parameters)
Machine learning (model parameters)
Financial portfolio optimization (multiple assets)
Chemical process optimization (multiple process variables)

Quartz 4

Explorer

Multivariate Optimisation

Mathematical Formulation

Key Concepts

Gradient Vector

Example:

Hessian Matrix

Example:

Optimality Conditions

Necessary Conditions

Sufficient Conditions

Special Cases

Linear Functions

Quadratic Functions

Numerical Methods

Gradient-Based Methods

Hessian-Based Methods

Challenges in Multivariate Optimisation

Scaling

Ill-Conditioning

Multiple Local Optima

Applications

Graph View

Table of Contents

Backlinks

Quartz 4

Explorer

Multivariate Optimisation

Mathematical Formulation

Key Concepts

Gradient Vector

Example:

Hessian Matrix

Example:

Optimality Conditions

Necessary Conditions

Sufficient Conditions

Special Cases

Linear Functions

Quadratic Functions

Numerical Methods

Gradient-Based Methods

Hessian-Based Methods

Challenges in Multivariate Optimisation

Scaling

Ill-Conditioning

Multiple Local Optima

Applications

Related Pages

Graph View

Table of Contents

Backlinks