Optimization

← Previous: Numerical Differentiation | Back to Index | Next: Root Finding →

Optimization is the process of finding the parameter set that minimizes (or maximizes) an objective function. The Numerics library provides a comprehensive suite of local and global optimization algorithms for both unconstrained and constrained problems. These methods are essential for parameter estimation, model calibration, machine learning, and engineering design optimization.

Overview

Method	Type	Best For	Requires Derivatives
Local Methods
BFGS	Local	Smooth, differentiable functions	No (numerical)
Nelder-Mead	Local	General purpose, robust	No
Powell	Local	Smooth functions without derivatives	No
Golden Section	Local	1D problems	No
Brent Search	Local	1D problems, smooth functions	No
Gradient Descent	Local	Large-scale, smooth problems	No (numerical)
ADAM	Local	Machine learning, stochastic	No (numerical)
Global Methods
Differential Evolution	Global	Multimodal, robust	No
Particle Swarm	Global	Multimodal, fast convergence	No
Shuffled Complex Evolution	Global	Hydrological calibration	No
Simulated Annealing	Global	Discrete/continuous, multimodal	No
Multi-Start	Global	Combines local search with random starts	No
MLSL	Global	Smooth multimodal functions	Requires local solver
Constrained
Augmented Lagrangian	Constrained	Equality and inequality constraints	No

Problem Formulation

An optimization problem seeks to find:

$$\min_{\mathbf{x}} f(\mathbf{x})$$

subject to:

$$\mathbf{x}_L \leq \mathbf{x} \leq \mathbf{x}_U$$

where $f(\mathbf{x})$ is the objective function, $\mathbf{x}$ is the parameter vector, and $\mathbf{x}_L$, $\mathbf{x}_U$ are lower and upper bounds.

For maximization problems, minimize $-f(\mathbf{x})$ or use the Maximize() method.

Common Interface

All optimizers in Numerics inherit from the Optimizer base class and share a common interface.

Input Properties

MaxIterations: Maximum iterations allowed (default = 10,000)
MaxFunctionEvaluations: Maximum function evaluations (default = int.MaxValue)
AbsoluteTolerance: Absolute convergence tolerance (default = 1E-8)
RelativeTolerance: Relative convergence tolerance (default = 1E-8)
ReportFailure: Throw exception on failure (default = true)
RecordTraces: Save optimization trace (default = true)
ComputeHessian: Compute Hessian at solution (default = true)

Output Properties

BestParameterSet: Optimal solution (Values, Fitness)
Iterations: Number of iterations performed
FunctionEvaluations: Number of function evaluations
Status: Optimization status (Success, Failure, etc.)
ParameterSetTrace: Full trace of parameter evaluations
Hessian: Numerically differentiated Hessian matrix (if computed)

Methods

Minimize(): Minimize the objective function
Maximize(): Maximize the objective function (minimizes $-f(\mathbf{x})$)

Local Optimization

Local optimization methods find the nearest local minimum from a starting point. They are fast and efficient for smooth, unimodal functions but may get trapped in local minima for multimodal problems.

BFGS (Broyden-Fletcher-Goldfarb-Shanno)

BFGS is a quasi-Newton method that builds an approximation to the Hessian matrix using gradient information [1]. It's one of the most effective methods for smooth, unconstrained optimization.

using Numerics.Mathematics.Optimization;

// Rosenbrock function: f(x,y) = (1-x)² + 100(y-x²)²
double Rosenbrock(double[] x)
{
    return Math.Pow(1 - x[0], 2) + 100 * Math.Pow(x[1] - x[0] * x[0], 2);
}

// Initial guess
var initial = new double[] { -1.2, 1.0 };
var lower = new double[] { -2, -2 };
var upper = new double[] { 2, 2 };

// Create and configure optimizer
var bfgs = new BFGS(Rosenbrock, 2, initial, lower, upper);
bfgs.RelativeTolerance = 1e-8;
bfgs.MaxIterations = 1000;

// Minimize
bfgs.Minimize();

// Results
Console.WriteLine($"Optimal solution: x = {bfgs.BestParameterSet.Values[0]:F6}, y = {bfgs.BestParameterSet.Values[1]:F6}");
Console.WriteLine($"Minimum value: {bfgs.BestParameterSet.Fitness:F10}");
Console.WriteLine($"Iterations: {bfgs.Iterations}");
Console.WriteLine($"Function evaluations: {bfgs.FunctionEvaluations}");
Console.WriteLine($"Status: {bfgs.Status}");

Advantages: Fast convergence, memory efficient, works well for most smooth problems.

Disadvantages: Can fail on non-smooth functions, requires bounded search space.

Nelder-Mead Simplex

The Nelder-Mead method is a direct search algorithm that uses a simplex (geometric figure with n+1 vertices in n dimensions) to search the parameter space [2]. It's robust and doesn't require derivatives.

var nm = new NelderMead(Rosenbrock, 2, initial, lower, upper);
nm.RelativeTolerance = 1e-6;
nm.Minimize();

Console.WriteLine($"Optimal solution: [{nm.BestParameterSet.Values[0]:F6}, {nm.BestParameterSet.Values[1]:F6}]");
Console.WriteLine($"Minimum value: {nm.BestParameterSet.Fitness:F10}");

Advantages: Very robust, doesn't require derivatives, handles non-smooth functions.

Disadvantages: Slower convergence than gradient-based methods, can stagnate.

Powell's Method

Powell's method is a conjugate direction algorithm that doesn't require derivatives [1]. It performs successive line searches along conjugate directions.

var powell = new Powell(Rosenbrock, 2, initial, lower, upper);
powell.RelativeTolerance = 1e-8;
powell.Minimize();

Console.WriteLine($"Solution: [{powell.BestParameterSet.Values[0]:F6}, {powell.BestParameterSet.Values[1]:F6}]");

Advantages: No derivatives required, good for smooth functions.

Disadvantages: Can be slow in high dimensions, sensitive to scaling.

Gradient Descent

Simple gradient-based optimization with line search:

var gd = new GradientDescent(Rosenbrock, 2, initial, lower, upper);
gd.Alpha = 0.001; // Step size (learning rate)
gd.Minimize();

ADAM Optimizer

Adaptive Moment Estimation, popular in machine learning applications [3]:

var adam = new ADAM(Rosenbrock, 2, initial, lower, upper);
adam.Alpha = 0.001;
adam.Beta1 = 0.9;  // First moment decay
adam.Beta2 = 0.999; // Second moment decay
adam.Minimize();

Advantages: Adaptive learning rates, works well for noisy objectives.

Best for: Machine learning, stochastic optimization problems.

One-Dimensional Methods

For single-parameter optimization, specialized 1D methods are more efficient:

Golden Section Search

Uses the golden ratio to bracket the minimum:

Func<double, double> f1d = x => Math.Pow(x - 2, 2) + 3;

var golden = new GoldenSection(f1d, 0.0, 5.0);
golden.Minimize();

Console.WriteLine($"Minimum at x = {golden.BestParameterSet.Values[0]:F6}");

Brent Search

Combines golden section search with parabolic interpolation for faster convergence:

var brent = new BrentSearch(f1d, 0.0, 5.0);
brent.Minimize();

Global Optimization

Global optimization methods are designed to find the global minimum across the entire search space, avoiding local minima. They typically require more function evaluations but are more robust for multimodal problems.

Differential Evolution

Differential Evolution (DE) is a population-based evolutionary algorithm that's very robust for continuous optimization [4]. For each member $\mathbf{x}_i$ of the population, DE creates a trial vector $\mathbf{u}$ using mutation and crossover:

Mutation (DE/rand/1): Three distinct population members $\mathbf{x}{r_0}$, $\mathbf{x}{r_1}$, $\mathbf{x}_{r_2}$ are randomly selected, and a mutant vector is formed:

$$\mathbf{v} = \mathbf{x}_{r_0} + G \cdot (\mathbf{x}_{r_1} - \mathbf{x}_{r_2})$$

where $G$ is a scale factor. In the Numerics implementation, $G$ is dithered with 90% probability: $G = 0.5 + \text{rand}() \times 0.5$, which helps avoid stagnation.

Crossover (binomial): The trial vector $\mathbf{u}$ is assembled component-by-component:

$$u_j = \begin{cases} v_j & \text{if } \text{rand}() \leq CR \text{ or } j = j_{\text{rand}} \\ x_{i,j} & \text{otherwise} \end{cases}$$

where $CR$ is the crossover probability and $j_{\text{rand}}$ is a randomly chosen index that ensures at least one component comes from the mutant vector.

Selection (greedy): The trial vector replaces the target only if it has equal or better fitness:

$$\mathbf{x}_i^{(g+1)} = \begin{cases} \mathbf{u} & \text{if } f(\mathbf{u}) \leq f(\mathbf{x}_i^{(g)}) \\ \mathbf{x}_i^{(g)} & \text{otherwise} \end{cases}$$

The algorithm converges when the standard deviation of fitness values across the population falls below the tolerance.

using Numerics.Mathematics.Optimization;

// Rastrigin function (highly multimodal)
double Rastrigin(double[] x)
{
    double sum = 10 * x.Length;
    for (int i = 0; i < x.Length; i++)
    {
        sum += x[i] * x[i] - 10 * Math.Cos(2 * Math.PI * x[i]);
    }
    return sum;
}

var lower = new double[] { -5.12, -5.12 };
var upper = new double[] { 5.12, 5.12 };

var de = new DifferentialEvolution(Rastrigin, 2, lower, upper);
de.PopulationSize = 50;
de.CrossoverProbability = 0.9;
de.Mutation = 0.75;
de.MaxIterations = 1000;
de.Minimize();

Console.WriteLine($"Global minimum: [{de.BestParameterSet.Values[0]:F6}, {de.BestParameterSet.Values[1]:F6}]");
Console.WriteLine($"Function value: {de.BestParameterSet.Fitness:F10}");

Advantages: Very robust, handles discontinuities, good for difficult problems.

Parameters:

PopulationSize: Number of candidate solutions (default = 10 × dimensions)
CrossoverProbability: Probability of crossover (default = 0.9)
Mutation: Scaling factor for mutation (default = 0.75)

Particle Swarm Optimization

PSO simulates social behavior of bird flocking or fish schooling [5]. Each particle $i$ has a position $\mathbf{x}_i$ and velocity $\mathbf{v}_i$ that are updated at each iteration using:

$$v_{i,j}^{(t+1)} = w \cdot v_{i,j}^{(t)} + c_1 \cdot r_1 \cdot (p_{i,j} - x_{i,j}^{(t)}) + c_2 \cdot r_2 \cdot (g_j - x_{i,j}^{(t)})$$

$$x_{i,j}^{(t+1)} = x_{i,j}^{(t)} + v_{i,j}^{(t+1)}$$

where:

$w$ is the inertia weight, which decreases linearly from $w_{\max} = 0.9$ to $w_{\min} = 0.4$ over the optimization, balancing exploration (high $w$) and exploitation (low $w$)
$c_1 = c_2 = 2.05$ are the cognitive and social acceleration coefficients
$r_1, r_2 \sim U(0,1)$ are independent random numbers (per component, per particle)
$\mathbf{p}_i$ is particle $i$'s personal best position (best position it has visited)
$\mathbf{g}$ is the global best position (best position any particle has visited)

The three terms represent momentum (continue in the current direction), cognitive pull (return toward personal best), and social pull (move toward the swarm's best).

var pso = new ParticleSwarm(Rastrigin, 2, lower, upper);
pso.PopulationSize = 40;
pso.Minimize();

Console.WriteLine($"Solution: [{pso.BestParameterSet.Values[0]:F6}, {pso.BestParameterSet.Values[1]:F6}]");

Advantages: Fast convergence, simple concept, works well for continuous problems.

Parameters:

PopulationSize: Number of particles (default = 30)

Shuffled Complex Evolution (SCE-UA)

SCE-UA was specifically developed for calibrating hydrological models [6]. The algorithm partitions the population into $p$ complexes, evolves each complex independently using a Competitive Complex Evolution (CCE) strategy, then shuffles members between complexes to share information.

The CCE step within each complex proceeds as follows:

Sub-complex selection: Select $q = D+1$ points from the complex using a trapezoidal probability distribution that favors better-ranked points: $P(i) = \frac{2(N+1-i)}{N(N+1)}$
Reflection: Compute the centroid $\mathbf{g}$ of all sub-complex points except the worst point $\mathbf{x}_w$, then reflect:

$$\mathbf{r} = 2\mathbf{g} - \mathbf{x}_w$$

Contraction: If the reflected point is infeasible or worse than $\mathbf{x}_w$, try contraction:

$$\mathbf{c} = \frac{\mathbf{g} + \mathbf{x}_w}{2}$$

Mutation: If contraction also fails, generate a random point within the smallest bounding box of the complex.

The shuffling step redistributes points across complexes, preventing any single complex from stagnating. This combination of local competitive evolution within complexes and global information sharing between complexes makes SCE-UA particularly effective for the rugged, multimodal objective functions typical of hydrological model calibration.

var sce = new ShuffledComplexEvolution(Rastrigin, 2, lower, upper);
sce.Complexes = 5;
sce.MaxIterations = 1000;
sce.Minimize();

Console.WriteLine($"Calibrated parameters: [{sce.BestParameterSet.Values[0]:F6}, {sce.BestParameterSet.Values[1]:F6}]");

Advantages: Excellent for hydrological model calibration, balances exploration and exploitation.

Best for: Calibrating watershed models, water resources applications.

Simulated Annealing

SA mimics the physical process of annealing in metallurgy [7]. It accepts uphill moves with decreasing probability, allowing escape from local minima.

var sa = new SimulatedAnnealing(Rastrigin, 2, lower, upper);
sa.InitialTemperature = 10.0;
sa.CoolingRate = 0.95;
sa.MaxIterations = 10000;
sa.Minimize();

Console.WriteLine($"Solution: [{sa.BestParameterSet.Values[0]:F6}, {sa.BestParameterSet.Values[1]:F6}]");

Advantages: Can escape local minima, works for discrete and continuous problems.

Parameters:

InitialTemperature: Starting temperature (default = 10)
CoolingRate: Temperature reduction factor (default = 0.95)

Multi-Start Optimization

Combines local search with multiple random starting points:

var initial = new double[] { 0.0, 0.0 };
var ms = new MultiStart(Rastrigin, 2, initial, lower, upper, LocalMethod.BFGS);
ms.MaxIterations = 20; // Number of random starts
ms.Minimize();

Console.WriteLine($"Best solution: [{ms.BestParameterSet.Values[0]:F6}, {ms.BestParameterSet.Values[1]:F6}]");

Advantages: Simple to implement, leverages fast local search.

Disadvantages: May waste evaluations in same basin of attraction.

MLSL (Multi-Level Single Linkage)

Clustering-based global optimization that avoids redundant local searches:

var initial = new double[] { 0.0, 0.0 };
var mlsl = new MLSL(Rastrigin, 2, initial, lower, upper, LocalMethod.BFGS);
mlsl.Minimize();

Advantages: More efficient than multi-start, avoids redundant searches.

Constrained Optimization

Augmented Lagrangian

The Augmented Lagrangian method handles equality and inequality constraints by adding penalty terms to the objective function [8].

using Numerics.Mathematics.Optimization;

// Objective: minimize x² + y²
double Objective(double[] x)
{
    return x[0] * x[0] + x[1] * x[1];
}

// Constraint: x + y >= 1
var constraint = new Constraint(
    x => x[0] + x[1], 2, 1.0,  // g(x) >= value form
    ConstraintType.GreaterThanOrEqualTo
);

var lower = new double[] { -5, -5 };
var upper = new double[] { 5, 5 };
var initial = new double[] { 0, 0 };

// Create the inner optimizer
var bfgs = new BFGS(Objective, 2, initial, lower, upper);

// Create the Augmented Lagrange optimizer with constraints
var al = new AugmentedLagrange(Objective, bfgs, new IConstraint[] { constraint });
al.MaxIterations = 100;
al.Minimize();

Console.WriteLine($"Optimal solution: [{al.BestParameterSet.Values[0]:F6}, {al.BestParameterSet.Values[1]:F6}]");
Console.WriteLine($"Constraint satisfied: {al.BestParameterSet.Values[0] + al.BestParameterSet.Values[1]:F6} >= 1");

Constraint Types:

ConstraintType.EqualTo: Equality constraint $g(\mathbf{x}) = 0$
ConstraintType.LesserThanOrEqualTo: Inequality constraint $g(\mathbf{x}) \leq 0$
ConstraintType.GreaterThanOrEqualTo: Inequality constraint $g(\mathbf{x}) \geq 0$

Example: Minimize subject to multiple constraints:

// Minimize f(x,y) = (x-3)² + (y-2)²
// Subject to: x + y <= 5
//             x >= 1
//             y >= 1

double ObjectiveFunc(double[] x)
{
    return Math.Pow(x[0] - 3, 2) + Math.Pow(x[1] - 2, 2);
}

var c1 = new Constraint(x => x[0] + x[1], 2, 5.0, ConstraintType.LesserThanOrEqualTo);
var c2 = new Constraint(x => x[0], 2, 1.0, ConstraintType.GreaterThanOrEqualTo);
var c3 = new Constraint(x => x[1], 2, 1.0, ConstraintType.GreaterThanOrEqualTo);

var innerOptimizer = new BFGS(ObjectiveFunc, 2, new[] { 2.0, 2.0 },
                              new[] { 0.0, 0.0 }, new[] { 10.0, 10.0 });
var constrained = new AugmentedLagrange(ObjectiveFunc, innerOptimizer,
                                        new IConstraint[] { c1, c2, c3 });
constrained.Minimize();

Console.WriteLine($"Constrained optimum: [{constrained.BestParameterSet.Values[0]:F4}, " +
                  $"{constrained.BestParameterSet.Values[1]:F4}]");

Practical Example: Calibrating a Hydrological Model

A complete example of using optimization to calibrate a watershed model:

using Numerics.Mathematics.Optimization;
using Numerics.Data.Statistics;

// Observed streamflow data
double[] observed = { 12.5, 15.3, 18.7, 16.2, 14.1, 11.8, 10.3 };

// Simple runoff model: Q = C × P^α where Q is flow, P is precipitation, C and α are parameters
double[] precipitation = { 10, 12, 15, 13, 11, 9, 8 };

// Objective function: minimize RMSE
double ObjectiveFunction(double[] parameters)
{
    double C = parameters[0];
    double alpha = parameters[1];
    
    // Simulate streamflow
    double[] simulated = new double[observed.Length];
    for (int i = 0; i < observed.Length; i++)
    {
        simulated[i] = C * Math.Pow(precipitation[i], alpha);
    }
    
    // Compute RMSE
    double rmse = GoodnessOfFit.RMSE(observed, simulated);
    return rmse;
}

// Parameter bounds
var lower = new double[] { 0.1, 0.5 };  // C >= 0.1, α >= 0.5
var upper = new double[] { 5.0, 3.0 };  // C <= 5.0, α <= 3.0

// Use SCE-UA (recommended for hydrological calibration)
var optimizer = new ShuffledComplexEvolution(ObjectiveFunction, 2, lower, upper);
optimizer.Complexes = 5;
optimizer.MaxIterations = 1000;
optimizer.Minimize();

Console.WriteLine($"Calibrated parameters:");
Console.WriteLine($"  C = {optimizer.BestParameterSet.Values[0]:F4}");
Console.WriteLine($"  α = {optimizer.BestParameterSet.Values[1]:F4}");
Console.WriteLine($"  RMSE = {optimizer.BestParameterSet.Fitness:F4}");
Console.WriteLine($"  Function evaluations: {optimizer.FunctionEvaluations}");

// Verify calibration
double[] final_simulated = new double[observed.Length];
for (int i = 0; i < observed.Length; i++)
{
    final_simulated[i] = optimizer.BestParameterSet.Values[0] * 
                         Math.Pow(precipitation[i], optimizer.BestParameterSet.Values[1]);
}

double nse = GoodnessOfFit.NashSutcliffeEfficiency(observed, final_simulated);
Console.WriteLine($"  NSE = {nse:F4}");

Choosing an Optimization Method

Problem Type	Recommended Method	Notes
Smooth, unimodal	BFGS or Powell	Fast convergence to local minimum
Non-smooth, unimodal	Nelder-Mead	Robust direct search
Multimodal, global search	Differential Evolution or SCE-UA	Thorough exploration
Quick global search	Particle Swarm	Faster but less thorough
Hydrological calibration	SCE-UA	Specifically designed for this
With constraints	Augmented Lagrangian	Handles equality and inequality constraints
1D problem	Brent Search or Golden Section	Specialized efficient methods
Machine learning	ADAM	Adaptive learning rates
Unknown smoothness	Start with Nelder-Mead	Very robust
High dimensions (>20)	Differential Evolution or ADAM	Scale better than others

Best Practices

Scale Variables: Normalize parameters to similar ranges (e.g., [0,1] or [-1,1]) for better convergence.
Set Reasonable Bounds: Tight bounds improve convergence but shouldn't exclude the optimum.
Multiple Runs: For global optimization, run multiple times with different random seeds to ensure robustness.

Hybrid Approach: Use global method followed by local refinement:

// Global search
var de = new DifferentialEvolution(ObjectiveFunc, n, lower, upper);
de.Minimize();

// Local refinement
var bfgs = new BFGS(ObjectiveFunc, n, de.BestParameterSet.Values, lower, upper);
bfgs.Minimize();

Monitor Convergence: Check the ParameterSetTrace to diagnose convergence issues:

foreach (var ps in optimizer.ParameterSetTrace)
{
    Console.WriteLine($"Iteration {ps.Fitness}");
}

Adjust Tolerances: Tighter tolerances require more evaluations:

optimizer.RelativeTolerance = 1e-10; // Very tight
optimizer.AbsoluteTolerance = 1e-10;

Population Size: For global methods, larger populations explore better but cost more:
```
de.PopulationSize = 20 * NumberOfParameters; // Rule of thumb
```

Understanding Results

Parameter Set Structure

The BestParameterSet contains:

Values: The optimal parameter vector
Fitness: The objective function value at the optimum
Weight: (Optional) Used internally by some algorithms

Hessian Matrix

When ComputeHessian = true, the Hessian at the solution is computed numerically. The Hessian $\mathbf{H}$ is the matrix of second partial derivatives of the objective function, and its eigenvalues reveal important information about the solution:

All eigenvalues positive: The solution is a local minimum (the Hessian is positive definite)
Large eigenvalue: The objective is highly curved in that direction — the corresponding parameter is well-determined
Small eigenvalue: The objective is nearly flat — the parameter is poorly determined or identifiable
Near-zero eigenvalue: Indicates a ridge or valley in the objective surface, suggesting parameter correlation or redundancy

The inverse of the Hessian approximates the covariance matrix of the parameters, so parameter standard errors can be estimated as $SE(\hat{\theta}i) \approx \sqrt{|H{ii}^{-1}|}$:

optimizer.ComputeHessian = true;
optimizer.Minimize();

// Access Hessian
var H = optimizer.Hessian;

// Parameter standard errors (approximation)
for (int i = 0; i < optimizer.NumberOfParameters; i++)
{
    double se = Math.Sqrt(Math.Abs(1.0 / H[i, i]));
    Console.WriteLine($"Parameter {i} ± {se:F4}");
}

Optimization Status

Check the Status property to verify success:

if (optimizer.Status == OptimizationStatus.Success)
{
    Console.WriteLine("Optimization converged successfully");
}
else if (optimizer.Status == OptimizationStatus.MaximumIterationsReached)
{
    Console.WriteLine("Maximum iterations reached - may not have converged");
}

Performance Tips

Function Evaluation Cost: Most optimization time is spent evaluating the objective function. Optimize your function first.
Parallel Evaluation: For population-based methods (DE, PSO), evaluate population members in parallel if your function is thread-safe.
Warm Start: If solving similar problems repeatedly, use the previous solution as initial guess.
Gradient Information: If you can provide analytical gradients, BFGS will converge much faster (though the current implementation uses numerical gradients).

References

[1] Nocedal, J., & Wright, S. J. (2006). Numerical Optimization (2nd ed.). Springer.

[2] Nelder, J. A., & Mead, R. (1965). A simplex method for function minimization. The Computer Journal, 7(4), 308-313.

[3] Kingma, D. P., & Ba, J. (2014). Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980.

[4] Storn, R., & Price, K. (1997). Differential evolution–a simple and efficient heuristic for global optimization over continuous spaces. Journal of Global Optimization, 11(4), 341-359.

[5] Kennedy, J., & Eberhart, R. (1995). Particle swarm optimization. Proceedings of IEEE International Conference on Neural Networks, 4, 1942-1948.

[6] Duan, Q., Sorooshian, S., & Gupta, V. (1992). Effective and efficient global optimization for conceptual rainfall-runoff models. Water Resources Research, 28(4), 1015-1031.

[7] Kirkpatrick, S., Gelatt, C. D., & Vecchi, M. P. (1983). Optimization by simulated annealing. Science, 220(4598), 671-680.

[8] Birgin, E. G., & Martínez, J. M. (2014). Practical Augmented Lagrangian Methods for Constrained Optimization. SIAM.

← Previous: Numerical Differentiation | Back to Index | Next: Root Finding →

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimization

Overview

Problem Formulation

Common Interface

Input Properties

Output Properties

Methods

Local Optimization

BFGS (Broyden-Fletcher-Goldfarb-Shanno)

Nelder-Mead Simplex

Powell's Method

Gradient Descent

ADAM Optimizer

One-Dimensional Methods

Golden Section Search

Brent Search

Global Optimization

Differential Evolution

Particle Swarm Optimization

Shuffled Complex Evolution (SCE-UA)

Simulated Annealing

Multi-Start Optimization

MLSL (Multi-Level Single Linkage)

Constrained Optimization

Augmented Lagrangian

Practical Example: Calibrating a Hydrological Model

Choosing an Optimization Method

Best Practices

Understanding Results

Parameter Set Structure

Hessian Matrix

Optimization Status

Performance Tips

References

FilesExpand file tree

optimization.md

Latest commit

History

optimization.md

File metadata and controls

Optimization

Overview

Problem Formulation

Common Interface

Input Properties

Output Properties

Methods

Local Optimization

BFGS (Broyden-Fletcher-Goldfarb-Shanno)

Nelder-Mead Simplex

Powell's Method

Gradient Descent

ADAM Optimizer

One-Dimensional Methods

Golden Section Search

Brent Search

Global Optimization

Differential Evolution

Particle Swarm Optimization

Shuffled Complex Evolution (SCE-UA)

Simulated Annealing

Multi-Start Optimization

MLSL (Multi-Level Single Linkage)

Constrained Optimization

Augmented Lagrangian

Practical Example: Calibrating a Hydrological Model

Choosing an Optimization Method

Best Practices

Understanding Results

Parameter Set Structure

Hessian Matrix

Optimization Status

Performance Tips

References