Skip to content

Latest commit

 

History

History
649 lines (461 loc) · 24.7 KB

File metadata and controls

649 lines (461 loc) · 24.7 KB

Optimization

← Previous: Numerical Differentiation | Back to Index | Next: Root Finding →

Optimization is the process of finding the parameter set that minimizes (or maximizes) an objective function. The Numerics library provides a comprehensive suite of local and global optimization algorithms for both unconstrained and constrained problems. These methods are essential for parameter estimation, model calibration, machine learning, and engineering design optimization.

Overview

Method Type Best For Requires Derivatives
Local Methods
BFGS Local Smooth, differentiable functions No (numerical)
Nelder-Mead Local General purpose, robust No
Powell Local Smooth functions without derivatives No
Golden Section Local 1D problems No
Brent Search Local 1D problems, smooth functions No
Gradient Descent Local Large-scale, smooth problems No (numerical)
ADAM Local Machine learning, stochastic No (numerical)
Global Methods
Differential Evolution Global Multimodal, robust No
Particle Swarm Global Multimodal, fast convergence No
Shuffled Complex Evolution Global Hydrological calibration No
Simulated Annealing Global Discrete/continuous, multimodal No
Multi-Start Global Combines local search with random starts No
MLSL Global Smooth multimodal functions Requires local solver
Constrained
Augmented Lagrangian Constrained Equality and inequality constraints No

Problem Formulation

An optimization problem seeks to find:

$$\min_{\mathbf{x}} f(\mathbf{x})$$

subject to:

$$\mathbf{x}_L \leq \mathbf{x} \leq \mathbf{x}_U$$

where $f(\mathbf{x})$ is the objective function, $\mathbf{x}$ is the parameter vector, and $\mathbf{x}_L$, $\mathbf{x}_U$ are lower and upper bounds.

For maximization problems, minimize $-f(\mathbf{x})$ or use the Maximize() method.

Common Interface

All optimizers in Numerics inherit from the Optimizer base class and share a common interface.

Input Properties

  • MaxIterations: Maximum iterations allowed (default = 10,000)
  • MaxFunctionEvaluations: Maximum function evaluations (default = int.MaxValue)
  • AbsoluteTolerance: Absolute convergence tolerance (default = 1E-8)
  • RelativeTolerance: Relative convergence tolerance (default = 1E-8)
  • ReportFailure: Throw exception on failure (default = true)
  • RecordTraces: Save optimization trace (default = true)
  • ComputeHessian: Compute Hessian at solution (default = true)

Output Properties

  • BestParameterSet: Optimal solution (Values, Fitness)
  • Iterations: Number of iterations performed
  • FunctionEvaluations: Number of function evaluations
  • Status: Optimization status (Success, Failure, etc.)
  • ParameterSetTrace: Full trace of parameter evaluations
  • Hessian: Numerically differentiated Hessian matrix (if computed)

Methods

  • Minimize(): Minimize the objective function
  • Maximize(): Maximize the objective function (minimizes $-f(\mathbf{x})$)

Local Optimization

Local optimization methods find the nearest local minimum from a starting point. They are fast and efficient for smooth, unimodal functions but may get trapped in local minima for multimodal problems.

BFGS (Broyden-Fletcher-Goldfarb-Shanno)

BFGS is a quasi-Newton method that builds an approximation to the Hessian matrix using gradient information [1]. It's one of the most effective methods for smooth, unconstrained optimization.

using Numerics.Mathematics.Optimization;

// Rosenbrock function: f(x,y) = (1-x)² + 100(y-x²)²
double Rosenbrock(double[] x)
{
    return Math.Pow(1 - x[0], 2) + 100 * Math.Pow(x[1] - x[0] * x[0], 2);
}

// Initial guess
var initial = new double[] { -1.2, 1.0 };
var lower = new double[] { -2, -2 };
var upper = new double[] { 2, 2 };

// Create and configure optimizer
var bfgs = new BFGS(Rosenbrock, 2, initial, lower, upper);
bfgs.RelativeTolerance = 1e-8;
bfgs.MaxIterations = 1000;

// Minimize
bfgs.Minimize();

// Results
Console.WriteLine($"Optimal solution: x = {bfgs.BestParameterSet.Values[0]:F6}, y = {bfgs.BestParameterSet.Values[1]:F6}");
Console.WriteLine($"Minimum value: {bfgs.BestParameterSet.Fitness:F10}");
Console.WriteLine($"Iterations: {bfgs.Iterations}");
Console.WriteLine($"Function evaluations: {bfgs.FunctionEvaluations}");
Console.WriteLine($"Status: {bfgs.Status}");

Advantages: Fast convergence, memory efficient, works well for most smooth problems.

Disadvantages: Can fail on non-smooth functions, requires bounded search space.

Nelder-Mead Simplex

The Nelder-Mead method is a direct search algorithm that uses a simplex (geometric figure with n+1 vertices in n dimensions) to search the parameter space [2]. It's robust and doesn't require derivatives.

var nm = new NelderMead(Rosenbrock, 2, initial, lower, upper);
nm.RelativeTolerance = 1e-6;
nm.Minimize();

Console.WriteLine($"Optimal solution: [{nm.BestParameterSet.Values[0]:F6}, {nm.BestParameterSet.Values[1]:F6}]");
Console.WriteLine($"Minimum value: {nm.BestParameterSet.Fitness:F10}");

Advantages: Very robust, doesn't require derivatives, handles non-smooth functions.

Disadvantages: Slower convergence than gradient-based methods, can stagnate.

Powell's Method

Powell's method is a conjugate direction algorithm that doesn't require derivatives [1]. It performs successive line searches along conjugate directions.

var powell = new Powell(Rosenbrock, 2, initial, lower, upper);
powell.RelativeTolerance = 1e-8;
powell.Minimize();

Console.WriteLine($"Solution: [{powell.BestParameterSet.Values[0]:F6}, {powell.BestParameterSet.Values[1]:F6}]");

Advantages: No derivatives required, good for smooth functions.

Disadvantages: Can be slow in high dimensions, sensitive to scaling.

Gradient Descent

Simple gradient-based optimization with line search:

var gd = new GradientDescent(Rosenbrock, 2, initial, lower, upper);
gd.Alpha = 0.001; // Step size (learning rate)
gd.Minimize();

ADAM Optimizer

Adaptive Moment Estimation, popular in machine learning applications [3]:

var adam = new ADAM(Rosenbrock, 2, initial, lower, upper);
adam.Alpha = 0.001;
adam.Beta1 = 0.9;  // First moment decay
adam.Beta2 = 0.999; // Second moment decay
adam.Minimize();

Advantages: Adaptive learning rates, works well for noisy objectives.

Best for: Machine learning, stochastic optimization problems.

One-Dimensional Methods

For single-parameter optimization, specialized 1D methods are more efficient:

Golden Section Search

Uses the golden ratio to bracket the minimum:

Func<double, double> f1d = x => Math.Pow(x - 2, 2) + 3;

var golden = new GoldenSection(f1d, 0.0, 5.0);
golden.Minimize();

Console.WriteLine($"Minimum at x = {golden.BestParameterSet.Values[0]:F6}");

Brent Search

Combines golden section search with parabolic interpolation for faster convergence:

var brent = new BrentSearch(f1d, 0.0, 5.0);
brent.Minimize();

Global Optimization

Global optimization methods are designed to find the global minimum across the entire search space, avoiding local minima. They typically require more function evaluations but are more robust for multimodal problems.

Differential Evolution

Differential Evolution (DE) is a population-based evolutionary algorithm that's very robust for continuous optimization [4]. For each member $\mathbf{x}_i$ of the population, DE creates a trial vector $\mathbf{u}$ using mutation and crossover:

Mutation (DE/rand/1): Three distinct population members $\mathbf{x}{r_0}$, $\mathbf{x}{r_1}$, $\mathbf{x}_{r_2}$ are randomly selected, and a mutant vector is formed:

$$\mathbf{v} = \mathbf{x}_{r_0} + G \cdot (\mathbf{x}_{r_1} - \mathbf{x}_{r_2})$$

where $G$ is a scale factor. In the Numerics implementation, $G$ is dithered with 90% probability: $G = 0.5 + \text{rand}() \times 0.5$, which helps avoid stagnation.

Crossover (binomial): The trial vector $\mathbf{u}$ is assembled component-by-component:

$$u_j = \begin{cases} v_j & \text{if } \text{rand}() \leq CR \text{ or } j = j_{\text{rand}} \\ x_{i,j} & \text{otherwise} \end{cases}$$

where $CR$ is the crossover probability and $j_{\text{rand}}$ is a randomly chosen index that ensures at least one component comes from the mutant vector.

Selection (greedy): The trial vector replaces the target only if it has equal or better fitness:

$$\mathbf{x}_i^{(g+1)} = \begin{cases} \mathbf{u} & \text{if } f(\mathbf{u}) \leq f(\mathbf{x}_i^{(g)}) \\ \mathbf{x}_i^{(g)} & \text{otherwise} \end{cases}$$

The algorithm converges when the standard deviation of fitness values across the population falls below the tolerance.

using Numerics.Mathematics.Optimization;

// Rastrigin function (highly multimodal)
double Rastrigin(double[] x)
{
    double sum = 10 * x.Length;
    for (int i = 0; i < x.Length; i++)
    {
        sum += x[i] * x[i] - 10 * Math.Cos(2 * Math.PI * x[i]);
    }
    return sum;
}

var lower = new double[] { -5.12, -5.12 };
var upper = new double[] { 5.12, 5.12 };

var de = new DifferentialEvolution(Rastrigin, 2, lower, upper);
de.PopulationSize = 50;
de.CrossoverProbability = 0.9;
de.Mutation = 0.75;
de.MaxIterations = 1000;
de.Minimize();

Console.WriteLine($"Global minimum: [{de.BestParameterSet.Values[0]:F6}, {de.BestParameterSet.Values[1]:F6}]");
Console.WriteLine($"Function value: {de.BestParameterSet.Fitness:F10}");

Advantages: Very robust, handles discontinuities, good for difficult problems.

Parameters:

  • PopulationSize: Number of candidate solutions (default = 10 × dimensions)
  • CrossoverProbability: Probability of crossover (default = 0.9)
  • Mutation: Scaling factor for mutation (default = 0.75)

Particle Swarm Optimization

PSO simulates social behavior of bird flocking or fish schooling [5]. Each particle $i$ has a position $\mathbf{x}_i$ and velocity $\mathbf{v}_i$ that are updated at each iteration using:

$$v_{i,j}^{(t+1)} = w \cdot v_{i,j}^{(t)} + c_1 \cdot r_1 \cdot (p_{i,j} - x_{i,j}^{(t)}) + c_2 \cdot r_2 \cdot (g_j - x_{i,j}^{(t)})$$ $$x_{i,j}^{(t+1)} = x_{i,j}^{(t)} + v_{i,j}^{(t+1)}$$

where:

  • $w$ is the inertia weight, which decreases linearly from $w_{\max} = 0.9$ to $w_{\min} = 0.4$ over the optimization, balancing exploration (high $w$) and exploitation (low $w$)
  • $c_1 = c_2 = 2.05$ are the cognitive and social acceleration coefficients
  • $r_1, r_2 \sim U(0,1)$ are independent random numbers (per component, per particle)
  • $\mathbf{p}_i$ is particle $i$'s personal best position (best position it has visited)
  • $\mathbf{g}$ is the global best position (best position any particle has visited)

The three terms represent momentum (continue in the current direction), cognitive pull (return toward personal best), and social pull (move toward the swarm's best).

var pso = new ParticleSwarm(Rastrigin, 2, lower, upper);
pso.PopulationSize = 40;
pso.Minimize();

Console.WriteLine($"Solution: [{pso.BestParameterSet.Values[0]:F6}, {pso.BestParameterSet.Values[1]:F6}]");

Advantages: Fast convergence, simple concept, works well for continuous problems.

Parameters:

  • PopulationSize: Number of particles (default = 30)

Shuffled Complex Evolution (SCE-UA)

SCE-UA was specifically developed for calibrating hydrological models [6]. The algorithm partitions the population into $p$ complexes, evolves each complex independently using a Competitive Complex Evolution (CCE) strategy, then shuffles members between complexes to share information.

The CCE step within each complex proceeds as follows:

  1. Sub-complex selection: Select $q = D+1$ points from the complex using a trapezoidal probability distribution that favors better-ranked points: $P(i) = \frac{2(N+1-i)}{N(N+1)}$

  2. Reflection: Compute the centroid $\mathbf{g}$ of all sub-complex points except the worst point $\mathbf{x}_w$, then reflect:

$$\mathbf{r} = 2\mathbf{g} - \mathbf{x}_w$$
  1. Contraction: If the reflected point is infeasible or worse than $\mathbf{x}_w$, try contraction:
$$\mathbf{c} = \frac{\mathbf{g} + \mathbf{x}_w}{2}$$
  1. Mutation: If contraction also fails, generate a random point within the smallest bounding box of the complex.

The shuffling step redistributes points across complexes, preventing any single complex from stagnating. This combination of local competitive evolution within complexes and global information sharing between complexes makes SCE-UA particularly effective for the rugged, multimodal objective functions typical of hydrological model calibration.

var sce = new ShuffledComplexEvolution(Rastrigin, 2, lower, upper);
sce.Complexes = 5;
sce.MaxIterations = 1000;
sce.Minimize();

Console.WriteLine($"Calibrated parameters: [{sce.BestParameterSet.Values[0]:F6}, {sce.BestParameterSet.Values[1]:F6}]");

Advantages: Excellent for hydrological model calibration, balances exploration and exploitation.

Best for: Calibrating watershed models, water resources applications.

Simulated Annealing

SA mimics the physical process of annealing in metallurgy [7]. It accepts uphill moves with decreasing probability, allowing escape from local minima.

var sa = new SimulatedAnnealing(Rastrigin, 2, lower, upper);
sa.InitialTemperature = 10.0;
sa.CoolingRate = 0.95;
sa.MaxIterations = 10000;
sa.Minimize();

Console.WriteLine($"Solution: [{sa.BestParameterSet.Values[0]:F6}, {sa.BestParameterSet.Values[1]:F6}]");

Advantages: Can escape local minima, works for discrete and continuous problems.

Parameters:

  • InitialTemperature: Starting temperature (default = 10)
  • CoolingRate: Temperature reduction factor (default = 0.95)

Multi-Start Optimization

Combines local search with multiple random starting points:

var initial = new double[] { 0.0, 0.0 };
var ms = new MultiStart(Rastrigin, 2, initial, lower, upper, LocalMethod.BFGS);
ms.MaxIterations = 20; // Number of random starts
ms.Minimize();

Console.WriteLine($"Best solution: [{ms.BestParameterSet.Values[0]:F6}, {ms.BestParameterSet.Values[1]:F6}]");

Advantages: Simple to implement, leverages fast local search.

Disadvantages: May waste evaluations in same basin of attraction.

MLSL (Multi-Level Single Linkage)

Clustering-based global optimization that avoids redundant local searches:

var initial = new double[] { 0.0, 0.0 };
var mlsl = new MLSL(Rastrigin, 2, initial, lower, upper, LocalMethod.BFGS);
mlsl.Minimize();

Advantages: More efficient than multi-start, avoids redundant searches.

Constrained Optimization

Augmented Lagrangian

The Augmented Lagrangian method handles equality and inequality constraints by adding penalty terms to the objective function [8].

using Numerics.Mathematics.Optimization;

// Objective: minimize x² + y²
double Objective(double[] x)
{
    return x[0] * x[0] + x[1] * x[1];
}

// Constraint: x + y >= 1
var constraint = new Constraint(
    x => x[0] + x[1], 2, 1.0,  // g(x) >= value form
    ConstraintType.GreaterThanOrEqualTo
);

var lower = new double[] { -5, -5 };
var upper = new double[] { 5, 5 };
var initial = new double[] { 0, 0 };

// Create the inner optimizer
var bfgs = new BFGS(Objective, 2, initial, lower, upper);

// Create the Augmented Lagrange optimizer with constraints
var al = new AugmentedLagrange(Objective, bfgs, new IConstraint[] { constraint });
al.MaxIterations = 100;
al.Minimize();

Console.WriteLine($"Optimal solution: [{al.BestParameterSet.Values[0]:F6}, {al.BestParameterSet.Values[1]:F6}]");
Console.WriteLine($"Constraint satisfied: {al.BestParameterSet.Values[0] + al.BestParameterSet.Values[1]:F6} >= 1");

Constraint Types:

  • ConstraintType.EqualTo: Equality constraint $g(\mathbf{x}) = 0$
  • ConstraintType.LesserThanOrEqualTo: Inequality constraint $g(\mathbf{x}) \leq 0$
  • ConstraintType.GreaterThanOrEqualTo: Inequality constraint $g(\mathbf{x}) \geq 0$

Example: Minimize subject to multiple constraints:

// Minimize f(x,y) = (x-3)² + (y-2)²
// Subject to: x + y <= 5
//             x >= 1
//             y >= 1

double ObjectiveFunc(double[] x)
{
    return Math.Pow(x[0] - 3, 2) + Math.Pow(x[1] - 2, 2);
}

var c1 = new Constraint(x => x[0] + x[1], 2, 5.0, ConstraintType.LesserThanOrEqualTo);
var c2 = new Constraint(x => x[0], 2, 1.0, ConstraintType.GreaterThanOrEqualTo);
var c3 = new Constraint(x => x[1], 2, 1.0, ConstraintType.GreaterThanOrEqualTo);

var innerOptimizer = new BFGS(ObjectiveFunc, 2, new[] { 2.0, 2.0 },
                              new[] { 0.0, 0.0 }, new[] { 10.0, 10.0 });
var constrained = new AugmentedLagrange(ObjectiveFunc, innerOptimizer,
                                        new IConstraint[] { c1, c2, c3 });
constrained.Minimize();

Console.WriteLine($"Constrained optimum: [{constrained.BestParameterSet.Values[0]:F4}, " +
                  $"{constrained.BestParameterSet.Values[1]:F4}]");

Practical Example: Calibrating a Hydrological Model

A complete example of using optimization to calibrate a watershed model:

using Numerics.Mathematics.Optimization;
using Numerics.Data.Statistics;

// Observed streamflow data
double[] observed = { 12.5, 15.3, 18.7, 16.2, 14.1, 11.8, 10.3 };

// Simple runoff model: Q = C × P^α where Q is flow, P is precipitation, C and α are parameters
double[] precipitation = { 10, 12, 15, 13, 11, 9, 8 };

// Objective function: minimize RMSE
double ObjectiveFunction(double[] parameters)
{
    double C = parameters[0];
    double alpha = parameters[1];
    
    // Simulate streamflow
    double[] simulated = new double[observed.Length];
    for (int i = 0; i < observed.Length; i++)
    {
        simulated[i] = C * Math.Pow(precipitation[i], alpha);
    }
    
    // Compute RMSE
    double rmse = GoodnessOfFit.RMSE(observed, simulated);
    return rmse;
}

// Parameter bounds
var lower = new double[] { 0.1, 0.5 };  // C >= 0.1, α >= 0.5
var upper = new double[] { 5.0, 3.0 };  // C <= 5.0, α <= 3.0

// Use SCE-UA (recommended for hydrological calibration)
var optimizer = new ShuffledComplexEvolution(ObjectiveFunction, 2, lower, upper);
optimizer.Complexes = 5;
optimizer.MaxIterations = 1000;
optimizer.Minimize();

Console.WriteLine($"Calibrated parameters:");
Console.WriteLine($"  C = {optimizer.BestParameterSet.Values[0]:F4}");
Console.WriteLine($"  α = {optimizer.BestParameterSet.Values[1]:F4}");
Console.WriteLine($"  RMSE = {optimizer.BestParameterSet.Fitness:F4}");
Console.WriteLine($"  Function evaluations: {optimizer.FunctionEvaluations}");

// Verify calibration
double[] final_simulated = new double[observed.Length];
for (int i = 0; i < observed.Length; i++)
{
    final_simulated[i] = optimizer.BestParameterSet.Values[0] * 
                         Math.Pow(precipitation[i], optimizer.BestParameterSet.Values[1]);
}

double nse = GoodnessOfFit.NashSutcliffeEfficiency(observed, final_simulated);
Console.WriteLine($"  NSE = {nse:F4}");

Choosing an Optimization Method

Problem Type Recommended Method Notes
Smooth, unimodal BFGS or Powell Fast convergence to local minimum
Non-smooth, unimodal Nelder-Mead Robust direct search
Multimodal, global search Differential Evolution or SCE-UA Thorough exploration
Quick global search Particle Swarm Faster but less thorough
Hydrological calibration SCE-UA Specifically designed for this
With constraints Augmented Lagrangian Handles equality and inequality constraints
1D problem Brent Search or Golden Section Specialized efficient methods
Machine learning ADAM Adaptive learning rates
Unknown smoothness Start with Nelder-Mead Very robust
High dimensions (>20) Differential Evolution or ADAM Scale better than others

Best Practices

  1. Scale Variables: Normalize parameters to similar ranges (e.g., [0,1] or [-1,1]) for better convergence.

  2. Set Reasonable Bounds: Tight bounds improve convergence but shouldn't exclude the optimum.

  3. Multiple Runs: For global optimization, run multiple times with different random seeds to ensure robustness.

  4. Hybrid Approach: Use global method followed by local refinement:

    // Global search
    var de = new DifferentialEvolution(ObjectiveFunc, n, lower, upper);
    de.Minimize();
    
    // Local refinement
    var bfgs = new BFGS(ObjectiveFunc, n, de.BestParameterSet.Values, lower, upper);
    bfgs.Minimize();
  5. Monitor Convergence: Check the ParameterSetTrace to diagnose convergence issues:

    foreach (var ps in optimizer.ParameterSetTrace)
    {
        Console.WriteLine($"Iteration {ps.Fitness}");
    }
  6. Adjust Tolerances: Tighter tolerances require more evaluations:

    optimizer.RelativeTolerance = 1e-10; // Very tight
    optimizer.AbsoluteTolerance = 1e-10;
  7. Population Size: For global methods, larger populations explore better but cost more:

    de.PopulationSize = 20 * NumberOfParameters; // Rule of thumb

Understanding Results

Parameter Set Structure

The BestParameterSet contains:

  • Values: The optimal parameter vector
  • Fitness: The objective function value at the optimum
  • Weight: (Optional) Used internally by some algorithms

Hessian Matrix

When ComputeHessian = true, the Hessian at the solution is computed numerically. The Hessian $\mathbf{H}$ is the matrix of second partial derivatives of the objective function, and its eigenvalues reveal important information about the solution:

  • All eigenvalues positive: The solution is a local minimum (the Hessian is positive definite)
  • Large eigenvalue: The objective is highly curved in that direction — the corresponding parameter is well-determined
  • Small eigenvalue: The objective is nearly flat — the parameter is poorly determined or identifiable
  • Near-zero eigenvalue: Indicates a ridge or valley in the objective surface, suggesting parameter correlation or redundancy

The inverse of the Hessian approximates the covariance matrix of the parameters, so parameter standard errors can be estimated as $SE(\hat{\theta}i) \approx \sqrt{|H{ii}^{-1}|}$:

optimizer.ComputeHessian = true;
optimizer.Minimize();

// Access Hessian
var H = optimizer.Hessian;

// Parameter standard errors (approximation)
for (int i = 0; i < optimizer.NumberOfParameters; i++)
{
    double se = Math.Sqrt(Math.Abs(1.0 / H[i, i]));
    Console.WriteLine($"Parameter {i} ± {se:F4}");
}

Optimization Status

Check the Status property to verify success:

if (optimizer.Status == OptimizationStatus.Success)
{
    Console.WriteLine("Optimization converged successfully");
}
else if (optimizer.Status == OptimizationStatus.MaximumIterationsReached)
{
    Console.WriteLine("Maximum iterations reached - may not have converged");
}

Performance Tips

  1. Function Evaluation Cost: Most optimization time is spent evaluating the objective function. Optimize your function first.

  2. Parallel Evaluation: For population-based methods (DE, PSO), evaluate population members in parallel if your function is thread-safe.

  3. Warm Start: If solving similar problems repeatedly, use the previous solution as initial guess.

  4. Gradient Information: If you can provide analytical gradients, BFGS will converge much faster (though the current implementation uses numerical gradients).


References

[1] Nocedal, J., & Wright, S. J. (2006). Numerical Optimization (2nd ed.). Springer.

[2] Nelder, J. A., & Mead, R. (1965). A simplex method for function minimization. The Computer Journal, 7(4), 308-313.

[3] Kingma, D. P., & Ba, J. (2014). Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980.

[4] Storn, R., & Price, K. (1997). Differential evolution–a simple and efficient heuristic for global optimization over continuous spaces. Journal of Global Optimization, 11(4), 341-359.

[5] Kennedy, J., & Eberhart, R. (1995). Particle swarm optimization. Proceedings of IEEE International Conference on Neural Networks, 4, 1942-1948.

[6] Duan, Q., Sorooshian, S., & Gupta, V. (1992). Effective and efficient global optimization for conceptual rainfall-runoff models. Water Resources Research, 28(4), 1015-1031.

[7] Kirkpatrick, S., Gelatt, C. D., & Vecchi, M. P. (1983). Optimization by simulated annealing. Science, 220(4598), 671-680.

[8] Birgin, E. G., & Martínez, J. M. (2014). Practical Augmented Lagrangian Methods for Constrained Optimization. SIAM.


← Previous: Numerical Differentiation | Back to Index | Next: Root Finding →