← Previous: Numerical Integration | Back to Index | Next: Optimization →
Numerical differentiation is a fundamental technique in various scientific and engineering fields. Many optimization algorithms, such as gradient descent and Newton's method, rely on calculating the gradient of a function to find its minimum or maximum points. Some optimization methods, like quasi-Newton methods, use the Hessian matrix (the matrix of second-order derivatives) to determine the curvature of the function. Numerical differentiation provides a way to approximate both the gradient and the Hessian matrix efficiently.
In Numerics, the derivative is evaluated using the two-point (central difference) formula by default:
where
where
For example, consider the simple function:
Differentiating with respect to
Evaluating the function at
Now, let's implement this in Numerics. First, we need to reference Numerics and the Mathematics namespace:
using Numerics.Mathematics;Next, create the test function:
/// <summary>
/// Test function: f(x) = x^3
/// </summary>
public double FX(double x)
{
return Math.Pow(x, 3);
}And then compute the derivative using the NumericalDerivative class and the Derivative method, which uses the two-point (central difference) formula:
double dFdx = NumericalDerivative.Derivative(FX, 2); // 11.999999949167176The accuracy of numerical differentiation depends on the smoothness of the function and the step size. For functions with discontinuities or sharp gradients, the two-point formula might not accurately capture the derivative at those points. If the function changes rapidly, a larger step size might miss important details.
The Numerics library provides additional finite difference methods:
The forward difference approximation uses:
double dFdx = NumericalDerivative.ForwardDifference(FX, 2); // 12.000001490104875This method is useful when you can only evaluate the function ahead of the point, but it's less accurate than the central difference method.
The backward difference approximation uses:
double dFdx = NumericalDerivative.BackwardDifference(FX, 2); // 11.999998508229537This method is useful when you can only evaluate the function behind the point.
The central difference method (which is also used by the Derivative method) provides better accuracy:
double dFdx = NumericalDerivative.CentralDifference(FX, 2); // 11.999999949167176The Numerics library also provides Ridder's method for computing the numerical derivative, which can be more accurate than the two-point method in some cases. This method also outputs an estimate of the error in the derivative:
double dFdx = NumericalDerivative.RiddersMethod(FX, 2, out var err); // 11.99999994392223
Console.WriteLine($"Derivative: {dFdx}");
Console.WriteLine($"Estimated error: {err}");Ridder's method uses Richardson extrapolation to refine the estimate by evaluating the derivative at multiple step sizes and extrapolating to zero step size.
You can specify a custom step size if the automatic determination is not suitable for your problem:
double h = 0.001; // Custom step size
double dFdx = NumericalDerivative.Derivative(FX, 2, h);The Numerics library provides methods for computing second derivatives. The second derivative measures the rate of change of the first derivative, or the curvature of the function.
The central difference approximation for the second derivative is:
double d2Fdx2 = NumericalDerivative.SecondDerivative(FX, 2); // 11.999997613071552For our test function
double d2Fdx2 = NumericalDerivative.SecondDerivativeForward(FX, 2);double d2Fdx2 = NumericalDerivative.SecondDerivativeBackward(FX, 2);The gradient is a vector of first-order partial derivatives of a scalar-valued function:
For example, consider a function with three variables:
Differentiating with respect to each variable gives:
Evaluating the function at
The gradient is the vector of these first-order partial derivatives:
In Numerics, the gradient is computed using the two-point formula described earlier. Let's create the test function:
/// <summary>
/// Test function: f(x, y, z) = x^3 + y^4 + z^5
/// </summary>
public double FXYZ(double[] x)
{
return Math.Pow(x[0], 3) + Math.Pow(x[1], 4) + Math.Pow(x[2], 5);
}And then compute the gradient:
double[] gradient = NumericalDerivative.Gradient(FXYZ, new double[] {2, 2, 2});
// {12.000000019411923, 32.000000014301264, 79.999999754774166}
Console.WriteLine($"∂f/∂x = {gradient[0]:F6}");
Console.WriteLine($"∂f/∂y = {gradient[1]:F6}");
Console.WriteLine($"∂f/∂z = {gradient[2]:F6}");The Hessian is a square matrix of second-order partial derivatives of a scalar-valued function:
Using the same 3-variable function as before
Since none of the variables interact with each other, the Hessian reduces to:
Taking the second-order derivatives with respect to each variable gives:
Evaluating the function at
Now, to compute the Hessian in Numerics, simply do the following:
double[,] hessian = NumericalDerivative.Hessian(FXYZ, new double[] {2, 2, 2});
// [0,0] = 12.000009765258449
// [0,1] = 8.5443864603330376E-06
// [0,2] = 0
// [1,0] = 8.5443864603330376E-06
// [1,1] = 48.000004883487954
// [1,2] = 0
// [2,0] = 0
// [2,1] = 0
// [2,2] = 159.99999349326262
Console.WriteLine("Hessian matrix:");
for (int i = 0; i < 3; i++)
{
for (int j = 0; j < 3; j++)
{
Console.Write($"{hessian[i, j],12:F6} ");
}
Console.WriteLine();
}The small off-diagonal elements (on the order of
Now, consider another example where the function variables interact:
The Hessian matrix is:
Taking the second-order derivatives with respect to each variable gives:
Evaluating the function at
Create the test function:
/// <summary>
/// Test function: f(x, y) = x^3 - 2xy - y^6
/// </summary>
public double FXY(double[] x)
{
return Math.Pow(x[0], 3) - 2 * x[0] * x[1] - Math.Pow(x[1], 6);
}Now, compute the Hessian using Numerics:
double[,] hessian = NumericalDerivative.Hessian(FXY, new double[] {1, 2});
// [0,0] = 5.9999414101667655
// [0,1] = -2.000001627543075
// [1,0] = -2.000001627543075
// [1,1] = -480.00004883487958
Console.WriteLine("Hessian matrix:");
Console.WriteLine($"[{hessian[0,0],10:F4} {hessian[0,1],10:F4}]");
Console.WriteLine($"[{hessian[1,0],10:F4} {hessian[1,1],10:F4}]");Note that the Hessian is symmetric, as expected from the equality of mixed partial derivatives (Schwarz's theorem), and the off-diagonal elements correctly capture the interaction between variables.
The Jacobian matrix represents the first-order partial derivatives of a vector-valued function. For a function
The Numerics library provides two overloads for computing the Jacobian:
// For a single output function f(x_i, x[])
double[,] jacobian1 = NumericalDerivative.Jacobian(
(xi, x) => /* function of xi and x[] */,
xValues,
point
);
// For a vector-valued function f(x[]) -> y[]
double[,] jacobian2 = NumericalDerivative.Jacobian(
x => /* returns double[] */,
point
);Example of computing a Jacobian for a system of equations:
// System: f1(x,y) = x^2 + y^2, f2(x,y) = xy
double[] F(double[] vars)
{
double x = vars[0];
double y = vars[1];
return new double[]
{
x * x + y * y, // f1
x * y // f2
};
}
var point = new double[] { 2, 3 };
double[,] jacobian = NumericalDerivative.Jacobian(F, point);
// Jacobian at (2,3):
// [∂f1/∂x ∂f1/∂y] [2x 2y] [4 6]
// [∂f2/∂x ∂f2/∂y] = [ y x] = [3 2]The CalculateStepSize method computes an appropriate step size for numerical differentiation based on the magnitude of the point and the order of the derivative:
double h = NumericalDerivative.CalculateStepSize(x: 2.0, order: 1);
// Returns approximately 1.49e-08 for first derivative
double h2 = NumericalDerivative.CalculateStepSize(x: 2.0, order: 2);
// Returns approximately 3.45e-06 for second derivativeThe step size is calculated as:
where
-
Use Central Differences: When possible, use the central difference method (default
Derivativemethod) as it provides better accuracy than forward or backward differences. -
Ridder's Method for Critical Applications: When you need both a derivative and an error estimate, use Ridder's method.
-
Automatic Step Sizing: The default automatic step sizing works well for most problems. Only specify a custom step size if you have specific numerical issues.
-
Beware of Noise: Numerical differentiation amplifies noise in function evaluations. If your function has numerical noise (e.g., from Monte Carlo simulations), consider smoothing or using a larger step size.
-
Check for Symmetry: For Hessian matrices, check that the result is symmetric (within numerical tolerance). Significant asymmetry indicates numerical issues.
-
Scale Considerations: For problems with variables at very different scales, consider normalizing variables before computing derivatives.
The central difference formula has truncation error
For the second derivative, the truncation error is
The numerical differentiation methods implemented in Numerics are based on standard finite difference formulas well-documented in numerical analysis literature [1]. Ridder's method for derivative estimation with error bounds was introduced by Ridders [2].
[1] W. H. Press, S. A. Teukolsky, W. T. Vetterling and B. P. Flannery, Numerical Recipes: The Art of Scientific Computing, 3rd ed., Cambridge, UK: Cambridge University Press, 2007.
[2] C. J. F. Ridders, "Accurate computation of F'(x) and F'(x)F''(x)," Advances in Engineering Software, vol. 4, no. 2, pp. 75-76, 1982.
← Previous: Numerical Integration | Back to Index | Next: Optimization →