Chapter 3 — Mathematical Foundations

"Mathematics is the language with which God has written the Universe." — Galileo Galilei

Mathematical computation over real numbers is fundamentally different from mathematical reasoning about them. A textbook derivation of the Black-Scholes formula assumes exact arithmetic; the OCaml implementation must work with 64-bit floating-point numbers that have approximately 15 significant decimal digits and cannot represent most real numbers exactly. This limitation is not a deficiency of the implementation — it is a physical fact about digital computers — but it requires understanding to avoid. The difference between correct and incorrect floating-point code is often invisible until it matters: a summation that is accurate for 100 terms may accumulate catastrophic cancellation error for 10 million terms.

Beyond floating-point arithmetic, quantitative finance uses a specific set of mathematical tools repeatedly: linear algebra for portfolio mathematics and correlation modelling; numerical integration for option pricing and expected loss calculation; root-finding for yield-to-maturity and implied volatility computation; interpolation for yield curve and volatility surface construction; and the normal distribution for virtually everything. This chapter implements each of these tools in OCaml, with attention to both the mathematical content and the numerical implementation issues that practitioners must understand.

The chapter is intentionally practical rather than mathematically rigorous. We assume familiarity with calculus and linear algebra at an undergraduate level, and we focus on the numerical methods needed in later chapters rather than on proofs or derivations. References to rigorous treatments are provided where appropriate.

3.1 Floating-Point Arithmetic and Precision in Finance

Every financial calculation is ultimately a sequence of floating-point operations. Understanding how IEEE 754 arithmetic works — and where it breaks — is essential for building reliable systems.

3.1.1 IEEE 754 Double Precision

A 64-bit double has 53 bits of mantissa, giving approximately 15–17 significant decimal digits. This is adequate for most pricing calculations but insufficient for:

Exact settlement amounts (use integer cents or arbitrary precision)
Summing thousands of small numbers (use Kahan summation)
Comparing prices for equality (never use = on floats)

(** The machine epsilon: smallest e such that 1 + e ≠ 1 *)
let machine_epsilon =
  let rec find e =
    if 1.0 +. e = 1.0 then e *. 2.0
    else find (e /. 2.0)
  in
  find 1.0

(** ~2.22e-16 *)
let () = Printf.printf "Machine epsilon: %.2e\n" machine_epsilon

(** WRONG: never compare floats with = *)
let bad_check price = price = 100.0   (* dangerous *)

(** CORRECT: use tolerance *)
let nearly_equal ?(tol = 1e-9) a b = Float.abs (a -. b) < tol

(** Catastrophic cancellation example *)
let () =
  let a = 1234567890.12345 in
  let b = 1234567890.12340 in
  let diff = a -. b in
  Printf.printf "a - b = %.10f (expected 0.00005)\n" diff
  (* Significant digits are lost when subtracting nearly-equal numbers *)

3.1.2 Kahan Compensated Summation

When summing many numbers of different magnitudes — common in Monte Carlo simulation — use Kahan summation to preserve precision:

(**
    Kahan summation algorithm.
    Maintains a compensation variable c to capture lost low-order bits.
    
    Error bound: O(ε) rather than O(n·ε) for naive summation,
    where ε is machine epsilon and n is the number of terms.
*)
let kahan_sum arr =
  let sum = ref 0.0 in
  let c   = ref 0.0 in   (* running compensation *)
  Array.iter (fun x ->
    let y = x -. !c in
    let t = !sum +. y in
    c   := (t -. !sum) -. y;
    sum := t
  ) arr;
  !sum

(** Compare naive vs Kahan summation *)
let test_summation () =
  (* Many small numbers that sum to exactly 1.0 *)
  let n = 10_000_000 in
  let arr = Array.make n (1.0 /. float_of_int n) in
  let naive = Array.fold_left (+.) 0.0 arr in
  let kahan = kahan_sum arr in
  Printf.printf "Naive:  %.15f\n" naive;
  Printf.printf "Kahan:  %.15f\n" kahan;
  Printf.printf "Error (naive): %.2e\n" (Float.abs (naive -. 1.0));
  Printf.printf "Error (Kahan): %.2e\n" (Float.abs (kahan -. 1.0))

Floating-Point Rounding Error Comparison Figure 3.1 — Summation error growth as a function of $n$ (the number of terms). Naive summation accumulates linear bound errors, whereas Kahan summation maintains $O(\varepsilon)$ precision.

3.1.3 Arbitrary Precision with Zarith

For accounting, clearing, and regulatory calculations that require exact results:

open Zarith

(** Exact bond accrued interest calculation *)
let accrued_interest ~face_value ~coupon_rate ~days_accrued ~days_in_period =
  (* All arithmetic in exact rationals *)
  let face  = Q.of_float face_value in
  let rate  = Q.of_string coupon_rate in  (* e.g. "5/100" for 5% *)
  let acc_d = Q.of_int days_accrued in
  let per_d = Q.of_int days_in_period in
  (* Accrued = face × coupon × (days_accrued / days_in_period) *)
  Q.mul face (Q.mul rate (Q.div acc_d per_d))
  |> Q.to_float  (* convert back to float for output *)

3.2 Vectors, Matrices, and Linear Algebra

Portfolio mathematics, factor models, and curve fitting all require linear algebra. We use the Owl library for matrix operations.

3.2.1 Vectors and Dot Products

open Owl

(**
    Portfolio return = w · r
    where w is weight vector and r is return vector
*)
let portfolio_return weights returns =
  (* Dot product *)
  Mat.dot weights (Mat.transpose returns) |> Mat.get_all |> Array.get 0

(**
    Portfolio variance = w^T Σ w
    where Σ is the covariance matrix
*)
let portfolio_variance weights cov_matrix =
  let wT = Mat.transpose weights in
  let inner = Mat.dot cov_matrix wT in
  Mat.dot weights inner |> Mat.get 0 0

let () =
  (* 3-asset example *)
  let weights = Mat.of_array [|0.4; 0.35; 0.25|] 1 3 in
  let returns = Mat.of_array [|0.08; 0.12; 0.06|] 1 3 in
  let cov = Mat.of_array [|
    0.0144; 0.0072; 0.0036;   (* row 1: asset 1 *)
    0.0072; 0.0225; 0.0090;   (* row 2: asset 2 *)
    0.0036; 0.0090; 0.0100;   (* row 3: asset 3 *)
  |] 3 3 in
  Printf.printf "Portfolio return:   %.4f\n" (portfolio_return weights returns);
  Printf.printf "Portfolio variance: %.4f\n" (portfolio_variance weights cov);
  Printf.printf "Portfolio vol:      %.4f\n" (sqrt (portfolio_variance weights cov))

3.2.2 Cholesky Decomposition

Cholesky decomposition is used extensively in finance to:

Generate correlated random numbers for Monte Carlo
Verify that a covariance matrix is positive definite
Solve symmetric positive-definite linear systems efficiently

Given a symmetric positive definite matrix $\Sigma$, find $L$ such that $\Sigma = L L^T$.

$$L_{ii} = \sqrt{\Sigma_{ii} - \sum_{k=1}^{i-1} L_{ik}^2}$$

$$L_{ij} = \frac{1}{L_{jj}}\left(\Sigma_{ij} - \sum_{k=1}^{j-1} L_{ik} L_{jk}\right), \quad j < i$$

(**
    Cholesky decomposition (lower triangular)
    Returns L such that A = L * L^T
    Raises if matrix is not positive definite
*)
let cholesky a =
  let n = Mat.row_num a in
  let l = Mat.zeros n n in
  for i = 0 to n - 1 do
    for j = 0 to i do
      let sum = ref 0.0 in
      for k = 0 to j - 1 do
        sum := !sum +. Mat.get l i k *. Mat.get l j k
      done;
      let value =
        if i = j then
          let v = Mat.get a i i -. !sum in
          if v <= 0.0 then
            failwith (Printf.sprintf "Matrix not positive definite at (%d,%d)" i i);
          sqrt v
        else
          (Mat.get a i j -. !sum) /. Mat.get l j j
      in
      Mat.set l i j value
    done
  done;
  l

(**
    Simulate correlated normals using Cholesky
    If Z ~ N(0,I), then L*Z ~ N(0, Σ)
*)
let correlated_normals ~cov ~n_paths =
  let n_assets = Mat.row_num cov in
  let l = cholesky cov in
  let z = Mat.gaussian n_assets n_paths in   (* independent standard normals *)
  Mat.dot l z                                (* correlate them *)

3.2.3 Eigenvalues and PCA

Principal Component Analysis (PCA) of yield curves extracts the dominant modes of movement (level, slope, curvature):

(**
    PCA of a returns matrix.
    Returns (eigenvalues, eigenvectors) sorted by explained variance.
*)
let pca returns_matrix =
  let n_obs = Mat.row_num returns_matrix in
  (* Center the data *)
  let means = Mat.mean ~axis:0 returns_matrix in
  let centered = Mat.sub returns_matrix (Mat.repmat means n_obs 1) in
  (* Covariance matrix: (1/n) * X^T X *)
  let cov = Mat.dot (Mat.transpose centered) centered
            |> fun m -> Mat.div_scalar m (float_of_int n_obs) in
  (* Eigendecomposition via Owl's LAPACK bindings *)
  let eigenvalues, eigenvectors = Linalg.D.eig ~full_matrices:false cov in
  (eigenvalues, eigenvectors)

let explained_variance eigenvalues =
  let total = Array.fold_left (+.) 0.0 eigenvalues in
  Array.map (fun e -> e /. total) eigenvalues

3.3 Numerical Differentiation

Derivatives appear everywhere in finance: delta, gamma, duration, convexity. When analytic formulas are unavailable, numerical differentiation is the tool.

3.3.1 Finite Differences

First derivative (central differences — second-order accurate):

$$f'(x) \approx \frac{f(x+h) - f(x-h)}{2h}$$

Second derivative (symmetric — second-order accurate):

$$f''(x) \approx \frac{f(x+h) - 2f(x) + f(x-h)}{h^2}$$

(**
    First derivative via central differences
    O(h^2) error
*)
let derivative ?(h = 1e-5) f x =
  (f (x +. h) -. f (x -. h)) /. (2.0 *. h)

(**
    Second derivative via central differences
    O(h^2) error
*)
let second_derivative ?(h = 1e-5) f x =
  (f (x +. h) -. 2.0 *. f x +. f (x -. h)) /. (h *. h)

(**
    Mixed partial derivative d²f/dxdy
    Used for cross-gamma, DV01 wrt vol, etc.
*)
let mixed_partial ?(h = 1e-4) f x y =
  let f pp = f (x +. h) (y +. h) in
  let f pm = f (x +. h) (y -. h) in
  let f mp = f (x -. h) (y +. h) in
  let f mm = f (x -. h) (y -. h) in
  (f pp -. f pm -. f mp +. f mm) /. (4.0 *. h *. h)

(** Example: compute delta and gamma of a European call *)
let () =
  let call = Black_scholes.call_price ~strike:100.0 ~rate:0.05 ~vol:0.20 ~tau:1.0 in
  let delta = derivative call 100.0 in    (* ~0.6368 *)
  let gamma = second_derivative call 100.0 in  (* ~0.0188 *)
  Printf.printf "Delta: %.4f  Gamma: %.6f\n" delta gamma

3.3.2 Optimal Step Size

The optimal step size balances truncation error (decreasing in h) with round-off error (increasing as h shrinks). For first derivatives of smooth functions:

$$h_{\text{opt}} \approx \sqrt{\varepsilon_{\text{machine}}} \cdot |x|$$

let optimal_h x =
  let eps = 2.22e-16 in
  sqrt eps *. (Float.abs x +. 1.0)

3.4 Numerical Integration

Integration appears in option pricing (the integral over the risk-neutral density), expected shortfall, and model calibration.

3.4.1 Simpson's Rule

$$\int_a^b f(x) \cdot dx \approx \frac{h}{3}\left[f(x_0) + 4f(x_1) + 2f(x_2) + 4f(x_3) + \cdots + f(x_n)\right]$$

(**
    Composite Simpson's rule.
    n must be even.
    Error: O(h^4) per step
*)
let integrate_simpson f a b n =
  assert (n mod 2 = 0);
  let h = (b -. a) /. float_of_int n in
  let sum = ref (f a +. f b) in
  for i = 1 to n - 1 do
    let x = a +. float_of_int i *. h in
    let coeff = if i mod 2 = 0 then 2.0 else 4.0 in
    sum := !sum +. coeff *. f x
  done;
  h /. 3.0 *. !sum

(**
    Gauss-Legendre quadrature: n-point rule.
    More efficient than Simpson for smooth functions.
    Exact for polynomials up to degree 2n-1.
*)
let gauss_legendre_5 f a b =
  (* 5-point GL nodes and weights on [-1, 1] *)
  let nodes   = [|-0.9061798459; -0.5384693101; 0.0;
                   0.5384693101;  0.9061798459|] in
  let weights = [| 0.2369268851;  0.4786286705; 0.5688888889;
                   0.4786286705;  0.2369268851|] in
  let mid = (a +. b) /. 2.0 in
  let half = (b -. a) /. 2.0 in
  Array.fold_left2
    (fun acc t w -> acc +. w *. f (mid +. half *. t))
    0.0 nodes weights
  |> ( *. ) half

(** 
    Option price as integral over risk-neutral density
    C = e^{-rT} * integral from K to inf of (S_T - K) * p(S_T) dS_T
    where p is the lognormal density
*)
let integrate_call_price ~spot ~strike ~rate ~vol ~tau =
  let from_x = strike in
  let to_x   = spot *. 10.0 in   (* truncate far out of money *)
  let integrand s_T =
    let log_arg = s_T /. spot in
    let mean = (rate -. 0.5 *. vol *. vol) *. tau in
    let std  = vol *. sqrt tau in
    let log_s = log log_arg in
    let density = exp (-. (log_s -. mean) ** 2.0 /. (2.0 *. std *. std))
                  /. (s_T *. std *. sqrt (2.0 *. Float.pi)) in
    (s_T -. strike) *. density
  in
  exp (-. rate *. tau) *. integrate_simpson integrand from_x to_x 1000

3.5 Root-Finding Algorithms

Root finding is used to compute implied volatility, yield to maturity, internal rate of return, and model calibration targets.

3.5.1 Newton-Raphson

Converges quadratically when started close to the root and the function is smooth:

$$x_{n+1} = x_n - \frac{f(x_n)}{f'(x_n)}$$

type convergence_result =
  | Converged of float
  | FailedToConverge of { iterations : int; residual : float; last : float }

let newton_raphson ?(tol = 1e-10) ?(max_iter = 100) ~f ~f' x0 =
  let rec loop x iter =
    let fx = f x in
    if Float.abs fx < tol then Converged x
    else if iter >= max_iter then FailedToConverge { iterations = iter; residual = fx; last = x }
    else
      let fpx = f' x in
      if Float.abs fpx < 1e-15 then FailedToConverge { iterations = iter; residual = fx; last = x }
      else loop (x -. fx /. fpx) (iter + 1)
  in
  loop x0 0

(** Implied volatility via Newton-Raphson *)
let implied_vol_newton ~market_price ~spot ~strike ~rate ~tau ~option_type =
  let bs_price v =
    match option_type with
    | Call -> Black_scholes.call_price ~spot ~strike ~rate ~vol:v ~tau
    | Put  -> Black_scholes.put_price  ~spot ~strike ~rate ~vol:v ~tau
  in
  (* Vega = dC/dσ — needed for Newton step *)
  let vega v =
    let d1 = Black_scholes.d1 ~spot ~strike ~rate ~vol:v ~tau in
    spot *. sqrt tau *. Black_scholes.norm_pdf d1
  in
  let f v = bs_price v -. market_price in
  let f' v = vega v in
  newton_raphson ~f ~f' 0.20   (* initial guess: 20% vol *)

Newton-Raphson Convergence Figure 3.2 — Left: Successive tangent lines during Newton-Raphson iteration converging to the root $\sqrt{2}$. Right: The absolute mathematical error falls geometrically (a steeper slope downwards on a log scale indicates quadratic convergence).

3.5.2 Brent's Method

Brent's method is a hybrid of bisection, secant method, and inverse quadratic interpolation. It is guaranteed to converge (unlike Newton) and is used when derivatives are unavailable:

let brent ?(tol = 1e-10) ?(max_iter = 200) ~f a b =
  let fa = f a and fb = f b in
  assert (fa *. fb <= 0.0);  (* root must be bracketed *)
  let a = ref a and b = ref b and fa = ref fa and fb = ref fb in
  let c = ref !a and fc = ref !fa in
  let d = ref 0.0 and e = ref 0.0 in
  let iter = ref 0 in
  while Float.abs (!b -. !a) > tol && Float.abs !fb > tol && !iter < max_iter do
    if !fc *. !fb > 0.0 then begin c := !a; fc := !fa; d := !b -. !a; e := !d end;
    if Float.abs !fc < Float.abs !fb then begin
      a := !b; b := !c; c := !a;
      fa := !fb; fb := !fc; fc := !fa
    end;
    let tol1 = 2.0 *. 2.22e-16 *. Float.abs !b +. tol /. 2.0 in
    let xm = (!c -. !b) /. 2.0 in
    if Float.abs xm <= tol1 || Float.abs !fb < tol then ()
    else begin
      let s =
        if Float.abs !e >= tol1 && Float.abs !fa > Float.abs !fb then
          let p = ref 0.0 and q = ref 0.0 and r = ref 0.0 in
          let s = !fb /. !fa in
          (if !a = !c then begin
            p := 2.0 *. xm *. s;
            q := 1.0 -. s
          end else begin
            q := !fa /. !fc;
            r := !fb /. !fc;
            p := s *. (2.0 *. xm *. !q *. (!q -. !r) -. (!b -. !a) *. (!r -. 1.0));
            q := (!q -. 1.0) *. (!r -. 1.0) *. (s -. 1.0)
          end);
          if !p > 0.0 then q := -. !q else p := -. !p;
          if 2.0 *. !p < Float.min (3.0 *. xm *. !q -. Float.abs (tol1 *. !q))
                                    (Float.abs (!e *. !q))
          then begin e := !d; d := !p /. !q; !d end
          else begin d := xm; e := !d; !d end
        else begin d := xm; e := !d; !d end
      in
      ignore s;
      a := !b; fa := !fb;
      b := !b +. (if Float.abs !d > tol1 then !d else (if xm > 0.0 then tol1 else -. tol1));
      fb := f !b;
      incr iter
    end
  done;
  if Float.abs !fb < tol then Converged !b
  else FailedToConverge { iterations = !iter; residual = !fb; last = !b }

3.6 Interpolation

Market data arrives at discrete tenors. Interpolation fills in intermediate maturities for consistent curve construction.

3.6.1 Linear Interpolation

(** 
    Linear interpolation on a sorted table of (x, y) pairs.
    Extrapolates flat at the boundaries.
*)
let linear_interpolate knots x =
  match knots with
  | [] -> failwith "empty knot list"
  | [(_, y)] -> y
  | (x0, y0) :: _ when x <= x0 -> y0
  | knots ->
    let rec find = function
      | [(_, yn)] -> yn
      | (x1, y1) :: ((x2, y2) :: _ as rest) ->
        if x <= x2 then
          let t = (x -. x1) /. (x2 -. x1) in
          y1 +. t *. (y2 -. y1)
        else find rest
      | _ -> assert false
    in
    find knots

3.6.2 Cubic Spline Interpolation

Cubic splines are the standard in yield curve construction because they:

Pass through all data points
Have continuous first and second derivatives
Minimise the "roughness" integral $\int (f'')^2$

type spline = {
  xs     : float array;
  ys     : float array;
  m      : float array;   (* second derivatives at knots *)
}

(**
    Natural cubic spline interpolation.
    Builds a tridiagonal system and solves via Thomas algorithm.
*)
let make_spline xs ys =
  let n = Array.length xs in
  assert (n >= 2);
  (* h_i = x_{i+1} - x_i *)
  let h = Array.init (n - 1) (fun i -> xs.(i + 1) -. xs.(i)) in
  (* Right-hand side *)
  let rhs = Array.init (n - 2) (fun i ->
    6.0 *. ((ys.(i + 2) -. ys.(i + 1)) /. h.(i + 1)
           -. (ys.(i + 1) -. ys.(i))   /. h.(i))
  ) in
  (* Solve tridiagonal system for second derivatives *)
  let m = Array.make n 0.0 in  (* natural BCs: m_0 = m_{n-1} = 0 *)
  (* Thomas algorithm *)
  let a = Array.init (n - 2) (fun i -> h.(i)) in
  let diag = Array.init (n - 2) (fun i -> 2.0 *. (h.(i) +. h.(i + 1))) in
  let c = Array.init (n - 2) (fun i -> h.(i + 1)) in
  (* Forward sweep *)
  for i = 1 to n - 3 do
    let w = a.(i) /. diag.(i - 1) in
    diag.(i) <- diag.(i) -. w *. c.(i - 1);
    rhs.(i)  <- rhs.(i)  -. w *. rhs.(i - 1)
  done;
  (* Back substitution *)
  m.(n - 2) <- rhs.(n - 3) /. diag.(n - 3);
  for i = n - 4 downto 0 do
    m.(i + 1) <- (rhs.(i) -. c.(i) *. m.(i + 2)) /. diag.(i)
  done;
  { xs; ys; m }

let eval_spline { xs; ys; m } x =
  let n = Array.length xs in
  if x <= xs.(0) then ys.(0)
  else if x >= xs.(n - 1) then ys.(n - 1)
  else begin
    (* Binary search for the interval *)
    let lo = ref 0 and hi = ref (n - 2) in
    while !hi - !lo > 1 do
      let mid = (!lo + !hi) / 2 in
      if xs.(mid) <= x then lo := mid else hi := mid
    done;
    let i = !lo in
    let h = xs.(i + 1) -. xs.(i) in
    let t = (x -. xs.(i)) /. h in
    let a = 1.0 -. t in
    (* Cubic Hermite form *)
    a *. ys.(i) +. t *. ys.(i + 1)
    +. h *. h /. 6.0 *. (
       (a *. a *. a -. a) *. m.(i) +.
    )
  end

Cubic Spline vs Linear Interpolation Figure 3.3 — While both linear and cubic spline interpolation pass perfectly through the control data points (left panel), the smoothness property is revealed in the first derivative (right panel). The linear interpolant's derivative exhibits abrupt staircase unphysical changes (kinks), while the cubic spline possesses a continuous derivative.

3.7 Special Functions

Financial models rely on several special functions, most importantly the normal distribution.

3.7.1 Normal CDF and PDF

The cumulative normal distribution $\Phi(x) = P(Z \leq x)$ for $Z \sim \mathcal{N}(0,1)$ appears in virtually every derivative pricing formula.

$$\Phi(x) = \frac{1}{\sqrt{2\pi}} \int_{-\infty}^{x} e^{-t^2/2} \cdot dt$$

let pi = 4.0 *. atan 1.0

(** Standard normal PDF *)
let norm_pdf x = Float.exp (-. 0.5 *. x *. x) /. sqrt (2.0 *. pi)

(**
    Rational approximation to Φ(x) — Abramowitz & Stegun 26.2.17
    Maximum error: |ε(x)| < 7.5 × 10^{-8}
*)
let norm_cdf x =
  let sign = if x >= 0.0 then 1.0 else -1.0 in
  let x = Float.abs x in
  let t = 1.0 /. (1.0 +. 0.2316419 *. x) in
  let poly = t *. (0.319381530
           +. t *. (-0.356563782
           +. t *. (1.781477937
           +. t *. (-1.821255978
           +. t *. 1.330274429)))) in
  let phi = 1.0 -. norm_pdf x *. poly in
  if sign > 0.0 then phi else 1.0 -. phi

(**
    Inverse normal CDF (probit function) — Peter Acklam's algorithm
    Relative error: < 1.15 × 10^{-9}
*)
let norm_ppf p =
  assert (p > 0.0 && p < 1.0);
  let a = [|-3.969683028665376e+01; 2.209460984245205e+02;
             -2.759285104469687e+02; 1.383577518672690e+02;
             -3.066479806614716e+01; 2.506628277459239e+00|] in
  let b = [|-5.447609879822406e+01; 1.615858368580409e+02;
             -1.556989798598866e+02; 6.680131188771972e+01;
             -1.328068155288572e+01|] in
  let c = [|-7.784894002430293e-03;-3.223964580411365e-01;
             -2.400758277161838e+00;-2.549732539343734e+00;
              4.374664141464968e+00; 2.938163982698783e+00|] in
  let d = [| 7.784695709041462e-03; 3.224671290700398e-01;
              2.445134137142996e+00; 3.754408661907416e+00|] in
  let p_low  = 0.02425 in
  let p_high = 1.0 -. p_low in
  let horner coeffs x =
    Array.fold_left (fun acc c -> acc *. x +. c) 0.0 coeffs
  in
  if p < p_low then
    let q = sqrt (-. 2.0 *. log p) in
    horner c q /. (1.0 +. q *. horner d q)
  else if p <= p_high then
    let q = p -. 0.5 in
    let r = q *. q in
    q *. horner a r /. horner b r
  else
    let q = sqrt (-. 2.0 *. log (1.0 -. p)) in
    -. (horner c q /. (1.0 +. q *. horner d q))

3.8 Building a Reusable Numerical Toolkit

Let us assemble the functions from this chapter into a well-structured module:

(** 
    Numerics — A reusable numerical toolkit for quantitative finance.
    Provides: integration, differentiation, root-finding, interpolation,
              special functions, and matrix utilities.
*)
module Numerics = struct

  module Diff = struct
    let deriv ?(h = 1e-5) f x = (f (x +. h) -. f (x -. h)) /. (2.0 *. h)
    let deriv2 ?(h = 1e-5) f x = (f (x +. h) -. 2.0 *. f x +. f (x -. h)) /. (h *. h)
    let grad ?(h = 1e-5) f xs =
      Array.mapi (fun i _ ->
        let xs_up   = Array.copy xs in
        let xs_down = Array.copy xs in
        xs_up.(i)   <- xs.(i) +. h;
        xs_down.(i) <- xs.(i) -. h;
        (f xs_up -. f xs_down) /. (2.0 *. h)
      ) xs
  end

  module Integrate = struct
    let simpson ?(n = 1000) f a b = integrate_simpson f a b n
    let gauss_5 = gauss_legendre_5
  end

  module Roots = struct
    let newton = newton_raphson
    let brent  = brent
  end

  module Special = struct
    let norm_cdf  = norm_cdf
    let norm_pdf  = norm_pdf
    let norm_ppf  = norm_ppf
    let kahan_sum = kahan_sum
  end

  module Interp = struct
    let linear = linear_interpolate
    let spline = (fun xs ys -> make_spline xs ys |> eval_spline)
  end

end

3.10 Tail Recursion — Stack-Safe Numerical Algorithms

OCaml guarantees tail-call optimisation (TCO): any function in tail position is compiled as a jump, not a function call, consuming no stack frame. This is not an implementation detail — it is a language specification. Python, Java, and C++ make no such guarantee (Python explicitly prohibits TCO as a deliberate design choice). Haskell achieves similar effects via lazy evaluation + strictness annotations, but the mechanism is less direct.

For numerical algorithms in quantitative finance, this matters in several concrete cases:

Bond ladder cash flow aggregation. A bond ladder may have 120 semi-annual coupon dates over 60 years. A naive recursive summation uses 120 stack frames. With tail recursion, it uses one:

(** Tail-recursive: accumulates present values without growing the stack *)
let bond_pv ~face ~coupon_rate ~yield ~n_periods =
  let coupon = face *. coupon_rate /. 2.0 in
  let df_half_yr = 1.0 /. (1.0 +. yield /. 2.0) in
  let rec go period acc df =
    if period = 0 then
      acc +. face *. df     (* principal payment at maturity *)
    else
      go (period - 1) (acc +. coupon *. df) (df *. df_half_yr)
  in
  go n_periods 0.0 1.0   (* go is in tail position: TCO applies *)

(** Compare the non-tail-recursive version (stack depth = n_periods): *)
let bond_pv_naive ~face ~coupon_rate ~yield ~n_periods =
  let coupon = face *. coupon_rate /. 2.0 in
  let rec go period df =
    if period = 0 then face *. df
    else coupon *. df +. go (period - 1) (df /. (1.0 +. yield /. 2.0))
    (* ^^^ go ... is NOT in tail position: + is applied after the call *)
  in
  go n_periods 1.0

(** For 120 periods: bond_pv is O(1) stack, bond_pv_naive risks overflow *)
let _ex1 = bond_pv ~face:1000.0 ~coupon_rate:0.05 ~yield:0.048 ~n_periods:120

Newton-Raphson with iteration bound. Root-finding algorithms iterate until convergence. A tail-recursive implementation is more natural and avoids mutable state:

(** Tail-recursive Newton-Raphson — O(1) stack, readable, no mutable counter *)
let newton_raphson ~f ~f' ?(tol = 1e-9) ?(max_iter = 200) x0 =
  let rec go x iter =
    let fx  = f x in
    let fx' = f' x in
    if Float.abs fx < tol then Ok x
    else if iter = 0 then Error (Printf.sprintf "Newton failed at x=%.6f" x)
    else if Float.abs fx' < 1e-14 then Error "Zero derivative"
    else go (x -. fx /. fx') (iter - 1)
  in
  go x0 max_iter

(** Yield-to-maturity: find r such that bond_pv(r) = market_price *)
let ytm ~market_price ~face ~coupon_rate ~n_periods =
  let f r  = bond_pv ~face ~coupon_rate ~yield:r ~n_periods -. market_price in
  let f' r = (* finite difference for gradient *)
    let h = 1e-6 in
    (f (r +. h) -. f (r -. h)) /. (2.0 *. h)
  in
  newton_raphson ~f ~f' coupon_rate   (* initial guess: coupon rate *)

(** With a 60y semi-annual bond (120 periods), both go and newton_raphson
    use O(1) stack frames regardless of iteration count. *)

Binary tree traversal for lattice models. Binomial trees for option pricing grow exponentially with step count. Tail-recursive tree traversal is essential for large step counts:

(** Fold a function over a complete binary tree level by level—tail-recursive *)
let binomial_tree_fold ~n ~init ~f =
  let nodes = Array.make (n + 1) init in
  for step = n - 1 downto 0 do
    for j = 0 to step do
      nodes.(j) <- f nodes.(j) nodes.(j + 1)
    done
  done;
  nodes.(0)

(** American put by backward induction — no stack growth for any n *)
let american_put ~spot ~strike ~rate ~vol ~tau ~n_steps =
  let dt   = tau /. float_of_int n_steps in
  let u    = exp (vol *. sqrt dt) in
  let d    = 1.0 /. u in
  let p    = (exp (rate *. dt) -. d) /. (u -. d) in
  let df   = exp (-. rate *. dt) in
  (* Terminal values *)
  let terminal_values = Array.init (n_steps + 1) (fun j ->
    let s = spot *. (u ** float_of_int j) *. (d ** float_of_int (n_steps - j)) in
    Float.max 0.0 (strike -. s)
  ) in
  let nodes = Array.copy terminal_values in
  for step = n_steps - 1 downto 0 do
    for j = 0 to step do
      let s = spot *. (u ** float_of_int j) *. (d ** float_of_int (step - j)) in
      let cont = df *. (p *. nodes.(j + 1) +. (1.0 -. p) *. nodes.(j)) in
      nodes.(j) <- Float.max (strike -. s) cont   (* early exercise *)
    done
  done;
  nodes.(0)

The american_put uses iterative backward induction rather than recursive traversal, achieving the same stack-safety guarantee through imperative style. In OCaml, the idiomatic choice depends on clarity: recursive for go-style iteration, imperative for indexed grid updates. Both are available, and both are stack-safe in OCaml in a way that Python (recursion limit of 1000) is not.

3.11 Chapter Summary

Numerical methods are the bridge between mathematical theory and working software. Each mathematical operation — summation, matrix multiplication, integration, root-finding, interpolation — has well-understood numerical properties that govern when it produces accurate results and when it fails.

Floating-point arithmetic is the foundation of all numerical computation, and its limitations are non-negotiable. IEEE 754 double precision provides about 15 significant decimal digits, which is sufficient for most financial computations but not all. Large portfolio computations that sum thousands of small numbers across many instruments can accumulate significant error with naive summation; Kahan's compensated summation algorithm reduces this error from $O(n\varepsilon)$ to $O(\varepsilon)$ with minimal overhead. For exact monetary accounting (never floating-point: $0.1 + 0.2 \neq 0.3$ in IEEE 754), Zarith arbitrary-precision arithmetic is the correct tool.

The numerical methods hierarchy — quadrature for integration, Newton-Raphson for root-finding, cubic splines for interpolation — reflects decades of accumulated knowledge about which algorithms converge, how fast, and when they fail. Newton-Raphson converges quadratically for smooth functions near the root but diverges if started far from the root or if the function has near-zero derivative. Brent's method (bisection with secant acceleration) is unconditionally convergent when a bracket is available and is the standard choice for implied volatility computation. Cubic splines produce $C^2$ interpolants that are the industry standard for yield curve construction because their smooth second derivative avoids the oscillation artifacts of higher-degree polynomial interpolation.

Tail-recursive algorithms (§3.10) are a language-level guarantee in OCaml: any function in tail position is compiled as a jump, not a stack frame. This makes recursive bond ladder aggregation, Newton iteration, and lattice model traversal stack-safe regardless of depth — a guarantee Python, Java, and C++ cannot provide.

Owl provides a comprehensive numerical computing environment for OCaml: BLAS/LAPACK-backed matrix operations, eigendecomposition, Cholesky factorisation, and statistical distributions. The normal CDF $\Phi(x)$, its inverse $\Phi^{-1}(p)$, and the probability density function $\phi(x)$ appear in virtually every derivative pricing formula in this book and must be implemented correctly to bitwise precision against reference values.

Exercises

3.1 Implement the trapezoidal rule and compare its convergence rate to Simpson's rule by numerically integrating $e^{-x^2}$ from 0 to 3 at various step counts.

3.2 Implement Halley's method: $x_{n+1} = x_n - 2f(x_n)f'(x_n) / (2[f'(x_n)]^2 - f(x_n)f''(x_n))$. Show that it converges cubically and compare to Newton on the implied volatility problem.

3.3 Given a 5-point yield curve (tenors: 1, 2, 5, 10, 30 years; yields: 4.5%, 4.8%, 5.1%, 5.4%, 5.6%), compute the zero rate at 7 years using linear interpolation vs cubic spline. Explain the difference.

3.4 Verify the relationship $\Phi(x) + \Phi(-x) = 1$ numerically for $x \in {-3, -2, -1, 0, 1, 2, 3}$ and measure the maximum deviation from exact symmetry.

3.5 Using the tail-recursive newton_raphson from §3.10, implement ytm_from_price for a 30-year bond with semi-annual coupons (120 periods). Verify that the YTM is self-consistent: bond_pv ~yield:(ytm ...) should recover the original market price to within 1 cent. Confirm that neither function grows the stack by testing with n_periods = 10000.

Next: Chapter 4 — Probability and Statistics

Quantitative Finance with OCaml