Chapter 4 — Probability and Statistics

"In God we trust. All others must bring data." — W. Edwards Deming

Probability and statistics are the language of quantitative finance. Every model is a probability model — an assertion about the distribution of future asset prices. Every risk measure is a statistical measure: a quantile (VaR), a conditional expectation (Expected Shortfall), or a standard deviation (volatility). Every strategy backtest is a statistical inference problem with the fundamental challenge of distinguishing genuine predictive power from overfitting to historical data.

The classical assumption underlying most financial models — that asset returns are normally distributed — is convenient but false. Daily equity returns have approximately 4-8 times as much kurtosis as the normal distribution, meaning extreme returns occur far more often than Gaussian models predict. The crash of 1987 (a 22-standard-deviation event under Gaussian assumptions, with probability roughly $10^{-106}$) was followed by the LTCM crisis of 1998, the dot-com crash of 2000, the financial crisis of 2008, and the COVID crash of 2020 — a frequency of extreme events that is entirely consistent with fat-tailed distributions but impossible under Gaussian assumptions. Understanding the statistical fingerprint of financial returns — their distribution, their autocorrelation structure, their changing variance — is the prerequisite for building models that fail gracefully rather than catastrophically.

This chapter covers the statistical toolkit of quantitative finance: the key distributions (normal, lognormal, Student-t), moment estimation, random variate generation, Monte Carlo fundamentals, regression and factor models, and time series analysis. These tools appear throughout every subsequent chapter in the book.

4.1 Probability Distributions in Finance

Financial models are built on probability distributions. Asset returns are modeled as random variables; option prices are expected values under a risk-neutral measure; risk measures like VaR are quantiles of loss distributions.

Normal vs Laplace vs Student-t Tails Figure 4.1 — Probability distributions on a log scale. The Normal distribution has rapidly decaying tails, severely underestimating the probability of a 4σ or 5σ event compared to the heavier-tailed Laplace or Student-t distributions commonly fit to financial returns.

4.1.1 The Normal Distribution

The normal distribution $\mathcal{N}(\mu, \sigma^2)$ has PDF:

$$f(x) = \frac{1}{\sigma\sqrt{2\pi}} \exp\left(-\frac{(x-\mu)^2}{2\sigma^2}\right)$$

It is the workhorse of classical quantitative finance — despite being empirically wrong for asset returns (which exhibit fat tails and asymmetry), it remains the foundation because it is analytically tractable.

module Normal = struct
  type t = { mu : float; sigma : float }

  let make ~mu ~sigma =
    assert (sigma > 0.0);
    { mu; sigma }

  let pdf { mu; sigma } x =
    let z = (x -. mu) /. sigma in
    exp (-. 0.5 *. z *. z) /. (sigma *. sqrt (2.0 *. Float.pi))

  let cdf { mu; sigma } x =
    let z = (x -. mu) /. sigma in
    norm_cdf z

  let ppf { mu; sigma } p =
    mu +. sigma *. norm_ppf p

  let mean { mu; _ } = mu
  let variance { sigma; _ } = sigma *. sigma
  let std { sigma; _ } = sigma
  let skewness _ = 0.0
  let kurtosis _ = 0.0   (* excess kurtosis *)

  (** Monte Carlo sample *)
  let sample { mu; sigma } rng =
    let u1 = Random.float 1.0 in
    let u2 = Random.float 1.0 in
    (* Box-Muller transform *)
    let z = sqrt (-. 2.0 *. log u1) *. cos (2.0 *. Float.pi *. u2) in
    mu +. sigma *. z
end

4.1.2 The Lognormal Distribution

If $X \sim \mathcal{N}(\mu, \sigma^2)$, then $S = e^X$ is lognormal. Asset prices that follow Geometric Brownian Motion are lognormally distributed:

$$S_T = S_0 \exp\left[\left(\mu - \frac{\sigma^2}{2}\right)T + \sigma\sqrt{T} Z\right], \quad Z \sim \mathcal{N}(0,1)$$

The $-\sigma^2/2$ correction (Itô correction) ensures the mean of $S_T$ is $S_0 e^{\mu T}$:

$$\mathbb{E}[S_T] = S_0 e^{\mu T}$$

module Lognormal = struct
  type t = { mu : float; sigma : float }   (* parameters of the underlying normal *)

  (** Parameters from the lognormal mean m and variance v *)
  let from_mean_variance ~mean ~variance =
    let sigma2 = log (1.0 +. variance /. (mean *. mean)) in
    let mu = log mean -. sigma2 /. 2.0 in
    { mu; sigma = sqrt sigma2 }

  let pdf { mu; sigma } x =
    if x <= 0.0 then 0.0
    else
      let lx = log x in
      exp (-. (lx -. mu) ** 2.0 /. (2.0 *. sigma *. sigma))
      /. (x *. sigma *. sqrt (2.0 *. Float.pi))

  let cdf { mu; sigma } x =
    if x <= 0.0 then 0.0
    else norm_cdf ((log x -. mu) /. sigma)

  let mean { mu; sigma } = exp (mu +. sigma *. sigma /. 2.0)
  let variance { mu; sigma } =
    let m = mean { mu; sigma } in
    (exp (sigma *. sigma) -. 1.0) *. m *. m

  (** Expected value of max(S - K, 0) — this IS the Black-Scholes call formula *)
  let call_payoff_expectation { mu; sigma } k =
    let d1 = (mu +. sigma *. sigma -. log k) /. sigma in
    let d2 = d1 -. sigma in
    exp (mu +. sigma *. sigma /. 2.0) *. norm_cdf d1
    -. k *. norm_cdf d2
end

4.1.3 Student's t-Distribution

Fat tails in equity returns are better captured by the Student's t-distribution with $\nu$ degrees of freedom. As $\nu \to \infty$, it approaches the normal:

$$f(x; \nu) = \frac{\Gamma((\nu+1)/2)}{\sqrt{\nu\pi},\Gamma(\nu/2)}\left(1 + \frac{x^2}{\nu}\right)^{-(\nu+1)/2}$$

For $\nu \leq 4$, the kurtosis is infinite — capturing the extreme events that normal models underestimate.

(** Student-t PDF (standardised, zero mean, unit scale) *)
let student_t_pdf ~nu x =
  let half_nu = nu /. 2.0 in
  let log_gamma_half = (* Stirling approximation for log Gamma *)
    let half = (nu +. 1.0) /. 2.0 in
    (half -. 0.5) *. log half -. half +. 0.5 *. log (2.0 *. Float.pi)
  in
  let log_norm = log_gamma_half
    -. 0.5 *. log (nu *. Float.pi)
    -. (half_nu +. 0.5 -. 0.5) *. log (half_nu) +. half_nu -. 0.5 *. log (2.0 *. Float.pi)
  in
  exp (log_norm -. (nu +. 1.0) /. 2.0 *. log (1.0 +. x *. x /. nu))

(** Excess kurtosis of Student-t: κ = 6/(ν-4) for ν > 4 *)
let student_t_kurtosis ~nu =
  if nu > 4.0 then 6.0 /. (nu -. 4.0)
  else infinity

4.1.4 The Poisson Distribution

Jump processes in asset prices (sudden gaps at earnings, macro announcements) are modeled with the Poisson distribution. The probability of $k$ jumps in a time interval of length $t$ with intensity $\lambda$:

$$P(N_t = k) = \frac{(\lambda t)^k e^{-\lambda t}}{k!}$$

let poisson_pmf ~lambda k =
  let rec factorial n = if n <= 1 then 1.0 else float_of_int n *. factorial (n - 1) in
  (lambda ** float_of_int k) *. exp (-. lambda) /. factorial k

let poisson_sample ~lambda rng =
  (* Knuth's algorithm for Poisson sampling *)
  let l = exp (-. lambda) in
  let k = ref 0 and p = ref 1.0 in
  while !p > l do
    p := !p *. Random.float 1.0;
    incr k
  done;
  !k - 1

4.2 Moments: Mean, Variance, Skewness, Kurtosis

The first four moments characterise the shape of a distribution and the risk profile of a return stream.

4.2.1 Sample Moments

(**
    Compute the first four central moments in a single pass
    using Welford's online algorithm for numerical stability.
*)
let moments arr =
  let n = Array.length arr in
  assert (n > 0);
  let mean  = ref 0.0 in
  let m2    = ref 0.0 in   (* sum of squared deviations *)
  let m3    = ref 0.0 in
  let m4    = ref 0.0 in
  Array.iteri (fun i x ->
    let ni   = float_of_int (i + 1) in
    let delta = x -. !mean in
    let delta_n = delta /. ni in
    let term1 = delta *. delta_n *. float_of_int i in
    mean := !mean +. delta_n;
    m4   := !m4 +. term1 *. delta_n *. delta_n *. (ni *. ni -. 3.0 *. ni +. 3.0)
             +. 6.0 *. delta_n *. delta_n *. !m2 -. 4.0 *. delta_n *. !m3;
    m3   := !m3 +. term1 *. delta_n *. (ni -. 2.0) -. 3.0 *. delta_n *. !m2;
    m2   := !m2 +. term1
  ) arr;
  let variance = !m2 /. float_of_int (n - 1) in   (* unbiased *)
  let std = sqrt variance in
  let skewness = if std > 0.0 then
    (!m3 /. float_of_int n) /. (std *. std *. std)
  else 0.0 in
  let kurtosis = if variance > 0.0 then
    (!m4 /. float_of_int n) /. (variance *. variance) -. 3.0   (* excess *)
  else 0.0 in
  (`Mean !mean, `Variance variance, `Std std, `Skewness skewness, `Excess_kurtosis kurtosis)

(** Annualised return statistics *)
let annualise_moments ~daily_mean ~daily_var ~trading_days =
  let annual_return = daily_mean *. float_of_int trading_days in
  let annual_vol    = sqrt (daily_var *. float_of_int trading_days) in
  (annual_return, annual_vol)

4.2.2 Interpreting Moments in Finance

Moment	Statistic	Financial Interpretation
1st	Mean	Expected return
2nd	Variance/Vol	Total risk
3rd	Skewness	Asymmetry; negative means crash risk
4th	Kurtosis	Fat tails; pos. means more extreme events than normal

Empirical equity return skewness is typically negative (−0.3 to −0.8): large negative moves are more common than large positive moves of the same magnitude. Excess kurtosis is typically positive (3–8): extreme events occur far more often than a Gaussian model predicts.

4.3 Sampling and Random Number Generation

4.3.1 Mersenne Twister

OCaml's built-in Random module uses a variant of Mersenne Twister (MT19937), which has period $2^{19937} - 1$. For Monte Carlo simulation, this is more than adequate.

(** A stateful RNG with reproducible seeds *)
module Rng = struct
  type t = Random.State.t

  let make seed = Random.State.make [|seed|]
  let make_self_init () = Random.State.make_self_init ()

  let uniform rng = Random.State.float rng 1.0

  (** Box-Muller transform: standard normal sample *)
  let normal rng =
    let u1 = max 1e-15 (uniform rng) in   (* avoid log(0) *)
    let u2 = uniform rng in
    sqrt (-. 2.0 *. log u1) *. cos (2.0 *. Float.pi *. u2)

  (** Array of n standard normals *)
  let normals rng n = Array.init n (fun _ -> normal rng)

  (** Ziggurat algorithm would be faster but this is clear *)
end

4.3.2 Quasi-Monte Carlo: Sobol Sequences

Sobol sequences are low-discrepancy sequences that cover the unit cube more uniformly than pseudo-random numbers. For the same number of samples, quasi-Monte Carlo converges at $O((\log N)^d / N)$ vs $O(1/\sqrt{N})$ for standard Monte Carlo.

(**
    1-dimensional Sobol sequence (direction numbers from Joe & Kuo 2010).
    For production use, load the full 21201-dimensional direction number table.
*)
module Sobol = struct
  (* Direction numbers for dimension 1 *)
  let direction_numbers_1d = [|1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
                                1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1|]

  let v = Array.init 32 (fun i ->
    direction_numbers_1d.(i) lsl (31 - i)
  )

  let norm = 1.0 /. float_of_int (1 lsl 31)

  (** Generate n quasi-random points in [0, 1) *)
  let generate n =
    let x = Array.make n 0.0 in
    let cur = ref 0 in
    for i = 1 to n do
      let c = (* position of rightmost zero bit *)
        let rec find k = if (i - 1) land (1 lsl k) = 0 then k else find (k + 1)
        in find 0
      in
      cur := !cur lxor v.(c);
      x.(i - 1) <- float_of_int !cur *. norm
    done;
    x

  (** Transform to standard normal via inverse CDF *)
  let normals n =
    generate n |> Array.map (fun u ->
      let u = Float.max 1e-12 (Float.min (1.0 -. 1e-12) u) in
      norm_ppf u
    )
end

4.3.3 Antithetic Variates

A simple variance reduction technique: if $Z$ is standard normal, then $-Z$ has the same distribution. By averaging payoffs under $Z$ and $-Z$, variance is reduced:

let mc_antithetic ~payoff ~s0 ~rate ~vol ~tau ~n_paths rng =
  let dt  = tau in
  let total = ref 0.0 in
  for _ = 1 to n_paths / 2 do
    let z  = Rng.normal rng in
    let st_pos = s0 *. exp ((rate -. 0.5 *. vol *. vol) *. dt +. vol *. sqrt dt *. z) in
    let st_neg = s0 *. exp ((rate -. 0.5 *. vol *. vol) *. dt +. vol *. sqrt dt *. (-. z)) in
    total := !total +. (payoff st_pos +. payoff st_neg) /. 2.0
  done;
  exp (-. rate *. tau) *. !total /. float_of_int (n_paths / 2)

4.4 The Law of Large Numbers and Central Limit Theorem

4.4.1 Law of Large Numbers (Simulation)

(** Simulate convergence of sample mean to true mean *)
let simulate_lln ~true_mean ~sample_gen ~max_n =
  let rng = Rng.make 42 in
  let cum_sum = ref 0.0 in
  Array.init max_n (fun i ->
    cum_sum := !cum_sum +. sample_gen rng;
    let n = float_of_int (i + 1) in
    let sample_mean = !cum_sum /. n in
    let error = Float.abs (sample_mean -. true_mean) in
    (i + 1, sample_mean, error)
  )

(** Standard error of mean: σ/√n *)
let standard_error ~sigma ~n = sigma /. sqrt (float_of_int n)

4.4.2 Central Limit Theorem and Monte Carlo Error

For a Monte Carlo estimate of a price $C$:

$$\hat{C} = \frac{1}{N}\sum_{i=1}^N V_i$$

The standard error of $\hat{C}$ is:

$$\text{SE}(\hat{C}) = \frac{\sigma_V}{\sqrt{N}}$$

where $\sigma_V$ is the standard deviation of the payoffs. A 95% confidence interval is $\hat{C} \pm 1.96 \cdot \text{SE}$.

type mc_result = {
  estimate   : float;
  std_error  : float;
  ci_low_95  : float;
  ci_high_95 : float;
  n_paths    : int;
}

let mc_result_of_payoffs payoffs ~rate ~tau =
  let n = Array.length payoffs in
  let df = exp (-. rate *. tau) in
  let mean   = Array.fold_left (+.) 0.0 payoffs /. float_of_int n in
  let var    = Array.fold_left (fun acc x -> acc +. (x -. mean) *. (x -. mean))
                 0.0 payoffs /. float_of_int (n - 1) in
  let se = sqrt var /. sqrt (float_of_int n) in
  {
    estimate  = df *. mean;
    std_error = df *. se;
    ci_low_95  = df *. (mean -. 1.96 *. se);
    ci_high_95 = df *. (mean +. 1.96 *. se);
    n_paths   = n;
  }

let pp_mc_result r =
  Printf.printf "Price: %.4f ± %.4f (95%% CI: [%.4f, %.4f], N=%d)\n"
    r.estimate r.std_error r.ci_low_95 r.ci_high_95 r.n_paths

4.5 Hypothesis Testing and Confidence Intervals

Quantitative finance uses statistical testing to validate models and evaluate trading strategies.

4.5.1 t-Test for Return Significance

Does a trading strategy generate returns significantly different from zero?

$$t = \frac{\bar{r}}{\hat{\sigma}/\sqrt{n}}$$

Under $H_0: \mu = 0$, $t \sim t_{n-1}$.

(** One-sample t-test: H_0: μ = mu_0 *)
let t_test ~returns ~mu_0 =
  let n = Array.length returns in
  let n_f = float_of_int n in
  let mean   = Array.fold_left (+.) 0.0 returns /. n_f in
  let var    = Array.fold_left (fun a x -> a +. (x -. mean) *. (x -. mean))
                 0.0 returns /. (n_f -. 1.0) in
  let se     = sqrt (var /. n_f) in
  let t_stat = (mean -. mu_0) /. se in
  let df     = n - 1 in
  (* p-value requires t-distribution CDF; approximate for large n *)
  let p_value_approx = 2.0 *. (1.0 -. norm_cdf (Float.abs t_stat)) in
  (t_stat, df, p_value_approx)

(** Sharpe ratio and its standard error *)
let sharpe_ratio ~returns ~risk_free_rate ~periods_per_year =
  let n = float_of_int (Array.length returns) in
  let excess = Array.map (fun r -> r -. risk_free_rate /. periods_per_year) returns in
  let mean = Array.fold_left (+.) 0.0 excess /. n in
  let var  = Array.fold_left (fun a x -> a +. (x -. mean) *. (x -. mean))
               0.0 excess /. (n -. 1.0) in
  let std  = sqrt var in
  let sr   = mean /. std *. sqrt periods_per_year in
  (* Standard error of Sharpe ratio (Lo 2002) *)
  let sr_se = sqrt ((1.0 +. 0.5 *. sr *. sr /. periods_per_year) /. n) in
  (sr, sr_se)

4.5.2 Jarque-Bera Test for Normality

The Jarque-Bera test tests whether sample skewness and kurtosis match a normal distribution:

$$JB = \frac{n}{6}\left(S^2 + \frac{(K-3)^2}{4}\right) \sim \chi^2_2 \quad \text{under } H_0$$

let jarque_bera returns =
  let n = float_of_int (Array.length returns) in
  let (`Mean _, `Variance _, `Std _, `Skewness s, `Excess_kurtosis k) = moments returns in
  let jb = n /. 6.0 *. (s *. s +. (k *. k) /. 4.0) in
  (* p-value: JB ~ chi^2(2) under H_0 *)
  (* For chi^2(2): P(X > x) = exp(-x/2) *)
  let p_value = exp (-. jb /. 2.0) in
  (jb, p_value)

4.6 Regression Analysis

4.6.1 Ordinary Least Squares

Linear regression is the foundation of factor models, beta estimation, and model calibration.

Given the linear model:

$$y = X\beta + \varepsilon, \quad \varepsilon \sim \mathcal{N}(0, \sigma^2 I)$$

The OLS estimator is:

$$\hat{\beta} = (X^T X)^{-1} X^T y$$

open Owl

(**
    Ordinary Least Squares regression.
    
    X: (n_obs × n_features) design matrix (should include intercept column if needed)
    y: (n_obs × 1) response vector
    
    Returns: (beta, residuals, r_squared)
*)
let ols ~x ~y =
  (* β = (X^T X)^{-1} X^T y *)
  let xt  = Mat.transpose x in
  let xtx = Mat.dot xt x in
  let xty = Mat.dot xt y in
  (* Solve using Cholesky for numerical stability *)
  let beta = Linalg.D.linsolve xtx xty in
  let y_hat = Mat.dot x beta in
  let residuals = Mat.sub y y_hat in
  (* R² *)
  let ss_res = Mat.dot (Mat.transpose residuals) residuals |> Mat.get 0 0 in
  let y_mean = Mat.mean y in
  let ss_tot = Mat.fold (fun acc v ->
    let diff = v -. y_mean in acc +. diff *. diff
  ) 0.0 y in
  let r2 = 1.0 -. ss_res /. ss_tot in
  (beta, residuals, r2)

(**
    Compute CAPM beta and alpha for an asset return series.
    y = alpha + beta * market_return + epsilon
*)
let capm_beta ~asset_returns ~market_returns =
  let n = Array.length asset_returns in
  (* Design matrix: [1, r_m] *)
  let x = Mat.init n 2 (fun i j ->
    if j = 0 then 1.0 else market_returns.(i)
  ) in
  let y = Mat.of_array asset_returns n 1 in
  let (beta, _, r2) = ols ~x ~y in
  let alpha_ann = Mat.get beta 0 0 *. 252.0 in   (* annualised *)
  let beta_val  = Mat.get beta 1 0 in
  Printf.printf "Alpha (ann): %.4f  Beta: %.4f  R²: %.4f\n" alpha_ann beta_val r2;
  (alpha_ann, beta_val, r2)

4.7 Time Series Basics

4.7.1 Autocorrelation

Autocorrelation measures the correlation of a time series with its own lagged values. Returns should be near zero (efficient markets); volatility is highly autocorrelated (volatility clustering).

$$\rho_k = \frac{\sum_{t=k+1}^{n}(r_t - \bar{r})(r_{t-k} - \bar{r})}{\sum_{t=1}^n (r_t - \bar{r})^2}$$

let autocorrelation returns max_lag =
  let n = Array.length returns in
  let n_f = float_of_int n in
  let mean = Array.fold_left (+.) 0.0 returns /. n_f in
  let centered = Array.map (fun r -> r -. mean) returns in
  let variance = Array.fold_left (fun a x -> a +. x *. x) 0.0 centered /. n_f in
  Array.init max_lag (fun lag ->
    let k = lag + 1 in
    let cov = ref 0.0 in
    for t = k to n - 1 do
      cov := !cov +. centered.(t) *. centered.(t - k)
    done;
    !cov /. (float_of_int (n - k)) /. variance
  )

(** Ljung-Box test for autocorrelation *)
let ljung_box returns ~lags =
  let n = float_of_int (Array.length returns) in
  let acf = autocorrelation returns lags in
  let q = Array.foldi (fun k acc rho ->
    acc +. rho *. rho /. (n -. float_of_int (k + 1))
  ) 0.0 acf in
  n *. (n +. 2.0) *. q   (* LB statistic ~ chi^2(lags) under H_0 *)

4.7.2 Stationarity and the ADF Test

A stationary time series has constant mean and variance. Many financial time series (prices) are non-stationary; log returns typically are.

The Augmented Dickey-Fuller (ADF) test for unit root:

$$\Delta y_t = \alpha + \beta t + \gamma y_{t-1} + \sum_{i=1}^p \delta_i \Delta y_{t-i} + \varepsilon_t$$

$H_0: \gamma = 0$ (unit root, non-stationary).

let log_returns prices =
  let n = Array.length prices in
  Array.init (n - 1) (fun i ->
    log (prices.(i + 1) /. prices.(i))
  )

(** Simple ADF test (zero lag) — for illustration *)
let adf_test series =
  let n = Array.length series in
  let dy = Array.init (n - 1) (fun i -> series.(i + 1) -. series.(i)) in
  let y_lag = Array.init (n - 1) (fun i -> series.(i)) in
  (* Regress Δy on y_{t-1} *)
  let x = Mat.of_array y_lag (n - 1) 1 in
  let y = Mat.of_array dy (n - 1) 1 in
  let (beta, residuals, _) = ols ~x ~y in
  let gamma = Mat.get beta 0 0 in
  let ss_res  = Mat.fold (fun a v -> a +. v *. v) 0.0 residuals in
  let se_gamma = sqrt (ss_res /. float_of_int (n - 2)
                       /. (Mat.fold (fun a v -> a +. v *. v) 0.0 x)) in
  let t_stat = gamma /. se_gamma in
  (* Critical value at 5%: approximately -2.86 *)
  (t_stat, gamma < -. 2.86)

4.8 Chapter Summary

Probability and statistics underpin every quantitative model in this book. The normal distribution is the default assumption in closed-form derivatives pricing not because it is empirically accurate but because it is mathematically tractable. The lognormal distribution (the distribution of $e^X$ where $X$ is normal) is used for asset prices because it is positive-definite and produces the Black-Scholes formula. The Student-t distribution with low degrees of freedom provides a practical fat-tailed alternative to the normal for risk calculations and empirical return modelling.

Moment estimation in the presence of finite samples requires attention to numerical stability. Welford's online algorithm computes mean and variance in a single pass through the data without accumulating large intermediate sums — essential for streaming data where recalculating from scratch is expensive. Higher moments (skewness, kurtosis) are estimated with the corresponding unbiased formulas, and their standard errors $\sqrt{6/n}$ and $\sqrt{24/n}$ determine when they are statistically significant.

Monte Carlo simulation is the bridge between probability models and numerical computation. The fundamental theorem — the sample mean of $f(X_1), \ldots, f(X_n)$ converges to $E[f(X)]$ at rate $\sigma/\sqrt{n}$ — is the basis for every Monte Carlo pricer, risk calculation, and backtest in this book. Variance reduction techniques (antithetic variates, control variates, importance sampling) accelerate convergence by reducing $\sigma$ rather than increasing $n$. Quasi-Monte Carlo methods replace pseudo-random samples with low-discrepancy sequences like Sobol, achieving $O(1/n)$ convergence instead of $O(1/\sqrt{n})$ for smooth integrands.

Factor models and regression connect the statistical framework to the financial application. OLS regression with time series $R_t = \alpha + \beta R_{M,t} + \varepsilon_t$ estimates the CAPM beta, and the residual diagnostics (Durbin-Watson for autocorrelation, Jarque-Bera for normality, ADF for stationarity) validate the model assumptions. The autocorrelation of squared returns — near zero for returns themselves but significantly positive for $|r_t|^2$ — is the statistical fingerprint of GARCH-type volatility clustering.

Exercises

4.1 Using Monte Carlo simulation with 1,000,000 paths, estimate the 99th percentile of a lognormal distribution with $\mu = 0.08$, $\sigma = 0.20$, $T = 1$. Compare to the analytical result $F^{-1}(0.99)$.

4.2 Implement the Polar method (Box-Muller variant) and the Ziggurat algorithm for normal sampling. Benchmark them against Box-Muller for N = 10,000,000 samples.

4.3 Download historical daily prices for two correlated assets (e.g., SPY and QQQ). Compute the sample correlation, run the Ljung-Box test on log returns, and test whether returns are normally distributed using Jarque-Bera.

4.4 Implement a multi-factor OLS regression that regresses stock returns on: market excess return, SMB (small minus big), HML (high minus low). Interpret the regression coefficients.

Next: Chapter 5 — Time Value of Money

Quantitative Finance with OCaml