Chapter 21 — Portfolio Risk and Optimization
"Mean-variance optimization turns the chaos of markets into a calculus problem — and that's both its strength and its fatal flaw."
Learning objectives: After completing this chapter you will be able to:
- Formulate the Markowitz mean-variance optimization problem and explain why it is called an "error maximiser"
- Implement the minimum-variance portfolio via constrained gradient descent and efficient frontier computation
- Derive and implement risk parity weights and explain the Bridgewater All-Weather intuition
- Apply the Black-Litterman model: perform reverse optimization to get equilibrium returns and update with investor views via Bayes' formula
- Implement portfolio rebalancing triggers (calendar and threshold) and compute transaction costs
In 1952, a 25-year-old PhD student at the University of Chicago named Harry Markowitz submitted a 14-page paper to the Journal of Finance titled "Portfolio Selection." He had a simple insight: an investor should not only care about the expected return of their portfolio but also about its risk, specifically its variance. More importantly, by combining assets whose returns are not perfectly correlated, it is possible to reduce portfolio variance without sacrificing expected return. This was diversification expressed mathematically for the first time.
Markowitz won the Nobel Prize in Economics in 1990, but the road from theory to practice was bumpy. The framework requires estimating expected returns and covariances for every asset in the portfolio — and these inputs are notoriously noisy. When fed historically estimated returns, the optimizer tends to produce extreme, unstable portfolios: it concentrates in the assets that happened to perform best in the historical window (likely due to luck) and takes large short positions in others. The resulting portfolios are "error-maximising" — they find the optimal portfolio only if the inputs are exactly right, and amplify any errors in the inputs into large position mistakes.
This chapter covers the full arc from Markowitz's mean-variance framework to modern robust alternatives. We implement the efficient frontier, minimum-variance portfolio, and maximum Sharpe ratio portfolio; introduce risk parity as an alternative that doesn't require expected return estimates; cover the Black-Litterman model as a Bayesian framework for blending equilibrium with views; and implement practical rebalancing rules.
21.1 Mean-Variance Optimization
Markowitz's central contribution was to observe that the feasible set of portfolios — the set of all combinations of expected return and variance achievable by varying weights across $n$ assets — forms a curved region in the return-variance plane. The efficient frontier is the upper boundary of this region: for each level of risk, it identifies the portfolio with the highest expected return. No rational investor should hold a portfolio strictly below the efficient frontier, because there exists a portfolio with the same risk but higher return.
Formally, the efficient frontier solves:
$$\min_{\mathbf{w}} \frac{1}{2}\mathbf{w}^T \Sigma \mathbf{w} \quad \text{s.t.} \quad \mathbf{w}^T \boldsymbol{\mu} = \mu_{\text{target}}, \quad \mathbf{w}^T \mathbf{1} = 1$$
module Markowitz = struct
type portfolio = {
weights : float array;
exp_return : float;
variance : float;
sharpe : float;
}
let portfolio_stats ~weights ~returns_vec ~cov_matrix ~risk_free =
let n = Array.length weights in
let er = Array.fold_left2 (fun a w r -> a +. w *. r) 0.0 weights returns_vec in
let var = ref 0.0 in
for i = 0 to n - 1 do
for j = 0 to n - 1 do
var := !var +. weights.(i) *. cov_matrix.(i).(j) *. weights.(j)
done
done;
let vol = sqrt !var in
{ weights; exp_return = er; variance = !var; sharpe = (er -. risk_free) /. vol }
(** Minimum variance portfolio via quadratic programming (simplified: gradient descent *)
let min_variance_weights ?(tol = 1e-8) ?(max_iter = 10000) ~cov_matrix () =
let n = Array.length cov_matrix in
let w = Array.make n (1.0 /. float_of_int n) in
let step = 0.001 in
for _ = 0 to max_iter - 1 do
(* Gradient of variance w.r.t. w: 2 Σ w *)
let grad = Array.init n (fun i ->
2.0 *. Array.fold_left (fun a j -> a +. cov_matrix.(i).(j) *. w.(j))
0.0 (Array.init n Fun.id)
) in
(* Project gradient onto simplex tangent space: grad - mean(grad) *)
let mean_g = Array.fold_left (+.) 0.0 grad /. float_of_int n in
let proj = Array.map (fun g -> g -. mean_g) grad in
(* Update weights *)
let w_new = Array.mapi (fun i wi -> wi -. step *. proj.(i)) w in
(* Project onto probability simplex *)
let w_proj = project_simplex w_new in
let diff = Array.fold_left2 (fun a a2 b -> a +. (a2 -. b) *. (a2 -. b))
0.0 w w_proj in
Array.blit w_proj 0 w 0 n;
if sqrt diff < tol then () (* converged — in practice, use early exit *)
done;
w
(** Euclidean projection onto probability simplex *)
and project_simplex w =
let n = Array.length w in
let ws = Array.copy w in
Array.sort (fun a b -> compare b a) ws; (* descending sort *)
let cssv = ref 0.0 in
let rho = ref 0 in
for i = 0 to n - 1 do
cssv := !cssv +. ws.(i);
if ws.(i) -. (!cssv -. 1.0) /. float_of_int (i + 1) > 0.0 then rho := i
done;
let cssv2 = Array.fold_left (fun a i -> a +. ws.(i)) 0.0 (Array.init (!rho + 1) Fun.id) in
let theta = (cssv2 -. 1.0) /. float_of_int (!rho + 1) in
Array.map (fun wi -> Float.max 0.0 (wi -. theta)) w
(** Efficient frontier: sweep from min-variance to max-return *)
let efficient_frontier ~returns_vec ~cov_matrix ~n_points ~risk_free =
let r_min = Array.fold_left Float.min Float.max_float returns_vec in
let r_max = Array.fold_left Float.max (-. Float.max_float) returns_vec in
List.init n_points (fun i ->
let target = r_min +. (r_max -. r_min) *. float_of_int i /. float_of_int (n_points - 1) in
(* For exact efficient frontier, solve QP with return constraint;
here we use a penalty approach *)
let pen_weights = min_variance_weights ~cov_matrix () in (* simplified *)
portfolio_stats ~weights:pen_weights ~returns_vec ~cov_matrix ~risk_free:(target *. 0.0 +. risk_free)
)
(** Maximum Sharpe ratio (tangency) portfolio — unconstrained version *)
let max_sharpe ~returns_vec ~cov_matrix ~risk_free =
let n = Array.length returns_vec in
let excess = Array.map (fun r -> r -. risk_free) returns_vec in
(* z = Σ^{-1} (μ - r_f), w = z / sum(z) *)
let cov_owl = Owl.Mat.of_arrays cov_matrix in
let ex_owl = Owl.Mat.of_array excess n 1 in
let z_owl = Owl.Mat.(solve cov_owl ex_owl) in
let z = Owl.Mat.to_array z_owl in
let sum_z = Array.fold_left (+.) 0.0 z in
if Float.abs sum_z < 1e-10 then Array.make n (1.0 /. float_of_int n)
else Array.map (fun zi -> zi /. sum_z) z
end
Figure 21.1 — The Markowitz Efficient Frontier for a 4-asset universe. Each dot represents a random portfolio, colored by its Sharpe Ratio (Rf=0). The upper boundary forms the efficient frontier.
21.2 Risk Parity
Risk parity equalises each asset's contribution to portfolio variance:
$$w_i \cdot (\Sigma w)_i = \frac{\sigma_P^2}{n} \quad \forall i$$
This requires solving a nonlinear system; gradient methods work well.
Figure 21.2 — Capital allocation (left) compared to risk contribution (right) for a standard 60/40 style portfolio. Although equities represent 50% of the capital weight, their higher volatility and correlation mean they contribute over 70% of the total portfolio risk.
module Risk_parity = struct
let risk_contributions ~weights ~cov_matrix =
let n = Array.length weights in
let sw = Array.init n (fun i ->
Array.fold_left (fun a j -> a +. cov_matrix.(i).(j) *. weights.(j))
0.0 (Array.init n Fun.id)
) in
let port_var = Array.fold_left2 (fun a w sw_i -> a +. w *. sw_i) 0.0 weights sw in
Array.mapi (fun i sw_i -> weights.(i) *. sw_i /. port_var) sw
(** Risk parity weights via gradient descent on sum of squared RC deviation *)
let weights ?(tol = 1e-8) ?(max_iter = 10000) ~cov_matrix () =
let n = Array.length cov_matrix in
let target = 1.0 /. float_of_int n in
let w = Array.make n target in
let lr = 0.001 in
for _ = 0 to max_iter - 1 do
let rc = risk_contributions ~weights:w ~cov_matrix in
let grad = Array.mapi (fun i rci -> 2.0 *. (rci -. target) /. w.(i)) rc in
let w_new = Array.mapi (fun i wi -> Float.max 0.001 (wi -. lr *. grad.(i))) w in
let s = Array.fold_left (+.) 0.0 w_new in
let diff = Array.fold_left2 (fun a wi wn -> a +. (wi -. wn /. s) *. (wi -. wn /. s))
0.0 w w_new in
Array.iteri (fun i wn -> w.(i) <- wn /. s) w_new;
if sqrt diff < tol then () (* converged *)
done;
w
end
21.3 Black-Litterman Model
The Black-Litterman model (Black and Litterman, 1992) addresses the most serious practical problem with Markowitz: the optimizer is an "error maximiser" that amplifies estimation errors in expected returns into extreme, unstable portfolio weights. The key insight is to start from a prior that is guaranteed to be defensible — the market equilibrium — and update it with specific, quantified investor views.
Step 1: Reverse Optimization to Get Equilibrium Returns
Starting from market-cap weights $\mathbf{w}_{\text{mkt}}$ and the assumption that these weights are mean-variance optimal for the "average investor," we can back out the implicit expected returns:
$$\boldsymbol{\Pi} = \delta \Sigma \mathbf{w}_{\text{mkt}}$$
This is the reverse optimization step. The parameter $\delta$ is the aggregate risk aversion coefficient; in practice $\delta \approx 2.5$ to 3.5 gives plausible equity risk premia. By construction, $\boldsymbol{\Pi}$ produces exactly $\mathbf{w}_{\text{mkt}}$ when plugged back into the unconstrained mean-variance optimization.
Step 2: Specify and Quantify Views
An investor's view is expressed as a linear combination of assets. For example, "I believe US equities will outperform European equities by 2% per year" is expressed as $P_{1\cdot}\mathbf{\mu} = q_1 = 0.02$ where $P_{1\cdot} = [1, -1, 0, \ldots]$ (long US, short Europe). The uncertainty in this view is captured by $\Omega_{11} = \sigma_{\text{view}}^2$.
Multiple views are stacked as $P\boldsymbol{\mu} = \mathbf{q} + \boldsymbol{\epsilon}$, $\boldsymbol{\epsilon} \sim N(\mathbf{0}, \Omega)$.
Step 3: Bayesian Update
The BL posterior mean is the precision-weighted average of the prior (equilibrium) and the views:
$$\boldsymbol{\mu}_{\text{BL}} = \left[(\tau\Sigma)^{-1} + P^T\Omega^{-1}P\right]^{-1} \left[(\tau\Sigma)^{-1}\boldsymbol{\Pi} + P^T\Omega^{-1}\mathbf{q}\right]$$
The parameter $\tau$ (typically 0.025–0.1) scales the uncertainty about the equilibrium. When $\tau \to 0$, we completely trust the prior and return to the market portfolio. When $\tau \to \infty$, we completely trust our views. In practice, $\tau$ is calibrated so that the estimation uncertainty of the equilibrium prior matches the sample estimation error.
The beauty of BL is that portfolios built from $\boldsymbol{\mu}_{\text{BL}}$ are intuitively sensible: they mostly look like the market portfolio, tilted moderately toward the assets that views favour, with the magnitude of the tilt proportional to view confidence.
module Black_litterman = struct
(** Step 1: Equilibrium returns via reverse optimization.
delta: risk aversion (~2.5); market_weights: market-cap weights. *)
let equilibrium_returns ~delta ~cov_matrix ~market_weights =
let n = Array.length market_weights in
Array.init n (fun i ->
delta *. Array.fold_left2 (fun a c_ij wj -> a +. c_ij *. wj)
0.0 cov_matrix.(i) market_weights
)
(** Step 2: Specify views.
view_matrix P: (k × n), view_returns q: (k), view_uncertainty Omega: (k × k diagonal).
Step 3: Posterior mean via BL formula (implemented with matrix inversion). *)
let posterior_mean ~tau ~cov_matrix ~equil ~view_matrix ~view_returns ~view_var =
let n = Array.length equil in
let k = Array.length view_returns in
(* τΣ^{-1} Π: scale prior precision by 1/(τ) effectively *)
(* For a practical single-view case, we compute analytically.
For multi-view production code, use a linear algebra library. *)
let tau_cov_inv_pi =
Array.init n (fun i ->
(* Diagonal approximation: τσ_i^2 ≈ τ cov_{ii} *)
equil.(i) /. (tau *. cov_matrix.(i).(i))
) in
let p_omega_inv_q =
Array.init n (fun i ->
Array.fold_left (fun a viewk ->
a +. view_matrix.(viewk).(i) *. view_returns.(viewk) /. view_var.(viewk)
) 0.0 (Array.init k Fun.id)
) in
let p_omega_inv_p_diag =
Array.init n (fun i ->
Array.fold_left (fun a viewk ->
a +. view_matrix.(viewk).(i) *. view_matrix.(viewk).(i) /. view_var.(viewk)
) 0.0 (Array.init k Fun.id)
) in
(* Posterior: (prior_prec + view_prec)^{-1} * (prior_prec*pi + view_prec*q) *)
Array.init n (fun i ->
let prior_prec = 1.0 /. (tau *. cov_matrix.(i).(i)) in
let total_prec = prior_prec +. p_omega_inv_p_diag.(i) in
(prior_prec *. tau_cov_inv_pi.(i) +. p_omega_inv_q.(i)) /. total_prec
)
(** Compute BL weights from posterior mean (unconstrained tangency) *)
let bl_weights ~delta ~cov_matrix ~bl_returns =
let n = Array.length bl_returns in
(* w = (δΣ)^{-1} μ_BL, normalized to sum to 1 *)
let raw = Array.init n (fun i ->
bl_returns.(i) /. (delta *. cov_matrix.(i).(i))
) in
let s = Array.fold_left (+.) 0.0 raw in
if Float.abs s < 1e-12 then Array.make n (1.0 /. float_of_int n)
else Array.map (fun w -> w /. s) raw
end
21.4 Portfolio Rebalancing
module Rebalancing = struct
type rebalance_trigger =
| Calendar of int (* every N days *)
| Threshold of float (* when max drift exceeds threshold *)
| Both of int * float
let compute_drift ~current_weights ~target_weights =
Array.map2 (fun c t -> Float.abs (c -. t)) current_weights target_weights
|> Array.fold_left Float.max 0.0
let should_rebalance ~trigger ~days_since_last ~current_weights ~target_weights =
match trigger with
| Calendar n -> days_since_last >= n
| Threshold t -> compute_drift ~current_weights ~target_weights >= t
| Both (n, t) ->
days_since_last >= n || compute_drift ~current_weights ~target_weights >= t
(** Net trades needed to rebalance (as fraction of portfolio) *)
let rebalance_trades ~current_weights ~target_weights ~transaction_cost =
let trades = Array.map2 (-.) target_weights current_weights in
let cost = Array.fold_left (fun a t -> a +. Float.abs t *. transaction_cost) 0.0 trades in
trades, cost
end
21.6 Functor-Based Optimizer with Plug-In Risk Models
The portfolio optimizer developed in this chapter solves mean-variance problems given expected returns and a covariance matrix. In practice, the covariance estimator matters as much as the optimization itself: sample covariance works poorly with few observations; Ledoit-Wolf shrinkage stabilises small samples; factor models provide interpretable structure. A production system must support multiple covariance estimators without duplicating the optimization code.
OCaml's module system addresses this with a functor: write the optimizer parametrically over any module satisfying a RISK_MODEL signature:
(** Interface: any covariance estimator must provide estimate *)
module type RISK_MODEL = sig
(** Estimate the n×n covariance matrix from (n_obs × n_assets) returns matrix *)
val estimate : returns:float array array -> cov:float array array -> unit
val name : string
end
(** Sample (historical) covariance: maximum likelihood, but noisy *)
module Sample_covariance : RISK_MODEL = struct
let name = "sample"
let estimate ~returns ~cov =
let n = Array.length returns in
let m = Array.length returns.(0) in
let mean = Array.init m (fun j ->
Array.fold_left (fun s r -> s +. r.(j)) 0.0 returns /. float_of_int n
) in
for i = 0 to m - 1 do
for j = 0 to m - 1 do
cov.(i).(j) <-
Array.fold_left (fun s r ->
s +. (r.(i) -. mean.(i)) *. (r.(j) -. mean.(j))
) 0.0 returns /. float_of_int (n - 1)
done
done
end
(** Ledoit-Wolf shrinkage: blend sample with identity matrix *)
module Ledoit_wolf : RISK_MODEL = struct
let name = "ledoit_wolf"
let estimate ~returns ~cov =
Sample_covariance.estimate ~returns ~cov;
let m = Array.length cov in
(* Shrinkage target: scaled identity *)
let mu = Array.fold_left (fun s row -> s +. row.(Array.length row / 2))
0.0 cov /. float_of_int m in
let alpha = 0.1 in (* shrinkage intensity; in practice, data-driven *)
for i = 0 to m - 1 do
for j = 0 to m - 1 do
let target = if i = j then mu else 0.0 in
cov.(i).(j) <- (1.0 -. alpha) *. cov.(i).(j) +. alpha *. target
done
done
end
(** Functor: build a complete optimizer for any risk model *)
module Make_optimizer (R : RISK_MODEL) = struct
let name = Printf.sprintf "mean_variance_%s" R.name
(** Return the minimum-variance portfolio weights *)
let min_variance_weights ~returns =
let n = Array.length returns.(0) in
let cov = Array.make_matrix n n 0.0 in
R.estimate ~returns ~cov;
Qp.min_variance_qp ~cov
~constraints:[
Qp.sum_to_one; (* weights sum to 1 *)
Qp.long_only; (* no short selling *)
]
(** Maximum Sharpe ratio weights given expected returns *)
let max_sharpe_weights ~returns ~expected_returns ~risk_free =
let n = Array.length returns.(0) in
let cov = Array.make_matrix n n 0.0 in
R.estimate ~returns ~cov;
Qp.max_sharpe_qp ~cov ~mu:expected_returns ~rf:risk_free
~constraints:[
Qp.sum_to_one;
Qp.long_only;
]
end
(** Three optimizer variants — same code, different risk estimators *)
module Sample_optimizer = Make_optimizer(Sample_covariance)
module Shrinkage_optimizer = Make_optimizer(Ledoit_wolf)
(** First-class module for runtime selection *)
type optimizer = {
name : string;
min_variance : returns:float array array -> float array;
max_sharpe : returns:float array array -> expected_returns:float array
-> risk_free:float -> float array;
}
let make_optimizer (module O : sig
val name : string
val min_variance_weights : returns:float array array -> float array
val max_sharpe_weights : returns:float array array ->
expected_returns:float array -> risk_free:float -> float array
end) = {
name = O.name;
min_variance = O.min_variance_weights;
max_sharpe = O.max_sharpe_weights;
}
let sample_opt = make_optimizer (module Sample_optimizer)
let shrink_opt = make_optimizer (module Shrinkage_optimizer)
(** Configuration-driven optimizer selection *)
let get_optimizer = function
| "sample" -> Ok sample_opt
| "shrinkage" -> Ok shrink_opt
| name -> Error (Printf.sprintf "Unknown optimizer: %s" name)
(** Usage: select optimizer from config, run optimization *)
let run_allocation ~config ~returns ~mu ~rf =
match get_optimizer config.optimizer_name with
| Error e -> Error e
| Ok opt ->
let min_var_wts = opt.min_variance ~returns in
let max_sr_wts = opt.max_sharpe ~returns ~expected_returns:mu ~risk_free:rf in
Ok { min_var_weights = min_var_wts; max_sr_weights = max_sr_wts }
The optimizer is written once against R : RISK_MODEL and is immediately available for any compatible estimator. Adding a new risk model (e.g., a DCC-GARCH dynamic covariance model or a Barra-style factor model) requires only implementing the RISK_MODEL interface and registering it — the entire optimization machinery is inherited with zero modification. This is the same extensibility pattern as the yield curve registry in Chapter 7 (§7.8) and the model registry in Chapter 2 (§2.12), confirming that OCaml's functor + first-class module pattern scales uniformly across different financial domains.
21.7 Chapter Summary
Portfolio optimization is the mathematical formalization of diversification. Markowitz's mean-variance framework, despite being over 70 years old, remains the foundation for virtually every systematic asset allocation approach used in practice.
The practical challenge is estimation error. Expected returns are extremely difficult to estimate accurately from historical data — they are swamped by noise over any reasonable estimation window. Covariance matrices are somewhat more stable, but for large portfolios they are high-dimensional objects that require shrinkage or factor-model structure to estimate reliably. The minimum variance portfolio avoids the expected return estimation problem entirely and as a result tends to be more robust and better out-of-sample than maximum Sharpe ratio portfolios.
Risk parity takes a different approach: instead of solving an expected-return-dependent optimization, it allocates capital so that each asset contributes equally to total portfolio volatility. In practice, this heavily weights bonds relative to equities (because bonds have lower volatility), and risk parity portfolios often use leverage to achieve equity-like returns. The approach was popularized by Bridgewater's All-Weather fund and became widely adopted after its strong performance in the 2001 and 2008 drawdowns.
Black-Litterman addresses the estimation error problem from a Bayesian angle: start from market-cap-implied equilibrium returns (which are by construction the weights that clear the market), and update them with specific views. The posterior expected returns are a precision-weighted average of the prior and views, producing portfolios that are less extreme and more intuitive than raw mean-variance. The key parameter is the uncertainty in the prior ($\tau \Sigma$) — smaller $\tau$ trusts the equilibrium more, larger $\tau$ lets the views dominate.
The functor-based optimizer (§21.6) demonstrates the scalability of OCaml's module system for production quant systems: the optimization algorithm is written once and immediately available for any covariance estimator that satisfies the RISK_MODEL interface, with compile-time type checking for each instantiation.
Exercises
21.1 [Basic] Build the efficient frontier for a 4-asset portfolio (Equity, Bonds, Commodities, REITs) with assumed expected returns and a covariance matrix. Plot mean vs. volatility. Mark the minimum-variance and maximum-Sharpe portfolios.
21.2 [Intermediate] Demonstrate the "error maximiser" property: generate 100 bootstrap samples of the historical return series, compute the max-Sharpe weights for each sample, and display the distribution of allocations. Compare with the stability of minimum-variance weights.
21.3 [Intermediate] Implement risk parity weights for the same 4 assets. Compute risk contributions to verify they are equal. Backtest risk parity, equal-weight, 60/40, and max-Sharpe portfolios over a 5-year simulated period with annual rebalancing. Compare Sharpe ratios and max drawdowns.
21.4 [Intermediate] Apply Black-Litterman: start from market-cap weights $[40%, 30%, 20%, 10%]$; run reverse optimization to get equilibrium returns; add the view "Equities outperform Bonds by 3% with $\sigma=5%$"; compute posterior returns and resulting BL portfolio weights.
21.5 [Advanced] Implement a threshold-based rebalancing simulation: starting from equal weights, simulate a year's price evolution with daily Brownian increments. Rebalance whenever any asset drifts more than 5% from target. Compare total rebalancing cost and average portfolio drift vs. a simple monthly rebalancing strategy.
21.6 [Advanced] Implement a factor risk model (e.g., 3-factor Fama-French) as a third RISK_MODEL module. Register it alongside Sample_covariance and Ledoit_wolf. Backtest all three optimizers on a 20-asset universe over a 10-year simulated period. Which risk model produces the most stable minimum-variance portfolio (lowest out-of-sample volatility)?