Chapter 23 — Execution Algorithms

"The best execution algorithm is the one that moves the market the least."


After this chapter you will be able to:

  • Derive the Almgren-Chriss optimal execution trajectory and interpret the $\kappa$ parameter
  • Explain the tradeoffs between TWAP, VWAP, and Implementation Shortfall benchmarks and when to use each
  • Implement a VWAP schedule from a historical volume profile and an adaptive TWAP
  • Compute the Implementation Shortfall and decompose it into market impact, spread cost, and opportunity cost
  • Understand Smart Order Routing and why fragmented liquidity across venues matters

When a pension fund decides to reallocate \$1 billion from bonds into equities, the investment decision is the easy part. The hard part is execution: how to buy \$1 billion of equities without moving prices substantially against yourself. Placing a single large market order would consume the entire top of the book and walk down multiple price levels, driving up the average execution price by perhaps 2–5%. On a \$1 billion order, that is \$20–50 million in unnecessary cost. Execution algorithms are the systematic answer to this problem.

The Almgren-Chriss (2001) model provided the first rigorous mathematical framework for optimal execution. The insight was to frame the problem as a mean-variance optimization in execution cost: trading faster reduces timing risk (the price may move adversely while you wait) but increases market impact cost (larger individual trades move the market more). The optimal trajectory trades off these two costs, producing a characteristic hyperbolic-sine-shaped execution schedule that front-loads trading slightly relative to TWAP.

In practice, the most widely used algorithms are TWAP (divide execution evenly across time, minimising complexity) and VWAP (track the market's volume profile, minimising impact relative to the market benchmark). Implementation Shortfall algorithms are closer to Almgren-Chriss in spirit and are preferred when timing risk dominates. Smart Order Routing adds the dimension of fragmented liquidity: modern equity markets have dozens of venues (NYSE, NASDAQ, BATS, dark pools), and routing orders optimally across them is itself a real-time optimization problem.


23.1 Almgren-Chriss Optimal Liquidation

The Problem

Suppose you need to liquidate $X$ shares of a stock over a time horizon of $T$ days. You can split the execution into $N$ equal periods of length $\tau = T/N$. Let $x_k$ be the number of shares still held at the start of period $k$ (so $x_0 = X$ and $x_N = 0$), and let $n_k = x_{k-1} - x_k$ be the number of shares sold in period $k$. The execution rate in period $k$ is $v_k = n_k/\tau$.

You face two competing costs:

Market impact cost: each period, selling $n_k$ shares moves the price against you by an amount proportional to the execution rate. Almgren and Chriss decompose impact into permanent and temporary components:

  • Temporary impact: $h(v_k) = \eta v_k$ (cost absorbed this period only; the price recovers afterwards)
  • Permanent impact: $g(n_k) = \gamma n_k$ (permanent adverse price shift; affects all future sales too)

The total expected cost of a trading schedule is: $$E[\text{Cost}] = \sum_{k=1}^{N} n_k \cdot \left(\eta v_k + \sum_{j=1}^{k} \gamma n_j\right)$$

Timing risk: while you wait to sell, the price can drift adversely. The contribution to cost variance from residual position $x_k$ over time $\tau$ is $\sigma^2 x_k^2 \tau$, giving total variance: $$\text{Var}[\text{Cost}] = \sigma^2 \sum_{k=1}^{N} x_k^2 \tau$$

The Optimal Trajectory

The mean-variance optimal problem is: $$\min_{n_1,\ldots,n_N} ;; E[\text{Cost}] + \lambda \cdot \text{Var}[\text{Cost}]$$

where $\lambda$ is the trader's risk aversion parameter (units: 1/dollar, since cost and variance have the same units). By taking the discrete-time solution to the limit as $\tau \to 0$, Almgren and Chriss show that the optimal trajectory satisfies a second-order difference equation whose continuous-time solution is:

$$x^*(t) = X \cdot \frac{\sinh(\kappa(T - t))}{\sinh(\kappa T)}, \qquad \kappa^2 = \frac{\lambda \sigma^2}{\eta}$$

The parameter $\kappa$ (units: $1/\text{time}$) measures the urgency of liquidation. When $\lambda \to \infty$ (extremely risk-averse), $\kappa \to \infty$ and $x^*(t)/X \to 1 - t/T$: uniform (TWAP) selling. When $\lambda \to 0$ (risk-neutral), $\kappa \to 0$ and the trajectory front-loads execution (sell quickly to minimise timing risk). For moderate $\lambda$, the $\sinh$ curve front-loads relative to TWAP: more shares are sold in the early periods, and the pace decreases through time.

Estimating the Parameters

In practice, calibrating the Almgren-Chriss model requires estimates of three parameters:

  • $\sigma$: daily volatility of the stock — readily estimated from historical returns
  • $\eta$: temporary impact coefficient — estimated from historical impact data: plot realised cost against execution rate and fit a linear regression. For liquid large-cap stocks, $\eta$ corresponds to impact of roughly 5–20 basis points per 1% of average daily volume (ADV) traded in one period
  • $\gamma$: permanent impact coefficient — harder to estimate because it requires long-horizon post-trade analysis to measure the price shift that did not reverse. Most practitioners set $\gamma \approx \eta/2$ as a starting approximation

For a trader executing 1% of ADV with $\sigma = 1.5%$/day and $\eta = 10\text{bp/(ADV%)}$, a risk-aversion parameter $\lambda = 10^{-6}$ gives $\kappa \approx 0.1$/day for a $T = 1$ day horizon, producing a trajectory that executes approximately 15% more in the first quarter of the day than a uniform TWAP schedule would.

Almgren and Chriss (2001) solve the problem of liquidating $X$ shares over $T$ periods to minimize total cost:

$$\min_{x_1,...,x_N} E[\text{Cost}] + \lambda \cdot \text{Var}[\text{Cost}]$$

The optimal trajectory balances market impact vs timing risk:

$$x_k^* = X \cdot \frac{\sinh(\kappa(T - t_k))}{\sinh(\kappa T)}$$

where $\kappa^2 = \frac{\lambda \sigma^2}{\eta}$.

module Almgren_chriss = struct

  type params = {
    shares    : float;    (* shares to execute *)
    horizon   : float;    (* execution horizon in days *)
    n_periods : int;
    sigma     : float;    (* daily volatility *)
    eta       : float;    (* temporary impact coefficient *)
    gamma     : float;    (* permanent impact coefficient *)
    lambda    : float;    (* risk aversion parameter *)
  }

  (** Optimal trading schedule under Almgren-Chriss *)
  let optimal_schedule p =
    let tau  = p.horizon /. float_of_int p.n_periods in
    let kappa2 = p.lambda *. p.sigma *. p.sigma /. p.eta in
    let kappa  = sqrt kappa2 in
    let kappaT = kappa *. p.horizon in
    Array.init (p.n_periods + 1) (fun k ->
      let t = float_of_int k *. tau in
      p.shares *. sinh (kappa *. (p.horizon -. t)) /. sinh kappaT
    )

  (** Compute shares to trade in each interval *)
  let trades_from_schedule schedule =
    let n = Array.length schedule - 1 in
    Array.init n (fun k -> schedule.(k) -. schedule.(k + 1))

  (** Expected cost of a trading schedule *)
  let expected_cost p schedule =
    let tau   = p.horizon /. float_of_int p.n_periods in
    let total = ref 0.0 in
    let perm_impact = ref 0.0 in
    let n = p.n_periods in
    for k = 0 to n - 1 do
      let nk = schedule.(k) -. schedule.(k + 1) in   (* shares in period k *)
      let rate = nk /. tau in
      let temp = p.eta *. rate in                     (* temporary impact cost *)
      let perm = p.gamma *. nk in                    (* permanent impact *)
      perm_impact := !perm_impact +. perm;
      total := !total +. nk *. (temp +. !perm_impact)
    done;
    !total

  (** Variance of cost *)
  let variance_cost p schedule =
    let tau  = p.horizon /. float_of_int p.n_periods in
    let var  = ref 0.0 in
    for k = 0 to p.n_periods - 1 do
      let rem = schedule.(k + 1) in  (* remaining after period k *)
      var := !var +. tau *. p.sigma *. p.sigma *. rem *. rem
    done;
    !var

  (** Efficient frontier: sweep lambda to get cost-risk tradeoff *)
  let efficient_frontier_pts ~p ~lambdas =
    List.map (fun lam ->
      let pp  = { p with lambda = lam } in
      let sch = optimal_schedule pp in
      let ec  = expected_cost pp sch in
      let vc  = variance_cost pp sch in
      (ec, sqrt vc, lam)
    ) lambdas

end

23.2 TWAP and VWAP Algorithms

TWAP (Time-Weighted Average Price) is the simplest possible execution strategy: divide the total quantity into equal pieces and execute one piece per time period. It has no market intelligence — it ignores volume, spread, and price dynamics — but its simplicity makes it auditable and hard to game. Counterparties cannot predict when a TWAP algorithm will trade, so it does not telegraph order flow. TWAP is appropriate when the trader has no view on intraday price patterns and wants the cleanest possible average price over a specified horizon.

VWAP (Volume-Weighted Average Price) is the most widely used benchmark in institutional trading. It weights execution to match the market's own volume profile: trade more during high-volume periods (market open and close, which together account for 30–40% of daily volume in a typical U-shaped intraday pattern), and less during the quiet midday. The goal is to trade with the market rather than against it: by concentrating orders when other participants are also trading, you reduce market impact relative to your order size.

However, VWAP has a critical weakness: it is gameable. If your counterparty knows you are running a VWAP algorithm that plans to trade 40% of your order in the first 30 minutes, they can front-run the open by buying before you, then selling back to you at elevated prices. VWAP also penalises the trader when the stock moves strongly: if you are liquidating a stock that rallies 5% intraday, VWAP forces you to sell more shares at the opening low than at the closing high. Your VWAP-benchmarked P&L will look good, but you sold the stock at the wrong time.

Implementation Shortfall (IS), also called arrival price or Perold's shortfall, measures the cost of execution relative to the price when the decision to trade was made (the "arrival price" $P_0$). It decomposes the total slippage into: (1) market impact from trades already executed, (2) timing risk from price movement while waiting, (3) spread cost, and (4) opportunity cost from the portion left unexecuted. IS directly measures the cost of the trading decision, whereas VWAP measures only how well you executed relative to the day's average price.

IS algorithms are preferred when the order must be completed within a specific time window and when timing risk is material (fast-moving markets, news events). VWAP is preferred when the primary goal is to minimise total trading footprint and there is no time pressure. TWAP is preferred for illiquid securities where volume profiles are unreliable.

module Twap = struct

  (** Simple TWAP: divide total quantity equally over N intervals *)
  let schedule ~total_qty ~n_periods =
    let per_period = total_qty /. float_of_int n_periods in
    Array.make n_periods per_period

  (** Adaptive TWAP: accelerate when price is favourable *)
  let adaptive ~total_qty ~n_periods ~prices ~arrival_price ~side =
    let avg = Array.fold_left (+.) 0.0 prices /. float_of_int (Array.length prices) in
    let base = total_qty /. float_of_int n_periods in
    Array.mapi (fun _ p ->
      let favour = match side with
        | `Buy  -> if p < arrival_price then 1.2 else 0.8
        | `Sell -> if p > arrival_price then 1.2 else 0.8
      in
      base *. favour *. avg /. avg  (* normalised *)
    ) prices

end

module Vwap = struct

  (** VWAP schedule: proportion trade to volume profile *)
  let schedule ~total_qty ~volume_profile =
    let total_vol = Array.fold_left (+.) 0.0 volume_profile in
    Array.map (fun v -> total_qty *. v /. total_vol) volume_profile

  (** Participate: target a fixed fraction of market volume each period *)
  let participate ~participation_rate ~market_volumes =
    Array.map (fun v -> participation_rate *. v) market_volumes

end

23.3 Implementation Shortfall Benchmark

The IS benchmark was introduced by Andre Perold in his landmark 1988 paper The Implementation Shortfall: Paper vs. Reality. The key insight was that portfolio managers had been evaluating execution quality by comparing to the VWAP or to some other average, which obscures the actual cost of the trading decision. If you decide to buy a stock at $100 and the stock rises to $105 during execution, VWAP might say you executed “well” (you beat the day's average), but the IS benchmark reveals you paid $5 per share more than you intended because execution was slow.

IS is computed as: $$\text{IS} = \frac{\text{Actual portfolio} - \text{Paper portfolio}}{\text{Paper portfolio}}$$

where the paper portfolio assumes all shares were bought at the arrival price $P_0$. In practice, IS is decomposed into components to diagnose where slippage is occurring:

$$\text{IS} = \underbrace{(\bar{P}\text{exec} - P_0)}\text{market impact} + \underbrace{\frac{\text{spread}}{2}}\text{spread cost} + \underbrace{(1 - \text{fill rate})(P\text{final} - P_0)}_\text{opportunity cost}$$

A low opportunity cost (you filled most of the order) but high market impact suggests a too-aggressive schedule. High opportunity cost suggests a too-passive schedule that left much of the order unfilled as the price moved away.

module Is_benchmark = struct

  (** Track slippage components for a completed execution *)
  type breakdown = {
    timing_risk   : float;   (* from waiting while price moved *)
    market_impact : float;   (* from our own execution *)
    spread_cost   : float;   (* bid-ask crossing *)
    opportunity   : float;   (* unexecuted portion cost *)
    total         : float;
  }

  let compute ~arrival_price ~executed_qty ~total_qty ~vwap ~spread ~final_price =
    let executed_frac = executed_qty /. total_qty in
    let mi    = vwap -. arrival_price in  (* buy scenario *)
    let spread_c = spread /. 2.0 in
    let opp   = (1.0 -. executed_frac) *. Float.abs (final_price -. arrival_price) in
    { timing_risk   = 0.0;   (* need price path for this *)
      market_impact = mi;
      spread_cost   = spread_c;
      opportunity   = opp;
      total         = mi +. spread_c +. opp }

end

23.4 Smart Order Routing

The execution algorithms discussed so far (TWAP, VWAP, Almgren-Chriss) answer the question of when to trade. Smart Order Routing (SOR) answers the question of where to trade.

Before regulatory changes like Regulation NMS in the US (2005) and MiFID in Europe (2007), liquidity for a given stock was heavily concentrated on its primary exchange (e.g., NYSE or LSE). Today, liquidity is highly fragmented across dozens of competing "lit" exchanges (like BATS, Direct Edge), electronic communication networks (ECNs), and "dark pools" (where orders are hidden before execution).

When an algorithm decides to buy 10,000 shares right now, it cannot simply send one order to the NASDAQ. The NASDAQ might only have 2,000 shares available at the best bid. The SOR must split that 10,000-share parent order into multiple child orders and route them simultaneously to different venues to sweep the available liquidity.

Maker-Taker Pricing and Net Price

A naive SOR would simply route to the venue with the best displayed price. However, modern exchanges charge fees for "taking" liquidity (executing against a resting limit order) and pay rebates for "making" liquidity (providing a resting order). These fees are typically quoted in mils (e.g., $0.0001 per share or in basis points).

If Venue A asks $100.00 and charges 30 mils to take, and Venue B asks $100.01 but pays a 20 mil rebate to take (an "inverted" venue), the net prices are:

  • Venue A net: $100.0030
  • Venue B net: $99.9910

The SOR must rank venues by net price (nominal price plus the take fee or minus the rebate), not just nominal price.

OCaml Implementation

We can model a smart order router in OCaml. A venue record contains the available quantity, the nominal price, and the fee. The router sorts the venues by net price and recursively sweeps the order book, generating a list of child orders.

module Smart_order_routing = struct

  type venue = {
    id             : string;
    available_qty  : float;
    ask            : float;
    take_fee_bps   : float; (* expressed in basis points for simplicity *)
  }

  (** Calculate net price including fees *)
  let net_ask v = 
    v.ask *. (1.0 +. v.take_fee_bps /. 10000.0)

  (** Route a marketable buy order to the cheapest venues first *)
  let route_buy ~total_qty ~venues =
    (* 1. Sort venues by net price (cheapest first) *)
    let sorted_venues = 
      List.sort (fun a b -> Float.compare (net_ask a) (net_ask b)) venues 
    in
    
    (* 2. Sweep the book: allocate quantity until the order is filled *)
    let rec sweep remaining routed_orders = function
      | [] -> (List.rev routed_orders, remaining) (* Unfilled remainder *)
      | v :: vs ->
          if remaining <= 0.0 then 
            (List.rev routed_orders, 0.0)
          else
            let fill_qty = Float.min remaining v.available_qty in
            if fill_qty > 0.0 then
              sweep (remaining -. fill_qty) ((v.id, fill_qty) :: routed_orders) vs
            else
              sweep remaining routed_orders vs
    in
    sweep total_qty [] sorted_venues

end

This functional approach uses recursion (sweep) to walk through the sorted venues, accumulating child orders. It explicitly returns both the generated child allocations and any unfilled quantity.

Dark Pools and Ping Routing

In a production system, an SOR will often route to dark pools first. Dark pools do not publish quotes, so the SOR sends an Immediate-or-Cancel (IOC) order with a Minimum Acceptable Quantity (MAQ) to the dark pool — this is known as a "ping." If the order is filled, the SOR saves the spread and adverse selection cost of the lit markets. If it is rejected, the SOR rapidly routes the remainder to the lit exchanges before the price can move.

The complexity of an SOR lies in managing latency races: if it sends an order to Venue A and Venue B simultaneously, but Venue B is slower, a high-frequency trader might see the execution on A and cancel their quote on B before the SOR's second order arrives. Modern SORs use historical latency profiles to stagger the release of their orders so they arrive at all venues at the exact same microsecond.


23.5 Chapter Summary

Execution algorithms address the practical gap between investment decisions and portfolio positions. A large order cannot be executed instantaneously without substantial cost; execution algorithms spread the trade over time and across venues to minimize that cost.

The Almgren-Chriss model provides the theoretical framework. The optimal liquidation schedule trades off linear temporary market impact (each trade moves price proportional to its size) against timing risk (the position is exposed to price drift while not fully exited). The resulting trajectory is a hyperbolic-sine function of time: convex (front-loaded) for high risk aversion (minimise variance of execution cost) and linear (TWAP) in the limit of zero risk aversion. The model's key parameters are the temporary and permanent impact coefficients, which must be estimated from market microstructure data.

TWAP and VWAP are pragmatic simplifications. TWAP divides the order uniformly across time intervals, making minimal assumptions about market structure. VWAP tracks the expected market volume profile (derived from historical data), trading more during high-volume periods. VWAP is the most common institutional benchmark because it measures execution quality relative to a market reference rather than an absolute price.

Implementation Shortfall captures the total cost of a completed trade, including spread, market impact, timing risk, and delay cost. Unlike VWAP benchmarking (which can be gamed by trading large amounts right before the close when volume spikes), IS is a precise and complete cost measure. Smart Order Routing addresses market fragmentation: modern equity markets consist of many competing venues, and the best execution obligation requires routing to the venue with the best combination of price, available quantity, and transaction costs.


Exercises

23.1 Compute the Almgren-Chriss optimal schedule for liquidating 100,000 shares over a day, varying $\lambda$ from $10^{-5}$ to $10^{-3}$. Plot the efficient frontier (cost vs risk).

23.2 Simulate a VWAP order using a typical intraday U-shaped volume profile. Compare achieved VWAP to the market VWAP.

23.3 Implement a participation rate algorithm that automatically pauses when spread exceeds a threshold (adverse market conditions).

23.4 Simulate smart order routing across 3 venues with different depths and fees. Measure average execution price vs single-venue execution.


Next: Chapter 24 — Quantitative Trading Strategies