Statistical Arbitrage: Quantitative Mean Reversion
Two assets walk into a bar. They're historically correlated—when one goes up, the other usually follows. Today? One rallies 5% while the other drops 3%.
Do you panic? Short the strong one? Buy the weak one?
Or do you recognize this as statistical arbitrage gold—a temporary divergence from long-term equilibrium that will eventually revert. If the relationship is cointegrated (not just correlated), you can trade the spread with quantified edge.
🚨 This Isn't "Buy Dips" Hope Trading
Statistical arbitrage requires rigorous mathematical validation. You can't just eyeball two charts and assume they'll converge. You need cointegration tests, z-score models, and proper risk management—because spreads can diverge further before they converge.
Without quantitative validation? You're gambling, not arbitraging.
🎯 What You'll Master
By the end of this lesson, you'll understand:
- The difference between correlation and cointegration (critical!)
- How to identify viable pairs using Augmented Dickey-Fuller tests
- Z-score potential entry/potential exit models for statistical edge
- Position sizing based on spread volatility and half-life
- Real-world stat arb strategies used by hedge funds
⚡ Quick Wins for Tomorrow (Click to expand)
- Test Your First Pair for Cointegration Tonight — Use Python: `from statsmodels.tsa.stattools import coint; score, pvalue, _ = coint(spy, qqq)`. If p-value < 0.05 = cointegrated (tradeable). If > 0.05 = NOT cointegrated (don't trade). High correlation ≠ cointegration.
- Build Your Z-Score Spreadsheet — Z = (Current Spread - Avg Spread) / StdDev. Entry: |z| > 2.0 (2 standard deviations from mean). Exit: z returns to 0. Stop: |z| > 3.0 (diverging too far). Position size: 2% per pair.
- Paper Trade ONE Pair for 30 Days — Pick one cointegrated pair (SPY-IWM, KO-PEP). Enter at z ≥ 2.0, exit at z ≤ 0.5. Track 10-15 trades. Target: >65% win rate, >1.5% avg profit. Validate execution before live money.
The Most Dangerous Misconception in Pairs Trading
Retail traders love to find "correlated pairs" and trade the divergence. SPY and QQQ move together, right? Tesla and Lucid? Bitcoin and Ethereum?
Here's the trap: Correlation does not guarantee mean reversion.
Two assets can be highly correlated yet drift apart permanently. Why? Because correlation measures direction, not equilibrium.
Correlation: Moves Together, Maybe
Definition: Measures how two assets move in the same direction
Range: -1 (perfect inverse) to +1 (perfect positive)
The Problem: High correlation doesn't mean they'll return to any specific relationship. Example:
- Stock A: $100 → $150 (+50%)
- Stock B: $50 → $75 (+50%)
- Correlation: Perfect 1.0
- Spread: $50 throughout (no reversion)
Danger: You can't trade a spread that doesn't revert. This is why pure correlation-based pairs fail.
Cointegration: Mean-Reverting Equilibrium
Definition: A statistically stable long-term relationship where deviations are temporary
Test: Augmented Dickey-Fuller (ADF) test on the spread
What It Means: When the spread deviates, it's pulled back toward equilibrium. Example:
- Asset A / Asset B = typical ratio of 2.0
- Today: ratio = 2.4 (spread widened)
- Cointegrated pairs: Ratio reverts toward 2.0
- Edge: Sell the expensive one, buy the cheap one
This Works: Mean reversion is statistically validated, not hoped for.
💡 The Key Insight
Correlation = "They move together"
Cointegration = "Their relationship has gravity"
Only cointegrated pairs can be traded as statistical arbitrage. Everything else is speculation.
Derek's $31,400 Correlation Trap: When Correlated Pairs Don't Revert
Derek Martinez, 34, Austin, TX — Quantitative trader, $220K account.
February 2024: Derek pairs traded ARKK/QQQ based on 0.89 correlation. "When they diverge, they'll converge back." He never tested for cointegration.
By May 2024: 3 trades over 3 months. Total loss: -$31,400 (-14.3%).
🚨 What Derek Learned The Hard Way
"Correlation measures the PAST. Cointegration predicts the FUTURE. I lost $31,400 trading ARKK/QQQ because they were 'highly correlated.' But correlation just means they moved similarly historically. It doesn't mean they'll revert."
— Derek Martinez, May 2024
📉 Derek's 3-Month Disaster: Feb-May 2024
The Three Disasters
Trade #1 (Feb 12-Mar 8): Bought ARKK thinking "cheap vs QQQ." Spread widened instead of reverting. Loss: -$6,600.
Trade #2 (Mar 15-Apr 19): Shorted ARKK "overextended." ARKK exploded +13.5% on genomics hype. Margin call. Loss: -$15,200.
Trade #3 (Apr 25-May 10): Revenge trade. Initially up $26,831, then short squeezed +10.5%. Held hoping for recovery. Loss: -$8,400.
The Rebuild: February-November 2025
New Process:
- ADF test EVERY pair — p-value must be <0.05 for stationary spread
- Calculate half-life — Only trade if mean reversion <20 days
- Z-score entry — |Z| > 2.0 (2+ std devs from mean)
- Decointegration stop — If ADF p-value rises >0.10, exit immediately
Pairs traded: KO/PEP, XOM/CVX, JPM/BAC (all tested cointegrated).
📈 Derek's 9-Month Transformation
💡 Derek's Lesson
Correlation = "They move together." Cointegration = "Their relationship has gravity."
- ADF p-value <0.05 = spread is stationary, can trade
- ADF p-value >0.05 = NOT cointegrated, don't touch it
- Calculate half-life to know how long reversion takes
Win rate went from 0% to 69.6% by testing for cointegration, not just correlation.
Case Study Quiz: Derek lost $31,400 (-14.3%) in 3 months trading ARKK/QQQ pairs. He was confident because they showed 0.89 correlation—they moved together historically. Trade #2 destroyed him: he shorted ARKK thinking it was "overextended vs. QQQ," but ARKK exploded +13.5% while QQQ only +1.1%, causing a margin call and -$15,200 loss. He also missed a $26,831 profit opportunity when the spread reversed but he held too long. What was Derek's fatal mistake?
Correct: C. Derek confused correlation (past similarity) with cointegration (mean-reverting spread). He never ran an ADF test. ARKK pivoted to biotech, breaking the historical link with QQQ—correlation persisted but cointegration died. Recovery: ADF test every pair (p<0.05), calculate half-life, use z-score entry. Result: 69.6% win rate after learning proper stat arb.
The Augmented Dickey-Fuller Test
To validate cointegration, you need to test whether the spread (or ratio) is stationary—meaning it has a constant mean it reverts to.
The standard test: Augmented Dickey-Fuller (ADF) test.
📐 What the ADF Test Does
The ADF test checks if a time series (like the spread between two assets) is stationary or has a unit root (random walk, non-stationary).
- Stationary spread: Has a constant mean and variance. Deviations from the mean are temporary → mean reversion is expected.
- Non-stationary spread: Drifts over time with no equilibrium. Deviations can persist indefinitely → no reliable mean reversion.
Null Hypothesis (H₀): The spread has a unit root (NOT stationary, NOT cointegrated)
Alternative Hypothesis (H₁): The spread is stationary (cointegrated, mean-reverting)
Interpreting ADF Test Results
The ADF test outputs a p-value that tells you the probability that the spread is non-stationary (random walk).
| P-Value | Interpretation | Trade Decision |
|---|---|---|
| < 0.01 | Highly stationary (99% confidence) | ✅ TRADEABLE (strong cointegration) |
| 0.01 - 0.05 | Stationary (95% confidence) | ✅ TRADEABLE (acceptable cointegration) |
| 0.05 - 0.10 | Weakly stationary (90% confidence) | ⚠️ MARGINAL (risky, watch closely) |
| > 0.10 | Non-stationary (random walk) | ❌ DO NOT TRADE (not cointegrated) |
✅ Rule of Thumb
Only trade pairs with ADF p-value < 0.05. This gives you 95% statistical confidence that the spread will mean-revert.
If p-value > 0.05, you're trading correlation, not cointegration—and that's gambling.
Calculating the Hedge Ratio
Before running the ADF test, you need to determine the hedge ratio (β)—the optimal proportion to balance the two assets in the pair.
Method: Run an Ordinary Least Squares (OLS) regression:
Asset A = β × Asset B + ε
β (beta) = hedge ratio
Example: KO vs PEP
Step 1: Collect Historical Prices
KO (Coca-Cola): Last 252 days of closing prices
PEP (PepsiCo): Last 252 days of closing prices
Step 2: Run OLS Regression (KO = β × PEP)
Using Python statsmodels or Excel LINEST function
Result: β = 1.12
Interpretation:
For every $1 of PEP, you need $1.12 of KO to create a hedged pair.
Step 3: Calculate Spread
Spread = KO - (1.12 × PEP)
Example values:
- KO = $60.50
- PEP = $180.20
- Spread = $60.50 - (1.12 × $180.20) = $60.50 - $201.82 = -$141.32
Step 4: Run ADF Test on Spread
ADF Test Result: p-value = 0.018 ✅
Interpretation: Spread is stationary (cointegrated at 98% confidence)
Conclusion: KO/PEP is a tradeable pair
Real Example: SPY/IWM Cointegration Test
| Step | Process | Result |
|---|---|---|
| 1. Data Collection | Download 1 year of daily prices for SPY and IWM | 252 trading days |
| 2. Correlation Check | Calculate Pearson correlation | r = 0.94 (high correlation) |
| 3. OLS Regression | SPY = β × IWM + ε | β = 2.68 (hedge ratio) |
| 4. Spread Calculation | Spread = SPY - (2.68 × IWM) | 252 spread values |
| 5. ADF Test | Test if spread is stationary | p = 0.0082 ✅ COINTEGRATED |
| 6. Half-Life | How long for 50% mean reversion? | 8.2 days (fast reversion) |
| 7. Conclusion | Is this pair tradeable? | YES - Trade the spread |
🚨 Common Mistakes in Cointegration Testing
- Using too short a lookback period: Need at least 6-12 months of data for reliable ADF test. 1-3 months = unreliable.
- Not recalculating β regularly: Hedge ratios drift over time. Recalculate monthly or you'll trade the wrong spread.
- Ignoring p-value drift: A pair cointegrated in January may decointegrate by June. Re-test monthly.
- Testing only once: Cointegration is dynamic, not static. Rolling window tests (6-month rolling) are essential.
Z-Score: The Trading Signal
Once you've confirmed cointegration (ADF p < 0.05), you need a trading signal to tell you WHEN to enter and exit. That signal is the z-score.
📐 Z-Score Formula
Z = (Current Spread - Mean Spread) / Standard Deviation of Spread
What it measures: How many standard deviations the current spread is from its historical mean.
- Z = 0: Spread is at its mean (no trade)
- Z = +2: Spread is 2 standard deviations ABOVE mean (Asset A expensive, Asset B cheap)
- Z = -2: Spread is 2 standard deviations BELOW mean (Asset A cheap, Asset B expensive)
Z-Score Trading Frameworks
Conservative Z-Score Model
- Entry: |Z| > 2.0 (spread is 2+ std devs from mean)
- Exit: Z crosses back through 0 (mean reversion)
- Stop Loss: |Z| > 3.5 (spread widening, cut losses)
Why Conservative:
- Z > 2 means 95% confidence the spread is abnormal
- Fewer false signals, higher win rate
- Lower frequency (fewer trades)
Aggressive Z-Score Model
- Entry: |Z| > 1.5 (more frequent signals)
- Partial Exit: Z crosses 0.5 (take some profit)
- Full Exit: Z crosses 0 (mean)
- Stop Loss: |Z| > 3.0
Why Aggressive:
- More trades = more opportunities
- Lower threshold = earlier entries
- Requires tighter risk management
💡 Real-World Application
Spread = $5.20
Mean Spread = $5.00
Std Dev = $0.10
Z-score = ($5.20 - $5.00) / $0.10 = 2.0
Action: Asset A is relatively expensive, Asset B is relatively cheap
Trade: Short Asset A, Long Asset B (expect convergence)
Half-Life: How Long Until Reversion?
The half-life tells you how many periods it takes for the spread to revert halfway back to the mean. This informs your holding period.
You're now at the halfway point. You've learned the key strategies.
Great progress! Take a quick stretch break if needed, then we'll dive into the advanced concepts ahead.
📖 Calculating Half-Life
Use an AR(1) model (autoregressive) on the spread:
Spreadt = θ × Spreadt-1 + ε
Half-Life = -log(2) / log(θ)
Example: If θ = 0.95, Half-Life ≈ 14 days
Implication: Expect mean reversion within ~14 days. If it takes 30+ days, your model may be breaking down.
How Much to Allocate Per Trade
Statistical arbitrage is NOT "set it and forget it." Spreads can widen before they converge (drawdowns), and some pairs decointegrate over time.
Kelly Criterion for Stat Arb
Use a modified Kelly approach:
f* = (p × b - q) / b
Where:
- p = win rate (historical % of profitable mean reversions)
- q = loss rate (1 - p)
- b = average win / average loss
- f* = optimal fraction of capital to risk
🚨 Never Risk More Than 2-5% Per Pair
Even if Kelly suggests 10%, stat arb models can fail (decointegration, regime change). Use half-Kelly or quarter-Kelly for safety:
Half-Kelly: f* / 2
Quarter-Kelly: f* / 4
This protects you from model potential breakdown while still capturing edge.
Portfolio of Pairs: Diversification
Don't run just one pair. Build a portfolio of 5-10 cointegrated pairs across different sectors to reduce idiosyncratic risk.
Example Portfolio: KO/PEP (Consumer), XOM/CVX (Energy), JPM/BAC (Finance), AAPL/MSFT (Tech), UNH/CVS (Healthcare). 2% per pair, 10% total risk, diversified if one decointegrates.
Regime Changes & Decointegration
Statistical arbitrage works—until it doesn't. Pairs can decointegrate due to:
- Corporate Actions: Mergers, spinoffs, dividend changes
- Structural Shifts: One company changes business model
- Macro Regime Change: Interest rate shocks, sector rotation
- Liquidity Drying Up: Can't potential exit the position without slippage
🚨 LTCM Collapse (1998)
Nobel Prize-winning stat arb fund lost $4.6B when Russian debt crisis caused spreads to widen beyond historical norms (z-score hit 10+). Lesson: Models assume normal distributions. Black swan events break assumptions. Always have stop losses.
Stop Loss Rules for Stat Arb
- Z-Score Stop: Exit if |Z| > 3.5 (spread widening, not reverting)
- Time Stop: Exit if position is open > 2× half-life (model failing)
- Dollar Stop: Exit if drawdown exceeds 50% of expected profit
- Correlation Breakdown: Exit if ADF p-value goes above 0.10 (losing cointegration)
Building Your Stat Arb System
Step-by-step process for running statistical arbitrage:
Step 1 - Discovery: Screen same-sector stocks with similar market cap and >0.7 correlation.
Step 2 - Test: ADF test on spread (6-12 months data). p<0.05 = pass, p>0.05 = reject.
Step 3 - Backtest: Entry |Z|>2, exit Z=0. Target: Sharpe >1.5, win rate >60%, half-life 5-30 days.
Step 4 - Monitor: Daily z-score, spread volatility, correlation. Re-run ADF monthly.
Pro Tip: Use rolling 6-month window to recalculate β monthly. Exit if p-value drifts >0.10 (decointegrating).
High-Frequency Mean Reversion (For Active Traders)
Most stat arb is swing trading (holding 5-30 days), but cointegration also works on intraday timeframes—if you have the infrastructure.
Requirements for Intraday Stat Arb
- Low Latency Execution: Sub-second fills required (DMA or sponsored access)
- High-Liquidity Pairs: SPY/IWM, QQQ/DIA, sector ETFs only
- Tight Spreads: Commission + slippage must be <0.02% per trade
- Automated System: Manual trading too slow for 5-minute half-lives
Example: SPY vs IWM Intraday Pairs
Setup: SPY/IWM, correlation 0.92 (5-min), ADF p=0.009✅, half-life 12 bars (60 min)
Rules: Entry |Z| > 1.8, exit when Z crosses 0 OR 90 min elapsed
Example (Nov 15, 2024): 10:35 AM z=-1.95 → Long 200 SPY, Short 530 IWM → 11:20 AM z=+0.12 → Exit +$142 (45 min, 0.12%)
Scalability: 8-12× per day, $50-200 each. Monthly: ~3-5% with minimal overnight risk.
Intraday Stat Arb Risks
Why Retail Fails: Commission overhead ($0.005/share × 2 legs), slippage on large size, 40-50% false signals from intraday noise, need algo platform.
Requirements: >$100K capital, professional routing (IBKR Pro, Lightspeed), automated execution.
Avoid if: Beginner, using Robinhood/Webull, or trading manually.
Building a Diversified Stat Arb Book
Single-pair trading is risky (decointegration can wipe you out). Professional stat arb traders run portfolios of 10-30 pairs simultaneously.
Example Portfolio Allocation
Conservative ($100K)
5 pairs, 2% each: KO/PEP, JPM/BAC, XOM/CVX, UNH/CVS, WMT/TGT
Total Risk: 10% | Sharpe: 1.5-2.0 | Return: 12-18% | Max DD: -8% to -12%
Moderate ($100K)
10 pairs, 1.5% each: Daily (6): Conservative + MA/V, DIS/CMCSA | Intraday (4): SPY/IWM, QQQ/DIA, XLF/XLK, XLE/XLU
Total Risk: 15% | Sharpe: 2.0-2.5 | Return: 20-30% | Max DD: -12% to -18%
Aggressive ($100K)
20+ pairs, 1% each: 8 daily blue-chips + 12 intraday ETFs | 1.5× leverage
Total Risk: 20-30% | Sharpe: 2.5-3.5 | Return: 35-50% | Max DD: -20% to -30%
Warning: Requires full-time monitoring and professional execution
🚨 Correlation Between Pairs (Hidden Risk)
Bad: KO/PEP, MDLZ/GIS, K/CPB (all food)—sector crash kills ALL pairs.
Good: KO/PEP (consumer) + JPM/BAC (finance) + XOM/CVX (energy)—sector diversification means uncorrelated performance.
Rebalancing Your Stat Arb Portfolio
📖 Monthly Maintenance
1st of Every Month: Re-run ADF tests, recalculate hedge ratios, update z-score thresholds, remove p>0.10 pairs, add new pairs. Example: KO/PEP p-value drifts Jan 0.021✅ → Apr 0.124❌ = REMOVE.
Building Your Stat Arb System from Scratch
Implementation Framework (3-Step Process):
Step 1: Testing for Cointegration
Process: (1) Run OLS regression to find hedge ratio β (e.g., KO = β × PEP), (2) Calculate spread (KO - β × PEP), (3) Run Augmented Dickey-Fuller (ADF) test on spread. If p-value < 0.05 = cointegrated. (4) Calculate half-life (mean reversion speed) using spread lag regression.
Example: KO/PEP β = 0.92, spread p-value = 0.018 (✅ cointegrated), half-life = 12.3 days (mean reversion completes in ~12 days).
Step 2: Calculating Z-Score Signals
Process: Calculate 60-day rolling z-score: z = (spread - spread_mean) / spread_std. Set thresholds: Entry ±2.0σ, Exit 0.0σ (mean reversion), Stop ±3.5σ (breakdown).
Signals: Z-score < -2.0 = Long spread (buy KO, sell PEP). Z-score > +2.0 = Short spread (sell KO, buy PEP). Z-score crosses 0 = Exit (mean reversion complete). Z-score > ±3.5 = Stop loss (cointegration broke).
Step 3: Backtesting the Strategy
Process: Iterate through historical data, track position state (long/short/flat), enter at ±2.0σ thresholds, exit at 0σ or ±3.5σ stop. Calculate metrics: win rate, avg win/loss, profit factor, total PnL.
Target Metrics: Win rate >55%, profit factor >1.5, Sharpe ratio >1.2, max DD <15%. If backtest fails thresholds, pair likely not tradable (insufficient mean reversion or high transaction costs eating edge).
💡 Implementation: Use Python libraries (statsmodels for ADF test, pandas for data handling) or platforms like QuantConnect/Zipline. Full code examples available in community repository.
Live Trading Tips: Paper trade 1-3 months first. Start 0.5-1% sizes. Monitor slippage (>0.05% = reduce size). Log every trade: z-score entry/exit, hold time, P&L.
💡 Statistical Arbitrage Checklist
- Cointegration > Correlation: Always validate with ADF test (p < 0.05)
- Z-Score Entry: |Z| > 2.0 for conservative, > 1.5 for aggressive
- Stop Losses: Exit if |Z| > 3.5 or position open > 2× half-life
- Position Sizing: Never more than 2-5% per pair (use Kelly/half-Kelly)
- Diversify Pairs: Run 5-10 pairs across different sectors
- Monitor Continuously: Re-test cointegration monthly, potential exit if breaking down
- Portfolio Construction: Ensure pairs are uncorrelated with each other (sector diversity)
- Intraday Trading: Only with professional execution and automation
Statistical arbitrage is a quantitative edge—but it requires discipline. If you execute this correctly, you're trading with mathematical precision instead of hope.
Test Your Understanding
Q1: What was Derek's fatal mistake that led to his $31,400 loss trading ARKK/QQQ?
Correct! Derek assumed 0.89 correlation meant mean-reversion. He never ran an ADF test. Correlation measures past similarity; cointegration validates a statistically stable equilibrium.
Q2: For a pairs trading strategy using statistical arbitrage, what ADF test p-value threshold indicates a valid cointegrated pair?
Correct! ADF p-value < 0.05 means the spread is stationary (mean-reverting) with 95% confidence. If p-value > 0.05, don't trade it as stat arb.
Q3: According to the lesson's conservative z-score strategy, when should you enter a statistical arbitrage pairs trade?
Correct! Conservative entry: |z| > 2.0 (2+ std devs from mean). Exit when z returns to 0. Stop out if |z| > 3.5 (spread may be de-cointegrating).
Q4: What's the critical difference between correlation and cointegration?
Correct! Correlation = "moved together historically." Cointegration = "spread has mean-reverting equilibrium." Highly correlated pairs can drift apart permanently. Only cointegrated pairs can be traded as stat arb.
Q5: According to the lesson's risk management guidelines, what's the maximum position size per pair in statistical arbitrage?
Correct! Max 2-5% per pair using Kelly/half-Kelly sizing. Run 5-10 pairs across sectors to diversify. If one pair de-cointegrates, it won't destroy your account.
Related Lessons
- #64: Macro Regime Framework — How regime changes affect stat arb
- #66: Quantitative Strategy Design — Backtest stat arb strategies
- #67: Machine Learning in Trading — ML for pair selection
⏭️ Coming Up Next
Lesson #64: Macro Regime Framework — Understand how regime changes affect your stat arb models and adapt your strategies accordingly.
💬 Discussion (0 comments)
Loading comments...