Signal Pilot
🔴 Advanced • Lesson 63 of 82

Statistical Arbitrage: Quantitative Mean Reversion

Reading time ~18-22 min • Pairs Trading & Cointegration Analysis
0%
You're making progress!
Keep reading to mark this lesson complete

Two assets walk into a bar. They're historically correlated—when one goes up, the other usually follows. Today? One rallies 5% while the other drops 3%.

Do you panic? Short the strong one? Buy the weak one?

Or do you recognize this as statistical arbitrage gold—a temporary divergence from long-term equilibrium that will eventually revert. If the relationship is cointegrated (not just correlated), you can trade the spread with quantified edge.

🚨 This Isn't "Buy Dips" Hope Trading

Statistical arbitrage requires rigorous mathematical validation. You can't just eyeball two charts and assume they'll converge. You need cointegration tests, z-score models, and proper risk management—because spreads can diverge further before they converge.

Without quantitative validation? You're gambling, not arbitraging.

🎯 What You'll Master

By the end of this lesson, you'll understand:

  • The difference between correlation and cointegration (critical!)
  • How to identify viable pairs using Augmented Dickey-Fuller tests
  • Z-score potential entry/potential exit models for statistical edge
  • Position sizing based on spread volatility and half-life
  • Real-world stat arb strategies used by hedge funds
⚡ Quick Wins for Tomorrow (Click to expand)
  1. Test Your First Pair for Cointegration Tonight — Use Python: `from statsmodels.tsa.stattools import coint; score, pvalue, _ = coint(spy, qqq)`. If p-value < 0.05 = cointegrated (tradeable). If > 0.05 = NOT cointegrated (don't trade). High correlation ≠ cointegration.
  2. Build Your Z-Score Spreadsheet — Z = (Current Spread - Avg Spread) / StdDev. Entry: |z| > 2.0 (2 standard deviations from mean). Exit: z returns to 0. Stop: |z| > 3.0 (diverging too far). Position size: 2% per pair.
  3. Paper Trade ONE Pair for 30 Days — Pick one cointegrated pair (SPY-IWM, KO-PEP). Enter at z ≥ 2.0, exit at z ≤ 0.5. Track 10-15 trades. Target: >65% win rate, >1.5% avg profit. Validate execution before live money.
Part 1: Why Correlation ≠ Cointegration

The Most Dangerous Misconception in Pairs Trading

Retail traders love to find "correlated pairs" and trade the divergence. SPY and QQQ move together, right? Tesla and Lucid? Bitcoin and Ethereum?

Here's the trap: Correlation does not guarantee mean reversion.

Two assets can be highly correlated yet drift apart permanently. Why? Because correlation measures direction, not equilibrium.

Correlation: Moves Together, Maybe

Definition: Measures how two assets move in the same direction

Range: -1 (perfect inverse) to +1 (perfect positive)

The Problem: High correlation doesn't mean they'll return to any specific relationship. Example:

  • Stock A: $100 → $150 (+50%)
  • Stock B: $50 → $75 (+50%)
  • Correlation: Perfect 1.0
  • Spread: $50 throughout (no reversion)

Danger: You can't trade a spread that doesn't revert. This is why pure correlation-based pairs fail.

Cointegration: Mean-Reverting Equilibrium

Definition: A statistically stable long-term relationship where deviations are temporary

Test: Augmented Dickey-Fuller (ADF) test on the spread

What It Means: When the spread deviates, it's pulled back toward equilibrium. Example:

  • Asset A / Asset B = typical ratio of 2.0
  • Today: ratio = 2.4 (spread widened)
  • Cointegrated pairs: Ratio reverts toward 2.0
  • Edge: Sell the expensive one, buy the cheap one

This Works: Mean reversion is statistically validated, not hoped for.

💡 The Key Insight

Correlation = "They move together"
Cointegration = "Their relationship has gravity"

Only cointegrated pairs can be traded as statistical arbitrage. Everything else is speculation.

Part 1.5: The $31,400 Lesson—Derek's Correlation Trap

📉 CASE STUDY: Derek's $31,400 Correlation vs. Cointegration Disaster (3 months)

Trader: Derek Martinez, quantitative trader ($220K account), February-May 2024

Strategy: Pairs trading ARKK/QQQ based on 0.89 correlation. "When they diverge, they'll converge back."

Fatal flaw: Never tested for cointegration (no ADF test, no half-life calculation). Confused correlation (past similarity) with cointegration (mean-reverting spread)

Result: Lost $31,400 (-14.3%) in 3 months across 3 trades. Missed $26.8K profit opportunity by holding through reversal. Account: $220K → $188.6K

The destruction (Feb-May 2024): Trade #1 (Feb 12-Mar 8): ARKK dropped, Derek bought thinking "cheap vs. QQQ," spread widened -$5.95 instead of reverting (ARKK continued underperforming). Loss: -$6,600. Trade #2 (Mar 15-Apr 19): Derek shorted ARKK thinking "overextended," but ARKK exploded +13.5% on genomics hype while QQQ only +1.1%, margin call forced close. Loss: -$15,200. Trade #3 (Apr 25-May 10): Revenge trade, shorted ARKK again. Initially up $26,831, then ARKK short squeezed +10.5% in 2 days, Derek held hoping for recovery, final exit -$8,400. Total: -$31,400 + missed $26.8K profit. Fatal mistakes: (1) Correlation ≠ cointegration (0.89 correlation just means past similarity, NOT guaranteed mean reversion), (2) No ADF test (spread was NOT stationary), (3) No half-life (held 25+ days hoping with no statistical basis), (4) Ignored structural changes (ARKK pivoted to biotech, broke historical link with QQQ), (5) Revenge trading (entered Trade #3 emotionally without re-validation).

Recovery (Feb-Nov 2025, 9 months later): Derek spent months learning proper statistical arbitrage. New process: (1) ADF test EVERY pair (p-value must be <0.05 for stationary spread), (2) Calculate half-life (only trade if mean reversion <20 days), (3) Z-score entry (|Z| > 2.0 = 2+ std devs from mean), (4) Decointegration stop (if ADF p-value rises >0.10, exit immediately). Pairs traded: KO/PEP, XOM/CVX, JPM/BAC (all tested cointegrated). Results: 23 trades, 16 wins (69.6% win rate), avg win +$1,847, avg loss -$923, profit factor 3.2, +$23,094 profit (+12.2% in 9 months, Sharpe 1.94). Account: $188.6K → $211.7K. Best trade: +$4,120 JPM/BAC (z-score 2.8 entry).

Derek's advice 15 months later: "Correlation measures the PAST. Cointegration predicts the FUTURE. I lost $31,400 trading ARKK/QQQ because they were 'highly correlated.' But correlation just means they moved similarly historically. It doesn't mean they'll revert in the future. Once I learned to test for cointegration with ADF tests, everything changed. Now I only trade pairs where the spread is PROVEN to mean-revert (ADF p-value <0.05). My win rate went from 0% to 69.6%—and I sleep at night because I'm trading with statistical validation, not hope. Rule: If ADF p-value > 0.05, I don't touch it. No exceptions. I trade math, not correlation coefficients."

Case Study Quiz: Derek lost $31,400 (-14.3%) in 3 months trading ARKK/QQQ pairs. He was confident because they showed 0.89 correlation—they moved together historically. Trade #2 destroyed him: he shorted ARKK thinking it was "overextended vs. QQQ," but ARKK exploded +13.5% while QQQ only +1.1%, causing a margin call and -$15,200 loss. He also missed a $26,831 profit opportunity when the spread reversed but he held too long. What was Derek's fatal mistake?

A) He used too much leverage on the pairs trade, causing margin calls when spreads widened temporarily
B) He held positions too short (under 10 days), not giving mean reversion enough time to occur
C) He NEVER tested for cointegration—confused correlation (0.89 = past similarity) with cointegration (mean-reverting spread). Never ran ADF test (p-value), never calculated half-life. The spread was NOT stationary, so reversion never came
D) He entered at wrong z-scores (z=1.0 instead of z=2.0), so trades lacked statistical edge

Correct: C. Derek confused correlation (past similarity) with cointegration (mean-reverting spread). He never ran an ADF test. ARKK pivoted to biotech, breaking the historical link with QQQ—correlation persisted but cointegration died. Recovery: ADF test every pair (p<0.05), calculate half-life, use z-score entry. Result: 69.6% win rate after learning proper stat arb.

Part 2: Testing for Cointegration—The Augmented Dickey-Fuller Test

The Augmented Dickey-Fuller Test

To validate cointegration, you need to test whether the spread (or ratio) is stationary—meaning it has a constant mean it reverts to.

The standard test: Augmented Dickey-Fuller (ADF) test.

📐 What the ADF Test Does

The ADF test checks if a time series (like the spread between two assets) is stationary or has a unit root (random walk, non-stationary).

  • Stationary spread: Has a constant mean and variance. Deviations from the mean are temporary → mean reversion is expected.
  • Non-stationary spread: Drifts over time with no equilibrium. Deviations can persist indefinitely → no reliable mean reversion.

Null Hypothesis (H₀): The spread has a unit root (NOT stationary, NOT cointegrated)

Alternative Hypothesis (H₁): The spread is stationary (cointegrated, mean-reverting)

Interpreting ADF Test Results

The ADF test outputs a p-value that tells you the probability that the spread is non-stationary (random walk).

P-Value Interpretation Trade Decision
< 0.01 Highly stationary (99% confidence) ✅ TRADEABLE (strong cointegration)
0.01 - 0.05 Stationary (95% confidence) ✅ TRADEABLE (acceptable cointegration)
0.05 - 0.10 Weakly stationary (90% confidence) ⚠️ MARGINAL (risky, watch closely)
> 0.10 Non-stationary (random walk) ❌ DO NOT TRADE (not cointegrated)

✅ Rule of Thumb

Only trade pairs with ADF p-value < 0.05. This gives you 95% statistical confidence that the spread will mean-revert.

If p-value > 0.05, you're trading correlation, not cointegration—and that's gambling.

Calculating the Hedge Ratio

Before running the ADF test, you need to determine the hedge ratio (β)—the optimal proportion to balance the two assets in the pair.

Method: Run an Ordinary Least Squares (OLS) regression:

Asset A = β × Asset B + ε

β (beta) = hedge ratio

Example: KO vs PEP

Step 1: Collect Historical Prices
KO (Coca-Cola): Last 252 days of closing prices
PEP (PepsiCo): Last 252 days of closing prices

Step 2: Run OLS Regression (KO = β × PEP)
Using Python statsmodels or Excel LINEST function
Result: β = 1.12

Interpretation:
For every $1 of PEP, you need $1.12 of KO to create a hedged pair.

Step 3: Calculate Spread
Spread = KO - (1.12 × PEP)

Example values:
- KO = $60.50
- PEP = $180.20
- Spread = $60.50 - (1.12 × $180.20) = $60.50 - $201.82 = -$141.32

Step 4: Run ADF Test on Spread
ADF Test Result: p-value = 0.018 ✅
Interpretation: Spread is stationary (cointegrated at 98% confidence)
Conclusion: KO/PEP is a tradeable pair
      

Real Example: SPY/IWM Cointegration Test

Step Process Result
1. Data Collection Download 1 year of daily prices for SPY and IWM 252 trading days
2. Correlation Check Calculate Pearson correlation r = 0.94 (high correlation)
3. OLS Regression SPY = β × IWM + ε β = 2.68 (hedge ratio)
4. Spread Calculation Spread = SPY - (2.68 × IWM) 252 spread values
5. ADF Test Test if spread is stationary p = 0.0082 ✅ COINTEGRATED
6. Half-Life How long for 50% mean reversion? 8.2 days (fast reversion)
7. Conclusion Is this pair tradeable? YES - Trade the spread

🚨 Common Mistakes in Cointegration Testing

  • Using too short a lookback period: Need at least 6-12 months of data for reliable ADF test. 1-3 months = unreliable.
  • Not recalculating β regularly: Hedge ratios drift over time. Recalculate monthly or you'll trade the wrong spread.
  • Ignoring p-value drift: A pair cointegrated in January may decointegrate by June. Re-test monthly.
  • Testing only once: Cointegration is dynamic, not static. Rolling window tests (6-month rolling) are essential.
Part 3: Z-Score Trading Models—Entry & Exit Signals

Z-Score: The Trading Signal

Once you've confirmed cointegration (ADF p < 0.05), you need a trading signal to tell you WHEN to enter and exit. That signal is the z-score.

📐 Z-Score Formula

Z = (Current Spread - Mean Spread) / Standard Deviation of Spread

What it measures: How many standard deviations the current spread is from its historical mean.

  • Z = 0: Spread is at its mean (no trade)
  • Z = +2: Spread is 2 standard deviations ABOVE mean (Asset A expensive, Asset B cheap)
  • Z = -2: Spread is 2 standard deviations BELOW mean (Asset A cheap, Asset B expensive)

Z-Score Trading Frameworks

Conservative Z-Score Model

  • Entry: |Z| > 2.0 (spread is 2+ std devs from mean)
  • Exit: Z crosses back through 0 (mean reversion)
  • Stop Loss: |Z| > 3.5 (spread widening, cut losses)

Why Conservative:

  • Z > 2 means 95% confidence the spread is abnormal
  • Fewer false signals, higher win rate
  • Lower frequency (fewer trades)

Aggressive Z-Score Model

  • Entry: |Z| > 1.5 (more frequent signals)
  • Partial Exit: Z crosses 0.5 (take some profit)
  • Full Exit: Z crosses 0 (mean)
  • Stop Loss: |Z| > 3.0

Why Aggressive:

  • More trades = more opportunities
  • Lower threshold = earlier entries
  • Requires tighter risk management

💡 Real-World Application

Spread = $5.20
Mean Spread = $5.00
Std Dev = $0.10

Z-score = ($5.20 - $5.00) / $0.10 = 2.0

Action: Asset A is relatively expensive, Asset B is relatively cheap

Trade: Short Asset A, Long Asset B (expect convergence)

Half-Life: How Long Until Reversion?

The half-life tells you how many periods it takes for the spread to revert halfway back to the mean. This informs your holding period.

You're now at the halfway point. You've learned the key strategies.

Great progress! Take a quick stretch break if needed, then we'll dive into the advanced concepts ahead.

📖 Calculating Half-Life

Use an AR(1) model (autoregressive) on the spread:

Spreadt = θ × Spreadt-1 + ε

Half-Life = -log(2) / log(θ)

Example: If θ = 0.95, Half-Life ≈ 14 days

Implication: Expect mean reversion within ~14 days. If it takes 30+ days, your model may be breaking down.

Part 4: Position Sizing & Risk Management

How Much to Allocate Per Trade

Statistical arbitrage is NOT "set it and forget it." Spreads can widen before they converge (drawdowns), and some pairs decointegrate over time.

Kelly Criterion for Stat Arb

Use a modified Kelly approach:

f* = (p × b - q) / b

Where:

  • p = win rate (historical % of profitable mean reversions)
  • q = loss rate (1 - p)
  • b = average win / average loss
  • f* = optimal fraction of capital to risk

🚨 Never Risk More Than 2-5% Per Pair

Even if Kelly suggests 10%, stat arb models can fail (decointegration, regime change). Use half-Kelly or quarter-Kelly for safety:

Half-Kelly: f* / 2
Quarter-Kelly: f* / 4

This protects you from model potential breakdown while still capturing edge.

Portfolio of Pairs: Diversification

Don't run just one pair. Build a portfolio of 5-10 cointegrated pairs across different sectors to reduce idiosyncratic risk.

Example Portfolio: KO/PEP (Consumer), XOM/CVX (Energy), JPM/BAC (Finance), AAPL/MSFT (Tech), UNH/CVS (Healthcare). 2% per pair, 10% total risk, diversified if one decointegrates.

Part 5: When Stat Arb Fails

Regime Changes & Decointegration

Statistical arbitrage works—until it doesn't. Pairs can decointegrate due to:

  • Corporate Actions: Mergers, spinoffs, dividend changes
  • Structural Shifts: One company changes business model
  • Macro Regime Change: Interest rate shocks, sector rotation
  • Liquidity Drying Up: Can't potential exit the position without slippage

🚨 LTCM Collapse (1998)

Nobel Prize-winning stat arb fund lost $4.6B when Russian debt crisis caused spreads to widen beyond historical norms (z-score hit 10+). Lesson: Models assume normal distributions. Black swan events break assumptions. Always have stop losses.

Stop Loss Rules for Stat Arb

  • Z-Score Stop: Exit if |Z| > 3.5 (spread widening, not reverting)
  • Time Stop: Exit if position is open > 2× half-life (model failing)
  • Dollar Stop: Exit if drawdown exceeds 50% of expected profit
  • Correlation Breakdown: Exit if ADF p-value goes above 0.10 (losing cointegration)
Part 6: Practical Implementation

Building Your Stat Arb System

Step-by-step process for running statistical arbitrage:

Step 1 - Discovery: Screen same-sector stocks with similar market cap and >0.7 correlation.

Step 2 - Test: ADF test on spread (6-12 months data). p<0.05 = pass, p>0.05 = reject.

Step 3 - Backtest: Entry |Z|>2, exit Z=0. Target: Sharpe >1.5, win rate >60%, half-life 5-30 days.

Step 4 - Monitor: Daily z-score, spread volatility, correlation. Re-run ADF monthly.

Pro Tip: Use rolling 6-month window to recalculate β monthly. Exit if p-value drifts >0.10 (decointegrating).

Part 7: Advanced Strategies—Intraday Statistical Arbitrage

High-Frequency Mean Reversion (For Active Traders)

Most stat arb is swing trading (holding 5-30 days), but cointegration also works on intraday timeframes—if you have the infrastructure.

Requirements for Intraday Stat Arb

  • Low Latency Execution: Sub-second fills required (DMA or sponsored access)
  • High-Liquidity Pairs: SPY/IWM, QQQ/DIA, sector ETFs only
  • Tight Spreads: Commission + slippage must be <0.02% per trade
  • Automated System: Manual trading too slow for 5-minute half-lives
Example: SPY vs IWM Intraday Pairs

Setup: SPY/IWM, correlation 0.92 (5-min), ADF p=0.009✅, half-life 12 bars (60 min)

Rules: Entry |Z| > 1.8, exit when Z crosses 0 OR 90 min elapsed

Example (Nov 15, 2024): 10:35 AM z=-1.95 → Long 200 SPY, Short 530 IWM → 11:20 AM z=+0.12 → Exit +$142 (45 min, 0.12%)

Scalability: 8-12× per day, $50-200 each. Monthly: ~3-5% with minimal overnight risk.

Intraday Stat Arb Risks

Why Retail Fails: Commission overhead ($0.005/share × 2 legs), slippage on large size, 40-50% false signals from intraday noise, need algo platform.

Requirements: >$100K capital, professional routing (IBKR Pro, Lightspeed), automated execution.

Avoid if: Beginner, using Robinhood/Webull, or trading manually.

Part 8: Portfolio Construction for Stat Arb

Building a Diversified Stat Arb Book

Single-pair trading is risky (decointegration can wipe you out). Professional stat arb traders run portfolios of 10-30 pairs simultaneously.

Example Portfolio Allocation

Conservative ($100K)

5 pairs, 2% each: KO/PEP, JPM/BAC, XOM/CVX, UNH/CVS, WMT/TGT

Total Risk: 10% | Sharpe: 1.5-2.0 | Return: 12-18% | Max DD: -8% to -12%

Moderate ($100K)

10 pairs, 1.5% each: Daily (6): Conservative + MA/V, DIS/CMCSA | Intraday (4): SPY/IWM, QQQ/DIA, XLF/XLK, XLE/XLU

Total Risk: 15% | Sharpe: 2.0-2.5 | Return: 20-30% | Max DD: -12% to -18%

Aggressive ($100K)

20+ pairs, 1% each: 8 daily blue-chips + 12 intraday ETFs | 1.5× leverage

Total Risk: 20-30% | Sharpe: 2.5-3.5 | Return: 35-50% | Max DD: -20% to -30%

Warning: Requires full-time monitoring and professional execution

🚨 Correlation Between Pairs (Hidden Risk)

Bad: KO/PEP, MDLZ/GIS, K/CPB (all food)—sector crash kills ALL pairs.

Good: KO/PEP (consumer) + JPM/BAC (finance) + XOM/CVX (energy)—sector diversification means uncorrelated performance.

Rebalancing Your Stat Arb Portfolio

📖 Monthly Maintenance

1st of Every Month: Re-run ADF tests, recalculate hedge ratios, update z-score thresholds, remove p>0.10 pairs, add new pairs. Example: KO/PEP p-value drifts Jan 0.021✅ → Apr 0.124❌ = REMOVE.

Part 9: Code Implementation (Python Example)

Building Your Stat Arb System from Scratch

Here's a practical Python implementation for testing and trading pairs:

Step 1: Testing for Cointegration
import pandas as pd
import numpy as np
from statsmodels.tsa.stattools import adfuller
from statsmodels.api import OLS

# Example: Test KO vs PEP cointegration
ko = pd.read_csv('KO_daily.csv')['Close']
pep = pd.read_csv('PEP_daily.csv')['Close']

# Calculate hedge ratio using OLS regression
model = OLS(ko, pep).fit()
beta = model.params[0]
print(f"Hedge Ratio (β): {beta:.4f}")

# Calculate spread
spread = ko - (beta * pep)

# Run ADF test on spread
adf_result = adfuller(spread)
p_value = adf_result[1]

if p_value < 0.05:
    print(f"✅ COINTEGRATED! (p-value: {p_value:.4f})")
else:
    print(f"❌ NOT cointegrated (p-value: {p_value:.4f})")

# Calculate half-life
def calculate_half_life(spread):
    spread_lag = spread.shift(1)
    spread_diff = spread - spread_lag
    model = OLS(spread_diff[1:], spread_lag[1:]).fit()
    theta = model.params[0]
    half_life = -np.log(2) / np.log(1 + theta)
    return half_life

half_life = calculate_half_life(spread)
print(f"Half-Life: {half_life:.1f} days")
Step 2: Calculating Z-Score Signals
# Calculate rolling z-score
lookback = 60  # 60-day rolling window

spread_mean = spread.rolling(window=lookback).mean()
spread_std = spread.rolling(window=lookback).std()
z_score = (spread - spread_mean) / spread_std

# Entry/exit signals
potential entry_threshold = 2.0
potential exit_threshold = 0.0
stop_threshold = 3.5

# Long spread signal (buy asset A, sell asset B)
long_potential entry = z_score < -potential entry_threshold
long_potential exit = z_score > -potential exit_threshold

# Short spread signal (sell asset A, buy asset B)
short_potential entry = z_score > potential entry_threshold
short_potential exit = z_score < potential exit_threshold

# Stop loss
stop_loss = (z_score > stop_threshold) | (z_score < -stop_threshold)

print(f"Current Z-Score: {z_score.iloc[-1]:.2f}")
if z_score.iloc[-1] > potential entry_threshold:
    print("🔴 SHORT SPREAD (sell KO, buy PEP)")
elif z_score.iloc[-1] < -potential entry_threshold:
    print("🟢 LONG SPREAD (buy KO, sell PEP)")
else:
    print("⚪ NO TRADE (waiting for signal)")
Step 3: Backtesting the Strategy
# Simple backtest
position = 0  # 0 = no position, 1 = long spread, -1 = short spread
trades = []
pnl = []

for i in range(lookback, len(z_score)):
    z = z_score.iloc[i]

    # Entry logic
    if position == 0:
        if z < -potential entry_threshold:
            position = 1  # Long spread
            potential entry_price = spread.iloc[i]
        elif z > potential entry_threshold:
            position = -1  # Short spread
            potential entry_price = spread.iloc[i]

    # Exit logic
    elif position == 1:  # In long spread
        if z > -potential exit_threshold or z < -stop_threshold:
            potential exit_price = spread.iloc[i]
            profit = potential exit_price - potential entry_price
            trades.append(profit)
            position = 0

    elif position == -1:  # In short spread
        if z < potential exit_threshold or z > stop_threshold:
            potential exit_price = spread.iloc[i]
            profit = potential entry_price - potential exit_price
            trades.append(profit)
            position = 0

# Calculate statistics
win_rate = len([t for t in trades if t > 0]) / len(trades)
avg_win = np.mean([t for t in trades if t > 0])
avg_loss = np.mean([t for t in trades if t < 0])
profit_factor = abs(sum([t for t in trades if t > 0])) / abs(sum([t for t in trades if t < 0]))

print(f"Total Trades: {len(trades)}")
print(f"Win Rate: {win_rate:.1%}")
print(f"Avg Win: ${avg_win:.2f}")
print(f"Avg Loss: ${avg_loss:.2f}")
print(f"Profit Factor: {profit_factor:.2f}")
print(f"Total PnL: ${sum(trades):.2f}")

Live Trading Tips: Paper trade 1-3 months first. Start 0.5-1% sizes. Monitor slippage (>0.05% = reduce size). Log every trade: z-score entry/exit, hold time, P&L.

Key Takeaways

💡 Statistical Arbitrage Checklist

  • Cointegration > Correlation: Always validate with ADF test (p < 0.05)
  • Z-Score Entry: |Z| > 2.0 for conservative, > 1.5 for aggressive
  • Stop Losses: Exit if |Z| > 3.5 or position open > 2× half-life
  • Position Sizing: Never more than 2-5% per pair (use Kelly/half-Kelly)
  • Diversify Pairs: Run 5-10 pairs across different sectors
  • Monitor Continuously: Re-test cointegration monthly, potential exit if breaking down
  • Portfolio Construction: Ensure pairs are uncorrelated with each other (sector diversity)
  • Intraday Trading: Only with professional execution and automation

Statistical arbitrage is a quantitative edge—but it requires discipline. If you execute this correctly, you're trading with mathematical precision instead of hope.

Test Your Understanding

Q1: What was Derek's fatal mistake that led to his $31,400 loss trading ARKK/QQQ?

A) He used too much leverage on his positions
B) He confused correlation (0.89) with cointegration and never ran an ADF test
C) He traded with too small position sizes and couldn't recover losses
D) He exited winners too early instead of holding through reversions

Correct! Derek assumed 0.89 correlation meant mean-reversion. He never ran an ADF test. Correlation measures past similarity; cointegration validates a statistically stable equilibrium.

Q2: For a pairs trading strategy using statistical arbitrage, what ADF test p-value threshold indicates a valid cointegrated pair?

A) p-value > 0.10 (spread is non-stationary)
B) p-value < 0.05 (spread is stationary and mean-reverting)
C) p-value > 0.50 (high statistical confidence)
D) p-value = 1.0 (perfect cointegration)

Correct! ADF p-value < 0.05 means the spread is stationary (mean-reverting) with 95% confidence. If p-value > 0.05, don't trade it as stat arb.

Q3: According to the lesson's conservative z-score strategy, when should you enter a statistical arbitrage pairs trade?

A) When |z-score| > 1.0 (spread is 1 standard deviation from mean)
B) When |z-score| > 2.0 (spread is 2+ standard deviations from mean)
C) When |z-score| > 0.5 (any deviation from equilibrium)
D) When z-score returns to 0 (spread at mean)

Correct! Conservative entry: |z| > 2.0 (2+ std devs from mean). Exit when z returns to 0. Stop out if |z| > 3.5 (spread may be de-cointegrating).

Q4: What's the critical difference between correlation and cointegration?

A) Correlation is for stocks, cointegration is for futures only
B) Correlation requires mathematical tests, cointegration can be eyeballed
C) Correlation measures past directional similarity; cointegration means the spread has a mean-reverting equilibrium relationship
D) There's no difference—they're the same concept with different names

Correct! Correlation = "moved together historically." Cointegration = "spread has mean-reverting equilibrium." Highly correlated pairs can drift apart permanently. Only cointegrated pairs can be traded as stat arb.

Q5: According to the lesson's risk management guidelines, what's the maximum position size per pair in statistical arbitrage?

A) 10-15% per pair (concentrated positions for maximum edge)
B) 2-5% per pair (diversified across 5-10 pairs)
C) 25-30% per pair (all-in on best setups)
D) 0.5-1% per pair (ultra-conservative sizing)

Correct! Max 2-5% per pair using Kelly/half-Kelly sizing. Run 5-10 pairs across sectors to diversify. If one pair de-cointegrates, it won't destroy your account.

Related Lessons

⏭️ Coming Up Next

Lesson #64: Macro Regime Framework — Understand how regime changes affect your stat arb models and adapt your strategies accordingly.

💬 Discussion (0 comments)

0/1000

Loading comments...

← Previous Lesson Next Lesson →