Randomized Algorithms
Randomness as a design tool, not a sign of uncertainty. Monte Carlo algorithms trade a small error probability for guaranteed polynomial time. Las Vegas algorithms are always correct but run in expected polynomial time. Three canonical problems — matrix multiplication verification, Quicksort, and order-statistics selection — show how randomness defeats adversarial inputs and yields elegant expected-time analyses.
Monte Carlo vs Las Vegas
Las Vegas algorithm: always produces correct output; running time is a random variable with expected polynomial value. Randomized Quicksort and Randomized Select are Las Vegas.
The distinction matters: Monte Carlo algorithms are used when a small error is acceptable (e.g., checking, probabilistic primality testing). Las Vegas algorithms are used when correctness is non-negotiable (e.g., sorting).
Problem 1 — Freivalds’ Matrix Multiplication Checker
Given $n \times n$ matrices $A$, $B$, $C$ over $\mathbb{F}_2$ (arithmetic mod 2), check whether $AB = C$ without computing the full product. Naively: $O(n^3)$. Freivalds’ algorithm: $O(n^2)$ per trial.
1. Choose $r \in \{0,1\}^n$ uniformly at random (each bit independently $1/2$).
2. Compute $A(Br)$ and $Cr$ โ each is a matrix-vector multiply: $O(n^2)$.
3. Output YES if $A(Br) = Cr$, else NO.
Correctness: If $AB = C$: $A(Br) = (AB)r = Cr$ always → $\Pr[\text{YES}] = 1$.
If $AB \neq C$: Let $D = AB - C \neq 0$. We need $\Pr[Dr = 0] \leq 1/2$.
Proof that $\Pr[Dr = 0] \leq 1/2$ when $D \neq 0$: Since $D \neq 0$, there exists some entry $d_{ij} \neq 0$. Fix $v = e_j$ (unit vector). Then $Dv \neq 0$ (since $(Dv)_i = d_{ij} \neq 0$). For any $r$ with $Dr = 0$, let $r' = r + v$. Then $Dr' = Dr + Dv = 0 + Dv \neq 0$. This gives a bijection between $\{r : Dr = 0\}$ and $\{r' : Dr' \neq 0\}$, so exactly half of all $r$ give $Dr = 0$. Thus $\Pr[Dr = 0] \leq 1/2$. $\square$
Error reduction: Repeat $k$ times independently. $\Pr[\text{false YES after } k \text{ trials}] \leq (1/2)^k$. For $k = 30$: error $\leq 10^{-9}$.
import numpy as np
def freivalds_check(A, B, C, k=20):
"""
Monte Carlo check: does A @ B == C?
O(k * n^2) time. Error probability <= 2^(-k).
If AB == C: always returns True.
If AB != C: returns False with probability >= 1 - 2^(-k).
Works over any field; here we use integers (exact arithmetic).
"""
n = len(A)
for _ in range(k):
# Random binary vector r in {0,1}^n
r = np.random.randint(0, 2, size=(n, 1))
# Compute A(Br) and Cr โ two matrix-vector products: O(n^2)
Br = B @ r # n x 1
ABr = A @ Br # n x 1
Cr = C @ r # n x 1
if not np.array_equal(ABr, Cr):
return False # definite NO: AB != C
return True # probably YES (error <= 2^(-k))
def freivalds_mod2(A, B, C, k=30):
"""
Freivalds over GF(2) (mod 2 arithmetic) โ the original formulation.
Even faster: bitwise operations.
"""
n = len(A)
A, B, C = (np.array(M) % 2 for M in (A, B, C))
for _ in range(k):
r = np.random.randint(0, 2, size=(n, 1))
Br = B @ r % 2
ABr = A @ Br % 2
Cr = C @ r % 2
if not np.array_equal(ABr, Cr):
return False
return True
# Example: verify Strassen result
import numpy as np
n = 64
A = np.random.randint(0, 10, (n, n))
B = np.random.randint(0, 10, (n, n))
C_correct = A @ B
C_wrong = C_correct.copy(); C_wrong[0, 0] += 1
print(freivalds_check(A, B, C_correct)) # True (always)
print(freivalds_check(A, B, C_wrong)) # False (with prob >= 1 - 2^-20)// Freivalds' algorithm โ O(k * n^2) Monte Carlo checker
#include <vector>
#include <random>
#include <cstdint>
using namespace std;
using Mat = vector<vector<long long>>;
// Matrix-vector product mod m (0 = exact integers)
vector<long long> mat_vec(const Mat& M, const vector<long long>& v) {
int n = M.size();
vector<long long> res(n, 0);
for (int i = 0; i < n; i++)
for (int j = 0; j < n; j++)
res[i] += M[i][j] * v[j];
return res;
}
bool freivalds(const Mat& A, const Mat& B, const Mat& C, int k = 20) {
int n = A.size();
mt19937 rng(random_device{}());
bernoulli_distribution coin(0.5);
for (int trial = 0; trial < k; ++trial) {
// Random binary vector r
vector<long long> r(n);
for (auto& ri : r) ri = coin(rng);
// Compute A(Br) and Cr in O(n^2)
auto Br = mat_vec(B, r);
auto ABr = mat_vec(A, Br);
auto Cr = mat_vec(C, r);
for (int i = 0; i < n; i++)
if (ABr[i] != Cr[i]) return false; // definite NO
}
return true; // YES with probability >= 1 - 2^(-k)
}Problem 2 — Randomized Quicksort
Deterministic Quicksort with a fixed pivot (first or last element) runs in $\Theta(n^2)$ on sorted input — easily triggered by any adversary who knows the pivot rule. Randomized Quicksort chooses the pivot uniformly at random, making the expected runtime $\Theta(n\log n)$ for every input.
“Paranoid” Quicksort. To get a cleaner recurrence, repeat pivot selection until we get a good pivot: one where both $|L| \leq \frac{3}{4}n$ and $|G| \leq \frac{3}{4}n$. A uniformly random pivot is good with probability $> 1/2$ (the middle half of the sorted array). Expected number of trials before a good pivot: $\mathbb{E}[\text{trials}] \leq 2$ (geometric with $p > 1/2$). Recurrence:
Recursion tree: at every level, total work $= 2cn$. Maximum depth $= \log_{4/3} n$ (any path shrinks by at least $3/4$ each level). Total: $T(n) = \Theta(n\log n)$.
Problem 3 — Randomized Select ($O(n)$ Expected)
Find the $i$-th smallest element in an unsorted array. Median of Medians (Ch1) gives $O(n)$ worst-case. Randomized Select gives $O(n)$ expected time with a much simpler implementation.
Proof by substitution. $T(n) \leq n + \frac{2}{n}\sum_{k=\lceil n/2\rceil}^{n-1} T(k)$. Guess $T(n) \leq cn$. Then: $$T(n) \leq n + \frac{2}{n}\sum_{k=\lceil n/2\rceil}^{n-1} ck \leq n + \frac{2c}{n} \cdot \frac{3n^2}{8} = n + \frac{3cn}{4} = n\left(1 + \frac{3c}{4}\right)$$ Choose $c = 4$: $T(n) \leq n(1 + 3) = 4n = cn$. $\square$
import random
# โโ RANDOMIZED QUICKSORT โโ O(n log n) expected, O(n^2) worst-case โโ
def randomized_quicksort(A, lo=0, hi=None):
"""
In-place randomized quicksort. Las Vegas: always correct.
Expected O(n log n) for any input (random pivot defeats adversary).
Worst case O(n^2) with probability 0 for random pivot.
"""
if hi is None: hi = len(A) - 1
if lo >= hi: return
# Random pivot: defeats adversarial sorted/reverse-sorted input
pivot_idx = random.randint(lo, hi)
A[pivot_idx], A[hi] = A[hi], A[pivot_idx] # move pivot to end
# Partition (Lomuto scheme): O(n)
pivot = A[hi]
i = lo - 1
for j in range(lo, hi):
if A[j] <= pivot:
i += 1
A[i], A[j] = A[j], A[i]
A[i+1], A[hi] = A[hi], A[i+1]
q = i + 1
randomized_quicksort(A, lo, q - 1)
randomized_quicksort(A, q + 1, hi)
# โโ PARANOID QUICKSORT โโ guaranteed balanced partition โโโโโโโโโโโโโโ
def paranoid_quicksort(A, lo=0, hi=None):
"""
Retry pivot until both sides <= 3/4 n.
E[retries per level] <= 2. Total: O(n log n) expected.
Recurrence: T(n) <= T(n/4) + T(3n/4) + 2cn.
"""
if hi is None: hi = len(A) - 1
if lo >= hi: return
n = hi - lo + 1
while True:
pivot_idx = random.randint(lo, hi)
A[pivot_idx], A[hi] = A[hi], A[pivot_idx]
pivot = A[hi]
i = lo - 1
for j in range(lo, hi):
if A[j] <= pivot:
i += 1
A[i], A[j] = A[j], A[i]
A[i+1], A[hi] = A[hi], A[i+1]
q = i + 1
left_size = q - lo
right_size = hi - q
if left_size <= 3*n//4 and right_size <= 3*n//4:
break # good pivot found
paranoid_quicksort(A, lo, q - 1)
paranoid_quicksort(A, q + 1, hi)
# โโ RANDOMIZED SELECT โโ O(n) expected โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
def randomized_select(A, i, lo=0, hi=None):
"""
Find i-th smallest (1-indexed) in A[lo..hi].
Las Vegas: always correct. Expected O(n) time.
"""
if hi is None: hi = len(A) - 1
if lo == hi: return A[lo]
# Random pivot, Lomuto partition
pivot_idx = random.randint(lo, hi)
A[pivot_idx], A[hi] = A[hi], A[pivot_idx]
pivot = A[hi]
j = lo - 1
for k in range(lo, hi):
if A[k] <= pivot:
j += 1
A[j], A[k] = A[k], A[j]
A[j+1], A[hi] = A[hi], A[j+1]
q = j + 1
rank = q - lo + 1 # rank of pivot within A[lo..hi]
if i == rank:
return A[q]
elif i < rank:
return randomized_select(A, i, lo, q - 1)
else:
return randomized_select(A, i - rank, q + 1, hi)// Randomized Quicksort + Randomized Select
#include <vector>
#include <random>
#include <algorithm>
using namespace std;
mt19937 rng(random_device{}());
// Lomuto partition around A[hi], returns pivot index
int partition(vector<int>& A, int lo, int hi) {
int pivot = A[hi], i = lo - 1;
for (int j = lo; j < hi; j++)
if (A[j] <= pivot) swap(A[++i], A[j]);
swap(A[i+1], A[hi]);
return i + 1;
}
// Randomized Quicksort โ O(n log n) expected
void rqs(vector<int>& A, int lo, int hi) {
if (lo >= hi) return;
// Random pivot defeats adversarial inputs
swap(A[uniform_int_distribution<int>(lo,hi)(rng)], A[hi]);
int q = partition(A, lo, hi);
rqs(A, lo, q-1);
rqs(A, q+1, hi);
}
// Randomized Select โ O(n) expected, O(n^2) worst-case
// Returns i-th smallest (0-indexed) in A[lo..hi]
int rselect(vector<int>& A, int lo, int hi, int i) {
if (lo == hi) return A[lo];
swap(A[uniform_int_distribution<int>(lo,hi)(rng)], A[hi]);
int q = partition(A, lo, hi);
int rank = q - lo;
if (i == rank) return A[q];
if (i < rank) return rselect(A, lo, q-1, i);
return rselect(A, q+1, hi, i - rank - 1);
}Expected vs Worst-Case: The Key Distinction
| Algorithm | Type | Expected | Worst-Case | Correct? |
|---|---|---|---|---|
| Freivalds (1 trial) | Monte Carlo | $O(n^2)$ | $O(n^2)$ | w.h.p. only |
| Rand. Quicksort | Las Vegas | $O(n\log n)$ | $O(n^2)$ | Always |
| Paranoid Quicksort | Las Vegas | $O(n\log n)$ | $O(n^2)$ | Always |
| Rand. Select | Las Vegas | $O(n)$ | $O(n^2)$ | Always |
| Median of Medians | Deterministic | $O(n)$ | $O(n)$ | Always |
import time, random, sys
sys.setrecursionlimit(100000)
def det_quicksort(A, lo=0, hi=None):
"""Deterministic quicksort (last element pivot) โ O(n^2) on sorted input."""
if hi is None: hi = len(A) - 1
if lo >= hi: return
pivot = A[hi] # FIXED pivot โ adversary's target
i = lo - 1
for j in range(lo, hi):
if A[j] <= pivot:
i += 1
A[i], A[j] = A[j], A[i]
A[i+1], A[hi] = A[hi], A[i+1]
q = i + 1
det_quicksort(A, lo, q - 1)
det_quicksort(A, q + 1, hi)
def benchmark(n=5000):
sorted_input = list(range(n)) # adversarial for det_quicksort
# Deterministic: O(n^2) on sorted input
A = sorted_input.copy()
t0 = time.perf_counter()
try:
det_quicksort(A)
t_det = time.perf_counter() - t0
except RecursionError:
t_det = float('inf')
# Randomized: O(n log n) expected on any input
A = sorted_input.copy()
t0 = time.perf_counter()
randomized_quicksort(A)
t_rand = time.perf_counter() - t0
print(f"n={n} sorted input:")
print(f" Deterministic: {t_det*1000:.1f}ms (O(n^2))")
print(f" Randomized: {t_rand*1000:.1f}ms (O(n log n) expected)")
if t_det != float('inf'):
print(f" Speedup: {t_det/t_rand:.1f}x")
benchmark(3000)
# Typical output: Deterministic: ~80ms, Randomized: ~2ms, Speedup: ~40x// Benchmark: deterministic vs randomized quicksort on adversarial input
#include <vector>
#include <numeric>
#include <chrono>
#include <iostream>
using namespace std;
// Deterministic quicksort (last-element pivot) โ O(n^2) on sorted input
void det_qs(vector<int>& A, int lo, int hi) {
if (lo >= hi) return;
int pivot = A[hi], i = lo-1;
for (int j=lo; j<hi; j++)
if (A[j] <= pivot) swap(A[++i], A[j]);
swap(A[i+1], A[hi]);
int q = i+1;
det_qs(A, lo, q-1);
det_qs(A, q+1, hi);
}
void benchmark(int n) {
vector<int> sorted_v(n);
iota(sorted_v.begin(), sorted_v.end(), 0); // adversarial input
// Deterministic
auto A = sorted_v;
auto t0 = chrono::high_resolution_clock::now();
det_qs(A, 0, n-1);
auto t_det = chrono::duration<double>(
chrono::high_resolution_clock::now()-t0).count();
// Randomized
A = sorted_v;
t0 = chrono::high_resolution_clock::now();
rqs(A, 0, n-1);
auto t_rand = chrono::duration<double>(
chrono::high_resolution_clock::now()-t0).count();
cout << "n=" << n << " det=" << t_det*1000 << "ms"
<< " rand=" << t_rand*1000 << "ms"
<< " speedup=" << t_det/t_rand << "x\n";
}At TTT, when constructing a NIFTY options strategy at expiry, we need: (1) find the ATM strike (nearest to spot) — a median-like query; (2) verify that a proposed volatility surface matrix $V$ satisfies $V = AB$ for given factor matrices $A$, $B$ (a Freivalds-style check before calibration).
import numpy as np
# Available NIFTY strikes (unsorted, as received from feed)
strikes = [24500, 23000, 23500, 24000, 25000, 22500, 21000, 23250]
spot = 23400
# Find ATM strike: nearest to spot = find strike with minimum |s - spot|
# Equivalent to selecting rank of spot in the transformed array
diffs = [abs(s - spot) for s in strikes]
# Randomized select to find minimum difference in O(n) expected
min_diff = randomized_select(diffs.copy(), 1) # 1st smallest diff
atm_idx = diffs.index(min_diff)
atm = strikes[atm_idx]
print(f"ATM strike: {atm}") # 23500 (closest to spot 23400)
# Freivalds check: verify vol surface factorisation V โ A @ B
# (Monte Carlo โ O(n^2) instead of O(n^3) full multiplication)
n = 50 # 50 expiries x 50 strikes
A = np.random.randn(n, n)
B = np.random.randn(n, n)
V = A @ B # correct surface
V_perturbed = V.copy(); V_perturbed[0,0] += 1e-6 # tiny perturbation
print(freivalds_check(A, B, V)) # True (always correct)
print(freivalds_check(A, B, V_perturbed)) # False with prob >= 1 - 2^-20
list.sort() uses Timsort (deterministic, $O(n\log n)$ worst-case). random.shuffle() followed by sort is an anti-pattern — unnecessary. The real value of randomized algorithms in quant finance is in Monte Carlo simulation (pricing path-dependent options, VaR estimation) and randomized linear algebra (sketching for large covariance matrices, randomized SVD). Freivalds’ checker is used in distributed matrix computations to verify results from untrusted workers without re-doing the $O(n^3)$ computation. In TTT’s context: verifying that a numerically computed factor decomposition of the implied vol surface satisfies $V \approx AB$ before using it for Greeks calculation.
Freivalds’ algorithm has error probability $\leq 1/2$ per trial when $AB \neq C$. After running $k = 30$ independent trials and all return YES, what is the probability that $AB \neq C$?
If $AB \neq C$, each trial independently outputs YES with probability $\leq 1/2$ (by the Freivalds bijection argument). For all $k$ trials to output YES when $AB \neq C$: $\Pr[\text{all YES} \mid AB \neq C] \leq (1/2)^k$. For $k=30$: $(1/2)^{30} = 1/2^{30} \approx 9.3 \times 10^{-10}$. This is far smaller than the probability of hardware error or cosmic ray bit-flips. Freivalds is used in practice for this reason.
In Paranoid Quicksort, a pivot is “good” if both $|L|$ and $|G|$ are at most $\frac{3}{4}n$. What fraction of elements are good pivots, and what is the expected number of trials before finding one?
An element is a bad pivot if it falls in the bottom $n/4$ or top $n/4$ of the sorted order. That’s $n/2$ bad pivots total. So $n/2$ elements are good pivots — exactly the middle half. Probability of a uniformly random pivot being good $= n/2 \div n = 1/2$. Since trials are independent with success probability $\geq 1/2$, the number of trials follows a geometric distribution: $\mathbb{E}[\text{trials}] \leq 1/(1/2) = 2$. Each trial costs $O(n)$ (partition step), so expected cost per level $= 2 \cdot cn = O(n)$.
Randomized Select is $O(n)$ expected but $O(n^2)$ worst-case. What sequence of events would cause the $O(n^2)$ worst case?
Since the pivot is chosen uniformly at random regardless of input, the worst case is not triggered by any particular input — it’s triggered by an unlucky sequence of random choices. Specifically: if at every recursion level we happen to pick the minimum element as pivot, the partition produces $|L|=0$, $|G|=n-2$. We recurse on a problem of size $n-1$: $T(n) = T(n-1) + O(n) = O(n^2)$. The probability of this happening for all $n$ levels is $(1/n) \cdot (1/(n-1)) \cdots (1/2) = 1/n!$ — astronomically small. The expected time is still $O(n)$; the $O(n^2)$ worst case almost never occurs in practice.
You need to check if a graph has a perfect matching. You have a Monte Carlo algorithm that runs in $O(n^2)$ and errs with probability $\leq 1/n$. You also have a Las Vegas algorithm that runs in $O(n^3)$ expected time and is always correct. For $n = 1000$ and a time budget of $10^7$ operations, which should you use?
Las Vegas costs $O(n^3) = 10^9$ operations for $n=1000$ — $100\times$ over budget. Monte Carlo costs $O(n^2) = 10^6$ per trial. Running 10 trials costs $10^7$ operations (exactly the budget). After 10 trials, error $\leq (1/n)^{10} = (10^{-3})^{10} = 10^{-30}$. This is orders of magnitude smaller than the probability of a hardware failure ($\sim 10^{-15}$ per bit-hour). In practice, the result is “correct enough” for any real-world decision. This is the fundamental Monte Carlo trade-off: pay in reliability (with exponentially negligible error) to gain in speed.
Work through independently. Q1–3 direct application. Q4–7 synthesis. Q8–10 careful argument.
np.allclose(A@B, C) for $n=500$ matrices. How many Freivalds trials can you run in the time NumPy takes for one full multiplication? What does this say about the practical value of the Monte Carlo approach? Medium