/
ProgrammingAlgorithm Design & AnalysisRandomized Algorithms
Progress
6 / 13
← All Chapters
Intermediate Chapter 06 of 13 · Algorithm Design and Analysis

Randomized Algorithms

Randomness as a design tool, not a sign of uncertainty. Monte Carlo algorithms trade a small error probability for guaranteed polynomial time. Las Vegas algorithms are always correct but run in expected polynomial time. Three canonical problems — matrix multiplication verification, Quicksort, and order-statistics selection — show how randomness defeats adversarial inputs and yields elegant expected-time analyses.

Monte Carlo vs Las Vegas

Monte Carlo algorithm: always runs in polynomial time; output is correct with high probability. Error probability can be made exponentially small by repetition.

Las Vegas algorithm: always produces correct output; running time is a random variable with expected polynomial value. Randomized Quicksort and Randomized Select are Las Vegas.

The distinction matters: Monte Carlo algorithms are used when a small error is acceptable (e.g., checking, probabilistic primality testing). Las Vegas algorithms are used when correctness is non-negotiable (e.g., sorting).

Problem 1 — Freivalds’ Matrix Multiplication Checker

Given $n \times n$ matrices $A$, $B$, $C$ over $\mathbb{F}_2$ (arithmetic mod 2), check whether $AB = C$ without computing the full product. Naively: $O(n^3)$. Freivalds’ algorithm: $O(n^2)$ per trial.

Freivalds’ Algorithm:
1. Choose $r \in \{0,1\}^n$ uniformly at random (each bit independently $1/2$).
2. Compute $A(Br)$ and $Cr$ โ€” each is a matrix-vector multiply: $O(n^2)$.
3. Output YES if $A(Br) = Cr$, else NO.

Correctness: If $AB = C$: $A(Br) = (AB)r = Cr$ always → $\Pr[\text{YES}] = 1$.
If $AB \neq C$: Let $D = AB - C \neq 0$. We need $\Pr[Dr = 0] \leq 1/2$.

Proof that $\Pr[Dr = 0] \leq 1/2$ when $D \neq 0$: Since $D \neq 0$, there exists some entry $d_{ij} \neq 0$. Fix $v = e_j$ (unit vector). Then $Dv \neq 0$ (since $(Dv)_i = d_{ij} \neq 0$). For any $r$ with $Dr = 0$, let $r' = r + v$. Then $Dr' = Dr + Dv = 0 + Dv \neq 0$. This gives a bijection between $\{r : Dr = 0\}$ and $\{r' : Dr' \neq 0\}$, so exactly half of all $r$ give $Dr = 0$. Thus $\Pr[Dr = 0] \leq 1/2$. $\square$

Error reduction: Repeat $k$ times independently. $\Pr[\text{false YES after } k \text{ trials}] \leq (1/2)^k$. For $k = 30$: error $\leq 10^{-9}$.
Freivalds’ checker · O(nยฒ) Monte Carlo, error โ‰ค 2โปแต
import numpy as np

def freivalds_check(A, B, C, k=20):
    """
    Monte Carlo check: does A @ B == C?
    O(k * n^2) time. Error probability <= 2^(-k).

    If AB == C: always returns True.
    If AB != C: returns False with probability >= 1 - 2^(-k).

    Works over any field; here we use integers (exact arithmetic).
    """
    n = len(A)
    for _ in range(k):
        # Random binary vector r in {0,1}^n
        r = np.random.randint(0, 2, size=(n, 1))

        # Compute A(Br) and Cr โ€” two matrix-vector products: O(n^2)
        Br  = B @ r         # n x 1
        ABr = A @ Br        # n x 1
        Cr  = C @ r         # n x 1

        if not np.array_equal(ABr, Cr):
            return False    # definite NO: AB != C
    return True             # probably YES (error <= 2^(-k))


def freivalds_mod2(A, B, C, k=30):
    """
    Freivalds over GF(2) (mod 2 arithmetic) โ€” the original formulation.
    Even faster: bitwise operations.
    """
    n = len(A)
    A, B, C = (np.array(M) % 2 for M in (A, B, C))
    for _ in range(k):
        r   = np.random.randint(0, 2, size=(n, 1))
        Br  = B @ r % 2
        ABr = A @ Br % 2
        Cr  = C @ r % 2
        if not np.array_equal(ABr, Cr):
            return False
    return True


# Example: verify Strassen result
import numpy as np
n = 64
A = np.random.randint(0, 10, (n, n))
B = np.random.randint(0, 10, (n, n))
C_correct = A @ B
C_wrong   = C_correct.copy(); C_wrong[0, 0] += 1

print(freivalds_check(A, B, C_correct))  # True  (always)
print(freivalds_check(A, B, C_wrong))    # False (with prob >= 1 - 2^-20)
// Freivalds' algorithm โ€” O(k * n^2) Monte Carlo checker
#include <vector>
#include <random>
#include <cstdint>
using namespace std;
using Mat = vector<vector<long long>>;

// Matrix-vector product mod m (0 = exact integers)
vector<long long> mat_vec(const Mat& M, const vector<long long>& v) {
    int n = M.size();
    vector<long long> res(n, 0);
    for (int i = 0; i < n; i++)
        for (int j = 0; j < n; j++)
            res[i] += M[i][j] * v[j];
    return res;
}

bool freivalds(const Mat& A, const Mat& B, const Mat& C, int k = 20) {
    int n = A.size();
    mt19937 rng(random_device{}());
    bernoulli_distribution coin(0.5);

    for (int trial = 0; trial < k; ++trial) {
        // Random binary vector r
        vector<long long> r(n);
        for (auto& ri : r) ri = coin(rng);

        // Compute A(Br) and Cr in O(n^2)
        auto Br  = mat_vec(B, r);
        auto ABr = mat_vec(A, Br);
        auto Cr  = mat_vec(C, r);

        for (int i = 0; i < n; i++)
            if (ABr[i] != Cr[i]) return false;  // definite NO
    }
    return true;   // YES with probability >= 1 - 2^(-k)
}
↑ Run C++ on Compiler Explorer

Problem 2 — Randomized Quicksort

Deterministic Quicksort with a fixed pivot (first or last element) runs in $\Theta(n^2)$ on sorted input — easily triggered by any adversary who knows the pivot rule. Randomized Quicksort chooses the pivot uniformly at random, making the expected runtime $\Theta(n\log n)$ for every input.

Randomized Quicksort analysis via indicator variables (CLRS). For elements $z_i < z_j$ (in sorted order), let $X_{ij} = 1$ if $z_i$ and $z_j$ are ever compared. Total comparisons $= \sum_{i < j} X_{ij}$. $z_i$ and $z_j$ are compared iff one of them is chosen as pivot before any element strictly between them. This happens with probability $\frac{2}{j-i+1}$. Expected comparisons: $$\mathbb{E}[\text{comparisons}] = \sum_{i=1}^{n}\sum_{j=i+1}^{n} \frac{2}{j-i+1} \leq 2\sum_{k=1}^{n}\frac{n}{k+1} = O(n\log n)$$

“Paranoid” Quicksort. To get a cleaner recurrence, repeat pivot selection until we get a good pivot: one where both $|L| \leq \frac{3}{4}n$ and $|G| \leq \frac{3}{4}n$. A uniformly random pivot is good with probability $> 1/2$ (the middle half of the sorted array). Expected number of trials before a good pivot: $\mathbb{E}[\text{trials}] \leq 2$ (geometric with $p > 1/2$). Recurrence:

$$T(n) \leq T\!\left(\frac{n}{4}\right) + T\!\left(\frac{3n}{4}\right) + 2cn$$

Recursion tree: at every level, total work $= 2cn$. Maximum depth $= \log_{4/3} n$ (any path shrinks by at least $3/4$ each level). Total: $T(n) = \Theta(n\log n)$.

Problem 3 — Randomized Select ($O(n)$ Expected)

Find the $i$-th smallest element in an unsorted array. Median of Medians (Ch1) gives $O(n)$ worst-case. Randomized Select gives $O(n)$ expected time with a much simpler implementation.

Randomized Select (CLRS): pick a random pivot, partition, recurse on the side containing rank $i$. Expected time $T(n) = O(n)$.

Proof by substitution. $T(n) \leq n + \frac{2}{n}\sum_{k=\lceil n/2\rceil}^{n-1} T(k)$. Guess $T(n) \leq cn$. Then: $$T(n) \leq n + \frac{2}{n}\sum_{k=\lceil n/2\rceil}^{n-1} ck \leq n + \frac{2c}{n} \cdot \frac{3n^2}{8} = n + \frac{3cn}{4} = n\left(1 + \frac{3c}{4}\right)$$ Choose $c = 4$: $T(n) \leq n(1 + 3) = 4n = cn$. $\square$
Randomized Quicksort + Randomized Select · O(n log n) / O(n) expected
import random

# โ”€โ”€ RANDOMIZED QUICKSORT โ”€โ”€ O(n log n) expected, O(n^2) worst-case โ”€โ”€
def randomized_quicksort(A, lo=0, hi=None):
    """
    In-place randomized quicksort. Las Vegas: always correct.
    Expected O(n log n) for any input (random pivot defeats adversary).
    Worst case O(n^2) with probability 0 for random pivot.
    """
    if hi is None: hi = len(A) - 1
    if lo >= hi: return

    # Random pivot: defeats adversarial sorted/reverse-sorted input
    pivot_idx = random.randint(lo, hi)
    A[pivot_idx], A[hi] = A[hi], A[pivot_idx]   # move pivot to end

    # Partition (Lomuto scheme): O(n)
    pivot = A[hi]
    i = lo - 1
    for j in range(lo, hi):
        if A[j] <= pivot:
            i += 1
            A[i], A[j] = A[j], A[i]
    A[i+1], A[hi] = A[hi], A[i+1]
    q = i + 1

    randomized_quicksort(A, lo, q - 1)
    randomized_quicksort(A, q + 1, hi)


# โ”€โ”€ PARANOID QUICKSORT โ”€โ”€ guaranteed balanced partition โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
def paranoid_quicksort(A, lo=0, hi=None):
    """
    Retry pivot until both sides <= 3/4 n.
    E[retries per level] <= 2. Total: O(n log n) expected.
    Recurrence: T(n) <= T(n/4) + T(3n/4) + 2cn.
    """
    if hi is None: hi = len(A) - 1
    if lo >= hi: return

    n = hi - lo + 1
    while True:
        pivot_idx = random.randint(lo, hi)
        A[pivot_idx], A[hi] = A[hi], A[pivot_idx]
        pivot = A[hi]
        i = lo - 1
        for j in range(lo, hi):
            if A[j] <= pivot:
                i += 1
                A[i], A[j] = A[j], A[i]
        A[i+1], A[hi] = A[hi], A[i+1]
        q = i + 1
        left_size  = q - lo
        right_size = hi - q
        if left_size <= 3*n//4 and right_size <= 3*n//4:
            break   # good pivot found

    paranoid_quicksort(A, lo, q - 1)
    paranoid_quicksort(A, q + 1, hi)


# โ”€โ”€ RANDOMIZED SELECT โ”€โ”€ O(n) expected โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
def randomized_select(A, i, lo=0, hi=None):
    """
    Find i-th smallest (1-indexed) in A[lo..hi].
    Las Vegas: always correct. Expected O(n) time.
    """
    if hi is None: hi = len(A) - 1
    if lo == hi: return A[lo]

    # Random pivot, Lomuto partition
    pivot_idx = random.randint(lo, hi)
    A[pivot_idx], A[hi] = A[hi], A[pivot_idx]
    pivot = A[hi]
    j = lo - 1
    for k in range(lo, hi):
        if A[k] <= pivot:
            j += 1
            A[j], A[k] = A[k], A[j]
    A[j+1], A[hi] = A[hi], A[j+1]
    q = j + 1

    rank = q - lo + 1        # rank of pivot within A[lo..hi]
    if i == rank:
        return A[q]
    elif i < rank:
        return randomized_select(A, i, lo, q - 1)
    else:
        return randomized_select(A, i - rank, q + 1, hi)
// Randomized Quicksort + Randomized Select
#include <vector>
#include <random>
#include <algorithm>
using namespace std;

mt19937 rng(random_device{}());

// Lomuto partition around A[hi], returns pivot index
int partition(vector<int>& A, int lo, int hi) {
    int pivot = A[hi], i = lo - 1;
    for (int j = lo; j < hi; j++)
        if (A[j] <= pivot) swap(A[++i], A[j]);
    swap(A[i+1], A[hi]);
    return i + 1;
}

// Randomized Quicksort โ€” O(n log n) expected
void rqs(vector<int>& A, int lo, int hi) {
    if (lo >= hi) return;
    // Random pivot defeats adversarial inputs
    swap(A[uniform_int_distribution<int>(lo,hi)(rng)], A[hi]);
    int q = partition(A, lo, hi);
    rqs(A, lo, q-1);
    rqs(A, q+1, hi);
}

// Randomized Select โ€” O(n) expected, O(n^2) worst-case
// Returns i-th smallest (0-indexed) in A[lo..hi]
int rselect(vector<int>& A, int lo, int hi, int i) {
    if (lo == hi) return A[lo];
    swap(A[uniform_int_distribution<int>(lo,hi)(rng)], A[hi]);
    int q    = partition(A, lo, hi);
    int rank = q - lo;
    if (i == rank) return A[q];
    if (i <  rank) return rselect(A, lo, q-1, i);
    return rselect(A, q+1, hi, i - rank - 1);
}
↑ Run C++ on Compiler Explorer

Expected vs Worst-Case: The Key Distinction

Algorithm Type Expected Worst-Case Correct?
Freivalds (1 trial)Monte Carlo$O(n^2)$$O(n^2)$w.h.p. only
Rand. QuicksortLas Vegas$O(n\log n)$$O(n^2)$Always
Paranoid QuicksortLas Vegas$O(n\log n)$$O(n^2)$Always
Rand. SelectLas Vegas$O(n)$$O(n^2)$Always
Median of MediansDeterministic$O(n)$$O(n)$Always
Comparison: deterministic vs randomized Quicksort on adversarial inputs
import time, random, sys
sys.setrecursionlimit(100000)

def det_quicksort(A, lo=0, hi=None):
    """Deterministic quicksort (last element pivot) โ€” O(n^2) on sorted input."""
    if hi is None: hi = len(A) - 1
    if lo >= hi: return
    pivot = A[hi]                        # FIXED pivot โ€” adversary's target
    i = lo - 1
    for j in range(lo, hi):
        if A[j] <= pivot:
            i += 1
            A[i], A[j] = A[j], A[i]
    A[i+1], A[hi] = A[hi], A[i+1]
    q = i + 1
    det_quicksort(A, lo, q - 1)
    det_quicksort(A, q + 1, hi)

def benchmark(n=5000):
    sorted_input = list(range(n))        # adversarial for det_quicksort

    # Deterministic: O(n^2) on sorted input
    A = sorted_input.copy()
    t0 = time.perf_counter()
    try:
        det_quicksort(A)
        t_det = time.perf_counter() - t0
    except RecursionError:
        t_det = float('inf')

    # Randomized: O(n log n) expected on any input
    A = sorted_input.copy()
    t0 = time.perf_counter()
    randomized_quicksort(A)
    t_rand = time.perf_counter() - t0

    print(f"n={n} sorted input:")
    print(f"  Deterministic: {t_det*1000:.1f}ms  (O(n^2))")
    print(f"  Randomized:    {t_rand*1000:.1f}ms  (O(n log n) expected)")
    if t_det != float('inf'):
        print(f"  Speedup: {t_det/t_rand:.1f}x")

benchmark(3000)
# Typical output: Deterministic: ~80ms, Randomized: ~2ms, Speedup: ~40x
// Benchmark: deterministic vs randomized quicksort on adversarial input
#include <vector>
#include <numeric>
#include <chrono>
#include <iostream>
using namespace std;

// Deterministic quicksort (last-element pivot) โ€” O(n^2) on sorted input
void det_qs(vector<int>& A, int lo, int hi) {
    if (lo >= hi) return;
    int pivot = A[hi], i = lo-1;
    for (int j=lo; j<hi; j++)
        if (A[j] <= pivot) swap(A[++i], A[j]);
    swap(A[i+1], A[hi]);
    int q = i+1;
    det_qs(A, lo, q-1);
    det_qs(A, q+1, hi);
}

void benchmark(int n) {
    vector<int> sorted_v(n);
    iota(sorted_v.begin(), sorted_v.end(), 0);   // adversarial input

    // Deterministic
    auto A = sorted_v;
    auto t0 = chrono::high_resolution_clock::now();
    det_qs(A, 0, n-1);
    auto t_det = chrono::duration<double>(
        chrono::high_resolution_clock::now()-t0).count();

    // Randomized
    A = sorted_v;
    t0 = chrono::high_resolution_clock::now();
    rqs(A, 0, n-1);
    auto t_rand = chrono::duration<double>(
        chrono::high_resolution_clock::now()-t0).count();

    cout << "n=" << n << " det=" << t_det*1000 << "ms"
         << " rand=" << t_rand*1000 << "ms"
         << " speedup=" << t_det/t_rand << "x\n";
}
↑ Run C++ on Compiler Explorer
Worked Example · Randomized Order-Statistic for Option Strikes 🇮🇳 TTT Systems

At TTT, when constructing a NIFTY options strategy at expiry, we need: (1) find the ATM strike (nearest to spot) — a median-like query; (2) verify that a proposed volatility surface matrix $V$ satisfies $V = AB$ for given factor matrices $A$, $B$ (a Freivalds-style check before calibration).

Use Randomized Select for ATM strike, Freivalds for matrix check
import numpy as np

# Available NIFTY strikes (unsorted, as received from feed)
strikes = [24500, 23000, 23500, 24000, 25000, 22500, 21000, 23250]
spot    = 23400

# Find ATM strike: nearest to spot = find strike with minimum |s - spot|
# Equivalent to selecting rank of spot in the transformed array
diffs  = [abs(s - spot) for s in strikes]
# Randomized select to find minimum difference in O(n) expected
min_diff = randomized_select(diffs.copy(), 1)   # 1st smallest diff
atm_idx  = diffs.index(min_diff)
atm      = strikes[atm_idx]
print(f"ATM strike: {atm}")    # 23500 (closest to spot 23400)

# Freivalds check: verify vol surface factorisation V โ‰ˆ A @ B
# (Monte Carlo โ€” O(n^2) instead of O(n^3) full multiplication)
n   = 50   # 50 expiries x 50 strikes
A   = np.random.randn(n, n)
B   = np.random.randn(n, n)
V   = A @ B                    # correct surface
V_perturbed = V.copy(); V_perturbed[0,0] += 1e-6  # tiny perturbation

print(freivalds_check(A, B, V))            # True (always correct)
print(freivalds_check(A, B, V_perturbed))  # False with prob >= 1 - 2^-20
Result: ATM strike found in $O(n)$ expected time (vs $O(n\log n)$ for sorting). Vol surface check in $O(kn^2)$ vs $O(n^3)$ for full multiply — a $50\times$ speedup for a $50\times50$ matrix, running 20 trials for error probability $\leq 2^{-20} \approx 10^{-6}$.
Python’s built-in list.sort() uses Timsort (deterministic, $O(n\log n)$ worst-case). random.shuffle() followed by sort is an anti-pattern — unnecessary. The real value of randomized algorithms in quant finance is in Monte Carlo simulation (pricing path-dependent options, VaR estimation) and randomized linear algebra (sketching for large covariance matrices, randomized SVD). Freivalds’ checker is used in distributed matrix computations to verify results from untrusted workers without re-doing the $O(n^3)$ computation. In TTT’s context: verifying that a numerically computed factor decomposition of the implied vol surface satisfies $V \approx AB$ before using it for Greeks calculation.

Practice Problems
4 questions · Chapter 06
prog / daa / ch06 / q01 ★ Freivalds Error

Freivalds’ algorithm has error probability $\leq 1/2$ per trial when $AB \neq C$. After running $k = 30$ independent trials and all return YES, what is the probability that $AB \neq C$?

A
$30/2 = 15\%$ — the error accumulates linearly
B
$\leq (1/2)^{30} \approx 10^{-9}$ — independent trials multiply probabilities
C
$1/2$ — each trial is independent so the last trial’s error is $1/2$
D
$0$ — after 30 trials without a NO, the algorithm is certain $AB = C$
Answer: B — $\leq (1/2)^{30} \approx 10^{-9}$.

If $AB \neq C$, each trial independently outputs YES with probability $\leq 1/2$ (by the Freivalds bijection argument). For all $k$ trials to output YES when $AB \neq C$: $\Pr[\text{all YES} \mid AB \neq C] \leq (1/2)^k$. For $k=30$: $(1/2)^{30} = 1/2^{30} \approx 9.3 \times 10^{-10}$. This is far smaller than the probability of hardware error or cosmic ray bit-flips. Freivalds is used in practice for this reason.
prog / daa / ch06 / q02 ★★ Good Pivot Probability

In Paranoid Quicksort, a pivot is “good” if both $|L|$ and $|G|$ are at most $\frac{3}{4}n$. What fraction of elements are good pivots, and what is the expected number of trials before finding one?

A
$1/4$ of elements are good pivots; expected $4$ trials
B
$1/2$ of elements are good pivots; expected $\leq 2$ trials
C
$3/4$ of elements are good pivots; expected $4/3$ trials
D
Depends on the input; no fixed bound on the fraction
Answer: B — $1/2$ of elements; expected $\leq 2$ trials.

An element is a bad pivot if it falls in the bottom $n/4$ or top $n/4$ of the sorted order. That’s $n/2$ bad pivots total. So $n/2$ elements are good pivots — exactly the middle half. Probability of a uniformly random pivot being good $= n/2 \div n = 1/2$. Since trials are independent with success probability $\geq 1/2$, the number of trials follows a geometric distribution: $\mathbb{E}[\text{trials}] \leq 1/(1/2) = 2$. Each trial costs $O(n)$ (partition step), so expected cost per level $= 2 \cdot cn = O(n)$.
prog / daa / ch06 / q03 ★★ Randomized Select

Randomized Select is $O(n)$ expected but $O(n^2)$ worst-case. What sequence of events would cause the $O(n^2)$ worst case?

A
Any sorted input, since the pivot is always the minimum or maximum
B
A deterministic input that triggers the random number generator to behave periodically
C
The random pivot being the minimum (or maximum) element at every single recursion level — this has probability $(1/n)^n$ and gives $T(n) = T(n-1) + O(n) = O(n^2)$
D
Calling select on an already-sorted array; the partition degenerates just like in deterministic quicksort
Answer: C.

Since the pivot is chosen uniformly at random regardless of input, the worst case is not triggered by any particular input — it’s triggered by an unlucky sequence of random choices. Specifically: if at every recursion level we happen to pick the minimum element as pivot, the partition produces $|L|=0$, $|G|=n-2$. We recurse on a problem of size $n-1$: $T(n) = T(n-1) + O(n) = O(n^2)$. The probability of this happening for all $n$ levels is $(1/n) \cdot (1/(n-1)) \cdots (1/2) = 1/n!$ — astronomically small. The expected time is still $O(n)$; the $O(n^2)$ worst case almost never occurs in practice.
prog / daa / ch06 / q04 ★★★ Monte Carlo vs Las Vegas

You need to check if a graph has a perfect matching. You have a Monte Carlo algorithm that runs in $O(n^2)$ and errs with probability $\leq 1/n$. You also have a Las Vegas algorithm that runs in $O(n^3)$ expected time and is always correct. For $n = 1000$ and a time budget of $10^7$ operations, which should you use?

A
Always Las Vegas — correctness must never be compromised
B
Monte Carlo: run it $k = \lceil\log_2 n\rceil = 10$ times ($10 \times 10^6 = 10^7$ ops, within budget), reducing error to $(1/n)^{10} \approx 10^{-30}$, far more reliable than hardware
C
Las Vegas: $O(n^3) = 10^9$ operations, exceeds budget — but correctness requires it
D
Neither — Monte Carlo errors are unacceptable and Las Vegas is too slow
Answer: B — use Monte Carlo with $k=10$ repetitions.

Las Vegas costs $O(n^3) = 10^9$ operations for $n=1000$ — $100\times$ over budget. Monte Carlo costs $O(n^2) = 10^6$ per trial. Running 10 trials costs $10^7$ operations (exactly the budget). After 10 trials, error $\leq (1/n)^{10} = (10^{-3})^{10} = 10^{-30}$. This is orders of magnitude smaller than the probability of a hardware failure ($\sim 10^{-15}$ per bit-hour). In practice, the result is “correct enough” for any real-world decision. This is the fundamental Monte Carlo trade-off: pay in reliability (with exponentially negligible error) to gain in speed.

Terminal Questions — Chapter 06 10 problems · No answers given

Work through independently. Q1–3 direct application. Q4–7 synthesis. Q8–10 careful argument.

1
Run Freivalds’ algorithm by hand on the $3\times3$ matrices $A = \begin{pmatrix}1&1\\0&1\end{pmatrix}$, $B = \begin{pmatrix}1&0\\1&1\end{pmatrix}$, $C = \begin{pmatrix}3&1\\1&1\end{pmatrix}$ (wrong), using $r = (1,1)^T$. Does the algorithm catch the error? Now try $r=(1,0)^T$. What does this tell you about worst-case $r$? Easy
2
Trace Randomized Select on $A=[7,2,5,1,9,3,6,8,4]$ to find the 5th smallest element, using pivot index 3 at the first call. Show the partition result and the recursive call. What is the answer? Easy
3
Construct an input of size $n=8$ for which deterministic Quicksort (last-element pivot) makes $\Theta(n^2)$ comparisons. Then show that Randomized Quicksort on the same input has $O(n\log n)$ expected comparisons by computing $\mathbb{E}[\text{comparisons}] = \sum_{i<j} 2/(j-i+1)$ for $n=8$. Easy
4
Implement a randomized primality test (Miller-Rabin) in Python. It is a Monte Carlo algorithm: composite numbers are always correctly identified as composite with probability $\geq 3/4$ per witness. How many witnesses do you need to reduce the error to below $10^{-20}$? Test on $n = 561$ (a Carmichael number that fools Fermat’s test). Medium
5
Prove the indicator variable analysis of Randomized Quicksort: $\mathbb{E}[\text{comparisons}] = \sum_{i=1}^n\sum_{j=i+1}^n \frac{2}{j-i+1} \leq 2n\ln n$. Show the harmonic sum bound explicitly. Medium
6
Implement and benchmark Freivalds’ checker vs NumPy’s np.allclose(A@B, C) for $n=500$ matrices. How many Freivalds trials can you run in the time NumPy takes for one full multiplication? What does this say about the practical value of the Monte Carlo approach? Medium
7
Design a randomized algorithm for the following problem: given $n$ numbers, estimate the number of distinct values in $O(n)$ time and $O(1)$ space (the HyperLogLog idea). Use a random hash function and track the maximum number of leading zeros in any hash value. Argue why the expected estimate is correct within a constant factor. Medium
8
Complete the Randomized Select expected-time proof: starting from $T(n) \leq n + \frac{2}{n}\sum_{k=\lceil n/2 \rceil}^{n-1}T(k)$, use substitution $T(n) \leq cn$ and show $c=4$ works. Fill in all the algebra for the sum $\sum_{k=n/2}^{n-1} k \leq 3n^2/8$. Hard
9
The Schwartz-Zippel lemma generalises Freivalds: for a non-zero multivariate polynomial $P(x_1,\ldots,x_n)$ of total degree $d$ and a random point $r$ from a set $S$, $\Pr[P(r)=0] \leq d/|S|$. Show that Freivalds’ algorithm is a special case with $P = D_{ij}$ for the entry-wise check, and that the bijection argument matches the $d=1$, $|S|=2$ case. Hard
10
Prove that no deterministic comparison-based sorting algorithm can achieve $o(n\log n)$ comparisons in the worst case (the comparison lower bound). Now explain why Randomized Quicksort’s $O(n\log n)$ expected bound does not contradict this lower bound. What input would trigger the worst case for Randomized Quicksort, and with what probability? Hard