/
ProgrammingAlgorithm Design & AnalysisNP-Completeness & Approximation
Progress
11 / 13
← All Chapters
Complexity Chapter 11 of 13 · Algorithm Design and Analysis

NP-Completeness & Approximation Algorithms

NP-completeness is a proof that a problem is at least as hard as every other problem in NP. Once you know a problem is NP-complete, the right response is not despair but strategy: approximation algorithms find provably near-optimal solutions in polynomial time. This chapter builds the NP-completeness chain from 3SAT through Vertex Cover and Set Cover, then proves approximation ratios for each.

Part I — Complexity Classes

P = decision problems solvable in polynomial time $O(n^k)$ for some constant $k$.

NP = decision problems where YES answers have polynomial-length certificates verifiable in polynomial time. Equivalently: problems solvable by a nondeterministic polynomial-time algorithm.

NP-hard: problem $X$ is NP-hard if every problem $Y \in \text{NP}$ reduces to $X$ in polynomial time. If $X \in P$ then $P = NP$.

NP-complete: $X$ is NP-complete if $X \in \text{NP}$ and $X$ is NP-hard. The most important class in computational complexity.

$P \subseteq NP$: any polynomial-time algorithm is also a polynomial-time verifier (the certificate is just empty). Whether $P = NP$ remains the greatest open problem in computer science.

Karp Reductions

A Karp reduction from $A$ to $B$ is a polynomial-time function $R$ such that $A(x) = B(R(x))$ for all inputs $x$. We write $A \leq_P B$. Implications: if $B \in P$ then $A \in P$; if $A$ is NP-hard then $B$ is NP-hard.

Proving NP-completeness: two-step recipe.
Step 1 — Show $X \in \text{NP}$: exhibit a polynomial-time verifier. Given a certificate (candidate solution), can we verify it in polynomial time?
Step 2 — Show $X$ is NP-hard: give a polynomial-time reduction from a known NP-complete problem $Y$ to $X$. Show: (a) the reduction runs in polynomial time; (b) YES instances of $Y$ map to YES instances of $X$; (c) YES instances of $X$ map to YES instances of $Y$.

The NP-Completeness Chain

3SAT (Cook 1971): Given a CNF formula where each clause has exactly 3 literals, is there a satisfying assignment? 3SAT is NP-complete (the first — every NP problem can be compiled to a circuit, which converts to a 3-CNF formula). Certificate: variable assignment; verifier: evaluate formula.

3-Dimensional Matching (3DM): Given disjoint sets $X, Y, Z$ of $n$ elements each and triples $T \subseteq X\times Y\times Z$, is there $S \subseteq T$ covering each element exactly once? Reduces from 3SAT via variable/clause gadgets. Certificate: the set $S$; verifier: check each element covered exactly once.

Subset Sum: Given integers $\{a_1,\ldots,a_n\}$ and target $t$, does any subset sum to $t$? Reduces from 3DM (encode triples as numbers in a large base $b$, target $= \sum b^i$). Weakly NP-hard (pseudo-polynomial DP solves it in $O(nT)$).

Partition: Given $A$, partition into two equal-sum subsets? Reduction from Subset Sum: add $a_{n+1} = \sigma + t$, $a_{n+2} = 2\sigma - t$ where $\sigma = \sum A$. Weakly NP-complete.

Vertex Cover $\leq_P$ Set Cover: For graph $G=(V,E)$, let $S = E$ (set of edges to be covered), and for each vertex $v$ create subset $S_v = \{\text{edges incident to }v\}$. A $k$-vertex cover $\Leftrightarrow$ a $k$-set cover. Therefore Set Cover is NP-hard.

Clique $\leq_P$ Independent Set: $I$ is independent in $G$ $\Leftrightarrow$ $I$ is a clique in $\overline{G}$ (complement graph). The reduction runs in $O(|E|)$: compute $\overline{G}$, same $k$.

Strong vs Weak NP-hardness. Weakly NP-hard problems become polynomial when numbers are bounded (pseudo-polynomial algorithms exist): Subset Sum, Partition, Knapsack. Strongly NP-hard problems remain hard even when all numbers are polynomial in $n$: 3SAT, Vertex Cover, Hamiltonian Cycle, 4-Partition. For strongly NP-hard problems, no pseudo-polynomial algorithm exists (unless P = NP).
NP verifiers + reductions demo · 3SAT, Subset Sum, Vertex Cover
# ── 3SAT VERIFIER ── O(clauses * 3) ──────────────────────────────────
def verify_3sat(clauses, assignment):
    """
    Verify a 3SAT certificate in polynomial time.
    clauses: list of 3-tuples like [(1, -2, 3), (-1, 4, -5)]
             positive = variable, negative = negated variable.
    assignment: dict {var: True/False}.
    O(clauses * 3) = O(n): clearly polynomial.
    """
    for clause in clauses:
        # Each clause is satisfied if at least one literal is true
        if not any(
            (assignment.get(abs(lit), False) if lit > 0
             else not assignment.get(abs(lit), False))
            for lit in clause
        ):
            return False   # clause not satisfied
    return True

# Example: (x1 OR x3 OR NOT x2) AND (NOT x1 OR x4 OR NOT x5)
clauses    = [(1, 3, -2), (-1, 4, -5)]
assignment = {1: True, 2: False, 3: True, 4: True, 5: False}
print("3SAT verified:", verify_3sat(clauses, assignment))   # True


# ── SUBSET SUM (weak NP) ── pseudo-polynomial O(n*T) DP ──────────────
def subset_sum_dp(nums, target):
    """
    Pseudo-polynomial DP: O(n*T) time, O(T) space.
    Works because T is treated as a number, not its bit-length.
    If T = 2^n, this is exponential in the input size (weak NP-hardness).
    """
    dp = {0}    # set of achievable sums
    for x in nums:
        dp = dp | {s + x for s in dp if s + x <= target}
    return target in dp

print("Subset sum:", subset_sum_dp([3,1,4,1,5,9,2,6], 11))  # True


# ── VERTEX COVER VERIFIER ── O(E) ────────────────────────────────────
def verify_vertex_cover(graph_edges, cover):
    """Verify vertex cover: every edge has at least one endpoint in cover."""
    cover_set = set(cover)
    return all(u in cover_set or v in cover_set for u, v in graph_edges)

# ── REDUCTION: Vertex Cover → Set Cover ──────────────────────────────
def vc_to_set_cover(n_vertices, edges):
    """
    Karp reduction: Vertex Cover instance → Set Cover instance.
    Universe S = E (set of edges, indexed 0..m-1).
    Set_v = {edges incident to vertex v}.
    k-vertex cover ↔ k-set cover.
    Polynomial time: O(V + E).
    """
    universe = list(range(len(edges)))    # edges are elements
    sets = {}
    for v in range(n_vertices):
        sets[v] = [i for i, (u, w) in enumerate(edges) if u==v or w==v]
    return universe, sets

edges = [(0,1),(0,2),(1,2),(1,3),(2,4),(3,4)]
universe, sets = vc_to_set_cover(5, edges)
print(f"Universe (edges): {universe}")
print(f"Sets (per vertex): {sets}")
// 3SAT verifier + Subset Sum DP
#include <vector>
#include <unordered_map>
#include <unordered_set>
#include <numeric>
using namespace std;

// 3SAT verifier — O(clauses * 3)
bool verify_3sat(const vector<vector<int>>& clauses,
                  const unordered_map<int,bool>& asgn) {
    for (auto& clause : clauses) {
        bool sat = false;
        for (int lit : clause) {
            int var = abs(lit);
            bool val = asgn.count(var) ? asgn.at(var) : false;
            if ((lit > 0 && val) || (lit < 0 && !val)) {
                sat = true; break;
            }
        }
        if (!sat) return false;
    }
    return true;
}

// Subset Sum — pseudo-polynomial O(n*T) via bitset DP
bool subset_sum(const vector<int>& nums, int T) {
    vector<bool> dp(T+1, false);
    dp[0] = true;
    for (int x : nums)
        for (int s = T; s >= x; --s)
            if (dp[s-x]) dp[s] = true;
    return dp[T];
}

// Clique ↔ Independent Set reduction: complement graph
// G' has edge (u,v) iff G does NOT have edge (u,v)
// k-clique in G ↔ k-independent set in G'  (O(|E|))
vector<pair<int,int>> complement_graph(int n,
    const vector<pair<int,int>>& edges) {
    vector<vector<bool>> adj(n, vector<bool>(n,false));
    for (auto [u,v]:edges) { adj[u][v]=adj[v][u]=true; }
    vector<pair<int,int>> comp;
    for (int i=0;i<n;i++)
        for (int j=i+1;j<n;j++)
            if (!adj[i][j]) comp.push_back({i,j});
    return comp;
}
↑ Run C++ on Compiler Explorer

Part II — Approximation Algorithms

When a problem is NP-complete, we can’t solve it exactly in polynomial time (unless P=NP). But we can often find solutions guaranteed to be within a factor of optimal.

Approximation ratio $\rho(n)$: algorithm $A$ is a $\rho$-approximation if for every input it produces cost $C$ satisfying $\max(C/C_\text{opt}, C_\text{opt}/C) \leq \rho(n)$. The ratio applies in both directions so it works for both minimisation and maximisation.

PTAS: Polynomial Time Approximation Scheme — for any $\varepsilon > 0$ gives $(1+\varepsilon)$-approximation in time poly$(n)$ (but possibly exponential in $1/\varepsilon$).

FPTAS: Fully Polynomial Time Approximation Scheme — runs in time poly$(n, 1/\varepsilon)$. Subset Sum has an FPTAS; TSP does not (unless P=NP).

Approximation 1 — Vertex Cover: 2-Approximation

Algorithm (Approx-Vertex-Cover): while edges remain, pick any edge $(u,v)$, add both $u$ and $v$ to the cover, remove all edges incident to $u$ or $v$.

Analysis: let $M$ = set of edges picked (a matching — no two share an endpoint). The algorithm returns $|V'| = 2|M|$. Any vertex cover must include at least one endpoint of every edge in $M$, and since no two edges in $M$ share an endpoint, $C_\text{opt} \geq |M|$. Therefore $C = 2|M| \leq 2 C_\text{opt}$. $\square$

This is tight: there exist graphs where the algorithm returns exactly $2 C_\text{opt}$.

Approximation 2 — Set Cover: $(\ln n + 1)$-Approximation

Algorithm (Greedy-Set-Cover): repeatedly pick the set $S_i$ covering the most uncovered elements; repeat until all elements covered.

Analysis: let $|X| = n$ and $C_\text{opt} = t$ (optimal cover uses $t$ sets). At any step with $|X_k|$ uncovered elements, the $t$ optimal sets cover all of them, so at least one covers $\geq |X_k|/t$ elements. The greedy picks the best, so: $$|X_{k+1}| \leq \left(1 - \frac{1}{t}\right)|X_k| \leq e^{-k/t} \cdot n$$ The algorithm terminates when $|X_k| = 0$, which requires $k > t\ln n$. Total cost $C \leq t\ln n + 1 = C_\text{opt}(\ln n + 1)$. $\square$

The $\ln n$ bound is essentially tight: Feige (1998) proved that no polynomial-time algorithm achieves ratio $(1-\varepsilon)\ln n$ unless NP $\subseteq$ DTIME$(n^{O(\log\log n)})$.

Approximation 3 — Max-Degree Vertex Cover: $O(\log n)$-Approximation

Algorithm (Natural-Vertex-Cover): repeatedly pick the vertex of maximum degree, add it to the cover, remove it and all incident edges.

Analysis: let $m = C_\text{opt}$ = size of optimal cover, $|G_0| = n = |E|$. After $m$ iterations (removing optimal-cover vertices), we’ve removed at least half the edges (since $m$ vertices cover some edges, and any $m$ vertices cover at most all edges). By induction: every $m$ iterations halves the edge count. Total iterations to remove all edges: $m \log n$. Since each iteration adds one vertex: $C \leq m\log n = C_\text{opt} \log n$. $\square$

This is the same bound as Greedy-Set-Cover (vertex cover reduces to set cover). Both are $O(\log n)$.

Approximation 4 — Partition PTAS

Problem: partition $n$ items into two groups minimising $\max(w(A), w(B))$ where $w(\cdot)$ is the total weight. Optimal $\geq L = \frac{1}{2}\sum s_i$.

Algorithm (PTAS for Partition): given $\varepsilon > 0$, set $m = \lceil 1/\varepsilon \rceil - 1$. Phase 1: solve exactly for the $m$ largest items (cost $O(2^m) = O(2^{1/\varepsilon})$, exponential in $1/\varepsilon$ but independent of $n$). Phase 2: greedily assign remaining items to the lighter partition.

Proof ratio is $(1+\varepsilon)$: if the last item $k$ was added in Phase 2, then $w(A) - s_k \leq w(B)$, so $w(A) \leq L + s_k/2$. Since $k > m$: all $m+1$ largest items $\geq s_k$, so $2L \geq (m+1)s_k$, giving $s_k \leq 2L/(m+1)$. Thus $w(A)/L \leq 1 + 1/(m+1) \leq 1 + \varepsilon$. $\square$

Metric TSP: 2-Approximation (MST-based)

The Travelling Salesman Problem (find minimum-weight Hamiltonian cycle) is NP-hard. For metric TSP (edge weights satisfy the triangle inequality $c(u,w) \leq c(u,v)+c(v,w)$), there is a 2-approximation.

MST-Approximation for Metric TSP:
1. Find MST $T$ of $G$.
2. Perform a pre-order DFS walk of $T$, listing vertices in visit order: $H$.
3. Return Hamiltonian cycle $H$ (shortcutting repeated vertices).

Analysis: Removing any edge of $H^*$ (optimal cycle) gives a spanning tree, so $c(T) \leq c(H^*)$. Pre-order DFS walk traverses each edge twice: cost $2c(T)$. Shortcutting (triangle inequality) only decreases cost: $c(H) \leq 2c(T) \leq 2c(H^*)$. $\square$

Christofides (1976) achieves ratio $3/2$ using minimum-weight perfect matching on odd-degree vertices. No better ratio is known in polynomial time for general metric TSP.
Approx Vertex Cover (2×) + Greedy Set Cover (ln n) + Metric TSP (2×)
import heapq, math
from collections import defaultdict

# ── 2-APPROX VERTEX COVER ─────────────────────────────────────────────
def approx_vertex_cover_2(edges):
    """
    2-approximation for Minimum Vertex Cover.
    Pick any edge, add both endpoints, remove incident edges.
    C <= 2 * C_opt. Proven tight (matching lower bound exists).
    O(E) time.
    """
    cover     = set()
    remaining = list(edges)
    covered   = set()

    for u, v in remaining:
        if (u, v) in covered or (v, u) in covered:
            continue                    # edge already covered
        cover.add(u)
        cover.add(v)
        # Mark all edges incident to u or v as covered
        for a, b in remaining:
            if a == u or b == u or a == v or b == v:
                covered.add((a, b))
    return cover


# ── GREEDY SET COVER (ln n + 1 approximation) ─────────────────────────
def greedy_set_cover(universe, sets):
    """
    Greedy set cover: always pick the set covering the most uncovered elements.
    Approximation ratio: ln(|universe|) + 1.
    O(|universe| * |sets|) per iteration → O(n^2 * m) total.
    Lower bound: no poly-time alg achieves (1-eps)*ln(n) unless NP ⊆ DTIME(n^(log log n)).
    """
    uncovered = set(universe)
    cover     = []

    while uncovered:
        # Pick set with maximum intersection with uncovered elements
        best = max(sets, key=lambda s: len(uncovered & sets[s]))
        cover.append(best)
        uncovered -= sets[best]

    return cover


# ── METRIC TSP: MST-BASED 2-APPROXIMATION ─────────────────────────────
def metric_tsp_2approx(n, dist):
    """
    2-approximation for Metric TSP via MST + DFS preorder.
    dist[i][j] must satisfy triangle inequality.
    Returns (tour_cost, tour_order).
    O(n^2) Prim + O(n) DFS = O(n^2) total.
    """
    # Step 1: Prim's MST (O(n^2) with adjacency matrix)
    in_mst = [False]*n
    key    = [float('inf')]*n
    parent = [-1]*n
    key[0] = 0
    adj    = defaultdict(list)  # MST adjacency list

    for _ in range(n):
        # Find min-key vertex not in MST
        u = min((i for i in range(n) if not in_mst[i]), key=lambda i: key[i])
        in_mst[u] = True
        if parent[u] != -1:
            adj[parent[u]].append(u)
            adj[u].append(parent[u])
        for v in range(n):
            if not in_mst[v] and dist[u][v] < key[v]:
                key[v] = dist[u][v]; parent[v] = u

    # Step 2: DFS preorder traversal (shortcutting = visiting unvisited)
    visited = [False]*n
    tour    = []

    def dfs(u):
        visited[u] = True
        tour.append(u)
        for v in sorted(adj[u]):   # sorted for determinism
            if not visited[v]:
                dfs(v)

    dfs(0)

    # Step 3: Compute tour cost (triangle inequality shortcutting is implicit)
    tour_cost = sum(dist[tour[i]][tour[i+1]] for i in range(len(tour)-1))
    tour_cost += dist[tour[-1]][tour[0]]   # return to start

    return tour_cost, tour
// 2-Approx Vertex Cover + Greedy Set Cover + Metric TSP
#include <vector>
#include <set>
#include <algorithm>
#include <numeric>
#include <climits>
using namespace std;

// 2-approximation for vertex cover — O(E)
set<int> approx_vertex_cover(int n, vector<pair<int,int>> edges) {
    set<int> cover;
    set<pair<int,int>> covered;
    for (auto [u,v] : edges) {
        if (covered.count({u,v}) || covered.count({v,u})) continue;
        cover.insert(u); cover.insert(v);
        // Mark incident edges (simplified: O(E) scan)
        for (auto [a,b] : edges)
            if (a==u||b==u||a==v||b==v)
                {covered.insert({a,b});covered.insert({b,a});}
    }
    return cover;
}

// Metric TSP 2-approx: MST + DFS preorder — O(n^2)
pair<double,vector<int>>
metric_tsp(int n, const vector<vector<double>>& d) {
    // Prim's MST
    vector<bool> inMST(n,false);
    vector<double> key(n,1e18); key[0]=0;
    vector<int> par(n,-1);
    vector<vector<int>> adj(n);
    for (int iter=0;iter<n;iter++) {
        int u=-1;
        for (int i=0;i<n;i++)
            if (!inMST[i]&&(u==-1||key[i]<key[u])) u=i;
        inMST[u]=true;
        if (par[u]!=-1){adj[par[u]].push_back(u);adj[u].push_back(par[u]);}
        for (int v=0;v<n;v++)
            if (!inMST[v]&&d[u][v]<key[v]){key[v]=d[u][v];par[v]=u;}
    }
    // DFS preorder
    vector<bool> vis(n,false);
    vector<int> tour;
    function<void(int)> dfs=[&](int u){
        vis[u]=true; tour.push_back(u);
        for (int v:adj[u]) if(!vis[v]) dfs(v);
    };
    dfs(0);
    double cost=0;
    for (int i=0;i+1<(int)tour.size();i++) cost+=d[tour[i]][tour[i+1]];
    cost+=d[tour.back()][tour[0]];
    return {cost,tour};
}
↑ Run C++ on Compiler Explorer
Partition PTAS + Subset Sum FPTAS · (1+ε)-approximation
def partition_ptas(items, epsilon):
    """
    PTAS for load-balancing partition: minimise max(w(A), w(B)).
    Phase 1: solve exactly for largest m = ceil(1/epsilon) - 1 items.
    Phase 2: greedy assign remaining items.
    Approximation ratio: 1 + epsilon.
    Time: O(2^(1/epsilon) + n) — polynomial in n, exponential in 1/epsilon.
    """
    items_sorted = sorted(items, reverse=True)
    m = max(1, math.ceil(1/epsilon) - 1)

    # Phase 1: brute-force optimal partition of first m items
    best_diff = float('inf')
    best_mask = 0
    for mask in range(1 << min(m, len(items_sorted))):
        wA = sum(items_sorted[i] for i in range(min(m, len(items_sorted)))
                 if mask & (1 << i))
        wB = sum(items_sorted[i] for i in range(min(m, len(items_sorted)))
                 if not mask & (1 << i))
        if abs(wA - wB) < best_diff:
            best_diff = abs(wA - wB)
            best_mask = mask

    A = [items_sorted[i] for i in range(min(m, len(items_sorted)))
         if best_mask & (1 << i)]
    B = [items_sorted[i] for i in range(min(m, len(items_sorted)))
         if not best_mask & (1 << i)]

    # Phase 2: greedy assignment of remaining items
    for item in items_sorted[m:]:
        if sum(A) <= sum(B):
            A.append(item)
        else:
            B.append(item)

    return max(sum(A), sum(B)), A, B


def subset_sum_fptas(nums, target, epsilon):
    """
    FPTAS for Subset Sum approximation: find subset with sum in
    [(1-epsilon)*target, target] if it exists.
    Time: O(n^2 / epsilon) — polynomial in both n and 1/epsilon.
    Strategy: round down all numbers by factor K=floor(epsilon*max/n),
              run exact DP on rounded numbers, unscale answer.
    """
    if not nums: return 0, []
    max_val = max(nums)
    K = max(1, int(epsilon * max_val / len(nums)))
    rounded = [x // K for x in nums]
    target_r = target // K

    # Exact DP on rounded numbers — O(n * target/K) = O(n^2/epsilon)
    dp = {0: []}
    for i, x in enumerate(rounded):
        new_dp = {}
        for s, path in dp.items():
            ns = s + x
            if ns <= target_r and ns not in new_dp:
                new_dp[ns] = path + [i]
        dp.update(new_dp)

    # Find best achievable sum
    best = max((s for s in dp if s*K <= target), default=0)
    chosen = [nums[i] for i in dp[best]]
    return sum(chosen), chosen
// Subset Sum FPTAS — O(n^2 / epsilon)
#include <vector>
#include <algorithm>
#include <cmath>
using namespace std;

// Round and solve: O(n * target/K) = O(n^2/epsilon)
int subset_sum_fptas(const vector<int>& nums, int T, double eps) {
    if (nums.empty()) return 0;
    int max_v = *max_element(nums.begin(),nums.end());
    int K = max(1, (int)(eps * max_v / nums.size()));
    int Tr = T / K;
    vector<int> rounded(nums.size());
    for (int i=0;i<(int)nums.size();i++) rounded[i] = nums[i]/K;

    // DP on rounded values
    vector<bool> dp(Tr+1, false);
    dp[0] = true;
    for (int x : rounded)
        for (int s=Tr;s>=x;--s)
            if (dp[s-x]) dp[s]=true;

    // Best rounded sum * K approximates actual sum
    for (int s=Tr;s>=0;--s)
        if (dp[s]) return s*K;  // unscale
    return 0;
}
// Approximation guarantee: answer >= (1 - epsilon) * optimal
↑ Run C++ on Compiler Explorer
Worked Example · Options Strategy Scheduling as Partition 🇮🇳 TTT / Pashupati

TTT runs strategies on two execution servers. We want to assign strategies to servers minimising the maximum-loaded server (makespan). Each strategy has a fixed CPU requirement. This is exactly the Partition problem (minimise $\max(w(A), w(B))$).

Apply PTAS with $\varepsilon = 0.1$ (10% guarantee)
strategies = {
    'STRADDLE_01': 80, 'STRANGLE_06': 60, 'BUTTERFLY_08': 40,
    'CALENDAR_13': 50, 'CONDOR_15':   45, 'VOLT_OTM_27': 35,
    'STRADDLE_27B': 30, 'IRON_CONDOR': 55
}
loads = list(strategies.values())   # [80, 60, 40, 50, 45, 35, 30, 55]

makespan, grpA, grpB = partition_ptas(loads, epsilon=0.1)
L = sum(loads) / 2
print(f"Total load:        {sum(loads)}")
print(f"Optimal lower bd:  {L:.1f}")
print(f"PTAS makespan:     {makespan} (ratio {makespan/L:.3f}, bound 1.1)")
print(f"Group A loads:     {grpA} = {sum(grpA)}")
print(f"Group B loads:     {grpB} = {sum(grpB)}")
Result: the PTAS achieves makespan $\leq 1.1 \times L$ (at most 10% above the lower bound on optimum). For TTT’s 8 strategies, the brute-force Phase 1 examines $2^9 = 512$ partitions of the heaviest 9 items — trivially fast. Phase 2 greedily assigns the remaining lighter strategies. The approximation guarantee is proven regardless of the load distribution.
NP-completeness appears directly in financial optimisation. Portfolio selection with cardinality constraints ($\leq k$ assets) is NP-hard (reduces to Knapsack). Index tracking (choose $k$ stocks to replicate an index) is NP-hard. Strategy scheduling (minimise makespan on 2 servers) is NP-hard (Partition). In practice: use PTAS for scheduling, greedy for set cover, and LP relaxations with rounding for cardinality constraints. The key insight is that approximation guarantees apply to every instance, not just in expectation — making them reliable for production systems.

Practice Problems
4 questions · Chapter 11
prog / daa / ch11 / q01 ★ NP Definitions

Which of the following is the correct relationship between P, NP, and NP-complete under the assumption P $\neq$ NP?

A
NP-complete problems are the hardest problems in NP, and since P $\neq$ NP, no NP-complete problem has any polynomial-time algorithm
B
P $=$ NP-complete, since every problem in P reduces to every NP-complete problem
C
P $\subsetneq$ NP; NP-complete problems are in NP but not in P; if any NP-complete problem were in P, then P $=$ NP
D
NP-complete problems are outside NP; they are strictly harder than any problem in NP
Answer: C.

NP-complete problems are in NP (they have polynomial certificates) and NP-hard (every NP problem reduces to them). Under P $\neq$ NP: P $\subsetneq$ NP, and NP-complete problems are not in P. If even one NP-complete problem had a polynomial-time algorithm, every NP problem would too (via reductions), giving P = NP — a contradiction. So P $\neq$ NP $\Rightarrow$ no NP-complete problem is in P. Option D is wrong: NP-complete problems are IN NP (they’re not harder than NP, they are the hardest IN NP).
prog / daa / ch11 / q02 ★★ Vertex Cover 2-approximation

The 2-approximation for Vertex Cover adds both endpoints of each picked edge. Why does this guarantee a ratio of exactly 2 (not better)?

A
Because each edge contributes 2 vertices, and edges are counted twice in any cover
B
The matched edges $M$ form a lower bound: $C_\text{opt} \geq |M|$ (every optimal cover must include one endpoint per matched edge, and matched edges don’t share endpoints). The algorithm uses $2|M|$ vertices, so $C = 2|M| \leq 2C_\text{opt}$. Tightness: bipartite graphs where one side is optimal but the algorithm takes both sides
C
Because the optimal cover has size at least $|E|/2$ in the worst case
D
Because the greedy matching always finds exactly half the optimal cover
Answer: B.

The algorithm picks a maximal matching $M$ (no two edges in $M$ share an endpoint). It adds $2|M|$ vertices (both endpoints of each matched edge). Any vertex cover must include at least one endpoint of every edge in $M$; since matched edges are vertex-disjoint, the optimal cover needs $\geq |M|$ vertices just for $M$ edges. So $C_\text{opt} \geq |M|$, giving $C = 2|M| \leq 2C_\text{opt}$.

Tightness: consider the complete bipartite graph $K_{n/2,n/2}$. Optimal cover: take all $n/2$ vertices from one side (covers all edges). Algorithm picks $n/2$ edges of the perfect matching, adding all $n$ vertices: ratio $= n/(n/2) = 2$.
prog / daa / ch11 / q03 ★★ Greedy Set Cover

Greedy Set Cover achieves ratio $\ln n + 1$. Why does the greedy argument use the bound $|X_{k+1}| \leq (1 - 1/t)|X_k|$, where $t = C_\text{opt}$?

A
Because each iteration removes exactly $1/t$ fraction of elements by definition of the algorithm
B
At each step, $|X_k|$ uncovered elements can be covered by the $t$ optimal sets, so at least one covers $\geq |X_k|/t$ elements; greedy picks the best, so $|X_k| - |X_{k+1}| \geq |X_k|/t$, giving $|X_{k+1}| \leq (1-1/t)|X_k|$
C
Because the optimal cover has $t$ sets each of size $|X_k|/t$ by symmetry
D
The bound comes from the Markov inequality applied to the set sizes
Answer: B.

At iteration $k$, there are $|X_k|$ uncovered elements. The $t$ optimal sets together cover all of them (they still form a valid cover of the original universe, hence also of $X_k$). By the pigeonhole principle, at least one of the $t$ sets covers $\geq |X_k|/t$ currently uncovered elements. The greedy algorithm picks the set with maximum coverage, so it covers $\geq |X_k|/t$ elements: $|X_k| - |X_{k+1}| \geq |X_k|/t \Rightarrow |X_{k+1}| \leq (1-1/t)|X_k|$. Unrolling: $|X_k| \leq (1-1/t)^k n \leq e^{-k/t} n$. Setting $k = t\ln n$: $|X_k| \leq e^{-\ln n} n = 1$. So the algorithm terminates after $\leq t\ln n$ iterations.
prog / daa / ch11 / q04 ★★★ PTAS vs FPTAS

An algorithm for Partition runs in $O(2^{1/\varepsilon} + n)$ time and achieves ratio $1+\varepsilon$. Is this a PTAS or FPTAS, and why?

A
FPTAS, because the exponent $1/\varepsilon$ is a polynomial function of $1/\varepsilon$
B
PTAS but not FPTAS: it is polynomial in $n$ for any fixed $\varepsilon$, but exponential in $1/\varepsilon$; an FPTAS must run in time polynomial in both $n$ and $1/\varepsilon$
C
Neither PTAS nor FPTAS, since $2^{1/\varepsilon}$ is not polynomial in $n$
D
FPTAS if $\varepsilon \geq 1/\log n$, and PTAS otherwise
Answer: B — PTAS but not FPTAS.

A PTAS requires polynomial time in $n$ for each fixed $\varepsilon > 0$. For any fixed $\varepsilon$, $2^{1/\varepsilon}$ is a constant, so $O(2^{1/\varepsilon} + n) = O(n)$ — polynomial in $n$. So this is a PTAS.

An FPTAS additionally requires polynomial time in $1/\varepsilon$: runtime must be $O(n^a / \varepsilon^b)$ for constants $a, b$. The $2^{1/\varepsilon}$ term is exponential in $1/\varepsilon$ — not polynomial. So this is not an FPTAS.

In contrast, the Subset Sum FPTAS runs in $O(n^2/\varepsilon)$ — polynomial in both. The practical difference: PTAS with $\varepsilon = 0.01$ requires $2^{100}$ computation; FPTAS with $\varepsilon = 0.01$ requires $O(100n^2)$ — hugely different.

Terminal Questions — Chapter 11 10 problems · No answers given

Work through independently. Q1–3 direct application. Q4–7 synthesis. Q8–10 careful argument.

1
Show that Independent Set is NP-complete by: (a) giving a polynomial-time verifier, (b) reducing from Clique via complement graph construction. Verify the reduction is correct on a 5-vertex example. Easy
2
Run the 2-approximation for Vertex Cover on the graph with edges $\{ab, ac, bc, bd, ce, de\}$. Show the maximal matching found, the resulting cover, its size, and compare to the optimal cover. Compute the exact approximation ratio on this instance. Easy
3
Run Greedy Set Cover on universe $\{1,2,3,4,5,6,7,8,9,10\}$ with sets $S_1=\{1,2,3,4,5\}$, $S_2=\{4,5,6,7\}$, $S_3=\{6,8,9\}$, $S_4=\{7,8,10\}$, $S_5=\{3,9,10\}$. Show each greedy selection, the resulting cover, and compare to optimal. Easy
4
Prove that Hamiltonian Path is NP-complete via reduction from Hamiltonian Cycle (as in R8): split vertex $v$ into $v^+$ (all in-edges to $v$) and $v^-$ (all out-edges from $v$). Show both directions of the equivalence. Medium
5
Implement the Christofides algorithm for metric TSP: (a) find MST, (b) find minimum-weight perfect matching on odd-degree vertices, (c) combine to get an Eulerian multigraph, (d) shortcut to Hamiltonian cycle. Show the ratio is $3/2$. Test on a 6-city random metric instance and compare to the 2-approximate MST tour. Medium
6
Show that the Partition problem (as NP decision: is there a perfect partition?) reduces to Subset Sum (target = $\sum A / 2$), AND Subset Sum reduces to Partition (by adding elements $\sigma+t$ and $2\sigma-t$). This shows Partition $\equiv_P$ Subset Sum. Medium
7
The natural max-degree greedy for Vertex Cover achieves $O(\log n)$ approximation. Construct the worst-case graph from L17 (with $k!$ vertices of different degrees) for $k=4$. Verify the greedy visits all bottom vertices before the optimal top vertices, giving a $\log k$ approximation gap. Medium
8
Prove that general TSP (without triangle inequality) has no polynomial-time $\rho$-approximation for any $\rho$, unless P = NP. Hint: show that if such an approximation existed, you could decide Hamiltonian Cycle in polynomial time. Hard
9
The Knapsack problem (choose items with total weight $\leq W$ maximising total value) is weakly NP-hard but has an FPTAS running in $O(n^2/\varepsilon)$. Describe the FPTAS: round item values by factor $K = \lfloor \varepsilon \cdot v_\text{max}/n \rfloor$, run exact DP on rounded values, recover solution. Prove the approximation ratio $1-\varepsilon$. Hard
10
The PCP theorem (Probabilistically Checkable Proofs) implies that Set Cover cannot be approximated better than $(1-\varepsilon)\ln n$ in polynomial time unless NP $\subseteq$ DTIME$(n^{O(\log\log n)})$. Explain: (a) what a PCP verifier is, (b) why the hardness of approximation follows, (c) what this says about Greedy Set Cover being essentially optimal. Hard