Dynamic Programming II
More DP problems, harder subproblem designs. Floyd-Warshall unifies all-pairs shortest paths in three nested loops. Rod Cutting and Subset Sum introduce integer-valued subproblems — and the subtle distinction between polynomial and pseudopolynomial time.
DP on Integers — A New Subproblem Pattern
In Ch11, subproblems were indexed by positions in a sequence (suffix $A[i:]$, substring $A[i:j]$). A new pattern: subproblems indexed by integers — the value of a target sum, a rod length, or an edge count. This unlocks powerful algorithms but also raises a subtle complexity question: is $O(nT)$ time actually efficient when $T$ can be huge?
Problem 1 — Floyd-Warshall (All-Pairs Shortest Paths)
Given a weighted directed graph, find $\delta(u,v)$ for all pairs $u, v \in V$. Running Bellman-Ford from each source costs $O(|V|^2 \cdot |E|) = O(|V|^4)$ for dense graphs. Floyd-Warshall achieves $\Theta(|V|^3)$ with a beautifully simple DP.
The key idea: instead of bounding path length by number of edges, restrict which intermediate vertices the path may use. Number vertices $1, \ldots, |V|$.
1. Subproblems: $d(u, v, k)$ = minimum weight of any path from $u$ to $v$ using only intermediate vertices from $\{1, \ldots, k\}$ (endpoints $u, v$ may be any vertex).
2. Relate: does the optimal path use vertex $k$ as an intermediate? $$d(u,v,k) = \min\bigl(d(u,v,k-1),\;\; d(u,k,k-1) + d(k,v,k-1)\bigr)$$ 3. Topo order: increasing $k$.
4. Base cases: $d(u,u,0) = 0$; $d(u,v,0) = w(u,v)$ if $(u,v) \in E$; else $+\infty$.
5. Original: $d(u,v,|V|)$ for all $u,v$.
6. Time: $O(|V|^3)$ subproblems $\times$ $O(1)$ work $= O(|V|^3)$.
def floyd_warshall(W):
"""
All-pairs shortest paths on |V|-vertex graph.
W[u][v] = weight of edge (u,v), or float('inf') if no edge.
W[u][u] = 0 for all u.
Returns dist[u][v] = delta(u,v), or -inf if negative cycle.
O(|V|^3) time and space.
"""
import copy
V = len(W)
d = copy.deepcopy(W) # d[u][v] = d(u,v,0)
for k in range(V): # allow vertex k as intermediate
for u in range(V):
for v in range(V):
# either don't use k, or go u->...->k->...->v
via_k = d[u][k] + d[k][v]
if via_k < d[u][v]:
d[u][v] = via_k
# Detect negative cycles: if d[u][u] < 0, u is on a neg cycle
for u in range(V):
if d[u][u] < 0:
raise ValueError(f"Negative-weight cycle detected through vertex {u}")
return d
# Note: d[u][v] in-place update is correct because:
# d(u,v,k) = min(d(u,v,k-1), d(u,k,k-1) + d(k,v,k-1))
# After row/col k is fully settled, d[u][k] = d(u,k,k) = d(u,k,k-1)
# (vertex k cannot improve path to itself via itself if no neg cycle)
# so the in-place update computes the correct value.#include <vector>
#include <stdexcept>
using namespace std;
const double INF = 1e18;
// Floyd-Warshall APSP ā O(|V|^3)
// W[u][v] = edge weight, INF if no edge, 0 on diagonal
vector<vector<double>> floyd_warshall(vector<vector<double>> d) {
int V = d.size();
for (int k = 0; k < V; ++k)
for (int u = 0; u < V; ++u)
for (int v = 0; v < V; ++v)
if (d[u][k] + d[k][v] < d[u][v])
d[u][v] = d[u][k] + d[k][v];
for (int u = 0; u < V; ++u)
if (d[u][u] < 0)
throw runtime_error("Negative-weight cycle detected");
return d;
}Problem 2 — Rod Cutting
A rod of length $L$ can be cut into pieces. A piece of length $\ell$ is worth $v[\ell]$. Maximize total value. This is a subproblem on integers (the remaining rod length), not on sequence positions.
1. Subproblems: $x(\ell)$ = maximum value from a rod of length $\ell$, for $0 \leq \ell \leq L$.
2. Relate: brute-force the length $p$ of the first piece cut: $$x(\ell) = \max\bigl\{v[p] + x(\ell - p) \;\big|\; p \in \{1, \ldots, \ell\}\bigr\}$$ 3. Topo order: increasing $\ell$.
4. Base case: $x(0) = 0$.
5. Original: $x(L)$.
6. Time: $O(L)$ subproblems $\times$ $O(L)$ work each $= O(L^2)$. Strongly polynomial since input has $O(L)$ integers.
def cut_rod(L, v):
"""
Max value from rod of length L with piece values v[0..L].
v[0]=0, v[l]=value of piece of length l. O(L^2).
Returns (max_value, list_of_pieces).
"""
x = [0] * (L + 1) # x[l] = max value for rod of length l
parent = [None] * (L + 1) # parent[l] = l - (first piece chosen)
for l in range(1, L + 1): # topological order: increasing l
for piece in range(1, l + 1):
candidate = v[piece] + x[l - piece]
if candidate > x[l]:
x[l] = candidate
parent[l] = l - piece # remainder after cutting piece
# Reconstruct cuts from parent pointers
pieces, l = [], L
while l > 0:
piece = l - parent[l] # piece length chosen at length l
pieces.append(piece)
l = parent[l]
return x[L], pieces
# Example: L=7, v=[0,1,10,13,18,20,31,32]
# x[7] = 33 via pieces [2, 2, 3] (10+10+13=33)#include <vector>
#include <algorithm>
using namespace std;
pair<int, vector<int>> cut_rod(int L, const vector<int>& v) {
vector<int> x(L+1, 0), parent(L+1, -1);
for (int l = 1; l <= L; ++l) {
for (int p = 1; p <= l; ++p) {
int cand = v[p] + x[l - p];
if (cand > x[l]) {
x[l] = cand;
parent[l] = l - p; // remainder
}
}
}
// Reconstruct pieces
vector<int> pieces;
for (int l = L; l > 0; l = parent[l])
pieces.push_back(l - parent[l]);
return {x[L], pieces};
}Problem 3 — Subset Sum (and the Pseudopolynomial Question)
Given $n$ positive integers and a target $T$: does any subset sum exactly to $T$? This is a decision problem (yes/no), not an optimisation problem.
1. Subproblems: $x(i, t)$ = can any subset of $A[i:]$ sum to exactly $t$? For $0 \leq i \leq n$, $0 \leq t \leq T$.
2. Relate: is item $A[i]$ included? $$x(i,t) = x(i+1,\; t - A[i]) \text{ (if } t \geq A[i]\text{)} \quad\mathbf{OR}\quad x(i+1,\; t)$$ 3. Topo order: decreasing $i$.
4. Base cases: $x(i, 0) = \text{True}$ (empty subset sums to 0); $x(n, t) = \text{False}$ for $t > 0$.
5. Original: $x(0, T)$.
6. Time: $O(nT)$ subproblems $\times$ $O(1)$ work $= O(nT)$.
def subset_sum(A, T):
"""
Does any subset of A sum to exactly T?
O(n*T) time and space (pseudopolynomial).
"""
n = len(A)
# x[i][t] = can subset of A[i:] sum to t
# Space optimisation: only need current and next row -> O(T) space
# Use 1D DP: x[t] = reachable with items A[i:]
# Process items in reverse, update t in reverse to avoid using same item twice
x = [False] * (T + 1)
x[0] = True # base: empty subset sums to 0
for item in A: # add one item at a time
# Traverse t in reverse to avoid using item more than once
for t in range(T, item - 1, -1):
if x[t - item]:
x[t] = True # can reach t by adding item
return x[T]
def subset_sum_with_trace(A, T):
"""Returns the actual subset, or None if impossible."""
n = len(A)
# Full 2D table for traceback
x = [[False] * (T + 1) for _ in range(n + 1)]
for i in range(n + 1):
x[i][0] = True # base: t=0 always reachable
for i in range(n - 1, -1, -1): # decreasing i (topo order)
for t in range(1, T + 1):
x[i][t] = x[i + 1][t] # don't include A[i]
if t >= A[i]:
x[i][t] = x[i][t] or x[i + 1][t - A[i]]
if not x[0][T]: return None
# Traceback: find which items were included
subset, t = [], T
for i in range(n):
if t >= A[i] and x[i + 1][t - A[i]]:
subset.append(A[i])
t -= A[i]
return subset
# Example: A=[1,3,4,12,19,21,22], T=47 -> [3,4,19,21] sums to 47#include <vector>
using namespace std;
// Subset sum ā O(n*T) pseudopolynomial
bool subset_sum(const vector<int>& A, int T) {
vector<bool> x(T + 1, false);
x[0] = true;
for (int item : A)
for (int t = T; t >= item; --t) // reverse to avoid reuse
if (x[t - item]) x[t] = true;
return x[T];
}
// With traceback ā O(n*T) time and space
vector<int> subset_sum_trace(const vector<int>& A, int T) {
int n = A.size();
vector<vector<bool>> x(n+1, vector<bool>(T+1, false));
for (int i = 0; i <= n; ++i) x[i][0] = true;
for (int i = n-1; i >= 0; --i)
for (int t = 1; t <= T; ++t) {
x[i][t] = x[i+1][t];
if (t >= A[i]) x[i][t] = x[i][t] || x[i+1][t-A[i]];
}
if (!x[0][T]) return {};
vector<int> result;
int t = T;
for (int i = 0; i < n; ++i)
if (t >= A[i] && x[i+1][t-A[i]]) { result.push_back(A[i]); t -= A[i]; }
return result;
}Polynomial vs Pseudopolynomial — A Critical Distinction
Subset Sum runs in $O(nT)$. Is this polynomial? It depends on how big $T$ can be relative to the input size.
Pseudopolynomial: runtime bounded by a polynomial in both the number of input values and the input values themselves. Example: $O(nT)$ for Subset Sum. If $T = 2^n$, this is exponential in $n$.
The key test: can the input integers be arbitrarily large relative to $n$? If yes, pseudopolynomial is not polynomial. If integers are polynomially bounded in $n$ (i.e., $T = O(n^c)$ for some constant $c$), then pseudopolynomial = polynomial for that input class.
| Algorithm | Runtime | Classification | Why |
|---|---|---|---|
| Merge Sort | $O(n\log n)$ | Strongly poly | Depends only on $n$, not on values |
| Rod Cutting | $O(L^2)$ | Strongly poly | Input has $L+1$ integers; $O(L^2)$ is poly in $L$ |
| Subset Sum | $O(nT)$ | Pseudopoly | $T$ can be $2^n$; input has $n+1$ integers but algorithm grows with $T$ |
| Counting Sort | $O(n+u)$ | Pseudopoly | $u$ (key range) can be exponential in $n$ |
| Floyd-Warshall | $O(|V|^3)$ | Strongly poly | Depends only on $|V|$, not on edge weights |
A Tour of DP Subproblem Patterns
Looking back across all DP problems in Chs 11–12, every one uses one of a small number of subproblem patterns. Recognising the pattern is half the battle:
Suffixes $A[i:]$ — Bowling, LIS, Piano Fingering, Subset Sum.
Prefixes $A[:i]$ — Rod Cutting (prefixes of possible lengths), Bellman-Ford (prefix of $k$ edges).
Substrings $A[i:j]$ — Alternating Coin Game, Arithmetic Parenthesization.
Two sequences $A[i:] \times B[j:]$ — LCS.
Integers $\{0,\ldots,T\}$ — Fibonacci, Rod Cutting, Subset Sum.
Vertices $\times$ integers — Bellman-Ford ($v \times k$), Floyd-Warshall ($u \times v \times k$).
With expanded state — LIS (must include $A[i]$), Coin Game (whose turn), Piano (start finger).
When stuck: try the simplest pattern first (suffix), then expand state until the recurrence has enough information to be correct.
You have a budget of $B$ rupees. There are $n$ option contracts available, each with cost $c_i$ and expected PnL $p_i$. You want to select a subset of contracts with total cost $\leq B$ to maximise total expected PnL. This is the 0/1 Knapsack problem — a direct generalisation of Subset Sum.
Subproblems: $x(i, b)$ = max PnL using only contracts $i, i+1, \ldots, n-1$ with budget $b$.
Relate: include contract $i$ or don’t:
$$x(i,b) = \max\bigl(x(i+1, b),\;\; p_i + x(i+1, b - c_i)\bigr) \text{ if } b \geq c_i$$
Time: $O(nB)$ — pseudopolynomial.
def knapsack(costs, pnls, B):
"""0/1 Knapsack: max PnL with budget B. O(n*B) pseudopolynomial."""
n = len(costs)
# x[b] = max PnL with remaining budget b
x = [0] * (B + 1)
for i in range(n):
for b in range(B, costs[i] - 1, -1): # reverse to avoid reuse
x[b] = max(x[b], pnls[i] + x[b - costs[i]])
return x[B]
# Example: 4 contracts, budget = 10
costs = [3, 4, 5, 2]; pnls = [4, 5, 6, 3]; B = 10
print(knapsack(costs, pnls, B)) # ā 13 (contracts 0,1,3: cost=9, pnl=12... check)
Floyd-Warshall computes all-pairs shortest paths in $O(|V|^3)$. What does the subproblem $d(u,v,k)$ mean?
$d(u,v,k)$ = minimum weight path from $u$ to $v$ where every intermediate vertex (all vertices except the endpoints $u$ and $v$ themselves) has label $\leq k$. This is the crucial design choice that gives Floyd-Warshall its efficiency: by expanding the allowed intermediate vertex set one vertex at a time, each subproblem has exactly two children ($O(1)$ branching), giving $O(1)$ work per subproblem and $O(|V|^3)$ total.
Option A describes Bellman-Ford’s subproblem ($\delta_k(s,v)$), which leads to $O(|V| \cdot |E|)$ per source vertex = $O(|V|^2 |E|)$ total.
$A = [3, 4, 3, 1]$, $T = 6$. Does a subset of $A$ sum to 6? What is it?
$A = [3, 4, 3, 1]$. The two elements at indices 0 and 2 both have value 3. $3 + 3 = 6 = T$. Also $1 + 4 + 1$? No, 1 only appears once. $4 + 3 - 1$? Not a sum. $3 + 1 + 1 + 1$? Only one 1. So $\{3, 3\}$ is the unique solution.
Running the DP: $x(3, \cdot) = [T,F,\ldots,F,\ldots]$ (only $t=0$ and $t=1$ true). $x(2, \cdot)$: can now use $A[2]=3$; $t=3$ and $t=4$ become true. $x(1, \cdot)$: can use $A[1]=4$; $t=7$ true but $>T$. $x(0, \cdot)$: use $A[0]=3$; $t=6$ becomes $x(1, 3)$ which is True. So $x(0,6) = $ True.
The Subset Sum DP runs in $O(nT)$. If $n = 100$ and $T = 2^{100}$, approximately how many operations does this require?
The DP is $O(nT)$. If $T = 2^{100}$, then $O(nT) = O(100 \times 2^{100}) \approx 10^{32}$. At $10^{18}$ operations/second (modern GPU cluster), this would take $10^{14}$ seconds — longer than the age of the universe.
This is the core of why Subset Sum is believed to be NP-hard when $T$ is exponentially large: no polynomial-time algorithm is known. The $O(nT)$ DP is only useful when $T = O(\text{poly}(n))$. For $T = 2^{100}$, even the exponential brute-force ($2^{100}$ subsets) doesn’t help much — both are astronomical.
In the Subset Sum DP, why do we iterate $t$ in reverse (from $T$ down to $A[i]$) when using the space-optimised 1D array?
The 0/1 Knapsack / Subset Sum problem: each item can be used at most once. When we process item $A[i]$, we update $x[t] = x[t]$ OR $x[t - A[i]]$. If we iterate $t$ in increasing order, then when we set $x[t] = $ True using $x[t - A[i]]$, that $x[t - A[i]]$ might already have been updated in the same pass by item $A[i]$ (if $t - A[i] > A[i]$). This would allow the same item to be counted multiple times.
Iterating in decreasing order ensures that when we read $x[t - A[i]]$, it still reflects the state before item $A[i]$ was considered — i.e., it represents the result without this item. This is the 0/1 constraint enforced by iteration order, not by extra data structures.
Work through independently. Q1–3 direct application. Q4–7 synthesis. Q8–10 careful argument.