All-Pairs Shortest Paths, Greedy & MST
Three algorithm design paradigms meet in one chapter. DP solves all-pairs shortest paths via Floyd-Warshall in $O(V^3)$. Johnson's algorithm uses reweighting to get $O(VE + V^2 \log V)$ on sparse graphs. Greedy algorithms — armed with the exchange argument proof technique — solve MST, process scheduling, and event partitioning optimally.
Part I — All-Pairs Shortest Paths (APSP)
Given a weighted directed graph $G = (V, E, w)$, compute $\delta(u, v)$ for all pairs $u, v \in V$. The naive approach: run single-source shortest paths from every vertex.
| Situation | Algorithm | Dense ($E=\Theta(V^2)$) | Sparse |
|---|---|---|---|
| Unweighted | $|V| \times$ BFS | $O(V^3)$ | $O(VE)$ |
| Non-negative weights | $|V| \times$ Dijkstra | $O(V^3)$ | $O(VE + V^2\log V)$ |
| General weights | Floyd-Warshall | $O(V^3)$ | $O(V^3)$ |
| General, sparse | Johnson’s | $O(V^3)$ | $O(VE + V^2\log V)$ |
DP Attempt 1 — Edge-Count Relaxation ($O(V^4)$)
Subproblem $d_{uv}^{(m)}$ = weight of shortest path $u \to v$ using at most $m$ edges. Recurrence: $d_{uv}^{(m)} = \min_x(d_{ux}^{(m-1)} + w(x,v))$. Base case: $d_{uv}^{(0)} = 0$ if $u=v$, else $\infty$. After $n-1$ rounds: $d_{uv}^{(n-1)} = \delta(u,v)$. Complexity: $O(V^3)$ subproblems $\times$ $O(V)$ per subproblem $= O(V^4)$. No better than $|V| \times$ Bellman-Ford.
Speedup via matrix multiplication. Define $\oplus = \min$ and $\otimes = +$. Then $D^{(m)} = D^{(m-1)} \otimes W$ (tropical matrix multiplication). Using repeated squaring: $D^{(n-1)}$ in $O(\log n)$ multiplications, each $O(V^3)$ → $O(V^3 \log V)$. (Cannot use Strassen because $(\min, +)$ has no subtraction.)
Floyd-Warshall — $O(V^3)$
Recurrence: does the shortest path use vertex $k$? $$c_{uv}^{(k)} = \min\!\left(c_{uv}^{(k-1)},\;\; c_{uk}^{(k-1)} + c_{kv}^{(k-1)}\right)$$ Base case: $c_{uv}^{(0)} = w(u,v)$. Answer: $\delta(u,v) = c_{uv}^{(n)}$.
Complexity: $O(V^3)$ subproblems, $O(1)$ per subproblem → $O(V^3)$. Space: $O(V^2)$ (overwrite in place).
Negative cycle detection: if $c_{uu}^{(n)} < 0$ for any $u$, graph contains a negative-weight cycle.
Johnson’s Algorithm — $O(VE + V^2 \log V)$
Reweight edges to make all weights non-negative, then run Dijkstra from every vertex. The key: find $h: V \to \mathbb{R}$ such that $w_h(u,v) = w(u,v) + h(u) - h(v) \geq 0$.
Finding $h$: we need $h(v) - h(u) \leq w(u,v)$ for all edges — a system of difference constraints. Add a virtual source $s$ with edges $(s, v, 0)$ for all $v$. Set $h(v) = \delta(s, v)$ (Bellman-Ford from $s$). The triangle inequality guarantees $h(v) - h(u) \leq w(u,v)$. If a negative cycle exists, Bellman-Ford detects it.
Total time: $O(VE)$ (Bellman-Ford) $+ O(E)$ (reweight) $+ V \times O(E + V\log V)$ (Dijkstra from each $v$) $= O(VE + V^2\log V)$.
import heapq
from math import inf
# โโ FLOYD-WARSHALL โโ O(V^3) โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
def floyd_warshall(n, edges):
"""
APSP via Floyd-Warshall. O(V^3) time, O(V^2) space.
n: number of vertices (0-indexed).
edges: list of (u, v, w).
Returns dist[u][v] = delta(u,v), or -inf if negative cycle reachable.
"""
dist = [[inf]*n for _ in range(n)]
for i in range(n): dist[i][i] = 0
for u, v, w in edges:
dist[u][v] = min(dist[u][v], w) # handle multi-edges
# Relax through intermediate vertex k
for k in range(n):
for u in range(n):
for v in range(n):
if dist[u][k] + dist[k][v] < dist[u][v]:
dist[u][v] = dist[u][k] + dist[k][v]
# Detect negative cycles: dist[u][u] < 0 for some u
has_neg_cycle = any(dist[i][i] < 0 for i in range(n))
return dist, has_neg_cycle
# โโ JOHNSON'S ALGORITHM โโ O(VE + V^2 log V) โโโโโโโโโโโโโโโโโโโโโโโโโ
def bellman_ford(n, edges, src):
"""Bellman-Ford from src. Returns (dist, has_negative_cycle)."""
dist = [inf] * n
dist[src] = 0
for _ in range(n - 1): # n-1 relaxation passes
for u, v, w in edges:
if dist[u] + w < dist[v]:
dist[v] = dist[u] + w
# Check for negative cycle
for u, v, w in edges:
if dist[u] + w < dist[v]:
return dist, True
return dist, False
def dijkstra(n, adj, src):
"""Dijkstra with binary heap. O((V+E) log V)."""
dist = [inf] * n
dist[src] = 0
pq = [(0, src)] # (distance, vertex)
while pq:
d, u = heapq.heappop(pq)
if d > dist[u]: continue
for v, w in adj[u]:
if dist[u] + w < dist[v]:
dist[v] = dist[u] + w
heapq.heappush(pq, (dist[v], v))
return dist
def johnsons(n, edges):
"""
Johnson's APSP. O(VE + V^2 log V).
Best for sparse graphs with negative weights.
"""
# Step 1: Add virtual source s=n, edges (n->v, 0) for all v
aug_edges = edges + [(n, v, 0) for v in range(n)]
h, neg_cycle = bellman_ford(n + 1, aug_edges, n)
if neg_cycle:
return None, True # negative cycle detected
# Step 2: Reweight edges: w_h(u,v) = w(u,v) + h(u) - h(v) >= 0
adj = [[] for _ in range(n)]
for u, v, w in edges:
w_h = w + h[u] - h[v]
adj[u].append((v, w_h))
# Step 3: Dijkstra from every vertex
all_dist = []
for s in range(n):
d_h = dijkstra(n, adj, s)
# Recover original distances: delta(s,v) = d_h(s,v) - h(s) + h(v)
d = [d_h[v] - h[s] + h[v] if d_h[v] < inf else inf
for v in range(n)]
all_dist.append(d)
return all_dist, False// Floyd-Warshall + Johnson's APSP
#include <vector>
#include <queue>
#include <algorithm>
using namespace std;
const double INF = 1e18;
using Edge = tuple<int,int,double>;
// Floyd-Warshall โ O(V^3)
pair<vector<vector<double>>, bool>
floyd_warshall(int n, const vector<Edge>& edges) {
vector<vector<double>> d(n, vector<double>(n, INF));
for (int i=0;i<n;i++) d[i][i]=0;
for (auto& [u,v,w] : edges) d[u][v]=min(d[u][v],w);
for (int k=0;k<n;k++)
for (int u=0;u<n;u++)
for (int v=0;v<n;v++)
if (d[u][k]+d[k][v] < d[u][v])
d[u][v]=d[u][k]+d[k][v];
bool neg = false;
for (int i=0;i<n;i++) if (d[i][i]<0) neg=true;
return {d, neg};
}
// Dijkstra โ O((V+E) log V)
vector<double> dijkstra(int n,
const vector<vector<pair<int,double>>>& adj, int src) {
vector<double> d(n,INF); d[src]=0;
priority_queue<pair<double,int>,
vector<pair<double,int>>, greater<>> pq;
pq.push({0,src});
while (!pq.empty()) {
auto [dd,u]=pq.top(); pq.pop();
if (dd>d[u]) continue;
for (auto [v,w]:adj[u])
if (d[u]+w<d[v]) { d[v]=d[u]+w; pq.push({d[v],v}); }
}
return d;
}Part II — Greedy Algorithms and the Exchange Argument
A greedy algorithm makes the locally optimal choice at each step. Two properties justify correctness: optimal substructure (the optimal solution embeds optimal sub-solutions) and the greedy choice property (a local optimum is part of some global optimum). Proof technique: exchange argument — show that any solution not following the greedy rule can be transformed into one that does, without worsening the objective.
Problem 1 — Shortest Processing Time First (SPTF)
Schedule $n$ jobs in order to minimise total completion time $\sum C_i$. Each job $i$ takes time $t_i$; completion time of the $k$-th scheduled job is $\sum_{j=1}^k t_{p_j}$.
Exchange argument proof: suppose in some schedule, job $i$ comes before job $j$ with $t_i > t_j$. Swap them. The completion time of everything between $i$ and $j$ decreases by $t_i - t_j > 0$; everything before or after is unchanged. Total completion time strictly decreases. Therefore, any non-SPTF order can be improved, and SPTF is optimal. $\square$
Time: $O(n \log n)$ (sort by $t_i$).
Problem 2 — Event Interval Partitioning (Cloning)
Given $n$ events with start/finish times, find the minimum number of “clones” (machines/rooms) needed so each event is attended. A clone can handle a set of non-overlapping events.
Correctness: when we create the $m$-th clone for event $(s_i, f_i)$, all $m-1$ existing clones are busy (their last event hasn’t finished). Since all those events started before $s_i$ (sorted by start time), at time $s_i$ there are $m$ concurrent events. Any solution needs $\geq m$ clones. So our greedy solution is optimal. $\square$
By Dilworth’s theorem, the minimum number of chains needed to partition a partial order equals the size of the maximum antichain.
Part III — Minimum Spanning Tree
Given undirected graph $G = (V, E, w)$, find a spanning tree of minimum total edge weight. The key structural insight:
Proof (cut-and-paste). Let $e = \{u,v\}$ be the min-weight crossing edge and $T$ be any MST. If $e \in T$: done. If $e \notin T$: $T$ has a path from $u$ to $v$; that path must cross the cut via some edge $e' = \{u', v'\}$. Replace $e'$ with $e$: $T' = T \setminus \{e'\} \cup \{e\}$ is still a spanning tree, and $w(T') = w(T) - w(e') + w(e) \leq w(T)$. So $T'$ is also an MST containing $e$. $\square$
Optimal substructure. If $e \in \text{MST}(G)$, then MST$(G) \setminus \{e\} \cup \{e\}$ $=$ MST$(G)$; equivalently, the MST of $G/e$ (contract $e$) gives the MST of $G$ by adding $e$ back.
import heapq
# โโ PRIM'S MST โโ O(E log V) with binary heap โโโโโโโโโโโโโโโโโโโโโโโโ
def prim_mst(n, adj):
"""
Prim's MST. adj[u] = list of (v, w).
Grows MST from vertex 0, always adding the min-weight crossing edge.
Returns (total_weight, mst_edges).
Uses binary heap: O((V+E) log V).
Fibonacci heap would give O(E + V log V).
"""
in_mst = [False] * n
key = [float('inf')] * n
parent = [-1] * n
key[0] = 0
pq = [(0, 0)] # (key, vertex)
total_w = 0
mst_edges = []
while pq:
w, u = heapq.heappop(pq)
if in_mst[u]: continue
in_mst[u] = True
total_w += w
if parent[u] != -1:
mst_edges.append((parent[u], u, w))
for v, edge_w in adj[u]:
if not in_mst[v] and edge_w < key[v]:
key[v] = edge_w
parent[v] = u
heapq.heappush(pq, (edge_w, v))
return total_w, mst_edges
# โโ KRUSKAL'S MST โโ O(E log E + E alpha(V)) โโโโโโโโโโโโโโโโโโโโโโโโโ
def kruskal_mst(n, edges):
"""
Kruskal's MST using Union-Find.
Sorts edges by weight, adds edge if it connects two different components.
Returns (total_weight, mst_edges).
"""
# Union-Find with path compression + union by rank
parent = list(range(n))
rank = [0] * n
def find(x):
while parent[x] != x:
parent[x] = parent[parent[x]] # path halving
x = parent[x]
return x
def union(x, y):
rx, ry = find(x), find(y)
if rx == ry: return False
if rank[rx] < rank[ry]: rx, ry = ry, rx
parent[ry] = rx
if rank[rx] == rank[ry]: rank[rx] += 1
return True
edges_sorted = sorted(edges, key=lambda e: e[2])
total_w, mst_edges = 0, []
for u, v, w in edges_sorted:
if union(u, v):
total_w += w
mst_edges.append((u, v, w))
if len(mst_edges) == n - 1:
break
return total_w, mst_edges
# โโ SPTF PROCESS SCHEDULING โโ O(n log n) โโโโโโโโโโโโโโโโโโโโโโโโโโโโ
def sptf_schedule(jobs):
"""
Shortest Processing Time First: minimise total completion time.
jobs: list of (job_id, processing_time).
Returns (schedule, total_completion_time).
Exchange argument: swapping any two out-of-order adjacent jobs
strictly decreases total completion time.
"""
schedule = sorted(jobs, key=lambda j: j[1]) # sort by t_i
time, total = 0, 0
for job_id, t in schedule:
time += t
total += time # completion time = cumsum
return schedule, total// Prim's + Kruskal's MST
#include <vector>
#include <queue>
#include <numeric>
#include <algorithm>
using namespace std;
using Edge = tuple<double,int,int>; // (weight, u, v)
// Prim's MST โ O((V+E) log V) with binary heap
pair<double, vector<Edge>>
prim(int n, const vector<vector<pair<int,double>>>& adj) {
vector<bool> in_mst(n,false);
vector<double> key(n,1e18); key[0]=0;
vector<int> par(n,-1);
priority_queue<pair<double,int>,
vector<pair<double,int>>,greater<>> pq;
pq.push({0,0});
double tot=0; vector<Edge> mst;
while (!pq.empty()) {
auto [w,u]=pq.top(); pq.pop();
if (in_mst[u]) continue;
in_mst[u]=true; tot+=w;
if (par[u]!=-1) mst.push_back({w,par[u],u});
for (auto [v,ew]:adj[u])
if (!in_mst[v]&&ew<key[v])
{key[v]=ew;par[v]=u;pq.push({ew,v});}
}
return {tot,mst};
}
// Kruskal's MST โ O(E log E + E alpha(V))
pair<double,vector<Edge>>
kruskal(int n, vector<Edge> edges) {
sort(edges.begin(),edges.end());
vector<int> par(n); iota(par.begin(),par.end(),0);
vector<int> rnk(n,0);
function<int(int)> find=[&](int x){
return par[x]==x?x:par[x]=find(par[x]);};
double tot=0; vector<Edge> mst;
for (auto [w,u,v]:edges) {
int ru=find(u),rv=find(v);
if (ru==rv) continue;
if (rnk[ru]<rnk[rv]) swap(ru,rv);
par[rv]=ru; if(rnk[ru]==rnk[rv]) rnk[ru]++;
tot+=w; mst.push_back({w,u,v});
if ((int)mst.size()==n-1) break;
}
return {tot,mst};
}import heapq
def min_clones(events):
"""
Minimum clones needed to attend all events without overlap.
Greedy: sort by start time, assign to earliest-finishing clone.
O(n log n): dominated by sort; heap ops O(n log k) where k = answer.
Correctness: when creating the m-th clone, m events are concurrent
at that moment => any solution needs >= m clones.
Connection to Dilworth's theorem: minimum chain partition of a
partial order (event i <= event j iff i.finish <= j.start)
equals size of maximum antichain (maximum concurrent events).
"""
events_sorted = sorted(events, key=lambda e: e[0]) # sort by start
heap = [] # min-heap of finish times of each clone's last event
for start, finish in events_sorted:
if heap and heap[0] <= start:
# Reuse earliest-finishing clone (it's now free)
heapq.heapreplace(heap, finish)
else:
# No free clone: create a new one
heapq.heappush(heap, finish)
return len(heap) # number of clones = size of heap
def fractional_knapsack(items, capacity):
"""
Fractional knapsack / fractional make-change.
items: list of (value_per_unit, max_weight).
Greedy: take most valuable items first, fractionally if needed.
O(n log n). Exchange argument proves optimality.
"""
items_sorted = sorted(items, key=lambda x: -x[0]) # desc value/weight
total_value, remaining = 0.0, capacity
selected = []
for value_rate, max_wt in items_sorted:
if remaining <= 0: break
take = min(max_wt, remaining)
total_value += take * value_rate
selected.append((value_rate, take))
remaining -= take
return total_value, selected
# Example: event partitioning
events = [(1,4),(3,5),(0,6),(5,7),(3,9),(5,9),(6,10),(8,11),(8,12),(2,14)]
print(f"Minimum clones needed: {min_clones(events)}") # 3// Event partitioning (interval scheduling minimisation)
#include <vector>
#include <queue>
#include <algorithm>
using namespace std;
int min_clones(vector<pair<int,int>> events) {
sort(events.begin(), events.end()); // sort by start time
priority_queue<int,vector<int>,greater<int>> heap; // min-heap of finish times
for (auto [s,f] : events) {
if (!heap.empty() && heap.top() <= s)
heap.pop(); // reuse earliest-finishing clone
heap.push(f); // assign event, update finish time
}
return heap.size(); // clones = remaining heap entries
}
// Fractional knapsack โ O(n log n)
double fractional_knapsack(vector<pair<double,double>> items, double cap) {
// items: (value_per_unit, max_weight)
sort(items.begin(), items.end(),
[](auto& a, auto& b){ return a.first > b.first; });
double total = 0;
for (auto [v, w] : items) {
if (cap <= 0) break;
double take = min(w, cap);
total += take * v;
cap -= take;
}
return total;
}Build a Minimum Spanning Tree of NSE instruments weighted by $1 - |\rho_{ij}|$ (dissimilarity). The MST reveals the backbone correlation structure: instruments connected by short edges are highly correlated, providing a basis for pairs trading or diversification.
import numpy as np
# 5-instrument correlation matrix (NIFTY, BANKNIFTY, RELIANCE, HDFC, TCS)
rho = np.array([
[1.00, 0.92, 0.45, 0.51, 0.38],
[0.92, 1.00, 0.41, 0.48, 0.35],
[0.45, 0.41, 1.00, 0.73, 0.29],
[0.51, 0.48, 0.73, 1.00, 0.31],
[0.38, 0.35, 0.29, 0.31, 1.00],
])
names = ['NIFTY', 'BANKNIFTY', 'RELIANCE', 'HDFC', 'TCS']
# Build edge list: (u, v, 1 - |rho|)
n = len(names)
edges = [(i, j, round(1 - abs(rho[i,j]), 4))
for i in range(n) for j in range(i+1, n)]
total_w, mst = kruskal_mst(n, edges)
print(f"MST total dissimilarity: {total_w:.4f}")
for u, v, w in sorted(mst, key=lambda e: e[2]):
print(f" {names[u]:12s} -- {names[v]:12s} rho={1-w:.2f}")
# MST edges (by correlation strength):
# NIFTY -- BANKNIFTY rho=0.92 (strongest, index cluster)
# RELIANCE -- HDFC rho=0.73 (bluechip cluster)
# NIFTY -- HDFC rho=0.51 (index to financials bridge)
# TCS -- NIFTY rho=0.38 (tech cluster, weakest bridge)
Floyd-Warshall uses subproblem $c_{uv}^{(k)}$ = shortest path using intermediate vertices in $\{1,\ldots,k\}$. Why is this better than the edge-count subproblem $d_{uv}^{(m)}$ from DP Attempt 1?
Both DPs have $O(V^3)$ subproblems (indexed by $(u, v, k)$ or $(u, v, m)$ respectively). The difference is cost per subproblem. Edge-count DP: $d_{uv}^{(m)} = \min_x(d_{ux}^{(m-1)} + w(x,v))$ — must iterate over all $V$ intermediate vertices $x$, giving $O(V)$ per cell and $O(V^4)$ total. Floyd-Warshall: $c_{uv}^{(k)} = \min(c_{uv}^{(k-1)},\; c_{uk}^{(k-1)} + c_{kv}^{(k-1)})$ — only two lookups, $O(1)$ per cell and $O(V^3)$ total. The key insight was to parameterise by the vertex set of intermediates (all $\leq k$) rather than edge count.
In Johnson’s algorithm, why does $h(v) = \delta(s, v)$ (shortest distance from virtual source $s$) satisfy the constraint $h(v) - h(u) \leq w(u,v)$ for all edges?
For any edge $(u,v)$, the shortest path from $s$ to $v$ satisfies the triangle inequality: $\delta(s,v) \leq \delta(s,u) + w(u,v)$ (you can always go from $s$ to $u$ via the shortest path, then take edge $(u,v)$). Setting $h = \delta(s,\cdot)$ and rearranging: $\delta(s,v) - \delta(s,u) \leq w(u,v)$, i.e., $h(v) - h(u) \leq w(u,v)$. This is exactly the difference constraint needed. So $w_h(u,v) = w(u,v) + h(u) - h(v) = w(u,v) - (h(v)-h(u)) \geq 0$.
Prim’s and Kruskal’s both build an MST using the cut property, but maintain different cuts. How does each algorithm’s invariant correspond to a cut?
Prim’s: maintains a single growing set $S$ (the current tree). At each step, the cut is $(S, V\setminus S)$. The invariant is that the tree inside $S$ is a sub-MST, and we add the minimum-weight edge crossing the cut. By the cut property, this edge is safe to add.
Kruskal’s: maintains a forest (multiple components). When considering edge $(u,v)$ with $u \in C_1$ and $v \in C_2$, the relevant cut is $(C_1, V \setminus C_1)$. Since we process edges in increasing weight order, $(u,v)$ is the minimum-weight edge leaving $C_1$ at the time we consider it (any lighter edges already created connections within the same component). By the cut property, this edge is in some MST.
Fractional knapsack (take fractions of items) is solvable greedily, but 0-1 knapsack (items are indivisible) requires DP. What exactly breaks the greedy exchange argument for 0-1 knapsack?
In fractional knapsack, the exchange argument works as follows: if the optimal solution doesn’t take item $i$ (highest value-to-weight ratio) first, we can replace a fraction $\epsilon$ of some lower-ratio item $j$ with $\epsilon \cdot (w_i/w_j)$ units of item $i$. Since $v_i/w_i > v_j/w_j$, value increases. This fraction swap is always possible, so greedy is optimal.
In 0-1 knapsack, we can’t take fractions. Replacing one unit of item $j$ with item $i$ requires dropping all of $j$ and including all of $i$ — but item $i$ might be heavier, causing infeasibility, or we might not want all of $i$. Counterexample: capacity $=10$, items $\{(v=6,w=6), (v=5,w=5), (v=5,w=5)\}$. Greedy by value/weight picks item 1 (ratio 1.0) then can’t fit either other (total $\geq 10$): value $=6$. Optimal: take items 2 and 3 for value $=10$.
Work through independently. Q1–3 direct application. Q4–7 synthesis. Q8–10 careful argument.