Weighted Shortest Paths
BFS finds shortest paths by edge count. Now edges carry weights. Three algorithms handle progressively harder cases: DAG Relaxation for acyclic graphs ($O(|V|+|E|)$), Bellman-Ford for graphs with negative weights ($O(|V||E|)$), and Dijkstra for non-negative weights ($O(|E|+|V|\log|V|)$).
Weighted Graphs and the Relaxation Framework
A weighted graph adds a weight function $w: E \to \mathbb{Z}$ assigning an integer to every edge. The weight of a path $\pi = (v_1, \ldots, v_k)$ is $w(\pi) = \sum_{i=1}^{k-1} w(v_i, v_{i+1})$. The shortest-path weight $\delta(s,v) = \inf\{w(\pi) \mid \text{path } \pi \text{ from } s \text{ to } v\}$.
Two special cases matter: $\delta(s,v) = +\infty$ if no path exists; $\delta(s,v) = -\infty$ if there is a path through a negative-weight cycle (a cycle with negative total weight). Negative-weight cycles make the shortest path undefined because you can always reduce the path weight by going around the cycle one more time.
Safety Lemma: relaxation only sets $d(s,v)$ to the weight of some actual path from $s$ to $v$. So $d(s,v) \geq \delta(s,v)$ always holds.
Termination Lemma: if no edge is relaxable, then $d(s,v) = \delta(s,v)$ for all $v$.
All three algorithms below are specialisations of this framework — they differ only in the order they choose edges to relax.
Algorithm Summary
| Algorithm | Graph | Weights | Time |
|---|---|---|---|
| BFS | Any | Unweighted | $O(|V|+|E|)$ |
| DAG Relaxation | DAG only | Any (incl. negative) | $O(|V|+|E|)$ |
| Bellman-Ford | Any (directed) | Any (incl. negative cycles) | $O(|V||E|)$ |
| Dijkstra | Any (directed) | Non-negative only | $O(|E|+|V|\log|V|)$ |
Algorithm 1 — DAG Relaxation
For a DAG (no cycles), process vertices in topological order and relax all outgoing edges of each vertex. Since the topological order puts all predecessors before successors, by the time we process $v$, we have already set optimal distances to all possible predecessors of $v$.
def dag_relaxation(Adj, w, s):
"""
SSSP on a DAG with arbitrary (including negative) edge weights.
Adj: adjacency list. w: weight function w(u,v) -> number.
Returns d[v] = delta(s, v) for all v. O(|V| + |E|).
"""
# Step 1: topological sort via DFS finishing order
_, order = full_dfs(Adj)
topo = order[::-1] # reverse = topological order
# Step 2: initialise distance estimates
INF = float('inf')
d = [INF] * len(Adj)
d[s] = 0
# Step 3: relax edges in topological order
for u in topo:
if d[u] == INF: continue # u unreachable from s, skip
for v in Adj[u]:
if d[v] > d[u] + w(u, v):
d[v] = d[u] + w(u, v) # relax edge (u, v)
return d
def try_relax(d, parent, u, v, w_uv):
"""Relax edge (u,v) with weight w_uv. Helper used by all algorithms."""
if d[v] > d[u] + w_uv:
d[v] = d[u] + w_uv
parent[v] = u
return True
return False#include <vector>
#include <functional>
#include <algorithm>
#include <limits>
using namespace std;
const double INF = numeric_limits<double>::infinity();
// DAG Relaxation — O(|V| + |E|)
vector<double> dag_relaxation(
const vector<vector<int>>& Adj,
const function<double(int,int)>& w, int s) {
// topological sort via DFS
int V = Adj.size();
vector<int> color(V,0), order;
function<void(int)> dfs = [&](int u){
color[u]=1;
for(int v: Adj[u]) if(!color[v]) dfs(v);
order.push_back(u);
};
for(int v=0;v<V;v++) if(!color[v]) dfs(v);
reverse(order.begin(), order.end()); // topo order
vector<double> d(V, INF);
d[s] = 0;
for(int u : order){
if(d[u] == INF) continue;
for(int v : Adj[u])
if(d[v] > d[u] + w(u,v))
d[v] = d[u] + w(u,v);
}
return d;
}Correctness (induction on topological order): when we process vertex $v$ (the $k$-th in topological order), every predecessor $u$ of $v$ on any shortest path has already been processed. By induction $d(s,u) = \delta(s,u)$. So when we relax $(u,v)$, we set $d(s,v) \leq \delta(s,u) + w(u,v) = \delta(s,v)$. Combined with the safety lemma ($d(s,v) \geq \delta(s,v)$), we get $d(s,v) = \delta(s,v)$.
Algorithm 2 — Bellman-Ford
For general directed graphs with arbitrary weights (including negative), we cannot use topological order (there may be cycles). The key insight: if no negative-weight cycles exist, a shortest path is simple (uses at most $|V|-1$ edges).
Define the $k$-edge distance $\delta_k(s,v)$ = minimum weight of any path from $s$ to $v$ using $\leq k$ edges. If the graph has no negative-weight cycles, $\delta(s,v) = \delta_{|V|-1}(s,v)$. If $\delta_{|V|}(s,v) < \delta_{|V|-1}(s,v)$, vertex $v$ is a witness — reachable via a negative-weight cycle.
Implementation via graph duplication: create $|V|+1$ level copies of the graph where level $k$ represents "reached using $\leq k$ edges." The resulting structure is a DAG. Run DAG relaxation on it. Equivalently, iterate $|V|$ rounds of "relax every edge," which is the classic Bellman-Ford.
def bellman_ford(Adj, w, s):
"""
SSSP on any directed graph. Detects negative-weight cycles.
Returns (d, parent) where d[v] = delta(s,v), or -inf if on neg cycle.
O(|V| * |E|).
"""
V = len(Adj)
INF = float('inf')
d = [INF] * V
parent = [None] * V
d[s] = parent[s] = s
# |V| - 1 rounds: after round k, d[v] = delta_k(s, v)
for _ in range(V - 1):
for u in range(V):
if d[u] == INF: continue
for v in Adj[u]:
if d[v] > d[u] + w(u, v):
d[v] = d[u] + w(u, v)
parent[v] = u
# Round |V|: detect witnesses (negative-cycle reachability)
witnesses = set()
for u in range(V):
if d[u] == INF: continue
for v in Adj[u]:
if d[v] > d[u] + w(u, v):
witnesses.add(v) # v is a witness — on or past a neg cycle
# BFS/DFS from each witness to mark all reachable vertices as -inf
if witnesses:
from collections import deque
q = deque(witnesses)
neg_inf = set(witnesses)
while q:
u = q.popleft()
for v in Adj[u]:
if v not in neg_inf:
neg_inf.add(v)
q.append(v)
for v in neg_inf:
d[v] = -INF
return d, parent// Bellman-Ford — O(|V| * |E|)
// Returns distance vector; -INF for vertices on/past negative cycles.
vector<double> bellman_ford(
const vector<vector<int>>& Adj,
const function<double(int,int)>& w, int s) {
int V = Adj.size();
vector<double> d(V, INF);
d[s] = 0;
// V-1 rounds of relaxing all edges
for(int k = 0; k < V-1; ++k)
for(int u = 0; u < V; ++u)
if(d[u] != INF)
for(int v : Adj[u])
if(d[v] > d[u] + w(u,v))
d[v] = d[u] + w(u,v);
// Round V: find witnesses
vector<bool> neg(V, false);
for(int u = 0; u < V; ++u)
if(d[u] != INF)
for(int v : Adj[u])
if(d[v] > d[u] + w(u,v))
neg[v] = true;
// BFS from witnesses to mark all reachable as -INF
queue<int> q;
for(int v=0;v<V;v++) if(neg[v]) q.push(v);
while(!q.empty()){
int u = q.front(); q.pop();
for(int v: Adj[u])
if(!neg[v]){ neg[v]=true; q.push(v); }
}
for(int v=0;v<V;v++) if(neg[v]) d[v]=-INF;
return d;
}Algorithm 3 — Dijkstra’s Algorithm
For non-negative weights, a greedy approach works: repeatedly extract the unvisited vertex with the smallest current distance estimate, then relax all its outgoing edges. Once a vertex is extracted, its distance is final.
This works because non-negative weights guarantee monotone distance increase along shortest paths: if $u$ appears on a shortest path from $s$ to $v$, then $\delta(s,u) \leq \delta(s,v)$. So when we extract the minimum, no future relaxation can improve it.
import heapq
def dijkstra(Adj, w, s):
"""
SSSP for graphs with non-negative edge weights.
Uses a min-heap (binary heap priority queue).
O((|V| + |E|) log |V|).
"""
V = len(Adj)
INF = float('inf')
d = [INF] * V
parent = [None] * V
d[s] = parent[s] = s
# min-heap: (distance_estimate, vertex)
heap = [(0, s)]
while heap:
dist_u, u = heapq.heappop(heap)
# Lazy deletion: skip if we already found a better path
if dist_u > d[u]:
continue
for v in Adj[u]:
w_uv = w(u, v)
if d[v] > d[u] + w_uv:
d[v] = d[u] + w_uv
parent[v] = u
heapq.heappush(heap, (d[v], v)) # push updated estimate
return d, parent
# Note: Python's heapq does not support decrease-key.
# We use "lazy deletion": push duplicate entries and skip stale ones.
# This gives O((|V| + |E|) log |E|) = O((|V| + |E|) log |V|) time.
# For dense graphs, replace heap with a simple array scan: O(|V|^2).#include <queue>
#include <vector>
#include <functional>
using namespace std;
// Dijkstra with binary heap (lazy deletion) — O((|V|+|E|) log |V|)
vector<double> dijkstra(
const vector<vector<int>>& Adj,
const function<double(int,int)>& w, int s) {
int V = Adj.size();
vector<double> d(V, INF);
d[s] = 0;
// min-heap: (distance, vertex)
priority_queue<pair<double,int>,
vector<pair<double,int>>,
greater<>> pq;
pq.push({0.0, s});
while(!pq.empty()){
auto [du, u] = pq.top(); pq.pop();
if(du > d[u]) continue; // stale entry — skip (lazy deletion)
for(int v : Adj[u]){
double nd = d[u] + w(u, v);
if(nd < d[v]){
d[v] = nd;
pq.push({nd, v});
}
}
}
return d;
}Correctness (induction on extraction order): when vertex $v$ is extracted, suppose for contradiction $d(s,v) > \delta(s,v)$. Consider a shortest path $\pi = s \to \cdots \to x \to y \to \cdots \to v$, and let $(x,y)$ be the first edge where $y$ has not yet been extracted. When $x$ was extracted, $d(s,x) = \delta(s,x)$ by induction, so edge $(x,y)$ was relaxed and $d(s,y) = \delta(s,y) \leq \delta(s,v) \leq d(s,v)$. But $v$ was extracted before $y$, meaning $d(s,v) \leq d(s,y) \leq \delta(s,v)$. Contradiction.
Why non-negative weights are required: if edge $(x,y)$ had negative weight, the chain of inequalities $\delta(s,y) \leq \delta(s,v)$ would break. A later-discovered path through a negative edge could improve $d(s,v)$ after $v$ was already extracted. The greedy argument collapses.
Priority Queue Choice Matters
Dijkstra performs $|V|$ delete-min operations and $|E|$ decrease-key operations. The priority queue choice determines the total runtime:
| Priority Queue | delete-min | decrease-key | Dijkstra Total | Best for |
|---|---|---|---|---|
| Array (linear scan) | $O(|V|)$ | $O(1)$ | $O(|V|^2)$ | Dense: $|E| = \Theta(|V|^2)$ |
| Binary Heap | $O(\log|V|)$ | $O(\log|V|)$ | $O(|E|\log|V|)$ | Sparse: $|E| = O(|V|\log|V|)$ |
| Fibonacci Heap | $O(\log|V|)^{(a)}$ | $O(1)^{(a)}$ | $O(|E|+|V|\log|V|)$ | Theoretically optimal |
(a) amortised · Fibonacci Heap not used in practice due to large constants; binary heap is standard.
Consider a simplified currency/instrument graph where you can convert between NIFTY, BANKNIFTY, USDINR, and GOLD using current exchange rates. You want to detect if a sequence of conversions can yield a profit starting and ending with NIFTY (a negative-weight cycle in log-space).
Create a directed graph with one vertex per instrument. For each conversion from $A$ to $B$ at rate $r_{AB}$ (you receive $r_{AB}$ units of $B$ per unit of $A$), add edge $(A, B)$ with weight $-\log(r_{AB})$. A cycle with negative total weight in this graph corresponds to a product of rates $> 1$ — i.e., a profitable arbitrage loop.
Run Bellman-Ford from any source. If at the end of $|V|-1$ rounds, any edge can still be relaxed, a negative-weight cycle exists — an arbitrage opportunity. The witness vertex is on or reachable from the profitable loop.
heapq does not support decrease-key. The standard Python Dijkstra pattern uses lazy deletion: push duplicate entries for a vertex when its estimate improves, and skip stale entries on pop (the if dist_u > d[u]: continue check in the code above). This degrades the bound slightly from $O(|E|+|V|\log|V|)$ to $O(|E|\log|E|)$, but since $|E| = O(|V|^2)$, this is $O(|E|\log|V|)$ — equivalent for practical purposes. For production use, scipy.sparse.csgraph.shortest_path implements both Dijkstra and Bellman-Ford with compiled C backends, orders of magnitude faster than pure Python for large graphs.
A graph has directed edges with weights $\{-3, -1, 0, 2, 5\}$ and is known to be a DAG (no cycles). Which algorithm is best for single-source shortest paths?
DAG Relaxation handles any edge weights (positive or negative) as long as the graph has no cycles. Since we know it’s a DAG, it runs in $O(|V|+|E|)$ — the fastest possible. Bellman-Ford also works but is $O(|V||E|)$ — strictly slower. Dijkstra requires non-negative weights and would fail here. BFS only works for unweighted graphs. Always exploit structure: a DAG is the most structured case and always admits the fastest SSSP.
Dijkstra’s algorithm fails on graphs with negative edge weights even if there are no negative-weight cycles. Why?
Dijkstra’s correctness proof relies on: “once a vertex $v$ is extracted with estimate $d(s,v)$, no future path can improve it.” This holds because non-negative weights mean distances are monotonically non-decreasing along paths — a later vertex can’t be closer than an earlier one.
With a negative edge $(x, v)$, it’s possible that $v$ was extracted early with a suboptimal estimate, and then $x$ is later settled and reveals $d(s,x) + w(x,v) < d(s,v)$. By then Dijkstra has already “finalized” $v$ and won’t revisit it. Result: wrong answer. Bellman-Ford avoids this by not finalizing any vertex until all $|V|-1$ rounds are done.
Bellman-Ford runs $|V|-1$ rounds. Why exactly $|V|-1$ and not, say, $|V|$ or $|V|/2$?
A simple path (no repeated vertices) can have at most $|V|-1$ edges (visits all $|V|$ vertices). If no negative-weight cycles exist, every shortest path is simple. After round $k$, every $k$-edge shortest path estimate is correct. After $|V|-1$ rounds, all simple shortest paths are correct.
The $|V|$-th round is used only to detect negative cycles: if any estimate still improves, a non-simple path (going through a negative cycle) can achieve lower weight, confirming a negative-weight cycle exists.
A road network graph has $|V| = 10{,}000$ intersections and $|E| = 12{,}000$ roads (sparse). Which priority queue gives the fastest Dijkstra, and what is the runtime?
For a sparse graph ($|E| = O(|V|)$), binary heap is clearly best. Array-based Dijkstra costs $O(|V|^2) = 10^8$ which is 600× slower. Fibonacci Heap is theoretically $O(|E|+|V|\log|V|) = O(12{,}000 + 140{,}000) \approx O(152{,}000)$ — slightly better in theory, but with enormous constant factors making it slower in practice for all realistic inputs.
The binary heap crossover: array beats heap when $|E| = \Omega(|V|^2 / \log|V|)$, i.e., for dense graphs. Road networks are extremely sparse ($|E| \approx 1.2|V|$ here) — always use a heap.
Work through independently. Q1–3 direct application. Q4–7 synthesis. Q8–10 careful argument.
heapq with lazy deletion. Test on a 5-vertex directed graph with non-negative weights. Then demonstrate failure on a graph with one negative edge (show the incorrect output). Medium