Graph Traversal
Every graph algorithm boils down to visiting vertices in some principled order. BFS visits them by distance from a source — the right order for shortest paths. DFS visits them by finishing time — the right order for topological sort and cycle detection.
Graph Definitions and Representations
A graph $G = (V, E)$ consists of a set of vertices $V$ and a set of edges $E \subseteq V \times V$. A directed edge $(u, v)$ goes from $u$ to $v$ only. An undirected edge $\{u,v\}$ is equivalent to both $(u,v)$ and $(v,u)$. We assume simple graphs: no self-loops, no duplicate edges, so $|E| = O(|V|^2)$.
The outgoing neighbors of $u$: $\text{Adj}(u) = \{v \in V \mid (u, v) \in E\}$. Out-degree $\deg(u) = |\text{Adj}(u)|$. The handshaking lemma: $\sum_{u \in V} \deg(u) \leq 2|E|$. Consequence: a graph is storable in $\Theta(|V| + |E|)$ space using adjacency lists, and linear-time algorithms on graphs run in $\Theta(|V| + |E|)$.
G = [[1,3], [0,2], [1], [0]] — vertex 0 points to 1 and 3; vertex 1 points to 0 and 2; etc.
Paths and Shortest Path Problems
A path is a sequence of vertices $(v_1,\ldots,v_k)$ where each $(v_i, v_{i+1}) \in E$. The distance $\delta(s,v)$ is the minimum number of edges on any path from $s$ to $v$ ($\infty$ if unreachable). Three natural problems exist, each harder than the last: single-pair reachability, single-pair shortest path, and single-source shortest paths (SSSP) — returning $\delta(s,v)$ for every $v \in V$. BFS solves SSSP in $O(|V| + |E|)$.
Breadth-First Search — Shortest Paths by Level
Key insight: every vertex at distance $i$ from $s$ must have an incoming edge from a vertex at distance $i-1$. Build level sets $L_0, L_1, \ldots$ where $L_i = \{v \mid \delta(s,v) = i\}$. Track a parent array: $\text{parent}[v] = u$ means "$v$ was discovered from $u$". Parent pointers form a shortest-path tree.
from collections import deque
def bfs(Adj, s):
"""
BFS from source s. Returns parent array (shortest-path tree).
parent[s]=s (root), parent[v]=None means unreachable.
O(|V| + |E|).
"""
parent = [None] * len(Adj)
parent[s] = s # mark s visited
queue = deque([s]) # FIFO queue — key BFS structure
while queue:
u = queue.popleft() # O(1)
for v in Adj[u]:
if parent[v] is None: # v not yet visited
parent[v] = u
queue.append(v)
return parent
def shortest_path(Adj, s, t):
"""Recover shortest path s -> t from BFS parent tree."""
parent = bfs(Adj, s)
if parent[t] is None: return None # unreachable
path, v = [], t
while v != s:
path.append(v)
v = parent[v]
path.append(s)
return path[::-1] # reverse to get s -> t#include <vector>
#include <queue>
using namespace std;
vector<int> bfs(const vector<vector<int>>& Adj, int s) {
int V = Adj.size();
vector<int> parent(V, -1);
parent[s] = s;
queue<int> q;
q.push(s);
while (!q.empty()) {
int u = q.front(); q.pop();
for (int v : Adj[u]) {
if (parent[v] == -1) {
parent[v] = u;
q.push(v);
}
}
}
return parent;
}
vector<int> shortest_path(const vector<vector<int>>& Adj, int s, int t) {
auto parent = bfs(Adj, s);
if (parent[t] == -1) return {};
vector<int> path;
for (int v = t; v != s; v = parent[v]) path.push_back(v);
path.push_back(s);
reverse(path.begin(), path.end());
return path;
}Runtime: each vertex enters the queue at most once. For each vertex $u$ dequeued, we do $O(1)$ work per neighbor. Total: $O(\sum_u \deg(u)) = O(|E|)$ by handshaking. Plus $O(|V|)$ for initialization. Grand total: $O(|V| + |E|)$.
Depth-First Search — Following Paths to Their End
DFS explores as deep as possible before backtracking. From $s$, visit each unvisited neighbor recursively. When a vertex has no unvisited neighbors, it finishes and we backtrack. DFS records a finishing order — the order vertices complete. The reverse finishing order is the key to topological sort.
def dfs(Adj, s, parent=None, order=None):
"""
DFS from s. Appends vertices to order in finishing order.
Reverse of order = topological sort (if graph is a DAG).
O(|V| + |E|) when called via full_dfs.
"""
if parent is None:
parent = [None] * len(Adj)
parent[s] = s
order = []
for v in Adj[s]:
if parent[v] is None: # v not yet visited
parent[v] = s
dfs(Adj, v, parent, order) # go deep first
order.append(s) # s finishes
return parent, order
def full_dfs(Adj):
"""Explore entire graph (all components). O(|V| + |E|)."""
parent = [None] * len(Adj)
order = []
for v in range(len(Adj)):
if parent[v] is None:
parent[v] = v # v is a root (new component)
dfs(Adj, v, parent, order)
return parent, order#include <vector>
using namespace std;
void dfs(const vector<vector<int>>& Adj, int s,
vector<int>& parent, vector<int>& order) {
for (int v : Adj[s]) {
if (parent[v] == -1) {
parent[v] = s;
dfs(Adj, v, parent, order);
}
}
order.push_back(s); // s finishes
}
pair<vector<int>,vector<int>> full_dfs(const vector<vector<int>>& Adj) {
int V = Adj.size();
vector<int> parent(V, -1), order;
for (int v = 0; v < V; ++v) {
if (parent[v] == -1) {
parent[v] = v;
dfs(Adj, v, parent, order);
}
}
return {parent, order};
}Topological Sort and Cycle Detection
Claim: the reverse of the DFS finishing order is a topological order for any DAG.
Proof (two cases): for any edge $(u,v)$, we show $v$ finishes before $u$. If $u$ visited before $v$: before $u$ finishes, it visits $v$, so $v$ finishes first. If $v$ visited before $u$: since the graph is acyclic, $u$ is not reachable from $v$, so $v$ finishes entirely before $u$ starts. In both cases $v$ finishes before $u$, so reversing puts $u$ before $v$. ✓
def topological_sort(Adj):
"""Returns vertices in topological order for a DAG. O(|V|+|E|)."""
_, order = full_dfs(Adj)
return order[::-1] # reverse finishing order
def has_cycle(Adj):
"""
Detect directed cycle using 3-color DFS.
WHITE=0 (unvisited), GRAY=1 (on current path), BLACK=2 (done).
A back edge to a GRAY vertex certifies a cycle.
O(|V| + |E|).
"""
color = [0] * len(Adj)
def visit(u):
color[u] = 1 # u is on current DFS path
for v in Adj[u]:
if color[v] == 1: # back edge to ancestor -> cycle!
return True
if color[v] == 0 and visit(v):
return True
color[u] = 2 # u is done
return False
return any(visit(v) for v in range(len(Adj)) if color[v] == 0)#include <vector>
#include <functional>
using namespace std;
vector<int> topological_sort(const vector<vector<int>>& Adj) {
auto [parent, order] = full_dfs(Adj);
reverse(order.begin(), order.end());
return order;
}
bool has_cycle(const vector<vector<int>>& Adj) {
int V = Adj.size();
vector<int> color(V, 0); // 0=white,1=gray,2=black
function<bool(int)> visit = [&](int u) -> bool {
color[u] = 1;
for (int v : Adj[u]) {
if (color[v] == 1) return true; // back edge
if (color[v] == 0 && visit(v)) return true;
}
color[u] = 2;
return false;
};
for (int v = 0; v < V; ++v)
if (color[v] == 0 && visit(v)) return true;
return false;
}BFS vs DFS — When to Use Which
| Property | BFS | DFS |
|---|---|---|
| Data structure | Queue (FIFO) | Stack / recursion (LIFO) |
| Shortest paths (unweighted) | Yes — always | No guarantee |
| Topological sort | No | Yes — reverse finishing order |
| Cycle detection | Indirect | Yes — back edges |
| Connected components | Yes (Full-BFS) | Yes (Full-DFS) |
| Runtime | $O(|V| + |E|)$ | $O(|V| + |E|)$ |
A trading strategy has modules with dependencies: config must load before instruments and risk; instruments before pricer and hedger; risk before hedger. Find a valid startup order.
Vertex per module, directed edge $(u,v)$ meaning "$u$ must start before $v$". Run topological sort on this DAG. DFS finishing order (reversed):
'config': ['instruments', 'risk'],
'instruments': ['pricer', 'hedger'],
'risk': ['hedger'],
'pricer': [],
'hedger': [],
'logger': []
}
One valid order: config → risk → instruments → hedger → pricer → logger. All dependency edges point forward. Modules can start in this sequence without any module waiting on one not yet loaded.
import pandas, Python builds a dependency DAG of all required packages and imports them in topological order. Circular imports cause ImportError — Python's cycle detector firing. For TTT's strategy pipeline, any time you have "X must happen before Y", you have a DAG. Model it explicitly, run topological sort, and you have a provably correct startup sequence.
A graph has 5 vertices and 8 undirected edges. BFS is run from vertex 0. What is the maximum number of times the inner loop for v in Adj[u] can execute across the entire BFS?
Each vertex $u$ is dequeued at most once. When $u$ is dequeued, the inner loop runs $\deg(u)$ times. Total executions $\leq \sum_u \deg(u) \leq 2|E| = 16$ by the handshaking lemma (each undirected edge contributes to two adjacency lists). Option D would be correct only for directed graphs where each edge appears in exactly one adjacency list.
You need to find the shortest path (fewest edges) from $s$ to $t$ in an unweighted graph. Which algorithm is correct?
BFS discovers vertices level-by-level. When $t$ is first reached at level $d$, no shorter path exists — all vertices at distance $< d$ have already been processed. DFS may reach $t$ via a long detour before ever trying a direct short path. Same $O(|V|+|E|)$ runtime, completely different guarantees. Dijkstra with unit weights IS BFS — they are equivalent for unweighted graphs.
DAG with edges $A \to C$, $B \to C$, $B \to D$, $C \to E$, $D \to E$. Which is a valid topological order?
Check all edges: $A \to C$ ✓ ($A$ before $C$); $B \to C$ ✓; $B \to D$ ✓; $C \to E$ ✓; $D \to E$ ✓. All edges point forward — valid.
Option A fails: $B \to C$ violated ($B$ comes after $C$ in that ordering). Option C fails: $E$ is first but $C \to E$ and $D \to E$ require $E$ to come after both. Option D fails: $C$ comes before $B$ but $B \to C$ requires $B$ before $C$.
You want the shortest path from $s$ to $t$ using an odd number of edges. Direct BFS fails. What is the correct approach?
Create $G'$ with two copies of each vertex: $u_E$ (reached via even edges) and $u_O$ (reached via odd edges). For each edge $(u,v) \in E$, add $(u_E, v_O)$ and $(u_O, v_E)$ — each step flips the parity. BFS from $s_E$ to $t_O$ finds the shortest odd-length path. $G'$ has $2|V|$ vertices, $2|E|$ edges, so BFS runs in $O(|V|+|E|)$.
This is the state-space doubling technique: when you need to track something beyond position (here: parity), encode it as part of the vertex. It generalises: tracking colour, energy level, number of turns, etc. all work the same way.
Work through independently. Q1–3 direct application. Q4–7 synthesis. Q8–10 careful argument.
connected_components(Adj) using Full-BFS. Return a list of vertex sets, one per connected component. Test on a graph with three disconnected components. Medium