/
ProgrammingAlgorithms & DSGraph Traversal
Progress
9 / 13
← All Chapters
Graph Algorithms Chapter 09 of 13 · Algorithms and Data Structures

Graph Traversal

Every graph algorithm boils down to visiting vertices in some principled order. BFS visits them by distance from a source — the right order for shortest paths. DFS visits them by finishing time — the right order for topological sort and cycle detection.

Graph Definitions and Representations

A graph $G = (V, E)$ consists of a set of vertices $V$ and a set of edges $E \subseteq V \times V$. A directed edge $(u, v)$ goes from $u$ to $v$ only. An undirected edge $\{u,v\}$ is equivalent to both $(u,v)$ and $(v,u)$. We assume simple graphs: no self-loops, no duplicate edges, so $|E| = O(|V|^2)$.

The outgoing neighbors of $u$: $\text{Adj}(u) = \{v \in V \mid (u, v) \in E\}$. Out-degree $\deg(u) = |\text{Adj}(u)|$. The handshaking lemma: $\sum_{u \in V} \deg(u) \leq 2|E|$. Consequence: a graph is storable in $\Theta(|V| + |E|)$ space using adjacency lists, and linear-time algorithms on graphs run in $\Theta(|V| + |E|)$.

Adjacency list representation. Store a mapping (direct-access array or hash map) from each vertex $u$ to an array of its outgoing neighbors. Standard representation used in all algorithms below.

G = [[1,3], [0,2], [1], [0]] — vertex 0 points to 1 and 3; vertex 1 points to 0 and 2; etc.

Paths and Shortest Path Problems

A path is a sequence of vertices $(v_1,\ldots,v_k)$ where each $(v_i, v_{i+1}) \in E$. The distance $\delta(s,v)$ is the minimum number of edges on any path from $s$ to $v$ ($\infty$ if unreachable). Three natural problems exist, each harder than the last: single-pair reachability, single-pair shortest path, and single-source shortest paths (SSSP) — returning $\delta(s,v)$ for every $v \in V$. BFS solves SSSP in $O(|V| + |E|)$.

Breadth-First Search — Shortest Paths by Level

Key insight: every vertex at distance $i$ from $s$ must have an incoming edge from a vertex at distance $i-1$. Build level sets $L_0, L_1, \ldots$ where $L_i = \{v \mid \delta(s,v) = i\}$. Track a parent array: $\text{parent}[v] = u$ means "$v$ was discovered from $u$". Parent pointers form a shortest-path tree.

BFS · O(|V| + |E|)
from collections import deque

def bfs(Adj, s):
    """
    BFS from source s. Returns parent array (shortest-path tree).
    parent[s]=s (root), parent[v]=None means unreachable.
    O(|V| + |E|).
    """
    parent = [None] * len(Adj)
    parent[s] = s                      # mark s visited
    queue = deque([s])                 # FIFO queue — key BFS structure
    while queue:
        u = queue.popleft()            # O(1)
        for v in Adj[u]:
            if parent[v] is None:      # v not yet visited
                parent[v] = u
                queue.append(v)
    return parent

def shortest_path(Adj, s, t):
    """Recover shortest path s -> t from BFS parent tree."""
    parent = bfs(Adj, s)
    if parent[t] is None: return None  # unreachable
    path, v = [], t
    while v != s:
        path.append(v)
        v = parent[v]
    path.append(s)
    return path[::-1]                  # reverse to get s -> t
#include <vector>
#include <queue>
using namespace std;

vector<int> bfs(const vector<vector<int>>& Adj, int s) {
    int V = Adj.size();
    vector<int> parent(V, -1);
    parent[s] = s;
    queue<int> q;
    q.push(s);
    while (!q.empty()) {
        int u = q.front(); q.pop();
        for (int v : Adj[u]) {
            if (parent[v] == -1) {
                parent[v] = u;
                q.push(v);
            }
        }
    }
    return parent;
}

vector<int> shortest_path(const vector<vector<int>>& Adj, int s, int t) {
    auto parent = bfs(Adj, s);
    if (parent[t] == -1) return {};
    vector<int> path;
    for (int v = t; v != s; v = parent[v]) path.push_back(v);
    path.push_back(s);
    reverse(path.begin(), path.end());
    return path;
}
↑ Run C++ on Compiler Explorer (Godbolt)

Runtime: each vertex enters the queue at most once. For each vertex $u$ dequeued, we do $O(1)$ work per neighbor. Total: $O(\sum_u \deg(u)) = O(|E|)$ by handshaking. Plus $O(|V|)$ for initialization. Grand total: $O(|V| + |E|)$.

The queue enforces level-by-level exploration. When $t$ is first reached at level $d$, that IS the shortest distance — all vertices at distance $< d$ have already been processed. Replace the queue with a stack and you get DFS: same $O(|V|+|E|)$ runtime, no shortest-path guarantee.

Depth-First Search — Following Paths to Their End

DFS explores as deep as possible before backtracking. From $s$, visit each unvisited neighbor recursively. When a vertex has no unvisited neighbors, it finishes and we backtrack. DFS records a finishing order — the order vertices complete. The reverse finishing order is the key to topological sort.

DFS + Full-DFS · O(|V| + |E|)
def dfs(Adj, s, parent=None, order=None):
    """
    DFS from s. Appends vertices to order in finishing order.
    Reverse of order = topological sort (if graph is a DAG).
    O(|V| + |E|) when called via full_dfs.
    """
    if parent is None:
        parent = [None] * len(Adj)
        parent[s] = s
        order = []
    for v in Adj[s]:
        if parent[v] is None:      # v not yet visited
            parent[v] = s
            dfs(Adj, v, parent, order)   # go deep first
    order.append(s)                # s finishes
    return parent, order

def full_dfs(Adj):
    """Explore entire graph (all components). O(|V| + |E|)."""
    parent = [None] * len(Adj)
    order  = []
    for v in range(len(Adj)):
        if parent[v] is None:
            parent[v] = v          # v is a root (new component)
            dfs(Adj, v, parent, order)
    return parent, order
#include <vector>
using namespace std;

void dfs(const vector<vector<int>>& Adj, int s,
         vector<int>& parent, vector<int>& order) {
    for (int v : Adj[s]) {
        if (parent[v] == -1) {
            parent[v] = s;
            dfs(Adj, v, parent, order);
        }
    }
    order.push_back(s);            // s finishes
}

pair<vector<int>,vector<int>> full_dfs(const vector<vector<int>>& Adj) {
    int V = Adj.size();
    vector<int> parent(V, -1), order;
    for (int v = 0; v < V; ++v) {
        if (parent[v] == -1) {
            parent[v] = v;
            dfs(Adj, v, parent, order);
        }
    }
    return {parent, order};
}
↑ Run C++ on Compiler Explorer (Godbolt)

Topological Sort and Cycle Detection

A directed acyclic graph (DAG) is a directed graph with no directed cycles. A topological order assigns a rank $f(v)$ to each vertex so that every edge $(u,v)$ satisfies $f(u) < f(v)$ — all edges point forward.

Claim: the reverse of the DFS finishing order is a topological order for any DAG.

Proof (two cases): for any edge $(u,v)$, we show $v$ finishes before $u$. If $u$ visited before $v$: before $u$ finishes, it visits $v$, so $v$ finishes first. If $v$ visited before $u$: since the graph is acyclic, $u$ is not reachable from $v$, so $v$ finishes entirely before $u$ starts. In both cases $v$ finishes before $u$, so reversing puts $u$ before $v$. ✓
topological_sort + has_cycle · O(|V| + |E|)
def topological_sort(Adj):
    """Returns vertices in topological order for a DAG. O(|V|+|E|)."""
    _, order = full_dfs(Adj)
    return order[::-1]             # reverse finishing order


def has_cycle(Adj):
    """
    Detect directed cycle using 3-color DFS.
    WHITE=0 (unvisited), GRAY=1 (on current path), BLACK=2 (done).
    A back edge to a GRAY vertex certifies a cycle.
    O(|V| + |E|).
    """
    color = [0] * len(Adj)

    def visit(u):
        color[u] = 1               # u is on current DFS path
        for v in Adj[u]:
            if color[v] == 1:      # back edge to ancestor -> cycle!
                return True
            if color[v] == 0 and visit(v):
                return True
        color[u] = 2               # u is done
        return False

    return any(visit(v) for v in range(len(Adj)) if color[v] == 0)
#include <vector>
#include <functional>
using namespace std;

vector<int> topological_sort(const vector<vector<int>>& Adj) {
    auto [parent, order] = full_dfs(Adj);
    reverse(order.begin(), order.end());
    return order;
}

bool has_cycle(const vector<vector<int>>& Adj) {
    int V = Adj.size();
    vector<int> color(V, 0); // 0=white,1=gray,2=black

    function<bool(int)> visit = [&](int u) -> bool {
        color[u] = 1;
        for (int v : Adj[u]) {
            if (color[v] == 1) return true;   // back edge
            if (color[v] == 0 && visit(v)) return true;
        }
        color[u] = 2;
        return false;
    };

    for (int v = 0; v < V; ++v)
        if (color[v] == 0 && visit(v)) return true;
    return false;
}
↑ Run C++ on Compiler Explorer (Godbolt)

BFS vs DFS — When to Use Which

Property BFS DFS
Data structureQueue (FIFO)Stack / recursion (LIFO)
Shortest paths (unweighted)Yes — alwaysNo guarantee
Topological sortNoYes — reverse finishing order
Cycle detectionIndirectYes — back edges
Connected componentsYes (Full-BFS)Yes (Full-DFS)
Runtime$O(|V| + |E|)$$O(|V| + |E|)$
Worked Example · Strategy Startup Dependency Resolution 🇮🇳 TTT Systems

A trading strategy has modules with dependencies: config must load before instruments and risk; instruments before pricer and hedger; risk before hedger. Find a valid startup order.

Model as a DAG

Vertex per module, directed edge $(u,v)$ meaning "$u$ must start before $v$". Run topological sort on this DAG. DFS finishing order (reversed):

G = {
  'config':      ['instruments', 'risk'],
  'instruments': ['pricer', 'hedger'],
  'risk':        ['hedger'],
  'pricer':      [],
  'hedger':      [],
  'logger':      []
}
Topological sort result

One valid order: config → risk → instruments → hedger → pricer → logger. All dependency edges point forward. Modules can start in this sequence without any module waiting on one not yet loaded.

Trading application: if a cycle is detected (e.g., module A depends on B, B depends on A), the system should raise an error at startup, not silently deadlock. DFS cycle detection finds this in $O(|V|+|E|)$ time.
Python's own import system uses topological sort internally. When you import pandas, Python builds a dependency DAG of all required packages and imports them in topological order. Circular imports cause ImportError — Python's cycle detector firing. For TTT's strategy pipeline, any time you have "X must happen before Y", you have a DAG. Model it explicitly, run topological sort, and you have a provably correct startup sequence.

Practice Problems
4 questions · Chapter 09
prog / dsa / ch09 / q01 ★ Runtime Analysis

A graph has 5 vertices and 8 undirected edges. BFS is run from vertex 0. What is the maximum number of times the inner loop for v in Adj[u] can execute across the entire BFS?

A
At most 5 — one per vertex
B
At most 25 — $|V|^2$ worst case
C
At most 16 — $2|E|$ by the handshaking lemma
D
Exactly 8 — one per edge
Answer: C — at most $2|E| = 16$.

Each vertex $u$ is dequeued at most once. When $u$ is dequeued, the inner loop runs $\deg(u)$ times. Total executions $\leq \sum_u \deg(u) \leq 2|E| = 16$ by the handshaking lemma (each undirected edge contributes to two adjacency lists). Option D would be correct only for directed graphs where each edge appears in exactly one adjacency list.
prog / dsa / ch09 / q02 ★★ BFS vs DFS

You need to find the shortest path (fewest edges) from $s$ to $t$ in an unweighted graph. Which algorithm is correct?

A
DFS — it explores all paths so it will find the shortest one
B
BFS — it visits vertices in order of distance, guaranteeing the first path found is shortest
C
Either — same runtime, same result
D
Neither — you need Dijkstra even for unweighted graphs
Answer: B — BFS.

BFS discovers vertices level-by-level. When $t$ is first reached at level $d$, no shorter path exists — all vertices at distance $< d$ have already been processed. DFS may reach $t$ via a long detour before ever trying a direct short path. Same $O(|V|+|E|)$ runtime, completely different guarantees. Dijkstra with unit weights IS BFS — they are equivalent for unweighted graphs.
prog / dsa / ch09 / q03 ★★ Topological Sort

DAG with edges $A \to C$, $B \to C$, $B \to D$, $C \to E$, $D \to E$. Which is a valid topological order?

A
$A, C, B, D, E$
B
$A, B, C, D, E$
C
$E, C, D, A, B$
D
$C, A, B, D, E$
Answer: B — $A, B, C, D, E$.

Check all edges: $A \to C$ ✓ ($A$ before $C$); $B \to C$ ✓; $B \to D$ ✓; $C \to E$ ✓; $D \to E$ ✓. All edges point forward — valid.

Option A fails: $B \to C$ violated ($B$ comes after $C$ in that ordering). Option C fails: $E$ is first but $C \to E$ and $D \to E$ require $E$ to come after both. Option D fails: $C$ comes before $B$ but $B \to C$ requires $B$ before $C$.
prog / dsa / ch09 / q04 ★★★ Graph Modelling

You want the shortest path from $s$ to $t$ using an odd number of edges. Direct BFS fails. What is the correct approach?

A
Run DFS — it naturally finds odd-length paths
B
Add weight 1 to each edge and run Dijkstra
C
Build a doubled graph with "even" and "odd" copies of each vertex, run BFS from $s_E$ to $t_O$
D
Run BFS twice and return the longer of the two paths
Answer: C — doubled graph.

Create $G'$ with two copies of each vertex: $u_E$ (reached via even edges) and $u_O$ (reached via odd edges). For each edge $(u,v) \in E$, add $(u_E, v_O)$ and $(u_O, v_E)$ — each step flips the parity. BFS from $s_E$ to $t_O$ finds the shortest odd-length path. $G'$ has $2|V|$ vertices, $2|E|$ edges, so BFS runs in $O(|V|+|E|)$.

This is the state-space doubling technique: when you need to track something beyond position (here: parity), encode it as part of the vertex. It generalises: tracking colour, energy level, number of turns, etc. all work the same way.

Terminal Questions — Chapter 09 10 problems · No answers given

Work through independently. Q1–3 direct application. Q4–7 synthesis. Q8–10 careful argument.

1
Run BFS on $G = \{0:[1,3],\ 1:[0,2,4],\ 2:[1,5],\ 3:[0,4],\ 4:[1,3,5],\ 5:[2,4]\}$ from vertex 0. Show the queue state after each step. What is the shortest path from 0 to 5? Easy
2
Run DFS on directed $G = \{0:[1,2],\ 1:[3],\ 2:[3],\ 3:[4],\ 4:[]\}$ from vertex 0. Show the finishing order. Reverse it to get the topological order. Verify all edges point forward. Easy
3
Does $G = \{0:[1],\ 1:[2],\ 2:[0],\ 3:[2]\}$ contain a cycle? Run the 3-color DFS cycle detection and identify the back edge. Easy
4
Prove BFS correctly computes shortest-path distances. Prove by induction on $d$ that when BFS first visits vertex $v$ at level $d$, $\delta(s,v) = d$. Your inductive step should use the FIFO property of the queue. Medium
5
Implement connected_components(Adj) using Full-BFS. Return a list of vertex sets, one per connected component. Test on a graph with three disconnected components. Medium
6
A strategy pipeline has 8 components with dependencies given as pairs $(a, b)$ meaning "$a$ must run before $b$". Write Python code to: build the dependency graph, run topological sort, and detect and report a cycle if present. Test with both a valid DAG and one containing a cycle. Medium
7
Describe 0-1 BFS: in a graph where each edge has weight 0 or 1, compute shortest paths in $O(|V|+|E|)$ using a deque (push weight-0 edges to front, weight-1 to back). Prove correctness. How does this differ from standard BFS? Medium
8
Prove that DFS on a DAG produces a valid topological order when reversed (the two-case proof). Then prove the converse: if a topological order exists, the graph must be a DAG. Hard
9
Design an algorithm to find the diameter of an unweighted undirected graph — $\max_{u,v} \delta(u,v)$. (a) $O(|V|(|V|+|E|))$ algorithm. (b) For trees, $O(|V|)$ algorithm using two BFS calls. Prove correctness of (b). Hard
10
Model the arbitrage detection problem: given $n$ currencies and exchange rates $r_{ij}$ (units of $j$ per unit of $i$), detect whether there is a sequence of trades starting and ending with currency 0 that nets a profit. What type of graph problem is this, and which algorithm (from this or the next chapter) solves it? Hard