Sorting I
Sorting is the gateway problem of algorithms. Simple to state, surprisingly subtle to analyse, and the foundation for every efficient data structure that follows. We build from brute force to $\Theta(n^2)$ to $\Theta(n \log n)$ — and prove each step along the way.
Why Sorting? The Set Interface Connection
In Chapter 2 we built Sequence data structures — arrays, linked lists, dynamic arrays. They all store items in an order you impose. But a different interface matters more for search: the Set interface, where items have intrinsic keys and you query by key value rather than position.
find(k) — return the item with key $k$ · insert(x) — add item $x$ · delete(k) — remove item with key $k$
find_min() · find_max() · find_next(k) · find_prev(k) — order queries
A plain unsorted array can implement all of these — just scan for $k$. But every operation costs $\Theta(n)$. A sorted array changes everything:
find(k) drops to $\Theta(\log n)$ via binary search, find_min and find_max become $\Theta(1)$ (first and last elements). The price: building the sorted array costs $\Theta(n \log n)$. That is the preprocessing-vs-query tradeoff, and it is worth it.
So sorting is not an end in itself — it is the prerequisite for efficient Set operations. Every chapter from here through Chapter 8 is about building faster and faster Set implementations, all starting from the ability to sort.
The Sorting Problem — Precisely Stated
— Permutation: $B$ contains the same elements as $A$, possibly in a different order.
— Sorted: $B[i-1] \leq B[i]$ for all $i \in \{1, \ldots, n-1\}$.
A sort is destructive if it overwrites $A$ in place. A sort is in-place if it uses $O(1)$ extra space beyond the input (in-place implies destructive). A sort is stable if equal elements appear in the output in the same relative order as in the input.
Before analysing good algorithms, let us see the worst possible one — just to appreciate the gap.
Permutation sort: there are $n!$ permutations of $A$; at least one is sorted. Try all of them. Correct by exhaustive case analysis. Runtime: $\Omega(n! \cdot n)$ — checking each permutation takes $\Theta(n)$. For $n = 20$, $20! \approx 2.4 \times 10^{18}$. This is not slow — it is cosmically unusable.
Algorithm 1 — Selection Sort
The first sensible idea: at each step, find the largest unsorted element and put it in its final position. After $i$ steps, the $i$ largest elements are correctly placed at the end of the array.
Invariant: after iteration $i$, sub-array $A[n-i:]$ contains the $i$ largest elements in sorted order, and $A[:n-i]$ contains the remaining elements in some order.
def selection_sort(A):
"""Sort array A in-place. Θ(n²) comparisons, O(n) swaps."""
n = len(A)
for i in range(n - 1, 0, -1): # i goes n-1, n-2, ..., 1
m = i # assume A[i] is max of A[:i+1]
for j in range(i): # scan A[:i] for a larger element
if A[j] > A[m]:
m = j # found new max
A[m], A[i] = A[i], A[m] # swap max into final position
# Example: sort NIFTY weekly expiry strikes
strikes = [24500, 24000, 23500, 25000, 23000]
selection_sort(strikes)
print(strikes) # [23000, 23500, 24000, 24500, 25000]
#include <vector>
#include <algorithm>
using namespace std;
// Selection sort — Θ(n²) comparisons, O(n) swaps, in-place
void selection_sort(vector<int>& A) {
int n = A.size();
for (int i = n - 1; i > 0; --i) {
int m = i; // index of current max
for (int j = 0; j < i; ++j)
if (A[j] > A[m]) m = j; // find max in A[0..i]
swap(A[m], A[i]); // place max at A[i]
}
}
// Example
// vector<int> strikes = {24500, 24000, 23500, 25000, 23000};
// selection_sort(strikes);
// → {23000, 23500, 24000, 24500, 25000}
Complexity analysis via recurrence. Let $T(n)$ be the time to sort $n$ elements. Finding the max in a prefix of length $i$ takes $\Theta(i)$. The recurrence is:
Selection sort is in-place ($O(1)$ extra space) and uses at most $O(n)$ swaps — the minimum possible for a comparison sort. But it is not stable: swapping can move equal elements out of their original relative order.
Algorithm 2 — Insertion Sort
Different invariant: at each step, the first $i$ elements are sorted among themselves. Insert the next element into its correct position in the sorted prefix by sliding it left until it finds its place.
Invariant: after iteration $i$, sub-array $A[:i]$ is sorted. The remaining elements are untouched.
def insertion_sort(A):
"""Sort array A in-place. Stable. Θ(n²) worst case, Θ(n) if nearly sorted."""
for i in range(1, len(A)):
j = i
# slide A[j] left until it finds its sorted position
while j > 0 and A[j] < A[j - 1]:
A[j - 1], A[j] = A[j], A[j - 1]
j -= 1
# Key insight: insertion sort is fast on nearly-sorted data.
# If A is already sorted, the while loop never executes — Θ(n) total.
# Real-world use: TimSort (Python's built-in sort) uses insertion sort
# for small subarrays (n < 64) because cache locality dominates there.
#include <vector>
using namespace std;
// Insertion sort — stable, in-place, Θ(n²) worst, Θ(n) if nearly sorted
void insertion_sort(vector<int>& A) {
int n = A.size();
for (int i = 1; i < n; ++i) {
int key = A[i];
int j = i - 1;
// shift elements greater than key one position right
while (j >= 0 && A[j] > key) {
A[j + 1] = A[j];
--j;
}
A[j + 1] = key; // insert key in its correct position
}
}
// Note: C++ version uses shifts (not swaps) — same O complexity,
// but ~2x fewer memory writes in practice on sorted-ish data.
sorted() (TimSort) exploits this exactly.
Algorithm 3 — Merge Sort
Both $\Theta(n^2)$ algorithms share the same structure: fix one element per pass, paying $\Theta(n)$ per pass, for $n$ passes total. To do better, we need to fix more structure per pass. Merge sort achieves this by divide and conquer.
Idea: recursively sort the left and right halves, then merge the two sorted halves into one sorted array. The merge step scans both halves simultaneously — two pointers walking from back to front — and takes $\Theta(n)$.
def merge_sort(A, a=0, b=None):
"""Sort A[a:b] in-place. Θ(n log n) time, Θ(n) extra space."""
if b is None: b = len(A)
if b - a <= 1: return # base case: 0 or 1 element
c = (a + b + 1) // 2 # midpoint (ceiling)
merge_sort(A, a, c) # sort left half T(n/2)
merge_sort(A, c, b) # sort right half T(n/2)
# merge: copy both halves, then merge back into A[a:b]
L, R = A[a:c], A[c:b] # O(n) copy
i = j = 0
for k in range(a, b): # O(n) merge
if j >= len(R) or (i < len(L) and L[i] <= R[j]):
A[k] = L[i]; i += 1
else:
A[k] = R[j]; j += 1
# Note: L[i] <= R[j] (not <) makes the sort stable
#include <vector>
using namespace std;
void merge(vector<int>& A, int a, int c, int b) {
vector<int> L(A.begin()+a, A.begin()+c); // copy left half
vector<int> R(A.begin()+c, A.begin()+b); // copy right half
int i = 0, j = 0;
for (int k = a; k < b; ++k) {
if (j >= (int)R.size() || (i < (int)L.size() && L[i] <= R[j]))
A[k] = L[i++];
else
A[k] = R[j++];
}
}
void merge_sort(vector<int>& A, int a, int b) {
if (b - a <= 1) return; // base case
int c = (a + b + 1) / 2;
merge_sort(A, a, c); // T(n/2)
merge_sort(A, c, b); // T(n/2)
merge(A, a, c, b); // Θ(n)
}
// Entry point
void merge_sort(vector<int>& A) {
merge_sort(A, 0, A.size());
}
Solving the Merge Sort Recurrence
The recurrence is $T(1) = \Theta(1)$, $T(n) = 2T(n/2) + \Theta(n)$. We solve it two ways.
Method 1 — Substitution. Guess $T(n) = \Theta(n \log n)$. Substitute $T(n) = cn \log n$:
Method 2 — Recurrence Tree. Draw the recursion tree. At depth $i$, there are $2^i$ subproblems each of size $n/2^i$. Work at depth $i$: $2^i \cdot \Theta(n/2^i) = \Theta(n)$. The tree has depth $\log_2 n$ (base case at $n/2^i = 1$). Total:
The Master Theorem — Solving Recurrences in General
Many divide-and-conquer algorithms have recurrences of the form $T(n) = aT(n/b) + f(n)$. The Master Theorem gives the answer by comparing $f(n)$ to $n^{\log_b a}$ — the number of leaves in the recursion tree.
Case 1 — $c < \log_b a$: leaves dominate. $T(n) = \Theta(n^{\log_b a})$
Case 2 — $c = \log_b a$: work balanced across levels. $T(n) = \Theta(n^c \log n)$
Case 3 — $c > \log_b a$: root dominates. $T(n) = \Theta(n^c)$
For merge sort: $a = 2$, $b = 2$, $f(n) = \Theta(n) = \Theta(n^1)$. So $c = 1$ and $\log_b a = \log_2 2 = 1$. Case 2 applies: $T(n) = \Theta(n^1 \log n) = \Theta(n \log n)$. ✓
Comparison — All Three Algorithms
| Algorithm | Worst Case | Best Case | Comparisons | Swaps | In-place | Stable |
|---|---|---|---|---|---|---|
| Selection Sort | $\Theta(n^2)$ | $\Theta(n^2)$ | $\Theta(n^2)$ | $O(n)$ | Yes | No |
| Insertion Sort | $\Theta(n^2)$ | $\Theta(n)$ | $\Theta(n^2)$ | $\Theta(n^2)$ | Yes | Yes |
| Merge Sort | $\Theta(n \log n)$ | $\Theta(n \log n)$ | $\Theta(n \log n)$ | $\Theta(n \log n)$ | No | Yes* |
* Merge sort is stable only if the merge step uses $\leq$ (not $<$) when choosing from the left half.
At 9:15 AM, the XTS API returns NIFTY option chain data as an unordered list of 450 strike-price records. Your strategy needs to: (a) quickly find the ATM strike, (b) iterate over strikes in order from lowest to highest, (c) binary search for any specific strike. Which algorithm should you use to sort the chain, and what is the total cost?
Merge sort. $n = 450$. The chain is unordered at arrival — insertion sort's "nearly sorted" advantage does not apply. We need a guaranteed $\Theta(n \log n)$ with no worst-case surprises. For $n = 450$: $450 \log_2 450 \approx 450 \times 8.8 \approx 3{,}970$ operations. Selection or insertion sort would take $450^2 / 2 \approx 101{,}250$ comparisons — 25× more work for the same result.
Python's sorted(chain, key=lambda x: x['strike']) uses TimSort — a hybrid of merge sort and insertion sort. It is $\Theta(n \log n)$ worst case, $\Theta(n)$ best case, stable, and written in C. Never write your own sort when the built-in is available. Write your own only to understand the algorithm — which is exactly what we just did.
sorted() or list.sort() — both are TimSort, $\Theta(n \log n)$, stable. The one-time sort at session open costs ~4,000 operations on 450 strikes. After that, binary search finds any strike in $\lceil \log_2 450 \rceil = 9$ comparisons. Compared to linear scan (225 comparisons on average), binary search is 25× faster per lookup. With 50,000 lookups per session, that is 12.5M comparisons saved — from one sort call.
list.sort() and sorted() are both TimSort — developed by Tim Peters specifically for Python in 2002. TimSort finds already-sorted "runs" in the input and merges them using a merge sort strategy, but uses insertion sort for runs shorter than 64 elements (where cache effects dominate). It is now used in Java's Arrays.sort() for objects, Android's sort, and many other runtimes. When you call sorted(chain, key=lambda x: x['strike']) on your option chain, that is TimSort running under the hood in C at roughly $10^8$ operations per second.
An options strategy script receives 500 price updates throughout the day. Each update arrives in roughly sorted order — the new price is within 3 positions of where it belongs in the sorted list. Which sort algorithm is most efficient for re-sorting the list after each update?
When every element is at most $k$ positions from its sorted location, insertion sort's inner while loop runs at most $k$ times per element. Total cost: $\Theta(kn)$. For $k = 3$ and $n = 500$: roughly 1,500 operations per re-sort.
Merge sort always costs $\Theta(n \log n) \approx 4{,}500$ operations regardless of input order — it ignores existing structure. Selection sort always costs $\Theta(n^2) \approx 125{,}000$. For nearly-sorted data, insertion sort wins decisively. This is precisely why TimSort uses insertion sort for small, nearly-sorted subarrays.
A divide-and-conquer algorithm splits a problem of size $n$ into 4 subproblems of size $n/2$, with $\Theta(n)$ work to combine them. What is the asymptotic runtime?
Apply Master Theorem: $a = 4$, $b = 2$, $f(n) = \Theta(n^1)$. So $\log_b a = \log_2 4 = 2$. Since $c = 1 < \log_b a = 2$, we are in Case 1 — leaves dominate: $$T(n) = \Theta(n^{\log_2 4}) = \Theta(n^2)$$ Recurrence tree intuition: at depth $i$, there are $4^i$ subproblems of size $n/2^i$. Work at depth $i$: $4^i \cdot \Theta(n/2^i) = \Theta(n \cdot 2^i)$. This grows geometrically — the deepest level (with $4^{\log_2 n} = n^2$ leaves) dominates. Doubling the branches changes everything compared to merge sort's 2 branches.
A list of NIFTY option orders is sorted by strike price, then the result is sorted again by order type (CE before PE). Which property must the second sort have to guarantee that orders with the same type remain sorted by strike within each type?
A stable sort guarantees that elements with equal keys appear in the output in the same relative order as in the input. After sorting by strike, all CE orders with strike 24000 appear before 24500, etc. If the second sort (by order type) is stable, then within each type, the strike ordering is preserved — because the sorter never swaps two elements with the same key (order type).
This "sort twice" technique is called a radix-style multi-key sort: sort by the least significant key first, then by the most significant key using a stable sort. It is used everywhere in data pipelines — sorting trades by time within each symbol, sorting options by expiry within each strike, etc. Python's
sorted() and list.sort() are both stable (TimSort), so this pattern is safe to use without thinking about it.
Merge sort on an array of $n = 8$ elements. The merge step at the top level merges two sorted halves of size 4. How many total element comparisons does merge sort perform across all merge operations?
Merging two sorted arrays of sizes $p$ and $q$ requires between $\max(p,q)$ and $p+q-1$ comparisons. The minimum occurs when all of one array is smaller than all of the other (no interleaving). The maximum occurs when elements alternate perfectly.
For $n = 8$ ($\log_2 8 = 3$ levels):
— Level 3 (base merges of size 1+1): 4 merges × 1 comparison each = 4
— Level 2 (merges of size 2+2): 2 merges × (2 to 3) comparisons each = 4 to 6
— Level 1 (top merge of 4+4): 1 merge × (4 to 7) comparisons = 4 to 7
Total range: $4 + 4 + 4 = 12$ to $4 + 6 + 7 = 17$ (not quite 24 — the $n \log n$ bound includes the copying overhead, not just comparisons).
The $\Theta(n \log n)$ bound counts all operations (comparisons + copies), not just comparisons. Comparisons alone are between $\frac{n}{2}\log n$ and $n \log n$.
Work through independently. Q1–3 direct application. Q4–7 synthesis. Q8–10 require careful argument.