Linear Programming
Optimise a linear objective subject to linear constraints — the universal language of constrained optimisation. LP subsumes max-flow, shortest paths, and difference constraints as special cases. Duality turns every LP into two mirror problems whose optimal values are equal, giving certificates of optimality without solving the problem twice. Simplex walks the polytope vertices to the optimum; interior-point methods race through the interior.
Standard Form
Every LP can be written in standard form:
where $x \in \mathbb{R}^d$ are the decision variables, $c \in \mathbb{R}^d$ is the objective coefficient vector, $A \in \mathbb{R}^{n\times d}$ encodes the $n$ inequality constraints, and $b \in \mathbb{R}^n$ is the right-hand-side vector. Non-standard forms convert to standard form via simple rules.
— Minimise $c^\top x$: negate to maximise $-c^\top x$
— Unrestricted variable $x_j$ (no sign constraint): replace $x_j = x_j^+ - x_j^-$ with $x_j^+, x_j^- \geq 0$
— Equality constraint $a^\top x = b$: split into $a^\top x \leq b$ and $-a^\top x \leq -b$
— $\geq$ constraint $a^\top x \geq b$: negate to $-a^\top x \leq -b$
Geometric Intuition: The Feasible Polytope
Each constraint $a_i^\top x \leq b_i$ defines a halfspace in $\mathbb{R}^d$. The feasible region is the intersection of $n$ halfspaces — a convex polytope. The objective $c^\top x = \text{const}$ is a family of parallel hyperplanes; we want the one tangent to the polytope in the direction of $c$. This tangent point is always a vertex (corner) of the polytope (or the LP is unbounded or infeasible).
Motivating Example: Campaign Spending
3 demographics, 4 policy issues. Coefficient table gives votes per dollar spent per issue per demographic. We need a majority in each demographic (population: Urban 100k, Suburban 200k, Rural 50k). Minimise total spending $x_1 + x_2 + x_3 + x_4$:
Optimal solution: $x_1 \approx 18{,}468$, $x_2 \approx 3{,}829$, $x_3 = 0$, $x_4 \approx 5{,}631$. Total cost $\approx 27{,}928$ (i.e., $3{,}100{,}000 / 111$).
LP Duality
Every LP (the primal) has a dual LP. The dual’s optimal value equals the primal’s — this is the strong duality theorem (proved via the simplex algorithm or Farkas’ lemma).
| Primal (maximise) | Dual (minimise) |
|---|---|
| Maximise $c^\top x$ | Minimise $b^\top y$ |
| Subject to $Ax \leq b$ | Subject to $A^\top y \geq c$ |
| $x \geq 0$ | $y \geq 0$ |
Strong duality: at optimality, $c^\top x^* = b^\top y^*$. The dual optimal $y^*$ is a certificate of primal optimality.
LP as a Certificate of Optimality
The dual gives short optimality certificates. In the campaign example, multiplying constraints (1), (2), (3) by multipliers $\lambda_1, \lambda_2, \lambda_3 \geq 0$ and summing produces a lower bound on $x_1+x_2+x_3+x_4$. The certificate shows the proposed solution achieves that lower bound — hence it is optimal.
LP Formulation: Max-Flow
LP Formulation: Shortest Paths
The Simplex Algorithm
The simplex method exploits the vertex structure of the feasible polytope. Since the optimal is always at a vertex, we can search over vertices. Key insight: move from vertex to adjacent vertex only if the objective improves. When no improving neighbour exists — we’re optimal.
Pivot step:
1. Choose a nonbasic variable $x_e$ with positive objective coefficient (it improves $z$ if increased).
2. Find the binding constraint: increase $x_e$ until one basic variable $x_l$ hits 0. This is the leaving variable.
3. Swap: $x_e$ becomes basic, $x_l$ becomes nonbasic. Rewrite all equations in terms of the new basis.
4. Repeat until no nonbasic variable has a positive coefficient.
Termination: when all objective coefficients for nonbasic variables are $\leq 0$, the current basic solution is optimal.
Complexity: $O\!\binom{n+d}{n}$ pivots in the worst case (exponential), but fast in practice. Polynomial-time algorithms exist (ellipsoid, interior-point).
import numpy as np
from scipy.optimize import linprog
# ── CAMPAIGN SPENDING EXAMPLE ── minimise total spending ─────────────
# Negate for maximise convention; linprog minimises by default
# Primal: minimise x1+x2+x3+x4
# Constraints (negated to <=): -(-2x1+8x2+10x4) <= -50000 etc.
c = [1, 1, 1, 1] # minimise x1+x2+x3+x4
# Constraints Ax <= b (negate >= constraints)
A = [[-(-2), -8, 0, -10], # Urban: -(votes) <= -50000
[-5, -2, 0, 0], # Suburban
[-3, 5, -10, 2]] # Rural
b = [-50000, -100000, -25000]
result = linprog(c, A_ub=A, b_ub=b, bounds=[(0,None)]*4,
method='highs')
print(f"Optimal spend: ${result.fun:,.0f}")
print(f"x = {result.x.round(1)}")
# Optimal spend: ~$27,928, x ≈ [18468, 3829, 0, 5631]
# ── LP DUALITY DEMONSTRATION ── weak duality certificate ─────────────
def verify_weak_duality(c, A, b, x_primal, y_dual):
"""
Weak duality: c^T x <= b^T y for any feasible primal x, dual y.
At optimality: c^T x* = b^T y* (strong duality).
"""
primal_val = np.dot(c, x_primal)
dual_val = np.dot(b, y_dual)
print(f"Primal value c^T x = {primal_val:.4f}")
print(f"Dual value b^T y = {dual_val:.4f}")
print(f"Weak duality holds: {primal_val <= dual_val + 1e-9}")
print(f"Strong duality (at optimum): {abs(primal_val-dual_val) < 1e-6}")
# ── MAX-FLOW AS LP ── demonstrating LP subsumes flow ─────────────────
def max_flow_lp(n, edges, source, sink, capacities):
"""
Solve max flow as an LP (educational; scipy linprog is not sparse-aware).
Maximise flow out of source subject to:
- Capacity: f(e) <= c(e) for each edge
- Conservation: inflow = outflow at each internal vertex
- Non-negativity: f(e) >= 0
"""
m = len(edges)
# Variables: f[0..m-1] = flow on each edge
# Objective: -sum of flows out of source (negate to minimise)
c_obj = [0.0] * m
for i, (u,v) in enumerate(edges):
if u == source:
c_obj[i] = -1.0 # maximise source outflow
# Capacity constraints: f[i] <= cap[i]
A_ub = np.eye(m)
b_ub = capacities
# Conservation constraints (equality): for each internal node
A_eq, b_eq = [], []
internal = [v for v in range(n) if v != source and v != sink]
for node in internal:
row = [0.0] * m
for i, (u,v) in enumerate(edges):
if v == node: row[i] = 1.0 # inflow
if u == node: row[i] = -1.0 # outflow
A_eq.append(row)
b_eq.append(0.0)
result = linprog(c_obj, A_ub=A_ub, b_ub=b_ub,
A_eq=A_eq if A_eq else None,
b_eq=b_eq if b_eq else None,
bounds=[(0,None)]*m, method='highs')
return -result.fun if result.success else None
# ── SHORTEST PATHS AS LP ── difference constraints ────────────────────
def shortest_paths_lp(n, edges, source):
"""
Maximise sum(d) subject to d[v] - d[u] <= w(u,v) and d[source]=0.
Equivalent to finding shortest paths from source.
"""
m = len(edges)
# Variables: d[0..n-1]
c_obj = [-1.0] * n # maximise sum d[v]
A_ub, b_ub = [], []
for u, v, w in edges:
row = [0.0]*n
row[v] = 1.0
row[u] = -1.0
A_ub.append(row)
b_ub.append(w)
# d[source] = 0
A_eq = [[0.0]*n]; A_eq[0][source] = 1.0
b_eq = [0.0]
result = linprog(c_obj, A_ub=A_ub, b_ub=b_ub,
A_eq=A_eq, b_eq=b_eq,
bounds=[(None,None)]*n, method='highs')
return result.x if result.success else None// LP standard form — interface sketch
// Production LP solvers in C++: GLPK, CPLEX, Gurobi, HiGHS
// Below: conceptual structure only (use a real solver in practice)
#include <vector>
#include <string>
using namespace std;
struct LP {
int n_vars; // d: number of decision variables
int n_cons; // n: number of inequality constraints
vector<double> c; // objective coefficients (maximise c^T x)
vector<vector<double>> A; // constraint matrix (A x <= b)
vector<double> b; // RHS vector
// x >= 0 implicit
// Dual LP: minimise b^T y s.t. A^T y >= c, y >= 0
LP dual() const {
LP d;
d.n_vars = n_cons; // dual variables = primal constraints
d.n_cons = n_vars; // dual constraints = primal variables
d.c = b; // dual objective = primal RHS
d.b.resize(n_vars);
for (int j=0;j<n_vars;j++) d.b[j] = -c[j]; // A^T y >= c → -A^T y <= -c
d.A.resize(n_vars, vector<double>(n_cons));
for (int j=0;j<n_vars;j++)
for (int i=0;i<n_cons;i++)
d.A[j][i] = -A[i][j]; // negated transpose
return d;
}
// At optimality: c^T x* = b^T y* (strong duality)
// Proof: follows from complementary slackness conditions
};Simplex: Step-by-Step Worked Example
Maximise $z = 3x_1 + x_2 + 2x_3$ subject to $x_1+x_2+3x_3 \leq 30$, $2x_1+2x_2+5x_3 \leq 24$, $4x_1+x_2+2x_3 \leq 36$, $x_1,x_2,x_3 \geq 0$.
Pivot 1: $x_1$ has the highest positive coefficient ($+3$). Binding constraint: $x_6 = 36-4x_1 \geq 0 \Rightarrow x_1 \leq 9$. Pivot: $x_1$ enters, $x_6$ leaves. Rewrite $x_1 = 9 - x_2/4 - x_3/2 - x_6/4$. Substitute into all equations. New $z = 27 + x_2/4 + x_3/2 - 3x_6/4$. New solution $(9,0,0,21,6,0)$, $z=27$.
Pivot 2: $x_3$ has positive coefficient ($+1/2$). Binding: $x_5 = 6 - 4x_3 \geq 0 \Rightarrow x_3 \leq 3/2$. Pivot $x_3$ in, $x_5$ out. New $z = 111/4 + x_2/16 - 11x_5/8 - 11x_6/16$. Solution $(33/4, 0, 3/2, 69/4, 0, 0)$, $z=111/4$.
Pivot 3: $x_2$ has positive coefficient. Pivot $x_2$ in. After substitution, all nonbasic coefficients are $\leq 0$. Optimal: $z=28$, at $(x_1,x_2,x_3)=(8,4,0)$.
import numpy as np
def simplex(c, A, b):
"""
Simplex algorithm in slack form. Maximise c^T x s.t. Ax <= b, x >= 0.
Educational implementation — not numerically robust; use scipy for production.
Returns (optimal_value, optimal_x) or raises if infeasible/unbounded.
Algorithm:
1. Introduce slack variables: B = A|I (n x (d+n)), extended cost c_ext.
2. Pivot: find entering variable (most positive objective coefficient),
find leaving variable (min-ratio test to stay feasible),
update tableau.
3. Stop when no positive objective coefficient remains.
"""
d, n = len(c), len(b)
# Build initial tableau: [A | I | b] with objective row [-c | 0 | 0]
# Rows 0..n-1: constraints. Row n: objective.
tab = np.zeros((n+1, d+n+1))
tab[:n, :d] = A # constraint coefficients
tab[:n, d:d+n] = np.eye(n) # slack variables
tab[:n, -1] = b # RHS
tab[n, :d] = -np.array(c, dtype=float) # objective (negated for min form)
basis = list(range(d, d+n)) # initial basis: slack variables
for _ in range(1000): # max iterations guard
# Find entering variable: most negative in objective row
pivot_col = np.argmin(tab[n, :-1])
if tab[n, pivot_col] >= -1e-9:
break # optimal: all coefficients >= 0
# Min-ratio test: find leaving variable
ratios = np.full(n, np.inf)
for i in range(n):
if tab[i, pivot_col] > 1e-9:
ratios[i] = tab[i, -1] / tab[i, pivot_col]
if np.all(np.isinf(ratios)):
raise ValueError("LP is unbounded")
pivot_row = np.argmin(ratios)
# Pivot: divide pivot row by pivot element, then eliminate column
tab[pivot_row] /= tab[pivot_row, pivot_col]
for i in range(n+1):
if i != pivot_row:
tab[i] -= tab[i, pivot_col] * tab[pivot_row]
basis[pivot_row] = pivot_col
# Extract solution
x = np.zeros(d + n)
for i, b_idx in enumerate(basis):
x[b_idx] = tab[i, -1]
return tab[n, -1], x[:d] # objective value, original variables
# Verify on the lecture example
c_ex = [3, 1, 2]
A_ex = [[1,1,3],[2,2,5],[4,1,2]]
b_ex = [30, 24, 36]
val, x_opt = simplex(c_ex, A_ex, b_ex)
print(f"Optimal value: {val:.4f}") # 28.0
print(f"Optimal x: {x_opt}") # [8, 4, 0]// Simplex algorithm — educational tableau implementation
// For production: use GLPK, HiGHS, CPLEX, or Gurobi
#include <vector>
#include <algorithm>
#include <numeric>
#include <limits>
#include <stdexcept>
using namespace std;
pair<double, vector<double>>
simplex(const vector<double>& c,
const vector<vector<double>>& A,
const vector<double>& b) {
int d = c.size(), n = b.size();
// Tableau: (n+1) x (d+n+1)
vector<vector<double>> T(n+1, vector<double>(d+n+1, 0));
for (int i=0;i<n;i++) {
for (int j=0;j<d;j++) T[i][j] = A[i][j];
T[i][d+i] = 1; // slack
T[i][d+n] = b[i]; // rhs
}
for (int j=0;j<d;j++) T[n][j] = -c[j]; // objective (negated)
vector<int> basis(n); iota(basis.begin(),basis.end(),d);
for (int iter=0;iter<10000;iter++) {
// Find entering variable (most negative obj coefficient)
int pc = min_element(T[n].begin(), T[n].end()-1) - T[n].begin();
if (T[n][pc] >= -1e-9) break; // optimal
// Min-ratio test
int pr = -1; double min_r = 1e18;
for (int i=0;i<n;i++)
if (T[i][pc]>1e-9) {
double r = T[i][d+n]/T[i][pc];
if (r<min_r) { min_r=r; pr=i; }
}
if (pr==-1) throw runtime_error("unbounded");
// Pivot
double piv = T[pr][pc];
for (auto& x : T[pr]) x /= piv;
for (int i=0;i<=n;i++)
if (i!=pr) {
double f = T[i][pc];
for (int j=0;j<=d+n;j++) T[i][j] -= f*T[pr][j];
}
basis[pr] = pc;
}
vector<double> x(d, 0);
for (int i=0;i<n;i++) if (basis[i]<d) x[basis[i]]=T[i][d+n];
return {T[n][d+n], x};
}import numpy as np
from scipy.optimize import linprog, minimize
# ── MINIMUM COST FLOW AS LP ── ────────────────────────────────────────
def min_cost_flow_lp(n, edges, source, sink, demand, costs, capacities):
"""
Minimum cost max-flow as LP.
edges: list of (u, v). costs[i], capacities[i] for each edge.
Minimise sum cost[i]*f[i] subject to:
- Capacity: 0 <= f[i] <= capacities[i]
- Conservation: inflow - outflow = demand at each node
"""
m = len(edges)
# Objective: minimise cost^T f
c_obj = costs
# Conservation (equality): B f = demand
B = np.zeros((n, m))
for i, (u, v) in enumerate(edges):
B[u][i] = -1 # outflow
B[v][i] = 1 # inflow
bounds = [(0, cap) for cap in capacities]
result = linprog(c_obj, A_eq=B, b_eq=demand,
bounds=bounds, method='highs')
return result.x if result.success else None
# ── MARKOWITZ MEAN-VARIANCE (LP-RELAXED) ── ───────────────────────────
def markowitz_max_return(mu, Sigma, risk_limit, long_only=True):
"""
Maximise expected return mu^T w subject to:
- Portfolio risk w^T Sigma w <= risk_limit^2 (quadratic constraint!)
- Sum weights = 1
- w >= 0 (long only)
Note: this is a QP (quadratic programming), not a pure LP.
LP approximation: linearise variance using diagonal only.
True QP: use scipy minimize with method='SLSQP'.
"""
n = len(mu)
def neg_return(w): return -np.dot(mu, w)
def grad_return(w): return -mu
constraints = [
{'type': 'eq', 'fun': lambda w: np.sum(w) - 1}, # weights sum to 1
{'type': 'ineq','fun': lambda w: risk_limit**2 - w@Sigma@w} # risk constraint
]
bounds = [(0,1)]*n if long_only else [(None,None)]*n
w0 = np.ones(n)/n
result = minimize(neg_return, w0, jac=grad_return,
method='SLSQP', bounds=bounds,
constraints=constraints)
return result.x if result.success else None
# Example: 4 NSE instruments
mu = np.array([0.12, 0.18, 0.10, 0.15]) # expected annual returns
Sigma = np.array([ # covariance matrix
[0.04, 0.02, 0.01, 0.02],
[0.02, 0.09, 0.02, 0.03],
[0.01, 0.02, 0.03, 0.01],
[0.02, 0.03, 0.01, 0.06],
])
w_opt = markowitz_max_return(mu, Sigma, risk_limit=0.15)
print("Optimal portfolio weights:", w_opt.round(3))
print("Expected return:", np.dot(mu, w_opt).round(4))// Min-cost flow as LP — structure sketch
// Production: use GLPK/HiGHS/CPLEX for LP/QP
#include <vector>
using namespace std;
// Min cost flow LP:
// Minimise sum_i cost[i] * f[i]
// s.t. sum_{edges into v} f[e] - sum_{edges out of v} f[e] = demand[v]
// 0 <= f[e] <= cap[e]
//
// This is a network-structured LP — special structure enables
// the network simplex algorithm which is much faster in practice.
// Markowitz mean-variance:
// Maximise mu^T w
// s.t. w^T Sigma w <= sigma_max^2 (quadratic constraint → QP not LP)
// 1^T w = 1, w >= 0
// Solve with QP solvers (OSQP, ECOS, Gurobi QP, etc.)
// LP approximation: diagonal Sigma → linear in w^2_i constraintsTTT runs multiple option strategies simultaneously. Each strategy has a capital requirement, a delta limit, and an expected return. Given total capital, max delta exposure, and max vega, find the optimal allocation across strategies.
import numpy as np
from scipy.optimize import linprog
# 5 strategies: straddle, strangle, butterfly, calendar, condor
strategies = ['Straddle','Strangle','Butterfly','Calendar','Condor']
# Per-lot: (expected_return, capital_needed, delta, vega)
data = {
'ret': [0.08, 0.05, 0.03, 0.06, 0.04], # per lot
'capital': [50, 40, 20, 30, 35], # in 1000s
'delta': [0.0, 0.0, 0.0, 0.1, 0.05], # net delta per lot
'vega': [200, 150, 80, 120, 100], # vega per lot
}
# Constraints: capital <= 120k, |delta| <= 0.5, vega <= 600
# Minimise -expected_return (to maximise with linprog)
c = [-r for r in data['ret']]
A = [data['capital'], # capital constraint
data['delta'], # delta constraint (one-sided; delta>=0 for these)
data['vega']] # vega constraint
b = [120, 0.5, 600]
# Each strategy: 0 to 5 lots
bounds = [(0, 5)] * 5
result = linprog(c, A_ub=A, b_ub=b, bounds=bounds, method='highs')
print("Optimal lots per strategy:")
for s, lots in zip(strategies, result.x):
print(f" {s:12s}: {lots:.2f} lots")
print(f"Expected return: {-result.fun:.4f}")
scipy.optimize.linprog uses HiGHS (a state-of-the-art interior-point + simplex solver). For large-scale problems (thousands of instruments), cvxpy provides a modelling layer over solvers like OSQP (operator splitting QP, open-source, fast for finance). The LP formulation of shortest paths and max-flow shows that many graph algorithms are secretly solving special-structure LPs — and LP duality is what gives us the min-cut theorem and Bellman-Ford correctness.
The primal LP maximises $c^\top x$ subject to $Ax \leq b$, $x \geq 0$. What is the dual LP, and what does weak duality guarantee?
The dual of the standard-form primal (maximise $c^\top x$ s.t. $Ax \leq b$, $x \geq 0$) is: minimise $b^\top y$ subject to $A^\top y \geq c$, $y \geq 0$. Weak duality: for any primal feasible $x$ and dual feasible $y$: $c^\top x \leq (A^\top y)^\top x = y^\top Ax \leq y^\top b = b^\top y$. So every dual feasible solution provides an upper bound on the primal maximum. At optimality (strong duality): $c^\top x^* = b^\top y^*$ — the primal value equals the dual value, and the dual optimal $y^*$ is a certificate that $x^*$ is truly optimal.
In the simplex algorithm, why does the min-ratio test (choosing the leaving variable as the basic variable with the smallest ratio $b_i / a_{ie}$) ensure feasibility is maintained?
When we increase $x_e$ from 0, basic variable $x_{d+i}$ decreases by $a_{ie} \cdot x_e$. It becomes 0 (its lower bound) when $x_e = b_i / a_{ie}$. If we let $x_e$ exceed this value, $x_{d+i}$ becomes negative — violating feasibility. The min-ratio test finds the smallest such ratio: $\min_i \{b_i/a_{ie} : a_{ie} > 0\}$. This is the maximum we can increase $x_e$ before the first basic variable hits zero. That basic variable leaves the basis and $x_e$ enters. All other basic variables with $a_{ie} > 0$ would also have decreased, but since we stopped at the minimum ratio, they remain $\geq 0$. Basic variables with $a_{ie} \leq 0$ actually increase or stay the same when $x_e$ increases, so they’re no problem.
The shortest paths LP maximises $\sum_v d(v)$ subject to $d(v)-d(u) \leq w(u,v)$ and $d(s)=0$. Why maximise (not minimise), and when is this LP infeasible?
If we minimised $\sum d(v)$, the optimal solution is $d(v) = -\infty$ for all $v \neq s$ (just make all distances $-\infty$, which trivially satisfies all triangle inequality constraints since $-\infty - (-\infty) = $ undefined, but in practice all $d(v) = -C$ for huge $C$ satisfies constraints). By maximising, we push $d(v)$ as large as the triangle inequality $d(v) \leq d(u)+w(u,v)$ allows. This is exactly the shortest path: the maximum $d(v)$ satisfying all triangle inequalities from $s$ is $\delta(s,v)$, the shortest-path distance. Infeasibility: the difference constraint system $d(v)-d(u) \leq w(u,v)$ has no solution iff there is a negative-weight cycle $v_0\to v_1\to\cdots\to v_k\to v_0$ with $\sum w < 0$ (adding the constraints around the cycle gives $0 \leq \sum w < 0$, a contradiction).
Simplex runs in exponential worst-case time but is fast in practice. Ellipsoid runs in polynomial time but is slow in practice. Interior-point methods are polynomial and practical. What fundamental reason explains the simplex algorithm’s practical speed?
The exponential worst case (Klee-Minty cubes require $2^n-1$ pivots) almost never appears in real-world LP instances from transportation, finance, or network problems. Empirically, the simplex method takes $O(n)$ to $O(n\sqrt{d})$ pivots on practical problems. Additionally, each pivot operation is a dense row operation ($O(nd)$) that is highly optimised in libraries like HiGHS and CPLEX (vectorised BLAS routines). Interior-point methods theoretically need $O(\sqrt{n}\log(1/\epsilon))$ iterations but each iteration solves an $n \times n$ linear system ($O(n^3)$ naively), making them slower for medium-sized problems. Modern solvers use both: interior-point to find a good starting point, then simplex to “clean up” to an exact vertex solution.
Work through independently. Q1–3 direct application. Q4–7 synthesis. Q8–10 careful argument.