Worksheet Chapter 10, Part 3#
KKT Conditions for Linearly Constrained Problems#
Notebook Info#
Topics#
This notebook contains three coding examples designed to build intuition for KKT conditions in linearly constrained optimization:
Projection onto an affine set
Equality-constrained quadratic
Inequality-constrained QP active-set exploration
Learning goals#
Compute and interpret stationarity, primal feasibility, dual feasibility, and complementarity.
See KKT multipliers as measurable quantities (not just symbols).
Connect active constraints to which multipliers are nonzero.
import itertools
import numpy as np
import matplotlib.pyplot as plt
import cvxpy as cp
%matplotlib inline
np.set_printoptions(precision=4, suppress=True)
1. Orthogonal Projection Onto an Affine Space#
Previously, we studied orthogonal projection (see lectures 8-2 and 9-1), where the goal is to find the closest point \(\mathbf{y} \in C\) to a given \(\mathbf{x}\), denoted
In the previous lectures, we found formulae for the projection operator in the case of
and
Now, using the KKT conditions, we can find formulae for projections onto more complex sets \(C\); specifically, when \(C\) is an affine space.
Given \(A\in\mathbb{R}^{m\times n}\) with full row rank, we will project \(y\in\mathbb{R}^n\) onto
by solving
In this case, the Lagrangian function is
so the stationarity condition is
Combined with the feasibility condition \(A \mathbf{x} = \mathbf{b}\), we find that
See the video lecture for the full derivation.
Example#
We are going to work with the system \(A \mathbf{x} = \mathbf{b}\) for
❓❓❓ Question ❓❓❓: Take a look at this Desmos plot. What is the set \(A \mathbf{x} = \mathbf{b}\)?
Your answer here
Click for solution
The red plane is the equation defined by the first row of \(A \mathbf{x} = \mathbf{b}\), and the blue plane is the equation defined by the second row. That means the entirety of the set \(A \mathbf{x} = \mathbf{b}\) is the intersection of the two, which is marked with the green line.
❓❓❓ Question ❓❓❓: Use the derivation above to compute the projected point \(\mathbf{x} = P_C(-1,0,2)\) for a point \(\mathbf{x} = (-1,0,2)\). Plot your found point on the desmos plot to see if your answer looks reasonable.
A = np.array([[3, 1, -1], [-2, 2, 1]])
b = np.array([[0,0]]).T
y = np.array([[-1,0,2]]).T
# Your code here
Click for solution
See this desmos plot to see the projected point. Note that the straight line to the projected point is perpendicular to the green \(Ax=b\).
Solving without doing the algebra by hand.#
We can actually use linear algebra to do all of our work for us.
After pulling the \(\gamma = 2\mu\) trick, for a given \(\mathbf{y} \in \mathbb{R}^n\), we have KKT conditions for this problem given by:
Right now we have variables \(\mathbf{x}\) and \(\boldsymbol{\gamma}\) to solve for. We can put all our variables together into a column vector, and then solve the following single giant system:
❓❓❓ Question ❓❓❓: For our example system above, write out the matrices in that single giant system. Check that all the equations implied by the KKT conditions show up in the matrix multiplication.
Your notes here
Click for solution
The matrix for our example is
❓❓❓ Question ❓❓❓: Now code the matrices you wrote above and use the np.linalg.solve function to determine \(\mathbf{x}\) and \(\boldsymbol{\gamma}\) for this problem. For a challenge, try to build the matrices directly from your saved A, b, and y from above. Did you get the same answer as above for \(\mathbf{x}\)?
# Your code here
2. Equality-Constrained Quadratic#
We can pull the same trick to solve a quadratic optimization problem,
From the 8-2 worksheet, we have that
so the KKT conditions become
❓❓❓ Question ❓❓❓: Write down the equivalent version of the “single giant system” from above that we can solve to get the solution.
Your notes here.
Click for solution
The matrix problem to solve is
Example in \(\mathbb{R}^2\)#
Now we will work with the system where
We will do the simplified case where \(A\) is a vector rather than a matrix (i.e. this is a linear system), so we want to find the optimum point over \(\mathbf{a}^\top \mathbf{x} = b\) for
We match the above notation by setting \(A = \mathbf{a}^\top\).
❓❓❓ Question ❓❓❓: For this Desmos plot and the setup above, what are the red surface, the blue line, and the green line?
Your notes here
Click for solution
The red surface is the optimization function \(x^TQx + 2c^Tx\).
The blue line is the constraint set \(\mathbf{a}^\top \mathbf{x} = b\).
The green line is the set of possible function values over the constraint set. So the goal of the problem is to find the point at the bottom of the green parabola.
❓❓❓ Question ❓❓❓: Set up the “single giant system” matrix problem you found above for this specific problem and find the solution \(\mathbf{x}^*\). As before, try to do this directly from the matrices rather than hard-coding the matrix. Then plot your solution on the desmos plot to see if it’s right.
Q = np.array([[3, -1], [-1, 2]])
c = np.array([[1, 1]]).T
A = np.array([[1, 1]])
b = np.array([[-2]])
# Your code here
Example in \(\mathbb{R}^3\)#
While I can’t draw you pictures, you should be able to do exactly the same thing for larger systems where we are solving for \(\mathbf{x} \in \mathbb{R}^n\) for \(n>2\). In particular, if you already setup your code above to build the matrices you want directly from the matrices this next bit should be easy.
The problem to solve is
for
We want to find the optimum point over \(A \mathbf{x} = \mathbf{b}\) for
❓❓❓ Question ❓❓❓:
Is the problem convex?
What is the solution to the problem?
# Your code here
3. Inequality-Constrained Optimization and Active Sets#
For the examples above, we had only equality constraints, \(A \mathbf{x} = \mathbf{b}\). It’s a bit more complicated when we want to include inequalities as well.
Consider the optimization problem
In this setup, we can write down the Lagrangian as
❓❓❓ Question ❓❓❓: Given the following list of matrix derivative rules, determine \(\nabla_\mathbf{x}L(\mathbf{x}, \boldsymbol{\lambda})\) and thus the stationarity condition.
Scalar value |
Scalar value description |
Gradient notation |
Gradient value |
|---|---|---|---|
\(x^\top Q x\) |
Quadratic form (\(Q\) is symmetric) |
\(\dfrac{\partial (x^\top Q x)}{\partial x}\) |
\(2Qx\) |
\(c^\top A x\) |
Vector-matrix-vector product |
\(\dfrac{\partial (c^\top A x)}{\partial x}\) |
\(A^\top c\) |
\(b^\top x\) |
Inner product |
\(\dfrac{\partial (b^\top x)}{\partial x}\) |
\(b\) |
Your notes here.
Click for solution
The stationarity condition is
❓❓❓ Question ❓❓❓: Write down the slackness conditions for the problem.
Your notes here
Click for solution
The requirement is \(\lambda_i(G_i x-h_i)=0\) for all \(i\).
Checking active constraints using an example#
So far, we have the stationarity condition and the slackness conditions you calculated above. We also need the feasibility condition \(G\mathbf{x} \leq \mathbf{h}\), so we will also need \(\mathbf{\lambda \geq 0}\). However, it’s hard to include inequality constraints with that matrix multiplication trick above since we can only figure out equalities.
So, to do this, we will do a check where we assume a subset of the constraints from \(G\mathbf{x}\leq \mathbf{h}\) are active (that is, are actually equality rather than inequality), and check all the possible subsets.
We will do this for the quadratic function given by matrices
and the constraint set given by matrices
Q = np.array([[2.0, 0.2], [0.2, 1.0]])
c = np.array([-1.0, -0.6])
G = np.array([[1.0, 0.0], [0.0, 1.0], [1.0, 1.0]])
h = np.array([1.2, 1.0, 1.4])
The code below gives back points \(\mathbf{x}\) and \(\lambda\) such that
the stationarity condition is satisfied, so \(\nabla_\mathbf{x}L(\mathbf{x}, \boldsymbol{\lambda}) = \mathbf{0}\), and
the rows of \(G\mathbf{x} \leq \mathbf{h}\) that are active are the ones in the list
active_idx. Said another way, for those rows, \(G_i\mathbf{x} = h_i\).
def solve_active_set(Q, c, G, h, active_idx):
n = Q.shape[0]
Aeq = G[list(active_idx), :] if len(active_idx) > 0 else np.zeros((0, n))
beq = h[list(active_idx)] if len(active_idx) > 0 else np.zeros(0)
mA = Aeq.shape[0]
left_side = np.block([[Q, Aeq.T], [Aeq, np.zeros((mA, mA))]])
right_side = np.concatenate([-c, beq])
try:
sol = np.linalg.solve(left_side, right_side)
except np.linalg.LinAlgError:
return None, None
x = sol[:n]
nu = sol[n:] # multipliers for active constraints
# map to full lambda
lam = np.zeros(G.shape[0])
for k, idx in enumerate(active_idx):
lam[idx] = nu[k]
return x, lam
❓❓❓ Question ❓❓❓:
Using the function above, what are the \(\mathbf{x}\) and \(\lambda\) so that the 0th and 2nd rows of the inequality constraints are active, that is so that \(x_1 + 0x_2 = 1.2\) and \(x_1+x_2 = 1.4\)? Save the outputs as x and lam.
# Your code here
# Warning: `lambda` is a reserved keyword in Python, so make sure to save the lambda output as `lam` instead to avoid syntax errors.
Now we can check if x and lam satisfy the rest of our requirements.
❓❓❓ Question ❓❓❓: Use the np.all function to output True for the following function if the lambda entries are all \(\geq 0\), and False otherwise.
def is_lambda_pos(lam):
pass #<-- Your code here
# Test the output
is_lambda_pos(lam)
❓❓❓ Question ❓❓❓: Use the np.all function to output True if \(G\mathbf{x} \leq \mathbf{h}\), and False otherwise.
def is_Gx_leq_h(G, x, h):
pass #<-- Your code here
is_Gx_leq_h(G, x, h)
❓❓❓ Question ❓❓❓: Use the np.all function to output True if \(\lambda_i (G_i\mathbf{x} - h_i) = 0\) for all \(i\), and False otherwise.
def is_slackness_satisfied(lam, G, x, h):
pass #<-- Your code here
is_slackness_satisfied(lam, G, x, h)
❓❓❓ Question ❓❓❓: Finish the code below by checking the output \(x\) and \(\lambda\) using the functions above, and add the point to the list if it satsifies all constraints.
feasible_points = []
all_idx = list(range(G.shape[0]))
for r in range(len(all_idx) + 1):
for active in itertools.combinations(all_idx, r):
x, lam = solve_active_set(Q, c, G, h, active)
if x is None:
continue
# Your code here
# Add checks here and add feasible points to the list
print("Feasible points that satisfy KKT conditions:")
for pt in feasible_points:
print(pt)
Feasible points that satisfy KKT conditions:
❓❓❓ Question ❓❓❓: The problem above was convex, so there was exactly one KKT point which made our job easy. Now, use your code above to find the optimum of the non-convex problem where
and the constraint set given by matrices
You should get multiple KKT points. How do you know which one is optimal?
# Your code here
© Copyright 2026, The Department of Computational Mathematics, Science and Engineering at Michigan State University.