Worksheet 10-1: Optimality Conditions for Linearly Constrained Problems (with Solutions)

Worksheet 10-1: Optimality Conditions for Linearly Constrained Problems (with Solutions)#

Download: CMSE382-WS10_1.pdf, CMSE382-WS10_1-Soln.pdf

Warning

This is an AI-generated transcript of the worksheet and may contain errors or inaccuracies. Please refer to the original course materials for authoritative content.


Worksheet 10-1: Q1#

Let \(m, n = 2\), and the matrices

\[\begin{split}\mathbf{A} = \begin{bmatrix} 6 & 3 \\ 4 & 0 \end{bmatrix}, \qquad \mathbf{c} = \begin{bmatrix} 3 \\ 1 \end{bmatrix}.\end{split}\]
  1. The first possibility from Farkas’ lemma is that there exists \(\mathbf{x} \in \mathbb{R}^2\) such that \(\mathbf{A} \mathbf{x} \leq \mathbf{0}\) and \(\mathbf{c}^T \mathbf{x} > 0\).

    a. Write down the system of inequalities implied by \(\mathbf{A} \mathbf{x} \leq \mathbf{0}\). Use Desmos to sketch the region defined by these inequalities.

    Solution

    The assumption \(\mathbf{A} \mathbf{x} \leq \mathbf{0}\) means

    \[\begin{split}\begin{align*} 6x_1 + 3x_2 &\leq 0, \\ 4x_1 &\leq 0. \end{align*}\end{split}\]

    b. Write down the inequality implied by \(\mathbf{c}^T \mathbf{x} > 0\). Draw this restriction on the same Desmos plot.

    Solution

    The assumption \(\mathbf{c}^T \mathbf{x} > 0\) means \(3x_1 + x_2 > 0\).

    c. Does there exist \(\mathbf{x} \in \mathbb{R}^2\) such that \(\mathbf{A} \mathbf{x} \leq \mathbf{0}\) and \(\mathbf{c}^T \mathbf{x} > 0\)? Justify your answer using the Desmos plot.

    Solution

    No. See my Desmos plot here.

    • The region defined by \(\mathbf{A} \mathbf{x} \leq \mathbf{0}\) is the intersection of the blue and red regions in the plot.

    • The region defined by \(\mathbf{c}^T \mathbf{x} > 0\) is the green region in the plot.

    • Since there is no overlap between the blue/red region and the green region, there is no \(\mathbf{x}\) that satisfies both conditions.

  2. The second possibility from Farkas’ lemma is that there exists \(\mathbf{y} \in \mathbb{R}^2\) such that \(\mathbf{A}^\top\mathbf{y} =\mathbf{c}\) and \(\mathbf{y}\geq 0\).

    a. Write down the system of equations implied by \(\mathbf{A}^\top\mathbf{y} =\mathbf{c}\). Plot the solution of these equations in a new Desmos plot.

    Solution

    The assumption \(\mathbf{A}^\top\mathbf{y} =\mathbf{c}\) means

    \[\begin{split}\begin{align*} 6y_1 + 4y_2 &= 3, \\ 3y_1 + 0y_2 &= 1. \end{align*}\end{split}\]

    b. Is there a solution \(\mathbf{y} \geq 0\) to the equations above? Use your Desmos plot to justify.

    Solution

    The lines intersect at \(\mathbf{y} = (1/3, 1/4)\), see my Desmos plot here. Since both entries are positive, \(\mathbf{y} \geq 0\). So there is a solution \(\mathbf{y} \geq 0\) to the equations above.

  3. Based on the previous parts, which of the two possibilities from Farkas’ lemma holds for the matrices \(\mathbf{A}\) and \(\mathbf{c}\) above? Justify your answer.

    Solution

    The second possibility from Farkas’ lemma holds for the matrices \(\mathbf{A}\) and \(\mathbf{c}\) above.

    • We showed in part (a) that there is no \(\mathbf{x}\) such that \(\mathbf{A} \mathbf{x} \leq \mathbf{0}\) and \(\mathbf{c}^T \mathbf{x} > 0\). So the first possibility does not hold.

    • We showed in part (b) that there is a \(\mathbf{y}\) such that \(\mathbf{A}^\top\mathbf{y} =\mathbf{c}\) and \(\mathbf{y}\geq 0\). So the second possibility does hold.

  4. Repeat the previous part for the matrices

\[\begin{split}\mathbf{A} = \begin{bmatrix} 6 & 3 \\ 4 & 0 \end{bmatrix}, \qquad \mathbf{d} = \begin{bmatrix} 1 \\ -1 \end{bmatrix}.\end{split}\]

Which of the two possibilities from Farkas’ lemma holds for the matrices \(\mathbf{A}\) and \(\mathbf{d}\) above? Justify your answer.

Solution

For the first option of Farkas’ lemma:

  • The inequalities from \(\mathbf{A} \mathbf{x} \leq 0\) are the same:

    \[\begin{split} \begin{align*} 6x_1 + 3x_2 &\leq 0, \\ 4x_1 &\leq 0. \end{align*}\end{split}\]

    They are the red and green regions of this Desmos plot.

  • The inequality from \(\mathbf{d}^T \mathbf{x} > 0\) is \(x_1 - x_2 > 0\), which is the green region in the plot above.

  • There are lots of options for \(\mathbf{x}\) that satisfy both conditions. For example, \(\mathbf{x} = (-2, -4)\) satisfies both conditions. So the first option of Farkas’ lemma holds for the matrices \(\mathbf{A}\) and \(\mathbf{d}\) above.

The first option of Farkas’ lemma holds so we know the second doesn’t. But just for the sake of practice, we can check.

  • \(\mathbf{A}^\top \mathbf{y} = \mathbf{d}\) means

    \[\begin{split} \begin{align*} 6y_1 + 4y_2 &= 1, \\ 3y_1 + 0y_2 &= -1. \end{align*}\end{split}\]
  • See this Desmos plot.

  • The lines intersect at \((-1/3, 3/4)\), which is not non-negative. So there is no \(\mathbf{y} \geq 0\) such that \(\mathbf{A}^\top \mathbf{y} = \mathbf{d}\).


Worksheet 10-1: Q2#

Find the stationary point(s) for

\[\min \frac{1}{2}\left(x_1^2 + x_2^2 +x_3^2\right) \quad \text{s.t.} \quad x_1 + x_2 + x_3=3\]

by following the steps below.

(a) Determine \(f\), \(A\), \(\mathbf{b}\), \(C\), and \(\mathbf{d}\) to write the problem in standard form:

\[\min_{\mathbf{x}} f(\mathbf{x}) \quad \text{s.t.} \quad A \mathbf{x} \leq \mathbf{b},\; C\mathbf{x} = \mathbf{d}.\]
Solution
  • \(f(\mathbf{x}) = \frac{1}{2} \left(x_1^2 + x_2^2 +x_3^2\right)\)

  • \(A = \mathbf{0}\), \(\mathbf{b} = \mathbf{0}\) (since there are no inequality constraints)

  • \(C = \begin{bmatrix}1 & 1 & 1 \end{bmatrix}\), \(\mathbf{d} = 3\) (since the equality constraint is \(x_1 + x_2 + x_3 = 3\))

(b) Write down the Lagrangian function.

Solution

The Lagrangian function is

\[L(\mathbf{x},\mu) = f(\mathbf{x}) + \boldsymbol{\mu}^{\top} (C\mathbf{x} - \mathbf{d}) = \frac{1}{2} \left(x_1^2 + x_2^2 +x_3^2\right) + \mu (x_1 + x_2 + x_3 - 3).\]

(Since \(A = \mathbf{0}\) and \(\mathbf{b} = \mathbf{0}\), we can drop the inequality constraint terms.)

(c) Write down the KKT condition (also called the stationarity condition).

Solution

The KKT condition is

\[\nabla_\mathbf{x} L(\mathbf{x}, \mu) = \nabla f(\mathbf{x}) + C^\top \boldsymbol{\mu} = \mathbf{0}.\]

We can compute \(\nabla f(\mathbf{x}) = \begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix}\) and \(C^\top \boldsymbol{\mu} = \begin{bmatrix} 1 \\ 1 \\ 1 \end{bmatrix} \mu\). So the stationarity condition is

\[\begin{split}\begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix} + \begin{bmatrix} 1 \\ 1 \\ 1 \end{bmatrix} \mu = \begin{bmatrix} x_1 + \mu \\ x_2 + \mu \\ x_3 + \mu \end{bmatrix} = \mathbf{0}.\end{split}\]

This can be solved as \(x_1 = -\mu\), \(x_2 = -\mu\), and \(x_3 = -\mu\).

(d) Write down the feasibility constraints.

Solution

The feasibility condition is \(C\mathbf{x} = \mathbf{d}\), which here is

\[x_1 + x_2 + x_3 = 3.\]

(We don’t need \(A \mathbf{x} \leq \mathbf{b}\) or \(\boldsymbol{\lambda} \geq \mathbf{0}\) since there are no inequality constraints.)

(e) Solve for the stationary point(s) by solving the stationarity and feasibility constraints together.

Solution
  • From the stationarity condition, we have \(x_1 = -\mu\), \(x_2 = -\mu\), and \(x_3 = -\mu\).

  • Plugging into \(x_1+x_2+x_3 = 3\), this means \(-3\mu = 3\), which implies \(\mu = -1\).

  • So the stationary point is \(\mathbf{x} = (1, 1, 1)\).

(f) Is the stationary point(s) optimal? Justify your answer.

Solution

The problem is convex since \(f\) is a convex function and the constraints are linear. So the stationary point is a global optimal solution of the problem.


Worksheet 10-1: Q3#

Consider the problem

\[\begin{split}\begin{aligned} & \text{minimize} & & x_1^2 + 2x_2^2 + 4x_1x_2 \\ & \text{subject to} & & x_1 + x_2 = 1, \\ & & & x_1, x_2 \geq 0. \end{aligned}\end{split}\]

(a) Is this problem convex? Justify your answer.

Solution

The problem is not convex because the objective function is not convex. To see this, we can compute the Hessian of the objective function:

\[\begin{split}H = \begin{bmatrix}2 & 4 \\ 4 & 4 \end{bmatrix}.\end{split}\]

The eigenvalues of \(H\) are \(-2\) and \(8\), so \(H\) is not positive semidefinite. Therefore, the objective function is not convex, and the problem is not convex.

(b) Recall the Generalized Extreme Value Theorem (GEVT): If \(f:U \to \mathbb{R}\) is a continuous function and \(U \subseteq \mathbb{R}^n\) is compact, then \(f\) is bounded and there exist \(\mathbf{x}^*, \mathbf{x}_* \in U\) such that

\[f(\mathbf{x}^*) = \sup_{\mathbf{x} \in U} f(\mathbf{x}) \quad \text{and} \quad f(\mathbf{x}_*) = \inf_{\mathbf{x} \in U} f(\mathbf{x}).\]

Use the GEVT to argue that there exists an optimal solution to the problem above.

Solution
  • The objective function is continuous since it is a polynomial.

  • The feasible set is compact since it is closed and bounded. The feasible set is closed since it is the intersection of the closed sets \(\{(x_1, x_2) : x_1 + x_2 = 1\}\), \(\{(x_1, x_2) : x_1 \geq 0\}\), and \(\{(x_1, x_2) : x_2 \geq 0\}\). The feasible set is bounded since \(x_1 + x_2 = 1\) implies \(x_1 \leq 1\) and \(x_2 \leq 1\).

  • Since the objective function is continuous and the feasible set is compact, the GEVT implies that there exists an optimal solution to the problem above.

(c) Find the Lagrangian.

Solution

Following the standard notation:

  • \(f(x_1, x_2) = x_1^2 + 2x_2^2 + 4x_1x_2\)

  • \(A = \begin{bmatrix} -1 & 0 \\ 0 & -1 \end{bmatrix}\), \(\mathbf{b} = \begin{bmatrix} 0 \\ 0 \end{bmatrix}\) (since the inequality constraints are \(-x_1 \leq 0\) and \(-x_2 \leq 0\))

  • \(C = \begin{bmatrix} 1 & 1 \end{bmatrix}\), \(\mathbf{d} = 1\) (since the equality constraint is \(x_1 + x_2 = 1\))

So the Lagrangian is

\[L(x_1, x_2, \lambda_1, \lambda_2, \mu) = x_1^2 + 2x_2^2 + 4x_1x_2 + \lambda_1 (-x_1) + \lambda_2 (-x_2) + \mu (x_1 + x_2 - 1).\]

(d) Write down the stationarity KKT condition.

Solution

The stationarity condition is

\[\begin{split}\nabla_\mathbf{x} L(\mathbf{x}, \boldsymbol{\lambda}, \mu) = \nabla f(\mathbf{x}) + A^\top \boldsymbol{\lambda} + C^\top \mu = \begin{bmatrix} 2x_1 + 4x_2 \\ 4x_1 + 4x_2 \end{bmatrix} + \begin{bmatrix}-1 & 0 \\ 0 & -1 \end{bmatrix}^\top \begin{bmatrix}\lambda_1 \\ \lambda_2 \end{bmatrix} + \begin{bmatrix} 1 \\ 1 \end{bmatrix} \mu = \mathbf{0}.\end{split}\]

With matrix multiplication, this is equivalent to the system of equations

\[\begin{split}\begin{align*} 2x_1 + 4x_2 - \lambda_1 + \mu &= 0, \\ 4x_1 + 4x_2 - \lambda_2 + \mu &= 0. \end{align*}\end{split}\]

(e) Write down the complementary slackness conditions.

Solution

The complementary slackness conditions are \(\lambda_i (\mathbf{a}_i^\top \mathbf{x}^* - b_i) = 0\) for each inequality constraint. Here, they are

\[\begin{split}\begin{align*} \lambda_1 (-x_1) &= 0, \\ \lambda_2 (-x_2) &= 0. \end{align*}\end{split}\]

This can be simplified to \(\lambda_1 x_1 = 0\) and \(\lambda_2 x_2 = 0\).

(f) Write down the feasibility conditions from the problem constraints.

Solution
  • \(x_1+x_2 = 1\)

  • \(x_1 \geq 0\)

  • \(x_2 \geq 0\)

(g) Write down the feasibility conditions for each of the \(\lambda\)s.

Solution
  • \(\lambda_1 \geq 0\)

  • \(\lambda_2 \geq 0\)

(h) From everything above, copy down the 9 total equations/inequalities to be satisfied by an optimal solution.

Solution
\[\begin{split}\begin{align*} 2x_1 + 4x_2 - \lambda_1 + \mu &= 0, \\ 4x_1 + 4x_2 - \lambda_2 + \mu &= 0, \\ \lambda_1 x_1 &= 0, \\ \lambda_2 x_2 &= 0, \\ x_1 + x_2 &= 1, \\ x_1 &\geq 0, \\ x_2 &\geq 0, \\ \lambda_1 &\geq 0, \\ \lambda_2 &\geq 0. \end{align*}\end{split}\]

(i) This problem involves multiple cases for \(\lambda_1\) and \(\lambda_2\). We will address each separately.

Case 1: \(\lambda_1=\lambda_2= 0\). Use the complementary slackness conditions to solve for \(x_1\) and \(x_2\). Is the solution consistent with the constraints?

Solution
  • If \(\lambda_1 = \lambda_2 = 0\), then the equations simplify to

    \[\begin{split} \begin{align*} 2x_1 + 4x_2 + \mu &= 0, \\ 4x_1 + 4x_2 + \mu &= 0, \\ x_1 + x_2 &= 1. \end{align*}\end{split}\]
  • Solving this system gives \(x_1 = 0\), \(x_2 = 1\), \(\mu = -4\).

  • All constraints are satisfied: \(x_1 \geq 0\), \(x_2 \geq 0\), \(\lambda_1 \geq 0\), \(\lambda_2 \geq 0\).

  • So \((x_1,x_2) = (0, 1)\) is a KKT point.

Case 2: \(\lambda_1, \lambda_2 > 0\). Use the complementary slackness conditions to solve for \(x_1\) and \(x_2\). Is the solution consistent with the constraints?

Solution
  • If \(\lambda_1, \lambda_2 > 0\), then the slackness conditions \(\lambda_1 x_1 = 0\) and \(\lambda_2 x_2 = 0\) imply \(x_1 = 0\) and \(x_2 = 0\).

  • But this contradicts the equality constraint \(x_1 + x_2 = 1\).

  • So there is no KKT point arising from \(\lambda_1, \lambda_2 > 0\).

Case 3: \(\lambda_1>0, \lambda_2 = 0\). Use the complementary slackness conditions to solve for \(x_1\) and \(x_2\). Is the solution consistent with the constraints?

Solution
  • If \(\lambda_1 > 0\) and \(\lambda_2 = 0\), then the slackness conditions imply \(x_1 = 0\).

  • From the equality constraint \(x_1 + x_2 = 1\), we get \(x_2 = 1\).

  • All constraints are satisfied.

  • So \((x_1,x_2) = (0, 1)\) is a KKT point. However, this is the same point found in Case 1, not a new point.

Case 4: \(\lambda_1=0, \lambda_2 > 0\). Use the complementary slackness conditions to solve for \(x_1\) and \(x_2\). Is the solution consistent with the constraints?

Solution
  • If \(\lambda_1 = 0\) and \(\lambda_2 > 0\), then the slackness conditions imply \(x_2 = 0\).

  • From the equality constraint \(x_1 + x_2 = 1\), we get \(x_1 = 1\).

  • All constraints are satisfied.

  • So \((x_1,x_2) = (1, 0)\) is a KKT point.

(j) Write down the KKT points you found above (there should be two of them).

Solution

The KKT points are \((x_1,x_2) = (0, 1)\) and \((x_1,x_2) = (1, 0)\).

(k) Can you use the KKT theorem to determine which of the points you found above is a local optimal solution of the problem? Justify your answer.

Solution

No, since the problem is not convex, the KKT conditions are not sufficient for optimality. So we cannot use the KKT theorem to determine which of the points is a local optimal solution.

(l) Which of the two points is the global optimal solution of the problem? Justify your answer.

Solution
  • Even though we can’t use the KKT condition theorem, the GEVT guarantees that there is a global optimal solution.

  • From the KKT conditions, any local optimal solution must be a KKT point. So the global optimal must be one of our KKT points.

  • Evaluate the objective function at each KKT point:

    • At \((x_1,x_2) = (0, 1)\): \(f = 0^2 + 2(1)^2 + 4(0)(1) = 2\)

    • At \((x_1,x_2) = (1, 0)\): \(f = 1^2 + 2(0)^2 + 4(1)(0) = 1\)

  • Since we are minimizing, the global optimal solution is \((x_1,x_2) = (1, 0)\) with objective value \(1\).