Worksheet 11-3: Review of KKT topics (with Solutions)

Worksheet 11-3: Review of KKT topics (with Solutions)#

Download: CMSE382-WS11_3.pdf, CMSE382-WS11_3-Soln.pdf

Warning

This is an AI-generated transcript of the worksheet and may contain errors or inaccuracies. Please refer to the original course materials for authoritative content.


For each of the following problems:

  1. Show that the problem is convex or not convex.

  2. Prove that there exists an optimal solution to the problem.

  3. Determine a solution strategy (e.g., where in the flow chart you need to go) and check any additional requirements in order to apply the KKT conditions (e.g. Slater’s condition or regularity).

  4. Are the KKT conditions necessary? Sufficient?

Once you have done the above for each problem, go back and use the KKT conditions to find the optimal solution of each problem. Note you will often need to use cases \((\lambda_i = 0\) vs \(\lambda_i > 0)\) to find the KKT points. Use Desmos to check your answer.


Worksheet 11-3: Q1#

Consider the problem

\[\begin{split} \begin{aligned} \min &\quad x_1^4-2x_2^2-x_2\\ \text{s.t.} & \quad x_1^2+x_2^2+x_2 \leq 0 \end{aligned} \end{split}\]

Hint: the constraint can also be written as \(x_1^2 + \left(x_2+\frac{1}{2}\right)^2 - \left(\frac{1}{2}\right)^2 \leq 0\).

Solution
  • Convexity:

  • The objective function is not convex because its Hessian is \(\begin{bmatrix} 12x_1^2 & 0\\ 0 & -4 \end{bmatrix}\), and this is indefinite for some values of \(\mathbf{x}\) (e.g. \(x_1=1\)).

  • The constraint is convex because it is the sum of three convex functions: \(x_1^2\), \(x_2^2\) and a linear function \(x_2\). The other version of the constraint is helpful to see that the feasible region is a closed ball of radius \(\tfrac{1}{2}\) in \(\mathbb{R}^2\) centered at \((x,y)=(0,-\tfrac{1}{2})\).

  • Existence of optimal solution: The problem consists of minimizing a continuous function over a nonempty compact set, so the minimizer exists (by the Generalized Extreme Value Theorem (GEVT), see Theorem 2.30 in the textbook).

  • Solution Strategy: The problem is not convex, so we don’t know that the KKT conditions are sufficient for optimality. However, the constraints are convex so we can check Slater’s conditions. In this case, our inequality constraint is convex but not affine, so we need to check if there exists a point \(\hat{\mathbf{x}}\) that strictly satisfies the inequality constraint. In this case, anything on the inside of the ball would work, so we can just pick the center of the ball \((0,-\tfrac{1}{2})\) as our representative point. This means the problem satisfies Slater’s condition, so KKT conditions are necessary for optimality. This means that we can find the optimal solution by finding the KKT points and checking which one is optimal.

  • Solving for the KKT points:

  1. The Lagrangian for this problem is

\[ L = x_1^4-2x_2^2-x_2+\lambda(x_1^2+x_2^2+x_2) \]

and

\[\begin{split} \nabla L = \begin{pmatrix} 4x_1^3+2\lambda x_1\\ -4x_2-1+2\lambda x_2+\lambda \end{pmatrix}. \end{split}\]

The KKT conditions are

\[\begin{split} \begin{aligned} 4x_1^3+2\lambda x_1 &=0\\ -4x_2-1+2\lambda x_2+\lambda&=0\\ \lambda(x_1^2+x_2^2+x_2) &=0\\ \lambda &\geq 0\\ x_1^2+x_2^2+x_2 &\leq 0 \end{aligned} \end{split}\]
  1. Case 1: \(\lambda = 0\). From the first equation of stationarity, \(4x_1^3 = 0\), which implies \(x_1 = 0\). From the second equation of stationarity, \(-4x_2 - 1 = 0\), which implies \(x_2 = -\frac{1}{4}\). The point \((0, -\frac14)\) satisifes all of the KKT conditions (generally, you need to check that you can plug in the point without getting the contradiction that \(\lambda < 0\)) so it is a KKT point.

  2. Case 2: \(\lambda > 0\). In this case, because \(\lambda(x_1^2+x_2^2+x_2)=0\), we get that \(x_1^2+x_2^2+x_2 = 0\). From the 1st equation we get \(x_1(4x_1^2+2\lambda) = 0\). If \(x_1 = 0\), then we get that \(x_2^2+x_2 = 0\) so \(x_2 = 0\) (and then \(x_1 = 0\)) or \(x_2 = -1\) (and \(x_1=0\)). Finally, if \(x_1 \neq 0\), then \(4x_1^2+2\lambda=0\). This implies that \(x_1^2 = -\frac{\lambda}{2}\), which implies \(\lambda \leq 0\), a contradiction.

  3. Therefore the KKT points are \((0, -\frac14)\), \((0,-1)\), and \((0, 0)\).

  4. Plugging the KKT points into the original equation,

\[\begin{split} \begin{aligned} f(0, -\frac14) &= -\frac{7}{16}\\ f(0, -1) &= -1\\ f(0, 0) &= 0 \end{aligned} \end{split}\]

so the optimal solution is obtained at \((0,-1)\).


Worksheet 11-3: Q2#

Consider the problem

\[\begin{split} \begin{aligned} \min &\quad 2x_1+x_2\\ \text{s.t.} & \quad 4x_1^2+x_2^2-2 \leq 0\\ & \quad 4x_1+x_2+3 \leq 0 \end{aligned} \end{split}\]
Solution
  • Convexity:

  • The function is affine, hence convex.

  • The feasible region is the intersection of a filled-in ellipse and a half plane; therefore, it is convex. The first inequality is convex but not affine, while the second one is affine (thus also convex). The intersection of convex sets is convex, so the feasible region is convex.

  • So, the problem is convex.

  • Existence of optimal solution: The problem consists of minimizing a continuous function over a nonempty, compact set, so the minimizer exists (by the Generalized Extreme Value Theorem (GEVT), see Theorem 2.30 in the textbook).

  • Solution Strategy: The problem is convex, so the KKT conditions are sufficient for optimality. Because we know that an optimal point exists and thus must be one of the KKT points, we can find the optimal solution by finding the KKT points and checking which one is optimal.

While not necessary for finding the solution, to see if the KKT conditions are also necessary, we need to check if the problem satisfies Slater’s condition. In this case, we need to find a point \(\hat{\mathbf{x}}\) that strictly satisfies the first inequality (although we can ignore the second since it’s affine). For example, the point \((-0.6, -0.6)\) strictly satisfies the first inequality, so Slater’s condition is satisfied. Therefore, KKT conditions are also necessary for optimality.

  • Solving for the KKT points: TBD

  1. The Lagrangian is

\[ L = 2x_1+x_2+\lambda_1(4x_1^2+x_2^2-2)+\lambda_2(4x_1+x_2+3) \]

and

\[\begin{split} \nabla L = \begin{pmatrix} 2+8\lambda_1x_1+4\lambda_2\\ 1+2\lambda_1x_2+\lambda_2 \end{pmatrix}. \end{split}\]

The KKT conditions are

\[\begin{split} \begin{aligned} 2+8\lambda_1x_1+4\lambda_2 &=0\\ 1+2\lambda_1x_2+\lambda_2&=0\\ \lambda_1(4x_1^2+x_2^2-2) &=0\\ \lambda_2(4x_1+x_2+3) &=0\\ 4x_1^2+x_2^2-2 &\leq 0\\ 4x_1+x_2+3 &\leq 0\\ \lambda_1, \lambda_2 &\geq 0 \end{aligned} \end{split}\]
  1. Case 1: \(\lambda_1 = 0, \lambda_2 = 0\). From the first equation of stationarity, \(2 = 0\), which is a contradiction. So there are no KKT points in this case.

  2. Case 2: \(\lambda_1 > 0, \lambda_2 = 0\). Replacing \(\lambda_2=0\), the stationarity equations become

\[\begin{split} \begin{aligned} 2+8\lambda_1x_1 &=0\\ 1+2\lambda_1x_2&=0. \end{aligned} \end{split}\]

Solving for \(1/\lambda_1\) (which we are allowed to do because \(\lambda_1 \neq 0\)) from the above equations gives

\[ \frac{1}{\lambda_1} = -4x_1 = -2x_2 \]

so \(x_2 = 2x_1\). Because \(\lambda_1 > 0\), the complementary slackness condition implies that

\[ 4x_1^2+x_2^2-2 = 0. \]

Then we plug in \(x_2 = 2x_1\) into the above equation to get \(8x_1^2 - 2 = 0\), so \(x_1^2 = \frac14\) and hence \(x_1 = \pm \frac12\). This means the possible points are \((-\frac12, -1)\) and \((\frac12, 1)\).

  • Checking the point \((-\frac12, -1)\), plugging into the stationarity equations gives \(\lambda_1 = \frac{1}{2}\) and \(\lambda_2 = 0\), which is consistent with the assumption from this case. So \((-\frac12, -1)\) is a KKT point.

  • Checking the point \((\frac12, 1)\), plugging into the stationarity equations gives \(\lambda_1 = -\frac{1}{2}\) and \(\lambda_2 = 0\), which contradicts the fact that \(\lambda_1 > 0\). So \((\frac12, 1)\) is not a KKT point. So the only KKT point in this case is \((-\frac12, -1)\). Note that since we know that the KKT conditions are sufficient for optimality, this means that \((-\frac12, -1)\) is the optimal solution. However, we will still check the other cases to see if there are any other KKT points (which would also be optimal) and to practice solving for KKT points in different cases.

  1. Case 3: \(\lambda_1 = 0, \lambda_2 > 0\). Replacing \(\lambda_1=0\), the stationarity equations become

\[\begin{split} \begin{aligned} 2+4\lambda_2 &=0\\ 1+\lambda_2&=0. \end{aligned} \end{split}\]

From both equations, \(\lambda_2 = -1\). However, this contradicts the fact that \(\lambda_2 > 0\), so there are no KKT points in this case.

  1. Case 4: \(\lambda_1 > 0, \lambda_2 > 0\). In this case, the complementary slackness conditions imply that \(4x_1^2+x_2^2-2 = 0\) and \(4x_1+x_2+3 = 0\). Solving for \(x_1\) and \(x_2\) from these equations gives the points \((-0.7,-0.2)\) and \((-0.5,-1)\).

  • For the point \((-0.7,-0.2)\), plugging into the stationarity equations and solving for \(\lambda_1,\lambda_2\) gives \(\lambda_1=-\frac{1}{2}\) and \(\lambda_2 = -\frac{6}{5}\), which contradicts the fact that \(\lambda_1, \lambda_2 > 0\). So \((-0.7,-0.2)\) is not a KKT point.

  • For the point \((-0.5,-1)\), plugging into the stationarity equations gives \(\lambda_1 = \frac{1}{2}\) and \(\lambda_2 = 0\), which breaks the assumption from this case. So \((-0.5,-1)\) is not a KKT point. So there are no KKT points in this case.

  • The solution can be seen at this desmos.


Worksheet 11-3: Q3#

Consider the problem

\[\begin{split} \begin{aligned} \min &\quad x_1^3+x_2^3\\ \text{s.t.} & \quad x_1^2+x_2^2 \leq 1 \end{aligned} \end{split}\]
Solution
  • Convexity: The objective function is not convex because its Hessian is \(\begin{bmatrix}6x_1 & 0\\ 0 & 6x_2 \end{bmatrix}\), which is indefinite for some values of \(\mathbf{x}\) (e.g. \(x_1 = -\frac12\) and \(x_2 = \frac12\)). The constraint is convex because it is the sum of two convex functions: \(x_1^2\) and \(x_2^2\). Note that the constraint set is the unit disk (including the interior) in \(\mathbb{R}^2\). So the problem is not convex.

  • Existence of optimal solution: The problem consists of minimizing a continuous function over a nonempty, compact set, so the minimizer exists (by the Generalized Extreme Value Theorem (GEVT), see Theorem 2.30 in the textbook).

  • Solution Strategy: The problem is not convex, so we don’t know that the KKT conditions are sufficient for optimality. However, the constraint is convex so we can check Slater’s conditions. In this case, our inequality constraint is convex but not affine, so we need to check if there exists a point \(\hat{\mathbf{x}}\) that strictly satisfies the inequality constraint. The origin \((0,0)\) works for this, so the problem satisfies Slater’s condition, and thus KKT conditions are necessary for optimality. This means that we can find the optimal solution by finding the KKT points and checking which one is optimal.

  • Solving for the KKT points:

  1. The Lagrangian is \(L = x_1^3+x_2^3+\lambda(x_1^2+x_2^2-1)\).

\[\begin{split} \nabla L = \begin{pmatrix} 3x_1^2+2\lambda x_1\\ 3x_2^2+2\lambda x_2 \end{pmatrix} \end{split}\]

Therefore the KKT conditions are

\[\begin{split} \begin{aligned} 3x_1^2+2\lambda x_1 &=0\\ 3x_2^2+2\lambda x_2&=0\\ \lambda(x_1^2+x_2^2-1) &=0\\ x_1^2+x_2^2 &\leq 1\\ \lambda & \geq 0 \end{aligned} \end{split}\]
  1. Case 1: \(\lambda = 0\). By plugging \(\lambda = 0\) into \(\nabla L\), we get that \(3x_1^2 = 3x_2^2 = 0\). Thus \(x_1 = x_2 = 0\).

  2. Case 2: \(\lambda > 0\). By the complementary slackness condition, this implies that \(x_1^2+x_2^2 = 1\). From the first equation in \(\nabla L\), we get that \(x_1(3x_1+2\lambda) = 0\) and from the second equation we get that \(x_2(3x_2+2\lambda) = 0\). This leads us to 4 subcases:

  • Subcase 1: \(x_1=0\) In this case, \(x_2 = \pm 1\). Plugging \(x_2 = 1\) into the second equation in \(\nabla L\) results in a negative value for \(\lambda\), a contradiction. So only \((0, -1)\) is a KKT point.

  • Subcase 2: \(x_2=0\) In this case, \(x_1 = \pm 1\). Plugging \(x_1 = 1\) into the first equation in \(\nabla L\) results in a negative value for \(\lambda\), a contradiction. So only \((-1, 0)\) is a KKT point.

  • Subcase 3: \(x_1, x_2\neq0\) In this case, \(3x_1+2\lambda = 3x_2+2\lambda = 0\) and , so that \(x_1 = x_2 = \frac{-2\lambda}{3}\). Thus \(2\left(\frac{4\lambda^2}{9}\right) = 1\) and hence \(\lambda^2 = \frac98\). Because \(\lambda \geq 0\), this implies that \(\lambda = \frac{3}{\sqrt8}\). Thus \(x_1 = x_2 = -\frac{2}{\sqrt{8}} = -\frac{1}{\sqrt2}\).

  1. So the KKT points are \((0,0)\), \((0,-1)\), \((-1,0)\), and \((-\frac{1}{\sqrt2}, -\frac{1}{\sqrt2})\). Checking the function values at these points, we get

\[\begin{split} \begin{aligned} f(0,0) &= 0\\ f(0,-1) &= -1\\ f(-1,0) &= -1\\ f\left(-\tfrac{1}{\sqrt2}, -\tfrac{1}{\sqrt2}\right) &= -\frac{1}{\sqrt2} \end{aligned} \end{split}\]

so the optimal solution is obtained at \((0,-1)\) and \((-1,0)\), which both give the same optimal value of \(-1\).


Worksheet 11-3: Q4#

Consider the problem

\[\begin{split} \begin{aligned} \min &\quad x_1^4-x_2^2\\ \text{s.t.} & \quad x_1^2+x_2^2 \leq 1\\ & \quad 2x_2+1 \leq 0 \end{aligned} \end{split}\]
Solution
  • Convexity: The objective function is not convex because its Hessian is \(\begin{bmatrix}12x_1^2 & 0\\ 0 & -2 \end{bmatrix}\), which is indefinite for some values of \(\mathbf{x}\) (e.g. \(x_1=1\)). The first constraint is convex because it is the sum of two convex functions: \(x_1^2\) and \(x_2^2\). The second constraint is convex because it is affine. The intersection of convex sets is convex, so the feasible region is convex. So the problem is not convex.

  • Existence of optimal solution: The problem consists of minimizing a continuous function over a nonempty, compact set, so the minimizer exists (by the Generalized Extreme Value Theorem (GEVT), see Theorem 2.30 in the textbook).

  • Solution Strategy: This is not a convex problem, but the constraints are convex, so we can check Slater’s condition. In this case, we need to find a point \(\hat{\mathbf{x}}\) that strictly satisfies the first inequality (although we can ignore the second since it’s affine). For example, the point \((-0.6, -0.6)\) strictly satisfies the first inequality, so Slater’s condition is satisfied. Therefore, KKT conditions are necessary for optimality. This means that we can find the optimal solution by finding the KKT points and checking which one is optimal.

  • Solving for the KKT points:

  • The Lagrangian is

\[ L = x_1^4-x_2^2+\lambda_1(x_1^2+x_2^2-1)+\lambda_2(2x_2+1) \]

Then

\[\begin{split} \nabla L = \begin{pmatrix} 4x_1^3+2\lambda_1x_1\\ -2x_1+2\lambda_1x_2+2\lambda_2 \end{pmatrix} \end{split}\]

Therefore the KKT conditions are

\[\begin{split} \begin{aligned} 4x_1^3+2\lambda_1x_1&=0\\ -2x_1+2\lambda_1x_2+2\lambda_2&=0\\ \lambda_1(x_1^2+x_2^2-1) &= 0\\ \lambda_2(2x_2+1) &=0\\ x_1^2+x_2^2 \leq 1\\ 2x_2+1 \leq 0\\ \lambda_1, \lambda_2 \geq 0 \end{aligned} \end{split}\]
  • Case 1: \(\lambda_1 = 0, \lambda_2 = 0\) Plugging \(\lambda_1 =\lambda_2 = 0\) into the gradient of the Lagrangian gives \(x_1 =x_2 = 0\). However, this point violates the constraint \(2x_2+1 \leq 0\), so it is not a KKT point.

  • Case 2: \(\lambda_1 = 0, \lambda_2 > 0\) In this case, from the second complementary slackness condition, \(2x_2+1 = 0\), and hence \(x_2 = -\frac12\). However, plugging this into the second equation of \(\nabla L\) results in a negative value for \(\lambda_2\), a contradiction.

  • Case 3: \(\lambda_1 > 0\) From the first equation in \(\nabla L\), \(x_1(4x_1^2+2\lambda_1) = 0\). If \(x_1 \neq 0\), then \(4x_1^2+2\lambda_1 = 0\). However, this implies that \(\lambda_1\) would be negative since \(4x_1^2 > 0\), a contradiction. So \(x_1 = 0\). By the first complementary slackness condition, \(x_1^2+x_2^2-1 = 0\) so \(x_2 = \pm 1\). \(x_2 = 1\) violates one of the original constraints, so \((0, -1)\) is the only KKT point.

  • Overall, the only KKT point is \((0,-1)\), and since the KKT condition is necessary, this means the optimal solution must be at \((0,-1)\).

  • The solution can be seen at this desmos.