Worksheet 9-1: Optimization Over Convex Sets (with Solutions)#
Download: CMSE382-WS9_1.pdf, CMSE382-WS9_1-Soln.pdf
Warning
This is an AI-generated transcript of the worksheet and may contain errors or inaccuracies. Please refer to the original course materials for authoritative content.
Worksheet 9-1: Q1#
Consider the problem
The feasible set for this problem is
Take a look at this desmos plot. What is the blue surface? What is the red line? What is the green line? Move the slider for \(a\) around, what is the grey point?
Solution
The blue surface is the function \(f(\mathbf{x})\).
The red line is the constraint set, \(U\).
The green line is the function restricted to \(U\).
The grey point is a point on the function. So the height of this thing is what we’re trying to minimize.
Based on the plot, does this appear to be a convex problem?
Solution
The constraint set is a line \(x_1+x_2=1\), so it is convex. While the function \(f(x_1,x_2)=-x_1x_2\) is a saddle, and so is not convex, the restriction to the line \(x_1+x_2=1\) is indeed convex. That means this is actually a convex problem.
Let’s find the solution without using the stationarity condition first. For a point \(\mathbf{x}=(x_1,x_2)\in U\), write down \(\mathbf{x}\) in terms of just \(x_1\). Then write down \(f(\mathbf{x})\) restricted to \(U\) in terms of just \(x_1\).
Solution
Since \(\mathbf{x}\in U\), \(x_1+x_2=1\), so \(x_2=1-x_1\).
So \(\mathbf{x}=(x_1,1-x_1)\).
This means that in \(U\),
\[ f(x_1,x_2)=f(x_1,1-x_1)=-x_1(1-x_1)=-x_1+x_1^2. \]
Great, this is a function in one variable! Find the minimum. Use this to determine the minimum for the problem. Move the grey point in the desmos plot to check your answer.
Solution
\(\frac{d}{dx_1}\left(-x_1(1-x_1)\right)=-1+2x_1\).
The derivative above is 0 if \(-1+2x_1=0\), so the minimum is at \(x_1=\tfrac{1}{2}\).
Going back to the original problem, this means the minimum occurs at
\[ \left(\tfrac{1}{2},1-\tfrac{1}{2}\right)=\left(\tfrac{1}{2},\tfrac{1}{2}\right). \]
Let’s go back and understand the stationarity condition for this problem. First, what is \(\nabla f(x_1,x_2)\)?
Solution
We’ll start with a point that isn’t a stationary point and show that the stationarity condition doesn’t hold. For the point \(\mathbf{x}^*=(0,1)\) and some other point \(\mathbf{x}=(x_1,x_2)\in U\), write down the stationarity condition we would check. Put it in terms of only \(x_1\).
Solution
\(\nabla f(\mathbf{x}^*)^\top(\mathbf{x}-\mathbf{x}^*)\ge 0\)
\((-1,0)^\top(x_1-0,(1-x_1)-1)\ge 0\)
\((-1,0)^\top(x_1,-x_1)\ge 0\)
\(-x_1\ge 0\)
To show that \(\mathbf{x}^*=(0,1)\) is not a stationary point, use your calculation above to find a point \(\mathbf{x}\in U\) that does not satisfy the stationarity condition.
Solution
Any point \((x_1,1-x_1)\) with \(x_1>0\) will work. So, for example, choose \((7,-6)\). This point is in the set \(U\) since the sum of the coordinates is 1. However, when we plug it into the stationarity condition,
so it is not \(\ge 0\). This means \(\mathbf{x}^*=(0,1)\) is not a stationary point.
Now, we’ll do this for \(\mathbf{x}^*\) which gives the minimum that you found on the first page, which should have been \(\mathbf{x}^*=\left(\tfrac{1}{2},\tfrac{1}{2}\right)\). For \(\mathbf{x}^*\) equal to that point, what is \(\nabla f(\mathbf{x}^*)\)?
Solution
Say we have some point \(\mathbf{x}=(x_1,x_2)\in U\). Write the stationarity condition for this problem we would check for \(\mathbf{x}^*\) found above in terms of only \(x_1\). Is there any possible \(\mathbf{x}\in U\) that does not satisfy the stationarity condition?
Solution
\(\nabla f(\mathbf{x}^*)^\top(\mathbf{x}-\mathbf{x}^*)\ge 0\)
\(\left(-\tfrac{1}{2},-\tfrac{1}{2}\right)^\top\left(x_1-\tfrac{1}{2},(1-x_1)-\tfrac{1}{2}\right)\ge 0\)
\(\left(-\tfrac{1}{2},-\tfrac{1}{2}\right)^\top\left(x_1-\tfrac{1}{2},\tfrac{1}{2}-x_1\right)\ge 0\)
\(-\tfrac{1}{2}(x_1-\tfrac{1}{2})+\tfrac{-1}{2}(\tfrac{1}{2}-x_1)\ge 0\)
This turns into \(0\ge 0\) which is always true. So no matter what \(\mathbf{x}\) is chosen \(0\) is always \(\ge 0\), so it trivially satisfies the stationarity condition. That means the point \(\mathbf{x}^*=\left(-\tfrac{1}{2},-\tfrac{1}{2}\right)\) is a stationary point.
Of course we knew it was going to be a stationary point because it’s a minimum of a convex problem.
Worksheet 9-1: Q2#
Let’s extend the example above to the more general case. Consider the optimization problem
where \(f\) is a continuously differentiable function over \(\mathbb{R}^n\). The feasible set for the problem is
We will show that the stationarity condition here, namely
is satisfied when
First, go back to the previous problem. Check that the solution you found for \(\mathbf{x}^*\) satisfies the second condition above.
Solution
In the example above,
and the condition above is just that each entry is the same. Here, they’re all \(-\tfrac{1}{2}\) so this satisfies the condition.
Now we will check that if the second condition above is true, then the stationarity condition above is true. Say that every entry in \(\nabla f(\mathbf{x}^*)\) is \(a\) (so this is all the things in the second condition above). Simplify \(\nabla f(\mathbf{x}^*)^\top(\mathbf{x}-\mathbf{x}^*)\) as much as possible.
Solution
Why does the result above imply that \(\mathbf{x}^*\) is a stationary point?
Solution
To be a stationary point, we need \(\nabla f(\mathbf{x}^*)^\top(\mathbf{x}-\mathbf{x}^*)\ge 0\) for all \(\mathbf{x}\), but since the left side is always 0, this is always true.
Worksheet 9-1: Q3#
Consider the convex optimization problem
(a) What is the gradient of \(f(\mathbf{x})=2x^2+3y^2+4z^2+2xy-2xz-8x-4y-2z\)?
Solution
(b) Fix \(\mathbf{x}^*=\left(\frac{17}{7},0,\frac{6}{7}\right)\). What is \(\nabla f(\mathbf{x}^*)\)? Is \(\mathbf{x}^*\) a stationary point of the function \(f\)?
Solution
Since \(\nabla f(\mathbf{x}^*)\) is not 0, this point is not a stationary point of the function.
(c) Show that the vector \(\left(\frac{17}{7},0,\frac{6}{7}\right)\) is a stationary point of the problem.
Solution
To show that it is a stationary point of the problem, we need to check that for any \(\mathbf{x}\) in the constraint set,
The point \(\mathbf{x}=(x,y,z)\) is in the constraint set if \(x,y,z\ge 0\).
We calculate that
Since to be in the set, \(y\ge 0\), this means that \(\nabla f(\mathbf{x}^*)^\top(\mathbf{x}-\mathbf{x}^*)\ge 0\), which is the definition of being a stationary point of the problem.
(d) Find the first iteration of the gradient projection method starting with \(\mathbf{x}_0=(1,1,1)\), and using a constant step size \(0.5\).
Solution
The equation for the gradient projection algorithm update step is
\[ \mathbf{x}_{k+1}=P_C\left(\mathbf{x}_k-t_k\nabla f(\mathbf{x}_k)\right). \]For \(k=0\), this is
\[ \mathbf{x}_{1}=P_C\left(\mathbf{x}_0-t_0\nabla f(\mathbf{x}_0)\right). \]We have constant step size, so \(t_0=0.5\).
We have \(\mathbf{x}_0=(1,1,1)\).
From above, we know \(\nabla f\) so we can calculate:
So
In this problem, the constraint set
\[ C=\{(x,y,z)\mid x,y,z\ge 0\}=\mathbb{R}_+^3. \]From the projection class earlier, we know that
\[ P_{\mathbb{R}_+^3}(\mathbf{x})=[\mathbf{x}]_+. \]So,