Worksheet 4-3: Lipschitz Gradient (with Solutions)

Worksheet 4-3: Lipschitz Gradient (with Solutions)#

Download: CMSE382-WS4_3.pdf · Solution PDF

Warning

This is an AI-generated transcript of the worksheet and may contain errors or inaccuracies. Please refer to the original course materials for authoritative content.


Useful facts. Let \(f: \mathbb{R}^n \to \mathbb{R}\) be \(C^2\).

  • If \(f \in C^{1,1}_L(\mathbb{R}^n)\) (i.e., \(\nabla f\) is \(L\)-Lipschitz), then \(\|\nabla^2 f(\mathbf{x})\| \le L\) for all \(\mathbf{x}\).

  • A once-differentiable function is Lipschitz on \(\Omega\) if its gradient is bounded on \(\Omega\).

  • For one-dimensional functions, \(\nabla f\) is Lipschitz on \(\Omega\) iff \(f''\) is bounded on \(\Omega\).


Worksheet 4-3: Q1#

For each of the following functions, determine:

  • Is the function Lipschitz on the domain given?

  • Is its gradient/derivative Lipschitz? (i.e., does the function belong to \(C^{1,1}\)?)

Function

Domain

Lipschitz?

\(\nabla f\) Lipschitz?

\(f(x) = mx + b\)

\(\mathbb{R}\)

\(f(x) = \sqrt{x}\)

\([0, \infty)\)

\(f(x) = x^2\)

\(\mathbb{R}\)

\(f(x) = \sin(x)\)

\(\mathbb{R}\)

\(f(x) = e^{-x}\)

\([0, \infty)\)

\(f(x,y) = 2\sin(x) - 10.9y^2 + \pi e^{-(x^2+y^2)}\)

\(\mathbb{R}^2\)

Solutions

\(f(x) = mx + b\), domain \(\mathbb{R}\):

  • \(f'(x) = m\) (constant), so \(|f(x)-f(y)| = |m||x-y|\)Lipschitz with constant \(|m|\).

  • \(f''(x) = 0\) (bounded) → \(\nabla f\) is Lipschitz (in \(C^{1,1}_0\)).

\(f(x) = \sqrt{x}\), domain \([0, \infty)\):

  • \(f'(x) = \frac{1}{2\sqrt{x}} \to \infty\) as \(x \to 0^+\) → gradient unbounded → NOT Lipschitz.

  • Since not Lipschitz, \(\nabla f\) is NOT Lipschitz on \([0,\infty)\).

\(f(x) = x^2\), domain \(\mathbb{R}\):

  • \(f'(x) = 2x\), unbounded → NOT Lipschitz on \(\mathbb{R}\).

  • \(f''(x) = 2\) (bounded) → \(\nabla f\) is Lipschitz with \(L=2\) (i.e., in \(C^{1,1}_2\)), but \(f\) itself is not Lipschitz.

\(f(x) = \sin(x)\), domain \(\mathbb{R}\):

  • \(|f'(x)| = |\cos(x)| \le 1\)Lipschitz with \(L=1\).

  • \(f''(x) = -\sin(x)\), \(|f''(x)| \le 1\)\(\nabla f\) is Lipschitz with \(L=1\) (in \(C^{1,1}_1\)).

\(f(x) = e^{-x}\), domain \([0, \infty)\):

  • \(f'(x) = -e^{-x}\), \(|f'(x)| \le 1\) on \([0,\infty)\)Lipschitz with \(L=1\).

  • \(f''(x) = e^{-x}\), \(|f''(x)| \le 1\)\(\nabla f\) is Lipschitz with \(L=1\) (in \(C^{1,1}_1\)).

\(f(x,y) = 2\sin(x) - 10.9y^2 + \pi e^{-(x^2+y^2)}\), domain \(\mathbb{R}^2\):

  • The \(y^2\) term has unbounded gradient (\(\partial f/\partial y = -21.8y\)) → NOT Lipschitz on \(\mathbb{R}^2\).

  • \(\nabla^2 f\) contains the term \(-21.8\) on the diagonal (bounded), but checking full Hessian: the second-order terms of \(\pi e^{-(x^2+y^2)}\) are bounded. The Hessian entry from \(-10.9y^2\) is \(-21.8\) (constant) so bounded → \(\nabla f\) is Lipschitz on \(\mathbb{R}^2\) (in \(C^{1,1}\)), even though \(f\) itself is not Lipschitz.


Worksheet 4-3: Q2#

Consider the function \(f(x,y) = x^2 + y^4\) and starting point \(\mathbf{x}_0 = (1, 1)^\top\).

  1. Perform 3 iterations of gradient descent with step size \(t = 1/2\).

    \(\nabla f(x,y) = \begin{bmatrix} 2x \\ 4y^3 \end{bmatrix}\), so \(\mathbf{x}_{k+1} = \mathbf{x}_k - \frac{1}{2}\nabla f(\mathbf{x}_k)\).

    Solution (a): \(\mathbf{x}_1\)
    \[\begin{split}\nabla f(1,1) = \begin{bmatrix} 2 \\ 4 \end{bmatrix}\end{split}\]
    \[\begin{split}\mathbf{x}_1 = \begin{bmatrix} 1 \\ 1 \end{bmatrix} - \frac{1}{2}\begin{bmatrix} 2 \\ 4 \end{bmatrix} = \begin{bmatrix} 0 \\ -1 \end{bmatrix}\end{split}\]
    Solution (b): \(\mathbf{x}_2\)
    \[\begin{split}\nabla f(0,-1) = \begin{bmatrix} 0 \\ -4 \end{bmatrix}\end{split}\]
    \[\begin{split}\mathbf{x}_2 = \begin{bmatrix} 0 \\ -1 \end{bmatrix} - \frac{1}{2}\begin{bmatrix} 0 \\ -4 \end{bmatrix} = \begin{bmatrix} 0 \\ 1 \end{bmatrix}\end{split}\]
    Solution (c): \(\mathbf{x}_3\)
    \[\begin{split}\nabla f(0,1) = \begin{bmatrix} 0 \\ 4 \end{bmatrix}\end{split}\]
    \[\begin{split}\mathbf{x}_3 = \begin{bmatrix} 0 \\ 1 \end{bmatrix} - \frac{1}{2}\begin{bmatrix} 0 \\ 4 \end{bmatrix} = \begin{bmatrix} 0 \\ -1 \end{bmatrix}\end{split}\]
  2. Comment on the convergence behavior observed in the \(x\)- and \(y\)-components. Are they converging at the same rate?

    Solution

    The \(x\)-component converges to \(0\) immediately after the first step and stays there.

    The \(y\)-component oscillates between \(\pm 1\) and does not converge for step size \(t = 1/2\). This is because \(f_y = y^4\) has a non-Lipschitz gradient (see Q3), so the step size is too large for the \(y\)-component.

    No, the two components are not converging at the same rate — the \(x\)-component converges in one step while \(y\) oscillates.

  3. Is the \(C^{1,1}\) assumption satisfied for \(f(x,y) = x^2 + y^4\)?

    Solution

    No. \(\nabla^2 f = \begin{bmatrix} 2 & 0 \\ 0 & 12y^2 \end{bmatrix}\). The \((2,2)\) entry \(12y^2\) is unbounded as \(|y| \to \infty\), so \(\|\nabla^2 f\|\) is unbounded. Therefore \(f \notin C^{1,1}\) on all of \(\mathbb{R}^2\).

    This explains the oscillation: the convergence guarantee for gradient descent with fixed step \(t \le 1/L\) requires \(C^{1,1}\), which fails here.