# Part 4: Gradient-based Optimization¶

Finally, you will implement a simple gradient descent algorithm for minimizing the objective function

to ‘solve’ the seismic imaging problem.

## Gradient Descent¶

Full waveform inversion is the geophysics terminology for minimizing the objective

The term arises because we will be using the wave equation simulation to match the seismic data to ‘invert’ or solve the seismic inverse problem.

From the notes, we have that the gradient of the objective at a point \(m_0\) is

The method of gradient (or steepest) descent attempts to solve the optimization problem using the iteration

where \(\beta_k\) is a step parameter, best found via a ‘line search,’ that is necessary because the gradient provides no notion of scale. With a sufficiently sized \(\beta_k\), this method can recover the true modem \(m\) approximately.

Note

Most texts use \(\alpha\) for the line search parameter, but to avoid confusion with the \(\alpha\) in the CFL condition, we use \(\beta\) instead.

Problem 4.1

Write a function `gradient_descent(C0, d, k, config)`

which takes a
starting model `C0`

, ‘measured’ data `d`

, and a number of iterations
`k`

as arguments and returns the sequence of `k`

estimates of the true
model, as well as the values of the objective function \(J\) at each
of those points.

How might you select \(\beta_k\)? How does \(m_k\) compare to \(m\)? Why? How many iterations do you need before you start to recover your model? The model recovery is not perfect. Explain the errors.

Warning

Be very careful to update the model correctly, as \(\delta m \ne c - c_0 = \delta c\)! How does \(\delta m\) relate to \(\delta c\)?

Plot the convergence of your gradient descent method using a log-scaled plot.

Problem 4.2

Import the model function `gauss_model`

from `models.py`

. Use your `gradient_descent`

function to try to recover the
new Gaussian model. How do your results change? Explain the errors in the
recovery.

## Bonus Problems¶

**Bonus Problem 4.3:** Gradient descent is going to converge slowly. How might
you accelerate this process?