2000 07 07 1 AGIP MILANO Seismic data inversion Enrico Pieroni Ernesto Bonomi Emma Calabresu () Geophysics Area CRS4
Jan 14, 2016
2000 07 07 1 AGIP MILANO
Seismic data inversion
Enrico PieroniErnesto BonomiEmma Calabresu ()
Geophysics Area CRS4
2000 07 07 2 AGIP MILANO
Inverse problems are among the most challenging in
computational and applied science and have been studied
extensively.
Although there is no precise definition inverse problems
are concerned with the determination of inputs or sources
from observed outputs or responses.
This is in contrast to direct problems, in which outputs or
responses are determined using knowledge of the inputs
or sources.
Inverse problems are among the most challenging in
computational and applied science and have been studied
extensively.
Although there is no precise definition inverse problems
are concerned with the determination of inputs or sources
from observed outputs or responses.
This is in contrast to direct problems, in which outputs or
responses are determined using knowledge of the inputs
or sources.
The Art of Inverse Probleminferring model parameters from output data
2000 07 07 3 AGIP MILANO
“Gauss-Newton and full Newton methods in frequency-space seismic waveform inversion”Pratt, Shin, HicksGeohpys.J.Int. (1998) 133, 341-362
“High resolution velocity model estimation from refraction and reflection data”Forgues, Scala, PrattSEG 1998
“Seismic Waveform inversion in the frequency domain”Pratt, Geophysics Jan 11, 1999
“Nonmonotone Spectral Projected Gradient Methods in Convex Set”1999, Birgin, Martinez, Raydan
Presentation outline
• inversion framework
• mathematical framework
• steepest descent optim.
• lagrangian approach
• optimization loop
• newton optimization
• conjugate direction opt.
• 1d optimization
• constrained optimization
• test cases
“Multiscale seismic waveform inversion”Bunks, Saleck, Zaleski, ChaventGeohpysics (1995) 60, 1457-1473
“Nonlinear inversion of seismic reflection data in a laterally invariant medium”Pica, Diet, Tarantola, Geohpysics (1990) 55, 284-292
“Pre-stack inversion of a 1D medium”Kolb, Collino, Lailly IEEE (1986) 74, 498-508
2000 07 07 4 AGIP MILANO
• Parameters: NxNyNz unknowns to recover: the velocity field c(x,y,z)
• Observed data/measurements: recorded data at a reference depth
STACK(x,y,t) = P(x,y,z=0,t).
• Simulated data: wave-field propagation imposed by the acoustic wave
equation using some trial velocity field
• Inversion: find the velocity field that minimizes some measure of the misfit
between observed and simulated data
Inversion framework
We solved the inverse problem with a single shot acquisition.The generalization to multiple shots is straightforward and can result in a better inversion.
2000 07 07 5 AGIP MILANO
Mathematical framework
)( ),,(),,,( 2
1)]([ 2 ztyxSTACKtzyxPdtdVcPE
),,,(),,,( ),,(
1 22
tzyxstzyxPzyxc tt
• measure STACK(x,y,t) at same reference level z=0, produced by a single source
• try a guess c(0)(x,y,z) for the velocity field
• solving the acoustic wave equation, simulate the pressure field over the entire
spatial domain (with adequate B.C. and I.C.)
• evaluate the error or cost function and if necessary its derivatives (cumbersome)
)]([min s.t. )()()1( cPEccc cnnn
• update iteratively the velocity field, with the intent to minimize the error function
• iterate this procedure up to a fixed “error threshold”
2000 07 07 6 AGIP MILANO
Steepest descent optimization
The velocity updating technique is usually based
on local informations, e.g. the gradient: 0 some with dc
dEcc
0)( cdc
dE
c
)(cE
0)( cdc
dE
*c
0*' dc
dEcc
problem: avoid local minima
*0 min cccdc
dE ?
Fixed point = minima
c
)(cE
*cminc
2000 07 07 7 AGIP MILANO
sP
zyxcdtdVcPEcPj tt
22 ),,(
1)]([],,[
0),,,(),,,(
)()],,(),,,([),,,(
),,,(),,,(),,(
1 22
TzyxTzyx
ztyxSTACKtzyxPtzyxS
tzyxStzyxzyxc
t
tt
0
0
jP
j
Waveequation
Lagrangian approach
T
tt dtPcdc
dE
c
j
03
2
From P and evaluate the gradientPROBLEM: time alignment!
Constrained minimization problem adjoint field
A sort of wave
equation with source
term = residual error
Back in time!
2000 07 07 8 AGIP MILANO
do it=0, nit-1! call FMod do step = 0, nt-1 ! call BMod call LoadMeasField call AdjMod call PartialGrad call PartialCostF ! end do call Optimizer!end do Inner loop: align in time both direct
and adjoint fields to perform in-core gradient evaluation
Optimization loop
Record data at z=0 & on the boundaries
Use information on the boundaries to backpropagate field P
Load observed data
With real and simulated measurements build the source term and solve for the adjoint field
FMod
BMod AdjMod
),( trP
),( trP
),( tr
0t
Tt
t
Evaluate partial cost function and gradient
Update velocity field
2000 07 07 9 AGIP MILANO
Newton optimization
The optimization procedure can use also information from the Hessian (second
derivative matrix) but this is very expensive for both computational (# direct
propagation = # parameters) and storage requirements ( [NxNyNz]2 )!
E.G. Newton, Quasi-Newton or Gauss-Newton methods:
Thus, aiming to a 3D reconstruction, we decided to only use the gradient.
)( 1 EcHcc
2000 07 07 10 AGIP MILANO
Optimization techniquesst
orag
e
convergence
2000 07 07 11 AGIP MILANO
Conjugate direction optimization
To achieve better convergence
we studied different conjugate
direction algorithms:
[1] Fletcher-Reeves
[2] Polak-Ribiere
[3] Hestenes-Stiefel
(but we have not observed
sensible differences)
[1]
[2]
[3]
)(
)()1()(
)1()(2)1(
2)(
)1()(2)1(
2)(
2)1(
)0()0()()1()1(
)()()1(
kkk
kkk
k
kkk
k
k
k
kk
kk
kk
kk
ggd
ggg
g
ggg
g
g
gddgd
dcc
Eg
2000 07 07 12 AGIP MILANO
1D optimization
At each iteration step, for each fixed direction d and velocity c, find a scalar
such that the resulting error function (depending now on a single real
parameter)
be minimum,
E.G. by line search, bisection, generalized decreasing conditions
) () ( dcjF
) (min F
2000 07 07 13 AGIP MILANO
Constrained optimization
Because of the box constraints over the velocity,
we are forced to adopt the projected conjugate gradient:
if
otherwise
if
'
maxmax
minmin
ccc
c
ccc
c
cg
cg
cg
maxmin ccc
dcccg
2000 07 07 14 AGIP MILANO
nx = 116nz = 66nt = 270dx = 3. dz = 3. dt = 0.00065thick_x = 0thick_z = 0rec_thick_x = 1rec_thick_z = 1 z_record = 4 Nopt = 20Niter = 100
We will consider inversion of small 2D synthetic data-sets. For a better tuning of the algorithms we used velocity field with no lateral variations, but thecode is genuine 2D.
Test cases
2000 07 07 15 AGIP MILANO
Target: piecewise constant function
Initial guess: straight line
Very good result, small changes after 140 iterations ...
2000 07 07 16 AGIP MILANO
Log !The cost function decreasesof about 4 orders of magnitude.The steepest slope is obtained in the first ~20 iterations.A second sudden jump comes as the velocity gets the second ridge!
2000 07 07 17 AGIP MILANO
Target: piecewiseconstant function
Initial guess: straight line
After ~10 iterationswe get the first ridge ...
2000 07 07 18 AGIP MILANO
We see the steepest slope in the first ~10 iterations, a ‘plateau’ seems tofollow!
2000 07 07 19 AGIP MILANO
We take one of the last iterated field (#11) and freeze the gradientof the first (20) layers
In ~20 iterations we reach both the first and the second ridges!
2000 07 07 20 AGIP MILANO
After ~5 iterations the main ridge is detected!
2000 07 07 21 AGIP MILANO
Target: piecewiseconstant function
Initial guess: straight linebut it does not matches the ‘trend’
Iterated velocity field
Things goes wrong if the low frequency trend is not included in the initial guess ...
2000 07 07 22 AGIP MILANO
2000 07 07 23 AGIP MILANO
Freezing the first 20 layers, the 1st discontinuity gets worse but we better recover the 2nd one ...
Here we startfrom #2 of previousiterations ...
2000 07 07 24 AGIP MILANO
2000 07 07 25 AGIP MILANO
Target: parabola
Initial guess: straight line
Good!After ~170 iterationsthings does not change too much!
2000 07 07 26 AGIP MILANO
Log !
3 orders of magnitudedecreasing!Steepest slope in the very first(~5) steps
2000 07 07 27 AGIP MILANO
Target: parabola
We start from the previousvelocity field (#60) and freezethe gradient at the first layers (#20)
In ~10 iterationswe get a reallygood result!
2000 07 07 28 AGIP MILANO
In the first ~8 iterationswe have the steepestslope ...
2000 07 07 29 AGIP MILANO
Target: parabola + sin
Initial guess: straight line
Iterated velocity field
Nice!The greatest part is donein the first ~100 iterations!
2000 07 07 30 AGIP MILANO
Cost Function
Log !
As observed, the big is done in the first ~100 iterations!
2000 07 07 31 AGIP MILANO
Target: parabola + sin
We start from a previous iteration(#20) and freeze the gradient at the first (20) layers
Not good as before: we only get the medium trend!
2000 07 07 32 AGIP MILANO
2000 07 07 33 AGIP MILANO
The main problem is the presence of a large number of local minima. To get rid of them is possible to linearize the direct model (eg Born approx.), to have a convex cost function
or adopt some multi-scale approach:large to small spatial scale, orlow to high time frequencies
but: loose refracted/multiply reflected waves, ecc. ecc.
but: the ultra-low frequency (the velocity field trend) components don’t produce reflected waves, thus must be already present in the initial guess.
Some preliminary conclusions
2000 07 07 34 AGIP MILANO
Advantages of the time frequency domain
• high data compression rate (~10)
• uncoupled problems in embarassing
parallelism
• large to small spatial scale approach, inverting
separately small and large frequencies
quickest and scalable approach
The advantage of
the time domain is
the intuitive
comprehension of
the involved fields
and results
Time versus Frequency Domain
2000 07 07 35 AGIP MILANO
Extra time!
2000 07 07 36 AGIP MILANO
Spectral conjugate gradient
Spectral Conjugate Gradient Method
the advantage is that in this way the conjugate direction (-g)
contains some explicit information on the Hessian matrix.
media integralHessian
)()(
2)(
)()1()1(
)0()0(
kk
k
k
kk
kk
k
ys
s
sgd
gd
)()1()(
)()1()(
kkk
kkk
ggy
ccs
2000 07 07 37 AGIP MILANO
• In geophysic application, the number of parameters is very large, this motivates the choice of a conjugate gradient minimization algorithm
• Without uphill movements (a<0) in the line search procedure, none optimization method will prevent the trapping inside the local minima
Modified Nonlinear Conjugate Gradient
2000 07 07 38 AGIP MILANO
• In our approach a can be either positive, describing a movement in the descent direction of pk, or negative.
• For a negative, the line search is similar
Line search (>0)
2000 07 07 39 AGIP MILANO
• very noisy function, presenting oscillations up to small scales (many local minima)• after 7 steps both Wolfe conditions are satisfied
Allowing a<0, the algorithm can visit and leave most local minima
2
)100cos(24
1)80cos(
16
112)( 2 xxxxf
Analytical 1D example
2000 07 07 40 AGIP MILANO
• the function is a sum of a simple convex quadratic and low-amplitude high frequency perturbation (N=2)• after 8 steps both Wolfe conditions are satisfied
Allowing a<0, the algorithm can visit and leave most local minima
)1,,1( ; )(sin,),1(sin2
1H
10cos1100
1
10cos100
11 H)(
10
2
1
2
100
xNxi
xx
xxxxxxxxf
ijij
T
Analytical 2D example
2000 07 07 41 AGIP MILANO
• same function as before, with N=32• standard gradient based minimization methods are not satisfactory with such a noisy function• on nontrivial analytical examples, our approach converges quickly towards the global minimum
Analytical 32D example
2000 07 07 42 AGIP MILANO
The number of parameters plays a crucial role in the choice of the algorithm to minimize the cost function j(p) in the parameter space
stor
age
Without uphill movements in the line search procedure, none optimization method will prevent the trapping inside the local minima
The landscape of the cost function presents many local minima
convergenceNumber of parameters
2000 07 07 43 AGIP MILANO
The number p of parameters impacts on the choice of the optimization strategy:• for very small p the gradient can be computed numerically• for small p, use the gradient and the Hessian to compute the search directions
- exact Hessian (Newton)- approximation of the Hessian as the iteration progresses (Quasi-Newton).
• for large p, use only the gradient to compute the search directions - nonlinear conjugate gradient
• for very large p, use stochastic methods- simulated annealing
The number p of parameters impacts on the choice of the optimization strategy:• for very small p the gradient can be computed numerically• for small p, use the gradient and the Hessian to compute the search directions
- exact Hessian (Newton)- approximation of the Hessian as the iteration progresses (Quasi-Newton).
• for large p, use only the gradient to compute the search directions - nonlinear conjugate gradient
• for very large p, use stochastic methods- simulated annealing
Number of parameters