GS 0109 MIDTERM

Download Report

Transcript GS 0109 MIDTERM

Using Derivative Information in the Statistical Analysis
of Computer Models
G. Stephenson, P. Challenor, R. Marsh
[email protected]
1. Introduction to Gaussian Process Emulators
4. Emulation of Derivatives
Complex computer models are used in many areas, such as engineering and environmental science,
to simulate the behaviour of real-world systems. An appreciable amount of computing time may be
required to run a complex model and performing analyses such as sensitivity and uncertainty analysis
can require many runs of the simulator. This quickly becomes impractical with a computationally
expensive model.
Running a model's adjoint to obtain derivatives, while more efficient and accurate than other methods,
such as the finite difference method, is a computationally expensive task. In addition, the effort taken
initially to write the adjoint is considerable and the task, time consuming. Therefore, if we can emulate
the derivatives of a model this would decrease the demand for writing and running adjoints.
Greater efficiency can be achieved by building an emulator, which is a statistical approximation to the
simulator. The approach we use is to model the simulator with a Gaussian process. The emulator is
built based on data collected from running the simulator at a specified, small number of input points. A
Gaussian process emulator is illustrated below, in Figure 1.
The derivatives of a Gaussian process remain a Gaussian process and we are able therefore to model
the derivatives of a model as a Gaussian process. Some performances of emulators to predict
derivatives, built with and without derivatives, are shown below in Figures 10 - 12.
Figure 10 (left). Emulating derivatives based on function output and known
derivatives at 5 points. Figure 10 shows the performance of an emulator predicting
the derivatives of the the true model, x + cos(x) + sin(x), based on the model
output and the derivatives at 5 points.
Figure 1 (left). The true function is evaluated at 5 points and training on this output,
an emulator built. The emulator mean is then evaluated at a set of untried input
points and the standard deviation at each of these points calculated.
In this example, the posterior mean is close to the true value of the simulator; the
predictions become worse and uncertainty much greater once it is forced to
extrapolate. At the 5 design points the uncertainty pinches in to zero, as the true
value of the simulator at these points is known. The uncertainty becomes more
appreciable the further away from a design point we predict at, how quickly the
uncertainty grows between design points depends on the roughness parameter in
the emulator. If we increase the number of design points we would expect the
predictions to become closer to the simulator output and the uncertainty to
We may not have a model’s adjoint and as such, the
derivatives are unknown. We can still, however, emulate the
derivatives of the model. Figures 11 and 12 illustrate
emulators built only with function output.
decrease.
Figure 11 (left). Emulating derivatives based on function output at 5 points. Figure
11 shows the performance of an emulator based on training data which consists
only of the simulator output at the 5 design points. We can see from Figure 11 that
though we can still predict the derivatives of the true function, the uncertainty is
much greater and the posterior mean is further from the true value than in the
emulator which had derivative information included in the training data. It should be
noted, though, that the computational expense required to build the two emulators
of Figures 10 and 11 are not equal.
2. Use of Derivatives in Emulating Model Output
It is possible to obtain derivatives of model outputs with respect to its inputs. One approach is using
Automatic Differentiation (AD). Derivatives are generated by repeatedly applying the chain rule to the
combinations of elementary operations in the model. The differentiated code then runs along side the
original code in the resulting model, which is termed the adjoint model.
The value of learning derivatives when building emulators is being investigated to determine whether
additional efficiency can be achieved. Figures 2 and 3 illustrate a Gaussian process emulator which
has been built with the additional information provided by derivatives.
Figure 2 (left). Emulator with derivative information at 5 points. The emulator in
Figure 1 is repeated but here, in addition to the function output, we include the
derivatives of the output w.r.t. the input at each of the 5 points. The emulator mean
is again evaluated at a set of untried input points and the standard deviation at
each of these points calculated. The resulting emulator is shown in Figure 2. The
emulator mean is much
closer to the true function
output and the uncertainty
much smaller than the
emulator without derivatives
(shown in Figure 1).
Figure 12 (right). Emulating derivatives based on function output at 8 points. It
would appear from Figure 11 that without derivative information in the training
data, 5 runs of this simulator aren't enough for the emulator mean to produce an
adequate approximation. Continuing with the example, we evaluate 3 more runs
of the simulator and this yields an emulator, of which the mean and standard
deviation is shown in Figure 12.
5. Application to the C-GOLDSTEIN climate model
C-GOLDSTEIN is an intermediate complexity climate model and the adjoint of the model exists
enabling the evaluation of derivatives. An example of the type of output C-GOLDSTEIN produces is
shown in Figure 13.
The performance of the emulator in Figure 2, while good, is
such that it is difficult to identify precisely how and where
the derivative information is having an effect. Due to this we
now repeat the example but remove two of the simulator
runs and corresponding derivatives from the training data.
Figure 3 (above, right). Emulator with derivative information at 3 points. The posterior mean is still close to the true simulator output. The
uncertainty reduces to zero at design points, as expected, but whereas in Figure 1 the uncertainty becomes appreciable once we start
predicting away from a design point, here the uncertainty remains very small for predictions closes to the design points. It is the derivative
information in the model which allows for this reduced uncertainty.
It is necessary to consider the computational cost of using derivative information in computer
experiments as it must be determined at which point the costs outweigh the benefits. For example, if
generating the derivatives of the model increases the computational cost substantially, this extra
computing time may be better spent evaluating the model at more points instead.
3. Toy Model Investigation
• C-GOLDSTEIN
=
Coupled
Global-Linear Drag Salt and
Temperature Equation Integrator.
• It is composed of a 3-d ocean
model coupled with a 2-d energy
moisture balance model of the
atmosphere and a simple sea ice
model.
• C-GOLDSTEIN has simplified
physics and a low resolution
(36x36 grid).
Figure 13. Air temperature at the year 2000. Figure 13 shows some output of CGOLDSTEIN run to AD 2000 under default parameters settings.
We investigate the value of derivative information in Gaussian process emulation with 3 toy models.
These are shown below in Figures 4 - 6.
We apply the method of Section 4 to C-GOLDSTEIN. 30 runs of the climate model are performed,
.
varying 3 of the input parameters. An emulator is built with the resulting output of global mean air
temperature. Validation data is produced by adopting the finite differences method; 50 derivatives
with respect to each input are generated.
Figure 4 - Toy Model 1
Figure 5 - Toy Model 2
Figure 14 (left). A comparison of the derivatives of global mean air
temperature with respect to atmospheric moisture diffusivity, at the year
2000. There are some areas where the emulated derivatives match
those produced by finite differences quite well. A number of points
show differences between the two methods though and additional runs
of the simulator are likely to be required to improve the overall
performance of the emulator.
The corresponding plots for the 2 remaining parameters, ocean vertical
diffusivity and atmospheric heat diffusivity, are omitted here but show
similar patterns.
Figure 6 - Toy Model 3
We emulate each toy model, with and without derivative information, for a range of simulator runs. As
the simulators here are not complex models, it is possible to run them at all points we test the emulator
at. An average prediction error is then determined by looking at the difference between the value of the
posterior mean, and the true value at that point as given by the simulator.
6. Conclusions
Gaussian process emulation can provide a practical solution to running computationally expensive
models and results from investigations on toy models show that using derivative information when
building emulators could improve efficiency. This is, however, dependent on the computational cost of
obtaining derivatives.
Figure 7 - Performance of Emulator 1
Figure 8 - Performance of Emulator 2
Figure 9 - Performance of Emulator 3
Figures 7 - 9 show that for all toy models tested here, the mean of emulators built with derivatives
provide a closer approximation to the relevant simulator. For models 2 and 3, an emulator without
derivative information requires approximately twice as many simulator runs as an emulator with
derivatives to achieve similar accuracy.
Moreover, we can use an emulator to predict the derivatives of a complex model. Further investigation
is required, but current work suggests that emulation of derivatives could reduce the demand for writing
and running adjoint models.
Work will continue by further emulation of both the function output and derivatives of C-GOLDSTEIN.
Thanks to Jeremy Oakley, Robin Hankin and all in the MUCM team.
www.mucm.group.shef.ac.uk