I’ve been at a great workshop on statistical methods for process models (mostly of the Earth system), set up by Lindsay Lee, from Leeds. It was set up as an opportunity for modellers of many stripes to hear about some of the latest statistical techniques being applied to a range of domains, and I think it worked well in this.
I gave a talk on using a model to work out the value of collecting new observations (paper), which you can download here. People ask, so if you are interested, the graphics were done using Paper by 53.
A longer meeting (this was only one day really) could have usefully been extended by hearing some more of the modellers’ needs and wishes. The quality of the discussion was high, though – there was 15 minutes at the end of every 30 minute talk, a format I think worked well. The time was often filled with questions.
Here are some of the take-away messages that I remember.
A good Uncertainty Quantification (UQ) project will most likely have a statistician and a domain expert (modeller) working closely together. Communication should be more or less constant, not at the 3-monthly meetings. It was pointed out that having someone who made/understands the observations on board would complete the dream team. I’ve been struck by the success that the Leeds team have been having – essentially by embedding/immersing a statistician in a team of domain experts (atmospheric chemistry).
Peter Challenor on design: Most of the time, building an emulator of your process model will be beneficial. If you can find a statistician, get them involved right at the beginning of the project, and ask them to help plan your experimental design. If you can’t find a statistician, use a Latin Hypercube, it will get you most of the way.
Jonty Rougier: Has got an operational definition of “climate”. It is subjective. Ask him about the details. (and yes, I’ve been in those meetings where everyone tries to define climate for hours). Also, an expert is someone whose probabilities you are willing to accept as your own.
Dan Cornford: Never underestimate the amount of time it will take to get data. Even when you own it.
Danny Williamson: Model discrepancy is what is left when you have optimally tuned your model. With good iterative history matching techniques, and experimental design, this may be more possible that we previously thought.
Everybody: Not explicitly addressing model discrepancy really stuffs up your inferences. It needs to be thought about and modelled much more. A common experience was in seeing people calibrate to a set of observations that lie far outside an ensemble. This just leads to the closest (and often poorest) ensemble member being given the highest weighting in the calibration. David Sexton pointed out that in a multivariate calibration, poorly modelled model discrepancy means that the poorly simulated outputs have a hugely disproportionate influence on weighting.
There were lots of techniques discussed, often with walk-throughs: emulation, sensitivity analysis, Bayesian calibration, tuning and history matching, model reduction, elicitation, assessing the value of observations, perfect model experiments, emulator validation, model discrepancy, and more. The MUCM (Managing Uncertainty in Complex Models) toolkit is probably the best place to go to get immediate advice on all of these things.