Below is my end of the correspondence that I had with Doug Keenan over the summer. This is being discussed slightly more widely now (e.g. here, here, here), so I thought it would be sensible to have it in the public domain.
The first email is largely a response to Keenan’s article in the Wall Street Journal here.
I don’t agree with alot of Keenan’s ideas, but I appreciate the fact that he has been unfailingly courteous in all of our interactions.
27th July 2011
Dear Doug Keenan,
Thank you for your correspondence regarding your recent article “How scientific is climate science?”, and related literature. I think it is useful to discuss more technical issues with people from a wide range of backgrounds, and statisticians are particularly welcome. One always learns something useful.
While I think that you make some very good points regarding the uncertainty statements in chapter 3 of the AR4, I am not convinced by the claim that you throw into doubt the scientific foundations of climate science as a whole! I will outline my reasons below. First though, I think that it is a considerable achievement to get some of the more technical (or even any) aspects of probability theory and timeseries statistics into the mainstream media. This is a welcome development, and I hope indicates the willingness of the media to move towards a more mature discussion on science with important societal implications. I think that writers like Ben Goldacre (for The Guardian), and David Spiegelhalter (occasionally for The Times) have led the way in the discussion of probability and risk in the health field: there is perhaps not yet an obvious equivalent writer in the field of climate science.
My reading of the key arguments of your article runs like this:
- Chapter 3 of the IPCC AR4 fits a linear model to the global temperature series, using number of assumptions. The chapter concludes that the trend is “extremely significant”.
- One of the assumptions is that the residuals from the linear fit should be AR(1).
- The residuals are not AR(1). This might have a bearing on how much internal variability you would expect from the system. What if you expected much more internal variability, and long term persistence? The trend that we see in the global mean temperature data might not be unusual in a historical or theoretical context, and therefore does not offer strong evidence that the climate is changing.
- The conclusion in chapter 3 of a trend in global mean temperature that is “significantly
different from zero” is weakened.
- Because of this basic mistake, the theory of anthropogenic global warming is weakened,
along with the credibility of the IPCC.
My response is based on this summary, and so please let me know if I have misunderstood the thrust of your arguments. I will leave the discussion of Milankovitch cycles for another day, in the interests of brevity.
Randomness in timeseries
I think your first point, regarding the folly of making rash conclusions based on timeseries with an unknown generating process, is well made. You use the example of a (very) short timeseries, that might be generated either by a flip of a coin, or a roll of a die. The argument is that short timeseries does not give you much information either way. I like the way that this point is made in a familiar setting, although I would suggest that using a Bayesian analysis would help to quantify how much evidence for any particular hypothesis the outcome of “three upward lines” would give you. With only two hypotheses (e.g. H0 = timeseries generated by a coin flip, or H1 by the die), and a sensible prior distribution, you can work out the probability of either hypothesis, given the data. Without the Bayesian analysis, you can still look at the likelihood ratio of the data, given each hypothesis. I think there is much promise in the Bayesian analysis of climate timeseries – perhaps in the vein of Tol & De Vos (1998). A Bayesian analysis relies on the fact that we should interpret data in the light of all available evidence, and I think that this is a crucial point in the case of climate timeseries.
Significance of global warming
The main point of your article seems to me to be that the “significance” of the trend of global
mean temperature in chapter 3 of the AR4 is overstated. I think there is a real issue of
uncertainty communication here, which should be addressed with the help of some statisticians for the upcoming AR5. You say in your article that:
“The latest report from the U.N.’s Intergovernmental Panel on Climate Change (IPCC) was published in 2007. Chapter 3 of Working Group I considers the global temperature series illustrated in Figure 1. The chapter’s principal conclusion is that the increase in global temperatures is extremely significant.”
While paragraph 1 of the chapter 3 executive summary does say that “global mean temperatures have risen by (this much) when estimated by a linear trend over the last 100 years”, just a few lines later it says “… the trend is not linear”, and goes on to talk about shorter trends. This clearly and obviously calls into doubt the appropriateness of using the linear model to do inference on the timeseries. It looks to me that the chapter (genuinely) attempts to communicate the informal significance of the rise in temperatures, where what we really want is the more formal detection, and attribution, of a formally significant trend. I think that your conclusion, we reject the notion that the trend is significant, does not follow from the finding that we reject an AR(1) correlation structure of the residuals.
In my reading, chapter 3 of the AR4 is not where you should be directing your analysis – I, as a climate scientist, would not look to chapter 3 to examine the conclusion that the globe is warming significantly, and that it is likely caused by humans. That should be Chapter 9 “Understanding and Attributing Climate Change”. I think that you overstate the importance of the global mean temperature record as the primary source of evidence for climate change. In reality, detection and attribution relies on an understanding of the physical principles that govern the behaviour of the Earth system, as well as a wealth of paleo and observational*1 records. We have much more evidence to rely on than simply a global temperature record.
However, I personally would not state the “significance” of linear trends in chapter 3, as occurs in table 3.2, for example. I think that this uncertainty communication could easily get confused with a formal attribution statement. Unfortunately, I don’t have a ready, simple alternative to suggest at the moment. I think that any suggestions that you (or others in the statistical community) might have on this topic would be well received by the climate science community.
As a scientist with Bayesian leanings, I think that discussions of “significance” of trends are often misguided, and distracting. They are too easily conflated with the informal notion of significance, as aptly demonstrated in the recent discussions of the significance (or otherwise) of recent trends of surface warming on the BBC news website. They tend to focus too closely on arbitrary levels of significance, with little justification, and are often misinterpreted. There are many fundamental problems with the philosophy of significance testing, particularly when applied to climate science (see e.g. Ambaum 2010).
Climate models and uncertainty
I understand that many people don’t trust climate models (and in fact, it is my job not to trust them either), but they do offer one of many strands of evidence that the surface warming that we see is unexpected, without anthropogenic forcing of the Earth system. They offer a coherent and testable hypothesis of the workings of the Earth system, built from first principles, that reproduce many of the salient features of its dynamics. I think it important that we make it easier for other scientists, and indeed members of the public, to scrutinise the models, compare them to reality, and to question the way they work. For this, a good understanding of uncertainty and statistics is necessary, along with an understanding of physics, atmospheric chemistry and biophysical processes. To reduce all of climate science, the modelling effort, and the vast numbers of observations available from satellites, field measurements, and ocean drifting buoys, to a single timeseries of global mean temperature, is to throw away a vast amount of information.
Our current understanding of the climate system is hard won, built up incrementally over many years of research in diverse fields. That is not to say that our understanding can’t change: new observations come in all the time, new theories are proposed, and old ones are overturned. A single piece of compelling evidence should cause us to re-examine what we think we know. However, to completely change our understanding of the climate system (as I believe you are proposing), an alternative theory would need to better explain all of the observations and evidence that we gather. While I maintain a healthy skepticism about many of the more precise predictions of climate change, I honestly think that the current climate modelling effort is a good attempt at the exposition of a comprehensive theory. My research primarily concerns the understanding of the differences between the true system and the climate models, and what this means for our uncertainty about the future.
I am intrigued by the “Hurst effect”, the concept of long term persistence, and self-similar systems. I must confess that I found the paper that you sent, Koutsoyiannis (2011), to be a little dense and jargon rich (I’m sure I can be guilty of the same thing). As such, I will need a bit more time to understand it, and help in relating the conclusions of the paper to the world that we are studying. I wonder, given that the timeseries of global temperature change might be a) a result of long term persistence in an essentially random process, or b) largely a result of an energy imbalance at the top of the atmosphere, caused by a changing atmospheric composition and land use, how could we tell the difference? What strategy would we use, and what evidence would we gather in order to tell these things apart?
I would be most interested to hear your suggestions, ideas, and comments.
Climate impacts analyst, Met Office Hadley Centre
*1 Some of the observational evidence was nicely summarised by my colleagues in a report at the end of last year – Evidence: state of the climate http://www.metoffice.gov.uk/climatechange/policy-relevant/evidence
Ambaum, M.H.P. (2010) Significance Tests in Climate Science. J. Climate, 23, 5927–5932, doi: 10.1175/2010JCLI3746.1
Koutsoyiannis, D. (2011) Hurst–Kolmogorov dynamics as a result of extremal entropy production, Physica A, Volume 390, Issue 8, p. 1424-1432
Tol, R.S.J and A.S. De Vos (1998) A Bayesian Statistical Analysis of the Enhanced Greenhouse
Effect, Climatic Change, Volume 38, Number 1, 87-112, DOI: 10.1023/A:1005390515242
12th August 2011
Dear Doug Keenan,
Thank you for your comments. It seems that we agree that there should be a new way forward in summarising the uncertainty in the observations chapter of the forthcoming IPCC report. A linear trend, while conveniently easy to understand and apply, is simply inadequate to capture all of the timescales that are apparent in the Earth system. To be fair to the authors of the chapter, they have tried to express the uncertainty in the global mean temperature in other ways, for example by comparing a periods at the beginning and end of the timeseries. Further, the statistical assumptions about trends made in this chapter are not crucial to the detection of global warming, and its attribution to largely anthropogenic influences. I think, however, that the “observations” chapter of the forthcoming IPCC summary would ideally just report on the state of the climate, and leave any assessment of detection and attribution to other chapters.
Aside from that, a colleague reminds me that the first “uncertainty issue” that had to be dealt with by chapter 3 was that of the observations themselves. The first question to ask upon seeing an apparent trend was “could this be an artifact of an inadequate observational network?”. It is easy to see how this question might get conflated with “is the trend unexpected”, and “what are the causes of the trend?”. Given the estimated observational uncertainty, it would be hard to imagine arguing that the Earth’s surface has not warmed, regardless of the driving processes.
Thanks for clarifying that you think that GCMs are the best way to understand the climate system. While a focus on observations is good, this discussion only highlights how difficult it is to examine observational evidence in isolation from our physical understanding, which often finds its expression in models of various complexity.
I must take issue with one of your comments however, you say that:
“The full situation is even worse: there does not seem to be any statistically-valid observational evidence for global warming”
and go on to say
“What would happen if policy makers were officially told that there is no observational evidence for global warming?”
As highlighted in my previous email, there is a great deal of observational evidence that agrees well with our physical understanding of the system – much “physically valid” observational evidence, if you like. However, I think that it is important to be very clear what we mean by “statistically valid”, before we decide whether the evidence that we have conforms to that or not. What would statistically valid evidence for global warming look like? Perhaps more usefully, what would statistically valid evidence against the theory of anthropogenic induced warming look like? To say that observational evidence is not “statistically valid” is probably more a comment on our statistical framework, than our knowledge of the climate. I think it is unfair to make the argument that there is no “statistically valid” evidence, without stating what statistical framework we are working under, and then going on and showing that the evidence for warming is not statistically valid. To further extend this and say that there is “no observational evidence” for global warming is stretching the point even further.
With that in mind, I think your instinct to question the statistical models used in climate science is a good one. We should certainly test a number of timeseries (including those from models) against a number of statistical models and assumptions. You have, however, made a good point about the inappropriateness of a linear trend assumption for global mean temperatures. It would seem unfair then, to test our best statistical models against this statistical model? The appropriate test is surely against our physical models, worked out from first principles, or against a physical-statistical model?
A good example of this approach is used in a paper that I was only made aware of since last writing to you. The paper is Mann (2011), where the author looks at long range dependence in global temperature series. The author claims that the value of the Hurst coefficient seen in the observational temperature record, can best be reproduced in a very simple climate model, with a linear response to natural and anthropogenic forcing, combined with stochastic noise. It also finds that purely stochastic forcing can produce the observed effect, and that the observational record is perhaps too short to distinguish between the two. The structure of the paper seems to me to be a good template for a comparison of physical and statistical models. As a bonus, the code for the experiments is available online. The paper is an editorial comment of Rea et al. (2011), which also looks relevant to our discussions, but which I haven’t had time to digest yet.
Later in your email, you say:
“The central question, though, is this: what statistical models should analyses be based on?. Any statistical model should have both physical realism and a good statistical fit to the data. The only statistical model of which I am aware that has been based on the underlying physics is fGn.”
I think the first sentence is indeed a central question, and I agree with the second sentence. I wonder though, if fGn is really the only statistical model that takes into account the underlying physics? I doubt that this is the case. I notice that there is a package FGN for R, which looks promising. Regarding your suggestions for fitting fGN to models or observations, I have a number of questions:
- How we could test that fGn was an appropriate statistical model for climate model or observational records?
- If fGn does fit either (or both) well, what physical interpretation does this suggest?
- If we were able to fit fGn models to both GCMs (or simpler physical models) and the observations, what would that tell us about the Earth system that was new?
I think there could be an interesting study in comparing an fGn model against an appropriate physical-statistical model, but we would have to be sure to set up a fair test beforehand, and be careful in interpreting the results.
M. E. Mann (2011) On long range dependence in global surface temperature series, an editorial comment, Climatic Change, Volume 107, Numbers 3-4, 267-276, DOI: 10.1007/s10584-010- 9998-z
W. Rea, M. Reale and J. Brown (2011) Long memory in temperature reconstructions, Climatic Change, Volume 107, Numbers 3-4, 247-265, DOI: 10.1007/s10584-011-0068-y