99% certainty (updated)

A month ago, one of the greatest scientific experiments of the world yielded tantalizing results, regarding one of the unsolved mysteries of physics. It is all about the question how particles gain their mass. The story is rather exciting, most of all for the involved physicists, whose models predict that this is supposed to happen through the Higgs mechanism.

The collisions of protons in the LHC are supposed to create a particle that is predicted to exist, if the mechanism works the way physicists think it should. The particle itself would remain elusive, even if the theory is right. It would be highly unstable and disintegrates into other particles, W-Bosons, with a certain amount of energy. Those bosons are what the scientists at CERN hope to see.

Last month, CERN announced that several of such particles had been detected and statistical analysis showed a 99% certainty, that this was not just down to mere coincidence. CERN called the finding ‘tantalizing’ and remained cautious to announce that the long sought-for proof of the theoretical models had been found.

Had CERN been the IPCC, it would have called the discovery of the Higgs particle, a virtual certainty. And with this sentence I am crossing the Rubicon. Be warned. There be dragons on the other side of the break.

The story of the Higgs didn’t end last month, even though it seemed virtually certain, according to the statistics. Further measurements contradict the earlier enthusiasm and while the statistical confidence now stands at 95%, the mere fact that it has fallen is an indication that the signal is a fluke. That’s not new to the physicists, who are now reviewing all the ways in which those bosons that they found earlier could have been created in their particular setup without the Higgs. Nature put it this way:

It is not entirely unexpected, adds Joe Incandela, deputy spokesperson for the CMS. “The veterans who have been in the trenches know that in the early goings there are often things that are unclear,” he says.

There is good reason why 99% certainty was not enough to call it quits. Statistics is no use in telling what the cause of certain results is. All it can do, under very special circumstances, is to provide an analysis of one very common and inevitable cause of your observations matching your theory – random noise.

What it cannot do, is to tell whether the mechanism that is supposed to produce your observation, is actually the mechanism that was at work. In the case of the LHC experiments, the observed w-bosons were not actually generated through the Higgs mechanism in the way it was described but some other process. This process was not random noise and hence not initially caught by the statistical analysis.

Statistical analysis, by its nature, must allow for certain deviations from the predicted result. If there is a process going on during the observation period, that produces results similar to the predictions, the initial results will seem to imply a very high certainty that the observation was not down to chance. But once more data has been collected, the subtle deviations between the proposed process and the one actually at work will show deviations that can no longer be reconciled by blaming them on chance alone.

That is why the standards for certainty in particle physics, before a new discovery is announced, is 5 standard deviations. Or a likelihood of 99.9999% of the observation not being down to chance.

This is done under the presumption, that most of the subtle effects and deviations that can be expected when treating on territory so far unknown to science, will come to bear and be identified before such a high level of confidence can be attained.

Moreover, there is the fact that scientists are human beings. They have ambitions, they have a certain amount of trust in their theories and, as all human beings, they are not perfectly objective in setting up the observations or evaluating their results. The physicist Richard Feynman (incidentally a particle physicist) was very aware of those faults and the difficulty of dealing with them.

In a famous lecture he held in 1974, he went to great lengths to show just how important it is, that a scientist must acknowledge his own fallibility and show all the ways in which he could be wrong:

I would like to add something that’s not essential to the science, but something I kind of believe, which is that you should not fool the laymen when you’re talking as a scientist. . . . I’m talking about a specific, extra type of integrity that is not lying, but bending over backwards to show how you’re maybe wrong, [an integrity] that you ought to have when acting as a scientist. And this is our responsibility as scientists, certainly to other scientists, and I think to laymen.

This is why the physicists were so (seemingly) incredibly reluctant, despite 99% statistical certainty, to just go ahead and say: See there, we found it, the science is settled.

I applaud CERN for their modesty in announcing their results. Despite the pressure of having to justify billions of dollars spent on one of the largest science projects this planet has ever seen – they kept their calm. Instead of triumphal rhetoric, even Nature had to concede:

For now, physicists are only willing to call them ‘excess events’, but fresh data from two experiments at the Large Hadron Collider (LHC) are hinting at something unusual — and it could be the most sought-after particle in all of physics.

Keep in mind (and here is one of the dragons I was talking about), that all this happened after an observation that the IPCC would have talked about as a “virtual certainty”.

The contrast between such modesty among particle physicists and the rhetoric associated with climate science is stark. In its fourth assessment report, the  Intergovernmental Panel on Climate Change agreed upon the following:

  • use of “virtual certainty” (or virtually certain) conveys a greater than 99% chance that a result is true
  • “extremely likely” (greater than 95% chance the result is true)
  • “very likely” (greater than 90% chance the result is true)
  • “likely” (greater than 66% chance the result is true),
  • “more likely than not” (greater than 50% chance the result is true),
  • “unlikely” (less than 33% chance the result is true),
  • “very unlikely” (less than 10% chance the result is true), and
  • “extremely unlikely” (less than 5% chance the result is true).

(Update) Please note the systematic contortions in this use of language and its implications. While the IPCC report has numerous claims of “virtual certainty”, there is not even the possibility for the authors of the IPCC reports to express something like virtual impossibility (flawed as it would be), for lack of allowed language describing such a concept.

There is also not a single finding, in which scientist came up with the conclusion, that  something was “less likely than not” (33%-50% chance of being true) but a lot that were “more likely than not”. However, not getting such a result among hundreds of inquiries in unbiased research is … well, a “virtual impossibility”. (/Update)

After continued experiments and observations, the statistical certainty of the excess bosons in the LHC being caused by the presence of Higgs particles, is still technically at about 95% – but no physicist would call this “extremely likely”. In fact, the implication of the confidence falling to a mere 95% is that the observation was probably a false alarm.

I can see none of this modesty, that is the result of many disappointments, false alarms, fraudulent claims and other hard-won experience in the struggle to find out how nature works, anywhere in the publications of the IPCC. The vocabulary used is a gross misrepresentation of what such confidence means in the practical experience of scientific discoveries. (And that is before considering such atrocious expressions as “extreme weather events” that resulted in perfectly arbitrary attributions of any and all somewhat unusual weather events to be ascribed to climate change.) I also have trouble finding it in other publications concerning climate change and its effects.

So please understand my reluctance (and sometimes, admittedly, open hostility) towards the claims made in the climate change debate and most of all the claims about the certainty of the findings.

There are enough reasons to reduce the use of carbon based fossil fuels. Be it the fundamental scarcity of fossil fuels; the unavoidable environmental damage associated with their use and extraction; or their price that will inexorably rise as the world is industrializing, which will hamper the economies of the industrialized countries unless they reduce their dependence upon them. (Which is something I have alluded to in one of my comments on the Economist fora and will discuss here in some later posting).

The current display of overconfidence in the results of climate science not only hurts the credibility of science in general today – but the backlash against science that can be expected when core assumptions turn out to be wrong, can proof to be extremely destructive.

Science must not be swayed by public rhetoric and moral panic. Science is about those things that we can know with such utmost certainty, that we can build upon them. Uncertainties and areas of scientific ignorance must find adequate expression and must not be swept under the rug of rhetorics in the way the IPCC tried to institutionalize.

About these ads

7 thoughts on “99% certainty (updated)

    • Well, hardly surprising you found it via /. This is a new blog and I have a pretty good idea where people come from. ;) (Basically from one of the two places I mentioned it, so far.)

      Any help spreading the word is appreciated.

      Of course, we should try to wean ourselves off fossil fuels. But it’s not that easy. I’ll try to get around beginning to write about those things in earnest this weekend.

  1. The IPCC exists to produce reports into the state of knowledge about the character and impacts of climate change. The “many disappointments, false alarms, fraudulent claims and other hard-won experience in the struggle to find out how nature works” take place both in the literature, and in the work that comes before the literature. It is not the job of the IPCC to provide an extensive catalogue of all these failures, and trivially one can see that doing so would vastly increase the length of the reports.

    Instead, their role is to provide a comprehensive assessment of the current “state of the science”, outlining the perceived levels of uncertainty to be found within the literature. This is clearly not a trivial process, and they do not please everyone; indeed they seem to struggle to please anyone, but such is the nature of consensus.

    It is, in my opinion, a valuable and laudable goal to attempt to compile all the thousands of man-years of research that have gone in to understanding our climate, and the systems which depend on it, into a set of reports that can be relatively quickly and easily digested by anyone who cares to read them. I think your suggestion of immodesty misrepresents the position of the IPCC reports in relation to the scientific process, namely, that they are both built upon and yet detached from it. They cannot have modesty for work which they do not do.

    On a more pedantic note, I would suggest that the second paragraph of your update:

    “There is also not a single finding, in which scientist came up with the conclusion, that something was “less likely than not” (33%-50% chance of being true) but a lot that were “more likely than not”. However, not getting such a result among hundreds of inquiries in unbiased research is … well, a “virtual impossibility”.”

    assumes that the studies referenced in the IPCC reports should be randomly distributed across result space. This is unlikely to be true as people neglect to work in areas which they think will be uninteresting and/or likely to provide inconclusive results, particularly those that might be negatively so.

    Regards,
    Chris

    • You cannot write a report on the state of knowledge on anything that has not been conclusive (it is not – otherwise there would be no need to write about the likelihood of something happening), without also writing about the failures in that process.

      In the process of knowledge acquisition, such failures are unavoidable. Knowing which failures have occurred, which hypothesis have been dismissed and what contradicting evidence has been found is as much part of the state of knowledge as knowing which hypothesis have been confirmed or supported by evidence.

      Also, the results of research that is being conducted cannot be predicted. Which is all the more true in narrow cases. It is impossible to believe that in all the important research on climate change, where indications of a result being true or not is very nearly undecided, the scales tipped in all cases towards being just a bit more likely than not.

      This cannot be explained by people neglecting to work in areas which they think will be uninteresting and/or likely to provide inconclusive results. Because then there would be no narrow results in the first place.

      Also, an aversion of research in areas that yield inconclusive results that are inclined towards being negative, cannot explain the lack of such results. Because such results just happen – unless those scientists know the results of their research before conducting it. But then we’re no longer talking about research, and we’re beginning to talk about fraud. And that, at least, is what people do when they are faced with those exact same problems in pharmaceutical and medical research.

      A TED talk published last week is a good primer on the topic:
      http://www.ted.com/talks/ben_goldacre_battling_bad_science.html

      Either the research or the reporting on the research is biased.

      I agree that it is a “valuable and laudable goal to attempt to compile all the thousands of man-years of research that have gone in to understanding our climate” – but it is despicable to selectively leave out the tens of thousands of man-years of research that turned out not to confirm what it set out to study.

      If this would have made the report longer, then so be it. If it becomes too long because of that, then the IPCC has failed in its goal to write such a report.

      • I want to provide a point-by-point reply, so please excuse the terse style. I have also omitted quoting directly to avoid making the post too big, so hopefully that hasn’t left it unreadable.

        §1 – The uncertainty in the kinds of discussions you are talking about relate to matters of degree, not to the basic theory. Also, if one was to compile, for example, a report on the current state of aviation technology, one would not feel compelled to recount all the crashes from which we have learnt and attained our currently excellent record of safety – it would not be part of the exercise. One could question the presence in the IPCC reports of obviously negative results, but then those do provide strong examples of contrary evidence. I appreciate that this unevenness in the distribution of presented material might be, or at least be interpreted as, a form of selection bias, but I do think that it is primarily controlled by processes taking place before the IPCC, in the ways I discuss below.

        §2 – What you describe here is true for those people who are conducting the science. The IPCC reports are not intended to be literature reviews for scientists, they are there to provide guidance for policy makers. I accept your right to disagree with their conclusions, but you should not expect more from them than they set out to provide.

        §3 – Research is not conducted blindly; whilst it is obviously true that one does not (in general) conduct research for which one knows the result, it is often the case that there is some kind of prediction, i.e. a hypothesis, that one aims to test. You could say that part of being a good scientist is being able to pose good hypotheses, that is, ones that stand up to experiment, although of course the aim of a good experiment is to refute the hypothesis that it is testing. I remain skeptical of the notion that one would expect research ‘events’ to be uniformly distributed across ‘result space’, and ask if there is anything beyond anecdotal evidence of such a uniform distribution in other fields?

        §4 – As with most things, it is a matter of degree. As stated above, people inevitably bias towards testing ideas they think are right. As a scientist you don’t just think up a bunch of random ideas and test them all – you do some hard work and make a good guess and then go and try to prove it one way or the other, more or less successfully depending upon your skill.

        §5 – I think that what I’ve said in the previous paragraphs covers this one fairly well, but I would suggest caution when beginning to accuse people of fraud. There’s a lot of interacting processes going on, and if you just look at one part of the picture (the IPCC), and draw your conclusions from that, then you are not doing the people who are putting in the scientific legwork much of a service.

        §6 – I’m a big fan of Ben Goldacre, but I don’t think I’ve seen this yet, so thanks.

        §7 – I would say the reporting is polarised, and that the science is working within the framework that we have available to us, which points towards exactly the kind of things the IPCC discuss.

        §8 – A cynic might say that I was just another member of the choir, but I have never been shown any body of literature that ‘goes against the grain’ as it were. Whilst it is true that I haven’t looked very hard, I honestly believe that given the field I work in, physical oceanography, it would be being hotly discussed behind our ‘closed doors’ (which are actually open, as it happens), and it just doesn’t happen. This is of course simple anecdote, but it is this experience that informs my position, at least.

        As my closing note I would say that, as far as I can tell, pretty much everything points the same way, though again I’m certainly not an expert. What I do know is that our Earth is a complicated system, and that our models, both conceptual and numerical, are poorly constrained and probably missing important features, but also that none of that means that we are incapable of making well-founded statements about system response. It is the role of the IPCC to take those ideas and put them into a series of pretty books, not to question, conduct or even suggest scientific experiments.

        Regards,
        Chris

      • (Sorry, this reply got a lot longer than I intended.)

        Let me start with an example from another field. Until the late 1990ies and certainly until 1993, the received wisdom of astrophysics was, that planets are a rare thing. Theoretical studies had been conducted and everything pointed in the same direction – that the formation of planets depends on subtle effects and equilibria, that we were lucky to have in our solar system. The formation of gas giants, for example, was thought to be only possible at the right distance from a star, where water would turn into ice and provide the core for planetary formation. Suggestions that planets might actually be very common were usually put down as wishful thinking of people who read too much science fiction. But i

        The very first extra-solar planet to be found was a gas giant. It circled around its star every few days – it is much closer than Mercury and its surface would melt lead or aluminum. This came completely out of the blue and theory had to be started from scratch. Hundreds more have been found (using equipment that is still quite limited in performance), contradicting a lot of theoretic work that had been done. (Shouldn’t solar radiation have blown all dust away that could have formed planets? How did they get so close to the stars? They couldn’t just form right next to the star? Oh, well, how did they form right next to the star? Didn’t we think that large rocky planets were impossible? Are rocky planets just failed gas giants and we were completely wrong in modeling two different kinds of planet formation? etc.) Not just the number was much higher than expected, they were found around types of stars that were supposed to have adverse conditions for planet formation – even a planet circling a double star and a pulsar. Current thinking holds that planets are probably rare in the inner parts of the galaxy … but those are hard to observe.

        Systems in climate science are even more complicated than this and we don’t have a second system (much less another 500) to observe. So, scientists will have a lot of the same kinds of fundamental misconceptions when extrapolating the influence of CO2 on global climate and weather phenomena.

        There is, for example, the issue of cloud formation which is absolutely fundamental but barely understood. Not only does it influence rainfalls (and vegetation, which has its own feedback mechanisms), but also albedo. Clouds reflect light and even a small change in overall cloud cover can create significant changes. Historic natural climate variability is at best poorly understood – the Dust Bowl, the little ice age, the European medieval climate optimum (that was a disaster in what we currently call Iraq and Iran – with deep snow and very cold winters) and the variability before that are far from being explained.

        Also, given the current array of positive feedback mechanisms being discussed, one wonders how our climate has ever managed to be relatively stable or managed to return to something even recognizable as “normal” in all those extreme global climate phenomena. Ten thousand years ago Earth had just emerged rather quickly from an ice age. But given all the positive feedback mechanisms, why didn’t that go on?- Something must have stopped the warming trend. Furthermore, climate was relatively stable over the last ten thousand years, all of which makes me think that a lot of the negative feedback and dampening mechanisms of the climate system have yet to be discovered, or are being underestimated. Because all stable systems are stable because the negative feedback mechanisms outweigh the positive.

        Also, I’m not accusing scientists of fraud (at least not on an unusual scale). My hypothesis is biased reporting and publishing, as well as certain social dynamics leading scientists to discard certain lines of inquiry more readily than they should. For example, all reports of research on negative feedback mechanisms are greeted immediately by skepticism and doubt, whereas reports on positive feedback mechanisms usually prompt another round of doomsday scenarios (and implicit acceptance).

        Finally, especially because the IPCC report is meant for policy makers, it must be unbiased. Because the bias of the report gets translated into the decision making of politicians. But there is a stark bias in the report and extremely unreasonable statements “slipping into” the report (like the Himalaya glaciers melting until 2035 instead of 2350) only reinforce the statistical argument. To which I would like to add, that the result space need not be perfectly symmetric across the whole range of confidences.

        But having results being declare “unlikely”, “likely” and “more likely than not” – but not having any that turned out to be “less likely than not” – is only a plausible outcome, if you assume that all research that turned out to be “as likely as not” to be true was given a positive spin of some kind or another. (E.g. by choice of null hypothesis, selective reporting, selection bias in journals etc.) It is that bad.

        The only sure way to eradicate bias is to confront it with new observations of comparable systems. But we don’t have those. All we can do is observe the one system we have, which is giving us new data at a steady and very slow rate, and compare those observations with previous predictions. And those predictions have turned out to be pretty bad. It is not understood, for example, why global temperatures didn’t rise during the last decade. Sure, many of the warmest years on record have been in the last 10 years – but they are in an essentially random order and have very little difference between them. Which is not consistent with the predicted warming trend, but a stagnation at a high level. This may yet change (in either direction) – but why it happened has not been explained.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s