Bayes Rules OK

This week, That’s Maths ( TM018 ) deals with the “war” between Bayesians and frequentists, a long-running conflict that has now subsided. It is 250 years since the presentation of Bayes’ results to the Royal Society in 1763.

The column below was inspired by a book, The Theory that would not Die, by Sharon Bertsch McGrayne, published by Yale University Press in 2011.

Two Perspectives

Classical and Bayesian statistics interpret probability in different ways. To classical statisticians, or frequentists, probability is the relative frequency. If an event occurs on average 3 out of 4 times, they will assign to it a probability of 3/4.

For Bayesians, probability is a subjective way to quantify belief or degree of certainty, based on incomplete information. All probability is conditional, and subject to change when more data emerges. If a Bayesian assigns a probability of 3/4, he or she should be willing to offer odds of 3 to 1 on a bet.

Frequentists find it impossible to draw conclusions about once-off events. By using prior knowledge, Bayesian analysis can deal with individual incidents. It can answer questions about events that had never occurred, such as the risk of an asteroid smashing into the Earth or the chance of a major international war breaking out over the next ten years.

An advantage of Bayesian analysis is that it answers the questions that scientists are likely to ask.

Smoke-plume of Space Shuttle Challenger, January 28, 1986. The disaster resulted in the deaths of all seven crew members. — Smoke-plume of Space Shuttle Challenger, January 28, 1986. The disaster resulted in the deaths of all seven crew members (image NASA).

Some Spectacular Successes

During World War II, Alan Turing used Bayesian methods to decode the German Enigma cipher and find the locations of U-boats. The cracking of Enigma was as vital to the Allied victory as any of the military engagements.

In 1966, Bayesian methods enabled the successful location of a lost hydrogen bomb following the crash of a B-52 in Palomares, Spain.

The danger of a major accident for the “Challenger” space shuttle was estimated by a Bayesian analysis in 1983 as 1 in 35. The official NASA estimate at the time was an incredible 1 in 100,000. In January 1986, during the twenty-fifth launch, the Challenger exploded, killing all seven crew members.

In May 2009, on a flight from Rio de Janeiro to Paris, Air France Flight AF447 crashed into the Atlantic Ocean. Bayesian analysis played a crucial role in the location of the flight recorders and the recovery of the bodies of passengers and crew.

Bayes’ Rule

Bayes’ Rule transformed probability from a statement of relative frequency into a measure of informed belief. In its simplest form the rule, devised in the 1740s by the Reverend Thomas Bayes, tells us how to calculate an updated assessment of probability in the light of new evidence.

We start with a prior degree of certainty. New data then makes this more or less likely. Bayes Rule gives the “likelihood ratio” that multiplies the prior value to give an updated or posterior value. We can write Bayes’ rule symbolically as

$\displaystyle P(A|B) = \left[ \frac{P(A)}{P(B)} \right] P(B|A)$

The rule should really be named in honour of Laplace, who formulated the key ideas and expressed them in clear mathematical notation. But many years of usage compel us to credit Laplace’s achievements to Bayes.

Long-running Controversy

Despite spectacular successes, Bayesian methods have been the focus of major controversy and their acceptance has been slow and tortuous. Through most of the twentieth century, the academic community eschewed Bayesian ideas and derided practitioners who applied them.

The nub of the controversy was that the probability using Bayesian methods depends on prior opinion. When data is scarce, this yields a subjective rather than objective assessment. When information is scarce, subjective opinions may differ widely.

The controversy was long and often bitter. Most prominent on the frequent side were leading classical statisticians Ronald Fisher in the UK and Jerzy Neymann in USA. Antagonists vilified each other, generating great hostility between the two camps. The conflict had aspects of a religious war, with talk of zealots, proselytizers, fervent devotees, dogma, converts and believers.

The war is now over: frequentists and Bayesians both recognize that the two approaches have value in different circumstances. When data is plentiful, the two methods generally produce consistent results.

Difficulties with Acceptance

There were two other major reasons for the delay in acceptance of Bayesian ideas. One was the official secrets act; much of the work carried out in cryptanalysis, and also for location of lost nuclear weapons, was kept under wraps. The other reason was that powerful computers were required to apply Bayesian methods, and these bacame common only towards the end of the century.

As well as faster computers, clever algorithms were essential to make the methods practicable. The big breakthrough was a technique with the formidable name Markov Chain Monte Carlo (MCMC) method. By replacing integration by MCMC, it became possible to manage the prodigious calculations needed to compute Bayesian probabilities. The combination of Bayes’ Rule and MCMC has been described as “arguably the most powerful mechanism ever created for processing data and knowledge” (McGrayne, p 224).

MCMC is a very efficient way to sample a random distribution. With this algorithm, Bayesian analysis has become practicable. MCMC was originally developed in the 1950s for simulating chain reactions in a hydrogen bomb.

The Scene Today

Bayesian inference now plays a crucial role in computer science, artificial intelligence, machine learning and language translation. It has many applications, in risk analysis in the chemical and nuclear industries, image enhancement, face recognition, medical diagnosis, setting insurance rates, search and rescue operations and filtering spam email.

Bayesian inference is likely to find many new applications over the coming decades.

Reference:

Sharon Bertsch McGrayne, 2011: The Theory that would not Die. Yale Univ. Press, 336pp.

Related

Published by thatsmaths