Owhadi ea’s When Bayesian Inference Shatters
Owhadi, Scovel and Sullivan When Bayesian Inference Shatters Arxiv.org August 2013
[Although] Bayesian methods are robust when the number of possible outcomes is finite or when only a finite number of marginals of the data-generating distribution are unknown, they are generically brittle when applied to continuous systems with finite information on the data-generating distribution. This … suggests that Bayesian inference is generically ill-posed in the sense of Hadamard when applied to such systems: if closeness is defined in terms of the total variation metric or the matching of a finite system of moments, then (1) two practitioners who use arbitrarily close models and observe the same (possibly arbitrarily large amount of) data may reach diametrically opposite conclusions; and (2) any given prior and model can be slightly perturbed to achieve any desired posterior conclusions.
One response to the concern that the choice of prior and model are somewhat arbitrary is to optimize over classes of priors and models. This … makes very strong statements about the form of the model components, particularly with regard to the tails, that may be difficult to justify based on a finite amount of prior information. This is particularly important in applied fields such as catastrophe modeling, insurance, and re-insurance, where often the events of interest are high-impact low-probability “Black Swan” events: the difference between an exponentially small and a polynomially small tail can be vitally (or ruinously) important.
A ‘Plain Jane’ summary
The paper is clearly important, but very technical, even for experts in the field. The authors have supplied a ‘plain Jane’ summary.
[A] natural way to assess the sensitivity of the Bayesian answer with respect to the choice of prior is to specify the distribution ℚ of only a large, but finite, number, k, of moments … . This defines a class of priors Π and our results show that no matter how large k is, no matter how large the number of samples is, for any ℚ that has a density with respect to the uniform distribution on the first k moments, if you observe the data at a fine enough resolution, then the minimum and maximum of the posterior value of the mean over the class of priors Π are 0 and 1.
We do not think that this is the end of Bayesian Inference, but we hope that these results will help stimulate the discussion on the necessity to couple posterior estimates with rigorous performance/accuracy analysis.
I have yet to fully digest the paper, but my thoughts are:
- In applications where there is no possible mathematical or properly scientific justification for the use of Bayesian data analysis,
- The results of Bayesian data analysis could be very misleading
- One at least need to supplement such analysis with specific consideration of robustness, not taking it for granted.
- If one regards P(X) as the probability of X, then one should note that there is additional uncertainty, which may be far more significant.
- A traditional approach to tail-area statistics has been to assume Gaussian (‘normal’) distributions. Once one abandons this assumption – as one often must – then the Bayesian approach becomes uncertain. I agree that so-called robust Bayesian techniques do not seem to help.
- The discussion broadly mirrors that of Keynes’ Treatise on Probability, adding an explicit construction to support its claims.
- The paper considers only static distributions. The authors intend to develop a means of taming their uncertain. This may not be helpful in terms of wider uncertainties, such as those that Keynes considers.
- It is not obvious that the dangerous cases briefly referred to a static in the sense of the paper.
My own approach to this problem has been from the application end. Many attempts to apply Bayesian analysis to complex systems have resulted in conclusions that have later been seen to be ridiculous, or which even if correct have not commanded the confidence of decision-makers. So far, these have all been cases in which there was no logical reason to think that the analysis ‘should’ have worked. Often, the non-probabilistic uncertainty has dominated the overall uncertainty, and attempt to improve probability estimates have been futile. It thus seems important to me to be able to distinguish between situations in which Bayesian approaches will or will not be reliable. For reasons such as those described by Keynes I do not find the paper’s modelling approach appropriate to many important situations, but it does seem a vital step forward in helping to shatter any illusions about Bayesian inference.
On a more positive – yet speculative – note, in specific applications it has often been useful to use statistical methods, including Bayesian to make extrapolations, as long as one treats them as extrapolations, not unconditional predictions. Thus if one uses a family of distributions to make a robust Bayesian estimate, one should make all deductions conditional on the family used, and monitor the credibility of the family. In many systems of interest, such as climate or economies, what is of interest is not so much a transient ‘Black Swan’, but one that makes a lasting difference to the way things work. One does not mind an economic ‘blip’ so much as a depression. Thus one can make reasonable forecasts subject to the condition that there is no ‘Black Swan’ that has a lasting effect. Worrying about such Black Swans is then a separate activity. In this sense, perhaps Bayesian data analysis could be made robust yet bounded, rather than unbounded and fragile.
Similarly, Bayesian techniques are only justified when the world is simple in some sense (such as Keynes discusses). So maybe one should make the conclusions of Bayesian inference conditional on the situation being sufficiently simple.