Interpreting Bayes’ Rule

Bayes’ Rule is about updating the probability, P(H), of a hypothesis, H, on receiving some relevant information (‘evidence’), E. The resultant probability, P(H|E), is multiplied by the ‘Bayes factor’, which is the likelihood of the evidence for the hypothesis, P(E|H) divided by the probability of the evidence, P(E). That is:

P(H|E) ≡ P(H).P(E|H)/P(E). (Or some algebraic variant.)

Bayes’ rule follows very simply from the identity:

P(A & B) ≡ P(B|A).P(A).

As such it is very rightly regarded as ‘a mathematical truism’. Applications of it are essential to all sciences and pseudo-sciences. But the point of this note is that not all applications respect the mathematics, and so not all are equally to be trusted.

Some warnings follow.

Composite Hypotheses

The assumption behind Bayes’ rule is that the hypothesis determines the likelihood. But often the hypothesis is ambiguous: there are a range of likelihoods that fit. For example, the hypothesis ‘this coin is not fair’ fits all likelihoods except P(‘Heads’)=1/2.

In such cases one may refine the hypotheses as you gain evidence. For example, if you get a long run of almost all ‘Heads’ you might refine the hypothesis to ‘biased to Heads’. In this case any previous likelihoods need to be re-assessed: Baye’s rule doesn’t apply.

Similarly if, after having applied Bayes’ rule you get information relevant to the hypotheses, such as ‘the coin is the same on both sides’, which is not actually ‘evidence’ in the sense of Bayes’ rule, you should re-assess the previous likelihoods.

Dependence on Context

In mathematical versions of Bayes’ rule the probability function P( ) depends on some context, C. Here I denote this context dependency by PC(), the probability in the context of C.

Bayes’ rule then becomes:

PC(H|E) ≡ PC(H).PC(E|H)/PC(E).

It is not usual to be so pedantic, because the context is normally fixed throughout the sequence of updatesand so ‘can be taken as read’. But we should always bear this context-dependence in mind. We then have a version of Bayes’ rule that is mathematically equivalent to the usual ones, but less prone to misinterpretation:

  • The probability is multiplied by the Bayes’ Factor, unless the evidence changes the context.

If the evidence does change the context then the initial probabilities and likelihoods may need to change. For example, if one has a formal ‘mathematical’ model and is trying to estimate a parameter, then Bayes’ rule applies when you are just updating the parameter: but the evidence could cast doubt on the model.

This need not be an issue, but where the rely on human judgement mainstream experimental best practice prohibits such changes, for good reason. The logical solution is to reset the context and restart the experiment. If, as is common practice, one sticks with the old context then the Bayes’ Factors will be unreliable, perhaps even supporting the wrong hypotheses.


  1. You have been watching a market trader when you realising that they are tricking you. So you revise your current interpretation of what you have seen.
  2. A British soldier falls asleep on the plane after an exhausting exercise in German, not having noticed that it had been loaded with emergency aid. He wakes up at low-level over Ethiopia, with his colleagues dropping the aid. Initially he thinks he is on his way back to the UK, and has appropriate priors. But at some point he realises that he was wrong, and re-interprets what he has seen in the new context.
  3. You have a context, C, with parameter X, denoted C(X), and a prior P(X=x). The likelihood function is P(E | X=x). On obtaining E you update the prior using Bayes’ rule to yield P(X=x | E). But you notice that P(E) was very small. This keeps happens consistently. You realise that C is a special case of a meta-context, C'(X,Y), with C(X) ≡ C'(X,Y=y). You conjecture that Y≠y , assign priors P(X,Y), gather new evidence and apply Bayes’ rule.

I do not see how Bayes’ rule could apply directly to the above. I have found the last example to be very common. The method I describe can be respectable, for example if the meta-context and appropriate priors were well known and widely accepted even before I was born. More generally it seems to me that Bayesian reasoning is fine for the final presentation of results, but it sometimes needs some preparation to make sure that one can legitimately apply it.

See Also


Dave Marsay

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: