# Illustrations of Uncertainty

Some examples of uncertainty, based on those invented by others. As such, they are simpler than real examples. See Sources of uncertainty for an overview of the situations and factors referred to.

## Pirates: Predicting the outcome of a decision that you have yet to make

Jack Sparrow can’t predict events that he can influence. Here we generalise this observation, revealing limits to probability theories.

In ‘Pirates of the Caribbean’ the hero, Captain Jack Sparrow, mocks the conventions of the day, including probability theory. In ‘On Stranger Tides’ when asked to make a prediction he says something like ‘I never make a prediction on something that I will be able to influence’. This has a mundane interpretation (even he can’t predict what he will do). But it also suggests the following paradox.

Let {Ei} be a set of possible future events dependent on a set, {Dj}, of possible decisions then, according to probability theory, for each i, P(Ei) ≡ ∑j{P(Ei|Dj).P(Dj)}.

Hence to determine the event probabilities, {P(Ei)}, we need to determine the decision probabilities, {P(Dj)}. This seems straightforward if the decision is not dependent on us, but is problematic if we are to make the decision.

According to Bayes’ rule the probability of an event only changes when new evidence is received. Thus if we consider a decision to have a particular probability it is problematic if we change our mind without receiving more information.

As an example, suppose that an operation on our child might be beneficial, but has to be carried out within the next hour. The pros and cons are explained to us, and then we have an hour to decide, alone (in an age before modern communications). We are asked how likely we are to go ahead initially, then at half an hour, then for our final decision. It seems obvious that {P(Dj)} would most likely change, if only to become more definite. Indeed, it is in the nature of making a decision that it should change.

From a purely mathematical perspective, there is no problem. As Keynes emphasized, not all future events can be assigned numeric probabilities: sometimes one just doesn’t know. ‘Weights of evidence’ are more general. In this scenario we can see that initially {P(Di)} would be based on a rough assessment of the evidence, and the rest of the time spent weighing things up more  carefully, until finally the pans tip completely and one has a decision. The concept of probability, beyond weight of evidence, is not needed to make a decision.

We could attempt to rescue probabilities by supposing that we only take account of probability estimates that take full account of all the evidence available. Keynes does this, by taking probability to mean what a super-being would make of the evidence, but then our decision-maker is not a super-being and so we can say what the probability distribution should be, not what it is ‘likely’ to be. More seriously, in an actual decision such as this the decision makers will be considering how the decision can be justified, both to themselves and to others. Justifications often involve stories, and hence are creative acts. It seems hard to see how an outsider, however clever, could determine what should be done.  Thus even a Keynesian logical probability does not seem applicable.

## Area

Wittgenstein pointed out that if you could arrange for darts to land with a uniform probability distribution on a unit square, then the probability of the dart landing on sub-set of the square would equal its area, and vice-versa. But some sub-sets are not measurable, so some (admittedly obscure) probabilities would be paradoxical if they existed.

## Cabs

Tversky and Kahneman, working on behavioural economics, posed what is now a classic problem:

A cab was involved in a hit-and-run accident at night. Two cab companies, the Green and the Blue, operate in the city. You are given the following data:
(i) 85% of the cabs in the city are Green and 15% are Blue;
(ii) A witness identified the cab as a Blue cab.
The court tested his ability to identify cabs under the appropriate visibility conditions. When presented with a sample of cabs (half of which were Blue and half of which were Green) the witness made correct identifications in 80% of the cases and erred in 20% of the cases.

Question: What is the probability that the cab involved in the accident was Blue rather than Green?

People generally say 80%, whereas Kahneman and Tversky, taking account of the base rate using Bayes’ rule, gave 41%. This is highly plausible and generally accepted as an example of the ‘base rate fallacy’. But this answer seems to assume that the witness is always equally accurate against both types of cab, and from an uncertainty perspective we should challenge all such assumptions.

If the witness has lived in area where most cabs are Green then they may tend to call cabs Green when they are in doubt, and only call them Blue when they are clear. When tested they may have stuck with this habit, or may have corrected for it. We just do not know. t is possible that the witness never mistakes Green for Blue, and so the required probability is 100%. This might happen if, for example, the Blue cabs had a distinctive logo that the witness (who might be colour-blind) used as a recognition feature. At the other extreme (for example, Green cabs had a distinctive logo) if Blue cabs are never mistaken for Green, the required probability is 31%.

Finally, a witness would normally have the option of saying that they were not sure. In this case it might be reasonable to suppose that they would only say that the cab was Blue if – after taking account of the base rate – the probability was reasonably high, say 80%. Thus an answer of 80% seems more justifiable than the official answer of 41%, but it might be better to range a range of answers for different assumptions, which could then be checked. (This is not to say that people to do not often neglect the base rate when they shouldn’t, but simply to say that the normative theory that was being used was not fully reliable.)

## Tennis

Gärdenfors, Peter & Nils-Eric Sahlin. 1988. Decision, Probability, and Utility includes the following example:

Miss Julie … is invited to bet on the outcome of three different tennis matches:

• In Match A, Julie is well-informed about the two players. She predicts that the match will be very even.
• In Match B, Julie knows nothing about the players.
• In Match C, Julie has overheard that one of the players is much better than the other but—since she didn’t hear which of the players was better—otherwise she is in the same position as in Match B.

Now, if Julie is pressed to evaluate the probabilities she would say that in all three matches, given the information she has, each of the players has a 50% chance of winning.

Miss Julie’s uncertainties, following Keynes, are approximately [0.5], [0,1] and {0,1}. That is, they are like those of a fair coin, a coin whose bias is unknown, or a coin that is two-sided, but we do not know if it is ‘heads’ or ‘tails’. If Miss Julie is risk-averse she may reasonably prefer to bet on match A than on either of the other two.

The difference can perhaps be made clearer if a friend of Miss Julie’s, Master Keynes, offers an evens bet on a match, as he always does. For match A Miss Julie might consider this fair. But for matches B and C she might worry that Master Keynes may have some additional knowledge and hence an unfair advantage.

Suppose now that Keynes offers odds of 2:1. In match A this seems fair. In match C it seems unfair, since if Keynes knows which player is better he will still have the better side of the bet. In match B things are less clear. Does Keynes know Miss Julie’s estimate of the odds? Is he under social pressure to make a fair, perhaps generous, offer? In deciding which matches to bet on, Miss Julie has to consider very different types of factor, so in this sense ‘the uncertainties are very different’.

(This example was suggested by Michael Smithson.)

## Shoes

If a group have a distinctive characteristic, then the use of whole population likelihoods for an in-group crime is biased against a suspect.

For example, suppose that a group of 20 social dancers all wear shoes supplied by X all the time. One of them murders another, leaving a clear shoe-mark. The police suspect Y and find matching shoes. What is the weight of evidence?

If the police fail to take account of the strange habits of the social group, they may simply note that X supplies 1% of the UK’s shoes, and use that to inform the likelihood, yielding moderate evidence against Y. But the most that one should deduce from the evidence is that it was likely to be one the dance group.

The problem here is that many (most?) people do belong to some group or groups with whom they share distinctive characteristics.

## More illustrations

Yet to be provided.