## Who thinks probability is just a number? A plea.

Many people think – perhaps they were taught it – that it is meaningful to talk about the unconditional probability of ‘Heads’ (I.e. P(Heads)) for a real coin, and even that there are logical or mathematical arguments to this effect. I have been collecting and commenting on works which have been – too widely – interpreted in this way, and quoting their authors in contradiction. De Finetti seemed to be the only example of a respected person who seemed to think that he had provided such an argument. But a friendly economist has just forwarded a link to a recent work that debunks this notion, based on wider  reading of his work.

So, am I done? Does anyone have any seeming mathematical sources for the view that ‘probability is just a number’ for me to consider?

There are some more modern authors who make strong claims about probability, but – unless you know different – they rely on the above, and hence do not need to be addressed separately. I do also opine on a few less well known sources: you can search my blog to check.

Dave Marsay

## Assessing and Communicating Risks and Uncertainty

David Spielgelhalter Assessing and Communicating Risks and Uncertainty Science in Parliament vol 69, no. 2, pp. 21-26. This is part of the IMA’s Mathematics Matters: A Crucial Contribution to the Country’s Economy.

This starts with a Harvard study showing that “a daily portion of red meat was associated with an increase in the annual risk of death by 13% over the period of the study”. Does this mean, as the Daily Express claimed, that “10% of all deaths could be avoided”?

David S uses ‘survival analysis’ to show that “a 40 year-old  man who eats a quarter-pound burger for his working lunch each day can expect, on average, to live to 79, while his mate who avoids the burger can expect to live to 80.” He goes on: “over a lifetime habit, each daily portion of red meat is associated with about 30 minutes off your life expectancy .. ” (my emphasis.)

As a mathematician advising politicians and other decision-makers, I would not be comfortable that policy-makers understood this, and would act appropriately. They might, for example, assume that we should all be discouraged from eating too much red meat.

Even some numerate colleagues with some exposure to statistics might, I think, suppose that their life expectancy was being reduced by eating red meat. But all that is being said is that if a random person were selected from the population as a whole then – knowing nothing about them – a statistician would ‘expect’ them to have a shorter life if they eat red meat. But every actual individual ‘you’ has a family history and many by 40 will have had cholesterol tests. It is not clear what the relevance to them is of the statistician’s ‘averaged’ figures.

Generally speaking, statistics gathered for one set of factors cannot be used to draw precise conclusions about  other sets of factors, much less about individuals. David S’s previous advice at Don’t Know, Can’t Know applies. In my experience, it is not safe to assume that the audience will appreciate these finer points. All that I would take from the Harvard study is that if you eat red meat most days it might be a good idea to consult your doctor. I would also hope that there was research going on into the factors in the apparent dangers.

I would appreciate a link to the original study.

Dave Marsay

## Football – substitution

A spanish banker has made some interesting observations about a football coach’s substitution choice.

The coach can make a last substitution. He can substitute an attacker for a defender or vice-versa. With more attackers the team  more likely to score but also more likely to be scored against. Substituting a defender makes the final score less uncertain. Hence there is some link with Ellsberg’s paradox. What should the coach do? How should he decide?

A classic solution would be to estimate the probability of getting through the round, depending on the choice made. But is this right?

Pause for thought …

As the above banker observes, a ‘dilemma’ arises in something like the 2012’s last round of group C matches where the probabilities depend, reflexively, on the decisions of each other. He gives the details in terms of game theory. But what is the general approach?

The  classic approach is to set up a game between the coaches. One gets a payoff matrix from which the ‘maximin’ strategy can be determined? Is this the best approach?

If you are in doubt, then that is ‘radical uncertainty’. If not, then consider the alternative in the article: perhaps you should have been in doubt. The implications, as described in the article, have a wider importance, and not just for Spanish bankers.

Other Puzzles, and my notes on uncertainty.

Dave Marsay

## What is the Public Understanding of Risk?

What is the Public Understanding of Risk?
D. Simmons FIMA , MD Analytics, Willes RE

Science in Parliament, Spring 2012, Reprinted in the IMA’s Mathematics Today, Vol. 48 No. 3 June 2012

This says very little about the public understanding of risk, and is more about the understanding within insurance and reinsurance companies. It discusses the potential use of probability in legal cases, and says:

There is no reason why such [probabilistic / statistical ] tools should not be used in government.

This contrasts oddly with an article in the previous issue:

T. Johnson, Heralding a New Era in Financial Mathematics, April 2012

This starts by referring to Keynes and goes on:

The Bank of England believes that recent developments in financial mathematics have focused on microeconomic issues, such as pricing derivatives. Their concern is whether there is the mathematics to support macroeconomic risk analysis; how the whole system works. While probability theory has an important role to play in addressing these questions, other mathematical disciplines, not usually associated with finance, could prove useful. For example, the Bank’s interest in complexity in networks and dynamical systems has been well documented.

… As well as the Bank of England’s interest in models of market failure and systemic risk, more esoteric topics such as non-ergodic dynamical systems and models of learning in markets would be interesting. Topics associated with mainstream financial mathematics could include control in the presence of liquidity constraints, Knightian uncertainty and behavioural issues and credit modelling.

Thus, there seems to be at least one area where Keynes’ notion that uncertainty cannot always be represented by a single number, probability, is still relevant. Simmons’ contention inevitably lies outside the proper scope of mathematics, and is contentious.

Simmons does say:

All assumptions behind a decision can be seen, discussed, challenged and stressed.

This is a common claim of Bayesians and other probabilists, and has great merit, particularly if one is comparing it with a status quo of relying on gut-feel. But the decision to use a probabilistic approach is not unimportant and we should consider, as Keynes does, the implicit assumptions behind it.

There are actually many different axiomatizations of probability. They all assume that the system under consideration is in some sense regular, and that one is concerned with averages. These conditions seem to apply to insurance and re-insurance, but not always to legal matters or government policy.

My own involvement in reinsurance was in the government’s covering of the market’s failure to cope with the non-stochastic risk presented by terrorism. If it were true that government could address risk in the same way as the reinsurers, what would the point of government cover be? Similarly, in finance, what is the regulatory role of governmental institutions if the probabilistic view of risk is correct? My career has largely been spent in explaining to decision-makers why the people who ultimately carry the risk have to take a different approach to limited liability companies, who can treat risk as if it were a gamble. (I tend to find the tools of Keynes, Turing and Good appropriate to ‘wider risk’.)

Hopefully the IMA president’s up-coming address will enlighten us all.

## The origins of Bayes’ insights: a puzzle

In English speaking countries the Rev. Thomas Bayes is credited with the notion that all kinds of uncertainty can be represented by numbers, such as P(X) and P(X|Y), that can be combined just as one can combine probabilities for gambling (e.g. Bayes’ rule).

You are told that one of these is true:

1. Bayes was in the  habit of attending the local Magistrates Court and making an assessment of the defendant’s guilt based on his appearance, and then comparing it with the verdict.
2. Bayes performed an experiment in which he blindly tossed balls on to a table while an assistant told him whether the ball was to the right or left of the original.

Assign probabilities to these statements. (As usual, I’d be interested in your assumptions, theories etc. If you don’t have any, try here.)

More similar puzzles here.

Dave Marsay

## Avoiding ‘Black Swans’

A UK Blackett Review has reviewed some approaches to uncertainty relevent to the question “How can we ensure that we minimise strategic surprises from high impact low probability risks”. I have already reviewed the report in its own terms.  Here I consider the question.

• One person’s surprise may be as a result of another person’s innovation, so we need to consider the up-sides and down-sides together.
• In this context ‘low probability’ is subjective. Things are not surprising unless we didn’t expect them, so the reference to low probability is superfluous.
• Similarly, strategic surprise necessarily relates to things that – if only in anticipation – have high impact.
• Given that we are concerned with areas of innovation and high uncertainty, the term ‘minimise’ is overly ambitious. Reducing would be good. Thinking that we have minimized would be bad.

The question might be simplified to two parts:

1. “How can we ensure that we strategize?
2. “How can we strategize?”

These questions clearly have very important relative considerations, such as:

• What in our culture inhibits strategizing?
• Who can we look to for exemplars?
• How can we convince stakeholders of the implications of not strategizing?
• What else will we need to do?
• Who might we co-opt or collaborate with?

But here I focus on the more widely-applicable aspects. On the first question the key point seems to be that, where the Blackett review points out the limitations of a simplistic view of probability, there are many related misconceptions and misguided ways that blind us to the possibility of or benefits of strategizing. In effect, as in economics, we have got ourselves locked into ‘no-strategy strategies’, where we believe that a short-term adaptive approach, with no broader or long-term view, is the best, and that more strategic approaches are a snare and a delusion. Thus the default answer to the original question seems to be ‘you don’t  – you just live with the consequences’. In some cases this might be right, but I do not think that we should take it for granted. This leads on to the second part.

We at least need ‘eyes open minds open’, to be considering potential surprises, and keeping score. If (for example, as in International Relations) it seems that none of our friends do better than chance, we should consider cultivating some more. But the scoring and rewarding is an important issue. We need to be sure that our mechanisms aren’t recognizing short-term performance at the expense of long-run sustainability. We need informed views about what ‘doing well’ would look like and what are the most challenging issues, and to seek to learn and engage with those who are doing well. We then need to engage in challenging issues ourselves, if only to develop and then maintain our understanding and capability.

If we take the financial sector as an example, there used to be a view that regulation was not needed. There are two more moderate views:

1. That the introduction of rules would distort and destabilise the system.
2. That although the system is not inherently stable, the government is not competent to regulate, and no regulation is better than bad regulation.

My view is that what is commonly meant by ‘regulation’ is very tactical, whereas the problems are strategic. We do not need a ‘strategy for regulation’: we need strategic regulation. One of the dogmas of capitalism is that it involves ‘free markets’ in which information plays a key role. But in the noughties the markets were clearly not free in this sense. A potential role for a regulator, therefore, would be to perform appropriate ‘horizon scanning’ and to inject appropriate information to ‘nudge’ the system back into sustainability. Some voters would be suspicious of a government that attempts to strategize, but perhaps this form of regulation could be seen as simply better-informed muddling, particularly if there were strong disincentives to take unduly bold action.

But finance does not exist separate from other issues. A UK ‘regulator’ would need to be a virtual beast spanning  the departments, working within the confines of regular general elections, and being careful not to awaken memories of Cromwell.

This may seem terribly ambitious, but maybe we could start with reformed concepts of probability, performance, etc.

JS Mill’s views

Dave Marsay

## UK judge rules against probability theory? R v T

Actually, the judge was a bit more considered than my title suggests. In my defence the Guardian says:

“Bayes’ theorem is a mathematical equation used in court cases to analyse statistical evidence. But a judge has ruled it can no longer be used. Will it result in more miscarriages of justice?”

The case involved Nike trainers and appears to be the same as that in a recent appeal  judgment, although it doesn’t actually involve Bayes’ rule. It just involves the likelihood ratio, not any priors. An expert witness had said:

“… there is at this stage a moderate degree of scientific evidence to support the view that the [appellant’s shoes] had made the footwear marks.”

The appeal hinged around the question of whether this was a reasonable representation of a reasonable inference.

According to Keynes, Knight and Ellsberg, probabilities are grounded on either logic, statistics or estimates. Prior probabilities are – by definition – never grounded on statistics and in practical applications rarely grounded on logic, and hence must be estimates. Estimates are always open to challenge, and might reasonably be discounted, particularly where one wants to be ‘beyond reasonable doubt’.

Likelihood ratios are typically more objective and hence more reliable. In this case they might have been based on good quality relevant statistics, in which case the judge supposed that it might be reasonable to state that there was a moderate degree of scientific evidence. But this was not the case. Expert estimates had supplied what the available database had lacked, so introducing additional uncertainty. This might have been reasonable, but the estimate appears not to have been based on relevant experience.

My deduction from this is that where there is doubt about the proper figures to use, that doubt should be acknowledged and the defendant given the benefit of it. As the judge says:

“… it is difficult to see how an opinion … arrived at through the application of a formula could be described as ‘logical’ or ‘balanced’ or ‘robust’, when the data are as uncertain as we have set out and could produce such different results.”

This case would seem to have wider implications:

“… we do not consider that the word ‘scientific’ should be used, as … it is likely to give an impression … of a degree of  precision and objectivity that is not present given the current state of this area of expertise.”

My experience is that such estimates are often used by scientists, and the result confounded with ‘science’. I have sometimes heard this practice justified on the grounds that some ‘measure’ of probability is needed and that if an estimate is needed it is best that it should be given by an independent scientist or analyst than by an advocate or, say, politician. Maybe so, but perhaps we should indicate when this has happened, and the impact it has on the result. (It might be better to follow the advice of Keynes.)

## Royal Statistical Society

“There is a long history and ample recent experience of misunderstandings relating to statistical information and probabilities which have contributed towards serious miscarriages of justice. … forensic scientists and expert witnesses, whose evidence is typically the immediate source of statistics and probabilities presented in court, may also lack familiarity with relevant terminology, concepts and methods.”

“Guide No 1 is designed as a general introduction to the role of probability and statistics in criminal proceedings, a kind of vade mecum for the perplexed forensic traveller; or possibly, ‘Everything you ever wanted to know about probability in criminal litigation but were too afraid to ask’. It explains basic terminology and concepts, illustrates various forensic applications of probability, and draws attention to common reasoning errors (‘traps for the unwary’).”

The guide is clearly much needed. It states:

“The best measure of uncertainty is probability, which measures uncertainty on a scale from 0 to 1.”

This statement is nowhere supported by any evidence whatsoever. No consideration is given to alternatives, such as those of Keynes, or to the legal concept of “beyond reasonable doubt.”

“The type of probability that arises in criminal proceedings is overwhelmingly of the subjective variety, …

There is no consideration of Boole and Keynes’ more logical notion, or any reason to take notice of the subjective opinions of others.

“Whether objective expressions of chance or subjective measures of belief, probabilistic calculations of (un)certainty obey the axiomatic laws of probability, …

But how do we determine whether those axioms are appropriate to the situation at hand? The reader is not told whether the term axiom is to be interpreted in its mathematical or lay sense: as something to be proved, or as something that may be assumed without further thought. The first example given is:

“Consider an unbiased coin, with an equal probability of producing a ‘head’ or a ‘tail’ on each coin-toss. …”

Probability here is mathematical. Considering the probability of an untested coin of unknown provenance would be more subjective. It is the handling of the subjective component that is at issue, an issue that the example does not help to address. More realistically:

“Assessing the adequacy of an inference is never a purely statistical matter in the final analysis, because the adequacy of an inference is relative to its purpose and what is at stake in any particular context in relying on it.”

“… an expert report might contain statements resembling the following:
* “Footwear with the pattern and size of the sole of the defendant’s shoe occurred in approximately 2% of burglaries.” …
It is vital for judges, lawyers and forensic scientists to be able to identify and evaluate the assumptions which lie behind these kinds of statistics.”

This is good advice, which the appeal judge took. However, while I have not read and understood every detail of the guidance, it seems to me that the judge’s understanding went beyond the guidance, including its ‘traps for the unwary’.

The statistical guidance cites the following guidance from the forensic scientists’ professional body:

Logic: The expert will address the probability of the evidence given the proposition and relevant background information and not the probability of the proposition given the evidence and background information.”

This seems sound, but needs supporting by detailed advice. In particular none of the above guidance explicitly takes account of the notion of ‘beyond reasonable doubt’.

## Forensic science view

Science and Justice has an article which opines:

“Our concern is that the judgment will be interpreted as being in opposition to the principles of logical interpretation of evidence. We re-iterate those principles and then discuss several extracts from the judgment that may be potentially harmful to the future of forensic science.”

The full article is behind a pay-wall, but I would like to know what principles it is referring to. It is hard to see how there could be a conflict, unless there are some extra principles not in the RSS guidance.

## Criminal law Review

Forensic Science Evidence in Question argues that:

“The strict ratio of R. v T  is that existing data are legally insufficient to permit footwear mark experts to utilise probabilistic methods involving likelihood ratios when writing reports or testifying at trial. For the reasons identified in this article, we hope that the Court of Appeal will reconsider this ruling at the earliest opportunity. In the meantime, we are concerned that some of the Court’s more general statements could frustrate the jury’s understanding of forensic science evidence, and even risk miscarriages of justice, if extrapolated to other contexts and forms of expertise. There is no reason in law why errant obiter dicta should be permitted to corrupt best scientific practice.”

In this account it is clear that the substantive issues are about likelihoods rather than probabilities, and that consideration of ‘prior probabilities’ are not relevant here. This is different from the Royal Society’s account, which emphasises subjective probability. However, in considering the likelihood of the evidence conditioned on the suspect’s innocence, it is implicitly assumed that the perpetrator is typical of the UK population as a whole, or of people at UK crime scenes as a whole. But suppose that women are most often murdered by men that they are or have been close to, and that such men are likely to be more similar to each other than people randomly selected from the population as a whole. Then it is reasonable to suppose that the likelihood that the perpetrator is some other male known to the victim will be significantly greater than the likelihood of it being some random man. The use of an inappropriate likelihood introduces a bias.

My advice: do not get involved with people who mostly get involved with people like you, unless you trust them all.

## The Appeal

Prof. Jamieson, an expert on the evaluation of evidence whose statements informed the appeal, said:

“It is essential for the population data for these shoes be applicable to the population potentially present at the scene. Regional, time, and cultural differences all affect the frequency of particular footwear in a relevant population. That data was simply not … . If the shoes were more common in such a population then the probative value is lessened. The converse is also true, but we do not know which is the accurate position.”

Thus the professor is arguing that the estimated likelihood could be too high or too low, and that the defence ought to be given the benefit of the doubt. I have argued that using a whole population likelihood is likely to be actually biased against the defence, as I expect such traits as the choice of shoes to be clustered.

## Science and Justice

Faigman, Jamieson et al, Response to Aitken et al. on R v T Science and Justice 51 (2011) 213 – 214

This argues against an unthinking application of likelihood ratios, noting:

• That the defence may reasonable not be able explain the evidence, so that there may be no reliable source for an innocent hypothesis.
• That assessment of likelihoods will depend on experience, the basis for which should be disclosed and open to challenge.
• If there is doubt as to how to handle uncertainty, any method ought to be tested in court and not dictated by armchair experts.

On the other hand, when it says “Accepting that probability theory provides a coherent foundation …” it fails to note that coherence is beside the point: is it credible?

## Comment

The current situation seems unsatisfactory, with the best available advice both too simplistic and not simple enough. In similar situations I have co-authored a large document which has then been split into two: guidance for practitioners and justification. It may not be possible to give comprehensive guidance for practitioners, in which case one should aim to give ‘safe’ advice, so that practitioners are clear about when they can use their own judgment and when they should seek advice. This inevitably becomes a ‘legal’ document, but that seems unavoidable.

In my view it should not be simply assumed that the appropriate representation of uncertainty is ‘nothing but a number’. Instead one should take Keynes’ concerns seriously in the guidance and explicitly argue for a simpler approach avoiding ‘reasonable doubt’, where appropriate. I would also suggest that any proposed principles ought to be compared with past cases, particularly those which have turned out to be miscarriages of justice. As the appeal judge did, this might usefully consider foreign cases to build up an adequate ‘database’.

My expectation is that this would show that the use of whole-population likelihoods as in R v T is biased against defendants who are in a suspect social group.

More generally, I think that anyguidance ought to apply to my growing uncertainty puzzles, even if it only cautions against a simplistic application of any rule in such cases.