Aitken et al, Communicating and Interpreting Statistical Evidence in the Administration of Criminal Justice  Royal Statistical Society, 2010

1. Fundamentals of Probability and Statistical Evidence in Criminal Proceedings

1. Probability and Statistics in Forensic Contexts

There is a long history and ample recent experience of misunderstandings relating to statistical information and probabilities which have contributed towards serious miscarriages of justice.

There is … no group of professionals working today in the criminal courts that can afford to be complacent about its members’ competence in statistical method and probabilistic reasoning.

The best measure of uncertainty is probability, which measures uncertainty on a scale from 0 to 1.

p(Guilty, G) + p(Innocent, I) = 1 [Where I is the information available.]

2. Basic Concepts of Probabilistic Inference and Evidence

Probative value (or the “weight” of the evidence) is the measure of the extent to which relevant evidence contributes towards proving, or disproving, a fact in issue. This is a matter of degree.

It is vital for judges, lawyers and forensic scientists to be able to identify and evaluate the assumptions which lie behind these kinds of statistics.

All probabilities are predicated (or “conditioned”) on specified assumptions. This is merely another way of expressing the inherent conditionality of probability as a species of reasoning under uncertainty.

Bayes Theorem is a codification of the reasoning that should be applied in the assessment of evidence. It is a statement of logic. Its application ensures evidence is assessed rationally.

3. Interpreting Probabilistic Evidence – Anticipating Traps for the Unwary

[P]robability, statistical evidence, and inferential reasoning associated with them do seem to be especially prone to recurrent errors and misinterpretation.

[T]he following analytically distinct (though in practice, often compounded) reasoning errors will be examined and elucidated:

(c) illegitimately transposing the conditional (“the prosecutor’s fallacy”);

(d) source probability error;

(e) underestimating the value of probabilistic evidence;

(f) probability (“another match”) error;

(g) numerical conversion error;

(h) false positive fallacy;

(i) fallacious inferences of uniqueness; and

(j) unwarranted assumptions of independence.

4. Summary and Checklist

[C]onfidence intervals are regarded as appropriate expressions of uncertainty in social science and elsewhere, but they are not an appropriate way of evaluating evidence in criminal proceedings because they are irremediably arbitrary and unjustifiably cause valuable evidence to “fall off a cliff”.

Ideally, expert witnesses should testify to the likelihood of the evidence under two competing propositions (or assumptions), the prosecution’s proposition and the competing proposition advanced by the defence (which may simply be the negation of the prosecution’s proposition in the absence of fuller pre-trial defence disclosure). In other words, experts should testify to the likelihood ratio.

This guidance seems reasonable, provided that the probability estimates are well founded, and not purely subjective or otherwise prejudiced. The main concerns I would have are that:

• If the defendant is innocent they may not be able to explain the evidence.
• If the defendant is atypical of the population as a whole, the population statistics may mislead.

As an analogy, consider Colin Powell’s case that the Saddam Regime still had WMD. We now know that some of the evidence was fabricated, so it is not surprising that Saddam could not explain it. We also know that, contrary to our assumptions, Saddam was more afraid of his neighbours than of the US, and so behaved very oddly. These two factors made the evidence against appear stronger than it was.

The guide has the following example:

Suppose that a bloody footwear mark taken from the scene of the crime is said to “match” (in some specified sense of what constitutes a “match”) the sole of a shoe in the accused’s possession.

If the accused is innocent they may not be able to explain the match. As the guide suggests, one needs to assess the probability of the perpetrator’s mark matching by coincidence, which means assessing the proportion of some ‘subject population’ that shares the mark. But there is a subtlety.  If the crime is murder of a young woman who seems to have entertained the murderer before being murdered then the obvious subject population is some group of men known to the victim.  But suppose, for example, that the match is to a large fashionable trainer and the accused is known to have been intimate with the victim. Then possibly the victim had a preference for men with large feet and fashionable footwear. In this case the probability of a false match would be higher than if she had simply been intimate with men selected at random from those she knew, possibly much higher. If there are unusual circumstances, it may be that experts in shoe marks, for example, are not qualified to make the necessary probability judgments. One needs to take the particular circumstances into account.

Dave Marsay