What logical term or concept ought to be more widely known?

Various What scientific term or concept ought to be more widely known? Edge, 2017.

INTRODUCTION: SCIENTIA

Science—that is, reliable methods for obtaining knowledge—is an essential part of psychology and the social sciences, especially economics, geography, history, and political science. …

Science is nothing more nor less than the most reliable way of gaining knowledge about anything, whether it be the human spirit, the role of great figures in history, or the structure of DNA.

Contributions

As against others on:

(This is as far as I’ve got.)

Comment

I’ve grouped the contributions according to whether or not I think they give due weight to the notion of uncertainty as expressed in my blog. Interestingly Steven Pinker seems not to give due weight in his article, whereas he is credited by Nicholas G. Carr with some profound insights (in the first of the second batch). So maybe I am not reading them right.

My own thinking

Misplaced Concreteness

Whitehead’s fallacy of misplaced concerteness, also known as the reification fallacy, “holds when one mistakes an abstract belief, opinion, or concept about the way things are for a physical or “concrete” reality.” Most of what we think of as knowledge is ‘known about a theory” rather than truly “known about reality”. The difference seems to matter in psychology, sociology, economics and physics. This is not a term or concept of any particular science, but rather a seeming ‘brute fact’ of ‘the theory of science’ that perhaps ought to have been called attention to in the above article.

Morphogenesis

My own speciifc suggestion, to illustrate the above fallacy, would be Turing’s theory of ‘Morphogenesis’. The particular predictions seem to have been confirmed ‘scientifically’, but it is essentially a logical / mathematical theory. If, as the introduction to the Edge article suggests, science is “reliable methods for obtaining knowledge” then it seems to me that logic and mathematics are more reliable than empirical methods, and deserve some special recognition. Although, I must concede that it may be hard to tell logic from pseudo-logic, and that unless you can do so my distinction is potentially dangerous.

The second law of thermodynamics, and much common sense rationality,  assumes a situation in which the law of large numbers applies. But Turing adds to the second law’s notion of random dissipation a notion of relative structuring (as in gravity) to show that ‘critical instabilities’ are inevitable. These are inconsistent with the law of large numbers, so the assumptions of the second law of thermodynamics (and much else) cannot be true. The universe cannot be ‘closed’ in its sense.

Implications

If the assumptions of the second law seem to leave no room for free will and hence no reason to believe in our agency and hence no point in any of the contributions to Edge: they are what they are and we do what we do. But Pinker does not go so far: he simply notes that if things inevitably degrade we do not need to beat ourselves up, or look for scape-goats when things go wrong. But this can be true even if the second law does not apply. If we take Turing seriously then a seeming permanent status quo can contain the reasons for its own destruction, so that turning a blind eye and doing nothing can mean sleep-walking to disaster. Where Pinker concludes:

[An] underappreciation of the Second Law lures people into seeing every unsolved social problem as a sign that their country is being driven off a cliff. It’s in the very nature of the universe that life has problems. But it’s better to figure out how to solve them—to apply information and energy to expand our refuge of beneficial order—than to start a conflagration and hope for the best.

This would seem to follow more clearly from the theory of morphogenesis than the second law. Turing’s theory also goes some way to suggesting or even explaining the items in the second batch. So, I commend it.

 

Dave Marsay

 

 

Heuristics or Algorithms: Confused?

The Editor of the New Scientist (Vol. 3176, 5 May 2018, Letters, p54) opined in response to Adrian Bowyer ‘swish to distinguish between ‘heuristics’ and ‘algorithms’ in AI that:

This distinction is no longer widely made by practitioners of the craft, and we have to follow language as it is used, even when it loses precision.

Sadly, I have to accept that AI folk tend to consistently fail to respect a widely held distinction, but it seems odd that their failure has led to an obligation on the New Scientist – which has a much broader readership than just AI folk. I would agree that in addressing audiences that include significant sectors that fail to make some distinction, we need to be aware of the fact, but if the distinction is relevant – as Bowyer argues, surely we should explain it.

According to the freedictionary:

Heuristic: adj 1. Of or relating to a usually speculative formulation serving as a guide in the investigation or solution of a problem.

Algorithm: n: A finite set of unambiguous instructions that, given some set of initial conditions, can be performed in a prescribed sequence to achieve a certain goal and that has a recognizable set of end conditions.

It even also this quote:

heuristic: of or relating to or using a general formulation that serves to guide investigation  algorithmic – of or relating to or having the characteristics of an algorithm.

But perhaps this is not clear?

AI practitioners routinely apply algorithms as heuristics in the same way that a bridge designer may routinely use a computer program. We might reasonably regard a bridge-designing app as good if it correctly implements best practice in  bridge-building, but this is not to say that a bridge designed using it would necessarily be safe, particularly if it is has significant novelties (as in London’s wobbly bridge).

Thus any app (or other process) has two sides: as an algorithm and as a heuristic. As an algorithm we ask if it meets its concrete goals. As a heuristic we ask if it solves a real-world problem. Thus a process for identifying some kind of undesirable would be regarded as good algorithmically if it conformed to our idea of the undesirables, but may still be poor heuristically. In particular, good AI would seem to depend on someone understand at least the factors involved in the problem. This may not always be the case, no matter how ‘mathematically sophisticated’ the algorithms involved.

Perhaps you could improve on this attempted explanation?

Dave Marsay

Mathematical Modelling

Mathematics and modelling in particular is very powerful, and hence can be very risky if you get it wrong, as in mainstream economics. But is modelling inappropriate – as has been claimed – or is it just that it has not been done well enough?

As a mathematician who has dabbled in modelling and economics I thought I’d try my hand at modelling economies. What harm could there be?

My first notion is that actors activity is habitual.

My second is that habits persist until there is a ‘bad’ experience, in which case they are revised. What is taken account of, what counts as ‘bad’ and how habits are replaced or revised are all subject to meta-habits (habits about habits).

In particular, mainstream economists suppose that actors seek to maximise their utilities, and they never revise this approach. But this may be too restrictive.

Myself, I would add that most actors mostly seek to copy others and also tend to discount experiences and lessons identified by previous generations.

With some such ‘axioms’ (suitably formalised) such as those above, one can predict booms and busts leading to new ‘epochs’ characterised by dominant theories and habits. For example, suppose that some actors habitually borrow as much as they can to invest in an asset (such as a house for rent) and the asset class performs well. Then they will continue in their habit, and others who have done less well will increasingly copy them, fuelling an asset price boom. But no asset class is worth an infinite amount, so the boom must end, resulting in disappointment and changes in habit, which may again be copied by those who are losing out on the asset class., giving a bust.  Thus one has an ’emergent behaviour’ that contradicts some of the implicit mainstream assumptions about rationality  (such as ‘ergodicity’), and hence the possibility of meaningful ‘expectations’ and utility functions to be maximized. This is not to say that such things cannot exist, only that if they do exist it must be due to some economic law as yet unidentified, and we need an alternative explanation for booms and busts.

What I take from this is that mathematical models seem possible and may even provide insights.I do not assume that a model that is adequate in the short-run will necessarily continue to be adequate, and my model shows how economic epochs can be self-destructing. To me, the problem in economics is not so much that it uses mathematics and in particular mathematical modelling but that it does so badly. My ‘axioms’ mimic the approach that Einstein took to physics: it replaces an absolutist model by a relativistic one, and shows that it makes a difference. In my model there are no magical ‘expectations’, rather actors may have realistic habits and expectations, based on their experience and interpretation of the media and other sources, which may be ‘correct’ (or at least not falsified) in the short-run, but which cannot provide adequate predictions for the longer run. To survive a change of epochs our actors would need to be at least following some actors who were monitoring and thinking about the overall situation more broadly and deeply than those who focus on short run utility. (Something that currently seems lacking.)

David Marsay

Can polls be reliable?

Election polls in many countries have seemed unusually unreliable recently. Why? And can they be fixed?

The most basic observation is that if one has a random sample of a population in which x% has some attribute then it is reasonable to estimate that x% of the whole population has that attribute, and that this estimate will tend to be more accurate the larger the sample is. In some polls sample size can be an issue, but not in the main political polls.

A fundamental problem with most polls is that the ‘random’ sample may not be uniformly distributed, with some sub-groups over or under represented. Political polls have some additional issues, that are sometimes blamed:

  • People with certain opinions may be reluctant to express them, or may even mislead.
  • There may be a shift in opinions with time, due to campaigns or events.
  • Different groups may differ in whether they actually vote, for example depending on the weather.

I also think that in the UK the trend to postal voting may have confused things, as postal voters will have missed out on the later stages of campaigns, and on later events. (Which were significant in the UK 2017 general election.)

Pollsters have a lot of experience at compensating for these distortions, and are increasingly using ‘sophisticated mathematical tools’. How is this possible, and is there any residual uncertainty?

Back to mathematics, suppose that we have a science-like situation in which we know which factors (e.g. gender, age, social class ..) are relevant. With a large enough sample we can partition the results by combination of factors, measure the proportions for each combination, and then combine these proportions, weighting by the prevalence of the combinations in the whole population. (More sophisticated approaches are used for smaller samples, but they only reduce the statistical reliability.)

Systematic errors can creep in in two ways:

  1. Instead of using just the poll data, some ‘laws of politics’ (such as the effect of rain) or other heuristics (such as that the swing among postal votes will be similar to that for votes in person) may be wrong.
  2. An important factor is missed. (For example, people with teenage children or grandchildren may vote differently from their peers when student fees are an issue.)

These issues have analogues in the science lab. In the first place one is using the wrong theory to interpret the data, and so the results are corrupted. In the second case one has some unnoticed ‘uncontrolled variable’ that can really confuse things.

A polling method using fixed factors and laws will only be reliable when they reasonably accurately the attributes of interest, and not when ‘the nature of politics’ is changing, as it often does and as it seems to be right now in North America and Europe. (According to game theory one should expect such changes when coalitions change or are under threat, as they are.) To do better, the polling organisation would need to understand the factors that the parties were bringing into play at least as well as the parties themselves, and possibly better. This seems unlikely, at least in the UK.

What can be done?

It seems to me that polls used to be relatively easy to interpret, possibly because they were simpler. Our more sophisticated contemporary methods make more detailed assumptions. To interpret them we would need to know what these assumptions were. We could then ‘aim off’, based on our own judgment. But this would involve pollsters in publishing some details of their methods, which they are naturally loth to do. So what could be done? Maybe we could have some agreed simple methods and publish findings as ‘extrapolations’ to inform debate, rather than predictions. We could then factor in our own assumptions. (For example, our assumptions about students turnout.)

So, I don’t think that we can expect reliable poll findings that are predictions, but possibly we could have useful poll findings that would inform debate and allow us to take our own views. (A bit like any ‘big data’.)

Dave Marsay

 

Mathematical modelling

I had the good fortune to attend a public talk on mathematical modelling, organised by the University of Birmingham (UK). The speaker, Dr Nira Chamberlain CMath FIMA CSci, is a council member of the appropriate institution, and so may reasonably be thought to be speaking for mathematicians generally.

He observed that there were many professional areas that used mathematics as a tool, and that they generally failed to see the need for professional mathematicians as such. He thought that mathematical modelling was one area where – at least for the more important problems – mathematicians ought to be involved. He gave examples of modelling, including one of the financial crisis.

The main conclusion seemed very reasonable, and in line with the beliefs of most ‘right thinking’ mathematicians. But on reflection, I wonder if my non-mathematician professional colleagues would accept it. In 19th century professional mathematicians were proclaiming it a mathematical fact that the physical world conformed to classical geometry. On this basis, mathematicians do not seem to have any special ability to produce valid models. Indeed, in the run up to the financial crash there were too many professional mathematicians who were advocating some mainstream mathematical models of finance and economies in which the crash was impossible.

In Dr Chamberlain’s own model of the crash, it seems that deregulation and competition led to excessive risk taking, which risks eventually materialised. A colleague who is a professional scientist but not a professional mathematician has advised me that this general model was recognised by the UK at the time of our deregulation, but that it was assumed (as Greenspan did) that somehow some institution would step in to foreclose this excessive risk taking. To me, the key thing to note is that the risks being taken were systemic and not necessarily recognised by those taking them. To me, the virtue of a model does not just depend on it being correct in some abstract sense, but also that ‘has traction’ with relevant policy and decision makers and takers. Thus, reflecting on the talk, I am left accepting the view of many of my colleagues that some mathematical models are too important to be left to mathematicians.

If we have a thesis and antithesis, then the synthesis that I and my colleagues have long come to is that important mathematical model needs to be a collaborative endeavour, including mathematicians as having a special role in challenging, interpret and (potentially) developing the model, including developing (as Dr C said) new mathematics where necessary. A modelling team will often need mathematicians ‘on tap’ to apply various methods and theories, and this is common. But what is also needed is a mathematical insight into the appropriateness of these tools and the meaning of the results. This requires people who are more concerned with their mathematical integrity than in satisfying their non-mathematical pay-masters. It seems to me that these are a sub-set of those that are generally regarded as ‘professional’. How do we identify such people?

Dave Marsay 

 

Uncertainty is not just probability

I have just had published my paper, based on the discussion paper referred to in a previous post. In Facebook it is described as:

An understanding of Keynesian uncertainties can be relevant to many contemporary challenges. Keynes was arguably the first person to put probability theory on a sound mathematical footing. …

So it is not just for economists. I could be tempted to discuss the wider implications.

Comments are welcome here, at the publisher’s web site or on Facebook. I’m told that it is also discussed on Google+, Twitter and LinkedIn, but I couldn’t find it – maybe I’ll try again later.

Dave Marsay

Artificial Intelligence?

The subject of ‘Artificial Intelligence’ (AI) has long provided ample scope for long and inconclusive debates. Wikipedia seems to have settled on a view, that we may take as straw-man:

Every aspect of learning or any other feature of intelligence can be so precisely described that a machine can be made to simulate it. [Dartmouth Conference, 1956] The appropriately programmed computer with the right inputs and outputs would thereby have a mind in exactly the same sense human beings have minds. [John Searle’s straw-man hypothesis]

Readers of my blog will realise that I agree with Searle that his hypothesis is wrong, but for different reasons. It seems to me that mainstream AI (mAI) is about being able to take instruction. This is a part of learning, but by no means all. Thus – I claim – mAI is about a sub-set of intelligence. In many organisational settings it may be that sub-set which the organisation values. It may even be that an AI that ‘thought for itself’ would be a danger. For example, in old discussions about whether or not some type of AI could ever act as a G.P. (General Practitioner – first line doctor) the underlying issue has been whether G.P.s ‘should’ think for themselves, or just apply their trained responses. My own experience is that sometimes G.P.s doubt the applicability of what they have been taught, and that sometimes this is ‘a good thing’. In effect, we sometimes want to train people, or otherwise arrange for them to react in predictable ways, as if they were machines. mAI can create better machines, and thus has many key roles to play. But between mAI and ‘superhuman intelligence’  there seems to be an important gap: the kind of intelligence that makes us human. Can machines display such intelligence? (Can people, in organisations that treat them like machines?)

One successful mainstream approach to AI is to work with probabilities, such a P(A|B) (‘the probability of A given B’), making extensive use of Bayes’ rule, and such an approach is sometimes thought to be ‘logical’, ‘mathematical, ‘statistical’ and ‘scientific’. But, mathematically, we can generalise the approach by taking account of some context, C, using Jack Good’s notation P(A|B:C) (‘the probability of A given B, in the context C’). AI that is explicitly or implicitly statistical is more successful when it operates within a definite fixed context, C, for which the appropriate probabilities are (at least approximately) well-defined and stable. For example, training within an organisation will typically seek to enable staff (or machines) to characterise their job sufficiently well for it to become routine. In practice ‘AI’-based machines often show a little intelligence beyond that described above: they will monitor the situation and ‘raise an exception’ when the situation is too far outside what it ‘expects’. But this just points to the need for a superior intelligence to resolve the situation. Here I present some thoughts.

When we state ‘P(A|B)=p’ we are often not just asserting the probability relationship: it is usually implicit that ‘B’ is the appropriate condition to consider if we are interested in ‘A’. Contemporary mAI usually takes the conditions a given, and computes ‘target’ probabilities from given probabilities. Whilst this requires a kind of intelligence, it seems to me that humans will sometimes also revise the conditions being considered, and this requires a different type of intelligence (not just the ability to apply Bayes’ rule). For example, astronomers who refine the value of relevant parameters are displaying some intelligence and are ‘doing science’, but those first in the field, who determined which parameters are relevant employed a different kind of intelligence and were doing a different kind of science. What we need, at least, is an appropriate way of interpreting and computing ‘probability’ to support this enhanced intelligence.

The notions of Whitehead, Keynes, Russell, Turing and Good seem to me a good start, albeit they need explaining better – hence this blog. Maybe an example is economics. The notion of probability routinely used would be appropriate if we were certain about some fundamental assumptions. But are we? At least we should realise that it is not logical to attempt to justify those assumptions by reasoning using concepts that implicitly rely on them.

Dave Marsay

Evolution of Pragmatism?

A common ‘pragmatic’ approach is to keep doing what you normally do until you hit a snag, and (only) then to reconsider. Whereas Lamarckian evolution would lead to the ‘survival of the fittest’, with everyone adapting to the current niche, tending to yield a homogenous population, Darwinian evolution has survival of the maximal variety of all those who can survive, with characteristics only dying out when they are not viable. This evolution of diversity makes for greater resilience, which is maybe why ‘pragmatic’ Darwinian evolution has evolved.

The products of evolution are generally also pragmatic, in that they have virtually pre-programmed behaviours which ‘unfold’ in the environment. Plants grow and procreate, while animals have a richer variety of behaviours, but still tend just to do what they do. But humans can ‘think for themselves’ and be ‘creative’, and so have the possibility of not being just pragmatic.

I was at a (very good) lecture by Alice Roberts last night on the evolution of technology. She noted that many creatures use tools, but humans seem to be unique in that at some critical population mass the manufacture and use of tools becomes sustained through teaching, copying and co-operation. It occurred to me that much of this could be pragmatic. After all, until recently development has been very slow, and so may well have been driven by specific practical problems rather than continual searching for improvements. Also, the more recent upswing of innovation seems to have been associated with an increased mixing of cultures and decreased intolerance for people who think for themselves.

In biological evolution mutations can lead to innovation, so evolution is not entirely pragmatic, but their impact is normally limited by the need to fit the current niche, so evolution typically appears to be pragmatic. The role of mutations is more to increase the diversity of behaviours within the niche, rather than innovation as such.

In social evolution there will probably always have been mavericks and misfits, but the social pressure has been towards conformity. I conjecture that such an environment has favoured a habit of pragmatism. These days, it seems to me, a better approach would be more open-minded, inclusive and exploratory, but possibly we do have a biologically-conditioned tendency to be overly pragmatic: to confuse conventions for facts and  heuristics for laws of nature, and not to challenge widely-held beliefs.

The financial crash of 2008 was blamed by some on mathematics. This seems ridiculous. But the post Cold War world was largely one of growth with the threat of nuclear devastation much diminished, so it might be expected that pragmatism would be favoured. Thus powerful tools (mathematical or otherwise) could be taken up and exploited pragmatically, without enough consideration of the potential dangers. It seems to me that this problem is much broader than economics, but I wonder what the cure is, apart from better education and more enlightened public debate?

Dave Marsay

 

 

Traffic bunching

In heavy traffic, such as on motorways in rush-hour, there is often oscillation in speed and there can even be mysterious ’emergent’ halts. The use of variable speed limits can result in everyone getting along a given stretch of road quicker.

Soros (worth reading) has written an article that suggests that this is all to do with the humanity and ‘thinking’ of the drivers, and that something similar is the case for economic and financial booms and busts. This might seem to indicate that ‘mathematical models’ were a part of our problems, not solutions. So I suggest the following thought experiment:

Suppose a huge number of  identical driverless cars with deterministic control functions all try to go along the same road, seeking to optimise performance in terms of ‘progress’ and fuel economy. Will they necessarily succeed, or might there be some ‘tragedy of the commons’ that can only be resolved by some overall regulation? What are the critical factors? Is the nature of the ‘brains’ one of them?

Are these problems the preserve of psychologists, or does mathematics have anything useful to say?

Dave Marsay

JIC, Syria and Uncertainty

This page considers the case that the Assad regime used gas against the rebels on 21st August 2013 from a theory of evidence perspective. For a broader account, see Wikipedia.

The JIC Assessment

The JIC concluded on 27th that it was:

highly likely that the Syrian regime was responsible.

In the covering letter (29th) the chair said:

Against that background, the JIC concluded that it is highly likely that the regime was responsible for the CW attacks on 21 August. The JIC had high confidence in all of its assessments except in relation to the regime’s precise motivation for carrying out an attack of this scale at this time – though intelligence may increase our confidence in the future.

A cynic or pedant might note the caveat:

The paper’s key judgements, based on the information and intelligence available to us as of 25 August, are attached.

Mathematically-based analysis

From a mathematical point of view, the JIC report is an ‘utterance’, and one needs to consider the context in which it was produced. Hopefully, best practice would include identifying the key steps in the conclusion and seeking out and hastening any possible contrary reports. Thus one might reasonably suppose that the letter on the 29th reflected all obviously relevant information available up to the ends of the 28th, but perhaps not some other inputs, such as ‘big data’, that only yield intelligence after extensive processing and analysis.

But what is the chain of reasoning (29th)?

It is being claimed, including by the regime, that the attacks were either faked or undertaken by the Syrian Armed Opposition. We have tested this assertion using a wide range of intelligence and open sources, and invited HMG and outside experts to help us establish whether such a thing is possible. There is no credible intelligence or other evidence to substantiate the claims or the possession of CW by the opposition. The JIC has therefore concluded that there are no plausible alternative scenarios to regime responsibility.

The JIC had high confidence in all of its assessments except in relation to the regime’s precise motivation for carrying out an attack of this scale at this time – though intelligence may increase our confidence in the future.

The report of the 27th is more nuanced:

There is no credible evidence that any opposition group has used CW. A number continue to seek a CW capability, but none currently has the capability to conduct a CW attack on this scale.

Russia claims to have a ‘good degree of confidence’ that the attack was an ‘opposition provocation’ but has announced that they support an investigation into the incident. …

In contrast, concerning Iraqi WMD, we were told that “lack of evidence is not evidence of lack”. But mathematics is not so rigid: it depends on one’s intelligence sources and analysis. Presumably in 2003 we lacked the means to detect Iraqi CW, but now – having learnt the lesson – we would know almost as soon as any one of a number of disparate groups acquires CW.  Many outside the intelligence community might not find this credible, leading to a lack of confidence in the report. Others would take the JIC’s word for it. But while the JIC may have evidence that supports their rating, it seems to me that they have not even alluded to a key part of it.

Often, of course, an argument may be technically flawed but still lead to a correct conclusion. To fix the argument one would want a much greater understanding of the situation. For example, the Russians seem to suggest that one opposition group would be prepared to gas another, presumably to draw the US and others into the war. Is the JIC saying that this is not plausible, or simply that no such group (yet) has the means? Without clarity, it is difficult for an outsider to asses the report and draw their own conclusion.

Finally, it is notable that regime responsibility for the attack of the 21st is rated ‘highly likely’, the same as their responsibility for previous attacks. Yet mathematically the rating should depend on what is called ‘the likelihood’, which one would normally expect to increase with time. Hence one would expect the rating to increase from possible (in the immediate aftermath) through likely to highly likely, as the kind of issues described above are dealt with. This unexpectedly high rating calls for an explanation, which would need to address the most relevant factors.

Anticipating the UN Inspectors

The UN weapons inspectors are expected to produce much relevant evidence. For example, it may be that even if an opposition group had CW an attack would necessarily lack some key signatures. But, from a mathematical point of view, one cannot claim that one explanation is ‘highly likely’ without considering all the alternatives and taking full account of how the evidence was obtained. It is quite true, as the PM argued, that there will always be gaps that require judgement to span. But we should strive to make the gap as slight as possible, and to be clear about what it is. While one would not want a JIC report to be phrased in terms of mathematics, it would seem that appropriate mathematics could be a valuable aid to critical thinking. Hopefully we shall soon have an assessment that genuinely rates ‘highly likely’ independently of any esoteric expertise, whether intelligence or mathematics.

Updates

30th August: US

The US assessment concludes that the attack was by Assad’s troops, using rockets to deliver a nerve agent, following their usual procedures. This ought to be confirmed or disconfirmed by the inspectors, with reasonable confidence. Further, the US claim ‘high confidence’ in their assessment, rather than very high confidence. Overall, the US assessment appears to be about what one would expect if Assad’s troops were responsible.

31st August: Blog

There is a good private-enterprise analysis of the open-source material. It makes a good case that the rockets’ payloads were not very dense, and probably a chemical gas. However, it points out that only the UN inspectors could determine if the payload was a prohibited substance, or some other substance such as is routinely used by respectable armies and police forces.

It makes no attribution of the rockets. The source material is clearly intended to show them being used by the Assad regime, but there is no discussion of whether or not any rebel groups could have made, captured or otherwise acquired them.

2nd September: France

The French have declassified a dossier. Again, it presents assertion and argumentation rather than evidence. The key points seem to be:

  • A ‘large’ amount of gas was used.
  • Rockets were probably used (presumably many).
  • No rebel group has the ability to fire rockets (unlike the Vietcong in Vietnam).

This falls short of a conclusive argument. Nothing seems to rule out the possibility of an anti-Assad outside agency loading up an ISO container (or a mule train) with CW (perhaps in rockets), and delivering them to an opposition group along with an adviser. (Not all the opposition groups all are allies.)

4th September: Germany

A German report includes:

  • Conjecture that the CW mix was stronger than intended, and hence lethal rather than temporarily disabling.
  • That a Hezbollah official said that Assad had ‘lost his nerve’ and ordered the attack.

It is not clear if the Hezbollah utterance was based on good grounds or was just speculation.

4th September: Experts

Some independent experts have given an analysis of the rockets that is similar in detail to that provided by Colin Powell to the UN in 2003, providing some support for the official dossiers. They asses that each warhead contained 50 litres (13 gallons) of agent. The assess that the rebels could have constructed the rockets, but not produced the large quantity of agents.

No figure is given for the number of rockets, but I have seen a figure of 100, which seems the right order of magnitude. This would imply 5,000 litres or 1,300 gallons, if all held the agent. A large tanker truck has a capacity of about 7 times this, so it does not seem impossible that such an amount could have been smuggled in.

This report essentially puts a little more detail on the blog of 31st August, and is seen as being more authoritative.

5th September: G20

The UK has confirmed that Sarin was used, but seems not to have commented on whether it was of typical ‘military quality’, or more home-made.

Russia has given the UN a 100 page dossier of its own, and I have yet to see a credible debunking (early days, and I haven’t found it on-line).

The squabbles continue. The UN wants to wait for its inspectors.

6th September: Veteran Intelligence Professionals for Sanity

An alternative, unofficial narrative. Can this be shown to be incredible? Will it be countered?

9th September: German

German secret sources indicate that Assad had no involvement in the CW attack (although others in the regime might have).

9th September: FCO news conference

John Kerry, at a UK FCO news conference, gives very convincing account of the evidenced for CW use, but without indicating any evidence that the chemicals were delivered by rocket. He is asked about Assad’s involvement, but notes that all that is claimed is senior regime culpability.

UN Inspectors’ Report

21st September. The long-awaited report concludes that rockets were used to deliver Sarin. The report, at first read, seems professional and credible. It is similar in character to the evidence that Colin Powell presented to the UN in 2003, but without the questionable ‘judgments’. It provides some key details (type of rocket, trajectory) which – one hopes – could be tied to the Assad regime, especially given US claims to have monitored rocket launches. Otherwise, they appear to be of  type that the rebels could have used.

The report does not discuss the possibility, raised by the regime, that conventional rockets had accidentally hit a rebel chemical store, but the technical details do seem to rule it out. There is an interesting point here. Psychologically, the fact that the regime raised a possibility in their defence which has been shown to be false increases our scepticism about them. But mathematically, if they are innocent then we would not expect them to know what happened, and hence we would not expect their conjectures to be correct. Such a false conjecture could even be counted as evidence in their favour, particularly if we thought them competent enough to realise that such an invention would easily be falsified by the inspectors.

Reaction

Initial formal reactions

Initial reactions from the US, UK and French are that the technical details, including the trajectory, rule out rebel responsibility. They appear to be a good position to make such a determination, and it would normally be a conclusion that I would take at face value. But given the experience of Iraq and their previous dossiers, it seems quite possible that they would say what they said even without any specific evidence. A typical response, from US ambassador to the UN Samantha Power was:

The technical details of the UN report make clear that only the regime could have carried out this large-scale chemical weapons attack.”

Being just a little pedantic, this statement is literally false: one would at least have to take the technical details to a map showing rebel and regime positions, and have some idea of the range of the rockets. From the Russian comments, it would seem they have not been convinced.

Media reaction

A Telegraph report includes:

Whether the rebels have captured these delivery systems – along with sarin gas – from government armouries is unknown. Even if they have, experts said that operating these weapons successfully would be exceptionally difficult.

”It’s hard to say with certainty that the rebels don’t have access to these delivery systems. But even if they do, using them in such a way as to ensure that the attack was successful is the bit the rebels won’t know how to do,” said Dina Esfandiary, an expert on chemical weapons at the International Institute for Strategic Studies.

The investigators had enough evidence to trace the trajectories followed by two of the five rockets. If the data they provide is enough to pinpoint the locations from which the weapons were launched, this should help to settle the question of responsibility.

John Kerry, the US secretary of state, says the rockets were fired from areas of Damascus under the regime’s control, a claim that strongly implicates Mr Assad’s forces.

This suggests that there might be a strong case against the regime. But it is not clear that the government would be the only source of weapons for the rebels, that the rebels would need sophisticated launchers (rather than sticks) or that they would lack advice. Next, given the information on type, timing and bearing it should be possible to identify the rockets, if the US was monitoring their trajectories at the time, and hence it might be possible to determine where they came from, in which case the evidence trail would lead strongly to the regime. (Elsewhere it has been asserted that one of the rockets was fired from within the main Syrian Army base, in which case one would have thought they would have noticed a rebel group firing out.)

17 September: Human Rights Watch

Human Rights Watch has marked the UN estimate of the trajectories on a map, clearly showing tha- they could have been fired from the Republican Guard 104 Brigade area.

Connecting the dots provided by these numbers allows us to see for ourselves where the rockets were likely launched from and who was responsible.

This isn’t conclusive, given the limited data available to the UN team, but it is highly suggestive and another piece of the puzzle.

This seems a reasonable analysis. The BBC has said of it:

Human Rights Watch says the document reveals details of the attack that strongly
suggest government forces were behind the attack.

But this seems to exaggerate the strength of the evidence. One would at least want to see if the trajectories are consistent with the rockets having been launched from rebel held areas (map, anyone?) It also seems a little odd that a salvo of M14 rockets appear to have been fired over the presidential palace. Was the Syrian Army that desperate? Depending on the view that one takes of these questions, the evidence could favour the rebel hypothesis. On the other hand, if the US could confirm that the only rockets fired at that time to those sites came from government areas, that would seem conclusive.

(Wikipedia gives technical details of rockets. It notes use by the Taliban, and quotes its normal maximum range as 9.8km. The Human Rights Watch analysis seems to be assuming that this will not be significantly reduced by the ad-hoc adaptation to carry gas. Is this credible? My point here is that the lack of explicit discussion of such aspects in the official dossiers leaves room for doubt, which could be dispelled if their ‘very high confidence’ is justified.)

18 September: Syrian “proof”

The BBC has reported that the Syrians have provide what they consider proof to the Russia that the rebels were responsible for the CW attack, and that the Russians are evaluating it. I doubt that this will be proof, but perhaps it will reduce our confidence in  the ‘very high’ likelihood that the regime was responsible. (Probably not!) It may, though, flush out more conclusive evidence, either way.

19 September: Forgery?

Assad has claimed that the materials recovered by the UN inspectors were forged. The report talks about rebels moving material, and it is not immediately clear, as the official dossiers claim, that this hypothesis is not credible, particularly if the rebels had technical support.

Putin has confirmed that the rockets used were obsolete Soviet-era ones, no longer in use by the Syrian Army.

December: US Intelligence?

Hersh claims that US had intelligence that the Syrian rebels had chemical weapons, and that the US administration  deliberately ‘adjusted’ the intelligence to make it appear much more damning of the Syrian regime. (This is disputed.)

Comment

The UN Inspectors report is clear about what it has found. It is careful not to make deductive leaps, but provides ample material to support further analysis. For example, while it finds that Sarin was delivered by rockets that could have been launched from a regime area, it does not rule out rebel responsibility. But it does give details of type, time and direction, such that if – as appears to be the case from their dossier – the US were monitoring the area, it should be possible to conclude that the rocket was actually fired by the regime. Maybe someone will assemble the pieces for us.

My own view is not that Assad did not do it or that we should not attack, but that any attack based on the grounds that Assad used CW should be supported by clear, specific evidence, which the dossiers prior to the UN report did not provide. Even now, we lack a complete case. Maybe the UN should have its own intelligence capability? Or could we attack on purely humanitarian grounds, not basing the justification on the possible events on 21 Aug? Or share our intelligence with the Russians and Chinese?

Maybe no-one is interested any more?

See Also

Telegraph on anti-spy cynicism. Letters. More controversially: inconclusive allegations. and an attempted debunking.

Discussion of weakness of case that Assad was personally involved. Speculation on UN findings.

A feature of the debate seems to be that those who think that ‘something must be done’ tend to be critical of those who question the various dossiers, and those who object to military action tend to throw mud at the dossiers, justified or not. So maybe my main point should be that, irrespective of the validity of the JIC assessment, we need a much better quality of debate, engaging the public and those countries with different views, not just our traditional allies.

A notable exception was a private blog, which looked very credible, but fell short claiming “high likelihood”. It gives details of two candidate delivery rockets, and hoped that the UN inspectors will have got evidence from them, as they did. Neither rocket was known to have been used, but neither do they appear to be beyond the ability of rebel groups to use (with support). The comments are also interesting, e.g.:

There is compelling evidence that the Saudi terrorists operating in Syria, some having had training from an SAS mercenary working out of Dubai who is reporting back to me, are responsible for the chemical attack in the Ghouta area of Damascus.

The AIPAC derived ‘red line’ little game and frame-up was orchestrated at the highest levels of the American administration and liquid sarin binary precursors mainly DMMP were supplied by Israeli handled Saudi terrorists to a Jabhat al-Nusra Front chemist and fabricator.

Israel received supplies of the controlled substance DMMP from Solkatronic Chemicals of Morrisville, Pa.

This at least has some detail, although not such as can be easily checked.

Finally, I am beginning to get annoyed by the media’s use of scare quotes around Russian “evidence”.

Dave Marsay

Are fananciers really stupid?

The New Scientist (30 March 2013) has the following question, under the heading ‘Stupid is as stupid does’:

Jack is looking at Anne but Anne is looking at George. Jack is married but George is not. Is a married person looking at an unmarried person?

Possible answers are: “yes”, “no” or “cannot be determined”.

You might want to think about this before scrolling down.

.

.

.

.

.

.

.

It is claimed that while ‘the vast majority’ (presumably including financiers, whose thinking is being criticised) think the answer is “cannot be determined”,

careful deduction shows that the answer is “yes”.

Similar views are expressed at  a learning blog and at a Physics blog, although the ‘careful deductions’ are not given. Would you like to think again?

.

.

.

.

.

.

.

.

Now I have a confession to make. My first impression is that the closest of the admissible answers is ‘cannot be determined’, and having thought carefully for a while, I have not changed my mind. Am I stupid? (Based on this evidence!) You might like to think about this before scrolling down.

.

.

.

.

.

.

.

Some people object that the term ‘is married’ may not be well-defined, but that is not my concern. Suppose that one has a definition of marriage that is as complete and precise as possible. What is the correct answer? Does that change your thinking?

.

.

.

.

.

.

.

Okay, here are some candidate answers that I would prefer, if allowed:

  1. There are cases in which the answer cannot be determined.
  2. It is not possible to prove that there are not cases in which the answer cannot be determined. (So that the answer could actually be “yes”, but we cannot know that it is “yes”.)

Either way, it cannot be proved that there is a complete and precise way of determining the answer, but for different reasons. I lean towards the first answer, but am not sure. Which it is is not a logical or mathematical question, but a question about ‘reality’, so one should ask a Physicist. My reasoning follows … .

.

.

.

.

.

.

.

.

Suppose that Anne marries Henry who dies while out in space, with a high relative velocity and acceleration. Then to answer yes we must at least be able to determine a unique time in Anne’s time-frame in which Henry dies, or else (it seems to me) there will be a period of time in which Anne’s status is indeterminate. It is not just that we do not know what Anne’s status is; she has no ‘objective’ status.

If there is some experiment which really proves that there is no possible ‘objective’ time (and I am not sure that there is) then am I not right? Even if there is no such experiment, one cannot determine the truth of physical theories, only fail to disprove them. So either way, am I not right?

Enlightenment, please. The link to finance is that the New Scientist article says that

Employees leaving logic at the office door helped cause the financial crisis.

I agree, but it seems to me (after Keynes) that it was their use of the kind of ‘classical’ logic that is implicitly assumed in the article that is at fault. Being married is a relation, not a proposition about Anne. Anne has no state or attributes from which her marital status can be determined, any more than terms such as crash, recession, money supply, inflation, inequality, value or ‘the will of the people’ have any correspondence in real economies.  Unless you know different?

Dave Marsay

Mathematics, psychology, decisions

I attended a conference on the mathematics of finance last week. It seems that things would have gone better in 2007/8 if only policy makers had employed some mathematicians to critique the then dominant dogmas. But I am not so sure. I think one would need to understand why people went along with the dogmas. Psychology, such as behavioural economics, doesn’t seem to help much, since although it challenges some aspects of the dogmas it fails to challenge (and perhaps even promotes) other aspects, so that it is not at all clear how it could have helped.

Here I speculate on an answer.

Finance and economics are either empirical subjects or they are quasi-religious, based on dogmas. The problems seem to arise when they are the latter but we mistake them for the former. If they are empirical then they have models whose justification is based on evidence.

Naïve inductivism boils down to the view that whatever has always (never) been the case will continue always (never) to be the case. Logically it is untenable, because one often gets clashes, where two different applications of naïve induction are incompatible. But pragmatically, it is attractive.

According to naïve inductivism we might suppose that if the evidence has always fitted the models, then actions based on the supposition that they will continue to do so will be justified. (Hence, ‘it is rational to act as if the model is true’). But for something as complex as an economy the models are necessarily incomplete, so that one can only say that the evidence fitted the models within the context as it was at the time. Thus all that naïve inductivism could tell you is that ‘it is rational’ to act as if the  model is true, unless and until the context should change. But many of the papers at the mathematics of finance conference were pointing out specific cases in which the actions ‘obviously’ changed the context, so that naïve inductivism should not have been applied.

It seems to me that one could take a number of attitudes:

  1. It is always rational to act on naïve inductivism.
  2. It is always rational to act on naïve inductivism, unless there is some clear reason why not.
  3. It is always rational to act on naïve inductivism, as long as one has made a reasonable effort to rule out any contra-indications (e.g., by considering ‘the whole’).
  4. It is only reasonable to act on naïve inductivism when one has ruled out any possible changes to the context, particularly reactions to our actions, by considering an adequate experience base.

In addition, one might regard the models as conditionally valid, and hedge accordingly. (‘Unless and until there is a reaction’.) Current psychology seems to suppose (1) and hence has little to help us understand why people tend to lean too strongly on naïve inductivism. It may be that a belief in (1) is not really psychological, but simply a consequence of education (i.e., cultural).

See Also

Russell’s Human Knowledge. My media for the conference.

Dave Marsay

Risks to scientists from mis-predictions

The recent conviction of six seismologists and a public official for reassuring the public about the risk of an earthquake when there turned out to be one raises many issues, mostly legal, but I want to focus on the scientific aspects, specifically the assessment and communication of uncertainty.

A recent paper by O’Hagan  notes that there is “wide recognition that the appropriate representation for expert judgements of uncertainty is as a probability distribution for the unknown quantity of interest …”.  This conflicts with UK best practice, as described by Spiegelhalter at understanding uncertainty. My own views have been formed by experience of potential and actual crises where evaluation of uncertainty played a key role.

From a mathematical perspective, probability theory is a well-grounded theory depending on certain axioms. There are plausible arguments that these axioms are often satisfied, but these arguments are empirical and hence should be considered at best as scientific rather than mathematical or ‘universally true’.  O’Hagan’s arguments, for example, start from the assumption that uncertainty is nothing but a number, ignoring Spiegelhalter’s ‘Knightian uncertainty‘.

Thus, it seems to me, that where there are rare critical decisions with a lack of evidence to support a belief in the axioms, one should recognize the attendant non-probabilistic uncertainty, and that failure to do so is a serious error, meriting some censure. In practice, one needs relevant guidance such as the UK is developing, interpreted for specific areas such as seismology. This should provide both guidance (such as that at understanding uncertainty) to scientists and material to be used in communicating risk to the public, preferably with some legal status. But what should such guidance be? Spiegelhalter’s is a good start, but needs developing.

My own view is that one should have standard techniques that can put reasonable bounds on probabilities, so that one has something that is relatively well peer-reviewed, ‘authorised’ and ‘scientific’ to inform critical decisions. But in applying any methods one should recognize any assumptions that have been made to support the use of those methods, and highlight them. Thus one may say that according to the usual methods, ‘the probability is p’, but that there are various named factors that lead you to suppose that the ‘true risk’ may be significantly higher (or lower). But is this enough?

Some involved in crisis management have noted that scientists generally seem to underestimate risk. If so, then even the above approach (and the similar approach of understanding uncertainty) could tend to understate risk. So do scientists tend to understate the risks pertaining to crises, and why?

It seems to me that one cannot be definitive about this, since there are, from a statistical perspective – thankfully – very few crises or even near-crises. But my impression is that could be something in it. Why?

As at Aquila, human and organisational factors seem to play a role, so that some answers seem to need more justification that others. Any ‘standard techniques’ would need take account of these tendancies. For example, I have often said that the key to good advice is to have a good customer, who desires an adequate answer – whatever it is – who fully appreciates the dangers of misunderstanding arising, and is prepared to invest the time in ensuring adequate communication. This often requires debate and perhaps role-playing, prior to any crisis. This was not achieved at Aquila. But is even this enough?

Here I speculate even more. In my own work, it seems to me that where a quantity such as P(A|B) is required and scientists/statisticians only have a good estimate of P(A|B’) for some B’ that is more general than B, then P(A|B’) will be taken as ‘the scientific’ estimate for P(A|B). This is so common that it seems to be a ‘rule of pragmatic inference’, albeit one that seems to be unsupported by the kind of arguments that O’Hagan supports. My own experience is that it can seriously underestimate P(A|B).

The facts of the Aquila case are not clear to me, but I suppose that the scientists made their assessment based on the best available scientific data. To put it another way, they would not have taken account of ad-hoc observations, such as amateur observations of radon gas fluctuations. Part of the Aquila problem seems to be that the amateur observations provided a warning which the population were led to discount on the basis of ‘scientific’ analysis. More generally, in a crisis, one often has a conflict between a scientific analysis based on sound data and non-scientific views verging on divination. How should these diverse views inform the overall assessment?

In most cases one can make a reasonable scientific analysis based on sound data and ‘authorised assumptions’, taking account of recognized factors. I think that one should always strive to do so, and to communicate the results. But if that is all that one does then one is inevitably ignoring the particulars of the case, which may substantially increase the risk. One may also want to take a broader decision-theoretic view. For example, if the peaks in radon gas levels were unusual then taking them as a portent might be prudent, even in the absence of any relevant theory. The only reason for not doing so would be if the underlying mechanisms were well understood and the gas levels were known to be simply consequent on the scientific data, thus providing no additional information. Such an approach is particularly indicated where – as I think is the case in seismology – even the best scientific analysis has a poor track record.

The bottom line, then, is that I think that one should always provide ‘the best scientific analysis’ in the sense of an analysis that gives a numeric probability (or probability range etc) but one needs to establish a best practice that takes a broader view of the issue in question, and in particular the limitations and potential biases of ‘best practice’.

The O’Hagan paper quoted at the start says – of conventional probability theory – that  “Alternative, but similarly compelling, axiomatic or rational arguments do not appear to have been advanced for other ways of representing uncertainty.” This overlooks Boole, Keynes , Russell and Good, for example. It may be timely to reconsider the adequacy of the conventional assumptions. It might also be that ‘best scientific practice’ needs to be adapted to cope with messy real-world situations. Aquila was not a laboratory.

See Also

My notes on uncertainty and on current debates.

Dave Marsay

UK judge rules against probability theory? R v T

Actually, the judge was a bit more considered than my title suggests. In my defence the Guardian says:

“Bayes’ theorem is a mathematical equation used in court cases to analyse statistical evidence. But a judge has ruled it can no longer be used. Will it result in more miscarriages of justice?”

The case involved Nike trainers and appears to be the same as that in a recent appeal  judgment, although it doesn’t actually involve Bayes’ rule. It just involves the likelihood ratio, not any priors. An expert witness had said:

“… there is at this stage a moderate degree of scientific evidence to support the view that the [appellant’s shoes] had made the footwear marks.”

The appeal hinged around the question of whether this was a reasonable representation of a reasonable inference.

According to Keynes, Knight and Ellsberg, probabilities are grounded on either logic, statistics or estimates. Prior probabilities are – by definition – never grounded on statistics and in practical applications rarely grounded on logic, and hence must be estimates. Estimates are always open to challenge, and might reasonably be discounted, particularly where one wants to be ‘beyond reasonable doubt’.

Likelihood ratios are typically more objective and hence more reliable. In this case they might have been based on good quality relevant statistics, in which case the judge supposed that it might be reasonable to state that there was a moderate degree of scientific evidence. But this was not the case. Expert estimates had supplied what the available database had lacked, so introducing additional uncertainty. This might have been reasonable, but the estimate appears not to have been based on relevant experience.

My deduction from this is that where there is doubt about the proper figures to use, that doubt should be acknowledged and the defendant given the benefit of it. As the judge says:

“… it is difficult to see how an opinion … arrived at through the application of a formula could be described as ‘logical’ or ‘balanced’ or ‘robust’, when the data are as uncertain as we have set out and could produce such different results.”

This case would seem to have wider implications:

“… we do not consider that the word ‘scientific’ should be used, as … it is likely to give an impression … of a degree of  precision and objectivity that is not present given the current state of this area of expertise.”

My experience is that such estimates are often used by scientists, and the result confounded with ‘science’. I have sometimes heard this practice justified on the grounds that some ‘measure’ of probability is needed and that if an estimate is needed it is best that it should be given by an independent scientist or analyst than by an advocate or, say, politician. Maybe so, but perhaps we should indicate when this has happened, and the impact it has on the result. (It might be better to follow the advice of Keynes.)

Royal Statistical Society

The guidance for forensic scientists is:

“There is a long history and ample recent experience of misunderstandings relating to statistical information and probabilities which have contributed towards serious miscarriages of justice. … forensic scientists and expert witnesses, whose evidence is typically the immediate source of statistics and probabilities presented in court, may also lack familiarity with relevant terminology, concepts and methods.”

“Guide No 1 is designed as a general introduction to the role of probability and statistics in criminal proceedings, a kind of vade mecum for the perplexed forensic traveller; or possibly, ‘Everything you ever wanted to know about probability in criminal litigation but were too afraid to ask’. It explains basic terminology and concepts, illustrates various forensic applications of probability, and draws attention to common reasoning errors (‘traps for the unwary’).”

The guide is clearly much needed. It states:

“The best measure of uncertainty is probability, which measures uncertainty on a scale from 0 to 1.”

This statement is nowhere supported by any evidence whatsoever. No consideration is given to alternatives, such as those of Keynes, or to the legal concept of “beyond reasonable doubt.”

“The type of probability that arises in criminal proceedings is overwhelmingly of the subjective variety, …

There is no consideration of Boole and Keynes’ more logical notion, or any reason to take notice of the subjective opinions of others.

“Whether objective expressions of chance or subjective measures of belief, probabilistic calculations of (un)certainty obey the axiomatic laws of probability, …

But how do we determine whether those axioms are appropriate to the situation at hand? The reader is not told whether the term axiom is to be interpreted in its mathematical or lay sense: as something to be proved, or as something that may be assumed without further thought. The first example given is:

“Consider an unbiased coin, with an equal probability of producing a ‘head’ or a ‘tail’ on each coin-toss. …”

Probability here is mathematical. Considering the probability of an untested coin of unknown provenance would be more subjective. It is the handling of the subjective component that is at issue, an issue that the example does not help to address. More realistically:

“Assessing the adequacy of an inference is never a purely statistical matter in the final analysis, because the adequacy of an inference is relative to its purpose and what is at stake in any particular context in relying on it.”

“… an expert report might contain statements resembling the following:
* “Footwear with the pattern and size of the sole of the defendant’s shoe occurred in approximately 2% of burglaries.” …
It is vital for judges, lawyers and forensic scientists to be able to identify and evaluate the assumptions which lie behind these kinds of statistics.”

This is good advice, which the appeal judge took. However, while I have not read and understood every detail of the guidance, it seems to me that the judge’s understanding went beyond the guidance, including its ‘traps for the unwary’.

The statistical guidance cites the following guidance from the forensic scientists’ professional body:

Logic: The expert will address the probability of the evidence given the proposition and relevant background information and not the probability of the proposition given the evidence and background information.”

This seems sound, but needs supporting by detailed advice. In particular none of the above guidance explicitly takes account of the notion of ‘beyond reasonable doubt’.

Forensic science view

Science and Justice has an article which opines:

“Our concern is that the judgment will be interpreted as being in opposition to the principles of logical interpretation of evidence. We re-iterate those principles and then discuss several extracts from the judgment that may be potentially harmful to the future of forensic science.”

The full article is behind a pay-wall, but I would like to know what principles it is referring to. It is hard to see how there could be a conflict, unless there are some extra principles not in the RSS guidance.

Criminal law Review

Forensic Science Evidence in Question argues that:

 “The strict ratio of R. v T  is that existing data are legally insufficient to permit footwear mark experts to utilise probabilistic methods involving likelihood ratios when writing reports or testifying at trial. For the reasons identified in this article, we hope that the Court of Appeal will reconsider this ruling at the earliest opportunity. In the meantime, we are concerned that some of the Court’s more general statements could frustrate the jury’s understanding of forensic science evidence, and even risk miscarriages of justice, if extrapolated to other contexts and forms of expertise. There is no reason in law why errant obiter dicta should be permitted to corrupt best scientific practice.”

In this account it is clear that the substantive issues are about likelihoods rather than probabilities, and that consideration of ‘prior probabilities’ are not relevant here. This is different from the Royal Society’s account, which emphasises subjective probability. However, in considering the likelihood of the evidence conditioned on the suspect’s innocence, it is implicitly assumed that the perpetrator is typical of the UK population as a whole, or of people at UK crime scenes as a whole. But suppose that women are most often murdered by men that they are or have been close to, and that such men are likely to be more similar to each other than people randomly selected from the population as a whole. Then it is reasonable to suppose that the likelihood that the perpetrator is some other male known to the victim will be significantly greater than the likelihood of it being some random man. The use of an inappropriate likelihood introduces a bias.

My advice: do not get involved with people who mostly get involved with people like you, unless you trust them all.

The Appeal

Prof. Jamieson, an expert on the evaluation of evidence whose statements informed the appeal, said:

“It is essential for the population data for these shoes be applicable to the population potentially present at the scene. Regional, time, and cultural differences all affect the frequency of particular footwear in a relevant population. That data was simply not … . If the shoes were more common in such a population then the probative value is lessened. The converse is also true, but we do not know which is the accurate position.”

Thus the professor is arguing that the estimated likelihood could be too high or too low, and that the defence ought to be given the benefit of the doubt. I have argued that using a whole population likelihood is likely to be actually biased against the defence, as I expect such traits as the choice of shoes to be clustered.

Science and Justice

Faigman, Jamieson et al, Response to Aitken et al. on R v T Science and Justice 51 (2011) 213 – 214

This argues against an unthinking application of likelihood ratios, noting:

  • That the defence may reasonable not be able explain the evidence, so that there may be no reliable source for an innocent hypothesis.
  • That assessment of likelihoods will depend on experience, the basis for which should be disclosed and open to challenge.
  • If there is doubt as to how to handle uncertainty, any method ought to be tested in court and not dictated by armchair experts.

On the other hand, when it says “Accepting that probability theory provides a coherent foundation …” it fails to note that coherence is beside the point: is it credible?

Comment

The current situation seems unsatisfactory, with the best available advice both too simplistic and not simple enough. In similar situations I have co-authored a large document which has then been split into two: guidance for practitioners and justification. It may not be possible to give comprehensive guidance for practitioners, in which case one should aim to give ‘safe’ advice, so that practitioners are clear about when they can use their own judgment and when they should seek advice. This inevitably becomes a ‘legal’ document, but that seems unavoidable.

In my view it should not be simply assumed that the appropriate representation of uncertainty is ‘nothing but a number’. Instead one should take Keynes’ concerns seriously in the guidance and explicitly argue for a simpler approach avoiding ‘reasonable doubt’, where appropriate. I would also suggest that any proposed principles ought to be compared with past cases, particularly those which have turned out to be miscarriages of justice. As the appeal judge did, this might usefully consider foreign cases to build up an adequate ‘database’.

My expectation is that this would show that the use of whole-population likelihoods as in R v T is biased against defendants who are in a suspect social group.

More generally, I think that anyguidance ought to apply to my growing uncertainty puzzles, even if it only cautions against a simplistic application of any rule in such cases.

See Also

Blogs: The register, W Briggs and Convicted by statistics (referring to previous miscarriages).

My notes on probability. A relevant puzzle.

Dave Marsay 

Science advice and the management of risk

Science advice and the management of risk in government and business

The foundation for science and technology, 10 November 2010

An authoritative summary of the UK governments position on risk, with talks and papers.

  •  Beddington gives a good overview. He discusses probability versus impact ‘heat maps’, the use of ‘worst case’ scenarios, the limitations of heat maps and Blackett reviews. He discusses how management strategy has to reflect both the location on the heat map and the uncertainty in the location.
  • Omand discusses ‘Why wont they (politicians) listen (to the experts)?’  He notes the difference between secrets (hard to uncover) and secrets (hard to make sense of), and makes ‘common cause’ between science and intelligence in attempting to communicate with politicians. Presents a familiar type of chart in which probability is thought of as totally ordered (as in Bayesian probability) and seeks to standardise on the descriptors of ranges of probability, such as ‘highly probable’.
  • Goodman discusses economic risk management and the need to cope with ‘irrational cycles of exuberance’, focussing on ‘low probability high impact’ events. Only some risks can be quantified. Recommends ‘generalised Pareto distribution’.
  • Spielgelhalter introduced the discussion with some important insights:

The issue ultimately comes down to whether we can put numbers on these events.  … how can a figure communicate the enormous number of assumptions which underlie such quantifications? … The … goal of a numerical probability … becomes much more difficult when dealing with deeper uncertainties. … This concerns the acknowledgment of indeterminacy and ignorance.

Standard methods of analysis deal with recognised, quantifiable uncertainties, but this is only part of the story, although … we tend to focus at this level. A first extra step is to be explicit about acknowledged inadequacies – things that are not put into the analysis such as the methane cycle in climate models. These could be called ‘indeterminacy’. We do not know how to quantify them but we know they might be influential.

Yet there are even greater unknowns which require an essential humility. This is not just ignorance about what is wrong with the model, it is an acknowledgment that there could be a different conceptual basis for our analysis, another way to approach the problem.

There will be a continuing debate  about the process of communicating these deeper uncertainties.

  • The discussion covered the following:
    • More coverage of the role of emotion and group think is needed.
    • “[G]overnments did not base policies on evidence; they proclaimed them because they thought that a particular policy would attract votes. They would then seek to find evidence that supported their view. It would be more realistic to ask for policies to be evidence tested [rather than evidence-based.]”
    • “A new language was needed to describe uncertainty and the impossibility of removing risk from ordinary life … .”
    •  Advisors must advise, not covertly subvert decision-making.

Comments

If we accept that there is more to uncertainty than  can be reflected in a typical scale of probability, then it is no wonder that organisational decisions fail to take account of it adequately, or that some advisors seek to subvert such poor processes. Moreover, this seems to be a ‘difference that makes a difference’.

From a Keynesian perspective conditional probabilities, P(X|A), sometimes exist but unconditional ones, P(X), rarely do. As Spielgelhalter notes it is often the assumptions that are wrong: the estimated probability is then irrelevant. Spielgelhalter mentioned the common use of ‘sensitivity analysis’, noting that it is unhelpful. But what is commonly done is to test the sensitivity of P(X|y,A) to some minor variable y while keeping the assumptions, A. fixed. What is more often (for these types of risk) needed is a sensitivity to assumptions. Thus, if P(X|A) is high:

  • one needs to identify possible alternatives, A’, to A for which P(X|A’) is low, no matter how improbable A’ may be regarded.

Here:

  • ‘Possible’ means consistent with the evidence rather than anything psychological.
  • The criteria for what is regarded as ‘low’ or ‘high’ will be set by the decision context.

The rationale is that everything that has ever happened was, with hind-sight, possible: the things that catch us out are those that we overlooked, perhaps because we thought them improbable.

A conventional analysis would overlook emergent properties, such as booming cycles of ‘irrational’ exuberance. Thus in considering alternatives one needs to consider potential emotions and other emergencies and epochal events.

This suggests a typical ‘risk communication’ would consist of an extrapolated ‘main case’ probability together with a description of scenarios in which the opposite probability would hold.

See also

mathematicsheat maps, extrapolation and induction

Other debates, my bibliography.

Dave Marsay

 

Induction, novelty and possibilistic causality

The concept of induction normally bundles together a number of stages, of which the key ones are modelling and extrapolating. Here I speculatively consider causality through the ‘lens’ of induction.

If I perform induction and what is subsequently observed fits the extrapolation then, in a sense, there is no novelty. If what happened was part of an epoch where things fit the model, then the epoch has not ended. I only need to adjust some parameter within the model that is supposed to vary with time.  In this case I can say that conformance to the model (with the value of its variables) could have caused the observed behaviour. That is, any notion of causality is entailed by the model. If we consider modelling and extrapolation as flow, then what happens seems to be flowing within the epoch. The general model (with some ‘slack’ in its variables) describes a tendency for change, that can be described as a field (as Smuts does).

As with the interpretation of induction, we have to be careful. There may be multiple inconsistent models and hence multiple inconsistent possible causes. For example, an aircraft plot may fit both civil and military aircraft, which may heading for different airports. Similarly, we often need to make assumptions to make the data fit the model, so different assumptions can lead to different models. For example, if an aircraft suddenly loses height we may assume that it had received an instruction, or that it is in trouble. These would lead to different extrapolations. As with induction, we neglect the caveats at our peril.

We can distinguish the following types of ‘surprise’:

  1. Where sometimes rare events happen within an epoch, without affecting the epoch. (Like an aircraft being struck by lightning, harmlessly.)
  2. Where the induction was only possibilistic, one of which predictions actually occurred. (Where one predicts that at least one aircraft will manoeuvre to avoid a collision, or there will be a crash.) 
  3. Where induction shows that the epoch has become self-defeating. (As when a period aircraft flying straight and level has to be ended to avoid a crash – which would end the epoch anyway).
  4. Where the epoch is ended by external events. (As when air traffic control fails.)

These all distinguish between different types of ’cause’. Sometimes two or more types may act together. (For example, when two airplanes crash together, the ’cause’ usually involves both planes and air traffic control. Similarly, if a radar is tracking an aircraft flying straight and level, we can say that the current location of the aircraft is ’caused by’ the laws of physics, the steady hand of the pilot, and the continued availability of fuel etc. But in a sense it also ’caused by’ not having been shot down.)

If the epoch appears to have continued then a part of the cause is the lack of all those things that could have ended it.  If the epoch appears to have ended then we may have no model or only a very partial model for what happens. If we have a fuller model we can use that to explain what happened and hence to describe ‘the cause’. But with a partial model we may only be able to put constraints on what happened in a very vague way. (For example, if we launch a rocket we may say what caused it to reach its intended target, but if it misbehaves we could only say that it will end up somewhere in quite a large zone, and we may be able to say what caused it to fail but not what caused it to land where it did. Rockets are designed to operate within the bounds of what is understood: if they fail ‘interesting’ things can happen.) Thus we may not always be able to give a possible cause for the event of interest, but would hope to be able to say something helpful.

In so far as we can talk about causes, we are talking about the result of applying a theory / model / hypothesis that fits the data. The use of the word ’cause’ is thus a short-hand for the situation where the relevant theory is understood.

Any attempt to draw conclusions from data involves modelling, and the effectiveness of induction feeds back into the modelling process, fitting some hypotheses while violating others. The term ’cause’ is suggestive that this process is mature and reliable. Its use thus tends to go with a pragmatic approach. Otherwise one should be aware of the inevitable uncertainties. To say that X [possibly] causes Y is simply to say that one’s experience to date fits X causes Y, subject to certain assumptions. It may not be sensible to rely on this, for example where you are in an adversarial situation and your opponent has a broader range of relevant experience than you, or where you are using your notion of causality to influence something that may be counter-adapting. Any notion of causality is just a theory. Thus it seems quite proper for physicists to seek to redefine causality in order to cope with Quantum Physics.

Dave Marsay