Who thinks probability is just a number? A plea.

Many people think – perhaps they were taught it – that it is meaningful to talk about the unconditional probability of ‘Heads’ (I.e. P(Heads)) for a real coin, and even that there are logical or mathematical arguments to this effect. I have been collecting and commenting on works which have been – too widely – interpreted in this way, and quoting their authors in contradiction. De Finetti seemed to be the only example of a respected person who seemed to think that he had provided such an argument. But a friendly economist has just forwarded a link to a recent work that debunks this notion, based on wider  reading of his work.

So, am I done? Does anyone have any seeming mathematical sources for the view that ‘probability is just a number’ for me to consider?

I have already covered:

There are some more modern authors who make strong claims about probability, but – unless you know different – they rely on the above, and hence do not need to be addressed separately. I do also opine on a few less well known sources: you can search my blog to check.

Dave Marsay

Is ‘probability’ a systemic issue?

My previous post was in part motivated by the current inquiry into the UK’s post office scandal. I note that one investigator said that the computer problems were not ‘systemic’ because there was no specific evidence that they had directly affected that many branches. Oh dear!

I have often opined that the entire computer industry has chronic systemic issues that affect us all (but mostly indirectly). But it seems – on the basis of the above usage – that I have been wrong. But still I claim that there is a serious problem that affects us all. You may not agree, but – hypothetically – what kind of problem would it be, if not ‘systemic’?

As my title suggests, I wouldn’t want to pick on any particular group as having a problem. Actually, I am more concerned about other technocrats, the legal profession, politicians and too many ‘public intellectuals’, and more concerned about impacts in epidemics, conflicts and economics.

It seems to me that we have a ‘systemic’ problem with understanding ‘probability’. But if the meaning of these words is now too narrow, how am I to express myself?

Dave Marsay

The ‘absolute certainty’ uncertainty principle

A well established principle of research is to be guided by the probabilities, and more generally this is widely regarded as ‘rational’. It would certainly seem odd to look where you didn’t expect to find anything. For example, if you are working in a seemingly well-established field among competent researchers who are guided by ‘the probabilities’ as they see them but failing, you should ‘probably’ look elsewhere. It is thus reasonable for researchers to understand their colleagues’ expectations, to form their own view of their collective successes in their field and of the significance (or otherwise) of any anomalies, and to act on them, often looking where others haven’t or using tools that others haven’t. Some would argue that anyone who does otherwise (e.g., by being unduly influenced by the expressed expectations of ‘the herd’) is not really researching, but simply someone working in the field, often as a ‘resource’ managed by others.

To take an extreme example, suppose that someone expresses ‘absolute confidence’ in something. Then they are admitting that no amount of contrary evidence would change their mind, in which case there can be no grounds for relying on anything they say. If all those engaged in an enterprise have ‘absolute confidence’ in something which we would otherwise think a possible error behind some manifest problem, then this seems a good reason to doubt it more than we would have done. If we think them reasonably competent then we might doubt less the things that they have doubted and seriously investigated. Roughly, the more confidence people express in something, the more reasonable it is to doubt.

More generally, if we have a certain level of concern that something could be significantly contributing to some problems and we find that it has been either overlooked, denied or regarded as ‘practically certain’ not to be a significant factor, then the more we should be concerned that it is. Moreover, if we find the ‘practical certainty’ is routinely justified by reference to some supposed ‘facts’, then we should be more sceptical about those ‘facts’ the more reliance has been put on them, which tends to be associated with the confidence placed in them.

In summary: (for a given level of concern) the more over-confidence has been placed in something, the less confidence you should have in it. And ‘absolute confidence’ in anything practical is absolutely to be discounted.

This is ‘good practice’ among (some) researchers. I also think that where you find seemingly excessive confidence in something that you already had concerns about it is as least reasonable to become more concerned and moreover, the more others try to reassure you, distract you or ‘manage’ your thinking or research, the more concerned you should be.

Of course, one has to recognise that in practice there may be good reasons to attempt to manage or otherwise constrain research, so one may need to consider a broader context before acting on or even expressing one’s concerns. But still, perhaps one should still be concerned, and be more concerned if those concerns are not respected by those who think they have ‘the bigger picture’, and be sceptical of any attempts to suppress or discredit the expression of such reasonable concerns. The media gives me the strong impression that ‘in practice’ ‘absolute confidence’ and ‘practical certainties’ too often corrupt, and that ‘on balance’ we may do better if we had less constraints on research and debate. But who can really tell?

Dave Marsay

AI Safety: Uncertain consequences

The UK government is hosting a summit on AI safety. I summarise their concerns below, followed by my some additional concerns that I have.

“There are 2 particular categories of risk that we are focusing on:

1. Misuse risks, for example where a bad actor is aided by new AI capabilities in biological or cyber-attacks, development of dangerous technologies, or critical system interference. Unchecked, this could create significant harm, including the loss of life.

2. Loss of control risks risks that could emerge from advanced systems that we would seek to be aligned with our values and intentions.”

A discussion paper has been prepared “on the need for further research” which outline some potential societal harms. There is also a ‘future risks‘ paper that notes:

“future risks are likely to fall in the following categories:
a. Providing new capabilities to a malicious actor.
b. Misapplication by a non-malicious actor.
c. Poor performance of a model used for its intended purpose, for example leading to biased decisions.
d. Unintended outcomes from interactions with other AI systems.
e. Impacts resulting from interactions with external societal, political, and economic systems.
f. Loss of human control and oversight, with an autonomous model then taking harmful actions.
g. Overreliance on AI systems, which cannot subsequently be unpicked.
h. Societal concerns around AI reduce the realisation of potential benefits.”

It also presents five scenarios to facilitate the consideration of risks:

Scenario 1: Unpredictable Advanced AI
Scenario 2: AI Disrupts the Workforce
Scenario 3: AI ‘Wild West’
Scenario 4: Advanced AI on a knife edge
Scenario 5: AI Disappoints.

My Concerns

I share concerns about potential societal impacts, but my concerns are somewhat broader.

The ‘Systems’ in category ‘e’ risks above could easily be construed as current institutions and patterns of behaviour, and the scenarios could all be taken as variations on the status quo, but faced as we are with challenges such as chronic warfare and climate change it seems doubtful that the status quo is adequate, so my concern is the extent to which AI may help or hinder societies coping with their challenges.

As a mathematician, I share doubts about the appropriateness of current narrowly ‘rationalistic’ approaches to addressing issues of economics, finance, epidemics, confrontations and conflicts, as in my blog. Maybe advanced AI could help here. But will it?

It seems to me that an essential aspect of our ability to rise to any challenges is the quality of debate. Imagine that social media and AI had developed in some relatively advanced country, such as the US, UK or Germany, 100 years ago, and spread from there. It might well have made a significant difference to the course of events: would things have necessarily turned out better?

The analysis above suggests a real risk that the subject technologies might lead to increased factionalism, with a poorer standard of debate and potentially much poorer societal outcomes. But might they not also lead to more harmonious debate, more supportive of effective action to combat shared challenges such as climate change, and might not such an outcome deserve our support?

There are certainly opportunities here, but also risks. Sustained effort to meet a chronic challenge might reasonably be expected to lead to a broad consensus on views and hence a lessening of the appreciation of the need for healthy debates, potentially leading to an ‘established common sense’ that might easily be supported and even defended either by well-intentioned social actors aided by AI, or just as a consequence of the nature of AI tools to reflect such views. Thus if such technologies had been developed 100 years ago we might now be stuck with social attitudes of the past which we now consider detrimental. It may be that we are complacent about our current social attitudes, but I am concerned that there will always be fresh issues that will challenge any ossified social attitudes (including ‘values’ and ‘intentions’), potentially to our detriment.

More broadly, it seems to me that the types of technologies being considered magnify both our strengths and weaknesses, and the challenge is to improve ourselves lest our interactions with emerging technologies prove fateful. But at least a summit offers potential in this direction.

Dave Marsay

What is ‘evidence’?

As a mathematician, I tend to worry about mathematical theories and how they are applied, particularly how they are misunderstood and misapplied. Hence my blog has mostly about uncertainty and how it can be understood (or misunderstood) by reference to relevant mathematical theories, and in particular why some regard ‘mathematical probability’ as the go-to theory, even when faced with – to take some topical examples – climate change, discrimination (such as ‘institutional racism’) and pandemics, when the theory is clearly not so useful, and has such potential to be misapplied.

But trying to make sense of the UK approach to pandemics, and its claim to be ‘following the science’ I now adjust my question:

What do those with power and influence think ‘evidence’ is?

There is a lot of ‘text’ available, much of it seems to me to assume some sort of implicit underlying theory of uncertainty, such as probability, without addressing the appropriateness of any such theory or even giving much hint of what such a theory might be or how to think about it logically or mathematically. If anyone can suggest something on this topic that I might begin to make sense of (other than as a dogma) and so usefully critique, I’d appreciate it. A related question is:

What kind of logic is appropriate to thinking about ‘evidence’, and how would one judge its appropriateness?

Dave Marsay

Applications of Statistics

Lars Syll has commented on a book by David Salsburg, criticising workaday applications of statistics. Lars has this quote:

Kolmogorov established the mathematical meaning of probability: Probability is a measure of sets in an abstract space of events.

This is not quite right.

  • Kolmogorov established a possible meaning, not ‘the’ meaning. (Actually Wittgenstein anticipated him.)
  • Even taking this theory, it is not clear why the space should be ‘measurable‘. More generally one has ‘upper’ and ‘lower’ measures, which need not be equal. One can extend the more familiar notions of probability, entropy, information and statistics to such measures. Such extended notions seem more credible.
  • In practice one often has some ‘given data’ which is at least slightly distant from the ‘real’ ‘events’ of interest. The data space is typically rather a rather tame ‘space’, so that a careful use of statistics is appropriate. But one still has the problem of ‘lifting’ the results to the ‘real events’.

These remarks seem to cover the criques of Syll and Salsburg, but are more nuanced. Statistical results, like any mathematics, need to be interpreted with care. But, depending on which of the above remarks apply, the results may be more or less easy to interpret: not all naive statistics are equally dubious!

Dave Marsay

AI pros and cons

Henry A. Kissinger, Eric Schmidt, Daniel Huttenlocher The Metamorphosis Atlantic August 2019.

AI will bring many wonders. It may also destabilize everything from nuclear détente to human friendships. We need to think much harder about how to adapt.

The authors are looking for comments. My initial reaction is here. I hope to say more. Meanwhile, I’d appreciate your reactions.

 

Dave Marsay

The limits of pragmatism

This is a personal attempt to identify and articulate a fruitful form of pragmatism, as distinct from what seems to me the many dangerous forms. My starting point is Wikipedia and my notion that the differences it notes can sometimes matter.

Doubt, like belief, requires justification. Genuine doubt irritates and inhibits, in the sense that belief is that upon which one is prepared to act.[2] It arises from confrontation with some specific recalcitrant matter of fact (which Dewey called a “situation”), which unsettles our belief in some specific proposition. Inquiry is then the rationally self-controlled process of attempting to return to a settled state of belief about the matter. Note that anti-skepticism is a reaction to modern academic skepticism in the wake of Descartes. The pragmatist insistence that all knowledge is tentative is quite congenial to the older skeptical tradition

My own contribution to things scientific has been on some very specific issues, but which I attempt to generalise:

  • It is sometimes seems much too late to wait to act on doubt for something that pragmatic folk recognize as a ‘specific recalcitrant matter of fact’. I would rather say (with the skeptics) that we should always be in some doubt, but that our actions require justification, and should only invest in relation to that justification. Requiring ‘facts’ seems too high a hurdle to act at all.
  • Psychologically, people do seek ‘settled states of belief’, but I would rather say (with the skeptics) that the degree of settledness ought to be only in so far as is justified. Relatively settled belief but not fundamentalist dogma!
  • It is often supposed that ‘facts’ and ‘beliefs’ should concern the ‘state’ of some supposed ‘real world’. There is some evidence that it is ‘better’ in some sense to think of the world as one in which certain processes are appropriate. In this case, as in category theory, the apparent state arises as a consequence of sufficient constraints on the processes. This can make an important difference when one considers uncertainties, but in ‘small worlds’ there are no such uncertainties.

It seems to me that the notion of ‘small worlds’ is helpful. A small world would be one which could be conceived of or ‘mentally modelled’. Pragmatists (of differing varieties) seem to believe that often we can conceive of a small world representation of the actual world, and act on that representation ‘as if’ the world were really small. So far, I find this plausible, even if not my own habit of thinking. The contentious point, I think, is that in every situation we should do our best to from a small world representation and then act as if it were true unless and until we are confronted with some ‘specific recalcitrant matter of fact’. This can be too late.

But let us take the notion of  a ‘small world’ as far as we can. It is accepted that the small world might be violated. If it could be violated as a consequence of something that we might inadvertently do then it hardly seems a ‘pragmatic’ notion in terms of ordinary usage, and might reasonably said to be dangerous in so far as it lulls us into a false sense of security.

One common interpretation of ‘pragmatism’ seems to be that we may as well act on our beliefs as there seems no alternative. But I shall refute this by presenting one. Another interpretation is that there is no ‘practical’ alternative’. That is to say, whatever we do could not affect the potential violation of the small world. But if this is the case it seems to me that there must be some insulation between ourselves and the small world. Thus the small world is actually embedded in some larger closed world. But do we just suppose that we are so insulated, or do we have some specific closed world in mind?

It seems to me that doubt is more justified the less our belief in insulation is justified. Even when we have specific insulation in mind, we surely need to keep an open mind and monitor the situation for any changes, or any reduction in justification for our belief.

From this, it seems to me that (as in my own work) what matters is not having some small world belief, but in taking a view on the insulations between what you seek to change and what you seek to rely on as unchanging. And from these identifying not only a single credible world in which to anchor one’s justifications for action, but in seeking out credible possible small worlds in the hope that at least one may remain credible as things proceed.

Dave Marsay

See also my earlier thoughts on pragmatism, from a different starting point.

Addendum: Locke anticipated my 3 bullet points above, by a few centuries. Pragmatists seem to argue that we don’t have to take some of Locke’s concerns too seriously. But maybe we should. It further occurs to me that there are often situations where in the short-run ‘pragmatism pays’, but in the long-run things can go increasingly awry. Locke offers an alternative to the familiar short-term utilitarianism that seems to make more sense. Whilst it may be beneficial to keep developing theories pragmatically, in the longer term one would do well to seek more logical (if less precise) theories from which one can develop pragmatic ‘beliefs’ that are not unduly affected by beliefs that may have been pragmatic in previous situations, but which no longer are. One might say that rather than stopping being pragmtic, one’s pragmatism should -from time to time – consider the potential long-run consequences, lest the long-run eventually burst upon one, creating a crisis and a need for a challenging paradigm shift.

An alternative is to recognise the issues arising from one’s current ‘pragmatic’ beliefs, and attempt to ‘regress to progress’. But this seems harder, and may be impossible under time presssure.

Which pragmatism as a guide to life?

Much debate on practical matters ends up in distracting metaphysics. If only we could all agree on what was ‘pragmatic’. My blog is mostly negative, in so far as it rubbishes various suggestions, but ‘the best is trhe enemy of the good’, and we do need to do something.

Unfortunately, different ‘schools’ start from a huge variety of different places, so it is difficult to compare and contrast approaches. But it is about time I had a go. (In part inspired by a recent public engagement talk on mathematics).

Suppose you have a method Π that you regard as pragmatic, in the sense that you can always act on it. To justify this, I think (like Popper) that you should have criteria , Γ, which if falsified would lead you to reconsider ∏ . So your pragmatic process is actually

If Γ then ∏ else reconsider.

But this is hardly reasonable if we try to arrange things so that Γ will never appear to be falsified. So an improvement is:

Spend some effort in monitoring Γ. If it is not falsified then ∏.

In practice if one thinks that Γ can be relied on, one may not think it worth spending much effort on checking it, but surely one should at least be open to suggestions that it could be wrong. The proper balance between monitoring Γ and acting on ∏ seems  impossible to establish with any confidence, but ignoring all evidence against Γ seems risky, to say the least.

Some argue that if you have no alternative to ∏  then it is pointless considering Γ. This may be a  reasonable argument when applied to concepts, but not to actions in the real world. Whatever evidence we may have for ∏ it will never uniquely prove it. It may be that it rules out all the alternatives that we have thought of, or which we consider credible or otherwise acceptable, but we should think again. Logically, there are always alternatives.

The above clearly applies to science. No theory is ever regarded as asolute and for ever. Scientists make their careers by identifying alternative theories to explain the experimental results and then devising new experiments to try to falsify the current theory. This process could only ever end when we were all sure that we had performed every possible experiment using every possible means in every possible circumstance, which implies the end of evolution and inventiveness. We aren’t there yet.

My proposal, then, is that very generally (not just in science) we ought to expect any ‘pragmatic’ ∏  to include a specific ‘caveat’, Γ(∏). If it doesn’t, we ought to develop one. This caveat will include its own rules for falsifying, tests, and we ought to regard more severe tests (in some sense) to be better. We then seek to develop alternatives that might be less precise (and hence less ‘useful’) than ∏ but which might survive falsification of ∏.

Much of my blog has some ideas on how to do this in particular cases: a work in progress. But an example may appeal:

Faced with what looks like a coin being tossed we might act ‘as if’ we believe it to be fair and to correspond to the axioms of mathematical probability theory, but keep an eye out for evidence to the contrary. Perhaps we inspect it and toss it a few times. Perhaps we watch whoever tosses it carefully. We do what we can, but still if someone tosses it and over a very large runs gets an excess of ‘Heads’ that our statistical friends tell us is hugely significant, we may be suspicious and reconsider

In this case we may decline from gambling on coin tosses even if we lack a specific ‘theory of the coin’, but it might be better if we had an alternative theory. Perhaps it is an ingenious fake coin? Perhaps the person tossing it has a cunning technique to bias it? Perhaps the person tossing it is a magician, and is actually faking the results?

This seems to me a like a good approach, surely better than acting ‘pragmatically’ but without such falsifying criteria. Can it be improved upon? (Suggestions please!)

Dave Marsay

What logical term or concept ought to be more widely known?

Various What scientific term or concept ought to be more widely known? Edge, 2017.

INTRODUCTION: SCIENTIA

Science—that is, reliable methods for obtaining knowledge—is an essential part of psychology and the social sciences, especially economics, geography, history, and political science. …

Science is nothing more nor less than the most reliable way of gaining knowledge about anything, whether it be the human spirit, the role of great figures in history, or the structure of DNA.

Contributions

As against others on:

(This is as far as I’ve got.)

Comment

I’ve grouped the contributions according to whether or not I think they give due weight to the notion of uncertainty as expressed in my blog. Interestingly Steven Pinker seems not to give due weight in his article, whereas he is credited by Nicholas G. Carr with some profound insights (in the first of the second batch). So maybe I am not reading them right.

My own thinking

Misplaced Concreteness

Whitehead’s fallacy of misplaced concerteness, also known as the reification fallacy, “holds when one mistakes an abstract belief, opinion, or concept about the way things are for a physical or “concrete” reality.” Most of what we think of as knowledge is ‘known about a theory” rather than truly “known about reality”. The difference seems to matter in psychology, sociology, economics and physics. This is not a term or concept of any particular science, but rather a seeming ‘brute fact’ of ‘the theory of science’ that perhaps ought to have been called attention to in the above article.

Morphogenesis

My own speciifc suggestion, to illustrate the above fallacy, would be Turing’s theory of ‘Morphogenesis’. The particular predictions seem to have been confirmed ‘scientifically’, but it is essentially a logical / mathematical theory. If, as the introduction to the Edge article suggests, science is “reliable methods for obtaining knowledge” then it seems to me that logic and mathematics are more reliable than empirical methods, and deserve some special recognition. Although, I must concede that it may be hard to tell logic from pseudo-logic, and that unless you can do so my distinction is potentially dangerous.

The second law of thermodynamics, and much common sense rationality,  assumes a situation in which the law of large numbers applies. But Turing adds to the second law’s notion of random dissipation a notion of relative structuring (as in gravity) to show that ‘critical instabilities’ are inevitable. These are inconsistent with the law of large numbers, so the assumptions of the second law of thermodynamics (and much else) cannot be true. The universe cannot be ‘closed’ in its sense.

Implications

If the assumptions of the second law seem to leave no room for free will and hence no reason to believe in our agency and hence no point in any of the contributions to Edge: they are what they are and we do what we do. But Pinker does not go so far: he simply notes that if things inevitably degrade we do not need to beat ourselves up, or look for scape-goats when things go wrong. But this can be true even if the second law does not apply. If we take Turing seriously then a seeming permanent status quo can contain the reasons for its own destruction, so that turning a blind eye and doing nothing can mean sleep-walking to disaster. Where Pinker concludes:

[An] underappreciation of the Second Law lures people into seeing every unsolved social problem as a sign that their country is being driven off a cliff. It’s in the very nature of the universe that life has problems. But it’s better to figure out how to solve them—to apply information and energy to expand our refuge of beneficial order—than to start a conflagration and hope for the best.

This would seem to follow more clearly from the theory of morphogenesis than the second law. Turing’s theory also goes some way to suggesting or even explaining the items in the second batch. So, I commend it.

 

Dave Marsay

 

 

Heuristics or Algorithms: Confused?

The Editor of the New Scientist (Vol. 3176, 5 May 2018, Letters, p54) opined in response to Adrian Bowyer ‘swish to distinguish between ‘heuristics’ and ‘algorithms’ in AI that:

This distinction is no longer widely made by practitioners of the craft, and we have to follow language as it is used, even when it loses precision.

Sadly, I have to accept that AI folk tend to consistently fail to respect a widely held distinction, but it seems odd that their failure has led to an obligation on the New Scientist – which has a much broader readership than just AI folk. I would agree that in addressing audiences that include significant sectors that fail to make some distinction, we need to be aware of the fact, but if the distinction is relevant – as Bowyer argues, surely we should explain it.

According to the freedictionary:

Heuristic: adj 1. Of or relating to a usually speculative formulation serving as a guide in the investigation or solution of a problem.

Algorithm: n: A finite set of unambiguous instructions that, given some set of initial conditions, can be performed in a prescribed sequence to achieve a certain goal and that has a recognizable set of end conditions.

It even also this quote:

heuristic: of or relating to or using a general formulation that serves to guide investigation  algorithmic – of or relating to or having the characteristics of an algorithm.

But perhaps this is not clear?

AI practitioners routinely apply algorithms as heuristics in the same way that a bridge designer may routinely use a computer program. We might reasonably regard a bridge-designing app as good if it correctly implements best practice in  bridge-building, but this is not to say that a bridge designed using it would necessarily be safe, particularly if it is has significant novelties (as in London’s wobbly bridge).

Thus any app (or other process) has two sides: as an algorithm and as a heuristic. As an algorithm we ask if it meets its concrete goals. As a heuristic we ask if it solves a real-world problem. Thus a process for identifying some kind of undesirable would be regarded as good algorithmically if it conformed to our idea of the undesirables, but may still be poor heuristically. In particular, good AI would seem to depend on someone understand at least the factors involved in the problem. This may not always be the case, no matter how ‘mathematically sophisticated’ the algorithms involved.

Perhaps you could improve on this attempted explanation?

Dave Marsay

Probability as a guide to life

Probability is the very guide to life.’

Cicero may have been right, but ‘probability’ means something quite different nowadays to what it did millennia ago. So what kind of probability is a suitable guide to life, and when?

Suppose that we are told that ‘P(X) = p’. Often there is some implied real or virtual population, P, a proportion ‘p’ of which has the property ‘X’. To interpret such a probability statement we need to know what the relevant population is. Such statements are then normally reliable. More controversial are conditional probabilities, such as ‘P(X|Y) = p’. If you satisfy Y, does P(X)=p ‘for you’?

Suppose that:

  1. All the properties of interest (such as X and Y) can be expressed as union of some disjoint basis, B.
  2. For all such basis properties, B, P(X|B) is known.
  3. That the conditional probabilities of interest are derived from the basis properties in the usual way. (E..g. P(X|B1ÈB2) = P(B1).P(X|B1)+P(B2).P(X|B2)/P(B1ÈB2).)

The conditional probabilities constructed in this way are meaningful, but if we are interested in some other set, Z, the conditional probability P(X|Z) could take a range of values. But then we need to reconsider decision making. Instead of maximising a probability (or utility), the following heuristics that may apply:

  • If the range makes significant difference, try to get more precise data. This may be by taking more samples, or by refining the properties considered.
  • Consider the best outcome for the worst-case probabilities.
  • If the above is not acceptable, make some reasonable assumptions until there is an acceptable result possible.

For example, suppose that some urn, each contain a mix of balls, some of which are white. We can choose an urn and then pick a ball at random. We want white balls. What should we do. The conventional rule consists of assessing the proportion of white balls in each, and picking an urn with the most. This is uncontroversial if our assessments are reliable. But suppose we are faced with an urn with an unknown mix? Conventionally our assessment should not depend on whether we want to obtain or avoid a white ball. But if we want white balls the worst-case proportion is no white balls, and we avoid this urn, whereas if we want to avoid white balls the worst-case proportion is all white balls, and we again avoid this urn.

If our assessments are not biased then we would expect to do better with the conventional rule most of the time and in the long-run. For example, if the non-white balls are black, and urns are equally likely to be filled with black as white balls, then assessing that an urn with unknown contents has half white balls is justified. But in other cases we just don’t know, and choosing this urn we could do consistently badly. There is a difference between an urn whose contents are unknown, but for which you have good grounds for estimating proportion, and an urn where you have no grounds for assessing proportion.

If precise probabilities are to be the very guide to life, it had better be a dull life. For more interesting lives imprecise probabilities can be used to reduce the possibilities. It is often informative to identify worst-case options, but one can be left with genuine choices. Conventional rationality is the only way to reduce living to a formula: but is it such a good idea?

Dave Marsay

How can economics be a science?

This note is prompted by Thaler’s Nobel prize, the reaction to it, and attempts by mathematicians to explain both what they do do and what they could do. Briefly, mathematicians are increasingly employed to assist practitioners (such as financiers) to sharpen their tools and improve their results, in some pre-defined sense (such as making more profit). They are less used to sharpen core ideas, much less to challenge assumptions. This is unfortunate when tools are misused and mathematicians blamed. It is no good saying that mathematicians should not go along with such misuse, since the misuse is often not obvious without some (expensive) investigations, and in any case whistleblowers are likely to get shown the door (even if only for being inefficient).

Mainstream economics aspires to be a science in the sense of being able to make predictions, at least probabilistically. Some (mostly before 2007/8) claimed that it achieved this, because its methods were scientific. But are they? Keynes coined the term ‘pseudo-mathematical’ for the then mainstream practices, whereby mathematics was applied without due regard for the soundness of the application. Then, as now, the mathematics in itself is as much beyond doubt as anything can be. The problem is a ‘halo effect’ whereby the application is regarded as ‘true’ just because the mathematics is. It is like physics before Einstein, whereby some (such as Locke) thought that classical geometry must be ‘true’ as physics, largely because it was so true as mathematics and they couldn’t envisage an alternative.

From a logical perspective, all that the use of scientific methods can do is to make probabilistic predictions that are contingent on there being no fundamental change. In some domains (such as particle physics, cosmology) there have never been any fundamental changes (at least since soon after the big bang) and we may not expect any. But economics, as life more generally, seems full of changes.

Popper famously noted that proper science is in principle falsifiable. Many practitioners in science and science-like fields regard the aim of their domain as to produce ‘scientific’ predictions. They have had to change their theories in the past, and may have to do so again. But many still suppose that there is some ultimate ‘true’ theory, to which their theories are tending. But according to Popper this is not a ‘proper’ scientific belief. Following Keynes we may call it an example of ‘pseudo-science’: something that masquerades as a science but goes beyond it bounds.

One approach to mainstream economics, then, is to disregard the pseudo-scientific ideology and just take its scientific content. Thus we may regard its predictions as mere extrapolations, and look out for circumstances in which they may not be valid. (As Eddington did for cosmology.)

Mainstream economics depends heavily on two notions:

  1. That there is some pre-ordained state space.
  2. That transitions evolve according to fixed conditional probabilities.

For most of us, most of the time, fortunately, these seem credible locally and in the short term, but not globally in space-time. (At the time of writing it seems hard to believe that just after the big bang there were in any meaningful sense state spaces and conditional probabilities that are now being realised.) We might adjust the usual assumptions:

The ‘real’ state of nature is unknowable, but one can make reasonable observations and extrapolations that will be ‘good enough’ most of the time for most routine purposes.

This is true for hard and soft sciences, and for economics. What varies is the balance between the routine and the exceptional.

Keynes observed that some economic structures work because people expect them to. For example, gold tends to rise in price because people think of it as being relatively sound. Thus anything that has a huge effect on expectations can undermine any prior extrapolations. This might be a new product or service, an independence movement, a conflict or a cyber failing. These all have a structural impact on economies that can cascade. But will the effect dissipate as it spreads, or may it result in a noticable shift? A mainstream economist would argue that all such impacts are probabilistic, and hence all that was happening was that we were observing new parts of the existing state space and new transitions. If we suppose for a moment that it is true, it is not a scientific belief, and hardly seems a useful way of thinking about potential and actual crises.

Mainstream economists suppose that people are ‘rational’, by which they mean that they act as if they are maximizing some utility, which is something to do with value and probability. But, even if the world is probabilistic, being rational is not necessarily scientific. For example, when a levee is built  to withstand a ‘100 year storm’, this is scientific if it is clear that the claim is based on past storm data. But it is unscientific if there is an implicit claim that the climate can not change. When building a levee it may be ‘rational’ to build it to withstand all but very improbable storms, but it is more sensible to add a margin and make contingency arrangements (as engineers normally do). In much of life it is common experience that the ‘scientific’ results aren’t entirely reliable, so it is ‘unscientific’ (or at least unreasonable) to totally rely on them.

Much of this is bread-and-butter in disciplines other than economics, and I am not sure that what economists mostly need is to improve their mathematics: they need to improve their sciencey-ness, and then use mathematics better. But I do think that they need somehow to come to a better appreciation of the mathematics of uncertainty, beyond basic probability  theory and its ramifications.

Dave Marsay

 

 

Why do people hate maths?

New Scientist 3141 ( 2 Sept 2017) has the cover splash ‘Your mathematical mind: Why do our brains speak the language of reality?’. The article (p 31) is titled ‘The origin of mathematics’.

I have made pedantic comments on previous articles on similar topics, to be told that the author’s intentions have been slightly skewed in the editing process. Maybe it has again. But some interesting (to me) points still arise.

Firstly, we are told that brain scans showthat:

a network of brain regions involved in mathematical thought that was activated when mathematicians reflected on problems in algebra, geometry and topology, but not when they were thinking about non-mathsy things. No such distinction was visible in other academics. Crucially, this “maths network” does not overlap with brain regions involved in language.

It seems reasonable to suppose that many people do not develop such a maths capability from experience in ordinary life or non-mathsy subjects, and perhaps don’t really appreciate its significance. Such people would certainly find maths stressful, which may explain their ‘hate’. At least we can say – contradicting the cover splash – that most people lack a mathematical mind, which may explain the difficulties mathematicians have in communicating.

In addition, I have come across a few seemingly sensible people who may seem to hate maths, although I would rather say that they hate ‘pseudo-maths’. For example, it may be true that we have a better grasp on reality if we can think mathematically – as scientists and technologists routinely do – but it seems a huge jump – and misleading – to claim that mathematics is ‘the language of reality’ in any more objective sense. By pseudo-maths I mean something that appears to be maths (at least to the non-mathematician) but which uses ordinary reasoning to make bold claims (such as ‘is the language of reality’).

But there is a more fundamental problem. The article cites Ashby to the effect that ‘effective control’ relies on adequate models. Such models are of course computational and as such we rely on mathematics to reason about them. Thus we might say that mathematics is the language of effective control. If – as some seem to – we make a dichotomy between controllable and not controllable systems then mathematics is the pragmatic language of reality. Here we enter murky waters. For example, if reality is socially constructed then presumably pragmatic social sciences (such as economics) are necessarily concerned with control, as in their models. But one point of my blog is that the kind of maths that applies to control is only a small portion. There is at least the possibility that almost all things of interest to us as humans are better considered using different maths. In this sense it seems to me that some people justifiably hate control and hence related pseudo-maths. It would be interesting to give them a brain scan to see if  their thinking appeared mathematical, or if they had some other characteristic networks of brain regions. Either way, I suspect that many problems would benefit from collaborations between mathematicians and those who hate pseudo-mathematic without necessarily being professional mathematicians. This seems to match my own experience.

Dave Marsay

Mathematical Modelling

Mathematics and modelling in particular is very powerful, and hence can be very risky if you get it wrong, as in mainstream economics. But is modelling inappropriate – as has been claimed – or is it just that it has not been done well enough?

As a mathematician who has dabbled in modelling and economics I thought I’d try my hand at modelling economies. What harm could there be?

My first notion is that actors activity is habitual.

My second is that habits persist until there is a ‘bad’ experience, in which case they are revised. What is taken account of, what counts as ‘bad’ and how habits are replaced or revised are all subject to meta-habits (habits about habits).

In particular, mainstream economists suppose that actors seek to maximise their utilities, and they never revise this approach. But this may be too restrictive.

Myself, I would add that most actors mostly seek to copy others and also tend to discount experiences and lessons identified by previous generations.

With some such ‘axioms’ (suitably formalised) such as those above, one can predict booms and busts leading to new ‘epochs’ characterised by dominant theories and habits. For example, suppose that some actors habitually borrow as much as they can to invest in an asset (such as a house for rent) and the asset class performs well. Then they will continue in their habit, and others who have done less well will increasingly copy them, fuelling an asset price boom. But no asset class is worth an infinite amount, so the boom must end, resulting in disappointment and changes in habit, which may again be copied by those who are losing out on the asset class., giving a bust.  Thus one has an ’emergent behaviour’ that contradicts some of the implicit mainstream assumptions about rationality  (such as ‘ergodicity’), and hence the possibility of meaningful ‘expectations’ and utility functions to be maximized. This is not to say that such things cannot exist, only that if they do exist it must be due to some economic law as yet unidentified, and we need an alternative explanation for booms and busts.

What I take from this is that mathematical models seem possible and may even provide insights.I do not assume that a model that is adequate in the short-run will necessarily continue to be adequate, and my model shows how economic epochs can be self-destructing. To me, the problem in economics is not so much that it uses mathematics and in particular mathematical modelling but that it does so badly. My ‘axioms’ mimic the approach that Einstein took to physics: it replaces an absolutist model by a relativistic one, and shows that it makes a difference. In my model there are no magical ‘expectations’, rather actors may have realistic habits and expectations, based on their experience and interpretation of the media and other sources, which may be ‘correct’ (or at least not falsified) in the short-run, but which cannot provide adequate predictions for the longer run. To survive a change of epochs our actors would need to be at least following some actors who were monitoring and thinking about the overall situation more broadly and deeply than those who focus on short run utility. (Something that currently seems lacking.)

David Marsay

Can polls be reliable?

Election polls in many countries have seemed unusually unreliable recently. Why? And can they be fixed?

The most basic observation is that if one has a random sample of a population in which x% has some attribute then it is reasonable to estimate that x% of the whole population has that attribute, and that this estimate will tend to be more accurate the larger the sample is. In some polls sample size can be an issue, but not in the main political polls.

A fundamental problem with most polls is that the ‘random’ sample may not be uniformly distributed, with some sub-groups over or under represented. Political polls have some additional issues, that are sometimes blamed:

  • People with certain opinions may be reluctant to express them, or may even mislead.
  • There may be a shift in opinions with time, due to campaigns or events.
  • Different groups may differ in whether they actually vote, for example depending on the weather.

I also think that in the UK the trend to postal voting may have confused things, as postal voters will have missed out on the later stages of campaigns, and on later events. (Which were significant in the UK 2017 general election.)

Pollsters have a lot of experience at compensating for these distortions, and are increasingly using ‘sophisticated mathematical tools’. How is this possible, and is there any residual uncertainty?

Back to mathematics, suppose that we have a science-like situation in which we know which factors (e.g. gender, age, social class ..) are relevant. With a large enough sample we can partition the results by combination of factors, measure the proportions for each combination, and then combine these proportions, weighting by the prevalence of the combinations in the whole population. (More sophisticated approaches are used for smaller samples, but they only reduce the statistical reliability.)

Systematic errors can creep in in two ways:

  1. Instead of using just the poll data, some ‘laws of politics’ (such as the effect of rain) or other heuristics (such as that the swing among postal votes will be similar to that for votes in person) may be wrong.
  2. An important factor is missed. (For example, people with teenage children or grandchildren may vote differently from their peers when student fees are an issue.)

These issues have analogues in the science lab. In the first place one is using the wrong theory to interpret the data, and so the results are corrupted. In the second case one has some unnoticed ‘uncontrolled variable’ that can really confuse things.

A polling method using fixed factors and laws will only be reliable when they reasonably accurately the attributes of interest, and not when ‘the nature of politics’ is changing, as it often does and as it seems to be right now in North America and Europe. (According to game theory one should expect such changes when coalitions change or are under threat, as they are.) To do better, the polling organisation would need to understand the factors that the parties were bringing into play at least as well as the parties themselves, and possibly better. This seems unlikely, at least in the UK.

What can be done?

It seems to me that polls used to be relatively easy to interpret, possibly because they were simpler. Our more sophisticated contemporary methods make more detailed assumptions. To interpret them we would need to know what these assumptions were. We could then ‘aim off’, based on our own judgment. But this would involve pollsters in publishing some details of their methods, which they are naturally loth to do. So what could be done? Maybe we could have some agreed simple methods and publish findings as ‘extrapolations’ to inform debate, rather than predictions. We could then factor in our own assumptions. (For example, our assumptions about students turnout.)

So, I don’t think that we can expect reliable poll findings that are predictions, but possibly we could have useful poll findings that would inform debate and allow us to take our own views. (A bit like any ‘big data’.)

Dave Marsay

 

The search for MH370: uncertainty

There is an interesting podcast about the search for MH370 by a former colleague. I think it illustrates in a relatively accessible form some aspects of uncertainty.

According to the familiar theory, if one has an initial probability distribution over the globe for the location of MH370’s flight recorder, say, then one can update it using Bayes’ rule to get a refined distribution. Conventionally, one should search where there is a higher probability density (all else being equal). But in this case it is fairly obvious that there is no principled way of deriving an initial distribution, and even Bayes’ rule is problematic. Conventionally, one should do the best one can, and search accordingly.

The podcaster (Simon) gives examples of some hypotheses (such as the pilot being well, well-motivated and unhindered throughout) for which the probabilistic approach is more reasonable. One can then split one’s effort over such credible hypotheses, not ruled out by evidence.

A conventional probabilist would note that any ‘rational’ search would be equivalent to some initial probability distribution over hypotheses, and hence some overall distribution. This may be so, but it is clear from Simon’s account that this would hardly be helpful.

I have been involved in similar situations, and have found it easier to explain the issues to non-mathematicians when there is some severe resource constraint, such as time. For example, we are looking for a person. The conventional approach is to maximise our estimated probability of finding them based on our estimated probabilities of them having acted in various ways (e.g., run for it, hunkered down). An alternative is to consider the ways they may ‘reasonably’ be thought to have acted and then to seek to maximize the worst case probability of finding them. Then again, we may have a ranking of ways that they may have acted, and seek to maximize the number of ways for which the probability of our success exceeds some acceptable amount (e.g. 90%). The key point here is that there are many reasonable objectives one might have, for only one of which the conventional assumptions are valid. The relevant mathematics does still apply, though!

Dave Marsay