J.M. Keynes A Treatise on Probability Macmillan 1921
Keynes‘ studied mathematics under Whitehead. His 1908 fellowship treatise for King’s College Cambridge UK was published in 1921 as part of the lessons identified from the Great War [Smuts Papers]. His approach is neither objective or subjective, but follows the logical approach of Boole, e.g. ‘Laws of Thought’. As the first chapter says [Ch. I]:
“The Theory of Probability is logical … because it is concerned with the degree of belief which it is rational to entertain in given conditions, and not merely with the actual beliefs of particular individuals, which may or may not be rational.”
“[It] is without significance to call a proposition probable unless we specify the knowledge to which we are relating it.”
“It is as useless … to say ‘b is probable’ as it would be to say ‘b is equal,’ or ‘b is greater than,’ and as unwarranted to conclude that because a makes b probable, therefore a and c together make b probable , as to argue that because a is less than b, therefore a and c together are less than b.”
Note that if one does not have sufficiently ‘given’ conditions then one has additional uncertainty.
The Treatise discusses such variations on the notion of precise, numeric, probability as interval-valued probability and weight of evidence, which were taken up by Turing, Good and others. The Treatise culminates in a critique of conventional statistical inference and a constructive theory. Briefly, Keynes notes that inference implicitly assumes a great deal of homogeneity and regularity, and can be in error where it assumes more than is sensible. The constructive theory is sound where, as is typical of science before quantum mechanics evolutionary dynamics, the assumptions have been tested and found to fit the available facts.
Keynes had worked on British war-time and European post-war finance under Smuts and resigned from the Versailles ‘Peace’ process, arguing in his best-seller the Economic Consequences of the Peace that the terms would lead to a sustained depression and a return to war. The treatise provided the mathematical underpinning for this work. This experience may motivated the ending:
O False and treacherous Probability,
Enemy of truth, and friend to wickednesse;
With whose bleare eyes Opinion learnes to see,
Truth’s feeble party here, and barrennesse.
It may be that in the light of these and other developments, Keynes would not find the assumptions of his constructive theory so acceptable. Nevertheless, Keynes’ argumentation remains informative.
Keynes uses a compact notation that helps bring out some key mathematical structures, but which hasn’t caught on. Russell provided a more accessible summary of Keynes’ approach, and developed it into a theory of science.
Classical, or Bayesian, probability is the familiar numeric probability. Keynes gave some alternative axiomatizations, which he discussed at length. He also developed the data fusion formula that combines likelihoods by multiplying them. Like Ramsey, he does not suppose that every proposition or event necessarily has a probability, or that it is measureable.
“It has been assumed hitherto as a matter of course that probability is … measureable. I shall have to limit, not extend, the popular doctrine.” [Ch. III]
Keynes, as above, was critical of the assumptions of classical probability theory, and developed alternative theories and methods that are not dependent on them. These were applied to intelligence in both world wars, with Keynes advising on their application and development under Turing and Good at Bletchley Park.
Probabilistic alternatives to classical probability include:
- sets of possible classical probability assignments
- interval-valued probability assignments
- algebraic-valued probability assignments, with constraints
These violate the principle of indifference, the subject of Keynes’ chapter IV. Keynes also developed his concepts of likelihoods and of ‘weight of arguments’ (ch. VI), a more general concept not relying on priors.
Principle of Indifference
“Is our expectation of rain, when we start out for a walk, always more likely than not, or less likely than not, or as likely as not ? I am prepared to argue that on some occasions none of these alternatives hold, and that it will be an arbitrary matter to decide for or against the umbrella. If the barometer is high, but the clouds are black, it is not always rational that one should prevail over the other in our minds, or even that we should balance them, though it will be rational to allow caprice to determine us and to waste no time on the debate.”
This anticipates notions of ‘bounded rationality’, normally attributed to Herbert Simon.
“An excellent and classic instance of the danger of wrongful assumptions of independence is given by the problem of determining the probability of throwing heads twice in two consecutive tosses of a coin. The plain man generally assumes without hesitation that the chance is (1/2)2. For the à priori chance of heads at the first toss is 1/2, and we might naturally suppose that the two events are independent,—since the mere fact of heads having appeared once can have no influence on the next toss. But … If we do not know whether there is bias, or which way the bias lies, then it is reasonable to put the probability somewhat higher than (1/2)2. The fact of heads having appeared at the first toss is not the cause of heads appearing at the second also, but the knowledge, that the coin has fallen heads already, affects our forecast of its falling thus in the future, since heads in the past may have been due to a cause which will favour heads in the future. The possibility of bias in a coin, it may be noticed, always favours ‘runs’; this possibility increases the probability both of ‘runs’ of heads and of ‘runs’ of tails.”
This is not something that appears to be widely appreciated, yet it can be of practical importance. (Keynes discusses further in Ch. XXIX.)
Keynes gives a good review of fusion, arguing that it rarely yields precise (numeric) probabilities, and that a misunderstanding of independence (above) has led to many fallacies.
Keynes (ch XXIX) gives examples of supernormal and subnormal dispersions, based on processes that have positive or negative feedback of any deviation from the expected value. This anticipates the behind the abnormalities notion of large-tailed distributions. Any deviation from normality is an indication that classical methods are inappropriate. Keynes recommends trying to identify the factors behind the abnormalities and developing the models until one (hopefully) has a normal situation. [This may not be possible in a game-like or evolving situations, for the reasons below.]
Keynes (Ch. XVII) quotes Whitehead, Introduction to Mathematics , p. 27:
“There is no more common error than to assume that, because prolonged and accurate mathematical calculations have been made, the application of the result to some fact of nature is absolutely certain.”
Its a small and simple world
Keynes (Ch XXII) notes the assumption implicit in induction and probability theory that the world is describable by finite probabilistic models, with no innovation.
Inference: limits on conclusions
Keynes notes (Ch XXIX):
“It is very important to notice that two conditions are involved. Not only must the experience, upon which the à priori probability is based, be extensive in comparison with the number of instances to which we apply our prediction; but also the number of previous instances multiplied by the probability based upon them … must be large in comparison with the number of new instances. Thus, even where the prior experience, upon which we found the initial probability P, is very extensive, we must not, if P is very small, say that the probability of n successive occurrences is approximately pn, unless n is also small.”
This anticipates some of the issues with tail-risk that Taleb has highlighted following the financial crisis of 2007/8/11.
Keynes notes (Notes on Part III):
“As our knowledge is partial, there is constantly, in our use of the term cause, some reference implied or expressed to a limited body of knowledge.”
That is, empirical models of causality are not necessarily robust ‘facts’. (In m y view, they may even be ideological.)
Initially, the book became the classic work, and was well-regarded by both Whitehead and Russell. But later numerous arguments were found to show (correctly) that it is rational to believe in conventional numeric probabilities, and so Keynes’ views were ignored, as were those of his economics theories that relied upon it. But Keynes had already shown that ‘rational’ decision making could be perilous if the assumptions of the theory were violated, as they often are. Interest in Keynes revived following the financial crises of 2007 et seq., but this treatise is still largely overlooked.
- My notes on probability and uncertainty.
- Others’ comments on Keynes’ Treatise.
- My comments on particular sections: