Savage’s Foundations of Statistics
Leonard J Savage The Foundations of Statistics 2nd Ed., Dover 1972 (1st Ed. 1954)
Preface to the Dover Edition
[Savage supposes that his theory of] personal (or subjective) probability is a good key, and the best yet known, to all our valid ideas about the application of probability.
This is considerably more modest than many of the claims that supposedly rely on his work. It has two inter-twined aspects. As mathematics, it gives some reasonable axioms for a theory of statistics based on conventional (numeric) subjective probabilities. It also provides some ‘reasoned argument’ in support of the axioms.
2. Preliminary Considerations on Decision in the Face of Uncertainty
2.3 The world, and states of the world
[T]he following nomenclature is … suggestive and in reasonable harmony with … usages … :
the world the object about which the person is concerned
a state (of the world) a description of the world, leaving no relevant aspect undescribed
the true state (of the world) the state that does in fact obtain.
Savage introduces the notion of a small world, an abstraction from the actual world.
[A] smaller world is derived from a larger by neglecting some distinctions between states, not by ignoring some states outright.
2.5 Consequences, acts and decisions
Savage supposes that one generally identify a suitable small world.
The point of view under discussion may be symbolized by the proverb “Look before you leap,” and the one to which it is opposed “You can cross that bridge when you come to it.” .. It is utterly beyond our power to plan a picnic or to play a game of chess in accordance with the principle [Look before …], even when the world of states and the set of available acts are artificially reduced to the narrowest reasonable limits.
Though the “Look before you leap” principle is preposterous if carried to extremes, I would none the less argue that it is the proper subject of our further discussion, because to cross one’s bridges when one comes to them means to attack relatively simple problems of decision by artificially confining attention to so small a world that the “Look before you leap” principle can be applied there. I am unable to formulate criteria for selecting these small worlds …
2.7 The sure-thing principle
Suppose that the outcomes of acts f and g depend on an event B.
If the person would not prefer f to g, either knowing that the event B obtained, or knowing that the event ~B obtained, then he does not prefer f to g. …
5.5 Small worlds
[I] find it difficult to say with any completeness how such isolated situations are actually arrived at and justified. … Any claim to realism made by this book .. is predicated on the idea that some 0f the individual decision situations into which actual people tend to sub-divide the single grand decision do recapitulate in microcosm the mechanism of the idealized grand decision.
Savage shows how small worlds can be constructed as ‘pseudo-microcosms’, but his theory requires that they be actual ‘microcosms’. This, in turn, requires that there be two consequences of small-world acts whose expected advantage is the same in the small world as the large. (And hence, that the small world be not too unlike the large.)
I feel … that the possibility of being taken in by a pseudo-microcosm that is not a real microcosm is remote, but the difficulty I find in defining an operationally applicable criterion is, to say the least, ground for caution.
5.6 … comments …
If, after thorough deliberation, anyone maintains a pair of distinct preferences that are in conflict with the sure-thing principle, he must abandon, or modify, the principle … .
7 Partition Problems
7.4 Extensions of observations, and sufficient statistics
A sufficient statistic is one that can be used in lieu of the original o0bservations. It typically ignores some distinctions, but this is valid only so long as the distinction is irrelevant.
7.6 Repeated Observations
Under moderate assumptions, the value of repeated observations is bounded above.
11 The Parallelism between the Minimax theory and the theory of two-person games.
11.4 Parallelism a contrast with the minimax theories
[It is a] reasonable expectation that there should be no material difference between regarding [a probability] as meaningless and regarding it as meaningless but utterly unknown.
13 Objections to the Minimax Rules
13.3 Utility and the minimax rule
[I]f all meaning is denied to utility … no unification of statistics seems possible.
13.5 The minimax rule does not generate a simple ordering
It is absurd … to contend that the objectivist minimax rule selects the best available act. … [T]he rule is invoked only only a a sometimes practical rule of thumb in contexts where the concept of “best” is impractical … .
It would not be strange … if a banquet committee about to agree to buy chicken should, on being informed that goose is also available, finally compromise on duck.
Savage is concerned with the foundations of ‘proper’ statistics, but if subjective probability is adequate for formal statistics it would seem more than adequate for less formal empirical reasoning, so the work has much wider significance than just ‘reasoning by numbers’. By seeking a proper (mathematical) foundation Savage illustrates the benefits of employing mathematics in this way: he is forced to display his axioms in a very clear way, for our consideration.
Savage’s justification of his axioms is in essence that they seem to him appropriate for the normal applications of statistics. We may turn this on his head: we can be suspicious of the notion that uncertainty can be treated as nothing but conventional numeric subjective probability whenever his axioms are suspect.
For the most part, Savage supposes that one is faced with a typical situation and has to make a one-off decision, such as identifying an enduring strategy. But my own experience has been in trying to deal with (mercifully) atypical situations that have arisen out of a failure of approaches based on conventional reason, and the solution has usually relied on something that has seemed akin to magic to those whose view of logic is strictly classical. It seems to me that, although not what Savage intended, this book provides some forensic insights into what can go wrong, and hence clues about remedies.
Throughout the book runs the notion of a ‘distinction’. In most fields one has experts, who have come to know – explicitly or implicitly – all necessary distinctions. This, then, enables them to form ‘small world’ views of situations, perhaps after some false starts and adjustments. Savage’s theory seems eminently suited to such experts, sharpening up on their reasoning. But it does not address the problem of becoming an expert, maintaining expertise in the face of substantive change, or coping with fields that have frequent innovations leading to changes in the distinctions required. We could say that it is a theory of reasoning about the short-term, until new distinctions are required. Even here, it seems to suppose that we should ignore the possibility of such change until it actually happens. This ‘pragmatic’ view is not at all justified. (An alternative would be to look out for the possibility of change, and ‘hedge one’s bets.)
In 2.5, Savage notes that if your policy is to ‘look before you leap’ then sometimes you may not be able to see very far ahead, and suggests that you only consider as far ahead as you are able to look. But Savage’s notion of looking is to assign a probability distribution. It seems to me that in a crisis this would be inappropriate, but that one can often identify and shape (or create) possibilities, and that in practice there are modes of reasoning that are able to exploit such insights, leading to outcomes that seem better than one would expect from ignoring the future beyond the limits of Savage’s probabilistic reasoning.
I also think that if a trickster was to offer to toss a coin, while I might have a subjective probability of Heads of 0.5, I would be foolish to assume that the trickster had the same subjective probability, and if they were keen to make the bet I might reasonably suppose that that the objective probability distribution was anything other than fair. Sometimes, the world is out to get you.
More mundanely, Savage’s sure-thing principle (2.7) seems at odds with Simpson’s paradox. Briefly, if you have a choice between two treatments whose efficacy depends on whether or not you have condition B, and the first treatment is better whether or not you have condition B, then you ‘should’ – according to Savage – prefer the first treatment. But there is a twist. Suppose that having condition B makes the prognosis much worse, and that treatment 1 is normally given for those with condition B, the other treatment being usual for those without the condition. Then although you may prefer to have treatment 1, you may prefer to have been given the other treatment. In any case, Simpson’s paradox illustrates the need to have made all appropriate distinctions.
- A later note by Savage.
- A critique of the sure-thing principle by Ellsberg.
- A similar view by Ramsey.
- A broader view by Russell.
My notes on probability.