Freedman’s Causal Inference
David A. Freedman Statistical Models and Causal Inference: A dialogue with the social sciences Eds. David Collier, Jasjeet S. Sekhon & Philip B. Stark CUP 2010.
Freedman presents a definitive account of his approach to causal inference in the social sciences. He explores the foundations and limitations of statistical modelling, illustrating basic arguments with examples from political science etc. He maintains that many new technical approaches to statistical modelling constitute not progress but regress.
Editors’ Introduction: Inference and Shoe-leather
[Freedman] demonstrated that it can be better to rely on subject-matter expertise and to exploit natural variation to mitigate confounding and rule out competing explanations. … An increasing number of social scientists now agree that statistical technique cannot substitute for good research design and subject-matter knowledge. This view is particularly common among those who understand the mathematics and have on-the-ground experience.
Part I Statistical Modelling: Foundations and Limitations
1 Issues in the Foundations of Statistics: Probability and Statistical Models
[Foundations of Science (1995) 1: 19-39.]
1.4 A critique of the subjectivist position
[A] Bayesian’s opinion may be of great interest to himself … but why should the results carry any weight for others?
… Under certain circumstances [but not in others], as more and more data become available, two Bayesians will come to agree.
My own experience suggests that neither decision-makers nor their statisticians do in fact have prior probabilities.
… [The] theory addresses only limited aspects of rationality.
1.5 Statistical models
Regression models … are widely used by social scientists to make causal inferences … . However, the “demonstrations” generally turn out to depend on a series of untested, even unarticulated, technical assumptions.
For [many statistical practitioners], fitting models to data, computing standard errors, and performing significance tests is “informative,” even though the basic statistical assumptions (linearity, independence of errors, etc.) cannot be validated.
2 Statistical Assumptions as Empirical Commitments
No amount of statistical maneuvering can get very far without deep understanding of how the data were generated.
3 Statistical Models and Shoe Leather
[Statistical] technique can seldom be an adequate substitute for good design, relevant data, and testing predictions against reality in a number of settings.
Part II Studies in Political Science, Public Policy, and Epidemiology
8 What is the chance of an earthquake?
8.4 A view from the past
Littlewood [A Mathematician’s Miscellany, 1953)] wrote:
Mathematics has no grip on the real world; if probability is to deal with the real world it must contain elements outside mathematics.
10 Relative Risk and Specific Causation
Epidemiologic data usually cannot determine the probability of causation in any meaningful way, because of individual differences.
11 Survival Analysis: An Epidemiological Hazard?
[The] misuse of survival analysis … can lead to serious mistakes … .
[The] big assumption in constructing [cross-sectional] life tables is that death rates do not change over time.
Cox said of the proportional hazards model:
As a basis for rather empirical data reduction, [the model] seems flexible and satisfactory.
Part III New Developments: Progress or Regress?
14 The Grand Leap
The Markov condition says, roughly, that past and future are conditionally independent given the present.
SGS [three advocates of AI] seem to buy the Automation Principle: The only worthwhile knowledge is the knowledge that can be taught to a computer.
15 On Specifying Graphical Models for Causation, and the Identification Problem
[Causal] relationships cannot be inferred from a data set by running regressions unless there is substantial prior knowledge about the mechanisms that generated the data.
The key to making a causal inference from nonexperimental data by regression is some kind of invariance … .
[Note] the difference between conditional probabilities that arise from selection of subjects with X = x, and conditional probabilities arising from an intervention that sets X to x. The data structures may look the same , but the implications can be worlds apart.
We want to use regression to draw causal inferences from nonexperimental data. To do that, we need to know that certain parameters and certain distributions would remain invariant if we were to intervene. That invariance can seldom if ever be demonstrated by intervention. What, then, is the source of the knowledge? “Economic Theory” seems like a natural answer, but an incomplete one. Theory has to be anchored in reality. Sooner or later, invariance needs empirical demonstration, which is easier said than done.
Freedman quotes Pearl:
[Causal] analysis deals with the conclusions that logically follow from the combination of data and a given set of assumptions, just in case one is prepared to accept the latter. Thus, all causal inferences are necessarily conditional. … In complex fields like the social sciences and epidemiology, there are only a few (if any) real life situations where we can make enough compelling assumptions that would lead to identification of causal effects.
The information in any body of data is usually too weak to eliminate competing causal explanations of the same phenomenon. There is no mechanical algorithm for producing a set of “assumption free” facts or causal estimates based on the facts.
19 Diagnostics Cannot Have Much Power Against General Alternatives
The invariance assumption is not entirely statistical. Absent special circumstances, it does not appear that the assumption can be tested with the data that are used to fit the model. Indeed, it may be difficult to test the assumption without an experiment, either planned or natural.
[As] recent economic history makes clear, a major source of uncertainty in forecasting is specification error in the forecasting models. Specification error is extremely difficult to evaluate using internal evidence.
Unless the relevant class of specification errors can be delimited by prior theory and experience, diagnostics have limited power, and the robust procedures may be robust only against irrelevant departures from assumptions.
Part IV Shoe Leather Revisited
20 On Types of Scientific Inquiry: The Role of Qualitative Reasoning
Informal reasoning, qualitative insights , and the creation of novel data sets that require deep substantive knowledge and a great expenditure of shoe leather.
… Recognizing anomalies is important; so is the ability to capitalize on accidents. …
…. There is a … natural preference for system and rigor over methods that seem more haphazard. These are possible explanations for the current popularity of statistical models.
…. If so, the rigor of advanced qualitative methods is a matter of appearance rather than substance.
The book includes many important medical examples.
Freedman is critical of the view that there is some statistical machinery that can be applied to a set of data to infer (or even validate) a causal model. Some have interpreted Freedman as being critical of mathematical modelling, but his view seems more nuanced. As he recognizes, the mathematical validity of Euclidean geometry is a different issue to its correspondence to the physical world. There is no known force of nature that forces it to comply with Euclid’s concept of it. Similarly, we may doubt the validity of astrology no matter how mathematical its methods.
Where I differ from Freedman is that I think that statistical descriptions of the kind provided by the methods that he criticises can often be very helpful, just as long as we recognize them for what they are, and do not over-interpret them. For example, a colleague once consulted me when some response-time data had very peculiar moments. Fortunately I had seen similar results before: when data such as ‘45.32’ had been interpreted as 45.32 minutes, but was actually 45 minutes and 32 seconds. In general, standard methods are good for suggesting hypotheses for development and testing. It is only under special circumstances (such as Freedman describes) that their raw output can be relied upon as ‘true’.
Freedman makes a distinction between quantitative and qualitative aspects of a problem, perhaps inviting a naïve reader to associate the quantitative aspect with mathematics. But one might rather say that mathematics is commonly thought to be confined to the quantitative. It is not. While Freedman’s qualitative analysis is not mathematical, the bulk of his work shows that one needs to be very careful about the meaning of one’s terms, and so it seems to me that his analysis could be improved by being more mathematical (perhaps as an annex). For example, it might draw on Keynes’ Thesis.
In particular, all the approaches here seem to suppose that one creates a model, extrapolates, determines some ‘error distribution’, and then makes a decision accordingly. It is possible to finesse this approach, and assume less but with greater diligence. Thus where the Markov condition supposes that the future depends on the present but not the past we should note two conditions. Firstly, that the system being observed is objectively Markov, and secondly that we are observing enough of it. Both assume that there is a fixed set of factors to be observed. This seems doubtful. For example, if the Bank of England suddenly decided that the quality of its favourite claret was an important economic indicator then it might act on it, in which case it might become important. In a speculative market, anything can become a factor.
I often wonder why people seem to make mistakes of the kind that Freedman denigrates, and how we may enlighten them. Freedman puts it down to love of ‘system’ and ‘rigor’. It is my experience that many ‘pragmatic’ people squeeze mathematics into a distorting frame of system and rigor. Perhaps the cure would be greater familiarity with mathematics post 1900, particularly that of Whitehead, Keynes, Russell and Turing, as on this, my blog.