Gerd Gigerenzer, Henry Brighton Homo Heuristicus: Why Biased Minds Make Better Inferences Topics in Cognitive Science 1 (2009) 107–143.
… Homo heuristicus has a biased mind and ignores part of the available information, yet a biased mind can handle uncertainty more efficiently and robustly than an unbiased mind relying on more resource-intensive and general-purpose processing strategies.
An important claim, if true.
Catching a ball
The computational view
The prior art in thinking about human thinking is cited as that:
The view of cognition favoring omniscience and omnipotence suggests that complex problems are solved with complex mental algorithms: ‘‘he behaves as if he had solved a set of differential equations in predicting the trajectory of the ball … At some subconscious level, something functionally equivalent to the mathematical calculations is going on’’ (Dawkins, 1989, p. 96).
The heuristic view
The first example of a heuristic is:
The gaze heuristic is the simplest one and works if the ball is already high up in the air: Fix your gaze on the ball, start running, and adjust your running speed so that the angle of gaze remains constant (see Gigerenzer, 2007). A player who relies on the gaze heuristic can ignore all causal variables necessary to compute the trajectory of the ball––the initial distance, velocity, angle, air resistance, speed and direction of wind, and spin, among others. By paying attention to only one variable, the player will end up where the ball comes down without computing the exact spot.
A mathematical view
The paper seems to suppose that the gaze heuristic is a heuristic because it ignores the ‘causal variables necessary to compute the trajectory of the ball’, such as the initial distance, and that the adjustment of running speed does not fit Dawkin’s description of ‘as if’ some differential equations had been solved. But are these assumptions reasonable?
Consider a simpler case, of crossing the road at night. A pair of lights are approaching. Is there time to cross? We cannot directly comprehend the paper’s ‘necessary causal variables’ but we can apply something like his gaze heuristic: visually, the lights are getting further apart and we can extrapolate to estimate the time at which the lights will reach us. It is of little or no consequence whether the lights are further apart on a fast-moving vehicle or closer together on a slower moving vehicle. The paper’s ‘causal variables’ are irrelevant. All we need to do is extrapolate from our sense-impressions ‘as if’ we were solving (simple) differential equations.
Ashby [Design for a brain] has a cybernetic description of catching a ball which extends this model of crossing the road. Gigerenzer & Brighton 0, Dawkins 1. But this is only the beginning.
Heuristics as bounded rationality
The paper, I think correctly, notes that:
By the end of the 20th century, the use of heuristics became associated with shoddy mental software, generating three widespread misconceptions:
1. Heuristics are always second-best.
2. We use heuristics only because of our cognitive limitations.
3. More information, more computation, and more time would always be better.
Contrary to the belief in a general accuracy-effort tradeoff, less information and computation can actually lead to higher accuracy, and in these situations the mind does not need to make trade-offs. Here, a less-is-more effect holds.
That simple heuristics can be more accurate than complex procedures is one of the major discoveries of the last decades (Gigerenzer, 2008). … less-is-more phenomena emerge from the bias–variance dilemma that cognitive systems face in uncertain worlds.
It is well known that if one has a family of functions (e.g. polynomals) then increasing the number of parameters (e.g., the degree) will tend to give better results for a problem such as ball-catching up to some point where performance degrades, due to ‘over-fitting: the apparent increase in accuracy is spurious. This is why attempting to estimate the paper’s ‘causal parameters’ gives poor results compared with just using the growth of the object in the visual field.
The paper notes that in the case of a complicated trajectory, such as of a ball, using a low degree polynomial will yield a systematic bias, whereas a higher degree polynomial may have too much variability. This is the bias-variance dilemma.
The paper, then, is saying that it is often better to use a simple, wrong, model than a more ‘realistic’ model, if you can estimate the parameters of the former adequately, but not the latter. This is not news. It has long been regarded as pragmatic that, given a complicated causal model of the type that the paper envisages, one seeks co-ordinate transformations that produce ‘key parameters’ that can be estimated and used effectively. One determines the potential bias and aims to correct for it. For example, when crossing the road one may keep an eye on the approaching vehicle and speed up if necessary.
More is less
More information or computation can decrease accuracy … .
Less-is-more effects … challenge the classical definition of rational decision-making as the process of weighting and adding all information.
Note that the term less-is-more does not mean that the less information one uses, the better the performance. Rather, it refers to the existence of a point at which more information or computation becomes detrimental, independent of costs.
Thus, with a given methodology, increasing the amount of data used can degrade performance, and that computational effort is no guarantee of good results. No surprises here, apart from the somewhat strange description of ‘classical decision-making’.
Of a study comparing tallying (equal weights) with multiple regression (optimised weights), it is noted that:
Averaged across all data sets, tallying achieved a higher predictive accuracy than multiple regression (Fig. 1). Regression tended to overfit the data, as can be seen by the cross-over of lines: It had a higher fit than tallying but a lower predictive accuracy.
[T]he conditions under which tallying succeeds–– low predictability of a criterion, small sample sizes relative to the number of available cues, and dependency between cues––are not infrequent in natural environments.
[R]ational minds might not always weight but may simply tally cues.
The class of one-good-reason heuristics orders cues, finds the first one that allows a decision to be made, and then stops and ignores all other cues.
Take-the-best: … Search is stopped after finding the first cue that enables an inference to be made.
[I]nferences relying on one good reason were more accurate than both multiple regression and tallying.
the success of take-the-best seems to be due to the fact that it ignores dependencies between cues in what turns out to be an adaptive processing policy when observations are sparse.
Biased minds for making better predictions
The paper promotes the merits of ‘a biased mind’. But what it seems actually to be proposing is that sometimes one is better off using methods that tolerate bias, not that one should select methods because of their bias. The difference may not be slight.
An example is given of stock market investments, using equal weights: a form of tallying. This gave better results than methods that tried to fit data from the previous 10 years. One could argue that the data-fitting methods were biased to the view that stock markets are learnable random systems, so that the future would be like the past. Terms like ‘biased minds’ need to be carefully interpreted.
Unpacking the adaptive toolbox
This describes some families of heuristics that people seem to use, and discusses their appropriateness.
This rightly points out that much work on cognition, machine reasoning, etc, assumes a knowable, stable situation with reasoning limited to reasoning about the details, and that much of the world isn’t like this.
The effectiveness of heuristics is explained in terms of ‘uncertainty’, which is due to missing data, or data which cannot be reliably exploited. Thus one might suppose that there is some real world that can, in principle, be described in terms of probabilistic characteristics, and the problem is that in practice we lack sufficient data.
The paper doesn’t really challenge the view that – ideally – getting more data and making ‘full’ use of it would be a good idea. Its key point seems to be that humans are often faced with situations where the ideal isn’t possible, in which case one needs to avoid over-fitting models. This, it points out, can result in models that are ‘biased’, in that they operate as if higher-order terms are 0. But the impact of any such biases is typically mitigated by trial and error type processes, so it seems unreasonable to say that the mind is ‘biased’ on this account.
Heuristics are characterised as ‘ignoring part of the available information’, yet – for example in the gaze heuristic – they often seem to make full use of all the available data. What they do not do is relying on assessing model parameters that cannot meaningfully be assessed: they ignore potentially misleading estimates.
It is claimed that heuristics handle uncertainty robustly, without defining uncertainty or robustness. Uncertainty seems to be conceptualised as just a lack of data, in which case ‘robustness’ is appropriately measured by variance, as in the case studies. But isn’t there more to uncertainty than this? The stock market example seems full of broader uncertainties, and the tally heuristic would seem relatively robust to them. But there is no discussion of what types of heuristic for what types of decisions are likely to show what types of robustness in the face of what types of uncertainty.
It seems to me that the sub-title ‘Biased Minds Make Better Inferences’ is misleading: ‘Statistically Competent Minds Make Better Decisions’?
Gerd Gigerenzer and Wolfgang Gaissmaier Heuristic Decision Making Annu. Rev. Psychol. 2011. 62:451–82.
A more recent review, with some more interesting examples.
[A] large community continues to routinely model behavior with complex statistical procedures without testing these against simple rules.
It is this community that is the target of these papers. One might also note that:
- There are also many who use ‘sophisticated’ procedures based on ill-founded statistics when simpler heuristics would perform much better, as in some of the examples cited.
- Statisticians and other mathematically-astute academics and technologists have long been giving similar warnings.
- The issues may be much deeper than these papers seem to suppose, particularly as regards pragmatism, randomness and uncertainty.
Gigerenzer, The History of Decision Making
In this video Gigerenezer seems to be saying that to make problems on can use logic or probability theory or heuristics. He seems to conflate logic with classical symbolic logic and probability with Bayesian statistics. My own view is that we need logic and probability and statistics and heuristics, possibly all at the same time. We need a general logical framework, as in Russell’s Human Knowledge, that includes a logic of ‘probability’ considered broadly and hence including all aspects of uncertainty. This would inform the proper use of statistics and other heuristics. Heuristics (in the sense of not being derived from a sound theory base) are often necessary because there is often no alternative. But a logic of broader uncertainty is also vital.
Finally, Cosmides and Tooby make a similar point, but perhaps more helpfully:
One point is particularly important for economists to appreciate: it can be demonstrated that “rational” decision-making methods (i.e., the usual methods drawn from logic, mathematics, and probability theory) are computationally very weak: incapable of solving the natural adaptive problems our ancestors had to solve reliably in order to reproduce.
Latest: Gigerenzer has a book out, Risk Savvy. In an interview, he says:
The error my dear colleagues make, is that they begin from the assumption that various “rational” approaches to decision-making must be the most effective ones. Then, when they discover that is not how people operate, they define that as making a mistake: When they find that we judge differently, they blame us, instead of their models!