How to Grow a Mind
June 9, 2011 1 Comment
How to Grow a Mind: Statistics, Structure, and Abstraction
Joshua B. Tenenbaum , et al.
Science 331, 1279 (2011);
This interesting paper proposes that human reasoning, far from being uniquely human, is understandable in terms of the mathematics of inference, and in particular that concept learning is ‘just’ that combination of Bayesian inference and abstract induction. found in hierarchical Bayesian model s (HBM). This has implications for two debates:
- how to conceptualise how people learn
- the validity of Bayesian methods
These may help, for example, in helping:
- to understand how thinking may be influenced, for example, by culture or experience
- to aid teaching
- to understand what might be typical mistakes of the majority
- to understand mistakes typical of important minorities
If it were the case that humans are Bayesians (as others have also claimed, but with less scope) and if one thought that Bayesian thinking had certain flaws, then one would expect to find evidence of these in human activities (as one does – watch this blog e.g. here). But the details matter.
In HBM one considers that observations are produced by a likelihood function that has a probability distribution, or a longer chain of likelihood functions ‘topped out’ by a probability function. This is equivalent to have a chain of conditional likelihood functions, with the likelihood of the conditions of each function being given by the next one, topped out by an unconditional probability distribution, to make it Bayesian. The paper explains how a Chinese restaurant process (CRP) is used to decide whether new observations fit an existing category (node in the HBM) or a new one is required. In terms of the odinary Bayesain probability theory, this corresponds to creating a new hypothesis when the evidence does not fit any of the existing ones. It thus breaks the Bayesian assumption that the sum of the probabilities of the hypotheses add to 1. Thus the use of the HBM is Bayesian only for as long as there is no observed novelty. So far, the way that humans reason would seem to meet criticisms of ‘pure’ Bayes.
A pragmatic approach is to use the existing model unless and until it is definitely broken, and this seems to be what the paper is saying the way humans seem to think. But the paper does not distinguish between the following two situations:
- We seem to be in a familiar, routine, situation with no particular reason to expect surprises.
- We are in a completely novel situation, perhaps where others are seeking to outwit us.
The pragmatic approach seems reasonable when surprises are infrequent ‘out of the blue’ and ‘not to be helped’. One proceeds as if one is a Bayesian until one has to change, in which case one fixes the Bayesian model (HBM) and goes back to being a de-facto Bayesian. But if surprises are more frequent then there are theoretical benefits in discounting the Bayesian priors (or frequentist frequency information), discounting more the more surprises are to be expected. This could be accommodated by the CRP-based categorisation process, to give an approach that was pragmatic in a broad sense, but not in the pedantic James’ sense.
There are two other ways in which one might depart further from a pure Bayesian approach, although these are not covered by the paper:
- In a novel situation for which there is no sound basis for any ‘priors’ use likelihood-based reasoning rather than trying (as HBM does) to extrapolate from previous experience.
- In a novel situation, if previous experience has not provided a matching ‘template’ in HBM, consider other sources of templates, e.g.:
- theoretical (e.g., mathematical) reasoning
- advice from others
An interesting paper, but we perhaps shouldn’t take it’s endorsement of Bayesian reasoning too pedantically: there may be other explanations, or even if people are naturally Bayesians in the strict technical sense, that doesn’t necessarily mean that they are beyond education.