# Russell’s Probability

B. Russell Human Knowledge: Its scope and limits, George Allen and Unwin, 1948.

## Part V. Probability

Russell considers probability in so far as it pertains to induction and scientific inference. He reviews the then mainstream material, and adapts Keynes’ theory. It is perhaps less obscure than Keynes but still demanding, perhaps because most readers will have been conditioned to the view that statements of the form ‘P(A|B)=p’ always make sense, and so fail to see what the fuss is about.

His main finding is that probability statements may be more or less credible, and credibility is different from probability.

### Introduction

It is generally recognized that the inferences of science and common sense differ from those of deductive logic … in … that, when the premises are true and the reasoning correct, the conclusion is only probable. … doubt that is justified is of three sorts:
… there may be relevant facts of which we are ignorant
… the laws that we have to assume in order to predict the future may be untrue
… we know a law to the effect that something happens usually … though not always … .

Mathematical probability arises always from a combination of two propositions, of which one may be completely known, while the other is completely unknown. …

We infer, in science, not only laws but also particular facts. … But although some knowledge of facts as yet unperceived can be acquired in this way, it is impossible to get far without without knowledge of general laws. Such laws … state probabilities (in one sense) and are themselves only probable (in another).

Russell draws particular attention to ‘laws that we have to assume in order to predict the future’, without – at this stage – considering whether we can predict the future or if there is some form of anticipation, falling short of probabilistic prediction, that can inform action in a more principled, and perhaps more effective, way.

### Ch. I  Kinds of Probability

Attempts to establish a logic of probability have been numerous, but to most of them there have been fatal objections. One of the causes … has been failure to distinguish … essentially different concepts. …

The first large fact … is the existence of the mathematical theory of probability. There is … a fairly complete agreement as to everything that can be expressed in mathematical symbols, but an entire absence of agreement as to the interpretation of the mathematical formulae. …

… When we use probability as a guide to conduct, it is because our knowledge is inadequate ; we know that the event in question is one of a class B of events, and we may know what proportion of this class belongs to some class A in which we are interested. But the proportion will vary according to our choice of class B ; we shall thus obtain different probabilities, all equally valid from a mathematical standpoint. …

The probability which is a guide to life is not the mathematical kind … . If we assert … that all our knowledge is doubtful, we cannot define this doubtfulness in the mathematical way, for in the compiling of statistics it is assumed that we know whether or not  this A is a B .. . Statistics are built up on a structure of assumed certainty as to past instances, and a doubtfulness which is universal cannot be statistical.

… The importance of probability in practice is due to its connection with credibility, but if we imagine this connection to be closer than it is, we bring confusion into the theory of probability.

### Ch. II  Mathematical Probability

This deals with the case where probability can be represented by a number, not Keynes’ broader concept. Unfortunately for the modern reader it does adopt Keynes’ obscure notation. Other reviews follow.

### Ch. V  Keynes’ Theory of Probability

Keynes’ theory of probability is much broader than what Russell calls the ‘mathematical theory’, containing it as a special case. Russell:

• notes some important technical objections to Keynes’ formulation
• develops ideas on the principle of indifference.

Russell uses Keynes’ notation, which has the technical advantage of not misleading in the way that conventional does, but the considerable disadvantage that it is not familiar, and in any case most readers will translate it into conventional terminology. Nonetheless, Russell’s overview is much more accessible than Keynes’ original, and might be a good place to start, for the logically-minded.

### Ch. VI  Degrees of credibility

The degree of credibility is the relative reliability of a statement. It is not the same as mathematical probability (above) and cannot always be represented and handled in a similar way. The notion of classification is critical.

#### A.  General Considerations

… the rational man … will be guided by the mathematical theory of probability when it is applicable.

#### B.  Credibility and Frequency

… All calculations of probability have to do with classes which can be defined in terms of the fundamental class. But the fundamental class itself must consist of members which cannot be logically defined in terms of the data. I [Russell] think that when this principle is fulfilled the principle of indifference is always satisfied.

Here the principle of indifference relies on having identified the atomic factors, something which Whitehead, for example, regards as always doubtful.

#### D.  Degrees of Subjective Certainty

Scientific method, broadly speaking, consists of techniques and rules designed to make degrees of belief coincide as nearly as possible with degrees of credibility. We cannot, however, begin to seek such harmony unless we can start from propositions which are both epistemologically credible and subjectively near certain. … We assume in practice that a class of beliefs may be regarded as true if
(a) they are firmly believed by all who have carefully considered them,
(b) there is no positive argument against them,
(c) there is no known reason for supposing that mankind would believe them if they were untrue.

Perfect rationality consists, not in believing what is true, but in attaching to every proposition a degree of belief corresponding to its degree of credibility. …

Note particularly condition (c), which is often overlooked.

#### E.  Probability and Conduct

The probabilities [degrees of credibility] concerned are to be estimated by the rules of “expectation”. … Since distant consequences seldom have any appreciable probability, this justifies the practical man in usually confining his attention to the less remote consequences of his action.

… we should, in practice, treat as certain whatever has a very high degree of probability. This is merely a matter of common sense, and raises no issue that is of interest to the theory of probability.

Russell’s ‘practical man’ seems similar to Keynes’, and hence may exclude those with long-run considerations. In essence, ‘the practical man’ ignores uncertainty beyond probability. This seems to justify Popper’s incrementalism and ‘no-strategy strategies’. It is stated without any supporting argumentation, and without addressing the proper scope and limitations of the ‘practical man’ .

### Ch. VII  Probability and Induction

#### A.  Statement of the Problem

Naive induction has it that if something has always been so, then it will continue to be so. It will be argued that in its full generality this is false – change can happen – and some more true variations considered.

#### B.  Induction by Simple Enumeration

… Whatever limitation may be necessary to make [any induction] principle valid must be stated in terms of the intensions by which the classes … are defined, not in terms of extensions.

Here the intensions are the logical formulae by which classes are defined, as distinct from extensions, which are particular memberships. For examples, if one supposes that all criminals have some attribute of criminality then one is logically justified in applying inductive reasoning to criminal cases. But if one suppose that ‘people found guilty of crime C’ is a rather arbitrary bunch, there is no justification for applying inductive reasoning. Russell also gives some mathematical counter-examples.

#### C.  Mathematical Treatment of Induction

From the time of Laplace onward, various attempts have been made to show the probable truth of an inductive inference follows from the mathematical theory of probability. It is now generally agreed that these attempts were all unsuccessful, and if inductive arguments are valid it must be true of some extra-logical characteristic of the actual world … .

Keynes … has done the best that can be done for induction on purely mathematical lines. …

#### D. Reichenbach’s Theory

[Reichenbach’s] posit is this: … if … after a sufficient number of α’s have been examined, the proportion that are β’s is always roughly m/n, then this proportion will continue however many instances of α may be subsequently observed.

One may object:

• This is a strong form of statistical stationarity, inconsistent with life.
• It is common for proportions to seem to have stabilised for a long period, only for there to be a sudden change. How does one know when one has observed ‘sufficient’ data?

#### E.  Conclusions

First: there is nothing in the mathematical theory … to justify us in regarding … induction as probable … .

Second: if no limitation is placed on the character of the intensional definitions of the classes … the principle of induction can be shown to be not only doubtful but false.

… [thus] ..

Fifth: scientific inferences, if they are in general valid, must be by virtue of some law or laws of nature, stating a synthetic property of the world, or several such properties. …

So-called mathematical probability assigns numbers to propositions consistent with some firmly established ‘laws’. Keynes’ theory is more general, so that probability is explicitly dependent on what is ‘known’. In particular, the principle of indifference may not always be applicable.

Russell notes that probabilistic reasoning, like any other, may not be reliable when there is strong motive to reach some conclusion rather than none, or when a group consensus has not been tested in practice. In the ‘mathematical’ theory you are force to assign a probability of at least 0.5 to either ‘X’ or ‘not X’. In Keynes’ approach you treat either as very doubtful.

Russell also briefly touches on the point that conditional probabilities ( P(A|B) ) depend on the class, B, that is conditioned over. In familiar situations one can perhaps take it for granted that conditioning over your habitual classes will be adequate, but not otherwise.

Despite Russell’s objections, we can perhaps use probabilistic notions as long as we recognize that they only apply to ‘the current epoch’ (in Whitehead’s terms) ; that is, that they are conditional on Reichenbach’s posit.

Other works on probability.

Dave Marsay