Turing’s Thesis

A. M. Turing SYSTEMS OF LOGIC BASED ON ORDINALS

A dissertation presented to the faculty of Princeton University in candidacy for the degree of Doctor of Philosophy. Recommended by the Department of Mathematics for acceptance May 1938

Abstract

The well known theorem of Gödel shows that every system of logic is in a certain sense incomplete, but at the same time it indicates means whereby from a system L of logic a more complete system L0 may be obtained. … It may be asked whether a sequence of logics of this kind is complete in the sense that to any problem A there corresponds an ordinal α such that A is solvable by means of the logic Lα.

1. The calculus of conversion. Gödel representations

A WFF [well formed formula] is said to be in normal form if it has no parts of the form …  and none of the form …  where M and N have no free variables.

A formula A is said to be convertible into another B (abbreviated AconvB) if there is a finite chain of immediate conversions leading from one formula to another.

We have not yet assigned any meaning to our formulae, and we do not intend to do so in general. …  In any case where any meaning is assigned to formulae it is desirable that the meaning be invariant under conversion.

I quote here some of the more important theorems concerning normal forms.
(A) If a formula has two normal forms they are convertible into one another by the use or (i) alone. (Church and Rosser [6], pages 479, 481.)
(B) If a formula has a normal form then every well formed part of it has a normal form. (Church and Rosser [6], pages 480–481.)
(C) There is (demonstrably) no process whereby one can tell of a formula whether it has a normal form. (Church [3], page 360, Theorem XVIII.)

If …  then … is called the Gödel representation (GR) of that sequence of symbols. No two WFF have the same GR.

 2. Effective calculability. Abbreviation of treatment

A function is said to be ‘effectively calculable’ if its values can be found by some purely mechanical process. Although it is fairly easy to get an intuitive grasp of this idea it is nevertheless desirable to have some more definite, mathematically expressible definition.

In most cases where we have to deal with an effectively calculable function we shall … In such cases there in no difficulty in seeing how a machine could in principle be designed to calculate the values of the function concerned, and assuming this done the equivalence theorem can be applied. A statement as to what the formula F actually is may be omitted.

There is another point to made clear in connection with the point of view we are adopting. It is intended that all proofs that are given should be regarded no more critically than proofs in classical analysis. The subject matter, roughly speaking, is constructive systems of logic, but as the purpose is directed towards choosing a particular constructive system of logic for practical use; an attempt at this stage to put our theorems into constructive form would be putting the cart before the horse.

These computable functions which take only the values 0 and 1 are of particular importance since they, determine and are determined by computable properties, as may be seen by replacing ‘0’ and ‘1’ by ‘true’ and ‘false’. But besides this type of property we may have to consider a different type, which is, roughly speaking, less constructive than the computable properties, but more so than the general predicates of classical mathematics. Suppose we have a computable function of the natural numbers taking natural numbers as values, then corresponding to this function there is the property of being a value of the function. Such a property we shall describe as ‘axiomatic’; the reason for using this term in that it is possible to define such a property by giving a set of axioms, the property to hold for a given argument if and only if is possible to deduce that it holds from the axioms.

Axiomatic properties may also be characterized in this way. A property ψ of positive integers is axiomatic if and only if there is a computable property ϕ of two positive integers, such that ψ(x) is true if and only if there is a positive integer y such that ϕ(x,y) is true. Or again ψ is axiomatic if and only if there is a WFF F such that ψ(n) is true in and only if F(n)conv 2.

 3. Number theoretic theorems

By a number theoretic theorem we shall mean a theorem of the form ‘θ vanishes for infinitely many natural numbers x’, where θ(x) is a primitive recursive We shall say that a problem is number theoretic if it has been shown that any solution of the problem may be put in the form of a proof of one or more number theoretic theorems. More accurately we may say that a class of problems is number theoretic if the solution of any one of them can be transformed (by a uniform method) into the form of proofs of number theoretic theorems.

We shall now show that questions as to the truth of statements of form ‘does f(x) vanish identically’, where f(x )is a computable function, can be reduced to questions as to the truth of number theoretic theorems. It is understood that in each case the rule for the calculation of f(x) is given and that one is satisfied that this rule is valid, i.e. that the machine which should calculate f(x) is circle free (Turing [17]). … The converse of this result is not quite true …  every number theoretic theorem is equivalent to the statement that a corresponding machine is circle free. …. The machine is circle free if and only if an infinity of these figures are printed, i.e. if and only if θ(x) vanishes for infinitely many values of the argument.

4. A type of problem which is not number theoretic

Let us suppose that we supplied with some unspecified means of solving number theoretic problems; a kind of oracle as it were. We will not go any further into the nature of this oracle than to say that it cannot be a machine; with the help of the oracle we could form a new kind of machine (call them o-machines), having as one of its fundamental processes that of solving a given number theoretic problem.  More definitely these machines are to behave in this way. The moves of the machine are determined as usual by a table except in the case of moves from a certain internal configuration o. If the machine is in the internal configuration o and if the sequence of symbols marked with l is then the well formed formula A, then the machine goes into the internal d or n according as it is or is not true that A is dual. The decision as to which is the case is referred to the oracle.

Given any one of these machines we may ask ourselves the question whether or not it prints an infinity of figures 0 or 1; I assert that this class of problems is not number theoretic. In view of the definition of ‘number theoretic problem’ this means to say that it is not possible to construct an o-machine which when supplied with the description of any other o-machine will determine whether that machine is o-circle free. The argument may be taken directly from Turing [17], page 8.

5. Syntactical theorem as number theoretic theorems

I shall mention a property of number theoretic theorems which suggests that there is reason for regarding them as of particular importance.
Suppose that we have some axiomatic system of a purely formal nature. We do not interest ourselves at all in interpretation for the formulae of this system. They are to be regarded as of interest for themselves.

Now consider a new rule of procedure which is reputed to yield formulae provable in the original sense. We may ask ourselves whether such a rule is valid. The statement that such a rule is valid would be number theoretic.

It might plausibly be argued that all theorems of mathematics which have any significance when taken alone, are in effect syntactical theorems of this kind stating … the validity of certain ‘derived rules’ of procedure. Without going so far as this I should say that theorems of this kind have an importance which make it worth while to give them special consideration.

6. Logic formulae

We shall call a formula L a logic formula (of, it is clear that we are speaking of a WFF, simply a logic) if it has the property that if A is a formula such that L(A)conv 2 then A is dual.

A logic formula gives us a means of satisfying ourselves of the truth of number theoretic theorems.

If L is a logic the set of formulae A for which L(A)conv 2 will be called the extent of L.

[Consider] the property that if A has no free variables then … is always convertible to 1 or to 2 or else has no normal form. A logic with this property will be said to be standardized.

We shall say that a logic L0 is at least as complete as a logic L if the extent of L is a subset of the extent of L0. The logic L0 will be more complete than L if the extent of L is a proper subset of the extent of L0.

In the case where we have symbolic logic those propositions can be interpreted as number theoretic theorems, but are not expressed in the form of the duality of formulae we shall again have a corresponding logic formula, but its relation to the symbolic logic will not be so simple. As an example let us take the case that the symbolic logic proves that certain primitive recursive functions vanish infinitely often.

We shall say that a single logic formula L is complete if its extent includes all dual formulae; that is to say that it is complete if it enables us to prove every true number theoretic theorem. It is a consequence of the theorem of G¨odel (if suitably extended) that no logic formula is complete, and this also follows from (C), page 5 or from the results of Turing [17], §8, when taken in conjunction with §3 of the present paper. The idea of completeness of a logic formula will not therefore be very important, although it is useful to have a term for it.

 7. Ordinals

There are numerous possibilities, and little to guide us as to which definition should be chosen. No one of them could well be described as ‘wrong’; some of them may be found more valuable in applications than others, and the particular choice we have made has been partly determined by the applications we have in view. In the case of theorems of a negative character one would wish to prove them for each one of the possible definitions of ‘ordinal formula’. This program could I think be carried through for the negative results of §9, §10.

To prove that there is no general method for determining of a formula whether it is an ordinal formula we use an argument akin tot hat leading to the Burali-Forti paradox, but the emphasis and the conclusion are different. …. This argument proves more than was originally asserted. In fact it proves that if we take any class E of ordinal formulae in normal form, such that if A is any ordinal formula then there is a formula E representing the same ordinal as A, then there is no method whereby one can tell whether a WFF in normal form belongs to E.

8. Ordinal logics

An ordinal logic is a WFF Λ such that Λ(Ω) is a logic formula whenever Ω is an ordinal formula.

This definition is intended to bring under one heading a number of ways of constructing logics which have recently been proposed or are suggested by recent advances. In thin section I propose to show how to obtain some of these ordinal logics.

9 Completeness questions

The purpose of introducing ordinal logics was to avoid as far as possible the effects of Gödel’s theorem. It is a consequence of this theorem, suitably modified, that [it] is impossible to obtain a complete logic formula, or (roughly speaking now) a complete system of logic. We are able, however, from a given system to obtain a more complete one by the adjunction as axioms of formulae, seen intuitively to be correct, but which Gödel theorem shows are unprovable in the original system; from this we obtained a yet more complete system by a repetition of the process and so on. We found that the repetition of the process gave us a new system for each C-K ordinal formula. We should like to know whether this process suffices, or whether the system should be extended in other ways as well. If it were possible to tell of a WFF in normal form whether it was an ordinal form we should know for certain that it was necessary to extend in other ways.

It should be noticed that our definitions of completeness refer only to number theoretic theorems. Although it would be possible to introduce formulae analogous to ordinal logics which would prove more general theorems than number theoretic ones, and have a corresponding definition of completeness, yet if our theorems are too general we shall find that our (modified) ordinal logics are never complete. This follows from the argument of §4. If our ‘oracle’ tells us, not whether any given number theoretic statement is true, but whether a given formula is an ordinal formula, the argument still applies, and we find there are classes of problems which cannot be solved by a uniform process even with the help of this oracle. This is equivalent to saying that there is no ordinal logic of the proposed modified type which is complete with respect to these problems. This situation becomes more definite if we take formulae satisfying conditions (a)–(e), (f0) (as described at the end of §12) instead of ordinal formulae; it is then not possible for the ordinal logic to be complete with respect to any class of problems more extensive than the number theoretic problems.

We might hope to obtain some intellectually satisfying system of logical inference (for the proof of number theoretic theorems) with some ordinal logic. Gödel’s theorem shows that such a system cannot be wholly mechanical, but with a complete ordinal logic we should be able to confine the non-mechanical steps entirely to verifications that particular formulae are ordinal formulae.

We might also expect to obtain an interesting classification of number theoretic theorems according to ‘depth’. A theorem which required an ordinal α to prove it would be deeper than one which could be proved by use of an ordinal β less than α. However, this presupposes more than is justified. We define
Definition of invariance of ordinal logics. An ordinal logic Λ is said to be invariant up to an ordinal α whenever Ω, Ω0 are ordinal formulae representing the same ordinal less than α, the extent of Λ(Ω) is identical with the extent of Λ(Ω). An ordinal logic is invariant if it is invariant up to each ordinal represented by an ordinal formula.
Clearly the classification into depths presupposes that the ordinal logic used is invariant.
Among the questions we should now like to ask are
(a) Are there any complete ordinal logics?
(b) Are there any complete invariant ordinal logics?

The question (b) must be answered negatively.

10. The continuum hypothesis. A digression

We ask ‘Is it possible to find a computable function of ordinal formulae determining a one-one correspondence between the ordinals represented by ordinal formulae and the computable sequences of figures 0, 1?’. … The answer is no … .

11. The purpose of ordinal logics

Mathematical reasoning may be regarded rather schematically as the exercise of a combination of two faculties† , which we may call intuition and ingenuity. The activity of the intuition consists in making spontaneous judgments which are not the result of conscious trains of reasoning. These judgments are often, but by no means invariably correct (leaving aside the question as to what is meant by ‘correct’). Often it is possible to find some other way of verifying the correctness of an intuitive judgment. One may for instance judge that all positive integers are uniquely factorizable into primes; a detailed mathematical argument leads to the same result. It will also involve intuitive judgments, but they will be ones less open to criticism than the original judgement about factorization. I shall not attempt to explain this idea of ‘intuition’ any more explicitly.

† Footnote: We are leaving out of account that most important faculty which distinguishes topics of interest from others; in fact we are regarding the function of the mathematician as simply to determine the truth or falsity of propositions.

The exercise of ingenuity in mathematics consists in aiding the intuition through suitable arrangements of propositions, and perhaps geometrical figures or drawings. It is intended that when these are really well arranged validity of the intuitive steps which are required cannot seriously be doubted.

The parts played by these two faculties differ of course from occasion to occasion, and from mathematician to mathematician. This arbitrariness can be removed by the introduction of a formal logic. The necessity for using the intuition is then greatly reduced by setting down formal rules for carrying out inferences which are always intuitively valid. When working with a formal logic the idea of ingenuity takes a more definite shape. In general a formal logic will be framed so as to admit a considerable variety of possible steps in any stage in a proof. Ingenuity will then determine which steps are the more profitable for the purpose of proving a particular proposition. In pre-Gõdel times it was thought by some that it would probably be possible to carry this program to such a point that all the intuitive judgments of mathematics could be replaced by a finite number of these rules. The necessity for intuition would then be entirely eliminated.

In our discussions, however, we have gone to the opposite extreme and eliminated not intuition but ingenuity, and this in spite of the fact that our aim has been in much the same direction. We have been trying to see how far it is possible to eliminate intuition, and leave only ingenuity. We do not mind how much ingenuity is required, and therefore assume it to be available in unlimited supply. In our meta-mathematical discussions we actually express this assumption rather differently. We are always able to obtain from the rules of a formal logic a method for enumerating the propositions proved by its means. We then imagine that all proofs take the form of a search through this enumeration for the theorem for which a proof is desired. In this way ingenuity is replaced by patience. In these heuristic discussions however, it is better not to make this reduction.
Owing to the impossibility of finding a formal logic which will wholly eliminate the necessity of using intuition we naturally turn to ‘non-constructive’ systems of logic with which not all the steps in a proof are mechanical, some being intuitive. An example of a non-constructive logic is afforded any ordinal logic. When we have an ordinal logic we are in a position to prove number theoretic theorems by the intuitive steps of recognizing formulae as ordinal formulae, and the mechanical steps of carrying out conversions.

What properties do we desire a non-constructive logic to have if we are to make use of it for the expression of mathematical proofs?
We want it to be quite clear when a step makes use of intuition, and when it is purely formal. The strain put on the intuition should be a minimum. Most important of all, it must be beyond all reasonable doubt that the logic leads to correct results whenever the intuitive steps are correct. It is also desirable that the logic be adequate for the expression of number theoretic theorems, in order that it may be used in meta-mathematical discussions (cf §5). …  There remains the system of Church which is free of these objections.

 12. Gentzen type ordinal logics

In proving the consistency of a certain system of formal logic Gentzen ([7]) has made use of the principle of transfinite induction for ordinals less than 0, and suggested that it is to be expected that transfinite induction carried sufficiently far would suffice to solve all problems of consistency. Another suggestion to base systems of logic on transfinite induction has been made by Zermelo (Zermelo [20]). In this section I propose to show how this method of proof may be put into the form of a formal (non-constructive) logic, and afterwards to obtain from it an ordinal logic.

It is to be understood that the relations that we are about to define hold only when compelled to do so by the conditions we lay down. The conditions should be taken together as a simultaneous inductive definition of all the relations involved.

A Summary

Wikipedia summarises the result has:

in all systems with finite sets of axioms, an exclusive-or condition applies to expressive power and provability; ie one can have power and no proof, or proof and no power, but not both.

In addition:

Martin Davis states that although Turing’s use of a computing oracle is not a major focus of the dissertation, it has proven to be highly influential in theoretical computer science, e.g. in the polynomial time hierarchy.

My Comments

It seems to me that there is much more than can be drawn from Turing’s approach, methods, criteria, definitions and intermediate results.

  • If we wish to rely on any particular logic to any particular problem (not just in number theory), we should worry – as Turing does – about the potential for error, and seek to establish and consider criteria which may support its validity. For example:
    • We need to be very careful that our concepts of ‘axiom’, ‘normal form’, etc. are adequate for our logic.
  • We can’t simply assume that our subject is computable in Turing’s sense.
  • We can’t simply assume that a theory of representations that is appropriate for mathematical structures will be appropriate for our application.
  • In particular we can’t simply assume that a theory of probability or decision theory applies in a particular case, no matter how logical the argument may be, unless we seriously consider the appropriateness of the logic.
  • That even if our subject is not amenable to a number-theoretic-like logic, there may be other appropriate logics.

In the classical period people used classical logic and even had classically logical arguments to support their intuition that this was appropriate, even necessary. More reasonable people, like Newton, reasoned more pragmatically. Classical logic eventually lost its credibility as being ‘the one true logic’ under the weight of the observed anomalies, to be replaced – at least in mainstreams sciences – by extended logics. In physics these are mostly number-theory-like, as if reality has a number-theory-like structure. But it is not at all clear that it does, merely that many scientists know how to reason as if does, so that is what is regarded as ‘scientific’. Currently science is rich in anomalies, but these are not usually seen as a challenge to the underlying logics: merely to misapplication of the logic. But Turing seems to suggest that this condition may not hold, leading to another Kuhnian paradigm-shift.

It also seems to me that Turing’s own work was guided by an intuition that ‘his’ number-theory-like logic was not appropriate to  biology, at least. In particular, the logical underpinnings of his ‘modern Bayesian’ practice seem to me quite different to contemporary ‘number-theory-like’ practice, which while it applies a Turing-type logic, seems not to understand its limitations in the same way that he did. I may blog some more on this.

Dave Marsay

 

 

Advertisements
%d bloggers like this: