Robinson’s Information Sciences

Lyn Robinson Murat Karamuftuoglu, The nature of information science: changing models, Proceedings of the Seventh International Conference on Conceptions of Library and Information Science—”Unity in diversity” — Part 2

Abstract

Introduction.

This paper considers the nature of information science as a discipline and profession.

Method.

It is based on conceptual analysis of the information science literature, and consideration of philosophical perspectives, particularly those of Kuhn and Peirce.

Results.

It is argued that information science may be understood as a field of study, with human recorded information as its concern, focusing on the components of the information chain, studied through the perspective of domain analysis, in specific or general contexts. A particular aspect of interest is those aspects of information organization, and of human information-related behaviour, which are invariant to changes in technology. Information science can also be seen as a science of evaluation of information, understood as semantic content with respect to qualitative growth of knowledge and change in knowledge structures in domains.

Conclusions.

This study contributes to the understanding of the unique ‘academic territory’ of information science, a discipline with an identity distinct from adjoining subjects.

Main Body

This seems to me to say something insightful about the broad practice of science.

Quantitative models of information

It describes Shannon’s mathematical theory of communication such that “the probability of selection of a particular symbol” “associated with a source, S” makes no distinction between a real physical source and a model of it. As mathematics, Shannon’s theory clearly only applies to mathematical models of sources, not to actual sources. One might suppose that an actual source was, in some real sense, stochastic, and even that it had some definite probability distribution. But this would be a claim of physics and from a strictly mathematical point of view a mere conjecture. (Moreover, it is not a conjecture that one could possibly prove, only test.)

“Information generated when a particular symbol is selected from a set of possible symbols is called self-information or surprisal, which measures the uncertainty associated with the selection of the symbol …”

This all depends on the model of the source, and not on the actual source. If the source is actually a stream of encrypted data then this would exclude an important aspect of ‘uncertainty’, namely ‘uncertainty about the model’. This latter uncertainty is not generally (or maybe ever) measureable, and is quite different from probability. Moreover, in many applications of Shannon, the main source of information is not about the choice of symbol from some assumed source, but about the nature of the source. (For example, the distinction between the chance of a long run all the same for a fair coin and the chance that the coin may not be fair.)

The mathematical theory of communication is rightly criticised for not being relevant to information scient, the main concern of which is the interpretation of documents, i.e., what documents are about or mean.

On the other hand …

Situation theory provides an ontology (objects, situations, channels, etc.) and a set of logical principles (inference rules) that operate on the objects and situations through channels.

Just as many people suppose that ‘probability measures’ are somehow ‘real’, so too it is often assumed that reality comprises things like objects, which is at least a controversial opinion. It is clearer to say that conventional models of reality often comprise objects etc.

Collectively, ontology and the set of inference rules determine the scope of deductions that can be made, and thus, the type of questions asked and answered about the state of affairs in a given situation.

There is a subtlety here:

In computer science and information science, an ontology encompasses a representation, formal naming and definition of the categories, properties and relations between the concepts, data and entities that substantiate one, many or all domains of discourse. (wikipedia)

That is, a physics ontology concerns the type of questions asked and answered about the state of affairs in a given situation for a partiuclar theory of physics: it has no other relevance to any supposed reality that is being theorised about. This in contrast to the more general usage.

To put it in other words, situation theory allows deductions once a model of the world is given in terms of objects and channels that represent the relationships between them.

This seems reasonable for an abstract theory, such as Geometry. But to what extent do these deductions take account of the limitations of empirical theories? (For example, economic theories often imply stability, but only because it is implicitly assumed. So in one sense stability is a valid deduction: in another it is not.)

Quantitative and qualitative change

What is a qualitative change? is a difficult question to answer rigorously. Intuitively, the term qualitative invokes the image of the creation of something new out of old where the steps involved in the transformation of the old into new are not obvious.

In terms of Shannon, a qualitative change might be a change in the source. For economics it might be an actual instability.

We, thus, define quantitative change as a process that leads from one state (old) to another (new) following an effective method. Inferences allowed in situation theory, and generally all deductive argumentation, are essentially effectively calculable.

But how do we know when a source is ‘effectively calculable’? Economies, for example, would appear not to be, so the application of ‘quantative methods’ would seem illegitimate. (My point is that deductions ‘within’ a science – as distinct from ‘about’ a science – can avoid such subtleties.)

Induction is a method of reasoning from particular to general, which produces only probable conclusions that need to be verified by future observations.

Further

C.S. Peirce, prominent philosopher and semiotician, calls the process of creation of a hypothesis from incomplete evidence as abduction. Abduction is different from both deduction and induction in that neither the rule nor the case is given. The rule is hypothesised and, based on this hypothesis, a case is concluded. Abduction is a creative process of hypothesis forming, in which, based on the relevant evidence, the hypothesis that best explains a given phenomenon is formulated. In Peirce’s words: ‘Abduction is the process of forming explanatory hypothesis. It is the only logical operation which introduces any new idea’ … .

The difference here seems unclear. Both induction and abduction produce reasonable conjectures that need to be tested. The difference seems to be that while in both cases the process of hypotheis forming ‘creates’ a conjecture, in the case of induction this is generally regarded as a routine statistical operation, whereas for abduction it is regarded as ‘creative’ in the sense of non-routine or perhaps not ‘effectively calculable’.

Abduction is characterised thus:

Abduction (Hypothesis)
1. The surprising fact, F, is observed;
2. But if H were true, F would be a matter of course.
3. Hence, there is reason to suspect that H is true (Peirce 1958, v 5, para. 189).

Thus whenever one has a model, M, of a source, S, and uses induction to derive a parameter, p, of M, the notion that this Model and parameter somehow apply to the real source, S, is an abduction. Hence, all applied sciences are abductive, whether or not they present themselves as deductive or inductive.

We are now able to relate this idea back to the earlier material of this paper, by noting that the domain-analytic approach suggests that specific theories in a given field of study are built on more general meta-theoretical frameworks (paradigms) or worldviews, which in turn are built on specific philosophical assumptions.

A metatheory, or as sometimes called a paradigm , is essentially a set of principles that prescribes what is acceptable and unacceptable as theory in a scientific discipline.

Kuhn’s work on the history of science (Kuhn 1962) shows that there is normally a single central paradigm, a single way of doing science, which he called “normal science”, in established fields such as physics and astronomy. Before the establishment of a paradigm in a field of study, there is a period that Kuhn called “prescience”, which is characterised by the existence of two or more alternative frameworks that compete to become the dominant paradigm.

Hence, it is arguable from a Kuhnian perspective that relevance assessment and classification of documents should be carried out in terms of the objective movement of competing theories and metatheories/paradigms in a domain. In other words, qualitative judgement of documents requires an understanding of the qualitative growth of knowledge, and change in knowledge structures in domains. This is another defining feature of information science.

My comments

Information science is the study of scientific texts. From this point of view each ‘normal’ science is a unique paradigm based on a specific model. They thus exclude model uncertainty, and so empirical sciences must be wrong. Moreover, deductions ‘within’ a ‘normal’ science do not take account of the empirical natuture of the theory, except in so far as this nature is explicitly modelled. On this basis, information science and ‘normal’ sciences in general seem to deny their essential residual uncertainty. So while ‘normal’ science may be ‘normal’, this may not be such a good thing in practice. (Actually, many scientists do think across paradigms, but it is still concerning that this is not apparent from ‘outside’.)

This seems unfortunate. What one would really want is some theory that is not embedded in a paradigm, or at least a way of reasoning about theories that takes account of the uncertainty that is unavoidably inherent in a paradigm. The paper calls this a ‘prescience’, as if going from prescience to normal science were ‘progress’.

%d bloggers like this: