## Disease

“You are suffering from a disease that, according to your manifest symptoms, is either A or B. For a variety of demographic reasons disease A happens to be nineteen times as common as B. The two diseases are equally fatal if untreated, but it is dangerous to combine the respectively appropriate treatments. Your physician orders a certain test which, through the operation of a fairly well understood causal process, always gives a unique diagnosis in such cases, and this diagnosis has been tried out on equal numbers of A- and B-patients and is known to be correct on 80% of those occasions. The tests report that you are suffering from disease B. Should you nevertheless opt for the treatment appropriate to A … ?”

My thoughts below …

.

.

.

.

.

.

.

.

If, following Good, we use

P(A|B:C) to denote the odds of A, conditional on B in the context C, Odds(A1/A2|B:C) to denote the odds P(A1|B:C)/P(A2|B:C), and LR(B|A1/A2:C) to denote the likelihood ratio, P(B|A1:C)/P(B|A2:C).

then we want

Odds(A/B | diagnosis of B : you), given
Odds(A/B : population) and
P(diagnosis of B | B : test), and similarly for A.

This looks like a job for Bayes’ rule! In Odds form this is

Odds(A1/A2|B:C) = LR(B|A1/A2:C).Odds(A1/A2:C).

If we ignore the dependence on context, this would yield

Odds(A/B | diagnosis of B ) = LR(diagnosis of B | A/B ).Odds(A/B).

But are we justified in ignoring the differences? For simplicity, suppose that the tests were conducted on a representative sample of the population, so that we have Odds(A/B | diagnosis of B : population), but still need Odds(A/B | diagnosis of B : you). According to Blackburn’s population indifference principle (PIP) you ‘should’ use the whole population statistics, but his reasons seem doubtful. Suppose that:

• You thought yourself in every way typical of the population as a whole.
• The prevalence of diseases among those you know was consistent with the whole population data.

Then PIP seems more reasonable. But if you are of a minority ethnicity – for example – with many relatives, neighbours and friends who share your distinguishing characteristic, then it might be more reasonable to use an informal estimate based on a more appropriate population, rather than a better quality estimate based on a less appropriate estimate. (This is a kind of converse to the availability heuristic.)

My notes on Cohen for a discussion of alternatives.

Other, similar, Puzzles.

My notes on probability.

Dave Marsay

## Cab accident

“In a certain town blue and green cabs operate in a ratio of 85 to 15, respectively. A witness identifies a cab in a crash as green, and the court is told [based on a test] that in the relevant light conditions he can distinguish blue cabs from green ones in 80% of cases. [What] is the probability (expressed as a percentage) that the cab involved in the accident was blue?” (See my notes on Cohen for a discussion of alternatives.)

For bonus points …. if you were involved , what questions might you reasonably ask before estimating the required percentage? Does your first answer imply some assumptions about the answers, and are they reasonable?

My thoughts below:

.

.

.

.

.

.

If, following Good, we use

P(A|B:C) to denote the odds of A, conditional on B in the context C,
Odds(A1/A2|B:C) to denote the odds P(A1|B:C)/P(A2|B:C), and
LR(B|A1/A2:C) to denote the likelihood ratio, P(B|A1:C)/P(B|A2:C).

Then we want P(blue| witness: accident), which can be derived by normalisation from Odds(blue/green| witness : accident).
We have Odds(blue/green: city) and the statement that the witness “can distinguish blue cabs from green ones in 80% of cases”.

Let us suppose (as I think is the intention) that this means that we know Odds(witness| blue/green: test) under the test conditions. This looks like a job for Bayes’ rule! In Odds form this is

Odds(A1/A2|B:C) = LR(B|A1/A2:C).Odds(A1/A2:C),

as can be verified from the identity P(A|B:C) = P(A&B:C)/P(B:C) whenever P(B:C)≠0.

If we ignore the contexts, this would yield:

Odds(blue/green| witness) = LR(witness| blue/green).Odds(blue/green),

as required. But this would only be valid if the context made no difference. For example, suppose that:

• Green cabs have many more accidents than blue ones.
• The accident was in an area where green cabs were more common.
•  The witness knew that blue cabs were much more common than green and yet was still confident that it was a green cab.

In each case, one would wish to re-assess the required odds. Would it be reasonable to assume that none of the above applied, if one didn’t ask?