# Diagnosis

1% of women at age forty who participate in routine screening have breast cancer.  80% of women with breast cancer will get positive mammographies.  9.6% of women without breast cancer will also get positive mammographies.  A woman in this age group had a positive mammography in a routine screening.  What is the probability that she actually has breast cancer?

It is alleged that:

The correct answer is 7.8%, obtained as follows:  Out of 10,000 women, 100 have breast cancer; 80 of those 100 have positive mammographies.  From the same 10,000 women, 9,900 will not have breast cancer and of those 9,900 women, 950 will also get positive mammographies.  This makes the total number of women with positive mammographies 950+80 or 1,030.  Of those 1,030 women with positive mammographies, 80 will have cancer.  Expressed as a proportion, this is 80/1,030 or 0.07767 or 7.8%.

Do you believe it? My notes below:

.

.

.

.

.

.

.

.

.

.

One can take either a subjective or an objective view of probability. From a subjective view I could only guess the proportion of 40-year old women who had this cancer, so my estimate would be uncertain, unlike my estimate for a similar problem involving urns where I could calculate it exactly. So it might be ‘8%ish’, but not 7.8%. I might do better to suppose that the test was reasonably diagnostic, so the answer can’t be much less than 10%, or else why bother? It certainly isn’t clear to me that I ‘should’ always start by trying to guess the relevant prior.

The article is written from an objectivist viewpoint. In this case we can replace probabilities by proportions, which are ratios of numbers satisfying various criteria. Even for an epidemic where the proportions might be changing rapidly, we suppose that the probabilities exist in principle. But there is till a problem. Suppose that the risk of cancer depends not only on age but also family history. Then the woman being tested had some family history and hence (following the logic of the article) has a different prior and hence a different final probability. Hence her probability could well be significantly more than 7.8%. The probability that the article calculates is not her objective probability, but a formal probability, derived for some abstract woman of her age, ignoring all else. In this case, it is the probability appropriate to a doctor with no access to patient records who asks no questions. Some people think it appropriate to the woman, based on the principle that everyone should always assume themselves to be average. This may seem very reasonable, but is it so reasonable to ignore family history?

At the very least, the problem raises issues that need to be addressed before one can be so sure that one has reduced the lack of certainty to a relatively simple numeric probability.

Similar puzzles. My notes on communicating uncertainty.

Dave Marsay