Cab accident
October 21, 2013 6 Comments
“In a certain town blue and green cabs operate in a ratio of 85 to 15, respectively. A witness identifies a cab in a crash as green, and the court is told [based on a test] that in the relevant light conditions he can distinguish blue cabs from green ones in 80% of cases. [What] is the probability (expressed as a percentage) that the cab involved in the accident was blue?” (See my notes on Cohen for a discussion of alternatives.)
For bonus points …. if you were involved , what questions might you reasonably ask before estimating the required percentage? Does your first answer imply some assumptions about the answers, and are they reasonable?
My thoughts below:
.
.
.
.
.
.
If, following Good, we use
P(AB:C) to denote the odds of A, conditional on B in the context C,
Odds(A1/A2B:C) to denote the odds P(A1B:C)/P(A2B:C), and
LR(BA1/A2:C) to denote the likelihood ratio, P(BA1:C)/P(BA2:C).
Then we want P(blue witness: accident), which can be derived by normalisation from Odds(blue/green witness : accident).
We have Odds(blue/green: city) and the statement that the witness “can distinguish blue cabs from green ones in 80% of cases”.
Let us suppose (as I think is the intention) that this means that we know Odds(witness blue/green: test) under the test conditions. This looks like a job for Bayes’ rule! In Odds form this is
Odds(A1/A2B:C) = LR(BA1/A2:C).Odds(A1/A2:C),
as can be verified from the identity P(AB:C) = P(A&B:C)/P(B:C) whenever P(B:C)≠0.
If we ignore the contexts, this would yield:
Odds(blue/green witness) = LR(witness blue/green).Odds(blue/green),
as required. But this would only be valid if the context made no difference. For example, suppose that:

Green cabs have many more accidents than blue ones.

The accident was in an area where green cabs were more common.

The witness knew that blue cabs were much more common than green and yet was still confident that it was a green cab.
In each case, one would wish to reassess the required odds. Would it be reasonable to assume that none of the above applied, if one didn’t ask?
See Also
Other Puzzles.
My notes on probability.
Ah yes, the classic blue and green taxi cab problem posed by Daniel Kahneman and Amos Tversky…but in response to your question:
“In each case, one would wish to reassess the required odds. Would it be reasonable to assume that none of the above applied, if one didn’t ask?”
Unless I’m misreading you, I’m not sure if the question of the weight of evidence has been dealt with properly. If my memory serves me correctly, one would have to factor in driving records for the taxi drivers, and the eyesights of the parties involved.
If we use Good’s approach, we have to ask if, for the relevant context, C,
Odds(blue/green: C) = Odds(blue/green: city)? (*)
As with the ‘standard’ solution, there is no need for us to employ any commonsense
or understanding of cabs, accidents, witnesses and cities: the logic drives us to ask the right questions. To answer them, one might consider the issues that you raise.
It occurs to me that blue cabs might be a large firm whereas green cabs might be a small family business. In this case I would not expect them to be statistically homogenous, although I wouldn’t like to guess who would be worse. Cohen’s ‘disease’ example brings this out more: I’ll post on that soon.
I try to keep the theory required to follow these puzzles as basic as possible. Hopefully it is fairly obvious, even if you haven’t read Good, that P(‘Heads’) depends on some context, such as knowing that it isn’t a doubleheaded coin. The basic point about weight of evidence is that one has no evidence to support (*), and that assuming (*) in effect assumes that the lack of such evidence does not matter. But it does.
P.S. Did K&T originate the example?
I see. Sorry for misunderstanding you – and just to make things clear, I have yet to read anything by Irving John Good. As for the blue/green taxi cab problem…Kahneman and Tversky did talk about the problem in one of their contributions, but I wouldn’t be surprised if a similar or identical scenario had been created by other people before they did.
Suppose we regard the problem, stated without information about the context mentioned in your bullets points, as having a definite answer in spite of the apparent absence of relevant information. For any contextual feature we can imagine that makes Green cabs more likely to have accidents than Blue cabs, we can equally well imagine the symmetrical contextual feature with the labels “Green” and “Blue” switched. (Okay, okay, I can think of a few that break the color symmetry, but work with me here.) Therefore the answer must be the (standard) Good answer — any other answer would be tantamount to assuming exactly the kind of contextual information that we just don’t have!
This kind of thing is just Jaynestype reasoning under uncertainty. I think it only works on a Jaynesstyle interpretation of the meaning of probability…
Working with you, I agree, especially with your final ‘….’.
Have you ever been in a situation where someone invokes a similar argument (explicit or not) to justify their acting as if the probability were p, when if only they’d asked you could have given them some highly pertinent information, such as that in my bullet points?
(Sorry for the delay in replying – I somehow overlooked your comment.)
No, but I’ve been the guy who invoked the argument and then received information breaking the symmetry.