The Confirmation Paradox:

albino raven

In today’s post I will be looking at Confirmation Paradox or Black Raven Paradox by Carl Hempel. Let’s suppose that you have never seen a raven in your life. You came across a raven one fine morning, and observe that it is black in color. Now that you have seen one, you suddenly start paying more attention and you start seeing ravens everywhere. Each time you see a raven, you observe that its color is black. Being the good scientist that you are, you come to a hypothesis – All ravens are black. This is also called induction, coming to a generalization from many specific observations.

Now you would like to confirm your hypothesis. You ask your good friend, Carl Hempel, to help. Carl suggests that you start looking at things around his house that are not black and not raven, like his red couch, the yellow tennis ball etc. He suggests that each of those observations support your hypothesis that all ravens are black. You are rightfully puzzled by this. This is the confirmation paradox. Carl Hempel was a German born philosopher who later immigrated to America.

Carl Hempel is correct with this claim. Let’s look at this further. All ravens are black can be restated as “Whatever is not black is not a raven”. This is a logical equivalence of your hypothesis. This would mean that if you observe something that is not black and is not a raven, it would support your hypothesis. Thus, if you observe a red couch, it is not black and it is also not a raven, therefore it supports your hypothesis that all ravens are black.

How do we come in terms with this? Surely, it does not make sense that a red couch supports the hypothesis that all ravens are black. The first point to note here is that one can never prove a hypothesis via induction. Induction requires the statement to be provided with a level of confidence or certainty. This would mean that the level of “support” that each observation makes depends upon the type of information gained from that observation.

I will explain this further with the concept of information from Claude Shannon’s viewpoint. Information is all around us. Where ever you look, you can get information. Claude Shannon quantified this in terms of entropy with the unit as a bit. He described this as the amount of surprise or reduction of uncertainty. Information is inversely proportional to probability of an event. The less probable an event is, the more information it contains. Let’s look at the schematic below:


The black triangle represents all the black ravens in our observable universe. The blue square represents all of the black things in our observable universe. The red circle represents all the things in the observable universe. Thus, the set of black ravens is a subset of all black things, which in turn is a subset of all things. From a probability standpoint, the probability of observing a black raven is much smaller than the probability of observing a black thing since there are proportionally a lot more black things in existence. Similarly, the probability of observing a non-black thing is much higher since there are lot more non-black things in existence. Thus, from an information standpoint, the information you get from observing a non-black thing that is not a raven is very very small. Logically, this observation does provide additional support, however, the information content is miniscule. Please note that, on the other hand, observing a black raven is also supporting the statement that all non-raven things are non-black.

When you first saw a black raven, you had no idea about such a thing existing. The information content of that observation was high. After you started observing more ravens, the information you got from each observation started diminishing. Even if you made 10,000 observations of black ravens, you cannot prove (100% confirm) that all ravens are black. This is the curse of induction. This is where Karl Popper comes in. Karl Popper, an Austrian-British philosopher, had the brilliant insight that good hypotheses should be falsifiable. We should try to look for observations that would fail our hypothesis. His insight was in the asymmetry of falsifiability. You may have 100,000 observations supporting your hypothesis. All you need is a single observation to fail it. The most popular example for this is the case of the black swan. The belief that all swans are white was discredited when black swans were discovered in Australia. To come back to the information analogy, the observation of a white raven has lot more information content that is powerful enough to break down your hypothesis since the occurrence of a white raven(albino) is very low in nature. Finding a white raven is quite rare and thus have the most information or surprise.

This also brings up the concept of Total Evidence. The concept of Total Evidence was put forth by Rudolf Carnap, a German born philosopher. He stated that in the application of inductive logic to a given knowledge situation, the total evidence available must be taken as basis for determining the degree of confirmation. Let’s say that as we learned more about ravens and other birds, we came across the concept of albinism in other animals and birds. This should make us challenge our hypothesis since we know that albinism can occur in nature, and thus it is not farfetched that it can occur in ravens as well. The concept of Total Evidence is interesting because even though it has the term “Total” in it, it is beckoning us to realize that we cannot ever have total information. It is a reminder for us to consider all possibilities and to understand where our mental models break down. In theory, one could also make whimsical statements such as “All unicorns are rainbow colored”, and say that the observation of a white shoe supports it based on the confirmation paradox. Total evidence in this case would require us to have made at least one observation of a rainbow colored unicorn.

I will finish with another paradox that is similar to the confirmation paradox – the 99-foot (feet) man paradox by Paul Berent. Up to this point, we have been looking at qualitative data (black versus not black, or raven versus not raven). Let’s say that you have a hypothesis that says all men are less than 100 feet. You surveyed over 100,000 men and found all of them to be less than 100 feet. One day you heard about a new circus company coming to town. Their main attraction is a 99-foot man. You go to see him in person and sure enough, he is 99 feet tall. Now, your hypothesis is still intact since the 99-foot man is technically less than 100 feet. However, this adds doubt to your mind. You realize that if there is a 99-foot man, then the occurrence of a 100-foot man is not farfetched. The paradox occurs since the observation of a 99-foot man strengthens your hypothesis, but at the same time it also weakens it.

Always keep on learning…

In case you missed it, my last post was Know Your Edges: