I have had some good conversations recently about epistemology. Today’s post is influenced by those conversations. In today’s post, I am looking at Bayesian epistemology, something that I am very influenced by. As the readers of my blog may know, I am a student of Cybernetics. One of the main starting points in Cybernetics is that we are informationally closed. This means that information cannot enter into us from outside. This may be evident for any teachers in my viewership. You are not able to open up a student’s brain and pour information in as a commodity and then afterwards seal it back up. What happens instead is that the teacher perturbs the student and the student in turn generates meaning out of the perturbation. This would also mean that all knowledge is personal. This is something that was taught by Michael Polanyi.
How we know something is based on what we already know. The obvious question at this juncture is what about the first knowledge? Ross Ashby, one of the pioneers of Cybernetics, has written that there are two main forms of regulations. One is the gene pattern, something that was developed over generations through the evolutionary process. An example of this is the impulse of a baby to grab or to breastfeed without any training. The second is the ability to learn. The ability to learn amplifies the chance of survival of the organism. In our species, this allows us to literally reach for the celestial bodies.
If one accepts that we are informationally closed, then one has to also accept that we do not have direct access to the external reality. What we have access to is what we make sense of from experiencing the external perturbations. Cybernetics aligns with constructivism, the philosophy that we construct a reality from our experience. Heinz von Foerster, one of my favorite Cyberneticians, postulated that our nervous system as a whole is organized in such a way (organizes itself in such a way) that it computes a stable reality. All we have is what we can perceive through our perception framework. The famous philosopher, Immanuel Kant, referred to this as the noumena (the reality that we don’t have direct access to) and the phenomena (the perceived representation of the external reality). We compute a reality based on our interpretive framework. This is just a version of the reality, and each one of us computes such a reality that is unique to each one of us. The stability comes from repeat interactions with the external reality, as well as with interactions with others. We do not exist in isolation from others. The more interactions we have the more we have the chance to “calibrate” it against each other.
With this framework, one does not start from ontology, instead one starts from epistemology. Epistemology deals with the theory of knowledge and ontology deals with being (what is out there). What I can talk about is what I know about rather than what is really out there.
Bayesian epistemology is based on induction. Induction is a process of reasoning where one makes a generalization from a series of observations. For example, if all the swans you have seen so far in your life are white swans, then induction would direct you to generalize that all swans are white. Induction assumes uniformity of nature, to quote the famous Scottish philosopher David Hume. This means that you assume that the future will resemble the past. Hume pointed out that induction is faulty because no matter how many observations one makes, one cannot assume that the future will resemble the past. We seek patterns in the world, and we make generalizations from them. Hume pointed out that we do this out of habit. While many people have tried to solve the problem of induction, nobody has really solved it.
All of this discussion lays the background for Bayesian epistemology. I will not go into the math of Bayesian statistics in this post. I will provide a general explanation instead. Bayesian epistemology puts forth that probability is not a characteristic of a phenomenon, but a statement about our epistemology. The probabilities we assign are not for THE reality but for the constructed reality. It is a statement about OUR uncertainty, and not about the uncertainty associated with the phenomenon itself. The Bayesian approach requires that we start with what we know. We start with stating our prior belief, and based on the evidence presented, we will modify our belief. This is termed as the “posterior” in Bayesian terms. Today’s posterior becomes tomorrow’s prior because “what we know now” is the posterior.
Another important thing to keep in mind is that one does not assign a 0 or 100% for your belief. Even if you see a coin with 10,000 heads in a row, you should not assume that the coin is double headed. This would be jumping into the pit of the problem of induction. We can keep updating our prior based on evidence without reaching 100%.
I will write more on this topic. I wanted to start off with an introductory post and follow up with additional discussions. I will finish with some appealing points of Bayesian epistemology.
Bayesian epistemology is self-correcting – Bayesian statistics has the tendency to cut down your overconfidence or underconfidence. The new evidence presented over several iterations corrects your over or under reach into confidence.
Bayesian epistemology is observer dependent and context sensitive – As noted above, probability in Bayesian epistemology is a statement of the observer’s belief. The framework is entirely dependent on the observer and the context around sensemaking. You do not remove the observer out of the observation. In this regard, Bayesian framework is hermeneutical. We bring our biases to the equation, and we put money where our mouth is by assigning a probability value to it.
Circularity – There is an aspect of circularity in Bayesian framework in that today’s prior becomes tomorrow’s posterior as noted before.
Second Order Nature – The Bayesian framework requires that you be open to changing your beliefs. It requires you to challenge your assumptions and remain open to correcting your belief system. There is an aspect of error correction in this. You realize that you have cognitive blind spots. Knowing this allows us to better our sensemaking ability. We try to be “less wrong” than “more right”.
Conditionality – The Bayesian framework utilizes conditional probability. You see that phenomena or events do not exist in isolation. They are connected to each other and therefore require us to look at the holistic viewpoint.
Coherence not Correspondence – The use of priors forces us to use what we know. To use Willard Van Orman Quine’s phrase, we have a “web of belief”. Our priors must make sense with all the other beliefs we already have in place. This supports the coherence theory of truth instead of the realist’s favorite correspondence theory of truth. I welcome the reader to pursue this with this post.
Consistency not completeness – The idea of a consistency over completeness is quite fascinating. This is mainly due to the limitation of our nervous system to have a true representation of the reality. There is a common belief that we live with uncertainty, but our nervous system strives to provide us a stable version of reality, one that is devoid of uncertainties. This is a fascinating idea. We are able to think about this only from a second order standpoint. We are able to ponder about our cognitive blind spots because we are able to do second order cybernetics. We are able to think about thinking. We are able to put ourselves into the observed.
I will finish with an excellent quote from Albert Einstein:
“As far as the laws of mathematics refer to reality, they are not certain; as far as they are certain, they do not refer to reality”.
Please maintain social distance, wear masks and take vaccination, if able. Stay safe and always keep on learning…
In case you missed it, my last post was Error Correction of Error Correction: