The Cybernetics of Bayesian Epistemology:

I have had some good conversations recently about epistemology. Today’s post is influenced by those conversations. In today’s post, I am looking at Bayesian epistemology, something that I am very influenced by. As the readers of my blog may know, I am a student of Cybernetics. One of the main starting points in Cybernetics is that we are informationally closed. This means that information cannot enter into us from outside. This may be evident for any teachers in my viewership. You are not able to open up a student’s brain and pour information in as a commodity and then afterwards seal it back up. What happens instead is that the teacher perturbs the student and the student in turn generates meaning out of the perturbation. This would also mean that all knowledge is personal. This is something that was taught by Michael Polanyi.

How we know something is based on what we already know. The obvious question at this juncture is what about the first knowledge? Ross Ashby, one of the pioneers of Cybernetics, has written that there are two main forms of regulations. One is the gene pattern, something that was developed over generations through the evolutionary process. An example of this is the impulse of a baby to grab or to breastfeed without any training. The second is the ability to learn. The ability to learn amplifies the chance of survival of the organism. In our species, this allows us to literally reach for the celestial bodies.

If one accepts that we are informationally closed, then one has to also accept that we do not have direct access to the external reality. What we have access to is what we make sense of from experiencing the external perturbations. Cybernetics aligns with constructivism, the philosophy that we construct a reality from our experience. Heinz von Foerster, one of my favorite Cyberneticians, postulated that our nervous system as a whole is organized in such a way (organizes itself in such a way) that it computes a stable reality. All we have is what we can perceive through our perception framework. The famous philosopher, Immanuel Kant, referred to this as the noumena (the reality that we don’t have direct access to) and the phenomena (the perceived representation of the external reality). We compute a reality based on our interpretive framework. This is just a version of the reality, and each one of us computes such a reality that is unique to each one of us. The stability comes from repeat interactions with the external reality, as well as with interactions with others. We do not exist in isolation from others. The more interactions we have the more we have the chance to “calibrate” it against each other.

With this framework, one does not start from ontology, instead one starts from epistemology. Epistemology deals with the theory of knowledge and ontology deals with being (what is out there). What I can talk about is what I know about rather than what is really out there.

Bayesian epistemology is based on induction. Induction is a process of reasoning where one makes a generalization from a series of observations. For example, if all the swans you have seen so far in your life are white swans, then induction would direct you to generalize that all swans are white. Induction assumes uniformity of nature, to quote the famous Scottish philosopher David Hume. This means that you assume that the future will resemble the past. Hume pointed out that induction is faulty because no matter how many observations one makes, one cannot assume that the future will resemble the past. We seek patterns in the world, and we make generalizations from them. Hume pointed out that we do this out of habit. While many people have tried to solve the problem of induction, nobody has really solved it.

All of this discussion lays the background for Bayesian epistemology. I will not go into the math of Bayesian statistics in this post. I will provide a general explanation instead. Bayesian epistemology puts forth that probability is not a characteristic of a phenomenon, but a statement about our epistemology. The probabilities we assign are not for THE reality but for the constructed reality. It is a statement about OUR uncertainty, and not about the uncertainty associated with the phenomenon itself. The Bayesian approach requires that we start with what we know. We start with stating our prior belief, and based on the evidence presented, we will modify our belief. This is termed as the “posterior” in Bayesian terms. Today’s posterior becomes tomorrow’s prior because “what we know now” is the posterior.

Another important thing to keep in mind is that one does not assign a 0 or 100% for your belief. Even if you see a coin with 10,000 heads in a row, you should not assume that the coin is double headed. This would be jumping into the pit of the problem of induction. We can keep updating our prior based on evidence without reaching 100%.

I will write more on this topic. I wanted to start off with an introductory post and follow up with additional discussions. I will finish with some appealing points of Bayesian epistemology.

Bayesian epistemology is self-correcting – Bayesian statistics has the tendency to cut down your overconfidence or underconfidence. The new evidence presented over several iterations corrects your over or under reach into confidence.

Bayesian epistemology is observer dependent and context sensitive – As noted above, probability in Bayesian epistemology is a statement of the observer’s belief. The framework is entirely dependent on the observer and the context around sensemaking. You do not remove the observer out of the observation. In this regard, Bayesian framework is hermeneutical. We bring our biases to the equation, and we put money where our mouth is by assigning a probability value to it.

Circularity – There is an aspect of circularity in Bayesian framework in that today’s prior becomes tomorrow’s posterior as noted before.

Second Order Nature – The Bayesian framework requires that you be open to changing your beliefs. It requires you to challenge your assumptions and remain open to correcting your belief system. There is an aspect of error correction in this. You realize that you have cognitive blind spots. Knowing this allows us to better our sensemaking ability. We try to be “less wrong” than “more right”.

Conditionality – The Bayesian framework utilizes conditional probability. You see that phenomena or events do not exist in isolation. They are connected to each other and therefore require us to look at the holistic viewpoint.

Coherence not Correspondence – The use of priors forces us to use what we know. To use Willard Van Orman Quine’s phrase, we have a “web of belief”. Our priors must make sense with all the other beliefs we already have in place. This supports the coherence theory of truth instead of the realist’s favorite correspondence theory of truth. I welcome the reader to pursue this with this post.

Consistency not completeness – The idea of a consistency over completeness is quite fascinating. This is mainly due to the limitation of our nervous system to have a true representation of the reality. There is a common belief that we live with uncertainty, but our nervous system strives to provide us a stable version of reality, one that is devoid of uncertainties. This is a fascinating idea. We are able to think about this only from a second order standpoint. We are able to ponder about our cognitive blind spots because we are able to do second order cybernetics. We are able to think about thinking. We are able to put ourselves into the observed.

I will finish with an excellent quote from Albert Einstein:

“As far as the laws of mathematics refer to reality, they are not certain; as far as they are certain, they do not refer to reality”.

Please maintain social distance, wear masks and take vaccination, if able. Stay safe and always keep on learning…

In case you missed it, my last post was Error Correction of Error Correction:

Error Correction of Error Correction:

If I were asked to explain cybernetics, the first thing that would come to my mind would be – error correction. The example that is often used to explain cybernetics is that of the steersman. You have a steersman on a boat moving from point A to point B. Ideally, the boat should move from point A to B in a straight line. However, the wind can change the direction of the boat, and the steersman has to adjust accordingly to stay on course. This negative feedback loop requires a target such that the difference from the target is compensated. In technical terms, there is a comparator (something that can measure) that checks on a periodic or continuous basis what the difference is, and provides this information to make adjustments accordingly. Let’s call this framework as first order cybernetics. In this framework, we need a closed loop so that we have feedback. This allows for information to be fed back so that we can compare it against a goal and make adjustments accordingly. This approach was made famous by one of the main pioneers of Cybernetics, Norbert Wiener. He used this for guided missile technology where the missile could change its course as needed similar to the steersman on the boat. First order cybernetics obviously is quite useful. But it is based on the assumption that there is a target that we can all agree upon. This also assumes that the comparator is able to work effectively and efficiently.

With this background, I would now like to look at second order cybernetics. One of the main pioneers of second order cybernetics was Heinz von Foerster. He wanted to go beyond the idea of just error correction. He wanted to look at error correction of error correction. As I noted earlier, the error correction mechanism assumes that the target is clear and available, and also that the comparator and the correcting mechanism are working appropriately. Von Foerster challenged the notion of an objective reality and introduced the notion of the observer being part of what is observed. The general tendency is to keep the observer out of what is being observed with the underlying belief that the observation is readily available for all those who are interested. Von Foerster pushed back on this idea and said that the observer is included in the observation. One of my favorite aphorisms from von Foerster is – only when you realize you are blind, can you see. We all have cognitive blind spots. Realizing this and being aware of it allows us to improve how we look at things. There is a circularity that we have to respect and understand better here. What we see impacts what we understand, and what we understand impacts what we see. It is an ongoing self-correcting cycle. If the first order error correction is a correcting to a specific problem, then second order error correcting is the error correction of the error correction.

There is a great example that von Foerster gives that might explain this idea better. He talked about the Turing’s test. Turing’s test or the Imitation Game as originally called by the great Alan Turing is a test given to an “intelligent machine” to see if its intelligence is comparable or indistinguishable from that of a human. Von Foerster turned this on its head by bringing up the second order implications. He noted:

The way I see it, the potential intelligence of a machine is not being tested. In actual fact, the scholars are testing themselves (when they give the Turing test). Yes, they are testing themselves to determine whether or not they can tell a human being from a machine. And if they don’t manage to do this, they will have failed. The way I see it, the examiners are examining themselves, not the entity that is meekly sitting behind the curtain and providing answers for their questions. As I said, “Tests test tests.”

One of the main implications from this is that the observer is responsible for their own construction of what they are observing. We are all informationally closed entities that construct our version of a stable paradigm that we call a reality (not THE reality). And we are responsible for our construction, and we are ethically bound to allow others to construct their versions. We come to an eigenvalue for this “reality” when we continue to interact with each other. The more we stay away from each other in our own echo chambers, the harder it becomes to reconcile the different realities. The term “informationally closed” means that information does not enter us from the outside. We generate meaning based on how we are being perturbed based on the affordances of the environment we are interacting with. The main criticism to this approach is that it leads to relativism, the notion that every viewpoint matters. I reject this notion and affirmatively state that we should support pluralism. By saying that we do not have access to an objective reality, I am saying that we need epistemic humility. We need to realize that we do not have the Truth; that there is no Truth out there. As the wonderful Systems Thinker, Charles West Churchman said, “The systems approach begins when first you see the world through the eyes of another.” We should be beware of those that claim that they have access to the Truth.

When we understand the second order implications, we realize that although the map is not the territory, the map is all we have. Thus, we have to keep working on getting better at making maps. We have to work on error correction of our error corrections. I will finish with some wise words from von Foerster:

The consciousness of consciousness is self-consciousness. The understanding of understanding is self-understanding. And the organization of organization is self-organization. I propose that whenever this self crops up we emphasize this moment of circularity. The result is this: The self does not appear as something static or firm but instead becomes fluid and is constantly being produced. It starts moving. I would plead that we also maintain the dynamics of this word when we speak of self-organization. The way I see it, the self changes every moment, each and every second.

Please maintain social distance, wear masks and take vaccination, if able. Stay safe and always keep on learning… In case you missed it, my last post was The Open Concept of Systems:

The Open Concept of Systems:

In today’s post, I am looking at the famous American philosopher Morris Weitz’s Closed and Open Concepts. Weitz studied aesthetics, the branch of philosophy interested in beauty and taste. He looked at the simple or not so simple question of “how do you define art?” This might seem to be a simple question at first. As we try to answer this, we will soon find that this is not so easy to answer. This might remind you of Socrates and the Socratic method of asking questions. Socrates would ask questions such as what is virtue? For any answer he got, he would find a contradiction that would push the other person further and further into a corner. Weitz came out against this approach and said that the question “what is art?” is itself the wrong question. Instead, he said that you should ask “what sort of concept is art?” The general tendency amongst theorists is to use strict definitions about the essence of something. Weitz called this approach a “closed concept”. Weitz said:

If necessary and sufficient conditions for the application of a concept can be stated, the concept is a closed one. But this can happen only in logic or mathematics where concepts are constructed and completely defined. It cannot occur with empirically-descriptive and normative concepts unless we arbitrarily close them by stipulating the ranges of their uses.

In this fashion, Weitz noted that – Art, as the logic of the concept shows, has no set of necessary and sufficient properties, hence a theory of it is logically impossible and not merely factually difficult.

To contrast the closed concept with the open concept, Weitz stated:

A concept is open if its conditions of application are emendable and corrigible; i.e., if a situation or case can be imagined or secured which would call for some sort of decision on our part to extend the use of the concept to cover this, or to close the concept and invent a new one to deal with the new case and its new property.

Weitz had strong words against the theorists of Aesthetics wanting to confine the subject into a box:

 Aesthetic theory is a logically vain attempt to define what cannot be defined, to state the necessary and sufficient properties of that which has no necessary and sufficient properties, to conceive the concept of art as closed when it’s very use reveals and demands its openness.

Weitz was a fan of Wittgenstein and seems to have been influenced by his idea of “what a game is?” In his posthumous book, Philosophical Investigations, Wittgenstein talked about how a concept such as a game can be defined. There are so many different games that you would be able to identify a game when you engage in it. They all have similarities but it is very hard to properly define a game in a closed concept sense. You know that Chess and Soccer (Football) are games, but also very different. Similarly, skating and polo are games, again of very different nature. They have family resemblances! Wittgenstein’s main point is that the meaning of a word is in its use. Weitz noted:

In his new work, Philosophical investigations, Wittgenstein raises as an illustrative question, What is a game? The traditional philosophical, theoretical answer would be in terms of some exhaustive set of properties common to an games. To this Wittgenstein says, let us consider what we call “games”: “I mean board-games, card-games, ball-games, Olympic games, and so on. What is common to them all?—Don’t say: ‘there must be something common, or they would not be called “games'” but look and see whether there is anything common to all.—For if you look at them you will not see something that is common to all, but similarities, relationships, and a whole series of them at that. … ” Card games are like board games in some respects but not in others. Not all games are amusing, nor is there always winning or losing or competition. Some games resemble others in some respects—that is all. What we find are no necessary and sufficient properties, only “a complicated network of similarities overlapping and crisscrossing,” such that we can say of games that they form a family with family resemblances and no common trait. If one asks what a game is, we pick out sample games, describe these, and add, “This and similar things are called ‘games.’ ” This is all we need to say and indeed all any of us knows about games. Knowing what a game is, is not knowing some real definition or theory but being able to recognize and explain games and to decide which among imaginary and new examples would or would not be called “games.”

In other words, a “game” is an open concept. How you define a game is specifically up to how you, as the observer, view the actual functioning of the concept. Weitz does note that it is possible to “close” an “open” concept in certain cases. The example he gives is that of “tragedy” and “Greek tragedy”. Tragedy is an open concept, whereas Greek tragedy is a closed concept. He notes:

Of course, there are legitimate and serviceable closed concepts in art. But these are always those whose boundaries of conditions have been drawn for a special purpose. Consider the difference, for example, between “tragedy” and “Greek tragedy. ” The first is open and must remain so to allow for the possibility of new conditions, e.g., a play in which the hero is not noble or fallen or in which there is no hero but other elements that are like those of plays we already call “tragedy.” The second is closed. The plays it can be applied to, the conditions under which it can be correctly used are all in, once the boundary, “Greek,” is drawn. Here the critic can work out a theory or real definition in which he lists the common properties at least of the extant Greek tragedies.


I was fascinated with the idea of open and closed concepts. I think this has use in Systems Thinking. Often, systems are depicted as real entities in the world that one can change or fix. This is to me, the use of a closed concept in systems thinking. Systems, similar to art, should be viewed as an open concept. A system is entirely dependent upon who does the observation. If we have three observers, then there are at least three systems of the same phenomenon. To paraphrase Dominik Jarczewski, the question whether something is a system is not a factual problem. It is a decision problem. How you define your system is entirely contingent upon your worldview, your biases and your experiential realities. The knowledge of what is a system is not theoretical but practical. You can replace the word “art” in the previous section with “system”, and there will be no meaning lost.

Peter Checkland, the eminent Systems Thinker provides more light on this. He noted that there will be an observer who gives an account of the world, or part of it, in systems terms; the principle which makes them coherent entities; the means and mechanism by which they tend to maintain their integrity; their boundaries, inputs, outputs, and components; their structure. Finally their behavior may be described in terms of inputs and outputs via state descriptions.

If you are trying to understand a system, you must not view it as a closed concept. You must view it as an open concept, and this means that you have to try to understand where the other person is coming from, and how it is constructed by that person. In other words, how does the functioning of the coherent whole affect that person. It is easy to fall into the mindset that systems can be viewed as closed concepts, where the purpose, the whole, etc. are definable and understandable by everybody. You might be tempted to say that the whole is more important than the parts, as if your whole is accepted by everybody. You might think that holism is the way to do systems thinking, and that reductionism is a terrible idea. When you embrace systems as an open concept, you realize that holism can be as bad as reductionism and reductionism can be as good as holism. All you have are abstractions. Even the holism you look at, is a form of reductionism.

I will finish with some more food-for-thought idea from Weitz that systems thinking is a meta-discipline (replacing “art” with “system”):

If I may paraphrase Wittgenstein, we must not ask, What is the nature of any system x?, or even, according to the semanticist, What does “x” mean?, a transformation that leads to the disastrous interpretation of “system” as a name for some specifiable class of objects; but rather, What is the use or employment of “x”? What does “x” do in the language? This, I take it, is the initial question, the begin-all if not the end-all of any philosophical problem and solution. Thus, … our first problem is the elucidation of the actual employment of the concept of a system, to give a logical description of the actual functioning of the concept, including a description of the conditions under which we correctly use it or its correlates.

Please maintain social distance, wear masks and take vaccination, if able. Stay safe and always keep on learning…

In case you missed it, my last post was Direct and Indirect Constraints:

[1] Art by Annie Jose

Direct and Indirect Constraints:

In today’s post, I am following on the theme of Lila Gatlin’s work on constraints and tying it up with cybernetics. Please refer to my previous posts here and here for additional background. As I discussed in the last post, Lila Gatlin used the analogy of language to explain the emergence of complexity in evolution. She postulated that lower complex organisms such as invertebrates focused on D1 constraints to ensure that the genetic material is passed on accurately over generations, while vertebrates maintained a constant level of D1 constraints and utilized D2 constraints to introduce novelty leading to complexification of the species. Gatlin noted that this is similar to Shannon’s second theorem which points out that if a message is encoded properly, then it can be sent over a noisy medium in a reliable manner. As Jeremy Campbell notes:

In Shannon’s theory, the essence of successful communication is that the message must be properly encoded before it is sent, so that it arrives at its destination just as it left the transmitter, intact and free from errors caused by the randomizing effects of noise. This means that a certain amount of redundancy must be built into the message at the source… In Gatlin’s new kind of natural selection, “second-theorem selection,” fitness is defined in terms very different and abstract than in classical theory of evolution. Fitness here is not a matter of strong bodies and prolific reproduction, but of genetic information coded according to Shannon’s principles.

The codes that made possible the so-called higher organisms, Gatlin suggests, were redundant enough to ensure transmission along the channel from DNA to protein without error, yet at the same time they possessed an entropy, in Shannon’s sense of “amount of potential information,” high enough to generate a large variety of possible messages.

Gatlin viewed that complexity arose from the ability to introduce more variety while at the same time maintaining accuracy in an optimal mix, similar to human language where there is always constant emergence of new and new ideas while the main grammar, syntax etc. are maintained. As Campbell continues:

In the course of evolution, certain living organisms acquired DNA messages which were coded in this optimum way, giving them a highly successful balance between variety and accuracy, a property also displayed by human languages. These winning creatures were the vertebrates, immensely innovative and versatile forms of life, whose arrival led to a speeding-up of evolution.

As Campbell puts it, vertebrates were agents of novelty. They were able to revolutionize their anatomy and body chemistry. They were able to evolve more rapidly and adapt to their surroundings. The first known vertebrate is a bottom-dwelling fish that lived over 350 million years ago. They had a heavy external skeleton that anchored them to the floor of the water-body. They evolved such that some of the spiny parts of the skeleton grew into fins. They also evolved such that they developed skull with openings for sense organs such as eyes, nose, ears etc. Later on, some of them developed limbs from the bony supports of fins, leading to the rise of amphibians.

What kind of error-correcting redundancy did he DNA of these evolutionary prize winners, the vertebrates, possess? It had to give them the freedom to be creative, to become something markedly different, for their emergence was made possible not merely by changes in the shape of a common skeleton, but rather by developing whole new parts and organs of the body. Yet this redundancy also had to provide them with the constraints needed to keep their genetic messages undistorted.

Gatlin defined the first type of redundancy, one that allows deviation from equiprobability as ‘D1 constraint’. This is also referred to as ‘governing constraint’. The second type of redundancy, one that allows deviation from independence was termed by Gatlin as ‘D2 constraint’, and this is also referred to as ‘enabling constraint’. Gatlin’s speculation was that vertebrates were able to use both D1 and D2 constraints to increase their complexification, ultimately leading to a high cognitive being such as our species, homo sapiens.

One of the pioneers in Cybernetics, Ross Ashby, also looked at a similar question. He was looking at the biological learning mechanisms of “advanced” organisms. Ashby identified that for lower complex organisms, the main source of regulation is their gene pattern. For Ashby, regulation is linked to their viability or survival. He noted that the lower complex organisms can rely just on their gene pattern to continue to survive in their environment. Ashby noted that they are adapted because their conditions have been constant over many generations. In other words, a low complex organism such as a hunting wasp can hunt and survive simply based on their genetic information. They do not need to learn to adapt, they can adapt with what they have. Ashby referred to this as direct regulation. With direct regulation, there is a limit to the adaptation. If the regularities of the environment change, the hunting wasp will not be able to survive. It relies on the regularities of the environment for its survival. Ashby contrasted this with indirect regulation. With indirect regulation, one is able to amplify adaptation. Indirect regulation is the learning mechanism that allows the organism to adapt. A great example for this is a kitten. As Ashby notes:

This (indirect regulation) is the learning mechanism. Its peculiarity is that the gene-pattern delegates part of its control over the organism to the environment. Thus, it does not specify in detail how a kitten shall catch a mouse, but provides a learning mechanism and a tendency to play, so that it is the mouse which teaches the kitten the finer points of how to catch mice.

The learning mechanism in its gene pattern does not directly teach the kitten to hunt for the mice. However, chasing the mice and interacting with it, trains the kitten how to catch the mice. As Ashby notes, the gene pattern is supplemented by the information supplied by the environment. Part of the regulation is delegated to the environment.

In the same way the gene-pattern, when it determines the growth of a learning animal, expends part of its resources in forming a brain that is adapted not only by details in the gene-pattern but also by details in the environment. The environment acts as the dictionary, while the hunting wasp, as it attacks its prey, is guided in detail by its genetic inheritance, the kitten is taught how to catch mice by the mice themselves. Thus, in the learning organism the information that comes to it by the gene-pattern is much supplemented by information supplied by the environment; so, the total adaptation possible, after learning, can exceed the quantity transmitted directly through the gene-pattern.

Ashby further notes:

As a channel of communication, it has a definite, finite capacity, Q say. If this capacity is used directly, then, by the law of requisite variety, the amount of regulation that the organism can use as defense against the environment cannot exceed Q.  To this limit, the non-learning organisms must conform. If, however, the regulation is done indirectly, then the quantity Q, used appropriately, may enable the organism to achieve, against its environment, an amount of regulation much greater than Q. Thus, the learning organisms are no longer restricted by the limit.

In the same way the gene-pattern, when it determines the growth of a learning animal, expends part of its resources in forming a brain that is adapted not only by details in the gene-pattern but also by details in the environment. The environment acts as the dictionary, while the hunting wasp, as it attacks its prey, is guided in detail by its genetic inheritance, the kitten is taught how to catch mice by the mice themselves. Thus, in the learning organism the information that comes to it by the gene-pattern is much supplemented by information supplied by the environment; so the total adaptation possible, after learning, can exceed the quantity transmitted directly through the gene-pattern.

As I look at Ashby’s ideas, I cannot help but see similarities between the D1/D2 constraints and Direct/Indirect regulation respectively. Indirect regulation, similar to enabling constraints, helps the organism adapt to its environment by connecting things together. Indirect regulation has a second order nature to it such as learning how to learn. It works on being open to possibilities when interacting with the environment. It brings novelty into the situation. Similar to governing constraints, direct regulation focuses only on the accuracy of the ‘message’. Nothing additional or any form of amplification is not possible. Direct regulation is hardwired, whereas indirect regulation is enabling. Direct regulation is context-free, whereas indirect regulation is context-sensitive. What the hunting wasp does is entirely reliant on its gene pattern, no matter the situation, whereas, what a kitten does is entirely dependent on the context of the situation.

Final Words:

Cybernetics can be looked at as the study of possibilities, especially why out of all the possibilities only certain outcomes occur. There are strong undercurrents to information theory in Cybernetics. For example, in information theory entropy is a measure of how many messages might have been sent, but were not. In other words, if there are a lot of possible messages available, and only one message is selected, then it eliminates a lot of uncertainty. Therefore, this represents a high information scenario. Indirect regulation allows us to look at the different possibilities and adapt as needed. Additionally, indirect regulation allows retaining the successes and failures and the lessons learned from them.

I will finish with a great lesson from Ashby to explain the idea of the indirect regulation:

If a child wanted to discover the meanings of English words, and his father had only ten minutes available for instruction, the father would have two possible modes of action. One is to use the ten minutes in telling the child the meanings of as many words as can be described in that time. Clearly there is a limit to the number of words that can be so explained. This is the direct method. The indirect method is for the father to spend the ten minutes showing the child how to use a dictionary. At the end of the ten minutes the child is, in one sense, no better off; for not a single word has been added to his vocabulary. Nevertheless, the second method has a fundamental advantage; for in the future the number of words that the child can understand is no longer bounded by the limit imposed by the ten minutes. The reason is that if the information about meanings has to come through the father directly, it is limited to ten-minutes’ worth; in the indirect method the information comes partly through the father and partly through another channel (the dictionary) that the father’s ten-minute act has made available.

Please maintain social distance, wear masks and take vaccination, if able. Stay safe and always keep on learning…

In case you missed it, my last post was D1 and D2 Constraints:

D1 and D2 Constraints:

In today’s post, I am following up from my last post and looking further at the idea of constraints as proposed by Dr. Lila Gatlin. Gatlin was an American biophysicist, who used the idea of information theory to propose an information-processing aspect of life. In information theory, the ‘constraints’ are the ‘redundancies’ utilized for the transmission of the message. Gatlin’s use of this idea from an evolutionary standpoint is quite remarkable. I will explain the idea of redundancies in language using an example I have used before here. This is the famous idea that if a monkey had infinite time on its hands and a typewriter, it will at some point, type out the entire works of Shakespeare, just by randomly clicking on the typewriter keys. It is obviously highly unlikely that a monkey can actually do this. In fact, this was investigated further by William R. Bennett, Jr., a Yale professor of Engineering. As Jeremy Campbell, in his wonderful book, Grammatical Man, notes:

Bennett… using computers, has calculated that if a trillion monkeys were to type ten keys a second at random, it would take more thana trillion times as long as the universe has been in existence merely to produce the sentence “To be, or not to be: that is the question.”

This is mainly because the keyboard of a typewriter does not truly reflect the alphabet as they are used in English. The typewriter keyboard has only one key for each letter. This means that every letter has the same chance of being struck. From an information theory standpoint, this represents a maximum entropy scenario. Any letter can come next since they all have the same probability of being struck. In English, however, the distribution of letters is not the same. Some letters such as “E” are more likely to occur than say “Q”. This is a form of “redundancy” in language. Here redundancy refers to regularities, something that occurs on a regular basis. Gatlin referred to this redundancy as “D1”, which she described as divergence from equiprobability. Bennett used this redundancy next in his experiment. This will be like saying that some letters now had lot more keys on the typewriter so that they are more likely to be clicked. Campbell continues:

Bennett has shown that by applying certain quite simple rules of probability, so that the typewriter keys were not struck completely at random, imaginary monkeys could, in a matter of minutes, turn out passages which contain striking resemblances to lines from Shakespeare’s plays. He supplied his computers with the twenty-six letters of the alphabet, a space and an apostrophe. Then, using Act Three of Hamlet as his statistical model, Bennett wrote a program arranging for certain letters to appear more frequently than others, on the average, just as they do in the play, where the four most common letters are e, o, t, and a, and the four least common letters are j, n, q, and z. Given these instructions, the computer monkeys still wrote gibberish, but no it had a slight hint of structure.

The next type of redundancy in English is the divergence from independence. In English, we know that certain letters are more likely to come together. For example, “ing” or “qu” or “ion”. If we see an “i” and “o”, then there is high chance that the next letter is going to be an “n”. If we see a “q”, we can be fairly sure that the next letter is going to be a “u”. The occurrence of one letter makes the occurrence of another letter highly likely. In other words, this type of redundancy makes the letter interdependent rather than independent. Gatlin referred to this as “D2”. Bennett utilized this redundancy for his experiment:

Next, Bennett programmed in some statistical rules about which letters are likely to appear at the beginning and end of words, and which pairs of letters, such as th, he, qu, and ex, are used most often. This improved the monkey’s copy somewhat, although it still fell short of the Bard’s standards. At this second stage of programming, a large number of indelicate words and expletives appeared, leading Bennett to suspect that one-syllable obscenities are among the most probable sequences of letters used in normal language. Swearing has a low information content! When Bennett then programmed the computer to take into account triplets of letters, in which the probability of one letter is affected by the two letters which come before it, half the words were correct English ones and the proportion of obscenities increased. At a fourth level of programming, where groups of four letters were considered, only 10 percent of the words produced were gibberish and one sentence, the fruit of an all-night computer run, bore a certain ghostly resemblance to Hamlet’s soliloquy:



We can see that as Bennett’s experiment started using more and more redundancies found in English, a certain structure seems to emerge. With the use of redundancies, even though it might appear that the monkeys were free to choose any key, the program made it such that certain events were more likely to happen than others. This is the basic premise of constraints. Constraints make certain things more likely to happen than others. This is different than a cause-and-effect phenomenon like a billiard ball hitting another billiard ball. Gatlin’s brilliance was to use this analogy with evolution. She pondered why some species were able to evolve to be more complex than others. She concluded that this has to do with the two types of redundancies, D1 and D2. She considered the transmission of genetic material to be similar to how a message is transmitted from the source to the receiver. She determined that some species were able to evolve differently because they were able to use the two redundancies in an optimal fashion.

If we come back to the analogy with the language, and if we were to only use D1 redundancy, then we would have a very high success rate of repeating certain letters again and again. Eventually, the strings we would generate would become monotonous, without any variety. It would be something like EEEAAEEEAAAEEEO. Novelty is introduced when we utilize the second type of redundancy, D2. Using D2 introduces a more likelihood of emergence since there are more connections present. As Campbell explains the two redundancies further:

Both kinds lower the entropy, but not in the same way, and the distinction is a critical one. The first kind of redundancy, which she calls D1, is the statistical rule that some letters likely to appear more often than the others, on the average, in a passage of text. D1 which is context-free, measures the extent to which a sequence of symbols generated by a message source departs from the completely random state where each symbol is just as likely to appear as any other symbol. The second kind of redundancy, D2, which is context-sensitive, measures the extent to which the individual symbols have departed from a state of perfect independence from one another, departed from a state in which context does not exist. These two types of redundancy apply as much to a sequence of chemical bases strung out along a molecule of DNA as to the letters and words of a language.

Campbell suggests that D2 is a richer version of redundancy because it permits greater variety, while at the same time controlling errors. Campbell also notes that Bennett had to utilize the D1 constraint as a constant, whereas he had to keep on increasing the D2 constraints to the limit of his equipment until he saw something roughly similar to sensible English. Using this analogy to evolution, Gatlin notes:

Let us assume that the first DNA molecules assembled in the primordial soup were random sequences, that is, D2 was zero, and possibly also D1. One of the primary requisites of a living system is that it reproduces itself accurately. If this reproduction is highly inaccurate, the system has not survived. Therefore, any device for increasing the fidelity of information processing would be extremely valuable in the emergence of living forms, particularly higher forms… Lower organisms first attempted to increase the fidelity of the genetic message by increasing redundancy primarily by increasing D1, the divergence from equiprobability of the symbols. This is a very unsuccessful and naive technique because as D1 increases, the potential message variety, the number of different words that can be formed per unit message length, declines. Gatlin determined that this was the reason why invertebrates remained “lower organisms”.

A much more sophisticated technique for increasing the accuracy of the genetic message without paying such a high price for it was first achieved by vertebrates. First, they fixed D1. This is a fundamental prerequisite to the formulation of any language, particularly more complex languages… The vertebrates were the first living organisms to achieve the stabilization of D1, thus laying the foundation for the formulation of a genetic language. Then they increased D2 at relatively constant D1. Hence, they increased the reliability of the genetic message-without loss of potential message variety. They achieved a reduction in error probability without paying too great a price for it… It is possible’ within limits to increase the fidelity of the genetic message without loss of potential message variety provided that the entropy variables change in just the right way, namely, by increasing D2 at relatively constant D1. This is what the vertebrates have done. This is why we are “higher” organisms.

Final Words:

I have always wondered about the exponential advancement of technology and how we as a species were able to achieve it. Gatlin’s ideas made me wonder if they are applicable to our species’ tremendous technological advancement. We started off with stone tools and now we are on the brink of visiting Mars. It is quite likely that we first came across a sharp stone and cut ourselves on it and then thought of using it for cutting things. From there, we realized that we could sharpen certain stones to get the same result. Gatlin puts forth that during the initial stages, it is extremely important that errors are kept to a minimum. We had to first get better at the stone tools before we could proceed to higher and more complex tools. The complexification happened when we were able to make connections – by increasing D2 redundancy. As Gatlin states – D2 endows the structure, The more tools and ideas we could connect, the faster and better we could invent new technologies. The exponentiality only came by when we were able to connect more things to each other.

I was introduced to Gatlin’s ideas through Campbell and Alicia Juarrero. As far as I could tell, Gatlin did not use the terms “context-free” or “context-sensitive”. They seem to have been used by Campbell. Juarrero refers to “context-free constraints” as “governing constraints” and “context-sensitive constraints” as “enabling constraints”. I will be writing about these in a future post. I will finish with a neat observation about the ever-present redundancies in English language from Claude Shannon, the father of Information Theory.:

The redundancy of ordinary English, not considering statistical structure over greater distances than about eight letters, is roughly 50%. This means that when we write English half of what we write is determined by the structure of the language and half is chosen freely.

In other words, if you follow basic rules of English language, you could make sense at least 50% of what you have written, as long as you use short words!

Please maintain social distance, wear masks and take vaccination, if able. Stay safe and always keep on learning… In case you missed it, my last post was More Notes on Constraints in Cybernetics:

More Notes on Constraints in Cybernetics:

In today’s post, I am looking further at constraints. Please see here for my previous post on this. Ross Ashby is one of the main pioneers of Cybernetics, and his book “Introduction to Cybernetics” still remains an essential read for a cybernetician. Alicia Juarrero is a Professor Emerita of Philosophy at Prince George’s Community College (MD), and is well known for her book, “Dynamics in Action: Intentional Behavior as a Complex System”.

I will start off with the basic idea of a system and then proceed to complexity from a Cybernetics standpoint. A system is essentially a collection of variables that an observer has chosen to make sense of something. Thus, a system is a mental construct and not something that is an objective reality. A system from this standpoint is entirely contingent upon the observer. Ashby’s view on complexity was regarding variety. Variety is the number of possible states of a system. A good example of this is a light switch. It has two states – ON or OFF. Thus, we can state that a light switch has a variety of 2. Complexity is expressed in terms of variety. The higher variety a system has, the more possibilities it possesses. A light switch and a person combined has indefinite variety. The person is able to communicate via messages simply by turning the light switch ON and OFF in a certain logical sequence such as Morse code.

Now let’s look at constraints. A constraint can be said to exist when the variety of a system is said to have diminished or decreased. Ashby gives the example of a boys only school. The variety for sex in humans is 2. If a school has a policy that only boys are allowed in that school, the variety has now decreased to 1 from 2. We can say that a constraint exists at the school.

Ashby indicated that we should be looking at all possibilities when we are trying to manage a situation. Our main job is to influence the outcomes so that certain outcomes are more likely than others. We do this through constraints. Ashby noted:

The fundamental questions in regulation and control can be answered only when we are able to consider the broader set of what it (system) might do, when ‘might’ is given some exact specification.

We can describe what we have been talking about so far with a simple schematic. We can try to imagine the possible outcomes of the system when we interact with it and utilize constraints so that certain outcomes, P2 and P4 are more likely to occur. There may be other outcomes that we do not know of or can imagine. Ashby advises that cybernetics is not about trying to understand what a system is, but what a system does. We have to imagine a set of all possible outcomes, so that we can guide or influence the system by managing variety. The external variety is always more than the internal variety. Therefore, to manage a situation, we have to at least match the variety of the system. We do this by attenuating the unwanted variety and by amplifying our internal variety so that we can match the variety thrown at us by the system. This is also represented as Ashby’s Law of Requisite Variety – only variety can absorb variety. Ashby stated:

Cybernetics looks at the totality, in all its possible richness, and then asks why the actualities should be restricted to some portion of the total possibilities.

Ashby talked about several versions of constraints. He talked about slight and severe constraints. He gave an example of a squad of soldiers. If the soldiers are asked to line up without any instructions, they have maximum freedom or minimum constraints to do so. If the order was given that no man may stand next to a man whose birthday falls on the same day, the constraint would be slight, for of all the possible arrangements few would be excluded. If, however, the order was given that no man was to stand at the left of a man who was taller than himself, the constraint would be severe; for it would, in fact, allow only one order of standing (unless two men were of exactly the same height). The intensity of the constraint is thus shown by the reduction it causes in the number of possible arrangements.

Another way that Ashby talked about constraints was by identifying constraint in vectors. Here, multiple factors are combined in a vector such that the resultant constraint is considered. The example that Ashby gave was that of an automobile. He gave the example of the vector shown below:

(Age of car, Horse-power, Color)

He noted that each component has a variety that may or may not be dependent on the other components. If the components are dependent on each other the final constraint will be less than the sum of individual component constraints. If the components are all independent, then the resultant constraints would be the sum of individual constraints. This is an interesting point to further look at. Imagine that we are looking at a team here of say Person A, B and C. Each person here is able to come up with indefinite possibilities, the resultant variety of the team would be also indefinite. If we allow for the indefinite possibilities to emerge, as in innovation or invention of new ideas or products, the constraints could play a role. When we introduce thinking agents to the mix, the number of possibilities goes up.

Complexity is about managing variety – about allowing room for possibilities to tackle complexity. Ashby famously noted that a world without constraints is totally chaotic. His point is that if a constraint exists, it can be used to tackle complexity. Allowing parts to depend upon each other introduces constraints that could cut down on unwanted variety and at the same time allow for innovative possibilities to emerge. The controller’s goal is to manage variety and allow for certain possible outcomes to be more likely than others. For this, the first step to imagine the total set of possible outcomes to best of their abilities. This means that the controller also has to have a good imagination and creative mind. This points to the role of the observer when it comes to seeing and identifying the possibilities. Ashby referred to the set of possibilities as “product space.” Ashby noted that its chief peculiarity is that it contains more than actually exists in the real physical world, for it is the latter that gives us the actual, constrained subset.

The real world gives the subset of what is; the product space represents the uncertainty of the observer. The product space may therefore change if the observer changes; and two observers may legitimately use different product spaces within which to record the same subset of actual events in some actual thing. The “constraint” is thus a relation between observer and thing; the properties of any particular constraint will depend on both the real thing and on the observer. It follows that a substantial part of the theory of organization will be concerned with properties that are not intrinsic to the thing but are relational between the observer and thing.

A keen reader might be wondering how the ideas of constraints stack up against Alicia Juarrero’s versions of constraints. More on this in a future post.  I will finish with a wonderful tribute to Ross Ashby from John Casti:

The striking fact is that Ashby’s idea of the variety of a system is amazingly close to many of the ideas that masquerade today under the rubric “complexity.”

Please maintain social distance and wear masks. Please take vaccination, if able. Stay safe and Always keep on learning… In case you missed it, my last post was Towards or Away – Which Way to Go?

Towards or Away – Which Way to Go?

In today’s post I am pondering the question – as a regulator, should you be going towards or away from a target? Are the two things the same? I will use Erik Hollnagel’s ideas here. Hollnagel is a Professor Emeritus at Linköping University who has a lot of work in Safety Management. Hollnagel challenges the main theme of safety management as getting to zero accidents. He notes:

The goal of safety management is obviously to improve safety. But for this to be attainable it must be expressed in operational terms, i.e., there must be a set of criteria that can be used to determine when the goal has been reached… the purpose of an SMS is to bring about a significant reduction – or even the absence – of risk, which means that the goal is to avoid or get away from something. An increase in safety will therefore correspond to a decrease in the measured output, i.e., there will be fewer events to count. From a control point of view that presents a problem, since the absence of measurements means that the process becomes uncontrollable.

He identifies this as a problem from a cybernetics standpoint. Cybernetics is the art of steersmanship. The controller identifies a target and the regulator works on getting to the target. There is a feedback loop so that when the difference between the actual condition and the target is higher than a preset value, the regulator tries to bring the difference down. Take the example of a steersman of a boat – the steersman propels the boat to the required destination by steering the boat. If there is a strong wind, the steersman adjusts accordingly so that the boat is always moving towards the destination. The steersman is continuously measuring the difference from the expected path and adjusting accordingly.

Hollnagel continues with this idea:

Quantifying safety by measuring what goes wrong will inevitably lead to a paradoxical situation. The paradox is that the safer something (an activity or a system) is, the less there will be measure. In the end, when the system is perfectly safe – assuming that this is either meaningful or possible – there will be nothing to measure. In control theory, this situation is known as the ‘fundamental regulator paradox’. In plain terms, the fundamental regulator paradox means that if something happens rarely or never, then it is impossible to know how well it works. We may, for instance, in a literal or metaphorical sense, be on the right track but also precariously close to the limits. Yet there is no indication of how close, it is impossible to improve performance.

The idea of the fundamental regulator paradox was put forward by Gerald Weinberg. He described it as:

The task of a regulator is to eliminate variation, but this variation is the ultimate source of information about the quality of its work. Therefore, the better job a regulator does, the less information it gets about how to improve.

Weinberg noted that as the regulator gets better at what it is doing, the more difficult it is for them to improve. If we go back to the case of the steersman, perfect regulation is when the steersman is able to make adjustment at a superhuman speed so that the boat travels in a straight line from start to end. Weinberg is pointing out this is not possible. When 100% percent regulation is achieved, we are also cutting off any contact with the external world. This is also the source of information that the regulator needs to do its job.

Coming back to the original question of “away from” or “towards”, Hollnagel states:

From a control perspective it would make more sense to use a definition of safety such that the output increases when safety improves. In other words, the goal should not be to avoid or get away from something, but rather to achieve or get closer to something.

While pragmatically it seems very reasonable that the number of accidents should be reduced as far as possible, the regulator paradox shows that such a goal is counterproductive in the sense that it makes it increasingly difficult to manage safety… The essence of regulation is that a regulator makes an intervention in order to steer or direct the process in a certain direction. But if there is no response to the intervention, if there is no feedback from the process, then we have no way of knowing whether the intervention had the intended effect.

Hollnagel advises that we should see safety in terms of resilience and not as absence of something (accidents, missed days etc.) but rather as the presence of something.

Based on the discussion we can see that “moving towards” is a better approach for a regulator than “moving away” from something. From a management standpoint, we should deter from enforcing policies that are too strict in the hopes of perfect regulation. They would lack the variety needed to tackle the external variety thrown at us. We should allow room for some noise in the processes. As the variety of the situation increases, we should stop setting targets and instead, provide a direction to move towards. Putting a hard target is again an attempt at perfect regulation that can stress the various elements within the organization.

I will finish with some wise words from Weinberg:

The fundamental regulator paradox carries an ominous message for any system that gets too comfortable with its surroundings. It suggests, for instance, that a society that wants to survive for a long time had better consider giving up some of the maximum comfort it can achieve to return for some chance of failure or discomfort.

Please maintain social distance and wear masks. Please take vaccination, if able. Stay safe and Always keep on learning…

In case you missed it, my last post was The Cybernetics of the Two Wittgensteins:


  1. The Trappers’ Return, 1851. George Caleb Bingham
  2. Safety management – looking back or looking forward – Erik Hollnagel, 2008
  3. On the design of stable systems – Gerald Weinberg, 1979

The Cybernetics of the Two Wittgensteins:

In today’s post, I am looking at Wittgenstein and parallels between his ideas and Cybernetics. Wittgenstein is often regarded as one of the most influential philosophers of the twentieth century. His famous works include Tractatus Logico-Philosophicus (referred to as TLP in this article) and Philosophical Investigations (referred to as PI in this article). TLP is one of the most intriguing books I have read and reread in philosophy. His style of writing is poetic and the body of the book is split into sections and sub-sections. Wittgenstein is one of the few philosophers who has written two influential books that held opposing views in linguistic philosophy.

The Early Wittgenstein:

Wittgenstein was very much influenced by Bertrand Russel’s logical representation of mathematics. Wittgenstein came to the conclusion that language also resides in a logical space. He realized that the problems in philosophy are due to a lack of understanding how language works. He opens TLP with the succinct declaration – “The world is all that is the case.” He followed this up with – “What is the case – a fact – is the existence of states of affairs.

Wittgenstein is saying that the world is not made up of things, but that the world is the totality of facts. For example, if we take the example of a house, we cannot simply point to the table, the chairs, the rooms and identify a house from the different things. Instead, we have to say that there is a brown dining table in the dining room, and there are six chairs around it. This statement is a representation of a fact. The fact contains objects depicted in a relation between them. The objects by themselves lack the complexity to denote the world. The statement is a state of affairs between the objects, and the state of affairs is a combination of objects in a specific configuration.

Let’s bring up the famous idea of “picture theory” here. The story goes that Wittgenstein read about a judiciary proceeding in France where a road accident was depicted using a model of the road with the cars, buildings, pedestrians etc. This gave him the idea of the picture theory. The picture theory is simply a model or a representation of a state of affairs that corresponds to the specific configuration of the objects in the world. The picture is a model of reality. If we say that there is a cat on the mat, then we can picture this as a cat being on the mat. There are other possible configurations possible such as the cat being on the side of the mat or the mat being on top of the cat. However, in this particular case, the picture of the cat on the mat depicts to the reality of the object “cat” being on top of the object “mat”. The relationship between the two objects is that the cat is on top of the mat. What we talk about using language can be represented by the model with the different objects in the statement having a specific relation between the objects.

Wittgenstein’s main idea was that the use of language is to represent the states of affairs in the world. We can make propositions or statements in language that are pictures of reality. These statements are true if and only if the pictures map onto a corresponding reality in the world. Whatever we can speak of using language are senseful only if they talk about states of affairs in the world. If we talk about supernatural things, then they are not depicting a state of affairs in the real world, and thus are senseless or nonsense. Wittgenstein then used this approach with thoughts by seeing a logical picture of facts as a thought. A thought thus becomes a proposition with sense. With this approach, Wittgenstein showed that the problems of philosophy arise from a poor understanding of knowing how language works. We can solve these problems only when we understand the logic of language. Wittgenstein said that everything that can be thought can be thought clearly, and everything that can be put into words can be put clearly. Everything else is nonsense. Wittgenstein famously stated that the limits of my language mean the limits of my world. Wittgenstein ended TLP with the following – What we cannot speak about we must pass over in silence.

The Later Wittgenstein:

In PI, Wittgenstein came to the realization that his earlier views were dogmatic. Instead of using the idea of picture theory where language corresponded to the world, the later Wittgenstein concluded that the meaning of a word is in its use. He realized that we should not provide definitions of words, but instead provide descriptions of use. Instead of picture theory, Wittgenstein introduced the idea of language games. We are all engaged in language games when we interact with one another. Wittgenstein never gave a definition for language games but, he gave several examples. Loosely put, we engage in a language game when we converse with each other. We follow certain rules; we act and counteract based on these rules. Things make sense only when we follow these rules. Wittgenstein viewed language as a tool box with all kinds of different tools, and each tool has multiple uses depending on the context. Let’s take the example of a surgeon performing a surgery. The surgeon at times might say “scalpel” or at times simply gesture. The assisting nurse or doctor understands exactly what the surgeon is asking for without the surgeon making a clear statement about the state of affairs. They are all engaged in a language game where the word “scalpel” or the simple gesture of an open hand has a specific meaning unique to that context. If the surgeon is in a restaurant and gestures with an open hand, he might be given a breadstick instead of a scalpel.

One of the other ideas that Wittgenstein brought up in PI that requires our attention is that of private language. Wittgenstein concluded that a private language is not possible. Language has to be public. To provide a simple explanation, we need an external reference to calibrate meanings to our words. If you are experiencing pain, all you can say is that you experience pain. While the experience of pain is private, all we have is a public language to explain it in. For example, if we experience a severe pain on Monday and decided to call it “X”. A week from that day, if you have some pain and you decide to call it “Y”, one cannot be sure if “X” was the same as “Y”. Wittgenstein used the example of a beetle in the box to explain this.

Suppose everyone had a box with something in it: we call it a ‘beetle’. No one can look into anyone else’s box, and everyone says he knows what a beetle is by looking at his beetle. Here it would be quite possible for everyone to have something different in his box. One might even imagine such a thing constantly changing. But suppose the word ‘beetle’ had a use in these people’s language? If so, it would not be used as the name of a thing. The thing in the box has no place in the language-game at all; not even as a something: for the box might even be empty. No one can ‘divide through’ by the thing in the box; it cancels out, whatever it is.

The beetle in the box is a thought experiment to show that private language is not possible. The beetle in my box is visible to only me, and I cannot see the beetle in anybody else’s box. All I can see is the box. The way that I understand the beetle or the word “beetle” is by interacting with others. I learn about the meaning only through the use of the word in conversations with others and how others use that word. This is true, even if they cannot see my beetle or if I cannot see their beetle. I can never experience and thus know their pain or any other private sensations. But we all use the same words to explain how each of us experience the world. The word beetle becomes whatever is in the box, even if the beetles are of different colors, sizes, types etc. Sometimes, the beetles could even be absent. The box in this case is the public language we use to explain the beetle which is the private experience. The meaning of the word beetle then is not what it refers to, but the meaning is determined by how it is used by all of us. It is an emergent phenomenon. And sometimes, the meaning itself changes over time. There is no way for me to know what your beetle looks like. The box comes to represent the beetle.

With these introductions, I will now try to draw parallels between Wittgenstein’s ideas and Cybernetics.

First Order Cybernetics and Early Wittgenstein:

When I look at the ideas of early Wittgenstein, I am seeing a lot of parallels to first order cybernetics. First order cybernetics is described of study of observed systems. Here the observer is independent of the observed system, and can make a model of how the observed system works and try to control it. The observer creates a model by looking at how the system works. Here the “system” refers to a selection of variables of interest with relation to a phenomenon chosen by the observer. One can see how this corresponds to the picture theory, where the picture is a model of reality depicting relations between objects.

Additionally for Wittgenstein, the logical space contains all possible combinations of the objects. Wittgenstein noted:

If all objects are given, then at the same time all possible states of affairs are also given.

Each thing is, as it were, in a space of possible states of affairs. This space I can imagine empty, but I cannot imagine the thing without the space.

In Cybernetics, this set of all possible combinations is viewed as the variety of the system. ‘The limits of my language are the limits of my world’ is a statement about my variety. ‘What we cannot speak about, we must pass over in silence’ is Wittgenstein’s advice to cut down on the extraneous variety. This can be viewed as the application of Ashby’s “Law of Requisite Variety”. Ashby explained this law as – only variety can absorb variety. The external variety is always greater than our internal variety. Therefore, to manage external variety thrown at us, we have to cut down the external variety coming our way so that we can focus and manage our abilities to cope with the world. For example, our brains have evolved so that we do not pay attention to every minute detail of the world around us.

We manage the world around us by making models of the world, and by interacting with the world through these models. We are able to sustain our viability by managing variety.

Second Order Cybernetics and Later Wittgenstein:

With the later ideas, I am seeing correlations to the ideas in Second order cybernetics. Second order cybernetics is the study of observing systems. Here the observer is not seen as independent of the observed system, rather the observer is part of the observed system. The idea of meaning as use brings in the need to look at the context. The context of an observed system is the observer doing the observation. The observer is doing the observation with a specific purpose in mind. We cannot remove the observer out of the observation. To be aware of our biases and preconceived notions is important. We need to also be mindful about the other observers in the social realm. We need to see how they view the system. A second order cybernetician is aware of the potential blind spots in our observations.

The second idea that resonated with me is the idea of language games. Language games imply that there is more than one player. We are in a social realm and our reality is a stable representation derived from the ongoing interactions with other participants in the social realm. The reality is formed from the specific rules of the game we engage in. Language games require practice just like any other game.

Third Order Cybernetics?

Wittgenstein viewed philosophy as therapy and I welcome Wittgenstein’s view of philosophy as a therapy. To me, it is a second-order activity. I make sense of the world by describing it and therefore the limits of my understanding are based on the limits of my language. This viewpoint is liberating. When I view philosophy as second order cybernetics, I can conclude that there is no need for third order cybernetics. There is no need for a philosophy of philosophy. Wittgenstein talked about whether second order philosophy is needed:

One might think: if philosophy speaks of the use of the word “philosophy” there must be a second-order philosophy. But it is not so: it is, rather, like the case of orthography, which deals with the word “orthography” among others without then being second-order.

Final words:

Wittgenstein saw philosophy as a process for coming up with descriptions instead of explanations. When we try to come up with explanations of things, most often we fall prey to the philosophical problems that Wittgenstein exposed. We come into the realm of nonsense and we try to make sense of things by providing explanations where none can suffice. Wittgenstein said – Don’t think but look. Cybernetics teaches us to look how the system behaves rather than trying to understand what the system is. We need to look at descriptions rather than explanations. I will finish with a great explanation from Marie McGinn:

What we are concerned with when we ask questions of the form ‘What is time?’, ‘What is meaning?’, ‘What is thought?’ is the nature of the phenomena which constitute our world. These phenomena constitute the form of the world which we inhabit, and in asking these questions we express a desire to understand them more clearly. Yet in the very act of framing these questions, we are tempted to adopt an attitude towards these phenomena which, Wittgenstein believes, makes us approach them in the wrong way, in a way which assumes that we have to uncover or explain something. When we ask ourselves these questions, we take up a stance towards these phenomena in which they seem suddenly bewilderingly mysterious, for as soon as we try to catch hold of them in the way that our questions seem to require, we find we cannot do it; we find that we ‘no longer know’. This leads us deeper and deeper into a state of frustration and philosophical confusion. We think that the fault lies in our explanations and that we need to construct ever more subtle and surprising accounts. Thus, we ‘go astray and imagine that we have to describe extreme subtleties, which in turn we are after all quite unable to describe with the means at our disposal. We feel as if we had to repair a torn spider’s web with our fingers’. The real fault, Wittgenstein believes, is not in our explanations, but in the very idea that the puzzlement we feel can be removed by means of a discovery. What we really need is to turn our whole enquiry round and concern ourselves, not with explanation or theory construction, but with description. The nature of the phenomena which constitute our world is not something that we discover by ‘digging’, but is something that is revealed in ‘the kind of statement we make about phenomena’, by the distinctive forms of linguistic usage which characterize the different regions of our language. The method we really need is one that ‘simply puts everything before us, and neither explains nor deduces anything. —Since everything lies open to view there is nothing to explain’. It is by attending to the characteristic structures of what already lies open to view in our use of language that we will overcome our sense of philosophical perplexity and achieve the understanding we seek; the difficulty lies only in the fact that we are so unwilling to undertake, and so unprepared for, this task of description: ‘The aspects of things that are most important for us are hidden because of their simplicity and familiarity. (One is unable to notice something—because it is always before one’s eyes.)’

Please maintain social distance and wear masks. Please take vaccination, if able. Stay safe and Always keep on learning…

In case you missed it, my last post was The Reality of Informationally Closed Entities:

The Reality of Informationally Closed Entities:

In today’s post, I am looking at the idea of “informationally closed”. The idea of informational closure was first proposed by Ross Ashby. Ashby defined Cybernetics as a study of systems that are informationally tight. Ashby wanted cyberneticians to look at all the possibilities that a system can be in. Here the system refers to a selection of variables that the observer has chosen. Ashby noted that we should not look at what individual act a system produces ‘here and now’, but at all the possible behaviors it can produce. For example, he asked why does the ovum grows into a rabbit, and not a dog or a fish? Ashby noted that this is strictly related to information, and not energy:

Growth of some form there will be; cybernetics asks “why should the changes be to the rabbit-form, and not to a dog-form, a fish-form or even to a teratoma-form?” Cybernetics envisages a set of possibilities much wider than the actual, and then asks why the particular case should conform to its usual particular restriction. In this discussion, questions of energy play almost no part – the energy is simply taken for granted. Even whether the system is closed to energy or open is often irrelevant; what is important is the extent to which the system is subject to determining and controlling factors. So, no information or signal or determining factor may pass from part to part without its being recorded as a significant event. Cybernetics might, in fact, be defined as the study of systems that are open to energy by closed to information and control – systems that are information-tight.

Ashby’s main point regarding this is that the machine or the system under observation selects its actions from a set of possible actions, and this will remain the same until there is a significant event that causes it to alter the set of possible actions. The action of the system is entirely based on its structure, and not because an external agent is choosing that action for the system. The external agent is only triggering or perturbing the system, and the system in turn reacts. This idea of informational closure was further taken up by Humberto Maturana and Francisco Varela. The idea of “informationally closed” is a strong premise for constructivism – the idea that all knowledge is constructed rather than perceived through senses. They noted that as cognizant beings, we are informationally closed. We do not have information enter us externally. We are instead perturbed by the environment, and we react in ways that we are accustomed to. Jonathan D. Raskin expands on this further:

People are informationally closed systems only in touch with their own processes. What an organism knows is personal and private. In adhering to such a view, constructivism does not conceptualize knowledge in the traditional manner, as something moving from “outside” to “inside” a person. Instead, what is outside sets off, triggers, or disrupts a person’s internal processes, which then generate experiences that the person treats as reflective of what is outside. Sensory data and what we make of it are indirect reflections of a presumed outside world. This is why different organisms experience things quite differently. How Jack’s backyard smells to his dog is different from how it smells to him because he and his dog have qualitatively different olfactory systems. Of course, how Jack’s backyard smells to him may also differ from how it smells to Sara because not only is each of them biologically distinct but each has a unique history that informs the things to which they attend and attribute meaning. The world does not dictate what it “smells” like; it merely triggers biological and psychological processes within organisms, which then react to these triggers in their own ways. The kinds of experiences an organism has depend on its structure and history. Therefore, what is known is always a private and personal product of one’s own processes.

Raskin gives an example of a toaster or a washing machine to provide more clarity on the informational closure.

Maturana asserts that from the point of view of a biologist living systems are informationally closed–that is, things don’t get in and they don’t get out. From the outside, you can trigger a change, but you cannot directly instruct. Think of it as having a toaster and a washing machine. And, the toaster is going to toast no matter what you do. And, the washing machine is going to wash no matter what you do. And they both can be triggered by electricity. But the electricity doesn’t tell the toaster what to do. The toaster’s structure tells the toaster what to do. So similarly, we trigger organisms, but what they do has to do with their internal structure–including their nervous system–and the way it responds to various perturbations.

The idea of informational closure forces us to bring a new perspective to how we view the world. How are we able to know about reality? From a constructivism standpoint, we do not have a direct access to the external reality. What we can truly say is how we experience the world, not how the world really is. We do not construct a representation of the external world. This is not possible, if we are informationally closed. What we do is actually construct how we experience the world. As Raskin points out, the world is not a construction; only our experience of it is. Distinguishing experiential reality from external reality (even a hypothetical, impossible-to-prove-for-sure external reality) is important in maintaining a coherent constructivist stance.

All knowledge from this standpoint is personal, and cannot be passed on as a commodity. In constructivism, there is an idea called as the myth of instructive interaction. This means that we cannot be directly instructed. A teacher cannot teach a student with a direct and exact impact. All the teacher can do is to perturb the student so that the student can construct their personal knowledge based on their internal structure. Raskin notes – once people’s internal systems are triggered, they organize their experiential responses into something meaningful and coherent. That is to say, they actively construe. Events alone do not dictate what people know; constructive processes play a central role as people impose meaning and order on sensory data. 

The more interactions we have with a phenomenon, the better we can experience the phenomenon, and it aids in our construction of the stable experiential reality of that phenomenon. Repetition is an important ingredient for this. Ernst von Glasersfeld notes:

Without repetition there would be no reason to claim that a given experiential item has some kind of permanence. Only if we consider an experience to be the second instance of the self-same item we have experienced before, does the notion of permanence arise.

From this point, I will try to look at some questions that might help to further our understanding of constructivism.

What is the point of constructivism if it means that we cannot have an accurate representation of the real world? The ultimate point about constructivism is not about an ontological stance, it is about viability. It is about being able to continue to survive. All organisms are informationally closed, and they continue to stay viable. The goal is to fit into the real world. Raskin explains – the purpose of this knowledge is not to replicate a presumed outside world but to help the organism survive. In Cybernetics, we say that we need to have a model of what we are trying to manage or control. This “model” does not have to be an exact representation of the “system” we are trying to control. We can treat it as a black box where we have no idea about the inner workings of the system. As long as we are able to come up with a set of possibilities and possible triggers for possible outcomes, we can manage the system. A true representation is not needed.

How would one account for a social realm if we are informationally closed? If each of us are informationally closed, and our knowledge are personal, how we do account for the social realm, where we all acknowledge a version of stable social reality. Raskin provides some clarity on this. He notes:

Von Glasersfeld held that people create a subjective internal environment that they populate with “repeatable objects.” These repeatable objects are experienced as “external and independent” by the person constructing internal representations of them. Certain repeatable objects–those we identify as sentient, primarily other people–are treated as if they have the same active meaning-making abilities that we attribute to ourselves. Consequently, we are able to experience an intersubjective reality whenever other people respond to us in ways that we interpret as indicating they experience things the same way we do. Once again, this alleviates concerns about constructivism being solipsistic because people do relationally coordinate with one another in confirming and maintaining their constructions. 

For von Glasersfeld, it means that people construe one another as active meaning makers and consequently treat their personal understandings as communally shared when others’ behavior is interpreted as affirming those understandings. As I stated elsewhere, “when experiencing sociality or an intersubjective reality, we come to experience our constructions as socially shared to the extent that they appear to be (and, for all functional purposes, can be treated as if they are) also held by others”.

Each one of us construct an experiential reality of the external world. This external world includes other people in it. Our ongoing interaction with these people enhances and updates our own experiential world. We come to see the external world as a social construction. Our personal construction gets triggered in a social setting resulting in a social version of that construction. The more frequent and diverse interactions we get, the more viable this construction becomes. The other people are part of this experiential reality and thereby become cocreators of the social reality. In many regards, what we construct are not representations of the external world, but more a domain of constraints and possibilities. Making sense of the external world is a question about viability. If it does not affect viability, one may very well believe in a God or think that the world is flat. The moment, the viability is impacted, the constructions of the reality will have to adjusted/modified.

The image I have chosen for the post is an artwork by the Japanese Zen master, Nakahara Nantenbō (1839 – 1925). The artwork is a depiction of ensō (circle). The caption reads:

Born within the ensō (circle) of the world, the human heart must also become an ensō (circle).

Please maintain social distance and wear masks. Please take vaccination, if able. Stay safe and Always keep on learning…

In case you missed it, my last post was The Ghost in the System:

This post is also available as a podcast –


  1. An Introduction to Cybernetics, Ross Ashby (1956)
  2. An introductory perturbation: what is constructivism and is there a future in it?, Raskin, Jonathan D. (2015)

The Ghost in the System:

In today’s post, I am looking at the idea of ‘category mistake’ by the eminent British philosopher Gilbert Ryle. Ryle was an ardent opponent of Rene Descartes’ view of mind-body dualism. Ryle also came up with the phrase ‘the ghost in the machine’ to mock the idea of dualism. Cartesian dualism is the idea that mind and body are two separate entities. Descartes was perhaps influenced by his religious beliefs. Our bodies are physical entities that will wither away when we die. But our minds, Descartes concluded are immaterial and can “live on” after we die. Descartes noted:

There is a great difference between mind and body, inasmuch as body is by nature always divisible, and the mind is entirely indivisible… the mind or soul of man is entirely different from the body.

Ryle called this idea the official doctrine:

The official doctrine, which hails chiefly from Descartes, is something like this. With the doubtful exceptions of idiots and infants in arms every human being has both a body and a mind. Some would prefer to say that every human being is both a body and a mind. His body and his mind are ordinarily harnessed together, but after the death of the body his mind may continue to exist and function.

Ryle referred to the idea of Cartesian dualism as the dogma of the ghost in the machine – the physical body being the machine, and the mind being the ghost. Ryle pointed out that Descartes was engaging in a category mistake by saying that mind and body are separate things. A category mistake happens when we operate with an idea as if it belongs to a particular category. Loosely put, it is like comparing apples to oranges, or even better, comparing apples to hammers. The two items do not belong to the same category and hence, a comparison between the two is a futile and incorrect attempt. The mind is not separate from the body. In fact, the two are interconnected and influence each other in a profound manner. Ryle talked about the idea of dualism as the absurdity of the official doctrine:

I shall often speak of it, with deliberate abusiveness, as ‘the dogma of the Ghost in the Machine’. I hope to prove that it is entirely false, and false not in detail but in principle. It is not merely an assemblage of particular mistakes. It is one big mistake and a mistake of a special kind. It is, namely, a category-mistake. It represents the facts of mental life as if they belonged to one logical type or category (or range of types or categories), when they actually belong to another. The dogma is therefore a philosopher’s myth.

Ryle explained the category mistake with some examples. One of the examples was that of a foreigner visiting Oxford or Cambridge:

A foreigner visiting Oxford or Cambridge for the first time is shown a number of colleges, libraries, playing fields, museums, scientific departments and administrative offices. He then asks ‘But where is the University? I have seen where the members of the Colleges live, where the Registrar works, where the scientists experiment and the rest. But I have not yet seen the University in which reside and work the members of your ‘University’. It has then to be explained to him that the University is not another collateral institution, some ulterior counterpart to the colleges, laboratories and offices which he has seen. The University is just the way in which all that he has already seen is organized. When they are seen and when their co-ordination is understood, the University has been seen. His mistake lay in his innocent assumption that it was correct to speak of Christ Church, the Bodleian Library, the Ashmolean Museum and the University, to speak, that is, as if ‘the University’ stood for an extra member of the class of which these other units are members. He was mistakenly allocating the University to the same category as that to which the other institutions belong.

The foreigner committed the category mistake by assuming that the university is a material entity just like different buildings he saw. He could not understand that the university is a collective whole made up of the different buildings, the students, the staff etc. I will discuss one more example that Ryle gave:

The same mistake would be made by a child witnessing the march-past of a division, who, having had pointed out to him such and such battalions, batteries, squadrons, etc., asked when the division was going to appear. He would be supposing that a division was a counterpart to the units already seen, partly similar to them and partly unlike them. He would be shown his mistake by being told that in watching the battalions, batteries and squadrons marching past he had been watching the division marching past. The march-past was not a parade of battalions, batteries, squadrons and a division; it was a parade of the battalions, batteries and squadrons of a division.

Similar to the foreigner, the child was looking for a separate entity called “the division”. He could not understand that the division is what he is seeing. It was not a parade of battalions, batteries, squadrons and a division; it was a parade of the battalions, batteries and squadrons of a division.

Ryle also gave another example of a visitor who was getting an explanation of the game of Cricket. He saw and understood the different players in the field such as the batsman, the bowler, the fielder etc. After he looked at each one of the players, he asked who is in charge of the team spirit. “But there is no one left on the field to contribute the famous element of team-spirit. I see who does the bowling, the batting and the wicket-keeping; but I do not see whose role it is to exercise esprit de corps.” Ryles explained:

Once more, it would have to be explained that he was looking for the wrong type of thing. Team-spirit is not another cricketing-operation supplementary to all of the other special tasks. It is, roughly, the keenness with which each of the special tasks is performed, and performing a task keenly is not performing two tasks. Certainly exhibiting team-spirit is not the same thing as bowling or catching, but nor is it a third thing such that we can say that the bowler first bowls and then exhibits team-spirit or that a fielder is at a given moment either catching or displaying esprit de corps.

The reader would have noticed that I titled the post – The Ghost in the System. I am alluding to the category mistakes we make in systems thinking. Most often we commit the category mistake of assuming that the system is a standalone objective entity. This is an ontological error. We talk of a hospital system or a transportation system as if it is a physical entity that is visible for everyone to see and understand. We talk about optimizing the system or changing the system as if it is a machine that we can repair by changing out a faulty part with another. In actuality, the system we refer to is a mental construct of how we imagine the different chosen components interact with each other producing specific outcomes we are interested. When we talk of the issues haunting the hospital system, we might be meaning the long waits we have to endure, or the expensive tests that we had to go through. Each one of us construct a version of a “system” and yet we use the same term “system” to talk about different aspects. It is a category mistake to assume that we know what the others are saying. Coming back to the example of the hospital system, when we speak of a hospital system, we point to the hospital buildings, the equipment in the hospitals, the waiting rooms, the doctors, the staff, or the patients. But that is not a hospital system, not really because a system is mental construct that is entirely dependent on who is doing the observing. The observer has a specific thing in mind when they use that word. It is a category mistake to assume that you know what was said. The artifacts are not the system. 

Ryle viewed category mistakes occurring due to problems in vocabulary. He wrote:

These illustrations of category-mistakes have a common feature which must be noticed. The mistakes were made by people who did not know how to wield the concepts University, division and team-spirit. Their puzzles arose from inability to use certain items in the English vocabulary.

Wittgenstein famously wrote – The limits of language are the limits of my world. Our use of language limits what we can know or tell about the world. To go further with this idea, I am looking at the idea of systems from West Churchman’s viewpoint. Churchman advised us that a systems approach begins when first you see the world through the eyes of another. We live in a social realm and by social realm, I mean that we live in a world where “reality” is co-constructed with the other inhabitants of the realm. We define and redefine reality on an ongoing basis through continual interactions with the other cocreators. We should have a model or an image of what we are trying to manage. But if social realm is cocreated, we need to be aware of others in the realm and treat it as a cocreation rather than an objective reality that we have access to. Systems do not have an objective existence. Each one of us view and construct systems from our viewpoint. Thus, how we define a system is entirely dependent on us, the observers. What we have to do is to seek understanding before we rush in to change or optimize a system. The first step is to be aware of the others in the realm. The next step is to seek understanding and see how each one of them views the world. We have to better our vocabulary so that we can speak their language.

There is no ghost in the machine. There is only the machine.

I will finish with a wonderful reflexive nugget from Ryle:

In searching for the self, one cannot be the hunter and the hunted.

Please maintain social distance and wear masks. Please take vaccination, if able. Stay safe and Always keep on learning…

In case you missed it, my last post was The Cybernetics of Complexity:

This post is also available as a podcast –