The Cybernetics of Ohno’s Production System:

In today’s post, I am looking at the cybernetics of Ohno’s Production System. For this I will start with the ideas of ultrastability from one of the pioneers of Cybernetics, Ross Ashby. It should be noted that I am definitely inspired by Ashby’s ideas and thus may take some liberty with them.

Ashby defined a system as a collection of variables chosen by an observer. “Ultrastability” can be defined as the ability of a system to change its internal organization or structure in response to environmental conditions that threaten to disturb a desired behavior or value of an essential variable (Klaus Krippendorff). Ashby identified that when a system is in a state of stability (equilibrium), and when disturbed by the environment, it is able to get back to the state of equilibrium. This is the feature of an ultrastable system. Let’s look at the example of an organism and its environment. The organism is able to survive or stay viable by making sure that certain variables, such as internal temperature, blood pressure etc. stays in a specific range. Ashby referred to these variables as essential variables. When the essential variables go outside a specific range, the viability of the organism is compromised. Ashby noted:

That an animal should remain ‘alive’, certain variables must remain without certain ‘physiological’ limits. What these variables are, and what the limits, are fixed when the species is fixed. In practice one does not experiment on animals in general, one experiments on one of a particular species. In each species the many physiological variables differ widely in their relevance to survival. Thus, if a man’s hair is shortened from 4 inches to 1 inch, the change is trivial; if his systolic blood pressure drops from 120 mm. of mercury to 30, the change will quickly be fatal.

Ashby noted that the organism affects the environment, and the environment affects the organism: such a system is said to have a feedback. Here the environment does not simply mean the space around the organism. Ashby had a specific definition for environment. Given an organism, its environment is defined as those variables whose changes affect the organism, and those variables which are then changed by the organism’s behavior. It is thus defined in a purely functional, not a material sense. The reactionary part is the sensory-motor framework of the organism. The feedback between the reactionary part (R) of an organism (Orgm) and the environment (Envt.) is depicted below:

Ashby explains this using an example of a kitten resting near a fire. The kitten settles at a safe distance from the fire. If a lump of hot coal falls near the kitten, the environment is threatening to have a direct affect on the essential variables. It the kitten’s brain does nothing; the kitten will get burned. The kitten being the ultrastable system is able to use the correct mechanism – move away from the hot coal and maintain its essential variables in check. Ashby proposed that an ultrsstable system has two feedbacks. One feedback that operates frequently while the other feedback that operates infrequently when the essential variables are threatened. The two feedback loops are needed for a system to get back into equilibrium. This is also how the system can learn and adapt. Paul Pangaro and Michael C. Geoghegan note:

What are the minimum conditions of possibility that must exist such that a system can learn and adapt for the better, that is, to increase its chance of survival? Ashby concludes via rigorous argument that the system must have minimally two feedback loops, or double feedback… The first feedback loop, shown on the left side and indicated via up/down arrows, ‘plays its part within each reaction/behavior.’ As Ashby describes, this loop is about the sensory and motor channels between the system and the environment, such as a kitten that adjusts its distance from a fire to maintain warmth but not burn up. The second feedback loop encompasses both the left and right sides of the diagram, and is indicated via long black arrows. Feedback from the environment is shown coming into an icon for a meter in the form of a round dial, signifying that this feedback is measurable insofar as it impinges on the ‘essential variables.’

Ashby depicted his ultrastable system as below:

The first feedback loop can be thought as a mechanism that cannot change itself. It is static, while the second feedback loop is able to operate some parameters so that the structure can change resulting in a new behavior. The second feedback loop acts only when the essential variables are challenged or when the system is not in equilibrium. It must be noted that there are no decisions being made with the first feedback loop. It is simply an action mechanism. It keeps doing what was working before, while the second feedback loop alters the action mechanism to result in a new behavior. If the new behavior is successful in maintaining the essential variables, the new action is continued until it is not effective any longer. When the system is able to counter the threatening situation posed by the environment, it is said to have requisite variety. The law of requisite variety was proposed by Ashby as – only variety can absorb variety. The system must be able to have the requisite variety (in terms of available actions) to counter the variety thrown upon it by the environment. The environment always possesses far more variety than the system. The system must find ways to attenuate the variety coming in, and amplify its own variety to maintain the essential variables.

Let’s look at this with an easy example of a baby. When the baby experiences any sort of discomfort, it starts crying. The crying is the behavior that helps put it back into equilibrium (removal of discomfort) since it gets the attention from its mother or other family members. As the baby grows, its desired variables also get specific (food, water, love, etc.) The action of crying does not always get it what it is looking for. Here the second feedback loop comes in, and it tries a new behavior and see if it results in a better outcome. This behavior could be to point at something or even learning and using words. The new action is kept and used, as long as it becomes successful. The baby/child learns and adapts as needed to meet its own wants and desires.

Pangaro and Geoghegan note that the idea of an ultrastable system is applicable in social realms also. To evoke the social arena, we call the parameters ‘behavior fields.’ When learning by trial-and-error, a behavior field is selected at random by the system, actions are taken by the system that result in observable behaviors, and the consequences of these actions in the environment are in turn registered by the second feedback loop. If the system is approaching the danger zone, and the essential variables begin to go outside their acceptable limits, the step function says, ‘try something else’—repeatedly, if necessary—until the essential variables are stabilized and equilibrium is reached. This new equilibrium is the learned state, the adapted state, and the system locks-in.

It is important to note that the first feedback loop is the overt behavior that is locked in. The system cannot change this unless the second feedback loop is engaged. Stuart Umpleby cites Ashby’s example of an autopilot to explain this further:

In his theory of adaptation two feedback loops are required for a machine to be considered adaptive (Ashby 1960).  The first feedback loop operates frequently and makes small corrections.  The second feedback loop operates infrequently and changes the structure of the system, when the “essential variables” go outside the bounds required for survival.  As an example, Ashby proposed an autopilot.  The usual autopilot simply maintains the stability of an aircraft.  But what if a mechanic miswires the autopilot?  This could cause the plane to crash.  An “ultrastable” autopilot, on the other hand, would detect that essential variables had gone outside their limits and would begin to rewire itself until stability returned, or the plane crashed, depending on which occurred first. The first feedback loop enables an organism or organization to learn a pattern of behavior that is appropriate for a particular environment.  The second feedback loop enables the organism to perceive that the environment has changed and that learning a new pattern of behavior is required.

Ohno’s Production System:

Once I saw that the idea of an ultrastable system may be applied to the social realm, I wanted to see how it can be applied to Ohno’s Production System. Taiichi Ohno is regarded as the father of the famous Toyota Production System. Before it was “Toyota Production System”, it was Ohno’s Production System. Taiichi Ohno was inspired by the challenge issued by Kiichiro Toyoda, the founder of Toyota Motor Corporation. The challenge was to catch up with America in 3 years in order to survive.  Ohno built his ideas with inspirations from Sakichi Toyoda, Kiichiro Toyoda, Henry Ford and the supermarket system. Ohno did a lot of trial and error. And the ideas he implemented, he made sure were followed. Ohno was called “Mr. Mustache”. The operators thought of Ohno as an eccentric. They used to joke that military men used to wear mustaches during World War II, and that it was rare to see a Japanese man with facial hair afterward. “What’s Mustache up to now?” became a common refrain at the plant as Ohno carried out his studies. (Source: Against All Odds, Togo and Wartman)

His ideas were not easily understood by others. He had to tell others that he will take responsibility for the outcomes, in order to convince them to follow his ideas. Ohno could not completely make others understand his vision since his ideas were novel and not always the norm. Ohno was persistent, and he made improvements slowly and steadily. He would later talk about the idea of Toyota being slow and steady like the tortoise. Ohno loved what he did, and he had tremendous passion pushing him forward with his vision. As noted, his ideas were based on trial and error, and were thus perceived as counter-intuitive by others.

Ohno can be viewed as part of the second feedback loop and the assembly line as part of the first feedback loop, while the survivability of the company via the metrics of cost, quality, productivity etc. can be viewed as the “essential variables”. Ohno implemented the ideas of kanban, jidoka etc. on the line, and they were followed. The assembly line could not change the mechanisms established as part of Ohno’s production system. Ohno’s production system can be viewed as a closed system in that the framework is static. Ohno watched how the interactions with the environment went, and how the essential variables were being impacted. Based on this, the existing behaviors were either changed slightly, or changed out all the way until the desired equilibrium was achieved.

Here the production system framework is static because it cannot change itself. The assembly line where it is implemented is closed to changes at a given time. It is “action oriented” without decision powers to make changes to itself. There is no point in copying the framework unless you have the same problems that Ohno faced.

Umpleby also describes the idea of the double feedback loop in terms of quality improvement similar to what we have discussed:

The basic idea of quality improvement is that an organization can be thought of as a collection of processes. The people who work IN each process should also work ON the process, in order to improve it. That is, their day-to-day work involves working IN the process (the first, frequent feedback loop). And about once a week they meet as a quality improvement team to consider suggestions and to design experiments on how to improve the process itself. This is the second, less frequent feedback loop that leads to structural changes in the process. Hence, process improvement methods, which have been so influential in business, are an illustration of Ashby’s theory of adaptation.

This follows the idea of kairyo and kaizen in the Toyota Production System.

Final Words:

It is important to note that Ohno’s Production System is not Toyota Production System is not Toyota’s Production System is not Lean. Ohno’s Production System evolved into Toyota Production System. Toyota’s production system is emergent while Toyota Production System is not. Toyota Production System’s framework can be viewed as a closed system, in the sense that the framework is static. At the same time, the different plants implementing the framework are dynamic due to the simple fact that they exist in an everchanging environment. For an organization to adapt to an everchanging environment, it needs to be ultrastable. An organization can have several ultrastable systems connected with each other resulting in a homeostasis. I will finish with an excellent quote from Mike Jackson.

The organization should have the best possible model of the environment relevant to its purposes… the organization’s structure and information flows should reflect the nature of that environment so that the organization is responsive to it.

Please maintain social distance and wear masks. Stay safe and Always keep on learning…

In case you missed it, my last post was The Cybernetics of a Society:

The Cybernetics of a Society:

In today’s post, I will be following the thoughts from my previous post, Consistency over Completeness. We were looking at each one of us being informationally closed, and computing a stable reality. The stability comes from the recursive computations of what is being observed. I hope to expand the idea of stability from an individual to a society in today’s post.

Humberto Maturana, the cybernetician biologist (or biologist cybernetician) said – anything said is said by an observer. Heinz von Foerster, one of my heroes in cybernetics, expanded this and said – everything said is said to an observer. Von Foerster’s thinking was that language is not monologic but always dialogic. He noted:

The observer as a strange singularity in the universe does not attract me… I am fascinated by images of duality, by binary metaphors like dance and dialogue where only a duality creates a unity. Therefore, the statement.. – “Anything said is said by an observer” – is floating freely, in a sense. It exists in a vacuum as long as it Is not embedded in a social structure because speaking is meaningless, and dialogue is impossible, if no one is listening. So, I have added a corollary to that theorem, which I named with all due modesty Heinz von Foerster’s Corollary Nr. 1: “Everything said is said to an observer.” Language is not monologic but always dialogic. Whenever I say or describe something, I am after all not doing it for myself but to make someone else know and understand what I am thinking of intending to do.

Heinz von Foerster’s great insight was perhaps inspired by the works of his distant relative and the brilliant philosopher, Ludwig Wittgenstein. Wittgenstein proposed that language is a very public matter, and that a private language is not possible. The meaning of a word, such as “apple” does not inherently come from the word “apple”. The meaning of the word comes from how it is used. The meaning comes from repeat usage of the word in a public setting. Thus, even though the experience of an apple may be private to the individual, how we can describe it is by using a public language. Von Foerster continues:

When other observers are involved… we get a triad consisting of the observers, the languages, and the relations constituting a social unit. The addition produces the nucleus and the core structure of society, which consists of two people using language. Due to the recursive nature of their interactions, stabilities arise, they generate observers and their worlds, who recursively create other stable worlds through interacting in language. Therefore, we can call a funny experience apple because other people also call it apple. Nobody knows, however, whether the green color of the apple you perceive, is the same experience as the one I am referring to with the word green. In other words, observers, languages, and societies are constituted through recursive linguistic interaction, although it is impossible to say which of these components came first and which were last – remember the comparable case of hen, egg and cock – we need all three in order to have all three.

Klaus Krippendorff defined closure as follows – A system is closed if it provides its own explanation and no references to an input are required. With closures, recursions are a good and perhaps the only way to interact. As organizationally closed entities, we are able to stay viable only as part of a social realm. When we are part of a social realm, we have to construct reality with reference to an external reference. Understanding is still generated internally, but with an external point of reference. This adds to the reality of the social realm as a collective. If the society has to have an identity that is sustained over time, its viability must come from its members. Like a set of nested dolls, society’s structure comes from participating individuals who themselves are embedded recursively in the societal realm. The structure of the societal or social realm is not designed, but emergent from the interactions, desires, goals etc. of the individuals. The society is able to live on while the individuals come and go.

I am part of someone else’s environment, and I add to the variety of their environment with my decisions and actions (sometimes inactions). This is an important reminder for us to hold onto in light of recent world events including a devastating pandemic. I will finish with some wise words from Heinz von Foerster:

A human being is a human being together with another human being; this is what a human being is. I exist through another “I”, I see myself through the eyes of the Other, and I shall not tolerate that this relationship is destroyed by the idea of the objective knowledge of an independent reality, which tears us apart and makes the Other as object which is distinct from me. This world of ideas has nothing to do with proof, it is a world one must experience, see, or simply be. When one suddenly experiences this sort of communality, one begins to dance together, one senses the next common step and one’s movements fuse with those of the other into one and the same person, into a being that can see with four eyes. Reality becomes communality and community. When the partners are in harmony, twoness flows like oneness, and the distinction between leading and being led has become meaningless.

Please maintain social distance and wear masks. Stay safe and Always keep on learning…

In case you missed it, my last post was Consistency over Completeness:

Source – The Certainty of Uncertainty: Dialogues Introducing Constructivism By Bernhard Poerksen

Consistency over Completeness:

Today’s post is almost a follow-up to my earlier post – The Truth about True Models. In that post, I talked about Dr. Donald Hoffman’s idea of Fitness-Beats-Truth or FBT Theorem. Loosely put, the idea behind the FBT Theorem is that we have evolved to not have “true” perceptions of reality. We survived because we had “fitness” based models and because we did not have “true models”. In today’s post, I am continuing on this idea using the ideas from Heinz von Foerster, one of my Cybernetics heroes.

Heinz von Foerster came up with “the postulate of epistemic homeostasis”. This postulate states:

The nervous system as a whole is organized in such a way (organizes itself in such a way) that it computes a stable reality.

It is important to note here that, we are speaking about computing “a” reality and not “the” reality. Our nervous system is informationally closed (to follow up from the previous post). This means that we do not have direct access to the reality outside. All we have is what we can perceive through our perception framework. The famous philosopher, Immanuel Kant, referred to this as the noumena (the reality that we don’t have direct access to) and the phenomena (the perceived representation of the external reality). All we can do is to compute a reality based on our interpretive framework. This is just a version of the reality, and each one of us computes such a reality that is unique to each one of us.

The other concept to make note of is the “stable” part of the stable reality. In Godelian* speak, our nervous system cares more about consistency than completeness. When we encounter a phenomenon, our nervous system looks at stable correlations from the past and present, and computes a sensation that confirms the perceived representation of the phenomenon. Von Foerster gives the example of a table. We can see the table, and we can touch it, and maybe bang on it. With each of these confirmations and correlations between the different sensory inputs, the table becomes more and more a “table” to us.

*Kurt Godel, one of the famous logicians of last century came up with the idea that any formal system able to do elementary arithmetic cannot be both complete and consistent; it is either incomplete or inconsistent.

From the cybernetics standpoint, we are talking about an observer and the observed. The interaction between the observer and the observed is an act of computing a reality. The first step to computing a reality is making distinctions. If there are no distinctions, everything about the observed will be uniform, and no information can be processed by the observer. Thus, the first step is to make distinctions. The distinctions refer to the variety of the observed. The more distinctions there are, the more variety the observed has. From a second order cybernetics standpoint, the variety of the observed depends upon of the variety of the observer. This goes back to the unique stable reality computation point from earlier. Each one of us are unique in how we perceive things. This is our variety as the observer. The observed, that which is external to us, always has more potential variety than us. We cut down or attenuate this high variety by choosing certain attributes that interests us. Once the distinctions are made, we find relations between these distinctions to make sense of it all. This corresponds to the confirmations and correlations that we noted above in the example of a table.

We are able to survive in our environment because we are able to continuously compute a stable reality. The stability comes from the recursive computations of what is being observed. For example, lets go back to the example of the table. Our eyes receive the sensory input of the image of the table. This is a first set of computation. This sensory image then goes up the “neurochain”, where it is computed again. This happens again and again as the input gets “decoded” at each level, until it gets satisfactorily decoded by our nervous system. The final result is a computation of a computation of a computation of a computation and so on. The stability is achieved from this recursion.

The idea of a consistency over completeness is quite fascinating. This is mainly due to the limitation of our nervous system to have a true representation of the reality. There is a common belief that we live with uncertainty, but our nervous system strives to provide us a stable version of reality, one that is devoid of uncertainties. This is a fascinating idea. We are able to think about this only from a second order standpoint. We are able to ponder about our cognitive blind spots because we are able to do second order cybernetics. We are able to think about thinking. We are able to put ourselves into the observed. Second order cybernetics is the study of observing systems where the observer themselves are part of the observed system.

I will leave the reader with a final thoughtthe act of observing oneself is also a computation of “a” stable reality.

Please maintain social distance and wear masks. Stay safe and Always keep on learning…

In case you missed it, my last post was Wittgenstein and Autopoiesis:

Wittgenstein and Autopoiesis:

In Tractatus Logico-Philosophicus, Wittgenstein wrote the following:

“The world of the happy man is a different one from that of the unhappy man.”

He also noted that, if a lion could talk, we would not understand him.

As a person very interested in cybernetics, I am looking at what Wittgenstein said in the light of autopoiesis. Autopoiesis is the brainchild of mainly two Chilean biologist cyberneticians Humberto Maturana and Francesco Varela. Autopoiesis was put forth as the joining of two Greek words, “auto” meaning self, and “poiesis” meaning creating. I have talked about autopoiesis here.  I am most interested in the autopoiesis’ idea of “organizational closure” for this post. An entity is organizationally closed when it is informationally tight. In other words, autopoietic entities maintain their identities by remaining informationally closed to their surroundings. We, human beings are autopoietic entities. We cannot take in information as a commodity. We generate meaning within ourselves based on experiencing external perturbations. Information does not enter from outside into our brain.

Let’s take the example of me looking at a blue light bulb. I interpret the presence of the blue light as being blue when my eyes are hit with the light. The light does not inform my brain, but rather my brain interprets the light as blue based on all my previous similar interactions I have had. There is no qualitative information coming to my brain saying that it is a blue light, but rather my brain interprets it as a blue light. It is “informative” rather than being a commodity piece of information. As cybernetician Bernard Scott noted:

…an organism does not receive “information” as something transmitted to it, rather, as a circularly organized system it interprets perturbations as being informative.

All of my previous interactions/perturbations with the light, and others explaining those interactions as being “blue light” generated a structural coupling so that my brain perceives a new similar perturbation as being “blue light”. This also brings up another interesting idea from Wittgenstein. We cannot have a private language. One person alone cannot invent a private language. All we have is public language, one that is reinterpreted and reinforced with repeat interactions. The sensation that we call “blue light” is a unique experience that is 100% unique to me as the interpreter. This supports the concept of autopoiesis as well. We cannot “open” ourselves to others so that they can see what is going on inside our head/mind.

Our interpretive framework, which we use to make sense of perturbations hitting us, is a result of all our past experiences and reinforcements. Our interpretive framework is unique to us homo sapiens. We share a similar interpretive framework, but the actual results from our interpretive framework is unique to each one of us. It is because of this that even if a lion could talk to us, we would not be able to understand it, at least not at the start. We lack the interpretive framework to understand it. The uniqueness of our interpretive framework is also the reason we feel differently regarding the same experiences. This is the reason, as a happy person, we cannot understand the world of a sad person, and vice versa.

Our brain makes sense based on the sensory perturbation and the interpretive framework it already has. A good example to think about this is the images that fall on our retina. The images are upside down, but we are able to “see” right side up. This is possible due to our structural coupling. What happens if there is a new sensory perturbation? We can only make sense of what we know. If we face a brand-new perturbation, we can make sense of it only in terms of what we know. The more we know, the more we are further able to know. As we face the same perturbation repeatedly, we are able to “better” experience it, and describe it to ourselves in a richer manner. With enough repeat interactions, we are finally able to experience it in our own unique manner. From this standpoint, there is no mind-body separation. The “mind” and “body” are both part of the same interpretive framework.

I will leave with another thought experiment to spark these ideas in the reader’s mind. There has always been talk about aliens. From what Wittgenstein taught us, when we meet the aliens, will we be able to understand each other?

I recommend the following posts to the reader expand upon this post:

If a Lion Could Talk:

The System in the Box:

A Study of “Organizational Closure” and Autopoiesis:

Please maintain social distance and wear masks. Stay safe and Always keep on learning… In case you missed it, my last post was When is a Model Not a Model?

When is a Model Not a Model?

Ross Ashby, one of the pioneers of Cybernetics, started an essay with the following question:

I would like to start not at: How can we make a model?, but at the even more primitive question: Why make a model at all?

He came up with the following answer:

I would like then to start from the basic fact that every model of a real system is in one sense second-rate. Nothing can exceed, or even equal, the truth and accuracy of the real system itself. Every model is inferior, a distortion, a lie. Why then do we bother with models? Ultimately, I propose. we make models for their convenience.

To go further on this idea, we make models to come up with a way to describe “how things work?” This is done for us to also answer the question – what happens when… If there is no predictive or explanatory power, there is no use for the model. From a cybernetics standpoint, we are not interested in the “What is this thing?”, but the “What does this thing do?” We never try to completely understand a “system”. We understand it in chunks, the chunks that we are interested in. We construct a model in our heads that we call a “system” to make sense of how we think things work out in the world. We only care about certain specific interactions and its outcomes.

One of the main ideas that Ashby proposed was the idea of variety. Loosely put, variety is the number of available states a system has. For example, a switch has a variety of two – ON or OFF. A stop light has a variety of three (generally) – Red, Yellow or Green. As we increase the complexity, the variety also increases. The variety is dependent on the ability of the observer to discern them. A keen-eyed observer can discern a higher number of states for a phenomenon than another observer. Take the example of the great fictional characters, Sherlock Holmes and John Watson. Holmes is able to discern more variety than Watson, when they come upon a stranger. Holmes is able to tell the most amazing details about the stranger that Watson cannot. When we construct a model, the model lacks the original variety of the phenomenon we are modeling. This is important to keep in mind. The external variety is always much larger than the internal variety of the observer. The observer simply lacks the ability to tackle the extremely high amount of variety. To address this, the observer removes or attenuates the unwanted variety of the phenomenon and constructs a simpler model. For example, when we talk about a healthcare system, the model in our mind is pretty simple. One hospital, some doctors and patients etc. It does not include the millions of patients, the computer system, the cafeteria, the janitorial service etc. We only look at the variables that we are interested in.

Ashby explained this very well:

Another common aim that will have to be given up is that of attempting to “understand” the complex system; for if “understanding” a system means having available a model that is isomorphic with it, perhaps in one’s head, then when the complexity of the system exceeds the finite capacity of the scientist, the scientist can no longer understand the system—not in the sense in which he understands, say, the plumbing of his house, or some of the simple models that used to be described in elementary economics.

A crude depiction of model-making is shown below. The observer has chosen certain variables that are of interest, and created a similar “looking” version as the model.

Ashby elaborated on this idea as:

We transfer from system to model to lose information. When the quantity of information is small, we usually try to conserve it; but when faced with the excessively large quantities so readily offered by complex systems, we have to learn how to be skillful in shedding it. Here, of course, model-makes are only following in the footsteps of the statisticians, who developed their techniques precisely to make comprehensible the vast quantities of information that might be provided by, say, a national census. “The object of statistical methods, said R. A. Fisher, “is the reduction of data.”

There is an important saying from Alfred Korzybski – the map is not the territory. His point was that we should take the map to be the real thing. An important corollary to this, as a model-maker is:

If the model is the same as the phenomenon it models, it fails to serve its purpose. 

The usefulness of the model is in it being an abstraction. This is mainly due to the observer not being able to handle the excess variety thrown at them. This also answers one part of the question posed in the title of this post – A model ceases to be a model when it is the same as the phenomenon it models. The second part of the answer is that the model has to have some similarities to the phenomenon, and this is entirely dependent on the observer and what they want.

This brings me to the next important point – We can only manage models. We don’t manage the actual phenomenon; we only manage the models of the phenomenon in our heads. The reason being again that we lack the ability to manage the variety thrown at us.

The eminent management cybernetician, Stafford Beer, has the following words of wisdom for us:

Instead of trying to specify it in full detail, you specify it only somewhat. You then ride on the dynamics of the system in the direction you want to go.

To paraphrase Ashby, we need not collect more information than is necessary for the job. We do not need to attempt to trace the whole chain of causes and effects in all its richness, but attempt only to relate controllable causes with ultimate effects.

The final aspect of model-making is to take into consideration the temporary nature of the model. Again, paraphrasing Ashby – We should not assume the system to be absolutely unchanging. We should accept frankly that our models are valid merely until such time as they become obsolete.

Final Words:

We need a model of the phenomenon to manage the phenomenon. And how we model the phenomenon depends upon our ability as the observer to manage variety. We only need to choose certain specific variables that we want. Perhaps, I can explain this further with the deep philosophical question – If a tree falls in a forest and no one is around to hear it, does it make a sound? The answer to a cybernetician should be obvious at this point. Whether there is sound or not depends on the model you have, and if you have any value in the tree falling having a sound.

Please maintain social distance and wear masks. Stay safe and Always keep on learning…

In case you missed it, my last post was The Maximum Entropy Principle:

The Maximum Entropy Principle:

In today’s post, I am looking at the Maximum Entropy principle, a brainchild of the eminent physicist E. T. Jaynes. This idea is based on Claude Shannon’s Information Theory. The Maximum Entropy principle (an extension of the Principle of Insufficient Reason) is the ideal epistemic stance. Loosely put, we should model only what is known, and we should assign maximum uncertainty for what is unknown. To explain this further, let’s look at an example of a coin toss.

If we don’t know anything about the coin, our prior assumption should be that heads or tails are equally likely to happen. This is a stance of maximum entropy. If we assumed that the coin was loaded, we would be trying to “load” our assumption model, and claim unfair certainty. Entropy is a measure proposed by Claude Shannon as part of his information theory. Low entropy messages have low information content or low surprise content. High entropy messages on the other hand have high information content or high surprise content. The informational entropy is also inversely proportional to the probability of an event. Low probability events have high information content. For example, an unlikely defeat of a reigning sports team generates more surprise than a likely win. Entropy is the average level of information when we consider all of the probabilities. In the case of the coin toss, the entropy is the average level of information when we consider the probability of heads or tail. For discrete events, the entropy is maximum for equally likely events, or in other words for uniform distribution. Thus, when we say that the probability of heads or tails is 0.5, we are assuming a maximum entropy model. In the case of uniform distribution, the maximum entropy model is also the same as Laplace’s principle of insufficient reason. If the coin was always landing on heads, we have a zero entropy case because there is no new information available. If it is a loaded coin that makes one side more likely to occur, then the entropy is lower than if it is a fair coin. This is shown below, where the X-axis is the probability of Heads, and the Y-axis is the information entropy. We can see that Pr(0) or no Heads, and Pr(1) or 100% Heads have zero entropy value. The highest value for entropy happens when the probability for heads is 0.5 or 50%. For those who are interested, Jon von Neumann had a great idea to make a loaded coin fair. You can check out that here.

From this standpoint, if we take a game, where one team is more favored to win, we could say that the most informative part of a game is sometimes the coin toss.

Let’s consider the case of a die. There are six possible events (1 through 6) when we roll a die. The maximum entropy model will be to assume a uniform distribution, i.e., to assign 1/6 as the probability for 1 through 6 value. If we somehow knew that 6 is more likely to happen. For example, if the manufacturer of the loaded die says that the number 6 is likely to occur 3/6 of the times. Per the maximum entropy model, we should divide the remaining 3/6 equally among the remaining 5 numbers. With each additional piece of information, we should change our model so that the entropy is at its maximum. What I have discussed here is the basic information regarding maximum entropy. Each new piece of “valid” information that we need to incorporate into our model is called a constraint. The maximum entropy approach utilizes Lagrangian multipliers to find the solutions. For discrete events, with no additional information, the maximum entropy model is the uniform distribution. In a similar vein, if you are looking at a continuous distribution, and you knew what the mean and variance of the distribution is, the maximum entropy model is the normal distribution.

The Role of The Observer:

Jaynes asked a great question about the information content of a message. He noted:

In a communication process, the message m(i) is assigned probability p(i), and the entropy H, is a measure of information. But WHOSE information?… The probabilities assigned to individual messages are not measurable frequencies; they are only a means of describing a state of knowledge.

The general idea of probability in the frequentist’s version of statistics is that it is fixed. However, in the Bayesian version, the probability is not a fixed entity. It represents a state of knowledge. Jaynes continues:

Entropy, H, measures not the information of the sender, but the ignorance of the receiver that is removed by the receipt of the message.

To me, this brings up the importance of the observer and circularity. As the great cybernetician Heinz von Foerster said:

“The essential contribution of cybernetics to epistemology is the ability to change an open system into a closed system, especially as regards the closing of a linear, open, infinite causal nexus into closed, finite, circular causality.”

Let’s go back to the example of a coin. If I am an alien and if I knew nothing about coins, should my maximum entropy model only include two possibilities of heads or tails? Why should it not include the coin landing on its edge? Or if a magician is tossing the coin, should I account for the coin to vanish in thin air? The assumption of just two possibilities (head or tails) is the prior information that we are accounting for, by saying that the probability of a heads or a tail is 0.5. As we gain more knowledge about the coin toss, we can update the model to reflect it, and at the same time change the model to a new state of maximum entropy. This iterative, closed loop process is the backbone of scientific enquiry and skepticism. The use of the maximum entropy model is a stance that we are taking to state our knowledge. Perhaps a better way to explain the coin toss is that – given our lack of knowledge about the coin, we are saying that the heads is not more likely to happen than tails until we find more evidence. Let’s look at another interesting example where I think the maximum entropy model comes up.

The Veil of Ignorance:

The veil of ignorance is an idea about ethics proposed by the great American Political philosopher, John Rawls. Loosely put, in this thought experiment, Rawls is asking us what kind of society should we aim for? Rawls asks us to imagine that we are behind a veil of ignorance, where we are completely ignorant of our natural abilities, societal standing, family etc. We are then randomly assigned a role in society. The big question then is – what should society be like where this random assignment promotes fairness and equality? The random assignment is a maximum entropy model since any societal role is equally likely.

Final Words:

Maximum entropy principle is a way of saying to not put all of your eggs in one basket. It is a way to be aware of your biases and it is an ideal position for learning. It is similar to the Epicurus’ principle of Multiple Explanations, that says – “Keep all the different hypotheses that are consistent with the facts.”

It is important to understand that “I don’t know,” is a valid and acceptable answer. It marks the boundary for learning.

Jaynes explained maximum entropy as follows:

The maximum entropy distribution may be asserted for the positive reason that is uniquely determined as the one which is maximally noncommittal with regard to missing information, instead of the negative one that there was no reason to think otherwise… Mathematically, the maximum entropy distribution has the important property that no possibility is ignored; it assigns positive weight to every situation that Is not absolutely excluded by the given information.

We learned that probability and entropy are dependent on the observer. I will finish off with the wise words from James Dyke and Axel Kleidon.

Probability can now be seen as assigning a value to our ignorance about a particular system or hypothesis. Rather than the entropy of a system being a particular property of a system, it is instead a measure of how much we know about a system.

Please maintain social distance and wear masks. Stay safe and Always keep on learning…

In case you missed it, my last post was Destruction of Information/The Performance Paradox:

Destruction of Information/The Performance Paradox:

Ross Ashby was one of the pioneers of Cybernetics. His 1956 book, An Introduction to Cybernetics, is still one of the best introductions to Cybernetics. As I was researching his journals, I came across an interesting phrase – “destruction of information.” Ashby noted:

I am not sure whether I have stated before my thesis – that the business of living things is the destruction of information.

Ashby gave several examples to explain what he meant by this. For example:

Consider a thermostat controlling a room’s temperature. If it is working well, we can get no idea, from the temperature of the room whether it is hot or cold outside. The thermostat’s job is to stop this information from reaching the occupant.

He also gave the example of an antiaircraft gun and its predictor. Suppose we observe only the error made by each shell in succession. If the predictor is perfect, we shall get the sequence of 0,0,0,0 etc. By examining this sequence, we can get no information of about how the aircraft maneuvered. Contrast this with the record of a poor predictor: 2, 1, 2, 3… -3, 0, 3 etc. By examining, this we can get quite a good idea of how the pilot maneuvered. In general, the better the predictor, the less the maneuvers show in the errors. The predictor’s job is to destroy this information.

As an observer, we learn about a living system or a phenomenon by the variety it displays. Here, variety can be loosely expressed as the number of distinct states a system has. Interestingly, the number of states or the variety is dependent upon the system demonstrating it, as well as the observer’s ability to distinguish the different states. If the observer is not able to make the needed number of distinctions, then less information is generated. On the other hand, if the system of interest is able to hide its different states, it minimizes the amount of information available for the observer. In this post, we are interested in the latter category. Ashby talks about an interesting example to further this idea:

An insect whose coloration makes it invisible will not show, by its survival or disappearance whether a predator has or has not seen it. An imperfectly colored one will reveal this fact by whether it has survived or not.

Another example, Ashby gives is that of an expert boxer:

An expert boxer, when he comes home, will show no signs of whether he had a fight in the street or not. An imperfect boxer will carry the information.

Ashby’s idea can be further looked at from an adaptation standpoint. When you adapt very well to your everchanging surroundings, you are destroying information or you are not demonstrating any information. Ashby also noted that adaptation means “destroying information.” In this manner, you know that you are adapting well, when you don’t break a sweat. A master swordsman moves effortlessly while defeating an opponent. A good runner is not out of breath after a quick sprint.

The Performance Paradox:

My take on this idea from Ashby is to express it as a form of performance paradox – When something works really well, you will not notice it, or worse you will think that it’s wasteful. The most effective and highly efficient components stay the quietest. The best spy is the one you have not ever heard of. When you try to monitor a highly performing component, you may rarely get evidence of its performance. It is almost as if it is wasteful. Another way to view this is – the imperfect components lend themselves to be monitored, while the perfect components do not. The danger in not understanding regulation from a cybernetics standpoint is to completely misread the interactions, and assume that the perfect component has no value.

I encourage the reader to read further upon these ideas here:

Edit (12/1/2020): Adding more clarity on “destruction of information”.

The phrase “destruction of information” was used by Ashby from a Shannon entropy sense. He is indicating that the agent is purposefully reducing the information entropy that would had been otherwise available. Another example is that of a good poker player, who is difficult to read.

Please maintain social distance and wear masks. Stay safe and Always keep on learning…

In case you missed it, my last post was Locard’s Exchange Principle at the Gemba:

Locard’s Exchange Principle at the Gemba:

In today’s post, I am looking at Locard’s Exchange Principle, named after the famous French Criminologist, Edmond Locard. Succinctly put, the exchange principle can be stated as “every contact leaves a trace.” This is perhaps well explained by Paul L. Kirk in his 1953 book, Crime Investigation: Physical Evidence and the Police Laboratory:

Wherever he steps, whatever he touches, whatever he leaves, even unconsciously, will serve as a silent witness against him. Not only his fingerprints or his footprints, but his hair, the fibers from his clothes, the glass he breaks, the tool mark he leaves, the paint he scratches, the blood or semen he deposits or collects. All of these and more bear mute witness against him. This is evidence that does not forget. It is not confused by the excitement of the moment. It is not absent because human witnesses are. It is factual evidence. Physical evidence cannot be wrong, it cannot perjure itself, it cannot be wholly absent. Only human failure to find it, study and understand it can diminish its value.

In other words, the perpetrator involved in a crime brings something into the scene and at the same time takes something with them. They both can be used against the perpetrator as forensic evidence. As a huge fan of mystery stories and shows, I was very interested when I first heard about this principle. Rather than the applications in the forensics science, I was thinking about it from a cybernetics standpoint. When two people converse with each other, their interactions can be viewed in the light of Locard’s exchange principle. Both of them bring something into the conversation, and in turn take something with them. There is a cross-transfer of ideas with successful conversations. To quote the late German philosopher, Hans-Georg Gadamer:

The true reality of human communication is such that a conversation doesn’t simply enforce one opinion over and against the other, nor does it simply add one opinion to another, as a kind of addition. Rather, true conversation transforms both viewpoints.

It may be challenged that true conversations do not always take place. However, this is something that we can strive for. At the same time, we need to be mindful not to treat information as a commodity that can be passed around. Just because we convey a message by speaking it out aloud, it does not mean that the message is conveyed. As the great cybernetician, Heinz von Foerster, would say – the hearer not the utterer determines the meaning of a message.

Claude Shannon, the father of Information Theory, looked in depth on successful transmission of messages. He noted:

The fundamental problem of communication is that of reproducing at one point either exactly or approximately a message selected at another point. Frequently the messages have meaning; that is, they refer to or are correlated according to some system with certain physical or conceptual entities. These semantic aspects of communication are irrelevant to the engineering problem. The significant aspect is that they are selected from a set of possible messages.

Shannon’s model had a source (a sender of a message), a transmission medium (a channel with noise and distortion) and a receiver. The sender had to encode the message and sent it through the medium. The receiver had to receive the message and decode the message and reconstruct the message. The receiver had to have a set of possible messages so that they were able to properly decode the message such that any distortion or noise introduced in the medium can be compensated for. Shannon came up with a quantitative measure for the amount of information in a message – entropy. This is also a measure of surprise. For a message with low entropy, there is little surprise. For a message with high entropy, there is a lot of surprise, and this requires redundancy to ensure that the message is properly conveyed. For example, if the sender is sending a message, “011”, then the sender can repeat the message three times. “011 011 011”. Thus, if the message gets distorted such as “011 001 011”, the receiver is able to still decode the message as “011”. Curiously, if the message has a full amount of surprise, then the receiver will not be able to decode the message. Thus, if the message was entirely new information, the message will not be decoded successfully, no matter how much redundancy is entered. This is the whole point of cryptic messages.

We are autopoietic entities, which means that we are informationally closed. No information can come into our organization from the outside. We are closed to information coming in. Any information is generated from within when we are exposed to perturbations from the outside. I have previously talked about this before. See here and here. We generate the information based on the perceptual network evolved specifically for us. We cannot pass information around as a commodity. Autopoeisis is the brainchild of Humberto Maturana and Francesco Varela. They noted:

Autopoietic systems do not have inputs or outputs. They can be perturbated by independent events and undergo internal structural changes which compensate these perturbations.

When we are communicating as part of being at the gemba, we have to keep in mind that we may not completely understand the meaning as the way the utterer intended. In a similar way, the hearer, the other person, may not have understood the meaning as we had intended the meaning to be. Even though we both may have heard each other 100%, we may not have communicated 100% (the way we think at least). Instead, I am interpreting what the other person is saying, and trying to respond to what I think the other person has said. The same applies to the other person. We are both interpreting each other. We are both trying to perturb each other with the hope that the meaning that is being generated has some similarity to what we want to communicate. It is here that I appreciate Locard’s Exchange Principle. We are coming in and leaving something (not the entire thing) at the scene, and at the same time, we are taking something (again not the entire thing) with us as we leave the scene. When we communicate, we are hopefully inspiring each other. Communication is never achieved 100%, but some transfer of ideas takes place resulting in transformation of existing ideas. As Gadmer indicated, when we communicate, the ideas do not get added on top of each other in an additive fashion. Rather, the ideas get transformed. When we are at the gemba, we should be keen on listening with intent. We should be open to receiving the ideas from others and be willing to transform. We should be mindful that what we are saying will not be understood the way we want it to be. We should also be mindful of our non-verbal communication. Most of the time, we can tell a lot by how a leader acts. A leader often talks the talk that we want to hear. However, their actions often talk the loudest.

I will stop with the great George Bernard Shaw’s wonderful quote on communication:

The biggest problem with communication is the illusion that it has occurred.

Please maintain social distance and wear masks. Stay safe and Always keep on learning…

In case you missed it, my last post was The Truth About True Models:

The Truth About True Models:

I recently came across Dr. Donald Hoffman’s idea of Fitness-Beats-Truth or FBT Theorem. This is the idea that evolution stamps out true perceptions. In other words, an organism is more likely to survive if it does not have a true and accurate perception. As Hoffman explains it:

Suppose there is an objective reality of some kind. Then the FBT Theorem says that natural selection does not shape us to perceive the structure of that reality. It shapes us to perceive fitness points, and how to get them… The FBT Theorem has been tested and confirmed in many simulations. They reveal that Truth often goes extinct even if Fitness is far less complex.

Hoffman suggests that natural selection did not shape us to perceive the structure of an objective reality. Evolution gave us a less complex but efficient perceptual network that takes shortcuts to perceive “fitness points.” Evolution by natural selection does not favor true perceptions—it routinely drives them to extinction. Instead, natural selection favors perceptions that hide the truth and guide useful action.

An easy to way to digest this idea is to consider our ancient ancestors. If they heard a rustling sound in the grass, it benefitted them to not analyze and capture the entire surrounding to get an accurate and true model of the reality. Instead, they would survive only if they got a “quick and dirty” or good-enough model of the surrounding. They did not gain anything by having an elaborate and accurate perception. Their quick and dirty heuristics such as “if you hear a rustling on the grass, then flee” allowed them to survive and pass of their genes. In other words, their fitter perception did not comprise of a true and accurate perception of the world around them. They gained (they survived) based on fitness rather than truth. As Hoffman noted, having true perception would have been detrimental because it avoided shortcuts and heuristics that saved time. As complexity increases, heuristics work much better.

The idea of FBT aligns pretty well with the ideas of second order cybernetics (SOC) and radical constructivism. From an SOC standpoint, the emphasis for the representation of the world is not that of a model of causality, but of a model of constraints. As Ernst von Glasersfeld explains this:

In the biological theory of evolution, we speak of variability and selection, of environmental constraints and of survival. If an organism survives individually or as a species it means that, so far at least, it has been viable in the environment in which it happens to live. To survive, however, does not mean that the organism must in any sense reflect the character or the qualities of his environment. Gregory Bateson (1967) was the first who noticed that this theory of evolution, Darwin’s theory, is really a cybernetic theory because it is based on the concept of constraint rather than on the concept of causation.

In order to remain among the survivors, an organism has to ‘‘get by” the constraints which the environment poses. It has to squeeze between the bars of the constraints, to coin a metaphor. The environment does not determine how that might he achieved. It does not cause certain organisms to have certain characteristics or capabilities or to be a certain way. The environment merely eliminates those organisms that knock against its constraints. Anyone who by any means manages to get by the constraints, survives… All the environment contributes is constraints that knock out some of the changed organisms while others are left to survive. Thus, we can say that the only indication we may get of the ‘‘real” structure of the environment is through the organisms and the species that have been extinguished; the viable ones that survive merely constitute a selection of solutions among an infinity of potential solutions that might be equally viable.

Nature prefers efficient solutions that does the work most of the time, rather than effective solutions that work all of the time – solutions that prefer least energy expenditure, least number of parts etc. This approach also resonates with Occam’s razor. It is always advisable to have the least number of assumptions in your model. Another way to look at this is – the design with the least number of moving parts is always preferred.

The idea that true perceptions are not always advantageous may be counterintuitive. As complexity increases, we lack the perceptual network to truly comprehend the complexity. How we perceive our world around us depends a lot on our perceptual network, which is unique to our species. Our reality consists of omitting most of the attributes of the world around us. As Hoffman explains – the reality becomes simply a species-specific representation of fitness points on offer, and how we can act to get those points. Evolution has shaped us with perceptions that allow us to survive. But part of that involves hiding from us the stuff we don’t need to know.

Complexity also favors this approach of viable solutions/fitter perceptions. Hoffman notes:

We find that increasing the complexity of objective reality, or perceptual systems, or the temporal dynamics of fitness functions, increases the selection pressures against veridical perceptions.

I will add more thoughts on the FBT theorem at a later time. I encourage the readers to check out Hoffman’s book, The Case Against Reality.

Please maintain social distance and wear masks. Stay safe and Always keep on learning…

In case you missed it, my last post was Talking about Constraints in Cybernetics:

Talking about Constraints in Cybernetics:

In today’s post, I am looking at constraints with respect to Cybernetics. I am looking mainly at the ideas from Ross Ashby, one of the pioneers of Cybernetics. Ashby wrote one of the best introductions to Cybernetics, aptly titled An Introduction to Cybernetics. Ashby described constraints in terms of variety. Variety is the number of distinct elements that an observer is capable of making. For example, consider the following set of elements:

{a, b, b, B, c, C}

Someone could say that the variety of this set is 3 since there are three letters. Some other person could say that the variety is actually 5 if the lower and upper cases are distinguished. A very common example to explain variety is a traffic stop light. Generally, the stop light in the US has 3 states (Red, Yellow and Green). Sometimes, additional states are possible such as blinking Red (indicating a STOP sign) or no light. Thus, the variety of a stop light can vary from 3 to 4 to 5.

Ashby explained constraints as – when there are two related sets and one set has less variety than the other, we can determine that a constraint is present in the set of elements with less variety. Let’s consider the stop light again. If all the lights were independent, we can have 8 possible states. This is shown below, where “X” means OFF and “O” means ON.

Figure 1 – The Eight States of a Stop Light

Per our discussion above, we utilize mainly 3 of these states to control traffic (ignoring the blinking states). These are identified in the blue shaded cells {2, 6, 7}. Thus, we can say that there is a constraint applied on the stop light since the actual variety the stop light possesses is 3 instead of 8. Ashby distinguishes slight and severe constraints. The example that Ashby gives is applying a constraint on a squad of soldiers in a single rank. The soldiers can be made to stand in numerous ways. For example, if the constraint to be applied is that no one soldier is to stand next to another soldier who shares the same birthday, the variety achieved is high. This is an example of a slight constraint. It is highly unlikely that two soldiers share the same birthday in a small group. However, if the constraint to be applied is that the soldiers should arrange themselves in the order of their height, the variety is then highly reduced. This is an example of a severe constraint.

Another example that Ashby gives is that of a chair. A chair taken as a whole has six degrees of freedom for movement. However, when the chair is disassembled into its parts, the freedom for movement increases. Ashby said:

A chair is a thing because it has coherence, because we can put it on this side of a table or that, because we can carry it around or sit on it. The chair is also a collection of parts. Now any free object in our three-dimensional world has six degrees of freedom for movement. Were the parts of the chair unconnected each would have its own six degrees of freedom; and this is in fact the amount of mobility available to the parts in the workshop before they are assembled. Thus, the four legs, when separate, have 24 degrees of freedom. After they are joined, however, they have only the six degrees of freedom of the single object. That there is a constraint is obvious when one realizes that if the positions of three legs of an assembled chair are known, then that of the fourth follows necessarily—it has no freedom.

Thus, the change from four separate and free legs to one chair corresponds precisely to the change from the set’s having 24 degrees of freedom to its having only 6. Thus, the essence of the chair’s being a “thing”, a unity, rather than a collection of independent parts corresponds to the presence of the constraint.

Ashby continued:

Seen from this point of view, the world around us is extremely rich in constraints. We are so familiar with them that we take most of them for granted, and are often not even aware that they exist. To see what the world would be like without its usual constraints we have to turn to fairy tales or to a “crazy” film, and even these remove only a fraction of all the constraints.

There are several takeaways we can have from Ashby’s explanation of constraints.

  1. The effect of the observer: The observer is king when it comes to cybernetics. The variety of an observed system is dependent on the observer. This means that the observation is subject to the constraints that the observer applies knowingly or unknowingly in the form of biases, beliefs, etc. The observer brings and applies internal constraints on the external world. Taking this a step further, our experiential reality is a result of our limited perceptual network. For example, we can see only a small section of the light spectrum. We can hear only a small section of the sound spectrum. We have cognitive blind-spots that we are not aware of. And yet we claim access to an objective reality and we are surprised when people don’t understand our point of view. We should not force our own views such that we come up with false dichotomies. This is sadly all very prevalent in today’s politics where almost every matter has been turned into a political viewpoint.
  2. Constraints are not a bad thing: Ashby’s great insight was that when a constraint exists, we can take advantage of it. We can make reasonably good predictions when constraints exist. Constraints help us to understand how things work. Ashby said that every law of nature is a constraint. We are able to estimate the variety that would exist if total independence occurred. We are able to minimize this variety by understanding the existing variety and adding further constraints as possible to produce results that we want. Adding constraints is about reducing unwanted variety. Design Engineering takes full use of this. On a similar note, Ashby also pointed out that learning is possible only to the extent that a sequence shows constraint. Learning is only possible when there is a constraint. If we are to learn a language, we learn it by learning the constraints that exists in the language in the form of syntax, meanings of the words, grammar etc.
  3. Law of Requisite Variety: Ross Ashby came up with the Law of Requisite Variety. This law simply can be explained as variety destroys (compensates) variety. For example, a good swordsman is able to fend off an opponent, if they are able to block and counter-attack every move of the opponent. The swordsman has to match the variety of the opponent (the set of attacks and blocks). To take our previous example, the stop light has to have a requisite variety to control traffic. If the 3 states identified in Figure 1 are not enough, the “system” will absorb the variety in the form of a traffic jam. When we think in terms of constraints, the requisite variety should be aligned with the identified constraints. We should minimize bringing in our internal constraints, and watch for the external constraints existing. The variety that we need to match must be aligned to the constraints already existing.
  4. Constraints do not need to be Objects: Similar to point 1, what we tell ourselves in terms of narratives and stories are also constraints. We are Homo Narrans – storytellers. We make sense of the world in terms of the stories we share and tell ourselves and others. We control ourselves and others with the stories we tell. We limit ourselves with what we believe. If we can understand the stories, we tell ourselves or others are telling us, we can better ourselves.
  5. Adaptation or Fit: Ashby realized that an organism can adapt just so far as the real world is constrained, and no further. Evolution is about fit. It is about supporting those factors that allow the organism to match the constraint in order to survive. The organism evolves to match the changing constraints present in the changing environment. This often happens through finding use for what is already existing. There is a great example that Cybernetician and Radical Constructivist, Ernst von Glasersfeld gives – the way the key fits a lock that it is able to open:

The fit describes a capacity of the key, not a property of the lock. When we face a novel problem, we are in much the same position as the burglar who wishes to enter a house. The “key” with which he successfully opens the door might be a paper clip, a bobby pin, a credit card, or a skillfully crafted skeleton key. All that matters is that it fits within the constraints of the particular lock and allows the burglar to get in.

I will finish with Ernst von Galsersfeld’s description of Cybernetics in terms of constraints:

Cybernetics is not interested in causality but constraints. Cybernetics is the art of maintaining equilibrium in a world of constraints and possibilities.

Please maintain social distance and wear masks. Stay safe and Always keep on learning…

In case you missed it, my last post was Deconstructing Systems – There is Nothing Outside the Text: