Getting Out of the Dark Room – Staying Curious:

In today’s post I am looking at the importance of staying curious in the light of Karl Friston’s “Free Energy Principle” (FEP) and Ross Ashby’s ideas on indirect regulation. I have discussed Free Energy Principle here. The FEP basically states that in order to resist the natural tendency to disorder, adaptive agents must minimize surprise.

Karl Friston, the brilliant mind behind FEP noted:

the whole point of the free-energy principle is to unify all adaptive autopoietic and self-organizing behavior under one simple imperative; avoid surprises and you will last longer.

Avoiding surprises means that one has to model and anticipate a changing and itinerant world. This implies that the models used to quantify surprise must themselves embody itinerant wandering through sensory states (because they have been selected by exposure to an inconstant world): Under the free-energy principle, the agent will become an optimal (if approximate) model of its environment. This is because, mathematically, surprise is also the negative log-evidence for the model entailed by the agent. This means minimizing surprise maximizes the evidence for the agent (model). Put simply, the agent becomes a model of the environment in which it is immersed. This is exactly consistent with the Good Regulator theorem of Conant and Ashby (1970). This theorem, which is central to cybernetics, states that “every Good Regulator of a system must be a model of that system.” .. Like adaptive fitness, the free-energy formulation is not a mechanism or magic recipe for life; it is just a characterization of biological systems that exist. In fact, adaptive fitness and (negative) free energy are considered by some to be the same thing.

This idea of the agent having a model of its environment is quite important in Cybernetics. In fact, the idea of FEP can be traced back to Ashby’s ideas on Cybernetics. For an organism to survive, it needs to keep certain internal variables such as blood pressure, internal temperature etc. in a certain range. Ashby called these as essential variables, depicted by “E”. Ashby noted that the goal of regulation is to keep these essential variables in range, in the light of disturbances coming from the environment. In other words, the goal of regulation is to minimize the effect of disturbances coming in. A perfect regulation will result in no disturbances reaching the essential variables. The organism will be completely ignorant of what is going on outside in this case. When the regulation succeeds, we say that the regulator has requisite variety. It is able to counter the variety coming in from the environment. Ashby called this “the law of Requisite Variety”, and explained it succinctly as “only variety can absorb variety.” Ashby explained the direct and indirect regulation as follows:

Direct and indirect regulation occur as follows. Suppose an essential variable X has to be kept between limits x’ and x”. Whatever acts directly on X to keep it within the limits is regulating directly. It may happen, however, that there is a mechanism M available that affects X, and that will act as a regulator to keep X within the limits x’ and x” provided that a certain parameter P (parameter to M) is kept within the limits p’ and p”. If, now, any selective agent acts on P so as to keep it between p’ and p”, the end result, after M has acted, will be that X is kept between x’ and x”.

Now, in general, the quantities of regulation required to keep P in p’ and p” and to keep X in x’ to x” are independent. The law of requisite variety does not link them. Thus, it may happen that a small amount of regulation supplied to P may result in a much larger amount of regulation being shown by X.

When the regulation is direct, the amount of regulation that can be shown by X is absolutely limited to what can be supplied to it (by the law of requisite variety); when it is indirect, however, more regulation may be shown by X than is supplied to P. Indirect regulation thus permits the possibility of amplifying the amount of regulation; hence its importance.

Ashby explained the direct and indirect regulation with the following example:

Living organisms came across this possibility eons ago, for the gene-pattern is a channel of communication from parent to offspring: ‘Grow a pair of eyes,’ it says, ‘ they’ll probably come in useful; and better put hemoglobin into your veins — carbon monoxide is rare and oxygen common.’ As a channel of communication, it has a definite, finite capacity, Q say. If this capacity is used directly, then, by the law of requisite variety, the amount of regulation that the organism can use as defense against the environment cannot exceed Q. To this limit, the non-learning organisms must conform. If, however, the regulation is done indirectly, then the quantity Q, used appropriately, may enable the organism to achieve, against its environment, an amount of regulation much greater than Q. Thus, the learning organisms are no longer restricted by the limit.

A lower cognitive capacity organism may be able to survive with just relying on its gene-pattern, while a higher cognitive capacity organism has to supplement the basic gene-patterns with a learning behavior. In order to do this, it has to learn from its environment. Ashby continued:

In the same way the gene-pattern, when it determines the growth of a learning animal, expends part of its resources in forming a brain that is adapted not only by details in the gene-pattern but also by details in the environment… dictionary. While the hunting wasp, as it attacks its prey, is guided in detail by its genetic inheritance, the kitten is taught how to catch mice by the mice themselves. Thus, in the learning organism the information that comes to it by the gene-pattern is much supplemented by information supplied by the environment; so, the total adaptation possible, after learning, can exceed the quantity transmitted directly through the gene-pattern.

It is important to note that the environment does not input information into the organism. Instead, the organism perceives the environment through its action on the environment. The environment also acts on the organism, just like the organism acts on the environment. Perception is possible only through this circular causal cycle. As Ashby noted, the gene pattern for learning allows for the organism to model its environment, and this allows for the indirect regulation. Ashby explains this point further:

This is the learning mechanism. Its peculiarity is that the gene-pattern delegates part of its control over the organism to the environment. Thus, it does not specify in detail how a kitten shall catch a mouse, but provides a learning mechanism and a tendency to play, so that it is the mouse which teaches the kitten the finer points of how to catch mice. This is regulation, or adaptation, by the indirect method. The gene-pattern does not, as it were, dictate, but puts the kitten into the way of being able to form its own adaptation, guided in detail by the environment.

The Dark Room:

At this point, we can look at the idea of the dark room. This is a thought experiment in FEP. We can try to explain this also using Ashby’s ideas. If the goal of the regulator is to minimize the impact of disturbances on the essential variables, one strategy is to then go to an environment with minimum disturbances. In FEP, this thought experiment is explained similarly as – if the goal of the agent is to minimize surprise, why wouldn’t the agent find a dark room and stay in it indefinitely?

A recurrent puzzle raised by critics of these models (FEP) is that biological systems do not seem to avoid surprises. We do not simply seek a dark, unchanging chamber, and stay there. This is the “Dark-Room Problem.” 

Karl Friston offers an answer to this question:

Technically, the resolution of the Dark-Room Problem rests on the fact that average surprise or entropy H(s|m) is a function of sensations and the agent (model) predicting them. Conversely, the entropy H(s) minimized in dark rooms is only a function of sensory information. The distinction is crucial and reflects the fact that surprise only exists in relation to model-based expectations. The free-energy principle says that we harvest sensory signals that we can predict (cf., emulation theory; Grush, 2004); ensuring we keep to well-trodden paths in the space of all the physical and physiological variables that underwrite our existence. In this sense, every organism (from viruses to vegans) can be regarded as a model of its econiche, which has been optimized to predict and sample from that econiche. Interestingly, free energy is used explicitly for model optimization in statistics (e.g., Yedidia et al., 2005) using exactly the same principles.

This means that a dark room will afford low levels of surprise if, and only if, the agent has been optimized by evolution (or neurodevelopment) to predict and inhabit it. Agents that predict rich stimulating environments will find the “dark room” surprising and will leave at the earliest opportunity. This would be a bit like arriving at the football match and finding the ground empty. Although the ambient sensory signals will have low entropy in the absence of any expectations (model), you will be surprised until you find a rational explanation or a new model (like turning up a day early). Notice that average surprise depends on, and only on, sensations and the model used to explain them. This means an agent can compare the surprise under different models and select the best model; thereby eluding any “circular explanation” for the sensations at hand.

We are born with a gene pattern that allows for learning. The basic pattern is to learn, and our survival mainly comes from this. We are able to get out of the dark room because of this. We are born curious and this allows us to keep on learning. We have an inner ability to keep looking for answers and not be satisfied with status quo.

I am sure there is an important lesson for us all here with the idea of the dark room and the indirect regulation. I could simply say – stay curious and keep on learning. Or I can have you come to that conclusion on your own. As famous Spanish philosopher, José Ortega y Gasset noted – He who wants to teach a truth should place us in the position to discover it ourselves.

I will finish with a great lesson from Ashby to explain the idea of the indirect regulation:

If a child wanted to discover the meanings of English words, and his father had only ten minutes available for instruction, the father would have two possible modes of action. One is to use the ten minutes in telling the child the meanings of as many words as can be described in that time. Clearly there is a limit to the number of words that can be so explained. This is the direct method. The indirect method is for the father to spend the ten minutes showing the child how to use a dictionary. At the end of the ten minutes the child is, in one sense, no better off; for not a single word has been added to his vocabulary. Nevertheless, the second method has a fundamental advantage; for in the future the number of words that the child can understand is no longer bounded by the limit imposed by the ten minutes. The reason is that if the information about meanings has to come through the father directly, it is limited to ten-minutes’ worth; in the indirect method the information comes partly through the father and partly through another channel (the dictionary) that the father’s ten-minute act has made available.

Please maintain social distance and wear masks. Stay safe and Always keep on learning…

In case you missed it, my last post was The Cybernetics of Ohno’s Production System:

The Free Energy Principle at the Gemba:

FEP

In today’s post, I am looking at the Free Energy Principle (FEP) by the British neuroscientist, Karl Friston. The FEP basically states that in order to resist the natural tendency to disorder, adaptive agents must minimize surprise. A good example to explain this is to say successful fish typically find themselves surrounded by water, and very atypically find themselves out of water, since being out of water for an extended time will lead to a breakdown of homoeostatic (autopoietic) relations.[1]

Here the free energy refers to an information-theoretic construct:

Because the distribution of ‘surprising’ events is in general unknown and unknowable, organisms must instead minimize a tractable proxy, which according to the FEP turns out to be ‘free energy’. Free energy in this context is an information-theoretic construct that (i) provides an upper bound on the extent to which sensory data is atypical (‘surprising’) and (ii) can be evaluated by an organism, because it depends eventually only on sensory input and an internal model of the environmental causes of sensory input.[1]

In FEP, our brains are viewed as predictive engines, or also Bayesian Inference engines. This idea is built on predictive coding/processing that goes back to the German physician and physicist Hermann von Helmholtz from the 1800s. The main idea is that we have a hierarchical structure in our brain that tries to predict what is going to happen based on the previous sensory data received. As philosopher Andy Clarke explains, our brain is not a cognitive couch potato waiting for sensory input to make sense of what is going on. It is actively predicting what is going to happen next. This is why minimizing the surprise is important. For example, when we lift a closed container, we predict that it is going to have a certain weight based on our previous experiences and the visual signal of the container. We are surprised if the container is light in weight and can be lifted easily. We have similar experiences when we miss a step on the staircase. From a mathematical standpoint, we can say that when our internal model matches the sensory input, we are not surprised. This refers to the KL divergence in information theory. The lower the divergence, the better the fit between the model and the sensory input, and lower the surprise. The hierarchical model is top down. The prediction flows top down, while the sensory data flows bottom up. If the model matches the sensory data, then nothing goes up the chain. However, when there is a significant difference between the top down prediction and the bottom up incoming sensory date, the difference is raised up the chain. One of my favorite examples to explain this further is to imagine that you are in the shower with your radio playing. You can faintly hear the radio in the shower. When your favorite song plays on the radio, you feel like you can hear it better than when an unfamiliar song is played. This is because your brain is able to better predict what is going to happen and the prediction helps smooth out the incoming auditory signals. British neuroscientist Anil Seth has a great quote regarding the predictive processing idea, “perception is controlled hallucination.”

Andy Clarke explains this further:

Perception itself is a kind of controlled hallucination… [T]he sensory information here acts as feedback on your expectations. It allows you to often correct them and to refine them.

(T)o perceive the world is to successfully predict our own sensory states. The brain uses stored knowledge about the structure of the world and the probabilities of one state or event following another to generate a prediction of what the current state is likely to be, given the previous one and this body of knowledge. Mismatches between the prediction and the received signal generate error signals that nuance the prediction or (in more extreme cases) drive learning and plasticity.

Predictive coding models suggest that what emerges first is the general gist (including the general affective feel) of the scene, with the details becoming progressively filled in as the brain uses that larger context — time and task allowing — to generate finer and finer predictions of detail. There is a very real sense in which we properly perceive the forest before the trees.

What we perceive (or think we perceive) is heavily determined by what we know, and what we know (or think we know) is constantly conditioned on what we perceive (or think we perceive).

(T)he task of the perceiving brain is to account for (to accommodate or ‘explain away’) the incoming or ‘driving’ sensory signal by means of a matching top-down prediction. The better the match, the less prediction error then propagates up the hierarchy. The higher level guesses are thus acting as priors for the lower level processing, in the fashion (as remarked earlier) of so-called ‘empirical Bayes’.

The question on what happens when the prediction does not match is best explained by Friston:

“The free-energy considered here represents a bound on the surprise inherent in any exchange with the environment, under expectations encoded by its state or configuration. A system can minimize free energy by changing its configuration to change the way it samples the environment, or to change its expectations. These changes correspond to action and perception, respectively, and lead to an adaptive exchange with the environment that is characteristic of biological systems. This treatment implies that the system’s state and structure encode an implicit and probabilistic model of the environment.”

Our brains are continuously sampling the data coming in and making predictions. When there is a mismatch between the prediction and the data, we have three options.

  • Update our model to match the incoming data.
  • Attempt to change the environment so that the model matches the environment. Try resampling the data coming in.
  • Ignore and do nothing.

Option 3 is not always something that will yield positive results. Option 1 is a learning process where we are updating our internal models based on the new evidence. Option 2 show ours strong confidence in our internal model, and that we are able to change the environment. Or perhaps there is something wrong with the incoming data and we have to get more data to proceed.

The ideas from FEP can also further our understanding on our ability to balance between maintaining status quo (exploit) and going outside our comfort zones (explore). To paraphrase the English polymath Spencer Brown, the first act of cognition is to differentiate (act of distinction). We start with differentiating – Me/everything else. We experience and “bring forth” the world around us by constructing it inside our mind. This construction has to be a simpler version due to the very high complexity of the world around us. We only care about correlations that matter to us in our local environment. This matters the most for our survival and sustenance. This leads to a tension. We want to look for things that confirm our hypotheses and maintain status quo. This is a short-term vision. However, this doesn’t help in the long run with our sustenance. We also need to explore to look for things that we don’t know about. This is the long-term vision. This helps us prepare to adapt with the everchanging environment. There is a balance between the two.

The idea of FEP can go from “I model the world” to “we model the world” to “we model ourselves modelling the world.” As part of a larger human system, we can cocreate a shared model of our environment and collaborate to minimize the free energy leading to our sustenance as a society.

Final Words:

FEP is a fascinating field and I welcome the readers to check out the works of Karl Friston, Andy Clarke and others. I will finish with the insight from Friston that the idea of minimizing free energy is also a way to recognize one’s existence.

Avoiding surprises means that one has to model and anticipate a changing and itinerant world. This implies that the models used to quantify surprise must themselves embody itinerant wandering through sensory states (because they have been selected by exposure to an inconstant world): Under the free-energy principle, the agent will become an optimal (if approximate) model of its environment. This is because, mathematically, surprise is also the negative log-evidence for the model entailed by the agent. This means minimizing surprise maximizes the evidence for the agent (model). Put simply, the agent becomes a model of the environment in which it is immersed. This is exactly consistent with the Good Regulator theorem of Conant and Ashby (1970). This theorem, which is central to cybernetics, states that “every Good Regulator of a system must be a model of that system.” .. Like adaptive fitness, the free-energy formulation is not a mechanism or magic recipe for life; it is just a characterization of biological systems that exist. In fact, adaptive fitness and (negative) free energy are considered by some to be the same thing.

Always keep on learning…

In case you missed it, my last post was The Whole is ________ than the sum of its parts:

[1] The free energy principle for action and perception: A mathematical review. Christopher L. Buckley, Chang Sub Kim, Simon McGregor, Anil K. Seth (2017)