Hammurabi, Hawaii and Icarus:

patent

In today’s post, I will be looking at Human Error. In November 2017, The US state of Hawaii reinstated the Cold War era nuclear warning signs due to the growing fears of a nuclear attack from North Korea. On January 13, 2018, an employee from the Hawaii Emergency Management Agency sent out an alert through the communication system – “BALLISTIC MISSILE THREAT INBOUND TO HAWAII. SEEK IMMEDIATE SHELTER. THIS IS NOT A DRILL.” The employee was supposed to take part in a drill where the emergency missile warning system is tested. The alert message was not supposed to go to the general public. The cause for the mishap was soon determined to be human error. The employee in the spotlight and few others left the agency soon afterwards. Even the Hawaiian governor, David Ige, came under scrutiny because he had forgotten his Twitter password and could not update his Twitter feed about the false alarm. I do not have all of the facts for this event, and it would not be right of me to determine what went wrong. Instead, I will focus on the topic of human error.

One of the first proponents of the concept of human error in the modern times is the American Industry Safety pioneer, Herbert William Heinrich. In his seminal 1931 book, Industrial Accident Prevention, he proposed the concept of Domino theory to explain industry accidents. Heinrich reviewed several industrial accidents of his time, and came up with the following percentages for proximate causes:

  • 88% are from unsafe acts of persons (human error),
  • 10% are from unsafe mechanical or physical conditions, and
  • 2% are “acts of God” and unpreventable.

The reader may find it interesting to learn that Heinrich was working as the Assistant Superintendent of the Engineering and Inspection Division of Travelers Insurance Company, when we wrote the book in 1931. The data that Heinrich collected was somehow lost after the book was published. Heinrich’s domino theory explains an injury from an accident as a linear sequence of events associated with five factors – Ancestry and social environment, Fault of person, Unsafe act and/or mechanical or Unsafe performance of persons, Accident and Injury.

H1

He hypothesized that taking away one domino from the chain can prevent the industrial injury from happening. He wrote – If one single factor of the entire sequence is to be selected as the most important, it would undoubtedly be the one indicated by the unsafe act of the person or the existing mechanical hazard. I was taken aback by the example he gave to illustrate his point. As an example, he talked about an operator fracturing his skull as the result of a fall from a ladder. The investigation revealed that the operator descended the ladder with his back to it and caught his heel on one of the upper rungs. Heinrich noted that the effort to train and instruct him and to supervise his work was not effective enough to prevent this unsafe practice.  “Further inquiry also indicated that his social environment was conducive to the forming of unsafe habits and that his family record was such as to justify the belief that reckless tendencies had been inherited.

One of the main criticisms to Heinrich’s Domino model is its simplistic nature to explain a complex phenomenon. The Domino model is reflective of the mechanistic view prevalent at that time. The modern view of “human error” is based on cognitive psychology and systems thinking. In this view, accidents are seen as a by-product of the normal functioning of the sociotechnical system. Human error is seen as a symptom and not a cause. This new view uses the approach of “no-view” when it comes to human error. This means that the human error should not be its own category for a root cause. The process is not perfectly built, and the human variability that might result in a failure is the same that results in the ongoing success of the process. The operator has to adapt to meet the unexpected challenges, pressures and demands that arise on a day-to-day basis. The use of human error as a root cause is a fundamental attribution error – focusing on the human trait of the operator as being reckless or careless; rather than focusing on the situation that the operator was in.

One concept that may help in explaining this further is Local Rationality. Local Rationality starts with the basic assumption that everybody wants to do a good job, and we try to do the best (be rational) with the information that is available to us at a given time. If this decision led to an error, instead of looking at where the operator went wrong, we need to look at why he made the decisions that made sense to him at that point in time. The operator is in the “sharp end” of the system. James Reason, Professor Emeritus of Psychology at the University of Manchester in England, came up with the concept of Sharp End and Blunt End. Sharp end is similar to the concept of Gemba in Lean, where the actual action is taking place. This is mainly where the accident happens and is thus in the spotlight during an investigation. Blunt end, on the other hand, is removed and away in space and time. The blunt end is responsible for the policies and constraints that shape the situation for the sharp end. The blunt end consists of top management, regulators, administrators etc. Professor Reason noted that the blunt end of the system controls the resources and constraints that confront the practitioner at the sharp end, shaping and presenting sometimes conflicting incentives and demands. The operators in the sharp end of the sociotechnical system inherits the defects in the system due to the actions and policies set by blunt end and can be the last line of defense instead of being the main proponents or instigators of the accidents. Professor Reason also noted that – rather than being the main instigators of an accident, operators tend to be the inheritors of system defects. Their part is that of adding the final garnish to a lethal brew whose ingredients have already been long in the cooking. I encourage the reader to research the works of Jens Rasmussen, James Reason, Erik Hollnagel and Sydney Dekker since I have tried to only scratch the surface.

Final Words:

Perhaps the oldest source of human error causation is the Code of Hammurabi, the code of ancient Mesopotamian laws dating back to 1754 BC. The Code of Hammurabi consisted of 282 laws. Some examples of human error are given below.

  • If a builder builds a house for someone, and does not construct it properly, and the house which he built falls in and kill its owner, then that builder shall be put to death.
  • If a man rents his boat to a sailor, and the sailor is careless, and the boat is wrecked or goes aground, the sailor shall give the owner of the boat another boat as compensation.
  • If a man lets in water and the water overflows the plantation of his neighbor, he shall pay ten gur of corn for every ten gan of land.

I will finish off with the story of Icarus. In Greek mythology, Icarus was the creator of the labyrinth in the island of Minos. Icarus’ father was the master craftsman Daedalus. King Minos of Crete imprisoned Daedalus and Icarus in Crete. The ingenious Daedalus observed the birds flying and invented a set of wings made from bird feathers and candle wax. He tested the wings out and made a pair for his son Icarus. Daedalus and Icarus planned their escape. Daedalus was a good Engineer since he studied the failure modes of his design and identified the limits. Daedalus instructed Icarus to follow him closely and asked him to not fly too close to the sea since the moisture can dampen the wings, and not fly too close to the sun since the heat from sun can melt the wings. As the story goes, Icarus was excited with his ability to fly and got carried away (maybe reckless). He flew too close to the sun, and the wax melted from his wings causing him to fall down to his untimely death.

Perhaps, the death of Icarus could be viewed as a human error since he was reckless and did not follow directions. However, Stephen Barlay in his 1969 book, Aircrash Detective: International Report on the Quest for Air Safety, looked at this story closely. At the high altitude that Icarus was flying, the temperature will actually be cold rather than warm. Thus, the failure would actually be from the cold temperature that would make the wax brittle and break instead of wax melting as indicated in the story. If this was true, during cold weathers the wings would have broken down and Icarus would have died at another time even if he had followed his father’s advice.

Always keep on learning…

In case you missed it, my last post was A Fuzzy 2018 Wish

Advertisements

The Information Model for Poka Yoke:

USB2

In today’s post, I will be looking at poka yoke or error proofing using an information model. My inspirations for this post is Takahiro Fujimoto, who wrote the wonderful book “The Evolution of a Manufacturing System at Toyota” (1999) and a discussion I had with my brother last weekend.

I will start with an interesting question – “where do you see information at your gemba, your production floor?” A common answer to this might be the procedures or the work instructions, or you might answer it as the visual aids readily available on the floor. Yet another answer might be the production boards where the running total along with reject information is recorded. All of this is correct. A general definition of information is something that carries content, which is related to data. I am not going into Claude Shannon’s work with information in this post. Fujimoto’s brilliant view of information is that every artifact on the production floor, and in fact every materialistic thing carries information. Fujimoto defines an information asset as the basic unit of an information system. Information cannot exist without the materials or energy in which it is embodied – its medium.

info asset

This information model indicates that the manufactured product carries information. The information it carries came from the design of the product. The information is transferred and transformed from the fixtures/dies/prints etc onto the physical product. Any loss of information during this process results in a defective product. To take this concept further, even if the loss of information is low, the end-user interaction with the product brings in a different dimension. The end-user gains information when he interacts with the product. If this information matches his expectations, he is satisfied. Even if there is minimal loss of information from design to manufacturing, if the end product information does not match the user’s expectations, the user gets dissatisfied.

Lets look at a simple example of a door.  A door with a handle is a poor design since the information of whether to push or pull is not clearly transferred to the user. The user might expect to pull on the handle instead of pushing on it. The information carried by the door handle is to “open the door using handle”. It does not convey whether to push or pull to open the door.

handle

Perhaps, one can add a note on the door that says, “Push”. A better solution to avoid the confusion is to eliminate the handle altogether so that the only option is to push. The removal of the handle with a note indicating “push” conveys the information that to open the door, one has to push. The information gets conveyed to the user and there is no dissatisfaction.

This example brings up an important point – a defect is created only when an operator or machine interacts with imperfect information. The imperfect information could be in the form of a worn-out die or an imperfect work instruction that aids loss of original information being transferred to the product. When you are trying to the solve a problem on the production floor, you are updating the information available on the medium so that the user’s interaction is modified to achieve the optimum result. This brings us to poka yoke or error-proofing.

If you think about it, you could say that the root cause for any problem is that the current process allows that problem to occur due to imperfect information.  This is what poka yoke tries to address. Toyota utilizes Jidoka and poka yoke to ensure product quality. Jidoka or autonomation is the idea that when a defect is identified, the process is stopped either by the machine in an automated process, or by the operator in an assembly line. The line is stopped so that the quality problem can be addressed. In the case of Jidoka, the problem has already occurred. In contrast, poka yoke eliminates the problem by preventing the problem from occurring in the first place. Poka yoke is the brainchild of probably one of the best Industrial Engineers ever, Shigeo Shingo. The best error-proofing is one where the operator cannot create a specific defect, knowingly or unknowingly. In this type of error-proofing, the information is embedded in the medium such that it conveys the proper method to the operator and if that method is not followed, the action cannot be completed. This information of only one proper way is physically embedded onto the medium.

Information in the form of work instructions may not always be effective because of limited interaction with the user. Information in the form of visual aids can be effective since it interacts with the user and provides useful information. However, the user can ignore this or get used to it. Information in the form of alarms can also be useful. This too may get ignored by the user and may not prevent the error from occurring. However, the user cannot ignore the information in the form of contact poka yoke since he has to interact with it. The proper assembly information is physically embedded in the material. A good example is a USB cable where it can be entered in only one way. The USB icon on top indicates that it is the top. Apple took this approach further by eliminating the need of orientation altogether with its lightning cables. The socket on the Apple product prevents any other cable from being inserted due to its unique shape.

Final Words:

The concept of physical artifacts carrying information is enlightening for me as a Quality Engineer. You can update the process information by updating a fixture to have a contact feature so that a part can be inserted in only one way. This information of proper orientation is embedded onto the fixture. This is much better that updating the work instruction to properly orient the part. The physical interaction ensures that the proper information is transferred to the operator to properly orient the part.

As I was researching for this post, I came across James Gleick who wrote the book, “The Information: A History, a Theory, a Flood”. I will finish off with a story I heard from James Gleick regarding information: When Gleick started working at the New York Times, a wise old head editor told him that the reader is not paying for all the news that they put in to be printed. What the reader is paying them was for all the news that they left out.

Always keep on learning…

In case you missed it, my last post was Divine Wisdom and Paradigm Shifts:

Which Way You Should Go Depends on Where You Are:

compass

I recently read the wonderful book “How Not To Be Wrong, The Power of Mathematical Thinking” by Jordan Ellenberg. I found the book to be enlightening and a great read. Jordan Ellenberg has the unique combination of being knowledgeable and capable of teaching in a humorous and engaging way. One of the gems in the book is – “Which way you should go depends on where you are”. This lesson is about the dangers of misapplying linearity. When we are thinking in terms of abstract concepts, the path from point A to point B may appear to be linear. After all, the shortest path between two points is a straight line. This type of thinking is linear thinking.

To illustrate this, let’s take the example of poor quality issues on the line. The first instinct to improve quality is to increase inspection. In this case, point A = poor quality, and point B = higher quality. If we plot this incorrect relationship between Quality and Inspection, we may assume it as a linear relationship – increasing inspection results in better quality.

Inspection and Quality

However, increasing inspection will not result in better quality in the long run and will result in higher costs of production. We must build quality in as part of the normal process at the source and not rely on inspection. In TPS, there are several ways to do this including Poka Yoke and Jidoka.

In a similar fashion, we may look at increasing the number of operators in the hopes of increasing productivity. This may work initially. However, increasing production at the wrong points in the assembly chain can hinder the overall production and decrease overall productivity. Taiichi Ohno, the father of Toyota Production System, always asked to reduce the number of operators to improve the flow. Toyota Production System relies on the thinking of the people to improve the overall system.

The two cases discussed above are nonlinear in nature. Thus increasing one factor may increase the response factor initially. However, continually increasing the factor can yield negative results. One example of a non-linear relationship is shown below:

productivity

The actual curve may of course vary depending on the particularities of the example. In nonlinear relationships, which way you should go depends on where you are. In the productivity example, if you are at the Yellow star location on the curve, increasing the operators will only decrease productivity. You should reduce the number of operators to increase productivity. However, if you are at the Red star, you should look into increasing the operators. This will increase productivity up to a point, after which the productivity will decrease. Which Way You Should Go Depends on Where You Are!

In order to know where you are, you need to understand your process. As part of this, you need to understand the significant factors in the process. You also need to understand the boundaries of the process where things will start to breakdown. The only way you can truly learn your process is through experimentation and constant monitoring. It is likely that you did not consider all of the factors or the interactions. Everything is in flux and the only constant thing is change. You should be open for input from the operators and allow improvements to happen from the bottom up.

I will finish off with the anecdote of the “Laffer curve” that Jordan Ellenberg used to illustrate the concept of nonlinearity. One polical party in America have been pushing for lowering taxes on the wealthy. The conservatives made this concept popular using the Laffer curve. Arthur Laffer was an economics professor at the University of Chicago. The story goes that Arthur Laffer drew the curve on the back of a napkin during dinner in 1974 with the senior members of then President Gerald Ford’s administration. The Laffer Curve is shown below:

Laffer curve

The horizontal axis shows the tax rate and the vertical axis shows the revenue that is generated from taxation. If there is no taxation, then there is no revenue. If there is 100% taxation, there is also no revenue because nobody would want to work and make money, if they cannot hold on to it. The argument that was raised was that America was on the right hand side of the curve and thus reducing taxation would increase revenue. It has been challenged whether this assumption was correct. Jordan used the following passage from Greg Manikiw, a Harvard economist and a Republican who chaired the Council of Economic Advisors under the second President Bush:

Subsequent history failed to confirm Laffer’s conjecture that lower tax rates would raise tax revenue. When Reagan cut taxes after he was elected, the result was less tax revenue, not more. Revenue from personal income taxes fell by 9 percent from 1980 to 1984, even though average income grew by 4 percent over this period. Yet once the policy was in place, it was hard to reverse.

The Laffer curve may not be symmetric as shown above. The curve may not be smooth and even as shown above and could be a completely different curve altogether. Jordan states in the book – All the Laffer curve says is that lower taxes could, under some circumstances, increase tax revenue; but figuring out what those circumstances are requires deep, difficult, empirical work, the kind of work that doesn’t fit on a napkin.

Always keep on learning…

In case you missed it, my last post was Epistemology at the Gemba:

Rules of 3 and 5:

rules of thumb

It has been a while since I have blogged about statistics. So in today’s post, I will be looking at rules of 3 and 5. These are heuristics or rules of thumb that can help us out. They are associated with sample sizes.

Rule of 3:

Let’s assume that you are looking at a binomial event (pass or fail). You took 30 samples and tested them to see how many passes or failures you get. The results yielded no failures. Then, based on the rule of 3, you can state that at 95% confidence level, the upper bound for a failure is 3/30 = 10% or the reliability is at least 90%. The rule is written as;

p = 3/n

Where p is the upper bound of failure, and n is the sample size.

Thus, if you used 300 samples, then you could state with 95% confidence that the process is at least 99% reliable based on p = 3/300 = 1%. Another way to express this is to say that with 95% confidence fewer than 1 in 100 units will fail under the same conditions.

This rule can be derived from using binomial distribution. The 95% confidence comes from the alpha value of 0.05. The calculated value from the rule of three formula gets more accurate with a sample size of 20 or more.

Rule of 5:

I came across the rule of 5 from Douglas Hubbard’s informative book “How to Measure Anything” [1]. Hubbard states the Rule of 5 as;

There is a 93.75% chance that the median of a population is between the smallest and largest values in any random sample of five from that population.

This is a really neat heuristic because you can actually tell a lot from a sample size of 5! The median is the 50th percentile value of a population, the point where half of the population is above it and half of the population is below it. Hubbard points out the probability of picking a value above or below the median is 50% – the same as a coin toss. Thus, we can calculate that the probability of getting 5 heads in a row is 0.5^5 or 3.125%. This would be the same for getting 5 tails in a row. Then the probability of not getting all heads or all tails is (100 – (3.125+3.125)) or 93.75%. Thus, we can state that the chance of one value out of five being above the median and at least one value below the median is 93.75%.

Final words:

The reader has to keep in mind that both of the rules require the use of randomly selected samples. The Rule of 3 is a version of Bayes’ Success Run Theorem and Wilk’s One-sided Tolerance calculation. I invite the reader to check out my posts that sheds more light on this 1) Relationship between AQL/RQL and Reliability/Confidence , 2) Reliability/Confidence Level Calculator (with c = 0, 1….., n) and 3) Wilk’s One-sided Tolerance Spreadsheet.

When we are utilizing random samples to represent a population, we are calculating a statistic – a representation value of the parameter value. A statistic is an estimate of the parameter, the true value from a population. The higher the sample size used, the better the statistic can represent the parameter and better your estimation.

I will finish with a story based on chance and probability;

It was the finals and an undergraduate psychology major was totally hung over from the previous night. He was somewhat relieved to find that the exam was a true/false test. He had taken a basic stat course and did remember his professor once performing a coin flipping experiment. On a moment of clarity, he decided to flip a coin he had in his pocket to get the answers for each questions. The psychology professor watched the student the entire two hours as he was flipping the coin…writing the answer…flipping the coin….writing the answer, on and on. At the end of the two hours, everyone else had left the room except for this one student. The professor walks up to his desk and angrily interrupts the student, saying: “Listen, it is obvious that you did not study for this exam since you didn’t even open the question booklet. If you are just flipping a coin for your answer, why is it taking you so long?”

The stunned student looks up at the professor and replies bitterly (as he is still flipping the coin): “Shhh! I am checking my answers!”

Always keep on learning…

In case you missed it, my last post was Kenjutsu, Ohno and Polanyi:

[1] How to Measure Anything.

Jidoka, the Governing Principle for Built-in-Quality:

721px-Centrifugal_governor

Harold Dodge said – “You cannot inspect quality into a product; it must be built into it.[1] This is something that has stuck with me ever since I entered the work force. This means that quality must be viewed as an intrinsic attribute of a manufacturing process. The idea of quality being part of the process cannot be brought out by talking to the employees or with slogans or short lived programs. In order to have quality be a part of the process, it has to be a part of the process intrinsically!

I came across the concept of James Watts’ centrifugal governor. This is essentially a feedback system that controls the speed of an engine at a desired state. This is shown in the picture above. As the speed increases, it causes the “flyballs” to move away from each other due to the centrifugal force and this cause the arms to go up, which controls the valve to reduce the fuel intake. This is beautifully explained by Stafford Beer in his 1966 book, “Decision and Control” [2]. He states that with the centrifugal governor, the system is brought under control in the very act of going out of control. The regulation is intrinsic (it is part of the system).

When you think about it, Jidoka in TPS is doing exactly that. Jidoka is the governing principle in TPS to ensure built-in-quality. Jidoka was introduced as a concept by Sakichi Toyoda with his automatic loom that stopped when a thread was broken. Jidoka was explained by Toyota as autonomation or automation with human touch. In Toyota’s little green book, The Toyota Production System – Leaner Manufacturing for a Greener Planet, Jidoka is explained as;

Jidoka is a humanistic approach to configuring the human-machine interface. It liberates operators from the tyranny of the machine and leaves them free to concentrate on tasks that enable them to exercise skill and judgment.

Jidoka ensures that the machines are able to detect any abnormality and automatically stop whenever they occur. This concept of stopping production when there is an abnormality was implemented on the production lines with the use of andon cords. When an operator identifies a problem that cannot be solved within the allotted time, the operator can pull on the andon cord to stop the production line, thus making the problem immediately visible. This is a “human jidoka”. This prevents defective items from progressing down the assembly line causing larger issues and wasting time. It also leads to identifying opportunities for improvement with the product and/or the process as well as a valuable time to provide coaching for the employee.

The concept of Jidoka is an effort to make built-in-quality intrinsic to the manufacturing process. Allowing the operator to stop the entire production line is an act of giving autonomy to the operator. The quality is not being pushed top-down, but allowed to emerge bottom-up. This is an example of what Toyota calls as “Good Thinking leading to Good Products”.

In a similar vein, I wanted to draw comparisons to Zen. In Zen, there is a concept of “monkey mind”. This is the racing mind that does not allow one to sit down and meditate. Many different thoughts and emotions go through the mind when one is trying to have a quiet mind. Buddha taught disciples to focus on the breath as way to calm down the monkey mind. This is a really hard thing to do and requires a lot of practice. When the mind drifts off, it needs to be brought back. The Zen teachers teach us that the source of control is also the mind, the very same thing that causes the focus to be lost. Meditation is the art of coming back to the focus again and again. My favorite story on this is from the great teacher Yunmen Wenyan.

 Yunmen was asked by his student, “How can I control my mind to not lose focus when I am trying to meditate?”

Yunmen replied, “The coin that is lost in the river can only be found in the same river.”

Always keep on learning…

In case you missed it, my last post was Learning to See:

[1] https://www.amazon.com/Out-Crisis-Press-Edwards-Deming-ebook/dp/B00653KTES/ref=sr_1_1?s=books&ie=UTF8&qid=1497211354&sr=1-1&keywords=9780262297189

[2] https://www.amazon.com/Decision-Control-Operational-Management-Cybernetics/dp/0471948381

Process Validation and the Problem of Induction:

EPSON MFP image

From “The Simpsons”

Marge: I smell beer. Did you go to Moe’s?

Homer: Every time I have beer on my breath, you assume I’ve been drinking.[1]

In today’s post, I will be looking at process validation and the problem of induction.  I have looked at process validation through another philosophical angle by using the lesson of the Ship of Theseus [4] in an earlier post.

US FDA defines process validation [2] as;

“The collection and evaluation of data, from the process design stage through commercial production, which establishes scientific evidence that a process is capable of consistently delivering quality product.”

My emphases on FDA’s definition are the two words – “capability” and “consistency”. One of the misconceptions about process validation is that once the process is validated, then it achieves almost an immaculate status. One of the horror stories I have heard from my friends in the Medical Devices field is that the manufacturer stopped inspecting the product since the process was validated. The problem with validation is the problem of induction. Induction is a process in philosophy – a means to obtain knowledge by looking for patterns from observations and coming to a conclusion. For example, the swans that I have seen so far are white, thus I conclude that ALL swans are white. This is a famous example to show the problem of induction because black swans do exist. However, the data I collected showed that all of the swans in my sample were white. My process of collection and evaluation of the data appears capable and the output consistent.

The misconception that the manufacturer had in the example above was the assumption that the process is going to remain the same and thus the output also will remain the same. This is the assumption that the future and present are going to resemble the past. This type of thinking is termed the assumption of “uniformity of nature” in philosophy. This problem of induction was first thoroughly questioned and looked at by the great Scottish philosopher David Hume (1711-1776). He was an empiricist who believed that knowledge should be based on one’s sense based experience.

One way of looking at process validation is to view the validation as a means to develop a process where it is optimized such that it can withstand the variations of the inputs. Validation is strictly based on the inputs at the time of validation. The 6 inputs – man, machine, method, materials, inspection process and the environment, all can suffer variation as time goes on. These variations reveal the problem of induction – the results are not going to stay the same. There is no uniformity of nature. The uniformities observed in the past are not going to hold for the present and future as well.

In general, when we are doing induction, we should try to meet five conditions;

  1. Use a large sample size that is statistically valid
  2. Make observations under different and extreme circumstances
  3. Ensure that none of the observations/data points contradict
  4. Try to make predictions based on your model
  5. Look for ways and test your model to fail

The use of statistics is considered as a must for process validation. The use of a statistically valid sample size ensures that we make meaningful inferences from the data. The use of different and extreme circumstances is the gist of operational qualification or OQ. OQ is the second qualification phase of process validation. Above all, we should understand how the model works. This helps us to predict how the process works and thus any contradicting data point must be evaluated. This helps us to listen to the process when it is talking. We should keep looking for ways to see where it fails in order to understand the boundary conditions. Ultimately, the more you try to make your model to fail, the better and more refined it becomes.

The FDA’s guidance on process validation [2] and the GHTF (Global Harmonized Task Force) [3] guidance on process validation both try to address the problem of induction through “Continued Process Verification” and “Maintaining a State of Validation”. We should continue monitoring the process to ensure that it remains in a state of validation. Anytime any of the inputs are changed, or if the outputs show a trend of decline, we should evaluate the possibility of revalidation as a remedy for the problem of induction. This brings into mind the quote “Trust but verify”. It is said that Ronald Reagan got this quote from Suzanne Massie, a Russian writer. The original quote is “Doveryai, no proveryai”.

I will finish off with a story from the great Indian epic Mahabharata, which points to the lack of uniformity in nature.

Once a beggar asked for some help from Yudhishthir, the eldest of the Pandavas. Yudhishthir told him to come on the next day. The beggar went away. At the time of this conversation, Yudhishthir’s younger brother Bhima was present. He took one big drum and started walking towards the city, beating the drum furiously. Yudhishthir was surprised.

He asked the reason for this. Bhima told him:
“I want to declare that our revered Yudhishthir has won the battle against time (Kaala). You told that beggar to come the next day. How do you know that you will be there tomorrow? How do you know that beggar would still be alive tomorrow? Even if you both are alive, you might not be in a position to give anything. Or, the beggar might not even need anything tomorrow. How did you know that you both can even meet tomorrow? You are the first person in this world who has won the time. I want to tell the people of Indraprastha about this.”

Yudhishthir got the message behind this talk and called that beggar right away to give the necessary help.

Always keep on learning…

In case you missed it, my last post was If a Lion Could Talk:

[1] The Simpsons – Season 27; Episode 575; Every Man’s Dream

[2] https://www.fda.gov/downloads/drugs/guidances/ucm070336.pdf

[3] https://www.fda.gov/OHRMS/DOCKETS/98fr/04d-0001-bkg0001-10-sg3_n99-10_edition2.pdf

[4] https://harishsnotebook.wordpress.com/2015/03/08/ship-of-theseus-and-process-validation/

[5] Non-uniformity of Nature Clock drawing by Annie Jose

The Big Picture of Problem Solving:

big_picture

In today’s post, I will be looking at Problem Solving. I am a Quality Professional, and this is a topic near and dear to my heart. There are several problem solving methods out there which includes tools like the Ishikawa Diagram, 5 Why, etc. I will try to shed light on the big picture of problem solving.

Sometimes we fall into the trap of reductionist thinking when trying to solve problems. The reductionist approach is to take things apart and study the parts in isolation. We need to understand that problems are sometimes attributed to the emergent properties of the system and are manifestations of the interactions between the parts. This means that a system has parts, and that the properties of the system are the sum of the whole of the parts and the interactions between the parts. The parts themselves cannot perform the function of the system. For example, the wheel of a bicycle cannot do anything by itself. The same is applicable to the handle. Even when the different parts are put together, the bicycle by itself cannot do anything by itself. When there is a rider, then there is the possibility of the pedals moving, and the wheels rolling. We can say that the system is the bicycle and the rider combined together, and this system has a purpose – to go from one place to the other.

From a problem solving standpoint, we should use both reductionist and holistic approaches. Reductionist thinking is mechanistic in nature, and it does not look at how everything works in relation to one another. However, this thinking has value and is needed to some extent. Russell Ackoff, the famous Systems Thinker, has stated that reductionist thinking, the idea that everything can be reduced to its individual parts, helps us in understanding how a system works. However, this does not explain why a system works the way it does. This requires holistic thinking. Holistic thinking is the “big picture” thinking – how the parts interact together to align with the system’s purpose, and how the system’s emergent properties align with the system’s purpose. This is the thinking that leads to the understanding of why a system is acting the way it is.

When we add humans in the mix, we are introducing parts that have a purpose on its own that may not align with the system’s purpose. The problems that arise from the interaction of humans and other parts in the system are tricky. One of my favorite stories on this is the Cobra Effect story. During the British rule in India, there was a concern about the high number of venomous snakes, especially deadly Cobras, in Delhi. The British regime in Delhi posted rewards for dead Cobras. This had some impact initially since the farmers started killing Cobras. However, things soon got out of hand when some of the farmers started breeding Cobras in order to get the reward. The reward program was scrapped by the British regime when they became aware of this. The interaction between the farmers and the reward system was strong, and the purpose of the farmers was to get as much reward as possible, where as the intent of the system as desired by the British regime was to eliminate or reduce venomous snakes. It is not easy to predict all things that can go wrong, however as we build a system we should look into resilience properties of the system with the expectation that some interactions have been overlooked.

This also reminds me of a manufacturing related story from my Materials Selection class in school. A plant started utilizing ultrasonically welded plastic parts to which plastic tubes were assembled on to. After 6 months, an operator noted that all of the assembled components in inventory were cracked. This puzzled everybody, and the finger was first pointed at the suppler that provided the welded plastic parts. However, the inventory of the incoming components did not show any cracked parts. It was later identified that a new operator started using alcohol as a lubricant to assemble the tubes onto the plastic parts. The operator was trying to make the operation easier to do. The alcohol-induced chemical-stress along with the residual stress from the welding led to the cracking. The human interaction on the part – the ease to assemble was not looked at. The operator’s purpose was to make his process easy and did not look at the big picture – how this interacted with other parts in the system.

Reductionist thinking alone is linear in nature and leads to quick fixes and band-aids.  Some examples are simply replacing a part of the system or providing training alone as the reaction to the problem.

Holistic thinking, on the other hand, is not linear in nature and does not lead to quick fixes with the hope that it addresses the problem. Holistic thinking results in either changing a part of the system, or changing how a part interacts with the system. Both of these result in a modified system.

I have identified nine points to further improve our big picture understanding of problem solving;

  • Problems as Manifestations of Emergent Properties:

Sometimes, the problems are manifestations of the emergent properties in the system. This means that the interactions between parts in the system, when the system is taken as a whole, resulted in the problem. This type of problem cannot be addressed by looking at the parts alone.

  • Cause- Effect Relationship is not Always Linear:

It is not likely that the cause-effect relationship is always linear. Factor “A” does not cause Effect “B”. Factor “A’s” in the interaction with Factor “D” and Factor “E” in the presence of the environment of the system resulted in the problem. The problem and the cause(s) are not always direct and easy to trace.

  • It’s About Interactions:

When trying to solve a problem, understand the interactions in the system first. This was explained by the two stories above.

  • Does Your Solution Create New Problems?

The “verification” stage of a problem solving activity is always deemed as important. This is when we verify that our solution addresses the problem. However, we also need to look at whether the solution can create a new problem. Are we impacting or creating any new interactions that we are not aware of? This is evident from the adage – “Today’s problems are created by yesterday’s solutions”.

  • Go to the Gemba:

The best and possibly the only way to truly understand the interactions and how the system behaves in an environment is by going to the Gemba – where the action is. You cannot solve a problem effectively by sitting in an Office environment.

  • How Much Does Your Solution Fix the Problem?

There is always more than one solution that can address the problem. Some of these are not feasible or not cost effective. One solution alone cannot address the problem in its entirety. There are two questions that are asked in a problem solving process. a) Why did the problem happen? And b) Why did the problem escape the production environment? In the light of these questions, we should understand, how much of the problem can be fixed by our solutions.

  • What is the Impact of Environment?

Sometimes problems exist in certain conditions only. Sometimes problems manifest themselves in certain environmental conditions. The most recent Wells Fargo incident is reported to have started by the push from the Management to meet the aggressive sales goals. This created an environment that eventually led to fraudulent activities. An article on CNN reported; “Relentless pressure. Wildly unrealistic sales targets.” The employees were asked to sell at least eight accounts to every customer, from about three accounts ten years earlier. The reason for eight accounts was explained by the CEO as – “Why eight? “The answer is, it rhymed with ‘great,

  • Quick Fixes = Temporary Local Optimization:

Problems persist when the first reaction is to put band-aids on it. We have to see quick fixes as an attempt to temporarily optimize locally in the hopes that the problem will go away. This almost always leads to an increase in cost and reduction in quality and productivity.

  • Involve the Parts in your Solution:

It goes without saying that the solutions should always involve the people involved in the process. It is ultimately their process. It is our job to make sure that they are aware of the system in its entirety. For example, train them on how a product is eventually used. What is the impact of what they do?

Always keep on learning…

In case you missed it, my last post was In-the-Customer’s-Shoes Quality.

The Pursuit of Quality – A Lesser Known Lesson from Ohno:

Ohno

In today’s post, I will be looking at a lesser known lesson from Taiichi Ohno regarding the pursuit of Quality.

“The pursuit of quantity cultivates waste while the pursuit of quality yields value.”

Ohno was talking about using andons and the importance of resisting mass production thinking. Andon means “lantern” in Japanese, and is a form of visual control on the floor. Toyota requires and requests the operators to pull the andon cord to stop the line if a defect is found and to alert the lead about the issue. Ohno said the following about andons;

“Correcting defects is necessary to reach our goal of totally eliminating waste.”

Prior to the oil crisis, in the early 1970’s in Japan, all the other companies were buying high-volume machines to increase output. They reasoned that they could store the surplus in the warehouse and sell them when the time was right. Toyota, on the other hand, resisted this and built only what was needed. According to Ohno, the companies following mass-production thinking got a rude awakening in the wake of the oil crisis since they could not dispose off their high inventory. Meanwhile Toyota thrived and their profits increased. The other companies started taking notice of the Toyota Production System.

Ohno’s lesson of the pursuit of quality to yield value struck a chord with me. This concept is similar to Dr. Deming’s chain reaction model. Dr. Deming taught us that improvement of quality begets the natural and inevitable improvement of productivity. His entire model is shown below (from his book “Out of the Crisis”).

Deming Chain reaction

Dr. Deming taught the Japanese workers that the defects and faults that get into the hands of the customer lose the market and cost him his job. Dr. Deming taught the Japanese management that everyone should work towards a common aim – quality.

Steve Jobs Story:

I will finish with a story I heard from Tony Fadell who worked as a consultant for Apple and helped with the creation of the IPod. Tony said that Steve Jobs did not like the “Charge Before Use” sticker on all of the electronic gadgets that were available at that time. Jobs argued that the customer had paid money anticipating using the gadget immediately, and that the delay from charging takes away from the customer satisfaction. The normal burn-in period used to be 30 minutes for the IPod. The burn-in is part of the Quality/Reliability inspection where the electronic equipment runs certain cycles for a period of time with the intent of stressing the components to weed out any defective or “weak” parts. Jobs changed the burn-in time to two hours so that when the customer got the IPod, it was fully charged for him to use right away. This was a 300% increase in the inspection time and would have impacted the lead time. Traditional thinking would argue that this was not a good decision. However, this counterintuitive approach was welcomed by the customers and nowadays it is the norm that electronic devices come charged so that the end user can start using it immediately.

Always keep on learning…

In case you missed it, my last post was Challenge and Kaizen.

Dharma, Karma and Quality:

Dharma

In today’s post I will be looking at the statement – quality is everyone’s responsibility. This is an interesting preachy statement. There are two questions that can be answered by this statement;

  1. Who is responsible for quality?
  2. What is everyone responsible for?

The first question (who is) is a wrong question to ask because it leads to blaming and never results in an improvement of current state. The second question is just too broad to answer. Everyone is surely responsible for more than just quality.

Dharma and Karma:

The best way to explain responsibility is by looking at “dharma”. “Dharma” is an ancient Sanskrit term, and goes back to about 1500 BC. The word was first explained in the ancient Indian script Rig Veda. This was explained as a means to achieve a sense of order in the world. The term loosely can be translated as “responsibility”, or “something that needs to be done from a sense of duty”. The main purpose of dharma is to preserve or uphold the order in a system. For example, the dharma of a plant is to bloom.

This brings me to the next word – “karma”. “Karma” is more commonly used in the English language, and everybody has some understanding of this word. The term actually means “action” in Sanskrit. The action can be in the past, present or in the future. However, every one of your actions has a consequence. This attaches the “cause and effect” meaning to the word “karma”.  There are three types of karma identified in the Sanskrit texts;

  1. Karma = action
  2. Vikarma = wrong action
  3. Akarma = no action (doing nothing is a form of action, and sometimes this is the right thing to do)

If everybody performs karma according to their dharma, then the system is sustained successfully.

Top Management – 85% or 100% Responsible?

The answer to the question, “who is responsible for quality” is sometimes answered as “Top Management”. Dr. Deming taught that “85% of all quality problems are management problems”. He is also supposed to have stated “85% of TQC’s (Total Quality Control program) success depends on the president.” This can be depicted as the chart below.

Responsibility

I have viewed this as – patient zero is in the board room.

Taiichi Ohno’s, the father of Toyota Production System, view on this was as follows;

“In reality, TQC’s success depends on the president’s resolution to assume 100% responsibility. The president should imagine him or herself taken hostage by TQC and become devoted to human quality control.”

Dr. Deming has also said that – Quality is made in the board room. However, he goes on to clarify this. Quality is everyone’s responsibility, but top management has the most leverage of all to make a meaningful impact with their decisions.

In this light, the answer to the question – “what is your responsibility?” is “You are responsible for what you can control.”

Top management’s dharma is to lay down the framework for the entire organization to grow. This involves strong vision, big and drastic improvements (innovation) and growth. Middle Management’s dharma is to enforce and reinforce the framework through maintaining the status quo while encouraging small improvements (kaizen) and developing people. The operator’s dharma is to aid middle management to maintain status quo while looking for opportunities for improvements. The push for maintaining status quo is to provide a temporary structure for the process so that it can be studied for improvements. The main goal is destruction of the status quo so that a new standard can be achieved. If the karma aligns with the dharma, then the organization will sustain itself, grow and be successful.

Final Words:

I have recently rediscovered Dr. Deming’s definition of quality – Quality is the pride of workmanship. I will use Dr. Deming to succinctly summarize this post.

“In a well organized system all the components work together to support each other. In a system that is well led and managed, everybody wins. This is what I taught Japanese top management and engineers beginning in 1950.”

I will finish off with a Zen monk story;

A monk was driving his car when a dog from nowhere crossed the road. Although the monk tried stopping his car, he ran over the dog, killing it. The monk stopped his car and parked it. He looked around and saw a temple across from the road. He went to the temple and knocked at the door. Another monk opened the door.

The first monk bowed his head and said “I am so sorry.”

He pointed to where the accident happened and continued; “My karma ran over your dogma over dharma”. (My car ran over your dog over there.)

Always keep on learning…

In case you missed it, my last post was To Be or Not To Be.

The Mystery of Missing Advent Calendar Chocolates:

advent

It is Christmas time, which means it is advent calendar time for the kids and for those of us who are kids at heart. My wife bought our kids chocolate advent calendars from Trader Joe’s. For those who do not know advent calendars, these are countdown calendars to Christmas starting on December 1st. Each day has a window which you can open to reveal a chocolate. Each day has a unique shaped chocolate, a Christmas tree, a stocking etc. The kids love this.

We keep the advent calendars on the top of our refrigerator to ensure they are not tempted to eat all of the chocolate at once. This morning, I found the advent calendars on the table and a crying Annie. Annie is our youngest daughter. She was very upset.

“I did not get any chocolate today from my calendar”, she said while crying.

“You must have eaten it already”, was my response. Of course, the kids eat chocolate and sometimes they are impatient and eat more than one day’s worth. In my mind, it was a reasonable assumption to make.

Annie explained that she opened the window with 6 on it and did not find any chocolate. I looked at the calendar, and sure enough, the window for day 6 on it was open. My initial hypothesis stayed the same – Annie ate the chocolate, and she is not telling me the entire truth.

My wife suggested she open the window for day 7 and eat that chocolate. Annie then proceeded to open the window with 7 on it, in front of me. Lo and behold, it did not have any chocolate. Annie looked at me with sad eyes. I realized, I was wrong to have assumed that Annie had eaten the chocolate!

“This is a mystery”, said Audrey, her twin sister.

Now I had a second hypothesis – those darn calendar makers; they do not know what they are doing. They obviously missed filling all the spots with chocolate. As a Quality Engineer, I have seen operator errors. I have now jumped to my second hypothesis.

Having thought about for a bit, I looked at the available information. Based on what Annie told me, the chocolate was not in its spot for two consecutive days. These calendars did not have the numbers in the consecutive order. They were placed in random order. It did not strike to me that two candies at different locations would be missing candy. She had opened a spot between 6 and 7 on an earlier day, and it had the candy.

I had a reasonable hypothesis – the operator/equipment missed the spots in the calendar. I have seen it happen before in different environments. But still, something was not right.

I proceeded to put the advent calendar back onto the top of the refrigerator. Then I thought of something. I wanted to test the calendar more. I carefully opened the calendar from the base. It was a card board box with a plastic tray inside.

Just then I found out what happened! On multiple places, the chocolate was missing. The chocolate were misplaced from its cavities. They were all gathered at the bottom of the box. It could be from the transportation. It could be the end user i.e. my excited young daughter who shook the calendar. It could be the design of the calendar that allows extra space between the tray and the cardboard.

The most important thing was that Annie was now happy that she got her candies. Audrey was happy that we indeed had a mystery that we could solve. My wife and I were happy that our kids were happy.

Final Words:

This personal story has made me realize again that we should not jump to conclusions. Listen to that tiny little voice that says “there is something more to this”…

Always keep on learning…

In case you missed it, my last post was about “Lady Tasting Tea”.