Process Validation and the Problem of Induction:

EPSON MFP image

From “The Simpsons”

Marge: I smell beer. Did you go to Moe’s?

Homer: Every time I have beer on my breath, you assume I’ve been drinking.[1]

In today’s post, I will be looking at process validation and the problem of induction.  I have looked at process validation through another philosophical angle by using the lesson of the Ship of Theseus [4] in an earlier post.

US FDA defines process validation [2] as;

“The collection and evaluation of data, from the process design stage through commercial production, which establishes scientific evidence that a process is capable of consistently delivering quality product.”

My emphases on FDA’s definition are the two words – “capability” and “consistency”. One of the misconceptions about process validation is that once the process is validated, then it achieves almost an immaculate status. One of the horror stories I have heard from my friends in the Medical Devices field is that the manufacturer stopped inspecting the product since the process was validated. The problem with validation is the problem of induction. Induction is a process in philosophy – a means to obtain knowledge by looking for patterns from observations and coming to a conclusion. For example, the swans that I have seen so far are white, thus I conclude that ALL swans are white. This is a famous example to show the problem of induction because black swans do exist. However, the data I collected showed that all of the swans in my sample were white. My process of collection and evaluation of the data appears capable and the output consistent.

The misconception that the manufacturer had in the example above was the assumption that the process is going to remain the same and thus the output also will remain the same. This is the assumption that the future and present are going to resemble the past. This type of thinking is termed the assumption of “uniformity of nature” in philosophy. This problem of induction was first thoroughly questioned and looked at by the great Scottish philosopher David Hume (1711-1776). He was an empiricist who believed that knowledge should be based on one’s sense based experience.

One way of looking at process validation is to view the validation as a means to develop a process where it is optimized such that it can withstand the variations of the inputs. Validation is strictly based on the inputs at the time of validation. The 6 inputs – man, machine, method, materials, inspection process and the environment, all can suffer variation as time goes on. These variations reveal the problem of induction – the results are not going to stay the same. There is no uniformity of nature. The uniformities observed in the past are not going to hold for the present and future as well.

In general, when we are doing induction, we should try to meet five conditions;

  1. Use a large sample size that is statistically valid
  2. Make observations under different and extreme circumstances
  3. Ensure that none of the observations/data points contradict
  4. Try to make predictions based on your model
  5. Look for ways and test your model to fail

The use of statistics is considered as a must for process validation. The use of a statistically valid sample size ensures that we make meaningful inferences from the data. The use of different and extreme circumstances is the gist of operational qualification or OQ. OQ is the second qualification phase of process validation. Above all, we should understand how the model works. This helps us to predict how the process works and thus any contradicting data point must be evaluated. This helps us to listen to the process when it is talking. We should keep looking for ways to see where it fails in order to understand the boundary conditions. Ultimately, the more you try to make your model to fail, the better and more refined it becomes.

The FDA’s guidance on process validation [2] and the GHTF (Global Harmonized Task Force) [3] guidance on process validation both try to address the problem of induction through “Continued Process Verification” and “Maintaining a State of Validation”. We should continue monitoring the process to ensure that it remains in a state of validation. Anytime any of the inputs are changed, or if the outputs show a trend of decline, we should evaluate the possibility of revalidation as a remedy for the problem of induction. This brings into mind the quote “Trust but verify”. It is said that Ronald Reagan got this quote from Suzanne Massie, a Russian writer. The original quote is “Doveryai, no proveryai”.

I will finish off with a story from the great Indian epic Mahabharata, which points to the lack of uniformity in nature.

Once a beggar asked for some help from Yudhishthir, the eldest of the Pandavas. Yudhishthir told him to come on the next day. The beggar went away. At the time of this conversation, Yudhishthir’s younger brother Bhima was present. He took one big drum and started walking towards the city, beating the drum furiously. Yudhishthir was surprised.

He asked the reason for this. Bhima told him:
“I want to declare that our revered Yudhishthir has won the battle against time (Kaala). You told that beggar to come the next day. How do you know that you will be there tomorrow? How do you know that beggar would still be alive tomorrow? Even if you both are alive, you might not be in a position to give anything. Or, the beggar might not even need anything tomorrow. How did you know that you both can even meet tomorrow? You are the first person in this world who has won the time. I want to tell the people of Indraprastha about this.”

Yudhishthir got the message behind this talk and called that beggar right away to give the necessary help.

Always keep on learning…

In case you missed it, my last post was If a Lion Could Talk:

[1] The Simpsons – Season 27; Episode 575; Every Man’s Dream

[2] https://www.fda.gov/downloads/drugs/guidances/ucm070336.pdf

[3] https://www.fda.gov/OHRMS/DOCKETS/98fr/04d-0001-bkg0001-10-sg3_n99-10_edition2.pdf

[4] https://harishsnotebook.wordpress.com/2015/03/08/ship-of-theseus-and-process-validation/

[5] Non-uniformity of Nature Clock drawing by Annie Jose

Advertisement

OpenFDA API, with Excel:

openFDA_720x825

FDA has made their databases more open to developers and businesses alike through open.FDA.gov. From their website, “The goal of the project is to create easy access to public data, to create a new level of openness and accountability, to ensure the privacy and security of public FDA data, and ultimately to educate the public and save lives.

I have created an Excel interface that does not use a JSON library, and allows the user to perform searches based on multiple criteria. This interface will also allow the user to download the data for further manipulation.

A basic screenshot is shown below. Please note that, currently this is applicable only for Medical Devices Adverse Events.

main

The user has to enter the required information into the yellow cells. The query is based on a “count” criterion. It is also important to note the “Keyword” search as well. I have found this to be quite useful, when I was playing around.

If the query criteria will yield results, the “FINAL HYPERLINK” cell will turn green. If the query results produce a null, the cell will turn red. The user can also click on the hyperlink to view the results in a browser.

The count criteria are shown below.

count

Based on the data input, the user clicks on the “CLICK HERE” button, and it will perform the query, and download the dataset to another sheet. This is shown below. I have used the FDA disclaimer section from the results, for my data page.

The speed of the query has been pretty impressive.

data

If the count selected is “date received”, the program will automatically parse the data and create a run chart along with the data sheet. This is shown below. The user can further manipulate the dates to weeks or months run chart.

runchart

Interested in R functions?

I have also created several functions in R to query and download the data to a .csv file. If there is an interest for this, I can certainly share them.

Feedback request:

I am interested in getting feedback from the users. If there are ideas to improve this further, please provide me feedback. You can reach me at harishjose@gmail.com

Disclaimer:

This program must be used at your own risk. I do not guarantee accuracy of the data. All the data is acquired through OpenFDA’s API. The data is updated frequently. The “update” information is shown as part of the dataset.

Download:

You can download the spreadsheet here (.xls format).

Always keep on learning…