# AQL/RQL/LTPD/OC Curve/Reliability and Confidence: It has been a while since I have posted about statistics. In today’s post, I am sharing a spreadsheet that generates an OC Curve based on your sample size and the number of rejects. I get asked a lot about a way to calculate sample sizes based on reliability and confidence levels. I have written several posts before. Check this post and this post for additional details.

The spreadsheet is hopefully straightforward to use. The user has to enter data in the required yellow cells. A good rule of thumb is to use 95% confidence level, which also corresponds to 0.05 alpha. The spreadsheet will plot two curves. One is the standard OC curve, and the other is an inverse OC curve. The inverse OC curve has the probability of rejection on the Y-axis and % Conforming on the X-axis. These corresponds to Confidence level and Reliability respectively. I will discuss the OC curve and how we can get a statement that corresponds to a Reliability/Confidence level from the OC curve.

The OC Curve is a plot between % Nonconforming, and Probability of Acceptance. Lower the % Nonconforming, the higher the Probability of Acceptance. The probability can be calculated using Binomial, Hypergeometric or Poisson distributions. The OC Curve shown is for n = 59 with 0 rejects calculated using Binomial Distribution. The Producer’s risk is the risk of good product getting rejected. The Acceptance Quality Limit (AQL) is generally defined as the percent defectives that the plan will accept 95% of the time (in the long run). Lots that are at or better than the AQL will be accepted 95% of the time (in the long run). If the lot fails, we can say with 95% confidence that the lot quality level is worse than the AQL. Likewise, we can say that a lot at the AQL that is acceptable has a 5% chance of being rejected. In the example, the AQL is 0.09%. The Consumer’s risk, on the other hand, is the risk of accepting bad product. The Lot Tolerance Percent Defective (LTPD) is generally defined as percent defective that the plan will reject 90% of the time (in the long run). We can say that a lot at or worse than the LTPD will be rejected 90% of the time (in the long run). If the lot passes, we can say with 90% confidence that the lot quality is better than the LTPD (% nonconforming is less than the LTPD value). We could also say that a lot at the LTPD that is defective has a 10% chance of being accepted.

The vertical axis (Y-axis) of the OC Curve goes from 0% to 100% Probability of Acceptance. Alternatively, we can say that the Y-axis corresponds to 100% to 0% Probability of Rejection. Let’s call this Confidence.

The horizontal axis (X-axis) of the OC Curve goes from 0% to 100% for % Nonconforming. Alternatively, we can say that the X-axis corresponds to 100% to 0% for % Conforming. Let’s call this Reliability. We can easily invert the Y-axis so that it aligns with a 0 to 100% confidence level. In addition, we can also invert the X-axis so that it aligns with a 0 to 100% reliability level. This is shown below. What we can see is that, for a given sample size and defects, the more reliability we try to claim, the less confidence we can assume. For example, in the extreme case, 100% reliability lines up with 0% confidence.

I welcome the reader to play around with the spreadsheet. I am very much interested in your feedback and questions. The spreadsheet is available here.

In case you missed it, my last post was Nature of Order for Conceptual Models:

# MTTF Reliability, Cricket and Baseball: I originally hail from India, which means that I was eating, drinking and sleeping Cricket at least for a good part of my childhood. Growing up, I used to “get sick” and stay home when the one TV channel that we had broadcasted Cricket matches. One thing I never truly understood then was how the batting average was calculated in Cricket. The formula is straightforward:

Batting average = Total Number of Runs Scored/ Total Number of Outs

Here “out” indicates that the batsman had to stop his play because he was unable to keep his wicket. In Baseball terms, this will be similar to a strike out or a catch where the player has to leave the field. The part that I could not understand was when the Cricket batsman did not get out. The runs he scored was added to the numerator but there was no changes made to the denominator. I could not see this as a true indicator of the player’s batting average.

When I started learning about Reliability Engineering, I finally understood why the batting average calculation was bothering me. The way the batting average in Cricket is calculated is very similar to the MTTF (Mean Time To Failure) calculation. MTTF is calculated as follows:

MTTF = Total time on testing/Number of failures

For a simple example, if we were testing 10 motors for 100 hours and three of them failed at 50, 60 and 70 hours respectively, we can calculate MTTF as 293.33 hours. The problem with this is that the data is a right-censored data. This means that we still have samples where the failure has not occurred and we stopped the testing. This is similar to the case where we do not include the number of innings where the batsman did not get out. A key concept to grasp here is that the MTTF or the MTBF (Mean Time Between Failure) metric is not for a single unit. There is more to this than just saying that on average a motor is going to last 293.33 hours.

When we do reliability calculations, we should be aware whether censored data is being used and use appropriate survival analysis to make a “reliability specific statement” – we can expect that 95% of the motor population will survive x hours. Another good approach is to calculate the lower bound confidence intervals based on the MTBF. A good resource is https://www.itl.nist.gov/div898/handbook/apr/section4/apr451.htm.

Ty Cobb. Don Bradman and Sachin Tendulkar:

We can compare the batting averages in Cricket to Baseball. My understanding is that the batting average in Baseball is calculated as follows:

Batting Average = Number of Hits/Number of Bats

Here the hit can be in the form of singles, home runs etc. Apparently, this statistic was initially brought up by an English statistician Henry Chadwick. Chadwick was a keen Cricket fan.

I want to now look at the greats of Baseball and Cricket, and look at a different approach to their batting capabilities. I have chosen Ty Cobb, Don Bradman and Sachin Tendulkar for my analyses. Ty Cobb has the largest Baseball batting average in American Baseball. Don Bradman, an Australian Cricketer often called the best Cricket player ever, has the largest batting average in Test Cricket. Sachin Tendulkar, an Indian Cricketer and one of the best Cricket players of recent times, has the largest number of runs scored in Test Cricket. The batting averages of the three players are shown below: As we discussed in the last post regarding calculating reliability with Bayesian approach, we can make reliability statements in place of batting averages. Based on 4191 hits in 11420 bats, we could make a statement that – with 95% confidence Ty Cobb is 36% likely to make a hit in the next bat. We can utilize the batting average concept in Baseball to Cricket. In Cricket, hitting fifty runs is a sign of a good batsman. Bradman has hit fifty or more runs on 56 occasions in 80 innings (70%). Similarly Tendulkar has hit fifty or more runs on 125 occasions in 329 innings (38%).

We could state that with 95% confidence, Bradman was 61% likely to score fifty or more runs in the next inning. Similarly, Sachin was 34% likely to score fifty runs or more in the next inning at 95% confidence level.

Final Words:

As we discussed earlier, similar to MTTF, batting average is not a good estimation for a single inning. It is an attempt for a point estimate for reliability but we need additional information regarding this. This should not be looked at it as a single metric in isolation. We cannot expect that Don Bradman would score 99.94 runs per innings. In fact, in the last very match that Bradman played, all he had to do was score 4 single runs to achieve the immaculate batting average of 100. He had been out only 69 times and he just needed four measly runs to complete 7000 runs and even if he got out on that inning, he would have achieved the spectacular batting average of 100. He was one of the best players ever. His highest score was 334. This is called “triple century” in Cricket, and this is a rare achievement. As indicated earlier, he was 61% likely to have scored fifty runs or more in the next inning. In fact, Bradman had scored more than four runs 69 times in 79 innings. Everyone expected Bradman to cross the 100 mark easily. As fate would have it, Bradman scored zero runs as he was bowled out (the batsman misses and the ball hits the wicket) by the English bowler Eric Hollies, in the second ball he faced. He had hit 635 fours in his career. A four is where the batsman scores four runs by hitting the ball so that it rolls over the boundary of the field. All Bradman needed was one four to achieve the “100”. Bradman proved that to be human is to be fallible. He still remains the best that ever was and his record is far from broken. At this time, the batsman with the second best batting average is 61.87.

Always keep on learning…

In case you missed it, my last post was Reliability/Sample Size Calculation Based on Bayesian Inference:

# Reliability/Sample Size Calculation Based on Bayesian Inference: I have written about sample size calculations many times before. One of the most common questions a statistician is asked is “how many samples do I need – is a sample size of 30 appropriate?” The appropriate answer to such a question is always – “it depends!”

In today’s post, I have attached a spreadsheet that calculates the reliability based on Bayesian Inference. Ideally, one would want to have some confidence that the widgets being produced is x% reliable, or in other words, it is x% probable that the widget would function as intended. There is the ubiquitous 90/90 or 95/95 confidence/reliability sample size table that is used for this purpose. In Bayesian Inference, we do not assume that the parameter (the value that we are calculating like Reliability) is fixed. In the non-Bayesian (Frequentist) world, the parameter is assumed to be fixed, and we need to take many samples of data to make an inference regarding the parameter. For example, we may flip a coin 100 times and calculate the number of heads to determine the probability of heads with the coin (if we believe it is a loaded coin). In the non-Bayesian world, we may calculate confidence intervals. The confidence interval does not provide a lot of practical value. My favorite explanation for confidence interval is with the analogy of an archer. Let’s say that the archer shot an arrow and it hit the bulls-eye. We can draw a 3” circle around this and call that as our confidence interval based on the first shot. Now let’s assume that the archer shot 99 more arrows and they all missed the bull-eye. For each shot, we drew a 3” circle around the hit resulting in 100 circles. A 95% confidence interval simply means that 95 of the circles drawn contain the first bulls-eye that we drew. In other words, if we repeated the study a lot of times, 95% of the confidence intervals calculated will contain the true parameter that we are after. This would indicate that the one study we did may or may not contain the true parameter. Compared to this, in the Bayesian world, we calculate the credible interval. This practically means that we can be 95% confident that the parameter is inside the 95% credible interval we calculated.

In the Bayesian world, we can have a prior belief and make an inference based on our prior belief. However, if your prior belief is very conservative, the Bayesian inference might make a slightly liberal inference. Similarly, if your prior belief is very liberal, the inference made will be slightly conservative. As the sample size goes up, impact of this prior belief is minimized. A common method in Bayesian inference is to use the uninformed prior. This means that we are assuming equal likelihood for all the events. For a binomial distribution we can use beta distribution to model our prior belief. We will use (1, 1) to assume the uninformed prior. This is shown below: For example, if we use 59 widgets as our samples and all of them met the inspection criteria, then we can calculate the 95% lower bound credible interval as 95.13%. This is assuming the (1, 1) beta values. Now let’s say that we are very confident of the process because we have historical data. Now we can assume a stronger prior belief with the beta values as (22,1). The new prior plot is shown below: Based on this, if we had 0 rejects for the 59 samples, then the 95% lower bound credible interval is 96.37%. A slightly higher reliability is estimated based on the strong prior.

We can also calculate a very conservative case of (1, 22) where we assume very low reliability to begin with. This is shown below: Now when we have 0 rejects with 59 samples, we are pleasantly surprised because we were expecting our reliability to be around 8-10%. The newly calculated 95% lower bound credible interval is 64.9%.

I have created a spreadsheet that you can play around with. Enter the data in the yellow cells. For a stronger prior (liberal), enter a higher a_prior value. Similarly, for a conservative prior, enter a higher b_prior value. If you are unsure, retain the (1, 1) value to have a uniform prior. The spreadsheet also calculates the maximum expected rejects per million value as well.

I will finish with my favorite confidence interval joke.

“Excuse me, professor. Why do we always calculate 95% confidence interval and not a 94% or 96% interval?”, asked the student.

“Shut up,” explained the professor.

Always keep on learning…

In case you missed it, my last post was Mismatched Complexity and KISS:

# Reliability/Confidence Level Calculator (with c = 0, 1….., n) The reliability/Confidence level sample size calculation is fairly known to Quality Engineers. For example, with 59 samples and 0 rejects, one can be 95% confident that the process is at least 95% reliable or that the process yields at least 95% conforming product.

I have created a spreadsheet “calculator”, that allows the user to enter the sample size, number of rejects and the desired confidence level, and the calculator will provide the reliability result.

It is interesting to note that the reliability/confidence calculation, LTPD calculation and Wilk’s first degree non-parametric one sided tolerance calculation all yield the same results.

I will post another day about LTPD versus AQL.

The spreadsheet is available here Reliability calculator based on Binomial distribution.

I have a new post in this topic. Check out https://harishsnotebook.wordpress.com/2019/10/19/aql-rql-ltpd-oc-curve-reliability-and-confidence/

Keep on learning… I have created a spreadsheet that allows the user to calculate the number of samples needed for a desired one-sided tolerance interval at a desired confidence level. Additionally, the user can also enter the desired order for the sample size.

For example, if you have 93 samples, you can be 95% confident that 95% of the population are above the 2nd lowest value samples. Alternatively, you can also state that 95% of the population is below the 2nd highest value of the samples.

Here is an example of this in use.

If there is an interest, I can also try creating a two sided tolerance interval spreadsheet as well.

The keen student might notice that the formula is identical to the Bayes Success Run Theorem when the order p =1.

The spreadsheet is available for download here. Wilks one sided

Keep on learning…