A minimal adaptive sampling plan for finite lot inspection

advertisement
A minimal adaptive sampling plan for finite lot inspection
Wei Chen computer science, UNO
Andrew W Swift mathematics, UNO
Abstract
Quality control is an important part of any manufacturing process. Ideally, to guarantee outgoing
quality meets a certain standard, every item should be inspected. However, in many instances this
is either not practical or not possible. Thus, sampling must be used. If the population size is large,
statistical theory provides convenient methods for estimating quality. However, for small
population size these methods are not valid. We have introduced an adaptive sampling plan based
on exact confident intervals for the Hypergeometric Distribution. A key property of our sampling
plan is that the sample size is always the smallest size required to show that the outgoing quality
meets a specified target with a given confidence.
Key words: sampling plan, confidence interval of hypergeometric, quality control
1. Introduction
People usually cannot make direct observations of every item in the population produced. Instead,
they collect data from a subset of items – a sample – and use those observations to make
inferences about the entire population or the lot.
Ideally, the sample corresponds to the larger population on defective rate of interest. In that case,
the conclusions from the sample are probably applicable to the entire population.
This type of correspondence between the sample and the larger population is most important when
customers or manufacturers wants to know what proportion of the items is defective or
defunctional – like boxes of sealed food , for example, require a sample to show what is the rate or
chance the food in the box are edible. Ideally, if we can open every one to inspect, we can get
guarantee that the food is good enough to eat, at the same time, we add more cost for producing
the food and the food can not be shipped because of inspection. So in many instances this is either
not practical or not possible to get guaranteed quality through one by one sampling.
People want process which can reduce costs of inspection and yet accurate estimation of defective
rate of items, even improves the quality of products at a given assurance.
The connection with inspection and testing of produced items is one of earliest research area for
quality control and improvement, even earlier than statistical methodology was introduced.
Inspection or sampling can occur at many points in a process. Acceptance sampling is defined as
the inspection or classification of a sample of items selected at random from a larger batch or lot
and decides whether accept the whole lot or reject it. According to different point to execute the
inspection, there are several kinds of inspection. The outgoing inspection performs immediately
following the production and before the product is shipped. The incoming inspection performs
before customers accept lots of batches of product from the manufacturer.
A lot acceptance sampling plan is an inspection scheme and a set of rules for making decisions.
The decision, based on counting the number of defectives in a sample, can be to accept the lot,
reject the lot, or even, for multiple or sequential inspection processes, such as take another sample
and then repeat the decision process.
The most used sampling plans are simple sampling plans, double sampling plans, multiple
sampling plans and sequential sampling plans.
In single sampling plans, one sample of items is selected at random from a lot and the disposition
of the lot is determined from the resulting information. These plans are usually denoted as (n,c)
plans for a sample size n, where the lot is rejected if there are more than c defectives. These are the
most common (and easiest) plans to use although not the most efficient in terms of average
number of samples needed.
In double sampling plans, after the first sample is tested, there are three choices: accept the lot,
reject the lot, and continue another sampling.
If the outcome is second sample, the
procedure is to combine the results of both samples and make a final decision based on that
information.
Multiple sampling plans, is an extension of the double sampling plans where more than two
samples are needed to reach a decision. The advantage of multiple sampling is smaller sample
sizes.
Sequential sampling plans is the extension of multiple sampling where items are selected from a
lot one at a time and after inspection of each item a decision is made to accept or reject the lot or
select another unit. [2]
In this report, we will introduce the an new sequential sampling plan in outgoing inspection, which
has small sampling size required to meet specific target in given confident level.
2. Exact confident interval of hypergeometric distribution
In statistical methodology, the estimation of products distribution is very important. Traditionally
the basic assumption of items distribution is under binomial or other distribution. Due to the large
population, the estimate is acceptable. However, when the size of product shipped goes down,
binomial distribution is difficult to catch the real character of defective rate among finite lot.
Hypergeometric distribute has better estimation.
The following section focuses on how to get exact confident interval under hypergeometric
distribution.
2.1 Probabilities of “tail” intervals
Consider X (d , n, N ) , a hypergeometric distributed random variable with positive integer
 d  N  d 
 

k  n  k 

parameters d , n and N , for integer k  [0, n] P ( X (d , n, N )  k ) 
when
N
 
n
d<k<n, P(X)=0
For c  [0, n] , let the lower hypergeometric tail be this function
 d  N  d 
 

i  n  i 

h(c, d )  P( X (d , n, N )  c)  
N
i 0
 
n
c
Note that, for c  min( d , n) and all N  max( d , n)
h ( c, d )  1
Proposition 1. let c  [0, min( d , n)] ,

h(c,0)  1 and h(c, N )  0

For 0  d  N value of function h(c, d ) is non-increasing with value of d increasing and
value of c fixed, or h(c, d ) is decreasing for d  [0, N ]
Proof:
 d  N  d 
 

c  
i  n  i 

P( H (d , N )  c)  

N
i 0
 
n
c
 d  N  d 
  i  n  i 
i 0
 
N
 
n

Let it be p (d ) , we want to show p(d  1)  p(d )  0 d is a non negative integer less or equal to
N.
Proof
Let f (d ) 
c
 d  N  d 
  i  n  i  , then
i 0
 

c
 d  1 N  d  1 c  d  N  d 

    

f (d  1)  f (d )   
i 0  i
 n  i  i  0  i  n  i 
c  d
   d  N  d  1 c  d  N  d 

    

     
i  0  i 
 i  1 n  i  i  0  i  n  i 
c
 d  N  d  1 c  d  N  d  1  N  d  c  d  N  d  1 c  d  N  d  1

    

   

    

  
n  i  i  0  i  n  i   n  i  i  0  i  1 n  i  i  0  i  n  i  1 
i  0  i  1
 d  N  d  1 c  d  N  d  1
 d  N  d  1
    
    
  0
   
c
n

c

1
i  0  i  n  i  1 
i  0  i  n  i  1 
 

c 1
So f (d  1)  f (d )  0 .
 d  N  d  1 c  d  N  d 
 
   

c  
i  n  i  i  0  i  n  i  f (d  1)  f (d )

So h(c, d  1)  h(c, d )  


0
N
N
N
i 0
 
 
 
n
n
n
when d  [1,N ]
Similarly, the upper hypergeometric tail is defined to be this function
 d  N  d 
 

i  n  i 

 (c, d )  P( X (d , n, N )  c)  1  
N
i 0
 
n
c 1
Proposition 2. let c  [1, min( d , n)] ,

 (c,0)  0 and  (c, N )  1

For 0  d  N value of function  (c, d ) is non-decreasing with value of d increasing and
value of c fixed, or  (c, d ) is increasing for d  [0, N ]
Proof: with c  0 , we have  (c, d )  1  h(c  1, d ) , c  1  [0, min( d , n)  1] . Application
of the previous proposition to h(c  1, d ) , give the conclusion listed here.
2.2 Traditional exact confidence intervals
Let d  [0, N ] and n a positive integer. Given a number   (0,1) and an
Observation c  [0, n] from a hypergeometric random variable X (d , n, N ) , we would like to
describe a (  100)% confidence interval I (n, c)  [0, N ] . The basic idea behind traditional
exact confidence intervals is this: an observation like c should be an unusual observation for any d
outside of I (n, c) . Traditionally, an observation like c translates into an observation in [0, d’] or
1
1
or P ( X (d , n, N )  c) 
.
2
2
1
Since   (0,1) , both cannot be true, and one might logically replace
with 1   , this is
2
[d’,n]. “Unusual” translates into P ( X (d , n, N )  c) 
our case. Here is a formal statement of the traditional translation of this idea:
d  I (n, c)  P( X (d , n, N )  c) 
1
1
orP ( X (d , n, N )  c) 
2
2
Another way to state the idea is that, having observed c, any d  I ( n, c ) is rejected with a risk,
for any specific rejected d but which is in fact correct; the probability of obtaining something like
d from X (d ' , n, N ) is at most ( 1   ).
Let us assume that c  [0, n] . For any value of d, the function P( X (d , n, N )  c)  h(c, d )
decreases from 1 at d = 0 to 0 at d = N. By the intermediate value theorem, there is a unique
smallest d2 * [0, N ] such that P ( X ( d 2 *, n, N )  c) 
d  [d 2 *, N ]  P( X (d , n, N )  c) 
1
.
2
1
or
2
Thus we reject any d [d2 *, N ] .
Similarly, with c  [1, n] , the function P( X (d , n, N )  c)   (c, d ) increases from 0 at d = 0
to 1 at d = N. Again, by the intermediate value theorem, there is a unique d1*  [0, N ] such that
P( X (d1*, n, N )  c) 
1
1
or d  [0, d1*]  P( X (d , n, N )  c) 
2
2
Therefore, we reject any d [0, d1*] . Thus we may select I(n, d) = [ d1 * , d2 * ] as a
(  100)% confidence interval. In most cases, d1 * and d2 * are numerically search from
the equations P( X (d1*, n, N )  c) 
1
1
and P( X (d 2 *, n, N )  c) 
.
2
2
Proposition 3. For c  [0, n] , we have d1*  d2 *
Proof: we shall prove this by contradiction. Suppose d1*  d2 * , choose some d such that
d2 *  d  d1 * . Since d  d1 *
P( X (d , n, N )  c)  P( X (d1*, n, N )  c) 
1
2
Since d2 *  d
P( X (d , n, N )  c)  P( X (d 2 *, n, N )  c) 
1
2
Therefore
1  P(0  X (d , n, N )  n)  P( X (d , n, N )  c)  P( X (d , n, N )  c)  1    1
When   [0,1] . The contradiction implies Proposition 3 is ture.
2.3 One side “tail”
In our case, we need one side “tail”, or a confidence interval is such format [d *, N ] , any
d  [d *, N ]  P( X (d , n, N )  c)  1  
The goal is to find smallest d* satisfy P( X (d *, n, N )  c)  1   ,or h(c, d *)  1   . We can
use search algorithm to get solution.
3. The design of sampling plan
3.1 mapping the hypergeometric interval to inspection
Image such a scenario, before the size of N of product is shipped out; we need defective rate
below p at (  100)% confidence level. We have no more knowledge about the product, how
can we decide the size of inspection, n to archive the requirement.
As mentioned above, the situation fits hypergeometric distribution P( X (d , n, N )  c) : N is the
size of product under sampling, n is the size of inspection, d is estimated value of defective items
containing in the lot, c is the number of defective items observed.
With fixed N, n, c, the random variable X depends on value of d. If estimated d̂ d is in [0, d*],
then it is true at least (  100)% chance the true d is less then d*.
Then come back to our problem, we can get d* from required defective rate and the size of lot N,
which is
p  N  .
From this information, we want to get small n that when d  [d *, N ] ,
P( X (d , n, N )  c)  1   , or when observed defective below or equal to c with sampling size
n, and the defective items among N population is d, the chance is rare as (1   )% .
The way to find n is to search n in [1, N]. We develop an quick algorithm to obtain the smallest n
when fixed N, c, and d*, which satisfy P( X (d *, n, N )  c)  1   .
Why the result of n is smallest sampling size? Suppose we find n is such a solution satisfy the
equation P( X (d *, n, N )  c)  1   , which means we inspect n items randomly select from
the lot, less then c defective item observed, we can conclude that total defective items is less then
d* in  confidence level. Then if we pick n+1 items with observed defective stay as c,
P( X (d *, n  1, N )  c)  P( X (d *, n, N )  c  1   ) leads to another conclusion that d has
less chance that bigger than d* under n+1 sampling size compared with n sampling size, or in
other words, more assurance d is below required one. In the end, we can say the n is smallest
sampling size required.
3.2 finding smallest n
From the analysis above, we know the range of n is from 1 to N. the value of
P( X (d *, n, N )  c) is not decreasing with value of n increasing when d*, c and N fixed. The
reason is obvious, when we get big sampling size, the more chance we will obtain bigger size of
defective items, and probability of observed items less than c is less. Through this monotonic
characteristic, we can always get a smallest n to satisfy condition P( X (d *, n, N )  c)  1   .
Any local search can apply to this problem.
In our case, a Pseudo gradient search algorithm is introduced. Search steps are following:
1. initialize n=1 and search step as 1, N, d*, c and  are specified.
2. Check the current value of P( X (d *, n, N )  c) , if it is bigger than 1   , then n=n+step
and doubles step; else n=n-step and step=step/2
3. if step is not equal to 0 then repeat 2; else jump out of the loop
4. if for current n, P( X (d *, n, N )  c)  1   , n=n+1;
3.3 the scheme of proposed sequential sampling plan
Through the algorithm, we can get sampling size when N, d*, c and
 are specified. So we build
an inspection table such that c is incremented at 1, the corresponding sampling size n is list beside
it. Like the figure 1.
Figure 1 sampling table.
So items are taken randomly from the lot with population size N. as each item is selected and
inspected, a decision is made whether to
1) accept the lot if a large enough number of nondefective items has encountered already;
2) continue to select some items to further inspection
3) inspect all the rest of items
For example we want to check a box of bulbs with size 1000; the required defective rate is 0.1
with confident level 0.9, as described in the figure 1. according to the first line of the table, we
need take 22 items from the lot. If all these bulbs work, then we accept all lot, if there 2 bulbs do
not work, we need take another (51-22=29) bulbs to further inspect and remove the defective 2
bulbs. Repeat the process either in some c, no more defective items find out in the sampling or
check all the population.
3.4 Analysis of sampling plan
After this inspection process, or outgoing inspection, the shipped items have at most required
defective rate p at given confident level  .
The risk of rejecting an acceptable lot is 0, because this inspection need no information about the
assumed defective rate of produce line, in other word, no matter defective rate of produce line has,
the required p can always be met by sequential sampling, if defective rate of produce line or the
defective items among the lot is high, the sample size meet the population size frequently, which
means check the item one by one; if the real defective rate is low or bad items in the lot are small,
the sample size keeps small also. So when an acceptable lot is under inspection, we can be
confirmed by checking item one by one at worst case, we don’t reject the lot.
The risk of accepting an unacceptable lot is 1   at most. As we proved in previous section,
when we observed c items defective, the real defective items contains in N is bigger than required
one at most at chance of 1   .
4. Simulations and conclusions
4.1 Simulations
We use computer to simulate the scenario of inspection. We put every 1000 items into a box
virtually with randomly produced defective items. The required defective rate shipped is 0.1 or
100 defective itmes per 1000 items, the confident level is 90% . Then use the sequential sampling
plan to inspect the boxes. The results of simulations are list in Table 1.
Defective items
per 1000 (before
inspection )
Final sample size
Defective items
found (removed)
Defective
shipped
52
37
1
5.11%
80
101
6
7.6%
100
476
41
4.28%
105
999
105
0.0%
rate
Table 1. the simulation of products before and after the sequential sampling, the assumptions: every box contains
1000 items, the required defective rate 10%; the confident level is 90%; the real defective items in the boxes are 52,
80, 100 and 105.
The sampling plan accept the lot with lower defective rate with smaller sample size. When the lot
contains more defective items than required one, the sample size becomes bigger, even the whole
population size. The defective rate of shipped product shows below required one.
4.2 conclusions
Our sequential sampling plan is proposed to solve the problem that the inspected lot has small
population size. Traditional sampling plan prefers big population size, or binomial distribution and
normal distribution works better in big population size. The hypergeometric distribution can
model the situation of the small population size better.
The fundamental problem is estimation of defective rate p from produce line. If population size is
bigger, the estimation of such p is more accurate. Our sampling is a kind of distribution free at
defective rate of produce line. Avoiding the difficulty in obtaining the knowledge of defective rate
in produce line, we can still meet the requirement. We can preset down exact confidence interval
of defective items according to the requirement p and N, and then find smallest sampling size n to
satisfy confidence level.
Another merit of the sampling plan is that the process could improve the quality of items. If the
defective rate in produce line is higher than required one, through inspection we can remove the
defective items, the final passing lot meets the requirements. In this situation, the lot has low
chance to pass through small sample size, it may meet big sample size even inspect one by one.
Once the lot pass the sampling, it display required character at given confidence. To reduce the
cost of inspection is improve product quality in production, or to improve the quality through strict
or more inspection.
Reference
1. Ross.S.M, Introduction to probability Models eighth edition, Academic press, 2003
2. Montgomery. D. C, Introduction to statistical quality control 5e, John Wiley & Sons, 2005
3. Ramsey. L.T, (Traditional) exact confidence intervals for the binomial distribution,
www.math.hawaii.edu/~ramsey/TraditionalBinomialCI.pdf
Download