Uses of sampling in Quality 1 Why sample? Samples give us information about a Population. For us, the population could be manufactured items, internet orders from Amazon, Skype calls, customers in a checkout line at a store, etc. In SPC (Statistical Quality Control) we are generally interested in some quality related characteristic, qualitative (audio-visual digital quality of Skype calls) or quantitative (waiting times of customers in a checkout line) of some element of the sample(s). We would like to make an inference about the quality characteristic of the population based upon the information from the sample(s). 2 Random sampling Random sampling gives us a way to study population characteristics without systematic bias. Random sampling is not arbitrary sampling, there are systematic ways to do it. Random sampling is not representative, i.e. it does not give an exact reflection of the population. For example, if the proportion of Skype calls made last year with poor visual quality was p, then the proportion of poor quality calls in a random sample will almost never exactly equal p. 3 What does random sampling mean? Suppose the population, which we will denote as P, has N elements and we wish to randomly select a sample of size n. There are a number of ways to accomplish this, but every possible sample of size n must be equally likely. This is called simple random sampling. The number of samples of size n which can be randomly drawn from a population of size N is N N! n n ! N n ! 4 Binomial coefficients The previous formula, often is termed “N choose n”, gives us the number of possible samples of size n that one could possibly draw from a population of size N. We say that we have a method for simple random sampling if each possible sample is equally likely to be drawn. That means that every possible sample has equal probability of being drawn, which is equal to N! 1 n! N n ! 5 How to do it? Random number tables. Use a random number table to give a random number to each item in the population, then order the random numbers from smallest to largest and choose the first n in order of magnitude. This assumes no two values are equal. Random number generators. Excel or any standard package will do this. Statistical software programs. There are a number of programs which will do various forms of randomization for Statistical studies. 6 Two views on sampling and Quality. One can assure good Quality through intensive sampling if everything in the population is sampled and measured. This is generally not done due to excessive cost. In some situations it may be impossible to do, for example, if one needs to do destructive testing. The modern approach is to assure Quality by concentrating on improving the Process itself, then to use sampling to audit or monitor the Process. This is the basis of SPC. 7 Sampling with and without replacement. If we randomly sample n distinct items from a Population of N items, this is simple random sampling without replacement. This is what is typically done in SPC. If each time we randomly sample one item at a time return it, randomly sample another, etc., this is sampling with replacement. From a theoretical standpoint, this mimics sampling from an infinite (or hypothetically infinite) population. A hypothetically infinite population could be all future production of a type of part, or it might be all future transactions of a certain type. 8 Simple vs. stratified random sampling Simple random sampling describes how to select one random sample. When monitoring a process, one will sample the process periodically. So really the population will be divided into subpopulations from which we select random samples. This is also called stratified random sampling. For example: take random samples from each day’s production; sample the dropped calls on Skype by day; sample the late shipments from Amazon by week, etc. 9 Sampling and SPC In SPC we usually collect samples periodically to monitor a process. The samples are collected in a relatively homogeneous set of conditions over time/space. This will allow us to separate “signal” from “noise”. The monitoring of the process is done to check to see if the process is stable over time/space. 10