Statistical methods and application Evans’ chapter 11 Dr. Pham Huynh Tram phtram@hcmiu.edu.vn When to use.. In Quality control and improvement, to analyze a process - describe characteristics of a product or process conclude parameters of a product or process Examples → sample of the weight of 38.1 38.5 38.3 37.3 38.4 39.2 38.9 38.7 39 38.6 38.1 38.5 38.4 37.6 38.4 39.2 38.9 38.7 39 38.6 castings (in kilograms) 38.2 38.5 38.4 37.7 38.4 39.3 38.9 38.7 39.1 38.7 from an production line 38.2 38.5 38.4 37.8 38.4 39.3 38.9 38.7 39.1 38.7 in the Harrison 38.3 38.6 38.4 37.9 38.4 39.3 38.9 38.7 39.1 38.7 Metalwork foundry. 38.3 38.6 38.4 37.9 38.4 39.4 39 38.7 39.1 38.7 38.3 38.6 38.4 37.9 38.4 39.4 39 38.7 39.1 38.7 38.3 38.6 38.4 38.1 38.4 39.5 39 38.8 39.1 38.7 38.3 38.6 38.4 38.1 38.5 39.6 39 38.8 39.2 38.7 38.3 38.6 38.4 38.1 38.5 39.9 39 38.8 39.2 38.7 What would you do to analyse the data? What do you conclude from your analysis? Descriptive Statistics Mean 38.6320 Standard Error 0.0444 Median 38.6000 Mode 38.4000 Standard Deviation 0.4436 Sample Variance 0.1967 Range 2.6000 Minimum 37.3000 Maximum 39.9000 Sum 3863.2000 Count 100.0000 → data are fairly normally distributed, with some slight skewing to the right. Examples A soup processed by our company for the month of July. Output in the month of July of brand A soup is 50,000 cans. We want to determine the average weight of cans of brand and so randomly select 500 cans of brand A soup from the July output. We get the average weight of 500 cans is 295g Next, we want to test the validity of a claim that the average weight of the cans is no less than 300g. Is this measured average of 295g is significantly smaller than the claimed mean of 300g? Recall: What are “population”, “sample”, “parameter”, “statistic”, “estimator” descriptive statistics, inferential statistics Statistical Methods - Clarify characteristics of a process - predict future results 6 - - Quantify the uncertainty of sample data Test factor significance - Identify quality problems - Means of measuring improvement Statistical Foundations ✹ Random variables Discrete vs continuous ✹ Probability distributions A theoretical model of the relative frequency of a random variable ✹ Sampling Form a basis for application of statistics Random variables and Probability distribution ● Binomial: # of successes in n trials ● Negative binomial: # of trials for r successes ● Geometry: #of trials for 1st success ● Poisson: # of independent events that occur in a fixed amount of time or space ● Normal: distribution of a process that is the sum of a number of component processes. Eg. assembly Recall: f(x) probability density function (pdf)/ p(x) probability mass function (pmf) F(x) cumulative distribution function (cdf) Binomial Poisson Normal Examples 1. A process is known to have a nonconformance rate of 0.02. If a random sample of 100 items is selected, what is the probability of finding 3 nonconforming items? 2.A process is known to produce about 6% nonconforming items. If a random sample of 200 items is chosen, what is the probability of finding between 6 and 8 nonconforming items? 3. New Orleans Punch was made by Frutayuda, Inc. and sold in 16-ounce cans to benefit victims of Hurricane Katrina. The mean number of ounces placed in a can by an automatic fill pump is 15.8 with a standard deviation of 0.12 ounce. Assuming a normal distribution, what is the probability that the filling pump will cause an overflow in a can, that is, the probability that more than 16 ounces will be released by the pump and overflow the can? 4. Georgia Tea is sold in 2 liter (2000 milliliter) bottles. The standard deviation for the filling process is 15 milliliters. If the process requires a 1 percent, or smaller, probability of overfilling, defined as over 1990 milliliters, what must the target mean for the process be? 5. Outback Beer bottles have been found to have a standard deviation of 5 ml. If 95 percent of the bottles contain more than 230 ml, what is the average filling volume of the bottles? Sampling ● Sampling forms the basis for applications of statistics ● Suppose that we want to determine the attitudes of students about the quality of care they received in IU. Several factors should be considered before making this study: - What is the objective of the study - What type of sample should be used? - What possible error might result from sampling? - What will the study cost? 13 Sampling Plan A good sampling plan should select a sample at the lowest cost that will provide the best possible representation of the population, consistent with the objectives of precision and reliability that have been determined for the study. 14 Factors to consider ✹ Sample size ✹ Appropriate sample design 15 Sampling Error ● Sampling error (statistical error) Occurs naturally, for a sample may not always be representative of the population no matter how carefully it is selected → reduce this error by ? ● Nonsampling error (systematic error) Bias: tendency to systematically over or underestimate true values Non-comparable data: data that com from 2 populations Uncritical projection of trends: assumption that what happened in the past will continue into the future Causation: assumption that because 2 variables are related, one must be the cause of changes in the other Improper sampling: use of erroneous method for gather data, thus biasing results → reduce this error by ? 16 Sampling Error ✹ Sampling error (statistical error) ✹ Nonsampling error (systematic error) 17 Sampling Methods ● Simple random sampling ○ Every item in the population has an equal probability of being selected ● Stratified sampling ○ The population is partitioned into groups, or strata, and a sample is selected from each stratum ● Systematic sampling ○ Every nth (4th, 5th, etc.) item is selected ● Cluster sampling ○ divide a population into smaller groups known as clusters. Randomly select among these clusters to form a sample Example ● A box of 1000 plastic components for electrical connectors is thoroughly mixed and 25 parts are selected randomly without replacement ● A particular nursing unit has 30 patients. Five patient records are to be sampled to verify the correctness of a medical procedure ● A population of 28,000 items is produced on 3 different machines Machine 1: 20,000 items Machine 2: 5,000 items MAchine 3: 3,000 items Assume that a specific confidence level requires a sample of 525 units. A simple random sample of 250 units from a machine 1, 150 units from machine 2, 125 units from machine 3 are taken Example ● A population has 4000 units and a sample of size 50 is required. Select the first unit randomly from among the first 80 units. Every 80th (4000/50) item after that should be selected ● Products are boxed in groups of 50. We draw a sample of boxes and inspect all units in the boxes selected Central Limit Theorem ✹ If simple random samples of size n are taken from any population, the probability distribution of sample means will be approximately normal as n becomes large. The mean of the sample means for this probability distribution will approach µ, and the standard deviation of the distribution will be σ / sqroot(n) The CLT is extremely important in any SQC techniques that require sampling. (sampling distribution) Sample size Simple random sampling is generally used to estimate population parameters such as means, proportions, and variances. Consider the sample size when using sample mean to provide a point estimate of the population mean for variables data. A 100(1 - 𝛼)% confidence interval on sample mean is given by → sampling error E: This sample size (n) will provide a point estimate having a sampling error of E or less at a confidence level of 100(1 - 𝛼)%. A preliminary sample or a good guess based on prior data or similar studies can be used to estimate 𝞼 Determine the sample size for estimating a population proportion for attributes data. Example - sampling ● A Firm conduct a process capability study on a critical quality dimension wishes to determine the sample size required to estimate the process mean with a sampling error of at most 0.1 at a 95% C.I. level. From control chart data, an estimate of a standard deviation of the process was found to be 0.47. Find the appropriate sample size ● A sample from a large finished group inventory is needed to determine the proportion of noncomforming product. Historically, about 0.5% level of noncomformance has been observed. A 90% confidence level with an allowable error of 0.25% is desired. Find the appropriate sample size Example - sampling ● Localtel, a small telephone company, interviewed 150 customers to determine their satisfaction with service. 27 expressed dissatisfaction. If an allowable error for the proportion dissatisfied is 0.05, is the sampling size sufficient with 90% confidence level? ● A management engineer at XYZ hospital determined that she needs to take a work sampling study to see whether the proportion of idle time in the diagnostic imaging department had changed since being measured in a previous study several years ago. At that time, the percentage of idle time was 10%. If the engineer can only take a sample of 800 observations due to cost factors and can tolerate an allowable error of 0.02, what percent confidence level can be obtained from the study?