Practical Aspects of Sampling An Overview Why Sample? Why Sample? Samples are taken to obtain information about populations. Sample estimators are computed to estimate parameters of the the population from which the sample was drawn. Advantages Complete enumeration of all sample units in the entire universe is often unnecessary to obtain reasonably accurate results. Advantages An examination of the entire population is often too costly, too time-consuming, and impractical (….if not impossible). Advantages In the case of destructive testing, the sample elements or units must be destroyed or must be consumed to obtain necessary measurements. Precision The standard error [se] is a measure of precision. A smaller se, other things remaining the same, means more precision .....that is, less variance in the sampling. Sample Size for a mean – n = z2 2 / e2 where: – e, the sampling error, is the difference between sample mean and population mean [e is expressed in units] Sample Size for a proportion n = [z2 p (1 – p)] / e2 where: – e, the sampling error, is the difference between sample proportion and population proportion [ e is expressed in percentage points] Sample Size Errors Sampling (internal) Error The fact that a sample was taken, the sample statistic is expected to deviate from the population parameter. Errors Non-Sampling (external) Error Practical considerations in taking a sample. recording errors coding errors processing errors Errors Bias Most insidious to detect .... poorly defined universe inadequate sampling design improperly worded questions distorted answers convenience sampling Errors The sampling error refers to the extent to which the sample values on some variable of importance to the research differ from those of the population from which it was drawn. Types of Random Samples Simple Random with replacement without replacement …must be able to identify the target population and ensure each item has an equal likelihood of being selected… ….use table of random numbers …or computer generate a series of random numbers… Stratified When the population is heterogeneous overall, but within it there are homogeneous populations (strata) the population is stratified. Systematic Selecting a random sample, as opposed to the simple random selection technique. Select the K-th item. Draw every I-th item. Cluster Another modified random sample design -- requires that the sample unites be grouped in clusters in the universe. Not grouped by homogeneous strata in the population. Multistage The selection procedure takes place in a hierarchy of stages. – – – – – first second third ..... last primary sample unit second sample unit tertiary sample unit final (or ultimate) sample unit Multistage - An Example The president of Supermarkets, Inc. decided to sample purchases at 150 stores in the US. The first stage is to select, on the basis of clustering (save travel time), 15 of the 150 stores. Multistage - An Example The researcher recommends that cash register files be randomly selected at each of the 150 stores. [second stage] Then select every 20th purchase in a file using a random start. [final stage] Comparison of Survey Sampling Designs Simple Random How to Select – assign numbers to elements using random numbers table Strengths/Weaknesses – basic, simple, often costly – must assign a number to each element in target population Stratified How to Select – divide population into groups that are similar within and different between variable of interest Strengths/Weaknesses – with proper strata, can produce very accurate estimates. – less costly than simple random sampling – must stratify target population correctly Stratified One of the main reasons for using a stratified sample is that stratifying has the effect of reducing sampling error for a given sample size to a level lower than that of a simple random sample of the same size. Stratified This is so because of a very simple principle: the more homogeneous a population is on the variables being studied, the smaller the sample size needed to represent it accurately. Stratifying makes each sub-sample more homogeneous by eliminating the variation on the variable that is used for stratifying. Systematic How to Select – select every K-th element are from a list after a random start Strengths/Weaknesses – produces very accurate estimates when elements in population exhibit order – used when pop. size not known – simplifies selection process Cluster How to Select – randomly choose clusters and sample all elements within each cluster Strengths/Weaknesses – with proper clusters, can produce accurate estimates – useful when sample frame not available or travel costs high – must cluster target population correctly Convenience in Dining Commons at dinner… in Student Union between classes… in classes in which you are enrolled… data available on the www… friend knows somebody who... Mini-Cases Working as a team… … determine best sampling technique and explain decision Scenario 1 You have been hired by the County of Sacramento to estimate the percentage of registered voters that favor issuing a bond in order to finance the construction of a new bike trail along the Sacramento River. You want no more than a 4 percentage point error margin, at the 95% confidence level. How would you conduct such a survey using a simple random sample? Scenario 1 (continued) When going over your sampling design with the county Parks Director, you are asked whether you think a stratified sample would be appropriate? What is your reply? Why? What about a systematic sample? Travel Vouchers Fly the Friendly Skis Scenario #2 o The State of California has hired you to estimate the number of travel vouchers for legislators that have been filed incorrectly. The vouchers have been filed as they are processed. o Which sampling technique would you recommend and why? Light Rail Scenario #3 Light Rail has hired you to determine whether passengers like the convenience of using the light rail system. Which sampling technique would you recommend and why? Other concerns that might be investigated? Trucks Scenario #4 Marketers, Inc., has hired you to determine why so many young drivers, both male and female, prefer owning a pickup truck as compared to an automobile. Which sampling technique would you recommend and why? Merit Pay Scenario #5 You have been hired to determine how faculty at a local university feel about the following statement: “…the union is seeking to obtain a moratorium on merit pay.” Which sampling technique would you recommend and why? Questions? References Levine, David, et al. Statistics for Managers, Second Edition. Upper Saddle River, NJ: Prentice-Hall, 1999. Monette, Duane R., et al. Applied Social Research New York: Holt, Rinehart and Winston, 1986.