# 5: Sampling, pg 1 MOL 501: Probability and Nonprobability Sampling (Revised 11/11/07) Required Reading: Chs. 8 & 9. The last lecture ended with a discussion of external validity—the extent to which one can generalize one’s findings from a study to the larger world. Sometimes it is not enough to know whether or not a hypothesis is supported for the people studied; we often really want to know if the hypothesis would be supported if it were tested with other people in other settings. One of the most important determinants of external validity is the quality of the sample studied. A “good” sample is one that is so similar in composition to the larger world we are interested in that what is true for the sample is also true for the larger world. Thus, in many studies, the quality of the sample studied becomes the major influence on external validity. In most social science research, external validity is a very important issue. However, external validity is not always a concern in organizational research. Often, organizational research is not concerned with people outside of the organization actually studied. For example, studies of the effectiveness of policies, programs and innovations within a particular organization are focused entirely on the members of the organization itself. Whether or not the same policies, programs or innovations would be equally beneficial for other organizations is of no interest to the researcher. If the organization is not too large and is limited to only one location, it is possible to study every member of the organization. If every member of the organization participates in the study, and the researcher is not interested in generalizing to some larger population, then external validity is not likely to be a problem. However, if the organization is very large or dispersed geographically, it may be too expensive and time consuming to include every member of the organization in the study. Whenever some members cannot be studied, external validity and, therefore, the quality of the sample becomes an important consideration. # 5: Sampling, pg 2 I. Key Concepts The logic of sampling is based on a few key concepts. Before discussing various types of samples and their strengths and weaknesses, it is best to make sure these concepts make sense. Population/Universe. All hypotheses refer to some population or group of people or organizations. The groups to which the hypothesis applies is referred to as the population or universe. If we are doing an evaluation study of the effectiveness of a program within a single organization, then the population or universe is all of the members of that organization. If the program only applies to some members, then the population is limited to those members of the organization to which the program applies. For example, if we are interested in the effect of an on-site day care facility on the productivity of workers, the population would be limited to workers with children of the appropriate age for daycare. Population parameters. A population is defined by certain characteristics, and these characteristics are the population parameters. For example the population parameters for the daycare study example described above would be having responsibility for the care of one or more children, and being an employee of the organization. Population element. A single member of the population is referred to as a population element. The term “population element” is used instead of “member” or “person” because not all populations consist of individuals. For example, if we are studying the effect of carpeting on the noise level in classrooms, the population is not people but classrooms, and a population element would be a single classroom, not each student. Similarly, if we were doing a study of the effects of different kinds of leadership on the performance of teams, the population size would be the number of teams, not the number of people in the teams. This is an important distinction because tests of the statistical significance of findings are often reported in the literature, and sample size has a large effect on many tests of significance. Pretending that a sample of classes or teams is really a sample of individuals exaggerates the size of the sample (often by a factor of 10, 30, 50, or even 100). When this kind of error occurs the author of the article may claim that the # 5: Sampling, pg 3 independent variable (cause) produced a “significant difference” in the dependent variable, when in fact, the difference may be due to random sampling error. A good example of this kind of error can be found in published studies of the effect of various factors on college students’ evaluations of their classes and instructors. Many published studies report variables like time of day, gender of the teacher, grading practices, and the like produce “statistically significant” differences in the ratings instructors receive, but their tests of “significance” of the differences are calculated using the size of the sample as the number of students who completed the surveys used. Since the sample is really a sample of classes, not students, they are exaggerating the size of the sample by the average size of the classes studied. If, for example, researchers studied student evaluations of 20 classes and the average size of those classes was 40 students, and they misrepresented the sample as a sample of students rather than as a sample of classes, the sample size they use to calculate whether or not the effects were “statistically significant” would be 800, when, in fact the size of the sample was really only 20. Since whether or not an effect or difference is “statistically significant” depends partly on the size of the sample because sample size or “degrees of freedom” is entered into statistical significance formulas , some very small and trivial results or results that occurred by chance, may be reported as “significant”. As a result of such errors, I am sure that some college faculty are worrying and obsessing over the results of such studies when, in fact, most of the effects have little or no real impact on their students’ ratings. Representative or Isomorphic Sample. A sample is representative if it is isomorphic. That is, if it matches the population in all regards except size. A truly representative sample is a small microcosm of the population that matches it perfectly in every detail. That is, the make-up of the sample in regard to types of people (or other elements) exactly matches the make-up of the population. It will have the same proportion of men and women, the same proportion of rich and poor people, the same proportion of blue eyed people, etc. as the population. Achieving a representative sample is often (but not always) the goal of sampling, but it is never achieved. Thus, representative sampling refers to strategies used to come as close to this goal as possible. # 5: Sampling, pg 4 Probability sample or random sample. A probability sample or random sample is a sample in which each element of the population has an equal chance of being included, and for which the probability of including a particular element is known. For example, in a small probability sample of employees drawn from an organization of 163 members, each population element would have one out of 163 chances of appearing in the sample. It is important to realize that the requirements of probability sampling are very strict. For example, selecting students to complete a survey by going to the student union and asking each student who enters to complete a survey is not probability sampling. Not all students go to the student union, and of those who do, some go there more often than others. As a result, not all students have an equal chance of being chosen for the sample, and there is no way to calculate the probability that a particular population member will end up in the sample. Nonprobability sample. A nonprobability sample is any sample that involves selecting members by some method that does not involve random selection. Some nonprobability samples attempt to achieve representativeness without random selection, while other nonprobabilty samples involve selecting people without regard to representativeness. For example, outlier samples involve selecting people who are known to be atypical, the exact opposite of the goal of representativeness. II. Probability Samples While the best way to achieve external validity is to include every member of the population in a study, the next best approach is to use a probability sample. There is no guarantee that a probability sample will be perfectly representative, but it gives us the best odds of achieving representativeness. Furthermore, using probability math, we can estimate the likelihood that any given probability sample will be representative. We can also use probability math to select a sample size that will give us a particular probability of achieving representativeness. For this reason, whenever representativeness is the goal, it is best to use a probability sample. # 5: Sampling, pg 5 Unfortunately, the requirements for a sample to be a true probability sample (or “random sample”) are so strict, probability sampling can be quite difficult and expensive. Generally, it is very difficult to draw a probability sample unless we either have a list of all members of the population, or we can count on them all being at the same place at the same time. For example, we could draw a random sample of all employees of a particular organization by selecting every seventh person off a list of the employees, or we could draw a random sample of the same group by asking all members to meet in the auditorium and select every seventh person by having them count off by sevens. However, it would probably be impossible to draw a probability sample of people who engage in employee theft. There is no list of employee thieves (only those who have been caught could end up on a list), and there is no place where all employee thieves congregate because they want to blend in with the general population of employees. Types of Probability Samples There are several types of probability samples, and each has its advantages and disadvantages. The simple random sample. The simple random sample comes closest to meeting the strict definition of random sampling, and it is usually the method most likely to achieve representativeness. Members of the population are usually selected from a list by lot (for example by putting all names in a hat, shaking it up, and selecting a sample of names), or they are selected in some other systematic way that gives every member an equal chance of being selected. In real life, even a simple random sample is not a true probability sample because as each member is selected, the odds of being included go up. For example, if we have a population of 100 people and we select our sample by drawing slips of paper with their names out of a hat containing the names of every member, the odds of being chosen first are one out of 100, but the odds of being chosen second are “only” one out of 99, the odds of being chosen third are one out of 98, and so on. The only way to achieve perfect random sampling would be to return each slip chosen back to the hat so there are always 100 names in it. Of course random sampling with replacement is not desirable because we do not want to take the chance that the same person # 5: Sampling, pg 6 will be chosen several times. As a consequence, simple random samples actually used in research are almost always selected by random sampling without replacement. Simple random samples are sometimes too expensive or too time consuming to use. For example, if we want to compare two groups of employees, one of which is very small, we would need a huge simple random sample in order to obtain enough members of the smaller group to make meaningful comparisons. (For example, comparing female CEO’s to male CEO’s of large corporations would require a very large sample of CEO’s because women make up only a tiny fraction of CEO’s in the population. To make sure we have a subsample of only 50 female CEO’s would probably require drawing a sample of several thousand CEO’s.) In general, the larger the sample, the more time consuming and expensive the study. Similarly studying a simple random sample of employees of an organization with offices scattered all over the U.S. would be extremely time consuming and expensive because we would have to travel to many different locations to observe or survey the members of our sample. Two modifications of simple random sampling are often used to reduce the costs and effort required by simple random sampling. The stratified random sample. If our primary goal is to compare two or more groups to each other, and if those groups are unequal in size, the size of the sample (and, therefore the costs and effort required to study it) can be reduced by drawing a stratified random sample. This technique involves first dividing up the population into the groups we are interested in, and then drawing a probability sample of each. For example, if we are interested in comparing male CEO’s to female CEO’s, we could reduce our costs by first dividing a list of CEO’s into two groups or strata (males and females), and then drawing a probability sample of 50 from each group (or stratum). Using equal sized subgroups to compare the groups is an effective approach so long as each subgroup sample is selected randomly from the population of members of that subgroup. However, if we want to accurately represent the population as a whole, stratified random sampling would not be a good choice because we are purposely over sampling one group (in our example, women CEO’s). As a consequence any generalizations drawn about the group as a # 5: Sampling, pg 7 whole would be misleading because the oversampled group would have too much influence on our results. Of course, we could get around this problem by making sure the proportion of the over-all sample that belong to each strata is equal to the proportion in the population, but that would negate the money saving and time saving benefits of stratified random sampling because we would end up with a very large sample. Cluster samples. Although the stratified random sample is excellent for comparing unequal sized groups, it can yield misleading results if we try to use it to study the group as a whole. However, cluster sampling. or multistage sampling can be used when the goal is to describe the population as a whole for the least cost, and the cost savings are especially large if the population is spread out geographically. Cluster sampling involves identifying clusters in the population, drawing a random sample of those clusters, and then drawing a random sample of individual population elements from each cluster selected. For example, if we wanted to study a probability sample of employees of an organization with offices scattered throughout the U.S., we could save a lot of money and effort by first getting a list of the various offices of the organizations and drawing a random sample of the offices; then we could go to only the offices selected, and study a sample of members from each. Similarly, if we wanted to administer surveys to a sample of G.U. undergraduates living on campus, we could save a lot of time by first drawing a random sample of dormitories and other campus housing from the list of all G.U. residences, and then go only to those residences to distribute surveys to either every member of the residences or giving it to a randomly selected sample from each of the residence. For example, drawing a cluster sample by first randomly selecting five residence halls from the 15 at G.U. residences and two apartments from the six G.U. apartments would cut the time and effort of obtaining surveys from a random sample of G.U. students living on campus to a third of the costs and effort to obtain the same number of surveys from a simple random sample. III. NONPROBABILITY SAMPLES # 5: Sampling, pg 8 While probability samples are the best for obtaining representative samples, they are not always desirable or necessary. Some populations cannot be studied with probability samples at all because there is no list of the members, and members may be hard to locate. For example, it is impossible to study probability samples of deviant groups (like employees who steal from the company), groups with no permanent location (like the homeless), or even potential customers or clients (like the population of people who wear hiking boots or the population in need of some kind of professional services) because there is no list of the population of these groups, and even if there was such a list, tracking them down could be impossible. As a consequence, a number of nonprobability sampling techniques have been developed, and each has advantages and disadvantages. As is the case with the various types of probability samples, each type of nonprobability sample has strengths and weaknesses, and the best choice depends on the kind of research being conducted. A. Accidental Or Convenience Samples. When lay persons say they selected people “at random” they are often referring to convenience sampling or accidental sampling (the two terms are interchangeable) rather than true random sampling. With convenience sampling we simply reach out to whoever is immediately available and take the first population elements we come across. Convenience samples are the lowest cost samples in terms of time and effort, but they are also the least likely to be representative. Taking whoever is within easy reach almost guarantees that the sample will not be representative of the population. If we simply sample from the people around us at school or work, there is a good chance that the sample will primarily consist of people who are very similar to us. If students sample their classmates, they will be over sampling people with interests, backgrounds, age, and career goals similar to their own. Of course, we could go to some convenient location that is not a regular stop in our daily routine in an effort to get a wider range of respondents, but such a sample will probably not be representative. For example, many marketing studies are done with “intercept samples”—convenience samples drawn by going to a # 5: Sampling, pg 9 public place (like a mall) and asking people to complete a survey as they walk by. However, a mall intercept sample will still not be representative of populations other than the patrons of that particular mall. Some people are particularly likely to spend a lot of time at malls (for example teenagers who go to mall to “hang out” with their friends), while others will go to great lengths to avoid shopping at malls. Furthermore, most malls are located in suburban and urban settings, so people living in rural areas are likely to be under represented. Generally, convenience sampling should be our last choice, but sometimes limitations on time and money make convenience sampling the only practical choice. One way to improve the quality of convenience samples is to use time sampling. For example, if we must use a mall intercept sample, we can make it more representative of the people who use that mall by sampling at different times of the day and different days of the week. (We could even draw a probability sample of times and days.) The population that goes to the mall on weekdays while school is in session and most people are at work is probably quite different from the population that normally goes to the mall in the evening or on weekends. Similarly, its possible to make convenience samples more representative of the population we wish to study by sampling people from several different locations because different populations sometimes use different locations for their routine activities. For example, students forced to use a convenience sample drawn from the student population can improve the quality of the sample by going to different locations (the student union, several dormitories, campus dining halls frequented by students who live off campus, the library, etc.) to obtain a more diverse sample. Of course, there is no way to make such a sample completely representative of the population, but such efforts will help reduce some aspects of over sampling and under sampling that inevitably result when we sample the people most convenient for us to study. B. Quota samples One way to improve the quality of a convenience sample is to identify various types of people, groups, or ways that population elements in the sample normally differ, estimate the # 5: Sampling, pg 10 proportion of the population they represent, and then set quotas to ensure that they represent the same proportion of the sample as they do in the population. For example, if we are studying students at a particular university with a convenience sample, we can find out what proportion of the student body is female, and what proportion is male (typically about 55% of most college populations are women), and take steps to make sure that our convenience sample includes the same proportion of men and women. This technique is called quota sampling, and the goal is to obtain a sample that is more representative than could be obtained with a simple convenience sample. Quota samples, like stratified random samples, may be necessary if we wish to compare two groups that are very unequal in size. For example, at Gonzaga University, the number of students whose parents divorced while they were in school is very small. As a consequence, a researcher who wished to compare students whose parents had divorced to those whose parents did not, would need a huge convenience sample to obtain enough students from families that experienced divorce to make meaningful comparisons. A less costly approach would be to set a quota for each group, and after enough students from intact families had been obtained, recruit only whose parents had been divorced. Such a quota sample could be obtained using the intercept technique by beginning the interview or survey with a question designed to identify people who fit the quota. For example, we could begin with the question, “Are your parents still married to each other?” If the answer is “no”, then we could interview the respondents or ask them to complete the questionnaire, but if the answer is “yes”, we could politely inform them that we are only interested in interviewing people whose parents divorced and move on to the next person who walks by. It is important to keep in mind that quota samples are almost never representative. The people actually sampled are still a convenience sample. For example, in the above example, the children of divorce sampled are a convenience sample of all such children and the children whose parents never divorced are a convenience sample of people from intact families. C. Purposive Samples # 5: Sampling, pg 11 Sometimes the goal is not to obtain a representative sample of the population. Purposive sampling involves hand picking population elements to meet certain predefined criteria. As your text indicates (see page 136), hand picking cases will inevitably produced a biased sample, but sometimes we actually want a biased (or at least a nonrepresentative) sample. Comparative samples. In organizational research we often need to have some basis for comparison if we are to correctly interpret our data. For example, an organizational researcher might be asked by management to conduct a study to determine what programs or innovations would benefit employees with young children. This kind of study, called a needs assessment, will be successful only if the unique or special needs of the group studied are identified, but some of the needs of this group will be common among all employees. If we were interested in identifying the special needs of employees with young children by interviewing them or asking them to complete a survey, we would have no way to tell which of their needs are unique, and which are similar to other workers, including those who are single, those with grown children, and those who are childless. One solution to this problem is to draw two samples, a sample with the characteristics we are interested in (in this case employees with young children), and another sample that is similar in as many ways as possible to the group we are interested in except for the key factor. For example, for the hypothetical needs assessment for employees with young children, we could draw a second quota sample of workers who do not have young children and compare the results obtained from the two samples. The presence of this comparison sample will sensitize us to the unique needs of the group we are interested in. Outlier analyses. Sometimes studying a sample that is representative of the general population yields very little information because there is not much variation among most members of the population. For example, most research designed to measure the characteristics of successful professionals have yielded little useful information because most members of any profession are quite similar in both competence and background. Professional schools generally have high standards, and most professionals must meet certain minimum requirements to be licensed to practice their profession. As a consequence, most members of a profession are quite # 5: Sampling, pg 12 similar in regard to their training, qualifications, and background. Thus while there are some outstanding performers in every profession (and there also some who are less able than average), they are relatively rare. As a result, the factors that account for excellence (or failure) in a profession are hard to identify by studying a representative sample because the vast majority are so similar in regard to both competence and background. One approach for identifying the characteristics of truly outstanding professionals is to select and compare outliers—in this case, professionals who are much better than average and those who are much less successful than average. By eliminating the vast group of competent professionals who are are close to average in ability and looking only at the extremes, the differences become much more apparent. Other types of purposive samples. Comparative samples and outlier samples are among the more commonly used purposive samples, but many other variations are possible. What all purposive samples have in common is that population elements are hand picked with some goal in mind. It is important that purposive samples should not be used if the primary goal is to create a representative sample of the population. Hand picking almost guarantees that the sample will not be representative. However, a properly designed purposive sample can yield insights that would not be obtained with other types of samples including probability samples. As a consequence, purposive samples are particularly useful for exploratory research conducted on topics that are poorly understood or are new to science. D. The Snowball Sample Sometimes we want to study a population that is hidden or includes many people who do not wish to be part of a study. Studies of these populations is often impossible if traditional probability or nonprobability sampling is used. For example, if an organizational researcher wished to study the working conditions of illegal immigrants, none of the traditional probability or nonprobability techniques would be useful. There is no list of illegal immigrants, so probability sampling is out of the question, and nonprobability sampling techniques such as convenience sampling in neighborhoods where many illegal immigrants are likely to live are not likely to yield an adequate sample because most illegal immigrants will avoid strangers asking # 5: Sampling, pg 13 questions because they might be immigration officials. Similarly, studies of workers who are dissatisfied or disgruntled with management would be difficult if traditional sampling techniques were used because, again, there is no list from which to draw a probability sample, and convenience sampling may fail because the workers may fear the researcher will report them to management. In recent years, sociologists facing this kind of problem have developed a new solution called snowball sampling. Snowball samples are obtained by first identifying a few people who have the desired characteristics. Typically the researcher will begin with a few friends or acquaintances, or a knowledgeable person is contacted and asked to refer the researcher to a few people with the relevant characteristics. The researcher then interviews, surveys or observes the few people identified, but at the end of each interview or observation session, the researcher asks the respondents if they know anyone else with the relevant characteristics. The people identified by the original respondents are then asked to cooperate, and, after the data are collected, these new respondents are asked to recommend still others for inclusion in the study. In this way a large sample can be gradually created from just a few cases. Snowball sampling is effective because most people know and are most comfortable with others who are similar to them (social psychologists call this the “principle of homophily”). As a consequence, the best way to find a particular type of person is to ask someone who shares the trait in question. Furthermore, when we inform people that a mutual acquaintance has recommend them to the researcher, they are much more likely to trust the researcher and cooperate IV. Avoiding Sampling Problems in Organizational Settings Most research methods texts are designed primarily for researchers who want to generalize from their research to large populations, so they often assume that the researcher will be studying a sample. But, generalizing to large populations is not always the goal of # 5: Sampling, pg 14 organizational research. Often, the researcher is concerned only with the organization itself, and many organizations have fewer than 500 members, all of whom work in the same location. For example, a small manufacturing company may want to know if a new program has improved the morale or job satisfaction of its members, or a social service organization with a client base of several hundred individuals might want to know if they are satisfied with the services they are receiving. This kind of research does not require a sample at all—smaller organizations often have the resources to study every member of the organization or every current client. This kind of study is called a census, and, if possible and affordable, a census is always preferable to a sample study because there is always some chance that any sample, even a simple random sample, might not be representative of the population. In organizational research, it is always best to study the entire population if that is feasible. So long as there is a high level of participation, a census eliminates the possibility that the results are biased through sampling errors. When the population of interest is only a few hundred individuals, and they all are available in one location, it is often less costly and less time consuming to study the whole population than to try to draw a representative sample for study. Of course, if the organization is very large and/or its members are scattered over a wide area, then a sample study may be the only choice. Of course, if participation is low because many refuse to answer questions, then sampling bias becomes an issue even if a census study has been completed. Low return rates are particularly likely when researchers rely on mailed surveys or internet surveys. For example, if a researcher uses a mailed survey or an E-mail survey to measure employee satisfaction, it would not be unusual for less than half of the employees to return a completed survey. Furthermore, since the most dissatisfied employees may be more highly motivated to let management know # 5: Sampling, pg 15 what they think, the results could be somewhat biased by high return rates among the least satisfied and low return rates among the most satisfied. When return rates are low, it is very likely that the survey results would not be representative of the entire population. Considerable research and experimentation has been done to identify strategies for increasing return rates for mailed and internet surveys. One of the best discussions of these strategies is Dillman, Don A. 2000. Mail and Internet Surveys: The Tailored Design Method, 2nd Edition. Published by John Wiley & Sons. # 5: Sampling, pg 16 SAMPLING APPENDIX 1 ESTIMATING DESIRED SAMPLE SIZE E. F. Vacha 2/20/08 Most people understand that the goal of most (but not all) sampling is to create a sample that is representative of the population. One of the most frequent questions I am asked by both clients and research methods students trying to design a good sampling procedure is “how many people should be in my sample?” Unfortunately the answer to that question is not simple because it depends on on a very nonintuitive aspect of probability statistics. Sample size estimation is nonintuitive because most of us think of sample size in terms of what percentage of the population is included in the sample, and we incorrectly assume that the “representativeness” of a sample depends on its size relative to the population. In reality, the representativeness of a sample depends on the spread of scores we would obtain if everyone in the population was included in the study. If the scores that people get on whatever measure we are using vary widely with lots of people getting low scores and many also getting high scores (a lot of spread), we will need a large sample. But if the scores people can obtain vary by only a few points (very little spread), we may be able to get by with a very small sample. If the spread of scores is quite large, it is conceivable that through sheer bad luck, a small sample could include way too many people with scores clustered around one extreme. However, if the scores have very little spread, even a small sample will usually include people at both the high and low extremes as well as people scoring in the middle of the range. The most common measure of spread of scores is the standard deviation. It is a measure of the average difference between actual scores of each individual and the mean (average) score of the whole sample. If the spread of scores is quite large, the average difference of individual scores will be very large as well, but if the spread of scores is small, most will cluster near the group average so the average difference between individual scores will be quite small. The standard deviation is simply a way of expressing the average difference between individual scores and the over-all average score. If a graph of the scores is a bell shaped or “normal” curve, two-thirds of the scores will be within one standard deviation of the mean. That is, if the average score is 20 and the standard deviation is 2, two-thirds of the scores will fall between 18 and 22. If, however the average score is 20 and the standard deviation is 5, then two-thirds of the score will be spread out between 15 and 25. Because the quality of a sample depends on the spread of the scores rather than the size of the population, a poll of 50,000,000 voters may require a sample size no larger than a study of a medium sized organization with only a few thousand members. For example, presidential polls can provide consistently accurate results with random samples of around 1.500 because the standard deviation in a two choice situation is very small. Sample Size Estimation for Probability Samples One of the great advantages of probability sampling is that it allows us to use a mathematical formula to estimate how large our sample should be. To use this formula we must first decide how accurately we want our average sample score to represent the population average (the # 5: Sampling, pg 17 average score we would obtain if everyone in the population was included in the sample). We call this value the “degree of accuracy”. We must also decide how confident we want to be that the sample score is at the level of accuracy we choose (we call this the “confidence level”). For example, we might decide that we want to try for a sample average score that is within one point of the actual population score, and we want to be 95% sure that the sample will hit that degree of accuracy. Then, if we can estimate the population standard deviation, we can plug all three values into a simple formula to obtain the ideal sample size. Here is the calculation: 1. To determine the desired size of a sample (or subgroup within a sample) for estimating a population mean, three values must be known or guessed: a. The confidence level. Usually a 95% confidence level is used. “We will be 95% sure that the mean is within a specific range or degree of accuracy.” b. The degree of accuracy . The degree of accuracy is a range expressed as plus or minus some value. E.g., “+/- .1” means the true average in the population will be within one tenth of a point of the value obtained from the sample. c. The population standard deviation. The range of scores above and below the mean that includes 2/3 of the population. If the mean is 3 and the population standard deviation is 1, 2/3 of the population would score between 2 and 4. 2. The population standard deviation must be guessed. If respondents can be expected to have similar scores, the population standard deviation will be small, but if respondents’ scores vary a lot, the population standard deviation will be large. Most researchers use the sample standard deviation(s) from previous studies to estimate the population standard deviation. The table below illustrates the calculation and some typical results. Notice that when the population standard deviation is small, and we are willing to be 95% sure that our sample’s average score is within .4 points of the actual population average score, our sample need only include 24 individuals, but when the population standard deviation is 3.0, we need 216 in our sample. Of course, we must have a probability sample to use this approach to estimating sample size. # 5: Sampling, pg 18 SOME SAMPLE SIZE CALCULATIONS Population Standard Deviation (SD) Assumptions: Other Assumptions: SD = 1 SD = 1.5 SD = 2 SD = 2.5 SD = 3.0 Confidence Level = 95% Degree of Accuracy = .1 384 864 1,536 2,401 3,457 Confidence Level = 95% Degree of Accuracy = .2 96 216 384 600 864 Confidence Level = 95% Degree of Accuracy = .3 43 96 171 267 384 Confidence Level = 95% Degree of Accuracy = .4 24 54 96 150 216 Values: 95% Confidence Level = 1.96; 99% Confidence Level = 2.796 Formula: Sample N = (Conf. Level x Est. Pop. St. Dev. / Deg. of Accuracy)2 Nonquantitative Approaches To Estimating Sample Size Of course, if we have a nonprobability sample like a convenience, quota, or snowball sample, we can’t estimate the ideal sample size mathematically. Also, sometimes we have no basis for estimating the population standard deviation of a probability sample. In these situations, the general practice is to use the largest sample we can, and to base sample size decisions on the size of the subgroups we wish to study. Estimates based on number and size of subgroups. One approach for estimating sample size that is useful for both probability and nonprobability sampling is to base the sample size on the smallest subgroups studied. Whenever we wish to compare groups or types of people to each other, we need to base our sample size on the smallest of those groups because we need to be sure there are enough in each group to provide meaningful comparisons. For example, if women and men in an organization were roughly equal in number, a meaningful comparison might be generated from a sample of only 100-200. However, if only 15% of the population is women, a sample of 100 would yield only about 15 women. With a sample as small as 15, if only a few atypical cases or “outliers” were accidently included in the sample, the average scores for women could be very misleading. Similarly, if we want to compare seven or eight subgroups we would need a large sample to make sure we have enough people from each group. Current practices. Another approach to estimating sample size is to base one’s decision on current practices by researchers in the field. For example, researchers could review similar studies on their topic to discover what size samples are usually used. Rossi, Wright and Anderson's (1983) review of several hundred published sociological studies found that typical sample sizes of published research differed depending on whether the studies were national or # 5: Sampling, pg 19 regional, and depending on the number of subgroup comparisons that were made. Most organizational studies are more similar to regional studies than national studies because the focus of the study is often on a particular kind of organization or even just one organization. Rossi et al, found with regional studies, when few or no subgroup analyses were conducted, samples ranged from 200-500; when an "average" number of subgroup analyses were conducted, sample sizes ranged from 500-1000; and when many subgroup analyses were conducted, sample sizes ranged from 1,000-2500. Sample Sizes For Exploratory Qualitative Research A special situation exists when researchers use qualitative methods like focus groups, participant observation, and unstructured interviews. All of these approaches to data collection involve intensive and extended interaction between the researcher and each subject. Unstructured interviews can consume several hours per subject, focus groups of 5 to 10 members sometimes require multiple meetings of several hours each, and participant observation often requires months observing a small number of people. Qualitative methods are unsuitable for testing hypotheses because they demand the use of small samples selected from those immediately available to the researcher. The researcher sacrifices representativeness in order to gain a much more complete, nuanced, and rich body of data about each subject. In most such studies sample sizes typically range between 25 and 250. However, such small samples are adequate for creating hypotheses that can be tested in future research using methods that lend themselves to the study of larger more representative samples.