SAMPLING FOR PRESERVATION ASSESSMENT: A CONVERSATIONAL APPROACH Philip Doty School of Information UT Austin INF 392G, Management of Preservation Programs October 2, 2006 Philip Doty, University of Texas at Austin, September 2006 1 October 2, 2006 Philip Doty INF 392G, Management of Preservation Programs Ellen Cunningham-Kruppa SAMPLING FOR PRESERVATION ASSESSMENT: A CONVERSATIONAL APPROACH Doing research in real collections often demands that we study samples rather than entire collections. To understand this process conceptually and in its specific applications, let’s consider sampling more fundamentally. Similarly let’s look at other concepts and tools in light of what we know about sampling. Conceptual foundations While we all generally know that a sample is some wholly contained subset of a (larger) population, defining a population is a bit more difficult. Think of a population as the entire group of persons, artifacts, historical eras, institutions, events, practices, and so on that interest the researcher. They are of interest because they share a characteristic, or characteristics, that matters to the research -- it is this characteristic that makes them a population, nothing else. Then, a sample is some subset of this larger group chosen for participation in the research. Generally speaking, researchers use samples rather than entire populations for three reasons: The population is too large to observe, e.g., all people living in the United States. The population is theoretical, e.g., all possible throws of a pair of dice. The process of research is destructive, e.g., testing the lifetime of batteries. Thus, researchers must depend upon samples because they cannot look at every individual member of the population. So another way of thinking of sampling is that it is the choice of what to observe from the infinite possibilities that confront us as researchers. A powerful way to consider the relationship between populations and samples is that researchers use what they see in samples to gain some understanding of what they cannot see in populations. In order to make such generalizations (to achieve what researchers in the positivist school call “external validity”), researchers must be as certain as they can be that their samples are representative of the population of Philip Doty, University of Texas at Austin, September 2006 2 interest. In order to achieve representativeness, researchers generally use some form of random sampling, sometimes called probabilistic sampling. Random sampling and representativeness While there is a huge applied and theoretical literature about random sampling, we can make some useful summary comments here. When we say that a sample is random (or probabilistic), we mean two things, the first of which is more important: 1. Every member of the population has an equal chance of being chosen for participation in the research. 2. Every member of the population has a known chance of being chosen for participation in the research. What these assertions mean is that the particular members of the population that are included in the research, i.e., the sample, must be chosen randomly in order for the researcher to be able to make any argument that the sample represents the population as a whole. This assertion rests on some very complex, abstract arguments, many of which are the object of great disagreement that we cannot explore here. One of the fundamental assumptions of the positivist approach to research is that random sampling, and the resultant representativeness, is likely to eliminate bias in a sample. That is, the sample will not be systematically in error. In addition, this approach assumes that noise (random variability or the effects of chance) will cancel out in the long run. We use statistics, particularly inferential statistics, to help us quantify the effects of chance in the results that we get when doing research. Another way of making this assertion is that statistics helps us to understand how wrong we might be when generalizing from a sample to a population. We’ll look at this assertion more closely below, particularly when we discuss sample size and sampling error. Identifying random samples In various literatures, we generally see three major ways of implementing random sampling, although there are many ways to do so (Babbie, 2001). The first of the three is simple random sampling. Let’s look at a small but useful population to help us understand this process. Suppose we have a 20-member population of books, 17 of which were published after 1950, with the remaining three published in the 17th and 18th centuries. We will identify the first seventeen as B1 - B17, and the last three as B18, B19, and B20. Philip Doty, University of Texas at Austin, September 2006 3 If we wanted to choose a random sample of five books from this group of 20, we might put the numbers 1-20 in a hat and pull out five numbers at random, after mixing them up, without peeking, and without replacing those once chosen. The chances that any particular book will be chosen is 1/20, or 0.05. Overall, the chance that a book published since 1950 will be chosen is 17/20 (0.85), while the chance that a book from the earlier period will be chosen is 3/20 (0.15). Remember the sum of all of the possibilities must equal 1.00. Now suppose that we still want to look at a sample with five observations but are particularly interested in ensuring that we include at least one of the books published in the 17th and 18th centuries. Of course, this consideration is very common among preservation administrators, conservators, and collection managers. Then we would need to use stratified random sampling, the second major way of determining random samples that we see in the literature. Some researchers refer to a stratified random sample as a quota sample because the researcher wants to ensure that some sub-population(s) is included in the sample, often keeping the proportions in the sample that the sub-population(s) have in the population as a whole. We can choose as we did earlier and stop after the first five choices if we get at least one of B18, B19, and B20. If we don’t get at least one of those, or even at the start of the research, we might make two “piles” – one that represents B1-B17, the other for B18-B20. Then we could choose four numbers randomly from B1-B17, and choose the fifth item randomly from only B18-B20. The sample still has five elements and is still random, but we ensure that it includes at least one of the earlier published items. This method is the approach Walker et al. (1985) take with the strata from the various Yale libraries and Teper & Atkinson (2003) take with the various decks in the Illinois study. There are other ways to achieve the same goal, but you can see that we have ensured “wider” representativeness of the collection by including one of the earlier works, IF there is a reasonable foundation for believing that date of publication matters to the researcher’s questions. It is very common for researchers to want to ensure that a minimum number of members of a sub-population(s) of interest are part of the sample(s) chosen for participation in the research. We see this kind of concern in all types of applications, e.g., political polling, biomedical research, strength of materials testing, and on and on. Philip Doty, University of Texas at Austin, September 2006 4 We also see a third approach in the research literature, usually called multistage random sampling. Such an approach has much in common with the second method described above. You might want to look at multistage random sampling on your own as well as some of the other ways of implementing a random or probabilistic sample, especially when the population is very large and generating complete sample frames is impractical. See the last two pages of this handout for a useful summary of sampling methods from Trochim (2001, p. 19, Table 2.1). Be skeptical, not cynical Before we proceed to working through the application of these principles to random sampling for identifying a collection’s preservation needs, let me make a few more general remarks. As readers and researchers, we should expect that any researcher using random sampling can: Document clearly how a sample, or samples, was chosen Articulate explicitly what specific procedures were followed to ensure randomness and, thus, representativeness and generalizability Tie this conceptual argument in a clear and explicit way to the particular data analysis techniques used in the research. If any of these elements is missing from a research report, the skeptical (but not dismissive) reader has substantial reason to doubt the accuracy and wider applicability of the research findings and conclusions (see, e.g., Best, 2001). Let us look more closely now how we can use these concepts in the context of preservation management of collections. Implementation with a collection The specific procedures of preservation surveys are clearly presented in your readings and in the wider literature. I will emphasize a few activities and ideas that are key to the successful implementation of a random sampling strategy and key to the informed reading of others’ research. Philip Doty, University of Texas at Austin, September 2006 5 Determining sample size Determining how many elements a sample should have can appear daunting. If we keep a few basic principles in mind, however, we can move forward with some confidence. No matter how large a sample is, it must necessarily include some error, some uncertainty. More specifically, there will be some unknown difference between the characteristics of the sample (what you have) and the characteristics of the population (what you want to know about). This deviation between sample statistics and population parameters is called sampling error. It is unavoidable – we must reconcile ourselves to that fact while also trying to minimize and get reliable estimates of the sampling error. Even with representative sample(s), our conclusions about the population based on a sample or samples can be very wrong. A second principle to recall is that the size of the population of interest usually does not determine the size of a representative random sample. If the population is a great deal larger than the sample, the size of the population matters little to the size of the sample. A third fundamental principle is that the major concern related to sample size, with the exception of the three reasons for using samples instead of populations mentioned above, is the amount of error the researcher can accept. This consideration often involves the real world consequences of being wrong. In the general context of preservation decisions, only the researchers themselves, their parent institutions, and their peer groups can decide that question. Sampling error is determined by the sample size, and the amount of sampling error the researcher can accept gives us a relatively easy way to calculate sample size, using tables such as the one attached from Babbie (2001, Appendix H, pp. A38-39). ). His and similar tables are based on two important assumptions: 1. The choice we have when classifying a member of the sample is dichotomous, i.e., we decide that a book either needs preservation treatment or it does not; this rationale is why he uses the binomial percentage distribution. 2. We want 95% confidence in our result, i.e., we are willing to be wrong 5% of the time in the long run in this dichotomous situation. As you know, researchers often are willing to accept 95% confidence in their probabilistic findings. For other percentages, see other texts and collections of tables. Philip Doty, University of Texas at Austin, September 2006 6 We might start our research about a collection of 500,000 items thinking that, given our resources (especially time, attention, and $$$) we’ll look at a sample of 2,000 items. Thus, the sample is 0.4% or 0.004 of the total population. If we were to use this sample and got the result that 10% of items needed preservation treatment, we would expect that 10% of the population as a whole would need preservation treatment plus or minus 1.3% (10 ± 1.3%) according to Babbie’s table. In other words, we would be 95% confident that anywhere between 8.7 and 11.3% of the items in the collection as a whole would need preservation. Is 95% a level of confidence you can live with? Most times, it is. Suppose you had to take a sample with only 1,000 items, still a very large sample. Then, at the same 95% level of confidence, and presuming we got the same 10% proportion of items needing preservation, our result would be stated that we are 95% confident that the relative proportion of items in the population needing preservation would be 10 ± 1.9% (again from Babbie’s table) or from 8.1 to 11.9%. We get a larger amount of uncertainty, that is, a wider interval, because we didn’t look at as large a sample as we did above. This principle also applies if we are using an inferential statistical technique such as the generation of confidence intervals around µ (mu, the population mean) – see any stats text for further explanation. This approach is generally used in dichotomous situations, i.e., like the example using the binomial noted above and that discussed in Walker et al.’s methodological appendix about the Yale study (1985, pp. 127128). Teper & Atkinson (2003) use the same basic strategy albeit a bit differently (see Babbie, Table H on estimated sampling error and the row for n = 400 and its various columns). A quick methodological aside – a summary to remember is that the size of the standard error of µ (SEµ), a common measure used to estimate sampling error, varies inversely with the square of the sample size: SE s , with s equal to the standard n 1 deviation of the sample and n the sample size. For the binomial generally, SE (P)(Q) , n the probability of the other; again, n is the sample size. To decrease the amount of uncertainty we can tolerate, for example, from 5% to 1% in a dichotomous situation like the one described above and like the ones you will generally face professionally, we would need to make the sample 25 times as large. We would need a sample 25 times as large to get five times the certainty. where P is the probability of one outcome and Q Philip Doty, University of Texas at Austin, September 2006 7 In most inferential circumstances, sample size should be a minimum of 30 items, and the sample size of 2,000 is often used for populations of several hundred thousand or more (most researchers use what are called non-parametric methods to address samples smaller than 30). One can choose from among these two extremes by maximizing sample size given your resources and being explicit about what level of uncertainty you can tolerate. Generally speaking, however, if you are doing a large-scale preservation assessment and you can take a sample larger than 2,000 for preservation assessment, then go for it! No matter what the circumstances or the size of the study, learn to live with the samples that you can take. Remember, the process of sampling inevitably involves uncertainty and error, no matter how large the sample and no matter how meticulous a methodologist you are. Your job is to minimize error and then make professional decisions in the face of uncertainty, not in its absence. Using or generating a random number table The use of random number tables is often essential to the choosing of a random sample for preservation research and is often among the first activities performed in a research project. There is a great deal of disagreement about whether a computer can generate random numbers and whether any random number table is indeed random. While that controversy is serious and, at times, amusing, we cannot engage it here. Instead, I and many others feel secure saying that such tools are “random enough.” Although many of you will use Web sites to generate random numbers for sampling purposes, it is also worthwhile to learn how to use random number tables in your research more autonomously. Let’s start with the attached tables of random numbers; the first is from Babbie (2001, Table Appendix E, pp. A33-34), while the second is from Spatz (2005, Table B, pp. 382-385). Spend a little bit of time getting familiar with these tools – as you can see, each is made up of many columns of numbers with five digits. Use Babbie’s instructions (2001, “Using a Table of Random Numbers, pp. 198-199) to see how to go about choosing a sample randomly in a collections-based setting (I’ve also attached Spatz’s shorter instructions from pp. 145-146 of his book). A quick summary is to: Philip Doty, University of Texas at Austin, September 2006 8 1. Number all the items in the population, either actually or in some way that indicates you’re sure of the total, or as sure of the total as you can be. That might mean using the techniques for considering the strata, ranges, shelves, and other sub-elements of the collection as indicated in Walker et al. (1985). 2. Decide how many digits you’ll need in your number. In our example above we said that the collection as a whole had 500,000 items, of which we would sample 2,000. Thus, we need six (6) digits in our numbers since 500,000 is a six-digit number. 3. Even though Babbie’s and Spatz’s tables have only five digits, we can use any digit immediately to the left or right, or above or below or wherever, to make any five-digit number into a six-digit number. 4. Follow the procedures as discussed in Babbie and Spatz to identify 2,000 unique items in the collection, randomly chosen. After you’ve gone through this process, the research then focuses on judgments about preservation treatment based on the explicit criteria developed for making them and the process of data analysis and drawing conclusions. A researcher should not start the process of random sampling, no matter how rigorously done, without knowing the specific reasons for doing the research and (largely) what data analysis techniques will be used on the data. Summary remarks Remember that the process of doing research is one of the hallmarks of mature professionals. You can do it, and you can do it well. Always keep in mind that we can never be certain about our conclusions when we do research for a number of reasons, especially if we use samples. Sampling inevitably introduces error and uncertainty into our measurements and conclusions because of the differences that exist between what we see in the samples and what “the reality” is in the population as a whole. With all that said, however, your two main jobs as a researcher are to minimize error and to make decisions based on the knowledge you have in hand. As my grandmother used to say to me, all we can do is the best we can with what we have. A little knowledge, uncertain and fragile as it might be, is much better than no knowledge at all, as long as we recognize its limitations. Be confident that you can make good decisions based on your own and others’ experience and on your own clear and explicit thinking. Philip Doty, University of Texas at Austin, September 2006 9 References • indicates that I refer to these explicitly in this document R indicates that I use these in my version of the Research class, INF 397C, mostly as supplemental texts • Babbie, Earl. (2001). The practice of social research (9th ed.). Belmont, CA: Wadsworth Publishing. [I now use the 10th edition from 2004] R • Best, Joel. (2001). Thinking about social statistics: The critical approach. In Damned lies and statistics: Untangling numbers from the media, politicians, and activists (pp. 160-171). Berkeley, CA: University of California. R Katzer, Jeffrey, Cook, Kenneth H., & Crouch, Wayne W. (1998). Evaluating information: A guide for users of social science research (4th ed.). Boston, MA: McGraw-Hill. R Rowntree, Derek. (1981). Statistics without tears: A primer for non-mathematicians. New York: Scribner. R • Spatz, Chris. (2005). Basic statistics: Tales of distributions (8th ed.). Pacific Grove, CA: Brooks/Cole. R • Teper, Thomas, & Atkins, Stephanie. (2003). Building preservation: The University of Illinois at Urbana-Champaign’s stacks assessment. College & Research Libraries, 64(3), 211-227. • Trochim, William K. (2001). The research methods knowledge base (2nd ed.). Cincinnati, OH: Atomic Dog Publishing. See http://www.socialresearchmethods.net/kb/ R • Walker, Gay, Greenfield, Jane, Fox, John, & Simonoff, Jeffrey S. (1985). The Yale survey: A large-scale study of book deterioration in the Yale University Library. College & Research Libraries, 46(2), 111-132. Philip Doty, University of Texas at Austin, September 2006 10