Sampling for Preservation Assessment: A Conversational Approach

advertisement
SAMPLING FOR PRESERVATION ASSESSMENT:
A CONVERSATIONAL APPROACH
Philip Doty
School of Information
UT Austin
INF 392G, Management of Preservation Programs
October 2, 2006
Philip Doty, University of Texas at Austin, September 2006
1
October 2, 2006
Philip Doty
INF 392G, Management of Preservation Programs
Ellen Cunningham-Kruppa
SAMPLING FOR PRESERVATION ASSESSMENT:
A CONVERSATIONAL APPROACH
Doing research in real collections often demands that we study samples rather than entire collections. To
understand this process conceptually and in its specific applications, let’s consider sampling more
fundamentally. Similarly let’s look at other concepts and tools in light of what we know about sampling.
Conceptual foundations
While we all generally know that a sample is some wholly contained subset of a (larger)
population, defining a population is a bit more difficult. Think of a population as the entire group of
persons, artifacts, historical eras, institutions, events, practices, and so on that interest the researcher. They
are of interest because they share a characteristic, or characteristics, that matters to the research -- it is this
characteristic that makes them a population, nothing else. Then, a sample is some subset of this larger
group chosen for participation in the research.
Generally speaking, researchers use samples rather than entire populations for three reasons:

The population is too large to observe, e.g., all people living in the United States.

The population is theoretical, e.g., all possible throws of a pair of dice.

The process of research is destructive, e.g., testing the lifetime of batteries.
Thus, researchers must depend upon samples because they cannot look at every individual member of the
population. So another way of thinking of sampling is that it is the choice of what to observe from the
infinite possibilities that confront us as researchers.
A powerful way to consider the relationship between populations and samples is that researchers
use what they see in samples to gain some understanding of what they cannot see in populations. In order
to make such generalizations (to achieve what researchers in the positivist school call “external validity”),
researchers must be as certain as they can be that their samples are representative of the population of
Philip Doty, University of Texas at Austin, September 2006
2
interest. In order to achieve representativeness, researchers generally use some form of random sampling,
sometimes called probabilistic sampling.
Random sampling and representativeness
While there is a huge applied and theoretical literature about random sampling, we can make some
useful summary comments here. When we say that a sample is random (or probabilistic), we mean two
things, the first of which is more important:
1.
Every member of the population has an equal chance of being chosen for participation in the research.
2.
Every member of the population has a known chance of being chosen for participation in the research.
What these assertions mean is that the particular members of the population that are included in the
research, i.e., the sample, must be chosen randomly in order for the researcher to be able to make any
argument that the sample represents the population as a whole. This assertion rests on some very complex,
abstract arguments, many of which are the object of great disagreement that we cannot explore here.
One of the fundamental assumptions of the positivist approach to research is that random
sampling, and the resultant representativeness, is likely to eliminate bias in a sample. That is, the sample
will not be systematically in error. In addition, this approach assumes that noise (random variability or the
effects of chance) will cancel out in the long run. We use statistics, particularly inferential statistics, to help
us quantify the effects of chance in the results that we get when doing research.
Another way of making this assertion is that statistics helps us to understand how wrong we might
be when generalizing from a sample to a population. We’ll look at this assertion more closely below,
particularly when we discuss sample size and sampling error.
Identifying random samples
In various literatures, we generally see three major ways of implementing random sampling,
although there are many ways to do so (Babbie, 2001). The first of the three is simple random
sampling. Let’s look at a small but useful population to help us understand this process. Suppose we
have a 20-member population of books, 17 of which were published after 1950, with the remaining three
published in the 17th and 18th centuries. We will identify the first seventeen as B1 - B17, and the last three
as B18, B19, and B20.
Philip Doty, University of Texas at Austin, September 2006
3
If we wanted to choose a random sample of five books from this group of 20, we might put the
numbers 1-20 in a hat and pull out five numbers at random, after mixing them up, without peeking, and
without replacing those once chosen. The chances that any particular book will be chosen is 1/20, or 0.05.
Overall, the chance that a book published since 1950 will be chosen is 17/20 (0.85), while the chance that a
book from the earlier period will be chosen is 3/20 (0.15). Remember the sum of all of the possibilities
must equal 1.00.
Now suppose that we still want to look at a sample with five observations but are particularly
interested in ensuring that we include at least one of the books published in the 17th and 18th centuries. Of
course, this consideration is very common among preservation administrators, conservators, and collection
managers. Then we would need to use stratified random sampling, the second major way of
determining random samples that we see in the literature. Some researchers refer to a stratified random
sample as a quota sample because the researcher wants to ensure that some sub-population(s) is
included in the sample, often keeping the proportions in the sample that the sub-population(s) have in the
population as a whole.
We can choose as we did earlier and stop after the first five choices if we get at least one of B18,
B19, and B20. If we don’t get at least one of those, or even at the start of the research, we might make two
“piles” – one that represents B1-B17, the other for B18-B20. Then we could choose four numbers
randomly from B1-B17, and choose the fifth item randomly from only B18-B20. The sample still has five
elements and is still random, but we ensure that it includes at least one of the earlier published items. This
method is the approach Walker et al. (1985) take with the strata from the various Yale libraries and Teper
& Atkinson (2003) take with the various decks in the Illinois study.
There are other ways to achieve the same goal, but you can see that we have ensured “wider”
representativeness of the collection by including one of the earlier works, IF there is a reasonable
foundation for believing that date of publication matters to the researcher’s questions. It is very common
for researchers to want to ensure that a minimum number of members of a sub-population(s) of interest are
part of the sample(s) chosen for participation in the research. We see this kind of concern in all types of
applications, e.g., political polling, biomedical research, strength of materials testing, and on and on.
Philip Doty, University of Texas at Austin, September 2006
4
We also see a third approach in the research literature, usually called multistage random
sampling. Such an approach has much in common with the second method described above. You
might want to look at multistage random sampling on your own as well as some of the other ways of
implementing a random or probabilistic sample, especially when the population is very large and
generating complete sample frames is impractical. See the last two pages of this handout for a useful
summary of sampling methods from Trochim (2001, p. 19, Table 2.1).
Be skeptical, not cynical
Before we proceed to working through the application of these principles to random sampling for
identifying a collection’s preservation needs, let me make a few more general remarks. As readers and
researchers, we should expect that any researcher using random sampling can:

Document clearly how a sample, or samples, was chosen

Articulate explicitly what specific procedures were followed to ensure randomness and, thus,
representativeness and generalizability

Tie this conceptual argument in a clear and explicit way to the particular data analysis techniques used
in the research.
If any of these elements is missing from a research report, the skeptical (but not dismissive) reader has
substantial reason to doubt the accuracy and wider applicability of the research findings and conclusions
(see, e.g., Best, 2001). Let us look more closely now how we can use these concepts in the context of
preservation management of collections.
Implementation with a collection
The specific procedures of preservation surveys are clearly presented in your readings and in the
wider literature. I will emphasize a few activities and ideas that are key to the successful implementation of
a random sampling strategy and key to the informed reading of others’ research.
Philip Doty, University of Texas at Austin, September 2006
5
Determining sample size
Determining how many elements a sample should have can appear daunting. If we keep a few
basic principles in mind, however, we can move forward with some confidence. No matter how large a
sample is, it must necessarily include some error, some uncertainty. More specifically, there will be some
unknown difference between the characteristics of the sample (what you have) and the characteristics of the
population (what you want to know about). This deviation between sample statistics and population
parameters is called sampling error. It is unavoidable – we must reconcile ourselves to that fact while
also trying to minimize and get reliable estimates of the sampling error.
Even with representative sample(s), our conclusions about the
population based on a sample or samples can be very wrong.
A second principle to recall is that the size of the population of interest usually does not determine
the size of a representative random sample. If the population is a great deal larger than the sample, the size
of the population matters little to the size of the sample.
A third fundamental principle is that the major concern related to sample size, with the exception
of the three reasons for using samples instead of populations mentioned above, is the amount of error the
researcher can accept. This consideration often involves the real world consequences of being wrong. In
the general context of preservation decisions, only the researchers themselves, their parent institutions, and
their peer groups can decide that question.
Sampling error is determined by the sample size, and the amount of sampling error the researcher can
accept gives us a relatively easy way to calculate sample size, using tables such as the one attached from
Babbie (2001, Appendix H, pp. A38-39). ). His and similar tables are based on two important assumptions:
1.
The choice we have when classifying a member of the sample is dichotomous, i.e., we decide that a
book either needs preservation treatment or it does not; this rationale is why he uses the binomial
percentage distribution.
2.
We want 95% confidence in our result, i.e., we are willing to be wrong 5% of the time in the long run
in this dichotomous situation. As you know, researchers often are willing to accept 95% confidence in
their probabilistic findings. For other percentages, see other texts and collections of tables.
Philip Doty, University of Texas at Austin, September 2006
6
We might start our research about a collection of 500,000 items thinking that, given our resources
(especially time, attention, and $$$) we’ll look at a sample of 2,000 items. Thus, the sample is 0.4% or
0.004 of the total population. If we were to use this sample and got the result that 10% of items needed
preservation treatment, we would expect that 10% of the population as a whole would need preservation
treatment plus or minus 1.3% (10 ± 1.3%) according to Babbie’s table. In other words, we would be 95%
confident that anywhere between 8.7 and 11.3% of the items in the collection as a whole would need
preservation. Is 95% a level of confidence you can live with? Most times, it is.
Suppose you had to take a sample with only 1,000 items, still a very large sample. Then, at the
same 95% level of confidence, and presuming we got the same 10% proportion of items needing
preservation, our result would be stated that we are 95% confident that the relative proportion of items in
the population needing preservation would be 10 ± 1.9% (again from Babbie’s table) or from 8.1 to 11.9%.
We get a larger amount of uncertainty, that is, a wider interval, because we didn’t look at as large a sample
as we did above.
This principle also applies if we are using an inferential statistical technique such as the generation
of confidence intervals around µ (mu, the population mean) – see any stats text for further explanation.
This approach is generally used in dichotomous situations, i.e., like the example using the binomial noted
above and that discussed in Walker et al.’s methodological appendix about the Yale study (1985, pp. 127128). Teper & Atkinson (2003) use the same basic strategy albeit a bit differently (see Babbie, Table H on
estimated sampling error and the row for n = 400 and its various columns).
A quick methodological aside – a
summary to remember is that the size of the
standard error of µ (SEµ), a common measure
used to estimate sampling error, varies inversely
with the square of the sample size:
SE  
s
, with s equal to the standard
n 1
deviation of the sample and n the sample size.
For the binomial generally,
SE 
(P)(Q)
,
n
the probability of the other; again, n is the
sample size.
To decrease the amount of uncertainty we
can tolerate, for example, from 5% to 1% in a
dichotomous situation like the one described
above and like the ones you will generally face
professionally, we would need to make the
sample 25 times as large. We would need a
sample 25 times as large to get five times the
certainty.
where P is the probability of one outcome and Q
Philip Doty, University of Texas at Austin, September 2006
7
In most inferential circumstances, sample size should be a minimum of 30 items, and the sample
size of 2,000 is often used for populations of several hundred thousand or more (most researchers use what
are called non-parametric methods to address samples smaller than 30). One can choose from among these
two extremes by maximizing sample size given your resources and being explicit about what level of
uncertainty you can tolerate. Generally speaking, however, if you are doing a large-scale preservation
assessment and you can take a sample larger than 2,000 for preservation assessment, then go for it! No
matter what the circumstances or the size of the study, learn to live with the samples that you can take.
Remember, the process of sampling inevitably involves uncertainty and error, no matter how large the
sample and no matter how meticulous a methodologist you are.
Your job is to minimize error and then make professional decisions in
the face of uncertainty, not in its absence.
Using or generating a random number table
The use of random number tables is often essential to the choosing of a random sample for
preservation research and is often among the first activities performed in a research project. There is a
great deal of disagreement about whether a computer can generate random numbers and whether any
random number table is indeed random. While that controversy is serious and, at times, amusing, we
cannot engage it here. Instead, I and many others feel secure saying that such tools are “random enough.”
Although many of you will use Web sites to generate random numbers for sampling purposes, it is also
worthwhile to learn how to use random number tables in your research more autonomously.
Let’s start with the attached tables of random numbers; the first is from Babbie (2001, Table Appendix
E, pp. A33-34), while the second is from Spatz (2005, Table B, pp. 382-385). Spend a little bit of time
getting familiar with these tools – as you can see, each is made up of many columns of numbers with five
digits.
Use Babbie’s instructions (2001, “Using a Table of Random Numbers, pp. 198-199) to see how to go
about choosing a sample randomly in a collections-based setting (I’ve also attached Spatz’s shorter
instructions from pp. 145-146 of his book). A quick summary is to:
Philip Doty, University of Texas at Austin, September 2006
8
1.
Number all the items in the population, either actually or in some way that indicates you’re sure of the
total, or as sure of the total as you can be. That might mean using the techniques for considering the
strata, ranges, shelves, and other sub-elements of the collection as indicated in Walker et al. (1985).
2.
Decide how many digits you’ll need in your number. In our example above we said that the collection
as a whole had 500,000 items, of which we would sample 2,000. Thus, we need six (6) digits in our
numbers since 500,000 is a six-digit number.
3.
Even though Babbie’s and Spatz’s tables have only five digits, we can use any digit immediately to the
left or right, or above or below or wherever, to make any five-digit number into a six-digit number.
4.
Follow the procedures as discussed in Babbie and Spatz to identify 2,000 unique items in the
collection, randomly chosen.
After you’ve gone through this process, the research then focuses on judgments about preservation
treatment based on the explicit criteria developed for making them and the process of data analysis and
drawing conclusions. A researcher should not start the process of random sampling, no matter how
rigorously done, without knowing the specific reasons for doing the research and (largely) what data
analysis techniques will be used on the data.
Summary remarks
Remember that the process of doing research is one of the hallmarks of mature professionals. You
can do it, and you can do it well. Always keep in mind that we can never be certain about our conclusions
when we do research for a number of reasons, especially if we use samples. Sampling inevitably
introduces error and uncertainty into our measurements and conclusions because of the differences that
exist between what we see in the samples and what “the reality” is in the population as a whole.
With all that said, however, your two main jobs as a researcher are to minimize error and to make
decisions based on the knowledge you have in hand. As my grandmother used to say to me, all we can do
is the best we can with what we have. A little knowledge, uncertain and fragile as it might be, is much
better than no knowledge at all, as long as we recognize its limitations. Be confident that you can make
good decisions based on your own and others’ experience and on your own clear and explicit thinking.
Philip Doty, University of Texas at Austin, September 2006
9
References
• indicates that I refer to these explicitly in this document
R indicates that I use these in my version of the Research class, INF 397C, mostly as supplemental
texts
• Babbie, Earl. (2001). The practice of social research (9th ed.). Belmont, CA: Wadsworth Publishing.
[I now use the 10th edition from 2004] R
• Best, Joel. (2001). Thinking about social statistics: The critical approach. In Damned lies and statistics:
Untangling numbers from the media, politicians, and activists (pp. 160-171). Berkeley, CA: University of
California. R
Katzer, Jeffrey, Cook, Kenneth H., & Crouch, Wayne W. (1998). Evaluating information: A guide for
users of social science research (4th ed.). Boston, MA: McGraw-Hill. R
Rowntree, Derek. (1981). Statistics without tears: A primer for non-mathematicians. New York:
Scribner. R
• Spatz, Chris. (2005). Basic statistics: Tales of distributions (8th ed.). Pacific Grove, CA: Brooks/Cole.
R
• Teper, Thomas, & Atkins, Stephanie. (2003). Building preservation: The University of Illinois at
Urbana-Champaign’s stacks assessment. College & Research Libraries, 64(3), 211-227.
• Trochim, William K. (2001). The research methods knowledge base (2nd ed.). Cincinnati, OH: Atomic
Dog Publishing. See http://www.socialresearchmethods.net/kb/ R
• Walker, Gay, Greenfield, Jane, Fox, John, & Simonoff, Jeffrey S. (1985). The Yale survey: A large-scale
study of book deterioration in the Yale University Library. College & Research Libraries, 46(2), 111-132.
Philip Doty, University of Texas at Austin, September 2006
10
Download