Unit 1.2 Sampling Design Corresponds to Chapter 2 in Triola Overview of Section 1.2 • Observational Studies vs. Experiments • Issues with Designing an Observational Study – Representativeness – Timing the Study • Issues with Designing Good Experiments – Randomization – Control – Replication – Blinding The basics of creating a study • Observational Study vs. an Experiment – An observational study is designed to observe and measure specific characteristics- no treatment – Treatment- An activity meant to make some effective change in the attribute being measured – Experiment- A treatment is applied and then the experimenter proceeds to observe its effects Issues in Designing Observational Studies 1.2a Sampling Error • A non- sampling error- the difference between a sample result and the true population result • Sampling Error- This occurs when data are incorrectly collected. For example, if we collected a sample • Non-sampling error is inevitable and the law of large numbers allows us to minimize problems from non-sampling error. Sampling Error is bad research design and we want to eliminate sampling error as much as possible. Sampling Bias • Sampling Bias is a kind of non-sampling error that can invalidate the results of your study. • A bias is skewing of the data towards one conclusion or another that is not a result of the actual views or tendencies of a population. • We want to avoid sampling bias as much as possible so that our results can be robust. Representativeness • When conducting an observational study, the single most important factor in creating the study is the representativeness of the sample. • By representativeness I mean, is your sample selected in such a way as to insure that all of the people that you want to know something about have a good chance of being selected for the study. • We can either make sure that you get a representative sample by how you select your sample, or you can use randomness to select your sample. Random Samples • Random Sample- Members from the population are selected in such a way that each individual member in the population has an equal chance of being selected. • Simple Random Sample- N subjects is selected in such a way that every possible sample of the same size n has the same chance of being chosen. – We are going to be discussing samples a lot and the simple random sample is one of the primary requirements for us to be able to run our statistical tests. – We want to make sure that we can select any sample of a particular size for our study because the mathematics that we are going to use allows us to draw conclusions about any sample of a particular size. Voluntary Sample • Voluntary Sample- Individuals select themselves for participation in the study or treatment – In general, a voluntary sample has all sorts of problems because the persons in the sample selfselect. For example, if you conducted a study on an individual’s political opinions through a voluntary sample, you would probably only get individuals who had strong political beliefs. The sample would most likely not be representative. Convenience Sampling • A convenience sample is one where the researcher measures particular attributes from a sample that is readily at hand. • By readily at hand that means the people that are most easily available for testing. • A convenience sample is hardly ever done for scientific purposes because it is easily subjected to sampling bias. For example, if you were to stand outside of the student store and ask students their opinions about increasing to help with schools, your sample might be skewed towards being in favor because the students have a vested interest in more money for schools. Probability Sample • Sometimes it is necessary to stack the deck in order to make sure that you have certain members of the population represented in your study. • A probability sample involves selecting members from a population in such a way that each member of the population has a known(but not necessarily the same) chance of being selected. – For example, weighting women in a study so that there is a greater likelihood that they will be chosen in order to insure their representativeness – What I mean by representation is that we are actually getting information from that segment of the population. Systematic Sampling • Systematic Sampling- Selecting every kth (such as every 50th) member of the population – Systematic Sampling is a way of getting randomization if the individuals in whatever group that you are studying have sufficiently similar characteristics and are somewhat evenly distributed in your population. – Systematic sampling is not an effective way of getting randomization if you have individuals with similar characteristics who are bunched together. – For example, suppose that you wanted to know the buying habits of people in Orange County. if you selected every 50th person in the Santa Ana phonebook this would not be a good way of sampling. Since Santa Ana is 95% Latino and of generally lower socioeconomic status, you would most likely only be getting the views of Latinos or people with lower incomes. Hence your sample would not be representative. Stratified Sampling • Stratified Sampling- Subdivide the population into at least two different subgroups (or strata) so that subjects in the same subgroup have similar characteristics and then the sample is drawn from the subgroup. – A common way of doing stratified sampling would be to divide a group of individuals by ethnicity. Then we would select an equal amount of people from each ethnicity to sample. This would insure that we have representativeness of all of the ethnicities in our study. Because I have tested all ethnicities, I can factor out ethnicity as a possible lurking variable. Cluster Sampling • Cluster sampling- divide the population area into sections ( or clusters), then randomly select some of those clusters and then choose all the members from those clusters. – For example, suppose that I wanted to study the effects of statistics workshops on student grades. I could separate all of the individuals who are taking Math 219 by the class that they are in. These are the clusters. Then I could select 5 classes to do my measurements in. This could insure that I get a good cross-section of instructors and hence remove that issue as a lurking variable. Timing of a Study • One way to eliminate sampling error in an observational study is to choose a timing that fits with the particulars of your study. Timing of a study • Cross-Sectional Study- Data are observed, measured, and collected at one point in time • Retrospective- Collected from the past by going back in time (through examination of previously collected data) • Prospective Study (also called a longitudinal or cohort study)- Collected throughout a set period of time- generally demonstrates change or persistence Designing experiments 1.2b Basic Vocabulary • Treatment or Experimental group- Those individuals selected for your study who will have the particular variable that you are concerned with changed. For example, they are given the drug that you are looking to test. • Control group- These are the individuals who will not be given the treatment that you are looking to test. • Placebo- A pill or intervention that is meant to have the members of the control group think that they are given the treatment. Literally, a placebo is a sugar pill that is used to make a sick person think that they are getting medicine. • The Placebo Effect- When a person who is given a placebo shows improvement in the variables under observation because they think that they have been given a treatment. Bias • Scientists are human beings and would like for their treatments to show effects. However, the community of science needs to have robust outcomes in order to build knowledge. Hence, processes are put into place by the scientific community in order to make sure that there is no systematic bias. A bias is a skewing of the data towards a particular outcome due to variables that are not the subject of the study. A systematic bias is a bias in which the scientist, either knowingly or unknowingly, skews the data by how they construct the experiment. Ways of Preventing Bias in Experimental Studies Randomization • Subjects are assigned to different groups through a process of random selection. – Suppose that we wanted to conduct an experiment where we tested the effect of a new drug on treating the common cold. We could randomly assign members to the treatment group and the experimental group and that would help us to insure that there was no systematic bias in our samples. Randomized Block Design • Selecting subjects that have a similar characteristic and then sampling from each of those blocks. – For example, a study on political opinions is conducted in which the study’s participants are grouped according to their political parties. Then, a random sample of equal size is selected from each of the groups and then tested. Completely Randomized design • Selecting members of the population through a process of random selection. – For example, a random number generator picks student ID numbers from SAC students and assigns those persons to either treatment or placebo group. – If the sample is of sufficient size then this is the most common way of designing an experiment. Rigorously Controlled Design • Making sure each of the members of a sample have a similar member of the control for all of the important characteristics involved in the study. – For example, if a 40-year old overweight man who smokes is selected for the treatment group of a health study, then a second overweight man who smokes will be chosen for the placebo group. Matched Pair Design • Compare two treatment groups where the subjects are matched in such a way that sets of two subjects are identical in some way. – For example, a set of twins are chosen for a toothpaste study. One twin is placed into the treatment group and the second twin is placed into the placebo group. – A more common version of this is a study in which a participant is tested at the beginning of the study and then that same characteristic is tested at the end of treatment time or at some point during the treatment time. Replication • The repetition of an experiment on more than one subject. • The law of small numbers- Samples that are too small will have too much variation to accurately draw inferences about the population. Blinding • The subject does not know that they are a part of the treatment or the control group. – The Placebo Effect- The fact that individuals who know that they are taking a medicine may improve because of their belief that the medicine is effective. Problems in Experimental and Observational Design • Confounding occurs in an experiment when you are not able to distinguish amongst the effects of different factors. – A factor is a possible variable that could have an effect upon an outcome but is not accounted for in the model or in our conclusions. – When we draw assumptions about a result and do not take into account possible confounders, we are committing an error. We may apply cause to a situation where cause is not warranted Confounding Variables • Example: In the early 50’s, health researchers concluded that excessive amounts of saturated fat was responsible for heart disease. This conclusion was based upon the fact that as Americans ate more saturated fat, there was also a commiserate increase in heart disease. However, the researchers did not account for the fact that processed sugar consumption also rose with saturated fat consumption. The amount of sugar consumption is called a confounder. Review of Section 1.2 • Observational Studies vs. Experiments • Issues with Designing an Observational Study – Representativeness – Timing the Study • Issues with Designing Good Experiments – Randomization – Control – Replication – Blinding