HYPOTHESIS TESTING Purpose: The purpose of this activity is to introduce you to one method of scientific inquiry – the laboratory experiment. Of course, many of the skills you will learn here are used in other types of scientific investigations, as well. Today you will learn how to identify and categorize variables, use statistical tests to examine patterns in data, and design and modify hypotheses. You will experience the tentative, probabilistic, cumulative nature of ‘truth’ in science. Methods and Techniques: You will learn how to construct testable hypotheses, collect data, subject these data to a statistical test, interpret the results of the statistical test, and reach a conclusion regarding your hypothesis. Major Concepts: variables (controlled, random, independent, dependent, confounding); falsifiability; replication; artifact; null hypothesis; type I error; probability, inductive logic, deductive logic. INTRODUCTION Webster’s New World Dictionary (1988) defines science as “systematized knowledge derived from observation, study, and experimentation carried on in order to determine the nature or principles of what is being studied; the systematized knowledge of nature and the physical world.” So, science seeks to understand the physical world. Therefore, questions about or dependent upon supernatural phenomena (by definition), or questions regarding ethics or morality, are outside the domain of science. Likewise, Webster’s definition specifies a method of inquiry – observation, study, and experimentation. As you know, this process is sometimes referred to as ‘the scientific method.’ In fact, there is no single, monolithic ‘scientific method.’ There are several ways to gain information about the physical world through systematized observation, study, and experimentation. In fact, for some questions, observation may be sufficient. These are called ‘descriptive studies.’ For instance, to answer the question “do humans have tails?” you just have to look – there is no true ‘experiment’ that is conducted. Of course, the validity of the conclusion is still dependent on the rigor and completeness of the investigation, and rigorous observations may lead you to modify your question to be more specific. If you simply use common sense and your own biased impressions (i.e., without applying systematic observation), you might say “of course not – humans don’t have tails.” That is why common sense is not science – it is neither rigorous nor specific. (For instance, common sense led humans to believe that the Earth was flat and the center of the solar system.) If you observed humans in a systematic, unbiased way, you would find that almost all humans are born without external tails. However, there are a few individuals who are, indeed, born with small external tails. And, barring other rare mutations, human embryos have post-anal tails that develop into internal tailbones. So, observations of humans at different stages of development, and observations of internal anatomy, might lead you to answer this simple question about the physical world in a rather specific manner couched in qualifiers: “Like all Chordates, human embryos typically develop a post-anal tail. If development continues correctly, post-anal coccygeal vertebrae develop. At birth, these tail bones usually do not protrude from the body as an external tail, except in very rare cases.” So, many facts in science are ‘just’ observations; but they are observations taken in an unbiased, systematic manner. You might think humans don’t have tails, but that’s because your nonscientific impressions were based on biased considerations of only the external anatomy of most postnatal humans. Our definition then lists study as a scientific method of inquiry. This is least specific of all and demands further elaboration. In science, we equate ‘study’ with logical, rational thought, based on the principle of natural causality. This means that when we observe a phenomenon in the physical world and we decide to try to explain it scientifically, we seek a rational explanation for that occurrence as a function of natural causes. We do this for two reasons. First, we have no reliable, repeatable method for inquiries based on supernatural phenomena, and no way to distinguish between alternate supernatural hypotheses with evidence from the physical world. (Did Yahweh make the world? Zeus? Vishnu? Odin? There is no evidence from the physical world that can distinguish one supernatural hypothesis from the other; as such, it is a moot issue scientifically.) A second reason for seeking a rational explanation is that describing the natural causality of events can potentially give us the ability to control those events by inducing these cause-effect relationships ourselves. In other words, science WORKS…. It is a useful, predictive tool for understanding and manipulating the physical universe. For instance, through science humans have come to understand electricity. We can control it, and cause electrical impulses to pass through thin pieces of metal. So, although many questions lie outside the domain of science, for questions about the physical universe science is the most appropriate and reliable tool to gain a correct understanding of cause-effect relationships. The first step in assessing causality is to describe a relationship between variables; ‘when ‘x’ happens, it seems that ‘y’ then occurs.’ This is a correlational study – the occurrence of one event correlates with the occurrence of another. However this, by itself, DOES NOT define causality… event ‘z’ could cause both ‘x’ and ‘y’ and that could be why x and y co-occur. Also, it is important to realize that quantification is typical of a scientific approach. In science, we often represent what we observe in mathematical terms because these can be relatively precise and less sensitive to subjective interpretation than other languages. Although scientific facts may also be described in prose (‘narrative data’), or as relative, semi-quantitative categorical terms (‘high, medium, low’), eventually a more precise study will demand quantitative data. Biology is a science, and you must be reasonably comfortable with mathematics to enjoy biology and succeed in it. The final method mentioned in the definition is experimentation. The goal of an experiment is to recreate a situation in which the causal link between one variable and another can be measured. To do this, other variables must be controlled, randomized, or measured so that their potential effects can be nullified or described. There are several types of experiments, too; from laboratory experiments to field experiments, to post-dictive experiments. Laboratory experiments provide the greatest control of the experimental environment, and so give the greatest sensitivity for observing a causal relationship between two variables. However, the goal of science is to describe how the physical (i.e. ‘natural’) world operates, and the laboratory environment may be so unnatural that what you observe is a function of this unnatural environment itself, and is not what actually happens in the ‘real world.’ Such results are called laboratory artifacts. Suppose you are studying the ecological interactions between spiders and lizards on islands in the Caribbean, and you systematically observe that there are few spiders on islands that have many lizards. You hypothesize that the lizards are eating the spiders and reducing spider abundance. So, you bring the system into the laboratory and you conduct an experiment demonstrating that lizards do eat the spiders. But, maybe the lizards eat the spiders in the laboratory because there is nothing else to eat. On the islands, maybe lizards rarely eat the spiders; rather, they out-compete spiders for flies. Lots of lizards mean fewer flies for spiders to eat, and spider populations decline. Some good field observations could have prevented this premature lab experiment. Another drawback to laboratory experiments is that the spatial scale is limiting – the system must fit into the laboratory environment. Field experiments are more realistic, but they suffer because the investigator cannot control (or even measure) all of the variables that are fluctuating in the environment. As such, it may be difficult to ascertain exactly which variable is responsible for the response you measure. Finally, post-dictive experiments allow us to test hypotheses about the past. The typical post-dictive hypothesis is: if ‘x’ happened in the past, then ‘y’ should be present now. (It is best if you have not observed ‘y’ yet… otherwise you may unconsciously be trying to explain ‘y’ rather than testing the effects of ‘x.’) For instance, the ‘Big Bang’ hypothesis predicted that there should still be a radiational ‘echo’ of the universal explosion, and that it should be about 2-3 degrees above absolute zero temperature. Two scientists at Bell Labs, unaware of this prediction, kept finding ‘background radiation’ of this magnitude, no matter which direction that they placed their antenna. They had confirmed a prediction of the Big Bang hypothesis…and with subsequent confirmations of other predictions, this idea was elevated to the status of a Theory – an explanatory model of how the physical world operates that has been tested and supported by numerous, independent experiments. Evolutionary Theory is another example. In 1859, Darwin suggested that species alive today are descended from common ancestors that lived in the past. Essentially, all of life is one big family tree, with some species closely related (like siblings in a nuclear family that share the same parents), and others more distantly related (like second cousins that share greatgrandparents). Based on anatomical similarities, Darwin hypothesized that humans were more closely related to other primates than to cats, birds, or fish. With the much later discovery of DNA, this hypothesis of relatedness could be tested by looking directly at DNA similarity among existing species. These data confirm predictions based on Darwin’s idea, and thus provide compelling experimental support for the Theory of Evolution. All true experiments, whether laboratory, field, or post-dictive, involve the rigorous identification of variables. Today’s first activity will give you some practice identifying and categorizing variables. Then, you will create and conduct experiments with the response of crickets to light. You will analyze your data statistically (employing quantification) and draw conclusions based on an understanding of probability. Throughout the work, you should also gain an appreciation for the fact that scientific investigations are iterative (you have to keep testing and modifying your hypothesis until it explains ALL of the related data - a single experiment is not sufficient), cooperative (since no one person can do ALL of the relevant experiments related to a hypothesis - the sharing of data among investigators is crucial), and tentative (you will never know for SURE that your hypothesis is absolutely true since it can never be tested in every conceivable way). TESTING PATTERNS OF HABITAT SELECTION IN CRICKETS In this exercise, you will be investigating the tendency of crickets to gather in shaded or lit habitats by collecting quantitative data from a laboratory experiment, and then subjecting these data to a statistical test. The goal of this activity is to learn how to use a statistical test to reach a scientific conclusion, and to hone your critical experimental skills. Today's experiments: the biological phenomenon of taxis Typically, animals select habitats that provide food, shelter, and/or mates, or place them in environmental conditions that they can tolerate. Important environmental variables include temperature, humidity, light, wind, sound, disturbance, etc. Obviously, animals usually select habitats based on more than one variable… but which ones? Some variables may be more important to some species than others. Controlled laboratory experiments allow scientists to determine whether organisms respond to particular stimuli. Orientation in response to a stimulus is called taxis. Furthermore, taxis can be characterized by the type of environmental stimulus guiding the orientation and whether the orientation is toward (positive taxis) or away from (negative taxis) the stimulus. For example, the grayling butterfly (Eumenis semele) flies upward toward the sun when confronted by a predator. The behavior in this case is termed positive phototaxis, phototaxis meaning orientation with respect to light, and positive meaning an orientation toward the stimulus. Similarly, adult female crickets tend to turn and move toward a speaker playing the recorded chirps of a male cricket of the same species (positive phonotaxis). Conversely, cockroaches typically avoid lighted areas of their environment, a type of negative phototaxis. The wood-boring larvae of many species of beetles change direction when exposed to simulated gnawing sounds in front of them (negative phonotaxis). This leads to the irregular ‘galleries’ of burrows in trees that are heavily infested. In this experiment, you will study the taxis response of crickets to light. Crickets are nocturnal, and they are fairly skittish creatures. Why? Is their nocturnal behavior a function of an aversion for light, or are they nocturnal because that is when their food is available or that when their predators are inactive? As a function of these issues, you might become interested in whether crickets respond to differences in their light environment. (You might not be interested, but you’ll play along, won’t you?) Based on your casual observations, you probably have a preconceived opinion on the matter – this is your ‘working hypothesis.’ Based on the fact that crickets are nocturnal and skittish, your working hypothesis is probably that ‘crickets congregate in shaded habitats.’ To test that general idea, you are going to conduct a specific, restricted test. Therefore, you are going from a general principle to a specific circumstance – this is deductive logic. ‘If, in general, crickets congregate in shade, then they will do so in my specific lab experiment when they are given a choice between shaded and lit habitats.’ EXPERIMENTAL PROTOCOL The first experiment offers crickets lit and shaded environments in a light-controlled room. You will record the number of individuals that select each type of environment. The white benchtop lights are required for this first experiment, so turn them on (toggle switch on top). As you move through this experiment, think about how your procedures and the ‘choices’ you make could affect the behavior of the crickets. At the end of the experiment, your goal is to reach a conclusion about how crickets behave in the restricted conditions of your specific experiment. However, your conclusion will be more valuable if you can legitimately apply this conclusion to cricket behavior in general. So, you want to be able to apply inductive logic (compare with deductive logic mentioned earlier) at the end of your experiment, and apply your conclusion about these specific crickets in this specific situation to the behavior of all crickets, in general. In order to legitimately generalize, the specific conditions of your experiment can’t be too unnatural or unique. Otherwise, the patterns you observe really ARE specific to your experimental conditions, and have no predictive or explanatory power outside of these particular conditions. So, as you go through this experiment, think about the choices you are making, either purposefully or not, and think about whether these choices could be limiting the generality of your conclusion. After this experiment is over, you will design a new hypothesis and will conduct another experiment. That is, you will have the chance to improve upon this experiment and ask a more refined question that has greater explanatory power. Procedure: 1. Examine your arena. The top is clear plastic and can slide out. There is also an entry port on one side, with a stopper. Also note a black sleeve, and a wooden rod that fits in the sleeve. There are three slots in the back of the arena so that you can divide the arena in half or into thirds, and you have two metal dividers that fit these slots. You will insert the dividers at the end of a run so that you can count crickets at each end of the maze without them moving back and forth. (Here are some of the choices you have made so far: this arena… its size, color, composition, etc.; this room and all of the environmental conditions in it.) 2. Position the arena on the table, perpendicular to the light. Cover the half of the arena closest to the light – this creates a shady region under this cover. Align the arena and cover so that half of the arena is shady and half is lit. Put a piece of tape on the table to note this position. (Choices made: this type of light: its quality and quantity. This type of shaded spot: its size, position, and amount of reflected light.) 3. Pick up the arena, open one end, and shake in the crickets. Be quick, and don’t bother to count the crickets entering the box! Grab any stragglers or escapees and add them through the ‘cork-hole’ in the front of the arena. Give the arena a sideways shake to randomize the position of the crickets in the box. Quickly place the arena in position, with the cover in position, too. Note the time. (Choices made: Dumping in the crickets through the top of the box; shaking the box; using multiple crickets at once.) 4. You will terminate the experiment after 5 minutes by sliding a divider into the center slot. In the meantime, observe your study organisms. Note that males have two terminal ‘spines’ coming off their abdomen – these are cerci. Females have shorter cerci, but a very long spine between them. This is the ovipositor – an egg-laying device. (choices made: five minute interval; both sexes present in box at the same time). 5. After five minutes, slide the divider into position and count the number of crickets in each end of the maze. Place your data in the table on the next page: Treatment Lit Shaded # of Males # of Females Total 6. So, what do you conclude? How sure are you that this pattern wasn’t just ‘dumb luck?’ In other words, how likely is it that crickets DON’T typically gather in one area over the other, but just by chance ended up sorting themselves as they did? Obviously, knowing that particular probability is very important. If you conclude that crickets move to one area over the other, but in fact this pattern is due to chance, then you are wrong. You have made a type I error, concluding that a difference exists between your treatments when, in reality, the difference you observed was due to chance. This type of error can have grave consequences. Think about Dr. Marsh and the Pilco drug company. Suppose they claim that the difference between the treatment group and the placebo group was caused by their treatment - in other words, they conclude that the drug works. However, suppose this difference was due to chance. They are WRONG. The drug doesn’t work, their reputations will be shot, and they will be sued. Hmmm…. Knowing the probability that chance could be responsible for your pattern becomes pretty important! They should be conservative; they should only say the drug works if the probability that chance caused the pattern is very low (< 1%). This is the function of most statistical tests: determining the probability that the pattern you observe could be caused by chance. Also, think about the physicians who read Dr. Marsh’s report in the Journal of the American Medical Association, and then have to decide whether to prescribe this drug or not, based on their interpretation of her results. These physicians need to know something about statistics and probability, too! Expected Distribution under this hypothesis: Suppose you had 20 crickets. Then, if there were no preference, you would expect 10 in the shaded region and 10 in the lit region. So, to calculate your expected values, divide your total sample size by 2. Enter your expected and observed values in the table below. Statistical Null Hypothesis: there is no difference between the observed and expected values. Table comparing observed and expected distributions: Obs. Shaded Lit sum Exp. O-E (O - E)2 - - (O – E)2/E X2 = Calculated Chi-squared value (X2) = Critical Value from Chi-squared table, p = 0.05, = How likely is it that the null hypothesis is TRUE (> 5% or < 5%)? Conclusion: Variables: Dependent: Independent: Controlled: Uncontrolled: Confounding: 8. Your instructor will accumulate the results from all the groups in the class. Do not begin the next activity until the instructor reviews the entire class’s data. Your own group’s results (and conclusions) may be anomalous. Wouldn’t you like to know?