• The value of a scientific study can in principle be assessed with respect p to (a) ( ) the quality of the science; (b) the value of the scientific question being investigated. Important Value of scie entific question Evaluating science Where most science should be, but isn’t Where too much science is. trivial, dangerous awful superb Quality of scientific study 1 University of Ottawa - BIO 4000 © Scott Findlay 24/09/2009 12:00 PM Descriptive versus hypothesishypothesisdriven science • • Descriptive science is concerned with the characterization and/or quantification of patterns in nature. The main issues: (a) what h t are the th important i t t observed patterns? (b) are they more imagined than real? • • Hypothesis-driven science is concerned with testing (scientific) hypotheses advanced to explain the (real, one assumes) patterns uncovered by descriptive science. The main issue: how likely is it that the hypothesis in question is true? 2 University of Ottawa - BIO 5901 © Scott Findlay 24/09/2009 12:00 PM Descriptive versus hypothesishypothesisdriven science • • • • Descriptive science provides the grist for the hypothesisdriven science mill … … while hypothesis-driven science often in turn provides directions as to where it might be productive to look for (more) patterns. Both types of science are necessary, neither is sufficient … … but is it CRITICAL that they be distinguished! University of Ottawa - BIO 5901 © Scott Findlay 24/09/2009 12:00 PM Descriptive science Hypothesis-driven science 3 1 What is hypothesis hypothesis--driven science? • • • The accumulation of knowledge about the world through the testing of causal theories (explanations). A causal theory is a statement about the cause(s) of observed phenomenon (“events”, “effects”) Science attempts to infer causal relationships (“If A, then B”) by application of the scientific method. Cause Causality Inference Effect (Observed) 4 University of Ottawa - BIO 5901 © Scott Findlay 24/09/2009 12:00 PM What makes a “scientific” hypothesis? • • • According to Sir Karl Popper, all scientific hypotheses must be refutable, at least in principle. A refutable hypothesis is one for which, at least in principle, there are empirical observations which could be inconsistent with the hypothesis. So testability = refutability (falsifiability) 5 University of Ottawa - BIO 5901 © Scott Findlay 24/09/2009 12:00 PM Science à la Popper • • Science proceeds by eliminating potential hypotheses for what we want to explain. “When you have eliminated th iimpossible, the ibl Watson, W t whatever remains – however improbable – is the truth.” Sherlock Holmes, The Sign of Four University of Ottawa - BIO 5901 © Scott Findlay 24/09/2009 12:00 PM Hypotheses What we want to explain 6 2 The logic of scientific inquiry: deduction • Deduction: if the axioms (premises) are true, the conclusion is necessarily true (reasoning from general to particular). All swans are white. This bird is a swan. ∴ This bird is white. If (this is a swan) then (it is white). This bird is a swan. ∴ This bird is white. 7 University of Ottawa - BIO 5901 © Scott Findlay 24/09/2009 12:00 PM The logic of scientific inquiry: induction • Induction: even if all the axioms are true, the conclusion is not necessarily true (reasoning from the particular to the general). This bird is a swan & it is white … … and this bird is a swan & it is white … ∴ All swans are white. 8 University of Ottawa - BIO 5901 © Scott Findlay 24/09/2009 12:00 PM The scientific method Hypothesis Deduction Predictions Induction Experiment Conclusions Observations Induction University of Ottawa - BIO 5901 © Scott Findlay 24/09/2009 12:00 PM 9 3 The logic of hypotheses and predictions • • The logic of Popper’s view of science can be represented by a standard deductive syllogism. The implication is that one cannot prove an hypothesis, only support (corroborate) it. To do otherwise would be to commit the logical fallacy of affirming the consequence. If H then P P ∴H Fallacy of affirming the consequence If H then P -P ∴-H No logical fallacy! 10 University of Ottawa - BIO 5901 © Scott Findlay 24/09/2009 12:00 PM Hypotheses and predictions: why the bathroom light doesn’t work Hypotheses Power off to house Bulb burnt out What we want to explain Light switch on, but no light Short in circuit 11 University of Ottawa - BIO 5901 © Scott Findlay 24/09/2009 12:00 PM Hypotheses and predictions • • • Causal hypothesis: a statement about the cause(s) of some observed event/pattern. Prediction: the empirical result/pattern one will see in a particular experiment if the hypothesis is true. Inference: the conclusion (H is supported or refuted) is based on deductive inference. University of Ottawa - BIO 5901 © Scott Findlay 24/09/2009 12:00 PM If the light bulb is burnt out (H)… … then the light will work k if th the old ld bulb b lb is replaced (P). If H then P -P ∴-H 12 4 Hypotheses, experiments and predictions Hypothesis Experiment Prediction Light bulb burnt Replace bulb out with new bulb Light will work Power off to house Try other electric switches No other switch will work Short in circuit Replace bulb with new bulb New bulb will blow &/or breaker will trip 13 University of Ottawa - BIO 5901 © Scott Findlay 24/09/2009 12:00 PM Two different hypotheses, same experiment, same prediction Hypothesis Experiment Prediction Power off to house Try other lights/outlets in bathroom and bedroom No other lights/outlets work Short circuit & thrown breaker Try other lights/outlets in bathroom and bedroom No other lights/outlets work 14 University of Ottawa - BIO 5901 © Scott Findlay 24/09/2009 12:00 PM Two different hypotheses, same experiment, different prediction Hypothesis Experiment Prediction Power off to house Try outlets on different breaker from bathroom No other outlets will work Short circuit & thrown breaker Try outlets on different breaker from bathroom Other outlets will work University of Ottawa - BIO 5901 © Scott Findlay 24/09/2009 12:00 PM 15 5 Beware ancillary assumptions! • • If the light bulb is burnt out (H)… In very many cases, there are unstated and unvalidated assumptions that must be true in order for the deducibilty condition be satisfied. Published science is FULL of this sort of logical misstep – be on your guard! … then the light will work k if th the old ld bulb b lb is replaced (P). Ancillary assumption: new bulb actually works! 16 University of Ottawa - BIO 5901 © Scott Findlay 24/09/2009 12:00 PM Good practice: reconstruct the experimental logic • • • • • • Hypothesis: the bulb is burnt out Experiment: Replace bulb with new bulb Prediction: there should be light! Ancillary assumption: new bulb itself works Validate assumption: try new bulb first in another light that is known to be working. Procedure: (1) reconstruct the logic of the experiment; (2) identify necessary ancillary assumptions (Aas); (3) how many AAs are in fact known to be true in the experiment in question? 17 University of Ottawa - BIO 4000 © Scott Findlay 24/09/2009 12:00 PM Necessary and sufficient causal hypotheses Causal hypothesis “A is a necessary, but not sufficient, cause of B” “A is a sufficient, but not necessary, cause of B” “A is both a necessary and sufficient cause of B” Prediction(s) For all observed B events, A is always present. Whenever A is present, event B occurs. For all observed B events, A is always present, AND whenever A is present, event B always occurs. “A is a contributing cause of When A is present, B is B” more likely to occur than when it is not; OR when B is observed, A is more likely to be present. University of Ottawa - BIO 4000 © Scott Findlay 24/09/2009 12:00 PM Refutation Event B occurs, but A not present Presence of A does not result in event B Event B occurs, but A is not present; OR presence of A does not result in event B When A is present, event B is no more likely to occur than when it is absent; AND when event B is observed, A is no more likely to be present than when B is not observed. 18 6 Why it is important Causal hypothesis Prediction(s) “A is a necessary, but not For all observed B sufficient, cause of B” events, A is always present. “A is a sufficient, but not Whenever A is necessary, cause of B” present, event B always occurs. “A A is both is both a necessary a necessary For all observed B For all observed B and sufficient cause of B” events, A is always present, AND whenever A is present, event B always occurs. • Good design Large number of B events required. Large number of A events required Large number of BOTH A and B Large number of BOTH A and B events required. Good experimental designs depend on the type of hypothesis, as do the inferences drawn from the results 19 University of Ottawa - BIO 4000 © Scott Findlay 24/09/2009 12:00 PM Hypothesis testability • • • An hypothesis H is testable in an experimental design E iff (a) it has at least one prediction that is derived deductively from H; and (b) there is at least one possible experimental outcome that is inconsistent with P; So, testability depends both on the hypothesis H and the experiment E. N.B. Testability is a minimal condition – what really matters is how good the test is (more on this later!) 20 University of Ottawa - BIO 4000 © Scott Findlay 24/09/2009 12:00 PM The bathroom light: testability Causal hypothesis Light bulb burnt out Power off to house Short circuit & bathroom breaker blown Experiment Testable? Try new Yes (working) bulb Try new Yes ( (working) bulb ) Try new Yes (working) bulb University of Ottawa - BIO 4000 © Scott Findlay 24/09/2009 12:00 PM 21 7 The bathroom light, testability redux Causal hypothesis Light bulb burnt out Power off to house Short circuit & bathroom breaker blown Experiment Testable? Try razor outlet No in bathroom Try razor outlet Yes in bathroom Try razor outlet Yes in bathroom 22 University of Ottawa - BIO 4000 © Scott Findlay 24/09/2009 12:00 PM Competing versus nonnon-competing hypotheses • • Two hypothesis H1 and H2 are competing if both cannot be true, such that experimental evidence supporting H1 is evidence refuting H2, and vice versa, in experiments where only one is testable. I general, In l (truly) (t l ) competing ti causall hypotheses are rare in science, because very rarely can an experiment designed to test a single hypothesis exclude the possibility of multiple causality. 23 University of Ottawa - BIO 4000 © Scott Findlay 24/09/2009 12:00 PM Competing versus nonnon-competing hypotheses Causal Experiment Competing? hypothesis Light bulb burnt Check multiple No out non‐bathroom outlets Power off to Check multiple house non‐bathroom No outlets Short circuit & Check multiple bathroom non‐bathroom breaker blown outlets University of Ottawa - BIO 4000 © Scott Findlay 24/09/2009 12:00 PM No 24 8 Inferential strength, Ψ • • All HD studies should conclude that either (a) the hypothesis H is supported (results are consistent with predictions); or (b) H is not supported (results are inconsistent with predictions). Inductive inference means inferring from (a) or (b) that H is indeed true or false. An experiment E having large Ψ is one for which result (a) means that H is very likely to be true; and for which result (b) means that H is very likely to be false. Experiment 1 Probability that hypothesis is true • 1 Low Ψ 0 R11 R12 R13 R14 1 Experiment 2 High Ψ 0 R21 R22R23 R24 25 University of Ottawa - BIO 4000 © Scott Findlay 24/09/2009 12:00 PM • • The a priori inferential strength Ψ of an experiment is the maximum Ψ that is achievable irrespective of the experimental results. This sets the upper limit on a posteriori Ψ∗ inferential strength, i.e. that associated with a given experimental result: A priori infere ential strength A priori (Ψ) and a posteriori (Ψ∗) inferential strength A posteriori inferential strength University of Ottawa - BIO 4000 © Scott Findlay 24/09/2009 12:00 PM 26 What determines the a priori inferential strength of an experiment? • • • • • • • Type of experiment (manipulative versus correlational (observational)) Number of independent predictions per hypothesis Number of hypotheses tested (in the same experiment) Adequacy of controls Accuracy and precision (validity) of experimental methods Extent of extrapolation from experimental (model) system to system of real interest. Sample size University of Ottawa - BIO 5901 © Scott Findlay 24/09/2009 12:00 PM 27 9 Factors affecting a priori inferential strength: study type • Correlation Manipulative A priori infe erential strengtth All other things being equal, manipulative studies (where hypothesized causal factors are experimentally manipulated) have greater inferential strength than correlational studies because in the former, other potential causal factors are (presumably!) controlled 28 University of Ottawa - BIO 4000 © Scott Findlay 24/09/2009 12:00 PM Primary prod duction Inferential strength and study type: manipulative versus correlational Manipulative study Correlational study Phosphorous concentration 29 University of Ottawa - BIO 4000 © Scott Findlay 24/09/2009 12:00 PM 20 Control E1 15 E3 10 5 Log10 na ative richness Average nattive richness Effects of exotic wetland plants on native wetland biodiversity 2.4 2.2 2.0 1.8 1.6 0.0 0.4 0.8 1.2 Log10 exotic abundance 0 1 m2 scale University of Ottawa - BIO 4000 © Scott Findlay 24/09/2009 12:00 PM Wetland scale 30 10 Factors affecting a priori inferential strength: number of independent predictions • Candidate hypotheses Each independent prediction serves as a “filter” for different candidate hypotheses, so the more predictions (P1, P2, …), the more the set of candidate hypotheses is winnowed down. P1 X X P2 X X X X 31 University of Ottawa - BIO 4000 © Scott Findlay 24/09/2009 12:00 PM Factors affecting a priori inferential strength: number of independent predictions • Multiple predictions provide more independent opportunities for rejection, making for a stronger test. If H then (P1 and P2) - P1; P1 ∴ - H (R ); -(-H) (S) If H then (P1 and P2) - P1& P2; P1& - P2 ; -P1& - P2; P1& P2 Testable ∴ - H (R ); -H (R ); - H ( R); -(-H) (S) 32 University of Ottawa - BIO 5901 © Scott Findlay 24/09/2009 12:00 PM Factors affecting a priori inferential strength: number of alternate hypotheses strength: • • Because for any observed pattern there are many alternate explanations … … the more hypotheses tested in a single experiment, the greater the inferential strength. Candidate hypotheses E1 P1 X X P2 X University of Ottawa - BIO 4000 © Scott Findlay 24/09/2009 12:00 PM E2 X X 33 11 Factors affecting a priori inferential strength:: adequacy of controls strength • • • To unambiguously attribute observed effects to manipulated or correlated factors, one must ensure that the effect of other factors are adequately controlled. In manipulative studies, a good control is a “treatment” (or set thereof) that allows one to isolate the effect of only the putative causal factor(s). In correlational studies, good control means also measuring variables that might be expected to have some influence on the outcome of interest, in addition to those for which one has hypotheses (and predictions) University of Ottawa - BIO 4000 © Scott Findlay 24/09/2009 12:00 PM 34 Elements of good control • • Randomization of sample units (subjects, plots, etc.) to treatments (not always possible). Design allows for the evaluation of the potential effects of different elements of the experimental methodology on the outcome of interest. interest University of Ottawa - BIO 4000 © Scott Findlay 24/09/2009 12:00 PM 35 Example: effects of an oncolytic virus on tumour development in mice • Biological question: does in vitro selection for • • • tumour host specificity increase tumour-killing effectiveness in vivo? Procedure: Evolve two different (A, B) viral quasispecies i in i vitro it using i t tumour h t cells host ll to t select l t for f high tumour specificity. Assess change in specificity over generations. Introduce evolved virus into spontaneous tumour mouse model, assess tumour growth Design question: what are the appropriate controls? University of Ottawa - BIO 4000 © Scott Findlay 24/09/2009 12:00 PM 36 12 Possible design: is it adequately controlled? • • • Treatment 1: 10 mice inoculated with evolved viral species A Treatment 2: 10 mice inoculated with evolved viral species B Control: 10 untreated mice 37 University of Ottawa - BIO 4000 © Scott Findlay 24/09/2009 12:00 PM Good design, adequately controlled • • • • • • Treatment 1: 10 mice inoculated with evolved viral species A Treatment 2: 10 mice inoculated with evolved viral species B Control 1: 10 untreated mice Control 2: 10 mice with sham inoculations on same schedule as T1 and T2 Control 3: 10 mice treated with ancestral (unevolved) virus A Control 4: 10 mice treated with ancestral (unevolved) virus B 38 University of Ottawa - BIO 4000 © Scott Findlay 24/09/2009 12:00 PM • Results consistent with prediction that [P] limits PP, , but there is no randomization, so observed relationship may arise from the fact that, e.g. lakes with more phosphorous also have more nitrogen (so other factors have not been randomized) So, measurement of nitrogen provides a type of control. Nitrogen concentration • Primary production Example: primary production (PP) in lakes and phosphorous [P] Phosphorous University of Ottawa - BIO 4000 © Scott Findlay 24/09/2009 12:00 PM 39 13 Predictions as models • All science (descriptive or hypothesis-driven) involves determining the probability φ that some model M [which represents a possible pattern (descriptive science) or a prediction (hypothesis-driven science] is in fact true. Hypothesis testing then requires estimating how well the observed pattern matches the predicted pattern M. Observations (poor fit) Observations (good fit) H: light bulb burnt out 1 Probability y of light • M 0 Not working Working Replacement bulb 40 University of Ottawa - BIO 5901 © Scott Findlay 24/09/2009 12:00 PM Models: dependent and independent variables • • A model can be represented mathematically as an equation, so alternate models give different equations To convert an hypothesis to a model, the (putative) effect Y i on the is th left l ft hand h d side id (the (th dependent variable),and the (putative) cause(s) X are on the right hand side. All statistical analysis (even a t-test!) involves fitting a specific mathematical model. Y = f (X ) Y • Y = g(X ) X observations Two alternate models (f, g) 41 University of Ottawa - BIO 5901 © Scott Findlay 24/09/2009 12:00 PM Alternate hypotheses, alternate models • • H1: increases in X cause a proportional increase in Y (model f) H2 : increases in X result in a decelerating increase in Y, indicating saturation (model g) Question: how well do the observations fit the two hypotheses? Y = f (X ) Y • Y = g(X ) X observations Two alternate models (f, g) University of Ottawa - BIO 5901 © Scott Findlay 24/09/2009 12:00 PM 42 14 Model fit , data accuracy and inferential strength • The lower the measurement • • Y = g(X ) High accuracy Y • accuracy (greater bias), the lower the extent to which “observed” model fit reflects the true model fit … …so that inferences about which model fits better will be invalid E.g. method that overestimates at larger values of Y… Leading to incorrect inferences concerning true model fit (f better fit than g) Y = f (X ) Biased observations Low accuracy (bias) X 43 University of Ottawa - BIO 5901 © Scott Findlay 24/09/2009 12:00 PM Y = f (X ) Model fit , data precision and inferential strength • As measurement precision declines, the differences in observed fit among different models declines. Because one is unsure which is the better/best model, inferences about which hypotheses are supported/refuted are weaker. Y = g(X ) High precision Y • Low precision X 44 University of Ottawa - BIO 5901 © Scott Findlay 24/09/2009 12:00 PM • • • Experiments are almost always conducted on “model” systems. Thus, drawing inferences from results of experiments with model systems requires that we assume that the causal structure of model system is similar to that of the system we are really interested in. The greater the degree of extrapolation, the less likely this is to be true, and the lower the inferential strength. University of Ottawa - BIO 4000 © Scott Findlay 24/09/2009 12:00 PM Temporral scale Factors affecting a priori inferential strength: extrapolation System of interest Extrapolation Model system Spatial scale 45 15 Common types of extrapolation • • • • Interspecies (very common in biomedical studies – are rats really people?) From experimental indicators (that which we measure or estimate) to system properties of real interest (e.g. from expression levels to protein t i llevels, l ffrom species i richness i h to t “biodiversity”, etc.) Spatial and temporal scales In vitro to in vivo 46 University of Ottawa - BIO 4000 © Scott Findlay 24/09/2009 12:00 PM Factors affecting a priori inferential strength: sample size • • The larger the sample size, the smaller the uncertainty associated with estimates based on the sample… … and thus,, the greater g the ability to detect differences in fit among competing models … …leading to stronger inferences about degree of support or refutation. Y = f (X ) Y • Y = g(X ) Large N Small N X University of Ottawa - BIO 5901 © Scott Findlay 24/09/2009 12:00 PM 47 Conclusions: are the conclusions supported by the data? • • Are observed results consistent or inconsistent with bona fide predictions? Does the study have strong a posteriori inferential strength? University of Ottawa - BIO 4000 © Scott Findlay 24/09/2009 12:00 PM 48 16