Biostatistics and Epidemiology (Stat 4101) Taddele Ch. (MSc.) taddelecherinet@gmail.com /tkibr005@uottawa.ca Addis Ababa University Statistics Department Winter, 2020 Biostatistics and Epidemiology What is Statistics? The field of statistics: the study and use of theory and methods for the analysis of data arising from random process. The study of how to make sense of data. Statistics is the science of learning from data, and of measuring, controlling, and communicating uncertainty. Statistics is also an ART of conducting a study, analyzing the data, and derive useful conclusions from numerical outcomes about real life problems. 3 The two fields of Statistics 1. Mathematical statistics: the study and development of statistical theory and methods in abstract; and 2. Applied Statistics: the application of statistical methods to solve real problems involving randomly generated data, and the development of new statistical methodology motivated by real problems. 4 Classes of Statistics Descriptive statistics is the branch of statistics that includes methods summarizing data. for organizing and Inferential statistics is the branch of statistics that involves generalizing from a sample to the population from which the sample was selected and assessing the reliability of such generalizations. 5 Biostatistics Biostatistics is the branch of applied statistics directed toward applications in health sciences and biology. e.g. Clinical trials, Epidemiology, Pharmacology, Medical decision making, Comparative Effectiveness Research etc. Why Biostatistics? Some statistical methods are more heavily used in health applications than elsewhere Example: survival analysis, longitudinal data analysis Because examples are drawn from health sciences 6 Statistical Methods in Biostatistics The data analysis process can be viewed as a sequence of steps that lead from planning to data collection to making informed conclusions based on the resulting data. 1. Understanding the nature of the problem: know the goal of the research and what questions we hope to answer. 2. Deciding what to measure and how to measure it: what information is needed to answer the questions of interest. 3. Data collection: decide whether an existing data source is adequate or whether new data must be collected. If you decide to use existing data, then understand how were collected and for what purpose. the data 7 4. Data summarization and preliminary analysis: a preliminary analysis that includes summarizing the data graphically and numerically. 5. Formal data analysis: select and apply statistical methods. 6. Interpretation of results: Several questions should be addressed in this final step. Example: - What can we learn from the data? - What conclusions can be drawn from the analysis? And - How can our results guide future research? 8 Application of Biostatistics 1. Collection of vital statistics - for example, mortality rates - used to inform about and to monitor the health status of the population 2. Analysis of accident records - to find out the times during the year when the greatest number of accidents occurred in a plant and decide when the need for safety instruction is the highest 9 Application of Biostatistics 3. Clinical trials - to determine whether or not a new hypertension medication performs better than the standard treatment for mild to moderate essential hypertension 10 Application of Biostatistics… 4. Surveys to estimate the proportion of low-income women of child-bearing age with iron deficiency anemia 11 Application of Biostatistics… 5. Studies to investigate whether or not exposure to electromagnetic fields is a risk factor for leukemia (a cancer caused by an overproduction of damaged white blood cells) 12 The Logic of Scientific Reasoning How do we go about deciding something is true? We have two tools at our disposal to pursue scientific inquiry: We have our senses, through which we experience the world and make observations. We have the ability to reason, which enables us to make logical inferences. In science we impose logic on those observations. All the logic in the world is not going to create an observation, and all the individual observations in the world won't in themselves create a theory. 13 The Logic of Scientific Reasoning In deductive inference, we hold a theory and based on it we make a prediction of its consequences. That is, we predict what the observations should be. In inductive inference, we go from the specific to the general. 14 The Logic of Scientific Reasoning We make many observations, discern a pattern, make a generalization, and infer an explanation. For example, it was observed in the Tikur Anbesa Hospital in the 1990s that women giving birth were dying at a high rate of puerperal fever, a generalization that provoked terror in prospective mothers. Induction is based on our belief that the things unobserved will be like those observed or that the future will be like the past. 15 The Logic of Scientific Reasoning Asking which comes first, theory or observation, is like asking which comes first, the chicken or the egg. Theories, then, can be used to predict observations. But these observations will not always be exactly as we predict them due to error inherent variability of natural phenomena. If observations are widely different from our predictions we will have to abandon or modify the theory. 16 The Logic of Scientific Reasoning How do we test the extent of the discordance of our predictions based on theory from the reality of our observations? The test is a statistical or probabilistic test Statistics is a methodology with broad areas of application in Science and industry, medicine and many other fields. 17 The Logic of Scientific Reasoning A phenomenon may be principally based on a) Deterministic model Example. Boyle's laws for a fixed volume an increase in temperature of a gas determines that there is an increase in pressure. b) Probabilistic model which implies that various states of a phenomenon occur with certain probabilities. “The presence of variation requires the use of statistical analysis” (Arias E, Smith BL., 2003). 18 The Logic of Scientific Reasoning When there is little variation with respect to a phenomenon much more weight is given to a small amount of evidence than when there is a great deal of variation Drug cures few patients of Pancreatic (invariably a fatal disease)- More weight to the evidence. Determine Vitamin C cures Cold - More patients should be involved since there may be biological variability among patients. 19 Review of Probability and Statistics Definition: The entire collection of individuals or objects about which information is desired is called the population of interest. In many of the situations, we cannot observe the full population. A sample is a subset of the population, selected for study. An individual subject/object in a population is called an experimental unit. Representative Population Sample Inference • uncertainty • reliability 20 Definition of Probability -A probability is a number between 0 and 1 that reflects the likelihood of occurrence of some outcome. Intuitively: Probability is relative frequency in the population Formal: Random experiment----->Events----->Probabilities -The probability of an outcome, denoted by P(outcome), is interpreted as the proportion of the time that the outcome occurs in the long run. 21 Some properties 1. The probability of any outcome is a number between 0 and 1. 2. If outcomes cannot occur simultaneously, then the probability that any one of them will occur is the sum of the outcome probabilities. 3. The probability that an outcome will not occur is equal to 1 minus the probability that the outcome will occur. 22 Independence Independent outcomes: Two outcomes are said to be independent if the probability that one outcome occurs is not affected by knowledge of whether the other outcome has occurred. - If there are more than two outcomes under consideration, they are independent if knowledge that some of the outcomes have occurred does not change the probabilities that any of the other outcomes have occurred. Dependent outcomes: If the occurrence of one outcome changes the probability that the other outcome occurs, the outcomes are dependent. 23 Conditional Probability Now let us consider the case where the chance that a particular event happens is dependent on the outcome of another event. The probability of A, given that B has occurred, is called the conditional probability of A given B, and is written symbolically as P(A|B). 24 Conditional Probability When we speak of conditional probability, the denominator becomes all the outcomes in the condition not all possible outcomes and the numerator consists of those outcomes that are in both condition and conditioned events. A Bbbb AnB B N P( A ∩ B) P ( A / B) = P( B) 25 Bayesian Probability Imagine that M is the event “loss of memory,” and B is the event “brain tumor.” We can establish from research on brain tumor patients the probability of memory loss given a brain tumor, P(M|B). A clinician, however, is more interested in the probability of a brain tumor, given that a patient has memory loss, P(B/M). 26 Bayesian Probability It is difficult to obtain directly have to study the vast number of persons with memory loss (which in most cases comes from other causes) and determine what proportion of them have brain tumors. P( M / B) P( B) P(B / M ) = P( M ) P( M / B) P( B) = P( M / B) P( B) + P( M / B c ) P( B c ) 27 Bayesian Probability “memory loss, M” can occur either among people with brain tumor, with probability, P(M/B) P(B), or among people with no brain tumor, with probability, P(M/Bc)P(Bc) 28 Example 2 To study the proportion of smokers by sex from a population a random sample of 200 persons was taken, the following table shows the result. Sex Non-Smoker Smoker Total Male 64 16 80 Female 42 a) What is the probability of getting78a non smoker120 given that a Total 94 200 person selected is a 106 female? b) What is the probability of getting a male given that a person selected is smoker? Solution - P (M) = 80/200, P(F) = 120/200 - P(S) = 94/200, P(N) = 106/200 - P(M and S)= 16/200, P(F and N)=42/200 1) P(N/F) = P(N and F)/P(F) =42/120= 0.35 2) P(M/S)=P(M and S)/P(S) =16/94= 0.17 Exercise A study investigating the effect of prolonged exposure to bright light on retina damage in premature infants Bright light Reduced light TOTAL Retinopathy YES 18 21 39 Retinopathy TOTAL NO 3 21 18 39 21 60 Find probability of retinopathy, given that the infant was exposed to bright light 31 Solution The probability of developing retinopathy is: P(Retinopathy) = No. of infants with retinopathy (18 + 21) = = 0.65 Total No. of infants 21 + 39 We want to compare the probability of retinopathy, given that the infant was exposed to bright light, with that the infant was exposed to reduced light. Exposure to bright light and exposure to reduced light are conditioning events, events we want to take into account when calculating conditional probabilities. 32 The conditional probability of retinopathy, given exposure to bright light, is: P(Retinopathy/exposure to bright light) = No. of infants with retinopathy exposed to bright light No. of infants exposed to bright light = 18/21 = 0.86 P(Retinopathy/exposure to reduced light) = # of infants with retinopathy exposed to reduced light No. of infants exposed to reduced light = 21/39 = 0.54 The conditional probabilities suggest that premature infants exposed to bright light have a higher risk of retinopathy than premature infants exposed to reduced light 33 Applications: Diagnostic test Assume that there is a disease D. Let, D+ the event that a patient has the disease D and Let, D- the event that a patient has not the disease D. Assume that there exists a test T to diagnose this disease. Let, T+ a positive test result and Let, T- a positive test result. 34 Applications … Sensitivity P(T+/D: The sensitivity of a symptom (or set of symptoms or screening test) is the probability that the symptom is present given that the person has a disease. Specificity P(T- / D-): The specificity of a symptom (or set of symptoms or screening test) is the probability that the symptom is not present given that the person does not have a disease. Intuitively, we expect a "good" test to identify the ill persons and to discriminate the non-ill persons. Therefore, a "good" test has a value close to 1 for both the sensitivity and specificity. 35 Applications … When a wrong conclusion is made, P(T- / D+) is the probability of a false negative result P(T+ / D-) is the probability of a false positive result When we develop a new test, we apply it to a group of patients to get a value for the sensitivity and specificity. However, in practice the test will be used to diagnose an individual. 36 Applications … Hence, we are interested in the probability that a person is ill when he/she has a positive test result. This is the Predictive Positive Value (PPV), P(D+/T+). By Bayes‘ rule, this is given by P(T + / D +) P( D +) P ( D + / T +) = P(T + / D +) P( D +) + P(T + / D −) P( D −) This leads to sensitivity × P( D +) P ( D + / T +) = sensitivity × P( D +) + (1 − specificity ) × (1 − P( D +)) 37 Applications … P(D+) is the prevalence of the disease. This is the probability that any person is ill. Also, we are interested in the probability that a person is not ill when he/she has a negative test result. This is the Predictive Negative Value (PNV), P(D- / T-). P(T − / D −) P( D −) P ( D − / T −) = P(T − / D −) P( D −) + P(T − / D +) P( D +) 38 Applications: Example 1 Example: In oncology, we have a test for a rare type of cancer. The sensitivity of this test is 99% and its specificity is 98%. The prevalence of this cancer is equal to 0.005. Compute the predictive positive value? sensitivity × P( D +) Solution: P ( D + / T + ) = sensitivity × P( D +) + (1 − specificity ) × (1 − P( D +)) P ( D + / T +) 0.99 × 0.005 = 0.20 0.99 × 0.005 + (1 − 0.98) × 0.995 We note that although we have a very accurate test, only 20% of the persons with a positive test result have the disease! 39 Applications: Example 2 Example: Here is a simplified version of how genes code eye color, assuming only two colors of eyes. - Each person has two genes for eye color. - Each gene is either B or b. - A child receives one gene from each of its parents. The gene it receives from its father is one of its father’s two genes, each with probability 1/2; and similarly for its mother. The genes received from father and mother are independent. 40 Applications: Example 2… If your genes are BB or Bb or bB, you have brown eyes; if your genes are bb, you have blue eyes. Suppose that John has brown eyes. So do both of John’s parents, and his sister has blue eyes. What is the probability that John’s genes are BB? 41 Applications: Example 2 … Solution - John’s sister has genes bb, so one b must have come from each parent. - Thus each of John’s parents is Bb or bB; we may assume Bb. So the possibilities for John are (writing the gene from his father first) BB, Bb, bB, bb each with probability 1/4. John gets his father’s B gene with probability 1/2 and his mother’s B gene with probability 1/2, and these are independent, so the probability that he gets BB is 1/4. 42 Applications: Example 2… Solution Let X be the event ‘John has BB genes’ and Y the event ‘John has brown eyes’. Then X= {BB} and Y = {BB, Bb, bB}. The question asks us to calculate P(X | Y). This is given by 1 P( X ∩ Y ) 4 1 P ( X /= Y) = = 3 3 P(Y ) 4 43 Odds and Probability The odds are simply the ratio of the proportions for the two possible outcomes (success/ failure). When the odds of a particular horse losing a race are said to be 4 to 1, he has a 4/5 = .80 probability of losing. To convert an odds statement to probability, we add 4 + 1 to get our denominator of 5. The odds of the horse winning are 1 to 4, which means he has a probability of winning of 1/5 = .20. If p is the proportion for one outcome, then 1−p is the proportion for the second outcome: proportion of success p = odds = proportion of failures 1 − p p= odds 1 + odds 44 Inference We described statistical inference as the branch of statistics that involves generalizing from a sample to the population from which it was selected. Interest usually centers on the value of one or more variables. A variable associates a value with each individual or object in a population. A variable can be either categorical or numerical, depending on its possible values. 45 Numerical continuous. variables can be either discrete or A discrete numerical variable is one whose possible values are isolated points along the number line. A continuous numerical variable is one whose possible values form an interval along the number line. 46 Estimation In general terms, estimation uses a sample statistic as the basis for estimating(approximating) the value of the corresponding population parameter. It is also common to use estimation in situations where a researcher simply wants to learn about an unknown population. Given several unbiased statistics that could be used for estimating a population characteristic, the best choice to use is the statistic with the smallest standard deviation. 47 Point estimation An estimate for a parameter that is one numerical value. An example of a point estimate is the sample mean or the sample proportion. A statistic whose mean value is equal to the value of the population characteristic being estimated is said to be an unbiased statistic. A statistic that is not unbiased is said to be biased. 49 Confidence interval A confidence interval (CI): An interval of values computed from sample data that is likely to cover the true parameter of interest. he interpretation of CI: "We are 'some level of percent confident' that the 'population of interest' is from 'lower bound to upper bound'. The confidence level associated with a confidence interval estimate is the success rate of the method used to construct the interval. The standard error of a statistic is the estimated standard deviation of the statistic. 50 Point & Interval Estimation Point Estimator- draws inference using a single number/value Don't reflect the effect of larger sample sizes Interval Estimator – contains a certain percentage of possible values of the parameter Lower Confidence Limit Upper Confidence Limit Width of confidence interval 51 Properties of Good Estimators Unbiased -an estimator whose expected value is equal to that parameter. Consistent - the difference between the estimator and the parameter grows smaller as the sample size grows larger. Relatively efficient-the one whose variance is smaller 52 Estimating the Population Mean when the Population Standard Deviation is Known How is an interval estimator produced from a sampling distribution? - To estimate µ, a sample of size n is drawn from the population, and its mean x is calculated. - Under certain conditions, x is normally distributed (or approximately normally distributed.), thus x −µ Z= σ n 53 We know that P ( µ − zα 2 σ σ ≤ x ≤ µ + zα 2 ) = 1−α n n – This leads to the relationship P( x − z α 2 σ σ ≤ µ ≤ x + zα 2 ) = 1− α n n 1 - α of all the values of x obtained in repeated sampling from this distribution, construct an interval that includes (covers) the expected value of the population. σ σ x − zα 2 n , x + zα 2 n 54 Confidence level 1-α x − zα 2 σ n Lower confidence limit x 2z α 2 x + zα 2 σ n σ n Upper confidence limit σ σ See simulation results x − z α 2 n , x + z α 2 n demonstrating this point 55 56 • The confidence interval are correct most, but not all, of the time. 150 UCL 100 LCL 50 0 Not all the confidence intervals cover the real expected value of 100. 0 The selected confidence level is 90%, and 10 out of 100 intervals do not cover the real µ. 100 57 58 Example: The number and the types of television programs and commercials targeted at children is affected by the amount of time children watch TV. A survey was conducted among 100 North American children, in which they were asked to record the number of hours they watched TV per week. The population standard deviation of TV watch was known to be σ = 8.0. Suppose that the sample mean is 27.191. Estimate the watch time with 95% confidence level. 59 Solution The parameter to be estimated is µ, the mean time of TV watch per week per child (of all American Children). We need to compute the interval estimator for µ. x = 27.191. x ± zα 2 σ n = 27.191 ± z .025 = 27.191 ± 1.96 8.0 100 8.0 100 Since 1 - α =.95, α = .05. Thus α/2 = .025. Z.025 = 1.96 = 27.191 ± 1.57 = [25.621, 28.761] 60 Hypothesis Testing The purpose of hypothesis testing is to determine whether there is enough statistical evidence in favor of a certain belief about a parameter. Is a new drug effective in curing a certain disease? A sample of patient is randomly selected. Half of them are given the drug where half are given a placebo. The improvement in the patients conditions is then measured and compared. Null and alternative hypotheses reference population values, and not observed statistics. 61 Concept of hypothesis testing The critical concepts of hypothesis testing. There are two hypotheses (about a population parameter(s)): H0 - the null hypothesis [ for example µ = 5] H1 - the alternative hypothesis [µ > 5] This is what you want to prove – Assume the null hypothesis is true. • Build a statistic related to the parameter hypothesized. • Pose the question: How probable is it to obtain a statistic value at least as extreme as the one observed from the sample? µ=5 x 62 Make one of the following two decisions (based on the test): - Reject the null hypothesis in favor of the alternative hypothesis. - Do not reject the null hypothesis in favor of the alternative hypothesis. 63 Example - Efficacy Test for New drug Drug company has new drug, wishes to compare it with current standard treatment - Federal regulators tell company that they must demonstrate that new drug is better than current treatment to receive approval - Firm runs clinical trial where some patients receive new drug, and others receive standard treatment - Numeric response of therapeutic effect is obtained - Compute mean difference of the therapeutic effects After few procedures make decisions 64 Possible Outcomes of Statistical Decision α = P(Type I Error ) Ineffective drug is deemed better. β = P(Type II Error ) Effective drug is deemed to be no better • Goal: Keep α and β reasonably should be small •The power of a test is the probability that the test will reject the null hypothesis when the treatment does have an effect. 65 Hypothesis Testing-One Sample The test statistic is converted to a conditional probability called a P-value. - P- value answers the question “If the null hypothesis were true, what is the probability of observing the current data or data that is more extreme?” - Small p values provide evidence against the null hypothesis because they say the observed data are unlikely when the null hypothesis is true. - “significant” means “the observed difference is not likely due to chance.” It does not mean of “important” or “meaningful.” 66 Warnings! Failure to reject the null hypothesis leads to its acceptance. (WRONG! Failure to reject the null hypothesis implies insufficient evidence for its rejection.) The p value is the probability that the null hypothesis is incorrect. (WRONG! The p value is the probability of the current data or data that is more extreme, assuming H0 is true.) α = .05 is a standard with an objective basis. (WRONG! α = .05 is merely a convention that has taken on unwise mechanical use. 67 Warnings!..... NB: There is no sharp distinction between “significant” and “insignificant” results, only increasingly strong evidence as the p value gets smaller. Small p values indicate large effects. (WRONG! p values tell you nothing about the size of an effect.) Data show a theory to be true or false. (WRONG! Data can at best serve to support or refute a theory or claim.) Statistical significance implies importance. (WRONG! WRONG! WRONG! Statistical significance says very little about the importance of a relation. 68 Hypothesis Testing-One Sample - Determine whether prenatal alcohol affects birth weight or not. - A sample is selected from the original population and is given alcohol. - The question is what would happen if the entire population were given alcohol. - The treated sample provides information about the unkonwn treated population. 69 Hypothesis Testing-Two Sample Often we want to compare one group to another. What happens when we are comparing two samples? Variability in both samples, and potentially two samples are related Having an accurate measure of tumor size is extremely important because it allows a physician to accurately determine if a tumor is growing, shrinking or remaining constant. The problem is that often the measurements of the tumor size vary from physician to physician Measure by RECIST method rather linear distance across the tumor 70 Hypothesis Testing-Two Sample For a portion of the study, a pair of doctors were shown the same set of tumor pictures. The volume of the tumor was measured by two separate physicians under similar conditions. Question of interest: Did the measurements from the two physicians significantly differ? If not, then there would be no evidence that the volume measurements change based on physician. 20 scans were measured by each physician Measurements in cm3, What can you say about these samples? 71 Example-Paired Two measurement on the same person, they are related so we must account for this Measure the effect of the treatment in each person by taking the difference Instead of having two samples, consider our dataset to be one sample of differences 72 Example-Paired Volume from Dr. 1 Population mean: Sample mean Volume from Dr. 2 Population mean: Sample mean: Difference Population mean: Sample mean: 73 Example-Paired use t-distribution with n-1 df where n is the number of differences Standard deviation of differences Step 1: Hypothesis: No difference between physicians effect Step 2: Level of significance-alpha=0.05 Step 3: Test statistics, t small sample size 74 Example-Paired Step 4: Decision, don not reject the null hypothesis Step 5: Conclusion, there is no evidence of a difference in tumor volume measurement based on physician Confidence interval for paired t-test For our example, the confidence interval is (-1.01, 0.54) Note that the conclusion from the hypothesis test and the confidence interval are the same 75 Example-Paired Other Examples: Differences between left and right eye Differences between dominant and recessive hand Matched samples 76 Example-two sample independent Often it is impractical to design study to use the same patients for both group Example: Comparison of cholesterol in males and females Since the samples are not paired, we cannot use the difference between the individual samples Compare the tumor volume among patients with different forms of cancer. The average tumor size is important to know the effect of treatment can be determined. 77 Example-two sample independent The null hypothesis is that there is no difference between the volume of the tumor in the two forms of cancer H0: mbrain =mbreast , or mbrain – mbreast =0 More generally, we can test if the difference between two groups is a specific value, m1-m2=D This occurs when comparing two treatment groups and we are interested if the two groups are different by a specific amount 78 Example-two sample independent Basic form of test statistic Known variance Unknown Variance Case 1) Equal variance and known Then the test statistic is Estimate σ βy 79 Two sample independent Known variance Unknown Variance Case 2) Unequal variance and known Then the test statistic is with V df Satterthwaite or Welch approximation 80 Comparing More than two Means 81 Analysis of Variance (ANOVA) Idea: For two or more groups, test difference between means, for quantitative normally distributed variables. Just an extension of the t-test (an ANOVA with only two groups is mathematically equivalent to a t-test) It’s like this: If there are three groups to compare: Do six pair-wise ttests, but this would increase my type I error So, instead look at the pairwise differences “all at once.” To do this, recognize that variance is a statistic that allows more than one difference at a time… will look at two measures of variation, overall variance vs individual differences 82 Analysis of Variance (ANOVA) Use a ratio of the two Between group variation / within group variation Summarizes the mean differences between all groups at once. Variability between groups F= Variability within groups Analogous to pooled variance from a ttest. 83 Example Treatment 1 Treatment 2 Treatment 3 Treatment 4 y11 y21 y31 y41 y12 y22 y32 y42 y13 y23 y33 y43 y14 y24 y34 y44 y15 y25 y35 y45 y16 y26 y36 y46 y17 y27 y37 y47 y18 y28 y38 y48 y19 y29 y39 y49 y110 y210 y310 y410 10 ∑y 1j y1• = 10 ∑ j =1 10 y 2• = 10 ( y1 j − y1• ) 2 j =1 10 − 1 10 10 ∑(y 2j ∑y 2j j =1 y 3• = 10 − y 2• ) j =1 10 − 1 2 10 ∑ ∑y 10 3j j =1 y 4• = 10 ( y 3 j − y 3• ) j =1 10 2 ∑ ∑y The group means j =1 10 ( y 4 j − y 4• ) 2 j =1 10 − 1 4j 10 − 1 The (within) group variances 84 SSW + SSB 4 10 ∑∑ ( y i =1 j =1 ij − y i• ) 4 2 + ∑ i =1 ( y i • − y •• ) = TSS 4 2 = 10 ∑∑ ( y ij − y •• ) 2 i =1 j =1 85 One Way ANOVA Example Assume ”treatment results” from 13 patients visiting one of three doctors are given: Doctor A: 24, 26, 31, 27 Doctor B: 29, 31, 30, 36, 33 Doctor C: 29, 27, 34, 26 H0: The treatment results are from the same population of results H1: They are from different populations 86 One Way ANOVA Example Averages within groups: Doctor A: 27 Doctor B: 31.8 Doctor C: 29 Total average 4 × 27 + 5 × 31.8 + 4 × 29 = 29.46 4+5+ 4 87 One Way ANOVA Example Sum of squares within groups: SSW = (24 − 27) 2 + (26 − 27) 2 + ... + (29 − 31.8) 2 + .... = 94.8 Compare it with sum of squares between groups: SSG = (27 − 29.46) 2 + (27 − 29.46) 2 + ... + (31.8 − 29.46) 2 + .... = 4(27 − 29.46) 2 + 5(31.8 − 29.46) 2 + 4(29 − 29.46) 2 = 52.43 88 One Way ANOVA Example Comparing these, we also need to take into account the number of observations and sizes of groups MSW = 94.8 SSW = = 9.48 n − K 13 − 3 SSB 52.43 MSB = = = 26.2 K −1 3 −1 MSB 26.2 = = 2.76 MSW 9.48 F3−1,13−3,0.05 = 4.10 89 One Way ANOVA Example BetweenGroups WithinGroups Total Sumof Squares 52,431 94,800 147,231 df 2 10 12 MeanSquare 26,215 9,480 F 2,765 Sig. ,111 Use ”Analyze => Compare Means => One-way ANOVA Do NOT reject the null hypothesis A statistically significant ANOVA (F-test) only tells you that at least two of the groups differ, but not which ones differ. Determining which groups differ (when it’s unclear) requires more sophisticated analyses to correct for the problem of multiple comparisons… 90 One Way ANOVA Example Why not just do all possible pair wise t tests? Answer: because, at an error rate of 5% each test, this means you have an overall chance of up to 1-(.95)3= 14% of making a type-I error (if all 3 comparisons were independent) If you wanted to compare 6 groups, you’d have to do 6C2 = 15 pair wise t tests; which would give you a high chance of finding something significant just by chance (if all tests were independent with a type-I error rate of 5% each); probability of at least one type-I error = 1-(.95)15=54%. 91 Correction for multiple comparisons If your ANOVA test identifies a difference between group means, then you must identify which of your k groups differ. If you did not specify the comparisons of interest (“contrasts”) ahead of time, then you have to pay a price for making all kCr pairwise comparisons to keep overall type-I error rate to α. Bonferroni For example, to make a Bonferroni correction, divide your desired alpha cut- off level (usually .05) by the number of comparisons you are making. Assumes complete independence between comparisons, which is way too conservative 92 Non parametric Tests Most of the statistical methods referred to as parametric require the use of interval- or ratio-scaled data. Nonparametric methods are often the only way to analyze nominal or ordinal data and draw statistical conclusions. Nonparametric methods require no assumptions about the population probability distributions. Non parametric methods are often called distribution-free methods 93 Example: Chi-square test here we have the results of a poll that asked people’s opinions about the use of the death penalty as opposed to life in prison. χ 2 (Oij − Eij ) 2 = ∑ ∑ Eij i j 20.02 2 χ 0.05,2 = 5.99 H0: distribution of female preferences matches distribution of male preferences HA: female proportions do not match male proportions Reject H0 94 Summary of Chisquare Test Type Goodness fit Aim Hypotheses One sample. H0: The observed values are equal to theoretical values (expected). (The data followed the assumed distribution). Ha: The observed values are not equal to theoretical values (expected). (The data did not follow the assumed distribution). of Compares the expected and observed values to determine how well the experimenter’s predictions fit the data. Homogeneity Two different populations (or sub-groups). Applied to one categorical variable. H0: Investigated populations are homogenous. Ha: Investigated populations are not homogenous. Independence One population. Type of variables: nominal, dichotomical, ordinal or grouped interval Each population is at least 10 times as large as its respective sample Research hypothesis: The two variables are dependent (or related). H0: There is no association between two variables. (The two variables are independent). Ha: There is an association between two variables. 95 McNemar Test Suppose we have the situation where measurements are made on the same group of people before and after some intervention, or suppose we are interested in the agreement between two judges who evaluate the same group of patients on some characteristics. In such situations, the before and after measures, or the opinions of two judges, are not independent of each other, since they pertain to the same individuals. The test statistic is: (n12 − n21 ) 2 z = χ12 n12 + n21 2 96 McNemar Test-Example 1319 schoolchildren were questioned on the prevalence of symptoms of severe cold at the age of 12 and again at the age of 14 years. At age 12, 356 (27%) children were reported to have severe colds in the past 12 months compared to 468 (35.5%) at age 14. H0: the prevalence is same at 12 and 14 years age Ha: The prevalence is not the same (256 − 144) 2 = 31.36 (256 + 144) The calculated test statistic is much larger than the tabulated 2 ( χ 0.05,1 = 3.89 ). There is a difference for prevalence of cold at age 12 and 14. 97 Sign Tests Used for paired data Can be ordinal or continuous Very simple and easy to interpret Makes no assumptions about distribution of the data Not very powerful The null hypothesis for the sign test is H0: the median difference is zero 98 Sign Tests To evaluate H0 we only need to know the signs of the differences If half the differences are positive and half are negative, then the median = 0 (H0 is true). If the signs are more unbalanced, then that is evidence against H0. 99 evaluate whether these data provide evidence that orthodontic treatment improves children’s image of their teeth The sign test looks at the signs of the differences 15 children felt better about their teeth (+ difference in ratings) 1 child felt worse (- diff.) 4 children felt the same (difference = 0) Looks like good evidence Need a p-value 100 Sign Tests The p-value is the probability of an outcome as or more extreme (under H0 ) than that observed. We observed 15 positives and 1 negative. If H0 were true we’d expect an equal number of positive and negative differences. More extreme outcomes would be more than 15 positives or less than 1 positives 101 Sign Tests P-value = P(X > 15) + P(X < 1) X is the number of positive differences Under H0, X is Binomial(n = 16, p = 0.5) n =16 because the sign test disregards the zero differences 102 Wilcoxon Signed-rank test Wilcoxon Signed-rank test is another non-parametric test used for paired data. It uses the magnitudes of the differences the sign test does not More powerful than the sign test More difficult to interpret than the sign test 103 child Rating before Rating after 1 1 5 2 1 4 3 3 1 4 2 3 5 4 4 6 1 4 7 3 5 8 1 5 9 1 4 10 4 4 11 1 1 12 1 4 13 1 4 14 2 4 15 1 4 16 2 5 17 1 4 18 1 5 19 4 4 20 3 5 Example: Body image data Use the Wilcoxon signed-rank test to evaluate whether these data provide evidence that orthodontic treatment improves children’s image of their teeth. 104 child Rating before Rating after difference 1 1 5 4 2 1 4 3 3 3 1 -2 4 2 3 1 5 4 4 0 6 1 4 3 7 3 5 2 8 1 5 4 9 1 4 3 10 4 4 0 11 1 1 0 12 1 4 3 13 1 4 3 14 2 4 2 15 1 4 3 16 2 5 3 17 1 4 3 18 1 5 4 19 4 4 0 20 3 5 2 Example: Body image data Use the Wilcoxon signed- rank test to evaluate whether these data provide evidence that orthodontic treatment improves children’s image of their teeth. Work with the differences Remove those with zero difference 105 child diff. 1 4 2 3 3 -2 4 1 6 3 7 2 8 4 9 3 12 3 13 3 14 2 15 3 16 3 17 3 18 4 20 2 Example: Body image data To compute the test we need to 106 child diff. sign 1 4 + 2 3 + 3 -2 - 4 1 + 6 3 + 7 2 + 8 4 + 9 3 + 12 3 + 13 3 + 14 2 + 15 3 + 16 3 + 17 3 + 18 4 + 20 2 + Example: Body image data To compute the test we need to note the signs of the differences 107 child diff. sign |diff.| 1 4 + 4 2 3 + 3 3 -2 - 2 4 1 + 1 6 3 + 3 7 2 + 2 8 4 + 4 9 3 + 3 12 3 + 3 13 3 + 3 14 2 + 2 15 3 + 3 16 3 + 3 17 3 + 3 18 4 + 4 20 2 + 2 Example: Body image data To compute the test we need to note the signs of the differences get magnitudes of the differences 108 child diff. sign |diff.| 4 1 + 1 3 -2 - 2 7 2 + 2 14 2 + 2 20 2 + 2 2 3 + 3 6 3 + 3 9 3 + 3 12 3 + 3 13 3 + 3 15 3 + 3 16 3 + 3 17 3 + 3 1 4 + 4 8 4 + 4 18 4 + 4 Example: Body image data To compute the test we need to note the signs of the differences get magnitudes of the differences reorder the data by magnitude 109 child diff. sign |diff.| Avg. ranks 4 1 + 1 1 3 -2 - 2 3.5 7 2 + 2 3.5 14 2 + 2 3.5 20 2 + 2 3.5 2 3 + 3 9.5 6 3 + 3 9.5 9 3 + 3 9.5 12 3 + 3 9.5 13 3 + 3 9.5 15 3 + 3 9.5 16 3 + 3 9.5 reorder the data by magnitude 17 3 + 3 9.5 assign 1 4 + 4 15 8 4 + 4 15 18 4 + 4 15 Example:Body image data To compute the test we need to note the signs of the magnitudes of the differences get differences ranks to the observations 110 child diff. sign |diff.| Avg. ranks 4 1 + 1 1 3 -2 - 2 3.5 7 2 + 2 3.5 14 2 + 2 3.5 20 2 + 2 3.5 2 3 + 3 9.5 6 3 + 3 9.5 9 3 + 3 9.5 12 3 + 3 9.5 13 3 + 3 9.5 15 3 + 3 9.5 16 3 + 3 9.5 17 3 + 3 9.5 1 4 + 4 15 8 4 + 4 15 18 4 + 4 15 Example: Body image data Note that since there are many ties in the magnitudes we had to assign average ranks. 111 child diff. sign |diff.| Avg. ranks 4 1 + 1 1 3 -2 - 2 3.5 7 2 + 2 3.5 14 2 + 2 3.5 20 2 + 2 3.5 through 5th differences all 2 3 + 3 9.5 6 3 + 3 9.5 have the same magnitude, so 9 3 + 3 9.5 we give them all the average 12 3 + 3 9.5 13 3 + 3 9.5 of the 2nd through 5th rank 15 3 + 3 9.5 16 3 + 3 9.5 17 3 + 3 9.5 1 4 + 4 15 8 4 + 4 15 18 4 + 4 15 Example: Body image data For example, the 2nd (2+3+4+5)/4 = 3.5 112 child diff. sign |diff.| Avg. ranks 4 1 + 1 1 3 -2 - 2 3.5 7 2 + 2 3.5 14 2 + 2 3.5 20 2 + 2 3.5 rank test is the sum of the 2 3 + 3 9.5 ranks 6 3 + 3 9.5 9 3 + 3 9.5 12 3 + 3 9.5 13 3 + 3 9.5 15 3 + 3 9.5 16 3 + 3 9.5 17 3 + 3 9.5 1 4 + 4 15 8 4 + 4 15 18 4 + 4 15 Example: Body image data The statistic for the signed- of the positive differences 113 child diff. sign |diff.| Avg. ranks 4 1 + 1 1 3 -2 - 2 3.5 7 2 + 2 3.5 14 2 + 2 3.5 20 2 + 2 3.5 rank test is the sum of the 2 3 + 3 9.5 ranks 6 3 + 3 9.5 9 3 + 3 9.5 12 3 + 3 9.5 13 3 + 3 9.5 15 3 + 3 9.5 16 3 + 3 9.5 17 3 + 3 9.5 1 4 + 4 15 8 4 + 4 15 18 4 + 4 15 Example: Body image data The statistic for the signed- of the positive differences R1 = 1 + 3.5 + 3.5 + 3.5 + 9.5 + 9.5 + 9.5 + 9.5 + 9.5 + 9.5 + 9.5 + 9.5 + 15 + 15 + 15 = 132.5 114 R1: What does it mean? With 16 observations R1 could range from 0 (all differences are negative) to 136 (all differences are positive). If H0 were true we’d expect R1 to be near the middle of the range, in this case, 68. R1= 132.5 appears to be evidence against H0 Need a p-value 115 Signed-rank test p-value For n > 15, can use a normal approximation n(n + 1) µ= 4 3 ( t n ( n )( n ) + + 1 2 1 ∑ i − ti ) 2 − σ = 24 48 where ti are the numbers of ties in each group of ties (note that if ti = 1 then the term is 0), and n is the number of non-zero differences. The two-sided p-value is given by |R1 − µ | − 0.5 p − value = 2 × P N(0,1) > σ 116 p-value for body image example 16(16 + 1) = 68 µ= 4 There are 4 people tied with difference 2, 8 with difference 3, and 3 tied with difference 4. So ∑ (t And so, 3 i − t i ) = (43 − 4 ) + (83 − 8 ) + (33 − 3 ) = 588 16 × 17 × 33 588 σ = − = 386.25 24 48 2 117 p-value for body image example |132.5 − 68| − 0.5 p − value = 2 × P N(0,1) > 386.25 = 2 × P (N(0,1) > 3.26 ) = 2 × 0.001 = 0.002 118 p-value for signed-rank test If n < 15 then should not use Normal approximation, but instead use an “exact” p-value or critical tables. Or simply list the possibilities for a given number of sample For n sample size, 2n , possibilities exist and probabilities can be calculated based on the possibilites In body image example, exact p-value is 0.00015. 119 Kruskal-Wallis Test ANOVA is based on the assumption of normality Non-parametric alternative to ANOVA One ordinal dependent variable with 3 or more independent levels Kruskal Wallis involves the analysis of the sums of ranks for each group, as well as the mean rank for each group. As sample sizes get larger, the distribution of the test statistic approaches that of χ2, with df = k – 1 120 Kruskal-Wallis Test The null hypothesis states that there is no difference in the distribution of scores of the K populations from which the samples were selected. The alternative hypothesis states that there is at least difference in the distribution of scores of 2 populations from which the samples were selected. The test statistic is: K Ri2 12 2 = W − 3( n + 1) χ ∑ k −1 n(n + 1) i =1 ni 121 Kruskal-Wallis Test Test Statistic is: Example: K Ri2 12 2 3( 1) = W − n + χ ∑ k −1 n(n + 1) i =1 ni 122 Summary of Tests 123 Chapter Two Principles and Methods of Epidemiology 124 Introduction Health is ‘the state of being free from illness or injury’. It refers to freedom from medically defined diseases. Poverty, social inequalities, unemployment, and crowding are among the main determinants of health. Illness can be observed : Subjectively Objectively 125 Introduction Subjective observations by the patient (symptoms) Example: nausea Subjective observations by the examiner (signs) Example: Weight loss inter-observer variation (the degree of agreement among different examiners) intra-observer variation (the degree of agreement between different examinations made by one examiner). objective observations (tests): to manifestations that can be read from an instrument and hence are less dependent on subjective judgments by the person examined or the examiner. 126 What is Epidemiology The term “epidemic” is used to describe an unexpected increase in the frequency of any disease such as myocardial infarction, obesity, or asthma (in general Genetic, behavioral or environmental). The term "Pandemic" is used to describe an occurring of disease over a wide geographic area and affecting an exceptionally high proportion of the population 127 What is Epidemiology Public health epidemiology uses the “healthy” population to study the transition from being healthy to being diseased or ill. Clinical epidemiology uses the population of patients to study predictors of cure or changes in the disease state. A clinical epidemiologist can study how best to treat diseases without taking an interest in how these diseases emerged. 128 What is Epidemiology Epidemiology comes from the Greek words Epi, meaning “on or upon,” Demos, meaning “people,” and Logos, meaning “the study of.” “Epidemiology is the study of the distribution and determinants of health-related states or events in specified populations, and the application of this study to the control of health problems.” 129 What is Epidemiology Study- includes: surveillance, observation, hypothesis testing, analytic research and experiments Distribution-Refers to analysis of: times, persons, places and classes of people affected. Time characteristics include annual, seasonal, and daily or even hourly occurrence during an epidemic. Place characteristics include geographic variation, urban-rural differences, and location of worksites or schools. Personal characteristics include demographic factors such as age, race, sex, marital status, and socioeconomic status, as well as behaviors and environmental exposures. 130 What is Epidemiology Epidemiology is concerned with the frequency and pattern of health events in a population. Frequency includes not only the number of such events in a population, but also the rate or risk of disease in the population. This characterization of the distribution of health-related states or events is one broad aspect of epidemiology called descriptive epidemiology. Descriptive epidemiology provides the What, Who, When, and Where of health-related events. 131 What is Epidemiology Determinants include factors that influence health: biological, chemical, physical, social, cultural, economic, genetic and behavioral. Search for causes and other factors that influence the occurrence of health-related events. Analytic epidemiology attempts to provide the Why and How of such events by comparing groups with different rates of disease occurrence and with differences in demographic characteristics, genetic or immunologic make-up, behaviors, environmental exposures, and other so-called potential risk factors. 132 What is Epidemiology Health-related states or events Originally, epidemiology was concerned with epidemics of communicable diseases. Then epidemiology was extended to endemic communicable diseases and non communicable infectious diseases. More recently, epidemiologic methods have been applied to chronic diseases, injuries, birth defects, maternal-child health, occupational health, and environmental health. Now, even behaviors related to health and well-being (amount of exercise, seat-belt use, etc.) are recognized as valid subjects for applying epidemiologic methods. 133 What is Epidemiology Health related events refer to: diseases, causes of death, behaviors such as use of tobacco, positive health states, reactions to preventive regimes and provision and use of health services. Specified populations include those with identifiable characteristics, such as occupational groups. Clinicians are concerned with the health of an individual; epidemiologists are concerned with the collective health of the people in a community or other area. 134 What is Epidemiology Application: to prevention and control the aims of public health—to promote, protect, and restore health Epidemiology is more than “the study of.” epidemiology provides data for directing public health action. using epidemiologic data is an art as well as a science. 135 Summary Epidemiology is the study (scientific, systematic, data driven) of the distribution (frequency, pattern) and determinants (causes, risk factors) of health-related states and events (not just diseases) in specified populations (patient is community, individuals viewed collectively), and the application of (since epidemiology is a discipline within public health) this study to the control of health problems. 136 Epidemiologists are required to have some knowledge of: Public health: because of the emphasis on disease prevention Clinical medicine: because of the emphasis on disease classification and diagnosis (numerators) Pathophysiology: because of the need to understand basic biological mechanisms in disease (natural history) Biostatistics: because of the need to quantify disease frequency and its relationships to antecedents (denominators, testing hypotheses) Social sciences: because of the need to understand the social context in which disease occurs and presents (social determinants of health phenomena) 137 Uses of Epidemiology 1. To study the history of disease: Trends of a disease for the prediction of trends. Result are useful in planning for health services and public health. 2. Community diagnosis: What are the diseases, conditions, injuries, disorders, disabilities, defects causing illness, health problems or death in a community or region? 138 Uses of Epidemiology 3. Look at risks of individuals as they affect groups or population: What are the risks factors, problems, behavior that affects group? Health screening, medical exams, disease assessments. 4. Assessments, evaluation, research. How well do public health and health services meet the problems and needs of the population or group? 139 Uses of Epidemiology 4. Assessments, evaluation, research. How well do public health and health services meet the problems and needs of the population or group? 5. Completing the clinical picture: Identification and diagnosis process to establish that a condition exists or that a person has a specific disease. 6. Identification of syndromes: Help to establish and set criteria to define syndromes. 140 Uses of Epidemiology 7. Determine the causes and sources of disease: Findings allow for control, prevention, and elimination of the causes of disease, conditions, injury, disability 141 Some Epidemiologic Concepts: Mortality Rates Death is a unique and universal event, and as a final event Mortality rates pertain to the number of deaths occurring in a particular population subgroup and often provide one of the first indications of a health problem. Mid year-an estimate of the average number during the year. A reason for taking the population at midyear as the denominator for determining rates or ratios is because of a population may grow or shrink during the year in question 142 Measures of Mortality 143 Age Specific Death Rates (ASDR) 144 Age Specific Death Rates (ASDR) 145 Maternal Mortality Rate (MMR) 146 Incidence and Prevalence 147 Incidence and Prevalence 148 Incidence and Prevalence The quality of the data is commonly described with use of four terms: Accuracy: the degree to which a measurement represents the true value of the attribute being measured. Precision: the reproducibility of a study result, that is, the degree of resemblance among study results, were the study to be repeated under similar circumstances: lack of precision is referred to as "random error". Reliability: a measure of how dependably an observation is exactly the same when repeated; it refers to the measuring procedure rather than to the attribute being measured. 149 Incidence and Prevalence Validity: the extent to which the study measures what it is intended to measure; lack of validity is referred to as "bias" or "systematic error." 150 Measures of Association Epidemiologic studies often interested in knowing how much more likely an individual is to develop a disease if he or she is exposed to a particular factor than the individual who is not so exposed. 151 Measures of Association Relative Risk It is the ratio of two incidence rates the rate of development of the disease for people with the exposure factor, divided by the rate of development of the disease for people without the exposure factor. the probability of the outcome in those exposed divided by the probability of the outcome in those not exposed-Cohort study = RR p1 P(disease / exp osed ) = P(disease / un exp osed ) p2 152 Example Relative Risk Cross-Classification of Aspirin Use and Myocardial Infarction RR p1 P(disease / exp osed ) = P(disease / un exp osed ) p2 189 0.0171 11034 = RR = = 1.818 104 0.0094 11037 The risk of developing heart attack (Myocardial infarction) for those individuals who don’t use Aspirin is 1.818 times Aspirin 153 Odds Ratio Odds ratio is the odds of the outcome in one group divided by the odds of the outcome in the other group Let p1 refers to the probability of the outcome in group 1, and p2 is the probability of the outcome in group 2. P(disease / exsposed ) OR = P(disease / unexsposed ) (1 − P(disease / exp osed )) (1 − P(disease / un exp osed )) p1 = (1 − p1 ) p1 (1 − p2 ) = p2 p2 (1 − p1 ) (1 − p2 ) 154 Example Odds Ratio A study on the relationship between seat-belt use (yes, no) and outcome of an automobile crash (fatality, non fatality) for drivers involved in accidents is given below. Calculate Odds ratio. OR = p1 (1 − p2 ) 160 x3600 = p2 (1 − p1 ) 510 x1500 = 0.75 155 Relation ship between RR and OR RR = OR [(1 − po ) + ( po × OR)] wherepo is the proportion of those unexposed who develop the outcome, OR is the odds ratio, and RR is the relative risk estimated from the odds ratio. 156 Bias What do Epidemiologists do? Measure effects It could be rate or risk Attempt to define a cause: just an estimate of the truth Implement Public health measures How ever there might be: Biase? Chance? Can be evaluated quantitatively Confounding? 157 Bias Despite all preventive efforts, it has to be remembered that bias: May mask an association or cause Masy cause over or under estimation of the effect size Conclusion different from the truth Bias can be minimized by design and conduct of study Some type of Bias cannot be minimized with increased sample size 158 Bias Deviation of results or inferences from the truth Bias is defined as “any trend in the collection, analysis, interpretation, publication, or review of data that can lead to conclusions that are systematically different from the truth”. Bias can lead to an incorrect estimation of the association between an exposure and the risk of a disease. Bias can be due to: Selection Measurement/ Mis classification Confounding 159 Selection Bias It is a distortion in the estimate of association between risk factors and disease that results from how the subjects are selected for the study. Could occur because the sampling frame is sufficiently different from the target population or That is the sampling frame is not the mirror image of the target population 160 Response Bias Also called as ascertainment bias Systematic error due to differences in characteristics between those who choose or volunteer to take part in a study and those who do not Avoid differential response rates Of 100 people exposed to a risk factor, 20% develop the disease and of a 100 people unexposed, 16% develop the disease yielding a relative risk of 1.25 161 Response Bias… Now imagine that only 60% of the exposed respond to follow-up, or are ascertained as having or not having the disease, a 60% response rate among the exposed. Assume further that all of the ones who don't respond happen to be among the ones who don't develop disease. The relative risk would be calculated as 2.06 Now imagine that only 60% of the nonexposed reply, a 60% response rate among the nonexposed, and all of the nonexposed who don't respond happen to be among the ones who don't have the disease. Now the relative risk estimate is 0.75. 162 Confounding Confounding exists when a risk factor other than the exposure under study is associated, independently, both with the exposure and with the outcome Confounder Exposure Outcome For a variable to be a confounder, it must have three characteristics: 1) it must be associated with the exposure (causally or not); 2) it must be a cause, or a surrogate of the cause, of the health outcome; 3) it should not be in the causal pathway between the potential risk factor and outcome. 163 Matching Several methods are available to address confounders: randomization, matching, multivariate analysis etc. Matching refers to the selection of unexposed subjects’ i.e., controls that in certain important characteristics are identical to cases (may be with the possible confounder). Matching addresses issues of confounding in the DESIGN stage of a study as opposed to the analysis phase. 164 Matching… A study of coffee and heart disease, match subjects on their smoking history, since smoking may be a confounder of the relationship between coffee and heart disease. Whenever enrolled a coffee drinker into the study, determine if that person was a smoker. If the patient was a smoker, the next patient who would be enrolled who was not a coffee drinker (i.e., a member of the comparison group), would also have to be a smoker. For each coffee-drinking non smoker, a non coffee-drinking non smoker would be enrolled. 165 Chapter Three Designing Research 166 Introduction Research is all about addressing an issue or asking and answering a question or solving a problem or Research is what we do when we have a question or a problem we want to resolve We may already think we know the answer to our question already We may think the answer is obvious, common sense even But until we have subjected our problem to rigorous scientific scrutiny, our 'knowledge' remains little more than guesswork or at best, intuition. 167 Procedures in research 168 What is research design? A framework for the research plan of action. A master plan that specifies the methods and procedures for collecting and analyzing the needed information A strategy for how the data will be collected. The purpose of research design is: It provides the scheme for answering research question. It maintains control to avoid bias that may affect the outcomes. It organize the study in a certain way defending the advantages of doing while being aware and caution about potential disadvantages 169 Categories of Research Design Descriptive studies: Examine patterns of disease Analytical studies: Studies of suspected causes of diseases Experimental studies: Compare treatment modalities 170 Observational and Experimental Studies Researchers are interested in comparing reading scores for students in schools with low average family income with scores for students in schools with high average family income. They choose a random sample of schools in each category. This is an observational study: the researchers do nothing to affect either family income or reading scores. Researchers are interested in comparing two methods for teaching reading. They randomly assign half the schools in their sample to one method and the other half to the other method. At the end of the school year, they analyze reading scores of the children in the schools. This is an experiment: the researchers deliberately decide which students receive each teaching method. 171 Observational Study… Looks at the natural history of the disease, can suggest a hypothesis Non-experimental, Observational studies are an alternative to experimental studies. Observational because there is no individual intervention Treatment and exposures occur in a “non-controlled” environment Individuals can be observed prospectively, retrospectively, or currently 172 Case Control Study It is designed to help determine if an exposure is associated with an outcome. A comparison group that does not have the disease. Subjects in whom the disease has been diagnosed. The two groups must be comparable except for the factor of interest. If one wants to show that an associated factor is a cause it is necessary to control for all important differences other than the exposure factor of interest. 173 Case Control Study… Then, look back in time to learn which subjects in each group had the exposure(s), comparing the frequency of the exposure in the case group to the control group. Exposed Cases Not Exposed Population Exposed Control Not Exposed 174 Case Control Study… Case control studies are comparatively quick, inexpensive, and Easy They are particularly appropriate for Investigating outbreaks, and E.g a study of endophthalmitis following ocular surgery Studying rare diseases or outcomes. study of risk factors for uveal melanoma, or corneal ulcers. 175 Case Control Study… Case-control studies cannot provide any information about the incidence or prevalence of a disease because no measurements are made. Case-control studies may prove an association but they do not demonstrate causation. made in a population based sample. All studies which contain ‘cases’ and ‘controls’ are not case-control studies. 176 Cohort Study A study design where one or more samples (called cohorts) are followed prospectively and subsequent status evaluations with respect to a disease or outcome are conducted to determine which initial participants exposure characteristics (risk factors) are associated with it. As the study is conducted, outcome from participants in each cohort is measured and relationships with specific characteristics determined. 177 Cohort Study… A “cohort” is a group of people who have something in common. Can represent the source population—the population from which cases of disease arise For example, the effect of company downsizing on the health of office workers. This group is then compared to a similar group that hasn't been exposed to the variable. 178 Cohort Study: Example To determine the long-term effectiveness of influenza vaccines in elderly people, cohorts of vaccinated elderly and unvaccinated community-dwelling elderly were studied. The results suggest that the elderly who are vaccinated have a reduced risk of hospitalization for pneumonia or influenza. This study uses data collected from high school students, and studies the differences in initiation of tobacco use between adolescents that started working for pay and a that did not work. The results suggest that adolescents who work for pay have a higher risk of initiating tobacco use. 179 Case Control and Cohort Study The distinction between case-control studies and prospective studies lies in the sampling. In the case-control study we sample from among the diseased and nondiseased, whereas in a prospective study we sample from among those with the factor and those without the factor. 180 Demonstrating Strength of Causality cross-sectional studies: useful in showing associations, in providing early clues to etiology. case-control studies: useful for rare diseases or conditions, or when the disease takes a very long time to become manifest (synonymous name: retrospective studies). Cohort studies: useful for providing stronger evidence of causality, and less subject to biases due to errors of recall or measurement (synonymous names: prospective studies, longitudinal studies). clinical trials: prospective, experimental studies that provide the most rigorous evidence of causality. 181 Crossectional Study Determines prevalence at a point in time Measure exposure and outcome variables at one point in time. Immediate outcome assessment and no loss to follow-up, therefore faster, cheaper, easier Useful for determining the prevalence of risk factors and the frequency of prevalent cases of a disease for a defined population They are also useful for measuring current health status and planning for selected health services 182 Crossectional Study: Example For example, in a cross-sectional study of high blood pressure and coronary heart disease the investigators determine the blood pressure and the presence of heart disease at the same time. If they find an association, they would not be able to tell which came first. Does heart disease result in high blood pressure or does high blood pressure cause heart disease, or are both high blood pressure and heart disease the result of some other common cause? 183 Choosing study design Does it adequately test the hypotheses? Hypotheses determine participants, variables measured & data analysis methods Example hypotheses tested in student projects Discussion of Requirements of proposal Does it identify and control extraneous factors? Eliminate alternative explanations for results to increase confidence in cause-effect conclusion (internal validity) 184 Choosing study design Control depends on type of design Correlational design has less control Extraneous variables are measured and effects are statistically controlled Are results generalizable? Replicate to other samples and other contexts Random selection of participants Features of field experiments enhancing external validity Realistic nature of setting and/or task Manipulation of treatment 185 Choosing study design Use of control group Nature of samples used Lack of control over confounding variables due to non-random assignment or inability for matching Can the hypothesis be rejected or retained via statistical means? (statistical conclusion validity) Need reliable measures Need large enough sample to detect true effect and avoid Type I and Type II errors 186 Choosing study design What is a null hypothesis? No effect proposed What is an alternative hypothesis? What is a directional hypothesis? Is the design efficient in using available resources? Optimal balance between research design, time, resources and researcher expertise 187 Methods and Methodology Research methods are the tools, techniques or processes that we use in our research. These might be, surveys, interviews, Questionaire participant observation, Focus Group Discussion. However, Methods and how they are used are shaped by methodology. 188 Methods and Methodology Methodology is the study of How research is done, How we find out about things, and How knowledge is gained. In other words, methodology is about the principles that guide our research practices. Methodology therefore explains why we are using certain methods or tools (logic, reality, values) in our research. 189 Chapter Four Survival Data Analysis 190 Introduction Survival Analysis typically focuses on time to event data. Survival time refers to a variable which measures the time from a particular starting time to a particular endpoint of interest. Survival analysis is generally defined as a set of methods for analyzing data where the outcome variable is the time until the occurrence of an event of interest. 191 Introduction… In the most general sense, it consists of techniques for positive- valued random variables, such as Time to death Time to onset (acquire/develop) or (relapse/recurence) of a disease Length of stay in a hospital Duration of a strike Money paid by health insurance Viral load measurements: how much HIV is in the blood. Time to finishing a doctoral dissertation! 192 Examples Time to Event Medicine: Time to relapse of a certain disease Time to death of HIV patients after retroviral therapy Time to re-occurrence of a particular symptom Time to cure from a certain disease Agriculture: Length of time required for a cow to conceive after calving Time until a farm experiences its first case of an exotic disease 193 Examples Time to Event… Sociology: “duration analysis”: studying behavioural changes of individuals over time. Time to find a new job after a period of unemployment Time until re-arrest after release from prison Time until getting promotion Engineering: “reliability analysis” Time to the failure of a machine Failure of networks 194 Examples Time to Event… Management: time until turn over, retirment Demography: Time until births, marriages, divorces, migration patterns and deaths Criminology: Time until Commiting crimes, convictions, arrests and rehabilitations Epidemiology: Time until aging , chronic diseases 195 Examples Time to Event… The event can be death, occurrence of a disease, marriage, divorce, etc. The time to event or survival time can be measured in days, weeks, years, etc. For example, if the event of interest is heart attack, then the survival time can be the time in years until a person develops a heart attack. 196 Time to Event Diagrammatically censored event censored event censored censored event event Time in a unit of measurement Individuals do not all enter the study at the same time. When the study ends, some individuals still haven't had the event yet. Other individuals drop out or get lost in the middle of the study, and all we know about them is the last time they were still “free" of the event. 197 The Main Concepts of Survival Analysis(Event) An event is ‘‘a change in state as defined by one or more qualitative variables within some observation period and within the relevant state space’’ (Blossfeld, Hamerle & Mayer, 1989). It consists of some form of change in state (Melnyk et al., 1995). Qualitative changes can be identified as events if there is a ‘‘relatively sharp disjunction between what precedes and what follows’’ the change over a period of time. 198 The Main Concepts of Survival Analysis (Event) For instance, an event can be a store opening, a store failure, a job termination, etc. Three questions can be asked: Question 1. Did the event happen? Question 2. When did it happen? Question 3. How do various factors affect the occurrence and the timing of the event? 199 The Main Concepts of Survival Analysis (Measurement Window) The measurement window characterises the period of time during which the researcher makes their observation. The choice of the measurement window length in the different investigations is a personal and arbitrary judgement by the researcher. Indeed, there is little theoretical and empirical evidence to use as a guide 200 The Main Concepts of Survival Analysis (Measurement Window)… Due to the arbitrary aspect of length choice, we must not forget that results can vary consequently with the length of the observation period. This variability in the results can be observed in various works about turnover. In addition to this diversity in the results, studies are also very difficult to compare. 201 The Main Concepts of Survival Analysis(Time) In order to define a failure time random variable, we need: An unambiguous time origin A time scale [real time: days, years] Definition of the event[Death, Disease, Recurrence, Response] Failure time random variables are always non-negative. Calendar time should be distinguished from survival time (patients time). 202 The Main Concepts of Survival Analysis (Censoring) We often possess data composed of duration between two events Individuals do not all enter the study at the same time When the study ends, some individuals still haven't had the event yet Patient withdrawal from a clinical trial Death due to some cause other than the one of interest Migration of human population 203 The Main Concepts of Survival Analysis (Censoring) … Duration may not been completely observed. The possibility that some individuals may not be observed for the full time to failure is occurrence of censoring The censorship refers to an incomplete survival time like the lack of start date the lack of ending date of the event the loss of a customer disappearance from the sample within the measurement window 204 The Main Concepts of Survival Analysis (Censoring) … Censored data can occur when: The event of interest is death, but the patient is still alive at the time of analysis. The individual was lost to follow-up without having the event of interest. The event of interest is death by cancer but the patient died of an unrelated cause, such as a car accident. The patient is dropped from the study without having experienced the event of interest due to a protocol violation. 205 The Main Concepts of Survival Analysis (Censoring) … Censored data are data which are incomplete. But in order to be complete, the information must meet three conditions 1. The time during which the subject is exposed to a particular risk type must be specified; 2. We must be able to identify the end of the period; and 3. The end must be due to some event under investigation. These three conditions cannot always be satisfied 206 The Main Concepts of Survival Analysis (Types of Censoring) … Censoring (incomplete survival time) can be: Right Censored Left Censored Interval Censored Sometimes process is terminated due to reasons different from these under investigation. In Real situation we have occasionally both right and left censored data. 207 Right Censoring Right censoring occurs when a subject leaves the study before an event occurs, or The study ends before the event has occurred. Don’t know when the event occurred This unknown date can be : close, distant or even non-existent. 208 Right Censoring … Patients in a clinical trial to study the effect of treatments on stroke occurrence. The study ends after 5 years. Those patients who have had no strokes by the end of the year are censored. Total survival time Observed survival time Start study End study 209 Right Censoring …. The following notation is used to denote right censored data: T = survival time (event time) C = censoring time The data are usually represented as (Y, δ) where Y = min(T,C) is the time recorded, and δ indicates whether we observed an event time or a censoring time, that is 210 Right Censoring …. Y = min(T , C ) T for an uncensored observation = C for a censored observation δ = I (T ≤ C ) 1, for an uncensored observation = 0, for a censored observation 211 Right Censoring … Type I Right censoring all subjects are followed from a common starting point to a common end point Censoring time is the same for all subjects Example: Everyone followed for 1 year Study Start Study End 212 Right Censoring … Type II Right censoring all items are put on test at the same (starting) time and Stop observation when a set number of events have occurred often used in equipment testing replace all light bulbs when five have failed Study Start Study End 213 Right Censoring …. Random Right censorship random refers to fact that censoring process/events unplanned/ not under control of investigator, so events occur “randomly”; for each item/subject the survival time and the censoring time are random variables and that we observe the minimum of these two random variables. 214 Right Censoring …. In random right censoring our focus, more general than Type I Entry is at any time, the study itself continues until a fixed time point but subjects enter and leave the study at different times. Study Start Study End 215 Left Censoring The event has occurred prior to the start of the study Or the true survival time is less than the person’s observed survival time, only a upper bound for the time of event of interest is known We know the event occurred, but unsure when prior to observation In this kind of study, exact time would be known if it occurred after the study started 216 Left Censoring … Example: Survey question: when did you first smoke? HIV: infection time Censored during start of study Follow up time Start study End study 217 Left Censoring … The following notation is used to denote left censored data: T = survival time (event time) C = censoring time The data are usually represented as (Y, ε) where Y = max(T, ε) is the time recorded, and ε indicates whether we observed an event time or a censoring time, that is 218 Left Censoring … Y = max(T , C ) ε = I (T ≥ C l ) 1, for an uncensored observation = 0, for a censored observation 219 Interval Censoring Due to discrete observation times, actual times not observed. Example: progression-free survival Progression of cancer defined by change in tumor size Measure in 3-6 month intervals If increase occurs, it is known to be within interval, but not exactly when. 220 Survival Analysis Survival analysis examines the hazard that a certain event occurs . Survival analysis possesses two main aims: Firstly, we want to estimate the time period during which the event can happen. Second, we want to examine and describe the time distribution of the event, and estimate quantitatively the impact of various independent factors, called covariates, on this distribution. Note: Data collection essentially consists of a longitudinal record of events which happen to study units/subjects. 221 Why Survival Analysis… Why not compare mean time-to-event between your groups using a t-test or linear regression? ignores censoring If no censoring (everyone followed to outcome of interest) than ttest on mean or median time to event is fine. Why not compare proportion of events in your groups using risk/odds ratios or logistic regression? ignores time If time at-risk was the same for everyone, could just use proportions. 222 The Survival Analysis Methodology Most modern methods are mainly non-parametric (Kaplan-Meier, 1958) or Semi-parametric (the Cox model, 1972) or It may be parametric When no covariate is available in the data, Kaplan-Meier methods can be used, otherwise the Cox model is the solution. 223 Probability Density Function The probability of the failure time occurring at exactly time t (out of the whole range of possible t’s) is P (t ≤ T < t + ∆t ) f (t ) = lim ∆t → 0 ∆t The goal of survival analysis is to estimate and compare survival experiences of different groups. Survival experience is described by the cumulative survival function: S (t ) = 1 − P (T ≤ t ) = 1 − F (t ) F(t) is the CDF of f(t), and is “more interesting” than f(t). 224 The survival function The survival function reflects the cumulative survival probabilities throughout time. It is the rate of units, individuals, organisations, etc. not yet reached, at time t, by the event studied. When events are studied, the dependent variable is frequently the length of time until the event. The survival function is the unconditional probability that an event has not yet occurred at the period of time t. 225 The Survival Function … A function describing the proportion of individuals surviving to or beyond a given time or probability that a randomly selected individual will survive beyond time t. Notation: T ≡ survival time of a randomly selected individual t ≡ a specific point in time. S(t) = P(T > t) ≡ Survival Function 226 The Survival Function… Example 1: If t=100 years, S(t=100) is probability of surviving beyond 100 years. Ŝ(t)=number of patients surviving longer than t total number of patients in the study Example 2: Event = death, scale = months since Rx “S(t) = 0.3 at t = 60” 227 Survival Function From example 2 “The 5 year survival probability is 30%”, hence “70% of patients die within the first 5 years” Basic Properties of survival function: S(0)=1 S(∞)=0, for example Everyone dies → S(∞) =0 S(t) is non increasing 228 Hazard Function The hazard function h(t) is the probability of dying “at” time t. Also called the instantaneous failure rate, the age-specific failure rate and force of mortality. h(t ) = f (t ) S (t ) Groups set at risk, which is the set of units, individuals, in the sample, which are at risk regarding the event occurring at a certain point in time. 229 Hazard Function For instance, at the first period of time (day, week, month, year), the set of units of the sample is at risk. The hazard rate is the conditional probability for an event to occur to a unit of the sample at a specific time since the unit is at risk. 1 = h(t ) lim p (t ≤ T ≤ t + ∆t / T ≥ t ) ∆t → 0 ∆t 1 p ([t ≤ T ≤ t + ∆t ] ∩ [T ≥ t ]) = lim ∆t → 0 ∆t p (T ≥ t ) 1 p (t ≤ T ≤ t + ∆t ) = lim ∆t → 0 ∆t p (T ≥ t ) f (t ) = S (t ) 230 Hazard Rate Example Event = death, scale = months since Rx “h(t) = 1% at t = 12 months” “At 1 year, patients are dying at a rate of 1% per month” “At 1 year the chance of dying in the following month is 1%” 231 Connection Between the different Quantities f (t ) Hazard from density and survival : h(t) = S (t ) ∞ Survival from density : S(t) = ∫ f (u )du t dS (t ) Density from survival : f (t ) = − dt t ∫ ( − h ( u ) du ) Density from hazard : f (t ) = h(t )e 0 t Survival from hazard: S(t) = e Hazard from survival : h(t) = - ∫ ( − h ( u ) du ) 0 d ln S (t ) dt 232 Estimates of Survival Probabilities Unlike ordinary regression models, survival methods correctly incorporate information from both censored and uncensored observations in estimating important model parameters. The dependent variable in survival analysis is composed of two parts: one is the time to event and the other is the event status, which records if the event of interest occurred or not. 233 Estimates of Survival Probabilities… It is possible to estimate two functions that are dependent on time, the survival and hazard functions. The survival and hazard functions are key concepts in survival analysis for describing the distribution of event times. While these are often of direct interest, many other quantities of interest (e.g., median survival) may subsequently be estimated from knowing either the hazard or survival function. 234 Estimates of Survival Probabilities … It is generally of interest in survival studies to describe the relationship of a factor of interest (e.g. treatment) to the time to event, in the presence of several covariates, such as age, gender, race, etc. A number of models are available to analyze the relationship of a set of predictor variables with the survival time. Methods include parametric, nonparametric and semiparametric approaches. 235 Kaplan-Meir Estimates of Survival Probabilities The goal is to estimate a population survival curve from a sample. If every patient is followed until death, the curve may be estimated simply by computing the fraction surviving at each time. However, in most studies patients tend to drop out, become lost to follow up, move away, etc. It allows estimation of survival over time, even when patients drop out or are studied for different lengths of time. 236 Kaplan-Meir Estimates of Survival Probabilities… Also called Product Limit Non-parametric estimate of the survival function Empirical probability of surviving past certain times in the sample Applicable only for right censored data Commonly used to compare two study populations It is a atep-function: (the Kaplan–Meier estimate does not change between events, nor at times when only censorings occur. It drops only at times when a failurehas been observed. 237 Kaplan-Meir Estimates of Survival Probabilities… Limitations: Mainly descriptive Doesn’t control for covariates Requires categorical predictors Can’t accommodate time-dependent variables 238 Kaplan-Meir Estimate When there are no censored data, the KM estimator is simple and intuitive: Estimated S(t)= proportion of observations with failure times > t. For example, if you are following 10 patients, and 3 of them die by the end of the first year, then your best estimate of S(1 year) = 70%. When there are censored data, KM provides estimate of S(t) that takes censoring into account 239 Kaplan-Meir Estimate … The PL method assumes that censoring is independent of the survival times:(that is, the reason an observation is censored is unrelated to the cause of failure). K-M estimates are limited to the time interval in which the observations fall If the largest observation is uncensored, the PL estimate at that time equals zero Median survival is a point of time when S(t) is 0.5, that is 50% of the subjects aquired the event. 240 Kaplan-Meir Estimate … Note that: (1) for each time period the number of individuals present at the start of the period is adjusted according to the number of individuals censored and the number of individuals who experienced the event of interest in the previous time period, and (2) for ties between failures and censored observations, the failures are assumed to occur first. 241 Kaplan-Meir Estimate Observed event times Let there be K distinct event times t1 < t j < < t k At each time tj, there are nj individuals set at risk (where at risk means individuals who die at time tj or later) dj is the number who have the event at time tj Multiply the probability of surviving event time t with the probabilities of surviving all the previous event times. Sˆ (t ) = ∏ j: t j ≤ t (n j − d j ) nj for 0 ≤ t ≤ t + represents estimated survival probability at time t: P(T>t) d j = Proportion that failed at the time tj nj 1− dj nj =Proportion surviving the event time 242 Kaplan-Meir Estimate Consider the following Data pseudo example Time At risk Died censored Point Survival probability (pj) Cumulativ e survival (Sj) 0 31 2 3 (31-2)/31=0.9355 0.9355 1 26 1 2 (26-1)/26=0.9615 0.8995 2 23 1 2 (23-1)/23=0.9565 0.8604 3 20 1 2 (20-1)/20=0.95 0.8173 243 Example: Kaplan-Meir Estimate Number left Number failed Prob. of Survival 0 17 0 1.0000 6* 16 0 1.000 10 16 2 0.8750 16 14 1 0.8125 16* 14 1 0.8125 16* 14 1 0.8125 17 11 1 0.7386 20 10 1 0.6648 23 9 1 0.5909 26 8 1 0.5170 26* 8 1 0.5170 29* 6 1 0.5170 30* 5 1 0.5170 31* 4 1 0.5170 38* 3 1 0.5170 * Indicates censored observations. Time 244 Kaplan-Meir Estimate Using R A lot of functions and data sets in the survival function > library(survival) #load it. > data(aml) #load the data set aml > aml #see the data To estimate the distribution of lifetimes non parametrically, based on right censored observations, we use the Kaplan-Meier estimator. The R function to do that is survfit() (part of the survival package). 245 Kaplan-Meir Estimate Using R > aml2<-Surv(aml$time,aml$status) #### creates an object with censoring indicated with a + [1] [16] 9 12 13 13+ 18 16+ 23 27 23 30 28+ 31 33 43 34 45+ 48 161+ 5 5 8 8 45 This Surv() object can then be entered into the survfit() function to obtain Kaplan-Meier estimates of the survival function. The survfit function works similarly to an lm() or glm() function, that is, we put the survival time data on the left hand side of the ~ and any predictors (groups) on the right, in this case there are no predictors so this can be considered a simple intercept model 246 Kaplan-Meir Estimate Using R > survfit(aml2~1) Call: survfit(formula = aml2 ~ 1) Records n.max n.start events median 0.95LCL 0.95UCL 23 23 23 18 27 18 45 247 Kaplan-Meir Estimate Using R > summary(survfit(aml2~1)) Call: survfit(formula = aml2 ~ 1) time n.risk n.event survival std.err lower 95% CI upper 95% CI 5 23 2 0.9130 0.0588 0.8049 1.000 8 21 2 0.8261 0.0790 0.6848 0.996 9 19 1 0.7826 0.0860 0.6310 0.971 12 18 1 0.7391 0.0916 0.5798 0.942 13 17 1 0.6957 0.0959 0.5309 0.912 18 14 1 0.6460 0.1011 0.4753 0.878 23 13 2 0.5466 0.1073 0.3721 0.803 27 11 1 0.4969 0.1084 0.3240 0.762 30 9 1 0.4417 0.1095 0.2717 0.718 31 8 1 0.3865 0.1089 0.2225 0.671 33 7 1 0.3313 0.1064 0.1765 0.622 34 6 1 0.2761 0.1020 0.1338 0.569 43 5 1 0.2208 0.0954 0.0947 0.515 45 4 1 0.1656 0.0860 0.0598 0.458 48 2 1 0.0828 0.0727 0.0148 0.462 248 Kaplan-Meir Estimate Using R >plot(survfit(aml2~1,conf.type="plain") ,xlab="Time",ylab="Survival") 249 Kaplan-Meir Estimate Using R >plot(survfit(aml2~1,conf.type="loglog"),xlab="Time",ylab="Survival") 250 Variance for Kaplan-Meir Estimate The Greenwood variance estimate for a K-M curve is defined as: ^ 2 ^ k dj j =1 n j (n j − d j ) Var[ S (t )] = S (t )∑ It underestimates the true variance for small to moderate samples. a confidence interval for all time points t (point wise confidence interval). ^ ^ ^ S (t ) ± zα σ ( S (t )) 2 May contain points outside the [0; 1] interval Use the log-log function option 251 Variance for Kaplan-Meir Estimate ^ Var g ( S (t )) 2 1 1 ' ^ ^ ^ = = g S t Var S t Var S t ( ) ( ) ( ) 2 2 ^ ^ log S (t ) log S (t ) dj k ∑ n (n j =1 j j −dj) confidence interval for log(-log(S(t))) is: ^ ^ log(− log S (t )) ± zα Var[log(− log S (t )] 2 Confidence interval for S(t) is obtained by back-transforming: ^ S (t ) exp ± zα 2 ^ Var log( − log S ( t ) 252 Confidence interval for Survival Curves The Greenwood variance estimate for a K-M curve is defined as: We can use Greenwood variance estimate to derive a confidence interval for all time points t (point wise confidence interval). What might be a potential problem with using this estimate for a confidence interval? Hint: 0 ≤S(t)≤1. But may contain points outside the [0; 1] interval. 253 Confidence interval for Survival Curves Solution: Transform the survivor function so that the confidence interval falls in the [0, 1] range. The usual solution is to use the log-log function option. Let g (S(t)) = log (-log (S(t))). Using the delta method, we can get a variance estimate for the transformed function Var g ( S (t )) ^ 2 1 1 ' ^ ^ = = g S t Var S t Var S t ( ) ( ) ( ) 2 2 ^ ^ log S (t ) log S (t ) ^ dj k ∑ n (n j =1 j j −dj) 254 Confidence interval for Survival Curves and therefore our confidence interval for log(-log(S(t))) is ^ ^ log(− log S (t )) ± zα Var[log(− log S (t )] 2 Confidence interval for S(t) is obtained by back- transforming: S (t ) ^ exp( ± zα 2 ^ Var log( − log S ( t ) 255 Confidence interval for Survival Curves Example: Let us assume the following artificial data and compute the log-og back transformed confidence interval. Time (tj) Number Number at risk of event (dj) (nj) 6 7 10 13 16 22 23 21 17 15 12 11 7 6 Survival (s(tj)) 3 1 1 1 1 1 1 0.8571 0.8067 0.7529 0.6902 0.6275 0.5378 0.4482 Standard 95% deviation confidence interval lower upper 0.0764 0.6197 0.9516 0.0870 0.5635 0.9228 0.0964 0.5032 0.8894 0.1068 0.4316 0.8491 0.1141 0.3675 0.8049 0.1282 0.2678 0.7468 0.1346 0.1881 0.6801 2 1 1 ^ ' ^ ^ ^ = = Var g ( S (t )) g= S t Var S t Var S (t ) ( ) ( ) 2 2 ^ ^ log S (t ) log S (t ) ^ ^ log(− log S (t )) ± zα Var[log(− log S (t )] 2 ^ S (t ) exp( ± zα 2 dj k ∑ n (n j =1 j j −dj) ^ Var log( − log S ( t ) 256 Comparison of Survival Curves As in most statistics, a key objective is to test whether subpopulations behave in the same way. As survival analysis is important to compare the survival times of different groups. Various tests have been proposed for testing for differences in survival between categorical covariates. 257 Comparison of Survival Curves Test whether subpopulations behave in the same way. Plot the corresponding estimates of the survivor functions on the same axes. Have a look at the following graph 258 Logrank Due to censoring , classical tests such as t-test and Wilcoxon test cannot be used for comparison of the survival times Various tests have been designed for comparison of survival curves, when censoring is present The most popular ones are: Logrank test Wilcoxon (Gehan) test The Logrank test has more power than Wilcoxon for detecting late differences. 259 Logrank Compute the following Quantities: n1i n2i di ( ni − di ) n1i di e1i = v1i = Where di = d1i + d 2i and ni = n1i + n2i 2 ni ni ( ni − 1) O1 − = E1 k ∑(d −e ) = V k ∑v 1i 1i 1 1i i 1 =i 1 Compute the "Z"-Statistic (Software packages often square this to get a Chi-Square): TMH = 2 χ MH O1 − E1 ~ N (0,1) Under H 0 : No differences in Survival Functions V1 (O1 − E1 ) 2 χ12 = V1 Alternative (less preferred, but easier computationally) method: k O2 = ∑ d 2i E2 = O1 + O2 − E1 i =1 Compute the Chi-Square statistic: X2 ( O1 − E1 ) E1 2 + ( O2 − E2 ) E2 2 ~ χ12 Under H 0 : No differences in Survival Functions 260 Logrank Tests the null hypothesis that the survival curves in the two groups are the same. Example for logrank on the following factious data where + indicates censored observation Treatment Old 3, 5, 7, 9+, 18 Treatment New 12, 19, 20, 20+, 33+ Check whether there is a difference on the survival time of individuals on the new and old treatment using log-rank test. 261 Example Logrank Days Trt Old Trt New at risk at (n1i) risk(n2i) Trt Old died (d1i) Trt New died (d2i) Expected 3 5 5 1 0 0.200 0.2500 5 4 5 1 0 0.444 0.2469 7 3 5 1 0 0.375 0.2344 9+ 2 5 0 0 0.000 0.0000 12 1 5 0 1 0.000 0.1389 18 1 4 1 0 0.200 0.1600 19 0 4 0 1 0.000 0.0000 20 0 3 0 1 0.000 0.0000 33 0 1 0 0 0.000 0.0000 1.219 1.0302 Total 4 O1 − E1 ) − 1.219 ) (= (4 = 2 2 = χ MH V1 e1i Variance V1i 2 1.0302 7.51 Which indicates significant difference at 5% n1i n2i di ( ni − di ) n1i di = e1i = v1i ni ni2 ( ni − 1) 262 General Expression for two groups 2 m ∑ wi (d1,(i ) − eˆ1,(i ) ~ χ2 TestStatistic = i =1 m 1 2 w ∑ i vˆ1,(i ) i =1 The log-rank test uses wi = 1. It puts emphasis on larger values of time. The (generalised) Wilcoxon test uses wi = nj It puts emphasis on smaller values of time. The Tarone–Ware test uses wi = √n(i−). It puts emphasis on intermediate values of time. m is the disticnt events time. 263 Practicing Exercise Exercise: Pollock et al. (1989) radio-tagged 18 quail (Colinus virginianus L.) and followed their survival. The following are death or censoring(+) times in weeks: 3, 3, 6, 8, 8+, 9, 9+, 9+, 10, 10+, 12+, 13+, 13+, 13+, 13+, 13+, 13+,13+. Construct the Kaplan–Meier estimate of the survival function, the variance of this estimate, and a 95% confidence interval (plain, log and log-log). 264 Parametric Survival Models The Kaplan-Meier estimator is a very useful tool for estimating survival functions. Sometimes, we may want to make more assumptions that allow us to model the data in more detail. By specifying a parametric form for S(t), we can easily compute selected quantiles of the distribution estimate the expected failure time derive a concise equation and smooth function for estimating S(t), H(t) and h(t) estimate S(t) more precisely than KM assuming the parametric form is correct! 265 Exponential Distribution Characterized by one parameter λ> 0 Leads to a constant hazard function, h(t)=λ The survival function is, S(t)= exp(-λt) The density function is, f(t)= λexp(-λt) Memory loss property; P(T ≥t+z | T ≥ t) = P(T≥z) Not reasonable in many applications Hazards usually not constant over time 266 Exponential Distribution An empirical check of the exponential distribution for a set of survival data is provided by plotting the log of the survival function estimate versus log time. Such a plot should approximate a straight line through the origin . 267 Weibul Distribution Two parameters: α- shape parameter >0 β - scale parameter >0 S0(t)=exp(-βtα) f0(t)=αβtα-1exp(-βtα) h0(t)=αβtα-1 268 Weibull Distribution hazard decreases monotonically with time if α< 1 hazard increases monotonically with time if α> 1 hazard is constant over time if α = 1 (exponential case): second parameter makes it more flexible than exponential An empirical check for the Weibull distribution is provided by a plot of the log-log estimate versus log time. The plot should give approximately a straight line. 269 Modeling Survival Data We could use a semi-parametric model, one where the baseline hazard rate is not specified. The most common semi-parametric model is the Cox Proportional Hazards model (Cox, 1972), typically called the Cox Model or PH regression. The hazard rate is simply evaluated at every data point in light of the covariates. Cox Regression builds a predictive model for time-to-event data 270 Modeling Survival Data Note that information from censored subjects contributes usefully to the estimation of the model. Proportional hazards assumption: the hazard for any individual is a fixed proportion of the hazard for any other individual Multiplicative risk P (t ≤ T < t + ∆t / T ≥ t ) h(t ) = lim ∆t → 0 ∆t 271 Modeling Survival Data In words: the probability that if you survive to t, you will succumb to the event in the next instant. f (t ) Hazard from density and survival : h(t) = S (t ) 272 Components of Cox PH A baseline hazard function that is left unspecified but must be positive (=the hazard when all covariates are 0) A linear function of a set of k fixed covariates that is exponentiated. (=the relative risk) hi (t ) = h0 (t )e β1 xi 1 +...+ β k xik Can take on any form! log= hi (t ) log h0 (t ) + β1 xi1 + ... + β k xik 273 Cox PH Model hi (t ) = h0 (t )e β1 xi1 +...+ β k xik log= hi (t ) log h0 (t ) + β1 xi1 + ... + β k xik Hazard for person i (eg a smoker) HR = i, j Hazard ratio hi (t ) = h j (t ) h0 (t )e β1 xi1 +...+ β k xik β1 ( xi 1 − x j 1 ) +...+ β1 ( xik − x jk ) e = β x +...+ β k x jk h0 (t )e 1 j 1 Hazard for person j (eg a non-smoker) 274 Parameter interpretation β (1) + β (60) age hi (t ) h0 (t )e smoking β smoking (1− 0) HRlung cancer / smoking = = = e h j (t ) h0 (t )e β smoking (0) + βage (60) HRlung cancer / smoking = e β smoking This is the hazard ratio for smoking adjusted for age. β (0) + β (70) age hi (t ) h0 (t )e smoking β age (70 − 60) = = HRlung cancer /10− years increase= e in age h j (t ) h0 (t )e β smoking (0) + βage (60) HRlung cancer /10− years increase in age = e β age (10) This is the hazard ratio for a 10-year increase in age, adjusted for smoking. Exponentiating a continuous predictor gives you the hazard ratio for a 1-unit increase in the predictor. 275 Example Study on Systolic Hypertension in the Elderly Program (SHEP); This was a study of 4,736 persons over age 60 with isolated systolic hypertension (i.e., people with high systolic blood pressure and normal diastolic blood pressure) to see if treatment with a low-dose diuretic and/or betablocker would reduce the rate of strokes compared with the rate in the control group treated with placebo. Variables Coefficient Se Exp(Coeff) Race -0.1031 0.2607 0.90 Sex (male) 0.1707 0.1952 1.19 Age 0.0598 0.01405 1.06 History of diabetes 0.5322 0.2397 1.70 Smoking (Baseline) 0.6214 0.2390 1.86 Interpreting results This means that a person with untreated systolic hypertension who has a history of diabetes has 1.7 times the risk of having a stroke than a person with the same other characteristics but no diabetes. This can also be stated as a 70% greater risk. The risk at age=5 is 1.35, There is a 35% increase in risk of future stroke per 5-year greater age at baseline, controlling for all the other variables in the model. Chapter Five Disease Screening 278 Introduction Screening is systematic application of a test or investigation to people Screening is the process by which unrecognized disease or defects are identified, using tests which can be applied rapidly and on to large numbers of people. Introduction screening programm can be population screening (sometimes referred to as ‘mass screening’), in which the aim is to screen everyone in a particular population all newborn babies everyone over the age of 50 years ‘individual screening’ or ‘targeted screening’ frequent eye-tests are carried out on people with diabetes Introduction When examining a screening test we tend to look most closely at its: Validity: compare the screening test against some “gold standard” and as a measure calculate: Sensitivity Specificity Reproducibility: do the tests repeatedly in the same individuals and calculate measures of: Intrasubject Variation Interobserver Variation-example Kappa measures of agreement Efficacy: use following measures Positive Predictive Value and Positive Predictive Value Sensitivity and Specificity Sensitivity- is probability that a person having the disease is detected by the test = P (test positive | they have the disease) = P(T+|D+) Specificity- is probability that a person who does not have the disease is classified that way by the test = P(test negative | they don’t have the disease) = P(T-|D-) Sensitivity and Specificity Disease “Gold standard” Test Result Present Absent Total Positive TP FP All who test + Negative FN TN All who test - Total All with the disease All without the disease TP Sensitivity = TP + FN TN specificity = FP + TN Negative predictive and Positive Predictive values For a measure of the efficacy of the test we use Positive Predictive Value – is probability that someone who tests positive for the disease will actually have the disease = P (have disease | positive test result) Negative Predictive Value- is probability that someone who tests negative for the disease will actually have no the disease =P (don’t have disease | negative test result) Negative predictive and Positive Predictive values Disease “Gold standard” Test Result Present Absent Total Positive TP FP All who test + Negative FN TN All who test - Total All with the disease All without the disease TP PPV = TP + FP TN NPV = FN + TN One of the reasons Positive Predictive Value is used as a measure of efficacy is because it depends on the prevalence of the disease Related Concepts to Screening False Positive: the test reports a positive result for a person who is disease free. The false positive rate is given by P(D-| T+)=c/(a+c). False Negative: the test reports a negative result for a person who actually has the disease. The false negative rate is given by: P(D+| T-)=b/(b+d). Related Concepts to Screening Which false result is the more serious depends on the situation. But we generally worry more about false positives in screening tests. We don't want to tell someone that they have a serious disease when they do not really have it. Calculating False Positive and False Negative Disease “Gold standard” Test Result Present Absent Total Positive TP FP All who test + Negative FN TN All who test - Total All with the disease All without the disease FN false positive = TP + FN FP false negative = TN + FP Cuttoff Point Used to screen for a quantitative risk factor Plasma Glucose levels Diabetes Body Mass Index Obesity Blood pressure Hypertension A perfect separation between groups is difficult Distribution of the test results will overlap Cuttoff Point Lowering the cutoff point for the screening test will Increase true positives Increase sensitivity Decrease true negatives Decrease specificity Highering the cutoff point for the screening test will Decrease true positives Decrease sensitivity Increase true negatives Increase specificity Cuttoff Point There is trade-offs between sensitivity and specificity!!! Failing to detect some true cases because of lower sensitivity or misclassifying some people as diseased because of lower specificity highly depends on: the prevalence of the disease the severity of the disease the potential fatality of the disease how good the test is the acceptability of the test to people Receiver Operating Characteristic (ROC) curve the true positive rate (Sensitivity) is plotted in function of the false positive rate (100-Specificity) for different cut-off points Each point on the ROC curve represents a sensitivity/specificity pair corresponding to a particular decision threshold A test with perfect discrimination (no overlap in the two distributions) has a ROC curve that passes through the upper left corner (100% sensitivity, 100% specificity). Therefore the closer the ROC curve is to the upper left corner, the higher the overall accuracy of the test (Zweig & Campbell, 1993). Receiver Operating Characteristic (ROC) curve Chapter Six Clinical Trials 295 New Drug Development RX Chemicals in test tubes New medicine widely used in humans ? 296 Introduction to Clinical Trials Trial: Is from the Anglo-French prier Meaning to try broadly Refers to the action or process of putting something to a test or proof 297 Introduction to clinical trial Clinical: Is from clinic From the French cliniquẻ From Greece klinike Refers to the practice of carrying for the sick at the bedside 298 Definition of clinical trial Narrowly defined: “the action or process of putting something to a test or proof at the bedside of the sick.” Broadly defined: “A clinical trial may be defined as a carefully designed, prospective medical study which attempts to answer a precisely defined set of questions with respect to the effects of a particular treatment or treatments.” 299 Old Paradigm of Clinician Unsystematic observations (experience) are the best way of developing knowledge. Knowledge of pathophysiology coupled with common sense may effectively guide clinical practice. 300 New Paradigm of clinicians Evidence-Based Medicine Clinical instincts are important, but they must be guided and modified by the results of carefully recorded unbiased observations. While pathophysiologic mechanisms are important, the response to therapy may occasionally be contrary to expectations. Clinical trials provide the most objective and unbiased data to guide therapeutic decisions. 301 What Is Evidence-Based Medicine? EBM is “the conscientious, explicit and judicious use of current best evidence in making decisions about the care of the individual patient. It mean integrating individual clinical expertise with the best available external clinical evidence from systematic research.” David Sackett 302 Assessing the Benefit of a New Therapy Expert opinion Physiological concepts Clinical experience Retrospective studies Clinical trials The primacy of clinical trials can be traced to the rise of evidence based medicine 303 Evidence Based Medicine Case report: a demonstration only that some event of clinical interest is possible. Case series: a demonstration of certain possibly related clinical events but subject to large selection biases. Database analysis: treatment is not determined by experimental design but by factors such as physician or patient preference. The data are unlikely to have been collected specifically to evaluate efficacy. 304 Evidence Based Medicine Observational study: the investigator takes advantage of “natural” exposures or treatment selection and chooses a comparison group by design. Controlled clinical trials: the treatment is assigned by design. Endpoint ascertainment is actively performed and analyses are planned in advance 305 New Drug Development Assessing the benefit of a new therapy Clinical experiences Physiological concepts Clinical trials: primacy for EBM 306 RX 10 000 molecules 1 new product $1 billion 8 - 10 year (Acute treatment) 12 years (Chronic treatment) 307 New Drug Development Pharmaceutical development Preclinical Clinical Phase I trials Phase II trials Phase III trials Postmarketing surveillance Phase IV trials/study 308 Pharmaceutical developmentDiscovery of compound, synthesis and purification of drug substances, manufacturing procedures Pre-clinical (animal) studies Pharmacological profile, acute toxicity Investigational New Drug Application Phase I clinical trials Small; focus on safety Phase II clinical trials Medium size, focus on safety and short term efficacy Phase III clinical trials Large and comparative, focus on efficacy and cost benefits New Drug Application Phase IV clinical trials “real world” experience; demonstrate cost benefits; rare adverse reactions 309 Phase I Clinical Trials Small number of subjects (20-100) Focus on safety Pharmacokinetics Pharmacodynamics Toxicity For toxic drugs: maximum tolerated dose 310 Phase II Clinical Trials Usually several hundred patients with the medical condition Strict inclusion/exclusion criteria Focus on safety and short-term efficacy Clarify dose and dose regimen Basis for design of “pivotal” studies 311 Phase III Clinical Trials “Pivotal” for NDA submission Strict inclusion/exclusion criteria Large (hundreds - thousands of subjects) Comparative (two or more treatment groups) Focus on efficacy and cost benefits 312 Phase IV Clinical Trials Efficacy in routine clinical practice Assess unusual adverse reactions Demonstrate cost benefits No inclusion/exclusion criteria 313 Trial aims The main focuses of different types of studies Activity Efficacy Effectiveness Efficiency Main Focus Biological effect of the drug on the target system A sample welldefined patients Overall effect of the drug in a population at large Balance of costs and effects of the drug from a public health perspective Types of studies Preclinical studies and early clinical trials(phase I-II) Clinical trials (phase II-III) Late clinical trials(phase IV) Pharmaco economic studies 314 Organization of a Clinical Trial Planning the study – Formulating the hypothesis – Choosing the endpoint – Choosing the design and sample size Conduct of the study – Patient accrual – Data collection Data analysis Publication of results 315 Elements of a Design Use of placebo/control group “Blinding” Randomization Early stopping Parallel groups vs. cross-over trials Testing for superiority/equivalence 316 Use of Control Group To obtain the information about mechanisms not related to treatment. Use of non-treated controls may be unethical and problematic. – Unethical if an effective treatment exists. – Problematic if the knowledge of treatment can affect evaluation of treatment effect. Use placebo 317 “Blinding” Concealing the treatment identity to prevent bias in treatment outcome evaluation. Open-label trials: both patients and clinicians know the assigned treatment. Single-blinded trials: the patient does not know the treatment (but the clinician does). Double-blinded trials: neither the patient nor the clinician knows the treatment. 318 Randomization Random assignment of treatment for patients in a clinical trial Goal: elimination of the effect of unobserved factors on response 319 How randomization works? Two treatments (A & B) assigned to patients with probability ½ each. Consider sex: – M males, on average ½ M get A and ½ M get B. – F females, on average ½ F get A and ½ F get B. – In each treatment group there will be ≈ ½ M / (½ M + ½ F) males. – Randomization should balance the distribution of all factors in the compared treatment groups. 320 Trial Participants Are a Selected Group Population of patients as defined by eligibility criteria (population P) Patients recruited (Sample A) Formal entry into trial (patients agreeing to participate: sample B) Eligible patients who, for reason, were not entered into the trial (often refused consent) Randomization Treatment group 1 Treatment group 2 Compare outcomes 321 Randomization Eliminates all sources of bias except accidental bias Tends to ensure balance among treatments with respect to known and unknown prognostic factors Guarantees the distributional assumptions of the test statistics and estimators 322 Bias in Randomization Selection bias Occurs if the allocation process is predictable. If any bias exists as to what treatment particular types of participants should receive, then a selection bias might occur. Accidental bias Can arise if the randomization procedure does not achieve balance on risk factors or prognostic covariates especially in small studies. 323 Sample Size Determination Ho and HA, How small a treatment difference is it important to detect and with what degree of certainty? ( δ, α and β.) Parameters used in calculation are estimates with uncertainty and often base on very small prior studies Population may be different Publication bias--overly optimistic Different inclusion and exclusion criteria 324 Sample Size Determination Quantities used in calculation: Variances mean values response rates difference to be detected Overestimated size: unfeasible early termination Underestimated size justify an increase extension in follow-up incorrect conclusion (WORSE) 325 What is α (Type I error)? The probability of erroneously rejecting the null hypothesis (Put an useless medicine into the market!) What is β (Type II error)? The probability of erroneously failing to reject the null hypothesis. (keep a good medicine away from patients!) is β (Type II error)? 326 What is Power ? Power quantifies the ability of the study to find true differences of various values of δ. Power = 1- β=P (accept H1|H1 is true) the chance of correctly identify H1 (correctly identify a better medicine) What is δ? δ is the minimum difference between groups that is judged to be clinically important 327 Sample Size Calculation H0: δ=µt-µc=0 HA: δ=µt-µc≠0 N= ( ) 2 2 Zα / 2 + Z β / 2 σ 2 δ2 Multiply the above number by 2 to get the total number of patients in the trial 328 Example An investigator wish to estimate the sample size necessary to detect a 10 mg/dl difference in cholesterol level in a diet intervention group compared to the control group. The variance from other data is estimated to be (50 mg/dl). For a two sided 5% significance level, Zα/2=1.96, and for 90% power, Zβ/2=1.282. 2N=4(1.96+1.282)2(50)2/102=1050 329 Interpretation of Sample Size A sample size of 525 in each group will have 90% power to detect a difference in means of 10.0 assuming that the common variance is 50.0 using a two group t-test with a 0.05 two-sided significant level. 330 Chapter Seven Research Ethics 331 Introduction An ‘ethic’ is a moral principle or a code of conduct which … governs what people do. It is concerned with the way people act or behave. The term ‘ethics’ usually refers to the moral principles, guiding conduct, which are held by a group or even a profession (though there is no logical reason why individuals should not have their own ethical code)” (Wellington, 2000: 54) “Ethical concerns should be at the forefront of any research project and should continue through to the write-up and dissemination stages” (Wellington, 2000: 3) 332 Ethical Principles Guides to moral behavior Good: honesty, keeping promises, helping others, respective rights of others Bad: lying, stealing, deceiving, harming others Universality of ethical principles: should apply in the same manner in all countries, cultures, communities Relativity of ethical principles: vary from country to country, community to community 333 Ethical Principles Meaning given to ethics are relative to time, place, circumstance, and the person involved Research ethics is to protect rights and welfare of research participants and to protect the wider society or community within which the research is being conducted 334 Protection of Human Research Project Research: means “a systematic investigation, including research development, testing, or evaluation, designed to develop or contribute to generalizable knowledge” Human subject: means “a living individual about whom an investigator … conducting research obtains data through intervention or interaction with the individual, or identifiable private information” 335 Protection of Human Research Project Protection of human subjects is based upon the principles Respect for human dignity Respect for free and informed consent Respect for vulnerable persons Respect for privacy and confidentiality Respect for justice and inclusiveness Balancing harms and benefits: minimizing harm and maximizing benefit 336 Protection of Human Research Project Respect for persons Every person has the right to determine what shall happen to him or her – participation must be voluntary Protect the multiple and interdependent interests of the person (bodily, psychological, cultural integrity) Special consideration and protection is extended to “vulnerable” subjects such as children, persons with cognitive disabilities, prisoners, and institutionalized persons 337 Protection of Human Research Project Beneficence No person shall be placed at risk unless the risks are reasonable in relation to the anticipated benefits Justice Risks and benefits should be justly distributed – who ought to receive the benefits of research and who should bear its burdens? 338 Informed Consent What is consent? Defined as permission, approval, or assent What is informed consent? Consent given by the patient based on knowledge of the procedure to be performed, including its risks and benefits, as well as alternatives to the proposed treatment/action. Presumption that individuals have capacity and right to make free and informed decisions In research = dialogue, process, rights, duties, requirements for free and informed consent by the research subject 339 Informed Consent Participants should have access to information about the aims and objectives of any research in which they are involved, including sources of help, advice, support and treatment if they experience any ill effects of participation The research cannot proceed without consent Informed consent must be maintained throughout 340 Informed Consent Informed Consent allows individuals: To determine whether participating in research fits with their values and interests. To decide whether to contribute to this specific research project. To protect themselves from risks. To decide whether they can fulfill the requirements necessary for the research. 341 Informed Consent Informed consent should involve the provision or collection of information on: Name and contact details of researcher Name and contact details of participant Aims and objectives of the research project Role of the participant in the research project Treatment of material/information collected Potential risks to the participant Sources of advice/help/support/treatment Voluntary participation and freedom to withdraw 342 Research Integrity “Integrity" means "firm adherence to a code, especially moral or artistic values; incorruptibility.“ Research integrity includes: the use of honest and verifiable methods in proposing, performing, and evaluating research reporting research results with particular attention to adherence to rules, regulations, guidelines, and following commonly accepted professional codes or norms. 343 Research Integrity “Research misconduct means fabrication, falsification, or plagiarism in proposing, performing, or reviewing research, or in reporting research results.” In research misconduct there must be a significant departure from accepted practices of the relevant research community. The misconduct must be committed intentionally, knowingly, or recklessly 344 Research Integrity Shared values in scientific research are: Honesty: convey information truthfully and honoring commitments Accuracy: report findings precisely and take care to avoid errors Efficiency: use resources wisely and avoid waste Objectivity: let the facts speak for themselves and avoid improper bias 345 Authorship Policies Authorship of a research publication is an acknowledgement of the substantial contribution made by a researcher. It carries with it both recognition of work done and responsibility for the material contributed. Authorship must therefore be attributed with due regard for the appropriate conventions. All persons designated as authors should qualify for authorship, and all those who qualify should be listed. 346 Authorship Policies Authorship credit for original, research-based works (in any medium) may be based on: 1. Substantial contributions to conception and design, or acquisition of data, or analysis and interpretation of data; 2. Drafting the article or revising it critically for important intellectual content; 3. Sufficient participation in the work to take public responsibility for appropriate portions of the content; and 4. Final approval of the version to be published. 347 Authorship Policies Authors should meet conditions 1, 2, 3, and 4. Other contributions such as provision of a key reagent, or collection of data may also be considered as long as conditions 2, 3 and 4 are met. Authorship credit for reviews or commentaries not based in original research should be based on conditions 2, 3 and 4. 348 Authorship Policies Acquisition of funding, collection of data (for example, from a fee- for-service core facility), or general supervision of the research group (e.g. by former or current mentors not directly involved in the conception or execution of the publication), alone, does not justify authorship. Financial and material support should be disclosed. All contributors who do not meet the criteria for authorship should be listed in an acknowledgments section. 349 Authorship Policies “Ghost-writing,” a practice whereby a commercial entity or its contractor writes an article or manuscript and a scientist is listed as an author, is not permissible. Making minor revisions to an article or manuscript that is ghost-written does not justify authorship. All contributors who do not meet the criteria for authorship should be listed in an acknowledgments section. 350 Data and Safety Monitoring All clinical investigations, including physiologic, toxicity, and dose-finding studies (phase I); efficacy studies (phase II); efficacy, effectiveness and comparative trials (phase III), involving greater than minimal risk to participants (i.e., full Committee review) are, at a minimum, required to develop a data and safety monitoring plan to assure the safety and welfare of the research subjects. The method and degree of monitoring needed is related to the degree of risk involved. be concluded successfully. 351 Data and Safety Monitoring A Data Safety Monitoring Committee (DSMB) is usually required to determine safe and effective conduct and to recommend conclusion of the trial when significant benefits or risks have developed or the study is unlikely to be concluded successfully. 352 Data and Safety Monitoring Risk associated with participation in research must be minimized to the extent practical. Monitoring should be commensurate with size and complexity of the study. Monitoring may be conducted in various ways or by various individuals or groups, depending on the size and scope of the research effort. 353 Data and Safety Monitoring Purposes of DSMB Identify high rates of ineligibility determined after randomization Identify protocol violation that suggests clarification of changes to protocol are needed Identify unexpectedly high drop out rates that threaten the trials ability to produce credible results Ensure validity of study results 354 THANK YOU 355