Introduction • Populations and Samples – Population - Set of all individuals or units of interest to investigators. Sometimes we may refer to a population of measurements as opposed to individuals or units. – Sample - Subset of a population that is observed and measured by investigators. Quantitative and Qualitative Variables • Quantitaive variables take on numeric values. They can be further classified as: – Continuous variables can take on values along an interval (e.g. blood pressure, temperature) – Discrete variables can take on distinct values with “breaks” (e.g. Woman’s parity, Number of prior cardiac events) • Qualitative variables take on various categories. They can be classified as: – Nominal variables take on values with no inherent ordering (e.g. Presence/Absence of parasite, gender, race) – Ordinal variables take on categories that can be ordered (e.g. Prognosis, Attitude toward a proposal) Dependent and Independent Variables • Dependent variables are outcomes of interest to investigators. Also referred to as Responses or Endpoints • Independent variables are Factors that are often hypothesized to effect the outcomes (levels of dependent variables). Also referred to as Predictor or Explanatory Variables • Research ??? Does I.V. D.V. Example - Clinical Trials of Cialis • Clinical trials conducted worldwide to study efficacy and safety of Cialis (Tadalafil) for ED • Patients randomized to Placebo, 10mg, and 20mg • Co-Primary outcomes: – Change from baseline in erectile dysfunction domain if the International Index of Erectile Dysfunction (Numeric) – Response to: “Were you able to insert your P… into your partner’s V…?” (Nominal: Yes/No) – Response to: “Did your erection last long enough for you to have succesful intercourse?” (Nominal: Yes/No) Source: Carson, et al. (2004). Example - Clinical Trials of Cialis • Population: All adult males suffering from erectile dysfunction • Sample: 2102 men with mild-to-severe ED in 11 randomized clinical trials • Dependent Variable(s): Co-primary outcomes listed on previous slide • Independent Variable: Cialis Dose: (0, 10, 20 mg) • Research Questions: Does use of Cialis improve erectile function? Parameters and Statistics • Parameters: Numerical descriptive measures for Populations: m - Mean (average) of a numeric variable s2 - Variance s - Standard deviation of a numeric variable CV - Coefficient of variation of a numeric variable p - Proportion of population with a nominal characteristic Parameters and Statistics • Statistics: Numerical descriptive measures for Samples – Sample Mean (of a sample of size n): y my ^ n – Sample Variance (s2) and standard deviation (s): s2 2 ( y y ) n 1 s s2 s – Sample coefficient of variation (cv): cv 100% y – Sample Proportion with a characteristic: ^ p Example - Carbonate of Bismuth • Samples of Carbonate of Bismuth from a sample of 6 London manufacturing chemists • Measurements: Quantity of Teroxide (Theoretically should be 88.30 per 100 parts) • Measured levels: 89, 88.5, 86.16, 87.66, 87.66, 86 y 89 88.5 86.16 87.66 87.66 86 87.50 6 (89 87.5) 2 (88.5 87.5) 2 (86.16 87.5) 2 (87.66 87.5) 2 (87.66 87.5) 2 (86 87.5) 2 s 1.47 6 1 1.21 s 1.47 1.21 cv 100% 1.39% 87 . 5 2 Source: Umney (1864) Example - Clinical Trials of Cialis • Among the 638 patients receiving placebo (dose=0), 198 responded “Yes” to “Did your erection last long enough for you to have succesful intercourse?” • Of 321 receiving 10mg dose, 186 replied “Yes” • Of 1143 receiving 20mg dose, 777 replied “Yes” 198 p0 0.31 638 ^ ^ p 10 186 0.58 321 ^ p 20 777 .68 1143 Note that proportions are often reported as percentages (number with characteristic per 100 exposed) or as rates per 10,000 such as mortality rates for rare causes Graphical Techniques • Pictures are worth a bunch of words and computer packages make graphing easy! – Histograms show the number or percent by category or within ranges of values – Pie charts show proportionally the number or percent by category or within ranges of values – Scatterplots plot a dependent variable on the vertical axis versus an independent variable with each subject being a point on the chart Histogram of ED Severity Level • In the Cialis trial, the baseline severity level was reported for 2099 patients on an ordinal scale: 1=Normal, 2=Mild, 3=Moderate, 4=Severe 800 600 400 200 Std. Dev = .92 Mean = 2.9 N = 2099.00 0 1.0 2.0 SEVERITY Cases w eighted by PATIENTS 3.0 4.0 Pie Chart of ED Severity Level Normal Severe Mild Moderate Cases weighted by PATIENTS Histogram of Disposition by Dose (Count=%) dose Placebo 10mg 75 20mg Count Bars show counts Disposition: 1=Completed 2=Adverse event 3=Lack of Efficacy 50 4=Lost to follow-up 25 5=Patient Decision 6=Protocol Violation 0 2 4 dispose 6 7=Others Scatterplot of Math Score vs LSD Level • Response - Mean Math score for 7 subjects • Predictor - Mean LSD Concentration Bivariate Scattergram 80 mathscore 70 60 50 40 30 20 1 2 3 Source: Wagner and Bing (1968) 4 lsdconc 5 6 7 Conc 1.17 2.97 3.26 4.69 5.83 6.00 6.41 Score 78.93 58.20 67.47 37.47 45.65 32.92 29.97 Basic Probability • Probability measures the likelihood or chances of particular outcomes (or events) of random experiment or observation • Let A and B be two events, with probabilities P(A) & P(B): – Intersection - Event that both A and B occur (Notation: AB) – Union - Event that either A and/or B occur (Notation: AB) – Complement - Event that the event does not occur (Notation: Ā) • Probability Rules: P( A B) P( A) P( B) P( AB) P( AB) P( A | B) = P(A occurs Given B has occurred) P( B) P( AB) P( A) P( B | A) P( B) P( B | A) P( A) 1 P( A) Example - High Cholesterol By Age and Sex • WHO MONICA Survey of 50000 Adults • Proportions by Age, Gender, and Cholesterol: Male 25-29 30-34 35-39 40-44 45-49 50-54 55-59 60-64 Total Source: Gostynski, et al (2004) Female High Chol Low Chol High Chol Low Chol Total 0.0066 0.0486 0.0046 0.0527 0.1125 0.0111 0.0542 0.0056 0.0563 0.1272 0.0167 0.0476 0.0079 0.0582 0.1304 0.0196 0.0457 0.0100 0.0526 0.1279 0.0207 0.0440 0.0172 0.0490 0.1309 0.0222 0.0430 0.0256 0.0400 0.1308 0.0229 0.0425 0.0303 0.0342 0.1299 0.0185 0.0344 0.0304 0.0280 0.1113 0.1383 0.3600 0.1316 0.3710 1 Example - High Cholesterol By Age and Sex • Probability a Randomly Selected Subject is Male: P( M ) P( M & HC ) P( M & LC ) .1383 .3600 .4983 • Probability a Randomly Selected Subject is over 40 years: P( 40) P(40 44) P(45 49) P(50 54) P(55 59) P(60 64) .1279 .1309 .1308 .1299 .1113 .6308 • Probability Female given subject has High Cholesterol: P( F & HC ) .1316 P( HC ) P( M & HC ) P( F & HC ) .1383 .1316 .2699 P( F & HC ) .1316 P( F | HC ) .4876 P( HC ) .2699 Independence • Two events A and B are independent if: P(A|B) = P(A) or, equivalently P(B|A) = P(B) • Cholesterol Example: P( F ) 1 P( M ) 1 .4983 .5017 P( F | M ) .4876 The occurrence of high cholesterol is not independent of gender Diagnostic Tests • True state: Disease Present (D+) or Absent (D-) based on a gold standard • Diagnostic test result: Positive (T+) or Negative (T-) • Subjects can be classified in following table (where a,b,c, and d are the number of subjects in the 4 cells: Test Result\True State Positive (T +) Negative (T -) Total Positive (D +) a c a+c Negative (D -) b d b+d Total a+b c+d a+b+c+d Diagnostic Tests • Sensitivity - The ability for the test to detect that the disease is present: P(T+ | D+) • Specificity - The ability for the test to detect that the disease is absent: P(T- | D-) • Positive Predictive Value (PPV) - Proportion of positive test results that actually have the disease • Negative Predictive Value (NPV)- Proportion of negative test results that do not have the disease • Overall Accuracy - Proportion of subjects who are correctly diagnosed Diagnostic Tests Test Result\True State Positive (T +) Negative (T -) Total Positive (D +) a c a+c a Sensitivity : P (T | D ) ac b Specificit y : P (T | D ) bd a * PPV : P ( D | T ) ab c * NPV : P ( D | T ) cd ad * Accuracy : abcd Negative (D -) b d b+d Total a+b c+d a+b+c+d * Assuming prevalence rates in test subjects is same as in population Example - Paracheck Test for Plasmodium Falciparum (Pf) • Goal: Develop an inexpensive test for Pf in asymptomatic children in remote parts of India • Gold Standard: Microscopy • Diagnostic Test: Paracheck ($0.65/test) Test Result\True State Positive (T +) Negative (T -) Total Source: Singh, et al (2002) Positive (D +) 119 7 126 Negative (D -) 49 398 447 Total 168 405 573 Example - Paracheck Test for Plasmodium Falciparum (Pf) Test Result\True State Positive (T +) Negative (T -) Total Positive (D +) 119 7 126 Negative (D -) 49 398 447 Total 168 405 573 119 .9444 (94.44%) 126 398 Specificit y .8904 (89.04%) 447 119 PPV .7803 (78.03%) 168 398 NPV .9827 (98.27%) 405 119 398 Accuracy .9023 (90.23%) 573 Sensitivity Basic Study Designs • Studies can generally be classified as observational or experimental – Observational - Subjects (or nature) select their groups (levels of the independent variable) • Studies comparing ethnicities or sexes wrt drug disposition • Studies of effects of smoking or other behaviors • Studies comparing effects of patients on different therapies – Experimental - Researchers assign subjects to treatment groups • Clinical trials with patients being randomized to active drug or placebo. Typically double-blind (patient/assessor) Observational Studies • Case-Control -- Subjects are identified based on presence/absence of the outcome of interest (D.V.). It is then determined whether the subject had been exposed to risk factor (I.V.). Retrospective Studies. • Cohort -- Subjects are identified by risk factor or treatment (I.V.) and followed over time to observe outcome (D.V.). Prospective Studies. • Cross-sectional -- Subjects sampled at random from population and levels of both I.V. and D.V. are simultaneously observed. Many studies based on large medical databases are cross-sectional Example - Case-Control Study • Purpose: Study Risk Factors of Hepatitis-A in Hispanic Children living in U.S. on Mexican border (San Diego, CA) • Cases: 132 Children with Hepatitis-A • Controls: 354 Children without Hepatitis-A • Risk Factors: – Travel outside U.S. (67% of cases, 25% of cases) – Eating food at taco stand/street vendor on travel – Eating salad/lettuce on travel Source: Weinberg, et al (2004) Example - Cohort Study • Purpose: Determine whether male adolescents who develop schizophrenia were more likely to smoke prior to onset • Subjects: Israeli male military recruits, not suffering major psychopathology who complete smoking questionnaire • Cohorts: 4052 smokers, 10196 non-smokers • Follow-up/outcome: 4-16 year follow-up for onset of schizophrenia (20 smokers, 24 nonsmokers) Source: Weiser, et al (2004) Example - Cross-Sectional Study • Purpose - Investigate effect of high altitude on maternal hemorrheology • Subjects - Pregnant and non-pregnant women at high altitude and at sea level • Measurements - Blood/Plasma viscosities, Hematocrit, total protein, Fibrinogen, Albumin • Selected Findings - Blood and Plasma viscosities are higher in pregnant and non-pregnant women at higher altitudes Source: Kametas, et al (2004) Experimental Studies • Randomized Clinical Trials - Studies where investigators assign subjects at random to treatments • Special Cases (more than one may apply): – – – – – Parallel Groups - Each subject receives only one treatment Crossover - Each subject receives each trt (in random order) Placebo Controlled - One group receives only a placebo Double Blind - Subject nor assessor are aware of which trt Double Dummy - Subjects receive similar regimens wrt appearance, when different drugs look different – Intention-to-Treat - Analysis is based on all subjects randomized, including those lost to follow-up – Completed Protocol - Analysis based on only subjects who completed study Example - Randomized Clinical Trial • Purpose - Three treatments for primary dysmenorrhea in women • Subjects - 337 women (18-40) suffering dysmenorrhea during past 3 consecutive menstrual cycles • Treatments (Parallel Groups, double-blind, double-dummy) – Group 1: 1 tablet meloxicam 7.5mg o.a.d. 1 tablet placebo matching meloxicam 15mg o.a.d. 1 tablet placebo matching mefenamic acid 500mg t.i.d. – Group 2: 1 tablet meloxicam 15mg o.a.d. 1 tablet placebo matching meloxicam 7.5mg o.a.d. 1 tablet placebo matching mefenamic acid 500mg t.i.d. – Group 3: 1 tablet mefenamic acid 500mg t.i.d. 1 tablet placebo matching meloxicam 7.5mg o.a.d. 1 tablet placebo matching meloxicam 15.0mg o.a.d. • Outcomes: Ordinal global assessment of safety/tolerability by patients and investigators (Good, Satisfactory, Not satisfactory, Bad) Source: de Mello, et al (2004)