Logistic and Nonlinear Regression • Logistic Regression - Dichotomous Response variable and numeric and/or categorical explanatory variable(s) – Goal: Model the probability of a particular as a function of the predictor variable(s) – Problem: Probabilities are bounded between 0 and 1 • Nonlinear Regression: Numeric response and explanatory variables, with non-straight line relationship – Biological (including PK/PD) models often based on known theoretical shape with unknown parameters Logistic Regression with 1 Predictor • Response - Presence/Absence of characteristic • Predictor - Numeric variable observed for each case • Model - p(x) Probability of presence at predictor level x x e p ( x) x 1 e • = 0 P(Presence) is the same at each level of x • > 0 P(Presence) increases as x increases • < 0 P(Presence) decreases as x increases Logistic Regression with 1 Predictor , are unknown parameters and must be estimated using statistical software such as SPSS, SAS, or STATA · Primary interest in estimating and testing hypotheses regarding · Large-Sample test (Wald Test): · H0: = 0 HA: 0 2 T .S . : X obs 2 R.R. : X obs ^ ^ ^ 2 ,1 2 2 P val : P ( 2 X obs ) Example - Rizatriptan for Migraine • Response - Complete Pain Relief at 2 hours (Yes/No) • Predictor - Dose (mg): Placebo (0),2.5,5,10 Dose 0 2.5 5 10 Source: Gijsmant, et al (1997) # Patients 67 75 130 145 # Relieved 2 7 29 40 % Relieved 3.0 9.3 22.3 27.6 Example - Rizatriptan for Migraine (SPSS) t d B p . a ig E S D 5 7 9 1 0 0 a 1 C 0 5 6 1 0 3 a V 2.490 0.165x e p ( x) 2.490 0.165x 1 e ^ H0 : 0 H A : 0 2 2 T .S . : X obs 0.165 19.819 0.037 2 RR : X obs .205,1 3.84 P val : .000 Odds Ratio • Interpretation of Regression Coefficient (): – In linear regression, the slope coefficient is the change in the mean response as x increases by 1 unit – In logistic regression, we can show that: odds( x 1) e odds( x) p ( x) odds( x) 1 p ( x) • Thus e represents the change in the odds of the outcome (multiplicatively) by increasing x by 1 unit • If = 0, the odds and probability are the same at all x levels (e=1) • If > 0 , the odds and probability increase as x increases (e>1) • If < 0 , the odds and probability decrease as x increases (e<1) 95% Confidence Interval for Odds Ratio • Step 1: Construct a 95% CI for : ^ ^ ^ ^ ^ ^ ^ 1.96 , 1.96 ^ 1.96 ^ • Step 2: Raise e = 2.718 to the lower and upper bounds of the CI: e 1.96 , e 1.96 ^ ^ ^ ^ ^ ^ • If entire interval is above 1, conclude positive association • If entire interval is below 1, conclude negative association • If interval contains 1, cannot conclude there is an association Example - Rizatriptan for Migraine • 95% CI for : ^ ^ 0.165 0.037 ^ 95% CI : 0.165 1.96(0.037) (0.0925 , 0.2375) • 95% CI for population odds ratio: e 0.0925 , e0.2375 (1.10 , 1.27) • Conclude positive association between dose and probability of complete relief Multiple Logistic Regression • Extension to more than one predictor variable (either numeric or dummy variables). • With p predictors, the model is written: x x p p e 11 p x p x p 1 e 1 1 • Adjusted Odds ratio for raising xi by 1 unit, holding all other predictors constant: ORi e i • Inferences on i and ORi are conducted as was described above for the case with a single predictor Example - ED in Older Dutch Men • Response: Presence/Absence of ED (n=1688) • Predictors: (p=12) – – – – Age stratum (50-54*, 55-59, 60-64, 65-69, 70-78) Smoking status (Nonsmoker*, Smoker) BMI stratum (<25*, 25-30, >30) Lower urinary tract symptoms (None*, Mild, Moderate, Severe) – Under treatment for cardiac symptoms (No*, Yes) – Under treatment for COPD (No*, Yes) * Baseline group for dummy variables Source: Blanker, et al (2001) Example - ED in Older Dutch Men Predictor Age 55-59 (vs 50-54) Age 60-64 (vs 50-54) Age 65-69 (vs 50-54) Age 70-78 (vs 50-54) Smoker (vs nonsmoker) BMI 25-30 (vs <25) BMI >30 (vs <25) LUTS Mild (vs None) LUTS Moderate (vs None) LUTS Severe (vs None) Cardiac symptoms (Yes vs No) COPD (Yes vs No) b 0.83 1.53 2.19 2.66 0.47 0.41 1.10 0.59 1.22 2.01 0.92 0.64 sb 0.42 0.40 0.40 0.41 0.19 0.21 0.29 0.41 0.45 0.56 0.26 0.28 Adjusted OR (95% CI) 2.3 (1.0 – 5.2) 4.6 (2.1 – 10.1) 8.9 (4.1 – 19.5) 14.3 (6.4 – 32.1) 1.6 (1.1 – 2.3) 1.5 (1.0 – 2.3) 3.0 (1.7 – 5.4) 1.8 (0.8 – 4.3) 3.4 (1.4 – 8.4) 7.5 (2.5 – 22.5) 2.5 (1.5 – 4.3) 1.9 (1.1 – 3.6) Interpretations: Risk of ED appears to be: • Increasing with age, BMI, and LUTS strata • Higher among smokers • Higher among men being treated for cardiac or COPD Nonlinear Regression • Theory often leads to nonlinear relations between variables. Examples: – 1-compartment PK model with 1st-order absorption and elimination – Sigmoid-Emax S-shaped PD model Example - P24 Antigens and AZT • Goal: Model time course of P24 antigen levels after oral administration of zidovudine • Model fit individually in 40 HIV+ patients: E (t ) E0 (1 A)e kout t E0 A where: • E(t) is the antigen level at time t • E0 is the initial level • A is the coefficient of reduction of P24 antigen • kout is the rate constant of decrease of P24 antigen Source: Sasomsin, et al (2002) Example - P24 Antigens and AZT • Among the 40 individuals who the model was fit, the means and standard deviations of the PK “parameters” are given below: Parameter E0 A kout Mean 472.1 0.28 0.27 Std Dev 408.8 0.21 0.16 • Fitted Model for the “mean subject” E (t ) 472.1(1 0.28)e 0.27t (472.1)(0.28) Example - P24 Antigens and AZT Example - MK639 in HIV+ Patients • Response: Y = log10(RNA change) • Predictor: x = MK639 AUC0-6h • Model: Sigmoid-Emax: 0 x Y x 1 2 2 2 • where: • 0 is the maximum effect (limit as x) • 1 is the x level producing 50% of maximum effect • 2 is a parameter effecting the shape of the function Source: Stein, et al (1996) Example - MK639 in HIV+ Patients • Data on n = 5 subjects in a Phase 1 trial: Subject 1 2 3 4 5 log RNA change (Y) MK639 AUC0-6h (x) 0.000 10576.9 0.167 13942.3 1.524 18235.3 3.205 19607.8 3.518 22317.1 • Model fit using SPSS (estimates slightly different from notes, which used SAS) 35.60 3.52 x Y 35.60 x 18374.39 ^ Example - MK639 in HIV+ Patients