WORKING WITH CLINICIANS & OTHER SCIENTISTS Chap Le Xianghua Luo David Vock William Thomas Science is built upon rigorous observation and experimentation. Biostatistics - the application of statistics to understanding health and biology provides powerful tools for developing questions, designing studies, refining measurements, and analyzing data. A biostatistician’s unique contribution to a research team is founded on quantifying uncertainty in and generating sound inferences from data. Because of the increasing complexity and quantity of health-related data, the need for biostatistics expertise and the need for biostatisticians are expanding and evolving. Biostatistics contributions take one of the two forms: Consultation and Collaboration. Statistical consultation is often unplanned, less organized, and aimed at smaller projects. Groups that focus on consultation provide a valuable service but fail to maximize the contributions biostatisticians can make to research. In those organizations, biostatistics is sometimes regarded as an ancillary service rather than an academic discipline; investigators or clinical departments expect biostatisticians to fill a perceived service role. In more modern Medical Centers, especially Academic Medical Centers, Biostatistics supports are organized in a way where the field has a strong identity – as an academic discipline , which spurs intellectual growth, values methodological contributions to health-related research. And contributions are made through collaborations where biostatisticians get involved early and in a continuing manner in each and all projects, from developing questions, designing studies, refining measurements, to analyzing data, and publishing results. All of us in applied environment still provide some statistical consultation – because not all investigators are experienced; but even those gradually becoming more like “mentoring” instead of consulting. Those who have been around for a while are often involved in more meaningful, more rewarding collaborations. So, where do we contribute? The following few slides provide a simple picture of the makeup of a research project. Finishing Truth in The Universe Research Question Truth in The Study Study Plan Starting Findings in The Study Study Data The biggest thread or the most important component in research is the concept of “validity”. It involves the assessment against accepted standards; we have to be sure that the evaluation covers its intended target or targets. INFERENCES & VALIDITIES Two major levels of inferences are involved in interpreting the results/findings of a study: The first level concerns Internal validity; the degree to which the investigator draws the correct conclusions about what actually happened in the study. The second level concerns External Validity (also referred to as generalizability or inference); the degree to which these conclusions could be appropriately applied to people and events outside the study. External Validity Truth in The Universe Research Question Internal Validity Truth in The Study Study Plan Findings in The Study Study Data Biostatistics contributes to both internal validity (dealing with missing data, refining measurements, analyzing data) and external validity (helping to develop research question, designing study, estimating sample size) Clinical Research Population Research T1 T2 Laboratory Research Studies can be grouped into there areas: Population, Laboratory, and Clinical; plus Translational Research, the component of basic science that interacts with clinical (T1) or with population research (T2). THE ANATOMY & PHYSIOLOGY OF CLINICAL RESEARCH We form or evaluate a research or research project from/on two different angles or parts: the anatomy and the physiology of research; just like the hardware and software to run a computer operation. THE ANATOMY PART From the anatomy of the research, one can describe/see what it’s made of; this includes the tangible elements of the study plan: research question, design, subjects, measurements, sample size calculation, etc… The goal is to create these elements in a form that will make the project feasible, efficient, and cost-effective. THE PHYSIOLOGY PART From the physiology of the research, one can describe/see how it works; first about what happened in the study sample and then about how study findings generalized to people outside the study. The goal is to minimize the errors that threaten conclusions based on these inferences. THE PROTOCOL • The structure of a Research Project, both its anatomy and physiology parts, are described in its protocol; the written part of the study. • The Protocol have a vital scientific function to help the investigator organize his/her research in a logical, focused, & efficient way. COMPONENTS OF THE PROTOCOL Research Question: What is the objective of the study, the uncertainty the investigator wants to resolve? Background and Significance: Why these questions important? Design: How is the study structured? Subjects: Who are the subjects and how they will be selected and recruited. Variables: What measurements will be made: predictors, confounders, and outcomes. Statistical Considerations: How large is the study and how will data be analyzed (“Design” is an important statistical component but listed in the Design Section). You can see “Statistical Fingerprints” everywhere! But productive contributions require some understanding of the “Content Sciences”. That’s why many statisticians are gradually specialized in only a few areas of biomedical research. SOME PROJECTS in Chap Le’s Portfolio: (1) P01: Biology and Transplantation of the Human Stem Cell Director: Phil McGlave; NCI: 7/1/10-6/30/15 This program project has three projects, all are in Minnesota; They focus on three important issues in the UCB transplant setting: 1) graft versus host disease (GVHD); 2) delayed immune reconstitution with resultant late infection; and 3) refractory or relapsed leukemia. Statisticians: Chap Le (Core Director), Qing Cao, Todd DeFor, Bruce Lindgren, Xianghua Luo, and Ryan Shanley. (2) P01: NK Cells, Their Receptors and Unrelated Donor Transplant Director: Jeff Miller; NCI: 9/1/10-7/31/15 This Program includes a group of international experts in NK cell biology and bone marrow transplantation collaborating to investigate the relevance of NK alloreactivity in URD HCT, a setting where KIR repertoires differ in nearly all donor-recipient pairs. This program project has three projects; one is here in Minnesota, one at Stanford University, and the third one is a multi-center randomized Clinical Trial with a PI here. Statisticians: Chap Le (Core Director), Todd DeFor, Xianghua Luo, and Yan Zhang. SOME PROJECTS in Chap Le’s Portfolio: (3) P30: Cancer Center Support Grant (CCSG) Director: Doug Yee; NCI: 6/1/98-1/31/14 The Masonic Cancer Center has 8 research programs (Cancer Outcomes and Survivorship, Carcinogenesis and Chemoprevention, Genetic Mechanisms of Cancers, Immunology, Prevention and Etiology, Transplant Biology and Therapy, Tumor Microenvironment, and Cell Signaling); Biostatistics and Bioinformatics is one of its 13 Shared Resources. Statisticians: Chap Le (Core Director), Haitao Chu, Yen-Yi Ho, Robin Bliss, Todd DeFor, Bruce Lindgren, and Yan Zhang. (4) P30: Minnesota Obesity Center Director: Allen Levine; NIDDK: 9/30/1995 -3/31/2016 The Center has 73 active investigators with 137 funded projects related to obesity, energy metabolism and eating disorders and is one of 12 funded Nutrition Obesity Research Centers; statisticians are part of the Biostatistics and Epidemiology Core led by Dr. Robert Jeffery of Epidemiology/SPH (basic rationale is the relationship between obesity and cancers). Statisticians: Chap Le, Robin Bliss, and Yan Zhang. SOME PROJECTS in Chap Le’s Portfolio: (5) P50 (SPORE): UAB/UMN SPORE in Pancreatic Cancer Directors: Donald Buchsbaum (Alabama) and Selwyn Vickers (Minnesota); NCI: 8/15/2010-6/30/2015 This SPORE has four projects; two are in Birmingham, one here in Minnesota, and one with Co-PIs in both campuses. Statisticians: Chap Le (Core Co-Director), Yen-Yi Ho, and Bruce Lindgren, and (for pilot projects) Xianghua Luo, Aaron Sarver, and Ryan Shanley. (6) U54: Evaluating New Nicotine Standards for Cigarettes Directors: Eric Donny (Pittsburgh) and Dorothy Hatsukami (Minnesota); NIDA-FDA: 9/15/11-6/30/16 This specialized research center has four projects; one is a multi-center clinical trial headquartered here in Minnesota, two at Pittsburgh and one at Brown. Statisticians: Chap Le (Core Director), Qing Cao, Bruce Lindgren, Xianghua Luo, and Joseph Koopmeiners. SOME PROJECTS in Chap Le’s Portfolio: (7) U19: Models for Tobacco Products Evaluation Director: Dorothy Hatsukami; NCI: 9/20/12-9/19/17 The overall goal of this Program Project is to provide scientists and regulatory agencies scientifically-based guidelines with methods and measures for the evaluation of tobacco products. The new program includes four projects – all are here in Minnesota. Statisticians: Chap Le (Core Director), Robin Bliss, Yen-Yi Ho, Bruce Lindgren, and Yan Zhang. (8) R01: Randomized Trial of PEITC as a Modifier of NNK Metabolism Principal Investigator: J. Yuan; NCI: 4/1/08-1/31/13 The primary aim is to assess, via a cross-over design Clinical Trial, the effect of PEITC supplementation (at 40 mg per day) as a modifier of NNK metabolism in smokers. Statistician: Chap Le (9) R01: Green Tea and Reduction of Breast Cancer Risk. Principal Investigator: M. Kurzer; NCI: 9/11/08-7/31/13 The primary aim is to gain full understanding of the mechanisms by which tea catechins inhibit breast carcinogenesis in humans. It’s a controlled randomized Clinical Trial. Statistician: Chap Le (10) P50: Tobacco Center of Regulatory Science (TCORS) Program Director: Anne Joseph & Sharon Murphy We are in the process of applying (deadline: November 14) to establish a Tobacco Center of Regulatory Science (TCORS). There will be 4 projects to start, three are here and one at Boston University. Statisticians: Chap Le (Core director), Dipankar Bandyopadyay, Haitao Chu, Yen-Yi Ho, Bruce Lindgren, Robin Bliss, and Ryan Shanley. SKILLS? • What kind of skills are needed? • Most important: “People Skills” (to weather out the conflict of two different cultures) – will elaborate a bit on this hidden barrier in class. • Years 1-3 are specially hard & important; but “networking” is always important, even for veterans. DATA-DRIVEN STATISTICAL RESEARCH -- By Xianghua Luo • Why still need to do methodological research? – Help you do your consulting work better – Give yourself a closure on a study you’ve been involved in or a new method you have just learnt – One way to drive yourself to learn new stuff • How to find topics? – Research on how previous people analyze the same type of data. Need a lot of reading. – What can be improved? – How to convince people to use the new method you proposed? Publish! – Write for both statistical journals and scientific journals. (This is the proof that you care about their scientific problems, you understand their problems, you know their languages, etc…) • Do you need to have your own funding to support your research? – Depends. – If you need to, you will find it not that difficult to find an existing data or ongoing study that you are involved in. So, an R03/R21 on secondary analysis of an existing data/project might be a good starting point for you. – Being a PhD means you will be a PI one way or another someday. Better to practice early. AN EXAMPLE OF DATA-DRIVEN RESEARCH Analysis of Cigarette Purchase Task Instrument Data with a Left-Censored Mixed Effects Model - A joint work with Liao W, Le C, Epstein LH, Yu J, Ahluwalia JS, and Thomas J. Cigarette Purchase Task Survey Imagine a TYPICAL DAY during which you smoke. The following questions ask how many cigarettes you would consume if they cost various amounts of money. Assume the following: • Available cigarettes are your favorite brand • You have the same income/savings that you have now • You have NO ACCESS to any cigarettes or nicotine products other than those offered at these prices • You consume the cigarettes you request on that day (in other words, no stockpiling) Participants were then asked to respond to the following set of questions: How many cigarettes would you smoke if they were_____ each?: 0¢ (free), 1¢, 5¢, 13¢, 25¢, 50¢, $1, $2, $3, $4, $5, $6, $11, $35, $70, $140, $280, $560, $1,120. Figure. A typical cigarette demand curve for a smoker, derived from cigarette purchase task survey data (log-log coordinate used) • Existing statistical methods: – Individual-specific ordinary least square model. – Mixed effects model. • How the extra zeros/missing values are handled in existing methods? – Ignore all zeros or missing values; – Impute the first zero with an arbitrary small number ω, e.g. 0.1, but ignoring further zeros; – Impute all zeros/missing values with ω. • Any problems in the existing methods? – Could the zeros be small values not observable because they are lower than a certain threshold (LOD)? • Left-censored mixed effects model – What if some zeros are real zero consumptions (complete cessation of smoking)? • A joint modeling approach with a logistic regression component for the cessation status and a left-censored model for the those the complete cessation hasn’t achieved. What else you can do to improve your consulting work? – Serve as a referee for scientific journals – Serve in protocol/proposal review committees – Go to scientific seminars – Be approachable, be responsible, be professional always! Understanding the Science in Collaborative Research David M. Vock, Ph.D. My Background • First year at University of Minnesota • Graduate school at North Carolina State • Interned at Duke Clinical Research Institute (DCRI) • Worked on secondary manuscripts mostly in hepatitis C, lung transplantation, and cardiology What Does “Understanding the Science” Entail • Should be able to give an “elevator talk” to another subject area expert • Know major objectives • Understand protocol for data collection • Read the major recent papers • Comprehend how study fits within the larger research agenda of discipline Not a Revolutionary Idea, But . . . • Academic departments teach a certain set of skills amenable to solving varied problems • “Real-world” problems usually require lots of tools to solve them interdisciplinary teams • Too often statisticians think of themselves as separate from the team Why is Understanding Science Important? • • • • Builds credibility with investigators Improve the research agenda Guide appropriate analysis Strengthen manuscript for publication and anticipate problems with review • Troubleshoot problems Builds Credibility • Statisticians too-often viewed as another hoop in research process • To be part of interdisciplinary team have to be able to speak common language • Stats not universally known: must learn scientific language and thought process • Forthcoming: value to the team is increased by understanding science • Think of yourself as scientist with purview over entire research process Improve Research Agenda • If you know the science . . . • Focus research question – no fishing expeditions • Help prioritize scientific hypotheses • Ensure that the question can be answered from the data collected Guide appropriate analysis • Anticipate appropriate confounders to account for • Prediction versus estimations problem • Avoid analyses not scientifically interesting • Move from associational analyses to causal treatment analyses • Not going to “win” every disagreement, want to fight hardest for those points that will affect scientific conclusions Anticipate Problems in Review • Extreme resistance to “different” analytical methods • Must be able to justify departures from standard analysis • Statistical articles written in medical journals are immensely valuable • Want to ensure that subject-area conclusions match analysis performed (cannot be too speculative, either) Troubleshoot Problems • Example: quality of life (QOL) study part of VALGAN trial • Pre-specified secondary analysis of a randomized trial of CMV prophylaxis for lung transplant recipients • Goal was to characterize QOL changes over first year post-transplant using SF-36 • Preliminary analyses showed extremely small gain in QOL even in physical domains