Basic Biostatistics in Medical Research: What (Not) to Do November 7, 2013 Leah J. Welty, PhD Biostatistics Collaboration Center Welcome to Basic Biostatics in Medical Research: What (Not) to Do. This is part of a bi-annual lecture series presented by the Biostatistics Collaboration Center. A laudable goal for today would be for you to come away understanding everything you might want or need to know about biostatistics in medical research. Unfortunately, that’s highly unlikely in an hour, especially with an audience of varied specialties and backgrounds. Even condensing introductory biostatistics in to an hour long lecture would be impossible. So, rather than focusing on all the background and methodology that’s out there, I will instead focus instead on areas in which people are prone to making mistakes, incorrectly or inadequately applying biostatistics methods, or confused about what their results mean. This lecture is accordingly divided in to four sections: 1. 2. 3. 4. A good picture is worth 1,000 words: the importance of statistical graphics Not all observations are created independent What is a p-value really? How to collaborate with a biostatistician. For those who get excited about this today, please come back next week. If you’re serious about expanding your own biostatistics repertoire, there are a number of excellent biostatistics courses offered by the graduate school. If instead you’re looking for some guidance on the methods appropriate for your own research, I urge you to listen carefully in section 4, and consider visiting the Biostatistics Collaboration Center. I. A good picture is worth 1,000 words. Statistical graphics can do two very important things: (1) guide appropriate choice of statistical analyses; and (2) provide a powerful illustration of results. My first piece of advice to investigators is to look at their data -- not to “fish” for results -- but to understand how individual variables are distributed and best summarized. Then, once data have been (appropriately) analyzed and are being prepared for publication, my next advice is to think about (creative) ways to graphically display results. A. Graphics guiding appropriate analysis choices Example 1: Correlation and Anscombe’s Quartet 1 Correlation measures the strength of the linear association between two variables. It is often denoted by “r” and takes values between -1 and 1. The values have the following interpretations: r near -1: Strong negative linear association r near 0: No linear association r near 1: Strong positive linear association. Suppose I have two variables A and B, and I tell you that their correlation is 0.82. What impression does that make? Hopefully that A and B are fairly strongly linearly associated. The picture we associate with this relationship might look something like what is shown below (Figure 1), where in fact the Variables A and B have a correlation of 0.82. Figure 1: However, it’s also possible for the relationship between the two variables A and B to actually be much different, but for their correlation to still be 0.82. First, the variables A and B may be related in a non-linear fashion. Figure 2 illustrates variables A and B which are quadratically related. Although r = 0.82 because there is still linear trend, correlation is not an accurate description of the strength of the relationship. 2 6 2 4 Variable B 8 10 Figure 2: 4 6 8 10 12 14 Variable A Second, variables A and B may either have no relationship at all or we may not have adequate information to capture the relationship, yet still r = 0.82. In Figure 3, for all but one observation, it appears that A is completely unrelated to B, or at least the B may vary substantially without any change in A. 10 6 8 Variable B 12 14 Figure 3: 5 10 15 20 Variable A The single value on the right side of the plot is what we refer to as an “influential observation.” Correlation is nororious for not being “robust” in the sense that it can depend heavily on just a few observations. If I were presented with this data, I would recomment two courses of action: (1) investigate the ‘influential point’ (is it an obvious mistake in coding or measurement?) and (2) if 3 possible, collect more data in which observations don’t all have the same values for A. Don’t throw-out the influential observation unless it you can determine it was clearly an error (and not just an error in the sense that it doesn’t match the rest of the data). Sometimes the most unusual observations end up giving us the most insignt. It may well be the case that B increases as A increases, we just can’t detemine that from this limited amount of inforatmion. Reporting r = 0.82 would be highly misleading for this data. Our third and final example involves another unusual observation. In Figure 4 below, variables A and B appear to have a perfect linear relationship, minus one observation, and the correlation is 0.82. As above, it would be wise to investige this observation a bit more -- is it an error, or are there some people (or observational units) for which A and B don’t have the same relationship? It’s also incredibly rare to see such a perfect linear relationship in practice, so I would also recommend investigating the points that appear perfectly related as well. 10 6 8 Variable B 12 14 Figure 4: 4 6 8 10 12 14 Variable A In only one of the above four cases was correlation a reasonable summary measure of the relationship between A and B. Have you ever computed a correlation coefficient without first making sure it’s an apropriate summary measure? As a final note, there are many kinds of correlation. We’ve been discussing the most common version, known as Pearson correlation. Spearman rank correlation and Kendall’s tau are somewhat common as well, but they do not measure strength of linear association. Example 2: Means, Medians, and Days in Corrections The most common summary measures of continuous data are undoubtedly the mean and the associated standard deviation. However, the mean may not always be the most accurate or appropriate summary statistic to associate with continuous data. Have you ever computed means and standard 4 deviations without actually looking at the distribution of the variable first? The next example illustrates why that’s not such a good idea. This example comes from some research I do with the group called Health Disparities and Public Policy in the Department of Psychiatry and Behavioral Sciences. In particular, we have a prospective longitudinal study of juvenile delinquents after detention. A number of participants cycle in and out of the correctional system (jail, prison), and one of the measures we are interested in is the amount of time they spend incarcerated. For this mock data (a subsample of 1000 participants from an interview 5 years after detention as a juvenile), we found that the average number of days spent in corrections during the past year was 84. However, the median number of days in corrections in the past year was 0. Figure 5, below, illustrates what’s going on. Over half the participants (544) had no correctional stays during the past year, and the next largest chunk of participants (99) were in a correctional facility the entire year. The remaining participants are distributed between 1 and 364 in a fairly uniform way. However, the 99 participants who were in a correctional facility the entire time “pull” the mean to 84. Figure 5: The mean is not “robust” to outlying values, but the median is “robust.” The mean is actually the balance point of the distribution: if you can imagine putting the histogram on a fulcrum, you’d need to put the fulcrum at 84 to balance the two ends – those 99 values are ‘far away’ from the rest of the data, which gives them disproportionate weight. The lesson here is not to blindly compute means without first making sure that they’re an appropirate summary meausre. If not, don’t be afraid to report the median. Just as the mean is generally reported with the standard deviation, the median should be reported with the range and quartiles (often 25% and 75%, as well as the distance between the two). 5 As a final note, both the histograms below illustrate fake data in which both variables have a mean of 2.0. For the symmetric (and normally distributed) data on the left, the median is also 2. For the skewed data on the right, the median is 1.4. For the data on the right, I would pause before reporting just the mean and the standard deviation. Figure 6: B. Graphics providing a powerful illustration of results. The graphics you use to generate helpful summaries of your data are generally not the same ones you’ll want to use in presentations or publications. Figures are an exciting opportunity to convey your results in a visual fashion, and they may be far more convincing than text or even the numbers themselves. Unfortunately, programs like Microsoft Excel or Power Point don’t always provide good guidance on what makes an effective figure for publication. Or, in the event that they can be coerced in to making a nice figure, it’s not trivial to figure out how. Biostatisticians hardly ever use Excel to generate figures. Common statistical programs such as R, SAS, Stata, and SPSS all have reasonble graphics packages that can easily provide more appropriate graphical summaries. The example below illustrates what’s possible with different options and increasing levels of sophistication. As in the previous section, this example uses data on time incarcerated. The purpose of the figure is to illustrate the racial/ethnic differences in time spent incarcerated. The first example was created using Excel with the help of a graduate student who was highly proficient in Excel from her former life as a management consultant. At first glance, are you overwhelmed by the racial/ethnic differences in incarceration? Can you tell what these differences are? Does this type of figure look familiar? 6 Although such figures are commonplace, there are a number of ways in which this figure doesn’t work. Criticisms include: (1) the x-axis divides up a continuous variable – the number of months in corrections – in to categories; (2) the horizontal lines are distracting; (3) perhaps most importantly, to understand the relatiosnhip between race/ethnicity and months in corrections, you need to digest racial/ethnic comparisons in six different categories -- the first and the last being the most relevant. This second presentation of the exact same data was generated using Stata, with no alterations to the standard boxplot command: 7 Side-by-side boxplots are a powerful way of conveying differences in the distributions of continuous variables, but are sadly underused. The boxes constitute the middle 50% of the data (from the 25th to the 75th percentile), the line within the boxes shows the median, and the whiskers reach to the upper and lower limits of the data. In the case of non-Hispanic whites, some of the large observations are considered ‘outliers’ (more the 1.5 times the length of the box beyond the 75th percentile), so they’re shown as dots rather than included as parts of the whiskers. It’s clear from looking at this boxplot that non-Hispanic whites are generally spending less time incarcerated than minorities.. Finally, our last version, which is close to what was submitted for publication, shows a slightly different version of the data: Note that it’s not a conventional plot, but is effective at demonstrating racial/ethnic and sex differences in two variables: (1) who had spent any time incarcerated, and (2) the length of those incarcerations. This figure was generated using R, which is open source and freely available statistical software. It has 8 excellent graphics capabilities. For the technically inclined, it is certainly accessible. For others, know that this is the sort of figure your friendly neighborhood biostatisican can create. C. Good and bad examples of graphics. Edward Tufte has written extensively (in very accessible language) and elegantly about what makes good statistical graphics. Visit http://www.edwardtufte.com/tufte/ for more information. Much of what follows in this section is influenced by his work. Here are some ideas to keep in mind when you’re generating graphics: 1. Graphics should have maximum information with minimum ink. The ubiquitous tower and antannea plots are horrible offenders in this category. One tower and antannea uses a lot of ink to illustrate just two numbers. Why not a dot and a line instead? All the extra ink is distracting. Only use color if color is necessary. 2. Graphics should have no more dimensions that exist in the data. The 3-d bar charts in Excel may look fancy, but they’re horrible when it comes to actually reading information from the plot. 3. Labels should be informative, not distracting, and have a sensible range. 4. Avoid pie charts (expecially the 3-d kind). Humans are horrible at comparing areas, and even worse at comparing volume. I recommend bars instead (see the final figure in the previous section). We’re better at comparing length. If you’re looking for examples of good and creative graphics, check out the New York Times. Below is a picture of an interactive plot illustrating how people spend their time. Although few of us have the technical expertise to generate such a figure, it’s important to note that this is highly illustrative even though it’s not what we’re used to seeing or have likely seen in a statitics textbook. It’s also worth noting that the New York Times graphics department relies heavily on R for the first versions of many of the graphics they create. 9 In contrast, the example on the left below comes from USA Today. It’s heavily leaden with what Tufte refers to as “chart junk” – the gratuitous and cartoonish decoration of statistical graphics. The USA Today example is particularly bad because it combines chartjunk with a pie chart that’s in perspective. 10 II. Not all observations are created independent. The majority of methods taught in introductory and intermediate biostatistics courses assume that observations are independent. However, in medical research especially, we encounter data in which our observations are not independent. Examples of non-independent data include: 1. 2. 3. 4. pre and post measurements on the same individuals or experimental units measurements on cases and matched controls longitudinal data (measurements taken repeatedly on the same individuals over time) nested samples (e.g. in a random sample of hospitals, patients are sampled randomly within each hospital) In each of these cases, some measurements are more related to others. For example, measurements within the same person are probably more similar than measurements across individuals. Measurements on patients within the same hospital may be more similar than measurements on patients treated at different hospitals. Note that nearly all data has dependencies – for example one variable is associated with another. This isn’t the type of dependency we mean here. Rather, the type of dependency we’re talking about comes from dependencies introduced by how the data were sampled or collected. It is key to recognize that there are different biostatistics methods for observations that arenot independent. Using mehtods designed for independent data on observations that are not independent can result in erroneous findings. Paired and dependent data can be very powerful, but as the following example illustrates, it’s critical to choose the appropriate analysis method. Example: Hodgkin’s and Tonsillectomy This example comes from the early 1970s, and involves two studies investigating whether having a tonsillectomy is associated with risk of Hodgkin’s Lymphoma. Although dated, the example uses a nice assortment of biostatistics methods that are at least familiar and accessible to many people in the room. The first study was published by Vianna, Greenwald, and Davies in 1971 in Lancet (citations at the end of this section). They conducted a case-control study in which controls were unmatched. They recruited 101 Hodgkin’s patients, and 107 controls who were as a group similar to the controls but not matched on an individual basis. The data are summarized in the following 2 x 2 table: Hodgkins Control Tonsillectomy 67 43 11 No Tonsillectomy 34 64 An appropriate summary of the associaiton between the exposure (tonsillectomy) and the outcome (Hodgkin’s) is the odds ratio. The odds ratio captures the association between two binary variables, and is oft used in medical research. Note that because this was a case-control study, and the authors recruited cases, we can’t say anything about the risk of disease in the exposed group compared to the risk of disease in the unexposed group. You may have seen studies report a measure called ‘relative risk,’ but we can’t do that here. One of the advantages of the odds ratio is that it is still valid in casecontrol studies where the participants are selected on the outcome of interest. The odds ratio comparing disease in the exposed group to the unexposed group is simply the odds of disease in the exposed group divided by the odds of disease in the unexposed group. In the tonsillectomy group, the odds of Hodgkins are 67 to 43 or 67/43 – for every 67 Hodgkin’s cases, we have 43 controls. In the no tonsillectomy group, the odds are 34/67 – for every 34 Hodgkin’s cases, we have 64 controls. So the odds ratio is (67/43)/(34/64) = 2.93. Odds ratios greater than one suggest that the exposure is promoting of outcome. Odds ratios less than one suggest that the exposure is protective for the outcome. The odds ratio is always greater than zero. An odds ratio of 2.93 is reasonably large, and suggests that having a tonsillectomy may be associated with developing Hodgkin’s. However, the size alone is not enough to convince us of an association. Instead, we need a test for association between the rows in the table (case/control status) and the column (tonsillectomy/no tonsillectomy). The appropriate test should be familiar – it’s the chi-squared test of homogeneity. It’s worth noting that we could also examine the association via logistic regression, in which we model the log of the odds of disease, but that’s beyond the scope of today’s discussion. The chi-squared test compares the observed data to what we would expect cell counts would be if the exposure was independent of the outcome. Observed data: Hodgkins Control Tonsillectomy 67 43 110 No Tonsillectomy 34 64 98 101 107 208 No Tonsillectomy 47.6 50.4 98 101 107 208 Expected data (if rows and columns independent): Hodgkins Control Tonsillectomy 53.4 56.6 110 The formula for the expected cell counts is (row total x column total)/total, but this is just as easy to reason out. If there are 101 people with Hodgkin’s, and Hodgkin’s is unrelated to tonsillectomy, we 12 would expect that 101 x (110/208) would fall in the upper left hand corner because overall, 110/208 of our participants have had tonsillectomies. The chi-squared statistics is equal to ∑(expected – observed)2 / (expected), which has a chi-squared distirbution with 1 degree of freedom if expected cell counts are all greater than 5 and the total number of subjects is “large” (more than 20 or 40, in this case). For this example, the chi-squared statistic is 14.46, which has an associted p-value of 0.002. This is statistically significant evidence for an association between Hodgkin’s and tonsillectomy. The second study was published a year later in the New England Journal of Medicine, and reported on another case-control study examining the same assocation. In this study, however, the cases and controls were matched, and consisted of 85 pairs in which Hodgkin’s patients were matched to siblings within 5 years of age and the same sex. The data were summarized in a 2 x 2 table analogous to tables above: Hodgkins Control Tonsillectomy 41 33 No Tonsillectomy 44 52 The associated odds ratio is (41/33)/(44/52) = 1.47 – not as large as what was observed before. Furthermore, the associated chi-squared statistic was 1.53, with an associated p-value of 0.22. The author’s concluded that their data failed to support the association published by Vianna, Greenwald, and Davies, and their contradictory finding was reported in NEJM. But what’s wrong with the analysis? The problem is that the analysis ignored the pairings in the data. Although the 2 x 2 table above is not technically incorrect, it is incredibly misleading as it suggest that there are 170 independent observations when there are only 85! The odds ratio is not correct at all. A much better table shows the pairings: Hodgkin’s Tonsillectomy 26 7 Tonsillectomy No Tonsillectomy Sibling No Tonsillectomy 15 37 We can think of each of the 85 pairs as falling in to one of four categories: (1) both had tonsillectomies; (2) neither had tonsillectomies; (3) the sibling had a tonsillectomy and the Hodgkin’s patient did not; (4) the Hodgkin’s pateint had a tonsillectomy but the sibling did not. Only the last two categories – the discordant pairs – tell us anything about the association between Hodgkin’s and tonsillectomy. The appropriate test in this case is to compare the percent of pairs in which the sibling had a tonsillectomy but the Hodgkin’s patient did not (7/85 = 8%) to the percent of pairs in which the Hodgkin’s had the tonsillectomy but the sibling did not (15/85 = 17%). If Hodgkin’s and tonsillectomy 13 were unrelated, we would expect the discordant pairs to split evenly between the two cateogies (so about 11/85 = 13% in each group). Are our percentages different enough from what we would suspect that we ought to be suspicious? The correct test in this case is called McNemar’s Test, and when applied to this data, results in a p-value of 0.09. Although not statistically significant, it certainly does not shed the doubt about the previous study that was originally reported. We could also use conditional logistic regression to estimate an accurate odds ratio. References: Vianna, N. J., Greenwald, P., and Davies, J. N. P “Tonsillectomy and Hodgkin's disease—The lymphoid tissue barrier.” Lancet i: 431–432, 1971. Sandra K. Johnson, R.N., and Ralph E. Johnson, M.D. “Tonsillectomy History in Hodgkin's Disease” N Engl J Med 1972; 287:1122-1125 November 30, 1972 Mathematical Statistics and Data Anlaysis, John A. Rice, Duxbury, 1995. Final thoughts on paired versus independent data: I find it useful to think about independent versus dependent data in terms of sources of variation. Suppose we are alotted a total of 10 measurements of a quantity of interest. We can take these measurements across 10 different people, 2 measurements each on 5 different people, 5 measurements each on 2 people, or 10 measurements all on the same person. As illustrated below, when we have single measurements across many (independently sampled) people, we know a lot about how a measurement varies across a population, but nothing about how it varies within a person. When we take 10 (independent) measurements on the same person, we know a lot about how the measurement varies within a person, but nothing about how it varies across a population. For the inbetween scenarios, we learn something about within person variation and something about across person variation. These two scenarios are the ones in which we need to be especailly careful not to treat the 10 measurements as independent. 14 You are probably already aware of some methods for data that aren’t independent – the paired t-test is perhaps the most common. There are many other methods that are suitable for dependent observations, such as generalized linear mixed models or conditional logistic regression. These are standard practice in biostatistics and also relatively accessible to someone with intermediate biostatistics knowledge. If you have dependencies in your observations and you’re not sure how to account for it, be sure to consult your friendly neighborhood biostatistian. It’s important to remember that paired data can actually be incredibly powerful, but you must do the analysis correctly. III. What is a p-value really? 1. An illustrative example There may be no more universally confusing or misunderstood idea in statistics that the p-value. Although I’m not a poker player, I find it useful to think of interpreting p-values using the following scenario: Dr. X and I are playing poker. Dr. X is winning. In fact, Dr. X’s last two hands were a flush and a straight. I’m forced to wonder – is Dr. X. cheating? This scenario leads me to set up a hypothesis test: 1. Suppose Dr. X is playing fairly (note that this is the opposite of what I suspect). This is called the null hypothesis, or H0. 2. I observe the data: Dr. X’s next hand is two pair. It is critical to note that the data that led to the generation of the hypothesis CANNOT be used to test this hypothesis. This would be ludicrous. Unfortunately, people are temped to do it all the time. This is akin to “fishing” for results, and will result in erroneous and unreplicable findings. 3. I next figure out the the probability of Dr. X having a hand that is 2 pair or better if the null hypothesis is true (i.e. if Dr. X is indeed playing fairly). This is called the p-value, and for this example is approximately 0.08. 4. If this probability is “small,” I conclude that my original supposition (the null hypothesis) might not be right. This would lead me to believe that Dr. X is indeed cheating. In formal statistical language, I reject the null hypothesis in favor of the alternative hypothesis. The alternative hypothesis, H1 or Ha, is in most cases is the opposite of the null hypothesis. If the probability is not “small,” I conclude that I don’t have sufficient evidence to reject the null hypothesis. It is important to note that this is not the same as “accepting” the null hypothesis, 15 or showing that the null hypothesis is true. Dr. X may have indeed been cheating, we just didn’t detect it. It’s also important to note that we haven’t ‘proved’ anything, nor have we computed the probability that Dr. X is cheating. Maybe Dr. X is just very lucky, and isn’t cheating at all. Or maybe Dr. X is clever enough to cheat just enough that we don’t detect it as statistically significant. All we’ve discovered is that Dr. X would get a hand of two pair or better only about 8% of the time if Dr. X were playing fairly. One final note on this example. It’s important to define what “small” means before you actually conduct the test. For most analyses, “small” means 0.05 (this is called the “alpha” level). Some clinical trials may use 0.01, some epidemiological studies may use 0.10. The choice of “small” depends very much on the end objectives of the analysis and should never be decided after looking at the results of the test. This is again akin to fishing. 2. The p-value defined The p-value is the probabiilty of the observed data (or of more ‘extreme’ data), under the assumption that the null hypothesis is true. p-value = Pr(data | H0) This actually doesn’t tell us what we’d really like to know: the probability of our null hypothesis given the data, or the probabiilty of the alternative hypothesis given the data – namely Pr(H0|data) or Pr(H1|data). If Dr. X is not cheating, we would expect Dr. X to get two pair or better less than 8% of time time. Note that there is no Pr(Dr. X cheating | observe two pair). 3. A significance test (Adopted from “The Null Ritual: What you always wanted to know about significance testing but were afraid to ask.” Gigerenzer, G., Krauss, S., Vitouch, O. in The Sage Handbook of Quantitative Methodology of the Social Sciences (2004). David Kaplan, Editor.) Suppose you have a treatment that you suspect may alter performance on a task. You compare the means of your control and experimental groups (say, 20 subjects per group). You use a simple independent means t-test and your result is significant (t = 2.7, df = 18, p = 0.01). H0: µ1 = µ2 H1: µ1 ≠ µ2 Please answer each of the following TRUE or FALSE: 1. You have disproved the null hypothesis (i.e. there is no difference between population means). 2. You have found the probability of the null hypothesis being true. 3. You have proved your alternative hypothesis (i.e. that there is a difference between the population means) 16 4. You can deduce the probability of the alternative hypothesis being true. 5. If you reject the null hypothesis, you know the probability that you are making the wrong decision. 6. If the experiment were repeated thousands of times, you would obtain a significant result about 99% of the time. 4. Final thoughts A few parting thoughts to keep in mind about hypothesis testing: 1. Statistics can’t “prove” anything. 2. The p-value is not the probabilty of a hypothesis. 3. Unfortunately, we can reject the hypothesis that most p-values are interpreted correctly. IV. How to collaborate with a biostatistician. As I mentioned at the beginning, I am part of the Biostatistics Collaboration Center, housed in the Department of Preventive Medicine. We’re a group of faculty and master’s level biostatisticians who love collaborating with investigators. Here’s our mission statement: The primary goal of the BCC is to collaborate and consult with FSM researchers in order to produce studies and statistical analyses that ultimately result in funded grants, peer-reviewed publications and presentations at professional meetings. Typically the best results come from researchers and statisticians working hand-in-hand as collaborators in these activities. We help investigators in a number of different ways, so I would encourage you to check out our website, http://www.feinberg.northwestern.edu/sites/bcc/. We offer everything from free one-hour consultations to helping develop proposals to long term collaborations in which our faculty members become key co-investigators in research groups. There are also a number of resources on how we can best help you: Guideline Summary: Know what your biostatistician needs from you. http://www.feinberg.northwestern.edu/sites/bcc/docs/StatsCollaborationGuideSummary.pdf Part I: Preliminary Help (Grants and Power) Prepare for your statistical collaboration pertaining to grant applications. http://www.feinberg.northwestern.edu/sites/bcc/docs/PowerGuide.pdf Part II: Database Issues 17 Collect and/or organize your data in the most effective way for statistical analysis. http://www.feinberg.northwestern.edu/sites/bcc/docs/DataGuide.pdf Part III: Analysis and Write-Up Work efficiently with a biostatistician through the analysis and write-up phase. http://www.feinberg.northwestern.edu/sites/bcc/docs/ProjectGuide.pdf Investigators often think of us as the people to come to when a grant deadline is approaching and they need a power calculation or sample size justificaiton, or as folks who can help once data has been collected. Although some of what we do may appear mysterious, it’s important to remember that we’re not magicians! Unfortunately, we can’t conduct a power analysis when we haven’t had a chance to thoroughly learn about the science at hand, nor can we wave a statistical wand to salvage poorly collected data. But we can do a lot when investigators come early in the grant development phase and we participate fully in the planning and execution of the proposed research. Here is a schematic for developing biomedical research, and how a biostatistian can help (with thanks to Dr. Denise Scholtens): Dr. Roger Peng, a biostatistian, and his collaborator Dr. Elizabeth Matsui, both Associate Professors at Johns Hopkins Bloomberg School of Public Health, have written informative and entertaining posts on how to collaborate with a biostatistian and in turn how to collaborate with a scientists: http://simplystatistics.org/2013/10/08/the-care-and-feeding-of-the-biostatistician/ http://simplystatistics.org/2013/10/09/the-care-and-feeding-of-your-scientist-collaborator/ 18 Next week’s lecture information: Lecture #2 - Basic Biostatistics in Medical Research: Emerging Trends Thursday, November 14th from 1:30-3pm Lurie Hughes Auditorium 303 E. Superior St., Chicago Contact information: Biostatistics Collaboration <bcc@northwestern.edu> http://www.feinberg.northwestern.edu/sites/bcc/ 19