Relevant End-points and Outcome Variables What Can We Learn from the Editor? Miquel A. Gassull, M.D. Editor, Clinical Nutrition, Dept.of Gastroenterology, Hospital Universitari Germans Trias I Pujol, Badalona, Spain, phone & fax +34-93-465-1385, e-mail: mgassull@ns.hugtip.scs.es Learning Objectives To describe the role of primary vs secondary end-points in the interpretation of trial results To list clinically relevant and surrogate end-points To list criteria for appropriate subgroup analysis To identify “data torturing” in published trals Many concepts in clinical nutrition and metabolic care have changed in the last three decades to a similar extent as occurred in other medical disciplines over a century. This accelerated evolution of nutritional science results in the need of answering new questions (particularly in the field of therapeutics) before older ones (mainly in the field of the pathophysiology of the nutrient-disease relationships)have been solved. In the setting of such an intellectual pressing, the risk exists that the aims of the investigators would exceed the capabilities of their research work. At the best, this practice may be a source of frustration for the investigators. At the worst, it may lead to misleading results. In the following paragraphs the value and limitations of the different end-points in clinical research, and their relevance for clinical nutrition, will be discussed. A Hypothesis for Each Trial: The Primary End-point The reliability, and hence the relevance, of a clinical trial depends on the prospective definition of an adequate primary end-point. This should be ethically acceptable, biologically plausible, clinically relevant, and potentially influenced by the intervention under investigation. In addition, it has to be precisely defined, and easy to assess [1]. In this sense, a qualitative and binary variable (mortality, morbidity, etc.) would be preferred. On the other hand, the primary end-point must be unique and the trial must be primarily designed to answer this sole question. This is of utmost importance, since the sample size is calculated on its basis. This means that only the results regarding the primary end-point would be reliable [2]. Unfortunately, the prospective definition of a primary end-point is not a general practice. Even in high impact factor journals, 25% to 50% of randomised controlled trial lack the prospective definition of a primary end-point [3]. Secondary End-points and Subgroup Analysis When the main hypothesis of a study cannot be demonstrated, one might be tempted to resort to the so-called “data torturing”; i.e., to opportunistically search for significant associations between variables despite the fact that the study was not primarily designed for this [4]. When this is made in an indiscriminate way, the probability of finding significant associations merely by chance is very high. In other words, the practice of analysing a host of outcomes (secondary end-points) and then highlighting all those that are statistically significant increases the risk of concluding that treatments are different when they are not. Although multivariate techniques can be used to control the overall type I error, these techniques rarely help to interpret data in multi outcome situations, because they do not take into account the different clinical importance of the individual outcomes [3]. This does not mean that it is obsolete to prospectively define secondary end-points for a clinical trial. As the primary outcome, secondary outcomes must be biologically plausible, clinically relevant, and – most important – few in number. Since the sample size is calculated on the basis of the primary end-point, any information obtained from secondary end-points should be considered as merely descriptive, complementary to the primary outcome, but not enough to modify the diagnostic and/or therapeutic guidelines in the future [5]. Prospectively defined secondary end-points can be useful to generate new hypotheses to be investigated in future trials. In fact, there are studies designed for this particular purpose. Hypothesis-generating studies should be explicitly identified as such, in order to be differentiated from opportunistic data torturing [4]. Subgroup analysis is another usual way of data torturing. This consists of comparison of primary end-point for patients subdivided according to baseline characteristics [5-7]. Again, subgroups to be analysed should be ideally defined a priori on two bases: single-factor subgroups with a strong rationale for biological response modification, and multifactorial prognostic subgroups defined from baseline risks [7]. However, single-factor subgroup analyses are often reported without a supporting rationale or formal statistical tests for interactions. We suggest that clinicians should interpret published subgroup-specific variations in treatment effects sceptically unless there is a prespecified rationale and a significant treatment-subgroup interaction [5,7]. A major drawback of subgroup analysis is the loss of statistical power due to the reduction in the sample analysed. Oxman and Guyatt [8] developed a series of questions to help clinicians decide whether apparent differences in subgroup responses are real (Table 1) Table 1: Are Apparent Differences in Subgroup Responses Real? [8] Is the magnitude of the difference clinically important? Was the difference statistically significant? Did the hypothesis precede rather than follow the analysis? Was the subgroup analysis one of a small number of hypotheses tested? Was the difference suggested by comparisons within rather than between studies? Was the difference consistent across studies? Is there indirect evidence that supports the hypothesised difference? Surrogate End-points As mentioned, the primary end-point of a clinical trial must be “clinically relevant”. This means that it has to be important for the patient. From this perspective, the only clinically relevant end-points are mortality, morbidity, and health-related quality of life. Often, however, conducting trials based upon these outcomes requires such a large sample size, or long-term patient follow-up, so that researchers look for alternatives. Substitu- ting surrogate end-points for the target event offers an apparent solution to the dilemma. A surrogate end-point can be defined as a laboratory measurement or a physical sign used as a substitute for a clinically meaningful end-point that measures directly how a patient feels, functions or survives [9]. Reliance on surrogate end-points may be a double-edged blade. Their use is indispensable for drug evaluation in phase 2 and early phase 3 trials geared to establish a drug’s promise of benefit. However, they should not be enough to introduce a new therapy unless there is no effective therapy and the mortality and morbidity of the disease to be treated is high. In fact, the positive result of an intervention upon a surrogate variable is by no means a guarantee that such an intervention is not ineffective (or even is deleterious) on morbidity or mortality [9]. The surrogate variable must be in the causal pathway of the disease process to be treated but this is not by itself a reason for recommending a treatment based only upon trials on such an end-point [10]. A guideline for reliance on a surrogate endpoint trial is provided in Table 2 [9]. Table 2. A Guide for Reliance on a Surrogate End-Point Trial [9] Necessary but not sufficient: is there a strong, independent, consistent association between the surrogate end-point and the clinical end-point? Is there evidence from randomised trials in other drug classes that improvement in the surrogate end-point has consistently led to improvement in the target outcome?* Is there evidence from randomised trials in the same drug class that improvement in the surrogate end-point has consistently led to improvement in the target outcome?* How large, precise and lasting was the treatment effect? (effect should be lasting, precise and lasting to consider a surrogate end-point trial as possible basis for offering patients the treatment) Are the likely treatment benefits worth the potential harms and costs? (offer treatments on the basis of surrogate data only if patient’s risk for the target outcome is high, patient places a high value on avoiding the target outcome, and if there are no satisfactory alternative therapies) (*Answer to one of both of these questions should be “yes”) Even if a surrogate end-point meets all these criteria, inferences about a treatment benefit may still prove misleading. Hence, waiting for randomised trials investigating the effect of the treatment on outcomes of unequivocal importance to patients is the only definitive solution to the surrogate end-point dilemma [9]. Research in Clinical Nutrition and Metabolic Care: A Peculiar Scenario Obtaining reliable results from randomised trials in the field of clinical nutrition is particularly difficult. This may be due to several reasons, some of them related to the definition of the end-points: The traditional end-point for nutritional therapies like protein-energy malnutrition and other nutritional deficiencies, which have to be considered as surrogate variables of other clinically relevant outcomes (morbidity, survival, quality of life) has been the nutritional replenishment. However, the concept that malnutrition is a surrogate of mortality and morbidity has been largely postulated but seldom proven. In hospitalised patients, the co-existence of malnutrition with other diseases makes the relationship between malnutrition and the outcome of the patient particularly difficult to demonstrate [11]. It has to be taken into account that, in contrast to traditional drugs, nutrients are not xenobiotics. This does not mean that the concept of “nutritional pharmacology” is not true, but may be of importance for the design of therapeutic trials: It is conceivable that an eventual therapeutic effect would be harder to demonstrate (e.g. would require a larger sample size) for a nutrient than for a drug, particularly for relevant end-points, and the choice of a reliable placebo for comparison purposes may be extremely difficult if not impossible. In fact, the validity of some “traditional” placebos in nutrition research have been recently questioned [12]. These difficulties can be illustrated by critically reading one of the most cited guidelines in clinical nutrition in the last years [13]. This was a review of the published data on the major item in clinical nutrition, and recommendation were classified as A, B or C depending on the sources they were based upon. It is noteworthy that only 16 (42%) out of the 38 conclusions of this review were based on class A data (namely, randomised controlled trials, or their meta-analyses). Furthermore only 9 of these 16 “gold” conclusions were related to clinically relevant end-points, and in 7 the conclusion was negative (i.e. no effect or deleterious effect of the nutritional intervention) or inconclusive (i.e. it read like “there are not enough data to conclude...”) [13]. Conclusions The first conclusion to draw from the preceding paragraphs is that high-quality research in clinical nutrition is extremely difficult to do. However, a corollary to such a conclusion is that high-quality research in clinical nutrition is not impossible. This requires to define clinically relevant, as well as realistic, end-points. To do that, multicentric collaborative studies are mandatory. References 1. Meinert CL. Toward more definitive clinical trials. Control Clin Trials 1980; 1: 249-261 2. Moye LA. End-point interpretation in clinical trials: The case for discipline. Control Clin Trials 1999; 20: 40-51 3. Zhang B, Schmidt B. Do we measure the right end points? A systematic review of primary outcomes in recent neonatal randomized trials. J Pediatr 2001; 138: 76-80 4. Mills JL. Data torturing. N Engl J Med 1993; 329: 1196-1199 5. Freemantle N. Interpreting the results of secondary end points and subgroup analyses in clinical trials: Should we lock the crazy aunt in the attic? Br Med J 2001; 322: 989-991 6. Assmann SF, Pocock SJ, Enos LE, Kasten LE. Subgroup analysis and other (mis)uses of baseline data in clinical trials. Lancet 2000; 355: 1064-1069 7. Parker AB, Naylor CD. Subgroups, treatment effects, and baseline risks: some lessons from major cardiovascular trials. Am Heart J 2000; 139: 952-961 8. Oxman AD, Guyatt GH. A consumer's guide to subgroup analyses. Ann Intern Med 1992; 116: 78-84 9. Bucher HC, Guyatt GH, Cook DJ, Holbrook A, McAlister FA. Users' guides to the medical literature: XIX. Applying clinical trial results. A. How to use an article measuring the effect of an intervention on surrogate end points. Evidence-Based Medicine Working Group. J Am Med Assoc 1999; 282: 771-778 10. Bobbio M. Surrogate end points. Ital Heart J 2000; 1(suppl): 877-879 11. Cabré E, Gassull MA. Efecto de la malnutrición sobre la mortalidad: Significación estadística versus significación clínica. In: Miján de la Torre A, ed. Nutrición Clínica: Bases y Fundamentos. Barcelona: Ediciones Doyma S.L., 2000: 473-485. 12. Hall JC. Glycine. J Parent Enteral Nutr 1998; 22: 393-398 13. Klein S, Kinney JM, Jeejeebhoy K, Alpers DH, Hellerstein M, Murray M et al. Nutrition support in clinical practice: Review of published data and recommendations for future research directions. J Parent Enteral Nutr 1997; 21: 133156