Full Text - Genomics Proteomics Bioinformatics and Systems

advertisement

A Gene Expression Signature with Independent Prognostic Significance in Epithelial

Ovarian Cancer

Dimitrios Spentzos, M.D., Douglas A. Levine, M.D., Marco F. Ramoni, Ph.D., Marie Joseph,

Xuesong Gu, Ph.D., Jeff Boyd, Ph.D., Towia A. Libermann, Ph.D., and

Stephen A. Cannistra, M.D.*

From the Program of Gynecologic Medical Oncology, Beth Israel Deaconess Medical Center,

(DS, SAC), Genomics Center (DS, MJ, XG, TL) and Bioinformatics Core (DS, TL), Beth Israel

Deaconess Medical Center, Harvard Medical School, Children’s Hospital Informatics Program and Harvard Partners Center for Genetics and Genomics (MFR), Boston, MA, and the

Department of Surgery, Memorial Sloan-Kettering Cancer Center, NY (DL, JB).

Funded in part through grants from the Patricia Cronin Foundation, the Director’s Challenge

Grant (U01 CA88175), RO1 CA85467, and U24 DK58739.

*Address correspondence and reprint requests to Dr. Stephen A. Cannistra, Program of

Gynecologic Medical Oncology, Beth Israel Deaconess Medical Center, 330 Brookline Avenue,

Boston, MA 02215. (Telephone: 617-667-4283; Fax: 617-975-5598; E-mail: scannist@bidmc.harvard.edu

).

Running head: A prognostic gene signature in ovarian cancer

Presented in part at the Annual Meeting of the American Society of Clinical Oncology, Chicago,

IL, June 2003.

1

Abstract

Purpose: Currently available clinical and molecular prognostic factors provide an imperfect assessment of prognosis for patients with epithelial ovarian cancer (EOC). In this study, we investigated whether tumor transcription profiling could be used as a prognostic tool in this disease.

Methods: Tumor tissue from 68 patients was profiled with oligonucleotide microarrays.

Samples were randomly split into training and validation sets. A three-step training procedure was employed to discover a statistically significant Kaplan-Meier split in the training set. The resultant prognostic signature was then tested on an independent validation set for confirmation.

Results: In the training set, a 115-gene signature referred to as the Ovarian Cancer Prognostic

Profile (OCPP) was identified. When applied to the validation set, the OCPP distinguished between patients with unfavorable and favorable overall survival (median 30 months versus not yet reached, respectively, log-rank p=0.004). The signature maintained independent prognostic value in multivariate analysis, controlling for other known prognostic factors such as age, stage, grade, and debulking status. The Hazard Ratio for death in the unfavorable OCPP group was

4.8, p=0.021 by Cox Proportional Hazards analysis.

Conclusion: The OCPP is an independent prognostic determinant of outcome in EOC. The use of gene profiling may ultimately permit identification of EOC patients appropriate for investigational treatment approaches, based upon a low likelihood of achieving prolonged survival with standard first-line platinum-based therapy.

2

Introduction

The majority of patients with epithelial ovarian cancer (EOC) are diagnosed with advanced disease involving sites such as the upper abdomen, pleural space, and para-aortic lymph nodes

1

.

Post-operative chemotherapy is almost always required in an attempt to eradicate residual tumor that remains after initial surgery. Standard chemotherapy with carboplatin in combination with a taxane results in an initial response rate of over 70%, although subsequent relapse frequently occurs and eventually becomes resistant to a wide variety of agents

2

. Consequently, the longterm survival of patients with upper abdominal involvement (stage III) or those with disease beyond the abdomen (stage IV) ranges from 30% to less than 10% 1 .

Despite the highly lethal nature of EOC, the clinical course of advanced disease can be difficult to predict in an individual patient. A small fraction of patients will be cured with surgery followed by chemotherapy, another group will experience relapse after a relatively long time interval (e.g., greater than 1-2 years), others will relapse and succumb to this disease within months of completing first-line therapy, and some will exhibit primary resistance to first-line chemotherapy. For patients with advanced disease, features associated with a more favorable prognosis include ability to perform an optimal surgical debulking, low grade disease, non-clear cell histology, age less than 65 years, a rapid serologic (CA-125) response to chemotherapy, the presence of BRCA-1 germ-line mutation, and overexpression of pro-apoptotic proteins such as

BAX

1,3-7

. Nonetheless, these prognostic factors are imperfect predictors of outcome, and for the most part they do not provide insight into the biologic mechanisms responsible for clinical behavior.

3

The heterogeneity of clinical outcomes in patients with ovarian cancer suggests that reliable prognostic and/or predictive factors would be of potential clinical value. Accurate predictive markers might identify patients who are appropriate candidates for novel first-line experimental approaches, based upon a high chance of exhibiting resistance to standard first-line chemotherapy. Alternatively, accurate prognostic factors may permit identification of patients who are likely to relapse and die of disease, despite achievement of a complete response. Such patients may be appropriate candidates for experimental approaches designed to determine the value of maintenance or consolidation strategies, for instance. Finally, reliable prognostic and/or predictive factors might provide important insights into the biology of drug resistance and tumor aggressiveness, yielding potentially new molecular targets for drug development.

Previous studies investigating the mechanisms of drug resistance, tumor growth, and metastatic potential have revealed that these processes are multifactorial in nature, associated with genetic abnormalities in multiple gene families. Thus, more recent attempts to develop accurate predictors of clinical outcome in other malignancies have focused on techniques that are capable of assessing global gene expression. This task has become feasible through the development of genome-wide expression arrays (cDNA and oligonucleotide microarrays), which have been capable of distinguishing between specific tumor types (e.g., myeloid versus lymphoid leukemia), between specific histologic subtypes (e.g., follicular versus large cell lymphoma), and between different clinical outcomes

8-11

. For example, microarray expression profiles in patients with non-Hodgkin’s lymphoma have been recently shown to provide prognostic information that was independent of standard clinical metrics such as the International Prognostic Index, attesting to the potential clinical utility of this technique

10,11

.

4

In this study, we utilized oligonucleotide microarrays to globally analyze gene expression of primary ovarian cancer samples in order to define profiles that have prognostic relevance. We demonstrate that it is possible to accurately prognosticate clinical outcome in patients with EOC using this technique, and we discuss the potential relevance of these findings for clinical management.

Materials and Methods

Study subjects. Sixty-eight patients with epithelial ovarian cancer diagnosed between January

1995 and October 2000 form the basis of this study (n=38 patients from Beth Israel Deaconess

Medical Center (BIDMC) and n=30 patients from Memorial Sloan-Kettering Cancer Center

(MSKCC)). All patients underwent exploratory laparotomy for diagnosis, staging, and debulking, followed by first-line platinum/taxane based chemotherapy. Standard postchemotherapy surveillance included serial physical examination, serum CA-125 level, and CT scanning based upon clinical suspicion of relapse. At one of the two institutions (MSKCC), patients who were in complete clinical remission after standard chemotherapy were considered for a second-look laparoscopy, although findings from this procedure were not taken into account in the definition of complete clinical remission (see below). Follow-up data for this study were extracted from the Ovarian Cancer Relational Database at BIDMC and the Ovarian

Cancer Clinical Database at MSKCC. The study protocol for collection of tissue and clinical information was approved by the Institutional Review Boards at both institutions, and patients provided written informed consent authorizing collection and use of the tissue for study purposes.

Clinical definitions. Staging was assessed in accordance with the International Federation of

Gynecology and Obstetrics (FIGO)

1

. Optimal debulking was defined as less than or equal to 1

5

cm. gross residual disease, and suboptimal debulking was more than 1 cm. residual disease. A complete clinical response/remission (CCR) was defined as resolution of all clinical/radiographic evidence of disease and normalization of the serum CA-125 level after the completion of first line chemotherapy. Completion of first-line chemotherapy was considered to be the date of the last administered cycle of treatment. For the purpose of this study, persistent disease was defined as lack of a complete response to first-line chemotherapy. For patients who achieved a

CCR, disease-free survival (DFS) was defined as the time interval between the end of first-line chemotherapy and the first confirmed sign of disease recurrence. Overall survival (OS) was defined as the time interval between the date of diagnosis and the date of death from any cause.

RNA isolation. Ovarian cancer samples were collected at the time of primary debulking surgery and frozen at –80 o

C. Microdissection was not used in this analysis, in order to assess the contribution of stromal and hematopoietic cell elements to the genetic profile. Tumor samples were pulverized in liquid nitrogen and homogenized in Trizol solution, followed by RNA isolation using standard techniques. cDNA synthesis, microarray probe preparation, and Affymetrix GeneChip hybridization. These procedures were carried out using standard protocols and are described in detail in the on-line supplement to this manuscript (www.bidmcgenomics.org/OvarianCancer ) as well as in previous publications

12-15

. The Affymetrix U 95 A2 array was used containing 12,625 transcripts. Image analysis was performed using the MAS 5 Affymetrix algorithm

12-15

.

Training set data analysis. A three-step process was developed to identify a gene expression profile using a randomly chosen training set of 34 samples (Figure 1). In step 1, samples from seven patients with the shortest survival (excluding censored patients) and seven patients with the longest known survival were analyzed with supervised statistical methods of pattern recognition and class prediction (first training step, Figure 1)

16-23

. The subsets of genes with the highest predictive accuracy (by leave-one-out cross validation)

8

for the initial 14 samples were

6

then selected for a second training step (Figure 1), in order to refine the expression profile. For this step, class labels were assigned to the remaining 20 patient samples from the training set, by predicting their class membership using the genes identified in the first training step. Once the labels were assigned, the survival times of the entire group of 34 training samples was assessed by Kaplan-Meier analysis. Predictive signatures with various numbers of genes were tested (all of them with the highest predictive accuracy for the first 14 training samples), until a distinction with maximal statistical significance and stable class assignments was reached by Kaplan-Meier analysis. The class assignments that yielded the best survival discrimination were considered to be the candidate phenotypes for final refinement in the third training step (Figure 1). For this step, the entire training set of 34 samples was then split into a favorable and unfavorable group

(based on these class assignments), and the two groups were again subjected to pattern recognition and class prediction analysis. The signature with the highest predictive accuracy (by leave one out cross validation) for the previously assigned 34 labels was chosen as the final gene profile. The resultant gene profile was then applied to an independent set of samples (validation set) in order to confirm its prognostic significance.

Gene expression pattern analysis and class prediction.

Details on the pattern recognition algorithm are provided in previous publications 16-23 . Briefly, this is a supervised method that is designed to discover patterns of gene expression associated with binary phenotypes. A pattern is defined as a subset of genes whose expression levels are tightly clustered (usually at a high or low expression level) in a subset of samples within a given phenotype. A computer algorithm

(SPLASH)

17

is used to discover all patterns characteristic of the two phenotypes at a given level of statistical significance as previously described 16-22 . The degree of differential gene expression was assessed by a signal to noise ratio and a permutation test as described previously

8

. Class predictions at all steps (training and validation) were carried out using the weighted voting

8,11,23-25

and k nearest neighbor (k-nn)

9,26,27

algorithms. Predictive accuracy in the

7

training set was assessed by leave-one-out cross validation

8

. The p value for predictor accuracy was calculated using the Fisher’s test on the prediction contingency table and by a permutation test as described previously

23,25

.

Statistical tests and survival analysis.

Associations between categorical variables were assessed with the Fisher’s exact test. Differences in median values were assessed with the Wilcoxon’s test when appropriate. All deaths observed in the dataset were cancer-related, meaning that overall survival is equivalent to cancer specific survival for purposes of this analysis. Overall survival (OS) and disease-free survival (DFS) curves were generated with the Kaplan-Meier method, and differences between survival curves were assessed for statistical significance with the log-rank test. Multivariate analysis for confounding factors was carried out using Cox

Proportional Hazards Regression with categorical or continuous covariates as appropriate. For this analysis, gene profile was considered as a binary category (favorable, unfavorable) as described in the predictive analysis. Age was considered as a continuous variable and the rest of the covariates were considered as categorical variables. The p values of all statistical tests were two-sided. The Genes at Work (IBM), Whitehead GeneCluster 2 and SPSS (version 11.5) packages were used for statistical tests. Details of the bioinformatics and statistical methods are provided in the on-line supplement to this manuscript

(www.bidmcgenomics.org/OvarianCancer).

Results

Patient characteristics. The clinical and pathologic characteristics of the 68 patients with epithelial ovarian cancer are shown in Table 1. The median age at diagnosis was 55 years (range

36 to 80 years), and the majority (96%) had advanced stage (FIGO stages III/IV), grade III tumors (80%), with papillary serous histology (97%). Sixty-five percent of patients were

8

optimally cytoreduced after initial surgery (less than or equal to 1 cm. residual diameter disease), and all received post-operative taxane/platinum-based combination chemotherapy. The median follow up was 40+ months (range 1 to 74+ months), with a median overall survival (OS) for the entire group of 49 months, and a median disease-free survival (DFS) of 15 months. Thus, the survival characteristics of this group are typical for patients with advanced epithelial ovarian cancer.

Development of the Ovarian Cancer Prognostic Profile (OCPP). The strategy for identifying a gene expression profile with prognostic significance is shown in Figure 1. In the first training step, 14 samples out of a randomly chosen training set of 34 samples were initially selected for pattern analysis. This group consisted of 7 samples with the shortest OS time (4, 10, 12, 18, 19,

24, 26 months) and 7 samples with the longest OS times (58+, 59+, 61+, 63, 65+, 68+, 73+).

These samples were selected in an orderly fashion starting from the most extreme sample on each end of the survival spectrum, with the two groups roughly representing less than 2-year and more than 5-year survival. Pattern analysis comparing the two groups revealed approximately

100 multigene patterns that associated with the two survival groups (p<0.001 for each pattern; additional data may be found in the on-line supplement). Class prediction with the weighted voting and the k nearest neighbor algorithm was performed. Predictor sets ranging from 120 to

300 genes using the k nearest neighbor algorithm showed high predictive accuracy (100%, p=0.0004 by a Fisher’s exact test and p<0.001 by a permutation test). Similar results were obtained with the weighted voting algorithm (100%, p=0.0003) using 180-400 genes. In the second training step, we used a weighted voting predictor with 200 genes (with 100% accuracy for the initial 14 samples) to assign labels (favorable and unfavorable) to the remaining 20 samples from the training set and then generated survival curves for the entire group. Kaplan-

Meier analysis showed a statistically significant difference in OS between the two groups. The

9

unfavorable group had a median survival of 33 months whereas the favorable group had a median survival that has not yet been reached (log rank p=0.0008). We also tested a range of the other highly accurate predictors (between 120-400 genes both by the k-nn and weighted voting methods) for their performance on the remaining 20 samples and obtained similar results, with p values for the Kaplan-Meier analysis being very similar to that of the 180-gene predictor. In the third training step, we utilized the entire group of the 34 training samples in order to develop a final candidate signature. We carried out pattern recognition and obtained 766 multigene patterns that associated with the two classes (favorable and unfavorable) at a p<0.001. The highest predictive accuracy was obtained using a 115-gene predictor (85% by weighted voting and 91% by k-nn, p=0.00005 by Fisher’s exact test and p<0.001 by permutation test). This final gene expression profile is shown in Figure 2 and will be referred to as the Ovarian Cancer

Prognostic Profile (OCPP).

Association of the OCPP with survival.

The OCPP was used to assign labels (favorable versus unfavorable) to a randomly chosen validation set of 34 patient samples, followed by Kaplan-

Meier analysis. These samples are distinct from the training set and had not been used at any step in the generation of the OCPP. As shown in Figure 3A, a strong survival split was observed on the basis of the OCPP, with a median OS of the unfavorable and favorable groups of 30 months and not yet reached, respectively, at a median follow up of 47 months (log rank p=0.004). In addition, there is a suggestion of a plateau in the favorable curve that identifies a subset of patients with a particularly indolent course, having a close to 70% long-term survival at

5 years. After the prognostic value of the signature was validated, we then applied the signature to the entire set of 68 patients in order to arrive at a more stable estimate of the effect size

(Figure 3B). The median survival for the unfavorable group was 30 months, while it has not yet been reached for the favorable group at a median follow up of 49 months, (log rank p=0.0001).

10

By a univariate Cox Proportional Hazards Model, the hazard ratio (HR) for death in the unfavorable group was 4.6 (95% CI: 2.0-10.7, p=0.0001) relative to the favorable group.

Patients from both hospital sites were similarly represented in the two groups, with 47% and

60% of samples from each site assigned to the unfavorable and favorable groups, respectively.

The OCPP was similarly prognostic when used to analyze the BIDMC versus MSKCC groups separately (see on-line supplement for additional details). The OCPP was also used to assess

DFS, as shown in Figure 4. Within the validation set, median DFS was 10 and 33 months for the unfavorable and favorable groups, respectively (log rank p =0.01). When all 68 patients were considered together, the median DFS was 10 and 20 months, respectively (log rank p=0.015).

Figure 5 shows the Kaplan-Meier analysis as a function of gene profile for homogeneous subsets of patients with stage III/IV disease (n=33), grade III disease (n=29), or optimal debulking status

(n=24) in the validation set. We purposely avoided mixing the training and validation sets for these subset analyses, in order to avoid re-analyzing samples that had already been used to generate the prognostic gene profile. For the subset of patients with stage III/IV disease, the median OS for the unfavorable and the favorable gene profile classes was 30 months versus not yet reached, respectively (Figure 5A, p=0.006). For patients with grade III disease, the median

OS for the unfavorable and the favorable profile was also 30 months versus not yet reached, respectively (Figure 5B, p<0.0006). Restricting the analysis to only those patients who were optimally debulked, the median OS for the unfavorable versus the favorable gene profile groups was 41 months versus not yet reached, respectively (Figure 5C, p=0.08). Thus, the OCPP provided excellent discrimination of survival curves for these patient subsets. More specifically, these results indicate that the survival predictions shown in Figure 3 were not sensitive to the small number of early stage and low-grade patients contained within this patient cohort.

11

Association of the OCPP with other clinical parameters. Table 2 shows the distribution of several known prognostic factors as a function of gene profile assignment. The two groups

(favorable and unfavorable) were well balanced for grade, stage, and histology. However, the favorable profile group was enriched for patients who were optimally cytoreduced (81% versus

51%, p=0.02,), whereas the unfavorable profile group was characterized by a higher median age

(61 versus 52 years, p=0001). Therefore, several prognostic factors were next evaluated by both univariate and multivariate analysis (Table 3). In addition to gene profile, debulking status and age maintained prognostic value for OS in univariate analysis. However, the OCPP maintained independent prognostic significance in multivariate analysis (Table 3), when correcting for debulking status and age. Specifically, the HR for death for the unfavorable versus the favorable group was 4.8 in the validation set (95% CI: 1.3-17.9, p=0.021), as well as in the entire dataset

(HR 3.6, 95% CI: 1.6-8.3, p=0.002), while controlling for debulking status and age. Debulking status and age were not independently associated with survival in any of the analyses (training set, validation set, or entire dataset), while controlling for each other and for the OCPP, although debulking status showed a trend towards significance in the validation set (HR 2.6, 95% CI: 2.9-

7.5, p=0.069).

Association between the OCPP and response to first-line chemotherapy. As shown in Table 4, the percentage of patients achieving a CCR after first-line therapy in the favorable versus unfavorable groups was 96% and 81%, (p =0.063). Although this trend did not reach significance by a two-sided Fisher’s test, it suggests that the association between the OCPP and survival (Figures 3 and 4) may be partly related to the likelihood of achieving a CCR with firstline chemotherapy. However, after excluding patients who did not achieve a complete response to chemotherapy, the unfavorable and favorable groups as defined by the OCPP still showed significantly different OS (41 months versus not yet reached, respectively, p=0.012). Taken

12

together, these observations suggest that the prognostic influence of the OCPP is largely independent of response to first-line treatment.

Second-look laparoscopy was routinely performed at one of the two participating institutions

(MSKCC) on patients who had achieved complete remission, had no detectable tumor by CT scan at the end of post-operative chemotherapy, and met eligibility criteria for various investigational protocols. Twenty-four (24) of the 30 patients from MSKCC had second-look laparoscopy, with 14 patients having evidence of residual disease. There was no statistically significant association between gene profile (favorable/unfavorable) and the second-look laparoscopy findings. Specifically, the percentage of patients with positive second-look laparoscopy in the unfavorable and favorable groups was 55% and 61%, respectively (Fisher’s p=1.0).

Functional classification of genes contained in the OCPP.

The OCPP as shown in Figure 2 consists of 70 genes overexpressed in the unfavorable group and 45 genes overexpressed in the favorable group. A full list of the 115 prognostic genes is provided in the on-line supplement.

Interestingly, several of these genes belong to families known to be associated with the malignant phenotype (Table 5). In order to avoid inflating the statistical significance of differentially expressed genes, the p values were estimated using the validation set only. Gene families represented in this profile include growth factor receptors and signaling molecules, angiogenesis genes, cellular adhesion and tumor invasion genes, mesenchymal markers, as well as hormone receptor associated genes. The possible significance of this gene expression profile will be briefly discussed below.

13

Discussion

Currently available clinical factors provide an imperfect assessment of prognosis for patients with advanced epithelial ovarian cancer. By using gene expression profiling, we now demonstrate the independent prognostic value of this technique when applied to tissue samples obtained at the time of initial diagnostic laparotomy. In order to define the OCPP, we combined a number of well-described methods of microarray analysis and phenotypic prediction in a way that allowed us to approach survival as a continuous and censored variable. The training approach that we have developed is based upon an initial assessment of samples at the extreme ends of the survival spectrum, but avoids using an arbitrary cut-off for defining “long” and

“short” survival durations. In this regard, our analysis is similar to that used in previous studies involving lymphoma and lung cancer

10,28

, which also approached survival as a continuous outcome in order to discover relevant prognostic signatures.

The gene profile shown in Figure 2 provided independent prognostic information for OS in patients with advanced ovarian cancer. Specifically, we were able to discriminate between two distinct OS groups on the basis of the OCPP (Figure 3), one with median OS of 30 months and another with median OS that has not yet been reached after a median follow up of 49 months

(p=0.0001). Importantly, the gene profile was strongly associated with survival in the independent validation set. Beyond the difference in median survival, it is notable that the favorable group demonstrated a possible survival plateau, with a subset of patients having a 70% probability of survival at 5 years. This level of discrimination between poor and good risk patients is not generally possible using conventional clinical factors and may provide a powerful way to identify, at the time of diagnosis, those patients who are at highest risk for an unfavorable outcome with conventional treatment approaches. In addition to OS, the OCPP was also prognostic for DFS as well (Figure 4).

14

The prognostic power of our gene expression profile was not dependent upon its association with other known characteristics, as it retained independent significance in multivariate analysis

(Table 3). This is a particularly important aspect of this study, given the emphasis recently placed on appropriate multivariate assessment of genomic signatures used for clinical prediction

29

. Although there were 3 patients with early stage disease in our initial cohort (Table 2), excluding these patients from the analysis did not diminish the prognostic significance of the gene profile when applied only to patients with advanced stage disease (Figure 5A). Similarly, the prognostic value of the gene profile was not sensitive to the small number of low-grade tumors that were present in our study (Figure 5B). Furthermore, the profile showed prognostic value even within the subset of optimally debulked patients, although this did not reach statistical significance (p=0.08, Figure 5C).

Although limited by small numbers, insight into potential mechanisms underlying the prognostic value of the OCPP was obtained by analyzing its association with response to first-line chemotherapy. Although the OCPP was associated with a trend in chemotherapy response

(p=0.063, Table 4), the profile maintained strong prognostic significance when applied to the homogeneous group of patients with chemosensitive disease. Thus, the prognostic value of the

OCPP cannot be solely ascribed to its association with drug resistance, and it is possible that it is identifying other factors such as proliferative rate or metastatic potential that could alter the natural history of this disease. In this regard, several genes with potential functional relevance were overexpressed in the unfavorable group (Table 5 and Figure 2), including the platelet derived growth factor receptor

30-31

and mesenchymal markers such as fibronectin

32

and connective tissue growth factor

33

. The coordinated expression of these and other mesenchymal genes (such as fibromodulin and vimentin, Table 5) observed in the OCPP may reflect a

15

contribution from tumor stroma, and/or might represent a process known as epithelialmesenchymal transition, which has been correlated with aggressive tumor behavior in preclinical model systems

34-39

. In addition, the overexpression of estrogen pathway related genes

(such as the estrogen receptor binding site associated antigen 9) in the favorable group could imply that estrogen responsiveness may contribute to an overall improved outcome, reminiscent of the well-described prognostic association in breast cancer. It is particularly interesting that certain genes upregulated in the unfavorable OCPP signature (Table 5) have been previously associated with poor prognosis in EOC. For example, expression of plasminogen activator inhibitor type 1 (PAI-1), a potentially important mediator of tumor invasion, has correlated with tumor aggressiveness and poor patient outcome in EOC

40-43

, as well as in other tumor types

41

.

Likewise, thrombospondin 2 expression has been associated with poor prognosis in endometrial cancer 44 and in EOC 45 . Finally, VEGF-C expression has been previously associated with inferior survival and lymphatic spread in EOC

46-48

. These interesting observations notwithstanding, it is important to point out that the functional role of these genes in ovarian cancer remains to be established and cannot be conclusively derived from this descriptive study.

This study demonstrates that it is feasible to define a gene profile that independently correlates with survival in epithelial ovarian cancer. The availability of a powerful prognostic tool such as the OCPP may enable clinicians to identify those patients most appropriate for investigational approaches such as novel first-line or maintenance strategies. In addition, the availability of molecularly-defined survival phenotypes may permit more rational targeted therapy using agents that inhibit the VEGF or PDGF pathways, for instance. Finally, although not directly tested in our study, it may be possible to use gene profiling to identify patients with early stage disease who are at high risk for relapse, and are therefore most appropriate for adjuvant platinum-based chemotherapy. Although our data suggest the potential utility of this approach, it is recognized

16

that the prognostic value of gene profiling in ovarian cancer must be further evaluated in additional prospective studies of patients with both advanced as well as early stage disease.

17

Acknowledgements

We thank Dr. Todd Golub for his helpful comments during manuscript preparation. We also thank Dr. Arthur Sytkowski and the Clinical Investigator Training Program (CITP) for providing

Dr. Dimitrios Spentzos with prior research experience during fellowship training. Finally, we wish to acknowledge the efforts of gynecologic oncologists at BIDMC and MSKCC in providing tissue samples used in this analysis.

18

Figure Legends

Figure 1: Development of Gene Expression Profile. One-half of the patient cohort (training set, n = 34) were randomly selected in order to develop a prognostic gene expression profile. Three training steps were used to progressively refine the profile, as described in text. The resultant gene expression profile was then applied to an independent set of patient samples (validation set).

Figure 2: Expression plot of the 115 prognostic genes comprising the Ovarian Cancer

Prognostic Profile (OCPP). Rows: Prognostic gene expression levels (normalized). Complete information regarding gene identity is provided in the on-line supplement (a subset of these genes is also provided in Table 5). Columns: Training set samples (n = 34). Red color:

Overexpressed genes. Blue color: Underexpressed genes.

Figure 3: Association between the OCPP and Survival. Figure 3A: Overall survival in the validation set (n = 34). Median survival for the unfavorable group is 30 months and has not been reached for the favorable group at a median follow up of 47 months (p=0.004 by log rank test).

Figure 3B: Overall survival in the entire data set (n = 68). The OCPP was applied to the entire data set (validation plus training samples) in order to more accurately assess effect size. Median survival for the unfavorable and favorable groups is 30 months and not yet reached, respectively, at a median follow up of 49 months (p=0.0001 by log-rank test).

Figure 4: Association between the OCPP and Disease Free Survival. Figure 4A: Disease-free survival in the validation set. The median DFS for the unfavorable and favorable groups was 10 months and 33 months, respectively (p=0.01). Figure 4B: Disease-free survival in the entire data set. The median DFS for the unfavorable and favorable groups was 10 months and 20 months, respectively (p=0.01).

Figure 5: Relationship between the OCPP and survival in homogeneous patient subsets.

Median survival of the unfavorable versus favorable OCPP groups as follows: Stage III/IV

(n=33); 30 months versus not yet reached; Grade 3 (n=29), 30 months versus not yet reached;

Optimally debulked (n=24), 41 months versus not yet reached. All analyses performed in the validation set.

19

Table 1: Clinical and pathological characteristics (N = 68)

Characteristic Number (percentage)

Age (Median, range) 55 (36-80)

Stage (FIGO)

I

II

III

IV

1(1.5%)

2 (3%)

58 (85.5)%)

7 (10%)

Grade

1

2

3

Histologic subtype

Papillary serous (pure or mixed)

Endometrioid

Clear cell

Debulking status

Optimal

Suboptimal

First-line chemotherapy

Platinum-based

Taxane (paclitaxel or docetaxel)

1 (1.5%)

13 (19%)

54 (79.5%)

62 (91%)

1 (1.5%)

5 (7.5%)

44 (65%)

24 (35%)

68 (100%)

68 (100%)

20

Table 2. Relationship between the OCPP and known prognostic factors.

Characteristic

Unfavorable

OCPP

(N=37)

Favorable

OCPP

(N=31)

P value*

Median Age

(years)

61 52 0.0001

Grade

1/2

3

7 (19%)

30 (81%)

7 (23%)

24 (77%)

0.77

Stage

I/II

III/IV

Histology

Clear cell

Other histology

1 (3%)

36 (97%)

2 (5%)

35 (95%)

2 (6%)

29 (94%)

3 (10%)

28 (90%)

0.59

0.65

Debulking status

Optimal

Suboptimal

19 (51%)

18 (49%)

25 (81%)

6 (19%)

0.02

* P values for grade, stage, histology, debulking status: Fisher’s exact test. P value for age:

Wilcoxon’s rank sum test.

21

Table 3. Prognostic value of the gene expression profile adjusted for debulking status and age by Cox Proportional Hazards regression.

Prognostic Factor a Univariate p value Mulitvariate p value

All patients Training Set Validation Set All patients

OCPP

b

(unfavorable/favorable)

0.0001 0.03 (HR c

4.2) 0.02 (HR 4.8) 0.002 (HR 3.6)

Debulking status

(suboptimal/optimal)

0.03 0.51 0.069 (HR 2.6) 0.39

Age

d

0.01 0.36 0.44 0.27 a) Prognostic significance assessed for overall survival. b) OCPP- Ovarian Cancer Prognostic Profile. c) HR: Hazard Ratio for death d) Age analyzed as a continuous variable..

22

Table 4. Association between the OCPP and response to first-line chemotherapy.

Response category

Unfavorable

OCPP

(N=37)

Favorable OCPP

(N=31)

p value

a

Clinical response

b

Complete Response

Persistent Disease

30 (81%)

7 (19%)

29 (96%)

1 (4%)

0.063

Second-look laparoscopy

c

(N=24)

Positive

Negative

6 (55%)

5 (45%)

8 (62%)

5 (38%) a) P value by Fisher’s exact test. b) Clinically defined as described in text. c) Defined on the basis of surgical findings, including random biopsies.

1.0

23

Table 5: Expression pattern of selected genes in the unfavorable OCPP.

Gene Identity

Connective tissue growth factor

Estrogen receptor binding site antigen 9

Fibromodulin

Fibronectin

Fibronectin precursor

Integrin beta 5 subunit

Leukemia inhibitory factor receptor

Lymphocyte antigen 75

Mitogen inducible gene (mig-2)

PDGF receptor

Plasminogen activator inhibitor I

Receptor protein tyrosine kinase

SHC transforming protein 1

Thrombospondin 2

VEGF-C

Vimentin v-fos transformation effector protein

Expression pattern

Increased

Decreased

Increased

Increased

P value

<0.0001

<0.001

<0.0001

<0.0001

Increased <0.0001

Increased <0.01

Decreased <0.001

Decreased

Increased

Increased

Increased

Increased

Increased

Increased

Increased

<0.01

<0.0001

<0.0001

<0.0001

<0.0001

<0.001

Increased <0.0001

Increased <0.05

<0.001

<0.0001

Expression patterns were determined using Affymetrix expression values generated with the

MAS 5 algorithm. Permutation p values were estimated in the validation set. A color plot with the OCPP expression patterns is also provided in Figure 2. Complete information regarding the identity of genes comprising the OCPP is available in the on-line supplement at www.bidmcgenomics.org/OvarianCancer .

24

Figure 1: Development of gene expression profile

All patients (n=68)

Training set (n=34)

(randomly selected)

Selection of

“extreme” survival patients (n=14)

Pattern recognition/ class prediction

(1 st

training step)

Optimization of class predictors

(2 nd

training step)

(n=34)

Refinement of final predictive signature

(3 rd

training step)

(n=34)

Independent validation of profile using a separate patient cohort

(n=34)

25

Figure 2: Expression plot of 115 prognostic genes.

Prognosis

Unfavorable Favorable

Receptor protein tyrosine kinase

Fibronectin

Fibronectin precursor

Mitogen inducible gene

Integrin beta 5

PDGF receptor

Connective tissue growth factor

VEGF C

Thrombospondin 2

Vimentin

Plasminogen activator inhibitor 1

Leukemia inhibitory factor receptor

Estrogen receptor binding site antigen 9

NOT4Hp transcriptional repressor

Beta dystrobrevin

Non receptor protein tyrosine kinase Tnk1

Normalized expression

low high

26

Figure 3A: Overall Survival in the Validation Set

Favorable profile

P=0.004

Unfavorable profile

Months from diagnosis

Number at risk:

Unfavorable profile 18 17 14 7 6 3 2 1 0

Favorable profile 16 15 15 14 11 5 2 0 0

Figure 3B: Overall Survival in the Entire Data Set.

Favorable profile

P=0.0001

Unfavorable profile

Months from diagnosis

Number at risk:

Unfavorable profile 37 34 28 20 13 7 3 1 0

Favorable profile 31 30 30 27 21 12 6 1 0

27

Figure 4A. Disease Free Survival in the Validation Set

P=0.01

Favorable profile

Unfavorable profile

Number at risk:

Months from diagnosis

Favorable profile 15 13 8 7 5 2 1 0

Unfavorable profile 13 8 3 1 0 0 0 0

Figure 4B. Disease Free Survival in the Entire Data Set

P=0.01

Favorable profile

Unfavorable profile

Number at risk:

Months from diagnosis

Favorable profile 31 23 16 13 8 4 1 0

Unfavorable profile 29 16 8 4 2 0 0 0

28

Figure 5: Association of the OCPP with survival in selected patient subsets.

Stages III/IV

Favorable profile

P=0.006

Unfavorable profile

Months from diagnosis

Number at risk:

Favorable profile 15 14 14 13 10 6 1 0 0

Unfavorable profile 18 17 15 11 6 3 2 1 0

Grade III

Favorable profile

P=0.0006

Unfavorable profile

Months from diagnosis

Number at risk:

Favorable profile 13 13 13 12 9 5 2 0 0

Unfavorable profile 16 15 13 9 4 2 2 1 0

Optimally debulked

Favorable profile

P=0.08

Unfavorable profile

Number at risk:

Months from diagnosis

Favorable profile 12 11 11 10 9 6 2 0 0

Unfavorable profile 12 12 11 8 5 3 2 1 0

29

References

1. Cannistra SA: Cancer of the ovary. N Engl J Med 329:1550-9, 1993

2. McGuire WP, Hoskins WJ, Brady MF, et al: Cyclophosphamide and cisplatin compared with paclitaxel and cisplatin in patients with stage III and stage IV ovarian cancer. N. Engl J

Med 334:1-6, 1996

3. Ben David Y, Chetrit A, Hirsh-Yechezkel G, et al: Effect of BRCA mutations on the length of survival in epithelial ovarian tumors. J Clin Oncol 20:463-6, 2002

4. Boyd J, Sonoda Y, Federici MG, et al: Clinicopathologic features of BRCA-linked and sporadic ovarian cancer. Jama 283:2260-5, 2000

5. Cass I, Baldwin RL, Varkey T, et al: Improved survival in women with BRCA-associated ovarian carcinoma. Cancer 97:2187-95, 2003

6. Rubin SC: BRCA-related ovarian carcinoma. Cancer 97:2127-9, 2003

7. Tai YT, Lee S, Niloff E, et al: BAX protein expression and clinical outcome in epithelial ovarian cancer. J Clin Oncol 16:2583-90, 1998

8. Golub TR, Slonim DK, Tamayo P, et al: Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286:531-7, 1999

9. Armstrong SA, Staunton JE, Silverman LB, et al: MLL translocations specify a distinct gene expression profile that distinguishes a unique leukemia. Nat Genet 30:41-7, 2002

10. Rosenwald A, Wright G, Chan WC, et al: The use of molecular profiling to predict survival after chemotherapy for diffuse large-B-cell lymphoma. N Engl J Med 346:1937-47,

2002

11. Shipp MA, Ross KN, Tamayo P, et al: Diffuse large B-cell lymphoma outcome prediction by gene-expression profiling and supervised machine learning. Nat Med 8:68-74,

2002

12. Ruel M, Bianchi C, Khan TA, et al: Gene expression profile after cardiopulmonary bypass and cardioplegic arrest. J Thorac Cardiovasc Surg 126:1521-30, 2003

13. Mitsiades N, Mitsiades CS, Richardson PG, et al: The proteasome inhibitor PS-341 potentiates sensitivity of multiple myeloma cells to conventional chemotherapeutic agents: therapeutic applications. Blood 101:2377-80, 2003

14. Mitsiades CS, Mitsiades NS, McMullan CJ, et al: Transcriptional signature of histone deacetylase inhibition in multiple myeloma: biological and clinical implications. Proc Natl

Acad Sci U S A 101:540-5, 2004

30

15. Mitsiades N, Mitsiades CS, Poulaki V, et al: Molecular sequelae of proteasome inhibition in human multiple myeloma cells. Proc Natl Acad Sci U S A 99:14374-9, 2002

16. Califano A, Stolovitzky G, Tu Y: Analysis of gene expression microarrays for phenotype classification. Proc Int Conf Intell Syst Mol Biol 8:75-85, 2000

17. Califano A: SPLASH: structural pattern localization analysis by sequential histograms.

Bioinformatics 16:341-57, 2000

18. Klein U, Tu Y, Stolovitzky GA, et al: Gene expression profiling of B cell chronic lymphocytic leukemia reveals a homogeneous phenotype related to memory B cells. J Exp

Med 194:1625-38, 2001

19. Klein U, Tu Y, Stolovitzky GA, et al: Gene expression dynamics during germinal center transit in B cells. Ann N Y Acad Sci 987:166-72, 2003

20. Klein U, Tu Y, Stolovitzky GA, et al: Transcriptional analysis of the B cell germinal center reaction. Proc Natl Acad Sci U S A 100:2639-44, 2003

21. Kuppers R, Klein U, Schwering I, et al: Identification of Hodgkin and Reed-Sternberg cell-specific genes by gene expression profiling. J Clin Invest 111:529-37, 2003

22. Lepre J, Rice JJ, Tu Y, et al: Genes@Work: an efficient algorithm for pattern discovery and multivariate feature selection in gene expression data. Bioinformatics, 2004

23. Pomeroy SL, Tamayo P, Gaasenbeek M, et al: Prediction of central nervous system embryonal tumour outcome based on gene expression. Nature 415:436-42, 2002

24. Ramaswamy S, Ross KN, Lander ES, et al: A molecular signature of metastasis in primary solid tumors. Nat Genet 33:49-54, 2003

25. Ramaswamy S, Tamayo P, Rifkin R, et al: Multiclass cancer diagnosis using tumor gene expression signatures. Proc Natl Acad Sci U S A 98:15149-54, 2001

26. Wu B, Abbott T, Fishman D, et al: Comparison of statistical methods for classification of ovarian cancer using mass spectrometry data. Bioinformatics 19:1636-43, 2003

27. Kim S: Protein beta-turn prediction using nearest-neighbor method. Bioinformatics

20:40-4, 2004

28. Beer DG, Kardia SL, Huang CC, et al: Gene-expression profiles predict survival of patients with lung adenocarcinoma. Nat Med 8:816-24, 2002

29. Ntzani EE, Ioannidis JP: Predictive ability of DNA microarrays for cancer outcomes and correlates: an empirical assessment. Lancet 362:1439-44, 2003

31

30. Henriksen R, Funa K, Wilander E, et al: Expression and prognostic significance of platelet-derived growth factor and its receptors in epithelial ovarian neoplasms. Cancer Res

53:4550-4, 1993

31. Shawver LK, Schwartz DP, Mann E, et al: Inhibition of platelet-derived growth factormediated signal transduction and tumor growth by N-[4-(trifluoromethyl)-phenyl]5methylisoxazole-4-carboxamide. Clin Cancer Res 3:1167-77, 1997

32. Thant AA, Nawa A, Kikkawa F, et al: Fibronectin activates matrix metalloproteinase-9 secretion via the MEK1-MAPK and the PI3K-Akt pathways in ovarian cancer cells. Clin Exp

Metastasis 18:423-8, 2000

33. Sakamoto M, Kondo A, Kawasaki K, et al: Analysis of gene expression profiles associated with cisplatin resistance in human ovarian cancer cell lines and tissues using cDNA microarray. Hum Cell 14:305-15, 2001

34. Auersperg N, Pan J, Grove BD, et al: E-cadherin induces mesenchymal-to-epithelial transition in human ovarian surface epithelium. Proc Natl Acad Sci U S A 96:6249-54, 1999

35. Jechlinger M, Grunert S, Tamir IH, et al: Expression profiling of epithelial plasticity in tumor progression. Oncogene 22:7155-69, 2003

36. Thiery JP, Chopin D: Epithelial cell plasticity in development and tumor progression.

Cancer Metastasis Rev 18:31-42, 1999

37. Thiery JP: Epithelial-mesenchymal transitions in tumour progression. Nat Rev Cancer

2:442-54, 2002

38. Tran NL, Nagle RB, Cress AE, et al: N-Cadherin expression in human prostate carcinoma cell lines. An epithelial-mesenchymal transformation mediating adhesion withStromal cells. Am J Pathol 155:787-98, 1999

39. Vincent-Salomon A, Thiery JP: Host microenvironment in breast cancer development: epithelial-mesenchymal transition in breast cancer development. Breast Cancer Res 5:101-6,

2003

40. Kuhn W, Schmalfeldt B, Reuning U, et al: Prognostic significance of urokinase (uPA) and its inhibitor PAI-1 for survival in advanced ovarian carcinoma stage FIGO IIIc. Br J

Cancer 79:1746-51, 1999

41. Harbeck N, Kruger A, Sinz S, et al: Clinical relevance of the plasminogen activator inhibitor type 1--a multifaceted proteolytic factor. Onkologie 24:238-44, 2001

32

42. Konecny G, Untch M, Pihan A, et al: Association of urokinase-type plasminogen activator and its inhibitor with disease progression and prognosis in ovarian cancer. Clin

Cancer Res 7:1743-9, 2001

43. Chambers SK, Ivins CM, Carcangiu ML: Plasminogen activator inhibitor-1 is an independent poor prognostic factor for survival in advanced stage epithelial ovarian cancer patients. Int J Cancer 79:449-54, 1998

44. Seki N, Kodama J, Hashimoto I, et al: Thrombospondin-1 and -2 messenger RNA expression in normal and neoplastic endometrial tissues: correlation with angiogenesis and prognosis. Int J Oncol 19:305-10, 2001

45. Kodama J, Hashimoto I, Seki N, et al: Thrombospondin-1 and -2 messenger RNA expression in epithelial ovarian tumor. Anticancer Res 21:2983-7, 2001

46. Hsieh CY, Chen CA, Chou CH, et al: Overexpression of Her-2/NEU in epithelial ovarian carcinoma induces vascular endothelial growth factor C by activating NF-kappa B: implications for malignant ascites formation and tumor lymphangiogenesis. J Biomed Sci

11:249-59, 2004

47. Ueda M, Terai Y, Kumagai K, et al: Vascular endothelial growth factor C gene expression is closely related to invasion phenotype in gynecological tumor cells. Gynecol

Oncol 82:162-6, 2001

48. Yokoyama Y, Charnock-Jones DS, Licence D, et al: Vascular endothelial growth factor-

D is an independent prognostic factor in epithelial ovarian carcinoma. Br J Cancer 88:237-

44, 2003

33

Download