Program Director/Principal Investigator (Last, First, Middle): Wong, Stephen, T.C., Ph.D., P.E. Center for Systematic Modeling of Cancer Development Section N1. Center Overview and Effort Integration This section provides an overview of the proposed center, Center for Systematic Modeling of Cancer Development (CSMCaD), including scientific focus, description of individual components and their integration, as well as an estimated timeline for the overall program. The synergies to be achieved through the establishment of multi-disciplinary teams and novel collaborations are fully described. The Center will draw its strength from an inter-disciplinary, multi-institutional team of experienced investigators and a rich variety of laboratory and institutional resources. The Center PI (Dr. Stephen Wong) and project lead investigators (Dr. Michael Lewis, Dr. Jeffrey Rosen, Dr. Xiaobo Zhou, and Dr. Vittorio Cristini) will be responsible for developing and managing the project such that we will have a representative decision-making process and administrative structure that will allow resources to be allocated as needed to meet the scientific goals in a timely and cost-effective fashion. N1.1. Overview of the scientific focus of the proposed Center (Abstract) Excluding cancers of the skin, breast cancer is the most common cancer diagnosed in American women (1 in 8 women; about 13%) and is the second leading cause of cancer deaths among women. Systemic therapies such as chemo- or radiation therapy are effective initially in controlling and reversing tumor growth. However, residual cancers will invariably re-grow despite this initial response. While there have been several advances in the treatment of breast cancer in the last two decades, notably targeted therapy for breast cancers expressing estrogen receptor (ER+) or the HER2 (ErbB2) oncogene, breast cancer survivorship has improved only modestly. Unfortunately, for women with “triple negative” breast cancers (lacking expression of ER, progesterone receptor (PR) and HER2) we currently have no targeted therapies. Our recent clinical data, as well as experimental evidence in both mouse mammary tumors and human xenograft models, support the existence of a subpopulation of cancer cells present in the original tumor that are greatly enriched in residual cancers after conventional systemic therapies. These residual cancer cells are characterized by their intrinsic resistance to chemotherapy and relative growth quiescence. However, a discreet subset of these residual cells possesses enhanced self-renewal capacity, as well as the ability to form tumors upon transplantation. These residual tumor-initiating cells (TIC) (a.k.a. cancer stem cells (CSC)), which may be located in certain tumor microenvironment (mE), may therefore be responsible for tumor growth, maintenance, resistance to treatment, and disease relapse. If the hypothesis is correct, the failure of traditional systemic therapies, such as radiation and chemotherapy, to cure breast cancer may be due to the fact that they incorrectly target the highly proliferative cells, while allowing survival of treatment-refractory tumor-initiating “cancer stem cells”. These findings fundamentally modify our conceptual approach to oncogenesis and have dramatic implications for breast cancer prevention, treatment, and drug development. In this proposal, we seek to build upon, and significantly extend, ongoing laboratory and clinical studies and Figure 1. The flowchart for the proposed research for the Center for Systematic Modeling of Cancer Development (CSMCaD) PHS 398/2590 (Rev. 11/07) Page 152 Research Plan Page Program Director/Principal Investigator (Last, First, Middle): Wong, Stephen, T.C., Ph.D., P.E. use newly developed experimental and imaging methodologies to identify, localize, purify, and characterize TIC to a degree not possible before. This will then allow us to identify and image TIC in vivo and to model TIC behavior during tumor development mathematically with respect not only to spatial localization and movement, but also to proliferation, apoptosis, and specific changes in gene expression and cellular signaling. Combined functional genomics and data mining strategies will allow us to characterize novel growth regulators. Furthermore, our combined experimental and systems biology approach will allow us to evaluate responses to experimental therapeutics that may inhibit or kill TIC specifically in a manner not possible before. Aside from a wealth of basic biological insight, extensions of this work may allow drug repositioning as well as development of directed, mechanism-based and “stem cell”-centric drug screening and evaluation methods. Figure 1 provides an overview of the CSMCaD and its scientific goals. The upper portion illustrates a flowchart of the four Specific Aims in Component 1 of the Center. These aims focus on the goal of understanding the behavior of TIC, whose function is governed by the spatial and temporal ordering of multiple interacting components at the molecular, cellular, and tissue levels. The experimental data will be used in Component 2, depicted in the lower portion of the figure, to develop mathematical and computational models of TIC signaling and behavior, including the use of mathematical equations and relationships as well as computer simulations to represent and model biological phenomena, such as proliferation, apoptosis, cell migration, and treatment response. These approaches serve two purposes. First, they provide a basic framework for the interrogation and integration of data, often providing insight into the type and quality of data needed for addressing a hypothesis or experimental design. Second, these models or simulations should allow one to predict the biological response of TIC to an experimental therapeutic agent under investigation and to predict how TIC-related processes will behave under different circumstances. The predictions generated in Component 2 can in turn be tested explicitly in experiments conducted within Component 1. We will discuss the synergy of the Specific Aims between Component 1 and Component 2 at the end of this Section. Component 1 is guided by the hypothesis that TIC represent a unique sub-population of cells within a tumor possessing properties of self-renewal and the ability to give rise to the characteristic cell types present within a given tumor. Because of their unique abilities, we hypothesize further that TIC are localized and function within a spatially and molecularly-regulated microenvironment (mE) (a.k.a. niche). To identify, localize, and functionally interrogate TIC in vivo in sufficient detail to allow mathematical modeling of their behaviors and responses to genetic and pharmacological manipulation in Component 2, the Specific Aims of Component 1 are: Aim 1.1: To identify tumor-initiating cells (cancer stem cells) using newly developed lentiviral fluorescent signaling reporters and to characterize their spatial distribution and behaviors during tumor growth using in vivo imaging. Based on our current knowledge of TIC regulation by signaling networks, including Wnt, Notch, and Hedgehog, we propose to use a novel set of lentiviral fluorescent signaling reporter vectors to identify, localize, and purify TIC from both mouse and human mammary tumors based on activities of these and other pathways in the TIC cells themselves. In addition to static histological preparations, individual stem cells can be tracked in live animals using a combination of high-resolution confocal microscopy and two-photon video imaging methods. Thus, the location and movement of TIC can be monitored over time at different phases of tumor development. These analyses should be informative about interactions between TIC and their local environment, including proximity to blood vessels, ECM, and interactions with stromal cell types, such as macrophages, neutrophils, and fibroblasts. These data will be used to develop and validate the bio-mathematical model of TIC mE (microenvironment) model that will be discussed in Specific Aim 2.1 of Component 2. Aim 1.2: To identify candidate genes and pathways that may regulate TIC behaviors (e.g. self-renewal, differentiation, and metastasis) By using the new fluorescent signaling reporter vectors used or developed in Aim 1, as well as known cell surface and enzymatic markers (e.g., CD44, CD24, and ALDH1), we will purify (or highly enrich) TIC populations away from other non-tumorigenic cell types using Fluorescence Activated Cell Sorting (FACS). Microarray (Affymetrix) and proteomic (antibody arrays, high-throughput immunofluorescence imaging) analyses will then be used to obtain gene expression data for each different cell population. Data will be analyzed using advanced bioinformatics methods (Component 2) to discover molecular pathways active in TIC and niche cell types. These data describing the relationship between signal pathways and cellular identities will be used to refine the TIC mE PHS 398/2590 (Rev. 11/07) Page 153 Research Plan Page Program Director/Principal Investigator (Last, First, Middle): Wong, Stephen, T.C., Ph.D., P.E. model and predictive modes of Aim 2.1 and Aim 2.2 in Component 2, respectively. The genes identified or predicted will then be tested functionally in Aim 3 of Component 1. Aim 1.3: To conduct a “Directed Iterative Functional Genomic Screen” (DIFGS) to characterize genes functionally that either increase or decrease tumor-initiating capacity. Using a TIC gene expression signature defined previously using the CD44 and CD24 cell surface markers on human clinical samples, we recently completed an initial functional genomics screen of 1,290 lentiviral shRNA constructs targeting ~500 genes. This screen identified 101 genes regulating mammosphere formation (a surrogate in vitro assay for TIC and normal stem/progenitor cell function). A similar study is underway using a gene expression signature derived from TIC in mouse p53-null tumor models. We propose to extend these screens in a directed, iterative manner by advanced bioinformatic approaches (Component 2) to define a new candidate target list using the 101 genes as input to identify known or suspected interacting proteins, immediate upstream regulators, and downstream targets. Additional unknowns from microarray data will also be tested whenever possible (up to about 500 genes can be screened at one time). These new candidates will be tested functionally using mammosphere-formation assays to identify only those genes regulating MSFE and the process repeated for five iterations per species (~2500 genes each species), or until all bioinformatics-defined interactions are exhausted. Human and mouse gene lists can then be mined for overlapping and unique gene sets and tested in vivo in Specific Aim 1.4 described next. These data will be analyzed through advanced bioinformatics methods described in Specific Aim 2.3 of Component 2, and the results can be used for the validation of the refined model TIC mE in Aim 2.2. Aim 1.4: To define the cellular responses of TIC to genetic and pharmacological manipulation of genes regulating TIC survival or function in vivo. Once key molecules are identified as functionally important in Aim 1.3 of Component 1, and the integrated molecular and cellular model is built in Component 2, the response to genetic and pharmacological manipulation of molecules in the model will be predicted, tested, and used to refine the model. Based on the premise that TIC must be targeted specifically for development of effective treatment or prevention of breast cancer, discovery of drugs that kill TIC specifically, or block their function will be critically important. Our ongoing work investigating inhibitors of normal stem cell self-renewal (including inhibitors of Notch, Hedgehog, and the PI3K/Akt axis) suggests that these agents function at the level of the TIC since they reduce the frequency of self-renewing cells, but typically do not alter tumor volume significantly unless combined with cytotoxic systemic therapies. We expect that a subset of the lentiviral shRNA constructs affecting TIC behavior in MS assays will have similar activity against TIC function in vivo. We will use a novel collection of mouse mammary tumors and low passage transplantable human xenografts to study the effects of genetic (constitutive or doxycyclin-inducible lentiviral shRNA expression vectors) and candidate pharmacological TIC inhibitors (currently in use or, suggested from analyses of Component 2) on TIC behavior and frequency in vivo. Moreover, the combinatorial effects of shRNA knockdown or experimental therapeutics with conventional chemotherapies will be examined with the goal of finding more effective cancer treatments for individual breast cancer subtypes. These data will again be used for the development of the drug-integrated model and for the validation for Aim 2.4 in Component 2. Component 2 is guided by the hypothesis that TIC behavior during tumor development can be simulated using a robust, multiparameter mathematical/computational model of TIC behavior during breast cancer development. Further, that these models can be built to reflect not only the molecular, cellular, and tissue-level dynamics, but also to allow prediction of the response of TIC to experimental therapeutics. Thus, the central goal of Component 2 is to build a multi-scale model platform of TIC mE for investigating TIC self-renewal, proliferation, localization, and other functions within a spatially and molecularly-regulated microenvironment. Based on the experimental data obtained from Component 1 and published knowledge of TIC, we will model the TIC tissue microenvironment (TIC mE) from the molecular and cellular level up to the tissue level. The TIC mE model can further predict and guide the pathway analysis, the candidate gene selection, genetic and pharmacological manipulation in Component 1. Accordingly, the Specific Aims of Component 2 are: Aim 2.1: To model the TIC tissue mE mathematically based on 2D and 3D microscopy and image analysis PHS 398/2590 (Rev. 11/07) Page 154 Research Plan Page Program Director/Principal Investigator (Last, First, Middle): Wong, Stephen, T.C., Ph.D., P.E. The microenvironment, including cellular and non-cellular components, is well-known to play an important role in supporting and influencing the behavior of TIC. Bio-imaging informatics models will be developed to quantify the TIC tissue microenvironment images obtained from Component 1 and then TIC mE spatial distribution can be modeled. Based on the quantified data as well as from the literature and online databases, we can apply ordinary differential equations (ODEs) and more sophisticated differential equations to describe the relationship among TIC and molecules, enzymes, nutrients and other cell types in microenvironment (e.g. fibroblasts, vasculature, immune cells) mathematically in an effort to model tumor development in silico. This will be a model at the cellular and tissue levels onto which the key molecular level mechanisms discovered in Aim 1.3 of Component 1 can be mapped in Aim 2.3. Therefore, further experiments will be carried out in Specific Aim 1.2 of Component 1 based on feedback from the results obtained in this aim. Aim 2.2: To predict the TIC pathways or key genes related to specific cancer subtypes so to refine the TIC microenvironment model Bioinformatic analysis of DNA microarray and proteomic data generated in Specific Aim 1.2 of Component 1, coupled with the genetic and pharmacologic manipulations of TIC function in Aims 1.3 and 1.4, will enable us to identify key candidate components in the pathways that are related to cellular behavior and survival. Subsequently, we will map these signaling pathway factors to specific tumor cell types and further to specific cellular properties by modeling them as functions of the factors. For example, psy f ( x1 ,..., xn ), pasy g ( x1,..., xn ) , where x1 ,..., xn are genes/factors, and f and g are the functions that model the relationship between symmetric or asymmetric self-renewal rates and the genes in TIC pathways. The TIC mE model will, in turn, be refined based on the newly inferred pathway and network information. With the network of genes integrated into the biomathematical model, predictions can be made by changing the parameter values for the network components, so that a subset of key factors will be found. These predictions will guide the iterative functional genomics experiments in Aim 1.3 of Component 1 to focus on the most likely gene candidates. Aim 2.3: To develop bioimaging informatics models for mapping gene functional networks within and among TIC and niche cells from the directed iterative shRNA screen and further refine the TIC mE model We will develop bioinformatics models for discovering gene functional networks by integrating gene function annotation results from the shRNA genome subset screening in Specific Aim 1.3 and publicly available multi-modality genomic data. We will first develop an integrated image analysis system for shRNA screens and score each gene based on the phenotypic information, then we will develop an image-based systems biology approach to study the gene functional networks. Biological processes are often an orchestra of groups of genes, and the gene functional network studies are important to understand and study gene functions. Combining with the prior knowledge, the gene functional annotation results from the shRNA screen will have the potential to identify known or suspected interacting proteins, immediate upstream regulators, and downstream targets. New experimental data that are unanticipated by the model can be used to further improve our mathematical TIC mE model. Aim 2.4: To model the response of TIC and their microenvironment to genetic and pharmacological manipulations of TIC function in vivo Based on our ability to assay the relationship between exposure to signaling inhibitors and gene expression in relatively pure cell populations, as well as the mathematical model linking molecular level data to the cellular and tissue levels, we can adjust the model to predict the response of TIC to new drug candidates. Technologies will be designed to elucidate, interrogate, and model the role of physical forces on varying cellular functions, including cellular ligand-receptor interaction, cell proliferation, differentiation, cell cycle evaluation, apoptosis and evolution of tumor phenotypes, or motility in order to facilitate an increased understanding of the role that physical forces play in cancer pathology and metastasis. Under different conditions, e.g., metastasis or non-metastasis stage, increased or decreased motility, changes in intracellular mechanics and ability of cells to interact with the environment will all be included for modeling the distribution of tumor-initiating cells. The collaboration of the Aim 2.4 and Aim 1.4 will be in an iterative manner to better refine the mathematical model in order to derive more robust drug candidates for inhibiting or managing TIC. Coherence and Synergy of Specific Aims between Component 1 and Component 2 PHS 398/2590 (Rev. 11/07) Page 155 Research Plan Page Program Director/Principal Investigator (Last, First, Middle): Wong, Stephen, T.C., Ph.D., P.E. In this section, we provide a summary of the aforementioned Specific Aims and elucidate the coherence and synergy between Component 1 and Component 2. As proposed, Aim 1.1 of Component 1 will identify, localize, and purify TIC using newly developed experimental biotechnologies and will also analyze the interactions between TIC and their microenvironment. Armed with such information, in Aim 2.1 of Component 2, we propose to construct a biomathematical model describing the cellular behavior of TIC and their interactions with other cellular and non-cellular components surrounding them. Further experiments will be carried out in Specific Aim 1.2 of Component 1 based on the feedback of the results obtained in Aim 2.1. Next, in Aim 1.2 of Component 1, we will identify candidate genes and pathways that may regulate TIC behavior by using genomic and proteomic analysis. Correspondingly, in Aim 2.2, we will use advanced bioinformatics algorithms to identify key components identified in Aim 1.2, which form a pathway or network that may regulate cellular behavior. Therefore, we can investigate models to describe the interactions in this network, and then map these genes and proteins to specific cellular properties. In this way, we can refine our mathematical TIC mE model derived in earlier Aim 2.1 by incorporating the function of gene networks. Since the initial pathway network can sometimes be very complex and large, we will divide the network into several sub-networks according to their functions for better navigation and manipulation. With the refined TIC mE model in Aim 2.2 of Component 2, we can study the effects of all components in these pathway sub-networks by changing their values in the mE model, through which we can predict the outcomes of up-regulation or down-regulation of certain genes. Thus, we can find the key components in each sub-network, which can be seen as hypotheses for biological mechanisms underlying cellular behavior. The genes in these sub-networks will be also the candidate genes used for directed iterative shRNA screen in Aim 1.3 of Component 1. To validate the hypotheses, Aim 1.3 will evaluate the functions of these found genes in Aim 2.2 on TIC properties first by using mammosphere-formation cellular assays, a surrogate assay for TIC and progenitor cell function. Corresponding bioimaging informatics techniques for analyzing these data are proposed in Aim 2.3. In this way, new experimental data will be generated and analyzed to validate the predictions by the refined model in Aim 2.2. This can also further refine our TIC mE model, which will be employed to guide the definition of cellular responses of TIC to genetic and pharmacologic manipulation as well as drug response prediction. After these Specific Aims are completed, an integrated and robust biomathematical model will be established, including interactions from the sub-cellular to the cellular and tissue levels. Similarly, in Aim 2.4, we will first use data from previous findings of the TIC mE model to incorporate the effects of drugs/shRNA into our model and then predict the potential outcomes by modifying treatment-related parameters. In Aim 1.4, we will investigate further the effect of the drugs (e.g. inhibitors of TIC) experimentally, with the purpose of generating data for validation and improvement of our TIC mE model. N1.2. CSMCaD Center Organization The proposed Center for Systematic Modeling of Cancer Development (CSMCaD) is composed of a multi-disciplinary team of investigators from several institutions across Texas Medical Center, including: The Methodist Hospital-Weill Cornell Medical College, Baylor College of Medicine, and the University of Texas Health Science Center (UTHSC) at Houston. The Methodist Hospital, Baylor College of Medicine, and UTHSC at Houston are located within walking distance of each other at Texas Medical Center. This geographic proximity provides great convenience for the synergy and interaction among the Methodist-Cornell, Baylor, and UTHSC teams. Many of the team members have been collaborating in a number of research projects on breast and other type of cancers, including those requiring new techniques in computational biology, bioimaging, pathway inference, tumor invasion microenvironment modeling, and computational modeling of drug treatment response. The CSMCaD will be leaded by a group of established researchers with track records in managing larger scale nationally allied projects, including the PI, Dr. Stephen Wong, and the other core PIs at the partnering sites, Dr. Michael Lewis, Dr. Jeffrey Rosen, and Dr. Suzanne Fuqua at Baylor College of Medicine, Dr. Xiaobo Zhou at Methodist, and Dr. Vittorio Cristini at UTHSC The PI, Dr. Stephen Wong, John S. Dunn Distinguished Endowed Chair of Biomedical Engineering and Professor of Bioengineering and Computer science in Radiology at Weill Cornell Medical College, is an established scientist and seasoned project manager. He has extensive experience in leading national biomedical research networks. Before he moved to the Methodist Hospital in May 2007, Dr. Wong was the Co-PI PHS 398/2590 (Rev. 11/07) Page 156 Research Plan Page Program Director/Principal Investigator (Last, First, Middle): Wong, Stephen, T.C., Ph.D., P.E. and management core PI of NAMIC (National Alliance of Medical Image Computing), a NIH National Center of Biomedical Computing for the analysis and visualization of medical images. Dr. Wong was also the informatics Co-Chair for fBIRN (functional Biomedical Informatics Research Network) and a member of fBIRN steering committee. fBIRN is the integral part of BIRN, another NIH roadmap funded initiative that fosters distributed collaborations in biomedical science by utilizing information technology innovations. Dr. Wong brings to the CSMCaD with more than two decades of experience in building and modeling large scale systems and managing scientific and product development for leading institutions in academia and industry, including HP, AT&T Figure 2. The organization structure of the Center for Systematic Modeling of Cancer Bell Labs, Japanese Fifth Development (CSMCaD) Generation Computer Systems project, Philips Medical Systems, Charles Schwab, UCSF, and Harvard. He was a key member of the pioneering UCSF PACS (picture archiving and communication system) program, headed scientific industrial labs at Philips Research and product development departments of Philips Medical Systems, managed the technology division of Charles Schwab, and created several research labs and centers during his tenure at Harvard, including HCNR Center for Bioinformatics at Harvard Medical School, as well as the Functional and Molecular Imaging Center, Optical Imaging Laboratory, and Conjugate and Medicinal Chemistry Laboratory at Brigham and Women’s Hospital. He directed interdisciplinary teams of over 400 scientists, researchers, and engineers globally while in industry. Dr. Wong received his executive education from MIT Sloan School, Stanford University Graduate School of Business and Columbia University Graduate School of Business. The core leadership team of the CSMCaD is composed by a group of established scientist and researchers from multiple disciplines including computational biology, imaging bioinformatics, molecular biology, cancer biology, imaging chemistry, bioengineering, biophysics and instrumentation, and clinical oncology, as well as staff members in administrative and other supporting cores. N1.3. Management The PI, Dr. Stephen Wong, will direct and manage the daily operation and coordination across sub-teams of the CSMCaD projects. The site Core PIs at Baylor College of Medicine will take the responsibility for Component 1: Experimental systems biology and the Core PIs at Methodist-Cornell and UT Health Science Center will be responsible for Component 2: Computational biology and modeling predictive medicine. The PI will also work with the core PI at Baylor, Dr. Lewis, on the Component 3 of the education and training, and, meanwhile, manage the pilot projects together with the co-PI and project manager of the Administrative Core, Dr. Fei Cao of Methodist-Cornell. The PI and Core PIs will have regular meetings and ad-hoc conversations on project progression. The PI will assume the ultimate responsibility for ensuring smooth execution of this project, with the Internal Advisory Committee (IAC) to assist in conflict resolution. The members of the internal advisory committee will include Dr. Wong and the core PIs of the three partnering sites, Dr. Michael Lewis, Dr. Jeffrey Rosen, Dr. Xiaobo Zhou, and Dr. Vittorio Cristini. The center will also form an External Advisory Panel (EAP) to include three to five experts in the areas of cancer cell biology and computational biology. This panel is primarily designed to provide feedback and suggestions and will visit and interact with the center at least once a year. The EAP will also act as a counsel for PHS 398/2590 (Rev. 11/07) Page 157 Research Plan Page Program Director/Principal Investigator (Last, First, Middle): Wong, Stephen, T.C., Ph.D., P.E. the investigators in the management of resources, resolving potential administrative issues arise, and ensuring the smooth integration of the Methodist-Baylor-UTHSC partnership. The selection of the EAP is to be made by the PIs with consent from the NCI program officer. The applicants will identify the desirable profiles of prospective EAP members. We also invite a strong consultant team composed of well-known experts in the cancer biology, stem cell biology, systems biology, bioinformatics, cancer microenvironment modeling, and in-vivo stem cell labeling, including Professor Norbert Perrimon at Harvard Medical School, Professor Dihua Yu at UT MD Anderson Cancer center, Professors Margaret Goodell and Daniel Medina at Baylor College of Medicine, Professor Michael Zhang at Cold Spring Harbor Laboratory, Professor Muhammad Zaman at UT Austin, and Professor Charles Lin at Massachusetts General Hospital (MGH), Harvard Medical School. They will contribute their experience to guide the proposed project. To ensure effective management, a relatively small leadership team (PI and Core PIs) will be formed to coordinate primary projects, task-specific-projects and supporting core activities. This team will also bring many facets of knowledge to bear upon the decision-making process, enabling faster, more effective decisions to be made about shaping the direction of the scientific research of the CSMCaD. This will be especially important in view of the demands of working with research groups across multiple disciplines and multiple institutions. The small yet representative nature of the team will minimize the cost of overhead and ensure swifter communications. Additional input to the decision-making process will come from the core leaders. N1.4. Interdisciplinary Research Team The proposed CSMCaD is composed of a multi-disciplinary team of investigators across three major institutions at Texas Medical Center, i.e., The Methodist Hospital Research Institute, Baylor College of Medicine, and the University of Texas Health Science Center. The expertise ranges from basic science such as cell biology and cancer genetics to applied technology such as computational systems biology, in silico cancer cell-matrix, cell microenvironment modeling, and software development; as well as clinical disciplines, such as clinical pathology and oncology. The PI, Dr. Wong and the core PIs, Drs. Lewis, Rosen, Zhou, Cristini, each brings complementary skills and capabilities to the proposed Center for cancer system biology at Texas Medical Center. Dr. Wong is a world-renowned leader in biomedical informatics and image computing while Drs. Lewis and Rosen have expertise in the fields of breast development and function, as well as in breast cancer research – particularly in the area of normal and malignant stem cell biology. Dr. Zhou has extensive research experience in bioinformatics and image bioinformatics. Dr. Cristini has rich experience in tumor invasion modeling and drug responsive simulation. Working together, the team will be responsible for the overall direction of the CSMCaD, and the planning, management, coordination, and integration of all contract activities. They will also be responsible for the scientific and technical leadership of CSMCaD, its implementation, interfacing with ICBP staffs and subcontractors, and ensuring that deliverables and milestones are achieved according to established timetables. Furthermore, the comprehensive expertise and experience of the investigator team is evidence that this team is well-qualified to carry out the proposed project. Stephen TC Wong, PhD (PI), please see description of the qualification and experience of the PI in Section N1.2. Fei Cao, PhD, (Project Manager), Director of Clinical Research Informatics Lab, Bioinformatics & Biomedical Engineering Program, TMHRI and Assistant Professor of Bioinformatics in Radiology, WCMC, will serve as a project manager to coordinate the management of the proposed center by working closely with Dr. Wong and the co-PIs. Dr. Cao has over fifteen years of biomedical informatics and scientific project management experience. Michael Lewis, Ph.D. (Core PI), Assistant Professor of Molecular and Cellular Biology in the Lester and Sue Smith Breast Center at Baylor College of Medicine. Dr. Lewis is trained in normal mammary gland development and breast cancer. His main research focus is in the role of hedgehog signaling in the regulation of mammary stem cells and functional differentiation at lactation. His more recent collaborative work with Drs. Jenny Chang and Jeff Rosen has been in the area of identification of tumor initiating cells and characterization of their intrinsic chemo-resistance phenotypes in clinical samples after treatment with conventional therapeutics. In addition, Dr. Lewis has expertise in in vivo analysis of experimental therapeutics and the effect of experimental therapeutics on the tumor-initiating population. Dr. Lewis has long-standing collaborations with Drs. Chang, Rosen, Hilsenbeck, and Edwards, and has collaborated with Dr. Wong and his groups over the past few years. PHS 398/2590 (Rev. 11/07) Page 158 Research Plan Page Program Director/Principal Investigator (Last, First, Middle): Wong, Stephen, T.C., Ph.D., P.E. Dr. Lewis will also serve as the Core co-PI for the educational component in collaboration with Dr. Suzanne Fuqua. Dr. Lewis has extensive teaching experience at the undergraduate and graduate level, as well as training experience at the graduate and postdoctoral levels. Jeffrey Rosen, Ph.D. (Core PI), C.C. Bell Professor of Molecular and Cellular Biology, BCM Dr. Rosen is an internationally recognized leader in the area of hormonal regulation of mammary gland development, stem cells and the molecular biology of mammary gland gene expression. Dr. Rosen’s laboratory, in collaboration with Dr. Peggy Goodell at Baylor College of Medicine, was the first to identify functional stem/progenitor cells markers in the normal mammary gland. His laboratory has extended these studies to the characterization of tumor initiating cells (TICs) in a p53 null mouse model of breast cancer, and using microarray and functional assays have identified intrinsic differences in cell cycle checkpoints and DNA repair pathways in TICs. In collaboration, with Dr. Charles Perou’s laboratory Drs. Rosen, Jenny Chang, and Michael Lewis at BCM have shown that tumor-initiating cells (TICs) often termed “cancer stem” cells (TICs) defined in human breast cancers share many common genomic patterns with a new molecular subtype called the “claudin-low” subtype, which was identified by comparative mouse-human oncogenomics. A tumorigenic signature derived from the TICs was present as a small population in all breast cancer subtypes revealed by the analysis of residual tumor cells post-therapy. This collaborative study is currently in press in the Proceedings of the National Academy of Sciences. Xiaobo Zhou, Ph.D. (Core PI), Associate Professor of Bioinformatics in Radiology, Weill Cornell Medical College and Chief of Bioinformatics and Bio-image Computing Laboratory, Bioinformatics and Biomedical Engineering Program, TMHRI, will serve as a core PI of the Component 2 of CSMCaD. He is an expert of applying advanced mathematics, pattern recognition, computer vision, signal processing, and data mining techniques in analyzing and modeling biological data and images, particularly those generated from high throughput biotechnology such as genomics, proteomics, tissue arrays, and high content screening. He and Dr. Wong pioneered the field of image bioinformatics and image-based system biology. They lead the development of a family of new image bioinformatics packages for cell biology and neurobiology studies. They have recently co-authored one of the first books in Computational Systems Bioinformatics. Vittorio Cristini, Ph.D. (Core PI), Associate Professor, the School of Health Information Sciences at UTHSC at Houston, will serve as a core PI of the Component 2. Dr. Cristini is also affiliated with the Bioengineering Departments at UT Austin and UT MD Anderson Cancer Center. His group seeks to integrate experimental and computational methods in an effort to investigate tumor biology. Currently, he focuses on examining the role of tumor micro-environmental spatial and temporal heterogeneity in promoting invasive and eventually metastatic cancer phenotypes. His group develops and applies multi-scale, predictive, computational cancer models based on well-established principles of physics, mathematics, and cancer biology that utilize state-of-the-art numerical techniques. This integrative framework allows us to form and test hypotheses that drive experimental investigation, which in turn provides data to refine our biomathematical models. Suzanne A.W. Fuqua, Ph.D (Core PI). Professor of Medicine, The Lester and Sue Smith Breast Center. Dr. Fuqua is a co-PI for the Breast Center training grant, the course director for the Translational and Clinical Breast Cancer course, and is an internationally recognized leader in the areas of estrogen receptor function in breast cancer, hormone therapy resistance, and metastasis. Dr. Fuqua has extensive training experience at the graduate and postdoctoral levels. She will be an invaluable resource for multiple aspects of the project, particularly the educational component, which she and Dr. Lewis will oversee jointly. Mary Dickinson, Ph.D., Associate Professor, BCM, will serve as a co-investigator to guide in-vivo imaging. Her laboratory uses a multi-disciplinary approach, including microscopy, molecular biology, and fluid mechanics, to study the role of fluid-derived mechanical forces in vascular remodeling and heart morphogenesis in early vertebrate embryos; her lab has developed methods for time-lapse, confocal imaging of rapid blood flow and heart mechanics using vital fluorescent protein reporters. Her lab is a part of the Molecular Physiology and Biophysics Department at BCM Jenny Chang, M.D., Medical Director of the Lester and Sue Smith Breast Center and Professor of Medicine of the Baylor College of Medicine and Chief of Breast Medical Oncology, Ben Taub General Hospital will serve as a co-investigator to guide clinical aspect of the CSMCaD project. Dr. Chang has extensive clinical and laboratory experience in the area of therapy resistance, gene expression analysis for the prediction of treatment response, and evaluation of experimental therapeutics, both pre-clinically and clinically. PHS 398/2590 (Rev. 11/07) Page 159 Research Plan Page Program Director/Principal Investigator (Last, First, Middle): Wong, Stephen, T.C., Ph.D., P.E. Thomas Westbrook, Ph.D., Assistant Professor of Biochemistry and Molecular Biology at BCM. Dr. Westbrook was recently recruited to Baylor from Dr. Steven Elledge’s laboratory at Harvard University and brings with him extensive expertise in the use of lentivirally-delivered shRNA. Dr. Westbrook is the Director of the Cell Based Assay Screening Service (C-BASS) shRNA core facility and has conducted genome-wide shRNA screens similar to the Directed Iterative Functional Genomics Screen proposed. Dr. Westbrook has also spearheaded the development of novel inducible lentiviral shRNA vectors for use specifically in mammary epithelium, vectors that will be used extensively in Component 1, Aim 1.4. Susan G. Hilsenbeck, Ph.D, Professor of Medicine and Director of Biostatistics and Bioinformatics in the Lester and Sue Smith Breast Center at BCM, will serve as a co-investigator to guide the statistic analysis. Dr. Hilsenbeck is an internationally recognized biostatistician with extensive experience in clinical trial design, microarray analysis, and preclinical study design. Dr. Hilsenbeck will work extensively with Dr. Shaw, and both will serve as intellectual and communication bridges between the laboratory based component 1 and the mathematical and computational modeling based Component 2 of this proposal. Chad Shaw Ph.D., Assistant Professor of Genetics at BCM, will serve as a co-investigator to be responsible for the initial genomics and proteomics data analysis generated in Component 1. His research interests are systems biology and the analysis of large scale genomic data. His group analyzes primary microarray data sets from all array platforms including expression arrays, genome content arrays (aCGH), microRNA arrays, and chromatin arrays with an expertise in data pre-processing and normalization. Dean P. Edwards Ph.D., Professor of Molecular and Cellular Biology, BCM, will sever as a co-investigator to be responsible for the statistic analysis Dr. Edwards is an internationally recognized expert in hormonal regulation of breast cancer and gene expression. He has extensive experience in development of monoclonal antibodies, protein biochemistry, and more recently high-throughput proteomics analyses. Dr. Edwards is the director of the Proteomics Shared Resource of the Dan L. Duncan Cancer Center at Baylor College of Medicine. He will work closely with Drs. Huang and Engler for the proteomic analyses proposed herein. Shixia Huang, Ph.D., Assistant Professor of Molecular and Cellular Biology, is an expert in proteomic analyses, particularly using protein and antibody microarrays. In addition, she has extensive knowledge of issues related to breast cancer and mammary gland development. She will work closely with Dr. Edwards and Dr. Engler for the proteomic analyses proposed. Ching Tung, Ph.D., Professor of Radiology, WCMC and Director of Diagnostic and Imaging Probes Lab, TMHRI, will serve as a co-investigator to guide the in-vivo TIC cell labeling in mouse model and assist Dr. Wong (PI) to manage the pilot projects. Dr. Tung’s research focuses on creating molecular probes to detect biomarkers in various types of diseases. Dr. Tung has pioneered the field of optical molecular probes for in vivo imaging. Over the past few years, he has applied the development of novel multi-functional molecules to molecular sensing, in vivo molecular imaging, therapy, and drug delivery. Jeff (Chung-Che) Chang, M.D., Ph.D., Professor of Pathology & Laboratory Medicine, WCMC and Director of Hemopathology, TMHRI, will serve as a co-investigator, participate the TIC mE modeling, pathway inference, and data analysis. Dr. Chang is a hematopathologist with research interest in myeloma, diffuse large B-cell lymphoma, and myedysplastic syndromes. Dr. Chang focuses on clinical and translational focusing on identifying markers and signaling pathways that are important for diagnosis, prognosis and treatment of hematological malignancies. Dr. Chang has a rich experience in DNA/cDNA/tissue microarray analysis, flow cytometry, meloma stem cells, and molecular diagnostic techniques. He will guide the data analysis and algorithmic development. John Baxter, M.D., Professor of Medicine, WCMC, Director of Genomic Core and Co-Director of the Diabetes Center, TMHRI; and Chief of Endocrinology, the Department of Medicine, TMH, will serve as an investigator. He was Chief of the Division of Endocrinology and Director of Metabolic Research Unit at UCSF. He also co-founded several startup companies in biotechnology. Dr. Baxter will provide guide for the software development and data analysis. David Engler, Ph.D. Director of the Proteomics Core at TMHRI will serve as an investigator. He will be responsible for guiding protein-level quantification and verification work necessary to validate or confirm the genomics-level data that the CSMCD will be analyzing. The Proteomics Core will also provide assistance in protein-level pathway analysis stemming from systems-level biological data directly correlated to, or inferred PHS 398/2590 (Rev. 11/07) Page 160 Research Plan Page Program Director/Principal Investigator (Last, First, Middle): Wong, Stephen, T.C., Ph.D., P.E. from many of the genomic, epigenomic, or transcriptomic changes observed in the cancers studied by the CSMCaD. Paul Macklin, Ph.D., Assistant Professor of Health Informatics at the School of Health Information Sciences at the University of Texas Health Science Center – Houston, will serve as a co-investigator to work on cancer mE modeling. He works at the intersection between biology, medicine, mathematics, and computer science to develop and validate sophisticated computer models of cancer. He works under Vittorio Cristini on several research projects in the field of computational and predictive oncology. Xiaofeng Xia, Ph.D, Instructor, Medical Systems Biology Lab, The Methodist Hospital Research Institute, will serve as a co-investigator in the Component 2 to guide the TIC mE modeling. He completed his postdoc in University of Wisconsin – Madison working in stem cell research and subsequent spending three years as a research scientist in WiCell working under Dr. James Thomson. He has worked in a number of areas including bioinformatics, neurotransmitter releasing and neuronal differentiation of stem cells. N1.4. Consultants In addition, we have assembled a team of leading experts in bioinformatics, cancer biology, computational biology, computational genomics, stem cell biology and systems biology as consultants to this project. These consultants will serve as “thought leaders” for their respective expertise and for the CSMCaD as a whole to guide the development of biologically-related pathway analysis and data analysis. They will also be called upon to beta test new software and provide feedback on the current CSMCaD performance. The consultants include Dr. Margaret Goodell, Professor and Director of STaR Center, Baylor College of Medicine, a leader scientist in the basic biology of hematopoietic stem cells, the behavior study of stem cells in vivo and in vitro using mouse stem cells as a model; Dr. Daniel Medina, Professor of Molecular and Celllular Biology, Baylor College of Medicine Dr. Medina is an internationally recognized leader in mammary gland development and preneoplastic breast disease; Dihua Yu, MD, PhD, Nylene Eckles Distinguished Professor and Vice Chair of Molecular and Cellular Oncology and Director of Cancer Biology Program at M.D. Anderson Cancer Center, a leader in understanding breast cancer initiation, metastasis, therapeutic resistance, apoptosis, cell cycle control, signal transduction, cancer stem cell-like properties, microRNA deregulation, cancer deregulation of cellular metabolism, and cancer molecular imaging of breast cancer; Michael Zhang, PhD, Professor of Computational Biology and Bioinformatics at Cold Spring Harbor Laboratory, a well-known expert in building a comprehensive network of the genes involved in the regulation of growth and homeostasis which can provide a "system level" understanding of gene and pathways; Norbert Perrimon, PhD, Professor of Genetics at Harvard Medical School, a leader in whole-genome RNA interference (RNAi) and other chemical genetics screens using our high-throughput screening and pathway analysis; Dr. Charles Lin, Associate Professor of Wellman Center for Photomedicine & Center for Systems Biology, Massachusetts General Hospital & Harvard Medical School, a leader in cancer stem labeling in in-vivo animal model, In-vivo monitoring of cell trafficking in circulation, imaging of vasculature and microenvironment in tissue, interaction of cells with microenvironment; Dr. Muhammad H. Zaman, Assistant Professor of Department of Biomedical Engineering, The University of Texas at Austin, a leading scientist in modeling cancer cell microenvironment to understand how cancer cells interact with the extra cellular matrices in native environments by employing computational and experimental tools of biophysics, cell biology, mechanics and chemistry to study cancer related problems. N1.5. Timeline for the overall program PHS 398/2590 (Rev. 11/07) Page 161 Research Plan Page Program Director/Principal Investigator (Last, First, Middle): Wong, Stephen, T.C., Ph.D., P.E. The outline of defining how novel analytical tools will be developed and applied to ICBP data will be generated and submitted to the NCI within the first three months of the project. We request five years to complete the proposed project. The timeline of this project is listed in the next Table 1. Table 1: The estimated timeline of the CSMCaD project. Tasks Year 1 Year 2 Year 3 Year 4 Year 5 TIC identification, labeling & distribution study, TIC mE image analysis and TIC mE modeling TIC distribution study, TIC mE modeling, genomics, proteomics & mechanism study in regulating TIC, TIC Pathway inference, and TIC mE remodeling TIC mE modeling, mechanism study in regulating TIC via pathway analysis, shRNA interference screening, target discovery for regulating TIC shRNA interference screening, bioimaging informatics for candidate target discovery, and TIC mE refining Integration study based on in-vitro/in-vivo information; TIC mE modeling and prediction for drug treatment response Section N.2. Administrative Core The specific aim of the administrative core is to provide a flexible yet effective administrative structure to support the infrastructural and scientific aims, in view of the many faceted interactions that must necessarily occur among the CSMCaD teams. To accomplish this aim, we will develop and execute a management plan based on a balanced management strategy that supports an environment of shared decision-making and mutual responsibility among the core PIs, while providing the oversight and leadership necessary to produce quality biomedical imaging work. We will manage the overall CSMCaD project using sound basics, including phased delivery, quick and concrete feedback, clear articulation of the project needs, project tracking and oversight, effective governance, and inter-group coordination. The overall organization and administrative structure of this CSMCaD project is shown in Figure 2. The PI, Dr. Stephen Wong and the project manager (PM) Dr. Fei Cao, along with Ms Sample and Ms. Roberts will direct and manage the daily operation and coordination across subteams of the project. The four specialized cores including the administration core, experimental system biology core, computational biology core, and education & training core, will provide the infrastructure to execute and support the proposed research projects. The PI, PM, and Core PIs will have regular meetings to interact with each other. The PI will assume the ultimate responsibility for ensuring smooth execution of this project, with the internal advisory committee to assist in conflict resolution. The members of the advisory committee and the management are described in Section N1.3. N.2.1 Management Plan The management plan encompasses two balanced goals: Effective Management and Oversight of Quality Research Work. Effective management: The goal for effective management includes the creation of a relatively small leadership team (PI and Core PIs), which will be responsible for coordinating primary project, task-specific-projects and supporting core activities. This team will also bring many facets of knowledge to bear upon the decision-making process, enabling faster, more effective decisions to be made about shaping the direction of the scientific research. This will be especially important in view of the demands of working with research groups across multiple collaborative institutions. The small yet representative nature of the team will minimize the cost of overhead and ensure swifter communications. Additional input to the decision-making process will come from the core directors. Oversight of quality research work: Each of the cores will set its own goals and competencies that will dovetail with the overall CSMCaD objectives. To a great extent, various research projects will run themselves. It will be the role of the leadership team of CSMCaD to shape the overall scientific direction and to bring projects back in line when they go off course, as sometimes occurs in a multidisciplinary, multi-institutional research setting. PHS 398/2590 (Rev. 11/07) Page 162 Research Plan Page Program Director/Principal Investigator (Last, First, Middle): Wong, Stephen, T.C., Ph.D., P.E. The management strategy that emerges for the proposed CSMCaD is one of decentralized decision-making rather than centralized control. The mission of the leadership team is to make the entire multi-disciplinary project a success. To that end, the tactics derived from the balanced goals of the management strategy include (1) providing a relatively horizontal organizational structure as opposed to the vertical hierarchy that typifies most corporate organizations; (2) implementing project management, quality controls, and service controls; (3) enabling autonomous units; (4) emphasizing flexibility and responsiveness; (5) collecting performance metrics to ensure that metrics are reviewed by those whose work is being measured; (6) embracing changes when necessary; and (7) intervening occasionally in situations that cannot be resolved from a distance. These are the characteristics of organizations that succeed over time. Accordingly, the specific aim of the leadership team of CSMCaD will be accomplished by: (1) establishing an effective management structure; (2) deploying project tracking and oversight mechanisms; (3) administrating overall center budgets and fiscal matters; (4) documenting progress reports and accomplishments; (5) organizing and scheduling meetings, including annual all-hands meetings; regular committee meetings; regular core meetings and teleconferences; and ad hoc meetings among investigators of different projects and cores; (6) providing and reinforcing guidelines and policies regarding human subject protection, inclusion of women, minorities, and children in research, care and use of vertebrate animals in research, software licensing, intellectual property, and publication of peer-reviewed scientific papers; (7) reinforcing the NIH data-sharing policies; and (8) coordinating and interfacing with the ICBP steering committee and other ICBP centers. N.2.2. Organizational Roles of Cores Administration core: A highly effective administrative resource is critically important in the successful establishment of this CSMCaD. This core will be the administrative center of the Methodist-Baylor-UTHSC team. The administrative core resource will consist of the PI, the Core PIs, a Research Project Coordinator and an Administrative Assistant. The CSMCaD advisory committee will communicate directly with this resource in terms of monitoring progress, providing evaluation and counseling the PI in issues arising during the administration of the Center. The administrative core resource will coordinate the administration of the CSMCaD, organize the steering and advisory committee meetings, and track milestones and project progress. Retreats, symposia, seminars, and meetings will be coordinated and organized through this resource. Monitoring and reconciliation of the various budgets, and facilitation of the purchase of supplies will be also be provided by this resource. The CSMCaD project team consists of seventeen investigators spread over three institutions. Efficient communication and a high level of interaction will be achieved through the administrative resource which will include the maintenance of an interactive Wiki project web site and annual all-hands meetings. Furthermore, email listings where daily postings on day-to-day activities and information will be provided. The administration core will also have support from TMHRI administration resources. Under the direction of Edward Jones, M.B.A., Vice-President in charge of administration, TMHRI has a fully developed and staffed research administration and support infrastructure. Among its innovations is web-based management of the document flow for the Institutional Review Board and other research administration functions. TMHRI is fully compliant with all NIH Grants Policies and OHRP policies regarding human research subjects. The administration core will ensure synergies and resource sharing of Component 1 & Component 2. The weekly meeting between these two components will take place as usual. The core will also make sure management of component 3 -the Education & Training is going well, see details in Section N5. N.2.3. Management of the Research Workshop Annual Interdisciplinary Symposium: The Methodist, Baylor, and UT teams will host an annual 2-day symposium at Texas Medical Center featuring keynote speakers that will include local, national, and international leaders in relevant systems and cancer biology research, as well as leaders from the other ICBP centers and members of external advisory panel. We will feature live demonstrations of the site, portal, and analysis tools, and hands-on tutorial workshop training sessions. Annotation jamborees also will be scheduled during this event to establish community consensus and standardization of terms or nomenclature. Scholarships will be available to assist young investigators (students and post-doctoral fellows) with travel, lodging, and registration expenses related to the event. Houston presents as an ideal location for such an event for several reasons: 1) it is the hub for Continental Airlines, one of the nation’s largest, which means that investigators can easily travel here via nonstop flights from PHS 398/2590 (Rev. 11/07) Page 163 Research Plan Page Program Director/Principal Investigator (Last, First, Middle): Wong, Stephen, T.C., Ph.D., P.E. most cities, including many major foreign cities; 2) Houston’s central location in the country allows attendees to avoid long, transcontinental flights; and 3) hotels in Houston are more reasonably priced than those in most other major American cities. N.2.4 Management of Pilot projects The TMHRI will actively solicit opportunities to collaborate with the scientific community on biological studies of cancer stem cell microenvironment that involves the application of high-throughput technologies, modeling, and bioinformatics techniques. The Pilot Projects or the Driving Biological Projects (DBP) funding mechanism will fund projects that have the potential to significantly contribute to the field of cancer stem cell microenvironment and enhance the capabilities and usefulness of the CSMCaD. TMHRI will seek to fund transformative projects whose outcomes will have implications for a broad array of cancer systems biology. For example, these projects could include the in-vivo cancer stem cell and other cell labeling, drug combination prediction, development of new modeling of tumor metastatic tissue, development of new analytical tools for next generation sequences, the deep characterization of drug resistance, and novel software to integrate in vivo imaging data with other data modalities, such as optical microscopy data. The CSMCaD plans fund four projects in the year 1 and year 2, 3 proposals in year 3 and 4, and 2 proposals in year 5 and totally 16 one-year projects over the 5-year period. The first four projects will start in August of 2010. We will post an announcement open to all experimental laboratories in the United States once the CSMCaD is funded. Interested investigators will be invited to submit their proposals. The review panel will consist of the PI, Core PIs and the members of external advisory panel. We anticipate that the TMHRI Bioinformatics and Biomedical Engineering Program will provide necessary bioinformatics support to the awardees. Solicitation and Review of Proposals: The strategy that we will use for solicitation and review of proposals is analogous to one that TMHRI and Baylor currently use successfully to manage several internal seed funding proposal programs. White paper proposals will be solicited from the scientific community through announcements posted on the CSMCaD website, and other outreach activities including ads in scientific journals and flyers and posters distributed and posted at scientific meetings. The Pilot Projects or DBP Program will especially be advertised to all cancer and related investigators in USA. The DBP white page guidelines will also be included in the solicitation. The proposals will be submitted electronically via the streamlined Methodist Online Research Technology Initiative (MORTI) system of TMHRI that forwards the applications to the Research Project Coordinator, then to reviewers, grants and staffs for budget review. TMHRI investigators currently use this facile electronic system extensively. Award notification and grant management are also managed by the MORTI system. We will first review the proposals for responsiveness to the solicitation. Proposals deemed responsive to the solicitation will then be sent out to external expert reviewers selected by the management committee. We will use an NIH-style peer-review panel process to rank order the proposals. Reviewers will assign priority scores to the proposals and return them to the Project Manager within one month. The four applications with the best priority scores will then be forwarded on to the ICBP Project Officers for their review and approval. The Research Project Coordinator, with input from the management committee, will address any questions or concerns the officers may have regarding the projects and their management in her Final DBP Project Plan to be submitted to the officers within 2 weeks of officer approval of the projects. Her plan will include monitoring procedures for the projects and will establish metrics and milestones. The monitoring procedures will likely emulate a U-series reporting structure and include semi-annual reports to be submitted to the Project Coordinator one month prior to the due date of the BRC semi-annual progress report. Upon receipt of written approval of the Final DBP Project Plan from the PI and ICBP officers, funds will be dispersed to grantees. N2.5. Existing Supporting Cores (Environments and Resources) Resources of The Methodist Hospital Research Institute. A complete account of available resources for this proposal is provided in the resource pages of this proposal. Briefly, TMHRI Cores participating in the ICBP Project include Bioinformatics and Biomedical Engineering Programmatic Core (Stephen Wong, Director and PI on this project and Xiaobo Zhou, Lab Chief and a core PI), Cellular and Tissue Microscopy Core (Stephen Wong, Director and PI on this project), Animal Imaging Core (Stephen Wong, co-Director and PI), Genomics Core (John Baxter and Paul Webb, Director and co-Director as well as co-investigators on this project), and Proteomics Core (David Engler, Director and co-investigator on this project). These specialized cores at TMHRI provide the PHS 398/2590 (Rev. 11/07) Page 164 Research Plan Page Program Director/Principal Investigator (Last, First, Middle): Wong, Stephen, T.C., Ph.D., P.E. infrastructure to execute and support the proposed research projects. Resources of the UT Health Science Center are provided in the resource pages. Resources of the Baylor College of Medicine A detailed list is also provided in the resource pages. Briefly, the institutional core facilities available to support this center are the Gene Expression Core (microarray/qPCR), Proteomics Core (Dean Edwards, Director and co-investigator on this project), C-BASS Core (Thomas Westbrook, Director and co-investigator on this project), Cytometry and Cell Sorting Core, Vector and Virus Production Core, Integrated Microscopy Core, the Genome Sequencing Center, and the Genetically Engineered Mouse Core. In addition to these institutional core facilities, the Lester and Sue Smith Breast Center, in which Dr. Michael Lewis is a faculty member, has an Animal Handling and Imaging Core (Michael Lewis, Director and core PI on this project), Microarray Core, qPCR Core, Pathology Core, and a Bioinformatics/Biostatistics division (Susan Hilsenbeck, Director and co-investigator on this project). Thus, several of the institutional and center-based core directors are active participants in this project. In addition, Baylor College of Medicine has an interinstitutional agreement with the MD Anderson Cancer Center which allows full use of MD Anderson Core facilities by faculty at Baylor at subsidized prices. The site or core PIs will assume the oversight responsibilities for individual core resources in their institutions. The CSMCaD PI will assume the ultimate responsibility for ensuring smooth and streamlined operating of the core resources. The supporting core directors will work closely with CSMCaD members to ensure access to the core resources. The PI and the core directors will also have regular meetings to interact with each other. N.2.6. Interaction with other ICBP centers Our ICBP working group is headed by the PI, Dr. Wong. It facilitates interactions and collaborations in order to foster learning and improvements among the Centers which could then be also applied to the other data generated by other groups. The CSMCaD will work under the guidance of the ICBP Steering Committee on the development, harmonization, and standardization of methods for data collection and analysis across the different platforms to be developed by centers in the ICBP network. The CSMCaD will work closely with other ICBP centers to identify and test methods suitable for performance validation of multi-scale data acquisition and multimodal imaging, leading to multi-center, multi-platform clinical studies involving all centers in the ICBP network. The PI and the CSMCaD research group leaders will participate in the ICBP working groups for the purpose of communicating information across the network centers relevant to joint activities and creation and maintenance of network-wide resources. N2.7. Compatibility with caBIG The software tools and data models developed in the CSMCD will be made compatible to the NCI caBIG infrastructure, according to the caBIG Compatibility Guidelines. The interoperability between the CSMCaD tools and caBIG software would be planned to be at least at the silver level such that the barrier to use CSMCaD software by a third party will be significantly reduced. N2.8. Plan for Sharing Research Data, Resources, and Intellectual Property Plan for Sharing Research Data, Resources, and Intellectual Property: All primary data, datasets, algorithms, and protocols generated will be conducted at the PI’s labs (Bioinformatics and Biomedical Engineering Program, including Laboratory for Medical Systems Biology and Laboratory for Bioinformatics and Bio-image Computing) at TMHRI and Core PIs’ labs at Baylor and UTHSC. All data pertaining to this project will be deposited into the CSMCAD database. Data generated as part of this research project will be made freely available after publication. Our labs have an excellent track record of sharing reagents with the community. As in previous studies, we will present and disseminate our research results, software package, and related publications and documentation on the public CSMCAD website hosted by the PI’s lab. We will post step-by-step instruction including screen shots and test images for users to download and run by themselves. On the CSMCAD website, we set up a “Contact Us” for users to send us questions and comments. Users may choose to register with us online by just using their email addresses (registration is not required to download the software). We will notify registered users by email about new release and upgrade, in addition to post the messages online. All research data and analysis tools generated under this award will be made publicly available to the research community. This includes all data generated through the driving biological projects and all software and analysis tools developed for pathway analysis, data integration from different sources, and so on. All data will be released once they are verified. Any software algorithms and programs developed by the center and our collaborators will be made publicly available in a manner consistent with the goals of NIH for software dissemination, and unique biological information (DNA sequences, etc.) will be submitted to caBIG for wide PHS 398/2590 (Rev. 11/07) Page 165 Research Plan Page Program Director/Principal Investigator (Last, First, Middle): Wong, Stephen, T.C., Ph.D., P.E. dissemination to the research community. Research tools (including analysis tools, algorithms, software interfaces, source codes and other software technologies) developed or enhanced with contract support will be made available in caBIG and the CSMCaD web portal after they have been appropriately tested both internally and by our cadre of expert consultants. The CSMCaD is a collaborative program, and TMHRI and its Baylor/UTHSC partners will work within the guidelines established by the NIH and with other funded sites to prepare a joint dissemination plan upon notice of award. Intellectual Property: TMHRI-Baylor-UTHSC will be responsible for acquisition of all proprietary and intellectual property rights needed to perform the projects proposed herein. TMHRI will work with our subcontractors to expedite the acquisition processes and stay on track with agreed upon milestones. TMHRI has dealt with multi-institutional IP issues as part of our 670-page Clinical and Translational Science Award application submitted in 2008. Figure 3. The 92 gene taxotere sensitivity predictor (Left) did not predict AC response (Right) The expression levels are shown in red (expression levels above the median for the gene) and blue (expression levels below the median for the gene). Section N.3. Previous Accomplishments for New Applicants N3.1. Accomplishments in Stem Cell Biology and Breast Cancer Research PHS 398/2590 (Rev. 11/07) A B CD44+/CD24 30 Mammosphere Formation Efficiency 130 n=31 120 n=31 110 No. of MS/10,000 cells 25 CD44+/CD24- Gene expression patterns correlating with response to different chemotherapeutic regimens. As part of our previously funded SPORE grant project (Chang/Lewis), we generated gene expression signatures for the prediction of response to Docetaxel [1], and anthracyclin-based chemotherapy in breast cancer patients [2], as well as genes regulated by hedgehog signaling agonists and antagonists in breast cancer cell lines (M.T. Lewis, in preparation). Docetaxel is one of the most active agents in breast cancer, but resistance or incomplete response is frequent. To determine whether there existed a gene expression pattern prior to treatment that could predict for sensitivity to Docetaxel, core biopsies from 24 patients were obtained before treatment with neoadjuvant docetaxel, and 20 15 10 100 90 80 70 60 50 40 30 5 20 10 0 0 Initial Week 3 Observed Week 12 Predictive Initial Week 3 Observed Week 12 Predictive Figure 4. Effect of chemotherapy on the mean percentage of cells that express high levels of CD44 and low levels of CD24 (CD44+/CD24 >) as well as mammosphere formation efficiency among HER2-negative patients before, during, and after treatment. Circles represent observed values. Predicted values (dashed lines) and their 95% confidence intervals (CIs; thin error bars) were estimated by linear mixed-effects models. Error bars on circles represent 95% CIs (two SEMs) of experiments at baseline and each time point of follow-up. A) Percentage of tumorigenic cells increased at week 3 (P < .001, model-based contrast) and remained high at surgery (week 12) (P < .001, model-based contrast). Statistical tests were two-sided. B) Effect of chemotherapy on mean mammosphere (MS)-forming efficiency before, during, and after treatment. All patients, P < .001, model-based contrast. Page 166 Research Plan Page Program Director/Principal Investigator (Last, First, Middle): Wong, Stephen, T.C., Ph.D., P.E. response was assessed after chemotherapy. After 3 months of neoadjuvant chemotherapy, surgical specimens (n = 13) were obtained, and laser capture microdissection (LCM; n = 8) was performed to enrich for tumor cells. From each core, surgical and LCM specimen, total RNA was extracted for cDNA array analysis using the Affymetrix HgU95-Av2 GeneChip microarrays. From the initial core biopsies, differential patterns of expression of 92 genes correlated with docetaxel response (P = .001) (Figure 3, Left). However, the molecular patterns of the residual cancers after 3 months of docetaxel treatment were strikingly similar, independent of initial sensitivity or resistance. This relative genetic homogeneity after treatment was observed in both LCM and non-LCM surgical specimens. The residual tumor after treatment in tumors that were initially sensitive indicates selection of a residual and resistant subpopulation of cells. The gene expression pattern was populated by genes involved in cell cycle arrest at G2M (eg, mitotic cyclins and cdc2) as well as markers later thought to serve as markers expressed in breast cancer “stem cells”. Of importance, the taxotere sensitivity signature did not predict response to Doxorubicin/cyclophosphamide (AC) response (Figure 3, Right) [2]. Thus, sensitivity to AC will be dependent on an entirely different set of genes that are currently under analysis. Intrinsic chemo-resistance of tumor-initiating sub-populations of breast cancer cells. Based on results of data mining of the taxotere and AC response data suggesting enrichment of cells expressing putative stem cell markers after conventional treatments, we recently demonstrated in clinical samples that tumorigenic breast cancer cells are enriched in residual tumors after chemotherapy, but not after lapatinib treatment. In matched human breast cancer biopsies (n= 31 pairs), the relative proportion of CD44+/CD24-/low cells (previously characterized as enriched for tumor-initiating cells [3]) increased with chemotherapy from a baseline mean of 4.7% to 13.6% after 12 weeks of chemotherapy (p<0.0001), indicating enrichment of chemotherapy-resistant potentially tumorigenic breast cancer cells (Figure 4A). Consistent with the increase in the relative proportion of CD44+/CD24-/low cells, mean MSFE was significantly increased after chemotherapy in matched pre- and post-chemotherapy samples (p<0.001) (Figure 4B), indicating an enrichment for cells capable of anchorage-independent growth. Unlike with chemotherapy, lapatinib treatment of HER2+ breast cancers did not increase the proportion of CD44+/CD24-/low breast cancer cells, but led to a statistically non-significant decrease in matched biopsies from a baseline mean of 10.6 to 7.4% (p=0.1) after 6 weeks of lapatinib (Figure 5A). Also unlike with chemotherapy, MSFE did not increase with lapatinib treatment, but showed a non-significant decrease (Figure 5B). Consistent with the effect of lapatinib on tumorigenic cells, use of lapatinib to augment conventional therapy increased the pathological complete response rate by 3- to 4-fold compared to the A B Mammosphere Formation CD44+/CD24published rate with conventional Efficiency therapy alone. Thus, lapatinib appears to target the TIC population with equal efficiency as the bulk of the tumor. If true, lapatinib represents the first characterized “stem cell” targeted therapy against HER2+ breast cancer. Gene expression analysis of enriched TIC populations in mouse mammary tumor models. Using the collection of stably transplantable mouse tumors derived from p53 null mammary epithelium carried in the epithelium-free mammary fat pad of host mice, the Rosen laboratory (co-investigator) has demonstrated that TIC in these tumors are characterized by expression of both PHS 398/2590 (Rev. 11/07) Figure 5. Effect of lapatinib on mean percent cells that express high amounts of CD44 and low amounts of CD24 (CD44+/CD24>) and mammosphere (MS)-forming efficiency (MSFE) before, during, and after treatment. Circles represent observed values. Predicted values (dashed lines) and their 95% confidence intervals (CIs; thin error bars) were estimated by linear mixed-effects models. Error bars on circles represent 95% confidence intervals (two SEMs) of experiments at baseline and each time point of follow-up. A) Percentage of CD44+/CD24> cells in samples from biopsy cores. B) MSFE of samples from biopsy cores. Page 167 Research Plan Page Program Director/Principal Investigator (Last, First, Middle): Wong, Stephen, T.C., Ph.D., P.E. the CD29 and CD24 cell surface antigens in FACS analysis. Cells that are double positive for both CD29 + and CD24+ possess tumor-initiating function, while the other three cell populations (CD29+;CD24-, CD29-;CD24+, or CD29-;CD24-) are greatly diminished in this ability. As few as 20 double positive cells are capable of forming new tumors upon transplantation, whereas several thousand cells in the other populations are required. Using FACS coupled with gene expression microarray analysis, the Rosen laboratory has generated a gene expression signature of p53 null tumor-initiating cells from three individual tumors (Figure 6, from [4]) and has identified 710 probesets differentially expressed in common among the three models used. Similar to the lentivirus-based shRNA knockdown strategy described below for the human, we are in the process of conducting a functional genomic analysis of these genes using lentiviral shRNA expression vectors to determine which of these genes are required for TIC function in p53 null tumors. Results of this functional screen can then be correlated with those of the human for commonalities and differences among the models. Wnt signaling and radiation resistance of tumor-initiating cells. Based on the observation that stem/progenitor cells, which remain after breast cancer therapy and may give rise to recurrent disease, we (Rosen) hypothesized that progenitor cells are resistant to radiation, a component of conventional breast cancer therapy. Further, that resistance is mediated at least in part by Wnt signaling, which has been Figure 6. Differentially expressed transcripts in tumor-initiating cells of p53 implicated in stem cell transplantable mammary tumors. A, Venn diagram of transcripts differentially survival. To test this expressed in Lin_CD29HCD24H compared with Lin_CD29HCD24L, hypothesis, we investigated Lin_CD29LCD24H, and Lin_CD29LCD24L subpopulations of p53-null transplantable radioresistance by treating mammary gland tumors (P < 0.01 for each comparison). B, the heat map of 710 primary BALB/c mouse differentially expressed transcripts in the tumorigenic cancer cell Lin_CD29HCD24H mammary epithelial cells with subpopulation. Each row represents a transcript; each column represents various clinically relevant doses of subpopulations from three tumors. The red color indicates high level expression, radiation. We observed whereas blue indicates a low level of expression. The top five IPA-picked molecular enrichment in normal and cellular functions in which those down-regulated and up-regulated genes progenitor cells (stem cell involved are indicated on the left (number of molecules). antigen 1-positive and side population progenitors) capable of clonal growth. Consistent with a role for canonical Wnt signaling, radiation selectively enriched for progenitors in mammary epithelial cells isolated from transgenic mice with activated Wnt/β-catenin signaling but not for background-matched controls, and irradiated stem cell antigen 1-positive cells had a selective increase in active β-catenin and survivin expression compared with stem cell antigen 1-negative cells. Functional genomics screening technology. A unique advantage to our project is the availability of Dr. Westbrook. Dr. Westbrook did his postdoctoral training at Harvard Medical School in the laboratory of Steve Elledge, Ph.D., where he developed (in collaboration with the Hannon lab) genome-wide retroviral RNA interference (RNAi) libraries (known as the Hannon-Elledge shRNA libraries). Using barcoding technology, these libraries enable rapid and functional interrogation of the genome in ways that cannot be achieved by current siRNA-based “well-by-well” approaches. Using these technologies, Dr. Westbrook devised a new strategy to identify human tumor suppressors systematically on a genome-wide scale [5]. This highly cited work was successful in identifying known and novel tumor suppressors, thus, providing a new method for discovering this class of cancer genes. One result of this work was the discovery of REST as a tumor suppressor which has opened up a new field of studying how neuronal regulators control breast cancer growth and survival, work that Dr. Westbrook has recently expanded [6] as an independent investigator at Baylor. Real-time in vivo imaging capabilities. Another unique aspect of this project is the time-lapse and real-time in vivo imaging capabilities developed in the laboratory of Dr. Mary Dickinson. Dr. Dickinson has developed methods for imaging vascular development during embryonic stages of development using a combination of PHS 398/2590 (Rev. 11/07) Page 168 Research Plan Page Program Director/Principal Investigator (Last, First, Middle): Wong, Stephen, T.C., Ph.D., P.E. two-photon and advanced confocal imaging technologies. She has also developed a number of specialized image analysis methods for studying the vascularization process in 3D, methods that can be modified (if necessary) to model neovascularization of tumors as they develop in the mouse mammary fat pad. These methods should also allow us to localize fluorescently tagged TIC relative to the vasculature and track their behavior over time. N3.2. Accomplishments in Computational Biology We (Wong labs) have been working in computational biology and imaging bioinformatics for decades and are well established in the following areas related to this proposal: PACASS database system, cellular and molecular high-content screening and data analysis, computational biology for biomarker discovery and signaling pathways analysis. N.3.2.1. Image Bioinformatics Platforms Developed D-CELLIQ – Dynamic cellular imaging quantitator Bioimage Data Generation and Analysis: We successfully imaged HeLa, N-tert1 cells Figure 7. D-CELLIQ 1.0 graphical user interface. (telomerase-immortalized keratinocytes), and Hct116 cells on our time-lapse microscopy system. Cells were treated with drugs of different concentrations. Images are captured as TIFF files first using software developed by Compix, which also controls the microscope, stage, and camera. These TIFF files are then transmitted to our in-house built D-CELLIQ 1.0 system for image processing and analysis. HeLa H2B-GFP cells were thawed six days before plating for each experiment and cultured in DMEM with 10% FBS. The details of the protocols can be found from [7, 8]. D-CELLIQ 1.0 consists of eight core modules: image acquisition, nuclei detection, nuclei segmentation, nuclei tracking, feature extraction, cell phase identification, statistical analysis of cell cycle behaviors, and output all information into a database. D-CELLIQ 1.0 has been released to the public since early 2008, see http://dcelliq.cbi-platform.net. Here we describe our major achievements. Since cells often cluster Figure 8. The representative lineage tree structures of cell division together, the detection of cell nuclei prior to with time-lapse microcopy analyzed with D-CELLIQ: asymmetric (left segmentation is non-trivial. To circumvent panel) vs. symmetric division (right panel). The numbers in green this problem, D-CELLIQ 1.0 includes a new nodes (representing the time point when cell division is captured) or detection algorithm that comprises three images are the number of nuclei in the corresponding imaging frames steps: binarization, local maxima acquired at different time intervals after starting imaging as indicated generation, and searching for local by the black numbers. The upper row of images are captured 15 minutes before cell division is identified (middle row of images) to maxima[9, 10]. Highly accurate cell illustrate the metaphase of dividing cells. The bottom row of images tracking is a challenging issue [11-13]. shows the tracked cells at the end of imaging (50 hours). Cell #21 (left Over or under segmentation, fast panel) appears to go through an asymmetric division while cell #28 movement of cells, and the problem of (right panel) goes through a symmetric division. overcrowded cells would cause tracking PHS 398/2590 (Rev. 11/07) Page 169 Research Plan Page Program Director/Principal Investigator (Last, First, Middle): Wong, Stephen, T.C., Ph.D., P.E. error. Parallel tracking is being studied and developed [14]. We developed two sophisticated tracking algorithms, and both of them achieve good performance [15, 16]. Figure 7 shows a snapshot of the D-CELLIQ 1.0 interface while Figure 8 shows the automated tracking sequence with symmetric and asymmetric division. Feature extraction and cell-cycle phase identification: After obtaining the segmented nuclei, feature vectors are generated in D-CELLIQ 1.0 to represent the cells. Each feature vector contains 211 features. These features are composed of 11 general image features, such as shape, size, and intensity (including max intensity, min intensity, deviation of gray level, average intensity), length of long axis, length of short axis, long axis/short axis, area, and perimeter [13]; 14 Haralick co-occurrence textural features [17]; 47 Zernike moment features [18]; 85 features generated by Gabor transformation [11, 19]; and 54 shape features. To remove the irrelevant features and improve the performance of the learning system, a prediction risk-based feature selection method is employed in our previous study to choose the sub-optimal feature sets[20, 21]. This method adopts an embedded feature selection criterion of prediction risk that evaluates features by calculating the change if the corresponding feature is replaced by its average value. It has several advantages. First, the embedded feature selection model depends on learning machines. Fifty-eight features are kept for cell phase identification, comprised of 37 Gabor features, 1 geometric feature, 14 moment features, 2 texture features and 4 shape features. The geometric feature used is "perimeter." Gabor features can describe the nuclear both in the time and frequency domain, thus many Gabor features are kept. Modeling of cell-cycle mitotic phases: Different methods of feature extraction, feature section, and classifiers for phase classification are investigated in D-CELLIQ 1.0, including new methods such as Context-based hidden Markov model (CBHMM) [7] and on-line support vector machine (OSVM) [8]. An Online Support Vector Classifier (OSVC) algorithm was developed in D-CELLIQ 1.0 to remove support vectors from the old model and assign the new training examples with different weights according to their importance. In addition, phase identification in D-CELLIQ 1.0 takes into consideration of the temporal-spatial patterns of nuclei dynamics. Each time region contains prophase, metaphase, and anaphase as a temporal-spatial pattern that is reflected by the phase sequence (temporal) and morphologic nuclei appearance (spatial). A dynamic programming algorithm was developed to identify the phase patterns that best satisfy the prophase, metaphase, and anaphase ordering. More results can be found from[8, 13, 22, 23]. G-CELLIQ – whole genome RNAi cellular imaging quantitator G-CELLIQ enables processing large volumes of digital images generated from high throughput RNAi screen applications and reduces the time required in processing the images from months in manual analysis to hours or minutes on a computer. Current ability of our G-GellIQ includes: (1) Integrated cell image processing pipeline; (2) capability of handling images from three different channels; (3) automatic two-step cell segmentation with nuclear segmentation supplying seed region for cell body segmentation; (4) morphological feature extraction describing each cell using 211 features from 5 categories; (5) feature selection method using SVM (Support Vector Machine)-RFE, Genetic Algorithms and unsupervised methods like k-Nearest Neighborhood; (6) online phenotype discovery using mixture model and gap statistics; (7) cell classification using SVM and graphical visualization; (8) scoring cell groups based on the output of different classifiers; and (9) gene function analysis using cluster analysis. One user interface of this system is shown in Figure 9. Computational architecture of G-CELLIQ: The computational functions of G-CELLIQ are organized into: 1) image processing and cell morphology quantification; 2) PHS 398/2590 (Rev. 11/07) Figure 9. A simple graphical user interface of G-CELLIQ. Page 170 Research Plan Page Program Director/Principal Investigator (Last, First, Middle): Wong, Stephen, T.C., Ph.D., P.E. phenotype modeling and cell classification; and 3) Image scoring and gene function annotation. The output of modules 1) and 3) comprises the database of quantified cells morphology and image/gene function score profiles, respectively. Integrated image processing pipeline: An efficient, compatible and robust image processing pipeline is constructed, and when running on an Intel Core (TM) 2, 2G Hz, 2G RAM PC, it can handle large volumes of raw images from a 384-well plate (5,384 images, ~100 cells each) within a day and generate quantified morphology profile for each cell (~half million cells in total). The workflow includes a two-step cell body segmentation scheme utilizing adaptive thresholding[24, 25], an over-segmentation correction method based on feedback systems[24], and an automatic image quality control system. A group of 211 morphological features [25] are defined and extracted to quantify each cell segment. This workflow is applicable to various HCS datasets upon tuning on a small group of parameters. Phenotype identification, cell classification, and gene function scoring: An original online phenotype discovery method [26] is used to identify and validate novel phenotypes and finally form a stable phenotype panel for each experiment. A subset of features are selected using SVM-RFE to differentiate each phenotype from others, and a series of SVM classifiers are used to classify each cell segments into different phenotypes [25]. The probability output of SVM classifiers are recorded as each cell segment’s similarity to different phenotypes. Combining the scoring schemes in [25, 27], the scores for each cell are in turn summarized as morphology scores for each image, well (each having 16 images) and RNAi treatment conditions (TC, each represented by 2-8 wells), and finally consolidated through different biological replicates TC to form a functional score for a single gene (each represented by 2-9 biological replicate TCs). User interface and functional validation: The available functions are integrated into a GUI shown in Figure 9 to form the present G-CELLIQ workflow. This workflow is applied to an RNAi HCS on part of the KP (kinase-phosphatase) set aiming at building signaling network regulating cell shape change. Datasets from 16 384-well plates are analyzed in about six months, and hierarchical clustering analysis is carried out based on functional scores for each involved gene. The clustering results, shown with some typical images from different clusters as in Figure 10, restored some well-known functional sub-groups and also indicate the role of previous uncharacterized genes. Comparing with the data analysis scheme in a similar screening [27], G-CELLIQ shows its efficiency and promising future in the analysis of RNAi HCS. We also developed another two software packages. One is NeuronIQ – Neuron image quantitator. The software is freely available from http://www.methodisthealth.coTYPE and can accurately detect the central line of dendritic backbones and spines of in vitro and in vivo 3D neuron image data from the noisy data by curvilinear structure detector. Another one is N-CELLIQ – Neuron and cellular image quantitator. The software aims to quantitate and interpret automated fluorescence microscopy images accurately and automatically, in particular, for the labeling and measurement of neurites in 2D and 3D space. The 2-D version of N-CELLIQ has been released for public on our website at http://www.cbi-tmhs.org/software.html. Figure 10. Hierarchical clustering results based on gene function scores on KP set. N.3.2.2. Accomplishments in Computational Genomics and Proteomics for biomarker discovery and signaling pathways analysis We have extensive experience in genomics data analysis [28, 29]. As an example, for one project studying the variations of DNAs and RNAs in cancer patients, we developed several new models for SNP analysis [30-32], eQTL mapping [33], and microRNA regulation [34]. In SNP analyses, we employed a new model which explicitly considers the distance between two neighboring SNPs, genotyping error rate and heterozygous rate to presents a novel LOH inference and segmentation algorithm based on the conditional random pattern (CRP) model [30]. For a particular disease MDS (myelodysplastic syndromes), we employed a novel Constraint Moving PHS 398/2590 (Rev. 11/07) Page 171 Research Plan Page Program Director/Principal Investigator (Last, First, Middle): Wong, Stephen, T.C., Ph.D., P.E. Average (CMA) algorithm to detect the Copy Number Aberration (CNA) regions of sorted marrow SNP array samples. Consequently, we have two independent applications based on the results of the CMA algorithm. The first one is power analysis to determine the minimum sample size required in the experimental design. The other is a General Variant Level (GVL) score for the discrimination of patients suffering from high and low grade MDS [31, 32]. In the analyses of eQTL mapping, we constructed the MDS disease gene network by integrating the expression quantitative trait loci (eQTL) mapping and the human interactome data [33]. In microRNA regulation analyses, we developed a network inference model, called significance analysis of microRNA-mRNA targeting (SAMiMT). In this model, the microRNA:gene binding probability is evaluated based on expression and sequence data, followed by the estimation of gene collaboration probability on transcriptional abundance and the microRNA binding compositions of genes [34]. At TMH, proteomics research focuses on the discovery of protein-level biomarkers including Stroke and Major Adverse Cardiac Events (MACE) patients. At BCM, work is currently focused on kinase activation in cancer cells and blood-borne biomarkers for cancer detection, and is operationally similar to work ongoing at TMH. For stroke studies [35], we proposed a novel automatic peak detection method for the stroke MS data. In this method, a mixture model is designed to model the spectrum of mass spectrometry proteomic data. A Bayesian approach is used to estimate parameters of the mixture model, and a Markov chain Monte Carlo method is employed to perform Bayesian inference. Another detection approach for MALDI mass spectrometry study was also developed by employing time-frequency analysis approach [36]. For MACE studies, we described one method for biomarker panel discovery [37] and another one for network biomarker discovery [10]. Conventional biomarker discovery focuses mostly on the identification of single markers and thus often has limited success in disease diagnosis and prognosis. One new method is to identify an optimized protein biomarker panel based on MS studies for predicting the risk of MACE in patients [37]. We presented a new variant of GA that embeds the recursive local floating enhancement technique to discover a panel of protein biomarkers with far better prognostic value for prediction of MACE than existing methods, including the one approved recently by FDA (Food and Drug Administration). The other novel method is to identify the network biomarkers based on protein-protein interactions to classify MACE patients from control patients [10]. We first built up a cardiovascular-related network based on protein information coming from protein annotations in Uniprot, protein-protein interaction (PPI), and signal transduction database. Aided by protein-protein interaction network, that is, cardiovascular-related network, we proposed a new type of biomarkers, network biomarkers, composed of a set of proteins and the interactions among them. The candidate network biomarkers can classify the two groups of patients more accurately than current single ones without consideration of biological molecular interactions. N.3.2.3 Accomplishments in Cancer Cell Invasion Modeling and Modeling of Drug Responsiveness Tumor growth and invasion: Cancer is the second most common cause of death in the United States, exceeded only by heart disease, according to the American Cancer Society [38]. Mathematical modeling and computer simulation are tools that can provide a robust framework to understand cancer progression better. Cristini and colleagues develop and apply multi-scale, predictive, computational models of tumor growth and invasion founded upon well-established principles of physics, mathematics, and cancer biology that utilize state-of-the-art numerical techniques (see refs [39-45], references therein). The in silico parameter values that govern the predictive cancer simulators are set according to in vitro and in vivo experimental evidence [41, 42, 46, 47]. This integrative framework allows us to form and test hypotheses that drive experimental investigation, which in turn provides data to refine our mathematical models. Cristini (co-PI on this project) and colleagues were among the first to advance modeling of complex tumor morphologies beyond the limited capabilities of mathematical linear analysis and into the realm of nonlinear computer simulation [40]. A biologically founded, multi-scale, mathematical model in [46, 47] is developed to identify and quantify tumor biologic and molecular properties relating to clinical and morphological phenotype and to demonstrate that tumor growth and invasion are predictable processes governed by biophysical laws, and regulated by heterogeneity in phenotypic, genotypic, and micro-environmental parameters. In this model, the behavior of cancer cells and their surroundings is linked to tumor growth, shape and treatment response. In the work [39], they developed, analyzed and simulated numerically a thermodynamically consistent mixture model for avascular tumor growth. The mixture model takes into account the effects of cell-to-cell adhesion, chemotaxis, haptotaxis, and transport of important molecular species (e.g., oxygen and chemotherapeutic drug compounds). In other recent work, Cristini and colleagues advanced the state-of-the-art in linking tumor growth, PHS 398/2590 (Rev. 11/07) Page 172 Research Plan Page Program Director/Principal Investigator (Last, First, Middle): Wong, Stephen, T.C., Ph.D., P.E. nutrient transport, hypoxia, release of angiogenesis-regulating factors (e.g., VEGF-A), and tumor-induced angiogenesis, along with tumor-induced biomechanical changes in the microenvironment and neo-vasculature that affected subsequent tumor-host-vasculature dynamics [43]. Cristini and colleagues also presented a two-part series on multispecies nonlinear tumor growth in which they develop, analyze, and simulate a diffuse interface continuum model of multispecies tumor growth and tumor-induced angiogenesis in two and three dimensions[45, 48]. In the first part [48], they presented simulations of unstable avascular tumor growth in two and three dimensions and demonstrate that their techniques now make large-scale three-dimensional simulations of tumors with complex morphologies computationally feasible. In the second part [45], they investigate multispecies tumor invasion, tumor-induced angiogenesis, and examine the effect of variable cell-cell adhesion due to changes in cell phenotype and microenvironmental conditions (e.g., oxygen levels) and also focus on the morphological instabilities that may underlie invasive cellular phenotypes. Modeling for prediction of tumor drug response: The heterogeneity and three-dimensionality of the tumor microenvironment presents a challenge to drug assessment, both during development and in the clinic, consequently hindering development of effective therapies. However, a multiscale computer simulator founded on the integration of experimental data and mathematical models can provide valuable insights into these processes and establish a technology platform for analyzing the effectiveness of drug treatments, with the potential to cost-effectively and efficiently screen drug candidates during the drug-development process. Cristini and colleagues have groundbreaking work on these issues (see refs. [42, 46, 49]). In [49], Cristini and co-workers describe their integrative approach to develop higher order bio-computational models , which has the capacity to predict in vivo tumor growth and response to therapy. They implement an extensive multi-compartment pharmacokinetics/pharmacodynamics model whose parameters are calibrated via published experimental data to investigate the pharmacokinetics and effect of doxorubicin and cisplatin in vascularized tumors [42]. This enables a comparison of the tissue and cell-level drug dynamics of the two drugs, and facilitates the generation of hypotheses to explain their in vivo characteristics. Indeed, the methodology presented herein could, with additional development, be applied to both established and nascent drugs to the end of refining clinical trials and assisting in clinical therapeutic strategy to improve patient comfort and survival. The multi-scale mathematical drug response model was recently implemented in [46] to successfully predict the effects of doxorubicin on breast tumor growth in human MCF-7 cell lines. This model hypothesizes specific functional relationships linking tumor growth and regression to the underlying phenotype, incorporates the effects of local drug, oxygen, and nutrient concentrations within the three-dimensional tumor volume, and includes the experimentally observed resistant phenotypes of individual cells to determine whether a prescribed drug will reach the tumor in sufficient quantities to kill the malignant cells. This integrative method, tightly coupling computational modeling with biological data, enhances the value of knowledge gained from current pharmacokinetic measurements and augments efforts to predict drug response. Further, such an approach could predict resistance based on specific tumor properties and thus improve treatment outcome. N.4. Research Program (Integrated research effort of Components 1 and 2) In this proposal, we seek to use newly developed experimental and imaging methodologies to identify, localize, and purify tumor-initiating cells (TIC). This will then allow us to identify and image TIC in vivo, and to model TIC behavior during tumor development with respect not only to spatial localization and movement, but also with respect to specific changes in gene expression and cellular signaling. Combined functional genomics and data mining strategies will allow us to characterize novel growth regulators. Further, our combined experimental and systems biology approach will guide these biological experiments and allow us to evaluate responses to experimental therapeutics that may inhibit or kill TIC specifically in a manner not possible before. Aside from a wealth of basic biological insight, future extensions of this work may allow drug repositioning as well as development of directed, mechanism-based and “stem cell”-centric drug screening and evaluation methods. Although we introduce the two components in Sections N4.1 and N4.2 separately, the specific aims of the two components are closely integrated as indicated in Figure 1. PHS 398/2590 (Rev. 11/07) Page 173 Research Plan Page Program Director/Principal Investigator (Last, First, Middle): Wong, Stephen, T.C., Ph.D., P.E. N4.1 Component 1 – Experimental System Biology Title: Functional analysis of tumor-initiating cells in cancer development and treatment response. Project Leaders (Core PIs): Michael Lewis, M.D., Ph.D, Jeffrey Rosen, Ph.D. Project Summary: A major obstacle to effective treatment of breast cancer is the realization that breast cancer is not a single disease, but a collection of similar diseases, each is represented by the presence of a specific breast cancer subtype. Systemic therapies such as chemo- or radiation therapy are effective initially in controlling and reversing tumor growth. However, residual cancers will invariably re-grow despite this initial response. While there have been several advances in the treatment of breast cancer in the last two decades, notably targeted therapy for breast cancers expressing estrogen receptor (ER+) or the HER2 (ErbB2) oncogene, breast cancer survivorship has improved only modestly. Unfortunately, for women with “triple negative” breast cancers (lacking expression of ER, progesterone receptor (PR), and HER2) we currently have no targeted therapies. Our recent clinical data, as well as experimental evidence in both mouse mammary tumors and human breast cancer xenograft models, supports the existence of a subpopulation of cancer cells present in the original tumor that are greatly enriched in residual cancers after conventional systemic therapies. These residual cancer cells are characterized by their intrinsic resistance to chemotherapy and relative growth quiescence. However, a discreet subset of these residual cells possesses enhanced self-renewal capacity, as well as the ability to form tumors upon transplantation. These residual tumor-initiating cells (TIC) (a.k.a. cancer stem cells (CSC)) may therefore be responsible for tumor growth, maintenance, resistance to treatment, and disease relapse. If this hypothesis is correct, the failure of traditional systemic therapies, such as radiation and chemotherapy, to cure breast cancer may be due to the fact that they incorrectly target the highly proliferative cells, while allowing survival of treatment-refractory tumor-initiating “cancer stem” cells. These findings modify our conceptual approach to oncogenesis and have dramatic implications for breast cancer prevention, treatment, and drug development. In this proposal, we seek to build upon, and significantly extend, ongoing laboratory and clinical studies using newly developed experimental and imaging methodologies to identify, localize, purify and characterize TIC. This will then allow us to identify and image TIC in vivo, and to model TIC behavior during tumor development mathematically with respect not only to spatial localization and movement, but also with respect to proliferation, apoptosis, and specific changes in gene expression and cellular signaling. Combined functional genomics and data mining strategies will allow us to characterize novel growth regulators. Further, our combined experimental and systems biology approach will allow us to evaluate responses to experimental therapeutics that may inhibit or kill TIC specifically in a manner not possible before. Aside from a wealth of basic biological insight, future extensions of this work may allow drug repositioning as well as development of directed, mechanism-based and “stem cell”-centric drug screening and evaluation methods. Key Personnel (listing percent of effort of each individual) Michael T. Lewis, PhD. (Core PI) 20% effort Component 1, 5% Education/Training; Jeffrey M. Rosen, PhD. (Core PI) 10%; Jenny C. Chang, M.D. (Co-investigator) 5%; Thomas Westbrook, Ph.D. (Co-investigator) 10%; Susan Hilsenbeck, Ph.D. (Co-investigator) 5%; Chad Shaw, Ph.D. (Co-investigator) 20%; Dean P. Edwards, Ph.D. (Co-investigator) 10%; Mary Dickinson, Ph.D. (Co-investigator) 5%; Shixia Huang, Ph.D. (Co-investigator), 10%; Alejandro Contrerras, MD (Co-investigator) 10%; Melissa Landis, Ph.D. (Research Assoc.) 25%; Lacey Dobrelecki. (Research Assoc.) 25%; Bhuvanesh Dave, Ph.D. (Postdoc) 80%; Jason Herskowitz, Ph.D. (Postdoc) 100%; Mei Zhang Ph.D. (Postdoc) 100%; Tegy Vadakkan. Ph.D. (Postdoc) 20%; Shirley Small (Research Tech.) 25%; Wei Wei (Graduate student) 100%; Kristen Meerbry (Graduate student) 50%; N.4.1.A Specific Aims Component 1 is guided by the basic hypothesis that TIC represent a unique sub-population of cells within a tumor possessing properties of self-renewal and the ability to give rise to the characteristic cell types present within a given tumor. Because of their unique abilities, we hypothesize further that TIC are localized and function within a spatially and molecularly-regulated microenvironment (mE) (a.k.a. niche). To identify, localize, and functionally interrogate TIC in vivo in sufficient detail to allow mathematical PHS 398/2590 (Rev. 11/07) Page 174 Research Plan Page Program Director/Principal Investigator (Last, First, Middle): Wong, Stephen, T.C., Ph.D., P.E. modeling of their behaviors and responses to genetic and pharmacological manipulation in Component 2, the Specific Aims of Component 1 are: Aim 1.1: To identify tumor-initiating cells (cancer stem cells) using newly developed lentiviral fluorescent signaling reporters, and to characterize their spatial distribution and behaviors during tumor growth using in vivo imaging. Based on our current knowledge of TIC regulation by signaling networks, including Wnt, Notch, and Hedgehog, we propose to use a novel set of lentiviral fluorescent signaling reporter vectors to identify, localize, and purify TIC from both mouse and human mammary tumors based on activities of these and other pathways in the TIC cells themselves. In addition to static histological preparations, individual stem cells can be tracked in live animals using a combination of high-resolution confocal microscopy and two-photon video imaging methods. Thus, the location and movement of TIC can be monitored over time at different phases of tumor development. These analyses should be informative about interactions between TIC and their local environment, including proximity to blood vessels, ECM, and interactions with stromal cell types, such as macrophages, neutrophils, and fibroblasts. These data will be used to develop and validate the mathematical model of TIC mE (microenvironment) model that will be discussed in Specific Aim 2.1 of Component 2. Aim 1.2: To identify candidate genes and pathways that may regulate TIC behaviors, e.g. self-renewal, differentiation, and metastasis By using the new fluorescent signaling reporter vectors used or developed in Aim 1, as well as known cell surface and enzymatic markers (e.g., CD44, CD24, and ALDH1), we will purify (or highly enrich) TIC populations away from other non-tumorigenic cell types using Fluorescence Activated Cell Sorting (FACS). Microarray (Affymetrix) and proteomic (antibody arrays, high-throughput immunofluorescence imaging) analyses will then be used to obtain gene expression data for each different cell population. Data will be analyzed using advanced bioinformatics methods (Component 2) to discover molecular pathways active in TIC and niche cell types. These data describing the relationship between signal pathways and cellular identities will be used to refine the TIC mE model and predictive modes of Aim 2.1 and Aim 2.2 in Component 2, respectively. The genes identified/predicted will then be tested functionally in Aim 3 of Component 1. Aim 1.3: To conduct a “Directed Iterative Functional Genomic Screen” (DIFGS) to characterize genes functionally that either decrease tumor-initiating capacity or increase tumor-initiating capacity. Using a TIC gene expression signature defined previously using the CD44 and CD24 cell surface markers on human clinical samples, we recently completed an initial functional genomics screen of 1,290 lentiviral shRNA constructs targeting ~500 genes. This screen identified 101 genes regulating mammosphere formation (a surrogate in vitro assay for TIC and normal stem/progenitor cell function). A similar study is underway using a gene expression signature derived from TIC in mouse p53-null tumor models. We propose to extend these screens in a directed, iterative manner by making use of advanced bioinformatic approaches (Component 2) to define a new candidate target list using the 101 genes as input to identify known/suspected interacting proteins, immediate upstream regulators, and downstream targets. Additional unknowns from microarray data will also be tested whenever possible (up to about 500 genes can be screened at one time). These new candidates will be tested functionally using mammosphere-formation assays to identify only those genes regulating MSFE and the process repeated for five iterations per species (~2500 genes each species), or until all bioinformatics-defined interactions are exhausted. Human and mouse gene lists can then be mined for overlapping and unique gene sets and tested in vivo in Specific Aim 1.4 described next. These data will be analyzed through bioinformatics methods described in Specific Aim 2.3 of Component 2, and the results can be used for the validation of the refined model TIC mE in Aim 2.2. Aim 1.4: To define the cellular responses of TIC to genetic and pharmacological manipulation of genes regulating TIC survival or function in vivo. Once key molecules are identified as functionally important in Aim 3 of Component 1, and the integrated molecular and cellular model is built in Component 2, the response to genetic/pharmacological manipulation of molecules in the model will be predicted, tested, and used to refine the model. Based on the premise that TIC must be targeted specifically for development of effective treatment or prevention of breast cancer, discovery of drugs that kill TIC specifically, or block their function will be critically important. Our ongoing work investigating inhibitors of normal stem cell self-renewal (including inhibitors of Notch, Hedgehog, and the PI3K/Akt axis) PHS 398/2590 (Rev. 11/07) Page 175 Research Plan Page Program Director/Principal Investigator (Last, First, Middle): Wong, Stephen, T.C., Ph.D., P.E. suggests that these agents function at the level of the TIC since they reduce the frequency of self-renewing cells, but typically do not alter tumor volume significantly unless combined with cytotoxic systemic therapies. We expect that a subset of the lentiviral shRNA constructs affecting TIC behavior in MS assays will have similar activity against TIC function in vivo. We will use a novel collection of mouse mammary tumors and low passage transplantable human xenografts to study the effects of genetic (constitutive or doxycyclin-inducible lentiviral shRNA expression vectors) and candidate pharmacological TIC inhibitors (currently in use or, suggested from analyses of Component 2) on TIC behavior and frequency in vivo. Moreover, the combinatorial effects of shRNA knockdown or experimental therapeutics with conventional chemotherapies will be examined with the goal of finding more effective cancer treatments for individual breast cancer subtypes. These data will again be used for the development of the drug-integrated model and for the validation for Aim 2.4 in Component 2. N.4.1.B Background and Significance Systemic therapies such as chemo- or radiation therapy are effective initially in controlling and reversing tumor growth. However, residual cancers will invariably re-grow despite this initial response. While there have been several advances in the treatment of breast cancer in the last two decades, notably targeted therapy for breast cancers expressing estrogen receptor (ER+) or the HER2 (ErbB2) oncogene, breast cancer survivorship has improved only modestly. In particular, for women with “triple negative” breast cancers there is currently no targeted therapies. Our recent clinical data [3], as well as experimental evidence in both mouse and human xenograft models [4, 50] supports the existence of a subpopulation of cancer cells present in the original tumor that are greatly enriched in residual cancers after conventional systemic therapies. These residual cancer cells are characterized by their intrinsic resistance to chemotherapy and relative growth quiescence. However, a discreet subset of these residual cells possesses enhanced self-renewal capacity, as well as the ability to form tumors upon transplantation. These residual tumor-initiating cells (TIC) may therefore be responsible for tumor growth, maintenance, resistance to treatment, and disease relapse. If this hypothesis is correct, the failure of traditional systemic therapies, such as radiation and chemotherapy, to cure breast cancer may be due to the fact that they incorrectly target the highly proliferative cells, while allowing survival of treatment-refractory tumor-initiating “cancer stem” cells. These findings modify our conceptual approach to oncogenesis, and have dramatic implications for breast cancer prevention, treatment, and drug development. Although the existence of tumor-initiating “cancer stem cells” is gradually being accepted, the identification and purification of tumor-initiating cells is still a great challenge due to the lack of specific markers that identify such cells uniquely. Current methods rely on cell surface markers (e.g. CD44+/D24low/- (human) or CD29 High/D24High (mouse) or fluorogenic enzyme substrates (e.g. the aldehyde dehydrogenase (ALDH) substrate Aldefluor – Human only)) coupled with fluorescence-activated cell sorting or magnetic bead separation techniques. Unfortunately, while these markers do allow enrichment of TIC, they do not allow purification to the degree necessary for detailed analysis of TIC gene expression or function. In an attempt to circumvent this problem, we are constructing a series of novel lentivirus-based fluorescent signaling reporter vectors, including reporters for known stem cell regulators such as Wnt, Notch, and Hedgehog. Crosstalk among these three pathways occurs frequently in normal development, and we suspect that this crosstalk is important for TIC biology. In a recent review, Hayward et al. suggest that Wnt and Notch (‘Wntch’) signaling are integrated such that rather than defining the fate of a cell, they determine the probability that a cell will adopt a particular fate [51]. There is extensive evidence that the Wnt pathway can induce the expression of Notch ligands as well as hedgehog signaling components, however, other interactions have also been reported including antagonistic relationships. All three of these pathways have been shown to induce epithelial-to-mesenchymal transition, a process essential for normal development and implicated in cancer progression, and metastasis, as well as in the acquisition of stem cell characteristics [52]. Based on the preliminary results presented below, we anticipate that these reporters will be more effective than current markers for purification of TIC, either alone or in combination. If results continue to be promising, these individual signaling reporters can be incorporated into a single lentiviral vector that will allow us to monitor the status of all three signaling networks in a single cell, and will also allow us to study the spatial location of TIC within tumors in real time using in vivo imaging. Finally, because TIC may show a unique signaling “signature” PHS 398/2590 (Rev. 11/07) Page 176 Research Plan Page Program Director/Principal Investigator (Last, First, Middle): Wong, Stephen, T.C., Ph.D., P.E. with respect to the status of these three signaling reporters, we should be able to purify TIC more effectively and evaluate their response to treatment using gene expression analysis methods such as microarrays and proteomics. In this proposal, we seek to use newly developed experimental and imaging methodologies to identify, localize, and purify TIC. This will then allow us to identify and image TIC in vivo, and to model TIC behavior during tumor development with respect not only to spatial localization and movement, but also with respect to specific changes in gene expression and cellular signaling. Combined functional genomics and data mining strategies will allow us to characterize novel growth regulators. Further, our combined experimental and systems biology approach will allow us to evaluate responses to experimental therapeutics that may inhibit or kill TIC specifically in a manner not possible before. Aside from a wealth of basic biological insight, future extensions of this work may allow drug repositioning as well as development of directed, mechanism-based and “stem cell”-centric drug screening and evaluation methods. N4.1.C Preliminary Results The first aim of Component 1 is to identify tumor-initiating cells (cancer stem cells) using newly developed lentiviral fluorescent signaling reporters and to characterize their spatial distribution and behaviors during tumor growth using in vivo microscopic imaging. Characterization of canonical Wnt pathway using lentiviral transduction into primary tumor cells. The Wnt/ß-catenin cascade has emerged as a key regulator of stem cell biology in multiple tissues, including the mammary gland [53]. In addition, data suggest deregulated Wnt/ß-catenin pathway activity leads to uncontrolled self-renewal of cancer cells and to resistance radiation therapy [54]. These observations suggest that TICs might be characterized by increased canonical Wnt signaling. As a proof of principle, using a p53 null mammary tumor, GFP positive cells derived by transduction with the TOP-eGFP lentivirus (a canonical Wnt-pathway reporter) showed a marked enrichment for tumor initiating ability. Thus far, tumors have not formed from GFP negative cells. In agreement with these data, FACS analysis has demonstrated a marked overlap of this cell subpopulation with the TIC population characterized previously as CD29H/CD24H. Taking advantage of the eGFP fluorescence, we can also visualize the location of these breast TICs relative to tumor margins, blood vessels, and other stromal cell types, in histological sections (Figure 11) and using in vivo confocal microscopy. Spatial localization of TIC has not been possible prior to these studies. Thus, this technique provides the unique opportunity to study the local microenvironment or TIC niche. We plan to extend these experiments initially to representatives of each of the p53 null mouse mammary tumor subtypes (luminal, basal-like and claudin-low) to determine if canonical Wnt signaling is uniformly activated in each case, as well as to our set of novel human xenograft models representing basal, HER2+, and ER+ breast cancers. Figure 11. TOP-eGFP expressing cells (TIC-enriched) (green arrow) are uniformly localized adjacent to blood vessels (outlined in white, vasculature stained red for Von Willebrand’s Factor). Notch and Hedgehog signaling reporters are available. In addition to the Wnt signaling reporter, we have obtained Notch and Hh pathway reporters and are in the process of constructing and validating lentiviral vectors for their ability to identify and localize TIC. Once validated individually, we will construct a triple pathway reporter that can be used in different tumor models both from genetically engineered mouse models and human breast cancer xenografts. p53-null mouse mammary cancer models are available. Loss- or gain-of-function phenotypes associated with mutations of p53 are observed in 20-30% of spontaneous human breast cancers [55], and the TP53 mutation status showed strong association with the basal-like and HER2-enriched subtypes where up to 50% of PHS 398/2590 (Rev. 11/07) Page 177 Research Plan Page Program Director/Principal Investigator (Last, First, Middle): Wong, Stephen, T.C., Ph.D., P.E. these ER-negative tumors may have TP53 mutations. Although deletion of both alleles of p53 rarely occurs in human breast cancer, the relevance of the p53-null mouse model is that absence of p53 sensitizes the mammary epithelium to other stochastic changes, and three main subtypes of p53 null tumors (i.e. luminal, basal-like and claudin-low). In collaboration with Dr. Daniel Medina at BCM, the Rosen laboratory has collected fifty p53 null mammary tumors that develop stochastically at an average of 6 months of age following transplantation of p53 null mammary epithelial cells (MECs) into the cleared mammary fat pad of wildtype p53 syngeneic Balb/c mice [56]. These tumors are aneuploid, and show varied histologies and biomarker expression including some estrogen receptor(ER) positive tumors, and as described above, tumors of at least three different expression subtypes. Early passage, stably transplantable, human xenograft models are available. The Lewis and Chang laboratories have established a relatively large series of stably transplantable human breast cancer xenograft models that accurately reflect tumor biology observed in the patient. To date, we have established 13 stably transplantable xenograft models, including nine “triple negative”, two HER2+, and two ER+ breast cancer models (one of which is also an inflammatory breast cancer). All of these novel xenograft models are available for our use. With respect to our Aim 1.2, to identify candidate genes and pathways that may regulate TIC behaviors (e.g. self-renewal, differentiation, and metastasis, we isolated cellular subpopulations enriched for TIC from both mouse (detailed in section N3) and human models and evaluated these enriched populations in comparative microarray analysis in order to characterize candidate genes and pathways regulating TIC behavior. Gene expression analysis of enriched TIC populations in human breast cancers. We (Lewis/Chang/Rosen) and others [57, 58] have developed “cancer stem cell signatures” from human clinical samples. However, each of these studies has taken a unique approach for purification of TIC and generation of the candidate gene list. In our effort to define a gene expression signature of tumorigenic breast cancer cells, we obtained populations enriched for tumorigenic cells by two methods: CD44+/CD24-/low and the ability to form mammospheres (MS). Comparative gene expression analysis was performed in populations enriched for tumorigenic cells (CD44+/CD24-/low or MS) vs. non-tumorigenic cells (“other” flow sorted, or bulk tumor, respectively), and then analyzed to there was significant overlap between both enrichment methods. In the first comparison (consisting of 14 CD44+/CD24-/low vs. 15 “other” profiles, representing 19 patients and 9 patient pairs), 2221 RNA transcripts (1424 named genes) were elevated (p<0.01 unpaired, two-sided t-test; fold change>1.5; FDR~0.2) in the flow-sorted CD44+/CD24-/low vs. other cells; in the second comparison (consisting of 15 MS vs. 11 primary cancer profiles, representing 16 patients and 10 patient pairs), 2696 transcripts (1890 genes) were elevated (p<0.01 unpaired, two-sided t-test; fold change>1.5; FDR~0.25) in the MSs vs. primary cancers. The numbers of genes arising in the two separate “enrichment” comparisons greatly exceeded chance expected by multiple testing. The shared gene overlap of 154 transcripts (117 genes) between these two tumor-initiating enrichment methods was significant (p=1E-5, one-sided Fisher’s exact test) (Figure 12A). Between the transcripts with decreased expression, 339 transcripts (263 genes) significantly overlapped (p=1E-15, one-sided Fisher’s exact) (Figure 12B). Thus, we defined a “CD44+/CD24-/low−MS gene signature” which comprised the relative “up” and “down” patterns of the 493 (154 over-expressed and 339 under-expressed) transcripts present in the significant overlap between both comparisons PHS 398/2590 (Rev. 11/07) Figure 12. Generation of a gene expression signature from the overlap of expression patterns of two cell populations enriched for tumor initiating cell types. Page 178 Research Plan Page Program Director/Principal Investigator (Last, First, Middle): Wong, Stephen, T.C., Ph.D., P.E. (heat map in Figure 12C). Gene expression patterns characteristic of populations enriched for TIC are enhanced after hormone and chemotherapy. Tumor Volume (mm 3) Tumor Volume (mm 3) As detailed in section N3 above, we showed previously that residual tumors after treatment with standard systemic chemotherapies are enriched for cells expressing cell surface markers characteristic of Figure 13. Enrichment of the TIC-enriched gene expression signature after either TIC (CD44+;CD24-) and hormone therapy (A) or chemotherapy (B). further that these residual cell populations are enriched for cells with mammosphere forming potential (a surrogate assay for TIC function). We reasoned that if the frequency of TIC increased after systemic therapies, genes differentially expressed in such cells relative to other cells in the tumor should also show enhanced correlation after treatment vs prior to treatment. We have since tested this Tumor Volume Xenograft 2665 Tumor Volume Xenograft 2147 1000.0 1400.0 hypothesis explicitly and have shown 1200.0 800.0 that TIC gene expression correlates 1000.0 better not only after conventional 600.0 800.0 chemotherapies, but also after hormone 600.0 400.0 400.0 therapy (Letrozole). These data suggest 200.0 200.0 that cells underlying tumor formation in 0.0 0.0 d0 d2 d6 d9 d13 d15 d21 both hormone receptor positive and d0 d2 d6 d9 d13 d16 d20 Days Days hormone receptor negative subtypes Chemo Chemo may share some underlying similarities Merck003 Merck003 biologically and thus, targeting these Vehicle Control (n=8) Vehicle Control (n=9) residual cancer cells is likely to be the Docetaxel (10mg/kg, n=9) Docetaxel (10mg/kg) (n=8) Merck003 (100mg/kg) (n=8) Merck003 (100mg/kg, n=8) key to effective treatment of multiple COMBINED (n=8) COMBINED (n=9) breast cancer subtypes. Figure 14. Gamma secretase inhibition does not alter xenograft growth n ) C om bi n at io g/ kg g) (1 00 m 10 m el ( er ck 00 3 ax oc et D M om C g/ k cl e Ve hi n bi n at io g/ kg g) 10 m el ( er ck 00 3 M oc et ax D (1 00 m g/ k cl e Ve hi PHS 398/2590 (Rev. 11/07) ) MSFE (%) MSFE (%) With respect to Aim 3, to conduct a significantly in two “triple negative” human breast cancer xenografts “Directed Iterative Functional Genomic (xenografts 2147 (left) and 2665 (right)). Note, both xenografts are Screen” (DIFGS) to characterize genes completely resistant to low dose chemotherapy (10mg/kg). functionally that either decrease tumor-initiating capacity or increase tumor-initiating capacity, we have ongoing functional genomics experiments investigating the function of genes Mammosphere Forming Efficiency (MSFE) 2665 Mammosphere Forming Efficiency (MSFE) 2147 in our “first generation” human TIC 0.4 0.4 a signature, we are testing the effect a b 0.3 0.3 of lentiviral shRNA knockdown of *a genes comprising a “TIC signature” 0.2 0.2 a,b * on TIC behaviors. Results from an 0.1 0.1 initial screen of ~550 genes (1290 0.0 shRNAs) in SUM159 cells has 0.0 identified 101 genes represented by 116 unique shRNA lentiviruses that regulate mammosphere initiation significantly (increase or decrease), many of which are known stem cell Figure 15. Gamma secretase inhibition decreases the proportion of regulators (e.g. beta-catenin and mammosphere-initiating cells in treated xenografts in vivo. Wnt5A in the Wnt/B-catenin pathway, Hes1 a target of the Notch pathway, and hedgehog signaling components Ptch1 and SuFu). In this project, we propose to extend these analyses in a directed manner by using sequential bioinformatic Page 179 Research Plan Page Program Director/Principal Investigator (Last, First, Middle): Wong, Stephen, T.C., Ph.D., P.E. identification of all known and suspected interacting proteins, upstream regulators, and downstream targets of a given set of experimentally validated TIC regulators such that new sets of candidate shRNAs can be screened in an iterative manner until all known interacting genes are evaluated functionally in a set of tumor models. A certain number of randomly selected “unknowns” (~50) can be included to identify novel regulators with each iteration of the screen. Mammosphere Forming Efficiency(%) With respect to Aim 4, to define the GANT61 Treatment cellular responses of TIC to genetic and Mammosphere Forming Efficiency Xenograft 2147 pharmacological manipulation of genes regulating TIC survival or function, in 0.15 Vehicle Control addition to our functional genomics studies, Docetaxel(20mg/kg) we and others are actively investigating GANT61(50mg/kg) 0.10 pharmacological inhibitors of known COMBINED cell-cell signaling and other regulatory pathways governing tumor-initiating cell 0.05 survival, behavior, and treatment response (e.g. Notch, Hedgehog, Wnt, PI3K/Akt) as experimental therapeutics. Several of these 0.00 compounds, notably gamma-secretase Figure 16. Hedgehog signaling inhibition by GANT61 prevents inhibitors (Notch signaling) and Hedgehog chemotherapy-induced enrichment of mammopshere-initiating cells. signaling inhibitors, are showing promising results not necessarily by reducing tumor volume (Figures 14 and 15), but by either reducing tumor-initiating cell frequency as estimated by mammosphere-formation assay. We are particularly excited by recent preliminary data using a novel class of hedgehog signaling inhibitor. In this study, a stably transplantable smoothened-overexpressing tumor (T505) was transplanted into the cleared #4 mammary fat pads of 36 FVB mice (JAX). When tumor reached a volume of 100-400mm3 mice were randomized into 4 treatment groups: vehicle control, Docetaxel (20mg/kg), Hedgehog antagonist (40mg/kg), and combination of Docetaxel and Hedgehog antagonist. Docetaxel was administered IP on Day 1 and Hedgehog antagonist was administered by oral gavage daily for 10 days. On day 10 tumors were harvested, pooled by experimental group, and dissociated to single cell suspensions. Cell suspensions were retransplanted bilaterally into cleared mouse mammary fat pads at two cell concentrations to evaluate changes in the proportion of tumor-initiating cells directly. Hedgehog antagonist in combination with Docetaxel resulted in statistically significant reduction in tumor volume relative to vehicle or either agent alone. Upon retransplantation of treated tumor cells, the vehicle group showed 3 tumors out of 6 fat pads (1,200 cells) whereas in Hedgehog antagonist, Docetaxel, and combination groups no tumors formed at this concentration, with only a single tumor forming in the hedgehog antagonist group at 500 cells/fat pads. In this proof-of-principle experiment, Hedgehog signaling inhibition, either alone or in combination with chemotherapy showed efficacy similar to chemotherapy alone in reducing or Table 2. Tumor formation rate in in vivo treated cell populations. Tumor formation rate eliminating tumor initiating cells in vivo 1200 cells/fat pad 500 cells/fat pad using this particular model. Together, 3/6 1/6 these data are consistent with the Vehicle Hedgehog antagonist 0/6 1/6 hypothesis that hedgehog signaling Docetaxel 0/6 0/6 inhibitors may be useful therapeutic Combination 0/6 0/6 agents for breast cancer treatment by specifically targeting the TIC population. Section N4.1.D: Research Plan N4.1.D.1. Aim 1.1: To identify tumor-initiating cells (cancer stem cells) using newly developed lentiviral fluorescent signaling reporters and to characterize their spatial distribution and behaviors during tumor growth using in vivo imaging. Rationale. Currently available markers for TIC, including ALDH activity (Aldefluor assay) and cell surface markers CD44, CD24, and CD29, show insufficient specificity to allow purification of TIC. Thus, localization of PHS 398/2590 (Rev. 11/07) Page 180 Research Plan Page Program Director/Principal Investigator (Last, First, Middle): Wong, Stephen, T.C., Ph.D., P.E. TIC is currently not possible, and definitive molecular analyses of TIC gene expression and function is not efficient, and can, in fact, be misleading if a given manipulation functions at the level of the niche or microenvironment rather than at the level of the TIC themselves. Fluorescent reporters for known and suspected TIC regulatory networks should allow us to overcome these obstacles and allow real-time evaluation of experimental therapeutics on TIC function and behavior. Spatial localization data can then be incorporated into mathematical models and computer simulations of cancer development in Component 2. Experimental design and methods. Generation of pathway reporters: We have already established the utility of using the TOP-EGFP pathway reporter to evaluate the importance of canonical Wnt signaling in the TIC subpopulation from several basal-like mammary tumors. The frequency of the TOP-EGFPhi cells detected by FACS varies among the basal-like tumors studied to date, however, expression of the EGFP reporter always correlates with TIC function. Importantly, no TOP-EGFP cells were detected in preliminary studies in three of the claudin-low tumors. This suggests that alternative pathways might be important in the claudin-low tumors. The Notch signaling pathway plays a critical role in regulating mammary luminal cell fate commitment [59], so it will also be critical to use Notch reporters in these studies. Finally, studies in our laboratories have also established a role for Smoothened in normal mammary gland development suggestive of a role for Hh signaling in mammary gland cell fate determination [60]. Thus, we will use established individual pathway reporters and develop a novel triple reporter for simultaneously monitoring all three pathways known to play a role in stem cell self-renewal and differentiation. For monitoring Wnt pathway activation, we obtained a TOP-EGFP lentiviral construct that contains a series of three LEF-1/TCF binding sites and a TATA box. Wnt ligands bind to Frizzled receptors and trigger signaling by a chain of events. This leads to the release of beta-catenin from a destruction complex, allowing it to enter the nucleus and function as a transcriptional co-activator with TCF/Lef family transcription factors to drive transcription of Wnt targets. This construct was the kind gift of Dr. Irving Weissman at Stanford University. To assay activation of the Notch pathway we will use either pSIN-Hes1p-d4Venus or pSIN-TP-1-d4Venus-1[61]. The pSIN-Hes1p-d4Venus construct contains the 195 bp promoter region of Hes1, a direct target of activated Notch signaling, driving a VENUS reporter. The pSIN-TP-1-d4Venus construct has an artificial promoter that contains 12 RBP-J binding sites and a minimal promoter that drives the VENUS reporter. The intracellular domain of Notch (NICD) acts as a membrane bound transcription factor. Once released following interaction with its ligands, NICD translocates to the nucleus where it interacts with CBF1/RBP-J to drive transcription of target genes. These constructs were the kind gift of Dr. Hideyuki Okano. To assay activation of hedgehog signaling, we will subclone an 8xGli binding site promoter element [62] into a lentiviral vector upstream of a tetrameric dTomato reporter [63]. A control vector containing an 8x mutant Gli binding site will be used as a negative control for reporter specificity. Hedgehog ligands bind to Ptch receptors (Ptch1 or Ptch2) relieving inhibition of Smoothened. Activation of Smoothened leads ultimately to production of transcriptional activator forms of one or more Gli transcription factors (Gli1, Gli2, or Gli3). These activator forms then translocate to the nucleus and activate target gene transcription [64]. These promoter elements were obtained from the ATCC and have been validated in transient transfection assays in our laboratory. After these initial vectors have been developed and validated, we will begin to construct other signaling reporters as candidate regulatory networks are identified in Component 2. These may include reporters for STAT3 activation (cytokine signaling), TGF-beta signaling, PPAR-gamma etc. We will test these reporters in a series of transplantable p53 null mouse mammary tumors representative of three major breast cancer subtypes (ER+, triple negative, claudin-low). We have successfully adapted a protocol from Dr. Bryan Welm, a former graduate student in Dr. Rosen’s laboratory, for infecting mammary epithelial cells in suspension for use with dissociated tumor cells [65]. We have been successful in using this technology for overexpression, gene knockdown, and reporters and transplanting the cells into the cleared mammary fat pad of 3 week old recipient mice to regenerate the normal gland as well as tumors. We are now optimizing a series of doxycycline-regulatable lentiviral vectors for use in vivo in transduced mammary epithelial cells and in our genetically engineered breast cancer models, and comparing the efficiency of vectors driven by elongation factor EF1with vectors driven by the spleen focus forming virus promoter/enhancer(SFFV) [66], as well as the ubiquitin C(UBC) promoter. PHS 398/2590 (Rev. 11/07) Page 181 Research Plan Page Program Director/Principal Investigator (Last, First, Middle): Wong, Stephen, T.C., Ph.D., P.E. As mentioned above, there exists extensive cross-talk between these pathways during development. One hypothesis is that if one of these pathways is inhibited therapeutically another may be induced as a compensatory mechanism. In order to monitor all three pathways simultaneously in a given single cell, and test this hypothesis, we will develop a triple pathway reporter system. We will clone each of the three regulatory elements in front of a different fluorescent reporter. The three fluorescent reporters we will use are VENUS, tandem dimer (td)TOMATO [67], and Enhanced Blue Fluorescent Protein (EBFP) [68]. VENUS and tdTOMATO are both bright, have good spectra, and have proven to be non-toxic. For the third color, we will use EBFP in the shorter wavelength range. The emission spectra for these three reporters do not overlap significantly and are easily distinguishable using standard filter cubes or spectrally tunable electronic filters such as those on our confocal microscopes. Some of these fluorescent reporters are being obtained from Dr. Timm Schroeder. We will test these reporters in combination to determine if we can accurately detect two at a time and then all three simultaneously by FACS. Baylor College of Medicine has a new state of the art Cytometry and Cell Sorting Core (CTIC) facility. The CTIC instrumentation includes two FACSAria machines that can detect up to 13 colors plus forward and side scatter and perform 4 way sorting. Activity of these reporters can also be evaluated and imaged using confocal microscopy available both in the Integrated Microscopy Core and in the Lester and Sue Smith Breast Center here at Baylor College of Medicine. Potential pitfalls and experimental alternatives. It is possible that the EBFP reporter may not be bright enough to detect low level activity of a given regulatory network. If this appears to be the case, we will replace the EBFP with a brighter fluorescent protein reporter. In addition, reporters in the far red range may be preferable under certain circumstances. However, far red fluorescence is not generally visible and thus is a bit more technically challenging to use than a visible reporter since all localization must be done via the microscope imaging equipment rather than by eye. The rationale for creating the triple reporter is to ensure that all three reporters are transduced together into the same cell. Introducing three pathway reporters into a single lentiviral vector is challenging and if unsuccessful, alternatively, we will use individual reporters (or dual reporters) with different fluorescent markers that can be co-transduced. In addition to the proposed uses of this reporter system in tumor studies, this resource will be very useful for studies of mammary gland development and other tissues that can be reconstituted. It can also be used for genetic and chemical screens in cell lines looking for activators or inhibitors of these pathways including cross-regulation between pathways. These studies should help determine the functional differences or similarities in the TIC subpopulations and provide the basis for more detailed gene expression and functional genomic analyses described below. N4.1.D.2. Aim 1.2: To identify candidate genes and pathways that may regulate TIC behaviors, e.g. self-renewal, differentiation, and metastasis Rationale. Under our previously funded SPORE grant, we derived a gene expression signature for tumor-derived cell populations enriched for TIC using cell surface markers CD44+;CD24-/low. In this project, we propose to extend these analyses considerably. A similar, but currently unfunded analysis was performed using CD29H/CD24H for a small set of three p53 null mouse tumors. However, since these populations are only enriched for TIC, identification of regulatory pathways critical for TIC survival and growth are not easily identified using gene expression approaches. While our initial shRNA screen (described below in Specific Aim 3) based on our human TIC-enriched signature yielded 101 candidate regulatory genes, we believe that specific signaling reporters will enhance our ability to identify bona fide regulators more efficiently and will enhance our subsequent “hit rate” in functional genomic analysis. Based on our preliminary results indicating that a TOP-EGFP Wnt reporter allows enhanced identification and further enrichment of the TIC subpopulation, we propose to use a combination of fluorescent signaling reporters for signaling to purify TIC more efficiently and analyze their gene expression patterns more effectively. Genes and pathways identified will be tested functionally in Specific Aim 3. These genes expression data can be incorporated into developmental models to be developed by Component 2. Experimental design and methods. TIC expressing one or more of the signaling reports used in Specific Aim 1.1 will be purified by FACS. The method for both human and mouse chip experiments is virtually identical except for the Affymetrix chip used. PHS 398/2590 (Rev. 11/07) Page 182 Research Plan Page Program Director/Principal Investigator (Last, First, Middle): Wong, Stephen, T.C., Ph.D., P.E. RNA isolation, cDNA synthesis, and combined in vitro transcription and biotinylation labeling (IVT) on the purified samples will be carried out according to protocols recommended by Affymetrix GeneChip TM (Santa Clara, CA). Briefly, total RNA will be isolated using TRIzol (Invitrogen Corporation, Carlsbad, CA), and subsequently passed over a Qiagen RNeasy column (Qiagen, Valencia, CA) for control of small fragments that have been shown to affect RT-reaction and hybridization quality. Based on the yield from resting lymphocytes, we expect cancer stem cell populations to yield 2 to 6 micrograms of total RNA per 10,000 cells. After RNA recovery, double-stranded cDNA will be synthesized by a chimeric oligonucleotide with an oligo-dT and a T7 RNA polymerase promoter at a concentration of 100pm/µL. Reverse transcription will be carried out according to protocols recommended by Affymetrix GeneChip, using commercially available buffers and proteins (Invitrogen Corporation). Biotin labeling and approximately 250-fold linear amplification followed by phenol-chloroform cleanup of the reverse-transcription reaction product will be carried out by in vitro transcription (Enzo Biochem, New York, NY) over a reaction time of 8 hours. The labeled cRNA will be hybridized onto an Affymetrix U133 plus 2.0 GeneChip (MG430 2.0 GeneChip for mouse) following the recommended procedures for prehybridization, hybridization, washing, and staining with streptavidin-phycoerythrin. Antibody amplification will be accomplished using a biotin-linked anti-streptavidin antibody (Vector Laboratories, Burlingame, CA) with a goat-IgG blocking antibody (Sigma, St. Louis, MO). A second application of streptavidin-phycoerythrin will be used subsequent to additional wash steps. After automated staining and wash, the arrays will be scanned on an Affymetrix GeneChip Scanner (Agilent, Palo Alto, CA) and quantified (Affymetrix, Santa Clara, CA). Low-level microarray data analysis includes quality assessment, normalization, and expression modeling [69, 70]. We most often use dChip[71] for quality control assessments and preliminary analyses, and then convert expression to RMA or GCRMA estimates for analysis in more complex linear models with Splus ArrayAnalyzer©, Bioconductor[72], or BRB Arraytools (http://linus.nci.nih.gov/BRB-ArrayTools.html). We have been highly successful in the use of gene expression arrays for identification of pathways responsible for chemotherapy sensitivity and resistance [1, 73]. Initial quality control and statistical analysis to identify candidate signaling pathways responsible for treatment resistance, and for self-renewal of MS-initiating cells and putative breast cancer stem cells, will be conducted by Dr. Susan Hilsenbeck and Dr. Chad Shaw. These data will also be provided to Component 2 for additional bioinformatic analysis, i.e. identification of inhibitory compounds, drug repositioning, and mathematical modeling of TIC signaling/regulation (please see below). For the Affymetrix format, we will use MicroArraySuite® (Affymetrix) to generate probe-level quantitation data. We will then use the DNA-Chip Analyzer (dChip) software package [71] to handle normalization, estimate expression, and visualize results of other higher level statistical analyses. Expression data will be used for a series of analysis of variance (ANOVA) calculations. These analyses will identify genes whose expression is altered as a function of cell type and treatment. Resulting gene lists will be used in dChip to visualize the direction and magnitude of expression changes. Functional annotations from the gene ontogeny database (http://www.geneontology.org) can then be used to segregate known genes by the processes in which they function (e.g. proliferation, differentiation, cell adhesion). Low-level microarray data analysis includes quality assessment, normalization, and expression modeling [69, 70]. High-level analysis varies, depending on the questions to be asked, but will include clustering and gene-wise selection using linear models. We will use pathway discovery software including Ingenuity and PathwayAssist (Stratagene) [74] which uses natural language search algorithms to probe published literature for associations between genes. From the Affymetrix gene expression analyses, we will obtain at least three sets of genes: 1) genes differentially expressed between Wnt responsive (EGFP+) vs. unresponsive cells; 2) genes differentially expressed between Notch responsive cells (EBFP+) and all other cells; and 3) genes differentially expressed between Hedgehog responsive cells (dtTomato+) and all other cells. There is the possibility that the different breast tumor phenotypes (luminal, basal, ERBB2+) may show different patterns of expression in the profiling experiments, in which case we could have more than three sets of genes to consider. We will determine whether the various gene sets obtained from each of the expression profile datasets show a significant amount of overlap with each other. Such an overlap would be indicative of nonrandom, biological associations between the different signaling networks considered. A number of analytical methods would be used to look for patterns of enrichment, all of which should yield significant results if true patterns exist in our data. A one-sided Fisher’s exact test would compare the overlap between the top genes from each independent comparison over the chance expected PHS 398/2590 (Rev. 11/07) Page 183 Research Plan Page Program Director/Principal Investigator (Last, First, Middle): Wong, Stephen, T.C., Ph.D., P.E. overlap. Alternatively, rank-based approaches that help avoid using arbitrary cutoffs include Gene Set Enrichment Analysis (GSEA) [75] Validation by real time quantitative RT-PCR (QT-PCR) As with any technique, the results of expression microarray analyses must be confirmed using an independent method. We will use real time quantitative PCR (QT-PCR) for sampling a set of genes identified in microarray analysis with the highest likelihood of participating in the phenotypic alterations. For this we will use TaqMan Low Density Arrays which, depending on the format chosen, will offer quantitative analysis of up to 380 targets (+4 control targets) (“Format 384”) in a single amplification run for an individual RNA sample. Since we are expecting approximately 500 named genes to be altered significantly among the various gene sets, we will use the “Format 384” platform for our analyses. Each sample used for microarray analysis will also be used with a low density array. Thus, approximately 60% of the expected ~500 named genes identified by microarray can be validated for altered expression using this approach using a single array per sample. Proteomic analysis using Clontech antibody arrays. Antibody arrays are available from a number of companies that now allow one to evaluate differential expression of ~500 human proteins (~300 for mouse)in a single experiment using two color competitive binding and image analysis methods similar to those used for two color analysis of mRNA expression. In this method, high-quality antibodies are immobilized in an array on glass slides. These slides are then incubated with dye-labeled (e.g. Cy3) protein lysates as well as a dye-labeled protein control (e.g. Cy5). A second array is also used in which the samples are labeled with the reciprocal dye (dye-reversal). Differential fluorescence (Cy3/Cy5 and Cy5/Cy3) is then measured at each antibody spot location on each and the two ratios are compared to generate an Internally Normalized Ratio (INR) value, which helps correct for differential dye labeling and antibody/antigen binding. We will use the Clontech Antibody Array 500 for human samples and Antibody Array 300 for mouse samples. For the mouse tumors, we will evaluate three individual p53-null tumors representing each of three different tumor subtypes (basal-like, ER+, and claudin-low). Cells will be separated using the previously identified CD29 and CD24 cell surface markers into the CD29+/CD24+ population (the TIC-enriched population) and “all others”. For the human tumors, we will evaluate four different triple-negative tumors as well as both of our HER2+ xenograft models. Human tumors will be separated using the most effective marker identified above (e.g. ALDH, Wnt, Notch, or HH activity) for the identification of the TIC fraction and sorted accordingly into the TIC-enriched populations versus “all others”. The AB Microarray 500 Slides (ClonTech, Mountain View, CA, USA) is a robust tool for high-throughput analysis of proteomic profiles. This antibody array consists of 506 individual antibodies spotted in duplicate upon a glass slide, and detects proteins in different functional categories including apoptosis, cancer, cell cycle, protein kinases, and neurobiology. The antibody array protocol is a fluorescence-based procedure in which antibodies printed on a glass surface are used to capture fluorescently-labeled antigens. The buffers in the Ab Microarray Express Buffer Kit yield the highest signal to noise ratio and are specifically formulated to minimize background binding. The mouse array is similar, but slightly smaller. Protein extraction and labeling will be prepared according to the manufacturer's small-scale protocol and using Protein Extraction & Labeling Kit (ClonTech, Mountain View, CA, USA). A common reference will be generated by mixing all the samples together and this common reference will be labeled with Cy 3, and each sample will be labeled with Cy 5 (GE Healthcare, Piscataway, NJ, USA) before being passed through a Microspin Desalting Column (Pierce Biotechnology, Rockford, IL). The labeled common reference and sample pair will be applied to a single slide, and standard protocol will be followed. The slides will be dried and scanned using an Axon AL4200 scanner and GenePix 6.0 software (Molecular Devices, Sunnyvale, CA). The scanned images will be analyzed with GenePix 6.0 software. A GPR result file with signal intensities with and without background, signal to noise ratio (SNR) in each channel (Cy3 and Cy5), flag (to reflex the quality of each spots) etc. will be produced for each array. Bioconductor will be used to process the GPR result files and printTipLoess normalization will be performed. A ratio for each antibody spot on sample vs. common reference will be calculated for each array. This ratio will be used for further statistical analyses across all the samples. We will perform a two sample ttest for two group comparison, and an ANOVA for multi-group comparisons. Differential expressed proteins will be identified and then validated using traditional methods such as Western and/or ELISA. PHS 398/2590 (Rev. 11/07) Page 184 Research Plan Page Program Director/Principal Investigator (Last, First, Middle): Wong, Stephen, T.C., Ph.D., P.E. Expected results We expect that cell populations enriched for cells capable of self-renewal will show similar sets of genes/proteins that are differentially expressed relative to cell populations lacking self-renewal capacity. For example, Wnt responsive cells may overlap considerably with Hedgehog responsive cells, if both populations contain the TIC. If TIC are responsive to all three signaling pathways being examined (responsive to Wnt, Notch, and Hedgehog signaling), triple positive cells may identify critical genes essential for TIC function more effectively than any other method used to date. Thus, from this Aim, we hope to identify active pathways responsible for therapy resistance and self-renewal. It is hoped that by examining the gene expression patterns of cells that are chemoresistant and tumor-initiating, we will be able to generate hypotheses concerning novel therapeutic targets. In addition, we predict that new markers and reagents for “stem cell” isolation will result, and that some of these new markers will specifically label “cancer stem cells”, providing targets to be used in producing new anti-cancer stem cell therapies. Potential pitfalls and experimental alternatives: The techniques to be used are well established in our laboratories and we have published extensively in this area. As such, we do not expect any technical problems. It is possible that a given fluorescent reporter may not emit at a high enough level to be easily discriminated by FACS (e.g EBFP). If this is the case, a more robust fluorescent protein reporter (dt) Tomato, Venus, EGFP) will be chosen and cloned downstream of the target promoter. In the event that no proteins are identified as differentially expressed in Clontech’s antibody arrays, we propose to use a different antibody array platform— PanoramaTM Ab Microarray- XPRESS Profiler725 from Sigma. This array contains 725 antibodies printed on a glass slide, and the proteins from Sigma’s array cover different functional groups such as gene regulation, cell signaling, MAPK & PKC pathways, and p53 pathways. Clontech’s and Sigma’s arrays have only 70 proteins overlapping, therefore, combining these two antibody array platform, we will be able to analyze total 1160 proteins. At the present time, Clontech’s arrays showed better quality with lower background and higher signal to noise ratio, therefore, we decide to use Clontech’s array in this grant and Sigma’s array as a backup. N4.1.D.3. Aim 1.3: To conduct a “Directed Iterative Functional Genomic Screen” (DIFGS) to characterize genes functionally that either decrease tumor-initiating capacity or increase tumor-initiating capacity. Rationale. We developed an initial gene expression signature of cell populations enriched for TIC consisting of 456 differentially expressed genes, and conducted a functional genomics shRNA screen of these genes (and selected others) to determine which of these genes are important for the regulation of mammosphere formation (a surrogate assay for TIC function). From this initial screen, we have identified ~101 genes that either increase or decrease mammosphere formation when disrupted. It is our goal to now extend these functional analyses considerably to conduct a genome-wide screen in a directed, iterative manner. We hypothesize that direct binding partners, as well as immediate upstream regulators and downstream targets of this subset of genes will be highly enriched for additional genes regulating TIC function. Further, that our functional genomics screen can be conducted iteratively such that the entire library of shRNAs can be screened in a directed fashion. Experimental design and methods. Starting with our validated list of 101 genes, we will use bioinformatics tools, including Ingenuity Pathway Analsysis (IPA) and KEGG repositiory diagrams, as well as our own knowledge of individual regulatory pathways to generate a new candidate gene list again consisting of approximately 500 genes. In our preliminary work, about 1300 shRNAs can be screened in a single experiment with relative ease and efficiency. Larger numbers of shRNA may be feasible to screen with modification of our automation methods but we are not anticipating being able to conduct a full genome analysis in a single experiment. For the DIFGS, lentiviral shRNA sublibraries will be generated in a sequential manner and used to transduce SUM159 and Hs578T cells as we have performed previously. Significant changes in mammosphere formation efficiency will be evaluated by plating 2000 transduced cells per well in each well of 96 well ultra-low attachment plates and MSFE evaluated after 3-5 days using GelCount colony imager and quantification software. The experiment will be repeated eight times for the entire sublibrary in each cell line. PHS 398/2590 (Rev. 11/07) Page 185 Research Plan Page Program Director/Principal Investigator (Last, First, Middle): Wong, Stephen, T.C., Ph.D., P.E. Mammosphere formation efficiency will be normalized within each plate relative to the median mammosphere-formation efficiency and ranked by Z-score. Z-scores across all 8 replicates will be averaged, re-ranked, and given a p-value. A p-value of 0.05 or less will be considered significant. All shRNA achieving a significant p-value on either cell line will be rescreened against each cell line to confirm an effect on mammosphere formation efficiency. Finally, the function of each shRNA on TIC will be confirmed using one or more xenograft lines as the source of cells for transduction and transplantation, as funding permits Expected results. Based on our initial screen of 1290 shRNA representing ~500 genes, we expect approximately 100 genes will show a significant effect on mammosphere formation in one or the other cell lines for each set of 500 genes chosen. We further expect that our “hit rate” will increase using a directed iterative approach thereafter because there should be a functional link already known among the candidates chosen. We expect there to be a significant overlap in the genes showing functional effects on mammosphere formation between the two cell lines chosen because these two cell lines are similar to one another being both of the basal/claudin-low subtype. Experimental pitfalls and experimental alternatives. Given that we have performed this type of screen already, we do not anticipate technical problems. However, it remains possible that a reduction in mammosphere formation may not accurately reflect a reduction in TIC in all cases. Thus, the in vivo validation of each and every hit via transplantation of transduced cells is critical. We will conduct these in vivo validations as funding and time permits. N4.1.D.4. Aim 1.4: To define the cellular responses of TIC to genetic and pharmacological manipulation of genes regulating TIC survival or function in vivo. Rationale. In addition to cell surface markers in use currently in our laboratories, we will extend these studies to use the Wnt, Notch and Hh specific pathway reporters to allow us to assess the activation status of these pathways in TIC from the different subtypes, and determine how pathway activity and TIC localization/mobility changes in response to systemic and “stem cell targeted” therapies. Experimental Design and Methods. In addition to standard systemic chemotherapeutics (Docetaxel 20mg/kg or Doxorubicin (5mg/kg)) and radiation (2Gy), we will also use experimental therapeutics under development for inhibition of various pathways thought to play a role in tumor-initiating cell proliferation, self-renewal, and survival etc. For each pathway, we will treat mice with single agent biologics with and without combination with chemotherapy, and again measure bulk tumor volume, measure long term survival, and measure effects on TIC numbers using both mammosphere assays as a surrogate assay for TIC function, as well as transplantation to define definitively the effect on TIC number in vivo. In addition, gene expression changes in purified cell populations will be monitored and evaluated using microarray and proteomic strategies in order to inform our developmental and cell cell signaling models. Many developmental pathways, such as Wnt, are challenging to target. While small molecule Wnt pathway inhibitors are under development in several laboratories, these are not yet available for animal studies. As soon as these are available they will integrated into our studies. However, we can test the effect of disrupted Wnt signaling using a genetic approach by lentiviral overexpression of either an inhibitory shRNA (identified already in our initial shRNA knockdown screen in Specific Aim 3 above) or a dominant-negative ß–engrailed construct used previously in the Rosen laboratory to inhibit canonical Wnt signaling [76] . For the Notch pathway, we will give a gamma-secretase inhibitor (MK003) that is currently under study in our laboratory. In addition, we have two alternatives and are testing both in vivo. One is GSI-X from CalBiochem, and the other is called DBZ and this one is showing effects on intestinal cell behavior and looks promising [77]; the DBZ compound is given at a dose of 0.048 mg/day/mouse, with a 3 day on and 4 day off schedule, and then given this way, it promotes goblet cell formation in the gut, which has been shown to be a hallmark of Notch pathway inhibition [78]. Finally, inhibitors of the Hedgehog pathway include those targeting Smoothened, specifically cyclopamine (a generous gift from Infinity pharmaceuticals), CUR0199691 (Genentech/Curis) (under study and with which we have extensive experience [79]), LDE225 (Novartis)(pending permission), and IPI926 (Infinity Pharmaceuticals)(pending permission), as well as an agent targeting the Gli1/Gli2 transcription factors called PHS 398/2590 (Rev. 11/07) Page 186 Research Plan Page Program Director/Principal Investigator (Last, First, Middle): Wong, Stephen, T.C., Ph.D., P.E. GANT61 [80]. All hedgehog inhibitors are currently under study using human xenograft models representing the basal subtype, and are available in Dr. Lewis’ laboratory in quantities sufficient for in vivo experiments proposed herein. Since human xenograft models representing the claudin-low subtype are not yet available, the mouse claudin-low tumors established in the Rosen laboratory will now allow us to test these agents for their ability to target TICs in this subtype. In addition to these three major tumor-initiating cell regulators, we will also extend our studies to other developmental signaling pathways as they are identified, or as relatively specific inhibitors become available. Candidate inhibitors include those for STAT3 (cytokine and growth factor signaling targeted by an inhibitor developed in the laboratory of our collaborator Dr. David Tweardy), and c-Src/Fyn kinases (implicated in our initial shRNA screen above and targeted by Dasatinib (BMS)), as well those against as c-Met (HGF receptor), TGF-beta signaling, and PPAR-gamma function. Both our STAT3 inhibitor and Dasatinib are showing promise in vitro and in vivo. Expected results. Based on our preliminary data, we expect inhibitors of Wnt, Notch, and Hedgehog to all have significant effects to reduce the frequency of TIC in vivo. With the exception of the GSI, which is known to have GI toxicity with extended use, we do not anticipate significant side effects due to treatment over the timeframe used since no adverse side effects have been noted thus far. We anticipate that one or more of the additional inhibitors to be studied will have similar effects on the TIC population in limiting dilution transplantation experiments. Potential pitfalls and experimental alternatives. It is likely that one or more of the signaling inhibitors chosen does not function at the level of the TIC, but rather functions at the level of the TIC microenvironment or niche (e.g. blood vessel formation, fibroblasts, macrophages etc.). We do not see this as a significant problem since we will be able to dissect these behaviors in vivo using our imaging and signaling reporter approaches. In fact, if one can target both the TIC and the niche in combination therapies without the need for chemotherapies, one may be able to avoid the inherent adverse side effects chemotherapies induce. For some regulatory pathways identified, small molecule inhibitors may not be available. In such instances, we will attempt to use an inducible shRNA expression lentivirus in which an inhibitory shRNA can be expressed under the control of a doxycyclin-inducible promoter which is functioning in vitro, and is being tested in vivo currently. N4.2 Component 2 – Computational Biology: Mathematical Modeling and Computer Simulation Title: Analyze and Model TIC Tissue Microenvironment Project Leaders (core PIs): Xiaobo Zhou, Ph.D., Vittorio Cristini, Ph.D. Project Summary: Mathematical modeling involves the use of mathematical equations and relationships to represent biological phenomena. Complementary to this type of modeling is the use of computer simulations to represent these modeling approaches in multiple dimensions. These approaches serve two purposes. First, they provide a basic framework for the interrogation and integration of data, often providing insight into the type and quality needed for addressing a hypothesis or experimental design. This feature is especially useful when trying to integrate or analyze the large datasets generally associated with systems biology. Second, and more importantly, these models or simulations should allow one to predict the biological state under investigation and predict how the natural process will behave in various circumstances. These problems center on the understanding of the behavior of biological systems whose function is governed by the spatial and temporal ordering of multiple interacting components at the molecular, cellular, and tissue levels. We will also develop bioinformatics and bioimaging models to integrate and analyze the data generated from Component 1, and make use of the information obtained from data analysis, biological knowledge to build in silico models to model TIC behavior, cancer cell apoptosis, cell migration, cell cycle changes and drug treatment response. The goal of this component is to take advantage of our combined expertise in cell biology and computational modeling to develop coherent experimental protocols and construct biomathematical models for understanding the mechanism underlying breast cancer stem cell evolution, i.e., how one stem cell evolves into breast tumor with various sizes and compositions in cell microenvironment. Our PHS 398/2590 (Rev. 11/07) Page 187 Research Plan Page Program Director/Principal Investigator (Last, First, Middle): Wong, Stephen, T.C., Ph.D., P.E. hypothesis is that that TIC behavior during tumor development can be simulated using a robust, multiscale mathematical/computational model of TIC behavior during breast cancer development. Further, that these models can be built to reflect not only the molecular, cellular, and tissue-level dynamics, but also to allow prediction of the response of TIC to experimental therapeutics. Key Personnel (listing percent of effort of each individual); Stephen T.C. Wong, Ph.D., P.E. (center PI), 20% in Component 2 (5% in Administrative Core, 5% education core); Xiaobo Zhou, Ph.D. (Core PI), 30%; Vittorio Cristini, Ph.D. (Core PI), 10%; Ching Tung, Ph.D., (co-investigator), 5%; Jeff (Chung-Che) Chang, M.D., Ph.D., (co-investigator), 10%; John Baxter, M.D., (co-investigator), 5% ; David Engler, Ph.D. (co-investigator), 5% ; Paul Macklin, PhD. (co-investigator) 17%; Xiaofeng Xia, Ph.D, (co-investigator), 30%; Hong Zhao, M.D., Ph.D (Research Associate), 100%, Fuhai Li, Ph.D. (Postdoc) 100%, Di Huang, Ph.D (Postdoc) 50%, Xiaorong Yang, Ph.D. (Postdoc) 100%, Guangxu Jin, Ph.D. (Postdoc), 100%, Xiuwei Zhu, M.Sc., (Research Programmer) 100%, Huiming Peng, M.Sc., (Research Programmer) 100% N4.2.A Specific Aims Component 2 is guided by the hypothesis that TIC behavior during tumor development can be simulated using a robust, multiparameter mathematical/computational model of TIC behavior during breast cancer development. Further, that these models can be built to reflect not only the molecular, cellular, and tissue-level dynamics, but also to allow prediction of the response of TIC to experimental therapeutics. Thus, the central goal of Component 2 is to build a multi-scale model platform of TIC mE for investigating TIC self-renewal, proliferation, localization, and other functions within a spatially and molecularly-regulated microenvironment. Based on the experimental data obtained from Component 1 and published knowledge of TIC, we will model the TIC tissue microenvironment (TIC mE) from molecular level, cellular level up to tissue level. The TIC mE model can further predict and guide the pathway analysis, the candidate gene selection, genetic and pharmacological manipulation in Component 1. Accordingly, the Specific Aims of Component 2 are: Aim 2.1: To model the TIC tissue mE mathematically based on 2D and 3D microscopy and image analysis The microenvironment, including cellular and non-cellular components, is well-known to play an important role in supporting and influencing the behavior of TIC. Image bioinformatics models will be developed to quantify the TIC tissue microenvironment images obtained from Component 1 and then TIC mE spatial distribution can be modeled. Based on the quantified data as well as from the literature and online databases, we can apply ordinary differential equations (ODEs) and more sophisticated differential equations to describe the relationship among TIC and molecules, enzymes, nutrients and other cell types in the microenvironment (e.g. fibroblasts, vasculature, immune cells). This mathematical model in an effort to model tumor development in silico. This will be a model at the cellular and tissue levels; however, the molecular level mechanisms should be more basic and important for understanding the biological problems, therefore, further experiments will be carried out in Specific Aim 1.2 of Component 1 based on the feedback of the results obtained in Aim 2.1. Aim 2.2: To predict the TIC pathways or key genes related to specific cancer so to refine TIC microenvironment model Bioinformatic analysis of DNA microarray and proteomic data generated in Specific Aim 1.2 of Component 1, coupled with the genetic and pharmacologic manipulations of TIC function in Aims 1.3 and 1.4, will enable us to identify key candidate components in the pathways that are related to cellular behavior and survival. Subsequently, we will map these signaling pathway factors to specific tumor cell types and further to specific cellular properties by modeling them as functions of the factors. For example, psy f ( x1 ,..., xn ), pasy g ( x1,..., xn ) , where x1 ,..., xn are genes/factors, and f and g are the functions that model the relationship between symmetric or asymmetric rates and the genes in TIC pathways. The TIC mE model will, in turn, be refined based on the newly inferred pathway and network information. With the network of genes integrated into the biomathematical model, predictions can be made by changing the parameter values for the network components, and then a subset of key factors will be found. These predictions will guide the iterative functional genomics experiments in Aim 3 of Component 1 to focus on the most likely gene candidates. PHS 398/2590 (Rev. 11/07) Page 188 Research Plan Page Program Director/Principal Investigator (Last, First, Middle): Wong, Stephen, T.C., Ph.D., P.E. Aim 2.3: To develop image bioinformatics models for mapping gene functional networks within and among TIC and niche cells from the directed iterative shRNA screen and further refine the TIC mE model We will develop image bioinformatics models for discovering gene functional networks by integrating gene function annotation results from shRNA genome subset screening in Specific Aim 1.3 and publicly available multi-modality genomic data. We will first develop an integrated image analysis system for shRNA screens and score each gene based on the phenotypic information, then we will develop an image-based system biology approach to study the gene functional networks. Biological processes are often an orchestra of groups of genes, and the gene functional network studies are important to understand and study gene functions. Combining with the prior knowledge, the gene functional annotation results from shRNA whole-genome study will have the potential to identify known or suspected interacting proteins, immediate upstream regulators, and downstream targets. Compared with the predictions of refined TIC mE model in Aim 2.2, new experimental data that are unanticipated by the model can be used to further improve our mathematical TIC mE model. Aim 2.4: To model the response of TIC and their microenvironment to genetic and pharmacological manipulations of TIC function in vivo Based on our ability to assay the relationship between exposure to signaling inhibitors and gene expression in relatively pure cell populations, as well as the mathematical model linking molecular level data to the cellular and tissue levels, we can adjust the model to predict the response of TIC to new drug candidates. Technologies will be designed to elucidate, interrogate, and model the role of physical forces on varying cellular functions, including cellular ligand-receptor interaction, cell proliferation, differentiation, cell cycle evaluation, apoptosis and evolution of tumor phenotypes, or motility in order to facilitate an increased understanding of the role that physical forces play in cancer pathology and metastasis. Under different conditions, e.g., metastasis or non-metastasis stage, increased or decreased motility, changes in intracellular mechanics and ability of cells to interact with the environment will all be included for modeling the distribution of tumor-initiating cells. The collaboration of the Aim 2.4 and Aim 1.4 will be in an iterative manner to better refine the mathematical model in order to derive more robust drug candidates for inhibiting or managing TIC. N4.2.B Background and Significance N4.2.B.1 Research on TIC Cancer is the second most common cause of death in the United States, surpassed only by heart disease. According to the American Cancer Society, death from cancer is not only due to the wide occurrence in nearly all of human tissues (e.g. breast, brain, colon, head and neck, lung, pancreas, and prostate), but also due to the extreme difficulty in curing cancer. The range of experimental data available is expanding dramatically such that it is rapidly exceeding our ability to conceive an intuitive understanding of the underlying mechanisms. Therefore, bioinformatics analysis and mathematical modeling is necessary to analyze and manage data and provide informative and intuitive interpretation. Indeed, mathematical modeling and computer simulation are tools that can provide a robust framework to understand cancer progression better. Through a collaborative effort involving experts in mathematical modeling and experimental biologists who understand that modeling and experiments are complementary, we will expand our knowledge of cancer more efficiently and effectively. The mechanisms underlying the cancer initiation, expansion and progression are extremely complicated. Besides the genetic abnormalities that may or may not be present, the effects of microenvironment for cells are also of great importance. The interactions between tumor cells and their regulatory factors (both cellular and non-cellular) in the microenvironment are obviously complex. It is clear that cell-cell signaling mechanisms, as well as the nutrient and oxygen concentrations may vary spatially and temporally, because of the distribution of blood vessels and the different cell types. Indeed, it is these variations in cell densities and nutrient and oxygen concentrations that create the 'unique microenvironment' that is responsible for the growth or inhibition of cancers. While a large amount of data has recently been generated in cancer biology, it constitutes only 'component knowledge’ of the 'system knowledge' for cancer. The integration of experiments designed on in vitro cell culture and in vivo animal model, and integrated with mathematical methods on data analysis and prediction is deem necessary, this will be our goal in this proposal. Tumors have been found to be heterogeneous both structurally and functionally, with TIC as the key component. TICs have been isolated from many types of tumors, including the breast cancer. It has been gradually accepted that, like the normal stem cells, TICs in breast cancer also act as the engine for tumor initiation, expansion, evolution, migration, and response to therapy as well [81]. It is though that the proliferation PHS 398/2590 (Rev. 11/07) Page 189 Research Plan Page Program Director/Principal Investigator (Last, First, Middle): Wong, Stephen, T.C., Ph.D., P.E. and differentiation of TICs significantly affect tumor growth dynamics and heterogeneity. Meanwhile, recent evidence shows that many pathways are involved in regulating the cellular behavior of TIC [82], including Wnt, Notch, Hedgehog, EGFR, and so on. Currently, it is widely accepted that TIC self-renewal patterns are at least partially dependent on the specific microenvironment, or niche, which is a particular growth environment consisting of different cell types and extra-cellular matrix components [83-85]. Therefore, the properties of TICs can be better characterized by including information from the molecular, cellular, and tissue levels, which is a major goal of our proposal. Breast cancer cells can be separated conceptually into three cellular subpopulations (though intermediate subtypes might exist): tumor-initiating cells, progenitor cells and “differentiated” cells, each of which can be present in different proportions within a given tumor. Resembling the normal stem cells, TICs divide symmetrically to grow TIC population and divide asymmetrically to produce progenitor cells, which in turn divide to generate “differentiated” tumor cells (cells contributing to the bulk of the tumor, and perhaps division-competent, but with no tumor-initiating potential). As these tumor cells divide, the tumor volume increases and the nutrients and oxygen are consumed. When the nutrients and oxygen concentrations fall below a certain level, tumor cells necroses and some growth factors (e.g. VEGF) are secreted, and then the angiogenesis will occur due to the gradient of growth factors. After this stage tumor cells will continue to proliferate and even invade normal tissues. Finally, metastasis occurs through blood vessels and/or lymph vessels. To understand the mechanisms underlying breast cancer, signaling pathway networks and their regulating functions on tumor cell properties will be investigated as well. Clearly, to devise effective clinical treatments, a sufficient understanding of the breast cancer from molecular to cellular and tissue level is required. We believe that basic and clinical research efficiency would be greatly enhanced through a complementary approach that couples computational modeling together with a targeted experimental program. Throughout this proposal, the following key questions in development of breast cancer will be considered: (1) how do tumor cells distribute in the breast tissue, especially the TICs? (2) How do TICs behave in the tumor microenvironment? (3) How does the angiogenesis occur and what is the relationship between vasculature and TICs? (4) What other signaling factors can lead to tumor growth and metastasis? (5) How are drugs delivered into the tumor microenvironment and what is the responses of tumor cells? (6) What multiple drug strategies can be employed to efficiently treat the breast cancer? Bearing these questions in our mind, we propose to 1) develop computational models that are grounded in experimental evidence, involving multiple linkages between tumor cells and their microenvironment; 2) make predictions and then guide experiments by using the proposed models and 3) refine both our proposed model and experiments by iterative improvement in response to feedback between them. If successful, we will have a framework to simulate the progression of breast cancer, which can be used to analyze and interpret experimental data, and most importantly, we will have developed a predictive tool that can guide the design of new experiments. By this means, experiments can be motivated by quantitative "engineering hypotheses" rather than qualitative hypotheses or intuition alone. Close collaboration between experts in breast cancer biology and engineering will quantitate known signaling mechanisms, and allow identification of new signaling mechanisms between tumor cells and cells of the microenvironment. Various state-of-the-art mathematical models applied to different spatial and temporal scales will be developed in conjunction with a series of experimental studies in order to improve our understanding of breast cancer. N4.2.B.2 Drug response modeling It is thought that the breast cancer is a heterogeneous three-dimensional composite of fibrous and connective tissues, stromal components, vasculature and multiple clones of cancer cells. Cancer drug therapy, which generally can be classified as either radiotherapeutic or chemotherapeutic (targeting tumor cells) or antiangiogenic (targeting vascular endothelial cells), can help control the growth of tumor lesions by impairing cell division or triggering apoptosis in tumor cells, or by inducing apoptosis in endothelial cells. However, the heterogeneity and three-dimensionality of the tumor environment presents a challenge to drug assessment, both during development and in the clinic, consequently hindering development of effective therapies. Mathematical modeling and computer simulation are tools that can provide a robust framework to understand cancer progression and response to drug treatment. Also, successful therapeutic agents must overcome biological barriers occurring at multiple space and time scales and still reach targets at sufficient concentrations. A computer simulator founded on the integration of experimental data and mathematical models can provide valuable insights into these processes and establish a technology platform for analyzing the effectiveness of PHS 398/2590 (Rev. 11/07) Page 190 Research Plan Page Program Director/Principal Investigator (Last, First, Middle): Wong, Stephen, T.C., Ph.D., P.E. drug treatments with the potential to cost-effectively and efficiently screen drug candidates during the drug-development process. In particular, bio-computational modeling of tumor drug response has endeavored in the last two decades to address this need [42, 49, 86]. Doxorubicin cellular pharmacodynamics has been modeled [87]. Drug transport was modeled in spheroids versus monolayers [88]. A model capable of predicting intracellular doxorubicin accumulation that matched experimental observations was described [89]. Different drug kinetic effects in vitro were compared [90]. Models using multi-scale approaches (i.e., linking events at sub-cellular, cellular, and tumor scales [91]; studying vascularized tumor treatment [92]; and simulating nanoparticle effects [93]) have also been developed. Recently, a novel multi-scale computational model [46] extended a previous formulation of tumor growth founded in cancer biology [44, 47] to enable more rigorous quantification of diffusion effects on tumor drug response. This model can represent nonsymmetrical solid tumor morphologies three-dimensionally, thus providing the capability to capture the physical complexity and heterogeneity of the cancer microenvironment. This kind of integrative method, tightly coupling computational modeling with biological data, enhances the value of knowledge gained from current pharmacokinetic measurements. It further shows that such an approach could predict resistance based on specific tumor properties and thus improve treatment outcome. As conceptualized, tumor-initiating cells (TIC) may be responsible for tumor growth, maintenance, resistance to treatment and disease relapse. If this hypothesis is right, it stands to reason that building TIC into the drug response model as a component dependent on the other tumor cells (TC) and other elements of the mE is a valid approach. The spatial distribution of TIC will be different from TC as tumor growth and accurately TIC will more close to the vasculature system because the viability of TIC is stronger than TC. Combining with Figure 17. Schematic representation for the composition another fact that vasculature is an important factor on in a tumor, including the niche (the grey region in the drug response, the novel multi-scale model in our dashed rectangle) for tumor initiating cells. proposal will have biological relevance. N4.2.C. Preliminary Results The CSMCaD investigation team has worked together in tumor modeling over the past two years. We will introduce the preliminary results about the TIC tissue microenvironment image analysis and modeling in Section N4.2.C.1. Preliminary results in pathway analysis will be presented in Section N4.2.C.2. Cellular image analysis and RNAi whole genome screening will be presented in Section N4.2.C.3, and drug response modeling in Section N4.2.C.4. N4.2.C.1: TIC Tissue Microenvironment Modeling N4.2.C.1.1 Tumor population dynamics and stem cell niche Experimental and clinical researchers have recently found that breast cancer can approximately be separated into three cellular subpopulations, which are stem-like cells, progenitor cells, and differentiated cells. Therefore, three compartments are used in our model: cancer stem cells which produce progenitor cells, which in turn generate general tumor cells. Dynamics of these three types of cell populations are described by mechanisms shown in Figure 17. Although it dNTIC may not be the exact mechanism underlying Psy TIC NTIC d1 NTIC (1) breast cancer, we assume that in our model: dt TICs divide either symmetrically to form two dN PC Pasy TIC NTIC PC N PC d 2 N PC (2) TICs (k1) or asymmetrically to form a TIC and a dt progenitor cell (k2). Progenitor cells divide dNTC symmetrically to form two progenitor cells (k 3) PC N PC d3 NTC (3) with a short-term division capacity, and then dt lose their proliferation capacity and produce TCs (k4) via differentiation; TCs do not divide, but can be lost from the system through many processes, e.g. apoptosis. The death rates for TICs, PCs and TCs are denoted by d1, d2 and d3 as shown in Figure 17. PHS 398/2590 (Rev. 11/07) Page 191 Research Plan Page Program Director/Principal Investigator (Last, First, Middle): Wong, Stephen, T.C., Ph.D., P.E. Taken these together, the dynamics of the three cellular populations can be described by equestions (1)-(3), where NTIC , N PC and NTC are cell numbers of these three population; Psy and Pasy are the probabilities for symmetric and asymmetric divisions of TICs. As we assume TICs only divide in these two ways, thus we have Psy Pasy 1 . TIC and PC are division rates for TIC and PC. The parameter in equation (2) is the expansion scale for PC, which are related to the proliferation capacity of PC. For simplicity, we use the function of 2 Ndiv to fit this process; here N div is the maximal number of division. Considering the function of cancer stem cells that are related to the niche, we have three hypotheses as follows: First, TICs can only reside in their niches where they obtain the necessary nutrients and molecules to survive and maintain their stemness. Once leaving the niche, they will differentiate into progenitor cells and subsequently into terminal differentiated cells. Second, niche is a specific anatomic location for stem cell populations, and thus we assume that the niche has a limited volume, namely carrying capacity, for cancer stem cells. As a consequence, only a limited number of cancer stem cells can exist. Third, the probability of division pattern, i.e. symmetric or asymmetric, is decided by the ratio of cancer stem cells to carrying capacity. As the niche is occupied by more and more cancer stem cells, asymmetric division is more likely to undergo. Therefore, we can integrate the effect of niche into the system by describing the probability of symmetric division as the function of TICs in the niche. Similarly to Roeder et al [94], we model the relationship between Psy and NTIC by a general class of sigmoid functions: Psy ( NTIC ) f ( NTIC ) v1 v2 exp(v3 NTIC / N niche ) v4 , 1 where v1, v2, v3 and v4 determine the shape of Psy ; N niche is the maximal number of TICs that the niche can hold. N4.2.C.1.2. Receptor-ligand interactions In recent years, methods have been developed for modeling various signaling pathways. These methods can now be adapted to virtually any signaling pathway as long as a sufficient number of pathway components are known. As an example of this, we have modeled the EGF-EGFR pathway with respect to the TIC mE. Many studies suggest a significant correlation between EGFR signaling pathway and cellular proliferation and survival[95-98]. Following the work of Eladdadi et.al (2008), we consider the proliferation rates and apoptosis rates as functions of the numbers of receptors complex of EGF-EGFR (Figure 18). It is worth noting that while the epidermal growth factor receptor family composes Figure 18. A schematic representation of the effect of four types, we just use EGFR to represent the whole conventional treatment and novel treatment on tumor behavior through the EGF-EGFR pathway. family for simplification. Indeed, our case can be seen as the special case in Eladdadi’s work, by setting the parameters related to EGFR and HER2 with identical values. Normally, the combination of EGF to extra-cellular domain of EGFR causes the activation of the receptors and leads to a series of interactions between activated receptors, recruited proteins, and plasma membrane molecules eventually activate the multiple downstream effectors, which are implicated in the control of proliferation and survival[98]. However, the presence of other ligands or molecules, such as lapatinib, which [ EGF : EGFR]eff EE i act ,effi max,i (4) function at the intracellular domain to block tyrosine kinase half [ EGF : EGFR]eff activity, thus inhibiting the autophosphorylation of d max,i receptors and receptor dimerization, blocks the activation EE di repeff,i (5) of downstream pathways that are responsible for 1 [ EGF : EGFR]eff / kd proliferation and apoptosis (e.g PI3K/Akt) [99-101]. Although the receptor number varies in different cell lines or cell types due to differential gene expression, we assume that in our model, the total receptor number (denoted by EGFRt) is a constant in the a given cell type, which will be used as a first approximation here. Considering the reactions that are related to EGF-EGFR, two PHS 398/2590 (Rev. 11/07) Page 192 Research Plan Page Program Director/Principal Investigator (Last, First, Middle): simple chemical reactions are Wong, Stephen, T.C., Ph.D., P.E. described: EGF EGFR kf EGF : EGFR and kr EGF : EGFR TKI kb EGF : EFGR : TKI . Where kf, kr, kb and ku are rate constants. Since the chemical reaction ku rate is much faster than cellular behavior such as cell proliferation, we use the quasi-steady state to obtain the concentration of EGF-EGFR. Using the law of Michaelis-Menten kinetics, we derive the concentrations of EGF-EGFR and EGF-EGFR-TKI as follows: [ EGF : EGFR] [ EGFRt ][ EGF ] [ EGF : EGFR][TKI ] and [ EGF : EGFR : TKI ] , where Km1 and Km2 are K m1 [ EGF ] K m 2 [TKI ] Michaelis constants. Therefore, the effective EGF-EGFR complex that can activate downstream factors is: [ EGF : EGFR]eff [ EGF : EGFR] [ EGF : EGFR : TKI ] . N4.2.C.1.3. Mathematical description of cell proliferation and apoptosis Let us now consider how to link the cell proliferation rate and apoptosis with the effective EGF-EGFR. Similarly to Monod [102] and Eladdadi et.al.[97], we use the Michaelis-Menten kinetics to model the saturated effects of cell proliferation rate with respect to the EGF-EGFR concentration as in equation (4), where max is the maximum cell proliferation rate, half is the number of occupied receptors required to generate a half-maximal response. Similarly, the repression function of apoptosis (death rate) is simply modeled as equation (5), where dmax is the maximum death rate and kd is a constant for repression threshold. Substituting equations (4)-(5) into equations (1)-(3) of the dynamics of tumor populations, we can obtain a new modeling system that incorporates ligand-receptor reactions and stem cell niche into the cellular dynamics of tumor, as in shown in equations dNTIC EE EE (6)-(8). Psy ( NTIC ) act ,effTIC NTIC repeff,TIC NTIC (6) dt N4.2.C.1.4. Parameter value dN PC EE EE EE selection Pasy act ,effTIC NTIC act ,effPC N PC repeff, PC N PC (7) dt The parameters used in this model dNTC EE EE are listed in Table 3, most of which are act ,effPC N PC repeff,TC NTC (8) dt based on recent experimental data or scientific literature. The maximal death rates for these three populations were derived from Michor et al’s work[103]. The maximal division rate for PC cells was estimated using the doubling time of HB4a cell lines t1/2=48hours: max,2 =ln(2)/t1/2=0.0143hour-1; no data was available for estimation of max,1 , we assumed that max,1 =0.5* max,2 due to the findings that TIC divides slower than PC cells. Number of receptor complexes required to generate a half-maximal response half was adopted from Table 3. Parameters used in our mathematical model. MH2005 represents the paper of (Michor, Hughes et al. 2005); EI2008 represents the paper of (Eladdadi and Isaacson 2008); HO2003 represents the paper of (Hendriks, et al. 2003) Eladdadi et al’s work. K m1 kr / k f was calculated using the data from Hendriks et al; we assumed the same value for K m 2 . The initial concentration of EGF ligand was obtained from Hendriks et al and TKI concentration was set initially with 1.0*10-9M. The total number of receptors per cell was simply the summation of EGFR and HER2 from experimental studies, which varies from PHS 398/2590 (Rev. 11/07) Page 193 Research Plan Page Program Director/Principal Investigator (Last, First, Middle): Wong, Stephen, T.C., Ph.D., P.E. 210000 (normal) to 800000 (high HER2 expression level)[97]. N4.2.C.1.5. Simulation results The effect of EGFRt on tumor dynamics: As described previously, the total number of EGFR per cell varies in a large range due to different cell lines, however, the relationship between total EGFR and tumor growth cannot be intuitively obtained. Therefore, we simulated the tumor volumes over time by changing the parameter value for the total EGFR in our model from 200000 to 800000. The simulation results are shown in Figure 19, from which we can see that the tumor grows faster and the final volume of the tumor is also larger as the total EGFR increases, as shown in Figure 19A. In addition to the growth advantage Figure 19 simulated tumor growth with various total EGFR per cell. (A) prediction, we note that our model compares the cell population growth. (B) shows the relationships between also predicts that the dose-response total EFGR numbers and tumor growth data (points taken from A at time=1500 of tumor to total EGFR approaches and time=1600). a saturation function as the total EGFR increases. As a result, increasing the level of total EGFR beyond a maximum number (threshold) will not alter the tumor growth rates and final volumes. As shown in Figure 19B, the dose response dependence of cell proliferation rates on total EGFR becomes weaker (a smaller slope). Tumor response to various treatments: Let us now consider the tumor response to different treatments. Without treatment, the tumor reaches its steady stage after a certain time, with a constant volume and TIC percentage. Conventional treatments will be simulated by increasing death rates of PC and TC by an additional 50%. Our simulation results in Figure 20 show that conventional treatment results in a decrease of tumor volume, and an increased TIC percentage (Figure 20B), as observed experimentally in our preclinical and clinical studies. For the treatments with TKIs Figure 20. Simulated response of tumor to different treatments. (A) shows tumor (e.g. lapatinib), the initial volumes changes over time and (B) shows TIC percentage changes over time concentration of TKI was without treatment (solid line), with conventional treatment (green dotted line) and 1.0*10-9M. The effective with the treatments of TKIs. Note, while red dashed lines represent treatments with EGF-EGFR complex is TKI that inhibit proliferation and induce apoptosis, the blue dash-dotted lines shows decreased due to the binding treatments including the change of self-renewal patterns as well. of TKIs to the intracellular domain of EGFR. While we assumed that the amount of effective EFG-EGFR is related to increased proliferation rate and decreased apoptosis, the presence of TKIs simultaneously represses the proliferation rate and induces the death rate of TIC, PC and TC cells. While the repression of tumor volume is obvious as shown in Figure 20A, the increased TIC percentage is not expected. Indeed, Li et al. found that the treatment with lapatinib (a kind of TKIs) led to a non-statistically significant decrease in the percentage of CD44+/CD24-/low cells that are thought to be cancer stem cells[3]. In our explanation the TKIs not only have effects on the proliferation rate and apoptosis rate, but also have effects on changing the self-renewal patterns, that is, shifting symmetric division to asymmetric division, in other words, inducing the differentiation of the TICs. PHS 398/2590 (Rev. 11/07) Page 194 Research Plan Page Program Director/Principal Investigator (Last, First, Middle): Wong, Stephen, T.C., Ph.D., P.E. Therefore, in addition to inhibiting the effective EGF-EGFR, we also multiplied Psy ( NCSC ) with a value of 0.25, which is again a first approximation and will be modeled with a continuous function in future, to model the shift of TIC division from a symmetric to an asymmetric pattern. The simulation results are shown in Figure 20 with blue dash-dotted lines. This time, not only decrease of tumor volume but also the slight decrease of TIC percentage is observed when the TKI treatment is imposed, which is in agreement with the finding of Li et.al[3]. Furthermore, it is worth noting that even after treatment stops, the tumor volume still decreases for a certain time, and a sharp increase of TIC percentage occurs after the treatment. This can be understood in the following way: when the treatment stops, the tumor volume is small and TIC cells shift back to symmetric division, therefore, an increase of TIC population and a decrease of PC and TC continue, resulting in a repression of total volume of tumor but an increase of TIC percentage. N4.2.C.2. Pathway Inference in regulating TIC cells Mapping the key signaling molecules in Figure 21. The signal transduction mapping for TICs. biochemical pathways will be central to pathway study drug discovery efforts [104-107]. However, study of signal transduction currently faces two key challenges: First, the pathways themselves are far from being completely constructed. For example, the signaling pathway map for the various subtypes of breast cancer is still not available. Second, key pathway components suspected of participating in disease will have to be validated as potential drug targets. This might explain the high failure rate in search of completely new targets and in repositioning of old drugs. Therefore, it is critical for drug discovery to construct a signal transduction map that has the ability to reveal correlation between drug targets and pathways of interests with a particular disease in a cause-effect format. Signal transduction from the outside to the inside of a cell is performed and completed by many diverse molecules, such as hormones, growth factors, neurotransmitters, cytokines and cell adhesion proteins and small molecules. Protein-protein interactions (PPIs) play a key role in the biological process. It is an important strategy in systems biology to construct signal transduction from PPIs[108, 109]. In our study, we filtered a protein-protein interaction data for Mus musculus (Mouse), i.e. Filtered Protein Network (FPN), based on five published PPI databases, that is, DIP [110], BIND [111], MIPS [112], MINT [113] and IntAct [114]. Based on Filtered PPIs and available signaling pathways, we found that a special interacting pattern, i.e. network motif [115, 116], play a key role in signal transduction of a cell. Network motifs are patterns that occur in different parts of a network at much higher frequencies than those found in randomized networks and have been proposed as the essential components in signal transductions. Through checking the properties in network topology, we found they are clustering between cancer signaling pathways and other signaling pathways. Microarray on mouse mammary tumor-initiating cells generated in the Rosen lab was used to construct the essential signal transductions in TICs [4]. The cells were labeled (for FACS sorting) with CD29 and CD24 antibodies, and four subpopulations were collected (CD29HighCD24High, CD29HighCD24Low,CD29LowCD24High, and CD29LowCD24Low) to do either in vivo transplantation or to isolate RNA from each of them for array to correlate the in vivo data. Twelve samples (RNA of four subpopulations based on expression of CD29 and CD24 for each of three tumors) were included in the identification of differentially expressed genes of tumor-initiating cells. Five samples (RNA of subpopulations based on expression of CD29 and CD24 of normal mammary epithelial cells) were included in the normal group analysis. A reference RNA was used to normalize all samples. The fold-changes (FCs) of genes were computed by comparing the subpopulations of CD29HighCD24High with other subpopulations. Four genes, Lsm5, Calm3, Bmi1 and Ezh2, were highly up-regulated in the CD29HighCD24High subpopulations (FC > 4). Bmi1 plays an important role in regulating the self-renewal capacity of hematopoietic, as well as human mammary gland stem cell. Two genes, Calm3 and Bmi1, are involved in network motif clusters. Moreover, the differential genes (FC > 3 or FC < 0.25) are significantly enriched in ‘Cell PHS 398/2590 (Rev. 11/07) Page 195 Research Plan Page Program Director/Principal Investigator (Last, First, Middle): Wong, Stephen, T.C., Ph.D., P.E. Cycle’, ‘ Calcium’, ‘ Long-term Potentiation’, ‘Wnt’, ‘Phosphatidylinositol’, ‘Focal adhesion’, ‘MAPK’, ‘Adherens junction’, ‘Cell adhesion molecules (CAMs)), ‘B cell receptor’, ‘T cell receptor’, ‘Adherens junction’, and ‘Tight junction’ of KEGG (Kyoto Encyclopedia of Genes and Genomes) [117]. Construction of signal transduction mapping: The key signal transduction networks in TICs were constructed using the Multiple Objective Optimization Model (MOOM). The inputs of MOOM are network motif clusters surrounding Calm3 and Bmi1, FCs of differential genes, and enriched signaling pathways. The outputs of MOOM are protein-paths with differential expressions of genes and cellular processes of signaling pathways. The component proteins in the output protein-paths satisfy that (1) be most differentially expressed in Lin — CD29HCD24H cells and (2) pass through a large number of enriched signaling pathways. The output protein-paths from the model satisfy that they dominate the most differential genes and pass through a large number of the interested signaling pathways identified from differential genes. These output protein-paths can be combined into a network by the shared common proteins, as shown in Figure 21. Most of these genes are either up-regulated or down-regulated in the subpopulations of CD29HighCD24High. By using IPA (Ingenuity Pathway Analysis) software, we identified that the up-regulated genes (in red) can accelerate several essential functions in the survival and renewal of TICs, such as ‘Cell Growth and Proliferation,’ ‘Hematological System Development and Function,’ ‘Cancer,’ ‘Cell Cycle,’ ‘Cell Signaling,’ ‘Cellular Assembly and Organization,’ and ‘Cellular development,’ and the down-regulated genes (in green) can decelerate many functions related to ‘Post-translational Modification,’ ‘Cell Growth and Proliferation,’ ‘Cell-To-Cell Signaling and Interaction,’ ‘Cell Signaling,’ and ‘Cell Death.’ The signaling transduction mapping is important in revealing the key molecules involved in TICs and uncovering the molecule mechanisms for the survival or renewal of TICs. N4.2.C.3. Cellular image analysis for the Directed Iterative Functional Genomics shRNA screen Similar to colony-forming assays under anchorage-independent growth conditions in soft agar, we have used mammosphere form efficiency (MSFE) as a surrogate for tumor-initiating and progenitor cell function and as a measure of cells capable of self-renewal in breast cancer specimens. TICs and at least some progenitor cells are able to survive and generate spheroid structures termed mammospheres (MS) by anchorage-independent growth in suspension culture. Using malignant mammary epithelial cells, MS can be transplanted to form tumors with the same cell type as the parental tumor. 3D Breast Cancer Mammosphere Segmentation: To quantitatively analyze the individual cells from the 3D mammophere, cell segmentation is necessary. In 3D cell segmentation, cell contact and intensity variation M E o I ci i 1 2 1 H dv I c 1 H dv g I dv 1 H 1 H dv M 2 i b b i i 2 i o I ci b I cb t 2 M j 1, j i M i i 1 H i g i M i 1 j i 1 i i vgdiv i i i j M 1 H i j 1, j i (9) (10) are two major challenges. In this proposal, we implement a segmentation method, which consists of cell center detection and cell body segmentation, to deal with these two challenges. To detect the cell centers, the iterative voting method proposed in [118] is employed. In summary, the iterative voting method uses oriented Gaussian kernels, whose topography is refined and reoriented iteratively, to filter the images for detecting the cell centers. Finally the cell centers can be detected by using simple thresholding method, e.g. Otsu’s method [119]. This method has an excellent ability of noise immunity, and is shown to be tolerant to perturbation in scale. For details, please refer to [118]. To separate the cells from background, we apply the Otsu threshold method in local regions. However, the local adaptive threshold method cannot separate the clustered cells. The aim of cell center detection is to provide ‘seed’ information for the cell body segmentation to separate clustered cells. To segment the cell body, we propose a modified multiple level set method. Each cell is represented using one level set function, and the spheres around the detected cell centers with radius r (which is empirically set as the minimum radius of cells), are used as initial contours. Specifically, in the proposed multiple level sets method, we integrated the intensity information [120], the first and second terms in the energy function, and geodesic length information [121], the third term in the energy function, together. In addition, we introduce the interactive energy, the fourth term in the energy function, to prevent the adjacent level sets from overlapping. The energy function is provided in Equation (9). The evolution equation for each i , as seen in Equation (10), is then obtained by deducing the associated PHS 398/2590 (Rev. 11/07) Page 196 Research Plan Page Program Director/Principal Investigator (Last, First, Middle): Wong, Stephen, T.C., Ph.D., P.E. Euler–Lagrange equation. where M denotes the number of cells; is the image domain; ci is the average intensity inside i-th cell, and cb is the average intensity of the background; H represents the Heaviside function, and is the Dirac function; g is the edge indicator function. The representative breast cancer images and segmentation results are provided in Figure 22. We will extend the 2D feature extraction in Section N.3. They include 3D Zernike descriptors, 3D Haralick co-occurrence features, spherical harmonic features, regional features, and phenotype shape descriptors. The (a) of 3D features are(b) (c) (d) (e)features below. last two types extension of 2D features; we briefly describe the other three Figure 22. A representative breast 3D image. (b) superior two slices over of mammal (c), of (d) noise two 3D Zernike descriptors: 2D cancer Zernikemammosphere moments were found (a), to be otherssphere; in terms 3D visualization from different view perspectives. (e) 3D rendering of the segmented cells. sensitivity, information redundancy and discrimination power [122]. Guided by this, Canterakis [123] generalized the classical 2D Zernike polynomials to 3D, however, in his work Canterakis considered mostly theoretical aspects and 3D Zernike Moments, which are directly derived from 3D Zernike polynomials similar as in the 2D case, are not invariant under rotations. 3D Haralick co-occurrence features: texture is one of the most commonly used features used to analyze and interpret images by measuring the variation of the intensity of a surface and quantifying properties such as smoothness, coarseness, and regularity. Although the traditional Haralick texture features were concentrated on 2D image, they have been extended to 3D volumetric data [124-126]. In 3D case, the Haralick features can be calculated along the same way as in the 2D case. It is worth noting that the number of directions in the 3D case is 13 instead of 4 in the 2D case during the computation of co-occurrence matrices. Spherical harmonic features: spherical harmonic is an important property in many theoretical and practical applications. Moreover, in 3D computer graphics, spherical harmonics plays a special role in a wide variety of topics including indirect lighting and in recognition of 3D shapes. Herein, we propose a novel kind of feature which is based on the rotation invariant spherical harmonic representation proposed in [127], called spherical harmonic feature, to describe the shape of 3D tumor for our special research purpose. Multidimensional drug profiling using Kullback-leibler divergence (KLD) the heterogeneous morphological structures of cells indicate the influence of different drug treatments [128-130]. Using KLD metric we can profile the drug effects compared with control. For each segmented cell image, we calculated 211 quantitative features [8, 131]. Given a quantitative feature, we obtained two populations: control and drug treatment. Then the KLD metrics between control and treatments are calculated as follows. For each plate, we collected the cells from control wells together as the pooled control population. For each replicate of the treatment, we first generated a sub-population with the same size of the replicate by sampling with replacement from the pooled control population, and then the KLD between the sub-population and the replicate is calculated for each quantitative feature. N4.2.C.4.Preliminary Results of Drug response modeling We (Cristini and colleagues) use a multiscale computational model [46], extending a previous formulation of tumor growth founded in cancer biology [40, 43, 44, 47, 48], to enable more rigorous quantification of diffusion effects on tumor drug response. This model can represent nonsymmetrical solid tumor morphologies three-dimensionally, thus providing the capability to capture the physical complexity and heterogeneity of the cancer microenvironment. More significantly, we fully constrain the model through functional relationships with parameters set from experiments. The hypothesis is that the simplest relationships that would at the same time be biologically founded and which could be calibrated by the experimental data. These relationships link tumor mass growth and regression to the underlying phenotype. We provide the mathematical basis for describing cell PHS 398/2590 (Rev. 11/07) Page 197 Research Plan Page Program Director/Principal Investigator (Last, First, Middle): Wong, Stephen, T.C., Ph.D., P.E. mitosis, apoptosis, and necrosis modulated by diffusion gradients of oxygen, nutrients, and cytotoxic drugs, and enable quantification of the physiologic resistance introduced by these gradients. Input parameters include the diffusion coefficients for these substances and the rate constants for proliferation, apoptosis, and necrosis. We measure parameter values from independent experiments done under conditions with no gradients (i.e., with cells grown as one-dimensional monolayers) and then use these values to calculate cell survival in three-dimensional tumor geometry. These values are compared with experiments in which cells are grown as in vitro tumor spheroids, representing a three-dimensional tumor environment with diffusion gradients. This approach allows us to (a) fully constrain the computational model, using experimentally obtained parameter values, and (b) validate the hypothesized functional relationships by comparing the computed three-dimensional tumor viability with the spheroid tumor growth experiments. By quantifying the link between tumor growth and regression and the underlying phenotype, the work presented here provides a quantitative tool to study tumor drug response and treatment. N4.2.D. Research Plan In this component, we propose to develop biomathematical modeling to model TIC tissue mE in Section 4.2.D.1, pathway analysis and TIC mE remodeling in Section 4.2.D.2, image bioinformatics for target discovery and TIC mR remodeling in Section 4.2.D.3, and in-silico models to model TIC behavior and drug treatment response in Section 4.2.D.4. Figure 23. 3-D segmentation of breast cancer cells in a tissue section, cells are labeled with D.4.2.D.1. Aim 2.1: TIC 3D Tissue Image Analysis and TIC different colors for clarity. Tissue mE modeling N4.2.D.1.1 TIC 3D Tissue Image Analysis Stacks of confocal images will be used to reconstruct the spatial distribution of Wnt-, Notch-, and Hedgehog-resonsive cells, which, at least in the case of Wnt-responsiveness, represent the TIC. Additional reporters will be employed as they become available. We will first reconstruct the 3-D structure of TIC cell populations separately. Each cell and blood vessel (or macrophage, fibroblast etc.) in the TIC tissue microenvironment will be segmented and registered with a 3-D coordinate, so that the TIC population can be obtained to study the 3-D distribution. The accurate detection and segmentation of 3-D cells and vessels is a crucial prerequisite for this subtractive reconstruction. In spite of the great number of commercial image analysis software available, cell cytoplasm detection and segmentation remain as challenging problems due to the complicated cellular appearances like the irregular shape, touching cells, and intensity heterogeneity. We (Wong and Zhou labs) have developed software packages for cellular image segmentation [8, 13, 113, 132-134], and we were successful in using the algorithms we developed to derive 3-D segmentation scheme for breast cancer cells (Figure 23). Armed with these algorithms, we seek to accomplish the 3-D subtractive reconstruction that cannot be done using any current commercially available software. The bioimage analysis software we developed will be freely available for the research community after publication. The spatial distribution of the TIC cell population will be analyzed to see whether they are highly clustered. A classical tool, Ripley’s K function will be used to analyze the spatial point pattern [135]. The definition of the K function is as follows: K(t)=E(d<t)/ λ, where E(d<t) denotes the number of cells within a distance t of an arbitrary cell, and λ the density of cells (mean number of cells per unit area). The density can be estimated by N/A, where N is the observed number of points and A is the area of the field of view. Using Ripley’s K function test, we will be able to determine whether TIC cells are distributed randomly or clustered in a breast tumor. Stacks of confocal images will be used to reconstruct the tumor vasculature (or other cell types). Considering the structure of vasculature, we will employ a curvilinear structure detector [136] to detect the center lines, and then use center line based level-sets [136] to segment the cells, and finally get the reconstructed 3-D vasculature by surface rendering. The shortest distance between a cell and the vascular surface will be calculated for each tumor initiating cell. The distance distribution will be plotted and compared with the control group to determine whether TIC cells are specifically concentrated in perivascular regions. D.4.2.D.1.2 TIC tissue microenvironment (mE) modeling PHS 398/2590 (Rev. 11/07) Page 198 Research Plan Page Program Director/Principal Investigator (Last, First, Middle): Wong, Stephen, T.C., Ph.D., P.E. Angiogenesis and Blood Flow: We simulate tumor-induced angiogenesis in 3D using a hybrid continuum-discrete, lattice-free random walk model, which refines earlier work by [137-140]. The model generates the vasculature based on tumor angiogenic regulators, represented by a single variable representing excess of pro-angiogenic regulators (e.g., VEGF). Endothelial cells (ECs) near the sprout tips proliferate and migrate by chemotaxis and haptotaxis [137, 138]. We model non-Newtonian blood flow with hematocrit in the neovasculature by adapting and refining the network flow equations in [43, 44] to our new, lattice-free vasculature[47, 141]. Similarly to [142], we determine the vessel radius in each network segment by balancing forces: vessel blood pressure (tied to the flow rate) and wall elasticity; and external tumor-generated stresses. This dynamic coupling between external mechanical forces, vessel radius, and flow can then feed back into the dif f usion by v essels uptake by all cell species decay release 2 0 Dσ σ δ v esselν a, u b 1 σ λσ ,U,V ρ V λσ ,U,H ρH σ λσ ,D σ 0 Dd 2 d δ v esselν a, u b 1 d λd ,U,V ρ V λd ,U,H ρH d λd ,D d 0 Dσ 2 n V λnV ,S ρ V H σ H σ λnV ,D n V δ v essel λnV ,U n V release by hy poxic tumor cells decay (nutrient transport) (drug transport) (11) (VEGF transport) uptake by ECs angiogenesis model by altering the branching probabilities[47, 141]. Transport of Nutrients, Drug, and Growth Factors: The vasculature releases nutrients (oxygen and glucose) σ that diffuse through the tissue and are uptaken by cells during metabolism, while tumor cells secrete VEGF (nV) in response to hypoxia. Drug d (e.g. erlotinib) extravasates from the microvasculature and diffuses through the tissue. On the time scale of tumor proliferation (days), we can assume that the governing equations are quasi-steady[40, 47, 143]. These assumptions are formulated as in Equation 11, where Dσ are diffusion constants, δvessel (Dirac delta function) gives the microvasculature position, ν is the delivery rate (depends upon avessel and ub), and λσ,U,i, λσ,D, and λnV,,S are the uptake, decay, and secretion rates. Also, H is the Heaviside “switch” function, defined to be 1 where σH - σ > 0 (in hypoxic regions) and zero elsewhere. Tumor Growth and Tissue Invasion: While the mathematical model is described in greater detail elsewhere[39, 44, 47, 48], we provide an overview of and adapt our model to include treatment of tumor initiating cells (TIC), progenitor cells (PC), and tumor cells (TC). Our general approach is to treat cells and tissue according to classical physical conservation laws (making the model mechanistic) while incorporating the appropriate biology as carefully-chosen constitutive relations among the physical variables. 3D Distribution of Cell Species: Tissue is modeled as a mixture of interstitial fluid, ECM, and various cell species with densities ρV (viable tumor cells, including TIC, PC, and TC sub-populations), ρH (viable host), and ρD (apoptotic and necrotic tumor and host cells). Cell-cell and cell-ECM mechanical interactions are modeled with a flux J that generalizes Fick’s Law [44]. The rate of change in ρi (i = V, D, H) is determined by balancing cell advection (·(uiρi), where ui is the velocity of the cell species), cell-cell and cell-ECM interactions (adhesion, cell incompressibility, chemotaxis, and haptotaxis; incorporated in Ji), and net cell creation (Si: proliferation minus ρi ui ρi Ji Si , for i V, D,H t (12) apoptosis and necrosis) [44, 47]. See Equation 12. Proliferation, Apoptosis, and Necrosis (Si): Each species’ density ρi (i = V, D, H) increases through proliferation and decreases by apoptosis and necrosis. For simplicity, we assume these primarily affect tumor mass through the water transport in the tissue and hence neglect their solid fraction[48]. Cell-Cell Interactions (Ji): Tumor and host cells adhere to one another but preferentially adhere to like cells, modeling (i) relatively low host cell density; (ii) degradation of the stroma by MMPs [144, 145]; and (iii) cell sorting experiments showing differential adhesion [146]. Thus, the tumor and host interface is relatively well-delineated, with interface thickness dependent upon the interaction of the forces given above. We express these effects with a mechanical interaction potential function E that depends upon ρi; the precise form of E is in[48]. Cell Species Velocity (ui): The movement of cell species i is determined by the balance of proliferation-generated pressure, adhesion, chemotaxis (due to substrate gradients), and haptotaxis (due to gradients in the ECM density f). We model the motion of cells and interstitial fluid through the ECM as porous PHS 398/2590 (Rev. 11/07) Page 199 Research Plan Page Program Director/Principal Investigator (Last, First, Middle): Wong, Stephen, T.C., Ph.D., P.E. medium flow as a viscous, inertialess fluid [44]. Tumor TIC-PC-TC Dynamics (Tumor Composition): The tumor cell density ρV consists of TIC, PC, and TC, which differ in cellular properties, such as proliferation rate, differentiation capacity, chemotaxis strength, and cell-cell adhesion. In any time interval (t, t + Δt], TICs divide symmetrically into two identical daughter TICs with probability PTIC→TIC, and asymmetrically into a TIC and PC with probability PTIC→PC. In turn, PCs divide symmetrically into two daughter PCs with probability PPC→PC, and divide symmetrically into two daughter TCs with probability PPC→TC. Tumor cells divide symmetrically into two TCs with probability PTC→TC. Each of these cell types has a probability of apoptosing Pi,A, where i = TIC, PC, 1 d fTIC V PTIC TIC PTIC , A fTIC (13) TC. Once averaged within any fixed dt V volume, we obtain the system of 1 d f PC V Equations (13)-(15). 2P f 2P P P f (14) TIC PC TIC PC PC PC TC V dt Note that each of the transition d f 1 TC V probabilities above depends upon PPC TC f PC PTC TC PTC , A fTC (15) the microenvironment and genetic V dt makeup of the cell constituents. Furthermore, notice that the fixed transition probabilities for fixed time intervals used in many models (e.g. [147]) can be derived from the more general stochastic process detailed in [148]. PC , A PC Discrete Cell-Scale Agent Model: Each cell is modeled as a semi-deformable sphere with radius r, mass m, distribution of surface adhesion receptors E and I (e-cadherin and integrins), position x, velocity v, phenotypic state S (proliferating P, apoptosing A, migrating M, quiescent Q, or necrotic N), and classification C (TIC, PC, or TC) (Figure 24). Phenotypic Transitions: Each non-quiescent state (P, M, A, or N) has a fixed completion time (βP-1, βM-1, βA-1, and βN-1, respectively). We assume the cell cannot leave P or M prior to completion, except for necrosis (N) or apoptosis (A; during therapy). Transitions from Q to A and P are regulated by stochastic processes that depend upon the microenvironment[149]. After proliferation is complete, the daughter cells’ classifications are assigned according to the symmetric/asymmetric proliferation probabilities in the TIC-PC-TC dynamics above. Transitions to the apoptotic and motile states are governed similarly, and cells become necrotic if σ < σH for a longer time period than a fixed “survival” time βH-1. Balance of Forces: The cell’s velocity is obtained by balancing the forces acting upon it: cell-cell adhesion and repulsion, cell-basement membrane (BM) adhesion and repulsion, and the net force of migration balance with interstitial fluid drag and cell-ECM adhesion by an “inertialess” assumption[43]. Motility: Effects of several signaling pathways will be considered. For example, to approximate EGF-EGFR signaling dynamics, we choose direction and strength of the motile force randomly with a bias towards the maximum EGFR activation[43, 96, 150, 151]. Tissue Geometry: Breast ducts and other basement membrane structures are represented with a signed distance function d with d > 0 inside the lumen, d = 0 on BM, and d < 0 outside the lumen; d gives the normal vector n to the interior BM surface. This method is well-suited to describing the complex BM topology [43]. Molecular dynamics and their effects on cellular properties: Because the cells’ phenotypic properties (e.g., proliferation and apoptosis rates) are not constant throughout the time course of growth (nor are they uniform across the tumor[152]), we will investigate the inclusion of a molecular-scale signaling model into the cell agents. Details are described in N4.2.D.2. PHS 398/2590 (Rev. 11/07) Page 200 Figure 24. A schematic of the agent cell model. Research Plan Page Program Director/Principal Investigator (Last, First, Middle): Wong, Stephen, T.C., Ph.D., P.E. Integration Across Scales, and Calibration: We generally use the tissue-scale, continuum model for overall performance of the simulator, with focused application of the cell-scale model in “patches” where molecular- and cell-scale dynamics are thought to be important (e.g., in hypoxic regions). To achieve this, we apply the equation-free approach (EFA) with “gaptooth” dynamics in the selected patches. The EFA, which we are currently investigating with colleagues in ongoing collaborations across a variety of cancers, accelerates the discrete code in the patches using coarse time projection/integration methods ([153-157]). The result is a hybrid, multi-scale tumor simulator which benefits from the efficiency of the continuum model while dynamically applying the discrete model where most needed. We further accelerate for 3D whole-organ simulations by (i) using the adaptive multi-grid techniques by our group [39, 44, 47, 48] and (ii) parallelizing the code [158]. Information flows between scales through dynamic up-scaling and downscaling, and helps to rigorously integrate data coming from a variety of spatiotemporal scales (Figure 25). Below, we give the essential points of the calibration; greater detail can be found in [41], where we applied the tissue-scale portion of the calibration to glioblastoma and [148, 159, 160], where we applied the cell-scale protocols to ductal carcinoma in situ of the breast. A: We initialize the tumor, neo-vasculature by macroscopic measurements such as MRI or CT. B: We seed the cell population with the percentage of cells in the different states, cell cycle and apoptosis times [148, 159, 160] and EGFR signaling calibration [96, 150, 151]. The Figure 25. Multi-scale workflow with data flow across biological scales. other variables are interpolated from (A). C: The cell-scale proliferation, apoptosis, and motility parameters are up-scaled to calibrate the continuum scale. D: update the neo-vasculature, transport, cell densities, and tissue biomechanics. Looping: After completing (A)-(D), we loop (B)-(D) to simulate the entire lesion and the surrounding microenvironment while efficiently incorporating the finer spatiotemporal dynamics into the evolution. The methodology described here can be easily adopted to study the effects of Notch, Wnt or Hh inhibitors with the data obtained in Specific Aims 1.1 and 1.4. N4.2.D.2 Aim 2: predict the TIC pathways or key genes related to specific cancer The bioinformatic analysis of DNA microarray and proteomic data, coupled with the genetic and pharmacological manipulations of TIC function, will enable us to identify key candidate components in the pathways that are related to cellular behavior and survival. Subsequently, we will map these signaling pathway factors to specific tumor cell types and further to specific cellular properties by modeling them as functions of the factors. N4.2.D.2.1. Modeling of the TIC evolution at the molecular level PHS 398/2590 (Rev. 11/07) Input GENE/Protein Expression t1 ... t n X1 g11 ... gn1 ... ... ... ... Xn g1n ... gnn PCR; Microarray. Output Cellular Properties t1 ... t n Cellular Behavior psy pasy psy,1 pasy,1 ... ... psy,n pasy,n Imaging; Statistics. 1 ... n Tissue Properties CSC t CSC 1 1 ... ... tn CSCn PC TC PC1 TC1 ... ... PCn TCn Markers; Cytometry. Figure 26. A schematic representation for the system integrating information from molecular to cellular and then to tissue level. The blue parts in the matrices are the data that we need for validation and that we can get via various techniques including biological and mathematical methods. Page 201 Research Plan Page Program Director/Principal Investigator (Last, First, Middle): Wong, Stephen, T.C., Ph.D., P.E. Cellular behavior is governed, in part, by a complex intracellular signaling network. While models at the cellular level are incapable of grasping the mechanisms of the decision for cell behavior, they can fit the statistical results and predict the outcomes of certain conditions. Nevertheless, we also want to incorporate the information of the lower biological level, namely sub-cellular or molecular level, into our statistical models to further understand the underlying mechanisms governing cell behavior (Figure 26). Indeed, we have already considered the effects of micro-environmental factors on the cellular behavior in Aim 2.1 while ignoring some important but complex networks connecting the environmental factors. Aim 2.2 includes this part to further enrich the proposed in-silico model. Several genes, such as BCRA1, HER-2, PTEN, and BMI-1, and many signaling pathways, such as Wnt, Notch, and Hedgehog, are involved in regulating stem cell behaviors. In breast cancer, activation of these signaling pathways; amplification of HER-2; or deletion of PTEN may lead to dysregulation of stem-cell self-renewal, resulting in expansion of the stem cell [82]. Therefore, it is important to integrate the effect of signaling pathways on the determination of stem cell fate. There are many publications on the signaling pathways or networks related to stem cells [161-164], most of which however are described in a conceptual and qualitative manner. Based on these literature and known biological knowledge, we can construct the mathematical models for pathways from microenvironment factors to nuclear targets and then to protein expression. However, the signaling pathways are usually complex; for instance, Hornberg et al. proposed a model of EGFR signaling [165], with 103 chemical species, 148 reactions, 97 independent reaction rates, and 103 initial conditions. Since our aim is to integrate molecular mechanisms into cellular models, we do not need to deal with the entire signaling pathways in this project. Instead, our strategy is to identify key factors in the pathways and simplify the extremely complex web of signaling pathways into a relatively simple but comprehensive mathematical model. An example of such a strategy is a biomathematical model for studying the effects of HER2 on cell proliferation by Eladdadi et.al [97]. The model described the interactions between a ligand (EGF) and corresponding receptors (EGFR and HER-2) by using simple chemical reactions; assuming that the total number of receptors that can initiate a signal transduction pathway is proportional to the number of cells per unit volume (i.e. cell density), they modeled the proliferative behavior of cells as a function of HER-2 and EGFR receptors numbers, and the growth factor EGF as well. Thus, previously complex pathways linking microenvironment factors to cell fate is greatly simplified. This provides us a certain hint about how to add the signaling pathway into our initial model established in Aim 2.1 without exponentially increase in complexity. As previously described, Wnt, Notch, and Hedgehog (Hh) signaling pathways are probably involved in regulation of stem cell fate, we will first identify key factors in these pathways based on literature and model analysis as Hornberg et al. [165] did, i.e. sensitive analysis of each factor, as well as by bioinformatics methods developed by us. For example, analysis of DNA microarray data and protein-protein interactions would enable us to find the candidate key components in the pathways which are related to cellular properties, such as the proliferation rate and patterns of stem cell division (Figure 4). After that, we map these signaling pathways to the cellular properties by modeling them as functions of these factors, thus integrating simplified pathways models (equations) into the cellular model. Mathematically psy f ( x1 ,..., xn ), pasy g ( x1 ,..., xn ) where x1 ,..., xn are Figure 27. Integration of signaling pathway(s) into cellular decision of stem cell fate. Factors from microenvironment, e.g. EGF, will stimulate signaling pathways targeting nuclear factors which will in turn induce protein expressions that modulate cell fate. genes obtained above, and f and g are the functions that model the relationship between symmetric or asymmetric rate and the genes in TIC pathways. PHS 398/2590 (Rev. 11/07) Page 202 Research Plan Page Program Director/Principal Investigator (Last, First, Middle): Wong, Stephen, T.C., Ph.D., P.E. They can be estimated by using x1 ,..., xn and symmetric or asymmetric rate at different time points. Since we have described in equations (1)-(2) that individual cell populations are functions of psy and pasy , that is, CSC f1 ( psy , ...), PC f 2 ( pasy , ...) , where TIC and PC are cancer stem cells and progenitor cells, we can easily map x1 ,..., xn into the cellular populations through psy and pasy as follows : CSC f1 ( f ( x1 ,..., xn ), ...), PC f 2 ( g ( x1 ,..., xn ), ...) . N4.2.D.2.2. Identify key factors in signaling pathways that modulate TIC fate into the in silico model As described above, we can evaluate the microarray data of the niche and TIC cells in different stages of tumor development. Based on these data, and available PPI information, we can infer the signaling pathways modulating TIC cell fate and progression of MM stem cells. Network Motif clusters: Although there are many available databases for protein interaction network, most of them suffer from false-positive data noise. To reduce the negative effects on the findings from network topology, we filter a high-confidence physical protein network by making sure every interaction is confirmed by at least two separate databases listed below: IntAct, DIP, MINT, and MIPS [112]. Using this method, filtered protein network (FPN) composing 2,684 proteins with 3,685 interactions has been identified by us. Network motifs are patterns that occur in different parts of a network at much higher frequencies than those found in randomized networks. They are the basic building blocks of all complex networks and do not distribute randomly in the gene regulatory networks. In the filtered protein network, we also found that its network motifs do not distribute randomly and display a high clustering property within the network topology. We defined the pattern or structure of many network motifs (at least two) sharing common proteins as network motif cluster (NMC). To analyze the contribution of network motif clusters to human cancers, especially MM, a clustering P-value is proposed to evaluate the extent to which network motif clusters take part in the signaling pathways of cancers. For a protein in the clustering topology of FPN, we can identify its surrounding network motifs. Thus, the whole structure of all surrounding network motifs of a protein is called as a motif cluster (MC), in which the protein shared by network motifs is called as the center proteins (CP) (in red) and every network motif is called as an arm of the center protein. For different NMCs, the CPs must be distinct from each other, but some arms of the MCs may be the same network motifs. Network Inference using diffusion map: Our goal is to identify those high-confidence protein-paths from the NMCs linking up signaling pathways. Every high-confidence protein-path can be identified by multiple-objective optimization model for quantifying the roles of this path in signal transductions specific to TIC. However, the MOOM could fail when the noise in the dataset bias the programming into an unreasonable shortcut. To overcome this problem, we propose a workflow of defining regulatory network from multi-modality data and unraveling signaling transduction pathway using diffusion map related method. This method measures gene relationship focusing more on the connectivity and better reflects the geometrical structure of whole dataset. We will use Figure 28. Optimal pathway discovery by diffusion maps technique. The optimal pathway from A to B in the diffusion mapping semi-group approach [166] [167] to mapped large protein interaction network in a spectral search the shortest path, see Figure 28. Our workflow view (a) diffusion distance defined from the dataset starts with constructing a general regulatory network, structure (b) conventional Euclidean distance. represented by undirected graph (G, E, W), where G is the group of interested genes serving as vertices of graph, E is the group of edges in the graph showing the interaction between different genes and W is the weight on each edge. The geometrical properties across this graph are then captured using a diffusion map, and pathway discovery can be done using conventional methods in a mapped space, where the connectivity between genes are preserved in simple measurement. A Euclidean distance on mapped space actually involved the local geometry structure around each gene, thus such “diffusion measurement” is robust against outliers in the dataset. PHS 398/2590 (Rev. 11/07) Page 203 Research Plan Page Program Director/Principal Investigator (Last, First, Middle): Wong, Stephen, T.C., Ph.D., P.E. N4.2.D.2.3. Modeling from signaling pathways to cell phenotype by using Flux Analysis As described above, in Aim 1.2, experimental data corresponding to different cell types or various stages of one cell type can be obtained; in Aim 2.2, bioinformatics analysis of the data enables us to obtain quantitative expressions of all components (genes or proteins) in signaling pathways (i.e. Wnt, Notch, Hedgehog in this proposal). Based on the results of these two Specific Aims, we will be able to figure out the interactions networks in these signaling pathways quantitatively. However, another critical question is: how to infer the cell phenotypes (cellular behavior parameters) with the information related to signaling pathways? To this end, flux analysis will be performed on signaling reaction network to calculate flux for each signaling network. Suppose the network includes n reactants and m reactions. The dynamic mass balance of the signaling system is described using the flux rates of reactions v R n1 and time derivatives of reactant dx Sv 0 , where S is dt stoichimetric matrix, the system can then be translated into a linear programming problem: min cT v s.t. Sv 0 , concentrations x R m1 . Therefore, at the steady state of the network, i.e., when v where c represents the objective function composition, in terms of the fluxes. Once the programming problem is solved by an optimal solution, the flux distribution v is predicted in the cell. As a consequence, the relationship between flux distribution and the cell phenotype will be modeled based on basic linear assumption, e.g. Pi i v i for simplification, where Pi are parameters for the cell phenotype, i and i are constants. Iterative feedback between experiments and simulation results will refine the mathematical models further. N4.2.D.3. Aim 2.3: develop image bioinformatics models for discovering critical gene functional networks from “Directed Iterative Functional Genomic Screen” (DIFGS) We seek to develop bioinformatics models for discovering gene functional networks by integrating gene function annotation results from the Directed Iterative Functional Genomics shRNA Screen in Component 1, Aim 1.3 and publicly available multi-modality genomic data. We will first develop an integrated image analysis system for shRNA screens, and scoring each gene based on the phenotypic information. We will then develop an imaging based systems biology approach to study the gene functional networks. Biological processes are often mediated by an orchestra of genes. Thus gene functional network studies are important to understand gene functions in detail. Combined with publicly available data, the gene functional annotation results from DIFGS shRNA study will allow us to identify known/suspected interacting proteins, immediate upstream regulators, and downstream targets. Compared with the predictions of the refined TIC mE model in Aim 2.2, new experimental data that are unanticipated by the model can be used to further enhance the robustness of our mathematical TIC mE model. Image Processing: Each image is segmented into different mammospheres (colonies) where each mammosphere contains hundreds of cells; and each cell body is delineated in 3D; thus we can extract various properties for each single cell, including geometry features like length of axis and volume, and intensities of voxels across cell body described in our preliminary work. Quantify the effect of shRNA treatment for gene function annotation: We aim to generate a functional annotation signature for each single gene that is targeted in replicate by different shRNA treatment. To do that we first record the morphology of each single cell, determine a mammosphere signature to summarize the information from single cells, and define an image descriptor based on the signature from different mammospheres in each image. The image descriptors are consolidated across a series of time-lapse images of a collection of mammospheres from a single shRNA treatment, and are also consolidated across replicate experiments. The consolidated score summarizes the effect of an shRNA treatment and serve as gene function signature such that cluster analysis and pathway analysis then can be carried out based on such gene function signatures. Colony signature Based on the image processing results, we identify mammospheres in each image and segmented the 3D image of each cell body. For each cell in a certain mammosphere, we can obtain a series of region geometry properties G=s(volume, major axis length, eccentricity) For each geometry property g G , we PHS 398/2590 (Rev. 11/07) Page 204 Research Plan Page Program Director/Principal Investigator (Last, First, Middle): Wong, Stephen, T.C., Ph.D., P.E. define a vector: Vg [mg , sd g , max g , min g , rg ] to record the mean value, standard deviation, maximum value, minimum value and ratio between maxx and minx, respectively. Also, we have a series of intensity properties for each single cell: Vi [mi , sdi , max i , cci ] to record the mean, standard deviation, maximum value of intensity across cell body, and the intensity in the center point of cell body. Combining the number of cells in each colony, N, with VG and VC, We have a vector C as signature for the properties of a mammosphere: CS [ N ,VG ,Vi ] . Image descriptor We summarize the information from all the mammosphere in a single image to form a vector ID [ Noc, mN , sd N , max N , min N , mGI , sd GI , Vi I ] . In this image descriptor we include number of mammospheres in the image Noc ; mean and standard deviation values, maximum and minimum values for pre-defined number of cells N in each colony [mN , sd N , max N , min N ] ; also, we include features in Vi as well as mean and standard deviation for features in G while this time all these features are calculated across the whole image (thus having the superscript I for mGI , sd GI , Vi I ). Gene function signature Now that we’ve defined the ID to describe each image, we have to consolidate such scores across images from time-lapse imaging experiments and replicates experiments to form a gene function signature. Outliers are discarded and weighted average values are taken based on correlation coefficients among IDs from replicate experiments so that repeatable results are obtained. The obtained mean values are normalized as a Z-score to control the baseline so that each feature in the Z-score has similar scale and cluster analysis is applicable. Also under the time lapse experiment scenario, we can calculate gene function signature at each time point and use differential equations to analyze dynamic of shRNA effect across time. Network discovery by integrating results of shRNA screening and functional gene networks: We propose an integrated workflow to combine single gene function signature with online genome Figure 29. Flowchart for quantifying shRNA treatment information and reconstruct gene regulatory networks effect for gene function annotation. related to relevant biological processes. An SVM classifier is utilized to integrate gene function signatures and heterogeneous genome data to form a functional gene network (FGN). The resulting network is used in combination with an integer linear programming algorithm [168] to extract a gene regulatory network. Figure 30 shows the flow chart of our proposed method. In the proposed method as shown in Figure 30, gene function scores are key to measure the relationship among different genes. Other than that, many different data types can be used to construct a FGN model including PPI (Protein-protein interaction), gene expression microarray, gene neighborhoods, gene fusion, text mining, and so on. However, most approaches reported in the literature are based on the integration of PPI and gene expression microarray data only [169-173]. We integrate physical interaction using PPI downloaded from the BIND database [111]. This dataset includes 6,772 proteins with 19,372 interactions. Microarray data which are related cell development are obtained from the GEO database [174]. In the first step PPI data and expression data are combined to obtain an individual score required for the FGN construction [173]. Several other genomic datasets with indirect interactions are then recruited to expand FGN, which include gene co-occurrence, genomic neighborhood, gene fusion, text mining and gene homology data. PHS 398/2590 (Rev. 11/07) Page 205 Research Plan Page Program Director/Principal Investigator (Last, First, Middle): Wong, Stephen, T.C., Ph.D., P.E. We will utilize SVM (the orange box in Figure 30) to integrate different data sources. The final combined scores reflect the level of confidence in each individual gene interaction. The input to the SVM consists of the gene interaction pairs, and we define our true negative gold standard as the collection of pairs that are annotated in the KEGG but do not occur in the same pathway. An integer linear programming (ILP) based method [168] is applied to search optimal path with diffusion distance (or signaling pathway), see the previous section. For each candidate path of a given length, a confidence score is used to evaluate its importance. Given a FGN and possible starting and end nodes, we can find such a path by maximizing the confidence score and minimizing the number of edges involved. By predicting a limited set of functionally related candidate genes from RNAi screening, we can discover pathways not implicated before. Indeed, several recent papers have demonstrated how this approach may soon have a major impact on human disease research. Therefore, a single integrated network may be powerfully predictive for many different aspects of human biology and disease. Once the target candidate genes and networks are discovered, we can use the similar approach to further refine the TIC mE model described in N4.2.D.2.4. This refined TIC mE model will be used to predict the response of TIC with genetic and pharmacological treatment. N4.2.D.4. Aim 2.4: model the response of TIC and their microenvironment according to genetic and pharmacological manipulations of TIC function Figure 30. Flow chart for FGN discovery. The blue boxes present the heterogeneous data sources; different types are used as inputs to a linear SVM classifier, which outputs a combined score. The orange box denotes the obtained weighted functional gene network. The red box denotes our final result, the extracted MAPK signaling pathway. For the genetic manipulations, cell cultures and a novel collection of mouse tumors and low passage human xenografts will be used to study the effects of genetic and pharmacological TIC inhibitors on tumor cell behavior in vitro and on tumor development in vivo. Based on these data and bioinformatics technologies, especially the signaling pathway modeling methods as described previously in the molecular level, we will first find the related signaling pathways and then describe the effects of these inhibitors on cellular behavior with mathematical equations (e.g., by virtually downregulating the corresponding portion of the network model), which will then be automatically integrated into the tumor growth model via the multiscale framework, specifically through the upscaling of the molecular/cellular agent model in the continuum model. In this way, the genetic manipulations can be incorporated into the mathematical model. By changing parameter values in the model corresponding to effects of genetic TIC inhibitors, we can simulate the outcomes for validation and prediction, without doing many more biological experiments. For the pharmacological manipulations, we model the drug delivery from the vasculature through the whole microenvironment to the tumor site, and further transport from the extracelluar compartment into the intracellular compartment. In addition, the cytotoxic function will be integrated. This technique has been employed in our multiscale modeling framework in [42, 46]. S S λd ,U,V ρ V d k 12 d k 21 2 k 41 4 Pharmacokinetics (PK) VC VC model: We model the S dS 2 extravasation of drug from the k 12VC d k 21S 2 k 23 S 2 1 3 k 32 S 3 k 24 S 2 k 42 S 4 vasculature and diffusion dt Vm (16) through the tissue using the dS 3 S reaction-diffusion equations k 12 S 2 1 3 k 32 S 3 dt Sm (11) previously introduced in Section D.4.2.D.1.2, with slight dS 4 k 24 S 2 k 42 S 4 k 41S 4 modifications for drug pumping dt as we now describe. The local PHS 398/2590 (Rev. 11/07) Page 206 Research Plan Page Program Director/Principal Investigator (Last, First, Middle): Wong, Stephen, T.C., Ph.D., P.E. transport of drugs from the interstitium into the cells’ cytoplasm and nucleoli is described using a compartmental model at each computational grid point as in [42, 46]. The model consists of four compartments, namely, the local extracellular drug concentration d, the cytosolic concentration S2, the nucleolar DNA_bound concentration S3, and the lysosomal drug content S4, as described in System (16) and in [42, 46, 49]. In particular, note that the first equation in System (16) is the net drug delivery rate for the reaction-diffusion drug transport in Equation (11). Also, kij is a transfer rate from compartment i to j (S1 is identical to d), ki is a rate of removal from compartment i, Sm is a DNA saturation parameter, and Vc is the volume of a cell. All these parameters are estimated for doxorubicin in breast cancer cells in [42, 46]. This system may be modified according to the mechanism of the therapeutic agent. For example, erlotinib is a TKI that disrupts EGFR autophosphorylation by binding at the intracellular ATP sites just below the cell membrane, and hence the nucleolar and lysosomal components are unnecessary for describing its pharmacodynamics. Note that these equations are just an example for the modeling of drug delivery in the microenvironment and intracellular space, and compartments in this system can be added or removed due to different biological characteristics of drugs. Pharmacodynamic (PD) model: This model describes the mechanism of the drugs cytotoxicity, and can be implemented with various levels of detail. For chemotherapeutic compounds such as doxorubicin that act by inducing apoptosis in cycling cells, we implement a phenomenological model as in [46, 49, 87, 175]. This avoids the issues of unknown drug mechanisms and instead focuses upon the quantitative effects of the drugs. A typical phenomenological PD model for cytotoxic agents that rely upon binding to DNA (e.g., doxorubicin) is given by the Hill-type equation (17), where E is cell inhibition, σ is the nutrient level, x is the DNA-bound drug time E N / 1 A1 x m , where x tˆ S4 s ds, tˆ 0 (17) (calculated in each computational grid point as a function of elapsed time tˆ since the administration of the compound), and A and m are fitted parameters. N(σ) is used to model the impact of the nutrient-dependence of drug effectiveness, which stems from the fact that such cytotoxic agents act primarily upon cycling cells; more information is given in [42, 46]. In discrete models, we can use x to alter the non-quiescent cells’ probability of entering the apoptotic state similarly to the hypoxic model in [43]. For the action of erlotinib upon EGFR pathway-addicted cells, we can implement PD either by setting the EGFR signaling rates to 0 in the detailed EGFR pathway models or by varying the rate of apoptosis with both the concentration of drug d and the rate of EGFR:EGF binding. In the latter case, EGFR:EGF is presumed to not transmit its signal for cell survival, and so the apoptosis rate is increased. N4.3. Other Items Optional Element - Shared Resources Cores. We will not propose a new supporting core, but make use of the existing cores. In Section N.2.5, we described the Supporting Cores. Pilot Research Efforts. Each awarded CCSB will be expected to pursue new opportunities pertinent to the scientific theme of the Center. We have described the strategies in Section N.2.4: Management of Pilot projects. Section N5: Component 3 - Education, Training, and Outreach Program N5.1. Rationale for an Educational & Training Program in Systematic Modeling of Cancer Development Breast cancer is the most commonly diagnosed cancer, and the second leading cause of cancer deaths among American women. As such, breast cancer has been identified as a public health priority in the United States. Despite this clinical and social importance, we are only now beginning to understand the molecular, cellular, and developmental mechanisms underlying breast cancer initiation and progression in enough detail to allow rudimentary predictions of treatment response to be made. Mathematical modeling and computational simulation offers the extraordinary promise of integrating huge quantities of diverse data into coherent developmental and predictive models capable of informing “personalized medicine” decisions. However, these models can only be as good as our biological understanding of the parameters involved in breast cancer development, and as the experimental data on which they are based. Our educational and training plan is designed to fill a need for an organized training process in combined biological and mathematical/computational modeling of breast cancer for postdoctoral fellows and undergraduates. In subsequent funding cycles, we intend to extend this training program to include graduate student trainees. PHS 398/2590 (Rev. 11/07) Page 207 Research Plan Page Program Director/Principal Investigator (Last, First, Middle): Wong, Stephen, T.C., Ph.D., P.E. Our proposed training program brings together, in a formal way, researchers from clinical, translational, and basic science areas of experimental breast cancer research, as well as researchers in the areas of mathematical modeling, computational biology and bioinformatics with a keen interest in working toward a common goal of understanding breast cancer biology. In this proposal, we will establish two unique multidisciplinary training programs, one for undergraduates and one for postdoctoral trainees. The undergraduate program will be entitled the “Multidisciplinary Summer Undergraduate Training Program in Experimental and Mathematical Modeling of Cancer.” This program will be geared toward individuals exploring their interest in cancer research and will serve to introduce undergraduates to various aspects of experimental biology and to mathematical/computational modeling approaches and analyses. The multidisciplinary postdoctoral training program will be geared toward recruitment and training of individuals holding Ph.D. or M.D./Ph.D. degrees in mathematical modeling, computational biology, biostatistics/bioinformatics, or a related advanced degree who are interested in gaining significant experimental experience and deeper conceptual insight into breast cancer development in a laboratory or clinical setting. Prospective candidates will be teamed both with a basic science/clinical mentor, and a mathematical modeling/computational biology mentor, for the development of a suitable research project addressing an important unanswered question in breast cancer biology from a combined experimental and mathematical/computational perspective. N5.1.1 Summer Undergraduate Training Program in Experimental and Mathematical Modeling of Cancer A summer undergraduate student research program has been operated in The Methodist Hospital Research Institute for 5 consecutive years. The program is a 10 weeks program. The stipend for the 10 weeks is $5,000.00. The students are placed in a research laboratory with a designated mentor where they are assigned a specific project. During this time they also have to attend weekly didactic lectures given by the leading researchers and physicians in The Methodist Hospital Research Institute and Methodist Hospital. At the end of the program there is a Student Retreat where the students present their work to the group of students and faculty mentors. Each year 20-25 students are admitted to the program. The Institute provides overall administrative support, but mentors are responsible for the summer stipend through their research funding. We will build upon this established infrastructure using funds from this ICBP program to fund up to five additional undergraduate students. These five undergraduate students can be distributed to individual laboratories across three institutions, TMHRI, Baylor, and UTHSC associated with the CSMCaD and assigned a specific cancer research project comprised of both an experimental and a mathematical/computational component with both a wet lab and dry lab mentoring. The laboratory topics studied and techniques used will vary considerably from lab to lab, however the training may include in-vitro and in-vivo imaging, preparation and analysis of genomics, preparation and analysis of protein and antibody arrays, reporter technology, biological image analysis, computational modeling of tumor progression and drug response. N5.1.2. Postdoctoral Training Program in Experimental and Mathematical Modeling of Cancer. In this joint, multidisciplinary program, our major objective will be the training of bright and ambitious postdoctoral fellows holding PhD degrees in fields such as Mathematics, Computer Science, or Biological Engineering to become well-versed laboratory or clinical researchers with a deep intellectual understanding of breast cancer initiation and progression. In essence, we seek to develop a group of independent researchers who can “speak at least two languages” fluently. This effort, in turn, will enhance the quality and breadth of research in the participating institutions. We will provide annual salaries commensurate with experience according to the NIH-NRSA guidelines. 2 Postdoctoral fellows will be budgeted with 0 years of experience ($37,368/year), and 1 fellow will be budgeted with 3 years of experience ($43,860/year). In addition to salary, we will support travel to attend one scientific meeting per year. We will advertise the postdoc positions nationwide. N5.1.3. Each trainee will have two mentors – a basic/clinical science mentor and a mathematical/computational modeling mentor. An internationally recognized breast cancer research program has existed for many years in the group that now constitutes the Lester and Sue Smith Breast Center at Baylor College of Medicine, and this program has offered a unique opportunity for postdoctoral fellows interested in starting careers in translational breast cancer research. As with the undergraduate summer program, we will build upon this established training infrastructure with the addition of faculty from an internationally recognized mathematical modeling/computational biology PHS 398/2590 (Rev. 11/07) Page 208 Research Plan Page Program Director/Principal Investigator (Last, First, Middle): Wong, Stephen, T.C., Ph.D., P.E. programs at TMHRI and other institutions to build a multidisciplinary postdoctoral training program in experimental and mathematical modeling of breast cancer. In addition to numerous individual research grants from federal, private, and industry sources, breast program faculty have an active, well-supported, and multi-disciplinary research program in breast cancer including a P01 program project grant (Steffi Oesterreich Ph. D., PI), a Breast Cancer SPORE grant (C. Kent Osborne, M.D., PI), a Susan G. Komen Promise grant (Powel Brown, MD. Ph.D., PI) for the study of triple negative breast cancer, as well as an AACR “Stand up to cancer” multi-institutional grant to study mechanisms of treatment resistance (C. Kent Osborne, PI). Finally, we have been awarded a Cancer Center grant from the NCI (C. Kent Osborne, M.D., PI), which will further facilitate and strengthen breast cancer research initiatives at Baylor and associated institutions. Many members of TMHRI’s Bioinformatics and Biomedical Engineering Program come from the HCNR Center for Bioinformatics at Harvard Medical School and Functional and Molecular Imaging Center at Brigham and Women’s Hospital. They have different kinds of funding sources from NIH, NSF, DOD, and foundation grants. Four NIH R01 grants are ongoing in the area of Image bioinformatics and Computational Biology. N5.2. Program Administration The Core PI (Director) for this Educational and Training Program will be Dr. Michael T. Lewis, Ph.D. (BCM). Dr. Lewis is an Assistant Professor of Molecular and Cellular Biology, and a faculty member of the Lester and Sue Smith Breast Center at BCM. In addition to extensive didactic teaching experience at the graduate and undergraduate levels, Dr. Lewis has trained 3 postdoctoral fellows, and serves as a basic science mentor for two clinical fellows. In addition, Dr. Lewis currently has three graduate students in his laboratory. Dr. Lewis has a strong commitment to training at every level and will be mentored by Drs. Fuqua and Wong in direction of the training program. Dr. Lewis will spend 5% of his time directing this program. The co-directors for the Educational and Training Program will be Drs. Suzanne A. W. Fuqua, Ph.D., (BCM) and Stephen Wong, Ph.D. (TMH). Dr. Fuqua has been directly responsible for the administration of a T32 training Program in Breast Cancer for many years. Dr. Fuqua is a Professor of Medicine, and a senior faculty member of the Lester and Sue Smith Breast Center at BCM. She has trained 29 pre- and postdoctoral fellows, including both M.D. and Ph.D. trainees, and is currently training three predoctoral graduate students in the Molecular and Cell Biology and the Translational Biology and Molecular Medicine (TBMM) graduate programs at Baylor. Dr. Fuqua will spend 5% of her time administering this Program and will serve as an on-site mentor to Dr. Lewis. Dr. Wong is currently the faculty mentor for the TMHRI undergraduate program, as well as for the Methodist Hospital's Departments of Radiology and Pathology residents and postdoctoral fellow program. In addition he serves as a mentor in the BCM computational and structural biology and biophysics graduate program, the departments of bioengineering at Rice and University of Houston programs, and the department of mechanical engineering program at the University of Houston, both undergraduate and graduate levels, as well as school of health information sciences, UTHSC. Dr. Wong also participated in several T32 training programs in biomedical informatics, bioengineering, and genetics at Harvard and UCSF for about fifteen years. Dr. Wong has trained 14 PhDs and 35 postdoctoral fellows. Dr. Wong will spend 5% of his time administering this Program. N5.3. Administrative Structure The Training Program in Systematic Modeling of Cancer Development is a 2 to 3 year program. Drs. Lewis, Fuqua, and Wong will constitute an Executive Steering Committee, which will orchestrate the selection of research preceptors by the trainees, assuring that the trainees meet the eligibility criteria and are paired with the appropriate wet lab and dry lab mentors. We will make use of prepared payback agreements, activation notices, and progress reports, and to manage the financial aspects of the Program. The Executive Steering Committee will organize the didactic components, and will spearhead the integration of two areas of training required by candidates. We will also monitor the progress of all trainees in terms of their didactic experience and their research progress through weekly ”Research in Progress” presentations, and monthly Journal Club. In addition, the Executive Steering Committee will meet two or three times per year specifically to discuss the final selection of research preceptors, the progress of each trainee, and the content of the didactic components of the training program. PHS 398/2590 (Rev. 11/07) Page 209 Research Plan Page Program Director/Principal Investigator (Last, First, Middle): Wong, Stephen, T.C., Ph.D., P.E. N.5.3.1. Program Faculty – Mathematical Modeling and Computational Biology Co-Mentors Vittorio Cristini, Ph.D. Associate Professor, UT Health Science Center at Houston; Susan Hilsenbeck, Ph.D. Professor, Baylor College of Medicine (BCM); Chad Shaw Ph.D. Assistant Professor of Genetics at the Baylor College of Medicine; Stephen TC Wong, Ph.D. John S Dunn Distinguished Endowed Chair in Biomedical Engineering; Professor in Radiology, Weill Cornell Medical College; Xiaobo Zhou, Ph.D. Associate Professor of Bioinformatics in Radiology, WCMC; Fei Cao, Ph.D. Assistant Professor of Radiology, WCMC, will sever as a memeter. The profiles of these professors can be found at Section N1.4. We also have Zhong Xue, Ph.D., Assistant Professor of Electrical Engineering in Radiology, WCMC, and Chief of Medical Image Analysis Lab, TMHRI-BBE Program; and Kelvin Wong, Ph.D., Assistant Professor of Electronic Engineering in Radiolgoy, WCMC, and Chief of Translational Multimodality Optical Imaging Lab, TMHRI-BBE Program, will serve mentors. N5.3.2 Program Faculty – Basic/Clinical Science Co-Mentors Jenny Chang, M.D., Professor of Medicine, BCM; Mary Dickinson, Ph.D., Associate Professor of Molecular Physiology and Biophysics, BCM; Dean Edwards, Ph.D., Professor of Molecular and Cellular Biology, BCM; Suzanne A.W. Fuqua, Ph.D., Professor of Medicine, BCM; Michael T. Lewis, Ph.D., Assistant Professor of Molecular and Cellular Biology, BCM. Jeffrey M. Rosen, Ph.D., C.C. Bell Professor of Molecular and Cellular Biology, BCM, will serve as a mentor. Their profiles can be found at Section N1.4. Other professors include: Powel H. Brown, M.D., Ph.D., Professor of Medicine, BCM, an expert in understanding the process of breast carcinogenesis and on developing more effective ways to prevent breast cancer; Eric Chang, Ph.D., Associate professor of Molecular and Cellular Biology, BCM, expert in studying the Ras GTPases; Yi Li, Ph.D., Assistant Professor of Molecular and Cellular Biology, BCM, an expert in dissecting the molecular interactions in breast carcinogenesis by studying how the Wnt and other oncogenic pathways interact to induce mammary tumors using mouse models; Daniel Medina, Ph.D., Professor of Molecular and cellular Biology, BCM, an expert in the study of early breast tumor development and prevention; Bert O’Malley, M.D., Professor and Chair of Molecular and Cellular Biology, BCM, an expert in studying "primary molecular endocrine pathway"; C. Kent Osborne, M.D., Professor of Medicine, Director of the Dan L. Duncan Cancer Center and the Lester and Sue Smith Breast Center, Chair of the Department of Hematology and Oncology, BCM, an expert in identifying molecular mechanisms by which breast cancer cells become resistant to the antiestrogen tamoxifen; and Nancy Weigel, Ph.D. Professor of Molecular and Cellular Biology, BCM, an expert in studying how cell signaling influences PR function. N5.3.3 Didactic Components of the Training Program for Postdoctoral Fellows TMHRI provides additional training in Mathematics and Computational Biology. It includes bioinformatics, image bioinformatics, systems biology, biomedical imaging informatics, molecular imaging, biostatistics, and biomathmatics. One area of TMHRI computational biology effort focuses on mathematical modeling and computer simulations to study various aspects of tumor initiation, progression, and treatment, model development and its integration with experimental data, clinical data, or both data: bio-mechanics of normal vs. tumor-like tissue morphogenesis, micro-fluids in drug delivery, biophysics of tumor microenvironment. The courses also include mathematical, computational or engineering sciences; computational fluid dynamics, computer programming and visualization. The Graduate School at Baylor College of Medicine provides a variety of basic science courses that lead to the award of a Ph.D. degree. Courses taught in Molecular and Cellular Biology, Biochemistry, or Molecular and Human Genetics may be appropriate for individual trainees. At the discretion of the preceptor, trainees will audit these courses in order to broaden their basic science background. There are two courses that are currently offered by the Graduate School which are required for trainees in the Program: “Cancer Core Curriculum Course” and “Introduction to Molecular Carcinogenesis.” The “Cancer” course covers cancer as a multi-step process, initiation of carcinogenesis, progression of carcinogenesis, oncogenes, and tumor suppressor genes. For training in breast cancer research Postdoctoral trainees will be required to take three additional courses (listed below). Additional courses will be recommended on an individual basis as needs arise. 1. Translational Breast Cancer Research Course – Dr. Fuqua Because many of our trainees are expected come to the Training Program with little if any knowledge of breast cancer, we have designed this lecture series to familiarize them with breast cancer from the standpoint of PHS 398/2590 (Rev. 11/07) Page 210 Research Plan Page Program Director/Principal Investigator (Last, First, Middle): Wong, Stephen, T.C., Ph.D., P.E. clinicians and translational basic scientists. The course objective is to provide a broad understanding of current problems in breast cancer, experimental approaches, and active research areas in the field. 2. Introduction to Biostatistics Course for Translational Researchers – Dr. Hilsenbeck While some candidates will have a strong background in Biostatistics, some may not and will benefit from a basic biostatistics course. It is our experience that few trainees in laboratory investigation receive any organized instruction in research design or statistical evaluation, and when such training exists, design and statistical methodologies appropriate for the special problems of clinically-oriented laboratory research are virtually never included. Dr. Hilsenbeck is the principal designer and instructor of this course. 3. Scientific Writing and Research Grants Course — Dr. Gary C. Chamness This course focuses on writing more readable (and fundable) research grants, including specific writing skills, construction of grant elements, layout and other technical aspects, and some points on strategy and tactics. Construction of scientific papers is also covered. We combine lecture and interactive formats, and there are also short but important homework exercises. Dr. Chamness has been an NIH study section member and editor of two journals as well as having published over 100 papers and written numerous grant proposals both large and small, so that he brings both the reader's and the writer's perspective to this course. In addition, UTHSC also offers formal courses in computational biology and bioinformatics. Moreover, trainees must participate in: (1) Breast Disease Research in Progress Seminars: Trainees are required to attend a weekly Breast Disease Research in Progress Seminars program (Directed by Dr. Lewis); (2) Mathematical Biology and Breast Cancer Journal Clubs: Trainees also attend a Breast Cancer Journal Club, held twice a month at BCM and a Mathematical Biology and Bioinformatics Journal Club, also held bi-weekly at TMHRI; and (3) Educational Seminars: Trainees also attend three regularly scheduled seminar series. The first is a Breast Disease Workshop organized by Dr. Medina and Dr. Li. This weekly seminar series is given by members of the Baylor faculty from any of several departments who are interested in basic, translational, or clinical research in breast cancer. The second seminar series is a monthly outside Distinguished Speaker program co-sponsored by the Breast Center, the Cancer Center, and the Department of Molecular and Cellular Biology. The third is a bi-weekly bioinformatics and imaging research seminar series organized by the PI at The Methodist Hospital, which features research topics in computational biology, systems biology, bioinformatics, and broad spectrum of microscopy imaging and medical imaging as well as their applications in clinical research and disease management. In addition, they are encouraged to select other seminars of interest from the dozens offered at Baylor, Methodist, and UTHSC each week to supplement their learning experience. N5.3.4 Recruitment Plan for Trainee Candidates Postdoctoral trainee candidates have been chosen from among applicants who have completed their Ph.D. in mathematics, computational biology, bioengineering, biostatistics, biophysics, and who wish to pursue fellowship training in breast cancer or other related cancer research. For active recruitment of postdoctoral trainee candidates we will use several approaches. 1) We will advertise in international scientific journals including Science, Nature, Cell, and Bioinformatics. Such ads typically garner about 400 applications. 2) We will seek candidates through national meetings of relevant societies including the American Association for Cancer Research, the American Society of Clinical Oncology, the San Antonio Breast Cancer Symposium, and the Endocrine Society. This has been a valuable source of candidates and many trainees in the Breast Center supported by other funds have been recruited via the San Antonio meeting. We will also advertise through bioinformatics and computational biology meetings to recruit mathematical/computational candidates, including meetings of International Society of Computational Biology, Biophysical Society, and IEEE (Institute of Electrical and Electronic Engineers). 3) We will send program announcements to relevant departments of other universities and institutions, backed up by active recruiting during faculty visits to these institutions. 4) We will pursue recruiting contacts at local colleges and universities, including the University of Houston, Rice University, Texas A&M, University of Texas at Austin, and Texas Southern University. 5) We will also set up recruitment website at CSMCaD web portal and TMHRI web portal for announcing the opening positions and accepting applications on line. A similar approach will be used to recruit summer undergraduate students, and graduate students in later years of the center, after postdocs and summer student programs are operational. PHS 398/2590 (Rev. 11/07) Page 211 Research Plan Page Program Director/Principal Investigator (Last, First, Middle): Wong, Stephen, T.C., Ph.D., P.E. Literature Cited 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. Chang, J.C., et al., Patterns of resistance and incomplete response to docetaxel by gene expression profiling in breast cancer patients. J Clin Oncol, 2005. 23(6): p. 1169-77. Cleator, S., et al., Gene expression patterns for doxorubicin (Adriamycin) and cyclophosphamide (cytoxan) (AC) response and resistance. Breast Cancer Res Treat, 2006. 95(3): p. 229-33. Li, X., et al., Intrinsic resistance of tumorigenic breast cancer cells to chemotherapy. J Natl Cancer Inst, 2008. 100(9): p. 672-9. Zhang, M., et al., Identification of tumor-initiating cells in a p53-null mouse model of breast cancer. Cancer Res, 2008. 68(12): p. 4674-82. Westbrook, T.F., et al., A genetic screen for candidate tumor suppressors identifies REST. Cell, 2005. 121(6): p. 837-48. Westbrook, T.F., et al., SCFbeta-TRCP controls oncogenic transformation and neural differentiation through REST degradation. Nature, 2008. 452(7185): p. 370-4. Wang, M., et al., Context based mixture model for cell phase identification in automated fluorescence microscopy. BMC bioinformatics, 2007. 8: p. 32. Wang, M., et al., Novel cell segmentation and online SVM for cell cycle phase identification in automated microscopy. Bioinformatics, 2008. 24(1): p. 94-101. Li, F., et al., High content image analysis for human H4 neuroglioma cells exposed to CuO nanoparticles. BMC Biotechnol, 2007. 7: p. 66. Jin, G., et al., The knowledge-integrated network biomarkers discovery for major adverse cardiac events. J Proteome Res, 2008. 7(9): p. 4013-21. Zhou, X. and S. Wong, Informatics challenges of high-throughput microscopy. Signal Processing Magazine, IEEE, 2006b. 23: p. 63 - 72. Li, K., et al., Online tracking of migrating and proliferating cells imaged with phase-contrast microscopy. Proceedings of the 2006 Conference on Computer Vision and Pattern Recognition Workshop, 2006: p. 65. Chen, X., X. Zhou, and S.T. Wong, Automated segmentation, classification, and tracking of cancer cell nuclei in time-lapse microscopy. IEEE Trans Biomed Eng, 2006. 53(4): p. 762-6. Al-Kofahi, O., et al., Automated cell lineage construction. Cell Cycle, 2006. 5(3): p. 327-335. Li, F., et al., Optimal Multiple Nuclei Tracking Using Integer Programming for Quantitative Cancer Cell Cycle Analysis. IEEE Transactions on Medical Imaging, 2008: p. under revision. Zhang, L., et al., Graph-Based Multi-Cells Tracking and Identification by Tracing Maximal Entering Flows. IEEE Transactions on Image Processing, 2008: p. under revision. Haralick, R., Statistical and structural approaches to texture. Proceedings of IEEE, 1979. 67: p. 786-804. Boland, M. and R. Murphy, A neural network classifier capable of recognizing the patterns of all major subcellular structures in fluorescence microscope images of HeLa cells. Bioinformatics, 2001. 17: p. 1213-1223. Manjunath, B. and W. Ma, Texture features for browsing and retrieval of image data. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1996. 18: p. 837-842. Li, G., et al., Feature Selection for Multi-class Problems Using Support Vector Machines. Lecture notes in computer science, 2004. 3157: p. 292-300. Guyon, L., et al., Gene Selection for Cancer Classification using Support Vector Machines. Machine Learning, 2002. 46: p. 389-422. Li, F., X. Zhou, and S.T.C. Wong, Novel Nuclei Segmentation and Cell Phase Identification Using Markov Model, in International Symposium on Computational Models for Life Sciences (CMLS). 2007d: Gold Coast, Queensland, Australia. Vapnik, V., Statistical Learning Theory. 1998, New York: Jon & Wiley. Li, F., et al., An automated feedback system with the hybrid model of scoring and classification for solving over-segmentation problems in RNAi high content screening. J Microsc, 2007. 226(2): p. 121-132. Wang, J., et al., Cellular Phenotype Recognition for High-Content RNA Interference Genome-Wide Screening. Journal of Molecular Screening, 2008. 13(1): p. 29-39. Yin, Z., et al., Using iterative cluster merging with improved gap statistics to perform online phenotype discovery in the context of high-throughput RNAi screens. BMC Bioinformatics, 2008. 9(1): p. 264. PHS 398/2590 (Rev. 11/07) Page 212 Research Plan Page Program Director/Principal Investigator (Last, First, Middle): 27. 28. 29. 30. 31. 32. 33. 34. 35. 36. 37. 38. 39. 40. 41. 42. 43. 44. 45. 46. 47. 48. 49. 50. 51. 52. 53. Wong, Stephen, T.C., Ph.D., P.E. Bakal, C., et al., Quantitative morphological signatures define local signaling networks regulating cell morphology. Science, 2007. 316: p. 1753 - 1756. Chahrour, M., et al., MeCP2, a key contributor to neurological disease, activates and represses transcription. Science, 2008. 320(5880): p. 1224-9. Zhou, X., and Wong, STC., Computational Systems Bioinformatics - Methods and Biomedical Applications. 2008: World Scientific Publishers. Wu, L.Y., et al., Conditional random pattern algorithm for LOH inference and segmentation. Bioinformatics, 2009. 25(1): p. 61-7. Yang, X., et al., Pattern-selection based power analysis and discrimination of low- and high-grade myelodysplastic syndromes study using SNP arrays. PLoS ONE, 2009. 4(4): p. e5054. Huang, W.T., et al., Multiple distinct clones may co-exist in different lineages in myelodysplastic syndromes. Leuk Res, 2009. 33(6): p. 847-53. Ren, X., et al., The pathogenesis of myelodysplastic syndromes (MDS): insights from a network view in In: Proceedings of the 100th Annual Meeting of the American Association for Cancer Research; AACR; . 2009: Denver, CO. Philadelphia (PA). p. Poster. 3294. Huang, D., X. Zhou, and S.T. Wong, Inferring Cancer-associated Signaling Networks Based on Significance Analysis of microRNA-mRNA Targeting, in ISMB/ECCB. 2009: Stockholm, Sweden. Wang, Y., et al., Reversible jump MCMC approach for peak identification for stroke SELDI mass spectrometry using mixture model. Bioinformatics, 2008. 24(13): p. i407-13. Zhang, S., Zhou, XB, Wang, HH, Hoehn, GT, DeGraba, TJ, Gonzales, DA, Suffredini, AF, Ching, WK, Ng, MK, and Wong, STC, A Novel Peak Detection Approach with Chemical Noise Removal Using Short-Time FFT for prOTOF MS Data. Proteomics, 2009. Zhou, X., et al., Identification of biomarkers for risk stratification of cardiovascular events using genetic algorithm with recursive local floating search. Proteomics, 2009. 9(8): p. 2286-94. Cancer Facts & Figures. 2009: American Cancer Society. Cristini, V., et al., Nonlinear simulations of solid tumor growth using a mixture model: invasion and branching. J Math Biol, 2009. 58(4-5): p. 723-63. Cristini, V., J. Lowengrub, and Q. Nie, Nonlinear simulation of tumor growth. J Math Biol, 2003. 46(3): p. 191-224. Bearer, E.L., et al., Multiparameter computational modeling of tumor invasion. Cancer Res, 2009. 69(10): p. 4493-501. Sinek, J.P., et al., Predicting drug pharmacokinetics and effect in vascularized tumors using computer simulation. J Math Biol, 2009. 58(4-5): p. 485-510. Macklin, P., et al, An Integrative, Agent-Based Model of Breast Epithelial Cells, with Patient-Specific ex vivo Calibration. 2009. Zheng, X., S.M. Wise, and V. Cristini, Nonlinear simulation of tumor necrosis, neo-vascularization and tissue invasion via an adaptive finite-element/level-set method. Bull Math Biol, 2005. 67(2): p. 211-59. Frieboes, H.B., et al., Three-dimensional diffuse-interface simulation of multispecies tumor growth-II: investigation of tumor invasion. Bull Math Biol, in press. Frieboes, H.B., et al., Prediction of drug response in breast cancer using integrative experimental/computational modeling. Cancer Res, 2009. 69(10): p. 4484-92. Frieboes, H.B., et al., Computer simulation of glioma growth and morphology. Neuroimage, 2007. 37 Suppl 1: p. S59-70. Wise, S.M., et al., Three-dimensional multispecies nonlinear tumor growth--I Model and numerical method. J Theor Biol, 2008. 253(3): p. 524-43. Sanga, S., et al., Mathematical modeling of cancer progression and response to chemotherapy. Expert Rev Anticancer Ther, 2006. 6(10): p. 1361-76. Al-Hajj, M., et al., Prospective identification of tumorigenic breast cancer cells. Proc Natl Acad Sci U S A, 2003. 100(7): p. 3983-8. Hayward, P., T. Kalmar, and A.M. Arias, Wnt/Notch signalling and information processing during development. Development, 2008. 135(3): p. 411-24. Mani, S.A., et al., The epithelial-mesenchymal transition generates cells with properties of stem cells. Cell, 2008. 133(4): p. 704-15. Reya, T. and H. Clevers, Wnt signalling in stem cells and cancer. Nature, 2005. 434(7035): p. 843-50. PHS 398/2590 (Rev. 11/07) Page 213 Research Plan Page Program Director/Principal Investigator (Last, First, Middle): 54. 55. 56. 57. 58. 59. 60. 61. 62. 63. 64. 65. 66. 67. 68. 69. 70. 71. 72. 73. 74. 75. 76. 77. 78. 79. Wong, Stephen, T.C., Ph.D., P.E. Woodward, W.A., et al., WNT/beta-catenin mediates radiation resistance of mouse mammary progenitor cells. Proc Natl Acad Sci U S A, 2007. 104(2): p. 618-23. Langerod, A., et al., TP53 mutation status and gene expression profiles are powerful prognostic markers of breast cancer. Breast Cancer Res, 2007. 9(3): p. R30. Jerry, D.J., et al., A mammary-specific model demonstrates the role of the p53 tumor suppressor gene in tumor development. Oncogene, 2000. 19(8): p. 1052-8. Liu, R., et al., The prognostic role of a gene signature from tumorigenic breast-cancer cells. N Engl J Med, 2007. 356(3): p. 217-26. Shipitsin, M., et al., Molecular definition of breast tumor heterogeneity. Cancer Cell, 2007. 11(3): p. 259-73. Bouras, T., et al., Notch signaling regulates mammary stem cell function and luminal cell-fate commitment. Cell Stem Cell, 2008. 3(4): p. 429-41. Moraes, R.C., et al., Constitutive activation of smoothened (SMO) in mammary glands of transgenic mice leads to increased proliferation, altered differentiation and ductal dysplasia. Development, 2007. 134(6): p. 1231-42. Kohyama, J., et al., Visualization of spatiotemporal activation of Notch signaling: live monitoring and significance in neural development. Dev Biol, 2005. 286(1): p. 311-25. Sasaki, H., et al., A binding site for Gli proteins is essential for HNF-3beta floor plate enhancer activity in transgenics and can respond to Shh in vitro. Development, 1997. 124(7): p. 1313-22. Winnard, P.T., Jr., J.B. Kluth, and V. Raman, Noninvasive optical tracking of red fluorescent protein-expressing cancer cells in a model of metastatic breast cancer. Neoplasia, 2006. 8(10): p. 796-806. Ingham, P.W., Hedgehog signalling. Curr Biol, 2008. 18(6): p. R238-41. Welm, B.E., et al., Lentiviral transduction of mammary stem cells for analysis of gene function during development and cancer. Cell Stem Cell, 2008. 2(1): p. 90-102. Weber, K., et al., A multicolor panel of novel lentiviral "gene ontology" (LeGO) vectors for functional gene analysis. Mol Ther, 2008. 16(4): p. 698-706. Tsien, R.Y., Breeding and building molecules to spy on cells and tumors. Keio J Med, 2006. 55(4): p. 127-40. Kremers, G.J., et al., Improved green and blue fluorescent proteins for expression in bacteria and mammalian cells. Biochemistry, 2007. 46(12): p. 3775-83. Irizarry, R.A., et al., Summaries of Affymetrix GeneChip probe level data. Nucleic Acids Res, 2003. 31(4): p. e15. Irizarry, R.A., et al., Exploration, normalization, and summaries of high density oligonucleotide array probe level data. Biostatistics, 2003. 4(2): p. 249-64. Li, H. and F. Hong, Cluster-Rasch models for microarray gene expression data. Genome Biol, 2001. 2(8): p. RESEARCH0031. Gentleman, R.C., et al., Bioconductor: open software development for computational biology and bioinformatics. Genome Biol, 2004. 5(10): p. R80. Chang, J.C.N., et al., Gene expression patterns for de novo and acquired docetaxel resistance in patients with locally advanced breast cancer. J Clin Onco, 2004: p. in press. Nikitin, A., et al., Pathway studio--the analysis and navigation of molecular networks. Bioinformatics, 2003. 19(16): p. 2155-7. Sweet-Cordero, A., et al., An oncogenic KRAS2 expression signature identified by cross-species gene-expression analysis. Nat Genet, 2005. 37(1): p. 48-55. Chen, M.S., et al., Wnt/beta-catenin mediates radiation resistance of Sca1+ progenitors in an immortalized mammary gland cell line. J Cell Sci, 2007. 120(Pt 3): p. 468-77. Real, P.J., et al., Gamma-secretase inhibitors reverse glucocorticoid resistance in T cell acute lymphoblastic leukemia. Nat Med, 2009. 15(1): p. 50-8. van Es, J.H., et al., Notch/gamma-secretase inhibition turns proliferative cells in intestinal crypts and adenomas into goblet cells. Nature, 2005. 435(7044): p. 959-63. Zhang, X., et al., Cyclopamine inhibition of human breast cancer cell growth independent of Smoothened (Smo). Breast Cancer Res Treat, 2008. PHS 398/2590 (Rev. 11/07) Page 214 Research Plan Page Program Director/Principal Investigator (Last, First, Middle): 80. 81. 82. 83. 84. 85. 86. 87. 88. 89. 90. 91. 92. 93. 94. 95. 96. 97. 98. 99. 100. 101. 102. 103. 104. 105. 106. 107. Wong, Stephen, T.C., Ph.D., P.E. Lauth, M., et al., Inhibition of GLI-mediated transcription and tumor cell growth by small-molecule antagonists. Proc Natl Acad Sci U S A, 2007. 104(20): p. 8455-60. Mimeault, M., et al., Recent advances in cancer stem/progenitor cell research: therapeutic implications for overcoming resistance to the most aggressive cancers. J Cell Mol Med, 2007. 11(5): p. 981-1011. Kakarala, M. and M.S. Wicha, Implications of the cancer stem-cell hypothesis for breast cancer prevention and therapy. J Clin Oncol, 2008. 26(17): p. 2813-20. Adams, G.B. and D.T. Scadden, The hematopoietic stem cell in its place. Nat Immunol, 2006. 7(4): p. 333-7. Scadden, D.T., The stem-cell niche as an entity of action. Nature, 2006. 441(7097): p. 1075-9. Walker, M.R., K.K. Patel, and T.S. Stappenbeck, The stem cell niche. J Pathol, 2009. 217(2): p. 169-80. Sanga, S., et al., Predictive oncology: a review of multidisciplinary, multiscale in silico modeling linking phenotype, morphology and growth. Neuroimage, 2007. 37 Suppl 1: p. S120-34. El-Kareh, A.W. and T.W. Secomb, Two-mechanism peak concentration model for cellular pharmacodynamics of Doxorubicin. Neoplasia, 2005. 7(7): p. 705-13. Ward, J.P. and J.R. King, Mathematical modelling of drug transport in tumour multicell spheroids and monolayer cultures. Math Biosci, 2003. 181(2): p. 177-207. Jackson, T.L., Intracellular accumulation and mechanism of action of doxorubicin in a spatio-temporal tumor model. J Theor Biol, 2003. 220(2): p. 201-13. Norris, E.S., J.R. King, and H.M. Bryrne, Modelling the response of spatially structured tumours to chemotherapy: drug kinetics. Math Comp Model, 2006. 43: p. 820-37. Byren, H.M., et al., Modelling the response of vascular tumours to chemotherapy: a multiscale approach. Math Models Meth Appl Sci, 2005. 16: p. 1219-41. Panovaska, J., H.M. Bryrne, and P.K. Maini, A theoretical study of the response of vascular tumors to different types of chemotherapy. Math Comp Model, 2007. 47: p. 560-79. Sinek, J., et al., Two-dimensional chemotherapy simulations demonstrate fundamental transport and tumor response limitations involving nanoparticles. Biomed Microdevices, 2004. 6(4): p. 297-309. Roeder, I. and M. Loeffler, A novel dynamic model of hematopoietic stem cell organization based on the concept of within-tissue plasticity. Exp Hematol, 2002. 30(8): p. 853-61. Timms, J.F., et al., Effects of ErbB-2 overexpression on mitogenic signalling and cell cycle progression in human breast luminal epithelial cells. Oncogene, 2002. 21(43): p. 6573-86. Athale, C.A. and T.S. Deisboeck, The effects of EGF-receptor density on multiscale tumor growth patterns. J Theor Biol, 2006. 238(4): p. 771-9. Eladdadi, A. and D. Isaacson, A mathematical model for the effects of HER2 overexpression on cell proliferation in breast cancer. Bull Math Biol, 2008. 70(6): p. 1707-29. Birtwistle, M.R., et al., Ligand-dependent responses of the ErbB signaling network: experimental and modeling analyses. Mol Syst Biol, 2007. 3: p. 144. Gan, H.K., et al., The epidermal growth factor receptor (EGFR) tyrosine kinase inhibitor AG1478 increases the formation of inactive untethered EGFR dimers. Implications for combination therapy with monoclonal antibody 806. J Biol Chem, 2007. 282(5): p. 2840-50. Costa, D.B., et al., BIM mediates EGFR tyrosine kinase inhibitor-induced apoptosis in lung cancers with oncogenic EGFR mutations. PLoS Med, 2007. 4(10): p. 1669-79; discussion 1680. Kong, A., et al., HER2 oncogenic function escapes EGFR tyrosine kinase inhibitors via activation of alternative HER receptors in breast cancer cells. PLoS ONE, 2008. 3(8): p. e2881. Monod, J., The Growth of Bacterial Cultures. Annual Review of Microbiology, 1949. 3(1): p. 371-394. Michor, F., et al., Dynamics of chronic myeloid leukaemia. Nature, 2005. 435(7046): p. 1267-70. Dekker, L., Editorial: Novel approaches to drug discovery in signal transduction. Biotechnol J, 2008. 3(4): p. 428-9. Kurose, H., [Signal transduction and drug development]. Nippon Rinsho, 2002. 60(1): p. 25-30. Slonim, D., et al., Class prediction and discovery using gene expression data. 4th Annual International Conference on Computational Molecular Biology (RECOMB), 2000 Apr 8-11; Tokyo, Japan, 2000: p. 263 - 272. Wang, J., Zhou, X, Bradley, P., Perrimon, N., and Wong, STC, Phenotype recognition for high-content RNAi genome-wide screening. Molecular Screening, 2008. 13(1): p. 29-39. PHS 398/2590 (Rev. 11/07) Page 215 Research Plan Page Program Director/Principal Investigator (Last, First, Middle): 108. 109. 110. 111. 112. 113. 114. 115. 116. 117. 118. 119. 120. 121. 122. 123. 124. 125. 126. 127. 128. 129. 130. 131. 132. 133. Wong, Stephen, T.C., Ph.D., P.E. Chuang, H.Y., et al., Network-based classification of breast cancer metastasis. Mol Syst Biol, 2007. 3: p. 140. Zhao, X.M., et al., Uncovering signal transduction networks from high-throughput data by integer linear programming. Nucleic Acids Res, 2008. 36(9): p. e48. Xenarios, I., et al., DIP, the Database of Interacting Proteins: a research tool for studying cellular networks of protein interactions. Nucleic Acids Res, 2002. 30(1): p. 303-5. Bader, G.D., D. Betel, and C.W. Hogue, BIND: the Biomolecular Interaction Network Database. Nucleic Acids Res, 2003. 31(1): p. 248-50. Mewes, H.W., et al., MIPS: a database for genomes and protein sequences. Nucleic Acids Res, 2002. 30(1): p. 31-4. Chatr-aryamontri, A., et al., MINT: the Molecular INTeraction database. Nucleic Acids Res, 2007. 35(Database issue): p. D572-4. Hermjakob, H., et al., IntAct: an open source molecular interaction database. Nucleic Acids Res, 2004. 32(Database issue): p. D452-5. Shen-Orr, S.S., et al., Network motifs in the transcriptional regulation network of Escherichia coli. Nat Genet, 2002. 31(1): p. 64-8. Vazquez, A., et al., The topological relationship between the large-scale attributes and local interaction patterns of complex networks. Proc Natl Acad Sci U S A, 2004. 101(52): p. 17940-5. Kanehisa, M., et al., KEGG for linking genomes to life and the environment. Nucleic Acids Res, 2008. 36(Database issue): p. D480-4. Parvin, B., et al., Iterative voting for inference of structural saliency and characterization of subcellular events. IEEE Trans Image Process, 2007. 16(3): p. 615-23. Otsu, N., A threshold selection method from gray-level histograms. IEEE Transactions on Systems, Man and Cybernetics, 1979. 9(1): p. 62-66. Chan, T. and L. Vese, Active contours without edges. IEEE Transactions on Image Processing, 2001. 10: p. 266-277. Casselles, V., R. Kimmel, and G. Sapiro, Geodesic Active Contours. International Journal of Computer Vision, 1997. 22: p. 61-79. Teh, C.H. and R.T. Chin, On image analysis by the methods of moments. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1988. 10(4): p. 496–513. Canterakis, N., 3D Zernike moments and zernike affine invariants for 3D image analysis and recognition. 11th Scandinavian Conf. on Image Analysis, 1999. Kurani, A., et al., Co-occurrence matrices for volumetric data. The 7th IASTED International Conference on Computer Graphics and Imaging - CGIM 2004, Kauai, Hawaii, USA, 2004. Tesar, L., et al., 3d extension of haralick texture features for medical image analysis. Proceedings of SPPRA, 2007. Tesar, L., et al., Medical image analysis of 3D CT images based on extension of Haralick texture features. Comput Med Imaging Graph, 2008. 32(6): p. 513-20. Kazhdan, M., T. Funkhouser, and S. Rusinkiewicz, rotation invariant spherical Harmonic representation of 3D shape descriptors. Eurographics Symposium on Geomety Processing, 2003. Loo, L.H., L.F. Wu, and S.J. Altschuler, Image-based multivariate profiling of drug responses from single cells. Nat Methods, 2007. 4(5): p. 445-53. Young, D.W., et al., Integrating high-content screening and ligand-target prediction to identify mechanism of action. Nature Chemical Biology 4, 59 - 68 (2008), 2008. 4: p. 59-68. Perlman, Z., et al., Multidimensional drug profiling by automated microscopy. Science, 2004. 306: p. 1194-8. Wang, J., et al., Classify Cellular Phenotype in High-Throughput Fluorescence Microcopy Images for RNAi Genome-Wide Screening, in IEEE/NLM Life Science Systems & Applications Workshop. 2006: Bethesda, MD. p. 1-2. Mao, Y., et al., Multiclass cancer classification by using fuzzy support vector machine and binary decision tree with gene selection. J Biomed Biotechnol, 2005. 2005(2): p. 160-71. Zhou, X., et al., Towards automated cellular image segmentation for RNAi genome-wide screening. Med Image Comput Comput Assist Interv Int Conf Med Image Comput Comput Assist Interv, 2005. 8(Pt 1): p. 885-92. PHS 398/2590 (Rev. 11/07) Page 216 Research Plan Page Program Director/Principal Investigator (Last, First, Middle): 134. 135. 136. 137. 138. 139. 140. 141. 142. 143. 144. 145. 146. 147. 148. 149. 150. 151. 152. 153. 154. 155. 156. 157. 158. Wong, Stephen, T.C., Ph.D., P.E. Chen, C., et al., Constraint factor graph cut-based active contour method for automated cellular image segmentation in RNAi screening. J Microsc, 2008. 230(Pt 2): p. 177-91. Ripley, B.D., The second-order analysis of stationary point processes. Journal of Applied Probability, 1976. 13: p. 255-266. Xiong, G., et al., Automated neurite labeling and analysis in fluorescence microscopy images. Cytometry A, 2006. 69(6): p. 494-505. Plank, M.J. and B.D. Sleeman, A reinforced random walk model of tumour angiogenesis and anti-angiogenic strategies. Math Med Biol, 2003. 20(2): p. 135-81. Plank, M.J. and B.D. Sleeman, Lattice and non-lattice models of tumour angiogenesis. Bull Math Biol, 2004. 66(6): p. 1785-819. Anderson, A.R. and M.A. Chaplain, Continuous and discrete mathematical models of tumor-induced angiogenesis. Bull Math Biol, 1998. 60(5): p. 857-99. Chaplain, M.A., Mathematical modelling of angiogenesis. J Neurooncol, 2000. 50(1-2): p. 37-51. Li, X., et al., Nonlinear three-dimensional simulation of solid tumor growth. Discrete and Continuous Dynamical Systems B, 2007. 7(3): p. 581-604. McDougall, S.R., A.R. Anderson, and M.A. Chaplain, Mathematical modelling of dynamic adaptive tumour-induced angiogenesis: clinical implications and therapeutic targeting strategies. J Theor Biol, 2006. 241(3): p. 564-89. Macklin, P. and J. Lowengrub, Nonlinear simulation of the effect of microenvironment on tumor growth. J Theor Biol, 2007. 245(4): p. 677-704. Cristini, V., et al., Morphologic instability and cancer invasion. Clin Cancer Res, 2005. 11(19 Pt 1): p. 6772-9. Frieboes, H.B., et al., An integrated computational/experimental model of tumor invasion. Cancer Res, 2006. 66(3): p. 1597-604. Armstrong, P.B., Light and electron microscope studies of cell sorting in combinations of chick embryo neural retina and retinal pigment epithelium. Wilhelm Roux' Arch, 1971. 168: p. 125-141. Anderson, A.R., A hybrid mathematical model of solid tumour invasion: the importance of cell adhesion. Math Med Biol, 2005. 22(2): p. 163-86. Macklin, P., et al., An Integrative, Agent-Based Model of Breast Epithelial Cells, with Patient-Specific ex vivo Calibration to Ductal Carcinoma in Situ. J. Theor. Biol., 2009 (submitted). Macklin, P., et al., Agent-Based Cell Model, with Application to Cancer, in Multiscale Modeling of Solid Tumor Growth, V. Cristini and J.S. Lowengrub, Editors. 2009 (accepted), Cambridge University Press: New York. Zhang, L., Chen, L. L., Deisboeck T. S., Multi-Scale, multi-resolution brain caner modeling. Mathematics and computers in simulation, 2009. 79. Zhang, L., C.A. Athale, and T.S. Deisboeck, Development of a three-dimensional multiscale agent-based tumor model: simulating gene-protein interaction profiles, cell phenotypes and multicellular patterns in brain cancer. J Theor Biol, 2007. 244(1): p. 96-107. Macklin, P., et al., Applications of Agent-Based Modeling to Predictive Breast Cancer Research, in Multiscale Modeling of Solid Tumor Growth, V. Cristini and J.S. Lowengrub, Editors. 2009 (accepted), Cambridge University Press: New York. Bold, K.A., et al., An equation-free approach to analyzing heterogeneous cell population dynamics. J Math Biol, 2007. 55(3): p. 331-52. Kevrekidis, P.G., et al., Minimal model for tumor angiogenesis. Phys Rev E Stat Nonlin Soft Matter Phys, 2006. 73(6 Pt 1): p. 061926. Makeev, A.G., et al., Coarse bifurcation analysis of kinetic Monte Carlo simulations: a lattice gas model with lateral interactions. J Chem Phys., 2002. 117: p. 8229-8240. Setayeshgar, S., et al., Application of coarse integration to bacterial chemotaxis. SIAM Multisale Model Sim, 2005. 4: p. 307-327. Theodoropoulos, C., Y.H. Qian, and I.G. Kevrekidis, "Coarse" stability and bifurcation analysis using time-steppers: a reaction-diffusion example. Proc Natl Acad Sci U S A, 2000. 97(18): p. 9840-3. Cristini, V., Collaborative Research: Multiscale Modeling of Solid Tumor Growth. 2008-2011, National Science Foundation. PHS 398/2590 (Rev. 11/07) Page 217 Research Plan Page Program Director/Principal Investigator (Last, First, Middle): 159. 160. 161. 162. 163. 164. 165. 166. 167. 168. 169. 170. 171. 172. 173. 174. 175. Wong, Stephen, T.C., Ph.D., P.E. Macklin, P., et al., An Integrative, Agent-Based Model of Breast Epithelial Cells, with Patient-Specific ex vivo Calibration. 2009 Macklin, P., et al., Agent-Based Modeling of Ductal Carcinoma in Situ: Application to Patient-Specific Breast Cancer Modeling, in Computational Biology: Issues and Applications in Oncology, T. Pham, Editor. 2009, Springer. Blank, U., G. Karlsson, and S. Karlsson, Signaling pathways governing stem-cell fate. Blood, 2008. 111(2): p. 492-503. Katoh, M. and M. Katoh, WNT signaling pathway and stem cell signaling network. Clin Cancer Res, 2007. 13(14): p. 4042-5. Katoh, M., Networking of WNT, FGF, Notch, BMP, and Hedgehog signaling pathways during carcinogenesis. Stem Cell Rev, 2007. 3(1): p. 30-8. Heasley, L.E. and B.E. Petersen, Signalling in stem cells: meeting on signal transduction determining the fate of stem cells. EMBO Rep, 2004. 5(3): p. 241-4. Hornberg, J.J., et al., Control of MAPK signalling: from complexity to what really matters. Oncogene, 2005. 24(36): p. 5533-42. Coifman, R.R., et al., Geometric diffusions as a tool for harmonic analysis and structure definition of data: diffusion maps. Proc Natl Acad Sci U S A, 2005. 102(21): p. 7426-31. Lafon, S., and Lee, A. B., Diffusion Maps and Coarse-Graining: A Unified Framework for Dimensionality Reduction, Graph Partitioning, and Data Set Parameterization. IEEE Trans. on PAMI 2006. 28(9): p. 1393-1403. Zhao, X.-M., et al., Uncovering signal transduction networks from high-throughput data by integer linear programming. Nucleic Acids Research, 2008. 36(9): p. e48-. Rao, A., et al., Inferring time-varying network topologies from gene expression data. EURASIP J Bioinform Syst Biol, 2007: p. 51947. Schramm, G., et al., Using gene expression data and network topology to detect substantial pathways, clusters and switches during oxygen deprivation of Escherichia coli. BMC Bioinformatics, 2007. 8: p. 149. Brun, C., et al., Functional classification of proteins for the prediction of cellular function from a protein-protein interaction network. Genome Biol, 2003. 5(1): p. R6. Calin, G.A., et al., A MicroRNA signature associated with prognosis and progression in chronic lymphocytic leukemia. N Engl J Med, 2005. 353(17): p. 1793-801. Maraziotis, I.A., K. Dimitrakopoulou, and A. Bezerianos, Growing functional modules from a seed protein via integration of protein interaction and gene expression data. BMC Bioinformatics, 2007. 8: p. 408. Edgar, R., M. Domrachev, and A.E. Lash, Gene Expression Omnibus: NCBI gene expression and hybridization array data repository. Nucleic Acids Res, 2002. 30(1): p. 207-10. El-Kareh, A.W. and T.W. Secomb, A mathematical model for cisplatin cellular pharmacodynamics. Neoplasia, 2003. 5(2): p. 161-9. PHS 398/2590 (Rev. 11/07) Page 218 Research Plan Page