Genome-wide Regulation of Gene Expression and Transcription Factor Binding during Human Hematopoeisis Hematopoeisis • Every functionally specialized mature blood cell develops from a common stem cell. A hematopoietic stem cell is pluripotent, able to differentiate along a number of different pathways and thereby generate erythrocytes, granulocytes, monocytes, mast cells, lymphocytes, and megakaryocytes. – Few in number – approximately one stem cell per 104 bone marrow cells • Early in hematopoiesis, a pluripotent stem cell differentiates along one of two pathways: lymphoid stem cell or a myeloid stem cell. – Each type of stem cell can further differentiate into progenitor cells, which have lost the capacity to for self-renewal and a committed to a specific cell lineage. – Lymphoid stem cells generate B and T cell progenitors – Myeloid stem cells generate progenitor cells for: Red blood cells – erythrocytes White blood cells – neutrophils, eosinophils, basophil, monocytes, and mast cells. Platelets • Progenitor commitment depends on responsiveness to particular growth factors. When the appropriate signal is present, progenitor cells proliferate and differentiate. HL60 Vitamin D2 phorbol myristic acid (PMA/TPA) M-CSF All-trans Retinoic Acid (ATRA) 9-cis-Retinoic Acid DMSO HL60 cell-line • Acute myeloid Leukemia • No PML-RAR translocation • Like most AML cells, fail to express secondary granule protein genes late in differentiation • RA treatment • VitD Treatment ] Clinical trials in patients with AML Experimental Setup • HL60 cells are treated with ATRA and harvested at 0, 2, 8, and 32 hours. • For each time point: – Cells are phenotyped by: • CD11b cell surface expression • NBT Reduction (phagocytic activity potential) – Cells are counted and tested for viability (Trypan Blue exclusion) – Cells are harvested for RNA – Cells are crosslinked and processed for chromatin preparation. CD11b Expression: A hallmark cell surface marker indicating differentiation toward granulocyte lineages Kansas et al. 1990 Blood 9(12):2483-92 CD11b Expression: A hallmark cell surface marker indicating differentiation toward granulocyte lineages • Cell differentiation analyzed by Flow Cytometry using a Alexa-Fluor 488 conjugated aCD11b antibodies and an Alexa-Fluor 488 isotype specific control antibody. • Samples prepared in triplicate for each time point and 10,000 events are counted for each time point. CD11b Expression: A hallmark cell surface marker indicating differentiation toward granulocyte lineages Nitroblue Tetrazolium Chloride Reduction Assay Superoxide O2- Nitroblue Tetrazolium Chloride Reduction Assay NBT Reduction Assay 50 45 % Cells Positive 40 35 30 25 20 15 10 5 0 0 2 8 Time After ATRA treatment 32 HL60 differentiation was performed in triplicate and both chromatin and RNA were prepared from each time point Replicate #1 60 40 NBT 30 CD11b Replicate #2 20 60 10 50 0 2 8 40 Time after ATRA treatment 32 NBT 30 CD11b Replicate #3 20 60 10 50 0 0 2 8 40 Time after ATRA treatment % Cells Positive 0 % cells positive % Cells Postive 50 32 NBT 30 CD11b 20 10 0 0 2 8 Time after ATRA treatment 32 Average HL60 differentiation over all replicates 60 % Positive Cells 50 40 NBT 30 CD11b 20 10 0 0 2 8 Time after ATRA treatment 32 Chromatin IPs for ENCODE Arrays • Retinoic Acid Receptor a: Cellular receptor protein for retinoic acid ligands. Binds to retinoic acid response elements (RARE) in target regions to activate and some times repress transcription. Although several isoforms of RAR are present in mammalian systems, RARa appears to have the central role in RA induced granulopoeisis. • Ezh2: Homo sapiens enhancer of zeste homolog 2 (Drosophila). A member of the polycomb family and found in a complex called HPC2. Shown to be the methylase component of polycomb complexes and to be associated primarily with H3K27 trimethylation but also with H3K9 tri-methylation to some extent. • SirT1: Human homolog of the S. cerevisiae Sir2 protein known to be involved in cell aging and in the response to DNA damage, binds and deacetylates the p53 protein with a specificity for its C-terminal Lys382 residue, modification of which has been implicated in the activation of p53 as a transcription factor Chromatin IPs for ENCODE Arrays • RNA polymerase II • Acetylated histone H4: associated with gene activation • Histone H3K9 di-methylated: associated with repressed genes • Histone H3K9 tri-methylated: associated with repressed and silenced genes (i.e. Suv39H1 and polycomb) • Histone H3K27 di-methylated: associated with repressed genes • Histone H3K27 tri-methylated: associated with repressed and silenced genes (i.e. polycomb) • Histone H4K20 di-methylated: associated with repressed and silenced genes • HA: Negative Control • RPL10: ribosomal protein L10 – This gene encodes a ribosomal protein that is a component of the 60S subunit. The protein belongs to the L10E family of ribosomal proteins. It is located in the cytoplasm. – In vitro studies have shown that the chicken protein can bind to c-Jun and can repress c-Jun-mediated transcriptional activation, but these activities have not been demonstrated in vivo. – This gene has been referred to as 'laminin receptor homolog' because a chimeric transcript consisting of sequence from this gene and sequence from the laminin receptor gene was isolated; however, it is not believed that this gene encodes a laminin receptor. – Transcript variants utilizing alternative polyA signals exist. The variant with the longest 3' UTR overlaps the deoxyribonuclease I-like 1 gene on the opposite strand. This gene is co-transcribed with the small nucleolar RNA gene U70, which is located in its fifth intron. As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed through the genome – Down regulated during adipocyte, kidney and heart differentiation. – Not well expressed in non-differentiated HL60 cells as determined by prior array analysis. RLP10 Gene Chr. X 0 hour RNA 2hour RNA 8 hour RNA 32 hour RNA 0 hour RNAP II 32 hour RNAP II 0 hour H4-Ac 32 hour H4-Ac • PSMB4: proteasome (prosome, macropain) subunit, beta type, 4 – The proteasome is a multicatalytic proteinase complex with a highly ordered ringshaped 20S core structure. – The core structure is composed of 4 rings of 28 non-identical subunits; 2 rings are composed of 7 alpha subunits and 2 rings are composed of 7 beta subunits. – Proteasomes are distributed throughout eukaryotic cells at a high concentration and cleave peptides in an ATP/ubiquitin-dependent process in a non-lysosomal pathway. – An essential function of a modified proteasome, the immunoproteasome, is the processing of class I MHC peptides. This gene encodes a member of the proteasome B-type family, also known as the T1B family, that is a 20S core beta subunit. – Expressed in HL60 by prior array analysis. PMSB4 Chr.1 0 hour RNA 2hour RNA 8 hour RNA 32 hour RNA 0 hour RNAP II 32 hour RNAP II 0 hour H4-Ac 32 hour H4-Ac • EEF1A1: Eukaryotic translation elongation factor 1 alpha 1. – This gene encodes an isoform of the alpha subunit of the elongation factor-1 complex, which is responsible for the enzymatic delivery of aminoacyl tRNAs to the ribosome. – This isoform (alpha 1) is expressed in brain, placenta, lung, liver, kidney, and pancreas, and the other isoform (alpha 2) is expressed in brain, heart and skeletal muscle. – This isoform is identified as an autoantigen in 66% of patients with Felty's syndrome. – This gene has been found to have multiple copies on many chromosomes, some of which, if not all, represent different pseudogenes. – Exhibits moderate expression in undifferentiated HL60 cells by prior microarray analysis. EEF1A1 Gene Chr. 6 0 hour RNA 2hour RNA 8 hour RNA 32 hour RNA 0 hour RNAP II 32 hour RNAP II 0 hour H4-Ac 32 hour H4-Ac Representative gene for Retinoic Acid Receptor a association. • IRF-1: interferon regulatory factor 1 – IRF1 encodes interferon regulatory factor 1, a member of the interferon regulatory transcription factor (IRF) family. – IRF1 serves as an activator of interferons alpha and beta transcription, and in mouse it has been shown to be required for double-stranded RNA induction of these genes. – specifically binds to the upstream regulatory region of type I IFN and IFNinducible MHC class I genes (the interferon consensus sequence (ICS)) and activates those genes. – IRF1 also functions as a transcription activator of genes induced by interferons alpha, beta, and gamma. – Further, IRF1 has been shown to play roles in regulating apoptosis and tumorsuppression. – Induced by viruses, IFN and Retinoic Acid. – Deletion or rearrangement of irf1 are a cause of preleukemic myelodysplastic syndrome (MDS) and of acute myelogenous leukemia (AML). – Not well expressed in uninduced HL60 cells as shown in prior array analysis. IRF-1 Gene Chr. 5 0 hour RNA 2hour RNA 8 hour RNA 32 hour RNA 0 hour RNAP II 32 hour RNAP II 0 hour H4-Ac 32 hour H4-Ac 0 hour RARa 32 hour RARa • IRAK1: Interleukin-1 receptor-associated kinase 1 – IRAK1 encodes the interleukin-1 receptor-associated kinase 1, one of two putative serine/threonine kinases that become associated with the interleukin-1 receptor (IL1R) upon stimulation. – IRAK1 is partially responsible for IL1-induced upregulation of the transcription factor NF-kappa B. – Expressed well in HL60 cells as determined by prior array analysis. IRAK1 Chr.X 0 hour RNA 2hour RNA 8 hour RNA 32 hour RNA 0 hour RNAP II 32 hour RNAP II 0 hour H4-Ac 32 hour H4-Ac Examples of SirT1 potential targets as determined by ENCODE array analysis. • RFX5: – A lack of MHC-II expression results in a severe immunodeficiency syndrome called MHC-II deficiency, or the bare lymphocyte syndrome. – At least 4 complementation groups have been identified in B-cell lines established from patients with BLS. – The molecular defects in complementation groups B, C, and D all lead to a deficiency in RFX, a nuclear protein complex that binds to the X box of MHC-II promoters. – The lack of RFX binding activity in complementation group C results from mutations in the RFX5 gene encoding the 75-kD subunit of RFX. RFX5 is the fifth member of the growing family of DNA-binding proteins sharing a novel and highly characteristic DNA-binding domain called the RFX motif. – Not great expression in undiffernitiated HL60s by previous array analysis. RFX5 Gene, Chr.1 RNA 0 hr RNA 2 hr RNA 8 hr RNA 32 hr RNA polII 0hr RNA polII 32hr H4 tetraAc 0hr H4 tetraAc 32hr SIRT1 0hr SIRT1 32 hr Examples of Polycomb association • GMMPA: GDP-mannose pyrophosphorylase A. – This gene is thought to encode a GDP-mannose pyrophosphorylase. – This enzyme catalyzes the reaction which converts mannose-1-phosphate and GTP to GDP-mannose which is involved in the production of N-linked oligosaccharides. • MDFI: MyoD family inhibitor – This protein is a transcription factor that negatively regulates other myogenic family proteins. – Inhibits the transactivation activity of the Myod family of myogenic factors and represses myogenesis. Acts by associating with Myod family members and retaining them in the cytoplasm by masking their nuclear localization signals. Can also interfere with the DNA-binding activity of Myod family members. Plays an important role in trophoblast and chondrogenic differentiation – Knockout mouse studies show defects in the formation of vertebrae and ribs that also involve cartilage formation in these structures. – Moderately expressed in HL60 cells as determined by prior array analysis. GMPPA Gene Chr.2 0 hour RNA 32 hour RNA 0 hour H3K27 di-Meth 32 hour H3K27 di-Meth 0 hour RNAP II 32 hour RNAP II 0 hour H4-Ac 32 hour H4-Ac 0 hour Ezh2 32 hour Ezh2 0 hour H3K27 tri-Meth 32 hour H3K27 tri-Meth Histone H3K9 di-methylation • PIPK15A:68 kDa type I phosphatidylinositol-4-phosphate 5-kinase alpha – kinase activity: 1-phosphatidylinositol-4-phosphate 5-kinase activity – transferase activity – glycerophospholipid metabolism – signal transduction – Not great expression in undifferentiated HL60 cells by prior array analysis. PIP5K1A Gene Chr. 1 0 hour H3K9 di-Meth 32 hour H3K9 di-Meth 0 hour H3K9 tri-Meth 32 hour H3K9 tri-Meth 0 hour H4K20 di-Meth 2 hour H4K20 di-Meth 0 hour SirT1 32 hour SirT1 0 hour RAR-a 32 hour RAR-a Histone H3K9 tri-methylation? Histone H4K20 di-methylation? Future Work • Finish processing RNA for biological replicates #2 and #3. • Repeat ChIP-Chip for RNAP II, H4-Ac, and H3K37 di-methylation for all time points in all three biological replicates. • Hybridize to ENCODE • Hybridize to Whole Genome Array. • Potentially pursue the Ezh2 and SirT1 antibodies for Whole Genome Array. • Do Real-Time PCR at several genes to determine whether H3K9 methylation and H4K20 methlyation ChIPs are actually working. • • • • • CCAAT/Enhancer Binding protein a (C/EBPa) – within haematopoeisis, a, b, d are primarily expressed in granulocute, monocytes and and eosinophil lineages. C/EBPa is the isoform primarily expressed in immature granulocytes. – Target Genes: MPO (myeloperoxidase), M-CSF receptor, neutrophil elastase (ELA2), Lactoferrin (LF), human neutrophil defensins (HNP), G-CSF receptor, PU.1, MBN (serine protease), Azurocidin (serine protease) C/EBPb – Expressed in a wide variety of cells including hepatocytes and haematopoeitic cells. Decrease in a correlates with increase in b. Low expression in immature cells and highest in neutrophils and macrophages. – Target genes: ELA2 and IL-6 C/EBPg – Dominant inhibitory due to ability to dimerize with other C/EBPs and lack intact basic regions. Ubiquitously expressed but highest expression level is in B cells. Currently no role described in myeolopoeisis. C/EBPd – Shows a similar expression pattern to b but is a stronger activator. – Target Genes: ELA2 C/EBPe – Found in later stage granulocytes and T cells. Has both an activation and repression domain. In knock out, neutrophils do not develop secondary granules. – Target Genes: LF and HNP. • • • • CCAAT displacement factor (CDP) – Antagonizes C/EBP by competing for binding sites. Should be inactivated as cells go to terminal differentiation – Target Genes: Gp91-phox, C/EBPe, HNP, LF Hox10A – Homoebox domain protein which binds as a dimmer with PBX. – May play a role analogous to CDP in granulopoesis. Preferentially expressed in myeloid cell lines but not present in mature neutrophils or monocytes. -/- mice have increased numbers of monocytes. – Target Genes: Gp91-phox, p21 Retinoic acid receptor (RAR a, b, g) – All RARs (especially a) are expressed in haematopoetic cells. Dominant negative mutant of a result in inability of myeloid precursor cells to undergo neutrophil or monocytes differentiation – Target Genes: IRF-1, CEBPe, Folate Receptor b c-Myb – Expressed in immature myeloid, lymphoid, and erythroid cells. Myb -/- mice lack all of these cell lineages. Expression patterns indicate that the gene has a primary role in haematopoeisis. CD34+ progenitor cells have the highest levels but c-myb is not expressed in mature granulocytes. – Target Genes: MPO, ELA2, MBN, Azurocidin, CD11b, CD34, c-myc, cdc2, c-myb, mim-1. • PU.1 – – – – Ets family transcription factor with glutamine rich transactivation domain. Interaction with GATA1 and inhibits activation of gene expression. Expression levels increase during granulocytic and monocytic differentiation. Specifically represses erythroid genes to push cells toward myeloid lineages. Expressed in B, granulocytic, and monocytic cells. – Target Genes: M-CSF receptor, ELA2, MPO, MBN (serine protease), azurocidin (serine protease), Gp91-phox, C/EBPe, HNP, and LF • GATA1 – Lowered levels after cell lineage commitment signal neutrophil/eosinophil development rather than erythroid. Critical regulator of erythroid, megakaryocytic, eosinophil, and mast cell differentation. – Target Genes: GATA1, GATA2, a-globin • GATA2 – Essential for haematopoeitic stem and progenitor function cells function. – GATA1 and 2 seem to be reciprocally expressed in haematopoeitic cells. – Target Genes: GATA1, GATA2, a-globin • AML/CBFA – Large family defined by homology to a runt domain. Heterodimerize with CBFb to bind to DNA. Interact with Ets-1 and C/EBP. – Deletion of AML1 results in embryonic death as a result of haematopoetic defects. Target for chromosomal translocations in many acute myeloid leukemias. Expression is largely restricted to myeloid and lymphoid cells that are CD34 positive. Stimulates the G1 to S transition in myeloid cells. – Target Genes: MPO, ELA2, NF3 (Defensin ), M-CSF receptor,