Supplementary Material: MATERIALS AND METHODS Hemgn promoter isolation and sequence analysis ―A mouse BAC genomic library CITB-CJ7-B (Research Genetics, Catalogue # 96021) was screened by PCR using Hemgn-specific primers 5’-AAA CAC ACC TCT CTC CTA CCA C and 5’-CCT ACT TTC TGG GCT CCT TCT G. One of the BAC clones (# 567P14) contained the full length Hemgn gene. Using clone #567P14 as template, a Hemgn promoter spanning 3171 bp (positions -2975 to +196) was amplified with pfu Turbo DNA polymerase (Stratagene, CA) in a PTC-200 thermal cycler. The Hemgn promoter was confirmed by DNA sequencing. The sequence homology comparison of human and mouse Hemgn genes was performed using VISTA (http://dcode.org) (Fig. 1B) 11. The sequences of mouse and human proximal promoters and the first exons were aligned using ClustalW (http://www2.ebi.ac.uk/clustalw). The MatInspector program (http://www.gene- regulation.com/) was used to identify putative hematopoietic-specific transcription factor binding sites in the sequence (Fig. 1C). Cell culture, transfections, and reporter assays ―To determine that the Hemgn promoter is specially activated in hematopoietic cells, we tested the Hemgn promoter activities in several cells lines originated from different tissues. 10T1/2 (a mouse fibroblast cell line), PAC1 (a rat smooth muscle cell line), COS-7 (a monkey kidney cell line) and C2C12 (a mouse skeletal muscle cell line) were cultured in DMEM medium with 10% fetal bovine serum. One g of indicated Hemgn promoter-luciferase reporter construct was cotransfected into the cells in 6-well plates using Lipofectamine Plus 1 reagent (Invitrogen) with 0.2 g of pRL-SV40 Renilla Luciferase reporter as an internal control. Cells were harvested 24 hours after transfection and luciferase activities were measured. K562 cells (a human erythroleukemia cell line) were cultured in RPMI 1640 medium with 10% fetal bovine serum (Invitrogen) in 5% CO2 incubator at 37oC. Four μg of plasmid DNA of the Hemgn promoter-luciferase reporter constructs (see below) were transiently transfected into K562 cells in 6-well plates using DMRIE-C liposome reagent (Invitrogen) along with 200 ng pRL-SV40 Renilla Luciferase reporter as internal control. Cells were harvested 2 days after transfection. Luciferase activities were measured using Dual-LuciferaseTM reporter assay system (Promega, Madison, WI, USA). Drosophila Mel-2 cells (D. Mel-2, a Schneider S2 insect cell line) were purchased from Invitrogen (Carlsbad, CA) and maintained in Schneider’s insect medium supplemented with 10% fetal bovine serum and 2 mM L-glutamine plus 100U/ml penicillin and 100µg/ml streptomycin at 28 oC. D. Mel-2 cells were cotransfected with 1 g of the Hemgn-luciferase reporter gene construct (pHemgn-2975+196luc) and GATA1 (50 to 400 ng pPacGATA1) 12 using FugeneTM 6 reagent (Roche Diagnostics Corporation, Indianapolis, IN). Cells were harvested after 24h for luciferase assays using the Single Luciferase Assay System (Promega). Luciferase activities were normalized to total cell protein, measured by the Bio-Rad protein assay system. CMS, an acute myeloid leukemia cell line [established from a 2 year old girl with acute megakaryocytic leukemia (AMkL)] 13 was a gift from Dr. A. Fuse (National Institute of Infectious Diseases, Tokyo, Japan). The CMS cell line was maintained in RPMI 1640 containing 10% heat-inactivated iron-supplemented calf serum (Hyclone Labs), 2 mM L- 2 glutamine, 100 U/ml penicillin and 100 µg/ml streptomycin in a humidified atmosphere at 37oC in the presence of 5% CO2/95% air. The AMkL cell line, Meg-01, was obtained from the American Type Culture Collection (Manassas, VA) and cultured in RPMI 1640 with 10% fetal bovine serum as instructed14. The CMS cells were stably transfected with pcDNA3-GATA1 15 by electroporation (200 V, 1000 µF). Forty-eight hours after transfection, the cells were plated in 96-well plates with a density of 1 cell/well containing 1mg/ml G418 (Sigma). Single colonies of stably transfected cells were selected after 3 weeks, expanded, and screened for GATA1 expression by real time RT-PCR (see below). Construction of Hemgn promoter deletion and mutant constructs ―Deletion mutants of the Hemgn promoter were prepared by high-fidelity PCR using pfu Turbo DNA polymerase (Stratagene, La Jolla, CA, USA). The primers were designed to generate a BamH I site and an Xho I site at the 5’ and 3’ ends of the PCR products respectively. This allowed the directional cloning of the PCR fragments into the luciferase reporter vector pXP1 16. The constructs were verified by sequencing. The primers for amplifying the corresponding Hemgn promoter and its deletion mutants were: pHemgn-2975+196: 5’-CGC GGA TCC CAC ATC AGA GAC ACC TTG CC and 5’CCG CTC GAG GGT ATT GGC TTT GAC TTC AC; pHemgn-831+196: 5’-CGC GGA TCC TTG AAC TAG GGT GGC TCT GG and 5’-CCG CTC GAG GGT ATT GGC TTT GAC TTC AC; pHemgn-404+196: 5’-CGC GGA TCC AAC AGC CTA CCT AGG AAG AG and 5’CCG CTC GAG GGT ATT GGC TTT GAC TTC AC; pHemgn-2975+12: 5’-CGC GGA TCC 3 CAC ATC AGA GAC ACC TTG CC and 5’-CCG CTC GAG ACA CTG CAC AGG TGT GAG GG. The core GATA binding sequences in the plasmid pHemgn-2975+196luc construct were mutated to cATA, a change reported to abolish GATA binding 17,18,19, using the QuickchangeTM site-directed mutagenesis kit (Stratagene, La Jolla, CA, USA). The primer for GATAbox1 mutation was: 5’-CCT CAC ACC TGT GCA GTG TcA TAA AGA AAG TGG. The primer for GATAbox2 mutation was: 5’-GCT GTG GTC TAA CCA cAT AAA ACT TTT AGG CGG G. Double mutations of both boxes were also generated. The primer for GATAbox3 mutation was: 5’-GTGTTGGTCTTaAgcTTTTCAA AAAGC. The mutant constructs (named as mutGATA box1+2luc) mutGATA box1luc, mutGATA box2luc, mutGATA box3luc and were re-cloned into pXP1 vector (to eliminate possible mutation in the vector) and were verified by sequencing. Generation and analysis of Hemgn promoter in transgenic mice―The Hemgn promoter (pHemgn-2975+196) was linked to a lacZ reporter. The DNA fragment containing the pHemgn-2975+196promoter-linked lacZ was isolated and used to generate transgenic mice (Cold Spring Harbor Transgenic Core Facility). The transgenic mice were PCRgenotyped using lacZ-specific primers (forward: 5’-GTGGTGGTTATGCCGATCGC; reverse: 5’-TACCACAGCGGATGGTTCGG). RNAs from embryos and tissues were isolated using Trizol (Invitrogen) and the expression of the lacZ transgene was examined by RT-PCR using lacZ-specific primers. The transgenic fetal and adult tissues were fixed briefly in 1xPBS solution containing 2% formaldehyde (EMS) and 0.2% glutaraldehyde (Sigma). After washing with 1xPBS three 4 times, the embryos/tissue were stained for -galactosidase activity in staining solution (5mM ferrocyanide, 5mM ferricyanide, 2mM MgCl2, 1 mg/ml X-gal and 1xPBS) at 37oC for 6-24 hours. The stained tissues were embedded in frozen O.C.T. medium (Diagnostics Division Elkhart, USA) and sectioned at 10 m using MICROM GmbH (Walldort, Germany). After hematoxylin/eosin staining of the frozen section, the distribution of X-gal staining in tissue sections was captured by a camera (model: RT slider 2.3.1, Diagnostic Instruments Inc.) over the Carl Zeiss (AX10phot) microscope (Göttingen, Germany) using the SPOT Dignostic Instruments software (version 4.5.8.). Gel shift and ChIP assays ―The nuclear extracts (3μg) from a mouse erythroleukemia cell line (MEL)(Active Motif, Carlsbad CA) were used for each DNA binding reaction in a volume of 20μl. The oligonucleotides used are as follows: GATAsite1: 5’CTGTGCAGTGTGATAAAGAAAGTGG; GACACGTCACACTATTTCTTTCA-CC-5’. 5’-CTGTGCAGTGTCTTAAAG-AAAGTGG; mutGATAsite1: ACACGTCACAGAATTTCTTTC-ACC-5’; GATAsite2: 5’-GGTCTAACCAGATAAAACTTTTAGG; CCAGATTGGTCTATTTTGAAAAT-CC-5’; GGTCTAACCACTTAAAA-CTTTTAGG; mutGATAsite2: 5’- CCTGATTGGTGAATTTTGAAA-ATCC-5’. The experimental procedure was described before 20. These oligonucleotides were end- labeled by T4 Kinase with [γ-32P] dATP. The binding reaction buffer contained 10mM Tris (pH7.5), 50mM NaCl, 1mM dithiothreitol, 1mM EDTA, 5% glycerol and 1g of poly (dIdC). The binding reactions were conducted at room temperature for 30 min. The samples were electrophoresed at 200V at room temperature using a 5% polyacrylamide in 0.5×Tris-Borate-EDTA buffer. For the competition studies, 100-fold molar excess of 5 unlabeled oligonucleotide was included in the binding reaction. For supershift analysis, 2μg anti-GATA1 or anti-GATA2 antibodies (Santa Cruz Biotechnology) were incubated with nuclear extracts respectively at room temperature for 20 min prior to their use in the binding reactions. The ChIP assay was performed as described previously 14. Purified chromatin fragments from Meg-01 (an AMkL cell line) were incubated with anti-GATA1 antibodies (C-20, Santa Cruz, Inc.). Standard PCR for the HEMGN promoter region was performed using forward (5’-CCAGACACTTCCTGGCAGAT) and reverse (5’- CACTTGACTTCCCGCCTAAA) primers spanning positions –141 to +89. A coding region (exon 3) of the human GATA1 gene was also amplified using forward (5’TGGAGACTTTGAAGACAGAGCGGCTGAG) and reverse (5’- GAAGCTTGGGAGAGGAATAGGCTGCTGA) primers to validate the specificity of the ChIP assays. Western blot analyses ―The presence of GATA1 and GATA2 in MEL nuclear extracts was detected by Western blot assays, as described before 21. Proteins on SDS-PAGE gel were transferred to nitrocellulose membranes (Trans-Blot Transfer Medium, BioRad) and subsequently probed with anti-GATA1 or anti-GATA2 antibodies (Santa Cruz Biotechnology). Primary antibodies were detected with POD conjugated anti-goat-IgG and a BM Chemiluminescence Western Blotting kit (Roche). Real time RT-PCR ―Total RNAs were extracted from mock-transfected, and GATA1 stably transfected CMS cells or primary myeloblasts of newly diagnosed acute myeloid 6 leukemia patients using TriReagent (Molecular Research Center, Inc.). First strand cDNAs were prepared from 1 µg RNA using random hexamer primers and a RT-PCR kit (Perkin Elmer), and purified with the QIAquick PCR Purification Kit (Qiagen). GATA1, EDAG and 18S RNA transcript levels were quantitated using a LightCycler real-time PCR machine (Roche). PCR reactions contained 2 µl purified cDNA or standard plasmid, 4 mM MgCl2, 0.5 µM each of sense and antisense primers, and 2 µl FastStart DNA Master SYBR Green I enzyme-SYBR reaction mix (Roche). Primers were to EDAG coding sense (5’-TGTGCCAAGAAGCTGCTGTA -3’) and antisense (5’TGGTTCTGCTGGATTTTGGT -3’) sequence, GATA1 coding sequence sense (5’TGGAGACTTTGAAGACAGAGCGGCTGAG-3’) and antisense (5’- GAAGCTTGGGAGAGGAATAGGCTGCTGA-3’) sequence, and 18S RNA sense (5’GATGCGGGGCGTTATT-3’) and antisense (5’-TGAGGTTTCCCGTGTTGTCA-3’) sequence. PCR conditions consisted of an initial denaturing step of 95 oC for 10 min, amplification with 35-55 cycles of 95oC, 59-63oC for 10 s, and 72oC for 5 s, followed by melting curve analysis from 40oC to 99oC, and a final cooling step to 40oC. External standard curves for EDAG and GATA1 were constructed using serial dilutions of linearized EDAG and GATA1 cDNA constructs. An 18S RNA plasmid construct was prepared by cloning the 18S RNA amplicon amplified from K562 cDNA with commercial primers (Ambion) into the pGEM T-Easy vector (Promega). This plasmid construct was linearized with ApaI, and serial dilutions were prepared and quantitated by real-time PCR, and the data used to prepare the 18S RNA external standard curve. EDAG and GATA1 transcript levels were expressed relative to 18S RNA. Real-time PCR results 7 were expressed as mean values from 2-3 experiments using the same cDNA preparation. Clinical AML samples ―Myeloblasts from children diagnosed with AML were obtained from the Children's Hospital of Michigan leukemia cell bank. Mononuclear cells were isolated on Ficoll-Hypaque gradients to obtain highly purified mononuclear cell fractions consisting mostly of leukemic blasts. Total RNAs were extracted from the samples using TRIzol reagent (Life Technologies). The research protocol was approved by the Human Investigation Committee of Wayne State University School of Medicine. Statistical analysis ―The nonparametric Spearman rank correlation coefficient was used to analyze HEMGN transcript levels and their relationship to GATA1 transcript levels. Statistical analyses were performed with StatView (Version 4.5, for Windows). 8