Supplementary Information TABLE S1: GENES ON CHROMOSOME 8p: LOCALIZATION AND DESCRIPTION GENE INFORMATION LOCU S BASES 1 8p23.3 2 NAME OTHER NAMES Gene ID MIM 7101-11265 hypothetical LOC728836 LOC728836 728836 protein coding 8p23.3 106086107024 OR4F21: olfactory receptor, family 4, subfamily F, member 21 OR4F olfactory receptor, family 4, subfamily F, member 21 pseudogene21P, 441308 protein coding 3 8p23.3 140551131884 protein coding 8p23.3 148344172227 644128 pseudogene 5 8p23.3 172200187339 LOC100132317; similar to C20orf69 protein FLJ45055; FLJ45055; DKFZp434B1135 FLJ36123 1001323 17 4 LOC100132317 similar to C20orf69 protein FLJ45055: 60S ribosomal pseudogene ZNF596: zinc finger protein 596 169270 protein coding 6 8p23.3 172367173692 LOC728686: hypothetical LOC728686 728686 pseudogene 7 8p23.3 315931318394 protein coding 8p23.3 318394317776 family with sequence similarity 87, member A FLJ40008 157693 8 400728 protein coding 9 8p23.3 346808409876 FAM87A: family with sequence similarity 87, member A FAM87B: family with sequence similarity 87, member B FBXO25: F-box protein 25 FBX25, MGC20256, MGC51975 26260 609098 Gene type protein coding SUMMARY Olfactory receptors interact with odorant molecules in the nose, to initiate a neuronal response that triggers the perception of a smell. The olfactory receptor proteins are members of a large family of G-protein-coupled receptors (GPCR) arising from single coding-exon genes. Olfactory receptors share a 7-transmembrane domain structure with many neurotransmitter and hormone receptors and are responsible for the recognition and G protein-mediated transduction of odorant signals. The olfactory receptor gene family is the largest in the genome. The nomenclature assigned to the olfactory receptor genes and proteins for this organism is independent of other organisms. GENE ONTOLOGY FUNCTION PROCESS COMPONENT olfactory receptor activity receptor activity G-protein coupled receptor protein signaling pathway response to stimulus sensory perception of smell signal transduction integral to membrane plasma membrana DNA binding metal ion binding zinc ion binding regulation of transcription, DNA-Dependent transcription intracellular nucleus integral to membrane membrane This gene encodes a member of the F-box protein family which is characterized by an approximately 40 amino acid motif, the F-box. The Fbox proteins constitute one of the four subunits of ubiquitin protein ligase complex called SCFs (SKP1-cullin-F-box), which function in phosphorylation-dependent ubiquitination. The F-box proteins are divided into 3 classes: Fbws containing WD-40 domains, Fbls containing leucinerich repeats, and Fbxs containing either different protein-protein interaction modules or no recognizable motifs. The protein encoded by this gene belongs to the Fbxs class. Three alternatively spliced transcript variants encoding distinct isoforms have been found for this gene. ubiquitin-protein ligase activity protein ubiquitination nucleus ubiquitin ligase complex 1 Supplementary Information TABLE S1: GENES ON CHROMOSOME 8p: LOCALIZATION AND DESCRIPTION 10 8p23.3 11 8p23.3 431646485331 12 8p23.3 597527599962 13 8p23.3 599962597527 14 8p23.3 15 8p23.3 604200671226 678545680374 16 8p23.3 6776511250805 17 8p23.3 14369761644049 18 8p23.3 16657791692574 Q8NB26: LOC286161 hypothetical protein LOC28616 C8orf42: chromosome 8 open reading frame 42 LOC389607: hypothetical gene supported by AK128318 Q6ZRD0: hypothetical gene supported by AK128318 ERICH1: glutamate-rich 1 LOC401442: hypothetical gene supported by BC028401 C8orf68: chromosome 8 open reading frame 68 DLGAP2: discs, large (Drosophila) homologassociated protein LOC10013032:hy pothetical LOC100130321 hypothetical protein LOC286161, LOC286161 286161 unknown DKFZp686J1521 6, INM01 157695 protein coding LOC389607 389607 unknown LOC389607 389607 unknown HSPC319 157697 protein coding unknown 401442 619343 DAP2, SAPAP2 : PSD-95/SAP90binding protein 2; SAP90/PSD-95associated protein 2; discs largeassociated protein 2 9228 1001303 21 Annotation category: not annotated on reference assembly protein coding 605438 protein coding The product of this gene is one of the membrane-associated guanylate kinases localized at postsynaptic density in neuronal cells. These kinases are a family of signaling molecules expressed at various submembrane domains and contain the PDZ, SH3 and the guanylate kinase domains. This protein may play a role in the molecular organization of synapses and in neuronal cell signaling. Alternatively spliced transcript variants encoding different isoforms have been identified, but their full-length nature is not known. Increased expression of PSD-25 and its coassembly with NMDA-receptor subunits NR1 and MR2B in resected epileptic cortical tissue suggest a possible functional role of the complex in situ epileptogenicity of focal cortical dysplasia. protein bindimg cell-cell signaling nerve-nerve synaptic transmission cell junction Neurofilament plasma membrane postsynaptic membrane Synapse pseudogene 2 Supplementary Information TABLE S1: GENES ON CHROMOSOME 8p: LOCALIZATION AND DESCRIPTION 19 8p23.3 16992771722143 CLN8: ceroidlipofuscinosis, neuronal 8 C8orf61, EPMR, FLJ39417 2055 20 8p23.3 693181 8p23.3 MIRN596: microRNA 596 ARHGEF10: Rho guanine nucleotide exchange factor (GEF) 10 hsa-mir-596 21 17528801752804 17595561894214 DKFZp686H072, GEF10, MGC131664: Rho guanine nucleotide exchange factor 10 9639 607837 608136 protein coding This gene encodes a transmembrane protein belonging to a family of proteins containing TLC domains, which are postulated to function in lipid synthesis, transport, or sensing. The protein localizes to the endoplasmic reticulum (ER), and may recycle between the ER and ER-Golgi intermediate compartment. Mutations in this gene are associated with progressive epilepsy with mental retardation (EMPR), which is a subtype of neuronal ceroid lipofuscinoses (NCL). Patients with mutations in this gene have altered levels of sphingolipid and phospholipids in the brain. miscRNA Annotation category: not annotated on reference assembly protein coding Rho GTPases play a fundamental role in numerous cellular processes that are initiated by extracellular stimuli that work through G protein coupled receptors. The encoded protein may form complex with G proteins and stimulate Rho-dependent signals. Data support a role for ARHGEF10 in developmental myelination of peripheral nerves. Gef10 is the third member of a Rho-specific GEF family with unusual protein architecture adult walking behavior age-dependent response to oxidative stress associative learning cellular protein catabolic process ceramide biosynthetic process cholesterol metabolic process glutamate uptake during transmission of nerve impulse lipid biosynthetic process lipid transport lysosome organization and biogenesis mitochondrial membrane organization and biogenesis negative regulation of apoptosis negative regulation of proteolysis negative regulation of transferase activity nervous system development neurofilament cytoskeleton organization and biogenesis neuromuscular process controlling balance neuromuscular process controlling posture phospholipid metabolic process photoreceptor cell maintenance regulation of cell size retina development in camera-type eye social behavior spinal cord motor neuron differentiation Rho guanyl-nucleotide exchange factor activity guanyl-nucleotide exchange factor activity regulation of Rho protein signal transduction ER-Golgi intermediate compartment membrane endoplasmic reticulum endoplasmic reticulum membrane integral to membrane Membrane Intracellular 3 Supplementary Information TABLE S1: GENES ON CHROMOSOME 8p: LOCALIZATION AND DESCRIPTION 22 8p23.3 18729961873789 LOC100131395: hypothetical protein LOC100131395 LOC100128 157: similar to hCG2041134 157 KBTBD11: similar to Kelch repeat and BTB domaincontaining protein 11 (Kelch domaincontaining protein 7B) MYOM2: myomesin (Mprotein) 2, 165kDa LOC100131395 1001313 95 protein coding 23 8p23.3 19082361910204 LOC100128157 1001281 57 protein coding 24 8p23.3 19094511942509 KBTBD11 716519 pseudogene 25 8p23.3 19805652080787 TTNAP: M-band protein; myomesin (Mprotein) 2 (165kD); myomesin 2; titinassociated protein, 165 kD 9172 603509 protein coding 26 8p23.2 27802824839736 CSMD1: CUB and Sushi multiple domains 1 KIAA1890 64478 608397 protein coding 27 8p23.2 46322104633674 LOC780813 780813 pseudogene 28 8p23.2 LOC100129861 8p23.2 1001298 61 648237 pseudogene 29 48474574917799 49728464973240 phosphoribosylam inoimidazole carboxylase, phosphoribosylam inoimidazole succinocarboxami de synthetase pseudogene hypothetical LOC100129861 LOC648237 30 8p23.2 LOC392180 similar to SPT3associated factor 42 392180 pseudogene 56936275694625 LOC648237 similar to laminin receptor homolog LOC392180 similar to Transcriptional adaptor 1 (HFI1 homolog, yeast) like The giant protein titin, together with its associated proteins, interconnects the major structure of sarcomeres, the M bands and Z discs. The C-terminal end of the titin string extends into the M line, where it binds tightly to Mband constituents of apparent molecular masses of 190 kD and 165 kD. The predicted MYOM2 protein contains 1,465 amino acids. Like MYOM1, MYOM2 has a unique N-terminal domain followed by 12 repeat domains with strong homology to either fibronectin type III or immunoglobulin C2 domains. Protein sequence comparisons suggested that the MYOM2 protein and bovine M protein are identical. Simple inactivation of CSMD1 may not explain the deletions observed in oropharyngeal squamous cell carcinoma and may call into question the role of this gene in head and neck carcinogenesis. BAC microarray-CGH detects homozygous deletion of CSMD1 in human bladder cancer specimens. CSMD1 expression is markedly decreased in high stage prostatic adenocarcinomas structural constituent of muscle striated muscle contraction striated muscle thick filament integral to membrane membrane pseudogene 4 Supplementary Information TABLE S1: GENES ON CHROMOSOME 8p: LOCALIZATION AND DESCRIPTION 31 8p23.1 62515296493434 MCPH1: microcephalin 1 32 8p23.1 64605326462075 LOC100131112 33 8p23.1 64641186553129 LOC100132301 34 8p23.1 63476016408170 ANGPT2: angiopoietin 2 35 8p23.1 65532866606432 AGPAT5: 1acylglycerol-3phosphate Oacyltransferase 5 (lysophosphatidic acid acyltransferase, epsilon) 36 8p23.1 66534486680447 XKR5: XK, Kell blood group complex subunitrelated family, member 37 8p23.1 66797166680289 LOC730495: hypothetical protein LOC730495 BRIT1, FLJ12847, MCT: BRCT-repeat inhibitor of TERT expression 1; microcephalin; microcephaly, primary autosomal recessive 1 similar to hCG2002332 79648 1001311 12 protein coding hypothetical protein LOC100132301 AGPT2, ANG2: Tie2-ligand; angiopoietin-2; angiopoietin-2B; angiopoietin-2a 1001323 01 unknown 1-AGPAT5, LPAAT-e, LPAAT-epsilon: 1-AGP acyltransferase 5; 1-acyl-snglycerol-3phosphate acyltransferase epsilon; 1acylglycerol-3phosphate Oacyltransferase 5; lysophosphatidic acid acyltransferase, epsilon UNQ275, XRG5a, XRG5b: HARL2754; X Kell blood group precursor-related family, member 5; XK-related protein 5a LOC730495 55326 protein coding 389610 protein coding 730495 unknown 285 607117 601922 protein coding protein coding Microcephalin and ASPM determine the size of the human brain. Identification of microcephalin, a protein implicated in determining the size of the human brain, which is mapped to the MCPH1 locus and is mutated in primary microcephaly. BRIT1 is a crucial DNA damage regulator in the ATM/ATR pathways and suggest that it functions as a tumor suppressor gene. The protein encoded by this gene is an antagonist of angiopoietin 1 (ANGPT1) and endothelial TEK tyrosine kinase (TIE-2, TEK). The encoded protein disrupts the vascular remodeling ability of ANGPT1 and may induce endothelial cell apoptosis. Three transcript variants encoding three different isoforms have been found for this gene. Results describe the expression of angiopoietin-1, 2 and 4 and Tie-1 and 2 in gastrointestinal stromal tumors, leiomyomas and schwannomas. This gene encodes a member of the 1-acylglycerol-3-phosphate Oacyltransferase family. This integral membrane protein converts lysophosphatidic acid to phosphatidic acid, the second step in de novo phospholipid biosynthesis. centrosome Intracellular receptor binding angiogenesis cell differentiation multicellular organismal development signal transduction extracellular region extracellular space 1-acylglycerol-3phosphate Oacyltransferase activity acyltransferase activity transferase activity metabolic process phospholipid biosynthetic process cellular_component integral to membrane membrana Mitochondrion integral to membrana Membrana 5 Supplementary Information TABLE S1: GENES ON CHROMOSOME 8p: LOCALIZATION AND DESCRIPTION 38 8p23.1 67155076722939 DEFB1: defensin, beta 1 BD1, DEFB-1, DEFB101, HBD1, MGC51822: betadefensin-1 1672 602056 protein coding 39 8p23.1 67438106744400 hypothetical LOC392181 392181 40 8p23.1 67696296771008 LOC392181: similar to GM06171p DEFA6: defensin, alpha 6, Paneth cell-specific DEF6, HD-6: defensin 6; defensin, alpha 6 1671 600471 protein coding 41 8p23.1 67807556783196 DEFA4: defensin, alpha 4, corticostatin DEF4, HNP-4, HP-4, HP4, MGC120099, MGC138296: corticostatin; defensin, alpha 4; defensin, alpha 4, preproprotein 1669 601157 protein coding 42 8p23.1 67951766799755 DEFAP1 449491 43 8p23.1 68225746838481 DEFA8P: defensin, alpha 8 pseudogene DEFA1: defensin, alpha 1 DEF1, DEFA2, HNP-1, HP-1, MGC138393, MRS: defensin, alpha 1, myeloidrelated sequence; defensin, alpha 2; myeloid-related sequence 1667 44 8p23.1 68416986844134 LOC728358: defensin, alpha 1 alpha-defensin 1 728358 45 8p23.1 68608056863226 DEFA3: defensin, alpha 3, DEF3, HNP-3, HNP3, HP-3: 1668 Defensins form a family of microbicidal and cytotoxic peptides made by neutrophils. Members of the defensin family are highly similar in protein sequence. This gene encodes defensin, beta 1, an antimicrobial peptide implicated in the resistance of epithelial surfaces to microbial colonization. This gene maps in close proximity to defensin family member, defensin, alpha 1 and has been implicated in the pathogenesis of cystic fibrosis. DEFB1 is down-regulated in human prostatic and renal carcinomas. G-protein coupled receptor protein signaling pathway chemotaxis defense response to bacterium innate immune response extracellular region Defensins are a family of microbicidal and cytotoxic peptides thought to be involved in host defense. They are abundant in the granules of neutrophils and also found in the epithelia of mucosal surfaces such as those of the intestine, respiratory tract, urinary tract, and vagina. Members of the defensin family are highly similar in protein sequence and distinguished by a conserved cysteine motif. Several alpha defensin genes appear to be clustered on chromosome 8. The protein encoded by this gene, defensin, alpha 6, is highly expressed in the secretory granules of Paneth cells of the small intestine, and likely plays a role in host defense of human bowel. Defensin alpha6 is highly expressed in colon cancer cell lines Defensins are a family of microbicidal and cytotoxic peptides thought to be involved in host defense. They are abundant in the granules of neutrophils and also found in the epithelia of mucosal surfaces such as those of the intestine, respiratory tract, urinary tract, and vagina. Members of the defensin family are highly similar in protein sequence and distinguished by a conserved cysteine motif. Several alpha defensin genes are clustered on chromosome 8. This gene differs from other genes of this family by an extra 83-base segment that is apparently the result of a recent duplication within the coding region. The protein encoded by this gene, defensin, alpha 4, is found in the neutrophils; it exhibits corticostatic activity and inhibits corticotropin stimulated corticosterone production. defense response to bacterium defense response to fungus xenobiotic metabolic process extracellular region defense response to bacterium defense response to fungus xenobiotic metabolic process extracellular region Defensins are a family of microbicidal and cytotoxic peptides thought to be involved in host defense. They are abundant in the granules of neutrophils and also found in the epithelia of mucosal surfaces such as those of the intestine, respiratory tract, urinary tract, and vagina. Members of the defensin family are highly similar in protein sequence and distinguished by a conserved cysteine motif. Several alpha defensin genes appear to be clustered on chromosome 8. The protein encoded by this gene, defensin, alpha 1, is found in the microbicidal granules of neutrophils and likely plays a role in phagocyte-mediated host defense. It differs from defensin, alpha 3 by only one amino acid. Alpha-defensins 1-3 levels are nonspecifically elevated in stools from patients with colorectal neoplasia and likely originate from white blood cells. Alpha-defensins 1-3 in stool might serve as markers of inflammatory bowel conditions. chemotaxis defense response to bacterium defense response to fungus immune response response to virus xenobiotic metabolic process extracellular region Defensins are a family of microbicidal and cytotoxic peptides thought to be involved in host defense. They are abundant in the granules of neutrophils defense response to bacterium extracellular region pseudogene calcium channel regulator activity pseudogene 125220 protein coding protein coding 604522 protein coding 6 Supplementary Information TABLE S1: GENES ON CHROMOSOME 8p: LOCALIZATION AND DESCRIPTION 46 8p23.1 68741506874463 47 8p23.1 69002396901669 48 8p23.1 69279036970663 49 8p23.1 69580526958805 50 8p23.1 51 8p23.1 70666377067398 70725167073541 52 8p23.1 70831187083688 53 8p23.1 70925007083212 neutrophilspecific defensin 3, neutrophilspecific; defensin, alpha 3; neutrophil peptide 3 DEFA11: defensin, alpha 11 pseudogene DEFA5: defensin, alpha 5, Paneth cell-specific DEFAP3 724068 DEF5, HD-5, MGC129728: defensin 5; defensin, alpha 5; defensin, alpha 5, preproprotein 1670 LOC648665: hypothetical LOC648665 LOC100129712 :hypothetical LOC100129712 hypothetical LOC100131970 hypothetical LOC645627 chromosome 11 open reading frame2 pseudogene OR7E125P: olfactory receptor, family 7, subfamily A, member 125 pseudogene LOC648665 648665 pseudogene LOC100129712 1001297 12 pseudogene LOC100131970 pseudogene LOC645627 1001319 70 645627 LOC441320 441320 pseudogene PJCG6 389616 pseudogene and also found in the epithelia of mucosal surfaces such as those of the intestine, respiratory tract, urinary tract, and vagina. Members of the defensin family are highly similar in protein sequence and distinguished by a conserved cysteine motif. Several alpha defensin genes appear to be clustered on chromosome 8. The protein encoded by this gene, defensin, alpha 3, is found in the microbicidal granules of neutrophils and likely plays a role in phagocyte-mediated host defense. It differs from defensin, alpha 1 by only one amino acid. Alpha-defensins 1-3 levels are nonspecifically elevated in stools from patients with colorectal neoplasia and likely originate from white blood cells. Alpha-defensins 1-3 in stool might serve as markers of inflammatory bowel conditions. defense response to fungus xenobiotic metabolic process Defensins are a family of microbicidal and cytotoxic peptides thought to be involved in host defense. They are abundant in the granules of neutrophils and also found in the epithelia of mucosal surfaces such as those of the intestine, respiratory tract, urinary tract, and vagina. Members of the defensin family are highly similar in protein sequence and distinguished by a conserved cysteine motif. Several of the alpha defensin genes appear to be clustered on chromosome 8. The protein encoded by this gene, defensin, alpha 5, is highly expressed in the secretory granules of Paneth cells of the ileum defense response to bacterium defense response to fungus xenobiotic metabolic process extracellular region G-protein coupled receptor protein signaling pathway signal transduction integral to membrane plasma membrana pseudogene 600472 protein coding pseudogene Olfactory receptors interact with odorant molecules in the nose, to initiate a neuronal response that triggers the perception of a smell. The olfactory receptor proteins are members of a large family of G-protein-coupled receptors (GPCR) arising from single coding-exon genes. Olfactory receptors share a 7-transmembrane domain structure with many neurotransmitter and hormone receptors and are responsible for the recognition and G protein-mediated transduction of odorant signals. The olfactory receptor gene family is the largest in the genome. The nomenclature assigned to the olfactory receptor genes and proteins for this organism is independent of other organisms. olfactory receptor activity receptor activity 7 Supplementary Information TABLE S1: GENES ON CHROMOSOME 8p: LOCALIZATION AND DESCRIPTION 54 8p23.1 70914517092619 OR7E154P: olfactory receptor, family 7, subfamily E, member 154 pseudogene 55 8p23.1 70985507100784 56 8p23.1 71018367104846 57 8p23.1 71084077106384 58 8p23.1 71094587112468 59 8p23.1 71140067116029 hypothetical protein LOC729261 FAM90A15: family with sequence similarity 90, member A15 Q4G0H1: hypothetical LOC349196 FAM90A3: family with sequence similarity 90, member A3 LOC729270: hypothetical protein LOC729270 60 8p23.1 71171437120645 61 8p23.1 71216287123651 62 8p23.1 71247027127712 63 8p23.1 71292507131273 64 8p23.1 71323247135334 FAM90A4P: family with sequence similarity 90, member A4 pseudogene LOC729273 hypothetical protein LOC729273 FAM90A13: family with sequence similarity 90, member A13 LOC729278: hypothetical LOC729278 FAM90A5 403296 pseudogene 729261 protein coding 389630 protein coding 349196 unknown 389611 unknown LOC729270 729270 protein coding FAM90A4 441313 pseudogene LOC729273 729273 protein coding 441314 protein coding 729278 protein coding 441315 protein coding LOC729261 LOC349196 hypothetical protein LOC729278 family with sequence similarity 90, Olfactory receptors interact with odorant molecules in the nose, to initiate a neuronal response that triggers the perception of a smell. The olfactory receptor proteins are members of a large family of G-protein-coupled receptors (GPCR) arising from single coding-exon genes. Olfactory receptors share a 7-transmembrane domain structure with many neurotransmitter and hormone receptors and are responsible for the recognition and G protein-mediated transduction of odorant signals. The olfactory receptor gene family is the largest in the genome. The nomenclature assigned to the olfactory receptor genes and proteins for this organism is independent of other organisms. 8 Supplementary Information TABLE S1: GENES ON CHROMOSOME 8p: LOCALIZATION AND DESCRIPTION member A5 65 8p23.1 66 8p23.1 71399467142956 67 8p23.1 68 8p23.1 71577777164882 71675787170595 69 8p23.1 71770027178911 70 8p23.1 71817307183639 71 8p23.1 71867587187958 72 8p23.1 71940137230490 73 8p23.1 74 8p23.1 72180767222432 72636667265553 75 8p23.1 76 8p23.1 71368727138895 LOC729284: hypothetical LOC729284 FAM90A20: family with sequence similarity 90, member A20 DEFB109: defensin, beta 109 LOC10012889: similar to hCG1993470 LOC401447: similar to ubiquitin-specific protease 17-like protein LOC645402: similar to ubiquitin-specific protease 17-like protein LOC402329: similar to ubiquitin-specific protease 17-like protein LOC10013198: similar to zinc finger protein 705A LOC100133101 HSPDP3: heat shock 60kDa protein 1 (chaperonin) pseudogene 3 hypothetical protein LOC729284 729284 protein coding 728430 protein coding 641517 pseudogene LOC10012889 1001288 9 protein coding LOC401447 401447 similar to deubiquitinating enzyme 3 defense response to bacterium extracellular region unknown cysteine-type peptidase activity ubiquitin thiolesterase activity apoptosis ubiquitin cycle ubiquitin-dependent protein catabolic process 645402 protein coding ubiquitin thiolesterase activity ubiquitin-dependent protein catabolic process similar to deubiquitinating enzyme 3 402329 pseudogene similar to zinc finger protein 705A 1001319 8 protein coding nucleic acid binding regulation of transcription, DNA-dependent Intracellular similar to betadefensin HSP60P3 1001331 01 3332 protein coding pseudogene molecular_function defense response to bacterium spermatogenesis extracellular region 72739017275280 DEFB103A: defensin, beta 103ª beta-defensin 3; defensin, beta 103B; defensin, beta 3 7292686.7308602 SPAG11: sperm associated antigen 11B EP2, EP2C, EP2D, HE2, HE2C, MGC61846, SPAG11: 55894 60661 protein coding 10407 606560 protein coding Defensins form a family of microbicidal and cytotoxic peptides made by neutrophils. Members of the defensin family are highly similar in protein sequence. This gene encodes defensin, beta 103B, which has broad spectrum antimicrobial activity and may play an important role in innate epithelial defense. Expression of HBD-3 was detected only in areas adjacent to squamous cell carcinomas This gene encodes several androgen-dependent, epididymis-specific secretory proteins. The specific functions of these proteins have not been determined, but they are thought to be involved in sperm maturation. Some of the isoforms contain regions of similarity to beta-defensins, a family of antimicrobial peptides. The gene is located on chromosome 8p23 near the nucleus 9 Supplementary Information TABLE S1: GENES ON CHROMOSOME 8p: LOCALIZATION AND DESCRIPTION epididymal protein 2; sperm associated antigen 11 77 8p23.1 73152407320014 DEFB104B: defensin, beta 104B 503618 protein coding 78 8p23.1 73274367331319 DEFB106B: defensin, beta 106B 503841 protein coding 79 8p23.1 73326537334483 DEFB105B: defensin, beta 105B 504180 protein coding 80 8p23.1 73407787354243 DEFB107B: defensin, beta 107B 503614 protein coding 81 8p23.1 73845607391754 1001316 0 protein coding 82 8p23.1 73914157392438 645489 protein coding 83 8p23.1 73934307396993 LOC10013160: hypothetical protein LOC100131608 LOC645489: hypothetical LOC645489 FAM90A6P: family with sequence 389618 pseudogene HsT21816 defensin gene cluster. Alternative splicing of this gene results in seven transcript variants encoding different isoforms. Two different N-terminal and five different C-terminal protein sequences are encoded by the splice variants. Two additional variants have been described, but their full length sequences have not been determined. HE2 peptides were detected in human epididymal epithelium, epididymal fluid, and ejaculate Defensins form a family of microbicidal and cytotoxic peptides made by neutrophils. Defensins are short, processed peptide molecules that are classified by structure into three groups: alpha-defensins, beta-defensins and theta-defensins. All beta-defensin genes are densely clustered in four to five syntenic chromosomal regions. Chromosome 8p23 contains at least two copies of the duplicated beta-defensin cluster. This duplication results in two identical copies of defensin, beta 104, DEFB104A and DEFB104B, in head-to-head orientation. This gene, DEFB104B, represents the more telomeric copy. Defensins form a family of microbicidal and cytotoxic peptides made by neutrophils. Defensins are short, processed peptide molecules that are classified by structure into three groups: alpha-defensins, beta-defensins and theta-defensins. All beta-defensin genes are densely clustered in four to five syntenic chromosomal regions. Chromosome 8p23 contains at least two copies of the duplicated beta-defensin cluster. This duplication results in two identical copies of defensin, beta 106, DEFB106A and DEFB106B, in head-to-head orientation. This gene, DEFB106B, represents the more telomeric copy. Defensins form a family of microbicidal and cytotoxic peptides made by neutrophils. Defensins are short, processed peptide molecules that are classified by structure into three groups: alpha-defensins, beta-defensins and theta-defensins. All beta-defensin genes are densely clustered in four to five syntenic chromosomal regions. Chromosome 8p23 contains at least two copies of the duplicated beta-defensin cluster. This duplication results in two identical copies of defensin, beta 105, DEFB105A and DEFB105B, in tail-to-tail orientation. This gene, DEFB105B, represents the more telomeric copy. Defensins form a family of microbicidal and cytotoxic peptides made by neutrophils. Defensins are short, processed peptide molecules that are classified by structure into three groups: alpha-defensins, beta-defensins and theta-defensins. All beta-defensin genes are densely clustered in four to five syntenic chromosomal regions. Chromosome 8p23 contains at least two copies of the duplicated beta-defensin cluster. This duplication results in two identical copies of defensin, beta 107, DEFB107A and DEFB107B, in tail-to-tail orientation. This gene, DEFB107B, represents the more telomeric copy. defense response to bacterium extracellular region defense response to bacterium extracellular region 10 Supplementary Information TABLE S1: GENES ON CHROMOSOME 8p: LOCALIZATION AND DESCRIPTION 84 8p23.1 73990707400093 85 8p23.1 74010707404644 86 8p23.1 74067207407743 87 8p23.1 74091857412392 88 8p23.1 74143667415389 89 8p23.1 74163707419937 90 8p23.1 74220167423038 91 8p23.1 74275847424019 92 8p23.1 74296607430894 93 8p23.1 74368197437987 similarity 90, member A6 pseudogene LOC10013264: hypothetical protein LOC100132648 FAM90A7: family with sequence similarity 90, member A7 LOC10013210: hypothetical protein LOC100132106 FAM90A21P: family with sequence similarity 90, member A21 pseudogene LOC729339: hypothetical LOC729339 FAM90A22: family with sequence similarity 90, member A22 LOC10013248: hypothetical LOC100132485 FAM90A23: family with sequence similarity 90, member A23 LOC729346: hypothetical LOC729346 OR7E157P: olfactory receptor, family 7, subfamily E, member 157 pseudogene 1001326 4 protein coding 441317 protein coding 1001321 0 protein coding 619418 pseudogene 729339 protein coding 645558 pseudogene LOC100132485 1001324 85 protein coding FAM90A23P 645572 pseudogene hypothetical protein LOC729346 729346 protein coding family with sequence similarity 90, member A7 hypothetical protein LOC729339 FAM90A22P pseudogene nucleic acid binding zinc ion binding Olfactory receptors interact with odorant molecules in the nose, to initiate a neuronal response that triggers the perception of a smell. The olfactory receptor proteins are members of a large family of G-protein-coupled receptors (GPCR) arising from single coding-exon genes. Olfactory receptors share a 7-transmembrane domain structure with many neurotransmitter and hormone receptors and are responsible for the recognition and G protein-mediated transduction of odorant signals. The olfactory receptor gene family is the largest in the genome. The nomenclature assigned to the olfactory receptor genes and proteins for this organism is independent of other organisms. 11 Supplementary Information TABLE S1: GENES ON CHROMOSOME 8p: LOCALIZATION AND DESCRIPTION 94 8p23.1 74560217456630 95 8p23.1 75752337575841 96 8p23.1 75813477582112 97 8p23.1 76000037602023 98 8p23.1 7602665-7613951 99 8p23.1 76070897609324 100 8p23.1 76149237615946 101 8p23.1 76180237621595 102 8p23.1 103 8p23.1 76256717629243 104 8p23.1 76302197631242 105 8p23.1 76333197636890 106 8p23.1 76378667638889 76225717623594 LOC10013308: similar to LOC649305 protein LOC10013221: hypothetical LOC100132212 LOC10013196: hypothetical LOC100131967 LOC728731: similar to olfactory receptor 873 FAM90A14: family with sequence similarity 90, member A14 LOC729371: hypothetical LOC729371 LOC729372: hypothetical LOC729372 FAM90A18: family with sequence similarity 90, member A18 LOC10013204: hypothetical protein LOC100132048 FAM90A16: family with sequence similarity 90, member A16 LOC729379: hypothetical protein LOC729379 FAM90A8: family with sequence similarity 90, member A8 LOC729383: hypothetical LOC729383 hypothetical LOC100133084 1001330 84 pseudogene LOC100132212 1001322 12 pseudogene LOC100131967 1001319 67 pseudogene LOC728731 728731 pseudogene 645651 protein coding 729371 protein coding 729372 protein coding 441326 protein coding 1001320 48 protein coding 441323 protein coding 729379 protein coding 441324 protein coding 729383 protein coding hypothetical protein LOC729371 LOC729372 LOC100132048 hypothetical LOC729379 hypothetical protein LOC729383 nucleic acid binding zinc ion binding nucleic acid binding zinc ion binding 12 Supplementary Information TABLE S1: GENES ON CHROMOSOME 8p: LOCALIZATION AND DESCRIPTION 107 8p23.1 76409667644538 108 8p23.1 76455147646537 109 8p23.1 76486147652186 110 8p23.1 76531627654185 111 8p23.1 76562627659834 112 8p23.1 76606097661834 113 8p23.1 76639097667482 114 8p23.1 76684687669490 115 8p23.1 76711787676345 116 8p23.1 77066527710648 117 8p23.1 77169407718770 FAM90A17: family with sequence similarity 90, member A17 LOC729387: hypothetical LOC729387 FAM90A19: family with sequence similarity 90, member A19 LOC729394: hypothetical LOC729394 FAM90A9: family with sequence similarity 90, member A9 LOC10013222: hypothetical protein LOC100132221 FAM90A10: family with sequence similarity 90, member A10 LOC10013309: hypothetical protein LOC100133099 LOC10013325: hypothetical protein LOC100133251 DEFB107A: defensin, beta 107A DEFB105A: defensin, beta 105A 728746 protein coding 729387 protein coding 728753 protein coding 729394 protein coding 441327 protein coding 1001322 21 protein coding 441328 protein coding LOC100133099 1001330 99 protein coding LOC100133251 1001332 51 protein coding BD-7, DEFB-7, DEFB10: betadefensin 107; defensin, beta 7 245910 protein coding BD-5, DEFB-5, DEFB10: beta defensin 5; defensin, beta 5 245908 protein coding hypothetical protein LOC729387 hypothetical protein LOC729394 LOC100132221 family with sequence similarity 90, member A10 nucleic acid binding zinc ion binding zinc ion binding nucleic acid binding zinc ion binding nucleic acid binding zinc ion binding Defensins form a family of microbicidal and cytotoxic peptides made by neutrophils. Defensins are short, processed peptide molecules that are classified by structure into three groups: alpha-defensins, beta-defensins and theta-defensins. All beta-defensin genes are densely clustered in four to five syntenic chromosomal regions. Chromosome 8p23 contains at least two copies of the duplicated beta-defensin cluster. This duplication results in two identical copies of defensin, beta 107, DEFB107A and DEFB107B, in tail-to-tail orientation. This gene, DEFB107A, represents the more centromeric copy Defensins form a family of microbicidal and cytotoxic peptides made by neutrophils. Defensins are short, processed peptide molecules that are classified by structure into three groups: alpha-defensins, beta-defensins and theta-defensins. All beta-defensin genes are densely clustered in four to five syntenic chromosomal regions. Chromosome 8p23 contains at least two 13 Supplementary Information TABLE S1: GENES ON CHROMOSOME 8p: LOCALIZATION AND DESCRIPTION 118 8p23.1 77201047723985 DEFB106A: defensin, beta 106A BD-6, DEFB-6, DEFB106, MGC118938, MGC118939, MGC118940, MGC118941, MGC133011, MGC133012: defensin, beta 6 245909 protein coding 119 8p23.1 77314037736174 DEFB104A: defensin, beta 104A BD-4, DEFB-4, DEFB104, DEFB4, MGC118942, MGC118944, MGC118945, hBD-4: defensin, beta 4 140596 protein coding 120 8p23.1 77428127758729 SPAG11A: sperm associated antigen 11A HE2 653423 protein coding 121 8p23.1 414325 77761367777596 DEFB103B: defensin, beta 103B beta-defensin 103B protein coding HSPDP2: heat shock 60kDa protein 1 (chaperonin) pseudogene 2 DEFB4: defensin, beta 4 HSP60P2 645808 DEFB-2, DEFB102, DEFB2, HBD-2, SAP1: defensin, beta 2; skinantimicrobial peptide 1 LOC100132396 1673 122 8p23.1 77859297787630 123 8p23.1 77896097791647 124 8p23.1 78212697856230 LOC10013239: hypothetical protein LOC100132396 1001323 96 copies of the duplicated beta-defensin cluster. This duplication results in two identical copies of defensin, beta 105, DEFB105A and DEFB105B, in tail-to-tail orientation. This gene, DEFB105A, represents the more centromeric copy. Defensins form a family of microbicidal and cytotoxic peptides made by neutrophils. Defensins are short, processed peptide molecules that are classified by structure into three groups: alpha-defensins, beta-defensins and theta-defensins. All beta-defensin genes are densely clustered in four to five syntenic chromosomal regions. Chromosome 8p23 contains at least two copies of the duplicated beta-defensin cluster. This duplication results in two identical copies of defensin, beta 106, DEFB106A and DEFB106B, in head-to-head orientation. This gene, DEFB106A, represents the more centromeric copy. A pilot study with cRNA probes for in situ hybridization and a synthetic propeptide for the functional characterization demonstrated the tissue-/cell-specific expression and the strong antimicrobial activity of DEFB106. Defensins form a family of microbicidal and cytotoxic peptides made by neutrophils. Defensins are short, processed peptide molecules that are classified by structure into three groups: alpha-defensins, beta-defensins and theta-defensins. All beta-defensin genes are densely clustered in four to five syntenic chromosomal regions. Chromosome 8p23 contains at least two copies of the duplicated beta-defensin cluster. This duplication results in two identical copies of defensin, beta 104, DEFB104A and DEFB104B, in head-to-head orientation. This gene, DEFB104A, represents the more centromeric copy. defense response to bacterium extracellular region defense response to bacterium defense response to bacterium positive regulation of biosynthetic process of antibacterial peptides active 1against Gram-positive bacteria extracellular region extracellular region pseudogene 602215 protein coding Defensins form a family of microbicidal and cytotoxic peptides made by neutrophils. Members of the defensin family are highly similar in protein sequence. This gene encodes defensin, beta 4, an antibiotic peptide which is locally regulated by inflammation. HBD-2 may lead to the death of normal keratinocytes adjacent to the squamous cell carcinomas, which might, in turn, indirectly assist in the multiplication of tumor cells. High concentration in oral squamous cell carcinoma G-protein coupled receptor protein signaling pathway chemotaxis defense response to bacterium immune response extracellular region protein coding 14 Supplementary Information TABLE S1: GENES ON CHROMOSOME 8p: LOCALIZATION AND DESCRIPTION 125 8p23.1 78290967832288 126 8p23.1 78499427850793 127 8p23.1 78622767863476 128 8p23.1 78713257873234 129 8p23.1 78738427882656 130 8p23.1 79067227910283 131 8p23.1 79123597913382 132 8p23.1 79179297914363 133 8p23.1 79200057921028 134 8p23.1 79255787922568 135 8p23.1 79276527928886 136 8p23.1 79348117935979 DEFB108P1: defensin, beta 108, pseudogene 1 LOC10013210: similar to hCG1990697 LOC392187: similar to ubiquitin-specific protease 17-like protein LOC645836: similar to ubiquitin-specific protease 17-like protein LOC10013282: similar to hCG1993470 FAM90A11: family with sequence similarity 90, member A11 LOC729456: hypothetical LOC729456 FAM90A24: family with sequence similarity 90, member A24 LOC729459: hypothetical protein LOC729459 FAM90A12: family with sequence similarity 90, member A12 LOC729462: hypothetical protein LOC729462 OR7E96P: olfactory receptor, family 7, subfamily E, member 96 pseudogene DEFB-8; DEFB108; DEFB108A 503694 pseudogene LOC100132103 1001321 03 protein coding LOC392187 392187 pseudogene LOC645836 645836 protein coding LOC100132828 1001328 28 protein coding FAM90A11P 441331 pseudogene LOC729456 729456 protein coding FAM90A24P 441332 pseudogene LOC729459 729459 protein coding 645879 protein coding 729462 protein coding 401450 pseudogene hypothetical LOC729462 nucleic acid binding zinc ion binding Olfactory receptors interact with odorant molecules in the nose, to initiate a neuronal response that triggers the perception of a smell. The olfactory receptor proteins are members of a large family of G-protein-coupled receptors (GPCR) arising from single coding-exon genes. Olfactory receptors share a 7-transmembrane domain structure with many neurotransmitter and hormone receptors and are responsible for the 15 Supplementary Information TABLE S1: GENES ON CHROMOSOME 8p: LOCALIZATION AND DESCRIPTION recognition and G protein-mediated transduction of odorant signals. The olfactory receptor gene family is the largest in the genome. The nomenclature assigned to the olfactory receptor genes and proteins for this organism is independent of other organisms 137 8p23.1 79540087954614 138 8p23.1 79601507960756 139 8p23.1 80835688083741 140 8p23.1 141 8p23.1 81023098139797 81356338136435 142 8p23.1 82126688276667 LOC10013231: hypothetical LOC100132313 LOC10013204: similar to LOC649305 protein LOC10013317: similar to liverrelated low express protein 1 FLJ10661: similar to CG7889-PA LOC10012942: similar to hCG1990547 PRAGMIN: homolog of rat pragma of Rnd2 hypothetical LOC100132313 1001323 13 pseudogene hypothetical LOC100132046 1001320 46 pseudogene LRLE1 1001331 7 protein coding 286042 unknown 1001294 2 protein coding 157285 protein coding Rnd2 regulates neurite outgrowth by functioning as the RhoA activator through Pragmin, in contrast to Rnd1 and Rnd3 inhibiting RhoA signaling hCG_1646163, CLDNL, hCG1646163, 2310014B08Rik FLJ23354, MASL1: MFHamplified sequences with leucine-rich tandem repeats 1 CLDN23 gene, a candidate tumor suppressor gene implicated in intestinaltype gastric cancer and pancreatic cancer CLDN23 gene, frequently down-regulated in intestinal-type gastric cancer, is a novel member of CLAUDIN gene family. Identified in a human 8p amplicon, this gene is a potential oncogene whose expression is enhanced in some malignant fibrous histiocytomas (MFH). The primary structure of its product includes an ATP/GTP-binding site, three leucine zipper domains, and a leucine-rich tandem repeat, which are structural or functional elements for interactions among proteins related to the cell cycle, and which suggest that overexpression might be oncogenic with respect to MFH MRPS18CP2 alleles and DEFA3 absence as putative chromosome 8p23.1 modifiers of hearing loss due to mtDNA mutation A1555G in the 12S rRNA gene. 143 8p23.1 85970768599027 CLDN23: claudin 23 144 8p23.1 86794098788541 MFHAS1: malignant fibrous histiocytoma amplified sequence 145 8p23.1 88286388828930 146 8p23.1 88504098890368 147 8p23.1 88978608925899 MRPS18CP2: mitochondrial ribosomal protein S18C pseudogene 2 LOC645960: similar to ribosomal protein L10 THEX1: three prime histone mRNA exonuclease 1 LOC10012942 DKFZp761P0423 609203 137075 protein coding 9258 605352 protein coding 286043 pseudogene gen hypothetical protein LOC645960 645960 protein coding 3'HEXO, MGC35395: 3' exoribonuclease; 3'-5' exonuclease ERI1; Eri-1 homolog; histone 90459 protein coding Annotation category: partial on reference assembly. 3'hExo is a primary candidate for the exonuclease that initiates rapid decay of histone mRNA upon completion and/or inhibition of DNA replication. 3'hExo is a 3' exonuclease specifically interacting with the 3' end of histone mRNA ATP binding non-membrane spanning protein tyrosine kinase activity nucleotide binding transferase activity identical protein inding structural molecule activity GTP binding Prot ein binding protein amino acid phosphorylation structural constituent of ribosome translation intracellular Ribosome 3'-5' exonuclease activity RNA binding hydrolase activity magnesium ion binding RNA-mediated gene silencing Intracellular calcium-independent cellcell adhesion cell junction integral to membrane Membrane tight junction small GTPase mediated signal transduction 16 Supplementary Information TABLE S1: GENES ON CHROMOSOME 8p: LOCALIZATION AND DESCRIPTION mRNA 3' endspecific exonuclease 148 8p23.1 89609078961295 149 8p23.1 89673798967436 150 8p23.1 90311759045630 151 8p23.1 152 8p23.1 92625389262825 94508559677266 153 8p23.1 154 8p23.1 155 8p23.1 156 LOC1001284: hypothetical LOC1001284727 2 RNU7P4: RNA, U7 small nuclear pseudogene 4 PPP1R3B: protein phosphatase 1, regulatory (inhibitor) subunit 3B LOC10012915: similar to LP5624 TNKS: tankyrase, TRF1-interacting ankyrin-related ADP-ribose polymerase 20746292074533 22363332236249 994923610323805 MIRN597: microRNA 597 MIRN124A1: microRNA 124A1 MSRA: methionine sulfoxide reductase A 8p23.1 1022975510232597 157 8p23.1 1039124610442499 LOC10012899: hypothetical LOC100128999 LOC346702: similar to hCG1643218 158 8p23.1 1042049110433738 UNQ9391: tryptophan/serine 1001284 protein coding U7; U7.55; RNU7P4; HSU7.36; HSU7.55 FLJ14005, FLJ34675, GL, PPP1R4 6075 pseudogene LP5624 1001291 50 8658 PARP-5a, PARP5A, PARPL, TIN1, TINF1, TNKS1 cytosolic methionine-Ssulfoxide reductase; peptide met (O) reductase similar to Plasma kallikrein precursor (Plasma prekallikrein) (Kininogenin) (Fletcher factor) hypothetical protein 79660 610541 603303 protein coding protein coding protein coding Results suggest that in cultured human myotubes, glycogen-targeting PP1 (protein phosphatase 1) subunit G(L) (coded for by the PPP1R3B gene) is expressed as in muscle tissue and is unresponsive to glucose or insulin, as are G(M) and PTG genes. Tankyrase1 is a poly(ADP-ribose) polymerase with roles in telomere length control by the TRF1 component of the shelterin complex. The role of TNKS in the poly(ADP-ribosyl)ation of the mitotic spindle apparatus is reported. Tankyrase 1 interacts with Mcl-1 proteins and inhibits their regulation of apoptosis 693182 miscRNA Annotation category: not annotated on reference assembly 406907 miscRNA Annotation category: not annotated on reference assembly protein coding This protein is ubiquitous and highly conserved. It carries out the enzymatic reduction of methionine sulfoxide to methionine. Human and animal studies have shown the highest levels of expression in kidney and nervous tissue. Its proposed function is the repair of oxidative damage to proteins to restore biological activity. MSRA gene on chromosome 8p might possess metastasis suppressor activity in HCC. MsrA may play an important role in cellular defenses against oxidative stress and in protection against death by limiting the accumulation of oxidative damage to proteins 4482 601250 NAD+ ADPribosyltransferase activity protein bindimg transferase activity, transferring glycosyl groups intracellular protein transport across a membrane mRNA transport peptidyl-serine phosphorylation peptidyl-threonine phosphorylation protein transport telomere maintenance via telomerase oxidoreductase activity oxidoreductase activity, acting on sulfur group of donors, disulfide as acceptor peptide-methionine-(S)S-oxide reductase activity methionine metabolic process protein metabolic process protein modification process response to oxidative stress 1001289 99 protein coding 346702 protein coding catalytic activity serine-type endopeptidase activity proteolysis 203074 protein coding catalytic activity peptidase activity proteolysis Golgi apparatus Golgi membrane Chromosome chromosome, telomeric region cytoplasm membrane nuclear pore nucleus integral to membrane Membrana 17 Supplementary Information TABLE S1: GENES ON CHROMOSOME 8p: LOCALIZATION AND DESCRIPTION protease LOC203074 serine-type endopeptidase activity 159 8p23.1 1050126910550027 RP1L1: retinitis pigmentosa 1-like 1 DCDC4B 94137 160 8p23.1 1025980110338697 PP13296; MGC74631; DKFZp686E039 161 8p23.1 1056755710595513 162 8p23.1 1061868810625432 LARP4: La ribonucleoprotein domain family, member 4 C8orf74: chromosome 8 open reading frame 74 SOX7: SRY (sex determining region Y)-box 7 163 8p23.1 1066029410734709 PINX1: PIN2interacting protein 1 164 8p23.1 1079106411096285 XKR6: XK, Kell blood group complex subunitrelated family, member 6 165 8p23.1 166 8p23.1 1093012610930222 1100213211003386 167 8p23.1 1101829211019012 168 8p23.1 1102515511021390 169 8p23.1 1115093911151585 MIRN598: microRNA 598 LOC10012944: hypothetical protein LOC100129441 C8orf15: chromosome 8 open reading frame 15 C8orf16: chromosome 8 open reading frame 16 LOC392193: hypothetical LOC392193 608581 protein coding The RP1L1 gene is a novel candidate for retinal degenerations and encodes a large, highly polymorphic, retinal-specific protein. 113251 protein coding The c-MPL protein altered expression provide an opportunity to diagnose and identify subpopulations of MPD patients. RNA binding hypothetical protein LOC203076 203076 protein coding MGC10895: SOX7 transcription factor; SRY-box 7 83595 protein coding transcription factor activity regulation of transcription from RNA polymerase II promoter transcription Nucleus FLJ20565, LPTL, LPTS, MGC8850: 67-11-3 protein; hepatocellular carcinoma-related putative tumor suppressor C8orf21, C8orf7, XRG6: X Kell blood group precursor-related family, member 6; XK-related protein 6 54984 This gene encodes a member of the SOX (SRY-related HMG-box) family of transcription factors involved in the regulation of embryonic development and in the determination of the cell fate. The encoded protein may act as a transcriptional regulator after forming a protein complex with other proteins. The protein may play a role in tumorigenesis. A similar protein in mice is involved in the regulation of the wingless-type MMTV integration site family (Wnt) pathway. LOH of PINX1 locus associated with reduced expression of PINX1 in gastric cancer. Over-expression of LPTS-L can induce hepatoma cells into crisis due to the reduction of telomerase activity. Data show that liverrelated putative tumor suppressor gene (LPTS) mutations occur in hepatocellular carcinoma but are infrequent and of little effect on the telomerase inhibitory function of the protein. molecular_function nucleic acid binding protein bindimg cell cycle negative regulation of cell cycle negative regulation of cell proliferation telomere maintenance via telomerase cellular_component chromosome chromosome, telomeric region intracellular nucleolus Nucleus integral to membrane Membrana 606505 protein coding 286046 protein coding 693183 miscRNA 1001294 4 protein coding 439940 unknown 83735 unknown 392193 pseudogene intracellular signaling cascade response to stimulus visual perception nucleic acid binding Annotation category: not annotated on reference assembly 18 Supplementary Information TABLE S1: GENES ON CHROMOSOME 8p: LOCALIZATION AND DESCRIPTION 170 8p23.1 1117998911178400 Q96KT8: hypothetical protein C8orf9 MTMR9: myotubularin related protein 9 171 8p23.1 1117941011223065 172 8p23.1 1122590511227105 173 8p23.1 1123455611263371 174 8p23.1 1124088711242975 175 8p23.1 1126332111333577 176 8p23.1 1131638211361663 177 8p23.1 1138893011459517 BLK: B lymphoid tyrosine kinase 178 8p23.1 1145338411454938 LOC10012872: hypothetical AMAC1L2: acylmalonyl condensing enzyme 1-like 2 TDH: L-threonine dehydrogenase LOC10012912: hypothetical protein LOC100129129 C8orf12: chromosome 8 open reading frame 12 C8orf13: chromosome 8 open reading frame 13 LOC157740 157740 C8orf9, DKFZp434K171, LIP-STYX, MGC126672, MTMR8: myotubularin related protein 8; myotubularinrelated protein 9 AMAC, acylmalonyl condensing enzyme FLJ25033 66036 D8S265, DKFZp761G151, MGC120649, MGC120650, MGC120651: hypothetical protein LOC83648 MGC10442 unknown protein coding This gene encodes a myotubularin-related protein that is atypical to most other members of the myotubularin-related protein family because it has no dual-specificity phosphatase domain. The encoded protein contains a double-helical motif similar to the SET interaction domain, which is thought to have a role in the control of cell proliferation. In mouse, a protein similar to the encoded protein binds with MTMR7, and together they dephosphorylate phosphatidylinositol 3-phosphate and inositol 1,3bisphosphate. 83650 protein coding This gene seems to be intronless. It has high sequence similarity to the gene encoding acyl-malonyl condensing enzyme on chromosome 17. 157739 pseudogene This gene appears to be an evolving pseudogene of L-threonine 3dehydrogenase (TDH). In both prokaryotes and eukaryotes, TDH catalyzes the first of two steps in one of two L-threonine degradation pathways. However, in human, the single gene with sequence similarity to TDH is not capable of encoding a functional TDH protein; the predicted protein lacks most of the C-terminus and parts of the NAD+ binding motif when compared to other species' TDH proteins. This suggests that the human gene is therefore a pseudogene. Transcripts of this gene are found in all tissues and alternatively spliced transcripts have been described. It is not known if these transcripts are translated, or if the possible protein product provides any functional role. binding catalytic activity coenzyme binding cellular metabolic process 1001291 29 protein coding 83656 unknown 83648 protein coding Association of systemic lupus erythematosus with C8orf13-BLK and ITGAM-ITGAX. ATP binding non-membrane spanning protein tyrosine kinase activity nucleotide binding protein bindimg transferase activity protein amino acid phosphorylation protein kinase cascade 640 1001287 28 606260 191305 protein coding enzyme regulator activity inositol or phosphatidylinositol phosphatase activity protein bindimg phospholipid dephosphorylation integral to membrana membrana Mitochondrion protein coding 19 Supplementary Information TABLE S1: GENES ON CHROMOSOME 8p: LOCALIZATION AND DESCRIPTION 181 8p23.1 1159916211654918 protein LOC100128728 LOC10013135: hypothetical protein LOC100131351 AMAC1L1: acylmalonyl condensing enzyme 1-like 1 GATA4: GATA binding protein 4 182 8p23.1 1166466611682263 183 8p23.1 1168429511685679 184 8p23.1 11656174.. 11657024 185 8p23.1 1169759911734227 186 8p23.1 1173744211763055 179 8p23.1 1157179111595690 180 8p23.1 1159947411600674 1001313 51 protein coding 646000 protein coding membrane MGC126629: GATA-binding protein 4 2626 600576 protein coding NEIL2: nei like 2 (E. coli) FLJ31644, MGC2832, MGC4505, NEH2: nei-like 2 252969 608933 protein coding SUB1P1: SUB1 homolog (S. cerevisiae) pseudogene 1 C8orf49: chromosome 8 open reading frame 49 FDFT1: farnesyldiphosphate farnesyltransferas e1 hypothetical protein LOC100128728 1001287 28 protein coding FLJ30972 606553 unknown DGPT, ERG9, SQS, SS: FPP:FPP farnesyltransferas e; presqualene-didiphosphate synthase; squalene synthase APPS, CPSB: APP secretase; amyloid precursor protein secretase; 2222 184420 protein coding This gene encodes a membrane-associated enzyme located at a branch point in the mevalonate pathway. The encoded protein is the first specific enzyme in cholesterol biosynthesis, catalyzing the dimerization of two molecules of farnesyl diphosphate in a two-step reaction to form squalene. In prostate cancer cells SQS expression is enhanced by androgens, channeling intermediates of the mevalonate/isoprenoid pathway toward cholesterol synthesis farnesyl-diphosphate farnesyltransferase activity magnesium ion binding oxidoreductase activity protein bindimg transferase activity cholesterol biosynthetic process isoprenoid biosynthetic process endoplasmic reticulum endoplasmic reticulum membrane integral to membrane membrane 1508 116810 protein coding The protein encoded by this gene is a lysosomal cysteine proteinase composed of a dimer of disulfide-linked heavy and light chains, both produced from a single protein precursor. It is also known as amyloid precursor protein secretase and is involved in the proteolytic processing of cathepsin B activity cysteine-type endopeptidase activity kininogen binding proteolysis regulation of apoptosis regulation of catalytic activity apical plasma membrane external side of plasma membrane CTSB: cathepsin B This gene encodes a member of the GATA family of zinc-finger transcription factors. Members of this family recognize the GATA motif which is present in the promoters of many genes. This protein is thought to regulate genes involved in embryogenesis and in myocardial differentiation and function. Mutations in this gene have been associated with cardiac septal defects. 4 missense sequence variants (Gly93Ala, Gln316Glu, Ala411Val, Asp425Asn) occurred in patients with cardiac septal defects. 2 led to polarity changes. Non-synonymous GATA4 sequence variants sometimes occur in septal defects & rarely in conotruncal defects. Tbx18 interacts with Gata4 and Nkx2-5 and competes Tbx5-mediated activation of the cardiac Natriuretic peptide precursor type a-promoter. Tbx18 downregulates Tbx6-activated Delta-like 1 expression in the somitic mesoderm in vivo. Hypermethylation of the GATA4 is associated with lung cancer. NEIL2 belongs to a class of DNA glycosylases homologous to the bacterial Fpg/Nei family. These glycosylases initiate the first step in base excision repair by cleaving bases damaged by reactive oxygen species and introducing a DNA strand break via the associated lyase reaction. A novel missense variant C367A and several other mutations were found in familial colorectal cancer DNA suggesting a limited role for this gene in the devlopment of CRC. DNA binding RNA polymerase II transcription factor activity metal ion binding protein bindimg sequence-specific DNA binding transcription activator activity transcription factor activity zinc ion binding damaged DNA binding hydrolase activity, acting on glycosyl bonds lyase activity metal ion binding oxidized purine base lesion DNA Nglycosylase activity zinc ion binding heart development multicellular organismal development positive regulation of transcription positive regulation of transcription, DNAdependent regulation of transcription, DNA-dependent transcription transcription from RNA polymerase II promoter base-excision repair metabolic process nucleotide-excision repair Nucleus Nucleus integral to membrane Membrana 20 Supplementary Information TABLE S1: GENES ON CHROMOSOME 8p: LOCALIZATION AND DESCRIPTION cathepsin B1; cysteine protease; preprocathepsin B 187 8p23.1 1181471411815813 OR7E158P: olfactory receptor, family 7, subfamily E, member 158 pseudogene 392194 pseudogene 188 8p23.1 1182338611824581 OR7E161P: olfactory receptor, family 7, subfamily E, member 161 pseudogene 389626 pseudogene 189 8p23.1 613210 8p23.1 DEFB137 613210 191 8p23.1 DEFB137: betadefensin 137 DEFB136: betadefensin 136 DEFB134: betadefensin 134 DEFB136 190 1186885511869517 1187723911879508 1188889811891169 DEFB134; MGC163333; MGC163335 613211 protein coding protein coding protein coding 192 8p23.1 1189252411929558 OR7E160P: olfactory receptor, family 7, subfamily E, member 160 pseudogene 402333 pseudogene 193 8p23.1 1193842111944543 1001281 74 protein coding 194 8p23.1 1195930711966665 1001332 67 protein coding 195 8p23.1 1198425612010434 LOC10012817: similar to betadefensin 131 LOC100133267: similar to betadefensin 130 LOC728957: similar to zinc finger protein 75 728957 protein coding MGC131746 amyloid precursor protein (APP). Incomplete proteolytic processing of APP has been suggested to be a causative factor in Alzheimer disease, the most common cause of dementia. Overexpression of the encoded protein, which is a member of the peptidase C1 family, has been associated with esophageal adenocarcinoma and other tumors. At least five transcript variants encoding the same protein have been found for this gene. Olfactory receptors interact with odorant molecules in the nose, to initiate a neuronal response that triggers the perception of a smell. The olfactory receptor proteins are members of a large family of G-protein-coupled receptors (GPCR) arising from single coding-exon genes. Olfactory receptors share a 7-transmembrane domain structure with many neurotransmitter and hormone receptors and are responsible for the recognition and G protein-mediated transduction of odorant signals. The olfactory receptor gene family is the largest in the genome. The nomenclature assigned to the olfactory receptor genes and proteins for this organism is independent of other organisms. Olfactory receptors interact with odorant molecules in the nose, to initiate a neuronal response that triggers the perception of a smell.The olfactory receptor proteins are members of a large family of G-protein-coupled receptors (GPCR) arising from single coding-exon genes. Olfactory receptors share a 7-transmembrane domain structure with many neurotransmitter and hormone receptors and are responsible for the recognition and G protein-mediated transduction of odorant signals. The olfactory receptor gene family is the largest in the genome. The nomenclature assigned to the olfactory receptor genes and proteins for this organism is independent of other organisms. response to wounding defense response to bacterium defense response to bacterium defense response to bacterium extracellular region intracellular lysosome melanosome Mitochondrion extracellular region extracellular region extracellular region Olfactory receptors interact with odorant molecules in the nose, to initiate a neuronal response that triggers the perception of a smell. The olfactory receptor proteins are members of a large family of G-protein-coupled receptors (GPCR) arising from single coding-exon genes. Olfactory receptors share a 7-transmembrane domain structure with many neurotransmitter and hormone receptors and are responsible for the recognition and G protein-mediated transduction of odorant signals. The olfactory receptor gene family is the largest in the genome. The nomenclature assigned to the olfactory receptor genes and proteins for this organism is independent of other organisms. Annotation category: spans an assembly gap 21 Supplementary Information TABLE S1: GENES ON CHROMOSOME 8p: LOCALIZATION AND DESCRIPTION 196 8p23.1 1201069512011546 197 8p23.1 1201683612047213 198 8p23.1 1202277612024213 199 8p23.1 1202733512029244 200 8p23.1 1203208612033678 201 8p23.1 1206711712070072 202 8p23.1 1207702212089033 203 8p23.1 1221284312220196 204 8p23.1 1223736512263733 205 8p23.1 1226389412264754 206 8p23.1 1227621312277743 207 8p23.1 1228055212282461 208 8p23.1 1228937412291715 209 8p23.1 12322773- LOC100133184: similar to hCG1990697 LOC100132923: similar to hCG1993470 LOC392196: deubiquitinating enzyme 3 pseudogene LOC392197: similar to deubiquitinating enzyme 3 DUB3: deubiquitinating enzyme 3 FAM90A2P: family with sequence similarity 90, member A2 pseudogene FAM86B1: family with sequence similarity 86, member B1 DEFB130: defensin, beta 130 ZNF705C: zinc finger protein 705C pseudogene LOC100133172: similar to hCG1990697 LOC649346: similar to deubiquitinating enzyme 3 DUB2 LOC649352: similar to deubiquitinating enzyme 3 DUB1 LOC100128995: hypothetical protein LOC100128995 FAM90A25P: USP17L7 1001331 84 protein coding 1001329 23 unknown 392196 pseudogene 392197 protein coding 377630 610186 protein coding 729689 pseudogene MGC104828, MGC16279 85002 protein coding DEFB-30: betadefensin 130; defensin, beta 30 ZNF705C; MGC131746 245940 protein coding 389631 pseudogene 1001331 72 protein coding DUB2 649346 pseudogene DUB1 649352 pseudogene 1001289 95 protein coding 389633 pseudogene DUB3 is a member of the ubiquitin processing protease (UBP) subfamily of deubiquitinating enzymes. human DUB-3, like the murine DUB family members, is transiently induced in response to cytokines and can, when constitutively expressed, block growth factor-dependent proliferation ubiquitin thiolesterase activity ubiquitin-dependent protein catabolic process cysteine-type peptidase activity ubiquitin thiolesterase activity apoptosis ubiquitin cycle ubiquitin-dependent protein catabolic process defense response to bacterium Nucleus Nucleus extracellular region 22 Supplementary Information TABLE S1: GENES ON CHROMOSOME 8p: LOCALIZATION AND DESCRIPTION 12316402 210 8p23.1 1233826212325843 211 8p23.1 1235017112370407 212 8p23.1 1237872812378901 213 8p23.1 1244833312448991 214 8p23.1 1248018612481558 215 8p23.1 1249685012567490 216 8p23.1 1250916312509772 217 8p23.1 1258592012587145 218 8p23.1 1258598212587582 219 8p23.1 1259825012598905 family with sequence similarity 90, member A25 pseudogene FAM86B2: family with sequence similarity 86, member B2 LOC646344: similar to sphingomyelinase , intestinal alkaline LOC100127885: similar to liverrelated low express protein 1 LOC100131718: hypothetical LOC100131718 LOC100131581: hypothetical LOC100131581 LOC729732: hypothetical protein LOC729732 LOC100132309: hypothetical LOC100132309 OR7E8P: olfactory receptor, family 7, subfamily E, member 8 pseudogene LOC442381: similar to Protein C11orf2 (Another new gene 2 protein) OR7E15P: olfactory receptor, family 7, subfamily E, member 15 Q4KMP3 653333 protein coding similar to hCG2042391 646344 protein coding 1001278 85 protein coding 1001317 18 pseudogene 1001315 81 protein coding 729732 pseudogene 1001323 09 pseudogene 346708 pseudogene 442381 protein coding 8588 pseudogene similar to LOC649305 protein OR11-11 A OST001; OR7E42P; OR7E80P; OR11392 Olfactory receptors interact with odorant molecules in the nose, to initiate a neuronal response that triggers the perception of a smell. The olfactory receptor proteins are members of a large family of G-protein-coupled receptors (GPCR) arising from single coding-exon genes. Olfactory receptors share a 7-transmembrane domain structure with many neurotransmitter and hormone receptors and are responsible for the recognition and G protein-mediated transduction of odorant signals. The olfactory receptor gene family is the largest in the genome. The nomenclature assigned to the olfactory receptor genes and proteins for this organism is independent of other organisms. olfactory receptor activity receptor activity G-protein coupled receptor protein signaling pathway response to stimulus sensory perception of smellsignal transduction integral to membrane plasma membrana Olfactory receptors interact with odorant molecules in the nose, to initiate a neuronal response that triggers the perception of a smell. The olfactory receptor proteins are members of a large family of G-protein-coupled receptors (GPCR) arising from single coding-exon genes. Olfactory receptors share a 7-transmembrane domain structure with many 23 Supplementary Information TABLE S1: GENES ON CHROMOSOME 8p: LOCALIZATION AND DESCRIPTION pseudogene 220 8p23.1 1260483212606018 221 8p23.1 222 8p23.1 1262377712657363 223 8p22 1284755412931657 224 8p22 1298524313416766 225 8p22 1346872313470172 226 8p22 1399174415140163 OR7E10P: olfactory receptor, family 7, subfamily E, member 10 pseudogene LOC653883: similar to Zinc finger protein 90 (Zfp-90) (Zinc finger protein NK10) LONRF1: LON peptidase Nterminal domain and ring finger 1 OR11-1 pseudogene 10823 653883 protein coding FLJ23749, RNF191 91694 protein coding C8orf79: chromosome 8 open reading frame 79 DLC1: deleted in liver cancer FLJ36980; KIAA1456; MGC43113 57604 protein coding HP; ARHGAP7; STARD12; FLJ21120; p122RhoGAP 10395 C8orf48: chromosome 8 open reading frame 48 SGCZ: sarcoglycan zeta FLJ25402 157773 MGC149397, ZSG1: zetasarcoglycan 137868 604258 protein coding neurotransmitter and hormone receptors and are responsible for the recognition and G protein-mediated transduction of odorant signals. The olfactory receptor gene family is the largest in the genome. The nomenclature assigned to the olfactory receptor genes and proteins for this organism is independent of other organisms. Olfactory receptors interact with odorant molecules in the nose, to initiate a neuronal response that triggers the perception of a smell. The olfactory receptor proteins are members of a large family of G-protein-coupled receptors (GPCR) arising from single coding-exon genes. Olfactory receptors share a 7-transmembrane domain structure with many neurotransmitter and hormone receptors and are responsible for the recognition and G protein-mediated transduction of odorant signals. The olfactory receptor gene family is the largest in the genome. The nomenclature assigned to the olfactory receptor genes and proteins for this organism is independent of other organisms. Annotation category: not annotated on reference assembly ATP-dependent peptidase activity metal ion binding protein bindimg zinc ion binding methyltransferase activity transferase activity ATP-dependent proteolysis This gene is deleted in the primary tumor of hepatocellular carcinoma. It is suggested that this gene is a candidate tumor suppressor gene for human liver cancer, as well as for prostate, lung, colorectal, and breast cancers. Alternative splicing at this locus results in several transcript variants encoding different isoforms. GTPase activator activity Rho GTPase activator activity Protein bindimg cytoskeleton organization Cytoplasm and biogenesis TAS Intracellular PubMed negative regulation of cell growth NAS PubMed regulation of cell adhesion signal transduction This protein is part of the sarcoglycan complex, a group of 6 proteins. The sarcoglycans are all N-glycosylated transmembrane proteins with a short intra-cellular domain, a single transmembrane region and a large extracellular domain containing a carboxyl-terminal cluster with several conserved cysteine residues. The sarcoglycan complex is part of the dystrophin-associated glycoprotein complex (DGC), which bridges the inner cytoskeleton and the extra-cellular matrix. Protein bindimg cytoskeleton organization and biogenesis metabolic process unknown 608113 protein coding cytoplasm cytoskeleton integral to membrane plasma membrane sarcoglycan complex 24 Supplementary Information TABLE S1: GENES ON CHROMOSOME 8p: LOCALIZATION AND DESCRIPTION 227 8p22 1421090614212368 228 8p22 229 8p22 1475531814755390 1544210115666366 230 8p22 1559079015591440 231 8p22 1570778115709283 232 8p22 1584494815845412 233 8p22 1600976116094595 234 8p22 1617757816178181 235 8p22 1627649916276966 LOC100131565: similar to Eukaryotic translation initiation factor 4E MIRN383: microRNA 383 TUSC3: tumor suppressor candidate 3 LOC100129568:h ypothetical LOC100129568 LOC137012: protein phosphatase 1A pseudogene LOC646433: ribosomal protein L32-like pseudogene MSR1: macrophage scavenger receptor 1 LOC646440: similar to chaperonin containing TCP1, subunit 3 (gamma) MRPL49P2: mitochondrial ribosomal protein L49 pseudogene 2 1001315 65 pseudogene hsa-mir-383 494332 miscRNA D8S1992, MGC13453, N33, OST3A: Putative prostate cancer tumor suppressor 7991 similar to clathrin light-chain A 0012956 8 601385 CD204, SCARA1, SR-A, phSR1, phSR2 macrophage acetylated LDL receptor I and II; macrophage scavenger receptor type III; scavenger receptor class A, member 1 similar to matricin mitochondrial ribosomal protein L49 pseudogene 2 contributes_to dolichyldiphosphooligosaccharid e-protein glycotransferase activity protein amino acid N-linked glycosylation via asparagine endoplasmic reticulum integral to membrane membrane oligosaccharyl transferase complex This gene encodes the class A macrophage scavenger receptors, which include three different types (1, 2, 3) generated by alternative splicing of this gene. These receptors or isoforms are macrophage-specific trimeric integral membrane glycoproteins and have been implicated in many macrophage-associated physiological and pathological processes including atherosclerosis, Alzheimer's disease, and host defense. The isoforms type 1 and type 2 are functional receptors and are able to mediate the endocytosis of modified low density lipoproteins (LDLs). The isoform type 3 does not internalize modified LDL (acetyl-LDL) despite having the domain shown to mediate this function in the types 1 and 2 isoforms. It has an altered intracellular processing and is trapped within the endoplasmic reticulum, making it unable to perform endocytosis. The isoform type 3 can inhibit the function of isoforms type 1 and type 2 when co-expressed, indicating a dominant negative effect and suggesting a mechanism for regulation of scavenger receptor activity in macrophages. lipid transporter activity receptor activity phosphate transport receptor-mediated endocytosis Cytoplasm integral to plasma membrane pseudogene pseudogene 646433 4481 This gene is a candidate tumor suppressor gene. It is located within a homozygously deleted region of a metastatic prostate cancer. The gene is expressed in most nonlymphoid human tissues including prostate, lung, liver, and colon. Expression was also detected in many epithelial tumor cell lines. Two transcript variants encoding distinct isoforms have been identified for this gene. pseudogene 137012 hCG_1794003 protein coding 153622 protein coding 646440 pseudogene 346711 pseudogene 25 Supplementary Information TABLE S1: GENES ON CHROMOSOME 8p: LOCALIZATION AND DESCRIPTION 236 8p22 1689470516904045 FGF20 :fibroblast growth factor 20 26281 605558 protein coding 237 8p22 1692911917024518 EFHA2: EF-hand domain family, member A2 238 8p22 DKFZp313A0139 , EF hand domain family A2; EF hand domain family, member A2 similar to mCG50795 286097 610633 Protein coding 1001303 92 pseudogene 239 8p22 1705840217124612 ZNF372 rec; zinc finger, DHHC domain containing 2 51201 Protein coding 240 8p22 1713111117148758 CNOT7: CCR4NOT transcription complex, subunit 7 CAF1, hCAF-1, BTG1 binding factor 1; carbon catabolite repressor protein (CCR4)associative factor 1 29883 604913 Protein coding The protein encoded by this gene binds to an anti-proliferative protein, Bcell translocation protein 1, which negatively regulates cell proliferation. Binding of the two proteins, which is driven by phosphorylation of the antiproliferative protein, causes signaling events in cell division that lead to changes in cell proliferation associated with cell-cell contact. The protein has both mouse and yeast orthologs. Alternate splicing of this gene results in two transcript variants encoding different isoforms. 241 8p22 1714885117197418 VPS37A:vacuolar protein sorting 37 homolog A (S. cerevisiae)) 137492 609927 Protein coding HCRP1 is a subunit of mammalian ESCRT-I, and its function is essential for lysosomal sorting of EGF receptors. Results strongly suggest that HCRP1 might be a growth inhibitory protein and associated with decreasing the invasion of HCC cells 8p22 1719991017315207 MTMR7: myotubularin related protein 7 FLJ32642, FLJ42616, HCRP1, hepatocellular carcinoma related protein 1; vacuolar protein sorting 37A DKFZp781E194, MGC163449, MGC163451 242 9108 603562 Protein coding 243 8p22 1733370217373392 LOC646479: similar to ADAM 29 precursor (A disintegrin and metalloproteinase domain 29) similar to mCG50795 1001303 92 1705218117052768 LOC100130392: hypothetical LOC100130392 ZDHHC2: zinc finger, DHHCtype containing 2 The protein encoded by this gene is a member of the fibroblast growth factor (FGF) family. FGF family members possess broad mitogenic and cell survival activities, and are involved in a variety of biological processes including embryonic development cell growth, morphogenesis, tissue repair, tumor growth and invasion. This gene was shown to be expressed in normal brain, particularly the cerebellum. The rat homolog is preferentially expressed in the brain and able to enhance the survival of midbrain dopaminergic neurons in vitro. growth factor activity cell growth cell-cell signaling signal transduction calcium ion binding metal ion binding palmitoyltransferase activity transferase activity zinc ion binding nucleic acid binding protein bindimg signal transducer activity transcription activator activity transcription factor activity hydrolase activity inositol or phosphatidylinositol phosphatase activity protein tyrosine phosphatase activity extracellular region soluble fraction integral to membrane membrane protein palmitoylation integral to membrane membrane carbohydrate metabolic process positive regulation of transcription from RNA polymerase II promoter regulation of transcription, DNA-dependent signal transduction transcription protein transport CCR4-NOT complex nucleus Cytoplasm, Endosome, late endosome membrane, membrane, nucleus phospholipid dephosphorylation protein amino acid dephosphorylation pseudogene 26 Supplementary Information TABLE S1: GENES ON CHROMOSOME 8p: LOCALIZATION AND DESCRIPTION 244 8p22 1744068517472300 SLC7A2: solute carrier family 7 (cationic amino acid transporter, y+ system), member 2) 245 8p22 1747898617544903 PDGFRL: platelet-derived growth factor receptor-like 246 8p22 1754558317702666 MTUS1: mitochondrial tumor suppressor 1 247 8p22 1776618017797327 FGL1: fibrinogenlike 1 248 8p22 1782477917929764 PCM1 : pericentriolar material 1 ATRC2, CAT-2, HCAT2, amino acid transporter, cationic 2; cationic amino acid transporter, y+ system; lowaffinity cationic amino acid transporter-2; solute carrier family 7, member 2 PDGRL, PRLTS, platelet-derived growth factor receptor-like protein; plateletderived growth factor-beta-like tumor suppressor ATIP, DKFZp586D15, DZp686F20243, FLJ14295, KIAA1288, MP44, MTSG1, AT2 receptorinteracting protein; erythroid differentiationrelated; mitochondrial tumor suppressor gene 1; transcription factor MTSG1 HFREP1, HP041, LFIRE1, MGC12455, hepassocin; hepatocellular carcinoma-related sequence; hepatocytederived fibrinogen-related protein 1 PTC4 6542 601872 Protein coding Keratinocytes express cationic amino acid transporters 1 and 2. Cationic amino acid transporter mediated L-arginine essential for inducible nitric oxide synthase and arginase enzyme, which modulate proliferation and differentiation of epidermal cells.Insulin increased L-arginine transport and the mRNA levels for hCAT-1 and hCAT-2B L-amino acid transmembrane transporter activity basic amino acid transmembrane transporter activity L-amino acid transport amino acid metabolic process transport integral to plasma membrane membrane membrane fraction 5157 604584 Protein coding This gene encodes a protein with significant sequence similarity to the ligand binding domain of platelet-derived growth factor receptor beta. Mutations in this gene, or deletion of a chromosomal segment containing this gene, are associated with sporadic hepatocellular carcinomas, colorectal cancers, and non-small cell lung cancers. This suggests this gene product may function as a tumor suppressor. platelet activating factor receptor activity platelet-derived growth factor beta-receptor activity biological_process extracellular region 57509 609589 Protein coding This gene encodes a protein which contains a C-terminal domain able to interact with the angiotension II (AT2) receptor and a large coiled-coil region allowing dimerization. Multiple alternatively spliced transcript variants encoding different isoforms have been found for this gene. One of the transcript variants has been shown to encode a mitochondrial protein that acts as a tumor suppressor and partcipates in AT2 signaling pathways. Other variants may encode nuclear or transmembrane proteins but it has not been determined whether they also participate in AT2 signaling pathways. cell cycle negative regulation of cell cycle Golgi apparatus Mitochondrion nucleus plasma membrane 2267 605776 Protein coding This protein is homologous to the carboxy terminus of the fibrinogen betaand gamma- subunits which contains the four conserved cysteines of fibrinogens and fibrinogen related proteins. However, it lacks the plateletbinding site, cross-linking region and a thrombin-sensitive site which are necessary for fibrin clot formation. It may play a role in the development of hepatocellular carcinomas. Four alternatively spliced transcript variants encoding the same protein exist for this gene. receptor binding signal transduction extracellular region fibrinogen complex 5108 600299 Protein coding Multiple processes involved in regulating the abundance of NIMA (never in mitosis gene a)-related kinase 2 kinase at the centrosome including microtubule binding, the centriolar satellite component PCM-1, and localized protein degradation. A genetic translocation in atypical chronic protein bindimg cytoplasm cytoskeleton pericentriolar material 27 Supplementary Information TABLE S1: GENES ON CHROMOSOME 8p: LOCALIZATION AND DESCRIPTION 249 8p22 1795822117986757 250 8p22 251 8p22 1799018117995697 252 8p22 1811189518125100 253 8p22 1827149918273867 254 8p22 1828504918285707 255 8p22 1829303518302962 1798866217988890 ASAH1 (Nacylsphingosine amidohydrolase (acid ceramidase) 1) MRPS18CP3: mitochondrial ribosomal protein S18C pseudogene 3 LOC100133073: hypothetical LOC100133073 NAT1: Nacetyltransferase 1 (arylamine Nacetyltransferase) AACP: arylamide acetylase pseudogene LOC392206: similar to ribosomal protein L10a NAT2: Nacetyltransferase 2 (arylamine Nacetyltransferase) AC, ASAH, FLJ21558, FLJ22079, PHP, PHP32, acylsphingosine deacylase AAC1, NATI, Nacetyltransferase 1; arylamide acetylase 1 (Nacetyltransferase 1); arylamine Nacetyltransferase 1 427 228000 359764 pseudogene 1001330 73 pseudogene 9 108345 carboxylic acid metabolic process ceramide metabolic process fatty acid metabolic process lung development response to organic substance lysosome Specific and quantitative reverse transcription-polymerase chain reaction assay for transcription from the major NAT1 promoter detected high expression with limited variability in human tissues. For esophageal and gastric adenocarcinomas, no consistent patterns of elevated risk were associated with one or two copies of NAT110 or 11 alleles. Genetic variation may affect the degree of association between pre-1980 hair dye use and the risk of non-Hodgkin lymphoma. In breast, NAT1 mRNA is transcribed from a strong promoter located 11.8 kb upstream of the translated exon, and the mature spliced mRNA includes at least one additional non-coding exon. Single nucleotide polymorphisms of NAT1 and NAT2, and acetylation haplotype were not associated with increased risk for Parkinson disease. The genotype for the NAT1 C1095A polymorphism does not appear to be an independent risk factor for spina bifida. acetyltransferase activity arylamine Nacetyltransferase activity arylamine Nacetyltransferase activity transferase activity metabolic process cytoplasm This gene encodes N-acetyltransferase 2 (arylamine N-acetyltransferase 2). This enzyme functions to both activate and deactivate arylamine and hydrazine drugs and carcinogens. Polymorphisms in this gene are reponsible for the N-acetylation polymorphism in which human populations segregate into rapid,intermediate, and slow acetylator phenotypes. Polymorphisms in NAT2 are also associated with higher incidences of cancer and drug toxicity. A second arylamine N-acetyltransferase gene (NAT1) is located near NAT2. acetyltransferase activity arylamine Nacetyltransferase activity arylamine Nacetyltransferase activity transferase activity metabolic process cytoplasm pseudogene 392206 10 Protein coding ceramidase activity hydrolase activity transferase activity, transferring acyl groups, acyl groups converted into alkyl on transfer pseudogene 11 AAC2, Arylamine Nacetyltransferase2; arylamide acetylase 2; arylamide acetylase 2 (Nacetyltransferase 2, isoniazid inactivation); arylamine Nacetyltransferase Protein coding myeloid leukemia yields a new PCM1-JAK2 fusion gene. To study the rearrangement created by the t(8;9)(p22;p24)used dual-colour FISH on metaphases from patient cells using labelled-BAC clones centred on PCM1. The PCM1 gene is implicated in susceptibility to schizophrenia and is associated with orbitofrontal gray matter volumetric deficits. This gene encodes a heterodimeric protein consisting of a nonglycosylated alpha subunit and a glycosylated beta subunit that is cleaved to the mature enzyme posttranslationally. The encoded protein catalyzes the synthesis and degradation of ceramide into sphingosine and fatty acid. Mutations in this gene have been associated with a lysosomal storage disorder known as Farber disease. Two transcript variants encoding distinct isoforms have been identified for this gene. 243400 Protein coding 28 Supplementary Information TABLE S1: GENES ON CHROMOSOME 8p: LOCALIZATION AND DESCRIPTION 256 8p22 1843234318915476 PSD3: pleckstrin and Sec7 domain containing 3 257 8p22 1863527018635569 LOC100131275: similar to nervous system abundant protein 11 LOC100128993: similar to hCG2036572 LOC442382: ATPase, Ca++ transporting, plasma membrane 1 pseudogene SH2D4A: SH2 domain containing 4A ChGn: chondroitin beta1,4 Nacetylgalactosami nyltransferase 258 8p22 259 8p22 1914022719142663 260 8p21.2 1921548319297596 261 8p21.2 1930595219584374 262 8p21.2 1950073019501992 263 8p21.3 1971919819753869 264 8p21.3 1984105819869049 1908546419086494 LOC100130604: hypothetical protein LOC100130604 INTS10: integrator complex subunit 10 LPL: lipoprotein lipase 2 DKFZp761K142, EFA6R, HCA67, ADP-ribosylation factor guanine nucleotide factor 6; hepatocellular carcinomaassociated antigen 67 NSAP11 23362 Protein coding FLJ20967, SH2A CSGalNAcT-1, ChGn, FLJ11264, beta4GalNAcT ARF guanyl-nucleotide exchange factor activity regulation of ARF protein signal transduction cell junction intracellular postsynaptic membrane synapse Protein coding 1001312 75 hypothetical protein LOC100128993 Down regulated in ovarian cancer or absent in ovarian cancer and impact survival Protein coding 1001289 93 442382 pseudogene 63898 Protein coding protein bindimg 306375 Protein coding glucuronosyl-Nacetylgalactosaminylproteoglycan 4-beta-Nacetylgalactosaminyltran sferase activity glucuronosyltransferase activity glucuronylgalactosylprot eoglycan 4-beta-Nacetylgalactosaminyltran sferase activity peptidoglycan glycosyltransferase activity UDP-Nacetylgalactosamine metabolic process UDPglucuronate metabolic process chondroitin sulfate biosynthetic process chondroitin sulfate proteoglycan biosynthetic process, polysaccharide chain biosynthetic process soluble fraction 1001306 04 Protein coding protein bindimg snRNA processing integrator complex nucleus heparin binding hydrolase activity lipid transporter activity lipoprotein lipase activity blood circulation fatty acid metabolic process lipid catabolic process phospholipid metabolic process chylomicron extracellular region plasma membrane C8orf35, FLJ10569, INT10 55174 611353 Protein coding HDLCQ11, LIPD 4023 609708 Protein coding LPL encodes lipoprotein lipase, which is expressed in heart, muscle, and adipose tissue. LPL functions as a homodimer, and has the dual functions of triglyceride hydrolase and ligand/bridging factor for receptor-mediated lipoprotein uptake. Severe mutations that cause LPL deficiency result in type I hyperlipoproteinemia, while less extreme mutations in LPL are linked 29 Supplementary Information TABLE S1: GENES ON CHROMOSOME 8p: LOCALIZATION AND DESCRIPTION to many disorders of lipoprotein metabolism. phospholipase activity triacylglycerol lipase activity drug transporter activity monoamine transmembrane transporter activity triacylglycerol metabolic process 265 8p21.3 2004665220084997 SLC18A1: solute carrier family 18 (vesicular monoamine), member 1 CGAT, VAT1, VMAT1 6570 193002 Protein coding Expression analysis confirmed that VMAT1 is expressed in human brain at the mRNA and protein level. Results suggest that variations in the VMAT1 gene may confer susceptibility to Bipolar Disorder in patients of European descent. Greater expression of VMAT 1 in von Hippel-Lindau syndrome than multiple endocrine neoplasia type 2. 266 8p21.3 2009898420123487 ATP6V1B: ATPase, H+ transporting, lysosomal 56/58kDa, V1 subunit B2 526 606939 Protein coding This gene encodes a component of vacuolar ATPase (V-ATPase), a multisubunit enzyme that mediates acidification of eukaryotic intracellular organelles. V-ATPase dependent organelle acidification is necessary for such intracellular processes as protein sorting, zymogen activation, receptor-mediated endocytosis, and synaptic vesicle proton gradient generation. V-ATPase is composed of a cytosolic V1 domain and a transmembrane V0 domain. The V1 domain consists of three A, three B, and two G subunits, as well as a C, D, E, F, and H subunit. The V1 domain contains the ATP catalytic site. The protein encoded by this gene is one of two V1 domain B subunit isoforms and is the only B isoform highly expressed in osteoclasts. hydrogen ion transporting ATP synthase activity, rotational mechanism hydrogen ion transporting ATPase activity, rotational mechanism hydrogen-exporting ATPase activity, phosphorylative mechanism hydrolase activity metal ion binding ATP synthesis coupled proton transport energy coupled proton transport, against electrochemical gradient ion transport 267 8p21.3 2014795620157083 LZTS1: leucine zipper, putative tumor suppressor 1 ATP6B1B2, ATP6B2, HO57, VATB, VPP3, Vma2, ATPase, H+ transporting, lysosomal (vacuolar proton pump), beta polypeptide, 56/58kD, isoform 2; ATPase, H+ transporting, lysosomal 56/58kDa, V1 subunit B, isoform 2; H+ transporting twosector ATPase; VATPase B2 subunit; endomembrane proton pump 58 kDa subunit; vacuolar H+ATPase 56,000 subunit; vacuolar H+ATPase B2 F37, FEZ1, F37/Esophageal cancer-related gene-coding leucine-zipper motif cytoplasmic vesicle cytoplasmic vesicle membrane integral to membrane membrane membrane fraction cytoplasm melanosome membrane proton-transporting two-sector ATPase complex 11178 606551 Protein coding Variation in the germline sequence is associated with prostate cancer risk. Down-regulation of FEZ1/LZTS1 gene with frequent loss of heterozygosity is associated with oral squamous cell carcinomas protein bindimg transcription factor activity cell cycle negative regulation of cell cycle regulation of transcription, DNA-dependent transcription cell junction, cell projection, cytoplasm nucleus, plasma membrana, postsynaptic membrana, synapse 268 8p21.3 2083431220836707 269 8p21.3 2083610020874499 270 8p21.3 2159381021652734; 21688430- LOC724059: transmembrane protein 97 pseudogene LOC100129163: hypothetical LOC100129163 GFRA2: GDNF family receptor alpha 2 Glial cell line-derived neurotrophic factor (GDNF) and neurturin (NTN) are two structurally related, potent neurotrophic factors that play key roles in the control of neuron survival and differentiation. The protein encoded by glial cell line-derived neurotrophic factor receptor activity transmembrane receptor protein tyrosine kinase signaling pathway extrinsic to membrane plasma membrane 724059 pseudogene similar to MAC30 1001291 63 pseudogene GDNFRB, NRTNR-ALPHA, NTNRA, RETL2, 2675 601956 Protein coding drug transport monoamine transport 30 Supplementary Information TABLE S1: GENES ON CHROMOSOME 8p: LOCALIZATION AND DESCRIPTION 21702292 TRNR2,GFRalpha 2; PI-linked cell-surface accessory protein; RET ligand 2; TGF-beta related neurotrophic factor receptor 2; TRN receptor, GPI-anchored; glial cell line derived neurotrophic factor receptor, beta; glial cell line-derived neurotrophic factor family receptor alpha2b; neurturin receptor alpha this gene is a member of the GDNF receptor family. It is a glycosylphosphatidylinositol(GPI)-linked cell surface receptor for both GDNF and NTN, and mediates activation of the RET tyrosine kinase receptor. This encoded protein acts preferentially as a receptor for NTN compared to its other family member, GDNF family receptor alpha 1. This gene is a candidate gene for RET-associated diseases. GFRalpha-2 were observed within sensory and motor nuclei of cranial nerves, dorsal column nuclei, olivary nuclear complex, reticular formation, pontine nuclei, locus caeruleus, raphe nuclei, substantia nigra, and quadrigeminal plate. Both GFR alpha2a and GFR alpha2c, but not GFR alpha2b, promote neurite outgrowth in transfected Neuro2A cells. 403294 pseudogene Olfactory receptors interact with odorant molecules in the nose, to initiate a neuronal response that triggers the perception of a smell. The olfactory receptor proteins are members of a large family of G-protein-coupled receptors (GPCR) arising from single coding-exon genes. Olfactory receptors share a 7-transmembrane domain structure with many neurotransmitter and hormone receptors and are responsible for the recognition and G protein-mediated transduction of odorant signals. The olfactory receptor gene family is the largest in the genome. The nomenclature assigned to the olfactory receptor genes and proteins for this organism is independent of other organisms. The protein encoded by this gene is constitutively tyrosine phosphorylated in hematopoietic progenitors isolated from chronic myelogenous leukemia (CML) patients in the chronic phase. It may be a critical substrate for p210(bcr/abl), a chimeric protein whose presence is associated with CML. This encoded protein binds p120 (RasGAP) from CML cells. receptor activity 271 8p21.3 2171044421711454 OR6R2P: olfactory receptor, family 6, subfamily R, member 2 pseudogene 272 8p21.3 2182233021827151 DOK2: docking protein 2, 56kDa p56DOK, p56dok-2, docking protein 2 9046 604997 Protein coding 273 8p21.3 2183312821918930 XPO7: exportin 7 KIAA0745, RANBP16, RAN binding protein 16 23039 606140 Protein coding The transport of protein and large RNAs through the nuclear pore complexes (NPC) is an energy-dependent and regulated process. The import of proteins with a nuclear localization signal (NLS) is accomplished by recognition of one or more clusters of basic amino acids by the importinalpha/beta complex. The small GTPase RAN (MIM 601179) plays a key role in NLS-dependent protein import. RAN-binding protein-16 is a member of the importin-beta superfamily of nuclear transport receptors. binding nuclear export signal receptor activity protein transporter activity 274 8p21.3 2193830021950354 NPM2: nucleophosmin/nu cleoplasmin, 2 MGC7865, nucleoplasmin 2 10361 608073 Protein coding Expression of NPM2 in HeLa cells histone binding nucleic acid binding identical protein bindimg insulin receptor binding transmembrane receptor protein tyrosine kinase docking protein activity Ras protein signal transduction Cell surface receptor linked signal transduction Transmembrane receptor protein tyrosine kinase signaling pathway intracellular protein transport intracellular protein transport across a membrane mRNA transport protein export from nucleus protein import into nucleus, docking chromatin remodeling embryonic development oocyte differentiation positive regulation of meiosis cytoplasm nuclear pore nucleus cytoplasmic chromatin nuclear chromatin nucleus 31 Supplementary Information TABLE S1: GENES ON CHROMOSOME 8p: LOCALIZATION AND DESCRIPTION 275 8p21.3 2195637421962266 FGF17: fibroblast growth factor 17 FGF-13 8822 603725 Protein coding 276 8p21.3 2197268021995982 EPB49: erythrocyte membrane protein band 4.9 (dematin) 2039 125305 Protein coding 277 8p21.3 2200266022017836 RAI16: retinoic acid induced 16 278 8p21.3 2202032922023813 279 8p21.3 280 8p21.3 2215842022158501 2218879622269545 NUDT18: nudix (nucleoside diphosphate linked moiety X)type motif 18 MIRN320: microRNA 320 PIWIL2: piwi-like 2 (Drosophila) DMT, FLJ78462, FLJ98848, dematin; erythrocyte membrane protein band 4.9 FLJ11125, FLJ21801, MGC138352 FLJ22494 281 8p21.3 8p21.3 2202787722043975 HR: hairless homolog (mouse) 282 8p21.3 2205147822055393 REEP4: receptor accessory protein 4 283 8p21.3 2206028822070289 LGI3: leucinerich repeat LGI 64760 Protein coding 79873 Protein coding 407037 miscRNA The protein encoded by this gene is a member of the fibroblast growth factor (FGF) family. FGF family members possess broad mitogenic and cell survival activities, and are involved in a variety of biological processes including embryonic development cell growth, morphogenesis, tissue repair, tumor growth and invasion. This gene was shown to be prominently expressed in the cerebellum and cortex. The mouse homolog of this gene was localized to specific sites in the midline structures of the forebrain, the midbrain-hindbrain junction, developing skeleton and developing arteries, which suggests a role in central nervous system, bone and vascular development. This gene was referred to as FGF-13 in reference 2, however, its amino acid sequence and chromosomal localization are identical to FGF17. FGF17 expression is increased 2-fold in benign prostatic hyperplasia and may contribute to the increased epithelial proliferation seen in this disease. Results suggest that phosphorylation of the dematin headpiece acts as a conformational switch within this headpiece domain and a crucial role for this proline residue in structural stability and folding potential of HP (sub)domains consistent with Pro-Trp stacking as a more general determinant of protein stability growth factor activity actin binding regulation of exit from mitosis single fertilization cell-cell signaling fibroblast growth factor receptor signaling pathway nervous system development positive regulation of cell proliferation signal transduction extracellular region extracellular space actin filament bundle formation barbed-end actin filament capping cytoskeleton organization and biogenesis actin cytoskeleton cytoplasm cell differentiation meiotic prophase I multicellular organismal development spermatogenesis regulation of transcription, DNA-dependent transcription cytoplasm hydrolase activity FLJ10351, HILI, MGC133049, PIWIL1L, mili, Miwi like; piwilike 2 ALUNC, AU, HSA277165, hairless protein 55124 610312 Protein coding Stem-cell protein Piwil2 is widely expressed in tumors and inhibits apoptosis through activation of Stat3/Bcl-XL pathway. nucleic acid binding protein bindimg 55806 602302 Protein coding This gene encodes a protein whose function has been linked to hair growth. A similar protein in rat functions as a transcriptional corepressor for thyroid hormone and interacts with histone deacetylases. Mutations in this gene have been documented in cases of autosomal recessive congenital alopecia and atrichia with papular lesions. metal ion binding transcription factor activity zinc ion binding C8orf20, FLJ22246, FLJ22277, PP432, receptor expression enhancing protein 4 LGIL4 80346 609349 Protein coding 203190 608302 Protein coding nucleus integral to membrane membrane protein bindimg extracellular region 32 Supplementary Information TABLE S1: GENES ON CHROMOSOME 8p: LOCALIZATION AND DESCRIPTION 2207883522125782 family, member 3 BMP1: bone morphogenetic protein 1 284 8p21.3 285 8p21.3 286 8p21.3 2207511322077928 287 8p21.3 2213318122145549 PHYHIP: phytanoyl-CoA 2hydroxylase interacting protein 288 8p21.3 2215856422164625 POLR3D: polymerase (RNA) III (DNA directed) polypeptide D, 44kDa 289 8p21.3 2228073722336143 SLC39A14: solute carrier 2208330422083926 LOC100129433: hypothetical LOC100129433 SFTPC: surfactant, pulmonaryassociated protein C FLJ44432, PCOLC, PCP, TLD, procollagen C-endopeptidase 649 112264 1001294 33 PSP-C, SFTP2, SMDP2, SP-C, pulmonary surfactant apoprotein-2 SPC; surfactant pulmonaryassociated protein C DYRK1AP3, KIAA0273, PAHX-AP, DYRK1A interacting protein 3; phytanoyl-CoA alpha-hydroxylase associated protein; phytanoyl-CoA hydroxylase interacting protein BN51T, RPC4, RPC53, TSBN51, BN51 (BHK21) temperature sensitivity complementing; polymerase (RNA) III (DNA directed) polypeptide D; temperature sensitive complementation, cell cycle specific, tsBN51 KIAA0062, LZTHs4, ZIP14, Protein coding The BMP1 locus encodes a protein that is capable of inducing formation of cartilage in vivo. Although other bone morphogenetic proteins are members of the TGF-beta superfamily, BMP1 encodes a protein that is not closely related to other known growth factors. BMP1 protein and procollagen C proteinase (PCP), a secreted metalloprotease requiring calcium and needed for cartilage and bone formation, are identical. PCP or BMP1 protein cleaves the C-terminal propeptides of procollagen I, II, and III and its activity is increased by the procollagen C-endopeptidase enhancer protein. The BMP1 gene is expressed as alternatively spliced variants that share an N-terminal protease domain but differ in their C-terminal region astacin activity calcium ion binding cytokine activity growth factor activity metal ion binding metallopeptidase activity procollagen Cendopeptidase activity zinc ion binding cartilage condensation cell differentiation multicellular organismal development ossification proteolysis Extracellular space regulation of liquid surface tension respiratory gaseous exchange extracellular region extracellular space proteinaceous extracellular matrix pseudogene 6440 178620 Protein coding Results suggest common cellular responses, including initiation of celldeath signaling pathways, to these lung disease-associated SP-C BRICHOS domain proteins. Finding of heterozygosity for ABCA3 mutations in severely affected infants with SFTPC I73T, and independent inheritance from disease-free parents supports that ABCA3 acts as a modifier gene for the phenotype associated with an SFTPC mutation. 9796 608511 Protein coding PAHX-AP1 may contribute to new cellular functions of DYRK1A and suggest that PAHX-AP1 may be involved in the development of neurological abnormalities observed in Down syndrome patients 661 187280 Protein coding This gene complements a temperature-sensitive mutant isolated from the BHK-21 Syrian hamster cell line. It leads to a block in progression through the G1 phase of the cell cycle at nonpermissive temperatures. DNA binding DNA-directed RNA polymerase activity transcription transcription from RNA polymerase III promoter DNA-directed RNA polymerase III complex nucleus 23516 608736 Protein coding Zinc is an essential cofactor for hundreds of enzymes. It is involved in protein, nucleic acid, carbohydrate, and lipid metabolism, as well as in the metal ion transmembrane ion transport metal ion transport integral to membrane membrane 33 Supplementary Information TABLE S1: GENES ON CHROMOSOME 8p: LOCALIZATION AND DESCRIPTION family 39 (zinc transporter), member 14 290 8p21.3 2235454122454583 PPP3CC: protein phosphatase 3 (formerly 2B), catalytic subunit, gamma isoform 291 8p21.3 2240586122406933 292 8p21.3 2246519622488953 LOC652964: basic transcription factor 3 pseudogene SORBS3: sorbin and SH3 domain containing 3 293 8p21.3 2249258822511483 294 8p21.3 295 cig19, Zrt-, Irtlike protein 14; solute carrier family 39 (metal ion transporter), member 14 CALNA3, calcineurin A gamma 5533 114107 652964 Protein coding control of gene transcription, growth, development, and differentiation. SLC39A14 belongs to a subfamily of proteins that show structural characteristics of zinc transporters. transporter activity zinc ion binding Calmodulin-dependent protein phosphatase, calcineurin, is involved in a wide range of biologic activities, acting as a Ca(2+)-dependent modifier of phosphorylation status. This gene has been identified in humans, mice, and rats, and are highly conserved between species (90 to 95% amino acid identity). Results identify PPP3CC, located at 8p21.3, as a potential schizophrenia susceptibility gene and support the proposal that alterations in calcineurin signaling contribute to schizophrenia pathogenesi calmodulin binding hydrolase activity iron ion binding metal ion binding phosphoprotein phosphatase activity zinc ion binding zinc ion transport calcineurin complex pseudogene SCAM-1, SCAM1, SH3D4, VINEXIN, vinexin beta (SH3-containing adaptor molecule1) 10174 610795 Protein coding Vinexin is enriched at the leading edge of migrating cells, lamellipodia and and focal adhesions in well-spread cells. Vinexin beta plays a role in maintaining the phosphorylation of EGFR on the plasma membrane through the regulation of c-Cbl, regulates cytoskeletal organization and signal transduction. structural constituent of cytoskeleton transcription factor binding vinculin binding PDLIM2: PDZ and LIM domain 2 (mystique) FLJ34715, SLIM, PDZ and LIM domain 2; mystique 64236 609722 Protein coding metal ion binding protein bindimg zinc ion binding 2251820222533929 KIAA1967 DBC-1, DBC1, deleted in breast cancer 1; p30 DBC protein 57805 607359 Protein coding Knockdown of Mystique 2 with small interfering RNA abrogated both adhesion and migration in MCF10A and MCF-7 cells. PDLIM2 deficiency resulted in larger amounts of nuclear p65, defective p65 ubiquitination and augmented production of proinflammatory cytokines in response to innate stimuli Biological function for DBC-1 in the modulation of ERalpha expression and hormone-independent breast cancer cell survival. Caspase-dependent processing of DBC-1 may act as a feed-forward mechanism to promote apoptosis and possibly also tumor suppression. BC-1 functions to modulate ER alpha expression and hormone-independent breast cancer cell survival. 8p21.3 2251306722517608 Protein coding 8p21.3 2253387622582540 FLJ34715, hypothetical protein LOC541565 MGC14978 541565 296 C8orf58: chromosome 8 open reading frame 58 BIN3: bridging integrator 3 361065 Protein coding 297 8p21.3 2255474822555669 FLJ14107: hypothetical 80094 unknown cell-substrate adhesion negative regulation of transcription from RNA polymerase II promoter positive regulation of MAPKKK cascade positive regulation of cytoskeleton organization and biogenesis positive regulation of stress fiber formation cell junction cytoplasm cytoskeleton colocalizes_with focal adhesion membrane nucleus cell surface cytoplasm cytoskeleton nucleus molecular_function protein bindimg apoptosis cytoplasm mitochondrial matrix nucleus cytoskeletal adaptor activity protein bindimg actin filament organization barrier septum formation cell cycle cell division cytokinesis endocytosis protein localization unidimensional cell growth Cytoplasm cytoskeleton 34 Supplementary Information TABLE S1: GENES ON CHROMOSOME 8p: LOCALIZATION AND DESCRIPTION protein FLJ14107 EGR3: early growth response 3 298 8p21.3 2260111922606760 299 8p21.3 2262671022841366 PEBP4: phosphatidylethan olamine-binding protein 4 300 8p21.3 2291340722933653 301 8p21.3 2293359122982637 RHOBTB: Rhorelated BTB domain containing 2 TNFRSF10B: tumor necrosis factor receptor superfamily, member 10b MGC138484, PILOT, zinc finger protein pilot 1960 CORK-1, CORK1, GWTM1933, MGC22776, PRO4408, cousinof-RKIP 1 protein DBC2, KIAA0717, deleted in breast cancer 2 CD262, DR5, KILLER, KILLER/DR5, TRAIL-R2, TRAILR2, TRICK2, TRICK2A, TRICK2B, TRICKB, ZTNFR9, Fas-like protein; TNFrelated apoptosisinducing ligand receptor 2; TRAIL receptor 2; apoptosis inducing protein TRICK2A/2B; apoptosis inducing receptor TRAIL-R2; cytotoxic TRAIL receptor-2; death domain containing receptor for TRAIL/Apo-2L; death receptor 5; p53-regulated DNA damageinducible cell death 157310 602419 Protein coding The gene encodes a transcriptional regulator that belongs to the EGR family of C2H2-type zinc-finger proteins. It is an immediate-early growth response gene which is induced by mitogenic stimulation. The protein encoded by this gene participates in the transcriptional regulation of genes in controling biological rhythm. It may also plays a role in muscle development. Some findings support the previous genetic association of altered calcineurin signaling with schizophrenia pathogenesis and identify EGR3 as a compelling susceptibility gene. Protein coding Anti-apoptotic hPEBP4 silencing promotes TRAIL-induced apoptosis of human ovarian cancer cells by activating ERK and JNK pathways. A novel human phosphatidylethanolamine-binding protein resists tumor necrosis factor alpha-induced apoptosis by inhibiting mitogen-activated protein kinase pathway activation and phosphatidylethanolamine externalization. RHOBTB2 is a member of the evolutionarily conserved RHOBTB subfamily of Rho GTPases. DBC2 plays an essential role in microtubulemediated VSVG transport from the endoplasmic reticulum to the Golgi apparatus The protein encoded by this gene is a member of the TNF-receptor superfamily, and contains an intracelluar death domain. This receptor can be activated by tumor necrosis factor-related apoptosis inducing ligand (TNFSF10/TRAIL/APO-2L), and transduces apoptosis signal. Studies with FADD-deficient mice suggested that FADD, a death domain containing adaptor protein, is required for the apoptosis mediated by this protein. 23221 607352 Protein coding 8795 603612 Protein coding metal ion binding nucleic acid binding transcription factor activity zinc ion binding circadian rhythm muscle development neuromuscular synaptic transmission peripheral nervous system development regulation of transcription, DNA-dependent transcription intracellular nucleus GTP binding nucleotide binding protein bindimg small GTPase mediated signal transduction intracellular TRAIL binding caspase activator activity protein bindimg receptor activity activation of NF-kappaBinducing kinase caspase activation cell surface receptor linked signal transduction induction of apoptosis via death domain receptors positive regulation of IkappaB kinase/NF-kappaB cascade regulation of apoptosis signal transduction integral to membrane membrane 35 Supplementary Information TABLE S1: GENES ON CHROMOSOME 8p: LOCALIZATION AND DESCRIPTION 302 8p21.3 2301637923030895 TNFRSF10C: tumor necrosis factor receptor superfamily, member 10c, decoy without an intracellular domain 303 8p21.3 2304904623077485 TNFRSF10D: tumor necrosis factor receptor superfamily, member 10d, decoy with truncated death domain 304 8p21.3 2310491523138584 TNFRSF10A: tumor necrosis factor receptor receptor(killer); tumor necrosis factor receptorlike protein ZTNFR9 CD263, DCR1, LIT, MGC149501, MGC149502, TRAILR3, TRID, TNF related TRAIL receptor; TNF related apoptosisinducing ligand receptor 3; TRAIL receptor 3; antagonist decoy receptor for TRAIL/Apo-2L; decoy receptor 1; decoy without an intracellular domain; lymphocyte inhibitor of TRAIL; tumor necrosis factor receptor superfamily, member 10c CD264, DCR2, TRAILR4, TRUNDD, TNF receptor-related receptor for TRAIL; TRAIL receptor 4; TRAIL receptor with a truncated death domain; decoy receptor 2; decoy with truncated death domain; tumor necrosis factor receptor superfamily, member 10d APO2, CD261, DR4, MGC9365, TRAILR-1, 8794 603613 Protein coding The protein encoded by this gene is a member of the TNF-receptor superfamily. This receptor contains an extracellular TRAIL-binding domain and a transmembrane domain, but no cytoplasmic death domain. This receptor is not capable of inducing apoptosis, and is thought to function as an antagonistic receptor that protects cells from TRAIL-induced apoptosis. This gene was found to be a p53-regulated DNA damage-inducible gene. The expression of this gene was detected in many normal tissues but not in most cancer cell lines, which may explain the specific sensitivity of cancer cells to the apoptosis-inducing activity of TRAIL transmembrane receptor activity protein bindimg apoptosis integral to plasma membrane plasma membrane 8793 603614 Protein coding The protein encoded by this gene is a member of the TNF-receptor superfamily. This receptor contains an extracellular TRAIL-binding domain, a transmembrane domain, and a truncated cytoplamic death domain. This receptor does not induce apoptosis, and has been shown to play an inhibitory role in TRAIL-induced cell apoptosis. transmembrane receptor activity Apoptosis ANTI-apoptosis signal transduction integral to membrane membrane 8797 603611 Protein coding The protein encoded by this gene is a member of the TNF-receptor superfamily. This receptor is activated by tumor necrosis factor-related apoptosis inducing ligand (TNFSF10/TRAIL), and thus transduces cell TRAIL binding caspase activator activity activation of NF-kappaBinducing kinase apoptosis integral to membrane membrane 36 Supplementary Information TABLE S1: GENES ON CHROMOSOME 8p: LOCALIZATION AND DESCRIPTION superfamily, member 10a TRAILR1, TNFrelated apoptosis inducing ligand receptor 1; cytotoxic TRAIL receptor; death receptor 4 MGC29816 death signal and induces cell apoptosis. Studies with FADD-deficient mice suggested that FADD, a death domain containing adaptor protein, is required for the apoptosis mediated by this protein. 91782 305 8p21.3 2315711423175452 CHMP7: CHMP family, member 7 306 8p21.3 2320138723209737 DKFZp564N123 203069 307 8p21.3 2321035523317667 R3HCC1: R3H domain and coiled-coil containing 1 LOXL2: lysyl oxidase-like 2 LOR2, WS9-14, lysyl oxidase homolog 2; lysyl oxidase related 2 4017 606663 Protein coding 308 8p21.3 2334607523376624 ENTPD4: ectonucleoside triphosphate diphosphohydrola se 4 KIAA0392, LALP70, LAP70, LYSAL1, NTPDase-4, UDPase, apyrase, lysosomal; guanosinediphosphatase like protein; lysosomal apyrase-like 1 9583 607577 Protein coding 309 8p21.3 2339271523400941 310 8p21.2 2344230823486008 LOC646708: similar to cysteine string protein SLC25A37: solute carrier family 25, member 37 311 8p21.2 23546171- LOC646721: 611130 51312 646721 UBPY MIT domain and another ubiquitin isopeptidase, AMSH, reveals common interactions with CHMP1A and CHMP1B but a distinct selectivity of AMSH for CHMP3/VPS24, a core subunit of the ESCRT-III complex, and UBPY for CHMP7. Results suggest that CHMP7, a novel CHMP4associated ESCRT-III-related protein, functions in the endosomal sorting pathway. Protein coding 646708 HT015, MFRN, MSC, MSCP, PRO1278, PRO1584, PRO2217, mitochondrial solute carrier protein; mitoferrin; predicted protein of HQ2217 similar to Protein coding death receptor activity receptor activity caspase activation induction of apoptosis induction of apoptosis via death domain receptors signal transduction protein transport cytoplasm copper ion binding electron carrier activity metal ion binding protein-lysine 6-oxidase activity scavenger receptor activity aging cell adhesion protein modification process extracellular region extracellular space membrane calcium ion binding hydrolase activity magnesium ion binding uridine-diphosphatase activity UDP catabolic process Golgi apparatus Golgi membrane cytoplasmic vesicle integral to Golgi membrane integral to membrane membrane binding iron ion binding iron ion transmembrane transporter activity ion transport mitochondrial iron ion transport integral to membrane membrane mitochondrial inner membrane mitochondrion nucleic acid binding This gene encodes a member of the lysyl oxidase gene family. The prototypic member of the family is essential to the biogenesis of connective tissue, encoding an extracellular copper-dependent amine oxidase that catalyses the first step in the formation of crosslinks in collagens and elastin. A highly conserved amino acid sequence at the C-terminus end appears to be sufficient for amine oxidase activity, suggesting that each family member may retain this function. The N-terminus is poorly conserved and may impart additional roles in developmental regulation, senescence, tumor suppression, cell growth control, and chemotaxis to each member of the family The VSFASSQQ motif confers calcium sensitivity to LALP70 during UDP cleavage pseudogene 610387 Protein coding pseudogene 37 Supplementary Information TABLE S1: GENES ON CHROMOSOME 8p: LOCALIZATION AND DESCRIPTION 23548965 similar to Protein FAM60A (Tera protein) NKX3-1: NK3 homeobox 1 312 8p21.2 2359217323596395 313 8p21.2 2361590923618280 NKX2-6: NK2 transcription factor related, locus 6 (Drosophila) 314 8p21.2 2375537923768265 315 8p21.2 316 317 hCG2020539 BAPX2, NKX3, NKX3.1, NKX3A, NK homeobox (Drosophila), family 3, A; NK3 transcription factor homolog A; NK3 transcription factor related, locus 1 CSX2, NKX4-2, tinman paralog 4824 602041 Protein coding The homeodomain-containing transcription factor NKX3A is a putative prostate tumor suppressor that is expressed in a largely prostate-specific and androgen-regulated manner. Loss of NKX3A protein expression is a common finding in human prostate carcinomas and prostatic intraepithelial neoplasia protein bindimg sequence-specific DNA binding transcription factor activity multicellular organismal development regulation of transcription, DNA-dependent nucleus 137814 611770 Protein coding Weakly activates transcription of a Cx40 promoter, may have role in heart development sequence-specific DNA binding transcription factor activity nucleus STC1: stanniocalcin 1 STC 6781 601185 Protein coding hormone activity 2420756524268660 ADAM28: ADAM metallopeptidase domain 28 10863 606188 Protein coding metal ion binding metalloendopeptidase activity zinc ion binding proteolysis spermatogenesis extracellular region integral to membrane plasma membrane 8p21.2 2429791624319471 ADAMDEC1: ADAM-like, decysin 1 ADAM23, MDCLm, MDC-Ls, MDCL, Emdcii, a disintegrin and metalloproteinase domain 28; metalloproteinaselike, disintegrinlike, and cysteinerich protein-L M12.219, decysin; disintegrin protease This gene encodes a secreted, homodimeric glycoprotein that is expressed in a wide variety of tissues and may have autocrine or paracrine functions. The gene contains a 5' UTR rich in CAG trinucleotide repeats. The encoded protein contains 11 conserved cysteine residues and is phosphorylated by protein kinase C exclusively on its serine residues. The protein may play a role in the regulation of renal and intestinal calcium and phosphate transport, cell metabolism, or cellular calcium/phosphate homeostasis. Overexpression of human stanniocalcin 1 in mice produces high serum phosphate levels, dwarfism, and increased metabolic rate. This gene has altered expression in hepatocellular, ovarian, and breast cancers This gene encodes a member of the ADAM (a disintegrin and metalloprotease domain) family. Members of this family are membraneanchored proteins structurally related to snake venom disintegrins, and have been implicated in a variety of biological processes involving cell-cell and cell-matrix interactions, including fertilization, muscle development, and neurogenesis. The protein encoded by this gene is a lymphocyte-expressed ADAM protein. Alternative splicing results in two transcript variants. The shorter version encodes a secreted isoform, while the longer version encodes a transmembrane isoform. embryonic heart tube development multicellular organismal development regulation of transcription, DNA-dependent cell surface receptor linked signal transduction cell-cell signaling cellular calcium ion homeostasis response to nutrient 27299 606393 Protein coding This encoded protein is thought to be a secreted protein belonging to the disintegrin metalloproteinase family. Its expression is upregulated during dendritic cells maturation. This protein may play an important role in dendritic cell function and their interactions with germinal center T cells. 2435448424422171 ADAM7: ADAM metallopeptidase domain 7 EAPI, GP-83, a disintegrin and metalloproteinase domain 7; epididymal apical protein I; sperm maturation-related 8756 607310 Protein coding The ADAM family is composed of zinc-binding proteins that can function as adhesion proteins and/or endopeptidases. They are involved in a number of biologic processes, including fertilization, neurogenesis, muscle development, and immune response integrin-mediated signaling pathway negative regulation of cell adhesion proteolysis proteolysis extracellular region integral to membrane 8p21.2 integrin binding metal ion binding metalloendopeptidase activity zinc ion binding metalloendopeptidase activity zinc ion binding extracellular region extracellular space integral to membrane membrane 38 Supplementary Information TABLE S1: GENES ON CHROMOSOME 8p: LOCALIZATION AND DESCRIPTION 318 8p21.2 2482721224832511 NEFM: neurofilament, medium polypeptide 150kDa 319 8p21.2 2486624024869946 NEFL: neurofilament, light polypeptide 68kDa 320 8p21.2 2486865624870043 321 8p21.2 2509820425326536 322 8p21.2 2534128025371837 323 8p21.2 2533269325337836 LOC100129717: hypothetical protein LOC100129717 DOCK5: dedicator of cytokinesis 5 KCTD9: potassium channel tetramerisation domain containing 9 GNRH1:gonadotr opin-releasing hormone 1 (luteinizingreleasing hormone) 324 8p21.2 2537242825421342 CDCA2: cell division cycle associated 2 325 8p21.2 2575749025958292 EBF2: early Bcell factor 2 glycoprotein GP83 NEF3, NF-M, NFM, neurofilament-3 (150 kD medium) CMT1F, CMT2E, NF-L, NF68, NFL, neurofilament, light polypeptide (68kD) 4741 162250 Protein coding 4747 162280 Protein coding Mutations in neurofilaments are possible risk factors that may contribute to pathogenesis in amytrophic lateral sclerosis in conjunction with one or more additional genetic or environmental factors, but are not significant primary causes. Variation in NEF3 influence rate of response to typical antipsychotic medication. Two polymorphisms of neurofilament M(Ala475Thr and Gly697Arg) occurred at similar frequencies in PD patients and controls. A Pro725Gln substitution and a deletion of valine in position 829 were identified in two PD patients. Eight novel sequence variations have been identified in the NF-L gene in patients with Charcot-Marie-Tooth phenotype: 5 variants are polymorphisms, including 3 single nucleotide polymorphisms (SNPs), and 3 other missense mutations have been detected. Mutational anlyses of NF-L gene in Parkinsonian patients revealed three silent DNA changes (G163A, C224T, C487T) in three unrelated patients. Association studies based on these haplotypes found no differences between PD patients and controls. structural constituent of cytoskeleton structural molecule activity identical protein bindimg protein C-terminus binding structural constituent of cytoskeleton structural molecule activity axon cargo transport intermediate filament bundle assembly microtubule cytoskeleton organization and biogenesis neurofilament cytoskeleton organization and biogenesis regulation of axon diameter anterograde axon cargo transport axon transport of mitochondrion intermediate filament organization neurofilament bundle assembly retrograde axon cargo transport axon cytoskeleton intermediate filament neurofilament neuromuscular junction colocalizes_with TSC1-TSC2 complex axon intermediate filament neurofilament 1001297 17 Protein coding DKFZp451J181, DKFZp779M164, DKFZp781J211 FLJ20038 80005 Protein coding guanyl-nucleotide exchange factor activity 54793 Protein coding protein bindimg voltage-gated potassium channel activity potassium ion transport membrane voltage-gated potassium channel complex GNRH, GRH, LHRH, LNRH 2796 luteinizing hormonereleasing factor activity cell-cell signaling multicellular organismal development negative regulation of cell proliferation signal transduction extracellular region soluble fraction FLJ25804; RepoMan; MGC129906; MGC129907 COE2, EBF-2, FLJ11500, O/E-3, OE-3; Collier, Olf and EBF 2; OLF1/EBF-LIKE 3; metencephalonmesencephalnon- 157313 Protein coding Data show that GnRH neurones morphologically interact with astrocytes and tanycytes in the human brain and suggest that glial cells may contribute to the process by which the neuroendocrine brain controls the function of GnRH neurones in humans.. estrogens may exert direct actions upon GnRH neurons exclusively through ER-beta The proliferation of human ovarian cancer cell lines is time- and dose-dependently reduced by GnRH and its superagonistic analogs. JunD activated by LHRH acts as a modulator of cell proliferation and cooperates with the anti-apoptotic and anti-mitogenic functions of LHRH. Repo-Man forms an essential complex with protein phosphatase 1 (PP1) gamma and is required for the recruitment of PP1 to chromatin Protein coding EBF2 belongs to the conserved Olf/EBF family (see MIM 164343) of helixloop-helix transcription factors DNA binding Metal ion binding Transcription regulation activity Zinc ion binding 64641 152760 609934 Protein coding Cell cycle Cell division mitosis nucleus Multicellular organism devlopment Regulation of transcription, DNA-dependent nucleus 39 Supplementary Information TABLE S1: GENES ON CHROMOSOME 8p: LOCALIZATION AND DESCRIPTION 326 8p21.2 2616839726170067 327 8p21.2 2620495126284563 328 8p21.2 2629276326294951 329 330 2629644026326561 8p21.2 26346429- LOC100129404: hypothetical LOC100129404 PPP2R2A: protein phosphatase 2 (formerly 2A), regulatory subunit B, alpha isoform SDAD1P1: SDA1 domain containing 1 pseudogene 1 BNIP3L: BCL2/adenovirus E1B 19kDa interacting protein 3-like LOC650531: olfactory transcription factor 1; transcription factor COE2 similar to Matrix metallopeptidase 12 B55A, FLJ26613, MGC52248, PR52A, PR55A; PP2A, subunit B, B-alpha isoform; PP2A, subunit B, B55-alpha isoform; PP2A, subunit B, PR55alpha isoform; PP2A, subunit B, R2-alpha isoform; Serine/threonine protein phosphatase 2A, 55 KDA regulatory subunit B, alpha isoform; alpha isoform of regulatory subunit B55, protein phosphatase 2; protein phosphatase 2 (formerly 2A), regulatory subunit B (PR 52), alpha isoform 1001294 04 5520 pseudogene 604941 157489 BNIP3a, NIX; BCL2/adenovirus E1B 19-kd protein-interacting protein 3a; BCL2/adenovirus E1B 19kDinteracting protein 3-like; adenovirus E1B19k-binding protein B5 similar to DnaJ 665 650531 Protein coding The product of this gene belongs to the phosphatase 2 regulatory subunit B family. Protein phosphatase 2 is one of the four major Ser/Thr phosphatases, and it is implicated in the negative control of cell growth and division. It consists of a common heteromeric core enzyme, which is composed of a catalytic subunit and a constant regulatory subunit, that associates with a variety of regulatory subunits. The B regulatory subunit might modulate substrate selectivity and catalytic activity. This gene encodes an alpha isoform of the regulatory subunit B55 subfamily. The protein phosphatase 2A regulatory subunit alpha4 has a novel role in the regulation of cell spreading and migration. Protein phosphatase 2A and separase form a complex regulated by separase autocleavage Protein bindimg Protein phosphatase type2A activity Protein phosphatase type2A regulator activity Protein aminoacid dephosphorilation Signal traduction Protein phosphatase type2A complex This gene is a member of the BCL2/adenovirus E1B 19 kd-interacting protein (BNIP) family. It interacts with the E1B 19 kDa protein which is responsible for the protection of virally-induced cell death, as well as E1B 19 kDa-like sequences of BCL2, also an apoptotic protector. The protein encoded by this gene is a functional homolog of BNIP3, a proapoptotic protein. This protein may function simultaneously with BNIP3 and may play a role in tumor suppression. lamin binding protein heterodimerization activity apoptosis defense response to virus induction of apoptosis negative regulation of apoptosis negative regulation of survival gene product activity positive regulation of apoptosis endoplasmic reticulum integral to membrane membrane mitochondrial envelope mitochondrion nuclear envelope nucleus pseudogene 605368 Protein coding pseudogene 40 Supplementary Information TABLE S1: GENES ON CHROMOSOME 8p: LOCALIZATION AND DESCRIPTION 26362013 similar to DnaJ (Hsp40) homolog, subfamily B, member 6 isoform b LOC100130267: hypothetical protein LOC100130267 PNMA2: paraneoplastic antigen MA2 331 8p21.2 2636068226362687 332 8p21.2 2641811326427400 333 8p21.2 2646100226461965 334 8p21.2 2649133826571610 335 8p21.2 2666158426778839 ADRA1A: adrenergic, alpha1A-, receptor 336 8p21.2 2677858626799753 337 8p21.2 2692446626925895 338 8p21.2 27149731- LOC10027897: hypothetical protein LOC100127897 LOC100132229: similar to hCG2040398 STMN4: LOC338097: proteasome activator subunit 2 pseudogene DPYSL2: dihydropyrimidin ase-like 2 1001302 67 Protein coding KIAA0883, MA2, MM2, RGAG2, onconeuronal antigen MA2; paraneoplastic neuronal antigen; retrotransposon gag domain containing 2 PSME2P5 10687 CRMP2, DHPRP2, DRP-2, DRP2; collapsin response mediator protein hCRMP-2 1808 602463 Protein coding ADRA1C, ADRA1L1, ALPHA1AAR; G protein coupled receptor; adrenergic, alpha -1A-, receptor; adrenergic, alpha1C-, receptor; alpha-1Aadrenergic receptor 148 104221 Protein coding MGC111012, 603970 338097 Protein coding A serologic marker of paraneoplastic limbic and brain-stem encephalitis in patients with testicular cancer. Ma1, a novel neuron- and testis-specific protein, is recognized by the serum of patients with paraneoplastic neurological disorders protein bindimg nucleus Collapsin response mediator protein-2 transcriptional activity is inhibited by all-trans-retinoic acid during SH-SY5Y neuroblastoma cell differentiation. Aberrant expression of dihydropyrimidinase related proteins-2,-3 and -4 in fetal Down syndrome brain. Significant decrease of crmp-2 protein may represent or underlie impaired neuronal plasticity, neurodegeneration, wiring of the brain in mesial temporal lobe epilepsy. CRMP-2 transports the Sra-1/WAVE1 complex to axons in a kinesin-1-dependent manner and thereby regulates axon outgrowth and formation. A significant association was found between a single nucleotide polymorphism of the DRP-2 gene and schizophrenia in a North American sample. Alpha-1-adrenergic receptors (alpha-1-ARs) are members of the G proteincoupled receptor superfamily. They activate mitogenic responses and regulate growth and proliferation of many cells. There are 3 alpha-1-AR subtypes: alpha-1A, -1B and -1D, all of which signal through the Gq/11 family of G-proteins and different subtypes show different patterns of activation. This gene encodes alpha-1A-adrenergic receptor. Alternative splicing of this gene generates four transcript variants, which encode four different isoforms with distinct C-termini but having similar ligand binding properties. Candidate gene for benign prostatic hyperplasia. dihydropyrimidinase activity hydrolase activity, acting on carbonnitrogen (but not peptide) bonds protein bindimg cell differentiation multicellular organismal development nervous system development nucleobase, nucleoside, nucleotide and nucleic acid metabolic process signal transduction axon cell soma cytoplasm dendrite mitochondrion adrenoceptor activity alpha1-adrenergic receptor activity alpha1-adrenergic receptor activity receptor activity rhodopsin-like receptor activity G-protein coupled receptor protein signaling pathway apoptosis cell-cell signaling negative regulation of cell proliferation protein kinase cascade signal transduction smooth muscle contraction integral to plasma membrane plasma membrane pseudogene 1002789 7 Protein coding 1001322 29 Protein coding 81551 Protein intracellular signaling 41 Supplementary Information TABLE S1: GENES ON CHROMOSOME 8p: LOCALIZATION AND DESCRIPTION 27171820 stathmin-like 4 339 8p21.2 2719832127224751 TRIM35: tripartite motifcontaining 35 340 8p21.2 2722491627372820 PTK2B: PTK2B protein tyrosine kinase 2 beta 341 8p21.2 2737319527392730 CHRNA2 : cholinergic receptor, nicotinic, alpha 2 (neuronal) 342 8p21.2 2740456227458403 EPHX2: epoxide hydrolase 2, cytoplasmic RB3; stathminlike-protein RB3 HLS5, KIAA1098, MAIR, MGC17233 CADTK, CAKB, FADK2, FAK2, FRNK, PKB, PTK, PYK2, RAFTK; CAK beta; calciumdependent tyrosine kinase; cell adhesion kinase beta; focal adhesion kinase 2; proline-rich tyrosine kinase 2; protein kinase B; protein tyrosine kinase 2 beta; related adhesion focal tyrosine kinase coding 23087 Protein coding 2185 601212 Protein coding cholinergic receptor, nicotinic, alpha polypeptide 2 (neuronal) 1135 118502 Protein coding CEH, SEH 2053 132811 Protein coding cascade The protein encoded by this gene is a member of the tripartite motif (TRIM) family. The TRIM motif includes three zinc-binding domains, a RING, a Bbox type 1 and a B-box type 2, and a coiled-coil region. The function of this protein has not been identified. This gene encodes a cytoplasmic protein tyrosine kinase which is involved in calcium-induced regulation of ion channels and activation of the map kinase signaling pathway. The encoded protein may represent an important signaling intermediate between neuropeptide-activated receptors or neurotransmitters that increase calcium flux and the downstream signals that regulate neuronal activity. The encoded protein undergoes rapid tyrosine phosphorylation and activation in response to increases in the intracellular calcium concentration, nicotinic acetylcholine receptor activation, membrane depolarization, or protein kinase C activation. This protein has been shown to bind CRK-associated substrate, nephrocystin, GTPase regulator associated with FAK, and the SH2 domain of GRB2. The encoded protein is a member of the FAK subfamily of protein tyrosine kinases but lacks significant sequence similarity to kinases from other subfamilies. Four transcript variants encoding two different isoforms have been found for this gene. These data demonstrate that LPXN forms a signaling complex with Pyk2, c-Src, and PTP-PEST to regulate migration of prostate cancer cells. Critical role of the carboxyl terminus of proline-rich tyrosine kinase (Pyk2) in the activation of human neutrophils by tumor necrosis factor: separation of signals for the respiratory burst and degranulation. Tyrosine phosphorylation of PYK2 mediates heregulininduced glioma invasion Mutations in the nAChRs can cause autosomal dominant nocturnal frontal lobe epilepsy. A new CHRNA2 mutation markedly increases the receptor sensitivity to acetylcholine, indicating that the nicotinic alpha 2 subunit alteration is the underlying cause. A novel permutation testing method implicates sixteen nicotinic acetylcholine receptor genes as risk factors for smoking in schizophrenia families. This gene encodes a member of the epoxide hydrolase family. The protein, found in both the cytosol and peroxisomes, binds to specific epoxides and converts them to the corresponding dihydrodiols. Mutations in this gene have been associated with familial hypercholesterolemia. Alternate transcriptional splice variants of this gene have been observed but have not been thoroughly characterized. metal ion binding protein bindimg zinc ion binding ATP binding non-membrane spanning protein tyrosine kinase activity nucleotide binding protein bindimg signal transducer activity transferase activity acetylcholine receptor activity extracellular ligandgated ion channel activity ion channel activity nicotinic acetylcholineactivated cationselective channel activity 4nitrophenylphosphatase activity epoxide hydrolase activity hydrolase activity magnesium ion binding protein homodimerization activity apoptosis induction of apoptosis negative regulation of cell proliferation apoptosis cell adhesion positive regulation of cell proliferation protein amino acid phosphorylation protein complex assembly response to stress signal complex assembly signal transduction ion transport signal transduction aromatic compound catabolic process cellular calcium ion homeostasis drug metabolic process inflammatory response linoleic acid metabolic process metabolic process oxygen and reactive oxygen species metabolic process positive regulation of blood pressure positive regulation of cytoplasm intracellular nucleus cytoplasm cytoskeleton focal adhesion plasma membrane cell NAS junction PubMed integral to membrane nicotinic acetylcholine-gated receptor-channel complex plasma membrane postsynaptic membrane synapse Cytoplasm cytosol peroxisome soluble fraction IEA 42 Supplementary Information TABLE S1: GENES ON CHROMOSOME 8p: LOCALIZATION AND DESCRIPTION vasodilation prostaglandin production during acute inflammatory response 343 8p21.1 2749104527502507 GULOP: gulonolactone (L) oxidase pseudogene CLU: clusterin 344 8p21.1 2751036827528244 345 8p21.1 2754749627590205 SCARA3: scavenger receptor class A, member 3 346 8p21.1 2757231427584362 347 8p21.1 2764675227686089 348 8p21.1 27723057- LOC100129849: hypothetical protein LOC100129849 CCDC25: coiledcoil domain containing 25 PBK: PDZ GULO; SCURVY 2989 pseudogene AAG4, APOJ, CLI, KUB1, MGC24903, SGP2, SGP2, SP-40, TRPM-2, TRPM2; agingassociated protein 4; apolipoprotein J; clusterin (complement lysis inhibitor, SP40,40, sulfated glycoprotein 2, testosteronerepressed prostate message 2, apolipoprotein J); complement lysis inhibitor; complementassociated protein SP-40; sulfated glycoprotein 2; testosteronerepressed prostate message 2 APC7, CSR, CSR1, MSLR1, MSRL1; anaphase promoting complex subunit 7; cellular stress response protein; macrophage scavenger receptor-like 1 1191 185430 Protein coding May be a biomarker for longer survival in patients with surgically resected non-small cell lung cancer. CLU gene expression might play a crucial role in prostate tumorigenesis by exerting differential biological effects on normal versus tumor cells through differential processing of CLU isoforms in the two cell systems. Expression increases early after androgen withdrawal in prostate cancer; protects tumor cells from apoptosis induced by medical castration. Clusterin is strongly expressed in melanoma. Downregulation of clusterin reduces drug-resistance, i.e., reduces melanoma cell survival in response to cytotoxic drugs. Reducing clusterin may be novel tool to overcome drug-resistance in melanoma. protein bindimg anti-apoptosis apoptosis cell death complement activation, classical pathway endocrine pancreas development innate immune response lipid metabolic process neurite morphogenesis positive regulation of cell differentiation positive regulation of cell proliferation response to oxidative stress aggresome extracellular region extracellular space perinuclear region of cytoplasm 51435 602728 Protein coding This gene encodes a macrophage scavenger receptor-like protein. This protein has been shown to deplete reactive oxygen species, and thus play an important role in protecting cells from oxidative stress. The expression of this gene is induced by oxidative stress. Alternatively spliced transcript variants encoding distinct isoforms have been described. Down-regulation of CSR1 protein expression by promoter methylation is associated with tumor growth and metastasis of prostate cancer scavenger receptor activity UV protection phosphate transport response to oxidative stress Golgi apparatus Golgi membrane cytoplasm endoplasmic reticulum endoplasmic reticulum membrane integral to membrane membrane This genes encodes a serine/threonine kinase related to the dual specific ATP binding Mitosis cellular_component 1001298 49 Protein coding FLJ10853 55246 Protein coding FLJ14385, Nori- 55872 611210 Protein 43 Supplementary Information TABLE S1: GENES ON CHROMOSOME 8p: LOCALIZATION AND DESCRIPTION 27751268 349 8p21.1 2775195227753854 binding kinase LOC100130612: hypothetical LOC100130612 350 8p21.1 2768797727718344 ESCO2: establishment of cohesion 1 homolog 2 351 8p21.1 2778365527906117 352 8p21.1 2793554727997307 SCARA5: scavenger receptor class A, member 5 (putative) C8orf80: chromosome 8 open reading frame 80 353 2800664528104585 ELP3: elongation protein 3 homolog (S. cerevisiae) 3, SPK, TOPK; MAPKK-like protein kinase; PDZ-binding kinase; T-LAK cell-originated protein kinase; serine/threonine protein kinase; spermatogenesisrelated protein kinase similar to TAF9 RNA polymerase II, TATA box binding protein (TBP)-associated factor 2410004I17Rik, EFO2, RBS; establishment of cohesion 1 homolog 2 FLJ23907, MGC45780, Tesr; testis expressed scavenger receptor FLJ26413, HMFN0672; hypothetical protein LOC389643 FLJ10422, KAT9; elongation protein 3 homolog coding 1001306 12 mitogen-activated protein kinase kinase (MAPKK) family. Evidence suggests that mitotic phosphorylation is required for its catalytic activity. This mitotic kinase may be involved in the activation of lymphoid cells and support testicular functions, with a suggested role in the process of spermatogenesis. PBK/TOPK is upregulated in Burkitt's lymphoma and other highly proliferative malignant cells and during normal fetal development. PBK augments tumor cell growth following transient appearance in different types of progenitor cells in vivo as reported. nucleotide binding protein bindimg protein serine/threonine kinase activity transferase activity protein amino acid phosphorylation This gene encodes a protein that may have acetyltransferase activity and may be required for the establishment of sister chromatid cohesion during the S phase of mitosis. Mutations in this gene have been associated with Roberts syndrome. acyltransferase activity metal ion binding transferase activity zinc ion binding DNA repair cell cycle nucleus scavenger receptor activity phosphate transport cytoplasm integral to membrane plasma membrane contributes_to DNA binding N-acetyltransferase activity contributes_to RNA polymerase II transcription elongation factor activity acyltransferase activity histone acetyltransferase activity iron ion binding iron-sulfur cluster binding metal ion binding phosphorylase kinase regulator activity protein bindimg transferase activity metabolic process regulation of transcription from RNA polymerase II promoter transcription DNA-directed RNA polymerase II, holoenzyme cytoplasm nucleolus nucleus transcription elongation factor complex pseudogene 157570 609353 Protein coding 286133 611306 Protein coding 389643 Protein coding 55140 Protein coding Reduction of hELP3 mRNA and protein caused a suppression of HSP70-2 and histone H3 hypoacetylation. 44 Supplementary Information TABLE S1: GENES ON CHROMOSOME 8p: LOCALIZATION AND DESCRIPTION 354 2813881628141043 LOC100131127: hypothetical LOC100131127 355 8p21.1 2815186628152442 356 8p21.1 2816349928252925 357 8p21.1 282305688256787 LOC100130891: hypothetical LOC100130891 LOC100129848: hypothetical protein LOC100129848 PNOC: epronociceptin 358 8p21.1 282590218299896 ZNF395: zinc finger protein 395 359 8p21.1 2834184828403703 FBXO16: F-box protein 16 360 8p21.1 2840769228477901 361 8p21.1 2861507228667121 similar to Degenerative spermatocyte homolog 1, lipid desaturase similar to enigma protein hypothetical protein LOC100129848 1001311 27 pseudogene 1001308 91 pseudogene Protein coding 1001298 48 PPNOC;proprono ciceptin 5368 601459 Protein coding Nciceptin/NociR is present and functional in human neutrophils, and the results identify a novel dialogue pathway between neural and immune tissues Peripheral blood levels are elevated in Wilson disease neuropeptide hormone activity opioid peptide activity DKFZp434K121, HDBP2, PBF, PRF-1, PRF1, Si1-8-14; Huntington's disease gene regulatory regionbinding protein 2; papillomavirus regulatory factor PRF-1 FBX16, MGC125923, MGC125924, MGC125925; Fbox only protein 16 55893 609494 Protein coding PBF binds to SAP30 and represses transcription via recruitment of the HDAC1 co-repressor complex. PBF is a new cellular factor mediating the effects of PI3K/Akt signaling and 14-3-3 on cell growth. HDBP1 and HDBP2 are novel transcription factors shuttling between nucleus and cytoplasm and bind to the specific GCCGGCG, which is an essential ciselement for HD gene expression in neuronal cells DNA binding metal ion binding zinc ion binding 157574 608519 Protein coding FZD3: frizzled homolog 3 (Drosophila) Fz-3, hFz3; frizzled 3 7976 606143 Protein coding This gene encodes a member of the F-box protein family, members of which are characterized by an approximately 40 amino acid motif, the Fbox. The F-box proteins constitute one of the four subunits of ubiquitin protein ligase complex called SCFs (SKP1-cullin-F-box), which function in phosphorylation-dependent ubiquitination. The F-box proteins are divided into three classes: Fbws containing WD-40 domains, Fbls containing leucine-rich repeats, and Fbxs containing either different protein-protein interaction modules or no recognizable motifs. The protein encoded by this gene belongs to the Fbx class. This gene is a member of the frizzled gene family. Members of this family encode seven-transmembrane domain proteins that are receptors for the Wingless type MMTV integration site family of signaling proteins. Most frizzled receptors are coupled to the beta-catenin canonical signaling pathway. The function of this protein is unknown, although it may play a role in mammalian hair follicle development. Results suggested that the FZD3 gene might be involved in the predisposition to schizophrenia EXTL3: exostoses (multiple)-like 3 DKFZp686C2342 , EXTR1, KIAA0519, REG, REGR, RPR, 2137 605744 Protein coding Increased expression of Reg genes, specifically Reg IV contribute to adenoma formation and lead to increased resistance to apoptotic cell death in colorectal cancer. G-protein coupled receptor activity Wnt receptor activity non-G-protein coupled 7TM receptor activity protein bindimg receptor activity glucuronyl-galactosylproteoglycan 4-alpha-Nacetylglucosaminyltrans ferase activity neuropeptide signaling pathway sensory perception signal transduction synaptic transmission regulation of transcription, DNA-dependent transcription extracellular region G-protein coupled receptor protein signaling pathway Wnt receptor signaling pathway cell proliferation establishment of planar polarity inner ear morphogenesis multicellular organismal development neural tube closure apical part of cell integral to plasma membrane membrane cytoplasm intracellular nucleus endoplasmic reticulum endoplasmic reticulum membrane integral to membrane 45 Supplementary Information TABLE S1: GENES ON CHROMOSOME 8p: LOCALIZATION AND DESCRIPTION botv; Reg receptor 362 8p21.1 2867110528673700 LOC100130764: hypothetical protein LOC100130764 INTS9: integrator complex subunit 9 363 8p21.1 2868109928803398 364 8p21.1 2880414428965351 HMBOX1: homeobox containing 1 365 8p21.1 2898071529176560 KIF13B: kinesin family member 13B 366 8p12 2924953029264104 DUSP4: dual specificity phosphatase 4 367 8p12 2926585529266600 368 8p12 2954552629546104 369 8p12 3000597130006160 LOC100132051: hypothetical protein LOC100132051 LOC646909: similar to 60S ribosomal protein L17 (L23) MAP2K1P1: mitogen-activated protein kinase kinase 1 manganese ion binding metal ion binding transferase activity, transferring glycosyl groups hypothetical protein LOC100130764 1001307 64 CPSF2L, FLJ10871, RC74, RC74; related to CPSF subunits 74 kDa FLJ21616, HNF1LA, PBHNF; homeoboxcontaining protein PBHNF GAKIN, KIAA0639: guanylate kinase associated kinesin; kinesin 13B HVH2, MKP-2, MKP2, TYP; MAP kinase phosphatase 2; VH1 homologous phosphatase 2; serine/threonine specific protein phosphatase 55756 membrane Protein coding 611352 79618 Protein coding contributes_to protein bindimg snRNA processing integrator complex nucleus Protein coding Hmbox1 is widely expressed in pancreas and the expression of this gene can also be detected in pallium, hippocampus and hypothalamus sequence-specific DNA binding transcription factor activity regulation of transcription, DNA-dependent nucleus Binding of the SH3-I3-GUK module of hDlg to GAKIN activates the microtubule-stimulated ATPase activity of GAKIN by approximately 10fold. We propose: the cargo-mediated regulation of motor activity is a general paradigm for the activation of kinesins. Results suggest that, in neurons, the GAKIN-PIP3BP complex transports PIP3 to the neurite ends and regulates neuronal polarity formation. The protein encoded by this gene is a member of the dual specificity protein phosphatase subfamily. These phosphatases inactivate their target kinases by dephosphorylating both the phosphoserine/threonine and phosphotyrosine residues. They negatively regulate members of the mitogen-activated protein (MAP) kinase superfamily (MAPK/ERK, SAPK/JNK, p38), which are associated with cellular proliferation and differentiation. Different members of the family of dual specificity phosphatases show distinct substrate specificities for various MAP kinases, different tissue distribution and subcellular localization, and different modes of inducibility of their expression by extracellular stimuli. This gene product inactivates ERK1, ERK2 and JNK, is expressed in a variety of tissues, and is localized in the nucleus. Two alternatively spliced transcript variants, encoding distinct isoforms, have been observed for this gene. In addition, multiple polyadenylation sites have been reported. ATP binding microtubule motor activity nucleotide binding protein bindimg protein kinase binding MAP kinase tyrosine/serine/threonine phosphatase activity hydrolase activity protein tyrosine phosphatase activity protein tyrosine/threonine phosphatase activity T cell activation microtubule-based movement protein targeting signal transduction cytoplasm cytoskeleton microtubule microtubule associated complex MAPKKK cascade protein amino acid dephosphorylation nucleus 23303 607350 Protein coding 1846 602747 Protein coding 1001320 51 Protein coding 646909 pseudogene 29778 pseudogene 46 Supplementary Information TABLE S1: GENES ON CHROMOSOME 8p: LOCALIZATION AND DESCRIPTION pseudogene 1 TMEM66: transmembrane protein 66 LEPROTL: leptin receptor overlapping transcript-like 1 LOC648729: similar to ribosomal protein S15a MBOAT4: membrane bound O-acyltransferase domain containing 4 370 8p12 3004017330060191 371 8p12 3007248730085122 372 8p12 3009439630094847 373 8p12 3010872930110011 374 8p12 3013335530160602 DCTN6: dynactin 6 375 8p12 3021433830218055 376 8p12 3022656030254863 377 8p12 3030837530330600 378 8p12 3030901930309335 379 8p12 3031513330336791 380 8p12 3036018830367557 381 8p12 3036148630549276 LOC392209: similar to heat shock protein 8 LOC642319: similar to transgelin 2 LOC92755: hypothetical gene LOC92755 LOC100128441: hypothetical LOC100128441 LOC100131210: hypothetical protein LOC100131210 LOC100128750: hypothetical protein LOC100128750 RBPMS: RNA binding protein with multiple splicing FLJ22274, FOAP7, HSPC035, MGC8721, XTP3 HSPC112, Vps55, my047 51669 23484 607338 Protein coding Integral to membrane membrane Protein coding Integral to membrane membrane 648729 pseudogene FKSG89, OACT4; Oacyltransferase (membrane bound) domain containing 4 S-3; novel RGDcontaining protein 619373 Protein coding 10671 Protein coding similar to heat shock cognate protein 70 similar to KIAA0120 392209 pseudogene 642319 pseudogene 92755 unknown 1001284 41 pseudogene similar to Gtf2a2 protein 1001287 50 11030 Integral to membrane membrane The protein encoded by this gene contains an RGD (Arg-Gly-Asp) motif in the N-terminal region, which confers adhesive properties to macromolecular proteins like fibronectin. It shares a high degree of sequence similarity with the mouse homolog, which has been suggested to play a role in mitochondrial biogenesis. The exact biological function of this gene is not known. acyltransferase activity dynein binding transferase activity cytoplasm cytoskeleton dynactin complex This gene encodes a member of the RRM family of RNA-binding proteins. The RRM domain is between 80-100 amino acids in length and family members contain one to four copies of the domain. The RRM domain consists of two short stretches of conserved sequence called RNP1 and RNP2, as well as a few highly conserved hydrophobic residues. The protein encoded by this gene has a single, putative RRM domain in its N-terminus. Alternative splicing results in multiple transcript variants encoding different RNA binding Nucleotide binding Protein bindimg Protein coding 1001312 10 HERMES; RNAbinding protein with multiple splicing acyltransferase activity transferase activity unknown 601558 Protein coding RNA processing 47 Supplementary Information TABLE S1: GENES ON CHROMOSOME 8p: LOCALIZATION AND DESCRIPTION isoforms. 382 8p12 3054523330547826 LOC100129846: hypothetical protein LOC100129846 GTF2E2: general transcription factor IIE, polypeptide 2, beta 34kDa 383 8p12 3055557330635274 384 8p12 3065597730704985 GSR: glutathione reductase 385 8p12 3072123230744064 UBXD6: UBX domain containing 6 1001298 46 Protein coding E, TF2E2, TFIIEB; general transcription factor IIE, polypeptide 2 (beta subunit, 34kD) MGC78522 2961 189964 Protein coding Introduced point mutations into two regions located near the carboxy terminus of TFIIE beta and identified the functionally essential amino acid residues that bind to RNA polymerase II general RNA polymerase II transcription factor activity protein bindimg regulation of transcription, DNA-dependent transcription transcription initiation from RNA polymerase II promoter nucleus transcription factor TFIIE complex 2936 138300 Protein coding Malignant lung tumors (squamous cell carcinoma and adenocarcinoma) had increased activity of this enzyme. Decreased activities of erythrocyte glutathione reductase is associated with cerebral palsy FAD binding NADP binding glutathione-disulfide reductase activity cytoplasm mitochondrion D8S2298E, REP8, Reproduction/chr omosome 8; reproduction 8 PP2CB, PP2Abeta; protein phosphatase 2, catalytic subunit, beta isoform; protein phosphatase type 2A catalytic subunit; serine/threonine protein phosphatase 2A, catalytic subunit, beta isoform DKFZP434M241 5 7993 602155 Protein coding cell redox homeostasis electron transport glutathione metabolic process single fertilization 5516 176916 Protein coding This gene encodes the phosphatase 2A catalytic subunit. Protein phosphatase 2A is one of the four major Ser/Thr phosphatases, and it is implicated in the negative control of cell growth and division. It consists of a common heteromeric core enzyme, which is composed of a catalytic subunit and a constant regulatory subunit, that associates with a variety of regulatory subunits. This gene encodes a beta isoform of the catalytic subunit. Two transcript variants encoding the same protein have been identified for this gene. hydrolase activity iron ion binding manganese ion binding metal ion binding phosphoprotein phosphatase activity protein bindimg protein amino acid dephosphorylation cytoplasm protein phosphatase type 2A complex 56154 605795 Protein coding The exact function of this gene is not known, however, its encoded product is highly similar to purine-rich element binding protein A. The latter is a DNA-binding protein which binds preferentially to the single strand of the purine-rich element termed PUR, and has been implicated in the control of both DNA replication and transcription. This gene lies in close proximity to the Werner syndrome gene, but on the opposite strand, on chromosome 8p11. Two transcript variants encoding different isoforms have been found for this gene. This gene encodes a member of the RecQ subfamily and the DEAH (AspGlu-Ala-His) subfamily of DNA and RNA helicases. DNA helicases are involved in many aspects of DNA metabolism, including transcription, DNA binding 386 8p12 3076266830789894 PPP2CB: protein phosphatase 2 (formerly 2A), catalytic subunit, beta isoform 387 8p12 3080860230826075 TEX15: testis expressed 15 388 8p12 3096051430972106 similar to KIAA0205 1001293 98 pseudogene 389 8p12 3097286331010773 LOC100129398: hypothetical LOC100129398 PURG: purinerich element binding protein G MGC119274, PURG-A, PURGB 29942 Protein coding 390 8p12 3101032031150819 WRN: Werner syndrome KFZp686C2056, RECQ3, RECQL2, 7486 Protein coding 3'-5' exonuclease activity ATP binding nucleus DNA recombination DNA replication multicellular organismal intracellular nucleolus nucleoplasm 48 Supplementary Information TABLE S1: GENES ON CHROMOSOME 8p: LOCALIZATION AND DESCRIPTION RECQL3 391 8p12 3119524031198134 392 8p12 3161704331618101 LOC642513: similar to potassium channel tetramerisation domain containing NRG1: neuregulin 1 3252529532741615 393 8p12 3214825232151311 394 8p12 3274426132745017 395 8p12 3334788433450206 LOC100127894: hypothetical protein LOC100127894 LOC100129710: similar to MSTP131 FUT10 : fucosyltransferase 642513 RIA, GGF, GGF2, HGL, HRG, HRG1, HRGA, NDF, SMDF; glial growth factor; heregulin, alpha (45kD, ERBB2 p185-activator); neu differentiation factor; sensory and motor neuron derived factor 3084 1001278 94 MSTP131 MGC11141; alpha 1,3-fucosyl 1001297 10 84750 replication, recombination, and repair. This protein contains a nuclear localization signal in the C-terminus and shows a predominant nucleolar localization. It possesses an intrinsic 3' to 5' DNA helicase activity, and is also a 3' to 5' exonuclease. Based on interactions between this protein and Ku70/80 heterodimer in DNA end processing, this protein may be involved in the repair of double strand DNA breaks. Defects in this gene are the cause of werner syndrome, an autosomal recessive disorder characterized by premature aging. Dual role for WRN in tumorigenesis; tumor suppressor-like activity in tumors with WRN inactivation and the promotion of proliferation and survival in tumors that express WRN. WRN missense mutations or polymorphisms could promote genetic instability and cancer in the general population by selectively interfering with recombination in somatic cells ATP-dependent helicase activity DNA binding DNA helicase activity hydrolase activity nucleotide binding protein bindimg aging nucleobase, nucleoside, nucleotide and nucleic acid metabolic process regulation of apoptosis regulation of growth rate replicative cell aging telomere maintenance nucleus Neuregulin 1 (NRG1) was originally identified as a 44-kD glycoprotein that interacts with the NEU/ERBB2 receptor tyrosine kinase to increase its phosphorylation on tyrosine residues. It is known that an extraordinary variety of different isoforms are produced from the NRG1 gene by alternative splicing. These isoforms include heregulins (HRGs), glial growth factors (GGFs) and sensory and motor neuron-derived factor (SMDF). They are tissue-specifically expressed and differ significantly in their structure. The HRG isoforms all contain immunoglobulin (Ig) and epidermal growth factor-like (EGF-like) domains. GGF and GGF2 isoforms contain a kringle-like sequence plus Ig and EGF-like domains; and the SMDF isoform shares only the EGF-like domain with other isoforms. The receptors for all NRG1 isoforms are the ERBB family of tyrosine kinase transmembrane receptors. Through interaction with ERBB receptors, NRG1 isoforms induce the growth and differentialtion of epithelial, neuronal, glial, and other types of cells. NRG1 was expressed in 80% of breast cancers studied. Study provides additional suggestive evidence for both the linkage and association of schizophrenia with NRG1. The molecular mechanism of the association between NRG1 risk alleles and schizophrenia may include down-regulation of nAChR alpha7 expression. Results indicate that GGF2 is neurotrophic and neuroprotective for developing dopaminergic neurons and suggest a role for NRGs in repair of the damaged nigrostriatal system that occurs in Parkinson's disease. ErbB-3 class receptor binding growth factor activity growth factor activity molecular_function receptor tyrosine kinase binding transcription cofactor activity transmembrane receptor protein tyrosine kinase activator activity cell differentiation cellular protein complex disassembly embryonic development glucose transport myelination negative regulation of transcription nervous system development regulation of blood pressure extracellular region integral to membrane membrane nucleus pseudogene 142445 Protein coding Protein coding molecular_function biological_process protein transport cellular_component integral IEA to membrane Protein coding Protein coding amino acid sequence alpha(1,3)fucosyltransferase L-fucose catabolic process embryonic development Golgi apparatus integral to membrane 49 Supplementary Information TABLE S1: GENES ON CHROMOSOME 8p: LOCALIZATION AND DESCRIPTION 396 8p12 3341332533413760 397 8p12 3346225933478318 398 8p12 3347577533490245 10 (alpha (1,3) fucosyltransferase ) transferase; fucosyltransferase 10 LOC100132576: hypothetical LOC100132576 RBM13: RNA binding motif protein 13 C8orf41: chromosome 8 open reading frame 41 SNORD13: small nucleolar RNA, C/D box 13 RNF122: ring finger protein 122 similar to Neosin 1001325 76 pseudogene MAK16; MAK16L 84549 Protein coding 399 8p12 3349053533490638 400 8p12 3352481533544185 401 8p12 3356839333576981 DUSP26: dual specificity phosphatase 26 (putative) 402 8p12 3369934633700138 VENTXP5: VENT homeobox (Xenopus laevis) pseudogene 5 403 8p12 3383644033837283 LOC388460: hypothetical FLJ23263, hypothetical protein LOC80185 U13 activity transferase activity, transferring glycosyl groups 80185 membrane Protein coding 692084 snoRNA FLJ12526, MGC126622 79845 Protein coding The protein encoded by this gene contains a RING finger, a motif present in a variety of functionally distinct proteins and known to be involved in protein-protein and protein-DNA interactions. metal ion binding protein bindimg zinc ion binding DUSP24, LDP-4, MGC1136, MGC2627, MKP8, NATA1, SKRP3: Novel amplified gene in thyroid anaplastic cancer; dual specificity phosphatase 26; dual-specificity phosphatase SKRP3; lowmolecular-mass dual-specificity phosphatase 4; mitogen-activated protein kinase phosphatase 8 78986 Protein coding DUSP26 effectively dephosphorylates p38 and has a little effect on extracellular signal-regulated kinase in anaplastic thyroid cancer. hydrolase activity protein tyrosine phosphatase activity protein tyrosine/serine/threonine phosphatase activity 442384 pseudogene Homeobox genes encode DNA-binding proteins, many of which are thought to be involved in early embryonic development. Homeobox genes encode a DNA-binding domain of 60 to 63 amino acids referred to as the homeodomain. This pseudogene is a member of the Vent homeobox gene family. 388460 pseudogene similar to SHUJUN-2 fertilization hemopoiesis nervous system development protein amino acid glycosylation protein folding protein targeting wound healing protein amino acid dephosphorylation Golgi apparatus endoplasmic reticulum integral to membrane membrane Golgi apparatus cytoplasm nucleus 50 Supplementary Information TABLE S1: GENES ON CHROMOSOME 8p: LOCALIZATION AND DESCRIPTION LOC388460 CYCSP3: cytochrome c, somatic pseudogene 3 LOC137107: similar to ribosomal protein L10a LOC100133273: similar to hCG2016451 LSM12P: LSM12 homolog (S. cerevisiae) pseudogene UNC5D:unc-5 homolog D (C. elegans) 404 8p12 3394662133946938 405 8p12 3430005534300702 406 8p12 3485117634851771 407 8p12 408 8p12 3552145235771722 409 8p12 3676100036912801 410 8p12 3677193936772643 411 8p12 412 8p12 3686515536865986 3706045537073080 413 8p12 3745237137537046 414 8p12 3767245937675554 KCNU1: potassium channel, subfamily U, member 1 MRPS7P1: mitochondrial ribosomal protein S7 pseudogene 1 FKSG2: apoptosis inhibitor LOC642879: similar to SWI/SNF related, matrix associated, actin dependent regulator of chromatin, subfamily e, member 1 LOC100128034: hypothetical protein LOC100128034 ZNF703: zinc finger protein 703 415 8p12 3771330737734476 ERLIN2: ER lipid raft associated 2 3550065935502834 HS7; HCP21 349198 pseudogene 137107 Protein coding 1001332 73 pseudogene 653122 pseudogene 137970 Protein coding 157855 Protein coding 359783 pseudogene 59347 642879 Protein coding pseudogene 1001280 34 Protein coding FLJ14299, ZNF503L 80139 Protein coding C8orf2, Erlin-2, MGC87072, SPFH2: SPFH 11160 FLJ16019, KIAA1777, PRO34692, Unc5h4: netrin receptor Unc5h4 KCNMC1, KCa5.1, Kcnma3, Slo3 611605 Protein coding The human genome has 49 cytochrome c pseudogenes, including a relic of a primordial gene that still functions in mouse. The human somatic cytochrome c gene: two classes of processed pseudogenes demarcate a period of rapid molecular evolution. protein bindimg receptor activity SPFH2 as a key endoplasmic reticulum associated degradation pathway component and suggest that it may act as a substrate recognition factor. Erlin-1 and erlin-2 are novel members of the prohibitin family of proteins metal ion binding nucleic acid binding zinc ion binding molecular_function Apoptosis multicellular organismal development signal transduction integral to membrane membrane apoptosis cytoplasm regulation of transcription, DNA-dependent transcription biological_process Intracellular nucleus endoplasmic reticulum endoplasmic reticulum membrane 51 Supplementary Information TABLE S1: GENES ON CHROMOSOME 8p: LOCALIZATION AND DESCRIPTION 416 3772401137724679 LOC728024: hypothetical protein LOC100128034 PROSC: proline synthetase cotranscribed homolog (bacterial) 417 8p12 3773883337756444 418 8p12 3777393237820650 GPR124: G protein-coupled receptor 124 419 8p12 3782055837826569 RF2: BRF2, subunit of RNA polymerase III transcription initiation factor, BRF1-like 420 8p12 3783562837876161 RAB11FIP1: RAB11 family interacting protein 1 (class I) 421 8p12 3791095737916804 GOT1L1: glutamicoxaloacetic transaminase 1like 1 domain family, member 2 hCG1640171 FLJ11861: proline synthetase co-transcribed (bacterial homolog); proline synthetase cotranscribed homolog DKFZp434C211, DKFZp434J0911, FLJ14390, KIAA1531, TEM5: tumor endothelial marker 5 BRFU, FLJ11052, TFIIIB50: RNA polymerase III transcription initiation factor BRF2; RNA polymerase III transcription initiation factor BRFU; transcription factor IIB- related factor, TFIIIB50 DKFZp686E2214 , FLJ22524, FLJ22622, MGC78448, NOEL1A, RCP, rab11-FIP1: RAB11 coupling protein; RAB11 family interacting protein 1; Rab effector protein; Rab-interacting recycling protein MGC33309 that define lipid-raft-like domains of the ER. integral to membrane membrane LOC728 024 Protein coding 11212 604436 Protein coding 25960 606823 Protein coding Proteolytically processed soluble tumor endothelial marker TEM5 mediates endothelial cell survival during angiogenesis by linking integrin alpha(v)beta3 to glycosaminoglycans G-protein coupled receptor activity protein bindimg receptor activity neuropeptide signaling pathway signal transduction integral to membrane membrane plasma membrane 55290 607013 Protein coding This gene encodes one of the multiple subunits of the RNA polymerase III transcription factor complex required for transcription of genes with promoter elements upstream of the initiation site. The product of this gene, a TFIIB-like factor, is directly recruited to the TATA-box of polymerase III small nuclear RNA gene promoters through its interaction with the TATAbinding protein. protein bindimg transcription regulator activity translation initiation factor activity zinc ion binding regulation of transcription, DNA-dependent transcription initiation transcription factor complex 80223 608737 Protein coding Proteins of the large Rab GTPase family (see RAB1A; MIM 179508) have regulatory roles in the formation, targeting, and fusion of intracellular transport vesicles. RAB11FIP1 is one of many proteins that interact with and regulate Rab GTPases protein bindimg protein transport cytoplasmic vesicle membrane recycling endosome Protein coding Neither human hexokinase-1 nor human inorganic pyrophosphatase expression segregated concordantly with human cytoplasmic glutamicoxaloacetic transaminase expression. catalytic activity pyridoxal phosphate binding transaminase activity amino acid metabolic process biosynthetic process 137362 cytoplasm intracellular 52 Supplementary Information TABLE S1: GENES ON CHROMOSOME 8p: LOCALIZATION AND DESCRIPTION 422 8p12 3793967337943341 ADRB3: adrenergic, beta3-, receptor BETA3AR 155 109691 Protein coding The ADRB3 gene product, beta-3-adrenergic receptor, is located mainly in adipose tissue and is involved in the regulation of lipolysis and thermogenesis. Beta adrenergic receptors are involved in the epenephrine and norepinephrine-induced activation of adenylate cyclase through the action of G proteins. 423 8p12 3800717738037040 EIF4EBP1: eukaryotic translation initiation factor 4E binding protein 1 4EBP1, BP-1, MGC4316, PHAS-I, eIF4Ebinding protein 1; phosphorylated heat- and acidstable protein regulated by insulin 1 1978 602223 Protein coding 424 8p12 3808222338116216 ASH2L: ash2 (absent, small, or homeotic)-like (Drosophila) ASH2, ASH2L1, ASH2L2, Bre2: ash2-like 9070 604782 Protein coding This gene encodes one member of a family of translation repressor proteins. The protein directly interacts with eukaryotic translation initiation factor 4E (eIF4E), which is a limiting component of the multisubunit complex that recruits 40S ribosomal subunits to the 5' end of mRNAs. Interaction of this protein with eIF4E inhibits complex assembly and represses translation. This protein is phosphorylated in response to various signals including UV irradiation and insulin signaling, resulting in its dissociation from eIF4E and activation of mRNA translation. Phosphorylated 4E-BP1 (p-4E-BP1) expression in tumors is associated with malignant progression and an adverse prognosis regardless of the upstream oncogenic alterations. In patients with ovarian carcinoma, significant expression of p-4EBP1 was associated with high-grade tumors and a poor prognosis, regardless other oncogenic alterations upstream. Findings offer insight into the molecular role of ASH2L, and by extension that of WDR5, in proper H3K4 trimethylation. 425 8p12 3814001438153183 LSM1: LSM1 homolog, U6 small nuclear RNA associated (S. cerevisiae) CASM, YJL124C 27257 607281 Protein coding 426 8p12 3811937538127757, STAR: steroidogenic acute regulatory protei TARD1: START domain containing 1; StAR-related lipid transfer (START) domain 6770 600617 Protein coding m-like proteins contain the Sm sequence motif, which consists of 2 regions separated by a linker of variable length that folds as a loop. The Sm-like proteins are thought to form a stable heteromer present in tri-snRNP particles, which are important for pre-mRNA splicing. LSM1 is a breast cancer oncogene from the 8p11-12 amplicon. Lsm1 is deeply involved in prostate cancer progression through its down-regulation, independent of any gene mutation. The protein encoded by this gene plays a key role in the acute regulation of steroid hormone synthesis by enhancing the conversion of cholesterol into pregnenolone. This protein permits the cleavage of cholesterol into pregnenolone by mediating the transport of cholesterol from the outer mitochondrial membrane to the inner mitochondrial membrane. Mutations in this gene are a cause of congenital lipoid adrenal hyperplasia (CLAH), beta3-adrenergic receptor activity contributes_to norepinephrine binding protein homodimerization activity receptor activity rhodopsin-like receptor activity G-protein signaling, coupled to cAMP nucleotide second messenger adenylate cyclase activation NOT arrestin mediated desensitization of G-protein coupled receptor protein signaling pathway carbohydrate metabolic process energy reserve metabolic process generation of precursor metabolites and energy positive regulation of MAPKKK cascade NOT receptor-mediated endocytosis signal transduction insulin receptor signaling pathway negative regulation of translational initiation integral to plasma membrane plasma membrane receptor complex DNA binding metal ion binding protein bindimg transcription regulator activity zinc ion binding RNA splicing factor activity, transesterification mechanism protein bindimg hemopoiesis regulation of transcription, DNA-dependent transcription transcription from RNA polymerase II promoter RNA splicing mRNA processing histone methyltransferase complex nucleus cholesterol binding cholesterol transporter activity lipid binding C21-steroid hormone biosynthetic process lipid transport steroid biosynthetic process mitochondrion eukaryotic initiation factor 4E binding protein binding cytosol nucleus ribonucleoprotein complex 53 Supplementary Information TABLE S1: GENES ON CHROMOSOME 8p: LOCALIZATION AND DESCRIPTION 427 8p12 3815326338187694 BAG4: BCL2associated athanogene 4 428 8p12 3820826438238143 DDHD2: DDHD domain containing 2 429 8p12 3824141438245872 430 8p12 3825171738358947 PPAPDC1B: phosphatidic acid phosphatase type 2 domain containing 1B WHSC1L1: Wolf-Hirschhorn syndrome candidate 1-like 1 431 8p12 3836317738385218 LETM2: leucine zipper-EF-hand containing 1; cholesterol trafficker; mitochondrial steroid acute regulatory protein; steroid acute regulatory protein; steroidogenic acute regulator BAG-4, SODD: BAG-family molecular chaperone regulator-4; silencer of death domains also called lipoid CAH. A pseudogene of this gene is located on chromosome 13. 9530 603884 Protein coding The protein encoded by this gene is a member of the BAG1-related protein family. BAG1 is an anti-apoptotic protein that functions through interactions with a variety of cell apoptosis and growth related proteins including BCL-2, Raf-protein kinase, steroid hormone receptors, growth factor receptors and members of the heat shock protein 70 kDa family. This protein contains a BAG domain near the C-terminus, which could bind and inhibit the chaperone activity of Hsc70/Hsp70. This protein was found to be associated with the death domain of tumor necrosis factor receptor type 1 (TNF-R1) and death receptor-3 (DR3), and thereby negatively regulates downstream cell death signaling. The regulatory role of this protein in cell death was demonstrated in epithelial cells which undergo apoptosis while integrin mediated matrix contacts are lost. anti-apoptosis apoptosis protein folding cytoplasm hydrolase activity metal ion binding lipid catabolic process cytoplasm KIAA0725, SAMWD1: PAPLA1 like; SAM, WWE and DDHD domain containing 1; sec23p-interacting protein p125-like phosphatidic acidpreferring phospholipase A1 DPPL1, HTPAP: diacylglycerol pyrophosphate like 1 23259 84513 610626 Protein coding DPPL1 and DPPL2 represent a novel type of mammalian phosphatidate phosphatase. HTPAP is a novel metastatic suppressor gene for hepatocellular carcinoma catalytic activity hydrolase activity cell cycle negative regulation of cell cycle integral to membrane membrane DKFZp667H044, FLJ20353, MGC126766, MGC142029, NSD3, pp14328: WHSC1L1 protein; WolfHirschhorn syndrome candidate 1-like 1 protein FLJ25409: leucine zipper- 54904 607083 Protein coding This gene is related to the Wolf-Hirschhorn syndrome candidate-1 gene and encodes a protein with PWWP (proline-tryptophan-tryptophan-proline) domains. The function of the protein has not been determined. Two alternatively spliced variants have been described. histone-lysine Nmethyltransferase activity metal ion binding methyltransferase activity protein bindimg transferase activity zinc ion binding cell differentiation cell growth chromatin modification histone methylation regulation of transcription, DNA-dependent transcription nucleus 137994 Protein coding protein bindimg receptor signaling protein activity Protein coding integral to membrane membrane 54 Supplementary Information TABLE S1: GENES ON CHROMOSOME 8p: LOCALIZATION AND DESCRIPTION containing transmembrane protein 2 432 8p12 3838944938445293 433 8p12 434 8p12 3857684838577932 435 8p12 3868766838722137 436 8p11.2 3 3876393838829703 437 8p11.2 3 3887798638946459 3848750938505337 FGFR1: fibroblast growth factor receptor 1 (fmsrelated tyrosine kinase 2, Pfeiffer syndrome) FLJ43582: FLJ43582 protein RNF5P1: ring finger protein 5 pseudogene 1 LOC100131277: hypothetical protein LOC100131277 TACC1: transforming, acidic coiled-coil containing protein 1 PLEKHA2: pleckstrin homology domain containing, family A (phosphoinositide binding specific) member 2 EF-hand containing transmembrane protein 1-like protein BFGFR, CD331, CEK, FGFBR, FLG, FLT2, HBGFR, KAL2, N-SAM: FMSlike tyrosine kinase 2; basic fibroblast growth factor receptor 1; fibroblast growth factor receptor 1; fms-related tyrosine kinase-2; heparin-binding growth factor receptor; hydroxyarylprotein kinase hypothetical protein LOC389649 mitochondrial membrane mitochondrion 2260 136350 389649 The protein encoded by this gene is a member of the fibroblast growth factor receptor (FGFR) family, where amino acid sequence is highly conserved between members and throughout evolution. FGFR family members differ from one another in their ligand affinities and tissue distribution. A full-length representative protein consists of an extracellular region, composed of three immunoglobulin-like domains, a single hydrophobic membrane-spanning segment and a cytoplasmic tyrosine kinase domain. The extracellular portion of the protein interacts with fibroblast growth factors, setting in motion a cascade of downstream signals, ultimately influencing mitogenesis and differentiation. This particular family member binds both acidic and basic fibroblast growth factors and is involved in limb induction. Mutations in this gene have been associated with Pfeiffer syndrome, Jackson-Weiss syndrome, Antley-Bixler syndrome, osteoglophonic dysplasia, and autosomal dominant Kallmann syndrome 2. Chromosomal aberrations involving this gene are associated with stem cell myeloproliferative disorder and stem cell leukemia lymphoma syndrome. Alternatively spliced variants which encode different protein isoforms have been described; however, not all variants have been fully characterized. HFGFR1 was expressed primarily in the ventricular zone embryologically ATP binding fibroblast growth factor receptor activity fibroblast growth factor receptor activity heparin binding nucleotide binding protein bindimg receptor activity transferase activity MAPKKK cascade cell growth fibroblast growth factor receptor signaling pathway protein amino acid phosphorylation skeletal developmenT integral to plasma membrane membrane membrane fraction The function of this gene has not yet been determined; however, it is speculated that it may represent a breast cancer candidate gene. It is located close to FGFR1 on a region of chromosome 8 that is amplified in some breast cancers. Down regulation of tacc1 controls mrna homeostasis in polarized cells and participates in oncogenic processes in human cancers protein bindimg cell cycle cell division cytoplasm nucleus phosphatidylinositol binding phospholipid binding biological_process cellular_component cytoplasm nucleus plasma membran Protein coding pseudogene 286140 1001312 77 DKFZp686K1812 6, Ga55, KIAA1103: transforming, acidic coiled-coil containing protein 1 variant TACC1B FLJ25921, TAPP2: pleckstrin homology domaincontaining, family A (phosphoinositide binding specific) Protein coding unknown 6867 605301 Protein coding 59339 607773 Protein coding 55 Supplementary Information TABLE S1: GENES ON CHROMOSOME 8p: LOCALIZATION AND DESCRIPTION 438 8p11.2 3 3895085438964801 HTRA4: HtrA serine peptidase 4 439 8p11.2 3 3896548438973198 TM2D2: TM2 domain containing 2 440 8p11.2 3 3897366239081937 441 8p11.2 3 442 member 2; tandem PH Domain containing protein-2 FLJ90724 203100 610700 Protein coding FLJ90724 or HTRA4 is a member of the HtrA family of proteases BLP1, MGC125813, MGC125814: BBP-like protein 1; TM2 domain containing 2, isoform a 83877 610081 Protein coding ADAM9: ADAM metallopeptidase domain 9 (meltrin gamma) KIAA0021, MCMP, MDC9, Mltng 8754 602713 protein coding 3908432539261593 ADAM32: ADAM metallopeptidase domain 32 203102 Protein coding 8p11.2 3 3913421439135377 1001290 04 pseudogene 443 8p11.2 3 3929130839394054 LOC100129004: hypothetical LOC100129004 ADAM5P: ADAM metallopeptidase domain 5 pseudogene FLJ26299, FLJ29004: a disintegrin and metalloprotease domain 32; a disintegrin and metalloproteinase domain 32; metalloproteinase 12-like protein similar to ribosomal protein L3 ADAM5, TMDCII The protein encoded by this gene contains a structural module related to that of the seven transmembrane domain G protein-coupled receptor superfamily. This protein has sequence and structural similarities to the beta-amyloid binding protein (BBP), but, unlike BBP, it does not regulate a response to beta-amyloid peptide. This protein may have regulatory roles in cell death or proliferation signal cascades. This gene has multiple alternatively spliced transcript variants which encode two different isoforms. This gene encodes a member of the ADAM (a disintegrin and metalloprotease domain) family. Members of this family are membraneanchored proteins structurally related to snake venom disintegrins, and have been implicated in a variety of biological processes involving cell-cell and cell-matrix interactions, including fertilization, muscle development, and neurogenesis. The protein encoded by this gene interacts with SH3 domaincontaining proteins, binds mitotic arrest deficient 2 beta protein, and is also involved in TPA-induced ectodomain shedding of membrane-anchored heparin-binding EGF-like growth factor. Two alternative splice variants have been identified, encoding distinct isoforms. ADAM9 overexpression enhances cell adhesion and invasion of non-small cell lung cancer cells via modulation of other adhesion molecules and changes in sensitivity to growth factors, thereby promoting metastatic capacity to the brain. This gene encodes a member of the ADAM (a disintegrin and metalloprotease domain) family. Members of this family are membraneanchored proteins structurally related to snake venom disintegrins, and have been implicated in a variety of biological processes involving cell-cell and cell-matrix interactions, including fertilization, muscle development, and neurogenesis. 255926 Protein coding 444 8p11.2 2 3942772139499524 ADAM3, CYRN1, tMDCI 1587 pseudogene ADAM3A: ADAM metallopeptidase This gene encodes a member of the ADAM (a disintegrin and metalloprotease domain) family. Members of this family are membraneanchored proteins structurally related to snake venom disintegrins, and have been implicated in a variety of biological processes involving cell-cell and cell-matrix interactions, including fertilization, muscle development, and neurogenesis. Human cyritestin genes (CYRN1 and CYRN2) are non-functional. The gene for the human tMDC I sperm surface protein is non-functional: implications for its proposed role in mammalian sperm-egg recognition. peptidase activity protein bindimg serinetype endopeptidase activity proteolysis integral to membrane membrane SH3 domain binding integrin binding metal ion binding metalloendopeptidase activity protein binding protein kinase binding zinc ion binding protein kinase cascade proteolysis extracellular region integral to plasma membrane plasma membrane metalloendopeptidase activity zinc ion binding metalloendopeptidase activity zinc ion binding proteolysis 56 Supplementary Information TABLE S1: GENES ON CHROMOSOME 8p: LOCALIZATION AND DESCRIPTION domain 3A (cyritestin 1) LOC100130964: similar to hCG2045185 ADAM18: ADAM metallopeptidase domain 18 445 8p11.2 2 3954991239556810 1001309 64 Protein coding 446 8p11.2 2 3956129939706644 ADAM27, MGC41836, MGC88272, tMDCIII 8749 Protein coding 447 8p11.2 2 397204119814936 ADAM2: ADAM metallopeptidase domain 2 (fertilin beta) CRYN1, CRYN2, FTNB, PH-30b, PH30: ADAM metallopeptidase domain 2; a disintegrin and metalloproteinase domain 2 (fertilin beta); fertilin beta CD107B, IDO: Indoleamine 2,3dioxygenase; indole 2,3dioxygenase 2515 601533 Protein coding 448 8p11.2 2 3989048533905107 INDO: indoleaminepyrrole 2,3 dioxygenase 3620 147435 Protein coding 449 8p11.2 2 3995574339992278 INDOL1: indoleaminepyrrole 2,3 dioxygenase-like 1 indoleamine 2,3dioxygenase-like 1 protein 169355 450 8p11.2 1 4013014640131600 C8orf4: chromosome 8 open reading frame 4 MGC22806, TC1, TC1, hTC-1: human thyroid cancer 1; thyroid cancer-1 56892 451 8p11.2 1 4050727040874500 FLJ13842 79698 8p11.2 1 4123863541286137 ZMAT4: zinc finger, matrin type 4 SFRP1: secreted frizzled-related protein 1 452 FRP, FRP-1, FRP1, FrzA, SARP2: secreted 6422 Protein coding 607702 Protein coding This gene encodes a member of the ADAM (a disintegrin and metalloprotease domain) family. Members of this family are membraneanchored proteins structurally related to snake venom disintegrins, and have been implicated in a variety of biologic processes involving cell-cell and cell-matrix interactions, including fertilization, muscle development, and neurogenesis. The protein encoded by this gene is a sperm surface protein. This gene encodes a member of the ADAM (a disintegrin and metalloprotease domain) family. Members of this family are membraneanchored proteins structurally related to snake venom disintegrins, and have been implicated in a variety of biological processes involving cell-cell and cell-matrix interactions, including fertilization, muscle development, and neurogenesis. This member is a subunit of an integral sperm membrane glycoprotein called fertilin, which plays an important role in sperm-egg interactions. metalloendopeptidase activity zinc ion binding cell differentiation multicellular organismal development proteolysis spermatogenesis integral to membrane membrane membrane fraction integrin binding metalloendopeptidase activity protein bindimg zinc ion binding cell adhesion fusion of sperm to egg plasma membrane proteolysis integral to membrane membrane Gamma-interferon (IFNG; MIM 147570) has an antiproliferative effect on many tumor cells and inhibits intracellular pathogens such as Toxoplasma and Chlamydia, at least partly because of the induction of indoleamine 2,3dioxygenase (INDO; EC 1.13.11.42). This enzyme catalyzes the degradation of the essential amino acid L-tryptophan to Nformylkynurenine IDO2 encodes a novel IDO-related tryptophan catabolic enzyme that is preferentially inhibited by D-1-methyl-tryptophan (D-1MT). IDO2 may have a distinct role in immune tolerance. Two common human genetic polymorphisms ablate IDO2 enzyme activity. Evolutionary relationships between the INDO and INDOL1 genes. The INDOL1 protein has a distinct expression pattern compared to INDO and both have the ability to catabolise tryptophan. electron carrier activity heme binding indoleamine 2,3dioxygenase activity iron ion binding metal ion binding heme binding iron ion binding metal ion binding oxidoreductase activity oxidoreductase activity, acting on single donors with incorporation of molecular oxygen, incorporation of two atoms of oxygen female pregnancy tryptophan catabolic process This gene encodes a small, monomeric, predominantly unstructured protein that functions as a positive regulator of the Wnt/beta-catenin signaling pathway. This protein interacts with a repressor of beta-catenin mediated transcription at nuclear speckles. It is thought to competitively block interactions of the repressor with beta-catenin, resulting in up-regulation of beta-catenin target genes. TC-1 over expression is transforming and may link with the FGFR pathway in a subset of breast cancer. Gene induces transformed phenotype when overexpressed in a cancer breast cell line. Overexpression of TC-1 may be important in thyroid carcinogenesis. Protein coding 604156 Protein coding Secreted frizzled-related protein 1 (SFRP1) is a member of the SFRP family that contains a cysteine-rich domain homologous to the putative Wntbinding site of Frizzled proteins. SFRPs act as soluble modulators of Wnt apoptosis DNA binding metal ion binding zinc ion binding protein bindimg intracellular nucleus Wnt receptor signaling pathway anatomical structure morphogenesis extracellular region extracellular space 57 Supplementary Information TABLE S1: GENES ON CHROMOSOME 8p: LOCALIZATION AND DESCRIPTION apoptosis-related protein 2 453 8p11.2 1 4146723841487656 GOLGA7: golgi autoantigen, golgin subfamily a, 7 454 8p11.2 1 4150590341519068 455 8p11.2 1 4155487641597971 INS4: GINS complex subunit 4 (Sld5 homolog) AGPAT6: 1acylglycerol-3phosphate Oacyltransferase 6 (lysophosphatidic acid acyltransferase, zeta) 456 8p11.2 1 4162298641624035 NKX6-3: NK6 homeobox 3 457 8p11.2 1 8p11.2 1 4163711641637183 4162990141774297 MIRN486: microRNA 486 ANK1: ankyrin 1, erythrocytic 458 GCP16, GOLGA3AP1, HSPC041, MGC21096, MGC4876: Golgi complexassociated protein of 16kDa MGC14799, SLD5: SLD5; SLD5 homolog DKFZp586M181 9, LPAAT-zeta, LPAATZ, TSARG7: lysophosphatidic acid acyltransferase zeta; testis spermatogenesis apoptosis-related protein 7 FLJ25169, NKX6.3: NK6 transcription factor related, locus 3 hsa-mir-486 51125 609453 Protein coding 84296 610611 Protein coding 137964 608143 Protein coding 157848 610772 miscRNA ANK, SPH1, SPH2: ankyrin 1; ankyrin-1, erythrocytic; ankyrin-R 286 619554 signaling. SFRP1 and SFRP5 may be involved in determining the polarity of photoreceptor cells in the retina. SFRP1 is expressed in several human tissues, with the highest levels in heart. Epigenetic inactivation by methylation is the predominant mechanism of SFRP1 gene silencing in breast cancer. SFRP1 inactivation is a common and early event caused mainly by hypermethylation in gastric cancer. SFRP1 expression loss may be correlated with tumor metastasis in primary gastric cancer. The data support a role for sFRP1 as a tumor suppressor in clear cell renal cell carcinoma and perhaps loss of sFRP1 is an early, aberrant molecular event in renal cell carcinogenesis. Results indicate that SFRP1 is the Hedgehog target to confine canonical WNT signaling within stem or progenitor cells. Results indicate that GCP16 is the acylated membrane protein, associated with GCP170, and possibly involved in vesicular transport from the Golgi to the cell surface. Data show that H- and N-Ras are palmitoylated by a human protein palmitoyltransferase encoded by the ZDHHC9 and GCP16 genes. The C-terminal domains of the Sld5 and Psf1 subunits are connected by linker regions to the core complex, and the C-terminal domain of Sld5 is important for core complex assembly. Lysophosphatidic acid acyltransferases (EC 2.3.1.51) catalyze the conversion of lysophosphatidic acid (LPA) to phosphatidic acid (PA). LPA and PA are involved in signal transduction and lipid biosynthesis. anti-apoptosis cell differentiation multicellular organismal development signal transduction Golgi apparatus Golgi membrane membranE 1-acylglycerol-3phosphate Oacyltransferase activity acyltransferase activity transferase activity endoplasmic reticulum integral to membrane membrane sequence-specific DNA binding transcription factor activity diacylglycerol metabolic process fatty acid metabolic process glandular epithelial cell maturation lactation lipid biosynthetic process mammary gland development metabolic process regulation of multicellular organism growth triacylglycerol biosynthetic process regulation of transcription, DNA-dependent transcription cytoskeletal adaptor activity enzyme binding spectrin binding structural constituent of cytoskeleton cytoskeleton organization and biogenesis exocytosis maintenance of epithelial cell polarity signal transduction actin cytoskeleton basolateral plasma membrane cytoplasm plasma membrane sarcoplasmic reticulum nucleus miscRNA 182900 Protein coding Ankyrins are a family of proteins that are believed to link the integral membrane proteins to the underlying spectrin-actin cytoskeleton and play key roles in activities such as cell motility, activation, proliferation, contact and the maintenance of specialized membrane domains. Multiple isoforms of ankyrin with different affinities for various target proteins are expressed in a tissue-specific, developmentally regulated manner. Most ankyrins are typically composed of three structural domains: an amino-terminal domain containing multiple ankyrin repeats; a central region with a highly conserved spectrin binding domain; and a carboxy-terminal regulatory domain which is the least conserved and subject to variation. Ankyrin 1, the prototype of this family, was first discovered in the erythrocytes, but since has also been found in brain and muscles. Mutations in erythrocytic ankyrin 58 Supplementary Information TABLE S1: GENES ON CHROMOSOME 8p: LOCALIZATION AND DESCRIPTION 459 8p11.2 1 4190742042028635 MYST3: MYST histone acetyltransferase (monocytic leukemia) 3 460 8p11.2 1 4212976142147858 AP3M2: adaptorrelated protein complex 3, mu 2 subunit 461 8p11.2 1 4215190842184351 PLAT: plasminogen activator, tissue 462 8p11.2 1 4224798642309122 IKBKB: inhibitor of kappa light KAT6A, MGC167033, MOZ, RUNXBP2, ZNF220: Monocytic leukemia zinc finger protein; runt-related transcription factor binding protein 2; zinc finger protein 220 AP47B, CLA20, P47B: HA1 47kDA subunit homolog 2; clathrin assembly protein assembly protein complex 1 medium chain homolog 2; clathrin coat assembly protein AP47 homolog 2; clathrinassociated protein AP47 homolog 2; golgi adaptor AP1 47 kDA protein homolog 2 DKFZp686I03148 , T-PA, TPA: alteplase; plasminogen activator, tissue type; reteplase; tplasminogen activator; tissue plasminogen activator (t-PA) 7994 5327 LJ40509, IKKbeta, IKK2, 3551 601408 10947 Protein coding 1 have been associated in approximately half of all patients with hereditary spherocytosis. Complex patterns of alternative splicing in the regulatory domain, giving rise to different isoforms of ankyrin 1 have been described, however, the precise functions of the various isoforms are not known. Alternative polyadenylation accounting for the different sized erythrocytic ankyrin 1 mRNAs, has also been reported. Truncated muscle-specific isoforms of ankyrin 1 resulting from usage of an alternate promoter have also been identified. MOZ complexes with TIF2 as a recombinant fusion protein which induces acute myelocytic leukemia DNA binding acetyltransferase activity histone acetyltransferase activity metal ion binding transcription factor binding transferase activity zinc ion binding DNA packaging chromatin modification histone acetylation myeloid cell differentiation negative regulation of transcription nucleosome assembly positive regulation of transcription regulation of transcription, DNA-dependent transcription nucleosome nucleus Protein coding This gene encodes a subunit of the heterotetrameric adaptor-related protein complex 3 (AP-3), which belongs to the adaptor complexes medium subunits family. The AP-3 complex plays a role in protein trafficking to lysosomes and specialized organelles. Some AP3M2 mutations still remain candidates for unmapped disorders including epilepsy, febrile seizure, and other neuronal developmental disorders associated with functional abnormalities of GABAergic transmission protein bindimg protein transporter activity intracellular protein transport protein complex assembly vesicle-mediated transport Golgi apparatus clathrin adaptor complex 173370 Protein coding peptidase activity plasminogen activator activity blood coagulation protein modification process proteolysis extracellular region 603258 Protein coding This gene encodes tissue-type plasminogen activator, a secreted serine protease which converts the proenzyme plasminogen to plasmin, a fibrinolytic enzyme. Tissue-type plasminogen activator is synthesized as a single chain which is cleaved by plasmin to a two chain disulfide linked protein. This enzyme plays a role in cell migration and tissue remodeling. Increased enzymatic activity causes hyperfibrinolysis, which manifests as excessive bleeding; decreased activity leads to hypofibrinolysis which can result in thrombosis or embolism. Alternative splicing of this gene results in multiple transcript variants encoding different isoforms. Tissue plasminogen activator and neuroserpin are widely expressed in the human central nervous system NFKB1 (MIM 164011) or NFKB2 (MIM 164012) is bound to REL (MIM 164910), RELA (MIM 164014), or RELB (MIM 604758) to form the ATP binding IkappaB kinase activity activation of NF-kappaB transcription factor cytoplasm 59 Supplementary Information TABLE S1: GENES ON CHROMOSOME 8p: LOCALIZATION AND DESCRIPTION polypeptide gene enhancer in Bcells, kinase beta 463 8p11.2 1 4231518742348470 POLB: polymerase (DNA directed), beta 464 8p11.2 1 4236854742382572 VDAC3: voltagedependent anion channel 3 465 8p11.2 1 4235074342353831 DKK4: dickkopf homolog 4 (Xenopus laevis) 466 8p11.2 1 4239315042516225 SLC20A2: solute carrier family 20 (phosphate transporter), member 2 467 8p11.2 1 4251591342527280 468 8p11.2 1 4267171942712548 C8orf40: chromosome 8 open reading frame 40 CHRNB3: cholinergic receptor, nicotinic, beta 3 IKKB, MGC131801, NFKBIKB: inhibitor of nuclear factor kappa B kinase beta subunit; nuclear factor NFkappa-B inhibitor kinase beta MGC125976: DNA pol beta; DNA polymerase beta subunit NFKB complex. The NFKB complex is inhibited by I-kappa-B proteins (NFKBIA, MIM 164008, or NFKBIB, MIM 604495), which inactivate NFkappa-B by trapping it in the cytoplasm. Phosphorylation of serine residues on the I-kappa-B proteins by kinases (IKBKA, MIM 600664, or IKBKB) marks them for destruction via the ubiquitination pathway, thereby allowing activation of the NF-kappa-B complex. Activated NFKB complex translocates into the nucleus and binds DNA at kappa-B-binding motifs such as 5-prime GGGRNNYYCC 3-prime or 5-prime HGGARNYYCC 3prime (where H is A, C, or T; R is an A or G purine; and Y is a C or T pyrimidine) In eukaryotic cells, DNA polymerase beta (POLB) performs base excision repair (BER) required for DNA maintenance, replication, recombination, and drug resistance. DNA pol-beta is an essential component of the DNA replication machinery in neuronal cell death in Alzheimer's disease. Deregulated DNA polymerase beta induces chromosome instability and tumorigenesis. 5423 174760 Protein coding HD-VDAC3 7419 610029 Protein coding DAC3 belongs to a group of mitochondrial membrane channels involved in translocation of adenine nucleotides through the outer membrane. These channels may also function as a mitochondrial binding site for hexokinase DKK-4, MGC129562, MGC129563: dickkopf homolog 4 GLVR2, Glvr-2, MLVAR, PIT-2: gibbon ape leukemia virus receptor 2; murine leukemia virus, amphotropic, receptor for; solute carrier family 20, member 2 hypothetical protein LOC114926 27121 605417 Protein coding 6575 158378 Protein coding This gene encodes a protein that is a member of the dickkopf family. The secreted protein contains two cysteine rich regions and is involved in embryonic development through its interactions with the Wnt signaling pathway. Activity of this protein is modulated by binding to the Wnt coreceptor and the co-factor kremen 2. Two highly conserved glutamate residues critical for sodium-dependent phosphate transport are revealed by uncoupling transport function from retroviral receptor function. acetylcholine receptor, neuronal nicotinic, beta-3 subunit; 1142 114926 identical protein bindimg nucleotide binding protein serine/threonine kinase activity proteintyrosine kinase activity transcription activator activity transferase activity protein amino acid phosphorylation protein amino acid phosphorylation protein modification process beta DNA polymerase activity damaged DNA binding lyase activity magnesium ion binding microtubule binding nucleotidyltransferase activity protein bindimg sequence-specific DNA binding sodium ion binding transferase activity protein bindimg voltage-gated anion channel activity DNA-dependent DNA replication anti-apoptosis cell death pyrimidine dimer repair cytoplasm intracellular nucleus spindle microtubule adenine transport anion transport Molecular function Wnt receptor signaling pathway multicellular organismal development negative regulation of Wnt receptor signaling pathway phosphate transport transport integral to plasma membrane membrane mitochondrial outer membrane mitochondrion outer membrane extracellular region inorganic phosphate transmembrane transporter activity receptor activity sodium:phosphate symporter activity Protein coding 118508 Protein coding integral to plasma membrane membrane membrane fraction integral to membrane membrane Absence of differences in the pharmacological profile of nicotinic receptor alpha3beta4 argues against role for incorporated beta3 subunit in formation of agonist binding sites while changes in channel kinetics suggest important effect on receptor gating channel activity extracellular ligandgated ion channel activity ion transport signal transduction synaptic transmission, cholinergic cell junction integral to membrane nicotinic acetylcholinegated receptor-channel 60 Supplementary Information TABLE S1: GENES ON CHROMOSOME 8p: LOCALIZATION AND DESCRIPTION cholinergic receptor, nicotinic, beta polypeptide 3 469 8p11.2 1 4272692042742776 CHRNA6: cholinergic receptor, nicotinic, alpha 6 CHNRA6: cholinergic receptor, nicotinic, alpha polypeptide 6 8973 470 8p11.2 1 4282954842870939 RNF170: ring finger protein 170 DKFZP564A022, FLJ38306 81790 471 8p11.2 1 4281097442817631 THAP1: THAP domain containing, apoptosis associated protein 1 55145 609520 Protein coding The protein encoded by this gene contains a THAP domain, a conserved DNA-binding domain. This protein colocalizes with the apoptosis response protein PAWR/PAR-4 in promyelocytic leukemia (PML) nuclear bodies, and functions as a proapoptotic factor that links PAWR to PML nuclear bodies. Alternatively spliced transcript variants encoding distinct isoforms have been observed. 472 8p11.2 1 4287119042994084 HOOK3: hook homolog 3 (Drosophila) 84376 607825 Protein coding Hook proteins are cytosolic coiled-coil proteins that contain conserved Nterminal domains, which attach to microtubules, and more divergent Cterminal domains, which mediate binding to organelles. The Drosophila Hook protein is a component of the endocytic compartment. 473 8p11.2 1 430305994 3060088 FNTA: farnesyltransferas e, CAAX box, alpha FLJ10477, MGC33014: 4833431A01Rik; THAP domain protein 1; nuclear proapoptotic factor HK3: golgiassociated microtubulebinding protein HOOK3 FPTA, MGC99680, PGGT1A: FTasealpha; GGTase-Ialpha; farnesyltransferas e alpha-subunit; ras proteins prenyltransferase alpha; type I protein geranylgeranyltransferase alpha subunit MGC126597, hypothetical protein LOC84197 DKFZp686G2417 5, FLJ22242, FLJ32731, 2339 134635 Protein coding Prenyltransferases attach either a farnesyl group or a geranylgeranyl group in thioether linkage to the cysteine residue of protein's with a C-terminal CAAX box. CAAX geranylgeranyltransferase and CAAX farnesyltransferase are heterodimers that share the same alpha subunit but have different beta subunits. This gene encodes the alpha subunit of these transferases. Alternative splicing results in multiple transcript variants encoding different isoforms. Gene encoding the enzyme deficient in mucopolysaccharidosis IIIC was identified as HGSNAT; mutational analyses identified a splice-junction mutation that accounted for three mutant alleles, and a single base-pair 474 8p11.2 1 475 8p11.2 1 4306781843097480 4311479743177127 FLJ23356: hypothetical protein FLJ23356 HGSNAT: heparan-alphaglucosaminide N- 606888 Protein coding 84197 138050 Protein coding ion channel activity neurotransmitter receptor activity nicotinic acetylcholineactivated cationselective channel activity acetylcholine receptor activity extracellular ligandgated ion channel activity ion channel activity nicotinic acetylcholineactivated cationselective channel activity metal ion binding protein bindimg zinc ion binding DNA binding metal ion binding zinc ion binding microtubule binding complex plasma membrane postsynaptic membrane synapse ion transport signal transduction synaptic transmission cell junction integral to membrane nicotinic acetylcholine-gated receptor-channel complex plasma membrane postsynaptic membrane synapse integral to membrane membrane nucleus Golgi localization cytoplasmic microtubule organization and biogenesis Golgi apparatus cisGolgi network cytoplasm cytoskeleton microtubule CAAX-protein geranylgeranyltransfera se activity protein bindimg protein farnesyltransferase transferase activity protein amino acid farnesylation protein amino acid geranylgeranylation transforming growth factor beta receptor signaling pathway cytoplasm acyltransferase activity heparan-alphaglucosaminide N- glycosaminoglycan metabolic process integral to membrane lysosomal membrane membrane IEA IPI TAS IEA protein coding 610453 Protein coding 61 Supplementary Information TABLE S1: GENES ON CHROMOSOME 8p: LOCALIZATION AND DESCRIPTION acetyltransferase 476 8p11.2 1 4326674243337485 477 8p11.2 1 4761000047645724 478 8p11.2 1 4800201848006274 479 8p11.2 1 4818777648191592 480 8p11.2 1 4822410048225305 481 8p11.2 1 4833609548811028 482 8p11.2 1 4858562448587835 483 8p11.2 1 4866694048667731 484 8p11.2 1 4881202948813279 A26A1: ANKRD26-like family A, member 1 ASNSL1: asparagine synthetase-like 1 MAPK6PS4: mitogen-activated protein kinase 6 pseudogene 4 RPL10AP2: ribosomal protein L10a pseudogene 2 ATP6V1GP2: ATPase, H+ transporting, lysosomal 13kDa, V1 subunit G pseudogene 2 KIAA0146 HGNAT, MPS3C, TMEM76: transmembrane protein 76 POTE-8, POTE8 Erk3ps4 340441 608915 pseudogene 286065 pseudogene 253986 pseudogene protein coding LOC100129093: similar to p47 1001290 93 pseudogene LOC100128689: similar to ubiquitinconjugating enzyme E2 UbcHben CEBPD: CCAAT/enhancer binding protein (C/EBP), delta 1001286 89 pseudogene 1052 The protein encoded by this intronless gene is a bZIP transcription factor which can bind as a homodimer to certain DNA regulatory regions. It can also form heterodimers with the related protein CEBP-alpha. The encoded protein is important in the regulation of genes involved in immune and inflammatory responses, and may be involved in the regulation of genes associated with activation and/or differentiation of macrophages. And it is invoved in cancer. protein dimerization activity sequence-specific DNA binding transcription factor activity pseudogene 23514 C/EBP-delta, CELF, CRP3, NFIL6-beta acetyltransferase activity transferase activity protein coding 389652 619450 hypothetical protein LOC23514 insertion accounted for the fourth. Mutational analysis of HGSNAT in Italian Sanfilippo C syndrome patients resulted in identification of 9 alleles (8 novel)-- 3 splice-site mutations, 3 frameshift deletions resulting in premature stop codons, 1 nonsense mutation & 2 missense mutations 116898 protein coding regulation of transcription, DNA-dependent transcription transcription from RNA polymerase II promoter nucleus 62 Supplementary Information TABLE S1: GENES ON CHROMOSOME 8p: LOCALIZATION AND DESCRIPTION GENES ON CHROMOSOME 8p: LOCALIZATION AND DESCRIPTION (GENES = 484) Data have been obtained from NCBI (http://www.ncbi.nlm.nih.gov/ ), OMIN, Entrez Gene (http://www.ncbi.nlm.nih.gov/sites/entrez) and Ensembl release 48 (http://www.ensembl.org/Homo_sapiens/contigview?c=8:1748847.5;w=10008) 63