The Genome Data Base بسم هللا الرحمن الرحيم Report on: The Genome Database Submitted to: Dr/ Ahmed Hisham Eng /Ahmed AbdElhady Eng /Enas Abd El-Fatah Submitted by: 1) Abdullah Zein El-Abdeen Abdullah 2) Ali Hameed Mothana Algomaee 3) Hussien Hassan Mahdy 4) Mena Hanna buocuY 5) Mohamed Ibrahim Resk Hashish 6) Mohammed Mustafa Mohamed The Genome Data Base The living organisms are classified as: 1) The non-cellular organism: It doesn’t have cell organelles Example: virus 2) The cellular organisms: It has cellular organelles such as human ,plants ,animals The living cells are one of two types: 1) the Eukaryotes: which contain the nucleus that contain the genetic code 2) the prokaryotes : that had no nucleus & its DNA molecules are scattered The cell of bacteria Eukaryotes Single or multiple cell organisms contain organelles, which are defined as membrane-bound structures such as nucleus, mitochondria, chloroplasts, endoplasmic reticulum (ER), etc. They have two cell types: 1) The animal cells 2) The plant cell the human cell Prokaryotes Single-cell organisms Prokaryotic cell does not have a nucleus Prokaryotes are divide into bacteria and archaea Prokaryotic cells do not have organelles. The Genome Data Base Genome: the name to be a blend of the Words gene and chromosome. it is the sum of the genes of an organism and hence genetic determinants. Bacteria don’t possess a nucleus like eukaryotes, but a structure called a nucleod containing nucleic acid, but lacks a nuclear membrane. The genome has many location according to the type of the organism as: o The genome of an eukaryotic cell is located in the nucleus and in mitochondria (and in chloroplasts in plants) which are membrane-bound organelles. o The genome of a prokaryotic cell is in the cytoplasm. Virus Genomes are composed of RNA, not DNA. Types of Genome in Eukaryotes : 1) A chromosomal genome: inside the nucleus of the cell in the familiar form of chromosomes & it comes from our parents together 2) A mitochondrial genome: outside the nucleus in the cytoplasm of the cell (in organelles), these genes are not considered part of the genome. It comes only from the mother. 3) The plastome :it is the DNA found within the chloroplast Gene: It is the segment of the genetic material, which controls only one character or property. It spreads like beads on the chromosome. Each gene codes or specifies a certain protein and hence a specific function. Note: The Structure gene is that is used to construct the protein The human genome consist of: 1) The exons: the coded parts of the gene 2) The introns: it is the non-coded parts between the exons in the gene & its function isn’t cleared until now for the scientists The Genome Data Base The De-Ribonucleic Acid (DNA) : it is a double helix of two strands; each is a polymer of nucleotides linked to each other by sugar-phosphate. The two strands are connected together by the hydrogen bonding. The strand is the nucleic acid chain which forms the DNA A DNA chain is made up of four chemical bases: o adenine (A) o guanine (G) o cytosine (C) o thymine (T) A DNA chain, also called a strand, has a sense of direction, in which one end is chemically different than the other. The so-called 5' end terminates in a 5' phosphate group (-PO4); the 3' end terminates in a 3' hydroxyl group (-OH).This is important because DNA strands are always synthesized in the 5' to 3' direction. The two strands are connected to each other by chemical pairing of each base on one strand to a specific partner on the other strand. Adenine (A) pairs with thymine (T), and guanine (G) pairs with cytosine (C). Thus, A-T and G-C base pairs are said to be complementary It carry the genetic information Note: The DNA from a single (diploid) human cell if the 46 chromosomes were connected end-to-end and straightened, would have a length of ~2 m The Genome Data Base Ribonucleic Acid (RNA) is a chain, or polymer, of nucleotides with the same 5' to 3' direction of its strands like the DNA The RNA has many types: 1) ribosomal RNAs (rRNAs) 2) transfer RNAs (tRNAs) 3) Messenger RNA (mRNA) 4) small nuclear RNAs (snRNAs) The protein: Proteins are long chains containing as many as 20 different kinds of amino acids. Each cell contains thousands of different proteins in the shape of: 1) Enzymes : that make new molecules and catalyze nearly all chemical processes in cells 2) structural components: that give cells their shape and help them move 3) hormones : that transmit signals throughout the body 4) antibodies: that recognize foreign molecules 5) transport molecules : that carry oxygen The genetic code carried by DNA is what specifies the order and number of amino acids and, therefore, the shape and function of the protein. The genetic information flows from DNA to RNA to protein. There are many steps for producing the protein: 1) Transcription: It is the synthesis of an mRNA copy from a sequence of DNA The primary mRNA transcript is edited which removes the introns, joins the exons together, and adds unique features to each end of the transcript to make a "mature" mRNA 2) Translation The process in which the genetic code carried by mRNA directs the synthesis of proteins from amino acids inside the ribosomes. The Genome Data Base Does Everyone Have the Same Genes? When you look at the human species, you see evidence of a process called genetic variation, that is, there are immediately recognizable differences in human traits, such as hair and eye color, skin pigment, and height.. This means that the gene's sequence is slightly different in the two individuals, and the gene is called polymorphic. But all people generally have the same genes& the genes do not have exactly the same DNA sequence. Genome evolution: We have traits that may be measured and studied without reference to the details of any particular genes. Researchers compare traits such as 1) chromosome number (karyotype) 2) gene order 3) GC-content 4) genome size 5) codon usage bias The genome size: is the amount of DNA in a haploid complement which is reported as the total number of base pairs. Generally, more complex organisms have more DNA The genome number: it is the number of the genes in the organism Organism Genome Size (Mb) Gene Number Hepatitis D virus 0.0017 1 Hepatitis B virus 0.0032 4 HIV-1 0.0092 9 Bacteriophage l 0.0485 80 Escherichia coli 4.6392 4400 S. cerevisiae (yeast) 12.155 6300 D. melanogaster (fruit fly) 137 13600 Homo sapiens (human) 3000 20000-30000 The Genome Data Base The Genome database It provides views for a variety of genomes, complete chromosomes, sequence maps with contigs, and integrated genetic and physical maps. The database is organized in six major organism groups: 1) Archaea 2) Bacteria 3) Eukaryotae 4) Viruses 5) Viroids 6) Plasmids It includes complete chromosomes, organelles and plasmids as well as draft genome assemblies An analogy to the human genome stored on DNA is that of instructions stored in a library: 1) 2) 3) 4) 5) 6) The library would contain 46 books (chromosomes) The books range in size from 400 to 3340 pages (genes) which is 48 to 250 million letters (A,C,G,T) per book. Hence the library contains over six billion letters total; The library fits into a cell nucleus the size of a pinpoint; A copy of the library (all 46 books) is contained in almost every cell of our body. Why do we need the database? 1) Democratized Data The project helped pioneer the now common practice of making scientific data freely available online. This open model of research has enabled researchers to make discoveries much more quickly than in the past Genome Sequencing Milestone Reached: There are now 1000 complete Prokaryotic Genomes available in Entrez Genome. See the full list of complete bacterial and archaeal genomes. Microbial Resources are available for search, retrieval, and analysis of all genomes The Genome Data Base 2) Microbial Genomes Resources: Presents public data from prokaryotic genome sequencing projects. The sequence collection contains data from finished genomes as well as draft assemblies. The analytical tools include specialized BLAST with microbial genomes, newly developed Concise Protein BLAST, annotation tools and many more. 3) Identification of SNPs: In two Salmonella enterica serovar Enteritidis PT13a pathotypes that point to epidemiological trends. There have been an increasing number of infections leading to salmonellosis by Salmonella enteric a serovar Enteritidis in the United States. Using preliminary genomic sequence data of Salmonella enterica subsp. I serovar Enteritidis PT4 from the Sanger Institute as a starting point, researchers at The Egg Safety and Quality Research Unit (ESQRU) of the U. S. Department of Agriculture have identified a set of potential SNPs that point to differences between two PT13a pathotypes that may be relevant in distinguishing phenotypic traits that may have epidemiological consequences. So it is used to identify the structure of some diseases and begin to try to remove the problem with the gene. 4) Added DNA to Human-Origins Tool Kit The Human Genome Project has proven to be a valuable new tool for studying human origins and the history of our species' migrations We've learned how young a species we are and how similar so many of us are, particularly those populations that came out of Africa 70,000 years ago. The genetic data largely back up theories derived from archeological and linguistic studies by working under the assumption that the more closely related different human populations are to one another, the more similar their genomes will be scientists have been able to roughly chart out the path that humanity took as it spread around the world. 5) Supercharged Genetic Research The Human Genome Project has helped foster the creation of newer, faster, and cheaper methods of gene sequencing That's because the rough draft of the human genome that resulted from the Human Genome Project serves as a reference against which the data from new sequencing methods can be compared The Genome Data Base The Next Step: Functional Genomics The avalanche of genome data grows daily. The new challenge will be to use this vast reservoir of data to explore how DNA and proteins work with each other and the environment to create complex, dynamic living systems. Systematic studies of function on a grand scale-functional genomics-will be the focus of biological explorations in this century and beyond. These explorations will encompass studies in : 1) Transcriptomics involves large-scale analysis of messenger RNAs transcribed from active genes to follow when, where, and under what conditions genes are expressed. 2) Proteomics Studying protein expression and function--or proteomics--can bring researchers closer to what's actually happening in the cell than gene-expression studies. This capability has applications to drug design. 3) Structural genomics initiatives are being launched worldwide to generate the 3-D structures of one or more proteins from each protein family, thus offering clues to function and biological targets for drug design. 4) new experimental methodologies: Experimental methods for understanding the function of DNA sequences and the proteins they encode include knockout studies to inactivate genes in living organisms and monitor any changes that could reveal their functions. 5) Comparative genomics analyzing DNA sequence patterns of humans and well-studied model organisms side-by-side has become one of the most powerful strategies for identifying human genes and interpreting their function.