The Genome Database

advertisement
The Genome Data Base
‫بسم هللا الرحمن الرحيم‬
Report on:
The Genome Database
Submitted to:
Dr/ Ahmed Hisham
Eng /Ahmed AbdElhady
Eng /Enas Abd El-Fatah
Submitted by:
1) Abdullah Zein El-Abdeen Abdullah
2) Ali Hameed Mothana Algomaee
3) Hussien Hassan Mahdy
4) Mena Hanna buocuY
5) Mohamed Ibrahim Resk Hashish
6) Mohammed Mustafa Mohamed
The Genome Data Base
The living organisms are classified as:
1) The non-cellular organism:
 It doesn’t have cell organelles
 Example: virus
2) The cellular organisms:
 It has cellular organelles
 such as human ,plants ,animals
The living cells are one of two types:
1) the Eukaryotes: which contain the nucleus that contain the genetic code
2) the prokaryotes : that had no nucleus & its DNA molecules are scattered
The cell of bacteria
Eukaryotes
 Single or multiple cell organisms
 contain organelles, which are
defined as membrane-bound
structures such as nucleus,
mitochondria, chloroplasts,
endoplasmic reticulum (ER), etc.
 They have two cell types:
1) The animal cells
2) The plant cell
the human cell
Prokaryotes
 Single-cell organisms
 Prokaryotic cell does not have a
nucleus
 Prokaryotes are divide into bacteria
and
archaea
 Prokaryotic cells do not have
organelles.
The Genome Data Base
Genome:




the name to be a blend of the Words gene and chromosome.
it is the sum of the genes of an organism and hence genetic determinants.
Bacteria don’t possess a nucleus like eukaryotes, but a structure called a
nucleod containing nucleic acid, but lacks a nuclear membrane.
The genome has many location according to the type of the organism as:
o The genome of an eukaryotic cell is located in the nucleus and in
mitochondria (and in chloroplasts in plants) which are membrane-bound
organelles.
o The genome of a prokaryotic cell is in the cytoplasm.
Virus Genomes are composed of RNA, not DNA.
Types of Genome in Eukaryotes :
1) A chromosomal genome: inside the nucleus of the cell in the familiar form of
chromosomes & it comes from our parents together
2) A mitochondrial genome: outside the nucleus in the cytoplasm of the cell (in
organelles), these genes are not considered part of the genome. It comes only
from the mother.
3) The plastome :it is the DNA found within the chloroplast
Gene:



It is the segment of the genetic material, which controls only one character or
property. It spreads like beads on the chromosome. Each gene codes or
specifies a certain protein and hence a specific function.
Note: The Structure gene is that is used to construct the protein
The human genome consist of:
1) The exons: the coded parts of the gene
2) The introns: it is the non-coded parts between the exons in the gene & its
function isn’t cleared until now for the scientists
The Genome Data Base
The De-Ribonucleic Acid (DNA) :

it is a double helix of two strands; each is a polymer of nucleotides linked to each
other by sugar-phosphate. The two strands are connected together by the
hydrogen bonding.


The strand is the nucleic acid chain which forms the DNA
A DNA chain is made up of four chemical bases:
o adenine (A)
o guanine (G)
o cytosine (C)
o thymine (T)

A DNA chain, also called a strand, has a sense of direction, in which one end is
chemically different than the other. The so-called 5' end terminates in a 5'
phosphate group (-PO4); the 3' end terminates in a 3' hydroxyl group (-OH).This
is important because DNA strands are always synthesized in the 5' to 3'
direction.
The two strands are connected to each other by chemical pairing of each base
on one strand to a specific partner on the other strand. Adenine (A) pairs with
thymine (T), and guanine (G) pairs with cytosine (C). Thus, A-T and G-C base
pairs are said to be complementary
It carry the genetic information
Note: The DNA from a single (diploid) human cell if the 46 chromosomes were
connected end-to-end and straightened, would have a length of ~2 m



The Genome Data Base
Ribonucleic Acid (RNA)


is a chain, or polymer, of nucleotides with
the same 5' to 3' direction of its strands like
the DNA
The RNA has many types:
1) ribosomal RNAs (rRNAs)
2) transfer RNAs (tRNAs)
3) Messenger RNA (mRNA)
4) small nuclear RNAs (snRNAs)
The protein:


Proteins are long chains containing as many as 20 different kinds of amino acids.
Each cell contains thousands of different proteins in the shape of:
1) Enzymes : that make new molecules and catalyze nearly all chemical processes in
cells
2) structural components: that give cells their shape and help them move
3) hormones : that transmit signals throughout the body
4) antibodies: that recognize foreign
molecules
5) transport molecules : that carry
oxygen


The genetic code carried by DNA is
what specifies the order and number
of amino acids and, therefore, the
shape and function of the protein.
The genetic information flows from
DNA to RNA to protein.
There are many steps for producing the protein:
1) Transcription:
 It is the synthesis of an mRNA copy from a sequence of DNA
 The primary mRNA transcript is edited which removes the introns, joins
the exons together, and adds unique features to each end of the transcript
to make a "mature" mRNA
2) Translation
The process in which the genetic code carried by mRNA directs the synthesis of
proteins from amino acids inside the ribosomes.
The Genome Data Base
Does Everyone Have the Same Genes?
When you look at the human species, you see evidence of a process called genetic
variation, that is, there are immediately recognizable differences in human traits, such
as hair and eye color, skin pigment, and height.. This means that the gene's sequence
is slightly different in the two individuals, and the gene is called polymorphic. But all
people generally have the same genes& the genes do not have exactly the same DNA
sequence.
Genome evolution:
We have traits that may be measured and studied without reference to the details of any
particular genes. Researchers compare traits such as
1) chromosome number (karyotype)
2) gene order
3) GC-content
4) genome size
5) codon usage bias
The genome size: is the amount of DNA in a haploid complement which is reported as
the total number of base pairs. Generally, more complex organisms have more DNA
The genome number: it is the number of the genes in the organism
Organism
Genome Size (Mb)
Gene Number
Hepatitis D virus
0.0017
1
Hepatitis B virus
0.0032
4
HIV-1
0.0092
9
Bacteriophage l
0.0485
80
Escherichia coli
4.6392
4400
S. cerevisiae (yeast)
12.155
6300
D. melanogaster (fruit fly)
137
13600
Homo sapiens (human)
3000
20000-30000
The Genome Data Base
The Genome database




It provides views for a variety of genomes, complete chromosomes, sequence
maps with contigs, and integrated genetic and physical maps.
The database is organized in six major organism groups:
1) Archaea
2) Bacteria
3) Eukaryotae
4) Viruses
5) Viroids
6) Plasmids
It includes complete chromosomes, organelles
and plasmids as well as draft genome
assemblies
An analogy to the human genome stored on
DNA is that of instructions stored in a library:
1)
2)
3)
4)
5)
6)
The library would contain 46 books (chromosomes)
The books range in size from 400 to 3340 pages (genes)
which is 48 to 250 million letters (A,C,G,T) per book.
Hence the library contains over six billion letters total;
The library fits into a cell nucleus the size of a pinpoint;
A copy of the library (all 46 books) is contained in almost every cell of our
body.
Why do we need the database?
1) Democratized Data



The project helped pioneer the now common practice of making scientific data
freely available online.
This open model of research has enabled researchers to make discoveries much
more quickly than in the past
Genome Sequencing Milestone Reached: There are now 1000 complete
Prokaryotic Genomes available in Entrez Genome. See the full list of complete
bacterial and archaeal genomes. Microbial Resources are available for search,
retrieval, and analysis of all genomes
The Genome Data Base
2) Microbial Genomes Resources:
 Presents public data from prokaryotic genome sequencing projects. The
sequence collection contains data from finished genomes as well as draft
assemblies.
 The analytical tools include specialized BLAST with microbial genomes, newly
developed Concise Protein BLAST, annotation tools and many more.
3) Identification of SNPs:
 In two Salmonella enterica serovar Enteritidis PT13a pathotypes that point to
epidemiological trends.
There have been an increasing number of infections leading to salmonellosis by
Salmonella enteric a serovar Enteritidis in the United States.
 Using preliminary genomic sequence data of Salmonella enterica subsp. I
serovar Enteritidis PT4 from the Sanger Institute as a starting point, researchers
at The Egg Safety and Quality Research Unit (ESQRU) of the U. S. Department
of Agriculture have identified a set of potential SNPs that point to differences
between two PT13a pathotypes that may be relevant in distinguishing phenotypic
traits that may have epidemiological consequences.
 So it is used to identify the structure of some diseases and begin to try to remove
the problem with the gene.
4) Added DNA to Human-Origins Tool Kit



The Human Genome Project has proven to be a valuable new tool for studying
human origins and the history of our species' migrations
We've learned how young a species we are and how similar so many of us are,
particularly those populations that came out of Africa 70,000 years ago.
The genetic data largely back up theories derived from archeological and
linguistic studies by working under the assumption that the more closely related
different human populations are to one another, the more similar their genomes
will be scientists have been able to roughly chart out the path that humanity took
as it spread around the world.
5) Supercharged Genetic Research


The Human Genome Project has helped foster the creation of newer, faster, and
cheaper methods of gene sequencing
That's because the rough draft of the human genome that resulted from the
Human Genome Project serves as a reference against which the data from new
sequencing methods can be compared
The Genome Data Base
The Next Step: Functional Genomics
The avalanche of genome data grows daily. The new challenge will be to use this vast
reservoir of data to explore how DNA and proteins work with each other and the
environment to create complex, dynamic living systems. Systematic studies of function
on a grand scale-functional genomics-will be the focus of biological explorations in this
century and beyond. These explorations will encompass studies in :
1) Transcriptomics
involves large-scale analysis of messenger RNAs transcribed from active genes
to follow when, where, and under what conditions genes are expressed.
2) Proteomics
Studying protein expression and function--or proteomics--can bring researchers
closer to what's actually happening in the cell than gene-expression studies. This
capability has applications to drug design.
3) Structural genomics
initiatives are being launched worldwide to generate the 3-D structures of one or
more proteins from each protein family, thus offering clues to function and
biological targets for drug design.
4) new experimental methodologies:
Experimental methods for understanding the function of DNA sequences and the
proteins they encode include knockout studies to inactivate genes in living
organisms and monitor any changes that could reveal their functions.
5) Comparative genomics
analyzing DNA sequence patterns of humans and well-studied model organisms
side-by-side has become one of the most powerful strategies for identifying
human genes and interpreting their function.
Download