Introduction to Medical Genetics

advertisement
Human Genetics
Weibin Shi
Michele Sale
Contact Information
 Shi:
ws4v@virginia.edu; 243-9420
 Sale:
ms5fe@Virginia.EDU; 982-0368
Recommended textbooks
 Medical Genetics
-Jorde, Carey, Bamshad & White
• Mosby, ISBN 13: 978-0-323-04035-8
 Human
Molecular Genetics
- Strachan T, Read A
Garland Science,ISBN-10: 0815341822
Overview of course content







1: Organization of the human genome
2: Genetic variation
3. Patterns of inheritance
4: Population genetics
5: linkage disequilibrium
6: Genetic epidemiology
7: Applied research in human genetics
Organization
of the
human
genome
Human genome sequence published February 2001
Genes are found in the
nucleus and mitochondria
Nuclear genome packaged with
proteins to form chromatin
Human chromosomes
23 pairs
46 chromosomes
22 pairs – autosomes
1 pair
sex chromosomes
46,XY
Normal male
Human chromosomes
46,XX
Normal
female
A little more basic terminology
Human genome =
nuclear genome +
mitochondrial genome
Mitochondrial genome
NUCLEAR GENOME
24 distinct chromosomes (22
autosomal + X + Y)
3,200 Mbp
25,000 genes
16,569 bp
37 genes
Human Mitochondrial Genome
Small (16.5 kb) circular DNA
rRNA, tRNA and protein encoding genes (37)
1 gene/0.45 kb
Very few repeats
No introns
93% coding
Genes are transcribed as multimeric transcripts
Maternal inheritance
What are the mitochondrial genes?
 24


of 37genes are RNA coding
22 tRNA
2 ribosomal RNA (23S, 16S)
 13
of 37 genes are protein coding
some subunits of respiratory complexes
and oxidative phosphorylation enzymes
Limited autonomy of mitochondrial
genome
mt encoded
NADH dehydrogenase
Cytochrome b-c1 comp
Cytochrome C oxidase
ATP synthase complex
7 subunits
1 subunit
3 subunits
2 subunits
nuclear
35 subunits
10 subunits
10 subunits
14 subunits
Two overlapping genes encoded by same
strand of mt DNA
(unique example)
Two independent ATG located in Frame-shift to each other,
second stop codon is derived from TA + A (from poly-A)
Mitochondrial codon table
Human Nuclear Genome
3,200 Mb
23 (XX) or 24 (XY) linear chromosomes
25,000 genes
1 gene/120kb
Introns in the most of the genes
1.5 % of DNA is coding
Genes are transcribed individually
Repetitive DNA sequences (45%)
Inherited from both parents
Human Nuclear Genome
In human nuclear genome
gene-rich regions are separated by gene
deserts
Chr. 19 has the highest gene density
Chr. 13 & Y show the lowest gene density
Human genome base content

41% CG in average
38% CG for Chr. 4 and Chr. 13
49% for Chr. 19

Regions with wide swings in CG content
(e.g. from 33.1% to 59.3%)
Gene density correlates with higher CG content
CpG dinucleotide depletion
 Expected
frequency is 4.2%
 Observed frequency is five times lower
Location of CpG islands in the
gene
CpG islands in the regulatory areas of human genes
Human nuclear genome
 Gene
density varies widely
 Averagely 9 exons per gene
 363 exons in titin gene
 Certain genes are intronsless
 Largest intron is 800 kb (WWOX gene)
 Smallest introns – 10 bp
 Average 5’ UTR 0.2-0.3 kb
 Average 3’ UTR 0.77 kb
 Largest protein: titin: 38,138 aa
Gene density varies substantially between
chromosomal regions
Genes vary in size and exon content
INTRONLESS GENES
 Interferon
genes
 Histone genes
 Many ribonuclease genes
 Heat shock protein genes
 Many G-protein coupled receptors
 Various neurotransmitters receptors and
hormone receptors
Genes within genes
Classical gene families: members
exhibit a high degree of sequence similarity
CS = chorionic somatomammotropin
four placenta-specific genes, primates only
serum albumin
alpha-albumin
vitamin D-binding protein
Gene families: gene products bearing
short conservative amino acid motifs
DEAD box proteins are involved in mRNA splicing
and translation initiation; DEAD box (Asp-Glu-Ala-Asp)
WD proteins take part in a variety of regulatory functions,
GH (Gly-His) should be at 23-41 aa distance from WD (Trp-Aps)
Gene superfamily: Proteins that are
functionally related in a general sense, but
show only weak homology
Functionally similar genes are occasionally clustered,
but usually dispersed throughout the genome
Non-coding RNA genes
 Code
for functional RNA
 ncRNA represent 98% of all transcripts in a
mammalian cell
 ncRNA can be:



Structural
Catalytic
Regulatory
How many genes in the
nuclear genome?
~3000 RNA genes in the nuclear genome
~10% of human gene count
have not been taken into account in gene
counts
Non-coding RNA
 tRNA –
transfer RNA: involved in
translation
 rRNA – ribosomal RNA: structural
component of ribosome, where translation
takes place
 snoRNA – small nucleolar RNA:
functional/catalytic in rRNA maturation
 Antisense RNA: gene regulation/silencing
microRNA
 A new
class of non-coding RNA gene
 Products are 19~25 nt RNAs
 Precursors are 70-100 nt.
 Block translation or result in degradation of
target mRNA
Tandem repeats and
interspersed repeats
Satellite DNA is repetitive DNA that could
be separated by centrifugation
Equilibrium
density
gradient
centrifugation
Sheared DNA
in Cesium Chloride
gradient
Satellite DNA
Alpha –satellite
(Centromere DNA)
Microsatellite
Minisatellite
Microsatellite
di-, tri-, and tetra-nucleotide repeats
TGCCACACACACACACACAGC
TGCCACACACACA------GC
TGCTCATCATCATCAGC
TGCTCATCA------GC
TGCTCAGTCAGTCAGTCAGGC
TGCTCAGTCAG--------GC
~10% of the nuclear genome
Minisatellites
• 6-64 bp repeating pattern
1
61
121
181
241
301
361
421
tgattggtct
attttttagg
tggtatttta
gatttcggga
tacttgattt
ggattttaag
ttttaggatt
ctgaatataa
ctctgccacc
aattttttta
ggatttactt
tttcaggatt
tgggatttta
ttttcttgat
acgggatttt
atgctctgct
gggagatttc
atggattacg
gattttggga
ttaagttttc
ggattacggg
tttatgattt
agggtgctca
gctctcgctg
cttatttgga
ggattttagg
ttttaggatt
ttgattttat
attttagggt
taagatttta
ctatttatag
atgtcattgt
Repeat: AGGAATTTTT
ggtgatggag
gttctaggat
gagggatttt
gattttaaga
ttcaggattt
ggatttactt
aactttcatg
tctcataata
gatttcagga
tttaggatta
agggtttcag
ttttaggatt
cgggatttca
gattttggga
gtttaacata
cgttcctttg
α-Satellite repeat
•
171 bp sequence repeat
Interspersed repetitive DNA

SINE (Short interspersed nuclear elements):



Alu, ~0.3 kb, ~10,7% of human DNA (1,200, 000 copies)
MIR, ~0.13 kb, 3% of human DNA (500,000 copies)
LINE (Long interspersed nuclear elements):

~0.8 kb, ~21% of human DNA (~1,00,000 copies)
Chromosomal location of repeats
Pseudogenes

Non-functional copy of a gene





Non-processed pseudogene
• Nonfunctional copies of the genomic DNA sequence of a gene
• Contain exons, intron, and flanking sequences
Processed pseudogene
• Nonfunctional copies of the exonic sequences of a gene
• Reverse-transcribed from an RNA transcript
• No 5’ promoter
• No introns
• Often includes polyA tail
Both include events that make the gene non-functional
• Frameshift
• Stop codons
Could be as high as 20-30% of all Genomic sequence
predictions could be pseudogene
We assume pseudogenes have no function, but we really don’t
know!
Human Genome Organization
HUMAN GENOME
Nuclear genome
3,200 Mb
25,000 genes
Genes and generelated sequences
Mitochondrial genome
16.5 kb
37 genes
Extragenic
DNA
Two rRNA
genes
22 tRNA
genes
13 polypeptideencoding genes
Unique or moderately repetitive
Coding
DNA
Pseudogenes
Unique or
low copy
number
Noncoding
DNA
Gene
fragments
Introns,
untranslated
sequences, etc.
Tandemly
repeated
Moderate to
highly
repetitive
Interspersed
repeats
Download