How do Replication and Transcription Change Genomes

advertisement
How do Replication and
Transcription Change Genomes?
Andrey Grigoriev
Director, Center for Computational and Integrative Biology
Rutgers University
What are we going to do?
• Observe effects of fundamental processes
• Estimate their relative contribution
• Link them to genome features
• Analyze nucleotide composition
2
How do Replication and
Transcription Change Genomes?
Well, do they?
Replication and Transcription
• textbook view
faithful reproduction machinery
• basis for selection
parental DNA  fitness advantages
4
Replication and Transcription
• paradox
both systematically change genomes
which they faithfully reproduce
• and they leave traces
5
What is in the sequence?
• The usual
– coding, regulatory regions, exons, introns,
RNAs, etc.
• Biases in nucleotide composition
– Traces of organism‘s „lifestyle“
– Links to genome features
6
Counting nucleotides: GC Skew
sw = ([G]-[C])/([G]+[C])
• Short sequence interval (window) w
• Relative excess of G vs C
[-1;1]
• Plot vs % of genome position [0;100]
7
Simian virus 40
0
20
40
60
80
100
0
20
40
60
80
100
Haemophilis influenzae
position, % genome length
8
Cumulative Skew Diagrams
sw = ([G]-[C])/([G]+[C])
S = W sw w/L
For W adjacent windows of size w << L
S is an integral of skew function
9
Simian virus 40
0
20
40
60
0
20
40
60
80
100
replication origin (ori)
replication terminus (ter)
80
position, % genome length
10
100
Haemophilis influenzae
0
20
40
60
80
100
0
20
40
60
80
100
replication origin (ori)
replication terminus (ter)
position, % genome length
11
Genome of Escherichia coli
Terminus
0
20
40
60
80
Origin
position, % genome length
12
100
Genome of Bacillus subtilis
0
20
60
40
position, % genome length
13
80
100
Genome of Borellia burgdorferi
0
20
40
60
position, % genome length
14
80
100
Cumulative Skew Diagrams
• Now widely used to predict ori and ter in
novel and less studied microbial genomes
• Predictions confirmed experimentally
• Constant skews over half-genomes
• oriter G>C
terori G<C
• Strand properties change at ori and ter
15
Causes: Selection vs. Mutation
• Properties of encoded proteins
• Regulatory sequences
• Most pronounced in 3rd codon position
• Suggests mutation, not selection pressure
16
Transcription
Replication
template DNA
mRNA synthesis
continuous DNA synthesis
discontinuous DNA synthesis
DNA single-stranded, not protected
17
Most Consistent Explanation
• spontaneous deamination of C or 5-MetC
– by far the most frequent mutation (rates raise
over 100-fold when DNA is single-stranded)
– fixing the mutated base during the next round
of replication
– depletion of cytosines vs guanines
18
Cytosine Deamination
Uracil
Cytosine
Thymine
19
Replication
• Leading strand exposed in replication
bubble, generation after generation
• Unusual replication models consistent with
the single-strand hypothesis
– adenovirus
– mitochondria
20
Adenovirus Replication
origins
0
20
40
60
80
position, % genome length
21
100
Replication or Transcription
• Leading-lagging switch at ori and ter
• Consistent with replication models
• Transcription often colinear with replication
• Direction often changes at ori and ter
22
Replication vs. Transcription
HPV-16
0
20
40
60
position, % genome length
23
80
100
Replication vs. Transcription
• Comparable contribution to skew
• [G]=900, [C]=690 in the same direction
additive effect on skew
• [G]=758, [C]=773 in the opposite direction
cancel each other out
24
Genome of Bacillus subtilis
0
20
60
40
position, % genome length
25
80
100
Diagrams „jagged“
• Sequence constraints
– amino acid composition, regulatory sequences,
etc.
• Sequence inversions
– swaps strands and change the skew to its
opposite between the borders of the inversion
• Horizontal transfer between species
26
Inversion
5‘
A B
3‘
C
5‘
D
A C
27
3‘
B
D
Rearrangements in two
sequenced strains of
Helicobacter pylori
Colored areas under the
curve correspond to
inversions and
translocations
cagPAI – pathogenicity
island (likely horizontal
transfer)
28
Conclusions
•
•
•
•
Analyze nucleotide composition
Observe effects of fundamental processes
Link them to genome features
Estimate their relative contribution
• Start asking own questions
29
Download