Levy

advertisement

Molecular Epidemiology

•This is the principle technique of scientific inquiry: by changing the scale of description, we move from unpredictable, unrepeatable individual cases to collections of cases whose behavior is regular enough to allow generalizations to be made.

(S. Levin, 1947)

Epidemiology

• Originally Study(ology) upon(Epi) populations of people(demes)

• Now much broader.

• inquiry into events that take place over very different temporal scales: From identification of organisms that have diverged millions of years ago, to the tracing of contacts.

On a Large Scale:

Identity of an infectious agent in an outbreak

Ribosomal RNA

Coding Regions: Highly conserved across widely divergent species.

Transcribed Spacer Regions: Less conserved. Different between closely related

Species.

Non-transcribed Spacer Regions: Vary between and among species.

Where is it?

• Microscopy:

What is it?

Figure 1.-Oocysts of a Cyclospora Species (Panel A),

Cryptosporidium muris (Panel B), and

C. parvum (Panel C) (Modified Acid-Fast Stain)

• Molecular Methods:

DNA sequencing. Use “Moderately variable” regions.

Such as the transcribed spacer.

Cyclospora ITS-1

3. Test sensitivity and specificity of primers.

Amplified 36 C. cayetanensis from around the world

Did NOT amplify 20 species with similar pathology

Among them Cryptosporidia. Faint band from Babesia gibsoni.

Assumptions:

• False Positives: Less stringent

PCR conditions

• False negatives: Overly stringent conditions, combined with unforeseen mutation in primer regions

Zooming in:

• The way to study events on a large scale may not be the way to study events on a small scale

(think physics)

• What is TRUE on one scale may not be true on another scale.

On a Smaller Scale:

Strains: Transmission cycles

GIARDIA

Giardia has Two Heads!

Mycobacterium tuberculosis

•According to the WHO…

Mycobacterium tuberculosis

•According to the WHO…

•2 Billion infected

Mycobacterium tuberculosis

•According to the WHO…

•2 Billion infected

•1/10 will become sick

Mycobacterium tuberculosis

•According to the WHO…

•2 Billion infected

•1/10 will become sick

•2.7 million die each year

Mycobacterium tuberculosis

•According to the WHO…

•2 Billion infected

•1/10 will become sick

•2.7 million die each year

•TB is the largest single agent killer of:

Mycobacterium tuberculosis

•According to the WHO…

•2 Billion infected

•1/10 will become sick

•2.7 million die each year

•TB is the largest single agent killer of: Women.

Mycobacterium tuberculosis

•According to the WHO…

•2 Billion infected

•1/10 will become sick

•2.7 million die each year

•TB is the largest single agent killer of: Women. Young.

Mycobacterium tuberculosis

•What is the frequency of exogenous re-infection? With

MDR-TB?

•What are the transmission dynamics in endemic countries?

Methods to differentiate strains

• Isoenzymes/allozymes: older methods.

• RFLP

• RAPD/ AP-PCR

• AFLPs

• Sequence surrogates: report nucleotic changes indirectly

Isoenzymes

• Isoenzymes/allozymes: electrophoresis to determine differences in enzymes.

Allozymes detect differences between alleles of a given enzyme. Very weak.

• Detect 60% of change, only at enzyme loci.

• Giardia divided into 2 clades evidence for zoonosis

RFLP

• Restriction fragment length polymorphism

• Usually a true sequence surrogate—a difference in RFLP pattern is ideally due to a change in the nucleotide sequence at one or many restriction sites.

• RFLP’s are highly dependent on experimental conditions.

GIARDIA RFLP of Intergenic rRNA

Spacer (IGS)

RFLP of the IGS locus differentiates

Four strains compared to 2 identified

By isoenzyme analysis.

TB-RFLP with Insertion Sequences

IS6110- Fingerprinting: use alu to digest genome. Little variation in RFLP. Question is, in which fragments is the insertion element present?

• IS6110 is a transposon that jumps around the genome.

• IS6110 is not purely a “sequence surrogate,” it is also a “transposon surrogate”

IS6110

• The ruler is ALIVE

• It is dynamic, and reaches equilibrium slower than TB in an outbreak.

IS6110

• # of IS6110 copies in TB genomes varies from

0 to 25. When copy number is low, k<5, there is less change in fingerprints

-contact investigation is very hard.

RAPD or AP-PRC

• RAPD/AP-PCR- Amplify with random primers.

• Sequence surrogate—Tests whether there is a change in the template regions only.

Analysis is the same as that for RFLP.

• Cycles of low-stringency leads to amplification of contaminants.

• Highly dependent on reaction conditions.

• Groupings correspond to

Isozymes.

AFLP’s

• AFLPs: digest DNA, ligate to adaptors, PCR

• Don’t need low-stringency steps, less non-specific amplification.

• Same analysis as RFLPs, need .2 to 1mg of DNA.

• No good for Giardia and other parasites—need too much DNA.

Smaller Still:

Identifying Clonal Lineages: Tracking transmission

Methods:

• Minisatellites

• Microsatellites

• IGS rDNA intergenomic spacer

Microsatellites

• Simple Sequence Repeats

• Repeating motifs for 2-

5bp

• Scattered throughout the genome

• Amenable to PCR and cloning due to small allele size.

Minisatellites

• Repeating motifs 10-100 bp

• Analysed with DNA probes specific for a single locus.

TB Spoligotyping

• Spacer Oligotyping

• Direct repeat (DR) locus 36bp, freq. varies

• Use primers somewhere in the DR, amplify nonrepetitive spacer sequences 34-41bp

• Identify the spacers by hybridization to know sequence oligonucleotides

– Need sequence to generate the oligos

Depends on:

Dynamics of DR regions.

Change in sequence in non-repetitive regions.

DR regions-are they at equilibrium?

How often do they repeat?

-Not yet known

Spoligotypes vs. IS6110

• # IS6110

1

2-5

>5

#IS6110 types # Spoligotypes

1 10

7

80

8

52

Spoligotyping can identify M. bovis (BCG vaccine)

Detection and strain differentiation can be done

Simultaneously without culture.

Crossing scales:

Crossing scales:

• DNA sequence of small subunit (SSU) ribosomal

RNA (highly conserved) suggests four groups of

Giardia. Groups 3 and 4 are only in Dogs. 293bp

• 1-------GCG------_G---------T-------C-------------------

• 2-------ATC-------AC---------G------G-------------------

• 3-------ATC-------AC---------A------G---------T--------

• 4-------ATC-------AC---------A------A----------T----A-

• 1 and 2 are mainly in humans, though some dogs have 3. 2,3,4 and four are nearly identical

• Is this good evidence against zoonosis?

Models of Nucleotide Substitution

• On a large scale, we can calculate the rate of substitution, then estimate the likelihood of any given substitution and control for confounders

(transition-transversion, codon bias etc).

• On a small scale we do not know rate, the process is nearly random, and confounders may be irrelevant

Distributions

BINOMIAL:

Pr(Y=y)=n!/(y!(n-y) * P y (1-P) n-y Mean= nP Variance= nP(1-P)

POISSON:

Pr(Y=y) = u y e -u / y! Mean and Variance= u

Central Limit Theorem: Large number of events  normal distribution

Binomial- coin toss.

Poisson- rare events. Tossing a 100,000 sided die.

Kimura’s 2 parameter

• For instance, as the rate of transition and transversion become small Kimura’s 2 parameter model reduces to a one parameter model

• K= -(1/2) ln[1-2P-Q√(1-2Q) 

K=P + Q where K is the distance per site and P and Q are the fractions of sites with transition vs/ transversion changes.

How to Analyze RFLP and other sequence surrogates

• Two sources of information: number of bands, and size of each fragment.

-In practice, it can be difficult to score changes in fragment size. Most studies look only at the presence or absence of a certain pattern.

Nei and Li’s model for RFLP

• The expected frequency of restriction sites with r nucleotide pairs depends on G+C content and

G+C content of restriction site sequence:

A= (g/2) r1 [(1-g)/2] r2

• G= G+C of genome

• r1, r2 are G+C, and A+T frequencies in

Restriction site. r1+r2=r

• mt=number of nucleotide pairs in genome

• mt*a= n, the expected # of restriction sites

What is the probability that the n changes over time t?

Mutations are a Poisson process.

P= e -r l t l= Mutation rate/nucleotide r= Length of restriction sequence t= Time

Nei and Li continued

• n(t) = number of bands at time t = n

1

(t)+ n

2

(t)

• n

1

(t)= # of sites that do not change

• n

2

(t)= number of new sites.

• E(n)=n

0

P + mta(1-P) or E(n

2

)+ E(n

1

)

• Variance: n

1

(t) and n

2

(t) are independent

• Var [n(t)]= Var[n

1

(t)]+Var[n

2

(t)]

• n

1

(t) is binomial, n

2

(t) is poisson

• Var [n(t)]=n

0

P(1-P) + mta(1-P)

IS6110 is modelled similarily

Transposition is rare—modeled as a Poisson process:

• Prob of at least 1 change= 1-e kqt

• Where k= # of copies of transposon in genome

• And q is the rate of transposition when k=1

Really Small-New Technology

Genetic marking of drug resistance, or virulence

-Represenational Difference Analysis (RDA)

-High-throughput genotyping

-Microarrays

Representational Difference Analysis

• “Cloning the Differences Between Two

Complex Genomes” Lisitsyn Science, feb 1993

• Uses Subtractive and Kinetic enrichment to purify fragments present in one population, but absent in another.

– Basically differential amplification of polymorphic fragments

High-Throughput Genotyping

• Flourescent labels incorporated into RAPDs, microsatellites and AFLP

• Can run in ONE electrophoresis lane.

• Result: complicated fingerprints that take into account variation at different levels.

Conclusions

1 The strongest analyses will be those that consider variation on multiple temporal levels.

2. Everyone says their technique is economically feasible for use in endemic countries; no one says how much their technique costs.

3. Stay away from Guatemalan raspberries.

Download