Using Bioinformatics to Develop and Test Hypotheses

advertisement
Is it E. coli O157:H7? Using Bioinformatics
to Develop and Test Hypotheses
Joanna R. Klein, Northwestern College, St. Paul, MN
Introduction
Bioinformatics is used extensively by researchers and is an area that
students need to become competent in, especially considering rapid
advances in genome sequencing projects. Just as in any inquiry based
lab, bioinformatics is most meaningful when students learn the tools
while using them to test hypotheses. With this goal in mind, an activity
was designed for students to learn how to use some specific
bioinformatics tools both in developing a hypothesis and then in testing
whether the hypothesis is correct.
Description of Activity
This activity takes a case study approach in which students are asked to
design a PCR-based diagnostic test for E. coli O157:H7 by identifying a
gene that is specific to this pathogenic strain. To do this, students are
provided a set of unknown gene sequences that they identify by
performing BLAST searches at NCBI. They review the function of the
gene products and they develop hypotheses about which might be
unique to O157:H7. They then test their hypotheses by using the
integrated microbial genomes (IMG) database to search specific
bacterial genomes for each gene.
Step 1: Learn about PCR and Gel Electrophoresis
Case Study Scenario: Elizabeth and Colin, two novice scientists, were
asked to test E. coli samples from Lake Johanna to determine if any are
the pathogenic strain O157:H7. How should they go about this task?
They have heard that PCR is often used in bacterial identification, but
they don’t know much about how it works.
As a pre-lab assignment, students are directed to go through virtual labs
on PCR and Gel electrophoresis (Figure 1) and read a section in their
textbook about the use of PCR in clinical diagnosis. They answer a set
of questions.
Figure 1: Learn Genetics Virtual Labs
http://learn.genetics.utah.edu/
content/labs/pcr/
Step 2: Use BLAST to determine the identity of unknown
sequences
Case Study Scenario: Now that Elizabeth and Colin have a better
understanding of PCR, they need to decide how to apply the technique to
their problem. See if you can help them out! They are hoping to use PCR
to amplify a gene that is present in O157:H7, but not in other strains of E.
coli. But what specific gene should they look for?
Their research supervisor provided them with 4 sets of primers that they
could potentially use. Each set has two primers – an upstream primer
and a downstream primer – to amplify a specific gene. They were given
the nucleotide sequence that each set amplifies, which is found in
appendix A, however the name of the gene is missing from the file and
their supervisor is unfortunately unable to be reached.
Joanna Klein, Ph.D.
Dept. of Biology and Biochemistry
Northwestern College
3003 Snelling Ave. N.
St. Paul, MN 55113
651-286-7468
jrklein@nwc.edu
Students are emailed a file containing the gene sequences of 4 unknown
genes. These 4 genes were chosen because each are expected to be
found differentially in E. coli strains and students have been previously
exposed to them directly or indirectly, so prior material is reinforced.
Students determine the name of the product encoded by each sequence
by doing a blastx search at NCBI (Figure 2).
Unknown genes:
1. Glyceraldehyde 3-phosphate dehydrogenase (GAPDH)
2. Cytochrome c oxidase
3. Tryptophanase
4. Shiga toxin subunit A
Figure 2: Tryptophanase BLAST results. A blastx search was done with the student unknown
sequence 3. The top matches are all to tryptophanase genes from various organisms so the
probable identity of the gene is evident.
Step 3: Develop a hypothesis about which gene is
specific to E. coli O157:H7
Case Study Scenario: Elizabeth and Colin have some idea of the
function of each of these genes based on their previous course work
and make a prediction as to what species of bacteria would contain
each of these genes.
Students review or research the function of each gene and write down
whether they predict if the gene would be:
found in all species of bacteria
found in all E. coli
found in just E. coli O157:H7
absent from E. coli
They write down an appropriate hypothesis to test.
Step 4: Test the hypothesis using IMG
Case Study Scenario: Elizabeth and Colin have heard of a useful database,
Integrated Microbial Genomes (IMG), where they might be able to test their
hypothesis. They plan to search several genomes to determine if they
contain the gene sequences. Which genomes should they search?
Students are asked to make a list of genomes to search and then search
each for the four different genes at IMG (Figures 3 & 4).
Students write down a conclusion regarding their hypothesis.
Figure 4: Gene search at IMG. Four bacterial genomes were selected (E. coli K12 DH1, E. coli
0157:H7 Sakai and P. aeruginosa PA01. The genomes were queried with the gene names
“glyceraldehyde”, “cytochrome c oxidase”, “tryptophanase” and “shiga toxin”. Results show that
shiga toxin is the only gene specific to O157:H7.
Assessment
Students were given a pre- and post-test on 12 key terms and concepts
covered in the activity and showed significant gains. (Table 1)
Table 1 – Assessment results
Number of
Course
Students
Pre-testa
Spring 2010
7
6.4
Spring 2011
15
3
a average score out of 24 points.
Post-testa
17.1
14.8
Gain
10.7
11.8
Practical Implementation
Audience – microbiology course with freshmen-senior biology majors
Format - Computer based activity is a useful substitute for a lab on
molecular methods and can be incorporated into online learning courses.
Acknowledgements
Figure 3: Integrated Microbial Genome (IMG) database. http://img.jgi.doe.gov/cgi-bin/w/main.cgi
I’d like to thank Shellie Kieke, Concordia University – St. Paul, and Ruth
Gyure, Western Connecticut State University, for piloting this lab in their
microbiology courses and providing feedback and assessment data.
Download