Student handout

advertisement
Name________________________________Class and Period____________________Date________
Partner(s) name(s)___________________________________________________________________
Introduction to Gene Mining Part A: BLASTn-off!
Part A Learning objectives: Use the bioinformatics NCBI Gene and BLASTn tools to search
for a human gene of interest in a plant model.
Evaluate the significance of your search results to see if human and plant genes are similar.
Engage:
Information and Instructions
Recall models that you have studied in school.
Lab Notebook
1. What is the first thing that comes to mind when you think of
the word “model”? Answer this before seeing slide 3.
2. List 3 examples of scientific models: slide 4
3. Why do scientists use models? slide 5
To determine whether a plant might be a useful
model for experimentation, it should have
certain characteristics. Watch the video and
record those.
https://www.youtube.com/watch?v=foHiKrlY9
Qc
What underlying principle of biology enables
scientists to study some human diseases in
plants?
Our sample question: Do plants have human
muscle genes?
Compare and contrast human and plant
movement. Watch:
http://www.bbc.co.uk/programmes/p00lx6cl
And
https://www.youtube.com/watch?v=eDA8rmU
P5ZM
What underlying principle of biology enables
scientists to study some human diseases in
plants?
http://www.OMIM.org is an interesting portal
about human genetic disorders. Open it and
examine “About” and “Entry Statistics” to see
what sorts of genetic information OMIM
provides.
In the OMIM search box, type in any human
genetic disease.
4. Characteristics of the model plant Arabidopsis thaliana are:
slide 7,8
5. Why might scientists use plants to study human diseases?
Slides 9,10
6. Compare and contrast human and plant movement. Slide 11
Human movement
Plant movement
7. How is it possible for a plant to have a version of a human
gene?______________________________________________
8. Summarize types of info you found on OMIM. Slide 12
9. List types of genetic disorders from Entry Statistics. Slide 12
10. What disease name did you enter? What did you find?
Slide 13
1
Name________________________________Class and Period____________________Date________
Partner(s) name(s)___________________________________________________________________
What do I know about muscle genes and
proteins involved in movement? Use
Wikipedia, Google and other open access info
to find information. Then try a more specific
database, like a science journal.
Use more specific search engines and portals.
Scientists have performed numerous
experiments to understand what characteristics
are shared by all living organisms or which
characteristics are unique to one domain or
kingdom. Historically, these experiments were
lengthy but the analysis was fairly
straightforward.
Nowadays, scientists are able to rapidly
perform experiments which generate
enormous amounts of data but the data
analysis may take weeks or even years. To
improve analysis of their data, they may make
it accessible in public databases for other
scientists and mathematicians to analyze.
Since plants and animals both move, do they
use the same types of proteins to move? Do
they have the same genes coding for these
proteins?
11. What are names of muscle proteins? How many hits did you find?
How specific was the information? Slides 14 and 15
12. What is an example of a data set that has increased in size
over the past decade? Slide 17 ___________________________
13. What are problems scientists have with “Big Data”? Slide 17
14. Define bioinformatics: Slide 18
15. Of what advantage would it be to a geneticist to use a
bioinformatics approach to study a disease? Slide 19
16. A bioinformatics use or question that interests me (Slide 20)
is __________________________________________because
___________________________________________
17. Make your own hypothesis about whether animals and
humans have the same skeletal muscle protein genes and then
explain your reasoning: (No slide is needed for this question.)
Animals and humans will ____ will not____ have the same
muscle protein genes because I know that:
2
Name________________________________Class and Period____________________Date________
Partner(s) name(s)___________________________________________________________________
Explore the bioinformatics tool BLASTn to find data to test your hypothesis:
Information and Instructions:
Lab Notebook
Which genes or proteins are involved in the biological process
(muscle movement)?
Go to http://www.ncbi.nlm.nih.gov
Use the pull down menu near the top to select Gene. Then
enter: homo sapiens skeletal muscle protein. Click on SEARCH.
1. What information does the result provide? Slide 24
2. Record the Name/Gene ID and Description for the
top 3 results: slide 24
Name/Gene ID Description
By clicking on the Gene Name ACTA1 for actin alpha 1, you will
find its gene page. Use the Summary and the links below to
decide whether alpha actin 1 has functions that would be
shared by a plant.
https://www.youtube.com/watch?v=FzcTgrxMzZk
https://www.youtube.com/watch?v=VVgXDW_8O4U
Scroll down the gene page until you see Genomic regions,
transcripts and products. Find FASTA and click on it. Then
click on GenBank.
In FASTA, copy the entire ACTA1 gene sequence and paste it
into a Word document. Include the top line that begins
“>gi……”. Title the document: Gene sequences.
3. Info I could use to decide whether a plant version of
ACTA1 would have a similar function as the human
gene: slides 25,26
4. What does FASTA show? How is FASTA format
different than GenBank format? Slides 27, 28
5. Describe the acronym, types, and features of
BLASTn. Slide 30,31
One bioinformatics tool used to search for genes in one
organism when a gene is known in another organism is
BLASTn. Go to:
http://blast.ncbi.nlm.nih.gov/Blast.cgi
to learn more about BLASTn.
3
Name________________________________Class and Period____________________Date________
Partner(s) name(s)___________________________________________________________________
BLASTn the ACTA1 gene vs the Arabidopsis thaliana genome.
6. Summarize the steps and reasons for each which
you used with BLASTn to submit a query for a human
version of ACTA1 in Arabidopsis thaliana. Slides 33-36
Step
Reason for doing that step
1
Step 1
2
3
Step 2
Step 4
Step 3
4
7. A BLASTn report shows Graphics section below.
Explain the significance of the colored blocks (black,
blue, green, pink, and red) and how they relate to the
tracks. Slide 37
What does the solid red bar represent?
____________________________________________
What do the numbers mean?
8. What information is in the Descriptions section?
Slide 38
9. On the Alignment at lower left, label what Q, S, 1,
45, 403 and 447 indicate. Tell what the vertical lines
indicate and what the spaces indicate. Slide 40
4
Name________________________________Class and Period____________________Date________
Partner(s) name(s)___________________________________________________________________
When you get an alignment result, should you trust it? How
could you decide whether the result is a version or just a lucky
find?
10. What is an alignment score? How is it calculated
for matching 2 sequences that have gaps (insertions or
deletions)? Slide 41
For additional information go to
https://www.youtube.com/watch?v=mvjHYMgJDTQ
For an NCBI webinar on BLAST. Watch and answer questions
11-12 (through about 8:30 into the video).
11. What is Query cover? What are its units? Slide 4345
12. What is the meaning of “Ident”? What are its
units? Slide 45
13. A portion of an Alignment section is shown at left.
Label the human query sequence, the aligned subject
sequence, a matching pair of nucleotides and a pair of
nucleotides which does not match. Slide 46
14. What does an Expect (E) value tell you about your
alignment result? What value indicates an acceptable
non-random alignment? Slide 47-48
5
Name________________________________Class and Period____________________Date________
Partner(s) name(s)___________________________________________________________________
Use your BLASTn search results to answer questions
In the Description section, look at the Query cover for the
16-20. Which Arabidopsis thaliana GENE is most
Actin 7 gene.
similar to human ACTA1? Slide 49
_____________________________________________
16. Record the E-value. What does the value indicate
about the alignment of this gene to ACTA1? Slide 49
17. The Query cover for the Arabidopsis thaliana
ACT7 gene is ________%. Explain which part of the
graphic section illustrates that coverage. Slide 45
18. Are there sections of the alignment that have more
consistent matches than others?_________
19. When you compare the alignment with the
graphics, what do you notice? Slide 45 and 48
Under Alignments, click on the Sequence ID to find more about
the aligned Arabidopsis thaliana sequence.
Explain:
20. From the graphics section, propose what the areas
of poorest alignment (shown in thin black lines) might
be in the gene structure:_______________________
Slide 45
21. Use the ACT7 gene information page to record
similar functions of Arabidopsis thaliana ACT7 and
Homo sapiens ACTA1. Slides 50, 51
Before conducting the BLASTn search, you hypothesized about whether plants might have a muscle gene
similar to a human muscle gene.
1. What did the data indicate about your hypothesis?
2. Which data did you use?
3. What additional information would help you determine if the two genes were similar enough to use as a research
model for a particular disease?
4. Why would it be significant for your study of nemaline myopathy for the two genes to be very similar in sequence and
in function?
6
Name________________________________Class and Period____________________Date________
Partner(s) name(s)___________________________________________________________________
Extend:
1. Pick one human gene which you think is highly conserved between plants and animals.
2. Follow the procedure you just learned to see if a similar Arabidopsis version exists.
3. Record your info on the scorecard below.
4. Repeat for a gene that you predict is unique to humans.
5. Do this until you have searched for 3 genes you think are conserved between humans and plants and for 3 you think
are unique to humans. Keep score!
Gene Prediction Scorecard
Human
Gene
name
Gene
ID
Gene Function
Arabidopsis
Gene
name
Gene
ID
Gene
Function
Will I find a
plant
version?
Explain
prediction:
Predict:
Evidence about
whether there is or is
not a plant version of
the gene (E values,
function, alignments,
etc.)
Correct
prediction?
+1 for
correct
prediction
Yes
Yes
Yes
No
No
7
Name________________________________Class and Period____________________Date________
Partner(s) name(s)___________________________________________________________________
No
8
Download