Mutations in the LRTOMT gene In the final project of this

advertisement
Project Bioinformatics BB051B 2015-2016
This document can also be found at Blackboard and on the course website
http://swift.cmbi.ru.nl/teach/LRTOM4/
Mutations in the LRTOMT gene
In the final project of this course you will perform a small bioinformatics research project.
You have so far learned how to use SwissProt, UniProt, EMBL, OMIM, PROSITE, PDB, MRS,
BLAST, Clustal, Yasara and Ensembl. Now you have to show that you know how to apply
these tools in the real world. So, use as many as possible of them when you are answering
the bioinformatics question stated below.
You will do an analysis of a number of DNA sequences found in patients that have a specific
disease. The main goal is to gather as much information as possible for the sequences ,
analyze the mutations and make a comprehensive report about it. The report will have the
layout of a research article.
Hannie Kremer spoke about one of her projects: deafness caused by mutations in the
LRTOMT gene. In her seminar she discussed several LRTOMT mutations, but all genetic
disorders can have many hundreds of molecular reasons, and we are going to analyze a few
LRTOMT mutations thoroughly. The goal is to explain at the molecular level whether or not
you expect an effect on the phenotype of the patients, that is, do you think they are
(partially or completely) deaf or suffer no consequences on their hearing?
Your task





We expect you to write, in teams of two students, a report of max 9 A4 (excluding
appendices). Please give the names of the group members to Karen or Celia or send
the group names to Celia.vanGelder@radboudumc.nl.
You will make the project report in this Word file (see further in this file for the
template you can use).
Do not use 2 columns text on a page, use the full width of the page.
Hand in your report digitally at Celia.vanGelder@radboudumc.nl as a doc(x) file (not
a pdf!)
Deadline for report is January 25 2016, 9:00 hr.
Preparation




First sit down and think about the research project.
What is the meaning of the data? What is the research question that you are going to
answer? Where in this project are you going to transfer information?
Before starting with your research you have to make a short (max 1 A4) project plan
outlining the major steps in your approach.
Discuss this plan with one of the teachers or assistants. Deadline Jan 7 2016 (lunch)
but preferably Dec 18, 2015!!
The DNA sequences
The DNA sequences that you are going to analyze are :
Sequence 1
AGACATGGGTGGAAAATCACTCCTTTGTCTTTATTAAAGAAACTTAGACCAGACCTGGCAATCAAGGGGCGAGGTACTGG
CCAGGAAGGTGGAGTAGGTTTCAGGCCCTGGGGATTTCAAGTGCAGA
Sequence 2
CCCATGCCCTGCCCGGTGACCCTGGTCACATCCTCACCACCCTGGACCACTGGAGCAGCCGCTGCGAGTACTTGACCCACA
TGGGGCCTGTCAA
Sequence 3
GCTTATTGCCCGAGCCCTGCCCCCTGGGGGTCGCCTTCTTACTGTGGTGCGGGACCCACGCA
Sequence 4
TTCAGGGTCCTGTCCTGTCTAGCCTGGCTTTTGGTTTCCCTCCCCAACAGATCCTAACAACTTTCTTCAACCTGAGTGTCCTC
TATCTTCACGGCAACAGCATCCAGCGC
Sequence 5
CCCGACCCCTCGCTTACAAGCCTTTCGGGAACTTGCCCTATACCCCCGACCCCAGTCACCTTCAGATTTAATTCC
ACAGAAACAGACCCTCCCCTTTAAGGCACCCCCCCCCCCCCCGGCTCCTCCCTCTCAGGCGCCTCTCCTCACAAA
CCTTACCCCCATAGATTCTGCCCTTTC
Sequence 6
GGGCTGCGGATCGAGGAGCAGGCCTTCAGCTACGTGCTCACCCATGCCCTGCCCGGTGACCCTGGTCACATCCTCACCA
Hints for starting up
Main steps in your research will consist of:
1. Finding out the gene and protein to which the mutations belong .
Hint 1: Use BLAT from Ensembl .
Hint 2: If you want you can translate the sequences to get a hint about whether they are
in coding or non-coding regions. You can use the Transeq tool for this (see under Links on
course website)
2. Analyze the gene with Ensembl (or another genome browser). What are the mutations at
DNA level and where are they located in the gene?
3. Try to predict the effect of the mutations on the phenotype of the individuals carrying
this mutation.
a. For mutations that influence the protein sequence you can analyze the 3D
structure of the protein or in case a 3D structure is not available, transfer
information from a homologous protein sequence that has a 3D structure
available (we call that protein the template). If you have found one: Look at
the structure and try to understand what is going on.
b. For mutations that occur in other regions you will have to come up with other
solutions, using the bioinformatics knowledge you have gained so far.
Think about:
i. Intron-exon structure.
ii. Known variations in this gene. Are your variations new or have they
already been discovered?
iii. Regulatory regions in this gene.
iv. Comparison with other organisms, e.g. rodents. This can help in
finding out if you can use e.g. mouse as a model organism to study
deafness.
v. Everything else that you find interesting and important to mention
General hints & remarks for the report:
General remarks:
o Choose carefully what information is relevant to show, and how you want to
show it. Use figures, tables, pictures etc. to illustrate your results. You can for
instance in the Introduction paragraph show schematically the domains in a
protein sequence.
o Don’t make your text a “bullet list” of results where the user has to draw his own
conclusions. You will take the reader by the hand and lead him through your text
in a story.
o Avoid constructs like “We ran BLAST to…and then we did this and then we did
that, etc. Write neutral texts: “Searches for homologs were performed with
BLAST (version etc)”
o Try to be as clear and specific as possible. When talking about a certain amino
acid residue or mutation mention the protein involved, the residue name and the
residue number. Do not use words like “mutation number 1 has no effect”, but
rather “the A127P mutation has no effect”.
o Remember to use amino acid numbering related to the protein you are studying.
The wild-type (wt) protein numbering can be used as reference.
o Try to make a good overview (e.g. a table) with the mutations. It is impossible to
write a clear story without putting the correct residue numbers to the respective
amino acids or nucleotides.
2. Figures and tables:
o should be functional, comprehensive, and nearly self-explanatory
o should be numbered and you must refer to them in the main text, otherwise no
one will read them.
o should always have a title and a legend explaining what is shown in those
figures/tables.
o Should not contain things that have nothing to do with the goal of the figure
o Should contain labels when relevant
1.
3.
Copyright:
It is allowed to copy maximally 200 words literally from anywhere, provided you put
double quotes around it and you provide a reference to the source from where you
copied it.
TEMPLATE FOR REPORT
Title

The report has a good title describing the topic of the study.
Authors

List the authors here including student numbers.
Abstract






The abstract will consist of a few sentences (max 6) where you summarise the main aspects of the project:
Sentence 1 summarises the research question being solved in this project.
Sentence 2 summarises your approach. Please don’t mention the tools used here (do not say “we used
Yasara, BLAST and CLUSTAL to investigate....) but formulate the approach.
Sentence 3 lists the main results/conclusion. Be specific here, do not use words like “mutation number 1”
but use the amino acid name and number to indicate a mutation
And that should normally be enough. If really needed, you can add a fourth sentence summarising some
discussion points.
Often, you will write the abstract after you have written the complete report, because it can be easily
distilled from the report text
1. Important:Do not put literature references in the Abstract, nor tables or figures
2. Do not use swissprot and PDB codes in the abstract!! Describe the protein with its name & species
not by its code in a database (it goes without saying that you put the proper names for the genes
and proteins involved. Be precise!)
Introduction (maximally 2,5A4)



The introduction should explain the question and provide the background information needed for
somebody of your own skill-level to understand what you have done to answer that question.
The things really needed in the introduction are:
1. The molecule & and its role in biology: what is it , where does it do what etc. Describe the
genomic environment of the gene, like intron-exon structure, known variations in the gene,
known regulatory regions. Also describe the important functional sites of the protein sequence
and structure, like active site residues, ligands, protein domain structure, known mutations and
anything else you find worth mentioning
2. The mutations investigated in this study, how were they found, what is their effect. You may want
to put in a table with the mutations, but this can also be part of the Results & Discussion section.
Please use at least one Figure to illustrate an aspect of the biological function that you think is relevant to
show . Be careful to choose a functional picture, not only a nice colourful image.
Methods (max 1 A4)


The report has a short Methods section where you describe the methods and tools used.
In principle, others should find enough information in the Methods section to allow them to repeat your
studies.





Do NOT put results or discussion in the methods section. You also do not discuss here WHY you use
certain tools, only HOW you use(d) them.
The following sentence is an example of a good sentence that you can use: “Searches for homologs in
SwissProt (version…) and PDB (has no version, so you cannot list it ) were performed with MRS BLAST. The
PDB file 1ABC was used for all studies, except blabla”.
Put references to the tool and databases in the References section (and not in the Methods text!). In the
case of tools, you can use the website of the tool as reference. Please include the version numbers of the
databases and tools.
Use the right tool for the job. Use BLAST to search for homologues, use CLUSTAL for alignments. So make
sure not to use BLAST (which is a local alignment tool for fast database searching) but CLUSTAL to make
the crucial alignment in the Results & Discussion section..
When analyzing structure(s) you may want to use the WHAT IF servers
(http://swift.cmbi.ru.nl/servers/html/) to:
1. List the sequence of a PDB file (Under Administration)
2. Analyze protein-cofactor contacts (Under Protein Analysis)
3. Analyze hydrogen bonds
Results & Discussion ( maximally 6 A4)






4.
In these sections the results of your study have to be described, including an explanation of the strategy
you used, the steps you took, etc. They also include the final results, i.e. the answers to the biological
question(s).
Analyze all the mutations and describe the analysis results. Think carefully about how to report for
mutations in the coding region (you will talk about amino acids, structures, protein sequence alignments)
and for mutations in the non-coding regions (you will talk about gene sequences and gene properties).
Important: If you are going to transfer information from one sequence to another: Indicate the E-value,
the percentage identity and the length of the alignment, and make clear if trust the quality of the
alignment and you are allowed to transfer information. Also describe carefully from which sequence your
are transferring information and which sequence you are transferring information to.
As a minimum requirement: Include a Figure showing the crucial protein sequence alignment used for
transfer of information! This figure belongs here and not in the Appendix.
When showing an alignment:
1. you should use a proportional font, e.g. Courier, otherwise the amino acids will not line up
correctly.
2. make sure to have the names of the proteins in the alignment and not names like “sequence 1”
etc.
3. Think of ways to illustrate important amino acids in your alignment. Use colours, numbers, boxes,
labels, arrows etc. Be creative. Colour them and/or put a box around them. Also put the residue
number of important amino acids in the alignment.
4. skip useless data. If an alignment has been made to only find out if two sequences are homologs,
then you should not put the alignment in the report. But if an alignment was made to transfer
information from one sequence to another, an alignment is needed.
Be systematic. If you describe the mutations, name and describe them in the same order as you mentioned
them in the introduction (if you did). Feel free to add small but clear 3D pictures for the mutations. If
possible, combine two mutations in one picture.
When showing structures/ parts of structures:
o Don’t make pictures because they are nice, but because they make something clear to the reader
that you think is important to get clear
o Remove from pictures everything that does not add information useful for the point you want to
make
o In detailed pictures make sure to display the side chain atoms of the amino acids. In close-up
pictures we want to look at atoms, not at ribbons!
o When making pictures of molecules, use YASARA with a white background.
o Remove hydrogens if showing them is not relevant.
o Only label atoms or residues that you discuss in the text.
o Do not zoom in too much, do not zoom out too much, think about what you want to show
o
Use colours only when needed to show something or to help the reader find a residue, atom,
interaction, etc., and not because they make the picture nicer.
Conclusion ( maximally1/2 A4)

This is a short section containing the overall conclusion of your research. You can also talk about things
like potential weaknesses in your study, suggestions for follow-up research and anything else you think
should be discussed
Acknowledgements

If you want you can add Acknowledgements.
References








Here you give a list of references. These are typically:
1. both research papers
2. and hyperlinks (to tools, databases & webservers).
Make sure that you quote the references in the text of the report
Please choose one method of referencing and use this method for all references.
For research articles: Please give the full literature reference and not only a PubMed link! If you want you
can also the URL (e.g. of PubMed) for the paper, but it should be preceded by the full literature reference!!
Example for reference to a research paper:
1. Joosten RP, Joosten K, Cohen SX, Vriend G, Perrakis A. Automatic rebuilding and optimization of
crystallographic structures in the Protein Data Bank. Bioinformatics 27:3392-8 (2011).
Example of reference to tool or database:
4. http://www.ensembl.org/, Release 69, consulted 16-12-2012
You also have to put a reference with a figure if you borrowed a picture from a paper
There is more out there than Wikipedia! Use more sources to gather information otherwise it will cost you
points.
Appendices



Feel free to use Appendices but choose carefully what to put in the main text, what to put in an Appendix,
and what not to put in at all because you can also describe it in two sentences instead of showing a large
computer output.
Appendices contain additional information that is not required to follow the flow of the story you tell in
the report.
If you use Appendices make sure to refer to them in the main text, otherwise they will not be read at all.
Download