ePlant and Biological Modelling - Science and Plants for Schools

advertisement
ePlant and biological modelling – how and why do
researchers use the software ?
Freya Scoates, Department of Plant Sciences, University of Cambridge
Modelling has always been integral to scientific research, because researchers use models to
help understand their theories and make predictions based on them. Large data-sets have
become a common feature of today’s science as a result of technological advances. For
example, recent developments in so-called ‘Ultra high throughput’ sequencing mean that you
can now sequence 5,000Mb/day, or 5,000,000,000 base pairs in a day. 1 The entire human
genome consists of only 3,000,000,000 base pairs: this means that it is now possible to
sequence the entire human genome in one day.
The plant Arabidopsis thaliana (an annual in the brassica family) has a relatively small
genome, at only 157 million base pairs. This makes it comparatively simple to use for genetic
research, as it is easy to manipulate. Arabidopsis is now one of the most important model
organisms used in genetics and biology. In contrast, the largest genome yet discovered is that
of Paris japonica, a flowering plant; its genome is 149,000,000,000 base pairs long. 2
Although these new experimental techniques allow the generation of very large data sets,
modellers are now having to produce new models in response to this new kind of data-set.
ePlant was developed as a way that researchers can integrate and interact with these
massive amounts of data, by looking at the model organism Arabidopsis. It allows you to
select a gene and follow its influence over the plant from the genome all the way up to the
whole organism.
What is ePlant?
ePlant is an online tool that allows researchers to visualise genetic data related to Arabidopsis
thaliana, the model plant, in 3D.3 It is part of a wider initiative which aims to improve
biological models and to work with these vast data sets. ePlant allows you to pick a gene and
view the known information about it in five key ways:
 ‘Homologs and Polymorphisms’ shows the linear genetic sequence of your gene of
interest, along with known polymorphisms and the corresponding amino acid
sequence. The amino acids are colour coded to show their physicochemical
properties.
 ‘Plant Expression’ displays the varying levels at which your gene is expressed
throughout the plant, by mapping data onto a 3D model of Arabidopsis in different
colours.
 ‘Tissue Expression’ shows the relative levels at which your gene is expressed in
different tissues as different coloured 3D images of different plant tissues, such as
stamens and pollen.
 ‘Subcellular Localisation’ displays where in the cell the protein product is found,
mapped onto a 3D model of a plant cell.
1
Kircher, M. and Kelso, J. (2010) High-throughput DNA sequencing- concepts and limitations
[Electronic version] Bioessays 32: 524-536 doi:10.1002/bies.200900181
2
Pellicer, J., Fay, M.F., Leitch, I.J. (2010) The largest eukaryotic genome of them all? [Electronic
version] Botanical Journal of the Linnean Society 164, Issue 1, 10–15 doi: 10.1111/j.10958339.2010.01072.x
3
Fucile, G., Di Biase, D., Nahal, H., La, G., Khodabandeh, S. et al. (2011) ePlant and the 3D Data
Display Initiative: Integrative Systems Biology on the World Wide Web PLoS ONE 6(1): e15237. doi:
10.1371/journal.pone.0015237

‘Protein Model’ shows the predicted tertiary structure of the protein product of your
gene of interest. You can then manipulate the diagram to show different colours
Why do researchers use ePlant?
ePlant helps geneticists and biologists learn about the expression of a gene in Arabidopsis. It
allows researchers to observe the action of the gene on several different scales - from the
primary structure of the protein to the way it is expressed in different plant tissues. Because
Arabidopsis is the model plant, the expression of the gene in Arabidopsis is used as a starting
point for understanding its expression in other plants.
How will modelling tools change in the future?
This type of software, which allows you to look at so many different aspects of gene
expression and organism development, is set to become more popular as our ideas and
perspectives on plant development change and evolve. For example, another programme,
The Computable Plant, has been developed at the University of California, and aims to put
together a whole-systems view of developmental biology in plants. It is constructed to show
how different environmental and genetic factors can jointly influence the biochemistry and
morphology of a plant over the course of its lifetime.4
Similar software may also be developed to take advantage of the release of many new
genomes during the next few years. Many commercially valuable crop plant genomes are
going to be released over the next few years: the tomato genome is due to be released some
time towards the end of 2011, while the maize genome is also close to being completed.
Advancing our understanding of the development of these vital plants will be invaluable as
researchers continue to try to improve crop productivity.
What is the future of ePlant?
The computer modelling programme, ePlant, was launched in October 2010 by a team in the
Department of Systems and Computational Biology at the University of Toronto.
At present, only 72% of known Arabidopsis gene sequences are available through ePlant- as
more is discovered about the genes and proteins of Arabidopsis, more data can be included
in the database. The creators are also planning to improve the visualisation of certain plant
organs such as the roots, in order to allow a more detailed image to be produced. Due to the
incredible complexity of protein interactions within cells, our knowledge of different metabolic
pathways is still developing. The developers of ePlant plan to including a new function which
allows users to see different metabolic pathways which your gene of interest is involved in.
4
www.computableplant.org retrieved 11th August 2011.
Download