Only 20 standard amino acids are used to build proteins, but why

advertisement
Only 20 standard amino acids are used to build proteins, but why exactly nature "chose" these
particular amino acids is still a mystery. One step towards solving this is to explore the
“amino acid space”, the set of possible or hypothetical amino acids that might have been used
instead. New research has used computer models to construct a large database of plausible
amino acids, revealing thousands of amino acid structures that could have been used.
Building blocks
All organisms on Earth employ the same workforce to perform a wide range of essential
biochemical tasks. This workforce is comprised of proteins, which are constructed from a
long string of amino acids attached to each other. Even for proteins with particularly long
chains of amino acids, there are still only 20 different types of amino acids which are
genetically encoded. These amino acids are essentially the building blocks of life, and the
same 20 standard amino acids have been used in proteins throughout the history of life on
Earth, since the existence of the Last Universal Common Ancestor three to four billions years
ago.
Amino acids all have a similar "backbone" structure, which is the foundation upon which the
acid is built. This backbone is held together via a single carbon atom acting as a bridge to
connect different groups of atoms. Amino acids with a single carbon connector are called
alpha amino acids, however it is possible to have more than one carbon atom in the bridge. In
this case, they are called beta amino acids and so on.
A group of atoms, called a sidechain, is affixed to the backbone, and it is the structure of the
sidechain that differs from one amino acid to the next, creating a staggering amount of
variability.
Of course, amino acids don't just occur in proteins. There are many more that have different
biological functions, and some amino acids are also produced abiotically. Some of these
abiotic amino acids are not exclusive to the Earth. For instance, the Murchison meteorite was
found to be harboring at least 75 amino acids, and it is even thought that the amino acid
glycine might exist in the interstellar medium.
However, abiotic chemistry
can still only account for half
of the 20 genetically encoded
amino acids, and there are
many unanswered questions as to the role amino acids play. Could extraterrestrial life use a
different set of amino acids? Why did life on Earth select those particular amino acids? What
other amino acids could have been selected? These are all open questions in astrobiology, and
one step towards answering them is to gauge the diversity of the amino acids that could have
been used for life on Earth.
The 20 different amino acids will stick together in various
formations to form protein. Credit: Plant & Soil Sciences
eLibrary
Defining amino acids
Markus Meringer, Jim Cleaves and Stephen Freeland set about taking this step by attempting
to generate a synthetic map of plausible amino acids structures that are similar in size and
composition to the 20 genetic amino acids. Up until now, modeling these structures has been
hampered due to the complexity in generating so many different chemical structures.
However, by taking a different approach to the problem, the scientists were able to draw a
preliminary amino acid map.
They input a molecular formula into a computer program that had the capability to visualize
different amino acids structures based on this formula. However, computing all possible
amino acids is a strenuous task for even the fastest computers. Also, listing every possible
amino acid does not narrow down the ones of interest to astrobiology. Therefore the main
challenge for the scientists was actually in defining what an amino acid should be, and they
used different methods to do this.
Different variations of amino acids
The way to narrow down the interesting amino acids is to explore the "space" around the 20
genetic amino acids. This can be done by generating multiple variations of each amino acid
by shuffling the atoms around. For instance, an isomer has the same molecular formula but a
different chemical structure, so generating isomers of each amino acid will give the "isomer
space".
This isomer space varies in size for each amino acid, partially depending on how many atoms
there are in the acid. Therefore, the isomer space is largest around tryptophan, the amino acid
with the greatest number of atoms.
Fuzzy formulas
However, the isomer space is
still a lower limit on the
number of potential amino
acids that could have been
available for use in proteins.
The isomer space only probes
the area in the immediate
vicinity of the amino acid,
rather than reaching out
towards their neighbors to
explore the intervening space
between formulas. Therefore,
the scientists included extra
combinations by considering
the minimum and maximum
numbers of possible atoms
for each chemical element.
The trick that they applied to
do this was to use a "fuzzy
formula”.
This means that instead of
telling the software that every
The Murchison meteorite has at least 75 amino acids in it.
atom of every chemical
element must occur a certain Credit: Wikimedia Commons
number of times, the fuzzy formula tells the software to be a bit more vague, or "fuzzy", so
that the element can have various numbers of atoms. For example, oxygen could be specified
as a range from 2 to 4, so that the program would search for solutions that included 2, 3 or 4
oxygen atoms.
Using this fuzzy formula uncovered a treasure trove of additional amino acid combinations.
However, a single fuzzy formula can only be used to explore the space around 15 of the
amino acids. A single formula that can include all 20 is still too much for current computing
power to handle.
Biochemistry's palette
The next step was to try and explore the amino acid space beyond the isomers while
including the five that had been neglected in the previous step. This meant that multiple fuzzy
formulas had to be used, but this couldn't be done without classifying the genetic amino acids
into ten different groups.
"There a lot of ways one could classify the coded amino acids according to functional groups
and properties," said Jim Cleaves. "But if you stuck to just using the functional groups
observed in biology and computationally poked around with that chemical diversity, it
wouldn't be nearly as wide as what we came up with, and it's clear that biochemistry had a
huge palette to play with during evolution."
Using ten fuzzy formulas proved to be the most successful way of exploring the amino acid
space. Not only does this method have less processing time than using one fuzzy formula, but
it has the advantage of including variations of all of the genetic amino acids.
Cartography of amino acids
The number of amino acid
structures generated
surpasses all previous
estimates. Using the method
with the single fuzzy formula
produced 120,000 plausible
The “isomer space” shown on the left only explores the space structures and using ten fuzzy
immediately around the amino acids. The second figure shows formulas narrows this down
that using the single “fuzzy formula” explores a much wider to a more biologically
space, but cannot account for all amino acids. The final figure relevant set of nearly 4,000
shows the amino acid space when multiple fuzzy formulas are amino acids. This shows that
used. Reprinted with permission from Meringer et al. (2013). there were a staggering
Copyright 2013 American Chemical Society.
amount of options available
that could have possibly been
used for building the genetically encoded amino acid set - and yet there are only 20.
They compared the output of both methods to databases of biological alpha amino acids
beyond the 20 genetic ones, as well as to amino acids found in carbonaceous meteorites.
Many of the amino acids present in the computer library also occur in nature, showing that
the computer generation of amino acids is a way of identifying potentially interesting amino
acids that could be used in proteins. It is even possible that there are undiscovered natural
amino acids that have had their chemical structure probed by the computer database.
The computer libraries generated by the team can now be used as a foundation for further
exploration into the jungle of amino acids, and may ultimately lead to an understanding of
life's building blocks.
The research was published in the November issue of the Journal of Chemical Information
and Modeling and can be found here: http://pubs.acs.org/doi/abs/10.1021/ci400209n
Download