here - BioGeometry Project

advertisement
Researchers “Shaping” Future of Proteomics
by Dennis Meredith
With the rough draft of the human genome now completed – as announced to great
fanfare last month – researchers now face an even more daunting task of figuring out how the
30,000 or so genes give rise to the biological protein machinery that makes humans uniquely
humans.
A central problem in this field, called “proteomics,” is how to mathematically describe
the intricate folding of proteins. These central molecules to all life begin as one-dimensional
linear strings of amino acids, but collapse immediately like intricate origami into the threedimensional proteins such as enzymes that are the workhorse catalysts of biochemical reactions
in the cell.
Since last fall, researchers from Duke, UNC Chapel Hill, Stanford and North Carolina
A&T have been tackling the daunting problems of mathematically describing the shapes of these
proteins, the complex contortions they undergo in carrying out cell processes and how their
triggering molecules called ligands plug into them.
The scientists are working under a $7 million National Science Foundation grant to
develop new tools of “computational geometry” that will aid help biologists in taking the next
key steps in understanding protein shape and function.
Principal investigator on the project is Duke Professor of Computer Science Herbert
Edelsbrunner, and his Duke collaborators include Associate Professor of Biochemistry Homme
Hellinga and Professor of Computer Science Pankaj Agarwal. They are joined by UNC
professors Fred Brooks, Jack Snoeying and Charles Carter; Stanford professors Jean-Claude
Latombe, Leonidas Guibas and Michael Levitt, and North Carolina A&T Professor Solomon
Bililign.
According to the project summary, “This research is expected to shed light on some of
the most important unsolved biological puzzles: prediction of protein structure, simulation of
protein folding and analysis of ligand to protein docking. These processes link form to function..
“Understanding them will pave the way to a post-genomic era in biological research in
which the wealth of DNA sequence information is complemented by corresponding knowledge
of geometric shape. Together, sequence and shape will provide a description of the biological
function so critical for all life.”
According to Edelsbrunner, the prodigious scientific effort needed to move biology into
the new realm of three-dimensionality is well worth it.
“Since we live in a three-dimensional world, everything is represented geometrically, so
the aim of computational geometry and of this project is to represent geometric shapes and do
computing about them,” he said. In their work, the scientists will seek to combine two previously
mutually exclusive approaches to computing, he explained.
“There are really two basic camps in the use of geometric algorithms -- the combinatorial
and the numerical,” Edelsbrunner said. “In the combinatorial approach to computational
geometry, logic is used at every step of the computations, critically relying on yes or no
decisions, and the result being a mathematical tree with branches.
“Numerical approaches use approximations and avoid the problem in combinatorial
approaches, in which a single small error can lead to the wrong branch and compound itself, with
the algorithm breaking down. Our approach will be a combined one, using whichever method is
better for the task.” Such efforts are complicated by the fact that researchers will first have to
understand where the scientific weaknesses lie in techniques of mathematically modeling
proteins.
“In addressing the protein-folding problem it is hard to pinpoint whether there’s an
essential weakness in the physical understanding of proteins or in the computing needed to
describe them,” Edelsbrunner said. “In fact, maybe there’s not a particular culprit; maybe the
whole system is just not good enough.”
Thus, said Edelsbrunner, he and his colleagues will seek not only to better understand the
biology of protein folding, but to speed up the computing tools, called algorithms, so that
biologists can perform more experiments to simulate the intricate molecular motions proteins
undergo as they fold.
“Say, if we can speed up a modeling algorithm by a factor of a thousand, then people can
run more molecular dynamics faster, and rather than waiting a week for their results, they can see
them in half an hour. Thus, they can perform more simulations and improve the system faster.”
Importantly, he said, simulating a protein effectively means simulating not just its shape, but how
it moves.
“When you address the protein folding problem you must address molecular motion, and
the simulation of that motion right now is not good enough,” he said. For example, he said, in
building a mathematical simulation of protein motion, researchers must now largely ignore the
smaller vibrations of the thousands of individual atoms in the protein, although they can prove
important to understand how the larger protein molecule behaves.
“If you try to simulate molecular motion, you may go through thousands of iterations in
which the whole system does nothing but vibration most of the time, and then occasionally
something large moves,” he said. Also, Edelsbrunner said, there are few shortcuts to developing
the complex simulation techniques.
“The most difficulty we have with our software is that it is so labor-intensive,” he said.
“It is sophisticated geometric software that takes a lot of time to develop. And once you have it, it
is great, but you always need an expert to create it.”
According to Edelsbrunner, the research group consists of teams of mathematicians and
biologists, each contributing and teaching their expertise to make the software not only
mathematically sound, but biologically useful. For example, while the mathematicians
understand the intricacies of modeling techniques, the biologists understand the complex
analytical technique of X-ray crystallography – in which beams of X-rays shone through crystals
of pure protein yield information about the protein’s structure.
The results of their interdisciplinary collaboration will not only be useful new software
for biologists, but also new talent.
“Part of the success of this project will be the education of postdoctoral students in this
interdisciplinary area,” he said. “There is a huge lack of computing and mathematical skills in
biology right now,” he said. “And while there are many people interested in the field, there are no
training programs. We want to help fill that need.”
And the scientists hope their research will yield new discoveries about the shape of the
proteins they study and how they interact with ligands in biochemical reactions. All in all,
Edelsbrunner said, the project will be not just an experiment in modeling protein structures, but
in creating fruitful collaborations among computer scientists and biologists to advance the
understanding of life’s processes.
“There is certainly a great potential for progress, in which we improve one part of a
modeling technique and it allows an advance that enables us to see how to improve or even
redesign another part -- and step-by-step, we will build far more useful scientific tools,”
Edelsbrunner said.
Download