Problem statement from VBG

advertisement
Compensated pathogenic deviations in human disease proteins
Violeta Beleva-Guthrie
Disease causing mutations in human are also known as pathogenic deviations (PDs).
Mostly, PDs are found at highly evolutionarily conserved sites in human proteins and
their functional equivalents in other species. However, PDs sometimes also occur as
wild-type residues in some of the non-human functionally equivalent protein sequences.
During evolution, these PDs are presumed to have spread by evolving together with
other neutralizing mutations. Such neutralizing mutations tend to occur in the same
protein and compensate for the deleterious effect of the original PD. In that case, PDs
are termed “compensated pathogenic deviations” (CPDs). The ability to distinguish
between compensated and un-compensated PDs has key applications to genetic
testing for diagnostic purposes.
In this project you will develop a machine-learning approach to distinguish between
compensated and uncompensated PDs. For that, you will use publicly available protein
databases and compile a set of biophysical features for your model. Additionally, you
may consider finding out what features tend to characterize the neutralizing
(compensating) mutations associated with a CPD.
Note: Some additional background in protein biophysics and modeling of protein
evolution is required for this project.
Download