Compensated pathogenic deviations in human disease proteins Violeta Beleva-Guthrie Disease causing mutations in human are also known as pathogenic deviations (PDs). Mostly, PDs are found at highly evolutionarily conserved sites in human proteins and their functional equivalents in other species. However, PDs sometimes also occur as wild-type residues in some of the non-human functionally equivalent protein sequences. During evolution, these PDs are presumed to have spread by evolving together with other neutralizing mutations. Such neutralizing mutations tend to occur in the same protein and compensate for the deleterious effect of the original PD. In that case, PDs are termed “compensated pathogenic deviations” (CPDs). The ability to distinguish between compensated and un-compensated PDs has key applications to genetic testing for diagnostic purposes. In this project you will develop a machine-learning approach to distinguish between compensated and uncompensated PDs. For that, you will use publicly available protein databases and compile a set of biophysical features for your model. Additionally, you may consider finding out what features tend to characterize the neutralizing (compensating) mutations associated with a CPD. Note: Some additional background in protein biophysics and modeling of protein evolution is required for this project.