Fundamental Thermodynamics of Protein Folding: Theory and Prediction Table of Content Appendix 2 Abstract 3 1.0 Introduction 3 2.0 Thermodynamic Theories of Protein Folding 5 3.0 Statistical Thermodynamics of Protein Folding 11 3.1 Hydrophobic Interaction 12 3.2 Hydrogen Bonds 13 3.3 Electrostatic Interaction 13 3.4 Van der Waals Forces 14 4.0 Protein Structure Prediction and Optimization 15 5.0 Conclusion and Future Prospective 18 6.0 References 21 1 Fundamental Thermodynamics of Protein Folding: Theory and Prediction Appendix Symbol Meaning π₯π» Enthalpy change π₯π Entropy change π Thermodynamic temperature π₯πΊ Gibbs free energy change the free energy required to open a folded π₯πΊπ, protein with electrostatic interaction the free energy required to open a folded π₯πΊπ, − protein without electrostatic interaction the free energy contribution of electrostatic π₯π₯πΊπ interaction in the folding process 2 Fundamental Thermodynamics of Protein Folding: Theory and Prediction Abstract The thermodynamic hypothesis proposed by Anfinsen states the native protein always has the lowest energy level which is also graphically represented by the energy landscapeοΌ becoming the basic theory of ab initio structure prediction. In addition, on the basis of traditional thermodynamics, statistical thermodynamics provides a clearer way to calculate and analyze thermodynamic properties of protein folding process from a macro perspective. This report focuses on the folding process explanation in both classic thermodynamic and statistical thermodynamic theories, as well as the protein structure prediction. Through searching this process thermodynamically, the fundamental principle will be known. However, protein folding is a so complexed process, new conceptual breakthroughs will be required to obtain further progress. 1.0 Introduction As the fundamental components for almost every living system, there are more than 100,000 kinds of proteins performing different functions in the human body like enzymes, regulatory proteins and immunoglobulin. Such functional difference is mainly determined by the amounts of amino acids and their sequencing, namely, the primary structure of proteins. Generally, proteins are considered to have four hierarchical structures [1]. The primary structure is the sequence of different types of amino acids on the polypeptide chain. The econda c e i he eg la c e fo med locall b he ol e ide chain, ch a - heli and -sheet. Tertiary structure is a 3D structure formed by the close arrangement of secondary structures in space. The quaternary structure refers to the compound molecule formed by the interaction between different polypeptide chains. After the unfolded protein undergoes a complex folding process, the complex structure of the natural protein formed has 3 Fundamental Thermodynamics of Protein Folding: Theory and Prediction different functions and undertakes different tasks in the organism. This delicate construction, as the machine of life, mediates most of the processes taking place in the living cells. The unique features for the most important polymeric molecules, which distinguish from all other polymers with a broad range of properties and functions, are their peculiar structure configuration. It is not difficult to see that protein is an important bearer of life activities. Therefore, it is important to research the folding mechanism of proteins. The significance of studying the principles of protein folding is threefold: First, it can help to find the reason for protein folding. Second, it can help to understand the structure and function of proteins more clearly. Third, it can provide a basis for the design of new proteins and targeted drugs. Fourth, despite most processes in living cells proceed in disequilibrium situations, measuring thermodynamic properties of the proteins to analyze is a more efficient and reliable method, which can make an essential contribution to biology. It is proven extensively that the folding process from polypeptide to protein is under thermodynamic and kinetic control [2]. In this report, main interests will fall on analysing this typical process in a thermodynamic view and explaining why the processes proceed the way they do. The first part mainly devotes to the theories. Generally, the folding and construction process is sophisticated and complex. Only the first three stages will be discussed in this article, which are on account for thermodynamics and molecular kinetics. Moreover, analysing the typical process and explaining the mechanism in a thermodynamic view are at the core of this report. Here, the theories will be listed in both kinetic and thermal views, and it is believed both types of control are active simultaneously. To be more specific, two main theories, Anfin en la and ene g land ca e fo o ein ill be anal ed follo ing o cla if he 4 Fundamental Thermodynamics of Protein Folding: Theory and Prediction construction process [3]. Eventually, practical experiments in protein constructions will be provided for further explanation. In the second part, the extensive new approach currently to analyze the protein folding process, statistical thermodynamics, is introduced generally. Specifically, the effects of hydrophobic interaction, hydrogen bonding, electrostatic interaction, and van der Waals force on the stability of the protein structure and the free energy of the folding process will be analyzed in detail. In the third part, the ab initio structure prediction will be introduced and also three kinds of optimization methods. 2.0 Thermodynamic Theories of Protein Folding It is proven that there are three levels in the process of a protein folding process (The fourth is about the interaction between chains, which will not be covered by this article.). The first level is the sequence of the peptide, and the second is some basic tortuosity and distortion. These two spatial structures are believed to be controlled by thermodynamics. Thus, the essential question leaves to how the polypeptide in the first level controls protein dimensional folding for the second level thermodynamically. Two thermodynamic ways have been put forward for constructing protein folding models. The first way predicts the native protein molecules conformation in the lowest thermodynamic energy state are the most stable configurations. On account of all interaction forces among molecules and the interaction between the whole protein particles and solvent, simulate the protein natural spatial structure with minimum energy on the theory of molecular mechanics. The other one focuses on the comparison between thermodynamic properties 5 Fundamental Thermodynamics of Protein Folding: Theory and Prediction statistics from experimental measurement and current protein database, thereby finding the regulation and constructing dimensional models. The structure forecast is based on protein homology and probing the conformation periodically, getting the structures level by level. To be specific, the first level statistic will be used in constructing the second level of configuration, following the more complex dimensional protein structures. With the regulation attained in the process and the feasible models concluded from the experiment, the unreasonable structures could be abandoned, and further modification are accessible with theory of the lowest energy level of the molecules. However, the first way is extremely difficult in the mathematical field while the second one has some accuracy limitations. In 1973, Anfinsen first analysed the connection between peptide sequence and the secondary structure for polypeptides chain [4]. In the studies on ribonuclease, he and his group found that the denaturation ribonuclease had the ability to renature spontaneously. Subsequently in their vivo and vitro experiments, they found out a kind of enzyme in cells which catalysed the folding process by discovering thermodynamic unstable positions. From the above experiments Anfinsen came out two conclusions. One was that the native protein had the lowest free energy under the constant physical conditions included temperature, pressure, pH and so on, so once a protein did not stay in a steady lowest way, it would decrease its free energy automatically in order to achieve the thermodynamic favourable position, which was called the Thermodynamic Hypothesis [4]. Another proven in his experiment was that all information needed for polypeptides to fold into basic secondary structure such as πΌ-helices and π½ -strands and their order contribute into the complicated structure entirely contained within the amino acid sequence [5]. After folding into the secondary conformation, the energy level will decrease compared to the unfold chains. In the following years, many experiments and models about Anfinsen Dogma had been 6 Fundamental Thermodynamics of Protein Folding: Theory and Prediction put forward by scientists. In 1993, Kolinski and Skolnick successfully folded model helical proteins [6]. Among their final cultured proteins, the energies of those unsuccessful folding simulations which presented incorrect structures were about 50-60 kBT higher than the correct ones had. However, it is still shocking to have an exception, which after the folding process the conformation acquires a higher energy state than the former (Figure 1A and 1B). Chaperones are discovered to explain this unique phenomenon. Classical chaperones were proven to exist in many living cells, yet they only contribute to the folding efficiency or accelerate the process rather than alter the reaction path. Highly specific steric chaperons were discovered, which have astonishing abilities to transfer some converting information to the native protein configurations, meaning the barrier between native and partially folded state will decrease which enable the folding process to proceed in an energy raising direction (Figure 1C). Though ome aid he e i ing of hi e of e ic cha e one iola e he Anfin en la , i ill should be counted into an exception, and further validation is required at a deeper level. Figure.1 οΌAοΌ A normal folding process energy changing diagram. οΌBοΌ Diagram for folded protein 7 Fundamental Thermodynamics of Protein Folding: Theory and Prediction which embraces a higher energy level than dissociate. οΌCοΌ Gibbs free energy level changing diagram with the use of steric chaperones [5]. Tho gh Anfin en inci le ha gi en i e o ildl do b ed d e o i ligh conflic towards thermodynamics, it is acknowledged by most that the folding process belongs to an irreversible thermodynamic process. However, for the lack of criteria and standard measurements, the scientists can only construct their models relying on both theory and experiment experience. In this circumstance, Frauenfelder first came up with the concept of the energy landscape for explanation. The landscape is based on the second thermodynamic principle emphasizing the total free energy of protein and surroundings decrease while folding, yet somehow protein energy level will not always drop during the process [7]. In Onuchic perspective, polypeptide chains firstly collapse on their configurations to form a compact structure, which is only similar in shape with an active protein structure, and then rebuild their most stable conformation through steric stretching, twisting, transient and bonding. The process above leads to a funnel ha ed o m l i f nnel ha ed land ca e, he e he Anfin en thermodynamic principle could narrowly explain the funnel shaped diagram, yet the multi one is really beyond its reach. For the multi funnel shaped landscape, a new question occurs, which is among all the equivalent funnels why the native funnel is selected for the structure. The explanation is the transient forces need to push a given πΌ-amino acid sequence into the native funnel comes from vibrational excited states, also known as the VES hypothesis. The VES hypothesis triggers the transient forces that constitute the first step in protein folding and function [8]. To validate the essential ideal that the conformations construct themselves to the global minimum Gibbs free energy structures, three different configurations of four types of protein were measured thermodynamically in Cruzeiro et al experiment. By comparing the energy landscapes obtained, it was proven that in a multi-funnel Gibbs energy landscape, each funnel 8 Fundamental Thermodynamics of Protein Folding: Theory and Prediction represents an average structure with different energy from the native, and each structure in a different funnel is theoretically attainable (Figure 2). In spite of the diverse structures in thermal possibility, the structure is not solely decided by Gibbs energy minimization. It was proposed by Levinthal in early 1968 that there is a first funnel selecting step before the folding process based on the non-equilibrium kinetic mechanism. And now such a method is proven to be the VES hypothesis mentioned above. Figure.2. A funnel shape energy landscape example for πΌ -helix protein folding process. Measured by effective energy and different conformation as well as entropy [8]. There is no doubt at all that the peptide sequence decides the landscape. Debayan et al experiment aiming to link the peptides encoding with Gibbs energy of the process is based on the transition from an πΌ-helix to a π½-harpin [9]. The polypeptide chain was characterized by using Discrete Path Sampling (DPS) technique. The contrast was designed to encode with DP3 from DP5 in peptide sequences, mapped out by DPS. And the multi-funnel energy landscape of the DP3 set was reshaped after recording the sequences (Figure 3). By comparing the πΌhelix and the π½ -harpin configurations after changing the peptide sequences, both the 9 Fundamental Thermodynamics of Protein Folding: Theory and Prediction thermodynamic and kinetic properties are changed. In microscopic view, the molecular mechanism and transient feature remained unchanged, meaning the adjustment of the energy landscape might be resulted by changing of the key hydrogen-binding during the process [9]. (a) (b) Figure.3. (a) Multi-funnel energy landscape of the πΌ-helix and the π½-harpin of DP5 sequences in 300K. Blue branches represent the helical conformations, while the red represent the hairpin conformations. Other partial structures are also shown on the graph; (b) Multi-funnel energy landscape of the πΌ-helix and the π½-harpin of DP3 sequences in 300K. Blue branches represent the helical conformations, while the red represent the hairpin conformations. Other partial structures are also shown on the graph [9]. Next phase for the folding process is a kinetic dominant stage, where the secondary structure begins to fold into more complex steric conformation. The process mainly proceeds in externalities, where the pressure, temperature and acidity are uncertain. And due to the kinetic view, the process can be reversible depending on the external condition. For this part of the process, the equilibrium has to be reached. It is known that all systems evolve towards equilibrium state, and an isolated system characterizes by maximum entropy. In contrast, living 10 Fundamental Thermodynamics of Protein Folding: Theory and Prediction systems seem to violate the truth with proceeding to increase order of the system, and never reach equilibrium. However, universe entropy must increase, which feeds the entropy negatively to the biology subsystem to keep the latter evade from equilibrium. Another word, the subsystem might demonstrate a trend away from equilibrium and decrease in entropy, but the things will keep towards entropy increase overall [10]. The equilibrium might collapse with changing of the external condition, such as changing pressure, or changing pH. The third level structure might damage and disable basic living function due to the equilibrium changing called denature for the protein conformation. The damaged structure might fold back again if the external condition falls back again. However, if the deeper configuration is damaged, generally the secondary structure, the protein could not go back. 3.0 Statistical Thermodynamics of Protein Folding It is difficult to describe the folding process of a single protein molecule. First, a single protein molecule keeps doing random thermal motion and has different structural states, which is hard to describe its trajectory and precise state. Second, protein molecules do not just have one unique spatial structure but are ensembles of interconnected and interconvertible microscopic states. For example, the structure of natural protein crystals observed by NMR is the average ensembles of different microscopic states rather than a stable structure [11]. Therefore, it is of great significance to study the collection of microscopic states composed of natural proteins from the perspective of statistical thermodynamics. Protein folding is mainly affected by factors, including hydrophobic interaction, hydrogen bonding, electrostatic force, and van der Waals force. The second part of this article will discuss how these four factors influence the stability of the protein and the free energy of the folding process. 11 Fundamental Thermodynamics of Protein Folding: Theory and Prediction 3.1 Hydrophobic Interaction Among the twenty common amino acids in nature, they can be divided into two categories according to their hydrophilicity and hydrophobicity. Nine amino acids are hydrophobic amino acids, and the remaining eleven amino acids are hydrophilic amino acids. There are both hydrophobic residues and hydrophilic residues in an unfolded protein chain. The side chain which ends of hydrophobic residues are non-polar hydrocarbon groups, while the carbonyl and amino groups in the main chain are hydrophilic groups. Under the combined action of the hydrophobic groups and the hydrophilic groups, the hydrophobic groups will gather inside the protein to form a hydrophobic core, while the hydrophilic groups will form an approximately spherical coating on the protein surface. Therefore, natural protein can be regarded as an approximately spherical structure with a hydrophobic core inside and a hydrophilic coating outside. Studies have shown that hydrophobic interaction is an important driving force for the rapid inward collapse in proteins folding process [12]. As the destruction of the tight hydrophobic core requires energy, the process of the hydrophobic groups in an unfolded protein collapsing inward to form the hydrophobic core is an exothermic process, which means the enthalpy change is negative. As the number of spatial conformations of an unfolded protein chain is greater than natural proteins, the process of protein folding is an entropy reduction process, which means the entropy change is negative. The relationship between the change of free energy and the change of enthalpy and the change of entropy is: G= H-T S Experiments and studies have confirmed that under the influence of hydrophobic effect, 12 Fundamental Thermodynamics of Protein Folding: Theory and Prediction the reduction of enthalpy is the main influencing factor. The free energy of this process is negative, which shows that protein collapse is a spontaneous process [12]. 3.2 Hydrogen Bonds A hydrogen bond is an interaction formed between a hydrogen atom and an atom with great electronegativity, which usually means an oxygen atom or a nitrogen atom. Hydrogen bonds are directional. For proteins, hydrogen bonds can only exist when the carbonyl group and amino group meet a certain angle. Hydrogen bonds are widely found in proteins. In -helix, a hydrogen bond can be formed between the upper circle and the lower circle of the helix. In -sheet, a hydrogen bond can be formed between one strand and another antiparallel strand. As the hydrogen bonds are regularly present between residues in -helix and -sheet, they can make the protein in a stable state. The hydrogen bonds are the key force in the formation of secondary structure. The formation of hydrogen bonds between residues and residues will release a lot of energy, resulting in a decrease in enthalpy. Shirley et al. studied the hydrogen bond changes of ribonuclease T1 mutants [13]. They pointed out that the average hydrogen bond contribution to structural stability was about 1.3kcal/mol. Considering that there are many hydrogen bonds in one protein, it can be seen that the formation of hydrogen bonds plays a vital role in the stability of protein structure. 3.3 Electrostatic Interaction Among eleven hydrophilic amino acids, there are three positively charged amino acids, including arginine, histidine, and lysine, and two negatively charged amino acids, including aspartic acid and glutamic acid. When these charged residues are close to each other, they will 13 Fundamental Thermodynamics of Protein Folding: Theory and Prediction generate electrostatic interaction between each other, also known as salt bridge. The electrostatic interaction makes the charged residues attract each other and become closer, which reduces free energy and makes the protein structure more stable. In addition, as the charged residues are hydrophilic, they are usually located in the polar environment outside the protein. However, in the folding process, it is inevitable that some of the charged residues will be involved into the hydrophobic core, and this process is called desolvation of charges. The accumulation of charges in the hydrophobic core makes protein structure more unstable, causing an increase in enthalpy. In other words, the desolvation process will increase the free energy and make the protein structure more unstable [14]. If π₯πΊπ,π π is used to denote the free energy required to open a folded protein with electrostatic interaction, and π₯πΊπ,πππ π π is used to denote the free energy required to open a folded protein without electrostatic interaction, then the free energy contribution π₯π₯πΊπ of electrostatic interaction to the folding process is: π₯π₯πΊπ π₯πΊπ, π₯πΊπ, − If π₯π₯πΊπ οΌ0, the electrostatic interaction makes the structure of the protein more stable. Conversely, the electrostatic interaction makes the protein structure more unstable. Since the interaction between salt bridges and the desolvation of charges are close, the effect of electrostatic force on the stability of protein structure is in a critical state. In the specific analysis, whether static electricity will make the protein structure more stable or unstable depends on which factor plays the leading role. 3.4 Van der Waals Forces Van der Waals force is the force between molecules, including gravitation and 14 Fundamental Thermodynamics of Protein Folding: Theory and Prediction repulsion. Due to the diversity of protein structures, the influence of van der Waals forces on the stability of protein structures is also complicated. Because van der Waals forces can attract side chains closer to each other, the tertiary structure of natural protein can be tighter. In addition, studies have shown that the attraction of van der Waals force can bring the amino group and the carbonyl group close, thereby enhancing the hydrogen bond force, and reducing the free energy of the protein, making the system more stable [15]. Since van der Waals forces exist widely between molecules and are affected by the complex spatial structure of different proteins, it is still challenging to quantitatively analyze the effects of van der Waals forces on protein stability and free energy change during the folding process. 4.0 Protein Structure Prediction and Optimization At the 18th Critical Assessment of Protein Structure Prediction in 2018, the Alphafold from Deepmind edic ed 25 kind of o ein c e cham ion hi b a ignifican ad an age. Thi no onl in od ce cce f ll eo le and on he i ion in o a dee study field, but also leads to a rethinking of traditional protein structure algorithms. At present, there are three main kinds of prediction methods, the homologous modelling, folds recognition and ab initio structure prediction. For the first two methods, they rely on the resolved protein structures. By comparing the protein structure with the templates from the protein data bank, the target model could be constructed. But the number of protein structures in the Bank was only 100,000. And the known protein sequences had been about 90 million by 2015 [17]. The huge gap makes the template-based methods invalid under most conditions (Figure 4). In this case, we need to construct 3D models from scratch, which is called ab initio structure prediction. 15 Fundamental Thermodynamics of Protein Folding: Theory and Prediction Figure.4. The amounts of known protein sequences and solved protein structures from 1995 to 2015 are shown respectively [17]. Ab initio protein structure prediction which has been developed for more than 60 years is still a promising prediction method and it is regarded as the holy grail of molecular biology [16]. Compared to the homologous modelling and folds recognition, the ab-initio method is an ideal method to predict the protein structure without the known protein structures but only relying on the amino sequences. Because of template-free, it is the hardest one among protein structure prediction approaches. But it could be much helpful for people to understand the fundamental protein folding mechanism and how a chain of polypeptides could change into a specific function protein [17]. The theory basis of the initio method is the thermodynamic hypothesis that the native protein corresponds to the global minimum free energy which was proposed by Anfinsen in 1973 [4]. For the ab-initio method, the basic protocol to predict protein structures is to search for the conformations of an appropriate potential energy which could lead to the prediction of native folds. After the folds have been recognized or predicted, the predicted structures will be assessed of the quality to determine whether the structure is rejected or selected. But there are two main constraints to the successful implementation of the ab-initio method. One is the lack of an effective potential function which could distinguish the native conformation from the non-native conformation of proteins so that the global minimum energy function corresponds 16 Fundamental Thermodynamics of Protein Folding: Theory and Prediction to the native proteins. Secondly, there are an astronomical number of local minima on the potential surface and it is hard to sample it efficiently. In 1983, Kirkpatrick was first inspired by the solid annealing process that according to the Boltzmann probability, when the temperature is lower, the atoms would collapse into the lowest-energy state and created the simulated annealing (SA) as a powerful optimization method which is probably the most common used method in protein structure prediction by far [18]. In this method, an applied algorithm generates a series of conformations following the Boltzman energy distribution at a given temperature. It performs a high temperature simulation, followed by a series of simulations based on the temperature reduction plan to find the ground state. As an improvement of SA, conformational space annealing which could search for a larger number of low-energy families was introduced by Baker Laboratory [19]. In this method, they first build a bank containing a preassigned number of random conformations which will subsequently be energy-minimized. Then several dissimilar ones will be selected as seeds to be modified. Through reducing the distance between conformations and updating the bank, the lower-energy locations will be found out. After assessing, the native-like structure will be determined (Figure 5). 17 Fundamental Thermodynamics of Protein Folding: Theory and Prediction Figure.5. Schematic diagram of the conformational space annealing. There are also many other conformational search methods, such as the entropic ensemble which is based on the system entropy at initial temperature in order to provide the entropy estimation of a larger system for Monte Carlo simulations [20]. 5.0 Conclusion and future prospective The basic thermodynamic mechanism for each stage of the protein folding process has been discussed above. Viewing the process thermodynamically not only provides a new concept to explain and understand the protein folding, but also enables protein configuration forecast. Actually, protein folding is an extremely complex process which covers dozens of subjects, including thermodynamics, biology, chemistry, molecular kinetic and so on, and it is impossible to deduce the process only on thermodynamics, while it still gives an overall concept and explanation fundamentally. The multi-funnel landscape investigates the possibility within polypeptides folding by Gibbs energy for native structure, or uncommon configurations. Combining Anfinsen's law and the energy landscape explains most of protein structure. However, exceptions occur with which the folding could not act solely by Gibbs 18 Fundamental Thermodynamics of Protein Folding: Theory and Prediction energy minimization, and such phenomenon could be considered by the VES hypothesis, which clarified deterministic force exists in first structure level and decides which specific conformation to go. The regret is that the further mechanism for the hypothesis is still vague. And now the experiment by Debayan et al proved that not just the conformation is decided by peptide sequence but also the whole energy landscape in primary level. The last phase in process is more about kinetic perspective, where the secondary structure, literally elementary protein configuration with some basic spatial structures, fold into more complex protein structure reversibly. And equilibrium exists and controls the process in this stage. And this report mainly focuses on the principle for the first two levels of protein construction, as the third is in the scope of molecular kinetic view. However, limitations still exist. Unfold or fold process for some extreme large or complex protein structure under certain temperature irreversibly is still difficult to be captured by current models [21]. AlsoοΌthe energy landscape could not reveal every type of protein folding process, furthermore some of them are even not a funnel shape, which are far more beyond current models and simulations. Studies have shown that in the process of protein folding, hydrophobic interaction plays the most important driving role. In addition, hydrogen bonds, electrostatic forces, and van der Waals forces also affect the stability of the protein and the free energy of the folding process. The research results help people understand the driving force of protein folding more deeply and lays the foundation for further research on protein folding. With the increase of successful genome sequencing projects, more and more amino sequences have been known. But one essential approach to understand their functions is still the protein structure information. Ab initio protein structure prediction method earned lots of 19 Fundamental Thermodynamics of Protein Folding: Theory and Prediction controversies during the past decades. Based on the thermodynamic hypothesis, the initio method put forward higher requirements on optimization algorithms and effective energy functions which were used to be hard to achieve. However, the success of AI Deep Study and more advanced computing technology have been gradually radiating new vitality of the traditional structure prediction. There is no doubt that a huge step further in protein structure prediction will occur in the near future through combining the states of the art technologies and the traditional thermodynamics. 20 Fundamental Thermodynamics of Protein Folding: Theory and Prediction 6.0 References [1] Y.S. Chiang, T.I. Gelfand, A.E. Kister, I.M. Gelfand, New classification of supersecondary structure of and ich like o ein nco e ic a e n of and assemblage. Proteins: Structure, Function, and Bioinformatics, 68 (2007) 915-921. https://doi.org/10.1002/prot.21473. [2] P.L. Privalov, Thermodynamics of protein folding. Journal of Chemical Thermodynamics, 29 (1997) 447 474. https://doi.org/10.1006/jcht.1996.0178. [3] N.P. King, A.W. Jacobitz, M.R. Sawaya, L. Goldschmidt, T.O. Yeates, Structure and folding of a designed knotted protein. Proceedings of the National Academy of Sciences, 107 (2010) 20732-20737. https://doi.org/10.1073/pnas.1007602107. [4] C.B. Anfinsen, Principles that govern the protein folding chains. Science, 181 (1973) 233-230. https://science.sciencemag.org/content/181/4096/223. [5] K. Pauwels, I. Van Molle, J. Tommassen, P. Van Gelder, Chaperoning Anfinsen: the steric foldases. Molecular Microbiology, 64 (2007) 917-922. https://doi.org/10.1111/j.1365-2958.2007.05718.x. [6] A. Kolinski, J. Skolnick, Monte Carlo simulations of protein folding. II. Application to protein A, ROP, and crambin. Proteins: Structure, Function, and Bioinformatics, 18 (1994) 353-366. https://doi.org/10.1002/prot.340180406. [7] Q.Y. Zhao, Protein folding of the irreversible thermodynamics and landscapes theory. Chemistry of Life, 30 (2010) 323 328. [8] L. C ei o, P o ein m l i f nnel ene g land ca e and mi folding di ea e . Jo nal 21 Fundamental Thermodynamics of Protein Folding: Theory and Prediction of Physical Organic Chemistry, 21 (2008) 549-554. https://doi.org/10.1002/poc.1315. [9] D. Chakraborty, Y. Chebaro, D.J. Wales, A multifunnel energy landscape encodes the com e ing -heli and -hairpin conformations for a designed peptide. Physical Chemistry Chemical Physics, 22 (2020) 1359-1370. https://doi.org/10.1039/C9CP04778F. [10] T. Lazaridis, M. Karplus, Thermodynamics of protein folding: a microscopic view. Biophysical Chemistry, 100 (2002) 367-395. https://doi.org/10.1016/S03014622(02)00293-4. [11] S.W. Englander, Protein folding intermediates and pathways studied by hydrogen exchange. Annual Review of Biophysics and Biomolecular Structure, 29 (2000) 213238. https://doi.org/10.1146/annurev.biophys.29.1.213. [12] P.L. Privalov, S.J. Gill, Stability of protein structure and hydrophobic interaction. in: C.B. Anfinsen, J.T. Edsall, F.M. Richards, D.S. Eisenberg (Eds.), Advances in Protein Chemistry, Academic Press, Cambridge, 1988, pp. 191-234. [13] B.A. Shirley, P. Stanssens, U. Hahn, C.N. Pace, Contribution of hydrogen bonding to the conformational stability of ribonuclease T1. Biochemistry, 31 (1992) 725-732. https://doi.org/10.1021/bi00118a013. [14] H.R. Bosshard, D.N. Marti, I. Jelesarov, Protein stabilization by salt bridges: concepts, experimental approaches and clarification of some misunderstandings. Journal of Molecular Recognition, 17 (2004) 1-16. https://doi.org/10.1002/jmr.657. [15] R. Nelson, M.R. Sawaya, M. Balbirnie, A.Ø. Madsen, C. Riekel, R. Grothe, D. 22 Fundamental Thermodynamics of Protein Folding: Theory and Prediction Eisenberg, Structure of the cross- ine of am loid-like fibrils. Nature, 435 (2005) 773-778. https://doi.org/10.1038/nature03680. [16] D.T. Jones, Progress in protein structure prediction. Current Opinion in Structural Biology, 7 (1997) 377-387. https://doi.org/10.1016/S0959-440X(97)80055-3. [17] J. Lee, P.L. Freddolino, Y. Zhang, Ab initio protein structure prediction. in: D.J. Rigden (Eds.), From Protein Structure to Function with Bioinformatics, Springer, Dordrecht, 2017, pp. 3-35. [18] S. Kirkpatrick, J. Gelatt, M.P. Vecchi. Optimization by simulated annealing. Science, 220 (1983) 671-680. https://science.sciencemag.org/content/220/4598/671. [19] T.J. Ewing, I.D. Kuntz, Critical evaluation of search algorithms for automated molecular docking and database screening. Journal of Computational Chemistry, 18 (1997) 1175-1189. https://doi.org/10.1002/(SICI)1096- 987X(19970715)18:9<1175::AID-JCC6>3.0.CO;2-O. [20] J. Lee, New monte carlo algorithm: Entropic sampling. Physical Review Letters, 71 (1993) 211-214. https://doi.org/10.1103/PhysRevLett.71.211. [21] S. Gopi, A. Aranganathan, A.N. Naganathan, Thermodynamics and folding landscapes of large proteins from a statistical mechanical model. Current Research in Structural Biology, 1 (2019) 6-12. https://doi.org/10.1016/j.crstbi.2019.10.002. 23