Rotation Report: An introduction to Immunology and Molecular Dynamics Denise Chac 25 November 2014 Contents: I. II. III. IV. I. Expectations for Rotation Immunology Molecular Dynamics Protocols for simulating HIV – gp120 in water. EXPECTATIONS FOR ROTATION At the start of this rotation, I had no background experience in immunology or molecular dynamics. Entering Dr. Ha Youn Lee’s lab, I expected to learn the basics of immunology with regards to HIV and gain an understanding of Molecular Dynamics. Due to my lack of experience with a wet lab, I wanted to focus on the computational side of the lab. While I only had basic knowledge of math and physics, I would like to further improve my skills through learning molecular dynamics. Goals: II. 1) Gain a basic understanding of immunology 2) Acquire the skills to operate and run molecular dynamics IMMUNOLOGY Immunology is the study of how our bodies react to foreign pathogens such as diseases and viruses. Our body responses using two different systems: the innate immune system and the adaptive immune system. The innate immune system is known to be the more system that is always available and can readily combat a wide range of pathogens. While it is the more rapid system, it lacks specificity. The adaptive immune system, which is the more specific and effective form of immunity, requires previous encounters with the pathogens. One way in which our immune system reacts to pathogens is through white blood cells, specifically lymphocytes. Of the lymphocytes, there are the B cells, T cells, and natural killer (NK) cells. B Cells B cells derive form the bone marrow and have membrane bound antibodies on their cell surface to identify antigens. All B cells have antibodies on their cell surface but the types of antibodies are highly variable. This variation of antibodies is essential for identifying the different pathogens that invade the body. B cells are activated when they encounter their matching antigen through the variable antibody. The antibodies on the B cell bind to an area on the antigen called the epitope. Once the B cell binds to the antigen, the B cell is activated and begins to engulf the pathogen; the pathogen gets digested and the resulting peptides are presented on the cell surface using MHC (Major Histocompatibility complex) class II proteins. These MHC proteins are important for the activation of other immune responses. Before the B cell can begin replicating and differentiating, the MHC attracts the help of mature helper T cells. Once it does, the naïve B cells differentiate into memory cells and effector cells. Effector cells are plasma cells that produce free antibodies that are able to recognize the original pathogen and tag them for further immune response. Memory B cells are clones of the original B cell with the same variation of antibodies capable of recognizing the original pathogen. T Cells T cells also derive from the bone marrow but they mature in the thymus. They differ form B cells in that they have T-cell receptors that recognize different MHC classes. Of T cells, there are two types: cytotoxic and helper T cells. Cytotoxic T cells are capable of identifying class I MHC on any antigen-presenting cell. They have a CD8 protein that further helps the binding of the T cell to the target cell. Once the T cell binds to its target cell, it then begins differentiating into effector and memory cells. Effector cells are capable of latching onto cancerous or “bad” cells trhough the class I MHC protein presented on the cell surface. Once it is bound, the cell releases proteins to lyse the cell membrane, resulting in cell death. Helper T cells, on the other hand, are attracted to class II MHC found on B cells. They have CD4 proteins that help the binding of the helper T cells to the target cell. Helper T cells are attracted to professional antigen presenting cells. Once helper T cells attach to their target cell, they release cytokines and differentiate into effector and memory cells. Cytokines are a general group of proteins that help increase cell signaling and recruit other proteins to fight the pathogens. Viruses Viruses are unique in that they utilize the cell’s own mechanism to reproduce and proliferate. They act differently than bacteria and other pathogens because they are capable of incorporating their own DNA into the host or target cell. The virus first fuses it’s membrane with the target cell. Once the membrane is fussed, it releases its content into the host cell and the viral RNA is incorporated into the host DNA through splicing mechanisms. Once the viral DNA is combined with the host DNA, viral replication is possible when the host cell replicates. As for the other remnants of the virus, viral peptides are transported into the ER lumen. The viral peptide is then bound to MHC proteins and transported through the Golgi apparatus. The MHC protein and that attached viral peptide is then presented on the cell surface. Once the MHC 1 and antigen peptide are presented, it is possible for cytotoxic T cells to recognize the infection. HIV The Human Immunodeficiency Virus (HIV) is a retrovirus that attacks key components of the immune system. The HIV virion first binds to CD4 proteins on T cells through the membrane envelope protein gp120. Once it is bound, another envelope protein, gp41, helps the HIV envelope fuse with the target cell’s membrane. Once it is fused, the HIV contents are spilled into the cytosol of the target cell. Through reverse transcriptase, the HIV RNA is transcribed into DNA and then incorporated into the target cell’s DNA with integrase. When the target cell (usually a T cell) becomes activated, the HIV provirus (the portion of the virus DNA incorporated into the host genome) becomes transcribed and replicated. MORE BACKGROUND III. MOLECULAR DYNAMICS While in vitro and in vivo experiments are the conventional way to test hypotheses, with technological advances, it is now possible to perform experiments in silico in biology. Molecular dynamics is a powerful tool that allows biologists to observe the reactions of molecules in a simulated environment. Through th IV. CASE STUDY: GP120 The envelope glycoprotein gp120 is an essential component in the HIV that determines the docking of the HIV virion to the targeted T cell. On the cell surface, gp120 attaches to the CD4 protein on the T cell and enables the HIV virion to dock and then continue on its life cycle of fusing its membrane and releasing the viral content into the targeted cell. Simulating gp120 using Molecular Dynamics In this project, I (1) minimized the gp120 protein in water to obtain the lowest energy conformation, (2) heated the structure from 0 K to 300 K, and then ran the (3) simulation of gp120 with constant temperature 300 K and pressure 1 atm for 60 ps. V. PROTOCOLS Necessary Programs: o Terminal Box / Command Box o X11 (for Mac) o AMBER tLEaP – basic model building SANDER - Simulated annealing with NMR-derived energy restraints Gedit – text editor Gnuplot – program to graph data 1. 2. Ptraj - program to analyze trajectories VMD – Visual Molecular Dynamics Loading gp120 To build the molecule, you need the LEaP program provided by the AMBER software package. This program provides the tools to build the system/model and then produce the Amber coordinates and parameter/topology input files. After opening the LEaP program, the force field must be specified. When building the molecule, it is possible to build the molecule from a sequence ($ foo = sequence { ACE ALAN ME } ) or it may be obtained from a pdf file ($ foo = loadpdb file.pdb ). Solvating the structure is also possible in LEaP. For this project, the gp120 protein was solvated in a period box of water with a 10.0 Angstroms buffer (between the protein and the periodic box wall). After building the molecule, the files need to be saved in the appropriate files for Amber. These files are prmtop and inpcrd. o Prmtop = parameter and topology file o Inpcrd = coordinate file COMMANDS o $ tleap $ source leaprc.ff03 $ gp120 = loadpdb 1MEQ.pdb $ solvatebox gp120 TIP3PBOX 10.0 $ saveamberparm gp120 prmtop incrd $ quit Scripting input files o Once all start files are obtained (prmtop and inpcrd), the next step is to make the input files that tell the MD program what to do to the system. To create these script, the simple text editor program gedit was used. Minimization o This script is used to obtain the lowest energy structure. o Contents: Gedit Script – Minimization imin =1 this dictates that it is a minimization run Minimize ntx = 1 read the coordinates but &cntrl not the velocities from ASCII imin=1, formatted inpcrd coordinate file ntx=1, irest = 0 do not restart simulation irest=0, (not applicable to minimization) maxcyc=2000, maxcyc = 2000 maximum ncyc=1000, minimization cycles ntpr=100, ntwx=0, cut=8.0, / ncyc = 1000 the steepest desecent algorithm for the first 0-ncyc cycles, then switches to conjugate gradient algorithm for ncyc-maxcyc cycles ntpr = 100 print to the amber mdout outfile every ntpr cycles Gedit Script – Heat ntwx = 0 no amber mdcrd trajectory file written (not Heat applicable to minimization) &cntrl cut = 8.0 nonbounded cutoff imin=0, distance in Angstroms ntx=1, irest=0, Heat nstlim=10000, o This script is used to heat the system from dt=0.002, 0 K to 300 K while maintaining a constant ntf=2, number of particles and volume. ntc=2, o Contents tempi=0.0, imin = 0 dictates the type of temp0=300.0, molecular dynamics run (it is not a ntpr=100, minimization run) ntwx=100, nstilm = 10000 number of MD cut=8.0, steps in the run ntb=1, dt = 0.002 time step in ntp=0, picoseconds (ps) ntt=3, ntf = 2 setting to not calculate gamma_ln=2.0, force for SHAKE constrained bonds nmropt=1, ntc = 2 enable SHAKE to constrain ig=-1, all bonds involving hydrogen / tempi = 0.0 Initial thermostat &wt type='TEMP0', istep1=0, temperature in K istep2=9000, value1=0.0, temp0 = 300.0 Final thermostat value2=300.0 / temperature in K &wt type='TEMP0', ntwx = 1000 periodic boundaries istep1=9001, istep2=10000, for constant volume value1=300.0, value2=300.0 / ntb = 1 periodic boundaries for &wt type='END' / constant volume ntp = 0 no pressure control ntt = 3 temperature control with Langevin thermostat gamma_ln = 2.0 Langevin thermostat collision frequency nmropt = 1 NMR restrains and weight changes read ig = -1 randomize the seed for the pseudo-random number generator Final steps For steps 0-9000, increase temperature from 0 K to 300. For steps 9001-10,000, keep constant temperature at 300 K. 3. Production o This script is used to run the simulation of gp120 in the periodic solvate box at constant temperature 300 K and constant pressure 1 atm. o Contents: ntx = 5 read coordinates and Gedit Script – Production velocities from unformatted inpcrd coordinate file Production rest = 1 restart previous MD run &cntrl velocities from inpcrd file will be used imin=0, fur initial atom velocities ntx=5, temp0 = 300.0 thermostat irest=1, temperature set at 300 K nstlim=30000, ntb = 2 use periodic boundary dt=0.002, conditions with constant pressure ntf=2, ntp = 1 use the Berendsen barostat ntc=2, for constant pressure simulation temp0=300.0, ntpr=100, To calculate the run length, take the number of MD ntwx=100, steps (nstilm) and multiple it by the time step of MD cut=8.0, (dt). ntb=2, o Heat: ntp=1, Nstlim = 10,000 run simulation for ntt=3, 10,000 MD time steps gamma_ln=2.0, Dt = 0.002 MD time step is 0.002 ps ig=-1, Length of simulation 10,000 * 0.002 / = 20 ps o Production: Nstilm = 30,000 run simulation for 30,000 MD time steps Dt = 0.002 MD time step is 0.002 ps Length of simulation 30,000 * 0.002 = 60 ps Sdf Commands o $ gedit Min.in o $ gedit Heat.in o $ gedit Prod.in o Once the program opens up in a separate window, create these scripts by copy, paste, and save. MD Simulation