NMRDocumentation

advertisement
NMR Documentation
This document describes how the programs are to be used. The inputs and outputs are described. The
example output is given in the directory Example. These programs should be of use to anybody who
wants to use trajectories.
Programs

Inputs and outputs

Example command lines for each of the programs

Bash files with example commands

Example output

Interpretation of the output

Documentation explaining the theory of the calculations of the programs
The 3J programs are
-
3J_coupling_calculation
-
3J_coupling_check
The noe programs are
-
noe_calculation
-
noe_check
There is also a program that calculates the text files to calculate the noe at any mixing time
-
noecalculation
There is a program to examine the trajectory
-
image_of_trajectory
The relaxation rate programs are
-
relaxation_rate
-
particular_relaxation_rate
-
particular_relaxation_rate_no_tumbling
There are also bash files for these programs.
3J_coupling_calculation
This program calculates the average 3J couplings of a trajectory. The 3J coupling of each frame is
calculated of the atom-atom-atom-atom, than it averages over all the frames. The default output is the
C-…-…-C, C-…-…-H, H-…-…-C, H-…-…-H 3-bonds. The atoms are adjacent.
-
Types of coefficients and functions
The couplings are found as a function of the coupling function, e.g. Acos*cos+Bcos+C. The program
uses a file called fileequation.txt which has the information of the 3J couplings. This file is parsed for
each of the calculations to determine the 3J coupling coefficients. The user can add to the file any
coefficients for any type of 3J coupling. The file has all the coefficients of the 3J couplings for each of
the 4 adjacent atoms.
An example of coefficients for a ‘type’ of 3J coupling is in fileequation.txt
Karplus
C
C
C
C
3.49
0
.16
The program will open the fileequation.txt file, look to find if the user type input (e.g. Karplus) is found,
than parse the information for the 1-4 atoms and coefficients. In this example, if the user wanted a
Karplus type of 3J coupling for the C-C-C-C then the coefficients of A=3.49, B=0, C=.16 are used in the 3J
coupling calculation.
-
Theory of the 3J coupling calculation
The paper “Developments in the Karplus equation as they relate to the NMR coupling constants of
carbohydrates,” Advances in Carbohydrate Chemistry and Biochemistry, v.62, pp.17-82 describes the
theory of the 3J couplings. The fileequation.txt as a default has the coefficients of the Karplus type from
that paper. The paper is included in the download.
Figure 1 shows the 3J coupling as a function of angle for the Karplus type. There are two angles which
give the same 3J coupling. This makes it difficult to use the 3J coupling for determination of structure
unless noe’s are also used.
-
Command line
The most basic usage of the program uses
./3J_coupling_calculation file.top file.crd Type out-dir samplefrequency
(1)
There are also the options of using the command line of
./3J_coupling_calculation file.top file.crd Type out-dir samplefrequency atomType1 atomType2
atomType3 atomType4
./3J_coupling_calculation file.top file.crd out-dir samplefrequency atomType1 atomType2
atomType3 atomType4 coeff1 coeff2 coeff3
-
Input
The inputs are described.
Topology file
-
file.top – topology file. This is an amber .prmtop file.
Trajectory file
-
file.crd – trajectory file. This is a trajectory file .crd.
Type of 3J calculation
-
Type – type of coupling – string, e.g. Karplus. This is the type of Karplus coupling as found in
fileequation.txt.
Output directory of the files
-
out-dir – the directory of the output, e.g. test. This is the directory in which the output will be
given.
Sampling – 1 frame out of samplefrequency frames is used. This can be important to uncorrelate the
motion.
-
samplefrequency
In case the second optional command line is used the atom types are given as inputs. The third optional
command line does not specify a type of Karplus coefficients. The user specifies coeff1, coeff2, coeff3 as
inputs. This option could be useful if somebody would like to test different coefficients.
-
Output
The use of the command line (1) will produce files fileCC.txt, fileCH.txt, fileHH.txt. The 1-4 adjacent
atoms with type [C=atom1, any=atom4; any=atom1, C=atom4], [C=atom1, any=atom2, any=atom3, Hatom4], [H=atom1, any=atom2, any=atom3, C=atom4], [H=atom1, any=atom2, any=atom3, H=atom4]
are used to find all the 3J couplings of the molecule.
Example output of the proton-proton 3J coupling is
6 5 7 8 : H-C-C-H
3YA:H1 3YA:C1 3YA:C2 3YA:H2
3.44614
The first four numbers are the atom numbers. The atomTypes are given. The second line says what the
residues and atom names of the four atoms. The nomenclature uses the amber format from the
topology file. The last number is the average 3J coupling using the coefficients that the user wanted
from the fileequation.txt.
3J_coupling_check
In order to compare theory with experiment it is necessary to know if the calculations are correct. This
program will examine the correlation of 3J couplings as a function of the time of the trajectory. The
program will output information that can be used to tell you about the 3J couplings of the 1-4 atoms as
a function of time. It calculates the ‘std-mean’ of the 3J couplings. It also calculates the average of the
3J coupling as a function of the trajectory time.
These calculations are important to determine if the trajectory is not behaving the way it should. An
example would be if the trajectory was made with unnatural constraints. This would show in these
calculations in the statistics of the 3J coupling as a function of the time of trajectory.
These calculations can also show processes that are present in the motion of the molecule. These could
be important.
-
Theory of the standard deviation in the mean
Standard deviation about an average calculation does not make sense if the information does not satisfy
the central limit theorem. Statisticians use different techniques to examine sets of information which is
correlated. The program uses a binning of the information of the 3J coupling as a function of the
trajectory time to calculate the ‘standard deviation in the mean,’ which is a good way to find out how
precise the calculated 3J coupling is. Two powerpoint presentations are given which describe the
calculations that the program does. The figures that the program give should ‘flatten’ for large number
of bins. If the figures don’t flatten, then the trajectory should be examined.
Example output of the program is given in figure 1 and figure 2 which show
-
Bad trajectory
-
Unusual phenomena such as several processes of different timescales affecting the 3J average
calculation
The program uses the 3J couplings as a function of the time units than bins them into some number of
bins which is specified by the user (bininitial to binfinal). The standard deviation/sqrt(number of bins) of
the 3J couplings of the bins is found, as a function of the number of bins. Then a curve fit is done of the
standard deviation/sqrt(number of bins) as a function of the number of bins (e.g., a+bx, x=number of
bins). First a line is used. Then it uses a quadratic, then a cubic, ..., until the chi square is accurate.
-
Command line
./adjacentfit file.top file.crd type samplefrequency atomResidue1 atom Name1 atomResidue2
atomName2 atomResidue3 atomResidue4 bininitial binfinal point font
-
Input
Topology file
-
file.top – topology file
Trajectory file
-
file.crd – trajectory file
Type of 3J coupling, e.g. Karplus
(2)
-
type – type of 3J coupling coefficients, e.g. Karplus
Samples 1 out of samplefrequency frames
-
samplefrequency
AtomResidues, AtomNames
-
atomResidue1 – first atom residue name, uses the topology file
-
atomName1 – first atom name
-
same for all the atomResidue, atomName
Bin information
-
bininitial, binfinal – the calculation will use the # of bins from bininitial to binfinal, e.g. 2, 100
Figures show this many points
-
point – the figures will show point of the calculation. The figures should not show 100,000 points of
information, typical is 500 for a figure. The program calculates the 3J coupling using the
samplingfrequency. The figures only show point data points. E.g., 100,000 frames,
samplefrequency=10, point=500. 100,000/10=10,000 frames. The figure of the 3J coupling will
show the 3J coupling at every 200 frames.
Output type of figures
-
font – jpeg, eps, epslatex, pdf to be used in the figures
-
Output
The program will produce a figure of the 3J coupling as a function of the trajectory time for the four
atoms, a figure of the std-mean as a function of bins, the file of the binning calculation, text file of the 3J
coupling as a function of the trajectory time, and a text file of the std-mean as a function of bins.
The information of these files can be used to interpret the 3J calculation.
noe_calculation
This program calculates the noe’s. This calculation is the most time-consuming of the programs. The
program will give the noe intensities for all the mixing times specified by the user. These intensities
have to be normalized to compare with experiment. The output of this program is what the
experimentalists measure.
-
Theory of calculation
The calculation of a noe between two atoms involves all the atoms in the molecule if the complete
relaxation rate formalism is used. That is the reason the calculation takes time. The complete relaxation
rate matrix can be large.
This is an abstract of a presentation which explains the use of the program.
NOEs from cross-relaxation between proton pairs provide essential structural information on both large
and small molecules. For rigid molecules, the analysis can be straightforward using the inverse sixth
power and isolated spin-pair approximation. However, for flexible molecules that have multiple
conformational states, such as oligosaccharides, small peptides or intrinsically disordered proteins,
simple 1/r^6 averaging over conformational states may not provide structurally relevant models.
Although formal treatments of this situation have been known for some time, it is difficult to apply
these experimentally. One way is to make use of advances in computational power to generate long
molecular dynamics trajectories. These calculated molecular motions can be used to generate
correlation functions for dipolar interactions and then treat the system with a full relaxation matrix
approach.
In general the correlation function has two 1/r^3 terms that are typically combined into 1/r^6 averaging,
assuming motions are not correlated. It is not uncommon for small molecules to have internal motions
such that the effective inter-nuclear distance is between the centers of motional range of the two
nuclei. However, the steep 1/r^6 averaging will heavily weight the positions of closest approach. Time
points in an MD trajectory can be used to calculate the correlation function directly, if the trajectory is
long enough to sample a uniform distribution of molecular orientations and conformations that would
average the NMR data. For small molecules such as a disaccharide, these simulations would be on the
order of hundreds of nanoseconds or millisecond range.
The paper “Influence of Rapid Intramolecular Motion on NMR Cross Relaxation Rates: A Molecular
Dynamics Study of Antamide in Solution,” R. Bruschweller, B. Roux, M. Blackledge, C. Greisenger, M.
Karplus, R.R. Ernst, J. Am. Chem. Soc. 1992, 114, 2289-2302 is used for the formulation of the
calculation.
-
Comparing the output of the program with experimental information
The model which is used should produce noe intensities which satisfy
𝑛𝑜𝑒1 < 1⁄𝑟16 >
=
𝑛𝑜𝑒2 < 1⁄ 6 >
𝑟2
(3)
Noe1 and noe2 are the noe intensities of two spin pairs. The expectation values are the average of <
1⁄ > of the trajectory. The noe1 and noe2 intensities are calculated. These are measurable. The
𝑟6
program also calculates the averages of < 𝑟 > and < 1⁄ 6 >, which is included in file_distance.txt.
𝑟
The second noe is used to normalize to compare theory with experiment. This noe is usually found from
a spin pair inside a ring, or monomer. The first noe could be from a spin pair across the glycosidic
linkage. The second noe has motion which does not change in conformations, i.e. this spin par is inside
a ring.
The output of the program can be used to compare theory with experiment by normalizing the noe of
interest with the second spin pair.
An example is given from a GAG. The H3-H5 spin pair in a ring is used to normalize the H1-H3 glycosydic
spin pair H1-H3. The simulations are bad by a factor of 3.0 which is about 10% in the interproton
distance if the equation (3) is used.
Experiment
noe
Simulation
H1-H3
H3-H5
3.6057
8.9491
.4029
H1-H3
H3-H5
.06487
.04828
1.3436
intensity
This molecule has two conformations of the rings containing the H1 and the H3. The molecular
dynamics simulation in general is known not to be accurate in the population of the conformations. The
H1-H3 noe could change to be experimentally correct if the population of these two conformations
changed by a factor of 2. Factor of 2-3 is about 10% in the inter-proton distance of the H1-H3 if the sixth
root is taken.
-
noe calculation
Figure 3 shows the steps necessary to calculate the noe’s. The topology file is used to simulate the
molecule. The correlation function of the spin pair is calculated. This gives information of the tumbling.
The fourier transform of the correlation function is calculated. The fourier transform is used to calculate
the relaxation rates at some strength of magnetic field. The rho and sigma’s are then used in the
complete relaxation matrix, which is exponentiated to find the noe at any mixing time.
Figure 3
.
-
Command line
./noe_calculation file.top file.crd totaltime displaytime samplefrequency bfieldz fourier
total_time unit distanceatom font all-notall dir-out
-
Input
Time of simulation,
-
double totaltime. This is in nanoseconds.
Correlation function is displayed for this time
-
double displaytime.
Sampling – 1 frame out of samplefrequency is used to calculate the correlation function
-
int samplefrequency
Magnetic field in Tesla
-
double bfieldz
Fourier transform for display – don’t show all points
(3)
-
double fourier
Mixing time is from 0 to total_time ms.
-
double total_time
The figures of the build-up curves show the information in units of unit ms
-
double unit
The spin pairs which are separated by at most distanceatom are included in the complete relaxation
matrix. Most noe’s are not measurable if the inter-dstance is less than 5 angstroms.
-
double distanceatom
Output is either all the noes within some distance or particular spin pairs. The output could be >1000
files if the molecule has >100 atoms. all means all the information of the spin pairs are the output. If
notall is used, than a file called atom_file.txt is used to output the noe information of those spin pairs.
-
argv[11]=notall,all
The output
-
dir-out = argv[12]
Font of the figures, jpeg, eps, epslatex, pdf
-
font=argv[13]
-
Output
The output of the program will give many files. Each of the noe’s which are calculated will give a figure
of the build-up curve and a txt file of the build-up curve points.
Several files are also given which contain the information of the cross-relaxation-matrix, the distances of
the spin pairs, and the tumbling times of the spin pairs. These are called
-
cross_relaxation.txt
-
file_distance.txt
-
file_tumbling.txt
Cross_relaxation.text has all the information of the rhos and sigmas. File_distance.txt has the distance
information of < 𝑟 >, < 1⁄ 6 >,
𝑟
−1⁄6
√1⁄𝑟 6 . File_tumbling has the tumbling in nanoseconds of the spin
pairs. Curve fitting of the correlation function is used to determine the tumbling time. An exponential fit
is used.
Figure 4 shows a build-up curve which is calculated from the complete relaxation matrix. The text file
noecurve_OME_H3-OME_H1.txt contains all the data points. The build-up curve is evaluated at some
mixing time, e.g. 400 milliseconds, to compare theory with experiment.
Figure 4
noecheck
This program checks the noe calculation. The program will calculate the average vector distance of a
spin pair, which should be 0 if the molecule is tumbling isotropically. It calculates the inter-distance of a
spin pair which is used for conformation identification. The program also calculates the std-mean of
these quantities to determine if any correlation or unusual phenomena is present of the spin pair.
There are several reasons to check the noe calculation,

bad trajectory

unusual phenomena in the dynamics of the molecule

the noe is sensitive to the population of conformations
If the trajectory was not made correct then the correlation function used in the calculation of the noe
may be bad behaved. A non-isotropic box with the wrong box size or an isotropic box with the wrong
size could make the correlation function not compatible with the model used to calculate the noe’s.
The dynamics of the molecule may be interesting. The statistics of the 3J coupling shows sometimes
that there are processes of different time-scales. These processes could affect the noe calculation.
The conformations will add differently to the spin pair noe calculation. The conformations of the spin
pair of minimal distance will add most to the noe calculation.
These calculations can tell you how precise the trajectory noe calculation is. The measurements are
usually 10%. The program will also show how to change the population of the conformations to agree
with experiment.
-
Command line
./noe_check file.top file.crd totaltime samplefrequency residue1 atom1 residue2 atom2 point out-dir
font
-
Input
Time of the simulation, nanoseconds.
-
double totaltime
Sample frequency – 1 out of samplefrequency frames used to calculate
-
int amplefrequency
Atom names, residues.
-
string residue
-
string atom
Point shows this many points in the figures
int point
Out-dir
string arv[10]
test –figures are jpeg, eps, epslatex, pdf
string font;
-
Output
The program gives information of the distances, the average distances, and the std-mean of the
calculation of the distances for the spin-pair.
-
Distances:
The program calculates the distance between the spin pairs. The vector of the spin pair used. The point
of the second atom minus the point of the first atom is the vector. There is an x-component of the
vector, a y-component, and a z-component. Also there is the magnitude of the vector distance. The
output is a figure of these 4 quantities as a function of the trajectory time. The text file with the data
points are given in case the figure is not in the form which is wanted; you can replot using this
information.
The information can be used to determine if the molecule is tumbling isotropically. The averages of
these components are given in the figures. Isotropic tumbling should give an average of zero in the
vector components. It can be the case if the simulation was done wrong that the average won’t be zero,
than the figures and the .txt files will show what is wrong.
Figure 5 shows the x-direction for a gag spin pair, UYA:H3 to OME:H1.
Average distances
The average vector distance of the spin pairs should average to zero if the molecule is tumbling
isotropically. The program calculates the average vector distance. This is calculated by: at each time of
a trajectory it uses all previous time units to calculate the average vector distance. The average
distances are shown in figures for each of the components. The program also calculates the average of
the magnitude of the vector distance as a function time. Figures are given also the text file with all the
Figure 7
data points of the figure. Figure 7 is an example. The average goes to zero.
-
Std-mean correlation
The correlation of the motion is calculated using the std-mean. This is done for the x-component, ycomponent, z-component, and magnitude of the distance. The output is the figure of the binning, the
text file of the data points, and the binning information. This is done for all vector components and the
magnitude of the vector of the spin pair.
image_of_trajectory
This program calculations the inter-distance of a spin pair and shows it in a figure. The program will do
this for all trajectories in the directory.
-
command line
.image_of_trajectory topolog trajectory.crd in-dir totaltime out-dir points
-
Inputs
Topology is the topology file of the trajectory.
-
Toology
Trajectory is the trajectory .
-
Trajectory
In-dir i- of the topology/trajectory.
Totaltime is the time of the simulation.
-
Totaltime
Out-dir – of the calculation.
Points is the number of points shown in the trajectory of the totaltime.
-
Output
The program calculates the inter-distances of the spin pairs. It will give images and text.
relaxation_rate
This program takes in the topology file and the trajectory file to calculate relaxation rates.
Command line
./relaxation_rate file.top file.crd totaltime displaytime samplefrequency bfieldz fourier
distanceatom font all-notall dir-out
-
Input
Time of simulation,
-
double totaltime. This is in nanoseconds.
Correlation function is displayed for this time
-
double displaytime.
Sampling – 1 frame out of samplefrequency is used to calculate the correlation function
-
int samplefrequency
Magnetic field in Tesla
-
double bfieldz
Fourier transform for display – don’t show all points
-
double fourier
The spin pairs which are separated by at most distanceatom are included in the complete relaxation
matrix. Most noe’s are not measurable if the inter-dstance is less than 5 angstroms.
-
double distanceatom
Output is either all/not-all the relaxation rates within some distance or particular spin pairs. aAll means
all the information of the spin pairs are the output. If notall is used, than a file called atom_file.txt is
used to output information of those spin pairs.
-
argv[11]=notall,all
The output
-
dir-out = argv[12]
Font of the figures, jpeg, eps, epslatex, pdf
-
font=argv[13]
Output –
The file relaxation_rate.txt contains all the proton and carbon relaxation rates.
particular_relaxation_rate
This program takes in the topology file and the trajectory file to calculate relaxation rates of a spin pair.
Command line
./particular_relaxation_rate file.top file.crd totaltime displaytime samplefrequency bfieldz fourier
distanceatom font dir-out residue1 atom1 residue2 atom2
-
Input
Time of simulation,
-
double totaltime. This is in nanoseconds.
Correlation function is displayed for this time
-
double displaytime.
Sampling – 1 frame out of samplefrequency is used to calculate the correlation function
-
int samplefrequency
Magnetic field in Tesla
-
double bfieldz
Fourier transform for display – don’t show all points
-
double fourier
The spin pairs which are separated by at most distanceatom are included in the complete relaxation
matrix. Most noe’s are not measurable if the inter-dstance is less than 5 angstroms.
-
double distanceatom
Output is either all/not-all the relaxation rates within some distance or particular spin pairs. aAll means
all the information of the spin pairs are the output. If notall is used, than a file called atom_file.txt is
used to output information of those spin pairs.
-
argv[11]=notall,all
Font of the figures, jpeg, eps, epslatex, pdf
-
font=argv[2]
The output
-
argv[13]=out-dir
Output –
The file carbon_relaxation_rate.txt and proton_relaxation_rate.txt contains all the proton and carbon
relaxation rates. There are also figures of the correlation function and fourier transform. There are
also text files.
particular_relaxation_rate_no_tumbling
This program takes in the topology file and the trajectory file to calculate relaxation rates of a spin pair
of a correlatin function of the internal motion. The program will average the correlation function of all
directions of the B-field.
Command line
./particular_relaxation_rate file.top file.crd totaltime displaytime samplefrequency bfieldz fourier
distanceatom font dir-out residue1 atom1 residue2 atom2
-
Input
Time of simulation,
-
double totaltime. This is in nanoseconds.
Correlation function is displayed for this time
-
double displaytime.
Sampling – 1 frame out of samplefrequency is used to calculate the correlation function
-
int samplefrequency
Magnetic field in Tesla
-
double bfieldz
Fourier transform for display – don’t show all points
-
double fourier
The spin pairs which are separated by at most distanceatom are included in the complete relaxation
matrix. Most noe’s are not measurable if the inter-dstance is less than 5 angstroms.
-
double distanceatom
Output is either all/not-all the relaxation rates within some distance or particular spin pairs. aAll means
all the information of the spin pairs are the output. If notall is used, than a file called atom_file.txt is
used to output information of those spin pairs.
-
argv[11]=notall,all
Font of the figures, jpeg, eps, epslatex, pdf
-
font=argv[2]
The output
-
argv[13]=out-dir
Output –
The file carbon_relaxation_rate.txt and proton_relaxation_rate.txt contains all the proton and carbon
relaxation rates. There are also figures of the correlation function and fourier transform. There are
also text files.
Download