ZENO -- A PROGRAM FOR COMPUTING HYDRODYNAMIC PROPERTIES OF MACROMOLECULES. Marc L. Mansfield Department of Chemistry Stevens Institute of Technology Hoboken, New Jersey 07030 OUTLINE I. INTRODUCTION II. THE PROGRAM MODELS ARBITRARY SHAPES AS UNIONS OF SIMPLE BODY ELEMENTS III. COMPILING AND INVOKING THE PROGRAM IV. GRAMMATICAL RULES FOR THE BODY FILE V. DESCRIPTION OF THE INTEGRATIONS VI. DESCRIPTION OF THE OUTPUT DATA APPENDIX A. Examples. APPENDIX B. Internal dictionary (words recognized by the grammar of the body file). APPENDIX C. Error estimates and propagation. APPENDIX D. Random numbers. REFERENCES AND NOTES I. INTRODUCTION This document describes procedures for using the program zeno, which computes various shape functionals (e.g., certain electrostatic and hydrodynamic properties) of macromolecules of arbitrary shape. The fortran code is stored in the file zeno.f. You must create a text file beforehand with the complete body specification, which will be referred to hereafter as the “body file,” and the program returns its results in a second text file, referred to as the “zeno file,” or the “report file.” Each shape is identified to the program by a character string of at most 25 characters, which will be represented in this document by the symbol <identifier>. The body file has the name <identifier>.bod, and the zeno file has the name <identifier>.zno. The program performs as many as three different numerical integrations on the body. (You only request the integrations that are desired. You need not request all three.) The three integrations are: 1. The “Zeno” computation, a numerical path integration technique that solves Laplace’s equation for two separate boundary value problems; an isolated, charged conductor, and a conductor in an external electric field. NZ brownian paths are initiated from a sphere of radius RL, the launch sphere, which completely encloses the body. You specify the value of NZ, but RL is determined internally from the specification of the body. This computation determines the electrostatic capacity or capacitance, C, and the nine 1 components of the electrostatic polarizability tensor, . Because of analogies between the hydrodynamic and the electrostatic boundary value problems, these quantities then permit the program to estimate the hydrodynamic radius, Rh, and the intrinsic viscosity, . 2. The “interior” computation, a Monte Carlo integration over the interior of the body. You specify a large number Ni. The program begins generating points at random inside the launch sphere, and continues until 2Ni points are found that also lie inside the body. The volume, V, of the body is obtained as the volume of the launch sphere times the ratio of successful points to trial points. The mean-square radius of gyration of the interior of the body, Rgi2 , is obtained as one half of the mean-square distance between successive pairs of interior points. 3. The “surface” computation, a Monte Carlo integration over the surface of the body. You specify a large number Ns. The program generates 2Ns points distributed randomly over the surface, and uses these to compute the surface area, A, the Kirkwood radius, RK, defined as the harmonic mean distance between arbitrary pairs of surface points, and the mean-square radius of gyration of the surface of the body, Rgs2 , defined as one half the mean square distance between arbitrary pairs of surface points. The raw values obtained from these integrations are then used to compute a number of derived quantities. II. THE PROGRAM MODELS AN ARBITRARY SHAPE AS A UNION OF SIMPLE BODY ELEMENTS The body is modeled as the union of simple, component body elements. Currently, eight different types of elements are recognized. In this document, we use the terms “open cylinder” and “closed cylinder,” to refer to cylinders with and without ends. (Think of a tin can: A closed cylinder is the can with the ends intact, an open cylinder is the can with both ends cut off.) Table 1 defines the body elements currently in use. TABLE 1. SUMMARY OF BODY ELEMENT TYPES BODY ELEMENT sphere triangle disk cylinder DEFINITION Set of points within a given distance of a given point. Three points constitute the vertices of a triangle, the triangle element is the set of points in the plane defined by the vertices and in the interior of the resulting triangle. The set of points in a plane within a given distance of a given point. The locus of points generated by rotating a line segment about an axis to which it is parallel, an open cylinder. 2 SHAPE TYPE A B B B torus lens ellipsoid cube The locus of points generated by rotating a circle about an axis outside the circle. The set of points formed as the intersections of two distinct spheres with different centers; the spheres need not have the same radius. The set of points inside an arbitrarily oriented and translated ellipsoid; the three axes can all be distinct. This body element is only defined for cubes whose edges are parallel to the Cartesian axes; arbitrary orientations can always be achieved with 12 triangles (2 for each face of the cube). A A A A The body elements are of two types. The first type (type A) includes the elements that have three-dimensional interiors; or that consist of surfaces enclosing a region of threedimensional space: spheres, tori, lenses, ellipsoids and cubes. The second type (type B) includes the elements that are two-dimensional surfaces: triangles, disks, and cylinders. In all cases, the overall shape is taken to be the union of some collection of these body elements. This permits considerable power in defining shapes. Two important examples are first, defining a molecule as the region of space occupied by a set of overlapping spheres, and second, defining an arbitrary surface with a grid of triangles. (E.g., a geodesic dome.) The complete set of body elements used in any given specification, along with their positions and orientations, are specified in the body file. See below for the grammatical rules that must be followed in setting up the body file. This program can consider three major shape classifications. The first are those shapes that have three-dimensional interiors, such as spheres, or a set of overlapping spheres. The second are those shapes that are two-dimensional surfaces but that are embedded in three-dimensional space, such as an open cylinder. The third are those shapes that are only two-dimensional, that exist in a plane, such as a square. (There are also hybrid shapes, for example, the union of a sphere and a square.) Certain shape functionals, such as the electrostatic capacity, can be defined for all three classifications, and the zeno integration works successfully on all three. However, the interior integration can only be performed for shapes of the first classification, for only in this case is the interior defined. This creates a problem, because it is not always easy to program the computer to determine which of the three classifications we might have. For example, we can represent any surface as a grid of triangles, but it probably requires evaluation of certain topological invariants to determine whether or not the surface is in the first or the second class – and this evaluation is beyond the scope of this program. Therefore, the following rules apply: (1) If the body contains any shape elements of type B, an interior integration will not be performed, even if one is requested. 3 (2) It follows that the volume of the body can be determined only if it contains body elements of type A. It is defined as the total volume occupied by points inside the launch sphere that are also inside at least one of the body elements. (3) The surface area of the body will be defined as the area contributed by all points on the surface of each body element (of either type) that are not in the interior of any other body element of type A. (Users should realize that Rule 3 introduces vagueness into the definition of the surface area, but only for some kinds of bodies. Imagine constructing a body as the union of a square and a sphere, but representing both as grids of triangles. Then by Rule 3, points on the square but inside the sphere will contribute to the surface area as calculated by this program.) III. COMPILING AND INVOKING THE PROGRAM The program was developed using the Linux f77 compiler. Therefore, it should compile effortlessly with either of the two Linux compilers f77 or f90. The following Linux commands can be used: f77 zeno.f –o zeno or f90 zeno.f –o zeno The above f77 or f90 command prepares an executable file of name zeno. You then issue the following command to invoke the program: ./zeno <identifier> <action-code-1> <action-code-2> <action-code-3> The four strings <identifier>, <action-code-1>, etc., are accessed by the program via a call to the intrinsic fortran subroutine getarg. The string <identifier> is the same identifier name discussed above. The full body specification must be prepared beforehand and saved in the “body” file, which should be named <identifier>.bod. Section IV describes in detail the grammatical rules required for the body file. Output goes to a file with the name <identifier>.zno. Each action code is a string with three parts: a one-character prefix, a set of digits in the middle, and optionally a suffix. The prefix is one of the following three characters: z i s do a zeno integration on the body do an interior integration on the body do a surface integration on the body 4 Allowed suffixes are any of the three characters: t = thousand m = million b = billion but these suffixes are optional. A few examples demonstrate the proper format of the action codes: z100t i1b s5000000 requests the zeno integration with Nz = 100 thousand. requests the interior integration with Ni = 1 billion. requests the surface integration with Ns = 5000000. You can specify as few as zero (which will result in no action being taken by the program) or as many as three action codes. For example, the following invocation of the zeno program: ./zeno spheroid z1m i1m s1m directs the program to get data on the body from the file spheroid.bod, to perform the zeno, interior, and surface integrations on the body, each with one million steps, and to put the results in a file named spheroid.zno. As another example, this invocation: ./zeno w85 i100t directs the program to get data on the body from the file w85.bod, to perform only the interior integration with 100 thousand steps, and to put the results in a file named w85.zno. IV. GRAMMATICAL RULES FOR THE BODY FILE You use the “body file” to give the full specification of the body to the program. The name of the body file is <identifier>.bod, where <identifier> represents the indentifier string provided during the invocation of the program. As mentioned above, the body is set up as a union of simple body elements. Allowable elements are listed in Table 1. The data in the body file consist of a series of commands, which in turn consist of a series of character strings. The strings are delimited by spaces or by carriage returns. A single line is 80 characters or less, so do not put more than 80 characters between carriage returns. 5 The first string of a command is its “predicate,” and identifies the type of command. The remaining strings in each command are “modifiers” of the predicate. The modifiers to each predicate come in a specific order following that predicate, and each predicate requires a specific number of modifiers. There are no punctuation marks flagging the end of one command or the beginning of another. The command is defined as a valid predicate followed by the correct number of modifiers, which are then followed by the predicate of the next command. A single command, i.e., a predicate with its modifiers, can be spread across more than one line, and one command may end and another begin on the same line. However, for ease in reading by humans, you will probably want to design the body file with carriage returns between commands. To process the file, the program looks at the first string on the file. This string must be a valid predicate. If it is not, then the program aborts. Then, the program takes the next n strings, where n is the number of modifiers required for this particular predicate. The program also aborts if it has trouble interpreting any of the modifiers. Assuming these n strings are interpreted successfully, then the program repeats, reading the next predicate and its modifiers, etc., until it encounters the end of the file. The strings are of two types, “numeric” strings, or simply “numbers,” and “alphabetic” strings, or simply “words.” A valid “numeric string” or “number” is any character string that can be interpreted by the fortran internal-read, free-format command: read(string,*) value (This converts the numeric string into a floating point number.) A valid “alphabetic string” or “word,” is one of the fifty or so words found in the program’s “internal dictionary.” All these words are given in this section, and also summarized in Appendix A. The program knows whether to expect a word or a number based on the position of the string relative to the beginning of the command. Any line with an asterisk in column 1 is interpreted as a comment, and is skipped over by the program. Blank lines can also be inserted for readability; these are also skipped over. Table 2 summarizes each command-type, giving the valid predicate or predicates, the valid modifiers, and the action of the command. The commands either add a bodyelement to the growing body (ADD commands) or set the value of some variable (SPECIFY commands). Most of the commands have several synonymous predicates, e.g., the four strings “SPHERE,” “sphere,” “S,” and “s” are all valid predicates for the ADD- SPHERE command. The order of the modifiers as given below must be followed in the body file. The order of the commands is not important. Body elements can be added in any order, and the SPECIFY and ADD commands can be interspersed. See Appendix A for examples of valid body files. 6 Table 2. VALID BODY FILE COMMANDS ADD-SPHERE COMMAND Valid predicates: SPHERE, sphere, S, s Modifiers: Four numbers: cx c y cz r Action: Adds a sphere to the list of body elements. The sphere is centered at the point cx , c y , cz and has radius r . ADD-TRIANGLE COMMAND Valid predicates: TRIANGLE, triangle, T, t Modifiers: Nine numbers: v1x v1 y v1z v2 x v2 y v2 z v3 x v3 y v3 z Action: Adds a triangle to the list of body elements. The three vertices of the triangle are the points v1x , v1 y , v1z , v2 x , v2 y , v2 z , v3 x , v3 y , v3 z . ADD-DISK COMMAND Valid predicates: DISK, disk, D, d Modifiers: Seven numbers: cx c y cz nx n y nz r Action: Adds a circular disk to the list of body elements. The disk is centered at the point cx , c y , cz . The vector nx , n y , nz specifies the direction normal to the disk, and need not be normalized. The disk has radius r . ADD-CYLINDER COMMAND Valid predicates: CYLINDER, cylinder Modifiers: Eight numbers: cx c y cz nx n y nz r L Action: Adds an open cylinder to the list of body elements. The cylinder has center cx , c y , cz . The vector nx , ny , nz specifies the principle axis of the cylinder, and need not be normalized. The cylinder has radius r and length L . (NOTE: You can add a closed cylinder as one open cylinder and two disks.) ADD-TORUS COMMAND Valid predicates: TORUS, torus, TO, to Modifiers: Eight numbers: cx c y cz nx n y nz r1 r2 Action: Adds a torus to the list of body elements. The torus has center cx , c y , cz . The vector nx , n y , nz specifies the principle axis of the torus, and need not be normalized. The two radii r1 and r2 are defined such that if the torus were in a reference frame with the center at the origin and the principle axis in the z-direction, it would be formed by 2 revolving the circle x r1 y 2 r22 about the z-axis. 7 Table 2, cont’d. VALID BODY FILE COMMANDS ADD-LENS COMMAND Valid predicates: LENS, lens Modifiers: Eight numbers: cx c y cz d x d y d z rc rd Action: Adds a lens to the list of body elements. A lens is defined as the intersection of two spheres, one sphere centered at cx , c y , cz and having radius rc , the other centered at d , d , d and having radius r . x y z d ADD-ELLIPSOID COMMAND Valid predicates: ELLIPSOID, ellipsoid, E, e Modifiers: Twelve numbers: cx c y cz n1x n1 y n1z n2 x n2 y n2 z a b c Action: Adds an ellipsoid to the list of body elements. The center is at cx , c y , cz , one axis is parallel to the vector n1x , n1 y , n1z , another to the vector n2 x , n2 y , n2 z . These two axis vectors need not be normalized; they will be automatically normalized by the program. However, they should be orthogonal. The third axis is determined by the program as the cross-product of these two. Then a is the semiaxis along the direction n1x , n1 y , n1z , b is the semiaxis along the direction n2 x , n2 y , n2 z , and c is the semiaxis along the third direction. ADD-CUBE COMMAND Valid predicates: CUBE, cube Modifiers: Four numbers: cx c y cz s Action: Adds a cube to the list of body elements. The cube is defined as the locus of points x, y, z satisfying cx x cx s , c y y c y s , cz z cz s . Note, therefore, that this version only adds cubes that are aligned parallel to the Cartesian axes. To add cubes with other orientations, use 12 triangles, two on each face. SPECIFY-SKIN-THICKNESS COMMAND Valid predicates: ST, st Modifiers: One number: Action: Sets the value of the skin-thickness parameter, . Note: This command is optional. A value of is only needed if the zeno integration is to be performed. If you omit this command, defaults to the value RL 106 . 8 Table 2, cont’d. VALID BODY FILE COMMANDS SPECIFY-LENGTH-UNITS COMMAND Valid predicates: UNITS, units Modifiers: This command takes a single word as a modifier. It does not take number modifiers. Only one of the following five strings will be accepted as the modifier: m (meters) cm (centimeters) nm (nanometers) A (Ångstrom units) L (generic or unspecified length units) Action: Sets the length unit for quantities found in the body file. All coordinates, radii, lengths, etc., must always be given in the same length units, and you use this command to specify the units. Note: This command is optional. If not given, the length unit defaults to L (generic or unspecified units). SPECIFY-TEMPERATURE COMMAND Valid predicates: TEMP, temp Modifiers: The predicate must be followed by two modifiers. The first is a number, the second gives the temperature units. Valid temperature unit codes: C (Celsius) K (Kelvin) Action: Specifies the temperature. Note: This command is optional. It needs to be present if you want the program to compute the diffusion coefficient from the Stokes-Einstein formula. SPECIFY-MASS COMMAND Valid predicates: MASS, mass Modifiers: The predicate must be followed by two modifiers. The first is a number, and the second gives the mass units. Valid mass unit codes: Da (Daltons) kDa (kilodaltons) g (grams) kg (kilograms) Action: Specifies the mass of the molecule. Note: This command is optional. It needs to be present if you want the program to compute the intrinsic viscosity in conventional units. 9 Table 2, cont’d. VALID BODY FILE COMMANDS SPECIFY-SOLVENT-VISCOSITY COMMAND Valid predicates: VISCOSITY, viscosity Modifiers: The predicate must be followed by two modifiers. The first is a number, the second gives the viscosity units. Valid viscosity unit codes: p (poise) cp (centipose) Note: This command is optional. If you want the program to compute the diffusion coefficient by the Stokes-Einstein formula, it will need the solvent viscosity. There are two options: (1) this command, or (2) the SPECIFY-SOLVENT command in conjunction with the SPECIFY-TEMPERATURE command. SPECIFY-SOLVENT COMMAND Valid predicates: SOLVENT, solvent Modifiers: The predicate takes only one modifier. At present, the modifier must be one of the two codes: water WATER Note: This command is optional. If you want the program to compute the diffusion coefficient by the Stokes-Einstein formula, it will need the solvent viscosity. There are two options: (1) the SPECIFY-SOLVENT-VISCOSITY command, or (2) this command in conjunction with the SPECIFY-TEMPERATURE command. At some later date, it may be possible to add other solvents to the list. V. DESCRIPTION OF THE INTEGRATIONS A. The Zeno Integration. The zeno integration is a numerical path integration. It simultaneously solves two separate boundary value problems in electrostatics, the charge distribution on, first, a charged conductor, and second, on a grounded conductor in a uniform external field. A flowchart summarizing the computation is given in Ref. [7]. Three quantities are required as input: The integration size, Nz, which you set through the action-code during program invocation, the launch radius, RL, which is determined automatically by the program from the body specification, and the skin thickness, ε, which is either set in the body file, or, by default, is set equal to RL 106 . If you set the skin thickness yourself, we recommend values five to six orders of magnitude smaller than the lateral dimensions of the body. Outputs are the electrostatic capacity, C, and the nine components of the electrostatic polarizability tensor, . 10 B. The Interior Integration. This is a Monte Carlo integration over the total volume of the object. Two quantities are required as input: The integration size, Ni, which you set through the action-code during program invocation, and the launch radius, RL, which is determined automatically by the program from the body specification. The calculation is performed by generating points at random inside the launch sphere, and discarding all those that do not also lie inside the body. This continues until 2 Ni points have been found inside the body. Outputs are: first, the volume, V, of the object, which is set equal to 4 / 3RL3 f i , where f i is the fraction of points found inside the body; and second, the square radius of gyration accumulated by points over the interior, which has the definition: Rgi2 where dr 1 1 dr1 dr2 r122 2 2V V V denotes integration over the volume of the body. This integral is evaluated V internally by averaging the square-distance between pairs of points for a total of Ni pairs. The integration will be skipped if any of the body elements are of type B, as explained above. C. The Surface Integration This is a Monte Carlo integration over the total surface area of the object. One quantity is required as input, the integration size, Ns, which you set through the action-code during program invocation. The calculation is performed by generating points at random over the surface of each body element (with each body element weighted according to its own surface area), and discarding all those that lie inside some other body element. This continues until 2 Ns points have been located on the surface of the body. Outputs are: first, the surface area, A, of the object, which is set equal to A0 f s , where f s is the fraction of points retained and where A0 is the total combined surface area of all the body elements; second, the square radius of gyration accumulated by points over the surface, which has the definition: Rgs2 where dr 1 1 dr1 dr2 r122 2 A2 A A denotes integration over the surface of the body of the body, and third, the A Kirkwood radius, or harmonic mean distance between arbitrary surface points: RK1 1 1 dr1 dr2 2 r12 A S S 11 These integrals are evaluated internally by averaging over pairs of points for a total of Ns pairs. D. Computation times. Of the three integration procedures, the zeno is generally slower than the other two for comparable values of NZ, NI, or NS, although the precise timing depends on the shape. The times for each of the three integrations are generally linear in NZ, NI, and NS, respectively. The time for a zeno integration is also linear in the number of body elements. When the body elements are spheres, and for Pentium III processors, an estimate of the time for a zeno integration is 2.3 108 N Z minutes per body element.[7] (This is notable because finite element computations are cubic in the number of body elements.) The ellipsoid body elements are somewhat slower than the others. VI. DESCRIPTION OF THE OUTPUT DATA Results of the computation are reported in the “zeno file,” a text file created by the program. The name of the zeno file is <identifier>.zno, where <identifier> represents the name-string introduced above. Whether or not a result is reported in the zeno file depends, obviously, on whether or not the requisite computation was performed and whether or not other requisite variables were set. To codify the rules followed by the program in reporting a quantity, let us first define several Boolean variables: TABLE 3. BOOLEAN VARIABLES CONTROLLING DATA OUTPUT. BOOLEAN VARIABLE BL BK BZ BS BI BT BM BV BU DEFINITION The launch radius, RL , was successfully determined. The skin thickness, ε, was set. The zeno integration finished successfully. The surface integration finished successfully. The interior integration finished successfully. The temperature was set. The mass was set. The solvent viscosity was set or determined. Specific length units, rather than the generic unit “L,” were set. The following table summarizes all quantities that are reported in the zeno file. It includes the Boolean truth-function that determines whether or not the program displays the quantity, and in some cases, a brief definition, the formula by which the quantity is computed, and appropriate literature references. 12 TABLE 4. SUMMARY OF OUTPUT DATA Launch radius, RL Displayed if: BL Definition: Radius of a sphere centered at the origin which encloses the body. Skin thickness, ε Displayed if: BK During the zeno integration, paths approaching to within this distance are considered to have made first-passage onto the surface. Temperature, T Displayed if: BT Mass, m Displayed if: BM Solvent viscosity, η Displayed if: BV The viscosity is either supplied in the body file, or else it is computed from the temperature, using formulas given on page F49 of the CRC Handbook of Chemistry and Physics, 55th edition. Zeno Monte Carlo steps, NZ Displayed if: BZ The number of independent paths employed in the zeno integration. Capacitance, C Displayed if: BZ The proportionality between total charge and electrostatic potential for a charged, conducting body. It has the units of length and provides one measure of the size of the body. Also an excellent approximation to the hydrodynamic radius. Polarizability, Displayed if: BZ The tensor giving the proportionality between induced dipole moment and external field for a polarized conducting body. All nine components are determined in the zeno integration. Surface Monte Carlo steps, Ns Displayed if: BS The number of independent pairs of points sampled over the surface during the surface integration. Kirkwood radius, RK Displayed if: BS 1 1 , harmonic mean distance between arbitrary pairs of surface RK1 2 dr1 dr2 r12 A S S points. Also used, via the Kirkwood double sum formula, to approximate the hydrodynamic radius and capacity. See Ref. [10] for an appraisal of the accuracy of this approximation. Surface area, A Displayed if: BS 13 TABLE 4, cont’d. SUMMARY OF OUTPUT DATA Surface radius of gyration, Rgs Displayed if: BS Radius of gyration as contributed only by the surface points. Russell radius, RRus Displayed if: BS 1/ 2 A RRus , an approximation to the hydrodynamic radius and capacitance, most 4 accurate for nearly spherical ellipsoids. See Refs. [9] and [10]. Rayleigh radius, RRay Displayed if: BS 1/ 2 2 A RRay , an approximation to the hydrodynamic radius and capacitance, most accurate for two-dimensional, nearly circular bodies. See Refs. [9] and [10]. Interior Monte Carlo steps, Ni Displayed if: BI The number of independent pairs of points sampled through the interior during the interior integration. Volume, V Displayed if: BI Interior radius of gyration, Rgi Displayed if: BI Radius of gyration as contributed by the interior points. Hydrodynamic radius, Rh Displayed if: BZ Rh q1C , where q1 = 1. The radius of a hypothetical sphere having the same diffusion coefficient as the molecule in question. This formula results from the electrostatichydrodynamic analogy and is valid to within 1% or so. In our opinion, this is the best approximation for Rh Trace of polarizability tensor, Tr Displayed if: BZ Tr 11 22 33 Hydrodynamic volume, Vh Displayed if: BZ q Tr Vh 2 , where q2 = 0.79. The volume of a hypothetical sphere having the same 3 intrinsic viscosity as the molecule in question. Like the hydrodynamic radius, this formula results from the electrostatic-hydrodynamic analogy and is valid to within about 5%. In our opinion, this is the best approximation for Vh. 14 TABLE 4, cont’d. SUMMARY OF OUTPUT DATA 13 3V C0 4 Displayed if: BI The capacitance of a hypothetical sphere having the same volume as the molecule in question. 3V Displayed if: BZ BI Diagonal elements of polarizability tensor normalized by the volume. 22 11 ; 33 33 3V 3V Displayed if: BZ BI Only significant for bodies with rotational symmetry about the z-axis, in which case this represents a decomposition of σ into components parallel and perpendicular to the rotation axis. Tr 11 22 33 3V Displayed if: BZ BI Trace of the polarizability tensor normalized by the volume. This is a convenient shape functional. Intrinsic viscosity (volume-normalized), V Displayed if: BZ BI V Vh q2Tr , where q2 = 0.79. Intrinsic viscosity in terms of volume fraction. V 3V Approximation good to about 5%. C Rgi , Rgi Rh , C C0 , V Vh Displayed if: BZ BI Various dimensionless shape functionals. C RK , C Rgs , Rgs Rh , RRus C , RRay C Displayed if: BZ BS Various dimensionless shape functionals. Sphericity Displayed if: BI BS A , a dimensionless shape functional, quantifies departure from sphericity. 13 36V 2 15 TABLE 4, cont’d. SUMMARY OF OUTPUT DATA Intrinsic viscosity (mass-normalized), M Displayed if: BZ BM M Vh q2Tr , where q2 = 0.79. Intrinsic viscosity defined in terms of mass m 3m concentration (conventional units). Approximation good to about 5%. Diffusion coefficient, D Displayed if: BZ BT BV BU kT D . Stokes-Einstein relation for the diffusion coefficient. 6Rh 16 Appendix A. EXAMPLES Example 1. A cube. Tables A.1 and A.2 display the body and zeno files, respectively, for the computation performed on a cube. Table A.1 box.bod cube -1 -1 -1 2 Table A.2 box.zno Body name: box Number of body elements: 1 ================================================== JOB SUMMARY: Actions Checked if Monte Carlo requested completed size -------------------------------------------------z * 1000000 s * 1000000 i * 1000000 ================================================== launch radius . . . . . . . . 1.73205 L skin thickness . . . . . . . . 0.173205E-05 L zeno m.c. steps . . . . . . . 1000000 capacitance, C . . . . . . . . 1.3230(7) L polarizability 11 . . . . . . 2.916(8)E+01 L^3 polarizability 12 . . . . . . -7(8)E-02 L^3 polarizability 13 . . . . . . 1.0(7)E-01 L^3 polarizability 21 . . . . . . 5(9)E-02 L^3 polarizability 22 . . . . . . 2.923(7)E+01 L^3 polarizability 23 . . . . . . 1.0(8)E-01 L^3 polarizability 31 . . . . . . 9(7)E-02 L^3 polarizability 32 . . . . . . -5(7)E-02 L^3 polarizability 33 . . . . . . 2.923(5)E+01 L^3 surface m.c. steps . . . . . . 1000000 RK . . . . . . . . . . . . . . 1.294(3) L surface area . . . . . . . . . 2.40000(0)E+01 L^2 Rg (surface) . . . . . . . . . 1.2916(3) L R(Russell) . . . . . . . . . . 1.38198(0) L R(Rayleigh) . . . . . . . . . 1.75959(0) L interior m.c. steps . . . . . 1000000 volume . . . . . . . . . . . . 7.993(4) L^3 Rg (interior) . . . . . . . . 9.996(4)E-01 L Rh . . . . . . . . . . . . . . 1.32(1) L Tr(alpha) . . . . . . . . . . 8.76(1)E+01 L^3 Vh . . . . . . . . . . . . . . 2.3(1)E+01 L^3 C0 . . . . . . . . . . . . . . 1.2403(2) L sig 11 . . . . . . . . . . . . 1.216(3) 17 sig 22 . . . . sig 33 . . . . sig(normal) . sig(parallel) sigma . . . . [eta](V) . . . C/Rg(int) . . Rg(int)/Rh . . C/C0 . . . . . V/Vh . . . . . C/RK . . . . . C/Rg(surf) . . Rg(surf)/Rh . R(Russell)/C . R(Rayleigh)/C sphericity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 1.219(3) 1.219(2) 2.435(5) 1.219(2) 3.654(5) 2.9(1) 1.3235(8) 7.56(8)E-01 1.0666(6) 3.5(2)E-01 1.023(3) 1.0243(6) 9.8(1)E-01 1.0446(5) 1.3300(7) 1.2414(5) Example 2. Five overlapping spheres. Tables A.3 and A.4 display the body and zeno files for a body constructed from five overlapping spheres. It displays some of the freedom you have in formatting the body file, including use of comments, blanks, and extending commands across more than one line. Table A.3 some.spheres.bod * This body consists of 5 spheres * Blank lines are OK: * This line inserts a sphere of radius 1 at the origin SPHERE 0 0 0 1 * The next line inserts a sphere of radius 2 tangent * to the first sphere with center on the x axis S 3 0 0 2 * Carriage returns are permissible during the *specification of any one element: sphere -3 0 0 2 * * You can also run different elements together on the same line S 1 1 1 1 s -1 -1 -1 1 * This command establishes nanometers as the length unit units nm Table A.4 some.spheres.zno Body name: some.spheres Number of body elements: 5 ================================================== JOB SUMMARY: Actions Checked if Monte Carlo requested completed size -------------------------------------------------z * 1000000 i * 1000000 s * 1000000 ================================================== launch radius . . . . . . . . 5.00000 nm skin thickness . . . . . . . . 0.500000E-05 nm zeno m.c. steps . . . . . . . 1000000 capacitance, C . . . . . . . . 3.101(2) nm polarizability 11 . . . . . . 7.98(3)E+02 nm^3 polarizability 12 . . . . . . 7(3) nm^3 19 polarizability 13 . polarizability 21 . polarizability 22 . polarizability 23 . polarizability 31 . polarizability 32 . polarizability 33 . surface m.c. steps . RK . . . . . . . . . surface area . . . . Rg (surface) . . . . R(Russell) . . . . . R(Rayleigh) . . . . interior m.c. steps volume . . . . . . . Rg (interior) . . . Rh . . . . . . . . . Tr(alpha) . . . . . Vh . . . . . . . . . C0 . . . . . . . . . sig 11 . . . . . . . sig 22 . . . . . . . sig 33 . . . . . . . sig(normal) . . . . sig(parallel) . . . sigma . . . . . . . [eta](V) . . . . . . C/Rg(int) . . . . . Rg(int)/Rh . . . . . C/C0 . . . . . . . . V/Vh . . . . . . . . C/RK . . . . . . . . C/Rg(surf) . . . . . Rg(surf)/Rh . . . . R(Russell)/C . . . . R(Rayleigh)/C . . . sphericity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 5(2) 5(1) 2.167(9)E+02 4.5(9) 4.6(9) 1.55(9)E+01 2.179(9)E+02 1000000 2.944(5) 1.2588(3)E+02 3.316(2) 3.1650(4) 4.0298(5) 1000000 7.824(5)E+01 3.181(1) 3.10(3) 1.233(3)E+03 3.2(2)E+02 2.6533(6) 3.40(1) 9.23(4)E-01 9.28(4)E-01 4.32(1) 9.28(4)E-01 5.25(1) 4.1(2) 9.751(9)E-01 1.03(1) 1.169(1) 2.4(1)E-01 1.054(2) 9.352(9)E-01 1.07(1) 1.0205(8) 1.299(1) 1.4230(7) nm^3 nm^3 nm^3 nm^3 nm^3 nm^3 nm^3 nm nm^2 nm nm nm nm^3 nm nm nm^3 nm^3 nm Example 3. The myoglobin molecule. Tables A.5 and A.6 display the application of these techniques to a protein molecule, myoglobin. The structure of the molecule was taken from the Protein Data Bank, entry code 1a6m. This protein consists of 151 amino acids, and was modeled with overlapping spheres, one sphere centered at each alphacarbon. Each sphere has radius 5 Å. This example demonstrates use of the SPECIFYMASS, SPECIFY-TEMPERATURE, and SPECIFY-SOLVENT commands in the body file. Since these variables were specified, the program was able to compute the massnormalized intrinsic viscosity and the diffusivity, neither of which appear in the previous zeno files. (For the sake of brevity, over 140 lines have been omitted from the body file.) Table A.5 1a6m.5.bod s -3.526 15.758 s -0.689 14.190 s -1.487 12.495 s 0.324 13.366 . . . s -0.870 33.550 s -1.223 31.968 s 1.894 29.853 ST 0.001 mass 16747 Da temp 20 C solvent water units A 14.900 16.862 20.143 23.335 5.000 5.000 5.000 5.000 -1.190 -4.522 -4.417 5.000 5.000 5.000 Table A.6 1a6m.5.zno Body name: 1a6m.5 Number of body elements: 151 ================================================== JOB SUMMARY: Actions Checked if Monte Carlo requested completed size -------------------------------------------------z * 100000 s * 100000 i * 100000 ================================================== launch radius . . . . . . . . 47.3601 A skin thickness . . . . . . . . 0.100000E-02 A temperature . . . . . . . . . 2.932(5)E+02 K mass . . . . . . . . . . . . . 1.67470(5)E+04 Da solvent viscosity (computed) . 1.002(1) cp zeno m.c. steps . . . . . . . 100000 capacitance, C . . . . . . . . 2.071(6)E+01 A polarizability 11 . . . . . . 1.29(2)E+05 A^3 21 polarizability 12 . polarizability 13 . polarizability 21 . polarizability 22 . polarizability 23 . polarizability 31 . polarizability 32 . polarizability 33 . surface m.c. steps . RK . . . . . . . . . surface area . . . . Rg (surface) . . . . R(Russell) . . . . . R(Rayleigh) . . . . interior m.c. steps volume . . . . . . . Rg (interior) . . . Rh . . . . . . . . . Tr(alpha) . . . . . Vh . . . . . . . . . C0 . . . . . . . . . sig 11 . . . . . . . sig 22 . . . . . . . sig 33 . . . . . . . sig(normal) . . . . sig(parallel) . . . sigma . . . . . . . [eta](V) . . . . . . C/Rg(int) . . . . . Rg(int)/Rh . . . . . C/C0 . . . . . . . . V/Vh . . . . . . . . C/RK . . . . . . . . C/Rg(surf) . . . . . Rg(surf)/Rh . . . . R(Russell)/C . . . . R(Rayleigh)/C . . . sphericity . . . . . [eta](M) . . . . . . D . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 -2(3)E+03 -1.7(2)E+04 -7(2)E+03 1.01(2)E+05 -1.7(2)E+04 -1.3(2)E+04 -2.6(2)E+04 1.12(2)E+05 100000 1.957(6)E+01 7.18(2)E+03 1.930(2)E+01 2.391(2)E+01 3.044(3)E+01 100000 2.957(6)E+04 1.646(2)E+01 2.07(2)E+01 3.42(4)E+05 9.0(5)E+04 1.918(1)E+01 1.46(3) 1.14(2) 1.26(3) 2.59(4) 1.26(3) 3.85(5) 3.0(2) 1.258(4) 7.95(8)E-01 1.079(3) 3.3(2)E-01 1.058(4) 1.073(3) 9.3(1)E-01 1.154(4) 1.470(5) 1.553(4) 3.2(2) 1.03(1)E-06 A^3 A^3 A^3 A^3 A^3 A^3 A^3 A^3 A A^2 A A A A^3 A A A^3 A^3 A cm^3/g cm^2/s Appendix B. SUMMARY OF WORDS RECOGNIZED BY THE GRAMMAR OF THE BODY FILE. Table B1 gives the “dictionary” for the grammar file. All the synonyms of any one word are listed together. In column 2, “P” indicates predicate, “M” indicates modifier. Table B1. WORD A C cm cp cube, CUBE cylinder, CYLINDER d, D, disk, DISK Da e, E, ellipsoid, ELLIPSOID g K kDa kg L lens, LENS m mass, MASS nm p s, S, sphere, SPHERE solvent, SOLVENT st, ST t, T, triangle, TRIANGLE temp, TEMP to, TO, torus, TORUS units, UNITS viscosity, VISCOSITY water, WATER TYPE MEANING M M M M P P Ångstrom units Celcius centimeters centipoise COMMAND IN WHICH THIS WORD IS FOUND SPECIFY-LENGTH-UNITS SPECIFY-TEMPERATURE SPECIFY-LENGTH-UNITS SPECIFY-SOLVENT-VISCOSITY ADD-CUBE ADD-CYLINDER Daltons ADD-DISK SPECIFY-MASS ADD-ELLIPSOID P M P M M M M M P M P M M P grams Kelvins kilodaltons kilograms Generic length unit meters nanometers poise SPECIFY-MASS SPECIFY-TEMPERATURE SPECIFY-MASS SPECIFY-MASS SPECIFY-LENGTH-UNITS ADD-LENS SPECIFY-LENGTH-UNITS SPECIFY-MASS SPECIFY-LENGTH-UNITS SPECIFY-SOLVENT-VISCOSITY ADD-SPHERE P SPECIFY-SOLVENT P P SPECIFY-SKIN-THICKNESS ADD-TRIANGLE P P SPECIFY-TEMPERATURE ADD-TORUS P P SPECIFY-LENGTH-UNITS SPECIFY-SOLVENT-VISCOSITY M SPECIFY-SOLVENT 23 Appendix C. UNCERTAINTY ESTIMATES AND PROPAGATION. Like any Monte Carlo integration, the results display sampling error. The program estimates the sampling error in the integrations, and propagates the errors through subsequent computations. Error estimation, propagation, and reporting by the program are explained here. Significant figures in the input data. The input quantities set in the body file, namely temperature, mass, and solvent viscosity, are assumed to contain experimental error. All digits displayed in a SPECIFY command, including trailing zeros, will be considered significant by the program. In other words, the two commands “MASS 20000 Da” and “MASS 2.00E4 Da” will be interpreted as m 20000 0.5 Da , and m 20000 50 Da , respectively. These uncertainties will then be propagated through subsequent computations as explained below. Sampling errors in the integrations. The following quantities are results of a Monte Carlo integration, and therefore display sampling error: Capacitance, C Polarizability, Kirkwood radius, RK Surface area, A Square surface radius of gyration, Rgs2 Volume, V Square interior radius of gyration, Rgi2 To estimate the integration error, each integral is performed 20 times independently, using an integration size of N/20. The final value is taken as the mean of these 20 independent integrations, while the sampling error is taken as the standard deviation divided by 20 . Uncertainties resulting from the electrostatic-hydrodynamic analogy. Two formulas, first given above, Rh q1C and V 24 Vh q2Tr V 3V result from analogies between electrostatic and hydrodynamic boundary value problems. But the analogies are only approximate, so that q1 and q2 are not constant; rather they vary from shape to shape. The variation is small, however, and so using standard values of the two coefficients lets us use the results of an electrostatic calculation to approximate hydrodynamic properties. The current version of the program uses the values: q1 1.00 0.01 and q2 0.79 0.04 These uncertainties are propagated through subsequent calculations. This means that no matter how many Monte Carlo steps are used in the zeno integration, hydrodynamic properties directly related to these two coefficients will never appear with more than 2 or 3 significant figures. Propagation of uncertainties. All other quantities are computed from the above values. Suppose that the computation of a variable y in terms of several variables xj is represented in the following functional form: y f x1, x2 , Furthermore, let y and x j represent uncertainties in y and xj, respectively. Then, to estimate y , the program uses 2 y j f x j 2 x j 2 Final display of error estimates. As explained in the preceding paragraphs, uncertainty estimates are calculated for all quantities. The final results are always rounded, with the uncertainty in the final digit enclosed in parentheses. For example, the string 1.03(1)E-06 represents the range of numbers (1.03 0.01) 106 . 25 Appendix D. Random numbers. The program uses the random number generator ran2 published in Press, Teukolsky, Vetterling, and Flannery, Numerical Recipes in Fortran 77, 2nd edition, Cambridge University Press (1992). The program uses the Linux date command to generate the seed for the random numbers. Therefore, after execution, you will find the file <identifier>.dfl in you directory. It contains the Linux date stamp at the time of program initiation. It is perfectly safe to delete this file. 26 References: [1] Derivation of the analogy between capacitance and hydrodynamic radius: Hubbard and Douglas, “Hydrodynamic friction of arbitrarily shaped Brownian particles,” Physical Review E, 47, R2983-R2986 (1993). [2] Derivation of the path integral technique for the capacitance: Zhou, Szabo, Douglas, and Hubbard, “A Brownian dynamics algorithm for calculating the hydrodynamic friction and the electrostatic capacitance of an arbitrarily shaped object,” J. Chem. Phys., 100, 3821-3826 (1994). [3] Test of the analogy between capacitance and hydrodynamic radius: Douglas, Zhou, and Hubbard, “Hydrodynamic friction and the capacitance of arbitrarily shaped objects,” Physical Review E, 49, 5319-5331 (1994). [4] Derivation and testing of the analogy between polarizability and intrinsic viscosity: Douglas and Garboczi, “Intrinsic viscosity and the polarizability of particles having a wide range of shapes,” Adv. Chem. Phys., 91, 85-153 (1995), and Garboczi and Douglas, Physical Review E, 53, 6169-80 (1996). [5] The derivation of the path integral formulation for the polarizability and its application to a number of different shapes: Mansfield, Douglas, and Garbozci, “Intrinsic viscosity and the electrical polarizability of arbitrarily shaped objects,” Physical Review E, 64, 061401 (2001). [6] Application zeno algorithm to flexible polymer models: Mansfield and Douglas, “Numerical path-integration calculation of transport properties of star polymers and theta-DLA aggregates,” Condensed Matter Physics, 5, 249 (2002). [7] Application of the zeno algorithm to proteins: Kang, Mansfield, and Douglas, “Numerical path integration technique for the calculation of transport properties of proteins,” Physical Review E, 69, 031918 (2004). This reference also gives a flowchart for the zeno algorithm. [8] A good introduction to the use of path integral techniques in solving boundary value problems: Douglas and Friedman, “Coping with complex boundaries,” IMA Series on Mathematics and its Applications, Vol. 67 (Springer, New York, 1995), p. 166-185. [9] The quantities RRus and RRay are discussed in an appendix of Douglas and Freed, “Competition between hydrodynamic screening (‘draining’) and excluded volume interactions in an isolated polymer chain,” Macromolecules, 27, 6088-6099 (1994). This also contains references to the original literature. [10] An an appraisal of how accurately RK, RRus, and RRay represent the hydrodynamic radius: Mansfield and Douglas, “Accuracy of several approximate formulas for the hydrodynamic radius and the diffusion coefficient,” in preparation. . 27