641 Acta Cryst. (1961). 14, 641 The Refinement of Structures Partially Determined by the Isomorphous Replacement Method BY MICHAEL G. ROSSMA_WNAWD D. I . BLOW Medical Research Council Unit for Molecular Biology, Cavendish Laboratory, Cambridge, England (Received 8 August 1960) When independent results about the phase of a reflexion are obtained, what is the best way of combining them ? This problem arises in the multiple isomorphous replacement method, and in a more general form where, in addition, part of the structure is known. The method proposed is to use each result to form a probability function for the phase of the reflexion ; the combination of these results is achieved by multiplying the probability functions together. This joint probability function can be fully represented by the magnitude and phase of two vectors C and Dis formed by addition of components from each result. From these, structure factors can be calculated which provide the 'best' combination of all the data. A discussion of the significance of the vectors C, Dis indicates how estimates of the relative importance of structural information and of the isomorphous replacement method may be made. Introduction The isomorphous-replacement m e t h o d has been used b y K e n d r e w et al. (1960) to determine the phases of 9,600 reflexions in a s t u d y of the protein myoglobin at 2 A resolution. I n the resulting Fourier synthesis, most of the atoms in the helical polypeptide chain can be placed with precision, a n d a n u m b e r of the amino acid side chains can be identified. The question now arises, how m a y a n extension or refinement of the structure best be m a d e ? The two available techn i q u e s - c a l c u l a t i o n of phases either on the basis of the known part of the structure or b y the isomorphousreplacement m e t h o d - - w i l l lead to different sets of phase angles. Is there a satisfactory w a y of combining these? The large n u m b e r of reflexions imposes a requirement for a completely a u t o m a t i c m e t h o d of combination. At the same time, the labour of model building is such t h a t it is worth doing a great deal of p r e l i m i n a r y computing in order to reduce the n u m b e r of cycles of refinement to a m i n i m u m . The problem is approached from the point of view of an earlier paper on the t r e a t m e n t of errors in the i s o m o r p h o u s - r e p l a c e m e n t m e t h o d (Blow & Crick, 1959). I n a projection subject to error, one m a y define as 'best Fourier' the Fourier transform which has least mean-square difference from the 'true' Fourier transform when averaged over the whole unit cell. I n the usual case, one knows the m a g n i t u d e Fo of the structure a m p l i t u d e with accuracy, b u t its phase is uncertain. If the relative probability of each phase is plotted round a circle of radius Fo on an Argand diagram, the centre of g r a v i t y of the p r o b a b i l i t y distribution gives the structure factor, ~, to be used in calculating the 'best Fourier'. This m a y be expressed by f 2~exp (ia)P(a)da .--. .0 (1) o Here P (c~)da is the relative p r o b a b i l i t y t h a t the phase angle o~ lies between c~ a n d a + d a . I n the following paper we present (a) a convenient a n a l y t i c a l expression, governed b y two vectors, t h a t represents the phase p r o b a b i l i t y distribution indicated b y a set of isomorphous-replacement d a t a ; (b) a similar expression, governed b y one vector, t h a t represents the corresponding distribution for the ' h e a v y - a t o m ' or p a r t i a l l y known structure m e t h o d ; (c) a m e t h o d for the combination of these to obtain the 'best Fourier' when the two methods are combined simultaneously. This leads to some general strategic considerations in the solution of large structures. The isomorphous replacement method I n the isomorphous-replacement m e t h o d Blow & Crick (1959) show t h a t a particular phase angle, a, is related to a particular error s ( a ) in the experimental d a t a of one isomorphous compound. This error can be allotted a p r o b a b i l i t y e x p { - s 2 ( a ) / 2 E 2} on the basis of a Gaussian distribution of error with a s t a n d a r d deviation E, which can be estimated, for example, from d a t a for a centrosymmetric zone. I n general, J isomorphous derivatives m a y be used simultaneously. I n this case J P ( a ) d a = II exp { - s~(a)/2E~.}d~ J = exp { - 2: ~2(a)/2E~}da j=l (2) 642 THE REFINEMENT OF ~: is given by (FH~+ s ~ ) e = F ~ + f ~ + 2 F o f i COS ( a - q ~ ) , (3) where Fo is the structure amplitude of the unsubstituted derivative, -~g~ t h a t of the j t h heavy atom derivative, and f~ exp (i~¢) is the calculated structure factor of the substituent atoms in the j t h compound (see Fig. 1). STRUCTURES , The first term of (5) results in a constant multiplier, which appears at top and bottom of (1). The second and third terms provide the required information for calculation of the structure factors, ~, which give rise to the 'best Fourier'. Let us now write J J C~8 cos q)l = ~ c: cos Fj; Ci8 sin ¢1 = ~ c: sin ~j; i=1 i=l J J D~ cos 2¢2=,~'d~ cos 21c~; D~ sin 2Cb~.=,~'d~sin 2(Tj, i=t i=~ (6) This corresponds to forming vectors Ci~ and Di~ by addition of the c: and dj vectors with their corresponding phases, ~: and 2V:. (5) now reduces to J _~Ye:-2/"~'~,~2~= K - Ci~ cos (~ - ~bl) + Di~ cos 2 (~x- ~b2). (7) J=" Finally, by combining (1), (2) and (7) we have ~=Fox f2'~exp {i c~} exp {Ci~ cos (c~ - ~b,)- D,8 cos 2 (a - (1)2)}da 0 Workers who have used this method (Blow, 1958; Kendrew et al., 1960; Perutz et al., 1960) have so far used numerical methods for the evaluation of (1), by calculating P(c~) from (2) and (3) at intervals of 5 ° or 10 ° . An analytical expression for the integrals of (1) will now be developed. Since in all important cases Fo, F m >> s~ except where the probabilities are negligible, e~2 can safely be neglected in (3). This leads at once to Putting to = c~- qh, ~ = F o exp (iq)l) f2=exp {ito + Ci~ cos t 0 - D~ cos 2 (to + q)~ - ¢9.)} dto X ., 0 . . . . f2aexp . /=i (4) i=~ . . . {Cis cos to-Dis . . . . . cos 2 (to + ¢ i - ~b2)}dto or most briefly (iq)l)g(Cis, Di~, (8) q ~ l - q~2) . This last form shows t h a t the integrals are functions of the three variables C~s, Dis, (q~l-q)2). Their ratio is a complex number g(C~, D~8, q)l-q~2) which can readily be calculated on a computer. Some values are given in Table 1. The validity of the approximation introduced by neglecting s 2 was tested by computation of some phase B, o' ~ " 210 ° J -:2_,'c: cos (o¢-cf:)+.,~,d: cos 2(o¢-~:) . (5) i=t . I J 2 2 = ~" (e~ + d~) e~/2E~ J . 0 This expression shows t h a t ei (a) can be expressed with a good accuracy by a Fourier expansion up to the second-order term, and c~ and d~ m a y be thought of as Fourier coefficients. As will be shown presently, t h e y are, however, like vectorial coefficients with phases ~: and 2V:. If follows from (4) t h a t J . { = Fo exp e~ = (F o2 + f ]2 - F~1)/2FH: + (Fofi/Fg:) cos (~ - ~ ) and 2 si/2E ~2 = e i - c~ cos (c~ - ~ ) + 2d~ cos ~ (c~ - ~ ) = (e~+ d~) - c: cos (a - ~:) + d~ cos 2 (~ - ~ ) , where 2 2 2 2 2 c: = ( F ~ / - f / - Fo)Foil/ (2F~iE i ), d:= F o2f i/(4FaiEi) 2 ~ 2 , ~2 2 2 2 2 2 . e:= ( n ] - f i - F o ) / ( 8 F n i E I ) qh)-Dis cos 2 ( ~ - fib2)}do~ 2 a e x p {Cis c o s ( ~ x 0 Fig. 1. Vector d i a g r a m showing error in t h e i s o m o r p h o u s r e p l a c e m e n t technique. i=t 270 ° a Fig. 2. D a s h e d line shows probability d i s t r i b u t i o n for reflexion (260) of h a e m o g l o b i n w i t h a p p r o x i m a t i o n s suggested in this p a p e r ; c o n t i n u o u s line shows t h e s a m e d i s t r i b u t i o n calculated w i t h o u t error. M I C H A E L G. R O S S M A N N AND D . M . B L O W probability curves from data for five isomorphous compounds of horse haemoglobin. These were calculated both by using equation (3) (accurate) and equation (4) (approximate). The two curves for the reflexion showing the poorest agreement are shown in Fig. 2. The error, about 5 °, is quite insignificant. All the information available about a structure factor from isomorphous replacements can thus be expressed in terms of five parameters: Fo and the real and imaginary parts of C~ and Di~, given by 2 C~=.X j= t (F.j-f 2 2 j - F~o)Fofj • ~ 2 2FB/Ej exp (iq~), 2 : Fofj Di~ = ~ ] - - ~ - ~ exp (i2q~). For practical application in a computer this has several advantages : 643 f2"cos ~ exp {CH COS~p}dyJ 0 ~ = F o exp (iqgH) ~ , f2~exp {CH COSyJ}dyJ (10) ,0 which may be shown to reduce to ~=Yo exp (iq~H)II(CH)/Io(CH). (11) (Sire, 1960; Watson, 1922). One feature of Sim's treatment which may readily be improved is the assumption that the 'known' part of the structure is perfectly accurate. In addition to the contribution to Z from the L atoms whose position is unknown, there will be a further contribution due to error in the parameters of the H 'known' atoms. If the hth known atom has been assigned a position ra, while its true position is ra+~h, and its scattering factor is fa, then H (i) It is necessary to store only five parameters per reflexion as a complete record of the results, rather than the (3J + 1) parameters required for observed structure amplitudes and calculated heavy-atom contributions. (ii) If further information becomes available about a reflexion at a later stage, it can be readily added into the C~ and D~ vectors. (iii) Only one integration is required (or a single reference to a three-dimensional table of g), rather than J such operations which would be required if the isomorphous replacements were treated separately. (fH)true= • f a exp {2ai(rh+~i~). s}. h=l Here s is the reciprocal-lattice vector, and assuming the angle 2~6a. s is small, H (fH)true~ (fH)ca~c.+ 2 2~i(8a. s)fh exp {2aira. s}. h=l The last term is a vector sum made up of the contributions of each 'known' atom to the total structure factor. Its mean square value is H ~ s ~ ±" ~fg, 2 h=l The heavy a t o m m e t h o d A probability distribution P(c~) can also be derived in the case where partial knowledge of the structure is available, either from the positions of a small number of heavy atoms or from a large number of light atoms. The two cases are formally identical, and are here both referred to as the 'heavy-atom method'. Let f z = f z exp (iqz) be a structure factor calculated from that part of structure which is known. Sim (1959) has derived the probability function P (c~), by making the assumption that the contribution of the unknown atoms conforms to Wilson statistics (1949), and has a mean square value ~. Following Sim, the probability may be written P ( c ~ ) d a = K exp (CH COS (a-- ~gH))d~ , where and (9) CH=2FofH/Z , K = 2ZIo(CH) . I0 is the modified zero-order Bessel function of imaginary argument iCH. Substituting (9) into (1), and changing the variable to yJ= ~ - qH, gives where ah is the r.m.s, value of 6a, or the standard error in position of the hth atom. This quantity needs to be added to the mean square contribution of the 'unknown' atoms, giving L H /=I h=l ahfh. (]2) This latter term, although probably negligible in the early stages, becomes increasingly important in later cycles when more of the structure is known. This is particularly true of the terms with large s, or if atomic positions determined at a low resolution (small s) are used for phase determination at a higher resolution. Alternatively, Z could be estimated for a range of s by the study of a centric zone, in the same way as Ej has been determined for the isomorphous-replacement technique. It has been assumed that each atom of the structure can be assigned into one of the two groups 'known' and 'unknown'. In the case of an uncertain atom a possible procedure would be to put part of its weight into the 'known' set and part into the 'unknown'. However, while this is the best procedure for determining the rest of the structure, it will not decide whether this atom really exists. 644 THE REFINEMENT Table 1 OF STRUCTURES The function g (x, y, co) The f u n c t i o n is set o u t in blocks of c o n s t a n t w. The left h a n d blocks give 100 x the m a g n i t u d e of ~ a n d t h e r i g h t h a n d blocks its phase in degrees Igl Y= o x I00 ~T~ 0.5 1.o 1.5 2.0 2.5 3.0 ~.5 4.0 0 Y= g degrees 0.5 1.0 1.5 2.0 2.5 3,0 3.5 4.0 w=O ~ 0 0.5 1.0 1.5 2.0 2-5 3.0 3- 5 4".0 4.5 5- 0 60 435 ~ 70 76 81 84 60 67 7~ 78 48 57 64 69 20 15 29 22 38 30 46 36 53 43 59 49 88 89 81 84 86 74 64 55 77 80 69 73 60 86 '12 10 18 18 1" 23 19 16 29 2== 20 35 29 24 40 33 28 46 38 51 55 14 17 21 24 32 28 36 ~0 31 35 4~ 58 11~ 1~ 12 G 26 25 25 36 ,~ 36 46 46 55 54 54 102 24 36 46 54 1Q 24 36 46 55 74 78 80 72 76 79 73 77 81 ~4 78 B~ 64 43 47 360 360 360 360 360 360 360 360 360 360 360 360 360 360 360 360 360 360 170 360 360 360 360 360 360 360 360 360 360 360 360 360 360 360 360 360 360 360 360 360 360 360 360 360 360 3c,0 360 360 360 360 360 360 360 360 360 360 360 360 360 360 360 360 360 360 36O 36(3 360 360 360 360 360 360 360 360 3@a 360 360 360 360 360 360 360 360 360 360 8 8 8 7 7 6 6 5 5 19 19 18 17 16 15 14 13 12 31 31 30 29 27 26 24 23 21 20 41 4B 53 26 25 24 35 34 33 360 360 360 360 360 13 12 12 11 10 10 9 8 8 360 360 360 360 360 360 360 14 13 13 12 11 1o 10 24 23 22 21 20 19 18 9 17 9 16 8 15 31 30 29 28 27 25 24 23 22 21 17 22 15 14 13 12 12 11 360 360 360 360 360 36O 360 360 36(} 360 360 360 360 36G 360 360 ..,, = 150 .5 1.0 1.5 2.0 2.5 3.0 ~-5 4.0 4.5 5.0 204 45 60 70 76 81 84 86 88 89 20 37 51 62 69 75 1~ 10 31 27 44 54 62 69 79 74 82 78 85 86 81 83 64 70 62 68 62 67 ~2 76 79 62 68 ]3 76 BO 62 68 62 68 9 20 57 60 51 50 5(3 46 49 45 48 44 46 43 45 41 44 40 43 52 52 51 50 49 48 47 46 45 37 37 36 35 34 33 32 30 29 28 39 38 38 37 36 35 34 33 32 31 40 ,40 39 38 37 36 35 34 33 32 41 40 40 39 38 37 36 36 35 34 24 25 26 27 19 21 18 20 17 19 16 19 15 18 15 17 26 27 25 26 25 26 23 22 21 20 20 19 24 23 23 22 21 20 27 27 26 25 24 24 23 22 22 26 25 24 24 23 22 13 13 12 12 11 11 11 10 10 10 I) 13 13 12 12 12 11 11 11 10 14 13 13 13 12 12 12 11 11 11 14 14 13 13 13 12 12 12 11 11 41 40 39 37 36 34 32 31 29 48 47 46 45 44 42 41 39 37 53 53 52 51 50 48 47 46 44 46 46 45 43 41 40 39 37 36 49 48 48 35 34 33 32 31 30 29 27 26 25 5? 56 56 55 54 53 52 50 49 59 59 58 58 57 56 55 54 53 w = 300 o 0.5 1.o 1.5 2.0 2.5 3.0 3;5 4.0 4.5 5.0 45 41 40 56 66 73 79 84 82 86 85 88 87 89 88 55 65 73 78 60 70 76 81 40 41 42 43 55 66 74 79 82 83 84 86 86 88 88 89 57 68 76 81 58 70 78 83 85 87 88 89 89 91 90 92 59 71 79 85 91 92 93 88 43 59 ]2 44 ]2 360 360 360 80 81 360 66 89 92 93 94 92 94 95 60 86 90 41 41 4G 22 32 39 21 30 37 20 29 36 18 27 34 17 26 32 16 24 31 w = 45 ° o 0-5 1.0 1.5 2-0 2.5 3.0 )-5 4.0 4.5 5.0 0 24 45 60 70 0 0 25 46 61 71 26 48 64 74 O 28 51 67 77 0 29 53 70 80 0 30 55 72 82 0 31 56 "]3 83 0 31 57 74 64 O 32 5"/ "15 85 81 84 86 88 89 85 87 89 90 85 87 89 90 91 87 89 91 92 93 89 91 93 93 94 91 93 94 95 95 92 94 95 95 96 93 95 96 96 96 94 95 96 97 97 76 78 82 81 84 86 88 89 90 91 360 360 360 w = 60 o 0.5 1;o 1,5 2-0 2.5 3.0 3-5 4. 0 4.5 5.0 0 0 45 50 65 75 81 85 87 89 90 91 60 70 76 81 84 86 88 89 55 70 80 85 88 90 91 92 93 59 75 83 88 91 92 93 94 94 6~ -/7 86 90 92 94 94 95 95 63 79 68 65 81 89 92 93 94 95 95 95 95 96 96 96 96 96 65 61 90 93 95 9,6 96 97 97 82 90 94 96 96 97 97 97 11 360 360 360 360 10 10 9 360 360 360 360 360 360 9 8 8 '7 7 6 v," = 75 ° o 0.5 '1.0 1.5 2.0 2-5 ~.0 3.5 4.0 4.5 5.0 45 ~o 52 68 59 63 ~_~ X~ 66 68 83 69 84 70 8s 71 8s 76 81 84 86 88 89 83 86 88 90 91 92 87 90 91 92 93 93 92 93 95 95 96 96 96 94 95 96 96 97 97 97 95 96 97 97 97 97 90 92 99 94 94 95 8~ 94 94 95 95 96 95 96 96 97 97 17 21 16 20 16 20 24 23 22 25 24 24 27 . . . . . . 360 360 360 360 360 360 3~0 360 360 360 6 6 5 5 5 4 4 4 4 3 9 9 8 8 8 7 7 6 6 6 11 11 10 10 10 9 9 8 8 7 12 12 12 11 11 10 10 9 9 9 w = 90° o 0.5 1,.o 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 . . . . . . . . . 360 45 60 70 76 81 84 86 88 89 53 69 78 83 86 89 90 91 92 60 75 83 88 90 91 92 93 94 64 79 87 90 92 93 94 94 95 67 82 89 92 94 95 95 95 96 69 84 91 94 95 95 96 96 96 71 85 92 94 96 96 96 97 97 72 86 92 95 96 97 ~)7 97 97 72 87 93 95 97 97 97 97 97 360 0 0 0 0 G O G 360 0 0 0 0 G o o 360 0 0 0 0 0 O 0 0 0 360 360 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3~ 0 o 0 0 o 0 0 0 360 360 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 o MICHAEL G. R O S S M A N N Referring again to equation (10), a n d comparing it with the function g defined b y equation (8), it is evident t h a t for the h e a v y atom m e t h o d = Fo exp (iq)H) ~ (CH, O, O) . The results of the h e a v y - a t o m method have thus been reduced to the same m a t h e m a t i c a l form as those of the isomorphous-replacement method, and it will next be shown t h a t t h e y m a y readily be combined. The c o m b i n a t i o n of the i s o m o r p h o u s - r e p l a c e m e n t and the h e a v y - a t o m m e t h o d The case m a y now be considered, which has arisen in the myoglobin work of Kendrew, where isomorphousreplacement d a t a are available, and also the structure is p a r t i a l l y known. I n this case the two contributions to P(c~) m u s t be multiplied together, with the result P(oc)da = e x p {Cis cos ( a - - ¢1) + c n cos ( ~ - ~ v ~ ) - D ~ cos 2 ( ~ - q ~ . ) } d ~ . C,~ a n d CH m a y be combined in just the same way as the c/s were combined in equation (6): J C cos ¢ = ~ Y c j cos ( a - ~ j ) +CH COS (a--(pn) j=l J C sin ¢ = ~ cj sin ( a - ~j) +CH sin ( a - - qg) • (13) i=l B y exactly the same steps as lead to equation (8) we reach the result ~=Fo exp (i~b) g(C, D~s, ~ b - ¢2) • (14) Thus all available information can be s u m m a r i z e d in terms of three variables which determine the definite integrals of g. The same function g(x, y, co) gives analytical expression to the weighting scheme of Blow & Crick (1959) when applied to the results from a n y n u m b e r of isomorphous replacements and the h e a v y atom method. The various cases are set out in Table 2. E~ a n d 2: m a y be d e t e r m i n e d with moderate accuracy b y analysis of the results from a centros y m m e t r i c zone, in the way d e m o n s t r a t e d b y Blow Table 2. The form of the function g in various methods 'Heavy atom' method g(cu, 0, 0) (Sire, 1960) Single isomorphous pair g(cl, dl, 0) Single isomorphous pair combined with 'heavy atom' method Multiple isomorphous replacement method g(C1, dl, (/)--~1) g(Cis,Dis, q51- ~2) Multiple isomorphous replacement combined with 'heavy atom' method g(C, Dis, ~)- ¢2) (Blow & Rossmann, 1961) A N D D. M. B L O W 645 & Crick (1959). I n addition, a theoretical estimate of 2: m a y be m a d e from (12). Although errors in E j and ~ will result in inaccurate relative weighting of the different sets of data, it is unlikely t h a t these wil] have an i m p o r t a n t effect on the phase angles. The function t~ (x, y, ¢o) The nature of the function f 2~exp {iv/+x cos ~ - y g(x, y, co)= cos 2(y~+ w)}dv2 0 f 2~exp {x cos V~-y cos 2(yJ+ w)}dyJ ,0 (15) m a y now be considered. It can be t h o u g h t of as representing the centre of gravity of a circular wire of radius u n i t y with a density exp {x cos yJ- y cos 2 (yJ + co) }. Since the centre of g r a v i t y of such a wire can never be outside the wire, it follows t h a t the m a g n i t u d e of g always lies between 0 and 1, approaching u n i t y as we approach certainty about the true phase of a reflexion. g m u s t always be real if the weight distribution is s y m m e t r i c a l about yJ--0. This occurs not only when y = 0 (as in the case of h e a v y atom technique) b u t also when ]col--0, ~/2, ~, . . . . I n the case of a single isomorphous pair (J= 1) (8) reduces to ~ =Fo exp ( i ~ l ) g (cl, dl, 0). We know t h a t this case results in a probability distribution for the phase angle a which is s y m m e t r i c a l about ~1. Thus w m a y be regarded as expressing the a s y m m e t r y of the phase probability distribution. W h e n x >~ y t h e n x cos y~ is the governing factor of the probability distribution. Hence we obtain a u n i m o d a l distribution with a peak whose sharpness increases as x becomes large. Conversely when y becomes >~ x the y cos2(yJ+co) term governs the probability distribution. This results in a bimodal distribution with peaks tending towards 7e/2-co, 3 ~ / 2 - 0 9 . If w=O, g ~ 0 as y increases; b u t for other values of 09 (implying an a s y m m e t r i c distribution), as y increases and sharpens the distribution ]~,[ increases, and its phase swivels round towards ( ~ / 2 - w). For smaller, but still considerable values of y, the p r o b a b i l i t y distribution has two peaks, which coalesce into one as y becomes small compared to x. I t can easily be shown t h a t the distribution m u s t be u n i m o d a l when x _> 4y and co--0. The s t r a t e g y of a structure d e t e r m i n a t i o n Apart from its direct application to the combination of structural and isomorphous-replacement data, the analysis given above is useful in evaluating the 646 THE REFINEMENT relative merits of different strategies in a large structure determination. First consider the effect of increasing the n u m b e r of isomorphous compounds. This m a y be done b y considering the effect of adding one m e m b e r to the vector s u m m a t i o n for Cis and Dis, defined b y (6). I t m"ay be seen from (3) and (4) t h a t if s0 is a phase angle for which s j = 0 , so t h a t 2 2Foil cos ( s o - ~ ) = FBj -- Fo2 --f}2 , then c~= 4dj- cos (so - ~j) , 2 while d~ ~ f i / 4 E i2 is a positive number. If the phases are assumed to be random, the problem corresponds to a r a n d o m - w a l k problem, giving a m e a n square value for D~s J J <D~s>=.~, d ~ ~.~, f~/(16E~.) . i=1 (16) i=1 Thus in the case where the fi/Ej's are of the same order for the different isomorphous derivatives, the r.m.s, value of Dis increases as VJ. Note t h a t two isomorphous substituents j , k at closely adjacent sites will lead to similar values of ~j, ~ , m a k i n g Dis larger t h a n it would be if the phases were random. For a rough a p p r o x i m a t i o n it can be assumed t h a t c¢0 is the 'true' phase, and has the same value for all j. I n the vector s u m m a t i o n J C ~ s = ~ 4dj cos ( s 0 - q~) exp (iqj) ]=1 it is seen t h a t terms with phase ~j far from s0 are scaled down b y the factor cos ( s o - ~ ) , and thus Cis tends to have the phase of a0. E x p a n d i n g OF S T R U C T U R E S <c~>= 4<Fo2} <f~}/(<F~>-<f~>)2. Let <,f~>=r<F2o> so t h a t we m a y say briefly 'a fractiorr r of the structure is known'. Then the r.m.s, value of CH is <c~>½=2r½/(1--r) . (18) R e m e m b e r i n g t h a t C characterizes the sharpness of the phase p r o b a b i l i t y distribution and thus the accuracy of the phase determination, we are now in a position to answer questions about the relative power of the isomorphous-replacement m e t h o d and the h e a v y atom m e t h o d under specific conditions. Suppose, for example, a fraction r of the structure has been revealed b y using J isomorphous replacements, is it more advantageous to use one more isomorphous replacement, or to use the h e a v y - a t o m m e t h o d ? We can calculate the probable value of ]C] in each case. Let <Cj> be the m e a n value of C when J isomorphous replacements are used. T h e n from (17) + 1 <c&; <cJ+o= C ~ -- ~]J2+1 exp (~s0) ~ J--yJ:~ J + l while using (18), adding h e a v y - a t o m data gives 2r I t m a y be m e n t i o n e d t h a t the isomorphous-replacem e n t m e t h o d tends to increase Dis as well as C, so t h a t a comparison based only on the value of [C[ tends to over-estimate the value of the isomorphousreplacement method. Nevertheless, to m a k e the example more definite, let us consider typical d a t a for the protein haemoglobin. These would be J <f2}=1002, E j = 5 0 Cis = ~ 4dj [cos so cos 9 ~j + sin s0 sin ~ cos ~Thus i=1 + i cos a0 sin q¢ cos ~j + i sin so sin e ~¢]. Assuming, as before, a r a n d o m distribution of ~j, we see t h a t the m e a n value of Cis is J ] [f~/2E~] exp (iao). (17) <Cis>=_~2d~ exp (is0) ~ " ?'=1 j=l If all <f~./~.> are of the same order of magnitude, <[Cts]> increases as J a n d <]C~I>/<D~,>½ as VJ. This expresses the fact that, as the n u m b e r of isomorphous compounds is increased, the distribution is sharpened and there is a decrease of their tendency towards being bimodal. Consider now the h e a v y - a t o m method. If a large n u m b e r of atomic positions have been found in a partially known structure, we m a y assume t h a t Wilson statistics apply, and the m e a n square value of fH is H <f~>=~f~ , h=l while <Fo~>= < f 2 > + X . Hence the m e a n square value of CH is and <Cj>=2J. so t h a t <Cj+l> e ~ 4 ( J + 1) 2 2r <C~+.> ~ 4J2+ (l_----r) ~. The h e a v y - a t o m m e t h o d will therefore be more powerful if the latter q u a n t i t y is greater, n a m e l y if r > 0 . 7 3 ( J = 2 ) , 0.77 ( J = 3 ) , 0.79 ( J = 4 ) . (In the case J = 1 it would certainly be necessary to take account of the effect of Dis). These results are sensitive to the relative m a g n i t u d e s of the inaccurately known E j and X values. This t y p e of calculation can therefore only give an order of m a g n i t u d e for r. More detailed comparisons could readily be made. F r o m the above it can be seen how q u a n t i t a t i v e expression can be given to the convergence of sets of isomorphous-replacement data or h e a v y - a t o m refinements, towards an accurate set of phases. I n principle it would be possible to go further a n d use the actual values of l g] which are the 'figures of merit' (Dickerson et al., 1961) to give the s t a n d a r d error of electron density (Blow & Crick, 1959). M I C H A E L G. R O S S M A N N AND D. M. B L O W This study has been stimulated by a fruitful interchange of ideas with other members of this Unit. In particular we are indebted to Dr F. H. C. Crick, who we hope will recognize his influence throughout the work. The tabulation of the function g(x, y, eg) was made possible by the use of the University of Cambridge Mathematical Laboratory's electronic computer EDSAC 2. References BLow, D. M. (1958). Prec. Roy. Soc. A, 247, 302. BLow, D. M. & CRICK, F. I-I. C. (1959). Acta Cryst. 12, 794. 647 BLow, D. M. & I~OSSMA~N, M. G. (1960). Acta Cryst. 14 (In press). DICKERSON, R. E., K E N D R E W , J. C. & STRANDBERG, B. E. (1961). Conference on Computing Methods and the Phase problem. (Glasgow, 1960, L o n d o n : P e r g a m o n Press.) KENDREW, J. C., DICKERSON, R. E., STRANDBERG, S. E., HART, R. G., DAVIES, D. R., PHILLIPS, D. C. & SHORE, V. C. (1960). Nature, Lend. 185, 422. PERUTZ, M. F., ROSSMAIWN, ]~¢I.G., CULLIS, A. F., MUIRHEAD, H., WILL, G. & NORTH, A. C. T. (1960). Nature, JLond. 185, 416. SIM, G. A. (1959). Acta Cryst. 12, 813. SIM, G. A. (1960). Acta Cryst. 13, 511. WATSON, G. N. (1922). The theory of the Bessel Functions, p. 77. Cambridge: University Press. WILSON, A. J. C. (1949). Acta Cryst. 2, 318. Acta Cryst. (1961). 14, 647 Preparation and Structure of BasTa~O,s and Related Compounds* BY FRANCIS GALASSO AND LEWIS KATZ Department of Chemistry, University of Connecticut, Storrs, Connecticut, U.S.A. (Received 3 August 1960) The preparation and characterization of a new ternary oxide of tantalum, BasTa4015 , and isomorphous compounds, SrsTa4Ols and Basl~b4O15 , are described. The barium tantalum compound was found to belong to the trigonal system; the axes of the primitive hexagonal cell are a--5.79, c -- 11.75 A. Single crystals of BasTa4015 grown in a lead II oxide flux were used in determining the structure. Anion deficiencies have been produced in these compounds by preparing them with £etravalent niobium and tantalum. Introduction I n a ,general survey of the barium-tantalum-oxygen :system, the existence of several phases have been noted. A,t ~ ratio of barium to tantalum of one half, two diif~exent phases have been reported, one a :hexagonal phase, Ba0.44Ta02.9~ (Galasso, Katz & Ward, 1958), and the other a tetragonal phase, Ba0.sTa08 (~Galasso, Katz & Ward, 1959a) with the tetragonal bronze structure (Magn41i, 1949). When the Ba/Ta "rat~o was increased to three, the phase Ba(Ba0.sTa0.~),Oe.75 was obtained with an ordered cubic perovsldte structure (Brixner, 1958; Galasso, Katz & Ward, 1959b). At an intermediate ratio, the compound BasTa4Ol~ has been identified. The preparation and :structure o~ this compound and isomorphous phases :is the subject of this paper. Experimental Preparation of BazTa4015 The best-preparation of BasTa4015 resulted from * Part of.this.research was sponsored by the Office of Naval l~esearch. Reproduc£ion in whole or in part is permitted for .any purpose of .the United States Government. mixing barium carbonate in 2% excess of the amount indicated in the reaction 5 BaC03 + 2 Ta205 -> BasTaa015 + 5 CO2 and heating in air at 1150 °C. for 24 hr. The white product obtained gave an X-ray powder pattern which was indexed with the aid of single crystal data on the basis of a hexagonal cell with a = 5 . 7 9 A and c= 11.75 J~. The density was found pycnometrically to be 7.9 g.cm. -8, and using the above parameters the unit cell content weight was calculated to be 1622, as compared to 1650 for the formula BasTa4015. Analysis gave 44.04% Ta, 42.70% Ba, as compared to the theoretical 43.85% Ta, 41-64% Ba for BasTa4Ol~. Preparation of BasTa~V01~ The compound BasTa~V013 was prepared in an evacuated sealed silica capsule using tantalum pentoxide and tantalum metal as a source of tantalum IV, and barium oxide as the source of barium. As is frequently the case with oxygen deficient compounds, the powder pattern of the reduced (blue) phase could be indexed using the same parameters as for the oxidized phase.