The Refinement of Structures Partially Determined by the

advertisement
641
Acta Cryst. (1961). 14, 641
The Refinement of Structures Partially Determined
by the Isomorphous Replacement Method
BY MICHAEL G. ROSSMA_WNAWD D. I . BLOW
Medical Research Council Unit for Molecular Biology, Cavendish Laboratory, Cambridge, England
(Received 8 August 1960)
When independent results about the phase of a reflexion are obtained, what is the best way of
combining them ? This problem arises in the multiple isomorphous replacement method, and in a
more general form where, in addition, part of the structure is known. The method proposed is to
use each result to form a probability function for the phase of the reflexion ; the combination of these
results is achieved by multiplying the probability functions together. This joint probability function
can be fully represented by the magnitude and phase of two vectors C and Dis formed by addition
of components from each result. From these, structure factors can be calculated which provide the
'best' combination of all the data. A discussion of the significance of the vectors C, Dis indicates
how estimates of the relative importance of structural information and of the isomorphous replacement method may be made.
Introduction
The isomorphous-replacement m e t h o d has been used
b y K e n d r e w et al. (1960) to determine the phases of
9,600 reflexions in a s t u d y of the protein myoglobin
at 2 A resolution. I n the resulting Fourier synthesis,
most of the atoms in the helical polypeptide chain
can be placed with precision, a n d a n u m b e r of the
amino acid side chains can be identified. The question
now arises, how m a y a n extension or refinement of
the structure best be m a d e ? The two available techn i q u e s - c a l c u l a t i o n of phases either on the basis of
the known part of the structure or b y the isomorphousreplacement m e t h o d - - w i l l lead to different sets of
phase angles. Is there a satisfactory w a y of combining
these? The large n u m b e r of reflexions imposes a
requirement for a completely a u t o m a t i c m e t h o d of
combination. At the same time, the labour of model
building is such t h a t it is worth doing a great deal
of p r e l i m i n a r y computing in order to reduce the
n u m b e r of cycles of refinement to a m i n i m u m .
The problem is approached from the point of view
of an earlier paper on the t r e a t m e n t of errors in the
i s o m o r p h o u s - r e p l a c e m e n t m e t h o d (Blow & Crick,
1959). I n a projection subject to error, one m a y define
as 'best Fourier' the Fourier transform which has least
mean-square difference from the 'true' Fourier transform when averaged over the whole unit cell. I n the
usual case, one knows the m a g n i t u d e Fo of the structure a m p l i t u d e with accuracy, b u t its phase is uncertain. If the relative probability of each phase is
plotted round a circle of radius Fo on an Argand
diagram, the centre of g r a v i t y of the p r o b a b i l i t y
distribution gives the structure factor, ~, to be used
in calculating the 'best Fourier'. This m a y be expressed
by
f
2~exp (ia)P(a)da
.--.
.0
(1)
o
Here P (c~)da is the relative p r o b a b i l i t y t h a t the phase
angle o~ lies between c~ a n d a + d a .
I n the following paper we present (a) a convenient
a n a l y t i c a l expression, governed b y two vectors, t h a t
represents the phase p r o b a b i l i t y distribution indicated
b y a set of isomorphous-replacement d a t a ; (b) a similar
expression, governed b y one vector, t h a t represents
the corresponding distribution for the ' h e a v y - a t o m '
or p a r t i a l l y known structure m e t h o d ; (c) a m e t h o d for
the combination of these to obtain the 'best Fourier'
when the two methods are combined simultaneously.
This leads to some general strategic considerations in
the solution of large structures.
The isomorphous replacement method
I n the isomorphous-replacement m e t h o d Blow & Crick
(1959) show t h a t a particular phase angle, a, is related
to a particular error s ( a ) in the experimental d a t a of
one isomorphous compound. This error can be allotted
a p r o b a b i l i t y e x p { - s 2 ( a ) / 2 E 2} on the basis of a
Gaussian distribution of error with a s t a n d a r d deviation E, which can be estimated, for example, from
d a t a for a centrosymmetric zone. I n general, J isomorphous derivatives m a y be used simultaneously.
I n this case
J
P ( a ) d a = II exp { - s~(a)/2E~.}d~
J
= exp { - 2: ~2(a)/2E~}da
j=l
(2)
642
THE
REFINEMENT
OF
~: is given by
(FH~+ s ~ ) e = F ~ + f ~ + 2 F o f i COS ( a - q ~ ) ,
(3)
where Fo is the structure amplitude of the unsubstituted derivative, -~g~ t h a t of the j t h heavy atom
derivative, and f~ exp (i~¢) is the calculated structure
factor of the substituent atoms in the j t h compound
(see Fig. 1).
STRUCTURES
,
The first term of (5) results in a constant multiplier,
which appears at top and bottom of (1). The second
and third terms provide the required information for
calculation of the structure factors, ~, which give rise
to the 'best Fourier'.
Let us now write
J
J
C~8 cos q)l = ~ c: cos Fj;
Ci8 sin ¢1 = ~ c: sin ~j;
i=1
i=l
J
J
D~ cos 2¢2=,~'d~ cos 21c~; D~ sin 2Cb~.=,~'d~sin 2(Tj,
i=t
i=~
(6)
This corresponds to forming vectors Ci~ and Di~ by
addition of the c: and dj vectors with their corresponding phases, ~: and 2V:. (5) now reduces to
J
_~Ye:-2/"~'~,~2~= K - Ci~ cos (~ - ~bl) + Di~ cos 2 (~x- ~b2).
(7)
J="
Finally, by combining (1), (2) and (7) we have
~=Fox
f2'~exp {i c~} exp {Ci~ cos (c~ - ~b,)- D,8 cos 2 (a - (1)2)}da
0
Workers who have used this method (Blow, 1958;
Kendrew et al., 1960; Perutz et al., 1960) have so far
used numerical methods for the evaluation of (1),
by calculating P(c~) from (2) and (3) at intervals of
5 ° or 10 ° . An analytical expression for the integrals of
(1) will now be developed.
Since in all important cases Fo, F m >> s~ except
where the probabilities are negligible, e~2 can safely be
neglected in (3). This leads at once to
Putting to = c~- qh,
~ = F o exp (iq)l)
f2=exp {ito + Ci~ cos t 0 - D~ cos 2 (to + q)~ - ¢9.)} dto
X ., 0 .
.
.
.
f2aexp
.
/=i
(4)
i=~
.
.
.
{Cis cos to-Dis
.
.
.
.
.
cos 2 (to + ¢ i - ~b2)}dto
or most briefly
(iq)l)g(Cis, Di~,
(8)
q ~ l - q~2) .
This last form shows t h a t the integrals are functions
of the three variables C~s, Dis, (q~l-q)2). Their ratio
is a complex number g(C~, D~8, q)l-q~2) which can
readily be calculated on a computer. Some values are
given in Table 1.
The validity of the approximation introduced by
neglecting s 2 was tested by computation of some phase
B, o' ~ "
210 °
J
-:2_,'c: cos (o¢-cf:)+.,~,d: cos 2(o¢-~:) . (5)
i=t
.
I
J
2
2 = ~" (e~ + d~)
e~/2E~
J
.
0
This expression shows t h a t ei (a) can be expressed
with a good accuracy by a Fourier expansion up to the
second-order term, and c~ and d~ m a y be thought of
as Fourier coefficients. As will be shown presently,
t h e y are, however, like vectorial coefficients with
phases ~: and 2V:.
If follows from (4) t h a t
J
.
{ = Fo exp
e~ = (F o2 + f ]2 - F~1)/2FH: + (Fofi/Fg:) cos (~ - ~ )
and
2
si/2E
~2 = e i - c~ cos (c~ - ~ ) + 2d~ cos ~ (c~ - ~ )
= (e~+ d~) - c: cos (a - ~:) + d~ cos 2 (~ - ~ ) ,
where
2
2
2
2
2
c: = ( F ~ / - f / - Fo)Foil/ (2F~iE i ),
d:= F o2f i/(4FaiEi)
2
~ 2 ,
~2
2
2 2
2
2 .
e:= ( n ] - f i - F o ) / ( 8 F n i E I )
qh)-Dis cos 2 ( ~ - fib2)}do~
2 a e x p {Cis c o s ( ~ x 0
Fig. 1. Vector d i a g r a m showing error in t h e i s o m o r p h o u s
r e p l a c e m e n t technique.
i=t
270 °
a
Fig. 2. D a s h e d line shows probability d i s t r i b u t i o n for reflexion
(260) of h a e m o g l o b i n w i t h a p p r o x i m a t i o n s suggested in
this p a p e r ; c o n t i n u o u s line shows t h e s a m e d i s t r i b u t i o n
calculated w i t h o u t error.
M I C H A E L G. R O S S M A N N AND D . M . B L O W
probability curves from data for five isomorphous
compounds of horse haemoglobin. These were calculated both by using equation (3) (accurate) and
equation (4) (approximate). The two curves for the
reflexion showing the poorest agreement are shown in
Fig. 2. The error, about 5 °, is quite insignificant.
All the information available about a structure
factor from isomorphous replacements can thus be
expressed in terms of five parameters: Fo and the real
and imaginary parts of C~ and Di~, given by
2
C~=.X
j= t
(F.j-f
2
2
j - F~o)Fofj
• ~ 2
2FB/Ej
exp (iq~),
2
: Fofj
Di~ = ~ ] - - ~ - ~ exp (i2q~).
For practical application in a computer this has
several advantages :
643
f2"cos ~ exp {CH COS~p}dyJ
0
~ = F o exp (iqgH) ~
,
f2~exp {CH COSyJ}dyJ
(10)
,0
which may be shown to reduce to
~=Yo exp (iq~H)II(CH)/Io(CH).
(11)
(Sire, 1960; Watson, 1922).
One feature of Sim's treatment which may readily
be improved is the assumption that the 'known'
part of the structure is perfectly accurate. In addition
to the contribution to Z from the L atoms whose
position is unknown, there will be a further contribution due to error in the parameters of the H 'known'
atoms. If the hth known atom has been assigned a
position ra, while its true position is ra+~h, and its
scattering factor is fa, then
H
(i) It is necessary to store only five parameters per
reflexion as a complete record of the results,
rather than the (3J + 1) parameters required for
observed structure amplitudes and calculated
heavy-atom contributions.
(ii) If further information becomes available about a
reflexion at a later stage, it can be readily added
into the C~ and D~ vectors.
(iii) Only one integration is required (or a single
reference to a three-dimensional table of g),
rather than J such operations which would be
required if the isomorphous replacements were
treated separately.
(fH)true= • f a exp {2ai(rh+~i~). s}.
h=l
Here s is the reciprocal-lattice vector, and assuming
the angle 2~6a. s is small,
H
(fH)true~ (fH)ca~c.+ 2 2~i(8a. s)fh exp {2aira. s}.
h=l
The last term is a vector sum made up of the contributions of each 'known' atom to the total structure
factor. Its mean square value is
H
~ s ~ ±" ~fg,
2
h=l
The heavy a t o m m e t h o d
A probability distribution P(c~) can also be derived
in the case where partial knowledge of the structure
is available, either from the positions of a small number
of heavy atoms or from a large number of light atoms.
The two cases are formally identical, and are here
both referred to as the 'heavy-atom method'.
Let f z = f z exp (iqz) be a structure factor calculated from that part of structure which is known.
Sim (1959) has derived the probability function P (c~),
by making the assumption that the contribution of
the unknown atoms conforms to Wilson statistics
(1949), and has a mean square value ~. Following Sim,
the probability may be written
P ( c ~ ) d a = K exp (CH COS (a-- ~gH))d~ ,
where
and
(9)
CH=2FofH/Z ,
K = 2ZIo(CH) .
I0 is the modified zero-order Bessel function of
imaginary argument iCH.
Substituting (9) into (1), and changing the variable
to yJ= ~ - qH, gives
where ah is the r.m.s, value of 6a, or the standard error
in position of the hth atom. This quantity needs to
be added to the mean square contribution of the
'unknown' atoms, giving
L
H
/=I
h=l
ahfh.
(]2)
This latter term, although probably negligible in the
early stages, becomes increasingly important in later
cycles when more of the structure is known. This is
particularly true of the terms with large s, or if atomic
positions determined at a low resolution (small s)
are used for phase determination at a higher resolution.
Alternatively, Z could be estimated for a range of
s by the study of a centric zone, in the same way as
Ej has been determined for the isomorphous-replacement technique.
It has been assumed that each atom of the structure
can be assigned into one of the two groups 'known'
and 'unknown'. In the case of an uncertain atom a
possible procedure would be to put part of its weight
into the 'known' set and part into the 'unknown'.
However, while this is the best procedure for determining the rest of the structure, it will not decide
whether this atom really exists.
644
THE
REFINEMENT
Table 1
OF
STRUCTURES
The function g (x, y, co)
The f u n c t i o n is set o u t in blocks of c o n s t a n t w.
The left h a n d blocks give 100 x the m a g n i t u d e of ~ a n d t h e r i g h t h a n d blocks its phase in degrees
Igl
Y=
o
x
I00
~T~
0.5 1.o 1.5 2.0 2.5 3.0 ~.5 4.0
0
Y=
g
degrees
0.5 1.0 1.5 2.0 2.5 3,0 3.5 4.0
w=O ~
0
0.5
1.0
1.5
2.0
2-5
3.0
3- 5
4".0
4.5
5- 0
60
435 ~
70
76
81
84
60
67
7~
78
48
57
64
69
20 15
29 22
38 30
46 36
53 43
59 49
88
89
81
84
86
74
64
55
77
80
69
73
60
86
'12 10 18
18 1"
23 19 16
29 2== 20
35 29 24
40 33 28
46 38
51
55
14
17
21
24
32 28
36
~0
31
35
4~
58
11~ 1~ 12
G
26 25 25
36
,~ 36
46 46
55 54 54
102
24
36
46
54
1Q
24
36
46
55
74
78
80
72
76
79
73
77
81
~4
78
B~
64
43
47
360 360 360 360 360 360 360 360 360
360 360 360 360 360 360 360 360 360
170
360
360
360
360
360
360
360
360
360
360
360
360
360
360
360
360
360
360
360
360
360
360
360
360
360 360
3c,0 360
360 360
360 360
360 360
360 360
360 360
360 360
360 360
360 360
360 36O
36(3 360
360 360
360 360
360 360
360 3@a
360
360
360
360
360
360
360
360
360
360
8
8
8
7
7
6
6
5
5
19
19
18
17
16
15
14
13
12
31
31
30
29
27
26
24
23
21
20
41
4B 53
26
25
24
35
34
33
360
360
360
360
360
13
12
12
11
10
10
9
8
8
360
360
360
360
360
360
360
14
13
13
12
11
1o
10
24
23
22
21
20
19
18
9 17
9 16
8 15
31
30
29
28
27
25
24
23
22
21
17
22
15
14
13
12
12
11
360 360
360 360
360 36O
360 360
36(} 360
360 360
360 36G
360 360
..,, = 150
.5
1.0
1.5
2.0
2.5
3.0
~-5
4.0
4.5
5.0
204
45
60
70
76
81
84
86
88
89
20
37
51
62
69
75
1~ 10
31 27
44
54
62
69
79 74
82 78
85
86
81
83
64
70
62
68
62
67
~2
76
79
62
68
]3
76
BO
62
68
62
68
9 20
57
60
51
50
5(3
46 49
45 48
44 46
43 45
41 44
40 43
52
52
51
50
49
48
47
46
45
37
37
36
35
34
33
32
30
29
28
39
38
38
37
36
35
34
33
32
31
40
,40
39
38
37
36
35
34
33
32
41
40
40
39
38
37
36
36
35
34
24
25
26
27
19 21
18 20
17 19
16 19
15 18
15 17
26 27
25 26
25 26
23
22
21
20
20
19
24
23
23
22
21
20
27
27
26
25
24
24
23
22
22
26
25
24
24
23
22
13
13
12
12
11
11
11
10
10
10
I)
13
13
12
12
12
11
11
11
10
14
13
13
13
12
12
12
11
11
11
14
14
13
13
13
12
12
12
11
11
41
40
39
37
36
34
32
31
29
48
47
46
45
44
42
41
39
37
53
53
52
51
50
48
47
46
44
46
46
45
43
41
40
39
37
36
49
48
48
35
34
33
32
31
30
29
27
26
25
5?
56
56
55
54
53
52
50
49
59
59
58
58
57
56
55
54
53
w = 300
o
0.5
1.o
1.5
2.0
2.5
3.0
3;5
4.0
4.5
5.0
45
41
40
56
66
73
79
84 82
86 85
88 87
89 88
55
65
73
78
60
70
76
81
40
41
42
43
55
66
74
79
82 83
84 86
86 88
88 89
57
68
76
81
58
70
78
83
85 87
88 89
89 91
90 92
59
71
79
85
91
92
93
88
43
59
]2
44
]2
360
360
360
80
81
360
66
89
92
93
94
92
94
95
60
86
90
41
41
4G
22 32 39
21 30 37
20 29 36
18 27 34
17 26 32
16 24 31
w = 45 °
o
0-5
1.0
1.5
2-0
2.5
3.0
)-5
4.0
4.5
5.0
0
24
45
60
70
0
0
25
46
61
71
26
48
64
74
O
28
51
67
77
0
29
53
70
80
0
30
55
72
82
0
31
56
"]3
83
0
31
57
74
64
O
32
5"/
"15
85
81
84
86
88
89
85
87
89
90
85
87
89
90
91
87
89
91
92
93
89
91
93
93
94
91
93
94
95
95
92
94
95
95
96
93
95
96
96
96
94
95
96
97
97
76
78
82
81
84
86
88
89 90
91
360
360
360
w = 60 o
0.5
1;o
1,5
2-0
2.5
3.0
3-5
4. 0
4.5
5.0
0
0
45
50
65
75
81
85
87
89
90
91
60
70
76
81
84
86
88
89
55
70
80
85
88
90
91
92
93
59
75
83
88
91
92
93
94
94
6~
-/7
86
90
92
94
94
95
95
63
79
68
65
81
89
92 93
94 95
95 95
95 96
96 96
96 96
65
61
90
93
95
9,6
96
97
97
82
90
94
96
96
97
97
97
11
360
360
360
360
10
10
9
360
360
360
360
360
360
9
8
8
'7
7
6
v," = 75 °
o
0.5
'1.0
1.5
2.0
2-5
~.0
3.5
4.0
4.5
5.0
45
~o
52
68
59 63
~_~ X~
66
68
83
69
84
70
8s
71
8s
76
81
84
86
88
89
83
86
88
90
91
92
87
90
91
92
93
93
92
93
95
95
96
96
96
94
95
96
96
97
97
97
95
96
97
97
97
97
90
92
99
94
94
95
8~
94
94
95
95
96
95
96
96
97
97
17 21
16 20
16 20
24
23
22
25
24
24
27
. . . . . .
360
360
360
360
360
360
3~0
360
360
360
6
6
5
5
5
4
4
4
4
3
9
9
8
8
8
7
7
6
6
6
11
11
10
10
10
9
9
8
8
7
12
12
12
11
11
10
10
9
9
9
w = 90°
o
0.5
1,.o
1.5
2.0
2.5
3.0
3.5
4.0
4.5
5.0
.
.
.
.
.
.
.
.
.
360
45
60
70
76
81
84
86
88
89
53
69
78
83
86
89
90
91
92
60
75
83
88
90
91
92
93
94
64
79
87
90
92
93
94
94
95
67
82
89
92
94
95
95
95
96
69
84
91
94
95
95
96
96
96
71
85
92
94
96
96
96
97
97
72
86
92
95
96
97
~)7
97
97
72
87
93
95
97
97
97
97
97
360
0
0
0
0
G
O
G
360
0
0
0
0
G
o
o
360
0
0
0
0
0
O
0
0
0
360
360
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
3~
0
o
0
0
o
0
0
0
360
360
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
o
MICHAEL
G. R O S S M A N N
Referring again to equation (10), a n d comparing it
with the function g defined b y equation (8), it is
evident t h a t for the h e a v y atom m e t h o d
= Fo exp (iq)H) ~ (CH, O, O) .
The results of the h e a v y - a t o m method have thus been
reduced to the same m a t h e m a t i c a l form as those of
the isomorphous-replacement method, and it will next
be shown t h a t t h e y m a y readily be combined.
The c o m b i n a t i o n of the i s o m o r p h o u s - r e p l a c e m e n t
and the h e a v y - a t o m m e t h o d
The case m a y now be considered, which has arisen in
the myoglobin work of Kendrew, where isomorphousreplacement d a t a are available, and also the structure
is p a r t i a l l y known. I n this case the two contributions
to P(c~) m u s t be multiplied together, with the result
P(oc)da = e x p {Cis cos ( a - - ¢1)
+ c n cos ( ~ - ~ v ~ ) - D ~
cos 2 ( ~ - q ~ . ) } d ~ .
C,~ a n d CH m a y be combined in just the same way
as the c/s were combined in equation (6):
J
C cos ¢ = ~ Y c j cos ( a - ~ j ) +CH COS (a--(pn)
j=l
J
C sin ¢ = ~
cj sin ( a - ~j) +CH sin ( a - - qg) • (13)
i=l
B y exactly the same steps as lead to equation (8)
we reach the result
~=Fo exp (i~b) g(C, D~s, ~ b - ¢2) •
(14)
Thus all available information can be s u m m a r i z e d in
terms of three variables which determine the definite
integrals of g.
The same function g(x, y, co) gives analytical expression to the weighting scheme of Blow & Crick
(1959) when applied to the results from a n y n u m b e r of
isomorphous replacements and the h e a v y atom
method. The various cases are set out in Table 2.
E~ a n d 2: m a y be d e t e r m i n e d with moderate
accuracy b y analysis of the results from a centros y m m e t r i c zone, in the way d e m o n s t r a t e d b y Blow
Table 2. The form of the function g in various methods
'Heavy atom' method
g(cu, 0, 0)
(Sire, 1960)
Single isomorphous pair
g(cl, dl, 0)
Single isomorphous pair
combined with 'heavy
atom' method
Multiple isomorphous
replacement method
g(C1, dl, (/)--~1)
g(Cis,Dis, q51- ~2)
Multiple isomorphous
replacement combined
with 'heavy atom'
method
g(C, Dis, ~)- ¢2)
(Blow & Rossmann, 1961)
A N D D. M. B L O W
645
& Crick (1959). I n addition, a theoretical estimate
of 2: m a y be m a d e from (12). Although errors in E j
and ~ will result in inaccurate relative weighting of
the different sets of data, it is unlikely t h a t these wil]
have an i m p o r t a n t effect on the phase angles.
The function t~ (x, y, ¢o)
The nature of the function
f
2~exp {iv/+x cos ~ - y
g(x, y, co)=
cos 2(y~+ w)}dv2
0
f
2~exp {x cos V~-y cos 2(yJ+ w)}dyJ
,0
(15)
m a y now be considered. It can be t h o u g h t of as
representing the centre of gravity of a circular wire of
radius u n i t y with a density
exp {x cos yJ- y cos 2 (yJ + co) }.
Since the centre of g r a v i t y of such a wire can never
be outside the wire, it follows t h a t the m a g n i t u d e of
g always lies between 0 and 1, approaching u n i t y as
we approach certainty about the true phase of a
reflexion.
g m u s t always be real if the weight distribution is
s y m m e t r i c a l about yJ--0. This occurs not only when
y = 0 (as in the case of h e a v y atom technique) b u t also
when ]col--0, ~/2, ~, . . . . I n the case of a single isomorphous pair (J= 1) (8) reduces to ~ =Fo exp ( i ~ l ) g
(cl, dl, 0). We know t h a t this case results in a probability distribution for the phase angle a which is
s y m m e t r i c a l about ~1. Thus w m a y be regarded as
expressing the a s y m m e t r y of the phase probability
distribution.
W h e n x >~ y t h e n x cos y~ is the governing factor of
the probability distribution. Hence we obtain a
u n i m o d a l distribution with a peak whose sharpness
increases as x becomes large. Conversely when y
becomes >~ x the y cos2(yJ+co) term governs the
probability distribution. This results in a bimodal
distribution with peaks tending towards 7e/2-co,
3 ~ / 2 - 0 9 . If w=O, g ~ 0 as y increases; b u t for other
values of 09 (implying an a s y m m e t r i c distribution),
as y increases and sharpens the distribution ]~,[ increases, and its phase swivels round towards ( ~ / 2 - w).
For smaller, but still considerable values of y, the
p r o b a b i l i t y distribution has two peaks, which coalesce
into one as y becomes small compared to x. I t can
easily be shown t h a t the distribution m u s t be u n i m o d a l
when x _> 4y and co--0.
The s t r a t e g y of a structure d e t e r m i n a t i o n
Apart from its direct application to the combination
of structural and isomorphous-replacement data, the
analysis given above is useful in evaluating the
646
THE
REFINEMENT
relative merits of different strategies in a large structure determination.
First consider the effect of increasing the n u m b e r
of isomorphous compounds. This m a y be done b y
considering the effect of adding one m e m b e r to the
vector s u m m a t i o n for Cis and Dis, defined b y (6).
I t m"ay be seen from (3) and (4) t h a t if s0 is a phase
angle for which s j = 0 , so t h a t
2
2Foil cos ( s o - ~ ) = FBj
-- Fo2 --f}2 ,
then
c~= 4dj- cos (so - ~j) ,
2
while d~ ~ f i / 4 E i2 is a positive number. If the phases
are assumed to be random, the problem corresponds
to a r a n d o m - w a l k problem, giving a m e a n square
value for D~s
J
J
<D~s>=.~, d ~ ~.~, f~/(16E~.) .
i=1
(16)
i=1
Thus in the case where the fi/Ej's are of the same
order for the different isomorphous derivatives, the
r.m.s, value of Dis increases as VJ. Note t h a t two
isomorphous substituents j , k at closely adjacent sites
will lead to similar values of ~j, ~ , m a k i n g Dis larger
t h a n it would be if the phases were random.
For a rough a p p r o x i m a t i o n it can be assumed t h a t
c¢0 is the 'true' phase, and has the same value for all j.
I n the vector s u m m a t i o n
J
C ~ s = ~ 4dj cos ( s 0 - q~) exp (iqj)
]=1
it is seen t h a t terms with phase ~j far from s0 are
scaled down b y the factor cos ( s o - ~ ) , and thus Cis
tends to have the phase of a0. E x p a n d i n g
OF S T R U C T U R E S
<c~>= 4<Fo2} <f~}/(<F~>-<f~>)2.
Let <,f~>=r<F2o> so t h a t we m a y say briefly 'a fractiorr
r of the structure is known'. Then the r.m.s, value
of CH is
<c~>½=2r½/(1--r) .
(18)
R e m e m b e r i n g t h a t C characterizes the sharpness of
the phase p r o b a b i l i t y distribution and thus the
accuracy of the phase determination, we are now in
a position to answer questions about the relative
power of the isomorphous-replacement m e t h o d and
the h e a v y atom m e t h o d under specific conditions.
Suppose, for example, a fraction r of the structure has
been revealed b y using J isomorphous replacements,
is it more advantageous to use one more isomorphous
replacement, or to use the h e a v y - a t o m m e t h o d ? We
can calculate the probable value of ]C] in each case.
Let <Cj> be the m e a n value of C when J isomorphous
replacements are used. T h e n from (17)
+ 1 <c&;
<cJ+o= C ~ -- ~]J2+1 exp (~s0) ~ J--yJ:~ J + l
while using (18), adding h e a v y - a t o m data gives
2r
I t m a y be m e n t i o n e d t h a t the isomorphous-replacem e n t m e t h o d tends to increase Dis as well as C, so
t h a t a comparison based only on the value of [C[
tends to over-estimate the value of the isomorphousreplacement method. Nevertheless, to m a k e the example more definite, let us consider typical d a t a for
the protein haemoglobin. These would be
J
<f2}=1002, E j = 5 0
Cis = ~ 4dj [cos so cos 9 ~j + sin s0 sin ~ cos ~Thus
i=1
+ i cos a0 sin q¢ cos ~j + i sin so sin e ~¢].
Assuming, as before, a r a n d o m distribution of ~j,
we see t h a t the m e a n value of Cis is
J
]
[f~/2E~] exp (iao). (17)
<Cis>=_~2d~ exp (is0) ~ "
?'=1
j=l
If all <f~./~.> are of the same order of magnitude,
<[Cts]> increases as J a n d <]C~I>/<D~,>½ as VJ. This
expresses the fact that, as the n u m b e r of isomorphous
compounds is increased, the distribution is sharpened
and there is a decrease of their tendency towards
being bimodal.
Consider now the h e a v y - a t o m method. If a large
n u m b e r of atomic positions have been found in a
partially known structure, we m a y assume t h a t Wilson
statistics apply, and the m e a n square value of fH is
H
<f~>=~f~ ,
h=l
while <Fo~>= < f 2 > + X . Hence the m e a n square value
of CH is
and
<Cj>=2J.
so t h a t
<Cj+l> e ~ 4 ( J + 1) 2
2r
<C~+.> ~ 4J2+ (l_----r)
~.
The h e a v y - a t o m m e t h o d will therefore be more
powerful if the latter q u a n t i t y is greater, n a m e l y if
r > 0 . 7 3 ( J = 2 ) , 0.77 ( J = 3 ) , 0.79 ( J = 4 ) . (In the case
J = 1 it would certainly be necessary to take account
of the effect of Dis).
These results are sensitive to the relative m a g n i t u d e s
of the inaccurately known E j and X values. This t y p e
of calculation can therefore only give an order of
m a g n i t u d e for r.
More detailed comparisons could readily be made.
F r o m the above it can be seen how q u a n t i t a t i v e
expression can be given to the convergence of sets of
isomorphous-replacement data or h e a v y - a t o m refinements, towards an accurate set of phases. I n principle
it would be possible to go further a n d use the actual
values of l g] which are the 'figures of merit' (Dickerson
et al., 1961) to give the s t a n d a r d error of electron
density (Blow & Crick, 1959).
M I C H A E L G. R O S S M A N N AND D. M. B L O W
This study has been stimulated by a fruitful interchange of ideas with other members of this Unit. In
particular we are indebted to Dr F. H. C. Crick, who
we hope will recognize his influence throughout the
work.
The tabulation of the function g(x, y, eg) was made
possible by the use of the University of Cambridge
Mathematical Laboratory's electronic computer EDSAC 2.
References
BLow, D. M. (1958). Prec. Roy. Soc. A, 247, 302.
BLow, D. M. & CRICK, F. I-I. C. (1959). Acta Cryst. 12,
794.
647
BLow, D. M. & I~OSSMA~N, M. G. (1960). Acta Cryst. 14
(In press).
DICKERSON, R. E., K E N D R E W , J. C. & STRANDBERG, B. E.
(1961). Conference on Computing Methods and the Phase
problem. (Glasgow, 1960, L o n d o n : P e r g a m o n Press.)
KENDREW, J. C., DICKERSON, R. E., STRANDBERG, S. E.,
HART, R. G., DAVIES, D. R., PHILLIPS, D. C. & SHORE,
V. C. (1960). Nature, Lend. 185, 422.
PERUTZ, M. F., ROSSMAIWN, ]~¢I.G., CULLIS, A. F., MUIRHEAD, H., WILL, G. & NORTH, A. C. T. (1960). Nature,
JLond. 185, 416.
SIM, G. A. (1959). Acta Cryst. 12, 813.
SIM, G. A. (1960). Acta Cryst. 13, 511.
WATSON, G. N. (1922). The theory of the Bessel Functions,
p. 77. Cambridge: University Press.
WILSON, A. J. C. (1949). Acta Cryst. 2, 318.
Acta Cryst. (1961). 14, 647
Preparation and Structure of BasTa~O,s and Related Compounds*
BY FRANCIS GALASSO AND LEWIS KATZ
Department of Chemistry, University of Connecticut, Storrs, Connecticut, U.S.A.
(Received 3 August 1960)
The preparation and characterization of a new ternary oxide of tantalum, BasTa4015 , and isomorphous compounds,
SrsTa4Ols and Basl~b4O15 , are described. The barium tantalum compound
was
found to belong to the trigonal system; the axes of the primitive hexagonal cell are a--5.79,
c -- 11.75 A. Single crystals of BasTa4015 grown in a lead II oxide flux were used in determining the
structure. Anion deficiencies have been produced in these compounds
by preparing them with
£etravalent niobium and tantalum.
Introduction
I n a ,general survey of the barium-tantalum-oxygen
:system, the existence of several phases have been
noted. A,t ~ ratio of barium to tantalum of one half,
two diif~exent phases have been reported, one a
:hexagonal phase, Ba0.44Ta02.9~ (Galasso, Katz &
Ward, 1958), and the other a tetragonal phase,
Ba0.sTa08 (~Galasso, Katz & Ward, 1959a) with the
tetragonal bronze structure (Magn41i, 1949). When the
Ba/Ta "rat~o was increased to three, the phase
Ba(Ba0.sTa0.~),Oe.75 was obtained with an ordered cubic
perovsldte structure (Brixner, 1958; Galasso, Katz &
Ward, 1959b). At an intermediate ratio, the compound
BasTa4Ol~ has been identified. The preparation and
:structure o~ this compound and isomorphous phases
:is the subject of this paper.
Experimental
Preparation of BazTa4015
The best-preparation of BasTa4015 resulted from
* Part of.this.research was sponsored by the Office of Naval
l~esearch. Reproduc£ion in whole or in part is permitted for
.any purpose of .the United States Government.
mixing barium carbonate in 2% excess of the amount
indicated in the reaction
5 BaC03 + 2 Ta205 -> BasTaa015 + 5 CO2
and heating in air at 1150 °C. for 24 hr. The white
product obtained gave an X-ray powder pattern
which was indexed with the aid of single crystal data
on the basis of a hexagonal cell with a = 5 . 7 9 A and
c= 11.75 J~.
The density was found pycnometrically to be
7.9 g.cm. -8, and using the above parameters the unit
cell content weight was calculated to be 1622, as
compared to 1650 for the formula BasTa4015. Analysis
gave 44.04% Ta, 42.70% Ba, as compared to the
theoretical 43.85% Ta, 41-64% Ba for BasTa4Ol~.
Preparation of BasTa~V01~
The compound BasTa~V013 was prepared in an
evacuated sealed silica capsule using tantalum pentoxide and tantalum metal as a source of tantalum
IV, and barium oxide as the source of barium. As is
frequently the case with oxygen deficient compounds,
the powder pattern of the reduced (blue) phase could
be indexed using the same parameters as for the
oxidized phase.
Download