New versions of the Maximum Entropy program MEED for X

advertisement
Enhanced versions of the Maximum Entropy program MEED for xray and neutron diffraction
K. Burger
Institut für Kristallographie der Universität
Charlottenstraße 33
D – 72070 Tübingen
Germany
e-mail: karsten.burger@uni-tuebingen.de
1
Abstract
Three new „Tübingen“ versions of the japanese Maximum Entropy method (MEM) program
MEED (Kumazawa, Kubota, Takata & Sakata (1995), J. Apll. Cryst. 26, 453-457) are
presented. Changes to the original code include: modernization of the programming style,
automatization of the constraint divergence error handling, and addition of a Fourier
transformation program capable of reading the MEM-input files. The new programs are
named MEED, MEND and MEEDCAB: 1. With MEED strictly positive scattering densities
can be calculated (e.g. electron densities from x-ray data); 2.
with MEND positive or
negative densities using neutron scattering data. 3. MEEDCAB is an extended version of
MEED incorporation a new constraint function for acentric reflections with ambiguous phase,
which are obtained from certain anomalous scattering experiments, e.g. powder diffraction
using the absorption edge of one type of resonantly scattering atoms.
The programs are written in FORTRAN77 and are available with complete source code at
internet address http://www.uni-tuebingen.de/uni/pki . The publication of these programs is
intended to help to overcome the problem of limited availability of MEM programs –
currently the original MEED program is available via internet from Japan, but not MEND.
1. Introduction
The Maximum-Entropy method has been used in many fields of crystallography, as recently
reviewed by Gilmore (1996). It is a non-linear calculation that has its roots in information and
probability theory, and is originally designed to reconstruct the most probable and least biased
probability distribution in an underdetermined situation. In crystallography it is often used to
2
calculate electron densities which are strictly positive, using single crystal or powder
diffraction data. Usually a MEM-density is less affected by missing reflections and series
termination errors as compared with an ordinary Fourier density. Some interesting
crystallographic applications published are: the case of poor resolution in one given direction
(Papoular, Zheludev, Ressouche & Schweizer, 1995); thermal vibrations from single-crystal
neutron diffraction data (Takata, Sakata, Kumazawa, Larsen & Iversen, 1994); model-free
search for extra-framework cations in zeolites using powder diffraction (Papoular & Cox,
1995); extraction of strictly positive integrated intensities from strongly overlapping
reflections in powder patterns (Sivia & David, 1994; however the availability of the program
is very limited); quasicrystals (Haibach & Steurer, 1995); disorder in crystals (Papoular,
Prandl & Schiebel, 1992); the problem of overlapping reflections in Laue diffraction patterns
(Popov & Bartunik, 1996); magnetisation densities from polarized neutron diffraction data
(Schleger, Puig-Molina, Ressouche, Rutty & Schweizer, 1997).
We used it mainly with data obtained from resonant (anomalous) scattering powder
diffraction experiments, where Patterson, difference Patterson and partial Patterson densities
can be calculated, as well as phases of structure factors of the unique reflections in the powder
diagram (Burger, 1997; Burger & Prandl, 1997; Burger, Cox, Papoular & Prandl, 1998). With
these densities an ab initio structure solution is possible.
Since not all phases can be determined, and many reflections usually overlap in the powder
diagram, the dataset used to calculate an electron density is systematically incomplete, and
much information is lost compared with the single crystal case. A valuable feature of the
MEM is, that all remaining information obtained from such a measurement can be used in the
calculation of the charge density, i.e. phased, unphased, ambiguously phased and overlapping
reflections, while in a Fourier calculation only the phased reflections can be used, which leads
often to severe distortions of the calculated Fourier-density.
3
2. The MEM-Algorithm used in MEED
The programs described here are based on the original code of the program MEED
(Kumazawa, Kubota, Takata & Sakata, 1993). The algorithm used there has been described
by Sakata & Toraya (1990) and Sakata, Mori, Kumazawa, Takata & Toraya (1990), and its
`single-pixel-approximation´ has been further discussed by Kumazawa, Takata & Sakata
(1995). The basis is an iterative algorithm of Collins (1982). The unit cell of a crystal is
divided in pixels of equal size, and the positive charge density is used in a normalized form.
This discrete density is then used as if it were a probability distribution of independent events,
so that the Maximum-Entropy principle of Jaynes can be applied to it (Skilling 1989; Buck &
Macauley, 1991): the most probable distribution fitting the known data is the one with the
maximum value of the „entropy“ H 

  ln 
with  the normalized pixel density value,
pixels
and  a prior density incorporating prior knowledge. This number has to be maximized
subject to the constraint, that the measured data have to be reproduced.
In the current algorithm, the measured data are included via a 2 –like sum
C
1
N
| F
c
j
 F jc | 2 /  2j , with the calculated and observed value of the structure factor F cj
and F oj , respectively, the number of observations N, and the estimated error  (| F o |) of an
observation. The key feature of Collins´ algorithm is an approximation: new values of the
density value l of every pixel l in the cell (asymmetric unit) are calculated iteratively,
4

 C (  (n), F o ,  ( F o )) 
 l (n  1)  l (n)  exp  

l


(1)
where  l (n  1) is the new density value, and  l (n) the value of the cycle before.
Independent
pixels are assumed, and relatively small changes of  . Without prior
knowledge, the calculation is usually started from the uniform distribution  l  const. In the
current algorithm, several different constraint functions C are implemented:

F-constraints: for the
NF
reflections with completely known structure factor
F | F | exp( i) , i.e. with known phase, C F 

NF
| F
c
j
 F jo | 2 /  2j
(2)
j
I-constraints: for the N I reflections where only the intensity | F | 2 is known, i.e. with
unknown phase, C I 

1
NF
1
NI
NI
|F
c
j
2
|  | F jo | /  2j
(3)
j
G-constraints: for closely overlapping reflections, which cannot be resolved in a powder
diagram, only their intensity sum
m
i
| Fi | 2 can be measured ( mi multiplicity). In this
reflect.
case the `mean structure factor of the group j´ Gj is calculated, G j 
m | F
m
i
i
|2
i
, and
i
used in the constraint function CG 

1
NG
NG
 (G
c
j
 G oj ) 2 /  2j .
(4)
j
New A-constraints: in our enhanced version of MEED1, MEEDCAB,
there is the
possibility to include acentric reflections with ambiguous phase       , where 
is obtained from the measured intensities | F ( ) | 2 of a certain unique reflection with the
wavelength  near to and far from the absorption edge of a single chemical element
contained in the sample, and 
is the phase of the partial structure factor
5
F0  | F0 | exp( i ) of this element.  is known at this stage of the procedure, since the
partial structure is determined e.g. from a partial Patterson or a difference Patterson
density, which are much easier to interprete than a usual Patterson density (from onewavelength data only). A complete description can be found in Burger (1997) and Burger
& Prandl (1998). Such an ambiguity occurs with powder data, since no Bijvoet
differences can be measured, and with single crystal SIR data. With the quantities
A' | F | cos(   ) and | B' || F |  | sin(    ) | known, the constraint function is
CA 
1
NA
 ( A'
NA
c
j
 A' oj ) 2 /  A2 ' j  (| B' cj |  | B' oj |) 2 /  B2 ' j

(5)
j
In the current algorithm the different constraint functions are simply summed up and then
used in eq. (1). The iteration is stopped, when C  1 is reached. The additional constraint of
normalization of the density is fulfilled by doing it after each cycle. In the `Tübingen´ version
of MEED and MEND, the restriction of relatively small changes of  used in the derivation
of Collins is taken into account by limiting the maximum change of C to an ad hoc value of
ten percent. If the change of C is larger in a certain cycle, then the value of  is halfed, and
the calculation redone. In a mathematical sense only the constraint function C F is convex, the
others are concave and have no unique solution (for a mathematical treatment see e.g.
Wilkins, Varghese & Lehmann, 1983; Wu 1997). However they contain very important
information and should not be omitted. On the other hand, the simple linear Fourier
transformation can only use the F-constraints.
In principle one should introduce an independent `Lagrangian factor‘ i for each constraint,
and include all prior knowledge as well as each observation as extra constraints. The ´weak
constraints´ (Sato, 1992) used instead in the current algorithm are surely a compromise with
1
The original code of MEED (Kumazawa et al. 1993) is available at address http://www.mcr.nuap.nagoyau.ac.jp/mem .
6
ease of implementation and computational costs. One could think of additional constraints,
e.g. insuring smoothness of the density, and the correct Gaussian distribution of
( F c  F o ) /  around the observations F o . These considerations make plausible, why some
authors have reported of problems, when very weak features in charge densities are studied,
and proposed extensions to the current algorithm (Jauch & Palmer, 1993; de Vries, Briels &
Feil, 1994; Takata & Sakata, 1996; Yamamoto, Takahashi, Ohshima, Okamura, & Yukino,
1996; Brummerstedt-Iversen, Ledet-Jensen & Danielsen, 1997).
The algorithm of the neutron version MEND has been published by Sakata, Uno, Takata &
Howard (1993). Since probabilities are always positive, and also negative scattering densities
have to be modeled, two positive normalized densities   and   are refined in parallel, of
which   describes the negative fraction of the total scattering density. In other aspects the
two algorithms are the same. The key equation in the MEM-process (1) is then simply
replaced by

 l (n  1)  l (n)  exp  


 l (n  1)  l (n)  exp  

C 
l 
(6)
C 
.
l 
(7)




and the effective scattering density is  l   bi  l   bi  l with the given sums of
positive and negative scattering lengths of all atoms in the unit cell.
3. Practical aspects
The programs read a MEM-input file with the crystal parameters and reflection list, and
produce an asymmetric unit density file and a documentation file. If a starting density is
7
given, it is read before the first cycle. The asymmetric unit pixel coordinates are determined
from a general algorithm by testing each pixel in the whole unit cell. The space group is input
via its number and setting number as listed in the International Tables for Crystallography
(1983).
For Patterson densities, the corresponding space group without translational components has
to be used (e.g. crystal: Pnma  Patterson density: Pmmm). For MEED and MEEDCAB, the
unit-cell charge sum F (000) is input, while for MEND the scattering length sums of both the
positive and negative part are input.
The programs run easily on modern desktop computers or PCs, however the calculation times
and computer memory needed may become a problem with low-symmetry crystals, high pixel
resolution, or with many reflections (say, more than 1000). The memory requirement is
dominated by two tables of cos- and sin-values (single-precision variables), for each single
pixel and each reflection. In the case of a centrosymmetric sample, only the cos-table is
needed. Because of this, the programs MEED and MEND are used for a centrosymmetric
sample, MEEDC and MENDC for a non-centrosymmetric sample. MEEDCAB (with the new
A-constraints) is for non-centrosymmetrical samples. At the start of the main program
FORTRAN source code, some parameters define the maximum pixel resolution along a cellaxis, maximum number of reflections, and maximum number of pixels in the asymmetric unit
– these parameters should be adjusted and the program newly compiled in exeptional cases.
The program FOURIER can read MEED and MEEDCAB files for all space groups and
calculates the ordinary Fourier transformation. It can however only use the F-constraints
information. There is a special version FOURIERN to do the Fourier-transformation using a
MEND neutron case input file. As a result of the calculation, an asymmetric unit density file
can be created similar to the MEM density file, or arbitrary sections and projections on these
planes. The auxiliary program MECO reads the asymmetric unit density file from either a
8
MEM program or FOURIER, reconstructs the density of the whole unit cell, and allows to
extract sections and projections from it, with plane normals parallel to a cell edge or with
arbitrary direction. It creates ASCII files, which the user is supposed to view with his own
graphics program.
If the parameter  in eq. (1) is chosen too small, the calculation converges too slowly, if it is
too large, the constraint sum C changes too quickly, or produces a divergence error, when C
gets larger instead of smaller. The calculation of this cycle is then automatically repeated with
 set to half its value, until the proper convergence is achieved again. If the last cycle was
successful, then  is automatically increased by a small amount of
1.5 %. With this
procedure the program can run in a batch queue without inspection by the user, avoiding both
too small and too large values of  (idea of Th. Haibach, ETH Zürich).
In cases with a small fraction of the reflections known as unique F-constraints it may happen,
that the algorithm does not converge to the target value C  1 . This may also be the case if a
Patterson density2 is calculated, which has an extremely strong peak at the origin, so that the
final density is extremely far from the uniform starting density. In the current version, the
atomic density peaks have often shapes, for which (instead of the expected physical nearlyGaussian shape) the center is somewhat exaggerated, and the diameter of the atom appears to
be smaller than usual. This underlines the qualitative nature of the density calculated.
4. Conclusion
Although the present algorithm can be developed further by incorporation of more prior
knowledge in the form of additional constraints, it has also in its present form very impressive
2
For a Patterson or difference Patterson density, we use only the unique reflections with their intensities | F |
and omit overlapping reflection groups. In practical use, instead of the ordinary F-constraints, where A and B of
2
the structure factor
F  A  iB are input, now at the place of A the intensity | F |2 is input, and B is set to
9
features compared with the ordinary Fourier transform: with x-ray data or in cases where it is
known from the physics of an experiment that the resulting scattering densities have to be
positive, this is guaranteed by the use of the program MEED. The Maximum-Entropy
algorithm is less influenced by missing reflections in an incomplete dataset. Important is the
possibility to use different informations together in a single calculation of a scattering density:
a first set of
reflections with known structure factor F | F | exp( i) ; a second set of
reflections with the phase determined with a remaining ambiguity,
      ; a third set
of the reflections with only the intensity | F | 2 known; and a fourth set
of groups of
overlapping reflections in a powder diffraction pattern, where only the sum of intensities
m
i
| Fi | 2 is known for each group. The ordinary Fourier transformation can only use the
first set, but has to omit the others, while a MEM calculation can now use all four sets in the
form of constraint functions of different type, which contribute to the total constraint sum C.
These additional informations used improve the calculated density considerable.
New versions of the original MEED program have been presented, which have been improved
in several ways: MEED for strictly positive densities, MEND for neutron data, where also
negative scattering densities may occur, and MEEDCAB with the new „ambiguous phase“ Aconstraints implemented. These programs run on common desktop computers, are freely
available, and offer important help to crystallographers in the ab initio structure
determination, especially if only powder data is available.
Financial support by the German ministry for research (BMBF, project number 05 647 VTA)
is gratefully acknowledged.
zero. Currently here the G-constraint type cannot be used, since otherwise a special version of the G-constraint
10
References
Brummerstedt-Iversen, Bo, Ledet-Jensen, J. & Danielsen, J. (1997).
Acta. Cryst. A53, 376--387.
Buck, B. & Macauley, V.A., Maximum Entropy in Action. Clarendon, Oxford
1991.
Burger, K. (1997).
PhD thesis, Universit\"at T\"ubingen. Published by Shaker, Aachen, Germany.
Burger, K., Cox, D.E., Papoular, R. & Prandl, W. (1998). Accepted by J.
Appl. Cryst.
Burger, K., Prandl, W.
& Doyle, S. (1997). Z. Kristall. 212, 493--505.
Burger, K. & Prandl, W. (1998). Submitted to
Collins, D. (1982).
Acta. Cryst. A.
Nature 298, 49-51.
Cox, D.E. (1992). In Synchrotron radiation crystallography,
Ph. Coppens, editor, chapter 6, p. 105--138. Academic Press, London.
Cox, D.E. & Papoular, R. (1996).
Gilmore, Chr. J. (1996).
Mat. Sci. Forum 228--231, 233--238.
Acta. Cryst. A52, 561--589.
International Tables for Crystallography, Vol. A (1993). Th. Hahn, editor.
Reidel, Dordrecht: Holland.
Kumazawa, S., Kubota, Y., Takata, M.
J. Appl. Cryst. 26, 453--457.
& Sakata, M. (1993).
Kumazawa, S., Takata, M. & Sakata, M. (1995). Acta. Cryst. A51, 47-53.
Papoular, R. & Cox, D. E. (1995). Europhys. Lett. 32,
337--342.
Papoular, R. & Delapalme, A. (1994). Phys. Rev. Lett. 72,
1486--9.
Sakata, M. & Toraya, H. (1990). Acta. Cryst. A46, 263-270.
Sakata, M., Mori, R., Kumazawa, S., Takata, M. & Toraya, H. (1990). J.
Appl. Cryst. 23, 526-534.
Sakata, M., Uno, T., Takata, M. & Howard, Chr.J. (1993),
J. Appl. Cryst. 26, 159-165.
function had to be implemented.
11
Sato, T. (1992). Acta. Cryst. A48, 842-850.
Schleger, P, Puig-Molina, A., Ressouche, E., Rutty, O. & Schweizer, J.
(1997). Acta. Cryst. A53, 426-435.
Takata, M., Sakata, M., Kumazawa, S., Larsen, F. & Iversen, B. (1994).
Acta. Cryst. A50, 330.
Skilling, J. (editor),
Academic, London 1989.
Maximum
Entropy
and
Bayesian
methods.
Kluwer
Wu, N., The Maximum Entropy Method. Springer, Berlin 1997.
Yamamoto, K., Takahashi, Y., Ohshima,
(1996), Acta. Cryst. A52, 606-613.
K.,
Okamura,
F.P.
&
Yukino,
K.
12
Download