Structure determination from X-ray powder diffraction data Alok Kumar Mukherjee Department of Physics Jadavpur University Kolkata -700032 E-mail : akm_ju@rediffmail.com Crystal structure determination using X-ray powder diffraction is a far more difficult task than several other applications of X-ray powder diffraction, such as: (i) (ii) (iii) (iv) (v) (vi) Phase identification Quantitative phase analysis Microstructure study Texture analysis Thin film study Dynamic and nonambient diffraction Single crystal versus powder crystal XRD Although the single crystal and powder crystal XRD patterns essentially contain the same information, but in the former case the information is distributed in threedimensional space where as in the latter case the threedimensional data are “compressed” into one-dimension. Single Crystal Diffraction Powder Diffraction As a consequence, there is generally considerable overlap of peaks in the X-ray powder pattern leading to ambiguities in extracting the peak positions(2) and intensities, I(hkl), of individual diffraction maxima. The powder diffraction pattern Why structure determination from powder XRD ? Many crystalline solids including several organic, metal-organic, pharmaceutical compounds and nano-materials are available only as polycrystalline powders. In such cases, X-ray powder diffraction is the only realistic option for structure elucidation. Flow chart for structure determination from X-ray powder diffraction data High quality powder XRD data collection Measurement of step scanned intensity data Unit cell determination Indexing the XRD pattern Possible space group assignment Look for any obvious systematic absences Structure solution Determination of approximate atomic positions Structure refinement Refinement of atomic coordinates, temperature factors, occupancy factors The whole procedure of structure determination from X-ray powder data is not automatic. It requires significant intelligent input from the crystallographer at every step during the process. For success in structure determination from powder diffractometry (SDPD), the first step i.e. data collection is crucial . The quality of data sets necessary depends on the type of analysis intended. • Indexing requires accurate and precise peak positions, specially at low 2θ values (large dspacings). • Structure solution requires reliable intensities. • Structure refinement requires good quality high angle data (small d-spacings). In addition, high-resolution data (narrow 2θ peak width) is desirable. High sample purity is required. One potential adverse effect of using very high resolution powder diffractometers is that any impurity peaks from the sample are more easily seen → problem in indexing ! Some useful data collection parameters (i) Wave length of X-ray (usually CuK is used) (ii) Monochromatization (iii) Zero-shift alignment (iv) Incident and diffracted beam apertures Divergent slit ~ 0.1 mm – 0.6 mm Receiving slit ~ 0.1 mm – 0.2 mm (v) Data collection parameters (Step scan mode) (a) Step size ~ 0.008º - 0.02º (b) Counting time ~ 20 - 30 sec/step Data acquisition variables can be adjusted to produce a better quality data set. Sample holder X-ray tube Detector Slit boxes The view of Bruker D8 Advance diffractometer with stationary X-ray source and synchronized rotations of both the detector arm and sample holder Some criteria for good quality X-ray powder data 1. Peak to noise ratio should be high 2. Full width half maxima peak overlapping (~ 0.08º- 0.10º) 3. Number of peaks 4. Reproducibility of powder pattern preferred orientation Number of expected diffraction peaks The number of reflections up to the diffraction angle θ in a powder diffraction pattern is determined by the size and symmetry of the unit cell and the wavelength of the radiation used. N=32πVSin2θ/3λ3Q , Where V is the unit cell volume, λ is the wavelength and Q is the product of the average multiplicity of the reflections and the number of lattice points per unit cell. For example: λ= 1.5418Å, V= 1500 Å3 , 2θ= 25° N =32πVSin2θ/3λ3Q = 637/Q For orthorhombic system : Q and N = average multiplicity X no. of lattice points per unit cell =5X1 =5 = 637/5 ≈ 127 Effect of counting time on statistical error during a single measurement Counts Counting Number (N) of Spread (√N) Error () at 90% Error () at 99% per sec time (s) registered counts confidence(%) confidence(%) 100 1 100 25 100 2500 10 16.4 25.9 50 3.3 5.2 =[Q/(N)½] x 100%, Q=1.64 for 90% confidence level Q=2.59 for 99% confidence level Good quality XRD data is the primary requirement Particular problem with organic and pharmaceutical compounds ~ diffraction data often contain very few peaks Non-reproducible data due to preferred orientation Effect of the divergence slit width Slit with 0.05mm Slit with 0.6mm Effect of the divergence slit width in peak asymmetry Slit with 0.05mm Slit with 0.6mm Powder pattern indexing The basic equation used for indexing a powder diffraction pattern is Qhkl = 1/dhkl2 = h2a*2+k2b*2+l2c*2+2klb*c*cos *+ 2hlc*a*cos*+2hka*b*cos* where, dhkl is the interplanar spacing corresponding to the (hkl) plane and a*, b*, c*, *, *, * are the reciprocal lattice cell parameters. The 2θ positions corresponding to a reasonable number of diffraction maxima (say 20-30) are extracted from the observed pattern by fitting individual peaks using appropriate profile function. Different Indexing programs: TREOR → A semi-exhaustive trial-and-error indexing program, which is based on the permutation of Miller indices in a selected basis set of lowest Bragg angle peaks. DICVOL → An exhaustive trial-and-error indexing program based on the variation of the lengths of cell edges and inter-axial angles over finite ranges, followed by a progressive reduction of these intervals by means of a dichotomy procedure. ITO → A zone search indexing method with the provision for the reduction of the most probable unit cell. It is based on specific relations among Qhkl values in reciprocal space. MCMAILLE → Based on the Monte Carlo and grid search method. X-cell→ An indexing algorithm which uses an extinction-specific dichotomy procedure to perform an exhaustive search of parameter space to establish a complete list of all possible unit cell solutions. Commonly used figures of merit To assess the reliability of indexing, two figures of merit are commonly used. The de Wolff ’s figure of merit (M20) M20= Q20 / 2N20 x (ΔQ)av where, Q20 is the Q value (1/d2) for the 20th observed line, N20 is the number of different diffraction lines possible up to the 20th observed line, (ΔQ)av is the average absolute discrepancy between the observed and calculated Q values for the 20 observed lines. The Smith and Snyder figure of merit (FN) FN= N / Nposs x (Δ2)av Where Nposs is the number of calculated diffraction lines up to the Nth observed line and (Δ2)av is the average absolute discrepancy between the observed and calculated 2 values. Higher the accuracy of the data collected and more complete the pattern, the larger will be M20 and FN values. Comments on the value of M20 • A ‘solution’ with FOM value less than 5.0 is worthless. • Any ‘solution’ that leaves more than two very weak lines unexplained is not useful. However, if the FOM is greater than 10.0 it might be worthwhile to examine the input data and the ‘solution’ more closely. • Even ‘solutions’ that index all lines with a FOM > 10.0 should not be accepted uncritically without further investigation. Experience shows that M20 values greater than 15 are likely to correspond to correct solutions for laboratory X-ray powder data. Pitfalls in indexing: • Attempting to index a poor data is a non-starter → trying to solve a jigsaw puzzle with half the pieces missing !! • Pseudo-symmetry: when certain lattice parameters have values that result in the symmetry of the lattice appearing to be higher than reality e.g. , → Monoclinic angle β very close to 90°, the metric tensor for the lattice will be similar to that of an orthorhombic lattice. → unit cell has a monoclinic symmetry, but a≈c and γ close to 120°; the symmetry of the lattice is then pseudo-hexagonal. • Instrumental errors: Satisfactory solution may not be obtained if 2θ zero error greater than 0.08° . • Dominant Zones Indexing a powder pattern with dominant zones (one cell axis is much shorter than the other two) is often a problem. If most of the first few lines are of h0l type → the d-spacings intrinsically lack 3-dimensional unit cell information. To overcome this problem XRD pattern should be collected of an unground powder sample. • Impurities and other phases: Which peak(s) due to impurity ? • Samples that change phase during data acquisition Indexing of the XRD data collected with step size 0.02º and counting time 20s per step results a triclinic cell with low FOM. Crystal system : Triclinic a=4.734(4) Å, α=92.1(1)º b=8.248(6) Å β=85.8(1)º c=10.811(7) Å γ=98.0(1)º Vol=416.7 Å3 F20= 23, M20= 15 No corresponding calculated position WPPD plot Indexing of the same XRD data collected with step size 0.008º and counting time 30s per step results a orthorhombic cell with high FOM. Crystal system : Orthorhombic a=16.666(2) Å b=10.548(2) Å c=9.476(2) Å Vol=1665.8 Å3 F20= 88, M20= 52 WPPD plot Severely overlapping peaks may results no indexing at all Space group assignment- most tricky part! The choice of space group is possibly the difficult part to automate. The presence or absence of screw axes or glide planes from indexing is not always obvious since the average powder pattern is likely to contain only a few reflections of the type h00 (or 0k0, 00l) and hk0 (or h0l , 0kl etc.). The paucity of data results in real ambiguities in the choice of space groups. If there is a choice between a commonly observed space group and a rare one based on occurrence in the Cambridge Crystallographic Database (e.g P2/c vs P21/c), then try to solve the structure using the most common space group first. Example: A molecular structure indexed orthorhombic system showed Z=8. in the The reflection conditions are ambiguous and indicate either Pbca or Pnma (this situation can easily arise due to overlap of key reflections in the powder pattern). Both these space groups are found to occur frequently for molecular compounds. For Pnma, however, the presence of mirror plane would suggest either two molecules related by mirror symmetry or two molecules per asymmetric unit with the mirror plane passing through both molecules → unlikely situation from packing consideration. So, in this case the first choice should be Pbca A better way to overcome this problem is to employ Whole pattern fitting approach in which the diffraction profile is fitted in the absence of a structural model, but in the presence of the reflection conditions of the various space groups under consideration. In EXPO-2004, for each crystal system, the probabilities of different extinction symbols are calculated. A list of possible space groups compatible with each extinction symbol is presented. Successful structure solution will only ascertain the correctness of the assigned space group! Structure was solved in P21/a - which had a lower FOM Structure solution Traditional methods (Patterson, Direct methods) Using the extracted intensities, the structure solution procedures are similar to those in the single crystal case. Direct space approach (Monte Carlo, Simulated annealing, Grid search etc.) All direct space approaches avoid the problematic step of Ihkl extraction from the experimental powder pattern Trial structural models generated in direct space. Suitability of a model assessed by comparing the calculated powder pattern based on the model with the observed powder pattern. Monte Carlo approach In the Monte Carlo approach a sequence of structures is generated as potential structure solution. The first structure (x1) is generally chosen as a random position of the structural fragment in the unit cell. Starting from the structure xi, the structural fragment is subjected to a random displacement to generate a trial structure (xtrial). The trial structure is then accepted or rejected by considering the difference z between the Rwp values corresponding to structures xtrial and xi. z = Rwp (xtrial) – Rwp (xi) if z ≤ 0, the trial structure is automatically accepted, whereas if z > 0, the trial structure is accepted with probability exp(-z/s) and rejected with probability [1exp(-z/s)], where s is a scaling factor. After a sufficiently extensive range of structural space has been explored, the structure corresponding to lowest Rwp is considered as the starting model for refinement The main factor limiting the efficiency of Monte Carlo calculation is the number of structural degrees of freedom varied during calculation. Thus in the Monte Carlo approach, the number of degrees of freedom in the structural fragment is a more important consideration than the number of atoms in the asymmetric unit. This approach is quite popular with organic or pharmaceutical compounds where the molecular compositions are known a priori. Rietveld method In the Rietveld method, the integrated intensities of the reflections (Ihkl) are calculated from the atomic parameters of the model. The equations for y(2i)cal now becomes y(2i)cal = smhkl (Lp)hkl |Fhkl|2 Phkl hkl + b(2i) Where, s is the scale factor, mhkl is the reflection multiplicity, (Lp)hkl is the Lorentz-polarization factor, P is the preferred orientation function, is the reflection profile function, Fhkl is the structure factor for the reflection, b(2i) is the background intensity at the ith step. Criteria of fit The agreement between the observed and calculated powder patterns is judged by several indicators. The profile R-factors are, R- pattern (profile) RP = |yi (obs) - yi (calc)| / |yi (obs)| R weighted pattern (profile) RwP = [ wi {yi (obs) - yi (calc)}2 / wi { yi (obs)}2]1/2 R-Structure factor RF = |(IK(obs))1/2 - (IK(calc))1/2| / (IK(obs))1/2 R-Bragg factor RB = |IK(obs) - IK(calc)| / |IK(obs)| R expected RE = [(N- P)/ wi { yi (obs)}2]1/2 Where, N and P are the number of profile points and refined parameters, respectively Goodness-of-fit 2 = (Rwp / RE)2 Steps in structure determination from powder data Examples from our recent work Example-1: 5-5′- disubstituted hydantoins Imidazolidine-2,4-dione, or hydantoin, is a five-membered heterocyclic ring containing a reactive cyclic urea nucleus. Hydantoins can serve as useful intermediates in the synthesis of amino acids. Dipropyl glycine hydantoin The indexing with TREOR showed an orthorhombic unit cell with a= 7.167(1), b= 13.964(1), c =10.957(3)Å [M(20)= 49, F(20)= 84(0.003343, 72)] Statistical analysis of the using the FINDSPACE module of EXPO2004 indicated Pnma as the most probable space group (Z’= 0.5) ORTEP view Hydantoin moeity lying on the mirror plane and side chains are above and below the plane Final Rietveld plot Formation of C11(4)C11(4)[R22(8)] network Crystal data Chemical Formula C9 H16 N2 O2 Mr 184.24 Temperature (K) 293(2) system, Space group, Z Orthorhombic, Pnma, 4 a, b, c(Å) 7.1577(5), 13.9721(9), 11.0211(10) Volume (Å3) 1102.2(2) D (Mg/m3) 1.110 Wavelength (Å) 1.54056 Diffractometer Bruker D8 Advance Rp 0.0477 Rwp 0.0687 χ2 1.671 Cyclohexanespiro-5′-hydantoin Synthesized as microcrystalline powder of cyclohexanespiro-5′ – hydantoin monohydrate(I) The compound (I) was slowly heated to 115°C and kept at the elevated temperature for 1 hr. The sample was cooled to room temperature (25°C) to obtain Anhydrous cyclohexanespiro-5′– hydantoin (II) Comparison of observed powder profiles of anhydrous and hydrated phases Normalized Intensity Anhydrous Phase-II Hydrated Phase-I 2θ(º) Intensity(counts) Final Rietveld Plot of hydrated form-I 2θ(º) Intensity(counts) Final Rietveld Plot of anhydrous form-II 2θ(º) Comparison of the packing of the hydrated and the anhydrous phase R22(8) ring formed by N-H…O hydrogen bonds Hydrated Phase Anhydrous Phase Crystal data and Rietveld refinement parameters for hydrated and anhydrous phases Formula weight Temperature Crystal system Space group Unit cell Dimensions (I) 186.21 293(2)K Orthorhombic Pna21 a=16.887(2) Å b=9.245(2) Å c=6.267(1) Å (II) 168.20 293(2) K Monoclinic P21/c a=12.7468(4) Å b=7.0777(2) Å c=10.3348(3) Å β=110.891(2) ° Volume Z Density (calculated) Rp Rwp RF2 χ2 978.3(4)Å3 4 1.264g/cm3 0.0478 0.0778 0.1029 4.786 871.09(4) Å3 4 1.283 g/cm3 0.0399 0.0557 0.0815 0.9655 Example-2: Two nimesulide derivatives 1 2 Nimesulide, N-(4-nitro-2phenoxyphenyl) methanesulfonamide, is an effective non-steroidal antiinflammatory drug (NSAID), which can inhibit cyclooxygenase-2 (COX-2) enzyme selectively To the best of our knowledge, this is the first example of structure determination of nimesulide analogues from Xray powder diffraction data Rietveld plot and Ortep view of 1 Rietveld plot and Ortep view of 2 Example-3: Three o-Hydroxyacetophenone derivatives with varied degrees of flexibility Indexing and Space group determination 1 2 3 Final Rietveld plot and ORTEP diagram Final Rietveld plot and ORTEP diagram Final Rietveld plot and ORTEP diagram Crystal Data (1) (2) (3) Chemical Formula C10H12O3 C17H18O C14H18O5 Mr 180.20 270.33 266.29 Temperature (K) 295(2) 295(2) 295(2) system, Space group, Z Monoclinic, P21/c, 4 Monoclinic, P21/c, 4 Monoclinic, P21/c, 4 a, b, c(Å) 9.8702 (15), 13.7735 (19), 12.207 (6), 16.487 (8), 8.1460 (12) 7.495 (4) 8.0139 (5), 7.2616 (3), 23.9540 (18) () 123.368 (5) 102.952 (5) 95.738 (5) Volume (Å3) 924.9 (3) 1470.0 (13) 1386.99 (16) D (Mg/m3) 1.294 1.221 1.275 Wavelength (Å) 1.5406 1.5406 1.5406 Data/ restraints/ parameters/ Rp 4750 / 78/ 123 4790/ 122/ 173 4800/ 126 / 152 0.0465 0.0579 0.0496 Rwp 0.0659 0.0855 0.0751 R(F2) 0.0978 0.1208 0.1208 χ2 9.68 6.66 3.35 Example-4: 3-phenylpropionic acid derivatives Final Rietveld plot and ORTEP diagram Compound 2 Final Rietveld plot and ORTEP diagram Compound 3 Final Rietveld plot and ORTEP diagram Compound 4 Crystal Data Molecular Weight Temperature (K) Crystal System C10H12O3 (2) 180.2 293 Monoclinic C10H12O3 (3) 180.2 293 Monoclinic C10H12O3 (4) 180.2 293 Monoclinic Space Group a(Å) b(Å) c(Å) β(º) P21/n 22.8913(14) 4.8938(27) 8.2326(6) 99.227(3) P21/c 7.8243(4) 19.6301(7) 6.7344(4) 114.385(4) P21/n 12.2392(5) 7.6727(2) 11.1229(2) 114.168(3) Volume (Å3) Z 910.33(14) 4 942.08(6) 4 952.97(3) 4 Dcalc (gcm-3) Wavelength (Å) 2θ interval (º) No. of parameters No. of data points No. of restraints 1.315 1.5418 5-100 64 4624 51 1.27 1.5418 5-100 55 4780 47 1.256 1.5418 5-100 58 4780 46 Rp 0.0527 0.0541 0.0476 Rwp 0.0748 0.078 0.0662 R(F2) χ2 0.1425 5.514 0.1009 3.204 0.0669 1.416 Example-5: 3-phenylpropionic acid and its derivatives with Z´= 2 . Final Rietveld plot and ORTEP diagram Final Rietveld plot and ORTEP diagram Final Rietveld plot and ORTEP diagram Crystal Data C9H10O2 (1) C10H12O2 (5) C10H12O3 (6) Molecular weight Temperature Wavelength(Å) Crystal system Space group a (Å) 150.18 293 1.5418 Monoclinic P21/a 31.6676(22) 164.20 293 1.5418 Monoclinic P21/n 17.3804 (12) 180.20 293 1.5418 Monoclinic P21/a 13.8863(14) b (Å) 9.8510(6) 6.1389 (5) 10.6470(13) c (Å) 5.4829(4) 16.9998 (12) 13.9018(17) β (°) 98.776(5) 91.025 (4) 114.787(3) Volume(Å3) Z / Z' Dcalc 2 interval (˚) Stepsize,counting time 1690.4(3) 8, 2 1.180 5-65 0.02, 10 1813.5 (4) 8, 2 1.203 5-100 0.02, 10 1866.0(6) 8, 2 1.282 5-100 0.02, 10 No. of data points No. of parameters No. of background points 3014 116 20 4774 121 10 4774 126 10 Rp Rwp R(F2) χ2 0.0346 0.0489 0.0788 4.285 0.0501 0.0715 0.0729 2.220 0.0609 0.0853 0.1435 1.742 Example-6: Ramipril -Tris(hydroxymethyl)amino methane Ramipril, an angiotensin-converting enzyme (ACE) inhibitor, belongs to a family of drugs used in cardiovascular problems In the absence of suitable single-crystals of ramipril-tris co-crystal , structure solution was achieved from laboratory X-ray powder diffraction data Indexing of powder pattern showed a monoclinic system, a=24.334 Å, b=6.460 Å, c=9.532 Å, β=96.95˚ V=1474.5 Å3 The space group indicated was P 21 After several unsuccessful attempts crystal structure of ramipril-tris was finally solved in direct-space using the program FOX View of the asymmetric unit of ramipril-tris co-crystal Final Rietveld plot of ramipril-tris Crystal data and Rietveld refinement parameters of Ramipril-tris Empirical formula Formula weight Temperature Wavelength Crystal system Space group Unit cell dimensions Volume Z Density (calculated) 2 range for data collection Step size Counting time per step No. of profile data steps No. of variable parameters No. of background points Rp Rwp RF2 2 C23H32N2O5. C4H11NO3 537.65 293(2) K 1.5418 Å Monoclinic P21 a= 24.382(2)Å, b= 6.477 (1) Å, c= 9.554(1) Å; = 96.92 (1) ° 1497.7 (4) Å3 2 1.214 g/cm3 6 to 60° 0.01° 10 s 5726 218 10 0.0520 0.0680 0.1735 24.66 References: 1. A. Bhattacharya, S. Ghosh, K. Kankanala, V. R. Reddy, K. Mukkanti, S. Pal & A. K. Mukherjee. Chem. Phys. Lett. (2010), 493, 151-157. 2. A. Bhattacharya, K. Kankanala, S. Pal & A. K. Mukherjee. J. Mol. Struc. (2010), 975, 40-46. 3. B. Chattopadhyay, A. K. Mukherjee , N. Narendra, H. P. Hemantha, V.V. Sureshbabu, M. Helliwell, & M.Mukherjee. Cryst. Growth & Des. (2010) 10, 4476- 4484 4. U. Das, B. Chattopadhyay, M. Mukherjee & A. K. Mukherjee. Chem. Phys. Lett. (2011) 501, 351-357. 5. B. Chattopadhyay, M.Mukherjee, Kantharaju, V.V. Sureshbabu & A.K.Mukherjee. Z. Krist (2008), 223, 591-597. 6. B. Chattopadhyay, S. Ghosh, S. Mondal, M.Mukherjee & A.K.Mukherjee. CrystEngComm, (2011) , DOI:10.1039/C1CE05920C 7. U. Das, B. Chattopadhyay, M. Mukherjee & A. K. Mukherjee. Cryst. Growth & Des. (2011) , DOI: 10.1021/cg201290g. Acknowledgements Prof . Monika Mukherjee Dr. Soumen Ghosh Dr. Basab Chattopadhyay Dipak K. Hazra Abir Bhattacharya Uday das