Journal of Computational Chemistry, 2013 Continuous symmetry measures for complex symmetry group Chaim Dryzun* Department of Natural Sciences, The Open University of Israel, Raanana, 43107, Israel. * Correspondening author: chdnew@yahoo.com Supplementary Information 1. The Continuous Symmetry Measures (CSM), Continuous Chirality Measure (CCM) and Continuous Shape Measure (CShM) This article focuses on the Continuous Symmetry Measures (CSM) which was developed by Zabroski, Peleg and Avnir more than two decades years ago 1. During years, several related methodologies were developed based on the original idea behind the CSM: The Continuous Chirlaity Measure (CCM) Measures (CShM) 3-4 2 and the Continuous Shape . The differences between the methodologies are subtle, so we dedicate this section to explaining the similarities and differences between the main methodologies. All of the above methods are based on distance functions, measuring the normalized square of the geometric distance between the original structure and a reference structure. In all cases the outcome is a number in the range 0 – 100, when a value of zero indicates that the original structure and the reference structure are identical, and therefore the original structure has a certain topological characteristic (symmetry, achirality or a specific shape). The main difference between the methods is based on the reference structure and the way it is constructed. In the original CSM publication 1, the reference structure is the closest G-symmetrical structure (the closest structure which belongs to a specific G symmetry point group) which is not known a-priori. This structure can be found using the Folding-Unfolding algorithm 1,5 or the average operation algorithm 6-7 , which offers an analytical formula which can be solved analytically in some cases 7. The CShM method compares the original structure with a predefined shape or structure using the Kabsch algorithm 3,8. This will give us the deviation of the original structure from a specific shape. The CCM finds the closest achiral structure, and therefore it is a measure of chirality 2 . Mathematically, achiral structures contain at least one improper symmetry operations. In practice, the chemical universe contains mainly achiral structures with Ci, Cs, S4, S6, S8 and S10. Therefore, we can apply the CSM methodology on the original structure with these symmetries and choose the one that minimized the CSM value. This will be the CCM value and the relevant closest symmetrical structure will be also the closest achiral structure. 2. Permutations In previous publications we introduced and discussed the concept of allowed permutations 7. Here we will only present a brief review of the subject. Molecules, as most three dimensional structures, are usually described by a set of vectors, which represents the positions of the atoms with respect to some origin. When a symmetry operation is applied on this set of vectors, the vertices (the entries of the vectors) change, but the order of the vectors remains the same. If we now want to calculate the minimal distance between the original and the transformed structures, we will first have to find the best permutation, or the best pairing of atoms (as can be seen in figure SI-1). Permutation means that for each of the atoms in the original set we find the label of the atom in the transformed set which is closest to it. Figure SI-1. After applying a symmetry operation on a given structure, the best permutation must be found in order of minimizing the distance function. For a given structure containing N vertices, there are N! possible permutations. Not all of these permutations are valid 7. A valid permutation is one which leads to a nearest structure which is perfectly G-symmetric. Mathematically, these allowed permutations are those where the number of vertices which are interchanged by the permutation is the order of the G symmetry group or a whole divider of it. A detailed example for this concept can be found in ref. 7. Checking only the allowed permutation can reduce drastically the number of permutation we need to check, but if N is large enough, then the scaling still approaches N!. A fast and efficient method for producing and checking only the most probable permutations was introduced in ref 9. This method scales as N2 in the worst case, but it has an error of ~2% for CSM values smaller than 5 and the error can reach up to 20% if the original structure has symmetry measures larger than 5. 3. Symmetry analysis of several distortion pathways As an example of the new methodology and as a test for its performance we analyzed the symmetry changes along several distortion pathways. We choose to focus on four pathways representing four different distortions of a tetrahedron 10-11 : The spread, plier, umbrella and scissoring pathways (Figure SI-2). We analyzed some of the relevant complex symmetry point groups: Td, D4h, C3v and C2v. The results are shown in figures SI-3 – SI-6. We can be seen that if the relevant symmetry is preserved during the pathway – the symmetry value was zero, as expected. In other cases, the symmetry measures change continuously and smoothly, unless the reference structure changes, which is represented by a sharp discontinuity point. These results are consistent with the results of previous publications 10-11. Figure SI-2. The spread, plier, umbrella and scissoring distortion pathways. Figure SI-3. Several symmetry measures along the spread distortion pathways. Figure SI-4. Several symmetry measures along the plier distortion pathways. Figure SI-5. Several symmetry measures along the umbrella distortion pathways. Figure SI-6. Several symmetry measures along the scissoring distortion pathways. 4. Error analysis Several types of tests were performed in order of assessing the error of the method. In cases where the closest symmetrical object was known, we compared our results with the results of the SHAPE program which uses the CShM methodology 3-4. For example, for all the pathways that was checked in the previous section we compared our S(Td) results with the results of the SHAPE program, using a perfect tetrahedron as a reference structure. For S(D4h) – we used the SHAPE program with planar square (D4h symmetry) and linear (D∞h symmetry) as reference structure. We compared the one with the lower value with our results. We also created 1000 randomly distorted tetrahedra (Td symmetry), 1000 randomly distorted octahedra (Oh symmetry), 1000 randomly distorted cubes (Oh symmetry), 1000 randomly distorted icosahedra (Ih symmetry), 1000 randomly distorted dodecahedra (Ih symmetry), 1000 randomly distorted fullerenes (Ih symmetry), 1000 randomly distorted triangular bipyramids (D3h symmetry) and 1000 randomly distorted ammonia molecules (C3v symmetry). The bond length was set to 1.0±0.4 Å and the angles were determined by the angles of the perfect structure with ±5 degrees range. The structural changes are kept small compared to the original symmetrical structures, so the CSM and CShM should be the same. We checked this assumption during the calculations: cases where our results were lower than the results of the SHAPE program were excluded. Also excluded were all the cases where the closest symmetrical structures were different from original symmetrical structures. A few examples are given in figure SI-7. In these cases we can compare our results with the results of the SHAPE program as the reference structures are know a-priori. Figure SI-7. Example structures of distorted tetrahedron, distorted cube, distorted octahedron, distorted icosahedrons, distorted dodecahedron, distorted triangular bipyramid, distorted fullerene and distorted ammonia molecule. The upper value is the CShM value (calculated by the SHAPE program), the middle value is the CSM value calculated using our method and the lower value is the error. 5. Analysis of the [Cu(II)Cl4]2- and [Ni(II)(CN)4]2- complexes CSD 12 reference code for the [Cu(II)Cl4]2- complexes 11: MPEACU LABNOR VACGUB PRZCUB CINCCU BOPWUY VITSAS FUTRER NUTDOV METHCC JEPLEV YITGUD JEZLAB CEJLAE KOFSON ZIPHIP TCENPT FISJEW VACGUB WEHVOU CSD 12 reference code for the [Ni(II)(CN)4]2- complexes 11: CIKBIH HAYDAM RIHBEP RUGVUE TOJXIZ VILXOD YOTVAE ZIBFUL ZURRIN ZURRUT 6. References 1. (a) H. Zabrodsky, S. Peleg, D. Avnir, J. Am. Chem. Soc. 1992, 114, 7843; (b) H. Zabrodsky, S. Peleg, D. Avnir, J. Am. Chem. Soc. 1993, 115, 8278; (c) H. Zabrodsky, D. Avnir, J. Am. Chem. Soc. 1995, 117, 462; (d) H. Zabrodsky, S. Peleg, D. Avnir IEEE Trans. Pattern Anal. Machine Intelligence 1995, 17, 1154. 2. (a) D. Avnir, A.Y. Meyer, J. Molec. Struct. (Theochem), 1991, 226, 211; (b) H. Zabrodsky, D. Avnir, J. Am. Chem. Soc., 1995, 117, 462. 3. (a) M. Pinsky, D. Avnir Inorg. Chem. 1998, 37, 5575; (b) S. Alvarez, D. Avnir, M. Llunell, M. Pinsky New J. Chem., 2002, 26, 996; (b) J. Cirera, E. Ruiz, S. Alvarez, Organometallics 2005, 24, 1556; (c) D. Casanova, M. Llunell, P. Alemany, S. Alvarez, Chem. Eur. J. 2005, 11, 1479. 4. http://www.ee.ub.es/index.php/downloads/viewcategory/4-registerddownload. 5. (a) H. Zabrodsky, D. Avnir, Adv. Mol. Struct. Res., 1995, 1, 1; (b) Y. Solomon, D. Avnir, Journal of Mathematical Chemistry, 1999, 25, 295. 6. (a) C. Dryzun, D. Avnir, Phys. Chem. Chem. Phys. 2009, 42, 9653; (b) C. Dryzun, D. Avnir, ChemPhysChem 2011, 12, 197. 7. (a) M. Pinsky, D. Casanova, P. Alemany, S. Alvarez, D. Avnir, C. Dryzun, Z. Kizner, A. Sterkin, J. Comput. Chem. 2008, 29, 190; (b) M. Pinsky, C. Dryzun, D. Casanova, P. Alemany, D. Avnir, J. Comput. Chem. 2008, 29, 2712. 8. W. Kabsch, Acta Crystallographica, 1976, 32, 922; (b) W. Kabsch, Acta Crystallographica, 1978, A34, 827. 9. C. Dryzun, A. Zayit, D. Avnir, J. Comput. Chem. 2011, 32, 2526. 10. (a) S. Alvarez, P. Alemany, D. Casanova, J. Cirera, M. Llunell, D. Avnir Coordination Chem. Rev. 2005, 249, 1693; (b) D. Casanova, J. Cirera, M. Llunell, P. Alemany, D. Avnir, S. Alvarez J. Am. Chem. Soc., 2004, 126 , 1755. 11. (a) S. Keinan, D. Avnir, Inorg. Chem., 2001, 40, 318; (b) S. Keinan, D. Avnir, J. Chem. Soc. Dalton Transactions, 2001, 941. 12. F. H. Allen, Acta Cryst., 2002, B58, 380.