ppt

advertisement
Structure Validation Challenges
in Chemical Crystallography
Ton Spek
Utrecht University,
The Netherlands.
Madrid, Aug. 26, 2011
Validation History
• Structure Validation of data supplied in
computer readable CIF format was pioneered by
Acta Cryst. C (Syd Hall et al., 1990ies).
• Initially the numerical checking of papers
submitted to Acta C in CIF format was done by
the Chester staff.
• Subsequently automated checking of the CIF for
data consistency, data completeness and validity
was introduced (checkCIF)
• PLATON facilities to check for Missed Symmetry
and VOIDS were added later on.
• Soon followed by the inclusion of numerous
other PLATON based tests (PLATxxx) of the
reported structure (currently more than 400).
checkcif/PLATON
FCF Validation
• Fo/Fc reflection file deposition and archival
in CIF format (FCF) was made mandatory
early on for Acta Cryst. papers.
• Useful for subsequent analysis of possibly
unique data.
• CIF + FCF checking was added in 2010
into the IUCr CheckCIF/PLATON suite.
• Major chemical journals now require CIF
deposition and validation reports but (not
yet) the deposition of reflection data.
• The CCDC now accepts FCF's for
deposition.
Why Automated Structure
Validation
• The large volume of new and routine structure
reports submitted for publication.
• The limited number experienced and available
crystallographic referees for validation.
• Detection of errors due to the black box use of
crystallography by non-crystallographers.
• Setting standards of quality and reliability.
• Automated detection of unusual though not
necessarily erroneous issues that need special
attention (ALERTS A,B,C,G).
• Sadly: The need to Detect Frauded structure
reports.
Systematic Fraud
• A massive fraud was detected in late 2009 of structures mainly
published around 2007 in Acta Cryst. E. (Soon 200 retractions !)
• Nobody was prepared for serious and systematic fraud in this not
competitive field of routine structures before 2010.
• Many deviations from the expected results can often be explained as
errors, inexperience or due to poor data.
• Several retractions before 2010 might in hindsight concern frauded
structures and not errors.
• Ongoing testing of our validation software on the archived data for
structures published in Acta E often indicated suspect structures
needing a more detailed investigation.
• It was only by following up on one of such a strange structure report
with an analysis of all structures published by the authors of that
paper that a fraud pattern emerged.
• It was discovered that the same data set was used to publish a
series if invented isomorphous structures.
• Full story: Acta Cryst. E (2010) editorial and a Powerpoint
Presentation of the E-section editor Jim Simpson (IUCr Website).
BogusVariations (with Hirshfeld ALERTS) on the Published Structure
2-hydroxy-3,5-nitrobenzoic acid (ZAJGUM)
OH=>NH2
NO2=>COOH
OH => F
H2O => NH3
Fraud Detection Tools
• Generalized Hirshfeld Rigid Bond Test.
• CIF versus FCF data checking.
• Scatter Plots of the reflection data of the same
or related structure(s).
• Look in Difference Maps for unusual features.
• SHELXL re-refinement using the supplied CIF &
FCF data.
• Check in the CSD for related structures.
• Two case studies that illustrate the use of the
above validation and analysis tools follow.
Example 1: Error or Fraud ?
Structure I
Submitted to Acta Cryst. (2011)
PLATON Report Part 1
PLATON Report Part 2
RELATED STRUCTURE FROM THE CSD
Structure II
Structure Report for II
Scatter Plots I(obs) versus I(calc)
(I)
(II)
Analysis
• Structure (II) has no validation issues.
• C-CH3 distance in (II) of 1.50 Ang. as expected.
• ‘C-F’ distance in (I) is 1.50 Ang. and not the
expected 1.35 Ang.
• Conclusion: Structure (I) is the CH3 variety and
not F.
• Data sets of (I) & (II) are not identical (see next).
• Data set (I) likely based on CH3 compound.
• Fraud or Error ? DIFABS file Error ?
• Authors of (I) confirmed Error believing external
chemists proposal. Paper was retracted.
Scatter Plots of 2 Data Sets
Two Unrelated Data Sets
Two Identical Data sets
CIF versus FCF data Check
• The R & S values in the three lines # R= should be identical within
rounding error.
• The reported and calculated residual density ranges should also be
closely identical
• This is the case in the first example but not in the second where the
CIF & FCF data do not match.
Example 2: Iron(III) Complex
Fe(III) Validation Part 1
Fe(III) Validation Part 2
Example 2: Difference Density Map
Fe Structure Re-refined
Conclusion ?
• Structure now O.K. after an erratum ?
• Search for similar (isomorphous)
structures in the CSD
• Yes, there is an isomorphous Mn complex
published by a different set of authors from
a different university.
• Let us compare both structures.
Isomorphous Mn(III) Complex
Mn Structure Validation Part 1
Mn Validation Part 2
Scatter Plot Fe versus Mn I(obs)
Fe and Mn Data Sets Identical !
Analysis on Fe/Mn Structures
• The Displacement parameters in the CIF
for the H2O molecule in the Fe complex
are different from those used in the final
refinement.
• Reflection sets identical for papers from
two different sets of authors and location.
• CSD: Unusual coordination distances
• Fraud or Error ?
• Withdraw/Retract one or both ?
Validation Challenges
•
•
•
•
•
•
•
•
Avoid False Positive and Negative ALERTS
Disordered structures (true or artifact)
Handling of Twinning (data names missing)
Powder structure validation (experts needed)
Incommensurate structure validation (experts)
Fabricated reflection data – Can we detect them
Education – What is the meaning of an ALERT
Should validation criteria be different for
structures published in chemical journals ?
Concluding Remarks
• PLATON includes a standalone Validation
Tool. It is part of the WEB-based IUCr
CheckCIF/PLATON Tool that is capably
managed by Mike Hoyland (IUCr)
• Validation is still a learning process.
• Chemical insight might be very helpful and
often decisive as a validation tool.
• Deposition of structure factors should be a
requirement for all journals (The CCDC
now accepts those along with the CIF)
Thanks To
• Martin Lutz and many
others for taking the
time to bring various
unresolved issues to
my attention with
actual data.
• Send to
a.l.spek@uu.nl
Download