Routine authoring and publication of enhanced figures

Accessing the data: going
beyond what the author
wanted to tell you
Interactive Publications and the Record of Science
ICSTI Winter Workshop
Paris, Monday, February 8, 2010
Brian McMahon
International Union of Crystallography
5 Abbey Square, Chester CH1 2HU, UK
PDFs and data impoverishment
Henry Rzepa: Publishers are likely to love interactive PDF, since it is
easy to archive. However ... such objects are data impoverished.
Whereas with Jmol, one is obliged to provide semantically accurate
data (e.g. CML or equivalent), the PDF object is simply a (pre)rendering
of that data. Thus reconstituting a useful molecule from Jmol is trivial
(and that reconstitution can then be used for many other purposes),
reconstituting a molecule from a 3D PDF is likely to be non trivial, and
will almost certainly suffer information loss compared to the original
data. By all means, provide both, but I strongly urge that a 3D PDF
should not be the only object provided.
19 December 2009:
Jmol interactive visualizations
• Not new
Biochem J. (2008). 412
• Bespoke design /
• Expensive
• Requires consultation
• Supplementary
The right tool for the job
Then (ca. 2004):
• Protein structures (RasMol)
• Small organic chemical molecules (Chime)
• Crystal lattices (symmetry)
• Inorganic materials (coordination polyhedra)
• Displacement ellipsoids
• Symmetry operations
• Electron orbitals
• Electron density maps
Making it easier to use
• Editing toolkit
• High-quality immediate visual feedback
• Context-sensitive help
• Manuals, examples, tutorials
• Reference: McMahon, B. & Hanson, R.M. (2008).
J. Appl. Cryst. 41, 811-814. A toolkit for publishing enhanced
Interactive molecular visualizations
enhance understanding
Acta Cryst. (2008). F64,
Modify orientation
Alternative representations
Overlay representations
Infrastructure for publication workflow
Server/client architecture
Ability to create interactive figures before or during
article submission/review
Opportunity for peer review/revision
Auto-generation of static equivalent
Easy generation/activation of multiple scripts to provide
alternative views
Requirements for routine
publication of enhanced figures
• Platform independence
• Web access for authors
• Serving visualization
application and data
• Integration into
submission/review procedures
• Integration into journal
production workflow
• Automated generation of static
copy (for failsafe/PDF
• Authoring tools
The authoring environment
• The author uploads a data file
• The system provides different
default styles according to the
type of structure
• The author edits and annotates
the view
• The author may supply
additional scripts
• The author saves the result as
an enhanced figure +
publication-quality static figure
Saving the enhanced figure
• Interactive applet
• Active scripts provided by the
• High-resolution static image
• Option to view dynamic or
static image online
• Link to allow peer review
The toolkit editing interface
• Essential tool for authors
• Accommodates novice and advanced users
• Tabbed interface allows authors to concentrate
on scientific aspects of visualization
• Presets tuned to journal style requirements
• Live testing, preview and feedback mechanisms
• Author may prepare enhanced
figure ahead of publication
• Simply enter URL of edit
workspace when asked to
‘upload source files’
• Presented alongside other
conventional figures
• Available for peer review
• Can be edited in response to
referee comments
Interactive authorship: publBio
• Start with the data (PDB)
example 3jw1
• Add structured text
• Online look-up:
• authors
• references
• crystallization solution components
• Validation
• references
• Visualisation (Jmol)
• Update data file as submission
Uniform (compatible) markup systems
• Crystallographic Information
Framework (CIF)
• Treat data/metadata,
text/numerical data as peers
• Domain-specific extensions
(dictionaries = ontologies)
• Image format
• Some data fields may need
to contain richer content
• Text markup
• Mathematical equations
• Interactive figure scripts
• Machine validation of
dictionary attributes
• Methods
The working scientist really wants to interact with the data
What interactive PDF offers is currently limited
Publishers should develop compatible architectures
Need domain-specific implementations (learned societies)
Investment in new applications; integration with workflow
Education for a new paradigm
requires more standardisation
proper compound document model
concentrate on data (or semantic content), not the implementation
‘record not what it looks like, but what you are looking at’
• Distributed content sources
• data not necessarily integral part of document
• retrieval of non-discrete data sets