3 Plate View

advertisement
Interactive Poster: Enlightenment: An Integrated Visualization and
Analysis Tool for Drug Discovery
Christopher E. Mueller
Array BioPharma , 3200 Walnut St. Boulder, CO 80301
cmueller@arraybiopharma.com
Abstract
Commercial software tools for interpreting analytical chemistry
data provide basic views but offer few domain specific
enhancements for exploring the data. Gaining an understanding
of the results for an individual compound and a large set of
compounds requires examining multiple data sets in multiple
applications for each compound. In this poster, we present
Enlightenment, a new tool that takes the traditional look and feel
of an analytical application and significantly enhances the utility
of the visualizations. Using Enlightenment, analytical chemists
can review large sets of compounds quickly and explore the data
from a single, unified interface. Enlightenment demonstrates how
applying domain knowledge can enhance the usefulness of
traditional displays.
Keywords: Visualization, Chromatography, HPLC, Mass Spec,
High Throughput Synthesis
1 High Throughput Synthesis
High Throughput Synthesis is the process of using combinatorial
chemistry to create large numbers of related but diverse
compounds quickly. The main vessel for handling compounds is a
plate. A plate consists of wells arrayed in an m x n matrix, where
m x n is typically 12 x 8 yielding 96 wells.
To confirm that the correct products are created, each plate is
analyzed using a high performance liquid chromatography
(HPLC) instrument with UV and mass spectrometric (MS)
detection to confirm purity and identity, respectively. An
algorithm is applied to the data to make the first determination as
to whether or not the compound was created properly. These
results are then reviewed by an analytical chemist who either
confirms or amends them. Interpreting the results algorithmically
is non-trivial and often produces incorrect results, requiring
human intervention to determine if a compound passes or fails.
The manual process consists of using a collection of vendorsupplied tools to explore the data, each task requiring a separate
application: one for viewing the plate and algorithmic results, one
for viewing raw data for each well, one for viewing compound
structures, and a spreadsheet for tracking observations. Finnigan's
Xcalibur/Discovery [1,2] and Waters’ OpenLynx [3] system are
examples of such commercial systems.
2 Enlightenment
Enlightenment provides a unified interface to all plate, structure,
and analytical data.
It applies information visualization
techniques to enable the analytical chemist to understand the
results quickly and increase the data density of the visualizations.
When data exploration is required, a series of data-aware, linked
plots allow the chemist to drill down into the data from a single
application.
Figure 1 - Enlightenment
Enlightenment is designed to be immediately familiar to analytical
chemists but provides a more information-rich view of the data
than commercially available tools. The main views integrated
into the UI are the plate view with its linked tree and compound
structure views as well as an analytical data view that shows the
processing results, linked to plots of the raw data.
3 Plate View
The plate view in commercial applications displays a grid of
color-coded circles for each well, with the color denoting the
status of the well. By default, Finnigan’s Discovery Browser [2]
uses four colors denoting pass (green), found but not pure
(yellow), pure but not found (pink), and fail (red). However, other
data items exist that can be displayed at the well level to give the
chemist a better idea of what is happening in the plate. It is often
the case that the chemist will step through each well to acquire
these, just to get a better view of the big picture.
Enlightenment uses the Finnigan color scheme to maintain
familiarity, but replaces pink with blue since
some displays made it hard to distinguish
pink and red. The intensity of the colors
was also adjusted using the guidelines in [4,
p. 164] so that no single color stood out.
Figure 2 – Icons,
Enlightenment uses overlays and size to
Colors and Overlays
show clearly three extra dimensions of
data: HPLC signal strength, channel
used, and percent BPI (MS signal strength). These values are
typically used to understand problems with a plate and are only
available through analysis of multiple plots per well in
commercial applications.
Signal strength is illustrated by the size of the circle: smaller for
low signals and larger for signals that are too strong. Size alone
was hard to distinguish on small displays, so a “noisy” border was
added to give the appearance of a deviant signal.
Selected channel and percent BPI use overlays to highlight cases
that occur infrequently. Generally, channel 1 is selected and the
BPI is 100%. If a different channel was used, the channel's
number is overlayed in the upper left corner of the well. If the BPI
is below a threshold (e.g. 80%), a bar appears on the left edge of
the well, its height relative to the BPI. By using the overlay only
in these cases, wells that exhibit these behaviors stand out.
Enlightenment's plate view uses different levels of detail (LODs)
to display more or less information about each well, depending on
the audience. For instance, business development staff can select a
LOD that only displays green/red to determine which compounds
can be sold, whereas an analytical chemist would select the most
detailed LOD.
The plate view is linked to a tree view that displays detailed
information for each compound and a structure view that displays
the structure of the selected compound (Figure 1, top row). The
analytical views are also linked to the selected well.
4 Analytical Results View
The analytical results views are located beneath the plate view
(Figure 1, bottom three rows). There are four different channels of
analytical information used to characterize a compound, three
displayed by default. Applying the concept of multiples in space
and time [5], each channel has an identical results view and a set
of plots. Because the results view is linked to the plate view,
changing the status of a well in the results view also changes the
color and overlays for that well in the plate view.
5 Analytical Plot Views
HPLC and MS data are represented by line and stick plots,
respectively. HPLC data consists of a time-series trace with
distinct peaks. Each peak corresponds to some amount of
material passing through the detector and comparing peak areas
gives the purity for each peak. Each peak has a start and end
point, and the MS data is sub-sampled to show data in the range
for each peak. Selecting a peak in a HPLC trace displays the subsample of the MS data in the MS plot. MS plots show the massto-charge (m/z) ratio on the x-axis and relative intensity on the yaxis.
axis denotes the maximum value (Figure 1, middle plot). Thus, a
quick glance can tell a chemist if the signal was strong enough for
proper evaluation. Signals that are too strong lead to obviously
distorted traces and have no special marking. Often, all data prior
to a certain time will be excluded from analysis. The x-axis
range-frame spans only the time range used in processing and
includes a single tick mark showing the time for the currently
selected peak. Labels on the peaks denote the purity of each peak.
If the target compound was found for a given peak, its mass is
displayed alongside the purity value.
For an MS intensity to be useful, it should be above 20%. This is
displayed by the y-axis range-frame on the MS plot, which spans
20-100%. The x-axis range-frame spans the entire length of the
plot with ticks at either end displaying the min and max m/z
values. Sticks are labeled with the m/z value.
The peaks in the HPLC plot are dynamically linked to the MS
plot. Changing the endpoints of a peak or drawing a new peak
sub-samples the MS data in real time to display the mass spec for
the new peak.
All plots feature interactive panning,
zooming, and arbitrary value picking.
Zooming is accomplished by drawing a
rectangular region around a plot area to
define the new view or by scrolling the
ends of the PanBar controls (Figure 4).
PanBars are similar to Spotfire's Range
Figure 4 – PanBars
Sliders [7] and allow both panning and
zooming. Originally, only the PanBars and Zoom Controls
were available for zooming, but user
feedback led to the addition of the zoom box and a button in the
lower-left corner of the view that zooms out completely. If no
mouse button is pressed, the current x/y value below the mouse
cursor is displayed in the status bar in data coordinates.
6 Conclusions
Enlightenment is similar to commercial analytical chemistry
applications. However, careful analysis of the domain and the
chemist’s usage patterns has led to several enhancements. By
combining the functionality of multiple applications into one, we
have eliminated redundant features and provided better linking
among views. Using information visualization techniques, the
views build on familiar displays but show significantly more
information and allow chemists to draw conclusions more
effectively.
References
Figure 3 - Chromatogram and Mass Spec Plots
Applying the principle of maximizing data ink [6], the HPLC and
MS plots were redesigned to display more information than the
simple scientific plots used in commercial tools. The axes on all
plots were replaced with range-frame axes with carefully selected
tick marks.
Signal strength is important for HPLC traces; too low or strong a
signal leads to incorrect purity results. The y-axis range-frame
starts with the minimum good value and ends with the maximum
observed value. If the signal is low, a single tick-mark with no
[1] Finnigan (2000). Xcalibur 1.2. [Software]
[2] Finnigan (2000). Xcalibur Discovery Browser 1.2. [Software]
[3] Waters (2003). OpenLynx Application Manager - Processing &
Reporting (Retrieved June 17, 2003). www.waters.com.
[4] Kosslyn, S. M. (1994). Elements of Graph Design. US: W. H.
Freeman and Company.
[5] Tufte, E. R. (2002). Visual Explanations. Conn: Graphics Press.
[6] Tufte, E. R. (2001, 2nd Ed.). The Visual Display of Quantitative
Information. Conn: Graphics Press.
[7] Spotfire, Inc (2001). Spotfire DecisionSite 6.3.0.349 [Software]
Download