Remote Visualisation of Large Oceanic Datasets

advertisement
Remote Visualisation of Large Oceanic Datasets
LJ West, J Stark, PD Killworth, P Challenor, J Marotzke
Southampton Oceanography Centre
European Way, Southampton SO14 3ZH UK
Abstract
The global high-resolution ocean model, OCCAM, has been run by D. Webb
and colleagues at the Southampton Oceanography Centre (SOC) for many years. It was
configured to resolve the energetic scales of oceanic motions, and its output is stored at
the Manchester Supercomputer Centre. Although this community resource represents a
treasure trove of potential new insights into the nature of the world ocean, it remains
relatively unexploited for a number of reasons, not least of which is its sheer size.
Computer visualisation of datasets is a powerful way of presenting vast amounts
of information in a fashion accessible to the human mind. However, the lack of readily
available hardware and software tools amenable to the task means that, too often, it is
simply not an option.
Under discussion, is a system being developed at SOC, which makes the remote
visualisation of very large volumes of data on modest hardware (e.g. a laptop with no
special graphics capability) a present reality.
This system is enabling our researchers to investigate the unresolved question of
oceanic convection and its relationship to large-scale flows; a question which lies at the
heart of many current climate change issues.
1. Introduction
The use of large scientific datasets
continues to expose unforeseen bottlenecks,
pitfalls and other surprises throughout the
computer visualisation process (VP).
For example, at the beginning of the
process, it is necessary frequently to store 3dimensional data fields in a ‘chunked’ format in
order to spread the burden of reading slices
evenly over the data array’s axes. This
innovation (supported by the Hierarchical Data
Format) vastly reduces the time taken to
perform the ‘average’ slicing operation of
OCCAM data, and reduces network traffic, too.
An example from the middle of the VP
is the computation of isosurfaces – a staple for
consumers of 3D visualisations, generally. It is
a computationally expensive task, but even
though highly parallelisable, it is still not
possible to execute in real-time (i.e. in the blink
of an eye), and can render some applications
non-interactive.
Another limit as being approached at
end of the VP: The number of model elements
(grid points in the case of OCCAM) along one
axis of a dataset is becoming comparable to the
number of pixels on a high resolution display
device.
It is unlikely that computer-screen
resolutions will increase significantly in the
future because of the limited resolving power of
the human eye, but the relentless march of
Moore’s Law virtually guarantees that dataset
sizes will increase almost indefinitely.
Rendering problems such as this are
beginning to be addressed in a practical way by
open source software visualisation libraries such
as ‘VTK’ from kitware, which provides a
facility for sub sampling geometric objects.
Each of these problems is compounded
when the components of a computer
visualisation system exist in different
geographical locations. For example, the raw
data may reside on a GRID server in one
location, the processing cluster used to generate
the requested diagnostic fields may reside
elsewhere, the computer graphics (CG) engine
may be housed at a further site, and a scientific
end user (armed with only a modest PC and
network connection) could be working almost
anywhere in the world.
As part of the GODIVA project, a
prototype system of this kind is being developed
at the SOC, enabling remote users to process
and visualise vast amounts of OCCAM output
hosted by the GADS server at ESSC – a hitherto
impossible task.
The intermediate stages, (scientific
processing and isosurface generation) are
performed by the SOC’s 12 processor 24Gb
SGI
Onyx300
graphics
supercomputer,
‘proteus’. The final rendering is performed
locally, or by one of proteus’s 1Gb graphics
pipes, optionally, if the client machine’s
graphics card is not up to scratch.
In the final stage, the generated
isosurfaces are cached on proteus, courtesy of
those 24Gb, allowing users to sweep through
isosurface values in real time, subject to
network constraints.
In the following sections, the model
data and software system are outlined, before
preliminary results are presented and a short
discussion of further work and technical issues.
2. OCCAM Ocean Model
The Ocean Circulation Climate
Advanced Model (OCCAM) uses two
curvilinear rectangular patches connected at the
Atlantic equator to span the world ocean. The
version used in the present work has 1/4o
horizontal resolution and 36 vertical levels.
Vertical contours of water-mass
properties such as temperature can indicate
convection, and vertical velocities, inferred
from divergence, may represent downwelling.
The question of whether these two processes are
co-located lies at the heart of many current
climate change issues.
The prognostic fields required to
investigate this matter, Temperature and
Horizontal Velocity, are downloaded along with
the configuration fields, Topography and Depth
before being processed into vertical velocities
and passed on to the next stage.
The Temperature field is processed
into a sequence of isosurfaces (approximately
40) between two interesting optima, chosen apriori. Isosurface generation is a CPU intensive
task, and the size of the resulting geometry
objects may vary considerably, because of the
connectivity information required to represent a
completely
general,
possibly
multiplyconnected or highly convoluted region.
It allows the user to flick through the
isovalues and get a feel for the structure of the
3D field, but why, it could be asked, is this
better, or even different, from scrolling through
a movie-style visualisation?
There are a number of reasons: Firstly,
the cached object can be manipulated, interacted
with and viewed from different angles, unlike a
movie frame. Secondly, other objects, sheets of
vertical w-velocities, for example, can be added
to the scene. The total number of different scene
configurations, then, is nm rather than nm, as it
would be in the case of a movie, where n is the
number of cached frames, and m is the number
of objects in the scene.
2. System Details
Much effort went into choosing an
appropriate software platform for such a tool,
although when all the requirements were
gathered, the number of technologies fit for the
task was small
The software had to be 64bit (to cope
with the sheer size of the data, and access
proteus’s address space), multithreaded (for
future parallelisation), and fast (for efficient
processing). Other desirables were platform
independence and open source, for portability,
distribution and accessibility to the wider
scientific community.
C++ was adopted because of its speed,
ubiquity and descriptive power (i.e. it takes less
lines of code to do the job).
The GSOAP library provides access to
web services, although it is only distributed in a
32bit format at present, which inhibits the use of
very large files, so today’s demonstration uses a
locally cached server file. ‘Locally,’ in this
sense means local-to-the-processing-server, i.e.
at Southampton, and definitely NOT local to the
client machine here in Nottingham. The 32bit
problem will be discussed in a later section.
The Kitware VTK library supplies high
level visualisation functionality and supports a
number of useful features such parallel
processing of isosurfaces and sub sampling of
geometry objects by implementing level-ofdetail actors.
The entire development environment is
open source and has been constructed in both
32- and 64bit platforms, and is suitable for PC
users and users of large datasets (I.e. >2Gb).
The Silicon Graphics Vizserver
software allows post rendered graphical output
from proteus to be piped across the internet, and
supports a number of compression modes up to
a ratio of 32:1. This enables remote users to
view and manipulate highly detailed and
complex scenes by utilizing the full graphics
capabilities of proteus at Southampton.
3. Results
Consider figure 1. It shows the CPU
and memory usage of an end user ‘client’
machine without using Vizserver. In other
words, the client machine is responsible for
displaying the geometry information received
across the internet. Naturally, the performance
of the application will depend very much on the
power of the graphics card, and this is revealed
in the figure.
Here, the client machine has an
NVIDIA GeForce4-4800 Titanium (384Mb)
graphics card and dual AMD Athlon processors,
2Gb of main RAM and is running Red Hat 9 –
hardly low powered for a PC.
The earlier CPU history (not shown) is
steady and the levels are approximately equal to
those at the left of the CPU Usage and %
Memory Usage graphs respectively. If the
Vizserver software is used, very little changes in
these graphs, except for a small amount of CPU
usage incurred by manipulating the scene.
Unsurprisingly, this system monitor
profile reflects the activity of proteus.
Clearly, there are spikes of activity in
the CPU Usage History graph. These spikes
reach halfway up the scale, suggesting that the
client window is running on one processor (50%
CPU on a dual CPU machine). This is indeed
the case. It also appears that their frequency is
increasing slightly.
Between the spikes, CPU usage levels
are the same as before the application begins to
run, suggesting that the client box is doing
nothing as it waits for more information from
the main server.
The earlier peaks are barely noticeable,
whereas the later ones overwhelm the single
CPU, albeit briefly. This increase in CPU usage
is heralded by a gentle increase in memory
usage.
If the Vizserver software is used, these
profiles remain uninterestingly flat for practical
purposes, and so are not shown here, but this
indicates that variations in CPU and memory
usages are due entirely to the changing contents
of the graphics window.
As has been stated, the generation of
isosurfaces is CPU intensive, and the process is
being executed on a single processor of the
server, proteus. This bottleneck corresponds to
periods of inactivity on the client machine – the
flat patches between spikes. Once a surface has
been generated, however, a call to render it
sends the geometry object ‘down the wire’ to
the client which takes responsibility for
rendering on receipt.
At first, the client machine is
effectively empty, and the geometry object is
simply passed through to the graphics card,
whose activity is not recorded in the System
CPU Usage History, (which is one of the
reasons for having a graphics card in the first
place.)
Eventually, it becomes clear that even
384Mb of texture memory is not sufficient to
cope with a scene of this complexity, and
subsequent geometry objects must be stored in
slower main memory, whose allocation is
evidenced by an incline in the % memory usage
graph.
The CPU must now take part in the
graphical process, as the graphics card begins to
make demands on main memory at every turn.
This is the reason for the increase in the size of
the CPU Usage spikes.
Figure 1. System load under local rendering.
By the time all the surfaces have been
loaded and rendered, and before the scene has
even been interacted with, approximately 1.5Gb
of graphics data have been transferred from
proteus to the client, which is impractical,
clearly, for a low bandwidth connection
(~56Kbps), for a machine possessing a low
power graphics card, having a small memory, or
one that is just downright slow.
It is important to remember, here, that
even if the user were operating proteus locally,
the ability to sweep through isosurfaces could
not be performed in real time, (i.e. interactively)
even by utilizing many processors. It is caching
that achieves this, but incurs a longer start up
time.
The increasing frequency between
successive spikes is due to the nature of
temperature data on the interval of interest.
Colder isosurfaces are generated before warmer
ones, i.e. to the left of the history graph. The
colder surfaces tend to span the ocean in a
stratified manner and so yield global (and
therefore large) geometries, but the warmer
ones outcrop at the surface and frequently
consist of only a few isolated blobs, and have a
smaller memory footprint, therefore.
Figure 2 shows a temperature
isosurface intersecting a ‘hedgehog’ of vertical,
w-velocity arrows in the North Atlantic. Large
outliers in w can be seen as huge arrows
emerging from the surface. These occur at the
coast where the water has nowhere else to go
but vertically, due to the fact that w is inferred
from the horizontal divergence. The w-field is
notoriously difficult to compute accurately, due
to the cancellation of large and similar terms.
4. Further Work
Isosurface Parallelisation
Parallelisation of isosurface generation
will improve results for users of higher
bandwidth connections and higher-performance
graphics cards, as this will reduce the length of
flat spots in CPU usage for these users. VTK
supports this kind of parallelisation very well.
GUI Toolkit
Use of Trolltech’s QT widget library is
popular and has proved successful with other
GODIVA partners. Drawing on the group’s
expertise, QT has been adopted at SOC and will
be used in subsequent GUI development.
GSOAP32 issues
64bit versions of libraries are
increasingly common, but are far from
ubiquitous. It is unfortunate that the GSOAP
library, necessary to access the ESSC GADS
webserver, is distributed in a 32bit format only
at present.
There are two possible ways to
circumvent this problem. The preferred solution
would be to obtain a 64bit version, either from
the GSOAP community or by minimal in house
development.
Alternatively, a separate 32bit-server
component could be developed which streams
data in <2Gb chunks and streams the data
through a socket to a software tool derived from
the current 64bit application.
A communications toolkit, ACE
appears to be ideal for the purpose, is available
in both 32- and 64-bit versions, and is also used
successfully by other GODIVA partners.
References
I. Foster, J. Insley, G. von Laszewskei, C.
Kesselman and M. Thiebaux, Distance
Visualisation: Data exploration on the grid,
Computer, 32(1999), pp.36-43
J. Marotzke, Boundary mixing and the dynamics
of three-dimensional thermohaline circulations,
J. Phys. Oceanogr., 27 (1997), pp. 1713-1728
Figure 2 z-plane of w-velocity cutting through
an isotherm in the North Atlantic
J. Marotzke and J. R. Scott, Convective mixing
and the thermohaline circulation, J. Phys.
Oceanogr., 29(1999), pp. 2962-2970
Download