23330 >> Aaron Smith: Okay. So I think we... interruption or the delay. Yeah. So I'm Aaron...

advertisement
23330
>> Aaron Smith: Okay. So I think we can get started now. Sorry for the
interruption or the delay. Yeah. So I'm Aaron Smith from Extreme Computing
Group. And it's my pleasure today to host Paul Navratil from TACC, Texas
Advanced Computing Center. So Paul and I have known each other for over a
decade.
So I remember when he was using small displays. And single cores. So at
TACC he's managing the visualization software group and he's visiting today
because of SE, so I talked him to come over here and giving us a talk. He's
going to tell us about the new display technology and the software they've been
building. Thanks for coming.
>> Paul Navratil: Thanks, Aaron. Thanks to all of you for being here and
watching on the Web. I'm Paul Navratil, part of the visualization group at TACC,
the Texas Advanced Computing Center. It's part of the Vice Presidents of
Research Portfolio at the University of Texas.
And we have a mission to service the advanced computing needs of the
university, the Texas system, and through primarily NSF funding the national
cyber infrastructure.
So what I'm going to talk about today is some of the challenges we face as a
visualization group at TACC and some of the ways we're solving those
challenges both in our remote computing platforms and in our local large format
displays.
>>: Tell us how big you are and ->> Paul Navratil: In terms of TACC.
>>: Number of people and ->> Paul Navratil: I don't have slides for that but I can give you just an off the top
of my head we're about 100 people. Almost all at UT Austin at our research
campus about ten miles north of main campus.
We do have a few folks in remote offices at Office of Naval Research, down in
medical branch in Houston for UT. So we have just the beginnings of a
branching outside. But otherwise we're divided into visualization group, high
performance computing. Advanced systems that keep the big iron running, user
support and outreach.
And so as part of our mission, our director, Jay Bosso, really has an emphasis on
giving back to the community, not only the scientific community, but also the
larger community.
And both to inform the public and to inspire the next generation of scientists that
will take our places. Okay. So basically what I want to talk about is how we're
using the cluster technology that predominates high performance computing
today and applying that technology to visualization. Because as you'll see, once
a simulation that runs on a very large cluster, 100,000 cores, for instance, the
data produced by that cluster really can't go anywhere else. So that cluster has
to become the visualization and analysis instrument.
Now I'll talk a little more later about what that looks like in terms of what we do.
So I'm also going to contrast the visualization workflow with traditional high
performance computing. If you think about, say, a fluid solver in massive
parallel, you can just divide up the domain evenly or, say, if you're doing a
ionization study of dark matter, you can take each of your stars that emits the
ionization and duplicate the entire universe that you're studying across different -across your cluster.
But for visualization, the computational workflow looks different and the results
and the demands placed on those computations are different as well. And we'll
talk a little bit more about what that means.
Then I'll also describe some of the solutions that we're pursuing at TACC. Of
course, I'm being biased towards the one I'm more familiar with than not. So
you'll see a lot of the software and some of the hardware that I've helped design.
And I'll talk a little bit at the end about motivation for the work of future clusters.
Okay. So let's look at what a typical HPC workflow looks like. Say you have
some parameters that you input into an equation. So this could be, say, the
initial conditions for a weather simulation, temperature, cloud density and fluid in
the atmosphere. You might have some initial conditions, starting points and you
ship those into your super computer and run a simulation and then you get some
results, typically in the form of some graphs or maybe just even a single number
or set of numbers, and you might also get some time steps out that you either
feedback as new initial conditions or you do later analysis on.
And so most of your work goes, happens at the machine and it's in terms of inter
and intra process communication. That might take the form of MPI, maybe P
threads, maybe some combination of both. Nowadays you're also talking about
hybrid technology. So GPU computing. Intel is making big announcements
about MIKE, their many integrated core project that came out of the Larrabee
development, if you're familiar with that work.
So visualization workflow takes these time steps that were typically generated
from some simulation. Could also be from an instrument. Say an MRI scan.
Electron microscope, and runs visualization algorithms on some high
performance or advanced computing hardware. And then you get some
geometry, for instance, that you're actually going to render into a picture.
You then have to feed that geometry back into some sort of hardware to perform
rendering that creates the pixels of the images that you create. And then you
have to display those pixels somewhere. And that can be an online process for
interactive visualization and off line if you're rendering frames to create a movie.
And typically this process is iterative. So, say, you're trying to do a visualization
and you don't like the technique you used you may have used iso servicing and
want to do volume rendering, so you change the way you're rendering the
geometry. Maybe you didn't like the color palette you used, so you have to
create a different image. Maybe you have point data and you need to resample it
into a grid so that you can use more visualization algorithms on it. So you
actually have to manipulate the data itself.
Then you typically also have a process where you're creating your rough
visualizations just for your own edification, then you polish it up a little bit and
show it to your colleagues. Then you polish it even more and put it in your
publications or into your talks.
So there's another axis of iteration that comes out of the screen. So let's -- what
does this communication pattern look like? And I put this in quotes, because I
mean interprocess and intra process communication similar to HPC but I also
mean the communication that a human gives to the simulation or to the process
and that comes back to the human, because there's typically an interactive
component to this.
So there's the same inter and intra process communication. There's interactive
algorithm manipulation, think of changing the isovalue in an iso surface, and
there's also interactive display of the data. So all of this process typically has to
happen in, say, a 20th of a second for just interactive or a 60th of a second if you
want really high performance interaction.
So the algorithms tend to be more demanding on the hardware than a typical
HPC algorithm for a couple of reasons. First, the calculations tend to be more
irregular. There's data driven calculations, think traversing a tree or searching for
a particular isovalue in a dataset. You also generate new data.
So if you're doing that weather simulation, the region of the country you're
simulating never changes. Right? You may change the gridding of it. But once
you've formed that grid it's constant through the calculation. In visualization
algorithms you're returning new geometry or you're generating images that you'll
return.
So you have to buffer that data in memory. Also you're interacting with someone
controlling the algorithm. And controlling the calculation. And so if you're
changing that isovalue, then your parameters change and you have to
recalculate very quickly. And the users expect interactive response. They can
tolerate up to, say, ten frames a second.
But if you start getting into seconds per frame where they have to go get coffee
or check their e-mail between them you've lost it. And also interactive display of
data and particularly for large simulations as the pixel count grows, it's more and
more difficult to ship those pixels in a meaningful way to a display.
Okay. So let me give you just a quick example if you're not familiar with
visualization algorithms. Some work that we've done in GPU-based isosurfacing.
Isosurface means you're looking for a continuous value in the dataset. Think the
temperature lines on a weather map extrapolated to 3-D. And a classic way to
do this is with marching cubes where you find where the value crosses in a cube
and then there's a lookup table to determine how the geometry is placed.
And so say if our value crossed the line 4-7, 4-5 and 4-0 then we go back to the
lookup table and put the triangle at the cross point through tri linear interpolation.
Okay. So the way this is done on the GPU is actually a three-step process.
They classify the voxels, find out which voxels have your data value and which
don't. You do a scan to determine where those voxels are, and then you
compact them into an active voxel list and generate the triangle. The challenge
here is that you don't know how many triangles you're going to generate ahead of
time. And on a CPU-based algorithm, that's fine, you just create a new vector or
create maybe an STL vector and push on the end of it. On the GPU, that's
harder because until recently you couldn't allocate new memory on the device
side. And in open CL that's still true. CUDO now gives you the ability to do that.
But either way you have to buffer some of your available memory for the data
that's going to result.
And so you can see that the execution time is really bad in this space because
one of the multi-pass process and, two, because of the amount of global memory
that you're accessing in that compaction step. And so one of the things we've
done is we've -- whoops, that was back. Sorry. Is that we've now moved this,
this is just summarizing the issues that I mentioned previously, that you don't
know how many triangles you're creating initially but also you're generating the
triangles in parallel and you're riding them to a single buffer.
So the massive multiple threads in a CPU have to synchronize down into that
single buffer. So think of it as a reduction but with more data.
So then what we've done is we just take a classic, you can think of it as an octre
approach from graphics that instead of dealing with the individual cells we create
meta cells that help filter out regions of space that don't or regions of the data
that don't have intra C numbers. So we can now in a spot that was ten mega
milliseconds we can reduce that to three for about two and a third times speedup.
And we can also because we're operating on fewer cells, the classification step
becomes a little faster.
But so I hope that that motivates just a little bit. We can talk more about that
work afterwards. I know I've just given it quick treatment. But basically the
take-aways are that we need more memories HPC algorithms and the amount of
data we need is dynamic or can be. And the large problems need significant
computational resources that may be larger than what is evenly divisible on a
single node.
For instance, we have done some work with dark matter simulation where each
individual time step is 650 gigabytes. And that is the memory footprint for the
HPC simulation itself. For the additional structures that the visualization
algorithm needs, that expands to three terabytes. And some of that is because
the visualization space VTK is only now being concerned with memory efficiency.
But some of that is just you need the extra space to allow the algorithm to work.
So if we need -- what we find out is when we first started working on this system,
our largest visualization resource had only a terabyte of memory. So we had to
go back to the HPC resource, but we were allocating nodes now from memory
instead of for processing power. And because libraries like VTK aren't
multi-threaded we would allocate the 32 gigabytes per node on ranger, one of
our machines, and 15 cores would sit idle. So there's definitely an opportunity.
And this is some of what we're working on at TACC is to parallelize this so we
can make use of those cores even when we're allocating for memory.
Okay? Okay. So the original solution to all this was to move the data to a
separate machine to do VIZ interactively. For instance, some datasets were
small enough that you could move to your own desktop or to your laptop.
However, as we're getting larger datasets, moving them off the machine where
they're generated is becoming untenable. So even if you have a ten gigabit
connection, a terabyte go get a cup of coffee. A petabyte, go on a vacation,
right?
If you have to do this over wireless, forget it it's just not going to happen. We
today even work on datasets where it's easier to put the data on physical drives
and mail them rather than try to do the transfer.
Okay. So this is only going to get worse, because as the machines grow, the
datasets grow but the disk technology and the network infrastructure isn't there
and this suffers from a last mile problem.
So at UT we have ten gigabit connecting our machine room to the main campus,
but if I try to send that to UT El Paso it's going to pass through a thin pipe and
that's only as fast as that transfer is going to go.
Okay. So what we've done is we're moving the visualization resource to at least
the same machine room and ideally moving forward on the same machine.
So our first system like this was built in 2008. Spur, which was an attachment to
ranger. Eight nodes, 32 GPUs and terabyte of aggregate RAM. And each single
node had 128 gigabytes of RAM because there's still some legacy shared
memory codes that need that much, that need that much RAM.
Longhorn is a machine we just put out in 2010. 256 nodes. 512 GPUs and 13
and a half terabytes of aggregate RAM.
And because it's in the same machine room we can put a very high bandwidth
connection to the ranger file system and operate on the data without moving it.
And now our Lone Star system that we just released also in 2010 has right now
eight nodes each with two GPUs but we're going to expand that by the end of the
year to 72 nodes with GPUs. So we have a smaller version of Longhorn sharing
the same interconnect and disk as the larger nodes of lone star. And lone star is
a 20,000 core machine.
Our new machine that was announced, stampede, ten petaflop resource,
released 2013, and yes everything has a Texas tie-in to the naming scheme, this
will have 128 nodes each with probably a Kepler GPU from Nvidia and also this
each node will have an Intel mic across the entire system which will be on the
order of thousands of nodes. So that will be interesting to experiment with.
And so for problems that don't operate on the VIZ subsystem we use software
rendering and we move it back to the HPC cluster. Okay. So what about shared
memory?
There's some in the community that think shared memory is and always will be
necessary for visualization. We've done the experiment ourselves. And SPUR
was actually a replacement for Maverick which was a 512 core -- I'm sorry, 1K
cores, half a terabyte of shared memory.
And what we found is that we were able to build SPUR with more capability for
cheaper and easier to maintain. And the use of SPUR was actually much higher
than of the shared machine Maverick.
And the nice thing is that if you go to a distributed memory model, you can get
much more aggregate RAM than is possible on a single machine today. So you
can get order of magnitude more. And there are single nodes that you can get
now that lone star and stampede will both have that have a terabyte of RAM. So
you can still -- you can still get a significant shared memory resource even in a
cluster environment.
>>: Can you give us sort of the one-liner of why shared memory fell through?
Because this is great evidence that it did but...
>> Paul Navratil: Sure. I think because the environment is harder to control
because underneath you're either having a NUMA access that you don't have a
control over or you're slowing everything down to the least common denominator.
And it's ultimately a shared resource.
In our clusters we can give everybody exclusive access to their set of nodes
whereas on a shared machine everyone's playing in the same sandbox.
>>: Okay.
>> Paul Navratil: And so also this is just following trends. VIZ machines have
always followed the path of HPC machines. And clusters have definitely won
out. I think 480 some odd of the top 500 are cluster machines and that's just
going to grow. And we're trying to bring the community into that role. Okay. So
there are some tools out there. Some tool kits like Para View and Visit, two large
shared memory -- I'm sorry -- large open source platforms based on VTK to do
visualization.
And what they've done is they use a Fat Client model where the geometry is
generated on a server and then shipped to the client. And some of them Visit in
particular tries to be smart about when it ships pixels versus geometry. But what
we've done is so we've seen that the data traffic, especially doing managed
communication within the software can be too high for low bandwidth conditions.
And also the connection options are still lagging a bit behind. Sometimes they
just assume a large shared memory system that you're connecting to even
remotely.
What we've done instead is we push everything server side. And now we just
use a thin VMC client to interact with the server. And that's been successfully
used from TACC machines literally half a world away. We have collaborators in
the Gulf states working on our visualization system remotely. And there's
definitely latency in that model but you're going to experience the latency either
way and this allows the computation to move forward while just moving keyboard
and mouse movements from the user and pushing pixels back.
And with the new VNCs they're doing smart things like only updating the change
region of the window rather than pushing an entire window across.
And so this really minimizes the bandwidth and if you want we have full-featured
gnome or KDE on the back end or I just tend to use TDM that's ancient and
Spartan but it gets the job done and it minimizing even the overhead of the
windowing system.
So let me show you just briefly -- I won't play this entire video. But this is our
Web-based interface onto Longhorn. It's called Envision. And so someone
through the Web can either get that VMC window I mentioned or go through an
interview process to do their own visualization and this is, I think, this version is
VTK-based but it has tie-ins where we can use other rendering solutions, and so
you just have an idea that this all happens again in your browser. And this is a
mummy MRI that they're playing with.
>>: How much is happening [inaudible] and how much is VTK window?
>> Paul Navratil: This is actually all on the browser. So VTK is rendering server
side and the pixels are being shipped.
>>: Okay.
>> Paul Navratil: So it's not even VTK window. So let's talk about image display
because generating the images is only part of the solution. You still -- your
analysis of those images are limited by the pixel count on the display you're
using. If you have an electron microscope that's producing an image that's
25,000 pixels by 25,000 pixels, you're either zooming and panning in something
like a Bing maps interface where you're zooming in to see what you want to see
or pulling back. But you have to trade off context for detail. There's other images
like NASA Blue Marble has a 3.4 gigapixel image of the entire earth. Google Art
Project has the large scans. Each museum in the project has donated a piece of
art to be high resolution scanned. It's the equivalent -- this is at half kilometer
resolution per pixel and it's equivalent to the entire earth scanned at kilometer
resolution per pixel is how much data is in one of these Google Art project scans.
Very high resolution and again the electromicroscopy is at half a gigapixel
already. So it's nice to have both resolution and size. Resolution to see the
details; size to see the context.
So we have multiple display technologies in our visualization lab that give you an
idea, that allow you to choose the technology that's right for you. And this is a
brief view of the VIS lab from the Longhorn network. This is all the Longhorn
network I've seen. I don't know if you're following that saga but apparently
there's a neighborhood in Dallas that has a Longhorn network and Austin
doesn't.
That was our 307 megapixel display. That's a 12 megapixel touch display. We
actually have that touch display in our booth at SC. If you have an opportunity to
go to the exhibitors fair, it's at Booth 223.
So this is just a minute. You can see high resolution photography. Actually the
lab has allowed us to expand beyond the traditional STEM field, science,
technology, engineering and mathematics. We've got fine arts, architecture and
humanities in here. We've had artists build pieces for the VIS lab. And so what
we found is by working with the artists, they have the vision of how it's supposed
to look and they challenge our technology to reach it.
Then we can take that new ability and bring it back to the scientists and
engineers to expand what they can do. So we find it's a virtuous cycle to work
with these folks.
This is a 3-D display. This is just an 82-inch commodity television and driven by
a quadro graphics card. So what we've done, this replaced a barco projector
solution. 2900 square feet. In the space that that projector, that single display of
projector technology took up, we have a meeting room and six high tech displays
now.
So we still have a projector, but we also have so much more that we built and we
maintain and the power of the commoditization of this hardware has really come
to the fore. Okay. So this was our first tile display. Colt three by three. These
are 30-inch LCD monitors. And we built the frame ourselves. Just any drafting
program that gives you actual measurements, you can design it, send it to a
company that's called 80/20, industrial erector set. They'll cut it, send it back to
you, iodize it any color you want. As long as it's black. This is the 307 megapixel
display. That's showing Mars in the background, plus some visualizations on top
of it.
The nice thing about this you can show either very large data or you can show
multiple views of data or correlated data.
And so you can have everything up at once rather than having to switch between
windows on a smaller display. Our work here has also been starting to expand
beyond just our center. We run a lab in the college of education at UT now that
allows them to work on their information visualization problems that we have a
partnership with such as the scores for every student in California in 1995. The
test scores for them.
So those are the types of problems they're trying to visualize. We've also
consulted on tile displays at UT San Antonio, UT El Paso, and I believe also in
Colorado. So that's broadening out.
Again, they'll be maintained all by TACC staff and using commodity equipment.
So also at a fraction of the cost. This is a schematic of stallion. You don't have
to read it. The take-away it's 75 displays, 30-inch. There's a three by five hot
spot just by a feature of how the hardware worked out. The outside that hot spot
it's each GPU drives two displays. Inside the hot spot each GPU drives one.
So you can have -- there's a notable rendering performance boost in that hot
spot. 23 workstations and these at the time this was built in mid 2008 there were
no server class machines that contained GPUs. It was really -- at the birth of the
GPGPU resolution. We just got gaming boxes. These are G-80, G force
machines driving it. If we did this today, we would use server rack-mounted
machines that contain GPUs and we'd use quadros. We have SDR infinibands
just to have high bandwidth among them and that's it. We also have a five
terabyte file system that's in the process of being expanded to about 50
terabytes.
Okay. Displaying data. So that square is a scale representation of the 30-inch
monitor compared to Stallion. Okay. And then that Blue Marble project I
mentioned. So this is to scale. 3.4 gigapixels versus 300 megapixels versus four
megapixels on a single display. So you can't see the entire image but which one
would you rather look at that image on if you had the choice of either in your
office.
So the point being you can't see all the data but you can see more of the data at
resolution. So that you can see not only the details but get the context.
You can almost see the great wall on that. It's pretty cool. Okay. So display
software is not standardized at this scale. The Cal IT Squared team has built
CGLX. Those were really the first folks. And the folks at Chicago have built
Sage, but they're now closed source. That's a problem because some of the
configuration assumptions are baked in. When we first got Sage we had to
modify it. It was open source then. We have to run CGLX in gradient mode
because it doesn't accept our hardware configuration naturally.
And full featured window environments like XDMX don't scale. There's actually
hard-coded limits in the code at 16 displays. And so 75 -- we just uncommented
that and see what happens. Bad things.
There's definitely a need for folks to do X right. And because outside of a
full-featured environment, your ability to have software hosted on the large
display is limited as well.
Primarily images in video streams, because that's the first thing people try and
it's relatively easy. You have an API for third-party software but if you want to run
a closed source proprietary piece of software or large software base you'd have
to come in and modify it.
You can use something like Chromium to just sniff the open GL and map it
across. But that Chromium stopped development in the mid late nineties. Only
supports open GL 1.3. We're at GL4 or 5 now. So a lot of stuff it doesn't
support.
And then to give you that pan-and-zoom feature for very large images, there's a
separate application in Sage and CLX doesn't have support for that at all.
So what we've done instead is we've now beta released our own display costs or
software that will remain open source. And it combines the features of Sage
CGLX magic carpet for large data, large image.
It has all those features plus it allows you to toggle between choosing network
bandwidth versus disk bandwidth. If you say we had a development event for
before UT football game and we streamed 75 football, historic football games on
the tile display, which was really cool and it was amazing. They gave us the
copyright provision to do that.
But it really taxed the ability of the other software programs to do that. So we
had to run it in our own software.
We are also exploring a touch center face, you'll see a video of that in a moment.
And we also have TUIO bluetooth connectivity so you can use smartphones and
tablets to control the display. And we're also giving Python scripting interface so
that you can can demonstrations or do advanced scripting for interactions across
the display.
This was really motivated initially by our artists who wanted particular images to
come on the display at a particular size at a particular point and the other
windowing environments really couldn't handle that.
Okay. So let me show you a little clip of this. And so this is showing that pan
and zoom that you would normally have to have a separate application open for.
But now in display cluster we can show multiple images and zoom into them
simultaneously. These are some of the Google Art projects, Van Gogh's Starry
Night and Ambassadors from the 17th century. The nice thing about it is
everything's public domain. So we can put this up. Here's the Kinect interface.
We liked that. [chuckling] so this is a piece of 17th century Russian art. And we
were amazed that face looks a lot like Robert Downey, Jr., particularly after a
bender. It's a little green tinted.
And so again part of the motivation for this is the College of Education wanted to
use Macs to run their cluster because they have a relationship with that. And so
CGLX and Sage wouldn't run on the Mac. So that really motivated us. We
talked about it a long time. But once we had a tangible problem to address and
fix that's what motivated us to do that it.
We've already had a lot of interest from all the UT institutions that have tile
displays, folks at Stanford are interested. University of Central Florida..
University of Michigan. So I think folks are using what's out there now because
that's what's out there. And hopefully we'll have some nice moving forward
experiences about how our stuff works. This is the touch screen display. This is
I think that's Bing maps being over the University of Texas. This was designed in
house from six 46-inch Samsung LCDs. PQ Lab 32 point IR frame. And we also
have a pane of glass in front to give a smooth swipe experience over the bezels.
And we also have the Kinect to do touchless interface. This is really working as
a test bed to then take those interface designs to the larger displays. So there's
a nice feed back up. And it's driven from a single node with an AMD Ifinity 6 port
GPU. That allowed us to keep the costs down. I'll talk a little more about this
later. The challenge there is it has really reduced rendering capacity. Only has
four gigabytes. It's driving six displays at two megabytes each. If you're trying to
stream a video on each of those displays things break down pretty quickly. It's
also doable into Windows 7 and [inaudible].
I think this is the last video. But this gives you a sense of how the display works.
So this has been motivated by a project with the National Archives. This is
showing the digital holdings in a tree map view and the National Archives are
really struggling because they don't understand half of what they have in digital
form much less how to respond to Freedom of Information Act request. So this is
operating to allow those folks from the School of Information from National
Archives to really interact with it in a more dynamic way.
So let me get to some of the interaction. That's just one of the demos. There's
multi-touch map navigation. You can see the whole video on YouTube. So that's
Bing there. That's one thing you actually have to hold that and do the rotation.
Google Maps lets you do a gesture like this to do that rotation. So any Bing
developers in the audience, that would be a nice feature to add.
Okay. So let me talk a little bit now in the last few minutes about what -- the
types of things that we're doing on these displays. And some of the user
successes that we've had.
So this is really something that could only happen at Texas, I think, is after the
BPO oil spill, we had scientists who have models of how the Gulf coast, how the
Gulf, the patterns of ocean currents in the Gulf, and they then just modified that
code to track where the oil spills are -- where the oil particles are going. So
here's -- I lied, that wasn't the last video. Whoops. Pause it.
So here's a visualization of that simulation. And these were done -- the
simulations were done in real time so we could give an idea to the responders
where the oil might be going.
So you can see how this is the end of Louisiana. This is the Mississippi and
coming into the New Orleans area. So the simulation was run at multiple
resolutions so you could get the coastal effects and also track more broadly in
the Gulf.
>>: How much can you zoom?
>> Paul Navratil: So this -- on this video itself? Or just in general? So you can
zoom into the -- to really fine-grained. Now the simulation has granularity limits,
because it's a Gulf simulation.
So at the coastal elements are rather coarse. But the simulation itself is pretty -is pretty -- it was pretty accurate in terms of where the oil was going.
And so this is in the spirit of what we do with NOA during hurricane season. We
have several hundred thousand hours of compute time set aside so when the
plane flies through a hurricane, they can take those immediate readings and then
they run what's called an ensemble simulation where they run really about a
dozen different hurricane simulations with slight parameter tweaks or slightly
different implementations simultaneously and that's where they get that cone of
probability from. They just average the simulation results together.
Back when Ike came through in 2008, our super computer actually predicted that
at the moment that the storm was going to go over Austin and cause the UT
football game to be moved a month later and the storm took a right turn and
missed us entirely. So it was our claim to fame that we got the game cancelled.
So we've also studied H1N1, not only the molecule itself and how the virus
attacks cells, but we've also looked at the epidemiologic effects and this is a nice
Web interface that Greg Johnson on my staff has developed.
That part really isn't exciting. So here I think we're getting a little choppy. But
you can see the counties of Texas and then once you zoom into a particular
point, you can run, look at the results of the simulation. I wonder if I broke things.
Yep. So you can see the counties are changing color by how much infection is
occurring at a particular time. And you can see how it radiates out across the
state from a particular input. So that was a disease that was breaking out in
Travis County where Austin is, how it slowly follows the population around in a
state. And this is something that was commissioned by the Texas Department of
Health. So that you can -- so they actually want to use this to track Texas
epidemics and to prepare for future ones. And what you saw there was then the
graphic output of amounts of antivirals, amounts of folks who are susceptible to
the disease and who have recovered or the mortality rates.
So this is a really powerful way for those folks to really learn more about what's
going on in terms of Texas health. Okay. We've also worked on high resolution
mantle convection and what that allowed us to do, this work was featured on the
cover of Science not too long ago, that allowed us to then, with the same team,
do a simulation of the Tahoku earthquake. And so what you'll see here -- this is
work that Greg Abram did -- is the seismic wave propagation that originates in
Japan in the corner here. So there's Japan.
I assume this is right. Looks like the screen froze. I think the system's having a
little problem with it. The video should be smoother. But what you'll see is that
as the earthquake waves radiate out, watch in that center here the waves
actually hit the core of the earth and reflect back up. And you can see that in the
simulation eventually. And so here it comes. There's the shadow of it. Right
there.
And so that's part of the reverbative effects of the original Quake, really rings the
earth like a bell. And you can see that waves propagating all the way out. This
is through Alaska down the east -- the western coast of the U.S. Through
Greenland into Russia.
Okay. And so this is another piece of visualization from that NARA project with
the National Archives. And so this is 3-D visualization and/or 3-D imagery that
the National Park Service has plus photographic tours. So this allows them to
see where those particular files are located.
So to summarize, let's talk about the trends that visualization systems track HPC
systems. Both at the large scale and at the small scale. The reason we could
build a system like Stallion or the touch display is because of the power of
individual workstations nowadays.
And that VIS systems must be able to scale to HPC size, either in the raw
hardware or in the software that has to run on the visualization systems. So, for
instance, a multi-threaded Mesa or other software rendering package would be
excellent to the develop.
And advances in the GPUs themselves not only in the compute where we can
move the visualization algorithms onto the GPU, but just in terms of their display
power allows us to really increase the density. Although, as we increase the
density, it limits the power of the display.
And so in terms of expanding a single window for like Bing Maps or for Google
Maps, that's fine. But to do high intensity video streaming, that's either CPU or
GPU intensive, a single node does show its limitations. But that power allows
you to trade off.
So can you design your system either with many nodes to make it a proper
compute resource in addition to display or you can minimize the nodes to reduce
the cost and still have a very large format display.
For instance, if we were to redo Stallion today we could span anywhere from
seven to 38 nodes, depending on the amount of compute power we wanted
behind the display.
And so for future work, really as a community we need to address the algorithmic
inefficiencies in terms of the visualization stack. Memory efficiency, VTK still has
an annoying habit of moving the dataset into an unstructured grid, even when it
doesn't have to, especially when there's a structured grid algorithm that can be
more efficient, the raw BTK library may still do the dumb thing, both the visit and
pair view teams have reimplemented a fair amount of the VTK staff to avoid
situations like that and to use the fast case when possible.
We're just starting to explore as a visualization community how to include
accelerator support. We're already using the GPU for rendering. If we can move
more of the algorithm down there and generate the geometry that we're going to
render already, then more to the good.
And, again, improved software rendering, particularly for HPC systems that don't
have graphic support. Right now we really only can use the Mesa library and
that's single threaded.
And in terms of usability for large scale displays, the windowing interfaces are
still limited but we at TACC are working to fix that. Distributed rendering support,
Chromium is the latest effort I'm aware of and that was at the end of the mid-90s.
So that's 20 years old. Or almost.
And then better user interaction. That's what we're working with both in touch
and touch-less interfaces or using the interfaces that people carry around in their
pockets are smartphones to actually allow us to use the displays.
And with that, I'll open it up to questions. Thanks for your attention.
[applause]
>>: So I have a question.
>> Paul Navratil: Go for it.
>>: So I was interested in the utilization of Kinect. So what exactly can you use
with that interface? Because on the video, the guy's kind of struggling to ->> Paul Navratil: Yeah. So that's with the touch. With the Kinect, that allows us,
especially on a display like Stallion, to have a centered control point and then
expand more of the display. We're also using, investigating using multiple
Kinects to do three dimensional sensing, not just in front but also getting a side
sense, and then doing a chain of Kinects to have control through the expansive
Stallion instead of just one in the center. So you could be at the side of the
display and still have effective control.
Part of the challenge is determining clicks. So if you want to grab a window and
move it you have to have some sort of gesture, maybe closing a hand so the
surface area is reduced. I think the gesture we're using right now is kind of a
Pacman, I want to grab this and move it. That's part of the challenge too. There
is work out there is using gloves or using a color-coding system in cameras.
We're trying to simplify the equipment needed. We don't want to have Michael
Jackson gloves for everybody.
Yeah?
>>: When I look at your picture and whisper bezels, he's got bezels.
>> Paul Navratil: He does.
>>: How much pain do you find users experience having bezels across.
>> Paul Navratil: If they notice it at all, they notice it the first time and grow
increasingly resilient to them. We're used to looking out at things like windows.
If you're looking out of a window you want something behind it you move your
head. Here you just move the data. What it doesn't do well is PowerPoint. If
that text is behind the bezel you're not going to read it. So we have a 4 K
projection system that gives us that slide show. I did my dissertation defense
there on the four megapixel screen. We had another dissertation defense with
the four megapixel slides and then the data all across stallion, in terms of videos,
stills, and it was 10 interactive -- interactive -- but 10 video posters of his work
and it was very compelling.
The bezels also give us some other advantages. In issue purchase cost,
because these are commodity. We don't have to do any modification. We just
stick them up there. Replacement cost -- you may have noticed in the video
some of those workstations. Those are the same monitors as in stallion. So we
have hot spares we can essentially just pop in there.
The only caveat on that you want the same manufacturer lot because the
fluorescence has the same temperatures and that gets messy.
And the construction and maintenance ourselves also if you ever come to Austin
and are right up with stallion it's not a precision fit. So those bezels buffer us a
little bit so we can just put them together, us, without shimmying it or doing any
sort of touch to it whereas the bar code system that we replaced we had to pay a
five figure contact to have someone drive up from Houston every six months to
readjust and realign to count for building shift, thermal expansion things like that.
Another thing is those bar code systems had to be kept at 60 degrees. And so
the students would come in in their winter jackets. Wouldn't want to stay there.
The entire walls were painted black to have this immersive experience. Here
we've made it warmer. People want to be in there and because it's all
commodity parts, it's used to being in the same environment as humans.
It's made to be in an office. So we can keep the room at a balmy 68, 70, people
can stay in there.
>>: 75 monitors heat it up to 84.
>> Paul Navratil: Exactly. There is a thermocline as you move forward. We
adjusted the HVAC so it now has a row of registers that blow straight down on
the display.
But 2900 square feet. It's pretty reasonable. It doesn't get too hot in there.
>>: We did an 18 pound wall we found the amount of power we are sucking in
there was substantial. I was wondering about your power consumption.
>> Paul Navratil: So each -- so I know the stats in terms of amps. Just because
when we were doing the circuitry on the wall, you know, we wanted to make sure
we weren't going to blow anything. The display is determined by the brightness
of the monitors. We have it probably at about 1.5, which is just the default
turn-on. If you max the brightness it goes to about 1.8 amps. If you minimize it
you can go down to about .8 amps. We as part of the renovation put glass doors
in so that people could see the display as they go buy in the hall. We tend to
keep the display on with interesting things. It's great you literally get these
pauses and walk-backs for people wondering what's going on in there.
The machines themselves are rated at 8.3 amps. Just warm boot is about if I
remember correctly, maybe 1.2 amps. And even running -- it wasn't Lynn Pack
but another stress test on it, never got above 2.5 amps. So that 8.3, I'm not sure
where they get that but that's extreme, crazy, doing something.
So our normal operation is I'd say for the entire system we're probably under 100
amps, which is still a lot. But I think for the size of the display it's pretty
reasonable and compared to a datacenter it's not moving the needle.
Good question, though.
>>: So right now you have people coming in and using it.
>> Paul Navratil: Yes.
>>: Is there -- I can imagine to a certain extent that they can't do their typical
things on here. They've got special apps and the signup and is it mostly sort of a
show piece right now or do they actually get some work done?
>> Paul Navratil: There's a lot of outreach that happens here and education
coming through. For instance, the electromicroscopy, the lab came in, they loved
it. The PI was a 40-ish woman and never touched a game controller. It took 40
minutes to pry it out of her hands because she was zooming around and
exploring it. Great.
The 3-D reconstruction technology, the science they're actually doing is like a
key-framer, they stack these slices together and recreate the neurons in 3-D.
Software is single threaded, runs on a single workstation, doesn't utilize the 23
nodes of stallion. And so while we could bring up those images at the flick of a
button, which they loved, their actual software didn't run there.
What we do as a center, we do have funding to do advanced user support
projects of porting that over. But there's also the last mile problem of this isn't in
my office, this isn't in my building, I can't walk outside in the Texas summer.
So there has to be some motivating factors for that. But we do have folks
using -- we're actually expecting to get more out of the fine art side, because to
be able to zoom in where a brush stroke fills an entire screen, especially of
something like the birth of Venus that's in Italy it's a fresco so it doesn't travel,
that allows folks -- doesn't replace interacting with the piece but you can get a
much different experience with it.
And then the science follows. In terms of analysis, rather than big data, I think
we have the multiple views gets more use. Where, for instance, for hurricanes
you can put path of the storm. You can put the ensemble forecasting, you can
put the storm surge very large and all together. They can analyze it as a group
and bring in the news media and it looks great there. [chuckling].
If you go to the Texas website, just the home page, the new campaign video for
development was shot in the lab. And so we have the university president. We
have all the students. And probably about 60 percent of that footage was in the
lab.
And even though we're a ten-year-old center, we have top-ranked machines in
the world. There are still folks on campus that don't know we exist.
So in terms of an ambassador and outreach piece, it's already worth what it can
do. And the science we can get out of it is in some ways a bonus.
>>: Do you have the [inaudible] any of that super high res video conferencing
stuff they've been doing?
>> Paul Navratil: Yeah, and so we can do that as well. That's one of the things
we wanted to design into our software. We had one of the lead developers for
Sage was on our staff for a time. He's since gone to Slumberje but in Portland
we had a live stream to our display colt from Australia. It comes back to the
bandwidth we had to set it up with National Lambda Rail to get a dedicated 10
gig connection. A whole lot of hoops to jump through rather than just use Skype
and call Aaron on a Sunday.
>>: Don't call me on a Sunday.
>> Paul Navratil: Exactly. So in terms of high stream data, the interesting
take-away from that experiment is the compressed video actually looked much
better than the raw. We did compressed and raw HD stream. And just the
amount of packet loss in between made the raw stream really noticeably Jankey
where the compression -- yeah, that's a technical term -- where the compression
actually recovered from that a bit.
>> Aaron Smith: All right. No more questions. Let's thank our speaker one
more time.
>> Paul Navratil: Thank you very much. I appreciate it. [applause]
Download