>> Lee Dirks: The second session here is the --... technologies for communication and problem solving, and so that will...

advertisement
>> Lee Dirks: The second session here is the -- the theme of it is tools and
technologies for communication and problem solving, and so that will be more
the emphasis for this.
And our first speaker in this session is Curtis Wong, and Curtis is a principal
researcher in the E Science group at Microsoft Research. He joined Microsoft in
1998 to start the next media research group focusing on interaction, media, and
visualization technologies. He has been granted more than 25 patents in areas
such as interactive television, media browsing, visualization, search and gaming.
He's the co-author of Microsoft's 5,000th patent in 2006, 10,000th patent in 2009.
He's the co-creator of the Worldwide Telescope featuring the largest collection of
the highest resolution ground and space based imagery ever assembled into a
single learning environment inspiring millions to explore and expand the
universe.
His most recent project is Project Tuva teaching the 1964 messenger series
lectures of Nobel Prize winning physicist Richard Feynman, who many of us in
the Department of Energy know -- knew, which are an interactive research media
player featuring rich interactive simulations and related content.
His presentation this morning is called telling stories in the cloud. So help me
welcome Curtis Wong.
[applause].
>> Curtis Wong: Thank you.
Okay. Well, I'm really excited to be here with your guys. You're sort of my
people. Multimedia, interactive, visualization.
So how many people here have heard of the Worldwide Telescope? Can I see
some hands? A few? Maybe almost half.
How many people have played with it? About a quarter?
Anybody authored with it yet? Oh, you have a real treat coming for you.
Okay. So I'm going to spend a little bit of time talking a little about the Worldwide
Telescope and take you into stuff that you probably haven't seen or noticed
before, and then more of the recent work we've been doing with it is bringing in
data. And I'll show you an example of that.
And then there's also been some really interesting work, I think, around what we
call it guided tours, which are essentially like hyper-media publications. It's very
similar to the last speaker. And I think you'll find that particularly interesting as
well.
So let me just dive in here. I'm going to set up -- running here, and I'm going to
turn this down. Excuse me a second here. It's amazing. It just happens.
Okay. So Worldwide Telescope will run at pretty much any resolution. Right
now we're running it at a fairly low resolution so that it can be captured on video,
but when we were running it on, like, a wall or something like that, we could
pretty much run it at any resolution, which is pretty wonderful.
So you're looking at the visible light view of the night sky. This is the milky way,
of course. And this is a trillion pixel image. It was put together with some folks
here and based on the Palomar DSS Survey, which is about 1700 plates, and
those were mosaic together and stitches were eliminated and, you know, so you
can do this kind of thing, zoom in pretty much anywhere.
But within this same environment we have the ability to compare other
wavelengths. And you notice that they're all sort of registered. So if we take a
look at, say, the X-ray view, you can zoom in and look at this heat signature from
a supernova remnant, and we can make that a foreground view and then we'll go
over and make the visible background view. And you can compare two different
data sets.
So you can see the heat signature as opposed to the debris clouds from that
particular supernova remnant, or if you wanted to see that in hydrogen, you can
see the compression of hydrogen from that particular explosion.
So having a base layer imagery is particularly important. We have 50 of those,
50 different wavelengths. And then within collections we have collections from
the Spitzer-Hubble Chandra, the highest resolution imagery of all of them here,
and these are overlaid on top of the visible light view that you have.
Say, here's an example of an M42. And notice down below we're showing you,
as we slew over, this blue rectangle, it shows you the field of view as we're
zooming in to show you what portion of the sky you're looking at. The numbers
here tell you the field of view in degrees. And down here it shows you what part
of the celestial sphere you're looking at.
So you're looking at a very high resolution image of M42. And you can sort of
zoom down as far as you want. This happens to be a problitin [phonetic] that has
resolved into a disc of dust where you can see the star forming at the center
here. There are many examples here in this particular area. Here's another one,
an earlier stage problitin right there.
So let me show you something else. One of the most exciting features, I think, of
Worldwide Telescope lies in the idea of what we're calling guided tours for the
public, but essentially they're just paths within this very, very large data set of
hundreds of terabytes of imagery that we have.
And what's exciting about it is that it allows you, because we have built in an
authoring environment, to create these paths that you can richly annotate with
graphics, text, imagery, hyperlinks to other things as well. I'll show you an
example of -- this is a tour that Alyssa Goodman did a couple of years ago. It's
interesting that when you add up all the people that have seen this tour and then
you compare it to all the people that have been students of hers, freshmen
undergraduate and all of people that have read her technical papers, all the
people that have been in -- read her and been at public talks, it's about a factor of
10 more have seen this tour.
So in terms of public outreach, it's a terrific opportunity to -- let's see. Is that
right? We have a small technical problem. It's not coming through because it's
still coming through on my speakers, which is -There we go. One of those things.
[Video playing. Voice of Allysa Goodman]
>> Narrator: -- because we're inside it. Here's a spiral galaxy not far from us,
about 12 million light years away called M81. If we look at it in optical light, we
see billions of stars shining together in a spiral pattern. If we look at the heat
from M81 rather than the light, it looks like the false color orangey image we see
here. This Spitzer Space Telescope image uses long wavelength patterns that
can see heat just like the one that took the picture of this cat.
Galaxies are filled with tiny particles called dust that absorb the light from stars -[Video paused]
>> Curtis Wong: So it feels like video. Video is like the most powerful media
form we have. It can really sort of engage somebody at a deep level. But the
problem with video is that if it's something they already know about, it's not going
fast enough. If it's something you don't know about, it's going too slow.
So the opportunity really here is that, as I mentioned before, these things that
look like video are just rendered video in realtime. And so because I've paused,
I'm back in this environment of the Worldwide Telescope so I can then look at a
particular object from any other telescope, I can look at a different part of that
particular object that I was looking at. I can look at other tours that internet with
this particular galaxy. So I can sort of semantically branch off.
If I want today know more information about this particular galaxy, I could go and
look up -- if I was a kid doing research, I could go look at Wikipedia and go get
some information from Wikipedia. So that's one source.
There's other sources. If I was an astronomer and wanted to know what's the
latest papers that have been written that reference this particular object, we can
go and it does a realtime query to Smithsonian ADS and it will show you all the
results from that of which there are 2625, and the most recent one was from
January of this month.
And there are other information sources that are essentially on the web. But if I
wanted to get an original source image, I could directly go get a DSS image from
that source or somewhere else, or if I want today get an image from the Sloan
Digital Sky Survey, it goes and does a query, pulls it out from the database and I
don't have to do anything other than just access it.
And then there's a number of other things that you can do with it. So far we have
this rich visual environment. We have the ability to create these paths in this
environment. And we also have this connection to data underneath it. And so
that gets particularly interesting from an educational perspective as you start to
think about, well, what kind of stories can I tell about these different environments
that we have. And we have quite a lot of different environments in here.
I mean, if you look at planets, you can create -- here's the planet Mars, and if I
wanted to sort of explore down into a particular area Valus Mariarus [phonetic]
you could do it and I could create a path down into the valley and I can annotate
it based on interesting things that I was looking at or seeing. And that's one
example.
Of course, we have the entire virtual Earth data set as well, whether it's -- I'll
show you here -- whether it happens to be the straight Earth view or the map
view or Earth at night or -- these are blue marble, winter and summer. And these
can all sort of be used within that experience.
If I wanted to go out beyond the solar system, so right now we're looking at the
sun and planets, actually quite small, if you -- I can turn on the orbits, but when I
show you the orbits you'll see that we have quite a lot of bodies here.
Okay. See the little lint balls? Those are the moons around -- as you look at
Jupiter here, you can go in and see all those. We can crank up the clock and
you see them spin around like crazy.
But let's go get some other context for kids. You can see why Pluto is not a
planet [laughter]. It's sort of an acquired object.
But let's take a look at Orion here, the little three stars in the belt. If you keep
going further out you'll notice that the constellation starts to distort. Of course,
that's because the planets in Orion are at different locations -- I mean the stars
that are in Orion are in different locations. But we can exit, you know, the Milky
Way and continue going out into the million or so galaxies in the Sloan.
So if you look at some of these galaxies here, such as this is the Coma Cluster,
each one of these has a real galaxy behind it and you can go and say I want to
get information about that particular outlier galaxy.
Here's the image of that galaxy. There's the spectra, there's the red shift, you
can download the data for it, you can take a look at chemical composition. And
this is sort of information that's behind every one of these million galaxies that are
here rendered.
So let's go back to the Earth and I'll show you a little bit about what we can do
with data.
Okay. So the first thing I'm going to do is I'm going to go find some data. And
this one will be -- we'll say this one. So I have some data in here in Excel. A lot
of people have data in Excel. So this one happens to be earthquakes from 2009
to 18 months later, 2010. It's about 40,000 rows of data. And it's difficult to sort
of discern any real insight by looking at all of this data, but we've been working
on connecting Worldwide Telescope with things like Excel. And so we can go
and take that data and bring it into Worldwide Telescope.
So let's see. Let's go in here. Let me copy this and bring it into here. All right.
And here we have -- I'm going to minimize this. Let's take a look at -- so here's
all of that data in Worldwide Telescope, but we can play with some of the
properties of this. If we wanted to change the scale of some of the data that we
were looking at, you could see it here.
Oh, okay. Wait a minute.
The challenge right here is that I didn't select a critical field, which is magnitude
[laughter]. That's why it's not plotting anything.
All right. Let's try that again. All right. I'm going to delete this guy out of here.
All right. That's better.
So it's pretty easy to sort of see the ring of fire. We can change the scale of so
that's a little bit -- that which it will allow you to look a little bit deeper at what
we're seeing there. You can change the color of it is as well and make it a little
bit more dramatic.
But when you start looking at these earthquakes, you notice that there's some
funny patterns of these quakes out in the middle of the ocean. And you kind of
wonder what's going on out here.
And so what we can do is we can switch to a different view of the underlying
information about the ocean floor and you can see that, of course, this is a little
bit of a subduction zone where you see the pattern of the earthquakes following
that subduction zone there.
You can go look down here. Here's the Haiti earthquake. And it was very close
to the surface.
But, you know, this is an interesting distribution right here, right above Puerto
Rico, seeing some interesting structures here. We can take this same data and
look at it over time. So we'll select that and select time series and we can go into
the clock and crank up the clock a little bit more and -- and pause there. So you
can start to see some of these patterns here.
Let's go in here and see them a little more closely.
Some of them even look like lightning bolts. Let's speed this up a little bit.
I think what will help us too, is if we adjust the latency of it to give it a little bit
more persistence so we can look at more temporal patterns.
So we're seeing something here that is very difficult to see any other way. And
one of the nice things we have about this environment of being able to author in
this space -- I'm going to pause here for a second -- is that we can create an
annotation about the data here.
So we're going to create a little guided tour. And the way you create a tour is just
by deciding on a point of view, which happens to be this one. We could pick a
different one. And it creates a little snapshot of a point that we're looking at. And
then maybe what we might want to do is look at it from a little different angle, and
it will interpolate between those two views.
And we also want to pick a segment of time that we're looking at. Say that
segment. And then if we set an end camera there and then we can create
another slide that will be a starting point, and then we'll move the time slider a
little bit more, we might change the view a little bit more here, add an end camera
position to that and then we can take a look at what we have.
So I just saw something interesting in here. Maybe what I wanted to do is I might
want to cull something out and maybe I wanted to add something here of -- let's
see. I want to make a little audio note to somebody to take a look at thing. And
let's sort of preview that.
[Audio clip played].
>>: Carl, can you take a look at these [inaudible] there's something interesting
going.
[Audio clip stopped]
>> Curtis Wong: I just recorded that with a little digital audio recorder.
And then, you know, you can -- within this space you can add anything like text,
image, pictures, and all of those objects, if I added, say, even this thing, if this
ring, I wanted to do something to it, if I wanted to hyperlink that thing to a
website, I could go to the HSTI [phonetic] and get that url, copy that, paste it into
here. And I'm actually going to mute myself because I don't want to hear myself
again.
So that's going, and I could click that and it would link me to some relevant other
related piece of information to what we were looking at. And I could actually
animate this particular object just by setting a different starting and ending point
in between those two.
So that's sort of a simple example of something that we're doing around data.
I'm going to show you a guided tour that was created by some astronomers at
the Harvard Center for Astrophysics, and this one is called John Huchra's
Universe as a dedication to John Huchra, who was one of the two astronomers
that worked on the original work on the large-scale structure of the universe.
[Clip played].
>>: About 14 billion years after the Big Bang. It's only during the last hundred of
that 14 billion years that human beings have figured out that we live in an
expanding universe. When John was born, no one really knew how fast the
universe was expanding, how old it was, or where the galaxies were in that
stretching space.
>> Curtis Wong: So this tour is 12 minutes long. So I inserted these little
[inaudible] in there as little jump points so we could sort of jump through it and I
could show you some different features about this.
Now, what we're doing here in the background, this is actually a path within the
Sloan Digital Sky Survey, which is sort of a testament to him. So this is an
example about the analogy of ->> Narrator: Here's a 3-dimensional view of the top of.
Mt. Hopkins in Arizona, the site of many of John Huchra's astronomical
observations. Adding altitude measurements to a two dimensional map of this
part of Arizona gives a 3D view. In the same way, adding distances inferred from
galaxy redshifts to a two dimensional map of the sky can make a 3-dimensional
map of the universe, like slide 33 seen here. One study of Markarian galaxies
got John into a heated discussion in the journal Nature about whether the
distribution of these galaxies was lumpy or smooth, another harbinger of things to
come.
After his Ph.D., John moved east to the Harvard Smithsonian Center for
Astrophysics. He never left. John and a number of his young colleagues were
deeply influenced by Jim Peebles' desire to understand the origins of large-scale
structure in the universe. Big surveys of galaxies seemed the right way to
proceed, and John was up for the challenge.
If you measure a galaxies redshift you can infer its distance. This lets you add a
third dimension to maps of the sky. On the Earth adding altitude lets you
understand mountains and canyons. Would a 3D map of the universe similarly
reveal such grand structures?
Soon after his arrival in Cambridge, Massachusetts, in 1976 [inaudible] who
wondered whether the universe was smooth or lumpy. At that time many people
had opinions, but there were few facts.
To build a 3D view of the way galaxies are arranged in space, you need a
spectrum for hundreds, or, more likely, thousands of galaxies in order to
determine each one's red shift and, thus, its distance, and none had ever set out
to measure -- and summarizes some big red shift [inaudible].
>> Curtis Wong: It's an interactive timeline.
>> Narrator: The CFA1 was the first systematic attempt to map out a large swath
of the universe in 3D. In the late 1970s and early '80s, John and his colleagues
spent hundreds of hours atop.
Mt. Hopkins measuring hundreds [inaudible] seemed to indicate a million cubic
megaparsec hole in the galaxy distribution. But nobody knew if voids like this
were rare or common.
The results from CFA1, published in 1982, hinted that there might be interesting
large-scale structure in the universe, but they were not conclusive. A larger
survey with more red shifts was needed.
After CFA1 ->> Curtis Wong: So this is the actual CFA survey plotted against the sky. Each
one of these squares is that survey. In fact, you can go in and look and find a
galaxy behind each one of those little squares.
>> Narrator: -- structure in the universe, but they were not conclusive. A larger
survey -- two red shift surveys should employ a different strategy than CFA1.
She thought that more would be learned by sampling long, thin strips on the sky
than by samples over a broader region as had been done in CFA1.
The terrestrial analogy is that more can be learned about the Earth's general
topography from a long, thin strip of elevation measurements stretching from
coast to coast than from sampling elevations over a random patch of Earth. After
all, the CFA mappers reasoned, a long strip would encounter rivers and mountain
ranges and oceans, while a sample patch could turn out to be all ocean or all
Iowa.
John, Margaret and their students and coworkers began the CFA2 red shift
survey in 1984, using the long, thin strip strategy that Margaret had proposed.
The first strip was 130 degrees long and just 6 degrees wide and contained the
1100 galaxies highlighted here.
A new insight was their 3D map.
>> Curtis Wong: So you remember this.
>> Narrator: It looked at a view of their first slice seeing as if they could look at
the universe from above. Here is the famous first slice from CFA2 where
velocity, a proxy for distance and Hubble flow, increases radially outward. The
range from east to west, stretching for 130 out of 360 degrees around on the sky,
is shown as the angular coordinate from right to left.
The thin range of 6 degrees north and south is scrunched down in this diagram,
which shows galaxies of the slice projected to make a two dimensional view as if
we viewed the universe from an orientation 90 degrees way from our usual sky
view.
It's easier to understand this diagram if we look at it in context. So let's put there
before we dissect its meaning.
Shown here again are the galaxies the first script as projected on the sky in two
dimensions.
If we color code the same first script galaxies in a 3D view of the universe given
to us by the modern Sloan Digital Sky Survey, we can see how a strip on the sky
translates to a wedge in three dimensions.
In John's modest description of the first slice he says -- total for CFA2 ->> Curtis Wong: This is 18,000 galaxies plotted against the sky.
>> Narrator: Here we show all 18,000 galaxies on the sky color coded by their
red shift ->> Curtis Wong: So I can pause and you can sort of see that segment of the
strip that they're talking about, and you can drill into any one of these. And the
color code tells you about the red shift. We can go in and pull up red shifts about
any one of those.
>> Narrator: The total for CFA2 reached about 18,000 in the 1990s. Here we
show all 18,000 galaxies on the sky color coded by their red shift. Red markers
are farthest, and blue nearest. The result from the CFA2 red shift survey are
best understood in three dimensions. Here we see the 3-dimensional locations
of the same 18,000 galaxies we saw just a moment ago on the sky. The bubbles
and sheets are now obvious, as is the great wall.
The CFA red shift survey inspired many larger surveys. The Sloan Digital Sky
Survey is shown here. The Sloan survey measured red shifts for hundreds of
thousands of galaxies. The cosmos synthetic 3D map of the Sloan galaxy shown
here in Worldwide Telescope was created by placing two dimensional images of
galaxies at the right positions in 3-dimensional space.
The CFA survey, where each of the 18,000 galaxies is marked by a colored dot,
was much smaller than the Sloan. But the essential features of large-scale
structure and network of filaments and ->> Curtis Wong: Okay. I want to make sure we have time for questions.
So this, like all the others, it's a real 3D model. There's real data behind all of
those points, and I think it allude to a little bit about the last speaker in terms of
how you might connect both visualization and the data underneath it.
Worldwide Telescope connects to a number of different data sources
transparently within this application. You don't know that, you know, some of the
Mars data is coming from JPL and other data is coming from other sources, and
you don't really have to care. You just have to say point me to the data and you
can get it.
And that was part of the original reason for wanting to build Worldwide
Telescope, because many of the astronomical imagery, say, for the Hubble was
resident in Hubble in visible light and the X-ray stuff was at Chandra and the
infrared stuff was at Vizier and all these different laboratories sort of had their
own different data, and there was an effort about ten years ago to try and bring
all of these imagery into something called the International Virtual Observatory
Association.
And part of that was just to have a standard so everybody could publish their
data on the web and we could sort of get to it. And so I pushed them about six or
seven years ago in saying we should create this uber environment where
everything is brought together seamlessly and then connect it with these acts to
author experiences as well as connect them with information. And now the IVOA
essentially uses Worldwide Telescope as their de facto visualization tool.
So we have a few minutes. I want to make sure we have time for questions.
>>: Thank you. I think this is -- it's a great demonstration of that kind of data
linkage and such. I have a question just in terms of how people were using the
data. In a number of the examples you showed were probably more educational.
>> Curtis Wong: Right.
>>: Which is probably good for a more lay audience like I think we have here.
Are scientists using this as a way to get access to data sets to ->> Curtis Wong: Yeah. I mean, I can show you an example -- let me bring up -here's a data set. There's an astronomer that did that tour about dust, and her
data set -- she's posted her data sets. And if I click here -- so this is her data set
overlaid a top of Worldwide Telescope. And I think that's the real benefit of it is
they have their data sets and they can compare their data to any other
multi-spectral data from a number of different sources.
And we have an API so that they can just construct their own unique viewer on a
web page that they can do their own things with. There's a connection for
amateur astronomers. They can take a picture of the sky, whatever it is, upload
it to Flicker, and on Flicker there's a group called Astrometry.net which solves the
position of the stars, and then there's a length to Worldwide Telescope and they
can see their image overlaid in Worldwide Telescope and you can sort of cross
fade between to look for what you can see in their image versus what's
commercially available.
Another question?
>>: I'm just curious about the project size in terms of how many developers have
been working on this, how long have you been working on it, those kind of --
>> Curtis Wong: Well, I've been thinking about this project for about 30 years,
and there really was not the technology to do it until Jim Gray [phonetic] and Alex
Salae [phonetic] started working with the Sloan about 10 years ago. I joined
them -- I did a little bit of work with them in 2002, and that was when I said, you
know, just having the data isn't enough, we should build this larger thing. And
they were huge supporters of it.
And because it was one of these -- I wasn't in the E Science part of the
organization at that time. And so it wasn't really sort of core in my job, so it sort
of was this background thing. And about 2006 I had a little window of time that I
could start working on this, so I hired a really good developer named Jonathan
Fey [phonetic], and Jonathan and I worked on it for about a year and a half. We
got an intern to help us with all the data, and that was the first launch we did in
2008.
We had other people help us with the website and things like that. But the core
team was very, very small just because, you know, it's kind of a labor of love.
We wanted to create something for education and something that we hoped
would have some benefits to science, and I think some of the data visualization
work that we're doing now I think has a lot of interesting potential, both scientific
and commercial.
>>: Hi. This is a really compelling demo, and actually some of the features here
remind me somewhat of the work on collaborative visualization by Jeff Hare
[phonetic], who's at Stanford. In particular, some of you may know about his
senses project. And one of the nice things there is that the system kind of
directly supports communication and discussions and collaborations based on
the data set kind of anchored with respect to the data set. And it seems like this
system would definitely support this type of discussion as well. Are there
features built in like for commenting and for kind of having discussions around
these different views of the data?
>> Curtis Wong: We haven't done that yet. But, I mean, as you saw, Alyssa has
a data set that I was just showing you. She could easily make a tour with that
data set which has pointers to the data, and that could be shared with someone
else. She could annotate it and send it to somebody else, and it's a very small
file because it's basically xml with pointers, and then somebody else could
annotate it too or send it to a group or have it just be large -- more largely
available. So there's a mechanism for it, and, you know, it's another one of our
sort of many to-do things.
>>: It seems like it just would be so cool to, like, have that video that's been
viewed by so many people, for instance, and then having them be able to leave
particular kind of comments or notes kind of embedded as part of that guided
experience.
>> Curtis Wong: And have it as a space where you could sort of view all this sort
of discussion and collaboration sort of geographically, look at a heat map of the
whole sky and say where's the activity and where's they're not the activity, and
should there be. And, you know, Harvard is working with ADS, which is the
publication sort of home for all those papers that I was showing you, to generate
a heat map of publications about the sky. Because it wouldn't be great to try and
figure out, you know, where is all the scholarship going and where is it not and
what are we missing in an interactive way?
>>: Let me get this row in the back, Peter, and I'll come up to you in a second.
>>: So kind of related to that, this seems like a new way to -- or a different way
than the traditional method for browsing and navigating data. So have you found
that has resulted in a different type of metadata structure or for semantics in
support of this way to kind of be tying things through geographic location?
>> Curtis Wong: A different kind of metadata structure? Well, we were trying to
use sort of conventions that exist out there already which allows us to bring in
lots of other metadata without having to do anything special. So we can bring in
shaped files, we can bring in 3D studio polygons, we can bring in a whole
number of different things with this environment, and each sort of -- you can
define each one as a separate layer which has its own rendering space which
you could then plot your data into that rendering space within the larger context
of another environment.
>>: It seems like with that component where you're looking at, you know, the
space and let's go look at -- you're kind of querying what's out there. I didn't
know if maybe having that kind of link required anything -- is going to require
anything different or anything that you found.
>> Curtis Wong: Well, for the sky and the Earth there are existing sort of
conventions there. We're also working on, you know, totally abstract 3D or ND
spaces that you can do visualization, and we're going to try and do the same
thing there where we use the same contentions, because we don't want to invent
too many things that we don't have to.
>>: Two things. First, I want to note for the audience that when the astronomer
started talking, the clock at the back stopped [laughter].
>> Curtis Wong: I'm sorry, what? I didn't hear that.
>>: When you started talking, the clock at the back stopped.
>> Curtis Wong: Oh [laughter].
>>: But I'm interested -- I know it's tough for someone who's been so involved in
this -- is the whole earth telescope at a disruptive technology stage for
astronomers, and if it is, are they afraid of it or welcoming it?
>> Curtis Wong: It's really interesting, because I think at the -- in the early stages
a number of astronomers said, wow, that's really cool, it's for education, right?
You know how that argument is. And then some of the astronomers that were
using it were showing some other astronomers and saying, well, you know, I
could do that, but let me show you how I do it here, and it's like a fraction of the
work. So what would you rather be doing, being a computer programmer or
doing science? And they go, you're right.
Some of the early advocates of Worldwide Telescope today to this day, too, are
two astronomers at Space Telescope, and ironically those two astronomers are
the ones that persuaded Google to do Google Sky. But now I think they're partly
a fan of what we've been doing because we've sort of taken it much further. I
mean, we were very passionate about sort of the accuracy of rendering. So
when you're doing the sky, you can't do the sky in a Mercator projection because
anything within 15 degrees of the poles is distorted. So that's a problem.
So we created our own projection method that essentially has no distortion where
you sort of look. And so that's particularly helpful for Earth data sets if you're
trying to look at what's happening climate-wise on the Earth. I mean, you saw a
little bit of earth in one of the tours. You could essentially create a tour of any
data set that's in this environment. So it's quite flexible.
And I didn't show you the tour that was done by a six year-old. There was a
great tour done by a 16 year-old girl who did a tour about extra solar planets,
which is particularly relevant. And she really does a really fantastic tour about
that.
>> Lee Dirks: Help me thank Curtis, please. Thank you.
>> Curtis Wong: Thank you.
[applause].
>> Lee Dirks: Thank you very much.
So our second speaker for -- our second speaker for this session is Tim Smith
from CERN. And, of course, I think everyone knows what CERN is. The
Anglicized spelling of that acronym is the European Organization for Nuclear
Research. And Tim has been so gracious to appear before [inaudible] venues
before. He's active in this area as well. He holds a Ph.D. in physics and
performed research at the CERN LEP Accelerator for ten years before moving to
It. He leads a group in the CERN IT department that provides services for the
10,000 strong CERN user community covering the domains of audio/visual,
conferencing, document management, print shop and copy editing as well as the
IT help desk.
Of course, these responsibilities cover the burgeoning area of multimedia, and
Tim oversees CERN multimedia archive which contains 25 terabytes of open
access photos and videos. In addition to the technology of disseminating
multimedia, he will also give us a view of how scientists at CERN are capturing
their work in the form of visual media.
Help me welcome Tim Smith. Thank you.
[applause].
>> Tim Smith: So thanks very much for inviting me to speak. What I'm going to
talk about is in fact the work of many other people, so I'd like to first of all
acknowledge that it's the physicists around the world who work for CERN, at
CERN, with CERN that I'm going to present their work here.
Multimedia means so many different things ->>: [inaudible].
>> Tim Smith: Yeah. I can move it up if you can't hear.
The scope of multimedia has expanded so much over recent years that it covers
so many different diverse things that instead of going into any one in particular,
I'm going to go and skim across the top. So I hope it isn't too light for you.
I'm going to start with talking about how multimedia is used in communications,
communications to the public, which in general is slightly different in the way that
we use the multimedia for the scientists themselves. But when I get on to that I'll
talk just a tiny little bit about the supporting technologies behind that.
So starting with the public and starting with the weird and wonderful, physicists
around the world have come up with different ways to engage the public and try
and make what they're doing inside the buildings interesting to the public working
outside.
So here is one idea from the Danish physicists who projected with 96 LED
projectors onto their physics institute realtime events as they were happening in
the collider. The tracks go across their institute just to try and attract the
attention of people walking by and get them to come in and ask what on earth
this is all about. That's called the Colliderscope.
A group in London took the reconstruction event -- the reconstructed events, the
raw data, take the tracks and the identified particles and fed it through a music
composition software that generated music from the events. So this way you
cannot only hear about the Higgs particle when we find it but you can hear the
Higgs particle.
>>: What's it going to sound like.
>> Tim Smith: Hmm [laughter].
And then how do you attract the younger generation? Well, gaming is the only
way to attract them. So what we did was we made a little CERNland game. You
can't actually see this very well. Can I dim? Here we go.
So the idea here being to try and attract their attention by making it a game, but
nonetheless, have all the correct elements, quarks and glue-ones and how to
construct protons and neutrons out of their elements so that they sort of absorb
the physics and perhaps get fascinated by it just by playing around with it.
But once we got the attention of the public, they obviously want more information.
They want to explore what we're doing.
In recent years we've been called the cathedrals of Science and you want to walk
around a cathedral. But the problem is it's a hundred meters underground and
not only is it underground and it's in a big cavern, but the experiment itself is
massive, and it completely fills the cavern. So even if you get to go down there,
which is unfortunately an extremely rare chance, you can hardly get a
perspective of the whole thing, and you can only go on a tiny little tour of it.
So what we wanted to do is give them some way of exploring it without having to
go down.
And the solution was we painted it on the outside of the buildings. Actually, that's
not what we did -- it is what we did, but it's not the solution. So we painted it real
scale so that we could actually give an impression for the people that came to
visit and can't actually go down.
But what we did is we started to create a virtual tour, an interactive video, so you
could actually explore it without actually going down.
So what we did here is we made a navigation paragraph of the entire cavern
where you can walk on the [inaudible] and the walkways where the viewing
points might be useful from, and then we got some experts to come in and take
360 degree pan photography at each of these intersection points on the graph.
We coded all this up in flash and then we also, between all the connector points,
we made videos so that we could put all this together into a virtual tour.
So, again, it doesn't show very well.
So I'll risk doing this online.
So here is the result. This means that you can now, at any given point, you can
start rotating around, zooming in, zooming out, the normal way of sort of a tour,
and then if you go round to the edge and then you start saying, oh, I want to walk
along there, then it actually goes and walks you along the video, along the
connectors, to the next point, and then when you're at a decision point then you
can interact with a flash which put it all together.
So this point here, I walked into the left and I can choose which floor I want to go
on and then I'll come out somewhere else.
So this gives people who can't go down at least an impression of what it's all
about. And once they've got an impression, they want to know more about it.
What are these experiments, what's going on in the apparatus. So what we try
and do here is we make illustrative videos.
So the top right-hand corner is there is an illustration of the experimental
apparatus itself. Perhaps I'll just leave the lights down because these are all
rather dark.
So what we have here, we've coded into just an animation software the outer
shells of the apparatus and then we do a cut-away so that we can go in. So if
only I'd known about this software that was described this morning, I'd have done
this differently. But it cuts away, and then you can see what's actually happening
in the heart of the experiment.
And underneath you can also show by analogy what a magnetic trap looks like
by portraying it as sort of a hill in terms of a gravitational analogy.
Other things that we can do with the illustrative videos, I'll explain how the
accelerators themselves work, how we collect particles, how the electric and
magnetic fields work.
So these are very useful tools, but they have their limitations. So when we get
on to explaining the processes that are under study, in this case here, this was
an anti-hydrogen trap. And what happens is when the anti-hydrogen finally
leaves the trap and hits the outer wall, it annihilates and you can see the decay
products.
So this is fine for a relatively simple experiment like this. You can recode it up
into an animation software and get something useful for the public.
But when it comes to the bigger experiments, when there are millions of detector
elements and tens of thousands of tracks coming out, we don't want to go and
code all that in just for the sake of illustration. So instead we have to have a
different approach, and there we start from what the physicists themselves use.
So the physicist himself wants an accurate representation so he can visualize
what's happening in each of the events. So for any given event we have a
reconstruction software that take the hundred million channels of readout and
allows you to visualize it in various different ways.
You can actually -- the reconstruction program itself joins all the dots to make
tracks and shows you where the energy depositions are, so you can actually see
what's happening in the event in the center and the physicist can then rotate it
around in 3D, take slices through it or make projections, so energy projection
plots on theta phi or theta [inaudible] plots.
So this is what the physicist wants. And it's a very powerful tool. But
nonetheless, this doesn't actually convey enough information to an expert. It's
actually coded in -- the whole reconstruction is coded in C++, so we have
converters to make these event displays work that actually extract the
information into xml which then a java visualizer can actually work on.
Now, that intermediate step, the xml, we can use to feed into the other software,
it's a visualization software, and make it useful or understandable to the public.
So the usual stuff that goes on in something like 3D Studio Max, when you
convert you add texture, and then we can basically make a video replay of the
event as it happened, slowed down so we can explain to the public what's
actually happening in the center there.
Let me just get rid of the -- so here, pointing to our digital library, here is the video
that comes out of this. So this is taken from real data, a real collision. We
haven't had to code it back in. We've just converted the information into
something that the renderer can actually work on.
So you can see the collision and the particles coming out. Sorry, this is realtime
from CERN over wireless so I'm asking a bit much.
And then you can see the energy depositions in the calorimeters from the
outside, and then we just rotate it and.
So this has the limitations of video that were just mentioned, but nonetheless, it
allows us to explain our science a lot easier than using one of the complex
programs the physicists themselves use.
>>: [inaudible].
>> Tim Smith: So once we've explained what's going on in the apparatus, the
next thing that we want to get them to understand is the theories that go behind.
What are we trying to study and why. And that's a lot more challenging for us to
try and explain some of these concepts that the theorists are coming up with:
Hidden dimensions, parallel universes, strings.
I put anti-matter there because most people think it's a theory rather than
something that we actually create and use in the lab.
So what we happily rely on here is the creative genius of authors around the
world like Philip Pullman [phonetic] and Dan Brown here and Hollywood to
explain exactly what these theories are all about.
So the relevance here is that when we discussed this with Sony and [inaudible],
they agreed that we could -- if they were allowed to film in our underground
chambers, then we could put our multimedia material along with the video. So
actually if you look at the blue Ray disc of the Angels and Demons, you got the
whole multimedia pack from CERN explaining what is in fact the science behind
Angels and Demons so they can created our own special ambigram for that.
So that's great. We got the attention of the public. But, of course, these authors
don't always portray it in the best light. So it creates a lot of fear as well. So
people were worried whether or not we were actually creating earth-eating black
holes, so we had to work out a way of actually allaying their fears and giving
them sufficient information.
So what we decided is we would -- the best which to do science is really to do it
open with everyone watching at the time we do it. So when we were switching
on we decided to open up the lab and have a webcast so people could really see
what we were doing and they could have links to all the supplementary material
at the same time.
So this worked extremely well. The public were interested, and we got a million
watchers of the webcast, which seems to have been the biggest scientific
webcast to date.
And at that time when they started hitting our network they were asking for more
and more multimedia material for these explanations and things like that. So that
put a huge stress on the infrastructure underneath because they want from one
place to start navigating out to all the extra information, exactly what we were
talking about, these extended publications. They want to go through our entire
digital library, especially the multimedia part of it.
So this -- that was the webcast itself. This put enormous challenges on our
digital library. So just another side of multimedia is how you serve it to the
outside world. So prior to the multimedia library we just had basically one
instance of -- this is the CERN digital library, CDS, which was a front end, back
end, but it was just one simple item.
So when multimedia comes into the game, unfortunately then you have to have
streaming servers that can stream the content. You have to have transcoders
that can translate it into all sorts different formats, the different browsers that
different usage patterns want to use. You have to have servers that can serve it
securely for the high resolution ones that can orchestrate this whole thing. And
not just a few but many, many, many.
So we turned from one server into a farm of 30 or 40 servers just to serve
multimedia content. So the flip side of having interest and having content is you
have to manage it in a completely different way.
So given the location we're in, I just wanted to point out that we achieve all this
by actually using virtualization, and these are all based on hyper-V virtualization
machines and we can just instantiate them as needed if there's more transcoding
or more streaming needing to be done underneath there.
So what do the physicists want? Sort of going from the public side. The
physicists want a similar thing. They want to see streaming media, but they want
to see it from internally on site. They want to see at any given time what's
happening in all the other lecture theatre, and they don't want to be sitting in front
of all of them. So we've get up this system whereby basically any lecture
anywhere on site can be streamed through the webcast server.
Just a comment here. We find that rather than just filming one part of it, I think it
was -- John was showing this yesterday as well, what our scientists want is in
fact to see exactly what's being pointed at so we capture the VGA, and they don't
particularly want to see what the lecturer is waving his hands at. Especially when
it happens to be in a darkened room and you can't see anything. So, hence, the
relative sizes.
But once they've watched it live they want to go back and either download it or
jump to somewhere and look at something specific in there. So what they want
from the same webcast is afterwards to have a web lecture which they can then
use and jump around in.
So to be able to do this we capture the video, the audio, the VGA signal, even
the whiteboards the theorists still use and we try and put that together into a
lecture object. We analyze it, again, like was described yesterday, so that we
can detect all the slide transitions and we can make chapters. We can make it
into an interesting blog for the scientists to use.
So this is what it then looks like is they can download this, they can use it, they
can see it on the web. This is a big flash object where you can jump forwards,
next slide, back a slide, and you can see the slides and the speaker or you can
jump anywhere in it. Because we've done the slide transitions, you can jump
along and you can just look at any -- the video syncs off and you can just look at
what the lecturer is saying at any given point.
So this is one type of multimedia -- the stuff that's more like what the public are
after. But the other side of it which was discussed this morning I think by Michael
is they want all sorts of other information which is vaguely classed as multimedia
information which supports the science that's going on.
So along with the publication, they want the data sets, they want the
supplementary plots, any things that didn't actually make the publication but
support it. They want access to all of this stuff and, of course, in a linked way
also to the presentations and videos that I said.
So this means that in our digital library we're starting getting more and more tabs
coming along here with the extra pieces of information that they want. And as
was mentioned this morning, they want to be able to add their data set there
such that someone else can analyze it. So you can click on the data set and it
will bring up the analysis program that all the physicists use, load up the data and
then just allow you to revalidate what was being discussed in the paper.
Now, one little aspect I'd like to bring up here that wasn't mentioned this morning
is this is all great as a new tool for new people that are thinking of new ways of
using it. But to attract the people who have one way of doing things, that are
used to things, we have to actually show them that it's useful. And there's this
sort of chicken and egg that if they're the first one to put it in and they're not
seeing anyone else using it, then they don't really want to put it in and be the first.
So rather than looking at just stuff that's being contributed, we decided to go a
different way through it and see what we could mine out and put in there anyway
to show the value of having it there in the first place.
So just take a couple of slides on that. So the first thing that we tried to do is to
mine out all of the plots. So from the zip files that are contributed or the PDFs or
even the raw tech files, we go through with analysis programs, start extracting all
the information and putting them out as separate objects behind all of the
publications automatically. So this already was starting to be extremely useful
because then the physicists see that they don't have to measure things on paper
anymore, they can get to the original objects.
So then what we did was we started extraction in the captions as well, which
means that this is much more useful information because then you can start
doing searches on it, such on the full text with a caption on this where the plot
discussed a certain particular particle. And you can really hone down very
quickly on the graphs that you're interested in.
But this takes quite a lot of processing power. So the next logical step from that
was once you've got objects like this which are not textual, you want to do
searching not just in the classic way on the caption but actually on the image
itself.
So there we started working with the University of Geneva on some of the tools
that they were doing for medical imaging and we started to see whether this
could actually be useful in our domain, whether or not you can use these tools
which actually look for the color distributions and the different shapes and
surfaces and can actually go through our data within use.
So this is from the lab. This is not in production yet. But it allowed you just to
click, you know, I want something similar to this or not like that and then it would
do a search through the entire multimedia database based on this -- the tool
having already indexed all these features.
So another thing I think I've shown you sometime in the past already, another
way of visualizing things, this is done already in many different ways, but it's very
fun. Once we're starting to analyze all the papers, the tech and the PDF and
things, you can extract all sorts of different pieces of information out from them
automatically.
So here we've extracted all the references, then we do the citation graph, the
citation network and the co-citation network, and then we can start plotting things
so they visually speak all the quicker to people.
So, for instance, a new person, a new researcher, can see that this was a very
hot topic in the 1970s here and didn't die off. Some papers lose their interest
completely. The subject is no longer interesting. This is obviously a really
seminal work because it stood the test of time. And they can look for just shapes
in the plots here.
Like this one, it's obviously something that's really hot, that's taking off, and it
allows the young researcher to focus more on the type of articles they're
interested in because you can search for things that were interesting, are
interesting, are taking off and things like that just by the shapes that you can see
through these visual patterns.
I won't talk much on the type of crowds and other things that are normal now in
the social tools. I'm trying to bring them back to the science community as well.
So all of this, of course, needs a lot of analysis power. So what we have to do
instead of just having a digital library, the digital library now has to be able to
dispatch all of these tasks for mining off. So we parallelize it, dispatch it, and
then collect and reintegrate the results, often indexed results with the indexing
from the classic text indexing.
So we did this on the grid originally. We made connectors out to our grid that the
physicists used for the analysis and are now actually doing it out into the cloud
instead.
I won't talk about all the other things because I probably haven't got time.
So just on to more things to do is the visualization. Something we tried ten years
ago was to see what virtue reality could do for us to help with the assembly. So
the LHC was constructed by institutes around the world, and they had to validate
that what was being made in these different places would actually fit together,
and a sort of 27-kilometer long accelerator so it was quite a challenge.
There were 142 different CAD systems being used around the world, and they
had to integrate the information together in order to validate it.
So what they did was they made a virtue reality lab. This was ten years ago, so
these were massive goggles with your hands on balls and you could navigate
through and just see whether or not a cooling pipe would go through an electrical
supply or something like that.
So this was fun and useful, but you had to be at CERN in that virtual reality lab,
so it didn't get much use.
So when we came to describing or assembling the detectors later on we tried
another technique which is something that anybody could view from anywhere,
which is to take all the diagrams and to try and work through a sequence, a time
sequence, of how we'd actually put them together and then we could just run it in
front of people and see if that fit what they thought they were going to be doing,
what they were responsible for and to make sure that it works.
So each of these are animation videos taken from the CAD output, again, with a
texturizing and the surface coloring and things like that, and then we just animate
it with a timeline to make sure that things insert as their inspected and there
aren't any clashes.
So this turned out to be quite a useful tool for the assembly and as useful
afterwards for the public to understand what makes a detector and how does it
go together.
So one last visualization thing to do with the grid. So the physicists distribute the
data all around to the participating institutes because we haven't got enough
compute power or even data storage to have everything at CERN in multiple
copies. So we distribute jobs around the world, dispatch them, then bring them
back together, the results, across the grid.
So we have a visualization tool that shows how these jobs and the data gets be
distributed to the 350 centers around the world that are actually participating in
this grid.
Let's just see if this works.
So then you can see realtime the distribution of jobs. It's just pulsating
because -- to give a visual impact of how big each of the centers are. But you
can see the data transfers and the job transfers meant to be realtime around the
grid.
So that's about it for the summary of the different techniques that we use. I just
want to finish and ask if there's any questions.
>> Lee Dirks: Any questions of Tim?
>>: Tim, wearing taxpayer's hat, as I understand it, the amount of running time
for the LHC to prove the Higgs boson is a critical feature. How do you try and
convey that you've not yet got statistical certainty on a particular particle? So if
you combine the Worldwide Telescope type of approach, then you would have to
have some slide bar for speeding up, showing accumulation of data, and that
would, you know, convey to the general public, you know, that the machine
needs to keep running.
>> Tim Smith: The way we generally do it is -- because it's a statistical analysis
and you have to show a signal over a massive background, we generally do it
just by literally showing what we have accumulated to date in the distributions
and we just show that you can't possibly see anything with the statistical
accuracy. And then with the modeling of the detector over 20 years, we can
show how the arrow bars go down.
The problem is all of this is statistical, and arrow bars and things like that, it's
hard to convey that sort of stuff to the public.
>> Lee Dirks: Other questions?
All right. Thank you very much, Tim.
[applause]
Download