>>: All right, everyone. I guess we'll go... introduce our next speaker, Peter Fox. Peter is Tetherless...

advertisement
>>: All right, everyone. I guess we'll go ahead and get started. I'd like to
introduce our next speaker, Peter Fox. Peter is Tetherless World Constellation
Chair and Professor of Earth and Environmental Science and Computer Science
at Rensselaer Polytechnic Institute.
Previously he spent 17 years at the high institute -- High Altitude Observatory,
sorry, of the National Center For Atmospheric Research as chief computational
scientist. Fox's research specializes in the fields of solar and solar terrestrial
physics, computational and computer science, information technology, and grid
enabled distributed semantic data frameworks.
Fox is currently PI for the Semantic eScience Framework, the Integrated
Ecosystem Assessment and Semantic Provenance Capture in Data Ingest
Systems projects. And if you'll take a look, many, many other varied experiences
and background. Quite an impressive bio.
I'd like to now hand it over to Peter to tell us about semantics for innovation in
visualization and multimedia. Peter.
>> Peter Fox: Thank you, Lee. Thank you for the invitation from a variety of
people. It's very nice to have an opportunity to come back and talk to the ICSTI
crew, the crowd. Because I remember the days when we used to get booed out
of this room. And you'll remember it too.
So what I'd like to you do is please buckle your seatbelt because I'm going to run
you through a series of slides and a series of concepts around this idea of
innovation, around visualization for science.
And I've got to introduce a few concepts. And I'm going to show you a few
visualizations but I'm actually going to talk more about visualization and some
emerging opportunities. So and this beautiful little graphic there which of course
is an artist's rendition of how the sun-Earth system works, and that's really key to
some of the things I'm going to say. The artistic element I think is one where
there are substantial opportunities.
And in these new means of conduct, I'm going to tell you a little bit about linked
open data, the semantic Web for real and then some new opportunities in open
source realtime software development or near realtime software development to
try and invoke a new way of conducting science. And it's a little -- it's a part of
data science which you've heard about, but it gets a -- it digs there a little bit
more.
`
And then we back into when I wrote the abstract I called it the semantics of
portrayal, and of course I sort of didn't think about it carefully enough. It's really
the semiotics of portrayal, and if you don't know what semiotics are, I'll show you
what they are and just briefly describe why they're important. And they do
include semantics.
And then I want to talk something very specifically about in science that we seem
to be utterly failing at, and that is the representation of things that scientists really
care about. And maybe general public would care about too, if they knew they
were there. But they're not. And then a little bit of a speculation, tell you where
exactly we're heading.
So many my working premise is that my laptop is old and slow. So these slides
will take a little time to come up. This is my working premise. Some of you have
seen me give a talk will have seen this slide many times. And really the top
phrasing and the two bullets that are -- appear underneath it are really where we
should be because we do have a lot of technology capability, and we do want
this access to some of the things we've heard about in this workshop today.
Distributed knowledge base scientific data. But it's got to appear to be integrated
and it has to by to be locally available.
And this is really important. And you'll see this is where the shift of the burden
has to come. We have over of overextended ourselves and pushed a lot of
responsibilities out on to users. And we have to make it look as though it's
integrated and local, just like it was yours. And that has a lot of implications in it.
But we have this problem. And you can read all that. Of the really important part
about this is the red statement, and this is something I bumped into a long time
ago, and that is really all data and information's created in a form to facilitate the
generation and not the use, except by accident. That's why all of us are in
business. If it was facilitated to make the use easy, we wouldn't -- most of us
wouldn't be in a job. It would be easy.
And we have complications. There's heterogeneity, there's large-scale systems,
there's all sorts of complications that go with it.
So you might at this time say ut-oh, but really this statement was made around
eight years ago. And I still carry it through today because we're making
deliberate progress on this particular challenge.
Now, because this is a meeting about visualization, I just want to give you a view
of something that Jim Hendler and I have been talking about for about two years,
since I went to RPI, and now we were invited to give a perspective article for
science called Changing the Equation For Scientific Data Visualization. Now, it's
embargoed until Friday, so I can't tell you about it. Especially out ->>: [inaudible].
>> Peter Fox: But I'm being taped. So I read the fine print. [laughter]. And the
fine print, you know what the fine print says.
But there are really three important points. And that's unlocked data, which you
should be convinced of and what we call visualization for the masses throughout
the lifecycle of data. And I'm going to argue that there's perhaps being too much
attention being paid to the curation aspects of creating really nice visualizations.
And really what they're do is they're filling up the majority of the time we take in
doing science. We can generate data really quickly. We can actually turn out
because of many of you here, get it published, but it's this middle piece that's
consuming more and more time. And it's getting more and for difficult. The tools
are not scaling.
And so our premise is to do smarter data and smarter visualizations. We are,
however, presented with diagrams like this. And this diagram is in fronts of really
every scientist almost in the world. And so you can start to read it. It says data
has lots of audiences rising up from science public, museums, educators,
policyholders, decision makers and it becomes more and more strategic. So
there's always this big emphasis on producing visualizations for a wide variety of
audiences. And we've seen and heard a lot about that so far today.
But scientists are getting crushed by this pyramid of extra use. So I'm going to -I'm really trying to intend to bring it back to visualization earlier in the research
lifecycle of data that will also if we do it right, allow this scaling up to more
strategic but less level of detail, less use of jargon and terminology, more
integration if you like, more aggregation.
And the only way to do that is to do this. I want to explain that. When I present
this slide in classes, I tell my students this is their job security slide. And I'll
explain it. So in the early days, remember the early days in the Web i. To make
data available we put up static HTML pages, wrote the code ourselves and had
listings. Who did that? Yup.
When the common gateway interface came along and scripting languages we
could generate pages on the fly from databases so it was a little more
maintainable. Who has done that? Yup.
Now we have Web services there which we have rich additions and annotation
and merging of datasets and all sorts of things that you can do. How many of
you have done those? You've written a Web service, Walt? Fantastic. So
you're pretty representative. He's a renaissance man. I love it.
The key for computer science and information technology is the complexity of the
data structures increases as you go that way. And the level of skill and
resources needed to create it go up this way. And that's usually why, you know,
nowadays your graduate student or even you can only sort of sit down here. But
the key is there's a decreasing level of resources required to maintain it.
So what we're doing is we're shifting the burden, we have to shift the burden
away from the users to the providers. Rich services that work together.
Now, just to emphasize that we of course have too many diagrams, I'm going to
show you another diagram which I also like quite a lot because initially it might
paint this myth of a data lifecycle such that there's a usually you see this diagram
and there's this progression from data to information to knowledge. Well, I
completely disagree with that. The -- what it really is it's a set of spheres in which
sometimes things are data and sometimes information. This green ellipse of a
contextual ellipse is really a great definer of what is -- when it becomes
information instead of data.
So this diagram is actually really interesting as are the arrows back in this
direction. There's the producer community. There's the consumer community.
There's an overlap in the middle. And as you all know, just to make it very clear,
this is the visualization. This is the information space. This is where when you
get people in the loop, this is the important part of it. So it requires bridging on
the knowledge side, understanding of the data and data structures,
representation. But really it's about this presentational aspect of data. And this
is the bit that's actually pretty hard.
All right. So those are the contextual setting slides.
I want to flip now to -- so who's heard about linked data, linked open data? Who
has heard about data.gov? Okay. You can't really get away from it. So as some
of you may know, we've helping the data gov team in the US Government bring
in linked data. And this is just a screenshot, quick screenshot from the linked
open government data page at RPI Jim Hendler is leading this activity. You've
heard his name a few times now.
And the reason I'm showing this is with this unlocking of the data I've talked
about, you suddenly have the opportunity to bring datasets together in a
relevantly simple way using really off the shelf technology, service based
technologies, not downloading the data but actually accessing it and mashing it
up on the fly. And you get -- and this has just happened to be, what is this, GOP,
per capita US and China comparison. But I encourage you if you haven't seen
some of these things to go to this site, click on the demos, look at the videos,
check out the datasets, check out the tools and technology. The idea is to push
all of this out into the community.
And the reason why it's important is it really underneath what it really looks like is
the idea behind linked data is that it's in an Interlingua. The Interlingua is RDF,
the resource description framework, or variant of RDF.
And it has -- it's a first-class citizen. It has a URI. It lives on the Web. Very
important. And it's either behind -- accessible directly or queriable behind a triple
store interface. A triple store is just like a regular database except the current
currency is a triple, a subject, a predicate, and an object. And it looks like this.
Data gov sits here. We convert it to RDF to make it useful. We load it in a triple
store. We convert it to JavaScript object notation. And we pump it into Google
graphics, pivot, anything you like. That's a very simple curation procedure.
But what we didn't anticipate was that the really important thing about data gov is
really none of this. It's all about this. Because it's an information resource. So
the graphics part of it became fundamentally important and we were really failing
because of this curation problem around getting together decent graphics. And if
you look on the demonstration side, you see tons of demonstrations. And I know
this is being recorded, but I'm saying it. I really dislike all of them. But they're
very easy to put together. So we need some more new means. And the new
means that came to us was through a creative exchange. We're very lucky to
have an experimental and media and performing arts center. This is just a
snapshot. It sort of looks like a big wooden whale suspended in a building
overlooking Troy. It's really fantastic. You should go check out the website,
which I'll show you at the end.
But we bumped into some digital artists. And when talking to them and sort of
describing what my ambition and goal for visualization was, and I said I need to
be able to visualize as quickly as I can think and experiment. And they said, well,
that's what we needed for art. You know, when we made the transition from
physical art playing with pens and brushes and paints and building things, when
they went to digital art it nearly killed them because the tools were horrific. So
they built a lot of their own tools.
So they needed to be able to do -- they needed good creative visual tools out of
the speed of creative thought, feeling, intuition, mental representation. And
fortunately they loved programming. Otherwise we would be in -- this is a little
hard to see. Can we -- just for this purpose. This is probably the only one that
needs it. And it's a little hard to see.
So the idea behind this activity is that the artists, when they start to create, start
in a small creative space but they want to go to a performance face. They want
to go to a large space. So one of the key characteristics of the Experimental
Median Performing Arts Center, the intent is to scale from your flat screen, many
of you are looking at now, up to a black box studio. That includes any form of
projection you can think of, any form of multi-dimensional sound, 360 theater,
360 projection, a whole range of multimedia capabilities, all four -- all completely
configureable at your request.
Thank you, Lee. So what we have done is we have initiated a collaborative
project with a group called the open ended group, which is one of the cooler
names I've ever heard. And they have this very magical piece of software called
Field. I think because the primary author lives in Chicago and loves the Field
Museum. I think I just worked that out. I haven't asked him yet. And don't worry,
it looks boring here. I'm going to show you just a quick video of what Field can
do. I encourage you to go to their site. They've got videos on Vimeo that you
can go and look at their artistic works. But what we're doing is we're really
bringing the artistic realtime development of visual artifacts that can be displayed
all the way from laptops all the way into creative spaces all the way to scale.
So and there's no sound on this. And it's just intended -- now, this is realtime.
We're going to draw some lines, going to write some Python code up here in this
little box. But all these interfaces know what each of the other interfaces are
doing. So it draws a line. You can go over there and move it around, add a point
in the middle, have lines run through it.
Now, this is for artistic purposes, the intent is to have the data under this so that
you can come in at this pace bang out a little bit of script code. It gets more
impressive towards the end. And really play around. Experiment and see the
impacts right away so you can see colors flipping back and forth adding lots of
lines and we're going to expand it here.
So you can start to imagine if this was a seismic wave form to get in and be able
to play around with this, at the scripting level -- now the key is the key difference
between this and some of the tools you saw before is that you can only do things
in the tool that were intended to be built into the tool. By giving script level
access with a very large toolbox of services for which you can plug in an wrap
any service or any tool that you have, for example Matlab can be wrapped in this
as well, you can start creating visualizations like this. And this is all done in the
realtime that you saw there and export to PDF.
What we're doing then, we've just been funded for this exploratory grant for the
National Science Foundation. We have a set of tasks. I wish I could have shown
you the first visualization but it didn't come out in time from it. But what we're
going to do is we're porting Field to Linux, so it will both run in server and client
mode. We're basically linking the linked data with Field. So for example, I'll
show you some diagrams on this in a minute. We can feed the current graphics
that come out of the linked data directly into field for manipulation, distortion also
to things. Then we'll unscrew the Google graphics and the pivot graphics,
unscrew JSON, query and consume raw RDF, which is where the semantics
come in. And with the idea of visualizing at the speed of thought with this idea of
scale.
And this is really where the semantics start to reenter. And so I'll just run through
that. All right. So you remember that previous diagram the graphics was in here.
We're going to unscrew that, put in Field and it will consume JavaScript object
notation which is nice flexible language. The next version will talk to the triple
store directly. And the next version we'll both be able to access RDF content
from anywhere on the Web, so this really semantic Webifies it. This has this
open world nature of bringing in related content as well as content that's been
populated into a triple store such as many of the stores that are being made
available.
And then the other part is to make this dynamic rather than static so that if the
data changes, the RDF is updated automatically so you now have these tightly
covered graphics, interactive graphics. So that's the plan with the linked data
and Field project.
Now I want to turn to, fairly briefly, is to -- the changing the means of conductive
science. You know, it sounds scary. Whenever I say this in scientific audiences
they're saying what the hell are you talking about? Well, there are really two
modes of conducting science. And these are very much ingrained in us.
There's the deductive approach and the inductive approach. So theory through
hypothesis, comparison to observation and confirmation or observations,
patterns, tentative hypothesis and theory. And that's good. We've conducted
science that way for a very long time.
The problem is that all those means of induction and deduction have been built in
to our information systems. And so I say what about abduction? And I don't
mean the criminal meaning. But I mean this is where semiotics comes in. So
Charles Sanders Peirce formulated this idea of abductive reasoning. And it's
how we used to do science. You would have a hunch. You sort of have some
idea that there's something there you want to explore.
But our current information systems guide you either inductively or deductively.
They don't allow for you to find things that you don't know are there. And it's hard
to write the tools to be actually able to, you know, confirm hunches.
And so abductive reasoning starts when you've got a set of unrelated facts.
That's the worst possible scenario for an information system. But you're armed
with an intuition that's somehow connected. And that's a great job for
visualization. And the unlocking this possibility of exploration through
visualization. But to leverage open world, semantics and the Web. So really
open open world.
And this is where the information theory comes in and semiotics. So semiotics is
this nice encapsulation of the study of signs and significations of those signs.
And it's the superset of syntax, semantics and as soon as you really start to care
about it, pragmatics. Right? So rather than -- you can see the definition also of
them, and the slides will be available. But really let's just look at something more
concrete.
So you have a sign. It's a -- it has a signifier -- it has a signifier and something
that becomes signified. When you group them together you've got a code. And
those codes are executed according to paradigms. And the syntax is important.
So just for my Schenectady [colleague].
>>: [inaudible].
>> Peter Fox: It's a little hard to see. So this is 87 going north near the town of
Troy and this is 87 north Montreal, 7 east, Troy to [inaudible] and then there's
another thing here that says Schnectady in [inaudible].
This is a symbiotic system. This is a set of signs, system of signs with syntax
combined in a way that conveys meaning, structure, and use. All right? So if
you want to go to Montreal, you know you stay on the road. If you want to get -want to go to Schnectady, you take that exit. But how do you know how to use
the sign? Anybody? Intuition is one. Usually experience. You see someone
else use it first.
Okay. So this is great, but this is completely an analog system of signs. And
one of the big problems we have with circumstances is that we're -- we've tended
to sort of think in this analog representation of visual objects instead of in a digital
one. And so this is where the semantics of portrayal really come into it. So we're
talking about a digital world, and not just talking because annotation and hard
coding, I'm talking about declarative relations between elements of these signs
and what they mean.
And this goes beyond the separation of content from presentation. So we have
means for content semantics, but this means for pragmatics, the way in which
actually bring these visualizations together, which is actually manifest when a
person sits and writes sets of scripts in Python and changes them around, that's
all about use. And we have to, we're in the process of coming up with
vocabularies for portrayal that really take into account all these different factors.
And in addition to capture appropriately visualization provenance, the order of
what happens while line choices were chosen, why colors were chosen, why
representations were chosen and their relation to each other. So that's where
semiotics comes in.
And how am I doing for time? Lee?
>>: [inaudible].
>> Peter Fox: Five. Okay. So I want to give you an example then turn I think
finally to this example for science visualization. The big problem as we started to
implement a lot of these more advanced visualization capabilities, more
advanced semantics, mashing up data, all of a sudden we're running in to the
problems that people really care about. Things like data and information quality,
data and information uncertainty, particularly bias, and the need to have
evidence for when these things start to occur in visualization.
So let me give you an example. This is my current favorite. These are set of
four correlations. Don't really know to know very much about what they are. Of
NASA satellite data. So longitude, latitude, correlation. Red is 1, purple is minus
1, zero is green. And so you can see in this plot of two quantities that you would
think would be the same. That's what's being chosen here. So this one is
choosing the cloud top pressure. The same satellite measured by two different
instruments. And this one is measuring the -- from the same instrument on two
different satellites.
And you'll notice these artifacts in these graphics. No annotations, no indication
that anything's wrong. So any guesses? International date line. Does the
satellite know about the date line? Does the atmosphere know about the date
line? No.
Similarly, if you know orbital tracks, this is centered around the date line with the
shape of the orbit. So there's an explanation for this that's actually a
straightforward one. But, you know, if you look at this, you just throw your hands
up and walk away and don't tell anyone. So the explanation for this is because of
a combination of how the day is defined, the fact that one satellite the
descending while the other is ascending. And the time they cross the equator is
different. So that in reality -- though here's that other plot. This is the
explanation. This is the difference in time between when those two
measurements think they're the same. All right?
So blue means there's no time difference. You're comparing reasonably. Red
means there's 22 hours difference. So they're almost a day apart. You wouldn't
expect them to correspond, all right? So my problem with this when we say
known issues, the difference of equatorial crossing time and daytime node are
modulated by the day-to-day definition, causes the included overpass time
difference which introduces the artifact.
But why are we saying this in words? We get so wrapped in creating these really
authentic representations of the data and forget to include a representation of all
the other things that are important. So, you know, why isn't this overlaid with
something that's false color red? Or a big cross on it? You know, we just don't
do things like that. And we've got to be able to do it. We've got to be able to
understand where to put it and what it should look like and what additional
information.
So what I want to do then is I want to really push on this idea of an abductive
information system. So what would application tools look like to let you explore
your hunches? And the real idea is to allow for abduction before you go on with
the more detailed analysis of either induction or deduction. And I'm pushing this
idea of open world and integrative information is the way to go. But there are lots
of things you have to take into account.
I'm running a little status report on time. So I've got two last slides on to be
speculative. But I want to go back to big data and the need to turn things like -and this is where the artists, you know, just open your eyes, says why do you
have a wall? It's just a bigger screen. Why don't you have an exhibit that you
can look through, walk around, and this is not -- we don't have one, but a digital
exhibit of all different and related and integrative facts around a particular topic?
Why don't we have that? It's actually pretty easy to do. We're going to
implement one. And I don't mean immersion but experience. And we are really
not taking advantage of synesthesia. So synesthesia you taste a sound, you see
a noise. It's the intermixing of senses. And we haven't taken advantage of the
multimedia. We tend to use sound for sound and visualization for visualizing but
we don't mix them together. And we need to be able to do it rapidly.
Last -- second last slide. We need to do this at scale. Scale meaning from when
you develop it all the way up to large spaces. Because when you involve the
human in it, that's where you can start to explore things. You can look at them
from different angles. There's a perspective aspect to doing things at scale that
you just don't get sitting looking in front of a flat screen. Stereo. I haven't heard
this mentioned, but this idea that most of what we're looking at in the real world is
actually multidimensional and we collapse it back down into two dimensional and
then we teach the computer to try and trick us into thinking that it's
multidimensional. Why do we do that? We have to sort of think through that.
And so the goal is to restore this idea of abductive reasoning and so it's going to
change how science is done both for specialists and non-specialists.
And this has to be an informatics approach. You know, there's a tendency for the
techies to get in and try and bang together some cool tools, but it has to be
integrative. It's cognitive science, social science, library science, computer
science, the science itself and the engineers. So there has to be collaboration.
And my view is we're certainly ready to play.
And if you'd like some more information that's my e-mail, our website. Open
ended group really worth checking out. The link to open government data and
Empac. Thank you very much for listening.
>>: Thank you, Peter.
[applause].
>>: Very thought provoking talk. Are there questions for Peter?
>>: When you said in one of your examples not an immersive environment but
nonetheless a multidimensional environment, I'm not quite sure what you mean
by that.
>> Peter Fox: So immersive environments tend to be directly related to an
individual getting immersed in something in and experience whereas an exhibit
style there's interaction with other people looking at the things the same time they
are. So immersion is direct to -- is really intended to direct your view. Exhibit is
meant to broaden your view. And so it lets you see things that you weren't
necessarily intending to see whereas immersion really is intended to go in that
other direction.
A little bit of a generalization but that's the way we're looking at it. Ut-oh.
>>: [inaudible].
>> Peter Fox: Yeah, here. Job safety.
>>: No, no. I've been thinking a lot about linked data and using RDF sort of
implies you need some sort of ontologies for the word you're putting in there. So
are you defining ontologies for each of these government datasets? How much
effort does it take to go from a dataset and raw data to make an RFD version?
Isn't that the critical thing? Isn't that what you've got to automate?
>> Peter Fox: It is. And so actually really because of experience we started all
the way from literally just translating comma separated values with headers in
there with meanings on each of these columns into RFD representations with
very -- so basically translating the schema, whatever the schema was, and just
reproducing the names of the schema. And then looking at the metadata to
establish minimal relationships between the class concepts or property concepts.
As you can tell, that only gets you so far. Now, if Jim Hendler was here, he
would just keep saying lightweight, lightweight, lightweight, don't -- you know,
don't overboard, don't overdo the semantics. And I would agree. And the reason
is one thing that we've done is we've tended to be too rigid in how we define the
knowledge relationships. And that goes against what I said about induction and
deduction. You want to put in as few relationships as you can and be able to
explore. Especially for large data. Once you start to build up the knowledge,
then, yes, ontologies can come in, you can get richer integration, you can do way
more things. So there is a -- there's a very broad spectrum we have to be able to
tolerate. And the good news is the tools on that low end of the spectrum is really
getting pretty good now.
>>: So you don't have to stop by defining ontology?
>> Peter Fox: No, don't do that. Definitely not.
>>: You don't have to rewrite the schemas of the comma separator or the
relation of database in terms of vocabularies?
>>: [inaudible].
>>: You do not have to rewrite the ontologies, but you do have to rewrite the
schemas in terms of the RFD vocabularies, right, Dublin Core and [inaudible] and
what have you?
>> Peter Fox: If at all possible, that gives you lots of leverage. The reason why
this point is important is that occasionally people develop schemas to structurally
store their data that actually are completely logically inconsistent. And you
probably don't want to translate something that's logically inconsistent to a form,
another form, because it's still going to be logically inconsistent. So we do -- we
do -- now, back to this -- we have tools for doing sort of very rapid automation of
converting CSV into RDF. Those tools are all on the side. Would love you to try
them out. So that you can start to investigate exactly these things, run some
queries, see if it makes sense. It's up to him, not me.
There's not a hand up there, but I'm going to presume that ->>: My question is along the vein of what she was talking about. Is there going
to be like efforts and there's like the W3C is there groups that are working on
creating the name spaces and reg registering them and maintaining them. Is
there going to be ->> Peter Fox: Yeah ->>: -- cooperation as the project goes forward?
>> Peter Fox: Yes. So W3 is coordinating name spaces in particular, URIs.
There's a group in the government coordinating name space namings at the
moment. I encourage everyone in the community if you're interested in this to
get involved and have a say. Because it actually is a fairly formative time
because we want these URIs, these first-class objects to be there and continue
to be there. And getting the naming as right as possible is actually pretty
important.
>>: So I'd like now to introduce my [inaudible]. Lin is a senior software
developer and test lead at Silverlight, Inc., here in Microsoft. Having spent
several years building a variety of data visualization tools within the Live Labs
group at Microsoft, the PivotViewer project she leads has recently graduated to
become part of the developer division Silverlight software development kit.
Before joining Microsoft, she worked as a developer with IBM supercomputing
division. She holds a BA from Harvard University in biomedical engineering.
And I'll turn it over to Lin to tell us more.
>> Jennifer Lin: Thanks. Thanks so much, Lee. This is great. All right. It's
really fabulous to follow Professor Fox because so many of the motivations for
the research and development that we've done for PivotViewer follow right in line
with what he was speaking about, basically being able to translate data into -from something that's just abstract and unformed into something that's more like
knowledge that can be acted upon.
So what I'm going to be introducing to you is the PivotViewer control. And let me
be totally up front that none of this data is mine. So basically I have some
colleagues that have been kind enough to share some fabulous datasets that
really highlight the ways that you can use PivotViewer with scientific data.
All right. And this is quick screen chat of PivotViewer in its glorious reflective
form. And our first dataset. This data is provided by Professor Ilius Sazlaski
[phonetic] of UCSD. He is a environmental informatics researcher and has been
working with numerous different organizations. This particular dataset is from
conservation international. They're working with the Bill and Linda Gates
Foundation to investigate climate impact on various species.
So what we're seeing here is output from an array of camera traps around
Tanzania, and they're motion detector triggered. And basically data around each
of these animal sightings is collated and gathered to kind of just look at different
trends in what we're seeing around that sensitive part of the world.
A quick tour of PivotViewer. We have a filter pane here which -- hold on. This is
a little bit off screen, unfortunately. There we go. That's a little bit off screen.
Let's go back. I'm trying not to touch that too much. All right. So continuing our
tour.
This is the [inaudible]. It contains metadata around the visual representation you
see here. And each photograph represents one citing of one animal on one
occurrence. And when you're zoomed in, you can kind of get some metadata
around basically latitude and longitude of the camera in the array. The time and
information about the animal site.
They broke down their data according to class. So basically they have various
different classes each of these buttons here. I'll just stick to the first one.
So what's nice about PivotViewer is that it will under -- rerender your data
according to the facets that are provided in the metadata.
So the researchers that were working from Conservation International found
some interesting trends in their data. You sort according to for example the
temperature at the time of the spotting. There's a pretty reasonable bell curve of
what temperatures animals are spotted at. But if you actually get to the genus
level of -- species, here we go. It's even better with species. If you look at the
granularity of different species, you find that they have distinct distributions
according to temperature. So basically the researchers were able to look at this
data and say borrowing from Professor Fox I have a hunch. You know, there
must be something to this. Perhaps the comfort zone of these animals will center
around certain very precise temperatures.
The fact that it was very precise and specific was surprising and the fact that it
differed between species was surprising. And [inaudible] that will be published in
an article in nature in the future.
Similar exercise but also quite interesting. If you look at each species in the
moon phase when they were sighted, there's another fairly strong and surprising
correlation. For example, this is a pretty -- this is a relatively even distribution but
still clustering around the end of the moon phase cycle. This is kind of bimodal.
And this is -- this probably would be [inaudible] if you had more data points.
So hopefully that gives everyone an introduction to the control and kind of how it
can be used when you're looking at a dataset that wasn't in any way, you know,
prepped for a specific visualization.
Let's talk a little bit more about PivotViewer. Okay. So PivotViewer is posted
inside of Silverlight which is in turn posted inside of a webpage or a scripting
engine. So therefore all of these visualizations live in the cloud. So some early
attempts at similar visualizations were done by the team using WPF and a client
app that required installation. And major feedback we got was this is not platform
agnostic, you know, let's get this into a Web form.
All right. Now I'm going to move on to an example that kind of comes from the
genetics field. This data is courtesy of my friend Beatrice Dias Acosta, who is in
the back of the room. So thanks, Bea.
Just going to give you some background in genetics because we're all from many
different backgrounds. The human genome has 23 chromosomes and contains
three billion base pairs and 25,000 distinct genes. To get the sense of the scale
between a chromosome and a gene -- or actually in a nucleotide, which is on the
far left -- right -- left, left for you, you have to do many, many, many levels of
zooming to get from something as, you know, microbiologically large as a
chromosome to the actual base pairs, the A, G, T, and Cs. Of.
What's very difficult about this problem, when you're looking at a data
visualization perspective is that there's a lot of noise. The signal to noise ratio is
problematic and it's one dimensional. So basically you want to be zooming
through what could be essentially a garbage to a geneticist until you actually get
to the jewels of the genes hidden inside.
So we have an approach which presents the chromosomes using -- that
represents the genes using a trade card. That's what we call the block on the
left. It has some information about the name, the location within the
chromosome, a description, and a protein sequence -- or a series of protein
sequences.
So each of those colorful blocks is an amino acid which is a trio of base pairs.
Basically interchangeable as far as the information they contain but just giving a
more concise visualization.
And I just want to give you kind of a sense of the different experiences from pivot
through another MSR based tool called Genozoom and the much more common
UCSC genome browser. Or more commonly used.
Okay. So here we have a representation of the human genome. Each of these
items represents one chromosome. So basically each have the numbered
chromosomes to the X and the Y sides gives kind of the relative scale between
the different chromosomes. And the colors represent a density of genes that are
found in each of them. So I want to show you something interesting about -- a
few interesting things about chromosome number two.
One quick way to kind of drill down into this data is to look at it according to the
starting base pair, which is a nice analog to its position within the chromosome.
And once you're in this experience you can just kind of do the zooming, panning
exploration to see what -- you know, what kind of patterns might emerge to look
for that kind of intuitive hypothesis discovery.
I've done this before, so I know what I'm looking for, but I'm not finding it yet.
Okay. So I want to go at it backwards because it's a little hard to see on this
monitor. Okay. So let's say that I am interested in looking at collagens for my
research. All right. So in chromosome two here are some examples of genes
that can -- that have collagen in their metadata.
One thing kind of strikes me about this particular gene, and that is that the
sequence of G is very diagonal. It seems to be following a pretty set pattern.
And what's interesting is, you know, just playing around. Looks like the next one
has a similar pattern. Perhaps -- oh, yes. And here, this one even has some
kind of down in here. So perhaps there's something to this.
So out of curiosity, I want to test my hypotheses against this dataset, just kind of
see is there something about striping patterns and collagen. I tend to like
chromosome 9. Let's check this one out too. All right. So again you kind of see
this pattern of in our representation stripy white regions. And, in fact, when
Beatrice kind of went through this exercise herself, she went and talked to a
geneticist to say why does collagen keep kind of showing up in this way and the
response was well, it's a structural protein. So of course there would be some
sort of repeating element to it. But that kind of made sense and jived from a
genomic perspective. So compared to other visualization tools, just being able to
kind of search through metadata that's a convenient feature, the way that these
set up data you can also search for base pair sequences. So let's say I wanted
to find repeating -- just some sort of repeating G sequence. Okay. So that's
pretty common.
Maybe look for even more repeating G sequence. And then you count a certain
narrowing down to say okay, these are basically representing similar
characteristics, can I find something that's also common between these genes.
Okay. I wanted to give some context around similar tools that are used for this
kind of application. I'll start with looking at the UCSC genome browser.
So basically this is based on the same data, same database as we were just
looking at. But these experiences vary widely different. So in order to zoom in -I didn't mean to zoom in that far, basically there's many clicks required. So say
10X, 10X, okay. Let's keep going in. Keep going in. Maybe somewhere along
this line if I click at something here. And then finally I come to some metadata
about that region of the gene that I was looking at.
Just wanted to kind of give you a representation of what people are using today
and how much you lose the context as you're clicking through items. When we're
looking at the genome browser you could see that there were things laid out
according to the metadata that you've provided. Here this is just click click click,
discrete interactions, much less pleasant. And the user often had to wait
because it takes time to load Web pages.
And I also want to show you Genozoom, which is another MSR product. So what
Genozoom tried to do was was to look at the same kind of contextual fidelity as
we had with pivot but with the experience that was more tuned to the
one-dimensional nature of a gene -- basically you know, chromosomes are just
basically one long, very, very long string, so how do we kind of take advantage of
that fact?
So this is an e-coli chromosome. Kind of pan around a little bit. Zooming. And
we get this slice represents what you see in this view. And then this view can be
moved around to kind of look for something in particular. And then this is down
here giving you the level of the base pairs that we were talking about before.
And what's nice about this view is we have the percentage of Gs in that area of
the genetic material and that tends to be a marker for a higher likelihood of
finding a gene. So, you know, maybe it would be worth just kind of panning
around and saying okay, where are these hotspots with the high red G quotient
and kind of investigating whether or not there's opportunity for investigation.
So this is kind of giving you a little bit of contextual information about the state of
genomic data visualization. I want to talk a little about PivotViewer itself. There's
an API that provides 2-way interaction between the control and your Silverlight
application. So basically you can learn about the user interaction with the control
and you can also provide data to the control. I will say this, that in the version
that's currently publically available, everything is static based. So all the data
that I was showing in the control earlier is represented in an CXML file or is built
just in time on the server and provided that way.
So basically everything is either server intensive or static. And what we're going
to be doing for the next release of Silverlight 5 is providing an API that makes the
entire experience programmatically drivable. So you can add items, remove
items, change properties on the fly. This will hopefully ease the barrier of entry
into producing experiences in PivotViewer that we found that just static data
doesn't cut it with the services that we're trying to work with.
This is a little bit about the collections. I apologize if I went through the demos
too quickly and just made too many assumptions. So just walking through what
constitutes a PivotViewer collection. The zooming is made possible by using the
Deep Zoom format for the imaginary. That's something that's Silverlight specific
platform piece. And the first example I showed you is a simple collection.
There's no connections between different datasets. It's just a single dataset.
Second one I showed you if you remember there's a chromosome representation
and then when you click on it, it brings you to the detailed gene for presentation.
So this is what we call a linked collection. There's lots of possibilities to
intermingle datasets using either. But just kind of giving you a frame of reference
for how we think about this.
And then I just wanted to give you some links for resources to take advantage of
PivotViewer now. And I'm sure I'm over time because I had my little snafu with
the projector, but I would love to hear questions if anyone has some.
>>: Questions for Jen? First a round of applause.
[applause].
>>: We love Silverlight PivotViewer so it's great. Great tool.
>> Jennifer Lin: I can't take all the credit. There's some people back there who
have to share it too.
>>: But we wouldn't to ask in terms of -- it's amazing to see the there was some
magnitude change from the chromosomes all the way down to the genes, the
nucleic acids. I was wondering from an example like a query of a virus genome
versus the database, how that would look to do those kind of queries, you know,
rather than just looking for the repeat patterns, more sophisticated searches. Is
there anything that has been discovered or through this visualization or -- what
are the advantages when you do that type of work?
>> Jennifer Lin: Yeah. I definitely think that once we have the dynamic
capabilities it will be fairly seamless to take the metadata in richer sources and
integrate it into the experience and just do queries that dynamically on the fly
provide results. The limitation of this example is that it was statically collated. So
basically there was a richness to the data but it was a static richness. And I think
that once we have our next release it will be much easier to provide experiences
that can query a database for something specific and come back with like here's
collagen across different species, like here's what they could look like, things like
that.
So I guess I was looking for kind of examples that showed some serendipity of
like here's things you can explore and see just almost as an amateur looking at
these sciences. But the experience should be richer with an expert eye.
>>: Do you happen to know what happened to the getpivot.com site? It's gone.
That URL doesn't work. And there were all these terrific collections out there, like
the Sports Illustrated and the classic dog breeds and the cars and all those, and
they're just gone. And a lot of our demos are broken as a result.
>> Jennifer Lin: Okay. I do apologize for the transition. This was developed
under Live Labs and then we moved to Silverlight and in all the enthusiasm some
of the Web resources were not lost. I won't say lost. But they were moved.
I believe that all the collections still exist. But the link is long. And I believe if you
go to the download page -- hey, Angela, I'm sorry to bother you during the
presentation. If you look at the download page is there a link to the collections
from there at this point?
>>: No, but there is the Microsoft.com//silverlight -- I can get that URL.
>> Jennifer Lin: I'm sorry. Yeah.
>>: We can get the URL to you.
>> Jennifer Lin: Yeah. I will get you the URL. That's totally fine. Yeah.
Unfortunately with the transition some things got jumbled. But definitely it's all
still -- the collateral is still there. And I apologize about the demos getting broken.
>>: I don't know.
>> Jennifer Lin: Hello, professor.
>>: Thank you. I don't know if you can do this but do you have any
instrumentation capabilities in the PivotViewer to trace sort of usability patterns
and sort of see how people browse and select and where they pause and where
they don't.
>> Jennifer Lin: Yeah.
>>: And do you have stats on that?
>> Jennifer Lin: Yeah. We've done usability studies in house. So we do have
some data about how people interact with it. If there's a specific site that has
data that you want to see the usage of that site and that data, there are events
for when items are clicked, when you filter, who changes the filter state and
basically when the collection view changes. And so you should actually be able
to do some home grown infographics or information about that.
There's nothing that I could think of that's kind of more global that we've
produced, but I will take that as a future request for maybe some simple code in
the future.
>>: Okay. Thanks.
>> Jennifer Lin: All right. Thanks. Thanks, everyone.
>>: Thanks very much.
[applause].
>>: I'd like to introduce Jeff Falgout. He's the senior systems administrator for
the US Geological Survey Center for Biological Informatics, CBI, in Denver,
Colorado. He manages [inaudible] infrastructure that supports five district
programs and over 100 websites. He's responsible for daily infrastructure
operations, including the configuration and support of servers and applications in
compliance with government security regulations.
In addition, Jeff leads long-term infrastructure planning activities that support the
needs of a diverse USGS bioinformatics community. He has over nine years
experience working in government.
Prior to joining the USGS in 2007 he worked as a system administrator for both
the Bureau of Land Management and Jefferson County Colorado Information and
Technical Operations.
He holds a bachelor of science in biology from Northwestern State University in
Louisiana.
>> Jeff Falgout: Okay. Thank you, guys. I am the last thing standing between
you and going home, so this is a pretty precarious situation, I guess. But thanks
for having me. And I appreciate you guys being here.
So quick outline what we're here to talk about. Quick organization background.
Most people don't or aren't aware that USGS also handles biological data,
ecological data. We're not just earthquakes and water. So of course biological
informatics challenges, we are living some of the challenges Peter was talking
about a few minutes ago, over 100 plus years of data. And some of the things
we're trying to do to solve these challenges. We're headquartered in Reston,
Virginia, outside of DC. Responsible for informatics activities with the USGS.
We support a diverse group of programs including the National Biological
Information Infrastructure, which I'm sure a lot of you have heard from -- heard of
if you're involved in biology. GAP analysis program, which is an effort to keep
common species common; in other words, the species that aren't any sort of
management plan, we want to keep them off management plans for like
endangered species. We do some cooperative work with the National Park
Service to map the vegetation characterizations within each of their parts.
And some of the other programs we support clustered Integrated Taxonomic
Information System which is one of the taxonomic authorities. And it's in
partnership with the Smithsonian Institute to actually be one of the top authorities
for genus species lookups.
We have international partners and global partners along with DataONE, which is
an up and coming effort and a pretty big deal for us. 60 people across the
country.
For those who aren't familiar with informatics, we of course this is stolen from the
eco informatics stock but we are the intersection between modeling, analysis and
synthesizing of ecological data, the raw ecology or biology science and then
information technology. And I'm looking at this from the information side.
The task within or the as the et cetera of biological informatics, I'm not going to
read through this, but it gives you an idea of there's so many different parts to
bioinformatics. What is the National Biological Information Infrastructure,
marking stuff.
Contributors and users. We go all the way from federal governments and
international governments all the way down to private citizens sometimes. We
run the gamut. And of course that leaves disparate datasets, like you can
imagine.
We are a distributed network. Some are based on regions, geographic
especially, some are national themes. Fish, some birds. And then we do have
some information or infrastructure nodes including places that provide computing
infrastructure for us. And one of our big ones is the Oakridge National Lab in
Oakridge, Tennessee.
Some example projects you can see. We have a ton of stuff what comes down
our pipe. I'll get into that.
Our gram of what we try to do. As you can see, we have a lot of data holdings,
not necessarily holdings within our infrastructure but remote partners. We try to
provide some data access and geospatial services and then some visualizations
services on top of this. And if you look at other visualizations, they would be
inserted into this between the top two layers there. And we can show some
modeling results towards the end which isn't our work but it's some stuff from
DataONE. But we're trying to move more towards expanding that distributed
services model there.
Of course we all have challenges, and ours are not small. Metadata, metadata,
metadata. Everybody for the most part throughout the day has been talking
about metadata. And we really, really, really rely on metadata. We do a lot of
work. We have a clearinghouse with approximately 100,000 records in there
point to datasets.
The tools you see there, sometimes there's way too many, and deciding on them
is a challenge and you end up going down a path and got to rerun that path
because a certain tool you come to a point where a tool just didn't quite work out
for you or tools aren't adequate enough. So sometimes it's too many, sometimes
not enough.
The culture issues, biologists are known for not wanting to do anything outside
the field. So we have to change that culture. And that's -- that's becoming real -a big challenge, and it's bigger than we ever thought it would be, simply because
they're not focused on dealing with data and data management, they are out
there doing research and money's tight and they want to get that money for the
research and get the -- and data collection and not necessarily deal wit data
management. So we have to change culture. And part of that is providing tools
for them.
Standards, of course everybody relies on standards. Infrastructure. And then
lacks a lots of stuff. And then data silos, sometimes these silos are made of
titanium and you can't get through them. So we are trying to bust through some
of those.
And then of course our nemesis is large numbers and small datasets because
each one of these datasets has their own schema. And then sometimes they're
in Excel spreadsheet with three different sheets on them. Sometimes they're a
little bit bigger than that, but they're unique as each personality. So we struggle
with that a lot.
Some examples of the projects and the ways we're trying to address these
challenges. Species and mash-ups. I think of mash potatoes every time I see
this, but essentially that's the point is that we're talking information from different
data sources and trying to put those all into one view. Here you can see lookups
on -- I can't even remember what this is I think it's a bull frog or something. And
then drilling down to information on that taxonomy or that species. And some of
this stuff is pulled from our data holdings within MBR. CBI. And then some are
pulled from other places like GBIF.
Species of concern by state. So each state typically has to file a state wildlife
action plan for species of concern can. And what we found is that as we
visualize and we're presented some of the -- what states were doing for their
wildlife action plans, other states didn't realize that the state next to them was
doing the same thing. So we kind of expose that to go to them and the states
adjusted to what they were doing. Because you know, animals don't pay
attention to political boundaries. So sometimes those action plans overlapped
each other.
And of course we can find even more information on what's called GBIF, Global
Biodiversity Information Facility. And that's a framework that has over 200 million
specimen record. So it's a massive data store so we certainly don't want to take
on that task of holding that information. But we do link out to the Web services
so we can direct the information on it. And these are specimen records from
museum occurrences not necessarily observations in the field.
Furthermore into these species mash-ups you can see the ecology of the animal
and then what other datasets are referenced to it. And I mentioned
clearinghouse before. And this is that metadata information where we're pretty
hot on. And that's the result of a decent metadata records that we can find some
of those datasets.
In addition to what we showed here, we're working on pictures for stuff and then
you can see the Google results, also.
Another big deal for us is the Oceanic Biogeographical Information System, and
that's true to get views into datasets from ocean -- or marine work. We have
information from west coast, east coast. We went -- we just for an example went
down to Woods Hole in 1903 to 1909. We have just large numbers of datasets.
So what this is trying to do is dissect the data so it's easy to wrap your head
around instead of just looking at spreadsheets or links to data. This gives you an
idea where things are. Another view of that. And then you can drill down to
taxon or -- yeah, so you can just drill down to species if you need to.
And this gives you a geographical view of data observations for a single species.
So you can see it's a lot of information there and graphically represented seems
to be the most efficient way to do it.
We run something called Ekey too or Ekey also. And what this does is give
people and easy way to identify a fish they found or the fish they've caught.
Sometimes they look the same. And then once you idea, then you open the
world for further research and what you've got and you can find links to other
datasets.
A geographic base representation. This one happens to deal with exotic versus
native species and the number of observations and the location on map. Again,
graphically representing information seems to make a lot more sense to a lot of
people, and especially decision makers who don't have time to dig into this stuff.
This really comes in handy.
And of course we don't forget our metadata people because they are critical to
us. So what we've done is create visualization tools for the metadata QA/QC
process. You can see on top left density or top right density. You can see where
our data provider or metadata record providers are and the data providers also.
Of course there's a Many Eyes reference to what our clearinghouse holds.
We also have a dashboard for the metadata QA/QC people. They can see
broken links, they can see what searches are popular the at a certain time. They
can see missing fields. So that's really increased a lot of QA/QC capabilities
made those -- making sure that metadata record is better than it was when it
came in. They can go back to the initial submitter and say you need to provide
this information for me before we put it out there because nothing's more
frustrating than incomplete metadata record.
And of course the other advantage of the clearinghouse is now once you've done
that search for that metadata record, now you can go off and find all kinds of
information, of course linked back to GBIF, Global Biodiversity Information
Facility
We've also looked at revamping the way data mining is done. We've decided to
bring some visualization into it as far as geographic and pictures and also
allowing dynamic refining of results. If you see what's called clusters up there we
integrate some thesaurus. Web service is in there to help you say you searched
on ecology, did you want ecosystems, also? So that provides some suggestions
and it gives you the capability to refine. Also a geographic search sometimes is
very helpful.
A different view into that. You can also drill down into who is providing the data,
who has published it. And then of course link off to more records and more
information.
Of course USGS everything is a map. And what this does is give you visual -- or
map representation of species information we have.
So we've also gone in and looked at IBM's Many Eyes to see what we could do
there. Fortunately still -- I think it's still considered a research project and not
really for production use. But we've done several things with it, just to give you
an idea of what we've done. And these are some of the comments we have with
visualization. We tried to enable visualization more than actually produce those,
but we can't get away from producing them ourselves.
My next three slides are some work that's done at DataONE which it actually
shows the visualization process and the data integration in process. EBird is a
site that allows citizen sight to submit individual bird sightings. And I believe right
now they're collecting about a million records or more a month. And so that gets
the end user or the citizen involved in a data collection process.
And they've also taken land cover, imaginary, meteorological information -- boy,
I can't say that, and MODIS information. I believe the MODIS information they
used to process was 200 terabytes alone. And they've taken all the intersection
of data with the help of the TeraGrid, which is basically a cloud based
supercomputer. A lot of that comes from ORNL and some of the other
supercomputer providers.
And they've built this model. And what this model is representation of a
prediction of the distribution of indigo bunting. And you can see the potential
uses there. But what they've done is try to ground truth this model, and they've
found that it's highly accurate in the prediction of where things should do.
And this is important because now they can play with values. What happens
when the drainup occurs two weeks earlier, two weeks later. What happens in
the event of a drought? So as far as the ground-truthing goes, you see the
traditional distribution map that's been accepted previously. And this is the
estimate based on that previous work, eBird, the supercomputer MODIS and
meteorology work. And you can see how accurate that is. And it's almost even
more accurate because it gives you the center distribution or the highest density
of the occurrences of those -- of the thrush.
Now, as they run these models, the first thing you see, right, is why is that
happening? Why is central California becoming the bird showing up earlier in the
year than the rest of the country? So of course with new discoveries come new
questions. And they've gone down and researched and they believe that it's a
change in land in the agricultural use in central and South America that's allowing
these birds to be closer to North America.
So without that visualization, that data, how would you really spot that and make
that apparent? That's obvious in two seconds.
So that's pretty much all I got. If you have any questions.
[applause].
>>: Questions for Jeff?
>>: The eBird project is part of the DataONE NSF project. Are you talking to
them about their -- they have a plan for looking after bio and ecological
databases?
>> Jeff Falgout: So eBird is actually a project of Cornell but the principle
investigator, Steve Cowling is on the DataONE team. And Mike Frame is a
principle investigator with the DataONE team. He's also on the leadership. So
we are intimately involved in DataONE promise right now, uh-huh.
>>: Any other questions? All right? Well, then I will hand it over to Roberta to
wrap up. And thank you again very much, Jeff.
[applause].
>> Robert Shaffer: Well, I want to start with thank yous. I want to thank all of the
presenters today. I think it was an excellent day and we all learned a lot. I'd like
to thank our wonderful host and inspiration, Tony Hey. And of course what we
could have done without Lee, I don't know. So we're very grateful.
And then of course the ICSTI team, Brian and Tony and Herbert and Bernard
and Elizabeth and everyone. So I really want to express our gratitude for
everyone and particularly all of us you. I was quite impressed that people were
up and attentive to this very moment, watching the birds.
It's been a fantastic day. I think of what I know they're now calling
techno-tourism, where you travel to different places but you actually stay in one
spot. And just to make a quick review for you, we have in the support span of
this day been to the moon an beyond. We've been several hundred feet
underground. We've been bird watching. We've been dissecting human beings.
We've been in the operating theater. We've really been all over. We have been
speaking many languages. And not only in the sense of foreign language but in
the sense of disciplinary language. And we've been wowed, we've been inspired
and I have a sense, but I'm not quite sure, that we may have been abducted to
Troy, New York. [laughter].
So now the time has come to travel home, and I want to wish you all a very, very
safe travels and say that I hope to see all of you again in the near term in Beijing.
Before we actually leave, though, I'd like to open the floor for any closing
comments. So anything you'd like to say about what you've seen, any
announcements you'd like to quickly make about things you know that could
connect people to what they've learned today, this is the opportunity, and I open
the floor to you.
I though that Robert has something to share with us. So, Robert.
>>: Is this being recorded too or are we off the air?
>>: Probably. [laughter].
>>: Well, let me thank you all. For all the other speakers too. I mean, I've never
honestly heard of this organization before a few months ago. And it's really great
to meet many of you. And I just made a connection this morning that many of
you may not be aware of, and I just wanted to share that with you. There is an
organization called the Gordon Research Conferences. And one of -- these are
typically in chemistry and math and physics and biology. But there's a very
interesting conference that a lot of people don't know about it. And I've been
going to it for maybe the last two cycles. It's an every-other-year conference.
It's in the summer. It's in July, the 10th through the 15th in Bryant, Rhode Island,
every -- it's every two years and every other occurrence it goes either to Oxford
in England or Rhode Island in this cycle around it's in Rhode Island. But the -- its
title is Gordon Research Conference on visualization and science and education.
It's quite a unique conference. It's not huge number of people. It's about 120,
100 to 120 participants.
So it's very small. Great format. You have conference meetings in the morning,
then you have afternoon all free to discuss and meet people. And then there's a
happy hour and a poster session and in the evening more talks.
So it's just a wonderful week. And I would just encourage you to check into that if
that interests you at all with broad, wide variety of people from museum curators,
graphic designers, chemists, biologists, all sorts of people interested in
visualization. I just get this sense that maybe there's some people in this room
who might be interested in that. So thank you very much.
>> Robert Shaffer: Tony, I'm sorry. Oh, I'm sorry.
>>: Just a little more housekeeping, too. We're going to be posting as much of
this material as we can on the ICSTI.org website, so we encourage you to go
back to that and just a reminder to all the speakers if you have files that you
haven't sent already, send them to Lee and -- or links, in which case the files may
be too big or if there were videos in some cases.
>>: [inaudible].
>>: Okay. That's good. Find Lee today. And there's a -- there's an agreement I
think between Microsoft and University of Washington where this kind of
proceedings can be streamed and you can watch them and share them with your
friends. And we -- I know ICSTI will be putting out some communications about
this, [inaudible] organization will be obviously putting out some press releases
about our launch of science cinema. And Microsoft will be tweeting and putting
out some blogs of their own. And so I encourage everyone to try to use the
social media to spread the word about this as much as possible. Thank you all
for coming.
>> Robert Shaffer: Anyone else? Safe travels and thank you again.
[applause]
Download