SPIRE Workshop Draft Report 2-04

advertisement
Federal Government
Mike Frame USGS mike_frame@usgs.gov
Lisa Zolly USGS lisa_zolly@usgs.gov
Bruce Bargmeyer LBL/EPA bebargmeyer@lbl.gov
Larry Fitzwater EPA fitzwater.larry@epa.gov
Frank Olken LBL olken@lbl.gov
Kevin Keck LBL kdkeck@lbl.gov
James Mercurio Veterans Affairs james.mercurio@med.va.gov
Not-for-profit
Grant Ballard PRBO gballard@prbo.org
UMBC CS
Tim Finin finin@cs.umbc.edu
Pavan Redivarri pavan2@umbc.edu
Poorva Arankalle poorvaarankalle@hotmail.com
UMBC GEST
Joel Sachs jsachs@umbc.edu
Susan Hoban susan.hoban@gsfc.nasa.gov
UMD MINDSWAP
Jen Golbeck golbeck@cs.umd.edu
Bernardo Cuenco bernardo@frodo.mindlab.umd.edu
UC Davis
Jim Quinn jfquinn@ucdavis.edu
Allan Hollander adh@ice.ucdavis.edu
David Kaplan dmkaplan@ucdavis.edu
RMBL
Neo Martinez neo@sfsu.edu
Rich Williams rich@sfsu.edu
Jen Dunne jdunne@sfsu.edu
Colorado State Univeristy Natural Reources Ecology Laboratory
Greg Newman newmang@nrel.colostate.edu
1. ELVIS
Neo Martinez and Jim Quinn reported on the NoCal meeting, and the napkin drawings
that came out of it. I’ll send out PowerPoint versions of these pictures. Essentially, they
storyboard ELVIS. ELVIS has two main components. The top half builds a species list
for a given location. The bottom half builds a food web for the given species list.
We broke out into ELVIS top and ELVIS bottom groups.
ELVIS-top
The input is a time and a place. Lake Tahoe, 2004 is a leading candidate for our first
demo. Four or five sources for species data need to be tied together to come up with a list.
Some of these sources are under the purview of CAIN, and CAIN looks to MINDSWAP
for help in building a semantic web service around some of their resources.
ELVIS Bottom:
The input is a species list. The output is a matrix with the species labeling each column,
and also each row. A “1” in the intersection of row and column indicates a trophic link; a
“.9” indicates a likely trophic link; etc. The challenge lies in inferring likely trophic links
from observed ones. Neo’s gang has some heuristics for doing this. They are along the
lines of “if A eats X and A is taxonomically close to B (for some meaning of “close”);
then infer a trophic link from B to X”. This reminded Jen Golbeck of inferring trust
relationships, and she’s going to work on Trust-o-matic (http://trust.mindswap.org/trusto-matic.shtml) to try to get it to do the necessary work. She’ll need a species list, and
some trophic data (both in OWL) to be provided by Rich Williams.
One of the great sources of trophic data in this world is Fishbase. The RMBL team plans
to buy Fishbase on CD-Rom, mock it up as a semantic web service, and then convince
Fishbase to transform itself into a semantic web service.
2. Education
Our goal is to introduce something into the classroom each semester for each of the next
6 semesters. Our first project will be an electronic field form that publishes directly to the
semantic web.
Susan Hoban presented her vision for the astronomy classroom, and asked “how does it
map to biology?”
- assume that all students will have GPS, WiFi PDAs
- activities that kill “sage on stage” model; kids construct their own
understanding
- some discussion about the amount of tech savvyness we should expect from
kids
Poorva Arankalle gave an overview of existing electronic field form projects.
-
feeling was that we should extend Jalama (based at MIT; used by LTER) to do
what we want (namely, publish field data on the semantic web).
We digressed into a realization that we’re unlikely to build anything usable
by field biologists outside of our circle of friends. So the goal is a prototype
that demonstrates functionality sufficient to get development funds.
Ontological elements in support of electronic field forms include:
- DC
- Methodology
- equipment
- experimental design:
Experimental Design/Methodology can be a huge can of worms; we need requirements
for our Field Form project. Eg transect vs. point count; how to infer the absence of a
species; etc.
Possible acronyms: OWL-F (OWL Electronic Field Forms); PYON (Publish Your
Observations Now); PYLON (Publish Your Lovely Observations Now)
3. SPORE
Tim Finin encouraged everyone to start using SPORE (the SPIRE Ontology Repository)
(http://pear.cs.umbc.edu/spire/v2.1/ont/ ). He also layed out plans for SPORE to become
a “Google for Ontologies”, indexing OWL ontology and instance data, and developing a
notion of Pagerank, which would enable ontologies to be returned according to the extent
of their use.
4. MINDSWAP Tools
JenG demoed the MINDSWAP site, Photostuff and SWOOP.
PhotoStuff generated much interest. Something to use after coming home from the field.
Maybe in the field, if we can get it on a PDA.
The question was raised: Can it be used as a general purpose annotation tool? Eg, to
annotate a table. The ensuing discussion indicated a high desire on many person’s parts
(at least Greg, JimQ, myself, Susan, Neo) to experiment with an annotation framework.
We’ll take another stab at Annotea, and also look at using PhotoStuff as the basis for an
annotation client.
Note: NSDL is getting ready to tackle this. See
http://annotations.comm.nsdl.org/cgi-bin/wiki.pl?Annotation_and_Review_Services
5. Metadata Registries
Larry Fitzwater described the ISO 11179 standard and metadata registry metamodel, and
the structure of EPA’s Electronic Data Registry (EDR). (To search for a defined concept,
or to compare definitions from different sources, go to
http://oaspub.epa.gov/edr/compare_tool$.startup )
How to get all this well-defined terminology into our RDF/OWL ontologies?
Frank Olken and Kevin Keck described some of what it would mean to transform ISO
11179 registries (EDR in particular) into semantic web services. Who would fund this?
We had a discussion about the pros and cons of permanent URIs, and mechanisms for
providing permanent URIs.
6. Ontology Mapping
Jen converted Darwin Core to OWL using excel2rdf (?), expressing DC elements as
OWL datatype properties. Rich pointed out that it wouldn’t map to the SEEK and WoW
ontologies because a datatype property can’t be declared equivalent to a class. So Jen
redid the ontology, this time wrapping DC elements as classes.
We then spent some time trying to figure out how to use Protégé to declare equivalences.
For some reason, this was harder than it should have been.
We also talked about the utility of ontology mappings. Will all of our applications need to
reason from one ontology to another? If not, should we have a tool that builds an XSLT
out of an ontology mapping, and that gets invoked whenever an application encounters
data expressed in a “wrong” ontology.
Even if we have the appropriate mechanism for using mappings, how useful will they be?
I’d like to complete the experiment of mapping Darwin Core to SEEK to gauge the
degree of equivalence, and then follow that up with other mapping experiments.
Rich presented pieces of the SEEK and WoW ontologies, and decried the absence of
good ontology visualization. Neo recalled seeing huge chunks of the cancer (?) ontology
projected on a huge wall. Maybe we’ll do this at our next meeting.
7. Invasive Species Forecasting
Greg Newman gave an overview of work being done at the Invasive Species Forecasting
Institute in Fort Collins.
Two main areas of ISFI-SPIRE connection:
i.
Providing probability distributions as a web service
ii.
Using trophic information as a correlate.
Neo and Greg will talk more about (ii). (possible acronym: TPOIS (Trophic Predictors of
Invasive Species)
8. Field Trip
Grant Ballard took us out to the Point Reyes Bird Observatory. Amidst the extreme
beauty of the drive, the birds, the venue, a few observations:
Most field notes are semi-structured, combining observations with remarks in an
idiosyncratic manner.
An example of a more structured data entry mechanism is the form used for reporting
identified and banded birds. For a number of reasons, it is desired that the initial entry
remain handwritten, but many applications (eg tracking birds as they move up the
flyway) could be enabled by generating RDF/OWL from the entry form.
9. Other – Miscellaneous conversations and digressions
- Text mining of literature to generate RDF. NBII has subscriptions to the relevant
ecological digital libraries, can provide a huge corpus to work with.
- Open Database Initiative. Publish database schema in RDF and develop agents to find
data; suggest potential data mining queries; etc.
Download