Trident Scientific Workflow For Neptune Roger S. Barga

Trident

Scientific Workflow

For Neptune

Roger S. Barga

Architect, Technical Computing Group

Microsoft Corporation

Trident Project Contributors

Monterrey Bay Aquarium Research Institute

Jim Bellingham

Yanwu Zhang

Mike Goding

University of Washington

Keith Grochow, Dept of Computer Science

Mark Stoermer, Dept. of Oceanography

Donald Averill, Dept. of Oceanography

Microsoft Technical Computing Group

Luciano Digiampietri, intern, UNICAMP Brazil

Nolan Li, intern, Johns Hopkins University

Roger Barga

Project Neptune

North East Pacific Time-Series

Undersea Networked

Experiment

The world’s first plate-scale undersea observatory

Project Neptune: Themes

Scientific Research

Plate tectonic processes

Regional ocean/climate dynamics

Gas hydrates, etc

Engineering Challenges

Delivering power and internet into the ocean, uninterrupted operation, device lifetime, data processing and storage

Technical Challenges

Scalar

Complex

Disparate, high volume data sets and streams

Streaming

Several types of CTD devices, ROVs, AUVs

ADCP

ZAP (vertical and horizontal)

8MPix moveable digital still camera

3-way hydrophone array

HDTV camera

From raw data to useable data products

Data cleaning, analysis, regridding, interpolation

Support real time, on-demand visualization

Technical Challenges

Support a variety of users interacting with system

Community  Easy access to regularly generated data products (model output, images, visualizations, etc.)

Researchers  access to quasi-live or historical data through thin client (web browser), ability to both access data and create visualizations on demand, author both data analysis pipelines and visualizations remotely

PIs  direct access to their own instrument for live access, add new instruments, introduce new analysis codes and algorithms into the system

Never been done before in oceanography

Requirements not easily obtained

Principal Goals Of Trident

Allow users to

Automate tedious data cleaning and analysis pipelines.

Explore and visualize data, regardless of source.

Compose, run and catalog experiments, save results.

A workflow starter kit, one that will allow users to easily extend Trident functionality.

Learn by exploring and visualizing ocean

& model data.

By…

Allowing experts to author custom workflow activities, but basic users aren’t forced to see the details.

Allow user access mostly through a web portal, one that is intuitive and requires nominal local resources.

A Quick Look At Trident

Scientific workflow workbench for oceanography

Populate Windows WF with custom activities

Introduce gridded data structures;

Define basic operators (data transformations);

Implemented as custom activities;

Introduce parameterized activities

Easier for users to design workflows

Tool to convert custom to parameterized activities

Invoke and author workflows via web browser

Persistent workflows, checkpoints ( stop-revise-rerun

)

Internal Data Structures for Spatiotemporal

Data

Oceanographic Data

Spatiotemporal information stored in

CDF and NetCDF files, in various formats

Internal data structures

ISTem3DCollection: A collection of spatiotemporal points. For each point there is an collection of objects to represent measured values;

HyperCube4DOfDoubles: A four dimensional hypercube (grid). For each point there is an array of doubles to represent measured values;

Custom Activities In Windows WF

Parameterized Activity

Converting Custom To

Parameterize Activities

The conversion is made automatically:

Custom Activity

Parameterized Activity (web accessible)

Converting Custom To

Parameterize Activities

Select Custom Activities

Converting Custom To

Parameterize Activities

Parameterized , and now web accessible

Remote Authoring Via Web Browser

Luciano Digiampietri ( UNICAMP ), Nolan Li ( JHU )

Interns, Summer 2007

Technical Computing Group at Microsoft

To Sum Up

Trident is a very young project (3 months old) growing by application pull from its contributors most features are now being designed jointly

Is not quite an alpha release but deployed at both UW and MBARI

The screen shots were the baby pictures!

There is a lot more to cover and work on…

© 2007 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.

The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation.

MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

Microsoft Research

Faculty Summit 2007