WDO-It! 102: Using an abstraction of a process to capture provenance

advertisement
MP
HP
NDR
WDO-It! 102 Workshop:
Using an abstraction of a process to
capture provenance
UTEP’s Trust Laboratory
GRAVITY MAP:
A SCIENTIFIC SYSTEM
Understanding the Gravity Map
System (1/3)
Longitude
-074.4244296
-074.9746118
-074.4245976
-074.7647730
-074.7647730
-074.3714268
-074.3714268
-074.2129201
-074.3237562
-074.3237562
Latitude
40.0049488
40.0051130
40.0051168
40.0059447
40.0059447
40.0099501
40.0101141
40.0109512
40.0139483
40.0139483
OBS
4176.40
4189.60
4176.40
4199.07
4199.10
4173.71
4173.69
4159.90
4172.05
4172.10
- Shell scripting
- Web services
- GMT
set gravity-dataset-path=gravityDataset.txt
set gridded-dataset-path=esriGrid.txt
set contoured-dataset-path=gravityContourMap.ps
call GetGravityData -109 -107 33 34 %gravity-dataset-path%
call GridGravityData %gravity-dataset-path% 0.02 0.2 %gridded-dataset-path%
call ContourGriddedData %gridded-dataset-path% %contoured-dataset-path%
ncols 5
nrows 5
cellsize=2
4604 4599
4619 4618
4596 4599
4551 4562
4532 4512
4598
4611
4598
4575
4482
4602
4586
4593
4572
4449
4606
4566
4585
4535
4459
Understanding the Gravity Map
System (2/3)
Longitude
-074.4244296
-074.9746118
-074.4245976
-074.7647730
-074.7647730
-074.3714268
-074.3714268
-074.2129201
-074.3237562
-074.3237562
Latitude
40.0049488
40.0051130
40.0051168
40.0059447
40.0059447
40.0099501
40.0101141
40.0109512
40.0139483
40.0139483
OBS
4176.40
4189.60
4176.40
4199.07
4199.10
4173.71
4173.69
4159.90
4172.05
4172.10
ncols 5
nrows 5
cellsize=2
4604 4599
4619 4618
4596 4599
4551 4562
4532 4512
4598
4611
4598
4575
4482
4602
4586
4593
4572
4449
4606
4566
4585
4535
4459
Installing and Running the Gravity Map
System (1/3)
• Download scripts to generate a contour map here:
http://trust.utep.edu/wdo/workshop/gravityMapClient.zip
• Uncompress download file
• Requirements:
–
–
–
–
OS: Windows or Unix/Linux/MacOS
Web access
Java 6
Recommendation: Clear your Java Cache
• Go to Control Panel > Java
• Under “Temporary Internet Files”, Click on Settings. Click on Delete Files
Installing and Running the Gravity Map
System (2/3)
• Open command prompt or terminal window
• Navigate to “scripts” directory
• Run ContourMapWorkflow script
– Windows
• C:\homedir\scripts\ContourMap
– Unix/Linux\MacOS
• chmod 775 /homedir/scripts/*.sh
• ./ ContourMapWorkflow.sh
Installing and Running the Gravity Map
System (3/3)
• The master script “ContourMapWorkflow” calls three
other scripts that in turn invoke a Java 6 client that
consumes a Web service
• All the core functionality in this example is provided
through a Web service, for all the activities whether
it is gridding a dataset or accessing the gravity
datastore
Notes about the outputs
• This small workflow or pipeline is built on the three activities
“getGravityData”, “gridGravityData”, and
“contourGriddedData,” each writes its output on the file
system for the next activity to consume
• The outputs are “gravityDatabase.txt”, “esriGrid.txt”, and
“gravityContourMap.ps” (final output) and their names can be
changed in the script “ContourMapWorkflow”
• The final output is “gravityContourMap.ps”
GENERATING CODE TO CAPTURE PROVENANCE:
THE DATA ANNOTATOR
Running WDO-It!
• The WDO-It! tool can be run at:
http://trust.utep.edu/wdo/downloads/latest/wdoit.jnlp
WDO-It! interface
Loading a SAW
• WDO-It! supports both an open function by local file or URL.
In addition, there is a bookmarks option.
• In order to start, Load the Gravity Contour Map Workflow
from the bookmarks
• From the top menu open
bookmarks > CreateGravityContourMapSAW
WDO-It! Gravity Contour Map Workflow
Data Annotators
• Specialized modules used to capture provenance at a specific
time in the execution of a scientific process
• Data Annotators do not take part on the processing of
information of the original system
• Data Annotators are created using the WDO-It! tool based on
a SAW that represents the scientific process
• DA implementation depends on:
– PML encoder agent
– Scientific system’s execution platform
Data Annotators
Data + Provenance
Links
System
instruction 1
instruction 2
.
.
.
instruction n
result
call
DA
use
PML
Encoder
Agent
read configuration
XML Config file
(based on process knowledge from SAW)
Data Annotator Wizard
In order to start exporting data annotators,
click File > Export Workflow > Generate
Data Annotators
from the top menu.
Select SAW
A prompt will ask the user to select which SAW will be used to generate the data
annotators, select the SAW for which you would like to capture provenance.
NOTE: It is important that there is a master script available for the SAW being
selected.
For this example, select CreateGravityContourMapSAW.owl
By clicking OK, WDO-It! will open the Data Annotator
Wizard to configure the scripts and configuration files to
be generated.
Data Annotator Wizard
The first screen of the DA
wizard asks the user to
set:
- Output directories
- Annotator Agents
- Targeted System
How to enter each will be
covered in the following
slides.
DA Configuration: Setting Output paths
Select any directory in your file
system where you wish to save all
generated files (DAs).
Annotator Directory – scripts and
configuration files.
Annotator Output Directory – dump all
scientific process data.
PML Output directory – dump PML
files generated.
NOTE: do not try to set the PML dump path
with the Browse button as we are currently
configuring the CI-Server widget and it may
produce some exceptions.
DA Configuration: Targeted System
For the targeted system, a user
must select what type of scripts he
or she requires.
• Shell scripts (.sh) files for UNIX ,
Linux or Mac systems
• Batch scripts (.bat) files for
Microsoft Windows systems.
Choose the correct option for the
system you have. And click on Next
DA Configuration: Bindings
The second screen of the DA
wizard will ask the user to specify
details regarding the semantic
abstract workflow. The interface
looks like this.
DA Configuration: Sources
Select an engine from the dropdown menu for each of the listed
sources.
For this example choose
“UTEPGISGravityDatabase”
DA Configuration: Data
The data section specifies what data will be
created, transformed, and dumped during the
scientific process.
A user must specify the format of each by
selecting the format type from the drop-down
menu.
In addition, the user must also specify how the
data annotators will know about the data itself
• Pass by Reference – link the data file
• Pass by Value – include value in PML file
DA Configuration: Data
For this example, select the following values
Data Formats:
•
•
•
For ContourMap use “ps3”
For GravityDataset use “tab-delimited-dataset”
For GriddedDataset use “esriGrid”
Data Filenames*:
•
•
•
For ContourMap use “gravityContourMap.ps”
For GravityDataset use “gravityDataset.txt”
For GriddedDataset use “esriGrid.txt”
* These files are given with the example
bundle. They are generated once the main
process was run once.
DA Configuration: Methods
The final section the bindings tab required the
user to set the engine for each method listed .
For this example, choose the following:
Method Engines:
• For Contouring use “contour”
• For Gridding use “gridder”
Once selected, you are now finished setting
up the Data Annotator wizard. This is a good
time to check that everything was set up
correctly.
Click on the “Generate” button to create the
scripts and the configurations files.
The Generated Data Annotator
• Go to the Annotator Output directory you selected in the DA
Wizard.
• You should see the following directories:
– Pml
– Data
– mappings
• You should also see the following files:
–
–
–
–
–
–
–
Contouring*
Contouring.xml
Gridding*
Contouring.xml
WebServicePACES*
WebServicePACES.xml
environmentVariables*
* .bat or .sh depending on which system you selected.
INSTRUMENT GRAVITY MAP SYSTEM FOR
PROVENANCE CAPTURE
Instrument Gravity Map scripts
• We need to modify the gravity map client script to
invoke the appropriate data annotator at the
appropriate time.
• Each DA script is named after a SAW activity or
source of information, but a system analyst must find
the appropriate location in the system that “maps”
to it.
• A call to the corresponding DA script must be
inserted in the system directly after the execution of
the workflow step it is annotating.
Instrument Gravity Map Scripts
• For example, in order to capture the
“gridding” step in our gravity map example,
we would add a call to the “Gridding.sh” in the
gravityMapClient workflow:
export gravitydatasetpath=gravityDataset.txt
export griddeddatasetpath=esriGrid.txt
export contoureddatasetpath=gravityContourMap.ps
./GetGravityData.sh -109 -107 33 34 $gravitydatasetpath
./GridGravityData.sh $gravitydatasetpath $griddeddatasetpath
./Gridding.sh
./ContourGriddedData.sh $griddeddatasetpath
$contoureddatasetpath
Instrument Gravity Map Scripts
• A call to an initialization script is also needed.
./environmentVariables.sh
export gravitydatasetpath=gravityDataset.txt
export griddeddatasetpath=esriGrid.txt
export contoureddatasetpath=gravityContourMap.ps
./GetGravityData.sh -109 -107 33 34 $gravitydatasetpath
./WebServicePACES.sh
./GridGravityData.sh $gravitydatasetpath $griddeddatasetpath
./Gridding.sh
./ContourGriddedData.sh $griddeddatasetpath $contoureddatasetpath
./contouring.sh
CAPTURING AND USING GRAVITY MAP
PROVENACE
Rerunning Gravity Map Script
• To generate PML documenting the derivation
of the postscript gravity map, simply rerun the
workflow script and PML will be dumped in
the directories which you specified in the DA
wizard.
Browsing Gravity Map Provenance
• Running Probe-It!
Thank you!
For more information please contact:
Leonardo Salayandia, leonardo@utep.edu
Paulo Pinheiro da Silva, paulo@utep.edu
Nick Del Rio, ndel2@miners.utep.edu
Antonio Garza, agarza6@miners.utep.edu
Aida Gandara, agandara1@miners.utep.edu
Download