Providing Researchers High Performance Web Service Access to

advertisement
reacchpna.org
northwestknowledge.net
Providing Researchers
High Performance Web
Service Access to Big
Downscaled Climate
Data
Internet 2 Technical Workshop
July 2014
Palouse region, Northern Idaho
Erich Seamon, M.S. PMP GISP
Environmental Data Manager
Regional Approaches to Climate Change for
Pacific Northwest Agriculture (REACCHPNA)
College of Agricultural and Life Sciences
University of Idaho
208.885.1230
erichs@uidaho.edu
Luke Sheneman, Ph.D.
Technical Services Manager
Northwest Knowledge Network
University of Idaho
sheneman@uidaho.edu
www.northwestknowledge.net
Paul Gessler, Ph.D.
Professor
Department of Forest,
Rangeland, and Fire Sciences
College of Natural Resources,
University of Idaho
paulg@uidaho.edu
Boulder Workshop Summer 2014
University of Idaho GIS/CI Day
2013
Presentation Overview
reacchpna.org
• Overview of data science
efforts under the University of
Idaho’s Northwest Knowledge
Network (NKN) research
consortium
• Data-centric research efforts in
agriculture within NKN
• NKN scientific research
collaboration efforts using
multi-dimensional datasets
• Methods for heterogeneous
systems/tool integration
• Lessons learned
northwestknowledge.net
Goals
Analysis Tools
Climatic data
storage and access
Extensible Data
Cataloging
Climatic science
integrative
research
Interactive Python
THREDDS
Climate Science
value to the public
and regional
stakeholders
ArcGIS Server
PostgresQL
Boulder Workshop Summer 2014
reacchpna.org
Climate data and the Pacific
Northwest
northwestknowledge.net
Boulder Workshop Summer 2014
REACCH Cyberinfrastrure and Data
Management Overview, 2013
reacchpna.org
NKN Services Model
northwestknowledge.net
• NKN functions as a unit under the Office of VP for
Research
• Functions as an ‘Online Data Observatory’,
providing customer projects with data cataloging
and archiving
• Working to extend capabilities to serve all
data/metadata via web services
• NKN functions under a service center model
• Currently refining a cost model to distribute
services/technology efforts to projects on a
equitable basis
• Development of a shared technology
research environment is a focused priority for
big data research
Boulder Workshop Summer 2014
February 14th, 2013
REACCH Cyberinfrastrure and Data
Management Overview, 2013
reacchpna.org
•
•
•
•
NKN Systems and
Networking Efforts
northwestknowledge.net
On-going systems and network upgrades are an essential component of NKN’s
technology strategy
NSF funding award# 1341040 assisted in the University of Idaho and NKN to
extend our research networking. Upgrades included:
1. The UI campus core network;
2. The Northwest Knowledge Network data repository; and
3. The DoE Idaho National Laboratory (INL) for replication/mirroring of NKN
data with proximate access to significant High Performance Computing
(HPC) and visualization resources for researchers.
This complements previous institutional and NSF-funded improvements and
enables true 10 Gigabit per second (Gbps) end-to-end data transfers to support
all researchers at the UI.
Scope: Core Network in Moscow, ID and INL HPC in Idaho Falls, ID
Status: 90% complete as of March 29, 2014
Boulder Workshop Summer 2014
reacchpna.org
NKN Systems and
Networking Efforts
northwestknowledge.net
• Includes perfSONAR servers
• Excludes stateful firewalls
– Science DMZ specifies flexible router ACLs for high
performance network security
• Collaborate with other Institutions
– Included Idaho National Lab HPC which provides
hosting for:
• University of Idaho
• Boise State University
• Idaho State University
Boulder Workshop Summer 2014
reacchpna.org
Focused Project Overview:
REACCH
northwestknowledge.net
The Regional Approaches to Climate
Change project is a five year, $20M
coordinated regional agricultural project,
funded by the National Institute for Food
and Agriculture to improve the long-term
profitability of the cereal production
systems in the Pacific Northwest under
ongoing and projected climate change,
while contributing to climate change
mitigation by reducing emissions of
greenhouse gases.
REACCH includes efforts in research, extension, and education that integrates
diverse elements including climate modeling, cropping systems modeling,
economics, agronomy, crop protection, and others in a transdisciplinary
manner.
www.reacchpna.org
Boulder Workshop Summer 2014
reacchpna.org
Focused Project Overview:
REACCH
northwestknowledge.net
• Inland Pacific Northwest (IPNW) is a
critical agricultural region
• Diverse research efforts abound – UI,
WSU, OSU, UW, USDA/ARS, NSF,
NOAA
• Clear connection between climate
change and agriculture processes
Boulder Workshop Summer 2014
reacchpna.org
Focused Project Overview:
REACCH
northwestknowledge.net
Climatic modeling integration with
other research efforts is paramount
for integrative research efforts
• NETCDF formats
• Gridded model dataset outputs
• Gridded meteorological datasets
• Over 20TB for western US
• http://nimbus.cos.uidaho.edu/MACA
• John Abatzoglou/University of
Idaho
Boulder Workshop Summer 2014
reacchpna.org
Focused Project Overview:
REACCH
northwestknowledge.net
The REACCHPNA project is divided into ten
functional objective teams (listed to the left),
with lead investigators for each area,
examining:
• the relationship between climate change
and cereal crops, primarily winter wheat
• how climate change might affect cereal
crops
• how production practices might contribute to
or help mitigate climate change
• what farming methods might help these
crops withstand climate change
• factors that influence decisions about crop
management
Boulder Workshop Summer 2014
REACCH/NKN Systems
Model
reacchpna.org
ArcGIS
Server
Aggregation
and
Programmatic
Geoportal
Server
Geospatial
Ipython
THREDDS
Php
REST
javascript
MySQL
PostgresQL
Database
northwestknowledge.net
THREDDS –Aggregation and
interrogation of NetCDF datasets
Geoportal Server. Metadata
Cataloging – modified to allow
data uploading
IPython – Interactive Python.
Python in a web browser! Can be
used to compile and document
research processes
ArcGIS Server – web server
technology used for geospatial
mapping processes
PostgresQL – open source
enterprise DB
Boulder Workshop Summer 2014
REACCH Data Library
reacchpna.org
northwestknowledge.net
• Based on ESRI’s
geoportal server
software
• Linux/tomcat/java
• Library can be
accessed at
data.reacchpna.org
Boulder Workshop Summer 2014
REACCH THREDDS Server
reacchpna.org
northwestknowledge.net
• Thematic Realtime
Environmental Data
Distribution Services
(THREDDS)
• Developed by UCAR
• Aggregates and
subsets multidimensional datasets
(NetCDF)
thredds.reacchpna.org
Boulder Workshop Summer 2014
reacchpna.org
REACCH Data Analysis
Library
northwestknowledge.net
• Use of geoprocessing
services for analytics
• Climate time series
• Subsetting and aggregation
• Integrative data queries (eg.
Biotics and climatic data)
• More applied tools:
• Growing Degree Day
analysis
• Crop buffering
analysis.reacchpna.org
Boulder Workshop Summer 2014
reacchpna.org
northwestknowledge.net
Data Library Integrative
Analysis Tool Examples
Boulder Workshop Summer 2014
February 14th, 2013
REACCH Cyberinfrastrure and Data
Management Overview, 2013
February 14th, 2013
REACCH Cyberinfrastrure and Data
Management Overview, 2013
February 14th, 2013
REACCH Cyberinfrastrure and Data
Management Overview, 2013
February 14th, 2013
REACCH Cyberinfrastrure and Data
Management Overview, 2013
Interactive Python Server
reacchpna.org
northwestknowledge.net
• Useful for collaboration
and informal scientific
analysis
• Allows for arcpy
integration
• IPython Notebook
server available
ipython.reacchpna.org
Boulder Workshop Summer 2014
February 14th, 2013
REACCH Cyberinfrastrure and Data
Management Overview, 2013
February 14th, 2013
REACCH Cyberinfrastrure and Data
Management Overview, 2013
Summary Overview
reacchpna.org
northwestknowledge.net
• Development of a data architecture that emphasizes the use
of web services (OPENDap, REST) has allowed PNW
researchers access to more robust datasets for varied and
integrative analysis
• Computing capabilities have been enhanced by placing data
closer to HPC, as well as use of perfSONAR testing for
bottlnecks
• Subsetting and aggregation methods have been very valuable
• Python geoservices are a nice way to encapsulate and deploy
geographic and temporal data transformations
Boulder Workshop Summer 2014
reacchpna.org
References
northwestknowledge.net
• CC:NIE – Support Big Science Data at U. of Idaho
– http://www.nsf.gov/awardsearch/showAward?AWD_ID=1341040
• Northwest Knowledge Network (NKN)
– https://www.northwestknowledge.net/
• PerfSONAR
– http://www.perfsonar.net/
• Science DMZ Security
– http://fasterdata.es.net/science-dmz
Boulder Workshop Summer 2014
reacchpna.org
REACCH Information Access
www.reacchpna.org
data.reacchpna.org
analysis.reacchpna.org
policy.reacchpna.org
research.reacchpna.org
education.reacchpna.org
extension.reacchpna.org
press.reacchpna.org
dictionary.reacchpna.org
help.reacchpna.org
northwestknowledge.net
reacch-list@uidaho.edu
reacch-student-list@uidaho.edu
reacch-faculty-list@uidaho.edu
Presentation and contact info
available @:
erichs@uidaho.edu
sheneman@uidaho.edu
Questions?
Boulder Workshop Summer 2014
REACCH Cyberinfrastrure and Data
Management Overview, 2013
Download