reacchpna.org northwestknowledge.net Providing Researchers High Performance Web Service Access to Big Downscaled Climate Data Internet 2 Technical Workshop July 2014 Palouse region, Northern Idaho Erich Seamon, M.S. PMP GISP Environmental Data Manager Regional Approaches to Climate Change for Pacific Northwest Agriculture (REACCHPNA) College of Agricultural and Life Sciences University of Idaho 208.885.1230 erichs@uidaho.edu Luke Sheneman, Ph.D. Technical Services Manager Northwest Knowledge Network University of Idaho sheneman@uidaho.edu www.northwestknowledge.net Paul Gessler, Ph.D. Professor Department of Forest, Rangeland, and Fire Sciences College of Natural Resources, University of Idaho paulg@uidaho.edu Boulder Workshop Summer 2014 University of Idaho GIS/CI Day 2013 Presentation Overview reacchpna.org • Overview of data science efforts under the University of Idaho’s Northwest Knowledge Network (NKN) research consortium • Data-centric research efforts in agriculture within NKN • NKN scientific research collaboration efforts using multi-dimensional datasets • Methods for heterogeneous systems/tool integration • Lessons learned northwestknowledge.net Goals Analysis Tools Climatic data storage and access Extensible Data Cataloging Climatic science integrative research Interactive Python THREDDS Climate Science value to the public and regional stakeholders ArcGIS Server PostgresQL Boulder Workshop Summer 2014 reacchpna.org Climate data and the Pacific Northwest northwestknowledge.net Boulder Workshop Summer 2014 REACCH Cyberinfrastrure and Data Management Overview, 2013 reacchpna.org NKN Services Model northwestknowledge.net • NKN functions as a unit under the Office of VP for Research • Functions as an ‘Online Data Observatory’, providing customer projects with data cataloging and archiving • Working to extend capabilities to serve all data/metadata via web services • NKN functions under a service center model • Currently refining a cost model to distribute services/technology efforts to projects on a equitable basis • Development of a shared technology research environment is a focused priority for big data research Boulder Workshop Summer 2014 February 14th, 2013 REACCH Cyberinfrastrure and Data Management Overview, 2013 reacchpna.org • • • • NKN Systems and Networking Efforts northwestknowledge.net On-going systems and network upgrades are an essential component of NKN’s technology strategy NSF funding award# 1341040 assisted in the University of Idaho and NKN to extend our research networking. Upgrades included: 1. The UI campus core network; 2. The Northwest Knowledge Network data repository; and 3. The DoE Idaho National Laboratory (INL) for replication/mirroring of NKN data with proximate access to significant High Performance Computing (HPC) and visualization resources for researchers. This complements previous institutional and NSF-funded improvements and enables true 10 Gigabit per second (Gbps) end-to-end data transfers to support all researchers at the UI. Scope: Core Network in Moscow, ID and INL HPC in Idaho Falls, ID Status: 90% complete as of March 29, 2014 Boulder Workshop Summer 2014 reacchpna.org NKN Systems and Networking Efforts northwestknowledge.net • Includes perfSONAR servers • Excludes stateful firewalls – Science DMZ specifies flexible router ACLs for high performance network security • Collaborate with other Institutions – Included Idaho National Lab HPC which provides hosting for: • University of Idaho • Boise State University • Idaho State University Boulder Workshop Summer 2014 reacchpna.org Focused Project Overview: REACCH northwestknowledge.net The Regional Approaches to Climate Change project is a five year, $20M coordinated regional agricultural project, funded by the National Institute for Food and Agriculture to improve the long-term profitability of the cereal production systems in the Pacific Northwest under ongoing and projected climate change, while contributing to climate change mitigation by reducing emissions of greenhouse gases. REACCH includes efforts in research, extension, and education that integrates diverse elements including climate modeling, cropping systems modeling, economics, agronomy, crop protection, and others in a transdisciplinary manner. www.reacchpna.org Boulder Workshop Summer 2014 reacchpna.org Focused Project Overview: REACCH northwestknowledge.net • Inland Pacific Northwest (IPNW) is a critical agricultural region • Diverse research efforts abound – UI, WSU, OSU, UW, USDA/ARS, NSF, NOAA • Clear connection between climate change and agriculture processes Boulder Workshop Summer 2014 reacchpna.org Focused Project Overview: REACCH northwestknowledge.net Climatic modeling integration with other research efforts is paramount for integrative research efforts • NETCDF formats • Gridded model dataset outputs • Gridded meteorological datasets • Over 20TB for western US • http://nimbus.cos.uidaho.edu/MACA • John Abatzoglou/University of Idaho Boulder Workshop Summer 2014 reacchpna.org Focused Project Overview: REACCH northwestknowledge.net The REACCHPNA project is divided into ten functional objective teams (listed to the left), with lead investigators for each area, examining: • the relationship between climate change and cereal crops, primarily winter wheat • how climate change might affect cereal crops • how production practices might contribute to or help mitigate climate change • what farming methods might help these crops withstand climate change • factors that influence decisions about crop management Boulder Workshop Summer 2014 REACCH/NKN Systems Model reacchpna.org ArcGIS Server Aggregation and Programmatic Geoportal Server Geospatial Ipython THREDDS Php REST javascript MySQL PostgresQL Database northwestknowledge.net THREDDS –Aggregation and interrogation of NetCDF datasets Geoportal Server. Metadata Cataloging – modified to allow data uploading IPython – Interactive Python. Python in a web browser! Can be used to compile and document research processes ArcGIS Server – web server technology used for geospatial mapping processes PostgresQL – open source enterprise DB Boulder Workshop Summer 2014 REACCH Data Library reacchpna.org northwestknowledge.net • Based on ESRI’s geoportal server software • Linux/tomcat/java • Library can be accessed at data.reacchpna.org Boulder Workshop Summer 2014 REACCH THREDDS Server reacchpna.org northwestknowledge.net • Thematic Realtime Environmental Data Distribution Services (THREDDS) • Developed by UCAR • Aggregates and subsets multidimensional datasets (NetCDF) thredds.reacchpna.org Boulder Workshop Summer 2014 reacchpna.org REACCH Data Analysis Library northwestknowledge.net • Use of geoprocessing services for analytics • Climate time series • Subsetting and aggregation • Integrative data queries (eg. Biotics and climatic data) • More applied tools: • Growing Degree Day analysis • Crop buffering analysis.reacchpna.org Boulder Workshop Summer 2014 reacchpna.org northwestknowledge.net Data Library Integrative Analysis Tool Examples Boulder Workshop Summer 2014 February 14th, 2013 REACCH Cyberinfrastrure and Data Management Overview, 2013 February 14th, 2013 REACCH Cyberinfrastrure and Data Management Overview, 2013 February 14th, 2013 REACCH Cyberinfrastrure and Data Management Overview, 2013 February 14th, 2013 REACCH Cyberinfrastrure and Data Management Overview, 2013 Interactive Python Server reacchpna.org northwestknowledge.net • Useful for collaboration and informal scientific analysis • Allows for arcpy integration • IPython Notebook server available ipython.reacchpna.org Boulder Workshop Summer 2014 February 14th, 2013 REACCH Cyberinfrastrure and Data Management Overview, 2013 February 14th, 2013 REACCH Cyberinfrastrure and Data Management Overview, 2013 Summary Overview reacchpna.org northwestknowledge.net • Development of a data architecture that emphasizes the use of web services (OPENDap, REST) has allowed PNW researchers access to more robust datasets for varied and integrative analysis • Computing capabilities have been enhanced by placing data closer to HPC, as well as use of perfSONAR testing for bottlnecks • Subsetting and aggregation methods have been very valuable • Python geoservices are a nice way to encapsulate and deploy geographic and temporal data transformations Boulder Workshop Summer 2014 reacchpna.org References northwestknowledge.net • CC:NIE – Support Big Science Data at U. of Idaho – http://www.nsf.gov/awardsearch/showAward?AWD_ID=1341040 • Northwest Knowledge Network (NKN) – https://www.northwestknowledge.net/ • PerfSONAR – http://www.perfsonar.net/ • Science DMZ Security – http://fasterdata.es.net/science-dmz Boulder Workshop Summer 2014 reacchpna.org REACCH Information Access www.reacchpna.org data.reacchpna.org analysis.reacchpna.org policy.reacchpna.org research.reacchpna.org education.reacchpna.org extension.reacchpna.org press.reacchpna.org dictionary.reacchpna.org help.reacchpna.org northwestknowledge.net reacch-list@uidaho.edu reacch-student-list@uidaho.edu reacch-faculty-list@uidaho.edu Presentation and contact info available @: erichs@uidaho.edu sheneman@uidaho.edu Questions? Boulder Workshop Summer 2014 REACCH Cyberinfrastrure and Data Management Overview, 2013