Survey of Emerging IT Trends & Technology

advertisement
1
CSIG 10
Survey of Emerging IT Trends
and Technologies
Chaitan Baru
SDSC
2
Cyberinfrastructure
• The “cyberinfrastructure” initiative is an attempt to provide explicit
investments in IT for science & engineering research and education
• From NSF’s Cyberinfrastructure Vision for 21st Century Discovery,
www.nsf.gov/od/oci/ci-v7.pdf, July 20, 2006
– “The comprehensive infrastructure needed to capitalize on dramatic advances in
information technology has been termed cyberinfrastructure.”
– “…integrates hardware for computing, data and networks, digitally-enabled sensors,
observatories and experimental facilities…:
– “…an interoperable suite of software and middleware services and tools...”
– Investments in interdisciplinary teams and cyberinfrastructure professionals with
expertise in algorithm development, system operations, and applications development are
also essential…”
– “In 1999, the PITAC released the seminal report ITR-Investing in our Future, prompting
new and complementary NSF investments in CI projects, such as the Grid Physics
Network (GriPhyN) and international Virtual Data Grid Laboratory (iVDGL) and the
Geosciences Network, known as GEON.”
3
Geoinformatics
• A vision for Geoinformatics, from the
NSF Workshop on Envisioning a National
Geoinformatics System for the United
States Denver, March 2007
– “…a future in which someone can sit at a
terminal and have easy access to vast stores of
data of almost any kind, with the easy ability to
visualize, analyze and model those data.”
4
Geoinformatics
From David Lambert, NSF EAR/GEO
Presentation at GEON Annual Meeting, 2005
5
Geoinformatics
Cyberinfrastructure for the Solid
Earth Sciences: Objectives
• Make data, tools, applications
…and communities…
easily accessible online
• Provide an integration environment for 3D and 4D
geoscience data integration
Book to be published this year by Cambridge University
Press. Co-editors: Randy Keller and Chaitan Baru
6
A Use Case for Geoinformatics
• A user request of the form:
“For a given region (i.e. lat/long extent, plus
depth), return a 3D structural model with
accompanying physical parameters of density,
seismic velocities, geochemistry, and geologic
ages, using a cell size of 10km”
Portal-based Science Environments
Support for resource sharing and collaborations
EarthScope Data Portal
- SDSC
San Diego
- IRIS
Seattle
- UNAVCO
Boulder
- ICDP
Potsdam
portal.earthscope.org
9
CUAHSI Hydrologic Information
System, HIS (http://his.cuahsi.org)
– Data Discovery, Data Access, Data Publication
10
GEON: Geosciences
•
•
•
* The
*
Network
Funded by NSF IT Research program
Multi-institution collaboration between IT and Earth
Science researchers
GEON Cyberinfrastructure provides:
–
–
–
–
–
–
Authenticated access to data and Web services
Registration of data sets, tools, and services with metadata
Search for data, tools, and services, using ontologies
Scientific workflow environment and access to HPC
Data and map integration capability
Scientific data visualization and GIS mapping
network / grid concept has been evolving over past several years
GEON: The Geosciences Network
www.geongrid.org
 GEON is a coalition among IT and Earth
Science researchers with the goal of developing
advanced information technologies to enable
new modes of geosciences research
 GEON is developing technologies for
information integration and knowledge
discovery
 Project participants: 14 PI institutions, and
partners including, other projects, agencies, and
industry
 GEON has deployed a Web services-based,
distributed computing infrastructure, called the
GEONgrid, across PI and partner sites
 GEONgrid provides access to data
collections, tools, and applications that support
geosciences research
 Project funding: $11.25M, 2002-2007
RESEARCH AND EDUCATION PRODUCTS
AND RESULTS
 Technologies for Ontology-Based Data
Registration, GIS Map Integration, Distributed
Portals, and 4D Visualization
 Research on
 3D Lithospheric structure
 Gravity Modeling
 Remote Sensing Data Integration
 Cyberinfrastructure Summer Institute for
Geoscientists and graduate courses in
Geoinformatics
GEON Partners
• 14 PI institutions
• Over 20 other partners including, universities,
industry,
government agencies/labs
PI Institutions
• Arizona State University
• Bryn Mawr College
• Penn State University
• Rice University
• San Diego State University
• San Diego Supercomputer Center/UCSD
• University of Arizona
• University of Idaho
• University of Missouri, Columbia
• University of Texas at El Paso
• University of Utah
• Virginia Tech
• UNAVCO
• Digital Library for Earth System Education (DLESE)
Partners
• Chronos
• CUAHSI-HIS
• ESRI
• Calit2
• Georgia State University
• Geological Survey of Canada
• Georeference Online
• HP
• IBM
• Lawrence Livermore Natl Laboratory
• NASA Goddard, Earth System Division
• SCEC
• U.S. Geological Survey (USGS)
• Purdue University
Affiliated Projects
• EarthScope, IRIS
Key Informatics Areas
• Portals
– Authenticated, role-based access to cyber resources: data, tools, models, model outputs,
collaboration spaces, …
• Data Integration
– Search, discovery and integration of data from heterogeneous information sources
(“mediation” and “semantic integration”)
• Use of workflow systems, and access to HPC
– Ability to “program” at a higher level of abstraction
– Sharing of models, along with “provenance” information
– Gateways to HPC environments
• Management of Geospatial Information
– Using GIS capabilities, map services, geospatial data integration
• Visualization of 3D, 4D geospatial data and information
14
GEON Portal
portal.geongrid.org
• Generic Capabilities:
– Search
– Workbench
– Dynamic map services, map integration
• Applications:
– Paleo database integration
– LiDAR data access and data processing
– SYNSEIS: Online access to computational modeling
system
– Gravity and Magnetic database for US
GEON and Related Portals
Chesapeake Bay Environmental Observatory
National Ecological Observatory Network Prototype
CUAHSI Hydrologic Information System
Tropical Ecology Assessment and Monitoring
Network
EarthScope
Data Search and Integration
GEON LiDAR Workflow
(GLW) Portlet
18
GEON Project and Funding
Structure
GEON
NSF EAR/IF Facility
(GEO, OCI, CISE)
• NSF ITR
• OCI Software Development for
Cyberinfrastructure (SDCI)
OpenTopography
OpenEarth Framework
NSF Geoinformatics
GEON Portal
NSF CluE (GEO, CISE)
CluE
19
Integrated Cyberinfrastructure System
Education and Training
Discovery & Innovation
Source: Dr. Deborah Crawford, Chair, NSF CI Working Committee
Application Domains
• Geosciences, Engineering,
Environmental Sciences, Physics,
Astronomy, Archaeology,
Neurosciences, Biomedicine, …
Development
Tools & Libraries
Domain-specific
Cybertools
(software)
Shared
Cybertools
(software)
Middleware Services
Hardware
Distributed Resources
(computation, storage,
communication, etc.)
20
Community Cyberinfrastructure Projects
Friendly Work-Facilitating Portals
Ocean Observing (ORION)
Ecological Observatories (NEON)
Earthquake Engineering (NEES)
Hardware
Geosciences (GEON)
Middleware
Services
Biomedical Informatics (BIRN)
Development
Tools & Libraries
High Enegy Physics (GriPhyN)
Authentication - Authorization – Auditing - Resource Discovery - Workflows Visualization - Analysis
Your Specific
Tools
& User Apps.
Shared Tools
ScienceDomains
Source: Prof. Mark
Ellisman, UC San
Diego
Distributed Computing, Instruments and Data Resources
21
Services implied by the
Geoinformatics use case
“For a given region (i.e. lat/long extent, plus depth),
return a 3D structural model with accompanying
physical parameters of density, seismic velocities,
geochemistry, and geologic ages, using a cell size
of 10km”
22
Services implied by the use case
1. Search and discovery
2. Data access
3. Data integration, including
transformations, model execution, and
visualization Some scientific
visualization
4. Result publication
(and preservation—
so that results can be searched and
Digital
libraries
and
discovered)
All in aarchives
distributed environment
23
Data “integration”
• A priori integration
– Consistent metadata and data standards and data
“schema”/structure, and semantics are pre-defined
across a set of data resources
– User simply issues a query and receives a result
versus
• Ad hoc integration
– Consistent standards for discovery and data access, but
retrieved data are visualized in a common environment
and user interactively integrates the data
24
Evolution of distributed
environments
• Mainframes
– with distributed “synchronous” terminals
• Networked minicomputers
– with proprietary computer networking
protocols
• The Web
– Engineering workstations with open
communications protocols
25
Evolution of distributed
environments
• The Grid
– Distributed computational and storage
resources owned by organizations, orchestrated
together to form “metacomputers”
• The Cloud
– On-demand computational and storage
resources provided as a service over the
Internet, with incremental cost models
26
Clients in a distributed
environment
• “Dumb” terminals
– IBM 3270, vt100
• “Thick” clients
– Workstations as clients in a client-server system
• “Thin” clients
– Original PC desktops
• Thick clients
– Modern PCs with powerful capabilities (64-bit, multicore, large
memory)
• Thin clients
– Mobile devices
27
Distributed
environments…contd.
• Service-oriented architecture, SOA
– A programming style for distributed computing
– Services may be distributed in wide area
(Internet scale)
– or local area (within a datacenter)
• Data inertia
– Moving data to computation vs
– Computation to data
28
Virtual Organizations (VOs)
• A socio-technical concept
• A distributed collection of entities and resources that
come together to solve a specific problem
–
–
–
–
Multiple participants
Distributed sites
Participants are from different “administrative domains”
Policies, rules, systems of the VO may be different than those of
the participating organizations
• Requires agreement on basics standards and protocols
to enable resource and data sharing
29
Other Geoinformatics Efforts
• OneGeology.org
– International initiative of
geological surveys to create
dynamic geological map data
available via the web.
• USGS initiative
– Presentation by Dr. Linda
Gundersen, at Geoinformatics
2007, San Diego.
USGS: 1000’s of National and
Regional Databases












The National Map – topographic, elevation,
orthoimagery, transportation hydrography etc.
Geospatial One Stop-portal
MRDATA – Mineral Resources and Related
Data
The National Geologic Map Database
stnadardized community collection of
geologic mapping
National Water Information System NWISWeb
National Geochemical Survey Database
(PLUTO, NURE)
National Geophysical Database (aeromag,
gravity, aerorad)
Earthquake Catalogs
North American Breeding Bird Survey
National Vegetation/speciation maps
National Oil and
Gas Assessment
Source:
Presentation by Dr. Linda Gundersen, USGS, at Geoinformatics
National Coal Quality Inventory
2007, San Diego, CA.
31
USGIN: Geoscience
Information Network
Download