Alan Blatecky

advertisement
OCI:
Opportunities & Challenges
Alan Blatecky
Office of Cyberinfrastructure
OCI Role
Technology Push
Science Pull
Capabilities increase
as refinements
are implemented
Development modifications
made as required
Spiral Development
Advanced Computational Infrastructure (ACI)
Vision: Support a comprehensive portfolio of advanced computing
infrastructure, programs and other resources to facilitate cutting-edge
foundational research in Computational and Data Enabled Science and
Engineering (CDS&E) and its applications to all disciplines.
Advanced Computational Infrastructure
• Invest in diverse and innovative national scale shared resources,
outreach and education complementing campus and other investments
• Leverage and invest in collaborative flexible “fabrics” dynamically
connecting scientific communities with computational resources and
services at all scales (campus, regional, national, international)
CIPRES –
Cyberinfrastructure for
Phylogenic Research
XSEDE
Blue Waters/UIUC
National resource offering large allocations for a
small number of diverse and significant
research projects across the U.S.
Highly Scalable Heterogeneous
System to enable investigations of
computationally challenging
problems that require sustained
PetaFlops (1015) performance and/or
large data and large memory
•
•
Stampede/UT at Austin
Expands the range of data intensive
computationally-challenging science and
engineering applications that can be
tackled with current national resources
Introduces new heterogeneous
architecture based on Intel MIC to
science and engineering research
communities
Stampede will accommodate larger simulations (both in fidelity
and number of ensemble members) producing more accurate
forecasts, and permit more research groups during critical
response efforts
Phylogenic Trees: Stampede will allow us to
approach the full tree for all green plant
species (~500,000) on Earth to gain insights
into the origins of drought resistance or nitrogen
efficiency in plants, which could then be bred
into future food crops
Hurricane Ike tracking predictions using the WRF program and 30
ensemble members (Courtesy F. Zhang, PSU and Y. Weng)
XSEDE
• Enable campus, regional and national resources and
communities to interact transparently
• Flexibly add diverse, distributed, heterogeneous sets of
digital resources (computers, data, instruments) that
change over time
• Provide ACI support (management and user), education
and outreach to science community
• Develop computational science and education expertise
and capabilities across a broad set of disciplines
Computational usage first 9 months of FY12
Number of Allocations
600
8,695 distinct users
500
32 NSF Divisions
400
1,833 Publications
300
200
0
ECD
SBR
ART
MIP
DDM
HUM
NCR
BNS
DPP
SEE
IRI
BCS
ECS
SES
OCE
IBN
DEB
EAR
CDA
MSS
STA
DMS
CCR
ATM
TRA
ASC
PHY
CTS
AST
DMR
CHE
MCB
100
Demand and allocation of Service Units
1 Billion SU gap
9
ACI Challenges for the next decade
• Technology diversity, pace of change and sustainability
• Increase collaboration and interaction among local,
national, and international cyberinfrastructures
• Broadening ACI capabilities to all science and education
including a balance between “Deep” and “Wide”
• Rapidly growing requirements for CDS&E tools,
capabilities, and expertise
• Data, computation and software are three sides of the
same coin – inextricably linked and co-dependent
• Allocation and prioritization of resources
Science and Society Transformed by Data

Modern science



Multi-disciplinary
Collaborations for
Complexity


Data- and compute-intensive
Integrative, multiscale
Individuals, groups, teams,
communities
Sea of Data – Big Data


Age of Observation
Distributed, central
repositories, sensor- driven,
diverse, etc
11
Building a National Data Infrastructure
• The data infrastructure will be complex and involve a
range of modalities
– Data centers, clouds, distributed systems, replication
– Partnerships between campuses, government, business
• Leveraging and building on the myriad of data efforts
and projects underway
• New focus on curation, interoperability, sharing of data,
common approaches and data policies
• New sustainability models for data stewardship will
emerge, driven by the needs of individual communities
• Data resources will have to allocated
– More data being generated than can be stored
– What should be kept, what can be discarded? And, on what basis?
12
Data challenges
• Increase in volume of simulation-based data will strain
and break existing usage models
• Need significant investments in data analytics, tools and
applications development
• Storage solutions and models already a critical problem
• New sustainability models for data stewardship need to
be developed
• CDS&E workforce expertise is becoming ever more
critical; from algorithm development to data creators,
technicians, managers, and scientists
13
Role of Software in Science
• Software essential for the bulk of science
– About half the papers in recent issues of Science were
software-intensive projects
– Research becoming dependent upon advances in
software
– Significant software development being conducted
across NSF: NEON, OOI, NEES, NCN, iPlant, etc
• Wide range of software types; system, apps,
modeling, gateways, analysis, algorithms,
middleware, libraries
• Development, production and maintenance are
people intensive
• Software life-times are long compared to hardware
• Under-appreciated value
Software Challenges
• Robust software for data-driven science
–
–
–
–
Documentation and sustainability
Managing increasing complexity
Disruptive architectures
Governance of software communities
• Software assurance, reproducibility, trust in models,
simulation & data
• Education; using modern software in education,
educating people how to use and create software,
software engineering
– Interaction with consumer trends, such as app store models
• Policies for citation, stewardship, attribution and
authorship for use of open software
CC-NIE: Data Driven Networking Infrastructure for
the Campus and Researcher
• network infrastructure improvements at the campus level
– network upgrades within a campus network to support a wide range of
science data flows
– re-architecting a campus network to support large science data flows
– campus network upgrades addressing sustainable infrastructure through
improvements in energy efficient networking.
– campus network upgrades addressing the growing needs in mobile
networking.
– Network connection upgrade for the campus connection to a regional
optical exchange or point-of-presence that connects to Internet2 or
National Lambda Rail.
CC-NIE: Network Integration and Applied
Innovation
• End-to-end network CI through integration of existing
and new technologies and applied innovation
• Applying network research results, prototypes, and
emerging innovations to enable (identified) research
and education
• Leverage new and existing investments in network
infrastructure, services, and tools by combining or
extending capabilities to work as part of the CI
environment used by scientific applications and users
International Research Network Backbone Concept: Notational only
Architecture, aggregation nodes, locations, bandwidth, connectivity to be determined
Aggregation nodes: Primary connection to International Backbone; every country
and economy has the opportunity to be an aggregation node or connect to one
Shared High Bandwidth network (multi 10/100 Gig) interconnecting
Aggregation nodes
Sep 2012
Education, Learning, Workforce Development,
CDS&E
At the end of the day, cyberinfrastructure is all about
people; enabling them to do what they have not been
able to do before
Conclusion
OCI
Transforming Science a Bit at a Time
Download