National Science Foundation Cyberinfrastructure for Materials * Discovery and Innovation *Some Assembly Required Daryl W. Hess Program Director, DMR dhess@nsf.gov Big Data Request for Input Closes November 14 https://www.nitrd.gov/bigdata/rfi/02102014.aspx Opportunities to Shape How Science is Done National Science Foundation Data plays a key role … • The way computation, data, experiment & theory interact • Enhanced Connectivity – Unprecedented Communication of Results – What didn’t work, not just what did • The Data is getting bigger …. – Simulations – New instruments • How do we handle BIG data? – – – – – Distributed, Inhomogeneous Provenance? Access? “Presentation?” Will it be there tomorrow? Discovery tool? A gateway to new problems? …. OSTP: Big Data is a Big Deal National Science Foundation President: … an “all hands on deck” effort. By improving our ability to extract knowledge and insights from large and complex collections of digital data, the initiative promises to help accelerate the pace of discovery in science and engineering, strengthen our national security, and transform teaching and learning.” more than $200 million from 6 Federal departments and agencies Big Data: data sets so large, complex, or rapidly-generated that they can’t be processed by traditional information and communication technologies Year 2: Obama Administration encourages federal agencies, industry, academia, state and local governments to develop and participate in Big Data initiatives. advance core Big Data technologies; use Big Data to advance national goals use competitions and challenges National Science Foundation To help businesses discover, develop, and deploy new materials twice as fast, we’re launching what we call the Materials Genome Initiative. The invention of silicon circuits and lithium ion batteries made computers and iPods and iPads possible, but it took years to get those technologies from the drawing board to the market place. We can do it faster. -President Obama Carnegie Mellon University, June 2011 The Materials Genome Imitative National Science Foundation Discovery-to-market in less than half the time at half the cost A Materials Innovation Infrastructure is Required MGI for Global Competitiveness 1. Discovery Number of Materials “to Market” National Science Foundation A strategy for acceleration 2 3 1 5 4 6 7 3- 5 years Time (yrs) National Science Foundation Nanotechnology Knowledge Infrastructure: Enabling National Leadership in Sustainable Design A Nanotechnology Signature Initiative Thrust Foster an agile modeling network for multidisciplinary collaboration. Thrust Thrust Create a robust digital nanotechnology data and information Build a sustainable nanotechnology cybertoolbox infrastructure Thrust Nurture a diverse collaborative community to create nanotechnology to meet national challenges A community-based knowledge infrastructure to accelerate nanotechnology discovery and innovation http://www.nano.gov/NSINKI National Science Foundation The Science Drives The Cyberinfrastructure NSF Workshop The Materials Genome Initiative: National Science Foundation The Interplay of Experiment, Theory and Computation “… a seamless interplay between experiment, computation and demonstration is essential.” A key role for data J.J. de Pablo, B. Jones, C. Lind-Kovacs, V. Ozolins, A. P. Ramirez. Current Opinion in Solid State and Materials Science 18, 99–117 (April, 2014) NSF Workshop The Materials Genome Initiative: National Science Foundation The Interplay of Experiment, Theory and Computation Consider a university researcher with an idea for a new battery material … She queries a large database of experimental and computational data accesses an online simulation service and suggests other promising compounds … … [Others] make some of the materials … and upload energy storage performance data to the database. … other computational researchers show why the new materials are effective. The new material is made in several labs and … stimulating industry. Commercial scale fabrication and scale-up are taken into consideration by industrial scientists and engineers as materials are being developed. All of this happens faster than the time it takes for a single research paper to be published today. (highly abridged) J.J. de Pablo, B. Jones, C. Lind-Kovacs, V. Ozolins, A. P. Ramirez. Current Opinion in Solid State and Materials Science 18, 99–117 (April, 2014) National Science Foundation Designing Materials to Revolutionize and Engineer our Future NSF’s signature response to the MGI What: … activities that accelerate materials discovery and development by building the fundamental knowledge base needed to progress towards designing and making a material with a specific and desired function or property from first principles. DMREF goal: to control material properties through design: this is to be accomplished by understanding the interrelationships of composition, processing, structure, properties, performance, and process control. How: … the proposed research must be a collaborative and iterative process wherein theory guides computational simulation, computational simulation guides experiments, and experiments further guide theory. Benefits from existing software and data and will contribute new data, models, and software National Science Foundation Software Infrastructure for Sustained Innovation (SI2) Innovation Resuable and Sustainable Software NSF 14-520 • Through Division of Advanced Cyberinfrastructure • ~50 Elements & Frameworks projects & 13 potential Institutes planning projects ( http://bit.ly/sw-ci for current SI2 projects) SI2-SSI Collaborative Research: A Computational Materials Data and Design Environment Goal: Develop modular and extensible high-throughput ab-initio tools and critical property databases for materials design Developing tools to model • Defect energetics/thermodynamics • Solid state diffusion • Charged surfaces Participants Dane Morgan Formation energies Kristin Persson Migration energies Gerbrand Ceder Alan K Dozier, Raphael Finkel CyberInfrastructure Impact • Open source tools for easy public use • Workshops on high-throughput computation and the Materials Genome framework. • Integration with Materials Project at LBNL • Training of next generation in automated computational materials design tools National Science Foundation SSI: Scalable, Extensible, and Open Framework for Ground and Excited State Properties of Complex Systems Sohrab Ismail-Beigi, Yale (139804) Laxmikant Kale, UIUC (1339715) Glenn Martyna, IBM (collaborator) • Develop a first-principles ground state/excited state and response code good scaling properties on HPC platforms • For study of new materials science, condensed matter physics, and chemistry problems. • An American electronic structure code • Highly scalable codes for HPC that would open new frontiers through the ability to tackle larger scale problems. • Helps support sophisticated materials design efforts and the Materials Genome Initiative. DFT-GGA predicted structure of the ordered nanoscale heterojunction of P3HT polymers (gold, above) anchored covalently by sulfur atoms to the [10-10] surface of a ZnO nanowire (red and purple, bottom). A GW/BSE prediction of the electronic bands and excitons of this interface would help for photo-voltaic applications. S2I2 Software Institutes National Science Foundation What does materials research need? • Long-term hubs of excellence in software infrastructure and technologies, research and application communities of substantial size and disciplinary breadth. • ~ $1-2 Mil./yr. • Conceptualization awards possible ~13 awarded across NSF Collaborative Research: Scientific Software Innovation Institute for Advanced Analysis of X-Ray and Neutron Scattering Data (SIXNS) - Brent Fultz, CalTech Collaborative Research: A Scientific Software Innovation Institute for Computational Chemistry and Materials Modeling (S2I2C2M2) - Tom Crawford, VaTech National Science Foundation Data & NSF: Enabling a Knowledge Infrastructure Warning: Emphasis will change! • BIGDATA (NSF 14-542) [$200K/yr. - $500K/yr.] Critical Techniques and Technologies for Advancing Big Data Science & Engineering "Foundations" (F): developing or studying fundamental techniques, theories, methodologies, and technologies of broad applicability to Big Data problems. "Innovative Applications" (IA): developing techniques, methodologies and technologies of key importance to a Big Data problem directly impacting at least one specific application. National Science Foundation Data & NSF: Enabling a Knowledge Infrastructure Warning: Emphasis will change! • DIBBs (NSF 14-542) Data Infrastructure Building Blocks “… development of robust and shared data-centric cyberinfrastructure capabilities to accelerate interdisciplinary and collaborative research in areas of inquiry stimulated by data.” Pilot Demonstration Awards (up to $500K/yr.) Addresses needs of a large number of researchers within a domain Early Implementation Awards (up to $1 Million/yr.) Multiple research communities in multiple S&E domains Community Input National Science Foundation Grappling with the *issues Workshops – NKI: Data Sharing Workshop • Diverse participants: nanotechnology data producers & users, journal editors, industry, …. • Cross-cutting needs of the communities – DMR: Data Workshop • Data Sharing in the context of materials research * Data curation, completeness, relevance, quality, “code publication,” ontologies … Big Data Request for Input National Science Foundation https://www.nitrd.gov/bigdata/rfi/02102014.aspx “This request encourages feedback from multiple big data stakeholders to inform the development of a framework, set of priorities, and ultimately a strategic plan for the National Big Data R&D Initiative.” VISION STATEMENT: We envision a Big Data innovation ecosystem in which the ability to analyze, extract information from, and make decisions and discoveries based upon large, diverse, and real-time data sets enables new capabilities for federal agencies and the nation at large; accelerates the process of scientific discovery and innovation; leads to new fields of research and new areas of inquiry that would otherwise be impossible; educates the next generation of 21st century scientists and engineers; and promotes new economic growth. Moving Forward National Science Foundation A useful way to think about data ? DMR MGI Workshop – Shared Data is a focal point for collaborative research Data infrastructure across the scales … Small Group Collaboration Center-Center Collaboration Intra-Center Collaborations Many Mixed Collaborations National Science Foundation Thank You! National Science Foundation Nanotechnology Knowledge Infrastructure: Enabling National Leadership in Sustainable Design A Nanotechnology Signature Initiative Thrust Areas: • A diverse collaborative community of scientists, engineers, and technical staff to support research, development, and applications of nanotechnology to meet national challenges • An agile modeling network for multidisciplinary intellectual collaboration that effectively couples experimental basic research, modeling, and applications development • A sustainable cyber-toolbox to enable effective application of models and knowledge to nanomaterials design • A robust digital nanotechnology data and Accelerating Nanotechnology information infrastructure to support discovery and innovation effective data sharing, collaboration, and http://www.nano.gov/NSINKI innovation across disciplines and applications