Internet2 DBDA Co Chair Call 08102015

advertisement
Distributed Big Data and
Analytics (DBDA)
Internet2 CINO Ini,a,ve Working Group Co-­‐Chair Mee,ng Chairs
Alex Feltus, Clemson
10 August 2015 Sam Gustman, USC
Marc Hoit, NC State
1
1
Meeting Objectives
• 
• 
Discussion of eight submitted use cases
Discussion of next steps
2
Use case input from the working group
Use Case # Use Case Title 1 2 Data Analy,cs of Campus-­‐Scale Power System Intelligent Management Systems Center 3 Machine Tool Ball Screw Health Monitoring 4 Bioinforma,cs 5 Name Ins0tu0on • 
• 
• 
• 
• 
• 
• 
SubmiHed by Alex Feltus Contact Dan Noneaker SubmiHed by Jane Combs Contact Prof. Jay Lee SubmiHed by Jane Combs Contact Prof. Jay Lee SubmiHed by Jane Combs Clemson • 
SubmiHed by Jane Combs Univ of Cincinna, 6 Computa,onal Fluid Dynamics Research: Aerospace Geography/Climate • 
SubmiHed by Jane Combs Univ of Cincinna, 7 High Energy Physics • 
SubmiHed by Jane Combs Univ of Cincinna, 8 Modeling and Simula,ons • 
SubmiHed by Jane Combs Univ of Cincinna, Univ of Cincinna, Univ of Cincinna, Univ of Cincinna, 3
Data Analytics of Campus-Scale Power System
Submitted by Alex Feltus at Clemson
Contact Dan Noneaker at Clemson
•  Project/Research Title: Data Analytics of Campus-Scale Power System
•  Industry Sector: Electric Power Utility
•  Science Sub-domain: Electrical Engineering
•  Short Description of Project & Relation to Big Data: The local electric power grid will be heavily instrumented on a campus
containing a mix of residential sites, office spaces, industrial-scale electromechanical systems, and distributed energy sources.
The instruments will be networked to a server that provides data for use in analytics focused on electric energy consumption,
electric-service reliability, power quality, and local grid planning and design. The analytics will support research in local-grid
technologies, distributed control of the electric grid, and power electronics, etc.
•  Potential Industrial Partners: Duke Energy (Clemson's electric service provider), other electric utilities, power-industry
instrumentation and electronics manufacturers, power-system monitoring and control vendors
•  Other Faculty Involved All power faculty at Clemson, power research staff at Clemson's CURI site in Charleston, SoC faculty
working in data analytics
•  Best Contact: Dan Noneaker, ECE Dept. Chair, dnoneak@clemson.edu
•  Big Data Attributes: sensor, near-realtime, distributed, geospatial
•  Aggregate Data Size:
Now
2 TB
2016
4 TB
2017
16 TB
2020
1 PB
4
Intelligent Maintenance Systems (IMS) Center
Submitted by Jane Combs at University of Cincinnati
Contact Prof. Jay Lee at University of Cincinnati
•  Project/Research Title: Utilizing Prognostics & Health Management (PHM) Cloud Technology to Improve Band Sawing
Process
•  Industry Sector: Manufacturing / Industrial Machinery
•  Science Sub-domain: Data Analytics / Prognostics & Health Management
•  Short Description of Project & Relation to Big Data: The goal of this project is to acquire a large amount of operating data from
band saw machines both in the field and from an in-house test bed. This data is then analyzed using the Watchdog Agent®
toolkit to assess and predict the health condition of the monitored band saws. Once validated, this approach will be used to
construct a commercial cloud-based platform and mobile app for the project sponsor.
•  Best Contact: Professor Jay Lee
•  Big Data Attributes: Sensor, Near Real-time
•  Aggregate Data Size:
Now
5 TB
2016
10 TB
2017
2020
5
Machine Tool Ball Screw Health Monitoring
Submitted by Jane Combs at University of Cincinnati
Contact Prof. Jay Lee at University of Cincinnati
•  Project/Research Title: Machine Tool Ball Screw Health Monitoring
•  Industry Sector: Manufacturing / Industrial Machinery
•  Science Sub-domain: Data Analysis / Prognostics & Health Management
•  Short Description of Project & Relation to Big Data: The goal of this project is to conduct multiple run-to-failure tests using
commercially available machine tool ball screws and motors to collect data and design a data driven model for health monitoring
and prediction of such ball screws. Data from these tests is transferred and stored on a central server. A mobile app will be
developed for monitoring these tests.
•  Best Contact: Professor Jay Lee
•  Big Data Attributes: Sensor, Near Real-time
•  Aggregate Data Size:
Now
5TB
2016
15TB
2017
30TB
2020
6
Bioinformatics
Submitted by Jane Combs at University of Cincinnati
•  Project/Research Title: NIH BD2K-LINCS Perturbation Data Coordination and Integration Center
•  Industry Sector: Health care / Environmental Health/ Biomedical Research
•  Science Sub-domain: Bioinformatics and Data Science
•  Short Description of Project & Relation to Big Data: The Library of Integrated Network Based Cellular Signatures (LINCS) project
is expected to produce masses of data collected from human cells and tissues perturbed with drugs and other molecules. The
center’s role is to develop new methods to integrate big data, come up with intelligent ways to mine and analyze it, intuitive tools
to interact with it and to educate the research community on how to best leverage this trove of information for biomedical
research.
•  Best Contact: Jane Combs, combsje@uc.edu
•  Big Data Attributes: In biomedical research, these data sources include the diverse, complex, disorganized, massive, and
multimodal data being generated by researchers, hospitals, and mobile devices around the world.
•  Aggregate Data Size:
Now
2016
2017
2020
7
Computational Fluid Dynamics Research: Aerospace
Submitted by Jane Combs at University of Cincinnati
•  Project/Research Title: Study of Active and Passive Flow Control Techniques over Turbine Blades
•  Industry Sector: Aerospace
•  Science Sub-domain: Mechanical Engineering, Comp Fluid Dynamics
•  Short Description of Project & Relation to Big Data: Collaborative immersive visualization of large datasets and simulation
trajectories to support the study of active and passive flow control techniques and turbine-blade cooling. The goal of such
simulations is to provide predictive performance analysis of physical systems that may contain many integrated components and
which are described by multiple, interacting, physical processes.
•  Best Contact: Jane Combs, combsje@uc.edu
•  Big Data Attributes: Simulation datasets and data visualization
•  Aggregate Data Size:
Now
2TB
2016
4TB
2017
6TB
2020
8
Geography/Climate
Submitted by Jane Combs at University of Cincinnati
•  Project/Research Title: Toward a Circumarctic Lakes Observation Network (CALON)--Multiscale Observations of
Lacustrine Systems
•  Industry Sector: Geography
•  Science Sub-domain: Climate
•  Short Description of Project & Relation to Big Data: Expand on existing lake monitoring sites in northern Alaska by developing a
network of regionally representative lakes along environmental gradients from which we will collect baseline data to assess
current physical, chemical, and biological lake characteristics. Download and process hundreds of up to 1TB processed satellite
image data sets with four Internet2 universities and NSF National Snow and Ice Data Center. Develop and refine data
management, visualization, and archiving activities with ACADIS.
•  Best Contact: Jane Combs, combsje@uc.edu
•  Big Data Attributes: Satellite image data sets
•  Aggregate Data Size:
Now
1TB
2016
3TB
2017
2020
9
High Energy Physics
Submitted by Jane Combs at University of Cincinnati
•  Project/Research Title: Large Hadron Collider (LHCb) experiment at CERN studying heavy flavor physics
•  Industry Sector: Nuclear Physics, Energy
•  Science Sub-domain: Physics
•  Short Description of Project & Relation to Big Data: The physics focus is studying oscillations of matter into anti-matter and
studying the differences in decays rates of matter and corresponding anti-matter to “mirror-image” final states. These address
the nature of fundamental interactions between the basic constituents of matter. We move large data files from host laboratories
to computers at UC for final analysis. For example, a single file with an LHCb NTUPLE from a small fraction of the data is 3GB.
The size of the data set to be transferred for this analysis will be on the order of 1TB.
•  Best Contact: Jane Combs, combsje@uc.edu
•  Big Data Attributes: LHCb NTUPLE data sets
•  Aggregate Data Size:
Now
3TB
2016
6TB
2017
6TB
2020
10
Modeling and Simulation
Submitted by Jane Combs at University of Cincinnati
•  Project/Research Title: Study of Active and Passive Flow Control Techniques over Turbine Blades
•  Industry Sector: Consumer Products
•  Science Sub-domain: Mechanical Engineering, Modeling and Simulation
•  Short Description of Project & Relation to Big Data: The UC Simulation Center, in collaboration with Procter & Gamble, focuses
on high-fidelity numerical simulation of complex flow phenomena with a wide range of length and time scales. The UC
Simulation center is a partnership where students work on M&S of complex industrial problems associated with porous media,
multiphase flows, etc. Predictive performance analysis of the multidisciplinary, multi-scale systems generate terabytes of data.
•  Best Contact: Jane Combs, combsje@uc.edu
•  Big Data Attributes: Simulation datasets and data visualization
•  Aggregate Data Size:
Now
2TB
2016
4TB
2017
6TB
2020
11
Next Steps
• 
Schedule deep dive calls between Co-Chairs and Use Case POCs
• 
Determine co-chair presenters for August 31 Joint Collaborative Innovation Community call
• 
Preparation for in person meetings at TechEx – October 4-7 in Cleveland, Ohio
– 
– 
• 
DBDA Innovation Working Group meeting – 75 minutes
•  Review current status, use cases, gather new ideas
Collaborative Innovation Community meeting – all 3 working groups together – 90 minutes
•  Each working group presents status
•  Invite new participants and new ideas
•  Innovation hackathon over lunch for new ideas in current innovation areas or new ones
Monthly team meeting – next one September
12
Thank You
13
Download