Simulation and Big-Data Challenges in Tuning Building Energy Models

advertisement
Simulation and Big-Data
Challenges in
Tuning Building
Energy Models
Jibonananda Sanyal, Ph.D.
and
Joshua New, Ph.D.
Building Technologies Research & Integration Center
(BTRIC)
Whole Building and Community Integration Group
Workshop on Modeling and Simulation
of Cyber-Physical Energy Systems
May 20, 2013
Presentation Summary
• Autotune – calibration problem
• Running a number of EnergyPlus simulations
–
–
–
–
Inputs
Workflow
Shared and distributed memory supercomputers
Data management
• Autotune software-as-a-service
• Key tricks
2
Managed by UT-Battelle
for the U.S. Department of Energy
The Autotune Idea
Bridging the gap between the real world and the virtual one
E+ Input
Model
.
.
.
3
Managed by UT-Battelle
for the U.S. Department of Energy
Autotune
Bridging the gap between the real world and the virtual one
E+ output - internal variable data
E+ Input Model
Avg
Instrumented Building
Sensor data
4
Managed by UT-Battelle
for the U.S. Department of Energy
Manual
mapping
initially
Autotune
Large-scale sensitivity analysis and uncertainty quantification
E+ Input Model
Ensemble of E+
inputs
E+ output changes
Sensor Data
5
Managed by UT-Battelle
for the U.S. Department of Energy
E+ output effect compared to sensor data
Generating the Inputs
• Parametric sampling
–
–
–
–
Experts selected 156 of 3000+ input parameters
Brute force using 3 levels: 5x1052 E+ simulations!
14 parameter full combinatorial subset
Markov Order 1 and 2 sampling
• Input generation
– 700 to 950 KB input file
– Perl program, sequential; Excel; Python
– E+ supplied parametric preprocessor
6
Managed by UT-Battelle
for the U.S. Department of Energy
Types of buildings simulated
• Residential
– 5 million simulations
• Medium Office
– 1 million
• Stand-alone retail
– 1 million
• Warehouse
– 1 million
Torcellini et al. 2008, “DOE Commercial Building
Benchmark Models”, NREL/CP-550-43291, National
Renewable Energy Laboratory, Golden CO.
7
Managed by UT-Battelle
for the U.S. Department of Energy
ORNL High Performance Computing Resources
Cost: $97 million
DOE BTO Use:
500k hours granted (CY13)
Autotune:
Parametric E+ Sims
Data Mining with
Machine Learning
Jaguar: 224k cores, 360TB memory,
10PB of disk, 1.7 petaflops
Cost: $104 million
DOE BTO: 500k hours granted (CY12)
Nautilus:
Frost: 2048 SGI Altix; 136 nodes
1024 cores, shared-memory 200k hours granted (CY13)
DOE BTO:
30k hours granted (CY11)
200k hours granted (CY12)
250k hours (CY13)
8
Managed by UT-Battelle
for the U.S. Department of Energy
Lens cluster:
77 nodes – 45x128GB, 32x 64GB with
NVIDIA 880 and Tesla dual-GPU
EVEREST visualization (CY13)
Gordon:
250k hours (CY13)
Target supercomputers
Titan, that used to be Jaguar
• 299,008 cores
• 18,688 nodes
• 20 petaflops
• 710 TB of distributed RAM
• 32 GB per node
Nautilus
• 1024 cores
• 4TB Shared memory
Frost
• 2048 cores
• 4 TB distributed RAM
• 32 GB per node
9
Managed by UT-Battelle
for the U.S. Department of Energy
Simulation Workflow
• E+ is designed for desktops, not supercomputers
• Biggest bottleneck is Lustre IO
• Run from Ramdisk… tricky with supercomputers
• Pack inputs – 64 files into each tarball
• Created a simplified, managed script to invoke E+ from RAM
• dplace used on Nautilus to place jobs
– E.g. for 256 cores, 4 tar balls loaded
– Elaborate PBS script
– Takes a long time to place individual jobs for 512 or more cores
• Frost and Titan
– MPI program
– Asynchronous node level barriers mitigate metadata server requests
– 1 tarball per node; each node has 16 cores, so 4 iterations
• After a block has been run, compress to disk
• Iterate
10
Managed by UT-Battelle
for the U.S. Department of Energy
Shared memory Nautilus
11
Managed by UT-Battelle
for the U.S. Department of Energy
Distributed memory Titan
12
Managed by UT-Battelle
for the U.S. Department of Energy
Data management
• Data generated
– 45 TB in 68 mins for ½ million E+ runs
– At least 270 TB raw
• Data storage
–
–
–
–
Compressed, around 70 TB
Lustre is scratch space (14 days)
Need to move this data before scratched
Many database technologies explored
• Data transfer
– Speed of generation is faster than you can pump out!
– Firewalls complicate things
• Data analysis
– Move computation to data
– Stitch them together
13
Managed by UT-Battelle
for the U.S. Department of Energy
14
Managed by UT-Battelle
for the U.S. Department of Energy
Autotune software-as-a-service model
Key tricks that helped us
• Determining the RAM based filesystem on these machines
–
Poor documentation
• Ratio of number of cores, size of simulation, and available RAM
–
Appropriately fit the task in RAM, with enough RAM left for the application heap
• Mitigating Lustre IO
• Asynchronous elements in bulk-synchronous processing
–
All cores do not hit the filesystem at the same time
–
Compile static
• Streamlining the workflow
–
E+ invokes a number of programs and has a script that performs copious amount of redundant IO
–
Reduce not needed calls in individual simulation workflow
• Managing shifting bottlenecks
• Think in parallel, even to list files on a drive!
15
Managed by UT-Battelle
for the U.S. Department of Energy
http://autotune.roofcalc.com
Machine Learning on Supercomputers
One year of 15-min data, 144 sensors/house
• Support Vector Machines
• Genetic Algorithms
• FF/Recurrent Neural Networks
• (Non-)Linear Regression
• Self-Organizing Maps
Nautilus Supercomputer
• C/K-Means
• Ensemble Learning
Acknowledgment: UTK computer science Ph.D. candidate
Richard Edwards; student of Dr. Lynne Parker
17
Managed by UT-Battelle
for the U.S. Department of Energy
Real demonstration facilities
ZEBRAlliance homes
2800 ft2 residence
269 sensors @ 15-minutes
50-60% energy savers
5M simulations of E+ model!
Heavily instrumented and equipped with occupancy simulation:
•
•
•
•
•
•
18
Temperature
Plugs
Lights
Range
Washer
Radiated heat
Managed by UT-Battelle
for the U.S. Department of Energy
•
•
•
•
•
Dryer
Refrigerator
Dishwasher
Heat pump air flow
Shower water flow
Large Data
156 inputs (permutes *.idf)
!-Generator IDFEditor 1.41
!-Option SortedOrder ViewInIPunits
!-NOTE: All comments with '!-' are ignored by the IDFEditor and are
generated automatically.
!Use '!' comments if they need to be retained when using the
IDFEditor.
!-
===========
Version,
7.0;
!-
===========
SimulationControl,
No,
No,
No,
No,
Yes;
Periods
!-
===========
ALL OBJECTS IN CLASS: VERSION ===========
!- Version Identifier
ALL OBJECTS IN CLASS: SIMULATIONCONTROL ===========
!!!!!-
Do Zone Sizing Calculation
Do System Sizing Calculation
Do Plant Sizing Calculation
Run Simulation for Sizing Periods
Run Simulation for Weather File Run
ALL OBJECTS IN CLASS: BUILDING ===========
Building,
ZEBRAlliance House number 1 SIP House, !- Name
-37,
!- North Axis {deg}
Suburbs,
!- Terrain
0.04,
!- Loads Convergence Tolerance Value
0.4,
!- Temperature Convergence Tolerance Value
{deltaC}
FullExteriorWithReflections, !- Solar Distribution
25,
!- Maximum Number of Warmup Days
6;
!- Minimum Number of Warmup Days
!-
===========
Site:Location,
Oak Ridge,
35.96,
-84.29,
-5,
19
ALL OBJECTS IN CLASS: SITE:LOCATION ===========
!!!!-
Managed by UT-Battelle
for the U.S. Department of Energy
Name
Latitude {deg}
Longitude {deg}
Time Zone {hr}
82 outputs @ 15m (*.csv)
Large Data
• 8M sims * 7.24m = 110 compute-years (cloud=$77,226)
– “Free” supercomputers and desktop utility for multiple runs+upload
• 8M sims * 35MB = 267 TB database (cloud=$512,237/month)
– Cost-effective hardware (1 time, ~$28k)
• Database engines: MyISAM load data 0.71s vs. InnoDB 2.3s
– Others: NoSQL/key-value pair, column-store, compression ratios
• Database partitioned by month, views span tables
• Software stack for analysis
20
Managed by UT-Battelle
for the U.S. Department of Energy
Making ORNL Data Available
Computing Resources
E+
Simulations
E+ Input
Model
Jaguar Supercomputer
Nautilus
Web Server
PowerEdge R510
Data Mining
96 ~ HP rx2600
21
Managed by UT-Battelle
for the U.S. Department of Energy
Automated process to
run millions of
simulations and host
publicly online
Genetic Algorithms
#1 problem with E+ is simulation speed
Use AI to approximate E+
Exact solution if in database (~milliseconds)
Approx. solution (seconds)
E+
Input
Model
Exact solution (5-10 mins)
Dual buffer, Genetic Algorithm Island model for evolving tuned model
Slow buffer/island
22
Managed by UT-Battelle
for the U.S. Department of Energy
Fast buffer/island
Multi-objective
Fitness evaluation
Data
• Plan to make available in FY13
• Will run on desktop machine (overnight testing, stop on demand)
• I+O = 8M*156 + 8M*35,040*96 = 26.9 trillion data points (eventually)
TOTAL COST = 4.3 * 10-16 cent
http://autotune.roofcalc.com
Acknowledgement:
This research used resources
of the AutotuneDB at the Oak
Ridge National Laboratory,
which was supported by the
Office of Science of the U.S.
Department of Energy.
Disclaimer:
No service-level performance
or availability guarantees
implied
23
Managed by UT-Battelle
for the U.S. Department of Energy
BTRIC 2011 accomplishments
• Support for Weatherization and Intergovernmental Program (WIP) grows
– Develop plan for new multi-family building audit
– Make existing single-family and mobile home
audits web-based
– Continue the retrospective national evaluation
of the Weatherization Assistance Program
(WAP)
– Initiate national evaluation of the State Energy
Program (SEP) and the Energy Efficiency Block
Grant Program (EEBGP)
– Complete the planning for the national
evaluation of ARRA Weatherization
– Aided in the weatherization of 600,000 homes
three months ahead of schedule
ORNL staff and subcontractors
have been supporting the
expenditure of over $10B in
ARRA funds in the WIP portfolio
24
Managed by UT-Battelle
for the U.S. Department of Energy
Science to transform today's buildings into
smart, responsive, and efficient structures
Experimental S&T
Capabilities
Modeling and
Visualization R&D
Better Buildings
via Novel Tools
and Technologies
Building Science
Data/Knowledge
Materials Science
Web-Based Tools
Data/Knowledge
Computational Science
Automated Model Calibration
Next Generation
Commercial Buildings
Neutron Science
Industry CRADAs
Data/Knowledge
Innovative Products
Sensors, Controls, Grid
Next Generation
Residential Buildings
Data/Knowledge
25
Managed by UT-Battelle
for the U.S. Department of Energy
4th Paradigm
• Empirical – guided by experiment/observation
– In use thousands of years ago, natural phenomena
• Theoretical – based on coherent group of principles and
theorems
– In use hundreds of years ago, generalizations
• Computational – simulating complex phenomena
– In use for decades
• Data exploration (eScience) – unifies all 3
– Data capture, curation, storage,
analysis, and visualization
26
Managed by UT-Battelle
for the U.S. Department of Energy
4th Paradigm
Johannes Kepler
3 laws of planetary motion:
Elliptical orbit (based on location of
Mars)
Planets sweep out equal areas in
equal times
The square of the periodic times are to
each other as the cubes of the mean distances
27
Managed by UT-Battelle
for the U.S. Department of Energy
4th Paradigm
• #3 - Computer simulation
28
Managed by UT-Battelle
for the U.S. Department of Energy
29
Managed by UT-Battelle
for the U.S. Department of Energy
4th Paradigm
• #4 - Visualization and Analysis
30
Managed by UT-Battelle
for the U.S. Department of Energy
4th Paradigm
31
Managed by UT-Battelle
for the U.S. Department of Energy
Visual Analytics (AI)
• Sensor-based Energy Modeling
32
Managed by UT-Battelle
for the U.S. Department of Energy
Download