2016 Winter Conference Joshua New, Ph.D. Oak Ridge National laboratory newjr@ornl.gov 865-241-8783 Technical Paper Session 11 Strategies to Improve Building Models and Operation Paper #5 - Suitability of ASHRAE Guideline 14 Metrics for Calibration Orlando, Florida Learning Objectives Objective 1 - Describe the current state of testing for building model calibration. Objective 2 - Explain the major components of the Trinity testing framework. ASHRAE is a Registered Provider with The American Institute of Architects Continuing Education Systems. Credit earned on completion of this program will be reported to ASHRAE Records for AIA members. Certificates of Completion for non-AIA members are available on request. This program is registered with the AIA/ASHRAE for continuing professional education. As such, it does not include content that may be deemed or construed to be an approval or endorsement by the AIA of any material of construction or any method or manner of handling, using, distributing, or dealing in any material or product. Questions related to specific materials, methods, and services will be addressed at the conclusion of this presentation. Acknowledgments • Aaron Garrett – JSU • Amir Roth – DOE BTO • Zheng O’Neill – UA Outline/Agenda • Publication edits • Context • Trinity Test – Limitations – Web service implementation • Purpose of this paper • Results of a large Calibration study Publication Edits • Trinity test – implementation of BESTEST-EX method to evaluate calibration, whether manual or automatic • ASHRAE Guideline 14 definition (b) of “calibration” – “process of reducing the uncertainty of a model by comparing the predicted output of the model under a specific set of conditions to the actual measured data for the same set of conditions.” • We use the word “calibration” herein whether it is to actual measured data or to simulation output (as a surrogate to measured data) Autotune Automatic calibration of models to data E+ Input Model . . . 6 Autotune Performance DOE Office of Science DOE-EERE: BTO Industry and building owners Results Monthly utility data Hourly utility data CVR NMBE CVR NMBE ASHRAE G14 Requires Autotune Results 15% 5% 30% 10% 1.20% 0.35% 3.65% 0.35% Results of 20,000+ Autotune calibrations (15 types, 47-282 tuned inputs each) Features • Calibrate any model to data • Calibrates to the data you have (monthly utility bills to submetering) • Runs on a laptop and in the cloud High Performance Computing • Different calibration algorithms • Machine learning – big data mining • Large-scale calibration tests • 30+ Publications: http://bit.ly/autotune_papers • Open source (GitHub): Other error metrics Residential home Tuned input avg. error Within 30¢/day (actual Hourly – 8% Monthly – 15% use $4.97/day) 3 bldgs, 8-79 inputs http://bit.ly/autotune_code Leveraging HPC resources to calibrate models for optimized building efficiency decisions Trinity Test – what is it? • “True” model – defined by the user for a specific test case; the answer key used to quantify accuracy of the calibrated model • Calibration (edits) – simulation output as a surrogate for measured data Advantages • Reproducibility! – No specific, unique buildings of interest – No faulty or unshared data used for calibration – No variation in definitions or metrics – No sole focus on simulation output • Proliferation in calibration literature – Necessarily unique – Largely irreplicable – Essentially incomparable Limitations • Cleanroom approach which has removed all realworld “noise” from the calibration process – No: sensor drift, missing data, utility data measured at different times, unaccounted for occupancy/behavior changes, model/form uncertainty (but can allow study) • Allows use of any weather file (TMY) – For real-world application, you need AMY data • No mapping of simulation output to measured data – Temperature gradients: what point is “Temp. of N wall?” • No sensor placement/material issues • Test results equally weight all inputs/outputs, even though some matter more than others Results IDF + CSV XML Thickness of metal siding? Calibrator: Between 0 and 0.5 and less than 1-B Oracle: 0.055 Website http://bit.ly/trinity_test XML EPW CSV Website/service Results Metric Input error average Input error maximum Input error minimum Input error variance CV(RMSE) CH4:Facility [kg](Monthly) CO2:Facility [kg](Monthly) CO:Facility [kg](Monthly) Carbon Equivalent:Facility [kg](Monthly) Cooling:Electricity [J](Hourly) Electricity:Facility [J](Hourly) … NMBE CV(RMSE)<30% CH4:Facility [kg](Monthly) NMBE<10%CO2:Facility [kg](Monthly) CO:Facility [kg](Monthly) Exceeds G14!!! Carbon Equivalent:Facility [kg](Monthly) Cooling:Electricity [J](Hourly) Electricity:Facility [J](Hourly) Electricity:Facility [J](Monthly) 143+ outputs Value 24.38 66.12 0.09 228.53 9.95 15.42 20.40 14.42 1577.96 10.48 -9.57 -14.78 -19.52 -13.83 592.77 -9.52 -9.52 Purpose of this Study • Are CV(RMSE) and NMBE the best metrics to use for calibration? • • • • What about no-CV: What about Mean Absolute Percent Error? What about (non-normalized) Mean Bias Error? What about Percent Absolute Error? …maybe calibration using another metric would allow a calibration algorithm to reach lower input-side error (i.e. recover the “real” model of the building) 20,000 Building Calibration Study #Inputs #Groups #Inputs #Groups Restaurant Hospital 49 49 227 139 Large Hotel Large Office 110 67 85 43 Secondary Stand-alone Small Hotel Small Office School Retail 231 282 72 59 122 131 58 55 Medium Office 81 36 Strip Mall 113 85 Midrise Apartment 155 78 Super Market 78 72 Primary School 166 109 Quick Service 54 54 Warehouse TOTAL 47 44 1809 1142 Results For the Strip Mall: If you use MAPE to minimize error to “measured” data, then you’ll have the closest building match in terms of CV(RMSE) Is there anything better? Output Variable 5 7 InteriorEquipment:Electricity [J](Hourly) InteriorLights:Electricity [J](Hourly) Number of Buildings 10 9 • Correlation to other properties showed that in most buildings (out of 15), calibrating to electrical usage of interior equipment and lights yielded better calibration results than any other building properties • ASHRAE G14 extension to allow a tier-2 calibration using (increasingly feasible) submetering requirements would allow more accurate and useful models from calibration Conclusions • Trinity test allows replicable calibration studies and quantifies calibration performance • An unsupported website and web service: http://bit.ly/trinity_test • A calibration study was conducted – 20,000 calibrations, 15 DOE commercial buildings, each with 36-139 calibrated groups • CV(RMSE) and NMBE are as good as any of the proposed alternatives…which is to say, BAD. – Calibration to important subsets is proposed Bibliography • Energy. 2015. Scalable tuning of building energy models to hourly data. Energy 84, 493-502. • ASHRAE. 2013. Evolutionary tuning of building models to monthly electrical consumption. Transactions 119(2). • Energy and Buildings. 2012. Evaluation of weather data for building energy simulations. ENB 49(0), 109-18. • IBPSA. 2012. Autotune E+ Building Models. IBPSA 37. • NREL. 2011. Building energy simulation test for existing homes (BESTEST-EX). NREL/TP-5500-52414. • ASHRAE. 2002. ASHRAE Guideline 14, Measurement of energy and demand savings. Questions? Joshua New newjr@ornl.gov