Deriving Archetype Templates for Urban Building Energy Models Based on Measured Monthly Energy Use by AR(HIVE Julia A. Sokol MASSACHUSETTS INSTITUTE OF'rECHNOLOLGY B.A., Mechanical Engineering JUL 3 0 2015 Harvard University, 2010 LIBRARIES Submitted to the Department of Mechanical Engineering in partial fulfillment of the requirements for the degree of Master of Science in Mechanical Engineering at the Massachusetts Institute of Technology June 2015 @ Massachusetts Institute of Technology, 2015. All rights reserved. Signature of Author ............................ Signature redacted Departm nt of Mechanical Engineering May 27, 2015 Certified by .............. ..................... Signature redacted Christoph R0Ti5Tiart Associate Professor Thesis Supervisor Certified by ............. Signature redacted C/ Leslie Norford Professor Signature redacted/'s Reader Accepted by ......................................................................... David Hardt Professor of Mechanical Engineering Chairman, Department Committee on Graduate Theses This page intentionally left blank. Deriving Archetype Templates for Urban Building Energy Models Based on Measured Monthly Energy Use by Julia A. Sokol Submitted to the Department of Mechanical Engineering on May 27, 2015 in partial fulfillment of the requirements for the degree of Master of Science in Mechanical Engineering. Abstract Interest in urban energy modeling has grown among planners and policy-makers as more and more municipalities set targets for reduction of greenhouse gas emissions. Urban-scale building energy models can help evaluate the efficiency of proposed district designs, consequences of building retrofit interventions, or energy supply options. Bottom-up models based on physical descriptions and engineering calculations are the most versatile for modeling scenarios and evaluating results at high spatial and temporal resolutions. Such urban building energy models (UBEMs) are typically created by grouping buildings with similar properties into archetypes, which standardize many properties that are not uniform in reality, such as occupancy-driven parameters. Since most UBEMs are validated using aggregated, annual measured data, this standardization is usually adequate; however, for a more accurate model that considers end-use differentiation or seasonal variation, neither this standardization nor this validation method are sufficient. This work proposes a new methodology for archetype definition and customization using metered monthly energy data. Customization is done by inferring certain parameters from the energy data and estimating others probabilistically from parametric analysis. The methodology is developed and tested on a case study of 453 low-rise residential buildings in Cambridge, Massachusetts. Four model iterations are compared: single template, eight archetype templates, eight archetypes with individual building customization, and the latter with the addition of parametric analysis and generation of frequency distributions for unknown parameters. The results show an improvement in mean goodness of fit from 46% with one template and 37% with eight templates to 18% for the final iteration. The distribution of energy use intensities, as well as monthly electricity and gas profiles, approach observed values closer with each iteration. The results also demonstrate that error metrics based on aggregated annual consumption, commonly used for urban model validation, are not necessarily representative of the model's fit on a monthly basis. Thesis Supervisor: Christoph Reinhart Title: Associate Professor of Building Technology This page intentionally left blank. Acknowledgements I would like to express my deepest gratitude to my research advisor Christoph Reinhart for his guidance, patience, kindness, and inspiration. I am also grateful to Professors Les Norford and Leon Glicksman, the rest of the Building Technology Faculty, and Kathleen Ross, for their help and advice, and for creating a supportive BT community. This research is part of a larger series of urban modeling projects in the Sustainable Design Lab and was performed in close collaboration with PhD candidate Carlos Cerezo Davila, whose insight and experience was invaluable. To all the talented Building Technology students who have shared their time and knowledge, both in lab and outside of it-especially Aiko, Cody, David, Irmak, Jeff, Leo, Madeline, Manos, Tarek, and Timur-you have made the past two years memorable. Thank you to all my other friends at MIT and farther away (especially Katie and the rest of the EBM lab) and to my dear roommate Polina for their love and encouragement. Thanks also to Meder, through whom I got to know MIT long before I could ever imagine being a student here, and to Anton, whose spirit is still here in Cambridge. Lastly, deepest of thanks to my dad, sister, and brother-in-law for their endless support. And to Cal-without whom I wouldn't be here-for all the magic. 5 This page intentionally left blank. Contents Abstract 3 Acknowledgements 5 Contents 7 List of Figures 10 List of Tables 11 1 Introduction 1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Research Question . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . 1.3 Thesis Goals and Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 12 15 16 2 Background 2.1 Energy Modeling of Existing Buildings 2.2 Calibration of Building Energy Models 2.2.1 Calibration Assessment . . . . 2.2.2 Calibration Methods . . . . . . 2.3 Urban Building Energy Modeling . . . 2.3.1 Archetype Definitions . . . . . 2.3.2 User Behavior Modeling . . . . 2.3.3 Model Validation . . . . . . . . 3 Proposed Methodology for Urban 3.1 Data Collection . . . . . . . . . . 3.1.1 Weather Data . . . . . . . 3.1.2 Energy Data . . . . . . . 3.2 Building Data . . . . . . . . . . . 3.2.1 Geometric Properties . . 3.2.2 Non-Geometric Properties 3.3 Template Generation . . . . . . . 3.4 Template Customization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Building . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Energy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 18 19 20 21 22 24 25 26 Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 28 28 29 29 29 30 31 31 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Contents . . . . . . . . . . . . . . of Parameters . . . . . . . . . . . . . . . . . Inference of Parameters Probabilistic Estimation Execution . . . . . . . . Validation . . . . . . . . . 3.5 3.6 3.4.1 3.4.2 Model Model . . . . . . . . . . . . . . 4 Application of Methodology to Cambridge, MA Case Study 4.1 Data Collection . . . . . . . . . . . . . . . . . . 4.1.1 Weather Data . . . . . . . . . . . . . . . 4.1.2 Energy Data . . . . . . . . . . . . . . . 4.2 Building Data . . . . . . . . . . . . . . . . . . . 4.2.1 Geometric Properties . . . . . . . . . . 4.2.2 Non-Geometric Properties . . . . . . . . 4.3 Template Generation . . . . . . . . . . . . . . . 4.3.1 Constructions . . . . . . . . . . . . . . . 4.3.2 Internal Loads and DHW . . . . . . . . 4.3.3 Schedules . . . . . . . . . . . . . . . . . 4.4 Template Customization . . . . . . . . . . . . . 4.4.1 Inference of Parameters . . . . . . . . . 4.4.2 Probabilistic Estimation of Parameters . 4.5 Model Iterations . . . . . . . . . . . . . . . . . Results for Cambridge, MA Case Study 5.1 Error M etrics . . . . . . . . . . . . . . . . . . . . . 5.2 Annual and Monthly Simulation Results . . . . . . 5.2.1 Monthly Comparison . . . . . . . . . . . . . 5.2.2 EUI Distribution Comparison . . . . . . . . 5.3 Parameter Distrib1tinn . . . . . . . . . . . .. 5.3.1 Inferred Parameters . . . . . . . . . . . . . 5.3.2 Probabilistically-Estimated Parameters . . 5.4 Results Summary . . . . . . . . . . . . . . . . . . . . . . . . . . 5 . . . . . . . . . . 6 Discussion and Conclusion 6.1 D iscussion . . . . . . . . . . . . . . . . . . . . . . . 6.1.1 Parameter Uncertainty Reduction . . . . . 6.1.2 Generation of Improved Archetypes..... 6.1.3 Evaluating Consequences of Data Availability 6.2 Lim itations . . . . . . . . . . . . . . . . . . . . . . 6.2.1 Geometric Limitations . . . . . . . . . . . . 6.2.2 Modeling Simplifications . . . . . . . . . . . 6.2.3 Limitations of Results . . . . . . . . . . . . 6.3 Future Work . . . . . . . . . . . . . . . . . . . . . 6.3.1 Validation . . . . . . . . . . . . . . . . . . . 6.3.2 Methodology Refinement . . . . . . . . . . 6.3.2.1 Sensitivity Analysis . . . . . . . . 8 64 64 64 65 65 66 66 67 67 68 68 69 69 Contents 6.3.2.2 Hourly Energy Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3.3 Automation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 69 70 A Energy Model Templates A.1 Constructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.2 Schedules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 71 73 Bibliography 74 6.4 9 List of Figures 1.1 1.2 1.3 U.S. energy flows in quadrillion Btu, 2014. [1] . . . . . . . . . . . . . . . . . . . . . . . Historical energy consumption by end-use sector, in quadrillion Btu. [1] . . . . . . . . Monthly energy consumption by end-use.sector, 2012-2014, in quadrillion Btu. [1] . . . 13 14 14 2.1 Techniques used for estimating regional or national residential energy consumption. [2] 23 3.1 Example of residential monthly energy use in Cambridge, MA. . . . . . . . . . . . . . 32 4.1 4.2 Tax parcels of the City of Cambridge. . . . . . . . . . . . . . . . . . . . . . . . . . . . Histogram by total 2008 EUI, shaded by construction period (pre-1945, 1946-1980, post-1980). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Histograms by 2008 EUI, separated into gas and electric use intensities. . . . . . . . . Monthly energy use intensities for low-rise residential buildings in Cambridge. . . . . . Building geometry generation process. . . . . . . . . . . . . . . . . . . . . . . . . . . . Energy use shaded by age category of building. . . . . . . . . . . . . . . . . . . . . . . 36 Rendering of the Cambridgeport 3D model. . . . . . . . . . . . . . . . . . . . . . . . . Annual results for baseline run with a single template assigned to all buildings. . . . . Annual results for Run 1 with initial templates generated from annual data. . . . . . . Annual results for templates customized to each building based on parameters inferred from m onthly data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Annual results for lowest-error parametric runs with variable occupancy and setpoints. Measured monthly gas and electric use intensity. . . . . . . . . . . . . . . . . . . . . . Monthly simulation results for a subset of 200 buildings. . . . . . . . . . . . . . . . . . Measured (shaded grey) and simulated (shaded green) EUI distributions for neighborhood, in W/m 2 : Run 0 (top left), Run 1 (top right), Run 2 (bottom left), Run 3 (bottom right). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Distribution of peak electric load intensities. . . . . . . . . . . . . . . . . . . . . . . . . Distribution of peak domestic hot water flow. . . . . . . . . . . . . . . . . . . . . . . . Distributions of AC use (left) and heating system efficiencies (right). . . . . . . . . . . Distribution of occupancy density per square meter for GOF < 10. . . . . . . . . . . . Cooling (left) and heating (right) setpoint distributions for GOF < 10. . . . . . . . . . Goodness of fit by individual building for every run. (Note that the vertical axis is truncated, so the highest error points are not displayed.) . . . . . . . . . . . . . . . . . 51 53 54 3D models of Cambridgeport with and without LiDAR data. 67 4.3 4.4 4.5 4.6 5.1 5.2 5.3 5.4 5.5 5.6 5.7 5.8 5.9 5.10 5.11 5.12 5.13 5.14 6.1 10 . . . . . . . . . . . . . . 38 38 39 41 45 55 56 57 58 59 60 60 61 61 62 63 List of Tables 21 2.1 Error limits for whole building calibrated simulation. [3, 4] . . . . . . . . . . . . . . . . 4.1 4.2 4.3 4.4 4.5 Summary of Cambridge residential dataset Linear regression results. . . . . . . . . . . Heating system efficiencies. . . . . . . . . Parametric analysis settings. . . . . . . . Energy model iterations. . . . . . . . . . . . . . . . 42 43 49 50 50 5.1 Summary of validation results by run, time period considered, and aggregation level. . 63 11 for . . . . . . . . 3,395 buildings. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chapter 1 Introduction 1.1 Motivation In 2014, the Residential and Commercial sectors were responsible for 40.5% of the total energy consumed in the United States, adding up to 11, 685 terawatt-hours or 39.9 x 1015 Btu (Figure 1.1) [1]. The energy consumption of these two sectors can be ascribed almost entirely to buildings-the electricity and fuel used for electrical appliances, lighting and space conditioning of all private living quarters, business and institutional facilities. (The commercial sector also includes energy consumed by street lighting and sewage treatment facilities, but their contribution is comparatively small [1]). This is not just a U.S. phenomenon: globally, residential and commercial buildings are responsible for 30-40% of final energy consumption, and about a third of the world's greenhouse gas emissions. Building-related consumption continues to grow as developing countries increase in population and standard of living. Fortunately, much of this consumption is avoidable through the implementation of appropriate energy efficiency measures. For developing and urbanizing areas, this entails designing sustainable new neighborhoods with high energy efficiency standards. For developed countries with low building stock growth, where the majority of existing buildings were constructed before building energy codes went info effect in the 1970s, policy makers need to implement smart strategies for building retrofits. In response to these challenges, state and municipal governments in the United States and around the world have started instituting targets for reduction of greenhouse gas emissions from the buildings sector. Since the first step in managing progress is measuring baseline performance, these targets have led to the establishment of the first energy disclosure laws. These laws require all buildings above a certain floor area to submit annual energy consumption records of their energy consumption to city governments. This information is then used to benchmark the performance of buildings within the same function and identify areas for improvement. Some cities have combined energy disclosure 12 Chapter 1. Introduction FIGURE 1.1: U.S. energy flows in quadrillion Btu, 2014. [1] laws with additional requirements for conducting building energy audits and retro-commissioning at certain time intervals. However, so far only fourteen cities and two states' in the U.S. have energy disclosure policies in place. Most of them apply only to commercial buildings (some include multifamily residential), and half require this information only from buildings with floor area above 50,000 square feet (4,645 in 2 ) [5]. Consequently, these regulations exclude the majority of low-rise residential structures. Individually such homes might have insignificant energy demands compared to those of large commercial buildings; yet, taken together, homes with one to four residential units comprise 200 billion square feet, compared to 87 billion square feet for all commercial buildings [6, 7]. Historically, the residential sector has consistently consumed more energy than the commercial sector (Figure 1.2). In recent years, monthly energy use in the residential sector at peak heating periods has even reached that of the industrial sector (Figure 1.3). Within the residential sector, low-rise (i.e., four units and under) buildings make up 89% of total residential floor area nationally [7]. This percentage varies by municipality depending Even in a city as dense as New York, however, 1-4 family residences makes up around 41% of the city's total residential floor area [8]. Therefore, it would be valuable to understand and model this significant component of the building stock in greater detail on the population density of the city. than has been done so far. Cities: Seattle, Portland, Berkeley, San Francisco, Santa Fe, Austin, Minneapolis, Chicago, Atlanta, Washington, D.C., Philadelphia, New York, Boston and Cambridge. States: California and Washington. 13 Chapter 1. Introduction Total Consumption by End-Use Sector, 1949-2014 40- Industrial 30- Transportation.. - .......... Residential Commercial - 10-- 1950 1960 1955 1965 1970 1975 1980 1985 1990 1995 2000 2005 2010 FIGURE 1.2: Historical energy consumption by end-use sector, in quadrillion Btu. [1] Total Consumption by End-Use Sector, Monthly 4- Transporation Pr 3- Industrial 3- 2- Residential Commercial 1- J A M J J A 2012 S O N D J F MAMJ J A 2013 S O N D J F M AMJ J A 2014 SO ND FIGURE 1.3: Monthly energy consumption by end-use sector, 2012-2014, in quadrillion Btu. [1] On top of its large contribution to global energy use, the residential sector presents a series of challenges for energy monitoring and regulation. First, it is set apart from commercial and industrial facilities due to highly decentralized ownership and disaggregation of energy use by tenants within a building. Utility companies supplying residential customers are the only source-besides the tenants themselves-that can provide information on their customers' energy use; however, privacy concerns generally inhibit utilities from sharing information that links energy use of an account to a specific address. 2 Utility companies are more willing to provide data aggregated by block or zip code, and this 2 Some progress is being made on this front: more and more utilities are participating in the Green Button program [9], which allows customers to view and download their own historical usage data in a consistent format and to share it with third-party applications on an opt-in basis. However, the effectiveness of this practice depends entirely on tenants' desire to participate. 14 Chapter 1. Introduction has recently been used by some researchers to create block-level maps of energy consumption [8, 10, 11]. While energy maps at this scale might be useful for urban planning, they are not particularly helpful for identifying underperforming buildings or informing owners which retrofits to pursue. Furthermore, while many commercial buildings have the capability to track their energy use on an hourly or sub-hourly basis through submetering, residential energy tracking lags behind, since submetering at that scale is considered cost-prohibitive. In addition, commercial buildings can exert a high degree of control over the operation of equipment and lighting through Building Automation Systems (BAS), so the building manager can relatively easily make adjustments to affect anything from a specific room to the entire facility. In residential buildings, equipment or lighting controls are generally much more primitive and more heavily influenced by user behavior, which varies greatly based on the nature of the occupants and is very difficult to track. 1.2 Research Question Urban-scale building modeling has emerged as a way to understand current trends and predict future ones in energy consumption and greenhouse gas emissions in cities. Various approaches have been taken to formulate such models, from top-down statistical analyses to bottom-up engineering models built from physical building descriptions. While top-down models are useful for describing existing conditions, they are limited in their predictive ability to very small variations from the status quo. The bottom-up engineering model approach is the most versatile for modeling various scenarios and evaluating results at high spatial and temporal resolution. Such urban building energy models (UBEMs) can be used by urban planners and policy-makers to evaluate the energy-efficiency of a proposed neighborhood, assess impacts of potential retrofits on an existing district, or compare possible energy supply alternatives. Due to the difficulty of gathering information on large quantities of buildings, bottom-up UBEMs are typically generated by grouping buildings with similar properties and defining detailed inputs just for one characteristic building, or archetype, per group. These archetype properties are then combined with some representation of the buildings' geometries. Thermal simulations with location-specific weather conditions are then performed, either for each building (if their geometric shapes are defined individually) or for each archetype building with consequent scaling up by floor area of all buildings within the type. If it is an existing neighborhood and historical energy data is available, the outputs of the energy model are then validated, typically by comparing to the neighborhood's annual energy consumption. This general method has been followed by many researchers with some variation within each step. However, it has limitations when attempting to create a model for any subsequent analysis requiring more detail than simply annual, aggregated energy demand. Neither this method of defining building 15 Chapter 1. Introduction properties nor this validation procedure is sufficient to ensure a model representative of actual seasonal energy variations or end-use proportions. In particular, three main limitations in urban building energy modeling have been identified and are addresses in this thesis: 1. Lack of a systematic methodology for defining building archetypes for urban energy models: Currently, archetype classification is usually done somewhat arbitrarily, based on expert judgment or classifications used in prior research. Without linking archetype definitions to measured data, there is no way of determining which variables in the modeled building stock actually affect energy use and, conversely, no way to demonstrate that the archetypes, once defined, create groups with homogeneity in energy profiles. 2. Uniformity in modeling occupancy parameters: Since occupant behavior is highly variable and information on it is rarely available, urban models-even ones that are meant to provide results on an hourly basis-tend to assign the same occupancy-related parameters to buildings of the same function [12, 13]. This includes parameters that are directly under the influence of the occupants rather than the physical properties of the building, such as operating schedules, occupant densities, or thermostat setpoints. Such uniformity of behavior is unrepresentative of reality and becomes an issue in energy modeling since differences in occupancy have been shown to be responsible for a large portion of the variance in energy consumption, especially among residential buildings. 3. Reliance on annual, aggregatedmeasured data for model validation: Little validation of UBEMs compared to measured data has been done to date [14]; those that have been validated did it by comparing their results to annual energy use quantities. A matching annual value, however, does not provide any guarantee that the attribution of this energy to different end-uses was done correctly. Some models do break energy down into end-uses or look at just a single end-use when validating; even then, it is unknown whether the model's results match well at shorter time scales. Furthermore, validation is typically done at the aggregated scale of an entire block, neighborhood or city. Few researchers have gone into greater detail and checked their urban models against individual buildings' energy uses; those that have report individual building errors that are several times larger than aggregated ones. 1.3 Thesis Goals and Outline The aim of this thesis is to address the limitations identified above for modeling urban building energy use through an improved bottom-up approach that employs data analysis to reduce uncertainty. It explores the extent to which an UBEM can be refined when sub-annual, individual-building energy data is available for a neighborhood. Specifically, its goals are to: 16 Chapter 1. Introduction 1. Present a data-based methodology for the creation and refinement of building archetype templates. 2. Probabilistically estimate indeterminate occupancy-driven parameters through a parametric analysis; use the resulting probability distributions to account for uncertainty in these parameters in the given or similar neighborhood. 3. Validate the model against both annual and monthly data, on an individual building basis and in aggregate; compare results to determine advantage of monthly data availability. Chapter 2 provides a background on current urban modeling practices. The general proposed methodology for urban building energy modeling and refinement is developed in Chapter 3. Chapter 4 describes specific details of the application of this methodology to a case study of Cambridge, Massachusetts. The results of the case study are presented in Chapter 5. Chapter 6 discusses the main conclusions of this research, its limitations, and plans for future work. 17 Chapter 2 Background In its aim of addressing uncertainty and increasing accuracy in urban building energy models (UBEM), this thesis builds upon prior work in urban energy modeling, which in turn arose out of the building energy modeling (BEM) field. This chapter summarizes BEM practices for existing buildings, focusing on ways that have been used for calibration of models to measured data. In UBEM, calibration as such has not really been done due to the lack of granular data, both of building properties and of measured energy. Rather, comparison to measured data in UBEMs is usually termed validation and is done primarily on an aggregated basis. 2.1 Energy Modeling of Existing Buildings Building Energy Modeling (BEM) has been used since the 1970s for predicting electric and fuel demands of buildings. These models can be divided into two broad approaches: forward models and data-driven (inverse) models [15]. Forward models-also called physical models-are used in the design stages of buildings before any measured data is available. They rely on inputs of climate data plus the building's geometry, constructions, occupancy profiles, zoning and systems, as designed. These inputs are used in energy balance calculations that determine heating and cooling loads for the spaces at certain time intervals, which are in turn fed into models of system components that serve to satisfy the calculated loads. Forward models are most commonly created using software developed specifically for BEM, which use detailed algorithms based on physical principles of heat and mass transfer. These include both government-funded (DOE-2, EnergyPlus) and commercially-developed simulation engines (TRNSYS, ECOTECT, IES-VE, among others). For the simulation of existing buildings, however, forward models are not sufficient: so much of the energy use is determined by a building's operation that forward models are generally not representative of real post-occupancy conditions. When a building's measured energy values for a certain time period 18 Chapter 2. Background are available, data-driven (or inverse) models need to be used. Inverse models use the known outputs to determine the system's input parameters and their mathematical relationship to the outputs. They can be broken down into three subtypes: o Black-Box Models: This is a purely mathematical approach that relies on identifying a model to link measured energy use to a set of building properties and weather parameters. Multiple Linear Regression has been used most often due to ease of interpretation; however, more complicated models, such as Multistage Regression, Artificial Neural Networks (ANN), Support Vector Machines (SVM) and others have been used in recent years. o Gray-Box Models: This method uses a combination of physical modeling and statistical methods. A physical (usually simplified) model is created to represent actual features of a building, while statistical analysis with measured data is used to identify specific building characteristics. o Calibrated Models: This method uses BEM software to create a physical model of a given building, then tunes the model's input parameters to match measured energy use as closely as possible. This method tends to be the most time-consuming of the three but provides the most flexible result. The accuracy of the model, however, greatly depends on the method and the time step used. (Calibrating to 8,760 hourly values, for instance, will inevitably result in a higher-accuracy model than to a single annual value.) This thesis uses a variation of the Calibrated Model approach due to its potential for automation, accuracy of physical process representations, and versatility in applications. The following section expands on procedures that have been used for calibrating models of existing buildings. 2.2 Calibration of Building Energy Models The calibration approach to modeling existing buildings is typically used when detailed building energy data is required, such as for examining the effects of proposed retrofits to a building or testing different control schemes. The advantage of a calibrated model as opposed to a black-box or graybox one is that, while all three approaches can yield outputs that would closely match observed energy consumption, only the calibrated model is entirely based on physical phenomena; thus, if the calibration is sufficiently accurate, it offers the greatest flexibility in scenarios the model can predict. The BEM engines used in calibration are generally evaluated according to ASHRAE Standard 140 [16 on their ability to model specific components consistent with established requirements. Ones that perform well can be assumed to faithfully represent physical processes within a building, at least insofar as every model is a simplification of reality. Therefore, if the simulation algorithms are assumed to be "correct" (i.e., they perform engineering calculations using accepted equations and numerical 19 Chapter 2. Background methods) and the observed energy consumption data is assumed accurate as well, what remains to be determined during calibration are the inputs to the simulation engine. However, BEM calibration remains an under-defined and over-parametrized problem without a unique solution. The difficulty stems from the fact that many disparate properties of a building can have similar effects on energy use (e.g., both thicker wall insulation and higher occupancy density can lead to a decrease in heating load), while others have effects that cancel each other out to various extents. In such cases, there can be an unlimited number of input parameter vectors that result in the same or comparable calibration errors. The determination of which one of these solutions, or input vectors, is the "most correct" is generally left up to the judgment of the practitioner. Furthermore, a model is usually considered sufficiently calibrated if its error (defined below) is within a certain maximum bound; the resulting model's parameters are rarely validated by inspection against the actual building, even when it is feasible. 2.2.1 Calibration Assessment The two official standards most often referred to when assessing the quality of a calibrated energy model are ASHRAE Guideline 14 for Measurement of Energy and Demand Savings [3], used in the United States, and the International Performance Measurement and Verification Protocol (1PMVP) [4], used internationally. Both of these provide techniques for quantifying effects of energy conservation measures (ECMs) on existing buildings, with Calibrated Simulation as one of the options. Two statistical indices are used in these standards for assessing calibration quality: the normalized mean bias error (NMBE) and the coefficient of variation of the root mean square error (CVRMSE). These are defined as follows: ) (2.1) - NMBE = 100 x Z=1(Y n CVRME ==100 CVRMSE -- x "=(2) 1(yi - Qi)2(2) where n is the number of measurements in the calibration period, yi is the observed energy value for the ith period, i is the simulated value for the same period, and 9 is the mean of the n observed values. The CVRMSE is a measure of how well the model fits the measured values at each interval; the NMBE measures the error between the means of the simulated and measured data for the entire time period. ASHRAE Guideline 14 and IPMVP declare a model to be 'calibrated' if it meets the criteria in Table 2.1 for these two metrics, varying based on the time interval used in the comparison. While these criteria provide some guidance on how far total simulated energy values can be from measured ones, it is relatively vague. They do not require separation between electricity and fuel used by the 20 Chapter 2. Background building and do not place any constraints on matching energy by specific end-use. This allows room for inconsistencies between the model and the actual building even when the maximum error thresholds are met. Metric NMBE CVRMSE TABLE 2.2.2 ASHRAE 14 Monthly Hourly 10% 5% 30% 15% IPMVP Monthly Hourly 5% 20% 20% - 2.1: Error limits for whole building calibrated simulation. [3, 4] Calibration Methods While ASHRAE Guideline 14 and IPMVP prescribe metrics to assess the quality of a calibrated model, they do not provide much guidance on the actual steps of the calibration process, leaving it up to the practitioner to determine the best method. As yet, there is no widely-accepted general calibration procedure, which makes the process largely non-replicable and inhibits inter-model comparisons. As Coakley et al. point out, "many of the current approaches to model calibration rely heavily on user knowledge, past experience, statistical expertise, engineering judgment, and an abundance of trial and error" [17]. In an attempt to provide additional guidance for the calibration process, many researchers have proposed their own methodologies. These feature a variety of approaches, ranging from manual graphical-based methods to ones rooted in statistical analysis and optimization. The latter type typically requires less user judgment and thus offers greater opportunity for automation of the calibration process. In the context of urban energy simulation, automation is essential, as large numbers of buildings make manual calibration time-prohibitive. The first comprehensive review of calibration methods was performed by Reddy et al. [18, 19] as part of ASHRAE's Research Project 1051. Reddy synthesized the most successful procedures into a partially-automated multi-step methodology employing sensitivity analysis and two-stage optimization. Importantly, Reddy advocated choosing a set of most plausible solutions, rather than a single one, as a way to account for the uncertainty associated with non-uniqueness of solutions. Reddy's case studies showed good agreement with measured data: Goodness of Fit values, defined as a weighted combination of CVRMSE and NMBE, were within 1-2%. However, a later paper by Gestwick et al. [20] tested Reddy's method on a 12,000 m2 office building and showed less satisfactory results, with a monthly NMBE of 8.4%, even after significant manual adjustments. Since Reddy's analysis, many other automated calibration methods have been proposed. They can generally be classified under one of the following types: 21 Chapter 2. Background o Optimization involves the minimization of an objective function-usually a measure of calibration error-using one of many optimization algorithms to search through the parameter vector space. o Bayesian Calibration assumes known prior distributions for the variable parameters. These priors are updated using known outputs (measured data) so that their posterior distributions reflect this added knowledge. o Meta-Modeling involves the creation of a simplified, algebraic or machine-learning-based model based on the original engineering building energy model created with BEM software, with the goal of obtaining the same results but at a much lower computational cost. More detailed descriptions of the latest calibration methods can be found in recent reviews by Coakley et al. [17] and Fabrizio et al. [21]. Most of these methods have not been tested by other researchers; hence, their general applicability to various building types has not been confirmed. When scaling up from a single building to a neighborhood or city with hundreds or thousands of buildings, data availability and computational requirements become prominent issues in model calibration. For this reason, researchers in urban energy modeling have been obliged to adapt the single-building approaches described above or develop alternatives. An overview of these is presented in the following section. 2.3 Urban Building Energy Modeling Urban Building Energy Modeling (UBEM) is a growing field of research with a range of applications: o For districts, cities or countries that are evaluating energy efficiency measures for the building sector, calibrated urban models provide a way to estimate effects of potential policies or to assess the feasibility of greenhouse gas reduction targets. o For a campus or neighborhood considering the installation of a district heating system or distributed generation, an UBEM can provide information on hourly loads for the district, separated by end-use. o UBEM can provide predictions for utilities forecasting future electrical demand for new or growing districts. Urban building energy modeling is divided into two major approaches that have been labeled "topdown" and "bottom-up." The top-down approach consists of using known aggregate energy consumption for a given region and time period (usually annual) and subdividing it into portions that are 22 Chapter 2. Background attributed to specific groups of buildings (e.g., to all buildings of one function, to each zipcode in a city, etc.). The bottom-up approach attempts to match the same aggregated energy consumption by taking the inverse route and creating models at an individual building level, then summing up the results for all buildings in the set. While both approaches aim to describe the same energy consumption, top-down models are limited in the sense that they are trained using historical data on consumption levels, building conditions, economic indicators, etc. Being entirely statistical in nature, they are limited in their predictive ability to very small variations from the status quo and so cannot model consequences of technological advances, changes in construction practices, etc. [2]. The bottom-up approach does not have that limitation. A further advantage is that the energy consumption can from the very beginning be separated into end uses, and that results can have high spatial resolutions. Two sub-methods exist within the bottom-up approach: the engineering method (EM), which creates engineering models of buildings using either BEM software or simplified thermal models, and the statistical method (SM) which uses black-box modeling. Figure 2.1 from Swan et al. [2] shows the breakdown of methods that have been used for modeling residential energy consumption; these are generalizable to urban energy models for any building type. Resiential Energy Consumption Econo=etri Technological Engineering Statistical Conditon.lPmhb Regression demand Nural network Diitop n Ardheyp Sample FIGURE 2.1: Techniques used for estimating regional or national residential energy consumption. [2] A typical bottom-up urban building energy modeling procedure involves the steps of data collection, conversion of data into energy model inputs, model execution, and validation of results by comparison to measured data. Data access is typically the limiting factor in how detailed an UBEM can be and how closely it can be validated. Building properties may be available on a large-scale basis thanks to housing, insurance, or property tax assessment databases; however, this data is rarely accurate or complete enough to serve as inputs to energy models. Energy data is even more difficult to obtain in large quantities due to privacy concerns. Most UBEMs end up relying on annual, district- or city-wide energy totals to validate their results. This reliance on spatially-aggregated and temporally-coarse data embeds large uncertainties into both urban model creation and model validation. 23 Chapter 2. Background 2.3.1 Archetype Definitions Bottom-up engineering models use building characteristics to use as inputs to energy simulation algorithms. When making an energy model of an existing building, these characteristics are typically collected by examination of architectural and mechanical drawings and an in-person energy audit. For an urban model with hundreds or thousands of buildings, this procedure is infeasible, so modelers must employ some level of generalization. The usual way to generalize this step is by separating the building stock into homogeneous groups and assigning the same building properties to all buildings within a group. This method automatically assumes that there are certain building properties that can explain much of the variance in energy consumption between groups of buildings. The non- geometric, non-climatic properties most often used to differentiate buildings by group include age (most common), use, HVAC system, and construction types. Yet, when measured energy data for individual buildings from the area being modeled (or from another location, similar with respect to geography, demographics, and construction practices) is unavailable, the rationale for defining archetypes remains somewhat arbitrary. It is usually based on expert judgment, general knowledge of construction practices, or methods used in prior research. In this situation, there is no way of determining which variables in the modeled building stock actually affect energy use, or of demonstrating that the defined archetypes create appropriate groups with similarity in energy consumption. Aksoezen et al. [22] explored whether classification of buildings by age and use is indeed a meaningful way to generate archetypes. They noted that several studies had thrown doubt on the common classification of buildings by construction age and dwelling type, as variations within the same class were often greater than between classes. To check their hypothesis, Aksoezen et al. performed an analysis on about 20,000 buildings in Basel, Germany, for which energy data for the year 2011 was available. They used annual natural gas consumption intensity as a measure of energy performance and demonstrated that for the given dataset buildings constructed in the period 1921-1979 used more gas than those before 1921 or after 1980, so in this case age was indeed appropriate as a differentiation factor. The impulse to use building age as a classification variable is understandable, since it can automatically reveal further information about the construction practices and- materials used in the building. However, this information is not deterministic and cannot solely be relied on for archetype definitions. A building's construction date contains no information on whether any refurbishment, such as improvement in insulation or replacement of original windows, has been performed. Orehounig [23] confirmed this with a case study of 100 buildings in a village in Switzerland, which were divided into 7 templates by age and modeled with the CitySim energy simulation environment. The authors noted that energy predictions for buildings less than 30 years old were much more accurate than for the older stock, whose energy consumption was highly variable. 24 Chapter 2. Background Data-based approaches to archetype definitions have been limited. Aksoezen et al. checked just one possible archetype distinction on their data set. Famuyibo et al. are among the few researchers who applied measured energy data to create archetypes for a building stock (residential homes in Ireland) [24]. The measurements came from the Energy Performance Survey of Irish Housing, which contained energy use and physical characteristics for 150 typical Irish dwellings. They conducted a literature review to identify which variables had been considered significant to energy consumption in prior studies, followed that with a multivariate linear regression for the given measured dataset, then looked at the frequency distributions for each of the significant variables and chose representative values of these variables based on histogram peaks. The thirteen resulting archetypes were said to be representative of 65% of the Irish housing stock. 2.3.2 User Behavior Modeling The most frequently cited limitation of bottom-up UBEMs is the uniformity of occupancy modeling. Most urban models tailor occupancy and schedules to the building type but do not vary the profiles within one type of building. This means that all residential spaces are modeled with the same occupancy, setpoints, and lighting and equipment schedules. However, it has been shown that user behavior is among the chief drivers of energy use in the residential sector [25]. Since most urban models only validate their results on an annual basis, discrepancies in occupancy and schedules might not be obvious due to averaging when aggregated. However, when these models attempt to use annually-validated results to predict spatially-distributed daily or hourly energy use within a city or neighborhood, the idea that every building shares the same user behavior is certainly erroneous. Strzalka et al. [26] compared ten similar apartments within the same multifamily building by annual heating energy intensity; they showed variation between 30 and 90 kWh/m 2 in the same year. The same study, when comparing a heating model for 300 residential houses that assumes the same user behavior among buildings, showed that the model-influenced primarily by the buildings' geometrieswas not able to account for the large variations in energy use among these buildings (EUIs ranging from 20 to 90 kWh/m 2); these large variations can only be attributed to user behavior. Based on previous sensitivity studies stating that thermostat setpoint is among the most influential parameters on heating demand, the authors adjusted setpoints according to a normal distribution around a mean of 68'F. The addition of this degree of freedom resulted in much closer correlation between simulated and measured values. The Household Electricity Survey conducted in the United Kingdom in 2010-2011 [27] illustrates some of the issues related to the unpredictability of residential energy consumption. This survey-the most detailed one ever undertaken of electricity consumption in UK homes-monitored 250 buildings, of which 26 were monitored for an entire year and the remainder for one month each on a rolling basis. Monitoring was done with meters added to the buildings' central distribution boards and to 25 Chapter 2. Background some individual appliances in order to disaggregate different end-uses. The researchers analyzed the electrical base loads of different end-uses (the minimum hourly value over a day) and found huge variation between individual households; a small portion of sample houses had base loads several times higher than their peers. For example, about 60% of the total IT category base load could be attributed to just 12% of the households. Similar contrasts were noticed for lighting and audiovisual appliances. In the audiovisual category, 17 households used under 100 kWh per year, while the top 22 used more than 1000 kWh per year-ten times the amount of the lowest users. One limitation of this study is that energy use between households was compared by building and not normalized by floor area; normalization would likely have reduced the variation in demand and annual consumption to a certain extent. However, the variance in floor area and number of household members still would not fully account for the variance in energy use, indicating the large role played by occupant behavior. Accounting for user behavior could be done in various ways. Research has been done on generating behavior schedules based on stochastic algorithms. Alternatively, multiple researchers have noticed correlations between demographic factors such as household income and energy use ([28], [29]). This suggests the possibility of using census data to create appropriate schedules or appliance loads based on population. In a different approach, travel surveys could be used to inform occupant schedules ([30], [31]). Keirstead and Sivakumar [30] used activity-based modeling to simulate hourly electricity and fuel demands. Activity-based modeling is a type of integrated land use and transportation modeling based on microsimulation of agents' schedules with regards to activity and location throughout a time period. Keirstead's model assigned schedules to each of 65,000 statistically-representative agents (population members) and simulated the electric and gas demand by time of day across 391 spatial zones in the city of London. One of the limitations of their study was a lack of variation in the domestic activity profiles, since the travel survey used to build the model did not contain this information; therefore, it would not help in predicting residential energy consumption. 2.3.3 Model Validation Calibration, which has long been used in the building energy modeling practice, is associated with a range of difficulties even when applied to a single building (Section 2.2). An urban model greatly increases not only the quantity of buildings that need to be calibrated, but also the uncertainty of initial, pre-calibrated model descriptions. For these reasons, as well as the difficulty of obtaining energy data for individual buildings at shorter time intervals, the practice of urban modeling has so far mostly relied on annual, aggregated energy measurements for model validation. Calibration of groups individual building models within an urban context to sub-annual data has not, to our knowledge, been performed. One recent work by Sehrawat et al. [32] did compare errors in energy models at monthly intervals for a block with 27 office buildings in Los Angeles, CA. (The energy data was provided by the LA Department of Water and Power.) Their model showed an error 26 Chapter 2. Background variation of 11-23% by month for the aggregated block energy consumption. Errors by individual building were reported only annually and ranged up to 14% of EUL. Fonseca et al. [33] created their own detailed integrated model of energy use in buildings instead of using BEM software and validated it against a neighborhood of 23 buildings. While their integrated model can predict energy use at hourly intervals, its results were not validated on a sub-annual basis. The model was evaluated on annual errors for major end-uses (heating, cooling, electricity), both for the neighborhood together and for each building separately. The neighborhood errors for these categories ranged between 1-19% but increased to 4-66% when looking at individual buildings (excluding outliers). 27 Chapter 3 Proposed Methodology for Urban Building Energy Modeling The main steps in the creation of an urban building energy model (UBEM) consist of (1) the collection of data relevant for a thermal energy model (weather, building shapes, constructions, etc.), (2) organization of data for input into thermal energy model (template creation), (3) execution of the thermal simulation algorithm, and (4) validation of results by comparison to measured data. This chapter presents a general overview of data sources and procedures that can be applied to a variety of models. Afterwards, Chapter 4 describes a specific application of this methodology to a case study on the City of Cambridge, Massachusetts. 3.1 3.1.1 Data Collection Weather Data An UBEM should attempt to use weather data specific to the location and the time period of calibration. In order of decreasing accuracy, sources of weather data include: o Local weather data from privately-installed stations can often be accessed online through aggregators such as Weather Underground [34] or individual webpages. Data downloaded directly from weather stations is typically recorded in sub-hourly intervals and may contain periods with missing observations, so it needs to be post-processed into the appropriate hourly weather file format. o Actual Meteorological Year (AMY) files, with historical data from weather stations around the world. This is important for calibrating or weather-normalizing a model with energy data 28 Chapter 3. Proposed Methodology for Urban Building Energy Modeling from a specific year, since much of the energy use is correlated with the number of heating and cooling degree days in the given year. The National Weather Service logs data for over 4,000 weather stations around the world. For locations not covered by NWS weather stations, services such as Weather Analytics [35] provide additional AMY files by combining actual meteorological station data, U.S. National Oceanic and Atmospheric Administration (NOAA) data, and proprietary algorithms to generate weather files for every 35 x 35-km area around the globe. o Typical Meteorological Year (TMY) files, representing statistically "typical" yearly weather from recent decades. These files are available for many locations around the world. [36] 3.1.2 Energy Data Measured energy use records for groups of buildings are not readily available. Usually they need to be obtained directly from utility companies, and such permission can be difficult to receive. Another source of energy data is municipal databases, for those select cities that have implemented energy disclosure laws for certain classes of buildings (see Introduction). 3.2 Building Data Another challenge in urban modeling is collecting accurate data on building properties. Two categories of data are needed to create a bottom-up urban model: geometric and non-geometric. 3.2.1 Geometric Properties Envelope geometry plays a role in a building's thermal energy requirements and needs to be defined prior to simulation. Methods that can be used for this purpose, in order of decreasing accuracy, include: o Combining extrusions of building footprints from 2D GIS data with LiDAR data from aerial scans. This provides more details on the building geometries, which is especially important when buildings have irregular shapes, roofs are not flat, or floor areas change by story. (LiDAR has been used in solar mapping, but not yet in urban energy simulation.) o The CityGML format for storage and exchange of city models, which specifies 3D geometries and locations of entities [37]. o Using GIS shapefiles with building footprints, along with information on building heights or numbers of stories, to extrude the footprints vertically to the appropriate height. 29 Chapter 3. Proposed Methodology for Urban Building Energy Modeling o Creating one prototypical geometry for each of the building archetypes being simulated, then scaling up energy use intensity results by floor area (as seen in [12, 13]). Once the exterior geometry has been specified, the buildings needs to be separated into thermal zones. Prior research work has used both single-zone and various multizone configurations. For urban modeling purposes, multizone configurations are generally either done by floor or by splitting into core and perimeter zones. 3.2.2 Non-Geometric Properties The non-geometric properties of buildings can be characterized by a space of numerical or categorical parameters, which can be separated into several types: o Physical (fixed) parameters describe the properties of the building that remain unchanged over time and do not depend on the occupancy. Numerical parameters can include floor area, number of rooms, number of specific appliances, year of construction, window-to-wall ratio, and others. Categorical ones include the presence or absence of air conditioning, type of heating system or heating fuel, type of wall construction, and so on. While these parameters can potentially vary when considering long time periods (e.g., windows can be upgraded or insulation can be added), they are usually assumed to be constant over the calibration period (typically one year). Data sources: In-person audits, property tax assessment records, expert evaluation, local building codes, national building codes. o Occupancy-driven (variable) parameters are ones that to a large extent are correlated with occupancy rather than with the physical properties of the building. The number of occupants typically determines the level of electricity use for appliances, the frequency of cooking equipment use, and the amount of hot water used for showering, laundry, and dishwashing. Occupant preferences and activities within the house also determine settings dependent on individual comfort requirements, such as thermostat setpoints or the amount of lighting needed. Data sources: Direct polling of occupants, residential surveys, national census. o Scheduling (time series) parameters: This category is also occupancy-driven, but, while the previous one defines steady effects of occupant number and preferences (e.g., the peak power requirement for appliances), this one accounts for daily fluctuations using hourly schedules (fractional, on/off, temperature). Data sources: Appliance-level submetering, direct polling of occupants, residential or transportation surveys. 30 Chapter 3. Proposed Methodology for Urban Building Energy Modeling Scheduling parameters become most important when calibrating to hourly energy consumption data. Since this work looks only at calibration to monthly and annual measurements, this methodology focuses on determining the other two types of properties: physical and occupancy-driven. The following sections present two approaches to setting these two categories of inputs. Physical properties are assigned to the model through archetype templates, while occupancy-driven properties are inferred from measured data and parametric analysis. 3.3 Template Generation Because collecting all the information required for an energy model on an individual building basis would be time-prohibitive, many urban- and national-scale modelers have used the concept of building archetypes as a way of assigning building properties. An archetype defines a set of characteristics that is representative of a group of buildings with similar properties. This research proposes the use of multivariate linear regression as the first step in archetype development. Assuming some information on energy consumption is known and building properties have been identified or estimated as described above, regression can be used to systematically select the variables by which archetypes should be distinguished. Since geometry is already accounted for within the EnergyPlus input files, archetype templates should be based upon combinations of the non-geometric variables that are shown to significantly affect energy use intensity. Once all combinations are defined, any groups that contain very small numbers of buildings can be eliminated for the sake of simplicity by merging them with more populous archetypes with similar properties. 3.4 Template Customization After geometry has been created and archetype templates have been used to assign building properties, energy simulation can be run. Generally, it is unlikely that the results of the energy simulation will match measured results, even if templates are assigned correctly, due to the fact that occupants have a greater effect on energy use than the physical properties of the construction. Most researchers have avoided addressing this problem by validating their results in aggregate, i.e., summing up the energy use for an entire neighborhood/city and comparing to the measured value. The errors for these aggregated comparisons have ranged from 4% to 21% [14]. Even in the best cases, however, it is not likely that, if an individual building was selected from that set, it would conform to one "typical" occupancy profile; the low error is likely a result of averaging out under- and over-predicted spaces. Indeed, researchers that have looked at errors both on the aggregate and the individual building scale have reported single-building errors between 5% and 99% [14]. 31 Chapter 3. Proposed Methodology for Urban Building Energy Modeling 60 50 40 HEATING 30 20 10 LOAD SBASE 1 2 3 4 5 6 7 8 9 10 11 12 6-- COOLING 4 3 2 BASE LOAD 0 1 FIGURE 2 3 4 5 6 7 8 9 10 11 12 3.1: Example of residential monthly energy use in Cambridge, MA. This step involves taking each template as defined above and further customizing it to individual buildings. Since the templates are assumed to be representative of constructions, this customization step applies to settings related to internal loads and heating and air conditioning systems. For single BEMs, these settings are typically defined manually by referring to building lighting plans and mechanical equipment schedules. For UBEMs with hundreds of buildings, this would be infeasible. However, if monthly measured data by building are available, certain input parameters can be inferred in an automated fashion. The resulting distributions of these inferred parameters can then be used as the basis for generation of models for other buildings in the same or similar neighborhoods. Moreover, this procedure can identify inconsistencies or mistakes in the building information database being used. 3.4.1 Inference of Parameters Monthly energy data affords some ability to differentiate between end uses. This in turn reveals information about the buildings that could not be gleaned from annual measurements. The information that can be inferred will vary based on the climate zone, and on whether the building uses all-electric energy or electricity plus a heating fuel. In a climate with both heating and cooling seasons, the 32 Chapter 3. Proposed Methodology for Urban Building Energy Modeling electricity base loads in shoulder months are typically representative of the appliance and lighting loads in the dwelling, electric peaks in the summer indicate air conditioning, while the base fuel load during non-heating season is primarily used for hot water (Figure 3.1). This information can be used to customize certain parameters in energy model inputs to each building. 3.4.2 Probabilistic Estimation of Parameters After customizing individual building templates with properties that can be inferred from the energy data, we can go a step further and attempt to estimate those properties that are not directly deducible. This can be done with a procedure similar to some automated calibration processes that have been used for individual buildings. As mentioned in Chapter 2, the parameter values resulting from a calibration process will almost never be unique since many combinations of input vectors can provide similar outcomes. This issue is addressed by generating probability distributions of unknown input parameters instead of single values, using the probabilisticestimation method that has been described by Cerezo et al. [38]. The probabilistic estimation procedure consists of the following steps: 1. Select a set of N unknown parameters (Xi). 2. Assign to each one a uniform probability distribution in the range [mini, maxi], i E [1, N], where the minima and maxima are set based on reasonable limits. 3. Use the N uniform distributions to generate an N-dimensional grid of discrete parameter values N with step sizes ti. This results in a set of S = maxi i=1 - ti mini + 1) input combinations. 4. Generate and simulate a set of S files for each original (customized) building EnergyPlus input file, calculating the calibration error for each parametric run. 5. Define an acceptable calibration error and select all results within that range. Combine the values for parameters Xi from all those results into a multivariate joint probability mass distribution. 6. Sample the resulting probability mass distribution with Monte-Carlo methods for use in modeling other buildings in the same or analogous neighborhood. The results of parameter inference and probabilistic estimation can all be used to create probability mass distributions for the neighborhood being explored. If the modeled set of buildings is representative of the neighborhood, these distributions can then be used to assign properties to other buildings for which information is not available. 33 Chapter 3. Proposed Methodology for Urban Building Energy Modeling 3.5 Model Execution This process uses the EnergyPlus [39] simulation engine developed by the U.S. Department of Energy (DOE). EnergyPlus is chosen for its versatility in interacting with external interfaces, ease of input file manipulation and running batch simulations, and continuing improvements by the development team at the DOE. Each new release of EnergyPlus is validated according to industry standards and validation reports are available online. 3.6 Model Validation After simulations are run, their results need to be compiled and compared to measured monthly values. An error metric should be chosen consistent with or derived from ASHRAE Guideline 14 (see Section 2.2.1). 34 Chapter 4 Application of Methodology Cambridge, MA Case Study The methodology described in the previous chapter was applied to the residential building stock in the City of Cambridge, Massachusetts. This chapter describes the specifics of each step tailored to the data available for Cambridge. The primary data sources were (1) monthly natural gas and electricity readings provided by the local utility, (2) annual property tax assessments provided by the City of Cambridge, and (3) GIS maps from the Cambridge Geographic Information System department. Most data processing and plotting was performed using R, a free software environment for statistical computing [40]. Analysis was limited to low-rise residential buildings with 1 to 4-family occupancy. The first step was to merge energy data with tax assessment data by address. Tax assessor information was labeled by parcel with a unique Map-Lot number (Figure 4.1), while energy readings were provided by account number with a separate file linking account numbers to street addresses. After converting all addresses to a format consistent across the two databases, each tax parcel could be associated with one or more electric and gas meters. The rest of the analysis was conducted using Map-Lot numbers as unique IDs. The final subset of data contained 3,395 residential buildings across Cambridge. Of these, 453 located in the same neighborhood, Cambridgeport, were modeled with EnergyPlus. Thus, the numerical data analysis below was conducted on the entire sample of 3,395 buildings; the energy model results concern just the Cambridgeport subset. The year 2008 was chosen for model calibration, as it had the largest amount of metered energy data available. 35 Chapter 4. Application of Methodology to Cambridge, MA Case Study FIGURE 4.1 4.1.1 4.1: Tax parcels of the City of Cambridge. Data Collection Weather Data Weather data for Cambridge for the year of analysis, 2008, were obtained from two sources and combined into one EnergyPlus Weather (EPW) file format. Most of the data was taken from a weather station in Central Square, Cambridge (KMACAMBR4) [41], which is located within the area being modeled and has records dating from 2005 available. Its measurements, recorded at 5minute intervals, include drybulb temperature, dewpoint temperature, relative humidity, barometric pressure, wind speed, and wind direction. After converting to hourly data and correcting any gaps due to temporary sensor failure, these values were used to populate fields in a new EPW file for Cambridge, MA in 2008. However , since the Central Square weather station's historical data did not include solar radiation, it was supplemented by Weather Analytics' [35] AMY file for that year. This was used to fill in values for global horizontal, direct normal, and diffuse horizontal radiation in the new EPW file. 36 Chapter 4. Application of Methodology to Cambridge, MA Case Study 4.1.2 Energy Data Energy data were provided by NStar (recently renamed to EverSource), the utility company servicing parts of New Hampshire, Connecticut and Massachusetts [42]. EverSource agreed to provide MIT with partial data on their electric and gas customers in Cambridge for calendar years 2007-2010 for research purposes. The following steps were used to pre-process the energy data for further analysis. 1. Cleaning and merging: (a) Energy consumption was provided with the read date for each bill. The read dates were not consistent from building to building and typically ranged anywhere within the first or the last week of every month, in some cases with multiple readings in one month. Since the beginning and end dates of each bill were not consistent, they were standardized by calendar month using the 'xts' package in R. (b) Obvious outliers (e.g., monthly values exceeding the values of the previous and following months by two or more times) were corrected by linear interpolation (less than 0.02% of 81,480 data points were corrected). (c) Natural gas consumption values were converted from therms to kilowatt-hours. (d) Since an address could be associated with one or more gas and electric accounts, all readings for one address were summed to get the entire building use. It was not always clear whether all of a building's accounts were included in the data provided; however, when the number of accounts differed drastically from the number of units reported for the building and the EUI was exceeding low, those buildings were excluded on the basis of incomplete data. 2. Selection: (a) Only buildings which had natural gas as the heating fuel were retained, since consumption of fuel oil or other heating fuels was unknown. (b) The data were trimmed down to only those accounts that had at least 12 continuous months of data. (c) The year 2008 was chosen for model calibration, as it had the largest number of complete observations. After pre-processing the data and merging with building property data, energy consumption could be normalized by floor area to energy use intensities (in kWh per square meter) for simpler comparison between buildings. The following plots show histograms of annual EUI distributions for the building set (Figures 4.2, 4.3) and monthly gas use and electricity use profiles (Figures 4.4). 37 Chapter 4. Application of Methodology to Cambridge, MA Case Study - 400 I 300- BUILTPD E1980 02 1945 200 - M E z 2014 100- M 0 4- 0 400 200 000 800 2008 EUI (KWH/M2) FIGURE 4.2: Histogram by total 2008 EUI, shaded by construction period (pre-1945, 1946-1980, post-1980). 500- 400 300 S300- - - - 400 - 0200 z00 200 400 000 80 2008 GAS (KWH"2) 2008 ELEC (KWHW2) FIGURE 4.3: Histograms by 2008 EUI, separated into gas and electric use intensities. 38 Chapter 4 . Application of Methodology to Cambridge, MA Case Study 150 - 'E 100 - ~ui Ill (!) 50 - o2008-01 2009-0 1 2008-07 2009-07 Month (A) 24 months of gas use intensity for 3,395 buildings, including gas- and oil-heated ones. The cyan color represents buildings originally labeled in the tax assessment as oil-heated . 30 - ~~ 20 - g Li] 10 - 0- 2008-01 2008-07 2009·01 2009-07 Month (B) 24 months of electricity use intensity for 3,395 buildings. Peaks occur in both summer and winter months, implying the use of both air conditioners and supplementary electric heat. FIGURE 4.4: Monthly energy use intensities for low-rise residential buildings in Cambridge. 39 Chapter 4. Application of Methodology to Cambridge, MA Case Study 4.2 4.2.1 Building Data Geometric Properties The Cambridge case study used GIS data made available by the City of Cambridge [43] in combination with the computer-aided design software Rhinoceros 3D [44]. The exact methodology is described below and illustrated in Figure 4.5. 1. The 3D geometry of the neighborhood was created in the Rhinoceros modeling environment with the visual programming plug-in Grasshopper [45]. (See Figure 4.5.) (a) A Grasshopper algorithm read in data from a GIS shapefile (SHP) and generated polygons for the perimeter of each building in Rhinoceros. (b) The SHP attribute table included information on the elevations of the ground and highest point of each building; these were used to extrude the polygons to the specified height. (c) The window-to-wall ratio was used to generate glazing surfaces distributed evenly around the facade of a building (except on walls adjacent to other buildings). 2. The Grasshopper plug-in ArchSim [46] was used to create EnergyPlus input files (IDFs) for each building. ArchSim processed the vertex coordinates of each surface into the corresponding EnergyPlus surface objects with specified boundary conditions. (a) A pre-processing algorithm was used to identify shading surfaces affecting each building. Generating shading surfaces from every surface in the neighborhood would be computationallyprohibitive in both the IDF-generation and simulation phases, so only shading surfaces in close proximity to each building were included. (b) The number of stories was used to split the entire building volume into floors; each story was represented as a separate thermal zone in the IDF files. (c) The last step in IDF file creation was to specify building properties by assigning each one a template defining one of the district's archetypes (see Section 4.2.2). 4.2.2 Non-Geometric Properties For the case study, information on physical properties of buildings was provided in the form of tax assessment records compiled annually by the City of Cambridge [47]. This is a widely-applicable source of building information, as property tax assessments are performed in every location in the U.S. and are publicly available, though the information recorded and frequency of updates may vary by municipality. For each tax parcel with a building, the Cambridge tax assessment listings contained 40 Chapter 4. Application of Methodology to Cambridge, MA Case Study .5. (A) 2D GIS view of a section of Cambridge. Energy simulation was done for the buildings shaded green; others contributed to shading. (B) Grasshopper component for creation of 3D geometry and EnergyPlus input files. (c) 3D geometry after outline extrusion. FIGURE 4.5: Building geretry generation process. Chapter 4. Application of Methodology to Cambridge, MA Case Study information such as the year of construction, number of stories and rooms, facade and roof types, heating system types, ratings of interior and exterior conditions, and others. Exploratory data analysis showed that sections of the tax records was either incorrect or inconsistent with other information. This is not entirely surprising, given the fact that the role of tax assessors does not extend beyond determining a property's value. In conversation with Clifford Cook, the Planning Information Manager at the Cambridge Community Development Department, he called tax assessment "a sort of black art" and acknowledged that tax records are not meant to be used for other purposes [48]. Therefore, measured energy data was assumed to be the more reliable source of information and was used as a check for certain tax assessment records. Some of these errors are insignificant for the purposes of energy simulation, but others are more substantial and, if left uncorrected, can ascribe properties to a building that it does not possess, so that attempts at calibration can result either in large errors or in unrealistic outcomes. One example of such errors that was noticed from combining energy data with the residential tax assessment was the field Fuel Type. It was noticed that many of the buildings labeled as having oil fuel did not actually show a significant difference in monthly use profiles from gas-labeled ones. These were assumed to have been either labeled erroneously or not updated after a house had switched from oil to gas heat. Thus, Fuel Type was re-labeled based on energy data: buildings that had near-zero gas use year-round or a base gas load that did not rise significantly in winter were labeled as oil-heated and not retained for further analysis, since no data was available for non-gas fuels. The final, clean dataset contained values for 3,395 parcels in Cambridge with 24 months of monthly electricity and gas data. Table 4.1 contains a summary of known properties of the dataset. Property Living Area Bldg Value Stories Units Bedrooms Kitchens Baths Total Rooms EUI 2008 Gas EUI 2008 Elec EUI 2008 Total Unit Mean St.Dev. Min Max m2 USD/m 2 kWh/M 2 kWh/M 2 kWh/M 2 226.8 1,985.2 2.3 1.5 4.1 1.5 2.5 9.3 203.0 39.2 242.2 104.8 790.4 0.4 0.7 1.6 0.7 1.016 3.255 83.9 22.8 94.5 39.0 35.2 1.0 1 1 1 1.0 2 14.8 0.5 19.3 1,183.9 8,271.8 4.0 7 13 7 7.5 28 776.8 196.2 794.6 TABLE 4.1: Summary of Cambridge residential dataset for 3,395 buildings. 42 Chapter 4. Application of Methodology to Cambridge, MA Case Study 4.3 Template Generation The initial archetype templates were based on the results of a multivariate regression analysis relating annual energy use intensities to building properties as predictor variables. All non-categorical building properties were normalized by floor area, and strongly correlated variables were excluded (e.g., number of bedrooms was correlated with number of kitchens, number of units, and total number of rooms). The categorical variables include AC Use (No/Yes), Heating Type (Forced Air/Hot Water/Steam/Other), Foundation (No Slab/Slab On Grade), Wall (Masonry/Non-Masonry), Roof (Flat/Sloped), Building Type (Attached/Detached/Semi-Detached), and Built Period (Pre-1945, 1946-1980, Post-1980). All variables were tested with annual EUIs for 2008 and 2009 to check for consistency; results are in Table 4.2. TABLE 4.2: Linear regression results. Dependent variable: 2008 EUI Intercept Stories/sqm t Bedrooms/sqm Fireplaces/sqm Building Value/sqm Exterior Condition AC Use Heating: HW Heating: Other Heating: Steam Foundation: Slab Wall: Non-Masonry Roof: Sloped Type: Detached t Type: Semi-Detached Built 1946-1980 Built Post-1980 Observations R2 Adjusted R 2 t 70.65*** (14.25) 8, 518.65*** (432.07) 1, 121.00*** (281.29) 3, 692.64*** (411.01) 0.01*** (0.003) -9.98*** (2.01) 13.75*** (3.06) 18.18*** (3.56) 1.45 (7.18) 12.94*** (4.63) 10.41 (11.84) -19.16*** (7.15) 15.67*** (4.73) 46.89*** (8.44) 33.35*** (8.66) 8.38 (7.15) -56.80*** (7.91) 3,395 0.18 0.18 2009 EUI 69.53*** (14.17) 8, 571.18*** (429.78) 1, 023.02*** (279.80) 4, 189.72*** (408.83) 0.01*** (0.003) -10.52*** (2.00) 12.59*** (3.04) 19.04*** (3.54) 1.89 (7.15) 13.09*** (4.60) 8.08 (11.78) -16.71** (7.11) 17.01*** (4.71) 48.30*** (8.40) 32.87*** (8.62) 9.57 (7.11) -60.92*** (7.87) 3,395 0.19 0.18 *p<0.1; **p<0.05; ***p<0.01 Note: Some of the variables identified as statistically significant by the regression model can be attributed purely to geometric properties (indicated with t in the table), while others need to be accounted for 43 Chapter 4. Application of Methodology to Cambridge, MA Case Study by other non-geometric means. The geometric variables (stories per square meter, type of building) do not need to be included in the templates since they are taken care of when IDF geometry is generated. Among the ones not automatically accounted for by the geometry, the statistically significant ones included: Bedrooms/sqm: Can be interpreted as an indicator of occupancy and is positively related to EUI. Fireplaces/sqm: The number of fireplaces seems to increase EUI significantly. This could be due to increases in air exchange rates due to the stack effect when a fireplace is in operation, as well as increased air leakage through a chimney with an imperfectly closed damper while a fireplace is non-operational. It could also be partially a result of correlation between the number of fireplaces and the building value (r = 0.41). Building Value/sqm: Has a very slight positive correlation with EUI. This could possibly be an indicator of a wealthier household with higher appliance use. Exterior Condition: An integer ranking from 0 (poor) to 10 (excellent). The negative regression coefficient implies that better exterior condition corresponds to slightly lower energy use. A C Use: Increases the EUI when present. Heating Type: Hot water and Steam were the categories showing significant difference from the base case category (Forced Air). Wall Type and Roof Type: Non-masonry walls show a negative coefficient, which could be due to the fact that masonry buildings are not present in the most recent age group. Sloped roofs have a positive coefficient: one possible explanation is that, if the roof is insulated worse than the exterior walls, the sloped roofs surrounding a conditioned space would provide more surface area for heat exchange than a flat roof. Built Period: The results imply little difference between buildings constructed pre-1945 and 1946-1980, with newer (post-1980) buildings having lower energy consumption. This is logical given that post-1980 construction had to comply with energy codes specifying insulation levels, while earlier buildings had variable, if any, insulation levels. Of the variables identified as significant above, construction period (used to define envelope constructions) and air conditioning were included in the initial template set. Masonry/non-masonry wall type was originally included as another differentiator, but the number of buildings with masonry construction was so low that the division was deemed superfluous. The number of bedrooms/occupants and heating type are accounted for later in the customized templates. Number of fireplaces, building value, and exterior condition were excluded as there was no clear way to model them with EnergyPlus. The flat/sloped roof variable was also excluded since it should in theory be defined by geometry, but with the current GIS extrusion process roofs cannot be modeled in detail. 44 Chapter 4. Application of Methodology to Cambridge, MA Case Study 400- WILTPEMO - PMu-1945 1946-1360 U) CD 200- 0- 0 1;0 100 50 2008 20 ELEC (KWHA2) FIGURE 4.6: Energy use shaded by age category of building. Finally, an additional variable not used in the regression was included in the initial set of templates. As Figure 4.6 illustrates, there is no obvious relation between gas/electric EUIs and the two older age groups. Because of the large spread in heating (gas) EUIs, it was assumed that a more important categorization than age would be whether (and to what extent) the house had been retrofitted with insulation since its original construction. At the time of this study, the City of Cambridge was unable to provide information on dates of major renovations, which would have simplified this categorization, and no other variables in the tax assessment data could be related to presence of insulation or renovation status. Therefore, a simplified procedure based on error comparison was used to infer whether a building had been insulated or not. All pre-1945 buildings were simulated with both insulated and non-insulated templates. When the error results of both runs were plotted against measured EUIs, the buildings for which the uninsulated run had better fit to measured data were assigned to the uninsulated category. Since this categorization could not be backed up by data, its outcome will need to be checked during validation (see Section 6.3.1). The result of this step resulted in 8 distinct archetype templates, with divisions based on physical parameters only: 3 age groups (pre-1945, 1946-1980, post-1980; each used to define envelope constructions typical of the time period), use of air conditioning (yes/no), and presence of insulation 45 Chapter 4. Application of Methodology to Cambridge, MA Case Study (yes/no, for the pre-1945 set only). Occupancy-related building properties were assigned uniformly to all templates. 4.3.1 Constructions The built period was used to define the types of constructions used for the buildings' facades, roofs, floors, and windows. Construction information by time period was derived from Massachusetts Building Codes [49, 50] and Architectural Graphic Standards [51]. Appendix A contains detailed tables with types and thicknesses of the materials used in construction layers for the different templates, along with the overall construction U-values. 4.3.2 Internal Loads and DHW Initial internal loads were based on average of lighting, appliance and miscellaneous electric consumption for a sample of low-rise buildings in Massachusetts included in the Residential Energy Consumption Survey (RECS) [7]. The domestic hot water load was an average from the same source. 4.3.3 Schedules Schedules of lighting, equipment, heating/cooling setpoints and occupancy are an integral part of any energy model and crucial when calibrating to hourly data. In this study, since measured data is available only on a monthly basis, hourly schedules are not as crucial and were assumed to be the same for all buildings, with just the peak loads differing. Schedules were primarily based on the NREL publication of Commercial and Residential Hourly Load Profiles for all TMY3 Locations in the United States [52]. Appendix A lists all the schedules used for the Cambridge model. 4.4 4.4.1 Template Customization Inference of Parameters Air Conditioning The use of air conditioning can generally be detected through a peak in electric load during summer months. In Cambridge's climate with warm but not overly hot summers (the mean daily temperature of the hottest month is about 23 C), air conditioning is not the norm in lowrise housing. Newer constructions typically have central AC systems installed, but older homes are usually equipped by tenants with non-permanent window ACs. These might be installed 46 Chapter 4. Application of Methodology to Cambridge, MA Case Study in every room of the house or only in certain ones and operated based on occupant preferences and presence in the home; their monthly electric load profiles will tend to rise in the summer months with varying degrees. In an EnergyPlus model of a building, cooling energy is controlled with a setpoint and daily schedule, so it is not easy to simulate irregular use of air conditioning. It is also not possible to simulate a cooling load in just one room when the entire floor is modeled as a single zone, which is typical of urban models. In order to match monthly electric loads over the cooling season by building, it is necessary to separate them into those that have AC and those that do not. An automated way to do this is to define a ratio of electricity in the summer months to electricity in the shoulder months (when no extra heating or cooling is needed so the electric use is at the base load), above which the house is considered to use air conditioning on a regular basis. In the Cambridge study, this cutoff ratio was set to 1.5. This step, then, adds a field for ACUSE to the building data table with the formula: A if ACUSE = Esummer Eshoul > - 1.5 (41) 0 otherwise Electric Heat Analogously to air conditioning, higher electric use in winter than in shoulder months usually indicates use of supplementary electric heat. Since all the buildings in this study use natural gas as the main heating fuel, it is assumed that this rise corresponds to supplementary heat provided by space heaters. Their presence was identified using the ratio of winter electric energy to that in the shoulder months, with 1.75 as the cutoff. It should be noted that some increase in winter gas use over shoulder months is expected even for buildings that only use it for domestic hot water and cooking, due to cold-weather behavior changes such as longer, warmer showers or more frequent cooking at home. ELECHEAT = I if Ewar Eshout 1.7 -- (4.2) 0 otherwise Due to difficulties in modeling a supplementary heat source in the current setup, buildings identified to have electric heat were excluded from further analysis. Domestic Hot Water Since the dataset used in the Cambridge study retained only buildings with natural gas as the heating fuel, it was assumed that the majority of those buildings also used gas-fired tanks for domestic hot water (DHW). While gas could also be used for cooking with gas stoves, its consumption for cooking is (1) overall lower than for domestic hot water, (2) less predictable (some 47 Chapter 4. Application of Methodology to Cambridge, MA Case Study people might cook several meals a day at home, while others practically never use their stoves), and (3) the number of buildings using gas and electric stoves is nearly equal [7]. Thus, this study attributes the natural gas base load in every building to domestic hot water. EnergyPlus specifies DHW loads in terms of peak flow in cubic meters per second and a fractional use schedule, so the natural gas energy for DHW needs to be converted to volumetric flow using: Q = rhCpAT V= = pVCP(Tsup - Tn) Q - (4.3) T (4.4) ) pCy (TSUP - Tn Q = energy input rate (kW), V = volumetric flow rate (m3 /s), Tup = DHW supply temperature (60'C), Tin = inlet (mains) temperature (13'C), Cp = specific heat (4185 J/kgK at 37'C), p = density (993 kg/m 3 at 37'C) (liquid properties from [53]). Since energy is known only on a monthly basis, it can be converted to the peak energy input rate once a daily schedule has been defined: where EDHW month full load hours days x days)( peak (4.5) day where the full load hours are the sum of all the hourly fractional values in the daily schedule. Internal Electric Loads Another customization that can be done is for internal electric loads. Base electric loads in the shoulder months can be assumed to comprise the typical electricity used for domestic appliances -1 T anuliuiig. LikAu WVV, -L :I-- T-%TTIT T1-- 1 11 Energyrius defines 111 ,. Y ./ 1) these inputs in peak power densities (W/m-) and fractional hourly schedules. Monthly energy values were converted to power densities after defining fractional schedules: Qelecload,peak = Eelecuload ours day days x (4.6) Since appliances and lighting are rarely separately metered in residences, it is common to lump all internal loads together when modeling residential buildings (e.g., the ASHRAE Handbook of Fundamentals chapter on Residential Cooling and Heating Load Calculations provides equations for sensible and latent internal gains from occupants, lighting, and appliances combined [15]; ASHRAE Standard 90.2: Energy-Efficient Design of Low-Rise Residential Buildings also specifies a combined hourly internal heat gain profile [54]). For the Cambridge model, lighting and appliance schedules were defined as identical since both primarily depend upon the occupants' presence at home (with minor differences, e.g., lights are usually fully turned off at night while some appliances, like refrigerators, stay on), and the Qelecload,peak was left as one value encompassing lighting and appliances, since end-use separation was not crucial. If more accurate 48 Chapter 4. Application of Methodology to Cambridge, MA Case Study results are needed, it is possible to split internal electric loads into lighting and appliance components. Since appliance use is more uncertain than lighting, the lighting power density can be set to a maximum and the rest of the internal loads assigned to appliances. Occupancy Occupant density can be customized to each building based on the number of bedrooms. As a starting value, it is fair to set the number of occupants equal to the number of bedrooms (based on RECS data for Massachusetts, the median number of occupants/bedroom is 1.0 [7]). HVAC System Efficiency If a list of HVAC system types is available, efficiencies can be adjusted. This is important for the current model because the buildings are modeled with Ideal Air Loads for heating and cooling, which implies that the simulation engine returns the demanded heating and cooling amounts without accounting for losses in plant or distribution equipment. For this reason, efficiencies need to be assigned post-simulation. Since this does not allow for calculation of transient efficiencies that vary with part load, an average efficiency value needs to be used. The best such metric for heating equipment is the annual fuel utilization efficiency (AFUE), defined as the ratio of total annual heat output for combustion equipment to the energy of the annual fuel supply. Since AFUE is an annual measure that accounts for transient variations, this is used as the constant efficiency factor for heating equipment in the model. Cooling: Since it is not possible to accurately differentiate central AC from unitary window ACs from the data, one coefficient of performance (COP) is assigned to all buildings that were labeled with ACUSE = 1. In this study, a COP of 2.6 was assumed as an average of federally-mandated standards for residential room air conditioners for the years 2000-2014 [55]. Heating: The building database separated heating system types into Forced Air, Hot Water, Steam, and Space Heat. A typical efficiency was assigned to each of these system types based on residential heating equipment descriptions from the 2008 ASHRAE Handbook: HVAC Equipment and Applications [56], per Table 4.3. System Efficiency Forced Air Hot Water Steam Space Heat 0.78 0.80 0.75 0.75 TABLE 4.3: Heating system efficiencies. It should be noted that these values carry high uncertainty, since (1) the age of the systems is unknown, (2) distribution losses can vary based on location and condition of piping or ductwork, 49 Chapter 4. Application of Methodology to Cambridge, MA Case Study (3) the system type might not be assigned correctly in the tax assessment database in the first place. 4.4.2 Probabilistic Estimation of Parameters The probabilistic estimation procedure was conducted as described in Section 3.4.2 focusing on nondeterministic parameters resulting from occupant behavior and preferences. Specifically, the ones analyzed here were occupant density (OCC) and heating and cooling setpoints (HEATSET, COOLSET). The ranges and step sizes defined for the parametric analysis are listed in Figure 4.4. Parameter Unit Min Max Step Size OCC HEATSET COOLSET people/m 2 C 0 C 1/unit* 18 22 2/bedroom* 24 28 Variable (6 steps)* 2 2 *Occupancy ranges are not uniform but depend on units and bedrooms per building. TABLE 4.5 4.4: Parametric analysis settings. Model Iterations Using the steps of progressive template customization described above, four iterations of the energy uuined ini Tabc 4p.t. Te . T are moue wIre simuiated, suarting frm least to m.st pc results of these are reported in the following chapter. Run , Templ Divisions Elec Load 2 (W/m ) 0 1 2 3 1 8 8 8 N/A Age/Insul/AC Age/Insul/AC Age/Insul/AC DHW (m3/s/m2) Heat Eff (%) 0.85 4.5 * 10-8 13.3 4.5 * 10-8 0.85 13.3 Inferred from building & energy data Inferred from building & energy data TABLE 4.5: Energy model iterations. 50 Occup Setpoints 2 (pp/m ) (H/C) 0.021 0.021 20 0 C / 25-C 20 0 C / 250 C 20 0 C / 250 C Param/Param Param Chapter 5 Results Cambridge, MA Case Study This chapter presents results for the simulation of 453 buildings in Cambridgeport. Figure 5.1 shows the 3D model for this subset of Cambridge. The buildings in the chosen neighborhood that were not being simulated due to lack of energy data for calibration were still included in the 3D file for use as shading surfaces. FIGURE 5.1 5.1: Rendering of the Cambridgeport 3D model. Error Metrics Several error measures were considered for application in the urban context. The chosen metrics to compare results of different model iterations are shown below. The Relative Error (RelErr) is 51 Chapter 5. Results for Cambridge, MA Case Study a reflection of the overall annual error in energy use intensity (EUI), while the Goodness of Fit (GOF) accounts for both the error in annual means (through NMBE) and monthly variances (through CVRMSE) (definitions in Section 2.2.1). RelErr GOF / WC - Emeas,yr - Esim,yr Emeas,yr CVRMSE VRMSECM2 + (5.1) NMBENMBE 2 (5.2) "CVRMSE +WMBE Here, wCVRMSE equal to and WNMBE are weighting factors such that = 0.25 and WCVRMSE + WNMBE = 1. They were set = 0.75. The CVRMSE and NMBE were calculated based on the monthly total energy use (sum of electricity and gas). An attempt to calculate these statistics for monthly electric and gas use separately and then combine into one NMBEGOF metric resulted in low measured electricity values being too influential. Therefore, to reflect the dominance of natural gas energy over electric, both NMBE and CVRMSE were calculated based on monthly energy sums. 5.2 WCVRMSE WNMBE Annual and Monthly Simulation Results The plots below illustrate the annual and monthly results from each simulation run. The runs were set up as described in Section 4.5 and summarized below: Single Template Run 0 Baseline run with a single template for all buildings. Multiple Templates Run 1 Buildings assigned to 8 initial templates, separated into 3 age groups (pre-1945, 1946-1980, post-1980), AC (yes/no), and insulation (yes/no, for the pre-1945 set only). Run 2 Same templates as above, but each building is customized for occupancy, DHW, electric loads, and heating system efficiency. Multiple Templates + ParametricAnalysis Run 3 Uses customized input files from RUN 2 to conduct parametric runs with varying occupancy density and heating/cooling setpoints. Results presented are for lowest-error runs for each building. 52 Chapter 5. Results for Cambridge, MA Case Study RUN 0: Single Template This simulation assigns a single template (1946-1980 age group with AC; template details in Appendix A) to all buildings to be used as a baseline for assessing the next iterations. The plots below show measured versus simulated annual energy use intensity (in order of increasing measured EUI), the Goodness of Fit for each building (calculated as shown in Section 5.1) that accounts for monthly agreement between values, and the Relative Error, which represents just the difference in annual energy use. (Note that the vertical axes are cut off so that the largest errors are not displayed in the plots.) The results for this run show annual energy consumption being under-predicted for the majority of buildings, indicating that the constructions defined in the template are likely better than actual. Additionally, the simulated buildings do not show much variance in EUI compared to the variance exhibited by measured data. Finally, it is seen that very few buildings have annual errors under 10%. 700 -Measured 4Simulated 600 Sb) 400 300 200 100 0 200 Igo 160 40 Z 20 104) 90 60 40 20 00% 40% -0% 0% F20%nl 40% -60% -100% FIGURE 5.2: Annual results for baseline run with a single template assigned to all buildings. 53 Chapter 5. Results for Cambridge, MA Case Study RUN 1: Initial Templates, Uncustomized This run separated buildings into 8 templates, categorized by 3 age groups (pre-1945 (insulated), pre1945 (uninsulated), 1946-1980, post-1980) and use of air conditioning (details in Appendix A). This separation results in greater variation in the simulated EUIs and closer agreement to measured data, which demonstrates the large effect insulated/uninsulated facades can have on energy use. However, buildings with the smallest and largest measured EUIs are not well matched by the simulation, showing that constructions and AC are not enough to account for all the variation in energy use. 700 Simulated -Measured 600 200 50 200 140 W 50 140 120 160 00% -0 0 I FIGURE 5.3: Annual results for Run 1 with initial templates generated from annual data. 54 Chapter 5. Results for Cambridge, MA Case Study RUN 2: Multiple Templates, Customized By Building This was done In this simulation, each of the 8 templates was customized to individual buildings. could be inferred from by changing values in the EnergyPlus input files based on parameters that of extremes monthly energy data (Section 4.4.1 details this process). This results in better matching at both ends of the spectrum and reduces the magnitudes of relative errors. 704) "Simulated -Measured 600 500 400 200 180 164 140 110 100 80 60 40 20 60% f6 20% 420% -60% -80% 101"1 on parameters inferred FIGURE 5.4: Annual results for templates customized to each building based from monthly data. 55 Chapter 5. Results for Cambridge, MA Case Study RUN 3: Parametric Analysis for Customized Templates This final run takes the customized templates from Run 2 and further adjusts them to occupancy parameters by varying the heating setpoints, cooling setpoints (for buildings with AC), and occupant density. All combinations of these three parameters are simulated and the runs with the smallest error (GOF) by building are selected. In some cases, multiple combinations of parameters can result in the same smallest error; in that case, just one of these was chosen for this part of the results, but all such low-error combinations are included later when generating parameter distributions. (Note that the number of buildings simulated in this run is smaller than in the previous three-404 rather than 453; this is due to the fact that buildings inferred to have supplementary electric heating in winter were excluded prior to simulation.) This plot provides the closest match between measured and simulated consumption. Buildings with EUIs at the low end are still not explained but, overall, the annual relative error and goodness of fit are closer to zero much more often than in previous runs. m*Simulated -Measured 600 50 160 40 -40" 0I 4 V FIGURE V N 10%rpq 5.5: Annual results for lowest-error parametric runs with variable occupancy and setpoints. 56 ....... .... .... ...... Chapter 5. Results for Cambridge, MA Case Study 5.2.1 Monthly Comparison Figure 5.7 shows the iterative improvement in the matching of monthly results to measured ones shown in Figure 5.6. Energy bills show that, while the monthly gas use profile keeps a consistent shape for all buildings, the electricity one has much more variation relative to its mean due to its lower weather dependence and higher occupancy influence. A subset of 200 buildings out of the 453 simulated are plotted. The first two runs, with uniform electric and domestic hot water loads across the building stock, show little in common with the measured data profiles since between-building variation in base loads is not modeled. Plots for Run 2 with customized templates much better cover the variation in base loads that is seen in measured data. Run 3 shows the same buildings adjusted to better match measurements-electricity use is lowered where necessary by increasing cooling setpoint, and gas use changed using occupant density and heating setpoints. Note that the gas use plot for this run matches the measured one more closely than Run 2, since lines of the same shade correspond to the same building. 2 FIGURE 3 4 6 7 9 W3 11 12 5.6: Measured mnonthly gas and electric use intensity. 57 ..... ..... - .. ... .. ...... .- Chapter 5. Results for Cambridge, MA Case Study 40 40 35 .89 25 25 30 2 4 5 6 0 7 9 10 11 2 1 12 12 10 10 8 8 6 6 2 2 I 0 2 4 3 8 1 8 7 9 10 11 5 6 5 6 8 9 10 11 12 8 9 0 II 12 0- 0 12 4 2 1 (A) Run 0 (single template) results. 2 3 4 7 (B) Run 1 (initial templates) results. 50 50 45 4 40 40 1 30 2 N _0 10 10 0 1 2 4 3 5 6 7 8 9 10 11 0 12 1 2 3 I 2 3 4 5 6 5 0 7 8 10 9 11 2 12 12 25 25 88 6 6 4 4 2 2 0 1 2 3 4 0 5 6 7 8 9 10 11 12 (c) Run 2 (customized templates) results. 4 7 8 9 10 11 12 (D) Run 3 (parametric simulation) results. FIGURE 5.7: Monthly simulation results for a subset of 200 buildings. 58 . . ............ .... ...... Chapter 5. Results for Cambridge , MA Case Study 5.2.2 EUI Distribution Comparison Another way to visually evaluate the results of each run is by comparing the EUI histograms for the neighborhood modeled. Figure 5.8 does this with density plots of simulated EUis along with the measured EUls. In the first run, simulated EUis are closely concentrated around one peak value, since the only difference between buildings is their geometry. As the number of templates increases, the EUI variance increases as well. In Run 3, the simulated EUis have comparable variance to measured, although the frequency of low-EUI buildings is still slightly higher than in reality. :;~ 200 200 • Measured 150 ., Simulnred Simulated 150 ~ ~ ~ ~ [ ·'~~· .... 100 50 0 - 50 I 100 ~ [ j· ... ~~1 .... 100 IfIII • -- so I 150 200 250 JOO 350 400 450 500 550 650 600 700 0 so 100 150 200 EUI (kWb/m2J 250 300 350 400 450 500 550 600 650 700 EUI (lcWblm2) 160 160 • Measured Simulalcd • Measured 140 140 120 120 100 100 80 i 11111 60 60 40 40 20 20 ~ ~ l • Measured l,.; 0 50 100 ISO 200 250 300 350 400 450 500 550 6oo 650 o -0 so 700 100 150 200 250 300 350 400 450 500 550 Simulated - 600 650 -- - EUl (k Wh/m2) EUI (kWb/m2) 5.8: Measured (shaded grey) and simulated (shaded green) EUI distributions for neighborhood, in W /m 2 : Run 0 (top left) , Run 1 (top right) , Run 2 (bottom left) , Run 3 (bottom right) . FIGURE 59 700 Chapter 5. Results for Cambridge, MA Case Study 5.3 5.3.1 Parameter Distributions Inferred Parameters The plots below demonstrate the distributions that resulted from parameter inference described in Section 4.4.1. Monthly energy data for each building was used to deterministically estimate the values for internal electric loads (appliances and lighting combined), domestic hot water flow rates, and the presence or absence of air conditioning in every building. Additionally, building tax assessment data was used to estimate the efficiency of the heating system. If the sample of buildings from which these distributions were derived is representative of the larger environment, they can be sampled for assigning parameter values to buildings with unavailable information. 0.18 0.16 0.14 0.12 0.10 0.08 0.06 0.02 0.00 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 Peak W/m2 FIGURE 5.9: Distribution of peak electric load intensities. 0.20 0.18 0.16 0.14 0.12 0.10 0.08 0.06 0.04 0.02 0.00 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Peak m3/a/m2 Z1OA(-8) FIGURE 5.10: Distribution of peak domestic hot water flow. 60 Chapter 5. Results for Cambridge, MA Case Study 0.70 0.70 0.60 0.60 0.50 0.50 0.40 0.40 0.30 0.30 0.20 0.20 0.10 0.10 0.00 0.00 0 FIGURE 5.3.2 r 1 0C C0 5.11: Distributions of AC use (left) and heating system efficiencies (right). Probabilistically-Estimated Parameters Following the procedure described in Section 3.4.2 for the probabilistic estimation of the three chosen parameters (occupancy density, heating and cooling setpoints), a batch of parametric runs was simulated for every building. The cutoff error below which the results of a parametric run were considered acceptable was set equal to a Goodness Of Fit of 10%. This cutoff resulted in 49% of buildings having one or more acceptable parametric runs. (A higher error cutoff of GOF = 15% resulted in 64% of buildings acceptable; raising the cutoff to 30% resulted in 84% of buildings having results within the limit.) The frequency of combinations of the 3-component vector of (OCC, COOLSET, HEATSET) 10% cutoff, the parameters were compiled into joint probability mass distributions. For the GOF distributions are illustrated in Figures 5.12 and 5.13. 0.070 0.060 0.050 0.040 0.030 0.020 0.0 10 0.000 S~ 000 SfC' F,"C. ;4 I.Q -- 0NNW FIGURE 5.12: Distribution of occupancy density per square meter for GOF < 10. The occupancy distribution has a wide spread, which seems reasonable considering the large spread in EUIs as well as in the average room size (inverse of the number of rooms per unit floor area) of the sampled buildings. The heating setpoint distribution is maximized at 20 to 22'C as can be expected. The cooling setpoint distribution increases with setpoint, with a setpoint of 28'C occurring 61 Chapter 5. Results for Cambridge, MA Case Study 0.30 0.30 0.25 0.25 0.20. 0.20 0.15 0.15 0.10 0.10 0.00 0.00 22 24 26 28 0 18 20 22 24 FIGURE 5.13: Cooling (left) and heating (right) setpoint distributions for GOF < 10. most often. While this is too high be a typical cooling setpoint, it can be explained by the fact that, among the buildings simulated as having cooling, many do not have their air conditioner turned on constantly. Thus, since cooling schedules were not varied in the simulations, the higher setpoint can be treated as a proxy for reduced hours of operation. These distributions represent a probabilistic method for the characterization of occupancy-related parameters in urban archetypes using prior information (uniformly-distributed parameter ranges) with measured monthly energy data. These distributions can be sampled with Monte Carlo methods to produce large numbers of possible input combinations. Each of these samples can then be used in a simulation, resulting in a set of simulation results that demonstrates the probabilities of occurrence of different outcomes given the parameter uncertainties. 5.4 Results Summary Error measures on a per-building and aggregate basis are summarized below. The monthly calibration, represented by the Goodness of Fit metric, improves with each step while its variance decreases. It is interesting to note that between Runs 2 and 3, the mean of the annual Relative Error does not change, but the GOF improves dramatically (from 31.6% to 18.4%). Furthermore, it is seen that the aggregated relative error (i.e., the error in the EUI based on the sums of the energy and square footage for all buildings) is not a very good indication of the fit of the model. The aggregated error is almost the same for Runs 1 and 3, and lower for Run 2, with all under 6%; yet, this metric does not convey the spread in EUIs that was seen in Figure 5.8. In all of the runs, the absolute values of maximum errors are high-in the 500% range for Run 0, down to 300% for Run 3. This is primarily due to the buildings with unusually low energy use intensities, for which even differences small in magnitude result in large errors relative to the low measured EUI. As Figure 5.14 shows, most GOF values are concentrated under 50% for all four runs, and in Run 3 the majority does not exceed 20%. During the validation stage, buildings with errors outside 3 standard deviations of the mean error should be inspected to identify potential explanations. 62 ............. Chapter 5. Results for Cambridge, MA Case Study Aggregated By Building Run 0 1 2 3 TABLE Annual Monthly Annual RelError NMBE CVRMSE GOF RelError Mean StDev Min Max 3.5 57 -464 75 3.9 61.8 -505.9 82.3 57.1 45.1 13.4 549.7 45.6 43.5 4.6 510.5 16.5 Mean StDev Min Max -12.7 49.7 -457.6 77.2 -0.7 56.0 -499.2 246.8 48.7 42.2 14.0 505.3 36.7 43.4 4.8 499.8 Mean StDev Min Max -8.3 36.4 -309.3 46.4 -9.1 39.8 -337.4 50.7 45.1 31.7 15.5 382.3 31.6 28.4 5.2 342.2 Mean StDev Min Max -8.3 24.0 -244.3 40.7 -9.1 26.1 -266.5 44.4 34.0 27.0 13.1 318.3 18.4 23.2 4.2 272.1 Stat (EL -35.6) (NG 26.5) -5.5 (EL -18.9) (NG -2.8) -4.2 (EL -13.8) (NG -2.3) -5.4 (EL 8.2) (NG -7.7) level. 5.1: Summary of validation results by run, time period considered, and aggregation 100- 00 50- Wa I RLNOGOF RUNI-GOP RLW-GOF RUN3GC0F Run that the vertical axis is FIGURE 5.14: Goodness of fit by individual building for every run. (Note displayed.) not are points error highest the truncated, so 63 . . .. .......... Chapter 6 Discussion and Conclusion 6.1 Discussion The goal of this work was to develop a method for iterative refinement of an urban energy model given information on a set of buildings and the monthly energy measurements. The method consisted of (1) generation of a set or archetype templates after identifying variables with significant effect on energy use, (2) automated customization of templates for individual buildings based on parameters inferred from energy data, and (3) probabilistic estimation of other unknown parameters through parametric analysis. The results showed a clear improvement in agreement with measured data with each step. Furthermore, the customization and parametric analysis steps enabled the creation of probability mass distributions for a set of parameters representative of the chosen neighborhood. The contributions of this research are summarized below. 6.1.1 Parameter Uncertainty Reduction In the practice of urban modeling, occupancy-driven parameters have so far typically been defined as uniform across a building archetype, due to lack of definite information and their unpredictability. When validating a model on a spatially-aggregated, annual basis, these parameters cannot be identified and checked for accuracy. However, in order to make urban models that would be representative of reality on smaller spatial and time scales, matching end-use energy consumption at hourly, daily, or monthly intervals becomes important. This work used monthly energy readings for a set of buildings to generate distributions of such difficult-to-predict occupancy-driven parameters, both by direct inference and by probabilistic analysis. The results included distributions for internal electric load density, domestic hot water use, occupant density, and heating and cooling thermostat setpoints. 64 Chapter 6. Discussion and Conclusion These distributions are a result of applying measured energy data to prior uncertainty intervals, specified as uniform distributions between a given minimum and maximum. The resulting distributions are more informative and useful than either the uniform distributions or the single values typically assumed in urban modeling. This probabilistic method proposes an alternative to the typical most-probable-outcome single simulation result by allowing for the generation of sets of simulation results that account for parameter uncertainty. This can be done with Monte Carlo analysis, in which the probability distributions for multiple variables are sampled to produce hundreds of possible input combinations. Each of these samples is then used in a simulation, and the set of simulation results demonstrates the probabilities of different outcomes subject to parameter probabilities. Having a set of probabilistic outcomes, rather than a single one, in turn allows greater understanding of uncertainty when using the building models to evaluate results of interventions in the building stock. This is of value to decision-makers or financing entities when assessing possible retrofit policies. 6.1.2 Generation of Improved Archetypes Using monthly data for calibration allowed the customization of initial building archetypes with occupancy-related parameters defined in the form of probability distributions. Thus, the outcome of this analysis can be translated into a set of archetype templates defining fixed building properties, along with a set of distributions defining parameters with less certainty and greater variability. These templates and the accompanying distributions can then be considered to comprise a full description of buildings in a given neighborhood, with a mix of deterministic and probabilistic parameters. These descriptions can be used to generate model inputs for buildings in a similar neighborhood for which energy data is unavailable. 6.1.3 Evaluating Consequences of Data Availability As Aksoezen et al. [22] noted, access to measured energy data can greatly improve generation and characterization of building archetypes for urban modeling. Aksoezen was primarily referring to annual data since it is most often available; this study looked at whether using monthly data instead of annual would provide even more of an advantage in archetype definition. Annual data was still used for the multivariate regression to identify initial archetype templates; monthly data was used after that to further customize the templates to every building. This work is the first detailed look at monthly, individual-building results for a bottom-up urban model. It has demonstrated that the same relative error in annual EUI calculated for a district in aggregate-the quantity used most often for validating urban models-can result from models with 65 Chapter 6. Discussion and Conclusion very different input parameters and monthly energy uses (e.g., Runs 1 and 3). Thus, the usual urban model validation metric is not sufficient to ensure a model representative of actual conditions. In addition to disaggregated energy data, it is essential to note the importance of accurate building property data. Many of the building properties used in this analysis were taken as givens from the tax assessment database; however, those properties that could be checked using energy data (e.g., heating fuel, presence of air conditioning) showed many inaccuracies. In addition, the property database had inconsistencies among its own data fields (e.g., the number of kitchens listed as very different from the number of apartment units, or the number of stories being different from what a recent photograph of the building shows). This throws doubt on the fact that other properties were all documented correctly, and puts into question the results of analysis based on these properties. In this study, many mistakes in the most important fields (such as floor area or number of stories) were corrected manually by consulting an online property database. In most cases, however, it is impossible to check every piece of information when modeling hundreds of buildings, so consistent and accurate data on building characteristics would be highly beneficial for increased accuracy in urban-scale models in the future. 6.2 6.2.1 Limitations Geometric Limitations One limitation of the current workflow is that sloped roofs are not modeled explicitly due to lack of infnrmationn nn the rnnf shapes, even thoughcr thke regcrei::Qin mnAde1 in Chapteor A ideantfiedar rnnf shaper to be a significant predictor of building EUIs for this dataset. This information could be gathered from LiDAR imaging of cities. In the case of Cambridge, LiDAR scans exist at least for some areas of the city and have been used to create true 3D models of portions of the city with more accurate roof shapes (Figure 6.1). However, only certain neighborhoods have so far been modeled using LiDAR, and most of the 453 buildings in this case study did not have this information available. Therefore, the 2.5-D model with flat roofs was kept for this study. In the future, a process to incorporate LiDAR data using the same Grasshopper component should be developed. Another aspect that was not taken into account in the current model was basements. At this stage, all buildings were modeled as having slab-on-grade floors with earth contact, because information on basements was not part of the tax assessment dataset. When this information is available, basements should be modeled with the appropriate constructions and internal loads, which can be inferred based on whether a basement is finished or unfinished. Finally, all buildings were modeled with the same window-to-wall ratio (0.15), considered representative of this building stock. 66 Chapter 6. Discussion and Conclusion (A) (B) Detailed roof geometry using LiDAR scans. 3D geometry used in model. FIGURE 6.1: 3D models of Cambridgeport with and without LiDAR data. 6.2.2 Modeling Simplifications Several simplifications were incorporated in the EnergyPlus models in order to enable automated model generation and quick simulation at the urban scale. First, each floor was modeled as a single thermal zone, which precludes precise representation of the heat transfer that occurs between rooms of variable occupancy and time of use. However, since the arrangement of rooms by floor is unknown, this simplification is unavoidable. Second, heating, ventilation and cooling systems were not explicitly modeled in EnergyPlus with central or unitary equipment. Instead, constant efficiency factors were applied post-simulation to the energy demand for cooling and heating; fan and pump electric use, if present, was not explicitly modeled. (That contribution was automatically included in the measured electric load to which the modeled appliance base loads were customized; however, this does not account for seasonal variations in fan and pump loads). Another simplification was the specification of uniform schedules across the building stock. Besides neglecting hourly between-building or between-floor variations in behavior, this also neglects any seasonal variation, such as from students leaving their apartments empty over summer or winter vacations. 6.2.3 Limitations of Results While the final iteration of the model showed good agreement with measured data (mean monthly Goodness of Fit under 20%, mean annual error under 10%) and appropriate separation into electric and natural gas consumption ( 8% annual error each, when summed over all buildings), it still does not ensure that further breakdowns by end-use within each type are accurate. If more precise results 67 Chapter 6. Discussion and Conclusion are needed, more input data will need to be provided to the model, such as counts of appliances and light fixtures, operating hours of each, and so on. Furthermore, although the variables chosen for parametric analysis (occupant density and setpoints) improved the model's Goodness of Fit significantly, there is no guarantee that they were actually the ones responsible for the metered energy variation. As in any calibration problem, multiple solutions to matching measured data are possible; the resulting distributions provide just a subset of such solutions that involve only those three variables. We believe the resulting distributions are useful due to the reportedly large influence of occupants and setpoints on a home's energy use; however, this assumes that the rest of the building's parameters-primarily the physical ones-were specified accurately, which is not necessarily the case. This limitation is addressed further in the Future Work section. 6.3 6.3.1 Future Work Validation While the proposed methodology resulted in good agreement between simulated and measured energy use, the energy models were dependent on assumptions that were made during template customization. These assumptions (e.g., use of air conditioning or electric heating, presence of insulation) were inferred from energy data but not yet verified against reality. The next step in validating this methodology would be to collect more information on the actual buildings that were modeled and check whether the assumptions made in template assignment and customization were indeed correct. Since the City of Cambridge does not maintain such records, this validation could he done by either in-nperson viq1sal inspection of the buildings to identify characteristics visible from the outside or through surveys sent to building occupants. The latter method would be preferred since occupants have better knowledge of their home (e.g., whether it has been renovated or how often air conditioning is used); these surveys should be designed to be short and simple to complete. Validation of this methodology will be performed in the upcoming months. Another type of validation concerns the application of the resulting parameter distributions to modeling buildings that were not part of the original dataset. A new batch of IDF files will be generated for another subset of residential buildings in Cambridge, with Monte Carlo sampling from the distributions used to specify parameters. Since energy data is available for almost 3,000 more buildings in addition to the ones used in the model of Cambridgeport, these can be used as the testing set for the archetypes and distributions generated from the model. 68 I Chapter 6. Discussion and Conclusion 6.3.2 6.3.2.1 Methodology Refinement Sensitivity Analysis The proposed methodology focused on probabilistic estimation of occupancy-related parameters, but is flexible enough to be used for almost any of the thousands of inputs to an EnergyPlus model. As a way to improve on the methodology and justify the set of parameters chosen, extensive sensitivity analysis should be performed. Sensitivity analysis has been frequently used as the first step in calibrating a building energy model in order to separate parameters that are insignificant and can be fixed and from more influential ones that should be varied in the calibration process. This sensitivity analysis has not yet been done on an urban scale, but would provide further insight into the proposed methodology. 6.3.2.2 Hourly Energy Data A further step will be to use the proposed methodology with hourly measured data. This will add several challenges, including logistical (much larger volumes of data, requiring higher computing power for processing) and methodological (choice of parameters and level of detail for calibration). Hourly data should primarily enable the determination of daily schedules, which at the level of this study were fixed. Distributions of fractional energy use at each hour of the day derived from measured data would allow the creation of better daily profiles for urban models. This would be most valuable for utilities trying to forecast power demand for new or changing neighborhoods, and would be an improvement over current urban models that generate hourly results based on "standard" occupancy and appliance use profiles. 6.3.3 Automation For this case study, the processes of generating templates and customizing them were partially automated but still often required manual intervention. Initial templates and IDF files were generated in Grasshopper with ArchSim. Afterward, Python scripts were used for customization of EnergyPlus IDF files, initialization of EnergyPlus simulations, extraction of results from EnergyPlus output files, and calculation of calibration errors from these results. GenOpt [57] was used to generate and run the parametric input files for each original IDF. Finally, Microsoft Excel was used to compile results from all runs and generate the plots in Chapter 5. In order to speed up the process and simplify dealing with larger datasets, it would be beneficial to centralize and automate the workflow and interactions between the different software tools to a greater extent. Additionally, the setup of a better data management system and cloud-based simulations should be explored. 69 Chapter 6. Discussion and Conclusion 6.4 Conclusion The methodology presented in this work uses publicly available building data and information provided by the utility company to characterize the low-rise residential building stock in Cambridge, Massachusetts. It results in an improvement of 27% over modeling all buildings with just one template based on building function, and 18% over the best urban modeling practice of defining archetypes based on annual energy use data. This methodology provides a way to generate a set of templates plus probability distributions for modeling buildings within the same stock that is expected to be useful for modeling buildings for which data is not readily available. The proposed methodology will be validated on a test set of other residences in Cambridge, and results of the validation are expected to guide further research on this subject. A side goal of this work was to identify gaps in current urban building energy modeling and validation practices and advocate for greater transparency in building energy data sharing. Accurate data on energy use and building thermal properties are essential for developing more reliable representations of building archetypes, and are the only resource to increase confidence in the predictive value of district energy models for use in sustainable urban planning. 70 Appendix A Energy Model Templates A.1 Constructions Pre-1945, Insulated Category Pre-1945, Uninsulated Exterior Wall Oriented strand board Air layer Gypsum board U-value 0.022 0.1 0.019 0.97 Oriented strand board Cellulose fiber insulation Gypsum board U-value 0.022 0.1 0.019 0.36 Roof Asphalt shingles Plywood wood panels Gypsum board 0.013 0.02 0.013 U-value 1.89 Asphalt shingles Plywood wood panels Fiberglass batt Gypsum board U-value 0.013 0.02 0.08 0.013 0.42 Concrete 0.15 U-value 4.17 Concrete Extruded polystyrene U-value 0.15 0.025 0.93 Ground Floor Fenestration Infiltration Double-pane, low-e, clear 0.6 ACH Single-pane, clear 0.8 ACH 71 Appendix A. Building Archetype Templates Category 1946-1980 Exterior Wall Oriented strand board Fiberglass batt Air layer Gypsum board 0.022 0.05 0.05 0.019 U-value 0.54 Asphalt shingles Plywood wood panels Fiberglass batt Gypsum board 0.013 0.02 0.1 0.013 U-value Ground Floor Concrete Extruded polystyrene U-value Fenestration Infiltration Double-pane, clear 0.6 ACH Roof Post-1980 Oriented strand board Extruded polystyrene Hardwood Fiberglass batt Gypsum board U-value 0.016 0.05 0.04 0.05 0.019 0.30 0.35 Asphalt shingles Plywood wood panels Fiberglass batt Extruded polystyrene Gypsum board U-value 0.013 0.02 0.152 0.05 0.013 0.17 0.15 0.025 0.93 Concrete Extruded polystyrene U-value 0.15 0.05 0.52 Double-pane, low-e, clear 0.5 ACH 72 Appendix A. Building Archetype Templates A.2 Schedules Schedule Weekend Weekday 0i9 0 0.9 09 9 A a9 0, 09 7 0 IiiiI 04 0-20Q2 02 022 02 11 220 02 Occupancy 123 4 5 ij n,,rns 1at i Ciii Light, Equip 0.,05 91 0-9 iiiIII Q2 a2 0203 92 0.8 ,2 2 U Ii 02 04 .0 02 03 C7 7 2 a 0 A W t 2s 07 CA Q. 1,A 00.9 9 1 000 000o1 1234347911121misihi is Hot Water 73 19toi:2 II232 Bibliography [1] U.S. Energy Information Administration. Monthly Energy Review, April 2015. (Visited on 05/01/2015). URL http: //www. eia.gov/totalenergy/data/monthly/. [2] Lukas G. Swan and V. Ismet Ugursal. Modeling of end-use energy consumption in the residential sector: A review of modeling techniques. Renewable and Sustainable Energy Reviews, 13(8): 1819-1835, 2009. doi: 10.1016/j.rser.2008.09.033. [3] American Society of Heating, Refrigeration and Air Conditioning Engineers. ASHRAE Guideline 14-2002 for Measurement of Energy and Demand Savings. 2002. [4] EVO. International Performance Measurement and Verification Protocol. 2007. [5] Institute for Market Transformation. Comparison of U.S. Benchmarking and Transparency Policies, January 2015. Commercial Building Energy URL http://www.imt.org/ resources/detail/comparison-of-commercial-building-benchmarking-policies. (Vis- ited on 05/01/2015). [6] U.S. Energy Information Administration. Commercial Buildings Energy Consumption Survey (CBECS) Data, March 2015. URL http: //www.eia.gov/consumption/commercial/data/2012. (Visited on 05/01/2015). [7] U.S. Energy Information Administration. Residential Energy Consumption Survey (RECS) Data, May 2013. URL http://www.eia.gov/consumption/residential/data/2009. (Visited on 05/01/2015). [8] B. Howard, L. Parshall, J. Thompson, S. Hammer, J. Dickinson, and V. Modi. Spatial distribution of urban building energy consumption by end use. Energy and Buildings, 45:141-151, 2012. doi: 10.1016/j.enbuild.2011.10.061. URL http://dx.doi.org/10.1016/j.enbuild.2011.10.061. [9] Green Button Alliance. Green Button. URL http: //www. greenbuttondata. org/. (Visited on 05/01/2015). [10] Stephanie Pincetl and Jacki Murdock. An Interactive Map of LA Energy Consumption, 2013. URL http://sustainablecommunities.environment.ucla.edu/map/. (Visited on 05/01/2015). 74 Bibliography [11] City of Chicago. Chicago Energy Data Map, 2015. URL http://energymap.cityof chicago. org/. (Visited on 05/01/2015). [12] Shem Heiple and David J. Sailor. Using building energy simulation and geospatial modeling techniques to determine high resolution building sector energy consumption profiles. Energy and Buildings, 40(8):1426-1436, 2008. doi: 10.1016/j.enbuild.2008.01.005. [13] Yu Joe Huang. A Bottom-Up Engineering Estimate of the Aggregate Heating and Cooling Loads of the Entire U.S. Building Stock Prototypical Residential Buildings. Proceedings of the 2000 A CEEE Summer Study on Energy Efficiency in Buildings, pages 135-148, 2000. [14] Christoph Reinhart and Carlos Cerezo. Urban Building Energy Modeling - A Review of a Nascent Field. 2015. [15] American Society of Heating, Refrigeration and Air Conditioning Engineers. ASHRAE Handbook: Fundamentals (SI Edition). 2013. [16] American Society of Heating, Refrigeration and Air Conditioning Engineers. ASHRAE Standard 140-2011: Standard Method of Test for the Evaluation of Building Energy Analysis Computer Programs. 2011. [17] Daniel Coakley, Paul Raftery, and Marcus Keane. A review of methods to match building energy simulation models to measured data. Renewable and Sustainable Energy Reviews, 37:123-141, 2014. 007. doi: 10.1016/j.rser.2014.05.007. URL http://dx.doi. org/10.1016/j .rser.2014.05. [18] T. Agami Reddy, Itzhak Maor, and Chanin Panjapornpon. Calibrating Detailed Building Energy Simulation Programs with Measured Data-Part I: General Methodology (RP-1051). HVA C&R Research, 13(2):221-241, 2007. [19] T. Agami Reddy. Calibrating Detailed Building Energy Simulation Programs with Measured Data-Part II: Application to Three Case Study Office Buildings (RP-1051). HVA C&R Research, 13(2):243-265, 2007. [20] Michael J. Gestwick and James A. Love. Trial Application of ASHRAE 1051-RP: Calibration Method for Building Energy Simulation. Journal of Building Performance Simulation, 7:346359, January 2015. doi: 10.1080/19401493.2013.838698. URL http://dx.doi.org/10.1080/ 19401493.2013.838698. [21] Enrico Fabrizio and Valentina Monetti. Methodologies and Advancements in the Calibration of Building Energy Models. Energies, 8(4):2548-2574, 2015. doi: 10.3390/en8042548. URL http://www.mdpi. com/1996-1073/8/4/2548/. 75 Bibliography [22] Mehmet Aksoezen, Magdalena Daniel, Uta Hassler, and Niklaus Kohler. Building Age As An Indicator For Energy Consumption. Energy & Buildings, 87:74-86, 2015. doi: 10.1016/j.enbuild. 2014.10.074. URL http://dx.doi.org/10.1016/j .enbuild.2014.10.074. [23] Kristina Orehounig, Georgios Mavromatidis, Ralph Evins, Viktor Dorer, and Jan Carmeliet. Predicting Energy Consumption of a Neighborhood Using Building Performance Simulations. 2011. [24] Adesoji Albert Famuyibo, Aidan Duffy, and Paul Strachan. mestic dwellings-An Irish case study. Developing archetypes for do- Energy and Buildings, 50:150-157, July 2012. doi: 10.1016/j.enbuild.2012.03.033. URL http://www.sciencedirect. com/science/article/pii/ S0378778812001818. [25] G. Branco, B. Lachal, P. Gallinelli, and W. Weber. Predicted Versus Observed Heat Consumption Of A Low Energy Multifamily Complex In Switzerland Based On Long-Term Experimental Data. Energy and Buildings, 36(6):543-555, 2004. doi: 10.1016/j.enbuild.2004.01.028. [26] Aneta Strzalka, Jiirgen Bogdahn, Volker Coors, and Ursula Eicker. 3D City Modeling for Urban Scale Heating Energy Demand Forecasting. HVAC&R Research, 17(4):37-41, 2011. doi: 10.1080/ 10789669.2011.582920. [27] Daniel Godoy-Shimizu, Jason Palmer, and Nicola Terry. What Can We Learn from the Household Electricity Survey? Buildings, 4(4):737-761, 2014. doi: 10.3390/buildings4040737. URL http: //www.mdpi.com/2075-5309/4/4/737/. [28] Merih Aydinalp-Koksal, V. Ismet Ugursal, and Alan S. Fung. Modeling Of The Appliance, Lighting, And Space-Cooling Energy Consumptions In The Residential Sector Using Neural Networks. Applied Energy, 71:87-110, 2002. doi: 10.1016/SO306-2619(01)00049-6. [29] M. Santamouris, K. Kapsis, D. Korres, I. Livada, C. Pavlou, and M. N. Assimakopoulos. On The Relation Between The Energy And Social Characteristics Of The Residential Sector. Energy and Buildings, 39(8):893-905, 2007. doi: 10.1016/j.enbuild.2006.11.001. [30] James Keirstead and Aruna Sivakumar. Using Activity-Based Modeling to Simulate Urban Resource Demands at High Spatial and Temporal Resolutions. Journal of IndustrialEcology, 16 (6):889-900, 2012. doi: 10.1111/j.1530-9290.2012.00486.x. [31] Tarek Rakha, Cody Rose, and Christoph Reinhart. A Framework for Modeling Occupancy Schedules and Local Trips Based on Activity Based Surveys. In 2014 ASHRAE/IBPSA-USA Building Simulation Conference, pages 433-440, Atlanta, GA, 2014. [32] Praveen Sehrawat and Karen Kensek. Urban Energy Modeling: GIS As An Alternative To BIM. In 2014 ASHRAE/IBPSA-USA Building Simulation Conference, pages 235-242, Atlanta, GA, 2014. 76 Bibliography [33] Jimeno A. Fonseca and Arno Schlueter. Integrated Model For Characterization Of Spatiotemporal Building Energy Consumption Patterns In Neighborhoods And City Districts. Applied Energy, 142:247-265, 2015. doi: 10.1016/j.apenergy.2014.12.0 6 8. URL http://dx.doi.org/10.1016/j. apenergy.2014.12.068. [34] Weather Underground. URL http: //www. wunderground. com/about/data. asp. (Visited on 05/01/2015). [35] Weather Analytics. URL http://www.weatheranalytics.com/. (Visited on 05/01/2015). [36] U.S. Department of Energy. Weather Data for Simulation, 2014. URL http://appsl.eere. energy. gov/buildings/energyplus/weatherdata-simulation. cfrm. (Visited on 05/01/2015). [37] Ursula Eicker, Romain Nouvel, Eric Duminil, and Volker Coors. Assessing Passive And Active Solar Energy Resources In Cities Using 3D City Models. Energy Procedia, 57:896-905, 2014. doi: 10.1016/j.egypro.2014.10.299. URL http: //dx. doi. org/10. 1016/j. egypro. 2014.10.299. [38] Carlos Cerezo, Julia Sokol, Christoph Reinhart, and Adil Al-Mumin. Comparison Of Three Methods For The Characterization Of Building Archetypes In Urban Scale Energy Simulation. The Case Study Of A Residential Neighborhood In Kuwait. 2015. [39] U.S. Department of Energy. EnergyPlus Energy Simulation Software. URL http: //appsl. eere. energy. gov/buildings/energyplus/. (Visited on 04/01/2015). [40] R Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria, 2014. URL http: //www. R-proj ect . org. [41] Weather Underground. Weather History for Cambridge, MA [KMACAMBR4]. URL http: //www.wunderground.com/personal-weather-station/dashboard?ID=KMACAMBR4. [42] EverSource. URL https: //www. eversource. com/. (Visited on 05/01/2015). [43] City of Cambridge, MA. Geographic Information Systems. URL http://www. cambridgema. gov/GIS/gisdata. aspx. (Visited on 04/01/2015). [44] Robert McNeel and Associates. Rhinoceros 3D, 2015. URL https: //www. rhino3d. com/. [45] Robert McNeel and Associates. Grasshopper: Algorithmic Modeling for Rhino, 2015. http: //www . grasshopper3d. com/. URL [46] Timur.Dogan. ArchSim. URL http: //archsim. com/. [47] City of Cambridge, MA. Residential Data Report. 2015. URL https: //data. cambridgema. gov/Assessing/Residential-Data-Report/xiwh-f97k. [48] Clifford Cook. Phone call. 2015. 77 Bibliography [49] Commonwealth of Massachusetts. Commonwealth of Massachusetts State Building Code. Office of the Massachusetts Secretary of State, Michael J. Connolly, 4 edition, 1980. //www.archive.org/details/commonwealthofmal980mass. [50] Commonwealth of Massachusetts. fice of the URL Secretary of the URL http: Massachusetts State Building Code, 780 CMR. Commonwealth, William F. Galvin, 7 edition, Of2008. http://www.mass.gov/eopss/consumer-prot-and-bus-lic/license-type/csl/ 7th-edition-base-september-2008.html. [51] Charles George Ramsey and Harold Reeve Sleeper. Architectural Graphic Standards. John Wiley & Sons, Inc., New York, NY, 1956. [52] National Renewable Energy Laboratory. Commercial and Residential Hourly Load Profiles for all TMY3 Locations in the United States, 2015. URL https: //catalog. data.gov/dataset/ commercial-and-resident ial-hourly-load-prof iles-f or-all-tmy3-locat ions-in-the-\ united-state-1d21c. (Visited on 04/01/2015). [53] A.F. Mills. Heat Transfer. Prentice Hall, Upper Saddle River, NJ, 2 edition, 1999. [54] American Society of Heating, Refrigeration and Air Conditioning Engineers. ASHRAE Standard 90.2: Energy-Efficient Design of Low-Rise Residential Buildings, 2004. [55] U.S. Department of Energy, Building Technologies Office. Appliance and Equipment Standards: Residential Room Air Conditioners. URL http://wwwl.eere.energy.gov/buildings/ appliancestandards/product . aspx/productid/41. [56] Ameriean Society of Heating, Refrigeration and Air Conditioning Engineers. 2012 AQ TD AE Handbook: HVAC Systems and Equipment (SI Edition). 2012. [57] Lawrence Berkeley National Laboratory. GenOpt: Generic Optimization Program. URL http: //simulationresearch.lbl.gov/GO/. (Visited on 04/01/2015). 78