CPUC/UCLA Scope of Work DRAFT Project Title: California Energy Data Repository Project Description The California Energy Data Repository is intended to serve as a resource for policy makers, energy planners, State agency staff, ratepayer advocates and public interest researchers. The repository will create a focal point for cross-agency, cross-sectoral and cross-utility partnerships to reduce building energy consumption across the state. The repository will provide data aggregation services and annual energy statistics that stakeholders can use to support evaluation and research of state energy consumption and state energy projects and programs and policies. The development of the repository will be done in collaboration with staff from the California Public Utilities Commission (CPUC) to: maximize the benefits to the CPUC; leverage the use of data available; and minimize overlap in existing activities conducted to support evaluation measurement and verification mandated by the Commission. This project and any research conducted will be developed in the context of the public processes for other evaluation projects established by the Commission in D.10-29-04. The data repository project builds off of previous work conducted in Los Angeles County and funded by the California Energy Commission and the Southern California Regional Energy Network (through the County of Los Angeles). In addition, the work will be conducted in compliance with guidelines set forth in the recent decision by the CPUC on energy data access as a result of the Smart Grid Proceeding. That proceeding set forth a clear set of standards and procedures to improve access to energy data. Under this decision researchers from accredited university’s can access covered information. By connecting available energy data to a wide variety of data sets, analyzing it, aggregating it and anonomyzing it, researchers for this project will expand the public and ratepayer benefits of this decision. This project is proposed to be funded under the 2013-2014 Evaluation Plan in the sub category of “Policy Research / Evaluation Methods” which had not yet been scoped (see page 11.) This project will provide benefits to several stakeholders including other State agencies, ratepayer advocates, researchers and policy-makers and expanding research opportunities by: - Developing cross-utility, standardized data sets to enable local and statewide policy research - Enable analysis of overall consumption trends resulting from multiple factors (non-program and program induced) and across space and time - Provide centralized access to a standardized data set to enable research not currently conducted under the evaluation, measurement and verification mandates of the Commission - Supporting baseline and longitudinal analysis in support of the CEC Demand Forecast. Overview of Project Deliverables I. Build Energy Data Repository Researchers will utilize cleaned data from the CPUC evaluation consultants as available to, standardize and geocode energy consumption data from Southern California Edison, SoCal Gas and San Diego Gas and Electric territories from 2006 through 2014. The full list of data variables is below. They will match this data to other publicly available and proprietary data sets and provide baseline statistical information. Research staff will develop the standards and protocols for providing data and aggregation services and the scope of these services. Additional, non-PUC funds will be sought to include POU data. Researchers will develop a relational geospatial database that links building energy consumption data and customer program information in Southern California to the following data sets: County assessor’s building and parcel information (includes building type, year built, square footage, land value, number of bedrooms etc.) Weather and climate zones Industrial classification (limited) Census characteristics By linking this consumption and program information to other datasets, the work will build off of the information provided by the EE Stats website by adding new layers of information and data that can answer additional questions and expand the utility of the data to other stakeholders beyond the CPUC’s direct evaluation activities. The data infrastructure will create a dynamic and responsive platform to support a host of stakeholder uses. These uses include expanding statewide, cross-utility, cross program/activity public interest energy research, engaging ratepayers and informing policy and program design. Researchers will collect, geocode and organize the following energy data. The CPUC will provide a portion of the data prepared by its evaluation consultants under secure transfer for personally identifiable information: . 1. Monthly service/site address level consumption data a. Including peak, off peak and shoulder kWh 2. Tariff 3. Climate zone and new title 24 climate zone 4. Mailing Address (to help verify site address, which tends to have many errors) 5. Lat/Long as available 6. Participation in low-income assistance program 7. Participation in solar program 8. EV charging information 9. Cost of electricity 10. Participation in EE and other programs a. Program b. Installation date c. Rebate amount d. Projected KWH savings e. Sociodemographic data regarding participants, if available We will include all of these variables in the database to enable flexible future analysis, but will not analyze or evaluate variable six through eleven. Data will only be made available in the compliance with guidelines set forth in the recent decision by the CPUC on energy data access. Deliverables At the completion of the phase, researchers will provide: 1. A methods report that outlines steps for cleaning, standardizing and organizing all data sets 2. A technical report that describes the database infrastructure and its applicability to agency goals, ratepayer advocacy, policy development and public interest research 3. A research report that provides baseline energy and GHG information for Southern California 4. An interactive website that maps the baseline findings described in the research report. This map will follow the same form and template as the LA Energy Atlas. 5. Recommendations for infrastructure, standards & process for ongoing population and refresh of the database. II. Establish Protocols, Standards and Procedures for Data Services To meet the State’s aggressive energy and GHG reduction goals requires action of multiple stakeholders including policy-makers, energy planners, ratepayer advocates and researchers. Many of these stakeholders do not have the resource to conduct the sophisticated analysis necessary to complete their work. For example, as the CEC implement Prop 39, it would be helpful for staff to have information about past school energy retrofits and historic school building performance data. An energy data repository would make such data sets available and easily accessible. Further, access to sophisticated, yet easy to-use web based energy data tools (aggregated and anonomyzed to protect privacy) will help numerous stakeholders from energy efficiency companies to ratepayer advocates work more effectively and efficiently. Thus a fundamental purpose of the data infrastructure described above is to provide data services and basic annual metrics to inform and enhance policy research and cross-utility/statewide analysis. The data repository will provide several benefits including expanding public interest energy research, streamlining data access and providing aggregated and anonymous energy statistics and interactive energy analysis tools for the public. Through a transparent stakeholder process and in conjunction with CPUC staff, researchers will develop standards for the following: Data cleaning and standardization procedures o This includes reviewing the literature and working with a community of researchers to develop a transparent process for data standardization and cleaning Data security and storage o This includes hardware, software, technology for storing data as well as personnel policies and auditing procedures (if data is shared externally) Data privacy o This includes thresholds and techniques for data aggregation and anonymization Data access o This includes standards for allowing access to data and at what level (for example, researchers working for accredited university can access covered information with an NDA.) Access levels would follow CPUC guidelines as laid out in the recent data use case decision o It will also include the process and procedures for evaluating proposals The protocols and standards will undergo stakeholder review every three years. Researchers will implement updates as necessary. There is currently not funding available to implement the data services. Researchers are actively seeking additional funding for implementation and to maintain the repository overtime. Deliverables A technical manual of policies and procedures for energy data services that outlines all of the concerns listed above and the stakeholder determined policy Needs assessment (this doesn’t have to be fancy but will help clarify the gap this project is trying to fill; and how it will be used – so we can gauge success over time in meeting those objectives (or more or less as the project evolves) Public webinar for assessing need; and user interest III. Budget and Staffing Staffing Needs (100% unless otherwise noted) Staff includess a Principal Investigator (5% time), a Project Manager and Legislative Liason, a Contract Manager (60%), a Database Manager/Systems Administrator, a Data Analyst, 2 GIS analysts. Additional Resources In addition to the above the project requires hardware and software to develop the database, support for website development, funds to support travel to meetings with the CPUC, utilities and other stakeholders and funds to purchase proprietary datasets (primarily address-level industrial classification data) Budget $1,000,000 (more detail will be provided by task as project is scoped with the contracts office.)