AN ABSTRACT OF THE THESIS OF Jonathan H. Luc for the degree of Master of Science in Mechanical Engineering presented on December 10, 2015. Title: Automated Synthesis of Hybrid Energy Network Optimization using A* & Ensemble Forecasting Abstract approved: ______________________________________________________ Matthew I. Campbell This thesis describes a method for automatically generating and evaluating small-scale off-grid energy systems. Such systems are comprised of components such as solar panels, residential-scale wind turbines, batteries, inverters and charge controllers. These disparate components are assembled into feasible networks through the use of topological optimization and graph grammar rules. The evaluation of candidates is done through the use of A* and Beam search algorithms and ensemble forecasting. Resulting networks are intended to provide users with optimal configurations that are cost-effective and reliable. In the results, the approach is tested with real historical climate data and real user data (from two different residential homes and location) recorded hourly throughout the year. Optimal networks are found to be dependent on the connections between each component within the network, the hour to hour information, and the geographical location of the network. ©Copyright by Jonathan H. Luc December 10, 2015 All Rights Reserved Automated Synthesis of Hybrid Energy Network Optimization using A* & Ensemble Forecasting by Jonathan H. Luc A THESIS Submitted to Oregon State University in partial fulfillment of the requirements for the degree of Master of Science Presented December 10, 2015 Commencement June 2016 Master of Science thesis of Jonathan H. Luc presented on December 10, 2015. APPROVED: Major Professor, representing Mechanical Engineering Head of the School of Mechanical, Industrial and Manufacturing Engineering Dean of the Graduate School I understand that my thesis will become part of the permanent collection of Oregon State University libraries. My signature below authorizes release of my thesis to any reader upon request. Jonathan H. Luc, Author ACKNOWLEDGEMENTS The author would like to thank Dr. Matthew Campbell for being my thesis advisor and guiding me through this research and thesis. The members of my committee Dr. Bryony DuPont, Dr. Ted Brekken, and Dr. Rakesh Gupta for taking time out of their work schedule to help me. To my fellow Design Engineering Lab mates for their suggestions and help. To my friends, for making these few years a learning experience, fun, and enjoyable. Finally to my family for their support and my grandpa for inspiring me to become an engineer and pushing me to learn as much as I can. The author would like to acknowledge Oregon Best, and the Bonneville Power Administration for funding this research. TABLE OF CONTENTS Page 1 Introduction ...…………………….…………………………………………………...….. 1 2 Related Work …………………….……………………………………………………..… 5 3 Artificial Intelligence Search Method ………………………...……….…………….…… 7 3.1 Uniform Cost Search (UCS) …………………………………………....……… 9 3.2 Greedy Search ……………………………………….…….………………….. 11 3.3 A* (A-Star) Search ……………….……………………….…………….…….. 12 3.4 Beam Search ……………………………………………………………….….. 13 4 Methods Used To Find Optimal Network …………………..………………….….……. 15 4.1 Topological Optimization Using GraphSynth …………….………..…....……. 17 4.2 A* With Beam Search ………………….……………….…………….……..... 19 4.3 Ensemble Forecasting ………………..…………………………….…..…........ 20 5 Experimental Setup ……………………………………..……………….……….….…... 24 5.1 User Data ……………………………………………………….….………….. 25 5.2 Wind Data ……………………………………………………………..………. 28 5.3 Solar Data ………………………………………………………….….……..... 28 5.4 Data Assumptions ………………………………………………….…………. 30 6 Results ……………………………………………………………………….………….. 32 6.1 End User #1 ……….……….……………………………………….…………. 33 6.2 End User #2 ……….…………………………………………….…………….. 39 6.3 Location Based Results ………………………………………….……………. 41 TABLE OF CONTENTS (Continued) Page 7 Conclusion …………………..…………………………………………………………... 44 References ……..………………………………………………………………….……….. 46 LIST OF FIGURES Figure Page 1.1: Flowchart of the search process …................................................................................. 3 3.1: Basic search tree diagram …........................................................................................... 8 3.2: Expanded search tree with new successors …................................................................ 9 3.3: Illustration of uniform cost search …............................................................................. 10 3.4: An illustration of greedy search …................................................................................. 12 4.1a: An illustration of the search tree in the search space …............................................... 15 4.1b: An illustration of one of the candidates, candidate #56, in the design space ….......... 16 4.2: A Grammar rule used to grow the graph …................................................................... 18 4.3: Predicted solar insolation and historical average vs time ….......................................... 23 5.1: Screenshot of the raw user data from NEEA …............................................................. 26 5.2: Graph of one end user's electrical demand vs hour of the year …................................. 27 5.3: Monthly average variation of user 1 vs. time of day …................................................. 27 5.4: Screenshot of part of the raw wind data from NOAA …............................................... 28 5.5: Screenshot of part of the raw solar insolation data from NSRBD …............................. 29 6.1.1: User 1's optimal network at 90% off-grid at KPDX, energy vs. time ….................... 36 6.1.2: User 1's optimal network at 90% off-grid at KIAH, energy vs. time …..................... 37 6.1.3: Objective function value vs candidates during the search process …......................... 39 6.3.1: Comparison between the amounts of network components vs user vs location ......... 42 6.3.2: Energy produced vs. user and location ….................................................................... 43 LIST OF EQUATIONS Equations Page 3.1: A* (A-Star) Objective Function …................................................................................. 12 4.1: Equation used to predict future data points …................................................................ 21 5.1: Equation used to calculate wind power from a turbine ….............................................. 28 5.2: Equation used to calculate the solar energy generated by a solar panel ......................... 30 LIST OF TABLES Table Page 5.1: A table of components used in the network …………………………………………... 25 6.1.1: End user #1's optimal networks at various off-grid percentages, at KPDX …............ 33 6.1.2: End user 1's optimal networks at various off-grid percentages, at KIAH …….......… 34 6.2.1: End user #2's optimal network at various off-grid percentages, at KPDX ….............. 40 6.2.2: End user #2's optimal network at various off-grid percentages, at KIAH …….......… 40 1 CHAPTER 1: INTRODUCTION According to the United States Energy Information Administration (EIA) annual energy outlook 2015 report, the demand for electricity in the US is expected to increase in the next few decades [1]. This increase in energy demand raises the question as to how this demand could be met. One solution is to use a hybrid energy networks to generate enough energy to meet these demands. A hybrid energy network is a network containing multiple types of energy generating devices, both renewable and non-renewable. It uses various forms of electrical generation such as electrochemistry, photovoltaic, thermoelectric, and electromagnetic to name a few. The purpose of using multiple types of energy devices is to stabilize the generation of electricity to a predictable and consistent manner that will meet and/or exceed the demand of the end users. For example, if photovoltaic generation is suddenly decreased due to a cloud passing over a solar panel the network would still generate electricity consistently if batteries were used to meet the demand when the solar panels could not. Hybrid energy networks contain devices to store the generated energy for later use. There are also various ways to store the energy generated such as deep cycle batteries, flywheels, compressed air, pump-storage, and thermal energy storage. Other essential components found in these networks include charge controllers, inverters, and electrical wiring. Charge controllers are used to regulate the charging of batteries. This ensure that the batteries are neither over-charged nor completely drained, which may decrease the performance and lifespan of the batteries [2]. There are also different types of charge controllers available. One example is the maximum power point trackers chargers also known as MPPT [3]. MPPTs charge controllers are used to ensure that 2 the energy generating device will charge all batteries evenly when the energy generated by the device is less than the nominal output. Inverters are also essential to the network. Inverters convert direct current generated from the sources into a useable alternating current allowing consumers to connect the network into their homes or facilities [4]. For photovoltaic devices there are several types of inverters to choose from: string inverters, which are located at the end of each solar panel array; module inverters that are located at each solar panel; and central inverters, located at the junction between the source and the line going into the breaker panel. All the different available device types and options for the end user to choose from can become overwhelming, even on a small residential scale. For industrial scales, the size and quantity of the components are scaled up and the complexity of the network can make it difficult to determine the correct network for a particular user. Since there are multiple categories for energy generating devices and multiple sub-categories within each type of device, the optimal cost-effective solution is not trivial due to the magnitude of possible networks that may exists. The goal of a hybrid energy network is to generate electricity reliably for the end user regardless of any external variation to the network, such as weather. Thus the end user can become more self-sufficient in terms of electrical energy needs compared to an end user that received all electrical needs from the grid. This thesis seeks to find a method to automatically generate an optimal, cost effective, hybrid energy network that reliably meets the electrical needs of any end user regardless of the scale and geographical location. The prescribed method will be able to take into consideration the different sub categories with each type of energy generating device as well as the complexity of the network from small residential scale to the industrial scale. 3 Figure 1.1: Flowchart of the search process This developed method takes in information about the different network devices available, weather data at the end user’s location, and seeks to create all the feasible combinations of network, while having components connected in parallel, or in series, or a combination of the two. Figure 1.1 shows the process of this prescribed method. Starting at the green color boxes on the left. The first step is to take the raw data and generate the future weather data points. Then the network devices, end user’s requirements, and seed graph are used to generate new networks called children. Each child contains the same network as the parent graph with the addition of another network component via a grammar rule. After all possible children are generated from their 4 parents, each child is compared to a list of already generated children to ensure that no two networks are the same and evaluated twice, as shown in the center of Figure 1.1. Following the duplicate check the candidate networks are evaluated against the 10 future weather years’ to determine their network objective function value and network capital cost, which is on the rightside of Figure 1.1. Each child is then check to see if it meets the end user’s requirements for a certain off-grid percentage. If the child turns out to meet the requirements then its data of the network and various performance parameters are recorded. A second check is used to determine if the child is the global minimum, or optimal network. If the child does not meet the goal requirements it is stored in a sorted list, based on lowest objective value, along with other children that failed to meet the goal. Once all the children are check against the goal criteria, the first 1000 children are placed into a list called “the beam” to be turned into parents for the next iteration. The remaining candidates are not expanded in the next iteration as their objective values will not be less than the children in the beam of the next iteration. Finally the children in the beam are turned into parents to start the whole cycle over. This cycle of generation, evaluation, and guidance is repeated until a global minimum is found. The global optimal is the optimal network configuration for that end user at that geographical location. By changing the end user input data this method becomes scalable from small cabins to large manufacturing facilities. 5 CHAPTER 2: RELATED WORK There is a substantial and widely varied body of research related to work currently being done on hybrid energy systems. Some of these studies focus on certain parts of the network. Such studies include a fix sized network, system control issues, and representing aspects of particular components within the network. While such work is important but not relevant, one technological innovation that closely parallels the efforts presented here is the work on the software HOMER created by Homer Energy LLC [5]. HOMER is a software that optimizes microgrid designs. It simulates a hybrid microgrid for an entire year and optimizes the design based on the lowest total net present cost [6]. The software incorporates the use of various renewable and non-renewable energy sources in their design. It is however, different than the method used in this thesis. In this thesis, the relationship between the connections of the components are taken into account not just the quantity of those components. The relationship of how the components are connected to each other are as important as the selection of the components themselves, such as parallel and series connections between solar panels or batteries to create a battery bank. Another software that creates solutions that will meet the end user’s electricity demand is Google’s Sunroof project. Google is using their aerial imagery and maps to determine the amount of sunlight one’s rooftop will receive within a year to help choose the best solar installation plan for that user [7]. This software is designed to output solutions for the user based on their rooftop real estate and geographical location. However, it does not provide the user with the exact components needed to achieve the user’s demand. Rather, it estimates and connects the users to service providers in their local area who will determine the network they need, which may or may not be optimal. 6 For wind turbines the power coefficient has a maximum theoretical value of 0.59 but most wind turbines are under this value [8]. It is shown that the power generated by the wind turbines can increase if the power coefficient is increased or if the swept area is increased [8]. This means that a smaller cheaper more efficient turbine may have a higher power output than a larger more expensive turbine. As a result the optimal cost effective turbine is difficult to determine by size and power output alone. Other artificial intelligence search methods have been used to find optimal hybrid energy systems or networks such as power management strategy [9] and using pre-existing software in a risk based approach to find a cost optimal system [10]. Methods such as genetic algorithm, particle swarm, simulated annealing, and even ant colony algorithms have been used to find optimal hybrid networks [11]. There are advantages and disadvantages to some of these methods. In the case of genetic algorithm, it is able to get out of local minimums due to the cross-over and mutation parameter but the disadvantage is the response time and the complexity of the algorithm’s structure [11]. For particle swarm optimization, if there are more options to choose from the efficiency of the search is reduce and potentially worse than a genetic algorithm [11]. 7 CHAPTER 3: ARTIFICIAL INTELLIGENCE SEARCH METHOD Since the network includes multiple components and connections between components the possible network solutions can be quite large. In order to traverse this design search space to find the global optimal solution a computational search method must be used. A search tree is one way of representing this design space. It can represent all the feasible configurations of a network and take into account the connectivity of the components in the network. Each of the methods described in the following sections are the artificial intelligence (AI) methods used in this thesis. These methods are different types of search tree methods. The search is described as an upside down tree because it starts with a “root” or seed node and braches out to one or more nodes. Each node of the search tree represents a potential solution or goal to the problem. The nodes are connected by edges or arcs and the search space is traversed using these edges. In Figure 3.1, the parent node generates two successor or child states, states A and B, which are completely different states and possible solutions, since these states have no successors they would be referred to as leaf nodes since it’s at the bottom end of the tree. 8 Figure 3.1: Basic search tree diagram. As an example, at each iteration the tree is expanded by taking the leaf nodes and generating successor states. The tree in figure 3.1 is expanded at the next iteration to produce Figure 3.2. The new successors or “child” of state A “the parent” is the same as the parent state except for one additional item to child state. States C and D are said to be at the second level, or depth of 2, of the tree whereas the states A and B are at first level in the tree. The children of state B are generated after all the children of state A’s is generated. So if state B had successors or children, they are on the same level of the tree as C and D. By using a search tree, all possible states, also referred to as nodes, can be visited in an orderly manner. 9 Figure 3.2: Expanded search tree with new successors CHAPTER 3.1: Uniform Cost Search (UCS) Uniform Cost Search (UCS) is an uninformed search [12]. This means the search method does not have knowledge about the relationship between the current state and the goal state. It assigns a value to each node it generates via a path-cost function. The value is the cost to get from the start node to the current node. UCS always expands the tree by expanding the lowest cost leaf node. Then expands the second lowest cost leaf node second and so forth [12]. Thus the goal that 10 is found, using this method, has the cheapest path-cost of all the visited nodes. An example of this search is illustrated in Figure 3.3 where each number in the node represents the total cost from the seed node, from first iteration (top left) to the fourth iteration (bottom right). In the first tree in the top left of Figure 3.3 generates three successor nodes, the cheapest of those three nodes is expanded next (top right). The cheapest leaf node is after this expansion is the node labeled #2, therefor it is expanded next (bottom left) and continues on expanding in that manner as seen in the bottom right tree of Figure 3.3. Figure 3.3: Illustration of uniform cost search. In order for uniform cost search to find the first cheapest goal the path cost must monotonically increase from the seed node, otherwise there is no guarantee the goal found is the cheapest. Uniform cost search is thereby complete and optimal [12]. 11 CHAPTER 3.2: Greedy Search Greedy search on the other hand is a search method that only takes into account the cost to the goal node from the current node [12]. This makes the method an informed search method because it makes informed decision about which path to take to minimize the distance between the current and goal nodes. The path-cost from the start node to the goal node decreases to a point where the goal node has a greedy search path-cost of zero. As greedy search expands a node it looks at the cheapest option of the expanded nodes and does not take into consideration the overall cheapest path but rather the cheapest child the parent node generated. An example of this can be seen in Figure 3.4 below, where the search begins at the seed node the first expansion would occur on node 1, resulting in node 6 and 3. The next node to be expanded is at node 3, since it is cheaper than node 6. If the cheapest goal is the successor to node 2, it would not be seen by the search, thus making the search incomplete. Greedy search will find solutions quickly but the optimal solution is not guaranteed to be found, so it is incomplete and not optimal [12]. 12 Figure 3.4: An illustration of greedy search. CHAPTER 3.3: A* (A-Star) Search The A* method is one that takes the advantages of the quickness of greedy search and combines it with the completeness of uniform cost search [12]. In the search tree, the children nodes are generated and A* looks at the child’s path cost from the seed node and the distance to the goal node. It then combines the two values and assigns that value to the child. This type of objective function can be seen in the equation shown below in Equation 3.1. Equation 3.1: A* (A-Star) Objective Function 13 In this objective function the first term is the path cost, c(n), and the second term is the distance to the goal, h(n), and is referred to as the heuristic value of the objective function. The path cost is a monotonically increasing function and the distance to the goal is determine by a heuristic function. The overall value of the objective function will reach a minimum when the heuristic value goes to zero. A* is guaranteed to find the global optimal if the heuristic is admissible [12]. Thus, the solution found is both complete in the sense that it search the entire search space and that the path from the seed node to the goal node is the most efficient path. [12]. CHAPTER 3.4: Beam Search Beam search is a method that keeps a certain amount of successor to be expanded. It selects the best or cheapest node, based that the objective function, and generates the successor nodes from said node. This method is used to reduce computational memory because it only keeps a percentage of the newly generated successors in memory and neglects the rest [13]. The beam is referring to the current level or depth in the tree and all of the “best” nodes on that level. The number of “best” nodes is predetermined by the programmer, which is referred to as the beam width. In beam search, all the parents nodes are expanded so that all the child nodes can be evaluated one level at a time. The search then determines the best child nodes to keep in the beam and all other child nodes are discarded. At the next iteration the children that are in the beam get expanded and all of their children (grand-children to the original parents) are compared to each other at that next level. This process of selecting the elite or best nodes at a particular level to survive to the next round repeats until a goal is found. By cutting the number of nodes down at each level this method uses less memory than ones that keep all the generated children [13]. 14 Since there are multiple methods to use in finding the optimal network. A combination of uniform cost and greedy search is used to help reduce computational time as well as ensuring the optimal solution is found. The combined methods is basically the A* search method. The drawback with using just A* is the amount of memory required to store the all the data of every node that is was visited. If beam search is used along with A* then this memory storage constraint can be mitigated. 15 CHAPTER 4: METHODS USED TO FIND OPTIMAL NETWORK In the context of the problem, each node of the search tree in the search space represents a possible combination of hybrid energy network components (i.e. one inverter, one charge controller, and 6 deep cycle batteries). The edges of the search tree represent the path from one network combination to another. Each node of the search tree is a candidate and is also a graph where each node represents the physical components of the hybrid energy network and the edges connecting the nodes represent the electrical connections between each component. Figure 4.1a shows the search tree graph, where each of the nodes is a potential solution, while Figure 4.1b show one of the candidates and their connections between the components of the network in the design space. Figure 4.1a: An illustration of the search tree in the search space. 16 Figure 4.1b: An illustration of one of the candidates, candidate #56, in the design space. The problem of finding an optimal hybrid energy network is divide into four categories: representation (Figure 4.1b), generation, evaluation, and guidance. These four categories are explained in the following sections. 17 CHAPTER 4.1: TOPOLOGICAL OPTIMIZATION USING GRAPHSYNTH The representation and generation of the network is done using a graph synthesis program called GraphSynth [14]. Since networks may widely differ in their topology the use of a graph to represent the network is a necessity [15]. GraphSynth is used to represent and generate the actual setup of the physical network and the relationship and connections between each component in the network, which is referred to as the host graph. Grammar rules are used to grow the design space graph from start state consisting of one component into various possible networks. These rules have a left-hand-side graph and a righthand-side graph. During a recognition stage in GraphSynth, the rule looks for any location in the host graph that matches the rule’s left-hand-side graph. In the apply stage, the rule replaces the left-hand-side graph with the right-hand-side graph. The host graph is then altered to match the right-hand-side graph of the rule, and thus the network grows in complexity. An example of this can be seen in the Figure 4.2, this grammar rule is used to add one battery unit to the network. The left side of the rule shows a battery bank node and the right side of the rule shows a battery unit connected to the battery bank node. The center graph show what the left and right side graphs have in common. 18 Figure 4.2: A Grammar rule used to grow the graph. Multiple rules can be applied to the initial state. As the number of rules applied to the graph increases, the complexity of the graph also increases. There can be duplicate copies of the same graph depending on the order the rules are applied, this issue will be discussed later. These rules are used to connect each component within the graph to the proper component such that the resulting graph resembles the actual connections in a real hybrid energy network (e.g. solar panels are connected to a charge controller which is then connected to the batteries). There are rules for every component in the network as well as rules that specify whether the components are connected in parallel, series, or a combination. This network optimization is not an integer optimization because two networks can have the same number of components, those networks maybe completely different due to the parallel and series connections between components and will result in different network energy generation values. Using these grammar rules ensures that the generation of the network is a feasible network that it can be built and function properly in the real world. With feasible networks being generated it is now possible to evaluate candidate graphs in the hope to find an optimal one. 19 CHAPTER 4.2: A* WITH BEAM SEARCH The A* with beam search method is used as the objective function for finding networks. The decision to use A* was to ensure the completeness and optimality of the solution found. Due to computational limits, beam search is implemented to ensure the solution would be found within a reasonable amount of time and without overloading the computational memory. In the search space, each candidate in the current level of the tree generates a child candidate using one of the graph grammar rules mentioned earlier. This results in one parent candidate generating one child candidate per grammar rule. A naming scheme is used to identify the child candidates and only the children with unique names are kept to reduce the number of redundant copies of the same network. Once redundant copies are removed the unique candidates can be evaluated. In the network evaluation, the cost of each candidate in the search tree is the capital cost of the network and the heuristic value. The capital cost is the monotonically increase path cost term, c(n), and the heuristic is the h(n) term in the equation shown earlier in Equation 3.1. The heuristic value is the amount of energy not met by the network, translated into a monetary cost (US$). Therefore, the function evaluation is the cost one would end up paying after using the network for one year. The capital cost of the network is monotonically increasing since adding a component will never decrease the cost of said network. The heuristic for this is the amount of money one would spend in electricity by using the hybrid energy network. If the end user were to go completely off-grid the value of the heuristic would be zero. Calculating the value for the heuristic term is done by multiplying the cost of electricity, units of kilowatt-hours per dollar, by the difference in the energy generated and demand multiplied by 20 a multiplier. This multiplier is used to ensure the heuristic reminds admissible. It is tuned by starting at a value of two, which results in a search similar to a breadth-first search, and goes up to a value of fifty which results in solutions being found at large depths in the tree within 24 hours. A multiplier value of fifty was used in this thesis to ensure solutions in the high off-grid ranges were found. The beam used for this search method contains the first 1,000 best preforming candidates in each iteration, based on the objective function value. The other candidates not within the beam are discarded and therefore their successors are not generated. A beam of 1,000 was used after starting with a beam of 10,000 and reducing the beam in increments of 1,000 to finding the smallest beam width that did not change optimal solution found by A*. Thus, computational memory was reduced by using this beam. The energy generated is the sum of each type of source (e.g. solar energy generated plus wind energy equals the total network energy generation). This difference in energy demand and generation is determined using ensemble forecasting, which allows one to predict future weather [16], which is used to calculate energy generation and compare it with the user’s electrical demand. CHAPTER 4.3: ENSEMBLE FORECASTING Ensemble forecasting is a method used to quantify uncertainties of future weather forecast by producing multiple forecasts using slightly different initial conditions [16]. It is similar to a Monte Carlo analysis, which runs a simulation multiple times to obtain a distribution of the output variable [17]. For renewable energy, common data collected include but is not limited to solar radiation, wind speed and direction, and volumetric flow rates of streams and rivers. Gathering multiple years’ worth of data at small intervals can help improve the ensemble forecasting results [18]. 21 Once the data is collected it is then used to compute the average values and a beta random distribution curve at each data point for an entire calendar year. This ensures that the forecasted values follow a seasonal trend and extreme outliers are reduced. In other words one would not see a forecast of freezing temperatures in mid-summer if the average for that data point, from past records, is near 90 ̊F (32 ̊C). Instead, the values forecasted will be around the historical average for that data point. Ensemble forecasting allows us to make a test bed where a candidate network experiences a typical weather cycle for any number of fictional years. This process tests the candidate’s robustness to fluctuations in the climate data and is used as a performance metric. Ensemble forecasting is used to predict the future weather data because it can capture the uncertainty by using a form of Monte Carlo analysis. In order to forecast the value of either wind speed or solar radiation at a particular hour of the year the following equation, shown in Equation 4.1 is used. Equation 4.1: Equation used to predict future data points. The value at any given predicted future hour is determine by the historical average of that data point plus the product of the eigenvalue of that data point and a beta random distribution, multiplied by the eigenvector of that data point. The average (µ) is based on the 10 historical years’ data values at that particular hour of the year. A covariance matrix is calculated from the raw data to get the eigenvectors (Φ) and eigenvalues (λ). This ensures that the hour to hour data are within a reasonable range and do not vary wildly to create extreme weather data points. It also captures 22 the season trends one would expect such as more solar radiation during summer months than the winter months. A beta distribution is used in this prediction equation to introduce the random uncertainty that naturally occurs in nature but also take into account the historical data. The alpha and beta values of the distribution used is one and three, respectively. These values create a high probability that the data point will change by a small amount and a low probability that the data point will change by a large amount. By using the above equation, the predicted data matches the trend of the ten years’ worth of historical solar data as seen in Figure 4.3 below. The figure show the raw data value for the first week of November at Portland from the year 2000 to 2009 as well as the historical average (blue line) and the predicted data (black line). The predicted data follows the trend of the historical data on an hour to hour basis due to the beta distribution but also varies by a small amount due to the randomness. The other faded colors show the different years from 2000 to 2009 solar insolation values at each hour to show that the predicted data follows this trend and lies within the data set. 23 Figure 4.3: Predicted solar insolation and historical average vs time. 24 CHAPTER 5: EXPERIMENTAL SETUP An experiment is setup to show that using A* and beam search with ensemble forecasting one can find the optimal network for any user regardless of the geographical location of the user or network. The expected results is to have two different optimal networks for two users due to their different energy needs as well as different networks for the same user but with a different geographical location. Three data sets are collected, end user’s energy demand, wind speed data, and solar radiation data. Ten years’ worth of solar and wind data are used in this setup for the ensemble forecasting. The data is collected for two locations the airports in Portland, Oregon (KPDX) and Houston, Texas (KIAH). Both locations are chosen due to the completeness of the available historical data and their use of renewable energy networks within those states. The components used in this experimental setup can be found in Table 5.1. Each component’s data is taken from the manufacture’s specification or datasheet. Three different brands of each solar, wind, and battery are used in this setup. The cost and performance of the inverter and charge controller are neglected for this setup as each simulation will use the same inverter and charge controller to reduce the branching factor and thereby reduce the computational time of each simulation. 25 Table 5.1: A table of components used in the network CHAPTER 5.1: USER DATA In the network, the end users are two difference home owners in the Pacific Northwest. The electrical usage or demand is collected and recorded by Northwest Energy Efficiency Alliance (NEEA) [19] as part of their residential building stock assessment field survey. This survey recorded the total energy usage of various devices in households around the Pacific Northwest at intervals of 15 minutes for a full calendar year, as seen in Figure 5.1. The data is then compressed into data points of one hour intervals to match the other data segments. The data is also converted into the same units as the other data segments; kilowatt-hours. 26 Figure 5.1: Screenshot of the raw user data from NEEA. The user demand can be seen as an electrical demand vs hour of the year in Figure 5.2. This figure shows fluctuations on an hourly basis for one of the end users for the first 100 hours of the year. The large fluctuations can make it difficult to find a network that matches this user’s demand. Not only is there variation of demand on an hourly basis but there is also a fluctuation on a monthly basis as seen in Figure 5.3. The methods used in this thesis intends to generate networks to meet this variation of end user demand both on the hourly basis and on a monthly basis. 27 Figure 5.2: Graph of one end user's electrical demand vs hour of the year. Figure 5.3: Monthly average variation of user 1 vs. time of day. 28 CHAPTER 5.2: WIND DATA Wind data is taken from the National Oceanic and Atmospheric Administration (NOAA) [20], at both locations as well. NOAA uses automated surface observing system units that are commonly found at airports, resulting in detail hourly weather measurements. This data, in Figure 5.4, is formatted in the same manner as the user data. Figure 5.4: Screenshot of part of the raw wind data from NOAA. Equation 5.1: Equation used to calculate wind power from a turbine. The equation used to calculate the wind power is shown in Equation 5.1 [21]. The available power from wind is based on the air density (ρ), the swept area of the turbine (A), the cube of the wind velocity, and the power coefficient has a theoretical maximum value of 59% [21]. CHAPTER 5.3: SOLAR DATA Solar data is collected from the National Solar Radiation Database (NSRDB) [22], for both test locations for ten years, the Portland international airport (KPDX) in Oregon and George Bush International Airport (KIAH) in Texas. Solar radiation data used are at one hour intervals at each 29 location for the entire year and are assumed to be the direct and diffused solar radiation received on a horizontal surface during that interval, as seen in Figure 5.5 [22]. These test locations will be used to show that the optimal network also depends on the geographical location of the end users. Figure 5.5: Screenshot of part of the raw solar insolation data from NSRBD. The equation used to calculate the solar power generated is shown in Equation 5.2. This equation is a simplified version of the actual equation used to calculate power of a solar panel [23]. The power generated by a panel is equal to the efficiency (η) of the panel multiplied by the area of the solar cells (A) in units of square meters, and the insolation received (i) in kilowatt per square meter. 30 Equation 5.2: Equation used to calculate the solar energy generated by a solar panel. At each location the solar data is used to predict multiple years’ worth of future solar radiation at one hour intervals based off the past 10 years of solar reading. Future solar data is generated using the mentioned ensemble forecasting method. By taking the 10 years’ worth of data and finding the average solar radiation and adding variation at each data point. A correlation matrix and a beta distribution is used to generate the variation. This ensures that future data points do not vary dramatically from the trend of the 10 years of solar data. CHAPTER 5.4: DATA ASSUMPTIONS It should be noted that some assumptions are made for these components. To simplify the simulation, effects of air temperature on the components are assumed to be negligible. Wind turbine cut-in speed is assumed to be 0 mph (0 kph) and the cut-out speed is not considered, which greatly simplifies the simulation. Wind speeds are held constant for the entire hour of each reading as well as the direction of the wind. Due to practical reasons the power coefficient of the turbines are assumed to be at 0.31. Transmission loses in the wires are also assumed to be negligible. Ramp up and ramp down time of the components are also assumed to be negligible and the same central inverter is used by all candidates. Other assumptions include solar panels not being placed in shaded areas, receive 100% of the solar radiation, and any mismatch in the output voltage between solar panels is neglected. Two users are used in this study, as stated in the “User Data” section above. It is important to note that 31 the user data is only acquired from households in the Pacific Northwest which may not be representative of actual user data in Houston, Texas. Another assumption is the cost of electricity, which is held constant at $0.095/kWh for both locations, which is the average cost of electricity for a household living in Oregon. It is held constant as to not influence the selection of components between locations. The weather at each location is the driving factor as to want components are selected. The air density at sea level is used and is also held constant for both locations. The depth of discharge for all batteries is assumed to be at 20% of the full capacity of the battery and the batteries cannot exceed their rated voltage. Lastly another assumption in this experiment is that there are no restrictions as to the number of components or types this method can select (i.e. property or land to place components or zoning codes). In the next section the results of the optimal networks are shown versus geographical locations. Each end user’s data is kept the same between locations only the weather data is different. The weather data used corresponds to the geographical location of interest. 32 CHAPTER 6: RESULTS In this section, results are gathered in tables showing various off-grid percentages. “Offgrid percent goal” refers to the goal in which the network must meet or exceed to be considered the goal if it was used. Only the global optimal, minimums, are shown. The A* value is the value of the objective function. This value is the total of the capital cost of the network and the product of the energy needed from the grid value and a cost multiplier to penalize the network for needed energy from the grid. In all the results, the cost multiplier is set to value of 50.0 to make the cost of electricity from the grid 50 times more expensive than the cost of electricity from the network. A naming scheme is used to identify each unique network. The names are based on the components and quantity of each brand of component used. The layout of the naming scheme is as follows, I_BB_CC_ET_ = Inverter, Battery Bank, Charge Controller, Energy Type B_(# of Brand 1)_(# of Brand 2)_(# of Brand 3) = Quantity of Battery Units T_(# of Brand 1)_(# of Brand 2)_(# of Brand 3) = Quantity of Wind Turbine Units S_(# of Brand 1)_(# of Brand 2)_(# of Brand 3) = Quantity of Solar Panel units BArcs_(# of batteries in parallel)_(# of batteries in series) = Battery Unit Connections For the battery edge, or arc, connections labeled “BArcs” the first number after the notation refers to the number of leaf nodes in the design space tree. If all batteries in the network are leaf nodes then it is said that all the batteries are connected in parallel, since they all have one arc connecting each battery node to the battery bank node. The second number after the “BArcs” notation refers to the number of batteries that have one arc going to another node and one arc coming from another node, thus the node is in series with the other nodes. 33 CHAPTER 6.1: END USER #1 RESULTS For end user #1, which is a household in the Pacific Northwest, the following networks are automatically synthesized at this location if they wished to be a certain percentage off-grid. In Table 6.1.1 the optimal networks at each off-grid percentage can be seen for Portland, Oregon. Table 6.1.1: End user #1's optimal networks at various off-grid percentages, at KPDX. It can be seen that the optimal goal for 80% off-grid is not just doubling the components of the 40% network. This is due to the connections between components. In the 80% off-grid optimal network, there seven batteries connected in series while the 40% off-grid optimal network contains only two batteries connected in series. The jump from 80% off-grid to 90% off-grid is dramatically different as one can see the different brands of battery and solar panels being used to reach a cost effective network. One can see in the above table that the number of solar panels chosen has increased from 12 panels at 80% off-grid to 66 panels at 90%. This increase might suggest that adding another turbine is more expensive and less beneficial than adding solar panels. 34 Table 6.1.2: End user 1's optimal networks at various off-grid percentages, at KIAH. In tables 6.1.2, the results of the optimal network for user #1 at the Houston location can be seen. If one examines the two tables above of end user #1, the optimal networks at each offgrid percentage is different. Not only are the optimal networks different but the quantity of the brands of components used are also difference. For example, the 90% off-grid optimal network at Portland uses solar panels of brand #3, but the network in Houston does not contain any solar panels of brand #3. The cost of the networks at the two locations show that the user would pay less for the optimal networks if end user #1 lived in Houston while still demanding the same amount of electrical energy. If one looks at the cost of the network versus percent off-grid, the cost increases while the A* value decreases. The A* value is the objective function value as mention earlier. The difference between the cost and A* value is the heuristic value, which is decreasing as the off-grid percentage increases, as expected, and would reach a value of zero if the network was 100% off-grid. 35 In Figure 6.1.1 and 6.1.2 below, end user #1’s optimal networks output for 90% off-grid is shown at the two locations. These graph show the energy produced by the solar, wind, and battery devices on the Y-axis versus the hour of the year on the X-axis. The X-axis shows the first week of November for a non-leap year, the hour starts at 7297th hour of the year, which is the first hour for the month of November. One can see the energy production and consumption of the optimal network with the rise and fall of the light blue and green lines, respectively. The rise and fall of the yellow line is the charging and discharging of the battery bank of the network. The flats at the top of the yellow lines indicates the battery bank is fully charged, while the flats at the bottom indicate the battery bank is at its minimum charge of 20%. This 20% depth of discharge is the assumption stated earlier in the data assumption section. The energy production of the wind turbines and solar panels depicted by the gray and orange lines, respectively. As expected, the solar energy production peaks around the middle of the day and the battery bank charge rises up until that peak due to the energy generated exceeding the energy demanded. When the energy generated falls below the energy demanded the battery bank is discharged to supply the net difference, as depicted by the black line falling below the green line in Figure 6.1.1 and 6.1.2. When the battery bank is discharged to 20% of its maximum charge and the energy demanded is not met the net difference is taken up by the grid, which results in a lower off-grid percentage. Since this network is at 90% off-grid it is expected that 10% of the energy needed must come from the grid, blue line, during the year. 36 Figure 6.1.1: User 1's optimal network at 90% off-grid at KPDX, energy vs. time. 37 Figure 6.1.2: User 1's optimal network at 90% off-grid at KIAH, energy vs. time. 38 When comparing Figure 6.1.1 and 6.1.2, one can see the difference in the magnitude of energy produced between the two locations. Other differences include the charge and discharge of the battery bank, the solar and wind energy produced. All of these different fluctuations resulted in a different optimal network at 90% off-grid, even though the user demand remained the same between the two locations. On average, the Houston location produced more energy the Portland location for user #1. The objective function value and heuristic values during the search process can be seen in Figure 6.1.3 below. This show the network cost, heuristic value, and the A* value, all in terms of dollars, of each unique candidate during the search process in order from the first candidate to the 65,000th candidate but due to legibility only the first 4,000 are shown. The blue line represents the A* value while the orange and gray lines represent the heuristic and network cost, respectively. As the search generates new candidates only the first 1,000 cheapest A* value are kept in the beam as mentioned earlier in chapter 3. This explains the sudden drop in the A* value, as the 2,001st candidate is most likely the successor of the 1,000th candidate. With each iteration the A* value drops and each A* value of the i-th plus one iteration is cheaper than the i-th iteration. As the search continues with each iteration the entire beams network cost goes up but the heuristic goes down. If the A* value intersects the network cost line then the optimal solution for 100% off-grid is found since the heuristic would reach a value of zero. 39 Figure 6.1.3: Objective function value vs candidates during the search process. CHAPTER 6.2: END USER #2 RESULTS In table 6.2.1, the optimal networks for end user #2 at Portland is completely different than that of end user #1. The optimal network for each off-grid percentage is different for user #2 and is not as simple as adding another component to the system. Take for example in table 6.2.1, the 50% and 40% off-grid networks. The 40% off-grid network contains five batteries connected in parallel while the 50% off-grid network contains nine batteries connected in series. This shows that the optimal network depends on the topology of the network not just the quantity of components in the network. If those five batteries were connected in series it would create a suboptimal network for the user and possibly resulting in a lower off-grid percentage. 40 Table 6.2.1: End user #2's optimal network at various off-grid percentages, at KPDX. Table 6.2.2: End user #2's optimal network at various off-grid percentages, at KIAH In table 6.2.2, the same user has a different set of optimal networks when said user is located in Houston. The solutions for the networks are also cheaper for a location in Houston that in Portland. This could be due to more solar insolation or higher average wind speeds. What this does show is that the geographical location does matter when selecting network components. At both locations the 80 and 90 percent off-grid networks were not found within 15 hours of computational time or 100 iterations, which translates to 100 network components. This could be 41 caused by the assumptions made about the wind turbines and solar panels. Solution for these network exist but may require more computational memory and time. CHAPTER 6.3: LOCATION BASED RESULTS In Figure 6.3.1, one can see the number of components in the optimal network at various off-grid percentages as well as the location. This graph shows that different users require a different amount of network components to achieve an optimal network. It also shows that the same user may require a different number of network components if the location of said user varies. This is expected as each user will have a different energy demand profile which will generate networks with different amounts of components. In general the trend in Figure 6.3.1 show that users located in Portland will require more network components than in Houston. The jump at 70% off-grid for user #2 could be a result of the efficiencies and cost between the solar panels and wind turbines. Where X number of turbines could produce less energy when compared to Y number of solar panels or than the panels are a magnitude cheaper than the turbines. 42 Figure 6.3.1: Comparison between the amounts of network components vs user vs location. Depending on the user and their location the amount of energy produced are different even for the same user as seen in Figure 6.3.2. This graph show the output data of the optimal networks for 70% off-grid for both users at both locations. The black and orange line represents end user #1 at Houston and Portland, respectively. While the yellow and blue line represents end user #2 at Houston and Portland. From this graph, one can see the different networks generating different amounts of energy to meet the user’s 70% off-grid requirement at each hour for the first three days of a typical future month of November. The energy generated for user #2 is higher than user #1 at each location, which is expected since the energy demand data for user #2 is higher than user #1. 43 Figure 6.3.2: Energy produced vs. user and location. 44 CHAPTER 7: CONCLUSION Hybrid energy networks are used to generate electricity to help reduce the end users dependency on the grid. They consist of various renewable and non-renewable energy generating devices that can vary between users as shown in the results section. The networks can also vary depending on the geographical location of the user. Ensemble forecasting is used to solve the issue of end user geographical location. By using ten years of historical data at each location (Portland and Houston), it is possible to predict future weather data that closely matched the historical average on an hourly basis. Ensemble forecasting also captures the seasonal trends of the weather such as higher solar insolation during the summer month compared to the winter months. Robustness of the generated networks can be tested even further by changing the ensemble forecasting to generate 100 future years to rigorously test the networks. In order to reduce the amount of computational time only ten years for ensemble forecasting was used as opposed to 100 years. The automated generation of potential networks is done using GraphSynth. This allows all feasible network connections to be represented in the generation of the network. Grammar rules were used to expand the networks in a feasible manner. With this capability to predict future weather data and to generate networks, it is possible to evaluate the performance of each network using A* with a beam. By using an artificial intelligence search method it was possible to automatically generate and find the optimal network at various off-grid percentages regardless of the end user and their geographical location. The optimal networks for end user #1 were completely different than that of end user #2 at both the Portland and Houston location. This is most likely due to the end user’s demand being 45 different. Optimal networks found at each percentage interval are the cheapest network for that percent off-grid interval, in terms of capital cost for one year of ownership. Each increase in offgrid percentage resulted in a different optimal network for the same user. When comparing the two users at the same off-grid percentage the optimal network is also different. The quantity of each component as well as the number of parallel and series connections for the batteries is different. This suggests that the search method used can find solutions tailored to each unique user’s energy demand profile. For the same user, some optimal networks had components connecting in parallel while others had connections in series. This shows that the topology of the network need to be accounted for. Developing the capability to analyze the topology of a network is important for finding the optimal network. The location of the end users also changed the optimal network configurations. In the Portland, Oregon the number of solar panels, wind turbines, and batteries for each optimal network were different than the ones at the Houston, Texas location. Both users had different optimal networks at each location and the energy generated at these locations were also different. This suggest that the geographical location of the user affects the optimal network needed to meet the user’s energy demand. The thesis defends the notion that it is possible to automatically synthesize the optimal hybrid energy network for any end user while incorporating the geographical location, the topology of the network, and their unique energy profiles. In order to find the optimal network for any end user. This method can be applied to systems of different scales by simple changing the sizes or scale of the system components used and the end user’s electrical energy profile. Thus, industrial components can be used, instead of residential one, to evaluate industrial scale networks. 46 REFERENCES [1] U.S. Energy Information Administration, "Annual Energy Outlook 2015," 2015. [2] J. Woodworth, M. Thomas, J. Stevens, S. Harrington, J. Dunlop and M. Swamy, "Evaluation of the Batteries and Charge Controllers in Small Stand-Alone Photovoltaic Systems," in IEEE First World Conference on Photovoltaic Energy Conversion,conference record of the Twenty Fourth IEEE Photovoltaic Specialists Conference, Waikoloa, 1994. [3] V. Salas, E. Olias, A. Barrado and A. Lazaro, "Review of the maximum power point tracking algorithms for stand-alone photovoltaic systems," Solar Energy Materials and Solar Cells, vol. 90, no. 11, pp. 1555-1578, 2006. [4] J. Enslin and P. Heskes, "Harmonic Interaction Between a Large Number of Distributed Power Inverters and the Distribution Network," IEEE Transactions on Power Electronics, vol. 19, no. 6, pp. 1586- 1593, 2004. [5] HOMER, "HOMER - Hybrid Renewable and Distributed Generation System Design Software," [Online]. Available: http://www.homerenergy.com/. [Accessed 25 Jan 2015]. [6] T. G. P. L. P. Lambert, "Micropower system modeling with HOMER," in Integration of Alternative Sources of Energy, John Wiley & Sons, 2006, pp. 379 - 418. [7] Google, "Project Sunroof," [Online]. Available: https://www.google.com/get/sunroof/about/. [Accessed October 2015]. [8] J. Libii, "Comparing the calculated coefficients of performance of a class of wind turbines that produce power between 330 kW and 7,500 kW," World Transactions on Engineering and Technology Education, vol. 11, no. 1, pp. 36 - 40, 2013. [9] M. Trifkovic, M. Sheikhzadeh and K. Nigim, "Modeling and Control of a Renewable Hybrid Energy System With Hydrogen Storage," IEEE Transactions on Control Systems Technology, vol. 22, no. 1, pp. 169 - 179, 2004. [10] R. Huang, S. Low, U. Topcu and K. Chandy, "Optimal Design of Hybrid Energy System with PV/ Wind Turbine/ Storage: A Case Study," IEEE International Conference on Smart Grid Communications, pp. 511 516, October 2011. [11] O. Erdinc and M. Uzunoglu, "Optimum design of hybrid renewable energy systems: Overview of different approaches," Renewable and Sustainable Energy Reviews, vol. 16, pp. 1412 - 1425, 2012. [12] S. Russel and P. Norvig, Artificial Intelligence a Modern Approach, Englewood Cliffs: Prentice Hall, 1994. [13] P. Norvig, Paradigms of Artificial Intelligence Programming: Case Studies in Common LISP, San Mateo: Kaufman, 1991. [14] M. Campbell, "GraphSynth," UT Austin Design Engineering Lab, [Online]. Available: http://designengrlab.github.io/GraphSynth/. [15] M. Campbell, "A Graph Grammar Methodology for Generative Systems," University of Texas, 2009. [16] T. Gneiting and A. E. Raftery, "Weather Forecasting with Ensemble Methods," Science, vol. 310, no. 5746, pp. 248-249, October 2005. [17] K. H. D. Binder, Monte Carlo Simulation in Statistical Physics, 5 ed., Springer-Verlag Berlin Heidelberg, 2010. [18] H. A. Nielsen, H. Madsen, T. S. Nielsen, J. Badger, G. Giebel, L. Landberg, K. Sattler and H. Feddersen, "Wind Power Ensemble Forecasting," in Proceedings of the 2004 Global Windpower Conference and Exhibition, 2004. [19] Northwest Energy Efficiency Alliance, "Residential Building Stock Assessment," 20011. [Online]. Available: http://neea.org/resource-center/regional-data-resources/residential-building-stock-assessment. [Accessed 2014]. [20] National Oceanic and Atmospheric Administration, "Automated Surface Observing System," [Online]. Available: http://www.ncdc.noaa.gov/data-access/land-based-station-data/land-based-datasets/automatedsurface-observing-system-asos. [Accessed 2014]. 47 [21] Royal Academy of Engineering, "Wind Turbine Power Calculations," Royal Academy of Engineering, 2003. [22] National Renewable Energy Laboratory, "National Solar Radiation Database," National Oceanic and Atmospheric Administration, 2012. [23] P. Nema, R. Nema and S. Rangnekar, "A current and future state of art development of hybrid energy system using wind and PV-solar: A review," Renewable & Sustainable Energy Reviews, vol. 13, pp. 2096 - 2102, 24 October 2008. 48 This page was intentionally left blank