Optimal Electricity Supply Bidding by Markov Decision Process Authors: Haili Song, Chen-Ching Liu, Jacques Lawarree, & Robert Dahlgren Presentation Review By: Feng Gao, Esteban Gil, & Kory Hedman IE 513 Analysis of Stochastic Systems Professor Sarah Ryan April 28, 2005 Outline Summary of the previous presentation Numerical Simulation Case Study Example 1 – Production Limit Example 2 – Market Power Conclusion Extension Summary of previous presentation Solution Procedure Value Iteration Model Validation Accumulate actual data and observations from the market over a period of time Estimated rewards r(i,j,a) and actual rewards w(i,j,a) are analyzed by linear regression. Summary of previous presentation 3 suppliers: GenCoA, GenCoB, and GenCoC GenCoA is the decision maker using the Markov Decision Process technique GenCoA: 1 generating unit GenCoB: 2 generating units GenCoC: 2 generating units Planning Horizon: 7 days (bid decision for next day considers the entire week ahead) Summary of previous presentation GenCoA makes a decision from a set of prespecified decision options GenCoA models GenCoB /C ‘s bidding behavior by prices, quantities and the associated probabilities based on his knowledge and information Comparison of Two Examples With production limit, bidder is concerned with saving resources for future more expensive periods (example 1) With market power, bidder is concerned with properly influencing the future spot price to maximize future profit (example 2) Knowing whether the bidder has market power/ production limit or not is crucial since the relationship between profits and decisions/resource depend on each other Example 1 Decision-maker has a production limit over the planning horizon 210 states, 10 possible production levels between 0 and the production limit The parameters for state 1 are listed in Table III Example 1 GenCoA’ s Cost, Production Limit and Decision Options GenCoB’s Decision Options in State 1 Example 1 Value iteration algorithm is used to calculate the optimal decision for each state and each stage Example 1 One rational strategy, Peak-Load Strategy, is chosen for comparison with the BIDS results Peak-Load Strategy: the decision maker bids at the cost, divides the production limit evenly over the week and sells only at the peak load period on each day. The result shows that the expected rewards from BIDS are consistently higher than that of the PeakLoad Strategy. Optimal strategy is time dependent due to the production limit In some states the optimal decision is not to sell, but to save the resources for more profitable days Example 2 Decision-maker has market power: it can manipulate the bid to influence the spot price Decision-maker has no production limit Decision-maker makes the bidding decision to maximize the expected reward over the planning horizon There are 21 states in the MDP model, since no production limit Daily Maximum Strategy and BIDS results are used for comparison Example 2 Daily Maximum Strategy maximizes the daily reward without considering how the bid affects the market trend Example 2 Results Daily maximum strategy is time independent: decision-maker makes the same decision as long as the system is in the same state BIDS results are time dependent: it takes into account how current bidding affects future spot prices Example 2 Since the decision-maker has market power: Bidding at high prices gives the decision-maker a high immediate reward and the system transfers to high price states with higher probabilities Bidding at low prices gives the decision-maker a lower immediate reward and the system transfers to low price states with higher probabilities Decision-maker should not rely on the daily profit in making decisions Conclusion The spot market bidding decision–making problem is formulated by a Markov Decision Process An algorithm is developed to calculate the transition probabilities and reward function Value iteration method is applied to solve the MDP Implementation and case study Three GenCos, GenCo A is the Decision Maker 5 Generators among the 3 GenCos Description of 2 Examples: Production Limit without Market Power Market Power without Production Limit Extension Risk-neutral decision–maker is assumed. In order to incorporate risk attitude, the variance of decision– maker’s profits should be brought in the bidding decisions. A slightly different formulation, risk– sensitive MDP, can be used to solve this problem. The objective function is to maximize the total expected future reward during 7-day planning horizon. The 7-day planning horizon is chosen based on the accuracy of the short–term load forecasting. However, an optimal planning horizon should be discussed in the future. The competitors’ strategies are modeled as price, quantity and probability from the decision–maker ‘s point of view. However, game theory should be incorporated to model how competitors’ activities are affected by decision–maker’s decisions as well. Questions???