Optimal Electricity Supply Bidding by Markov Decision Process Presentation Review By:

advertisement
Optimal Electricity Supply
Bidding by Markov Decision
Process
Authors: Haili Song, Chen-Ching Liu, Jacques Lawarree, & Robert Dahlgren
Presentation Review By:
Feng Gao, Esteban Gil, & Kory Hedman
IE 513 Analysis of Stochastic Systems
Professor Sarah Ryan
April 28, 2005
Outline
 Summary of the previous presentation
 Numerical Simulation Case Study


Example 1 – Production Limit
Example 2 – Market Power
 Conclusion
 Extension
Summary of previous presentation
 Solution Procedure
 Value Iteration
 Model Validation
 Accumulate actual data and observations from the
market over a period of time
 Estimated rewards r(i,j,a) and actual rewards
w(i,j,a) are analyzed by linear regression.
Summary of previous presentation
 3 suppliers: GenCoA, GenCoB, and GenCoC
 GenCoA is the decision maker using the




Markov Decision Process technique
GenCoA: 1 generating unit
GenCoB: 2 generating units
GenCoC: 2 generating units
Planning Horizon: 7 days (bid decision for
next day considers the entire week ahead)
Summary of previous presentation
 GenCoA makes a decision from a set of prespecified decision options
 GenCoA models GenCoB /C ‘s bidding
behavior by prices, quantities and the
associated probabilities based on his
knowledge and information
Comparison of Two Examples
 With production limit, bidder is concerned with
saving resources for future more expensive
periods (example 1)
 With market power, bidder is concerned with
properly influencing the future spot price to
maximize future profit (example 2)
 Knowing whether the bidder has market power/
production limit or not is crucial since the
relationship between profits and
decisions/resource depend on each other
Example 1
 Decision-maker has a production limit over the
planning horizon
 210 states, 10 possible production levels
between 0 and the production limit
 The parameters for state 1 are listed in Table
III
Example 1
 GenCoA’ s Cost, Production Limit and
Decision Options
 GenCoB’s Decision Options in State 1
Example 1
 Value iteration algorithm is used to calculate the
optimal decision for each state and each stage
Example 1
 One rational strategy, Peak-Load Strategy, is chosen
for comparison with the BIDS results
Peak-Load Strategy: the decision maker bids at the
cost, divides the production limit evenly over the
week and sells only at the peak load period on each
day.
 The result shows that the expected rewards from
BIDS are consistently higher than that of the PeakLoad Strategy.
 Optimal strategy is time dependent due to the
production limit
 In some states the optimal decision is not to sell, but
to save the resources for more profitable days
Example 2
 Decision-maker has market power: it can manipulate




the bid to influence the spot price
Decision-maker has no production limit
Decision-maker makes the bidding decision to
maximize the expected reward over the planning
horizon
There are 21 states in the MDP model, since no
production limit
Daily Maximum Strategy and BIDS results are used
for comparison
Example 2
 Daily Maximum Strategy maximizes the daily
reward without considering how the bid affects
the market trend
Example 2 Results
 Daily maximum strategy is time independent:
decision-maker makes the same decision as long as
the system is in the same state
 BIDS results are time dependent: it takes into
account how current bidding affects future spot
prices
Example 2
 Since the decision-maker has market power:
 Bidding at high prices gives the decision-maker a
high immediate reward and the system transfers to
high price states with higher probabilities
 Bidding at low prices gives the decision-maker a
lower immediate reward and the system transfers
to low price states with higher probabilities
 Decision-maker should not rely on the daily
profit in making decisions
Conclusion
 The spot market bidding decision–making problem is




formulated by a Markov Decision Process
An algorithm is developed to calculate the transition
probabilities and reward function
Value iteration method is applied to solve the MDP
Implementation and case study
 Three GenCos, GenCo A is the Decision Maker
 5 Generators among the 3 GenCos
Description of 2 Examples:
 Production Limit without Market Power
 Market Power without Production Limit
Extension
 Risk-neutral decision–maker is assumed. In order to
incorporate risk attitude, the variance of decision–
maker’s profits should be brought in the bidding
decisions. A slightly different formulation, risk–
sensitive MDP, can be used to solve this problem.
 The objective function is to maximize the total
expected future reward during 7-day planning
horizon. The 7-day planning horizon is chosen based
on the accuracy of the short–term load forecasting.
However, an optimal planning horizon should be
discussed in the future.
 The competitors’ strategies are modeled as price,
quantity and probability from the decision–maker ‘s
point of view. However, game theory should be
incorporated to model how competitors’ activities are
affected by decision–maker’s decisions as well.
Questions???
Download