Introduction Value Function Calculating Transition Matrix Approximate Dynamic Programming . Vadym Omelchenko Faculty of Mathematics and Physics, Charles University in Prague and Institute of Information Theory and Automation, Academy of Sciences of the Czech Republic Model of Approximate Dynamic Programming Applied on Day-Ahead Trading of a Renewable Producer of Energy
37
Embed
Vadym Omelchenko Model of Approximate Dynamic Programming
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Faculty of Mathematics and Physics, Charles University in Prague andInstitute of Information Theory and Automation, Academy of Sciences of the Czech Republic
Model of Approximate Dynamic Programming Applied onDay-Ahead Trading of a
1) Renewable producer generates energy but he does not knowhow much he will generate in the following day due touncertainties entailed by weather.2) We assume that the producer is penalized for insufficientdelivery of energy because it corresponds to market conditions andbecause some countries have introduced such a system, e.g.Bulgaria.3) In our settings, the state space is a two-dimensional variablethat consists of wind data and electricity price.5) Our goal is to determine a bidding strategy of the producer byusing dynamic programming.
There is a special case of reward functions when they depend notonly on the current state but also on the the next state/states. Inthis case, the value function will be represented as follows:
Ct(St , xt , St+1)
The value function will be then in the following form :
1) At some t < ∞ we have VT+1 = 0. Knowledge of the valuefunction at the terminal state enables to calculate the valuefunction backward in time2) T is tending to infinity.
There is the dependence of Wind Production on the wind speed(let us denote it as ”wind”):
WindProduction = c ·Wind3 where c is a positive constant
The square root of wind speed can be modeled by AR(1) process.Taking into account the dependence of Wind Production on windspeed we modeled WindProduction1/6 by AR(1) process.
Figure: Visualization of the test of googness of fit. Residuals of Prices ofAR(1) process modeled by stable AR(1) process. Kolmogorov-Smirnovand Anderson-Darling tests confirmed the hypothesis that the residualshave the stable distribution S1.562(1, 0, 0)
Figure: Visualization of the test of googness of fit. Residuals of Wind ofAR(1) process modeled applied on WindProduction1/6.Kolmogorov-Smirnov and Anderson-Darling tests confirmed thehypothesis that the residuals have the stable distribution S1.651(1, 0, 0)
Assumptions on the Residuals of Autoregressive Models ofWind Production and Prices
Assumption 1. Residuals are independent. We can assumedifferent tail index.Assumption 2. Residuals are not independent because windaffects prices. Sub-Gaussian.Assumption 3. Residuals are not independent and we assume thatthe tail index is different for wind production and prices.
Assumptions on the Residuals of Autoregressive Models ofWind Production and Prices
Assumption 1. We can analyse the residuals separately. Easy toimplement.Assumption 2. Sub-Gaussian distributions can be expressed asfollows:X = W 1/2 · Z where W ∼ Sα/2
((cos(πα/2))2/α, 1, 0
),
Z ∼ N(0,Q)We need to approximate the distribution function.Assumption 3. It is complicated due to the spectral measure. It isan operator stable distribution.In the following slides, we will comment what follows fromthese assumptions
Assumptions on the Residuals of Autoregressive Models ofWind Production and Prices
Assumption 1. The tail index α of the residuals of windproduction equals 1.651 and the tail index of the residuals of pricesequals 1.562.Assumption 2. The classical correlation is equal to 45%. Thedependence parameter between the residuals under assumptionthat the joint distribution equals 63%. The tail index is 1.61. Weneed to approximate the distribution function.
Assumptions on the Residuals of Autoregressive Models ofWind Production and Prices
Assumption 3. Any univariate stable distribution can besimulated by means of exponential and uniform distributions. Inour case it looks as follows:If W (α, exp(1), U(−π/2, π/2)) = Sα/2(cos(πα/2)2/α, 1, 0)
Any state is a two-dimensional vector S = (Price, Wind)T
Xprice = W (αprice)1/2 · Z , Xwind = W (αwind)1/2 · Z
X ∗ = (Xprice1, Xwind2)In this case, we will approximate the distribution function bymeans of empirical distribution function because it convergesuniformly to the true distribution function.
Step 0.Set v0(s) = 0, ∀s ∈ S .fix a tolerance parameter ε > 0.Set n = 1.Step 1. For each s ∈ S compute:V n(s) = maxx∈X
(C (s, x) + γ
∑s′∈S P(s ′|x , s)V n−1(s ′)
)(1)
let xn be the decision vector that solves equations (1).Step 2. If |vn − vn−1| < ε(1− γ)/2γ, let xπ be the resultingpolicy that solves (1), and let v ε = vn and stop. (| · | denoted themaximum norm) Else set n = n + 1 and go to step 1.
Application of Random Forests to Estimate Value Function
We apply random forests to estimate value function afterreformulation of the problem in terms of post-decision variables.
We used the value function obtained by value iteration as abenchmark.We express the value function as a function of price,WindProduction, price2, WindProduction2,price ·WindProduction, price2 ·WindProduction2. In the case ofregression and instrumental variables it will be a linear function ofthese variables. This approximation yields the similar results andRandom Forests outperform regression and instrumental variables.In the case of instrumental variables, the relative error is just 2.5percent and in the case of random forests, it is 2.1 percent.
Application of Random Forests to Estimate Value Function
We apply random forests to estimate value function afterreformulation of the problem in terms of post-decision variables.
We used the value function obtained by value iteration as abenchmark.We express the value function as a function of price,WindProduction, price2, WindProduction2,price ·WindProduction, price2 ·WindProduction2. In the case ofregression and instrumental variables it will be a linear function ofthese variables. This approximation yields the similar results andRandom Forests outperform regression and instrumental variables.In the case of instrumental variables, the relative error is just 2.5percent and in the case of random forests, it is 2.1 percent.
1) To reduce simplifying assumptions.2) To combine the technique of ADP with techniques of predictionof prices.3) To implement ADP for Assumption 2. and Assumption 3.4) To handle only one-sided dependence structure: wind can affectprices but not vice versa.5) To use bidding strategies that follow from the improved modelfor trading purposes.
1) L. Breiman. Random Forests. Statistics Department. Universityof California Berkeley, CA 94720. January 2001.2) N. Lohndorf, S. Minner. Optimal Day-Ahead Trading andStorage of Renewable Energies - An Approximate DynamicProgramming Approach. Department of Business Administration,University of Vienna. December 2009.3) W.R. Scott, W.B. Powell. Approximate Dynamic Programmingfor Energy Storage with New Results on Instrumental Variablesand Projected Bellman Errors. Submitted to Operations Research.4) S. Snih. Random Forests for Classification Trees andCategorical Dependent Variables: an informal Quick Start R Guide.Stanford University. February 2011.