© 2009 Warren B. Powell© 2008 Warren B. Powell Slide 1 SMART: A Stochastic Multiscale Energy Policy Model using Approximate Dynamic Programming Power Systems.

SMART:A Stochastic Multiscale Energy Policy Modelusing Approximate Dynamic Programming

Power Systems Modeling ConferenceUniversity of Florida

March, 2009

Warren PowellAbraham George

Alan LamontJeffrey Stewart

Goals for an energy policy model Potential policy questions

» How do we design policies to achieve energy goals (e.g. 20% renewables by 2015) with a given probability?

» How does the imposition of a carbon tax change the likelihood of meeting this goal?

» What might happen if ethanol subsidies are reduced or eliminated?» How would a long-term rise or fall in oil prices affect

investments? Some challenges:

» The marginal value of wind and solar farms depends on the ability to work with intermittent supply.

» The impact of intermittent supply will be affected by our use of storage.

» The need for storage (and the value of wind and solar) depends on the entire portfolio of energy producing technologies.

Demand modeling

Commercial electric demand

7 days

Intermittent energy sources

Wind speed

Solar energy

30 days

1 year

Storage

Batteries Ultracapacitors

FlywheelsHydroelectric

Long term uncertainties….

2010 2015 2020 2025 2030

Tax policy Batteries

Solar panels Carbon capture and sequestration

Price of oil

Climate change

Goals for an energy policy model

Model capabilities we are looking for» Multi-scale

• Multiple time scales (hourly, daily, seasonal, annual, decade)• Multiple spatial scales• Multiple technologies (different coal-burning technologies,

new wind turbines, …)• Multiple markets

– Transportation (commercial, commuter, home activities)– Electricity use (heavy industrial, light industrial, business,

residential)– ….

» Stochastic (handles uncertainty)• Hourly fluctuations in wind, solar and demands• Daily variations in prices and rainfall• Seasonal variations in weather• Yearly variations in supplies, technologies and policies

Prior work

Existing deterministic models» MARKAL, NEMS, EFOM, META*Net» Use linear programming with nonlinear market equilibration» Typically focus on long horizons (decades) with long time steps

(1-5 years)» Not set up to handle uncertainty or fine-grained spatial and

temporal features

Techniques incorporating uncertainty» LP-averaging – Stochastic MARKAL » Decision trees – SOCRATES» Stochastic programming – SETSTOCH » Markov decision processes (you have to be kidding)

Decision making under uncertainty

Mixing optimization and uncertainty

TimeEnergysources

? “large optimization model (e.g. NEMS, MARKAL, …)”

TimeEnergysources

12.334

Scenario 1

TimeEnergysources

18.917

89.1Scenario 2

TimeEnergysources

22.314

117Scenario 3

22.314

18.917

12.334

142Scenario 1

Scenario 2

Scenario 3

Now we have to combine the results of these three optimizations to make decisions.

Making decisions

Stochastic programming using scenario trees

Jan Feb Mar Apr May… Sep… Jan

Known Unknown

0.4 probability (weight)

Wetter

DrierAverage over 30Flow forecast scenarios

15% exceedence

50% exceedence

85% exceedence

15% exceedence

50% exceedence

85% exceedence

Stochastic programming does not “cheat”, but it does not scale. It is best designed for coarse-grained sources of uncertainty, but will not handle fine-grained temporal resolution.

Stochastic MARKAL

Received Wednesday, March 18 2009

Energy resource modeling

SMART: A Stochastic, Multiscale Allocation model for energy Resources, Technology and policy» Stochastic – able to handle different types of uncertainty:

• Fine-grained – Daily fluctuations in wind, solar, demand, prices, …• Coarse-grained – Major climate variations, new government policies,

technology breakthroughs

» Multiscale – able to handle different levels of detail:• Time scales – Hourly to yearly• Spatial scales – Aggregate to fine-grained disaggregate• Activities – Different types of demand patterns

» Decisions• Hourly dispatch decisions• Yearly investment decisions• Takes as input parameters characterizing government policies,

performance of technologies, assumptions about climate

tttt DRS ,,System state:

Resource state

Market demands

“System parameters”:State of technology (costs, performance)Climate, weather (temperature, rainfall, wind)Government policies (tax rebates on solar panels)Market prices (oil, coal)

The decision variable:

New capacity

Retired capacity

Storage

Location

Technology

tx for each

Exogenous information:

ˆ ˆ ˆNew information = , ,t t t tW R D

ˆ Exogenous changes in capacity, reserves

ˆ New demands for energy from each source

ˆ Exogenous changes in parameters.

Information can be: Fine-grained:

Wind, solar, demand, prices, …

Coarse-grained:Changes in technologies, policies, climate

The transition function

1 1( , , )Mt t t tS S S x W

Also known as the “system model,” “plant model” or just “model.”

All the physics of the problem.

Introduction to ADP The challenge of dynamic programming:

Problem: Three curses of dimensionality» State space

» Outcome space (the expectation)

» Action space (x is a vector)

1 1( ) max ( , ) ( ) |t t t t t t t tx

V S C S x E V S S

Introduction to ADP

The computational challenge:

How do we find ? 1 1( )t tV S

How do we compute the expectation?

How do we find the optimal solution?

1 1( ) max ( , ) ( ) |t t t t t t t tx

V S C S x E V S S

Introduction to ADP

We first use the value function around the post-decision state variable, removing the expectation:

We then replace the value function with an approximation that we estimate using machine learning techniques:

( ) max ( , ) ( , )x xt t t t t t t t t

xV S C S x V S S x

( ) max ( , ) ( , )xt t t t t t t t t

xV S C S x V S S x

Making decisions

Following an ADP policy

Making decisions

Approximate dynamic programming

With luck, the objective function will improve steadily

Objec tive func tion

1 11 21 31 41 51 61 71 81

Itera tion

Step 1: Start with a pre-decision state

Step 2: Solve the deterministic optimization using

an approximate value function:

to obtain .

Step 3: Update the value function approximation

Step 4: Obtain Monte Carlo sample of and

compute the next pre-decision state:

Step 5: Return to step 1.

, 1 ,1 1 1 1 1 1 ˆ( ) (1 ) ( )n x n n x n n

t t n t t n tV S V S v

1 ,ˆ max ( , ) ( ( , ) )n n n M x nt x t t t t t tv C S x V S S x

( )ntW

1 1( , , ( ))n M n n nt t t tS S S x W

Simulation

Deterministicoptimization

Recursivestatistics

Step 1: Start with a pre-decision state

Step 2: Solve the deterministic optimization using

an approximate value function:

to obtain .

Step 3: Update the value function approximation

Step 4: Obtain Monte Carlo sample of and

compute the next pre-decision state:

Step 5: Return to step 1.

, 1 ,1 1 1 1 1 1 ˆ( ) (1 ) ( )n x n n x n n

t t n t t n tV S V S v

1 ,ˆ max ( , ) ( ( , ) )n n n M x nt x t t t t t tv C S x V S S x

( )ntW

1 1( , , ( ))n M n n nt t t tS S S x W

Simulation

Deterministicoptimization

Recursivestatistics

Optimization Simulation

Approximate dynamic programming combines simulation and optimization in a rigorous yet flexible framework.

OptimizationStrengths

Produces optimal decisions.Mature technology

WeaknessesCannot handle uncertainty.Cannot handle high levels of

complexity

SimulationStrengths

Extremely flexibleHigh level of detailEasily handles uncertainty

WeaknessesModels decisions using

user-specified rules.Low solution quality.

The investment problem

oiltR ˆ oil

tD ˆ oiltˆ oil

New information 2009

1oiltR 1

oiltx 1

ˆ oiltD 1

ˆ oilt 1

ˆ oiltR

New information

windtxwind

tR ˆ windtD ˆ wind

tˆ windtR 1

windtR 1

windtx 1

ˆ windtD 1

ˆ windt 1

ˆ windtR

coaltxcoal

tR ˆ coaltD ˆ coal

tˆ coaltR 1

coaltR 1

coaltx 1

ˆ coaltD 1

ˆ coalt 1

ˆ coaltR

nat gastxnat gas

tR ˆ nat gastD ˆ nat gas

tˆ nat gastR 1

nat gastx 1

nat gastR 1

ˆ nat gastD 1

ˆ nat gast

ˆ nat gastR

The dispatch problem

Hourly electricity “dispatch” problem

Hourly model» Decisions at time t impact t+1 through the amount of water held in

the reservoir.Hour t Hour t+1

Hourly model» Decisions at time t impact t+1 through the amount of water held in

the reservoir.

Value of holding water in the reservoir for future time periods.

Hour t

Hour 1 2 3 4 87602009

2008 2009

2009oil

2009wind

2009nat gas

2009coal

2008 2009 2010 2011 2038

~5 seconds ~5 seconds ~5 seconds ~5 seconds ~5 seconds

Use statistical methods to learn the value of resources in the future.

Resources may be:» Stored energy

• Hydro• Flywheel energy• …

» Storage capacity• Batteries• Flywheels• Compressed air

» Energy transmission capacity• Transmission lines• Gas lines• Shipping capacity

» Energy production sources• Wind mills• Solar panels• Nuclear power plants

Amount of resourceV

( )t tV R

Following sample paths» Demands, prices, weather, technology, policies, …

Achievedgoal w/

Prob. 0.70

ˆ ˆ ˆ, ,t t t tW R D

Need to consider:Little noise (wind, rain, demand, prices, …)Big noise (technology, policy, climate, …)

Convergence analysis

» For scalar, piecewise linear approximations:• Nascimento, J. and W. B. Powell, “An Optimal Approximate

Dynamic Programming Algorithm for the Lagged Asset Acquisition Problem,” Mathematics of Operations Research, Vol. 34, No. 1, pp. 210-237 (2009).

• Nascimento, J. and W. B. Powell, “Optimal approximate dynamic programming algorithms for a general class of storage problems,” under review at SIAM J. on Control and Optimization.

» For general continuous states and actions:• J. Ma and W. B. Powell, “Convergence Proofs for Least Squares

Policy Iteration Algorithm of High-Dimensional Infinite Horizon Markov Decision Process Problems,” under review at Machine Learning.

» Stepsizes:• George, A. and W. B. Powell, “Adaptive Stepsizes for Recursive

Estimation with Applications in Approximate Dynamic Programming,” Machine Learning, Vol. 65, No. 1, pp. 167-198, (2006).

Convergence analysis

Research on approximations:» George, A., W.B. Powell and S. Kulkarni, “Value Function

Approximation Using Hierarchical Aggregation for Multiattribute Resource Management,” Journal of Machine Learning Research, Vol. 9, pp. 2079-2111 (2008).

» L. Hannah, D. Blei and W. B. Powell, “Density Estimation and Regression with Dirichlet Process-Generalized Linear Mixture Models,” in preparation for submission to Machine Learning.

» J. Ma and W. B. Powell, “A Convergent Algorithm for Continuous State Value Function Approximations Using Kernel Regression,” in preparation.

Benchmarking» Compare ADP to optimal LP for a deterministic

problem• Annual model

– 8,760 hours over a single year– Focus on ability to match hydro storage decisions

• 20 year model– 24 hour time increments over 20 years– Focus on investment decisions

» Comparisons on stochastic model• Stochastic rainfall analysis

– How does ADP solution compare to LP?• Carbon tax policy analysis

– Demonstrate nonanticipativity

0 50 100 150 200 250 300 350 400 450 500

Iterations

0.06% over optimal

Energy resource modeling ADP objective function relative to optimal LP

Optimal from linear program

Reservoir level

Demand

Rainfall

ADP solution

Reservoir level

Demand

Rainfall

Reservoir level

Demand

Rainfall

ADP solution

Reservoir level

Demand

Rainfall

Annual energy model

0 100 200 300 400 500 600 700 800

Time period

Sample paths

vel Optimal for individual

scenarios

0 100 200 300 400 500 600 700 800

Time period

Multidecade energy model

Optimal vs. ADP – daily model over 20 years

10.00%

15.00%

20.00%

25.00%

30.00%

35.00%

40.00%

0 100 200 300 400 500 600Iterations

0.24% over optimal

Energy policy modeling

Traditional optimization models tend to produce all-or-nothing solutions

Cost differential: IGCC - Pulverized coal

Pulverized coal is cheaperIGCC is cheaper

Investment in IGCC

Traditionaloptimization

Approximate dynamicprogramming

Policy study: » What is the effect of a potential (but uncertain) carbon

tax in year 8?

1 2 3 4 5 6 7 8 9

0 2 4 6 8 10 12 14 16 18 20

Renewable technologies

Carbon-based technologies

No carbon tax

0 2 4 6 8 10 12 14 16 18 20

With carbon tax

Carbon tax policy unknown

Carbon tax policy determined

0 2 4 6 8 10 12 14 16 18 20

With carbon tax

Conclusions

Capabilities» SMART can handle problems with over 300,000 time

periods so that it can model hourly variations in a long-term energy investment model.

» It can simulate virtually any form of uncertainty, either provided through an exogenous scenario file or sampled from a probability distribution.

» Accurate modeling of climate, technology and markets requires access to exogenously provided scenarios.

» It properly models storage processes over time.» Current tests are on an aggregate model, but the

modeling framework (and library) is set up for spatially disaggregate problems.

Conclusions

Limitations» Value function approximations capture the resource

state vector, but are limited to very simple exogenous state variations.

» More research is needed to test the ability of the model to use multiple storage technologies.

» Extension to spatially disaggregate model will require significant engineering and data.

» Run times will start to become an issue for a spatially disaggregate model.

© 2009 Warren B. Powell© 2008 Warren B. Powell Slide 1 SMART: A Stochastic Multiscale Energy Policy Model using Approximate Dynamic Programming Power Systems.

powell slide

powell wind

powell decision

powell goals

climate slide

powell stochastic markal

coal slide

days slide

Documents

Slide 1 © 2008 Warren B. Powell Slide 1 Approximate Dynamic...

© 2016 Robert Wilson Powell III

Slide 1 Tutorial: Optimal Learning in the Laboratory...

CARLOS POWELL®©™- SF181 OMB FORM FAXED 09/20/2015

Adaptive Labeling Algorithms for the Dynamic Assignment...

An Optimal Learning Approach to Finding an Outbreak of a ...

WARREN BUCKLER POWELL...Warren B. Powell Page 2 ORF 418 –....

Simon Powell (Powell Interiors), Gareth Knight (Powell...

Slide 1 Tutorial: Optimal Learning in the Laboratory...

DPW © 2005-2010 DPW © Donna Warren WINDOWSSERVER2008...

© 2018 Raymond Merriman & Kat Powell

WARREN BUCKLER POWELL - Princeton University...Jayanth...

Dwane Powell (News & Observer; April 1, 2007) Jeffrey...

Knowledge-Gradient Methods for Efficient...

Slide 1 Tutorial: Optimal Learning in the Laboratory...

© 2013 James Warren Hempfling ALL RIGHTS RESERVED