Wind Farm Power prediction with Graph Neural Network Junyoung Park SYSTEMS INTELLIGENCE Lab Industrial and Systems Engineering (ISysE) <https://wall.alphacoders.com/big.php?i=526859>
Wind Farm Power prediction with Graph Neural Network
Junyoung Park
SYSTEMS INTELLIGENCE Lab
Industrial and Systems Engineering (ISysE)
<https://wall.alphacoders.com/big.php?i=526859>
Wind Farm Power Estimation Task
2
Wind direction 1 Wind direction 2
Wind Farm Power Estimation Task
3
Wind direction 1
• Farm-level power estimation
Wind-farm power = ??
• Turbine-level power estimation
Wind turbine powers = ??
Wind Farm Power Estimation Task
4
Wind direction 1
• Farm-level power estimation
Wind-farm power = ??
• Turbine-level power estimation
Wind turbine powers = ??
Wind Farm and Its Graph Representation
5
Wind direction 1
𝒢 = (𝑁, 𝐸, 𝑔)
Node features 𝑁 = 𝑓𝑟𝑒𝑒 𝑓𝑙𝑜𝑤 𝑤𝑖𝑛𝑑 𝑠𝑝𝑒𝑒𝑑 ∀𝑖 ∈ 𝑡𝑢𝑟𝑏𝑖𝑛𝑒 𝑖𝑛𝑑𝑒𝑥
Edge features 𝐸 = the down−stream wake distance 𝑑,
the radial−wake distance 𝑟 ∀ 𝑖,𝑗 ∗
Global features 𝑔 = {𝑓𝑟𝑒𝑒 𝑓𝑙𝑜𝑤 𝑤𝑖𝑛𝑑 𝑠𝑝𝑒𝑒𝑑}
𝑖, 𝑗 𝑎𝑟𝑒 𝑡𝑢𝑟𝑏𝑖𝑛𝑒 𝑖𝑛𝑑𝑒𝑥
∗ ∀ 𝑖, 𝑗 ∈ 𝑖𝑛𝑡𝑒𝑟𝑎𝑐𝑡𝑖𝑜𝑛 𝑡𝑢𝑟𝑏𝑖𝑛𝑒𝑠
Details on Edge Features
6
Wind direction 1
Edge features 𝐸 = the down−stream wake distance 𝑑,
the radial−wake distance 𝑟 ∀ 𝑖,𝑗 ∗
𝒢 = (𝑁, 𝐸, 𝑔)
Neural Network in EXTREMELY High Level View
7
Input data
𝑦
𝑥
ො𝑦 = 𝑁𝑒𝑢𝑟𝑎𝑙𝑁𝑒𝑡𝑤𝑜𝑟𝑘(𝑥; 𝜽)
Neural network is a function approximator that has trainable parameter 𝜽
such 𝑦 ≈ ො𝑦 as accurate as possible
Why Graph Representation?
8
Wind direction 1
𝒢 = (𝑁, 𝐸, 𝑔)
vs.
Matrix (Tensor) Representations
X coord. Y coord.
T0 850 713
T1 303 587
T2 569 775
T3 642 290
T4 217 97
#. Turb
ines
Why Graph Representation?
9
X coord. Y coord.
T0 850 713
T1 303 587
T2 569 775
T3 642 290
T4 217 97
#. Turb
ines
1. MLP/CNN’s input size tends to be fixed.
e.g.) MNSIT = [28 X 28]
If we deploy one more turbine to the farm,
then the input dimension would change
2. Input data has no natural order.
e.g.) time-series has time index!
Which turbine should be the first input?
Spatial/Temporal Adjacency does not imply ‘related’
10
Convolution operation presumes that
‘Nearby pixels are somewhat related’.
Since we share the convolution filters
Figure source <Left: https://github.com/vdumoulin/conv_arithmetic>, <Right: https://towardsdatascience.com/illustrated-guide-to-recurrent-neural-networks-79e5eb8049c9>
RNNs presumes that
‘Nearby inputs are somewhat related’.
Since we share the RNN blocks.
Graph Neural Network
11
Image source <https://becominghuman.ai/lets-build-a-simple-neural-net-f4474256647f?gi=743618029571>
𝑥3
1
0
24
3
𝑥1
𝑥0
𝑥2
𝑥4
𝑦3
1
0
24
3
𝑦1
𝑦0
𝑦2
𝑦4
- Graph Convolution Networks (GCN)- Attention based approaches- Relational inductive bias (GN block)- …
(or tensors)
𝒢𝑦 = 𝐺𝑟𝑎𝑝ℎ𝑁𝑒𝑢𝑟𝑎𝑙𝑁𝑒𝑡𝑤𝑜𝑟𝑘 (𝒢𝑥; 𝜃)
Imposing Relational Inductive Bias
12
Share edge update function 𝑓 and node update function 𝑔for updating graph represented data
Node update function
𝑓 ∙ Edge update function
𝑔 ∙
1
0
24
3
𝑛1
𝑛0
𝑒0,1
𝑓 ∙
1
0
24
3
𝑒′0,1
Input Graph Updated Graph
Imposing Relational Inductive Bias
13
Share edge update function 𝑓 and node update function 𝑔for updating graph represented data
Node update function
𝑓 ∙ Edge update function
𝑔 ∙
1
0
24
3
𝑛0
𝑛4
𝑒0,4𝑓 ∙ 1
0
24
3
𝑒′0,1
𝑒′0,4
Input Graph Updated Graph
1
0
24
3
1
0
24
3
Imposing Relational Inductive Bias
14
Share edge update function 𝑓 and node update function 𝑔for updating graph represented data
Node update function
𝑓 ∙ Edge update function
𝑔 ∙
Input Graph Updated Graph
1
0
24
3
𝑛1
𝑒1,4
𝑒1,2
1
0
24
3
𝑛′1
𝑔 ∙
Imposing Relational Inductive Bias
15
Share edge update function 𝑓 and node update function 𝑔for updating graph represented data
Node update function
𝑓 ∙ Edge update function
𝑔 ∙
Input Graph Updated Graph
Imposing Relational Inductive Bias
16
Share edge update function 𝑓 and node update function 𝑔for updating graph represented data
Node update function
𝑓 ∙ Edge update function
𝑔 ∙
Input Graph Updated Graph
1
0
24
3
𝑛0
𝑒0,1
𝑒1,2
𝑒0,2 𝑒0,4 1
0
24
3
𝑛′0𝑔 ∙
Physics-induced Graph Neural Network On Wind Power Estimations
17
GN (Graph Neural) Block
18
𝑔
Input graph 𝒢
Global features
𝑁0
𝑁1
𝑁2
𝑁4
𝑁3
𝑁𝑜𝑑𝑒0 features
𝐸𝑑𝑔𝑒0,1 features
𝑔′
Update graph 𝒢′
Global features
𝑁0′
𝑁1′
𝑁2’
𝑁4’
𝑁3′
𝑁𝑜𝑑𝑒′0 features
𝐸𝑑𝑔𝑒′0,1 features
Graph Neural (GN) Block
𝑓(∙: 𝜃0)
𝑓(∙: 𝜃1)
𝑓(∙: 𝜃2)
Edge update
network
Node update
network
Global update
network
GN Block – Edge update steps
19
𝑔
Input graph 𝒢
Global features
𝑁0
𝑁1
𝑁2
𝑁4
𝑁3
𝑁𝑜𝑑𝑒0 features
𝐸𝑑𝑔𝑒0,1 0 features
𝐸𝑑𝑔𝑒′0,1
= 𝑓(𝐸𝑑𝑔𝑒0,1 𝑁𝑜𝑑𝑒1, 𝑁𝑜𝑑𝑒0, 𝑔; 𝜃0)
Update edge features with 𝑓(𝐸𝑑𝑔𝑒 𝑓𝑒𝑎𝑡𝑢𝑟𝑒𝑠, 𝑅𝑒𝑐𝑖𝑒𝑣𝑒𝑟 𝑓𝑒𝑎𝑡𝑢𝑟𝑒𝑠, 𝑆𝑒𝑛𝑑𝑒𝑟 𝑓𝑒𝑎𝑡𝑢𝑟𝑒𝑠, 𝑔; 𝜃0)
𝑔
GN Block – Edge update steps
20
𝑔
Input graph 𝒢
Global features
𝑁0
𝑁1
𝑁2
𝑁4
𝑁3
𝑁𝑜𝑑𝑒0 features
𝐸𝑑𝑔𝑒0,1 0 features
𝐸𝑑𝑔𝑒′4,1
= 𝑓(𝐸𝑑𝑔𝑒4,1 𝑁𝑜𝑑𝑒4, 𝑁𝑜𝑑𝑒1, 𝑔; 𝜃0)
Update edge features with 𝑓(𝐸𝑑𝑔𝑒 𝑓𝑒𝑎𝑡𝑢𝑟𝑒𝑠, 𝑅𝑒𝑐𝑖𝑒𝑣𝑒𝑟 𝑓𝑒𝑎𝑡𝑢𝑟𝑒𝑠, 𝑆𝑒𝑛𝑑𝑒𝑟 𝑓𝑒𝑎𝑡𝑢𝑟𝑒𝑠, 𝑔; 𝜃0)
𝑔
GN Block – Edge update steps
21
𝑔
Input graph 𝒢
Global features
𝑁0
𝑁1
𝑁2
𝑁4
𝑁3
𝑁𝑜𝑑𝑒0 features
𝐸𝑑𝑔𝑒0,1 0 features
Update edge features with 𝑓(𝐸𝑑𝑔𝑒 𝑓𝑒𝑎𝑡𝑢𝑟𝑒𝑠, 𝑅𝑒𝑐𝑖𝑒𝑣𝑒𝑟 𝑓𝑒𝑎𝑡𝑢𝑟𝑒𝑠, 𝑆𝑒𝑛𝑑𝑒𝑟 𝑓𝑒𝑎𝑡𝑢𝑟𝑒𝑠, 𝑔; 𝜃0)
𝑔
Updated edge features
GN Block – Node update steps
22
𝑔
Input graph 𝒢 Updated edge features
Global features
𝑁0
𝑁1
𝑁2
𝑁4
𝑁3
𝑁𝑜𝑑𝑒0 features
𝐸𝑑𝑔𝑒0,1 0 features𝑁0
𝑁𝑜𝑑𝑒′0 = 𝑓(𝐸𝑑𝑔𝑒0; 𝜃1)
𝐸𝑑𝑔𝑒0 = mean concat(𝐸𝑑𝑔𝑒0,𝑖 , 𝑁𝑜𝑑𝑒1, 𝑁𝑜𝑑𝑒𝑖)
∀𝑖 𝑖𝑛𝑐𝑜𝑚𝑖𝑛𝑔 𝑒𝑑𝑔𝑒𝑠
𝑔 𝑔
Aggregation function: any function obeys ‘input-order invariant’ and ‘input-number invariant’ properties.e.g., Mean, Max, Min, etc.
GN Block – Node update steps
23
𝑔
Input graph 𝒢 Updated edge features
Global features
𝑁0
𝑁1
𝑁2
𝑁4
𝑁3
𝑁𝑜𝑑𝑒0 features
𝐸𝑑𝑔𝑒0,1 0 features𝑁0
𝑔 𝑔
GN Block – Global feature update
24
𝑔
Input graph 𝒢 Updated edge features
Global features
𝑁0
𝑁1
𝑁2
𝑁4
𝑁3
𝑁𝑜𝑑𝑒0 features
𝐸𝑑𝑔𝑒0,1 0 features𝑁0
𝑔 𝑔′
𝑔′ = 𝑓(𝐸𝑑𝑔𝑒′, 𝑁𝑜𝑑𝑒’,𝑔; 𝜃2)
𝐸𝑑𝑔𝑒’= mean 𝐸𝑑𝑔𝑒′𝑖,𝑗 ∀𝑒𝑑𝑔𝑒𝑠 𝑖, 𝑗
𝑁𝑜𝑑𝑒’= mean 𝑁𝑜𝑑𝑒𝑖 ∀ 𝑛𝑜𝑑𝑒𝑠 𝑖
Revisit Aggregation Method
25
𝑔
Input graph 𝒢 Updated edge features
Global features
𝑁0
𝑁1
𝑁2
𝑁4
𝑁3
𝑁𝑜𝑑𝑒0 features
𝐸𝑑𝑔𝑒0,1 0 features𝑁0
𝑁𝑜𝑑𝑒′0 = 𝑓(𝐸𝑑𝑔𝑒0; 𝜃1)
𝐸𝑑𝑔𝑒0 = mean concat(𝐸𝑑𝑔𝑒0,𝑖 , 𝑁𝑜𝑑𝑒1, 𝑁𝑜𝑑𝑒𝑖)
∀𝑖 𝑖𝑛𝑐𝑜𝑚𝑖𝑛𝑔 𝑒𝑑𝑔𝑒𝑠
𝑔 𝑔
Aggregation function: any function obeys ‘input-order invariant’ and ‘input-number invariant’ properties.e.g., Mean, Max, Min, etc.
Weighted “__” ≈ Attention (in Deep Learning)
26
Figure source <Agile Amulet: Real-Time Salient Object Detection with Contextual Attention>
Consider weighted Aggregations
27
Figure source <Left: https://www.youtube.com/watch?v=HHlN0TDgllE> , <Right: VAIN: Attentional Multi-agent Predictive Modeling>
<Robot soccer> <Visualized weights>
How can we get the weights?
28
Learn to weight!
GN Block – Edge update steps Revisit
29
𝑔
Input graph 𝒢
Global features
𝑁0
𝑁1
𝑁2
𝑁4
𝑁3
𝑁𝑜𝑑𝑒0 features
𝐸𝑑𝑔𝑒0,1 0 features
𝐸𝑑𝑔𝑒′4,1
= 𝑊4,1 × 𝑓(𝐸𝑑𝑔𝑒4,1 𝑁𝑜𝑑𝑒4, 𝑁𝑜𝑑𝑒1, 𝑔; 𝜃0)
Update edge features with 𝑓(𝐸𝑑𝑔𝑒 𝑓𝑒𝑎𝑡𝑢𝑟𝑒𝑠, 𝑅𝑒𝑐𝑖𝑒𝑣𝑒𝑟 𝑓𝑒𝑎𝑡𝑢𝑟𝑒𝑠, 𝑆𝑒𝑛𝑑𝑒𝑟 𝑓𝑒𝑎𝑡𝑢𝑟𝑒𝑠, 𝑔; 𝜃0)
𝑔
𝑊4,1 = 𝑓(𝑠𝑜𝑚𝑒 𝑝𝑜𝑠𝑠𝑖𝑏𝑙𝑒 𝑖𝑛𝑝𝑢𝑡𝑠; 𝜃3)
Physics-induced Attention
30
Figure source <Cooperative wind turbine control for maximizing wind farm power using sequential convex programming by Jinkyoo Park, Kincho H.Law >
JK park, and K.H. law suggest the continuous deficit factor δu(d, r, α) as
δu d, r = 2α𝑅0
𝑅0+𝜅𝑑
2exp −
𝑟
𝑅0+𝜅𝑑
2
𝑅0: 𝑅𝑜𝑡𝑜𝑟 𝑑𝑖𝑎𝑚𝑒𝑡𝑒𝑟𝑑: Down−stream wake distance𝑟: 𝑅𝑎𝑑𝑖𝑎𝑙 𝑤𝑎𝑘𝑒 − 𝑑𝑖𝑠𝑡𝑛𝑎𝑛𝑐𝑒𝛼, 𝜅: 𝑇𝑢𝑛𝑎𝑏𝑙𝑒 𝑝𝑎𝑟𝑎𝑚𝑒𝑡𝑒𝑟𝑠
Physics-induced Attention
31
Figure source <Cooperative wind turbine control for maximizing wind farm power using sequential convex programming by Jinkyoo Park, Kincho H.Law >
δu d, r = 2α𝑅0
𝑅0+𝜅𝑑
2exp −
𝑟
𝑅0+𝜅𝑑
2
δu d, r indicates ‘How much the down stream turbine get affected
Due to the upstream turbines’→ Weighting Factor 𝑊!
However, they tuned the parameters 𝛼, 𝜅 to the observed data
Physics-induced Attention
32
𝑔
Input graph 𝒢
Global features
𝑁0
𝑁1
𝑁2
𝑁4
𝑁3
𝑁𝑜𝑑𝑒0 features
𝐸𝑑𝑔𝑒0,1 0 features
𝐸𝑑𝑔𝑒′4,1
= 𝑊4,1 × 𝑓(𝐸𝑑𝑔𝑒4,1 𝑁𝑜𝑑𝑒4, 𝑁𝑜𝑑𝑒1, 𝑔; 𝜃0)
Let neural network learn 𝛼, 𝜅, 𝑅0!
𝑔
𝑊4,1 = 𝑓(𝑠𝑜𝑚𝑒 𝑝𝑜𝑠𝑠𝑖𝑏𝑙𝑒 𝑖𝑛𝑝𝑢𝑡𝑠; 𝜃3)
𝑓(𝑠𝑜𝑚𝑒 𝑝𝑜𝑠𝑠𝑖𝑏𝑙𝑒 𝑖𝑛𝑝𝑢𝑡𝑠; 𝜃3)
𝑓(𝑟, 𝑑; 𝛼, 𝜅, 𝑅0)
= 2α𝑅0
𝑅0 + 𝜅𝑑
2
exp −𝑟
𝑅0 + 𝜅𝑑
2
Physics-induced Graph Neural Network On Wind Power Estimations
33
Physics-induced Graph Neural Network On Wind Power Estimations
34
Graph Dense Layer
35
𝑁′0
𝑔′
Graph Dense Layer
𝑓(∙: 𝜃5)Prediction
network
𝑃0
𝑃1
𝑃4
𝑃2
𝑃3
𝑃0 = 𝑓(𝑁′0; 𝜃5)
Graph Dense Layer
36
𝑁′0
𝑔′
𝑃0
𝑁1′
𝑁4′
𝑁2′
𝑁5′
𝑃0 = 𝑓(𝑁′0; 𝜃5)
𝑃0
𝑃1
𝑃4
𝑃2
𝑃3
Graph Dense Layer
37
𝑁′0
𝑔′
𝑃0 = 𝑓(𝑁′0; 𝜃5)
𝑃0
𝑃1
𝑃4
𝑃2
𝑃3
Graph Dense Layer
38
𝑁′0
𝑔′
𝑃0 = 𝑓(𝑁′0; 𝜃5)
How to train your PGNN
39
𝑃0
𝑃1
𝑃4
𝑃2
𝑃3
𝑃
𝑃0
𝑃1
𝑃4
𝑃2
𝑃3
𝑃2
We use mean-squared-error as a loss function of PGNN
Lovely but Dreadful Exponential functions
40
𝑓 𝑥 = exp(𝑥)
Numerical under-flow Numerical over-flow
Simple approximation for exponential functions
41
exp 𝑥 ≔
𝑘=0
∞𝑥𝑘
𝑘!
≈
𝑘=0
D𝑥𝑘
𝑘!
We set D = 5
Bottom side of power-series approximation
42
Question? “why don’t you use Taylor's expansion?”Answer:“You may encounter exponential again!”
The suggested approximation works
(relatively) properly when 𝑥 is small.
Scale-only normalization
43
Instead of using raw the down stream distance 𝑑, and the radial wake distance 𝑟 as inputs,
𝑑′ =𝑑
𝜎(𝑑)×max(0, 𝑠𝑑) 𝑟′ =
𝑑
𝜎(𝑟)×max(0, 𝑠𝑟)
𝑠𝑑 , 𝑠𝑟 are learnable parameters
Dissect Scale-only normalization
44
Instead of using raw the down stream distance 𝑑, and the radial wake distance 𝑟 as inputs,
𝑑′ =𝑑
𝜎(𝑑)×max(0, 𝑠𝑑)
(1) (2)
(3)(4)
(1) Why do not subtract means?
→ We want the scaled values to be positive
(2) What are max(0, 𝑠) for?
→ Since 𝑠’s are learnable parameters, w/o max(0, 𝑠) could be negative
(3) How do you get 𝜎(∙)?
→ We employed EWMA to get 𝜇(∙), 𝜎(∙) estimation
(4) Why do you multiply max(0, 𝑠) again?
→ If not scaling was the best, then we can recover the original values.
Same intuition Batch Normalization did.
Approximated weighting function
45
Scale-norm
𝐴𝑝𝑝𝑟𝑜𝑥𝑖𝑚𝑎𝑡𝑒𝑑f𝑤(∙ ; 𝜃3)
downstream-wake distance d
radial-wake distance r
Normalized d
Normalized rWeight 𝑤
Training Procedure
46
Sample wind-farm layout
Wind speed S
Wind direction 𝜃
𝑃0 𝑃1
𝑃2
𝑃4𝑃3
Power simulations with FLORIS
Graphrepresentation
𝑃
Graphencoding
PGNNSimulator
𝑃
MSE
# turbines 𝑛
sample 𝑠 ~ 𝑈 5.0𝑚/𝑠, 15.0𝑚/𝑠 , 𝜃 ~ 𝑈(0°, 360°)
𝑛 = {5,10,15,20}
Generalization Tests
47
Wind speed 𝑆Generalization over environmental factors- wind directions, wind speedsGeneralization over wind farm layouts
Wind direction 𝜃
Wind farm layouts
wind farm layouts
Generalization Over Environmental Factors
48
Error = 0.0172 Error = 0.022
Wind speed = 8.0 m/s
Generalization Over Layouts
49
- Sample 20 wind farm layouts and Estimate average estimation errors.- Each layout has 20 wind turbines in it.
Qualitative Analysis on Physics-induced Bias
50
𝑔
𝐸𝑑𝑔𝑒′4,1
= 𝑊4,1 × 𝑓(𝐸𝑑𝑔𝑒4,1 𝑁𝑜𝑑𝑒4, 𝑁𝑜𝑑𝑒1, 𝑔; 𝜃0)
𝑊4,1 = 𝑓(𝑖𝑛𝑝𝑢𝑡𝑠; 𝜃3)
𝑓(𝑖𝑛𝑝𝑢𝑡𝑠; 𝜃3)
𝑓(𝑟, 𝑑; 𝛼, 𝜅, 𝑅0)
= 2α𝑅0
𝑅0 + 𝜅𝑑
2
exp −𝑟
𝑅0 + 𝜅𝑑
2
𝑓 is another neural network
DGNN
PGNN
Qualitative Analysis on Physics-induced Bias
51
PGNN achieved 11% smaller validation error than DGNN
Training data
Out-of-distribution
Case Study on Inferred Weights
52
Weight values Ignored edges
Case Study on a Regularized Grid Layout
53
Error = 0.0642 Error = 0.0702
Anyway the wind blows
Junyoung Park
SYSTEMS INTELLIGENCE Lab
Industrial and Systems Engineering (ISysE)
<https://wall.alphacoders.com/big.php?i=526859>
Normalizing powers
55