Building Electricity Demand Forecasting SHUBHAM SAINI, PANDARASAMY ARJUNAN, AMARJEET SINGH As part of the work done at Mobile and Ubiquitous Computing Group
May 29, 2015
Building Electricity Demand Forecasting
SHUBHAM SAINI, PANDARASAMY ARJUNAN, AMARJEET SINGHAs part of the work done at
Mobile and Ubiquitous Computing Group
OVERVIEW
The IIIT – Delhi campus has more than 200 smart meters installed, collecting around 10 electrical parameters every 30 seconds.
Important to calculate an accurate baseline, and monitor any deviations from it.
A forecasting pipeline is proposed for predicting the power consumption of an electric load at any given point of time.
Motivation
Energy Consumption Increasing Worldwide India – Energy Forecasting has important role in
formulation of effective energy policies Electricity consumption analysis useful for
monitoring environmental issues
FORECASTING MODELS
Auto-Regressive Integrated Moving Average (ARIMA)
Artificial Neural Networks (ANN) Hybrid ARIMA+ANN EnerNOC
ARIMA (p,d,q)(P,D,Q)
(p,P) - number of lagged variables (d,D) - difference necessary to make the time
series stationary (q,Q) moving average over the number of last
observations.
Where yt and Et are actual value and random error at time t
Artificial Neural Networks
Popular for flexible non-linear modeling Single hidden layer feed-forward network
Where wj and wi,j are model model parameters called connection weights, p is the number of input nodes and q is the number of hidden nodes.
Hybrid ARIMA+ANN
Power consumption composed of linear and non-linear structure
Yt = Lt + Nt
ARIMA able to model linear component Lt
Residuals modeled by ANN
et = Yt - YFt
Final fitted value:
YFt = LFt + NFt
EnerNOC
Based on averaging the load on X days for each interval
D-3 12-1am 1-2am 2-3am 3-4am 4-5am
D-2 12-1am 1-2am 2-3am 3-4am 4-5am
D-1 12-1am 1-2am 2-3am 3-4am 4-5am
Event Day
12-1am 1-2am 2-3am 3-4am 4-5am
Prediction Pipeline
Multiple models can be learned by using different sub-models at each of these stages.
Initial Parameters - Granularity
Very high resolution data available, sampled every 30 seconds
Too small and too large time intervals detrimental to a model's performance
Experimented with 1Hour, 30Minutes, 15 Minutes
Initial Parameters – Forecast Horizon
Forecasting Horizon implies the number of data points a model forecasts into the future.
Days maybe be divided into working/non-working hours, day/night hours, peak/off-peak hours.
SELECTION OF SIMILAR (Y) DAYS
CRITERIA: Previous Business Days Previous Same Days
Lookback Window 4,7,10
7 similar days
14 similar days
Sub-sampling (X) Days
Criteria High X Days
Makes sense for demand-response
Excluding Highest and Lowest Days anomalies could be either due to load failure, holiday,
unpredicted occupancy etc
X:Y = 8:10
X:Y = 6:10
Adjustments – ARIMA+ANN
Training data used to forecast future values includes an additional 2-4 hours of data from the event day.
For example, in order to forecast consumption on the event day for 12PM - 5PM, we use 10AM - 5PM data on the X similar days, as well as 10AM - 12PM data on the event day.
This additional data more accurately reflect load conditions on the event day.
Adjustments - EnerNOC
To adjust the forecasted value of a time interval, for example 12PM - 1 PM, adjustments are done at 11AM
Mean of difference between actual values and the forecasted values between 8AM - 11AM is added(subtracted) to(from) the 12PM - 1PM forecasted value.
Event Day data not always available !!
Results
Brute-force approach to find optimal parameters Over 700 different combinations of parameters tested Varying Parameters:1. No. of similar days - 4, 7, 102. Similarity Criteria - Previous Business Days, Previous Same
Days3. Sub-sampling: High X of Y4. X:Y Ratio - 6:10, 8:105. Models - Hybrid ARIMA+ANN, EnerNOC, Adjusted EnerNOC6. Time Duration - 12AM - 12AM, 12AM - 7AM, 7AM - 12PM,
12PM - 5PM7. Dates - 13-March-2014, 11-March-2014, 5-March-2014, 3-
March-2014, 28-February-2014
Results (Contd.)
Load #1: Academic Building - Floor Total - First Floor
Sample Result
Load #1: Academic Building - Floor Total - First Floor
Number of Similar Days (Y) – 7 X : Y Ratio – 0.8 Similarity Criteria - Previous Same Days Time Duration - 12AM – 7AM Model - Adjusted EnerNOC
Implementation
Developed using the R language for statistical computing version 3.0(RStudio IDE)
Reasons for choosing R over other statistical computing languages like Matlab are:
1. Free and Open-Source2. Graphics and Data Visualization3. Flexible statistical analysis toolkit4. Powerful, cutting-edge analytics5. Robust, vibrant community
UI Design and Layout
GUI for simple data visualization using Shiny web framework v0.98
Tab layout with a sidebar Sidebar contains options to set the forecasting
parameters Main window - training data, and output of
various forecasting models
Time-Series clustering (In Progress)
Global features extracted from the time series through statistical operations
trend seasonality periodicity serial correlation skew, kurtosis chaos nonlinearity self-similarity
Time-Series clustering (In Progress)
Clustering – K-Means or Heirarchical Using global characteristics, group all available
streams into optimal number of clusters For each cluster, find optimal forecasting model
(through the prediction pipeline) For any new stream – assign the stream to one of
the clusters and apply the optimal forecasting model
Questions ???