AN ANALYSIS OF SHORT-TERM LOAD FORECASTING ON RESIDENTIAL BUILDINGS USING DEEP LEARNING MODELS SREERAG SURESH THESIS SUBMITTED TO THE FACULTY OF THE VIRGINIA POLYTECHNIC INSTITUTE AND STATE UNIVERSITY IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE IN ENVIRONMENTAL ENGINEERING FARROKH JAZIZADEH KARIMI, CHAIR LINSEY C MARR GABRIEL ISAACMAN-VANWERTZ MAY 21 ST , 2020 BLACKSBURG, VIRGINIA KEYWORDS: LOAD FORECASTING, BUILDING ENERGY, CNN, DEEP LEARNING, LSTM Copyright @ 2020, Sreerag Suresh
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
AN ANALYSIS OF SHORT-TERM LOAD FORECASTING ON RESIDENTIAL
BUILDINGS USING DEEP LEARNING MODELS
SREERAG SURESH
THESIS SUBMITTED TO THE FACULTY OF THE VIRGINIA POLYTECHNIC
INSTITUTE AND STATE UNIVERSITY IN PARTIAL FULFILLMENT OF THE
REQUIREMENTS FOR THE DEGREE OF
MASTER OF SCIENCE
IN
ENVIRONMENTAL ENGINEERING
FARROKH JAZIZADEH KARIMI, CHAIR
LINSEY C MARR
GABRIEL ISAACMAN-VANWERTZ
MAY 21ST, 2020
BLACKSBURG, VIRGINIA
KEYWORDS: LOAD FORECASTING, BUILDING ENERGY, CNN, DEEP
LEARNING, LSTM
Copyright @ 2020, Sreerag Suresh
AN ANALYSIS OF SHORT-TERM LOAD FORECASTING ON RESIDENTIAL
BUILDINGS USING DEEP LEARNING MODELS
SREERAG SURESH
ABSTRACT
Building energy load forecasting is becoming an increasingly important task with the
rapid deployment of smart homes, integration of renewables into the grid and the advent
of decentralized energy systems. Residential load forecasting has been a challenging
task since the residential load is highly stochastic. Deep learning models have showed
tremendous promise in the fields of time-series and sequential data and have been
successfully used in the field of short-term load forecasting at the building level.
Although, other studies have looked at using deep learning models for building energy
forecasting, most of those studies have looked at limited number of homes or an
aggregate load of a collection of homes. This study aims to address this gap and serve
as an investigation on selecting the better deep learning model architecture for short
term load forecasting on 3 communities of residential buildings. The deep learning
models CNN and LSTM have been used in the study. For 15-min ahead forecasting for
a collection of homes it was found that homes with a higher variance were better
predicted by using CNN models and LSTM showed better performance for homes with
lower variances. The effect of adding weather variables on 24-hour ahead forecasting
was studied and it was observed that adding weather parameters did not show an
improvement in forecasting performance. In all the homes, deep learning models are
shown to outperform the simple ANN model.
AN ANALYSIS OF DEEP LEARNING MODELS FOR SHORT TERM LOAD
FORECASTING ON RESIDENTIAL BUILDINGS
SREERAG SURESH
GENERAL AUDIENCE ABSTRACT
Building energy load forecasting is becoming an increasingly important task with the
rapid deployment of smart homes, integration of renewables into the grid and the advent
of decentralized energy systems. Residential load forecasting has been a challenging
task since residential load is highly stochastic. Deep learning models have showed
tremendous promise in the fields of time-series and sequential data and have been
successfully used in the field of short-term load forecasting. Although, other studies
have looked at using deep learning models for building energy forecasting, most of those
studies have looked at only a single home or an aggregate load of a collection of homes.
This study aims to address this gap and serve as an analysis on short term load
forecasting on 3 communities of residential buildings. Detailed analysis on the model
performances across all homes have been studied. Deep learning models have been used
in this study and their efficacy is measured compared to a simple ANN model.
ACKNOWLEDGMENTS
I would like to express my sincere gratitude to my advisor, Dr. Farrokh Jazizadeh for
his constant support throughout this study. I would not have been able to complete this
research without his guidance. I would also like to express my gratitude to Dr. Linsey
Marr and Dr. Gabriel Issacman-VanWertz for serving in my committee and providing
their valuable insights.
I am extremely grateful to all my colleagues at Virginia Tech, especially all members at
the INFORM lab for their constant support. I would also like to express my sincere
gratitude to Cristiano Ronaldo, Kobe Bryant (RIP Mamba) and Uzumaki Naruto. They
have been an immense source of inspiration and constantly motivate me do best.
I am thankful to my friends Manu Krishnan, Amal Shaj, Nevedita Sankararaman,
Venkatesh Modi and Prachi Jain for providing me their valuable feedback and helping
me improve my thesis.
I would like to dedicate this thesis to my parents Mr. Suresh Babu and Mrs. Deepa
Convolutional Neural Networks belong to a class of deep learning networks that are used for
processing data with a grid like topology[49]. This can include time-series data and image data
which can be thought of as a 1-D and 2-D data grid respectively. They have been successfully
used in the applications of computer vision, human activity recognition, natural language
processing, drug discovery, time series forecasting etc. ([51],[8],[52],[53],[40]). CNN uses a
specialized linear mathematical operation called convolution in at least one of its layers[49]. In
CNN’s, convolution operation is done using repeated application of filters or kernels on the
input data to obtain a feature map.
Three different operations take place in the convolutional layer. The first operation described
above results in the production of the feature map. The second step is activation of the elements
in the feature map using a nonlinear activation function i.e. mostly RELU or rectified linear
activation function[49]. In the third step, a pooling operation is used to smoothen and reduce
the dimensions of the feature map output. Max pooling method is used in this study. It returns
an array of the values of maximum output within a rectangular neighborhood from the previous
layer[49]. The CNN network may consist of one or more convolutional layers. After the
convolutional layers create their outputs, it is then received by the hidden layers or fully
connected layers. The output layer is positioned following the hidden layer and it performs an
identical role to that of an output layer in a conventional neural network[37].
14
Figure 4 : CNN Architecture for time series data2
3.5 Theoretical Background – Multi Layered Perceptron (MLPs)
Multilayer Perceptron also known as feedforward or artificial neural networks are the
archetypal deep learning models. Multilayer perceptrons are powerful machine learning models
that are used for learning non-linear relationships within the data and are highly flexible
universal approximators [54]. They are extremely useful to machine learning practitioners and
form the basis for many of the commercial machine learning applications [49]. They have been
successfully used in load forecasting and other time series applications[2].
On a high level, the simple neural network consists of an input layer, hidden layer and an output
layer. Unlike recurrent neural networks they have no feedback connections in which outputs of
the model are fed back into itself [49]. Networks with just 1 hidden layer also known as vanilla
artificial neural networks are used in this study. Detailed information on the workings of the
multilayered perceptron can be found in the following literature[49].
15
Figure 5 : Simple Multi Layered Perceptron1
3.5.1 Optimization Algorithm
In this study, all the models use ADAM optimization algorithm for optimizing the weights of
each layer. This adaptive learning rate optimization algorithm shows quicker convergence than
the traditional SGD[55]. It is a first order gradient-based optimization algorithm which is
intuitive, computationally efficient and tailor-made for optimizing models which involve a large
set of parameters. Unlike the stochastic gradient descent which naively updates the weights
with a constant learning rate, ADAM optimization algorithm computes individual adaptive
learning rates from the moments of the gradients. Further details about the ADAM optimization
algorithm can be found in the literature[55].
1,2 These figures are generated using the web application: http://alexlenail.me/NN-SVG/index.html
16
3.5.2 Regularization
Machine Learning models usually suffer from the problem of overfitting that results in testing
errors much worse than errors on the training data. This occurs when the model fits the data in
the training phase too well, resulting in high variance and low bias. The strategies to decrease
the testing error sometimes at the cost of increased training error is known as regularization[49].
In this study, weight decay regularization is used to address the problem of overfitting. The
regularization parameter lambda is selected as 0.01. This is the default lambda value in
Keras[56].
17
4. Evaluation Method
4.1 Data Collection and Characteristics
4.1.1 Residential Homes Data
The residential building data used for the study is obtained from Pecan Street Inc. Dataport[57]
from Austin, New York and California. The dataset is publicly available and is downloaded
using the free student license. 1-year data of 15 min frequency load data is used for the study
where 25 homes from Austin, 24 homes from New York and 23 homes from California are
selected. All the homes in Austin used in the study are in the time range of: 1 January 2018 to
1 January 2019 whereas the homes in California consist of yearly data anywhere between 1
January 2014 to 1 January 2019. In the case of New York, the data consists of half year data
but with ranging from 1 January 2019 to 31 October 2019.
4.1.2 Weather Data
The use of weather data in the study is only done for the city of Austin. 1 year of weather data
is obtained from the website: openweathermap.org [58] from 1 January 2018 to 1 January 2019.
The weather data consists of temperature, humidity and atmospheric pressure.
18
4.2 Data Description and Pre-processing
A total of 72(25, 24 and 23) homes from both Austin, New York and California are initially
selected. All the 72 homes are checked for missing values. The deep learning models would not
run with missing values present in the dataset. It is found that all the homes in California do not
have any missing values whereas several homes in Austin contain missing values. In the case
of New York, large sections of the data is missing and the collection of homes are analysed as
a separate case study.
4.2.1 Austin
The home types and the missing value data from the city of Austin is provided in the table
below. Homes in Austin with missing values greater than 0.5 % (approximately 170 missing
values out of 35000) are omitted and those with homes with missing values present, but less
than 0.5 % are interpolated linearly to fill out the missing values. The houses omitted from
Austin are highlighted in orange in the below table. This finally gives us 20 homes from Austin
(after interpolation). All the homes in Austin belong to the same building type with a few homes
having solar generation capacity.
19
Table 2 : List of residential buildings from Austin. Source: Pecan Street Inc, Dataport[57]
House ID % Missing Values Building Type Solar Available
661 0.95 Single-Family Home Yes 1642 0.52 Single-Family Home Yes 2335 0 Single-Family Home Yes 2361 0 Single-Family Home Yes 2818 0 Single-Family Home Yes 3039 1.5 Single-Family Home No 3456 0.01 Single-Family Home Yes 3538 0 Single-Family Home Yes 4031 0.01 Single-Family Home Yes 4373 0.01 Single-Family Home Yes 4767 0 Single-Family Home Yes 5746 0.38 Single-Family Home No 6139 0 Single-Family Home Yes 7536 0.01 Single-Family Home Yes 7719 0 Single-Family Home Yes 7800 0.06 Single-Family Home Yes 7901 0.01 Single-Family Home No 7951 0 Single-Family Home No 8156 0.01 Single-Family Home Yes 8386 0.01 Single-Family Home No 8565 0 Single-Family Home No 9019 0 Single-Family Home Yes 9160 0 Single-Family Home Yes 9278 7.08 Single-Family Home No 9922 4.26 Single-Family Home Yes
20
Figure 6 : 1 year of load data for a home (with solar, ID=2361) in Austin.
Figure 7 : 1-week of load data for a home (with solar, ID=2361) in Austin.
4.2.2 California
It is found that all 23 homes in California contained no missing values. But they consisted of
different building types i.e. single-family home, a townhome or an apartment. Except one home,
none of the homes in California had solar generation capacity. The data of the homes is given
below.
21
Table 3 : List of residential buildings from California. Source: Pecan Street Inc, Dataport[57]
House ID Missing Values Building Type Solar Available
203 0 Single-Family Home No
1450 0 Town Home No
1524 0 Single-Family Home No
1731 0 Town Home No
2606 0 Town Home No
3687 0 Town Home No
3864 0 Town Home No
3938 0 Apartment No
4495 0 Apartment No
4934 0 Town Home No
5938 0 Town Home No
6377 0 Apartment No
6547 0 Town Home No
7062 0 Town Home No
7114 0 Town Home No
8061 0 Town Home No
8342 0 Town Home No
8574 0 Apartment No
8733 0 Apartment No
9213 0 Apartment No
9612 0 Town Home No
9775 0 Apartment No
9836 0 Town Home Yes
22
Figure 8 : 1 year of load data for a home (no solar, ID = 1450) in California
Figure 9 : 1 week of load data for a home (no solar, ID = 1450) in California
4.2.3 New York
In the case of the New York dataset, it was found that for all the 24 homes 11 months of data
was present for the dates between January 1, 2019 and October 31, 2019, with large portions
of the data missing for similar periods of dates. This can be observed in Figure 10 for home
ID=914. Other homes also show a similar pattern. All the homes are of the Single-family home
type with a few homes having solar generation capacity.
23
Table 4 : List of residential buildings from California. Source: Pecan Street Inc, Dataport[57]
House ID Missing Values Building Type Solar Available
27 0 Single-Family Home Yes
387 0 Single-Family Home Yes
558 0 Single-Family Home No
914 0 Single-Family Home Yes
950 0 Single-Family Home Yes
1222 0 Single-Family Home Yes
1240 0 Single-Family Home No
1417 0 Single-Family Home No
2096 0 Single-Family Home No
2318 0 Single-Family Home No
2358 0 Single-Family Home No
3000 0 Single-Family Home Yes
3488 0 Single-Family Home Yes
3517 0 Single-Family Home Yes
3700 0 Single-Family Home No
3996 0 Single-Family Home No
4283 0 Single-Family Home No
4550 0 Single-Family Home No
5058 0 Single-Family Home Yes
5587 0 Single-Family Home Yes
5679 0 Single-Family Home Yes
5982 0 Single-Family Home No
5997 0 Single-Family Home Yes
9053 0 Single-Family Home No
24
Figure 10 : 1 year of load data for a home (with solar, ID=914) in New York
Fig 11 : 1-week load data for a home (with solar, ID = 914) in New York
25
4.3 Feature Engineering
To test the efficacy of the multistep forecasting (24 hours ahead) models with different
combinations of features (multivariate forecasting), features had to be manually added to the
load data. This experiment is only carried out for a single home in Austin (ID=2361). 24 hours
ahead prediction is done for both grid and use data. Both, the grid and use data was rescaled
from 15-min frequency to 1-hour frequency as the weather data was available only in the latter
frequency. Weather based features: temperature, humidity, pressure and time-based features:
day of the week, weekend/day, holiday data are used in this study. All these features are for the
time range of 1 January 2018 to 1 January 2019. The weather and holiday data which is obtained
from an external source is manually appended to the load data. The ‘day of the week’ and
‘weekend/day’ features are constructed from the load data file using the datetime index in the
load file. For the ‘day of the week’ feature, values of 0 to 6 are assigned for the days Monday
to Sunday. For the ‘weekend/day’ feature, a value of 1 and 0 was assigned for weekdays and.
The above-mentioned features were added to the load data after rescaling and the Pearson
correlations between the load data and the features were studied. It is observed that the time-
based features i.e. ‘day of the week’ and ‘weekend/day’ shows almost close to zero correlation
to the load data. Similarly, the pressure variable also shows minimal correlation with the load
data. Thus, these features are not considered for the multivariate-multistep forecast study.
26
4.4 Implementation Setup
All the models are developed on top of a Keras API running on top of a TensorFlow version
1.0 backend[59]. The analysis is done using Google Colab (Colaboratory) online compiler
which gives us the access to their external graphic processing units.7 different models are used
in this study to carry out the multiple home analysis that include a multi-layer perceptron,
LSTM and CNN networks. The models have been named MLP-1, CNN-1, CNN-2, CNN-3,
CNN-4, LSTM-1 and LSTM-2 respectively. For all the models, 70 % of the data is used for
training and 30 % of the data is used for testing the model. The MLP-1 model consists of 3
layers with 32 units in the input layer, 16 units in the hidden and single output unit. The other
architectures of the models used is provided in the table.
Feature Correlation with
‘Grid’
Temperature 0.424
Humidity -0.335
Pressure -0.169
‘Day of the
week’ -0.0049
‘Weekend/day’ -0.0038
Feature Correlation with
‘Use’
Temperature 0.479
Humidity -0.313
Pressure -0.168
‘Day of the
week’ -0.0068
‘Weekend/day’ -0.0054
Table 5 : Correlation with 'Grid' data Table 6 :Correlation with 'Use' data
27
Table 8 : CNN model architectures used for multiple home analysis.
CNN Model Filters Kernel Sizes Pooling Filters Hidden Layers
1 [32] [[3]] [2] [1]
2 [64,32] [[3,3]] [2,2] [1]
3 [32,32,64,64,32,32] [[3,3,3,3,3,3,3]] [2,2] [1]
4 [32,8] [[4,4]] [3] [1]
Table 9 : LSTM architectures used for multiple home analysis.
LSTM Model LSTM Units Dropout Hidden Layers
1 [30] 0.2 1
2 [30,15] 0.2,0.2 1
Figure 12 : Code block of CNN-4 architecture compiled in Python in Google Colab
28
4.5 Evaluation Metric
To assess the efficacy of the model, the accuracy metric: RMSE or root mean squared error is
used. It is a scale dependent accuracy metric.
𝑅𝑀𝑆𝐸 = '∑ )𝑦!"#$ − 𝑦%&',()*+
𝑛
where,
ypred: is the predicted values
yact: is the actual values
n: number of samples.
29
5. Results and Discussion
5.1 Introduction
This study has been conducted to answer the research questions which are mentioned at the end
of Section 1. This section will describe the results obtained from carrying out experiments for
the two major research questions i.e. to evaluate a need for individual deep learning models for
each home for a collection of homes and to evaluate multistep-ahead (24-hour ahead)
forecasting for an individual home using different features.
5.2 Multiple Home Analysis
In this experiment, 7 different models consisting of an ANN, CNN and LSTM architectures are
used for single-step (15-min ahead) univariate forecasting across all the homes in the 3 locations
i.e. Austin, California and New York from the Pecan Street data. The models are trained and
tested on individual homes and the forecasting performance is evaluated using RMSE values.
For answering the research question, the RMSE values of all the homes for all the models are
tabulated and the overall best model is identified using the minimum average RMSE of each
model. Then for each home the overall best model’s RMSE is compared with all other model’s
RMSE to see any significant difference (if any) from the best model. Significant differences in
RMSE would indicate a need for separate model for that particular home. The multiple home
analysis is done as 2 separate case studies. One of the case studies is for Austin and California
30
which after preprocessing do not contain any missing values and the second case study is for
New York where all the homes contain significant chunks of missing values.
5.2.1 Austin and California
5.2.1.1 Austin
The 7 different model architectures trained on each home are ran across all 20 homes in Austin.
It is observed that LSTM-2 architecture shows the best overall performance whereas MLP-1
showed the worst performance amongst the 7 models. It is also observed that increasing model
complexity by adding more layers in the case of CNN and LSTM do not show any significant
improvement in the forecasting performance. The table below shows the test RMSE values
obtained for all the 7 models on the Austin homes dataset for single-step (15-min ahead)
predictions. The best overall model (LSTM-2) is compared to all other model’s RMSE for each
home and the percentage difference between the best model RMSE and minimum RMSE for
that home is noted down in the table below. In the case of Austin dataset, it is observed that
only 3 homes show significant difference (> 5 % difference in RMSE) between the best overall
model and the min. RMSE for that home.
31
Table 10 : Test RMSE (in kW) values of 20 homes in Austin.
In this experiment, effect of adding weather-based variables on multistep-ahead (24-hour
ahead) forecasting is studied. The experiments are carried out on a single home in Austin
(ID=2361) with solar generation capacity. 1-hour frequency data is used in these experiments.
The 24-hr ahead forecasting is done for both ‘grid data’ and ‘electricity use’ data for the home.
After trial and error, a best CNN model is identified for the multistep forecast and the effects
of adding the weather features and changing the length of the sliding window are studied in this
section.
5.3.1 Grid Data Forecasting
The CNN model is used for forecasting grid data for the home (ID=2361). Using different
combination of features for 24-hr ahead forecasting, it is found that adding weather-based
features such as ‘temperature’ and ‘humidity’ to the grid data do not show any performance
improvement when compared to using only ‘grid’ data for forecasting. This could be because
of the fact that ‘temperature’ and ‘humidity’ did not show a strong correlation with grid data.
It is also observed that model with all the features combined shows the worst results. This
observation could be due to overfitting of the model.
Table 17 : RMSE values with different combination of features for 'grid' data
Features Used Test RMSE Values (in kW)
Only Grid 0.9120
Grid with Temperature 0.9367
Grid with Humidity 0.9410
Grid with Temperature & Humidity 0.9841
44
Figure 23 : 24-hr ahead forecast with 'Only grid'
Figure 24 : 24-hr ahead forecast with 'Grid' and 'Temperature'
Figure 25 : 24-hr ahead forecast with 'Grid' and 'Humidity'
45
Figure 26 : 24-hr ahead forecast with all the features.
5.3.2 Use Data Forecasting
The same CNN model used for grid forecasting is used in forecasting the 24-hr ahead
‘electricity use’ data. Similar to the results obtained in the case of ‘grid’ data, it is found that
adding the weather-based features ‘temperature’ and ‘humidity’ do not show an improvement
in forecasting performance. Although the correlation of ‘temperature’ with ‘use’ is more than
with ‘grid’ as shown in Table 7, it is still not good enough to show an improvement in the
forecasting performance.
Table 18 : RMSE values with different combination of features for 'grid'' data
Features Used Test RMSE Values (in kW)
Only Use 0.5090
Use with Temperature 0.5115
Use with Humidity 0.5244
Use with Temperature & Humidity 0.5419
46
Figure 27 : 24-hr ahead forecast for 'Use'
Figure 28 : 24-hr ahead forecast for Use and 'Temperature'
Figure 29 : 24-hr ahead forecast for 'Use' and 'Humidity'
47
Figure 30 : 24-hr ahead forecast for all the features.
5.3.3 Variation with Lookback
In this experiment, the effect of different ‘lookback’ values i.e. the number of timesteps (or
hours in this case) used as input to predict the forecast, is analyzed. This effect is studied in the
case of predicting ‘electricity use’ value for a home in Austin (ID=2361). It is observed that
the ideal lookback for 24-hours ahead forecast lies in the range of 4-8 days (96-192 hours).
This behaviour is observed in 3 experiments involving forecasting using ‘Only Use’, ‘Use and
Temperature’ and ‘Use and Humidity’ as shown in Figure 31, 32 and 33 below. Similar results
are obtained in the case of 24-hour ahead forecasting using ‘grid’ data. Thus, a lookback of 4-
8 days gives the best results for 24-hour ahead forecasts.
48
Figure 31 : RMSE vs Lookback for 24-hr forecast using 'Only Use'
Figure 32 : RMSE vs Lookback for 24-hr forecast using 'Use' and 'Temperature'
Figure 33 : RMSE vs Lookback for 24-hr forecast using 'Use' and 'Humidity'
0.470.480.49
0.50.510.520.530.540.550.56
24 48 72 96 120 144 168 192 216 240
RMSE
Lookback(In hrs)
0.48
0.49
0.5
0.51
0.52
0.53
0.54
24 48 72 96 120 144 168 192 216 240
RMSE
Lookback(In hours)
0.49
0.5
0.51
0.52
0.53
0.54
0.55
24 48 72 96 120 144 168 192 216 240
RMSE
Lookback(In Hours)
49
6. Conclusions
With an increased growth of smart grids, smart homes and decentralized energy production,
load forecasting at an individual building level becomes an increasingly important task. Deep
learning models have shown to surpass the traditional statistical as well as hybrid models. With
more data available and increased computational power, their performance is only set to
improve. This study only reinforces the use of deep learning models for building energy
forecasting. With other studies focusing on models on an individual building or an aggregate
building load, this study aims to address this gap and serve as an analysis on short term load
forecasting on a community of residential buildings. The following conclusions are drawn from
this study:
1. It was observed that the deep learning models outperform the ANN model in all cases.
2. For the multiple home analysis, it is found that the LSTM-2 was the best overall model
in the case of Austin and California. It is also seen that only 5 homes out of a total of 43
homes (both Austin and California) show a significant difference from the best
performing overall model (> 5 % RMSE). This indicates that there is not a pressing need
for individual models for each home. The best overall model can be applied across all
homes giving satisfactory results.
3. For the multiple home analysis, it was also found that the CNN models are shown to
give better performances in the case of homes with higher variance and the LSTM
models are shown to have better performance in the case of homes with lower variance.
50
4. In the case of New York dataset, all the homes have large chunks of missing data in the
initial months of the year. It is found that the missing data do not affect the performance
when it comes to single step forecasting. This could be because the lookback used is
only 6 hours.
5. In the case of 24 hour ahead forecasting, it is found that adding weather-based features
such as temperature and humidity did not show an improvement in forecasting
performance. This could be because of not having a strong enough correlation with the
weather-based features. In literature[40] it is shown that adding temperature can result
in improvement in forecasting performance where the correlation between temperature
and load is as high as 0.74.
6. For 24-hour forecast, it is also observed that the forecast performance is dependent on
the lookback window used. A lookback window in the range of 4 to 8 days is shown to
have best results.
7. A forecasting competition on a public residential dataset can be used to compare the
different models already published as most studies focus on only an individual home
and do not show the results on more than a single home.
51
References
1. Omer, A.M., Energy use and environmental impacts: A general review. Journal of renewable and Sustainable Energy, 2009. 1(5): p. 053101.
2. Amasyali, K. and N.M. El-Gohary, A review of data-driven building energy consumption prediction studies. Renewable and Sustainable Energy Reviews, 2018. 81: p. 1192-1205.
3. Namlı, E., H. Erdal, and H.I. Erdal, Artificial Intelligence-Based Prediction Models for Energy Performance of Residential Buildings, in Recycling and Reuse Approaches for Better Sustainability. 2019, Springer. p. 141-149.
4. Li, K., et al., A hybrid teaching-learning artificial neural network for building electrical energy consumption prediction. Energy and Buildings, 2018. 174: p. 323-334.
5. Rahman, A., V. Srikumar, and A.D. Smith, Predicting electricity consumption for commercial and residential buildings using deep recurrent neural networks. Applied energy, 2018. 212: p. 372-385.
6. Dong, B., et al., A hybrid model approach for forecasting future residential electricity consumption. Energy and Buildings, 2016. 117: p. 341-351.
7. Li, K., et al., Building's electricity consumption prediction using optimized artificial neural networks and principal component analysis. Energy and Buildings, 2015. 108: p. 106-113.
8. Um, T.T., V. Babakeshizadeh, and D. Kulić. Exercise motion classification from large-scale wearable sensor data using convolutional neural networks. in 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). 2017. IEEE.
9. Voß, M., C. Bender-Saebelkampf, and S. Albayrak. Residential short-term load forecasting using convolutional neural networks. in 2018 IEEE International Conference on Communications, Control, and Computing Technologies for Smart Grids (SmartGridComm). 2018. IEEE.
10. Association, A.P.P. More accurate load forecasts help utilities save. 2018, September 25; Available from: https://www.publicpower.org/periodical/article/more-accurate-load-forecasts-help-utilities-save.
11. Gillies, D., B. Bernholtz, and P. Sandiford, A New Approach to Forecasting Daily Peak Loads Transactions of the American Institute of Electrical Engineers. Part III: Power Apparatus and Systems, 1956. 75(3): p. 382-387.
12. Che, J. and J. Wang, Short-term load forecasting using a kernel-based support vector regression combination model. Applied energy, 2014. 132: p. 602-609.
13. Bedi, J. and D. Toshniwal, Deep learning framework to forecast electricity demand. Applied Energy, 2019. 238: p. 1312-1326.
14. He, F., et al., A hybrid short-term load forecasting model based on variational mode decomposition and long short-term memory networks considering relevant factors with Bayesian optimization algorithm. Applied energy, 2019. 237: p. 103-116.
15. Wu, Z., et al., A hybrid model based on modified multi-objective cuckoo search algorithm for short-term load forecasting. Applied energy, 2019. 237: p. 896-909.
16. Sadaei, H.J., et al., Short-term load forecasting by using a combined method of convolutional neural networks and fuzzy time series. Energy, 2019. 175: p. 365-377.
52
17. Liang, Y., D. Niu, and W.-C. Hong, Short term load forecasting based on feature extraction and improved general regression neural network model. Energy, 2019. 166: p. 653-663.
18. Bianchi, F.M., et al., Short-term electric load forecasting using echo state networks and PCA decomposition. Ieee Access, 2015. 3: p. 1931-1943.
19. Johannesen, N.J., M. Kolhe, and M. Goodwin, Relative evaluation of regression tools for urban area electrical energy demand forecasting. Journal of cleaner production, 2019. 218: p. 555-564.
20. Hayes, B., J. Gruber, and M. Prodanovic. Short-term load forecasting at the local level using smart meter data. in 2015 IEEE Eindhoven PowerTech. 2015. IEEE.
21. Sauter, P.S., et al. Load Forecasting in Distribution Grids with High Renewable Energy Penetration for Predictive Energy Management Systems. in 2018 IEEE PES Innovative Smart Grid Technologies Conference Europe (ISGT-Europe). 2018. IEEE.
22. Ebrahim, A.F. and O.A. Mohammed, Pre-Processing of Energy Demand Disaggregation Based Data Mining Techniques for Household Load Demand Forecasting. Inventions, 2018. 3(3): p. 45.
23. Wang, Y., et al., Review of smart meter data analytics: Applications, methodologies, and challenges. IEEE Transactions on Smart Grid, 2018. 10(3): p. 3125-3148.
24. Srivastava, A., A.S. Pandey, and D. Singh. Short-term load forecasting methods: A review. in 2016 International Conference on Emerging Trends in Electrical Electronics & Sustainable Energy Systems (ICETEESES). 2016. IEEE.
25. Mocanu, E., et al., Deep learning for estimating building energy consumption. Sustainable Energy, Grids and Networks, 2016. 6: p. 91-99.
26. Kong, W., et al., Short-term residential load forecasting based on resident behaviour learning. IEEE Transactions on Power Systems, 2017. 33(1): p. 1087-1088.
27. Alberg, D. and M. Last, Short-term load forecasting in smart meters with sliding window-based ARIMA algorithms. Vietnam Journal of Computer Science, 2018. 5(3-4): p. 241-249.
28. Chakhchoukh, Y., P. Panciatici, and P. Bondon. Robust estimation of SARIMA models: Application to short-term load forecasting. in 2009 IEEE/SP 15th Workshop on Statistical Signal Processing. 2009. IEEE.
29. Bercu, S. and F. Proïa, A SARIMAX coupled modelling applied to individual load curves intraday forecasting. Journal of Applied Statistics, 2013. 40(6): p. 1333-1348.
30. Hu, Z., Y. Bao, and T. Xiong, Electricity load forecasting using support vector regression with memetic algorithms. The Scientific World Journal, 2013. 2013.
31. Jain, A. and B. Satish. Clustering based short term load forecasting using support vector machines. in 2009 IEEE Bucharest PowerTech. 2009. IEEE.
32. Voß, M., A. Haja, and S. Albayrak. Adjusted feature-aware k-nearest neighbors: Utilizing local permutation-based error for short-term residential building load forecasting. in 2018 IEEE International Conference on Communications, Control, and Computing Technologies for Smart Grids (SmartGridComm). 2018. IEEE.
33. Li, K., H. Su, and J. Chu, Forecasting building energy consumption using neural networks and hybrid neuro-fuzzy system: A comparative study. Energy and Buildings, 2011. 43(10): p. 2893-2899.
34. Baccouche, M., et al. Sequential deep learning for human action recognition. in International workshop on human behavior understanding. 2011. Springer.
53
35. Yu, D. and L. Deng, Deep learning and its applications to signal and information processing
IEEE Signal Processing Magazine, 2010. 28(1): p. 145-154. 36. Marino, D.L., K. Amarasinghe, and M. Manic. Building energy load forecasting using deep
neural networks. in IECON 2016-42nd Annual Conference of the IEEE Industrial Electronics Society. 2016. IEEE.
37. Amarasinghe, K., D.L. Marino, and M. Manic. Deep neural networks for energy load forecasting. in 2017 IEEE 26th International Symposium on Industrial Electronics (ISIE). 2017. IEEE.
38. Fan, C., et al., Deep learning-based feature engineering methods for improved building energy prediction. Applied energy, 2019. 240: p. 35-45.
39. Fan, C., et al., Assessment of deep recurrent neural network-based strategies for short-term building energy predictions. Applied energy, 2019. 236: p. 700-710.
40. Cai, M., M. Pipattanasomporn, and S. Rahman, Day-ahead building-level load forecasts using deep learning vs. traditional time-series techniques. Applied Energy, 2019. 236: p. 1078-1088.
41. Jiao, R., et al., Short-term non-residential load forecasting based on multiple sequences LSTM recurrent neural network. IEEE Access, 2018. 6: p. 59438-59448.
42. Shi, H., M. Xu, and R. Li, Deep learning for household load forecasting—A novel pooling deep RNN. IEEE Transactions on Smart Grid, 2017. 9(5): p. 5271-5280.
43. Ryu, S., J. Noh, and H. Kim, Deep neural network based demand side short term load forecasting. Energies, 2017. 10(1): p. 3.
44. Elvers, A., M. Voß, and S. Albayrak. Short-term probabilistic load forecasting at low aggregation levels using convolutional neural networks. in 2019 IEEE Milan PowerTech. 2019. IEEE.
45. LeCun, Y., Y. Bengio, and G. Hinton, Deep learning. nature, 2015. 521(7553): p. 436-444. 46. Hinton, G.E., S. Osindero, and Y.-W. Teh, A fast learning algorithm for deep belief nets.
Neural computation, 2006. 18(7): p. 1527-1554. 47. Silver, D., et al., Mastering the game of go without human knowledge. Nature, 2017.
550(7676): p. 354-359. 48. Cao, Y., et al., Predicting Long-Term Health-Related Quality of Life after Bariatric
Surgery Using a Conventional Neural Network: A Study Based on the Scandinavian Obesity Surgery Registry. Journal of clinical medicine, 2019. 8(12): p. 2149.
49. Goodfellow, I., Y. Bengio, and A. Courville, Deep learning. 2016: MIT press. 50. Olah, C. Understanding LSTM Networks 2015 [cited 2020; Available from: http://colah.github.io/posts/2015-08-Understanding-LSTMs/. 51. Garg, R., et al. Unsupervised cnn for single view depth estimation: Geometry to the rescue.
in European Conference on Computer Vision. 2016. Springer. 52. Zhang, Y., S. Roller, and B. Wallace, MGNC-CNN: A simple approach to exploiting
multiple word embeddings for sentence classification. arXiv preprint arXiv:1603.00968, 2016.
53. Gawehn, E., J.A. Hiss, and G. Schneider, Deep learning in drug discovery. Molecular informatics, 2016. 35(1): p. 3-14.
54. Pao, J.J. and D.S. Sullivan, Time Series Sales Forecasting. Final Year Project, 2017. 55. Kingma, D.P. and J. Ba, Adam: A method for stochastic optimization. arXiv preprint
arXiv:1412.6980, 2014.
54
56. Chollet, F. Keras: Deep Learning for humans 2015 [cited 2020; Available from: https://github.com/keras-team/keras. 57. Street, P. Dataport : Researcher access to Pecan Street's groundbreaking energy and water
data. [cited 2020; Available from: https://www.pecanstreet.org/dataport/. 58. Ltd., O. OpenWeather Map. 2020 [cited March 20, 2020; Available from:
https://openweathermap.org/. 59. Team, G.B. An end-to-end open source machine learning platform. [cited 2019; Available