Top Banner
SUNSEED project is partially funded by EC FP7 programme under grant agreement #619437. Big Data Stream Mining Maintain summaries of the streams, sufficient to answer the expected queries about the data: Summaries can be in various forms: clusters (flat or hierarchic, statistical aggregates, …) Maintain a sliding window of the most recently arrived data operations on a sliding window mimic more traditional database/mining operations Sampling obtain representative data sample (i.e., enabling to perform correctly required operations on data) Smart sampling (x % from stream of multiple data sources; alternative take y % of selected data sources) Similarity comparison – smart indexing Incremental updating of predicting models M. Skrjanc, B. Kazic {Maja.Skrjanc, Blaz.Kazic}@ijs.si , Jozef Stefan Institute, Jamova ul. 39, Ljubljana, Slovenia Forecasting in Smart Grids Types of forecasting problems: Electricity load (short term, medium term, long term) Renewable sources generation Electricity prices Costumer segmentation Input sources: Historical load variables: used for learning models and detecting short term trends Meteorological data: known to be correlated with load (depends on location) Static data: such as special calendar data (holidays, summer season), and topology of electrical grid Methods used: Naive approach: Localized averages, previous values. Computationally non demanding, fast, robust and easy to maintain. Can work surprisingly well. Classical approaches: Autoregressive (ARMIA), regressionbased statistics methods. Based on historical data. Can take advantage of seasonality trends, but usually don’t include other data sources. Computational intelligence approaches: Artificial neural networks, support vector machines. Data driven approach that can take advantage of various heterogeneous data sources. Hybrid methods: combine two or more different approaches in order to take advantage of specific methods benefits and overcome their drawbacks.
1

Big Data StreamMining - Sunseed EU | Sustainable and ...sunseed-fp7.eu/wp-content/uploads/2015/04/13_SUNSEED...SUNSEED project is partially funded by EC FP7 programme under grant agreement

Jun 18, 2018

Download

Documents

buihuong
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Big Data StreamMining - Sunseed EU | Sustainable and ...sunseed-fp7.eu/wp-content/uploads/2015/04/13_SUNSEED...SUNSEED project is partially funded by EC FP7 programme under grant agreement

SUNSEED project is partially funded by EC FP7 programme under grant agreement #619437.

Big Data StreamMining• Maintain summaries of the streams, sufficient to answer the

expected queries about the data:• Summaries can be in various forms: clusters (flat orhierarchic, statistical aggregates, …)

• Maintain a sliding window of the most recently arriveddata operations on a sliding window mimic moretraditional database/mining operations

• Sampling ‐ obtain representative data sample (i.e., enabling toperform correctly required operations on data)

• Smart sampling (x % from stream of multiple datasources; alternative ‐ take y % of selected data sources)

• Similarity comparison – smart indexing• Incremental updating of predicting models

M. Skrjanc, B. Kazic{Maja.Skrjanc, Blaz.Kazic}@ijs.si , 

Jozef Stefan Institute, Jamova ul. 39, Ljubljana, Slovenia

Forecasting in Smart Grids• Types of forecasting problems:

• Electricity load (short term, medium term, long term)• Renewable sources generation• Electricity prices• Costumer segmentation

• Input sources:• Historical load variables: used for learning models and

detecting short term trends• Meteorological data: known to be correlated with load

(depends on location)• Static data: such as special calendar data (holidays,

summer season), and topology of electrical grid• Methods used:

• Naive approach: Localized averages, previous values.Computationally non demanding, fast, robust and easyto maintain. Can work surprisingly well.

• Classical approaches: Autoregressive (ARMIA),regression‐based statistics methods. Based on historicaldata. Can take advantage of seasonality trends, butusually don’t include other data sources.

• Computational intelligence approaches: Artificialneural networks, support vector machines. Data drivenapproach that can take advantage of variousheterogeneous data sources.

• Hybrid methods: combine two or moredifferent approaches in order to take advantageof specific methods benefits and overcometheir drawbacks.