Transcript slides

DESIGNING AND AGGREGATING EXPERTS
FOR ENERGY DEMAND FORECASTING
SESO 2014 International Thematic Week
“Smart Energy and Stochastic Optimization''
June 23 to 27, 2014
Yannig Goude
Georges Oppenheim
Pierre Gaillard
Gilles Stoltz
EDF R&D
UPEM & Paris 11
EDF R&D, HEC Paris-CNRS
HEC Paris-CNRS
INDUSTRIAL CHALLENGES
 Smart grids

More and more « real time » data (ex: linky, 1million meters in 2016)

Demand response (new tariffs, real time pricing…)

New communication tools with customers (webservice, on-line reporting….)
 Renewables energy development

A more and more probabilistic context
 Opening of the electricity market:

Losses/gains of customers
 Sensors data:

Production/consumption sites

Smart home, internet of things
 New usages/tariffs:

Electric cars

Heat pumps, smart phones, battery charge, computers, flat screens….

Demand response, special tariffs (time varying…)
| 2
STATISTICAL CHALLENGES
 Large scale data sets

Parallelizing statistical algorithms

Complex data analysing: heteregonous spatial/temporal sampling, different sources/nature of data
 Adaptivity

Non-parametric models, fonctional data analysis

Model selection, data driven penalty…
 Sequential estimation

Break detection

On-line update, sequential data treatment (data flow, connection to big data)

Aggregation with on-line weigths
 Multi-scale models

Multi-horizon models

Multi level data on the grid
 Data mining of time series
 Large scale simulations

Simulation platform, parallel processing
| 3
CONTRIBUTIONS
 Large scale data sets

GAM parallel processing

EDF R&D/IBM simulation platform
 Adaptivity

GAM models, automatic GAM selection

functional data analysis (CLR: curve linear regression, KWF: kernel wavelet fonctional)
 Sequential learning:

Adaptive GAM

Combining forecasts
 Spatio temporal/multi-scale models, complex data


« Downscaling » electricity consumption: link INSEE (socio-demographic, census) data to local electricity
consumption (meters, grid data) and meteo data
EDF R&D/IBM simulation platform
| 4
LOAD FORECASTING
 Electricity consumption is the main entry for optimizing the production units
| 5
ELECTRICITY CONSUMPTION DATA
 Trend
 Yearly, Weekly, Daily cycles
| 6
ELECTRICITY CONSUMPTION DATA
 Meteorological events
 Special days
| 7
GAM (GENERALIZED ADDITIVE MODELS)
 A good trade-off complexity/adaptivity
 Publications

Application on load forecasting
• A. Pierrot and Y. Goude, Short-Term Electricity Load Forecasting With Generalized Additive Models Proceedings of
ISAP power, pp 593-600, 2011.
• R. Nédellec, J. Cugliari and Y. Goude, GEFCom2012: Electricity Load Forecasting and Backcasting with SemiParametric Models, International Journal of Forecasting , 2014, 30, 375 - 381.

GAM « parallel »: BAM (Big Additive Models)
• S.N. Wood, Goude, Y. and S. Shaw, Generalized additive models for large datasets, Journal of Royal Statistical
Society-C, 2014.

Adaptive GAM (forgetting factor)
• A. Ba, M. Sinn, Y. Goude and P. Pompey, Adaptive Learning of Smoothing Functions: Application to Electricity Load
Forecasting Advances in Neural Information Processing Systems 25, 2012, 2519-2527.
| 8
GAM
| 9
GEFCOM COMPETITION
 20 substations on the US grid
 11 temperature series
 hourly data from january 2004 to june
2008
 9 weeks to predict : 8 from 2005 to
2006, and the one following the train set
(no temperature forecast available)
105 teams
?
One issue : no localisation information
http://www.kaggle.com/c/global-energy-forecasting-competition-2012-load-forecasting
| 10
GEFCOM COMPETITION
Nedellec, R.; Cugliari, J. & Goude, Y.
GEFCom2012: Electric load forecasting and backcasting with semi-parametric models
International Journal of Forecasting , 2014, 30, 375 - 381
| 11
GEFCOM COMPETITION
| 12
CURVE LINEAR REGRESSION
 Regressing curves on curves

Dimension reduction, SVD of cov(Y,X) , selection with penalised model selection

Scale to big data sets (SVD+linear regression)
 Publications

Application on electricity load forecasting
• H. Cho, Y. Goude, X. Brossat & Q. Yao, Modeling and Forecasting Daily Electricity Load Curves: A Hybrid Approach
Journal of the American Statistical Association, 2013, 108, 7-21.
• Cho, H.; Goude, Y.; Brossat, X. & Yao, Q, Modelling and forecasting daily electricity load using curve linear regression
submitted to Lecture Notes in Statistics: Modeling and Stochastic Learning for Forecasting in High Dimension.

Clusturing functional data
• H. Cho, Y. Goude, X. Brossat & Q. Yao, Clusturing for curve linear regression, technical report, 2013.
| 13
CURVE LINEAR REGRESSION
| 14
OTHER MODELS
 Random forest: a popular machine learning method for classification/regression
• Breiman, L., . Random Forests, Machine Learning, 45 (1), 2001.
yes
Bank Holiday
<6°C
no
Temperature
<55GW
>6°C
Lag Load
>55GW
http://luc.devroye.org/BRUCE/brucepics.html
 KWF (Kernel Wavelet Functional): another approach for functional data forecasts
• See: Antoniadis, A., Brossat, X., Cugliari, J., Poggi, J., Clustering functional data using wavelets. In: Proceedings of
the Nineteenth International Conference on Computational Statistics(COMPSTAT), 2010.
• Antoniadis, A., Paparoditis, E., Sapatinas, T., A functional wavelet–kernel approach for time series prediction.
Journal of the Royal Statistical Society: Series B 68(5), 837–857, 2006.
| 15
SEQUENTIAL AGREGATION OF EXPERTS
| 16
SEQUENTIAL AGREGATION OF EXPERTS
| 17
SEQUENTIAL AGREGATION OF EXPERTS
Cesa-Bianchi, N., Lugosi, G.: Prediction, Learning, and Games. Cambridge University Press (2006)
| 18
EXPONENTIALLY WEIGHTED AVERAGE FORECASTER (EWA)
| 19
EXPONENTIATED GRADIENT FORECASTER (EG)
| 20
OTHER ALGORITHMS
Theoretical calibration
Works well in practice
Gaillard, P., Stoltz, G., van Erven, T.:
A second-order bound with excess
losses (2014).
ArXiv:1402.2044
| 21
APPLICATION ON LOAD FORECASTING
 initial « heterogenous » experts:

GAM

Kernel Wavelet Functional

Curve Linear Regression

Random Forest
 Designing a set of experts from the original ones: 4 « home made » tricks


Bagging: 60 experts
Boosting:Boosting: trained on
45 experts
such that

Specializing: focus on cold/warm days, some periods of the year… 24 experts

Time scaling: MD with GAM, ST with the 3 initial experts
performs well
 Publications
• M. Devaine, P. Gaillard, Y. Goude & G. Stoltz, Forecasting electricity consumption by aggregating specialized experts A review of the sequential aggregation of specialized experts, with an application to Slovakian and French countrywide one-day-ahead (half-)hourly predictions Machine Learning, 2013, 90, 231-260.
• Gaillard, P. & Goude, Y., Forecasting electricity consumption by aggregating experts; how to design a good set of
experts to appear in Lecture Notes in Statistics: Modeling and Stochastic Learning for Forecasting in High
Dimension, 2013.
| 22
COMBINING FORECASTS
combining
Designing experts
| 23
ANOTHER DATA SET: HEAT DEMAND
| 24
PERSPECTIVES
 Forecasting methods:


Industrial implementation on the way (national, substations, cogeneration central in Poland: 30%
better with GAM)
CLR: improve automatic clusturing, forecasting the clusters (HMM)
 Combining:

publication of the R package OPERA (Online Prediction through ExpeRts Aggregation) coming soon

application on other data sets

derive probabilistic forecasts from a set of experts
 Probabilistic forecasts
| 25