Benjamin Goehry’s research while affiliated with University of Paris-Saclay and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (3)


Figure 1. A partitioning of [0, 1] 2 and the associated binary tree. c 1 , c 2 , c 3 are the constants associated to each cell.
Random forests for time-dependent processes
  • Article
  • Full-text available

April 2020

·

252 Reads

·

20 Citations

ESAIM Probability and Statistics

Benjamin Goehry

Random forests were introduced by Breiman in 2001. We study theoretical aspects of both original Breiman’s random forests and a simplified version, the centred random forests. Under the independent and identically distributed hypothesis, Scornet, Biau and Vert proved the consistency of Breiman’s random forest, while Biau studied the simplified version and obtained a rate of convergence in the sparse case. However, the i.i.d hypothesis is generally not satisfied for example when dealing with time series. We extend the previous results to the case where observations are weakly dependent, more precisely when the sequences are stationary β−mixing.

Download

Prévision multi-échelle par agrégation de forêts aléatoires. Application à la consommation électrique.

December 2019

·

30 Reads

·

1 Citation

Cette thèse comporte deux objectifs. Un premier objectif concerne la prévision d’une charge totale dans le contexte des Smart Grids par des approches qui reposent sur la méthode de prévision ascendante. Le deuxième objectif repose quant à lui sur l’étude des forêts aléatoires dans le cadre d’observations dépendantes, plus précisément des séries temporelles. Nous étendons dans ce cadre les résultats de consistance des forêts aléatoires originelles de Breiman ainsi que des vitesses de convergence pour une forêt aléatoire simplifiée qui ont été tout deux jusqu’ici uniquement établis pour des observations indépendantes et identiquement distribuées. La dernière contribution sur les forêts aléatoires décrit une nouvelle méthodologie qui permet d’incorporer la structure dépendante des données dans la construction des forêts et permettre ainsi un gain en performance dans le cas des séries temporelles, avec une application à la prévision de la consommation d’un bâtiment.


Aggregation of Multi-Scale Experts for Bottom-Up Load Forecasting

October 2019

·

90 Reads

·

58 Citations

IEEE Transactions on Smart Grid

The development of smart grid and new advanced metering infrastructures induces new opportunities and challenges for utilities. Exploiting smart meters information for forecasting stands as a key point for energy providers who have to deal with time varying portfolio of customers as well as grid managers who needs to improve accuracy of local forecasts to face with distributed renewable energy generation development. We propose a new machine learning approach to forecast the system load of a group of customers exploiting individual load measurements in real time and/or exogenous information like weather and survey data. Our approach consists in building experts using random forests trained on some subsets of customers then normalise their predictions and aggregate them with a convex expert aggregation algorithm to forecast the system load. We propose new aggregation methods and compare two strategies for building subsets of customers: 1) hierarchical clustering based on survey data and/or load features and 2) random clustering strategy. These approaches are evaluated on a real data set of residential Irish customers load at a half hourly resolution. We show that our approaches achieve a significant gain in short term load forecasting accuracy of around 25 percent of RMSE.

Citations (3)


... Following Theorems 3 and 4, this needs to extend now to estimation on time-dependent observations. Consistency and convergence rates on α-mixing sequences are derived for Lasso in Wong et al. (2020), for random forests in Goehry, Benjamin (2020), for boosting algorithms in Lozano, Kulkarni, and Schapire (2014), for support vector machines in Steinwart, Hush, and Scovel (2009), for kernel and nearest-neighbour regressions in Irle (1997) and for spline and wavelet series regression estimators in X. Chen and Christensen (2015). Consistency of deep feed-forward neural networks with ReLU activation functions on exponentially α-mixing processes was recently shown in Ma and Safikhani (2022). ...

Reference:

Semiparametric inference for impulse response functions using double/debiased machine learning
Random forests for time-dependent processes

ESAIM Probability and Statistics

... However, trees with lower multifaceted nature are wanted. The wide-spread utilization of XGB in the AI group can be ascribed to various elements such as indicated best in class execution in supervised learning assignments including arrangement and relapses, models are fairly simple to decipher, is versatile to enormous datasets and, as an open source venture, is accessible on numerous stages additionally significant [30]. ...

Prévision multi-échelle par agrégation de forêts aléatoires. Application à la consommation électrique.
  • Citing Thesis
  • December 2019

... A solution involving the sequential aggregation of multiscale random forest-based experts is provided in [80]. It considers individual customers and the problem is to disaggregate the global signal to improve the forecast of global demand. ...

Aggregation of Multi-Scale Experts for Bottom-Up Load Forecasting
  • Citing Article
  • October 2019

IEEE Transactions on Smart Grid