Humboldt-Universität zu Berlin - Mathematisch-Naturwissenschaftliche Fakultät - Institut für Mathematik

Abstract: We develop early stopping rules for growing regression tree estimators. The fully data-driven stopping rule is based on monitoring the global residual norm. The best-first search and the breadth-first search algorithms together with linear interpolation give rise to generalized projection or regularization flows. A general theory of early stopping is established. Oracle inequalities for the early-stopped regression tree are derived without any smoothness assumption on the regression function, assuming the original CART splitting rule, yet with a much broader scope. The remainder terms are of smaller order than the best achievable rates for Lipschitz functions in dimension . In real and synthetic data the early stopping regression tree estimators attain the statistical performance of cost-complexity pruning while significantly reducing computational costs.

Forschungsseminar Mathematische Statistik

Für den Bereich Statistik


A. Carpentier, S. Greven, W. Härdle, M. Reiß, V. Spokoiny

 

Ort

Weierstrass-Institut für Angewandte Analysis und Stochastik
HVP 11 a,  R.313 bitte beachten Sie die Raumänderung!!!
Mohrenstrasse 39
10117 Berlin

 

Zeit

mittwochs, 10.00 - 12.00 Uhr


Programm

 
16. April 2025
N.N. 
 
23. April 2025
N.N.
30. April 2025
Ratmir Miftachov (HU Berlin) 
Early Stopping for Regression Trees

Abstract: We develop early stopping rules for growing regression tree estimators. The fully data-driven stopping rule is based on monitoring the global residual norm. The best-first search and the breadth-first search algorithms together with linear interpolation give rise to generalized projection or regularization flows. A general theory of early stopping is established. Oracle inequalities for the early-stopped regression tree are derived without any smoothness assumption on the regression function, assuming the original CART splitting rule, yet with a much broader scope. The remainder terms are of smaller order than the best achievable rates for Lipschitz functions in dimension . In real and synthetic data the early stopping regression tree estimators attain the statistical performance of cost-complexity pruning while significantly reducing computational costs.

 

07. Mai 2025

     N.N.

 

14. Mai 2025 
Vladimir Spokoiny (WIAS/ HU)
Estimation and classification for DNN: bless of dimension
 

21. Mai 2025

Yi Yu (University of Warwick) 

 

 

28. Mai 2025    
Jason Klusowski (Princeton University)
04. Juni 2025    
Sebastian Kassing (TU Berlin)
11. Juni 2025  
Dimitri Konen (University of Cambridge)
18. Juni 2025   
Lasse Vuursteen (University of Pennsylvania)
25. Juni 2025
Ryan Tibshirani (Berkeley University)

 

02. Juli 2025
Frank Konietschke (Charité)
09. Juli 2025
Vincent Rivoirard (Université Dauphine, Paris)

 

 

 

16. Juli 2025

     Claudia Strauch  (Universität Heidelberg)

 

 

 

 

 

 

 

 


 Interessenten sind herzlich eingeladen.

Für Rückfragen wenden Sie sich bitte an:

Frau Marina Filatova

Mail: marina.filatova@hu-berlin.de
Telefon: +49-30-2093-45460
Humboldt-Universität zu Berlin
Institut für Mathematik
Unter den Linden 6
10099 Berlin, Germany