Humboldt-Universität zu Berlin - Mathematisch-Naturwissenschaftliche Fakultät - Institut für Mathematik

Abstract: We develop early stopping rules for growing regression tree estimators. The fully data-driven stopping rule is based on monitoring the global residual norm. The best-first search and the breadth-first search algorithms together with linear interpolation give rise to generalized projection or regularization flows. A general theory of early stopping is established. Oracle inequalities for the early-stopped regression tree are derived without any smoothness assumption on the regression function, assuming the original CART splitting rule, yet with a much broader scope. The remainder terms are of smaller order than the best achievable rates for Lipschitz functions in dimension . In real and synthetic data the early stopping regression tree estimators attain the statistical performance of cost-complexity pruning while significantly reducing computational costs.

Forschungsseminar Mathematische Statistik

Für den Bereich Statistik


A. Carpentier, S. Greven, W. Härdle, M. Reiß, V. Spokoiny

 

Ort

Weierstrass-Institut für Angewandte Analysis und Stochastik
HVP 11 a,  R.313 bitte beachten Sie die Raumänderung!!!
Mohrenstrasse 39
10117 Berlin

 

Zeit

mittwochs, 10.00 - 12.00 Uhr


Programm

 
16. April 2025
N.N. 

 

23. April 2025
N.N.
30. April 2025
Ratmir Miftachov (HU Berlin) 
Early Stopping for Regression Trees

Abstract: We develop early stopping rules for growing regression tree estimators. The fully data-driven stopping rule is based on monitoring the global residual norm. The best-first search and the breadth-first search algorithms together with linear interpolation give rise to generalized projection or regularization flows. A general theory of early stopping is established. Oracle inequalities for the early-stopped regression tree are derived without any smoothness assumption on the regression function, assuming the original CART splitting rule, yet with a much broader scope. The remainder terms are of smaller order than the best achievable rates for Lipschitz functions in dimension . In real and synthetic data the early stopping regression tree estimators attain the statistical performance of cost-complexity pruning while significantly reducing computational costs.

 

07. Mai 2025

     N.N.

 

14. Mai 2025 
Vladimir Spokoiny (WIAS/ HU)
Estimation and inference for Deep Neuronal Networks and inverse problems

Abstract: The talk discusses two important issues in modern high-dimensional statistics. Success of DNN in practical applications is at the same time a great challenge for statistical theory due to the "curse of dimensionality" problem. Manifold type assumptions are not really helpful and do not explain the double descent phenomenon when the DNN accuracy improves with of overparametrization. We offer a different view on the problem based on the notion of effective dimension and a calming device. The idea is to decouple the structural DNN relation by extending the parameter space and use a proper regularization without any substantial increase of the effective dimension.The other related issue is the choice of regulation in inverse problems.

We show that a simple ridge penalty (Tikhonov regularization) does a good job in any inverse problem for which the operator is more regular than the unknown signal. In the opposite case, one should use a model reduction technique like spectral cut-off. Estimation and inference for Deep Neuronal Networks and inverse problems

21. Mai 2025

Yi Yu (University of Warwick) 

 

Contextual Dynamic Pricing: Algorithms, Optimality and Local Differential Privacy Constraints

 

Abstract: We study contextual dynamic pricing problems where a firm sells products to $T$ sequentially-arriving consumers, behaving according to an unknown demand model. The firm aims to minimize its regret over a clairvoyant that knows the model in advance. The demand follows a generalized linear model (GLM), allowing for stochastic feature vectors in $\mathbb R^d$ encoding product and consumer information. We first show the optimal regret is of order~$\sqrt{dT}$, up to logarithmic factors, improving existing upper bounds by a~$\sqrt{d}$ factor. This optimal rate is materialized by two algorithms: an explore-then-commit~(ETC) algorithm and a confidence bound-type algorithm. A key insight is an intrinsic connection between dynamic pricing and contextual multi-armed bandit problems with many arms with a careful discretization. We further extend our study to adversarial contexts and propose algorithms that are statistically and computationally more efficient than existing methods in the literature. We further study contextual dynamic pricing under local differential privacy~(LDP) constraints. We propose a stochastic gradient descent-based ETC algorithm achieving regret upper bounds of order $d\sqrt{T}/\epsilon$, up to logarithmic factors, where $\epsilon>0$ is the privacy parameter. The upper bounds with and without LDP constraints are matched by newly constructed minimax lower bounds, characterizing costs of privacy. Moreover, we extend our study to dynamic pricing under mixed privacy constraints, improving the privacy-utility tradeoff by leveraging public data. This is the first time such setting is studied in the dynamic pricing literature and our theoretical results seamlessly bridge dynamic pricing with and without LDP. Extensive numerical experiments and real data applications are conducted to illustrate the efficiency and practical value of our algorithms.

28. Mai 2025    
Jason Klusowski (Princeton University)
04. Juni 2025    
Sebastian Kassing (TU Berlin)
11. Juni 2025  
Dimitri Konen (University of Cambridge)
18. Juni 2025   
Lasse Vuursteen (University of Pennsylvania)
25. Juni 2025
Ryan Tibshirani (Berkeley University)

 

02. Juli 2025
Frank Konietschke (Charité)
09. Juli 2025
Vincent Rivoirard (Université Dauphine, Paris)

 

 

 

16. Juli 2025

     Claudia Strauch  (Universität Heidelberg)

 

 

 

 

 

 

 

 


 Interessenten sind herzlich eingeladen.

Für Rückfragen wenden Sie sich bitte an:

Frau Marina Filatova

Mail: marina.filatova@hu-berlin.de
Telefon: +49-30-2093-45460
Humboldt-Universität zu Berlin
Institut für Mathematik
Unter den Linden 6
10099 Berlin, Germany