Forschungsseminar Mathematische Statistik
Für den Bereich Statistik
A. Carpentier, S. Greven, W. Härdle, M. Reiß, V. Spokoiny
Ort
WeierstrassInstitut für Angewandte Analysis und Stochastik
ErhardSchmidtRaum
Mohrenstrasse 39
10117 Berlin
Zeit
mittwochs, 10.00  12.00 Uhr
Programm

 25. Oktober 2023
 Denis Belomestny (Universität DuisburgEssen)
 Provable Benefits of Policy Learning from Human Preferences
 Abstract: A crucial task in reinforcement learning (RL) is a reward construction.
It is common in practice that no obvious choice of reward function exists. Thus, a popular approach is to introduce human feedback during training and leverage such feedback to learn a reward function. Among all policy learning methods that use human feedback, preferencebased methods have demonstrated substantial success in recent empirical applications such as InstructGPT. In this work, we develop a theory that provably shows the benefits of preferencebased methods in tabular and linear MDPs. The main idea of our method is to use KLregularization with respect to the learned policy to ensure more stable learning.  01. November 2023
 Victor Panaretos (EPFL Lausanne)
 Optimal transport for covariance operators
 Covariance operators are fundamental in functional data analysis, providing the canonical means
to analyse functional variation via the celebrated KarhunenLoève expansion. These operators may themselves be subject to variation, for instance in contexts where multiple functional populations are to be compared. Statistical techniques to analyse such variation are intimately linked with the choice of metric on covariance operators, and the intrinsic infinitedimensionality and of these operators. I will describe how the geometry and tools of optimal transportation can be leveraged to construct natural and effective statistical summaries and inference tools for covariance operators, taking full advantage of the nature of their ambient space. Based on joint work with Valentina Masarotto (Leiden), Leonardo Santoro (EPFL), and Yoav Zemel (EPFL).  08. November 2023
 Sven Wang (HumboldtUniversität zu Berlin)
 Statistical convergence rates for transport and ODEbased generative models
 Measure transport provides a powerful toolbox for estimation and generative modelling of
complicated probability distributions. The common principle is to learn a transport map which couples a
tractable (e.g. uniform or normal) reference distribution to some complicated target distribution, e.g. by maximizing a likelihood objective. In this talk, we discuss recent advances in statistical convergence
guarantees for such methods. While a general theory is developed, we will primarily treat (1) triangular maps which are the building blocks for “autoregressive normalizing flows" and (2) ODEbased maps, defined through an ODE flow. The latter encompasses NeuralODEs, a popular method for generative modeling. Our results imply that transport methods achieve minimaxoptimal convergence rates for nonparametric density estimation over Hölder classes on the unit cube. Joint work with Youssef Marzouk (MIT, United States), Robert Ren (MIT, United States) and Jakob Zech (U Heidelberg, Germany).  15. November 2023
 N.N.
 22. November 2023
 Marc Hoffmann (Université ParisDauphine)
 On Estimating Multidimensional Diffusions from Discrete Data

We revisit an old statistical problem: estimate nonparametrically the drift (vector field) and diffusion coefficient (matrix) of a diffusion process from discrete data $(X_0,X_D, X_{2D}, \ldots, X_{ND})$. The novelty are: (i) the multivariate case: only few results have been obtained in this setting from discrete data (and, to the best of our knowledge, no results for the diffusion coefficient) (ii) the sampling scheme has high frequency but is arbitrarily slow: $D=D_N \rightarrow 0$, $ND_N \rightarrow \infty$ and $N_D_N^q\rightarrow 0 from some possibly arbitrarily large $q$ (à la Kessler) and (iii) the process lies in a (not necessarily convex, not necessarily bounded) domain in $\mathbb R^d$ with reflection at the boundary. (In particular we recover the case of a bounded domain or the whole Euclidean space $R^d$.) We establish a relatively standard minimax (adaptive) program for integrated squared error loss over bounded domains (and more losses in the simpler case of the drift) over standard smoothness classes, including lower bounds for the diffusion coefficient with a bit of Malliavin calculus. When $ND_N^2 \rightarrow 0$ and in the special case of the conductivity equation over a bounded domain, we actually obtain contraction rates in squared error loss in a nonparametric Bayes setting. The main difficulty here lies in controlling small ball probabilities for the likelihood ratios; we develop small time expansions of the heat kernel with a bit of Riemannian geometry to control adequate perturbations in KL divergence, using old ideas of Azencott and others. That last part is joint with K.Ray.
Although this problem could have been methodologically addressed almost two decades ago, we heavily rely on the substantial progress that in the domain to clarify and quantify the stability of ergodic averages via concentration chaining techniques and explicit mixing bounds, as well as Malliavin calculus for the lower bound (Dirksen, Gobet, Nickl, Paulin, Ray, Reiss and many others).
 29. November 2023

Martin Spindler (Universität Hamburg)
HighDimensional L2Boosting: Rate of Convergence
Boosting is one of the most significant developments in machine learning. This paper studies the rate of convergence of L2Boosting, which is tailored for regression, in a highdimensional setting. Moreover, we introduce socalled \textquotedblleft postBoosting\textquotedblright,\textquotedblleft iterated Boosting\textquotedblright and \textquotedblleft restricted Boosting\textquotedblright and \textquotedblleft orthogonal Boosting\textquotedblright and analyze their properties. To show the latter results, we derive new approximation results for the pure greedy algorithm, based on analyzing the revisiting behavior of L2Boosting. We also introduce feasible rules for early stopping, which can be easily implemented and used in applied work. Our results also allow a direct comparison between LASSO and boosting which has been missing from the literature. Finally, we present simulation studies and applications to illustrate the relevance of our theoretical results and to provide insights into the practical aspects of boosting. In these simulation studies, L2Boosting clearly outperforms LASSO.
(joint work with Jannis Kück and Ye Luo)
 06. Dezember 2023
 Dennis Nieman (Vrije Universiteit Amsterdam)
 Frequentist guarantees for variational Gaussian process regression
 We discuss the variational Bayesian approach with inducing variables introduced by Titsias
(2009). This is a sparse approximation for the Bayesian posterior distribution in the nonparametric Gaussian process regression model. The procedure is analyzed from a frequentist perspective: we study contraction rates and validity of the uncertainty quantification. Specifically, it is shown how the frequentist properties of the variational posterior depend on the chosen prior distribution and the dimension of the approximation. Most of the theory is developed under the assumption that the smoothness of the true, datagenerating parameter is known, but we also discuss a smoothnessadaptive variational procedure.  13. Dezember 2023
 Boris Buchmann (ANU Canberra)
 Weak subordination of multivariate Levy processes

Subordination is the operation which evaluates a Levy process at a subordinator, giving riseto a pathwise construction of a ``timechanged'' process. In probability semigroups, subordinationwas applied to create the variance gamma process, which is prominently used in financial modelling.However, subordination may not produce a levy process unless the subordinate has independentcomponents or the subordinate has indistinguishable components. We introduce a new operationknown as weak subordination that always produces a Levy process by assigning the distribution ofthe subordinate conditional on the value of the subordinator, and matches traditional subordination in law in the cases above. Weak subordination is applied to extend the class of variancegeneralised gamma convolutions and to construct the weak variancealphagamma process.The latter process exhibits a wider range of dependence than using traditional subordination.Joint work with Kevin W. LU  Australian National University (Australia) & Dilip B. Madan  University of Maryland (USA)
 15. Dezember 2023
 Laura Sangalli (MOX Milano)
 PhysicsInformed Spatial and Functional Data Analysis
 Recent years have seen an explosive growth in the recording of increasingly complex and high
dimensional data, whose analysis calls for the definition of new methods, merging ideas and approaches from statistics and applied mathematics. My talk will focus on spatial and functional data observed over nonEuclidean domains, such as linear networks, twodimensional manifolds and nonconvex volumes. I will present an innovative class of methods, based on regularizing terms involving Partial Differential Equations (PDEs), defined over the complex domains being considered. These PhysicsInformed statistical learning methods enable the inclusion of the available problem specific information, suitably encoded in the regularizing PDE. Illustrative applications from environmental and life sciences will be presented.  20. Dezember 2023
 N.N.

 10. Januar 2024
 Eric Moulines (Ecolé Polytechnique)
 Scorebased diffusion models and applications
 Deep generative models represent an advanced frontier in machine learning. These models are adept at fitting complex data sets, whether they consist of images, text or other forms of highdimensional data. What makes them particularly noteworthy is their ability to provide independent samples from these complicated distributions at a cost that is both computationally efficient and resource efficient. However, the task of accurately sampling a target distribution presents significant challenges. These challenges often arise from the high dimensionality, multimodality or a combination of these factors. This complexity can compromise the effectiveness of traditional sampling methods and make the process either computationally prohibitive or less accurate.In my talk, I will address recent efforts in this area that aim to improve traditional inference and sampling algorithms. My major focus will be on scorebased diffusion models. By utilizing the concept of score matching and timereversal of stochastic differential equations, they offer a novel and powerful approach to generating highquality samples. I will discuss how these models work, their underlying principles and how they are used to overcome the limitations of conventional methods. The talk will also cover practical applications, demonstrating their versatility and effectiveness in solving complex realworld problems.
 17. Januar 2024
 Matteo Giordano (University of Torino)
 Likelihood Methods for Low Frequency Diffusion Data
 The talk will consider the problem of nonparametric inference in multidimensional diffusion models from lowfrequency data. Implementation of likelihoodbased procedures in such settings is a notoriously delicate task, due to the computational intractability of the likelihood. For the nonlinear inverse problem of inferring the diffusivity in a stochastic differential equation, we propose to exploit the underlying PDE characterisation of the transition densities, which allows the numerical evaluation of the likelihood via standard numerical methods for elliptic eigenvalue problems. A simple MetropolisHastingstype MCMC algorithm for Bayesian inference on the diffusivity is then constructed, based on Gaussian process priors. Furthermore, the PDE approach also yields a convenient characterisation of the gradient of the likelihood via perturbation techniques for parabolic PDEs, allowing the construction of gradientbased inference methods including MLE and Langevintype MCMC. The performance of the algorithms is illustrated via the results of numerical experiments. Joint work with Sven Wang.
 24. Januar 2024
 Simon Wood (The University of Edinburgh)
 On Neighbourhood Cross Validation

Cross validation comes in many varieties, but some of the more interesting flavours require multiple model fits with consequently high cost. This talk shows how the high cost can be sidestepped for a wide range of models estimated using a quadratically penalized smooth loss, with rather low approximation error.
Once the computational cost has the same leading order as a single model fit, it becomes feasible to efficiently optimize the chosen crossvalidation criterion with respect to multiple smoothing/precision parameters. Interesting applications include crossvalidating smooth additive quantile regression models, and the use of leaveoutneighbourhood cross validation for dealing with nuisance short range autocorrelation.
The link between cross validation and the jackknife can be exploited to obtain reasonably well calibrated uncertainty quantification in these cases.
 31. Januar 2024
 Gianluca Finocchio (Universität Wien)
 An extended latent factor framework for illposed linear regression
 The classical latent factor model for linear regression is extended by assuming that, up to an unknown orthogonal transformation, the features consist of subsets that are relevant and irrelevant for the response. Furthermore, a joint lowdimensionality is imposed only on the relevant features vector and the response variable. This framework allows for a comprehensive study of the partialleastsquares (PLS) algorithm under random design. In particular, a novel perturbation bound for PLS solutions is proven and the highprobability L²estimation rate for the PLS estimator is obtained. This novel framework also sheds light on the performance of other regularisation methods for illposed linear regression that exploit sparsity or unsupervised projection. The theoretical findings are confirmed by numerical studies on both real and simulated data.
 07. Februar 2024
 Evgenii Chzhen (LMO Orsay, Paris)
 Small TotalCost Constraints in Contextual Bandits with Knapsacks

I will talk about some recent developments in the literature of contextual bandit problems with knapsacks [CBwK], a problem where at each round, a scalar reward is obtained and vectorvalued costs are suffered. The goal is to maximize the cumulative rewards while ensuring that the cumulative costs are lower than some predetermined cost constraints. In this setting, total cost constraints had so far to be at least of order T^{3/4} where T is the number of rounds, and were even typically assumed to depend linearly on T. Elaborating on the main technical challenge and drawback of the previous approaches, I will present a dual strategy based on projectedgradientdescent updates, that is able to deal with totalcost constraints of the order of T^{1/2} up to polylogarithmic terms. This strategy is direct, and it relies on a careful, adaptive, tuning of the step size. The approach is inspired by a parameterfreetype algorithms arising from convex (online) optimization literature, which I also briefly review.The talk is based on joint works with C. Giraud, Z. Li, and G. Stoltz
 14. Februar 2024
 Martin Wahl (U Bielefeld)
 Heat kernel PCA with applications to Laplacian Eigenmaps

Laplacian eigenmaps and diffusion maps are nonlinear dimensionality reduction methods that use the eigenvalues and eigenvectors of (un)normalized graph Laplacians. Both methods are applied when the data is sampled from a lowdimensional manifold, embedded in a highdimensional Euclidean space. From a mathematical perspective, the main problem is to understand these empirical Laplacians as spectral approximations of the underlying LaplaceBeltrami operator. In this talk, we study Laplacian eigenmaps through the lens of kernel PCA, and consider the heat kernel as reproducing kernel feature map. This leads to novel points of view and allows to leverage results for empirical covariance operators in infinite dimensions.
Interessenten sind herzlich eingeladen.
Für Rückfragen wenden Sie sich bitte an:
Frau Sabine Bergmann
Mail: bergmann@mathematik.huberlin.de
Telefon: +4930209345450
Fax: +4930209345451
HumboldtUniversität zu Berlin
Institut für Mathematik
Unter den Linden 6
10099 Berlin, Germany