Forschungsseminar Mathematische Statistik
Für den Bereich Statistik
A. Carpentier, S. Greven, W. Härdle, M. Reiß, V. Spokoiny
Ort
WeierstrassInstitut für Angewandte Analysis und Stochastik
ErhardSchmidtRaum
Mohrenstrasse 39
10117 Berlin
Zeit
mittwochs, 10.00  12.00 Uhr
Programm

 17. April 2024
 Gil Kur (ETH Zurich)
 Connections between Minimum Norm Interpolation and Local Theory of Banach Spaces
 Abstract: We investigate the statistical performance of "minimum norm'' interpolators in nonlinear regression under additive Gaussian noise. Specifically, we focus on norms that satisfy either 2uniform convexity or the cotype 2 property  these include innerproduct spaces, lp norms, and W_{p} Sobolev spaces for 1 ≤ p ≤ 2. Our main result demonstrates that under 2uniform convexity, the bias of the minimal norm solution is bounded by the Gaussian complexity of the class. We then prove a "reverse'' EfronStein type estimate for the variance of the minimal norm solution under cotype 2  that provides an optimal bound for overparametrized linear regression. Our approach leverages tools from the local theory of finite dimensional Banach spaces, and, to the best of our knowledge, it is the first to study nonlinear models that are "far'' from Hilbert spaces.
 24. April 2024
 Nicolas Verzelen (INRAE Montpellier)

Computational Tradeoffs in Highdimensional Clustering
 Abstract: In this talk, I will discuss the fundamental problem of clustering a mixture of isotropic Gaussian. After reviewing some results on Kmeanstype procedures and on some of its relaxations, I will investigate the existence of a fundamental computationinformation gap for the problem of in the highdimensional regime, where the ambient dimension p is larger than the number n of points. The existence of a computationinformation gap in a specific Bayesian highdimensional asymptotic regime has been conjectured by lesieur2016phase based on the replica heuristic from statistical physics. We provide evidence of the existence of such a gap generically in the highdimensional regime p ≥ n, by proving a nonasymptotic lowdegree polynomials computational barrier for clustering in highdimension, matching the performance of the best known polynomial time algorithms.
 08. Mai 2024
 Georg Keilbar, Ratmir Miftachov (HumboldtUniversität zu Berlin)

Shapley Curves : A Smoothing Perspective
 Abstract: This paper fills the limited statistical understanding of Shapley values as a variable importance measure from a nonparametric (or smoothing) perspective. We introduce populationlevel Shapley curves to measure the true variable importance, determined by the conditional expectation function and the distribution of covariates. Having defined the estimand, we derive minimax convergence rates and asymptotic normality under general conditions for the two leading estimation strategies. For finite sample inference, we propose a novel version of the wild bootstrap procedure tailored for capturing lowerorder terms in the estimation of Shapley curves. Numerical studies confirm our theoretical findings, and an empirical application analyzes the determining factors of vehicle prices.
 15. Mai 2024
 Fabian Telschow (HumboldtUniversität zu Berlin)
 Estimation of the Expected Euler Characteristic of Excursion sets of Random fields and Applications to Simultaneous Confidence bands
 Abstract: The expected Euler characteristic (EEC) of excursion sets of a smooth Gaussianrelated random field over a compact manifold can be used to approximate the distribution of its supremum for high thresholds. Viewed as a function of the excursion threshold, the EEC of a Gaussianrelated field is expressed by the Gaussian kinematic formula (GKF) as a finite sum of known functions multiplied by the Lipschitz–Killing curvatures (LKCs) of the generating Gaussian field.
In the first part of this talk we present consistent estimators of the LKCs as linear projections of “pinned” Euler characteristic (EC) curves obtained from realizations of zeromean, unit variance Gaussian processes. As observed data seldom is Gaussian, we generalize these LKC estimators by an unusual use of the Gaussian multiplier bootstrap to obtain consistent estimates of the LKCs of Gaussian limiting fields of nonstationary statistics. In the second part, we explain applications of LKC estimation and the GKF to simultaneous familywise error rate inference, for example, by constructing simultaneous confidence bands and CoPE sets for spatial functional data over complex domains such as fMRI and climate data and discuss their benefits and drawbacks compared to other methodologies.  22. Mai 2024
 Vladimir Spokoiny (WIAS/ HU)
 Gaussian Variational Inference in high dimension
 Abstract: We consider the problem of approximating a highdimensional distribution by a Gaussian one by minimizing the KullbackLeibler divergence.
The main result extends Katsevich and Rigollet (2023) and claims that the minimiser can be well approximated by the Gaussian distribution with the mean and varianceas for the underlying measure. We also describe the accuracy of approximation and the range of applicability for such approximation in terms of efficient dimension.The obtained results can be used for analysis of various sampling scheme in optimization.

29. Mai 2024

Tailen Hsing (University of Michigan)
A functionaldata perspective in spatial data analysis
Abstract: More and more spatiotemporal data nowadays can be viewed as functional data. The first part of the talk focuses on the Argo data, which is a modern oceanography dataset that provides unprecedented global coverage of temperature and salinity measurements in the upper 2,000 meters of depth of the ocean. I will discuss a functional kriging approach to predict temperature and salinity as a smooth function of depth, as well as a cokriging approach of predicting oxygen concentration based on temperature and salinity data. In the second part of the talk, I will give an overview on some related topics, including spectral density estimation and variable selection for functional data.
 05. Juni 2024
 JiaJie Zhu (WIAS Berlin)
 Wasserstein and Beyond: Optimal Transport and Gradient Flows for Machine Learning and Optimization
 Abstract: In the first part of the talk, I will provide an overview of gradient flows over nonnegative and probability measures and their application in modern machine learning tasks, such as variational inference, sampling, training of overparameterized models, and robust optimization. Then, I will present our recent results on the analysis of a couple of particularly relevant gradient flows, including the settings of Wasserstein, Hellinger/FisherRao, and reproducing kernel Hilbert space. The focus is on the global exponential decay of the entropy functionals along the gradient flows such as HellingerKantorovich (a.k.a. WassersteinFisherRao) and a new type of gradient flow geometries that guarantee convergence of minimizing a maximummean discrepancy, which we term the interactionforce transport.
The talk is based on the joint works with Alexander Mielke, Pavel Dvurechensky, and Egor Gladin.
 12. Juni 2024
 Marc Hallin (Université libre de Bruxelles)

The long quest for quantiles and ranks in Rd and on manifolds
 Abstract:
Quantiles are a fundamental concept in probability, and an essential tool in statistics, from descriptive to inferential. Still, despite half a century of attempts, no satisfactory and fully agreedupon definition of the concept, and the “dual” notion of ranks, is available beyond the wellunderstood case of univariate variables and distributions. The need for such a definition is particularly critical for varia bles taking values in Rd, for directional variables (values on the hypersphere), and, more generally, for variables with values on manifolds. Unlike the real line, indeed, no canonical ordering is available on the se domains. We show how measure transportation brings a solution to this problem by characterizing distributionspecific (datadriven, in the empirical case) orderings and centeroutward distribution and quantile functions (ranks and signs in the empirical case) that satisfy all the properties expected from such concepts while reducing, in the case of realvalued variables, to the classical univariate notion.
 19. Juni 2024  WIAS Evaluation
 26. Juni 2024 Achtung anderer Raum u. anderes Geb.: R. 3.13 im HVP 11a !
 Clement Berenfeld (Universität Potsdam)
 A theory of stratification learning
 Abstract: Given i.i.d. sample from a stratified mixture of immersed manifolds of different dimensions, we study the minimax estimation of the underlying stratified structure. We provide a constructive algorithm allowing to estimate each mixture component at its optimal dimensionspecific rate adaptively. The method is based on an ascending hierarchical codetection of points belonging to different layers, which also identifies the number of layers and their dimensions, assigns each data point to a layer accurately, and estimates tangent spaces optimally. These results hold regardless of any ambient assumption on the manifolds or on their intersection configurations. They open the way to a broad clustering framework, where each mixture component models a cluster emanating from a specific nonlinear correlation phenomenon.

 03. Juli 2024
 Celine Duval (Université de Lille)
 Geometry of excursion sets: computing the surface area from discretized points
 Abstract: The excursion sets of a smooth random field carries relevant information in its various geometric measures. After an introduction of these geometrical quantities showing how they are related to the parameters of the field, we focus on the problem of discretization. From a computational viewpoint, one never has access to the continuous observation of the excursion set, but rather to observations at discrete points in space. It has been reported that for specific regular lattices of points in dimensions 2 and 3, the usual estimate of the surface area of the excursions remains biased even when the lattice becomes dense in the domain of observation. We show that this limiting bias is invariant to the locations of the observation points and that it only depends on the ambient dimension. (based on joint works with H. Biermé, R. Cotsakis, E. Di Bernardino and A. Estrade).
 10. Juli 2024
 Anya Katsevich (MIT)
 Laplace asymptotics in highdimensional Bayesian inference
 Abstract: Computing integrals against a highdimensional posterior is the major computational bottleneck in Bayesian inference. A popular technique to reduce this computational burden is to use the Laplace approximation (LA), a Gaussian distribution, in place of the true posterior. We derive a new, leading order asymptotic decomposition of integrals against a highdimensional Laplacetype posterior which sheds valuable insight on the accuracy of the LA in high dimensions. In particular, we determine the tight dimension dependence of the approximation error, leading to the tightest known Bernstein von Mises result on the asymptotic normality of the posterior. The decomposition also leads to a simple modification to the LA which yields a higherorder accurate approximation to the posterior. Finally, we prove the validity of the highdimensional Laplace asymptotic expansion to arbitrary order, which opens the door to approximating the partition function, of use in highdimensional model selection and many other applications beyond statistics.
 17. Juli 2024
 findet nicht statt

Interessenten sind herzlich eingeladen.
Für Rückfragen wenden Sie sich bitte an:
Frau Marina Filatova
Mail: marina.filatova@huberlin.de
Telefon: +4930209345460
HumboldtUniversität zu Berlin
Institut für Mathematik
Unter den Linden 6
10099 Berlin, Germany