Forschungsseminar Mathematische Statistik
Für den Bereich Statistik
A. Carpentier, S. Greven, W. Härdle, M. Reiß, V. Spokoiny
Ort
Weierstrass-Institut für Angewandte Analysis und Stochastik
ESH,
Mohrenstrasse 39
10117 Berlin
Zeit
mittwochs, 10.00 - 12.00 Uhr
Programm
-
- 15. Oktober 2025
- Carlos Soto (University of Massachusetts Amherst)
-
Differential Privacy over Manifolds and Shape Space
Abstract: In this work we consider the problem of releasing a differentially private statistical summary that resides on a Riemannian manifold. We present extensions of the Laplace and K-norm mechanisms that utilizes intrinsic distances and volumes on the manifold. We also consider in detail the specific case where the summary is the Fréchet mean of data residing on a manifold. We demonstrate that our mechanism is rate optimal and depends only on the dimension of the manifold, not on the dimension of any ambient space, while also showing how ignoring the manifold structure can decrease the utility of the sanitized summary. Lastly, we illustrate our framework in three examples of particular interest in statistics: the space of symmetric positive definite matrices, which is used for covariance matrices, the sphere, which can be used as a space for modeling discrete distributions, and Kendall's 2D planar shape space.
- 22. Oktober 2025
- Nina Doernemann (Aarhus University, Dänemark)
- Tracy-Widom, Gaussian, and Bootstrap: Approximations for Leading Eigenvalues in High-Dimensional PCA
- Abstract: Under certain conditions, the largest eigenvalue of a sample covariance matrix undergoes a well-known phase transition when the sample size $n$ and data dimension $p$ diverge proportionally.
In the subcritical regime, this eigenvalue has fluctuations of order $n^{-2/3}$ that can be approximated by a Tracy-Widom distribution, while in the supercritical regime, it has fluctuations of order $n^{-1/2}$ that can be approximated with a Gaussian distribution. However, the statistical problem of determining which regime underlies a given dataset is far from resolved. We develop a new testing framework and procedure to address this problem. In particular, we demonstrate that the procedure has an asymptotically controlled level, and that it is power consistent for certain alternatives. Also, this testing procedure enables the design of a new bootstrap method for approximating the distributions of functionals of the leading sample eigenvalues within the subcritical regime---which is the first such method that is supported by theoretical guarantees.This talk is based on a joint work with Miles E. Lopes (UC Davis).
- 29. Oktober 2025
- Ludger Overbeck (Universität Gießen)
- Bayesian Estimation with MCMC Methods for Stochastic Processes with Hidden States
-
Abstract: Markov Modulated Stochastic Processes (MMSP) are a more flexible class of Stochastic Processes which capture phase changes arising in economies by allowing jumps in drift and volatility, linked to hidden states of a Markov chain. These models have been used to model option prices, renewable energy markets as well as for risk quantification. While Bayesian inference methods exists for simpler regime-switching models, we aim to extend it to more complex MMSPs. Our approach involves applying Bayesian estimation techniques to recover the hidden states and the parameters associated with each state of the Markov Chain. We propose Markov Chain Monte Carlo algorithms to perform Bayesian inference for MM-SPs. This will allow for a more data-driven analysis of asset returns with regime shifts and jumps. In the first part of the talk we review well-known estimation techniques for the CIR-process which is the solution of
dXt = (a+ bXt)dt+ σ X+t dWt
based on the paper Estimation in the CIR-Process (O.& Ryd´en. Scand, Econometric Theory. 1997). Finallywe want to extend this to a model with hidden variables, namely to
dXt = a(Zt) + b(Zt)Xtdt+ σ(Zt) X+s dWs
where Z is a continuous-time Markov chain. This is aspecial case of a general regime switching stochastic process.
dXt = β(Zt,Xt)dt+ σ(Zt,Xt)dWζ(Zt)
We consider two special cases
dXt = a(Zt) + b(Zt)Xtdt+ σ(Zt) X+s dWs.
For two special cases We formulate For parameter estimation we present a new version of the EM (Expectation Maximizer)-approach which was in a similar way used in the paper On the estimation of regime-switching L´evy models (Chevalier & Goutte, Stud. Nonlinear Dyn. E. 2017). Finally, we discuss the potential modification of the EM-algorithm, if we consider conditional least square minimization instead of likelihood maximisation in the ”M”-step of the EM-algorithm.
05. November 2025
N.N.
- 12. November 2025
- Thomas Kneib (Universität Göttingen)
- Demystifying Spatial Confounding
-
Abstract:
Spatial confounding is a fundamental issue in spatial regression models which arises because spatial random effects, included to approximate unmeasured spatial variation, are typically not independent of covariates in the model. This can lead to significant bias in covariate effect estimates. The problem is complex and has been the topic of extensive research with sometimes puzzling and seemingly contradictory results. We will give an introduction to spatial confounding and discuss some suggested solutions for dealing with it, including the formalisation as a structural equation model and spatial+, where spatial variation in the covariate of interest is regresssed away first and remaining residuals are then used to identify the relevant effect.
In the second part of the presentation, we develop a broad theoretical framework that brings mathematical clarity to the mechanisms of spatial confounding, relying on an explicit analytical expression for the resulting bias. We see that the problem is directly linked to spatial smoothing and identify exactly how the size and occurrence of bias relate to the features of the spatial model as well as the underlying confounding scenario. Using our results, we can explain subtle and counter-intuitive behaviours. Finally, we propose a general approach for dealing with spatial confounding bias in practice, applicable for any spatial model specification. When a covariate has non-spatial information, we show that a general form of the so-called spatial+ method can be used to eliminate bias. When no such information is present, the situation is more challenging but, under the assumption of unconfounded high frequencies, we develop a procedure in which multiple capped versions of spatial+ are applied to assess the bias in this case. We illustrate our approach with an application to air temperature in Germany.
-
19. November 2025
-
Alain Celisse (Paris)
Abstract:
- 26. November 2025
- Alessandro Palummo (Politecnico Milano)
-
Physics-informed Functional Principal Component Analysis of Large-scale Datasets
Abstract: Physics-informed statistical learning is an emerging area of spatial and functional data analysis that integrates observational data with prior physical knowledge encoded by partial differential equations (PDEs). We propose an iterative Majorization–Minimization scheme for functional Principal Component Analysis of random fields in a general Hilbert space, formulated under the practically relevant assumption of partial observability of the data. By combining differential penalties with finite element discretizations, our approach recovers smooth principal component functions while preserving the geometric features of the spatial domain. The resulting estimation procedure involves solving a smoothing problem, which may become computationally demanding for large-scale datasets. After establishing the well-posedness of this smoothing problem under mild assumptions on the PDE parameters, we develop an efficient iterative algorithm for its solution. This framework enables the practical analysis of massive functional datasets at the population level, ranging from physics-informed fPCA to functional clustering, with applications to environmental and neuroimaging data.
- 03. Dezember 2025
- Fanny Seyzilles (Cambridge)
-
Abstract:
- 10. Dezember 2025
- Vladimir Spokoiny (WIAS)
Abstract:
- 17. Dezember 2025
- Cesare Miglioli (University of Pittsburgh)
-
Incomplete U-Statistics of Equireplicate Designs: Berry-Esseen Bound and Efficient Construction.
Abstract: U-statistics are a fundamental class of estimators that generalize the sample mean and underpin much of nonparametric statistics. Although extensively studied in both statistics and probability, key challenges remain. These include their inherently high computational cost—addressed partly through incomplete U-statistics—and their non-standard asymptotic behavior in the degenerate case, which typically requires resampling methods for hypothesis testing. This talk presents a novel perspective on U-statistics, grounded in hypergraph theory and combinatorial designs. Our approach bypasses the traditional Hoeffding decomposition, which is the the main analytical tool in this literature but is highly sensitive to degeneracy. By fully characterizing the dependence structure of a U-statistic, we derive a new Berry–Esseen bound that applies to all incomplete U-statistics based on deterministic designs, yielding conditions under which Gaussian limiting distributions can be established even in the degenerate case and when the order diverges. Moreover, we introduce efficient algorithms to construct incomplete U-statistics of equireplicate designs, a subclass of deterministic designs that, in certain cases, enable to achieve minimum variance. To illustrate the power of this novel framework, we apply it to kernel-based testing, focusing on the widely used two-sample Maximum Mean Discrepancy (MMD) test. Our approach leads to a permutation-free variant of the MMD test that delivers substantial computational gains while retaining statistical validity.
- 07. Januar 2026
- 14. Januar 2026
- Frank Konietschke (Charité)
-
- Abstract:
- 21. Januar 2026
-
28. Januar 2026
Jakob Runge (Uni Potsdam)
04. Februar 2026
11. Februar 2026
Béatrice Matteo (University of Geneva) and
Ho Yun (EPFL)
Title Ho Yun: Linear Monge is All You Need
Abstract: In this talk, we explore the geometry of the Bures-Wasserstein space for potentially degenerate Gaussian measures on a separable Hilbert space, based on our recent work with Yoav Zemel. A key feature of our approach is its simplicity: relying only on elementary arguments from linear operator theory, we are able to derive explicit results without resorting to Kantorovich duality or Otto's Calculus. We provide a complete characterisation of both the Monge and Kantorovich problems in this context, regardless of the degeneracy of their measures. Furthermore, we show a simple way to construct all possible Wasserstein geodesics connecting two Gaussian measures. Finally, we generalise our results to characterise Wasserstein barycenters of Gaussian measures, borrowing the idea of Procrustes distance from statistical shape analysis
Title Béatrice Matteo: Kernel ridge regression for spherical responses
Abstract: The aim is to propose a novel nonlinear regression framework for responses taking values on a hypersphere. Rather than performing tangent space regression, where all the sphere responses are lifted to a single tangent space on which the regression is performed, we estimate conditional Frechet means by minimizing squared distances on the nonlinear manifold. Yet, the tangent space serves as a linear predictor space where the regression function takes values. The framework integrates Riemannian geometry techniques with functional data analysis by modelling the regression function using methods from vector-valued reproducing kernel Hilbert space theory. This formulation enables the reduction of the infinite-dimensional estimation problem to a finite-dimensional one via a representer theorem and leads to an estimation algorithm by means of Riemannian gradient descent. Explicit checkable conditions on the data that ensure the existence and uniqueness of the minimizing estimator are given.
Interessenten sind herzlich eingeladen.
Für Rückfragen wenden Sie sich bitte an:
Frau Marina Filatova
Mail: marina.filatova@hu-berlin.de
Telefon: +49-30-2093-45460
Humboldt-Universität zu Berlin
Institut für Mathematik
Unter den Linden 6
10099 Berlin, Germany