Humboldt-Universität zu Berlin - Mathematisch-Naturwissenschaftliche Fakultät - Institut für Mathematik

Forschungsseminar Mathematische Statistik

Für den Bereich Statistik


A. Carpentier, S. Greven, W. Härdle, M. Reiß, V. Spokoiny

 

Ort

Weierstrass-Institut für Angewandte Analysis und Stochastik
Erhard-Schmidt-Raum
Mohrenstrasse 39
10117 Berlin

 

Zeit

mittwochs, 10.00 - 12.00 Uhr


Programm

 

Achtung!
The seminar will be hybrid and realized via Zoom. Our lecture room ESH has according to hygiene recommendations only a capacity of 24 people. If you intend to come to same of the talks in person, you must register for our mailinglist with Andrea Fiebig (fiebig@math.hu-berlin.de).
When participating in the seminar, a self test must be taken on the day (or before the start) of the seminar.
The so-called ''3G rule'' applies at the Weierstrass Institute.
 
19. Oktober 2022
Otmar Cronie (Chalmers University of Technology & University of Gothenburg) (10 - 11 Uhr)
Point Process Learning: A Cross-validation-based Approach to Statistics for Point Processes
Abstract: Point processes are random sets which generalise the classical notion of a random (iid) sample by allowing i) the sample size to be random and/or ii) the sample points to be dependent. Therefore, point process have become ubiquitous in the modelling of spatial and/or temporal event data, e.g. earthquakes and disease cases. Motivated by cross validation’s general ability to reduce overfitting and mean square error, in this talk, we present a new cross-validation-based statistical theory for general point processes. It is based on the combination of two novel concepts for general point processes: cross validation and prediction errors. Our cross-validation approach uses thinning to split a point process/pattern into pairs of training and validation sets, while our prediction errors measure discrepancy between two point processes. The new statistical approach exploits the prediction errors to measure how well a given model predicts validation sets using associated training sets. Due to its connection to the general idea of empirical risk minimisation, it is referred to as Point Process Learning. We discuss properties of the proposed approach and its components, and we illustrate how it may be applied in different spatial statistical settings. In (at least) one of these settings, we numerically show that it outperforms the state of the art.
David Frazier (Monash University, Melbourne, Australia) (ca. 11-12 Uhr)
Guarenteed Robustness via Semi-Modular Posterior Inference
Abstract: Even in relatively simple settings, model misspecification can cause Bayesian inference methods to fail spectacularly. In situations where the underlying model is built by combining different modules, an approach to guard against misspecification is to employ cutting feedback methods. These methods modify conventional Bayesian posterior inference algorithms by artificially limiting the information flows between the (potentially) misspecified and correctly specified modules. By artificially limiting the flow of information when updating our prior beliefs, we essentially "cut" the link between these modules, and ultimately produce a posterior that differs from the exact posterior. However, it is generally unknown when one should prefer this "cut posterior" over the exact posterior. Rather than choosing a single posterior on which to base our inferences, we propose a new Bayesian method that combines both posteriors in such a way that we can guard against misspecification, and decrease posterior uncertainty. We derive easily verifiable conditions under which this new posterior produces inferences that are guaranteed to be more accurate than using either posterior by itself. We demonstrate this new method in a host of applications.
26. Oktober 2022
N.N.  
02. November 2022
Johannes Schmidt-Hieber (University of Twente) 
Overparametrization and the bias-variance dilemma
Abstract: For several machine learning methods such as neural networks, good generalisation performance has been reported in the overparametrized regime. In view of the classical bias-variance trade-off, this behaviour is highly counterintuitive. The talk summarizes recent theoretical results on overparametrization and the bias-variance trade-off. This is joint work with Alexis Derumigny (Delft).
09. November 2022
Claudia Schillings (FU Berlin) (ACHTUNG!! Raum 3.13, Hausvogteiplatz 11a) 
The Convergence of the Laplace Approximation and Noise-Level-Robust Computational Methods for Bayesian Inverse Problems
Abstract: The Bayesian approach to inverse problems provides a rigorous framework for the incorporation and quantification of uncertainties in measurements, parameters and models. We are interested in designing numerical methods which are robust w.r.t. the size of the observational noise, i.e., methods which behave well in case of concentrated posterior measures. The concentration of the posterior is a highly desirable situation in practice, since it relates to informative or large data. However, it can pose a computational challenge for numerical methods based on the prior measure. We propose to use the Laplace approximation of the posterior as the reference measure for the numerical integration and analyze the efficiency of Monte Carlo methods based on it.
16. November 2022
Aila Särkkä (Chalmers University of Technology and University of Gothenburg) 
Anisotropy analysis and modelling of spatial point patterns
Abstract: In the early spatial point process literature, observed point patterns were typically small and no repetitions were available. It was natural to assume that the patterns were realizations of stationary and isotropic point processes. Nowadays, large data sets with repetitions have become more and more common and it is important to think about the validity of these assumptions. Non-stationarity has received quite a lot of attention during the recent years and it is straightforward to include it in many point process models. Isotropy, on the other hand, is often still assumed without further checking, and even though there are several tools suggested to detect isotropy and test for it, they have not been so widely used. This talk will give an overview of nonparametric methods for anisotropy analysis of (stationary) point processes ([3], [4], [5]). Methods based on nearest neighbour and second order summary statistics as well as on spectral and wavelet analysis will be discussed. The techniques will be illustrated on both a clustered and a regular example. In the second part of the talk, one of the methods will be used to estimate the deformation history in polar ice using the measured anisotropy of air inclusions from deep ice cores [2]. In addition, an anisotropic point process model for nerve fiber data will be presented [1].
References: [1] Konstantinou K and Särkkä A (2022). Pairwise interaction Markov model for 3D epidermal nerve fiber endings. To appear in Journal of Microscopy.
[2] Rajala T, Särkkä A, Redenbach C, and Sormani M (2016). Estimating geometric anisotropy in spatial point patterns. Spatial Statistics 15, 100–114.
[3] Rajala T, Redenbach C, Särkkä A, and Sormani M (2018). A review on anisotropy analysis of spatial point patterns. Spatial Statistics 28, 141–168.
[4] Rajala T, Redenbach C, Särkkä,A, and Sormani M (2022). Tests for isotropy in spatial point patterns. Under revision. [5] Sormani M, Redenbach C, Särkkä A and Rajala T (2020). Second order directional analysis of point processes revisited. Spatial Statistics 38, 100456.
23. November 2022
Alexei Kroshnin (WIAS Berlin)
Robust k-means clustering in Hilbert and metric spaces
Abstract: In this talk, we consider the robust algorithms for the k-means clustering (quantization) problem where a quantizer is constructed based on N independent observations. While the well-known asymptotic result by Pollard shows that the existence of two moments is sufficient for strong consistency of an empirically optimal quantizer in R^d, non-asymptotic bounds are usually obtained under the assumption of bounded support. We discuss a robust k-means in Hilbert and metric spaces based on trimming, and prove non-asymptotic bounds on the excess distortion, which depend on the probability mass of the lightest cluster and the second moment of the distribution.
30. November 2022    
Mikolaj Kasprzak (University of Luxemburg)
How good is your Laplace approximation? Finite-sample error bounds for a variety of useful divergences
Abstract: The Laplace approximation is a popular method of approximating an intractable Bayesian posterior by a suitably chosen Gaussian distribution. But can we trust this approximation for practical use? Its theoretical justification comes from the celebrated Bernstein-von Mises theorem (also known as the Bayesian CLT or BCLT). However, an obstacle to its wider use is the lack of widely applicable post-hoc checks on its quality. Our work provides closed-form, finite-sample quality bounds for the Laplace approximation that simultaneously (1) do not require knowing the true parameter, (2) control posterior means and variances, and (3) apply generally to models that satisfy the conditions of the asymptotic BCLT. In fact, our bounds work even in the presence of misspecification. We compute exact constants in our bounds for a variety of standard models, including logistic regression, and numerically demonstrate their utility. And we provide a framework for analysis of more complex models. This is joint work with Ryan Giordano (MIT) and Tamara Broderick (MIT). A preliminary version of the work is available here: https://arxiv.org/abs/2209.14992.
07. Dezember 2022
Matthias Vetter (Universität Kiel)
On goodness-of-fit testing for point processes
Abstract: Typical models for point processes like Hawkes processes or inhomogeneous Poisson processes are often of a parametric form where the intensity function or an additional self-exciting component is known up to an unspecified parameter. A lot of research since the seminal paper by Ogata (1978) has been devoted to the estimation of these unknown parameters but even in these rather standard models a consistent goodness-of-fit test has been missing. This talk aims to fill this gap. We will show how to formally set up a bootstrap procedure to allow for goodness-of-fit testing and we will discuss how to prove consistency of the test in the (already quite involved) case of an inhomogenous Poisson process.
14. Dezember 2022
Jovanka Lili Matic (HU, IRTG 1792)
Global Sensitivity Analysis in the Presence of Missing Values
04. Januar 2023
Franz Besold (WIAS Berlin)
Adaptive Weights Community Detection
Abstract: Due to the technological progress of the last decades, Community Detection has become a major topic in machine learning. However, there is still a huge gap between practical and theoretical results, as theoretically optimal procedures often lack a feasible implementation and vice versa. This paper aims to close this gap and presents a novel algorithm that is both numerically and statistically efficient. Our procedure uses a test of homogeneity to compute adaptive weights describing local communities. The approach was inspired by the Adaptive Weights Community Detection (AWCD) algorithm by Adamyan et al. (2019). This algorithm delivered some promising results on artificial and real-life data, but our theoretical analysis reveals its performance to be suboptimal on a stochastic block model. In particular, the involved estimators are biased and the procedure does not work for sparse graphs. We propose significant modifications, addressing both shortcomings and achieving a nearly optimal rate of strong consistency on the stochastic block model. Our theoretical results are illustrated and validated by numerical experiments.
11. Januar 2023
Leonhard Held (Universität Zürich)
tba
18. Januar 2023
Tim Jahn (Universität Bonn)
tba
25. Januar 2023
Maria Grith (Erasmus University Rotterdam)
tba
01. Februar 2023
Radu Stoica (Université de Lorraine, Nancy)
Random structures and patterns in spatio-temporal data: probabilistic modelling and statistical inference
Abstract: The useful information carried by spatio-temporal data is often outlined by geometric structures and patterns. Filaments or clusters induced by galaxy positions in our Universe are such an example. Two situations are to be considered. First, the pattern of interest is hidden in the data set, hence the pattern should be detected. Second, the structure to be studied is observed, so relevant characterization of it should be done. This talk is structured in four parts. The first part presents the construction of different marked point processes together with their properties, such that characteristics of the patterns of interested are modelled by these processes. Second, tailored to the model MCMC dynamics to simulate the previous models are presented. A discussion related to the performances of these algorithms and comparison with exact simulation methods are provided. Third, on this basis, inference procedures are derived. They include level sets estimators, global optimisation, Approximate Bayesian Computation. Finally, applications on cosmological and geological real data are shown.
08. Februar 2023
Vanessa Didelez (Universität Bremen)
tba
15. Februar 2023
N.N.

 

 
 


 Interessenten sind herzlich eingeladen.

Für Rückfragen wenden Sie sich bitte an:

Frau Andrea Fiebig

Mail: fiebig@mathematik.hu-berlin.de
Telefon: +49-30-2093-45460
Fax:        +49-30-2093-45451
Humboldt-Universität zu Berlin
Institut für Mathematik
Unter den Linden 6
10099 Berlin, Germany