Forschungsseminar Mathematische Statistik
Für den Bereich Statistik
A. Carpentier, S. Greven, W. Härdle, M. Reiß, V. Spokoiny
Ort
WeierstrassInstitut für Angewandte Analysis und Stochastik
ErhardSchmidtRaum
Mohrenstrasse 39
10117 Berlin
Zeit
mittwochs, 10.00  12.00 Uhr
Programm

 Achtung!
 The seminar will be hybrid and realized via Zoom. Our lecture room ESH has according to hygiene recommendations only a capacity of 24 people. If you intend to come to same of the talks in person, you must register for our mailinglist with Andrea Fiebig (fiebig@math.huberlin.de).
 When participating in the seminar, a self test must be taken on the day (or before the start) of the seminar.
 The socalled ''3G rule'' applies at the Weierstrass Institute.
 19. Oktober 2022
 Otmar Cronie (Chalmers University of Technology & University of Gothenburg) (10  11 Uhr)
 Point Process Learning: A Crossvalidationbased Approach to Statistics for Point Processes
 Abstract: Point processes are random sets which generalise the classical notion of a random (iid) sample by allowing i) the sample size to be random and/or ii) the sample points to be dependent. Therefore, point process have become ubiquitous in the modelling of spatial and/or temporal event data, e.g. earthquakes and disease cases. Motivated by cross validation’s general ability to reduce overfitting and mean square error, in this talk, we present a new crossvalidationbased statistical theory for general point processes. It is based on the combination of two novel concepts for general point processes: cross validation and prediction errors. Our crossvalidation approach uses thinning to split a point process/pattern into pairs of training and validation sets, while our prediction errors measure discrepancy between two point processes. The new statistical approach exploits the prediction errors to measure how well a given model predicts validation sets using associated training sets. Due to its connection to the general idea of empirical risk minimisation, it is referred to as Point Process Learning. We discuss properties of the proposed approach and its components, and we illustrate how it may be applied in different spatial statistical settings. In (at least) one of these settings, we numerically show that it outperforms the state of the art.
 David Frazier (Monash University, Melbourne, Australia) (ca. 1112 Uhr)
 Guarenteed Robustness via SemiModular Posterior Inference
 Abstract: Even in relatively simple settings, model misspecification can cause Bayesian inference methods to fail spectacularly. In situations where the underlying model is built by combining different modules, an approach to guard against misspecification is to employ cutting feedback methods. These methods modify conventional Bayesian posterior inference algorithms by artificially limiting the information flows between the (potentially) misspecified and correctly specified modules. By artificially limiting the flow of information when updating our prior beliefs, we essentially "cut" the link between these modules, and ultimately produce a posterior that differs from the exact posterior. However, it is generally unknown when one should prefer this "cut posterior" over the exact posterior. Rather than choosing a single posterior on which to base our inferences, we propose a new Bayesian method that combines both posteriors in such a way that we can guard against misspecification, and decrease posterior uncertainty. We derive easily verifiable conditions under which this new posterior produces inferences that are guaranteed to be more accurate than using either posterior by itself. We demonstrate this new method in a host of applications.
 26. Oktober 2022
 N.N.
 02. November 2022
 Johannes SchmidtHieber (University of Twente)
 Overparametrization and the biasvariance dilemma
 Abstract: For several machine learning methods such as neural networks, good generalisation performance has been reported in the overparametrized regime. In view of the classical biasvariance tradeoff, this behaviour is highly counterintuitive. The talk summarizes recent theoretical results on overparametrization and the biasvariance tradeoff. This is joint work with Alexis Derumigny (Delft).
 09. November 2022
 Claudia Schillings (FU Berlin) (ACHTUNG!! Raum 3.13, Hausvogteiplatz 11a)
 The Convergence of the Laplace Approximation and NoiseLevelRobust Computational Methods for Bayesian Inverse Problems
 Abstract: The Bayesian approach to inverse problems provides a rigorous framework for the incorporation and quantification of uncertainties in measurements, parameters and models. We are interested in designing numerical methods which are robust w.r.t. the size of the observational noise, i.e., methods which behave well in case of concentrated posterior measures. The concentration of the posterior is a highly desirable situation in practice, since it relates to informative or large data. However, it can pose a computational challenge for numerical methods based on the prior measure. We propose to use the Laplace approximation of the posterior as the reference measure for the numerical integration and analyze the efficiency of Monte Carlo methods based on it.
 16. November 2022
 Aila Särkkä (Chalmers University of Technology and University of Gothenburg)
 Anisotropy analysis and modelling of spatial point patterns
 Abstract: In the early spatial point process literature, observed point patterns were typically small and no repetitions were available. It was natural to assume that the patterns were realizations of stationary and isotropic point processes. Nowadays, large data sets with repetitions have become more and more common and it is important to think about the validity of these assumptions. Nonstationarity has received quite a lot of attention during the recent years and it is straightforward to include it in many point process models. Isotropy, on the other hand, is often still assumed without further checking, and even though there are several tools suggested to detect isotropy and test for it, they have not been so widely used. This talk will give an overview of nonparametric methods for anisotropy analysis of (stationary) point processes ([3], [4], [5]). Methods based on nearest neighbour and second order summary statistics as well as on spectral and wavelet analysis will be discussed. The techniques will be illustrated on both a clustered and a regular example. In the second part of the talk, one of the methods will be used to estimate the deformation history in polar ice using the measured anisotropy of air inclusions from deep ice cores [2]. In addition, an anisotropic point process model for nerve fiber data will be presented [1].
 References: [1] Konstantinou K and Särkkä A (2022). Pairwise interaction Markov model for 3D epidermal nerve fiber endings. To appear in Journal of Microscopy.
 [2] Rajala T, Särkkä A, Redenbach C, and Sormani M (2016). Estimating geometric anisotropy in spatial point patterns. Spatial Statistics 15, 100–114.
 [3] Rajala T, Redenbach C, Särkkä A, and Sormani M (2018). A review on anisotropy analysis of spatial point patterns. Spatial Statistics 28, 141–168.
 [4] Rajala T, Redenbach C, Särkkä,A, and Sormani M (2022). Tests for isotropy in spatial point patterns. Under revision. [5] Sormani M, Redenbach C, Särkkä A and Rajala T (2020). Second order directional analysis of point processes revisited. Spatial Statistics 38, 100456.
 23. November 2022
 Alexei Kroshnin (WIAS Berlin)
 Robust kmeans clustering in Hilbert and metric spaces
 Abstract: In this talk, we consider the robust algorithms for the kmeans clustering (quantization) problem where a quantizer is constructed based on N independent observations. While the wellknown asymptotic result by Pollard shows that the existence of two moments is sufficient for strong consistency of an empirically optimal quantizer in R^d, nonasymptotic bounds are usually obtained under the assumption of bounded support. We discuss a robust kmeans in Hilbert and metric spaces based on trimming, and prove nonasymptotic bounds on the excess distortion, which depend on the probability mass of the lightest cluster and the second moment of the distribution.
 30. November 2022
 Mikolaj Kasprzak (University of Luxemburg)
 How good is your Laplace approximation? Finitesample error bounds for a variety of useful divergences
 Abstract: The Laplace approximation is a popular method of approximating an intractable Bayesian posterior by a suitably chosen Gaussian distribution. But can we trust this approximation for practical use? Its theoretical justification comes from the celebrated Bernsteinvon Mises theorem (also known as the Bayesian CLT or BCLT). However, an obstacle to its wider use is the lack of widely applicable posthoc checks on its quality. Our work provides closedform, finitesample quality bounds for the Laplace approximation that simultaneously (1) do not require knowing the true parameter, (2) control posterior means and variances, and (3) apply generally to models that satisfy the conditions of the asymptotic BCLT. In fact, our bounds work even in the presence of misspecification. We compute exact constants in our bounds for a variety of standard models, including logistic regression, and numerically demonstrate their utility. And we provide a framework for analysis of more complex models. This is joint work with Ryan Giordano (MIT) and Tamara Broderick (MIT). A preliminary version of the work is available here: https://arxiv.org/abs/2209.14992.
 07. Dezember 2022
 Matthias Vetter (Universität Kiel)
 On goodnessoffit testing for point processes
 Abstract: Typical models for point processes like Hawkes processes or inhomogeneous Poisson processes are often of a parametric form where the intensity function or an additional selfexciting component is known up to an unspecified parameter. A lot of research since the seminal paper by Ogata (1978) has been devoted to the estimation of these unknown parameters but even in these rather standard models a consistent goodnessoffit test has been missing. This talk aims to fill this gap. We will show how to formally set up a bootstrap procedure to allow for goodnessoffit testing and we will discuss how to prove consistency of the test in the (already quite involved) case of an inhomogenous Poisson process.
 14. Dezember 2022
 Jovanka Lili Matic (HU, IRTG 1792)
 Global Sensitivity Analysis in the Presence of Missing Values
 04. Januar 2023
 Franz Besold (WIAS Berlin)
 Adaptive Weights Community Detection
 Abstract: Due to the technological progress of the last decades, Community Detection has become a major topic in machine learning. However, there is still a huge gap between practical and theoretical results, as theoretically optimal procedures often lack a feasible implementation and vice versa. This paper aims to close this gap and presents a novel algorithm that is both numerically and statistically efficient. Our procedure uses a test of homogeneity to compute adaptive weights describing local communities. The approach was inspired by the Adaptive Weights Community Detection (AWCD) algorithm by Adamyan et al. (2019). This algorithm delivered some promising results on artificial and reallife data, but our theoretical analysis reveals its performance to be suboptimal on a stochastic block model. In particular, the involved estimators are biased and the procedure does not work for sparse graphs. We propose significant modifications, addressing both shortcomings and achieving a nearly optimal rate of strong consistency on the stochastic block model. Our theoretical results are illustrated and validated by numerical experiments.
 11. Januar 2023
 Leonhard Held (Universität Zürich)
 tba
 18. Januar 2023
 Tim Jahn (Universität Bonn)
 tba
 25. Januar 2023
 Maria Grith (Erasmus University Rotterdam)
 tba
 01. Februar 2023
 Radu Stoica (Université de Lorraine, Nancy)
 Random structures and patterns in spatiotemporal data: probabilistic modelling and statistical inference
 Abstract: The useful information carried by spatiotemporal data is often outlined by geometric structures and patterns. Filaments or clusters induced by galaxy positions in our Universe are such an example. Two situations are to be considered. First, the pattern of interest is hidden in the data set, hence the pattern should be detected. Second, the structure to be studied is observed, so relevant characterization of it should be done. This talk is structured in four parts. The first part presents the construction of different marked point processes together with their properties, such that characteristics of the patterns of interested are modelled by these processes. Second, tailored to the model MCMC dynamics to simulate the previous models are presented. A discussion related to the performances of these algorithms and comparison with exact simulation methods are provided. Third, on this basis, inference procedures are derived. They include level sets estimators, global optimisation, Approximate Bayesian Computation. Finally, applications on cosmological and geological real data are shown.
 08. Februar 2023
 Vanessa Didelez (Universität Bremen)
 tba
 15. Februar 2023
 N.N.
Interessenten sind herzlich eingeladen.
Für Rückfragen wenden Sie sich bitte an:
Frau Andrea Fiebig
Mail: fiebig@mathematik.huberlin.de
Telefon: +4930209345460
Fax: +4930209345451
HumboldtUniversität zu Berlin
Institut für Mathematik
Unter den Linden 6
10099 Berlin, Germany