Forschungsseminar Mathematische Statistik
Für den Bereich Statistik
A. Carpentier, S. Greven, W. Härdle, M. Reiß, V. Spokoiny
Ort
WeierstrassInstitut für Angewandte Analysis und Stochastik
ErhardSchmidtRaum
Mohrenstrasse 39
10117 Berlin
Zeit
mittwochs, 10.00  12.00 Uhr
Programm

 Achtung!
 The seminar will be hybrid and realized via Zoom. Our lecture room ESH has according to hygiene recommendations only a capacity of 16 people. If you intend to come to same of the talks in person, you must register for our mailinglist with Andrea Fiebig (fiebig@math.huberlin.de). Prior to each talk a doodle will be created where it is mandatory to sign in for attendance in person. Therefore, it is mandatory for those who want to participate in person to register (put your name in the list) using the doodle link sent by email before the lecture. Please follow the streamed talk, if 16 guests have already registered under the zoom link (to be inquired at fiebig@math.huberlin.de).
 The socalled ''3G rule'' applies at the Weierstrass Institute
(joint work with E. Chzhen and G. Stoltz)
 Pure Differential Privacy in Functional Data Analysis
 Abstract: We consider the problem of achieving pure differential privacy in the context of functional data analysis, or more general nonparametric statistics, where the summary of interest can naturally be viewed as an element of a function space. In this talk I will give a brief overview and motivation for differential privacy before delving into the challenges that arise in the sanitization of an infinite dimensional summary. I will present a new mechanism, called the Independent Component Laplace Process, for achieving privacy followed by several applications and examples.
 01. Dezember2021
 Nikita Puchkin (Higher School of Economics, Moscow) (online talk)
 Rates of convergence for density estimation with GANs
 Abstract: We undertake a thorough study of the nonasymptotic properties of the vanilla generative adversarial networks (GANs). We derive theoretical guarantees for the density estimation with GANs under a proper choice of the deep neural networks classes representing generators and discriminators. In particular, we prove that the resulting estimate converges to the true density 𝗉∗ in terms of JensenShannon (JS) divergence at the rate (logn/n)2β/(2β+d) where n is the sample size and β determines the smoothness of 𝗉∗. Moreover, we show that the obtained rate is minimax optimal (up to logarithmic factors) for the considered class of densities.
 08. Dezember 2021
 Davy Paindaveine (Université libre de Bruxelles)
 Hypothesis testing on highdimensional spheres: the Le Cam approach
 Abstract: Hypothesis testing in high dimensions has been a most active research topics in the last decade. Both theoretical and practical considerations make it natural to restrict to sign tests, that is, to tests that uses observations only through their directions from a given center. This obviously maps the original Euclidean problem to a spherical one, still in high dimensions. With this motivation in mind, we tackle two testing problems on highdimensional spheres, both under a symmetry assumption that specifies that the distribution at hand is invariant under rotations with respect to a given axis. More precisely, we consider the problem of testing the null hypothesis of uniformity ("detecting the signal") and the problem of testing the null hypothesis that the symmetry axis coincides with a given direction ("learning the signal direction"). We solve both problems by exploiting Le Cam's asymptotic theory of statistical experiments, in a double or tripleasymptotic framework. Interestingly, contiguity rates depend in a subtle way on how well the parameters involved are identified as well as on a possible further antipodallysymmetric nature of the distribution. In many cases, strong optimality results are obtained from local asymptotic normality. When this cannot be achieved, it is still possible to establish minimax rate optimality.
 05. Januar 2022
 Alexandra Suvorikova (WIAS Berlin) (online talk)
 Robust kmeans in metric spaces and spaces of probability measures
 12. Januar 2022
 Martin Wahl (HU Berlin)
 Lower bounds for invariant statistical models with applications to PCA
 Abstract: This talk will be concerned with nonasymptotic lower bounds for the estimation of principal subspaces. I will start by reviewing some previous methods, including the local asymptotic minimax theorem and the Grassmann approach. Then I will present a new approach based on a van Trees inequality (i.e. a Bayesian version of the CramérRao inequality) tailored for invariant statistical models. As applications, I will provide nonasymptotic lower bounds for principal component analysis, the matrix denoising problem and the phase synchronization problem.
 19. Januar 2022
 Denis Belomestny (Universität DuisburgEssen)
 Minimax bounds on the sample complexity of reinforcement learning with a generative model
 Abstract: We consider the problem of learning the optimal value function in discountedreward Markov decision processes (MDPs). We analyze the sample complexity of a new uppervalue iteration procedure in the presence of a generative model of the MDP. The main result indicates that for an MDP with $N$ stateaction pairs, only $O(N log(N)/\epsilon)$ statetransition samples are required to find an $\epsilon$optimal estimate of the corresponding value function with high probability. This bound should be contrasted to $O(N log(N)/\epsilon^2) $ complexity bound for estimating the actionvalue function. We also discuss the optimality of the obtained complexity bound.
 26. Januar 2022
 Pierre Jacob (ESSEC Paris) (online talk)
 Some methods based on couplings of Markov chain Monte Carlo algorithms
 Abstract: Markov chain Monte Carlo algorithms are commonly used to approximate a variety of probability distributions, such as posterior distributions arising in Bayesian analysis. I will review the idea of coupling in the context of Markov chains, and how this idea not only leads to theoretical analyses of Markov chains but also to new Monte Carlo methods. In particular, the talk will describe how coupled Markov chains can be used to obtain 1) unbiased estimators of expectations and of normalizing constants, 2) nonasymptotic convergence diagnostics for Markov chains, and 3) unbiased estimators of the asymptotic variance of MCMC ergodic averages.
 02. Februar 2022
 Alessandra Menafoglio (MOX  Dept. of Mathematics, Politecnico di Milano)
 Object Oriented Data Analysis in Bayes spaces: from distributional data to the analysis of complex shapes
 Abstract: In the presence of increasingly massive and heterogeneous data, the statistical modeling of distributional observations plays a key role. Choosing the 'right' embedding space for these data is of paramount importance for their statistical processing, to account for their nature and inherent constraints. The Bayes space theory is a natural embedding space for (spatial) distributional data, and was successfully applied in varied settings. In this presentation, I will discuss the stateoftheart methods for the modelling, analysis, and prediction of distributional data, with a particular attention to cases when their spatial dependence cannot be neglected. I will embrace the viewpoint of objectoriented spatial statistics (O2S2), a system of ideas for the analysis of complex data with spatial dependence. All the theoretical developments will be illustrated through their application on real data, highlighting the intrinsic challenges of a statistical analysis which follows the Bayes spaces approach. Applications will cover a varied range of fields, from the assessment of COVID19 on mortality data to the analysis of complex shapes produced in additive manufacturing.
 09. Februar 2022
 Ervan Scornet (France)
 Variable importance in random forests
 Abstract: Nowadays, machine learning procedures are used in many fields with the notable exception of socalled sensitive areas (health, justice, defense, to name a few) in which the decisions to be taken are fraught with consequences. In these fields, it is necessary to obtain a precise decision but, to be effectively applied, these algorithms must provide an explanation of the mechanisms that lead to the decision and, in this sense, be interpretable. Unfortunately, the most accurate algorithms today are often the most complex. A classic technique to try to explain their predictions is to calculate indicators corresponding to the strength of the dependence between each input variable and the output to be predicted. In this talk, we will focus on variable importances designed for the original random forest algorithm: the Mean Decreased Impurity (MDI) and the Mean Decrease Accuracy (MDA). We will see how theoretical results provide guidance for their practical uses.
 16. Februar 2022
 Celine Duval
 tba
Interessenten sind herzlich eingeladen.
Für Rückfragen wenden Sie sich bitte an:
Frau Andrea Fiebig
Mail: fiebig@mathematik.huberlin.de
Telefon: +4930209345460
Fax: +4930209345451
HumboldtUniversität zu Berlin
Institut für Mathematik
Unter den Linden 6
10099 Berlin, Germany