## Forschungsseminar Mathematische Statistik

### Für den Bereich Statistik

G. Blanchard, M. Reiß, V. Spokoiny, W. Härdle

*Ort*

Weierstrass-Institut für Angewandte Analysis und Stochastik

Erhard-Schmidt-Raum

Mohrenstrasse 39

10117 Berlin

*Zeit*

mittwochs, 10.00 - 12.30 Uhr

*Programm *

- 18. Oktober 2017
**Vladimir Spokoiny**(WIAS Berlin)- Big ball probability with applications in statistical inference
- Abstract: We derive the bounds on the Kolmogorov distance between probabilities of two Gaussian elements to hit a ball in a Hilbert space. The key property of these bounds is that they are dimensional-free and depend on the nuclear (Schatten-one) norm of the difference between the covariance operators of the elements. We are also interested in the anticoncentration bound for a squared norm of a non-centered Gaussian element in a Hilbert space. All bounds are sharp and cannot be improved in general. We provide a list of motivation examples and applications in statistical inference for the derived results as well.
- 25. Oktober 2017
**Debarghya Ghoshdastidar**(U Tübingen)- Two-sample Hypothesis Testing for Inhomogeneous Random Graphs
- Abstract: In this work, we consider the problem of testing between two populations of inhomogeneous random graphs defined on the same set of vertices. We are particularly interested in the high-dimensional setting where the population size is potentially much smaller than the graph size, and may even be constant. It is known that this setting cannot be tackled if the separation between two models is quantified in terms of total variation distance.

Hence, we study two-sample testing problems where the separation between models is quantified by the Frobenius or operator norms of the difference between the population adjacency matrices. We derive upper and lower bounds for the minimax separation rate for these problems. Interestingly, the proposed near-optimal tests are uniformly consistent in both the “large graph, small sample” and “small graph, large sample” regimes.

This is a joint work with Maurilio Gutzeit, Alexandra Carpentier and Ulrike von Luxburg. - 01. November 2017
**Denis Belomestny**(U Duisburg-Essen)- Statistical inference for McKean-Vlasov-SDEs
- Abstract: McKean-Vlasov-SDEs provide a very rich modelling framework for large complex systems. They naturally appear in modelling and simulation of turbulent flows by fluid-particle method. In biomathematics, a McKean-Vlasov-SDE model for neuronal networks has been proposed. Although potentially very powerful, the lack of efficient statistical procedures prevents further expansion of these results into application areas. When proposing a McKean-Vlasov-SDE model, one of the main challenges is the appropriate choice of the coefficients. In this talk, we study the problem of the nonparametric estimation of the McKean-Vlasov diffusion coefficients from low-frequency observations.
- 08. November 2017
**A. Dalayan**(Paris)- On the Exponentially Weighted Aggregate with the Laplace Prior
- Abstract: In this talk, we will present some results on the statistical behaviour of the Exponentially Weighted Aggregate (EWA) in the problem of high-dimensional regression with fixed design. Under the assumption that the underlying regression vector is sparse, it is reasonable to use the Laplace distribution as a prior. The resulting estimator and, specifically, a particular instance of it referred to as the Bayesian lasso, was already used in the statistical literature because of its computational convenience, even though no thorough mathematical analysis of its statistical properties was carried out. The results of this talk fill this gap by establishing sharp oracle inequalities for the EWA with the Laplace prior. These inequalities show that if the temperature parameter is small, the EWA with the Laplace prior satisfies the same type of oracle inequality as the lasso estimator does, as long as the quality of estimation is measured by the prediction loss. Extensions of the proposed methodology to the problem of prediction with low-rank matrices will be discussed as well.

(based on a joint work with Edwin Grappin and Quentin Paris) - 16. November 2017
**Enkelejd Hashorva**(U Lausanne)- From Classical to Parisian Ruin in Gaussian Risk Models
- Abstract: This talk is concerned with Gaussian risk models which approximate reasonably the risk process of an insurance company. Such models incorporate various financial elements related to inflation/deflation and taxation. Of interest also from the probabilistic point of view, is the approximation of the ruin probability and the ruin time when the initial capital is large. The concept of Parisian ruin is quite new and appealing for mathematical models of insurance risks. However the calculation of Parisian ruin and the Parisian ruin time is a hard problem. Recent research has also focused on the investigation of multi-valued risk models analysing the ruin probability and the ruin time. Currently, due to the lack of appropriate tools, results are available only for the Brownian risk model. In this talk various approximations of ruin probability and ruin times for both classical and Parisian case will be discussed including results for the multi-valued Brownian risk model.

Joint work with K. Debicki, University of Wroclaw and L. Ji, University of Lausanne - 22. November 2017
**We celebrate the 50th anniversary of the MMS****The older Math Stat Seminar in Germany great guests com e.g.**- Program
- 06. Dezember 2017
**Alexander Meister**(U Rostock)- Nonparametric density estimation for intentionally corrupted functional data
- Abstract: We consider statistical models where, in order to satisfy privacy constraints, functional data are artificially contaminated by independent Wiener processes. We show that the corrupted observations have a Wiener density, which determines the distribution of the original functional random variables uniquely. We construct a nonparametric estimator of the functional density and study its asymptotic properties. We provide an upper bound on its mean integrated squared error which yields polynomial convergence rates, and we establish lower bounds on the minimax convergence rates which are close to the rates attained by our estimator. Our estimator requires the choice of a basis and of two smoothing parameters. We propose data-driven ways of choosing them and prove that the asymptotic quality of our estimator is not significantly affected by the empirical parameter selection. We apply our technique to a classification problem of real data and provide some numerical results.

This talk is based on a joint work with A. Delaigle (University of Melbourne). - 13. Dezember 2017
**Fabian Dunker**(University of Canterbury)- Multiscale Tests for Shape Constraints in Linear Random Coefficient Models
- Abstract: A popular way to model unobserved heterogeneity is the linear random coefficient model $Yi=\beta_{i,1}X_{i,1}+\beta_{i,2}X_{i,2}+...+\beta_{i,d}X_{i,d}$. We assume that the observations $(X_i,Y_i), i= 1,...,n$, are i.i.d. where $X_i= (X_{i,1},...,X_{i,d})$ is a $d$-dimensional vector of regressors. The random coefficients $\beta_i= (\beta_{i,1},...,\beta_{i,d}), i= 1,...,n$, are unobserved i.i.d. realizations of an unknown $d$-dimensional distribution with density $f_\beta$ independent of $X_i$. We propose and analyze a nonparametric multi-scale test for shape constraints of the random coefficient density $f_\beta$. In particular we are interested in confidence sets for slopes and modes of the density. The test uses the connection between the model and the $d$-dimensional Radon transform and is based on Gaussian approximation of empirical processes. This is a joint work with K. Eckle, K. Proksch, and J. Schmidt-Hieber.
- 10. Januar 2018
**Antoine Chambaz**(Université Paris-Descartes)- An introduction to targeted learning
- Abstract: Coined by Mark van der Laan and Dan Rubin in 2006, targeted learning is a general approach to learning from data that reconciles machine learning and statistical inference. On the one hand, ``machine learning'' refers to the estimation of infinite-dimensional features of the law of the data, $P$, for instance a regression function. Machine learning algorithms are versatile, and produce (possibly highly) data-adaptive estimators. Driven by the need to make accurate predictions, they do not care so much about the assessment of prediction uncertainty. On the other hand, ``statistical inference'' refers to the estimation of finite-dimensional parameters of $P$, for instance a measure of association with a causal interpretation. It focuses on the construction of confidence regions or the development of hypotheses tests. Emphasis is placed on robustness (guaranteeing that one goes to the truth even under mild and reasonable assumptions on $P$), efficiency (trying to draw as much information from the data as possible), and controlling the asymptotic levels or type I errors.
- 17. Januar 2018
**Anthony Nouy**(Ecolé Centrale de Nantes)- Learning high-dimensional functions with tree tensor networks
- Abstract: Tensor methods are among the most prominent tools for the approximation of high-dimensional functions. Such approximation problems naturally arise in statistical learning, stochastic analysis and uncertainty quantification. In many practical situations, the approximation of high-dimensional functions is made computationally tractable by using rank-structured approximations. In this talk, we give an introduction to tree-based (hierarchical) tensor formats, which can be interpreted as deep neural networks with particular architectures. Then we present adaptive algorithms for the approximation in these formats using statistical methods.
- 24. Januar 2018
**Andreas Maurer**(München)- Concentration for functions of bounded interaction
- Abstract: Some multivariate functions have the property that their variation in any argument does not change too much when another argument is modified. The talk will give some examples and concentrates on the random variable W obtained by applying such a function to a vector of independent variables. Functions with weakly interacting arguments share some important properties with sums: the expectation of W can be estimated by a version of Bernstein's inequality and its variance can be tightly estimated in terms of an iid sample, which has only one datum more than the function has arguments. There is also a version of the central limit theorem.
- 31. Januar 2018
**Elisabeth Gassiat**(Université Paris-Sud)- Estimation of the proportion of explained variation in high dimensions
- Abstract: The estimation of the heritability of a phenotypic trait based on genetic data may be set as the estimation of the proportion of explained variation in high dimensional linear models. I will be interested in understanding the impact of:

— not knowing the sparsity of the regression parameter,

— not knowing the variance matrix of the covariates

on minimax estimation of heritability. In the situation where the variance of the design is known, I will present an estimation procedure that adapts to unknown sparsity. when the variance of the design is unknown and no prior estimator of it is available, I will show that consistent estimation of heritability is impossible. (Joint work with N. Verzelen, and PHD thesis of A. Bonnet). - 07. Februar 2018
**Gitta Kotyniok**(TU Berlin)- Optimal Approximation with Sparsely Connected Deep Neural Networks
- Abstract: Despite the outstanding success of deep neural networks in real-world applications, most of the related research is empirically driven and a mathematical foundation is almost completely missing. One central task of a neural network is to approximate a function, which for instance encodes a classification task. In this talk, we will be concerned with the question, how well a function can be approximated by a neural network with sparse connectivity. Using methods from approximation theory and applied harmonic analysis, we will derive a fundamental lower bound on the sparsity of a neural network. By explicitly constructing neural networks based on certain representation systems, so-called $\alpha$-shearlets, we will then demonstrate that this lower bound can in fact be attained. Finally, we present numerical experiments, which surprisingly show that already the standard backpropagation algorithm generates deep neural networks obeying those optimal approximation rates. This is joint work with H. Bölcskei (ETH Zurich), P. Grohs (Uni Vienna), and P. Petersen (TU Berlin).
- 14. Februar 2018
**Jean-Pierre Florens**(U Toulouse)- Is Completeness Necessary? Penalized Estimation in Non Identified Linear Models
- Abstract: Identification is an important issue in many econometric models. This paper studies potentially non-identified and/or weakly identified ill-posed inverse models. The leading examples are the nonparametric IV regression and the functional linear IV regression. We show that in the case of identification failures, a very general family of continuously-regularized estimators is consistent for the best approximation of the parameter of interest. We obtain L_2 and L_∞ convergence rates for this general class of regularization schemes, including Tikhonov, iterated Tikhonov, spectral cut-off, and Landweber-Fridman. Unlike in the identified case, estimation of the operator has non-negligible impact on convergence rates and inference. We develop inferential methods for linear functionals in such models. Lastly, we demonstrate the discontinuity in the asymptotic distribution in case of weak identification. In particular, the estimator has a degenerate U-statistics type behavior, in the extreme case of weak instrument.

Interessenten sind herzlich eingeladen.

**Für Rückfragen wenden Sie sich bitte an:**

**Frau Andrea Fiebig**

**Mail: fiebig@mathematik.hu-berlin.de
Telefon: +49-30-2093-5860
Fax: +49-30-2093-5848
Humboldt-Universität zu Berlin
Institut für Mathematik
Unter den Linden 6
10099 Berlin, Germany**