Humboldt-Universität zu Berlin - Mathematisch-Naturwissenschaftliche Fakultät - Martin Wahl

Martin Wahl - Teaching

 

Maßtheorie für Statistiker im Sommersemester 2022

 

 

Hochdimensionale Statistik im Wintersemester 2021/22

 

A brief description:
The goal of this course is to provide a rigorous introduction to concepts and methods of high-dimensional statistics having numerous applications in data science, machine learning and signal processing.

Lecture:
Tuesday, 13:15-15:00, Room 3.008, Rudower Chaussee 25 (RUD25)

Excercise:
Tuesday, 15:00-16:00, Room 3.008, Rudower Chaussee 25 (RUD25)

Prerequisites:
Stochastik I (und II), Lineare Algebra, Analysis
Methoden der Statistik or Mathematische Statistik

Books and supplementary material:
A. High-Dimensional Statistics by Martin J. Wainwright
B. High-Dimensional Probability by Roman Vershynin
C. Introduction to High-Dimensional Statistics by Christophe Giraud
D. Lecture notes on High Dimensional Statistics by Philippe Rigollet and Jan-Christian Hütter
E. An introduction to random matrices by Greg W. Anderson, Alice Guionnet, Ofer Zeitouni
F. Spectral analysis of large dimensional random matrices  by Zhidong Bai, Jack W. Silverstein
G. Oracle Inequalities in Empirical Risk Minimization and Sparse Recovery by Vladimir Koltchinskii
H. Mathematical foundations of infinite-dimensional statistical models by Evarist Giné and Richard Nickl

Topics (tentative):
1. Compressed sensing and sparse recovery
2. High-dimensional covariance estimation, (sparse) PCA
3. Low-rank matrix completion
4. Foundations of statistical learning theory: Glivenko-Cantelli classes, VC-dimension, metric entropy, classification and regression problems
5. Minimax lower bounds
6. Spiked covariance model, Marcenko-Pastur theorem
7. Implicit regularization, Benign overfitting

Lecture 1: Short overview of the course (see this expository work by David Donoho)  and beginning of sparse recovery & compressed sensing (see Chapter 7.1 in Reference A)
Lecture 2: Compressed sensing, the restricted isometry property and exact recovery (Chapter 7.2 in Ref A)
Lecture 3: Subgaussian r.v., Hoeffding's inequality, Bernstein's inequality, The Johnson-Lindenstrauss Lemma (Chapter 2.1 in Ref A, Chapter 2 in Ref B)
Lecture 4: Restricted isometry property for random matrices, Sparse recovery for the linear model (Chapter 7.3 in Ref A)
Lecture 5: Motivation for PCA and spectral clustering (Chapter 8.1 in Ref A, Chapter 5.4 in Ref D), Random matrices and the estimation of covariance matrices (Chapter 6.3 in Ref A, Chapter 5.3 in Ref D)
Lecture 6: Eigenvalue and eigenvector perturbation theorems by Weyl, Hoffmann-Wielandt and Davis-Kahan (see Chapter 6 in this Ref)
Lecture 7: Sparse PCA: s-sparsest maximal eigenvector and its SDP relaxation, application to sparse spectral clustering
Lecture 8: Minimiax lower bounds, Fano's method, applications to linear model and to PCA (Chapter 15 in Ref A or Chapter 2 in this Ref)
Lecture 9: Proof of Fano's method, sparse Varshamov-Gilbert
Lecture 10: Short overview of high-dimensional asymptotics (see this expository work by Iain M. Johnstone)  and beginning of semicircle law (Chapter 2.1 in Ref E)
Lecture 11: Proof of Wigner's semicircle law by method of moments (Chapter 2.1 in Ref E)
Lecture 12: Proof of Marchenko-Pastur's theorem by method of moments (Chapter 3.1 in Ref F)
Lecture 13: Truncation techniques (Chapter 2.1 in Ref E)
Lecture 14: Properties of the Stieltjes transform, Schur complement formulas
Lecture 15: End of proof of Wigner's semicircle law by the Stieltjes transform (Chapter 2.3 in Ref F), lower bounds for testing
Lecture 16: Low-degree method, low-degree likelihood, applications to Gaussian mixture model

Lecture notes:
Lecture notes will be written after each week’s lecture.
Lecture from 15.02.: https://www.dropbox.com/s/ptpjkbc0vgt5n5a/VL%2016.pdf?dl=0
Link: https://www.dropbox.com/s/s7kxg0fuipidhmm/Gliederung.pdf?dl=0

Homeworks:
Link: https://www.dropbox.com/s/skcg4s3po8i0k6e/Exercises.pdf?dl=0

 

 

Frühere Semester