Humboldt-Universität zu Berlin - Mathematisch-Naturwissenschaftliche Fakultät - Martin Wahl

Martin Wahl - Teaching

 

Hochdimensionale Statistik im Wintersemester 2021/22

 

A brief description:
The goal of this course is to provide a rigorous introduction to concepts and methods of high-dimensional statistics having numerous applications in data science, machine learning and signal processing.

Lecture:
Tuesday, 13:15-15:00, Room 3.008, Rudower Chaussee 25 (RUD25)

Excercise:
Tuesday, 15:00-16:00, Room 3.008, Rudower Chaussee 25 (RUD25)

Next Q&A:
Friday, 03.12., 16:00-17:00:  https://hu-berlin.zoom.us/j/65093594681

Prerequisites:
Stochastik I (und II), Lineare Algebra, Analysis
Methoden der Statistik or Mathematische Statistik

Books and supplementary material:
A. High-Dimensional Statistics by Martin J. Wainwright
B. High-Dimensional Probability by Roman Vershynin
C. Introduction to High-Dimensional Statistics by Christophe Giraud
D. Lecture notes on High Dimensional Statistics by Philippe Rigollet and Jan-Christian Hütter
Two more advanced books:
E. Oracle Inequalities in Empirical Risk Minimization and Sparse Recovery by Vladimir Koltchinskii
F. Mathematical foundations of infinite-dimensional statistical models by Evarist Giné and Richard Nickl

Topics (tentative):
1. Compressed sensing and sparse recovery
2. High-dimensional covariance estimation, (sparse) PCA
3. Foundations of statistical learning theory: Glivenko-Cantelli classes, VC-dimension, metric entropy, classification and regression problems
4. Minimax lower bounds
5. Spiked covariance model, Marcenko-Pastur theorem, covariance estimation, Benign overfitting

Lecture 1: Short overview of the course (see this expository work by David Donoho)  and beginning of sparse recovery & compressed sensing (see Chapter 7.1 in Reference A)
Lecture 2: Compressed sensing, the restricted isometry property and exact recovery (Chapter 7.2 in Ref A)
Lecture 3: Subgaussian r.v., Hoeffding's inequality, Bernstein's inequality, The Johnson-Lindenstrauss Lemma (Chapter 2.1 in Ref A, Chapter 2 in Ref B)
Lecture 4: Restricted isometry property for random matrices, Sparse recovery for the linear model (Chapter 7.3 in Ref A)
Lecture 5: Motivation for PCA and spectral clustering (Chapter 8.1 in Ref A, Chapter 5.4 in Ref D), Random matrices and the estimation of covariance matrices (Chapter 6.3 in Ref A, Chapter 5.3 in Ref D)
Lecture 6: Eigenvalue and eigenvector perturbation theorems by Weyl, Hoffmann-Wielandt and Davis-Kahan (see Chapter 6 in this Ref)
Lecture 7: Sparse PCA: s-sparsest maximal eigenvector and its SDP relaxation, application to sparse spectral clustering
Lecture 8: Minimiax lower bounds, Fano's method, applications to linear model and to PCA (Chapter 15 in Ref A or Chapter 2 in this Ref)
Lecture 9: Proof of Fano's method, (sparse) Varshamov-Gilbert

Lecture notes:
Lecture notes will be written after each week’s lecture.
Link: https://www.dropbox.com/s/s7kxg0fuipidhmm/Gliederung.pdf?dl=0

Homeworks:
Link: https://www.dropbox.com/s/skcg4s3po8i0k6e/Exercises.pdf?dl=0

Oral exam:
- First exam: 22.02. 
- Second exam: 04.04.

Further links:
Course Description
Leitfaden für Präsenzlehrveranstaltungen

 

 

Frühere Semester