Humboldt-Universität zu Berlin - Mathematisch-Naturwissenschaftliche Fakultät - Research Unit 1735

Multiple testing under unspecified dependency structure


Multiple hypotheses testing has emerged as one of the most active research fields in statistics over the last 10-15 years, contributing at present approximately 8% of all articles in the four leading methodological statistics journals (data from Benjamini, 2010). This growing interest is especially driven by large-scale applications, such as in genomics, proteomics or cosmology. Many new multiple type I and type II error criteria like the meanwhile quite popular “false discovery rate” (FDR) have recently been propagated and published together with explicit algorithms for controlling them. A broad class of these methods employs marginal test statistics or p-values, respectively, for each individual hypothesis and a set of critical constants with which they have to be compared. Up to now, only under joint independence of all marginal p-values the behaviour of such methods is understood well. Moreover, under unspecified dependence the type I error level is often not kept accurately or not fully exhausted. This holds true especially for the FDR or related measures and offers room for improvements of those procedures with respect to type I error control and power. An adaptation to the dependency structure can therefore lead to a gain in validity (type I error rate is kept accurately) and efficiency (quantified by multiple power measures). In this project, a general theory of the usage of parametric copulae methods in this multiple testing shall be developed. This will be flanked by structural assumptions regarding the multivariate distribution of p-values reducing the complexity of the problem, for instance, the dimensionality of the copula parameter. Moreover, we will develop resampling techniques for empirical calibration of multiple testing thresholds in the case of unspecified dependency.

 

The principal investigators are

 

Scientific staff is