A More Powerful Two-Sample Test in High Dimensions using Random Projection
Miles Lopes and Laurent Jacob and Martin Wainwright
EECS Department, University of California, Berkeley
Technical Report No. UCB/EECS-2014-28
April 9, 2014
http://www2.eecs.berkeley.edu/Pubs/TechRpts/2014/EECS-2014-28.pdf
We consider the hypothesis testing problem of detecting a shift between the means of two multivariate normal distributions in the high-dimensional setting, allowing for the data dimension p to exceed the sample size n. Specifically, we propose a new test statistic for the two-sample test of means that integrates a random projection with the classical Hotelling T-squared statistic. Working under a high-dimensional framework with (p,n) tending to infinity, we first derive an asymptotic power function for our test, and then provide sufficient conditions for it to achieve greater power than other state-of-the-art tests. Using ROC curves generated from synthetic data, we demonstrate superior performance against competing tests in the parameter regimes anticipated by our theoretical results.
Advisors: Martin Wainwright
BibTeX citation:
@mastersthesis{Lopes:EECS-2014-28, Author= {Lopes, Miles and Jacob, Laurent and Wainwright, Martin}, Title= {A More Powerful Two-Sample Test in High Dimensions using Random Projection}, School= {EECS Department, University of California, Berkeley}, Year= {2014}, Month= {Apr}, Url= {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2014/EECS-2014-28.html}, Number= {UCB/EECS-2014-28}, Abstract= {We consider the hypothesis testing problem of detecting a shift between the means of two multivariate normal distributions in the high-dimensional setting, allowing for the data dimension p to exceed the sample size n. Specifically, we propose a new test statistic for the two-sample test of means that integrates a random projection with the classical Hotelling T-squared statistic. Working under a high-dimensional framework with (p,n) tending to infinity, we first derive an asymptotic power function for our test, and then provide sufficient conditions for it to achieve greater power than other state-of-the-art tests. Using ROC curves generated from synthetic data, we demonstrate superior performance against competing tests in the parameter regimes anticipated by our theoretical results.}, }
EndNote citation:
%0 Thesis %A Lopes, Miles %A Jacob, Laurent %A Wainwright, Martin %T A More Powerful Two-Sample Test in High Dimensions using Random Projection %I EECS Department, University of California, Berkeley %D 2014 %8 April 9 %@ UCB/EECS-2014-28 %U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2014/EECS-2014-28.html %F Lopes:EECS-2014-28