Permutation Tests for Classification

Mukherjee, Sayan; Golland, Polina; Panchenko, Dmitry

dc.contributor.author	Mukherjee, Sayan
dc.contributor.author	Golland, Polina
dc.contributor.author	Panchenko, Dmitry
dc.date.accessioned	2005-12-19T23:02:49Z
dc.date.available	2005-12-19T23:02:49Z
dc.date.issued	2003-08-28
dc.identifier.other	MIT-CSAIL-TR-2003-016
dc.identifier.other	AIM-2003-019
dc.identifier.uri	http://hdl.handle.net/1721.1/30408
dc.description.abstract	We introduce and explore an approach to estimating statisticalsignificance of classification accuracy, which is particularly usefulin scientific applications of machine learning where highdimensionality of the data and the small number of training examplesrender most standard convergence bounds too loose to yield ameaningful guarantee of the generalization ability of theclassifier. Instead, we estimate statistical significance of theobserved classification accuracy, or the likelihood of observing suchaccuracy by chance due to spurious correlations of thehigh-dimensional data patterns with the class labels in the giventraining set. We adopt permutation testing, a non-parametric techniquepreviously developed in classical statistics for hypothesis testing inthe generative setting (i.e., comparing two probabilitydistributions). We demonstrate the method on real examples fromneuroimaging studies and DNA microarray analysis and suggest atheoretical analysis of the procedure that relates the asymptoticbehavior of the test to the existing convergence bounds.
dc.format.extent	22 p.
dc.format.extent	22876548 bytes
dc.format.extent	882217 bytes
dc.format.mimetype	application/postscript
dc.format.mimetype	application/pdf
dc.language.iso	en_US
dc.relation.ispartofseries	Massachusetts Institute of Technology Computer Science and Artificial Intelligence Laboratory
dc.subject	AI
dc.subject	Classification
dc.subject	Permutation testing
dc.subject	Statistical significance.
dc.title	Permutation Tests for Classification

Files in this item

Name:: MIT-CSAIL-TR-2003-016.ps
Size:: 21.81Mb
Format:: Postscript

View/Open

Name:: MIT-CSAIL-TR-2003-016.pdf
Size:: 861.5Kb
Format:: PDF

View/Open

This item appears in the following Collection(s)

CSAIL Technical Reports (July 1, 2003 - present)

Show simple item record