RMS bounds and sample size considerations for error estimation in linear discriminant analysis

Amin Zollanvari, Ulisses M. Braga-Neto, Edward R. Dougherty

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

The validity of a classifier depends on the precision of the error estimator used to estimate its true error. This paper considers the necessary sample size to achieve a given validity measure, namely RMS, for resubstitution and leave-one-out error estimators in the context of LDA. It provides bounds for the RMS between the true error and both the resubstitution and leave-one-out error estimators in terms of sample size and dimensionality. These bounds can be used to determine the minimum sample size in order to obtain a desired estimation accuracy, relative to RMS. To show how these results can be used in practice, a microar-ray classification problem is presented.

Original languageEnglish
Title of host publication2010 IEEE International Workshop on Genomic Signal Processing and Statistics, GENSIPS 2010
DOIs
Publication statusPublished - 2010
Externally publishedYes
Event2010 IEEE International Workshop on Genomic Signal Processing and Statistics, GENSIPS 2010 - Cold Spring Harbor, NY, United States
Duration: Nov 10 2010Nov 12 2010

Other

Other2010 IEEE International Workshop on Genomic Signal Processing and Statistics, GENSIPS 2010
CountryUnited States
CityCold Spring Harbor, NY
Period11/10/1011/12/10

Fingerprint

Discriminant Analysis
Discriminant analysis
Sample Size
Error analysis
Classifiers

ASJC Scopus subject areas

  • Genetics
  • Signal Processing

Cite this

Zollanvari, A., Braga-Neto, U. M., & Dougherty, E. R. (2010). RMS bounds and sample size considerations for error estimation in linear discriminant analysis. In 2010 IEEE International Workshop on Genomic Signal Processing and Statistics, GENSIPS 2010 [5719691] https://doi.org/10.1109/GENSIPS.2010.5719691

RMS bounds and sample size considerations for error estimation in linear discriminant analysis. / Zollanvari, Amin; Braga-Neto, Ulisses M.; Dougherty, Edward R.

2010 IEEE International Workshop on Genomic Signal Processing and Statistics, GENSIPS 2010. 2010. 5719691.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Zollanvari, A, Braga-Neto, UM & Dougherty, ER 2010, RMS bounds and sample size considerations for error estimation in linear discriminant analysis. in 2010 IEEE International Workshop on Genomic Signal Processing and Statistics, GENSIPS 2010., 5719691, 2010 IEEE International Workshop on Genomic Signal Processing and Statistics, GENSIPS 2010, Cold Spring Harbor, NY, United States, 11/10/10. https://doi.org/10.1109/GENSIPS.2010.5719691
Zollanvari A, Braga-Neto UM, Dougherty ER. RMS bounds and sample size considerations for error estimation in linear discriminant analysis. In 2010 IEEE International Workshop on Genomic Signal Processing and Statistics, GENSIPS 2010. 2010. 5719691 https://doi.org/10.1109/GENSIPS.2010.5719691
Zollanvari, Amin ; Braga-Neto, Ulisses M. ; Dougherty, Edward R. / RMS bounds and sample size considerations for error estimation in linear discriminant analysis. 2010 IEEE International Workshop on Genomic Signal Processing and Statistics, GENSIPS 2010. 2010.
@inproceedings{372888d1c3924e7f840e2d6c7cf98c73,
title = "RMS bounds and sample size considerations for error estimation in linear discriminant analysis",
abstract = "The validity of a classifier depends on the precision of the error estimator used to estimate its true error. This paper considers the necessary sample size to achieve a given validity measure, namely RMS, for resubstitution and leave-one-out error estimators in the context of LDA. It provides bounds for the RMS between the true error and both the resubstitution and leave-one-out error estimators in terms of sample size and dimensionality. These bounds can be used to determine the minimum sample size in order to obtain a desired estimation accuracy, relative to RMS. To show how these results can be used in practice, a microar-ray classification problem is presented.",
author = "Amin Zollanvari and Braga-Neto, {Ulisses M.} and Dougherty, {Edward R.}",
year = "2010",
doi = "10.1109/GENSIPS.2010.5719691",
language = "English",
isbn = "9781612847924",
booktitle = "2010 IEEE International Workshop on Genomic Signal Processing and Statistics, GENSIPS 2010",

}

TY - GEN

T1 - RMS bounds and sample size considerations for error estimation in linear discriminant analysis

AU - Zollanvari, Amin

AU - Braga-Neto, Ulisses M.

AU - Dougherty, Edward R.

PY - 2010

Y1 - 2010

N2 - The validity of a classifier depends on the precision of the error estimator used to estimate its true error. This paper considers the necessary sample size to achieve a given validity measure, namely RMS, for resubstitution and leave-one-out error estimators in the context of LDA. It provides bounds for the RMS between the true error and both the resubstitution and leave-one-out error estimators in terms of sample size and dimensionality. These bounds can be used to determine the minimum sample size in order to obtain a desired estimation accuracy, relative to RMS. To show how these results can be used in practice, a microar-ray classification problem is presented.

AB - The validity of a classifier depends on the precision of the error estimator used to estimate its true error. This paper considers the necessary sample size to achieve a given validity measure, namely RMS, for resubstitution and leave-one-out error estimators in the context of LDA. It provides bounds for the RMS between the true error and both the resubstitution and leave-one-out error estimators in terms of sample size and dimensionality. These bounds can be used to determine the minimum sample size in order to obtain a desired estimation accuracy, relative to RMS. To show how these results can be used in practice, a microar-ray classification problem is presented.

UR - http://www.scopus.com/inward/record.url?scp=79952791781&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=79952791781&partnerID=8YFLogxK

U2 - 10.1109/GENSIPS.2010.5719691

DO - 10.1109/GENSIPS.2010.5719691

M3 - Conference contribution

SN - 9781612847924

BT - 2010 IEEE International Workshop on Genomic Signal Processing and Statistics, GENSIPS 2010

ER -