A Theoretical Analysis of the Peaking Phenomenon in Classification

Research output: Contribution to journalArticle

Abstract

In this work, we analytically study the peaking phenomenon in the context of linear discriminant analysis in the multivariate Gaussian model under the assumption of a common known covariance matrix. The focus is finite-sample setting where the sample size and observation dimension are comparable. Therefore, in order to study the phenomenon in such a setting, we use an asymptotic technique whereby the number of sample points is kept comparable in magnitude to the dimensionality of observations. The analysis provides a more thorough picture of the phenomenon. In particular, the analysis shows that as long as the Relative Cumulative Efficacy of an additional Feature set (RCEF) is greater (less) than the size of this set, the expected error of the classifier constructed using these additional features will be less (greater) than the expected error of the classifier constructed without them. Our result highlights underlying factors of the peaking phenomenon relative to the classifier used in this study and, at the same time, calls into question the classical wisdom around the peaking phenomenon.

Original languageEnglish
JournalJournal of Classification
DOIs
Publication statusPublished - Jan 1 2019

Fingerprint

Discriminant Analysis
Sample Size
Theoretical Analysis
Classifier
Observation
Sample point
Multivariate Models
discriminant analysis
Gaussian Model
wisdom
Covariance matrix
Dimensionality
Efficacy
Theoretical analysis

Keywords

  • Classification error rate
  • Linear discriminant analysis
  • Multiple asymptotic analysis
  • Peaking phenomenon

ASJC Scopus subject areas

  • Mathematics (miscellaneous)
  • Psychology (miscellaneous)
  • Statistics, Probability and Uncertainty
  • Library and Information Sciences

Cite this

A Theoretical Analysis of the Peaking Phenomenon in Classification. / Zollanvari, Amin; James, Alex Pappachen; Sameni, Reza.

In: Journal of Classification, 01.01.2019.

Research output: Contribution to journalArticle

@article{1a6c41345f824f728cf4bff5d0fc646e,
title = "A Theoretical Analysis of the Peaking Phenomenon in Classification",
abstract = "In this work, we analytically study the peaking phenomenon in the context of linear discriminant analysis in the multivariate Gaussian model under the assumption of a common known covariance matrix. The focus is finite-sample setting where the sample size and observation dimension are comparable. Therefore, in order to study the phenomenon in such a setting, we use an asymptotic technique whereby the number of sample points is kept comparable in magnitude to the dimensionality of observations. The analysis provides a more thorough picture of the phenomenon. In particular, the analysis shows that as long as the Relative Cumulative Efficacy of an additional Feature set (RCEF) is greater (less) than the size of this set, the expected error of the classifier constructed using these additional features will be less (greater) than the expected error of the classifier constructed without them. Our result highlights underlying factors of the peaking phenomenon relative to the classifier used in this study and, at the same time, calls into question the classical wisdom around the peaking phenomenon.",
keywords = "Classification error rate, Linear discriminant analysis, Multiple asymptotic analysis, Peaking phenomenon",
author = "Amin Zollanvari and James, {Alex Pappachen} and Reza Sameni",
year = "2019",
month = "1",
day = "1",
doi = "10.1007/s00357-019-09327-3",
language = "English",
journal = "Journal of Classification",
issn = "0176-4268",
publisher = "Springer New York",

}

TY - JOUR

T1 - A Theoretical Analysis of the Peaking Phenomenon in Classification

AU - Zollanvari, Amin

AU - James, Alex Pappachen

AU - Sameni, Reza

PY - 2019/1/1

Y1 - 2019/1/1

N2 - In this work, we analytically study the peaking phenomenon in the context of linear discriminant analysis in the multivariate Gaussian model under the assumption of a common known covariance matrix. The focus is finite-sample setting where the sample size and observation dimension are comparable. Therefore, in order to study the phenomenon in such a setting, we use an asymptotic technique whereby the number of sample points is kept comparable in magnitude to the dimensionality of observations. The analysis provides a more thorough picture of the phenomenon. In particular, the analysis shows that as long as the Relative Cumulative Efficacy of an additional Feature set (RCEF) is greater (less) than the size of this set, the expected error of the classifier constructed using these additional features will be less (greater) than the expected error of the classifier constructed without them. Our result highlights underlying factors of the peaking phenomenon relative to the classifier used in this study and, at the same time, calls into question the classical wisdom around the peaking phenomenon.

AB - In this work, we analytically study the peaking phenomenon in the context of linear discriminant analysis in the multivariate Gaussian model under the assumption of a common known covariance matrix. The focus is finite-sample setting where the sample size and observation dimension are comparable. Therefore, in order to study the phenomenon in such a setting, we use an asymptotic technique whereby the number of sample points is kept comparable in magnitude to the dimensionality of observations. The analysis provides a more thorough picture of the phenomenon. In particular, the analysis shows that as long as the Relative Cumulative Efficacy of an additional Feature set (RCEF) is greater (less) than the size of this set, the expected error of the classifier constructed using these additional features will be less (greater) than the expected error of the classifier constructed without them. Our result highlights underlying factors of the peaking phenomenon relative to the classifier used in this study and, at the same time, calls into question the classical wisdom around the peaking phenomenon.

KW - Classification error rate

KW - Linear discriminant analysis

KW - Multiple asymptotic analysis

KW - Peaking phenomenon

UR - http://www.scopus.com/inward/record.url?scp=85069659068&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85069659068&partnerID=8YFLogxK

U2 - 10.1007/s00357-019-09327-3

DO - 10.1007/s00357-019-09327-3

M3 - Article

JO - Journal of Classification

JF - Journal of Classification

SN - 0176-4268

ER -