TY - JOUR
T1 - A Theoretical Analysis of the Peaking Phenomenon in Classification
AU - Zollanvari, Amin
AU - James, Alex Pappachen
AU - Sameni, Reza
N1 - Funding Information:
This material is based in part upon work supported by the Nazarbayev University Faculty Development Competitive Research Grant, under award number SOE2018008.
Publisher Copyright:
© 2019, Classification Society of North America.
PY - 2020
Y1 - 2020
N2 - In this work, we analytically study the peaking phenomenon in the context of linear discriminant analysis in the multivariate Gaussian model under the assumption of a common known covariance matrix. The focus is finite-sample setting where the sample size and observation dimension are comparable. Therefore, in order to study the phenomenon in such a setting, we use an asymptotic technique whereby the number of sample points is kept comparable in magnitude to the dimensionality of observations. The analysis provides a more thorough picture of the phenomenon. In particular, the analysis shows that as long as the Relative Cumulative Efficacy of an additional Feature set (RCEF) is greater (less) than the size of this set, the expected error of the classifier constructed using these additional features will be less (greater) than the expected error of the classifier constructed without them. Our result highlights underlying factors of the peaking phenomenon relative to the classifier used in this study and, at the same time, calls into question the classical wisdom around the peaking phenomenon.
AB - In this work, we analytically study the peaking phenomenon in the context of linear discriminant analysis in the multivariate Gaussian model under the assumption of a common known covariance matrix. The focus is finite-sample setting where the sample size and observation dimension are comparable. Therefore, in order to study the phenomenon in such a setting, we use an asymptotic technique whereby the number of sample points is kept comparable in magnitude to the dimensionality of observations. The analysis provides a more thorough picture of the phenomenon. In particular, the analysis shows that as long as the Relative Cumulative Efficacy of an additional Feature set (RCEF) is greater (less) than the size of this set, the expected error of the classifier constructed using these additional features will be less (greater) than the expected error of the classifier constructed without them. Our result highlights underlying factors of the peaking phenomenon relative to the classifier used in this study and, at the same time, calls into question the classical wisdom around the peaking phenomenon.
KW - Classification error rate
KW - Linear discriminant analysis
KW - Multiple asymptotic analysis
KW - Peaking phenomenon
KW - Machine learning
UR - http://www.scopus.com/inward/record.url?scp=85069659068&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85069659068&partnerID=8YFLogxK
U2 - 10.1007/s00357-019-09327-3
DO - 10.1007/s00357-019-09327-3
M3 - Article
AN - SCOPUS:85069659068
SN - 0176-4268
VL - 37
SP - 421
EP - 434
JO - Journal of Classification
JF - Journal of Classification
IS - 2
ER -