TY - GEN
T1 - Critical Analysis of Validation Methods for Machine Learning Models Used in E-health Applications
AU - Yatbaz, Hakan Yekta
AU - Yazici, Adnan
AU - Ever, Enver
N1 - Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - Different types of data sets are established using various sensor-based networks and wearable devices for performing human activity recognition that can be used effectively in the e-health domain. Various machine learning models and relevant algorithms are used to perform detection with high accuracy. Depending on the characteristics of the data set, validation methods used to compute the accuracy may vary. The correct validation of machine learning algorithms is essential to correctly assess the performance of the models especially when the data analysed are related to the health domain. Activity recognition algorithms based on sliding windows with different overlapping ratios are popularly used for validation together with popular cross-validation methods such as k-fold, leave-one-out and leave-one-subject-out. In this study, validation methods commonly used for windowing-based activity recognition systems using wearable devices are analyzed. The advantages and disadvantages of each method are discussed taking into account various parameters. A case study, using the well-known MHEALTH data set, is presented with state-of-the art machine learning approaches. Experimental testing with a second window size using 10-fold cross-validation, a five-second window size using leave one out cross-validation, and a second window size using 5 fold cross-validation gave the highest accuracy, 96.71%, 95.65% and 95% respectively while the window overlap is 50%.
AB - Different types of data sets are established using various sensor-based networks and wearable devices for performing human activity recognition that can be used effectively in the e-health domain. Various machine learning models and relevant algorithms are used to perform detection with high accuracy. Depending on the characteristics of the data set, validation methods used to compute the accuracy may vary. The correct validation of machine learning algorithms is essential to correctly assess the performance of the models especially when the data analysed are related to the health domain. Activity recognition algorithms based on sliding windows with different overlapping ratios are popularly used for validation together with popular cross-validation methods such as k-fold, leave-one-out and leave-one-subject-out. In this study, validation methods commonly used for windowing-based activity recognition systems using wearable devices are analyzed. The advantages and disadvantages of each method are discussed taking into account various parameters. A case study, using the well-known MHEALTH data set, is presented with state-of-the art machine learning approaches. Experimental testing with a second window size using 10-fold cross-validation, a five-second window size using leave one out cross-validation, and a second window size using 5 fold cross-validation gave the highest accuracy, 96.71%, 95.65% and 95% respectively while the window overlap is 50%.
UR - http://www.scopus.com/inward/record.url?scp=85163703241&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85163703241&partnerID=8YFLogxK
U2 - 10.1109/ICISCT55600.2022.10146901
DO - 10.1109/ICISCT55600.2022.10146901
M3 - Conference contribution
AN - SCOPUS:85163703241
T3 - 2022 International Conference on Information Science and Communications Technologies, ICISCT 2022
BT - 2022 International Conference on Information Science and Communications Technologies, ICISCT 2022
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2022 International Conference on Information Science and Communications Technologies, ICISCT 2022
Y2 - 28 September 2022 through 30 September 2022
ER -