TY - JOUR
T1 - SRPM-ST
T2 - Sequential retraining and pseudo-labeling in mini-batches for self-training
AU - Mukhamediya, Azamat
AU - Zollanvari, Amin
N1 - Publisher Copyright:
© 2024 Elsevier B.V.
PY - 2024/11/7
Y1 - 2024/11/7
N2 - An impediment to training accurate classifiers in supervised learning is the scarcity of labeled data. In that respect, semi-supervised learning could help by using both labeled and unlabeled data. A specific form of semi-supervised learning is self-training (ST). In its basic form, ST trains an initial classifier using the labeled data to generate pseudo-labels for the unlabeled set. At this point, either the whole set of pseudo-labeled data or a subset of them with some high confidence scores about the generated pseudo-labels is selected. The selected pseudo-labeled data are then used to update the initial classifier. Although this process can be repeated to generate new pseudo-labels for the unlabeled data, it is typically a tacit assumption up to this point that the classifier is updated once all pseudo-labels are generated—a process to which we refer as the full-batch ST (F-ST) regardless of any confidence score-based subset selection. Here, we show that sequential retraining and pseudo-labeling in mini-batches (SRPM) could potentially improve the performance of the classifier with respect to F-ST. Our empirical results show the existence of a data-dependent mini-batch size for SRPM that is optimal in terms of possessing the least error rate. In practice, this parameter could be treated as a hyperparameter to tune.
AB - An impediment to training accurate classifiers in supervised learning is the scarcity of labeled data. In that respect, semi-supervised learning could help by using both labeled and unlabeled data. A specific form of semi-supervised learning is self-training (ST). In its basic form, ST trains an initial classifier using the labeled data to generate pseudo-labels for the unlabeled set. At this point, either the whole set of pseudo-labeled data or a subset of them with some high confidence scores about the generated pseudo-labels is selected. The selected pseudo-labeled data are then used to update the initial classifier. Although this process can be repeated to generate new pseudo-labels for the unlabeled data, it is typically a tacit assumption up to this point that the classifier is updated once all pseudo-labels are generated—a process to which we refer as the full-batch ST (F-ST) regardless of any confidence score-based subset selection. Here, we show that sequential retraining and pseudo-labeling in mini-batches (SRPM) could potentially improve the performance of the classifier with respect to F-ST. Our empirical results show the existence of a data-dependent mini-batch size for SRPM that is optimal in terms of possessing the least error rate. In practice, this parameter could be treated as a hyperparameter to tune.
KW - Pseudo-labeling
KW - Self-training
KW - Semi-supervised learning
UR - http://www.scopus.com/inward/record.url?scp=85201376861&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85201376861&partnerID=8YFLogxK
U2 - 10.1016/j.neucom.2024.128343
DO - 10.1016/j.neucom.2024.128343
M3 - Article
AN - SCOPUS:85201376861
SN - 0925-2312
VL - 605
JO - Neurocomputing
JF - Neurocomputing
M1 - 128343
ER -