TY - GEN
T1 - Data-driven morphological analysis and disambiguation for Kazakh
AU - Makhambetov, Olzhas
AU - Makazhanov, Aibek
AU - Sabyrgaliyev, Islam
AU - Yessenbayev, Zhandos
PY - 2015/1/1
Y1 - 2015/1/1
N2 - We propose a method for morphological analysis and disambiguation for Kazakh language that accounts for both inflectional and derivational morphology, including not fully productive derivation. The method is data-driven and does not require manually generated rules. We leverage so called “transition chains” that help pruning false segmentations, while keeping correct ones. At the disambiguation step we use a standard HMM-based approach. Evaluating our method against open source solutions on several data sets, we show that it achieves better or on par performance. We also provide an extensive error analysis that sheds light on common problems of the morphological disambiguation of the language.
AB - We propose a method for morphological analysis and disambiguation for Kazakh language that accounts for both inflectional and derivational morphology, including not fully productive derivation. The method is data-driven and does not require manually generated rules. We leverage so called “transition chains” that help pruning false segmentations, while keeping correct ones. At the disambiguation step we use a standard HMM-based approach. Evaluating our method against open source solutions on several data sets, we show that it achieves better or on par performance. We also provide an extensive error analysis that sheds light on common problems of the morphological disambiguation of the language.
UR - http://www.scopus.com/inward/record.url?scp=84942574744&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84942574744&partnerID=8YFLogxK
U2 - 10.1007/978-3-319-18111-0_12
DO - 10.1007/978-3-319-18111-0_12
M3 - Conference contribution
AN - SCOPUS:84942574744
SN - 9783319181103
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 151
EP - 163
BT - Computational Linguistics and Intelligent Text Processing - 16th International Conference, CICLing 2015, Proceedings
A2 - Gelbukh, Alexander
PB - Springer Verlag
T2 - 16th Annual Conference on Intelligent Text Processing and Computational Linguistics, CICLing 2015
Y2 - 14 April 2015 through 20 April 2015
ER -