TY - GEN
T1 - Do LLMs Speak Kazakh? A Pilot Evaluation of Seven Models
AU - Maxutov, Akylbek
AU - Myrzakhmet, Ayan
AU - Braslavski, Pavel
N1 - Publisher Copyright:
©2024 Association for Computational Linguistics.
PY - 2024
Y1 - 2024
N2 - We conducted a systematic evaluation of seven large language models (LLMs) on tasks in Kazakh, a Turkic language spoken by approximately 13 million native speakers in Kazakhstan and abroad. We used six datasets corresponding to different tasks – questions answering, causal reasoning, middle school math problems, machine translation, and spelling correction. Three of the datasets were prepared for this study. As expected, the quality of the LLMs on the Kazakh tasks is lower than on the parallel English tasks. GPT-4 shows the best results, followed by Gemini and AYA. In general, LLMs perform better on classification tasks and struggle with generative tasks. Our results provide valuable insights into the applicability of currently available LLMs for Kazakh. We made the data collected for this study publicly available: https://github.com/akylbekmaxutov/LLM-eval-using-Kazakh.
AB - We conducted a systematic evaluation of seven large language models (LLMs) on tasks in Kazakh, a Turkic language spoken by approximately 13 million native speakers in Kazakhstan and abroad. We used six datasets corresponding to different tasks – questions answering, causal reasoning, middle school math problems, machine translation, and spelling correction. Three of the datasets were prepared for this study. As expected, the quality of the LLMs on the Kazakh tasks is lower than on the parallel English tasks. GPT-4 shows the best results, followed by Gemini and AYA. In general, LLMs perform better on classification tasks and struggle with generative tasks. Our results provide valuable insights into the applicability of currently available LLMs for Kazakh. We made the data collected for this study publicly available: https://github.com/akylbekmaxutov/LLM-eval-using-Kazakh.
UR - http://www.scopus.com/inward/record.url?scp=85204708080&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85204708080&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85204708080
T3 - SIGTURK 2024 - 1st Workshop on Natural Language Processing for Turkic Languages, Proceedings of the Workshop
SP - 81
EP - 91
BT - SIGTURK 2024 - 1st Workshop on Natural Language Processing for Turkic Languages, Proceedings of the Workshop
A2 - Ataman, Duygu
A2 - Derin, Mehmet Oguz
A2 - Ivanova, Sardana
A2 - Koksal, Abdullatif
A2 - Saleva, Jonne
A2 - Zeyrek, Deniz
PB - Association for Computational Linguistics (ACL)
T2 - 1st Workshop on Natural Language Processing for Turkic Languages, SIGTURK 2024
Y2 - 15 August 2024
ER -