Preview

Bulletin of Shakarim University. Technical Sciences

Advanced search

DEEP LEARNING-BASED HATE SPEECH DETECTION IN KAZAKH: A HYBRID FRAMEWORK FOR ROBUST TEXT ANALYSIS

https://doi.org/10.53360/2788-7995-2025-4(20)-25

Abstract

This study presents a new intelligent system based on deep learning methods for the automatic detection of hate speech in the Kazakh language. Particular attention is paid to the specific nature of Kazakh as a resource-poor language, where the limited linguistic data poses significant challenges in building reliable models. A multilingual data corpus covering a wide range of speech contexts was generated and preprocessed using various online sources-social media, forums, and news portals. To improve efficiency and generalization performance, a hybrid architecture was proposed, including convolutional neural networks (CNNs), bidirectional long short-term memories (BiLSTMs), and Transformer attention mechanisms. An evaluation using precision, recall, F1-criterion, and accuracy metrics demonstrated the superiority of the proposed model over traditional machine learning algorithms. The results of the study make a significant contribution to the development of automatic content moderation systems and promote the creation of a safer, inclusive, and linguistically sensitive digital space for Kazakh-speaking users.

About the Authors

D. Sultan
Narxoz University
Kazakhstan

Daniyar Sultan – PhD, Ac. Associate Professor of the School of Digital Technologies

050035, Republic of Kazakhstan, Almaty, 55 Zhandossova street



R. Abdrakhmanov
International University of Tourism and Hospitality
Kazakhstan

Rustam Abdrakhmanov – Candidate of Technical Sciences, Associate Professor of the Humanities school

161205, Turkistan, Kazakhstan 14 A Rabiga Sultan Begim str.



E. Adali
Istanbul Technical University
Turkey

Tursinbay Turymbetov – Candidate of Technical Sciences, Associate Professor of the Humanities
school, International University of Tourism and Hospitality

34010, Istanbul



T. Turymbetov
International University of Tourism and Hospitality
Kazakhstan

Tursinbay Turymbetov – Candidate of Technical Sciences, Associate Professor of the Humanities school

161205, Turkistan, Kazakhstan 14 A Rabiga Sultan Begim str. 



G. Bekeshova
L.N. Gumilyov Eurasian national university
Kazakhstan

Gulvira Bekeshova – Senior Lecturer at the Department of Information Security, IT Faculty  

010000, Republic of Kazakhstan, Astana, K. Satpayev Street, 2 



References

1. Data-Driven Morphological Analysis and Disambiguation for Kazakh / O. Makhambetov et al // Computational Linguistics and Intelligent Text Processing. – 2015. – Р. 151-163. https://doi.org/10.1007/978-3-319-18111-0_12.

2. Assembling the Kazakh Language Corpus / O. Makhambetov et al // in Proc. 2013 Conf. Empirical Methods in Natural Language Processing (EMNLP), Seattle, WA, USA. – 2013. – Р. 1022-1031. [Online]. Available: https://aclanthology.org/D13-1104.

3. Yessenbayev Z. KazNLP: A Pipeline for Automated Processing of Texts Written in Kazakh Language / Z. Yessenbayev, Z. Kozhirbayev, A. Makazhanov // in Speech and Computer. Switzerland: Springer. – 2020. – Р. 657-666. https://doi.org/10.1007/978-3-030-60276-5_63.

4. Document and Word-level Language Identification for Noisy User Generated Text / Z. Kozhirbayev, Z. Yessenbayev, A. Makazhanov // in Proc. 12th Int. Conf. Application of Information and Communication Technologies (AICT), Almaty, Kazakhstan. – 2018. – Р. 1-4. https://doi.org/10.1109/ICAICT.2018.8747138.

5. Yessenbayev Z. KazNLP: A Pipeline for Automated Processing of Texts Written in Kazakh Language / Z. Yessenbayev, Z. Kozhirbayev, A. Makazhanov // in Speech and Computer. LNCS. – 2020. – vol. 12335. – Р. 657-666. https://doi.org/10.1007/978-3-030-60276-5_63.

6. Data-Driven Morphological Analysis and Disambiguation for Kazakh / O. Makhambetov et al // in CICLing. – 2015. – vol. 9041. – Р. 151-163. https://doi.org/10.1007/978-3-319-18111-0_12.

7. Yessenbayev Z. KazNLP: A Pipeline for Automated Processing of Texts Written in Kazakh Language / Z. Yessenbayev, Z. Kozhirbayev, A. Makazhanov // in SPECOM. – 2020. – vol. 12335. – Р. 657-666. https://doi.org/10.1007/978-3-030-60276-5_63.

8. Development of CRF and CTC Based End-To-End Kazakh Speech Recognition System / D. Oralbekova et al // in Intelligent Information and Database Systems. – 2022. – vol. 13757. – Р. 519-531. https://doi.org/10.1007/978-3-031-21743-2_41.

9. A Comparative Analysis of LSTM and BERT Models for Named Entity Recognition in Kazakh Language: A Multi-classification Approach / D. Oralbekova et al // in Modeling and Simulation of Social-Behavioral Phenomena in Creative Societies (MSBC 2024), CCIS. – 2024. – vol. 2211. – Р. 116-128. https://doi.org/10.1007/978-3-031-72260-8_10.

10. Neurocomputer System of Semantic Analysis of the Text in the Kazakh Language / A. Akanova et al // ACM Trans. Asian and Low-Resource Language Information Processing. – 2024. – vol. 23, № 4. https://doi.org/10.1145/3652159.

11. Automatic Recognition of Kazakh Speech Using Deep Neural Networks / O. Mamyrbayev et al // in Asian Conf. Intelligent Information and Database Systems. – 2019. – vol. 11432. – Р. 465-474. https://doi.org/10.1007/978-3-030-14802-7_40.

12. End-to-End Speech Recognition in Agglutinative Languages / O. Mamyrbayev et al // in Intelligent Information and Database Systems. – 2020. – vol. 12034. – Р. 391-401. https://doi.org/10.1007/978-3-030-42058-1_33.

13. A Comparative Analysis of LSTM and BERT Models for Named Entity Recognition in Kazakh Language: A Multi-classification Approach / D. Oralbekova et al // in MSBC. – 2024. – vol. 2211. – Р. 116-128. https://doi.org/10.1007/978-3-031-72260-8_10.

14. Advanced Implementation of a Multilevel Model for Text Summarization in Kazakh Using Pretrained Models / D. Oralbekova et al // Engineering, Technology & Applied Science Research. – 2025. – vol. 15, № 5. – Р. 26711-26721. https://doi.org/10.48084/etasr.12799.

15. A Comparative Analysis of LSTM and BERT Models for Named Entity Recognition in Kazakh Language: A Multi-classification Approach / D. Oralbekova et al // in MSBC 2024, CCIS. – 2024. – vol. 2211. – Р. 116-128. https://doi.org/10.1007/978-3-031-72260-8_10.

16. Development of CRF and CTC Based End-To-End Kazakh Speech Recognition System / D. Oralbekova et al // in ACIIDS. – 2022. – vol. 13757. – Р. 519-531. https://doi.org/10.1007/978-3-031-21743-2_41.


Review

For citations:


Sultan D., Abdrakhmanov R., Adali E., Turymbetov T., Bekeshova G. DEEP LEARNING-BASED HATE SPEECH DETECTION IN KAZAKH: A HYBRID FRAMEWORK FOR ROBUST TEXT ANALYSIS. Bulletin of Shakarim University. Technical Sciences. 2025;1(4(20)):210-219. https://doi.org/10.53360/2788-7995-2025-4(20)-25

Views: 7

JATS XML


Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.


ISSN 2788-7995 (Print)
ISSN 3006-0524 (Online)
X