Improving Low-Resource Kazakh-English and Turkish-English Neural Machine Translation Using Transfer Learning and Part of Speech Tags

Yazar, Bilge Kagan; Kilic, Erdal

doi:10.1109/ACCESS.2025.3542491

Publication:
Improving Low-Resource Kazakh-English and Turkish-English Neural Machine Translation Using Transfer Learning and Part of Speech Tags

dc.authorscopusid	57212212990
dc.authorscopusid	22953804000
dc.authorwosid	Kiliç, Erdal/Hjy-2853-2023
dc.contributor.author	Yazar, Bilge Kagan
dc.contributor.author	Kilic, Erdal
dc.contributor.authorID	Kiliç, Erdal/0000-0003-1585-0991
dc.contributor.authorID	Yazar, Bilge Kağan/0000-0003-2149-142X
dc.date.accessioned	2025-12-11T01:23:32Z
dc.date.issued	2025
dc.department	Ondokuz Mayıs Üniversitesi	en_US
dc.department-temp	[Yazar, Bilge Kagan; Kilic, Erdal] Ondokuz Mayis Univ, Fac Engn, TR-55200 Samsun, Turkiye	en_US
dc.description	Kiliç, Erdal/0000-0003-1585-0991; Yazar, Bilge Kağan/0000-0003-2149-142X	en_US
dc.description.abstract	This study presents a novel translation framework by combining transfer learning and part-of-speech (POS) tagging methods to improve the performance of low-resource neural machine translation models using Kazakh-English and Turkish-English language pairs. It is aimed to maximize the effectiveness of transfer learning by taking advantage of the structural similarities of Turkish and Kazakh languages and to obtain more accurate and consistent translation results by integrating grammatical and syntactic information into the model with POS tags. For Kazakh, POS tags were generated using the RoBERTa model, while for Turkish, the Zemberek library was employed, and these tags were used as an additional feature in Transformer-based models. The findings show that using transfer learning and POS tags alone increases the performance, but when these two methods are used together, more meaningful and consistent results are obtained. The results obtained in the study are examined with BLEU, chrF, and METEOR metrics, and detailed analyses are made. The models created for the Kazakh-English translation direction are compared with different models, and it is seen that much better results are obtained with the methods used. For the Turkish-English translation direction, the results were examined using Tatoeba and TED2020 corpora of different sizes. In particular, significant improvements were observed in the experiments conducted on the Tatoeba corpus, and significant increases were obtained in the examined metrics. In this context, the methods applied in the study achieved successful results for low-resource languages, and the generalizability of the proposed approach was demonstrated with the use of different corpora and language pairs.	en_US
dc.description.woscitationindex	Science Citation Index Expanded
dc.identifier.doi	10.1109/ACCESS.2025.3542491
dc.identifier.endpage	32356	en_US
dc.identifier.issn	2169-3536
dc.identifier.scopus	2-s2.0-85217895243
dc.identifier.scopusquality	Q1
dc.identifier.startpage	32341	en_US
dc.identifier.uri	https://doi.org/10.1109/ACCESS.2025.3542491
dc.identifier.uri	https://hdl.handle.net/20.500.12712/43381
dc.identifier.volume	13	en_US
dc.identifier.wos	WOS:001440225500023
dc.identifier.wosquality	Q2
dc.language.iso	en	en_US
dc.publisher	IEEE-Inst Electrical Electronics Engineers Inc	en_US
dc.relation.ispartof	IEEE Access	en_US
dc.relation.publicationcategory	Makale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı	en_US
dc.rights	info:eu-repo/semantics/openAccess	en_US
dc.subject	Translation	en_US
dc.subject	Transformers	en_US
dc.subject	Accuracy	en_US
dc.subject	Transfer Learning	en_US
dc.subject	Data Models	en_US
dc.subject	Encoding	en_US
dc.subject	Tagging	en_US
dc.subject	Grammar	en_US
dc.subject	Vectors	en_US
dc.subject	Solid Modeling	en_US
dc.subject	Neural Machine Translation	en_US
dc.subject	Low-Resource Languages	en_US
dc.subject	Multi-Feature Transformer	en_US
dc.subject	Transfer Learning	en_US
dc.subject	Part of Speech Tags	en_US
dc.title	Improving Low-Resource Kazakh-English and Turkish-English Neural Machine Translation Using Transfer Learning and Part of Speech Tags	en_US
dc.type	Article	en_US
dspace.entity.type	Publication

Collections

WoS İndeksli Yayınlar Koleksiyonu
Scopus İndeksli Yayınlar Koleksiyonu

Publication: Improving Low-Resource Kazakh-English and Turkish-English Neural Machine Translation Using Transfer Learning and Part of Speech Tags

Files

Collections

Publication:
Improving Low-Resource Kazakh-English and Turkish-English Neural Machine Translation Using Transfer Learning and Part of Speech Tags