Publication: Improving Low-Resource Kazakh-English and Turkish-English Neural Machine Translation Using Transfer Learning and Part of Speech Tags
| dc.authorscopusid | 57212212990 | |
| dc.authorscopusid | 22953804000 | |
| dc.authorwosid | Kiliç, Erdal/Hjy-2853-2023 | |
| dc.contributor.author | Yazar, Bilge Kagan | |
| dc.contributor.author | Kilic, Erdal | |
| dc.contributor.authorID | Kiliç, Erdal/0000-0003-1585-0991 | |
| dc.contributor.authorID | Yazar, Bilge Kağan/0000-0003-2149-142X | |
| dc.date.accessioned | 2025-12-11T01:23:32Z | |
| dc.date.issued | 2025 | |
| dc.department | Ondokuz Mayıs Üniversitesi | en_US |
| dc.department-temp | [Yazar, Bilge Kagan; Kilic, Erdal] Ondokuz Mayis Univ, Fac Engn, TR-55200 Samsun, Turkiye | en_US |
| dc.description | Kiliç, Erdal/0000-0003-1585-0991; Yazar, Bilge Kağan/0000-0003-2149-142X | en_US |
| dc.description.abstract | This study presents a novel translation framework by combining transfer learning and part-of-speech (POS) tagging methods to improve the performance of low-resource neural machine translation models using Kazakh-English and Turkish-English language pairs. It is aimed to maximize the effectiveness of transfer learning by taking advantage of the structural similarities of Turkish and Kazakh languages and to obtain more accurate and consistent translation results by integrating grammatical and syntactic information into the model with POS tags. For Kazakh, POS tags were generated using the RoBERTa model, while for Turkish, the Zemberek library was employed, and these tags were used as an additional feature in Transformer-based models. The findings show that using transfer learning and POS tags alone increases the performance, but when these two methods are used together, more meaningful and consistent results are obtained. The results obtained in the study are examined with BLEU, chrF, and METEOR metrics, and detailed analyses are made. The models created for the Kazakh-English translation direction are compared with different models, and it is seen that much better results are obtained with the methods used. For the Turkish-English translation direction, the results were examined using Tatoeba and TED2020 corpora of different sizes. In particular, significant improvements were observed in the experiments conducted on the Tatoeba corpus, and significant increases were obtained in the examined metrics. In this context, the methods applied in the study achieved successful results for low-resource languages, and the generalizability of the proposed approach was demonstrated with the use of different corpora and language pairs. | en_US |
| dc.description.woscitationindex | Science Citation Index Expanded | |
| dc.identifier.doi | 10.1109/ACCESS.2025.3542491 | |
| dc.identifier.endpage | 32356 | en_US |
| dc.identifier.issn | 2169-3536 | |
| dc.identifier.scopus | 2-s2.0-85217895243 | |
| dc.identifier.scopusquality | Q1 | |
| dc.identifier.startpage | 32341 | en_US |
| dc.identifier.uri | https://doi.org/10.1109/ACCESS.2025.3542491 | |
| dc.identifier.uri | https://hdl.handle.net/20.500.12712/43381 | |
| dc.identifier.volume | 13 | en_US |
| dc.identifier.wos | WOS:001440225500023 | |
| dc.identifier.wosquality | Q2 | |
| dc.language.iso | en | en_US |
| dc.publisher | IEEE-Inst Electrical Electronics Engineers Inc | en_US |
| dc.relation.ispartof | IEEE Access | en_US |
| dc.relation.publicationcategory | Makale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı | en_US |
| dc.rights | info:eu-repo/semantics/openAccess | en_US |
| dc.subject | Translation | en_US |
| dc.subject | Transformers | en_US |
| dc.subject | Accuracy | en_US |
| dc.subject | Transfer Learning | en_US |
| dc.subject | Data Models | en_US |
| dc.subject | Encoding | en_US |
| dc.subject | Tagging | en_US |
| dc.subject | Grammar | en_US |
| dc.subject | Vectors | en_US |
| dc.subject | Solid Modeling | en_US |
| dc.subject | Neural Machine Translation | en_US |
| dc.subject | Low-Resource Languages | en_US |
| dc.subject | Multi-Feature Transformer | en_US |
| dc.subject | Transfer Learning | en_US |
| dc.subject | Part of Speech Tags | en_US |
| dc.title | Improving Low-Resource Kazakh-English and Turkish-English Neural Machine Translation Using Transfer Learning and Part of Speech Tags | en_US |
| dc.type | Article | en_US |
| dspace.entity.type | Publication |
