Publication: Turkish Optical Character Recognition Under the Lens: A Systematic Review of Language-Specific Challenges, Dataset Scarcity, and Open-Source Limitations
| dc.authorscopusid | 57274538200 | |
| dc.authorscopusid | 60129175000 | |
| dc.authorscopusid | 22953804000 | |
| dc.authorwosid | Sahin, Durmus/Aaj-7961-2020 | |
| dc.authorwosid | Kiliç, Erdal/Y-2198-2018 | |
| dc.contributor.author | Goksu Ozturk, Mirac | |
| dc.contributor.author | Ozkan Sahin, Durmus | |
| dc.contributor.author | Kilic, Erdal | |
| dc.date.accessioned | 2025-12-11T00:43:58Z | |
| dc.date.issued | 2025 | |
| dc.department | Ondokuz Mayıs Üniversitesi | en_US |
| dc.department-temp | [Goksu Ozturk, Mirac] Ondokuz Mayis Univ, Inst Grad Studies, Dept Computat Sci, TR-55200 Samsun, Turkiye; [Ozkan Sahin, Durmus; Kilic, Erdal] Ondokuz Mayis Univ, Fac Engn, Dept Comp Engn, TR-55200 Samsun, Turkiye | en_US |
| dc.description.abstract | This systematic literature review explores the progress, challenges, and opportunities in the field of Optical Character Recognition (OCR) for the Turkish language. Despite significant advancements, the development of robust Turkish OCR systems faces several obstacles, such as a lack of publicly available datasets, limited open-source solutions, and the underutilization of cutting-edge deep learning techniques. These challenges hinder the creation of OCR systems that can match the capabilities of those developed for languages like English. Focusing on 38 peer-reviewed studies published between 2019 and 2023, this paper provides the first systematic review of Turkish OCR research, offering a comprehensive analysis of the current methods, datasets, and evaluation metrics across both modern Turkish (Latin script) and Ottoman Turkish (Arabic script) contexts. Our findings highlight that while Convolutional Neural Networks (CNN), Long Short-Term Memory (LSTM), and Convolutional Recurrent Neural Networks (CRNN) architectures are frequently used, Transformer-based and end-to-end models remain underexplored in Turkish OCR. We also identify data scarcity and the lack of reproducible benchmark datasets as key barriers. By analyzing current research trends, pinpointing challenges, and emphasizing opportunities for future advancements, this review aims to be a valuable resource for researchers and Turkish language recognition. Our study contributes to the field by offering a structured overview of existing methods and proposes practical recommendations for improving dataset availability, encouraging open-source collaboration, and adopting more advanced model architectures. | en_US |
| dc.description.woscitationindex | Science Citation Index Expanded | |
| dc.identifier.doi | 10.1109/ACCESS.2025.3614147 | |
| dc.identifier.endpage | 168997 | en_US |
| dc.identifier.issn | 2169-3536 | |
| dc.identifier.scopus | 2-s2.0-105017393015 | |
| dc.identifier.scopusquality | Q1 | |
| dc.identifier.startpage | 168977 | en_US |
| dc.identifier.uri | https://doi.org/10.1109/ACCESS.2025.3614147 | |
| dc.identifier.uri | https://hdl.handle.net/20.500.12712/38840 | |
| dc.identifier.volume | 13 | en_US |
| dc.identifier.wos | WOS:001586205100041 | |
| dc.identifier.wosquality | Q2 | |
| dc.language.iso | en | en_US |
| dc.publisher | IEEE-Inst Electrical Electronics Engineers Inc | en_US |
| dc.relation.ispartof | IEEE Access | en_US |
| dc.relation.publicationcategory | Makale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı | en_US |
| dc.rights | info:eu-repo/semantics/openAccess | en_US |
| dc.subject | Optical Character Recognition | en_US |
| dc.subject | Text Recognition | en_US |
| dc.subject | Systematic Literature Review | en_US |
| dc.subject | Surveys | en_US |
| dc.subject | Convolutional Neural Networks | en_US |
| dc.subject | Accuracy | en_US |
| dc.subject | Systematics | en_US |
| dc.subject | Measurement | en_US |
| dc.subject | Linguistics | en_US |
| dc.subject | Focusing | en_US |
| dc.subject | Classification | en_US |
| dc.subject | Deep Learning | en_US |
| dc.subject | Machine Learning | en_US |
| dc.subject | Optical Character Recognition | en_US |
| dc.subject | OCR Applications | en_US |
| dc.subject | Turkish OCR | en_US |
| dc.title | Turkish Optical Character Recognition Under the Lens: A Systematic Review of Language-Specific Challenges, Dataset Scarcity, and Open-Source Limitations | en_US |
| dc.type | Article | en_US |
| dspace.entity.type | Publication |
