Content uploaded by Murat Koklu
Author content
All content in this area was uploaded by Murat Koklu on Jan 22, 2025
Content may be subject to copyright.
2nd International Conference on Modern and Advanced
Research
January 15-16, 2025 : Konya, Turkey
https://as-proceeding.com/index.php/icmar/home
© 2025 Published by All Sciences Academy
198
Old-Alphabet Ottoman Turkish to Latin Based Turkish Translation
Systems and Current Situation Analysis
Abdulkadir OZTURK1, Abdulkerim SAKA2, Murat KOKLU3
1 Graduate School of Natural and Applied Sciences, Selcuk University, Konya, Türkiye;
ozturk.abdullkadir@gmail.com; ORCID: 0009-0000-7258-3303
2 Faculty of Letters, Department of History, Selcuk University, Konya, Türkiye;
aksaka@selcuk.edu.tr, ORCID: 0000-0003-0440-8858
3 Department of Computer Engineering, Selcuk University, Konya, Türkiye;
mkoklu@selcuk.edu.tr; ORCID: 0000-0002-2737-2360
Corresponding Author (mkoklu@selcuk.edu.tr)
Abstract – Old-alphabet Ottoman Turkish translation systems are methods developed for the translation
of historical texts into Latin-based Turkish. While traditional methods (transcription and simplification)
remain faithful to the historical context, modern deep learning-based approaches offer the advantage of
speed and accuracy. Modern technologies are supported by Natural Language Processing and context-
sensitive models. However, problems such as limited data sets and context losses limit the success of
these systems. In the future, hybrid translation models may combine the strengths of traditional and
modern methods. Diversification of datasets and development of context-sensitive algorithms can
increase translation accuracy. Additionally, applications supported by digital archives and educational
materials can enable translation systems to reach large audiences. The wider development of these
systems with international cooperation offers an important opportunity to preserve the cultural and
historical heritage in the transition from old-alphabet Ottoman Turkish to Latin-based Turkish. Ottoman
Turkish translation systems are not only a linguistic transformation tool, but also a bridge carrying
historical and cultural memory into the future. Therefore, the implementation of technological
developments in harmony with language and culture is critical to increase success in this field.
Keywords – Cultural Heritage, Digital Archives, Latin-Based Turkish, Ottoman Turkish, Translation Systems.
I. INTRODUCTION
Old-alphabet Ottoman Turkish is a form of Turkish that developed under the influence of Arabic and
Persian, used in official correspondence, literature and scientific texts in the Ottoman Empire. This
written language, composed in the Arabic alphabets, differs significantly from Latin-based Turkish with
its complex grammatical structures and extensive vocabulary. This situation made it difficult to translate
historical documents into Latin-based Turkish, but this process made the preservation and transmission of
Ottoman cultural heritage inevitable [1]. The process of constructing the concept of a modern state and
the efforts to build a national identity during the Second Constitutional Era also manifested in the field of
language [2]. The Turkish Language Reform ended the use of Ottoman Turkish, but this situation also
made it difficult to make the historical heritage accessible to the public [1].
The necessity of handling old-alphabet Ottoman Turkish with modern approaches is of critical importance
not only for the latinization of historical documents, but also for the preservation of cultural and historical
199
memory. This transition between old-scripted Ottoman Turkish and Latin-based Turkish is not only a
linguistic but also a socio-political transformation fact. Therefore, the accurate and effective transfer of
documents from the Ottoman Empire period to the Latin-based Turkish is a vital resource both
academically and socially [3].
Traditional translation methods have been used to meet this need for many years. Methods such as
transcription, simplification and semantic analysis are time-consuming and easily mistaken processes
because they rely on manual processing. While addressing the limitations of these methods, it is stated
that modern technologies, especially Natural Language Processing (NLP) and deep learning-based
approaches, can increase the efficiency of the translation process. It is showen that NLP-based systems
can be effective even in complex languages such as old-alphabetic Ottoman Turkish and that these
systems can be optimized with large datasets [4, 5].
There are various studies in the literature on the translation of old-alphabet Ottoman Turkish. Some
studies have examined the transition process from old-alphabet Ottoman Turkish to Latin-based Turkish
in the context of Westernization. Bakırcı (2019) in his study has developed a deep learning-based
translation system and shown how effective these technologies are in processing historical texts. He also
drew attention to the importance and difficulties of preparing datasets for the translation of old-alphabet
Ottoman Turkish [6-8].
This study aims to review the literature on translation systems from old-alphabet Ottoman Turkish to
Latin-based Turkish, and to analyze the transition process from traditional methods to modern deep
learning-based approaches. In addition, it is aimed to contribute to research in this field by discussing the
current challenges and areas for improvement in the translation process. The main flow process of the
study is presenetd in Figure 1.
Figure 1. Study Explained Flow Diagram
200
II. OLD LETTER OTTOMAN TURKISH AND LATIN BASED TURKISH: LINGUISTIC
DIFFERENCES AND THE NEED FOR TRANSLATION
Old-alphabet Ottoman Turkish is a complex written language that developed under the influence of
Arabic and Persian, and is used especially in official documents and literary works. This feature of the
language made it difficult for a large segment of society to understand it, and over time it became a tool
that only an elite segment could master. Latin-based Turkish, on the other hand, transformed into a
simpler language structure that could be easily understood by the public with the Turkish Language
Reform at the beginning of the 20th century. This linguistic transformation made the translations from
old-alphabet Ottoman Turkish to Latin-based Turkish not only a linguistic need, but also a cultural
necessity [1].
The most fundamental difference between old-alphabet Ottoman Turkish and Latin-based Turkish stems
from the structural features of the language. While old-alphabet Ottoman Turkish has complex
grammatical structures and a wide vocabulary of Arabic and Persian origin, Latin-based Turkish has been
reconstructed with more simplified grammatical rules and words of Turkish origin. In this context,
translations from old-alphabet Ottoman Turkish to Latin-based Turkish require the reconstruction of not
only a linguistic but also a historical and cultural understanding. It is seen that these differences make it
difficult for contemporary readers to understand cultural texts and official documents in particular [9, 10].
The difficulties in transferring old-alphabet Ottoman Turkish to serve today’s readers are related to the
disconnection from the historical context rather than the complexity of the linguistic structure. It is seen
that the language policies of the Republican period increased the need for translation by completely
separating old-alphabet Ottoman Turkish from Latin-based Turkish. In particular, the process of
simplifying archival documents and literary works requires attention both technically and conceptually. In
this process, researchers have emphasized the importance of preserving linguistic fidelity, but they have
stated that this goal often comes at the risk of losing the cultural context [9-11].
Another important difficulty in the translation process is that the texts of old-alphabet Ottoman Turkish
do not have one-to-one equivalents in Latin-based Turkish. Especially in official documents, the complex
language structures and terminology pose significant challenges in their translation into Latin-based
Turkish. The translation process involves not only linguistic transformation but also an effort to preserve
the historical context and cultural meaning. In this context, translations from old-alphabet Ottoman
Turkish to Latin-based Turkish are considered as an important tool that enables establishing a connection
with the past. Although these linguistic and historical differences make it difficult to convey old-alphabet
Ottoman Turkish texts to today's readers, the importance of this process is increasing day by day.
Transfers to Latin-based Turkish play a critical role not only for academic research but also for society to
understand its past and carry this heritage to the future [12, 13].
III. TRADITIONAL AND MODERN TRANSLATION METHODS
A. Traditional Translation Methods
Traditional methods in the translation of old-alphabet Ottoman Turkish mostly focus on preserving the
original language and form of the text. These methods include techniques such as transcription
(transforming the text from Ottoman letters to Latin letters) and simplification (transforming the text
into a more understandable form without losing meaning). These approaches have been effective in
preserving historical and cultural meanings, but they may be insufficient for modern readers to grasp
the meaning [6, 14].
B. Modern Translation Methods
Modern methods offer more effective solutions in the translation of old-alphabet Ottoman Turkish by
taking advantage of technological innovations. In this context, NLP techniques, especially
morphological analysis and word embedding, attract attention. Modern translation systems offer a
201
powerful tool to overcome the limitations of traditional methods and can reach high accuracy rates
using large datasets [15].
Deep learning-based translation models offer more effective solutions in understanding the contextual
meaning of texts. For example, deep learning-based models have made translation more fluid by
preserving both the historical and cultural meanings of words in translations of old-alphabet Ottoman
Turkish. Defining intralingual translation as a tool of modernization, it has drawn attention to the
importance of linguistic fidelity and cultural meaning harmony in translations from old-letter Ottoman
Turkish to Latin-based Turkish [16, 17].
C. Comparison of Traditional and Modern Methods
While traditional methods prioritize staying true to the historical context, modern methods offer
advantages in terms of accuracy and speed. Modern methods offer a more systematic and versatile
translation process in the analysis of complex languages such as old-alphabet Ottoman Turkish.
However, it should not be overlooked that modern methods depend on the accuracy of the datasets and
sometimes risk missing details of the historical context [18].
IV. DATASETS AND PROCESSING TECHNIQUES
Integrating historical languages such as old-alphabet Ottoman Turkish into NLP processes has increased
the need for comprehensive and high-quality datasets. However, the complex structure of this language
and its limited digital resources make it difficult to create and process datasets. The grammatical structure
and vocabulary of the language pose a great challenge in the process of creating basic NLP resources for
old-alphabet Ottoman Turkish. In this context, the creation of accurate datasets and effective processing
techniques are critical to the success of translation systems [5].
A. Dataset Preparaion
Digitizing and labeling Ottoman Turkish texts written in old alphabets is a fundamental step in the
dataset preparation process. By developing a framework for the morphological processing of Ottoman
texts, it has been shown that word-based analyses are effective in understanding the complex structure
of the language. However, it should not be overlooked that errors are frequently made in the
digitization of Ottoman texts written in Arabic alphabtes with OCR technology, and therefore manual
verification is a critical process [19, 20].
B. Pre-processing Techniques
During the processing of datasets, cleaning and normalization of texts are important. Especially in old-
alphabet Ottoman Turkish, it is necessary to determine word roots and assign context-appropriate
meanings. It is seen that the use of hybrid datasets in the pre-processing stage of old-alphabet Ottoman
Turkish OCR data increases the accuracy rate. In addition, removing stop words and analyzing word
roots contribute to more effective processing of texts [21, 22].
C. Dataset Variety and Improvement Methods
Existing datasets for Old Ottoman Turkish are generally limited and focused on historical contexts. To
overcome these limitations, he suggests using data augmentation methods. These methods make it
possible to create larger datasets from a small number of texts. Similarly, it is states that digitizing and
tagging historical texts such as “Evliya Çelebi’s Seyahatname” increases data diversity in NLP-based
translations [4, 23].
202
V. EVALUATION OF EXISTING SYSTEMS
Translation systems from old-alphabet Ottoman Turkish to Latin-based Turkish have been developed
with various approaches, from traditional methods to modern technologies based on NLP and deep
learning. The success of these translation systems depends on the quality of the language models, datasets
and processing methods used. Evaluating the performance of the systems is important in understanding
the problems encountered in the translation process and identifying areas for improvement.
A. Limitations of Traditional Systems
Traditional systems are generally based on grammatical rules and fixed lexical structures. It is stated
that morphology and dictionary-based translation systems are generally inadequate in understanding
the context of the language in old-alphabet Ottoman Turkish texts. It is stated that such systems
perform less well in texts that require linguistic complexity and historical context [24, 25].
B. Achievements of Modern Approaches
Deep learning-based approaches offer effective tools to minimize loss of meaning and accurately grasp
context in translation systems. It has been observed that the deep learning model developed for Old
Ottoman Turkish achieves higher accuracy rates compared to traditional methods. Similarly, it has
been shown that deep learning models provide more consistent results in context-based translations [7,
15].
C. System Performance Evaluation
The performance of modern translation systems is evaluated using metrics such as BLEU and
ROUGE. It is emphasized that modern metrics used in intralingual translation processes provide an
objective measurement of system performance. It has been noted that deep learning-based systems are
more efficient than manual methods in automatic transcription of old-alphabet Ottoman Turkish texts
[10, 26].
D. Challenges and Improvement Areas
There are still many challenges to be solved in current translation systems. These include limited and
lack of contextual datasets, preserving the historical and cultural context of the language, and resolving
complex grammatical structures. Furthermore, modern systems are particularly inadequate in capturing
nuances of meaning in texts [27].
VI. STRENGTHS AND WEAKNESSES
Translation systems from Ottoman Turkish with alphabet letters to Latin-based Turkish have certain
advantages and limitations in both traditional and modern approaches. Evaluation of these systems is
important to understand the deficiencies in the translation process and to make the best use of the
potential of technology in this area.
A. Strengths
One of the strengths of the Old Alphabet Ottoman Turkish translation systems is their success in
preserving the historical and cultural context of the language. It is known that Old Alphabet Ottoman
Turkish translations play an important role in keeping historical memory alive by conveying the
linguistic and cultural contexts of the texts. Modern translation systems based on deep learning can
produce more consistent and meaningful results based on context using large datasets. Such systems
203
are more successful than traditional methods, especially in solving complex grammatical structures
[24, 28].
B. Weaknesses
Translation systems face various challenges. Traditional methods are seen to be inefficient in terms of
efficiency due to their slow and manual processing. In modern systems, a significant weakness is the
use of limited and context-insensitive datasets. In addition, the inability of modern systems to capture
contextual nuances can negatively affect the accuracy and fluency of translations [6, 24].
C. Comparison of Traditional and Modern Approaches
Traditional translation methods have advantages over modern approaches, especially in terms of
linguistic fidelity. However, modern systems have become much more efficient in terms of data
processing and speed. Combining these two approaches to create a hybrid model will yield better
results in terms of both accuracy and efficiency [28, 29].
VII. FUTURE PERSPECTIVES AND SUGGESTIONS
A. Development of Digital Archives
The importance of digital archives is increasing to preserve the texts of Ottoman Turkish with the Old
Alphabet and make them accessible to a wider audience. It is a fact that digital archives play a critical
role in the preservation of historical texts and that the digitalization of Ottoman Turkish with the Old
Alphabet paves the way for new academic studies. It is also a fact that digital archives will not only
preserve the texts but also accelerate research processes by allowing users to easily access the texts
[30, 31].
B. Model Improvement Methods
In the future, it is necessary to develop context-sensitive language models to increase the success of
Old-alphabet Ottoman Turkish translation systems. Deep learning-based language models have great
potential in Old-alphabet Ottoman Turkish translations, and contextual analysis tools need to be
developed, especially to prevent loss of meaning. Similarly, it is thought that digital methodologies
should be integrated with Ottoman studies, and thus the analysis of historical texts can be further
deepened [32, 33].
C. Dataset Diversification
For the development of translation systems for Ottoman Turkish with old alphabets, larger and more
diverse datasets are needed. It is known that creating more diverse datasets by selecting texts from
different periods and writing styles will increase the accuracy of translation systems. In addition, it
should not be ignored that digital humanities will offer new opportunities in this context and that large-
scale datasets can be created with volunteer participants [34, 35].
D. Educational Applications
The use of Old-alphabet Ottoman Turkish translation systems in education will facilitate the transfer of
this historical heritage to new generations. It is a fact that the use of digital technologies in Old-
alphabet Ottoman Turkish education can make language learning more accessible. In addition, an
educational process supported by digital tools will facilitate the understanding of texts and preserve the
historical context [34].
204
E. Long-Term Strategies
In the future, the integration of Old-alphabet Ottoman Turkish translation systems with global research
should be targeted. It is shown that Old-alphabet Ottoman Turkish translation projects can become
more comprehensive with international cooperation and that these studies will preserve a valuable
heritage not only in linguistic but also in cultural terms [36].
VIII. CONCLUSIONS AND RECOMMENDATIONS
Translation systems for Ottoman Turkish with old alphabets are gaining importance as a research area
that combines the historical and cultural context of the language with modern tools. Different tools, from
traditional methods to modern natural language processing-based approaches, have been effective in
translating historical texts into Latin-based Turkish. However, the success of these systems is directly
related to the effectiveness of the methods used and the quality of the data sources [10].
Modern approaches to the translation of old-letter Ottoman Turkish provide significant success with
context sensitivity and analyses appropriate to the grammatical structure of the language. It is known that
deep learning-based models are more effective in analyzing the complex structure of the language
compared to traditional methods. In addition, the digitalization of historical texts and the increase in
automation in the translation process have revealed the importance of technological developments in this
field [15, 37].
In the future, it is expected that the translation systems of Ottoman Turkish with old letters will be
developed with larger and more diverse data sets. It should not be ignored that digital technologies will
make it possible to reach a wider user base and thus make the translation of Ottoman Turkish with old
letters more accessible. In addition, the integration of these systems with education and cultural studies
will offer both academic and societal benefits [14, 38].
Translation systems for Ottoman Turkish with old letters are an important tool for preserving historical
texts and transferring cultural heritage to future generations. The inclusion of technological developments
in applications in this field will increase the accuracy and efficiency of the systems. In addition,
international cooperation and the development of standardized data sources will provide a wider scope in
translation processes [39].
Old-alphabet Ottoman Turkish translation systems play a critical role in transferring historical texts into
Latin-based Turkish. When the fidelity of traditional methods to historical context is combined with the
speed and accuracy advantages of modern technologies, more effective translation solutions can be
offered. However, the limitations of existing translation systems, particularly the scarcity of diverse data
and difficulties in capturing context, indicate that further development in this field is necessary.
In the future, hybrid translation models can be developed by combining the strongest aspects of
traditional methods and modern technologies. The creation of broader and more diverse datasets,
enhancing contextual accuracy, and adopting approaches focused on preserving cultural heritage will
form the cornerstone of progress in this area.
Education and digital archiving applications have significant potential in promoting the widespread use of
old-alphabet Ottoman Turkish translation systems. Specifically, the integration of these systems into
language education and history teaching can facilitate a better understanding of historical texts by new
generations. Additionally, international collaboration and technological innovations will make it possible
for the translation of old-alphabet Ottoman Turkish to hold universal, rather than merely national, value.
205
Old-alphabet Ottoman Turkish translation systems not only serve as tools for linguistic transformation but
also function as historical and cultural bridges. In the future, further research and development in this area
will greatly contribute to both the preservation of linguistic studies and historical heritage.
REFERENCES
[1] G. Lewis, The Turkish language reform: A catastrophic success: A catastrophic success. OUP Oxford, 1999.
[2] A. Saka, "İkinci Meşrutiyet Konya'sında Siyasal Hayat 1908-1914," 2023.
[3] A. Bolcakan, "The Language of Politics, The Politics of Language: The Political Literature in the Late Ottoman
Empire and the Early Turkish Republic," 2021.
[4] F. Aladag, "The Potential of GPT in Ottoman Studies: Computational Analysis of Evliya Çelebi’s Travelogue with
NLP and Text Mining and Digital Edition with TEI," CULTURE, vol. 5, p. 7, 2023.
[5] S. B. Ozates et al., "Building Foundations for Natural Language Processing of Historical Turkish: Resources and
Models," arXiv preprint arXiv:2501.04828, 2025.
[6] O. Berk, "Translation and Westernisation in Turkey (from the 1840s to the 1980s)," University of Warwick, 1999.
[7] A. Bakirci, "A deep learning based translation system from Ottoman Turkish to Modern Turkish," Fen Bilimleri
Enstitüsü, 2019.
[8] S. B. Ozates et al., "Building Foundations for Natural Language Processing of Historical Turkish: Resources and
Models," arXiv preprint arXiv:2501.04828, 2025.
[9] S. Pamuk, "Ottoman past and today's Turkey," in Ottoman Past and Today's Turkey: Brill, 2021.
[10] O. Berk Albachten, "The Turkish language reform and intralingual translation," in Tradition, tension and translation
in Turkey: John Benjamins Publishing Company, 2015, pp. 165-180.
[11] Y. Colak, "Language policy and official ideology in early republican Turkey," Middle Eastern Studies, vol. 40, no. 6,
pp. 67-91, 2004.
[12] A. L. Jensen, "The sociolinguistic role of Ottoman Turkish and Arabic in Turkish nationalism," 2017.
[13] E. Gunaydin, B. Gencturk, C. Ergen, and M. Koklu, "Digitization and Archiving of Company Invoices using Deep
Learning and Text Recognition-Processing Techniques," Intelligent Methods In Engineering Sciences, vol. 2, no. 4,
pp. 90-101, 2023.
[14] S. Paker, "Translation as terceme and nazire: Culture-bound concepts and their implications for a conceptual
framework for research on Ottoman translation history," in Crosscultural Transgressions: Routledge, 2014, pp. 120-
143.
[15] I. Dolek and A. Kurt, "The Ottoman-Turkish Transliteration using Traditional NLP Techniques," 2025.
[16] O. L. Ottoman, "Translation, Transcription, and the Making of World Literature," Turkish Literature as World
Literature, 2020.
[17] Ö. Berk Albachten, "Intralingual translation as ‘modernization’of the language: The Turkish case," Perspectives, vol.
21, no. 2, pp. 257-271, 2013.
[18] S. T. Gurcaglar, S. Paker, and J. Milton, Tradition, tension and translation in Turkey. John Benjamins Publishing
Company, 2015.
[19] O. Dursun, "A Novel Framework for Morphological Processing of Turkish," 2023.
[20] M. E. Almahdi, "Quantitative ways of measuring natural language change through time and location," Fen Bilimleri
Enstitüsü, 2020.
[21] S. Demir and S. Oktem, "A benchmark dataset for Turkish data-to-text generation," Computer Speech & Language,
vol. 77, p. 101433, 2023.
[22] I. Dolek and A. Kurt, "A deep learning model for Ottoman OCR," Concurrency and Computation: Practice and
Experience, vol. 34, no. 20, p. e6937, 2022.
[23] H. Z. Koytak and M. H. Celik, "A text mining approach to determinants of attitude towards Syrian immigration in the
Turkish Twittersphere," Social Science Computer Review, vol. 41, no. 2, pp. 608-625, 2023.
[24] G. Hagen, "Translations and translators in a multilingual society: a case study of Persian-Ottoman translations, late
fifteenth to early seventeenth century," Eurasian studies, vol. 2, no. 1, pp. 95-134, 2003.
[25] J. Korkut, "Morphology and lexicon-based machine translation of Ottoman Turkish to Modern Turkish," Princeton
University, Princeton, NJ, USA, 2019.
[26] S. Kirmizialtin and D. Wrisley, "Automated transcription of non-Latin script periodicals: a case study in the ottoman
Turkish print archive," arXiv preprint arXiv:2011.01139, 2020.
[27] C. Woodhead, "Ottoman languages," in The Ottoman World: Routledge, 2011, pp. 143-158.
[28] E. N. Rothman, "Dragomans and “Turkish literature”: The making of a field of inquiry," Oriente Moderno, vol. 93,
no. 2, pp. 390-421, 2013.
[29] S. Paker, "Ottoman Conceptions of Translation and its Practice: The 1897 ‘Classics Debate'as a Focus for Examining
Change," in Translating Others (Volume 2): Routledge, 2014, pp. 325-348.
206
[30] S. Kalafat, "Triggering A Renaissance in Historical Textual Studies in Turcology: Text Encoding Initiative (TEI) and
the Initial Standardised Ottoman Mathematics Text," Osmanlı Mirası Araştırmaları Dergisi, vol. 11, no. 30, pp. 261-
283, 2024.
[31] O. Celiktemel-Thomen, "Prime Ministry Ottoman Archives: Inventory of Written Archival Sources for Ottoman
Cinema History," Tarih Graduate History Journal, pp. 17-48, 2013.
[32] F. Aladag, "Digital Humanities and Ottoman Studies 2.0," Journal of Digital Islamicate Research, vol. 2, no. 1-2, pp.
63-89, 2024.
[33] M. Kirca and H. Baktir, "Comparative Literature in the Turkish Context: Past, Present and Possible Trajectories,"
Comparative Literature: East & West, pp. 1-10, 2024.
[34] B. De Nicola, "Manuscripts and Digital Technologies: A Renewed Research Direction in the History of Ilkhanid
Iran," Iran Namag, vol. 5, no. 1, pp. 4-21, 2020.
[35] A. Erbil, "Translation and the growth of juristic discourse in sixteenth-century Ottoman political writing," 2021.
[36] S. Gunasti, The Qur'an between the Ottoman Empire and the Turkish Republic: an exegetical tradition. Routledge,
2019.
[37] C. Demircioglu, "Translating Europe: The case of Ahmed Midhat as an Ottoman agent of translation," in Agents of
translation: John Benjamins Publishing Company, 2009, pp. 131-159.
[38] C. Kerslake, "Ottoman Turkish," in The Turkic Languages: Routledge, 2021, pp. 174-194.
[39] E. Wigen, State of translation: Turkey in interlingual relations. University of Michigan Press, 2018.