About
79
Publications
13,993
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
603
Citations
Publications
Publications (79)
The phenomenon we investigate in this article is the translation process of the discourse connective and in discourse relations that involve parentheticals. We argue that the use of and together with parentheticals in discourse functions as a perspective shifter. We examine the use of and in TED talks in the original language, English, and compare...
One of the most interesting aspects of natural language is how texts cohere, which involves the pragmatic or semantic relations that hold between clauses (addition, cause-effect, conditional, similarity), referred to as discourse relations. A focus on the identification and classification of discourse relations appears as an imperative challenge to...
Proposals such as continuity and causality-by-default relate the level of expectedness of a relation to its linguistic marking as an explicit or implicit relation. We investigate these two proposals with regard to the English transcripts of six TED Talks and their Lithuanian, Portuguese and Turkish translations in the TED-Multilingual Discourse Ban...
This paper describes a rule-based approach and a machine learning approach to disambiguate the discourse usage of Turkish connectives, which not only has single and phrasal connectives as most languages do, but also suffixal connectives that largely correspond to subordinating conjunctions in English. Since these connectives have different linguist...
This paper contributes to the question of how discourse relations are realised in TED talks. Drawing on an annotated, multilingual discourse corpus of TED talk transcripts, we examine discourse relations in English and Lithuanian, Portuguese and Turkish translations by concentrating on three aspects: the degree of explicitness in discourse relation...
We describe Turkish Discourse Bank 1.2, the latest version of a discourse corpus annotated for explicitly or implicitly conveyed discourse relations, their constitutive units, and senses in the Penn Discourse Treebank style. We present an evaluation of the recently added tokens and examine three commonly occurring dependency patterns that hold amon...
The single biggest obstacle in performing comprehensive cross-lingual discourse analysis is the scarcity of multilingual resources. The existing resources are overwhelmingly monolingual, compelling researchers to infer the discourse-level information in the target languages through error-prone automatic means. The current paper aims to provide a mo...
Corpus-based contrastive and translation research are areas that keep evolving in the digital age, as the range of new corpus resources and tools expands, opening up to different approaches and application contexts. The current book contains a selection of papers which focus on corpora and translation research in the digital age, outlining some rec...
Manually annotated linguistic corpora record language users’ intuitions electronically. Reusability of annotated corpora depends on adopting an annotation approach which aspires to be linguistically sound and reliable. The major goal of this study is to present an overall assessment of the validity and reliability of Turkish Discourse Bank (TDB 1.0...
Discourse relations are expressed either with a discourse connective or conveyed with no connective, referred to as implicit relations. Both forms of expression affect the translation of the texts. Translators may omit the connectives which are explicit in the source text, which is implicitation; or they may add connectives which are merely inferre...
The conjunction and is a highly frequent and polyfunctional word (Crible et al. 2019). In the Penn-Discourse-Tree-Bank/PDTB-3, multiple relations are categorised in four conditions: 1) where a single explicit connective has multiple senses; 2) in the absence of any explicit connective, 3) where there are multiple explicit connectives; 4) and where...
There is a growing interest in understanding how discourse relations are processed and translated into different languages (Zufferey & Gygax, 2016; Hoek et al., 2017; Crible et al., 2019). This study aims to extend the TED-Multilingual-DiscourseBank/TED-MDB as an annotated resource of an English-Turkish parallel corpus and to lay the ground for fur...
The volume aims to bring together original, unpublished papers on discourse structure and meaning from different frameworks or theoretical perspectives to address research questions revolving around issues instigated by Turkish. Another goal is to offer methodologically different solutions for the research gaps identified in individual chapters. Th...
We introduce the Turkish Emotion-Voice Database (TurEV-DB), which involves a corpus of over 1735 tokens based on 82 words uttered by human subjects in four different emotions (angry, calm, happy, sad). The speech data were produced by amateur actors, checked by assessors, recorded, and preprocessed by a denoising procedure. An emotion corpus was co...
TED-Multilingual Discourse Bank, or TED-MDB, is a multilingual resource where TED-talks are annotated at the discourse level in 6 languages (English, Polish, German, Russian, European Portuguese, and Turkish) following the aims and principles of PDTB. We explain the corpus design criteria, which has three main features: the linguistic characteristi...
Languages enable their speakers to use word order to mark the information status of the various elements in a sentence. This chapter investigates the information status of syntactically subordinate clauses in Turkish by examining the cases where subordinate clauses have a discourse role. Using data from the current release of the Turkish Discourse...
The Turkish Discourse Bank (TDB) is a resource of approximately 400,000 words in its current release in which explicit discourse connectives and phrasal expressions are annotated along with the textual spans they relate. The corpus has been annotated by annotators using a semiautomatic annotation tool. We expect that it will enable researchers to s...
The paper offers a quantitative and qualitative analysis of explicit inter-and intra-sentential discourse connectives in Turkish Discourse Bank, or TDB version 1.1, a multi-genre resource of written Turkish manually annotated at the discourse level following the goals and principles of Penn Discourse TreeBank. TDB 1.1 is a 40K-word corpus involving...
We introduce TED-Multilingual Discourse Bank, a corpus of TED talks transcripts in 6 languages (English, German, Polish, European Portuguese, Russian and Turkish), where the ultimate aim is to provide a clearly described level of discourse structure and semantics in multiple languages. The corpus is manually annotated following the goals and princi...
In this chapter, we provide an overview of Turkish Discourse Bank, a resource of \(\sim \)400,000 words built on a sub-corpus of the 2-million-word METU Turkish Corpus annotated following the principles of Penn Discourse Tree Bank. We first present the annotation framework we adopted, explaining how it differs from the annotation of the original la...
In this paper we present the recent developments on Turkish Discourse Bank (TDB). We first summarize the resource and present an evaluation. Then, we describe TDB 1.1, i.e. enrichments on 10% of the corpus (namely, added senses for explicit discourse connectives and new annotations for implicit relations, entity relations and alternative lexicaliza...
The Acquisition of Turkish in Childhood presents recent research on the nature of language acquisition by typically and atypically developing monolingual and bilingual Turkish-speaking children. The book summarises the most recent research findings on the acquisition of Turkish in childhood, with a focus on (i) the acquisition of phonology, morphol...
This study primarily aims to build a Turkish psycholinguistic database including three variables: word frequency, age of acquisition (AoA), and imageability, where AoA and imageability information are limited to nouns. We used a corpus-based approach to obtain information about the AoA variable. We built two corpora: a child literature corpus (CLC)...
The volume assembles a wide variety of research papers on phonetics, phonology, syntax, morphology, semantics, as well as language acquisition, discourse analysis, pragmatics, language contact studies, receptive multilingualism, and sociolinguistics. The 58 articles, written specifically for this collection, give an informative overview of the curr...
The present study investigates the parsing of pre-nominal relative clauses (RCs) in children for the first time with a real-time methodology that reveals moment-to-moment processing patterns as the sentence unfolds. A self-paced listening experiment with Turkish-speaking children (aged 5–8) and adults showed that both groups display a sign of proce...
Öz: Söylemi konu alan berimsel kuramlar, bağdaşıklık ve doğal dil işlemleme konularını anlamamızı sağlar. Bu yazıda söylemi inceleyen çeşitli berimsel kuramlar ve ele aldıkları konular ele alınmaktadır. Örneğin, bağdaşıklık türlerini bağlaçların varlığıyla açıklayan, sözdizim-söylem arasında doğrudan bir bağ aramayı öneren ve sonlu sayıda oldukları...
Abstract
This paper investigates the major characteristics of Corrective discourse
relations in Turkish. The data source is METU Turkish Discourse Bank, a
~400,000-word corpus of contemporary written Turkish. The paper suggests
that Corrective discourse relations are unambiguously inferred from the
adjacency of a negative and a positive clause. Cer...
This work is trying to bring a computational perspective to the problem of early lexical acquisition of words. It is a preliminary investigation to see if the underlying mechanism relates to computational complexity by which short, frequent and unambiguous words are supposed to be acquired first; and long, ambiguous or infrequent words (including n...
The languages of Europe and North and Central Asia provide a rich variety of data. In this volume, some articles are summaries of large areal typological research projects, and some articles focus on structures or constructions in a single language. However, it is common to all the articles that they investigate phenomena that have not been examine...
In an attempt to extend Penn Discourse Tree Bank (PDTB) / Turkish Discourse Bank (TDB) style annotations to spoken Turkish, this paper presents the first attempt at annotating the explicit discourse connectives in the Spoken Turkish Corpus (STC) demo version. We present the data and the method for the annotation. Then we reflect on the issues and c...
It is a widely accepted fact that coherence enables a text’s comprehensibility. A major source of coherence is discourse cohesion (textual properties of the text). Lexical cohesion (e.g. synonymy) and discourse connectives are two major types of discourse cohesion. We investigate the contribution of these two types of cohesion to the overall compre...
This paper briefly describes the Turkish Discourse Bank, the first publicly available annotated discourse resource for Turkish. It focuses on the challenges posed by annotating Turkish, a free word order language with rich inflectional and derivational morphology. It shows the usefulness of the PDTB style annotation but points out the need to expan...
This paper will introduce a procedure that we call pair annotation after pair programming. We describe initial annotation procedure of the TDB, followed by the inception of the pair annotation idea and how it came to be used in the Turkish Discourse Bank. We discuss the observed benefits and issues encountered during the process, and conclude by di...
Özet
Söylemi konu alan berimsel kuramlar, bağdaşıklık ve doğal dil işlemleme konularını anlamamızı sağlar. Bu yazıda söylemi inceleyen çeşitli berimsel kuramlar ve ele aldıkları konular ele alınmaktadır. Örneğin, bağdaşıklık türlerini bağlaçların varlığıyla açıklayan, sözdizim-söylem arasında doğrudan bir bağ aramayı öneren ve sonlu sayıda oldukla...
This study, in which 310 university students participated, was designed to investigate whether computer interfaces that offer human-like apologetic error messages influence users’ self-appraisals of performance in the computerized environment. The study consists of three phases. In the first phase, using the CCSARP (cross-cultural study of speech a...
This study presents a novel computational approach to the analysis of unaccusa-tive/unergative distinction in Turkish by employing feed-forward artificial neural networks with a backpropagation algo-rithm. The findings of the study reveal cor-respondences between semantic notions and syntactic manifestations of unaccusa-tive/unergative distinction...
In this paper, we report on the annotation procedures we developed for annotating the Turkish Discourse Bank (TDB), an effort that extends the Penn Discourse Tree Bank (PDTB) annotation style by using it for annotating Turkish discourse. After a brief introduction to the TDB, we describe the annotation cycle and the annotation scheme we developed,...
In this paper, we describe an annotation environment developed for the marking of discourse structures in Turkish, and the kinds of discourse relation configurations that led to its design.
In this paper we explain how we anno- tated subordinators in the Turkish Dis- course Bank (TDB), an effort that started in 2007 and is still continuing. We in- troduce the project and describe some of the issues that were important in anno- tating three subordinators, namely kars ¸in, ra˘ gmen and halde, all of which encode the coherence relation C...
Diary studies have become a useful tool for both L2 teachers and teacher educators. Such studies encourage teachers to assimilate lessons they have learned throughout their teaching experience. For teacher educators, using diaries has the promising benefit of focusing on teaching as it is understood by teachers. That is, diaries help clarify the mo...
This volume includes 14 papers investigating politeness phenomena in Greece and Turkey, the cultural cross-roads of Europe, Asia and the Middle East. It reflects current research and provides observations of and findings in patterns of linguistic politeness in a geographical area other than the much studied English speaking ones. The book appeals t...
Koşul cümlelerinde varsayımsallık ve gerçek karşıtlığı (Hypotheticality and counterfactuality in conditional sentences in Turkish)
Dans l'analyse du discours des recits et des contes populaires, competence, efficacite et opportunite sont les principales caracteristiques. Il existe une strategie de la constitution des histoires. A partir de l'analyse d'un conte type AaTh 877 turc selon le point de vue de Beaugrande, l'A. presente la structure du conte et la maniere dont le narr...
Runs are inherent to Turkish folktales, as they are to folktales of certain other cultures. They are traditionally accepted forms, and useful compositional devices that function as bridges between the world of the tale and the world of everyday reality. This study attempts to demonstrate the dynamics of runs through an examination of the stylistic...
This paper describes first steps towards extending the METU Turkish Corpus from a sentence-level language resource to a discourse-level resource by annotating its discourse connectives and their arguments. The project is based on the same principles as the Penn Discourse TreeBank (http://www.seas.upenn.edu/~pdtb) and is supported by TUBITAK, The Sc...
Ad, eylem gibi dil ulamlar~n~n nasd tan~mlanmas~ gerektigi ve ne tur sozcilkler igerdigi gibi konular, geleneksel dilbilgisinin oldukqa onemli bir bolumiinii olugturmaktad~r. Cagdag dilbilim kuramlarln~n ise sozciiklerin s~n~fland~r~lmas iyla dogrudan dogruya ugragtlgln~ soylemek zordur, ama kuramlarm dil ulamlarlna nasll yaklagtlg~ merak edilerek...
1. Giriş Üretimsel dilbilgisi kavrami çerçevesinde ortaya atilan dilbilgisi kuramlari, genel bilişim sistemi içinde edincin (competence) yeri konusunda bir savda bulunurlar. Bu sav genellikle diğer sistemlerle (ör. algilama, anlam) bilgi bağlantisi konusunda önsavlar içerdiği gibi, dilbilgisel mimarinin içyapisi hakkinda da önerilerde bulunur. Bu i...