Science topic

Corpus Linguistics - Science topic

Explore the latest publications in Corpus Linguistics, and find Corpus Linguistics experts.
Filters
All publications are displayed by default. Use this filter to view only publications with full-texts.
Publications related to Corpus Linguistics (10,000)
Sorted by most recent
Article
Full-text available
Tests in the null hypothesis significance testing (NHST) framework are designed to identify differences between groups. Thus, researchers wishing to assess similarities are out of luck if they use widespread techniques in the field such as t-tests, chi-square tests, and linear regression. This is unfortunate given that researchers with many common...
Article
Full-text available
This article introduces a protocol designed to analyze large corpora for vocabulary profiling, aimed at enhancing corpus-based studies of academic discourse. Given the complexity and volume of data typical in academic fields, this protocol integrates advanced corpus compilation techniques with lexical analysis tools to effectively identify and cate...
Cover Page
Full-text available
The focus of LxGr is the interaction of lexis and grammar. The focus is influenced by Halliday’s view of lexis and grammar as “complementary perspectives” (1991: 32), and his conception of the two as notional ends of a continuum (lexicogrammar), in that “if you interrogate the system grammatically you will get grammar-like answers and if you interr...
Article
Full-text available
Considered an iconic symbol of indigenous legal heritage, Islamic law is adopted nowadays in whole or in part in the legal systems of the Muslim world and is also of significance in Muslim-minority European countries, where it typically finds its niche in civil and financial domains. This article sets out to investigate the norms of translating Isl...
Article
Full-text available
The discourse of song lyrics can be depicted as a mirror and as a mould of social life: It reflects, but also stimulates, certain practices and subjectivities. Using a corpus of 1 000 pop-rock lyrics in Spanish and techniques from corpus linguistics, socio-semiotics, and psychological text analysis, this study explores the discourse of lyrics from...
Article
Full-text available
La cuestión que se aborda en este trabajo de investigación es la comprobación, mediante técnicas estadísticas, de que los modelos generativos de lenguaje GPT-3.5 (versión gratuita) y GPT-4 (versión de pago) de ChatGPT tienen un estilo de escritura distinto al de los humanos, y que pueden diferenciarse, al menos, por tres tipos de rasgos: léxicos, s...
Article
Full-text available
This article proposes and illustrates with examples new Spanish as a foreign language teaching strategies which are based on the Lexical Approach empowered by the creation and didactic use of self-compiled corpora. The proposed methodology is aimed at the teaching of multiword expressions. Our proposition derives from the latest acknowledgements in...
Article
Full-text available
Introduction: la crise de la zone euro de 2009, marquée par une crise de la dette souveraine et du système bancaire avec la Grèce en épicentre, a suscité un grand intérêt médiatique en Europe, notamment en Espagne, où des parallèles ont été établis avec la situation grecque. Méthodologie: cette étude analyse la couverture médiatique de la crise gre...
Article
Full-text available
The integration of corpora into language teaching appears to be lacking, raising the question of whether this omission is due to teachers intentionally choosing not to use corpora in their classes or a lack of awareness and knowledge about these tools. This study of teachers in Indonesia aimed to explore two research questions: (a) what are the tea...
Article
Full-text available
The study addresses issues related to the usage of “to be” in unstructured written conversations among vocational high school students through the WhatsApp group platform.The focus is an analysis using AntConc to understand conversational patterns involving “to be”. The research aims to identify sentence variations and common errors among the stude...
Article
Full-text available
This study explored Indonesia’s diverse narratives and counter-narratives concerning pandemics with a focus on public sentiment and governmental responses. The research highlighted the challenges and complexities of sentiment analysis in such a multifaceted field by gathering data from numerous sources, including the unstructured terrain of Twitter...
Article
Full-text available
This study examines the effectiveness of Data-Driven Learning (DDL) in enhancing grammar instruction for Thai English as a Foreign Language (EFL) learners, addressing the limitations of traditional methods such as the Grammar Translation Method (GTM). DDL, rooted in corpus linguistics, engages learners in analysing authentic language patterns throu...
Article
Full-text available
A sentence is generally construed as a unit which is made up of one or more clauses. Hence, the English simple sentence, made up of a subject and a predicate, contains just a clause which makes a complete thought and expresses a single proposition. Against this backdrop, this study examines the connection between valency and passive voice in the En...
Article
Full-text available
This critical discourse analysis aims to expose the violations of the standards and guidelines of conversation specifically in the context of senate investigations on the matter of People's Initiative in the Philippines. Moreover, this study delved into the violations of maxims within the premise of cooperative principle conceptualized by Grice. Th...
Article
Full-text available
The global discourse on COVID-19 has shifted from a broad discussion of the pandemic to a focus on the vaccine. However, how COVID-19 vaccines have been discursively constructed and communicated in mainstream newspapers has received insufficient scientific attention, particularly given that research indicates the news media is a more reliable sourc...
Article
Full-text available
This research analyzes the phenomenon of amelioration on concrete nouns in the novels Layar Terkembang by Sutan Takdir Alisjahbana and Sitti Nurbaya by Marah Roesli. Amelioration is the process of changing the meaning of a word that leads to a more positive connotation. The purpose of this study is to identify the forms of amelioration, the positiv...
Article
Full-text available
We explore the surprising lexical be construction in English (e.g. Why don’t you be quiet? ). After an overview of previous discussions, an investigation of the use of lexical be in the COCA and SOAP corpora is provided. It is shown that its distribution is highly skewed and that it is completely felicitous only under a very limited set of conditio...
Article
Full-text available
This article examines the nexus between crisis and change in the context of German security policy after the Russian invasion of Ukraine. Chancellor Olaf Scholz’s announcement of a Zeitenwende (historic turning point) on 27 February 2022, a few days after the Russian attack, suggests a substantial change in German foreign and security policy. Germa...
Article
Full-text available
This paper addresses the intricate task of studying humor considering its dependence on cognition, emotions and even human perception. It focuses on the use of stereotypes within verbal humor, having stand-up comedy as the center of the study. Specifically, it examines how Trevor Noah, a South African comedian, utilizes stereotypes to entertain and...
Preprint
Full-text available
Legal interpretation is experiencing an empirical turn. Over the past decade, jurists have informed interpretation with corpus linguistics and survey experiments. Recently, others have used machine learning (“ML”) and natural language processing (“NLP”). Over the past few years, the transformer architecture—a technological advancement that has chan...
Article
Full-text available
This study examines the application of corpus linguistics in English language teaching (ELT) in Colombia, where its practical adoption remains limited compared to many developed countries. Data were gathered through a survey of English instructors at Colombian universities to assess their familiarity with corpus linguistics. Findings indicate signi...
Article
Full-text available
Recent research underscores the significance of data-led and collaborative reflection in enhancing teaching practices and professional development of teachers. While video-based reflections have been extensively studied, the potential of corpus-based methods remains underexplored. We address this gap in two ways. Firstly, we describe a research and...
Article
Full-text available
Lexical reciprocal verbs, defined as those verbs that inherently lexicalize a symmetric situation, e.g., consentio ‘agree’, occur in at least three distinct argument structure constructions, depending on whether their participants are expressed as a single (plural) subject, asymmetrically expressed – one as subject and the other one as oblique – or...
Article
Full-text available
In this article, we introduce a sociolinguistic perspective on language modeling. We claim that language models in general are inherently modeling varieties of language , and we consider how this insight can inform the development and deployment of language models. We begin by presenting a technical definition of the concept of a variety of languag...
Article
Full-text available
This study explored the representation of Pakistani identity in Mohsin Hamid's The Reluctant Fundamentalist employing corpus-based approach. Nuance portrayal of Pakistani identity was investigated utilizing selected terms Pakistan, Lahore, Muslim, Urdu and beard and their variations. Drawing on frequency, collocations, concordance lines, and semant...
Article
Full-text available
The present research investigated comments on political reports that elicited negative criticism, predominantly influenced by the readers' heated political perspectives. The study utilised lexical and pragmatic analyses to explore a corpus of 1000 reader comments on Indonesian news items concerning a new capital city construction, Ibu Kota Nusantar...
Article
Full-text available
Background Identifying language variation in healthy aging speakers is important for understanding normal cognitive aging. Setting a baseline of normal aging languages in the first place is necessary for the evaluation of language performances of old adults. Lexical concreteness, a well‐studied psycholinguistic parameter, has been used to detect se...
Article
Full-text available
Background Identifying language variation in healthy aging speakers is important for understanding normal cognitive aging. Setting a baseline of normal aging languages in the first place is necessary for the evaluation of language performances of old adults. Lexical concreteness, a well‐studied psycholinguistic parameter, has been used to detect se...
Article
Full-text available
The study concentrates on the analysis of diminutive forms of the members of the family, i.e. mother, father, daughter, son in English and Slovak language. We are able to detect specific suffixes in order to create diminutive forms. The target is to compare these known suffixes and analyse their occurrence, frequency and usage in the electronic cor...
Thesis
Full-text available
The present study is conducted to examine the occurrence of adjectives in English and Arabic newspaper corpora. The purpose is to find out the most frequent lexical phrases and their phraseological patterns. In addition, the study is an attempt to explain the cognitive phenomenon behind how linguistic units are constructed, produced and developed i...
Article
Full-text available
Metaphors, arguments and emotional appeals have considerable persuasive power in political discourse, yet they are rarely studied together. To explore the interactions between these interrelated phenomena, we employ three methods of analysis: Metaphor Identification Procedure, Inference Anchoring Theory, and lexicon-based sentiment analysis. Our da...
Article
Full-text available
The article delves into the question of Freud's concept of reading, and the fear of being photographed based on an analysis of the article "A Case of Paranoia Running Counter to the Psychoanalytic Theory of That Disease" (1915). Freud explicitly guides readers on how to read and not read this text. In alignment with contemporary concepts of paranoi...
Article
Full-text available
Nel presente contributo si propone un possibile percorso di studi sulla didattica del lessico: negli ultimi decenni, infatti, numerose sono state le pubblicazioni sull’argomento, rendendo necessario porre ordine nella materia, in maniera tale da rendere più agevole l’orientamento ai fruitori ideali (il docente e lo studioso). Si è scelto di suddivi...
Book
Full-text available
This book is the first of its kind to bridge the gap between corpus linguistics and forensic linguistics, illustrating the value of applying corpus linguistic data, tools, and methods in the analysis of language in the law, evidence, crime, and justice. The volume begins by taking stock of the use of corpus linguistics in the field of forensic and...
Article
Full-text available
This study (Note 1) investigates potential differences in language use between genders, by applying a modified model of thought representation. Our hypothesis is that women use more direct forms of thought representation than men in modern spoken British English. Women are said to favour “private speech” that creates intimacy and nearness through d...
Article
Full-text available
The growing field of corpus linguistics has been engaged heavily in language pedagogy during the last two decades. This has encouraged researchers to look for more applications that corpora have on language teaching and learning and led to the emersion of using corpora in language testing. The aim of this article is to provide an overview of using...
Article
Full-text available
The purpose of this study is to demonstrate how to integrate two in-house specialized corpora into a university-level English for Specific Purposes (ESP) course for nonnative speakers of English. The ESP course was an introductory level of wine tasting for Applied English Department students at a university specializing in hospitality in Taiwan. Tw...
Article
Full-text available
The paper gives an overview of learner corpora and their application to second language learning and teaching. It is proposed that there are four core components in learner corpus research, namely, corpus linguistics expertise, a good background in linguistic theory, knowledge of SLA theory, and a good understanding of foreign language teaching iss...
Article
Full-text available
Aim To highlight the use of corpus linguistics for analysing language data and to provide a worked example of this approach in nursing research. Design Methodology discussion paper. Methods This paper introduces corpus linguistics as a distinct approach to undertaking qualitative research in nursing. Examples are provided to illustrate how corpus...
Article
Full-text available
Corpus linguistics has transformed linguistic research but has a slightly moderate impact on the ESL teaching and learning. The Wikipedia Corpus, designed by Mark Davis is introduced in this essay. The corpus allows teachers to search Wikipedia in a powerful way: they can search by word, phrase, part of speech, and synonyms. Teachers can also find...
Article
Full-text available
This study attempts to examine the use of English modals in terms of their frequency and functions. For this purpose, Form 4 and College students’ argumentative compositions were extracted from the Malaysian Corpus of Students’ Argumentative Writing (MCSAW). In order to analyze the data, this study employed discourse analysis and some descriptive s...
Article
Full-text available
Recent studies in corpus linguistics have revealed apparent inconsistencies between the prescriptive grammar presented in EFL textbooks and the type of grammar used in the speech of native speakers. Such variations and learning gapsdeprive EFL learners of the actual use of English and delay their oral/aural developmental processes. The focus of thi...
Article
Full-text available
Studies on ESL/EFL learners’ use of the progressives reveal that it is one of the grammatical aspects most problematic to them. This paper presents the results of a study on the use of progressives among Year 5, Form 1 and Form 4 Malaysian ESL learners’ compositions using the English of Malaysian School Students (EMAS) corpus. The purpose of this s...
Article
Full-text available
The purpose of this study is to explore the complexity of the lexical bundles and/or formulaic patterns in Law texts, create a corpus of authentic formulaic patterns of verb foms in law and propose a workable method for identifying and teaching the accumulate specialised registers of formulaic patterns of law, the types and function of law lexical...
Article
Full-text available
In the case of students studying English as an L2, a genre classification was tested via corpus referencing using the Multidimensional Analysis Tagger, statistical comparisons and qualitative assessments. The study explores the academic genres that students produce, and the influences of the genres to which they are exposed. Academic genre is exami...
Article
Full-text available
The aim of this article is to analyze explicit influences that can be identified in Pamela Faber's frame-based terminology theory (FBT). We deem that FBT is primarily based on cognitive-semantic theories, such as Fillmore's frame semantics, and Cabré’s and Temmerman’s terminology theories. In order to prove this hypothesis, first, a review of termi...
Article
Full-text available
La pedagogía basada en el género (PBG) considera relevante el reconocimiento y explicitación de los movimientos retóricos de los géneros discursivos. Por esta razón ha adquirido especial relevancia en los procesos de alfabetización académica. Para identificar los patrones de un género se hace necesario contar con muestras de texto representativas,...
Article
Full-text available
This paper (Note 1) examines the form sort of in British men and women’s speech, and investigates whether there is a gender difference in the use of this form. We do so through corpus analysis of the British National Corpus (BNC). We contend there is no quantitative difference in the use of sort of in men and women’s speech. Contrary to general bel...
Article
Full-text available
This study examines the depiction of gender construction in Tehmina Durrani's novel Blasphemy by implementing a corpus-based technique and employing Gloria Jean Watkins's (Bell's Hook) feminist theory and Kimberly Crenshaw's intersectionality framework. The study uncovers patterns of male dominance and feminine submission through a qualitative inve...
Article
Full-text available
The Old Irish glosses in contemporary manuscripts are the most reliable evidence for Old Irish syntax. These glosses convey discontinuous utterances that depend on the Latin text to which they are attached. One of the most obvious consequences of this discontinuous and textually dependent character is that the glosses very often convey what we coul...
Article
Full-text available
This study aims to analyze and determine the extent to which environmental issues and natural sustainability are accommodated in Law no. 3 of 2022 concerning IKN using a corpus linguistics approach. The data collected in this research is the complete text of Law no. 3 of 2022 concerning IKN, which covers the entire text of the law, including preamb...
Article
Full-text available
The significance of academic word lists has consistently been underscored by scholars, particularly following the establishment of the AWL by Coxhead in 2000. Therefore, it is necessary to conduct a systematic literature review on the criteria for word selection that are crucial in the process of developing word lists. This study systematically rev...
Article
Full-text available
This study scrutinizes the authorial presence in English and Indonesian research articles written by Indonesian authors. The study aims to identify the frequency of first-person pronouns used as a form of authorial presence across research articles. Furthermore, the study also intends to examine the discourse functions of these first-person pronoun...
Article
Full-text available
This study investigates the representation of Islam in European Parliament sessions using the Europarl 3: German corpus, focusing on collocations related to Islam to understand their associated sentiments and semantic distributions. Political discourse plays a critical role in shaping public perceptions, and Islam is often framed within the context...
Article
Full-text available
The initiative of the Clear Writing Movement (Kimble 1992), targeting the democratization of communication by simplifying legal documents, has influenced the presentation of law globally. By uniting diverse philosophies of Plain Language and Easy-to-Read under the broad umbrella of text clarity and accessibility (Maaß 2020), this movement has parti...
Article
Full-text available
Given the well-established fact that news is not "facts" but an opinionated representation of events, individuals, and issues, and thus the subjective and infosuasive nature of news reporting, we need a deeper exploration of how media frame and construct narratives to shape public perception. In the present paper we depart from the assumption that...
Article
Full-text available
This work shows how corpus-based studies can be applied to the field of specialised translation and, more precisely, to the legal translation of employment contracts. Legal language, given its complexity and cultural specificity, presents considerable challenges for translators, and reliance on bilingual dictionaries may not always result in high-q...
Article
Full-text available
The article investigates the dynamics of fear and condemnation in the female context through the analysis of biblical translations using Natural Language Processing () methods. The authors adopt an interdisciplinary approach that combines corpus linguistics and word embedding to analyze the discourse related to female figures associated to sentimen...
Article
Full-text available
This research paper focuses on an investigation of if-conditional structures in naturally-occurring American English. The corpus-based analysis has revealed several alternative grammatical constructions, all of which occurred with higher frequency than the three traditional main types, which actually accounted for less than half of the entire if-co...
Article
Full-text available
If we consider corpus linguistics as the study of a language through its samples, we should give credit to its contribution to the advancement of various sub-fields of linguistics: lexicography, translation studies, applied linguistics, diachronic studies and contrastive linguistics. The latter can be regarded as a special case of a linguistic typo...
Article
Full-text available
W artykule przeanalizowano komentarze pod filmami Grzegorza Brauna, jednego z liderów prawicowej formacji Konfederacja Wolność i Niepodległość, które poruszały kwestię pandemii koronawirusa. W badaniu wykorzystano wiele metod językoznawstwa korpusowego, przyjmując perspektywę współczesnych badań nad dyskursem. Wykazano, że zwolennicy Grzegorza Brau...
Article
Full-text available
Nowadays, theoretic and empirical research into affix acquisition in Second Language Acquisition has attracted increasing attention (Peng Tingting, 2009; Zhao Ming, 2014; Chen Jie, 2017). However, there are still few empirical studies on affix acquisition of Chinese English learners, especially from the perspective of corpus linguistics. The presen...
Article
Full-text available
The paper summarizes the results of a large-scale international project aimed at investigation of Russian-Vietnamese mutual perceptions reflected in language and culture. The authors describe the process of development and testing of complex methodology for the reconstruction of ethnic “portraits” and “self-portraits” in two aspects, the characteri...
Article
Full-text available
Esta investigación tiene como objetivo identificar si existen diferencias contextuales en el uso de es decir, o sea y en plan entre los hablantes de español. Con la aparición de material procedente de Internet, existe un nuevo paradigma que ubica ciertos usos de los marcadores discursivos en la intersección entre lo oral y lo escrito. En este conte...
Article
Full-text available
The adoption of online learning has significantly transformed educational practices, sparking extensive debate about its effectiveness, challenges, potential. Central to these discussions are the roles of educators and students, innovative strategies for maintaining engagement, and the integration of technology to meet diverse learning needs. Teach...
Article
Full-text available
Esta contribución se enmarca en la explotación de corpus aplicada a la enseñanza de lenguas. Concretamente, su objetivo reside en investigar el empleo que en la lengua oral hacen los aprendientes de español L2/LE respecto al que realizan los hablantes nativos del cuantificador un poco como recurso pragmático-discursivo. El corpus ESLORA de español...
Article
Full-text available
The aim of corpus linguistics is the elaboration and analysis of real language texts. In this article we synthesise the basic concepts of transcription and elaboration of textual and oral corpora, focusing on the latter, as well as on the systems of transcription of prosodic elements in colloquial conversation. Following this review, we propose a p...
Article
Full-text available
Los corpus orales son herramientas esenciales para la investigación lingüística, ya que proporcionan datos auténticos del uso del lenguaje en contextos reales. En este trabajo se presenta el Corpus Oral de la Comunidad de Habla LGTBI (Navarro-Carrascosa 2023a), que recopila muestras de habla de esta comunidad, facilitando el estudio de sus códigos...
Article
Full-text available
The Chinese dream describing a set of ideals received numerous media reports after its proclamation by Chinese President Xi Jinping in November 2012. Making use of the rich source of media data, this article explores the ideology and ideals of the Chinese Dream represented in China’s state-run English-language newspapers. Modeled on the approach of...
Article
Full-text available
This article reports a state-of-art review of recent development on corpus linguistics and corpus-based research in Hong Kong. A top-down, multi-layer, stratified review identified 29 on-going research projects from the eight research-active universities in Hong Kong. These projects make use of corpus technology to address a wide range of research...
Article
Full-text available
The key function of storytelling is a meeting of hearts: a resonance in the recipient(s) of the story narrator’s emotion toward the story events. This paper focuses on the role of gestures in engendering emotional resonance in conversational storytelling. The paper asks three questions: Does story narrators’ gesture expressivity increase from story...
Article
Full-text available
Este trabajo se propone analizar una serie de narrativas autobiográficas de integrantes de AMMAR, sindicato de trabajadoras sexuales argentinas, compiladas en dos volúmenes titulados Tacones cercanos (AMMAR, 2016, 2017) desde el análisis del discurso, (Arnoux, 2006, 2019), entendido como una práctica interpretativa que vincula datos contextuales co...
Article
Full-text available
Critical Metaphor Analysis is concerned with integrating critical discourse analysis, corpus linguistics, pragmatics and cognitive linguistics to explore implicit speaker intentions and covert power relations through the analysis of metaphoric expressions. CMA has been a meaningful enrichment of both Critical Discourse Analysis and Conceptual Metap...
Book
Full-text available
Los cambios en los hábitos de consumo, impulsados por el desarrollo de internet y la digitalización han transformado profundamente nuestra sociedad en las últimas décadas, y el periodismo no es la excepción. Esta monografía invita a explorar el discurso periodístico actual y sus géneros, prestando especial atención a la noticia, un género que, como...
Article
Full-text available
Climate change is a multifaceted issue that encompasses environmental, social, cultural, and political dimensions. Greater involvement and awareness have been fostered among online community members thanks to the rise of the internet and the use of social media networks. Among these, X (formerly known as Twitter) stands out as a catalyst for creati...
Article
Full-text available
The study mainly examines the connotative meaning of several terms that are frequently used in the media in the political discourse of the September 11th attacks and the ‘War on Terror’. Eight items were identified which are ‘Sunni’, ‘jihad’, ‘Islamist’, ‘fatwa’, ‘terrorism’, ‘radicalism’, ‘militant’ and ‘fundamentalism’. The study explores the exi...
Article
Full-text available
Tim John’s Data-Driven Learning (DDL) is further developed as Pedagogic Processing of Corpora (PPC) in classroom language teaching. In this study, a group of ESL learners (the Experimental Group) acquire vocabulary by being exposed to a large quantity of teacher-edited language materials from the corpora. In this implementation, a free-of-charge on...
Article
Full-text available
The study uses a Corpus Assisted Feminist Stylistic Analysis to look at how women are portrayed in Hamlet by William Shakespeare. This current study carries out these analyses with the help of adjectives. The adjectives are identified from the play with the help of computer software Sketch Engine. The study examines how women, notably Ophelia and G...
Article
Full-text available
This study proposes and analyses a General Service List of Conversational English based on the spoken British National Corpus 2014. Most general service lists are based on written data (Brezina & Gablasova, 2015), overlooking the inherently unique features of conversations. This paper addresses this gap by presenting a general service list of conve...
Article
Full-text available
This research explores the integration of digital humanities methods in the analysis of Indonesian language texts to enhance linguistic and cultural understanding. The primary objective is to develop tailored digital humanities methodologies, applying computational tools such as text mining, natural language processing, and corpus linguistics to an...
Article
Full-text available
Institutions or people can express their political stances or attitudes toward a specific topic if they keep using some words rather than others repetitively and consistently. This study uses the corpus linguistic technique of frequency to examine the influence of the country where the newspaper is published on its agenda and coverage using a corpu...
Article
Full-text available
Management sociology poses the problem of the quantitative interpretation of qualitative research. The article deals with the corpus-based method, which can be considered as one of the solution tools. Based on ‘grounded theory’ methodology (Strauss & Corbin, n. d.) and partly debating with conceptual analysis (Sartory & Goertz, n. d.), we propose t...
Article
Full-text available
Though rather rare and not favoured by corpus linguists due to computationally hard-to-handle problems, learner corpora consisting of spoken and written texts by students from different L1 backgrounds can benefit both researchers in the field of second language acquisition and language teachers. Growing from this need and considering corpora’s pote...
Article
Full-text available
Annotatsiya. Tabiiy tilni qayta ishlash (NLP) sohasida statistik modellash va algoritmlarni o‘rgatish uchun maxsus matn korpuslari zarur. Til korpusi – lingvistik ma’lumotlarning katta hajmdagi strukturalangan to‘plami bo‘lib, NLP vazifalari uchun tahlil va ishlov berishni ta’minlaydi. Ushbu maqola korpuslarning turlari, ularga qo‘yiladigan talabla...
Article
Full-text available
The development of corpus linguistics has laid theoretical foundation and provided technical support for breaking the bottleneck in traditional vocabulary instruction in China. Corpora allow access to authentic data and show frequency patterns of words and grammar construction. Such patterns can be used to improve language materials or to directly...
Article
Full-text available
The aim of this paper is to make suggestions for improving the Georgian National Corpus on the basis of selected linguistic processes. The Georgian National Corpus is currently the most developed and detailed corpus of the Georgian language. One of the reasons for this is the included annotation of the texts, the variety of text genres and the size...
Article
Full-text available
Starting from the assumption that “if corpora are to play a role in the translation professions of tomorrow, it is important that they impact on the education of the students of today” (Bernardini & Castagnoli, 2008, p. 40), this study endeavours to show how translation corpora of parallel texts (in English and in Italian) can be used in a Speciali...