Science topic

Computational Linguistics - Science topic

Computational linguistics is an interdisciplinary field dealing with the statistical or rule-based modeling of natural language from a computational perspective.
Filters
All publications are displayed by default. Use this filter to view only publications with full-texts.
Publications related to Computational Linguistics (8,641)
Sorted by most recent
Article
Full-text available
Computational semantics, a branch of computational linguistics, involves automated meaning analysis that relies on how words occur together in natural language. This offers a promising tool to study schizophrenia. At present, we do not know if these word-level choices in speech are sensitive to the illness stage (i.e., acute untreated vs. stable es...
Article
Full-text available
Knowledge or information is to be transferred or stored from person to person or person to machine or machine to person in the form of communication. Modern technologies try to bring such communication as simple as possible in the form of artificial intelligence. Human and animals made their communication easy with help of their cognition in brain....
Article
Full-text available
Computational linguistics (CL) is the application of computer science for analysing and comprehending written and spoken languages. Recently, emotion classification and sentiment analysis (SA) are the two techniques that are mostly utilized in the Natural Language Processing (NLP) field. Emotion analysis refers to the task of recognizing the attitu...
Preprint
Full-text available
Figures of speech, such as metaphor and irony, are ubiquitous in literature works and colloquial conversations. This poses great challenge for natural language understanding since figures of speech usually deviate from their ostensible meanings to express deeper semantic implications. Previous research lays emphasis on the literary aspect of figure...
Article
Full-text available
Sentiment analysis has been one of the hot topics for researchers for previous two decades. Researchers from domains like natural language processing (NLP), statistic, computational linguistics and information retrieval (IR) have been targeting different types of problems related to sentiment analysis. While the number of research problems related...
Article
Full-text available
1 O tratamento computacional das línguas naturais O tratamento computacional de dados linguísticos tem estado na agenda de linguistas e cientistas da computação há no mínimo cinco décadas; entretanto, apenas nas últimas duas décadas tal movimento ganhou impulso no cenário brasileiro. Este movimento conta com a adesão de pesquisadores de diversas ár...
Article
Full-text available
The success of deep learning in natural language processing raises intriguing questions about the nature of linguistic meaning and ways in which it can be processed by natural and artificial systems. One such question has to do with subword segmentation algorithms widely employed in language modeling, machine translation, and other tasks since 2016...
Article
Full-text available
The task of processing natural language automatically has been on the radar of researchers since the dawn of computing, fostering the rise of fields such as computational linguistics and human–language technologies [...]
Article
Full-text available
Information Technology has touched new vistas for a couple of decades mostly to simplify the day-to-day life of the humans. One of the key contributions of Information Technology is the application of Artificial Intelligence to achieve better results. The advent of artificial intelligence has given rise to a new branch of Natural Language Processin...
Article
Full-text available
Non-arbitrary phenomena in language, such as systematic association in the form-meaning interface, have been widely reported in the literature. Exploiting such systematic associations previous studies have demonstrated that pseudowords can be indicative of meaning. However, whether semantic activation from words and pseudowords is supported by the...
Article
Full-text available
Personality is a set of stable and tendentious behaviors, thoughts and emotions. How to measure personality more conveniently and accurately has always been a problem for scholars in related fields. With the rapid development of computer technology and the widespread popularity of social media in recent years, the research of computational personal...
Cover Page
Full-text available
Machine Translation Protein Structures, Sequence Reads and Computational Linguistics
Article
Full-text available
Computational linguistics explores how human language is interpreted automatically and then processed. Research in this area takes the logical and mathematical features of natural language and advances methods and statistical procedures for automated language processing. Slot filling and intent detection are significant modules in task-based dialog...
Preprint
Full-text available
Background Over the last few decades, a growing body of evidence suggests a role for various infectious agents in Alzheimer’s Disease (AD) pathogenesis. Despite diverse pathogens (virus, bacteria, or fungi) being detected in AD subjects’ brains, most research has focused on individual pathogens and only a few studies investigated the hypothesis of...
Article
Full-text available
KHUSAINOVA ZILOLA YULDASHEVNA-Tashkent State University of Uzbek Language and Literature named after AlisherNavoi Master of Computer Linguistics ANNOTATION: The article covers the stages of development of modern lexicography. In the article the better features of electronic dictionary than paper ones and the stages of creation process are given. Th...
Presentation
Full-text available
The article examines the main linguistic features of Hercule Poirot’s speech etiquette figures, particularly the French phrases, as well as the speciality of translation in the detective genre of literature using computer linguistics, as computer linguistics and machine translation issues and problems become more relevant as scientific and technolo...
Article
Full-text available
The Collaborative Research Center 1412 “Register: Language Users’ Knowledge of Situational-Functional Variation” (CRC 1412) investigates the role of register in language, focusing in particular on what constitutes a language user’s register knowledge and which situational-functional factors determine a user’s choices. The following paper is an extr...
Preprint
Full-text available
Distributional semantics, the quantitative study of meaning variation and change through corpus collocations, is currently one of the most productive research areas in computational linguistics. The wider availability of big data and of reproducible algorithms for analysis has boosted its application to living languages in recent years. But can we...
Article
Full-text available
Questions are crucial expressions in any language. Many Natural Language Processing (NLP) or Natural Language Understanding (NLU) applications, such as question-answering computer systems, automatic chatting apps (chatbots), digital virtual assistants, and opinion mining, can benefit from accurately identifying similar questions in an effective man...
Preprint
Full-text available
The idea that discourse relations are construed through explicit content and shared, or implicit, knowledge between producer and interpreter is ubiquitous in discourse research and linguistics. However, the actual contribution of the lexical semantics of arguments is unclear. We propose a computational approach to the analysis of contrast and conce...
Article
Full-text available
Introduction Symptoms of schizophrenia are closely related to aberrant language comprehension and production. Macroscopic brain changes seen in some patients with schizophrenia are suspected to relate to impaired language production, but this is yet to be reliably characterized. Since heterogeneity in language dysfunctions, as well as brain structu...
Article
Full-text available
The paper describes computational tools that can be of great help to both qualitative and quantitative scholars in the humanities and social sciences who deal with words as data. The Java and Python tools described provide computer-automated ways of performing useful tasks: 1. check the filenames well-formedness; 2. find user-defined characters in...
Article
Full-text available
Recent achievements have turned Computational linguistics into a dynamic research area, and an important field of application development that is being explored by leading technology companies. Despite the advances, there is still much room for improvement to allow humans to interact with computational devices, through natural language, in the same...
Article
Full-text available
Infectious diseases have been an impending threat to the survival of individuals and groups throughout our evolutionary history. As a result, humans have developed psychological pathogen-avoidance mechanisms and groups have developed societal norms that respond to the presence of disease-causing microorganisms in the environment. In this work, we d...
Article
Full-text available
Post-authorship attribution is a scientific process of using stylometric features to identify the genuine writer of an online text snippet such as an email, blog, forum post, or chat log. It has useful applications in manifold domains, for instance, in a verification process to proactively detect misogynistic, misandrist, xenophobic, and abusive po...
Article
Full-text available
Abstract. This article discusses the use and tools of the spaCy module, which is written in Python machine language, in the Natural Language Processing (NLP), considered as one of the main areas of computer linguistics. A text in a natural language contains separate units (symbols) and can be divided into several interrelated parts belonging...
Article
Full-text available
Spell checking is the process of finding misspelled words and possibly correcting them. Most modern commercial spell checkers use a straightforward approach to finding misspellings, which considered a word is erroneous when it is not found in the dictionary. However, this approach is not able to check the correctness of words in their context and t...
Preprint
Full-text available
Lemmatization is the process of grouping together the inflected forms of a word so they can be analysed as a single item, identified by the word's lemma, or dictionary form. In computational linguistics, lemmatisation is the algorithmic process of determining the lemma of a word based on its intended meaning. Unlike stemming, lemmatisation depends...
Article
Full-text available
Natural language processing is an important branch of deep learning. In particular, the classification of short texts is one of the main tasks of computer linguistics, because it ensures the security of information. Therefore, this paper reviews the text classification methods for the first time, aiming at comparing the modern methods to solve the...
Article
Full-text available
In this article, the concept (i.e., the mathematical model and methods) of computational phonetic analysis of speech with an analytical description of the phenomenon of phonetic fusion is proposed. In this concept, in contrast to the existing methods, the problem of multicriteria of the process of cognitive perception of speech by a person is stric...
Article
Full-text available
Ancient undeciphered scripts present problems of different nature, not just tied to linguistic identification. The undeciphered Cypro-Minoan script from second millennium BCE Cyprus, for instance, currently does not have a standardized, definitive inventory of signs, and, in addition, stands divided into three separate subgroups (CM1, CM2, CM3), wh...
Presentation
Full-text available
Students are frequently matched to texts via readability indices (e.g., Lexile); however, little is known about the syntactic characteristics of texts at each level. We applied computational linguistics to quantify variability on syntactic variables to better inform teachers about the grammatical challenges students may face as they progress throug...
Article
Full-text available
Natural Language Processing (NLP) is a discipline at the intersection between Computer Science (CS), Artificial Intelligence (AI), and Linguistics that leverages unstructured human-interpretable (natural) language text. In recent years, it gained momentum also in health-related applications and research. Although preliminary, studies concerning Low...
Article
Full-text available
Introducción Las metáforas conceptuales son un proceso cognitivo que consiste en significar un do-minio conceptual (como el amor, las empresas o el tiempo) en términos de otro (como los viajes, los árboles o los caminos). Así, generamos expresiones como: «No sé para dónde vamos en esta relación» (el amor es un viaje), «Vamos a recortar utilidades»...
Article
Full-text available
KHUSAINOVA ZILOLA YULDASHEVNA-Tashkent State University of Uzbek Language and Literature named after AlisherNavoi Master of Computer Linguistics ANNOTATION: The article covers the stages of development of modern lexicography. In the article the better features of electronic dictionary than paper ones and the stages of creation process are given. Th...
Preprint
Full-text available
The following chapters are a collection of self-contained, more or less bite-sized essays on distinctive topics in computational linguistics. As an alternative to reading book chapters in their given sequence to avoid loss of logical presentation, the chapters here may be read according to whichever title strikes the gentle reader's fancy. Because...
Article
Full-text available
Given the difficulties of defining “machine” and “think”, Turing proposed to replace the question “Can machines think?” with a proxy: how well can an agent engage in sustained conversation with a human? Though Turing neither described himself as a philosopher nor published much on philosophical matters, his Imitation Game has stood the test of time...
Article
Full-text available
Competency in skills associated with collaborative problem solving (CPS) is critical for many contexts, including school, the workplace, and the military. Innovative approaches for assessing individuals’ CPS competency are necessary, as traditional assessment types such as multiple -choice items are not well suited for such a process-oriented compe...
Preprint
Full-text available
This paper proposes methods of predicting dynamic time series (including non-stationary ones) based on a linguistic approach, namely, the study of occurrences and repetition of so-called N-grams. This approach is used in computational linguistics to create statistical translators, detect plagiarism and duplicate documents. However, the scope of app...
Article
Full-text available
Modal verbs express modality, and modality is concerned with the status of the proposition that describes an event, it also expresses the opinion and attitude of a speaker toward the proposition of an utterance. Since modalities are directly related to the objective world, subjective world, and language use, they have been a hot topic of philosophe...
Article
Full-text available
Various applications in computational linguistics and artificial intelligence rely on high-performing word sense disambiguation techniques to solve challenging tasks such as information retrieval, machine translation, question answering, and document clustering. While text comprehension is intuitive for humans, machines face tremendous challenges i...
Article
Full-text available
Graphical representations of speech generate powerful computational measures related to psychosis. Previous studies have mostly relied on structural relations between words as the basis of graph formation, i.e., connecting each word to the next in a sequence of words. Here, we introduced a method of graph formation grounded in semantic relationship...
Article
Full-text available
In communication, textual data are a vital attribute. In all languages, ambiguous or polysemous words' meaning changes depending on the context in which they are used. The ability to determine the ambiguous word's correct meaning is a Know‐distill challenging task in natural language processing (NLP). Word sense disambiguation (WSD) is an NLP proce...
Article
Full-text available
Comparative judgments permit the assessment of open-ended student works by constructing a latent quality scale through repeated pairwise comparisons (i.e., which works “win” or “lose”). Adaptive comparative judgments speed up the judgment process by maximizing the Fisher information of the next comparison. However, at the start of a judgment proces...
Book
Full-text available
The International Conference on the Statistical Analysis of Textual Data (JADT, Journées d’Analyse Statistique des Données Textuelles) has been at its 16th occurrence. It was held for the first time in Naples, from the 6th to the 8th of July 2022, organised at the University of Naples Federico II by the VADISTAT Per Simona Balbi Association. This b...
Article
Full-text available
Research on the development of methods for identifying signs of hidden manipulation (destructive information and psychological impact) in text messages that are published on Internet sites and distributed among users of social networks is relevant. One of the main problems in the development of these methods is the difficulty of formalizing the pro...
Article
Full-text available
Over the past decades, the process of knowledge generation has accelerated, producing a lot of scientific publications, which makes reviewing even a relatively narrow subject area very demanding, if not impossible. However, recent text data mining tools can assist researchers in conducting such analysis in an objective and time-efficient way. We co...
Preprint
Full-text available
Natural language generation models are computer systems that generate coherent language when prompted with a sequence of words as context. Despite their ubiquity and many beneficial applications, language generation models also have the potential to inflict social harms by generating discriminatory language, hateful speech, profane content, and oth...
Article
Full-text available
The study presents an overview of discursive complexology, an integral paradigm of linguistics, cognitive studies and computer linguistics aimed at defining discourse complexity. The article comprises three main parts, which successively outline views on the category of linguistic complexity, history of discursive complexology and modern methods of...
Article
Full-text available
The dramatic expansion of modern linguistic research and enhanced accuracy of linguistic analysis have become a reality due to the ability of artificial neural networks not only to learn and adapt, but also carry out automate linguistic analysis, select, modify and compare texts of various types and genres. The purpose of this article and the journ...
Chapter
Full-text available
Objetivamos demonstrar a produtividade da Linguística de Corpus para estudos linguísticos de caráter descritivo e evidenciar sua relevância na formação (continuada) de pesquisadores da Linguística e Linguística Aplicada. As pesquisas aqui apresentadas vinculam-se a diferentes áreas, como Dialetologia, Lexicologia, Sociolinguística, Ensino de Língua...
Book
Full-text available
Este livro objetiva demonstrar a produtividade da Linguística de Corpus para estudos linguísticos de caráter descritivo e evidenciar sua relevância na formação (continuada) de pesquisadores da Linguística e Linguística Aplicada. As pesquisas aqui apresentadas vinculam-se a diferentes áreas, como Dialetologia, Lexicologia, Sociolinguística, Ensino d...
Preprint
Full-text available
The primary use of any probabilistic model involving a set of random variables is to run inference and sampling queries on it. Inference queries in classical probabilistic models is concerned by the computation of marginal or conditional probabilities of events given as an input. When the probabilistic model is sequential, more sophisticated margin...
Preprint
Full-text available
Sentiment analysis is a sub-discipline in the field of natural language processing and computational linguistics and can be used for automated or semi-automated analyses of text documents. One of the aims of these analyses is to recognize an expressed attitude as positive or negative as it can be contained in comments on social media platforms or p...
Conference Paper
Full-text available
The application of machine learning techniques to ancient writing systems is a relatively new idea, and it poses interesting challenges for researchers. One particularly challenging aspect is the scarcity of data for these scripts, which contrasts with the large amounts of data usually available when applying neural models to computational linguist...
Preprint
Full-text available
In this paper, we launch a new Universal Dependencies treebank for an endangered language from Amazonia: Kakataibo, a Panoan language spoken in Peru. We first discuss the collaborative methodology implemented, which proved effective to create a treebank in the context of a Computational Linguistic course for undergraduates. Then, we describe the ge...
Conference Paper
Full-text available
The paper presents a new software-Linguistic Field Data Management and Analysis System-LiFE for endangered and low-resourced languages-an open-source, web-based linguistic data analysis and management application allowing systematic storage, management, usage and sharing of linguistic data collected from the field. The application enables users to...
Poster
Full-text available
In spite of their diversity, the Indigenous languages of the American continent have received little attention from the technological perspective (Mager, Gutierrez-Vasques, Sierra & Meza-Ruiz, 2018). Guarani is one of the most widely spoken native South American languages. However, its presence in the web is scarce, even in Paraguayan websites, whe...
Preprint
Full-text available
Argumentation analysis is a field of computational linguistics that studies methods for extracting arguments from texts and the relationships between them, as well as building argumentation structure of texts. This paper is a report of the organizers on the first competition of argumentation analysis systems dealing with Russian language texts with...
Article
Full-text available
Words, sentences, and paragraphs are the basis of texts. When we consider texts as data and want to establish a relationship between qualitative and quantitative perspectives, we can do this with the word frequencies in a text. We aim to examine to what extent the relative frequencies of the words differ in Turkish and English scientific articles....
Article
Full-text available
In this paper, I use a framework from computational linguistics, the Rational Speech Act framework, to model deceptive probabilistic communication. This account allows agents to discount for the biases they perceive their interlocutors to have. This way, agents can update their credences with the perceived interests of others in mind.
Technical Report
Full-text available
The International AAAI Conference on Web and Social Media (ICWSM) is a forum for researchers from multiple disciplines to come together to share knowledge, discuss ideas, exchange information, and learn about cutting-edge research in diverse fields with the common theme of online social media. This overall theme includes research in new perspective...
Poster
Full-text available
This contribution presents the first steps towards the analysis of Leonardo Fibonacci's Liber Abbaci using computational linguistics methods. The work is currently carried out in the context of a joint research project between the Tuscany Region and the University of Pisa with the help of an interdisciplinary team.
Preprint
Full-text available
Dictionary approaches are at the forefront of current techniques for quantifying central bank communication. This paper proposes embeddings-a language model trained using machine learning techniques-to locate words and documents in a multidimensional vector space. To accomplish this, we utilize a text corpus that is unparalleled in size and diversi...
Article
Full-text available
This work represents a novel direction for computational linguistics research on metaphor in the Ever-Glorious Qur'ān. The present study proposed a basic/non-basic meaning criterion as a marker for the computational identification of metaphor in the Ever-Glorious Qur'ān. The corpus was Sūrat Hūd, where manual identification for candidate metaphors...
Article
Full-text available
Language models, such as BERT, construct multiple, contextualized embeddings for each word occurrence in a corpus. Understanding how the contextualization propagates through the model's layers is crucial for deciding which layers to use for a specific analysis task. Currently, most embedding spaces are explained by probing classifiers; however, som...
Article
Full-text available
The statistical measurement of agreement is important in a number of fields, e.g., content analysis, education, computational linguistics, biomedical imaging. We propose Sklar's Omega, a Gaussian copula-based framework for measuring intra-coder, inter-coder, and inter-method agreement as well as agreement relative to a gold standard. We demonstrate...