Sílvia AraújoUniversity of Minho · Centro de Estudos Humanísticos da Universidade do Minho (CEHUM)
Sílvia Araújo
Ph.D Language Sciences – Corpus Linguistics
Associate Professor at the Department of Romance Studies of the University of Minho
About
72
Publications
8,165
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
114
Citations
Introduction
Sílvia Araújo is an Associate Professor at the Department of Romance Studies at the University of Minho. Her research interests include corpus linguistics, natural language processing, and the application of AI technologies to education.
Publications
Publications (72)
Prenant pour cadre la Théorie des Opérations Énonciatives proposée par Antoine Culioli, notre communication a pour objet l’étude des emplois du présent simple en portugais et en français. Afin d’aboutir à des conclusions fiables sur les similarités et les divergences quant aux conditions d’apparition et au mode de fonctionnement du présent dans les...
Neste documento apresentamos o projecto Per-Fide que tem como principal objectivo a criação de recursos bilingues entre a língua portuguesa e seis outras línguas: espanhol, russo, francês, italiano, alemão e inglês. Este processo iniciar-se-á com a compilação de corpora paralelos em diferentes áreas, nomeadamente a literatura, religião e política (...
This paper presents a corpus-based study of pronominal causative constructions in a French-Spanish-Portuguese perspective. The combination of monolingual and multilingual corpus searches will help determine, at an initial phase, the conditions that underlie the functioning of se faire/hacerse/fazer-se in each language and, subsequently, the linguis...
No âmbito deste estudo, propomo-nos descrever uma proposta didática em TIC para a formação de professores na área da aprendizagem de línguas estrangeiras e analisar os resultados de aprendizagem obtidos no âmbito da Unidade Curricular (doravante UC) de Tecnologias Aplicadas às Línguas (TAL) do Mestrado em Espanhol Língua Segunda ou Língua Estrangei...
This paper aims to analyse and describe the functioning of pronominal anaphora in English-Portuguese and Portuguese-English simultaneous interpreting. For this purpose, we selected a random sample of 24 speeches from the plenary sessions of the European Parliament. This small sample was taken from a larger pool of speech transcripts, which will be...
Designing a Phrase Bank for Academic Learning and Teaching: A European Portuguese Case Study. This work focuses on the development and application of an academic phrase bank for European Portuguese. Our goal is to present an overview of this resource, discuss its potential replication in other languages, and explore its application in the classroom...
In this chapter, we examine generative artificial intelligence in the context of Digital Humanities. We commence by providing a concise overview of this technology and its prevalent models. Following that, we provide a brief survey of the existing literature on the application of generative AI in the field of Digital Humanities. Generative AI liter...
This edited volume explores how digital humanities can address critical societal challenges in social media, health, education, archives, heritage, and the arts. It features contributions from leading scholars and practitioners in various fields, offering a comprehensive overview of the role of digital humanities in addressing pressing social and e...
This article delves into the realm of science communication and data visualization, presenting a platform designed to enhance the dissemination of scientific knowledge. Rooted in the context of the DIAL4U project, the study investigates the creation of online resources to bridge the gap between scientific experts and diverse audiences. The integrat...
In the introductory chapter of this volume, the authors contemplate the role of Digital Humanities in today’s fast-paced and interconnected world and present an overview of the book’s key values. These include the intersection of Digital Humanities with data science and big data, the application of digital tools and methodologies, and the perspecti...
alBERTUM is a Portuguese search engine designed for scientific and academic language. This paper provides an overview of alBERTUM’s data sources, model, architecture, and two core functions. The search engine uses a vast collection of scientific open data available in national repositories to provide bilingual terminology searches and offer a Portu...
This paper introduces a methodology for facilitating the acquisition of specialized language using artificial intelligence tools, specifically ChatGPT, to simplify complex texts. The methodology consists of five steps: summarizing, simplifying, extracting terminology, creating mind maps, and visualizing, by generating multimodal content such as sho...
Nowadays, the Internet is one of the main sources of data, which can be collected and interpreted in order to obtain valuable insights. This paper presents a method for extracting and analyzing data from tourism platforms, focusing on easy techniques and tools usable by virtually any person from any scientific area. This methodology allows the user...
This paper addresses the challenge researchers face when translating their work. Due to the high cost associated with human translation services, many researchers turn to automatic translation tools as a cost-effective alternative. Therefore, assessing the quality of these translations is crucial. This paper presents a comparative evaluation of tra...
The lack of access to academic literacy skills and tools is a serious problem, as it furthersinequality among students. In this paper, we propose a methodology to semi-automaticallyextend a corpus of academic phraseology, previously manually extracted and categorized,using BERT machine learning models. We begin by describing the constitution of the...
1. Da planificação à textualização: abordagem multimodal para a elaboração de uma revisão da literatura com recurso às TIC As dificuldades dos alunos, do básico ao superior, em navegar na mi-ríade de géneros académicos com os quais se deparam ao longo de um per-curso académico que cada vez mais se estende durante quase duas décadas está bem documen...
While digital technologies have revolutionized how we collect, store and retrieve information, there is a lack of strategies that support academic collaboration at the institutional level. Today it is vital to know how to communicate accurately and efficiently, respecting the specificities of each communicative context. We present an interface prot...
Abstract:
In order to take full advantage of science and technology, they must be accessible to as many people as possible and their message must be easily understood. In fact, the Covid pandemic highlighted the importance of simple and effective science communication. However, researchers are only trained for the creation of knowledge and in no in...
Abstract:
In this paper, we will present a methodology for collaborative and cooperative literature review in multimodal and multigenre learning scenarios. This goal is the development of (creative) learning scenarios in distance or hybrid learning contexts, which promote multimodal mapping of syllabuses topics and the improvement of multiliteracie...
In this paper, we will present the process of developing a resource that we consider to be useful for both native and non-native college students in the process of writing Portuguese academic texts: a BERT-powered Writing Assistant for academic purposes in European Portuguese. The Writing Assistant includes two main components: a phrase bank, that...
Dans le cadre de cet article, nous présentons les résultats d’une expérience pédagogique menée dans un contexte universitaire au Portugal et en France. Le but de cette expérience est d’initier les étudiants aux méthodologies actives par le biais d’une démarche pédagogique en trois étapes : i) recherche documentaire de cinq articles scientifiques po...
The importance of technology in the tourism sector has dramatically increased in the last few years. The amount of available data about just any field demands appropriate techniques, to be able to have a ground for development strategies. This paper presents a science mapping bibliometric analysis on tourism and technology. Its main goals are to de...
In this paper, we describe our approach for solving Task 3 of the SimpleText Lab, organized as part of the Clef 2022 conference. The SimpleText Lab addresses issues of automatic text simplification of scientific texts in order to make scientific knowledge more accessible to everyone. To address Task 3, we trained Simple T5. In the first experiment,...
In this paper, we describe our approach for solving Task 3 of the SimpleText Lab, organized as part of the Clef 2022 conference. The SimpleText Lab addresses issues of automatic text simplification of scientific texts in order to make scientific knowledge more accessible to everyone. To address Task 3, we trained Simple T5. In the first experiment,...
Neste artigo, é nosso objetivo apresentar um projeto de investigação que visa a criação de um portal de recursos multimodais (Kress et al., 2014) que apoiem a literacia académica e a escrita em línguas de especialidade, tendo como público-alvo alunos, professores, investigadores e profissionais de diversas áreas científicas. Ao privilegiar textos e...
Contextualização. O recente crescimento da área dos estudos do discurso académico acompanha as mudanças demográficas (Purser et al., 2008) e o aumento dos alunos que prosseguem estudos de pós-graduação e doutoramento (Ferrão Tavares & Pereira, 2021). A preocupação em descrever os géneros do discurso académico prende-se também com questões de litera...
Neste artigo, exploraremos o processo de construção de um motor multifunções que está a ser desenvolvido no âmbito do projeto de investigação PortLinguE (ref. PTDC/LLT-LIG/31113/2017) e que parte da reutilização de dados científicos disponíveis em regime de acesso aberto. Daremos conta da arquitetura geral do motor que assenta numa framework Django...
Suite à la mise en confinement des apprenants et des enseignants en Espagne, conséquence de la pandémie de la COVID-19, il est important d'analyser la situation d'enseignement du français langue étrangère dans une Alliance française. L'Alliance Française de Vigo (Af Vigo) a dû adapter sa pratique professionnelle à la réalité du confinement, passant...
The present study compares the analytic causative constructions featuring the verbs faire and fazer. It is based on a bilingual corpus, the source language being French and the target language Portuguese. Using this contrastive data, we first examine the cases where there is correspondence between faire and fazer; then we focus on the equivalents t...
While humour and wordplay are among the most intensively studied problems in the field of translation studies, they have been almost completely ignored in machine translation. This is partly because most AI-based translation tools require a quality and quantity of training data (e.g., parallel corpora) that has historically been lacking for humour...
Although citizens agree on the importance of objective scientific information, yet they tend to avoid scientific literature due to access restrictions, its complex language or their lack of prior background knowledge. Instead, they rely on shallow information on the web or social media often published for commercial or political incentives rather t...
https://simpletext-project.com
SimpleText tackles technical challenges and evaluation challenges by providing appropriate data and benchmarks for text simplification.
We propose the following shared tasks:
TASK 1 What is in (or out)? Select passages to include in a simplified summary, given a query
TASK 2 What is unclear? Given a passage and a q...
https://www.joker-project.com/clef-2022/EN/project
The goal of the JOKER workshop is to bring together translators and computer scientists to work on an evaluation framework for wordplay, including data and metric development, and to foster work on automatic methods for wordplay translation.
Tasks
We invite you to submit both automatic and manua...
https://www.joker-project.com/clef-2022/EN/project The goal of the JOKER workshop is to bring together translators and computer scientists to work on an evaluation framework for wordplay, including data and metric development, and to foster work on automatic methods for wordplay translation. Tasks We invite you to submit both automatic and manual r...
The Web and social media have become the main source of information for citizens, with the risk that users rely on shallow information in sources prioritizing commercial or political incentives rather than the correctness and informational value. Non-experts tend to avoid scientific literature due to its complex language or their lack of prior back...
https://simpletext-project.com SimpleText tackles technical challenges and evaluation challenges by providing appropriate data and benchmarks for text simplification. We propose the following shared tasks: TASK 1 What is in (or out)? Select passages to include in a simplified summary, given a query TASK 2 What is unclear? Given a passage and a quer...
ABSTRACT. In this article, we present the results of a pedagogical experiment conducted in a university context. The goal of this experiment is to introduce students to active methodologies through a pedagogical approach in three stages: i) documentary research of five scientific articles relating to an active methodology chosen by the students (fl...
This paper aims to explain how virtual tours are an important resource that can bring students closer to science, by allowing them to visit research centers and laboratories – not only from their country, but also internationally – without requiring their physical presence. This technology enables them to see places they would not be able to visit...
1 In this article, we present the results of a pedagogical experiment conducted in a university context. The experiment aims to introduce students to active methodologies through a pedagogical approach in three stages. The students were divided into pairs and were asked to complete the following tasks: i) documentary research of five scientific art...
El aprendizaje de lenguas extranjeras o segundas (LE/L2) etimológicamente cercanas suele ser fuente
de dificultades y malentendidos por parte de los
aprendientes de dichas lenguas. Así es en el caso del
par de lenguas portugués-español en las cuales la
proximidad etimológica facilita la intercomprensión y
permite las transferencias léxicas de la L1...
In this paper, we present the Per-Fide project, aimed at the construction of parallel corpora mapping the Portuguese language to six other languages-Spanish, French, Italian, German, English and Russian-in various domains, including literary, journalistic and religious texts. We will demonstrate how the Per-Fide Corpus can be used for contrastive a...
La présente étude compare les particularités syntaxiques et sémantiques des constructions causatives analytiques verbales en faire et en fazer, sur la base d’un corpus bilingue que nous examinons dans le sens du texte original (le français) vers la traduction (le portugais). A partir de ces données contrastives, nous verrons que le portugais est un...
The digital age plays a key role in learning, helping to discover new ways of acquiring, building and disseminating knowledge. One of the most attractive features of this new digitally-oriented learning lies in the fact that students are actively involved in their learning process. In this paper, we discuss the methodology of a teaching experiment,...
Resumo Direcionada para suprir a falta de tomada de consciência linguística (language awareness) pelos alunos de tradução, futuros tradutores, do funcionamento das línguas de trabalho e dos recursos linguísticos característicos das várias linguagens de especialidade, propomos uma abordagem didática, heurística, que parte do texto como unidade básic...
L’étude dont il est question ici vise précisément à souligner l’impact significatif que la linguistique de corpus peut avoir sur la recherche et l’enseignement dans les domaines touchant à l’étude du langage et à la traduction (Williams, 2005). Nous entendons ici proposer une réflexion sur l’utilité d’un corpus multilingue (tel que le corpus Per-Fi...
O presente volume da revista H2D, subordinado à temática Novas práticas no ensino em tempos de pandemia: desafios e soluções em ambiente digital, teve como objetivo dar voz aos professores que tiveram, pelo mundo fora, de se adaptar a novas modalidades de ensino sem disporem necessariamente dos instrumentos necessários para transitar, num tão curto...
This paper focuses on intervention effects obtained by embedding a topic constituent (either a displaced
topic or a clitic left−dislocated topic) within the domain of wh−movement. We present the results of two
acceptability judgment tests carried out in European Portuguese (EP), which indicate that only a subset of
the constructions in which a topi...
This work is focused on the use of Cheat Sheets and Padlet as a new learning tool. In fact, Cheat Sheets are well-known in students´ community but not for the well-behaved reasons. But, if we extract the practical aspect of Cheat Sheets, we highlight “While doing it, I learn”. And this is the key issue of this study: is Cheat Sheets to promote lear...
Le développement des technologies numériques a transformé les modes d’enseignement et d’apprentissage. Cette évolution a entraîné une révolution des pratiques pédagogiques et la mise en place d’environnements d’apprentissage innovants. En permettant une diversification des pratiques de lecture et d’écriture, ces environnements fournissent des conte...
Writing is a challenging task especially for graduate students, who are often required to produce scientific writing. Mind maps help to organize ideas and are an essential tool in the planning phase of writing. Software tools have been developed to support mind-mapping, but in our research we have not found tools offering specific support for writi...
CALL FOR PAPERS: May 15, 2018: abstract submission
*************************
techLING is an international conference devoted to the synergies between languages/linguistics and technology, which focuses on the impact of technological tools on language processing. This year it will be held in Portugal, at Universidade Autónoma de Lisboa, on October 2...
This paper aims to analyse and describe the functioning of pronominal anaphora in English-Portuguese and Portuguese-English simultaneous interpreting. For this purpose, we selected a random sample of 24 speeches from the plenary sessions of the European Parliament. This small sample was taken from a larger pool of speech transcripts, which will be...
Over the last few years, the number of parallel resources to natural language processing has been increasing, especially for other languages than Portuguese. Thanks to the entry of Portugal into the EU, the Europarl (Koehn, 2005) and the JRC-Acquis (Steinberger et al., 2006) corpora became a reference in the legal field, which include the Portugues...
Mobile devices (smartphones, tablets, e-readers, etc.) have come to be used as tools for mobile learning. Several studies support the integration of such technological devices with learning, particularly with language learning. In this paper, we wish to present an Android app designed for the teaching and learning of Portuguese as a foreign languag...
In this paper, we propose an exploratory study about the usefulness of multilingual corpora in areas related to the study of language, translation and, in particular, of simultaneous interpreting. After a brief overview of corpus-based interpreting studies as well as of some existing electronic interpreting corpora, we move on to describe the compi...
This chapter introduces the Per-Fide corpus, a multilingual parallel collection that encompasses six languages: English, Russian, French, Italian, German and Spanish. The corpus is bidirectional, with Portuguese as the pivot language (it is always the search or target language). The chapter details the various stages involved in the preparation of...
This paper is intended to establish a space for reflection on the usefulness of a monolingual corpus in fields concerning the study of language, using for that purpose a concrete example, i.e. that of se faire+Vinf constructions. Based on a literary and journalistic corpus, which we queried to determine the types of verbs more often selected by se...
If the concepts of theme and rheme, defined as «the subject you say something about»/ «what you say about it», seem to fit perfectly within the analysis of a state-ment as simple as (i) mon père a une moto hyper-rapide, they meet dificulties as soon as they come across more complex statements, such as (ii) moi, mon père, il a une moto hyper-rapide....
El presente estudio se ocupa de las implicaciones teóricas del proyecto de un diccionario multilingüe de estructuras pronominales en español, que se basaría en el análisis contrastivo con otras lenguas del entorno (italiano, francés, alemán, inglés, portugués), y que, en su vertiente de lingüística aplicada, pretende profundizar también en los aspe...
Presentación del proyecto de un diccionario multilingüe de estructuras pronominales en español, que se basa en el análisis contrastivo con otras lenguas del entorno (italiano, francés, alemán, inglés, portugués), y que se ocupa tanto de los aspectos teórico-descriptivos como también de los aspectos didácticos.