Carla Teixeira LopesUniversity of Porto | UP · Departamento de Engenharia Informática
Carla Teixeira Lopes
PhD
About
84
Publications
6,464
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
401
Citations
Introduction
Publications
Publications (84)
Archives are facing numerous challenges. On the one hand, archival assets are evolving to encompass digitized documents and increasing quantities of born-digital information in diverse formats. On the other hand, the audience is changing along with how it wishes to access archival material. Moreover, the interoperability requirements of cultural he...
Archives preserve materials that allow us to understand and interpret the past and think about the future. With the evolution of the information society, archives must take advantage of technological innovations and adapt to changes in the kind and volume of the information created. Semantic Web representations are appropriate for structuring archi...
Wikipedia is the world’s largest online encyclopedia, but maintaining article quality through collaboration is challenging. Wikipedia designed a quality scale, but with such a manual assessment process, many articles remain unassessed. We review existing methods for automatically measuring the quality of Wikipedia articles, identifying and comparin...
Linked Data is used in various fields as a new way of structuring and connecting data. Cultural heritage institutions have been using linked data to improve archival descriptions and facilitate the discovery of information. Most archival records have digital representations of physical artifacts in the form of scanned images that are non-machine-re...
Research data management is essential for safeguarding and prospecting data generated in a scientific context. Specific issues arise regarding data in image format, as this data typology poses particular challenges and opportunities; however, not much attention has been given to data as images. We reviewed 109 articles from several research domains...
In a world increasingly present online, people are leaving a digital footprint, with valuable information scattered on the Web, in an unstructured manner, beholden to the websites that keep it. While there are potential harms in being able to access this information readily, such as enabling corporate surveillance, there are also significant benefi...
The 2nd edition of the International Workshop on Archives and Linked Data ran in conjunction with the 26th International Conference on Theory and Practice of Digital Libraries (TPDL 2022). TPDL 2022 was an in-person event in Padua with an online-only registration for non-speakers. Archives, the guardians of large volumes of historical and current i...
Web search engines have marked everyone's life by transforming how one searches and accesses information. Search engines give special attention to the user interface, especially search engine result pages (SERP). The well-known ''10 blue links'' list has evolved into richer interfaces, often personalized to the search query, the user, and other asp...
Web Search Engine Results Pages (SERP) are one of the most well-known and used web pages. These pages have started as simple ``10 blue links'' pages, but the information in SERP currently goes way beyond these links. Several features have been included in these pages to complement organic and sponsored results and attempt to provide answers to the...
An institution must understand its users to provide quality services, and archives are no exception. Over the years, archives have adapted to the technological world, and their users have also changed. To understand archive users’ characteristics and motivations, we conducted a study in the context of the Portuguese Archives. For this purpose, we a...
Research data management is an essential process in scientific research activities. It includes monitoring data from the moment it is created until it is deposited in a repository so that later it can be accessed and reused by others. Sharing and reuse are the last steps in this process. It is essential to ensure that the data stored in digital rep...
Research data management (RDM) includes people with different needs, specific scientific contexts, and diverse requirements. The description is a big challenge in the domain of RDM. Metadata plays an essential role, allowing the inclusion of essential information for the interpretation of data, enhances the reuse of data and its preservation. The e...
This report provides an overview of the field of Information Retrieval (IR) in healthcare. It does not aim to introduce general concepts and theories of IR but to present and describe specific aspects of Health Information Retrieval (HIR). After a brief introduction to the more broader field of IR, the significance of HIR at current times is discus...
Research data management (RDM) practices are critical for ensuring research success. Data can assume diverse formats and data in image format have been understudied in RDM. To understand image management habits in research, we have conducted semi-structured interviews with researchers from four research domains. Most researchers do not formally man...
The International Workshop on Archives and Linked Data was a satellite event of the 25th International Conference on Theory and Practice of Digital Libraries (TPDL 2021). TPDL 2021 was an online event with free registration, and the same applied to the workshop. Linked Data and Semantic Web technologies offer new possibilities for digital curation...
Archives are evolving. Analog archives are becoming increasingly digitized and linked with other cultural heritage institutions and information sources. Diverse forms of born-digital archives are appearing. This diversity asks for systematic ways to characterize existing archives managing physical or digital records. We conducted a systematic revie...
Archives are faced with great challenges due to the vast amounts of data they have to curate. New data models are required, and work is underway. The International Council on Archives is creating the RiC-CM (Records in Context), and there is a long line of work in museums with the CIDOC-CRM (CIDOC Conceptual Reference Model). Both models are based...
Research data management is the basis for making data more Findable, Accessible, Interoperable and Reusable. In this context, little attention is given to research data in image format. This article presents the preliminary results of a study on the habits related to the management of images in research. We collected 107 answers from researchers us...
Medico-scientific concepts are not easily understood by laypeople that frequently use lay synonyms. For this reason, strategies that help users formulate health queries are essential. Health Suggestions is an existing extension for Google Chrome that provides suggestions in lay and medico-scientific terminologies, both in English and Portuguese. Th...
Archives have well-established description standards, namely the ISAD(G) and ISAAR(CPF) with a hierarchical structure adapted to the nature of archival assets. However, as archives connect to a growing diversity of data, they aim to make their representations more apt to the so-called linked data cloud. The corresponding move from hierarchical, ISA...
Readability is a linguistic feature that indicates how difficult it is to read a text. Traditional readability formulas were made for the English language. This study evaluates their adequacy to the Portuguese language. We applied the traditional formulas in 10 parallel corpora. We verified that the Portuguese language had higher grade scores (less...
Wikipedia is the largest on-line collaborative encyclopedia, containing information from a plethora of fields, including medicine. It has been shown that Wikipedia is one of the top visited sites by readers looking for information on this topic. The large reliance on Wikipedia for this type of information drives research towards the analysis of the...
Because of terminology mismatches, health consumers frequently face difficulties while searching the Web for health information. Difficulties arise in query formulation but also in understanding the retrieved documents. In this work we analyze how documents' readability affects users' comprehension and how both affect the retrieval performance, mea...
Health consumers usually face difficulties on their online searches, mainly because of the differences between terminologies used by laypeople and health professionals. This work presents a tool, HealthTranslator, available as a Google Chrome extension that intends to reduce this terminological gap while users are searching the Web for health infor...
The importance of research data management is widely recognized. Dendro is an ontology-based platform that allows researchers to describe datasets using generic and domain-specific descriptors from ontologies. Selecting or building the right ontologies for each research domain or group requires meetings between curators and researchers in order to...
Purpose
The quality of consumer-oriented health information on the web has been defined and evaluated in several studies. Usually it is based on evaluation criteria identified by the researchers and, so far, there is no agreed standard for the quality indicators to use. Based on such indicators, tools have been developed to evaluate the quality of...
Introduction. The concept and study of relevance has been a central subject in information science. Although research in information retrieval has been focused on topical relevance, other kinds of relevance are also important and justify further study. Motivational relevance is typically inferred by criteria such as user satisfaction and success. M...
A patient»s health literacy has a direct impact on their health, but more than a third of the USA population has "basic" or "below basic" levels of health literacy. An individual»s wellbeing is also affected by the communication with their physician, as the use of technical terminology may hinder the patient»s understanding. A patient»s ability to,...
We present a case study of quality evaluation of online health information. Two participants were selected from a health information search (HIS) study, in which we are investigating consumers' evaluation of the quality of online health information. The selected cases offered a rare example of two almost exactly opposite eye-movement patterns on th...
Relevance is usually estimated by search engines using document content, disregarding the user behind the search and the characteristics of the task. In this work, we look at relevance as framed in a situational context, calling it situational relevance, and analyze whether it is possible to predict it using documents, users and tasks characteristi...
Searching for health information is one of the most popular activities on the web. In this domain, users often misspell or lack knowledge of the proper medical terms to use in queries. To overcome these difficulties and attempt to retrieve higher-quality content, we developed a query suggestion system that provides alternative queries combining the...
The Web is frequently used as a way to access health information. In the health domain, the terminology can be very specific, frequently assuming a medico-scientific character. This can be a barrier to users who may be unable to understand the retrieved documents. Therefore, it would be useful to automatically assess how well a certain document wil...
Searching for health information is one of the most popular activities on the Web. In this domain, users frequently encounter difficulties in query formulation, either because they lack knowledge of the proper medical terms or because they misspell them. To overcome these difficulties and attempt to retrieve higher-quality content, we developed a q...
To help laypeople surpass the common difficulties they face when searching for health information on the Web, we built Health Suggestions, an extension for Google Chrome to assist users obtaining high-quality search results in the health domain. This is achieved by providing users with suggestions of queries formulated in different terminologies an...
The health domain is rich in specific vocabulary and information structures. Previous work on this area includes the collection of this information in information systems. However, the language of these can limit their use. To overcome this, we present Health Translations, a web application that uses crowd-sourcing to translate a large vocabulary s...
Nowadays, online communities are becoming an important resource for health consumers who want to retrieve and share information about health subjects. These communities have the potential to influence patients' health behaviors and increase their engagement with therapies. However, the interaction dynamics in this type of media remains poorly under...
If it were possible to automatically detect proficiency in languages using data from eye movements, new levels of customizing computer applications could possibly be achieved. An example in case is web searches where suggestions and results could be adjusted to the user's knowledge of the language. The objective of this study is to compare the read...
Search engines typically estimate relevance using features of the documents. We believe that several features from the user and task can also contribute to this process. In the health domain there are specific characteristics of web documents that can also add value to this estimation. In the present work, using a dataset composed by set of annotat...
Prior studies have shown that terminology support can improve health information retrieval but have not taken into account the characteristics of the user performing the search. In this chapter, the impact of translating queries' terms between lay and medico-scientific terminology, in users with different levels of health literacy and topic familia...
Identifying the user's intent behind a query is a key challenge in Information Retrieval. This information may be used to contextualize the search and provide better search results to the user. The automatic identification of queries targeting a search for health information allows the implementation of retrieval strategies specifically focused on...
As Tecnologias de Informação e Comunicação (TIC) assumem um papel preponderante na nossa sociedade, ao nível dos diversos setores de atividade económica, das instituições de ensino, da Administração Pública e mesmo em casa. Mas é na vida dos jovens, nomeadamente daqueles que nasceram a partir de 1980 (a geração net), que estas tecnologias têm um pa...
A avaliação do grau de preparação para o e-learning é uma estratégia muito interessante para o sucesso da implementação de práticas de e-learning, uma vez que permite identificar os aspetos mais críticos a considerar antes e durante a sua implementação. Este artigo descreve um modelo para avaliar o grau de preparação para a implementação de e-learn...
English is by far the most used language on the web. In some domains, the existence of less content in the users' native language may not be problematic and even help to cope with the information overload. Yet, in domains such as health, where information quality is critical, a larger quantity of information may mean easier access to higher quality...
In this paper we propose a multilingual method to identify health-related queries and classify them into health categories. Our method uses a consumer health vocabulary and the Unified Medical Language System semantic structure to compute the association degree of a query to medical concepts and categories. This method can be applied in different l...
We conducted a user study to analyze how health literacy, topic familiarity and the terminology used in past queries affect query behavior in health searches. We found that users with inadequate health literacy have less success in web searches and show more difficulties in query formulation. These users and the ones not familiar with the topic use...
Purpose
The intent of this work is to evaluate several generalist and health‐specific search engines for retrieval of health information by consumers: to compare the retrieval effectiveness of these engines for different types of clinical queries, medical specialties and condition severity; and to compare the use of evaluation metrics for binary re...
The Web is being increasingly used by health consumers to search for health information. In this domain, the quality of the retrieved contents is crucial to avoid healthcare hazards. To address this problem and help the user identify reliable and credible contents, initiatives have appeared that certify the compliance of health websites to quality...
The Internet plays an important role in higher education institutions where Learning Management Systems (LMS) occupies a main role in the eLearning realm. In this chapter we aim to characterize the Internet and LMS usage patterns and their role in the largest Portuguese Polytechnic Institute. The usage patterns were analyzed in two components: char...
It is recognized by the Information Retrieval community that context aects the retrieval process. Query formulation and relevance assessment are stages where the user role is central. The first determines what the system will search for and the second is frequently used to evaluate how the system behaved. With a large human involvement, these stage...
We have conducted a user study to evaluate several generalist and health-specific search engines on health information retrieval. Users evaluated the relevance of the top 30 documents of 4 search engines in two different health information needs. We introduce the concepts of local and global precision and analyze how they affect the evaluation. Res...
The context in which a search takes place affects the Information Retrieval (IR) process. It affects the searcher's interaction with the IR system, his expectations and his decisions about the documents he retrieves. Therefore, knowing more about what features are important in a searcher's context and what they are used for, can help design more us...
in HIR applications. We intent to characterize this behav- ior through the application of questionnaires to a sample of health professionals, interviews to a smaller set of pro- fessionals and, if logs are available, through the analysis of searches made on general and specialized search engines. The next step is to propose an information retrieval...
This report provides an overview of the field of Information Retrieval (IR) in health-care. It does not aim to introduce general concepts and theories of IR but to present and describe specific aspects of Health Information Retrieval (HIR). After a brief in-troduction to the more broader field of IR, the significance of HIR at current times is disc...
Usage analysis of a Web Information System is a valuable help to predict user needs, to assess systems impact and to guide to its improvement. This is usually done analysing clickstreams, a low-level approach, with huge amounts of data that calls for data warehouse techniques. This paper presents a dimensional model to monitor user behaviour in Hig...
The publication of rakings about high schools, started in 2001, had a significant impact at several levels, initiating a series of comments and speculations in main stream media and influencing the decisions (or at least the aspirations) of many people. Taking advantage of the availability of five years of data, and the possibility of comparing the...
Monitoring the user behaviour of an information system (IS) is a valuable way to assess its impact and to guide its improvement. In the case of Web Based IS this is usually done analyzing the Web server logs or clickstreams, a too low-level approach with huge amounts of data. Dealing with this calls for data warehouse techniques, which abstract the...
An e-learning readiness evaluation is critical to the success of an e-learning strategy, identifying issues that should be considered before and during an e-learning intervention. This paper describes a model to evaluate the e-learning readiness of a Higher Education Institution and reports the results of its application in ESTSP, a Porto's Allied...
The use of the Web to find health information is a com-mon practice nowadays. The improvement of Health Information Re-trieval depends on studies that, frequently, require the identification of health-related queries. Being usually done by human assessors, this iden-tification may turn out to be inefficient and even impracticable in some cases. To...
Resumo: A Acessibilidade Web é um assunto que tem vindo a ganhar relevo no seio das sociedades desenvolvidas, pelo facto de a Web ser um meio que permite às pessoas com necessidades especiais alcançarem objectivos que antes não imaginavam. O presente estudo tem como objectivo analisar se os sítios do Governo de Portugal de subdomínio ".gov.pt" cump...
Researchers are aware that context affects information re-trieval in general. The health area is no exception and is particularly rich in terms of context. To understand how context is used in health information research, we collected a sample of health information research papers that use con-text features. Papers were analyzed and classified acco...
Tese de mestrado. Gestão de Informação. Faculdade de Engenharia. Universidade do Porto. 2005