
Vanessa LopezIBM Research Europe
Vanessa Lopez
PhD, KMi, Open University
About
95
Publications
24,636
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
3,060
Citations
Citations since 2017
Introduction
Additional affiliations
January 2012 - present
April 2003 - December 2011
Publications
Publications (95)
Regulations govern many aspects of citizens' daily lives. Governments and businesses routinely automate these in the form of coded rules (e.g., to check a citizen's eligibility for specific benefits). However, the path to automation is long and challenging. To address this, recent global initiatives for digital government, proposing to simultaneous...
Computer-assisted scientific discovery promises to revolutionise how humans discover new materials, find novel drugs or identify new uses for existing ones, and improve clinical trial design and efficiency. The potential of technology to accelerate scientific discovery when the space of possible candidate solutions is too large for human evaluation...
To protect vital health program funds from being paid out on services that are wasteful and inconsistent with medical practices, government healthcare insurance programs need to validate the integrity of claims submitted by providers for reimbursement. However, due the complexity of healthcare billing policies and the lack of coded rules, maintaini...
There is a growing trend in building deep learning patient representations from health records to obtain a comprehensive view of a patient’s data for machine learning tasks. This paper proposes a reproducible approach to generate patient pathways from health records and to transform them into a machine-processable image-like structure useful for de...
In many machine learning tasks, models are trained to predict structure data such as graphs. For example, in natural language processing, it is very common to parse texts into dependency trees or abstract meaning representation (AMR) graphs. On the other hand, ensemble methods combine predictions from multiple models to create a new one that is mor...
Automated Theorem Proving (ATP) deals with the development of computer programs being able to show that some conjectures (queries) are a logical consequence of a set of axioms (facts and rules). There exists several successful ATPs where conjectures and axioms are formally provided (e.g. formalised as First Order Logic formulas). Recent approaches,...
Automated Theorem Proving (ATP) deals with the development of computer programs being able to show that some conjectures (queries) are a logical consequence of a set of axioms (facts and rules). There exists several successful ATPs where conjectures and axioms are formally provided (e.g. formalised as First Order Logic formulas). Recent approaches,...
To protect vital health program funds from being paid out on services that are wasteful and inconsistent with medical practices, government healthcare insurance programs need to validate the integrity of claims submitted by providers for reimbursement. However, due the complexity of healthcare billing policies and the lack of coded rules, maintaini...
In challenging times, ensuring financial integrity and fairer distribution of services by reducing disparities are among the top priorities for social and health-care systems globally. To deliver services at population-scale, governments and healthcare agencies are increasingly automating aspects of policy rule processing – e.g., automating activit...
In challenging economic times, obtaining value for money by ensuring financial integrity and fairer distribution of services are among the top priorities for social and health-care systems globally. However, healthcare billing policies are complex and identifying non-compliance is often narrow-scope, manual and expensive. Maintaining 'integrity' is...
Social determinants of health (SDoH) are the factors which lie outside of the traditional health system, such as employment or access to nutritious foods, that influence health outcomes. Some efforts have focused on identifying vulnerable populations during the COVID-19 pandemic, however, both the short-and long-term social impacts of the pandemic...
Health and social services are complex domains that have a direct impact on people's lives and where vast amounts of money are spent globally.
When funding intended for public health programs is lost to Fraud Waste and Abuse, vulnerable citizens are ultimately the victims. In challenging times, ensuring financial integrity and fairer distribution...
Social determinants of health (SDoH) are the complex set of circumstances in which individuals are born, or with which they live, that impact their health. Integrating SDoH into practice requires that information systems are able to identify SDoH-related concepts from charts and case notes through vocabularies or terminologies. Despite significant...
Financial losses in Medicaid, from Fraud, Waste and Abuse (FWA), in the United States are estimated to be in the tens of billions of dollars each year. This results in escalating costs as well as limiting the funding available to worthy recipients of healthcare. The Centers for Medicare & Medicaid Services mandate thorough auditing, in which policy...
Financial losses in Medicaid, from Fraud, Waste and Abuse (FWA), in the United States are estimated to be in the tens of billions of dollars each year. This results in escalating costs as well as limiting the funding available to worthy recipients of healthcare. The Centers for Medicare & Medicaid Services mandate thorough auditing, in which policy...
Social determinants of health (SDoH) are the complex set of circumstances in which individuals are born, or with which they live, that impact their health. Integrating SDoH into practice requires that information systems are able to identify SDoH-related concepts from charts and case notes through vocabularies or terminologies. Despite significant...
With healthcare fraud accounting for financial losses of billions of dollars each year in the United States, the task of investigating regulation adherence is key to reduce the impact of Fraud, Waste and Abuse (FWA) on the healthcare industry. Providers rendering services to patients typically submit claims to healthcare insurance agencies. Such cl...
With healthcare fraud accounting for financial losses of billions of dollars each year in the United States, the task of investigating regulation adherence is key to reduce the impact of Fraud, Waste and Abuse (FWA) on the healthcare industry. Providers rendering services to patients typically submit claims to healthcare insurance agencies. Such cl...
The Semantic Web contains an enormous amount of information in the form of knowledge bases (KB). To make this information available, many question answering (QA) systems over KBs were created in the last years. Building a QA system over KBs is difficult because there are many different challenges to be solved. In order to address these challenges,...
Health and social care professionals are under increasing pressure to assimilate the ever-growing volume of data from case notes and electronic medical records. In this paper, we propose and evaluate with domain experts a cognitive system for patient-centric care that leverages and combines natural language processing, semantics, and learning from...
We propose a cognitive system for patient-centric care that leverages and combines natural language processing, semantics, and learning from users over time to support care professionals working with large volumes of unstructured patient notes. The proposed methods highlight entities embedded in the unstruc-tured data to provide a holistic semantic...
Conversational message thread identification regards a wide spectrum of applications, ranging from social network marketing to virus propagation, digital forensics, etc. Many different approaches have been proposed in literature for the identification of conversational threads focusing on features that are strongly dependent on the dataset. In this...
We propose a cognitive system for patient-centric care that leverages and combines natural language processing, semantics, and learning from users over time to support care professionals working with large volumes of patient notes. The proposed methods highlight the entities embedded in the unstructured data to provide a holistic semantic view of a...
We present a domain-agnostic system for Question Answering over multiple semi-structured and possibly linked datasets without the need of a training corpus. The system is motivated by an industry use-case where Enterprise Data needs to be combined with a large body of Open Data to fulfill information needs not satisfied by prescribed application da...
Providing appropriate support for the most vulnerable individuals carries enormous societal significance and economic burden. Yet, finding the right balance between costs, estimated effectiveness and the experience of the care recipient is a daunting task that requires considering vast amount of information. We present a system that helps care team...
Efficiently detecting conversation threads from a pool of messages, such as social network chats, emails, comments to posts, news etc., is relevant for various applications, including Web Marketing, Information Retrieval and Digital Forensics. Existing approaches focus on text similarity using keywords as features that are strongly dependent on the...
Nowadays, most users carry high computing power mobile devices where speech recognition is certainly one of the main technologies available in every modern smartphone, although battery draining and application performance (resource shortage) have a big impact on the experienced quality. Shifting applications and services to the cloud may help to im...
Providing appropriate support for the most vulnerable individuals carries enormous societal significance and economic burden. Yet, finding the right balance between costs, estimated effectiveness and the experience of the care recipient is a daunting task that requires considering vast amount of information. We present a system that helps care team...
This paper introduces an extension of DALI, a framework for data integration and visualisation. When integrating new data, DALI automatically tries to recognise the schema and contents of the file, semantically lift them, and annotate them with existing ontologies. The extension presented in this paper allows users to import data from external data...
DALI is a practical system that exploits Linked Data to provide fede-rated entity search and spatial exploration across hundreds of information sources containing Open and Enterprise data pertaining to cities, which are stored in tabular files or in their original enterprise systems. Our system is able to lift data into a meaningful linked structur...
The success of a society is often judged by its ability to support the most vulnerable. Supporting the most vulnerable individuals is extremely challenging from an information needs perspective, since it requires data from numerous domains and systems, including Social Care, Healthcare, Public Safety and Juridical systems. Information sharing on th...
More and more urban data is published every day, and consequently, consumers want to take advantage of this body of knowledge. Unfortunately, metadata and schema information around this content is sparse. To effectively fulfill user information needs, systems must be able to capture user intent and context in order to evolve beyond current search a...
Patient-Centric Care requires comprehensive visibility into the strengths and vulnerabilities of individuals and populations. The systems involved in Patient-Centric Care are numerous and heterogeneous, span medical, behavioral and social domains and must be coordinated across government and NGO stakeholders in Health Care, Social Care and more. We...
We present an approach to access and consolidate complex information spanning multiple specialist domains and make it available to non-experts. We are using a combination of business rules and contextual exploration to reduce interface complexity and improve consumability. We present a use case and a prototype on top of a real-world enterprise solu...
We present SPUD, a semantic environment for cataloguing, exploring, integrating, understanding, processing and transforming urban information. A series of challenges are identified: namely, the heterogeneity of the domain and the impracticality of a common model, the volume of information and the number of data sets, the requirement for a low entry...
Comprehensive Care requires comprehensive visibility on the strengths and vulnerabilities of individuals and populations. The systems involved in Care are numerous and heterogeneous, span very broad domains, such as Social Care, Healthcare and Public Safety, and draw on specialist knowledge from many disciplines. We present a system, based on Linke...
The third edition of the open challenge on Question Answering over Linked Data (QALD-3) has been conducted as a half-day lab at CLEF 2013. Differently from previous editions of the challenge, has put a strong emphasis on multilinguality, offering two tasks: one on multilingual question answering and one on ontology lexicalization. While no submissi...
Today plenty of data is emerging from various city systems. Beyond the classical Web resources, large amounts of data are retrieved from sensors, devices, social networks, governmental applications, or service networks. In such a diversity of information, answering specific information needs of city inhabitants requires holistic IR techniques, capa...
The availability of large amounts of open, distributed and structured semantic data on the web has no precedent in the history of computer science. In recent years, there have been important advances in semantic search and question answering over RDF data. In particular, natural language interfaces to online semantic data have the advantage that th...
The dynamics of social events happening in large metropolitan areas are extremely complex. Location-based user generated data could be an exceptionally rich source of informa-tion about events, however the vastness and the heterogeneity of such information makes it almost impossible for city managers to have a comprehensive view. Some events are pl...
The demonstration and poster track is an opportunity for researchers and practitioners to present their innovative prototypes, practical developments, on-going projects, lessons learned and late-breaking results. This year we had a very exciting track with thirty-five poster and thirty-two demo submissions. All poster and demonstration papers were...
Governments and enterprises are interested in the return-on-investment for exposing their data. This brings forth the problem of making data consumable, with minimal effort. Beyond search techniques, there is a need for effective methods to identify heterogeneous datasets that are closely related, as part of data integration or exploration tasks. T...
This book constitutes the thoroughly refereed post-proceedings of the satellite events of the10th International Conference on the Semantic Web, ESWC 2013, held in Montpellier, France, in May 2013. The volume contains 44 papers describing the posters and demonstrations, 10 best workshop papers selected from various submissions and four papers of the...
In this paper, we present QuerioCity, a platform to catalog, index and query highly heterogenous information coming from complex systems, such as cities. A series of challenges are identified: namely, the heterogeneity of the domain and the lack of a common model, the vol-ume of information and the number of data sets, the requirement for a low ent...
With the recent rapid growth of the Semantic Web (SW), the processes of searching and querying content that is both massive in scale and heterogeneous have become increasingly challenging. User-friendly interfaces, which can support end users in querying and exploring this novel and diverse, structured information space, are needed to make the visi...
This work investigates the process of selecting, extracting and reorganizing content from Semantic Web information sources, to produce an ontology meeting the specifications of a particular domain and/or task. The process is combined with traditional text-based ontology learning methods to achieve tolerance to knowledge incompleteness. The paper de...
With the continued growth of online semantic information, the processes of searching and managing this massive scale and heterogeneous content have become increasingly challenging. In this work, we present PowerAqua, an ontology-based Question Answering system that is able to answer queries by locating and integrating information, which can be dist...
With the continued growth of online semantic information, the processes of searching and managing this massive scale and heterogeneous content have become increasingly challenging. In this work, we present PowerAqua, an ontology-based Question Answering system that is able to answer queries by locating and integrating information, which can be mas-...
Linked Data semantic sources, in particular DBpedia, can be used to answer many user queries. PowerAqua is an open multi-ontology
Question Answering (QA) system for the Semantic Web (SW). However, the emergence of Linked Data, characterized by its openness,
heterogeneity and scale, introduces a new dimension to the Semantic Web scenario, in which e...
Evaluations of semantic search systems are generally small scale and ad hoc due to the lack of appropriate resources such as test collections, agreed performance criteria and independent judgements of performance. By analysing our work in building and evaluating semantic tools over the last five years, we conclude that the growth of the semantic we...
In this paper we propose algorithms for combining and ranking answers from distributed heterogeneous data sources in the context of a multi-ontology Question Answering task. Our proposal includes a merging algorithm that aggregates, combines and filters ontology-based search results and three different ranking algorithms that sort the final answers...
PowerAqua1 is a Question Answering system, which takes as input a natural language query and is able to return answers drawn from relevant semantic resources found anywhere on the Semantic Web. In this paper we provide two novel contributions: First, we detail a new component of the system, the Triple Similarity Service, which is able to match quer...
Thanks to the huge efforts deployed in the community for creating, building and generating semantic information for the Semantic Web, large amounts of machine processable knowledge are now openly available. Watson is an infrastructure component for the Semantic Web, a gateway that provides the necessary functions to support applications in using th...
Currently, techniques for content description and query processing in Information Retrieval (IR) are based on keywords, and therefore provide limited capabilities to capture the conceptualizations associated with user needs and contents. Aiming to solve the limitations of keyword-based models, the idea of conceptual search, understood as searching...
The construction of standard datasets and benchmarks to evaluate ontology-based search approaches and to compare then against baseline IR models is a major open problem in the semantic technologies community. In this paper we propose a novel evaluation benchmark for ontology-based IR models based on an adaptation of the well-known Cranfield paradig...
While semantic search technologies have been proven to work well in specific domains, they still have to confront two main challenges to scale up to the Web in its entirety. In this work we address this issue with a novel semantic search system that a) provides the user with the capability to query Semantic Web information using natural language, b...
Matching has been recognized as a plausible solution for the semantic heterogeneity problem in many traditional applications, such as schema integration, ontology integration, data warehouses, data integration, and so on. Recently, there have emerged a line of new applications characterized by their dynamics, such as peer-to-peer systems, agents, w...
Although research on integrating semantics with the Web started almost as soon as the Web was in place, a concrete Semantic Web that is, a large-scale collection of distributed semantic metadata emerged only over the past four to five years. The Semantic Web's embryonic nature is reflected in its existing applications. Most of these applications te...