Rafael Berlanga-Llavori

Rafael Berlanga-Llavori
Universitat Jaume I | UJI · Department of Computer Languages and Systems

PhD

About

152
Publications
24,070
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,845
Citations
Citations since 2017
9 Research Items
428 Citations
2017201820192020202120222023020406080
2017201820192020202120222023020406080
2017201820192020202120222023020406080
2017201820192020202120222023020406080
Additional affiliations
October 1992 - present
Universitat Jaume I
Position
  • Full-time professor

Publications

Publications (152)
Article
Full-text available
The generation of electricity through renewable energy sources increases every day, with solar energy being one of the fastest-growing. The emergence of information technologies such as Digital Twins (DT) in the field of the Internet of Things and Industry 4.0 allows a substantial development in automatic diagnostic systems. The objective of this w...
Conference Paper
Full-text available
Social networking sites, being one of the “megatrends” which has significantly impacted the whole tourism industry and its widely use in traveller’s decisión making encourages many to accomplish their dreams. With the impacular growth in tourist movement across the world, countries have taken serious note on the safe and security travel of tourists...
Article
Full-text available
Presenta una nueva metodología basada en infraestructuras de datos abiertos vinculados (Linked Open Data LOD) al realizar tareas de análisis en redes sociales. Esta metodología sigue las típicas fases de un proyecto de inteligencia de negocios (Business Intelligence BI), en el que a partir de un conjunto de fuentes de datos se obtienen métricas e i...
Article
Light pollution and nature preservation, are new trends in which the European cities are involved as they evolve into Smart Cities. Internet of Things are changing the way that sensors and management control systems are designed and implemented. In this article, our main objective is to present an Outdoor Light Desktop Central Management architectu...
Article
Text mining of scientific literature has been essential for setting up large public biomedical databases, which are being widely used by the research community. In the biomedical domain, the existence of a large number of terminological resources and knowledge bases (KB) has enabled a myriad of machine learning methods for different text mining rel...
Article
While the Linked Data (LD) initiative has given place to open, large amounts of semi-structured and rich data published on the Web, effective analytical tools that go beyond browsing and querying are still lacking. To address this issue, we propose the automatic generation of multidimensional (MD) analytical stars. The success of the MD model for d...
Article
In this work, we present a first approximation to the semantic annotation of Unified Medical Language System (UMLS®) concept descriptions based on the extraction of relevant linguistic features and its use in conditional random fields (CRF) to classify them at the different semantic groups provided by UMLS. Experiments have been carried out over th...
Article
Full-text available
A new methodology based on language models retrieves product features and opinions from a collection of free-text customer reviews about a product or service. The proposal relies on a language-modeling framework that can be applied to reviews in any domain and language provided with a minimal knowledge source of sentiments or opinions (that is, a m...
Article
Full-text available
In this paper, we introduce a new methodology for modeling product aspects from a collection of free-text customer reviews. The proposal relies on a language modeling framework and is domain independent. It combines both a kernel-based model of opinion words and a stochastic translation model between words to approach the aspect model of products....
Article
Full-text available
Resumen: Los métodos de agrupamiento han sido ampliamente usados en muchas tareas de Procesamiento de la Información con el fin de capturar categorías de objetos desconocidos. Sin embargo, el agrupamiento ha sido poco utilizado como método para etiquetar sentidos en la Desambiguación del Sentido de las Palabras (WSD), es decir, como una forma de id...
Article
Full-text available
Resumen: En este artículo presentamos un marco para la obtención de representa-ciones condensadas estructuralmente complejas de conjuntos de documentos, el cual servirá de base para la construcción de resúmenes, la obtención de respuestas para preguntas complejas, etc. Este marco incluye un método para extraer una lista ordenada de hechos, triplos...
Conference Paper
In this paper, we present a new methodology aimed at retrieving relevant product aspects from a collection of customer reviews, as well as the most salient sentiments expressed about them. Our proposal is both unsupervised and domain independent, and does not relies on NLP techniques such as parsing or dependence analysis. In our experiments, the p...
Conference Paper
Full-text available
The annotation of texts in natural language links some terms of the text to an external information source that gives us more detailed information about them. Most of the approaches made in this field get any text and annotate it by trying to find out the context of each term, as there are terms that have different meanings depending on the topic t...
Conference Paper
This paper proposes the application of multidimensional analysis over large semantically annotated biomedical corpora for the identification of relevant abstract relations between the recognized entities. The identification of relations is one of the most challenging issues in information extraction, as they guide the definition of the patterns use...
Conference Paper
Full-text available
Word sense disambiguation (WSD) is an intermediate task within information retrieval and information extraction, attempting to select the proper sense of ambiguous words. Due to the scarcity of training data, knowledge-based and knowledge-lean methods receive attention as disambiguation methods. Knowledge-based methods compare the context of the am...
Conference Paper
In this paper, we propose a methodology for obtaining a probabilistic ranking of product features from a customer review collection. Our approach mainly relies on an entailment model between opinion and feature words, and suggest that in a probabilistic opinion model of words learned from an opinion corpus, feature words must be the most probable w...
Article
Full-text available
We propose a novel approach to facilitate the concurrent development of ontologies by different groups of experts. Our approach adapts Concurrent Versioning, a successful paradigm in software development, to allow several developers to make changes concurrently to an ontology. Conflict detection and resolution are based on novel techniques that tak...
Article
Full-text available
Web opinion feeds have become one of the most popular information sources users consult before buying products or contracting services. Negative opinions about some product can have a high impact in its sales figures. As a consequence, companies are more and more concerned about how to integrate this information in their Business Intelligence (BI)...
Article
Full-text available
Research in the Life Sciences depends on the integration of large, distributed and heterogeneous data sources and web services. The discovery of which of these resources are the most appropriate to solve a given task is a complex research question, since there is a large amount of plausible candidates and there is little, mostly unstructured, metad...
Conference Paper
Full-text available
The establishment of links between data (e.g., patient records) and Web resources (e.g., literature) and the proper visualization of such discovered knowledge is still a challenge in most Life Science domains (e.g., biomedicine). In this paper we present our contribution to the community in the form of an infrastructure to annotate information reso...
Article
Full-text available
This paper presents a method for semi-automatically building tailored application ontologies from a set of data acquisition forms. Such ontologies are intended to facilitate the integration of very heterogeneous data generation processes and their linkage to well-known external resources. The resulting tool is being applied to the medical domain, w...
Conference Paper
Current research in domains such as the Life Sciences depends heavily on the integration of information coming from diverse sources, which are typically highly complex and heterogeneous, and usually require exploratory access. Web services are increasingly used as the preferred method for accessing and processing these sources. Due to the large num...
Article
Ontologies are frequently used in information retrieval being their main applications the expansion of queries, semantic indexing of documents and the organization of search results. Ontologies provide lexical items, allow conceptual normalization and provide different types of relations. However, the optimization of an ontology to perform informat...
Conference Paper
In this demonstration we present XTaGe (XML Tester and Generator), a flexible tool for the creation of complex XML collections. XTaGe focuses on XML collections with complex structural constraints and domain-specific characteristics, which would be very difficult or impossible to replicate using existing XML generators. It addresses the limitations...
Conference Paper
Full-text available
The amount of ontologies and semantic annotations available on the Web is constantly increasing. This new type of complex and heterogeneous graph-structured data raises new challenges for the data mining community. In this paper, we present a novel method for mining association rules from semantic instance data repositories expressed in RDF/S and O...
Article
In this paper, we introduce a new clustering algorithm for discovering and describing the topics comprised in a text collection. Our proposal relies on both the most probable term pairs generated from the collection and the estimation of the topic homogeneity associated to these pairs. Topics and their descriptions are generated from those term pai...
Conference Paper
The Semantic Web (SW) deployment is now a realization and the amount of semantic annotations is ever increasing thanks to several initiatives that promote a change in the current Web towards the Web of Data, where the semantics of data become explicit through data representation formats and standards such as RDF/(S) and OWL. However, such initiativ...
Article
Full-text available
In this paper we present a preliminary logic-based evaluation of the integration of post-composed phenotypic descriptions with domain ontologies. The evaluation has been performed using a description logic reasoner together with scalable techniques: ontology modularization and approximations of the logical difference between ontologies. Comment: in...
Conference Paper
Full-text available
We propose a silver standard based on the UMLS Metathe-saurus to align NCI, FMA and SNOMED CT. This silver standard aims at being exploited within the OAEI and SEALS Campaigns.
Article
Ontologies represent domain knowledge that improves user interaction and interoperability between applications. In addition, ontologies deliver precious input to text mining techniques in the biomedical domain, which might improve the performance in different text mining tasks. This chapter will explore on the mutual benefits for ontologies and tex...
Article
Full-text available
Clustering methods have been extensively used in the solution of many Information Processing tasks in order to capture unknown object categories. This paper presents an approach to Word Sense Disambiguation based on clustering. The underlying idea is that the clustering of word senses provides a useful way to discover semantically related senses. W...
Article
Full-text available
The today's public database infrastructure spans a very large collection of heterogeneous biological data, opening new opportunities for molecular biology, bio-medical and bioinformatics research, but raising also new problems for their integration and computational processing. In this paper we survey the most interesting and novel approaches for t...
Article
Full-text available
This paper is intended to explore how to use terminological resources for ontology engineering. Nowadays there are several biomedical ontologies describing overlapping domains, but there is not a clear correspondence between the concepts that are supposed to be equivalent or just similar. These resources are quite precious but their integration and...
Article
Ontological resources such as controlled vocabularies, taxonomies and ontologies from the OBO foundry are used to represent biomedical domain knowledge. The development of such resources is a time consuming task. Once they are finished they contribute to standardization of information representation, interoperability of IT solutions, literature ana...
Conference Paper
Full-text available
We introduce XTaGe (XML Tester and Generator), a system for the synthesis of XML collections meant for testing and micro-benchmarking applications. In contrast with existing approaches, XTaGe focuses on complex collections, by providing a highly extensible framework to introduce controlled variability in XML structures. In this paper we present the...
Conference Paper
Full-text available
We propose a general method and novel algorithmic tech- niques to facilitate the integration of independently developed ontologies using mappings. Our method and techniques aim at helping users un- derstand and evaluate the semantic consequences of the integration, as well as to detect and x potential errors. We also present ContentMap, a system th...
Article
This paper presents a relevance model to rank the facts of a data warehouse that are described in a set of documents retrieved with an information retrieval (IR) query. The model is based in language modeling and relevance modeling techniques. We estimate the relevance of the facts by the probability of finding their dimensions values and the query...
Chapter
Full-text available
There is a proliferation of research and industrial organizations that produce sources of huge amounts of biological data issuing from experimentation with biological systems. In order to make these heterogeneous data sources easy to use, several efforts at data integration are currently being undertaken based mainly on XML. Starting from a discuss...
Article
Full-text available
Ontologies are frequently used in information retrieval being their main applications the expansion of queries, semantic indexing of documents and the organization of search results. However, the optimization of an ontology to perform information retrieval tasks is still unclear. In this paper, we propose an ontology query model to analyze the usef...
Article
Full-text available
The Semantic Web enables organizations to attach semantic annotations taken from domain and application ontologies to the information they generate. The concepts in these ontologies could describe the facts, dimensions and categories implied in the analysis subjects of a data warehouse. In this paper we propose the Semantic Data Warehouse to be a r...
Conference Paper
Full-text available
Nowadays very large domain knowledge resources are being developed in domains like Biomedicine. Users and applications can benefit enormously from these repositories in very different tasks, such as visualization, vocabulary homogenizing and classification. However, due to their large size and lack of formal semantics, they cannot be properly manag...
Conference Paper
Full-text available
1 Motivation OWL Ontologies are already being used in many application domains. In particular, OWL is extensively used in the clinical sciences; prominent examples of OWL on-tologies are the National Cancer Institute (NCI) Thesaurus, SNOMED CT, the Gene Ontology (GO), the Foundational Model of Anatomy (FMA), and GALEN. These ontologies are large an...
Conference Paper
Full-text available
We present ContentCVS, a system that implements a novel approach to facilitate the collaborative development of ontologies. Our approach adapts Concurrent Versioning, a successful paradigm in collaborative software development, to allow several developers to make changes concurrently to an ontology. Conflict detection and resolution are based on no...
Conference Paper
Full-text available
The UMLS Metathesaurus (UMLS-Meta) is currently the most comprehensive effort for integrating independently-developed medical thesauri and ontologies. The techniques used in the construction of UMLS-Meta are mostly based on lexical matching and often disregard the semantics of the sources being integrated. In this paper we aim at developing logic-b...
Conference Paper
In this paper we present a novel approach for identifying and describing the possible subtopics that can be derived from the result set of a topic-based query. Subtopic descriptions rely on the conceptual indexing of the retrieved documents, which consists of mapping the document terms into concepts of an existing thesaurus (i.e. UMLS meta-thesauru...
Conference Paper
In this paper, we introduce a new clustering algorithm for obtaining labeled document clusters that accurately identify the topics of a text collection. In order to determine the topics, our approach relies on both probable term pairs generated from the collection and the estimation of the topic homogeneity associated to term pair clusters. Experim...
Conference Paper
Many XML-based information systems that must handle highly heterogeneous information require multiple similarity measures. Until now, little guidance exists for the design of application-dependent measures in such systems. This paper contributes a four-step methodology that guides the development of multi-similarity systems, and shows its usefulnes...
Conference Paper
The concept of heterogeneity is very important in XML data management, since many common applications must deal with large and complex collections which do not conform to a schema. Heterogeneity in XML collections can be present at many different levels (textual and structural) and needs to be addressed from several perspectives. This paper contrib...
Conference Paper
Full-text available
The integration of heterogeneous biomedical information is one important step towards providing the level of personalization required in the next generation of healthcare provision. In order to provide the computer-based decision support systems needed to access this integrated healthcare information it will be necessary to handle the semantics of...
Article
Full-text available
Background In recent years, the recognition of semantic types from the biomedical scientific literature has been focused on named entities like protein and gene names (PGNs) and gene ontology terms (GO terms). Other semantic types like diseases have not received the same level of attention. Different solutions have been proposed to identify disease...
Article
Current data warehouse and OLAP technologies are applied to analyze the structured data that companies store in databases. The context that helps to understand data over time is usually described separately in text-rich documents. This paper proposes to integrate the traditional corporate data warehouse with a document warehouse, resulting in a con...
Article
Due to the heterogeneous nature of XML data for internet applications exact matching of queries is often inadequate. The need arises to quickly identify subtrees of XML documents in a collection that are similar to a given pattern. Similarity involves both tags, that are not required to coincide, and structure, in which not all the relationships am...
Conference Paper
Full-text available
Driven by application requirements and using well-understood theoretical results, we describe a novel methodology and a tool for modular ontology design. We support the user in the safe use of imported symbols and in the economic import of the relevant part of the imported ontology. Both features are supported in a well-understood way: safety guara...
Conference Paper
In this demonstration we will show a series of tools that support a methodology [1] for the design of complex similarity functions in the context of heterogenous XML systems.
Article
This work-in-progress paper describes ArHeX similarity-oriented XML processing toolkit [9]. The distinguishing features of ArHeX are: (i) its ability to support collections which are heterogeneous at multiple levels of granularity, (ii) its flexible pattern-based query model, and (iii) its component-based architecture. These features allow ArHeX to...
Article
Full-text available
This paper surveys the most relevant research on combining Data Warehouse (DW) and Web data. It studies the XML technologies that are currently being used to integrate, store, query, and retrieve Web data and their application to DWs. The paper reviews different DW distributed architectures and the use of XML languages as an integration tool in the...
Conference Paper
Full-text available
As more semantic web services become on the Internet, it is feasible that users collaborate among them to save efforts in complex web solutions by sharing and reusing existing semantic web services, rather than building them from scratch. In this paper we focus on the problem of discovering and reusing semantic web services at a high level of abstr...
Conference Paper
A major difficulty of text categorization problems is the high dimensionality of the feature space. Thus, feature selection is often performed in order to increase both the efficiency and effectiveness of the classification. In this paper, we propose a feature selection method based on Testor Theory. This criterion takes into account inter-feature...
Conference Paper
This paper addresses the characterization of a large text collection by introducing a method for retrieving sets of relevant WordNet concepts as descriptors of the collection contents. The method combines models for identifying interesting word co-occurrences with an extension of a word sense disambiguation algorithm in order to retrieve the concep...