About
85
Publications
94,262
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
403
Citations
Publications
Publications (85)
Food is an important factor when choosing a touristic destination, and culinary images are a fundamental tool in gastronomic marketing. This paper presents an approach to analyse the use of color in food images based on a dataset of more than 22000 recipes coming form a popular recipe website, including images of the final dishes and scores that in...
This paper presents an approach for analysing gastronomic images and also their related comments published by the Getcookingcanada Instagram account, which belongs to a cooking school. Our approach processes the published images to calculate the moods that the image can generate depending on its colour palette, and also analyses the comments relate...
The image of a tourist destination can be influenced by any of multitude of aspects, one of the most important being that of gastronomy, within which it is well-known that colour plays a major role. An exploratory study into how the perception of food colour affects the mood of Latino-American and Spanish people is presented to this end. A survey i...
A person’s preference to select or reject certain meals is influenced by several aspects, including colour. In this paper, we study the relevance of food colour for such preferences. To this end, a set of images of meals is processed by an automatic method that associates mood adjectives that capture such meal preferences. These adjectives are obta...
This paper presents an approach for analysing food-porn images and their related comments published by the cooking school Getcookingcanada Instagram account. Our approach processes the published images to extract colour parameters, counts the number of likes, and also analyses the comments related to each publication. A dataset containing all these...
The business environment today is characterized by high competition and saturated markets. Pay-tv platforms there are not an exception. Because of that, the cost to acquire new customers is much higher than the cost of retaining the existing customers. Therefore, it is important for Pay-TV platforms to keep controlled the Customer Churn. Therefore,...
This paper presents a cognitively inspired qualitative theory, \(QCharm\), which defines five operators for colour combination based on the qualitative colour descriptor (QCD) and applies these operators to recommend palettes of harmonic colours. Machine learning techniques have been applied to learn the QCD colour coordinates in Kobayashi’s colour...
In the European Higher Education Area (EHEA) the coordination of subjects presents a challenge and a key factor for students’ learning and competence development. The joint planning of subjects about fundamentals of software engineering and design and implementation of information systems in computer science higher studies provides the students wit...
The QArt-Learn approach for style painting categorization based on Qualitative Color Descriptors (QCD), color similarity (SimQCD), and quantitative global features (i.e. average of brightness, hue, saturation and lightness and brightness contrast) is presented in this paper. k-Nearest Neighbor (k-NN) and support vector machine (SVM) techniques have...
The tremendous popularity of web-based social media is attracting the attention of the industry to take profit from the massive availability of sentiment data, which is considered of a high value for Business Intelligence (BI). So far, BI has been mainly concerned with corporate data with little or null attention to the external world. However, for...
The tremendous popularity of web-based social media is attracting the attention of the industry to take profit from the massive availability of sentiment data, which is considered of a high value for Business Intelligence (BI). So far, BI has been mainly concerned with corporate data with little or null attention to the external world. However, for...
Colour naming consists of successfully finding the correspondence between colours as named by humans and the colour coordinates used by machine displays. Its successful implementation is crucial for human-machine interaction tasks, e.g. for the communication between service robots and humans. However, significant variability among human groups make...
An approach for a query-by-sketch system on qualitative shape information for image retrieval in databases is proposed and evaluated. The use of qualitative methods for shape description allows the gathering of semantic information from the sketches. The qualitative description and recognition of sketches are evaluated in order to verify that it is...
The problem of colour naming consists of successfully recognizing, categorising and labeling colours. This paper presents a colour naming theory based on a Qualitative Colour Description (QCD) model, which is validated here by an experiment carried out by real users in order to determine whether the QCD model is close enough to common human colour...
In this paper qualitative colours are named by defining intervals on reference systems built on the Hue Saturation and Lightness (HSL) colour space. The new model for qualitative colour description (QCD) distinguishes rainbow colours, pale, light, dark colours and colours in the grey scale. For comparing the qualitative colours described, a similar...
Background
Open metadata registries are a fundamental tool for researchers in the Life Sciences trying to locate resources. While most current registries assume that resources are annotated with well-structured metadata, evidence shows that most of the resource annotations simply consists of informal free text. This reality must be taken into accou...
An approach for a query-by-sketch system based on qualitative shape information for image retrieval in databases is proposed and evaluated. The use of qualitative methods for shape description allows the gathering of semantic information from the sketches. Therefore, in this paper the qualitative description and recognition of sketches is evaluated...
Research in the Life Sciences depends on the integration of large, distributed and heterogeneous web resources (e.g., data sources and web services). The discovery of which of these resources are the most appropriate to solve a given task is a complex research question, since there are many candidate resources and there is little, mostly unstructur...
Open metadata registries are a fundamental tool for researchers in the Life Sciences trying to locate resources such as web services or databases. While sophisticated standards have been produced for annotating these resources with rich, well-structured metadata, evidence shows that in open registries a majority of annotations simply consists of in...
An approach is presented aimed to produce narrative descriptions of objects within digital images which are understandable by human beings. A context-free grammar is defined based on qualitative models for colour and shape description. This approach was tested using images of the MPEG-7 CE Shape-1 library and images of tile geometric pieces. The re...
Research in the Life Sciences largely depends on the integration of large, distributed and heterogeneous data sources and
web services. Due to the large number of available web services, the sheer complexity of the data and the frequent lack of
documentation, discovering the most appropriate web service for a given task is a challenge for the user....
The application of a qualitative shape and qualitative colour description and similarity calculus to an Image Query By Example problem is presented in this paper. Specifically, the qualitative shape and colour similarity theories are applied to the problem of matching icon images. The suitability of this approach as a foundation for a practical que...
Research in the Life Sciences depends on the integration of large, distributed and heterogeneous data sources and web services. The discovery of which of these resources are the most appropriate to solve a given task is a complex research question, since there is a large amount of plausible candidates and there is little, mostly unstructured, metad...
This paper presents a method for semi-automatically building tailored application ontologies from a set of data acquisition forms. Such ontologies are intended to facilitate the integration of very heterogeneous data generation processes and their linkage to well-known external resources. The resulting tool is being applied to the medical domain, w...
Current research in domains such as the Life Sciences depends heavily on the integration of information coming from diverse
sources, which are typically highly complex and heterogeneous, and usually require exploratory access. Web services are increasingly
used as the preferred method for accessing and processing these sources. Due to the large num...
In this demonstration we present XTaGe (XML Tester and Generator), a flexible tool for the creation of complex XML collections. XTaGe focuses on XML collections with complex structural constraints and domain-specific characteristics, which would be very difficult or impossible to replicate using existing XML generators. It addresses the limitations...
Enginyeria Tècnica en Informàtica de Gestió (Pla de 2001). IG18: Bases de Dades
Enginyeria Tècnica en Informàtica de Gestió (Pla de 2001). IG18: Bases de Dades
Enginyeria Tècnica en Informàtica de Gestió (Pla de 2001). IG18: Bases de Dades
Enginyeria Tècnica en Informàtica de Gestió (Pla de 2001). IG18: Bases de Dades
Enginyeria Tècnica en Informàtica de Gestió (Pla de 2001). IG18: Bases de Dades
Enginyeria Tècnica en Informàtica de Gestió (Pla de 2001). IG18: Bases de Dades
Many important applications in scientific fields such as Bioinformatics depend on the management of large collections of heterogeneous XML documents, containing complex domain-specific data. To allow advanced querying, these systems need to support techniques such as data exploration and approximate query processing by using multiple notions of sim...
The today's public database infrastructure spans a very large collection of heterogeneous biological data, opening new opportunities for molecular biology, bio-medical and bioinformatics research, but raising also new problems for their integration and computational processing.
In this paper we survey the most interesting and novel approaches for t...
We introduce XTaGe (XML Tester and Generator), a system for the synthesis of XML collections meant for testing and micro-benchmarking
applications. In contrast with existing approaches, XTaGe focuses on complex collections, by providing a highly extensible
framework to introduce controlled variability in XML structures. In this paper we present the...
There is a proliferation of research and industrial organizations that produce sources of huge amounts of biological data issuing from experimentation with biological systems. In order to make these heterogeneous data sources easy to use, several efforts at data integration are currently being undertaken based mainly on XML. Starting from a discuss...
In this demonstration we will show a series of tools that support a methodology [1] for the design of complex similarity functions
in the context of heterogenous XML systems.
Many XML-based information systems that must handle highly heterogeneous information require multiple similarity measures. Until now, little guidance exists for the design of application-dependent measures in such systems. This paper contributes a four-step methodology that guides the development of multi-similarity systems, and shows its usefulnes...
Many XML-based information systems that must handle highly heterogeneous information require multiple similarity measures. Until now, little guidance exists for the design of application-dependent measures in such systems. This paper contributes a four-step methodology that guides the development of multi-similarity systems, and shows its usefulnes...
The concept of heterogeneity is very important in XML data management, since many common applications must deal with large and complex collections which do not conform to a schema. Heterogeneity in XML collections can be present at many different levels (textual and structural) and needs to be addressed from several perspectives. This paper contrib...
Due to the heterogeneous nature of XML data for internet applications exact matching of queries is often inadequate. The need arises to quickly identify subtrees of XML documents in a collection that are similar to a given pattern. Similarity involves both tags, that are not required to coincide, and structure, in which not all the relationships am...
In this demonstration we will show a series of tools that support a methodology [1] for the design of complex similarity functions in the context of heterogenous XML systems.
This work-in-progress paper describes ArHeX similarity-oriented XML processing toolkit [9]. The distinguishing features of ArHeX are: (i) its ability to support collections which are heterogeneous at multiple levels of granularity, (ii) its flexible pattern-based query model, and (iii) its component-based architecture. These features allow ArHeX to...
In this work we introduce a novel retrieval language, named OntoPath, for specifying and retrieving relevant ontology fragments. This language is intended to extract customized self-standing
ontologies from very large, general-purpose ones. Through OntoPath, users can specify the desired detail level in the concept taxonomies as well as the propert...
This work-in-progress paper describes the features of the ArHeX similarity-oriented XML processing toolkit. ArHeX is designed to assist in the engineering of XML similarity-oriented applications, supporting the design and evaluation of suitable similarity measures and their associated indexes for each specific application.
In this work we introduce a novel retrieval language, named OntoPath, for specifying and retrieving relevant ontology fragments. This language is intended to extract customized self-standing
ontologies from very large, general-purpose ones. Through OntoPath, users can specify the desired detail level in the concept taxonomies as well as the propert...
The Health-e-Child project aims to develop an integrated healthcare platform for European paediatrics. In order to achieve
a comprehensive view of children’s health, a complex integration of biomedical data, information, and knowledge is necessary.
Ontologies will be used to formally define this domain knowledge and will form the basis for the medi...
The Health-e-Child project aims to develop an integrated healthcare platform for European paediatrics. In order to achieve a comprehensive view of childrens health, a complex integration of biomedical data, information, and knowledge is necessary. Ontologies will be used to formally define this domain knowledge and will form the basis for the medic...
Highly heterogeneous XML collections are thematic collec- tions exploiting different structures: the parent-child or ancestor-descen- dant relationships are not preserved and vocabulary discrepancies in the element names can occur. In this setting current approaches return an- swers with low precision. By means of similarity measures and semantic i...
Handling the heterogeneity of structure and/or content of XML documents for the retrieval of information is a fertile field
of research nowadays. Many efforts are currently devoted to identifying approximate answers to queries that require relaxation
on conditions both on the structure and the content of XML documents [1,2,4,5]. Results are ranked...
The large amount and heterogeneity of XML documents on the Web require the development of clustering techniques to group together similar documents. Documents can be grouped together according to their content, their structure, and links inside and among documents. For instance, grouping together documents with similar structures has interesting ap...
In this paper we present and evaluate two approaches for the generation of Semantic Fields, which are used as a tool for resource
discovery in the Semantic Web. We mainly concern ourselves with semantic networks that describe their interests and resources
by means of ontologies. Semantic Fields are intended to help users to locate these resources b...
Due to the heterogeneous nature of XML data for internet applications exact matching of queries is often inadequate. The need arises to quickly identify subtrees of XML documents in a collection that are similar to a given pattern. In this paper we discuss different similarity measures between a pattern and subtrees of documents in the collection....
In this paper we introduce a new view definition language, named OntoPathView, which combines the simplicity of XPath with the high expressiveness of RDF/S based view languages. This language will allow us to work in a collaborative environment for the development of complex ontologies.
In this paper we present and evaluate two approaches for the generation of Semantic Fields, which are used as a tool for resource
discovery in the Semantic Web. We mainly concern ourselves with semantic networks that describe their interests and resources
by means of ontologies. Semantic Fields are intended to help users to locate these resources b...
An ontology is a conceptual representation of a domain resulted from a consensus within a community. One of its main applications is the integration of heterogeneous information sources available in the Web, by means of the semantic annotation of web documents. This is the cornerstone of the emerging Semantic Web. However, nowadays most of the info...
This work addresses the automatic generation of conceptual models for XML-oriented databases, which in many cases have little
or no support for schemata. Our techniques are based on both an incremental clustering algorithm, which groups together the
incoming XML documents according to their structural similarities, and a schema inference method, wh...
This work addresses the automatic generation of conceptual models for XML-oriented databases, which in many cases have little or no support for schemata. Our techniques are based on both an incremental clustering algorithm, which groups together the incoming XML documents according to their structural similarities, and a schema inference method, wh...
We describe the distributed, object-oriented architecture of TREVI, a system designed to help overcome the information overload problem. TREVI provides a comprehensive framework for the processing and dissemination of documents coming from news streams and other external sources. TREVI is designed to perform a variety of linguistic and text process...
After our previous works on modelling a database of newspapers and designing a specially suited retrieval language, we are now developing an application to automatically acquire, summarize and store newspaper documents published in distinct web resources. This paper describes the current implementation of the acquisition process which includes the...
We describe the distributed object oriented architecture of TREVI,
a system designed to help overcome the information overload problem.
TREVI provides a comprehensive framework for the processing and
dissemination of documents coming from news streams and other external
sources. TREVI is designed to perform a variety of linguistic and text
processi...
Resumen En este trabajo mostramos como la edición de periódicos digitales en XML (eXtended Mark-up Language) supone una mejora para el desarrollo de herramientas de almacenamiento, localización y búsqueda de información periodística en la Web. Actualmente, los periódicos digitales consisten en una mera versión navegable de la edición impresa, sin p...
Building digital libraries from Internet-accessible document repositories is a challenging task, due to the current mismatch between the desired DBMS-like capabilities of the former and the schemeless HTML files stored in web sires. In order to address this problem, we propose a distributed architecture for the extraction of metadata from WWW docum...
Building digital libraries from Internet-accessible document repositories is a challenging task, due to the current mismatch between the desired DBMS-like capabilities of the former and the schemaless HTML files stored in web sites. In order to address this problem, we propose a distributed architecture for the extraction of metadata from WWW docum...
In this paper we examine the problem of extractingschema-conforming metadata out from HTML sources. Atechnique founded on semistructured data analysis isexplained. It is based on the combination of HTML styles,which abstract the visual characteristics of documents,and document-oriented context-free grammar, whichprovide structural information. This...
Current research in domains such as Bioinformatics depends heavily on the integration and processing of information coming from diverse sources, which are typically highly complex and heterogeneous, and require approximate queries based on a ranking method. Processing these sources requires sophisticated query processing techniques, which are based...
Enginyeria Tècnica en Informàtica de Gestió (Pla de 2001). IG18: Bases de Dades
Enginyeria Tècnica en Informàtica de Gestió (Pla de 2001). IG18: Bases de Dades