Bruno Martins

Instituto Técnico y Cultural, Santa Clara de Portugal, Michoacán, Mexico

Are you Bruno Martins?

Claim your profile

Publications (83)4.93 Total impact

  • Source
    David S Batista · Bruno Martins · Mário J Silva
    [Show abstract] [Hide abstract]
    ABSTRACT: Semi-supervised bootstrapping techniques for relationship extraction from text iteratively expand a set of initial seed relationships while limiting the semantic drift. We research bootstrapping for relationship extraction using word embeddings to find similar relationships. Experimental results show that relying on word embeddings achieves a better performance on the task of extracting four types of relationships from a collection of newswire documents when compared with a baseline using TF-IDF to find similar relationships.
    Full-text · Conference Paper · Sep 2015
  • Source
    André Leal · Bruno Martins · Francisco M Couto
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper describes a system developed for the disorder identification subtask within task 14 of SemEval 2015. The developed system is based on a chain of two modules, one for recognition and another for normaliza-tion. The recognition module is based on an adapted version of the Stanford NER system to train CRF models in order to recognize disorder mentions. CRF models were build based on a novel encoding of entity spans as token classifications to also consider non-continuous entities, along with a rich set of features based on (i) domain lexicons and (ii) Brown clusters inferred from a large collection of clinical texts. For disorder normalization, we (i) generated a non ambiguous dictionary of abbreviations from the labelled files, using it together with (ii) an heuristic method based on similarity search and (iii) a comparison method based on the information content of each disorder. The system achieved an F-measure of 0.740 (the second best), with a precision of 0.779, a recall of 0.705.
    Full-text · Conference Paper · Jul 2015
  • Tiago Gonçalves · Ana Paula Afonso · Bruno Martins
    [Show abstract] [Hide abstract]
    ABSTRACT: With the prevalence of mobile computing systems and location based services, large amounts of spatio-temporal data are nowadays being collected, representing the mobility of people performing various activities. However, despite the increasing interest in the exploration of these data, there are still open challenges in various application contexts, e.g. related to visualisation and human–computer interaction. In order to support the extraction of useful and relevant information from the spatio-temporal and the thematic properties associated with human trajectories, it is crucial to develop and study adequate interactive visualisation techniques. In addition to the properties of the visualisations themselves, it is important to take into consideration the types of information present within the data and, more importantly, the types of tasks that a user might need to consider in order to achieve a given goal. The understanding of these factors may, in turn, simplify the development and the assessment of a given interactive visualisation. In this paper, we present and analyse the most relevant concepts associated to these topics. In particular, our analysis addresses the main properties associated with (human) trajectory data, the main types of visualisation tasks/objectives that the users may require in order to analyse that data and the high-level classes of techniques for visualising trajectory data. In addition, this paper also presents an overview on a user study, conducted in function of this analysis, to compare two classes of visualisation techniques, namely static maps and space-time cubes, regarding their adequacy in helping users completing basic visualisation tasks.
    No preview · Article · Apr 2015 · Journal of Location Based Services
  • Joao Tiago Luis Santos · Ivo Miguel Anastacio · Bruno Emanuel Martins
    [Show abstract] [Hide abstract]
    ABSTRACT: This article addresses the problem of disambiguating named entities, in text documents, towards entries in a knowledge base like Wikipedia. The proposed approach uses supervised learning to sort candidate knowledge base entries for each entity mentioned in a text, and then to classify the entry ranked in the first position as either the correct disambiguation or not. We present results with Portuguese and Spanish texts for a wide range of models and configuration options. Our experiments attest to the effectiveness of supervised learning methods in this specific task, showing that out-of-the-box algorithms and relatively simple features can achieve a high accuracy.
    No preview · Article · Mar 2015 · IEEE Latin America Transactions
  • Source
    André Leal · Bruno Martins · Francisco M Couto
    [Show abstract] [Hide abstract]
    ABSTRACT: Clinical notes in the form of textual context occur frequently in Electronic Health Records (EHRs). They are mainly used to describe treatment plans, symptoms, diagnostics, etc. Clinical notes are recorded in narrative language without any structured form and, since each medical professional uses different types of terminologies according to context and to their specialization, these notes are very challenging for their complexity, heterogeneity and contextual need. Forcing medical professionals to introduce the information in a predefined structure simplifies the interpretation. However, the imposition of such a rigid structure increases not only the time needed to record data, but it also puts some heavy barriers at recording unusual cases. One possible solution consists on the application of text-mining techniques to the clinical texts, in order to support the recognition and normalization of medical concepts. Together, these techniques can result in the correct and efficient information gathering by information systems. We developed a system which on a first instance recognizes medical concepts in clinical notes and then normalizes them with a UMLS concept unique identifier (CUI). This system was developed with the intention to overcome some challenges presented in this task, such as the recognition of non-continuous entities and the normalization of ambiguous entities. For the recognition we use the novel SBIEON encoding which contains a tag to specify words inside recognized entities that are not part of them. We also explore non-annotated clinical notes to generate lower-dimensional representation of the word vocabulary, and therefore reduce the data sparsity. CRF models were generated based on the mentioned features among others, such as domain specific lexicon, token shape,
    Full-text · Conference Paper · Jan 2015
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper describes our participation on Task 7 of SemEval 2014, which fo-cused on the recognition and disambigua-tion of medical concepts. We used an adapted version of the Stanford NER system to train CRF models to recognize tex-tual spans denoting diseases and disorders , within clinical notes. We considered an encoding that accounts with non-continuous entities, together with a rich set of features (i) based on domain specific lexicons like SNOMED CT, or (ii) leveraging Brown clusters inferred from a large collection of clinical texts. Together with this recognition mechanism, we used a heuristic similarity search method, to assign an unambiguous identifier to each concept recognized in the text. Our best run on Task A (i.e., in the recognition of medical concepts in the text) achieved an F-measure of 0.705 in the strict evaluation mode, and a promising F-measure of 0.862 in the relaxed mode, with a precision of 0.914. For Task B (i.e., the disambiguation of the recognized concepts), we achieved less promising results, with an accuracy of 0.405 in the strict mode, and of 0.615 in the relaxed mode.
    Full-text · Conference Paper · Sep 2014
  • Carolina Bento · Daniel Goncalves · Bruno Martins

    No preview · Conference Paper · Jul 2014
  • João Santos · Ivo Anastácio · Bruno Martins
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper presents a machine learning method for disambiguating place references in text. Solving this task can have important applications in the digital humanities and computational social sciences, by supporting the geospatial analysis of large document collections. We combine multiple features that capture the similarity between candidate disambiguations, the place references, and the context where the place references occur, in order to rank and choose from a set of candidate disambiguations, obtained from a knowledge base containing geospatial coordinates and textual descriptions for different places from all around the world. The proposed method was evaluated through English corpora used in previous work in this area, and also with a subset of the English Wikipedia. Experimental results demonstrate that the proposed method is indeed effective, showing that out-of-the-box learning algorithms and relatively simple features can obtain a high accuracy in this task.
    No preview · Article · Jun 2014 · GeoJournal
  • Source
    Catarina Moreira · Pável Calado · Bruno Martins
    [Show abstract] [Hide abstract]
    ABSTRACT: Expert finding is an information retrieval task that is concerned with the search for the most knowledgeable people with respect to a specific topic, and the search is based on documents that describe people's activities. The task involves taking a user query as input and returning a list of people who are sorted by their level of expertise with respect to the user query. Despite recent interest in the area, the current state-of-the-art techniques lack in principled approaches for optimally combining different sources of evidence. This article proposes two frameworks for combining multiple estimators of expertise. These estimators are derived from textual contents, from graph-structure of the citation patterns for the community of experts, and from profile information about the experts. More specifically, this article explores the use of supervised learning to rank methods, as well as rank aggregation approaches, for combing all of the estimators of expertise. Several supervised learning algorithms, which are representative of the pointwise, pairwise and listwise approaches, were tested, and various state-of-the-art data fusion techniques were also explored for the rank aggregation framework. Experiments that were performed on a dataset of academic publications from the Computer Science domain attest the adequacy of the proposed approaches.
    Full-text · Article · Nov 2013 · Expert Systems
  • Source
    David S Batista · Rui Silva · Bruno Martins · Mário J Silva
    [Show abstract] [Hide abstract]
    ABSTRACT: Relationship extraction concerns with the detection and classification of semantic relationships between entities mentioned in a collection of textual documents. This paper proposes a simple and on-line approach for addressing the automated extraction of semantic relations, based on the idea of nearest neighbor classification, and leveraging a minwise hashing method for measuring similarity between relationship instances. Experiments with three different datasets that are commonly used for benchmarking relationship extraction methods show promising results, both in terms of classification performance and scalability.
    Full-text · Conference Paper · Oct 2013
  • Tiago Goncalves · Ana Paula Afonso · Bruno Martins
    [Show abstract] [Hide abstract]
    ABSTRACT: With the prevalence of mobile computing systems and location based services, the research interest on spatio-temporal data has significantly increased, as evidenced by the collection of huge amounts of movement data. Consequently, this type of data raises several issues, namely in the research area of geographic information visualization. Despite the existence of several visual analysis techniques for the exploration of movement data, it is still unclear how usable and useful these techniques are, how can they be improved, and for which situations are these techniques most suitable. In this paper, we present current open challenges on the visual analysis of movement data, and the Ph.D work in progress aiming to address these problems. Our work will explore several factors that may affect the users' performance, and, based on those factors we will propose a taxonomy and an evaluation framework covering different tasks and techniques.
    No preview · Conference Paper · Jun 2013
  • Source
    Catarina Moreira · Bruno Martins · Pável Calado
    [Show abstract] [Hide abstract]
    ABSTRACT: The task of expert finding has been getting increasing attention in information retrieval literature. However, the current state-of-the-art is still lacking in principled approaches for combining different sources of evidence. This paper explores the usage of unsupervised rank aggregation methods as a principled approach for combining multiple estimators of expertise, derived from the textual contents, from the graph-structure of the citation patterns for the community of experts, and from profile information about the experts. We specifically experimented two unsupervised rank aggregation approaches well known in the information retrieval literature, namely CombSUM and CombMNZ. Experiments made over a dataset of academic publications for the area of Computer Science attest for the adequacy of these methods.
    Full-text · Dataset · Mar 2013
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: When developing a conversational agent, there is often an urgent need to have a prototype available in order to test the application with real users. A Wizard of Oz is a possibility, but sometimes the agent should be simply deployed in the environment where it will be used. Here, the agent should be able to capture as many interactions as possible and to understand how people react to failure. In this paper, we focus on the rapid development of a natural language understanding module by non experts. Our approach follows the learning paradigm and sees the process of understanding natural language as a classification problem. We test our module with a conversational agent that answers questions in the art domain. Moreover, we show how our approach can be used by a natural language interface to a cinema database.
    Full-text · Article · Feb 2013
  • Source
    Catarina Moreira · Pável Calado · Bruno Martins
    [Show abstract] [Hide abstract]
    ABSTRACT: The task of expert finding has been getting increasing attention in information retrieval literature. However, the current state-of-the-art is still lacking in principled approaches for combining different sources of evidence in an optimal way. This paper explores the usage of learning to rank methods as a principled approach for combining multiple estimators of expertise, derived from the textual contents, from the graph-structure with the citation patterns for the community of experts, and from profile information about the experts. Experiments made over a dataset of academic publications, for the area of Computer Science, attest for the adequacy of the proposed approaches.
    Full-text · Article · Feb 2013
  • [Show abstract] [Hide abstract]
    ABSTRACT: Huge amounts of movement data are nowadays being collected, as a consequence of the prevalence of mobile computing systems and location based services. While the research interest on the analysis of spatio-temporal data has also significantly increased, there are still several open challenges in areas such as interaction and information visualization. In this paper, we present the first steps of a research project that aims to study the usability of visualization techniques of mobility data. We present ST-TrajVis, an application for the visualization of movement data, based on the innovative combination of two popular techniques, namely a 2D map and a space-time cube, augmented with data processing techniques supporting the interaction with interesting subsets of the data. We conducted a user study to assess the usefulness of ST-TrajVis, and to obtain feedback regarding the users interaction with the different techniques. The results suggest the adequacy of the combination of 2D maps with space-time cubes, the existence of some features of interest to users, and the need to conduct further comparative studies between the different techniques.
    No preview · Conference Paper · Jan 2013
  • S Moreira · J Filgueiras · B Martins · F Couto · M Silva

    No preview · Conference Paper · Jan 2013
  • Wesley Mathew · Bruno Martins
    [Show abstract] [Hide abstract]
    ABSTRACT: The analysis of human location histories is currently getting an increasing attention, due to the widespread usage of geopositioning technologies such as the GPS, and also of online location-based services that allow users to share this information. Tasks such as the prediction of human movement can be addressed through the usage of these data, in turn offering support for more advanced applications, such as adaptive mobile services with proactive context-based functions. This paper addresses the problem of predicting human mobility on the basis of Hidden Markov Models (HMMs), an approach that allows us to account with location characteristics as unobservable parameters, and also to account with the effects of each individual's previous actions. We report on a series of experiments with both regular and second-order HMMs. The experiments were made with a real-world location history dataset from the LifeMap project, and the results show that a high prediction accuracy, relative to the dificulty of the task, can be achieved when considering relatively small regions.
    No preview · Conference Paper · Nov 2012
  • [Show abstract] [Hide abstract]
    ABSTRACT: Reading is an important activity for individuals. Content-based recommendation systems are, typically, used to recommend scientific papers or news, where search is driven by topic. Literary reading or reading for leisure differs from scientific reading, because users search books not only for their topic but also by author or writing style. Choosing a new book to read can be tricky and recommendation systems can make it easy by selecting books that the user will like. In this paper we study recommendation through writing style and the influence of negative examples in user preferences. Our experiments were conducted in a hybrid set-up that combines a collaborative filtering algorithm with stylometric relevance feedback. Using the LitRec data set, we demonstrate that writing style influences book selection; that book content, characterized with writing style, can be used to improve collaborative filtering results; and that negative examples do not improve final predictions.
    No preview · Conference Paper · Oct 2012
  • Source
    Wesley Mathew · Ruben Raposo · Bruno Martins
    [Show abstract] [Hide abstract]
    ABSTRACT: The analysis of human location histories is currently getting an increasing attention, due to the widespread usage of geopositioning technologies such as the GPS, and also of online location-based services that allow users to share this information. Tasks such as the prediction of human movement can be addressed through the usage of these data, in turn offering support for more advanced applications, such as adaptive mobile services with proactive context-based functions. This paper presents an hybrid method for predicting human mobility on the basis of Hidden Markov Models (HMMs). The proposed approach clusters location histories according to their characteristics, and latter trains an HMM for each cluster. The usage of HMMs allows us to account with location characteristics as unobservable parameters, and also to account with the effects of each individual's previous actions. We report on a series of experiments with a real-world location history dataset from the GeoLife project, showing that a prediction accuracy of 13.85% can be achieved when considering regions of roughly 1280 squared meters.
    Preview · Conference Paper · Sep 2012
  • [Show abstract] [Hide abstract]
    ABSTRACT: Literary reading is an important activity for individuals and can be a long term commitment, making book choice an important task for book lovers and public library users. In this paper, we present a hybrid recommendation system to help readers decide which book to read next. We study book and author recommendations in a hybrid recommendation setting and test our algorithm on the LitRec data set. Our hybrid method combines two item-based collaborative filtering algorithms to predict books and authors that the user will like. Author predictions are expanded into a booklist that is subsequently aggregated with the former book predictions. Finally, the resulting booklist is used to yield the top-n book recommendations. By means of various experiments, we demonstrate that author recommendation can improve overall book recommendation.
    No preview · Conference Paper · Jun 2012

Publication Stats

731 Citations
4.93 Total Impact Points

Institutions

  • 2009-2015
    • Instituto Técnico y Cultural
      Santa Clara de Portugal, Michoacán, Mexico
  • 2004-2015
    • University of Lisbon
      • Faculty of Science
      Lisboa, Lisbon, Portugal
    • Faculdade Campo Grande
      Campo Grande, Estado de Mato Grosso do Sul, Brazil
  • 2011-2012
    • Inesc-ID
      Lisboa, Lisbon, Portugal
  • 2008-2009
    • Technical University of Lisbon
      • Departamento de Engenharia Informática (DEI)
      Lisbon, Lisbon, Portugal
  • 2007-2008
    • Instituto Superior de Contabilidade e Administração de Lisboa
      Lisboa, Lisbon, Portugal