Bruno Martins

Instituto Técnico y Cultural, Santa Clara de Portugal, Michoacán, Mexico

Are you Bruno Martins?

Claim your profile

Publications (76)2.54 Total impact

  • João Santos, Ivo Anastácio, Bruno Martins
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper presents a machine learning method for disambiguating place references in text. Solving this task can have important applications in the digital humanities and computational social sciences, by supporting the geospatial analysis of large document collections. We combine multiple features that capture the similarity between candidate disambiguations, the place references, and the context where the place references occur, in order to rank and choose from a set of candidate disambiguations, obtained from a knowledge base containing geospatial coordinates and textual descriptions for different places from all around the world. The proposed method was evaluated through English corpora used in previous work in this area, and also with a subset of the English Wikipedia. Experimental results demonstrate that the proposed method is indeed effective, showing that out-of-the-box learning algorithms and relatively simple features can obtain a high accuracy in this task.
    GeoJournal 01/2014; DOI:10.1007/s10708-014-9553-y
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Expert finding is an information retrieval task that is concerned with the search for the most knowledgeable people with respect to a specific topic, and the search is based on documents that describe people's activities. The task involves taking a user query as input and returning a list of people who are sorted by their level of expertise with respect to the user query. Despite recent interest in the area, the current state-of-the-art techniques lack in principled approaches for optimally combining different sources of evidence. This article proposes two frameworks for combining multiple estimators of expertise. These estimators are derived from textual contents, from graph-structure of the citation patterns for the community of experts, and from profile information about the experts. More specifically, this article explores the use of supervised learning to rank methods, as well as rank aggregation approaches, for combing all of the estimators of expertise. Several supervised learning algorithms, which are representative of the pointwise, pairwise and listwise approaches, were tested, and various state-of-the-art data fusion techniques were also explored for the rank aggregation framework. Experiments that were performed on a dataset of academic publications from the Computer Science domain attest the adequacy of the proposed approaches.
    Expert Systems 11/2013; (in press). DOI:10.1111/exsy.12062 · 0.75 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The task of expert finding has been getting increasing attention in information retrieval literature. However, the current state-of-the-art is still lacking in principled approaches for combining different sources of evidence. This paper explores the usage of unsupervised rank aggregation methods as a principled approach for combining multiple estimators of expertise, derived from the textual contents, from the graph-structure of the citation patterns for the community of experts, and from profile information about the experts. We specifically experimented two unsupervised rank aggregation approaches well known in the information retrieval literature, namely CombSUM and CombMNZ. Experiments made over a dataset of academic publications for the area of Computer Science attest for the adequacy of these methods.
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: When developing a conversational agent, there is often an urgent need to have a prototype available in order to test the application with real users. A Wizard of Oz is a possibility, but sometimes the agent should be simply deployed in the environment where it will be used. Here, the agent should be able to capture as many interactions as possible and to understand how people react to failure. In this paper, we focus on the rapid development of a natural language understanding module by non experts. Our approach follows the learning paradigm and sees the process of understanding natural language as a classification problem. We test our module with a conversational agent that answers questions in the art domain. Moreover, we show how our approach can be used by a natural language interface to a cinema database.
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The task of expert finding has been getting increasing attention in information retrieval literature. However, the current state-of-the-art is still lacking in principled approaches for combining different sources of evidence in an optimal way. This paper explores the usage of learning to rank methods as a principled approach for combining multiple estimators of expertise, derived from the textual contents, from the graph-structure with the citation patterns for the community of experts, and from profile information about the experts. Experiments made over a dataset of academic publications, for the area of Computer Science, attest for the adequacy of the proposed approaches.
  • [Show abstract] [Hide abstract]
    ABSTRACT: Huge amounts of movement data are nowadays being collected, as a consequence of the prevalence of mobile computing systems and location based services. While the research interest on the analysis of spatio-temporal data has also significantly increased, there are still several open challenges in areas such as interaction and information visualization. In this paper, we present the first steps of a research project that aims to study the usability of visualization techniques of mobility data. We present ST-TrajVis, an application for the visualization of movement data, based on the innovative combination of two popular techniques, namely a 2D map and a space-time cube, augmented with data processing techniques supporting the interaction with interesting subsets of the data. We conducted a user study to assess the usefulness of ST-TrajVis, and to obtain feedback regarding the users interaction with the different techniques. The results suggest the adequacy of the combination of 2D maps with space-time cubes, the existence of some features of interest to users, and the need to conduct further comparative studies between the different techniques.
    British Computer Society Conference on Human-Computer Interaction, BCS HCI 2013; 01/2013
  • [Show abstract] [Hide abstract]
    ABSTRACT: With the prevalence of mobile computing systems and location based services, the research interest on spatio-temporal data has significantly increased, as evidenced by the collection of huge amounts of movement data. Consequently, this type of data raises several issues, namely in the research area of geographic information visualization. Despite the existence of several visual analysis techniques for the exploration of movement data, it is still unclear how usable and useful these techniques are, how can they be improved, and for which situations are these techniques most suitable. In this paper, we present current open challenges on the visual analysis of movement data, and the Ph.D work in progress aiming to address these problems. Our work will explore several factors that may affect the users' performance, and, based on those factors we will propose a taxonomy and an evaluation framework covering different tasks and techniques.
    Proceedings of the 2013 IEEE 14th International Conference on Mobile Data Management - Volume 02; 01/2013
  • Wesley Mathew, Bruno Martins
    [Show abstract] [Hide abstract]
    ABSTRACT: The analysis of human location histories is currently getting an increasing attention, due to the widespread usage of geopositioning technologies such as the GPS, and also of online location-based services that allow users to share this information. Tasks such as the prediction of human movement can be addressed through the usage of these data, in turn offering support for more advanced applications, such as adaptive mobile services with proactive context-based functions. This paper addresses the problem of predicting human mobility on the basis of Hidden Markov Models (HMMs), an approach that allows us to account with location characteristics as unobservable parameters, and also to account with the effects of each individual's previous actions. We report on a series of experiments with both regular and second-order HMMs. The experiments were made with a real-world location history dataset from the LifeMap project, and the results show that a high prediction accuracy, relative to the dificulty of the task, can be achieved when considering relatively small regions.
    Proceedings of the First ACM SIGSPATIAL International Workshop on Mobile Geographic Information Systems; 11/2012
  • [Show abstract] [Hide abstract]
    ABSTRACT: Reading is an important activity for individuals. Content-based recommendation systems are, typically, used to recommend scientific papers or news, where search is driven by topic. Literary reading or reading for leisure differs from scientific reading, because users search books not only for their topic but also by author or writing style. Choosing a new book to read can be tricky and recommendation systems can make it easy by selecting books that the user will like. In this paper we study recommendation through writing style and the influence of negative examples in user preferences. Our experiments were conducted in a hybrid set-up that combines a collaborative filtering algorithm with stylometric relevance feedback. Using the LitRec data set, we demonstrate that writing style influences book selection; that book content, characterized with writing style, can be used to improve collaborative filtering results; and that negative examples do not improve final predictions.
    Proceedings of the fifth ACM workshop on Research advances in large digital book repositories and complementary media; 10/2012
  • Source
    Wesley Mathew, Ruben Raposo, Bruno Martins
    [Show abstract] [Hide abstract]
    ABSTRACT: The analysis of human location histories is currently getting an increasing attention, due to the widespread usage of geopositioning technologies such as the GPS, and also of online location-based services that allow users to share this information. Tasks such as the prediction of human movement can be addressed through the usage of these data, in turn offering support for more advanced applications, such as adaptive mobile services with proactive context-based functions. This paper presents an hybrid method for predicting human mobility on the basis of Hidden Markov Models (HMMs). The proposed approach clusters location histories according to their characteristics, and latter trains an HMM for each cluster. The usage of HMMs allows us to account with location characteristics as unobservable parameters, and also to account with the effects of each individual's previous actions. We report on a series of experiments with a real-world location history dataset from the GeoLife project, showing that a prediction accuracy of 13.85% can be achieved when considering regions of roughly 1280 squared meters.
    Proceedings of the 2012 ACM Conference on Ubiquitous Computing; 09/2012
  • [Show abstract] [Hide abstract]
    ABSTRACT: Literary reading is an important activity for individuals and can be a long term commitment, making book choice an important task for book lovers and public library users. In this paper, we present a hybrid recommendation system to help readers decide which book to read next. We study book and author recommendations in a hybrid recommendation setting and test our algorithm on the LitRec data set. Our hybrid method combines two item-based collaborative filtering algorithms to predict books and authors that the user will like. Author predictions are expanded into a booklist that is subsequently aggregated with the former book predictions. Finally, the resulting booklist is used to yield the top-n book recommendations. By means of various experiments, we demonstrate that author recommendation can improve overall book recommendation.
    JCDL 2012; 06/2012
  • André Nunes, Pável Calado, Bruno Martins
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper describes an approach for resolving user identifiers in the context of social networks, using techniques from the area of duplicate record detection [1]. We reduce the user identity resolution problem into a binary classification task, where the goal is to classify pairs of identifiers as either belonging to the same person or not. The pairs are represented as feature vectors that combine multiple sources of similarity (e.g. similarity between profile information, descriptions of people's interests, and people's friend lists). We report on a thorough evaluation of different machine learning algorithms and different feature sets, concluding that user identities can be resolved with high accuracy.
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Literary reading is an important activity for individuals and choosing to read a book can be a long time commitment, making book choice an important task for book lovers and public library users. In this paper we present an hybrid recommendation system to help readers decide which book to read next. We study book and author recommendation in an hybrid recommendation setting and test our approach in the LitRec data set. Our hybrid book recommendation approach purposed combines two item-based collaborative filtering algorithms to predict books and authors that the user will like. Author predictions are expanded in to a book list that is subsequently aggregated with the former list generated through the initial collaborative recommender. Finally, the resulting book list is used to yield the top-n book recommendations. By means of various experiments, we demonstrate that author recommendation can improve overall book recommendation.
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The task of expert finding has been getting increasing at-tention in information retrieval literature. However, the current state-of-the-art is still lacking in principled approaches for combining differ-ent sources of evidence. This paper explores the usage of unsupervised rank aggregation methods as a principled approach for combining mul-tiple estimators of expertise, derived from the textual contents, from the graph-structure of the citation patterns for the community of experts, and from profile information about the experts. We specifically exper-imented two unsupervised rank aggregation approaches well known in the information retrieval literature, namely CombSUM and CombMNZ. Experiments made over a dataset of academic publications for the area of Computer Science attest for the adequacy of these methods.
  • [Show abstract] [Hide abstract]
    ABSTRACT: This paper describes an approach for performing recognition and resolution of place names mentioned over the descriptive metadata records of typical digital libraries. Our approach exploits evidence provided by the existing structured attributes within the metadata records to support the place name recognition and resolution, in order to achieve better results than by just using lexical evidence from the textual values of these attributes. In metadata records, lexical evidence is very often insufficient for this task, since short sentences and simple expressions are predominant. Our implementation uses a dictionary based technique for recognition of place names (with names provided by Geonames), and machine learning for reasoning on the evidences and choosing a possible resolution candidate. The evaluation of our approach was performed in data sets with a metadata schema rich in Dublin Core elements. Two evaluation methods were used. First, we used cross-validation, which showed that our solution is able to achieve a very high precision of 0,99 at 0,55 recall, or a recall of 0,79 at 0,86 precision. Second, we used a comparative evaluation with an existing commercial service, where our solution performed better on any confidence level (p
    Proceedings of the 2011 Joint International Conference on Digital Libraries, JCDL 2011, Ottawa, ON, Canada, June 13-17, 2011; 01/2011
  • Bruno Martins
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper presents a novel approach for detecting duplicate records in the context of digital gazetteers, using state-of-the-art machine learning techniques. It reports a thorough evaluation of alternative machine learning approaches designed for the task of classifying pairs of gazetteer records as either duplicates or not, built by using support vector machines or alternating decision trees with different combinations of similarity scores for the feature vectors. Experimental results show that using feature vectors that combine multiple similarity scores, derived from place names, semantic relationships, place types and geospatial footprints, leads to an increase in accuracy. The paper also discusses how the proposed duplicate detection approach can scale to large collections, through the usage of filtering or blocking techniques.
    GeoSpatial Semantics - 4th International Conference, GeoS 2011, Brest, France, May 12-13, 2011. Proceedings; 01/2011
  • Rui Candeias, Bruno Martins
    [Show abstract] [Hide abstract]
    ABSTRACT: The association of illustrative photos to textual contents is a challenging cross-media retrieval problem with many practical applications. We have, for instance, that the association of photos to specific parts of travelogues, i.e. textual descriptions for travel experiences, may lead to a better usage of these documents. Despite the huge number of high quality photos in websites like Flickr, these photos are currently not being properly explored in cross-media retrieval tasks.
    19th ACM SIGSPATIAL International Symposium on Advances in Geographic Information Systems, ACM-GIS 2011, November 1-4, 2011, Chicago, IL, USA, Proceedings; 01/2011
  • Vitor Loureiro, Ivo Anastácio, Bruno Martins
    [Show abstract] [Hide abstract]
    ABSTRACT: Geo-temporal information is pervasive over textual documents, since most of them contain references to particular locations, calendar dates, clock times or duration periods. An important text analytics problem is therefore related to resolving the place names and the temporal expressions referenced in the texts, i.e. linking the character strings in the documents that correspond to either locations or temporal instances, to the specific geospatial coordinates or the time intervals that they refer to. However, geo-temporal reference resolution presents several non-trivial problems to the area of text mining, due to the inherent ambiguity and contextual assumptions of natural language discourse.
    19th ACM SIGSPATIAL International Symposium on Advances in Geographic Information Systems, ACM-GIS 2011, November 1-4, 2011, Chicago, IL, USA, Proceedings; 01/2011
  • Ana Silva, Bruno Martins
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper presents methods for annotating georeferenced photos with descriptive tags, exploring the annotations for other georeferenced photos which are available at online repositories like Flickr. Specifically, by using the geospatial coordinates associated to the photo which we want to annotate, we start by collecting the photos from an online repository which were taken from nearby locations. Next, and for each tag associated to the collected photos, we compute a set of relevance estimators with basis on factors such as the tag frequency, the geospatial proximity of the photo, the image content similarity, and the number of different users employing the tag. The multiple estimators can then be combined through supervised learning to rank methods such as Rank-Boost or AdaRank, or through unsupervised rank aggregation methods well-known in the information retrieval literature, namely the CombSUM or the CombMNZ approaches. The most relevant tags are finally suggested. Experimental results with a collection of photos collected from Flickr attest for the adequacy of the proposed approaches.
  • Bruno Martins, Ivo Anastácio, Pável Calado
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper presents a machine learning method for resolving place references in text, i.e. linking character strings in documents to locations on the surface of the Earth. This is a fundamental task in the area of Geographic Information Retrieval, supporting access through geography to large document collections. The proposed method is an instance of stacked learning, in which a first learner based on a Hidden Markov Model is used to annotate place references, and then a second learner implementing a regression through a Support Vector Machine is used to rank the possible disabiguations for the references that were initially annotated. The proposed method was evaluated through gold-standard document collections in three different languages, having place references annotated by humans. Results show that the proposed method compares favorably against commercial state-of-the-art systems such as the Metacarta geo-tagger and Yahoo! Placemaker.
    07/2010: pages 221-236;

Publication Stats

586 Citations
2.54 Total Impact Points

Institutions

  • 2009–2013
    • Instituto Técnico y Cultural
      Santa Clara de Portugal, Michoacán, Mexico
    • Universidade da Beira Interior
      Ковильян, Castelo Branco, Portugal
  • 2011–2012
    • Inesc-ID
      Lisboa, Lisbon, Portugal
  • 2008–2009
    • Technical University of Lisbon
      • Departamento de Engenharia Informática (DEI)
      Lisbon, Lisbon, Portugal
  • 2007–2008
    • Instituto Superior de Contabilidade e Administração de Lisboa
      Lisboa, Lisbon, Portugal
  • 2004–2006
    • University of Lisbon
      • Faculty of Science
      Lisboa, Lisbon, Portugal
    • Faculdade Campo Grande
      Campo Grande, Estado de Mato Grosso do Sul, Brazil