Learning to resolve geographical and temporal references in text

Conference Paper · January 2011with3 Reads
DOI: 10.1145/2093973.2094020 · Source: DBLP
Conference: 19th ACM SIGSPATIAL International Symposium on Advances in Geographic Information Systems, ACM-GIS 2011, November 1-4, 2011, Chicago, IL, USA, Proceedings
Geo-temporal information is pervasive over textual documents, since most of them contain references to particular locations, calendar dates, clock times or duration periods. An important text analytics problem is therefore related to resolving the place names and the temporal expressions referenced in the texts, i.e. linking the character strings in the documents that correspond to either locations or temporal instances, to the specific geospatial coordinates or the time intervals that they refer to. However, geo-temporal reference resolution presents several non-trivial problems to the area of text mining, due to the inherent ambiguity and contextual assumptions of natural language discourse.
    • "There is a fair body of research on georeferencing both outside and inside the domain of natural history. Within the natural language processing community georeferencing is treated as a follow-up task to named entity recognition (Leidner and Lieberman 2011; Loureiro et al. 2011), or possibly as complementary to it (Godoy et al. 2011). There are also several open source tools available such as OpenSextant (http://opensextant.github.io) "
    [Show abstract] [Hide abstract] ABSTRACT: For biodiversity research, the field of study that is concerned with the richness of species of our planet, it is of the utmost importance that the location of an animal specimen find is known with high precision. Due to specimens often having been collected over the course of many years, their accompanying geographical data is often ambiguous or may be very imprecise. In this article, we detail an approach that utilizes reasoning and external sources to improve the geographical information of animal finds. Our main contribution is to show that adding external domain knowledge improves the ability to georeference locations over traditional methods that focus solely on analyzing geographical information. Additionally, our system is able to output the confidence it has in its decisions through a confidence measure based on the difficulty of the instance and the steps undertaken to disambiguate it. Our results show that adding domain knowledge to the georeferencing process increases the accuracy @5km from 38.9% to 61.7% and from 47.0% to 74.5% @25km. Furthermore, we reduce the mean distance by more than half, from 251.1km to 114.5km, and decrease the number of records for which no reference can be found from 26.2% to 7.4%.
    Article · Dec 2014

People who read this publication also read