Geocoding of Spatial Relationships Contained in Tweets

Geocoding of Spatial Relationships Contained in Tweets

Imelda Escamilla (Centro de Investigación en Computación, Instituto Politécnico Nacional (IPN), Mexico City, Mexico), Clodoveu A. Davis Jr. (Universidade Federal de Minas Gerais, Belo Horizonte, Brazil), Marco Moreno-Ibarra (Centro de Investigación en Computación, Instituto Politécnico Nacional (IPN), Mexico City, Mexico) and Vladimir Luna (Centro de Investigación en Computación, Instituto Politécnico Nacional (IPN), Mexico City, Mexico)
Copyright: © 2016 |Pages: 17
DOI: 10.4018/IJKSR.2016010102
OnDemand PDF Download:
No Current Special Offers


Human ability to understand approximate references to locations, disambiguated by means of context and reasoning about spatial relationships, is the key to describe spatial environments and to share information about them. In this paper, the authors propose an approach for geocoding that takes advantage of the spatial relationships contained in the text of tweets, using semantic and spatial analyses. Microblog text has special characteristics (e.g. slang, abbreviations, acronyms, etc.) and thus represents a special variation of natural language. The main objective of this work is to associate spatial relationships found in text with a spatial footprint, to determine the location of the event described in the tweet. The feasibility of the proposal is demostrated using a corpus of 200,000 tweets posted in Spanish related with traffic events in Mexico City.
Article Preview

Although there exist research with good results in area, the spatial information in the context are often not taken into consideration in applications and are not processed as a part of the location information. Such is the case of Kordjamshidi et al. (2010) implemented the method “spatial role labelling” that obtain spatial information from natural language sentences by mapping the terms onto formal spatial relations. Other approaches for spatial analysis in natural language use part-of-speech tagging methods such as Zhang et al. (2009) and Hall and Jones (2008).

The quality of the results depends on the type and style of the information, if the sentence structure is poor in syntax, lacks correct orthography, contains standard and non-standard abbreviations, uses acronyms and depends on the individual style of the user the result maybe not be accurate and needs to be analysed thoroughly (Blanco et al., 2015).

These issues are generally addressed by identifying segments, key phrases or weighting terms based on e.g., text documents associated with a particular geographical context, there are several approaches to retrieve or identify locations in the text. One of the commonly used techniques is Named Entity Recognition (NER), focused on extracting location names from representing sites or locations (Crooks & Wise, 2013). This detection is always related with a disambiguation process in order to define which of the multiple locations obtained can be related with the text (Daly et al., 2013). This kind of detection has shown that the performance in short text is less than in a larger text.

Complete Article List

Search this Journal:
Volume 8: 4 Issues (2017)
Volume 7: 4 Issues (2016)
Volume 6: 4 Issues (2015)
Volume 5: 4 Issues (2014)
Volume 4: 4 Issues (2013)
Volume 3: 4 Issues (2012)
Volume 2: 4 Issues (2011)
Volume 1: 4 Issues (2010)
View Complete Journal Contents Listing