Classification of Traffic Events in Mexico City Using Machine Learning and Volunteered Geographic Information

Classification of Traffic Events in Mexico City Using Machine Learning and Volunteered Geographic Information

Magdalena Saldana-Perez, Miguel Torres-Ruiz, Marco Moreno-Ibarra
DOI: 10.4018/978-1-5225-7347-0.ch008
(Individual Chapters)
No Current Special Offers


Volunteer geographic information and user-generated content represents a source of updated information about what people perceive from their environment. Its analysis generates the opportunity to develop processes to study and solve social problems that affect the people's lives, merging technology and real data. One of the problems in urban areas is the traffic. Every day at big cities people lose time, money, and life quality when they get stuck in traffic jams; another urban problem derived from traffic is air pollution. In the present approach, a traffic event classification methodology is implemented to analyze VGI and internet information related to traffic events with a view to identify the main traffic problems in a city and to visualize the congested roads. The methodology uses different computing tools and algorithms to achieve the goal. To obtain the data, a social media and RSS channels are consulted. The extracted data texts are classified into seven possible traffic events, and geolocalized. In the classification, a machine learning algorithm is applied.
Chapter Preview


Internet is a huge source of information. Many web sites, social media, and repositories, let programmers, analysts, and users access to information about what happens in real world. Most of the internet content is created by users. People are not aware about how much data they have been producing during the last years (Li, 2015).

Social media represent a source of information about everything that concerns people such as education, technology, health, politics, and environment, among others. One advantage of social media is that makes possible to locate people or events letting users share their location (Zhao, 2015).

Internet and social media are used to identify and solve social problems; for example, are used by social movements to ask governments for justice and information about their management. Also, are used to communicate ideas and to connect people, to share safety measures when natural disasters occur, and to propagate information about certain situations and conditions that affect people’s daily activities, such as traffic, air pollution, weather, rain, among others (Adamko, 2015).

One of the purposes of people when participating in social media is to be useful for others Han et al. (2017). When a person reports something that he perceives in his environment, for example a flooding, helps others to search for alternatives in order to avoid the area and to inform authorities about the problem, looking for a response.

In social media people act as sensors, the information they provide is the sensor data; in their posts citizens communicate the events and problems they perceive on roads through short texts, photos, and images; what is more, if the user allows it, many social medias are able to obtain the persons coordinates, which helps to identify the described events. The number of publications and posts related to a certain event on internet let analysts and citizens to understand the problems magnitude, and to formulate hypotheses and solutions.

Due to the huge development and growth of cities, the problems on their roads such as traffic, air pollution, among others, have considerably increased, affecting almost all the activities developed in a city.

Every day bottlenecks, traffic jams, and car accidents affect the economical, educational, and social activities in a city; the time a person spends in traffic events also affects his health. Social media has open the opportunity to have information about the real situation on the roads, letting people who have access to this information, change their routes or itineraries when problems are reported.

The data management to obtain information about the factors that affect the social dynamics, can be achieved by different computational and statistical methods. For example, classifying messages related to a specific event, or resubmitting the messages posted by trusty sources in order to make them accessible to more people (Dou, 2013).

Some researches interested in traffic at urban areas make use of data generated by vehicles with GPS implemented device (Zhang, 2011), such data are used to predict traffic in specific study areas. Alternate to the GPS generated data, cans the social media information increase the predictions accuracy? Or the ambiguity in certain social media posts decreases the predictions accuracy? Few researching works consider it.

Complete Chapter List

Search this Book: