Big Data Analytics for Train Delay Prediction: A Case Study in the Italian Railway Network

Big Data Analytics for Train Delay Prediction: A Case Study in the Italian Railway Network

Emanuele Fumeo (University of Genoa, Italy), Luca Oneto (University of Genoa, Italy), Giorgio Clerico (University of Genoa, Italy), Renzo Canepa (Rete Ferroviaria Italiana S.P.A., Italy), Federico Papa (Ansaldo STS S.P.A., Italy), Carlo Dambra (Ansaldo STS S.P.A, Italy), Nadia Mazzino (Ansaldo STS S.P.A., Italy) and Davida Anguita (University of Genoa, Italy)
Copyright: © 2018 |Pages: 29
DOI: 10.4018/978-1-5225-3176-0.ch014
OnDemand PDF Download:
List Price: $37.50


Current Train Delay Prediction Systems (TDPSs) do not take advantage of state-of-the-art tools and techniques for extracting useful insights from large amounts of historical data collected by the railway information systems. Instead, these systems rely on static rules, based on classical univariate statistic, built by experts of the railway infrastructure. The purpose of this book chapter is to build a data-driven TDPS for large-scale railway networks, which exploits the most recent big data technologies, learning algorithms, and statistical tools. In particular, we propose a fast learning algorithm for Shallow and Deep Extreme Learning Machines that fully exploits the recent in-memory large-scale data processing technologies for predicting train delays. Proposal has been compared with the current state-of-the-art TDPSs. Results on real world data coming from the Italian railway network show that our proposal is able to improve over the current state-of-the-art TDPSs.
Chapter Preview


Big Data Analytics is one of the current trending research interests in the context of railway transportation systems. Indeed, many aspects of the railway world can greatly benefit from new technologies and methodologies able to collect, store, process, analyze and visualize large amounts of data (Paakkonen & Pakkala, 2015; Thaduri, Galar, & Kumar, 2015; Zarembski, 2014; Jina, Wah, Chenga et al., 2015; Wu & Chin, 2014; Schmidt, Chen, Matheson, & Ostrouchov, 2016) as well as new methodologies coming from machine learning, artificial intelligence, and computational intelligence to analyze that data in order to extract actionable information (Chen & An, 2016; Yu & Boyd, 2016; Aridhi & Nguifo, 2016; Colombo & Ferrari, 2015; Al-Jarrah, Yoo, Muhaidat, Karagiannidis, & Taha, 2015). Examples are: condition based maintenance of railway assets (Fumeo, Oneto, & Anguita, 2015; Li, Qian, Parikh, & Hampapur, 2013; Li et al., 2014; Núñez, Hendriks, Li et al., 2014), automatic visual inspection systems (Feng et al., 2014; Aytekin, Rezaeitabar, Dogru et al., 2015), risk analysis (Figueres-Esteban, Hughes, & Van Gulijk, 2015), network capacity estimation (Branishtov, Vershinin, Tumchenok et al., 2014), optimization for energy-efficient railway operations (Bai, Ho, Mao, Ding, & Chen, 2014), marketing analysis for rail freight transportation (Xueyan & Depeng, 2014), usage of ontologies and linked data in railways (Morris, Easton, & Roberts, 2014; Tutcher, 2014), big data for rail inspection systems (Li, Zhong, Liang et al., 2015), complex event processing over train data streams (Ma, Wang, Chu et al., 2015), fault diagnosis of vehicle on-board equipment for high speed railways (Wang, Xu, Zhao et al., 2015; Zhao, Xu, & Hai-feng, 2014; Noori & Jenab, 2013) and for conventional ones (Bin & Wensheng, 2015), research on storage and retrieval of large amounts of data for high-speed trains (Wang, Li, Hei et al., 2015), development of an online geospatial safety risk model for railway networks (Sadler et al., 2016), train marshalling optimization through genetic algorithms (Qingyang & Xiaoyun, 2015), research on new technologies for the railway ticketing systems (Zhu, Wang, Shan et al., 2014). The work described in this book chapter tackles the problem of predicting train delays using Big Data Analytics, aiming at improving traffic management and dispatching and at scaling to large railway networks at the same time. In particular, this work will focus on exploiting the large amount of historical train movements data collected by the railway information systems.

Complete Chapter List

Search this Book: