Time-Series Forecasting and Analysis of COVID-19 Outbreak in Highly Populated Countries: A Data-Driven Approach

Time-Series Forecasting and Analysis of COVID-19 Outbreak in Highly Populated Countries: A Data-Driven Approach

Arunkumar P. M., Lakshmana Kumar Ramasamy, Amala Jayanthi M.
Copyright: © 2022 |Pages: 17
DOI: 10.4018/IJEHMC.20220701.oa3
Article PDF Download
Open access articles are freely available for download


A novel corona virus, COVID-19 is spreading across different countries in an alarming proportion and it has become a major threat to the existence of human community. With more than eight lakh death count within a very short span of seven months, this deadly virus has affected more than 24 million people across 213 countries and territories around the world. Time-series analysis, modeling and forecasting is an important research area that explores the hidden insights from larger set of time-bound data for arriving better decisions. In this work, data analysis on COVID-19 dataset is performed by comparing the top six populated countries in the world. The data used for the evaluation is taken for a time period from 22nd January 2020 to 23rd August 2020.A novel time-series forecasting approach based on Auto-regressive integrated moving average (ARIMA) model is also proposed. The results will help the researchers from medical and scientific community to gauge the trend of the disease spread and improvise containment strategies accordingly.
Article Preview

1. Introduction

The emergence of novel corona virus is identified from the Wuhan City, Hubei province in China during December 2019 and subsequently renamed as COVID-19 by World health organization. The most common symptoms of the virus include fever, cough and tiredness. Some lesser known symptoms are headache, diarrhea, sore throat and loss of taste or smell. Most of the severe cases of COVID-19 showed symptoms of breathing difficulty and chest pain. Monitoring of epidemiological changes in an in-depth manner will give better perceptions on the disease outbreak (Rotha & Byrareddy, 2020). The research on time-series data is highly critical due to the enormous usage of temporal data in wide variety of applications. Large dataset, high dimensionality and frequent updation are few characteristics of time-series data. The time-series data is subjected to various processing steps to discover the patterns for better decision making. Apart from pattern discovery and clustering, other important task of time-series data mining include classification, rule mining and summarization (Fu,2011). Distance-based clustering, fuzzy c-means (FCM) algorithm, Autoregressive integrated moving average (ARIMA) models and Hidden Markov model (HMM) are few methods adopted for time-series clustering and pattern discovery. Time series forecasting depends on the task of analyzing past observations of a random variable and generates a model that portrays the underlying relationship and its patterns. Each of the forecasting method follows four important steps namely, problem definition, information gathering, selecting the best model and forecasting (Hyndman & Athanasopoulos,2018). The time-series analysis and forecasting for COVID-19 disease outbreak is an emerging research paradigm that requires deep knowledge and better experimentations for interpreting the trend and evaluating the predictions.

Holt–Winters Additive Model (HWAAS), Auto-regressive integrated moving average (ARIMA), TBAT, Prophet, DeepAR and N-Beats and Vector Auto regression (VAR) are few models used by researchers around the world for time-series forecasting(Papastefanopoulos,2020). In HWAAS model, trend and seasonal variation of the data are taken in to account. This method is an advanced model proposed by adopting added features to Holt’s exponential smoothing. In exponential smoothing, the recently recorded observations are used for updating the prediction levels. The additive method is favored when the seasonal variations are approximately constant through the data series. Holt-Winters Exponential Smoothing is also called as Triple Exponential Smoothing. TBAT method involves four components, namely, Trigonometric seasonal formulation, Box–Cox transformation, ARMA errors and trend component (Harvey et al., 1997; Box & Cox,1964; Adhikari & Agrawal,2013). Multiple seasonalities can be accommodated by TBAT model. Here, each seasonality is modeled with a trigonometric representation based on fourier series. Prophet method is proposed by Facebook. Three major components used by this model are trend, seasonality and holidays.

Complete Article List

Search this Journal:
Volume 14: 1 Issue (2023)
Volume 13: 5 Issues (2022): 4 Released, 1 Forthcoming
Volume 12: 6 Issues (2021)
Volume 11: 4 Issues (2020)
Volume 10: 4 Issues (2019)
Volume 9: 4 Issues (2018)
Volume 8: 4 Issues (2017)
Volume 7: 4 Issues (2016)
Volume 6: 4 Issues (2015)
Volume 5: 4 Issues (2014)
Volume 4: 4 Issues (2013)
Volume 3: 4 Issues (2012)
Volume 2: 4 Issues (2011)
Volume 1: 4 Issues (2010)
View Complete Journal Contents Listing