COVID-19 Analysis, Prediction, and Misconceptions: A Computational Machine Learning Model as a New Paradigm in Scientific Research

COVID-19 Analysis, Prediction, and Misconceptions: A Computational Machine Learning Model as a New Paradigm in Scientific Research

Balachandran Krishnan, Sujatha Arun Kokatnoor, Vandana Reddy, Boppuru Rudra Prathap
DOI: 10.4018/978-1-7998-9805-4.ch010
Chapter PDF Download
Open access chapters are freely available for download

Abstract

COVID-19 is an infectious disease of the newly discovered coronavirus (CoV). The importance and value of open access (OA) resources are critical in the context of the COVID-19 epidemic. OA aided in the development of a vaccine and informed public health actions necessary to stop the virus from spreading. Many publishers implicitly acknowledged that OA was vital to promote science in the fight against the disease. Accordingly, publishers have committed to OA publication and scholarly communication of disease-related scientific research. This chapter covers three issues based on the modeling of the CoV dataset. First, an exploratory data analysis is done to detect the hidden facts and the relevant information patterns about the affected, recovered, death cases caused by the CoV and the vaccination details. Second, a predictive model is developed using machine learning techniques to effectively predict the number of COVID-19 positive cases in India. In the last step, a hybrid computational model is developed to identify the misconceptions that are spread through social media networks.
Chapter Preview
Top

Background

The literature review is carried out based on the sub-topics addressed in this chapter.

Key Terms in this Chapter

Ridge Regression: Ridge regression is a technique for estimating the coefficients of multiple-regression models when the variables are linearly independent but highly linked. It has been applied in various domains such as econometrics, chemistry, engineering, etc.

Seasonal Auto-Regressive Integrated Moving Average With eXogenous Factors (SARIMAX): SARIMAX used to forecast daily Covid-19 cases in this chapter.

Neural Network (NN): A neural network is a set of algorithms that attempts to detect underlying relationships in a batch of data using a technique similar to how the human brain works. In this context, neural networks are systems of neurons that might be biological or artificial in origin.

Holt’s Winter Model (HWL): HWM is a time series behavior model. Forecasting usually necessitates the use of a model, and Holt-Winters is a method for modelling three components of a time series: a typical value (average), a slope (trend) across time, and a cyclical repeating pattern (seasonality).

Artificial Intelligence (AI): AI refers to a computer or a robot controlled by a computer's capacity to do jobs that are normally performed by people because they require human intellect and judgement.

Logistic Regression: Logistic regression is a statistical model that uses a logistic function to represent a binary dependent variable in its most basic form; however, many more advanced extensions exist. Logistic regression (or logit regression) in regression analysis is used to estimate the parameters of a logistic model (a form of binary regression).

Random Forest (RF): Random Forest is a Supervised Machine Learning Algorithm frequently utilised in Classification and Regression applications. It constructs decision trees from several samples and uses their majority vote for classification and average for regression.

Data Science (DS): Data science is an interdisciplinary subject that combines scientific techniques, procedures, algorithms, and systems to extract knowledge and insights from noisy, structured, and unstructured data.

Latent Dirichlet Allocation (LDA): The LDA is a generative statistical model that allows unobserved groups to explain why some parts of the data are similar.

Exploratory Data Analysis (EDA): EDA is a data analysis approach that enables the discovery of hidden information within a data collection. This technique is frequently used to derive inferences from data.

Exponential Smoothing (ES): ES is a univariate time series forecasting approach that may be expanded to accommodate data with a systematic trend or seasonal component. It is a strong forecasting approach that may be used in place of the popular Box-Jenkins ARIMA family of algorithms.

FBProphet (FP): The FBProphet library, which is created by Facebook and is primarily used for time series forecasting, is used in the prediction analysis.

Centers for Disease Control and Prevention (CDC): The Centers for Disease Control and Prevention (CDC) is the nation’s health protection agency, operating around the clock to keep America safe from foreign and domestic health and safety risks. The CDC improves our country’s health security.

World Health Organization (WHO): The WHO, dedicated to the well-being of all people and informed by science, leads and champions worldwide efforts to provide everyone, everywhere, an equal chance to live a healthy life.

Autoregressive Integrated Moving Average (ARIMA): ARIMA is a statistical analysis model that uses time-series data to understand a data set better or predict future patterns in the data set. Autoregressive statistical models predict future values based on the previous values.

Least Absolute Shrinkage and Selection Operator (LASSO): LASSO is a regression analysis approach that uses attribute selection and regularization to improve the predictability and interpretability of the final statistical model.

Decision Trees (DTs): DTs are used for classification and regression in non-parametric supervised learning. The objective is to learn basic decision rules using data attributes to forecast the value of a target variable. A tree is a constant piecewise approximation.

Machine Learning (ML): ML is a sort of artificial intelligence (AI) that allows software programs to improve their prediction accuracy without being expressly designed to do so. In order to forecast new output values, machine learning algorithms use past data as input.

Predictive Analytics (PA): PA is a subset of advanced analytics that predicts future events by combining historical data with statistical modelling, data mining tools, and machine learning. Companies use predictive analytics to discover hazards and opportunities by looking for trends in data.

Holt’s Linear Model (HLM): A prominent smoothing technique for predicting data with trend is Holt's two-parameter model, sometimes known as linear exponential smoothing. Holt's model consists of three different equations that interact to provide a final forecast.

LSTM-Regression: The LSTM model is a Gated Recurrent Neural Network, and bidirectional LSTM is simply an extension of that model. The crucial aspect is that these networks may save information for future cell processing.

Complete Chapter List

Search this Book:
Reset