Downscaling of Open Coarse Precipitation Data Using a Machine Learning Algorithm

Downscaling of Open Coarse Precipitation Data Using a Machine Learning Algorithm

Ismail Elhassnaoui, Zineb Moumen, Hicham Ezzine, Marwane Bel-lahcen, Ahmed Bouziane, Driss Ouazar, Moulay Driss Hasnaoui
Copyright: © 2021 |Pages: 34
DOI: 10.4018/978-1-7998-3343-7.ch001
(Individual Chapters)
No Current Special Offers


In this chapter, the authors propose a novel statistical model with a residual correction of downscaling coarse precipitation TRMM 3B43 product. The presented study was carried out over Morocco, and the objective is to improve statistical downscaling for TRMM 3B43 products using a machine learning algorithm. Indeed, the statistical model is based on the Transformed Soil Adjusted Vegetation Index (TSAVI), elevation, and distance from the sea. TSAVI was retrieved using the quantile regression method. Stepwise regression was implemented with the minimization of the Akaike information criterion and Mallows' Cp indicator. The model validation is performed using ten in-situ measurements from rain gauge stations (the most available data). The result shows that the model presents the best fit of the TRMM 3B43 product and good accuracy on estimating precipitation at 1km according to 𝑅2, RMSE, bias, and MAE. In addition, TSAVI improved the model accuracy in the humid bioclimatic stage and in the Saharan region to some extent due to its capacity to reduce soil brightness.
Chapter Preview


Precipitation is the most significant component in the hydrologic cycle (Elhassnaoui et al., 2019). Indeed, precipitation data is a fundamental requirement for full features of meteorology, hydrology, groundwater, streamflow, flood, drought, agriculture, and economics (Maidment, 1993). In a context of climate change, providing high precise precipitation data as a vital variable that describe a variety of features and phenomenon is very significant (Chen & li, 2016; l. Tang et al., 2015; Zhao et al., 2017).

From ancient times, rain gauge stations have been an essential tool for precipitation observation in a hydro-meteorological perspective(Schneider et al., 2014; Schwaller & Morris, 2011). However, due to the topographical characteristic of the catchments, the rain gauge network suffers from sparse spatial distribution (Guofeng et al., 2016; Maggioni et al., 2016). The sparse rain gauge network cannot provide a significant statistical distribution rainfall using interpolation (Kro & Law, 2005). Furthermore, rain gauge stations face many impediments to record and monitor precipitation data namely the weather modification, wind speed, type of precipitation, which lead to significant errors in rainfall observation (Essery & Wblcock, 1991; peck, 1974; Rodda & Smith, 1986; John Rodda & Dixon, 2011; Spring & Peck, 1980).

To overcome rain gauge errors, and provide a quantitative spatial measurement of precipitation, remotes sensing techniques have been developed (Gregg & Casey, 2004; Karaska et al., 2004). Besides, passive satellite sensors can provide accurate global coverage of precipitation data without data interpolation (Duan & Bastiaanssen, 2013; Kidd & Levizzani, 2011; Zhang & Li, 2018). Indeed, satellite sensors with a set of algorithms that embody microwaves radiation have a high potential for measuring precipitation, because, microwaves are directly related to the raindrop through emission, absorption and scattering techniques (Ezzine et al., 2017; Liu et al., 2018; Maidment, 1993). The temperature threshold method is the standard method for estimating rainfall by remote sensing (Arkin et al., 1980.; Arkin et al., 1979).

Satellite sensors programs have become an advanced and efficient tool for providing accurate precipitation data (Silva & Lopes, 2017), covering a broad spatial distribution (Ezzine et al., 2017; Irvem & Ozbuldu, 2019; Kidd, 2001).

Key Terms in this Chapter

Stepwise Regression: Is a multiple regression method aiming to identify the best fit model predictors. This method tries to optimize the choice of the explanatory variables that predict the dependent variable. Stepwise regression automatically consists of iterative model construction.

Transformed Soil Adjusted Vegetation Index (TSAVI): Is a vegetation index that seeks to minimize the soil brightness that noise to the electromagnetic wavelengths reflected from vegetation.

Remote-Sensing: Is the science of retrieving information about an object, a landscape, or a phenomenon without being in contact with them. Remote sensing techniques are based on sensors that retrieve information through the analysis of the electromagnetic spectrum reflected from an object, landscape, or a phenomenon. There are two types of sensors: passive sensors that respond to the amount of light naturally reflected from an object, a landscape or a phenomenon, and active sensors, which are radars that measure the transmitted light by the sensor that was reflected.

Machine Learning: Is an algorithmic-based technique known as training data. Machine learning is an algorithmic-based on a mathematical model that analyzes and computes data in a way to make predictions or decisions. This concept, a significant method used widespread among data scientists. There are two types of machine learning techniques: supervised and unsupervised learning.

Akaike Information Criterion (AIC): Is a statistical criterion that aims to choose the best independent variables that can explain the dependent variable. A considered as a good predictor of the variable that gas the minimum AIC among the other variable.

Mallows’ Cp: Is a statistical indicator, aiming to assess the precision and bias of the predictors. It is considered unbiased and precise estimators of the predictors that have a small value near to the number of predictors.

Spatial Downscaling: Is a technique that aims to reduce the gap between global or regional data with local data. Spatial downscaling mainly seeks to assess global or regional at fine-scale, taking into account the environmental factors in the local area.

Quantile Regression: Is a regression technique used in data science. It is an advanced version of linear regression, which does not forge any assumptions between the distributions of residuals.

Complete Chapter List

Search this Book: