Article Preview
TopThe majority of research works having addressed the document warehousing issues have tackled the modeling and OLAPing of documents. We classify the contributions of the literature into two main categories: i) Works using/extending the Multidimensional Model (MD), and ii) Works suggesting specific model.
The first category of works has adapted the conventional MD model initially suggested for numeric DWs; as examples we can cite the works of (Golfarelli, Maio, & Rizzi, 1998), (Kimball & Ross, 2003), (Inmon, 2005), and (Azabou, Khrouf, Feki, Soulé-Dupuy, & Vallès, 2017). Furthermore, in this category some authors have created data warehousing model, especially XML model, by processing XML raw documents into a specified data warehouse repository; such as (Rusu, Rahayu, & Taniar, 2004), (Rusu, Rahayu, & Taniar, 2005). Some other authors have enriched the conventional MD model with extensions specific to textual modeling and processing (Jin, Han, Cao, Luo, Ding, & Lin, 2010) and (Liu, Zhou, Pan, Qian, Cai, & Lian, 2009) for data-centric documents, whereas others like (Lin, Ding, Han, Zhu, & Zhao, 2008) extend the MD model for elaborating document-centric documents warehouses.