Comparing Different Sparse Matrix Storage Structures as Index Structure for Arabic Text Collection

Comparing Different Sparse Matrix Storage Structures as Index Structure for Arabic Text Collection

Basel Bani-Ismail (Department of Computer Science, Sultan Qaboos University, Muscat, Oman) and Ghassan Kanaan (Department of Computer Science, Amman Arab University, Amman, Jordan)
Copyright: © 2012 |Pages: 16
DOI: 10.4018/ijirr.2012040105
OnDemand PDF Download:
$37.50

Abstract

In the authors’ study they evaluate and compare the storage efficiency of different sparse matrix storage structures as index structure for Arabic text collection and their corresponding sparse matrix-vector multiplication algorithms to perform query processing in any Information Retrieval (IR) system. The study covers six sparse matrix storage structures including the Coordinate Storage (COO), Compressed Sparse Row (CSR), Compressed Sparse Column (CSC), Block Coordinate (BCO), Block Sparse Row (BSR), and Block Sparse Column (BSC). Evaluation depends on the storage space requirements for each storage structure and the efficiency of the query processing algorithm. The experimental results demonstrate that CSR is more efficient in terms of storage space requirements and query processing time than the other sparse matrix storage structures. The results also show that CSR requires the least amount of disk space and performs the best in terms of query processing time compared with the other point entry storage structures (COO, CSC). The results demonstrate that BSR requires the least amount of disk space and performs the best in terms of query processing time compared with the other block entry storage structures (BCO, BSC).
Article Preview

2. Sparse Matrices

A matrix is considered as sparse when it is beneficial to determine which entries have non-zero values (Tavakoli, 1997). In many cases, the matrices are sparse; therefore, only the non-zero elements and their indices are stored to save storage and reduce computational requirements. The performance of sparse matrix-vector multiplication heavily depends on the used storage structure and affects the performance of many applications in scientific computations, economic modeling, signal and image processing, and IR (Shahnaz & Usman, 2011).

2.1. General Sparse Matrix Background

The nature of text data in a matrix structure results in a very sparse matrix. To save space and run time, it is important to store only the non-zero elements. There are different compression structures to compress the sparse matrix by storing only non-zero elements. Among these compression structures is Scalar ITPACK (CSR) (Peters, 1991; Petiton et al., 1993). The BLAS (Basic Linear Algebra Subprograms) technical forum suggests sparse matrix structures such as Coordinate (COO), Compressed Sparse Row (CSR), Compressed Sparse Column (CSC), Sparse Diagonal (DIA), Block Coordinate (BCO), Block Compressed Sparse Row (BSR), Block Compressed Sparse Column (BSC), Block Sparse Diagonal (BDI), and Variable Block Compressed Sparse Row (VBR) (Carney et al., 1994).

Complete Article List

Search this Journal:
Reset
Open Access Articles: Forthcoming
Volume 7: 4 Issues (2017)
Volume 6: 4 Issues (2016)
Volume 5: 4 Issues (2015)
Volume 4: 4 Issues (2014)
Volume 3: 4 Issues (2013)
Volume 2: 4 Issues (2012)
Volume 1: 4 Issues (2011)
View Complete Journal Contents Listing