Compressing and Vague Querying (XCVQ) Design

Compressing and Vague Querying (XCVQ) Design

Badya Al-Hamadani (University of Huddersfield, UK) and Joan Lu (University of Huddersfield, UK)
DOI: 10.4018/978-1-4666-1975-3.ch010


As shown in the literature review from the previous chapter, there are a good number of studies in the field of compressing XML documents and querying the compressed version without the need to fully decompress. However, vague queries, which are one of the most important query types, have been processed to retrieve information from raw XML documents and not from compressed ones. Depending on the SDM, the design of the complete system should be made, followed by its implementation which can be seen in Appendix B in chapter 12. This chapter illustrates the design architecture of the XCVQ (an XML Compressing and Vague Querying) which has the ability to compress the XML documents and use the compressed files in order to retrieve information according to vague queries. It starts with the main architecture of the system followed by the design of each of its parts, namely XCVQ’s compressor, decompressor, and the query processor.
Chapter Preview

1.1 System Architecture

As illustrated in Figure 1, the XCVQ system consists of two main stages. The first is designing a new XML compression technique which converts the normal XML documents to a compressed version. The second is designing a retrieving technique that processes the XPath vague queries in order to retrieve the relevant information from the compressed document accordingly.

Figure 1.

Preliminary architecture of XCVQ

The design of the XCVQ does not rely on the XML Schema or the DTD of the document. This is due to several reasons:

  • 1.

    The main purpose of designing XCVQ is to process vague queries which are usually written, as illustrated in a previous section, by inexperienced users who may not want to have another technology linked with their documents.

  • 2.

    Even if the schema for a document exists, it could not have been accessible to the user.

  • 3.

    Since the main purpose of any compressor is to reduce the storage memory and the transition bandwidth, XCVQ saves the amount of memory required to store the schema.

As illustrated in the design of the XCVQ, all the compressed XML documents are stored in a repository which is going to be used in the retrieving process. To the best of our knowledge, XCVQ may well be considered to be the first retrieving technique that has the ability to retrieve information from more than one XML document without requiring the pre-specification of the documents needed to be retrieved and without dependence on the document’s schema. This approach helps users retrieve more relative information no matter which documents contain this information. The complete design of the XCVQ is illustrated in Figure 2.

Figure 2.

The complete design of XCVQ

The following sections demonstrate the design of each part of the system starting with XCVQ-Compressor (XCVQ-C), passing by XCVQ-Decompressor (XCVQ-D), and ending with XCVQ-Query Processor (XCVQ-QP).


1.2 Xcvq-C Design

XCVQ-C compressor takes an XML document as the input and creates the compressed version from this document by passing through several steps. An example in Figure 3 from (, 2006b) will be used in the following sections in order to simplify the exact process of each step.

Figure 3.

An XML example

Complete Chapter List

Search this Book: