Lossless Compression of Semi-Ordered Trees

Lossless Compression of Semi-Ordered Trees

Habibeche Salah eddine, Ben-Naoum Farah
Copyright: © 2022 |Pages: 21
DOI: 10.4018/IJARB.290341
OnDemand:
(Individual Articles)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

In this paper, authors are interested in the problem of lossless compression of unlabeled semi-ordered trees. Semi-ordered trees are a class of trees that present an order between some sibling while some other sibling are unordered. They offer a wide possibility of applications especially for the representation of plants architecture. Authors show that these trees present remarkable compression properties covering those of ordered and unordered trees. To illustrate this approach, authors apply these notions to a particular class of semi-ordered trees which is the most studied branching structure particularly for a botanical motivation, namely axial trees.
Article Preview
Top

Introduction

Trees are among the most important nonlinear structure arising in computer science (Knuth,2011). They occur in several diverse areas such as genealogical studies, computer vision and pattern recognition (Shih,1991), programming compilation and natural language processing.

The diversity of these application fields and the nature of the already studied trees leads to divide this family into three major classes: ordered trees, when there is a predefined order within each set of sibling nodes in the tree (Zhang,1995), unordered trees, when there is no significant order within each set of sibling nodes (Zhang,1996), and semi-ordered trees if some elements of a set of sibling are unordered while some others can be totally ordered (Aïda,2009).

This diversity induces a wide variety of algorithms resolving queries on tree structures. However, the increase in the amount of tree data especially in biology and also the size of each data requires the implementation of appropriate tools (O’Donoghue,2010). This requires not only effective processing, but also effective representation taking into account the complexity of tree data. A way to address the data storage problem is to use redundancies within the object and store them once. This principle can be applied for the compression of tree data since they often present internal redundancies of substructures in different scales of the tree. Hence, the problem of tree compression aims to condense its similar parts to build a reduced form that keeps the major original tree properties and achieves a better algorithmic conditions.

In the literature, data compression is an operation which consists in passing from an initial representation of an object to a more compact representation, that is to say requiring less storage space. A compression algorithm takes as input a data IJARB.290341.m01 and generates a representation IJARB.290341.m02 which requires fewer bits, and there is a reconstruction algorithm which generates from the compressed representation IJARB.290341.m03 a reconstruction IJARB.290341.m04 (Figure 1). Based on the reconstruction requirements, data compression techniques can be divided into two classes: lossless compression, in which IJARB.290341.m05 is identical to IJARB.290341.m06, and lossy compression which allows IJARB.290341.m07 to be different from IJARB.290341.m08 (Sayood,2017). In this paper, we are concerned by the lossless compression.

Figure 1.

Compression and reconstruction scheme

IJARB.290341.f01

Several attempts have been proposed to solve the problem of ordered tree compression. In (Salah,2019) authors assign signatures to each node of the input tree to identify similar subtrees. For that, subtrees rooted in nodes with similar signatures are replaced by a single occurrence, then they used hierarchical relations between these occurrences to form a graph in which they augment each of its edges by the order of the child occurrence among the other children of the parent. This additional information makes possible the exact reconstitution of the initial tree from its compression. Bill and al in (Bill,2015) present another tree compression called top tree compression, that exploits repetitive patterns in a tree and that support navigational queries in a IJARB.290341.m09 time. Ordered tree compression showed its importance in the representation of structured files (Lohrey,2013), this has largely contributed to the processing of queries on XML file compression, where Sakr presented a survey in this context (see (Sakr,2009) for review).

Complete Article List

Search this Journal:
Reset
Volume 13: 1 Issue (2024): Forthcoming, Available for Pre-Order
Volume 12: 2 Issues (2022): 1 Released, 1 Forthcoming
Volume 11: 2 Issues (2021)
Volume 10: 2 Issues (2020)
Volume 9: 2 Issues (2019)
View Complete Journal Contents Listing