Article Preview
Top1. Introduction
XML (World Wide Web Consortium [W3C], 2008) is the widely used open standard for content representation and dissemination over the Internet. XML is the mostly followed data format to exchange data or information between incompatible systems. XML is very famous for its flexible and adaptable structure. This leads to the user to define their own structure based on the requirement. XML document is extensively used in various real-world application areas (Ashish, & Maluf, 2009; Maluf, Bell, Ashish, Knight, & Tran, 2005; Singh, 2007) like e-commerce, social networks, research communities, healthcare, enterprises, government and private organizations, etc., and the file size ranges from KB (KiloBytes) to EB (ExaBytes). XML big data is a recent trending research area. XML document plays a predominant role in publish/subscribe system (Datta, Gradinariu, Raynal, & Simon, 2003; Kundu, & Bertino, 2008) that efficiently disseminates the selective XML content to different subscribed users. Publish/subscribe system basically follows service oriented architecture (SOA). SOA is an architecture that provide services and mostly adapted in large scale enterprises (Alwadain, Fielt, Korthaus, & Rosemann, 2013; Beydoun, Xu, & Sugumaran, 2013). SOA can be implemented using web services.
Publish/subscribe system (Sankari, & Bose, 2014) involves three entities namely producer, consumer and publisher. Producer is considered as the owner of the XML document. Producer disseminates the selective content of an XML document to the subscribed consumers through the publisher. To assure the confidentiality and integrity of the XML document content to be disseminated, producer has to securely label, encode and encrypt the XML document. Consumers who are authorized for an XML document subscribes to the producer. Subscribed consumers receive secure labels for their accessible XML content and credentials for authentication and decryption from the producer. Producer disseminates the securely labeled, encoded and encrypted selective XML content to the publisher. A third-party usually called publisher or message broker receives the encrypted XML content with the subscribed consumers details from the producer. Publisher disseminates the encrypted the XML content to the respective subscribed consumers based on their request. Subscribed consumer receives the encrypted XML content and decrypts the XML content.
Producer needs to label the XML document to uniquely identify the XML content. To accomplish this, an XML document can be viewed as an XML tree by following a standard called document object model (DOM). DOM (W3C, 2015) allows to visualize the XML document as XML tree by exploiting the hierarchical structural relationship existing in the XML document. Elements, attributes, content, etc., present in an XML document are usually represented as nodes in the XML tree. An XML labeling scheme is followed to uniquely label every node of an XML tree. Hence, an XML label uniquely identifies every node in the XML tree. The basic requirements of an XML labeling scheme are minimum label size, efficient labeling time and every label has to preserve the structural relationships existing among the nodes of an XML tree. Document order (DO), sibling, parent-child (PC), ancestor-descendant (AD) and lowest common ancestor (LCA) are the structural relationships that normally exist between the nodes of an XML tree. XML label acts as a key to distinguish the content of an XML document. Producer sends the XML labels that are additional information apart from the actual XML content to the subscribed consumers. From these labels, consumer can deduct structural information of an XML document that the producer has. Therefore, any consumer can become an internal attacker and can exploit this information leakage to perform any security attack. Hence, producer prefers a secure XML labeling scheme that retains basic features of a labeling scheme and prevents information leak by assuring secure content dissemination. Enhanced Dewey coding (Sankari, & Bose, 2013) and secure Dewey coding (Sankari, & Bose, 2014) are the two recent secure XML labeling schemes proposed for publish/subscribe system.