Content-based routing is a form of data delivery whereby the flow of messages is driven by their content rather than the IP address of their destination. With the recognition of XML as the standard for data exchange, specialized XML routing services become necessary. In this chapter, the authors first demonstrate the relevance of such systems by presenting different world application scenarios where XML routing systems are needed and/or employed. Then, they present a survey of the current state of the art. Lastly, they attempt to identify issues and problems that have yet to be investigated. Their discussion will help identify open problems and issues and suggest directions for further research in the context of such systems.
Content-based routing is a form of data delivery whereby the flow of messages is driven by their content rather than the IP address of their destination. Specifically, in XML Routing, there is a continuous stream of XML messages (usually, one message has one XML document) from data producers to consumers, without any of the parties having knowledge of the other (Snoeren, 2001). Message transmission is performed by a sophisticated overlay network of application-level, content-based routers (called message brokers or XML routers) that match data messages against registered client subscriptions, and forward those messages (based on such matching) to output links, i.e. other routers or clients. The task of matching incoming messages to the set of client subscriptions is called message filtering.
This form of communication is widely employed by content-based information dissemination services, which are usually instantiated as publish/subscribe systems (pub/sub for short). For example, pub/sub systems have created opportunities for new applications such as a plethora of alert and notification services that notify interested users of new products in the market, stock price changes, currency variation, better offer deals and so on. Furthermore, with the expansion of Web services, new pub/sub systems are released every week. For instance, online travel agencies such as Priceline.com and Hotwire.com inform their clients of price changes and hot deals that take into consideration the subscriber’s interests. Likewise, Ticketmaster.com sends to its users email alerts about upcoming events and pre-sale information according to the user’s signed up artists and locations.
With the recognition of XML as the standard for data exchange, specialized XML-aware information dissemination services become necessary (Diao, 2004). These services can be implemented as publish/subscribe systems in which the information to be routed is encoded using XML, and the user subscriptions (or profiles) are expressed using XML query languages. Figure 1 illustrates the general architecture of an XML routing system.
General architecture of XML routing system
Recent research on XML-aware information dissemination has investigated issues related to different parts of the routing system architecture. The most relevant aspects include: the discovery of Semantic communities of users with similar interests (Chand, 2007), the construction of the overlay dissemination network structure (Fenner, 2005; Diao, 2004; Snoeren, 2001), the indexing and aggregation of the profiles within a message broker (Chan, 2002; Diao, 2003; Gong, 2005; Kwon, 2005; Li, 2007; Moro, 2007a; Raj, 2007), the distribution of consumer profiles (Diao, 2004; Li, 2007; Papaemmanouil, 2005; Yoo, 2006), the encoding of the routed messages (Vagena, 2007a; Vagena, 2007b), the message filtering task (Altinel, 2000; Chan, 2002; Diao, 2003; Gong, 2005; He, 2006; Li, 2007; Kwon, 2005; Moro, 2007a; Raj, 2007; Tian, 2004; Vagena, 2007a; Vagena, 2007b), in-situ transformation of the original information (Diao, 2004), and computation sharing among message brokers (Chan, 2007).