Semantic Matching, Propagation and Transformation for Composition in Component-Based Systems

Semantic Matching, Propagation and Transformation for Composition in Component-Based Systems

Eric Bouillet (IBM Research, USA), Mark Feblowitz (IBM Research, USA), Zhen Liu (IBM Research, USA), Anand Ranganathan (IBM Research, USA) and Anton Riabov (IBM Research, USA)
Copyright: © 2012 |Pages: 20
DOI: 10.4018/978-1-4666-0261-8.ch008


Composition of software applications from component parts in response to high-level goals is a long-standing and highly challenging goal. We target the problem of composition in flow-based information processing systems and demonstrate how application composition and component development can be facilitated by the use of semantically described application metadata. The semantic metadata describe both the data flowing through each application and the processing performed in the associated application code. In this paper, we explore some of the key features of the semantic model, including the matching of outputs to input requirements, and the transformation and the propagation of semantic properties by components.
Chapter Preview

Overview: Components And Composition

A flow-based application is modeled as a graph—or flow composition—of interconnected components, describing the flow of data from one or more external data sources, through a number of software components, outputting some desired end result. In our work, the flows are described as DAGs (directed acyclic graphs). Figure 1(a) depicts a simple flow composition in the vehicle traffic analysis domain; the desired end result for this flow composition is a stream of traffic congestion levels detected at the intersection of Broadway and 42nd St in New York City.

Figure 1.

Example of a flow composition


Components are connected in the usual manner for flow-based applications: a component observes data via its inputs, performs some processing and publishes data via its outputs. Thus, a flow composition includes a collection of components—both data sources and software components—and a description of the components’ interconnections, over which data published by some component can be observed by other components.1

Component descriptions contain all of the descriptive metadata needed to select and assemble components into flow compositions and much of the metadata needed to deploy the compositions to a target runtime environment. Descriptions of a component’s inputs include metadata describing the constraints—typically constraints on input data—that must be satisfied in order for a component to be included into a flow composition. Similarly, descriptions of a component’s outputs capture the characteristics of data published by the component, for observation by other components and/or by subscribers to the application’s result data. For a component to be included in a flow, all of its inputs must be interconnected to other components’ outputs, in a way that satisfies each of the input constraints. Outputs are typically interconnected to other components’ inputs and/or are produced as result data output from the flow composition. For a flow composition to satisfy the processing goal, each of the described results must be provided as an output from some component in the flow composition.

Descriptive metadata for software components also includes some declaration regarding the component’s executable code. This can take the form of actual source or binary executable code (or reference to some repository location of either) or some specification from which the component can be generated, etc.

Consider the Video Image Sampler component in Figure 1(c). Its sole input requires VideoSegments and TimeIntervals and produces, via its sole output, Images and Times. The component also requires a SamplingFrequency rate, expressed as a configuration parameter, and is associated with the VIS.cpp source code file.

The example in this article describes a stream-oriented set of applications that provide real-time traffic information and vehicle routing services by analyzing data obtained from various sensors, web pages and other data sources. A user describes a continuous query for traffic congestion levels on a particular roadway intersection, e.g. the corner of Broadway and 42nd Street in New York City. A flow constructed for such a query might use raw data from a variety of sources. It might use video from a traffic camera at an intersection, extracting images from the video stream and examining them for alignment to visual patterns of congestion at that intersection (the upper thread in Figure 1(a)). Comparing audio data from a sound sensor at the intersection to known congestion audio patterns might be considered (the lower thread in Figure 1(a)). If combining both analyses is believed to provide a more accurate assessment of congestion, the two analytic chains can be joined using an additional component; the combined threads are depicted by the entire flow composition in Figure 1(a).2

Complete Chapter List

Search this Book: