Recently, there has been increased interest in sharing digitized information between government agencies, with the goals of improving security, reducing costs, and offering better quality service to users of government services. The bulk of previous work in interagency information sharing has focused largely on the sharing of structured information among heterogeneous data sources, whereas government agencies need to share data with varying degrees of structure ranging from free text documents to relational data. In this work, we explore the different technologies available to share information. Specifically, our framework discusses the optional data storage mechanisms required to support a Service Oriented Architecture (SOA). We compare XML document, free text search engine, and relational database technologies and analyze the pros and cons of each approach. We explore these options along the dimensions of information definition, information storage, the access to this information, and finally the maintenance of shared information.
The digitization of information has fundamentally altered the environment in which government agencies conduct their missions and deliver services. Recently, there has been considerable interest in exploring how emerging technologies can be used to promote information sharing among different governmental agencies (Bajaj & Ram, 2003). Such information sharing is desirable for several reasons. First, increased levels of security can be achieved if different government agencies share information. These effects can be felt in areas as diverse as global counter-terrorism (Goodman, 2001) and the war on drugs (Forsythe, 1990). Several recent articles, for example,(Dizard, 2002), strongly endorse the view that the sharing of intelligence information amongst different law enforcement agencies will enhance their ability to fulfill their required functions. Second, there has been a growing need to streamline inter-agency communication from a financial savings perspective. For example, Minahan (1995) shows how the lack of information sharing between different government organizations considerably hampered the establishment of an import-export database that would have streamlined the flow of goods into and out of the US and potentially saved billions of dollars. As pointed out in (Stampiglia, 1997), data sharing between health care agencies can also result in significant cost savings. Third, inter-agency information sharing results in offering fewer contact points for end-users of public services, thereby leading to more efficiencies in the delivery of these services to the end-users. For example, allowing agencies to share geographic information systems (GIS) information improves the quality of customer service afforded to end-users of these services (Hinton, 2001). Other common examples of activities that can benefit from information sharing include: the application for licenses and permits and the ability of aid workers to provide essential services.
Much work has been done in the area of the integration of structured information between heterogeneous databases (Hayne & Ram, 1990; Reddy, Prasad, Reddy, & Gupta, 1994; Larson, Navathe, & Elmasri, 1989; Batini, M.Lenzerini, & Navathe, 1986; Hearst, 1998; Ram & Park, 2004; Ram & Zhao, 2001). The two broad approaches in this area are a) the creation of virtual federated schemas for query integration (Zhao, 1997; Chiang, Lim, & Storey, 2000; Yan, Ng, & Lim, 2002) and b) the creation of actual materialized integrated warehouses for integration of both queries and updates (Vaduva & Dittrich, 2001; Hearst, 1998). While the area of structured information integration is relatively well researched, considerably less attention has been paid to the area of the integration of unstructured information (e.g., free text documents) between heterogeneous information sources. Recently, several researchers (Khare & Rifkin, 1997; Sneed, 2002; Glavinic, 2002) have pointed out the advantages of the XML (extensible markup language) standard as a means of adding varying degrees of structure to information, and as a standard for exchanging information over the Internet.