Semantic Integration and Knowledge Discovery for Environmental Research

Semantic Integration and Knowledge Discovery for Environmental Research

Zhiyuan Chen (University of Maryland, Balitmore County (UMBC), USA)
DOI: 10.4018/978-1-60566-172-8.ch003
OnDemand PDF Download:
$37.50

Abstract

Environmental research and knowledge discovery both require extensive use of data stored in various sources and created in different ways for diverse purposes. We describe a new metadata approach to elicit semantic information from environmental data and implement semantics-based techniques to assist users in integrating, navigating, and mining multiple environmental data sources. Our system contains specifications of various environmental data sources and the relationships that are formed among them. User requests are augmented with semantically related data sources and automatically presented as a visual semantic network. In addition, we present a methodology for data navigation and pattern discovery using multi-resolution browsing and data mining. The data semantics are captured and utilized in terms of their patterns and trends at multiple levels of resolution. We present the efficacy of our methodology through experimental results.
Chapter Preview
Top

Introduction

The urban environment is formed by complex interactions between natural and human systems. Studying the urban environment requires the collection and analysis of very large datasets that span many disciplines, have semantic (including spatial and temporal) differences and interdependencies, are collected and managed by multiple organizations, and are stored in varying formats. Scientific knowledge discovery is often hindered because of challenges in the integration and navigation of these disparate data. Furthermore, as the number of dimensions in the data increases, novel approaches for pattern discovery are needed.

Environmental data are collected in a variety of units (metric or SI), time increments (minutes, hours, or even days), map projections (e.g., UTM or State Plane) and spatial densities. The data are stored in numerous formats, multiple locations, and are not centralized into a single repository for easy access. To help users (mostly environmental researchers) identify data sets of interest, we use a metadata approach to extract semantically related data sources and present them to the researchers as a semantic network. Starting with an initial search (query) submitted by a researcher, we exploit stored relationships (metadata) among actual data sources to enhance the search result with additional semantically related information. Although domain experts need to manually construct the initial semantic network, which may only include a small number of sources, we introduce an algorithm to let the network expand and evolve automatically based on usage patterns. Then, we present the semantic network to the user as a visual display of a hyperbolic tree; we claim that semantic networks provide an elegant and compact technique to visualize considerable amounts of semantically relevant data sources in a simple yet powerful manner.

Once users have finalized a set of environmental data sources, based on semantic networks, they can access the actual sources to extract data and perform techniques for knowledge discovery. We introduce a new approach to integrate urban environmental data and provide scientists with semantic techniques to navigate and discover patterns in very large environmental datasets.

Our system provides access to a multitude of heterogeneous and autonomous data repositories and assists the user to navigate through the abundance of diverse data sources as if they were a single homogeneous source. More specifically, our contributions are:

  • (1)

    Recommendation of additional and relevant data sources. We present our approach to recommend data sources that are potentially relevant to the user’s search interests. Currently, it is tedious and impractical for users to locate relevant information sources by themselves. We provide a methodology that addresses this problem and automatically supplies users with additional and potentially relevant data sources that they might not be aware of. In order to discover these additional recommendations, we exploit semantic relationships between data sources. We define semantic networks for interrelated data sources and present an algorithm to automatically refine, augment, and expand an initial and relatively small semantic network with additional and relevant data sources; we also exploit user profiles to tailor resulting data sources to specific user preferences.

  • (2)

    Visualization and navigation of relevant data sources. The semantic network with the additional sources is shown to the user as a visual hyperbolic tree improving usability by showing the semantic relationships among relevant data sources in a visual way. After the user has decided on the choice of relevant data sources of interest (based on our metadata approach) and has accessed the actual data, we also assist the user in navigating through the plethora of environmental data using visualization and navigation techniques that describe data at multiple levels of resolution, enabling pattern and knowledge discovery at different semantic levels. We achieve that, using wavelet transformation techniques, and we demonstrate resilience of wavelet transformation to noisy data.

  • (3)

    Implementation of a prototype system. Finally, we have designed and implemented a prototype system as a proof of concept for our techniques. Using this system we have demonstrated the feasibility of our contributions and have conducted a set of experiments verifying and validating our approach.

Complete Chapter List

Search this Book:
Reset
Editorial Advisory Board
Table of Contents
Chapter 1
Hong Zhang, Rajiv Kishore, Ram Ramesh
A conceptual modeling grammar should be based on the theory of ontology and possess clear ontological semantics to represent problem domain... Sample PDF
Semantics of the MibML Conceptual Modeling Grammar: An Ontological Analysis Using the Bunge-Wand-Weber Framework
$37.50
Chapter 2
Henry M. Kim, Arijit Sengupta, Mark S. Fox, Mehmet Dalkilic
This paper introduces a measurement ontology for applications to semantic Web applications, specifically for emerging domains such as microarray... Sample PDF
A Measurement Ontology Generalizable for Emerging Domain Applications on the Semantic Web
$37.50
Chapter 3
Zhiyuan Chen
Environmental research and knowledge discovery both require extensive use of data stored in various sources and created in different ways for... Sample PDF
Semantic Integration and Knowledge Discovery for Environmental Research
$37.50
Chapter 4
Vijayan Sugumaran, Gerald DeHondt
Software reuse has been discussed in the literature for the past three decades and is widely seen as one of the major areas for improving... Sample PDF
Towards Code Reuse and Refactoring as a Practice within Extreme Programming
$37.50
Chapter 5
Miguel I. Aguiree-Urreta, George M. Marakas
Requirements elicitation has been recognized as a critical stage in system development projects, yet current models prescribing particular... Sample PDF
Requirements Elicitation Technique Selection: A Theory-Based Contingency Model
$37.50
Chapter 6
VenuGopal Balijepally, Sridhar Nerur, RadhaKanta Mahapatra
Software development in organizations is evolving and increasingly taking a socio-technical hue. While empirical research guided by common sense... Sample PDF
IT Value of Software Development: A Multi-Theoretic Perspective
$37.50
Chapter 7
Amel Mammar
UB2SQL is a tool for designing and developing database applications using UML and B formal method. The approach supported by UB2SQL consists of two... Sample PDF
UB2SQL: A Tool for Building Database Applications Using UML and B Formal Method
$37.50
Chapter 8
Juliette Gutierrez
Crime reports are used to find criminals, prevent further violations, identify problems causing crimes and allocate government resources.... Sample PDF
Using Decision Trees to Predict Crime Reporting
$37.50
Chapter 9
Karen Corral, David Schuff, Robert D. St. Louis, Ozgur Turetken
Inefficient and ineffective search is widely recognized as a problem for businesses. The shortcomings of keyword searches have been elaborated upon... Sample PDF
A Model for Estimating the Savings from Dimensional vs. Keyword Search
$37.50
Chapter 10
Praveen Madiraju, Rajshekhar Sunderraman, Shamkant B. Navathe, Haibin Wang
Global semantic integrity constraints ensure the integrity and consistency of data spanning distributed databases. In this chapter, we discuss a... Sample PDF
Integrity Constraint Checking for Multiple XML Databases
$37.50
Chapter 11
Russel Pears
Data Warehouses are widely used for supporting decision making. On Line Analytical Processing or OLAP is the main vehicle for querying data... Sample PDF
Accelerating Multi Dimensional Queries in Data Warehouses
$37.50
Chapter 12
Vikas Agrawal, P. S. Sundararaghavan, Mesbah U. Ahmed, Udayan Nandkeolyar
Data warehouse has become an integral part in developing a DSS in any organization. One of the key architectural issues concerning the efficient... Sample PDF
View Materialization in a Data Cube: Optimization Models and Heuristics
$37.50
Chapter 13
Athman Bouguettaya, Zaki Malik, Xumin Liu, Abdelmounaam Rezgui, Lori Korff
The ubiquity of the World Wide Web facilitates the deployment of highly distributed applications. The emergence of Web databases and applications... Sample PDF
WebFINDIT: Providing Data and Service-Centric Access through a Scalable Middleware
$37.50
Chapter 14
James E. Wyse
Location-based mobile commerce (LBMC) incorporates location-aware technologies, wire-free connectivity, and server-based repositories of business... Sample PDF
Retrieval Optimization for Server-Based Repositories in Location-Based Mobile Commerce
$37.50
Chapter 15
Shing-Han Li, Shi-Ming Huang, David C. Yen, Cheng-Chun Chang
The lifecycle of information system (IS) became relatively shorter compared with earlier days as a result of information technology (IT) revolution... Sample PDF
Migrating Legacy Systems to Web Services Architecture
$37.50
Chapter 16
Myeong Ho Lee
The trend toward convergence, initiated by advances in ICT, entails the creation of new value chain networks, made up by partnerships between actors... Sample PDF
A Socio-Technical Interpretation of IT Convergence Services: Applying a Perspective from Actor Network Theory and Complex Adaptive Systems
$37.50
Chapter 17
T. Ariyachandra, L. Dong
Past evidence suggests that organizational transformation from IT implementations is rare. Data warehousing promises to be one advanced information... Sample PDF
Understanding Organizational Transformation from IT Implementations: A Look at Structuration Theory
$37.50
Chapter 18
Yuan Long, Keng Siau
Drawing on social network theories and previous studies, this research examines the dynamics of social network structures in Open Source Software... Sample PDF
Social Networks Structures in Open Source Software Development Teams
$37.50
Chapter 19
Susanta Mitra, Aditya Bagchi, A. K. Bandyopadhyay
A social network defines the structure of a social community like an organization or institution, covering its members and their... Sample PDF
Design of a Data Model for Social Networks Applications
$37.50
About the Contributors