Persistence of Knowledge across Layered Architectures

Persistence of Knowledge across Layered Architectures

Janet Fredericks (Woods Hole Oceanographic Institution, USA)
Copyright: © 2015 |Pages: 21
DOI: 10.4018/978-1-4666-6567-5.ch013

Abstract

In this chapter, a model demonstrating methods for integration of semantic technologies within observational data services is described. Implementation of the model captures knowledge about data provenance where it is best understood and also enables its persistence across architectural layers through the use of standards-based technologies. Domain experts can build upon the semantic layer to create meaningful ontologies. Brokering services can utilize the ontologies for automated mediation of terms and translation between standards-based technologies. Research communities will be enabled to operate within their own framework, utilizing their familiar, specialized terminology and tools. The role of communities of practice is explored relating to knowledge management across layered architectures. Implementation of semantic technologies within Web-based data and brokering services will minimize the operational barriers to data discovery and access and provide mechanisms that enable the formation of collaborative environments that will facilitate repeatable, well-documented research.
Chapter Preview
Top

Introduction

Since the advent of the web and the subsequent ability to openly share and access data, there have been significant advances in creating, adopting and promoting standards to describe data provenance, requiring information such as: key words, ownership details, parameter units and geospatial-temporal coverage. The use of standards-based mechanisms is enabling access to machine-harvestable data along with the basic information describing it. But, are we missing an opportunity to do more than describe data and methods of access? Can we push the technologies and their implementation further to enable the capture of and machine-to-machine harvesting of information sufficient to also describe how data are created? The ability to understand observational provenance will provide a foundation of trust in our global data assets.

In the workshop report on “The Future of Scientific Knowledge Discovery in Open Networked Environments” (National Research Council, 2012), Paul Edwards describes the need to extend the types of available knowledge required to ensure trust and add value to data as they become available through open networked environments. He summarizes institutional impediments including the fact that participants are remote versus local, as well as noting the pervasive problem that agents along the digital life cycle do not consider it their responsibility to document data provenance and assume that they cannot know what will be needed by future unknown users. In Diviacco (2015, see chapter 1 this book), the complexity of understanding and knowledge relating to science within this environment is discussed, along with relevant levels of communication of knowledge, such as syntactic, semantic and pragmatic meaning.

With the wide adoption of web technologies, researchers are now able to conduct research remotely providing opportunities for collaborations internationally and across disciplines (Juan, Daradoumis, Roca, Grasman & Faulin, 2012). A collaborative environment enables participants to build a workspace to address common goals or problems by either sharing access to data or processes or both. It may evolve as a community with needs to access data from outside its own community or it may have data and want to share tools to utilize it. Researchers may want data to be dynamic or may share static data. Collaboration can focus on building data resources for sharing or building tools and workflows to be applied to disparate data sets.

Often researchers have to perform tedious tasks when accessing other’s data, such as reformatting, subsampling, and interpretation of terms and conventions. Data providers typically serve data collections that require researchers to download volumes of data and to create specialized tools to access and subset the data, putting an onerous burden on their time and physical resources. Then there is the task of dealing with conversions of data formats and determining what is meant by each parameter in the set, assuring the understanding of units and conventions. The discovery and integration of data from outside a community’s area of expertise can also be prone to error and omission. Each community may have a set of tools for processing, visualization or workflows that are to be applied to numerous data sets. Collaboration environments let scientists and decision makers focus on their respective research and analysis, rather than on the task of data discovery and transfer mechanisms.

The main concept of a collaborative environment is to create a methodology or workspace that facilitates ease of operation for a group of researchers or decision makers. Once the framework is built, the functionality can provide repeatable and documented results for use within the community. Either the data can change for utilizing the same workflows on different data (new location, different time, etc.) or the collected data are to be used in addressing a particular area of interest with different approaches. But, in all cases, the environment must provide access to knowledge about the data and processes that are being made available.

Complete Chapter List

Search this Book:
Reset