Adaptive Study Design Through Semantic Association Rule Analysis

Adaptive Study Design Through Semantic Association Rule Analysis

Ping Chen (University of Houston-Downtown, USA), Wei Ding (University of Massachusetts-Boston, USA) and Walter Garcia (University of Houston-Downtown, USA)
DOI: 10.4018/jssci.2011040103
OnDemand PDF Download:
$30.00
List Price: $37.50

Abstract

Association mining aims to find valid correlations among data attributes, and has been widely applied to many areas of data analysis. This paper presents a semantic network-based association analysis model including three spreading activation methods. It applies this model to assess the quality of a dataset, and generate semantically valid new hypotheses for adaptive study design especially useful in medical studies. The approach is evaluated on a real public health dataset, the Heartfelt study, and the experiment shows promising results.
Article Preview

1. Introduction

Association rule mining has been widely applied to numerous domains, such as analysis of market-basket datasets, text mining, and disease diagnosis (Agrawal et al., 1996). Association rules whose support and confidence are above user-specified thresholds are considered statistically significant and presented to end-users. While these objective measures are effective to reduce rule redundancy, incorporation of subjective and domain-specific knowledge is still a critical challenge for association analysis, and this knowledge should be represented in a more structured way to maximize its usage. Hence, we choose semantic network to represent knowledge for association analysis. Semantic network has been implemented in many knowledge bases. Concepts and ideas in the human brain have been shown to be semantically linked, which motivates the modern research of semantic network (Quillian, 1998). On recent development on human memory study was described in Widrow et al., 2010), and a general cognitive knowledge representation model was described in Ramirez 2010. Numerous knowledge representation models for more specific areas are also proposed recently to tailor and model interesting aspects of knowledge (Chen et al., 2009).

A semantic network represents knowledge as a directed graph, where vertices represent concepts and edges represent semantic relations between the concepts. Figure 1 shows a sample semantic network whose vertices represent concepts and edges are labeled with names of relations. This semantic network was created in the case study (Section 8) when we examined the Heartfelt adolescent health study. Concepts are organized into a hierarchical structure by is-a edges, and other edges show causal relations, e.g., observable entity diagnose disease or syndrome, stressed is a mental process, diseases can be result of mental process. Comparing with other knowledge representation models, a semantic network has the following advantages:

Figure 1.

A fragment of semantic network used in our case study

  • 1.

    Easy to use. A user needs little training or computer background to build semantic networks. Semantic networks are easy to understand and its explanation is usually straightforward.

  • 2.

    Flexible, incremental, and easy to update. Building a semantic network does not require a user to have a complete or perfect understanding at the beginning. Instead, the building processing can be incremental, and knowledge can be updated locally as a user gets more familiar with a domain.

  • 3.

    Generative. A semantic network is not a merely static structure; instead it has a vertex-firing mechanism called spreading activation. Firing or activation of a vertex sends activation to its semantically connected neighbor vertices. Spreading activation only accesses local neighbor vertices, so its time complexity does not grow with the size of the network.

In this paper we will discuss a semantic network-based association analysis model. With this model we will provide the following analysis techniques:

  • 1.

    Hypothesis generation. New hypotheses are generated through generalization and inference from the association rule set, and give end-users directions for further investigation.

  • 2.

    Data quality assessment. A dataset is just an imperfect and incomplete reflection of a real-world object or scenario. By analyzing association rules we can assess the quality of original dataset.

Our work is closely related to cognitive informatics, which is a transdisciplinary field emerging from Cognitive Science, Computational Intelligence, Artificial Intelligence, Formal Semantics, and Human-Computer Interaction. A set of cognitive models for causation analyses and causal inferences was proposed in Wang 2011, which formalized causal inference methodologies to simulate subtle aspects of human reasoning.

Complete Article List

Search this Journal:
Reset
Open Access Articles: Forthcoming
Volume 9: 4 Issues (2017): 3 Released, 1 Forthcoming
Volume 8: 4 Issues (2016)
Volume 7: 4 Issues (2015)
Volume 6: 4 Issues (2014)
Volume 5: 4 Issues (2013)
Volume 4: 4 Issues (2012)
Volume 3: 4 Issues (2011)
Volume 2: 4 Issues (2010)
Volume 1: 4 Issues (2009)
View Complete Journal Contents Listing