Towards Scalingless Generation of Formal Contexts from an Ontology in a Triple Store

Towards Scalingless Generation of Formal Contexts from an Ontology in a Triple Store

Frithjof Dau
DOI: 10.4018/ijcssa.2013010102
OnDemand:
(Individual Articles)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

The EU-funded research project CUBIST investigates how Formal Concept Analysis can be applied as a Visual Analytics tool on top of information stored in a Triple Store (TS). This paper provides first steps for utilizing SPARQL in order to generate formal contexts out of the data in the TS, where the emphasis is put on using object-properties between individuals. Thus it complements FcaBedrock, which will be used in CUBIST as well and focuses on the scaling of datatype-properties between individuals and literals. It is discussed how the approaches of this paper and FcaBedrock can be combined.
Article Preview
Top

1. Introduction

CUBIST1 is an EU funded research project with an approach that leverages (Business Intelligence) BI to a new level of precise, meaningful and user-friendly analytics of data by following a best-of-breed approach that combines essential features of Semantic Technologies, Business Intelligence and Visual Analytics based on FCA (Formal Concept Analysis). CUBIST aims to:

  • Support federation of data from unstructured and structured sources;

  • Persist the federated data in an information warehouse; an approach based on a bi enabled triple store;

  • Provide novel ways of applying visual analytics based on meaningful diagrammatic representations.

The Visual Analytics part of CUBIST is complementing traditional BI-means by utilizing Formal Concept Analysis (FCA) for analyzing the data in the triple store. FCA is a well-known theory of data analysis which allows to conceptually clustering objects with respect to a given set of attributes and then visualize the (lattice-ordered) set of clusters, e.g. by means of Hasse-diagrams. The starting point of FCA is a formal context (O,A,I) consisting of a set O of formal objects, a set A of formal attributes, and an incidence-relation M ⊆ O × A between the formal objects and attributes. There exists a variety of tools2 to carry out analysis of formal contexts (e.g. conexp, Lattice Miner, or conflexplore), but nearly all of them take a formal context as input. Real data to be analyzed, however, often comes in different forms:

  • Conceptually, often attributes are not binary, but have values like numbers, strings, or dates (e.g. We have many-valued attributes);

  • Technically, data can come in form of csv-files, databases, triple stores, etc.

For dealing with many-valued attributes, the best-known and most-used method is conceptual scaling (Ganter & Wille, 1989). Essentially, for a given-many valued attribute, a conceptual scale is a specific context with the values of the many-valued attribute as formal objects. The choice of the formal attributes of the scale is a question of the design of the scale: The formal attributes are meaningful attributes to describe the values; they might be different entities or they might even be the values of the property again. Using a conceptual scale, a dataset with a many-valued attribute can be “translated” into a formal context, where the objects are the objects of the dataset and the attributes are the attributes of the conceptual scale, and the derived formal context can be analyzed with FCA.

From the technical point of view, there are (to the author’s knowledge) essentially two tools which allow for scaling real datasets:

  • Toscanaj (Becker & Correia, 2005) is a suite of tools which allows to creating conceptual scales out of data from a relational database and then interactively visualizing and exploring the generated concept lattices;

  • Fcabedrock (Andrews, 2009; Andrews & Orphanides, 2010) is a tool which converts csv-files into formal contexts. It is “taking each many-valued attribute and converting it into as many Boolean attributes as it has values and converting continuous values using ranges.” (Andrews & Orphanides, 2010).

Complete Article List

Search this Journal:
Reset
Open Access Articles: Forthcoming
Volume 6: 2 Issues (2018)
Volume 5: 2 Issues (2017)
Volume 4: 2 Issues (2016)
Volume 3: 2 Issues (2015)
Volume 2: 2 Issues (2014)
Volume 1: 2 Issues (2013)
View Complete Journal Contents Listing