Predicting Reasoner Performance on ABox Intensive OWL 2 EL Ontologies

Predicting Reasoner Performance on ABox Intensive OWL 2 EL Ontologies

Jeff Z. Pan (Department of Computing Science, University of Aberdeen, Aberdeen, UK), Carlos Bobed (IRISA/Université de Rennes 1, Rennes, France), Isa Guclu (University of Aberdeen, United Kingdom, Aberdeen, UK), Fernando Bobillo (I3A, University of Zaragoza, Zaragoza, Spain), Martin J. Kollingbaum (University of Aberdeen, Aberdeen, UK), Eduardo Mena (I3A, University of Zaragoza, Zaragoza, Spain) and Yuan-Fang Li (Faculty of Information Technology, Monash University, Clayton, VIC, Australia)
Copyright: © 2018 |Pages: 30
DOI: 10.4018/IJSWIS.2018010101
OnDemand PDF Download:
$30.00
List Price: $37.50

Abstract

In this article, the authors introduce the notion of ABox intensity in the context of predicting reasoner performance to improve the representativeness of ontology metrics, and they develop new metrics that focus on ABox features of OWL 2 EL ontologies. Their experiments show that taking into account the intensity through the proposed metrics contributes to overall prediction accuracy for ABox intensive ontologies.
Article Preview

Introduction

Reasoner performance prediction for OWL 2 ontologies has been studied so far from different dimensions. One key aspect of these studies has been the prediction of how much time a particular reasoning task for a given ontology will consume. Several approaches have adopted machine-learning techniques to predict time consumption of different reasoning tasks depending on features of the input ontologies. However, these studies have mainly focused on the complexity of their TBoxes, while paying little attention to ABox details. ABox information is particularly important in real-world scenarios, where data volumes are much larger than data-describing schema information.

The language OWL 2 DL (Cuenca-Grau et al. (2008)), the most expressive profile of OWL 2, has a worst-case complexity that is 2NEXPTIME-complete (Kazakov (2008)), which constitutes a bottleneck for performance critical applications. Empirical studies show that even the EL profile, with PTIME-complete complexity and less expressiveness, can become too time-consuming (Dentler et al. (2011), Kang et al. (2012b)).

There have been several studies regarding performance prediction of ontologies. Kang et al. (2012a) investigated the hardness category (categories according to reasoning time) for reasoner-ontology pairs and used machine-learning techniques to make a prediction. Using the reasoners FaCT++ (Tsarkov & Horrocks (2006)), HermiT (Glimm et al. (2014)), Pellet (Sirin et al. (2007)), and TrOWL (Pan et al. (2016, 2012), Ren et al. (2010), Thomas et al. (2010)), their prediction had high accuracy in terms of hardness category, but not in terms of reasoning time. In a subsequent study, Kang et al. (2014) investigated regression techniques to predict reasoning time. They made experiments, based on their syntactic metrics, using the reasoners FaCT++, HermiT, JFact, MORe (Armas-Romero et al. (2012)), Pellet, and TrOWL. These metrics are generally effective when there is a balance between TBox axioms and ABox axioms. However, our preliminary experiments in Guclu, Bobed, Pan, Kollingbaum & Li (2016) showed that the accuracy of these metrics decreases when the relative size of the ABox with respect to the TBox increases.

We regard this observation important as there are many real-world scenarios where the amount of data exceeds by far the size of the schema associated with them (e.g., Linked Data repositories (Bizer et al. (2009))). Besides, as observed in Yus & Pappachan (2015), there is an increasing interest in using semantic technologies on mobile devices (Bobed et al. (2015)). Given that the ABox constitutes the data of an ontology (Fokoue et al. (2012), Hogan et al. (2011), Ren et al. (2012)), whereas TBox constitutes the schema, on mobile devices, with their restricted resources, TBox axioms are expected to be rather static, whereas the ABox axioms (data) tend to change more frequently. Thus, due to volume and dynamism, an approach that can capture the influence of the ABox in reasoning performance in a more accurate way is needed to make accurate overall predictions. Plenty of applications can benefit from this prediction mechanism, both in resource-limited scenarios as well as in non-limited ones. For example, on the one hand, having an accurate processing time prediction can be combined with battery consumption prediction (Guclu, Li, Pan & Kollingbaum (2016)) to devise new adaptive methods for reasoning in mobile devices. On the other hand, semantic applications dealing with highly volatile data can also benefit from these predictions to decide whether or not to update the materialization of their knowledge (Bobed et al. (2014)).

Complete Article List

Search this Journal:
Reset
Open Access Articles
Volume 14: 4 Issues (2018): 1 Released, 3 Forthcoming
Volume 13: 4 Issues (2017)
Volume 12: 4 Issues (2016)
Volume 11: 4 Issues (2015)
Volume 10: 4 Issues (2014)
Volume 9: 4 Issues (2013)
Volume 8: 4 Issues (2012)
Volume 7: 4 Issues (2011)
Volume 6: 4 Issues (2010)
Volume 5: 4 Issues (2009)
Volume 4: 4 Issues (2008)
Volume 3: 4 Issues (2007)
Volume 2: 4 Issues (2006)
Volume 1: 4 Issues (2005)
View Complete Journal Contents Listing