Ontological Random Forests for Image Classification

Ontological Random Forests for Image Classification

Ning Xu (Beckman Institute, University of Illinois at Urbana-Champaign, USA), Jiangping Wang (Beckman Institute, University of Illinois at Urbana-Champaign, USA), Guojun Qi (Beckman Institute, University of Illinois at Urbana-Champaign, USA), Thomas Huang (Beckman Institute, University of Illinois at Urbana-Champaign, USA) and Weiyao Lin (Shanghai Jiao Tong University, China)
Copyright: © 2015 |Pages: 14
DOI: 10.4018/IJIRR.2015070104


Previous image classification approaches mostly neglect semantics, which has two major limitations. First, categories are simply treated independently while in fact they have semantic overlaps. For example, “sedan” is a specific kind of “car”. Therefore, it's unreasonable to train a classifier to distinguish between “sedan” and “car”. Second, image feature representations used for classifying different categories are the same. However, the human perception system is believed to use different features for different objects. In this paper, we leverage semantic ontologies to solve the aforementioned problems. The authors propose an ontological random forest algorithm where the splitting of decision trees are determined by semantic relations among categories. Then hierarchical features are automatically learned by multiple-instance learning to capture visual dissimilarities at different concept levels. Their approach is tested on two image classification datasets. Experimental results demonstrate that their approach not only outperforms state-of-the-art results but also identifies semantic visual features.
Article Preview

1. Introduction

Most existing image classification algorithms treat categories as completely independent both visually and semantically. However, humans are believed to use semantic relations to classify categories (Collin, 2005). For example, it is unreasonable to distinguish “truck” from “vehicle” since “truck” is a kind of “vehicle”. In addition, it is common for humans to use different features to discriminate different objects. For example, “wheel” is a useful feature to distinguish “car” from “animal” while shape differences are more discriminative to distinguish “truck” from “sedan”.

Although having good performance on some easy image classification datasets such as Caltech 101 (Fei-Fei, 2007) and Caltech 256 (Griffin, 2007), the neglect of semantics makes most existing image classification algorithms (Shao, 2014; Wang, 2010; Zhang, 2014) not only have limited results on challenging problems such as fine-grained image classification (Deng, 2009; Welinder, 2010), but also are at odds with the human visual system.

An ontology is a hierarchical structure consisting of categories and high-level relations such as “is-a” and “part-of”. It encodes semantics in a hierarchical way that is very similar to human perception. Therefore it provides a useful tool to incorporate semantics into frameworks of traditional image classification approaches. However, traditional ontology based algorithms (Marszalek, 2007; Tsai, 2010; Xu, 2014) build ontological classifiers which have a classifier at every ontological node to discriminate the node's sub-categories. This simple framework leads to error propagation such that if an image is misclassified at any intermediate node along the path from the root concept to the leave concept, the prediction will be wrong. This issue is serious due to large intra-class variations of super-categories, i.e., it is difficult to train a good classifier for general concepts such as “animal” and “vehicle”. As a result, previous use of ontologies on image classification mainly aims at improving classification speed instead of classification accuracy.

In comparison to the fixed structures of ontologies, decision tree has the advantage of flexible structure. Previous decision-tree based approaches can be categorized into two directions by different splitting methods. Approaches of the first direction (Belgiu, 2014; Yao, 2011) use random splits that randomly partition categories into a binary set at each tree node. The second direction (Fan, 2014; Liu, 2013) is based on visual splits that at each tree node categories with similar visual appearances are grouped together. However, random splits do not leverage any prior knowledge of data and thus the discriminative power is weak. On the other hand, the cost of visual splits is too high when the number of images and categories grows.

Complete Article List

Search this Journal:
Open Access Articles: Forthcoming
Volume 9: 4 Issues (2019): 1 Released, 3 Forthcoming
Volume 8: 4 Issues (2018)
Volume 7: 4 Issues (2017)
Volume 6: 4 Issues (2016)
Volume 5: 4 Issues (2015)
Volume 4: 4 Issues (2014)
Volume 3: 4 Issues (2013)
Volume 2: 4 Issues (2012)
Volume 1: 4 Issues (2011)
View Complete Journal Contents Listing