Detecting Bias on Aesthetic Image Datasets

Detecting Bias on Aesthetic Image Datasets

Adrian Carballal (Department of Information and Communication Technologies, University of A Coruña, A Coruña, Spain), Luz Castro (Department of Information and Communication Technologies, University of A Coruña, A Coruña, Spain), Rebeca Perez (Department of Information and Communication Technologies, University of A Coruña, A Coruña, Spain) and João Correia (Department of Informatics Engineering, University of Coimbra, Coimbra, Portugal)
DOI: 10.4018/ijcicg.2014070104
OnDemand PDF Download:
No Current Special Offers


In recent years, there have been attempts to discover the principles that determine the value of aesthetics in the domain of computing. Many and diverse studies have tried in some way to capture these principles through technical characteristics. To this end, helped by the ease of Internet data acquisition, datasets of images have been published which were obtained online at random from websites and photography competitions. To guarantee the validity of a system of aesthetic image classification, one must first guarantee its capacity for generalization. This paper studies how the indiscriminate selection of images can affect the generalization capacity obtained by a binary classifier.
Article Preview

2. State Of The Art

Within the group of works orientated towards automatic aesthetic classification, some of the most cited are from Datta et al. (2006), Wong et al. (2008), Ke et al. (2006), Luo et al. (2009). Each one of these authors has supplied a different method in the search for the ideal design characteristics in relation to technical components such as luminosity, saturation, etc.

Despite being different in their aims and methods, these author’s investigations have all employed the same kind of datasets, which include photographs and human evaluations.

Although these datasets may be a suitable source for study, each one comes with its own peculiarities. They consist of large groups of images created by third parties external to the investigation. Also, each photograph includes its own evaluation in the form of a rating carried out by various individuals on the basis of different criteria.

However, the conditions in which these ratings were carried out were not controlled as in an experiment attended in person. There is also a significant dearth of information regarding the participants and we cannot disregard the possibility that extraneous variables have contaminated the sample.

The greater part of these photo databases exhibit a semantic bias and a bias in terms of content. Aimed at professional photographers there is always a certain tendency towards various types of subject, framing and uses of colour.

Complete Article List

Search this Journal:
Open Access Articles: Forthcoming
Volume 13: 2 Issues (2022): Forthcoming, Available for Pre-Order
Volume 12: 2 Issues (2021)
Volume 11: 2 Issues (2020)
Volume 10: 2 Issues (2019)
Volume 9: 2 Issues (2018)
Volume 8: 2 Issues (2017)
Volume 7: 2 Issues (2016)
Volume 6: 2 Issues (2015)
Volume 5: 2 Issues (2014)
Volume 4: 2 Issues (2013)
Volume 3: 2 Issues (2012)
Volume 2: 2 Issues (2011)
Volume 1: 2 Issues (2010)
View Complete Journal Contents Listing