Validating Machine Vision Competency Against Human Vision

Validating Machine Vision Competency Against Human Vision

Vani Ashok Hiremani, Kishore Kumar Senapati
Copyright: © 2023 |Pages: 18
DOI: 10.4018/978-1-7998-9220-5.ch061
OnDemand:
(Individual Chapters)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

This work elucidates the human intelligence performance and machine intelligence in geographical region-wise face classification incorporating a sample of 120 human identifiers and computational models like convolutional neural network and colour local binary pattern. A novel Indian colour face database is created consisting of 2010 distinctive face images of east and south regions. On human side, an automated human intelligence system is established to evaluate the visual capabilities of human. On machine side, the authors trained two CovNets, one comprising more layers trained with 1800 normal face database images and another one trained with 1000 contoured images of face obtained by canny edge detection approximation method, to estimate the human intelligence response that found face shape the more discriminative feature among other face features. Experimental results showed the human classification proficiency (96%) stood superior to the machine algorithms even in challenging aspects.
Chapter Preview
Top

Introduction

This chapter presents an empirical analysis of human intelligence as how human perceive an image and based upon which prominent facial features he/she is deciding upon the class of image before training the machine for classification. At present image classification accuracy is not high enough because of large number of redundant information as well as features. The visual capacity of humans has par excellence in object identification and recognition under critical impediments which evolves to much competent with age so primary focus should be given on how human intelligence works on image classification rather than training the machine for the image classification. Human brain processes visual statistics in semantic space by extracting the semantically imperative features such as contour information, line segments, edges which are hardly detected by computers. On the other hand machines require high end resolution images and rigorous processing of images to fit for training. Machines have to process visual statistics in data space obtained by the strongly detectable but less informative features like texture patterns and chromatic information (Zhang, 2010). Hence the primary motto of underlying chapter involves human interaction and response analysis to validate the competency of machine intelligence under face classification. Understanding the way human extracts features can enable a variety of AI applications with human-like performance. Over the years humans have showcased clever proficiency in judging age, gender, behavior, state of mind and race by face even under many obstacles (Chellappa, 2010; Jain et al., 2011; Sinha et al., 2006; Sinha et al., 2007). This chapter elucidates the intra class classification problem like classifying Indian face vs Indian Face (Kattia & Aruna, 2018). The racial classification problem in a highly populated and most diversified country like India is more apprehensive where every region epitomizes different culture and traditions. This leads to fabricate the labeled face structure. Racial classification of Caucasian, Black and Asian abrasive races along with gender (Brooks & Gwinn, 2010; Fu et al., 2014) has been performed precisely using computer vision (Tariq et al., 2016). Knowledge of races is an important initial step in discrimination. Geographical regional faces have stereotyped structure which incorporates many discriminative features. Both human and machine process them in a systematic way applying experience and computational logic respectively. Classification problem becomes handy with subtle feature variations i.e. finer grained race such as Chinese/Japanese/Korean (Duan et al., 2010), Chinese sub-ethnicities (Tin & Sein, 2010) and Myanmar (Bruce, 1986) but these studies have not characterized the human performance in systematic way. Eventually, what features does human consider for classification is mystery. To address this concern an Automated Human Intelligence System (AHIS) is designed involving randomly selected untrained identifiers in a fine-grained race classification problem to evaluate the potential of human vision. The interrogation of identifiers based on given regional face images emphasized on local conventional facial features like skin tone, face shape, shape of eyes, eyebrows, shape of nose, orientation of mouth and non-conventional features like style of applying vermillion, its color, style of dressing, draping sari as per regional tradition, physic, moustache, accessories like jewelry and regional amulet thread. This rich feature set will be prospected as a reference input set to train neural models for solving computer vision problems. The insight of non-conventional features can be seen sufficing the absence of conventional features. Augmentation of these symmetric features can bring tangible gain in classification. The experimental results of human vision analysis have shown accuracy of 88% when both identifiers and person in image are from different regions. 96% accuracy is achieved when both are from same region. This familiarity of faces reinforced improved performance in classification. The proficiency of humans in underlying classification problem is systematically measured and the derived discriminative features are characterized using computer vision algorithms using novel face database. This work emphasizes on CNNs since they have achieved commendable success in image classification on large scale datasets for a long time. The success of the AlexNet has influenced researchers to carry out advances in classification precision by either sinking filter size or escalating the network deepness and the efficient pre-trained model like GoogLeNet containing trained weights for the network reduced the number of steps required for the output to converge. CNN model training is global optimization problem. The authors have described different variations to improve traditional CNN model to find the best fitting set of parameters by incorporating three aspects: Inception module, spectral pooling and leaky ReLu activation function. Using canny edge detection approximation method face contour information is obtained and characterized through CNN model. This leads to explore the perceptual annotation of individual features influence in overall face. To this end the authors have developed a novel Indian regional face database (IRFD) consisting of large set of distinctive face images of north, east, west and south regions of India to mitigate the scarcity of regional and labeled face images for future supervised classification process. The face images are collected from different universities and acquired through both online and offline mode. IRFD is made public for further research work in addition to relatively available few datasets of Indian faces. Meanwhile, as a result of study apart from facial features dressing style, physic, moustache, style of applying vermillion, regional amulet thread are few other non-facial factors which have influenced human intelligence decision. The experimental results have yielded accuracy of 96% with the assumption that the identifier (human) belongs to the same state. The experimental outcome evidences the effectiveness and viability of the study on human intelligence. The findings of the experiments will be prospected and act as a booster for machine intelligence in feature selection.

Complete Chapter List

Search this Book:
Reset