Outliers, Missing Values, and Reliability: An Integrated Framework for Pre-Processing of Coding Data

Outliers, Missing Values, and Reliability: An Integrated Framework for Pre-Processing of Coding Data

Swati Aggarwal (NSIT, India) and Shambeel Azim (Vidyadaan Institute of Technology and Management, India)
DOI: 10.4018/978-1-5225-1008-6.ch014


Reliability is a major concern in qualitative research. Most of the current research deals with finding the reliability of the data, but not much work is reported on how to improve the reliability of the unreliable data. This paper discusses three important aspects of the data pre-processing: how to detect the outliers, dealing with the missing values and finally increasing the reliability of the dataset. Here authors have suggested a framework for pre-processing of the inter-judged data which is incomplete and also contains erroneous values. The suggested framework integrates three approaches, Krippendorff's alpha for reliability computation, frequency based outlier detection method and a hybrid fuzzy c-means and multilayer perceptron based imputation technique. The proposed integrated approach results in an increase of reliability for the dataset which can be used to make strong conclusions.
Chapter Preview

Various works have been done by different researchers to estimate intercoder reliability in content analysis. Cohen’s Kappa (Cohen’s, 1960) is a reliability measure that works only on nominal data and takes into account occurrence of agreement by chance. Moreover the number of raters in this case is limited to two only. Fleiss’ Kappa (Fliess, 1971) is an improvement over Cohen’s Kappa as it works on more than two raters. It also has the same limitation as Cohen’s Kappa i.e. it is limited to nominal data only. Cronbach’s alpha is used most commonly as a reliability coefficient (Hogan et al., 2000). It is a measure of internal consistency of the test. The problem with Cronbach’s alpha is that it is non robust, a single observation can greatly affect the coefficient value (Christmann & Van, 2006). An improvement to various other statistics for measuring the reliability of interrater data is Krippendorff’s alpha (Hayes et al., 2007; Krippendorff, 2013, 2011). It is a flexible, reliable measure that can be used with any number of raters, with any metric or level of measurement and also with the missing data.

Complete Chapter List

Search this Book: