Methods for the Identification of Data Outliers in Interactive SQL

Methods for the Identification of Data Outliers in Interactive SQL

Ronald Dattero, Edna M. White, Marius A. Janson
Copyright: © 1991 |Pages: 12
DOI: 10.4018/jdm.1991010102
OnDemand:
(Individual Articles)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

The purpose of this paper is twofold. First, the paper discusses the importance of identifying data outliers (that is, extremely unusual values) for three major purposes in the management of database systems: (1) data validation, (2) statistical analysis, and (3) extreme point recognition. Outliers are defined following the original approach developed by Tukey for use in Exploratory Data Analysis. Second, the paper develops two outlier identification procedures that can be directly inserted into relational database systems through the nonprocedural relational database language SQL (Structured Query Language). The first procedure, the Hinge Procedure, follows exactly Tukey’s definition of outliers, but requires the creation of an additional base table. The second procedure, the Quartile Procedure, provides only a close approximation to Tukey’s definition, but can be implemented solely through views. The development of these procedures was non-trivial due to the rather limited number of available mathematical functions in SQL. The advantages and disadvantages of the two procedures are discussed in the paper.

Complete Article List

Search this Journal:
Reset
Volume 35: 1 Issue (2024)
Volume 34: 3 Issues (2023)
Volume 33: 5 Issues (2022): 4 Released, 1 Forthcoming
Volume 32: 4 Issues (2021)
Volume 31: 4 Issues (2020)
Volume 30: 4 Issues (2019)
Volume 29: 4 Issues (2018)
Volume 28: 4 Issues (2017)
Volume 27: 4 Issues (2016)
Volume 26: 4 Issues (2015)
Volume 25: 4 Issues (2014)
Volume 24: 4 Issues (2013)
Volume 23: 4 Issues (2012)
Volume 22: 4 Issues (2011)
Volume 21: 4 Issues (2010)
Volume 20: 4 Issues (2009)
Volume 19: 4 Issues (2008)
Volume 18: 4 Issues (2007)
Volume 17: 4 Issues (2006)
Volume 16: 4 Issues (2005)
Volume 15: 4 Issues (2004)
Volume 14: 4 Issues (2003)
Volume 13: 4 Issues (2002)
Volume 12: 4 Issues (2001)
Volume 11: 4 Issues (2000)
Volume 10: 4 Issues (1999)
Volume 9: 4 Issues (1998)
Volume 8: 4 Issues (1997)
Volume 7: 4 Issues (1996)
Volume 6: 4 Issues (1995)
Volume 5: 4 Issues (1994)
Volume 4: 4 Issues (1993)
Volume 3: 4 Issues (1992)
Volume 2: 4 Issues (1991)
Volume 1: 2 Issues (1990)
View Complete Journal Contents Listing