Does Protecting Databases Using Perturbation Techniques Impact Knowledge Discovery?

Does Protecting Databases Using Perturbation Techniques Impact Knowledge Discovery?

Rick L. Wilson (Oklahoma State University, USA) and Peter A. Rosen (University of Evansville, USA)
Copyright: © 2005 |Pages: 12
DOI: 10.4018/978-1-59140-471-2.ch003
OnDemand PDF Download:
$37.50

Abstract

Data perturbation is a data security technique that adds noise in the form of random numbers to numerical database attributes with the goal of maintaining individual record confidentiality. Generalized Additive Data Perturbation (GADP) methods are a family of techniques that preserve key summary information about the data while satisfying security requirements of the database administrator. However, effectiveness of these techniques has only been studied using simple aggregate measures (averages, etc.) found in the database. To compete in today’s business environment, it is critical that organizations utilize data mining approaches to discover information about themselves potentially hidden in their databases. Thus, database administrators are faced with competing objectives: protection of confidential data versus disclosure for data mining applications. This chapter empirically explores whether data protection provided by perturbation techniques adds a so-called Data Mining Bias to the database. While the results of the original study found limited support for this idea, stronger support for the existence of this bias was found in a follow-up study on a larger more realistic-sized database.

Complete Chapter List

Search this Book:
Reset