Approximating Proximity to Fast and Robust Distance-Based Clustering

Approximating Proximity to Fast and Robust Distance-Based Clustering

Vladimir Estivill-Castro (University of Newcastle, Australia) and Michael Houle (University of Sydney, Australia)
Copyright: © 2002 |Pages: 21
DOI: 10.4018/978-1-930708-25-9.ch002
OnDemand PDF Download:


Distance-based clustering results in optimization problems that typically are NP-hard or NP-complete and for which only approximate solutions are obtained. For the large instances emerging in data mining applications, the search for high-quality approximate solutions in the presence of noise and outliers is even more challenging. We exhibit fast and robust clustering methods that rely on the careful collection of proximity information for use by hill-climbing search strategies. The proximity information gathered approximates the nearest neighbor information produced using traditional, exact, but expensive methods. The proximity information is then used to produce fast approximations of robust objective optimization functions, and/or rapid comparison of two feasible solutions. These methods have been successfully applied for spatial and categorical data to surpass well-established methods such as k-MEANS in terms of the trade-off between quality and complexity.

Complete Chapter List

Search this Book:
Table of Contents
Hussein A. Abbass, Ruhul Sarker, Charles S. Newton
Chapter 1
Vladimir Estivill-Castro, Michael Houle
Distance-based clustering results in optimization problems that typically are NP-hard or NP-complete and for which only approximate solutions are... Sample PDF
Approximating Proximity to Fast and Robust Distance-Based Clustering
Chapter 2
Erick Cantu-Paz
With computers becoming more pervasive, disks becoming cheaper, and sensors becoming ubiquitous, we are collecting data at an ever-increasing pace.... Sample PDF
On the Use of Evolutionary Algorithms in Data Mining
Chapter 3
Beatriz de la Iglesia, Victor J. Rayward-Smith
Knowledge Discovery in Databases (KDD) is an iterative and interactive process involving many steps (Debuse, de la Iglesia, Howard & Rayward-Smith... Sample PDF
The Discovery of Interesting Nuggets Using Heuristic Techniques
Chapter 4
Jay T. Rodstein, Katherine S. Watters
Safety and health issues in virtual offices are part of progressive telecommuting programs. Telecommuting agreements between employers and employees... Sample PDF
From Evolution to Immune to Swarm to? A Simple Introduction to Modern Heuristics
Chapter 5
Inaki Inza, Pedro Larranaga, Basilio Sierra
Feature Subset Selection (FSS) is a well-known task of Machine Learning, Data Mining, Pattern Recognition or Text Learning paradigms. Genetic... Sample PDF
Estimation of Distribution Algorithms for Feature Subset Selection in Large Dimensionality Domains
Chapter 6
Jorge Muruzabal
Evolutionary algorithms are by now well-known and appreciated in a number of disciplines including the emerging field of data mining. In the last... Sample PDF
Towards the Cross-Fertilization of Multiple Heuristics: Evolving Teams of Local Bayesian Learners
Chapter 7
Neil Dunstan, Michael de Raadt
Sensing devices are commonly used for the detection and classification of subsurface objects, particularly for the purpose of eradicating Unexploded... Sample PDF
Evolution of Spatial Data Templates for Object Classification
Chapter 8
Peter W.H. Smith
Genetic Programming (GP) has increasingly been used as a data-mining tool. For example, it has successfully been used for decision tree induction... Sample PDF
Genetic Programming as a Data-Mining Tool
Chapter 9
Andries P. Engelbrecht, L. Schoeman, Sonja Rouwhorst
Genetic programming has recently been used successfully to extract knowledge in the form of IF-THEN rules. For these genetic programming approaches... Sample PDF
A Building Block Approach to Genetic Programming for Rule Discovery
Chapter 10
Rafael S. Parpinelli, Heitor S. Lopes, Alex A. Freitas
This work proposes an algorithm for rule discovery called Ant-Miner (Ant Colony-Based Data Miner). The goal of Ant-Miner is to extract... Sample PDF
An Ant Colony Algorithm for Classification Rule Discovery
Chapter 11
Jonathan Timmis, Thomas Knight
The immune system is highly distributed, highly adaptive, self-organising in nature, maintains a memory of past encounters and has the ability to... Sample PDF
Artificial Immune Systems: Using the Immune System as Inspiration for Data Mining
Chapter 12
Leandro Nunes de Castro, Fernando J. Von Zuben
This chapter shows that some of the basic aspects of the natural immune system discussed in the previous chapter can be used to propose a novel... Sample PDF
aiNet: An Artificial Immune Network for Data Analysis
Chapter 13
Parallel Data Mining  (pages 261-289)
David Taniar, J. Wenny Rahayu
Data mining refers to a process on nontrivial extraction of implicit, previously unknown and potential useful information (such as knowledge rules... Sample PDF
Parallel Data Mining
About the Authors