Super Computer Heterogeneous Classifier Meta-Ensembles
Anthony Bagnall (University of East Anglia, UK), Gavin Cawley (University of East Anglia, UK), Ian Whittley (University of East Anglia, UK), Larry Bull (University of West of England, UK), Matthew Studley (University of West of England, UK), Mike Pettipher (University of Manchester, UK) and Firat Tekiner (University of Manchester, UK)
Copyright: © 2008
This article describes the entry of the Super Computer Data Mining (SCDM) Project to the 10th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD) 2006 Data Mining Competition. The SCDM project is developing data mining tools for parallel execution on Linux clusters. The code is freely available; please contact the first author for a copy. We combine several classifiers, some of them ensemble techniques, into a heterogeneous meta-ensemble, to produce a probability estimate for each test case. We then use a simple decision theoretic framework to form a classification. The meta-ensemble contains a Bayesian neural network, a learning classifier system (LCS), attribute selection based-ensemble algorithms (Filtered At-tribute Subspace based Bagging with Injected Randomness [FASBIR]), and more well-known classifiers such as logistic regression, Naive Bayes (NB), and C4.5.