DWFIST: The Data Warehouse of Frequent Itemsets Tactics Approach
Rodrigo Salvador Monteiro (Federal University of Rio de Janeiro, Brazil and University of Stuttgart, Germany), Geraldo Zimbrao (Federal University of Rio de Janeiro, Brazil), Holger Schwarz (University of Stuttgart, Germany), Bernhard Mitschang (University of Stuttgart, Germany) and Jano Moreira de Souza (Federal University of Rio de Janeiro, Brazil and University of Stuttgart, Germany)
Copyright: © 2008
This chapter presents the core of the DWFIST approach, which is concerned with supporting the analysis and exploration of frequent itemsets and derived patterns, e.g., association rules in transactional datasets. The goal of this new approach is to provide: (1) flexible pattern-retrieval capabilities without requiring the original data during the analysis phase; and (2) a standard modeling for data warehouses of frequent itemsets, allowing an easier development and reuse of tools for analysis and exploration of itemset-based patterns. Instead of storing the original datasets, our approach organizes frequent itemsets holding on different partitions of the original transactions in a data warehouse that retains sufficient information for future analysis. A running example for mining calendar-based patterns on data streams is presented. Staging area tasks are discussed and standard conceptual and logical schemas are presented. Properties of this standard modeling allow retrieval of frequent itemsets holding on any set of partitions, along with upper and lower bounds on their frequency counts. Furthermore, precision guarantees for some interestingness measures of association rules are provided as well.