Search the World's Largest Database of Information Science & Technology Terms & Definitions
InfInfoScipedia LogoScipedia
A Free Service of IGI Global Publishing House
Below please find a list of definitions for the term that
you selected from multiple scholarly research resources.

What is Data Stream Pre-Processing

Handbook of Research on Text and Web Mining Technologies
The application, prior to the mining phase, of several methods, aimed to improve the overall data mining results. Usually, it consists of (1) data cleaning, that is a method for fixing missing values, outliers and possible inconsistent data and (2) data reduction, that is the application of any technique (affecting data representation) which is capable of saving storage space without compromising the possibility of inquiring compressed data.
Published in Chapter:
Approximate Range Querying over Sliding Windows
Francesco Buccafurri (University “Mediterranea” of Reggio Calabria, Italy)
Copyright: © 2009 |Pages: 15
DOI: 10.4018/978-1-59904-990-8.ch016
Abstract
In the context of Knowledge Discovery in Databases, data reduction is a pre-processing step delivering succinct yet meaningful data to sequent stages. If the target of mining are data streams, then it is crucial to suitably reduce them, since often analyses on such data require multiple scans. In this chapter, we propose a histogram-based approach to reducing sliding windows supporting approximate arbitrary (i.e., non biased) range-sum queries. The histogram is based on a hierarchical structure (as opposed to the flat structure of traditional ones) and it results suitable to directly support hierarchical queries, such as drill-down and roll-up operations. In particular, both sliding window shifting and quick query answering operations are logarithmic in the sliding window size. Experimental analysis shows the superiority of our method in terms of accuracy w.r.t. the state-of-the-art approaches in the context of histogram-based sliding window reduction techniques.
Full Text Chapter Download: US $37.50 Add to Cart
eContent Pro Discount Banner
InfoSci OnDemandECP Editorial ServicesAGOSR