Preparing a dataset is a very important step in data mining. If the input to the process contains problems, noise, or errors, then the results will reflect this, as well. Not all possible combinations of the data should exist, as the data represent real-world observations. Correlation is expected among the variables. If all possible combinations were represented, then there would be no knowledge to be gained from the mining process.