Data Mining
Data mining is the process of discovering new patterns from large data sources. Knowledge discovery from databases (KDD) is an interdisciplinary subfield of computer science which is used for discovering patterns in large data sets involving methods at the intersection of artificial intelligence, machine learning, statistics, and database systems. The overall goal of the data mining process is to extract information from a data set and transform it into an understandable structure for further use.
The analysis step involves database and data management aspects, data pre-processing, model and inference considerations, interestingness metrics, complexity considerations, post-processing of discovered structures, visualization and online updating. Data mining is not only for the analysis of large-scale data or information processing but is also generalized to any kind of computer decision support system, including artificial intelligence, machine learning, and business intelligence.
Data mining uses information from past data to analyze the outcome of a particular problem or situation that may arise. Data mining works to analyze data stored in data warehouses that are used to store that data that is being analyzed.
Managers also use data mining to decide upon marketing strategies for their product. They can use data to compare and contrast among competitors.
Data mining interprets its data into real time analysis that can be used to increase sales, promote new product, or delete product that is not value-added to the company. Data mining interprets its data into real time analysis that can be used to increase sales and promote new product. Data mining mostly is used in decision making process which is also called business intelligence. Business-related decision-making is made using data mining techniques. Data mining is the entire process of applying computer methodology for knowledge discovery.
Steps in data mining:
- •
Data Cleaning: It is a phase in which noise and irrelevant data are removed from the collection.
- •
Data Integration: In this stage, multiple data sources, often heterogeneous may be combined in a common source.
- •
Data Selection: At this stage, the data relevant to the analysis is decided on and retrieved from the data collection.
- •
Data Transformation: It is also known as data consolidation. It is a phase in which the selected data is transformed into forms appropriate for mining procedure.
- •
Data Mining: It is the crucial step in which clever techniques are applied to extract patterns potentially useful.
- •
Pattern Evaluation: In this step, strictly interesting patterns representing knowledge are identified based on given measures.
- •
Knowledge Representation: It is the final phase in which the discovered knowledge is visually represented to the user. This essential step uses visualization techniques to help users understand and interpret the data mining results. Finally, the output will be represented in some human readable format which will be easy to understand.