Visualization of High Dimensional Data

Visualization of High Dimensional Data

Gokmen Zararsiz (Hacettepe University Ankara, Turkey), Cenk Icoz (Anadolu University Eskisehir, Turkey) and Erdener Ozcetin (Anadolu University Eskisehir, Turkey)
Copyright: © 2014 |Pages: 12
DOI: 10.4018/978-1-4666-5202-6.ch236
OnDemand PDF Download:
$30.00
List Price: $37.50

Chapter Preview

Top

Background

The visualization of data has always been a strong desire and an interesting application for data analysts (Fyfe & Garcia-Osorio, 2005). The era of data began after the pioneering works of John Wilder Tukey’s in exploratory data analysis. Nowadays, not only statisticians and data analysts but also researchers in other fields such as doctors, chemists, and geologists aim to represent data visually in order to look for patterns and interactions. All these researchers try to find out the answers to questions which arise during their research by investigating only the data they have gathered (Donoho, 2000). As to be expected, data exist in all areas and will continue to increase.

During the 1970s, statistical graphics proposed for HDD were created to allow researchers to find patterns and interactions in progressively higher dimensions. Andrews’ curves (Andrews, 1972) and Chernoff faces (Chernoff, 1973) are examples of these graphics. Dimension reduction techniques were also generalized to extract interesting information from HDD in lower dimensional graphics (Friendly, 2008). In 1974, PRIM-9, the first dynamic and interactive tool to view and manipulate HDD up to 9 dimensions, was developed (Fishkeller, Friedman, & Tukey, 1974).

Key Terms in this Chapter

Cartesian Space: Euclidean space defined by Cartesian coordinates.

Dimension Reduction Techniques: Techniques to study with lower dimensions than the original data has by keeping the total variance at maximum level.

Exploratory Data Analysis: Exploratory data analysis is a statistical approach rely on graphical techniques to get an initial idea of the data on hand. It can be regarded as a pre-statistical analysis before making any conclusions about data. The advantages of these techniques are being nondependent from distributions.

Cloud Computing: is the use of remote resources via network or internetInternet.

Bar Chart Techniques: A way of summarizing categorical data by using bar graphs.

Data Mining Techniques: Techniques that used in data mining like clustering or classification etc. to extract meaningful patterns from data.

Outlier: The observation whose value is numerically extreme from the remaining data.

Complete Chapter List

Search this Book:
Reset