Experimental Study III: Forest Cover Type Dataset

Experimental Study III: Forest Cover Type Dataset

DOI: 10.4018/978-1-5225-5029-7.ch006
OnDemand:
(Individual Chapters)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

This chapter provides implementation of the proposed model on Forest Cover Type data set. The chapter includes the implementation of pattern extraction from this dataset by following a series of steps discussed in the proposed model chapter. It also includes detailed implementation of pattern prediction from Automobile dataset for prediction of numeric variables, nominal variables, and aggregate data. The implementation of pattern prediction is also a series of steps as discussed before.
Chapter Preview
Top

6.1 Dataset Introduction

Forest Cover Type data set is taken from UCI Machine Learning website(Asuncion & Newman, 2007). The data set has a mixture of numerical and nominal type of variables (13 in total). The whole data set contains 5, 81,012 records. Ten variables of the dataset are numeric and rest three variables are nominal. Numeric variables include Elevation, Aspect, Slope, Horizontal Distance To Hydrology, Vertical Distance To Hydrology, Horizontal Distance To Roadways, Hillshade 9am, Hillshade Noon, Hillshade 3pm, Horizontal Distance To Fire Points. The nominal variables and their distinct values are shown in the Table 1.

Table 1.
Nominal Variables along with their distinct values in the Forest Dataset
Nominal VariablesDistinct Values
Wilderness_AreasRawah_Wilderness_Area,Comanche_Peak_Wilderness_Area, Neota_Wilderness_Area. Cache_la_Poudre_Wilderness_Area
Soil_TypeSoilType, SoilType2, …, SoilType40
Cover_TypeSpruce/Fir, Krummholz, Lodgepole_Pine, Aspen, Douglas-fir, Ponderosa_Pine, Cottonwood/Willow
Top

6.2 Pattern Extraction

This section presents the first part of the proposed framework called Pattern Extraction. As discussed above, this part works through a series of steps where each step takes input from the previous step. The steps involve hierarchical clusters generation, ranking of variables, multi-dimensional scaling, schema generation, mining of association rules, advanced evaluation of association rules and visualization of extracted patterns. The sub sections below explain each step briefly and then provide implementation using the example dataset.

Complete Chapter List

Search this Book:
Reset