Experimental Study II: Adult Dataset

Experimental Study II: Adult Dataset

DOI: 10.4018/978-1-5225-5029-7.ch005


This chapter provides an experimental study of the proposed model on Adult data set. The chapter includes the implementation of pattern extraction from this dataset by following a series of steps as discussed before. It also includes detailed implementation of pattern prediction of numeric variables, nominal variables, and aggregate data. The implementation of pattern prediction is also a series of steps as discussed before.
Chapter Preview

5.1 Dataset Introduction

Adult dataset is taken from UCI machine learning repository (Asuncion & Newman, 2007). The data set has a mixture of numerical and nominal type of variables (13 in total). The whole data set contains 48,842 records. Eight variables of the dataset are nominal and rest five variables are numeric. Nominal variables along with their distinct values are given in Table 1. The numeric variables include Age, Final Weight, Edu Num, Cap Gain, Cap Loss, Hrs Per Week. Since the available data set contains missing values, we removed records with missing values and used 61% of the total records (30,162) for our case study.


5.2 Pattern Extraction

This section presents the first part of the proposed framework called Pattern Extraction. As discussed above, this part works through a series of steps where each step takes input from the previous step. The steps involve hierarchical clusters generation, ranking of variables, multi-dimensional scaling, schema generation, mining of association rules, advanced evaluation of association rules and visualization of extracted patterns. The sub sections below explain each step briefly and then provide implementation using the example dataset.

Table 1.
Nominal variables along with their distinct values from adults dataset
Nominal VariableDistinct Values
Work ClassPrivate, Self-emp-not-inc, Self-emp-inc, Federal-gov, Local-gov, State-gov, Without-pay, Never-worked
EducationBachelors, Some-college, 11th, HS-grad, Prof-school, Assoc-acdm, Assoc-voc, 9th, 7th-8th, 12th, Masters, 1st-4th, 10th, Doctorate, 5th-6th, Preschool
Marital StatusMarried-civ-spouse, Divorced, Never-married, Separated, Widowed, Married-spouse-absent, Married-AF-spouse
OccupationTech-support, Craft-repair, Other-service, Sales, Exec-managerial, Prof-specialty, Handlers-cleaners, Machine-op-inspct, Adm-clerical, Farming-fishing, Transport-moving, Priv-house-serv, Protective-serv, Armed-Forces
RelationshipWife, Own-child, Husband, Not-in-family, Other-relative, Unmarried
RaceWhite, Asian-Pac-Islander, Amer-Indian-Eskimo, Other, Black
SexFemale, Male
Native CountryUnited-States, Cambodia, England, Puerto-Rico, Canada, Germany, Outlying-US(Guam-USVI-etc), India, Japan, Greece, South, China, Cuba, Iran, Honduras, Philippines, Italy, Poland, Jamaica, Vietnam, Mexico, Portugal, Ireland, France, Dominican-Republic, Laos, Ecuador, Taiwan, Haiti, Columbia, Hungary, Guatemala, Nicaragua, Scotland, Thailand, Yugoslavia, El-Salvador, Trinadad&Tobago, Peru, Hong, Holand-Netherlands

Complete Chapter List

Search this Book: