The Distance and Cluster Procedure

The Distance and Cluster Procedure

Sean Eom (Southeast Missouri State University, USA)
DOI: 10.4018/978-1-59904-738-6.ch009
OnDemand PDF Download:
$37.50

Abstract

This chapter describes the distance and cluster procedure of the SAS system. SAS version 9 introduced the proc distance procedure. All previous versions of SAS used two programs (xmacro.sas and distnew.sas) to process a transposed cocitation matrix (input) to produce a distance matrix (output). Cluster analysis is a data reduction technique for grouping various entities (individuals, variables, objects) into clusters so that the entities in the same cluster have more similarity to each other with respect to some predetermined selection criteria. The first section of this chapter explains the creation of a distance matrix, which is the input to the cluster procedure. The second part of this chapter focuses on the PROC CLUSTER statement which sets out the CLUSTER procedure steps. This chapter also includes the discussions of interpreting results of cluster analysis.
Chapter Preview
Top

Introduction

SAS version 9 introduced the proc distance procedure. All previous versions of SAS used two programs (xmacro.sas and distnew.sas) to process a transposed cocitation matrix (input) to produce a distance matrix (output). The input to the cluster and multi-dimensional scaling analysis is a proximity matrix. The cocitation frequency counts matrix must be converted into a distance or similarity matrix. SAS version 9 created a new procedure, the distance procedure, to compute various measures of distance, dissimilarity, or similarity between the authors under investigation. The distance matrix is the input to the CLUSTER and MDS procedures.

There are many different ways of measuring inter-object similarity, including distance measures (proximity/difference between each pair of objects) and the correlation coefficient between a pair of objects. The higher cocitation frequencies between a pair of authors represent a higher level of cognitive linkages or similarities between them. In ACA, the cocitation frequency count matrix, correlation coefficient matrix, and distance matrix represent three different outputs in the same transformation process (see Table 1). Understanding input and output relations in the process helps us select the correct options in the distance and MDS procedures.

Table 1.
Summary of input/output relationships in various PROC statements
Original inputOutput/inputInput/outputoutput
Cocitation
frequency matrix
Proc FactorFactor pattern/
structure correlations
Proc DistanceDistance matrix
Proc ClusterClusters
Proc MDSMDS
Configuration
Coordinates
Proc plotTwo
dimensional
MDS maps
Proc G3DThree
dimensional
MDS maps

Complete Chapter List

Search this Book:
Reset
Table of Contents
Acknowledgment
Sean B. Eom
Chapter 1
Sean Eom
Author cocitation analysis (ACA) is a branch of bibliometrics. Bibliometrics/informetrics is one of the older areas of library and information... Sample PDF
An Introduction to Bibliometrics and Informetrics
$37.50
Chapter 2
Sean Eom
This chapter introduces online citation data retrieval using the Web of Science and Dialog Classic. The Web of Science provides access to... Sample PDF
An Introduction to Online Data Retrieval and Issues
$37.50
Chapter 3
Sean Eom
Virtually all ACA studies using Thomson’s ISI citation indexes used only the first author to retrieve the cocitation counts. Therefore, this has... Sample PDF
The Impact of the ISI Convention of Relying on only the Name of the First Author on ACA Results: An Empirical Investigation
$37.50
Chapter 4
Diagonal Values in ACA  (pages 91-121)
Sean Eom
Diagonal values in the cocitation frequency counts matrix are a fundamental issue in ACA study. Diagonal values are the co-citation frequency counts... Sample PDF
Diagonal Values in ACA
$37.50
Chapter 5
The Fox-Base Approach  (pages 123-136)
Sean Eom
Chapter II introduced online cocitation counts retrieval using Dialog Classic and citation index files. Certainly Dialog Classic is an attractive... Sample PDF
The Fox-Base Approach
$37.50
Chapter 6
Sean Eom
This chapter shows another alternative approach of building citation database and retrieval system using the spreadsheet program, Microsoft Excel.... Sample PDF
Building Databases and the Cocitation Counts Generation System Using Microsoft Excel Program in Visual Basics
$37.50
Chapter 7
Sean Eom
The previous two chapters examined the two alternative approaches of retrieving cocitation counts using custom databases and cocitation frequency... Sample PDF
Overview of Author Cocitation Analysis Procedures
$37.50
Chapter 8
Sean Eom
This chapter describes the factor procedure. The first section of the chapter begins with the definition of factor analysis. This is the statistical... Sample PDF
Principal Component Analysis Using the Factor Procedure
$37.50
Chapter 9
Sean Eom
This chapter describes the distance and cluster procedure of the SAS system. SAS version 9 introduced the proc distance procedure. All previous... Sample PDF
The Distance and Cluster Procedure
$37.50
Chapter 10
Multidimensional Scaling  (pages 225-254)
Sean Eom
This chapter discusses multidimensional scaling (MDS) procedures. MDS is a class of multivariate statistical techniques/procedures to produce two or... Sample PDF
Multidimensional Scaling
$37.50
Chapter 11
Sean Eom
This chapter briefly introduces the use of SPSS version 15.0 to conduct ACA analysis. The SPSS accepts datafiles in many different formats including... Sample PDF
ACA Analysis with the 15.0.0 Verison of SPSS for Windows
$37.50
Chapter 12
Sean Eom
This is the capstone chapter that shows how the concepts, tools, and techniques discussed in each of the previous chapters can be applied in... Sample PDF
The Intellectual Structure of Decision Support Systems Research (1969-1989)
$37.50
Chapter 13
Sean Eom
This chapter extends an earlier benchmark study (Sean B. Eom, 1995) which examined the intellectual structure, major themes, and reference... Sample PDF
The Changing Structure of Decision Support Systems Research: An Empirical Investigation through Author Cocitation Mapping (1990-1999)
$37.50
About the Author