Special Offers
- IGI Global’s New Emerging Topic e-Book Collections
  Acquire highly focused and affordable Cutting-Edge Peer-Reviewed Research Content through a selection of 17 topic-focused e-Book Collections discounted up to 90%, compared to list prices. Collection topics include Artificial Intelligence, Data Science, Language Learning, Marketing and Customer Relations, Sustainability, and many more. Hosted on the InfoSci^® platform, these collections feature no DRM, no additional cost for multi-user licensing, no embargo of content, full-text PDF & HTML format, and more.
  Learn More
- Open Access Book (Free Access) - Encyclopedia of Information Science and Technology, Sixth Edition (ISBN: 9781668473665)
  The Encyclopedia of Information Science and Technology, Sixth Edition) continues the legacy set forth by the first five editions by providing comprehensive coverage and up-to-date definitions of the most important issues, concepts, and trends pertaining to technological advancements and information management within a variety of settings and industries. The entire book is being published under open access.
  Read Now
- Open Access Book (Free Access) - Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries (ISBN: 9781668456293)
  Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries provides information on the recent technology, mitigation, and environmental protection that must be applied for food sustainability in developing countries. This book is being published under Platinum Open Access through funding from Diponegoro University, Indonesia.
  Read Now
- Open Access Book (Free Access) - New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY (ISBN: 9781668438091)
  The Walmart Corporation and the Lumina Foundation have provided funding to make New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY fully open access, completely removing any paywall between scholars in education and the latest research on new models for the future of higher education.
  Read Now
- Open Access Book (Free Access) - Handbook of Research on the Global View of Open Access and Scholarly Communications (ISBN: 9781799898054)
  Through a collaboration between IGI Global and the University of North Texas, the Handbook of Research on the Global View of Open Access and Scholarly Communications has been published as fully open access, completely removing any paywall between researchers of any field, and the latest research on the equitable and inclusive nature of Open Access and all of its complications.
  Read Now
Books
- - Books by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Books by Field
Journals
- - Journals
  - OnDemand Journal Articles
  - Journals by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Journals by Field
e-Collections
Open Access
- View All Open Access Opportunities
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Find an Open Access Journal for Your Next Manuscript
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Submit an Open Access Book Proposal
  Learn more about open access book publishing and how it can propel your research forward in the field.
  Convert Your Work to Open Access
  Already published? You can convert your work to open access to increase its impact through IGI Global’s Restrospective Open Access Program.
  Utilize Open Access Collection Database
  Open up your research potential by utilizing our open access content or integrating the open access collection into your library
  Consider Open Access Agreements
  For Libraries: consider no-cost or investment-level open access agreements with IGI Global to support your faculty's research endeavors.
  Search Funding Resources
  Looking for additional funding resources to support your open accesss endeavors? View industry resources compiled by our open access team.
  Review Open Access Policies & Ethical Guidelines
  Considering IGI Global to publish your work under open access? Review IGI Global’s open access policies and ethical guidelines
Publish with Us
Resources
- - Instructors
  - Course Adoption
  - Teaching Cases
  - K-12 Online Learning Collection
  - Authors and Editors
  - eEditorial Discovery^® System
  - Peer Review Process
  - Ethics and Malpractice
  - COPE Membership
  - Fair Use Policy
  - Open Access Publishing
  - FAQ
Catalogs
About Us
Newsroom

Clustering by K-Means Method and K-Medoids Method: An Application With Statistical Regions of Turkey

Onur Önay

Source Title: Handbook of Research on Engineering, Business, and Healthcare Applications of Data Science and Analytics

DOI: 10.4018/978-1-7998-3053-5.ch024

OnDemand:

(Individual Chapters)

Available

$37.50

Current Special Offers

No Current Special Offers

Abstract

Data science and data analytics are becoming increasingly important. It is widely used in scientific and real-life applications. These methods enable us to analyze, understand, and interpret the data in every field. In this study, k-means and k-medoids clustering methods are applied to cluster the Statistical Regions of Turkey in Level 2. Clustering analyses are done for 2017 and 2018 years. The datasets consist of “Distribution of expenditure groups according to Household Budget Survey” 2017 and 2018 values, “Gini coefficient by equivalised household disposable income” 2017 and 2018 values, and some features of “Regional Purchasing Power Parities for the main groups of consumption expenditures” 2017 values. Elbow method and average silhouette method are applied for the determining the number of the clusters at the beginning. Results are given and interpreted at the conclusion.

Chapter Preview

Top

Introduction

Data science and data analytics are becoming more and more important with their wide application area. Spending time on the internet, visiting shopping sites, reading the news sites, using search engines, looking social media posts, adding comments to contents etc. each contributes to the formation of datasets. So data is constantly growing from social media, weather stations, government agencies, purchases and so on (Dichev, & Dicheva 2017). Data is very important source of knowledge. In business to design successful strategies and policies data science is widely used (Gibert et al., 2018). Hundreds of scientific studies and real-life applications can be found from the internet in data science and data analytics. A lot of applications can be found according to data which are collected in different areas. Data can be in a variety of formats, such as numeric, text or image and etc. Data science and data analytics can help us understand the data by analyzing various methods. Data science includes mathematical and statistical analysis combined with information technology tools and builds systems and algorithms to discover the information, to detect the patterns, and create useful insights and predictions while doing this it uses techniques, such as classification, clustering, regression and association rule mining (Molina-Solana et al., 2017).

Clustering methods are used as data science methods and that can be used to understand the meaning of the data. The data are grouped into clusters and the resulting clusters are interpreted. There are many types of clustering methods, such as partitioning, hierarchical, grid-based and model-based methods (Kaur et al., 2014). In this study, k-means and k-medoids methods are used which are in partitioning clustering methods. Clusters are arranged with these methods by looking at the distances between the data.

In this study, Statistical Regions of Turkey in Level 2 are clustered with k-means and k-medoids clustering methods. Turkey has three different levels of Statistical Regions. They are “Level 1”, “Level 2” and “Level 3”. Details of the Statistical Regions of Turkey are given at the section two (background). Turkey has 7 geographical regions and their details are given before the analysis. The dataset 2017 (in Table 3) consists of “Distribution of expenditure groups according to Household Budget Survey (Horizontal %), 2015-2017, 2017”, “Gini coefficient by equivalised household disposable income 2017” and some features of “Regional Purchasing Power Parities for the main groups of consumption expenditures” for 2017. The dataset 2018 (in Table 2) consists of “Distribution of expenditure groups according to Household Budget Survey (Horizontal %), 2016-2018, 2018”, “Gini coefficient by equivalised household disposable income” for 2018. Statistical Regions of Turkey which are in the same or in a different cluster can be identified and interpreted by the clustering analysis according to datasets. Anyone who knows a region can make inferences about other regions in the same cluster. So data science methods help us understand the clustering of Statistical Regions of the Turkey in Level 2 according to topics of datasets.

There are seven sections in this study. Section one is the introduction. Section two is the background. In this section, there is information about the classification of Statistical Regions system of Turkey and some studies are given as examples from the literature which are related with the study. Section three is the main focus of the chapter part. The information about clustering, k-means, k-medoids, determination of the numbers of the clusters and information of the data are given in section three. Analyses and results are given in section four which is the solutions and recommendations section. Some ideas are given for the future research direction in section five. In the section six conclusions are given. And in final section references are given. The overview of the study can be shown by steps as following;

Step 1: Determine the research problem
Step 2: Do background research
Step 3: Give information of the data and analysis methods
Step 4: Analyze the data
Step 5: Communicate the results
Step 6: Conclusion

Key Terms in this Chapter

K-Means Algorithm: It is an algorithm that is used for cluster analysis.

Statistical Regions of Turkey in Level 1: Level 1 is one of the types of Statistical Regions of Turkey and it consists of 12 regions. They are produced by grouping Level 2 regions.

Statistical Regions of Turkey in Level 3: Level 3 is one of the types of Statistical Regions of Turkey. It consists of 81 cities of Turkey.

Geographical Regions of Turkey: There are 7 geographical regions of Turkey. They are determined by their features.

K-Medoids Algorithm: It is an algorithm that is using medoids for cluster analysis.

Statistical Regions of Turkey in Level 2: Level 2 is one of the types of Statistical Regions of Turkey and it consists of 26 regions. They are produced by grouping Level 3 regions.

Cluster Analysis (Clustering): It is used to separate the data into groups with using different techniques.

Complete Chapter List

Search this Book:

Reset

MLA

APA

Chicago

Export Reference

Clustering by K-Means Method and K-Medoids Method: An Application With Statistical Regions of Turkey

Abstract

Introduction

Key Terms in this Chapter

Complete Chapter List