Special Offers
- IGI Global’s New Emerging Topic e-Book Collections
  Acquire highly focused and affordable Cutting-Edge Peer-Reviewed Research Content through a selection of 17 topic-focused e-Book Collections discounted up to 90%, compared to list prices. Collection topics include Artificial Intelligence, Data Science, Language Learning, Marketing and Customer Relations, Sustainability, and many more. Hosted on the InfoSci^® platform, these collections feature no DRM, no additional cost for multi-user licensing, no embargo of content, full-text PDF & HTML format, and more.
  Learn More
- Open Access Book (Free Access) - Encyclopedia of Information Science and Technology, Sixth Edition (ISBN: 9781668473665)
  The Encyclopedia of Information Science and Technology, Sixth Edition) continues the legacy set forth by the first five editions by providing comprehensive coverage and up-to-date definitions of the most important issues, concepts, and trends pertaining to technological advancements and information management within a variety of settings and industries. The entire book is being published under open access.
  Read Now
- Open Access Book (Free Access) - Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries (ISBN: 9781668456293)
  Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries provides information on the recent technology, mitigation, and environmental protection that must be applied for food sustainability in developing countries. This book is being published under Platinum Open Access through funding from Diponegoro University, Indonesia.
  Read Now
- Open Access Book (Free Access) - New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY (ISBN: 9781668438091)
  The Walmart Corporation and the Lumina Foundation have provided funding to make New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY fully open access, completely removing any paywall between scholars in education and the latest research on new models for the future of higher education.
  Read Now
- Open Access Book (Free Access) - Handbook of Research on the Global View of Open Access and Scholarly Communications (ISBN: 9781799898054)
  Through a collaboration between IGI Global and the University of North Texas, the Handbook of Research on the Global View of Open Access and Scholarly Communications has been published as fully open access, completely removing any paywall between researchers of any field, and the latest research on the equitable and inclusive nature of Open Access and all of its complications.
  Read Now
Books
- - Books by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Books by Field
Journals
- - Journals
  - OnDemand Journal Articles
  - Journals by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Journals by Field
e-Collections
Open Access
- View All Open Access Opportunities
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Find an Open Access Journal for Your Next Manuscript
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Submit an Open Access Book Proposal
  Learn more about open access book publishing and how it can propel your research forward in the field.
  Convert Your Work to Open Access
  Already published? You can convert your work to open access to increase its impact through IGI Global’s Restrospective Open Access Program.
  Utilize Open Access Collection Database
  Open up your research potential by utilizing our open access content or integrating the open access collection into your library
  Consider Open Access Agreements
  For Libraries: consider no-cost or investment-level open access agreements with IGI Global to support your faculty's research endeavors.
  Search Funding Resources
  Looking for additional funding resources to support your open accesss endeavors? View industry resources compiled by our open access team.
  Review Open Access Policies & Ethical Guidelines
  Considering IGI Global to publish your work under open access? Review IGI Global’s open access policies and ethical guidelines
Publish with Us
Resources
- - Instructors
  - Course Adoption
  - Teaching Cases
  - K-12 Online Learning Collection
  - Authors and Editors
  - eEditorial Discovery^® System
  - Peer Review Process
  - Ethics and Malpractice
  - COPE Membership
  - Fair Use Policy
  - Open Access Publishing
  - FAQ
Catalogs
About Us
Newsroom

An Ensemble Random Forest Algorithm for Privacy Preserving Distributed Medical Data Mining

Musavir Hassan, Muheet Ahmed Butt, Majid Zaman

Source Title: International Journal of E-Health and Medical Communications (IJEHMC) 12(6)

DOI: 10.4018/IJEHMC.20211101.oa8

Article PDF Download Open access articles are freely available for download

Abstract

As the voluminous amount of data is generated because of inexorably widespread proliferation of electronic data maintained using the Electronic Health Records (EHRs). Medical health facilities have great potential to discern the patterns from this data and utilize them in diagnosing a specific disease or predicting outbreak of an epidemic etc. This discern of patterns might reveal sensitive information about individuals and this information is vulnerable to misuse. This is, however, a challenging task to share such sensitive data as it compromises the privacy of patients. In this paper, a random forest-based distributed data mining approach is proposed. Performance of the proposed model is evaluated using accuracy, f-measure and appa statistics analysis. Experimental results reveal that the proposed model is efficient and scalable enough in both performance and accuracy within the imbalanced data and also in maintaining the privacy by sharing only useful healthcare knowledge in the form of local models without revealing and sharing of sensitive data.

Article Preview

Top

1. Introduction

The age of big data has empowered several relations to gather extensive volumes of information. In many real world applications data required for crucial data mining tasks is distributed among several parties. To find useful patterns from the data and discover knowledge that can’t be mined from the data of single party, these parties must share data. It is unfeasible to centralize the data from participating parties due to huge communication costs, computation costs, central storage requirements, security and most importantly privacy concerns. To overcome the drawbacks of centralized system, efficient global models can be constructed from collaborative participants. But this collaborative participation is challenging due to the privacy concerns of participants, as sharing of data among the participants is required. Thus, various distributed data mining algorithms have been proposed in literature to mine different patterns extracted from data shared among different participants without revealing the original data.

Data shared among different participants may have the same attributes at each participant location; such data is said to be horizontally partitioned. For example, medical data of patients who suffer from a common disease will have the same attributes maintained with each medical facility. On the other hand, data belonging to a specific entity may be shared among different participants such that different participants store different attributes of the same entity. Such data is said to be vertically partitioned data. For example, medical data of a patient may be stored by a medical facility whereas data regarding medical bill data, health cover information, etc. of the same patient may be stored by an insurance company. Various distributed privacy preserving approaches based on different machine learning algorithms to mine horizontally and vertically partitioned data have been proposed in the literature. One such approach is to perform local data mining at different participant locations in parallel to produce local data models and keep the disjoint datasets to their respective locations. These local models are then transmitted to a central site that combines them into a global model (Myneni and Patel (1999), Chawlaet al. (2004), Tsoumakas (2003)). The second approach is that, from each local site original data is sub-sampled and then accumulated at a central site to form a global subset (Chawlaet al. (2004)). Another approach is to introduce perturbation in local data of participants with the help of a third-party coordinator in order to preserve the privacy of data. The perturbed data from each participant can then be published in the form of a centralized database to perform different data mining tasks as done by Sheela and Vijayalakshmi (2017). Distributed data mining algorithms that work in a fully decentralized manner have also been proposed in literature. The participants involved, mine shared data by using message passing mechanism. Such algorithms are characterized by the distribution of data on each participant site and asynchronous communication so as to enable learning from participants that aren't available at a given time. Such algorithms should also be scalable so as to work with more participants and therefore more data which may be added to the system at a later time. An important consideration while using decentralized distributed data mining algorithms is to preserve the privacy of data local to each participant. There are potential weaknesses in above mentioned techniques that may put the privacy of the data at risk. Moreover, different privacy preserving methods used in these techniques have certain limitations discussed in Hassan et al. (2017).

Complete Article List

Search this Journal:

Reset

Volume 15: 1 Issue (2024): Forthcoming, Available for Pre-Order

Volume 14: 1 Issue (2023)

Volume 13: 5 Issues (2022): 4 Released, 1 Forthcoming

Volume 12: 6 Issues (2021)

Volume 11: 4 Issues (2020)

Volume 10: 4 Issues (2019)

Volume 9: 4 Issues (2018)

Volume 8: 4 Issues (2017)

Volume 7: 4 Issues (2016)

Volume 6: 4 Issues (2015)

Volume 5: 4 Issues (2014)

Volume 4: 4 Issues (2013)

Volume 3: 4 Issues (2012)

Volume 2: 4 Issues (2011)

Volume 1: 4 Issues (2010)

View Complete Journal Contents Listing

MLA

APA

Chicago

Export Reference

An Ensemble Random Forest Algorithm for Privacy Preserving Distributed Medical Data Mining

Abstract

1. Introduction

Complete Article List