Special Offers
- IGI Global’s New Emerging Topic e-Book Collections
  Acquire highly focused and affordable Cutting-Edge Peer-Reviewed Research Content through a selection of 17 topic-focused e-Book Collections discounted up to 90%, compared to list prices. Collection topics include Artificial Intelligence, Data Science, Language Learning, Marketing and Customer Relations, Sustainability, and many more. Hosted on the InfoSci^® platform, these collections feature no DRM, no additional cost for multi-user licensing, no embargo of content, full-text PDF & HTML format, and more.
  Learn More
- Open Access Book (Free Access) - Encyclopedia of Information Science and Technology, Sixth Edition (ISBN: 9781668473665)
  The Encyclopedia of Information Science and Technology, Sixth Edition) continues the legacy set forth by the first five editions by providing comprehensive coverage and up-to-date definitions of the most important issues, concepts, and trends pertaining to technological advancements and information management within a variety of settings and industries. The entire book is being published under open access.
  Read Now
- Open Access Book (Free Access) - Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries (ISBN: 9781668456293)
  Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries provides information on the recent technology, mitigation, and environmental protection that must be applied for food sustainability in developing countries. This book is being published under Platinum Open Access through funding from Diponegoro University, Indonesia.
  Read Now
- Open Access Book (Free Access) - New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY (ISBN: 9781668438091)
  The Walmart Corporation and the Lumina Foundation have provided funding to make New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY fully open access, completely removing any paywall between scholars in education and the latest research on new models for the future of higher education.
  Read Now
- Open Access Book (Free Access) - Handbook of Research on the Global View of Open Access and Scholarly Communications (ISBN: 9781799898054)
  Through a collaboration between IGI Global and the University of North Texas, the Handbook of Research on the Global View of Open Access and Scholarly Communications has been published as fully open access, completely removing any paywall between researchers of any field, and the latest research on the equitable and inclusive nature of Open Access and all of its complications.
  Read Now
Books
- - Books by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Books by Field
Journals
- - Journals
  - OnDemand Journal Articles
  - Journals by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Journals by Field
e-Collections
OnDemand
Open Access
- View All Open Access Opportunities
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Find an Open Access Journal for Your Next Manuscript
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Submit an Open Access Book Proposal
  Learn more about open access book publishing and how it can propel your research forward in the field.
  Convert Your Work to Open Access
  Already published? You can convert your work to open access to increase its impact through IGI Global’s Restrospective Open Access Program.
  Utilize Open Access Collection Database
  Open up your research potential by utilizing our open access content or integrating the open access collection into your library
  Consider Open Access Agreements
  For Libraries: consider no-cost or investment-level open access agreements with IGI Global to support your faculty's research endeavors.
  Search Funding Resources
  Looking for additional funding resources to support your open accesss endeavors? View industry resources compiled by our open access team.
  Review Open Access Policies & Ethical Guidelines
  Considering IGI Global to publish your work under open access? Review IGI Global’s open access policies and ethical guidelines
Publish with Us
Resources
- - Instructors
  - Course Adoption
  - Teaching Cases
  - K-12 Online Learning Collection
  - Authors and Editors
  - eEditorial Discovery^® System
  - Peer Review Process
  - Ethics and Malpractice
  - COPE Membership
  - Fair Use Policy
  - Open Access Publishing
  - FAQ
Catalogs
About Us

Privacy Preserving Feature Selection for Vertically Distributed Medical Data based on Genetic Algorithms and Naïve Bayes

Boudheb Tarik, Elberrichi Zakaria

Source Title: International Journal of Information System Modeling and Design (IJISMD) 9(3)

DOI: 10.4018/IJISMD.2018070101

OnDemand:

(Individual Articles)

Available

$37.50

Current Special Offers

No Current Special Offers

Abstract

Machine learning is a powerful tool to mine useful knowledge from vast databases. Many establishments in the medical area such as hospitals, laboratories want to join their efforts with the ambition to extract models that are more accurate. However, this approach faces problems. Due to the laws protecting patient privacy or other similar concerns, parties are reluctant to share their data. In vast amounts of data, which are useful and pertinent in constructing accurate data mining models? In this article, the researchers deal with these challenges for vertically distributed medical data. They propose an original secure wrapper solution to perform feature selection based on genetic algorithms and distributed Naïve Bayes. Contrary to the previous solutions, the original data is not perturbed. Therefore, the data utility and performance are preserved. They prove that the proposed solution selects relevant attributes to increase performance, preserving patient privacy.

Article Preview

Top

1. Introduction

Medicine has a special status in science, philosophy, and daily life. The outcomes of medical care are life-or-death, and they apply to everybody. Medicine is a necessity, not merely an optional luxury, pleasure, or convenience. The only justification for collecting medical data is to benefit the individual patient (Cios & Moore, 2002). One of the major challenges in the medical domain today is how to exploit the vast amount of data that this field generates. Machine-learning approaches are required (Anguera, Barreiro, Lara & Lizcano, 2016). They are able of discovering useful knowledge for decision making in the medical field. Data mining holds great potentials for the healthcare area. Some experts believe the opportunities to improve care and reduce costs concurrently could apply to as much as 30% of overall healthcare spending (Eliason & Crockett, 2017).

Nowadays, due to the progress in network and storage technologies, different patients’ health records, such as health diagnosis, blood analysis and radiology results are collected. The data can be stored on different sites since patients during their lifetime can visit different available hospitals or laboratories, etc. With the aim of conceiving more accurate models, some organizations would like to collaborate to enhance their data mining process, by using additional external information. In vertically distributed data, they will use external attributes, about the same patients. For example, a hospital that treats the breast cancer may use external information of its patients, such as blood analysis, biopsy, radiology, MRI scan, etc.

There is no guarantee that external information will enhance the data mining process, some non-pertinent data can decrease the actual performance of the model. Feature selection is a technique to select relevant attributes to build more accurate data mining models. It reduces dimensionality, speeds up the learning and improves the model interpretability. There are three categories of feature selection methods: Wrapper, Filter and embedded methods. The wrapper methods consider the selection of a set of features as a search problem, where different combinations are prepared, evaluated and compared to the other combinations. A predictive model is used to evaluate a combination of features and assign a score based on model accuracy (Brownlee, 2014). Filter methods apply a statistical measure to assign a scoring to each attribute. The features are ranked by the score and either selected to be kept or removed from the dataset. Embedded methods combine the qualities of filter and wrapper methods. It’s implemented by algorithms that have their own built-in feature selection methods (Kaushik, 2016).

Due to the laws or other concerns, which prohibit the disclosure of private information about individuals, organizations such as hospitals, clinics, and laboratories are reluctant to share their local data. However, data mining process needs complete access to the data to construct accurate models. Thus, in the last decade, privacy preserving of sensitive data has become an important topic. It must be incorporated in all data mining process. Privacy preserving feature selection has received the great attention. Many solutions were proposed for distributed data, but few of them used wrapper methods without perturbing the original data. The challenge with the perturbation techniques is to find a good tradeoff between privacy and accuracy (Zhong & Wright, 2005). The more patients’ private information is protected, the less accurate result the miner obtains; conversely, more accurate results, less privacy for patients.

Complete Article List

Search this Journal:

Reset

Volume 15: 1 Issue (2024)

Volume 14: 1 Issue (2023)

Volume 13: 8 Issues (2022): 7 Released, 1 Forthcoming

Volume 12: 4 Issues (2021)

Volume 11: 4 Issues (2020)

Volume 10: 4 Issues (2019)

Volume 9: 4 Issues (2018)

Volume 8: 4 Issues (2017)

Volume 7: 4 Issues (2016)

Volume 6: 4 Issues (2015)

Volume 5: 4 Issues (2014)

Volume 4: 4 Issues (2013)

Volume 3: 4 Issues (2012)

Volume 2: 4 Issues (2011)

Volume 1: 4 Issues (2010)

View Complete Journal Contents Listing

MLA

APA

Chicago

Export Reference

Privacy Preserving Feature Selection for Vertically Distributed Medical Data based on Genetic Algorithms and Naïve Bayes

Abstract

1. Introduction

Complete Article List