Special Offers
- IGI Global’s New Emerging Topic e-Book Collections
  Acquire highly focused and affordable Cutting-Edge Peer-Reviewed Research Content through a selection of 17 topic-focused e-Book Collections discounted up to 90%, compared to list prices. Collection topics include Artificial Intelligence, Data Science, Language Learning, Marketing and Customer Relations, Sustainability, and many more. Hosted on the InfoSci^® platform, these collections feature no DRM, no additional cost for multi-user licensing, no embargo of content, full-text PDF & HTML format, and more.
  Learn More
- Open Access Book (Free Access) - Encyclopedia of Information Science and Technology, Sixth Edition (ISBN: 9781668473665)
  The Encyclopedia of Information Science and Technology, Sixth Edition) continues the legacy set forth by the first five editions by providing comprehensive coverage and up-to-date definitions of the most important issues, concepts, and trends pertaining to technological advancements and information management within a variety of settings and industries. The entire book is being published under open access.
  Read Now
- Open Access Book (Free Access) - Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries (ISBN: 9781668456293)
  Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries provides information on the recent technology, mitigation, and environmental protection that must be applied for food sustainability in developing countries. This book is being published under Platinum Open Access through funding from Diponegoro University, Indonesia.
  Read Now
- Open Access Book (Free Access) - New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY (ISBN: 9781668438091)
  The Walmart Corporation and the Lumina Foundation have provided funding to make New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY fully open access, completely removing any paywall between scholars in education and the latest research on new models for the future of higher education.
  Read Now
- Open Access Book (Free Access) - Handbook of Research on the Global View of Open Access and Scholarly Communications (ISBN: 9781799898054)
  Through a collaboration between IGI Global and the University of North Texas, the Handbook of Research on the Global View of Open Access and Scholarly Communications has been published as fully open access, completely removing any paywall between researchers of any field, and the latest research on the equitable and inclusive nature of Open Access and all of its complications.
  Read Now
Books
- - Books by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Books by Field
Journals
- - Journals
  - OnDemand Journal Articles
  - Journals by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Journals by Field
e-Collections
Open Access
- View All Open Access Opportunities
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Find an Open Access Journal for Your Next Manuscript
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Submit an Open Access Book Proposal
  Learn more about open access book publishing and how it can propel your research forward in the field.
  Convert Your Work to Open Access
  Already published? You can convert your work to open access to increase its impact through IGI Global’s Restrospective Open Access Program.
  Utilize Open Access Collection Database
  Open up your research potential by utilizing our open access content or integrating the open access collection into your library
  Consider Open Access Agreements
  For Libraries: consider no-cost or investment-level open access agreements with IGI Global to support your faculty's research endeavors.
  Search Funding Resources
  Looking for additional funding resources to support your open accesss endeavors? View industry resources compiled by our open access team.
  Review Open Access Policies & Ethical Guidelines
  Considering IGI Global to publish your work under open access? Review IGI Global’s open access policies and ethical guidelines
Publish with Us
Resources
- - Instructors
  - Course Adoption
  - Teaching Cases
  - K-12 Online Learning Collection
  - Authors and Editors
  - eEditorial Discovery^® System
  - Peer Review Process
  - Ethics and Malpractice
  - COPE Membership
  - Fair Use Policy
  - Open Access Publishing
  - FAQ
Catalogs
About Us
Newsroom

Solving Complex Problems in Human Genetics Using Nature-Inspired Algorithms Requires Strategies which Exploit Domain-Specific Knowledge

Casey S. Greene, Jason H. Moore

Source Title: Computer Engineering: Concepts, Methodologies, Tools and Applications

DOI: 10.4018/978-1-61350-456-7.ch804

OnDemand:

(Individual Chapters)

Available

$37.50

Current Special Offers

No Current Special Offers

Abstract

In human genetics the availability of chip-based technology facilitates the measurement of thousands of DNA sequence variations from across the human genome. The informatics challenge is to identify combinations of interacting DNA sequence variations that predict common diseases. The authors review three nature-inspired methods that have been developed and evaluated in this domain. The two approaches this chapter focuses on in detail are genetic programming (GP) and a complex-system inspired GP-like computational evolution system (CES). The authors also discuss a third nature-inspired approach known as ant colony optimization (ACO). The GP and ACO techniques are designed to select relevant attributes, while the CES addresses both the selection of relevant attributes and the modeling of disease risk. Specifically, they examine these methods in the context of epistasis or gene-gene interactions. For the work discussed here we focus solely on the situation where there is an epistatic effect but no detectable main effect. In this domain, early studies show that nature-inspired algorithms perform no better than a simple random search when classification accuracy is used as the fitness function. Thus, the challenge for applying these search algorithms to this problem is that when using classification accuracy there are no building blocks. The goal then is to use outside knowledge or pre-processing of the dataset to provide these building blocks in a manner that enables the population, in a nature-inspired framework, to discover an optimal model. The authors examine one pre-processing strategy for revealing building blocks in this domain and three different methods to exploit these building blocks as part of a knowledge-aware nature-inspired strategy. They also discuss potential sources of building blocks and modifications to the described methods which may improve our ability to solve complex problems in human genetics. Here it is argued that both the methods using expert knowledge and the sources of expert knowledge drawn upon will be critical to improving our ability to detect and characterize epistatic interactions in these large scale biomedical studies.

Chapter Preview

Top

Introduction

Nature-inspired algorithms are a natural fit for solving problems in biological domains, not just because of the connection between method and application but also because many of the problems natural systems solve are common to biological data. Biological organisms evolve in a noisy environment with a rugged fitness landscape. Many of the interesting problems in human genetics also likely involve a rugged fitness landscape where models that contain some but not all of the relevant attributes may not have an accuracy greater than that of the surrounding noise. In addition these data are frequently noisy, in the sense that two individuals with the same values at the relevant attributes may have different disease states. In this context it is no surprise that we look to natural systems for inspiration when designing algorithms which succeed in this domain. Wagner discusses the role of robustness and evolvability in living systems (Wagner, 2005). We must design and use algorithms that, like living systems, are both robust to the noise in the data and evolvable despite the rugged fitness landscape. We briefly discuss the Relief family of machine learning methods which are useful for separating signals from noise in this type of data and then focus on approaches that exploit this information. The nature-inspired methods we examine here are genetic programming (GP), a computational evolution system (CES), and ant colony optimization (ACO).

GP is an automated computational discovery tool inspired by Darwinian evolution and natural selection (Koza, 1992, 1994; Koza, Andre, Bennett & Keane, 1999; Koza, 2003; Banzhaf, Nordin, Keller & Francone, 1998; Langdon & Koza, 1998; Langdon & Poli, 2002). The goal of GP is to evolve computer programs which solve problems. This is accomplished by generating programs composed of the building blocks needed to solve or approximate a solution and then iteratively evaluating, selecting, recombining, and mutating these programs to form new computer programs. This process repeats until a best program or set of programs is identified. Genetic programming and its many variations have been applied successfully to a wide range of different problems including data mining, knowledge discovery (Freitas, 2002), and bioinformatics (Fogel & Corne, 2003). Despite the power of this method, there remain a number of challenges that GP practitioners and theorists must address before this computational discovery tool becomes a standard in the modern problem solver's toolbox. Yu et al. list 22 such challenges (Yu, Riolo & Worzel, 2006). We discuss here methods that address some of these challenges, in particular those related to practice. We specifically discuss methods that use information from pre- and post- processing, methods for handling large high dimensional datasets, and methods for integrating domain knowledge. We argue that these methods will be critical if we are to successfully and reliably analyze these genetic data for epistasis.

Spector, as part of an essay regarding the roles of theory and practice in genetic programming, discusses the push towards biology by GP practitioners (Spector, 2003). Banzhaf et al. propose the transformation of overly simplistic and abstracted artificial evolution methods such as GP into computational evolution systems (CES) that more closely resemble the complexity of real biological and evolutionary systems (Banzhaf et al., 2006). We review work here regarding a CES developed for this domain. The working hypothesis addressed in the papers we discuss is that a GP-based genetic analysis system will find better solutions faster if it is implemented as a CES that can evolve a variety of complex operators that in turn generate variability in solutions. This is in contrast to an artificial evolution system that employs a fixed set of operators. Unlike the other two systems we discuss, the goal here is to both identify attributes and discover a mathematical model that correctly classifies individuals with regard to their disease state.

Complete Chapter List

Search this Book:

Reset

MLA

APA

Chicago

Export Reference

Solving Complex Problems in Human Genetics Using Nature-Inspired Algorithms Requires Strategies which Exploit Domain-Specific Knowledge

Abstract

Introduction

Complete Chapter List