Special Offers
- IGI Global’s New Emerging Topic e-Book Collections
  Acquire highly focused and affordable Cutting-Edge Peer-Reviewed Research Content through a selection of 17 topic-focused e-Book Collections discounted up to 90%, compared to list prices. Collection topics include Artificial Intelligence, Data Science, Language Learning, Marketing and Customer Relations, Sustainability, and many more. Hosted on the InfoSci^® platform, these collections feature no DRM, no additional cost for multi-user licensing, no embargo of content, full-text PDF & HTML format, and more.
  Learn More
- Open Access Book (Free Access) - Encyclopedia of Information Science and Technology, Sixth Edition (ISBN: 9781668473665)
  The Encyclopedia of Information Science and Technology, Sixth Edition) continues the legacy set forth by the first five editions by providing comprehensive coverage and up-to-date definitions of the most important issues, concepts, and trends pertaining to technological advancements and information management within a variety of settings and industries. The entire book is being published under open access.
  Read Now
- Open Access Book (Free Access) - Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries (ISBN: 9781668456293)
  Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries provides information on the recent technology, mitigation, and environmental protection that must be applied for food sustainability in developing countries. This book is being published under Platinum Open Access through funding from Diponegoro University, Indonesia.
  Read Now
- Open Access Book (Free Access) - New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY (ISBN: 9781668438091)
  The Walmart Corporation and the Lumina Foundation have provided funding to make New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY fully open access, completely removing any paywall between scholars in education and the latest research on new models for the future of higher education.
  Read Now
- Open Access Book (Free Access) - Handbook of Research on the Global View of Open Access and Scholarly Communications (ISBN: 9781799898054)
  Through a collaboration between IGI Global and the University of North Texas, the Handbook of Research on the Global View of Open Access and Scholarly Communications has been published as fully open access, completely removing any paywall between researchers of any field, and the latest research on the equitable and inclusive nature of Open Access and all of its complications.
  Read Now
Books
- - Books by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Books by Field
Journals
- - Journals
  - OnDemand Journal Articles
  - Journals by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Journals by Field
e-Collections
Open Access
- View All Open Access Opportunities
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Find an Open Access Journal for Your Next Manuscript
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Submit an Open Access Book Proposal
  Learn more about open access book publishing and how it can propel your research forward in the field.
  Convert Your Work to Open Access
  Already published? You can convert your work to open access to increase its impact through IGI Global’s Restrospective Open Access Program.
  Utilize Open Access Collection Database
  Open up your research potential by utilizing our open access content or integrating the open access collection into your library
  Consider Open Access Agreements
  For Libraries: consider no-cost or investment-level open access agreements with IGI Global to support your faculty's research endeavors.
  Search Funding Resources
  Looking for additional funding resources to support your open accesss endeavors? View industry resources compiled by our open access team.
  Review Open Access Policies & Ethical Guidelines
  Considering IGI Global to publish your work under open access? Review IGI Global’s open access policies and ethical guidelines
Publish with Us
Resources
- - Instructors
  - Course Adoption
  - Teaching Cases
  - K-12 Online Learning Collection
  - Authors and Editors
  - eEditorial Discovery^® System
  - Peer Review Process
  - Ethics and Malpractice
  - COPE Membership
  - Fair Use Policy
  - Open Access Publishing
  - FAQ
Catalogs
About Us
Newsroom

Bio-Inspired Data Mining for Optimizing GPCR Function Identification

Safia Bekhouche, Yamina Mohamed Ben Ali

Source Title: International Journal of Cognitive Informatics and Natural Intelligence (IJCINI) 15(4)

DOI: 10.4018/IJCINI.20211001.oa40

Article PDF Download Open access articles are freely available for download

Abstract

GPCR are the largest family of cell surface receptors; many of them still remain orphans. The GPCR functions prediction represents a very important bioinformatics task. It consists in assigning to the protein, the corresponding functional class. This classification step requires a good protein representation method and a robust classification algorithm. However the complexity of this task could be increased because of the great number of GPCRs features in most databases, which produce combinatorial explosion. In order to reduce complexity and optimize classification, the authors propose to use bio-inspired metaheuristics for both the feature selection and the choice of the best couple (feature extraction strategy (FES), data mining algorithm (DMA)). The authors propose also to use the BAT algorithm for extracting the pertinent features and the Genetic Algorithm to choose the best couple. They compared the results they we obtained with two existing algorithms. Experimental results indicate the efficiency of the proposed system.

Article Preview

Top

1. Introduction

The identification of G-protein coupled receptors (GPCRs) function is an area of current interest in pharmaceutical and biological research. Of the approximately 500 clinically marketed drugs, greater than 30% are modulators of GPCR function, making GPCRs the most successful of any target class in terms of drug discovery (Drews 2000).

Intense efforts have been devoted to identifying new GPCR functions for orphans. However, for many GPCRs, such efforts have failed to yield reliable results.

At this stage several questions have been asked: what are the necessary steps for good protein function identification? What is the adequate protein representation method (PRM) that can be used to extract features and construct numerical attribute vectors? Which Data Mining Algorithm (DMA) that should be selected to make an accurate classification? How to avoid the combinatorial explosion of classification algorithms due to the complex nature of protein data?

Although many GPCR function prediction approaches have been proposed, a great number of GPCR are still orphan. The previous common methodology is sequence similarity searching in protein databases which is mainly based on pairwise sequence alignment such as

BLAST (Zhang et al., 2012). But it is difficult to identify GPCR successfully because there are no significant shared sequence similarities. However, two proteins can have very different sequences and perform a similar function, or have very similar sequences and perform different functions (Nemati et al., 2009). To solve this problem, some statistical and machine learning approaches have been developed (Secker et al., 2007).

There are three major problems in the task of computational protein function prediction with classification algorithms, which are the choice of the classification algorithm and the choice of the PRM, also the selection of relevant attributes to avoid the combinatorial explosion problem. Those are open problems, even in any classification problem as there are many choices and it is not clear which one is the best.

Generally, there are several strategies to extract attributes from a protein sequence, and the choice of the PRM might be as important as the choice of the DMA, contrary to few works (King et al., 2001) that are often overlooked the used feature extraction strategy and more focused on which classification algorithm to use. Other researchers have developed a hybrid feature extraction strategy (Rehman & Khan, 2011) that can exploit both pseudo-amino-acid composition strategy (PseAAC) and multiscale energy representation, while some authors (Secker et al., 2010; Naveed & Khan, 2012) have made a comparison of the predictive accuracies of few PRM in protein classification.

The transformation of the protein chain can give an enormous numerical attribute vector, the size and the components of this later, strongly influences the predictive accuracy and the error rate of the classification. To improve these rates it’s strictly necessary to eliminate noises “redundancies or useless information” present in the examples to be classified. Furthermore, datasets with hundreds and thousands of attributes may cause the curse of dimensionality and combinatorial explosion problems (Chen et al., 2014).

One of the most feasible techniques to cope with this problem is feature selection (FS) (Sayes et al., 2007; Bagherzadeh-Khiabani et al., 2016) to optimize the classification model and improve the performance measurements. This technique is widely used in different fields to improve results such as: protein function prediction (Nemati et al., 2009) and it is mostly used in big data and data mining (Li & Liu, 2017; Tupe & Wakchaure, 2017).

Complete Article List

Search this Journal:

Reset

Volume 18: 1 Issue (2024)

Volume 17: 1 Issue (2023)

Volume 16: 1 Issue (2022)

Volume 15: 4 Issues (2021)

Volume 14: 4 Issues (2020)

Volume 13: 4 Issues (2019)

Volume 12: 4 Issues (2018)

Volume 11: 4 Issues (2017)

Volume 10: 4 Issues (2016)

Volume 9: 4 Issues (2015)

Volume 8: 4 Issues (2014)

Volume 7: 4 Issues (2013)

Volume 6: 4 Issues (2012)

Volume 5: 4 Issues (2011)

Volume 4: 4 Issues (2010)

Volume 3: 4 Issues (2009)

Volume 2: 4 Issues (2008)

Volume 1: 4 Issues (2007)

View Complete Journal Contents Listing

MLA

APA

Chicago

Export Reference

Bio-Inspired Data Mining for Optimizing GPCR Function Identification

Abstract

1. Introduction

Complete Article List