Analysis of Protein Structure for Drug Repurposing Using Computational Intelligence and ML Algorithm

Analysis of Protein Structure for Drug Repurposing Using Computational Intelligence and ML Algorithm

Deepak Srivastava, Kwok Tai Chui, Varsha Arya, Francisco José García Peñalvo, Pramod Kumar, Anuj Kumar Singh
DOI: 10.4018/IJSSCI.312562
OnDemand:
(Individual Articles)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

Proteins are fundamental compounds in biological processes during the analysis of drug target indication for drug repurposing. The identification of relevant features is a necessary step in determining protein structure. A classification technique is used to identify the most important features in a dataset, which is why feature selection is so important. For protein structure prediction, recent research has developed a wide range of new methods to improve accuracy. The authors use principal component analysis (PCA) with correlation-matrix-based feature selection to analyse breast cancer data. In this paper, they discussed a therapeutic agent that is used to reduce the dataset by reduction-based algorithm and after that applied reduced dataset labelled as Standard Gold Dataset on machine learning model to analyze drug target indication. They get the higher accuracy of 92.8%, 93.9%, and 95.3%, each of the three datasets with 200, 500, and 1000 features with SVM with RBF kernel function. Also they found the best result, 97.8%, with the same classifier.
Article Preview
Top

1. Introduction

Data mining techniques are incredibly large data sets that investigate trends, patterns, and classify information based on computer investigations. Data Mining is produced as a result of the rapid growth of data, which enables the acquisition of knowledge and the extraction of significant data.

Prediction of protein secondary structure is a fundamental procedure in protein science, whereas computational biology is used to recognize protein three-dimensional structures in order to uncover organic functions1. Protein sequencing technologies aid in the identification of disease-causing genetic variants. This involves the analysis of a massive quantity of sequencing data to aid computational approaches in efficiently discovering the subset of disease-related variants. The extraction of information from the DNA sequence of an organism is important for both biological study and medical applications.

Figure 1.

Breast cancer classifications

IJSSCI.312562.f01

1.1 Protein Structure Identification

Protein is a polymeric macromolecule composed of amino acid building units connected by peptide bonds. The linear polypeptide chain is a major protein structure characterized by a series of letters connected to amino acids. A sequence referred to as the protein's secondary structure differentiates each amino acid into its corresponding secondary structure element. The elements of the secondary structure are organized into a tertiary structure, which is represented by the coordinates of all protein atoms. Multiple linked protein chains interconnect to form protein complexes1, 2. These protein complexes are known as quaternary protein structure.

1.2 Therapeutic Protein Agents

Breast cancer is one of the most challenging diseases to diagnose. They are common diseases with a variety of risk variations and genes that have been identified via genome-wide association studies. The use of these conditions as case studies for repurposing therapeutic protein agents (immune-oncology) will serve as a model for forecasting the development of novel drugs1, 16. As a result, this strategy offers significant potential for cancer prediction using an immunological signature.

1.3 Machine Learning for Bio- Medical Data

Computer-assisted learning (CALM) is a hybrid of AI and computational intelligence. Intelligent machines capable of resolving real-world issues will be modelled like human minds, according to these researchers2, 3. Genetic algorithms, belief networks, and learning theory are a few examples of probabilistic reasoning. In order to design, develop, and implement intelligent discovery, these factors must be taken into consideration.

Figure 2.

Connections between Medical Informatics, Medicine data, and Protein Structure

IJSSCI.312562.f02

Complete Article List

Search this Journal:
Reset
Volume 16: 1 Issue (2024)
Volume 15: 1 Issue (2023)
Volume 14: 4 Issues (2022): 1 Released, 3 Forthcoming
Volume 13: 4 Issues (2021)
Volume 12: 4 Issues (2020)
Volume 11: 4 Issues (2019)
Volume 10: 4 Issues (2018)
Volume 9: 4 Issues (2017)
Volume 8: 4 Issues (2016)
Volume 7: 4 Issues (2015)
Volume 6: 4 Issues (2014)
Volume 5: 4 Issues (2013)
Volume 4: 4 Issues (2012)
Volume 3: 4 Issues (2011)
Volume 2: 4 Issues (2010)
Volume 1: 4 Issues (2009)
View Complete Journal Contents Listing