Prediction of Compound-protein Interactions with Machine Learning Methods

Prediction of Compound-protein Interactions with Machine Learning Methods

Yoshihiro Yamanishi (Mines ParisTech, Institut Curie, Inserm U900, France) and Hisashi Kashima (IBM Tokyo Research Laboratory, Japan)
Copyright: © 2012 |Pages: 15
DOI: 10.4018/978-1-60960-818-7.ch315
OnDemand PDF Download:
$30.00
List Price: $37.50

Abstract

In silico prediction of compound-protein interactions from heterogeneous biological data is critical in the process of drug development. In this chapter the authors review several supervised machine learning methods to predict unknown compound-protein interactions from chemical structure and genomic sequence information simultaneously. The authors review several kernel-based algorithms from two different viewpoints: binary classification and dimension reduction. In the results, they demonstrate the usefulness of the methods on the prediction of drug-target interactions and ligand-protein interactions from chemical structure data and genomic sequence data.
Chapter Preview
Top

Introduction

Most drugs are small compounds which interact with their target proteins and inhibit or activate the biological behavior of the proteins. Therefore, the identification of interactions between compounds (ligands, small molecules, drugs) and proteins (targets) is an important part of genomic drug discovery. Examples of pharmaceutically useful target proteins are enzymes, ion channels, G protein-coupled receptors (GPCRs) and nuclear receptors. Owing to the completion of the human genome sequencing projects, we are beginning to understand the genomic spaces populated by these protein classes. At the same time, the high-throughput screening of large-scale chemical compound libraries with various biological assays is enabling us to explore the chemical space of possible compounds (Kanehisa et al., 2006, Stockwell, 2000, Dobson, 2004). However, our knowledge about the relationship between the chemical and genomic spaces is very limited.

In 2003 the U.S. National Institutes of Health announced the Roadmap, which contained new chemical genomics initiatives. The aim of chemical genomics research is to relate this chemical space with the genomic space in order to identify potentially useful compounds such as imaging probes and drug leads. Toward the goal, the PubChem database was established at NCBI (Wheeler et al., 2006) in order to store various chemical information about millions of compounds, but the number of compounds with information on their target protein is very limited. This implies that many potential interactions between the chemical and genomic spaces remain undiscovered. There is therefore a strong incentive to develop new methods capable of detecting these potential compound-protein interactions efficiently.

Although some bio-technologies such as binding assays are becoming available, experimental determination of compound-protein interactions remains very challenging and expensive even nowadays. It is therefore of great practical interest to develop effective in-silico prediction methods which can both provide new predictions to experimentalists and provide supporting evidence to experimental studies. The computational prediction is expected to increase research productivity toward genomic drug discovery.

In this chapter we review various computational approaches to predict compound-protein interactions from chemical structures and protein sequences. From the viewpoint of machine learning, we formulate the problem of predicting compound-protein interactions, and introduce several supervised machine learning methods which have been recently developed from two different viewpoints: binary classification and dimension reduction. In the results, we show the usefulness of the methods on the predictions of compound-protein interactions from chemical structure data and genomic sequence data. We also discuss the characteristics of the methods, and show some perspectives toward future work.

Complete Chapter List

Search this Book:
Reset