Chemoinformatics is a scientific area that endeavours to study and solve complex chemical problems using computational techniques and methods.
Chemoinformatics and Advanced Machine Learning Perspectives: Complex Computational Methods and Collaborative Techniques provides an overview of current research in machine learning and applications to chemoinformatics tasks. As a timely compendium of research, this book offers perspectives on key elements that are crucial for complex study and investigation.
The many academic areas covered in this publication include, but are not limited to:
- Advanced PLS techniques in chemometrics
- Bayesian statistics
- Chemoinformatics on metabolic pathways
- Compound-protein interactions with machine learning methods
- Graph kernels for chemoinformatics
- Graph mining in chemoinformatics
- Machine leaning in drug discovery and development
- Nonlinear partial least squares
- Similarity fusion for virtual screening
- Support vector machines
Reviews and Testimonials
The book presents cutting edge tools and strategies to solving problems in chemoinformatics. It explains key elements of the filed. Authors have highlighted many future research directions that will foster multi-disciplinary collaborations and hence will lead to significant development and progress in the field of chemoinoformatics.
– Huma Lodhi (Imperial College, UK); Yoshihiro Yamanishi (Kyoto University, Japan)
Table of Contents and List of Contributors
PrefaceChemoinformatics endeavors to study and solve complex chemical problems by using computational tools and methods. It involves storing, and analyzing data, and drawing inferences from chemical information. Recent advances in high throughput technologies and generation of large amount of data have generated huge interest to design, analyze and apply novel computational and more specifically learning methodologies to solving chemical problems. The advances in machine learning for chemoinformatics establish a need for a comprehensive text on the subject. The book addresses the need by presenting in-depth description of novel learning algorithms and approaches for foundational topics ranging from virtual screening to chemical genomics.
The book is designed for multidisciplinary audiences. The intended audiences are researchers, scientists and experts in the fields ranging from chemistry, biology, to machine learning. It is stimulating, clear and accessible to its readers. It provides useful and efficient tools to experts in industry including pharmaceutical, agrochemical and biotechnology companies. It is aimed at fostering collaborations between experts from chemistry, biology and machine learning that is crucial for the advances in chemoinformatics.
The chapters of the book are organized into five sections. The first section presents methods for computing similarity in chemical spaces. Kernel methods are well known class of machine learning algorithms that give state-of-the performance. Chapters one and two present novel kernel functions for the problems in chemoinformatics. Graph kernels that can be viewed as effective approaches to compute similarity and predict the properties of chemical compounds are described in detail in the first chapter. The next chapter explains kernels, namely optimal assignment kernels, for predicting absorption, distribution, metabolic and excretion (ADME) properties of compounds. Other useful kernels, pharmacophore kernels, are presented in chapter three. The use of data fusion in chemoinformatics is the topic of chapter four. The last chapter of the section one clearly introduces molecular features like autocorrelation descriptors and their effectiveness in building quantitative structure activity relational models.
The complexity of chemical problems has led the researchers to develop techniques that are based on the integration of different areas. Section two of the book presents methodologies that combine graph mining and machine learning techniques, for example, chapter six introduces a framework for integrating graph mining algorithm with support vector machines (SVMs), partial least squares (PLS) regression and least angle regression (LARS) in order to extract informative or discriminative chemical fragments. Chapter seven describes graph matching algorithms to compare 3D structures of protein molecules to predict the biological functions.
Section three is based on explaining important elements of chemoinformatics. It gives an in-depth description of statistical and Bayesian techniques. Partial least square methods are well-known for their effectiveness to constructing accurate models in chemoinformatics. Chapter eight and nine not only explain PLS techniques but also motivate the further development and enhancement of these methods. Virtual screening is a very useful tool in drug design and development. A number of Bayesian approaches are presented in chapter ten and the efficacy of their applications to virtual screening is validated. The next chapter further explains the use of Bayesian and non-Bayesian learning methods for large scale virtual screening.
Integration of ideas from different fields like chemoinformatics, bioinformatics and systems biology is important for the development of methods to solving the challenging chemical problems. The research presented in section four is aimed to enhance the design of theses techniques. In chapter twelve an overview of a number of learning methods ranging from classical to modern approaches is given. The chapter also establishes efficacy for the tasks in chemoinformatics. Chapter thirteen highlights the key issues that need to be addressed to construct structure activity relationships models and introduces a useful tool for modeling human intestinal absorption. Mutagenicity is an unfavorable characteristic of drugs that can cause adverse effects. In chapter fourteen an overview of inductive logic programming (ILP) techniques, propositional methods within ILP, ensemble methods, probabilistic, and kernel methods is given to detect and identify mutagenic compounds. Chapter fifteen describes an exciting research avenue that is classification of odorants by using machine learning.
The final section of the book is based on introducing machine learning for chemical genomics that requires useful computational approaches to investigate the relationship between chemical space of possible compounds and genomic space of possible genes or proteins. Chapter sixteen addresses the issue by presenting a number of machine learning approaches for predicting drug-target and ligand-protein interactions from the integration of chemical and genomic data on a large scale. Chapter seventeen presents research that integrates genomics and metabolomics to analyze enzymatic reactions on metabolic pathways, hence introducing tools that can solve challenges ranging from environmental issues to health problems.
As described in preceding paragraphs the book presents cutting edge tools and strategies to solving problems in chemoinformatics. It explains key elements of the filed. Authors have highlighted many future research directions that will foster multi-disciplinary collaborations and hence will lead to significant development and progress in the field of chemoinoformatics.
We want to thank IGI Global for the help in the processing of the book.
Huma Lodhi and Yoshihiro Yamanishi
Author(s)/Editor(s) BiographyHuma Lodhi obtained her Ph.D. in computer science from University of London. She is a researcher with the department of Computing, Imperial College London. She has published in leading international journals, books, conference proceedings and has edited a volume Elements of Computational Systems Biology (Wiley Series in Bioinformatics), (2010) by Huma M Lodhi and Stephen H Muggleton (Editors), Wiley. Her research interests are machine learning and data mining and their application to tasks in bioinformatics, chemoinformatics and computation systems biology.Yoshihiro Yamanishi is a faculty member at Centre for Computational Biology, Mines ParisTech, France. He is also a researcher in the department of Bioinformatics and Computational Systems Biology of Cancer, Mines ParisTech - Institut Curie - INSERM U900. He is working on statistics and machine learning for bioinformatics, chemoinformatics, and genomic drug discovery. He obtained his Ph.D in 2005 from Kyoto University in Japan. He was a post-doctoral research fellow at Center for Geostatistics, Ecole des Mines de Paris from 2005 to 2006. He was an assistant professor at Institute for Chemical Research, Kyoto University from 2006 to 2007.