Complexity, Emergence and Molecular Diversity via Information Theory

Complexity, Emergence and Molecular Diversity via Information Theory

Francisco Torrens (Institut Universitari de Ciència Molecular, Universitat de València, Spain) and Gloria Castellano (Universidad Católica de Valencia San Vicente Mártir, Spain)
DOI: 10.4018/978-1-4666-2077-3.ch009
OnDemand PDF Download:
No Current Special Offers


Numerous definitions for complexity have been proposed with little consensus. The definition here is related to Kolmogorov complexity and Shannon entropy measures. However, the price is to introduce context dependence into the definition of complexity. Such context dependence is an inherent property of complexity. Scientists are uncomfortable with such context dependence that smacks of subjectivity, which is the reason why little agreement is found on the meaning of the terms. In an article published in Molecules, Lin presented a novel approach for assessing molecular diversity based on Shannon information theory. A set of compounds is viewed as a static collection of microstates that can register information about their environment. The method is characterized by a strong tendency to oversample remote areas of the feature space and produce unbalanced designs. This chapter demonstrates the limitation with some simple examples and provides a rationale for the failure to produce results that are consistent.
Chapter Preview


Complexity is conceptually impressive; it is a discipline that can be applied to energy packets, traffic, neurones, bourse markets, molecules inside the cell, etc., to any system formed by elements that interact between them in an apparently random way and then, without understanding why and how, something happens (Mitchell, 2009). The emergent properties are impressive; they leave us astonished and state two basic questions. (1) Is there a science that could arrive to describe in a satisfactory way the internal laws of complexity? (2) This new that appears, was it there before and is it a question of being more refined with the calculations or is it really unforeseen?

We shall give an example from chemistry (Pullman, 1994). Imagine that one knows everything on the structure of the water molecule. Can this person foresee the transition from liquid to ice? Or does it happen at a different level? We think that it is fundamentally different because the second law of thermodynamics, e.g., appears when there are more than one particle not only one. Physicists are not used to like the statement because they think that if they go shaking out the system top-down, they could become to understand all its features. They are used to be more reductionist but we as chemists have a different approach. We at the laboratory build up new complex systems instead of deconstructing the ones existing already. Moreover we study them in a more holistic way.

In earlier reports the fractal hybrid-orbital analyses of the protein tertiary structure were carried out (Torrens, 2000, 2001, submitted). Valence topological charge-transfer indices for molecular dipole moments were obtained (Torrens, 2004). Information-entropy molecular classification was applied to local anaesthetics (Castellano & Torrens, 2009; Torrens & Castellano, 2006, in press, a) and inhibitors of human immunodeficiency virus type 1 (Torrens & Castellano, 2009, 2010, in press, b). It was reported the structural classification of complex molecules by artificial intelligence techniques (Torrens & Castellano, in press, c). It was published the structural classification of complex molecules by information entropy and equipartition conjecture (Torrens & Castellano, in press, d). It was performed the molecular diversity studies of the bond-based linear indices of the non-stochastic and stochastic edge adjacency matrix of the physicochemical properties of organic molecules (Marrero-Ponce et al., in press) and novel coumarin-based tyrosinase inhibitors discovered by the Organisation for Economic Co-operation and Development (OECD) principles-validated quantitative structure–activity relationship (QSAR) approach from an enlarged balanced database (Le-Thi-Thu et al., in press). The following subsections describe the problem of complexity, the concept of emergence, the entropy as a case study of emergence and the complexity in the group method of data handling (GMDH)-type neural networks. Then the computational method is explained. In next sections the calculation results are presented and discussed. The final section summarizes the conclusions.

Complete Chapter List

Search this Book: