Article Preview
TopIntroduction
Simplicity provides clarity, reliability, and reproducibility, therefore aspiration to variety minimization was always peculiar to praxis of research work, in general, and for quantitative structure – property / activity relationships (QSPRs/QSARs) analyses, in particular (Montchamp et al. 1993). The optimal descriptor is an example of aspiration to the Simplicity.
The first attempts of the construction of the optimal descriptors were based on the molecular graphs (Randic & Basak 1999; Randic & Pompe 2001; Randic & Basak 2001; Da Silva Junkes et al. 2005). It is well-known that the majority of the topological indices are calculated with the numerical data on the vertex degrees (i.e number of vertices connected with the given vertex) in the molecular graphs, the numbers of paths of lengths 2, 3, etc., Morgan vertex degrees of the increasing orders: zero order is the usual number of neighbors. Extended connectivity of first order is calculated with the recurrence formula from data on usual vertex degrees, extended connectivity of second order is calculated with the recurrence formula from data on the extended connectivity of first order, and so on (Amic et al. 1998; Toropov & Toropova 2002a,b; Toropov & Toropova 2004).
The representation of the molecular structure by the molecular graph is a convenient way but in fact this representation involves the adjacency matrix (Randic & Basak 1999; Randic & Pompe 2001; Randic & Basak 2001), i.e., matrix with the nxn elements (n is the number of atoms in the molecule), whereas so-called simplified molecular input-line entry system (SMILES) (Weininger 1988; Weininger et al. 1989; Weininger 1990) is the representation by string of symbols. In other words, the representation by SMILES is more convenient and more “economical” than the representation by molecular graph for databases on the physicochemical and biochemical endpoints available on the Internet. Under such circumstances, the optimal descriptors calculated from the SMILES (Toropov et al. 2008; Nesmerak et al. 2013; Toropova & Toropov 2014; Nesmerak et al. 2014; Masand et al. 2014; Veselinovic et al. 2014; Toropova et al. 2014) become an attractive alternative of the optimal descriptors which are calculated with the molecular graphs (Randic & Basak 1999; Randic & Pompe 2001; Randic & Basak 2001; Toropov & Toropova 2002a,b; Toropov & Toropova 2004).
However, the representation of the molecular structure by SMILES and the representation of the molecular structure by molecular graphs are different and moreover each of the above-mentioned representations has advantages as well as disadvantages. This leads to attractiveness of so-called hybrid optimal descriptors (Toropova et al. 2012; Achary 2014a,b; Fatemi & Malekzadeh 2015; Ghaedi 2015), calculated taking into account molecular attributes extracted from both the SMILES and molecular graphs.