QSPR Modeling For Critical Temperatures Of Organic Compounds Using Hybrid Optimal Descriptors

QSPR Modeling For Critical Temperatures Of Organic Compounds Using Hybrid Optimal Descriptors

Khalid Bouhedjar (1-Laboratoire de Synthèse et Biocatalyse Organique, Université Badji Mokhtar, Annaba, Algeria & Centre de Recherche en Biotechnologie Constantine, Constantine, Algeria), Abdelmalek Khorief Nacereddine (Laboratoire de Synthèse et Biocatalyse Organique, Université Badji Mokhtar Annaba, Annaba, Algeria & Department of Physics and Chemistry, Higher Normal School of Technological Education-Skikda, Skikda, Algeria), Hamida Ghorab (University of Constantine 1, Constantine,Algeria) and Abdelhafid Djerourou (Laboratoire de Synthèse et Biocatalyse Organique, Université Badji Mokhtar Annaba, Annaba, Algeria)
DOI: 10.4018/IJQSPR.2019100102

Abstract

The simplified molecular input line entry system (SMILES) is particularly suitable for high-speed machine processing, based on the Monte Carlo method using CORAL software. Quantitative structure-property relationships (QSPR) of critical temperatures have been established using a dataset of 165 diverse organic compounds employing hybrid optimal descriptors defined by graph and SMILES notation. External validation is one of the most important parts in the evaluation of model performance. However, previous models on the same dataset have poor predictive power in the external test set, or the authors had not done that check. In the present work, the predictive ability of model has been tested using external validation. The statistical quality of the three splits are similar and good. The r2 values for the best model are: r2 = 0.98 for the training set, r2 = 0.95 for the calibration set, and r2 = 0.94 for the validation set.
Article Preview
Top

2. Method

2.1. Data Set

The data set used in the study consists of 165 organic compounds obtained from the literature (Duchowicz & Castro, 2002; Katritzky et al., 1998; Poling, Thomson, George, & Friend, 2008); this set is structurally very diverse, and it includes saturated unsaturated hydrocarbons, halogenated compounds, compounds containing hydroxyl, cyano, amino, ester, ether, carbonyl, and carboxyl functionalities. The data were split three times into the training, calibration, and validation sets (Toropov, Toropova, Benfenati, Leszczynska, & Leszczynski, 2009a, 2009b; Alla P. Toropova, Toropov, Benfenati, Leszczynska, & Leszczynski, 2015). The training set was used to build up a QSAR model, the calibration set was used as the “critic” of the developed QSAR model, and the test sets was used in assessed for the predictive power of the model (Toropova & Toropov, 2014). The Supplementary materials section (Table S1), contains SMILES, the corresponding experimental and calculated critical temperatures (Tc).

Complete Article List

Search this Journal:
Reset
Open Access Articles: Forthcoming
Volume 5: 4 Issues (2020): 1 Released, 3 Forthcoming
Volume 4: 4 Issues (2019)
Volume 3: 2 Issues (2018)
Volume 2: 2 Issues (2017)
Volume 1: 2 Issues (2016)
View Complete Journal Contents Listing