Quantitative structure-activity relationships (QSARs) are models which utilize descriptors to relate the structure of a compound to a specific molecular property of interest. Plugging in the values of descriptors for a given compound into the QSAR will give a prediction of the property for that compound. This process is recognized as the forward QSAR problem. QSARs can be employed to refine the search for molecules matching a desired property in an existing database, but ideally one would like to examine potential compounds outside the database. Here we present a novel algorithm to accomplish that goal, which involves solving the inverse-QSAR (I-QSAR) problem via a powerful molecular descriptor known as Signature. [1] A Signature tree example for methanol is provided in Figure 1. The local topology of the root atom is encoded out to a specified height h. As the Signature height increases, the degeneracy decreases. Diophantine (integer coefficients and solutions) constraint equations are generated from the atomic Signatures based on valence and consistency restrictions to solve the inverse problem. [2] Solutions to the Diophantine equations are filtered for favorable properties, and have the capability to be transformed back to actual structures.
Figure 1. The Signature tree for methanol is rooted at the carbon atom.
In our previous work with hydrofluoroethers (HFEs), the size of the compounds in the database was relatively small (10-30 atoms) and the inverse problem was easily solved at height 1. For larger compounds (50-100 atoms), the number of possible I-QSAR structures matching a solution to the Diophantine equations can be difficult to manage. Because the degeneracy decreases as the Signature height increases, the number of I-QSAR structures generated at height 2 will be fewer than at height 1. In this study, comparisons for working with height 1 versus height 2 are performed, and the pros/cons will be discussed for solving the I-QSAR problem on a set antifungal compounds. [4]
Once solutions are obtained from solving the Diophantine equations, there are several options for the type of QSAR applied. The predictions from the training set compounds and inverse solutions will be evaluated with linear, nonlinear, and principal component techniques. Comparisons will be performed between each QSAR by calculating a Euclidean distance on the predictions. Algorithms developed to further refine the focused database based on energetic considerations will also be presented.
[1] D. Visco Jr., R. Pophale, M. Rintoul, J. L. Faulon, “Developing a methodology for an inverse quantitative structure-activity relationship using the signature molecular descriptor”, J. of Molecular graphics and Modeling, 20, 429-438 (2002).
[2] C. Churchwell, M. D. Rintoul, S. Martin, D. P. Visco, Jr., A. Kotu, R. S. Larson, L.O. Sillerud, D. C. Brown and J. L. Faulon , “The Signature Molecular Descriptor. 3. Inverse Quantitative Structure-Activity Relationship of ICAM-1 Inhibitory Peptides”, J. Molecular Graphics and Modeling, 22, 263 – 273 (2004).
[3] D. Weis, J. L. Faulon, R. LeBone, D. Visco, “The Signature Molecular Descriptor. 5. The Design of Hydrofluoroether Foam Blowing Agents Using Inverse-QSAR”, Ind.
[4] J. Caballero, M. Fernandez, “Linear and nonlinear modeling of antifungal activity of some heterocyclic ring derivatives using multiple linear regression and Bayesian-regulated neural networks”, J. Mol. Model., 12, 168-181 (2006).