With the discovery of numerous new chemicals for various scientific applications the need for assessing their toxic effects with reasonable accuracy arises. Extensive efforts have been carried out in the literature to identify assessment methods for evaluating the toxicity of various chemical compounds. Of the toxicity related issues, skin sensitization due to exposure of toxic chemical compounds has been a major work-related problem comprising up to 95% of the occupational contact dermatitis cases. Though reliable test procedures for skin sensitization exist, their application to chemical compounds is limited either by the time consumed or cost involved. Hence, it becomes necessary for development of computational techniques that not only reduce time and cost, but also ensure animal welfare. Non-testing procedures such as quantitative structure-property relationships (QSPRs) can be used as effective methods for a priori prediction of physical properties of chemical compounds. QSPR models offer an attractive alternative with the potential to provide reliable property estimates based on information obtained solely from the chemical structure.
In this work, an effort has been made to obtain a QSPR skin sensitization model with wide applicability. An extensive database comprised of test results from three exclusive test procedures were used for the QSPR model development. This work focuses primarily on the following objectives: (a) Develop QSPR models to predict skin sensitization. Since the experimental procedure and end point ranking for local lymph node assay (LLNA), guinea pig maximization test (GPMT) and Federal Institute for Health Protection of Consumers and Veterinary Medicine (BgVV) are different, three exclusive QSPR models were developed, and (b) Improve the predictive capability of the QSPR models using a combination of literature recommended and statistically determined descriptors. The resultant QSPR models are capable of predicting critical properties of the diverse set of molecules considered with an accuracy of 88%, 93% and 90% for LLNA, GPMT and BgVV datasets, respectively.