Predicting Hydration Enthalpy of Low Molecular Weight Organic Molecules using COSMO-SAC Modeling

Document Type : Research Article


1 Department of Pharmacy, Al-Zahrawi University College, Karbala, Iraq

2 College of Applied Medical Sciences, University of Kerbala, Kerbala, Iraq

3 Department of Pharmacy, Al-Noor University College, Nineveh, Iraq

4 College of Food Sciences, Al-Qasim Green University, Babylon, Iraq


COSMO-SAC modeling is a reliable method to determine the activity coefficient of the mixtures and is used to predict low molecular weight organic materials hydration enthalpy. A dataset of 96 organic molecules’ activity coefficients in the different solvents (water, ethanol, methanol, toluene, and benzene) mixtures have been obtained in full range composition with COSMO-SAC. The created database has been merged with the FreeSolv dataset to include the hydration enthalpy of these materials as input of machine learning training besides the Van der Waals diameter, other important molecular descriptive. The support vector regressor, random forest regressor, and gradient boosting decision tree regressor have been used for data training and prediction of hydration enthalpy of the organic and pharmaceutical materials. Variation of training and testing rates is most effective parameter in the prediction of enthalpy of hydration. The random forest regression is the most accurate method in the prediction of the enthalpy of hydration with 1.5 % RMSD with a train: test ratio of 0.25:0.75 between the studied methods.


Main Subjects