Identification and Prediction of Fresh Gasoline Locations and Branding Using Newly Targeted Compound Chromatograms with Chemometrics and Machine Learning

Document Type : Research Article

Authors

1 Jabatan Kimia Malaysia, 46661 Petaling Jaya, Selangor, Malaysia

2 East Coast Environmental Research Institute (ESERI), Universiti Sultan Zainal Abidin, Gong Badak, 21300 Kuala Nerus, Terengganu, Malaysia

3 Faculty of Bioresources and Food Industry, Universiti Sultan Zainal Abidin, Besut Campus, 22200 Besut, Terengganu, Malaysia

10.22034/crl.2024.407768.1233

Abstract

Forensic investigations place significant importance on the identification and utilisation of gasoline in crime scenes, particularly in cases involving arson. This study employed gas chromatography-mass spectrometry (GC-MS) to analyse gasoline samples. Additionally, chemometrics techniques, specifically principal component analysis (PCA), discriminant analysis (DA), and classification and regression tree (CART) machine learning, were utilised to identify and differentiate the gasoline brands and geographical origins. This study encompasses three widely recognized gasoline brands obtained from stations located across eight distinct Malaysian states, which also contains an oil refinery. A novel chromatogram, known as the targeted compounds chromatogram (TCC), was developed. It consists of toluene, p-xylene, propyl benzene, 1-ethyl-2-methylbenzene, mesitylene, and indane, which were identified as markers using factor analysis. The TCC was then applied to fifty-three training samples, resulting in a 94% accurate classification of the brands and locations of origin. A unique machine-learning model called Classification and Regression Tree (CART) was constructed and effectively used to analyse 100 unidentified real gasoline samples. The model achieved a mean absolute error (MAE) of 1.1 for location and 0.4 for brand. Furthermore, the accuracy of the estimator remained consistent even when changes were made to the training data set. The results collected clearly illustrate the capacity of this methodology to assist in solving of criminal investigations.

Keywords

Main Subjects