Classification of local diesel fuels and simultaneous prediction of their physicochemical parameters using FTIR-ATR data and chemometrics
Abstract
Class identification and prediction of physicochemical variables of eight diesel fuel brands collected from several stations within the Atlanta metropolitan area in the State of Georgia were investigated using principal component analysis (PCA), partial least squares discriminant analysis (PLS2-DA), and partial least squares regression (PLSR) as modeling techniques. The fuels were from a common pipeline, therefore, assumed to have very similar characteristics. Ten FTIR-ATR spectra per fuel brand were collected over the 650 – 4000 cm−1 mid-infrared region, and the 80 x 3351 matrix was submitted to PCA to determine if there were any clusters. Following PCA, the 80 x 3351 matrix was split into a training matrix (56x3351) and a test matrix (24x3351). PLS2-DA models were built and evaluated for class identification using dummy variables (I,0) as input matrix. For physicochemical variable predictions, models were developed via PLSR using the FTIR-ATR spectra training matrix and physicochemical variables obtained from the Georgia Department of Agriculture Labs as input. Correlation coefficients of the eight fuels ranged from 0.9960 to 0.9998. PCA revealed all eight clusters of the diesel fuels, regardless of the tight correlation coefficients range. With a 1.0 ± 0.1 cut-off for fuel identification, the PLS2-DA models showed 100% correct predictions for four or five fuel brands, and 75% correct prediction for all eight fuel brands. PLSR predicted 100% correct physicochemical variables, with a RMSEP range of 0.019 to 1.132 for all 80 variables targeted.