Fig. 2

Methodology and outcomes of the feature selection process for the machine learning models. (a) Five-fold cross-validation curve of LASSO Regression, indicating an optimal number of features at 19, subsequently reduced to 12 to prevent overfitting. (b) Coefficient path for LASSO Regression for all 54 features. (c) Optimal feature selection results for the SVM model using recursive feature elimination (RFE), achieving the lowest root mean square error with 26 features. (d) Optimal feature selection for the RF model using RFE, achieving peak accuracy with 53 features. (e) Ranking of feature importance in the RF model. (f) Ranking of feature importance in the XGBoost model