Machine Learning to Predict Etiology for Infectious Diseases of Classic Fever of Unknown Origin in Adults
MachineLearning to Predict Etiology for Infectious Diseases
DOI:
https://doi.org/10.56042/ijeb.v61i07.2826Keywords:
C-reactive protein (CRP), Extreme gradient boosting (XGBoost), Light gradients boosting (Light GBM), Random forest (RF), SHAPAbstract
The etiologies of infectious diseases (IDs) of classic fever of unknown origin (FUO) are multitudinous. Different etiologies affect medication decisions. Here, we have made an attempt to predict the types of etiology on the basis of a machine learning (ML) model for IDs of classic FUO for adults. Ten years clinical data of 408 classic FUO were retrospectively collected from August 2012 to August 2022 in Huzhou Central Hospital. A total of 256 adult patients with ID of classic FUO were divided into four subgroups for clinical characteristic analysis. Random forest (RF), light gradients boosting (Light GBM), and extreme gradient boosting (XGBoost) were used to construct prediction models of 10-fold cross validation. The micro average and weighted average of F1 score were calculated to evaluate the performance of the models. SHapley Additive exPlanations (SHAP) was used to explain the relationship between features and the predicted results. Clinical characteristic analysis showed that 25 indices were statistically different (P <0.05). RF, LightGBM and XGBoost models, were constructed on the basis of these indices. Among them, the XGBoost model showed the best performance (micro-F1=0.7129,weighted-F1=0.6618). The areas under the ROC curve of the four subgroups were 0.7477, 0.7162, 0.9200 and 0.7500, respectively. C-reactive protein (CRP), N%, C3, and C4 with high SHAP values were positively correlated with the bacterial ID model output, which was used to distinguished other causes. Bacterial infections were the main cause of IDs. The XGBoost model could be regarded as an auxiliary tool to predict the etiological types of IDs of classic FUO, improve the etiological diagnostic rate, and provide evidence for clinical drug application.