Feature Selection for Biomedical Data Classification: Statistical vs. Swarm Intelligence Methods

BIOMEDICAL DATA CLASSIFICATION — STATISTICAL VS. SWARM INTELLIGENCE METHODS

Authors

  • Ulfeta Marovac Department of Technical and Technological Sciences, State University of Novi Pazar, Vuka Karadžića 9, 36300 Novi Pazar, Serbia
  • Aldina Avdić Department of Technical and Technological Sciences, State University of Novi Pazar, Vuka Karadžića 9, 36300 Novi Pazar, Serbia
  • Irfan Fetahović Department of Technical and Technological Sciences, State University of Novi Pazar, Vuka Karadžića 9, 36300 Novi Pazar, Serbia
  • Lejlija Memić Department of Technical and Technological Sciences, State University of Novi Pazar, Vuka Karadžića 9, 36300 Novi Pazar, Serbia
  • Nataša Đorđević Department of Natural and Mathematical Sciences, State University of Novi Pazar, Vuka Karadžića 9, 36300 Novi Pazar, Serbia
  • Zana Dolićanin Department of Biomedical Sciences, State University of Novi Pazar, Vuka Karadžića 9, 36300 Novi Pazar, Serbia
  • Goran Babić Faculty of Medical Sciences, University of Kragujevac, Svetozara Markovića 69, 34000 Kragujevac, Serbia

DOI:

https://doi.org/10.56042/jsir.v84i6.13842

Keywords:

Biomedical data classification, Feature selection, Machine learning, Swarm intelligence

Abstract

Applying machine learning methods to large datasets with numerous features presents challenges in terms of training time and model complexity. Feature selection is crucial for reducing data dimensions, improving classification accuracy, and optimizing model interpretability. This study aims to enhance the classification of integrated biomedical data to identify thrombophilia diagnosis. The dataset consists of 71 features from 35 women (22 healthy, 13 with thrombophilia), and three classification algorithms (K Nearest Neighbors, Random Forest, Support Vector Machine) are used to evaluate model performance. Identifying key features related to thrombophilia diagnosisis performed using both filter methods and wrapper methods based on swarm intelligence algorithms. Those methods are analyzed and compared as potential approaches for the feature selection process. The wrapper method outperformed the filter methods for clinical and biological data, achieving a classification accuracy of 0.97 compared to 0.91, while selecting only 4 key features compared to 10. For demographic data, both methods produced the same classification accuracy (0.83), but the wrapper method reduced the number of features. These findings demonstrate that wrapper methods based on swarm intelligence algorithms improve model performance and facilitate more efficient data management, which holds significant practical applications for thrombophilia diagnostics. Additionally, the study highlights the advantage of applying the Bat Algorithm in the feature selection process for thrombophilia prediction, contributing to both the novelty and utility of the approach.

Downloads

Published

18-06-2025

Issue

Section

Computer Sciences, Communication and Information Technology

How to Cite

Feature Selection for Biomedical Data Classification: Statistical vs. Swarm Intelligence Methods: BIOMEDICAL DATA CLASSIFICATION — STATISTICAL VS. SWARM INTELLIGENCE METHODS. (2025). Journal of Scientific & Industrial Research (JSIR), 84(6), 672-680. https://doi.org/10.56042/jsir.v84i6.13842

Similar Articles

1-10 of 175

You may also start an advanced similarity search for this article.