Publication:
Comparison of Machine Learning Algorithms for Predicting Length of Stay in Chronic Kidney Disease Patients

dc.authorscopusid60022943100
dc.authorscopusid56589621700
dc.contributor.authorKiremit, B.Y.
dc.contributor.authorŞahin, D.Ö.
dc.date.accessioned2025-12-11T00:34:22Z
dc.date.issued2025
dc.departmentOndokuz Mayıs Üniversitesien_US
dc.department-temp[Kiremit] Birgül Yabana, Ondokuz Mayis Üniversitesi, Samsun, Turkey; [Şahin] Durmuş Ozkan, Ondokuz Mayis Üniversitesi, Samsun, Turkeyen_US
dc.description.abstractThe length of stay (LOS) for patients in hospitals is crucial for workforce planning, resource allocation, and bed capacity management, impacting healthcare costs, future needs and financial planning. This study focuses on calculating the LOS for Chronic Kidney Disease (CKD) patients admitted as inpatients and estimating their hospital bills based on services rendered during their stay. Utilizing data from 5,583 CKD patients and 11 input variables, various machine learning (ML) algorithms were applied to develop regression, and classification models. To optimize the model performance and address potential overfitting issues, feature selection techniques were also employed. The Random Forest (RF) algorithm achieved the highest performance for bill amount estimation, with a Correlation Coefficient (CC) of 0.736. The algorithms predicting LOS showed even more promising results, with all performing above 0.848 on the CC metric. The best performances were obtained from Support Vector Machine (SVM), M5P trees and RF with Mean Absolute Error (MAE) and CC results of 2.580 day-0.875, 2.587 day-0.880 and 2.611 day-0.880, respectively. LOS was categorized as short or long using ML algorithms, with Logistic Regression (LogR) achieving the best classification results: 0.944 on the AUC-ROC (Area Under the ROC Curve) metric and 0.872 on the F-Measure metric. The RF algorithm also excelled in classification based on patient units, producing results of 0.788 on the AUC-ROC and 0.863 for accuracy. Additionally with feature selection revealed that reducing input variables maintained prediction accuracy for bill amount and LOS, but it generally negatively affected classification performance. Feature selection was identified as a critical challenge, particularly in balancing the trade-off between dimensionality reduction and predictive accuracy. While dimensionality reduction can improve computational efficiency, careful selection of input variables is essential to maintain robust classification performance. Given the lengthy treatment processes for CKD patients, accurate predictions of LOS, billing amounts, and admission units will assist health managers in planning for future resource needs, such as medical supplies and workforce. Ultimately, this study provides insights that can enhance the financial sustainability and management of healthcare services. © 2025 Elsevier Ltden_US
dc.identifier.doi10.1016/j.compbiomed.2025.110825
dc.identifier.issn0010-4825
dc.identifier.issn1879-0534
dc.identifier.pmid40763677
dc.identifier.scopus2-s2.0-105012195998
dc.identifier.scopusqualityQ1
dc.identifier.urihttps://doi.org/10.1016/j.compbiomed.2025.110825
dc.identifier.urihttps://hdl.handle.net/20.500.12712/37574
dc.identifier.volume196en_US
dc.identifier.wosqualityQ1
dc.language.isoenen_US
dc.publisherElsevier Ltden_US
dc.relation.ispartofComputers in Biology and Medicineen_US
dc.relation.publicationcategoryMakale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanıen_US
dc.rightsinfo:eu-repo/semantics/closedAccessen_US
dc.subjectChronic Kidney Disease Patientsen_US
dc.subjectClassification Modelsen_US
dc.subjectForecasting Modelsen_US
dc.subjectLength of Stayen_US
dc.subjectMachine Learningen_US
dc.titleComparison of Machine Learning Algorithms for Predicting Length of Stay in Chronic Kidney Disease Patientsen_US
dc.typeArticleen_US
dspace.entity.typePublication

Files