Publication:
Literature-Based Explainable Machine Learning Models for Predicting Pathogen and Antibiotic Resistance Gene Loads from Animal Manure

dc.authorscopusid60117458500
dc.authorscopusid57830207600
dc.authorscopusid58923609400
dc.authorwosidKadioğlu, Eli̇f Ni̇han/Mbw-2544-2025
dc.authorwosidAtalay Eroğlu, Handan/Lrb-8975-2024
dc.contributor.authorGokalp, Ayse Birsen Kadioglu
dc.contributor.authorEroglu, Handan Atalay
dc.contributor.authorKadioglu, Elif Nihan
dc.date.accessioned2025-12-11T00:42:22Z
dc.date.issued2025
dc.departmentOndokuz Mayıs Üniversitesien_US
dc.department-temp[Gokalp, Ayse Birsen Kadioglu] Erciyes Univ, Fac Vet Med, Dept Microbiol, TR-38039 Kayseri, Turkiye; [Eroglu, Handan Atalay; Kadioglu, Elif Nihan] Ondokuz Mayis Univ, Fac Engn, Dept Environm Engn, TR-55139 Samsun, Turkiyeen_US
dc.description.abstractThe use of animal manure (cattle, pigs, poultry, and sheep) in agriculture offers significant advantages such as increasing soil fertility and reducing the use of chemical fertilizers. However, this application also brings about serious environmental and public health problems due to the risk of microbial contaminants such as pathogenic microorganisms and antibiotic resistance genes (ARGs) spreading into the environment. In order to assess this dual risk, we developed a machine learning (ML) framework capable of simultaneously predicting pathogen load and ARG levels. The dataset contains 223 records systematically collected from 54 scientific studies published between 2015 and 2024. Six regression models were compared; Gradient Boosting algorithm (R2 = 0.93) for pathogen load and Ridge Regression algorithm (R2 = 0.84) for ARG level showed the highest accuracy performance. Model generalizability was tested with 5- and 10-fold cross-validation; low overfitting risk was confirmed by learning curves and residual analysis, specifically for the final selected models (Gradient Boosting for pathogen load and Ridge Regression for ARG level), while other models such as Decision Tree showed clear signs of overfitting and were therefore excluded from further analysis. The transparency of model decisions was examined with SHapley Additive exPlanations (SHAP) analyses; "application period", "ARG type" and "fertilizer type" were highlighted as determining variables. In addition, Partial Dependence Plot (PDP) analyses revealed the marginal effects of environmental and operational factors on target variables in a biologically meaningful way. This integrated modelling approach contributes to the optimization of sustainable fertilization strategies and the development of environmental-health policies.en_US
dc.description.woscitationindexScience Citation Index Expanded
dc.identifier.doi10.1016/j.mran.2025.100355
dc.identifier.issn2352-3522
dc.identifier.issn2352-3530
dc.identifier.scopus2-s2.0-105017255560
dc.identifier.scopusqualityQ2
dc.identifier.urihttps://doi.org/10.1016/j.mran.2025.100355
dc.identifier.urihttps://hdl.handle.net/20.500.12712/38602
dc.identifier.volume30en_US
dc.identifier.wosWOS:001587499200001
dc.identifier.wosqualityQ2
dc.language.isoenen_US
dc.publisherElsevieren_US
dc.relation.ispartofMicrobial Risk Analysisen_US
dc.relation.publicationcategoryMakale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanıen_US
dc.rightsinfo:eu-repo/semantics/closedAccessen_US
dc.subjectMachine Learningen_US
dc.subjectRidge Regressionen_US
dc.subjectGradient Boostingen_US
dc.subjectAntibiotic Resistance Genesen_US
dc.subjectPathogen Loaden_US
dc.subjectAnimal Manureen_US
dc.titleLiterature-Based Explainable Machine Learning Models for Predicting Pathogen and Antibiotic Resistance Gene Loads from Animal Manureen_US
dc.typeArticleen_US
dspace.entity.typePublication

Files