Classification and Diagnosis of Renal Failure Disease Using Machines Learning
DOI:
https://doi.org/10.62933/qgj6hr87Abstract
During the past few years, the number of classification techniques has increased with the rapid growth of technology, which depends on machine learning. Recently, medical experts and doctors have widely utilized machine learning, a branch of artificial intelligence, to aid in predicting and diagnosing various diseases. Regarding health research, machine learning techniques are used extensively for data processing. In this study, we applied three different machine learning algorithms to a medical diagnosis problem and analyzed their efficiency in predicting the results. The study focuses on the diagnosis and factors influencing renal failure disease, using a serum test for both presence and absence patients. The dataset used for the study consists of 165 cases and 13 attributes of RFD patients. The goal of this study is to find out how well k-nearest neighbors (KNN), decision tree (DT), and random forest (RF) classifiers work by looking at things like accuracy, geometric mean, kappa coefficient and area under the curve for RFD prediction. The experimental results show that the RF classifier performs better than the other classifiers. Additionally, based on the final fitted models, it was found that urea, albumin, and magnesium are the most significant factors that clearly impact patients with renal failure disease.
References
[1] Jha, V., Garcia-Garcia, G., Iseki, K., Li, Z., Naicker, S., Plattner, B., . . . Yang, C.-W. (2013). Chronic kidney disease: global dimension and perspectives. The Lancet, 382(9888), 260-272.
[2] NCHS (National Center for Health Statistic Survey Data) Centers for Disease Control and Prevention (CDC). Hyattsville:Department of Health and Human Services, Centers for Disease Control and Prevention. 2012.
[3] Basturk, T., Sari, O., Koc, Y., Eren, N., Isleem, M., Kara, E., . . . Hasbal, N. (2017). Prognostic significance of NGAL in early stage chronic kidney disease. Minerva Urologica e Nefrologica, 69(3).
[4] Byvatov, E., Fechner, U., Sadowski, J., & Schneider, G. (2003). Comparison of support vector machine and artificial neural network systems for drug/nondrug classification. Journal of chemical information and computer sciences, 43(6), 1882-1889.
[5] Chen, S.-T., Hsiao, Y.-H., Huang, Y.-L., Kuo, S.-J., Tseng, H.-S., Wu, H.-K., & Chen, D.-R. (2009). Comparative analysis of logistic regression, support vector machine and artificial neural network for the differential diagnosis of benign and malignant solid breast tumors by the use of three-dimensional power Doppler imaging. Korean Journal of Radiology, 10(5), 464-471.
[6] Bhatla, N., & Jyoti, K. (2012). An analysis of heart disease prediction using different data mining techniques. International Journal of Engineering, 1(8), 1-4.
[7] Sarwar, A., & Sharma, V. (2014). Comparative analysis of machine learning techniques in prognosis of type II diabetes. AI & society, 29, 123-129.
[8] George, Y. M., Zayed, H. H., Roushdy, M. I., & Elbagoury, B. M. (2013). Remote computer-aided breast cancer detection and diagnosis system based on cytological images. IEEE Systems Journal, 8(3), 949-964.
[9] Saxena, R. (2016). Knn classifier, introduction to K-nearest neighbor algorithm. DataAspirant [online], available at: https://dataaspirant. com/2016/12/23/k-nearest-neighbor-classifier-intro.
[10] Swain, D., Mehta, U., Bhatt, A., Patel, H., Patel, K., Mehta, D., . . . Manika, S. (2023). A robust chronic kidney disease classifier using machine learning. Electronics, 12(1), 212.
[11] Quinlan, J. R. (1986). Induction of decision trees. Machine learning, 1, 81-106.
[12] Ibrahim, I. M., & Abdulazeez, A. M. (2021). The role of machine learning algorithms for diagnosing diseases. learning, 4(5), 6.
[13] Breiman, L. (2001). Random forests. Machine learning, 45, 5-32.
[14] Debal, D. A., & Sitote, T. M. (2022). Chronic kidney disease prediction using machine learning techniques. Journal of Big Data, 9(1), 109.
[15] Alshebly, O., & Abdullah, S. N. (2024). Proposed Two-Steps Procedure of Classification High Dimensional Data with Regularized Logistic Regression. Statistics, Optimization & Information Computing, 12(2), 325-342.
[16] Melgani, F., & Bruzzone, L. (2004). Classification of hyperspectral remote sensing images with support vector machines. IEEE Transactions on geoscience and remote sensing, 42(8), 1778-1790.
[17] Allouche, O., Tsoar, A., & Kadmon, R. (2006). Assessing the accuracy of species distribution models: prevalence, kappa and the true skill statistic (TSS). Journal of applied ecology, 43(6), 1223-1232.
[18] Carletta, J. (1996). Assessing agreement on classification tasks: the kappa statistic. arXiv preprint cmp-lg/9602004.
[19] Alshebly, O. Q., & Abdullah, S. N. (2024). The Fuzziness Models with The Proposed New Conjugate Gradient Method for The Classification of High-Dimensional Data in Bioinformatics. Journal of Economics and Administrative Sciences, 30(142), 425-448.
[20] Hosmer, D., & Lemeshow, S. (2000). Applied Logistic Regression, New York: Johnson Wiley & Sons: Inc.
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Ammar Ahmed Ali, Omar Q. Alshebly, Rizgar Maghdid Ahmed (Author)

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Licensed under a CC-BY license: https://creativecommons.org/licenses/by-nc-sa/4.0/