Evaluating the Performance of Machine Learning Classifiers on Predicting Hypothyroidism for Public Healthcare Good

Authors

  • Sena Okuboyejo Metro State University
  • Jawad Haqbeen Kyoto University
  • Takayuki Ito Kyoto University

DOI:

https://doi.org/10.52731/liir.v004.166

Abstract

Hypothyroidism is an endocrine disorder in which the thyroid gland cannot secrete enough hormones. If left undetected and treated, it poses grave consequences on the patient's health and quality of life. Early detection is vital for treatment, enhancing the quality of a patient's life. Besides many sectors, artificial intelligence (AI) will drive health sector transformation, offering new approaches to optimize health systems' operation and reliability, ensuring not only techno-economic advantages but also improving patients' quality of life (QoL) in a meaningful way. Therefore, it is critical to find innovative approaches using AI. Towards this end, we initiate the study to evaluate the performance of Machine Learning Classifiers in predicting Hypothyroidism for Healthcare Good. This work uses supervised machine learning (ML) algorithms to predict hypothyroidism based on available features and identifies the best-performing classifier. We built and trained seven classifiers using specified ML algorithms. We presented an experimental case study, validating models and measuring performance. A comparative analysis of the classifiers revealed that the tree-based classifiers (Random Forest, Decision Tree, and Gradient Boost) outperformed other models based on the F1-score and AUC values, consistent with existing literature. This work has implications for the development of health informatics systems.

References

J. Balakrishnan, Y. K. Dwivedi, L. Hughes, and F. Boy, "Enablers and Inhibitors of AI-Powered Voice Assistants: A Dual-Factor Approach by Integrating the Status Quo Bias and Technology Acceptance Model," Inf. Syst. Front. 2021, vol. 1, pp. 1–22, Oct. 2021, doi: 10.1007/S10796-021-10203-Y.

O. A. Nasseef, A. M. Baabdullah, A. A. Alalwan, B. Lal, and Y. K. Dwivedi, "Artificial intelligence-based public healthcare systems: G2G knowledge-based exchange to enhance the decision-making process," Gov. Inf. Q., p. 101618, Aug. 2021, doi: 10.1016/J.GIQ.2021.101618.

R. Bose, "Knowledge management-enabled health care management systems: capabilities, infrastructure, and decision-support," Expert Syst. Appl., vol. 24, no. 1, pp. 59–71, Jan. 2003, doi: 10.1016/S0957-4174(02)00083-0.

Y. K. Dwivedi et al., "Artificial Intelligence (AI): Multidisciplinary perspectives on emerging challenges, opportunities, and agenda for research, practice and policy," Int. J. Inf. Manage., vol. 57, p. 101994, Apr. 2021, doi: 10.1016/J.IJINFOMGT.2019.08.002.

S. S. R. Abidi and S. R. Abidi, "Intelligent health data analytics: A convergence of artificial intelligence and big data," Healthc. Manag. Forum, vol. 32, no. 4, pp. 178–182, Jul. 2019, doi: 10.1177/0840470419846134.

A. Kalantari, A. Kamsin, S. Shamshirband, A. Gani, H. Alinejad-Rokny, and A. T. Chronopoulos, "Computational intelligence approaches for classification of medical data: State-of-the-art, future challenges and research directions," Neurocomputing, vol. 276, pp. 2–22, Feb. 2018, doi: 10.1016/J.NEUCOM.2017.01.126.

Aalto-yliopisto, IEEE Computer Society, and Institute of Electrical and Electronics Engineers, ICDE 2016 Workshops : 2016 IEEE 32nd International Conference on Data Engineering Workshops : May 16-20, 2016, Helsinki, Finland. 2016.

G. Nguyen et al., "Machine Learning and Deep Learning frameworks and libraries for large-scale data mining: a survey," Artif. Intell. Rev., vol. 52, no. 1, pp. 77–124, Jun. 2019, doi: 10.1007/s10462-018-09679-z.

N. Razavian, S. Blecker, A. M. Schmidt, A. Smith-Mclallen, S. Nigam, and D. Sontag, "Population-level prediction of type 2 diabetes from claims data and analysis of risk factors," Big Data, vol. 3, no. 4, pp. 277–287, Dec. 2015, doi: 10.1089/big.2015.0020.

F. Lucini, K. Fiest, H. Stelfox, and Lee Joon, "Delirium Prediction in the intensive care unit: a temporal approach," in 42nd Annual International Conferences of the IEEE Engineering in Medicine and Biology Society, 2020, pp. 5527–5530.

H. N. Mufti, G. M. Hirsch, S. R. Abidi, and S. S. R. Abidi, "Exploiting machine learning algorithms and methods for the prediction of agitated delirium after cardiac surgery: Models development and validation study," JMIR Med. Informatics, vol. 7, no. 4, Oct. 2019, doi: 10.2196/14993.

P. Govindarajan, R. K. Soundarapandian, A. H. Gandomi, R. Patan, P. Jayaraman, and R. Manikandan, "Classification of stroke disease using machine learning algorithms," Neural Comput. Appl., vol. 32, no. 3, pp. 817–828, Feb. 2020, doi: 10.1007/s00521-019-04041-y.

M. Böhland et al., "Machine learning methods for automated classification of tumors with papillary thyroid carcinoma-like nuclei: A quantitative analysis," PLoS One, vol. 16, no. 9 September, Sep. 2021, doi: 10.1371/journal.pone.0257635.

A. S. Albahri et al., "Role of biological Data Mining and Machine Learning Techniques in Detecting and Diagnosing the Novel Coronavirus (COVID-19): A Systematic Review," Journal of Medical Systems, vol. 44, no. 7. Springer, Jul. 01, 2020, doi: 10.1007/s10916-020-01582-x.

P. C. Austin, J. V. Tu, J. E. Ho, D. Levy, and D. S. Lee, "Using methods from the data-mining and machine-learning literature for disease classification and prediction: a case study examining classification of heart failure subtypes," J. Clin. Epidemiol., vol. 66, no. 4, pp. 398–407, Apr. 2013, doi: 10.1016/J.JCLINEPI.2012.11.008.

Y. Khourdifi and M. Bahaj, "Heart disease prediction and classification using machine learning algorithms optimized by particle swarm optimization and ant colony optimization," Int. J. Intell. Eng. Syst., vol. 12, no. 1, pp. 242–252, 2019, doi: 10.22266/ijies2019.0228.24.

F. I. Alarsan and M. Younes, "Analysis and classification of heart diseases using heartbeat features and machine learning algorithms," J. Big Data, vol. 6, no. 1, Dec. 2019, doi: 10.1186/s40537-019-0244-x.

I. Tougui, A. Jilbab, and J. El Mhamdi, "Heart disease classification using data mining tools and machine learning techniques," Health Technol. (Berl)., vol. 10, pp. 1137–1144, 2020, doi: 10.1007/s12553-020-00438-1/Published.

S. Mohan, C. Thirumalai, and G. Srivastava, "Effective heart disease prediction using hybrid machine learning techniques," IEEE Access, vol. 7, pp. 81542–81554, 2019, doi: 10.1109/ACCESS.2019.2923707.

S. A. A. Naqvi, K. Tennankore, A. Vinson, P. C. Roy, and S. S. R. Abidi, "Predicting kidney graft survival using machine learning methods: Prediction model development and feature significance analysis study," J. Med. Internet Res., vol. 23, no. 8, Aug. 2021, doi: 10.2196/26843.

N. Kureshi, S. S. R. Abidi, and C. Blouin, "A predictive model for personalized therapeutic interventions in non-small cell lung cancer," IEEE J. Biomed. Heal. Informatics, vol. 20, no. 1, pp. 424–431, Jan. 2016, doi: 10.1109/JBHI.2014.2377517.

R. Chaganti, F. Rustam, I. De La Torre Díez, J. L. V. Mazón, C. L. Rodríguez, and I. Ashraf, "Thyroid Disease Prediction Using Selective Features and Machine Learning Techniques," Cancers (Basel)., vol. 14, no. 16, Aug. 2022, doi: 10.3390/cancers14163914.

R. Jha, V. Bhattacharjee, and A. Mustafi, "Increasing the Prediction Accuracy for Thyroid Disease: A Step Towards Better Health for Society," Wirel. Pers. Commun., vol. 122, no. 2, pp. 1921–1938, Jan. 2022, doi: 10.1007/s11277-021-08974-3.

H. Abbad Ur Rehman, C. Y. Lin, Z. Mushtaq, and S. F. Su, "Performance Analysis of Machine Learning Algorithms for Thyroid Disease," Arab. J. Sci. Eng., vol. 46, no. 10, pp. 9437–9449, Oct. 2021, doi: 10.1007/s13369-020-05206-x.

G. Chaubey, D. Bisen, S. Arjaria, and V. Yadav, "Thyroid Disease Prediction Using Machine Learning Approaches," Natl. Acad. Sci. Lett., vol. 44, no. 3, pp. 233–238, Jun. 2021, doi: 10.1007/s40009-020-00979-z.

M. Khalilia, S. Chakraborty, and M. Popescu, "Predicting disease risks from highly imbalanced data using random forest," BMC Med. Inform. Decis. Mak., vol. 11, no. 1, pp. 1–13, Jul. 2011, doi: 10.1186/1472-6947-11-51/FIGURES/10.

P. Kaur, R. Kumar, and M. Kumar, "A healthcare monitoring system using random forest and internet of things (IoT)," Multimed. Tools Appl., vol. 78, no. 14, pp. 19905–19916, Jul. 2019, doi: 10.1007/S11042-019-7327-8/FIGURES/3.

A. Mathur and G. M. Foody, "Multiclass and binary SVM classification: Implications for training and classification users," IEEE Geosci. Remote Sens. Lett., vol. 5, no. 2, pp. 241–245, Apr. 2008, doi: 10.1109/LGRS.2008.915597.

M. M. Rahman, S. K. Antani, and G. R. Thoma, "A learning-based similarity fusion and filtering approach for biomedical image retrieval using SVM classification and relevance feedback," IEEE Trans. Inf. Technol. Biomed., vol. 15, no. 4, pp. 640–646, Jul. 2011, doi: 10.1109/TITB.2011.2151258.

Z. Camlica, H. R. Tizhoosh, and F. Khalvati, "Medical image classification via SVM using LBP features from saliency-based folded data," Proc. - 2015 IEEE 14th Int. Conf. Mach. Learn. Appl. ICMLA 2015, pp. 128–132, Mar. 2016, doi: 10.1109/ICMLA.2015.131.

F. Pedregosa FABIANPEDREGOSA et al., “Scikit-learn: Machine Learning in Python Gaël Varoquaux Bertrand Thirion Vincent Dubourg Alexandre Passos PEDREGOSA, VAROQUAUX, GRAMFORT ET AL. Matthieu Perrot,” 2011.

J. Shreffler and M. R. Huecker, "Diagnostic Testing Accuracy: Sensitivity, Specificity, Predictive Values and Likelihood Ratios," StatPearls, Mar. 2022.

Mate A, Madaan L, Taneja A, Madhiwalla N, Verma S, Singh G, Hegde A, Varakantham P, Tambe M (2022) Field study in deploying restless multi-armed bandits: Assisting non-profits in improving maternal and child health. Proceedings of the AAAI Conference on Artificial Intelligence 36(1111):12017–12025.

Downloads

Published

2023-12-20