Early Detection of Diabetes using AI and Machine Learning Models
Author names:
Dr. Shruti Mantri
Email: shruti_mantri@isb.edu
ISB Institute of Data Science
Vishal Siram
Email: vishalsiram_anil@isb.edu
ISB Institute of Data Science
Diabetes is a common chronic disease that is of increasing concern. According to the World Health Organization (WHO), it is estimated that around 422 million people worldwide suffer from diabetes. The number of people suffering from diabetes is estimated to increase to approximately 642 million by 2040. Due to diabetes, one person (five million a year) dies every six seconds — more than HIV, tuberculosis and malaria combined and 1.6 million deaths are due to diabetes every year. The disease is of particular concern to working adults due to increase in work pressure, changes in living standards and proclivity towards unhealthy food habits. In addition, women who are diabetic have special concerns during pregnancy. High blood sugars levels early in pregnancy can cause birth defects. It also increases risks of miscarriage and diabetes-related complications. Further, women also tend to get affected by high blood sugar level during pregnancy (called gestational diabetes) and is a matter of concern. This condition occurs in three to nine percent of all pregnancies and pose a risk to both the mother and the baby.
Machine learning application to diabetes can reform the approach of its diagnosis and management. Machine learning models have been used for predicting the risks of developing diabetes or its consequent complications. AI and machine learning can facilitate patients care, healthcare professionals and healthcare systems. Clinically case-based reasoning, deep learning and neural networks enable predictive population risks, enhanced decision making and self-management.
In the current study, the authors have developed web-based application using machine learning tools that can identify women at high risk for diabetes based on genetic and metabolic factors. Predictive models have been built to leverage big data analytics for building estimates of possibility of development of diabetes in patients. Machine learning concepts used also identify the early signs of gestational diabetes in pregnant women. The web based intelligent application using machine learning models predict early signs of diabetes based on following parameters:
(i) Glucose (ii) Blood Pressure (iii) Skin Thickness (iv) Insulin (v) Body Mass Index (vi) Diabetes pedigree function (vii) age (viii) number of pregnancies.
Support vector machines, logistics regression, KNN, decision tree algorithms were used to identify the model, more to suitable to detect early signs of diabetes based on nine parameters. The dataset is collected real time by patients entering the values through the web-based system. Another data set Pima Indians diabetics data (Jegan, 2014) data set is used to train the system. The dataset has all records of females. The data set has attributes: (i) times of pregnancy, (ii) plasma glucose concentration after an 2-h oral glucose tolerance test, (iii) diastolic blood pressure, (iv) triceps skin fold thickness (v) 2-h (vi) serum insulin, (vii) body mass index, (viii) diabetes pedigree function and (ix) age. In this dataset, there are 2000 samples. Outcome column encoded with binary values 0(False) it means the person is not surviving from diabetes and 1(True) surviving from diabetes.
Accuracy, F-Measure, Recall, Precision and ROC (Receiver Operating Curve) measures are used define performance of the different machine learning techniques. Table 1, Table 2, Table 3 and Table 4 defines classifier performance measures of the algorithms based on accuracy, precision, recall and F-measure.
From Tables 1–4 it is analyzed that decision tree shows maximum accuracy. Decision Tree can predict the chances of diabetes with more accuracy as compared to other classifiers. Performance of the classifiers based on various measures is plotted via graph in Figure 1, Figure 2, Figure 3, Figure 4.
One of the important real-world chronic medical problems is the early detection of diabetes. In this study, systematic efforts are made in designing a web-based application which results in the prediction of disease like diabetes. In this study four machine learning classification algorithms were evaluated on various measures to identify the algorithm more suitable for real time detection of diabetes. Experimental results determine the adequacy of the designed system with an achieved accuracy of 94 % using decision tree algorithm.
Self-management is the key to the treatment of diabetes. With the advent of AI and machine learning, patients can be empowered to manage their own diabetes, generate data/parameters and be their own health experts. Awareness and knowledge of early signs will be useful in management of diabetes in women especially pregnant women. The end users of technical advances in diabetes care include health care professionals, patients, diabetes and management center and data science enthusiast.
AI and machine learning have introduced a quantum of change in health care systems especially diabetes care and will continue to evolve. In future, experience generated from the system developed with help improvise the system further in terms of functionality and utility in diabetes care using concepts of enforcement learning.
References
1. Bamnote, M.P., G.R., 2014. Design of Classifier for Detection of Diabetes Mellitus Using Genetic Programming. Advances in Intelligent Systems and Computing 1, 763–770. doi:10.1007/978–3–319–11933–5.
2. Esposito, F., Malerba, D., Semeraro, G., Kay, J., 1997. A comparative analysis of methods for pruning decision trees. IEEE Transactions on Pattern Analysis and Machine Intelligence 19, 476–491. doi:10.1109/34.589207.
3. Iancu I., Mota M., Iancu E. (2008). “Method for the analysing of blood glucose dynamics in diabetes mellitus patients,” in Proceedings of the 2008 IEEE International Conference on Automation, Quality and Testing, Robotics, Cluj-Napoca: 10.1109/AQTR.2008.4588883
- Jegan C. (2014). Classification of diabetes disease using support vector machine. Microcomput. Dev. 3 1797–1801.
5. Krasteva A., Panov V., Krasteva A., Kisselova A., Krastev Z. (2011). Oral cavity and systemic diseases — Diabetes Mellitus. Biotechnol. Biotechnol. Equip. 25 2183–2186. 10.5504/BBEQ.2011.0022
6. Lonappan A., Bindu G., Thomas V., Jacob J., Rajasekaran C., Mathew K. T. (2007). Diagnosis of diabetes mellitus using microwaves. J. Electromagnet. Wave. 21 1393–1401. 10.1163/156939307783239429
7. Orabi,K.M.,Kamal,Y.M.,Rabah,T.M.,2016.EarlyPredictiveSystemforDiabetesMellitusDisease,in:IndustrialConferenceonDataMining,Springer.Springer.pp.420–427
8. Perveen,S.,Shahbaz,M.,Guergachi,A.,Keshavjee,K.,(2016).PerformanceAnalysisofDataMiningClassificationTechniquestoPredictDiabetes.ProcediaComputerScience82,115–121.doi:10.1016/j.procs.2016.04.016.
9. Priyam,A.,Gupta,R.,Rathee,A.,Srivastava,S.,2013.ComparativeAnalysisofDecisionTreeClassificationAlgorithms.InternationalJournalofCurrentEngineeringandTechnologyVol.3,334–337.doi:JUNE2013,arXiv:ISSN2277–4106.
10. Robertson G., Lehmann E. D., Sandham W., Hamilton D. (2011). Blood glucose prediction using artificial neural networks trained with the AIDA diabetes simulator: a proof-of-concept pilot study. J. Electr. Comput. Eng. 2011:681786 10.1155/2011/681786