PROJECT TITLE :
Diabetes Prediction Using Ensembling of Different Machine Learning Classifiers
Diabetes, often known as chronic sickness, is a collection of metabolic illnesses caused by a persistently high blood sugar level. If exact early prediction is achievable, the risk factor and severity of diabetes can be considerably decreased. Due to the low number of labeled data and the existence of outliers (or missing values) in diabetes datasets, robust and reliable diabetes prediction is extremely difficult. We propose a robust framework for diabetes prediction in this paper, which includes outlier rejection, data standardization, feature selection, K-fold cross-validation, and various Machine Learning (ML) classifiers (k-nearest Neighbour, Decision Trees, Random Forest, AdaBoost, Naive Bayes, and XGBoost) as well as Multilayer Perceptron (MLP). In this research, weighted ensembling of different ML models is also proposed to improve diabetes prediction, where the weights are calculated using the ML model's corresponding Area Under ROC Curve (AUC). The performance statistic is chosen as AUC, which is then maximized using the grid search technique during hyperparameter tweaking. Using the Pima Indian Diabetes Dataset, all of the experiments in this literature were carried out under the identical experimental settings. Our suggested ensembling classifier exceeds the state-of-the-art findings by 2.00 percent in AUC in all of the extended experiments, with sensitivity, specificity, false omission rate, diagnostic odds ratio, and AUC of 0.789, 0.934, 0.092, 66.234, and 0.950, respectively. Our suggested framework outperforms the other approaches presented in the article for diabetes prediction. It can also deliver superior findings on the same dataset, resulting in higher diabetes prediction performance. Our diabetes prediction source code has been made public.
Did you like this research project?
To get this research project Guidelines, Training and Code... Click Here