Comparing Different Resampling Methods in Predicting Students Performance Using Machine Learning Techniques


Predicting students' performance is one of the most valuable and important research areas in today's society, thanks to technological advancements. In the subject of education, Data Mining is particularly useful for analyzing student performance. Because of the imbalanced datasets in this sector, projecting students' performance has become a difficult task, and there is no way to compare different resampling strategies. Using two different datasets, this study compares various resampling strategies such as Borderline SMOTE, Random Over Sampler, SMOTE, SMOTE-ENN, SVM-SMOTE, and SMOTE-Tomek to manage the unbalanced data problem and forecast students' performance. The distinction between multiclass and binary classification, as well as the structure of the features, are also investigated. This paper employs a variety of Machine Learning classifiers, including Random Forest, K-Nearest-Neighbor, Artificial Neural Network, XG-boost, Support Vector Machine (Radial Basis Function), Decision Tree, Logistic Regression, and Nave Bayes, to better assess the performance of resampling methods in solving the imbalanced problem. Model validation strategies include the Random hold-out and Shuffle 5-fold cross-validation procedures. The results obtained using various assessment measures show that models with fewer classes and nominal features will perform better. In addition, classifiers do not perform well with unbalanced data, so this issue must be addressed. Using balanced datasets improves the performance of classifiers. The Friedman test, which is a statistical significance test, also confirms that the SVM-SMOTE is more efficient than the other resampling methods. Furthermore, when utilizing SVM-SMOTE as a resampling approach, the Random Forest classifier outperformed all other models.

Did you like this research project?

To get this research project Guidelines, Training and Code... Click Here

PROJECT TITLE : Smartphone based Indoor Path Estimation and Localization without Human Intervention ABSTRACT: Many different kinds of indoor positioning systems have been developed as a result of the growing market interest in
PROJECT TITLE : Robust Fuzzy Learning for Partially Overlapping Channels Allocation in UAV Communication Networks ABSTRACT: The emerging cellular-enabled unmanned aerial vehicle (UAV) communication paradigm poses significant challenges
PROJECT TITLE : Prediction of Traffic Flow via Connnected Vehicles ABSTRACT: We propose a framework for short-term traffic flow prediction (STP) so that transportation authorities can take early actions to control flow and prevent
PROJECT TITLE : Passenger Demand Prediction with Cellular Footprints ABSTRACT: An accurate forecast of the demand for passengers across the entire city enables providers of online car-hailing services to more efficiently schedule
PROJECT TITLE : NCF: A Neural Context Fusion Approach to Raw Mobility Annotation ABSTRACT: Improving business intelligence in mobile environments requires a thorough comprehension of human mobility patterns on a point-of-interest

Ready to Complete Your Academic MTech Project Work In Affordable Price ?

Project Enquiry