AN EVALUATION OF MACHINE LEARNING ALGORITHMS IN PREDICTING STUDENT PERFORMANCE
Abstract
This study investigates machine learning algorithms to predict student performance using a publicly available Kaggle dataset containing academic, behavioral, and socio-demographic attributes. Four algorithms—Logistic Regression, Random Forest, Decision Tree, and Gradient Boosting—were evaluated using cross-validation for reliability and accuracy assessment.
The Gradient Boosting classifier emerged as the best-performing model, achieving an accuracy of 96%. Its interpretability and simplicity make it well-suited for educational data analysis. Random Forest and Decision Tree provided competitive results, while Logistic Regression demonstrated lower performance due to the dataset’s non-linear patterns.
These results highlight the critical role of algorithm selection in student performance prediction and underscore the potential of machine learning in enhancing educational decision-making. Future work could explore advanced models and feature engineering to further improve prediction accuracy.