Autors: Kabumba R.K., Mbayandjambe A.M., Tashev, T. A., Slavov, V. D., Kambale W.V., Ngoie R.B.M., Kyamakya K., Kasereka S.K. Title: Predictive Modeling of Academic Success Using Machine Learning: A Comparative Analysis with SMOTE-Based Class Balancing Keywords: Academic success, Dropout prediction, Higher education, Machine learning, Predictive analytics, Random Forest, SMOTEAbstract: This study evaluates six supervised machine learning algorithms, K-Nearest Neighbors (KNN), Decision Tree (DT), Logistic Regression (LR), Support Vector Machines (SVM), Random Forest (RF), and Artificial Neural Networks (ANN) for predicting academic outcomes (Graduate, Dropout, Enrolled) using an imbalanced real-world dataset. SMOTE was applied to address class imbalance. Random Forest achieved the highest accuracy (84.19%) after balancing and majority voting. Findings highlight the role of socio-economic factors and the effectiveness of ensemble learning in educational decision-making. The proposed model supports early interventions, especially in low- and middle-income academic contexts. References - UNESCO. Progress on education: Sdg 4 dashboard, 2023. Accessed: 2025-07-05.
- Naureen Durrani, Gulmira Qanay, Ghulam Mir, et al. Achieving sdg 4, equitable quality education after covid-19: Global evidence and a case study of kazakhstan. Sustainability, 15:14725, 2023.
- Mihai Ciolacu, Amir F. Teherani, Robert Biere, and Hannes Popp. Promoting student performance through machine learning methods. In Proceedings of the IEEE Conference, 2017. Accessed: 2025-01-15.
- P. A. Carneiro, L. M. Ribeiro, and F. J. Viana. A machine learning approach to predict academic performance of first-year students. Education and Information Technologies, 28:457–476, 2023.
- J. Pecuchova and M. Drlik. Predicting students at risk of early dropout using ensemble methods. Procedia Computer Science, 225:3223–3232, 2023.
- A. López-García, O. Blasco-Blasco, M. Liern-García, and S. Parada-Rico. Early detection of students’ failure using machine learning techniques. Operations Research Perspectives, 11:100292, 2023.
- Y. N. S. A. Husaini and N. S. A. Shukor. Factors affecting students’ academic performance: A review. Res Militaris, 12(6):284–294, 2022.
- B. Han and C. Rideout. Factors associated with university students’ development and success: Insights from senior undergraduates. The Canadian Journal for the Scholarship of Teaching and Learning, 2022.
- M. V. Martins, D. Tolledo, J. Machado, L. M. T. Baptista, and V. Realinho. Early prediction of student performance in higher education: A case study. In Trends and Applications in Information Systems and Technologies, volume 1 of Advances in Intelligent Systems and Computing. Springer, 2021.
- Antonio Lopez-García, Oscar Blasco-Blasco, Marcela Liern-García, and Sergio Parada-Rico. Early detection of students’ failure using machine learning techniques. Operations Research Perspectives, 11:100292, 2023.
- M. T. Nguyen and A. H. Vo. Socioeconomic variables in predictive analytics for education. International Journal of Educational Technology in Higher Education, 19(1):1–18, 2022.
- Y.-S. Su, Y.-D. Lin, and T.-Q. Liu. Applying machine learning technologies to explore students’ learning features and performance prediction. Frontiers in Neuroscience, 2022.
- A. Lopez-Garcia, O. Blasco-Blasco, M. Liern-Garcia, and S. Parada-Rico. Early detection of students’ failure using machine learning techniques. Operations Research Perspectives, 11:100292, 2023.
- J. Pecuchova and M. Drlik. Predicting students at risk of early dropping out from course using ensemble classification methods. Procedia Computer Science, 225:3223–3232, 2023.
- M. Ciolacu, A. F. Teherani, R. Biere, and H. Popp. Favoriser la performance des élèves grâce aux méthodes d’apprentissage automatique. In Proceedings of IEEE Conference, 2017. Consulté le [2025-01-15].
- Vieira Martins Mónica Machado Jorge Realinho, Valentim and Luís Baptista. Predict Students’ Dropout and Academic Success. UCI Machine Learning Repository, 2021. DOI: https://doi.org/10.24432/C5MC89.
- F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. Scikit-learn: Machine learning in python. Journal of Machine Learning Research, 12:2825–2830, 2011.
- C. P. Dancey and J. Reidy. Statistics Without Maths for Psychology. Pearson Education, Harlow, 7th edition, 2017.
- Y. N. S. A. Husaini and N. S. A. Shukor. Factors affecting students’ academic performance: A review. Res Militaris, 12(6):284–294, 2022. Faculty of Communication Visual Art and Computing, Universiti Selangor, Malaysia.
- Kenneth Holstein, Jennifer Wortman Vaughan, Hal Daumé, Miro Dudik, and Hanna Wallach. Improving fairness in machine learning systems: What do industry practitioners need? In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, CHI’19, page 1–16, New York, NY, USA, 2019. Association for Computing Machinery.
- European Parliament and Council of the European Union. Regulation (eu) 2024/1227 of the european parliament and of the council of 24 april 2024 on the protection of children in migration, 2024. Official Journal of the European Union, L, 2024/1227.
- European Commission. Ethics guidelines for trustworthy ai. https://digital-strategy.ec.europa.eu/en/library/ ethics-guidelines-trustworthy-ai, 2019. High-Level Expert Group on Artificial Intelligence, published April 8, 2019.
Issue
| Procedia Computer Science, vol. 272, pp. 242-250, 2025, Albania, https://doi.org/10.1016/j.procs.2025.10.202 |
|