Main Article Content


In order to increase student performance, several universities use machine learning to analyze and evaluate their data so that it enables to improve the quality of education in the university. To get a new insight from the tracer study dataset as the relevance between university performance and student capability with business and industries work, the author will develop a model to predict student performance based on the tracer study dataset using Artificial Neural Network (ANN). For obtaining attributes that correspond to labels, Phi Coefficient Correlation will be used to select the attributes with high correlation as Feature Selection. The author is also performing the oversampling method using Synthetic Minority Oversampling Technique (SMOTE) because this dataset is imbalanced and evaluates the model using K-Fold Cross-Validation. According to K-Fold Cross Validation, the result shows that K = 3 has a low standard deviation of evaluation score and it's the best candidate of K to split the dataset. The average standard deviation is 0.038 for all score evaluations (Accuracy, Precision, Recall, and F-1 Score). After applied SMOTE to treating the imbalanced dataset with the data splitting 65 training data and 35 testing data, the accuracy value increases by 10% from 0.77 to 0.87.


Artificial Neural Network Imbalanced Dataset K-Fold Cross Validation Student Performance Tracer Study

Article Details


    [1] V. Oladokun, A. Adebanjo, and O. Charles-Owaba. Predicting students academic performance using artificialneural network: A case study of an engineering course. 2008.
    [2] M. A. Umar. Student academic performance prediction using artificial neural networks: A case study.Inter-national Journal of Computer Applications, 975:8887.
    [3] H. Altabrawee, O. A. J. Ali, and S. Q. Ajmi. Predicting students’ performance using machine learningtechniques.JOURNAL OF UNIVERSITY OF BABYLON for pure and applied sciences, 27(1):194–205,2019.
    [4] J. Ekström, “The Phi-coefficient, the Tetrachoric Correlation Coefficient, and the Pearson-Yule Debate”.2011.
    [5] R. Dase and D. Pawar. Application of artificial neural network for stock market predictions: A review ofliterature.International Journal of Machine Intelligence, 2(2):14–17, 2010.
    [6] I.-T. Lee, “Notes on Backpropagation with Cross Entropy,” (accessed Aug. 08, 2021).
    [7] I. Goodfellow, Y. Bengio and A. Courville, Deep learning, 1st ed. 2016.
    [8] J. G. Moreno-Torres, J. A. Saez, and F. Herrera, “Study on the Impact of Partition-Induced Dataset Shift on K-Fold Cross-Validation,” IEEE Transactions on Neural Networks and Learning Systems, vol. 23, no. 8, pp. 1304–1312, Aug. 2012.
    [9] W. A. Wardana, I. A. Siradjuddin, and A. Muntasa, “Identification of pedestrians attributes based on multi-class multi-label classification using Convolutional Neural Network (CNN),” 2020.
    [10] Y. Liu, A. An, and X. Huang, “Boosting Prediction Accuracy on Imbalanced Datasets with SVM Ensembles,” Lecture Notes in Computer Science, pp. 107–118, 2006.
    [11] J. Brownlee, “Ordinal and one-hot encodings for categorical data,”, 11-Jun-2020. [Online]. Available: encoding-for-categorical-data/. [Accessed: 08-Aug-2021].
    [12] P. M. Arsad, N. Buniyamin and J. A. Manan, "Prediction of engineering students' academic performance using Artificial Neural Network and Linear Regression: A comparison," 2013 IEEE 5th Conference on Engineering Education (ICEED), 2013, pp. 43-48, doi: 10.1109/ICEED.2013.6908300.
    [13] Sikder, Md Fahim, Md Jamal Uddin, and Sajal Halder. "Predicting students yearly performance using neural network: A case study of BSMRSTU." 2016 5th International Conference on Informatics, Electronics and Vision (ICIEV). IEEE, 2016.
    [14] “CDC Telkom University,” [Online]. Available: [Accessed: 08-Aug-2021]