Uploaded by Fabio Madie

Success in academic studies prediction through Machine Learning

advertisement
Success in academic studies prediction through Machine
Learning
Madie Fabio1
1
Norwegian University of Science and Technology, Trondheim 7034, Norway
Abstract. As we start our academic studies, we have to invest a lot to succeed.
A lot of money to pay for the tuition fees, housing, misc and extraneous expenses,… The purpose of this study is then to create a prediction model from a
various data set from the website kaggle.com and treated and analyzed in the
paper "Predicting Student Dropout and Academic Success"[1]. Higher education
institutions collect a large amount of data about their students, representing a tremendous opportunity to develop information, knowledge, and monitoring.
School dropout and educational failure in higher education are both barriers to
economic growth, employment, competitiveness, and production, with significant consequences for students and their families, higher education institutions,
and society as a whole. Student achievement is critical at educational institutions
since it is frequently used as a criterion for the institution's performance. Early
discovery of at-risk students, combined with preventive actions, can significantly
increase their achievement. Machine learning techniques have recently been
widely used for prediction. In this regard, it is necessary to, first of all, prepare
the data set that will be used, presented on the Kaggle page affiliated with the
project[2].. Different types of models will be confronted in order to determine the
most efficient to make an early prediction of student academic success. Some
exotic models will be considered for this task and will be stated on their effectiveness. The idea of the project is to identify the main risk factors for dropout
and take early interventions to prevent it and determine the factor that leads to a
positive academic outcome.
Keywords: Machine Learning Prediction, Multi-class model, Academic success.
1
Introduction
1.1
Project Idea
The purpose of this project is severalfold. The main idea is to provide a program that will be
able to determine the chance of a student dropping out and for this percentage what are the main
failure elements. On the other hand, it will deliver the principal success components and an affiliated academic success rate.
Several machine learning models exist, the target is multi-class which limits the methods we
can use. We will explore different types of machine learning such as Classification and Regression. For this type of dataset, because it has been treated a lot in existing research, we will only
use one type of Regression model known as Logistic Regression to determine its effectiveness
for this type of problem. It is not commonly used in multi-class problems but it is pretty interpretable. Then, we will use classification algorithms such as the K-nearest Neighbors method
2
because, in this data set, several variables such as demographic information, academic performance, and financial information can be used to predict whether a student is likely to drop out or
graduate. K-nearest Neighbors can be used to group students based on these variables, which can
help identify common characteristics or factors that contribute to their retention or dropout rates.
We will do a comparison between the Decision Tree method and the Random Forest to conclude
on the effectiveness compared with the increase in complexity. And finally, we will implement
Support Vector Machine that can handle non-linear decision boundaries which can appear in our
dataset, are effective with high-dimensional data, and are robust to outliers.
The data is well-labeled only the supervised training method will be used. The performance
and the accuracy of each model will be compared in several ways such as accuracy, precision,
recall, and F1-score.
1.2
Literature Review
The topic of predicting a student's academic failure or success has always been of interest to
researchers. Indeed, in order to maximize the chances of success, it is interesting to wonder which
courses to follow, and in which geographical area,... All the following have been extracted from
‘’Artificial neural networks in academic performance prediction: Systematic Implementation and
Predictor Evaluation’’[3]. This literature review about academic success prediction through time
defines Academic Success and presents the methodology used in research. It has provided a complete guideline describing data mining techniques and summarizing previous studies. In this report, only information relative to the studies based on prediction using course level will be considered. The dataset currently used is based on student performance in undergraduate degrees.
Previous research depending on the year level or the degree level is irrelevant. The following
table relates the content of the existing research:
Table 1: Review of the existing studies
Reference
Algorithms Used
Model Type
Sample Size
Best Accuracy
Almarabeh (2017) )[5]
NB, BN, ID3, J48, NN
[C]
255
NB-93%
Mueen et al (2016)[6]
NB, NN, C4.5
[C]
60
NB with 86%
Mohamed & Waguih (2017)[7]
J48, Rep Tree, RT
[C]
8080
J48 – 85%
Sivasakthi (2017)[8]
SMO, NB, J48, NN, REPTree
[C]
300
MLP – 93%
Putpuek et al (2018)[9]
ID3, C4.5,KNN, NB
[C]
/
NB-43.18%
Garg (2018) )[10]
C4.5
[C]
400
/
Yassein et al (2017) )[11]
C4.5
[C] [CC]
150
/
[C] for classification; [R] for regression; [CC] for clustering; BN Bayes net, DT decision tree, KNN k-nearest neighbors, LR logistic regression, NB naive Bayes, (P)NN (probabilistic) neural network, RB rule-based, RI rule induction,
RF random forest, RT random tree, NN neural network, TE tree ensemble; −: information not available {4]
The same evaluation will be considered for each type of algorithm used in order to provide a
solid comparison with the previous studies. Although the accuracy of the most performant model
is greatly influenced by the data set and preprocessing operation done on it, this gives an overall
impression of what type of model is the most efficient. The Naïve bays model distinguishes itself
from the others. Classification types are the most commonly used with the Clustering types. Regression models are irrelevant because the output is binary: The student either passes or drops
out. The report will provide an instance of the inefficiency of this type of algorithm.
3
2
Methodology
The review methodology is based on the approach of the previous research[4]. Every student’s
information could influence the prediction, it would not be accurate to remove data without consideration based on the model response. First of all, we have to prepare the data which are raw
and can not be used for analysis and modeling. It could contain missing, inconsistent, or duplicate
data that must be removed before any other step. Hence, the data will be split and models will be
train on one part and tested on the other. Each model’s results will be evaluated and important
features will be deducted. Further tests will be realized with those features only.
2.1
Data Set Presentation
1.1.1. Data Description
As explained in the dataset report1, the data sources are varied and the dataset is a combination
of external data from the Academic Management System (AMS), the Support System for the
Teaching Activity of the institution (developed internally and called PAE), the General Directorate of Higher Education (DGES) regarding admission through the National Competition for
Access to Higher Education (CNAES) and the Contemporary Portugal Database (PORDATA)
regarding macroeconomic data. It refers to records of students enrolled between the academic
years 2008/2009 to 2018/2019. It includes 17 undergraduate degrees specified in the original
report.
The data set used is presented on the website Kaggle2.. It is used to predict dropouts and academic outcomes. It is composed of various data including demographic, social, economic, academic performance, and personal factors. The values come from students enrolled in higher education institutions. Social parameters are for instance the mother’s/father’s qualifications, nationality, the mother/father’s occupation,… All the data is explained in Table 1. It described each
attribute used in the dataset grouped by class[1]. The possible values for each attribute are detailed
in the Appendix of the report from the dataset extracted[1].
This dataset includes 4424 records of the 34 attributes with no missing values. The analysis
of the dataset will be performed with Python 3 using the Pandas library, the Sklear library for the
feature selection, the Machine Learning Models and their evaluation, and the Matplotlib and Seaborn libraries for the data visualization.
The main difficulties that can be faced with the dataset are the missing data, which means that
either some data are not provided in the dataset or there are not enough data to construct an
accurate model. In our case, we don’t have any missing data. Another common issue is the case
of an imbalanced dataset, which means that the number of samples from one class is significantly
less than the samples from other classes. Finally, for a regression-type model, there is a problem
of multicollinearity that can occur. It appears whenever an independent variable is highly correlated with one or more of the other independent variables in a multiple regression equation. All
these problems will be treated during the pre-processing phase. Although, the dataset is pretty
well-sized and there are no imbalances between in. Our target feature is a categorical type value
that represents if a student is either ‘’Dropout’’, ‘’Enrolled, or ‘’Successful’’. The problem is in
light of that, a multi-class classification. Hence, a transformation will have to be done to get a
result for the regression model.
4
Table 1: Data Description
Data Class
Demographic
Socioeconomic
Macroeconomic
Academic data at enrollment
Academic results 1st semester
Academic results 2nd semester
Target
Attribute
Data Type
Marital Status
Nationality
Displaced
Gender
Age at enrollment
International
Mother’s qualification
Father’s qualification
Mother’s occupation
Father’s occupation
Educational special needs
Debtor
Tuition fees are up to date
Scholarship holder
Unemployment rate
Inflation rate
GDP
Application mode
Application order
Course
Daytime/evening attendance
Previous qualification
Numeric/discrete
Numeric/discrete
Numeric/binary
Numeric/binary
Numeric/discrete
Numeric/binary
Numeric/discrete
Numeric/discrete
Numeric/discrete
Numeric/discrete
Numeric/binary
Numeric/binary
Numeric/binary
Numeric/binary
Numeric/continuous
Numeric/continuous
Numeric/continuous
Numeric/discrete
Numeric/ordinal
Numeric/ordinal
Numeric/binary
Numeric/discrete
Curricular units 1st sem (credited)
Curricular units 1st sem (enrolled)
Curricular units 1st sem (evaluations)
Curricular units 1st sem (approved)
Curricular units 1st sem (grade)
Curricular units 1st sem (without evaluations)
Numeric/discrete
Numeric/discrete
Numeric/discrete
Numeric/discrete
Numeric/discrete
Numeric/discrete
Curricular units 2nd sem (credited)
Numeric/discrete
Curricular units 2nd sem (enrolled)
Curricular units 2nd sem (evaluations)
Curricular units 2nd sem (approved)
Curricular units 2nd sem (grade)
Curricular units 2nd sem (without evaluations)
Target
Numeric/discrete
Numeric/discrete
Numeric/discrete
Numeric/discrete
Numeric/discrete
Categorical
1.1.2. Data Processing and Data Transformation
First of all, a data selection also called ‘’Dimensionality Reduction’’ has to be considered for
the sake of reducing the computer power needed to compute results. Furthermore, the irrelevant
feature can yield below-optimal prediction reduction results. Two different data methods exist :
The vertical selection consists is the removal of redundant or irrelevant features to simplify the understanding of patterns and decrease the time of the learning phase. But, it
needs a good understanding of the data to select the feature.
The horizontal selection consists of the removal of conflicting instances to strengthen
the dataset. But, it requires to have a large sample size first and foremost.
Data cleaning is a crucial step in any machine learning project, as it helps ensure the data used
to train the model is accurate, consistent, and free from any anomalies or errors. First of all, we
5
have to clean the data set and check if there are not any missing values that can create issues
during the prediction. Data sources can be inconsistent or contain noises, then this step is crucial
in order to get usable results. In our dataset, there is any missing or duplicate value.
Next, the data type that represents the student's status must be changed. Indeed, to be interpretable by the models, the information in the 'Target' column must be an integer and not a string
like it is originally presented in the dataset. We have to remove every case where the student is
enrolled in the school. Every graduate student is labeled with a 1 and student who failed with a
0. With that, the multi-class classification problem becomes a binary-class classification. In order
to improve the quality of the result the dataset will be scaled. The goal of normalization is to
transform features to be on a similar scale. This improves the performance and training stability
of the model.
There exists in the dataset a strong imbalance towards the ‘’Graduate’’ group as it is shown
in Figure 2. The majority of the student ‘Graduate ‘represents 50% of the total records (2209 of
4424) and Dropout represents 32% of the total records (1421 of 4424), then the Enrolled represents 18% of total records (794 of 4424). Even if the Enrolled class is removed from the dataset,
the majority is still represented by the ‘’Graduate’’ class with 61% (2209 of 3630) against 39%
(1421 of 3630). A great imbalance might cause a prediction driven by the majority group which
is not optimal. This problem will be addressed by using the Synthetic Minority Oversampling
Technique (SMOTE) on the dataset to oversample the minority class.
Figure 1: Distribution of student records
Target
2500
Frequency
2000
1500
1000
500
0
Enrolled
2.2
Dropout
Graduate
Machine Learning Algorithms
1.1.1. Models Used
Once the data has been cleaned and preprocessed, we have to split the data set for the training
and the testing. For instance, to do that, a random shuffle is applied to the cleaned dataset and all
the training and testing parts are converted again into an integer. The values are also separated
into 5 folds for Cross Validation analysis.
For every model, we will use the same notation :
Μ‚ ∢ model prediction
𝒙 input variable features, π’š target variable, π’š
The Logistic regression model (LR) is a straightforward model that works well with
small to medium-sized datasets:
6
We use the sigmoid function to which we apply this input variable: 𝒛 = ⃗𝒙. π’˜
βƒ—βƒ—βƒ— + 𝒃, where π’˜
and 𝒃 are the parameters of the model also called the coefficients or the weights.
This value is passed to the sigmoid function :
𝟏
π’ˆ(𝒛) =
βƒ—βƒ—βƒ— +𝒃)
𝟏 + 𝒆−(𝒙⃗.π’˜
In order to set the output to either 0 or 1 only. We can set up a threshold where:
Μ‚=𝟏
π’ˆ(𝒛) ≥ 𝒕𝒉𝒓𝒆𝒔𝒉𝒐𝒍𝒅,
π’š
{
Μ‚=𝟎
π’ˆ(𝒛) < 𝒕𝒉𝒓𝒆𝒔𝒉𝒐𝒍𝒅,
π’š
In the Sklearn library, the threshold for a two-class problem is 0.5.
A K-Nearest Neighbors (KNN) which is capable of handling intricate correlations between features, but it can be computationally expensive for large datasets:
For all the new data 𝒙′, a distance measure (Euclidian distance in the Sckear library) is done
with its K-nearest neighbors π’™π’Œ :
𝒅(𝒙′ , π’™π’Œ ) = √∑(π’™π’Š − π’™π’Œ π’Š )
𝟐
The data 𝒙′ is assigned to the majority class of the K-nearest Neighbors.
In the Sklearn library, the default value for k is 5 but further testing tends to prove that the
most optimal is 1. A value above this number is too computationally expensive
- Decision Tree (DT) is capable of handling both continuous and categorical variables,
but it is susceptible to overfitting:
This model uses two different selection criteria: the Entropy to measure impurity or the Gini
index. Because there is no difference between these two criteria, the less computational expensive
will be chosen and developed in this report.
Let the data at node π’Ž be represented π‘Έπ’Ž by with π’π’Ž samples. For each candidate split 𝜽 =
π’“π’Šπ’ˆπ’‰π’•
(𝒋, π’•π’Ž ) consisting of a feature 𝒋 and threshold π’•π’Ž , partition the data into 𝑸𝒍𝒆𝒇𝒕
subsets.
π’Ž and π‘Έπ’Ž
𝒍𝒆𝒇𝒕
(𝜽)
)
π‘Έπ’Ž
= {(𝒙, 𝒋 | 𝒙𝒋 < π’•π’Ž }
π’“π’Šπ’ˆπ’‰π’•
𝒍𝒆𝒇𝒕
π‘Έπ’Ž (𝜽) = π‘Έπ’Ž \π‘Έπ’Ž
A target is a classification outcome taking on values 0,1, for node π’Ž, let:
π’‘π’Žπ’Œ =
𝟏
∑ 𝑰(π’š = π’Œ)
π’π’Ž
π’š∈π‘Έπ’Ž
We can compute the impurity function based on gini Index,
𝑯(π‘Έπ’Ž ) = ∑ π’‘π’Žπ’Œ (𝟏 − π’‘π’Žπ’Œ )
π’Œ
Hence, we minimize the impurity by looking at the minimum value in the split 𝜽,
𝒍𝒆𝒇𝒕
π’“π’Šπ’ˆπ’‰π’•
π’π’Ž
π’π’Ž
𝒍𝒆𝒇𝒕
π’“π’Šπ’ˆπ’‰π’•
π’‚π’“π’ˆπ’Žπ’Šπ’πœ½ (
𝑯 (π‘Έπ’Ž ) +
𝑯 (π‘Έπ’Ž ))
π’π’Ž
π’π’Ž
The split achieving this to the fullest will be selected to start another node. This will continue
until the maximum allowable depth is reached (None by default) or all the data have been split.
Random Forest (RF) is good for high-dimensional datasets with complicated relationships:
It is a model which uses an estimator that fits several decision tree classifiers on subsamples
of the training dataset. Hence, it uses averaging to prevent overfitting and increase accuracy.
The decision tree classifier is based on the previous model presented with the same parameters. After testing, the most optimal number of classifiers to prevent overfitting and be less computably expensive is 100 with a maximum depth of 10.
Support Vector Machine (SVM) works well in high-dimensional domains and can handle complex interactions, although it can be expensive to compute and challenging to
interpret:
It is a supervised machine-learning problem where we try to find a hyperplane that best separates the two classes. Because the target is binary, a linear SVM will be considered.
7
The functional margin of a hyperplane is defined by : π’‹Μ‚π’Š = π’š(π’Š) (π’˜π‘» 𝒙(π’Š) + 𝒃). Hence,
π’š = 𝟏,
π’˜π‘» 𝒙(π’Š) + 𝒃 ≫ 𝟎
{
π’š = 𝟎,
π’˜π‘» 𝒙(π’Š) + 𝒃 β‰ͺ 𝟎
The goal of the model is to minimize the geometric margin :
𝑦 (𝑖) (𝑀 𝑇 π‘₯ (𝑖) + 𝑏)
𝑗̂𝑖
𝑗=
=
‖𝑀‖
‖𝑀‖
By optimizing the optimal margin, the problem is now resumed to the minimization of:
𝒏
β€–π’˜β€–πŸ
𝐦𝐒𝐧
+ π‘ͺ ∑ πœ»π’Š
π’˜,𝒃
𝟐
π’Š =𝟏
Where πœ»π’Š are positive stack variables introduced to relax the margin, C also called Regularization Constant controls these variables and has to be identified through Cross Validation. For
this dataset, C has been determined optimally equal to 1.10−5 .
1.1.2. Results Evaluation
This type of classification problem fits an evaluation using a confusion matrix. While evaluating each model, four different cases related to a given success prediction occur:
-True Positive (TP): number of students correctly classified as ‘’Graduated’’
-False Positive (FP): number of students wrongly classified as ‘’Graduated’’
-True Negative (TN): number of students correctly classified as ‘’Dropout’’
-False Negative (FN): number of students wrongly classified as ‘’Dropout’’
To evaluate the performance of each model, we used accuracy, precision, recall, and F1 Score.
Information on the measurement tools is provided in Table 3.
Table 3: Measurement Tools for classification problem
Performance assessment
Calculation
Interpretation
Accuracy
𝑇𝑃 + 𝑇𝑁
𝑇𝑃 + 𝐹𝑃 + 𝑇𝑁 + 𝐹𝑁
Number of correct predictions
Precision
𝑇𝑃
𝑇𝑃 + 𝐹𝑃
Number of students correctly
labeled as ‘’Graduate’’ for all
students predicted as ‘’Graduate’’
Recall
𝑇𝑃
𝑇𝑃 + 𝐹𝑁
The number of students correctly labeled as ‘’Graduate’’
for all students ‘’Graduate’’
F1 Score
2×
π‘ƒπ‘Ÿπ‘’π‘π‘–π‘ π‘–π‘œπ‘› × π‘…π‘’π‘π‘Žπ‘™π‘™
π‘ƒπ‘Ÿπ‘’π‘π‘–π‘ π‘–π‘œπ‘› + π‘…π‘’π‘π‘Žπ‘™π‘™
3
Results and analysis
3.1
feature selection and preprocessing graph
The precision of the classifier /
his robustness
Collinearity can be an issue in our dataset, the analysis of the heatmap (Figure 2) shows that
some pairs of the feature have high Pearson correlation coefficients. Collinearity is the most important with the same group of features but is also present between groups. “Nationality” and
“International” or “Mother’s occupation” and “Father’s occupation” have great collinearity coefficients as well as “Curricular units 1st sem (approved)” and “Curricular units 2nd sem (approved)”. The performance at the end of the semester greatly influences the next one.
8
Figure 2: Correlation Table
A test has been performed to determine the most important features considering the permutation Feature Importance. The 10 most important features are plotted in Figure 3. The analysis of
these results shows that five features are considered important in all algorithms: “Curricular units
2nd sem (approved)”, “Curricular units 1st sem (approved)”, “Curricular units 2nd sem (grade)”,
“Course”, and “Tuition fees up to date”.
Figure 3: Plot of top 10 Permutation Feature Importance
9
3.2
Model Evaluation
We compared, tested, and assessed five classifiers on a dataset. All 34 accessible attributes
were examined on all five classifiers. We employed fivefold cross-validation, which implies that
the dataset was randomly divided into five equal-sized sections. Table 4 displays the results of
the average of ten of the experiment, which used all of the attributes.
Table 4: Classifier results using all attributes
LR
KNN
DT
RF
SVM
Accuracy
Precision Score
Recall Score
F1 Score
91.5 %
91.5
91.6
91.6
86.3 %
86.2
86.2
86.2
90.0%
90.3
89.8
89.8
91.0%
91.1
90.9
90.9
90.7%
91.3
90.7
90.7
Logistic Regression and Random Forest Models are the most performants in all the previously
defined performance metrics. Every classifier achieves great accuracy. Now in order to be less
computably expensive, irrelevant features are removed. Again classifiers are executed on a reduced dataset with only the ten most important features for each using five-fold cross-validation.
The result of this reduced dataset can be seen in Table 5.
Table 5: Classifier results using the ten best attributes
LR
KNN
DT
RF
SVM
Accuracy
Precision Score
Recall Score
F1 Score
90.8 %
91.0
90.8
90.8
83.9 %
84.9
84.9
84.9
88.0 %
88.3
88.0
87.9
91.9 %
92.0
92.0
91.9
90.4 %
91.0
90.3
90.4
The most performant algorithms do not change with the removal of irrelevant features however
for every model except the Random Forest model, their performance slightly decreases. For the
case of the Random Forest, some features removed must have high collinearity with others. Although, Although it would have been good for models to remove the features with the highest
collinearity, it was not possible because they are the most weighted features too. Their removal
would have been even more unbeneficial. The performance of every algorithm is rather high
except for the K Nearest Neighbors model.
4
Conclusion
The ability to predict a student’s performance is very important in educational environments.
It permits us to successfully predict what has to be aimed to be graduated. This work is an example of how machine learning can be used to analyze students’ academic success. This research
aims to assist teachers to identify early signs of dropout. It will be possible for them to provide
extra attention to these students and help them to enhance their performance. Multiple classifiers
were used as presented with few data pre-processing techniques such as Dimensionality Reduction, Synthetic Minority Oversampling Technique (SMOTE),… It permits achieving a great performance for the Random Forest model which outperforms every other algorithm with an average
accuracy of 91.9%. The different factors that have to be watched are which course the student
follows, if its tuition fees are up to date, if its curricular units are approved and what are his grades
in those. Finally, for future study, it would be interesting to conduct more trials with larger datasets that include different courses and educational levels, and degrees.
10
References
1. Author, F. Valentim Realinho, Author, F. Jorge Machado, Author, F. Luis Baptista, Author, F. Monica
V.Martins, S.: Predict students’ dropout and academic success. https://www.mdpi.com/23065729/7/11/146
2. Affiliated Kaggle page: https://www.kaggle.com/datasets/thedevastator/higher-education-predictorsof-student-retention
3. Author, F. Carlos Felipe Rodríguez-Hernandez, Author, F. Mariel Musso, Author, F. Eva Kyndt, Author, F. Eduardo Cascallar, S.: Artificial neural networks in academic performance prediction: Systematic implementation and predictor evaluation https://www-sciencedirect-com.docelec.insa-lyon.fr/science/article/pii/S2666920X21000126#bbib5
4. Author, F. Eyman Alyahyan, Author, F. d Dilek Düştegör, , S.: Predicting academic success in higher
education: literature review and best practices | International Journal of Educational Technology in
Higher Education | Full Text (springeropen.com)
5. Almarabeh, H. (2017). Analysis of Students' Performance by Using Different Data Mining Classifiers
(mecs-press.org)
6. Mueen, A., Zafar, B., & Manzoor, U. (2016). Modeling and predicting students’ academic performance
using data mining techniques. International Journal of Modern Education and Computer Science, 8(11),
36–42.
https://www.researchgate.net/publication/311068715_Modeling_and_Predicting_Students'_Academic_Performance_Using_Data_Mining_Techniques/citations
7. Mohamed, M. H., & Waguih, H. M. (2017). Early prediction of student success using a data mining
classification technique. International Journal of Science and Research, 6(10), 126–131.
https://www.semanticscholar.org/paper/Early-Prediction-of-Student-Success-Using-a-Data-MohamedWaguih/e90dcba96b0c9472750869e4f127a8240e6763e1
8. M. Sivasakthi, “Classification and Prediction based Data Mining Algorithms to Predict Students ’ Introductory
Programming
Performance,”
Icici,
0–4,
2017
https://www.ijsr.net/archive/v6i10/ART20177029.pdf
9. Putpuek, N., Rojanaprasert, N., Atchariyachanvanich, K., & Thamrongthanyawong, T. (2018). Comparative Study of Prediction Models for Final GPA Score: A Case Study of Rajabhat Rajanagarindra
University. In 2018 IEEE/ACIS 17th International Conference on Computer and Information Science,
(pp. 92–97) https://www.researchgate.net/publication/327820955_Comparative_Study_of_Prediction_Models_for_Final_GPA_Score_A_Case_Study_of_Rajabhat_Rajanagarindra_University
10. Garg, R. (2018). Predict Student performance in different regions of Punjab. International Journal of
Advanced Research in Computer Science, 9(1), 236–241. http://ijarcs.info/index.php/Ijarcs/article/view/5234/4486
11. Yassein, N. A., Helali, R. G. M., & Mohomad, S. B. (2017). Information Technology & Software Engineering Predicting Student Academic Performance in KSA using Data Mining Techniques. Journal
of Information Technology and Software Engineering, 7(5), 1–5. https://www.longdom.org/open-access-pdfs/predicting-student-academic-performance-in-ksa-using-data-mining-techniques-2165-78661000213.pdf
12. Almarabeh, H. (2017). Analysis of students’ performance by using different data mining classifiers.
International Journal of Modern Education and Computer Science, 9(8), 9–15.
Download