Multi-split Optimized Bagging Ensemble Model Selection for Multi-class Educational Data Mining

9 Jun 2020  ·  MohammadNoor Injadat, Abdallah Moubayed, Ali Bou Nassif, Abdallah Shami ·

Predicting students' academic performance has been a research area of interest in recent years with many institutions focusing on improving the students' performance and the education quality. The analysis and prediction of students' performance can be achieved using various data mining techniques. Moreover, such techniques allow instructors to determine possible factors that may affect the students' final marks. To that end, this work analyzes two different undergraduate datasets at two different universities. Furthermore, this work aims to predict the students' performance at two stages of course delivery (20% and 50% respectively). This analysis allows for properly choosing the appropriate machine learning algorithms to use as well as optimize the algorithms' parameters. Furthermore, this work adopts a systematic multi-split approach based on Gini index and p-value. This is done by optimizing a suitable bagging ensemble learner that is built from any combination of six potential base machine learning algorithms. It is shown through experimental results that the posited bagging ensemble models achieve high accuracy for the target group for both datasets.

PDF Abstract

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here