Student Intelligence System

Diagnose. Plan. Improve.

Traceback analyzes learning signals to identify why each student is struggling — and generates a personalized recovery plan.

Total Students

28,785

OULAD dataset

On Track

13,134

45.6% of students

Need Intervention

8,449

Critical + High urgency

Model Accuracy

92.5%

XGBoost classifier

Plans Generated

28,785

100% coverage

Root Cause Distribution

Intervention Urgency

Avg Score by Root Cause

Avg Engagement by Final Result

Project Documentation

How Traceback Works

A full machine learning pipeline built on the OULAD dataset to diagnose student learning problems and generate personalized interventions.

The Pipeline

Data Cleaning & Merging

7 OULAD CSV files cleaned and merged into a master table of 28,785 students × 32 columns.

Feature Engineering

10 signals normalized. 3 composite risk scores built: academic_risk, engagement_risk, persistence_risk.

Root Cause Classification

XGBoost classifier trained on 13 features. 92.5% accuracy across 5 root cause categories.

Learning Plan Generation

Rule-based engine generates 4-step personalized plans using each student's real signal values.

The 5 Root Causes

NO_ENGAGEMENT

Student barely interacts with the platform. Attendance problem, not knowledge.

KNOWLEDGE_GAP

Engaging but scoring low despite effort. Foundational concept is missing.

DECLINING

Was performing OK but scores are dropping. Something changed recently.

EXAM_ANXIETY

Good coursework scores but poor exam performance. Performance-under-pressure issue.

NEEDS_SUPPORT

Consistently near but below threshold. Small targeted push needed.

Tech Stack

Built entirely in Python on Google Colab using open-source libraries and the OULAD dataset.

Python 3.12 pandas numpy scikit-learn XGBoost matplotlib OULAD Dataset Google Colab HTML/CSS/JS

Key Results

CLASSIFIER ACCURACY