Default Dataset for Final Project (Section 001)#
Note: If you do not want to use this dataset, you will need to propose your preferred dataset(s) to the instructor for approval before proceeding with your project. The goal is to select a dataset (or combination of datasets) that is complex enough to support a wide range of methods covered in the course.
The Data: Student Academic Performance#
A zip containing csv datasets can be download here.
A copy of the dataset can be downloaded from Kaggle here:
https://www.kaggle.com/datasets/ayeshasiddiqa123/student-perfirmance
What Is It About?#
This dataset contains student academic performance data and related demographic and behavioral factors. The goal is to understand which variables are most strongly associated with student performance, and to build predictive models for academic outcomes.
The dataset includes multiple student attributes (such as study habits, attendance, and other background information) and one or more performance-related outcomes that can be used for both regression and classification tasks.
This dataset is well-suited for the final project because it allows students to apply a range of modeling approaches, including linear and nonlinear methods, model evaluation, feature engineering, and cross-validation.
Citation#
You should cite the Kaggle dataset source in your project:
Ayesha Siddiqa. Student Performance Dataset. Kaggle.
https://www.kaggle.com/datasets/ayeshasiddiqa123/student-perfirmance
You may also read it to better understand the dataset.