Default Dataset for Final Project (Section 001)

Default Dataset for Final Project (Section 001)#

Note: if you do not want to use this dataset, you will need to propose your desirable dataset(s) to Dr. Guanqun Cao for approval before proceeding to develop your project. The goal is to have a dataset (or combo of multiple datatsets) that is complex enough to accommodate a wide range of methods covered in class.

The data: Risk Factor Prediction of Chronic Kidney Disease#

A csv datasets can be download here.

The data was initially downloaded from UC Irvine Machine Learning Repository.

What is it about#

Chronic kidney disease (CKD) is an increasing medical issue that declines the productivity of renal capacities and subsequently damages the kidneys. This dataset is real Bangladeshi patient data. The dataset is collected from Enam Medical College, Savar, Dhaka, Bangladesh. The dataset contains 28 features and 200 instances.

A preliminary analysis of this data was published in Islam et al (2020).

Citation#

You will need to cite the publication below in your project:

  • M. Islam, S. Akter, M. Hossen, Sadia Ahmed Keya, Sadia Afrin Tisha, Shahed Hossain. (2020). [Risk Factor Prediction of Chronic Kidney Disease based on Machine Learning Algorithms](./ProjectData/Chronic kidney disease prediction based on machine learning algorithms.pdf). International Conferences on Information Science and System

You may also read it to better understand the dataset.