How to do it?:

Submission: Submit the link on Github of the assignment to Canvas


  1. Use the Adult Census Income dataset. We will predict the income (whether or not it is more than 50k or not) of an adult. Import the dataset. Partition the data into 80% training and 20% testing.

  2. Practice Decision Tree. Do the follows:

  1. Create 3 more trees and compare the testing accuracy of these trees, which tree give the highest testing accuracy.

  2. Practice Random Forest. Do the follows:

  1. Create 3 more forests and compare the testing accuracy of these forests, which forest give the highest testing accuracy.

  2. What is the best model (in term of testing accuracy) among all models (including trees and forests) you have trained?