Assignment 4
Instruction: Organize your answer in a word document and submit to Canvas. You can also take photos or scan your written answers and submit to Canvas.
Problem 1
Given the following training data. You want to build a regression tree to predict \(y\) from \(x_1\) and \(x_2\).
Points | \(x_1\) | \(x_2\) | \(y\) |
---|---|---|---|
A | 3 | 0 | 3.5 |
B | 2 | 5 | 3.7 |
C | 2 | 3 | 4.0 |
D | 0 | 0 | 2.5 |
E | 0 | 1 | 3.0 |
- Using the Residual Sum Squares (RSS) to decide the better split between
Split 1: Region 1: \(x_1 \leq 1\), Region 2: \(x_1 > 1\)
Split 2: Region 1: \(x_2 \leq 2\), Region 2: \(x_2 > 2\)
Suppose that your regression tree contain only one split which is the best split in the previous question. Calculate the \(R^2\) of this regression tree on the training data.
Use your regression tree to predict the \(y\) for the below testing data. Calculate the \(R^2\) of the tree on the below testing data.
Points | \(x_1\) | \(x_2\) | \(y\) |
---|---|---|---|
F | 3 | 1 | 3.0 |
G | 1 | 5 | 3.6 |
H | 2 | 1 | 4.0 |
I | 0 | 2 | 3.9 |