How to do it?:
Open the Rmarkdown file of this assignment (link) in Rstudio.
Right under each question, insert a code chunk
(you can use the hotkey Ctrl + Alt + I
to add a code chunk)
and code the solution for the question.
Knit
the rmarkdown file (hotkey:
Ctrl + Alt + K
) to export an html.
Publish the html file to your Githiub Page.
Submission: Submit the link on Github of the assignment to Canvas
Input: a number, x, (year born)
Output: Print out “You are r age”. Where r is the age of the person, i.e. 2020 - x.
Hint: Similar Function
Input: a number
Output: print out: “You input an even number!” if the number is event, or “You input an odd number!” otherwise.
Hint: Similar Function
Input: a numeric vector
Output:
if the input vector has missing values: return the input vector with missing values replaced by mean
if the input vector has no missing value: return the same input vector
Hint: Similar Function
Input: a vector x
Output: The vector x where the missing values replaced by the mean (if x is numeric) or the mode (if x is non-numeric). If x does not have missing value, return the same vector x.
Hint: Use If-statement to combine the function in Question 3 and this function
Input: A data frame of two variables x and y
Output:
A boxplot of x by y if x is numeric and y is non-numeric
A boxplot of y by x if y is numeric and x is non-numeric
print out ‘This function cannot visualize your data’ otherwise
Hint:
You can refer to this slide to plot a boxplot: https://bryantstats.github.io/math421/slides/6_viz.html#36
Input:
input_data: a clean data frame with a variable name
target
. The target
variable is also
binary.
train_percent: a number presenting a proportion of training data. The default train_percent is .8
Output: the accuracy of the decision model rpart
where the training data is train_percent.
Input:
input_data: a clean data frame with a variable name
target
. The target
variable is also
binary.
train_percent: a number presenting a proportion of training data. The default train_percent is .8
Output: the plot of variable important by random forest trained by caret.
Input:
a data frame that has a text column
the name of the text column in the data
Output: the word cloud plot of the text column
Sample codes