Assignment 4 - Extra Credits

How to do it?:

Open the Rmarkdown file of this assignment (link) in Rstudio.
Right under each question, insert a code chunk (you can use the hotkey Ctrl + Alt + I to add a code chunk) and code the solution for the question.
Notice that if there is eval=FALSE in the first line of the code chunk, the chunk will not be execute.
Knit the rmarkdown file (hotkey: Ctrl + Alt + K) to export an html.
Publish the html file to your Githiub Page.

Submission: Submit the link on Github of the assignment to Canvas.

This assignment works with the IMDB Top 1000 data. Find out more information about this data at this link. Import the data and answer the following questions.

List all the names of the columns of the data
Which movies have the highest money earned (Gross)?
What is the lowest rating (IMDB_Rating)? List five movies have this lowest rating.
Which year have the most number of movies released in the list? What is the total of money earned on that year?
What is the total money earned per movies on average?
Calculate the average number of votes by year. Calculate the average number of votes of movies that have IMDB rating greater than 9.
Calculate the average Meta score in 2020 of movies that have number of votes in the third quartile.
(Optional - Challenging). The current Runtime variable is not a numeric. Use the str_remove function to remove min from the variables then use as.numeric to convert the variable to numeric. Calculate the average running time in the 2010s. Calculate the correlation between running time and rating (adding use="complete.obs" in the cor function to ignore the missing values).
We can use select_if to select columns satisfying a condition and use summarise_if to do calculation on columns satisfying a condition. Try the follows to understand these functions.

# Select only character columns
df %>% select_if(is.character)

# Calculate the median of all numeric columns
df %>% summarise_if(is.numeric, mean, na.rm=TRUE)

Implement the follows functions or combos. Drawing a comment or summary from each calculation. The codes in this question should be different from the codes used in other questions.

select
filter
mutate
summarise
arrange
count
count + arrange
filter + count + arrange
group_by + summarise
filter + group_by + summarise
filter + group_by + summarise + arrange