Note: This assignment practices working with Data Frame using Base R.
How to do it?:
Open the Rmarkdown file of this assignment (link) in Rstudio.
Right under each question, insert a code chunk (you can use the
hotkey Ctrl + Alt + I to add a code chunk) and code the
solution for the question.
Knit the rmarkdown file (hotkey:
Ctrl + Alt + K) to export an html.
Publish the html file to your Githiub Page.
Submission: Submit the link on Github of the assignment to Canvas under Assignment 3.
| Rank | Age | Name |
|---|---|---|
| 0 | 28 | Tom |
| 1 | 34 | Jack |
| 2 | 29 | Steve |
| 3 | 42 | Ricky |
Use read.csv to import the Covid19 Vaccination data
from WHO: link.
Show the names of the variables in the data
How many columns and rows the data have?
How many missing values are there? Show the missing values by columns. What variable has the most number of missing values?
What is the class of the date column. Change the
date columns to date type using the
as.Date function. Show the new class of the
date column.
Capitalize the names of all the variables
Find the average number of cases per day. Find the maximum cases a day.
How many states are there in the data?
Create a new variable weekdays to store the weekday
for each rows.
Create the categorical variable death2 variable
taking the values as follows
has_death if there is a death that dayno_death if there is no death that dayFind the frequency and relative frequency of no_death
and has_death.
Find the first quartile (Q1), second quartile (Q2) and and third
quartile (Q3) of the variable death. (Hint: Use the
summary function)
Create the categorical variable death3 variable
taking the values as follows
low_death if the number of deaths smaller than the
25 percentile (Q1)
mid_death if the number of deaths from Q1 to
Q3
high_death if the number of deaths greater than
Q3
Find the average cases in Rhode Island in 2021
Find the median cases by weekdays in Rhode Island in 2021
Compare the median cases in Rhode Island in June, July, August and September in 2021.
Find your own dataset, import it and implement the following functions on the data
If you do not have a data, you can use the titanic dataset, which can be downloaded at this link