Note: This assignment practices working with Data Frame using Base R.
How to do it?:
Open the Rmarkdown file of this assignment (link) in Rstudio.
Right under each question, insert a code chunk (you can use the
hotkey Ctrl + Alt + I
to add a code chunk) and code the
solution for the question.
Knit
the rmarkdown file (hotkey:
Ctrl + Alt + K
) to export an html.
Publish the html file to your Githiub Page.
Submission: Submit the link on Github of the assignment to Canvas under Assignment 3.
Rank | Age | Name |
---|---|---|
0 | 28 | Tom |
1 | 34 | Jack |
2 | 29 | Steve |
3 | 42 | Ricky |
Use read.csv
to import the Covid19 Vaccination data
from WHO: link.
Show the names of the variables in the data
How many columns and rows the data have?
How many missing values are there? Show the missing values by columns. What variable has the most number of missing values?
What is the class of the date
column. Change the
date
columns to date
type using the
as.Date
function. Show the new class of the
date
column.
Capitalize the names of all the variables
Find the average number of cases per day. Find the maximum cases a day.
How many states are there in the data?
Create a new variable weekdays
to store the weekday
for each rows.
Create the categorical variable death2
variable
taking the values as follows
has_death
if there is a death that dayno_death
if there is no death that dayFind the frequency and relative frequency of no_death
and has_death
.
Find the first quartile (Q1), second quartile (Q2) and and third
quartile (Q3) of the variable death
. (Hint: Use the
summary
function)
Create the categorical variable death3
variable
taking the values as follows
low_death
if the number of deaths smaller than the
25 percentile (Q1)
mid_death
if the number of deaths from Q1 to
Q3
high_death
if the number of deaths greater than
Q3
Find the average cases in Rhode Island in 2021
Find the median cases by weekdays in Rhode Island in 2021
Compare the median cases in Rhode Island in June, July, August and September in 2021.
Find your own dataset, import it and implement the following functions on the data
If you do not have a data, you can use the titanic dataset, which can be downloaded at this link