Note: This assignment practices working with Data Frame using Base R.

How to do it?:

Submission: Submit the link on Github of the assignment to Canvas under Assignment 3.


Problems


  1. Create the following data frame
Rank Age Name
0 28 Tom
1 34 Jack
2 29 Steve
3 42 Ricky
  1. Use read.csv to import the Covid19 Vaccination data from WHO: link.

  2. Show the names of the variables in the data

  3. How many columns and rows the data have?

  4. How many missing values are there? Show the missing values by columns. What variable has the most number of missing values?

  5. What is the class of the date column. Change the date columns to date type using the as.Date function. Show the new class of the date column.

  6. Capitalize the names of all the variables

  7. Find the average number of cases per day. Find the maximum cases a day.

  8. How many states are there in the data?

  9. Create a new variable weekdays to store the weekday for each rows.

  10. Create the categorical variable death2 variable taking the values as follows

Find the frequency and relative frequency of no_death and has_death.

  1. Find the first quartile (Q1), second quartile (Q2) and and third quartile (Q3) of the variable death. (Hint: Use the summary function)

  2. Create the categorical variable death3 variable taking the values as follows

  1. Find the average cases in Rhode Island in 2021

  2. Find the median cases by weekdays in Rhode Island in 2021

  3. Compare the median cases in Rhode Island in June, July, August and September in 2021.

  4. Find your own dataset, import it and implement the following functions on the data

  1. In the dataset in #16, practice the follows. You can reuse the code of 16.

If you do not have a data, you can use the titanic dataset, which can be downloaded at this link