class: center, middle, inverse, title-slide #
Writing Functions - Loop ###
Son Nguyen
--- <style> .remark-slide-content { background-color: #FFFFFF; border-top: 80px solid #F9C389; font-size: 14px; font-weight: 300; line-height: 1.5; padding: 1em 1em 1em 1em } .inverse { background-color: #696767; border-top: 80px solid #696767; text-shadow: none; background-image: url(https://github.com/goodekat/presentations/blob/master/2019-isugg-gganimate-spooky/figures/spider.png?raw=true); background-position: 50% 75%; background-size: 150px; } .your-turn{ background-color: #8C7E95; border-top: 80px solid #F9C389; text-shadow: none; background-image: url(https://github.com/goodekat/presentations/blob/master/2019-isugg-gganimate-spooky/figures/spider.png?raw=true); background-position: 95% 90%; background-size: 75px; } .title-slide { background-color: #F9C389; border-top: 80px solid #F9C389; background-image: none; } .title-slide > h1 { color: #111111; font-size: 40px; text-shadow: none; font-weight: 400; text-align: left; margin-left: 15px; padding-top: 80px; } .title-slide > h2 { margin-top: -25px; padding-bottom: -20px; color: #111111; text-shadow: none; font-weight: 300; font-size: 35px; text-align: left; margin-left: 15px; } .title-slide > h3 { color: #111111; text-shadow: none; font-weight: 300; font-size: 25px; text-align: left; margin-left: 15px; margin-bottom: -30px; } </style> <style type="text/css"> .left-code { color: #777; width: 40%; height: 92%; float: left; } .right-plot { width: 59%; float: right; padding-left: 1%; } .pull-left { color: #777; width: 40%; height: 92%; float: left; } .pull-right { width: 58%; float: right; padding-left: 2%; } .left-column { color: #777; width: 40%; height: 92%; float: left; } .right-column { width: 58%; float: right; padding-left: 2%; } </style> --- class: inverse, center, middle # Loop --- # Task .pull-left[ |Name | Age|Sex |Survived | |:---------------------------------------------------|---:|:------|:--------| |Braund, Mr. Owen Harris | 22|male |0 | |Cumings, Mrs. John Bradley (Florence Briggs Thayer) | 38|female |1 | |Heikkinen, Miss. Laina | 26|female |1 | |Futrelle, Mrs. Jacques Heath (Lily May Peel) | 35|female |1 | |Allen, Mr. William Henry | 35|male |0 | - For each passenger, print out a sentence reporting the name and the age. ] .pull-right[] --- # Task .pull-left[ |Name | Age|Sex |Survived | |:---------------------------------------------------|---:|:------|:--------| |Braund, Mr. Owen Harris | 22|male |0 | |Cumings, Mrs. John Bradley (Florence Briggs Thayer) | 38|female |1 | |Heikkinen, Miss. Laina | 26|female |1 | |Futrelle, Mrs. Jacques Heath (Lily May Peel) | 35|female |1 | |Allen, Mr. William Henry | 35|male |0 | - For each passenger, print out a sentence reporting the name and the age. ] .pull-right[ ```r for (i in 1:5) { print(paste0(df$Name[i], ' is ', df$Age[i], ' year old.')) } ``` ] --- # Task .pull-left[ |Name | Age|Sex |Survived | |:---------------------------------------------------|---:|:------|:--------| |Braund, Mr. Owen Harris | 22|male |0 | |Cumings, Mrs. John Bradley (Florence Briggs Thayer) | 38|female |1 | |Heikkinen, Miss. Laina | 26|female |1 | |Futrelle, Mrs. Jacques Heath (Lily May Peel) | 35|female |1 | |Allen, Mr. William Henry | 35|male |0 | - For each passenger, print out a sentence reporting the name and the age. ] .pull-right[ ```r for (i in 1:5) { print(paste0(df$Name[i], ' is ', df$Age[i], ' year old.')) } ``` ``` ## [1] "Braund, Mr. Owen Harris is 22 year old." ## [1] "Cumings, Mrs. John Bradley (Florence Briggs Thayer) is 38 year old." ## [1] "Heikkinen, Miss. Laina is 26 year old." ## [1] "Futrelle, Mrs. Jacques Heath (Lily May Peel) is 35 year old." ## [1] "Allen, Mr. William Henry is 35 year old." ``` ] --- # Loop through a vector .pull-left[ - Loop through a vector and print out all element of the vector. ] .pull-right[ ] --- # Loop through a vector .pull-left[ - Loop through a vector and print out all element of the vector. ] .pull-right[ ```r x <- c('b','r','y','a','n','t') for (i in x) { print(i) } ``` ] --- # Loop through a vector .pull-left[ - Loop through a vector and print out all element of the vector. ] .pull-right[ ```r x <- c('b','r','y','a','n','t') for (i in x) { print(i) } ``` ``` ## [1] "b" ## [1] "r" ## [1] "y" ## [1] "a" ## [1] "n" ## [1] "t" ``` ] --- # Loop through a data frame .pull-left[ - Loop through variables of a data frame and print the number of missing for each variable ] .pull-right[ ] --- # Loop through a data frame .pull-left[ - Loop through variables of a data frame and print the number of missing for each variable ] .pull-right[ ```r for (i in df) { print(sum(is.na(i))) } ``` ] --- # Loop through a data frame .pull-left[ - Loop through variables of a data frame and print the number of missing for each variable ] .pull-right[ ```r for (i in df) { print(sum(is.na(i))) } ``` ``` ## [1] 0 ## [1] 0 ## [1] 0 ## [1] 0 ## [1] 177 ## [1] 0 ## [1] 0 ## [1] 2 ``` ] --- # Loop through a data frame .pull-left[ - Print out the types of variables for each columns in a data frame ] .pull-right[ ] --- # Loop through a data frame .pull-left[ - Print out the types of variables for each columns in a data frame ] .pull-right[ ```r for (i in 1:length(df)) { print(class(df[[i]])) } ``` ] --- # Loop through a data frame .pull-left[ - Print out the types of variables for each columns in a data frame ] .pull-right[ ```r for (i in 1:length(df)) { print(class(df[[i]])) } ``` ``` ## [1] "factor" ## [1] "factor" ## [1] "factor" ## [1] "numeric" ## [1] "numeric" ## [1] "numeric" ## [1] "factor" ``` ] --- class: center, inverse, middle # Map functions --- # Map function .pull-left[ - When you apply the same function to all columns of a data frame, - you could use a map function in the `purrr` package to make the code simpler ] .pull-right[ ] --- # Map function .pull-left[ - When you apply the same function to all columns of a data frame, - you could use a map function in the `purrr` package to make the code simpler ] .pull-right[ ```r for (i in 1:length(df)) { class(df[[i]]) } ``` - The above is the same as ```r library(purrr) map_df(df, class) ``` ] --- # Map function .pull-left[ - When you apply the same function to all columns of a data frame, - you could use a map function in the `purrr` package to make the code simpler ] .pull-right[ - Rewrite the following codes using a `map` function ```r for (i in 1:length(df)) { length(df[[i]]) } ``` ] --- # Map function .pull-left[ - When you apply the same function to all columns of a data frame, - you could use a map function in the `purrr` package to make the code simpler ] .pull-right[ - Rewrite the following codes using a `map` function ```r for (i in 1:length(df)) { length(df[[i]]) } ``` - Rewrite: ```r library(purrr) map_df(df, length) ``` ] --- # Map function .pull-left[ - When you apply the same function to all columns of a data frame, - you could use a map function in the `purrr` package to make the code simpler ] .pull-right[ - Rewrite the following codes using a `map` function ```r for (i in 1:length(df)) { mean(df[[i]]) } ``` ] --- # Map function .pull-left[ - When you apply the same function to all columns of a data frame, - you could use a map function in the `purrr` package to make the code simpler ] .pull-right[ - Rewrite the following codes using a `map` function ```r for (i in 1:length(df)) { mean(df[[i]]) } ``` - Rewrite: ```r library(purrr) map_df(df, mean) ``` ] --- # Map function .pull-left[ - When you apply the same function to all columns of a data frame, - you could use a map function in the `purrr` package to make the code simpler ] .pull-right[ - Rewrite the following codes using a `map` function ```r for (i in 1:length(df)) { sum(is.na(df[[i]])) } ``` ] --- # Map function .pull-left[ - When you apply the same function to all columns of a data frame, - you could use a map function in the `purrr` package to make the code simpler ] .pull-right[ - Rewrite the following codes using a `map` function ```r for (i in 1:length(df)) { sum(is.na(df[[i]])) } ``` - Rewrite: ```r library(purrr) map_df(df, ~sum(is.na(.))) ``` ] --- # Map function .pull-left[ - When you apply the same function to all columns of a data frame, - you could use a map function in the `purrr` package to make the code simpler ] .pull-right[ - Rewrite the following codes using a `map` function ```r for (i in 1:length(df)) { sum(is.na(df[[i]])) } ``` - Rewrite: ```r library(purrr) map_df(df, ~sum(is.na(.))) ``` - Notice that `map_df(df, sum(is.na)` would not work because `sum(is.na)` is not a function! ] --- class: inverse, center, middle # Write functions with Loop --- # Example .pull-left[ Write the following function. Give an example to test your function. - Input: - a data frame - Output: - a data frame with all the missing of non-numeric variables replaced by the associated modes. ] .pull-right[ ] --- # Example .pull-left[ Write the following function. Give an example to test your function. - Input: - a data frame - Output: - a data frame with all the missing of non-numeric variables replaced by the associated modes. ] .pull-right[ - We use the `mode_impute` function from last class ```r numeric_impute <- function(d) { for (i in 1:length(d)) { d[[i]] <- mode_impute(d[[i]]) } return(d) } ``` ] --- # Example .pull-left[ Write the following function. Give an example to test your function. - Input: - a data frame - Output: - a data frame with all the missing of non-numeric variables replaced by the associated modes. ] .pull-right[ - We use the `mode_impute` function from last class ```r numeric_impute <- function(d) { for (i in 1:length(d)) { d[[i]] <- mode_impute(d[[i]]) } return(d) } ``` - Test the function ```r colSums(is.na(df)) ``` ``` ## Survived Pclass Sex Age SibSp Fare Embarked ## 0 0 0 177 0 0 2 ``` ```r d1 <- numeric_impute(df) colSums(is.na(d1)) ``` ``` ## Survived Pclass Sex Age SibSp Fare Embarked ## 0 0 0 177 0 0 0 ``` ] --- # Map function --- # Map function .pull-left[ - When you apply the same function to all columns of a data frame, - you could use a map function in the `purrr` package to make the code simpler ] .pull-right[ ```r numeric_impute <- function(d) { for (i in 1:length(d)) { d[[i]] <- mode_impute(d[[i]]) } return(d) } ``` - The above is the same as ```r numeric_impute <- function(d) { d <- map_df(d, mode_impute) return(d) } ``` ] --- # Example .pull-left[ Write the following function. Give an example to test your function. - Input: - a data frame - Output: - Density plots of all numeric variables. ] .pull-right[ ] --- # Example .pull-left[ Write the following function. Give an example to test your function. - Input: - a data frame - Output: - Density plots of all numeric variables. ] .pull-right[ ```r density_plot <- function(d) { library(ggplot2) for (i in 1:length(d)) { if (is.numeric(d[[i]])) { print(ggplot(d, aes(x = d[[i]]))+ geom_density()+ labs(x = names(d)[i])) } } } ``` ] --- # Example .pull-left[ Write the following function. Give an example to test your function. - Input: - a data frame - Output: - Density plots of all numeric variables. ] .pull-right[ - Test the function ```r density_plot(df) ``` <!-- --><!-- --><!-- --> ] --- class: inverse, center, middle # Double Loop --- # Example .pull-left[ Write the following function. Give an example to test your function. - Input: - a data frame - Output: - all possible the scatter plots. ] .pull-right[ ] --- # Example .pull-left[ Write the following function. Give an example to test your function. - Input: - a data frame - Output: - all possible the scatter plots. ] .pull-right[ ```r scatter_plot <- function(d) { library(ggplot2) l <- length(d) for (i in 1:(l-1)) for (j in (i+1):l) { if (is.numeric(d[[i]])&is.numeric(d[[j]])) { print(ggplot(d, aes(x = d[[i]], y = d[[j]]))+ geom_point()+ labs(x = names(d)[i], y = names(d)[j])) } } } ``` ] --- # Example .pull-left[ Write the following function. Give an example to test your function. - Input: - a data frame - Output: - all possible the scatter plots. ] .pull-right[ - Test the function ```r scatter_plot(df) ``` <!-- --><!-- --><!-- --> ] --- # Example .pull-left[ Write the following function. Give an example to test your function. - Input: - a data frame - Output: - all possible the density plots of a numeric variable colored by a non-numeric variable. ] .pull-right[ ] --- # Example .pull-left[ Write the following function. Give an example to test your function. - Input: - a data frame - Output: - all possible the density plots of a numeric variable colored by a non-numeric variable. ] .pull-right[ ```r density_plot2 <- function(d) { library(ggplot2) l <- length(d) for (i in 1:(l-1)) for (j in (i+1):l) { if (is.numeric(d[[i]])& (!is.numeric(d[[j]]))) { print(ggplot(d, aes(x = d[[i]], color = d[[j]]))+ geom_density()+labs(x = names(d)[i], color = names(d)[j])) } } } ``` ] --- # Example .pull-left[ Write the following function. Give an example to test your function. - Input: - a data frame - Output: - all possible the density plots of a numeric variable colored by a non-numeric variable. ] .pull-right[ ```r density_plot2 <- function(d) { library(ggplot2) l <- length(d) for (i in 1:(l-1)) for (j in (i+1):l) { if (is.numeric(d[[i]])& (!is.numeric(d[[j]]))) { print(ggplot(d, aes(x = d[[i]], color = d[[j]]))+ geom_density()+labs(x = names(d)[i], color = names(d)[j])) } } } ``` - Test the function ```r density_plot2(df) ``` ] --- # Test the function ```r density_plot2(df) ``` <!-- --><!-- --><!-- --> --- # Example .pull-left[ Write the following function. Give an example to test your function. - Input: - a data frame - Output: - all possible the bar plots of a non-numeric variable filled by a non-numeric variable. ] .pull-right[ ```r bar_plot2 <- function(d) { library(ggplot2) l <- length(d) for (i in 1:(l-1)) for (j in (i+1):l) { if ((!is.numeric(d[[i]])) & (!is.numeric(d[[j]]))) { print(ggplot(d, aes(x = d[[i]], fill = d[[j]])) + geom_bar(position = 'dodge')+ labs(x = names(d)[i], fill = names(d)[j])) } } } ``` ] --- # Example .pull-left[ Write the following function. Give an example to test your function. - Input: - a data frame - Output: - all possible the bar plots of a non-numeric variable filled by a non-numeric variable. ] .pull-right[ ```r bar_plot2 <- function(d) { library(ggplot2) l <- length(d) for (i in 1:(l-1)) for (j in (i+1):l) { if ((!is.numeric(d[[i]])) & (!is.numeric(d[[j]]))) { print(ggplot(d, aes(x = d[[i]], fill = d[[j]])) + geom_bar(position = 'dodge')+ labs(x = names(d)[i], fill = names(d)[j])) } } } ``` - Test the function ```r bar_plot2(df) ``` ] --- # Test the function ```r bar_plot2(df) ``` <!-- --><!-- --><!-- --><!-- --><!-- --><!-- --> --- exclude: true # Example .pull-left[ Write the following function. Give an example to test your function. - Input: - a data frame - Output: - all possible the bar plots of a non-numeric variable filled by a non-numeric variable. - all possible the density plots of a numeric variable colored by a non-numeric variable. - all possible the scatter plots. ] .pull-right[ ] --- exclude: true # Example .pull-left[ Write the following function. Give an example to test your function. - Input: - a data frame - Output: - all possible the bar plots of a non-numeric variable filled by a non-numeric variable. - all possible the density plots of a numeric variable colored by a non-numeric variable. - all possible the scatter plots. ] .pull-right[ ```r quickplot <- function(d) { density_plot2(d) scatter_plot(d) bar_plot2(d) } ``` ] --- # Test the function ```r df2 <- read_csv('insurance.csv') quickplot(df2) ``` <!-- --><!-- --><!-- --><!-- --><!-- --><!-- --><!-- --><!-- --><!-- --><!-- --><!-- --><!-- --><!-- --><!-- --><!-- --><!-- --> --- # Test the function ```r quickplot(df) ``` <!-- --><!-- --><!-- --><!-- --><!-- --><!-- --><!-- --><!-- --><!-- --><!-- --><!-- --><!-- -->