class: center, middle, inverse, title-slide # R Basics ##
Getting up and Running ### Christopher Callaghan - CORE Lab ### 2019-08-07 --- # Motivation -- R is HOT! 🔥 -- .pull-left[ The community - Among one of the fastest growing programming languages (behind heavy weights such as Python and C) - 450+ user group world wide - 300,000 questions on Stackoverflow - Social media community under #rstats ] -- .pull-right[ Extensibility - 14,500 packages on CRAN - Thriving community on Github | Gitlab - RStudio ] --- # The R Environment .center[  ] -- In R, all data types and functions are considered objects. --- # Quirks to Watch Out For 1. Capitalization ```r Sys.time() ``` ``` ## [1] "2019-08-07 11:11:06 PDT" ``` vs. ```r sys.time() ``` `Error in sys.time() : could not find function "sys.time"` --- # Quirks to Watch Out For 2. Spacing ```r x<-1 x ``` ``` ## [1] 1 ``` vs. ```r x < -1 ``` ``` ## [1] FALSE ``` --- # Data types ```r character_type <- "three" logical_type <- FALSE numeric_type <- 3 ``` or ```r typeof(character_type) ``` ``` ## [1] "character" ``` ```r typeof(logical_type) ``` ``` ## [1] "logical" ``` ```r typeof(numeric_type) ``` ``` ## [1] "double" ``` --- # Structures | | Homogeneous | Heterogeneous | |----|---------------|---------------| | 1D | Atomic Vector | List | | 2D | Matrix | Data Frame | One-dimensional: - *Atomic vectors* - All observations are the same type - *Lists* - All observations can be a different type Two-dimensional: - *Matrix* - All variables are the same type - *Data Frame* - All variables can be a different type --- # Understanding Atomic Vectors .pull-left[ Four (relevant) flavors: - `logical` - `double` - `integer` - `character` ] .pull-right[ <br> <center>  ] ```r lgl_vector <- c(TRUE, FALSE, TRUE) dbl_vector <- c(0x1, 2.0 , 3e0) int_vector <- c(1L, 2L, 1:3L) chr_vector <- c("Hello", "world", "!") ``` --- # Undestanding Lists .pull-left[ A named (or not) sequence of objects. ] .pull-right[ <center>  ] ```r my_list <- list(lgl = lgl_vector, dbl = dbl_vector, int = int_vector, chr = chr_vector) my_list ``` ``` ## $lgl ## [1] TRUE FALSE TRUE ## ## $dbl ## [1] 1 2 3 ## ## $int ## [1] 1 2 1 2 3 ## ## $chr ## [1] "Hello" "world" "!" ``` --- # Understanding Matrices .pull-left[ A homogeneous two-dimensional structure. ] .pull-right[ <center>  ] ```r my_matrix <- matrix(dbl_vector, nrow = length(dbl_vector), ncol = length(dbl_vector)) my_matrix ``` ``` ## [,1] [,2] [,3] ## [1,] 1 1 1 ## [2,] 2 2 2 ## [3,] 3 3 3 ``` --- # Understanding Data frames .pull-left[ A heterogeneous two-dimensional structure. ] .pull-right[ <center>  ] ```r my_df <- data.frame(lgl_vector, dbl_vector, chr_vector) my_df ``` ``` ## lgl_vector dbl_vector chr_vector ## 1 TRUE 1 Hello ## 2 FALSE 2 world ## 3 TRUE 3 ! ``` --- # Using Data Frames ```r url <- "https://raw.githubusercontent.com/fivethirtyeight/russian-troll-tweets/master/IRAhandle_tweets_1.csv" map_dfr(url, read_csv) %>% as_tibble() ``` ``` ## # A tibble: 243,891 x 21 ## external_author… author content region language publish_date ## <dbl> <chr> <chr> <chr> <chr> <chr> ## 1 9.06e17 10_GOP "\"We … Unkno… English 10/1/2017 1… ## 2 9.06e17 10_GOP Marsha… Unkno… English 10/1/2017 2… ## 3 9.06e17 10_GOP Daught… Unkno… English 10/1/2017 2… ## 4 9.06e17 10_GOP JUST I… Unkno… English 10/1/2017 2… ## 5 9.06e17 10_GOP 19,000… Unkno… English 10/1/2017 2… ## 6 9.06e17 10_GOP "Dan B… Unkno… English 10/1/2017 2… ## 7 9.06e17 10_GOP 🐝🐝🐝 ht… Unkno… English 10/1/2017 2… ## 8 9.06e17 10_GOP '@Sena… Unkno… English 10/1/2017 2… ## 9 9.06e17 10_GOP As muc… Unkno… English 10/1/2017 3… ## 10 9.06e17 10_GOP After … Unkno… English 10/1/2017 3… ## # … with 243,881 more rows, and 15 more variables: harvested_date <chr>, ## # following <dbl>, followers <dbl>, updates <dbl>, post_type <chr>, ## # account_type <chr>, retweet <dbl>, account_category <chr>, ## # new_june_2018 <dbl>, alt_external_id <dbl>, tweet_id <dbl>, ## # article_url <chr>, tco1_step1 <chr>, tco2_step1 <chr>, ## # tco3_step1 <lgl> ``` --- # Functions Repeatable instructions for the program to execute. name <- function(args_list){ body } For instance: ```r purrr::is_empty ``` ``` ## function (x) ## length(x) == 0 ## <bytecode: 0x7fe978c3ec38> ## <environment: namespace:rlang> ``` --- # Functions Example ```r 1 + 2 ``` ``` ## [1] 3 ``` ```r sum(1,2) ``` ``` ## [1] 3 ``` ```r sum ``` ``` ## function (..., na.rm = FALSE) .Primitive("sum") ``` ```r my_sum <- function(x, y) { return(x + y) } my_sum(1, 2) ``` ``` ## [1] 3 ``` --- # Parting Thoughts - [Data Camp](https://www.datacamp.com/), an online platform for learning data science. The early modules on R are free. - [R for Data Science](https://r4ds.had.co.nz/) from Hadley Wickham. An essential read for those invested in learning data science with R. The author is the Chief Scientist at RStudio and principal author of many R packages. - [Stat 545 course](http://stat545.com/) from the University of British Columbia, which is an online course on data wrangling, exploration and analysis with R. Happy R learning! 😄