class: center, middle, inverse, title-slide # Social Network Analysis in R ## CORE Lab ### Department of Defense Analysis ### 2019-06-14 --- # Background -- Social network analysis (SNA) is commonly used to understand groups and social formations. -- **The focus of this methodology is the relationships among individuals, which influence a person's behavior above and beyond the influence of his or her individual attributes (Valente 2010). ** -- As such, SNA enables analysts to understand how social ties help to define, enable, and constrain the knowledge, reach, and capacities of actors within groups (Cunningham, Everton, and Murphy 2016). -- While social network research is **not** exclusively dependent on software applications, these do increase the efficiency of researchers. Here we will focus on **R**. --- # Goals Here we will explore some of the key SNA features of the open-source programming language **R**, and a variety of packages - primarily [**igraph**](https://igraph.org/r/) and [**visNetwork**](https://datastorm-open.github.io/visNetwork/) - developed to work with relational data. Specifically, we will: -- * Explore, structure, visualize, and analyze relational data with the **igraph** library. -- * Build processes in **R** with **igraph** to streamline analysis. -- * Create interactive visualizations with the **visNetwork** package. --- # Supplemental Packages * `here` - A package to make it easier to find your files by constructing paths to your project's files. * `tidyverse` - A collection of packages designed for data science. Users get packages such as dplyr, ggplot, purr, etc. * `purrr` - Enhances R's functional programming toolkit. * `DT`- A "wrapper" of the JavaScript Library "DataTables". We will use it to build interactive data tables. * `kableExtra` - A package to help us build tables using HTML. * `emo` - A package that allows users to insert emoji into RMarkdown documents (like this one!). * `xaringan` - You're looking at it! --- # Getting Started: Loading Data Let's bring in data from an edgelist: ```r here("/docs/community-of-interest/sna-in-R/data/Familial.csv") %>% read_csv() #familial <- read.csv(file="/docs/community-of-interest/sna-in-R/data/Familial.csv", header=TRUE) ```
--- # Relationships Codebook * **Familial**: (person-to-person; i.e., one-mode) - Defined as any family connection through blood, adoption, or marriage. * **Financial**: person-to-person) - Defined as two actors, in reporting or intelligence, who are explicitly stated as transferring funds between one another for any purpose, legal or illegal. * **Friendship**: (person-to-person) - Defined as two individuals who are explicitly stated as friends, or who are explicitly known as trusted confidants in reports or in intelligence documentation. * **Hierarchy**: (person-to-person) - Defined as relationships between immediate superiors and subordinates in an organization. --- # Relationships Import ```r # Familial: familial <- here("/docs/community-of-interest/sna-in-R/data/Familial.csv") %>% read_csv() # Financial: financial <- here("/docs/community-of-interest/sna-in-R/data/Financial.csv") %>% read_csv() # Friendship: friendship <- here("/docs/community-of-interest/sna-in-R/data/Friendship.csv") %>% read_csv() # Hierarchy: hierarchy <- here("/docs/community-of-interest/sna-in-R/data/Hierarchy.csv") %>% read_csv() ``` --- # Building Networks First, install the package: ```r install.packages("igraph") ``` Now load the package: ```r library(igraph) ``` Create an **igraph** graph: ```r familial_net <- graph_from_data_frame(familial, directed = FALSE) ``` --- # Familial Network Object ```r familial_net ``` ``` ## IGRAPH 83a5933 UN-- 22 30 -- ## + attr: name (v/c), Relationship (e/c) ## + edges from 83a5933 (vertex names): ## [1] Adonis --Boxcar Adonis --Ghost Adonis --Jelly ## [4] Bananas --Blue Eyes Bananas --Brains Bananas --Slingshot ## [7] Bananas --Gremlin Bat G. --Big G. Bat G. --Blaze ## [10] Big G. --Blaze Boots --Fat Boy Boots --Freckles ## [13] Boots --Icepick Boots --Repo Girl Boxcar --Ghost ## [16] Boxcar --Jelly Brains --Slingshot Brains --Gremlin ## [19] Fat Boy --Freckles Fat Boy --Icepick Fat Boy --Repo Girl ## [22] Slingshot--Gremlin Freckles --Icepick Freckles --Repo Girl ## + ... omitted several edges ``` -- * `name` listed as `(v/c)`, which denotes a vertex-level character attributes -- * `Relationship` listed as `(e/c)` or edge-level character attributes --- # Familial Network Edges ```r E(familial_net) ``` ``` ## + 30/30 edges from 83a5933 (vertex names): ## [1] Adonis --Boxcar Adonis --Ghost Adonis --Jelly ## [4] Bananas --Blue Eyes Bananas --Brains Bananas --Slingshot ## [7] Bananas --Gremlin Bat G. --Big G. Bat G. --Blaze ## [10] Big G. --Blaze Boots --Fat Boy Boots --Freckles ## [13] Boots --Icepick Boots --Repo Girl Boxcar --Ghost ## [16] Boxcar --Jelly Brains --Slingshot Brains --Gremlin ## [19] Fat Boy --Freckles Fat Boy --Icepick Fat Boy --Repo Girl ## [22] Slingshot--Gremlin Freckles --Icepick Freckles --Repo Girl ## [25] Ghost --Jelly Goldie --Pookey Goldie --O.G. ## [28] Pookey --O.G. Icepick --Repo Girl Snake --Sonny Black ``` ```r ecount(familial_net) ``` ``` ## [1] 30 ``` --- # Familial Network Nodes ```r V(familial_net) ``` ``` ## + 22/22 vertices, named, from 83a5933: ## [1] Adonis Bananas Bat G. Big G. Boots ## [6] Boxcar Brains Fat Boy Slingshot Freckles ## [11] Ghost Goldie Pookey Icepick Snake ## [16] Jelly Blue Eyes Gremlin Blaze Repo Girl ## [21] O.G. Sonny Black ``` ```r vcount(familial_net) ``` ``` ## [1] 22 ``` ```r V(familial_net)$name ``` ``` ## [1] "Adonis" "Bananas" "Bat G." "Big G." "Boots" ## [6] "Boxcar" "Brains" "Fat Boy" "Slingshot" "Freckles" ## [11] "Ghost" "Goldie" "Pookey" "Icepick" "Snake" ## [16] "Jelly" "Blue Eyes" "Gremlin" "Blaze" "Repo Girl" ## [21] "O.G." "Sonny Black" ``` --- # Familial Network Visualization ```r plot.igraph(familial_net) ``` .center[ ![](SNAinR_files/figure-html/unnamed-chunk-14-1.png)<!-- --> ] --- # Familial Network Visualization .pull-left[ ```r plot.igraph(familial_net, # Nodes ==== vertex.label = NA, vertex.color = "blue", vertex.size = 5, # Edge ==== edge.color = "black", edge.arrow.size = 0, edge.curved = TRUE, # Other ==== margin = .01, frame = FALSE, main = "Familial Network" ) ``` .center[ Arguments = 😄 ] ] .pull-right[ ![](SNAinR_files/figure-html/unnamed-chunk-16-1.png)<!-- --> ] --- #So What? -- .center[ Automation! Automation! Automation! ] -- .pull-left[ ```r here("/docs/community-of-interest/sna-in-R/data/Financial.csv") %>% read_csv() %>% graph_from_data_frame(directed = FALSE) %>% plot.igraph(vertex.label=NA, vertex.color="blue", vertex.size=5, edge.color="black", edge.arrow.size=0, edge.curved=TRUE, margin = .01, frame = FALSE ) ``` .center[ 💵 ] ] .pull-right[ ![](SNAinR_files/figure-html/unnamed-chunk-18-1.png)<!-- --> ] --- # Automation! Automation! Automation! .center[ ![](images/multipleCSVs.png) ] --- # Automation! Automation! Automation! .pull-left[ ```r files <- list.files(path=here("/docs/community-of-interest/sna-in-R/data/onemode/"), pattern = "\\.csv$", full.names = TRUE) files %>% map_dfr(read_csv) %>% graph_from_data_frame(directed = FALSE) %>% set_edge_attr("color", value = case_when( E(.)$Relationship == "Familial" ~ "#d7191c", E(.)$Relationship == "Financial" ~ "#fdae61", E(.)$Relationship == "Friendship" ~ "#abd9e9", E(.)$Relationship == "Hierarchy" ~ "#2c7bb6" )) %>% plot.igraph(vertex.label=NA, vertex.color="grey", vertex.size=5, edge.arrow.size=0, edge.curved=TRUE, margin = .01 ) ``` ] .pull-right[ ![](SNAinR_files/figure-html/unnamed-chunk-20-1.png)<!-- --> ] --- # Metrics in Igraph Q: Now that you have data, how can you analyze it? -- A: Start with network-level measures Metric | Explanation | Command --------|-------------|--------- Density | Number of observed ties divided by possible number of ties | `edge_density()` Average Degree | Sum of ties divided by number of actors | `mean(degree())` Global Clustering | Sum of each actor's clustering divided by number of actors | `transitivity()` --- #Quick Note on Commands ```r g <- list.files(path=here("/docs/community-of-interest/sna-in-R/data/onemode/"), pattern = "\\.csv$", full.names = TRUE) %>% map_dfr(read_csv) %>% graph_from_data_frame(directed = FALSE) ``` ```r edge_density(g, loops = FALSE) # Simple Command ``` ``` ## [1] 0.1188406 ``` ```r g_density<-edge_density(g, loops = FALSE) # Assigning object ``` ```r g_density# Calling "g_density" object ``` ``` ## [1] 0.1188406 ``` --- # Network-Level Measures ```r g <- list.files(path=here("/docs/community-of-interest/sna-in-R/data/onemode/"), pattern = "\\.csv$", full.names = TRUE) %>% map_dfr(read_csv) %>% graph_from_data_frame(directed = FALSE) %>% set.graph.attribute("density", edge_density(.)) %>% set.graph.attribute("avg_degree", mean(degree(.))) %>% set.graph.attribute("avg_clu_coef", transitivity(., "average")) ``` ```r graph_attr(g, "density") ``` ``` ## [1] 0.1188406 ``` ```r graph_attr(g, "avg_degree") ``` ``` ## [1] 5.347826 ``` ```r graph_attr(g, "avg_clu_coef") ``` ``` ## [1] 0.6436332 ``` --- # Network-Level Measures Report ```r data.frame( "Density" = graph_attr(g, "density"), "Avg. Degree" = graph_attr(g, "avg_degree"), "Avg. Clustering Coefficient" = graph_attr(g, "avg_degree") ) %>% kable(format = "html", digits = 3, caption = "Demo Table") %>% kable_styling(bootstrap_options = c("striped", "condensed")) %>% add_footnote(label = "table footnote", notation = "number") ``` <table class="table table-striped table-condensed" style="margin-left: auto; margin-right: auto;"> <caption>Demo Table</caption> <thead> <tr> <th style="text-align:right;"> Density </th> <th style="text-align:right;"> Avg..Degree </th> <th style="text-align:right;"> Avg..Clustering.Coefficient </th> </tr> </thead> <tbody> <tr> <td style="text-align:right;"> 0.119 </td> <td style="text-align:right;"> 5.348 </td> <td style="text-align:right;"> 5.348 </td> </tr> </tbody> <tfoot> <tr> <td style = 'padding: 0; border:0;' colspan='100%'><sup>1</sup> table footnote</td> </tr> </tfoot> </table> --- # Metrics in Igraph Q: Now that I've looked at network-level measures, what do I do? -- A: Calculate vertex-level metrics. Metric | Explanation | Command --------|-------------|--------- Degree | Count of actor's ties | `degree()` Eigenvector | Weights an actor's centrality by the centrality of its neighbors | `evcent()` Betweenness | How often each actor lies on the shortest path between all other actors | `betweenness()` --- # Vertex-Level Measures ```r g %>% set.vertex.attribute("degree", value=degree(.)) %>% set.vertex.attribute("eigenvector", value=round(evcent(.)$vector, 2)) %>% set.vertex.attribute("betweenness", value=round(betweenness(.), 2)) %>% get.data.frame("vertices") %>% datatable(rownames = F, options = list(dom="tp", pageLength=2)) ```
--- # Interactive Visuals First, install the package: ```r install.packages("visNetwork") ``` Now load the package: ```r library(visNetwork) ``` --- # visNetwork with Igraph ```r visIgraph(g) ``` .center[
] --- # visNetwork with Igraph Visualization Arguments ```r g <- g %>% # Node attributes ==== set.vertex.attribute("color.background", value = "grey") %>% set.vertex.attribute("color.border", value = "black") %>% set.vertex.attribute("borderWidth", value = 2) %>% set.vertex.attribute("size", value = degree(.)) %>% set.vertex.attribute("label", value = V(.)$name) %>% # Edge attributes ==== set.edge.attribute("width", value = rescale(edge_betweenness(.), c(2,10))) %>% set.edge.attribute("color", value = "slategrey") %>% set.edge.attribute("smooth", value = FALSE) %>% set.edge.attribute("shadow", value = TRUE) ``` --- # visNetwork with Igraph Visualization .center[
] --- <br> .center[ ### Questions? ] ![](images/wanna_see_the_code.png) Chris Callaghan | cjcallag@nps.edu Dan Cunningham | dtcunnin@nps.edu --- <br> ``` ## R version 3.5.0 (2018-04-23) ## Platform: x86_64-apple-darwin15.6.0 (64-bit) ## Running under: macOS 10.14.4 ## ## Matrix products: default ## BLAS: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRblas.0.dylib ## LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib ## ## locale: ## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 ## ## attached base packages: ## [1] stats graphics grDevices utils datasets methods base ## ## other attached packages: ## [1] xaringanthemer_0.2.0 igraph_1.2.4.1 visNetwork_2.0.7 ## [4] kableExtra_0.9.0 knitr_1.23 emo_0.0.0.9000 ## [7] scales_1.0.0 DT_0.7 forcats_0.4.0 ## [10] stringr_1.4.0 dplyr_0.8.1 purrr_0.3.2 ## [13] readr_1.3.1 tidyr_0.8.3 tibble_2.1.3 ## [16] ggplot2_3.1.1 tidyverse_1.2.1 here_0.1 ```