What is Social Network Analysis (SNA)?

Introduction
R Packages
Data
Data Prepartion
Basic Exploration and Visualization
Topography
Cohesive Subgroups
Centrality
Interactive Visualiztions with visNetwork
Conclusion and Other Resources
Resources
- SNA (Concepts, Measures, and Theory)
- SNA in R (Online Guides)
Appendix
- Converting Two-Mode Networks to One-Mode Networks

Introduction

As highlighted in this section’s short video, social networks permeate our daily lives and they play an important role in who and what we know, where we live, our beliefs and preferences, and the opportunities and the constraints with which we are presented in our lives. It is therefore unsurprising that many practitioners are looking for new ways to understand a variety of social networks in a more efficient and more effective manner. Social network analysis (SNA), especially when implemented in R, provides us with many of the tools needed to address this real-world challenge.

Formally, SNA is a set of theories and techniques used to understand social structures. Most practitioners use visualizations and SNA-based statistics to examine their social network data. Everton’s (2012) “Four Metrics Families” provides us with a useful way to conceptualize various aspects of social networks:

Network Topography - describes the overall structure of a social network, which allows us to assess its strengths and vulnerabilities.
Cohesive Subgroups - highlights clusters of actors who interact relatively more frequently with one another than with others.
Centrality - identifies actors who are located in structurally advantageous positions and who can diffuse information and/or whose removal may disrupt a social network.
Brokers and Bridges - a focus on brokerage is similar to centrality in that it helps us identify actors in structural advantageous positions; however, in this case we focus on the control over the flow of information and resources. Bridges are crucial relationships in a network, and formally, ties that would create a disconnect in a network of interest if removed.

The purpose of this document is to provide readers with a basic, practical understanding of SNA. We highly encourage users to check out the “Resources” section for some of our favorite references regarding relevant theories, concepts, functions, packages, and coding.

In this tutorial, we will use some basic SNA techniques in R to explore a data set pertaining to the Noordin Top terrorist network. Specifically, we will look at the network at a single snapshot in time right before the network’s first attack in August 2003 on the JW Marriott Hotel in Jakarta. Using Everton’s (2012) “Four Metric Families” as our guide, we will explore the following questions:

Topography

How interconnected was Noordin’s network prior to the attack?
Did Noordin’s network appear to be structured around him? Or was the network decentralized?

Cohesive Subgroups

Noordin’s network was comprised of individuals from various militant groups across the region. Were there clusters consisting of individuals from various groups? How can we describe them?

We will put all actor-focused questions, including brokerage, under the umbrella of centrality for this tutorial.

Centrality (and brokerage)

Besides Noordin, who were key individuals in the operation from a structural perspective? Who were the key facilitators of information during the operation?

From our “answers” to the questions above, we can develop hypotheses that we can test using more sophisticated techniques and statistical models. For demonstration purposes, however, we will keep things simple and limit our ourselves to data exploration and some basic informative techniques. Thus, we will explore our data and then describe our results to a hypothetical audience.

R Packages

We will leverage four packages in the tutorial. The functions listed in Table 1 are the primary functions we will use for each package, but they do not represent an exhaustive list of the functions and arguments provided below (or that each package offers). We do not list igraph functions in Table 1 because of the large number of them used in this tutorial.

We will not leverage statnet even though it is a commonly used and excellent package for SNA. ¹ The goal of Table 1 is simply to provide you with a quick preview; that is, our brief descriptions do not “do justice” for these excellent packages, so we recommend you check out their websites.

*Table 1: Summary of Chapter Packages and Functions*
Package	Function	Short Description
igraph	See each metric family section	A straightforward package to estimate most social network analysis statistics. You can calculate measures related to network topography, centrality, subgroups, and brokers and bridges, among others. ²
visNetwork	`visIgraph()`, `visNetwork()`, `visPhysics()`& `visOptions()`	A package to build interactive, network visualizations. ³ The functions in column 2 allows us to send igraph objects directly to visNetwork, visualize our data, control the “physics” a network, and customize interactive features.
dplyr	`arrange()`, `select()`, & `one_of()`	A tidyverse package for data manipulation. The functions listed in Column 2 allow us to sort variables, select and maintain variables of choice. ⁴
DT	`datatable()`	The R package DT provides an R interface to the JavaScript library DataTables. ⁵ R data objects (matrices or data frames) can be displayed as tables on HTML pages, and DataTables provides filtering, pagination, sorting, and many other features in the tables. The function in column 2 allows us to create an HTML widget to display R data objects with DataTables.

You will need to make sure those packages are installed (install.packages) before calling them from your library.

Data

According to Cunningham, Everton, and Murphy (2016), the Noordin data set can be described as follows:

“The foundation of the Noordin Top Terrorist network data were extracted from two International Crisis Group (ICG) reports (International Crisis Group 2006; International Crisis Group 2009b), which contain rich one- and two-mode data on a variety of relations and affiliations (friendship, kinship, meetings, etc.) along with significant attribute data (education, group membership, physical status, etc.). Because a single source for any network data raises the possibility of bias, the data were supplemented with additional open source literature in order fill gaps in the data and in order to generate monthly time codes from January 2001 through December 2010, which allow us to account for when actors enter and leave the network and examine the network longitudinally. The data were initially structured and analyzed by Defense Analysis students at the Naval Postgraduate School in the course”Tracking and Disrupting Dark Networks’ under the direction of Professors Sean Everton and Professor Nancy Roberts. Dan Cunningham reviewed, cleaned, and updated the data, in particular the time code information."

In this tutorial we will examine a binary, one-mode aggregation/combination of operational, communcation, and trust-based ties among 30 individuals involved in the August 2003 attack on the JW Marriott in Jakarta. These relationships together, which we will refer to as our combined network, are undirected and stored in an edge-list (Noordin_Edgelist.csv). The original data set contained 139 individuals, from which we extracted only those who were alive and active during August of 2003. Furthermore, we extracted out the largest component of the structure.

Finally, we will work with a single attribute, namely each individuals militant group affiliation (i.e., “Primary.Group.Affiliation”) to demonstrate selected techniques in igraph. The file you need containing attributes is, Noordin_Attribute.csv.

For more information on the comprehensive Noordin data set, see Cunningham, Everton, and Murphy (2016).

Data Prepartion

Let’s go ahead and import our network data and convert it to a “graph object” so that we can work with it in igraph. As previously stated, the data set is stored as an edge list so we’ll bring it using the following:

noordin_df <- as.data.frame(read.csv(file="data/Noordin_Edgelist.csv", header=TRUE))

As we did in other tutorials, we can use head() to take a look at the first few observations in the newly created data frame.

head(noordin_df)

##          from                      to Relationship
## 1 Abdul Rohim    Noordin Mohammed Top     Combined
## 2 Abu Dujanah                  Amrozi     Combined
## 3 Abu Dujanah            Azhari Husin     Combined
## 4 Abu Dujanah                Dulmatin     Combined
## 5 Abu Dujanah Fathur Rahman Al- Ghozi     Combined
## 6 Abu Dujanah                 Hambali     Combined

The noordin_df file has three columns: “from”, “to”, and " Relationship." The first observation (row one) represents the existence of an Combined (i.e., either an operational, communication, or trust) relationship between the Abdul Rohim and Noordin Top.

we will use the first two columns (hence the [1:2] below) to create a social network graph using igraph’s function called, graph_from_edgelist(). This functions creates an igraph class object that can be used for graphing or statistical analysis (here we call the object, noordin_g). Also note the embedded as.matrixfunction; we use this here to convert the edge list into a matrix first because the graph_from_edgelist() function requires a matrix.

noordin_g <- graph_from_edgelist(as.matrix(noordin_df[1:2]), directed=F)

We now have our network data stored as a graph object and can start exploring it.

Basic Exploration and Visualization

Before diving into our questions, let’s get acquainted with the data set by calculating some rudimentary statistics as well as creating some basic visualizations in igraph. In terms of the former, the summary() function gives us a basic description of the the network. The “UN” tells us our data are undirected and the “30 194” tells us we have 30 nodes and 194 relations among them. We will import attributes shortly.

summary(noordin_g)

## IGRAPH e845537 UN-- 30 194 -- 
## + attr: name (v/c)

We can get additional information using the E() and V() functions, which give us a list of all the relationships (i.e., “E” is for edges) and nodes (i.e., “V” is for vertices) in the data set.

E(noordin_g)

## + 194/194 edges from e845537 (vertex names):
##  [1] Abdul Rohim         --Noordin Mohammed Top   
##  [2] Abu Dujanah         --Amrozi                 
##  [3] Abu Dujanah         --Azhari Husin           
##  [4] Abu Dujanah         --Dulmatin               
##  [5] Abu Dujanah         --Fathur Rahman Al- Ghozi
##  [6] Abu Dujanah         --Hambali                
##  [7] Abu Dujanah         --Ismail1                
##  [8] Noordin Mohammed Top--Abu Dujanah            
##  [9] Noordin Mohammed Top--Ahmad Basyir           
## [10] Ali                 --Aris Munandar          
## + ... omitted several edges

V(noordin_g)

## + 30/30 vertices, named, from e845537:
##  [1] Abdul Rohim                   Noordin Mohammed Top         
##  [3] Abu Dujanah                   Amrozi                       
##  [5] Azhari Husin                  Dulmatin                     
##  [7] Fathur Rahman Al- Ghozi       Hambali                      
##  [9] Ismail1                       Ahmad Basyir                 
## [11] Ali                           Aris Munandar                
## [13] Dani Chandra                  Hilman                       
## [15] Muchtar                       Salman                       
## [17] Umar2                         Zulkarnaen                   
## [19] Umar Patek                    Apuy                         
## + ... omitted several vertices

Let’s plot the network in igraph with a few basic aesthetics. The plot() function allows us to visualize the network, while vertex.color customizes the color of the nodes, vertex.label.color changes the node label color, and edge.curved depicts ties in a curved format. We can add a tile with the main parameter.

plot(noordin_g, vertex.color = "lightblue", vertex.label.color = "black",
            edge.curved = 0.2, main = "Noordin Top Network (Aug 2003)" )

Figure 1: Simple plot of Noordin’s Network in August 2003

The parameters seen in the previous step are a bit confusing at first if you’re not quite comfortable with R. Table 2 provides a summary of commonly used plotting parameters. See ?igraph.plotting, Katya Ognyanova’s excellent tutorial on SNA in igraph (https://kateto.net/networks-r-igraph), or igraph’s website (https://igraph.org/r/) for an exhaustive list.

*Table 2: Summary of Chapter Packages and Functions*
Parameter	Short Description
`vertex.color`	Adjusts node color.
`vertex.size`	Parameter for node size. Default is 15.
`vertex.shape`	Parameter for node shape (e.g., “sphere”)
`vertex.label`	Parameter for adjusting and setting node labels.
`vertex.label.font`	Parameter for node font. Font: 1=plain, 2=bold, 3=italic, 4=bold italic, 5=symbol
`vertex.label.family`	Adjusts font family.
`vertex.label.cex`	Parameter for changing font size.
`edge.color`	Parameter for setting edge color.
`edge.width`	Sets edge width (default = 1).
`edge.arrow.size`	Sets edge arrow size (default = 1).
`arrow.mode`	Sets arrow aesthetics: 0=no arrow, 1=back, 2=forward, 3=both.
`edge.curved`	Edge curvature (ranges from 0-1).

Let’s do a few more aesthetic changes using some of the parameters shown in Table 2.

plot(noordin_g, vertex.color = "lightblue", vertex.size = 10, vertex.shape = "sphere", vertex.label = NA,
     vertex.label.cex=0.75, edge.color = "gray", edge.arrow.size = 0, edge.curved = 0, main = "Noordin Network (Aug 2003)")

Figure 2: Simple plot of Noordin’s Network in August 2003

Now, let’s change the layout so we can see structure a bit differently. Here’s a circular layout, which like all layouts, can be created two different ways. The first way is to create a layout object separately and then embed the object within the plot() function.

lay1<-layout_in_circle(noordin_g)
plot(noordin_g, vertex.color = "lightblue", vertex.size = 10, vertex.shape = "sphere", vertex.label = NA,
          vertex.label.cex=0.75, edge.color = "gray", edge.arrow.size = 0, edge.curved = 0,
            main = "Noordin Network (Aug 2003)", layout=lay1)

Figure 3: Circular plot of Noordin’s Network in August 2003

The second option is to set the layout type directly within the plot() function.

plot(noordin_g, vertex.color = "lightblue", vertex.size = 10, vertex.shape = "sphere", vertex.label = NA,
            vertex.label.cex=0.75, edge.color = "gray", edge.arrow.size = 0, edge.curved = 0,
            main = "Noordin Network (Aug 2003)", layout = layout_in_circle)

Figure 4: Circular plot of Noordin’s Network in August 2003

We can compare various layout using the code below. The mfrow=c(1,2) tells igraph to create multiple plots along a single row with two columns.

par(mfrow=c(2,2))

plot(noordin_g, vertex.color = "lightblue", vertex.size = 10, vertex.shape = "sphere", vertex.label = NA,
            vertex.label.cex=0.75, edge.color = "gray", edge.arrow.size = 0, edge.curved = 0,
            main = "Noordin Network (Aug 2003)", layout = layout_in_circle)# A circular layout

plot(noordin_g, vertex.color = "lightblue", vertex.size = 10, vertex.shape = "sphere", vertex.label = NA,
            vertex.label.cex=0.75, edge.color = "gray", edge.arrow.size = 0, edge.curved = 0,
            main = "Noordin Network (Aug 2003)", layout = layout_on_sphere)#A spherical layout

plot(noordin_g, vertex.color = "lightblue", vertex.size = 10, vertex.shape = "sphere", vertex.label = NA,
            vertex.label.cex=0.75, edge.color = "gray", edge.arrow.size = 0, edge.curved = 0,
            main = "Noordin Network (Aug 2003)", layout = layout_with_kk)# A spring embedded layout (kk = kamada kawai)

plot(noordin_g, vertex.color = "lightblue", vertex.size = 10, vertex.shape = "sphere", vertex.label = NA,
            vertex.label.cex=0.75, edge.color = "gray", edge.arrow.size = 0, edge.curved = 0,
            main = "Noordin Network (Aug 2003)", layout = layout_with_fr)# A force directed layout (fr = fruchterman and reingold)

Figure 5: Multiple Plots of Noordin’s Network in August 2003

Let’s bring in the “Primary Group Affiliation” attribute to give a little context to our data set.

atts<-as.data.frame(read.csv(file="data/Noordin_Attribute.csv", header=TRUE))

Take a quick look at the data if you haven’t already done so.

head(atts)

##                     id Primary.Group.Affiliation
## 1          Abdul Rohim              Unaffiliated
## 2 Noordin Mohammed Top          Jemaah Islamiyah
## 3          Abu Dujanah          Jemaah Islamiyah
## 4               Amrozi          Jemaah Islamiyah
## 5         Azhari Husin          Jemaah Islamiyah
## 6             Dulmatin          Jemaah Islamiyah

We will create a new node attribute called “Group” to represent our “Primary.Group.Affiliation” attribute. We can do this by extracting each value within the “Primary.Group.Affiliation” column when the name in the “id” column matches a node’s name in our network object (i.e., noordin_g).

V(noordin_g)$Group<-as.character(atts$Primary.Group.Affiliation[match(V(noordin_g)$name,atts$id)])

Print out the new vertex attribute using the following code.

V(noordin_g)$Group

##  [1] "Unaffiliated"     "Jemaah Islamiyah" "Jemaah Islamiyah"
##  [4] "Jemaah Islamiyah" "Jemaah Islamiyah" "Jemaah Islamiyah"
##  [7] "Jemaah Islamiyah" "Jemaah Islamiyah" "Unaffiliated"    
## [10] "KOMPAK"           "Unaffiliated"     "KOMPAK"          
## [13] "KOMPAK"           "Darul Islam"      "Jemaah Islamiyah"
## [16] "KOMPAK"           "Darul Islam"      "Jemaah Islamiyah"
## [19] "Jemaah Islamiyah" "Darul Islam"      "Darul Islam"     
## [22] "Unaffiliated"     "Jemaah Islamiyah" "Jemaah Islamiyah"
## [25] "Jemaah Islamiyah" "Jemaah Islamiyah" "Darul Islam"     
## [28] "Unaffiliated"     "Unaffiliated"     "Unaffiliated"

We are now ready to build upon our previous visualizations by coloring the nodes by their group affiliation. We will assign colors to each group category and create a “color” attribute to which we can refer back when we want to color nodes by militant group affiliation. The gsub() function helps us do this.

V(noordin_g)$color<-V(noordin_g)$Group #First, assign the "Primary.Group.Affiliation"" attribute as the vertex color.
V(noordin_g)$color<-gsub("Unaffiliated","orange",V(noordin_g)$color) #Unaffiliated will be orange.
V(noordin_g)$color<-gsub("KOMPAK","red",V(noordin_g)$color) #KOMPAK nodes will be red.
V(noordin_g)$color<-gsub("Jemaah Islamiyah","blue",V(noordin_g)$color) #JI nodes will be blue.
V(noordin_g)$color<-gsub("Darul Islam","green",V(noordin_g)$color) #DI nodes will be green.

Here we will keep things simple and do a single layout using kamada kawailike we did in Figure 5.

plot(noordin_g,vertex.size = 10, vertex.shape = "sphere", vertex.label = NA,
            vertex.label.cex=0.75, edge.color = "gray", edge.arrow.size = 0, edge.curved = 0,
            main = "Noordin Network (Aug 2003)", layout = layout_with_kk)

Figure 6: Noordin’s Network in August 2003, Node Color by Group

Now, plot the same visualization but with a legend.

plot(noordin_g,vertex.size = 10, vertex.shape = "sphere", vertex.label = NA,
            vertex.label.cex=0.75, edge.color = "gray", edge.arrow.size = 0, edge.curved = 0,
            main = "Noordin Network (Aug 2003)", layout = layout_with_kk)

colrs<-c("orange", "red", "blue", "green")# We will use these colors in the legend; they match the colors in Figure 6. 

legend(x=-1.5, y=-1.1, c("Unaffiliated","KOMPAK", "JI", "DI"), pch=21,
       pt.bg=colrs, pt.cex=2, cex=.8, bty="n", ncol=1) # The pt.bg assigns the colors to the appropriate group while other parameters set the size, text, and location of the legend.

Figure 7: Noordin’s Network in August 2003, Node Color by Group

Topography

The first set of questions we hope to explore is, how interconnected was Noordin’s network prior to the attack? Did Noordin’s network appear to be structured around him? Or was the network decentralized?

Based on these question, we will utilize a handful of metrics that fall under the umbrella of network topography. Table 3 (adapted from Cunningham, Everton, and Murphy (2016)) outlines relevant measures, including a definition, potential ways to interpret the results, and caveats to keep in mind when using them.

*Table 3: Sample of Network Topographic Measures/Statistics*
Measure	Definition	Interpretation	Caveat
Density	The total number of ties in a network divided by the total possible number of ties in that network. The output is a range from 0 to 1.	Indicates how interconnected a network is, which sheds light onto potential trade-offs network may have to consider (e.g., efficiency vs. operational security). For example, a dense network, with a focus on strong ties, may have a hard time getting resources from the outside.	Should not be used to compare networks of different sizes.
Average Degree	The sum of ties in a network divided by the number of actors in the network.	May indicate how interconnected a network is, which sheds light into potential trade-offs.	Networks that adopt a cell-like structure can be locally dense but globally sparse.
Centralization	The ratio of the actual sum of differences in actor centrality of the theoretical maximum, yielding a score between 0 and 1.	Centralization indicates how centralized, or decentralized, a network is. A network with high degree centralization could indicate that one or few actors are relatively active, as compared to the rest of the actors.	Can be confused with centrality, which is a node level measure.

We can calculate all of these with igraph functions. First, let’s calculate graph density.

edge_density(noordin_g)

## [1] 0.445977

A graph density of 0.445977 suggests the network is neither too sparse, nor too dense. We can see a relatively dense core of the network but several peripheral actors maintained only a few ties to others during that period of time. Now, let’s take a look at average degree. While there is no function for that, R is pretty good at calculating averages. Therefore, we need to calculate the degree for all nodes and then take an average. This step is easily done in R.

mean(degree(noordin_g))

## [1] 12.93333

This result tells us that, on average, actors have 12.9333333 connections to others in the network. From these two measures of interconnectedness/cohesion (i.e., density and average degree), it appears Noordin kept many folks close for operational purposes but also he maintained direct and indirect connections to “outsiders” who could provide resources and operational guidance to him and his close associates. Existing research into this network suggests a similar pattern (Everton and Roberts (2011); Everton and Cunningham (2016)).

To explore our final question regarding centralization, we can use igraph’s centralization.degree() function. Note we included the mode="total" argument because our network is undirected. The results will tell us each node’s number of connections (i.e., degree centrality) under $res.

centralization.degree(noordin_g, mode="total", normalized=TRUE)

## $res
##  [1]  2 28 14 18 26 26 20 22 20  2 14 16 14 14 14 22 14 24 14  6  8  2  8
## [24]  8 10 14  2  2  2  2
## 
## $centralization
## [1] 0.5195402
## 
## $theoretical_max
## [1] 870

The results indicate the network was fairly centralized prior to its first attack. As with all topographic measures, interpreting the results can be sort of tricky without comparing them to similar structures, such as a comparison of the same network over time. Everton and Cunningham (2016) did just that and found that while Noordin’s network was not substantially centralized in August 2003, it increasingly became centralized over time and prior to its major terrorist attacks.

Cohesive Subgroups

The next question we want to explore is, were there clusters consisting of individuals from various groups? How can we describe them? This question is interesting because we know that Noordin’s network was comprised of individuals from various militant groups across the region.

As with network topography, we have many options to choose from in terms of subgroup analysis. Here we will look at a sample of those available. Table 4 (adapted from Cunningham, Everton, and Murphy (2016)) outlines relevant measures, including a definition, potential ways to interpret the results, and caveats to keep in mind when using them.

*Table 4: Sample of Clustering Measures/Statistics*
Measure	Definition	Interpretation	Caveat
Walktrap	An agglomerative clustering approach that models a random walker who would tend to remain in dense part of a network (i.e., communities) since there are fewer paths out than within. Actors are merged into subgroups according to their similarity, estimated through random walks.	Also helps analysts identify larger communities, or relatively dense clusters, within dark networks, which highlights potential seams, or vulnerabilities, between those communities.	Tends to exhibit better sensitivity in dense networks than other community detection models, except for Spinglass.
Girvan-Newman	Similar to faction analysis in that subgroups are defined as having more ties within and fewer ties between groups than would be expected in a random graph of the same size with the same number of ties. Focuses on edge betweenness.	Helps analysts identify larger communities, or relatively dense clusters, within dark networks, which highlights potential seams, or vulnerabilities, between those communities.	Calculated differently than other community detection algorithms because it begins an iterative process by calculating edge betweenness and subsequently removing the tie with the highest score. Although the approach is intuitive, it tends to exhibit poor sensitivity with dense networks.

In terms of writing scripts, running these algorithms is fairly straightforward. For both types of subgroups, we will use the appropriate function and then estimate a modularity score, which compares the ties within and across subgroups (i.e., clusters) to what one would expect in a random graph of the same size and having the same number of ties.

cw<-cluster_walktrap(noordin_g)#Walktrap
modularity(cw)

## [1] 0.3814964

eb <- cluster_edge_betweenness(noordin_g)#Girvan Newman
modularity(eb)

## [1] 0.3593899

We will add one item here for the Girvan-Newman algorithm, namely membership(), which tells us the “community” to which each actor belongs according to the number of clusters that provides the highest possible modularity score.

membership (eb)

There is some debate as to what constitutes a “good” modularity score. A discussion of this topic is beyond the scope of this tutorial, so we will go with walktrap because its modularity suggests it does a bit better on this network than Girvan-Newman, but not by much.

Let’s take a look at the network’s subgroups based on walktrap using convex hulls, which can provide us with a nice visual depiction of them. Here we can compare our visual in Figure 6 depicting each individual’s affiliation with the subgroup to which they belong.

op<-par(mfrow = c(1,2))# Multiple plots with 1 row and 2 columns 

## Subgroups
plot(cw, noordin_g,vertex.size = 10, vertex.color = "lightgray",vertex.shape = "sphere", vertex.label = NA, vertex.label.cex=0.75, edge.color = "gray", edge.arrow.size = 0, edge.curved = 0, main = "Subgroups", layout = layout_with_kk)

##Figure 6: Group Affiliation
plot(noordin_g,vertex.size = 10, vertex.shape = "sphere", vertex.label = NA,
            vertex.label.cex=0.75, edge.color = "gray", edge.arrow.size = 0, edge.curved = 0, main = "Group Affiliation)", layout = layout_with_kk)

colrs<-c("orange", "red", "blue", "green") 

legend(x=-1.5, y=-1.1, c("Unaffiliated","KOMPAK", "JI", "DI"), pch=21,
       pt.bg=colrs, pt.cex=2, cex=.8, bty="n", ncol=1)

Figure 8: Noordin’s Network in August 2003, Subgroups vs. Affiliations

Figure 8 suggests we have clusters containing individuals from different militant groups. For instance, we can see the nodes within the blue convex hull (i.e.,the top left subgroup) have representatives from all four affiliations when we compare it to the network on the right (i.e., Unaffiliated, KOMPAK, JI, and DI). At the same time, the subgroup represented by the green convex hull (i.e., middle) is made up mostly of JI members who appear to make up the core of the network.

Centrality

The final set of questions we want to examine lead us to centrality and brokerage. Specifically, who were key individuals in the operation from a structural perspective besides Noordin? Also, who were the key facilitators of information during the operation?

The centrality metric family is perhaps the most intuitive and the most commonly used. The basic idea is to identify structurally “important” actors. However, the variety of interpretations of what it means to be central or “important”" means that no single measure can be used as a “silver bullet” in SNA. Instead, analysts should focus on using these measures to describe the potential importance of each actor in the network.

Some of the most relevant measures are:

*Table 5: Sample of Centrality Measures/Statistics*
Measure	Definition	Interpretation	Caveat
Degree	Count of an actor’s ties.	Actor activity; Direct power or influence, or ability to be influenced by others	In some cases, well-connected actors are the result of biased collection.
Betweenness	How often each actor lies on the shortest path between all pairs of actors.	Brokerage potential; Gatekeepers; Boundary Spanners	Betweenness assumes a desire for efficiency. Actors, resources, and information may not always follow shortest paths.
Closeness	The average shortest path (i.e., geodesic) distance from an actor to every other actor in the network.	Actor levels of accessibility to others, and to material and non- material goods.	Not designed for use with disconnected networks.
Eigenvector	Weights an actor’s degree centrality by the degree centrality of its neighbors.	Indirect influence or power; Potential social capital.	In well-connected networks (or sub- networks, such as cliques), it is often difficult to identify a single, or a few, potentially powerful actors.

We can calculate each measure individually using the following scripts. Note we did not provide the outputs for each function but we will for the dynamic table.

degree(noordin_g, mode = "total", loops=F)# Active Individuals

betweenness(noordin_g, directed = F, normalized = T)# Potential Brokers

eigen_centrality(noordin_g, directed = F, weights = NULL)# People connected to well-connected others

closeness(noordin_g, mode = "all", weights = NULL, normalized = T)# People with potential access to others, materails, etc.

Another option is to calculate each measure, attach it to a data frame, and render it as variables in a dynamic table using the DT package. First, let’s recalculate centrality and put them into a data frame.

metrics<-data.frame(id = V(noordin_g)$name,
               Degree = degree(noordin_g,
              mode="total",
              loops=FALSE,
              normalized = FALSE),
              Betweenness = round(betweenness(noordin_g,
                                        directed = F,
                                        weights = NULL,
                                        normalized = T),
                                  digits = 2),
              Eigenvector = eigen_centrality(noordin_g,
                                             directed=F,
                                             weights = NULL),
              Closeness = round(closeness(noordin_g,
                                    mode="total",
                                    weights = NULL,
                                    normalized=T),
                                digits = 2))

Now let’s put them in a dynamic table so we can interact with the results.

DT::datatable(metrics %>%
  arrange(desc(Degree))%>%
  select(one_of(c("id","Degree", "Betweenness", "Eigenvector.vector", "Closeness"))),
  class = 'cell-border stripe',
  rownames = FALSE,
  filter="top",
  selection="multiple",
  escape=FALSE,
  options=list(scrollX=TRUE,
               pageLength=10,
               sDom='<"top">lrt<"bottom">ip')
  )

Table 6: Noordin Network August 2003 - Centrality

The centrality results suggest several individuals were key actors in the network during mid-2003. According to multiple measures, we can see that people such as Azhari Husin, Dulmatin, Zulkarnaen, Ismail1, and Fathur Rahman Al-Ghozi (to name a few) all were structurally important individuals. In fact, much has been written about these individuals and the roles they played in numerous terrorist attacks and plots. Because this is a simple demonstration, we will limit our interpretation to that. However, centrality is by no means the end of an analysis but rather a set of indicators about which actors we should analyze more deeply.

Interactive Visualiztions with visNetwork

With R you have many options to produce interactive visualizations for your reports (e.g., Markdown, which what you’re looking at) ⁶, briefs (e.g., Reveal JS ⁷ and Xaringan ⁸, and/or interactive tools/dashboards (e.g., R Shiny⁹ and flexdashboard ¹⁰). An in-depth tutorial of these options is beyond the scope of this write-up, but we highly recommend that you explore these options as you become more comfortable with R.

One package that is useful for interactive social network visualizations is visNetwork. Unfortuneately, this package is limited in terms of available statistics to analyze networks of interest. In fact, it does not offer users the ability to leverage the metric families we’ve discussed here; however, we can run actor-level measures (e.g., centrality and brokerage) in other programs, store the results as actor attributes, and then use visual properties (e.g, size) to interact with our data.

As with the other packages we’ve used so far, we will keep things simple and build only a few of the same visualizations from above. This time, however, we will include some interactivity in our visualizations.

Several ways exist for you to get your data into visNetwork depending on your starting point. In this example, we can take our igraph object and send it directly to VisNetwork. We can do this in at least two ways (i.e., toVisNetworkData or visIgraph()). Note there are tradeoffs for each option, such as the below (i.e., visIgraph()), which will maintain the colors we produced in igraph.

visIgraph(noordin_g)

Figure 9: Noordin’s Network in August 2003, visNetwork to Igraph

We recognize that you may not start from igraph but rather begin with a node and edge list in csv. We will focus on this approach here.

Let’s re-import our Noordin network edge list and attribute one more time as if we were starting from scratch. Note we will bring in a slightly different version of our attribute file for this portion of the tutorial; we’ve added degree centrality scores to our attribute file so we can work with some additional visual properties.

edges <- as.data.frame(read.csv(file="data/Noordin_Edgelist.csv", header=TRUE))

nodes <- as.data.frame(read.csv(file="data/Noordin_Attribute (visNet).csv", header=TRUE))

Let’s visualize the network using a few basic lines of script in which we tell visNetwork what the nodes and edges are as well as the layout type.

visNetwork::visNetwork(nodes=nodes,
                       edges=edges) %>%
  visNetwork::visPhysics(enable=T,
                         solver = "forceAtlas2Based")

Figure 10: Noordin’s Network in August 2003

We can can adjust the nodes color, shape, size, label, and title. To do so, we can add these variables to the nodes data frame.

#First resize all nodes to reflect degree centrality.
nodes$size <- nodes$Degree
#Now reshape the vertices
nodes$shape <- "dot"
#Adjust node color based on group affiliation
nodes$color[nodes$Primary.Group.Affiliation =="Unaffiliated"] <- "orange"
nodes$color[nodes$Primary.Group.Affiliation=="KOMPAK"] <- "red"
nodes$color[nodes$Primary.Group.Affiliation=="Jemaah Islamiyah"] <- "blue"
nodes$color[nodes$Primary.Group.Affiliation=="Darul Islam"] <- "green"
#In order to reduce the number of colors in one visualization, recolor all edges to the same color.
edges$color <- "slategrey"
#Remove labels and add titles
nodes$label <- ""
nodes$title <- nodes$id

We can now re-render the network with the edits we just made.

visNetwork::visNetwork(nodes=nodes,
                       edges=edges) %>%
  visNetwork::visPhysics(enable=T,
                         solver = "forceAtlas2Based")

Figure 11: Noordin’s Network in August 2003

Let’s add the ability to see actor attributes when we hover over the nodes.

nodes$title <- paste("<b>Name: </b>", nodes$id, "<br>",
                     "<b>Affiliation: </b>", nodes$Primary.Group.Affiliation, "<br>")

Also, let’s use the width = 100% argument to maximize the visualization window, the main = argument to add a title, and the visLegend() function to add a legend.

visNetwork::visNetwork(nodes=nodes,
                       edges=edges, width = "100%", main = "Noordin Network (Aug 2003)") %>%
  visNetwork::visPhysics(enable=T,
                         solver = "forceAtlas2Based") %>%
  visNetwork::visLegend(addNodes=list(
    list(label="Unaffiliated", shape="square", size=5, color="orange"),
    list(label="KOMPAK", shape="square", size=5, color="red"),
    list(label="JI", shape="square", size=5, color="blue"),
    list(label="DI", shape="square", size=5, color="green")),
  useGroups=FALSE,
  position = "left")

Figure 12: Noordin’s Network in August 2003, Affiliations

Finally, let’s add a drop down menu to select nodes based on id (visoptions(nodeIdSelection = TRUE)).

visNetwork::visNetwork(nodes=nodes,
                       edges=edges, width = "100%", main = "Noordin Network (Aug 2003)") %>%
  visNetwork::visPhysics(enable=T,
                         solver = "forceAtlas2Based") %>%
  visNetwork::visLegend(addNodes=list(
    list(label="Unaffiliated", shape="square", size=10, color="orange"),
    list(label="KOMPAK", shape="square", size=10, color="red"),
    list(label="JI", shape="square", size=10, color="blue"),
    list(label="DI", shape="square", size=10, color="green")),
  useGroups=FALSE,
  position = "left") %>%
  visNetwork::visOptions(nodesIdSelection = TRUE, highlightNearest = TRUE)

Figure 13: Noordin’s Network in August 2003, Affiliations

Conclusion and Other Resources

Remember, this tutorial is very basic and designed to get you interested in using R for SNA. Many useful resources exist that go in far more depth than this document. Here are a few resources (i.e., many other great ones exist; these are just some recent and great resources) to check out pertaining to SNA in R:

Resources

Below is a list of useful resources for those who want to learn SNA and who are seeking additional information. This list is certainly not exhaustive but it is a great place to start.

SNA (Concepts, Measures, and Theory)

Borgatti, Stephen P., Martin G. Everett, and Jeffrey C. Johnson. 2013. Analyzing Social Networks. Los Angeles and London: SAGE Publications.
Cunningham, Daniel, Sean F. Everton, and Philip J. Murphy. 2016. Understanding Dark Networks: A Strategic Framework for the Use of Social Network Analysis. Lanham, MD: Rowman and Littlefield.
Everton, Sean F. 2012. Disrupting Dark Networks. Cambridge: Cambridge University Press.
McCulloh, Ian, Helen Armstrong, and Anthony Johnson. 2013. Social Network Analysis with Applications. Hoboken, NJ: Wiley.
Prell, Christina. 2011. Social Network Analysis: History, Theory & Methodology. London and Thousand Oaks, CA: SAGE Publications.
Robins, Garry L. 2015. Doing Social Network Research: Network-based Research Design for Social Scientists. London: SAGE.

SNA in R (Online Guides)

Shizuka Lab at University of Nebraska-Lincoln (http://www.shizukalab.com/toolkits/sna)
Katya Ognyanova at Rutgers University (http://kateto.net/network-visualization)
Stanford University’s SNA ( https://sna.stanford.edu/rlabs.php)
Jesse Sadler’s site (https://www.jessesadler.com/post/network-analysis-with-r//)

Appendix

Converting Two-Mode Networks to One-Mode Networks

This tutorial did not cover an important process (well, several actually) that you may need to follow as you leverage SNA. The process of converting two-mode data (e.g., people connected to organizations and accounts to comment threats) to one-mode is an important step in many SNA-based investigations.

We will use a different data set to demonstrate this process, namely a two-mode network of hypothetical gang members and their affiliations to fictional gangs. The file is an edge list called, “Affiliation.csv”.

We can convert this two-mode network to a one-mode network rather easily using igraph’s bipartite.mapping() and bipartite.projection() functions. First, create an affiliation graph using the graph_from_edgelist() function used previously.

affiliation <- as.data.frame(read.csv(file="data/Affiliation.csv", header=TRUE))

affiliationNet <- graph_from_edgelist(as.matrix(affiliation[1:2]), directed=F)

Now use the bipartite.mapping() function to evaluate whether the vertices of a network can be mapped to two sets of nodes in a network. In essence, this function checks whether or not a graph is bipartite (i.e., two-mode).

bipartite.mapping(affiliationNet)

## $res
## [1] TRUE
## 
## $type
##  [1] FALSE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE
## [12] FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE FALSE FALSE FALSE
## [23] FALSE FALSE FALSE FALSE FALSE FALSE  TRUE FALSE FALSE FALSE FALSE
## [34] FALSE FALSE FALSE FALSE

The bipartite.mapping() function returns two elements:

A logical scalar ($res) where TRUE indicates that a graph can bipartite, FALSE otherwise.
A logical vector indicating which nodes fall into each node class.

Once determined that a graph is two-mode, the “type” argument can be assigned to each node in the network as follows:

V(affiliationNet)$type <- bipartite.mapping(affiliationNet)$type

Now that the argument “type” has been assigned as an attribute to each node on the graph, we can begin to manipulate the two-mode network. First, let’s graph it.

plot(affiliationNet,
     layout=layout.bipartite,
     vertex.size=5,
     vertex.label=NA)

Figure 14: Gang Two-Mode/Bipartite Network

Now that we have mapped the network, we can use the bipartite.projection() function to calculate the actual one-mode projections. In other words, the bipartite.projection() function serves as the means to create two one-mode projections: one projections for person-to-person coaffiliation ties, and another for organization-to-organization comembership ties.

coaffiliation <- bipartite.projection(affiliationNet)$proj1
comembership <- bipartite.projection(affiliationNet)$proj2

Let’s examine the coaffiliation network:

coaffiliation

## IGRAPH c08b108 UNW- 32 196 -- 
## + attr: name (v/c), weight (e/n)
## + edges from c08b108 (vertex names):
##  [1] All City--Blood Messiah  All City--Bat G.        
##  [3] All City--Big G.         All City--Blaze         
##  [5] All City--Bloodhound     All City--Brains        
##  [7] All City--Clown          All City--Droopy        
##  [9] All City--Fast Trigger   All City--Goldie        
## [11] All City--O.G.           All City--Smiley        
## [13] All City--Sharpie        All City--Baby Face     
## [15] All City--Bananas        All City--Book Collector
## + ... omitted several edges

The resulting network contains 32 nodes and 196 edges. These new edges can be extracted from the graph and saved into a new data frame.

coaffiliation_edgelist <- as.data.frame(get.edgelist(coaffiliation))

Again, we need to create a graph from this duelist using the graph_from_edgelist() function.

coaff_net <- graph_from_edgelist(as.matrix(coaffiliation_edgelist[1:2]),
                                  directed = FALSE)
coaff_net

## IGRAPH 453d14b UN-- 32 196 -- 
## + attr: name (v/c)
## + edges from 453d14b (vertex names):
##  [1] All City--Blood Messiah  All City--Bat G.        
##  [3] All City--Big G.         All City--Blaze         
##  [5] All City--Bloodhound     All City--Brains        
##  [7] All City--Clown          All City--Droopy        
##  [9] All City--Fast Trigger   All City--Goldie        
## [11] All City--O.G.           All City--Smiley        
## [13] All City--Sharpie        All City--Baby Face     
## [15] All City--Bananas        All City--Book Collector
## + ... omitted several edges

We can plot the one-mode co-affiliation network as we did before.

plot(coaff_net,
     layout=layout_with_kk,
     vertex.size=5,
     vertex.label=NA)

Figure 15: Gang One-Mode Projection

Footnotes:

A suite or “wrapper” of several SNA packages ranging from descriptive measures (e.g., centrality) to advanced modeling (http://www.statnet.org/). Each package provides users with unique functionality. You can get access to all of these packages by installing statnet.↩
https://igraph.org/redirect.html ↩
https://datastorm-open.github.io/visNetwork/↩
See dplyr at tidyverse’s website, https://dplyr.tidyverse.org/.↩
https://cran.r-project.org/web/packages/DT/index.html ↩
https://rmarkdown.rstudio.com/.↩
https://cran.r-project.org/web/packages/revealjs/index.html.↩
https://github.com/yihui/xaringan.↩
https://shiny.rstudio.com/.↩
https://rmarkdown.rstudio.com/flexdashboard/.↩