Articles - Social Network Analysis

Interactive Network Visualization using R

  |   21100  |  Comment (1)  |  Social Network Analysis

This chapter describes two key R packages for creating interactive network graphs. These packages include:

  • visNetwork (Almende B.V., Thieurmel, and Robert 2017). Creates an interactive network visualization using the vis.js javascript library (http://visjs.org/).
  • networkD3 (Allaire et al. 2017). Creates a D3 JavaScript Network Graphs from R.

You’ll learn how to:

  • Create a classic network graph that is interactive
  • Make an interactive sankey diagram, useful for network flow visualization
  • Visualize, interactively, classification and regression trees

Contents:


Load demo data sets and R package

We’ll use the phone.call2 data [in the navdata R package], which is a list containing the nodes and the edges list prepared in the chapter @ref(network-visualization-essentials) from the phone.call data.

Start by loading the tidyverse R package and the phone.call2 demo data sets:

library(tidyverse)
library("navdata")
data("phone.call2")
nodes <- phone.call2$nodes
edges <- phone.call2$edges

networkD3 R package

Key features

Can be used to easily create an interactive sankey diagram, as well as, other network layout such as dendrogram, radial and diagnonal networks.

Key R functions and options

Key R functions:

forceNetwork(). Creates a D3 JavaScript force directed network graph

forceNetwork(Links, Nodes, Source, Target,
             Value, NodeID, Nodesize, Group)

Key Arguments:

  • Links: edges list. Edge IDs should start with 0
  • Nodes: Nodes list. Node IDs should start with 0
  • Source, Target: the names of the column, in the edges data, containing the network source and target variables, respectively.
  • Value: the name of the column, in the edge data, containing the weight values for edges. Used to indicate how wide the links are.
  • NodeID: the name of the column, in the nodes data, containing the node IDs. Used for labeling the nodes.
  • Nodesize: the name of the column, in the nodes data, with some value to vary the node radius’s with.
  • Group: the name of the column, in the nodes data, specifying the group of each node.

Prepare nodes and edes data

As specified above, the IDs in nodes and edges lists should be numeric values starting with 0. This can be easily done by substracting 1 from the existing IDs in the two data frames.

  1. Prepare the nodes and the edges data:
nodes_d3 <- mutate(nodes, id = id - 1)
edges_d3 <- mutate(edges, from = from - 1, to = to - 1)
  1. Create the interactive network:
library(networkD3)
forceNetwork(
  Links = edges_d3, Nodes = nodes_d3,  
  Source = "from", Target = "to",      # so the network is directed.
  NodeID = "label", Group = "id", Value = "weight", 
  opacity = 1, fontSize = 16, zoom = TRUE
  )

Note that, a color is attributed to each group. Here, as we specified the column “id” as the node Group value, we have different colors for each individual nodes.

Create sankey diagram

You can create a d3-styled sankey diagram. A Sankey diagram is a good fit for the phone call data. There are not too many nodes in the data, making it easier to visualize the flow of phone calls.

Create a sankey diagram:

sankeyNetwork(
  Links = edges_d3, Nodes = nodes_d3, 
  Source = "from", Target = "to", 
  NodeID = "label", Value = "weight", 
  fontSize = 16, unit = "Letter(s)")

Other hierarchical layouts exist in the network3D package to visualize tree-like graphs. In the example below, we start by computing hierarchical clustering using a sample of the USArrests data set:

set.seed(123)
hc <- USArrests %>% sample_n(15) %>%
  scale() %>% dist() %>%
  hclust(method = "complete")

Other network layouts

  • dendroNetwork:
dendroNetwork(hc, fontSize = 15)

Other alternatives are:

  • radialNetwork:
radialNetwork(
  as.radialNetwork(hc), fontSize = 15
  )
  • diagonalNetwork:
diagonalNetwork(
  as.radialNetwork(hc), fontSize = 15
  )

visNetwork R package

Key features

  • Creates interactive network graphs.
  • Possible to customize nodes and edge as you want.
  • Can be used to directly visualize interactively a network generated with the igraph package.
  • Can be used to visualize recursive partitioning and regression trees generated with the rpart package.
  • Possible to use images and icons for node shapes.
  • Supports igraph layouts

Key R function and options

Key R function:

visNetwork( 
  nodes = NULL, edges = NULL,
  width = NULL, height = NULL, 
  main = NULL, submain = NULL, footer = NULL
  )

Key Arguments:

  • nodes: nodes list information. Should contain at least the column “id”. See visNodes() for more options to control nodes. Other colums can be included in the data, such as:
    • “id” : id of the node, needed in edges information
    • “label” : label of the node
    • “group” : group of the node. Groups can be configure with visGroups().
    • “value” : size of the node
    • “title” : tooltip of the node
  • edges: edges list information. Required at least columns “from” and “to”. See visEdges() for more options to control edges.
    • “from” : node id of begin of the edge
    • “to” : node id of end of the edge
    • “label” : label of the edge
    • “value” : size of the node
    • “title” : tooltip of the node

Create a classic network graphs

Note that, the function plots the labels for the nodes, using the “label” column in the node list.

You can move the nodes and the graph will use an algorithm to keep the nodes properly spaced. You can also zoom in and out on the plot and move it around to re-center it.

To have always the same network, you can use the function visLayout(randomSeed = 12):

library("visNetwork")
visNetwork(nodes, edges) %>%
  visLayout(randomSeed = 12) 

Note that,

  • visNetwork can use igraph layouts, which include a large variety of possible layouts.
  • you can use visIgraph() to directly visualize an igraph network object.

If you want to control the width of edges according to a variable, you should include the column “width” in the edge list data. You should manually calculate and scale the edge width.

In the following R code, we’ll customize the visNetwork() output by using an igraph layout and changing the edges width.

First add the column width in the edges list data frame. Set the minimum width to 1:

edges <- mutate(edges, width = 1 + weight/5)

Create the network graph with the variable edge widths and the igraph layout = “layout_with_fr”.

visNetwork(nodes, edges) %>% 
  visIgraphLayout(layout = "layout_with_fr") %>% 
  visEdges(arrows = "middle") %>%
  visLayout(randomSeed = 1234)  

Visualize classification and regression trees

As mentioned above, you can visualize classification and regression trees generated using the rpart package.

Key function: visTree() [in visNetwork version >= 2.0.0].

For example, to visualize a classification tree, type the following R code:

# Compute
library(rpart)
res <- rpart(Species~., data=iris)
# Visualize
visTree(res, main = "Iris classification Tree",
        width = "80%",  height = "400px")

References

Allaire, J.J., Christopher Gandrud, Kenton Russell, and CJ Yetman. 2017. NetworkD3: D3 Javascript Network Graphs from R. https://CRAN.R-project.org/package=networkD3.

Almende B.V., Benoit Thieurmel, and Titouan Robert. 2017. VisNetwork: Network Visualization Using ’Vis.js’ Library. https://CRAN.R-project.org/package=visNetwork.