Sankey Diagram with networkD3 package will not plot - r

I am using the sankeyNetwork function in the networkD3 package in R using as an example the code found here. However, all I get is a blank screen. The diagram is supposed to show the flow of infections between age groups (by gender). My code is as below:
library(RCurl)
library(networkD3)
edges <- read.csv(curl("https://raw.githubusercontent.com/kilimba/data/master/infection_flows.csv"),stringsAsFactors = FALSE )
nodes = data.frame(ID = unique(c(edges$Source, edges$Target)))
nodes$indx =0
for (i in 1:nrow(nodes)){
nodes[i,]["indx"] = i - 1
}
edges2 <- merge(edges,nodes,by.x = "Source",by.y = "ID")
edges2$Source <-NULL
names(edges2) <- c("target","value","source")
edges2 <- merge(edges2,nodes,by.x = "target",by.y = "ID")
edges2$target <- NULL
names(edges2) <- c("value","source","target")
nodes$indx <- NULL
# Plot
sankeyNetwork(Links = edges2, Nodes = nodes,
Source = "source", Target = "target",
Value = "value", NodeID = "ID",
width = 700, fontsize = 12, nodeWidth = 30)

Are you sure there are no errors printed in your R console?
This works for me with two small modifications:
Load the curl package as well at the beginning
library("curl")
The fontsize parameter apparently does not work and should be removed.
# Plot
sankeyNetwork(Links = edges2, Nodes = nodes,
Source = "source", Target = "target",
Value = "value", NodeID = "ID",
width = 700, #fontsize = 12,
nodeWidth = 30)

Adjusting fontsize does work, but your argument is missing a capitalization: fontSize
sankeyNetwork(Links = edges2, Nodes = nodes,
Source = "source", Target = "target",
Value = "value", NodeID = "ID",
width = 700, fontSize = 12,
nodeWidth = 30)

you do not need RCurl, read.csv is able to read directly from a URL
it's probably safer to use the stringsAsFactors = FALSE option when creating the nodes data.frame
as others have pointed out, you must make sure that the source and target variables in the links data are numeric, and that they are zero-indexed
as others have pointed out, the font size parameter is properly named fontSize
I have provided a more direct way of creating the links data with numeric indexes of the nodes in the nodes data.frame
library(networkD3)
edges <- read.csv("https://raw.githubusercontent.com/kilimba/data/master/infection_flows.csv",stringsAsFactors = FALSE)
nodes = data.frame(ID = unique(c(edges$Source, edges$Target)), stringsAsFactors = FALSE)
edges$Source <- match(edges$Source, nodes$ID) - 1
edges$Target <- match(edges$Target, nodes$ID) - 1
sankeyNetwork(Links = edges, Nodes = nodes,
Source = "Source", Target = "Target",
Value = "Value", NodeID = "ID",
width = 700, fontSize = 12, nodeWidth = 30)

I solved it for me by making sure that source, target and values were all numeric.
For example:
Energy$links$value <- as.numeric(Energy$links$value)

Related

Isolating a "branch" in a sankey diagram using networkd3

I am using sankeyNetwork() from the networkD3 package for visualizing some data. I was wondering if theres a way to "isolate" a branch from start to finish, ignoring the irrelevant links.
Example: I've got this: SankeyGot
And I want to extract this: SankeyWant
reproducible example:
set.seed(9)
df <- tibble(
source = sample(stringr::words, 5) %>% rep(2),
target = c(sample(words, 7), source[1:3]),
values = rnorm(10, 10, 7) %>% round(0) %>% abs)
nodes <- data.frame(names = unique(c(df$source, df$target)))
links <- tibble(
source = match(
df$source, nodes$names) -1,
target = match(
df$target, nodes$names) -1,
value = df$values
)
sankeyNetwork(Links = links, Nodes = nodes, Source = "source",
Target = "target", Value = "value", NodeID = "names",
iterations = 64, sinksRight = F, fontSize = 14)
I'd like to be able to filter out "name" for example and get everything that connects to that on all levels upstream and downstream - how would i go about doing this?
Calculating the paths from a node in a graph is non-trivial, but the igraph package can help with the all_simple_paths(). However, heed that warning in the help file...
Note that potentially there are exponentially many paths between two
vertices of a graph, and you may run out of memory when using this
function, if your graph is lattice-like.
(I don't know what your words vector is, so I recreated the links data.frame manually)
library(dplyr)
library(networkD3)
set.seed(9)
df <- read.csv(header = TRUE, text = "
source,target
summer,obvious
summer,structure
however,either
however,match
obvious,about
obvious,non
either,contract
either,produce
contract,paint
contract,name
")
df$values <- rnorm(10, 10, 7) %>% round(0) %>% abs()
# use graph to calculate the paths from a node
library(igraph)
graph <- graph_from_data_frame(df)
start_node <- "name"
# get nodes along a uni-directional path going IN to the start_node
connected_nodes_in <-
all_simple_paths(graph, from = start_node, mode = "in") %>%
unlist() %>%
names() %>%
unique()
# get nodes along a uni-directional path going OUT of the start_node
connected_nodes_out <-
all_simple_paths(graph, from = start_node, mode = "out") %>%
unlist() %>%
names() %>%
unique()
# combine them
connected_nodes <- unique(c(connected_nodes_in, connected_nodes_out))
# filter your data frame so it only includes links/edges that start and
# end at connected nodes
df <- df %>% filter(source %in% connected_nodes & target %in% connected_nodes)
nodes <- data.frame(names = unique(c(df$source, df$target)))
links <- tibble(
source = match(
df$source, nodes$names) -1,
target = match(
df$target, nodes$names) -1,
value = df$values
)
sankeyNetwork(Links = links, Nodes = nodes, Source = "source",
Target = "target", Value = "value", NodeID = "names",
iterations = 64, sinksRight = F, fontSize = 14)
If you code sankeyNetwork as an object you can use str(object) to identify it as a list, with a matrix called x that holds your input df
list_sankey <- sankeyNetwork(Links = links, Nodes = nodes, Source = "source", Target = "target", Value = "value", NodeID = "names", iterations = 64, sinksRight = F, fontSize = 14)
str(list_sankey)
You can then filter the x matrix to only contrain your desired input source and output target nodes
list_sankey_filter <- list_sankey
list_sankey_filter$x$links <- list_sankey_filter$x$links %>% filter(source %in% c(4, 2, 0), target %in% c(4, 2, 0, 10))
This then gives you the object below.

sankeyD3 > sankeyNetwork implementation of 'NodePosX'

I have been using the sankeyD3 package to create SankeyNetworks and the 'NodePosX' feature isn't working for me yet. The 'NodePosX' feature is not in the 'networkD3' package but it is in the 'sankeyD3' package.
To help illustrate the problem that I am having, I have edited the example from akraemer007 that was posted here to include the X positions of the nodes (see below) but it's still not working in the way that he had originally wanted, with manual control over the x-position of the 'Opted-Out' node.
We're aiming for something like this, but without the small line from 'Opted-Out' to 'Activated':
library(devtools)
devtools::install_github("fbreitwieser/sankeyD3")
library(sankeyD3)
name <- c('Enrolled', 'Opted-Out', 'Invited', 'Activated')
xpos <- c(0, 1, 1, 2)
nodes <- data.frame(name, xpos)
source <- c(0, 0, 2, 1)
target <- c(1, 2, 3, 3)
value <- c(20, 80, 60, 0)
links <- data.frame(source, target, value)
sankeyNetwork(Links = links, Nodes = nodes, Source = "source",
Target = "target", Value = "value", NodeID = "name",NodePosX = "xpos",
units = "TWh", fontSize = 12, nodeWidth = 30)
Assuming the last row in your links data frame is only there to force the plot to look the way you want and not part of the actual data you want to plot, you can achieve this with networkd3 using the sinksRight = FALSE parameter.
library(networkD3)
name <- c('Enrolled', 'Opted-Out', 'Invited', 'Activated')
xpos <- c(0, 1, 1, 2)
nodes <- data.frame(name, xpos)
source <- c(0, 0, 2)
target <- c(1, 2, 3)
value <- c(20, 80, 60)
links <- data.frame(source, target, value)
sankeyNetwork(Links = links, Nodes = nodes, Source = "source",
Target = "target", Value = "value", NodeID = "name",
units = "TWh", fontSize = 12, nodeWidth = 30, sinksRight = FALSE)

Is it possible to show node id always with forcenetwork? (R)

I am using the forceNetwork function within the networkd3 package and can't get the node ID name to always show. I can see it only if I hover the mouse over it. Is there any way to make it always show?
I am using this code from the user guide:
# Load data
data(MisLinks)
data(MisNodes)
# Plot
forceNetwork(Links = MisLinks, Nodes = MisNodes,
Source = "source", Target = "target",
Value = "value", NodeID = "name",
Group = "group", opacity = 0.8)
edit: It works really fine with #CJ Yetman.
One more question...
Is it possible to put weights in the edges and show it?
use the opacityNoHover = 1 argument...
library(networkD3)
data(MisLinks)
data(MisNodes)
# Plot
forceNetwork(Links = MisLinks, Nodes = MisNodes,
Source = "source", Target = "target",
Value = "value", NodeID = "name",
Group = "group", opacity = 0.8,
opacityNoHover = 1)
The Value column in your links data frame determines the weight.

Saving networkD3 Sankey diagram using code only

I've created a Sankey diagram in R, using the networkD3 package, that I'd like to save as a static image, using code instead of clicking on 'Export' --> 'Save as Image...'.
The current code I've tried (using this Sankey diagram as an example) is:
library(networkD3)
URL <- paste0(
"https://cdn.rawgit.com/christophergandrud/networkD3/",
"master/JSONdata/energy.json")
Energy <- jsonlite::fromJSON(URL)
# Plot
jpeg( filename = "Sankey.jpg", width = 4000, height = 4000)
sankeyNetwork(Links = Energy$links, Nodes = Energy$nodes, Source = "source",
Target = "target", Value = "value", NodeID = "name",
units = "TWh", fontSize = 12, nodeWidth = 30)
dev.off()
All I'm getting though is a blank white box when I open the image though.
Simplest working solution I've found so far is:
Install PhantomJS. For example using Homebrew for OSX - brew install phantomjs
Install rbokeh - install.packages("rbokeh")
Then:
library(rbokeh)
sn <- sankeyNetwork(Links = Energy$links, Nodes = Energy$nodes, Source = "source",
Target = "target", Value = "value", NodeID = "name",
units = "TWh", fontSize = 12, nodeWidth = 30)
widget2png(sn, "sankey.png")
The result doesn't look great, but this might serve as a starting point for research and improvements.
EDIT: here's another potential solution using the webshot package.

linking a node in networkD3 to a website using clickAction = NULL

Is there a way to use a node as a link to an external website using the function forceNetwork() in the networkD3 package in r? I was thinking maybe modifying the clickAction?
Example data:
library(networkD3)
data(MisLinks)
data(MisNodes)
# Create a random URL in the nodes dataset
MisNodes$URL <- paste0("http://www.RANDOMLINK_", sample(1:100, NROW(MisNodes)), ".com")
head(MisNodes)
MyClickScript <- 'alert(d.index)'
forceNetwork(Links = MisLinks, Nodes = MisNodes,
Source = "source", Target = "target",
Value = "value", NodeID = "name",
Group = "group", opacity = 0.8,
clickAction = MyClickScript)
Desired outcome: When a user clicks on a node, a new tab will open (e.g. window.open) pointing to the associated URL for the node - How can I get clickAction to point to MisNodes$URL[d.index]?
networkD3 design does not make this easy. Here is one way to answer. I'll try to comment inline to explain what we are doing in each step.
library(networkD3)
# example from ?forceNetwork
data(MisLinks)
data(MisNodes)
# Create graph
fn <- forceNetwork(
Links = MisLinks, Nodes = MisNodes, Source = "source",
Target = "target", Value = "value", NodeID = "name",
Group = "group", opacity = 0.4, zoom = TRUE
)
# let's look at our forceNetwork
# nodes are provided to JavaScript
# in a nodes data.frame
str(fn$x$nodes)
# make up some links to demonstrate
# how we can add them to our nodes df
fn$x$nodes$hyperlink <- paste0(
'http://en.wikipedia.org/wiki/Special:Search?search=',
MisNodes$name
)
# then with our hyperlinks in our data
# we can define a click action to open
# the hyperlink for each node in a new window
fn$x$options$clickAction = 'window.open(d.hyperlink)'
fn

Resources