sankeyD3 > sankeyNetwork implementation of 'NodePosX' - r

I have been using the sankeyD3 package to create SankeyNetworks and the 'NodePosX' feature isn't working for me yet. The 'NodePosX' feature is not in the 'networkD3' package but it is in the 'sankeyD3' package.
To help illustrate the problem that I am having, I have edited the example from akraemer007 that was posted here to include the X positions of the nodes (see below) but it's still not working in the way that he had originally wanted, with manual control over the x-position of the 'Opted-Out' node.
We're aiming for something like this, but without the small line from 'Opted-Out' to 'Activated':
library(devtools)
devtools::install_github("fbreitwieser/sankeyD3")
library(sankeyD3)
name <- c('Enrolled', 'Opted-Out', 'Invited', 'Activated')
xpos <- c(0, 1, 1, 2)
nodes <- data.frame(name, xpos)
source <- c(0, 0, 2, 1)
target <- c(1, 2, 3, 3)
value <- c(20, 80, 60, 0)
links <- data.frame(source, target, value)
sankeyNetwork(Links = links, Nodes = nodes, Source = "source",
Target = "target", Value = "value", NodeID = "name",NodePosX = "xpos",
units = "TWh", fontSize = 12, nodeWidth = 30)

Assuming the last row in your links data frame is only there to force the plot to look the way you want and not part of the actual data you want to plot, you can achieve this with networkd3 using the sinksRight = FALSE parameter.
library(networkD3)
name <- c('Enrolled', 'Opted-Out', 'Invited', 'Activated')
xpos <- c(0, 1, 1, 2)
nodes <- data.frame(name, xpos)
source <- c(0, 0, 2)
target <- c(1, 2, 3)
value <- c(20, 80, 60)
links <- data.frame(source, target, value)
sankeyNetwork(Links = links, Nodes = nodes, Source = "source",
Target = "target", Value = "value", NodeID = "name",
units = "TWh", fontSize = 12, nodeWidth = 30, sinksRight = FALSE)

Related

Converting a class of sankeyNetwork into a grob

I am trying to make a panel plots of 3 sankey diagram in R using networkD3 library. The Sankey diagrams are well generated and I have assigned three variables (p1, p2 and p3) to the plots. Now I want to combine them using ggarrange into one single plot using the code below;
ggarrange(p1, p2, p3, labels = c("A", "B", "C"), ncol = 1, nrow = 3)
But, I keep getting the following error.
Warning message:
In as_grob.default(plot) :
Cannot convert object of class sankeyNetworkhtmlwidget into a grob.
Apparently, ggarrange does not accept the format of my plots. How can I convert them into a format that ggarange will accept.
Here is the code that I used to generate the final Sankey Diagram for the first one (p1). The rest two (p2, p3) are generated using the same code.
nodes = data.frame("name" = c("Rocket 1", "Kerosene","Carbon IV Oxide",
"Aluminium IV Oxide","Sulphur IV Oxide"))
links = as.data.frame(matrix(c(0, 1, 488.4, 1, 2, 576.7, 1, 3, 24.4,
1, 4, 0.3), byrow = TRUE, ncol = 3))
p1 = names(links) = c("source", "target", "value")
p1 = sankeyNetwork(Links = links, Nodes = nodes,
Source = "source", Target = "target",
Value = "value", NodeID = "name",
fontSize= 12, nodeWidth = 20)
I would really appreciate some help with this. Thank you.

Isolating a "branch" in a sankey diagram using networkd3

I am using sankeyNetwork() from the networkD3 package for visualizing some data. I was wondering if theres a way to "isolate" a branch from start to finish, ignoring the irrelevant links.
Example: I've got this: SankeyGot
And I want to extract this: SankeyWant
reproducible example:
set.seed(9)
df <- tibble(
source = sample(stringr::words, 5) %>% rep(2),
target = c(sample(words, 7), source[1:3]),
values = rnorm(10, 10, 7) %>% round(0) %>% abs)
nodes <- data.frame(names = unique(c(df$source, df$target)))
links <- tibble(
source = match(
df$source, nodes$names) -1,
target = match(
df$target, nodes$names) -1,
value = df$values
)
sankeyNetwork(Links = links, Nodes = nodes, Source = "source",
Target = "target", Value = "value", NodeID = "names",
iterations = 64, sinksRight = F, fontSize = 14)
I'd like to be able to filter out "name" for example and get everything that connects to that on all levels upstream and downstream - how would i go about doing this?
Calculating the paths from a node in a graph is non-trivial, but the igraph package can help with the all_simple_paths(). However, heed that warning in the help file...
Note that potentially there are exponentially many paths between two
vertices of a graph, and you may run out of memory when using this
function, if your graph is lattice-like.
(I don't know what your words vector is, so I recreated the links data.frame manually)
library(dplyr)
library(networkD3)
set.seed(9)
df <- read.csv(header = TRUE, text = "
source,target
summer,obvious
summer,structure
however,either
however,match
obvious,about
obvious,non
either,contract
either,produce
contract,paint
contract,name
")
df$values <- rnorm(10, 10, 7) %>% round(0) %>% abs()
# use graph to calculate the paths from a node
library(igraph)
graph <- graph_from_data_frame(df)
start_node <- "name"
# get nodes along a uni-directional path going IN to the start_node
connected_nodes_in <-
all_simple_paths(graph, from = start_node, mode = "in") %>%
unlist() %>%
names() %>%
unique()
# get nodes along a uni-directional path going OUT of the start_node
connected_nodes_out <-
all_simple_paths(graph, from = start_node, mode = "out") %>%
unlist() %>%
names() %>%
unique()
# combine them
connected_nodes <- unique(c(connected_nodes_in, connected_nodes_out))
# filter your data frame so it only includes links/edges that start and
# end at connected nodes
df <- df %>% filter(source %in% connected_nodes & target %in% connected_nodes)
nodes <- data.frame(names = unique(c(df$source, df$target)))
links <- tibble(
source = match(
df$source, nodes$names) -1,
target = match(
df$target, nodes$names) -1,
value = df$values
)
sankeyNetwork(Links = links, Nodes = nodes, Source = "source",
Target = "target", Value = "value", NodeID = "names",
iterations = 64, sinksRight = F, fontSize = 14)
If you code sankeyNetwork as an object you can use str(object) to identify it as a list, with a matrix called x that holds your input df
list_sankey <- sankeyNetwork(Links = links, Nodes = nodes, Source = "source", Target = "target", Value = "value", NodeID = "names", iterations = 64, sinksRight = F, fontSize = 14)
str(list_sankey)
You can then filter the x matrix to only contrain your desired input source and output target nodes
list_sankey_filter <- list_sankey
list_sankey_filter$x$links <- list_sankey_filter$x$links %>% filter(source %in% c(4, 2, 0), target %in% c(4, 2, 0, 10))
This then gives you the object below.

labeling values over the link and nodes in Sankey Diagram with networkD3 [duplicate]

Background
I was trying the create a Sankey graph like the following figure. Actually, I wanted to get a output where values (10, 20, 30, 40) will be set in the paths (from one node to another node).
How Did I Try?
At first, I tried using the Plotly library of Python. However, somewhere I have seen that it is not possible to set the values in the links or the paths of Sankey graph in Plotly (of Python). Later, I switched to R (for some other reasons also) where more resources are available (I think). However, here, I am also facing the same problem. I have checked many tutorials (e.g., this one), Q&A (e.g., 1, 2, 3) of SO which are in R. Still, I could not to find a tutorial or resources where the values are displayed in the paths!
My Question
How can I display the values on the links/paths of Sankey Graph, in R?
Note: This and this questions of SO seems to be similar. However, I failed to understand the way to incorporate those in my codes.
Example Code (collected from here)
# install.packages('networkD3')
library(networkD3)
nodes = data.frame("name" =
c("Node A", # Node 0
"Node B", # Node 1
"Node C", # Node 2
"Node D"))# Node 3
links = as.data.frame(matrix(c(
0, 1, 10, # Each row represents a link. The first number
0, 2, 20, # represents the node being conntected from.
1, 3, 30, # the second number represents the node connected to.
2, 3, 40),# The third number is the value of the node
byrow = TRUE, ncol = 3))
names(links) = c("source", "target", "value")
sankeyNetwork(Links = links, Nodes = nodes,
Source = "source", Target = "target",
Value = "value", NodeID = "name",
fontSize= 50, nodeWidth = 30)
This can be achieved by injecting custom JavaScript code when it's rendered using htmlwidgets::onRender(). The example below will initially position the link labels appropriately, but if the nodes are manually moved, the link labels will not automatically update accordingly. To achieve that, you would probably have to also override the default dragmove behaviour.
library(htmlwidgets)
library(networkD3)
nodes <-
data.frame(
name = c("Node A", "Node B", "Node C", "Node D")
)
links <-
data.frame(
source = c(0, 0, 1, 2),
target = c(1, 2, 3, 3),
value = c(10, 20, 30, 40)
)
p <- sankeyNetwork(Links = links, Nodes = nodes,
Source = "source", Target = "target",
Value = "value", NodeID = "name",
fontSize= 20, nodeWidth = 30)
htmlwidgets::onRender(p, '
function(el) {
var nodeWidth = this.sankey.nodeWidth();
var links = this.sankey.links();
links.forEach((d, i) => {
var startX = d.source.x + nodeWidth;
var endX = d.target.x;
var startY = d.source.y + d.sy + d.dy / 2;
var endY = d.target.y + d.ty + d.dy / 2;
d3.select(el).select("svg g")
.append("text")
.attr("text-anchor", "middle")
.attr("alignment-baseline", "middle")
.attr("x", startX + ((endX - startX) / 2))
.attr("y", startY + ((endY - startY) / 2))
.text(d.value);
})
}
')

Interactivity in Sankey chart in R using networkD3

I wish to implement onClick on this sankey diagram such that by clicking on a link, I should see the details of the link between the two nodes. It's like the plotly_click function
library(networkD3)
nodes = data.frame("name" =
c("r1", # Node 0
"r2", # Node 1
"r3", # Node 2
"r4", # Node 3
"r5", # Node 4
"r6", # Node 5
"r7", # Node 6
"Blood Test", # Node 7
"Check Out", # Node 8
"Discuss Results", # Node 9
"MRI Scan", # Node 10
"Registration", # Node 11
"Triage and Assessment", # Node 12
"X-ray"))# Node 13
links = as.data.frame(matrix(c(
0, 11, 500, # Each row represents a link. The first number
1, 12, 500, # represents the node being conntected from.
2, 7, 237, # the second number represents the node connected to.
3, 10, 236,
4, 13, 261,
5, 9, 495,
6, 8, 492),# The third number is the value of the node
byrow = TRUE, ncol = 3))
names(links) = c("source", "target", "value")
sankeyNetwork(Links = links, Nodes = nodes,
Source = "source", Target = "target",
Value = "value", NodeID = "name",
fontSize= 12, nodeWidth = 30)
You can add click events using htmlwidgets::onRender function. It's not clear what details you want to see, but this, for example, will show a link's value in an alert box when you click it...
library(htmlwidgets)
sn <- sankeyNetwork(Links = links, Nodes = nodes,
Source = "source", Target = "target",
Value = "value", NodeID = "name",
fontSize = 12, nodeWidth = 30)
clickJS <- 'd3.selectAll(".link").on("click", function(d){ alert(d.value); })'
htmlwidgets::onRender(sn, clickJS)
Here is an interesting solution based on the parset package:
devtools::install_github("timelyportfolio/parsetR")
library(parsetR)
links$source <- as.character(factor(links$source, labels=nodes[1:7,1]))
links$target <- as.character(factor(links$target, labels=nodes[8:14,1]))
parset(links, dimensions = c('source', 'target'),
value = htmlwidgets::JS("function(d) {return d.value}"),
tension = 0.5)

Sankey Diagram with networkD3 package will not plot

I am using the sankeyNetwork function in the networkD3 package in R using as an example the code found here. However, all I get is a blank screen. The diagram is supposed to show the flow of infections between age groups (by gender). My code is as below:
library(RCurl)
library(networkD3)
edges <- read.csv(curl("https://raw.githubusercontent.com/kilimba/data/master/infection_flows.csv"),stringsAsFactors = FALSE )
nodes = data.frame(ID = unique(c(edges$Source, edges$Target)))
nodes$indx =0
for (i in 1:nrow(nodes)){
nodes[i,]["indx"] = i - 1
}
edges2 <- merge(edges,nodes,by.x = "Source",by.y = "ID")
edges2$Source <-NULL
names(edges2) <- c("target","value","source")
edges2 <- merge(edges2,nodes,by.x = "target",by.y = "ID")
edges2$target <- NULL
names(edges2) <- c("value","source","target")
nodes$indx <- NULL
# Plot
sankeyNetwork(Links = edges2, Nodes = nodes,
Source = "source", Target = "target",
Value = "value", NodeID = "ID",
width = 700, fontsize = 12, nodeWidth = 30)
Are you sure there are no errors printed in your R console?
This works for me with two small modifications:
Load the curl package as well at the beginning
library("curl")
The fontsize parameter apparently does not work and should be removed.
# Plot
sankeyNetwork(Links = edges2, Nodes = nodes,
Source = "source", Target = "target",
Value = "value", NodeID = "ID",
width = 700, #fontsize = 12,
nodeWidth = 30)
Adjusting fontsize does work, but your argument is missing a capitalization: fontSize
sankeyNetwork(Links = edges2, Nodes = nodes,
Source = "source", Target = "target",
Value = "value", NodeID = "ID",
width = 700, fontSize = 12,
nodeWidth = 30)
you do not need RCurl, read.csv is able to read directly from a URL
it's probably safer to use the stringsAsFactors = FALSE option when creating the nodes data.frame
as others have pointed out, you must make sure that the source and target variables in the links data are numeric, and that they are zero-indexed
as others have pointed out, the font size parameter is properly named fontSize
I have provided a more direct way of creating the links data with numeric indexes of the nodes in the nodes data.frame
library(networkD3)
edges <- read.csv("https://raw.githubusercontent.com/kilimba/data/master/infection_flows.csv",stringsAsFactors = FALSE)
nodes = data.frame(ID = unique(c(edges$Source, edges$Target)), stringsAsFactors = FALSE)
edges$Source <- match(edges$Source, nodes$ID) - 1
edges$Target <- match(edges$Target, nodes$ID) - 1
sankeyNetwork(Links = edges, Nodes = nodes,
Source = "Source", Target = "Target",
Value = "Value", NodeID = "ID",
width = 700, fontSize = 12, nodeWidth = 30)
I solved it for me by making sure that source, target and values were all numeric.
For example:
Energy$links$value <- as.numeric(Energy$links$value)

Resources