Interactivity in Sankey chart in R using networkD3 - r

I wish to implement onClick on this sankey diagram such that by clicking on a link, I should see the details of the link between the two nodes. It's like the plotly_click function
library(networkD3)
nodes = data.frame("name" =
c("r1", # Node 0
"r2", # Node 1
"r3", # Node 2
"r4", # Node 3
"r5", # Node 4
"r6", # Node 5
"r7", # Node 6
"Blood Test", # Node 7
"Check Out", # Node 8
"Discuss Results", # Node 9
"MRI Scan", # Node 10
"Registration", # Node 11
"Triage and Assessment", # Node 12
"X-ray"))# Node 13
links = as.data.frame(matrix(c(
0, 11, 500, # Each row represents a link. The first number
1, 12, 500, # represents the node being conntected from.
2, 7, 237, # the second number represents the node connected to.
3, 10, 236,
4, 13, 261,
5, 9, 495,
6, 8, 492),# The third number is the value of the node
byrow = TRUE, ncol = 3))
names(links) = c("source", "target", "value")
sankeyNetwork(Links = links, Nodes = nodes,
Source = "source", Target = "target",
Value = "value", NodeID = "name",
fontSize= 12, nodeWidth = 30)

You can add click events using htmlwidgets::onRender function. It's not clear what details you want to see, but this, for example, will show a link's value in an alert box when you click it...
library(htmlwidgets)
sn <- sankeyNetwork(Links = links, Nodes = nodes,
Source = "source", Target = "target",
Value = "value", NodeID = "name",
fontSize = 12, nodeWidth = 30)
clickJS <- 'd3.selectAll(".link").on("click", function(d){ alert(d.value); })'
htmlwidgets::onRender(sn, clickJS)

Here is an interesting solution based on the parset package:
devtools::install_github("timelyportfolio/parsetR")
library(parsetR)
links$source <- as.character(factor(links$source, labels=nodes[1:7,1]))
links$target <- as.character(factor(links$target, labels=nodes[8:14,1]))
parset(links, dimensions = c('source', 'target'),
value = htmlwidgets::JS("function(d) {return d.value}"),
tension = 0.5)

Related

Is there a way to visualize a partially directed graph using R forceNetwork() function?

I am currently working with the R forceNetwork function of the networkD3 package and I have properly validated the correctness of the Nodes and Links data frames for my graph.
My Nodes data frame (node_df) is like this:
node_id node_type node_size
0 T054717 irrelevant 10
1 T095006 irrelevant 10
2 T088658 irrelevant 10
3 T069179 irrelevant 10
4 T009515 irrelevant 10
5 T152167 irrelevant 10
6 T100447 irrelevant 10
7 T150659 irrelevant 10
...
and My Links dataframe (links_df) is like this:
tid1 tid2 edge_dir
0 37 36 10
1 37 0 10
2 37 1 10
3 37 3 10
...
147 34 35 5
148 7 47 5
149 34 47 5
150 35 47 5
151 36 48 5
152 1 48 5
I set the forceNetwork function like this:
network <- forceNetwork (Links = links_df,
Nodes = node_df,
Source = "tid1",
Target = "tid2",
Value = "edge_dir",
NodeID = "node_id",
Nodesize = "node_size",
Group = "node_type",
colourScale = JS("d3.scaleOrdinal(d3.schemeCategory20);"),
fontSize = 10,
linkDistance = 100,
radiusCalculation = JS(" Math.sqrt(d.nodesize)+6"),
charge = -30,
linkColour = ifelse(links_df$edge_dir == 10, "black","red"),
opacity = 1,
zoom = TRUE,
arrows = ifelse(links_df$edge_dir == 10, TRUE, FALSE),
opacityNoHover = TRUE,
clickAction = NULL)
I am struggling with the arrows parameter of the function. In fact I would like to specify if there should be a directed edge (arrows = TRUE) or an undirected edge (arrows = FALSE) for each link, by checking the Value parameter.
In my case Value refers to a column of the Links data frame named edge_dir, which specifies if the edge should be directed (edge_dir = 10) or not (edge_dir = 5).
After looking at this stackoverflow link, specify-colors-for-each-link-in-a-force-directed-network-networkd3
I've tried to set the parameter like this:
arrows = ifelse(links_df$edge_dir == 10, TRUE, FALSE) but the output graph has arrows where there just should be undirected lines.
Using the same structure in the linkColour parameter: linkColour = ifelse(links_df$edge_dir == 10, "black","red") it works fine, directed edges are black and the ones that should be undirected, but they aren't, are red, as shown in this graph output image
Is it possible to display a graph which has directed and undirected edges by modifying the arrows parameter?
Thank you!
This is an imperfect solution, but it's easier than making substantial modifications to the underlying JavaScript yourself. You can use htmlwidgets::onRender() to inject some JavaScript to run just after the plot is generated. Here's an example of how you could turn off the arrows for only those links that have a value equal to 5...
library(tibble)
library(networkD3)
library(htmlwidgets)
node_df <- tibble::tribble(
~node_id, ~node_type, ~node_size, ~directed,
"T054717", "irrelevant", 10, TRUE,
"T095006", "irrelevant", 10, FALSE,
"T088658", "irrelevant", 10, TRUE,
"T069179", "irrelevant", 10, FALSE,
"T009515", "irrelevant", 10, TRUE,
"T152167", "irrelevant", 10, FALSE,
"T100447", "irrelevant", 10, TRUE,
"T150659", "irrelevant", 10, FALSE
)
links_df <- tibble::tribble(
~tid1, ~tid2, ~edge_dir,
0, 1, 10,
0, 2, 10,
0, 3, 10,
1, 3, 10,
2, 4, 5,
2, 5, 5,
3, 6, 5,
4, 5, 5,
5, 6, 5,
5, 6, 5
)
network <- forceNetwork (Links = links_df,
Nodes = node_df,
Source = "tid1",
Target = "tid2",
Value = "edge_dir",
NodeID = "node_id",
Nodesize = "node_size",
Group = "node_type",
colourScale = JS("d3.scaleOrdinal(d3.schemeCategory20);"),
fontSize = 10,
linkDistance = 100,
radiusCalculation = JS(" Math.sqrt(d.nodesize)+6"),
charge = -30,
linkColour = ifelse(links_df$edge_dir == 10, "black","red"),
opacity = 1,
zoom = TRUE,
arrows = ifelse(links_df$edge_dir == 10, TRUE, FALSE),
opacityNoHover = TRUE,
clickAction = NULL)
htmlwidgets::onRender(network, '
function(el) {
d3.select("svg")
.selectAll(".link")
.filter(function(d) { return d.value == 5; })
.style("marker-end", null);
}')

labeling values over the link and nodes in Sankey Diagram with networkD3 [duplicate]

Background
I was trying the create a Sankey graph like the following figure. Actually, I wanted to get a output where values (10, 20, 30, 40) will be set in the paths (from one node to another node).
How Did I Try?
At first, I tried using the Plotly library of Python. However, somewhere I have seen that it is not possible to set the values in the links or the paths of Sankey graph in Plotly (of Python). Later, I switched to R (for some other reasons also) where more resources are available (I think). However, here, I am also facing the same problem. I have checked many tutorials (e.g., this one), Q&A (e.g., 1, 2, 3) of SO which are in R. Still, I could not to find a tutorial or resources where the values are displayed in the paths!
My Question
How can I display the values on the links/paths of Sankey Graph, in R?
Note: This and this questions of SO seems to be similar. However, I failed to understand the way to incorporate those in my codes.
Example Code (collected from here)
# install.packages('networkD3')
library(networkD3)
nodes = data.frame("name" =
c("Node A", # Node 0
"Node B", # Node 1
"Node C", # Node 2
"Node D"))# Node 3
links = as.data.frame(matrix(c(
0, 1, 10, # Each row represents a link. The first number
0, 2, 20, # represents the node being conntected from.
1, 3, 30, # the second number represents the node connected to.
2, 3, 40),# The third number is the value of the node
byrow = TRUE, ncol = 3))
names(links) = c("source", "target", "value")
sankeyNetwork(Links = links, Nodes = nodes,
Source = "source", Target = "target",
Value = "value", NodeID = "name",
fontSize= 50, nodeWidth = 30)
This can be achieved by injecting custom JavaScript code when it's rendered using htmlwidgets::onRender(). The example below will initially position the link labels appropriately, but if the nodes are manually moved, the link labels will not automatically update accordingly. To achieve that, you would probably have to also override the default dragmove behaviour.
library(htmlwidgets)
library(networkD3)
nodes <-
data.frame(
name = c("Node A", "Node B", "Node C", "Node D")
)
links <-
data.frame(
source = c(0, 0, 1, 2),
target = c(1, 2, 3, 3),
value = c(10, 20, 30, 40)
)
p <- sankeyNetwork(Links = links, Nodes = nodes,
Source = "source", Target = "target",
Value = "value", NodeID = "name",
fontSize= 20, nodeWidth = 30)
htmlwidgets::onRender(p, '
function(el) {
var nodeWidth = this.sankey.nodeWidth();
var links = this.sankey.links();
links.forEach((d, i) => {
var startX = d.source.x + nodeWidth;
var endX = d.target.x;
var startY = d.source.y + d.sy + d.dy / 2;
var endY = d.target.y + d.ty + d.dy / 2;
d3.select(el).select("svg g")
.append("text")
.attr("text-anchor", "middle")
.attr("alignment-baseline", "middle")
.attr("x", startX + ((endX - startX) / 2))
.attr("y", startY + ((endY - startY) / 2))
.text(d.value);
})
}
')

Thin gray lines as links plotting sankeyNetwork in R

Disclaimer: I'm an R newbie, so I may be overlooking something really obvious here...
I am currently working on a sankeyNetwork diagram using R, and I am facing a problem that almost seems to be a bug, but I'm completely clueless...
I've googled extensively, and haven't been able to find anybody else reporting the same...
The problem is that in my code I currently have 7 nodes, and 5 links. When I plot the diagram, everything works fine:
Plot 1, everything working fine
This is the code for Plot 1:
library(networkD3)
# List of nodes (portfolios & targets)
nodes = data.frame("trialnodes" =
c("portfolio1", # 0
"portfolio2", # 1
"portfolio3", # 2
"portfolio4", # 3
"target1", # 4
"target2", # 5
"target3" # 6
))
# List of links
links = as.data.frame(matrix(c(
0,4,2,
1,6,1,
2,3,1,
2,6,1,
3,5,1),
byrow = TRUE, ncol = 3))
# Column names of data frame
names(links) = c("source", "target", "value")
# check
links
# Sankey Diagram
# Colour scale
colourScale = JS("d3.scaleOrdinal(d3.schemeCategory20);")
# Diagram
sankeyNetwork(Links = links, Nodes = nodes,
Source = "source", Target = "target", Value = "value", NodeID = "trialnodes",
fontSize = 14, nodeWidth = 10, nodePadding = 140, iterations = 0,
colourScale = colourScale)
however, as soon as I add one more node, and 1 more link, the plot's format becomes completely broken, showing the links between nodes as simple gray thin lines (not representing the Value). The nodeWidth and nodePadding specifications also get ignored :(
Plot 2, links as thin gray lines
This is the code for Plot 2:
library(networkD3)
# List of nodes (portfolios & targets)
nodes = data.frame("trialnodes" =
c("portfolio1", # 0
"portfolio2", # 1
"portfolio3", # 2
"portfolio4", # 3
"target1", # 4
"target2", # 5
"target3", # 6
"target4" # 7
))
# List of links
links = as.data.frame(matrix(c(
0,4,2,
0,7,1,
1,6,1,
2,3,1,
2,6,1,
3,5,1),
byrow = TRUE, ncol = 3))
# Column names of data frame
names(links) = c("source", "target", "value")
# check
links
# Sankey Diagram
# Colour scale
colourScale = JS("d3.scaleOrdinal(d3.schemeCategory20);")
# Diagram
sankeyNetwork(Links = links, Nodes = nodes,
Source = "source", Target = "target", Value = "value", NodeID = "trialnodes",
fontSize = 14, nodeWidth = 10, nodePadding = 140, iterations = 0,
colourScale = colourScale)
Can anybody spot what's going on? I hope someone can help... I'm desperate D: Thank you very much in advance! :)
Either reduce your nodePadding value to something reasonable, or make the viewer/browser-window size large enough to show the max number of nodes you have in a column * 140 pixels (plus some for the actual node) and then refresh (in your second example that comes out to ~600px).
library(networkD3)
# List of nodes (portfolios & targets)
nodes = data.frame("trialnodes" =
c("portfolio1", # 0
"portfolio2", # 1
"portfolio3", # 2
"portfolio4", # 3
"target1", # 4
"target2", # 5
"target3", # 6
"target4" # 7
))
# List of links
links = as.data.frame(matrix(c(
0,4,2,
0,7,1,
1,6,1,
2,3,1,
2,6,1,
3,5,1),
byrow = TRUE, ncol = 3))
# Column names of data frame
names(links) = c("source", "target", "value")
# check
links
# Sankey Diagram
# Colour scale
colourScale = JS("d3.scaleOrdinal(d3.schemeCategory20);")
# Diagram
sankeyNetwork(Links = links, Nodes = nodes,
Source = "source", Target = "target", Value = "value", NodeID = "trialnodes",
fontSize = 14, nodeWidth = 10, nodePadding = 14, iterations = 0,
colourScale = colourScale)

sankeyD3 > sankeyNetwork implementation of 'NodePosX'

I have been using the sankeyD3 package to create SankeyNetworks and the 'NodePosX' feature isn't working for me yet. The 'NodePosX' feature is not in the 'networkD3' package but it is in the 'sankeyD3' package.
To help illustrate the problem that I am having, I have edited the example from akraemer007 that was posted here to include the X positions of the nodes (see below) but it's still not working in the way that he had originally wanted, with manual control over the x-position of the 'Opted-Out' node.
We're aiming for something like this, but without the small line from 'Opted-Out' to 'Activated':
library(devtools)
devtools::install_github("fbreitwieser/sankeyD3")
library(sankeyD3)
name <- c('Enrolled', 'Opted-Out', 'Invited', 'Activated')
xpos <- c(0, 1, 1, 2)
nodes <- data.frame(name, xpos)
source <- c(0, 0, 2, 1)
target <- c(1, 2, 3, 3)
value <- c(20, 80, 60, 0)
links <- data.frame(source, target, value)
sankeyNetwork(Links = links, Nodes = nodes, Source = "source",
Target = "target", Value = "value", NodeID = "name",NodePosX = "xpos",
units = "TWh", fontSize = 12, nodeWidth = 30)
Assuming the last row in your links data frame is only there to force the plot to look the way you want and not part of the actual data you want to plot, you can achieve this with networkd3 using the sinksRight = FALSE parameter.
library(networkD3)
name <- c('Enrolled', 'Opted-Out', 'Invited', 'Activated')
xpos <- c(0, 1, 1, 2)
nodes <- data.frame(name, xpos)
source <- c(0, 0, 2)
target <- c(1, 2, 3)
value <- c(20, 80, 60)
links <- data.frame(source, target, value)
sankeyNetwork(Links = links, Nodes = nodes, Source = "source",
Target = "target", Value = "value", NodeID = "name",
units = "TWh", fontSize = 12, nodeWidth = 30, sinksRight = FALSE)

Sankey Diagram with networkD3 package will not plot

I am using the sankeyNetwork function in the networkD3 package in R using as an example the code found here. However, all I get is a blank screen. The diagram is supposed to show the flow of infections between age groups (by gender). My code is as below:
library(RCurl)
library(networkD3)
edges <- read.csv(curl("https://raw.githubusercontent.com/kilimba/data/master/infection_flows.csv"),stringsAsFactors = FALSE )
nodes = data.frame(ID = unique(c(edges$Source, edges$Target)))
nodes$indx =0
for (i in 1:nrow(nodes)){
nodes[i,]["indx"] = i - 1
}
edges2 <- merge(edges,nodes,by.x = "Source",by.y = "ID")
edges2$Source <-NULL
names(edges2) <- c("target","value","source")
edges2 <- merge(edges2,nodes,by.x = "target",by.y = "ID")
edges2$target <- NULL
names(edges2) <- c("value","source","target")
nodes$indx <- NULL
# Plot
sankeyNetwork(Links = edges2, Nodes = nodes,
Source = "source", Target = "target",
Value = "value", NodeID = "ID",
width = 700, fontsize = 12, nodeWidth = 30)
Are you sure there are no errors printed in your R console?
This works for me with two small modifications:
Load the curl package as well at the beginning
library("curl")
The fontsize parameter apparently does not work and should be removed.
# Plot
sankeyNetwork(Links = edges2, Nodes = nodes,
Source = "source", Target = "target",
Value = "value", NodeID = "ID",
width = 700, #fontsize = 12,
nodeWidth = 30)
Adjusting fontsize does work, but your argument is missing a capitalization: fontSize
sankeyNetwork(Links = edges2, Nodes = nodes,
Source = "source", Target = "target",
Value = "value", NodeID = "ID",
width = 700, fontSize = 12,
nodeWidth = 30)
you do not need RCurl, read.csv is able to read directly from a URL
it's probably safer to use the stringsAsFactors = FALSE option when creating the nodes data.frame
as others have pointed out, you must make sure that the source and target variables in the links data are numeric, and that they are zero-indexed
as others have pointed out, the font size parameter is properly named fontSize
I have provided a more direct way of creating the links data with numeric indexes of the nodes in the nodes data.frame
library(networkD3)
edges <- read.csv("https://raw.githubusercontent.com/kilimba/data/master/infection_flows.csv",stringsAsFactors = FALSE)
nodes = data.frame(ID = unique(c(edges$Source, edges$Target)), stringsAsFactors = FALSE)
edges$Source <- match(edges$Source, nodes$ID) - 1
edges$Target <- match(edges$Target, nodes$ID) - 1
sankeyNetwork(Links = edges, Nodes = nodes,
Source = "Source", Target = "Target",
Value = "Value", NodeID = "ID",
width = 700, fontSize = 12, nodeWidth = 30)
I solved it for me by making sure that source, target and values were all numeric.
For example:
Energy$links$value <- as.numeric(Energy$links$value)

Resources