Plotting a Sankey Diagram using R - r

Due to a mysterious problem discussed on this thread, I currently can't plot a Sankey Diagram using R.
Since some reported being unable to replicate my problem on their R environment, and at this point I'm willing to do anything for my diagram, it occurred to me, maybe somebody could help me plotting the actual code I need, and sending me the image...
So, this is the code:
library(networkD3)
#List of nodes (portfolios & targets)
nodes = data.frame("finalnodes" =
c("portfolio1", #0
"portfolio2", #1
"portfolio3", #2
"portfolio4", #3
"target1", #4
"target2", #5
"target3", #6
"target4", #7
"target5", #8
"target6", #9
"target7", #10
"target8" #11
))
#List of links
links = as.data.frame(matrix(c(
0,4,1,
2,10,1,
2,6,1,
3,4,1,
3,5,1,
3,7,1,
3,8,1,
3,9,1,
3,11,1),
byrow = TRUE, ncol = 3))
#Column names of data frame
names(links) = c("source", "target", "value")
#check
links
#Sankey Diagram
#Colour scale
colourScale = JS("d3.scaleOrdinal(d3.schemeCategory20);")
#Diagram
sankeyNetwork(Links = links, Nodes = nodes,
Source = "source", Target = "target", Value = "value", NodeID = "finalnodes",
fontSize = 14, nodeWidth = 10, nodePadding = 140, iterations = 0,
colourScale = colourScale)
Needless to say, when I plot it, the format is all messed up.
The plot I get:

Related

Converting a class of sankeyNetwork into a grob

I am trying to make a panel plots of 3 sankey diagram in R using networkD3 library. The Sankey diagrams are well generated and I have assigned three variables (p1, p2 and p3) to the plots. Now I want to combine them using ggarrange into one single plot using the code below;
ggarrange(p1, p2, p3, labels = c("A", "B", "C"), ncol = 1, nrow = 3)
But, I keep getting the following error.
Warning message:
In as_grob.default(plot) :
Cannot convert object of class sankeyNetworkhtmlwidget into a grob.
Apparently, ggarrange does not accept the format of my plots. How can I convert them into a format that ggarange will accept.
Here is the code that I used to generate the final Sankey Diagram for the first one (p1). The rest two (p2, p3) are generated using the same code.
nodes = data.frame("name" = c("Rocket 1", "Kerosene","Carbon IV Oxide",
"Aluminium IV Oxide","Sulphur IV Oxide"))
links = as.data.frame(matrix(c(0, 1, 488.4, 1, 2, 576.7, 1, 3, 24.4,
1, 4, 0.3), byrow = TRUE, ncol = 3))
p1 = names(links) = c("source", "target", "value")
p1 = sankeyNetwork(Links = links, Nodes = nodes,
Source = "source", Target = "target",
Value = "value", NodeID = "name",
fontSize= 12, nodeWidth = 20)
I would really appreciate some help with this. Thank you.

Sort multiple nodes into rows in DiagrammeR in R

I am building a large map where a number of observed variables map onto latent variables. When using DiagrammeR to generate the map, all the observed items (indicated by squares) appear in a single row. I was wondering if there was a way to sort them into multiple rows, say 5 or 10 in a row? The example below is based on actual data where there are 30+ items that map to one latent variable. The final map will have multiple sets of these, but right now, I am working on getting one to display correctly:
library(tidyverse)
library(DiagrammeR)
nodes <- create_node_df(
n=33,
nodes = 33,
label = make.unique(rep(letters, length.out = 33), sep=''),
tooltip = make.unique(rep(letters, length.out = 33), sep=''),
fontsize= 7,
shape = c("ellipse", "ellipse", rep("square", 31)),
fillcolor = "white",
width = 1,
height = 1,
)
# lines
edges <- create_edge_df(
from=c(1,rep(2,32)),
to = c(2,3,seq(3, 33, 1)),
)
create_graph(nodes_df = nodes,
edges_df = edges,
) %>%
render_graph(layout="tree")
This renders as:
However, what I am going for is more like:

Thin gray lines as links plotting sankeyNetwork in R

Disclaimer: I'm an R newbie, so I may be overlooking something really obvious here...
I am currently working on a sankeyNetwork diagram using R, and I am facing a problem that almost seems to be a bug, but I'm completely clueless...
I've googled extensively, and haven't been able to find anybody else reporting the same...
The problem is that in my code I currently have 7 nodes, and 5 links. When I plot the diagram, everything works fine:
Plot 1, everything working fine
This is the code for Plot 1:
library(networkD3)
# List of nodes (portfolios & targets)
nodes = data.frame("trialnodes" =
c("portfolio1", # 0
"portfolio2", # 1
"portfolio3", # 2
"portfolio4", # 3
"target1", # 4
"target2", # 5
"target3" # 6
))
# List of links
links = as.data.frame(matrix(c(
0,4,2,
1,6,1,
2,3,1,
2,6,1,
3,5,1),
byrow = TRUE, ncol = 3))
# Column names of data frame
names(links) = c("source", "target", "value")
# check
links
# Sankey Diagram
# Colour scale
colourScale = JS("d3.scaleOrdinal(d3.schemeCategory20);")
# Diagram
sankeyNetwork(Links = links, Nodes = nodes,
Source = "source", Target = "target", Value = "value", NodeID = "trialnodes",
fontSize = 14, nodeWidth = 10, nodePadding = 140, iterations = 0,
colourScale = colourScale)
however, as soon as I add one more node, and 1 more link, the plot's format becomes completely broken, showing the links between nodes as simple gray thin lines (not representing the Value). The nodeWidth and nodePadding specifications also get ignored :(
Plot 2, links as thin gray lines
This is the code for Plot 2:
library(networkD3)
# List of nodes (portfolios & targets)
nodes = data.frame("trialnodes" =
c("portfolio1", # 0
"portfolio2", # 1
"portfolio3", # 2
"portfolio4", # 3
"target1", # 4
"target2", # 5
"target3", # 6
"target4" # 7
))
# List of links
links = as.data.frame(matrix(c(
0,4,2,
0,7,1,
1,6,1,
2,3,1,
2,6,1,
3,5,1),
byrow = TRUE, ncol = 3))
# Column names of data frame
names(links) = c("source", "target", "value")
# check
links
# Sankey Diagram
# Colour scale
colourScale = JS("d3.scaleOrdinal(d3.schemeCategory20);")
# Diagram
sankeyNetwork(Links = links, Nodes = nodes,
Source = "source", Target = "target", Value = "value", NodeID = "trialnodes",
fontSize = 14, nodeWidth = 10, nodePadding = 140, iterations = 0,
colourScale = colourScale)
Can anybody spot what's going on? I hope someone can help... I'm desperate D: Thank you very much in advance! :)
Either reduce your nodePadding value to something reasonable, or make the viewer/browser-window size large enough to show the max number of nodes you have in a column * 140 pixels (plus some for the actual node) and then refresh (in your second example that comes out to ~600px).
library(networkD3)
# List of nodes (portfolios & targets)
nodes = data.frame("trialnodes" =
c("portfolio1", # 0
"portfolio2", # 1
"portfolio3", # 2
"portfolio4", # 3
"target1", # 4
"target2", # 5
"target3", # 6
"target4" # 7
))
# List of links
links = as.data.frame(matrix(c(
0,4,2,
0,7,1,
1,6,1,
2,3,1,
2,6,1,
3,5,1),
byrow = TRUE, ncol = 3))
# Column names of data frame
names(links) = c("source", "target", "value")
# check
links
# Sankey Diagram
# Colour scale
colourScale = JS("d3.scaleOrdinal(d3.schemeCategory20);")
# Diagram
sankeyNetwork(Links = links, Nodes = nodes,
Source = "source", Target = "target", Value = "value", NodeID = "trialnodes",
fontSize = 14, nodeWidth = 10, nodePadding = 14, iterations = 0,
colourScale = colourScale)

How to add title to a networkD3 visualisation when saving as a web page?

I have created a interactive visualisation using the following code:
library(networkD3)
nodes = data.frame("name" = c("node1", "node2","node3", "node4", "node5", "node6", "node7"))
links = as.data.frame(matrix(c(
0,1,7937,
0,2,6990,
0,3,2483,
1,4,2120,
2,4,666,
3,4,282,
1,5,4583,
2,5,5657,
3,5,731,
1,6,1234,
2,6,756,
3,6,1470), byrow = TRUE, ncol = 3))
names(links) = c("source", "target", "value")
sankey <- sankeyNetwork(Links = links, Nodes = nodes,
Source = "source", Target = "target",
Value = "value", NodeID = "name",
fontSize= 12, nodeWidth = 15)'
This is my first time using the networkD3 package (or any interactive package for that matter) and from playing around I found that to keep it interactive it has to be published as a webpage (or is there another way??) but looking through the documentation for the package I can't see a way to add a title or a caption / comments. I want to share this piece of work round so need to explain what each level means on the published webpage ideally
There is no feature built-in to networkD3 to add titles or captions, but you can use functions in the htmlwidgets package to prepend or append content to an htmlwidget. There are numerous options, but for example....
library(htmlwidgets)
library(htmltools)
sankey <- htmlwidgets::prependContent(sankey, htmltools::tags$h1("Title"))
sankey <- htmlwidgets::appendContent(sankey, htmltools::tags$p("Caption"))
Responding to the comment, "I ended up using this to add a title, but it keeps pushing my viz down and cutting the bottom off. This stays off even when saving as a webpage. Is there anyway I can stop this from happening?"
I tried the suggested reply of adding sankey$sizingPolicy$viewer$fill <- FALSE, however, it made my sankey smaller than I wanted it. I found out that you can adjust the width and height of the Sankey prior to adding the HTML widget by adding width=(desired width) and height=(desired height) and this creates the space to then add in the title and the comment, as suggested by CJ Yetman.
library(networkD3)
library(htmlwidgets)
library(htmltools)
nodes = data.frame("name" = c("node1", "node2","node3", "node4", "node5", "node6", "node7"))
links = as.data.frame(matrix(c(
0,1,7937,
0,2,6990,
0,3,2483,
1,4,2120,
2,4,666,
3,4,282,
1,5,4583,
2,5,5657,
3,5,731,
1,6,1234,
2,6,756,
3,6,1470), byrow = TRUE, ncol = 3))
names(links) = c("source", "target", "value")
sankey <- sankeyNetwork(Links = links, Nodes = nodes,
Source = "source", Target = "target",
Value = "value", NodeID = "name",
fontSize= 12, nodeWidth = 15,
width= 900, height=600)
sankey <- htmlwidgets::prependContent(sankey, htmltools::tags$h1("Title"))
sankey <- htmlwidgets::appendContent(sankey, htmltools::tags$p("Caption"))
sankey

Sankey Diagram with networkD3 package will not plot

I am using the sankeyNetwork function in the networkD3 package in R using as an example the code found here. However, all I get is a blank screen. The diagram is supposed to show the flow of infections between age groups (by gender). My code is as below:
library(RCurl)
library(networkD3)
edges <- read.csv(curl("https://raw.githubusercontent.com/kilimba/data/master/infection_flows.csv"),stringsAsFactors = FALSE )
nodes = data.frame(ID = unique(c(edges$Source, edges$Target)))
nodes$indx =0
for (i in 1:nrow(nodes)){
nodes[i,]["indx"] = i - 1
}
edges2 <- merge(edges,nodes,by.x = "Source",by.y = "ID")
edges2$Source <-NULL
names(edges2) <- c("target","value","source")
edges2 <- merge(edges2,nodes,by.x = "target",by.y = "ID")
edges2$target <- NULL
names(edges2) <- c("value","source","target")
nodes$indx <- NULL
# Plot
sankeyNetwork(Links = edges2, Nodes = nodes,
Source = "source", Target = "target",
Value = "value", NodeID = "ID",
width = 700, fontsize = 12, nodeWidth = 30)
Are you sure there are no errors printed in your R console?
This works for me with two small modifications:
Load the curl package as well at the beginning
library("curl")
The fontsize parameter apparently does not work and should be removed.
# Plot
sankeyNetwork(Links = edges2, Nodes = nodes,
Source = "source", Target = "target",
Value = "value", NodeID = "ID",
width = 700, #fontsize = 12,
nodeWidth = 30)
Adjusting fontsize does work, but your argument is missing a capitalization: fontSize
sankeyNetwork(Links = edges2, Nodes = nodes,
Source = "source", Target = "target",
Value = "value", NodeID = "ID",
width = 700, fontSize = 12,
nodeWidth = 30)
you do not need RCurl, read.csv is able to read directly from a URL
it's probably safer to use the stringsAsFactors = FALSE option when creating the nodes data.frame
as others have pointed out, you must make sure that the source and target variables in the links data are numeric, and that they are zero-indexed
as others have pointed out, the font size parameter is properly named fontSize
I have provided a more direct way of creating the links data with numeric indexes of the nodes in the nodes data.frame
library(networkD3)
edges <- read.csv("https://raw.githubusercontent.com/kilimba/data/master/infection_flows.csv",stringsAsFactors = FALSE)
nodes = data.frame(ID = unique(c(edges$Source, edges$Target)), stringsAsFactors = FALSE)
edges$Source <- match(edges$Source, nodes$ID) - 1
edges$Target <- match(edges$Target, nodes$ID) - 1
sankeyNetwork(Links = edges, Nodes = nodes,
Source = "Source", Target = "Target",
Value = "Value", NodeID = "ID",
width = 700, fontSize = 12, nodeWidth = 30)
I solved it for me by making sure that source, target and values were all numeric.
For example:
Energy$links$value <- as.numeric(Energy$links$value)

Resources