Colouring by variable when using tidy graph in R?

Colouring by variable when using tidy graph in R? - r

I am trying to come up with a way to consistently colour multiple tidygraph plots. Right now, the issue is, when I plot multiple plots to the screen at once, tidygraph chooses a different colour for each variable. hopefully my example below will explain the issue.
To begin, I create some data, turn them into tidygraph objects, and put them together into a list:
library(tidygraph)
library(ggraph)
library(gridExtra)
# create some data for the tbl_graph
nodes <- data.frame(name = c("x4", NA, NA),
label = c("x4", 5, 2))
nodes1 <- data.frame(name = c("x4", "x2", NA, NA, "x1", NA, NA),
label = c("x4", "x2", 2, 1, "x1", 2, 7))
edges <- data.frame(from = c(1,1), to = c(2,3))
edges1 <- data.frame(from = c(1, 2, 2, 1, 5, 5),
to = c(2, 3, 4, 5, 6, 7))
# create the tbl_graphs
tg <- tbl_graph(nodes = nodes, edges = edges)
tg_1 <- tbl_graph(nodes = nodes1, edges = edges1)
# put into list
myList <- list(tg, tg_1)
Then I have a plotting function that allows me to display all the plots at once. I do this using grid.arrange from the gridExtra package, like so:
plotFun <- function(List){
ggraph(List, "partition") +
geom_node_tile(aes(fill = name), size = 0.25) +
geom_node_label(aes(label = label, color = name)) +
scale_y_reverse() +
theme_void() +
theme(legend.position = "none")
}
# Display all plots
allPlots <- lapply(myList, plotFun)
n <- length(allPlots)
nRow <- floor(sqrt(n))
do.call("grid.arrange", c(allPlots, nrow = nRow))
This will produce something like this:
As you can see, it colours by the variable label for each individual plot. This results in the same variable label being coloured differently in each plot. For example, x4 in the first plot is red and in the second plot is blue.
I'm trying to find a way to make the colours for the variable's label consistent across all plots. Maybe using grid.arrange isn't the best solution!?
Any help is appreciated.

Since each plot doesn't know anything about the other plots, it's best to assign colors yourself. First you can extract all the node names and assign them a color
nodenames <- unique(na.omit(unlist(lapply(myList, .%>%activate(nodes) %>% pull(name) ))))
nodecolors <- setNames(scales::hue_pal(c(0,360)+15, 100, 64, 0, 1)(length(nodenames)), nodenames)
nodecolors
# x4 x2 x1
# "#F5736A" "#00B734" "#5E99FF"
We use scales::hue_pal to get the "default" ggplot colors but you could use whatever you like. Then we just need to customize the color/fill scales for the plots with these colors.
plotFun <- function(List, colors=NULL){
plot <- ggraph(List, "partition") +
geom_node_tile(aes(fill = name), size = 0.25) +
geom_node_label(aes(label = label, color = name)) +
scale_y_reverse() +
theme_void() +
theme(legend.position = "none")
if (!is.null(colors)) {
plot <- plot + scale_fill_manual(values=colors) +
scale_color_manual(values=colors, na.value="grey")
}
plot
}
allPlots <- lapply(myList, plotFun, colors=nodecolors)
n <- length(allPlots)
nRow <- floor(sqrt(n))
do.call("grid.arrange", c(allPlots, nrow = nRow))

Related

Set / Link point and shape options for variables in ggplot2

I would like to link variables I have in a dataframe i.e. ('prop1', 'prop2', 'prop3') to specific colours and shapes in the plot. However, I also want to exclude data (using dplyr::filter) to customise the plot display WITHOUT changing the points and shapes used for a specific variable. A minimal example is given below.
library(ggplot2)
library(dplyr)
library(magrittr)
obj <- c("cmpd 1","cmpd 1","cmpd 1","cmpd 2","cmpd 2")
x <- c(1, 2, 4, 7, 3)
var <- c("prop1","prop2","prop3","prop2","prop3")
y <- c(1, 2, 3, 2.5, 4)
col <- c("#E69F00","#9E0142","#56B4E9","#9E0142","#56B4E9")
shp <- c(0,1,2,1,2)
df2 <- cbind.data.frame(obj,x,var,y,col,shp)
plot <- ggplot(data = df2 %>%
filter(obj %in% c(
"cmpd 1",
"cmpd 2"
)),
aes(x = x,
y = y,
colour = as.factor(var),
shape = as.factor(var))) +
geom_point(size=2) +
#scale_shape_manual(values=shp) +
#scale_color_manual(values=col) +
facet_grid(.~obj)
plot
However, when I redact cmpd1 (just hashing in code) the colour and shape of prop2 and prop3 for cmpd2 change (please see plot2).
To this end, I tried adding in scale_shape_manual and scale_color_manual to the code (currently hashed) and linked these to specific vars (col and shp) in the dataframe (df2), but the same problem arises that both the shape and color of these variables changes when excluding one of the conditions?
Any and all help appreciated.

Try something like this:
library(tidyverse)
obj <- c("cmpd 1","cmpd 1","cmpd 1","cmpd 2","cmpd 2")
x <- c(1, 2, 4, 7, 3)
var <- c("prop1","prop2","prop3","prop2","prop3")
y <- c(1, 2, 3, 2.5, 4)
df2 <- cbind.data.frame(obj,x,var,y)
col <- c("prop1" = "#E69F00",
"prop2" = "#9E0142",
"prop3" = "#56B4E9")
shp <- c("prop1" = 0,
"prop2" = 1,
"prop3" = 2)
plot <- ggplot(data = df2 %>%
filter(obj %in% c(
"cmpd 1",
"cmpd 2"
)),
aes(x = x,
y = y,
colour = var,
shape = var)) +
geom_point(size=2) +
scale_shape_manual(values=shp) +
scale_color_manual(values=col) +
facet_grid(.~obj)
plot

Colouring nodes using graph and tidygraph in R?

I recently asked this question about how to colour nodes by variable. And the code works great. However, I'm back trying to colour the terminal nodes separately. For example, if I create some data, then turn them into tidygraph objects and plot them using ggraph then I get something like this:
library(tidygraph)
library(ggraph)
library(gridExtra)
pal = colorspace::sequential_hcl(palette = "Purples 3", n = 100)
# create some data for the tbl_graph
nodes <- data.frame(name = c("x4", NA, NA),
label = c("x4", 5, 2),
value = c(10, 5, 2))
nodes1 <- data.frame(name = c("x4", "x2", NA, NA, "x1", NA, NA),
label = c("x4", "x2", 2, 1, "x1", 13, 7),
value = c(10, 8, 2, 1, 10, 13, 7))
edges <- data.frame(from = c(1,1), to = c(2,3))
edges1 <- data.frame(from = c(1, 2, 2, 1, 5, 5),
to = c(2, 3, 4, 5, 6, 7))
# create the tbl_graphs
tg <- tbl_graph(nodes = nodes, edges = edges)
tg_1 <- tbl_graph(nodes = nodes1, edges = edges1)
# put into list
myList <- list(tg, tg_1)
# set colours for variables
nodenames <- unique(na.omit(unlist(lapply(myList, .%>%activate(nodes) %>% pull(name) ))))
nodecolors <- setNames(scales::hue_pal(c(0,360)+15, 100, 64, 0, 1)(length(nodenames)), nodenames)
nodecolors
# plot function
plotFun <- function(List, colors=NULL){
plot <- ggraph(List, "partition") +
geom_node_tile(aes(fill = name), size = 0.25) +
geom_node_label(aes(label = label, color = name)) +
scale_y_reverse() +
theme_void() +
theme(legend.position = "none")
if (!is.null(colors)) {
plot <- plot + scale_fill_manual(values=colors) +
scale_fill_manual(values=colors, na.value= 'grey40')
}
plot
}
# create grid of plots
allPlots <- lapply(myList, plotFun, colors=nodecolors)
n <- length(allPlots)
nRow <- floor(sqrt(n))
do.call("grid.arrange", c(allPlots, nrow = nRow))
As you can see the named nodes are all coloured correctly, but the terminal nodes are coloured grey. I am trying to colour the terminal nodes by the corresponding value in the value column of the data. I have tried altering the scale_fill_manual function, but I cant seem to get it to work..
Any suggestions as to how I could do this?

If I understand correctly, you want to apply a different colour mapping to
the terminal nodes, mapping value to colour rather than name, and using
a different colour scale altogether. ggplot2 doesn’t support that directly,
but you can use e.g. ggnewscale to apply a different scale for the rest
of the plot.
I simplified your example a bit to focus on the new scale application:
library(tidygraph)
library(ggraph)
nodes <- data.frame(
name = c("x4", "x2", NA, NA, "x1", NA, NA),
label = c("x4", "x2", 2, 1, "x1", 13, 7),
value = c(10, 8, 2, 1, 10, 13, 7)
)
edges <- data.frame(
from = c(1, 2, 2, 1, 5, 5),
to = c(2, 3, 4, 5, 6, 7)
)
tg <- tbl_graph(nodes = nodes, edges = edges)
ggraph(tg, "partition") +
geom_node_tile(aes(fill = name)) +
geom_node_label(aes(label = label, color = name)) +
# Apply different colour/fill scales to terminal nodes
ggnewscale::new_scale_fill() +
ggnewscale::new_scale_color() +
geom_node_tile(
data = . %>% filter(is.na(name)),
aes(fill = value)
) +
geom_node_label(
data = . %>% filter(is.na(name)),
aes(label = label, color = value)
)

Use plot_grid to arrange plots where plot information is stored in an R data frame

I generate a series of plots stored in a matrix as part of a for loop much like in the MWE below. This same matrix also stores two other columns of information (Colour and Animal in this example). I then want to be able to create a grid of plots, where I identify the plot based on the corresponding Colour and Animal.
I tried creating a data frame and then using row names to call out the plots I needed, but had the common error of Cannot convert object of class list into a grob.. If I call from the matrix directly this works - however I want a way not have to do this in case the order of the data changes in the input files. Is it possible to work directly from the data frame? I've seen similar examples, but couldn't apply to my case. I want to stick with cow plot and change as little as possible in the data generation stage.
MWE
library(cowplot)
p <- vector('list', 15)
p <-
matrix(
p,
nrow = 5,
ncol = 3
)
myColours = c("Yellow", "Red", "Blue", "Green", "Orange")
myAnimals = c("Kangaroo", "Emu", "Echidna", "Platypus", "Cassowary")
x = seq(1,10)
it = 1
for (i in seq(0,4)){ # generate example data and plots
y = x^i
t = runif(5)
df <- data.frame("X" = x, "Y" = y, "T" = t)
theanimal = myAnimals[i+1]
thecolour = myColours[i+1]
p[[it,1]] = thecolour
p[[it,2]] = theanimal
p[[it,3]] = ggplot(data = df, mapping = aes(x = X, y = Y)) +
geom_point(aes(color = T)) +
ggtitle(paste(thecolour, theanimal, sep = " "))
it = it+ 1
}
# turn into df
pltdf<- as.data.frame(p)
colnames(pltdf) <- c("Colour", "Animal", "plot")
rownames(pltdf) <- do.call(paste, c(pltdf[c("Colour", "Animal")], sep="-"))
pltdf[[1,3]] # this is what I expect for a single plot
plot1 = vector('list', 4)
plot1 <-
matrix(
plot1,
nrow = 2,
ncol = 2
)
plot1[[1,1]] = pltdf["Red-Emu", "plot"]. # also tried with just plot[[1]] = etc.
plot1[[1,2]] = pltdf["Blue-Echidna", "plot"]
plot1[[2,1]] = pltdf["Orange-Cassowary", "plot"]
plot1[[2,2]] = pltdf["Green-Platypus", "plot"]
plot_grid(plotlist = t(plot1), ncol = 2)
plot_grid(plotlist = list(plot1), ncol = 2) # suggested solution on a dif problem
plot2 = vector('list', 4) # what I want plots to look like in the end
plot2[[1]] = p[[1,3]]
plot2[[2]] = p[[4,3]]
plot2[[3]] = p[[2, 3]]
plot2[[4]] = p[[5, 3]]
plot_grid(plotlist = t(plot2), ncol = 2)

You can specify the order that you want the plots to be in and subset the dataframe accordingly which can be used in plot_grid.
library(cowplot)
order <- c("Red-Emu", "Blue-Echidna", "Orange-Cassowary", "Green-Platypus")
plot_grid(plotlist = pltdf[order, 'plot'], ncol = 2)

Combine multiple facet strips across columns in ggplot2 facet_wrap

I am trying to combine facet strips across two adjacent panels (there is always two adjacent ones with the same first ID variable, but with two different scenarios, let's call them "A" and "B"). I am not particularly wedded to the gtable + grid solution I tried, but sadly I cannot use the facet_nested() from the ggh4x package (I cannot install it on my company's server due to various restrictions that are in place and needed dependencies - I looked at using only the relevant code, but that again is not easy due to the dependencies).
A minimum viable example of the basic plot I want to make easier to read by indicating which panels "belong together" by combining the top facet strips looks like this:
library(tidyverse)
library(gtable)
library(grid)
idx = 1:16
p1 = expand_grid(id=idx, id2=c("A", "B"), x=1:10) %>%
mutate(y=rnorm(n=n())) %>%
ggplot(aes(x=x,y=y)) +
geom_jitter() +
facet_wrap(~id + id2, nrow = 4, ncol=8)
The strips with the "1"s, the ones with the "2"s etc. should be combined (in reality it's a somewhat longer text, but this is just for illustration). I was trying to adapt an answer for a similar scenario (https://stackoverflow.com/a/40316170/7744356 - thank you #markus for finding it again), but this is what I tried. As you can see below, the height of what I produce seems wrong. I assume this must be some trivial thing I am overlooking/not understanding.
# Combine strips for a ID
g <- ggplot_gtable(ggplot_build(p1))
strip <- gtable_filter(g, "strip-t", trim = FALSE)
stript <- which(grepl('strip-t', g$layout$name))
stript2 = stript[idx*2-1]
top <- strip$layout$t[idx*2-1]
# # Using the $b below instead of b = top[i]+1, also seems not to work
#bot <- strip$layout$b[idx*2-1]
l <- strip$layout$l[idx*2-1]
r <- strip$layout$r[idx*2]
mat <- matrix(vector("list",
length = length(idx)*3),
nrow = length(idx))
mat[] <- list(zeroGrob())
res <- gtable_matrix("toprow", mat,
unit(c(1, 0, 1), "null"),
unit( rep(1, length(idx)),
"null"))
for (i in 1:length(stript2)){
if (i==1){
zz <- res %>%
gtable_add_grob(g$grobs[[stript2[i]]]$grobs[[1]], 1, 1, 1, 3) %>%
gtable_add_grob(g, .,
t = top[i],
l = l[i],
b = top[i]+1,
r = r[i],
name = c("add-strip"))
} else {
zz <- res %>%
gtable_add_grob(g$grobs[[stript2[i]]]$grobs[[1]], 1, 1, 1, 3) %>%
gtable_add_grob(zz, .,
t = top[i],
l = l[i],
b = top[i]+1,
r = r[i],
name = c("add-strip"))
}
}
grid::grid.draw(zz)
------------ Update with a ggh4x implementation -----------------
This may solve this type of problem for many, but has its downsides (e.g. axes alignment across rows gets a bit manual, probably need to manually remove x-axes and ensure the limits are the same, add a unified y-axis label, requires installation of a package from github: devtools::install_github("teunbrand/ggh4x#v0.1") for a specific version, plus cowplot interacts badly with e.g. ggtern). So I'd love it, if someone still managed to do a pure gtable + grid version.
library(tidyverse)
library(ggh4x)
library(cowplot)
plots = expand_grid(id=idx, id2=c("A", "B"), x=1:10) %>%
mutate(y=rnorm(n=n()),
plotrow=(id-1)%/%4+1) %>%
group_by(plotrow) %>%
group_map( ~ ggplot(data=.,
aes(x=x,y=y)) +
geom_jitter() +
facet_nested( ~ id + id2, ))
plot_grid(plotlist = plots, nrow = 4, ncol=1)

I'm a bit late to this game, but ggh4x now has a facet_nested_wrap() implementation that should greatly simplify this problem (disclaimer: I wrote ggh4x).
library(tidyverse)
library(ggh4x)
idx = 1:16
p1 = expand_grid(id=idx, id2=c("A", "B"), x=1:10) %>%
mutate(y=rnorm(n=n())) %>%
ggplot(aes(x=x,y=y)) +
geom_jitter() +
facet_nested_wrap(~id + id2, nrow = 4, ncol=8)
p1
Created on 2020-08-12 by the reprex package (v0.3.0)
Keep in mind that there might still be a few bugs in this. Also, I'm aware that this doesn't help the OP because his package versions are constrained, but I thought I mention this here anyway.

Here's a reprex of a somewhat pedestrian way to do it in grid. I have made the "parent" facet somewhat darker to emphasise the nesting, but if you prefer the color to match just change the rectGrob fill color to "gray85".
# Set up plot as per example
library(tidyverse)
library(gtable)
library(grid)
idx = 1:16
p1 = expand_grid(id=idx, id2=c("A", "B"), x=1:10) %>%
mutate(y=rnorm(n=n())) %>%
ggplot(aes(x=x,y=y)) +
geom_jitter() +
facet_wrap(~id + id2, nrow = 4, ncol=8)
g <- ggplot_gtable(ggplot_build(p1))
# Code to produce facet strips
stript <- grep("strip", g$layout$name)
grid_cols <- sort(unique(g$layout[stript,]$l))
t_vals <- rep(sort(unique(g$layout[stript,]$t)), each = length(grid_cols)/2)
l_vals <- rep(grid_cols[seq_along(grid_cols) %% 2 == 1], length = length(t_vals))
r_vals <- rep(grid_cols[seq_along(grid_cols) %% 2 == 0], length = length(t_vals))
labs <- levels(as.factor(p1$data$id))
for(i in seq_along(labs))
{
filler <- rectGrob(y = 0.7, height = 0.6, gp = gpar(fill = "gray80", col = NA))
tg <- textGrob(label = labs[i], y = 0.75, gp = gpar(cex = 0.8))
g <- gtable_add_grob(g, filler, t = t_vals[i], l = l_vals[i], r = r_vals[i],
name = paste0("filler", i))
g <- gtable_add_grob(g, tg, t = t_vals[i], l = l_vals[i], r = r_vals[i],
name = paste0("textlab", i))
}
grid.newpage()
grid.draw(g)
And to demonstrate changing the rectGrob to 50% height and "gray85":
Or if you wanted you could assign a different fill for each cycle of the loop:
Obviously the above method might take a few tweaks to fit other plots with different numbers of levels etc.
Created on 2020-07-04 by the reprex package (v0.3.0)

Maybe this can not tackle the issue, but I would like to post because it could help to present results in a different plot keeping the same structure. You will have to define the number of columns for the plot in plot_layout(ncol = 4). This code uses patchwork package. Hope this can be useful.
library(tidyverse)
library(gtable)
library(grid)
library(patchwork)
idx = 1:16
#Data
p1 = expand_grid(id=idx, id2=c("A", "B"), x=1:10) %>%
mutate(y=rnorm(n=n()))
#Split data
List <- split(p1,p1$id)
#Sketch function
myplot <- function(x)
{
d <- ggplot(x,aes(x=x,y=y)) +
geom_jitter() +
facet_wrap(~id2, nrow = 1, ncol=2)+
ggtitle(unique(x$id))+
theme(plot.title = element_text(hjust = 0.5))
return(d)
}
#List of plots
Lplots <- lapply(List,myplot)
#Concatenate plots
#Create chain for plots
chain <- paste0('Lplots[[',1:length(Lplots),']]',collapse = '+')
#Evaluate the object and create the plot
Plot <- eval(parse(text = chain))+plot_layout(ncol = 4)+
plot_annotation(title = 'A nice plot')&theme(plot.title = element_text(hjust=0.5))
#Display
Plot
You will end up with a plot like this:

How to set the distance between discrete values in ggplot2?

I am attempting to use grid.arrange to plot several graphs in one column, as the x axis is the same for all graphs. However the different graphs have different number of discrete values, resulting in Samples in the top graph more distanced than the graph below. Is there a way to set the distance between discrete values on an axis so the distance between Sample1 and Sample2 lines is the same for both graphs? Thanks!
Here is an example:
library(reshape2)
library(tidyverse)
library(gridExtra)
#Data frame 1
a <- c(1,2,3,4,5)
b <- c(10,20,30,40,50)
Species <- factor(c("Species1","Species2","Species3","Species4","Species5"))
bubba <- data.frame(Sample1=a,Sample2=b,Species=Species)
bubba$Species=factor(bubba$Species, levels=bubba$Species)
xm=melt(bubba,id.vars = "Species", variable.name="Samples", value.name = "Size")
#Data frame 2
c <- c(1,2,3,4,5)
d <- c(10,20,30,40,50)
e <- c(1,2,3,4,5)
f <- c(10,20,30,40,50)
bubban <- data.frame(Sample1=c,Sample2=d,Sample3=e,Sample4=f,Species=Species)
xn=melt(bubban,id.vars = "Species", variable.name="Samples", value.name = "Size")
#Not related, but part of my original script i am using
shrink_10s_trans = trans_new("shrink_10s",
transform = function(y){
yt = ifelse(y >= 10, y*0.1, y)
return(yt)
},
inverse = function(yt){
return(yt) # Not 1-to-1 function, picking one possibility
}
)
#Make plot 1
p1=ggplot(xm,aes(x= Species,y= fct_rev(Samples), fill = Size < 10))+
geom_point(aes(size=Size), shape = 21)+
scale_size_area(trans = shrink_10s_trans, max_size = 10,
breaks = c(1,3,5,10,20,30,40,50),
labels = c(1,3,5,10,20,30,40,50)) +
scale_fill_manual(values = c(rgb(136,93,100, maxColorValue = 255),
rgb(236,160,172, maxColorValue = 255))) +
theme_bw()+theme(axis.text.x = element_text(angle = -45, hjust = 1))+scale_x_discrete(position = "top")
#Make plot 2
p2=ggplot(xn,aes(x= Species,y= fct_rev(Samples), fill = Size < 10))+
geom_point(aes(size=Size), shape = 21)+
scale_size_area(trans = shrink_10s_trans, max_size = 10,
breaks = c(1,3,5,10,20,30,40,50),
labels = c(1,3,5,10,20,30,40,50)) +
scale_fill_manual(values = c(rgb(136,93,100, maxColorValue = 255),
rgb(236,160,172, maxColorValue = 255))) +
theme_bw()+theme(axis.text.x = element_blank())
#arrange the plots
grid.arrange(p1,p2,nrow=2)

Instead of using grid.extra use ggpubr::ggarrange function. It lets you specify heights of each plot and set shared legend.
# Using plots generated with OPs code
ggpubr::ggarrange(p1, p2, nrow = 2, heights = c(1.3, 2),
common.legend = TRUE, legend = "right")
With argument heights you can set relative heights of each provided plot.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

Colouring by variable when using tidy graph in R? - r

Related

Set / Link point and shape options for variables in ggplot2

Colouring nodes using graph and tidygraph in R?

Use plot_grid to arrange plots where plot information is stored in an R data frame

Combine multiple facet strips across columns in ggplot2 facet_wrap

How to set the distance between discrete values in ggplot2?

Categories

Resources