Related
I have been really struggling to position the components of my heatmap.2 output.
I found this old answer explaining how the element positioning worked from #IanSudbery which seemed really clear and I thought it had given me the understanding I need, but I'm still not grasping something.
I understand that the elements are all essentially put in a lattice of windows but they aren't behaving in a way I understand.
Here is my code and the current output (at the very bottom is the bit of interest which orders the figure elements):
for(i in 1:length(ConditionsAbbr)) {
# creates its own colour palette
my_palette <- colorRampPalette(c("snow", "yellow", "darkorange", "red"))(n = 399)
# (optional) defines the colour breaks manually for a "skewed" colour transition
col_breaks = c(seq(0,0.09,length=100), #white 'snow'
seq(0.1,0.19,length=100), # for yellow
seq(0.2,0.29,length=100), # for orange 'darkorange'
seq(0.3,1,length=100)) # for red
# creates a 5 x 5 inch image
png(paste(SourceDir, "Heatmap_", ConditionsAbbr[i], "XYZ.png"), # create PNG for the heat map
width = 5*600, # 5 x 600 pixels
height = 5*600,
res = 300, # 300 pixels per inch
pointsize = 8) # smaller font size
heatmap.2(ConditionsMtx[[ConditionsAbbr[i]]],
cellnote = ConditionsMtx[[ConditionsAbbr[i]]], # same data set for cell labels
main = paste(ConditionsAbbr[i], "XYZ"), # heat map title
notecol="black", # change font color of cell labels to black
density.info="none", # turns off density plot inside color legend
trace="none", # turns off trace lines inside the heat map
margins =c(12,9), # widens margins around plot
col=my_palette, # use on color palette defined earlier
breaks=col_breaks, # enable color transition at specified limits
dendrogram="none", # No dendogram
srtCol = 0 , #correct angle of label numbers
asp = 1 , #this overrides layout methinks and for some reason makes it square
adjCol = c(NA, -35) ,
adjRow = c(53, NA) ,
keysize = 1.2 ,
Colv = FALSE , #turn off column clustering
Rowv = FALSE , # turn off row clustering
key.xlab = paste("Correlation") ,
lmat = rbind( c(0, 3), c(2,1), c(0,4) ),
lhei = c(0.9, 4, 0.5) )
dev.off() # close the PNG device
}
This gives:
As you can see, the key is right of the matrix, there are huge amounts of white space between the matrix, the title above and key below, and it's not even as if the title and matrix are centred in the PNG?
I think to myself "well I'll just create a 3x3 that is easy to understand and edit" e.g.
| |
| | (3)
| |
--------------------------
| (1) |
(2) | Matrix |
| |
--------------------------
| (4) |
| Key |
| |
And then I can get rid of the white space so it's more like this.
| |(3)
------------------
| (1) |
(2)| Matrix |
| |
------------------
|(4) Key |
I do this using:
lmat = rbind( c(0, 0, 3), c(2, 1, 0), c(0, 4, 0) ),
lhei = c(0.9, 4, 0.5) ,
lwid = c(1, 4, 1))
This is what it looks like:
As great as it is to see my matrix in the centre, my key is still aligned to the right of my matrix and my title is taking the Silk Road East? Not to mention all the excess white space?
How do I get these to align and to all move together so the figure components fit snugly together?
EDIT: reducing my margins helped to reduce the whitespace but it's still excessive.
Here are the final changes I made to get my results, however, I would recommend using the advice of Maurits Evers if you aren't too invested in heatmap.2. Don't overlook the changes I made to the image dimensions.
# creates my own colour palette
my_palette <- colorRampPalette(c("snow", "yellow", "darkorange", "red"))(n = 399)
# (optional) defines the colour breaks manually for a "skewed" colour transition
col_breaks = c(seq(0,0.09,length=100), #white 'snow'
seq(0.1,0.19,length=100), # for yellow
seq(0.2,0.29,length=100), # for orange 'darkorange'
seq(0.3,1,length=100)) # for red
# creates an image
png(paste(SourceDir, "Heatmap_XYZ.png" )
# create PNG for the heat map
width = 5*580, # 5 x 580 pixels
height = 5*420, # 5 x 420 pixels
res = 300, # 300 pixels per inch
pointsize =11) # smaller font size
heatmap.2(ConditionsMtx[[ConditionsAbbr[i]]],
cellnote = ConditionsMtx[[ConditionsAbbr[i]]], # same data set for cell labels
main = "XYZ", # heat map title
notecol="black", # change font color of cell labels to black
density.info="none", # turns off density plot inside color legend
trace="none", # turns off trace lines inside the heat map
margins=c(0,0), # widens margins around plot
col=my_palette, # use on color palette defined earlier
breaks=col_breaks, # enable color transition at specified limits
dendrogram="none", # only draw a row dendrogram
srtCol = 0 , #correct angle of label numbers
asp = 1 , #this overrides layout methinks and for some reason makes it square
adjCol = c(NA, -38.3) , #shift column labels
adjRow = c(77.5, NA) , #shift row labels
keysize = 2 , #alter key size
Colv = FALSE , #turn off column clustering
Rowv = FALSE , # turn off row clustering
key.xlab = paste("Correlation") , #add label to key
cexRow = (1.8) , # alter row label font size
cexCol = (1.8) , # alter column label font size
notecex = (1.5) , # Alter cell font size
lmat = rbind( c(0, 3, 0), c(2, 1, 0), c(0, 4, 0) ) ,
lhei = c(0.43, 2.6, 0.6) , # Alter dimensions of display array cell heighs
lwid = c(0.6, 4, 0.6) , # Alter dimensions of display array cell widths
key.par=list(mar=c(4.5,0, 1.8,0) ) ) #tweak specific key paramters
dev.off()
Here is the output, which I will continue to refine until all spacing and font sizes suit my aesthetic preference. I would tell you exactly what I've done but I'm not 100% sure, frankly it all feels like it's held together with old gum and bailer twine, but don't kick a gift horse in the code, as they say.
I don't know if you're open to non-heatmap.2-based solutions. In my opinion ggplot offers greater flexibility and with a bit of tweaking you can reproduce a heatmap similar to the one you're showing quite comfortably while maximising plotting "real-estate" and avoiding excessive whitespace.
I'm happy to remove this post if you're only looking for heatmap.2 solutions.
That aside, a ggplot2 solution may look like this:
First off, let's generate some sample data
set.seed(2018)
df <- as_tibble(matrix(runif(7*10), ncol = 10), .name_repair = ~seq(1:10))
Prior to plotting we need to reshape df from wide to long
library(tidyverse)
df <- df %>%
rowid_to_column("row") %>%
gather(col, Correlation, -row) %>%
mutate(col = as.integer(col))
Then to plot
ggplot(df, aes(row, col, fill = Correlation)) +
geom_tile() +
scale_fill_gradientn(colours = my_palette) + # Use your custom colour palette
theme_void() + # Minimal theme
labs(title = "Main title") +
geom_text(aes(label = sprintf("%2.1f", Correlation)), size = 2) +
theme(
plot.title = element_text(hjust = 1), # Right-aligned text
legend.position="bottom") + # Legend at the bottom
guides(fill = guide_colourbar(
title.position = "bottom", # Legend title below bar
barwidth = 25, # Extend bar length
title.hjust = 0.5))
An example with multiple heatmaps in a grid layout via facet_wrap
First off, let's generate more complex data.
set.seed(2018)
df <- replicate(
4,
as_tibble(matrix(runif(7*10), ncol = 10), .name_repair = ~seq(1:10)), simplify = F) %>%
setNames(., paste("data", 1:4, sep = "")) %>%
map(~ .x %>% rowid_to_column("row") %>%
gather(col, Correlation, -row) %>%
mutate(col = as.integer(col))) %>%
bind_rows(.id = "data")
Then the plotting is identical to what we did before plus an additional facet_wrap(~data, ncol = 2) statement
ggplot(df, aes(row, col, fill = Correlation)) +
geom_tile() +
scale_fill_gradientn(colours = my_palette) + # Use your custom colour palette
theme_void() + # Minimal theme
labs(title = "Main title") +
geom_text(aes(label = sprintf("%2.1f", Correlation)), size = 2) +
facet_wrap(~ data, ncol = 2) +
theme(
plot.title = element_text(hjust = 1), # Right-aligned text
legend.position="bottom") + # Legend at the bottom
guides(fill = guide_colourbar(
title.position = "bottom", # Legend title below bar
barwidth = 25, # Extend bar length
title.hjust = 0.5))
One final update
I thought it'd be fun/interesting to see how far we can get towards a complex heatmap similar to the one you link to from the paper.
The sample data is included at the end, as this takes up a bit of space.
We first construct three different ggplot2 plot objects that show the main heatmap (gg3), an additional smaller heatmap with missing values (gg2), and a bar denoting group labels for every row (gg1).
gg3 <- ggplot(df.cor, aes(col, row, fill = Correlation)) +
geom_tile() +
scale_fill_distiller(palette = "RdYlBu") +
theme_void() +
labs(title = "Main title") +
geom_text(aes(label = sprintf("%2.1f", Correlation)), size = 2) +
scale_y_discrete(position = "right") +
theme(
plot.title = element_text(hjust = 1),
legend.position="bottom",
axis.text.y = element_text(color = "black", size = 10)) +
guides(fill = guide_colourbar(
title.position = "bottom",
barwidth = 10,
title.hjust = 0.5))
gg2 <- ggplot(df.flag, aes(col, row, fill = Correlation)) +
geom_tile(colour = "grey") +
scale_fill_distiller(palette = "RdYlBu", guide = F, na.value = "white") +
theme_void() +
scale_x_discrete(position = "top") +
theme(
axis.text.x = element_text(color = "black", size = 10, angle = 90, hjust = 1, vjust = 0.5))
gg1 <- ggplot(df.bar, aes(1, row, fill = grp)) +
geom_tile() +
scale_fill_manual(values = c("grp1" = "orange", "grp2" = "green")) +
theme_void() +
theme(legend.position = "left")
We can now use egg::ggarrange to position all three plots such that the y axis ranges are aligned.
library(egg)
ggarrange(gg1, gg2, gg3, ncol = 3, widths = c(0.1, 1, 3))
Sample data
library(tidyverse)
set.seed(2018)
nrow <- 7
ncol <- 20
df.cor <- matrix(runif(nrow * ncol, min = -1, max = 1), nrow = nrow) %>%
as_tibble(.name_repair = ~seq(1:ncol)) %>%
rowid_to_column("row") %>%
gather(col, Correlation, -row) %>%
mutate(
row = factor(
paste("row", row, sep = ""),
levels = paste("row", 1:nrow, sep = "")),
col = factor(
paste("col", col, sep = ""),
levels = paste("col", 1:ncol, sep = "")))
nrow <- 7
ncol <- 10
df.flag <- matrix(runif(nrow * ncol, min = -1, max = 1), nrow = nrow) %>%
as_tibble(.name_repair = ~seq(1:ncol)) %>%
rowid_to_column("row") %>%
gather(col, Correlation, -row) %>%
mutate(
row = factor(
paste("row", row, sep = ""),
levels = paste("row", 1:nrow, sep = "")),
col = factor(
paste("col", col, sep = ""),
levels = paste("col", 1:ncol, sep = ""))) %>%
mutate(Correlation = ifelse(abs(Correlation) < 0.5, NA, Correlation))
df.bar <- data.frame(
row = 1:nrow,
grp = paste("grp", c(rep(1, nrow - 3), rep(2, 3)), sep = "")) %>%
mutate(
row = factor(
paste("row", row, sep = ""),
levels = paste("row", 1:nrow, sep = "")))
I am attempting to use grid.arrange to plot several graphs in one column, as the x axis is the same for all graphs. However the different graphs have different number of discrete values, resulting in Samples in the top graph more distanced than the graph below. Is there a way to set the distance between discrete values on an axis so the distance between Sample1 and Sample2 lines is the same for both graphs? Thanks!
Here is an example:
library(reshape2)
library(tidyverse)
library(gridExtra)
#Data frame 1
a <- c(1,2,3,4,5)
b <- c(10,20,30,40,50)
Species <- factor(c("Species1","Species2","Species3","Species4","Species5"))
bubba <- data.frame(Sample1=a,Sample2=b,Species=Species)
bubba$Species=factor(bubba$Species, levels=bubba$Species)
xm=melt(bubba,id.vars = "Species", variable.name="Samples", value.name = "Size")
#Data frame 2
c <- c(1,2,3,4,5)
d <- c(10,20,30,40,50)
e <- c(1,2,3,4,5)
f <- c(10,20,30,40,50)
bubban <- data.frame(Sample1=c,Sample2=d,Sample3=e,Sample4=f,Species=Species)
xn=melt(bubban,id.vars = "Species", variable.name="Samples", value.name = "Size")
#Not related, but part of my original script i am using
shrink_10s_trans = trans_new("shrink_10s",
transform = function(y){
yt = ifelse(y >= 10, y*0.1, y)
return(yt)
},
inverse = function(yt){
return(yt) # Not 1-to-1 function, picking one possibility
}
)
#Make plot 1
p1=ggplot(xm,aes(x= Species,y= fct_rev(Samples), fill = Size < 10))+
geom_point(aes(size=Size), shape = 21)+
scale_size_area(trans = shrink_10s_trans, max_size = 10,
breaks = c(1,3,5,10,20,30,40,50),
labels = c(1,3,5,10,20,30,40,50)) +
scale_fill_manual(values = c(rgb(136,93,100, maxColorValue = 255),
rgb(236,160,172, maxColorValue = 255))) +
theme_bw()+theme(axis.text.x = element_text(angle = -45, hjust = 1))+scale_x_discrete(position = "top")
#Make plot 2
p2=ggplot(xn,aes(x= Species,y= fct_rev(Samples), fill = Size < 10))+
geom_point(aes(size=Size), shape = 21)+
scale_size_area(trans = shrink_10s_trans, max_size = 10,
breaks = c(1,3,5,10,20,30,40,50),
labels = c(1,3,5,10,20,30,40,50)) +
scale_fill_manual(values = c(rgb(136,93,100, maxColorValue = 255),
rgb(236,160,172, maxColorValue = 255))) +
theme_bw()+theme(axis.text.x = element_blank())
#arrange the plots
grid.arrange(p1,p2,nrow=2)
Instead of using grid.extra use ggpubr::ggarrange function. It lets you specify heights of each plot and set shared legend.
# Using plots generated with OPs code
ggpubr::ggarrange(p1, p2, nrow = 2, heights = c(1.3, 2),
common.legend = TRUE, legend = "right")
With argument heights you can set relative heights of each provided plot.
I have a dataframe:
gene_symbol<-c("DADA","SDAASD","SADDSD","SDADD","ASDAD","XCVXCVX","EQWESDA","DASDADS","SDASDASD","DADADASD","sdaadfd","DFSD","SADADDAD","SADDADADA","DADSADSASDWQ","SDADASDAD","ASD","DSADD")
panel<-c("growth","growth","growth","growth","big","big","big","small","small","dfgh","DF","DF","DF","DF","DF","gh","DF","DF")
ASDDA<-c("normal","over","low","over","normal","over","low","over","normal","over","DF","DF","DF","DF","DF","DF","DF","DF")
ASDDb<-c("normal","over","low","over","normal","over","low","over","normal","over","DF","DF","DF","DF","DF","DF","DF","DF")
ASDDAf<-c("normal","over","low","over","normal","over","low","over","normal","over","DF","DF","DF","DF","DF","DF","DF","DF")
ASDDAf1<-c("normal","over","low","over","normal","over","low","over","normal","over","DF","DF","DF","DF","DF","DF","DF","DF")
ASDDAf2<-c("normal","over","low","over","normal","over","low","over","normal","over","DF","DF","DF","DF","DF","DF","DF","DF")
ASDDAf3<-c("normal","over","low","over","normal","over","low","over","normal","over","DF","DF","DF","DF","DF","DF","DF","DF")
ASDDAf4<-c("normal","over","low","over","normal","over","low","over","normal","over","DF","DF","DF","DF","DF","DF","DF","DF")
ASDDAf5<-c("normal","over","low","over","normal","over","low","over","normal","over","DF","DF","DF","DF","DF","DF","DF","DF")
ASDDA1<-c("normal","over","low","over","normal","over","low","over","normal","over","DF","DF","DF","DF","DF","DF","DF","DF")
ASDDb1<-c("normal","over","low","over","normal","over","low","over","normal","over","DF","DF","DF","DF","DF","DF","DF","DF")
ASDDAf1<-c("normal","over","low","over","normal","over","low","over","normal","over","DF","DF","DF","DF","DF","DF","DF","DF")
ASDDAf11<-c("normal","over","low","over","normal","over","low","over","normal","over","DF","DF","DF","DF","DF","DF","DF","DF")
ASDDAf21<-c("normal","over","low","over","normal","over","low","over","normal","over","DF","DF","DF","DF","DF","DF","DF","DF")
ASDDAf31<-c("normal","over","low","over","normal","over","low","over","normal","over","DF","DF","DF","DF","DF","DF","DF","DF")
ASDDAf41<-c("normal","over","low","over","normal","over","low","over","normal","over","DF","DF","DF","DF","DF","DF","DF","DF")
ASDDAf51<-c("normal","over","low","over","normal","over","low","over","normal","over","DF","DF","DF","DF","DF","DF","DF","DF")
Gene_states22 <- data.frame(gene_symbol, panel, ASDDA, ASDDb, ASDDAf, ASDDAf1, ASDDAf2,
ASDDAf3, ASDDAf4, ASDDAf5, ASDDA1, ASDDb1, ASDDAf1, ASDDAf11,
ASDDAf21, ASDDAf31, ASDDAf41, ASDDAf51)
And I create a heatmap with:
library(ggplot2); library(reshape2)
HG3 <- split(Gene_states22[,1:15], Gene_states22$panel)
HG4 <- melt(HG3, id.vars= c("gene_symbol","panel"))
HG4 <- HG4[,-5]
pp <- ggplot(HG4, aes(gene_symbol,variable)) +
geom_tile(aes(fill = value),
colour = "grey50") +
facet_grid(~panel, scales = "free" ,space = "free") +
scale_fill_manual(values = c("white", "red", "blue", "black", "yellow", "green", "brown"))
As you can see I use facet_grid to separate my heatmap into groups based on panel value. The problem is that when I use ggplotly(pp) the column width differs from group to group and my plot seems ugly.
In order to fix the issue I used adapted answer of Plotly and ggplot with facet_grid in R: How to to get yaxis labels to use ticktext value instead of range value?
:
library(plotly)
library(ggplot2)
library(data.table)
library(datasets)
#add fake model for use in facet
dt<-data.table(HG4[1:50,])
dt[,variable:=rownames(HG4)]
dt[,panel:=substr(variable,1,regexpr(" ",variable)-1)][panel=="",panel:=variable]
ggplot.test<-ggplot(dt,aes(gene_symbol,variable))+facet_grid(panel~.,scales="free_y",space="free",drop=TRUE)+
geom_tile(aes(fill = value),
colour = "grey50") +
scale_fill_manual(values = c("white", "red", "blue", "black", "yellow", "green", "brown")) +
labs(title = "Heatmap", x = "gene_symbol", y = "sample", fill = "value") +
guides(fill = FALSE)+
theme(panel.background = element_rect(fill = NA),
panel.spacing = unit(0.5, "lines"), ## It was here where you had a 0 for distance between facets. I replaced it by 0.5 .
strip.placement = "outside")
p <- ggplotly(ggplot.test)
len <- length(unique(HG4$panel))
total <- 1
for (i in 2:len) {
total <- total + length(p[['x']][['layout']][[paste('yaxis', i, sep='')]][['ticktext']])
}
spacer <- 0.01 #space between the horizontal plots
total_length = total + len * spacer
end <- 1
start <- 1
for (i in c('', seq(2, len))) {
tick_l <- length(p[['x']][['layout']][[paste('yaxis', i, sep='')]][['ticktext']]) + 1
#fix the y-axis
p[['x']][['layout']][[paste('yaxis', i, sep='')]][['tickvals']] <- seq(1, tick_l)
p[['x']][['layout']][[paste('yaxis', i, sep='')]][['ticktext']][[tick_l]] <- ''
end <- start - spacer
start <- start - (tick_l - 1) / total_length
v <- c(start, end)
#fix the size
p[['x']][['layout']][[paste('yaxis', i, sep='')]]$domain <- v
}
p[['x']][['layout']][['annotations']][[3]][['y']] <- (p[['x']][['layout']][['yaxis']]$domain[2] + p[['x']][['layout']][['yaxis']]$domain[1]) /2
p[['x']][['layout']][['shapes']][[2]][['y0']] <- p[['x']][['layout']][['yaxis']]$domain[1]
p[['x']][['layout']][['shapes']][[2]][['y1']] <- p[['x']][['layout']][['yaxis']]$domain[2]
#fix the annotations
for (i in 3:len + 1) {
#fix the y position
p[['x']][['layout']][['annotations']][[i]][['y']] <- (p[['x']][['layout']][[paste('yaxis', i - 2, sep='')]]$domain[1] + p[['x']][['layout']][[paste('yaxis', i - 2, sep='')]]$domain[2]) /2
#trim the text
p[['x']][['layout']][['annotations']][[i]][['text']] <- substr(p[['x']][['layout']][['annotations']][[i]][['text']], 1, length(p[['x']][['layout']][[paste('yaxis', i - 2, sep='')]][['ticktext']]) * 3 - 3)
}
#fix the rectangle shapes in the background
for (i in seq(0,(len - 2) * 2, 2)) {
p[['x']][['layout']][['shapes']][[i+4]][['y0']] <- p[['x']][['layout']][[paste('yaxis', i /2 + 2, sep='')]]$domain[1]
p[['x']][['layout']][['shapes']][[i+4]][['y1']] <- p[['x']][['layout']][[paste('yaxis', i /2 + 2, sep='')]]$domain[2]
}
p
But the heatmap is still not correct:
So first things first:
In your case I am not even sure whether a plotly heatmap is what you need. In addition you should never convert a complicated ggplot to plotly. It will fail! In 90% of cases. Try recreating your plot in plotly or whereever you want it to end up. Anything else ends up in coding hell.
I started by doing some research:
Here is a good description how to create heatmaps with different colors in plotly
This explains how you can create titles in subplots.
From post 1 I know that I have to create a matrix for each level in your data. So I wrote a function for that:
mymat<-as.matrix(Gene_states22[,-1:-2])
### Creates a 1-NA dummy matrix for each level. The output is stored in a list
dummy_mat<-function(mat,levels,names_col){
mat_list<-lapply(levels,function(x){
mat[mat!=x]=NA
mat[mat==x]=1
mymat=t(apply(mat,2,as.numeric))
colnames(mymat)=names_col
return(mymat)
})
names(mat_list)=levels
return(mat_list)
}
my_mat_list<-dummy_mat(mymat,c('DF','low','normal','over'),Gene_states22$gene_symbol)
### Optional: The heatmap type is peculiar - I created a text-NA matrix for each category as well
text_mat<-function(mat,levels,names_col){
mat_list<-lapply(levels,function(x){
mat[mat!=x]=NA
mat=t(mat)
colnames(mat)=names_col
return(mat)
})
names(mat_list)=levels
return(mat_list)
}
my_mat_list_t<-text_mat(mymat,c('DF','low','normal','over'),as.character(Gene_states22$gene_symbol))
In addition I needed colors for each level. These colors are created using some dataframe. You may write a similar (lapply-)loop here as well:
DF_Color <- data.frame(x = c(0,1), y = c("#DEDEDE", "#DEDEDE"))
colnames(DF_Color) <- NULL
lowColor <- data.frame(x = c(0,1), y = c("#00CCFF", "#00CCFF"))
colnames(lowColor) <- NULL
normColor <- data.frame(x = c(0,1), y = c("#DEDE00", "#DEDE00"))
colnames(normColor) <- NULL
overColor <- data.frame(x = c(0,1), y = c("#DE3333", "#DE3333"))
colnames(overColor) <- NULL
In addition we need the columns in the matrix for each panel-category:
mycols<-lapply(levels(Gene_states22$panel),function(x) grep(x,Gene_states22$panel))
I stored this in a list as well.
Next I use lapply-loop to plot. I store the values in a list and use subplot to put everything together:
library(plotly)
p_list<-lapply(1:length(mycols),function(j){
columns<-mycols[[j]]
p<-plot_ly(
type = "heatmap"
) %>% add_trace(
y=rownames(my_mat_list$DF),x=colnames(my_mat_list$DF)[columns],
z = my_mat_list$DF[,columns],
xgap=3,ygap=3, text=my_mat_list_t$DF[,columns],hoverinfo="x+y+text",
colorscale = DF_Color,
colorbar = list(
len = 0.3,
y = 0.3,
yanchor = 'top',
title = 'DF series',
tickvals = ''
)
) %>% add_trace(
y=rownames(my_mat_list$low),x=colnames(my_mat_list$low)[columns],
z = my_mat_list$low[,columns],
xgap=3,ygap=3,text=my_mat_list_t$low[,columns],hoverinfo="x+y+text",
colorscale = lowColor,
colorbar = list(
len = 0.3,
y = 0.3,
yanchor = 'top',
title = 'low series',
tickvals = ''
)
) %>% add_trace(
y=rownames(my_mat_list$normal),x=colnames(my_mat_list$normal)[columns],
z = my_mat_list$normal[,columns],
xgap=3,ygap=3,text=my_mat_list_t$normal[,columns],hoverinfo="x+y+text",
colorscale = normColor,
colorbar = list(
len = 0.3,
y = 1,
yanchor = 'top',
title = 'normal series',
tickvals = ''
)
) %>% add_trace(
y=rownames(my_mat_list$over),x=colnames(my_mat_list$over)[columns],
z = my_mat_list$over[,columns],
xgap=3,ygap=3,text=my_mat_list_t$over[,columns],hoverinfo="x+y+text",
colorscale = overColor,
colorbar = list(
len = 0.3,
y = 1,
yanchor = 'top',
title = 'over series',
tickvals = ''
)
)
return(p)
})
subplot(p_list[[1]],p_list[[2]],shareY = TRUE) %>%
layout(annotations = list(
list(x = 0.2 , y = 1.05, text = levels(Gene_states22$panel)[1], showarrow = F, xref='paper', yref='paper'),
list(x = 0.8 , y = 1.05, text = levels(Gene_states22$panel)[2], showarrow = F, xref='paper', yref='paper'))
)
POSSIBLE ISSUES:
You have to become create around categories like dfgh which occur only once. If only one column is selected in R, the output is automatically transformed into a (numeric or character) vector-type. Thus maybe add an as.matrix() to all z and text arguments
hover-text doesn't really work. But plotly has a good documentation there. You should be able to figure that out.
You also have to specify the width in the subplot-function. That will be fiddly if you have more than 10 categories.
Interactivity doesn't really work. You can't remove traces. Why? No idea. Do some research if you need it. I guess it is connected with the plot type.
I recommend specifying the extend of the plot(s) in px. That might make the tiles more similar.
Finally you will need some reference for the (subplot) titles and you will need to adjust the margins of your plot. Such that the titles are visible.
I'm using facet_grid() to display some data, and I have facet labels that span multiple lines of text (they contain the "\n" character).
require(ggplot2)
#Generate example data
set.seed(3)
df = data.frame(facet_label_text = rep(c("Label A",
"Label B\nvery long label",
"Label C\nshort",
"Label D"),
each = 5),
time = rep(c(0, 4, 8, 12, 16), times = 4),
value = runif(20, min=0, max=100))
#Plot test data
ggplot(df, aes(x = time, y = value)) +
geom_line() +
facet_grid(facet_label_text ~ .) +
theme(strip.text.y = element_text(angle = 0, hjust = 0))
So by using the hjust = 0 argument, I can left-align facet label text as a unit.
What I would like to do is left-align each individual line of text. So "Label B" and "very long label" are both aligned along the left side, rather than centered relative to each other (ditto for "Label C" and "short"). Is this possible in ggplot2?
This is fairly straightforward using grid's grid.gedit function to edit the strips.
library(ggplot2) # v2.1.0
library(grid)
# Your data
set.seed(3)
df = data.frame(facet_label_text = rep(c("Label A",
"Label B\nvery long label",
"Label C\nshort",
"Label D"),
each = 5),
time = rep(c(0, 4, 8, 12, 16), times = 4),
value = runif(20, min=0, max=100))
# Your plot
p = ggplot(df, aes(x = time, y = value)) +
geom_line() +
facet_grid(facet_label_text ~ .) +
theme(strip.text.y = element_text(angle = 0, hjust = 0))
p
# Get a list of grobs in the plot
grid.ls(grid.force())
# It looks like we need the GRID.text grobs.
# But some care is needed:
# There are GRID.text grobs that are children of the strips;
# but also there are GRID.text grobs that are children of the axes.
# Therefore, a gPath should be set up
# to get to the GRID.text grobs in the strips
# The edit
grid.gedit(gPath("GRID.stripGrob", "GRID.text"),
just = "left", x = unit(0, "npc"))
Or, a few more lines of code to work with a grob object (in place of editing on screen as above):
# Get the ggplot grob
gp = ggplotGrob(p)
grid.ls(grid.force(gp))
# Edit the grob
gp = editGrob(grid.force(gp), gPath("GRID.stripGrob", "GRID.text"), grep = TRUE, global = TRUE,
just = "left", x = unit(0, "npc"))
# Draw it
grid.newpage()
grid.draw(gp)
Until someone comes along with a real solution, here's a hack: Add space in the labels to get the justification you want.
require(ggplot2)
#Generate example data
set.seed(3)
df = data.frame(facet_label_text = rep(c("Label A",
"Label B \nvery long label",
"Label C\nshort ",
"Label D"),
each = 5),
time = rep(c(0, 4, 8, 12, 16), times = 4),
value = runif(20, min=0, max=100))
#Plot test data
ggplot(df, aes(x = time, y = value)) +
geom_line() +
facet_grid(facet_label_text ~ .) +
theme(strip.text.y = element_text(angle = 0, hjust = 0))
There may be a cleaner way to do this but I didn't find a way to do this within ggplot2. The padwrap function could be more generalized as it basically does just what you requested. To get the justification right, I had to use a mono-spaced font.
# Wrap text with embedded newlines: space padded and lef justified.
# There may be a cleaner way to do this but this works on the one
# example. If using for ggplot2 plots, make the font `family`
# a monospaced font (e.g. 'Courier')
padwrap <- function(x) {
# Operates on one string
padwrap_str <- function(s) {
sres <- strsplit(s, "\n")
max_len <- max(nchar(sres[[1]]))
paste( sprintf(paste0('%-', max_len, 's'), sres[[1]]), collapse = "\n" )
}
# Applys 'padwrap' to a vector of strings
unlist(lapply(x, padwrap_str))
}
require(ggplot2)
facet_label_text = rep(c("Label A",
"Label B\nvery long label",
"Label C\nshort",
"Label D"), 5)
new_facet_label_text <- padwrap(facet_label_text)
#Generate example data
set.seed(3)
df = data.frame(facet_label_text = new_facet_label_text,
time = rep(c(0, 4, 8, 12, 16), times = 4),
value = runif(20, min=0, max=100))
#Plot test data
ggplot(df, aes(x = time, y = value)) +
geom_line() +
facet_grid(facet_label_text ~ .) +
theme(strip.text.y = element_text(angle = 0, hjust = 0, family = 'Courier'))
The strip text is left justified in the image below
From a data frame I want to plot a pie chart for five categories with their percentages as labels in the same graph in order from highest to lowest, going clockwise.
My code is:
League<-c("A","B","A","C","D","E","A","E","D","A","D")
data<-data.frame(League) # I have more variables
p<-ggplot(data,aes(x="",fill=League))
p<-p+geom_bar(width=1)
p<-p+coord_polar(theta="y")
p<-p+geom_text(data,aes(y=cumsum(sort(table(data)))-0.5*sort(table(data)),label=paste(as.character(round(sort(table(data))/sum(table(data)),2)),rep("%",5),sep="")))
p
I use
cumsum(sort(table(data)))-0.5*sort(table(data))
to place the label in the corresponding portion and
label=paste(as.character(round(sort(table(data))/sum(table(data)),2)),rep("%",5),sep="")
for the labels which is the percentages.
I get the following output:
Error: ggplot2 doesn't know how to deal with data of class uneval
I've preserved most of your code. I found this pretty easy to debug by leaving out the coord_polar... easier to see what's going on as a bar graph.
The main thing was to reorder the factor from highest to lowest to get the plotting order correct, then just playing with the label positions to get them right. I also simplified your code for the labels (you don't need the as.character or the rep, and paste0 is a shortcut for sep = "".)
League<-c("A","B","A","C","D","E","A","E","D","A","D")
data<-data.frame(League) # I have more variables
data$League <- reorder(data$League, X = data$League, FUN = function(x) -length(x))
at <- nrow(data) - as.numeric(cumsum(sort(table(data)))-0.5*sort(table(data)))
label=paste0(round(sort(table(data))/sum(table(data)),2) * 100,"%")
p <- ggplot(data,aes(x="", fill = League,fill=League)) +
geom_bar(width = 1) +
coord_polar(theta="y") +
annotate(geom = "text", y = at, x = 1, label = label)
p
The at calculation is finding the centers of the wedges. (It's easier to think of them as the centers of bars in a stacked bar plot, just run the above plot without the coord_polar line to see.) The at calculation can be broken out as follows:
table(data) is the number of rows in each group, and sort(table(data)) puts them in the order they'll be plotted. Taking the cumsum() of that gives us the edges of each bar when stacked on top of each other, and multiplying by 0.5 gives us the half the heights of each bar in the stack (or half the widths of the wedges of the pie).
as.numeric() simply ensures we have a numeric vector rather than an object of class table.
Subtracting the half-widths from the cumulative heights gives the centers each bar when stacked up. But ggplot will stack the bars with the biggest on the bottom, whereas all our sort()ing puts the smallest first, so we need to do nrow - everything because what we've actually calculate are the label positions relative to the top of the bar, not the bottom. (And, with the original disaggregated data, nrow() is the total number of rows hence the total height of the bar.)
Preface: I did not make pie charts of my own free will.
Here's a modification of the ggpie function that includes percentages:
library(ggplot2)
library(dplyr)
#
# df$main should contain observations of interest
# df$condition can optionally be used to facet wrap
#
# labels should be a character vector of same length as group_by(df, main) or
# group_by(df, condition, main) if facet wrapping
#
pie_chart <- function(df, main, labels = NULL, condition = NULL) {
# convert the data into percentages. group by conditional variable if needed
df <- group_by_(df, .dots = c(condition, main)) %>%
summarize(counts = n()) %>%
mutate(perc = counts / sum(counts)) %>%
arrange(desc(perc)) %>%
mutate(label_pos = cumsum(perc) - perc / 2,
perc_text = paste0(round(perc * 100), "%"))
# reorder the category factor levels to order the legend
df[[main]] <- factor(df[[main]], levels = unique(df[[main]]))
# if labels haven't been specified, use what's already there
if (is.null(labels)) labels <- as.character(df[[main]])
p <- ggplot(data = df, aes_string(x = factor(1), y = "perc", fill = main)) +
# make stacked bar chart with black border
geom_bar(stat = "identity", color = "black", width = 1) +
# add the percents to the interior of the chart
geom_text(aes(x = 1.25, y = label_pos, label = perc_text), size = 4) +
# add the category labels to the chart
# increase x / play with label strings if labels aren't pretty
geom_text(aes(x = 1.82, y = label_pos, label = labels), size = 4) +
# convert to polar coordinates
coord_polar(theta = "y") +
# formatting
scale_y_continuous(breaks = NULL) +
scale_fill_discrete(name = "", labels = unique(labels)) +
theme(text = element_text(size = 22),
axis.ticks = element_blank(),
axis.text = element_blank(),
axis.title = element_blank())
# facet wrap if that's happening
if (!is.null(condition)) p <- p + facet_wrap(condition)
return(p)
}
Example:
# sample data
resps <- c("A", "A", "A", "F", "C", "C", "D", "D", "E")
cond <- c(rep("cat A", 5), rep("cat B", 4))
example <- data.frame(resps, cond)
Just like a typical ggplot call:
ex_labs <- c("alpha", "charlie", "delta", "echo", "foxtrot")
pie_chart(example, main = "resps", labels = ex_labs) +
labs(title = "unfacetted example")
ex_labs2 <- c("alpha", "charlie", "foxtrot", "delta", "charlie", "echo")
pie_chart(example, main = "resps", labels = ex_labs2, condition = "cond") +
labs(title = "facetted example")
It worked on all included function greatly inspired from here
ggpie <- function (data)
{
# prepare name
deparse( substitute(data) ) -> name ;
# prepare percents for legend
table( factor(data) ) -> tmp.count1
prop.table( tmp.count1 ) * 100 -> tmp.percent1 ;
paste( tmp.percent1, " %", sep = "" ) -> tmp.percent2 ;
as.vector(tmp.count1) -> tmp.count1 ;
# find breaks for legend
rev( tmp.count1 ) -> tmp.count2 ;
rev( cumsum( tmp.count2 ) - (tmp.count2 / 2) ) -> tmp.breaks1 ;
# prepare data
data.frame( vector1 = tmp.count1, names1 = names(tmp.percent1) ) -> tmp.df1 ;
# plot data
tmp.graph1 <- ggplot(tmp.df1, aes(x = 1, y = vector1, fill = names1 ) ) +
geom_bar(stat = "identity", color = "black" ) +
guides( fill = guide_legend(override.aes = list( colour = NA ) ) ) +
coord_polar( theta = "y" ) +
theme(axis.ticks = element_blank(),
axis.text.y = element_blank(),
axis.text.x = element_text( colour = "black"),
axis.title = element_blank(),
plot.title = element_text( hjust = 0.5, vjust = 0.5) ) +
scale_y_continuous( breaks = tmp.breaks1, labels = tmp.percent2 ) +
ggtitle( name ) +
scale_fill_grey( name = "") ;
return( tmp.graph1 )
} ;
An example :
sample( LETTERS[1:6], 200, replace = TRUE) -> vector1 ;
ggpie(vector1)
Output