plotting labels outside of plot in ggplot - r

I have a plot made from the following code:
variable=c("A","B","C","D","E")
value=c(1,2,3,4,5);
type=c("A","B","A","A","B")
temp<-data.frame(var=factor(variable),val=value,type=factor(type))
p<-ggplot(temp,aes(var,val,color=type))+geom_point(aes(colour="type"))
p<-p+coord_flip()+theme(plot.margin = unit(c(1,5,1,1), "lines"),legend.position = "none")
How can I labels for the values (now on x-axis) of the plot on the right-side of the plot at the correct level (ie, i want it to say "5 4 3 2 1" vertically on the right side at the level (height) of the corresponding variable?
Thanks

if you make the "variable" the y-axis label rather than the actual values of the plot, you can use the sec_axis as a 1:1 transformation:
temp <- data.frame(val = value, var = value, type = type)
p <- ggplot(temp,aes(var,val,color=type)) +
geom_point(aes(colour="type")) +
theme(plot.margin = unit(c(1,5,1,1), "lines"), legend.position = "none")
p <- p + scale_y_continuous(labels = variable, sec.axis = sec_axis(~.*1))
p

Related

How do you recenter y-axis at 0.01 using ggplot2

I am trying to make a grouped bar plot comparing the concentration of contaminants (in ug/g) from 2 different locations site comparison figure, with the y axis in log scale. I have some values that are below one but greater than zero. The y axis is centering my bars at y=1, making some of the bars look negative. Is there to make my bars start at 0.01 instead of 1?
I tried coord_cartesian( ylim=c(0.01,100), expand = FALSE ) but that didn't do it. I also tried to use coor_trans(y='log10') to log transform y axis instead of scale_y_continuous(trans='log10') but was getting the error message "Transformation introduced infinite values in y-axis" even though I have no zero values.
Any help would be much appreciated,
Thank you.
my code is below:
malcomp2 %>%
ggplot(aes(x= contam, y= ug_values, fill= Location))+
geom_col(data= malcomp2,
mapping = aes(x= contam, y= ug_values, fill = Location),
position = position_dodge(.9), #makes the bars grouped
stat_identity(),
colour = "black", #adds black lines around bars
width = 0.8,
size = 0.3)+
ylab(expression(Concentration~(mu*g/g)))+ #adds mu character to y axis
coord_cartesian( ylim=c(0.01,100), expand = FALSE ) + #force bars to start at 0
theme_classic()+ #get rid of grey grid
scale_y_continuous(trans='log10', #change to log scale
labels = c(0.01,0.1,1,10,100))+ #change axis to not sci notation
annotation_logticks(sides = 'l')+ #add log ticks to y axis only
scale_x_discrete(name = NULL, #no x axis label
limits = c('Dieldrin','Mirex','PBDEs','CHLDs','DDTs','PCBtri','PCBquad',
'PCBhept'), #changes order of x axis
labels = c('Dieldrin','Mirex','PBDEs','CHLDs','DDTs','PCB 3','PCB 4-6','PCB7+'))+
scale_fill_manual("Location", #rename legend
values = c('turquoise2','gold'), #change colors
labels = c( 'St.Andrew Bay', 'Sapelo'))+ #change names on legend
theme(legend.title = NULL,
legend.key.size = unit(15, "pt"),
legend.position = c(0.10,0.95)) #places legend in upper left corner
One hack would be to just scale your data so baseline is at 1. People have asked this question before on SO, and it seems like a more satisfactory approach might be to use geom_rect instead, like here:
Setting where y-axis bisects when using log scale in ggplot2 geom_bar
ggplot(data.frame(contam = 1:5, ug_values = 10^(-2:2)*100),
aes(contam, ug_values)) +
geom_col() +
scale_y_continuous(trans = 'log10', limits = c(1,10000),
breaks = c(1,10,100,1000,10000),
labels = c(0.01,0.1,1,10,100))

Add new geom as new row in ggplot2, preventing layering of plots

I am pretty sure that this is easy to do but I can't seem to find a proper way to query this question into google or stack, so here we are:
I have a plot made in ggplot2 which makes use of geom_jitter(), efficiently creating one row for each element in a factor and plotting its values.
I would like to add a complementary geom_violin() to the plot, but just adding the extra geom_ function to the plot code returns two layers: the jitter and the violin, one on top of the other (as usually expected).
EDIT:
This is how the plot looks like:
How can I have the violin as a separate row, without generating a second plot?
Side quest: how I can I have the jitter and the violin geoms interleaved? (i.e. element A jitter row followed by element A violin row, and then element B jitter row followed by element B violin row)
This is the minimum required code to make it (without all the theme() embellishments):
P1 <- ggplot(data=TEST_STACK_SUB, aes(x=E, y=C, col=A)) +
theme(... , aspect.ratio=0.3) +
geom_point(position = position_jitter(w = 0.30, h = 0), alpha=0.2, size=0.5) +
geom_violin(data=TEST_STACK_SUB, mapping=aes(x=E, y=C), position="dodge") +
scale_x_discrete() +
scale_y_continuous(limits=c(0,1), breaks=seq(0,1,0.1),
labels=c(seq(0,1,0.1))) +
scale_color_gradient2(breaks=seq(0,100,20),
limits=c(0,100),
low="green3",
high="darkorchid4",
midpoint=50,
name="") +
coord_flip()
options(repr.plot.width=8, repr.plot.height=2)
plot(P1)
Here is a subset of the data to generate it (for you to try):
data
How about manipulating your factor as a continuous variable and nudging the entries across the aes() calls like so:
library(dplyr)
library(ggplot2)
set.seed(42)
tibble(x = rep(c(1, 3), each = 10),
y = c(rnorm(10, 2), rnorm(10))) -> plot_data
ggplot(plot_data) +
geom_jitter(aes(x = x - 0.5, y = y), width = 0.25) +
geom_violin(aes(x = x + 0.5, y = y, group = x), width = 0.5) +
coord_flip() +
labs(x = "x") +
scale_x_continuous(breaks = c(1, 3),
labels = paste("Level", 1:2),
trans = scales::reverse_trans())

Manipulating the legend of scale_fill_gradient2

I have data which comes from a statistical test (gene set enrichment analysis, but that's not important), so I obtain p-values for statistics that are normally distributed, i.e., both positive and negative values:
The test is run on several categories:
set.seed(1)
df <- data.frame(col = rep(1,7),
category = LETTERS[1:7],
stat.sign = sign(rnorm(7)),
p.value = runif(7, 0, 1),
stringsAsFactors = TRUE)
I want to present these data in a geom_tile ggplot such that I color code the df$category by their df$p.value multiplied by their df$stat.sign (i.e, the sign of the statistic)
For that I first take the log10 of df$p.value:
df$sig <- df$stat.sign*(-1*log10(df$p.value))
Then I order the df by df$sig for each sign of df$sig:
library(dplyr)
df <- rbind(dplyr::filter(df, sig < 0)[order(dplyr::filter(df, sig < 0)$sig), ],
dplyr::filter(df, sig > 0)[order(dplyr::filter(df, sig > 0)$sig), ])
And then I ggplot it:
library(ggplot2)
df$category <- factor(df$category, levels=df$category)
ggplot(data = df,
aes(x = col, y = category)) +
geom_tile(aes(fill=sig)) +
scale_fill_gradient2(low='darkblue', mid='white', high='darkred') +
theme_minimal() +
xlab("") + ylab("") + labs(fill="-log10(P-Value)") +
theme(axis.text.y = element_text(size=12, face="bold"),
axis.text.x = element_blank())
which gives me:
Is there a way to manipulate the legend such that the values of df$sig are represented by their absolute value but everything else remains unchanged? That way I still get both red and blue shades and maintain the order I want.
If you check ggplot's documentation, scale_fill_gradient2, like other continuous scales, accepts one of the following for its labels argument:
NULL for no labels
waiver() for the default labels computed for the transofrmation object
a character vector giving labels (must be same length as breaks)
a function that takes the breaks as input and returns labels as output
Since you only want the legend values to be absolute, I assume you're satisfied with the default breaks in the legend colour bar (-0.1 to 0.4 with increments in 0.1), so all you really need is to add a function that manipulates the labels.
I.e. instead of this:
scale_fill_gradient2(low = 'darkblue', mid = 'white', high = 'darkred') +
Use this:
scale_fill_gradient2(low = 'darkblue', mid = 'white', high = 'darkred',
labels = abs) +
I'm not sure I did understood what you're looking for. Do you meant that you wan't to change the labels within legends? If you want to change labels manipulating breaks and labels given by scale_fill_gradient2() shall do it.
ggplot(data=df,aes(x=col,y=category)) +
geom_tile(aes(fill=sig)) +
scale_fill_gradient2(low='darkblue',mid='white',high='darkred',
breaks = order(unique(df$sig)),
labels = abs(order(unique(df$sig)))) +
theme_minimal()+xlab("")+ylab("")+labs(fill="-log10(P-Value)") +
theme(axis.text.y=element_text(size=12,face="bold"),axis.text.x=element_blank())
For what you're looking for maybe you could display texts inside the figure to show the values, try stacking stat_bin_2d() like this:
ggplot(data=df,aes(x=col,y=category)) +
geom_tile(aes(fill=sig)) +
scale_fill_gradient2(low='darkblue',mid='white',high='darkred',
breaks = order(unique(df$sig)),
labels = abs(order(unique(df$sig)))) +
theme_minimal()+xlab("")+ylab("")+labs(fill="-log10(P-Value)") +
stat_bin_2d(geom = 'text', aes(label = sig), colour = 'black', size = 16) +
theme(axis.text.y=element_text(size=12,face="bold"),axis.text.x=element_blank())
You might want to give the size and colour arguments some tries.

HeatMap not displaying correctly using ggplot()

I am having a strange situation when I am trying to plot a heatmap on a dataset that I have which can be found here.
I am using the following code to plot the heat map:
xaxis<-c('density')
midrange<-range(red[,xaxis])
xaxis <- c(xaxis,'quality')
molten<-melt(red[,xaxis],'quality')
p <- ggplot(molten, aes(x = value, y = quality))
p <- p + geom_tile(aes(fill = value), colour = "white")
p <- p + theme_minimal()
# turn y-axis text 90 degrees (optional, saves space)
p <- p + theme(axis.text.y = element_text(angle = 90, hjust = 0.5))
# remove axis titles, tick marks, and grid
p <- p + theme(axis.title = element_blank())
p <- p + theme(axis.ticks = element_blank())
p <- p + theme(panel.grid = element_blank())
p <- p + scale_y_discrete(expand = c(0, 0))
# optionally remove row labels (not useful depending on molten)
p <- p + theme(axis.text.x = element_blank())
# get diverging color scale from colorbrewer
# #008837 is green, #7b3294 is purple
palette <- c("#008837", "#b7f7f4", "#b7f7f4", "#7b3294")
if(midrange[1] == midrange[2]) {
# use a 3 color gradient instead
p <- p + scale_fill_gradient2(low = palette[1], mid = palette[2], high = palette[4], midpoint = midrange[1]) +
xlim(midrange[1],midrange[2])
}else{
# use a 4 color gradient (with a swath of white in the middle)
p <- p + scale_fill_gradientn(colours = palette, values = c(0, midrange[1], midrange[2], 1)) +
xlim(midrange[1],midrange[2])
}
p
I am trying to plot the heat map on the variable Density and would like to use the variable quality as separation in my heat map. When I use the above code, I get the following plot:
It can be clearly seen that it is a blank image. This is happening because the range of the variable Density is very low, it doesn't happen if I change the variable to the one having a wider range (pH for example).
Should ggplot automatically adjust to this? If not, how can I get ggplot to show the real plot?
Any help in this regard will be much appreciated.
So there are (at least) two problems here.
First, you have almost 1600 tiles in the x-direction, so specifying color="white" for the outline means that all you see is the outline, hence, white. Try taking this out.
Second, in your values=c(...) argument to scale_fill_gradientn(...) you seem to expect the midrange[1] and midrange[2] to be between (0,1), but midrange[2] = 1.003.
After taking out color="white" from the call to geom_tile(...), I get this:

How to force the x-axis tick marks to appear at the end of bar in heatmap graph?

I created a simple heatmap graph with ggplot2 but I need to force the x-axis tick marks to appear at the end of my x variable, rather than at its center. For example, I would expect 1 to appear at the position of where 1.5 is now. I beleive a heatmap done in Base R would do that.
library(car) #initialize libraries
library(ggplot2) #initialize libraries
library(reshape)
df=read.table(text= "x y fill
1 1 B
2 1 A
3 1 B
1 2 A
2 2 C
3 2 A
", header=TRUE, sep="" )
#plot data
qplot(x=x, y=y,
fill=fill,
data=df,
geom="tile")+
scale_x_continuous(breaks=seq(1:3) )
The idea is to create a simple heatmap which looks like this:
The tick marks in this graph are placed at the end of the bars instead of their centers
What about this?
object = qplot(x=x, y=y,
fill=fill,
data=df,
geom="tile")+
scale_x_continuous(breaks=seq(1:3))
object + scale_x_continuous(breaks=seq(.5,3.5,1), labels=0:3)
geom_tile centres each tile at the coordinates given. Therefore you would expect the output which it does give.
Therefore If you give ggplot the centres (not the top-right corner coordinates) for each cell it will work.
ggplot(df, aes(x = x-0.5, y = y-0.5, fill = fill)) +
geom_tile() +
scale_x_continuous(expand = c(0,0), breaks = 0:3) +
scale_y_continuous(expand = c(0,0), breaks = 0:3) +
ylab('y') +
xlab('x')
or using qplot
qplot(data = df, x= x-0.5, y = y-0.5, fill = fill, geom = 'tile') +
scale_x_continuous(expand = c(0,0), breaks = 0:3) +
scale_y_continuous(expand = c(0,0), breaks = 0:3) +
ylab('y') +
xlab('x')

Resources