Dodged geom_errorbar with geom_bar using fill, colour, and alpha - r

I am attempting to make a facet_wrap bar_graph with error bars (se) that clearly shows three different categorical variables (Treatment, Horizon, Enzyme) with one response variable (AbundChangetoAvgCtl). Below is the code for some dummy data followed by the ggplot code I have so far. The graphs I've made can be see at this link:
bargraph figures
Enzyme <- c("Arabinosides","Arabinosides","Arabinosides","Arabinosides","Arabinosides","Arabinosides","Cellulose","Cellulose","Cellulose","Cellulose","Cellulose","Cellulose","Chitin","Chitin","Chitin","Chitin","Chitin","Chitin","Lignin","Lignin","Lignin","Lignin","Lignin","Lignin")
Treatment <- c("Deep","Deep","Int","Int","Low","Low","Deep","Deep","Int","Int","Low","Low","Deep","Deep","Int","Int","Low","Low","Deep","Deep","Int","Int","Low","Low")
Horizon <- c("Org","Min","Org","Min","Org","Min","Org","Min","Org","Min","Org","Min","Org","Min","Org","Min","Org","Min","Org","Min","Org","Min","Org","Min")
AbundChangetoAvgCtl <- rnorm(24,mean=0,sd=1)
se <- rnorm(24, mean=0.5, sd=0.25)
notrans_noctl_enz_toCtl_summary <- data.frame(Enzyme,Treatment,Horizon,AbundChangetoAvgCtl,se)
ggplot(notrans_noctl_enz_toCtl_summary, aes(x=Horizon, y=AbundChangetoAvgCtl, fill=Horizon, alpha=Treatment)) +
geom_bar(position=position_dodge(), colour="black", stat="identity", aes(fill=Horizon)) +
geom_errorbar(aes(ymin=AbundChangetoAvgCtl-se, ymax=AbundChangetoAvgCtl+se),
width=.2,
position=position_dodge(.9)) +
scale_fill_brewer(palette = "Set1") + theme_bw() +
geom_hline(yintercept=0) +
labs(y = "Rel Gene Abundance Change / Control", x="") +
theme(axis.ticks = element_blank(),
axis.text.x = element_blank(),
strip.text.x = element_text(size=20),
plot.title = element_text(size=22, vjust=2, face="bold"),
axis.title.y = element_text(size=18),
legend.key.size = unit(.75, "in"),
legend.text = element_text(size = 15),
legend.title = element_text(size = 18)) +
facet_wrap(~Enzyme, scales="free")
(figure 1)
So this is close to what I want, however for some reason, the "alpha=Treatment" call in ggplot causes my errorbars to fade (which I don't want) as well as the bar_fill (which I do want). I've tried moving the "alpha=Treatment" to the geom_bar call, as well as adding "alpha=1" to geom_bar, but when I do that, the error bars all move to a single location and overlap (figure 2).
I initially wanted to cluster the bars within facet_wrap, but found the alpha option on this site, which seems to accomplish what I'm looking for as well. Any help would be appreciated. If there is a better way to represent all of this, those ideas are welcome as well.
Also, if there is a way to condense and clarify my legend, that would be extra bonus!
Thanks in advance for your help!
Mike

You need to assign Treatment to the group option in the ggplot() command and then move the alpha=Treatment option to the geom_bar() command. Then the alpha value of geom_errorbar won't be affected by the global option and will be black. Like this:
ggplot(notrans_noctl_enz_toCtl_summary, aes(x=Horizon, y=AbundChangetoAvgCtl, fill=Horizon, group = Treatment)) +
geom_bar(position=position_dodge(), colour="black", stat="identity", aes(fill=Horizon, alpha = Treatment))
Also, I would check whether setting alpha=Treatment corresponds to more transparent as being equivalent to low treatment and less transparent to high treatment. At least that would be my intuitive understanding, without any background on the research design or data.
For information about formatting legends, see here.

Related

ggplot2 dodged boxplot with geom_point dodging and unequal number of subgroups

I am attempting to plot a dodged boxplot but I run into a couple of difficulties. First of all, the x-axis basically has 2 types of grouping: the "letter-groups" (A, B, C etc...) are the main groups, I specify these as my "X" aesthetic (X_main_group). Within this main group I have subgroups called "X_group", the boxes are coloured by those subgroup types. What causes problems is that for each letter group I have different amounts of these subgroups, e.g. for x=A I have 4 subgroups but for x=B I have only one. This causes problems, for one the dodging of the plotted points do not work anymore (see the example plot below) as they do not align with the dodged boxplots. Secondly, the boxes are not centered around the x-axis tick anymore, this is most clear for x=B. How can I fix this?
I would also like to achieve small x-axis ticks below each subgroup (so 4 ticks for x=A, 1 tick for x=B, 3 for x=C etc..) but this has less priority. I have attached the figure, and in red I drew some examples of what I hope to achieve with the tick-marks. ggplot2 code is shown below. I would like to provide a reproducible piece of code, but I can not manage to create a piece of code that creates a dataframe with unequal amounts of subgroups so people that want to help can run it. I can only make "symmetrical" dataframes...
cbpallette <- c("#999999", "#666666", "#333333", "#000000", "#003300")
p1 <- ggplot(data=df, aes(x=X_main_group,y=Intensity, colour=factor(X_group))) + stat_boxplot(geom = "errorbar", width=.4, position = position_dodge(0.5, preserve="single")) + geom_boxplot(width=0.5, outlier.shape=NA, position=position_dodge(preserve = "single")) + theme_classic() + geom_point(position=position_jitterdodge(), alpha=0.3)
p2 <- p1 + scale_colour_manual(values = cbpallette) + theme(legend.position = "none") + theme(axis.ticks.length = unit(-0.1, "cm"), axis.text.x = element_text(size=30, vjust=-0.4), axis.text.y=element_text(size=35, hjust = 0.5, angle=45), axis.title = element_blank())
p3 <- p2 + theme(axis.text.x = element_text(margin = margin(t = .5, unit = "cm")), axis.text.y = element_text(margin = margin(r = .5, unit = "cm")))
p3

Add values the the bar,reorder bars a remove grid in horizontal grouped bar plot in ggplot 2

Hi this is my fake data:
year<-c(rep(2010,3),rep(2011,3),rep(2012,3))
value<-rnorm(9,10,5)
nation<-rep(c("a","b","c"),3)
fake<-data.frame(year,value,nation)
and this is code i already have:
ggplot(data=fake, aes(x=year, y=value, fill=nation)) +
geom_bar(stat="identity",width=0.5, position=position_dodge())+
scale_fill_manual(values=c("#006600", "#007f00","#009900"),name="Experimental\nCondition")+
scale_x_continuous(breaks=seq(2010,2015,1),name="year")+
scale_y_continuous(breaks=F,name="value")+
theme(legend.title=element_blank(),plot.title = element_text(lineheight=0.9, face="bold"))+
ggtitle("arbitrary nation values in arbitrary years")+
coord_flip()
What i am trying to make is add values inside of each bar, remove grids in the background and reorder each single year from the highest to the lowest values,so the order of bars will be different in different years. And maybe one last thing is that little zero in the down left corner. In the first version of graph there was numbers on the X axis but i make desicion to remove them so i did but the zero persist. Its possible to erase it in some graphical editor but i guess it could be done throught code.
Thank you very much for every suggestions
I almost got what you want, i'll send you this code. You could do something with the command reorder() to reorder the barplots by year.
ggplot(data=fake, aes(x=year, y=value, fill=nation)) +
geom_bar(stat="identity",width=0.5, position=position_dodge())+
scale_fill_manual(values=c("#006600", "#007f00","#009900"),name="Experimental\nCondition")+
scale_x_continuous(breaks=seq(2010,2015,1),name="year")+
scale_y_continuous(breaks=F,name="value")+
theme(legend.title=element_blank(),plot.title = element_text(lineheight=0.9, face="bold"))+
geom_text(aes(label = round(value, digits=2)), size = 5, position = position_dodge(width = 0.5), hjust=1)+
ggtitle("arbitrary nation values in arbitrary years")+
coord_flip()+
theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank(),
panel.background = element_blank(), axis.line = element_line(colour = "black"))

How to align labels with coord_polar?

I am trying to produce a circular "heatmap" in R, and found a solution with coord_polar, and how to distribute the labels around the plot.
My problem is that the labels around the plot seem to be centred and the long names are overlapping the plot. I can't use hjust and vjust to align the text to the edge of the plot.
My code and a subset of my data:
library(reshape)
library(ggplot2)
data <- data.frame(id=c("S_subsp_houtenae_str_ATCC_BAA-1581","S_Heidelberg_S_1_7","S_Haifa_S_11_3","S_Infantis_S_2_3","S_Newport_S_1_4","S_Bredeney_S_1_3","S_Saint_Paul_S_1_5","S_Bovismorbificans_S_3_8","S_Saintpaul_str_SARA26","S_London_S_6_7","S_Mbandaka_S_7_5","S_Corvallis_S_5_6","S_San_Diego_S_9_5","S_Javiana_str_10721"),
A.C2=c(0,0,0,0,0,0,0,0,0,0,0,2,0,0),Col156=c(0,0,0,0,0,4,0,0,0,0,0,0,0,0),
ColRNAI=c(0,8,0,0,8,8,8,0,8,0,0,0,0,0),FIB=c(0,0,0,0,10,0,0,10,10,0,0,0,0,0),
FII=c(0,0,0,0,0,0,0,12,12,0,0,0,0,0),HI2=c(0,15,0,0,15,15,0,0,0,0,0,0,0,0),
HI2A=c(0,15,0,0,15,15,0,0,0,0,0,0,0,0),I1=c(0,17,17,17,0,0,0,0,0,0,0,17,17,0),
I2=c(0,0,0,0,0,0,0,0,0,0,0,18,18,18),N=c(0,0,0,0,0,0,0,19,19,19,19,0,0,0),
P=c(20,20,20,20,20,20,20,0,0,0,0,0,0,0),Q1=c(0,22,0,0,22,0,0,0,0,0,0,22,0,0))
data <- transform(data,id=factor(id,levels=unique(id)))
data.m <- melt(data)
data.m$var2 = as.numeric(data.m$variable) + 15
y_labels = levels(data.m$variable)
y_breaks = seq_along(y_labels) + 15
sequence_length = length(unique(data.m$id))
first_sequence = c(1:(sequence_length%/%2))
second_sequence = c((sequence_length%/%2+1):sequence_length)
first_angles =c(90 - 180/length(first_sequence) * first_sequence)
second_angles = c(-90 - 180/length(second_sequence) * second_sequence)
Palette <- c("#f1f1f1","#302013","#614126","#58DB41","#638A5C","#62D585","#579134","#B8DD95","#9ED84D","#4B6FC8","#2A344D","#47689B","#315CEE","#D9AB68","#E09B33","#FE9E2A","#D97B0C","#6A2F45","#A02A77","#E1C73E","#D16F60","#C13420","#DA435C","#E20338","#000000","#999999")
p = ggplot(data.m, aes(x=id, y=var2, fill=factor(value))) +
geom_tile(colour="white") +
scale_fill_manual(values=Palette) +
scale_y_discrete(breaks=y_breaks, labels=y_labels) +
theme(panel.background=element_blank(),
axis.title=element_blank(),
panel.grid=element_blank(),
axis.text.x=element_text(angle= c(first_angles,second_angles),size=8),
axis.ticks=element_blank(),
axis.text.y=element_blank(),
legend.position="none")
p = p + coord_polar()
plot(p)
I've had similar issues in coord_polar() with labels not responding to either hjust= or vjust= and therefore not aligning as I'd like.
The solution to this, shown here https://stackoverflow.com/a/28846989/4340137, is to use geom_text() to manually label the data.
The example at the link provided does everything you need. Unfortunately, I just can't get it working quickly with your more complicated data structure and SO won't let me leave this as a comment.
Someone else may be able to edit to include the exact code.
In RStudio, when I run the following and zoom, all the labels are outside the circle except the longest one, which may mean the plot margin at the top is too tight (or you might consider shortening the name or using \n for a new line). I changed the axis.text.y argument to theme. I also couldn't get the odd legend in the top left to go away. Even so, the inserted plot suffers from the overlap problem you described.
ggplot(data.m, aes(x=id, y=var2, fill=factor(value))) +
geom_tile(colour="white") +
scale_fill_manual(values=Palette) +
scale_y_discrete(breaks=y_breaks, labels=y_labels) +
theme(panel.background=element_blank(), axis.title=element_blank(), panel.grid=element_blank(),
axis.text.x=element_text(angle= c(first_angles,second_angles),size=8, vjust=-1), # vjust=-1
axis.ticks=element_blank(), legend.position="none",
axis.text.y=element_text(vjust = -2), legend.position="none") +
coord_polar()

R: ggplot2: Adding count labels to histogram with density overlay

I have a time-series that I'm examining for data heterogeneity, and wish to explain some important facets of this to some data analysts. I have a density histogram overlayed by a KDE plot (in order to see both plots obviously). However the original data are counts, and I want to place the count values as labels above the histogram bars.
Here is some code:
$tix_hist <- ggplot(tix, aes(x=Tix_Cnt))
+ geom_histogram(aes(y = ..density..), colour="black", fill="orange", binwidth=50)
+ xlab("Bin") + ylab("Density") + geom_density(aes(y = ..density..),fill=NA, colour="blue")
+ scale_x_continuous(breaks=seq(1,1700,by=100))
tix_hist + opts(
title = "Ticket Density To-Date",
plot.title = theme_text(face="bold", size=18),
axis.title.x = theme_text(face="bold", size=16),
axis.title.y = theme_text(face="bold", size=14, angle=90),
axis.text.x = theme_text(face="bold", size=14),
axis.text.y = theme_text(face="bold", size=14)
)
I thought about extrapolating count values using KDE bandwidth, etc, . Is it possible to data frame the numeric output of a ggplot frequency histogram and add this as a 'layer'. I'm not savvy on the layer() function yet, but any ideas would be helpful. Many thanks!
if you want the y-axis to show the bin_count number, at the same time, adding a density curve on this histogram,
you might use geom_histogram() first and record the binwidth value! (this is very important!), next add a layer of geom_density() to show the fitting curve.
if you don't know how to choose the binwidth value, you can just calculate:
my_binwidth = (max(Tix_Cnt)-min(Tix_Cnt))/30;
(this is exactly what geom_histogram does in default.)
The code is given below:
(suppose the binwith value you just calculated is 0.001)
tix_hist <- ggplot(tix, aes(x=Tix_Cnt)) ;
tix_hist<- tix_hist + geom_histogram(aes(y=..count..),colour="blue",fill="white",binwidth=0.001);
tix_hist<- tix_hist + geom_density(aes(y=0.001*..count..),alpha=0.2,fill="#FF6666",adjust=4);
print(tix_hist);

How can I remove the legend title in ggplot2?

I have a question concerning the legend in ggplot2.
Say I have a hypothetical dataset about mean carrot length for two different colours at two farms:
carrots<-NULL
carrots$Farm<-rep(c("X","Y"),2)
carrots$Type<-rep(c("Orange","Purple"),each=2)
carrots$MeanLength<-c(10,6,4,2)
carrots<-data.frame(carrots)
I make a simple bar plot:
require(ggplot2)
p<-ggplot(carrots,aes(y=MeanLength,x=Farm,fill=Type)) +
geom_bar(position="dodge") +
opts(legend.position="top")
p
My question is: is there a way to remove the title ('Type') from the legend?
Thanks!
I found that the best option is to use + theme(legend.title = element_blank()) as user "gkcn" noted.
For me (on 03/26/15) using the previously suggested labs(fill="") and scale_fill_discrete("") remove one title, only to add in another legend, which is not useful.
You can modify the legend title by passing it as the first parameter to a scale. For example:
ggplot(carrots, aes(y=MeanLength, x=Farm, fill=Type)) +
geom_bar(position="dodge") +
theme(legend.position="top", legend.direction="horizontal") +
scale_fill_discrete("")
There is also a shortcut for this, i.e. labs(fill="")
Since your legend is at the top of the chart, you may also wish to modify the legend orientation. You can do this using opts(legend.direction="horizontal").
You can use labs:
p + labs(fill="")
The only way worked for me was using legend.title = theme_blank() and I think it is the most convenient variant in comparison to labs(fill="") and scale_fill_discrete(""), which also could be useful in some cases.
ggplot(carrots,aes(y=MeanLength,x=Farm,fill=Type)) +
geom_bar(position="dodge") +
opts(
legend.position="top",
legend.direction="horizontal",
legend.title = theme_blank()
)
P.S. There are more useful options in documentation.
You've got two good options already, so here's another using scale_fill_manual(). Note this also lets you specify the colors of the bars easily:
ggplot(carrots,aes(y=MeanLength,x=Farm,fill=Type)) +
geom_bar(position="dodge") +
opts(legend.position="top") +
scale_fill_manual(name = "", values = c("Orange" = "orange", "Purple" = "purple"))
If you are using the up-to-date (As of January 2015) version of ggplot2 (version 1.0), then the following should work:
ggplot(carrots, aes(y = MeanLength, x = Farm, fill = Type)) +
geom_bar(stat = "identity", position = "dodge") +
theme(legend.position="top") +
scale_fill_manual(name = "", values = c("Orange" = "orange", "Purple" = "purple"))
#pascal 's solution in a comment to set the name argument of a scale function, such as scale_fill_discrete, to NULL, is the best option for me. It allows removing the title together with the blank space that would remain if you used "", while at the same time allowing the user to selectively remove titles, which is not possible with the theme(legend.title = element_blank()) approach.
Since it is buried in a comment, I am posting it as an answer to potentially increase its visibility, with kudos to #pascal.
TL;DR (for the copy-pasters):
scale_fill_discrete(name = NULL)

Resources