I am attempting to plot a dodged boxplot but I run into a couple of difficulties. First of all, the x-axis basically has 2 types of grouping: the "letter-groups" (A, B, C etc...) are the main groups, I specify these as my "X" aesthetic (X_main_group). Within this main group I have subgroups called "X_group", the boxes are coloured by those subgroup types. What causes problems is that for each letter group I have different amounts of these subgroups, e.g. for x=A I have 4 subgroups but for x=B I have only one. This causes problems, for one the dodging of the plotted points do not work anymore (see the example plot below) as they do not align with the dodged boxplots. Secondly, the boxes are not centered around the x-axis tick anymore, this is most clear for x=B. How can I fix this?
I would also like to achieve small x-axis ticks below each subgroup (so 4 ticks for x=A, 1 tick for x=B, 3 for x=C etc..) but this has less priority. I have attached the figure, and in red I drew some examples of what I hope to achieve with the tick-marks. ggplot2 code is shown below. I would like to provide a reproducible piece of code, but I can not manage to create a piece of code that creates a dataframe with unequal amounts of subgroups so people that want to help can run it. I can only make "symmetrical" dataframes...
cbpallette <- c("#999999", "#666666", "#333333", "#000000", "#003300")
p1 <- ggplot(data=df, aes(x=X_main_group,y=Intensity, colour=factor(X_group))) + stat_boxplot(geom = "errorbar", width=.4, position = position_dodge(0.5, preserve="single")) + geom_boxplot(width=0.5, outlier.shape=NA, position=position_dodge(preserve = "single")) + theme_classic() + geom_point(position=position_jitterdodge(), alpha=0.3)
p2 <- p1 + scale_colour_manual(values = cbpallette) + theme(legend.position = "none") + theme(axis.ticks.length = unit(-0.1, "cm"), axis.text.x = element_text(size=30, vjust=-0.4), axis.text.y=element_text(size=35, hjust = 0.5, angle=45), axis.title = element_blank())
p3 <- p2 + theme(axis.text.x = element_text(margin = margin(t = .5, unit = "cm")), axis.text.y = element_text(margin = margin(r = .5, unit = "cm")))
p3
Hi this is my fake data:
year<-c(rep(2010,3),rep(2011,3),rep(2012,3))
value<-rnorm(9,10,5)
nation<-rep(c("a","b","c"),3)
fake<-data.frame(year,value,nation)
and this is code i already have:
ggplot(data=fake, aes(x=year, y=value, fill=nation)) +
geom_bar(stat="identity",width=0.5, position=position_dodge())+
scale_fill_manual(values=c("#006600", "#007f00","#009900"),name="Experimental\nCondition")+
scale_x_continuous(breaks=seq(2010,2015,1),name="year")+
scale_y_continuous(breaks=F,name="value")+
theme(legend.title=element_blank(),plot.title = element_text(lineheight=0.9, face="bold"))+
ggtitle("arbitrary nation values in arbitrary years")+
coord_flip()
What i am trying to make is add values inside of each bar, remove grids in the background and reorder each single year from the highest to the lowest values,so the order of bars will be different in different years. And maybe one last thing is that little zero in the down left corner. In the first version of graph there was numbers on the X axis but i make desicion to remove them so i did but the zero persist. Its possible to erase it in some graphical editor but i guess it could be done throught code.
Thank you very much for every suggestions
I almost got what you want, i'll send you this code. You could do something with the command reorder() to reorder the barplots by year.
ggplot(data=fake, aes(x=year, y=value, fill=nation)) +
geom_bar(stat="identity",width=0.5, position=position_dodge())+
scale_fill_manual(values=c("#006600", "#007f00","#009900"),name="Experimental\nCondition")+
scale_x_continuous(breaks=seq(2010,2015,1),name="year")+
scale_y_continuous(breaks=F,name="value")+
theme(legend.title=element_blank(),plot.title = element_text(lineheight=0.9, face="bold"))+
geom_text(aes(label = round(value, digits=2)), size = 5, position = position_dodge(width = 0.5), hjust=1)+
ggtitle("arbitrary nation values in arbitrary years")+
coord_flip()+
theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank(),
panel.background = element_blank(), axis.line = element_line(colour = "black"))
I am trying to produce a circular "heatmap" in R, and found a solution with coord_polar, and how to distribute the labels around the plot.
My problem is that the labels around the plot seem to be centred and the long names are overlapping the plot. I can't use hjust and vjust to align the text to the edge of the plot.
My code and a subset of my data:
library(reshape)
library(ggplot2)
data <- data.frame(id=c("S_subsp_houtenae_str_ATCC_BAA-1581","S_Heidelberg_S_1_7","S_Haifa_S_11_3","S_Infantis_S_2_3","S_Newport_S_1_4","S_Bredeney_S_1_3","S_Saint_Paul_S_1_5","S_Bovismorbificans_S_3_8","S_Saintpaul_str_SARA26","S_London_S_6_7","S_Mbandaka_S_7_5","S_Corvallis_S_5_6","S_San_Diego_S_9_5","S_Javiana_str_10721"),
A.C2=c(0,0,0,0,0,0,0,0,0,0,0,2,0,0),Col156=c(0,0,0,0,0,4,0,0,0,0,0,0,0,0),
ColRNAI=c(0,8,0,0,8,8,8,0,8,0,0,0,0,0),FIB=c(0,0,0,0,10,0,0,10,10,0,0,0,0,0),
FII=c(0,0,0,0,0,0,0,12,12,0,0,0,0,0),HI2=c(0,15,0,0,15,15,0,0,0,0,0,0,0,0),
HI2A=c(0,15,0,0,15,15,0,0,0,0,0,0,0,0),I1=c(0,17,17,17,0,0,0,0,0,0,0,17,17,0),
I2=c(0,0,0,0,0,0,0,0,0,0,0,18,18,18),N=c(0,0,0,0,0,0,0,19,19,19,19,0,0,0),
P=c(20,20,20,20,20,20,20,0,0,0,0,0,0,0),Q1=c(0,22,0,0,22,0,0,0,0,0,0,22,0,0))
data <- transform(data,id=factor(id,levels=unique(id)))
data.m <- melt(data)
data.m$var2 = as.numeric(data.m$variable) + 15
y_labels = levels(data.m$variable)
y_breaks = seq_along(y_labels) + 15
sequence_length = length(unique(data.m$id))
first_sequence = c(1:(sequence_length%/%2))
second_sequence = c((sequence_length%/%2+1):sequence_length)
first_angles =c(90 - 180/length(first_sequence) * first_sequence)
second_angles = c(-90 - 180/length(second_sequence) * second_sequence)
Palette <- c("#f1f1f1","#302013","#614126","#58DB41","#638A5C","#62D585","#579134","#B8DD95","#9ED84D","#4B6FC8","#2A344D","#47689B","#315CEE","#D9AB68","#E09B33","#FE9E2A","#D97B0C","#6A2F45","#A02A77","#E1C73E","#D16F60","#C13420","#DA435C","#E20338","#000000","#999999")
p = ggplot(data.m, aes(x=id, y=var2, fill=factor(value))) +
geom_tile(colour="white") +
scale_fill_manual(values=Palette) +
scale_y_discrete(breaks=y_breaks, labels=y_labels) +
theme(panel.background=element_blank(),
axis.title=element_blank(),
panel.grid=element_blank(),
axis.text.x=element_text(angle= c(first_angles,second_angles),size=8),
axis.ticks=element_blank(),
axis.text.y=element_blank(),
legend.position="none")
p = p + coord_polar()
plot(p)
I've had similar issues in coord_polar() with labels not responding to either hjust= or vjust= and therefore not aligning as I'd like.
The solution to this, shown here https://stackoverflow.com/a/28846989/4340137, is to use geom_text() to manually label the data.
The example at the link provided does everything you need. Unfortunately, I just can't get it working quickly with your more complicated data structure and SO won't let me leave this as a comment.
Someone else may be able to edit to include the exact code.
In RStudio, when I run the following and zoom, all the labels are outside the circle except the longest one, which may mean the plot margin at the top is too tight (or you might consider shortening the name or using \n for a new line). I changed the axis.text.y argument to theme. I also couldn't get the odd legend in the top left to go away. Even so, the inserted plot suffers from the overlap problem you described.
ggplot(data.m, aes(x=id, y=var2, fill=factor(value))) +
geom_tile(colour="white") +
scale_fill_manual(values=Palette) +
scale_y_discrete(breaks=y_breaks, labels=y_labels) +
theme(panel.background=element_blank(), axis.title=element_blank(), panel.grid=element_blank(),
axis.text.x=element_text(angle= c(first_angles,second_angles),size=8, vjust=-1), # vjust=-1
axis.ticks=element_blank(), legend.position="none",
axis.text.y=element_text(vjust = -2), legend.position="none") +
coord_polar()
I have a time-series that I'm examining for data heterogeneity, and wish to explain some important facets of this to some data analysts. I have a density histogram overlayed by a KDE plot (in order to see both plots obviously). However the original data are counts, and I want to place the count values as labels above the histogram bars.
Here is some code:
$tix_hist <- ggplot(tix, aes(x=Tix_Cnt))
+ geom_histogram(aes(y = ..density..), colour="black", fill="orange", binwidth=50)
+ xlab("Bin") + ylab("Density") + geom_density(aes(y = ..density..),fill=NA, colour="blue")
+ scale_x_continuous(breaks=seq(1,1700,by=100))
tix_hist + opts(
title = "Ticket Density To-Date",
plot.title = theme_text(face="bold", size=18),
axis.title.x = theme_text(face="bold", size=16),
axis.title.y = theme_text(face="bold", size=14, angle=90),
axis.text.x = theme_text(face="bold", size=14),
axis.text.y = theme_text(face="bold", size=14)
)
I thought about extrapolating count values using KDE bandwidth, etc, . Is it possible to data frame the numeric output of a ggplot frequency histogram and add this as a 'layer'. I'm not savvy on the layer() function yet, but any ideas would be helpful. Many thanks!
if you want the y-axis to show the bin_count number, at the same time, adding a density curve on this histogram,
you might use geom_histogram() first and record the binwidth value! (this is very important!), next add a layer of geom_density() to show the fitting curve.
if you don't know how to choose the binwidth value, you can just calculate:
my_binwidth = (max(Tix_Cnt)-min(Tix_Cnt))/30;
(this is exactly what geom_histogram does in default.)
The code is given below:
(suppose the binwith value you just calculated is 0.001)
tix_hist <- ggplot(tix, aes(x=Tix_Cnt)) ;
tix_hist<- tix_hist + geom_histogram(aes(y=..count..),colour="blue",fill="white",binwidth=0.001);
tix_hist<- tix_hist + geom_density(aes(y=0.001*..count..),alpha=0.2,fill="#FF6666",adjust=4);
print(tix_hist);
I have a question concerning the legend in ggplot2.
Say I have a hypothetical dataset about mean carrot length for two different colours at two farms:
carrots<-NULL
carrots$Farm<-rep(c("X","Y"),2)
carrots$Type<-rep(c("Orange","Purple"),each=2)
carrots$MeanLength<-c(10,6,4,2)
carrots<-data.frame(carrots)
I make a simple bar plot:
require(ggplot2)
p<-ggplot(carrots,aes(y=MeanLength,x=Farm,fill=Type)) +
geom_bar(position="dodge") +
opts(legend.position="top")
p
My question is: is there a way to remove the title ('Type') from the legend?
Thanks!
I found that the best option is to use + theme(legend.title = element_blank()) as user "gkcn" noted.
For me (on 03/26/15) using the previously suggested labs(fill="") and scale_fill_discrete("") remove one title, only to add in another legend, which is not useful.
You can modify the legend title by passing it as the first parameter to a scale. For example:
ggplot(carrots, aes(y=MeanLength, x=Farm, fill=Type)) +
geom_bar(position="dodge") +
theme(legend.position="top", legend.direction="horizontal") +
scale_fill_discrete("")
There is also a shortcut for this, i.e. labs(fill="")
Since your legend is at the top of the chart, you may also wish to modify the legend orientation. You can do this using opts(legend.direction="horizontal").
You can use labs:
p + labs(fill="")
The only way worked for me was using legend.title = theme_blank() and I think it is the most convenient variant in comparison to labs(fill="") and scale_fill_discrete(""), which also could be useful in some cases.
ggplot(carrots,aes(y=MeanLength,x=Farm,fill=Type)) +
geom_bar(position="dodge") +
opts(
legend.position="top",
legend.direction="horizontal",
legend.title = theme_blank()
)
P.S. There are more useful options in documentation.
You've got two good options already, so here's another using scale_fill_manual(). Note this also lets you specify the colors of the bars easily:
ggplot(carrots,aes(y=MeanLength,x=Farm,fill=Type)) +
geom_bar(position="dodge") +
opts(legend.position="top") +
scale_fill_manual(name = "", values = c("Orange" = "orange", "Purple" = "purple"))
If you are using the up-to-date (As of January 2015) version of ggplot2 (version 1.0), then the following should work:
ggplot(carrots, aes(y = MeanLength, x = Farm, fill = Type)) +
geom_bar(stat = "identity", position = "dodge") +
theme(legend.position="top") +
scale_fill_manual(name = "", values = c("Orange" = "orange", "Purple" = "purple"))
#pascal 's solution in a comment to set the name argument of a scale function, such as scale_fill_discrete, to NULL, is the best option for me. It allows removing the title together with the blank space that would remain if you used "", while at the same time allowing the user to selectively remove titles, which is not possible with the theme(legend.title = element_blank()) approach.
Since it is buried in a comment, I am posting it as an answer to potentially increase its visibility, with kudos to #pascal.
TL;DR (for the copy-pasters):
scale_fill_discrete(name = NULL)