removing a layer legend in ggplot - r

Another ggplot legend question!
I have a dataset of the form
test <- data.frame(
cond = factor(rep(c("A", "B"), each=200)),
value = c(rnorm(200), rnorm(200, mean=0.8))
)
So two groups and some values and I want to plot the density. I also want to add a line indicating the mean for each group to the plot so I:
test.cdf <- ddply(test, .(cond), summarise, value.mean=mean(value))
Then in ggplot call:
ggplot(test, aes(value, fill=cond)) +
geom_density(alpha=0.5) +
labs(x='Energy', y='Density', fill='Group') +
opts(
panel.background=theme_blank(),
panel.grid.major=theme_blank(),
panel.grid.minor=theme_blank(),
panel.border=theme_blank(),
axis.line=theme_segment()
) +
geom_vline(data=test.cdf, aes(xintercept=value.mean, colour=cond),
linetype='dashed', size=1)
If you run the above code, you get a legend indicating each group, but also one for the mean indicator vline. My question is how can I get rid of the legend for the geom_vline()?

Depending on the version of ggplot2 you are using you get this problem. Using ggplot2 vs 0.9.0 on R2.14.1 I get this graph:
which does not include the legend for the vline. In this version of ggplot2 you can tweak the occurence of the legend using show_guide:
ggplot(test, aes(value, fill=cond)) +
geom_density(alpha=0.5) +
labs(x='Energy', y='Density', fill='Group') +
opts(
panel.background=theme_blank(),
panel.grid.major=theme_blank(),
panel.grid.minor=theme_blank(),
panel.border=theme_blank(),
axis.line=theme_segment()
) +
geom_vline(data=test.cdf, aes(xintercept=value.mean, colour=cond),
linetype='dashed', size=1, show_guide = TRUE)
which reproduces your problem. Default, show_guide = FALSE. In older versions, you can add legend = FALSE to geom_vline in order to omit the legend. Adding legend = FALSE still works still works in the current version, but it throws a warning:
Warning message:
In get(x, envir = this, inherits = inh)(this, ...) :
"legend" argument in geom_XXX and stat_XXX is deprecated. Use show_guide = TRUE or show_guide = FALSE for display or suppress the guide display.
I would recommend upgrading ggplot2.

Related

Arranging plots using grid.arrange R

I have six plots obtained with ggplot2 for normality analysis: 2 histograms, 2 qqplots and 2 boxplots.
I want to display them together ordered by type of plot: so the histograms in the first row, the qqplots in the second row and the boxplots in the third row. For this I use the grid.arrange function from gridExtra package as follows:
grid.arrange(grobs= list(plot1, plot2, qqplot1, qqplot2, boxplot1, boxplot2),
ncol=2, nrow=3,
top = ("Histograms + Quantile Graphics + Boxplots"))
But this error message pops up:
Error: stat_bin() requires an x or y aesthetic.
any idea how to solve this?
As people said in the comments the error was the aes() of one of the plots. The confussion came as R allows you to create an object even when it´s not operational, I guess this is because it can be modified later. This is the code for the plot:
ggplot(data = mtcars, aes(sample=mtcars$mpg)) +
geom_histogram(aes(y = ..density.., fill = ..count..), binwidth = 1) +
geom_density(alpha=.2) +
scale_fill_gradient(low = "#6ACE78", high = "#0D851D") +
stat_function(fun = dnorm, colour = "firebrick",
args = list(mean = mean(mtcars$mpg),
sd = sd(mtcars$mpg))) +
labs(x = "Tiempo de seguimiento", y = "")+
theme_bw()
As you can see, the mistake is the first aes() argument, as I wrote sample= instead of x=. Already solved.
Thanks

R ggplot2 could not add legend to graph

I'm using visual studio with R version 3.5.1 where I tried to plot legend to the graph.
f1 = function(x) {
return(x+1)}
x1 = seq(0, 1, by = 0.01)
data1 = data.frame(x1 = x1, f1 = f1(x1), F1 = cumtrapz(x1, f1(x1)) )
However, when I tried to plot it, it never give me a legend!
For example, I used the same code in this (Missing legend with ggplot2 and geom_line )
ggplot(data = data1, aes(x1)) +
geom_line(aes(y = f1), color = "1") +
geom_line(aes(y = F1), color = "2") +
scale_color_manual(values = c("red", "blue"))
I also looked into (How to add legend to ggplot manually? - R
) and many other websites in stackoverflo, and I have tried every single function in https://www.rstudio.com/wp-content/uploads/2016/11/ggplot2-cheatsheet-2.1.pdf
i.e.
theme(legend.position = "bottom")
scale_fill_discrete(...)
group
guides()
show.legend=TRUE
I even tried to use the original plot() and legend() function. Neither worked.
I thought there might be something wrong with the dataframe, but I split them(x2,f1,F1) apart, it still didn't work.
I thought there might be something wrong with IDE, but the code given by kohske acturally plotted legend!
d<-data.frame(x=1:5, y1=1:5, y2=2:6)
ggplot(d, aes(x)) +
geom_line(aes(y=y1, colour="1")) +
geom_line(aes(y=y2, colour="2")) +
scale_colour_manual(values=c("red", "blue"))
What's wrong with the code?
As far as I know, you only have X and Y variables in your aesthetics. Therefore there is no need for a legend. You have xlab and ylab to describe your two lines. If you want to have legends, you should put the grouping in the aesthetics, which might require recoding your dataset
d<- data.frame(x=c(1:5, 1:5), y=c(1:5, 2:6), colorGroup = c(rep("redGroup", 5),
rep("blueGroup", 5)))
ggplot(d, aes(x, y, color = colorGroup )) + geom_line()
This should give you two lines and a legend

None-missing rows were removed in geom_point in ggplot

Why the rows in this data was claimed to be missing and removed in the plot even though the x-scale isn't out of range? I have tried to include xlim without success. What do I miss here? This is the figure Gp2 (geom_point) isn't included in the plot. The code I used is as follows:
df1 <- data.frame(x=c(2,4:8),
y=c(1.030928,4.123711,3.092784,8.247423,9.278351,4.123711))
df2 <- data.frame(x=3:8,
y=c(1.700680,1.360544,4.081633,3.401361,3.061224,9.183673))
require(ggplot2)
ggplot(NULL, aes(x=x, y=y)) +
geom_bar(data = df1, aes(fill="Gp1", shape="Gp1"),
stat= "identity") +
geom_point(data = df2, stat= "identity", size = 5,
aes(shape="Gp2", fill="Gp2")) +
ylab("%") + xlab("grades") +
ggtitle("Test figure") +
scale_shape_manual(values = c(23, NA)) +
scale_fill_manual(values = c("#6699CC","#000099")) +
guides(fill = guide_legend(reverse = TRUE),
shape = guide_legend(override.aes = list(shape=0), reverse = TRUE))
This gives warning message:
Removed 6 rows containing missing values (geom_point).
Running your code piece-by-piece we can easily find that scale_shape_manual is the culprit here, everything before that works find. (If you had made a minimal example, you would have easily found that..)
You have told ggplot that all the shapes for geom_point should be Gp2, which is the second shape you have mapped. So it will look at the second entry in values and find there is an NA. So you yourself told ggplot that it should give NA shapes to all points.
(Note that you mapped shape Gp1 in geom_bar, but geom_bar doesn't take that aesthetic..)

Creating a density histogram in ggplot2?

I want to create the next histogram density plot with ggplot2. In the "normal" way (base packages) is really easy:
set.seed(46)
vector <- rnorm(500)
breaks <- quantile(vector,seq(0,1,by=0.1))
labels = 1:(length(breaks)-1)
den = density(vector)
hist(df$vector,
breaks=breaks,
col=rainbow(length(breaks)),
probability=TRUE)
lines(den)
With ggplot I have reached this so far:
seg <- cut(vector,breaks,
labels=labels,
include.lowest = TRUE, right = TRUE)
df = data.frame(vector=vector,seg=seg)
ggplot(df) +
geom_histogram(breaks=breaks,
aes(x=vector,
y=..density..,
fill=seg)) +
geom_density(aes(x=vector,
y=..density..))
But the "y" scale has the wrong dimension. I have noted that the next run gets the "y" scale right.
ggplot(df) +
geom_histogram(breaks=breaks,
aes(x=vector,
y=..density..,
fill=seg)) +
geom_density(aes(x=vector,
y=..density..))
I just do not understand it. y=..density.. is there, that should be the height. So why on earth my scale gets modified when I try to fill it?
I do need the colours. I just want a histogram where the breaks and the colours of each block are directionally set according to the default ggplot fill colours.
Manually, I added colors to your percentile bars. See if this works for you.
library(ggplot2)
ggplot(df, aes(x=vector)) +
geom_histogram(breaks=breaks,aes(y=..density..),colour="black",fill=c("red","orange","yellow","lightgreen","green","darkgreen","blue","darkblue","purple","pink")) +
geom_density(aes(y=..density..)) +
scale_x_continuous(breaks=c(-3,-2,-1,0,1,2,3)) +
ylab("Density") + xlab("df$vector") + ggtitle("Histogram of df$vector") +
theme_bw() + theme(plot.title=element_text(size=20),
axis.title.y=element_text(size = 16, vjust=+0.2),
axis.title.x=element_text(size = 16, vjust=-0.2),
axis.text.y=element_text(size = 14),
axis.text.x=element_text(size = 14),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank())
fill=seg results in grouping. You are actually getting a different histogram for each value of seg. If you don't need the colours, you could use this:
ggplot(df) +
geom_histogram(breaks=breaks,aes(x=vector,y=..density..), position="identity") +
geom_density(aes(x=vector,y=..density..))
If you need the colours, it might be easiest to calculate the density values outside of ggplot2.
Or an option with ggpubr
library(ggpubr)
gghistogram(df, x = "vector", add = "mean", rug = TRUE, fill = "seg",
palette = c("#00AFBB", "#E7B800", "#E5A800", "#00BFAB", "#01ADFA",
"#00FABA", "#00BEAF", "#01AEBF", "#00EABA", "#00EABB"), add_density = TRUE)
The confusion regarding interpreting the y-axis might be due to density is plotted rather than count. So, the values on the y-axis are proportions of the total sample, where the sum of the bars is equal to 1.

ggplot2: Multiple color scales or shift colors systematically on different layers?

When I make box plots, I like to also show the raw data in the background, like this:
library(ggplot2)
library(RColorBrewer)
cols = brewer.pal(9, 'Set1')
n=10000
dat = data.frame(value=rnorm(n, 1:4), group=factor(1:4))
ggplot(dat, aes(x=group, y=value, color=group, group=group)) +
geom_point(position=position_jitter(width=0.3), alpha=0.1) +
scale_color_manual(values=cols) +
geom_boxplot(fill=0, outlier.size=0)
However, I don't like it how my box plots completely disappear when the points get too dense. I know I can adjust alpha, which is fine in some cases, but not when my groups have varying densities (For example when the lightest group would completely disappear if I were to decrease alpha enough so that the darkest group doesn't obscure the box plot). What I'm trying to do is systematically shift the colors for the box plots - a bit darker, perhaps - so that they show up even when the background points max out the alpha. For example:
plot(1:9, rep(1, 9), pch=19, cex=2, col=cols)
cols_dk = rgb2hsv(col2rgb(brewer.pal(9, 'Set1'))) - c(0, 0, 0.2)
cols_dk = hsv(cols_dk[1,], cols_dk[2,], cols_dk[3,])
points(1:9, rep(1.2, 9), pch=19, cex=2, col=cols_dk)
So far I haven't found a way to fake in a different scale_color for the geom_boxplot layer (which would seem the simplest route if there's a way to do it). Nor have I been able to find a simple syntax to systematically adjust the colors the same way you can easily offset a continuous aesthetic like aes(x=x+1).
The closest thing I've been able to get is to completely duplicate the levels of the factor...
ggplot(dat, aes(x=group, y=value, color=group, group=group)) +
geom_point(position=position_jitter(width=0.3), alpha=0.1) +
scale_color_manual(values=c(cols[1:4], cols_dk[1:4])) +
geom_boxplot(aes(color=factor(as.numeric(group)+4)), fill=0, outlier.size=0)
but then I have to deal with that ugly legend. Any better ideas?
Late answer added Nov 2012:
Since some of these terrific answers require older ggplot2 versions and people are still referring to this page, I'll update it with the ridiculously simple solution that I've been using with ggplot2 0.9.0+.
We just add a second geom_boxplot layer that is identical to the first one except we assign a constant color using scales::alpha() so the first boxplot shows through.
library(scales) # for alpha function
ggplot(dat, aes(x=group, y=value, color=group, group=group)) +
geom_point(position=position_jitter(width=0.3), alpha=0.2) +
geom_boxplot(size=1.4,fill=0, outlier.size=0)+
geom_boxplot(size=1.4,fill=0, outlier.size=0, color=alpha("black",0.3))
edit: TobiO points out that fill=0 has stopped working. Instead, fill=NA or alpha=0 can be substituted. This seems to be due to a change in col2rgb() starting in R 3.0.0.
For now, you could define your own version of GeomBoxplot (calling it, say, GeomPlotDark), differing from the original only in that it first 'darkens' the colors before plotting them.
With proto, you can do this by creating a proto object, GeomBoxplotDark, that inherits from GeomBoxplot, and differs only in its draw function. Most of the draw function's definition is taken from the GeomBoxplot sources; I have annotated the lines I changed with comments like this # ** ... **:
require(ggplot2)
GeomBoxplotDark <- proto(ggplot2:::GeomBoxplot,
draw <- function(., data, ..., outlier.colour = "black", outlier.shape = 16, outlier.size = 2) {
defaults <- with(data, { # ** OPENING "{" ADDED **
cols_dk <- rgb2hsv(col2rgb(colour)) - c(0, 0, 0.2) # ** LINE ADDED **
cols_dk <- hsv(cols_dk[1,], cols_dk[2,], cols_dk[3,]) # ** LINE ADDED **
data.frame(x = x, xmin = xmin, xmax = xmax,
colour = cols_dk, # ** EDITED, PASSING IN cols_dk **
size = size,
linetype = 1, group = 1, alpha = 1,
fill = alpha(fill, alpha),
stringsAsFactors = FALSE
)}) # ** CLOSING "}" ADDED **
defaults2 <- defaults[c(1,1), ]
if (!is.null(data$outliers) && length(data$outliers[[1]] >= 1)) {
outliers_grob <- with(data,
GeomPoint$draw(data.frame(
y = outliers[[1]], x = x[rep(1, length(outliers[[1]]))],
colour=I(outlier.colour), shape = outlier.shape, alpha = 1,
size = outlier.size, fill = NA), ...
)
)
} else {
outliers_grob <- NULL
}
with(data, ggname(.$my_name(), grobTree(
outliers_grob,
GeomPath$draw(data.frame(y=c(upper, ymax), defaults2), ...),
GeomPath$draw(data.frame(y=c(lower, ymin), defaults2), ...),
GeomRect$draw(data.frame(ymax = upper, ymin = lower, defaults), ...),
GeomRect$draw(data.frame(ymax = middle, ymin = middle, defaults), ...)
)))
}
)
Then create a geom_boxplot_dark() to be called by the user, and which appropriately wraps the call to GeomBoxplotDark$new():
geom_boxplot_dark <- function (mapping = NULL, data = NULL, stat = "boxplot", position = "dodge",
outlier.colour = "black", outlier.shape = 16, outlier.size = 2,
...)
GeomBoxplotDark$new(mapping = mapping, data = data, stat = stat,
position = position, outlier.colour = outlier.colour, outlier.shape = outlier.shape,
outlier.size = outlier.size, ...)
Finally, try it out with code almost identical to your original call, just substituting a call to geom_boxplot_dark() for the call to geom_boxplot():
library(ggplot2)
library(RColorBrewer)
cols = brewer.pal(9, 'Set1')
n=10000
dat = data.frame(value=rnorm(n, 1:4), group=factor(1:4))
ggplot(dat, aes(x=group, y=value, color=group, group=group)) +
geom_point(position=position_jitter(width=0.3), alpha=0.1) +
scale_color_manual(values=cols) +
geom_boxplot_dark(fill=0, outlier.size=0)
I think the resulting plot looks pretty nifty. With a bit of tweaking, and viewed directly (not as an uploaded file), it'll look awesome:
You can hack the legend grob, but it seems difficult to place it.
g = ggplotGrob(p)
grid.draw(g)
legend = editGrob(getGrob(g, gPath("guide-box","guide"), grep=TRUE), vp=viewport())
new = removeGrob(legend, gPath("-7|-8|-9|-10"), grep=TRUE, glob=T)
## grid.set(gPath("guide-box"), legend, grep=TRUE) # fails for some reason
grid.remove(gPath("guide-box"), grep=TRUE, global=TRUE)
grid.draw(editGrob(new, vp=viewport(x=unit(1.4,"npc"), y=unit(0.1,"npc"))))
The ggplot2 syntax seems to have changed, and since it took me a little to figure it out:
the fill=0 does (for me) have no effect (anymore?)
however, it has to be changed to alpha=0 in order to make the box transparent:
library(scales) # for alpha function
ggplot(dat, aes(x=group, y=value, color=group, group=group)) +
geom_point(position=position_jitter(width=0.3), alpha=0.2) +
geom_boxplot(size=1.4,alpha=0, outlier.size=0)+
geom_boxplot(size=1.4,alpha=0, outlier.size=0, color=alpha("black",0.3))
edit: I just found out, that changing fill=0 to fill=NA also does the trick...
This has been implemented in ggplot2 3.3.0 (released 2020-03):
The new stage function allows you to control aesthetics after mapping of the data by a stat or a scale:
ggplot(dat, aes(x=group, y=value, color=group, group=group)) +
geom_point(position=position_jitter(width=0.3), alpha=0.1) +
scale_color_manual(values=cols) +
geom_boxplot(aes(color=stage(start=group, after_scale = colorspace::darken(color, 0.1))), fill=NA, outlier.size=0)

Resources