Why does geom_jitter replicates black points when we use aes colour? - r

I came across this issue while analyzing my data and I was able to replicate it with the example of the official ggplot reference.
This code creates black points that seem to be the original points before jitter was applied with collors:
ggplot(mpg, aes(cyl, hwy)) +
geom_point() +
geom_jitter(aes(colour = class))
However, this code works fine, it doesn't show the black points:
p <- ggplot(mpg, aes(cyl, hwy))
p + geom_point()
p + geom_jitter(aes(colour = class))
I was thinking it may be related to geom_point printing the black dots before geom_jitter, but if this is the case, why does it work fine in the second example, which follows the same order?
This is the image of the black points

geom_jitter is merely a convenience function, it is calling geom_point under the hood. Because of this, your use of p + geom_point() + geom_jitter(aes(color=class)) is actually the same as
ggplot(ggplot2::mpg, aes(cyl, hwy)) +
geom_point() +
geom_point(aes(color = class), position = "jitter")
which is plotting the same points twice. You can clarify this a little by changing the color of the original points:
ggplot(ggplot2::mpg, aes(cyl, hwy)) +
geom_point(color = "red") +
geom_jitter(aes(color = class))
If you want jittered points, use either geom_point(position = "jitter") or geom_jitter(), not both.

Related

Different colours for facet_grid strip backgrounds

I would like to change the colours of the strip backgrounds to a predefined order.
This code generates the plot, and changes the strip backgrounds to red:
p <- ggplot(mpg, aes(displ, cty)) + geom_point() + facet_grid(. ~ cyl) +
theme(strip.background = element_rect(fill="red"))
I'd like to do something like the below however, which ideally would specify a different colour for each strip
p <- ggplot(mpg, aes(displ, cty)) + geom_point() + facet_grid(. ~ cyl) +
theme(strip.background = element_rect(fill=c("red","green","blue","yellow")))
Which just makes them all red...
This was asked in similar questions years ago, the answer was to manipulate grobs. I was hoping that there was a simpler solution in the years since?

Smooth line to the full datase (faceting) ggplot2

I want to add smooth line of the full dataset to each facet.
However, the following code add smooth line different(of each facet) smooth line to each facet.
ggplot(mpg2,aes(displ,hwy)) + geom_point() + facet_wrap(~class) + geom_smooth(se = FALSE)
How can I fix that?
You can add the geom_smooth layer using a dataset that doesn't contain the faceting variable. So remove class from the dataset.
ggplot(mpg, aes(displ, hwy)) +
geom_point() +
facet_wrap(~class) +
geom_smooth(data = mpg[,1:10], se = FALSE)

How to place multiple boxplots in the same column with ggplot(geom_boxplot)

I would like to built a boxplot in which the 4 factors (N1:N4) are overlaid in the same column. For example with the following data:
df<-data.frame(N=N,Value=Value)
Q<-c("C1","C1","C2","C3","C3","C1","C1","C2","C2","C3","C3","Q1","Q1","Q1","Q1","Q3","Q3","Q4","Q4","Q1","Q1","Q1","Q1","Q3","Q3","Q4","Q4")
N<-c("N2","N3","N3","N2","N3","N2","N3","N2","N3","N2","N3","N0","N1","N2","N3","N1","N3","N0","N1","N0","N1","N2","N3","N1","N3","N0","N1")
Value<-c(4.7,8.61,8.34,5.89,8.36,1.76,2.4,5.01,2.12,1.88,3.01,2.4,7.28,4.34,5.39,11.61,10.14,3.02,9.45,8.8,7.4,6.93,8.44,7.37,7.81,6.74,8.5)
with the following (usual) code, the output is 4 box-plots displayed in 4 columns for the 4 variables
ggplot(df, aes(x=N, y=Value,color=N)) + theme_bw(base_size = 20)+ geom_boxplot()
many thanks
Updated Answer
Based on your comment, here's a way to add marginal boxplots. We'll use the built-in mtcars data frame.
First, some set-up:
library(cowplot)
# Common theme elements
thm = list(theme_bw(),
guides(colour=FALSE, fill=FALSE),
theme(plot.margin=unit(rep(0,4),"lines")))
Now, create the three plots:
# Main plot
p1 = ggplot(mtcars, aes(wt, mpg, colour=factor(cyl), fill=factor(cyl))) +
geom_smooth(method="lm") + labs(colour="Cyl", fill="Cyl") +
scale_y_continuous(limits=c(10,35)) +
thm[-2] +
theme(legend.position = c(0.85,0.8))
# Top margin plot
p2 = ggplot(mtcars, aes(factor(cyl), wt, colour=factor(cyl))) +
geom_boxplot() + thm + coord_flip() + labs(x="Cyl", y="")
# Right margin plot
p3 = ggplot(mtcars, aes(factor(cyl), mpg, colour=factor(cyl))) +
geom_boxplot() + thm + labs(x="Cyl", y="") +
scale_y_continuous(limits=c(10,35))
Lay out the plots and add the legend:
plot_grid(plotlist=list(p2, ggplot(), p1, p3), ncol=2,
rel_widths=c(5,1), rel_heights=c(1,5), align="hv")
Original Answer
You can overlay all four boxplots in a single column, but the plot will be unreadable. The first example below removes N as the x coordinate, but keeps N as the colour aesthetic. This results in the four levels of N being plotted at a single tick mark (which I've removed by setting breaks to NULL). However, the plots are still dodged. To plot them one on top of the other, set the dodge width to zero, as I've done in the second example. However, the plots are not readable when they are overlaid.
ggplot(df, aes(x="", y=Value,color=N)) +
theme_bw(base_size = 20) +
geom_boxplot() +
scale_x_discrete(breaks=NULL) +
labs(x="")
ggplot(df, aes(x="", y=Value,color=N)) +
theme_bw(base_size = 20) +
geom_boxplot(position=position_dodge(0)) +
scale_x_discrete(breaks=NULL) +
labs(x="")

incorrect linetype in geom_vline with legend in r ggplot2

I am trying to display the xvar median as a dotted line & show it in the legend. Here's my code:
require(ggplot2)
require(scales)
medians_mtcars <- data.frame("wt.median"=median(mtcars$wt))
# legend shows but linetype is wrong (solid)
p <- ggplot(mtcars, aes(wt, mpg))
p <- p + geom_point()
p <- p + geom_vline(aes(xintercept=wt.median, linetype="dotted"),
data=medians_mtcars, show_guide=TRUE)
p
I also tried:
# linetype is correct but legend does not show
p <- ggplot(mtcars, aes(wt, mpg))
p <- p + geom_point()
p <- p + geom_vline(aes(xintercept=wt.median),
data=medians_mtcars, show_guide=TRUE, linetype="dotted")
p
Would have liked to post the plot images, but haven't crossed the reputation threshold yet.
There were 2 other posts on this forum that comes close to this topic but does not offer a solution to this problem:
Add vline to existing plot and have it appear in ggplot2 legend?
;
Incorrect linetype in legend, ggplot2 in R
I am using ggplot2 version 1.0.0
What am I doing wrong ?
Thanks in advance
If you need to show linetype in legend and also change it then inside aes() you can just write name for that linetype (as you have only one line) and then change linetype with scale_linetype_manual().
ggplot(mtcars, aes(wt, mpg)) +
geom_point() +
geom_vline(aes(xintercept=wt.median, linetype="media"),
data=medians_mtcars, show_guide=TRUE)+
scale_linetype_manual(values="dotted")
If you really want to type linetype in aes() and also get correct legend then you should use scale_linetype_identity() with argument guide="legend".
ggplot(mtcars, aes(wt, mpg)) +
geom_point() +
geom_vline(aes(xintercept=wt.median, linetype="dotted"),
data=medians_mtcars,show_guide=TRUE)+
scale_linetype_identity(guide="legend")

Seemingly incorrect guide color when geom_line() and geom_segment() are used together

I'm drawing some line segments on top of a plot that uses geom_line(). Surprisingly, the guide (legend) colors for geom_line() are drawn as the color of the last element I add to the plot - even if it is not the geom_line(). This looks like a bug to me, but it could be expected behavior for some reason I don't understand.
#Set up the data
require(ggplot2)
x <- rep(1:10, 2)
y <- c(1:10, 1:10+5)
fac <- gl(2, 10)
df <- data.frame(x=x, y=y, fac=fac)
#Draw the plot with geom_segment second, and the guide is the color of the segment
ggplot(df, aes(x=x, y=y, linetype=fac)) +
geom_line() +
geom_segment(aes(x=2, y=7, xend=7, yend=7), colour="red")
Whereas if I add the geom_segment first, the colors on the guide are black as I would expect:
ggplot(df, aes(x=x, y=y, linetype=fac)) +
geom_segment(aes(x=2, y=7, xend=7, yend=7), colour="red") +
geom_line()
Feature or bug? If the first, can someone explain what's happening?
Feature(ish). The guide that is drawn is a guide for linetype. But, it has to be drawn in some color to be seen. When the color is not specified by an aesthetic mapping, ggplot2 draws it in a color that is consistent with the plot. I'm speculating that the default is whatever last color was used. That is why you are seeing differences when you plot them in a different order.
However, you can control these details of the legend.
ggplot(df, aes(x=x, y=y, linetype=fac)) +
geom_line() +
geom_segment(aes(x=2, y=7, xend=7, yend=7), colour="red") +
scale_linetype_discrete(guide=guide_legend(override.aes=aes(colour="blue")))

Resources