Here is my code. I'm getting the plot I want but it doesn't have a legend for the colors. I have Grade.7.ELA.4s...White as green and Grade.7.Math.4s as blue and thought the scale_color_manual would create a legend but when I plot it no legend appears.
PerW.vs.7ELA4s <- ggplot(Explore, aes(Explore$Percent.White, value)) +
geom_point(aes(Percent.White, Grade.7.ELA.4s...White),
color="green", alpha=.55) +
geom_point(aes(Percent.White, Grade.7.Math.4s...White),
color="blue", alpha=.5) +
geom_smooth(aes(Percent.White, Grade.7.ELA.4s...White),
color="green", alpha=.55, se=F) +
geom_smooth(aes(Percent.White, Grade.7.Math.4s...White),
color="blue", alpha=.5, se=F) +
scale_color_manual(name="", values=c("ELA"="green", "Math"="blue")) +
labs(y="# 7th Grade ELA 4's", x="Percent White")
I have some data where x is categorical, y is numeric, and color.var is another categorical variable that I would like to color by. My goal is to plot all of the points using position_jitterdodge(), and then highlight a couple of the points, draw a line between them, and add labels, while making sure these highlighted points line up with the corresponding strips of points that were plotted using position_jitterdodge(). The highlighted points are aligned properly when all factors are present in the variable used to dodge, but it does not work well when some factors are missing.
Minimal (non-)working example
library(ggplot2)
Generate some data
d = data.frame(x = c(rep('x1', 1000), rep('x2', 1000)),
y = runif(n=2000, min=0, max=1),
color.var= rep(c('color1', 'color2'), 1000),
facet.var = rep(c('facet1', 'facet1', 'facet2', 'facet2'), 500))
head(d)
dd = d[c(1,2,3,4,1997,1998, 1999,2000),]
dd
df1 = dd[dd$color.var=='color1',] ## data for first set of points, labels, and the line connecting them
df2 = dd[dd$color.var=='color2',] ## data for second set of points, labels, and the line connecting them
df1
dw = .75 ## Define the dodge.width
Plot all points
Here are all of the points, separated using position_jitterdodge() and the aesthetic fill.
ggplot() +
geom_point(data=d, aes(x=x, y=y, fill=color.var), position=position_jitterdodge(dodge.width=dw), size=3, alpha=1, shape=21, color='darkgray') +
facet_wrap(~facet.var) +
scale_fill_manual(values=c( 'lightblue','gray'))+
theme(axis.title = element_blank()) +
theme(legend.position="top")
That works well.
Additional highlighted points.
Here is the same plot, with additional points in dd added.
ggplot() +
geom_point(data=d, aes(x=x, y=y, fill =color.var), position=position_jitterdodge(dodge.width=dw), size=3, alpha=1, shape=21, color='darkgray') +
geom_point(data=dd, aes(x=x, y=y, color=color.var ), position=position_dodge(width=.75), size=4 ) +
geom_line(data=dd, aes(x=x, y=y, color=color.var, group=color.var ), position=position_dodge(width=.75), size=1 ) +
geom_label(data=dd, aes(x=x, y=y, color=color.var, group=color.var, label=round(y,1)), position=position_dodge(width=.75), vjust=-.5) +
facet_wrap(~facet.var) +
scale_fill_manual(values=c( 'lightblue','gray'))+
scale_color_manual(values=c( 'blue', 'gray40')) +
theme(axis.title = element_blank())+
theme(legend.position="top")
This is what I want it to look like. However, this only works properly if both factors of the color.var variable are in the set of points to highlight.
If both factors aren't present in the new data, the horizonal alignment fails.
Highlight points, only one factor present
Here is an example where only the 'color1' factor (blue) is present. Note that data=dd was replaced with data=df1 (data that only contains blue highlighted dots) in this code.
ggplot() +
geom_point(data=d, aes(x=x, y=y, fill =color.var), position=position_jitterdodge(dodge.width=dw), size=3, alpha=1, shape=21, color='darkgray') +
geom_point(data=df1, aes(x=x, y=y, color=color.var ), position=position_dodge(width=.75), size=4 ) +
geom_line(data=df1, aes(x=x, y=y, color=color.var, group=color.var ), position=position_dodge(width=.75), size=1 ) +
geom_label(data=df1, aes(x=x, y=y, color=color.var, group=color.var, label=round(y,1)), position=position_dodge(width=.75), vjust=-.5) +
facet_wrap(~facet.var) +
scale_fill_manual(values=c( 'lightblue','gray'))+
scale_color_manual(values=c( 'blue', 'gray40')) +
theme(axis.title = element_blank())+
theme(legend.position="top") +
scale_x_discrete(drop=F)
The highlight blue dots appear between the blue and gray dots, instead of aligned with the blue dots. Note that the additional code scale_x_discrete(drop=F) had no apparent effect on the alignment.
A manual solution
One possible fix is to edit the x coordinate manually, like this
ggplot(data=d, aes(x=x, y=y)) +
geom_point(aes(fill=color.var), position=position_jitterdodge(dodge.width=dw), size=3, alpha=1, shape=21, color='darkgray') +
geom_point(data=df1, aes(x=as.numeric(x)-dw/4, y=y), alpha=.9, size=4 , color='blue') + ## first set of points
geom_line( data=df1, aes(x=as.numeric(x)-dw/4, y=y , group=color.var ), color='blue', size=1) + ## first line
geom_label(data=df1, aes(x=as.numeric(x)-dw/4, y=y , label=round(y,1)), color='blue', vjust=-.25)+ ## first set of labels
facet_wrap(~facet.var) +
scale_fill_manual(values=c( 'lightblue','gray'))+
theme(axis.title = element_blank() +
theme(legend.position="top")
An adjustment of 1/4 of the dodge.width seems to work. This works fine, but it seems like there should be a better way, especially since I will eventually want to do this with 4-5 sets of highlighted points/lines, which may all be all be the same color.var, like the blue 'color1' factor above. Repeating this 4-5 times would be cumbersome. I will also eventually want to do this will 5-10 different figures. I suppose dodge.width*1/4 will always work, and copying and pasting might do the trick, but would like to know if there is a better way.
Here is a solution based on #aosmith's comment. Basically, just need to add this code before using ggplot:
library(dplyr) ## needed for group_by()
library(tidyr) ## needed for complete()
df1 = df1 %>% group_by(facet.var, x) %>% complete(color.var)
That adds extra rows to the data so that all the levels of color.var are present. Then the code given in the question, along with a couple of small edits that fix the legend, can be used:
ggplot() +
geom_point(data=d , aes(x=x, y=y, fill =color.var), position=position_jitterdodge(dodge.width=dw), size=3, alpha=1, shape=21, color='darkgray', show.legend=T) +
geom_point(data=df1, aes(x=x, y=y, color=color.var ), position=position_dodge(width=.75), size=4, show.legend=T ) +
geom_line( data=df1, aes(x=x, y=y, color=color.var, group=color.var ), position=position_dodge(width=.75), size=1, show.legend=F ) +
geom_label(data=df1, aes(x=x, y=y, color=color.var, group=color.var, label=round(y,1)), position=position_dodge(width=.75), vjust=-.5, show.legend=F) +
facet_wrap(~facet.var) +
scale_fill_manual( values=c( 'lightblue','gray'), name='Background dots', guide=guide_legend(override.aes = list(color=c('lightblue', 'gray')))) +
scale_color_manual(values=c( 'blue', 'gray40') , name='Highlighted dots') +
theme(axis.title = element_blank())+
theme(legend.position="top")+
scale_x_discrete(drop=F)
I have this plot made in R with ggplot2
which is drawn by the following code:
ggplot(mtcars) +
geom_smooth(fill='grey', alpha=0.3, span=0.1, aes(x=mpg, y=hp, color='AAA',linetype='AAA')) +
geom_smooth(fill='grey', alpha=0.3, span=0.9, aes(x=mpg, y=hp, color='BBB',linetype='BBB')) +
scale_colour_manual(name='test', values=c('AAA'='chocolate', 'BBB'='yellow')) +
scale_linetype_manual(name='test', values=c('AAA'='dashed','BBB'='solid')) +
theme_minimal() +theme(legend.position = "top")
Problem: from the legend, it is not easy to understand that the "AAA" line is dashed, since the box is too small.
How can I enlarge it?
I would love to have something similar to:
Try
# create your legend guide
myguide <- guide_legend(keywidth = unit(3, "cm"))
# now create your graph
ggplot(mtcars) +
geom_smooth(fill='grey', alpha=0.3, span=0.1,
aes(x=mpg, y=hp, color='AAA',linetype='AAA')) +
geom_smooth(fill='grey', alpha=0.3, span=0.9,
aes(x=mpg, y=hp, color='BBB',linetype='BBB')) +
scale_colour_manual(name='test',
values=c('AAA'='chocolate', 'BBB'='yellow'),
guide = myguide) +
scale_linetype_manual(name='test',
values=c('AAA'='dashed','BBB'='solid'),
guide = myguide) +
theme_minimal() + theme(legend.position = "top")
See ?guide_legend and here.
This will give you
You can use keywidth and keyheight to manipulate how much the key "stretches" into both directions. With title.position, direction, etc you can further finetune the legend.
Note that since you have multiple legends that are merged, you need to specify the guide to all merged scales. I simplified this by creating the guide outside as an object first.
I'm trying to customize the look of multiple loess plots within the same graph for different levels of a group var. I looked at this post, but wasn't able to make it work:
ggplot(iris, aes(x=Sepal.Length, y=Petal.Length, color=Species, linetype=Species)) +
stat_smooth(method = "loess")
I'd like to change the color of each band and line.
You can specify the looks with for example the scale_color_manual scale. In the example below I also used override.aes within guides to get a nice legend as well:
ggplot(iris, aes(x=Sepal.Length, y=Petal.Length, color=Species, linetype=Species)) +
stat_smooth(aes(fill=Species), method = "loess", size=1) +
scale_color_manual(values = c("green","blue","red")) +
scale_fill_manual(values = c("green","blue","red")) +
scale_linetype_manual(values = c("dashed","dotted","solid")) +
theme_bw() +
guides(fill=guide_legend(override.aes = list(fill="white",size=1.2)))
this gives:
Other alternatives to the manual scales are the hue and brewer scales.
I added size = 2 so you can see that the line type is different for each line:
ggplot(iris, aes(x=Sepal.Length, y=Petal.Length, color=Species, linetype=Species)) +
stat_smooth(method = "loess", aes(fill = Species), size= 2)
set.seed(586)
data<-data.frame(x=sort(runif(20)),y=sort((rnorm(1)*1:20*.234)),grp=factor(sample(c(0,1),20,replace = T)))
ggplot(data, aes(x=x, y=y, shape=grp)) +
geom_point() +
theme_classic() +
scale_shape_manual("PT Status",
values=c(1,3),
breaks=c(0,1),
labels=c("No","Yes"))+
scale_x_continuous("My x") +
scale_y_continuous("My y")+
geom_abline(intercept=.12, slope=.98,linetype=1,show_guide = TRUE)+
geom_abline(intercept=.05, slope=(-.3+.98),linetype=3,show_guide = TRUE)+
theme(legend.position="bottom")
Lines look right, and I like how the lines are integrated into the legend. But obviously in my code I'm not specifying what data go to which abline. Is there a way to simply indicate this in the geom_abline code, or some other way to recode it to make it so that the first one is solid and the second is dashed, in terms of the legend?
Using #Jorans suggestion I created
smDf<-data.frame(intercept=c(.12,.05),slope=c(.98, (-.3+.98)),linetype=factor(c(1,3)))
Then the new code is:
ggplot(data, aes(x=x, y=y, shape=grp)) +
geom_point() +
theme_classic() +
scale_shape_manual("PT Status",
values=c(1,3),
breaks=c(0,1),
labels=c("No","Yes"))+
scale_x_continuous("My x") +
scale_y_continuous("My y")+
geom_abline(aes(intercept=intercept, slope=slope, linetype=linetype), data=smDf, show_guide = TRUE)
Which gives me:
So now how do I integrate these two?
This could be better documented, I think, but you basically need to give ggplot manual scales of the same basic structure for them to be merged:
ggplot() +
geom_point(data = data,aes(x=x, y=y, shape=grp)) +
geom_abline(aes(intercept=intercept, slope=slope, linetype=linetype), data=smDf,show_guide = TRUE) +
theme_classic() +
scale_shape_manual(name = "PT Status",
values=c(1,3),
breaks=c(0,1),
labels=c("No","Yes"))+
scale_linetype_manual(name = "PT Status",
values=c(1,3),
labels=c("No","Yes"))+
scale_x_continuous("My x") +
scale_y_continuous("My y")