Create different colours for regression lines - r

I am creating a scatterplot (around 1,500,000 points), and I running regressions through it based on a factor variable (see below "halfs"). This is the picture:
As you can see it is difficult to see the "red" regression lines.
This is the data:
For_Cov Per_chg halfs
1 0.8372001 0.002400000 upper half
2 0.7236001 0.002800111 upper half
3 0.6036000 0.000800000 upper half
4 0.8540000 0.000000000 upper half
5 0.9080001 0.003200000 upper half
6 0.8248000 0.000000000 upper half
7 0.1132000 0.000000000 upper half
8 0.2044000 0.007600000 upper half
9 0.2476001 0.085200000 upper half
10 0.2368000 0.003600000 upper half
This is the code:
ggplot(grid_gdp_full, aes(x = For_Cov, y = Def_Chg, group = factor(halfs))) +
geom_point(aes(colour = halfs), alpha = 0.1) +
stat_smooth(method = "lm", formula = y ~ x + I(x^2), size = 4, fullrange=TRUE, aes(group = halfs, colour = halfs), alpha = 1) +
xlab("Initial") +
ylab("Percent") +
ylim(0,0.10) +
scale_x_reverse() +
theme_bw() +theme(
axis.text=element_text(size=12)
,axis.title=element_text(size=14,face="bold")
,plot.background = element_blank()
,panel.grid.major = element_blank()
,panel.grid.minor = element_blank()
,panel.border = element_blank()
,panel.background = element_blank()
) +
theme(axis.line = element_line(color = 'black'))
Does anyone how to change the code so that the regression lines will have different colors (regression line is created by stat_smooth) - possibly grey and blue for the two factor levels without changing the initial color of the points?
Here is the code I am using at the moment and I am still not able to get the two different colored lines:
ggplot(grid_gdp_full, aes(x = For_Cov, y = Def_Chg, fill = factor(halfs))) +
geom_point(aes(colour = halfs), alpha = 0.1, colour="transparent",shape=21) +
stat_smooth(method = "lm", formula = y ~ x + I(x^2), size = 2, fullrange=TRUE, aes(group = factor(halfs)), alpha = 1) +
xlab("Initial") +
ylab("Percent") +
ylim(0,0.1) +
scale_x_reverse() +
theme_bw() +theme(
axis.text=element_text(size=12)
,axis.title=element_text(size=14,face="bold")
,plot.background = element_blank()
,panel.grid.major = element_blank()
,panel.grid.minor = element_blank()
,panel.border = element_blank()
,panel.background = element_blank()
) +
theme(axis.line = element_line(color = 'black')) +
scale_colour_manual(values = c("red","blue")) +
scale_fill_manual(values = c("grey","green"))
dev.off()

Taking advantage of point shape 21 we can use fill for points and colour for lines. Setting colours by scale_manual as we want. Note the colour=transparent in geom_point so we remove colour borders around points.
ggplot(mtcars, aes(x=cyl,y=mpg,fill=factor(gear))) +
geom_point(aes(fill=factor(gear)),colour="transparent",shape=21) +
stat_smooth(aes(colour=factor(gear)),method="lm",se = FALSE) +
scale_colour_manual(values = c("red","blue", "green")) +
scale_fill_manual(values = c("orange","pink", "red"))

OK. I see what you want now (I think...): you want the line groups to be a different color than the point groups. In essence you need four colors: two for point groups and two for the line groups.
This worked on your sample, recoded with For_Cov<0.5 to "lower" to split the data.
grid_gdp_full$halfs <- ifelse(grid_gdp_full$For_Cov<0.5,"lower half","upper half")
grid_gdp_full <- cbind(grid_gdp_full,col =
ifelse(grid_gdp_full$halfs=="lower half","lm: lower","lm: upper"))
ggplot(grid_gdp_full, aes(x = For_Cov, y = Per_chg)) +
geom_point(aes(colour = factor(halfs)), alpha = 0.9, size=5) +
stat_smooth(method = "lm", formula = y ~ x + I(x^2), size = 4, fullrange=TRUE,
aes(group = col, colour = factor(col)), alpha = 1) +
xlab("Initial") +
ylab("Percent") +
ylim(0,0.10) +
scale_x_reverse() +
theme_bw() +theme(
axis.text=element_text(size=12)
,axis.title=element_text(size=14,face="bold")
,plot.background = element_blank()
,panel.grid.major = element_blank()
,panel.grid.minor = element_blank()
,panel.border = element_blank()
,panel.background = element_blank()
) +
theme(axis.line = element_line(color = 'black'))
Produces this:
I made the points bigger and set alpha=0.9 so you can see the points in this small sample. Also you reversed the x axis, which makes "lower" and "upper" kind of confusing.

Related

How to adjust both the line and point shape size in a legend to be different sizes?

I'm trying to adjust the point and line size independently in my legend. I'm wanting to be able to discern between the dashed and solid line while also having my point shapes be distinctive enough overtop of the line in the legend. Right now, I can't seem to figure out how to make the line smaller - I've figured out how to adjust the point size though. Any help/thoughts are all appreciated; definitely still a newbie. Thanks!
I'll post an image and code below:
Image of figure with points enhanced, but still can't get line to size correctly in the legend
- Chase
ggplot(df, aes(x = Psych.Distress.Sum, y = Likelihood.Max, color = Feedback)) +
geom_smooth(method = "lm", se = T, aes(linetype = Feedback)) +
geom_jitter(aes(shape = Feedback)) +
labs(x = "Psychological Distress", y = "Endorsement of Max's Feedback Strategy") +
apatheme +
theme(axis.title = element_text(face="bold")) +
theme(legend.text = element_text(face="bold")) +
theme(legend.position = c(0.87, 0.13)) +
scale_color_grey(end = .5) +
theme(legend.key.height= unit(.5, 'cm'),
legend.key.width= unit(1, 'cm')) +
guides(colour = guide_legend(override.aes = list(size= 3, linetype=0)))
This is tricky. Here's a hacky way using two plots that are overlaid on top of each other using patchwork. To prove that the data are aligned, I made the 2nd plot's text be semi-transparent red, but we could make it totally transparent with color #FF000000. This method is a little brittle, since the plots will come out of alignment if they have different ranges or different formats. But if we adjust for that, they line up perfectly with no extra fuss.
Your question didn't include any sample data so I used the mtcars data set.
library(patchwork)
library(ggplot2)
# This layer has the `geom_smooth` and black axis text
(a <- ggplot(mtcars, aes(x = wt, y = mpg, color = as.factor(am))) +
geom_smooth(method = "lm", se = T, aes(linetype = as.factor(am))) +
scale_color_grey(end = .5) +
guides(linetype = guide_legend(override.aes = list(size = 1.2))) +
labs(x = "Psychological Distress",
y = "Endorsement of Max's Feedback Strategy",
linetype = "Line legend", color = "Line legend") +
coord_cartesian(ylim = c(10, 35)) +
theme_classic() +
theme(axis.title = element_text(face="bold")) +
theme(legend.text = element_text(face="bold")) +
theme(legend.position = c(0.7, 0.8)))
# This layer has the `geom_jitter` and red semi-clear axis text
(b <- ggplot(mtcars, aes(x = wt, y = mpg, color = as.factor(am))) +
geom_jitter(aes(shape = as.factor(am))) +
scale_color_grey(end = .5) +
guides(shape = guide_legend(override.aes = list(size = 3))) +
coord_cartesian(ylim = c(10, 35)) +
labs(x = "", y = "",
color = "Point legend", shape = "Point legend") +
theme_classic() +
theme(plot.background = element_blank(),
panel.background = element_blank(),
axis.text = element_text(color = "#FF000055")) +
theme(legend.position = c(0.7, 0.55)))
a + inset_element(b, 0, 0, 1, 1, align_to = "full")

How to draw an inclined line and parallelly put the text on it in ggplot?

Now, I'm making some hypothesis graph and I made this graph.
x<- c(1,2,3,4,5,6,7,8,9,10,11)
y<- c(100,90,80,70,60,50,40,30,20,10,1)
a<- c(1,2,3,4,5,6,7,8,9,10)
b<- c(1,4,9,16,25,36,49,64,81,100)
dataA<- data.frame (x,y)
dataB<- data.frame (a,b)
geom_line(data=dataA, aes(x=x, y=y), col="Dark red", size=1) +
geom_line(data=dataB, aes(x=a, y=b), col="Dark blue", size=1) +
scale_x_continuous(breaks = seq(0,12,1), limits = c(0,12)) +
scale_y_continuous(breaks = seq(0,120,10), limits = c(0,120)) +
geom_hline(yintercept=70, linetype="dashed", color = "Black", size=1) +
geom_hline(yintercept=50, linetype="dashed", color = "Black", size=1) +
#geom_text(aes(fontface=6), x=11, y=110, label=paste("% distal\n","grains"), size=6, col="Dark blue") +
geom_text(aes(fontface=6), x=10, y=75, label="AGW (90th percentile)", size=5, col="Black") +
geom_text(aes(fontface=6), x=10, y=55, label="AGW (10th percentile)", size=5, col="Black") +
xlab(bquote('x ('~m^2*')')) +
ylab(bquote('y (mg '~grain^-1*')')) +
theme(axis.title = element_text (face = "plain", size = 18, color = "black"),
axis.text.x = element_blank(), #element_blank()) element_text(size= 14)
axis.text.y = element_blank(),
axis.ticks.x = element_blank(),
axis.ticks.y = element_blank(),
axis.line = element_line(size = 0.5, colour = "black"))+
windows(width=5.5, height=5)
As an alternatives, I need to present such as below graph, but I don't know how to draw an inclined line, starting a specific point of y-axis. Also, I would like to add a text on the line in a parallel position. Could you tell me how I can do this?
Many thanks!!
Here is a concrete example based on #Waldi 's comment:
library(ggplot2)
# Change line to what you want
line <- data.frame(x = c(10,30), y = c(300,50))
line = lm(formula = line$y ~ line$x)$coefficients
# ratio required for unequal axis scales
ratio <- 1/15
# get angle of the line
angle <- atan(line[2] * ratio) * 180 / pi
# plot it
ggplot(mtcars, aes(x = mpg, y = hp)) +
geom_point() +
geom_abline(slope = line[2], intercept = line[1], color = "blue") +
coord_equal(ratio = ratio) +
annotate(
geom = "text",
x = 20,
y = 20 * line[2] + line[1],
label = "My Line",
color = "blue",
angle = angle,
vjust = -1 # offset so text isn't directly on the line
)

How to add a second legend using different `geom_line`?

I am plotting the relationship between two variables (X and Y) for different individuals (IDs). This relationship is shown both with the real values (geom_point) and with lines which represent the prediction of the relationship between the variables for different individuals Linear Mixed Effect Models (LME). On top of that, the linear relationship between the two variables and for the different individuals is done using three levels of a second quantitative predictor (Z).
Thus, what I do is to use geom_point() for showing the relationship between raw values of X and Y. Then, I use three geom_line() for three LME with different levels of Z. Thus, each geom_line() draws the six lines for the six IDs for a fixed Z. So, since I have 3 Z levels and I have 3 geom_line(), I have 18 lines.
I tried this (note: code is simplified):
Plot_legend <- ggplot(df, aes(x=X, y=Y, colour=ID)) +
geom_point(size=1.5,alpha=0.2) +
geom_line(aes(y=predict(model,df.Z_low), group=ID, linetype = c("1")), size=1.5, alpha=0.6, color = line_colors[3]) +
geom_line(aes(y=predict(model,df.Z_medium), group=ID, linetype = c("2")), size=1.5, alpha=0.6, color = line_colors[2]) +
geom_line(aes(y=predict(model,df.Z_high), group=ID, linetype = c("3")), size=1.5, alpha=0.6, color = line_colors[1]) +
geom_abline(aes(slope=1,intercept=0),linetype="dashed",color="grey52",size=1.5) +
theme_bw() +
theme(legend.text=element_text(size=18),
legend.title = element_text(size=19, face = "bold",hjust = 0.5),
legend.key=element_blank(),
legend.background = element_rect(colour = 'black', fill = 'white', size = 1, linetype='solid')) +
guides(color=guide_legend(override.aes=list(fill=NA)))
However, as you can see, the legend for the three geom_line() is not what I desire. I would like to appear as title Z instead of c("10th"). Also, the colours of the legend for the three geom_line() do not correspond with the true colours for the different geom_line(), and some lines are dashed.
Does anyone know how to solve this?
Plot using Duck's advice
Try this approach. As no data was shared I can test it but it can address in right path:
library(ggplot2)
#Code
Plot_legend <- ggplot(df, aes(x=X, y=Y, colour=ID)) +
geom_point(size=1.5,alpha=0.2) +
geom_line(aes(y=predict(model,df.Z_low), group=ID, linetype = c("1")),
size=1.5, alpha=0.6, color = line_colors[3]) +
geom_line(aes(y=predict(model,df.Z_medium), group=ID, linetype = c("2")),
size=1.5, alpha=0.6, color = line_colors[2]) +
geom_line(aes(y=predict(model,df.Z_high), group=ID, linetype = c("3")),
size=1.5, alpha=0.6, color = line_colors[1]) +
geom_abline(aes(slope=1,intercept=0),linetype="dashed",color="grey52",size=1.5) +
theme_bw() +
scale_linetype_manual(values=c('solid','solid','solid'))+
scale_color_manual(values=c(line_colors[3],line_colors[2],line_colors[1]))+
labs(linetype='Z')
theme(legend.text=element_text(size=18),
legend.title = element_text(size=19, face = "bold",hjust = 0.5),
legend.key=element_blank(),
legend.background = element_rect(colour = 'black', fill = 'white', size = 1, linetype='solid')) +
guides(color=guide_legend(override.aes=list(fill=NA)))
I used next code finally:
Plot_legend <- ggplot(df, aes(x=X, y=Y, colour=ID)) +
geom_point(size=1.5,alpha=0.2) +
geom_abline(aes(slope=1,intercept=0),linetype="dashed",color="grey52",size=1.5) +
theme_bw() +
theme(legend.text=element_text(size=18),
legend.title = element_text(size=19, face = "bold",hjust = 0.5),
legend.key=element_blank(),
legend.background = element_rect(colour = 'black', fill = 'white', size = 1, linetype='solid')) +
guides(color=guide_legend(override.aes=list(fill=NA)))
Plot_legend
Plot_legend_2 <- Plot_legend +
geom_line(aes(y=predict(model,df.Z_low), group=ID, linetype = "m1"), size=1.5, alpha=0.6, color = line_colors[3]) +
geom_line(aes(y=predict(model,df.Z_medium), group=ID, linetype ="m2"), size=1.5, alpha=0.6, color = line_colors[2]) +
geom_line(aes(y=predict(model,df.Z_high), group=ID, linetype ="m3"), size=1.5, alpha=0.6, color = line_colors[1]) +
scale_linetype_manual(values = c(m1 = "solid", m2 = "solid", m3 = "solid"),labels = c(m1 = "1", m2 = "2", m3 = "3")) +
labs(color = "ID", linetype = expression(Z)) +
guides(linetype = guide_legend(override.aes = list(color = line_colors)))
Plot_legend_2

ggplot2: increasing space between categorical axis ticks with geom_point

I have a geom_point plot that with a large number of categorical variables, and a size parameter mapped to a continuous variable. When I make the plot, the categorical variables are too close together, and the large points from within each overlap. Is there any way to give a little breathing room to the axis so that this doesn't happen? I'm aware that an alternative solution is simply to use scale_size_area(max_size = 3) to narrow the range of point sizes, but I'd prefer not to do this as it makes it too difficult to tell them apart.
Here's the code:
plot <- ggplot(allcazfull, aes(x = Family, y = ifelse(Percentage==0,NA, Percentage), fill = Treatment, size = ifelse(Number == 0, NA,Number))) +
facet_wrap(~ Pathogen, scales = "free_x") +
geom_point(shape = 21) +
scale_fill_manual(values = alpha(c("#98fb98","#f77e17","#0d5a0d","#8d0707"),.6)) +
theme_bw() +
theme(panel.grid.major.x = element_blank(),
panel.grid.minor.x = element_blank(),
aspect.ratio = 4/1,
strip.background = element_rect(fill="white", linetype = "blank"),
strip.text = element_blank()) +
scale_x_discrete(limits = rev(levels(allcazfull$Family))) +
xlab("") +
ylab("") +
guides(fill = FALSE, size = FALSE) +
coord_flip()
plot
And here's the resulting figure:

Merging two plots into one, each with a separate legend using R

I'm have made two separate scatter plots using ggplot2 and I need to combine them into one single plot. Each plot is for a population of lizards under three different treatments (backgrounds).
for each plot I have the following:
csMS = data.frame()
ellMS = data.frame()
centroidsMS = data.frame()
csplotMS = ggplot(csMS, aes(x = RG, y = GB, colour = Background)) + geom_point(size = 3, shape = 17) + #colour by background, circles size 3
geom_path(data = ell.AS, aes(x = RG, y = GB ,colour = Background), size = 1, linetype = 2) + #adding the ellipses
geom_point(data = centroidsMS, size = 3, shape = 17) + #added centroids
geom_errorbar(data = centroidsMS, aes(ymin = GB - se.GB, ymax = GB + se.GB), width = 0) + #add y error bars
geom_errorbarh(data = centroidsMS, aes(xmin = RG - se.RG, xmax = RG + se.RG), height = 0) +
theme_bw() + #white background
theme(axis.title.y = element_text(vjust = 2), axis.title.x = element_text(vjust = -0.3)) + #distance of axis titles from axis
theme(panel.border = element_blank(), panel.grid.major = element_blank(), panel.grid.minor = element_blank(), #no grids
axis.line = element_line(colour = "black")) + #black axes
theme(text = element_text(size = 30)) + #font size
ylab("(G-B)/(G+B)") + xlab("(R-G)/(R+G)") + # Set text for axes labels
scale_colour_manual(values = c("black","#FF6600", "yellow1")) + #changed default colours
labs(colour = "Murray Sunset NP") +
theme(legend.title = element_text(size = "20")) + #changes the legend title
theme(legend.text = element_text(size = "20")) + #changes the legend title
theme(legend.key = element_blank()) + #removed little squares around legend symbols
theme(legend.direction = "horizontal", legend.position = c(.5, .85))
I tried
csASMS = csplotAS + csplotMS
but I get an error message: "Error in p + o : non-numeric argument to binary operator In addition: Warning message: Incompatible methods ("+.gg", "Ops.data.frame") for "+" "
I also tried
csASMS = grid.arrange(csplotAS, csplotMS)
but this places one plot on top of the other, but I need to combine both plots so that they are basically just one plot but with two separate legends as each plot has different conventions to indicate the different lizard populations.
Any help will be greatly appreciated.
****EDIT**** Dec 12/ 2014
I have managed to combine the two plots into one but still have the problem of the separate legends. To try to simplify the question and as per cdeterman's request I'm adding a simpler form of the code with some sample data:
data frames: p1 and p2
> p1
treatment x y
1 Black 1 1
2 Orange 2 2
3 Yellow 3 3
> p2
treatment x y
1 Black 4 4
2 Orange 5 5
3 Yellow 6 6
I used the following code to make a plot that includes both data frames:
plot = ggplot(p1, aes(x = x, y = y, colour = treatment)) + geom_point(size = 3) + #colour by background, circles size 3
theme_bw() + #white background
theme(axis.title.y = element_text(vjust = 2), axis.title.x = element_text(vjust = -0.3)) + #distance of axis titles from axis
theme(panel.border = element_blank(), panel.grid.major = element_blank(), panel.grid.minor = element_blank(), #no grids
axis.line = element_line(colour = "black")) + #black axes
theme(text = element_text(size = 30)) + #font size
scale_colour_manual(values = c("black","#FF6600", "yellow1")) + #changed default colours
labs(colour = "p1") +
theme(legend.title = element_text(size = "20")) + #changes the legend title
theme(legend.text = element_text(size = "20")) + #changes the legend title
theme(legend.key = element_blank()) + #removed little squares around legend symbols
theme(legend.direction = "horizontal", legend.position = c(.33, 1)) +
# Now to add the second plot/ No need to code for axis titles, titles positions,etc b/c it's already coded in the first plot
geom_point(data = p2, aes(x = x, y = y, colour = treatment), size = 3, shape = 17)
This produces a graph with each data frame represented in a different symbol (circles for p1 and triangles for p2) but with only one combined legend with triangles superimposed over circles). How can I get two separate legends, one for each data frame?
Thank you!
After doing some research and trying different things I was able to solve PART of my problem. To add two plots together one needs to be plotter first and the other one on top of the first one using
geom.point()
my new code looks like this:
csplotASMS = ggplot(csAS, aes(x = RG, y = GB, colour = Background)) + geom_point(size = 3) + #colour by background, circles size 3
geom_path(data = ell.AS, aes(x = RG, y = GB ,colour = Background), size = 1, linetype = 1) + #adding the ellipses
geom_point(data = centroidsAS, size = 4) + #added centroids
geom_errorbar(data = centroidsAS, aes(ymin = GB - se.GB, ymax = GB + se.GB), width = 0) + #add y error bars
geom_errorbarh(data = centroidsAS, aes(xmin = RG - se.RG, xmax = RG + se.RG), height = 0) +
theme_bw() + #white background
theme(axis.title.y = element_text(vjust = 2), axis.title.x = element_text(vjust = -0.3)) + #distance of axis titles from axis
theme(panel.border = element_blank(), panel.grid.major = element_blank(), panel.grid.minor = element_blank(), #no grids
axis.line = element_line(colour = "black")) + #black axes
theme(text = element_text(size = 30)) + #font size
ylab("(G-B)/(G+B)") + xlab("(R-G)/(R+G)") + # Set text for axes labels
scale_colour_manual(values = c("black","#FF6600", "yellow1")) + #changed default colours
labs(colour = "Alice Springs") +
theme(legend.title = element_text(size = "20")) + #changes the legend title
theme(legend.text = element_text(size = "20")) + #changes the legend title
theme(legend.key = element_blank()) + #removed little squares around legend symbols
theme(legend.direction = "horizontal", legend.position = c(.33, 1)) +
# Now to add the second plot/ No need to code for axis titles, titles positions,etc b/c it's already coded in the first plot
geom_point(data = csMS, aes(x = RG, y = GB, colour = Background), size = 3, shape = 17) +
geom_path(data = ell.MS, aes(x = RG, y = GB ,colour = Background), size = 1, linetype = 2) + #adding the ellipses
geom_point(data = centroidsMS, size = 4, shape = 17) + #added centroids
geom_errorbar(data = centroidsMS, aes(ymin = GB - se.GB, ymax = GB + se.GB), width = 0) + #add y error bars
geom_errorbarh(data = centroidsMS, aes(xmin = RG - se.RG, xmax = RG + se.RG), height = 0) #add x error bars
and the graph depicts a scatterplot for two populations, each with three treatments. Because tratments are the same for both populations I want to use the same colours but different symbols to denote the differences in populations. One population is circles and the other one is triangles.
Now, the part I can't answer yet is how to have two separate legends, one for each "plot". i.e. one for the circles and one for the triangles. At the moments there is a "combined legend showing triangles superimposed on circles. Each legend should have its own title.

Resources