R / ggplot2: Multiple regression lines on same axes - r

I'm needing to plot two regressions on the same axes. For this I have 3 columns in my dataset (let's call them A, B & C). I want to plot B against A and then C against A, and have these as different colour regression lines, with the data points being the same colour as their corresponding lines.
To be more specific, to create individual plots I used the following code for the first one:
P1 <- ggplot(data=volboth, aes(x=control, y=vol30)) +
geom_point(alpha=1, size=4, color="maroon") +
ggtitle("Correlation Plot: Ground Survey (Control) vs 30m UAV Survey") +
labs(x = expression(paste("Volume - Control Data - ", m^{3})),
y = expression(paste("Volume - Aerial Data - ", m^{3}))) +
xlim(0, 5) +
ylim(0, 5) +
geom_smooth(method=lm, se=FALSE, fullrange=TRUE)
And then the following for the second plot:
P2 <- ggplot(data=volboth, aes(x=vol10, y=control)) +
geom_point(alpha=1, size=4, color="maroon") +
ggtitle("Correlation Plot: Ground Survey (Control) vs 10m UAV Survey") +
labs(x = expression(paste("Volume - Aerial Data - ", m^{3})),
y = expression(paste("Volume - Control Data - ", m^{3}))) +
xlim(0, 5) +
ylim(0, 5) +
geom_smooth(method=lm, se=FALSE, fullrange=TRUE)
Any ideas of how to combine both plots onto same axes, and to apply corresponding visual themes? I'm open to using standard R (not ggplot2) if that makes things easier.

require(ggplot2)
#first, some sample data
volboth <- data.frame(control=(0:100)/20,vol10=(50:150)/50,vol30=(120:20)/30)
#next, make a plot
P1 <- ggplot(data=volboth, aes(x=control, y=vol30)) +
geom_point(alpha=1, size=4, color="maroon") +
geom_smooth(method=lm, se=FALSE, fullrange=TRUE) +
#Now add a second layer, with same x, but other y (and blue color for clarity)
geom_point(aes(y=vol10),alpha=1, size=4, color="blue") +
geom_smooth(aes(y=vol10),method=lm, se=FALSE, fullrange=TRUE) +
ggtitle("Correlation Plot: Ground Survey (Control) vs 30m UAV Survey") +
labs(x = expression(paste("Volume - Control Data - ", m^{3})),
y = expression(paste("Volume - Aerial Data - ", m^{3}))) +
xlim(0, 5) +
ylim(0, 5)
print(P1)
Which gives me this graph:
I have used geom_point here, but as you have worked out yourself if your points are closely together, geom_jitter might be a better alternative.

Related

ggplot scatterplot with missing x values, trendlines won't connect

I am new to R, and I'm trying to use ggplot2 to plot some data as a scatterplot. I'm missing a day in my samples, and the trendline I made won't connect all of the data together. Below is the code I have and what the graph looks like.
ggplot(SiExptTEPa, aes(x=Timepoint..dpi., y=(TEPcells),group=(Treatment))) +
geom_point(size=5,aes(colour=Nutrient)) +
scale_color_manual(values=c('yellow','light blue')) +
geom_errorbar(aes(ymin=TEPcells-se, ymax=TEPcells+se), width=.1) +
facet_wrap(~Nutrient, scales="free") +
scale_y_continuous(labels = scientific) +
theme_classic() +
xlab("Time Post Infection (Days)") +
ylab("TEP by Total Cells") +
ylim(3e-08,2e-07) +
geom_line(aes(linetype=Treatment)) +
scale_linetype_manual(values=c("solid", "dashed"))
Incorrect graph
Please tell me how to connect the gap in the middle of both sides of the graph so that there is one continuous line?

Adding Legends in Graphs without tidy data

#Plot the in sample forecasts against the actual values
#Build the confidence interval
Upper95 <- fcast1 + 1.96*sqrt(Var1)
Lower95 <- fcast1 - 1.96*sqrt(Var1)
Upper80 <- fcast1 + 1.28*sqrt(Var1)
Lower80 <- fcast1 - 1.28*sqrt(Var1)
#Create a data frame
dfb <- data.frame(TeslaWeeklyPrices$Date,fcast1,TeslaWeeklyPrices$TeslaPrices,Upper95,Lower95,Upper80,Lower80)
#Make the Plot
Plot1 <- ggplot(dfb, aes(x=TeslaWeeklyPrices.Date, y=TeslaWeeklyPrices.TeslaPrices))+
geom_ribbon(data=dfb,aes(ymin=Upper95,ymax=Lower95),fill = "slategray2")+
geom_ribbon(data=dfb,aes(ymin=Upper80,ymax=Lower80),fill = "bisque")+
geom_line(data=dfb, aes(x=TeslaWeeklyPrices.Date, y=fcast1),size=1, color="red1")+
geom_point(shape = 19, fill = "white", colour = "blue" ,size = 1)+
theme_light(base_size = 11) +
ylab("Tesla Stock price ($)") + xlab("Date (weeks)")
Plot1
That is my code for my graph.
That is how it looks. I want to add legends in my graph without having to tidy my data. Because then I can not format my graph as I want.
After the useful comment I got.
Upper95 <- fcast1 + 1.96*sqrt(Var1)
Lower95 <- fcast1 - 1.96*sqrt(Var1)
Upper80 <- fcast1 + 1.28*sqrt(Var1)
Lower80 <- fcast1 - 1.28*sqrt(Var1)
dfb <- data.frame(TeslaWeeklyPrices$Date,fcast1,TeslaWeeklyPrices$TeslaPrices,Upper95,Lower95,Upper80,Lower80)
Plot1 <- ggplot(dfb, aes(x=TeslaWeeklyPrices.Date, y=TeslaWeeklyPrices.TeslaPrices))+
geom_ribbon(aes(ymin=Upper95, ymax=Lower95, fill='95% prediction level')) +
geom_ribbon(aes(ymin=Upper80, ymax=Lower80, fill='80% prediction level')) +
geom_line(data=dfb, aes(x=TeslaWeeklyPrices.Date, y=fcast1,
color="Predicted Values"),size=1)+
geom_point(shape = 19, aes(color = "Observed Values"),
fill = "white", size = 1 ,)+
scale_fill_manual(values=c('95% prediction level'='slategray2', '80% prediction level'="bisque"), breaks=c('95% prediction level', '80% prediction level')) +
scale_color_manual(values=c("Predicted Values"="red","Observed Values"= "blue"), breaks=c('Predicted Values', 'Observed Values'))+
guides(color=guide_legend(title=NULL),fill=guide_legend(title=NULL) ) +
theme(legend.margin = margin(b=0, t=-1000))+
theme_light(base_size = 12)
Plot1
That is my new code.
So how can my blue points look as points in the Legend and not as a line. And how can i det the margin to 0 between my 2 legends?
Can I format the background color of this so it looks like an independent part and not as part of the graph?
That is an example I saw in one paper.
First of all, I do have a bit of an issue with the comment:
I want to add legends in my graph without having to tidy my data.
Because then I can not format my graph as I want.
Tidying your data is often the best solution to do just that (format the graph as you want), but I kind of agree it might be more straightforward in this case to just "brute force" the legend into place. I'll show you how.
Since I don't have your data, I made up my own to mirror that you shared:
set.seed(1234)
time <- 1:50
Var1 <- unlist(lapply(time, function(x) rnorm(1, 1, 0.01)))
fcast1 <- unlist(lapply(time, function(x) { x * rnorm(1, 0.1, 0.01)}))
Upper95 <- fcast1 + 1.96*sqrt(Var1)
Lower95 <- fcast1 - 1.96*sqrt(Var1)
Upper80 <- fcast1 + 1.28*sqrt(Var1)
Lower80 <- fcast1 - 1.28*sqrt(Var1)
dfb <- data.frame(time, fcast1, Upper95, Lower95, Upper80, Lower80)
And the plot:
ggplot(dfb, aes(time, fcast1)) +
geom_ribbon(aes(ymin=Upper95, ymax=Lower95), fill='slategray2') +
geom_ribbon(aes(ymin=Upper80, ymax=Lower80), fill='bisque') +
geom_line(size=1, color='red1') +
theme_light()
To create the legend without having Tidy data, you need to make the legend piece by piece, but still use ggplot to do so. Legends are created for aesthetics that are not part of the coordinate system in ggplot2 that are inside of aes(). Therefore, to make the legend appear, you only need to put the aesthetic modifiers fill and color inside the aes() part of each geom_*() function.
It's not quite that simple though, but once you understand how it works it becomes more clear. The value you assign to fill= or color= inside aes() will be used for the label in the legend, and not the color. You will have to specify color with a scale_*() function.
ggplot(dfb, aes(time, fcast1)) +
geom_ribbon(aes(ymin=Upper95, ymax=Lower95, fill='Upper')) +
geom_ribbon(aes(ymin=Upper80, ymax=Lower80, fill='Lower')) +
geom_line(size=1, aes(color="Forecast")) +
scale_fill_manual(values=c('Upper'='slategray2', 'Lower'='bisque'), breaks=c('Upper', 'Lower')) +
scale_color_manual(values='red1') +
theme_light()
That looks more like it, but it's not perfect. Perhaps you would want the line and fill boxes in the legend to become "one" legend box instead? If that's the case, you can't really do that (because they span two different aesthetic modifiers, fill and color); however, we can make the same effect if we do a few things:
Remove the title for the color legend
Change the title for the fill legend
Use the theme elements and margins to move the legends closer together to look as one
Here you can see how you might do that:
p + # this is the code from above
guides(
color=guide_legend(title=NULL),
fill=guide_legend(title='Legend')
) +
theme(legend.margin = margin(b=0, t=-13))
EDIT:
OP asked if the points could appear on the chart as well. They certainly can, and you have to use a similar method to do that. You can just add color= into aes() for geom_point() like before:
ggplot(dfb, aes(time, fcast1)) +
geom_ribbon(aes(ymin=Upper95, ymax=Lower95, fill='Upper')) +
geom_ribbon(aes(ymin=Upper80, ymax=Lower80, fill='Lower')) +
geom_line(size=1, aes(color="Forecast")) +
geom_point(size=1, aes(color='Actual Values'), shape=19) +
scale_fill_manual(values=c('Upper'='slategray2', 'Lower'='bisque'), breaks=c('Upper', 'Lower')) +
scale_color_manual(values=c('Forecast'='red1', 'Actual Values'='blue')) +
theme_light() +
guides(
color=guide_legend(title=NULL),
fill=guide_legend(title='Legend')
) +
theme(legend.margin = margin(b=0, t=-13))
One small problem there... you'll notice the icon (called "glyph") next to "Actual Values" and "Forecast" is a line + a point. I think you'd prefer to have a point be the glyph for the point and a line be the glyph for the line. We can't do that in the same legend (they are both part of the color legend)... so we can fix that by separating "Actual Values" into another legend. In this case, we'll just use the shape aesthetic modifier and have a third legend that also has no title.
ggplot(dfb, aes(time, fcast1)) +
geom_ribbon(aes(ymin=Upper95, ymax=Lower95, fill='Upper')) +
geom_ribbon(aes(ymin=Upper80, ymax=Lower80, fill='Lower')) +
geom_line(size=1, aes(color="Forecast")) +
geom_point(size=1, aes(shape='Actual Values'), color='blue') +
scale_fill_manual(values=c('Upper'='slategray2', 'Lower'='bisque'), breaks=c('Upper', 'Lower')) +
scale_color_manual(values=c('Forecast'='red1')) +
scale_shape_manual(values=19) +
theme_light() +
guides(
color=guide_legend(title=NULL),
shape=guide_legend(title=NULL),
fill=guide_legend(title='Legend')
) +
theme(legend.margin = margin(b=0, t=-13))
Now you have all the information needed to become a ggplot master :).

Change tickmark labels in ggplot2 [duplicate]

I would like to show a short time series showing heterogeneity of heroin seizures in Europe over the span of 22 years. However there are different amount countries included in some of the years. I would like to display this in the graph by putting "n=xx" for each year on the x-axis. Does anyone know how I should do this?
across_time<- ggplot(by_year, aes(year, value) +
geom_errorbar(aes(ymin=value-se, ymax=value+se), width=.4) +
geom_line(colour="black", size= 2) +
geom_point(size=4, shape=21, fill="white") + # 21 is filled circle
xlab("Year") +
ylab("Siezures") +
ggtitle("Hetrogeniety Across Time") +
scale_x_continuous(breaks = round(seq(min(1990), max(2012), by=2)))
across_time
Here is a link to what the graph looks like:
http://imgur.com/XWhBqqi
I found this as a solution:
#make a list of the lables you want
lab<- c("1990\nn=26", "1991\nn=29", "1992\nn=30", "1993\nn=32", "1994\nn=36", "1995\nn=35", "1996\nn=33", "1997\nn=38", "1998\nn=36", "1999\nn=39", "2000\nn=39", "2001\nn=40", "2002\nn=38", "2003\nn=40", "2004\nn=39", "2005\nn=41", "2006\nn=42", "2007\nn=43", "2008\nn=44", "2009\nn=41", "2010\nn=41", "2011\nn=41", "2012\nn=42")
lab<- as.factor(lab)
#bind our label list to our table
by_year<-cbind(lab, by_year)
#make a column of zeros to group by for the line
by_year$g<- 0
# Standard error of the mean
across_time<- ggplot(by_year, aes(x=lab, y=value)) +
geom_errorbar(aes(ymin=value-se, ymax=value+se), width=.4) +
geom_line(aes(group=g), colour="black", size= 2) + #notice the grouping
geom_point(size=4, shape=21, fill="white") + # 21 is filled circle
scale_x_discrete(labels = by_year$lab) + # discrete not continuous
xlab("Year & Number of Reporting Countries") +
ylab("Total Annual Seizures") +
ggtitle("Heterogeneity of Heroin Seizures in Europe")
across_time
Here is the final result:
Have you tried using the label argument in scale_x_continuous? If you have a vector with the "xx" you want as labels this should work.

How to ggplot with pre calculated quantiles?

I am using a model to predict some numbers. My prediction also includes a confidence interval for each number. I need to plot the actual numbers + predicted numbers and their quantile values on the same plot. Here is a simple example:
actualVals = c(12,20,15,30)
lowQuantiles = c(19,15,12,18)
midQuantiles = c(22,22,17,25)
highQuantiles = c(30,25,25,30)
and I'm looking for something like this, perhaps by using ggplot():
You can use geom_errorbar, among others you can see at ?geom_errorbar. I created a data.frame from your variables, dat and added dat$x <- 1:4.
ggplot(dat) +
geom_errorbar(aes(x, y=midQuantiles, ymax=highQuantiles, ymin=lowQuantiles, width=0.2), lwd=2, color="blue") +
geom_point(aes(x, midQuantiles), cex=4, shape=22, fill="grey", color="black") +
geom_line(aes(x, actualVals), color="maroon", lwd=2) +
geom_point(aes(x, actualVals), shape=21, cex=4, fill="white", color='maroon') +
ylim(0, 30) +
theme_bw()

Plot two regression lines (calculated on subset of the same data frame) on the same graph with ggplot

I have this kind of data frame:
df<-data.frame(x=c(1,2,3,4,5,6,7,8,9,10),y=c(2,11,24,30,45,65,90,110,126,145), a=c(0.2,0.2,0.3,0.4,0.1,0.8,0.7,0.6,0.8,0.9))
Using ggplot, I would like to plot on the same figure two regression lines, calculated for a subset of my data frame under condition (a > or < 0.5).
Visually, I would like that both regression lines:
df_a<-subset(df, df$a<0.5)
ggplot(df_a,aes(x,y))+
geom_point(aes(color = a), size=3.5) +
geom_smooth(method="lm", size=1, color="black") +
ylim(-5,155) +
xlim(0,11)
df_b<-subset(df, df$a>0.5)
ggplot(df_b,aes(x,y)) +
geom_point(aes(color = a), size=3.5) +
geom_smooth(method="lm", size=1, color="black") +
ylim(-5,155) +
xlim(0,11)
Appear on this figure:
ggplot(df,aes(x,y))+ geom_point(aes(color = a), size=3.5)
I've tried with par(new=TRUE) without success.
Make a flag variable, and use group:
df$small=df$a<0.5
ggplot(df,aes(x,y,group=small))+geom_point() + stat_smooth(method="lm")
and have yourself pretty colours and a legend if you want:
ggplot(df,aes(x,y,group=small,colour=small))+geom_point() + stat_smooth(method="lm")
Or maybe you want to colour the dots:
ggplot(df,aes(x,y,group=small)) +
stat_smooth(method="lm")+geom_point(aes(colour=a))

Resources