I am using the following code in R to a plot a linear regression with confidence interval bands (95%) around the regression line.
Average <- c(0.298,0.783429,0.2295,0.3725,0.598,0.892,2.4816,2.79975,
1.716368,0.4845,0.974133,0.824,0.936846,1.54905,0.8166,1.83535,
1.6902,1.292667,0.2325,0.801,0.516,2.06645,2.64965,2.04785,0.55075,
0.698615,1.285,2.224118,2.8576,2.42905,1.138143,1.94225,2.467357,0.6615,
0.75,0.547,0.4518,0.8002,0.5936,0.804,0.7,0.6415,0.702182,0.7662,0.847)
Area <-c(8.605,16.079,4.17,5.985,12.419,10.062,50.271,61.69,30.262,11.832,25.099,
8.594,17.786,36.995,7.473,33.531,30.97,30.894,4.894,8.572,5.716,45.5,69.431,
40.736,8.613,14.829,4.963,33.159,66.32,37.513,27.302,47.828,39.286,9.244,19.484,
11.877,9.73,11.542,12.603,9.988,7.737,9.298,14.918,17.632,15)
lm.out <- lm (Area ~ Average)
newx = seq(min(Average), by = 0.05)
conf_interval <- predict(lm.out, newdata = data.frame(Average = newx), interval ="confidence",
level = 0.95)
plot(Average, Area, xlab ="Average", ylab = "Area", main = "Regression")
abline(lm.out, col = "lightblue")
lines(newx, conf_interval[,2], col = "blue", lty ="dashed")
lines(newx, conf_interval[,3], col = "blue", lty ="dashed")
I am stuck because the graph I got reports the bands just for the first part pf the line, leaving out all the remaining line (you find the link to the image at the bottom of the message). What is going wrong? I would also like to shade the area of the confidence interval (not just the lines corresponding to the limits) but I can't understand how to do it.
Any help would be really appreciated, I am completely new in R.
This is very easy with the ggplot2 -library. Here is the code:
library(ggplot2)
data = data.frame(Average, Area)
ggplot(data=data, aes(x=Area, y=Average))+
geom_smooth(method="lm", level=0.95)+
geom_point()
Code to install the library:
install.packages("ggplot2")
Related
I am trying to plot an NMDS plot of species community composition data with ellipses which represent 95% confidence intervals. I generated the data for my NMDS plot using metaMDS and successfully have ordinations generated using the basic plot functions in R (see code below). However, I am struggling to get my data to plot successfully using ggplot2 and this is the only way I have seen 95% CIs plotted on NMDS plots. I am hoping someone is able to help me correct my code so the ellipses show 95% CIs, or could point me in the right direction for achieving this using other methods?
My basic code for plotting my NMDS plot:
orditorp(dung.families.mds, display = "sites", labels = F, pch = c(16, 8, 17, 18) [as.numeric(group.variables$Heating)], col = c("green", "blue", "orange", "black") [as.numeric(group.variables$Dungfauna)], cex = 1.3)
ordiellipse(dung.families.mds, groups = group.variables$Dungfauna, draw = "polygon", lty = 1, col = "grey90")
legend("topleft", "stress = 0.1329627", bty = "n", cex = 1)
My ordination:
I realize this question is old, but I found this post useful for plotting confidence ellipses during my work, and maybe it will help you. Plotting ordiellipse function from vegan package onto NMDS plot created in ggplot2
Edit: Below I have copied the code from the second part of Didzis Elferts's answer on the link above.
Where "sol" is the metaMDS object:
First, make NMDS data frame with group column.
NMDS = data.frame(MDS1 = sol$points[,1], MDS2 = >sol$points[,2],group=MyMeta$amt)
Next, save result of function ordiellipse() as some object.
ord<-ordiellipse(sol, MyMeta$amt, display = "sites", >kind = "se", conf = 0.95, label = T)
Data frame df_ell contains values to show ellipses. It is calculated again with function veganCovEllipse which is hidden in vegan package. This function is applied to each level of NMDS (group) and now it uses arguments stored in ord object - cov, center and scale of each level.
df_ell <- data.frame()
for(g in levels(NMDS$group)){
df_ell <- rbind(df_ell, cbind(as.data.frame(with(NMDS[NMDS$group==g,],
veganCovEllipse(ord[[g]]$cov,ord[[g]]$center,ord[[g]]$scale)))
,group=g))
}
Plotting is done the same way as in previous example. As for the calculating of coordinates for elipses object of ordiellipse() is used, this solution will work with different parameters you provide for this function.
ggplot(data = NMDS, aes(MDS1, MDS2)) + geom_point(aes(color = group)) +
geom_path(data=df_ell, aes(x=NMDS1, y=NMDS2,colour=group), size=1, linetype=2)
I have some data about the percentages of temperature for different time periods and I want to create a barplot showing those percentages and then add a linear regression line showing the trend. Although i manage to get the first graph, I fail to add a straight linear regression line
Basically I try to make a barplot with these tx_1 data
tx_1<-c(0.055,0.051,0.057,0.049,0.061,0.045)
mypath<-file.path("C:\\tx5\\1.jpeg")
jpeg(file = mypath,width = 1200, height = 600)
plot.dim<-barplot(get(name),
space= 2,
ylim=c(0,0.15),
main = "Percentage of days when Tmax < 5th percentile",
xlab = "Time Periods",
ylab = "Percentage",
names.arg = c("1975-1984", "1985-1990", "1991-1996", "1997-2002", "2003-2008", "2009-2014"),
col = "darkred",
horiz = FALSE)
dev.off()
I tried using ggplot also, but with no luck
Here i have included both a line connecting each observation and a overall best linear fit line. Hope this helps.
library(tidyverse)
year <- tribble(~ Year,~ Percent,
94,0.055,
95,0.051,
96,0.057,
97,0.049,
98,0.061,
99,0.045)
ggplot(year,aes(Year,Percent)) +
geom_bar(stat = "identity") +
geom_line() +
geom_smooth(method = "lm",se = F)
I would like to add the median spline and corresponding confidence interval bands to a ggplot2 scatter plot. I am using the 'quantreg'-package, more specifically the rqss function (Additive Quantile Regression Smoothing).
In ggplot2 I am able to add the median spline, but not the confidence interval bands:
fig = ggplot(dd, aes(y = MeanEst, x = N, colour = factor(polarization)))
fig + stat_quantile(quantiles=0.5, formula = y ~ qss(x), method = "rqss") +
geom_point()
The quantreg-package comes with its own plot function; plot.rqss. Where I am able to add the confidence bands (bands=TRUE):
plot(1, type="n", xlab="", ylab="", xlim=c(2, 12), ylim=c(-3, 0)) # empty plot
plotfigs = function(df) {
rqss_model = rqss(df$MeanEst ~ qss(df$N))
plot(rqss_model, bands=TRUE, add=TRUE, rug=FALSE, jit=FALSE)
return(NULL)
}
figures = lapply(split(dd, as.factor(dd$polarization)), plotfigs)
However plot function that comes with the quantreg-package is not very flexible/well suited for my needs. Is it possible to get the confidence bands in a ggplot2 plot? Perhaps by mimicking the method used in the quantreg-package, or simply copying them from the plot?
Data: pastebin.
You almost have it. When you call
plot(rqss_model, bands=TRUE, add=TRUE, rug=FALSE, jit=FALSE)
The function very helpfully returns the plotted data. All we do is grab the data frame. First a minor tweak to your function, return the data in a sensible way
plotfigs = function(df) {
rqss_model = rqss(df$MeanEst ~ qss(df$N))
band = plot(rqss_model, bands=TRUE, add=TRUE, rug=FALSE, jit=FALSE)
data.frame(x=band[[1]]$x, low=band[[1]]$blo, high=band[[1]]$bhi,
pol=unique(df$polarization))
}
Next call the function and condense
figures = lapply(split(dd, as.factor(dd$polarization)), plotfigs)
bands = Reduce("rbind", figures)
Then use geom_ribbon to plot
## We inherit y and color, so have to set them to NULL
fig + geom_ribbon(data=bands,
aes(x=x, ymin=low, ymax=high,
y=NULL, color=NULL, group=factor(pol)),
alpha=0.3)
Say I some data, d, and I fit nls models to two subsets of the data.
x<- seq(0,4,0.1)
y1<- (x*2 / (0.2 + x))
y1<- y1+rnorm(length(y1),0,0.2)
y2<- (x*3 / (0.2 + x))
y2<- y2+rnorm(length(y2),0,0.4)
d<-data.frame(x,y1,y2)
m.y1<-nls(y1~v*x/(k+x),start=list(v=1.9,k=0.19),data=d)
m.y2<-nls(y2~v*x/(k+x),start=list(v=2.9,k=0.19),data=d)
I then want to plot the fitted model regression line over data, and shade the prediction interval. I can do this with the package investr and get nice plots for each subset individually:
require(investr)
plotFit(m.y1,interval="prediction",ylim=c(0,3.5),pch=19,col.pred='light blue',shade=T)
plotFit(m.y2,interval="prediction",ylim=c(0,3.5),pch=19,col.pred='pink',shade=T)
However, if I plot them together I have a problem. The shading of the second plot covers the points and shading of the first plot:
1: How can I make sure the points on the first plot end up on top of the shading of the second plot?
2: How can I make the region where the shaded prediction intervals overlap a new color (like purple, or any fusion of the two colors that are overlapping)?
Use adjustcolor to add transparency like this:
plotFit(m.y1, interval = "prediction", ylim = c(0,3.5), pch = 19,
col.pred = adjustcolor("lightblue", 0.5), shade = TRUE)
par(new = TRUE)
plotFit(m.y2, interval = "prediction", ylim = c(0,3.5), pch = 19,
col.pred = adjustcolor("light pink", 0.5), shade = TRUE)
Depending on what you want you can play around with the two transparency values (here both set to 0.5) and possibly make only one of them transparent.
I wish to add regression lines to a plot that has multiple data series that are colour coded by a factor. Using a brewer.pal palette, I created a plot with the data points coloured by factor (plant$ID). Below is an example of the code:
palette(brewer.pal(12,"Paired"))
plot(x=plant$TL, y=plant$d15N, xlab="Total length (mm)", ylab="d15N", col=plant$ID, pch=16)
legend(locator(1), legend=levels(factor(plant$ID)), text.col="black", pch=16, col=c(brewer.pal(12,"Paired")), cex=0.6)
Is there an easy way to add linear regression lines to the graph for each of the different data series (factors)? I also wish to colour the lines according to the factor plant$ID?
I can achieve this by adding each of the data series to the plot separately and then using the abline function (as below), but in cases with multiple data series it can be very time consuming matching up colours.
plot(y=plant$d15N[plant$ID=="Sm"], x=plant$TL[plant$ID=="Sm"], xlab="Total length (mm)", ylab="d15N", col="green", pch=16, xlim=c(50,300), ylim=c(8,15))
points(y=plant$d15N[plant$ID=="Md"], x=plant$TL[plant$ID=="Md"], type="p", pch=16, col="blue")
points(y=plant$d15N[plant$ID=="Lg"], x=plant$TL[plant$ID=="Lg"], type="p", pch=16, col="orange")
abline(lm(plant$d15N[plant$ID=="Sm"]~plant$TL[plant$ID=="Sm"]), col="green")
abline(lm(plant$d15N[plant$ID=="Md"]~plant$TL[plant$ID=="Md"]), col="blue")
abline(lm(plant$d15N[plant$ID=="Lg"]~plant$TL[plant$ID=="Lg"]), col="orange")
legend.text<-c("Sm","Md","Lg")
legend(locator(1), legend=legend.text, col=c("green", "blue", "orange"), pch=16, bty="n", cex=0.7)
There must be a quicker way! Any help would be greatly appreciated.
Or you use ggplot2 and let it do all the hard work. Unfortunately, you example is not reproducible, so I have to create some myself:
plant = data.frame(d15N = runif(1000),
TL = runif(1000),
ID = sample(c("Sm","Md","Lg"), size = 1000, replace = TRUE))
plant = within(plant, {
d15N[ID == "Sm"] = d15N[ID == "Sm"] + 0.5
d15N[ID == "Lg"] = d15N[ID == "Lg"] - 0.5
})
> head(plant)
d15N TL ID
1 0.6445164 0.14393597 Sm
2 0.2098778 0.62502205 Lg
3 -0.1599300 0.85331376 Lg
4 -0.3173119 0.60537491 Lg
5 0.8197111 0.01176013 Sm
6 1.0374742 0.68668317 Sm
The trick is to use the geom_smooth geometry which calculates the lm and draws it. Because we use color = ID, ggplot2 knows it needs to do the whole plot for each unique ID in ID.
library(ggplot2)
ggplot(plant, aes(x = TL, y = d15N, color = ID)) +
geom_point() + geom_smooth(method = "lm")