ggplot(data = wheatX,
aes(x = No.of.species,
y = Weight.of.weed,
color = Treatment)) +
geom_point(shape = 1) +
scale_colour_hue(l = 50) +
geom_smooth(method = glm,
se = FALSE)
This draws a straight line.
But the species number will decrease at somepoint. I want to make the line curve. How can I do it. Thanks
This is going to depend on what you mean by "smooth"
One thing you can do is apply a loess curve:
ggplot() + ... + stat_smooth(method = "loess", formula = biomass ~ numSpecies, size = 1)
Or you can manually build a polynomial model using the regular lm method:
ggplot() + ... + stat_smooth(method = "lm", formula = biomass ~ numSpecies + I(numSpecies^2), size = 1)
You'll need to figure out the exact model you want to use for the second case, hence what I originally meant by the definition of the term "smooth"
Related
I have a data frame include two groups, each group with four points, and I want to plot them using smooth line in r.
The dataframe is:
df <- data.frame(x=c(12,25,50,85,12,25,50,85), y=c(1.02, 1.05, 0.99, 1.07, 1.03, 1.06, 1.09, 1.10), Type=c("AD","AD","AD","AD","WT","WT","WT","WT"))
I used the code:
ggplot(df) +
geom_point(aes(x=x, y=y, color=Type, group=Type), size = 3) +
geom_line(aes(y=y, x=x, group = Type, color=Type)) +
stat_smooth(aes(y=y, x=x), method = "loose", formula = y~ poly(x, 21), se = FALSE)
However the plot I got is not smooth as I expected.
How could I change on code?
Is it because the limited point number?
Thanks a lot in advance!
There are a couple of problems with your code. Firstly, there is no method called "loose" for regression (did you mis-spell "loess"?). Secondly, if you want a polynomial regression you probably want method = lm. Thirdly, if you have four points in each series, you can have at most a degree-3 polynomial.
Using lm with y ~ poly(x, 3) works quite well here:
ggplot(df, aes(x, y, color = Type)) +
geom_point(size = 3) +
stat_smooth(method = lm, formula = y ~ poly(x, 3), se = FALSE)
Or even just a loess with y ~ x
ggplot(df, aes(x=x, y=y, color=Type)) +
geom_point(size = 3) +
stat_smooth(method = loess, formula = y ~ x, se = FALSE)
Posting an alternative way of doing it.
geom_point(size = 3) +
geom_smooth(method = "loess", span = 0.75, se = FALSE)
You can the span-parameter. See some examples below:
span == 1:
span == 0.75:
I have produced a glm interaction plot using ggplot2. I have attached the code I have used and the plot
.
I know that the grey shaded areas represent the 95% condfidence interval, but I am wondering if there is a method to get the exact values of the grey shaded areas and therefore 95% confidence interval?
#bind data togther
Modern_EarlyHolocene<-rbind(FladenF30, FladenB30, Early_Holocene)
#Build modern vs Holocene model
Modern_EarlyHolocene<-glm(Max_Height~Age+Time_period, data=Modern_EarlyHolocene,family = gaussian)
#Produce gg interaction plot
Modern_EarlyHolocene_plot<-ggplot(data=Modern_EarlyHolocene) +
aes(x = Age, y = Max_Height, group = Time_period, color = Time_period,) +>
geom_point( alpha = .7) +
stat_smooth(method = "glm", level=0.95) +
expand_limits(y=c(0,90), x=c(0,250))
#add axis labels
Modern_EarlyHolocene_plot + labs(x = "Age (years)", y = 'Maximum height (mm)') +
theme(legend.text = element_text(size = 14, colour = "Black"),
legend.title=element_blank()) +
theme(axis.text=element_text(size=14),
axis.title=element_text(size=16,face="bold"))
You can access de plot data with layer_data(Modern_EarlyHolocene_plot, i) with i corresponding to the layer to return, in the order added to the plot
You are effectively fitting a different regression line for each Time_period, so your glm has to include an interaction term. It should be:
Modern_EarlyHolocene<-glm(Max_Height~Age*Time_period, data=Modern_EarlyHolocene)
I do not have your data, so see below for an example with iris:
fit = glm(Sepal.Width ~ Sepal.Length * Species,data=iris)
g1 = ggplot(iris,aes(x=Sepal.Length,y=Sepal.Width,color=Species)) +
geom_point( alpha = .7) + stat_smooth(method = "glm", level=0.95)
To get the se of the predictions, you do:
pred = predict(fit,iris,se.fit = TRUE)
df_pred = data.frame(iris,pred=pred$fit,se=pred$se)
We can plot this, and the upper and lower bounds of the prediction are 1.96 * the standard error:
g2 = ggplot(df_pred,aes(x=Sepal.Length,y=Sepal.Width,color=Species)) +
geom_point( alpha = .7) +
geom_ribbon(aes(ymin=pred-1.96*se,ymax=pred+1.96*se,fill=Species),alpha=0.1)
I have 50 data points of temperature and humidity that I would like to plot on geom_point and add a linear model to my ggplot. however, I am unable to do so. I have tried abline, geom_line, geom_smooth and lm.
temp_humidity_data <- dplyr::select(data, temperature:humidity)
lm(formula = humidity ~ temperature, data = temp_humidity_data)
ggplot(temp_humidity_data) +
geom_point(aes (x = temperature , y = humidity))
geom_smooth()
How can I go about adding an lm to my `ggplot? any help is appreciated. thank you. And how could i differentiate the temperature and humidity points by colour as well on the plot?
this is what I have currently ^
As mentioned in the comment section, you missed a + sign after geom_point. Besides that, you are also missing a few arguments in geom_smooth:
library(ggplot2)
ggplot(iris) +
geom_point(aes(x = Petal.Length , y = Petal.Width)) +
geom_smooth(aes(x = Petal.Length, y = Petal.Width),
method = "lm", formula = y ~ x)
You need to supply "aesthetics" for x and y, otherwise you would get the following error:
Error: stat_smooth requires the following missing aesthetics: x, y
method = "lm" tells geom_smooth that you want to use the linear model method while formula specifies the model formula to plot. If we don't specify the method, geom_smooth defaults to "loess" (as stated by #Lyngbakr) and gives the warning message:
geom_smooth() using method = 'loess' and formula 'y ~ x'
Since we have to supply the same aesthetics in both geom_point and geom_smooth, a more convenient way would be to write:
ggplot(iris, aes(x = Petal.Length , y = Petal.Width)) +
geom_point() +
geom_smooth(method = "lm", formula = y ~ x)
Output:
To answer OP's second question of "how could i differentiate the temperature and humidity points by colour as well on the plot?", we can add the color and size aesthetics to geom_point like the following:
ggplot(iris, aes(x = Petal.Length , y = Petal.Width)) +
geom_point(aes(color = Petal.Length, size = Petal.Width)) +
geom_smooth(method = "lm", formula = y ~ x)
Output:
To change the range of sizes and colors, we use scale_fill_continuous (or scale_color_continuous for color) and scale_size_continuous:
ggplot(iris, aes(x = Petal.Length , y = Petal.Width)) +
geom_point(aes(fill = Petal.Length, size = Petal.Width), pch = 21) +
geom_smooth(method = "lm", formula = y ~ x) +
scale_fill_continuous(low = "red", high = "blue") +
scale_size_continuous(range = c(1, 10))
Notice that as you increase the size range, some points start to overlap with each other. To make it less confusing, I've used fill instead of color and added pch = 21 ("plot character" of a circle) to wrap around each point. This gives a nice border that separates each point.
Output:
Trying to reproduce below base code using ggplot which is yielding
incorrect result
base code
model1 <- lm(wgt ~ 1, data = bdims)
model1_null <- augment(model1)
plot(bdims$hgt, bdims$wgt)
abline(model1, lwd = 2, col = "blue")
pre_null <- predict(model1)
segments(bdims$hgt, bdims$wgt, bdims$hgt, pre_null, col = "red")
ggplot code
bdims %>%
ggplot(aes(hgt, wgt)) +
geom_point() +
geom_smooth(method = "lm", formula = bdims$hgt ~ 1) +
segments(bdims$hgt, bdims$wgt, bdims$hgt, pre_null, col = "red")
Here's an example using the built-in mtcars data:
ggplot(mtcars, aes(wt, mpg)) +
geom_point() +
geom_smooth(method = "lm", formula = y ~ 1) +
geom_segment(aes(xend = wt, yend = mean(mpg)), col = "firebrick2")
The formula references the aesthetic dimensions, not the variable names. And you need to use geom_segment not the base graphics segments. In a more complicated case you would pre-compute the model's predicted values for the segments, but for a null model it's easy enough to just use mean inline.
I am pretty new in modelling. I have three groups of data (by period), which I want to display by lines over scatter plot.
I figured out how to put my method and formula in geom_smooth, and I am able to display a single line.
However, when I want to add lines per group, which could be accomplished by ggplot(.., aes(..,group = period)), I've got back a Warning:
Warning message:
Computation failed in `stat_smooth()`:
number of iterations exceeded maximum of 50
and the line is not displayed.
My working code:
ggplot(tab, aes(x=distance, y=grad)) + #
geom_point() + theme_bw() +
geom_smooth(method = "nls",
formula = y ~ a*x^(-b),
method.args = list(start=c(a=20, b=0.01)), #
se = F)
results:
Code providing error (with added group = period in aes), and not displaying lines per group:
ggplot(tab, aes(x=distance, y=grad, group = period)) + #
geom_point() + theme_bw() +
geom_smooth(method = "nls",
formula = y ~ a*x^(-b),
method.args = list(start=c(a=20, b=0.01)), #
se = F)
Do you have some ideas how can I increase the number of iteration in ggplot2 by geom_smooth function?
I found some information to increase number of iteration by control=nls.control(maxiter=200) https://stat.ethz.ch/pipermail/r-help/2006-June/107606.html relative to R base modelling, but I can't find solution or directions for ggplot2.
Based on #Axeman comment, I added the control=nls.control(maxiter=200) to the
method.args = list(start=c(a=20, b=0.01),
control=nls.control(maxiter=200))
The whole script is thus:
ggplot(tab, aes(x=distance, y=grad, group = period, col = period)) + #
geom_point(col = "grey") + theme_bw() +
geom_smooth(method = "nls",
formula = y ~ a*x^(-b),
method.args = list(start=c(a=20, b=0.01),
control=nls.control(maxiter=200)), #
se = F)
And the result is: