I have a data frame include two groups, each group with four points, and I want to plot them using smooth line in r.
The dataframe is:
df <- data.frame(x=c(12,25,50,85,12,25,50,85), y=c(1.02, 1.05, 0.99, 1.07, 1.03, 1.06, 1.09, 1.10), Type=c("AD","AD","AD","AD","WT","WT","WT","WT"))
I used the code:
ggplot(df) +
geom_point(aes(x=x, y=y, color=Type, group=Type), size = 3) +
geom_line(aes(y=y, x=x, group = Type, color=Type)) +
stat_smooth(aes(y=y, x=x), method = "loose", formula = y~ poly(x, 21), se = FALSE)
However the plot I got is not smooth as I expected.
How could I change on code?
Is it because the limited point number?
Thanks a lot in advance!
There are a couple of problems with your code. Firstly, there is no method called "loose" for regression (did you mis-spell "loess"?). Secondly, if you want a polynomial regression you probably want method = lm. Thirdly, if you have four points in each series, you can have at most a degree-3 polynomial.
Using lm with y ~ poly(x, 3) works quite well here:
ggplot(df, aes(x, y, color = Type)) +
geom_point(size = 3) +
stat_smooth(method = lm, formula = y ~ poly(x, 3), se = FALSE)
Or even just a loess with y ~ x
ggplot(df, aes(x=x, y=y, color=Type)) +
geom_point(size = 3) +
stat_smooth(method = loess, formula = y ~ x, se = FALSE)
Posting an alternative way of doing it.
geom_point(size = 3) +
geom_smooth(method = "loess", span = 0.75, se = FALSE)
You can the span-parameter. See some examples below:
span == 1:
span == 0.75:
Related
I'm trying to fit a curve to my data points in R, but geom_smooth is just drawing an ugly line through all the points. I'm looking for a way to make a smooth curve that doesn't necessarily go through all the points.
and here is the code I used to make it:
data <- data.frame(thickness = c(0.25, 0.50, 0.75, 1.00),
capacitance = c(1.844, 0.892, 0.586, 0.422))
ggplot(data, aes(x = thickness, y = capacitance)) +
geom_point() +
geom_smooth(method = "loess", se = F, formula = (y ~ (1/x)))
When I say fitted curve, I mean something like
The "loess" method of smoothing a line in geom_smooth has a "span" argument which you can use for this purpose, e.g.
library(tidyverse)
data <- data.frame(thickness = c(0.25, 0.50, 0.75, 1.00),
capacitance = c(1.844, 0.892, 0.586, 0.422))
ggplot(data, aes(x = thickness, y = capacitance)) +
geom_point() +
geom_smooth(method = "loess", se = F,
formula = (y ~ (1/x)), span = 2)
Created on 2021-07-21 by the reprex package (v2.0.0)
For more details see What does the span argument control in geom_smooth?
I have a question about ggplot2.
I want to connect data point with ols result via vertical line, like the code listed below.
Can I transfer ..y.., the value calculated by stat_smooth, to geom_linerange directly?
I tried stat_smooth(..., geom = "linerange", mapping(aes(ymin=pmin(myy, ..y..), ymax=pmax(myy,..y..)) but it is not the result I want.
library(ggplot2)
df <- data.frame(myx = 1:10,
myy = c(1:10) * 5 + 2 * rnorm(10, 0, 1))
lm.fit <- lm("myy~myx", data = df)
pred <- predict(lm.fit)
ggplot(df, aes(myx, myy)) +
geom_point() +
geom_smooth(method = "lm", se = FALSE) +
geom_linerange(mapping = aes(ymin = pmin(myy, pred),
ymax = pmax(myy, pred)))
stat_smooth evaluates the values at n evenly spaced points, with n = 80 by default. These points may not coincide with the original x values in your data frame.
Since you are calculating predicted values anyway, it would probably be more straightforward to add that back to your data frame and plot all geom layers based on that as your data source, for example:
df$pred <- pred
ggplot(df, aes(myx, myy)) +
geom_point() +
geom_smooth(method = "lm", se = FALSE) +
geom_linerange(aes(ymin = myy, ymax = pred))
I am trying to fit a quadratic curve over my spaghetti plot. In the beginning I did it only with ggplot like this:
library(ggplot2)
library(reshape2)
GCIP <- data_head$GCIP
Patient.ID <- data_head$Patient.ID
Eye <-data_head$Eye
Visit <-data_head$Visit
Patient<-data_head$Patient
data_head$time_since_on <- as.numeric(as.character(data_head$time_since_on))
ggplot(data = data_head, aes(x= time_since_on, y=GCIP)) +
geom_point(alpha=1, size=2) +
aes(colour=Patient.ID) +
geom_path(aes(group='Patient.ID'))
ggplot(data= data_head, aes(x = time_since_on, y = GCIP)) +
geom_point(size = 2, alpha= 1, aes(color = Patient.ID)) + #colour points by group
geom_path(aes(group = Patient.ID)) + #spaghetti plot
stat_smooth(method = "lm", formula = y ~ poly(x,2)) + #line of best fit by group
ylab("GCIP (volume)") + xlab("time_since_on (months)") +
theme_bw()
The problem is that I am not sure this code takes into account that each line contains different timepoints of 1 patient, so the line fitted should take that also into account.
Could you please tell me if this is correct?
Here you can see the graph I get
I am not sure and maybe is better to generate a lme model (but in that case I don't know how to introduce the quadratic fitting in the model).
I also did this:
data_head <- read.csv("/Users/adrianaroca-fernandez/Desktop/Analysis/Long_100418_2/N=lit.csv", sep=";", dec=",")
library(ggplot2)
library(reshape2)
library(lme4)
library(lsmeans)
GCIP <- data_head$GCIP
Patient.ID <- data_head$Patient.ID
Eye <-data_head$Eye
Visit <-data_head$Visit
Patient<-data_head$Patient
data_head$time_since_on <- as.numeric(as.character(data_head$time_since_on))
time_since_on <-data_head$time_since_on
time_since_on2 <- time_since_on^2
quadratic.model <-lm(GCIP ~ time_since_on + time_since_on2)
summary(quadratic.model)
time_since_onvalues <- seq(0, 250, 0.1)
predictedGCIP <- predict(quadratic.model,list(time_since_on=time_since_onvalues, time_since_on2=time_since_onvalues^2))
plot(time_since_on, GCIP, pch=16, xlab = "time_since_on (months)", ylab = "GCIP", cex.lab = 1.3, col = "blue")
lines(time_since_onvalues, predictedGCIP, col = "darkgreen", lwd = 3)
The problem is that I am still unable to introduce (1|Patient.ID) as a mixed effect. And I lose my spaghetti plot in this case, having just the dots. Here the result:
What do you think is better or how should I code this?
Thanks.
lili
Trying to reproduce below base code using ggplot which is yielding
incorrect result
base code
model1 <- lm(wgt ~ 1, data = bdims)
model1_null <- augment(model1)
plot(bdims$hgt, bdims$wgt)
abline(model1, lwd = 2, col = "blue")
pre_null <- predict(model1)
segments(bdims$hgt, bdims$wgt, bdims$hgt, pre_null, col = "red")
ggplot code
bdims %>%
ggplot(aes(hgt, wgt)) +
geom_point() +
geom_smooth(method = "lm", formula = bdims$hgt ~ 1) +
segments(bdims$hgt, bdims$wgt, bdims$hgt, pre_null, col = "red")
Here's an example using the built-in mtcars data:
ggplot(mtcars, aes(wt, mpg)) +
geom_point() +
geom_smooth(method = "lm", formula = y ~ 1) +
geom_segment(aes(xend = wt, yend = mean(mpg)), col = "firebrick2")
The formula references the aesthetic dimensions, not the variable names. And you need to use geom_segment not the base graphics segments. In a more complicated case you would pre-compute the model's predicted values for the segments, but for a null model it's easy enough to just use mean inline.
ggplot(data = wheatX,
aes(x = No.of.species,
y = Weight.of.weed,
color = Treatment)) +
geom_point(shape = 1) +
scale_colour_hue(l = 50) +
geom_smooth(method = glm,
se = FALSE)
This draws a straight line.
But the species number will decrease at somepoint. I want to make the line curve. How can I do it. Thanks
This is going to depend on what you mean by "smooth"
One thing you can do is apply a loess curve:
ggplot() + ... + stat_smooth(method = "loess", formula = biomass ~ numSpecies, size = 1)
Or you can manually build a polynomial model using the regular lm method:
ggplot() + ... + stat_smooth(method = "lm", formula = biomass ~ numSpecies + I(numSpecies^2), size = 1)
You'll need to figure out the exact model you want to use for the second case, hence what I originally meant by the definition of the term "smooth"