slope, intercept, ggplot2, R - r

I plotted a time series data on ggplot with Year on the x axis and rain on the y axis.
I would like to overlay a trend line on this plot ( my equation for this trend line is rain = 2.6*Year + 23). My slope was computed using the theil sen method
How can I overlay this on my plot
My code thus far is
ggplot(data = Datarain, aes(x = year, y = rain)) +
geom_smooth(color="red", formula = y ~ x) +
geom_smooth(method = "lm", se=FALSE color="blue", formula = y ~ x) +
geom_line() + scale_x_continuous("Year")
I am not sure how to add my own equation on my plot or how to add a thiel sen line in ggplot
Any ideas would be grateful

You can use geom_abline to specify your linear equation
ggplot(data = Datarain, aes(x = year, y = rain)) +
geom_smooth(color="red", formula = y ~ x) +
geom_smooth(method = "lm", se=FALSE color="blue", formula = y ~ x) +
geom_line() + scale_x_continuous("Year") +
geom_abline(intercept = 23, slope = 2.6)

Related

R - Adding legend to ggplot graph for regression lines

I do a Multiple Linear Regression in R, where I want to add a simple legend to a graph (ggplot). The legend should show the points and fitted lines with their corresponding colors. So far it works fine (without legend):
ggplot() +
geom_point(aes(x = training_set$R.D.Spend, y = training_set$Profit),
col = 'red') +
geom_line(aes(x = training_set$R.D.Spend, y = predict(regressor, newdata = training_set)),
col = 'blue') +
geom_line(aes(x = training_set$R.D.Spend, y = predict(regressor_sig, newdata = training_set)),
col = 'green') +
ggtitle('Multiple Linear Regression (Training set)') +
xlab('R.D.Spend [k$]') +
ylab('Profit of Venture [k$]')
How can I add a legend here most easily?
I tried the solutions from similar question, but did not succeed (add legend to ggplot2 | Add legend for multiple regression lines from different datasets to ggplot)
So, I appended my original model like this:
ggplot() +
geom_point(aes(x = training_set$R.D.Spend, y = training_set$Profit),
col = 'p1') +
geom_line(aes(x = training_set$R.D.Spend, y = predict(regressor, newdata = training_set)),
col = 'p2') +
geom_line(aes(x = training_set$R.D.Spend, y = predict(regressor_sig, newdata = training_set)),
col = 'p3') +
scale_color_manual(
name='My lines',
values=c('blue', 'orangered', 'green')) +
ggtitle('Multiple Linear Regression (Training set)') +
xlab('R.D.Spend [k$]') +
ylab('Profit of Venture [k$]')
But here I am getting the error of "Unknown colour name: p1". which makes somewhat sense, as I do not define p1 above. How can I make the ggplot recognise my intended legend?
Move col into the aes and then you can set the color using scale_color_manual:
library(ggplot2)
set.seed(1)
x <- 1:30
y <- rnorm(30) + x
fit <- lm(y ~ x)
ggplot2::ggplot(data.frame(x, y)) +
geom_point(aes(x = x, y = y)) +
geom_line(aes(x = x, y = predict(fit), col = "Regression")) +
scale_color_manual(name = "My Lines",
values = c("blue"))

How to plot raw data but use predicted values for line fit in ggplot2 R?

I have a data set (dat), with raw data (raw_x and raw_y). I have predicted a model and the predictions from the model are stored in dat$predict.
I wish to plot the raw data but overlay the data with a geom_smooth (here a quadratic function) but using the predicted data. This is my attempt at the basic code. I am not sure how to use predicted values in the geom_smooth yet.
ggplot(dat, aes(x = raw_x, y = raw_y, colours = "red")) +
geom_point() +
theme_bw() +
geom_smooth(method = "lm", formula = y ~ x + I(x^2))
The following plots the original points, the linear fit line and the fitted points. I use made up data since you have posted none.
set.seed(1234)
x <- cumsum(rnorm(100))
y <- x + x^2 + rnorm(100, sd = 50)
dat <- data.frame(raw_x = x, raw_y = y)
fit <- lm(y ~ x + I(x^2), dat)
dat$predict <- predict(fit)
ggplot(dat, aes(x = raw_x, y = raw_y)) +
geom_point(colour = "blue") +
theme_bw() +
geom_smooth(method = "lm", formula = y ~ x + I(x^2), colour = "red") +
geom_point(aes(y = predict), colour = "black")

Scatter plot with horizontal lines representing averages with R and ggplot

The below code produces a scatter plot with regression lines for each group. Instead of the sloped regression lines is it possible to plot horizontal lines that represent the average of each group's y values? I tried modifying the formula parameter to "y ~ 0 *x" but can't think of anything else that's obvious to use.
Thanks
ggplot(data = iris, aes(y = Sepal.Length, x = Sepal.Width, colour = Species)) + geom_point() +
geom_smooth(method = 'lm', formula = y ~ x , se = F)
We can specify the formula as y ~ 1.
library(ggplot2)
ggplot(data = iris, aes(y = Sepal.Length, x = Sepal.Width, colour = Species)) +
geom_point() +
geom_smooth(method = "lm", formula = y ~ 1)

Fitting a quadratic curve in ggplot

This is my sample data. I want to plot both y1 and y2 against x1 in a single plot. This is what I did:
library(ISLR)
library(ggplot2)
y1<-scale(Auto$horsepower,scale = T,center=T)
y2<-scale(Auto$weight,scale = T,center=T)
x1<-Auto$mpg
df<-data.frame(y1,y2,x1)
p<-ggplot(df,aes(x=x1)) +
geom_point(aes(y = y1), shape = 16) +
geom_point(aes(y = y2), shape = 2)
I want to insert a quadratic line for both y1 and y2 against x. I did this:
p + stat_smooth(method = "lm", formula = y ~ x + I(x^2), size = 1)
It throws up an error:
Warning message:
Computation failed in `stat_smooth()`:
variable lengths differ (found for 'x')
Other than this, the stat_smooth command will only put one quadratic line while I need two quadratic lines
for both y1 and y2.
How did I achieve this in R?
Thanks
You should add two stat_smooth() calls and add aes() to show which y to use.
ggplot(df,aes(x=x1)) +
geom_point(aes(y = y1), shape = 16) +
geom_point(aes(y = y2), shape = 2) +
stat_smooth(aes(y = y1),method = "lm", formula = y ~ x + I(x^2), size = 1) +
stat_smooth(aes(y = y2),method = "lm", formula = y ~ x + I(x^2), size = 1, color = "red")
Or make long format table and then you will need just one call of stat_smooth() and geom_point().
library(tidyr)
df_long <- df %>% gather(variable, value, y1:y2)
ggplot(df_long, aes(x1, value, color = variable)) +
geom_point() +
stat_smooth(method = "lm", formula = y ~ x + I(x^2), size = 1)

ggplot2: How to add linebreak to horizontal legend

Please consider the following R script (taken and slightly modified from here):
require(ggplot2)
x <- 1:10
y <- jitter(x^2)
DF <- data.frame(x, y)
p <- ggplot(DF, aes(x = x, y = y)) + geom_point() +
stat_smooth(method = 'lm', aes(colour = 'linear')) +
stat_smooth(method = 'lm', formula = y ~ poly(x,2),
aes(colour = 'polynomial')) +
stat_smooth(method = 'nls', formula = y ~ a * log(x) +b,
aes(colour = 'logarithmic')) +
stat_smooth(method = 'nls', formula = y ~ a*exp(b *x),
aes(colour = 'Exponential')) +
theme(legend.position = "top")
p <- p + guides(guide_legend(ncol=2,nrow=2,byrow=TRUE))
p
The legend is displayed at the top of the plot. I want to break this legend into two lines, with two keys in each line. Is this possible?
Please note that, as you may see, I already tried
p+guides(guide_legend(ncol=2,nrow=2,byrow=TRUE))
as suggested here and here, but it did not work for me. This suggestion basically displays the data and the legends of the linear and polynomial models and completely hides the logarithmic and exponential models.
As explained by eipi10,
You need specify which legend, in this case the colour legend: guides(colour=guide_legend(ncol=2,nrow=2,byrow=TRUE)).
To clarify, the aesthetic is defining the colour of each line. If fill were used, the line could be guides(fill=guide_legend(ncol=2,nrow=2,byrow=TRUE)).

Resources