The below code produces a scatter plot with regression lines for each group. Instead of the sloped regression lines is it possible to plot horizontal lines that represent the average of each group's y values? I tried modifying the formula parameter to "y ~ 0 *x" but can't think of anything else that's obvious to use.
Thanks
ggplot(data = iris, aes(y = Sepal.Length, x = Sepal.Width, colour = Species)) + geom_point() +
geom_smooth(method = 'lm', formula = y ~ x , se = F)
We can specify the formula as y ~ 1.
library(ggplot2)
ggplot(data = iris, aes(y = Sepal.Length, x = Sepal.Width, colour = Species)) +
geom_point() +
geom_smooth(method = "lm", formula = y ~ 1)
Related
I'm trying to plot individual regression lines for all of my experimental subjects (n=40) on the same plot where I show the overall regression line.
I can do the plots separately with ggplot, but I haven't found a way to superpose them on the same graph.
I can illustrate what I did with the iris data frame:
#first plot
ggplot(iris, aes(x = Sepal.Width, y = Sepal.Length)) +
geom_point() +
stat_smooth(method = lm, se = FALSE) +
theme_classic()
# second plot, grouped by species
ggplot(iris, aes(x = Sepal.Width, y = Sepal.Length, colour =Species)) +
geom_point() +
stat_smooth(method = lm, se = FALSE) +
theme_classic()
# and I've been trying things like this:
ggplot(iris, aes(x = Sepal.Width, y = Sepal.Length)) +
geom_point() +
stat_smooth(method = lm, se = FALSE) +
theme_classic() +
geom_point(aes(x = Sepal.Width, y = Sepal.Length, colour =Species))) +
stat_smooth(method = lm, se = FALSE) +
theme_classic()
which returns the message "Error: Cannot add ggproto objects together. Did you forget to add this object to a ggplot object?", so I get that this is not the right way to combine them, but what is?
How can I combine both graphs in one?
Thanks in advance!
Repeat the whole data and set Species to be something else ("Together") in example below. Attach the repeated data to the original data and just call the second plot.
d1 = iris
d2 = rbind(d1, transform(d1, Species = "Together"))
ggplot(d2, aes(x = Sepal.Width, y = Sepal.Length, colour =Species)) +
stat_smooth(method = lm, se = FALSE) +
geom_point(data = d1) +
theme_classic()
Similar to #d.b's answer, consider expanding the data frame with rbind, assigning an "All" category for Species and adjust for factor levels (so All shows at top on legend):
new_species_level <- c("All", unique(as.character(iris$Species)))
iris_expanded <- rbind(transform(iris, Species=factor("All", levels=new_species_level)),
transform(iris, Species=factor(Species, levels=new_species_level)))
ggplot(iris_expanded, aes(x=Sepal.Width, y=Sepal.Length, colour=Species)) +
geom_point() +
stat_smooth(method = lm, se = FALSE) +
theme_classic()
I have a data set (dat), with raw data (raw_x and raw_y). I have predicted a model and the predictions from the model are stored in dat$predict.
I wish to plot the raw data but overlay the data with a geom_smooth (here a quadratic function) but using the predicted data. This is my attempt at the basic code. I am not sure how to use predicted values in the geom_smooth yet.
ggplot(dat, aes(x = raw_x, y = raw_y, colours = "red")) +
geom_point() +
theme_bw() +
geom_smooth(method = "lm", formula = y ~ x + I(x^2))
The following plots the original points, the linear fit line and the fitted points. I use made up data since you have posted none.
set.seed(1234)
x <- cumsum(rnorm(100))
y <- x + x^2 + rnorm(100, sd = 50)
dat <- data.frame(raw_x = x, raw_y = y)
fit <- lm(y ~ x + I(x^2), dat)
dat$predict <- predict(fit)
ggplot(dat, aes(x = raw_x, y = raw_y)) +
geom_point(colour = "blue") +
theme_bw() +
geom_smooth(method = "lm", formula = y ~ x + I(x^2), colour = "red") +
geom_point(aes(y = predict), colour = "black")
I am using ggplot and geoms to show my data, but the plot sidebar area just shows a gray box with the x and y axis correctly labeled.
Here is the output image:
The code which made the plot:
ggplot(Wc, aes(y = popsafe, x = rnground)) +
geom_jitter(aes(col = me)) +
geom_smooth(method = "lm", se = FALSE, col = "black")
Looks like your dataset is empty. We don't know what your dataset contains, so here an example with the built-in iris dataset. First a proper plot, using the same geoms and mappings you use:
library(ggplot2)
ggplot(iris, aes(y = Sepal.Length, x = Sepal.Width)) +
geom_jitter(aes(col = Species)) +
geom_smooth(method = "lm", se = FALSE, col = "black")
Now I remove all the data from the dataset and replot:
library(dplyr)
iris_empty <- filter(iris, Sepal.Length < 0)
ggplot(iris_empty, aes(y = Sepal.Length, x = Sepal.Width)) +
geom_jitter(aes(col = Species)) +
geom_smooth(method = "lm", se = FALSE, col = "black")
A simple head(Wc) would confirm whether your dataset actually contains any data.
Please consider the following R script (taken and slightly modified from here):
require(ggplot2)
x <- 1:10
y <- jitter(x^2)
DF <- data.frame(x, y)
p <- ggplot(DF, aes(x = x, y = y)) + geom_point() +
stat_smooth(method = 'lm', aes(colour = 'linear')) +
stat_smooth(method = 'lm', formula = y ~ poly(x,2),
aes(colour = 'polynomial')) +
stat_smooth(method = 'nls', formula = y ~ a * log(x) +b,
aes(colour = 'logarithmic')) +
stat_smooth(method = 'nls', formula = y ~ a*exp(b *x),
aes(colour = 'Exponential')) +
theme(legend.position = "top")
p <- p + guides(guide_legend(ncol=2,nrow=2,byrow=TRUE))
p
The legend is displayed at the top of the plot. I want to break this legend into two lines, with two keys in each line. Is this possible?
Please note that, as you may see, I already tried
p+guides(guide_legend(ncol=2,nrow=2,byrow=TRUE))
as suggested here and here, but it did not work for me. This suggestion basically displays the data and the legends of the linear and polynomial models and completely hides the logarithmic and exponential models.
As explained by eipi10,
You need specify which legend, in this case the colour legend: guides(colour=guide_legend(ncol=2,nrow=2,byrow=TRUE)).
To clarify, the aesthetic is defining the colour of each line. If fill were used, the line could be guides(fill=guide_legend(ncol=2,nrow=2,byrow=TRUE)).
I plotted a time series data on ggplot with Year on the x axis and rain on the y axis.
I would like to overlay a trend line on this plot ( my equation for this trend line is rain = 2.6*Year + 23). My slope was computed using the theil sen method
How can I overlay this on my plot
My code thus far is
ggplot(data = Datarain, aes(x = year, y = rain)) +
geom_smooth(color="red", formula = y ~ x) +
geom_smooth(method = "lm", se=FALSE color="blue", formula = y ~ x) +
geom_line() + scale_x_continuous("Year")
I am not sure how to add my own equation on my plot or how to add a thiel sen line in ggplot
Any ideas would be grateful
You can use geom_abline to specify your linear equation
ggplot(data = Datarain, aes(x = year, y = rain)) +
geom_smooth(color="red", formula = y ~ x) +
geom_smooth(method = "lm", se=FALSE color="blue", formula = y ~ x) +
geom_line() + scale_x_continuous("Year") +
geom_abline(intercept = 23, slope = 2.6)