I'm trying to plot individual regression lines for all of my experimental subjects (n=40) on the same plot where I show the overall regression line.
I can do the plots separately with ggplot, but I haven't found a way to superpose them on the same graph.
I can illustrate what I did with the iris data frame:
#first plot
ggplot(iris, aes(x = Sepal.Width, y = Sepal.Length)) +
geom_point() +
stat_smooth(method = lm, se = FALSE) +
theme_classic()
# second plot, grouped by species
ggplot(iris, aes(x = Sepal.Width, y = Sepal.Length, colour =Species)) +
geom_point() +
stat_smooth(method = lm, se = FALSE) +
theme_classic()
# and I've been trying things like this:
ggplot(iris, aes(x = Sepal.Width, y = Sepal.Length)) +
geom_point() +
stat_smooth(method = lm, se = FALSE) +
theme_classic() +
geom_point(aes(x = Sepal.Width, y = Sepal.Length, colour =Species))) +
stat_smooth(method = lm, se = FALSE) +
theme_classic()
which returns the message "Error: Cannot add ggproto objects together. Did you forget to add this object to a ggplot object?", so I get that this is not the right way to combine them, but what is?
How can I combine both graphs in one?
Thanks in advance!
Repeat the whole data and set Species to be something else ("Together") in example below. Attach the repeated data to the original data and just call the second plot.
d1 = iris
d2 = rbind(d1, transform(d1, Species = "Together"))
ggplot(d2, aes(x = Sepal.Width, y = Sepal.Length, colour =Species)) +
stat_smooth(method = lm, se = FALSE) +
geom_point(data = d1) +
theme_classic()
Similar to #d.b's answer, consider expanding the data frame with rbind, assigning an "All" category for Species and adjust for factor levels (so All shows at top on legend):
new_species_level <- c("All", unique(as.character(iris$Species)))
iris_expanded <- rbind(transform(iris, Species=factor("All", levels=new_species_level)),
transform(iris, Species=factor(Species, levels=new_species_level)))
ggplot(iris_expanded, aes(x=Sepal.Width, y=Sepal.Length, colour=Species)) +
geom_point() +
stat_smooth(method = lm, se = FALSE) +
theme_classic()
Related
I would appreciate any help to apply the transparent background colours below to
divide into two parts the plot area based on x-values as illustrated in the plot below (vertical division).
Here are my sample data and code:
mtcars$cyl <- as.factor(mtcars$cyl)
ggplot(mtcars, aes(x=wt, y=mpg, color=cyl)) +
geom_point() +
theme(legend.position="none")+
geom_smooth(method=lm, se=FALSE, fullrange=TRUE)
Here is the plot I would like to replicate, and the legend illustrates the change I want to implement:
Thank you in advance.
I think you want something like this. You'll have to designate groups and fill by that group in your geom_ribbon, and set your ymin and ymax as you like.
library(tidyverse)
mtcars$group <- ifelse(mtcars$wt <= 3.5, "<= 3.5", "> 3.5")
mtcars <- arrange(mtcars, wt)
mtcars$group2 <- rleid(mtcars$group)
mtcars_plot <- head(do.call(rbind, by(mtcars, mtcars$group2, rbind, NA)), -1)
mtcars_plot[,c("group2","group")] <- lapply(mtcars_plot[,c("group2","group")], na.locf)
mtcars_plot[] <- lapply(mtcars_plot, na.locf, fromLast = TRUE)
ggplot(mtcars_plot, aes(x = wt, y = mpg)) +
geom_point() +
geom_smooth(aes(), method=lm, se=F, fullrange=TRUE) +
geom_ribbon(aes(ymin = mpg *.75, ymax = mpg * 1.25, fill = group), alpha = .25) +
labs(fill = "Weight Class")
Edit:
To map confidence intervals using geom_ribbon you'll have to calculate them beforehand using lm and predict.
mtmodel <- lm(mpg ~ wt, data = mtcars)
mtcars$Low <- predict(mtmodel, newdata = mtcars, interval = "confidence")[,2]
mtcars$High <- predict(mtmodel, newdata = mtcars, interval = "confidence")[,3]
Followed by the previous code to modify mtcars. Then plot with the calculated bounds.
ggplot(mtcars_plot, aes(x = wt, y = mpg)) +
geom_point() +
geom_smooth(aes(), method=lm, se=F, fullrange=TRUE) +
geom_ribbon(aes(ymin = Low, ymax = High, fill = group), alpha = .25) +
labs(fill = "Weight Class") +
scale_fill_manual(values = c("red", "orange"), name = "fill")
The below code produces a scatter plot with regression lines for each group. Instead of the sloped regression lines is it possible to plot horizontal lines that represent the average of each group's y values? I tried modifying the formula parameter to "y ~ 0 *x" but can't think of anything else that's obvious to use.
Thanks
ggplot(data = iris, aes(y = Sepal.Length, x = Sepal.Width, colour = Species)) + geom_point() +
geom_smooth(method = 'lm', formula = y ~ x , se = F)
We can specify the formula as y ~ 1.
library(ggplot2)
ggplot(data = iris, aes(y = Sepal.Length, x = Sepal.Width, colour = Species)) +
geom_point() +
geom_smooth(method = "lm", formula = y ~ 1)
I am using ggplot and geoms to show my data, but the plot sidebar area just shows a gray box with the x and y axis correctly labeled.
Here is the output image:
The code which made the plot:
ggplot(Wc, aes(y = popsafe, x = rnground)) +
geom_jitter(aes(col = me)) +
geom_smooth(method = "lm", se = FALSE, col = "black")
Looks like your dataset is empty. We don't know what your dataset contains, so here an example with the built-in iris dataset. First a proper plot, using the same geoms and mappings you use:
library(ggplot2)
ggplot(iris, aes(y = Sepal.Length, x = Sepal.Width)) +
geom_jitter(aes(col = Species)) +
geom_smooth(method = "lm", se = FALSE, col = "black")
Now I remove all the data from the dataset and replot:
library(dplyr)
iris_empty <- filter(iris, Sepal.Length < 0)
ggplot(iris_empty, aes(y = Sepal.Length, x = Sepal.Width)) +
geom_jitter(aes(col = Species)) +
geom_smooth(method = "lm", se = FALSE, col = "black")
A simple head(Wc) would confirm whether your dataset actually contains any data.
ggplot() +
geom_point(aes(x = Africa_set$Africa_Predict, y = Africa_set$Africa_Real), color ="red") +
geom_line(aes(x = Africa_set$Africa_Predict, y = predict(simplelm, newdata = Africa_set)),color="blue") +
labs(title = "Africa Population",fill="") +
xlab("Africa_set$Africa_Predict") +
ylab("Africa_set$Africa_Real")
Then show the error message:
Error: Found object is not a stat
How can fix this error?
It looks like you are trying to plot points with a fitted regression line on top. You can do this using:
library(ggplot2)
ggplot(iris, aes(Petal.Length, Petal.Width)) +
geom_point() +
geom_smooth(method = "lm")
Or, if you really do want to use the model you've stored ahead of time in a simplelm object like you have in your example, you could use augment from the broom package:
library(ggplot2)
library(broom)
simplelm <- lm(Petal.Width ~ Petal.Length, data = iris)
ggplot(data = augment(simplelm),
aes(Petal.Length, Petal.Width)) +
geom_point() +
geom_line(aes(Petal.Length, .fitted), color = "blue")
I'll use violin plots here as an example, but the question extends to many other ggplot types.
I know how to subset my data along the x-axis by a factor:
ggplot(iris, aes(x = Species, y = Sepal.Length)) +
geom_violin() +
geom_point(position = "jitter")
And I know how to plot only the full dataset:
ggplot(iris, aes(x = 1, y = Sepal.Length)) +
geom_violin() +
geom_point(position = "jitter")
My question is: is there a way to plot the full data AND a subset-by-factor side-by-side in the same plot? In other words, for the iris data, could I make a violin plot that has both "full data" and "setosa" along the x-axis?
This would enable a comparison of the distribution of a full dataset and a subset of that dataset. If this isn't possible, any recommendations on better way to visualise this would also be welcome :)
Thanks for any ideas!
Using:
ggplot(iris, aes(x = "All", y = Sepal.Length)) +
geom_violin() +
geom_point(aes(color="All"), position = "jitter") +
geom_violin(data=iris, aes(x = Species, y = Sepal.Length)) +
geom_point(data=iris, aes(x = Species, y = Sepal.Length, color = Species),
position = "jitter") +
scale_color_manual(values = c("black","#F8766D","#00BA38","#619CFF")) +
theme_minimal(base_size = 16) +
theme(axis.title.x = element_blank(), legend.title = element_blank())
gives: