trendline in R in ggplot - r

mpg %>%
mutate(Color=ifelse(class=='2seater','2seater','Other')) %>%
ggplot(aes(displ, hwy, colour = Color)) +
geom_point() +
scale_color_manual(values = c("2seater" = "#992399", "Other" = "#000000"))
To this I am trying to add a trend line which is for all categories because , if I add a trend line geom_smooth(method="lm"), it draws for 2 seater separately which I don't want

Override the colour aesthetic, which is the grouping aesthetic, with group = 1 in the call to geom_smooth.
library(tidyverse)
mpg %>%
mutate(Color=ifelse(class=='2seater','2seater','Other')) %>%
ggplot(aes(displ, hwy, colour = Color)) +
geom_point() +
scale_color_manual(values = c("2seater" = "#992399", "Other" = "#000000")) +
geom_smooth(aes(group = 1),
method = "lm", formula = y ~ x)

The geom_smooth inherits the aes arguments from ggplot. You can move 'colour' to geom_point, or pass inherit.aes = F to geom_smooth.
mpg %>%
mutate(Color=ifelse(class=='2seater','2seater','Other')) %>%
ggplot(aes(displ, hwy)) +
geom_point(aes(, colour = Color)) +
scale_color_manual(values = c("2seater" = "#992399", "Other" = "#000000")) + geom_smooth(method = 'lm')
#or:
mpg %>%
mutate(Color=ifelse(class=='2seater','2seater','Other')) %>%
ggplot(aes(displ, hwy, colour = Color)) +
geom_point() +
scale_color_manual(values = c("2seater" = "#992399", "Other" = "#000000")) + geom_smooth(method = 'lm', inherit.aes = F, aes(displ, hwy))

Related

Override legend for linechart with points

My current legend displays the shapes of the points in the chart, crossed out by the line. Id like to remove this line in the legend and just display the shapes.
The code looks like this:
p <- ggplot(data=cumdf, aes(x=quarters2)) +
geom_line(aes(y = mean_cumsum, colour='Platform participants'), size = 1.5)+
geom_point(aes(y = mean_cumsum, colour='Platform participants', shape='Platform participants'), size=3) +
geom_line(aes(y = mean_interventions, colour='Actions'), size=1.5) +
geom_point(aes(y = mean_interventions, colour='Actions', shape='Actions'), size=3) +
geom_line(aes(y = mean_sales, colour="Adopters"), size=1.5) +
geom_point(aes(y = mean_sales, colour='Adopters', shape='Adopters'), size=3) +
xlab("Quarters") +
ylab("Cumulative occurences") +
scale_shape_manual("", values=c("Platform participants" = 16, "Actions" = 17, "Adopters"=15)) +
scale_colour_manual("",breaks = c("Platform participants", "Actions", "Adopters"),
values = c ("#C80000", "#696969", "#4E33FF")) +
theme_stata(base_size = 15, base_family = "sans", scheme = "s2color") +
scale_x_continuous(n.breaks=14) +
geom_vline(xintercept=3, linetype='dashed', size=1.7)
p
Add show.legend to your geom_line. Since you have multiple calls to geom_line, you need to add it to all of them.
I'll demonstrate using mtcars, updated for a factor.
dat <- transform(mtcars, cyl = factor(cyl))
Before the change:
ggplot(dat, aes(mpg, disp, group = cyl, color = cyl, shape = cyl)) +
geom_line() +
geom_point()
Add show.legend=FALSE:
ggplot(dat, aes(mpg, disp, group = cyl, color = cyl, shape = cyl)) +
geom_line(show.legend = FALSE) +
geom_point()
The answer was found in the comments, by adding show.legend=FALSE to geom.line() .

ggplot single-value factor remove slashes from legend

I would like to show a simple geom_point, geom_smooth, and geom_abline with helpful legends. Unfortunately, the simple combination of geom_point and geom_smooth places a horizontal line across the point legend, adding geom_abline places a diagonal slash across all legends.
How can I create a simple visual where legend boxes include only a "point", a "line", and a "dashed line"?
Thanks
Examples:
Geom_point and geom_smooth
mtcars %>%
ggplot() +
geom_point(aes(x = carb, y = mpg, color = "Points")) +
geom_smooth(aes(x = carb, y = mpg, color = "Trendline")) +
theme(legend.position="bottom") +
labs(x = "carb",
y = "mpg",
color = "LEGEND")
Geom_point, geom_smooth, and geom_abline
mtcars %>%
ggplot() +
geom_point(aes(x = carb, y = mpg, color = "Points")) +
geom_smooth(aes(x = carb, y = mpg, color = "Trendline")) +
geom_abline(aes(slope = 1, intercept = 10, color = "ZCustom"), linetype = "dashed") +
theme(legend.position="bottom") +
labs(x = "carb",
y = "mpg",
color = "LEGEND")
Fixed geom_point legend, but slashes remain on other legends
mtcars %>%
ggplot() +
geom_point(aes(x = carb, y = mpg, color = "Points")) +
geom_smooth(aes(x = carb, y = mpg, color = "Trendline")) +
geom_abline(aes(slope = 1, intercept = 10, color = "ZCustom"), linetype = "dashed") +
scale_color_manual(values = c("red", "blue", "black"),
label = c("Points", "Trendline", "Custom"),
guide = guide_legend(override.aes = list(
linetype = c("blank", "solid", "dashed"),
shape = c(16, NA, NA)))) +
theme(legend.position="bottom") +
labs(x = "carb",
y = "mpg",
color = "LEGEND")
I have looked at these questions but did not understand how to apply to my situation:
ggplot legend slashes
ggplot2 legend for abline and stat_smooth
Include manually-added lines to ggplot2 guide legend
To me, the easiest way to get around this is to simply not list everything as color. You can use size, shape, alpha, etc. to break up the legend.
mtcars %>%
ggplot() +
geom_point(aes(x = carb, y = mpg, shape = "")) +
geom_smooth(aes(x = carb, y = mpg, alpha = "")) +
geom_abline(aes(slope = 1, intercept = 10, color = ""), linetype = "dashed") +
theme(legend.position="bottom") +
labs(x = "carb",
y = "mpg",
shape = "Points",
alpha = "Trendline",
color = "ZCustom")
My guess: you are using color for both geom_point and geom_smooth, so the legend attempts to combine both geoms.
When you use a different aes, the legend takes them as separate attribute/layers.
mtcars %>%
ggplot( aes(x = carb, y = mpg) ) +
geom_point( aes(fill = "Points") ) + # use 'fill' rather than 'color'
geom_smooth( aes(color = "Trendline") ) +
theme(legend.position = "bottom") +
labs(x = "carb", y = "mpg", color = "", fill = "")
Hope it helps!

How to change the transparency of a continuous scale legend when calling the alpha argument in a geom in ggplot2?

I need the alpha for the legend of the continuous scale colourbar to match that of the call in the geom.
mpg %>% ggplot(aes(x = displ, y = cty)) +
geom_point(aes(colour = hwy), alpha = 0.33)
You can use a color-gradient scale with a built-in alpha. For example, in the code below, the 85 tacked onto the end of the color values sets the alpha for each color (85 is 1/3 of 256 on the hexadecimal scale of the color and alpha values):
mpg %>% ggplot(aes(x = displ, y = cty)) +
geom_point(aes(colour = hwy), alpha = 0.33) +
scale_colour_gradient(low = "#132B4385", high = "#56B1F785")
Compare:
theme_set(theme_classic())
gridExtra::grid.arrange(
mpg %>% ggplot(aes(x = displ, y = cty)) +
geom_point(aes(colour = hwy), alpha = 0.33),
mpg %>% ggplot(aes(x = displ, y = cty)) +
geom_point(aes(colour = hwy), alpha = 0.33) +
scale_colour_gradient(low = "#132B4385", high = "#56B1F785"),
ncol=2
)

adding summary statistics to two factor boxplot

I would like to add summary statistics (e.g. mean) to the boxplot which have two factors. I have tried this:
library(ggplot2)
ggplot(ToothGrowth, aes(x = factor(dose), y = len)) +
stat_boxplot(geom = "errorbar", aes(col = supp, fill=supp), position = position_dodge(width = 0.85)) +
geom_boxplot(aes(col = supp, fill=supp), notch=T, notchwidth = 0.5, outlier.size=2, position = position_dodge(width = 0.85)) +
stat_summary(fun.y=mean, aes(supp,dose), geom="point", shape=20, size=7, color="violet", fill="violet") +
scale_color_manual(name = "SUPP", values = c("blue", "darkgreen")) +
scale_fill_manual(name = "SUPP", values = c("lightblue", "green"))
I got this picture:
It is possible somehow put the sample size of each box (e.g. top of the whiskers)? I have tried this:
ggplot(ToothGrowth, aes(x = factor(dose), y = len)) +
stat_boxplot(geom = "errorbar", aes(col = supp, fill=supp), position = position_dodge(width = 0.85)) +
geom_boxplot(aes(col = supp, fill=supp), notch=T, notchwidth = 0.5, outlier.size=2, position = position_dodge(width = 0.85)) +
stat_summary(fun.y=mean,aes(supp,dose),geom="point", shape=20, size=7, color="violet", fill="violet") +
scale_color_manual(name = "SUPP", values = c("blue", "darkgreen")) +
scale_fill_manual(name = "SUPP", values = c("lightblue", "green")) +
geom_text(data = ToothGrowth,
group_by(dose, supp),
summarize(Count = n(),
q3 = quantile(ToothGrowth, 0.75),
iqr = IQR(ToothGrowth),
aes(x= dose, y = len,label = paste0("n = ",Count, "\n")), position = position_dodge(width = 0.75)))
You can state the aesthetics just once by putting them in the main ggplot call and then they will apply to all of the geom layers: ggplot(ToothGrowth, aes(x = factor(dose), y = len, color=supp, fill=supp))
For the count of observations: The data summary step in geom_text isn't coded properly. Also, to set len (the y-value) for the text placement, the summarize function needs to output values for len.
To add the mean values in the correct locations on the x-axis, use stat_summary with the exact same aesthetics as the other geoms and stats. I've overridden the color aesthetic by setting the color to yellow so that the point markers will be visible on top of the box plot fill colors.
The code to implement the plot is below:
library(tidyverse)
pd = position_dodge(0.85)
ggplot(ToothGrowth, aes(x = factor(dose), y = len, color=supp, fill=supp)) +
stat_boxplot(geom = "errorbar", position = pd) +
geom_boxplot(notch=TRUE, notchwidth=0.5, outlier.size=2, position=pd) +
stat_summary(fun.y=mean, geom="point", shape=3, size=2, colour="yellow", stroke=1.5,
position=pd, show.legend=FALSE) +
scale_color_manual(name = "SUPP", values = c("blue", "darkgreen")) +
scale_fill_manual(name = "SUPP", values = c("lightblue", "green")) +
geom_text(data = ToothGrowth %>% group_by(dose, supp) %>%
summarize(Count = n(),
len=max(len) + 0.05 * diff(range(ToothGrowth$len))),
aes(label = paste0("n = ", Count)),
position = pd, size=3, show.legend = FALSE) +
theme_bw()
Note that the notch goes outside the hinges for all of the box plots. Also, having the sample size just above the maximum of each boxplot seems distracting and unnecessary to me. You could place all of the text annotations at the bottom of the plot like this:
geom_text(data = ToothGrowth %>% group_by(dose, supp) %>%
summarize(Count = n()) %>%
ungroup %>%
mutate(len=min(ToothGrowth$len) - 0.05 * diff(range(ToothGrowth$len))),
aes(label = paste0("n = ", Count)),
position = pd, size=3, show.legend = FALSE) +

R - ggplot2 geom_line dodge

I am having trouble drawing "dodges" line on "dodged" stacked bars.
dt = mtcars %>% group_by(am, cyl) %>% summarise(m = mean(disp))
dt0 = dt[dt$am == 0, ]
dt1 = dt[dt$am == 1, ]
dt0 %>% ggplot(aes(factor(cyl), m, fill = factor(cyl))) + geom_bar(stat = 'identity', position = 'dodge') +
geom_point(data = dt1, aes(factor(cyl), m, colour = factor(cyl)), position=position_dodge(width=0.9), colour = 'black')
What I would like is to draw a line from the top of the stacked bar to the black points of each cyl.
dt0 %>% ggplot(aes(factor(cyl), m, fill = factor(cyl))) + geom_bar(stat = 'identity', position = 'dodge') +
geom_point(data = dt1, aes(factor(cyl), m, colour = factor(cyl)), position=position_dodge(width=0.9), colour = 'black') +
geom_line(data = dt1, aes(factor(cyl), m, colour = factor(cyl), group = 1), position=position_dodge(width=0.9), colour = 'black')
However, the position=position_dodge(width=0.9) dodge doesn't work here.
Any idea ?
This is much easier to accomplish if you reshape your summary data:
dt <- mtcars %>%
group_by(am, cyl) %>%
summarise(m = mean(disp)) %>%
spread(am, m)
cyl 0 1
* <dbl> <dbl> <dbl>
1 4 135.8667 93.6125
2 6 204.5500 155.0000
3 8 357.6167 326.0000
While "0" and "1" are poor column names, they can still be used in aes() if you quote them in backticks. The calls to position_dodge() also become unnecessary:
dt %>% ggplot(aes(x = factor(cyl), y = `0`, fill = factor(cyl))) +
geom_bar(stat = 'identity') +
geom_point(aes(x = factor(cyl), y = `1`), colour = 'black') +
geom_segment(aes(x = factor(cyl), xend = factor(cyl), y = `0`, yend = `1`))

Resources