I know that this question has been asked before but the solutions don't seem to work for me.
What I want to do is represent my median, mean, upper and lower quantiles on a histogram in different colours and then add a legend to the plot. This is what I have so far and I have tried to use scale_color_manual and scale_color_identity to give me a legend. Nothing seems to be working.
quantile_1 <- quantile(sf$Unit.Sales, prob = 0.25)
quantile_2 <- quantile(sf$Unit.Sales, prob = 0.75)
ggplot(aes(x = Unit.Sales), data = sf) +
geom_histogram(color = 'black', fill = NA) +
geom_vline(aes(xintercept=median(Unit.Sales)),
color="blue", linetype="dashed", size=1) +
geom_vline(aes(xintercept=mean(Unit.Sales)),
color="red", linetype="dashed", size=1) +
geom_vline(aes(xintercept=quantile_1), color="yellow", linetype="dashed", size=1)
You need to map the color inside the aes:
ggplot(aes(x = Sepal.Length), data = iris) +
geom_histogram(color = 'black', fill = NA) +
geom_vline(aes(xintercept=median(iris$Sepal.Length),
color="median"), linetype="dashed",
size=1) +
geom_vline(aes(xintercept=mean(iris$Sepal.Length),
color="mean"), linetype="dashed",
size=1) +
scale_color_manual(name = "statistics", values = c(median = "blue", mean = "red"))
Related
I am trying to add a legend for the mean and median to my histogram. I am also trying to change the scale on the y-axis that is labeled count. It is currently showing the density scale. I want the density plot but the count scale. Alternatively, I would be fine with a second scale or the counts at the end of the histogram. I am just not sure how to go about it. Below is some data and the current code. Thank you in advance.
studyData=data.frame(X=rchisq(1:100000, df=3))
colnames(studyData) <- "hoursstudying"
mu <- data.frame(mean(studyData$hoursstudying))
colnames(mu) <- "Mean"
med <- data.frame(median(studyData$hoursstudying))
colnames(med) <- "Median"
p <- ggplot(studyData, aes(x = hoursstudying)) +
geom_histogram(aes(y=(..density..)), binwidth = 1, colour = "black", fill = "lightblue") +
geom_density(alpha=.2, fill="#FF6666") +
geom_vline(data = mu, aes(xintercept = Mean),
color = "red", linetype = "dashed", size = 1) +
geom_vline(data = med, aes(xintercept = median(Median)),
color = "purple", size = 1) +
labs(title = "Hours Spent Completing Course Work") +
ylab("Count") +
xlab("Hours Studying")
theme(plot.title = element_text(hjust = 0.5))
p
You can access the count instead of density on the y axis much in the same way you reference the internal calculation of density using the "..XXXX.." notation. In this case, use ..count...
You will need to change both y aesthetics for geom_histogram() and geom_density():
ggplot(studyData, aes(x = hoursstudying)) +
geom_histogram(aes(y=(..count..)), binwidth = 1, colour = "black", fill = "lightblue") +
geom_density(aes(y=..count..), alpha=.2, fill="#FF6666") +
# ... everything else is the same
Note: also, I echo the comment from u/Limey. The median and mean values in your original plot shared are clearly wrong... yet when I run the code I am getting the values looking correct. Not sure what that's about, OP, but perhaps that's a different question.
Since #chemdork123 answered the question about the y-axis scale I won't say anything about it. To add the median/mean values to the legend you need to add them as aesthetics.
p <- ggplot(studyData, aes(x = hoursstudying)) +
geom_histogram(aes(y=(..density..)), binwidth = 1, colour = "black", fill = "lightblue") +
geom_density(alpha=.2, fill="#FF6666") +
geom_vline(data = mu, aes(xintercept = Mean,
color = "red"),
linetype = "dashed", size = 1) +
geom_vline(data = med, aes(xintercept = Median,
color = "purple"),
size = 1) +
scale_color_manual(values = c("purple", "red"),
labels = c("Median", "Mean")) +
labs(title = "Hours Spent Completing Course Work") +
ylab("Count") +
xlab("Hours Studying") +
theme(plot.title = element_text(hjust = 0.5))
I have the following code which yields the figure below:
ggplot(data=data.frame(x=x, y=y, mass=mass)) +
geom_line(mapping = aes(x=x, y=y, linetype='Gompertz predicted mass', col='Gompertz predicted mass')) +
geom_point(mapping = aes(x=x, y=mass, shape='Actual mass',col='Actual mass')) +
theme_bw() +
ylab('Mass') +
xlab('t') +
scale_color_manual(name='',values = c("black",'red')) +
scale_linetype_manual(name='',values = c("solid")) +
scale_shape_manual(name='', values = c(19)) +
scale_x_continuous(breaks=seq(4,26,2)) +
ylim(c(0, 20000)) +
ggtitle('Problem 3: Plot of tumor mass with time')
Notice how the legend is separated. I'd like to merge it for shape and color. When the geoms are the same, the technique of using scale_something_manual works perfectly fine to merge the legends. However, I'm having trouble with it here since I have two different geoms.
The problem is similar to the one described in https://github.com/tidyverse/ggplot2/issues/3648. There is no elegant solution at the moment. Because you haven't included any data, I've presumed that your problem is conceptually similar to the plot below:
library(ggplot2)
ggplot(mpg, aes(displ, hwy)) +
geom_point(aes(shape = "Point", colour = "Point")) +
geom_smooth(aes(linetype = "Line", colour = "Line"),
formula = y ~ x, se = FALSE, method = "loess") +
scale_colour_manual(values = c("red", "black")) +
scale_linetype_manual(values = "solid") +
scale_shape_manual(values = 19)
The way to fix the problem is to get rid of the linetype and shape aesthetics and scales, and instead override aesthetics at the level of the legend.
ggplot(mpg, aes(displ, hwy)) +
geom_point(aes(colour = "Point")) +
geom_smooth(aes(colour = "Line"),
formula = y ~ x, se = FALSE, method = "loess") +
scale_colour_manual(
values = c("red", "black"),
guide = guide_legend(override.aes = list(shape = c(NA, 19),
linetype = c(1, NA)))
)
Created on 2021-09-04 by the reprex package (v2.0.1)
I have the following data frame data frame and I am plotting the average (Accuracy) per level. But I want to also the individual data points with shapes (e.g.Accuracy1, Accuracy2, Accuracy3 etc) on the line. Anyone who could help me? Thanks
ggplot(data=Accuracy_means, aes(x=Effort_Level, y=Accuracy,
group=1)) +
geom_errorbar(aes(ymin=Accuracy-se, ymax=Accuracy+se), width=.05, size=1) +
geom_line(size=1)+
geom_hline(yintercept=c(-0.5,0.5), linetype="dashed", colour="black", size=0.5)+
ylim(0,1)+
coord_fixed(ratio = 2.5)+
theme_classic()
It's not clear if you want to change the line type. If so, here is an approach using gather from tidyr.
library(tidyverse)
Accuracy_means %>%
gather(key = accuracy_vars, value = values, -Effort_Level, -Accuracy, -se) %>%
ggplot(aes(x=Effort_Level,
y=values)) +
geom_errorbar(aes(ymin=Accuracy-se, ymax=Accuracy+se), colour = "red", width =0.05, size = 0.5) +
geom_line(aes(linetype = accuracy_vars), size=1) +
geom_line(aes(y = Accuracy), colour = "red")+
coord_fixed(ratio = 2.5)+
theme_classic()
I have a 5 data frames (actually one and 4 sub sets of the same with some filters), and the idea is to plot them according to colors and have corresponding colors on the legend. But in the legend the colors are mixed, the order of the names are kept. I am not sure how are they ordered but not according to my input.
My code is:
ggplot(regression_with_pred, aes(x= countF, y= countM)) +
geom_point(alpha = 0.5, size=0.1) +
geom_smooth(method = lm, se=F, color ="#e08214") +
geom_ribbon(aes(ymin=lwr, ymax=upr, fill="#e08214"), alpha = 0.2) +
ggtitle(samplename) +
theme_bw() +
geom_point(data=y_mers, aes(x= countF, y= countM, color = "y_mers"), alpha = 0.5, size=0.1) +
geom_point(data=x_mers, aes(x= countF, y= countM, color = "x_mers"), alpha = 0.5, size=0.1) +
geom_point(data=z_mers, aes(x= countF, y= countM, color = "z_mers"), alpha = 0.5, size=0.1) +
geom_point(data=w_mers, aes(x= countF, y= countM, color = "w_mers"), alpha = 0.5, size=0.1) +
scale_color_manual(name = "Kmer association",
values = c("y_mers" = "#053061", "x_mers" = "#b2182b", "z_mers" = "#2166ac", "w_mers"="#d6604d"),
labels = c("y_mers","x_mers","z_mers","w_mers")) +
scale_fill_identity()
The wrong plot:
It's wrong because "y_mers" should have the color of "x_mers", "x_mers" the color of "zmers", "zmers" the color of "w_mers", and "w_mers" the color of "y_mers".
Like this:
The correct order
Following guides like ggplot Donut chart I am trying to draw small gauges, doughnuts with a label in the middle, with the intention to put them later on on a map.
If the value reaches a certain threshold I would like the fill of the doughnut to change to red. Is it possible to achieve with if_else (it would be most natural but it does not work).
library(tidyverse)
df <- tibble(ID=c("A","B"),value=c(0.7,0.5)) %>% gather(key = cat,value = val,-ID)
ggplot(df, aes(x = val, fill = cat)) + scale_fill_manual(aes,values = c("red", "yellow"))+
geom_bar(position="fill") + coord_polar(start = 0, theta="y")
ymax <- max(df$val)
ymin <- min(df$val)
p2 = ggplot(df, aes(fill=cat, y=0, ymax=1, ymin=val, xmax=4, xmin=3)) +
geom_rect(colour="black",stat = "identity") +
scale_fill_manual(values = if_else (val > 0.5, "red", "black")) +
geom_text( aes(x=0, y=0, label= scales::percent (1-val)), position = position_dodge(0.9))+
coord_polar(theta="y") +
xlim(c(0, 4)) +
theme_void() +
theme(legend.position="none") +
scale_y_reverse() + facet_wrap(facets = "ID")
Scale fill manual values= if else.... this part does not work, the error says: Error in if_else(val > 0.5, "red", "black") : object 'val' not found. Is it my error, or some other solution exists?
I also realize my code is not optimal, initially gather waited for more variables to be included in the plot, but I failed to stack one variable on top of the other. Now one variable should be enough to indicate the percentage of completion. I realise my code is redundant for the purpose. Can you help me out?
A solution for the color problem is to first create a variable in the data and then use that to map the color in the plot:
df <- tibble(ID=c("A","B"),value=c(0.7,0.5)) %>% gather(key = cat,value = val,-ID) %>%
mutate(color = if_else(val > 0.5, "red", "black"))
p2 = ggplot(df, aes(fill=color, y=0, ymax=1, ymin=val, xmax=4, xmin=3)) +
geom_rect(colour="black",stat = "identity") +
scale_fill_manual(values = c(`red` = "red", `black` = "black")) +
geom_text( aes(x=0, y=0, label= scales::percent (1-val)), position = position_dodge(0.9))+
coord_polar(theta="y") +
xlim(c(0, 4)) +
theme_void() +
theme(legend.position="none") +
scale_y_reverse() + facet_wrap(facets = "ID")
The result would be: