How to make legends independet of aesthetics? - r

I've managed to make the plot below using ggplot2 containing a boxplot, the observations and the weighted mean.
I managed to get the legend that i want by using the three different aesthetics fill, color and size, but I would like to produce the legend without using the aesthetics. Using the aesthetics make customization of the plot with regard to colors, fills and sized impossible and given I one day need a fourth element plotted, I'm running out of aesthetics.
Is there any way to treat legends individually on a "geom-basis" using the same aesthetics for all geoms?
More specifically I want to have the edges of the boxplot colored as fill, the observations colored as the boxplot and the weighted average colored black, but if I specidy these colors outside of the aes(), the legend is deleted or altered.
library(dplyr)
library(tidyr)
library(ggplot2)
study <- c(1:10)
observations <- c(seq(10, 100, by = 10))
type <- c("A", "A", "A", "A", "A", "B", "B", "B", "B", "B")
rate <- c(runif(10, 0, 1))
data1 <- data.frame(study, type, observations, rate)
average <- data1 %>%
group_by(type) %>%
summarise(rate = weighted.mean(rate, observations))
data1 %>%
ggplot() +
geom_boxplot(aes(x = type, y = rate, fill = type), alpha = 0.2) +
geom_point(aes(x = type, y = rate, size = "Observations")) +
geom_point(data = average,
aes(x = type, y = rate, color = "Weighted mean"),
shape = 18, size = 5) +
guides(fill = guide_legend(title = "Legend"),
color = guide_legend(title = ""),
size = guide_legend(title = ""))

Related

GGplot: Two stacked bar plots side by side (not facets)

I am trying to recreate this solution using ggplot2 in R: Combining two stacked bar plots for a grouped stacked bar plot
diamonds %>%
filter(color=="D"|color=="E"|color=="F") %>%
mutate(dummy=rep(c("a","b"),each=13057)) %>%
ggplot(aes(x=color,y=price))+
geom_bar(aes(fill=clarity),stat="identity",width=.25)+
facet_wrap(~cut)
I added a new variable to the diamonds dataset called dummy. dummy has two values: a and b. Let's say I want to compare these two values by creating a bar graph that has two stacked bars right next to each other (one for each value of dummy) for each value of color. How can I manipulate this such that there are two stacked bars for each value of color?
I think it would involve position dodge and/or a separate legend, but I've been unsuccessful so far. I do not want to add another facet - I want these both on the x-axis within each facet.
Similiar to the approach in the post you have linked one option to achieve your desired result would be via two geom_col and by converting the x axis variable to a numeric like so. However, doing so requires to set the breaks and labels manually via scale_x_continuous. Additionally I made use of the ggnewscale package to add a second fill scale:
library(ggplot2)
library(dplyr)
d <- diamonds %>%
filter(color == "D" | color == "E" | color == "F") %>%
mutate(dummy = rep(c("a", "b"), each = 13057))
ggplot(mapping = aes(y = price)) +
geom_col(data = filter(d, dummy == "a"), aes(x = as.numeric(color) - .15, fill = clarity), width = .3) +
scale_fill_viridis_d(name = "a", guide = guide_legend(order = 1)) +
scale_x_continuous(breaks = seq_along(levels(d$color)), labels = levels(d$color)) +
ggnewscale::new_scale_fill() +
geom_col(data = filter(d, dummy == "b"), aes(x = as.numeric(color) + .15, fill = clarity), width = .3) +
scale_fill_viridis_d(name = "b", option = "B", guide = guide_legend(order = 2)) +
facet_wrap(~cut)

Assign custom colors to each plot of facet_wrap histograms in R - ggplot

I want to use facet_wrap in R to split my plots based on a certain column. Here is a working example I reproduced from here:
set.seed(1)
df <- data.frame(age = runif(500, min = 10, max = 100),
group = rep(c("a", "b", "c", "d", "e"), 100))
#Plotting
ggplot(df, aes(age)) +
geom_histogram(aes(y = (..count..)), binwidth = 5) +
facet_wrap(~group, ncol = 3)
This produces plots, all in grey color (shown below). However, I want each plot be in a specific color. That is, they have the following color in order c("green","orange","blue","black", "red"). All bars in plot (a) be green, all in (b) be orange, and so on. These colors match my other plots and preserve consistency.
How can I achieve this task?
Thanks.
ggplot(df, aes(age)) +
geom_histogram(aes(y = (..count..), fill=group), binwidth = 5) +
facet_wrap(~group, ncol = 3) +
scale_fill_manual(values=c("green","orange","blue","black", "red"))

Is there an efficient way to draw lines between different elements in a stacked bar plot using ggplot2?

I would like to draw lines between different elements in a stacked bar plot using ggplot2.
I have plotted a stacked barchart using ggplot2 (first figure), but would like to get something like in second figure.
dta <- tribble(
~colA, ~colB, ~colC,
"A", "a", 1,
"A", "b", 3,
"B", "a", 4,
"B", "b", 2); dta
ggplot(dta, aes(x = colA, y = colC, fill = colB)) +
geom_bar(stat = "identity")
The fastes way would probably to the add the lines by manually drawing them into the exported image. However, I prefere avoiding this.
This Stackoverflow entry (esp. the answere of Henrik) gives a potential solution. However, I was wondering whether there is another solution that is more generic (i.e. that does not require to manually define all the start and end points of the segments/lines)
You could use the "factor as numbers" trick to draw lines between the bar centers (shown, e.g., here).
In your case this needs to be combined with stacking in geom_line().
ggplot(dta, aes(x = colA, y = colC, fill = colB)) +
geom_bar(stat = "identity") +
geom_line( aes(x = as.numeric(factor(colA))),
position = position_stack())
Getting the lines to the edges instead of the center would take some manual work. It's OK if you really only have two stacks like this, but would be difficult to easily scale.
In this case you'd want to add .45 to the group that comes first on the x axis and subtract .45 from the second. This might seem magical, but the default width is 90% of the resolution of the data so I used half of 0.9.
dta = transform(dta, colA_num = ifelse(colA == "A",
as.numeric(factor(colA)) + .45,
as.numeric(factor(colA)) - .45) )
ggplot(dta, aes(x = colA, y = colC, fill = colB)) +
geom_bar(stat = "identity") +
geom_line( aes(x = colA_num),
position = position_stack())
This doesn't add a line at 0 because those values aren't in the dataset. This could be added as a segment along the lines of
annotate(geom = "segment", y = 0, yend = 0, x = 1.45, xend = 1.55)

Hide legend for a single geom in ggplot2

I map the same variable (color) to color in two different geoms. I want them either to appear in separate legends (DHJ and EFI) or preferably just skip the second legend (for E, F, and I) altogether. Currently, R mixes the two together and gives me a legend that lists DEFHIJ in alphabetical order all mixed together.
Basically, I want to graph today's points onto some smoothed lines that use a standard dataset. I don't want there to be a legend for the smoothed lines - we are all familiar with them and they are standard on all our graphs. I just want a legend for the points only.
I've tried show.legend = FALSE as suggested elsewhere, but that doesn't seem to have an effect. guides(color = FALSE) removes the entire legend.
Reprex:
library(tidyverse)
set1 <- diamonds %>%
filter(color %in% c("D", "H", "J"))
set2 <- diamonds %>%
filter(color %in% c("E", "F", "I"))
ggplot() +
geom_point(data = set1,
aes(x = x, y = y, color = color)) +
geom_smooth(data = set2,
show.legend = FALSE,
aes(x = x, y = y, color = color))
Here is the graph that is produced. It has all 6 letters in the legend, instead of only DHJ.
If you want the legend to show only the colors from one dataset you can do so by setting the breaks in scale_color_discrete() to those values.
... +
scale_color_discrete(breaks = unique(set1$color) )
If you aren't using the colors of the lines, since this is standard background info, you could add the lines by using group ingeom_smooth() instead of color. (Also see linetype if you wanted to be able to tell the lines apart.)
ggplot() +
geom_point(data = set1,
aes(x = x, y = y, color = color)) +
geom_smooth(data = set2,
aes(x = x, y = y, group = color))

How to make frequency barplot in groups?

Suppose my data is two columns, one is "Condition", one is "Stars"
food <- data.frame(Condition = c("A", "B", "A", "B", "A"), Stars=c('good','meh','meh','meh','good'))
How to make a barplot of the frequency of "Star" as grouped by "Condition"?
I read here but would like to expand that answer to include groups.
for now I have
q <- ggplot(food, aes(x=Stars))
q + geom_bar(aes(y=..count../sum(..count..)))
but that is the proportion of the full data set.
How to make a plot with four bars, that is grouped by 'Condition'?
Eg. 'Condition A' would have 'Good' as 0.66 and 'Meh' as 0.33
I guess this is what you are looking for:
food <- data.frame(Condition = c("A", "B", "A", "B", "A"), Stars=c('good','meh','meh','meh','good'))
library(ggplot2)
library(dplyr)
data <- food %>% group_by(Stars,Condition) %>% summarize(n=n()) %>% mutate(freq=n/sum(n))
ggplot(data, aes(x=Stars, fill = Condition, group = Condition)) + geom_bar(aes(y=freq), stat="identity", position = "dodge")
At first i have calculated the frequencies using dplyr package, which is used as y argument in geom_bar(). Then i have used fill=Condition argument in ggplot() which divided the bars according to Condition. Additionally i have set position="dodge" to get the bars next to each other and stat="identity", due to already calculated frequencies.
I have used value ..prop.., aesthetic group and facet_wrap(). Using aesthetic group proportions are computed by groups. And facet_wrap() is used to plot each condition separately.
require(ggplot2)
food <- data.frame(Condition = c("A", "B", "A", "B", "A"),
Stars=c('good','meh','meh','meh','good'))
ggplot(food) +
geom_bar(aes(x = Stars, y = ..prop.., group = Condition)) +
facet_wrap(~ Condition)

Resources