How to give different pattern for side by side boxplots in R - r

My code is:
ggplot(my_data, aes(x = factor(inst), y = value, fill = color)) +
geom_boxplot(position = position_dodge(width = 0.75)) +
scale_fill_manual(values = c("blue" = "blue", "green" = "green", "red" = "red", "yellow" =
"yellow")) +
theme_bw()
And it shows a graph like this:
My grouping variable is color. How can I visualize the boxplots with different patterns inside,just black and white background, instead of colors.

Here's an example which should help you see how it is done:
library(tidyverse)
library(ggpattern)
dat <- expand_grid(a = 1:5, b = c("blue", "green", "red", "yellow"))
my_data <-
pmap_dfr(dat, function(a, b) {
tibble(value = sample(randu$x, size = 20, replace = TRUE)) |>
mutate(inst = a, color = b)
} )
ggplot(my_data, aes(x = factor(inst), y = value)) +
geom_boxplot_pattern(
aes(pattern = color, pattern_angle = color, pattern_spacing = color),
position = position_dodge(width = 0.75)
) +
theme_bw()
Here's the output
{ggpattern} has really good documentation, and an example of what you want can be found here: https://coolbutuseless.github.io/package/ggpattern/articles/geom-gallery-geometry.html#bw-example

Related

ggplot add horizontal line to grouped categorical data and share legend

Here is some code that makes a categorical bar chart and places a mean line on the chart. The problem is that the legends are separate and I can't figure out how to stick them together. I think I have made a dummy variable in the past and included it in the scale_manual arguments but geom_vline doesn't handle the "fill" mappings. Any ideas?
library(tidyverse)
data(mtcars)
y = mean(mtcars$mpg)
x = unique(mtcars$cyl)
meanDf <-
data.frame(x, y )
mtcars$mean = y
mtcars$group = "mean"
mtcars %>%
ggplot(aes(x = factor(cyl), y = mpg, fill = factor(carb))) +
geom_col(position = "dodge") +
geom_hline(data = meanDf, aes(yintercept = y, color = "")) +
scale_fill_manual(name = "", values = c("blue", "red", "green", "white", "black", "yellow"), labels = paste("myLabel", 1:6)) +
scale_color_manual(name = "", values = "red", label = "myLabel") +
theme(panel.background = element_rect(fill = "white")) +
theme(legend.background = element_rect(color = "black", fill = "white"))
One option would be to use only the fill scale and make use of custom key glyph.
Set the color for the geom_hline as an argument instead of mapping on the color aes. Instead map a constant e.g. "" on the fill aes. A Drawback is that we get a warning.
Add an additional color and label to scale_fill_manual.
To get a line as the key glyph for the geom_hline I make use of a custom key glyph which conditionally on the fill color switches between draw_key_path and the default key glyph for geom_col. To make this work I use a "red2" as the additional fill color for the hline which I switch to "red" inside the custom key glyph function.
library(tidyverse)
data(mtcars)
y = mean(mtcars$mpg)
x = unique(mtcars$cyl)
meanDf <- data.frame(x, y )
mtcars$mean = y
mtcars$group = "mean"
draw_key_cust <- function(data, params, size) {
if (data$fill %in% c("red2")){
data$colour <- "red"
data$fill <- NA
draw_key_path(data, params, size)
} else
GeomCol$draw_key(data, params, size)
}
mtcars %>%
ggplot(aes(x = factor(cyl), y = mpg, fill = factor(carb))) +
geom_hline(data = meanDf, aes(yintercept = y, fill = ""), color = "red") +
geom_col(key_glyph = "cust") +
scale_fill_manual(name = NULL, values = c("red2", "blue", "red", "green", "white", "black", "yellow"), labels = c("label", paste("myLabel", 1:6))) +
theme(panel.background = element_rect(fill = "white")) +
theme(legend.background = element_rect(color = "black", fill = "white"))
#> Warning: Ignoring unknown aesthetics: fill
I think for readability, it's better to separate them out. However, for formatting purpose, you sure can bring them as close as you want by dropping the legend.title (not just assigning it an empty string) and adjusting the legend.margin and legned.spacing. For instance,
library(tidyverse)
data(mtcars)
y = mean(mtcars$mpg)
x = unique(mtcars$cyl)
meanDf <-
data.frame(x, y )
mtcars$mean = y
mtcars$group = "mean"
mtcars %>%
ggplot(aes(x = factor(cyl), y = mpg, fill = factor(carb))) +
geom_col(position = "dodge") +
geom_hline(data = meanDf, aes(yintercept = y, color = "")) +
scale_fill_manual(name = "", values = c("blue", "red", "green", "white", "black", "yellow"), labels = paste("myLabel", 1:6)) +
scale_color_manual(name = "", values = "red", label = "myLabel") +
theme(
legend.title = element_blank(),
legend.margin = margin(t = 0, b = 0, r = 2, l = 2),
legend.spacing.y = unit(.5, "pt")
)
Output

Add a legend to geom_point overlaid on geom_boxplot

So I create a boxplot of data and then add a set point over that data. I want my legend to capture what the data type of the geom_points represents. Thanks!
ggplot(data = NULL) +
geom_boxplot(data = discuss_impact_by_county,
aes(x=reorder(State,discuss, FUN = median),y=discuss),
outlier.shape = NA) +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) +
labs(x = "States") +
geom_point(data = by_state,
aes(x = State, y = discuss_happen_difference),
col = "red",
size = 3,
show.legend = TRUE)
If you want a legend you have to map on aesthetics. In your case map something on the color aes, i.e. move col="red" into aes() and use scale_color_manual to set the value and the legend label to be assgined to the color label "red".
As you have only one "category" of points you can simply do scale_color_manual(values = "red", label = "We are red points") to set the color and label. In case that your have multiple points with different colors it's best to make use of a named vector to assign the colors and legend labels to the right "color label"s, i.e use scale_color_manual(values = c(red = "red"), label = c(red = "We are red points")).
Using some random example data try this:
library(ggplot2)
library(dplyr)
set.seed(42)
discuss_impact_by_county <- data.frame(
State = sample(LETTERS[1:4], 100, replace = TRUE),
discuss = runif(100, 1, 5)
)
by_state <- discuss_impact_by_county %>%
group_by(State) %>%
summarise(discuss_happen_difference = mean(discuss))
#> `summarise()` ungrouping output (override with `.groups` argument)
ggplot(data = NULL) +
geom_boxplot(data = discuss_impact_by_county,
aes(x=reorder(State,discuss, FUN = median),y=discuss),
outlier.shape = NA) +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) +
labs(x = "States") +
geom_point(data = by_state,
aes(x = State, y = discuss_happen_difference, col = "red_points"),
size = 3,
show.legend = TRUE) +
scale_color_manual(values = "red", label = "We are red points")

Legend geom_hline not in right order

I've made a barplot in ggplot, and added a couple of lines. What happens is that the color and description of the lines don't correspond:
The yellow line should have the description 'Median Member', but is displayed as 'avg Member'. What happens here? The code I used:
library(ggplot2)
library(dplyr)
MemberID=c(1,1,1, 2, 2, 2)
ClientCode = c(10,100,1000, 20, 200, 2000)
Duration = c(2356, 1560, 9000, 4569, 3123, 8000)
df <- data.frame(MemberID, ClientCode, Duration)
dr <- df %>%
filter(MemberID == 1)
dr_avg <- df
ggplot(dr, aes(reorder(as.character(ClientCode), -Duration), Duration, fill=-Duration)) +
geom_bar(stat="identity") + # the height of the bar will represent the value in a column of the data frame
xlab('ClientCode') +
ylab('Duration (Minutes)') +
geom_hline(data=dr, aes(yintercept=mean(Duration), linetype = 'Avg Member'), color = 'red', show.legend = TRUE) +
geom_hline(data=dr, aes(yintercept=median(Duration), linetype = 'Median Member'), color = 'orange', show.legend = TRUE) +
geom_hline(data=dr_avg, aes(yintercept=mean(Duration), linetype = 'Avg all data'), color = 'blue', show.legend = TRUE) +
scale_linetype_manual(name = "Line", values = c(2, 2, 2), guide = guide_legend(override.aes = list(color = c("red", "orange", "blue")))) +coord_flip()
Don't create geom_hline for every line you want to insert. What if you have hundred of them? Create a separate object d and specify different linetypes and colors there geom_hline(data = d, aes(yintercept = value, linetype = name, color = name)). When you want to specify colors use: scale_colour_manual(values = c("red", "orange", "blue")).
d1 <- summarize(df, mean(Duration), median(Duration))
d2 <- summarize(dr_avg, mean(Duration))
d <- data.frame(value = as.numeric(c(d1, d2)),
name = c('Avg Member', 'Median Member', 'Avg all data'))
ggplot(dr, aes(reorder(as.character(ClientCode), -Duration),
Duration,
fill = factor(-Duration))) +
geom_bar(stat = "identity") +
labs(x = "ClientCode",
y = "Duration (Minutes)") +
geom_hline(data = d, aes(yintercept = value, linetype = name, color = name)) +
scale_fill_brewer(palette = "Dark2") +
scale_colour_manual(values = c("red", "orange", "blue")) +
coord_flip() +
theme_bw()
PS.: Data that you provided doesn't make sense as two lines overlap.

Labling bars in grouped barplot in ggplot

The main goal of this plot is to make a comparison between A and B in three groups, but I want to have one, two, and three besides them, as well. Using below code, I can make a grouped barplot which is almost what I want. But I need to have the names of each bar below it since the legend is so ugly.
How can I do it?
m.names <- c("A1","B1","one","A2","B2","two","A3","B3","three")
m.group <- c(1,1,1,2,2,2,3,3,3)
m.value <- c(5,10,1,20,15,2,10,20,3)
df <- data.frame(m.names, m.group, m.value)
df
ggplot(df, aes(x = m.group, y = m.value)) +
geom_bar(aes(fill = m.names), position = "dodge", stat = "identity") +
scale_fill_manual(values=c("gray75", "gray75","gray75", "gray40","gray40","gray40", "blue", "red", "green" ))
Adding geom_text and making sure it's dodged in the same way as the bars:
# width = 0.9 should be the default for dodged bars but set
# it explicitly to be sure
dodger = position_dodge(width = 0.9)
ggplot(df, aes(x = m.group, y = m.value)) +
geom_bar(aes(fill = m.names), position = dodger, stat = "identity") +
scale_fill_manual(values=c("gray75", "gray75","gray75", "gray40","gray40","gray40", "blue", "red", "green" ),
guide = "none") +
geom_text(aes(x = m.group, group = m.names, label = m.names, y = 0),
position = dodger,
vjust = 1, colour = "black")
Faceting by group may work for this case as well:
fill.values = c("gray75", "gray75","gray75",
"gray40","gray40","gray40",
"blue", "red", "green")
names(fill.values) = levels(df$m.names)
> fill.values
A1 A2 A3 B1 B2 B3 one three two
"gray75" "gray75" "gray75" "gray40" "gray40" "gray40" "blue" "red" "green"
ggplot(df,
aes(x = m.names, y = m.value, fill = m.names)) +
geom_col() +
scale_fill_manual(values = fill.values, guide = F) +
facet_wrap(~m.group, scales = "free_x") +
theme_bw()
Seems like you might want this:
require(ggplot2)
ggplot(df, aes(x = m.names, y = m.value)) +
geom_bar(aes(fill = m.names), stat = "identity") +
scale_fill_manual(values=c("gray75", "gray75","gray75", "gray40",
"gray40","gray40", "blue", "red", "green" )) +
facet_grid(~m.group, scales = "free_x", space = "free_x") +
theme(strip.text.x = element_blank(),
panel.spacing = unit(0, "lines"))
Output:
The trick is to plot x by m.names here instead of m.groups. Then later we can facet the bars by m.group to keep them presented the way you want.
We could use geom_label
dodger = position_dodge(width = 0.9)
ggplot(df, aes(x = m.group, y = m.value)) +
geom_bar(aes(fill = m.names), position = dodger, stat = "identity") +
scale_fill_manual(values=c("gray75", "gray75","gray75",
"gray40","gray40","gray40", "blue", "red", "green" ),
guide = "none") +
theme(axis.text.x=element_blank(),
axis.ticks.x=element_blank()) +
geom_label(aes(x = m.group, group = m.names, label = m.names, y = 0),
position = dodger,
vjust = 1, colour = "black")

Create legend with manual shapes and colours

I use bars and line to create my plot. The demo code is:
timestamp <- seq(as.Date('2010-01-01'),as.Date('2011-12-01'),by="1 mon")
data1 <- rnorm(length(timestamp), 3000, 30)
data2 <- rnorm(length(timestamp), 30, 3)
df <- data.frame(timestamp, data1, data2)
p <- ggplot()
p <- p + geom_histogram(data=df,aes(timestamp,data1),colour="black",stat="Identity",bindwidth=10)
p <- p + geom_line(data=df,aes(timestamp,y=data2*150),colour="red")
p <- p + scale_y_continuous(sec.axis = sec_axis(~./150, name = "data2"))
p <- p + scale_colour_manual(name="Parameter", labels=c("data1", "data2"), values = c('black', 'red'))
p <- p+ scale_shape_manual(name="Parameter", labels=c("data1", "data2"), values = c(15,95))
p
This results in a plot like this:
This figure does not have a legend. I followed this answer to create a customized legend but it is not working in my case. I want a square and line shape in my legend corresponding to bars and line. How can we get it?
I want legend as shown in below image:
For the type of data you want to display, geom_bar is a better fit then geom_histogram. When you to manipulate the appaerance of the legend(s), you need to place the colour = ... parts inside the aes. To get the desired result it probably best to use different types of legend for the line and the bars. In that way you are better able to change the appearance of the legends with guide_legend and override.aes.
A proposal for your problem:
ggplot(data = df) +
geom_bar(aes(x = timestamp, y = data1, colour = "black"),
stat = "Identity", fill = NA) +
geom_line(aes(x = timestamp, y = data2*150, linetype = "red"), colour = "red", size = 1) +
scale_y_continuous(sec.axis = sec_axis(~./150, name = "data2")) +
scale_linetype_manual(labels = "data2", values = "solid") +
scale_colour_manual(name = "Parameter\n", labels = "data1", values = "black") +
guides(colour = guide_legend(override.aes = list(colour = "black", size = 1),
order = 1),
linetype = guide_legend(title = NULL,
override.aes = list(linetype = "solid",
colour = "red",
size = 1),
order = 2)) +
theme_minimal() +
theme(legend.key = element_rect(fill = "white", colour = NA),
legend.spacing = unit(0, "lines"))
which gives:

Resources