I've created a dodged bar chart in ggplot2 with geom_col(). The code looks like this:
cat <- c("A", "A", "A", "A","B", "B", "B", "B")
var <- c("X", "Y", "Z", "T", "X", "Y", "Z", "T")
val <- c(35, 25, 20, 20, 40, 10, 15, 35)
df <- data.frame(var, cat, val)
ggplot(data = df) +
geom_col(aes(x = var, y = val, fill = cat), position = "dodge")
This produces the following plot:
I would like each variable to have a different filling colour, for example T = Green, X = Blue etc. and still keep a colour separation between the categories, for example T-A = darkgreen, T-B = lightgreen, X-A = darkblue, X-B = lightblue etc.
Is there an easy way to add this feature?
Thanks!
I think the easiest way to do what you're asking is to use the alpha scale:
ggplot(data = df) +
geom_col(aes(x = var, y = val, fill = var, alpha = cat),
position = "dodge") +
scale_alpha_discrete(range = c(0.5, 1), guide = guide_none()) +
theme_classic()
If you really want to use a grid in the background and don't want to see lines through the pale bars, make sure you plot some white bars of the same dimension underneath:
ggplot(data = df) +
geom_col(aes(x = var, y = val, group = cat),
position = "dodge", fill = "white", alpha = 1) +
geom_col(aes(x = var, y = val, fill = var, alpha = cat),
position = "dodge") +
scale_alpha_discrete(range = c(0.5, 1), guide = guide_none())
Maybe this can be useful:
library(ggplot2)
#Data
cat <- c("A", "A", "A", "A","B", "B", "B", "B")
var <- c("X", "Y", "Z", "T", "X", "Y", "Z", "T")
val <- c(35, 25, 20, 20, 40, 10, 15, 35)
df <- data.frame(var, cat, val)
#Plot
ggplot(data = df) +
geom_col(aes(x = var, y = val, fill = interaction(var,cat)), position = "dodge")+
labs(fill='Var')
Output:
You can customize colors with scale_fill_*(). Here an example using a fill scale from ggsci package:
#Plot 2
ggplot(data = df) +
geom_col(aes(x = var, y = val, fill = interaction(var,cat)), position = "dodge")+
labs(fill='Var')+
ggsci::scale_fill_futurama()
Output:
Related
I'm working on a larger project for which I am creating several plots in ggplot2. The plots are concerned with plotting several different outcomes across several different discreet categories (think: countries, species, types). I would like to completely fix the mapping of discrete types to colors such that Type=A is always displayed in red, Type=B is always displayed in blue, and so on across all plots irrespective of what other factors are present. I know about scale_fill_manual() where I can provide color values manually and then work with drop = FALSE which helps in dealing with unused factor levels. However, I find this extremely cumbersome since every plot will need some manual work to deal with sorting the factors in the right way, sorting color values to match factor sorting, dropping unused levels, etc.
What I am looking for is a way where I can map once and globally factor levels to specific colors (A=green, B=blue, C=red, ...) and then just go about plotting whatever I please and ggplot picking the right colors.
Here is some code to illustrate the point.
# Full set with 4 categories
df1 <- data.frame(Value = c(40, 20, 10, 60),
Type = c("A", "B", "C", "D"))
ggplot(df1, aes(x = Type, y = Value, fill = Type)) + geom_bar(stat = "identity")
# Colors change complete because only 3 factor levels are present
df2 <- data.frame(Value = c(40, 20, 60),
Type = c("A", "B", "D"))
ggplot(df2, aes(x = Type, y = Value, fill = Type)) + geom_bar(stat = "identity")
# Colors change because factor is sorted differently
df3 <- data.frame(Value = c(40, 20, 10, 60),
Type = c("A", "B", "C", "D"))
df3$Type <- factor(df3$Type, levels = c("D", "C", "B", "A"), ordered = TRUE)
ggplot(df3, aes(x = Type, y = Value, fill = Type)) + geom_bar(stat = "identity")
You could define your own custom scale, if you like. If you look at the source for scale_fill_manual,
scale_fill_manual
#> function (..., values)
#> {
#> manual_scale("fill", values, ...)
#> }
#> <environment: namespace:ggplot2>
it's actually quite simple:
library(ggplot2)
scale_fill_chris <- function(...){
ggplot2:::manual_scale(
'fill',
values = setNames(c('green', 'blue', 'red', 'orange'), LETTERS[1:4]),
...
)
}
df1 <- data.frame(Value = c(40, 20, 10, 60),
Type = c("A", "B", "C", "D"))
ggplot(df1, aes(x = Type, y = Value, fill = Type)) +
geom_col() +
scale_fill_chris()
df2 <- data.frame(Value = c(40, 20, 60),
Type = c("A", "B", "D"))
ggplot(df2, aes(x = Type, y = Value, fill = Type)) +
geom_col() +
scale_fill_chris()
df3 <- data.frame(Value = c(40, 20, 10, 60),
Type = c("A", "B", "C", "D"))
df3$Type <- factor(df3$Type, levels = c("D", "C", "B", "A"), ordered = TRUE)
ggplot(df3, aes(x = Type, y = Value, fill = Type)) +
geom_col() +
scale_fill_chris()
You could make a custom plot function (including scale_fill_manual and reasonable default colours) in order to avoid repeating code:
library(ggplot2)
custom_plot <- function(.data,
colours = c("A" = "green", "B" = "blue", "C" = "red", "D" = "grey")) {
ggplot(.data, aes(x=Type, y=Value, fill= Type)) + geom_bar(stat="identity") +
scale_fill_manual(values = colours)
}
df1 <- data.frame(Value=c(40, 20, 10, 60), Type=c("A", "B", "C", "D"))
df2 <- data.frame(Value=c(40, 20, 60), Type=c("A", "B", "D"))
df3 <- data.frame(Value=c(40, 20, 10, 60), Type=c("A", "B", "C", "D"))
df3$Type <- factor(df3$Type, levels=c("D", "C", "B", "A"), ordered=TRUE)
custom_plot(df1)
custom_plot(df2)
custom_plot(df3)
Another options is to make drop = F the default by defining the default colour scales as follows:
scale_colour_discrete <- function(...)
scale_colour_manual(..., drop = F)
scale_fill_discrete <- function(...)
scale_fill_manual(..., drop = F)
That way colours are always consistent for different factors.
make sure you convert that column into Factor first and then create a variable to store the color value for each factor...
df$color <- as.factor(df$color, levels = c(1, 0))
cbPallete <- c("1"= "green", "0"="red")
ggplot(data = df) + geom_bar(x = df$x,
y = df$y,
fill = df$color) +
scale_fill_manual(values = cbPallete)
Below is a simple example. I wish to create an outline of the bar graph plot.
Below is an example how the desire plot looks like.
Outlines bars in bar graph ggplot
library(tidyverse)
level <- c("a", "b", "c", "d", "e", "f", "g")
value <- c(8.1, 5.6, 3.2, 4.4, 3.5, 2.5, 1.8)
tbl <- tibble(level = level,
value = value)
# create plot using geom_step()
ggplot(data = tbl,
aes(x = level,
y = value)) +
geom_step(col = "black") +
theme_bw()
Modifying the linked answer to apply to your data frame, we get:
ggplot(tbl,
aes(x = level,
y = value)) +
geom_col(width = 1, fill = "#e0a0e8", alpha = 0.5) +
geom_step(data = tbl %>%
mutate(level = as.numeric(factor(level)) - 0.5) %>%
summarise(level = c(level[1], level, rep(last(level) + 1, 2)),
value = c(0, value, last(value), 0)),
aes(group = 1), col = "black") +
theme_bw(base_size = 20)
Given a data frame and a plot as follows:
library(dplyr)
library(ggplot2)
dat <- data.frame(grp = c("a", "b", "c"),
val = c(30, 20, 10),
avg = c(25, 15, 5))
dat %>%
ggplot(aes(x = grp, y = val)) +
geom_bar(stat = "identity")
How do I amend the code above to place a unique horizontal reference line (avg) on each bar as shown below:
This could be achieved via geom_segment like so, where I first conver grp to a numeric and corresponding to the default width of a bar of .9 put the x at .45 to the left and xend at .45 to the right:
library(ggplot2)
dat <- data.frame(grp = c("a", "b", "c"),
val = c(30, 20, 10),
avg = c(25, 15, 5))
ggplot(dat, aes(x = grp, y = val)) +
geom_bar(stat = "identity") +
geom_segment(aes(y = avg, yend = avg,
x = as.numeric(factor(grp)) - .45,
xend = as.numeric(factor(grp)) + .45), color = "red")
EDIT Thanks to comment by #tjebo: As hard-coding is rarely a good idea one could set the width via a variable:
w <- .9
...
geom_segment(aes(y = avg, yend = avg,
x = as.numeric(factor(grp)) - w/2,
xend = as.numeric(factor(grp)) + w/2), color = "red")
I'm working on a larger project for which I am creating several plots in ggplot2. The plots are concerned with plotting several different outcomes across several different discreet categories (think: countries, species, types). I would like to completely fix the mapping of discrete types to colors such that Type=A is always displayed in red, Type=B is always displayed in blue, and so on across all plots irrespective of what other factors are present. I know about scale_fill_manual() where I can provide color values manually and then work with drop = FALSE which helps in dealing with unused factor levels. However, I find this extremely cumbersome since every plot will need some manual work to deal with sorting the factors in the right way, sorting color values to match factor sorting, dropping unused levels, etc.
What I am looking for is a way where I can map once and globally factor levels to specific colors (A=green, B=blue, C=red, ...) and then just go about plotting whatever I please and ggplot picking the right colors.
Here is some code to illustrate the point.
# Full set with 4 categories
df1 <- data.frame(Value = c(40, 20, 10, 60),
Type = c("A", "B", "C", "D"))
ggplot(df1, aes(x = Type, y = Value, fill = Type)) + geom_bar(stat = "identity")
# Colors change complete because only 3 factor levels are present
df2 <- data.frame(Value = c(40, 20, 60),
Type = c("A", "B", "D"))
ggplot(df2, aes(x = Type, y = Value, fill = Type)) + geom_bar(stat = "identity")
# Colors change because factor is sorted differently
df3 <- data.frame(Value = c(40, 20, 10, 60),
Type = c("A", "B", "C", "D"))
df3$Type <- factor(df3$Type, levels = c("D", "C", "B", "A"), ordered = TRUE)
ggplot(df3, aes(x = Type, y = Value, fill = Type)) + geom_bar(stat = "identity")
You could define your own custom scale, if you like. If you look at the source for scale_fill_manual,
scale_fill_manual
#> function (..., values)
#> {
#> manual_scale("fill", values, ...)
#> }
#> <environment: namespace:ggplot2>
it's actually quite simple:
library(ggplot2)
scale_fill_chris <- function(...){
ggplot2:::manual_scale(
'fill',
values = setNames(c('green', 'blue', 'red', 'orange'), LETTERS[1:4]),
...
)
}
df1 <- data.frame(Value = c(40, 20, 10, 60),
Type = c("A", "B", "C", "D"))
ggplot(df1, aes(x = Type, y = Value, fill = Type)) +
geom_col() +
scale_fill_chris()
df2 <- data.frame(Value = c(40, 20, 60),
Type = c("A", "B", "D"))
ggplot(df2, aes(x = Type, y = Value, fill = Type)) +
geom_col() +
scale_fill_chris()
df3 <- data.frame(Value = c(40, 20, 10, 60),
Type = c("A", "B", "C", "D"))
df3$Type <- factor(df3$Type, levels = c("D", "C", "B", "A"), ordered = TRUE)
ggplot(df3, aes(x = Type, y = Value, fill = Type)) +
geom_col() +
scale_fill_chris()
You could make a custom plot function (including scale_fill_manual and reasonable default colours) in order to avoid repeating code:
library(ggplot2)
custom_plot <- function(.data,
colours = c("A" = "green", "B" = "blue", "C" = "red", "D" = "grey")) {
ggplot(.data, aes(x=Type, y=Value, fill= Type)) + geom_bar(stat="identity") +
scale_fill_manual(values = colours)
}
df1 <- data.frame(Value=c(40, 20, 10, 60), Type=c("A", "B", "C", "D"))
df2 <- data.frame(Value=c(40, 20, 60), Type=c("A", "B", "D"))
df3 <- data.frame(Value=c(40, 20, 10, 60), Type=c("A", "B", "C", "D"))
df3$Type <- factor(df3$Type, levels=c("D", "C", "B", "A"), ordered=TRUE)
custom_plot(df1)
custom_plot(df2)
custom_plot(df3)
Another options is to make drop = F the default by defining the default colour scales as follows:
scale_colour_discrete <- function(...)
scale_colour_manual(..., drop = F)
scale_fill_discrete <- function(...)
scale_fill_manual(..., drop = F)
That way colours are always consistent for different factors.
make sure you convert that column into Factor first and then create a variable to store the color value for each factor...
df$color <- as.factor(df$color, levels = c(1, 0))
cbPallete <- c("1"= "green", "0"="red")
ggplot(data = df) + geom_bar(x = df$x,
y = df$y,
fill = df$color) +
scale_fill_manual(values = cbPallete)
I recognize that this has been an issue that's been asked in many other instances, but none of the solutions provided worked for my particular problem.
Here, I have the following data:
library(tidyverse)
library(scales)
mydata <- tibble(Category = c("A", "B", "C", "D"),
Result = c(0.442, 0.537, 0.426, 0.387),
A = c(NA, "A", NA, NA),
B = rep(NA, 4),
C = c(NA, "C", NA, NA),
D = c("D", "D", NA, NA))
mydata$Category <- factor(mydata$Category)
And I have the following vector for the colors:
colors_vct <- c(A = "#0079c0", B = "#cc9900", C = "#252525", D = "#c5120e")
With this information, I can create the following plot:
p <- ggplot(data = mydata , aes(x = Category, y = Result, fill = Category)) +
geom_bar(stat = "identity") + geom_text(aes(label = percent(Result), color = Category), hjust = -.25) +
coord_flip() + scale_y_continuous(limits = c(0,1), labels = percent) +
scale_colour_manual(values = colors_vct) + scale_fill_manual(values = colors_vct)
p
And I'd like to have little triangles appear after the labels based on whether a certain category is mentioned in the last 4 columns of mydata, colored by that category's color, as so:
p <- p + geom_text(data = filter(mydata, mydata[,3] == "A"), aes(label = sprintf("\u25b2")), colour = colors_vct["A"], hjust = -4)
#p <- p + geom_text(data = filter(mydata, mydata[,4] == "B"), aes(label = sprintf("\u25b2")), colour = colors_vct["B"], hjust = -5) #This is commented out because there are no instances where the layer ends up being applied.
p <- p + geom_text(data = filter(mydata, mydata[,5] == "C"), aes(label = sprintf("\u25b2")), colour = colors_vct["C"], hjust = -6)
p <- p + geom_text(data = filter(mydata, mydata[,6] == "D"), aes(label = sprintf("\u25b2")), colour = colors_vct["D"], hjust = -7)
p
This is what I want the final chart to look like (more or less, see bonus question below). Now, I'd like to iterate the last bit of code using a for loop. And this is where I'm running into trouble. It just ends up adding one layer only. How do I make this work? Here is my attempt:
#Set the colors into another table for matching:
colors_tbl <- tibble(Category = levels(mydata$Category),
colors = c("#0079c0", "#cc9900", "#252525", "#c5120e"))
for (i in seq_along(mydata$Category)) {
if (is_character(mydata[[i]])) { #This makes the loop skip if there is nothing to be applied, as with category B.
#Filters to just the specific categories I need to have the triangles shown.
triangles <- filter(mydata, mydata[,(i+2)] == levels(mydata$Category)[i])
#Matches up with the colors_tbl to determine which color to use for that triangle.
triangles <- mutate(triangles, colors = colors_tbl$colors[match(levels(triangles$Category)[i], colors_tbl$Category)])
#Sets a particular position for that triangle for the hjust argument below.
pos <- -(i+3)
#Adding the layer to the plot object
p <- p + geom_text(data = triangles, aes(label = sprintf("\u25b2")), color = triangles$colors, hjust = pos)
}
}
p
:(
Bonus question: Is there a way I can avoid gaps in between the triangles, as per the 2nd chart?
EDIT: As per #baptiste 's suggestion, I re-processed the data as such:
mydata2 <- mydata %>% gather(key = comp, value = Present, -Result, -Category)
mydata2 <- mydata2 %>% mutate(colors = colors_tbl$colors[match(mydata2$Present, colors_tbl$Category)]) %>%
filter(!is.na(mydata2$Present)) %>% select(-comp)
mydata2 <- mydata2 %>% mutate(pos = if_else(Present == "A", -4, if_else(Present == "B", -5, if_else(Present == "C", -6, -7))))
p <- p + geom_text(data = mydata2, aes(x = Category, label = sprintf("\u25b2")), colour = mydata2$colors, hjust = mydata2$pos)
p
Ok, I got it to work. my bonus question still stands.