This question already has an answer here:
Manually setting group colors for ggplot2
(1 answer)
Closed 3 years ago.
So I am plotting a number of stacked bar charts on antibiotic use. The antibiotics included in each chart will differ between each individual dataset being plotted. So for example, in chart 1, 'antibiotic x' may be plotted as red, but in chart 2, it may be plotted as orange.
I was wondering if there is any way in which to standardise the colouring so that antibiotic x is the same colour across all charts I am plotting? This way it would make it easier to visualise if all antibiotic classes were the same across all charts.
Some example code for what I have used to plot one of the stacked bar charts is as followed;
datastack016 %>%
mutate(Date = dmy(Date),
Q = quarter(Date, with_year = TRUE)) %>%
group_by(Q) %>%
summarise_if(is.numeric, sum, na.rm = TRUE) %>%
gather(Key, Total, -Q) %>%
ggplot(aes(Q, Total, fill = Key)) +
geom_bar(stat = "identity") +
scale_x_yearqtr(format = "%Y") +
ylab("Antibiotic Total (Grams)") +
xlab("Date (Quarters/Year)")`
Any help would be much appreciated! :)
You could define the colours as a named vector and pass that to scale_fill_manual. See code below (just substitute scale_colour_manual with scale_fill_manual):
cols <- setNames(c("dodgerblue", "limegreen", "tomato"),
levels(iris$Species))
ggplot(iris, aes(Sepal.Width, Sepal.Length, colour = Species)) +
geom_point() +
scale_colour_manual(values = cols)
And now without one of the factors:
ggplot(iris[iris$Species != "setosa", ],
aes(Sepal.Width, Sepal.Length, colour = Species)) +
geom_point() +
scale_colour_manual(values = cols)
Note that cols still contains dodgerblue, but this is dropped from the legend since we don't have the first group (setosa).
Related
This question already has answers here:
Plotting two variables as lines using ggplot2 on the same graph
(5 answers)
Closed last year.
I have a dataframe that contains information on gene expression level.
Column names are gene names and row names are patient IDs.
How can I make bar plot where X axis is Gene names and Y axis is expression level?
I cannot find a way to label Gene A and Gene B in ggplot2 without merging them into the same column and type Gene A and B in a separate column.
There has to be a simple way to do this without changing the structure of data, but I cannot find it.
#Example Data
df <- data.frame(1:5,3:7)
colnames(df) <- c("Gene_A","Gene_B")
row.names(df) <- c("Pat_A","Pat_B","Pat_C","Pat_D","Pat_E")
I have tried to use one discrete, one continuous method as ggplot2 cheat sheet suggested.
f <- ggplot(mpg, aes(class, hwy)) + geom_...()
In the above code class is gene names, but I cannot determine what I should use for hwy.
Here is an example graph of what I want to get
You can do something like this with tidyverse, where you pipe everything into ggplot so that you do not change the original dataframe. First, I summarize the data, then pipe the summary dataframe into ggplot to plot the two bars and error.
library(tidyverse)
df %>%
pivot_longer(everything()) %>%
group_by(name) %>%
summarise(mean = mean(value),
sd = sd(value)) %>%
ggplot(., aes(name, mean)) +
geom_col(fill=c("black", "grey"), colour="black") +
geom_errorbar(aes(ymin = mean - sd, ymax = mean + sd), width=0.2) +
xlab("Gene") +
ylab("Expression Level") +
theme_classic()
Output
Or if you do not want to pivot at all, then you can just manually build each bar.
ggplot(df) +
geom_bar(aes(x = "Gene_A", y = Gene_A, colour = "Gene_A"), stat = "summary", fun = "mean", fill=c("black"), colour="black") +
geom_bar(aes(x = "Gene_B", y = Gene_B, colour = "Gene_B"), stat = "summary", fun = "mean", fill=c("grey"), colour="black") +
xlab("Gene") +
ylab("Expression Level") +
theme_classic()
Output
Say, I make a gpplot2 plot like the following with several facets:
ggplot(iris) +
geom_tile(aes(x = Petal.Width, fill = Sepal.Width, y = Petal.Length)) +
facet_wrap(~Species)
Note that there is one colourbar for all three plots, but each facet could potentially have a very different values. Is it possible to have a separate colourbar for each facet?
I agree with Alex's answer, but against my better scientific and design judgment, I took a stab at it.
require(gridExtra)
require(dplyr)
iris %>% group_by(Species) %>%
do(gg = {ggplot(., aes(Petal.Width, Petal.Length, fill = Sepal.Width)) +
geom_tile() + facet_grid(~Species) +
guides(fill = guide_colourbar(title.position = "top")) +
theme(legend.position = "top")}) %>%
.$gg %>% arrangeGrob(grobs = ., nrow = 1) %>% grid.arrange()
Of course, then you're duplicating lots of labels, which is annoying. Additionally, you lose the x and y scale information by plotting each species as a separate plot, instead of facets of a single plot. You could fix the axes by adding ... + coord_cartesian(xlim = range(iris$Petal.Width), ylim = range(iris$Petal.Length)) + ... within that ggplot call.
To be honest, the only way this makes sense at all is if it's comparing two different variables for the fill, which is why you don't care about comparing their true value between plots. A good alternative would be rescaling them to percentiles within a facet using dplyr::group_by() and dplyr::percent_rank.
Edited to update:
In the two-different-variables case, you have to first "melt" the data, which I assume you've already done. Here I'm repeating it with the iris data. Then you can look at the relative values by examining the percentiles, rather than the absolute values of the two variables.
iris %>%
tidyr::gather(key = Sepal.measurement,
value = value,
Sepal.Length, Sepal.Width) %>%
group_by(Sepal.measurement) %>%
mutate(percentilevalue = percent_rank(value)) %>%
ggplot(aes(Petal.Length, Petal.Width)) +
geom_tile(aes(fill = percentilevalue)) +
facet_grid(Sepal.measurement ~ Species) +
scale_fill_continuous(limits = c(0,1), labels = scales::percent)
Separate palettes for facets in ggplot facet_grid
It has been asked before. This is the best solution I have seen so far, however I think having a common palette is more ideal from a visualization standpoint.
If this is what you want then there is a simple hack to it.
tf1 <- iris
tf1$COL <- rep(1:50, each=3)
ggplot(tf1) +
geom_tile(aes(x = Petal.Width, fill = interaction(Petal.Length,COL), y = Petal.Length)) +
facet_wrap(~Species, scales = "free") + theme(legend.position="none")
I have created an stacked barplot with the counts of a variables. I want to keep these as counts, so that the different bar sizes represent different group sizes. However, inside the bar plot i would like to add labels that show the proportion of each stack - in terms of percentage.
I managed to create the stacked plot of count for every group. Also I have created the labels and they are are placed correctly. What i struggle with is how to calculate the percentage there?
I have tried this, but i get an error:
dataex <- iris %>%
dplyr::group_by(group, Species) %>%
dplyr::summarise(N = n())
names(dataex)
dataex <- as.data.frame(dataex)
str(dataex)
ggplot(dataex, aes(x = group, y = N, fill = factor(Species))) +
geom_bar(position="stack", stat="identity") +
geom_text(aes(label = ifelse((..count..)==0,"",scales::percent((..count..)/sum(..count..)))), position = position_stack(vjust = 0.5), size = 3) +
theme_pubclean()
Error in (count) == 0 : comparison (1) is possible only for atomic
and list types
desired result:
well, just found answer ... or workaround. Maybe this will help someone in the future: calculate the percentage before the ggplot and then just just use that vector as labels.
dataex <- iris %>%
dplyr::group_by(group, Species) %>%
dplyr::summarise(N = n()) %>%
dplyr::mutate(pct = paste0((round(N/sum(N)*100, 2))," %"))
names(dataex)
dataex <- as.data.frame(dataex)
str(dataex)
ggplot(dataex, aes(x = group, y = N, fill = factor(Species))) +
geom_bar(position="stack", stat="identity") +
geom_text(aes(label = dataex$pct), position = position_stack(vjust = 0.5), size = 3) +
theme_pubclean()
This question already has answers here:
pie chart with ggplot2 with specific order and percentage annotations
(2 answers)
Closed 5 years ago.
I'm trying to add some percent labels to a pie chart but any of the solutions works. The thing is that the chart displays the number of tasks completed grouped by category.
output$plot2<-renderPlot({
ggplot(data=data[data$status=='100% completed',], aes(x=factor(1), fill=category))+
geom_bar(width = 1)+
coord_polar("y")
Using geom_text with position_stack to adjust the label locations would work.
library(ggplot2)
library(dplyr)
# Create a data frame which is able to replicate your plot
plot_frame <- data.frame(category = c("A", "B", "B", "C"))
# Get counts of categories
plot_frame <- plot_frame %>%
group_by(category) %>%
summarise(counts = n()) %>%
mutate(percentages = counts/sum(counts)*100)
# Plot
ggplot(plot_frame, aes(x = factor(1), y = counts)) +
geom_col(aes(fill = category), width = 1) +
geom_text(aes(label = percentages), position = position_stack(vjust = 0.5)) +
coord_polar("y")
The codes above generate this:
You might want to change the y-axis from counts to percentages since you are labeling the latter. In that case, change the values passed to ggplot accordingly.
Say, I make a gpplot2 plot like the following with several facets:
ggplot(iris) +
geom_tile(aes(x = Petal.Width, fill = Sepal.Width, y = Petal.Length)) +
facet_wrap(~Species)
Note that there is one colourbar for all three plots, but each facet could potentially have a very different values. Is it possible to have a separate colourbar for each facet?
I agree with Alex's answer, but against my better scientific and design judgment, I took a stab at it.
require(gridExtra)
require(dplyr)
iris %>% group_by(Species) %>%
do(gg = {ggplot(., aes(Petal.Width, Petal.Length, fill = Sepal.Width)) +
geom_tile() + facet_grid(~Species) +
guides(fill = guide_colourbar(title.position = "top")) +
theme(legend.position = "top")}) %>%
.$gg %>% arrangeGrob(grobs = ., nrow = 1) %>% grid.arrange()
Of course, then you're duplicating lots of labels, which is annoying. Additionally, you lose the x and y scale information by plotting each species as a separate plot, instead of facets of a single plot. You could fix the axes by adding ... + coord_cartesian(xlim = range(iris$Petal.Width), ylim = range(iris$Petal.Length)) + ... within that ggplot call.
To be honest, the only way this makes sense at all is if it's comparing two different variables for the fill, which is why you don't care about comparing their true value between plots. A good alternative would be rescaling them to percentiles within a facet using dplyr::group_by() and dplyr::percent_rank.
Edited to update:
In the two-different-variables case, you have to first "melt" the data, which I assume you've already done. Here I'm repeating it with the iris data. Then you can look at the relative values by examining the percentiles, rather than the absolute values of the two variables.
iris %>%
tidyr::gather(key = Sepal.measurement,
value = value,
Sepal.Length, Sepal.Width) %>%
group_by(Sepal.measurement) %>%
mutate(percentilevalue = percent_rank(value)) %>%
ggplot(aes(Petal.Length, Petal.Width)) +
geom_tile(aes(fill = percentilevalue)) +
facet_grid(Sepal.measurement ~ Species) +
scale_fill_continuous(limits = c(0,1), labels = scales::percent)
Separate palettes for facets in ggplot facet_grid
It has been asked before. This is the best solution I have seen so far, however I think having a common palette is more ideal from a visualization standpoint.
If this is what you want then there is a simple hack to it.
tf1 <- iris
tf1$COL <- rep(1:50, each=3)
ggplot(tf1) +
geom_tile(aes(x = Petal.Width, fill = interaction(Petal.Length,COL), y = Petal.Length)) +
facet_wrap(~Species, scales = "free") + theme(legend.position="none")