Question about ggplot2 bar graph transformation method - r

I wanted the barplot to appear in two forms, so I created repeated data and used it as an input.
So I used the data in the form below.
I put the data in the form above and wrote the following code to use it.
Select <- "Mbp"
if(Select == "Mbp"){
Select <- "Amount of sequence (Mbp)"
} else if (Select == "Gbp"){
Select <- "Amount of sequence (Gbp)"
}
ggplot(G4, aes(x = INDV, y = Bp, fill = Group)) + theme_light() +
geom_bar(stat = 'identity', position = 'dodge', width = 0.6) + coord_flip() +
scale_x_discrete(limits = rev(unname(unlist(RAW_TRIM[1])))) +
scale_fill_discrete(breaks = c("Raw data","Trimmed data"))+
scale_y_continuous(labels = scales::comma, position = "right") +
theme(axis.text = element_text(colour = "black", face = "bold", size = 15)) +
theme(legend.position = "bottom", legend.text = element_text(face = "bold", size = 15),
legend.title = element_blank()) + ggtitle(Select) + xlab("") + ylab("") +
theme(plot.title = element_text(size = 25, face = "bold", hjust = 0.5))
Then I can get a plot like the one below, where I want the red graph to be on top of the green graph.
I also tried changing the order of the data, and several sites such as the Internet and Stack Overflow provided solutions and used them, but not a single solution was able to solve them.
If you know a solution, please let me know how to modify the code or change the data.
thank you.

You seem to be asking more than one question at once here, but the main one is: why do the bars for Raw appear under those for Trimmed? The short answer is: factor levels and the behaviour of coord_flip().
Let's make a toy dataset:
library(tidyverse)
G4 <- data.frame(INDV = c("C_01", "C_01", "C_41", "C_41"),
Group = c("Raw data", "Trimmed data", "Raw data", "Trimmed data"),
Bp = c(200, 100, 500, 400))
A simple dodged bar chart. Note that Raw comes before Trimmed, because R is before T in the alphabet:
G4 %>%
ggplot(aes(INDV, Bp)) +
geom_col(aes(fill = Group),
position = "dodge")
Now we coord_flip:
G4 %>%
ggplot(aes(INDV, Bp)) +
geom_col(aes(fill = Group),
position = "dodge") +
coord_flip()
This has the effect of reversing the variables, so Raw is now below Trimmed.
We can fix that by altering factor levels. As there are only two groups we can just reverse them using fct_rev() from the forcats package:
G4 %>%
ggplot(aes(INDV, Bp)) +
geom_col(aes(fill = fct_rev(Group)),
position = "dodge") +
coord_flip()
The bar for Raw is now on top but unfortunately, the colours are now reversed so that Raw bars are green. We can fix that using scale_fill_manual():
G4 %>%
ggplot(aes(INDV, Bp)) +
geom_col(aes(fill = fct_rev(Group)),
position = "dodge") +
coord_flip() +
scale_fill_manual(values = c("#00BFC4", "#F8766D"))
Now the Raw bars are on top, and they are red.

Related

Why are colours appearing in the labels of my gganimate sketch?

I have a gganimate sketch in R and I would like to have the percentages of my bar chart appear as labels.
But for some bizarre reason, I am getting seemingly random colours in place of the labels that I'm requesting.
If I run the ggplot part without animating then it's a mess (as it should be), but it's obvious that the percentages are appearing correctly.
Any ideas? The colour codes don't correspond to the colours of the bars which I have chosen separately. The codes displayed also cycle through about half a dozen different codes, at a rate different to the frame rate that I selected. And while the bars are the same height (they grow until they reach the chosen height displayed in the animation) then they display the same code until they stop and it gets frozen.
Code snippet:
df_new <- data.frame(index, rate, year, colour)
df_new$rate_label <- ifelse(round(df_new$rate, 1) %% 1 == 0,
paste0(round(df_new$rate, 1), ".0%"), paste0(round(df_new$rate, 1), "%"))
p <- ggplot(df_new, aes(x = year, y = rate, fill = year)) +
geom_bar(stat = "identity", position = "dodge") +
scale_fill_manual(values = colour) +
#geom_text(aes(y = rate, label = paste0(rate, "%")), vjust = -0.7) +
geom_shadowtext(aes(y = rate, label = rate_label),
bg.colour='white',
colour = 'black',
size = 9,
fontface = "bold",
vjust = -0.7,
alpha = 1
) +
coord_cartesian(clip = 'off') +
ggtitle("% population belonging to 'No religion', England and Wales census") +
theme_minimal() +
xlab("") + ylab("") +
theme(legend.position = "none") +
theme(plot.title = element_text(size = 18, face = "bold")) +
theme(axis.text = element_text(size = 14)) +
scale_y_continuous(limits = c(0, 45), breaks = 10*(0:4))
p
p <- p + transition_reveal(index) + view_follow(fixed_y = T)
animate(p, renderer = gifski_renderer(), nframes = 300, fps = frame_rate, height = 500, width = 800,
end_pause = 0)
anim_save("atheism.gif")
I think you have missed some delicate points about ggplot2. I will try my best to describe them to you. First of all, you need to enter the discrete values as factor or integer. So you can use as.factor() before plotting or just factor() in the aesthetic. Also, you should consider rounding the percentages as you wish. Here is an example:
set.seed(2023)
df_new <- data.frame(index=1:10, rate=runif(10), year=2001:2010, colour=1:10)
df_new$rate_label <- ifelse(round(df_new$rate, 1) %% 1 == 0,
paste0(round(df_new$rate, 1), ".0%"),
paste0(round(df_new$rate, 1), "%"))
The ggplot for this data is:
library(ggplot2)
p <- ggplot(df_new, aes(x = factor(year), y = rate, fill = factor(colour))) +
geom_bar(stat = "identity", position = "dodge") +
geom_text(aes(y = rate, label = paste0(round(rate,2), "%")), vjust = -0.7) +
coord_cartesian(clip = 'off') +
ggtitle("% population belonging to 'No religion', England and Wales census") +
theme_minimal() +
xlab("") + ylab("") +
theme(legend.position = "none",
plot.title = element_text(size = 18, face = "bold"),
axis.text = element_text(size = 14))
p
And you can combine all theme element in one theme() function (as did I). The output is:
And you can easily animate the plot using the following code:
library(gganimate)
p + transition_reveal(index)
And the output is as below:
Hope it helps.
So it was answered here although I don't know why the fix works.
For some reason, labels need to go into gganimate as factors
as.factor()
I just had to add the line:
df_new$rate_label <- as.factor(df_new$rate_label)
and it works fine.

Set color of the boxes in a boxplot

hist <- ggplot(df, aes(x = A,fill = ("red"))) +
geom_bar() +
theme_minimal() +
ggtitle("Treatment") + theme(legend.position = "bottom", plot.title = element_text(hjust=0.5), text = element_text(size = 20)) +
scale_fill_manual("A", values = c("0" = "dodgerblue4", "1" = "chocolate"))
hist
I need help regarding setting the colors of the boxplots from the ggplot2 package which I am learning right now. I want the left one to be in the blue color and the right one in the other color. But unfortunately I cannot figure out how to set these instead this code just sets the right colors in the Legend and leaves the boxes unchanged?!
Like Rene said it is a bit hard to help while not seeing the dataset.
Looking at the given code your A contains two values you want to color individually
define a vector containing those two values:
df_colors <- c("0" , "1")
hist <- ggplot(df, aes(x = A, fill = df_colors )) +
geom_bar() +
theme_minimal() +
ggtitle("Treatment") + theme(legend.position = "bottom", plot.title = element_text(hjust=0.5), text = element_text(size = 20)) +
scale_fill_manual("A", values = c("0" = "dodgerblue4", "1" = "chocolate"))
Not enough code for me to know exactly what you're doing but I think if you remove:
mapping = aes(fill = "red")
it will solve your problem. A reprex would help others better understand your problem. I think instead you might need something like:
ggplot(data = your_data) +
geom_bar(mapping = aes(x = variable_1, y = variable_2, fill = variable_3), position = "dodge")

Order of categories in ggplot with fill

I've run this code:
#setup
library(tidyverse)
library(skimr)
library(scales)
#import data
tuesdata <- tidytuesdayR::tt_load('2021-05-25')
records <- tuesdata$records
records_tt <- records %>%
mutate(track = factor(track),
type = factor(type))
#let's create a boxplot
records_tt %>%
ggplot(aes(x=record_duration, y=track, fill=type)) +
geom_boxplot(alpha=0.6) +
theme_minimal() +
scale_x_continuous(labels = comma) +
labs(x = "Record duration", y = "") +
theme(
axis.title.x = element_text(
margin = margin(t = 15)),
legend.title = element_blank())
Which gives me this plot:
I have three questions though:
What is ggplot using to decide which level of type to place on the top (i.e. why is it currently plotting three lap on the top and single lap on the bottom)?
How do I flip the order of the display, without changing the order of the legend keys (as it makes sense that single is listed on top of three)?
How do I reorder the values of track from lowest mean value of record_duration (independent of type) to highest mean value, instead of the default reverse alphabetical which is currently displayed?
For question 2: Here is a solution, borrowing from this solution: Switch out the call to geom_boxplot with the following line:
geom_boxplot(alpha=0.6, position=position_dodge2(reverse = T)) +
For question 3: here's a solution:
records_tt %>%
group_by(track) %>%
mutate(mean_duration = mean(record_duration)) %>%
ggplot(aes(x = record_duration, y = reorder(track, -mean_duration), fill = type)) +
geom_boxplot(alpha=0.6, position=position_dodge2(reverse = T)) +
theme_minimal() +
scale_x_continuous(labels = comma) +
labs(x = "Record duration", y = "") +
theme(
axis.title.x = element_text(
margin = margin(t = 15)),
legend.title = element_blank())

Creating manually a legend in ggplot R [duplicate]

This question already has an answer here:
How to insert a legend in a GGPLOT with multiple time series
(1 answer)
Closed 2 years ago.
So this is the graph and the code I have so far:
library(fivethirtyeight)
library(tidyverse)
bad_drivers$num_drivers
bad_drivers$perc_speeding
mytable <- bad_drivers %>%
mutate(SpeedPerBilion = (num_drivers * perc_speeding)/100)
ggplot(data = mytable, aes(x = state, y = SpeedPerBilion, fill='red')) +
xlab("") +
ylab("") +
coord_flip() +
geom_bar(stat = "identity")+
geom_bar(data= mytable, aes(x = state, y=num_drivers), alpha=0.5,stat="identity") +
theme(plot.title =`enter code here` element_text(face = "bold"), legend.position = "none")+
scale_y_continuous(sec.axis = dup_axis())+
labs(title = "Drviers Involved In Fatal Collisions While Speeding",
subtitle = "As a share of the number of fatal collisions per billion miles, 2009")
So my questions are:
How am I adding a legend to this graph?
How to erase this lower y coordinate (so to have just the upper one)?
Thank you in advance! :)
Find code below and what I believe is your desired plot. You will have to tweak labels to match what you need but I put place holder names. The key is using the scale_fill_manual with a named vector of colors and calling those color names in the aes of each layer you need to use that color in. Also a neat trick is using alpha() to apply alpha as a color rather than a separate scale. Finally the y axis transformation you were looking for is position = "right" so it ends up on top after coord_flip().
library(fivethirtyeight) # for data
library(tidyverse)
bad_drivers %>%
mutate(SpeedPerBilion = (num_drivers * perc_speeding)/100) %>%
ggplot(aes(x = state, y = SpeedPerBilion)) +
xlab("") +
ylab("") +
coord_flip() +
geom_bar(stat = "identity", aes(fill = "Speeding")) +
geom_bar(aes(x = state, y = num_drivers, fill = "All"),
stat = "identity") +
theme(plot.title = element_text(face = "bold")) +
scale_y_continuous(position = "right") +
scale_fill_manual(name = "Speeding Involved",
values = c("Speeding" = alpha("red", 1), "All" = alpha("red", 0.5))) +
labs(title = "Drviers Involved In Fatal Collisions While Speeding",
subtitle = "As a share of the number of fatal collisions per billion miles, 2009") +
guides(fill = guide_legend(override.aes = list(color = "red", alpha = c(0.25, 1))))
Created on 2022-10-11 by the reprex package (v2.0.1)
Note: For some reason the transparency in the legend doesn't look the same as in the plot so I manually set the legend to alpha = 0.25 so to my eye it matches the plot. Please confirm the result on your own computer.
Maybe this:
library(fivethirtyeight)
library(tidyverse)
mytable <- bad_drivers %>%
mutate(SpeedPerBilion = (num_drivers * perc_speeding)/100)
ggplot(data = mytable, aes(x = state, y = SpeedPerBilion, fill='red')) +
xlab("") +
ylab("") +
coord_flip() +
geom_bar(stat = "identity")+
geom_bar(data= mytable, aes(x = state, y=num_drivers), alpha=0.5,stat="identity") +
scale_y_continuous(sec.axis = dup_axis())+
theme(plot.title = element_text(face = "bold"),
axis.text.x.bottom = element_blank(),
axis.ticks.x.bottom = element_blank())+
labs(title = "Drviers Involved In Fatal Collisions While Speeding",
subtitle = "As a share of the number of fatal collisions per billion miles, 2009")
Output:

Manually change order of y axis items on complicated stacked bar chart in ggplot2

I've been stuck on an issue and can't find a solution. I've tried many suggestions on Stack Overflow and elsewhere about manually ordering a stacked bar chart, since that should be a pretty simple fix, but those suggestions don't work with the huge complicated mess of code I plucked from many places. My only issue is y-axis item ordering.
I'm making a series of stacked bar charts, and ggplot2 changes the ordering of the items on the y-axis depending on which dataframe I am trying to plot. I'm trying to make 39 of these plots and want them to all have the same ordering. I think ggplot2 only wants to plot them in ascending order of their numeric mean or something, but I'd like all of the bar charts to first display the group "Bird Advocates" and then "Cat Advocates." (This is also the order they appear in my data frame, but that ordering is lost at the coord_flip() point in plotting.)
I think that taking the data frame through so many changes is why I can't just add something simple at the end or use the reorder() function. Adding things into aes() also doesn't work, since the stacked bar chart I'm creating seems to depend on those items being exactly a certain way.
Here's one of my data frames where ggplot2 is ordering my y-axis items incorrectly, plotting "Cat Advocates" before "Bird Advocates":
Group,Strongly Opposed,Opposed,Slightly Opposed,Neutral,Slightly Support,Support,Strongly Support
Bird Advocates,0.005473026,0.010946052,0.012509773,0.058639562,0.071149335,0.31118061,0.530101642
Cat Advocates,0.04491726,0.07013396,0.03624901,0.23719464,0.09141056,0.23404255,0.28605201
And here's all the code that takes that and turns it into a plot:
library(ggplot2)
library(reshape2)
library(plotly)
#Importing data from a .csv file
data <- read.csv("data.csv", header=TRUE)
data$s.Strongly.Opposed <- 0-data$Strongly.Opposed-data$Opposed-data$Slightly.Opposed-.5*data$Neutral
data$s.Opposed <- 0-data$Opposed-data$Slightly.Opposed-.5*data$Neutral
data$s.Slightly.Opposed <- 0-data$Slightly.Opposed-.5*data$Neutral
data$s.Neutral <- 0-.5*data$Neutral
data$s.Slightly.Support <- 0+.5*data$Neutral
data$s.Support <- 0+data$Slightly.Support+.5*data$Neutral
data$s.Strongly.Support <- 0+data$Support+data$Slightly.Support+.5*data$Neutral
#to percents
data[,2:15]<-data[,2:15]*100
#melting
mdfr <- melt(data, id=c("Group"))
mdfr<-cbind(mdfr[1:14,],mdfr[15:28,3])
colnames(mdfr)<-c("Group","variable","value","start")
#remove dot in level names
mylevels<-c("Strongly Opposed","Opposed","Slightly Opposed","Neutral","Slightly Support","Support","Strongly Support")
mdfr$variable<-droplevels(mdfr$variable)
levels(mdfr$variable)<-mylevels
pal<-c("#bd7523", "#e9aa61", "#f6d1a7", "#999999", "#c8cbc0", "#65806d", "#334e3b")
ggplot(data=mdfr) +
geom_segment(aes(x = Group, y = start, xend = Group, yend = start+value, colour = variable,
text=paste("Group: ",Group,"<br>Percent: ",value,"%")), size = 5) +
geom_hline(yintercept = 0, color =c("#646464")) +
coord_flip() +
theme(legend.position="top") +
theme(legend.key.width=unit(0.5,"cm")) +
guides(col = guide_legend(ncol = 12)) + #has 7 real columns, using to adjust legend position
scale_color_manual("Response", labels = mylevels, values = pal, guide="legend") +
theme(legend.title = element_blank()) +
theme(axis.title.x = element_blank()) +
theme(axis.title.y = element_blank()) +
theme(axis.ticks = element_blank()) +
theme(axis.text.x = element_blank()) +
theme(legend.key = element_rect(fill = "white")) +
scale_y_continuous(breaks=seq(-100,100,100), limits=c(-100,100)) +
theme(panel.background = element_rect(fill = "#ffffff"),
panel.grid.major = element_line(colour = "#CBCBCB"))
The plot:
I think this works, you may need to play around with the axis limits/breaks:
library(dplyr)
mdfr <- mdfr %>%
mutate(group_n = as.integer(case_when(Group == "Bird Advocates" ~ 2,
Group == "Cat Advocates" ~ 1)))
ggplot(data=mdfr) +
geom_segment(aes(x = group_n, y = start, xend = group_n, yend = start + value, colour = variable,
text=paste("Group: ",Group,"<br>Percent: ",value,"%")), size = 5) +
scale_x_continuous(limits = c(0,3), breaks = c(1, 2), labels = c("Cat", "Bird")) +
geom_hline(yintercept = 0, color =c("#646464")) +
theme(legend.position="top") +
theme(legend.key.width=unit(0.5,"cm")) +
coord_flip() +
guides(col = guide_legend(ncol = 12)) + #has 7 real columns, using to adjust legend position
scale_color_manual("Response", labels = mylevels, values = pal, guide="legend") +
theme(legend.title = element_blank()) +
theme(axis.title.x = element_blank()) +
theme(axis.title.y = element_blank()) +
theme(axis.ticks = element_blank()) +
theme(axis.text.x = element_blank()) +
theme(legend.key = element_rect(fill = "white"))+
scale_y_continuous(breaks=seq(-100,100,100), limits=c(-100,100)) +
theme(panel.background = element_rect(fill = "#ffffff"),
panel.grid.major = element_line(colour = "#CBCBCB"))
produces this plot:
You want to factor the 'Group' variable in the order by which you want the bars to appear.
mdfr$Group <- factor(mdfr$Group, levels = c("Bird Advocates", "Cat Advocates")

Resources