I am attempting to create a diverging stacked bar like here, and am experiencing a similar issue to this SO question. My approach is slightly different though as I am managing it all via a single dataset, rather than two and my colours are independent of my data.
Reprex as follows:
library(tidyverse)
library(RColorBrewer)
x <- tribble(
~response, ~count,
0, -27,
1, -9,
2, -41,
3, -43,
4, -58,
5, -120,
5, 120,
6, 233,
7, 379,
8, 388,
9, 145,
10, 61
) %>%
mutate(response = factor(response))
ggplot(x, aes(x = 1, y = count, fill = response)) +
geom_col() +
scale_fill_brewer(palette = "RdBu") +
coord_flip()
This gives me an image like this:
The issue is to do with the ordering of the stacked data on the right hand side of the zero where they stacking appears to be in descending order. Any thoughts on how to fix this would be greatly appreciated (expected ordering would be 0-10, not 0-5,10-5)
A tough one! I played with ordering and it seems that geom_bar and geom_col don't like it when you combine positive and negative values in the common same order. So I divided your data inside the dataframe for positive and negative values, generated colors for every response value and used two geoms for positive and negative values separately:
library(tidyverse)
library(RColorBrewer)
x <- tribble(
~response, ~count,
0, -27,
1, -9,
2, -41,
3, -43,
4, -58,
5, -120,
5, 120,
6, 233,
7, 379,
8, 388,
9, 145,
10, 61
) %>%
# Get absolute values and add dummy to distuingish positive and negative values
mutate(subzero = count < 0,
count = abs(count))
# Generate variable with colors from ColorBrewer for every response level (ugly but works)
colors <- brewer.pal(length(unique(x$response)),"RdBu")
x$colors <- NA
for (i in 1:nrow(x)){
x$colors[i] <- colors[x$response[i]+1]
}
ggplot() +
geom_bar(data = x[x$subzero==T,], aes(x = "", y = -count, fill = reorder(colors, response)), position="stack", stat="identity") +
geom_bar(data = x[x$subzero==F,], aes(x = "", y = count, fill = reorder(colors, -response)), position="stack", stat="identity") +
geom_hline(yintercept = 0, color =c("black")) +
scale_fill_identity("Response", labels = unique(x$response), breaks=unique(x$colors), guide="legend") +
coord_flip() +
labs(y="",x="") +
theme(legend.position = "bottom", legend.direction = "horizontal") +
scale_y_continuous(breaks=seq(-1400,1400,200), limits=c(-1400,1400))
UPD: made Y-scale balanced so it look more clear
Although not intuitive (for me), use:
ggplot(x, aes(x = 1, y = order(count), fill = response)) +
geom_col() +
scale_fill_brewer(palette = "RdBu",direction=1) +
coord_flip()
It takes into account the ordering based on response (rather than order(response))
You can use position_stack(reverse=TRUE):
ggplot(x, aes(x = 1, y = count, fill = response)) +
geom_col(position = position_stack(reverse=TRUE)) +
scale_fill_brewer(palette = "RdBu") +
coord_flip()
Related
I am creating time series plot for the following data:
# Creating data set
year <- c(rep(2018,4), rep(2019,4), rep(2020,4))
month_1 <- c(2, 3, 7, 8, 6, 10, 11, 12, 5, 7, 8, 12)
avg_dlt_calc <- c(10, 20, 11, 21, 13, 7, 10, 15, 9, 14, 16, 32)
data_to_plot <- data.frame(cbind(year,month_1,avg_dlt_calc ))
ggplot(data_to_plot, aes(x = month_1)) +
geom_line(aes(y = avg_dlt_calc), size = 0.5) +
scale_x_discrete(name = "months", limits = data_with_avg$month_1) +
facet_grid(~year, scales = "free")
I am ok with the plot itself, but x-axis labels are messed up:
How I can fix it?
It is ok not to have labels for missing months (for example, for 2018 it will be only 2,3,7,8 - so it will be clear, that there is data only for those months).
A remedy is to coerce month_1 to a factor and group the observations by year like so:
ggplot(data_to_plot, aes(x = as.factor(month_1), y = avg_dlt_calc, group = year)) +
geom_line(size = 0.5) +
scale_x_discrete(name = "months") +
facet_grid(~year, scales = "free")
Note that I've moved y = avg_dlt_calc inside aes() in ggplot() which is more idiomatic than your approach. You may use the breaks argument in scale_x_discrete() to set breaks manually, see ?scale_x_discrete.
I think a fixed x-axis and adding points is more suitable for conveying the information that data is only available for some periods:
ggplot(data_to_plot, aes(x = as.factor(month_1), y = avg_dlt_calc, group = year)) +
geom_line(size = 0.5) +
geom_point() +
scale_x_discrete(name = "months") +
facet_grid(~year, scales = "free_y")
This is modified version of this question.
I need to create time series plot for 2 lines for the following data:
# Creating data set
year <- c(rep(2018,4), rep(2019,4), rep(2020,4))
month_1 <- c(2, 3, 7, 8, 6, 10, 11, 12, 5, 7, 8, 12)
avg_dlt_calc <- c(10, 20, 11, 21, 13, 7, 10, 15, 9, 14, 16, 32)
avg_dlt_standard <- c(rep(9,12))
data_to_plot <- data.frame(cbind(year,month_1,avg_dlt_calc,avg_dlt_standard ))
data_to_plot$month_1 <- factor(data_to_plot$month_1, levels=unique(data_to_plot$month_1))
ggplot(data_to_plot,aes(x = as.factor(month_1))) +
geom_line(aes(y = avg_dlt_calc, group = year, colour = "DLT Calculated"), size = 0.5) +
geom_line(aes(y = avg_dlt_standard, group = year, colour = "DLT standard"), size = 0.5) +
geom_point(aes(y = avg_dlt_calc, colour = "DLT Calculated")) +
scale_x_discrete(name = "months", limits = data_to_plot$month_1) +
facet_grid(~year, scales = "free")+
scale_color_manual(name="",
labels = c("DLT Calculated",
"DLT standard"),
values = c( "blue",
"red")) +
theme(legend.position="top",
legend.text = element_text(size = 8))
s = data_to_plot$month_1) +
facet_grid(~year, scales = "free")+
But x-axis looks wrong:
If to plot data without this line:
data_to_plot$month_1 <- factor(data_to_plot$month_1, levels=unique(data_to_plot$month_1))
Then x-axis will still be messy:
I am setting limits for x-axis, but looks like it is not working.
How can I fix it?
I've skipped some lines and features of your plot, but in essence, this is what needs to be changed:
ggplot(data_to_plot, aes(x=month_1))+ # no as.factor
geom_point(aes(y=avg_dlt_calc)) +
geom_line(aes(y=avg_dlt_calc)) +
geom_line(aes(y=avg_dlt_standard), colour='red') +
scale_x_continuous(breaks=1:12, limits=c(1,2)) + # do *not* use scale_x_discrete,
# your x-axis is *continuous*; use breaks-argument to set the ticks.
# note, limits should only have 2 values - upper and lower limit.
facet_grid(~year)
In your code, you used limits = data_to_plot$month_1, but ggplot2 only used the 2 first elements of month_1 - it did not interpret it as a set of acceptable values.
I am creating a grouped bar chart in ggplot2, where I have the x-axis as direction of gaze, y-axis percentage of time, and grouped by condition (reliability of robot). I have created a reproducible example of the dataset below.
install.packages("tidyverse")
library(tidyverse)
library(ggplot2)
library(reshape2)
robot_reliability <- c("reliable", "reliable", "reliable", "reliable", "unreliable", "unreliable", "unreliable", "unreliable")
percent_robot <- c(5, 10, 15, 20, 25, 30, 35, 40)
percent_game <- c(20, 30, 40, 50, 15, 25, 35, 45)
percent_others <- c(6, 8, 10, 12, 11, 9, 7, 5)
data <- data.frame(robot_reliability, percent_robot, percent_game, percent_others)
gg <- melt(data, id = "robot_reliability")
ggplot(gg, aes(x = reorder(variable, -value), y = value, fill = factor(robot_reliability))) +
stat_summary(fun = mean, geom = "bar", position = position_dodge(1)) +
stat_summary(fun.min = min, fun.max = max, geom = "errorbar",
colour="grey40", position=position_dodge(1), width=.2) +
scale_fill_discrete(name = "Robot Reliability", labels = c("Reliable", "Unreliable")) +
xlab("Direction of Gaze") +
ylab("Percentage of Overall Interaction Time") +
ggtitle("Percentage of Time Spent Gazing") +
scale_x_discrete(labels = c("Game", "Robot","Others"))
I have ordered my graph from high-low values on the y-axis (percentage of overall interaction time) using the reorder function
ggplot(gg, aes(x = reorder(variable, -value), y = value, fill = factor(robot_reliability)))
Later on, I have relabelled the axis using scale_x_discrete:
scale_x_discrete(labels = c("Game", "Robot", "Others"))
However, this appears to fix the labels to those positions (so, for example if you remove the '-' from '-value' in reorder, the bar graph would rearrange to go from low-high, but the labels on the x-axis would stay in the same positions, meaning the labels are incorrectly matched to the data. Is there a way to combine the labels on the x-axis with the reorder function so that they are permanently attached to the correct data columns?
There are two options to achieve this. First. Use a named vector to map categories of your variable to labels like I did in the code below. Second. Simply rename the categories in your df or add a column with the labels. This way you will get the correct labels automatically.
library(tidyverse)
library(ggplot2)
library(reshape2)
robot_reliability <- c("reliable", "reliable", "reliable", "reliable", "unreliable", "unreliable", "unreliable", "unreliable")
percent_robot <- c(5, 10, 15, 20, 25, 30, 35, 40)
percent_game <- c(20, 30, 40, 50, 15, 25, 35, 45)
percent_others <- c(6, 8, 10, 12, 11, 9, 7, 5)
data <- data.frame(robot_reliability, percent_robot, percent_game, percent_others)
gg <- melt(data, id = "robot_reliability")
ggplot(gg, aes(x = reorder(variable, value), y = value, fill = factor(robot_reliability))) +
stat_summary(fun = mean, geom = "bar", position = position_dodge(1)) +
stat_summary(fun.min = min, fun.max = max, geom = "errorbar",
colour="grey40", position=position_dodge(1), width=.2) +
scale_fill_discrete(name = "Robot Reliability", labels = c("Reliable", "Unreliable")) +
xlab("Direction of Gaze") +
ylab("Percentage of Overall Interaction Time") +
ggtitle("Percentage of Time Spent Gazing") +
scale_x_discrete(labels = c(percent_game = "Game", percent_robot = "Robot", percent_others = "Others"))
Created on 2020-04-24 by the reprex package (v0.3.0)
I have created a graph using this code:
df.1 <- data.frame(
Month = c("Dec-17", "Jan-18", "Feb-18", "Mar-18", "Apr-18", "May-18"),
Total_1 = c(25, 14, 8, 16, 137, 170),
Total_2 = c(3, 2, 3, 2, 18, 27),
Total_3 = c(5, 4, 3, 2, 16, 54)
)
df.1 <- melt(df.1,id.vars = "Month")
#reorder the month column so it isn't alphabetical
df.1$Month <- factor(df.1$Month, levels(df.1$Month)[c(2,4,3,5,1,6)])
#partition my data into the 2 different graphs I need
df.1.1 <- df.1[7:18,]
df.1.2 <- df.1[1:6,]
ggplot(data = df.1.1, aes(x = Month, y = value)) +
geom_bar(aes(fill = variable), position = position_dodge(),stat = 'identity') +
geom_line(data = df.1.2, aes(x=Month, y=value, group=1), size =1.25, color = "#380B61") +
theme(axis.title.x=element_blank(), axis.title.y = element_blank(), legend.position="bottom", legend.direction="horizontal")
Which created this graph:
Example Graph
As you can see only the bar chart is showing on the legend. How can I get the line part (Total_1) to also show on the legend as well?
EDIT: To be clear I want the finished chart to look as close to this as possible:
Example Graph
Using this dummy data.frame
ts <- data.frame(x=1:3, y=c("blue", "white", "white"), z=c("one", "one", "two"))
I try and plot with category "blue" on top.
ggplot(ts, aes(z, x, fill=factor(y, levels=c("blue","white" )))) + geom_bar(stat = "identity")
gives me "white" on top. and
ggplot(ts, aes(z, x, fill=factor(y, levels=c("white", "blue")))) + geom_bar(stat = "identity")
reverses the colors, but still gives me "white" on top. How can I get "blue" on top?
For what it is worth, in ggplot2 version 2.2.1 the order of the stack is no longer determined by the row order in the data.frame. Instead, it matches the order of the legend as determined by the order of levels in the factor.
d <- data.frame(
y=c(0.1, 0.2, 0.7),
cat = factor(c('No', 'Yes', 'NA'), levels = c('NA', 'Yes', 'No')))
# Original order
p1 <- ggplot(d, aes(x=1, y=y, fill=cat)) +
geom_bar(stat='identity')
# Change order of rows
p2 <- ggplot(d[c(2, 3, 1), ], aes(x=1, y=y, fill=cat)) +
geom_bar(stat='identity')
# Change order of levels
d$cat2 <- relevel(d$cat, 'Yes')
p3 <- ggplot(d, aes(x=1, y=y, fill=cat2)) +
geom_bar(stat='identity')
grid.arrange(p1, p2, p3, ncol=3)
It results in the below plot:
I've struggled with the same issue before. It appears that ggplot stacks the bars based on their appearance in the dataframe. So the solution to your problem is to sort your data by the fill factor in the reverse order you want it to appear in the legend: bottom item on top of the dataframe, and top item on bottom:
ggplot(ts[order(ts$y, decreasing = T),],
aes(z, x, fill=factor(y, levels=c("blue","white" )))) +
geom_bar(stat = "identity")
Edit: More illustration
Using sample data, I created three plots with different orderings of the dataframe, I thought that more fill-variables would make things a bit clearer.
set.seed(123)
library(gridExtra)
df <- data.frame(x=rep(c(1,2),each=5),
fill_var=rep(LETTERS[1:5], 2),
y=1)
#original order
p1 <- ggplot(df, aes(x=x,y=y,fill=fill_var))+
geom_bar(stat="identity") + labs(title="Original dataframe")
#random order
p2 <- ggplot(df[sample(1:10),],aes(x=x,y=y,fill=fill_var))+
geom_bar(stat="identity") + labs(title="Random order")
#legend checks out, sequence wird
#reverse order
p3 <- ggplot(df[order(df$fill_var,decreasing=T),],
aes(x=x,y=y,fill=fill_var))+
geom_bar(stat="identity") + labs(title="Reverse sort by fill")
plots <- list(p1,p2,p3)
do.call(grid.arrange,plots)
Use the group aethetic in the ggplot() call. This ensures that all layers are stacked in the same way.
series <- data.frame(
time = c(rep(1, 4),rep(2, 4), rep(3, 4), rep(4, 4)),
type = rep(c('a', 'b', 'c', 'd'), 4),
value = rpois(16, 10)
)
ggplot(series, aes(time, value, group = type)) +
geom_col(aes(fill = type)) +
geom_text(aes(label = type), position = "stack")
Messing with your data in order to make a graph look nice seems like a bad idea. Here's an alternative that works for me when using position_fill():
ggplot(data, aes(x, fill = fill)) + geom_bar(position = position_fill(reverse = TRUE))
The reverse = TRUE argument flips the order of the stacked bars. This works in position_stack also.
I have the exactly same problem today. You can get blue on top by using order=-as.numeric():
ggplot(ts,
aes(z, x, fill=factor(y, levels=c("blue","white")), order=-as.numeric(y))) +
geom_bar(stat = "identity")
I had a similar issue and got around by changing the level of the factor. thought I'd share the code:
library(reshape2)
library(ggplot2)
group <- c(
"1",
"2-4",
"5-9",
"10-14",
"15-19",
"20-24",
"25-29",
"30-34",
"35-39",
"40-44",
"45-49"
)
xx <- factor(group, levels(factor(group))[c(1, 4, 11, 2, 3, 5:10)])
method.1 <- c(36, 14, 8, 8, 18, 1, 46, 30, 62, 34, 34)
method.2 <- c(21, 37, 45, 42, 68, 41, 16, 81, 51, 62, 14)
method.3 <- c(37, 46, 18, 9, 16, 79, 46, 45, 70, 42, 28)
elisa.neg <- c(12, 17, 18, 6, 19, 14, 13, 13, 7, 4, 1)
elisa.eq <- c(3, 6, 3, 14, 1, 4, 11, 13, 5, 3, 2)
test <- data.frame(person = xx,
"Mixture Model" = method.1,
"Censoring" = method.3,
"ELISA neg" = elisa.neg,
"ELISA eqiv" = elisa.eq)
melted <- melt(test, "person")
melted$cat <- ifelse(melted$variable == "Mixture.Model", "1",
ifelse(melted$variable == "Censoring", "2", "3"))
melted$variable = factor(melted$variable, levels = levels(melted$variable)[c(1, 2, 4,3 )]) ## This did the trick of changing the order
ggplot(melted, aes(x = cat, y = value, fill = variable)) +
geom_bar(stat = 'identity') + facet_wrap(~ person) +
theme(axis.ticks.x=element_blank(),
axis.text.x=element_blank()) +
labs(title = "My Title",
y = "Per cent", x = "Age Group", fill = "")
(Sorry, this is my data, I didn't reproduce using the data from the original post, hope it's ok!)