Related
I would need some help with a Likert-scala bar chart that I created using ggplot2. Here is the data frame:
structure(list(Q4_ROLE = structure(c(1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L,
3L), levels = c("Civilian Analyst", "Military Analyst", "Operations/Admin Specialist"
), class = "factor"), Year = structure(c(1L, 1L, 1L, 1L, 1L,
2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 1L, 1L,
1L, 2L), levels = c("2021", "2022"), class = "factor"), Q20_A8 = structure(c(1L,
2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L, 2L, 3L, 4L, 5L, 1L, 2L, 3L,
4L, 5L, 3L, 4L, 5L, 3L), levels = c("1", "2", "3", "4", "5"), class = "factor"),
n = c(1L, 4L, 12L, 25L, 17L, 7L, 16L, 16L, 16L, 7L, 1L, 2L,
4L, 8L, 5L, 8L, 1L, 2L, 1L, 3L, 2L, 1L, 3L), perc = c(1.69491525423729,
6.77966101694915, 20.3389830508475, 42.3728813559322, 28.8135593220339,
11.2903225806452, 25.8064516129032, 25.8064516129032, 25.8064516129032,
11.2903225806452, 6.66666666666667, 13.3333333333333, 26.6666666666667,
53.3333333333333, 29.4117647058824, 47.0588235294118, 5.88235294117647,
11.7647058823529, 5.88235294117647, 50, 33.3333333333333,
16.6666666666667, 100), percent_answers = c(-0.0169491525423729,
-0.0677966101694915, 0.203389830508475, 0.423728813559322,
0.288135593220339, -0.112903225806452, -0.258064516129032,
0.258064516129032, 0.258064516129032, 0.112903225806452,
-0.0666666666666667, 0.133333333333333, 0.266666666666667,
0.533333333333333, -0.294117647058824, -0.470588235294118,
0.0588235294117647, 0.117647058823529, 0.0588235294117647,
0.5, 0.333333333333333, 0.166666666666667, 1), percent_answers_label = c("-2%",
"-7%", "20%", "42%", "29%", "-11%", "-26%", "26%", "26%",
"11%", "-7%", "13%", "27%", "53%", "-29%", "-47%", "6%",
"12%", "6%", "50%", "33%", "17%", "100%")), row.names = c(NA,
-23L), class = c("tbl_df", "tbl", "data.frame"))
Created on 2022-08-28 by the reprex package (v2.0.1)
I have five levels and I want them to be ordered correctly, but since I have it divergent I would need two different orderings. Using:
position_stack(reverse = TRUE)
works just fine when the plot was not divergent. I basically need the Neutral-Agree-Strong Agree to be reverse = TRUE and Strong disagree-Disagree to be reverse = FALSE so everything is in the right order on the divergent scale.
I have tried to filter with geom_col() to make 3-5 in a different direction than 1-2 but the second command overwrites my first one, making the filtering useless.
Q20_A8 is the Answer variable:
Factor w/ 5 levels "1","2","3","4","5"
count_8 %>%
ggplot(aes(x = Year, y = percent_answers, fill = Q20_A8)) +
geom_col(count_8 = filter(count_8, Q20_A8 %in% c("3","4","5")), position = position_stack(reverse = TRUE )) +
geom_col(count_8 = filter(count_8, Q20_A8 %in% c("1","2")), aes( y = percent_answers), position = position_stack(reverse = FALSE )) +
geom_text(aes(label = percent_answers_label), size = 2.4,
position = position_stack(reverse = FALSE, vjust = 0.5),
color = "black",
fontface = "bold") +
facet_wrap(~ Q4_ROLE, nrow=3) +
coord_flip() +
theme_minimal() +
theme(legend.title = element_text(size=8),
legend.key.size = unit(0.3, 'cm'),
legend.text = element_text(size = 6),
axis.title.y = element_text(vjust = +3),
legend.position="bottom") +
scale_fill_manual(name="Response:",
values=c("#C0392B","#F5B7B1","#E5E7E9", "#85C1E9", "#2874A6"),
labels=c("Strongly Disagree", "Disagree", "Neither Agree/Disagree", "Agree", "Strongly Agree")) +
xlab("") +
ylab("") +
ggtitle("Test") +
scale_y_continuous(limits = c(-0.5,1), labels = ylabs)
Any help is appreciated! Thank you.
You should define breaks in your scale_fill_manual according to the specific order and define the order of your data frame in specific column using for example fct_relevel from scales package. Also you can use only geom_bar(position="stack", stat = 'identity") instead of two calls of bars. Here is a reproducible example:
library(tidyverse)
library(scales)
count_8 %>%
group_by(Q4_ROLE, Year) %>%
mutate(Q20_A8 = fct_relevel(Q20_A8,"1","2","3","4","5")) %>%
ggplot(aes(x = Year, y = percent_answers, fill = Q20_A8)) +
geom_bar(position="stack", stat="identity") +
geom_text(aes(label = percent_answers_label), size = 2.4,
position = position_stack(reverse = FALSE, vjust = 0.5),
color = "black",
fontface = "bold") +
facet_wrap(~ Q4_ROLE, nrow=3) +
coord_flip() +
theme_minimal() +
theme(legend.title = element_text(size=8),
legend.key.size = unit(0.3, 'cm'),
legend.text = element_text(size = 6),
axis.title.y = element_text(vjust = +3),
legend.position="bottom") +
scale_fill_manual(name="Response:",
values=c("#C0392B","#F5B7B1","#E5E7E9", "#85C1E9", "#2874A6"),
breaks = c("1", "2", "5", "4", "3"),
labels=c("Strongly Disagree", "Disagree", "Neither Agree/Disagree", "Agree", "Strongly Agree")) +
xlab("") +
ylab("") +
ggtitle("Test")
Created on 2022-08-28 with reprex v2.0.2
I need some help with stat_compare_means and multiple groups.
Here is what my data look like.
> head(df_annot)
Row.names Diversity_sh Diversity_si Evenness Chao1 Location Bean Fungi Insect
1 R-B1 1.314181 0.6040213 0.3053349 91.00000 Root Bean M- NI
2 R-B2 1.323718 0.6117602 0.3075507 77.43750 Root Bean M- NI
3 R-B3 1.249950 0.5737293 0.2877545 81.50000 Root Bean M- NI
4 R-BF-1 1.177111 0.5414276 0.2693958 92.33333 Root Bean M+ NI
5 R-BF-2 1.191254 0.5252688 0.2742420 79.54545 Root Bean M+ NI
6 R-BF-3 1.397233 0.6285945 0.3179540 85.50000 Root Bean M+ NI
Here is a graph and I would like ALL comparisons labelled.
Here is some code. I know that I don't have my_comparisons correct, but I don't know where to start for the two groups. I want to compare M+/Insect to M-/Insect and M+/Insect to M+/NI etc.., all two-way comparisons. Any suggestions would be great. thanks
my_comparisons<- list( c("M+", "M-"), c("Insect", "NI"))
ggplot(df_annot,aes_string(x="Insect",y=index,fill="Fungi"))+
geom_boxplot(alpha=0.8)+
geom_point(aes(fill=Fungi),size = 3, shape = 21,position = position_jitterdodge(jitter.width = 0.02,jitter.height = 0))+
stat_compare_means(comparison=my_comparisons,label="p.format",method="wilcox.test")+
#ggtitle(df_name)+
ylab(paste(index))+
xlab("")+
# scale_x_discrete(labels= c("M+","M-","soil alone"))+
theme(plot.title = element_text(size = 18, face = "bold"))+
theme(axis.text=element_text(size=14),
axis.title=element_text(size=14)) +
theme(legend.text=element_text(size=14),
legend.title=element_text(size=14)) +
theme(strip.text.x = element_text(size = 14))
dput(df_annot)
structure(list(Row.names = structure(c("R-B1", "R-B2", "R-B3",
"R-BF-1", "R-BF-2", "R-BF-3", "R-BFi-1", "R-BFi-2", "R-Bi-1",
"R-Bi-2", "R-Bi-3"), class = "AsIs"), Diversity_sh = c(1.31418133185869,
1.32371839350534, 1.24994951615418, 1.17711111336449, 1.19125374868316,
1.39723272927515, 1.34145146126423, 1.21674449259962, 1.20721660188555,
1.17245529262564, 1.20912937911657), Diversity_si = c(0.604021268328531,
0.611760247980402, 0.573729285531772, 0.541427625516077, 0.525268755766239,
0.628594506768001, 0.597250229879166, 0.554646956896473, 0.548992316400345,
0.531291238688503, 0.583806537719818), Evenness = c(0.305334910927276,
0.307550737463383, 0.287754490536268, 0.269395848882803, 0.274241968272787,
0.317954009728278, 0.305260435164649, 0.276882141486585, 0.273949061455415,
0.269914321375221, 0.275929262855007), Chao1 = c(91, 77.4375,
81.5, 92.3333333333333, 79.5454545454545, 85.5, 87.5, 90.5454545454545,
89.3333333333333, 88.6666666666667, 88.0769230769231), Location = structure(c(1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("Root", "Rhizospheric Soil"
), class = "factor"), Bean = structure(c(1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L), .Label = "Bean", class = "factor"),
Fungi = structure(c(2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 2L, 2L,
2L), .Label = c("M+", "M-"), class = "factor"), Insect = structure(c(2L,
2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L), .Label = c("Insect",
"NI"), class = "factor")), row.names = c(NA, -11L), class = "data.frame")
facet_wrap() might help you as discussed here
ggplot(df_annot, aes(x=df_annot$Insect, y= df_annot$Evenness)) +
facet_wrap(~df_annot$Fungi)+
geom_boxplot(alpha=0.8) +
geom_point()+
stat_compare_means(comparisons = list(c("Insect", "NI") ), label="p.format",method="wilcox.test")
EDIT
ok here is a - not too elegant - solution without faceting.
Create a new variable containing Insect info and Fungi status:
df_annot$var <- paste(df_annot$Insect,df_annot$Fungi, sep = "_" )
Then build the contrasts
my_comparisons <- rev(list(c("Insect_M-","Insect_M+"),c("NI_M-","Insect_M-"),c("NI_M+","Insect_M-"),
c("Insect_M+", "NI_M-"), c("Insect_M+", "NI_M+"), c("NI_M-","NI_M+")))
and plot your graph
ggplot(df_annot,aes_string(x="var",y="Evenness",fill="Fungi"))+
geom_boxplot(alpha=0.8)+
geom_point(aes(fill=Fungi),size = 3, shape = 21,position = position_jitterdodge(jitter.width = 0.02,jitter.height = 0))+
stat_compare_means(comparison=my_comparisons,label="p.format",method="wilcox.test")+
#ggtitle(df_name)+
ylab(paste("Evenness"))+
xlab("")+
# scale_x_discrete(labels= c("M+","M-","soil alone"))+
theme(plot.title = element_text(size = 18, face = "bold"))+
theme(axis.text=element_text(size=14),
axis.title=element_text(size=14)) +
theme(legend.text=element_text(size=14),
legend.title=element_text(size=14)) +
theme(strip.text.x = element_text(size = 14))
you might want to make better names and such. but this could be what you are looking for.
I have a bar chart which I want also to include some lines that show the percentage difference between them as in the following figure:
The lines in the figure are drawn just to make my point of what I ideally want.
Can someone help me with this?
Here is the dataframe to replicate the figure:
structure(list(shares = c(0.39, 3.04, 9.32, 22.29, 64.97, 0.01,
0.11, 5.83, 21.4, 72.64), quantile = structure(c(4L, 1L, 2L,
3L, 5L, 4L, 1L, 2L, 3L, 5L), .Label = c("2nd Quantile", "3rd Quantile",
"4nd Quantile", "Poorest 20%", "Richest 20%"), class = "factor"),
case = structure(c(1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L
), .Label = c("No Debt", "With Debt"), class = "factor")), row.names = c(NA,
-10L), class = "data.frame")
And here is my code used to make the bar plot:
ggplot(df_cum, aes(fill = case , quantile, shares)) + geom_bar(position =
"dodge", stat = "identity") +
scale_x_discrete(limits = c(
"Poorest 20%",
"2nd Quantile",
"3rd Quantile",
"4nd Quantile",
"Richest 20%"
)) +
theme_minimal()
Your data unchanged:
library(tidyverse)
df_cum<-structure(list(shares = c(0.39, 3.04, 9.32, 22.29, 64.97, 0.01,0.11, 5.83, 21.4, 72.64),
quantile = structure(c(4L, 1L, 2L, 3L, 5L, 4L, 1L, 2L, 3L, 5L),
.Label = c("2nd Quantile", "3rd Quantile", "4nd Quantile", "Poorest 20%", "Richest 20%"), class = "factor"),
case = structure(c(1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L), .Label = c("No Debt", "With Debt"), class = "factor")), row.names = c(NA, -10L), class = "data.frame")
Your graph unchanged:
p <- ggplot(df_cum, aes(fill = case , quantile, shares)) +
geom_bar(position = "dodge", stat = "identity") +
scale_x_discrete(limits = c("Poorest 20%", "2nd Quantile", "3rd Quantile", "4nd Quantile", "Richest 20%")) +
theme_minimal()
I used the horizontal error bar to do the trick. Here is my solution:
y = rep(c(3, 5, 13, 25, 75),2)
x = rep(c(1:5), 2)
label = rep(c("-3%", "-5%", "-2%", "-1%", "10%"), 2)
p1 <- p + geom_text(x=x, y=y+2, label=label)
p1 + geom_errorbarh(aes(xmax = (x + 0.3), xmin = (x - 0.3), y = y), height = 0.5)
Now, you get:
You can also adjust both height and width if you like.
I tried to make the title self-explanatory, but here goes - data first:
dtf <- structure(list(variable = structure(c(1L, 1L, 2L, 2L, 3L, 3L,
4L, 4L, 5L, 5L), .Label = c("vma", "vla", "ia", "fma", "fla"), class = "factor"),
ustanova = structure(c(1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L,
2L), .Label = c("srednja škola", "fakultet"), class = "factor"),
`(all)` = c(42.9542857142857, 38.7803203661327, 37.8996138996139,
33.7672811059908, 29.591439688716, 26.1890660592255, 27.9557692307692,
23.9426605504587, 33.2200772200772, 26.9493087557604)), .Names = c("variable",
"ustanova", "(all)"), row.names = c(NA, 10L), class = c("cast_df",
"data.frame"), idvars = c("variable", "ustanova"), rdimnames = list(
structure(list(variable = structure(c(1L, 1L, 2L, 2L, 3L,
3L, 4L, 4L, 5L, 5L), .Label = c("vma", "vla", "ia", "fma",
"fla"), class = "factor"), ustanova = structure(c(1L, 2L,
1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L), .Label = c("srednja škola",
"fakultet"), class = "factor")), .Names = c("variable", "ustanova"
), row.names = c("vma_srednja škola", "vma_fakultet", "vla_srednja škola",
"vla_fakultet", "ia_srednja škola", "ia_fakultet", "fma_srednja škola",
"fma_fakultet", "fla_srednja škola", "fla_fakultet"), class = "data.frame"),
structure(list(value = structure(1L, .Label = "(all)", class = "factor")), .Names = "value", row.names = "(all)", class = "data.frame")))
And I'd like to create a dodged barplot, do the coord_flip and put some text labels inside the bars:
ggplot(bar) + geom_bar(aes(variable, `(all)`, fill = ustanova), position = "dodge") +
geom_text(aes(variable, `(all)`, label = sprintf("%2.1f", `(all)`)), position = "dodge") +
coord_flip()
you can see output here.
I reckon I'm asking for something trivial. I want the text labels to "follow" stacked bars. Labels are placed correctly on the y-axis, but how to position them correctly on x-axis?
Is this what you want?
library(ggplot2)
ggplot(bar) +
geom_col(aes(variable, `(all)`, fill = ustanova), position = "dodge") +
geom_text(aes(variable, `(all)`, label = sprintf("%2.1f", `(all)`), group = ustanova),
position = position_dodge(width = .9)) +
coord_flip()
The key is to position = position_dodge(width = .9) (where .9 is the default width of the bars) instead of position = "dodge", which is just a shortcut without any parameter. Additionally you have to set the group=ustanova aesthetic in geom_text to dodge the labels by ustanova (A second option would be to make fill = ustanova a global aesthetic via ggplot(bar, aes(fill = ustanova))
In ggplot2_2.0.0 you find several examples in ?geom_text on how to position geom_text on dodged or stacked bars (the code chunk named "# Aligning labels and bars"). The Q&A What is the width argument in position_dodge? provides a more thorough description of the topic.
I tried to make the title self-explanatory, but here goes - data first:
dtf <- structure(list(variable = structure(c(1L, 1L, 2L, 2L, 3L, 3L,
4L, 4L, 5L, 5L), .Label = c("vma", "vla", "ia", "fma", "fla"), class = "factor"),
ustanova = structure(c(1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L,
2L), .Label = c("srednja škola", "fakultet"), class = "factor"),
`(all)` = c(42.9542857142857, 38.7803203661327, 37.8996138996139,
33.7672811059908, 29.591439688716, 26.1890660592255, 27.9557692307692,
23.9426605504587, 33.2200772200772, 26.9493087557604)), .Names = c("variable",
"ustanova", "(all)"), row.names = c(NA, 10L), class = c("cast_df",
"data.frame"), idvars = c("variable", "ustanova"), rdimnames = list(
structure(list(variable = structure(c(1L, 1L, 2L, 2L, 3L,
3L, 4L, 4L, 5L, 5L), .Label = c("vma", "vla", "ia", "fma",
"fla"), class = "factor"), ustanova = structure(c(1L, 2L,
1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L), .Label = c("srednja škola",
"fakultet"), class = "factor")), .Names = c("variable", "ustanova"
), row.names = c("vma_srednja škola", "vma_fakultet", "vla_srednja škola",
"vla_fakultet", "ia_srednja škola", "ia_fakultet", "fma_srednja škola",
"fma_fakultet", "fla_srednja škola", "fla_fakultet"), class = "data.frame"),
structure(list(value = structure(1L, .Label = "(all)", class = "factor")), .Names = "value", row.names = "(all)", class = "data.frame")))
And I'd like to create a dodged barplot, do the coord_flip and put some text labels inside the bars:
ggplot(bar) + geom_bar(aes(variable, `(all)`, fill = ustanova), position = "dodge") +
geom_text(aes(variable, `(all)`, label = sprintf("%2.1f", `(all)`)), position = "dodge") +
coord_flip()
you can see output here.
I reckon I'm asking for something trivial. I want the text labels to "follow" stacked bars. Labels are placed correctly on the y-axis, but how to position them correctly on x-axis?
Is this what you want?
library(ggplot2)
ggplot(bar) +
geom_col(aes(variable, `(all)`, fill = ustanova), position = "dodge") +
geom_text(aes(variable, `(all)`, label = sprintf("%2.1f", `(all)`), group = ustanova),
position = position_dodge(width = .9)) +
coord_flip()
The key is to position = position_dodge(width = .9) (where .9 is the default width of the bars) instead of position = "dodge", which is just a shortcut without any parameter. Additionally you have to set the group=ustanova aesthetic in geom_text to dodge the labels by ustanova (A second option would be to make fill = ustanova a global aesthetic via ggplot(bar, aes(fill = ustanova))
In ggplot2_2.0.0 you find several examples in ?geom_text on how to position geom_text on dodged or stacked bars (the code chunk named "# Aligning labels and bars"). The Q&A What is the width argument in position_dodge? provides a more thorough description of the topic.