I am trying to use ggsignif for displaying significant stars in top of paired bar graphs using facet_wrap. However, I canĀ“t manage to find a way of displaying one significant bar per facet. Here is what I mean:
dat <- data.frame(Group = c("S1", "S1", "S2", "S2"),
Sub = c("A", "B", "A", "B"),
Value = c(3,5,7,8))
ggplot(dat, aes(Group, Value)) +
geom_bar(aes(fill = Sub), stat="identity", position="dodge", width=.5) +
geom_signif(y_position=c(5.3, 8.3), xmin=c(0.8, 1.8), xmax=c(1.2, 2.2),
annotation=c("**", "NS"), tip_length=0) +
scale_fill_manual(values = c("grey80", "grey20")) +
facet_grid(~ Group, scales = "free")
Is there a way of making sure that each facet has its individual significance label?
The main problem seems to me is that the geom_signif layer doesn't know to what panel the variables go to, since it has no data argument provided.
I'm not that familiar with the package, but the documentation seems to suggest that manual = TRUE is recommended for plotting it in different facets. Doing that and making some adjustments for the errors that were thrown, I got the following to work:
ggplot(dat, aes(Group, Value)) +
geom_bar(aes(fill = Sub), stat="identity", position="dodge", width=.5) +
geom_signif(data = data.frame(Group = c("S1","S2")),
aes(y_position=c(5.3, 8.3), xmin=c(0.8, 0.8), xmax=c(1.2, 1.2),
annotations=c("**", "NS")), tip_length=0, manual = T) +
scale_fill_manual(values = c("grey80", "grey20")) +
facet_grid(~ Group, scales = "free")
The key seemed to be to provide a data argument from which the facetting code could deduce what bit goes in what panel.
Have you considered using ggpubr and stat_compare_means?
https://rpkgs.datanovia.com/ggpubr/reference/stat_compare_means.html
Since your example only contains one observation pr. bar it does not work but if you include multiple observations you can get what you want.
rewrite the test data:
dat <- data.frame(A_S1 = sample(rnorm(20, 3, 1)),
B_S1 = sample(rnorm(20, 5, 1)),
A_S2 = sample(rnorm(20, 7, 1)),
B_S2 = sample(rnorm(20, 8, 1))) %>%
tidyr::gather("G", "value") %>%
tidyr::separate("G", c("Sub", "Group"))
Plot the data using the ggpubr package
ggerrorplot(dat, x = "Sub", y = "value",
facet.by = "Group",
error.plot = "pointrange") +
stat_compare_means(aes(label = ..p.signif..),
method = "t.test", ref.group = "A")
Related
I have the following geom_bar dodged plot and think the single bars for Ages 8, 17, 26 and 27 would look better centralized rather than off to the left. I am not sure what to add to the script to achieve this. Any assistance would be greatly appreciated.
This is the script:
ggplot(data = combo1, aes(x = Age_Year, fill = Tactic)) +
geom_bar(position = position_dodge(preserve = 'single')) +
theme_classic() +
labs(x = "Age (years)", y = "Counts of Fish", show.legend = FALSE)+
theme(legend.position = "none")+
scale_fill_manual("legend", values = c("Migr" = "skyblue", "OcRes" = "pale green", "EstRes" = "pink"))
OP, use position_dodge2(preserve="single") in place of position_dodge(preserve="single"). For some reason, centering bars/columns doesn't quite work correctly with position_dodge(), but it does with position_dodge2(). Note the slight difference in spacing you get when you switch the position function, but should overall be the fix to your problem.
Reproducible Example for OP's question
library(ggplot2)
set.seed(8675309)
df <- data.frame(
x=c("A", "A", "A", "B", "C", "C"),
grouping_var = c("Left", "Middle", "Right", "Middle", "Left", "Right"),
values = sample(1:100, 6))
Basic plot with position_dodge():
ggplot(df, aes(x=x, y=values, fill=grouping_var)) +
geom_col(position=position_dodge(preserve = "single")) +
theme_classic()
When you use position_dodge2():
ggplot(df, aes(x=x, y=values, fill=grouping_var)) +
geom_col(position=position_dodge2(preserve = "single")) +
theme_classic()
I would like to plot a line + point plot. But my data contain "<" Is it possible to make the special point for the point with "<"? Any suggestion on how to better present those info?
Sample data:
df<-structure(list(Day = c(1, 3, 6, 7, 9, 12, 15), Score = c("0.1",
"0.5", "<1.3", "0.2", "<1.55", "0.8", "1.2")), row.names = c(NA,
-7L), class = c("tbl_df", "tbl", "data.frame"))
Here is my plot code and sample:
df<- df %>%
mutate(Score1=gsub("<", "", Score))
ggplot(data=df26, aes(x=Day,y=Score1, group=1)) +
geom_line()+
geom_point()
BTW, your Score1 is still in character type, so it is not plotting proportional to its value. Here's one approach to use the value without "<" but the label including the "<".
There are lots of options here. A few below:
add the "<" to the axis labels
add a visual indicator (could be color, text, an arrow, etc.) to note "smaller than" values.
Color differently and use a legend. I like ggtext for this as you can use markup to color in specific words, which is great for incorporating color legends into explanatory text.
Perhaps "<1.3" could be interpreted, based on situational knowledge, that the measurement was somewhere below 1.3 but not below 1.2. Then we could show simulated possibilities.
ggplot(data=df, aes(x=Day,y=as.numeric(Score1), group=1)) +
geom_line()+
geom_point() +
scale_y_continuous(breaks = as.numeric(df$Score1), labels = df$Score,
minor_breaks = NULL)
Or you might indicate visually that the values are smaller, esp. if there's some plausible range that they might be lower.
ggplot(data=df, aes(x=Day,y=as.numeric(Score1), group=1)) +
geom_line()+
geom_point() +
geom_segment(data = . %>% filter(Score1 != Score),
aes(xend = Day, yend = as.numeric(Score1) - 0.2),
arrow = arrow(length = unit(0.02, "npc")), color = "gray60") +
scale_y_continuous(breaks = as.numeric(df$Score1), labels = df$Score, minor_breaks = NULL)
library(ggtext)
ggplot(data=df, aes(x=Day, y=as.numeric(Score1), group = 1,
shape = Score1 == Score)) +
geom_line()+
geom_point(aes(color = Score1 == Score)) +
scale_shape_discrete(guide = FALSE) +
scale_color_manual(values = c("red", "black"), guide = FALSE) +
labs(caption = "<span style = 'color:#FF0000'>Red dots</span> were recorded with a '<'") +
theme(plot.caption = element_markdown())
Another idea is we might show possibilities that are consistent with the measurement based on our situational understanding of what "<1.3" means -- ie maybe it means the value was "somewhere between 1.2 and 1.3."
df_possibilities <- df %>%
filter(Score1 != Score) %>%
uncount(10) %>%
rowwise() %>%
mutate(adjusted = as.numeric(Score1) - runif(1, max = 0.1))
ggplot(data=df, aes(x=Day,y=as.numeric(Score1), group=1)) +
geom_line()+
geom_point() +
scale_y_continuous(breaks = as.numeric(df$Score1), labels = df$Score,
minor_breaks = NULL) +
geom_point(data = df_possibilities,
aes(y = adjusted), alpha = 0.1)
Couple of alternatives, inclulded in the same graph:
by a key using a coloured geom_point, or
by annotation with geom_text
This is just to give an impression, both methods can be enhanced and modified to provide the appearance you think provides the best visualisation.
library(ggplot2)
library(dplyr)
library(stringr)
df1 <-
df%>%
mutate(y = as.numeric(str_extract(Score, "\\d.\\d{1,2}")),
less_than = if_else(str_detect(Score, "<"), TRUE, FALSE))
ggplot(df1, aes(Day, y))+
geom_point(aes(colour = less_than))+
geom_line()+
geom_text(aes(label = Score), hjust = -0.2)
Created on 2021-04-15 by the reprex package (v2.0.0)
UPDATE
Labels idea from Peter. Thanks.
You can use shape for different shapes.
with ggpubr more sophisticated. Here a overview of the numbers:
ggplot(data=df, aes(x=factor(Day),y=Score1, group=1)) +
geom_line()+
geom_point() +
geom_point(data=df[c(3,5),], aes(x=factor(Day), y=Score1), colour="red", size=5, shape=25) +
geom_text(aes(label = Score), hjust = -0.2)+
theme_bw()
I'm plotting 3 columns/character vectors in a faceted bar graph and would like to be able to plot "smoker" as the stacked bar graph inside each bar graph.
I'm using ggplot2. I've managed to plot "edu" and "sex" already, but I'd also like to be able to see the count of each "y" and "n" inside each bar graph of "sex" (divided along the x-axis by "edu"). I have attached an image of my graph,
which I achieved by entering the following code:
I tried entering the "fill=smoker" argument in aes, but this didn't work.
If anyone has any suggestions on how to clean up the code I used to turn the graph into a faceted one and express it as percentages, I would also be very grateful, as I took it from somewhere else.
test <- read.csv('test.csv', header = TRUE)
library(ggplot2)
ggplot(test, aes(x= edu, group=sex)) +
geom_bar(aes(y = ..prop.., fill = factor(..x..)), stat="count", show.legend = FALSE) +
geom_text(aes( label = scales::percent(..prop..),
y= ..prop.. ), stat= "count", vjust = -.5, size = 3) +
labs(y = NULL, x="education") +
facet_grid(~sex) +
scale_y_continuous(labels = scales::percent)
Not sure if this is what you are looking for but I attempted my best at answering your question.
library(tidyverse)
library(lubridate)
library(scales)
test <- tibble(
edu = c(rep("hs", 5), rep("bsc", 3), rep("msc", 3)),
sex = c(rep("m", 3), rep("f", 4), rep("m", 4)),
smoker = c("y", "n", "n", "y", "y", rep("n", 3), "y", "n", "n"))
test %>%
count(sex, edu, smoker) %>%
group_by(sex) %>%
mutate(percentage = n/sum(n)) %>%
ggplot(aes(edu, percentage, fill = smoker)) +
geom_col() +
geom_text(aes(label = percent(percentage)),
position = position_stack(vjust = 0.5)) +
facet_wrap(~sex) +
scale_y_continuous(labels = scales::percent) +
scale_fill_manual(values = c("#A0CBE8", "#F28E2B"))
I have a ggplot2 linegraph with two lines featuring significant overlap. I'm trying to use position_jitterdodge() so that they are more visible, but I can't get the lines and points to both jitter in the same way. I'm trying to jitter the points and line horizontally only (as I don't want to suggest any change on the y-axis). Here is an MWE:
## Create data frames
dimension <- factor(c("A", "B", "C", "D"))
df <- data.frame("dimension" = rep(dimension, 2),
"value" = c(20, 21, 34, 32,
20, 21, 36, 29),
"Time" = c(rep("First", 4), rep("Second", 4)))
## Plot it
ggplot(data = df, aes(x = dimension, y = value,
shape = Time, linetype = Time, group = Time)) +
geom_line(position = position_jitterdodge(dodge.width = 0.45)) +
geom_point(position = position_jitterdodge(dodge.width = 0.45)) +
xlab("Dimension") + ylab("Value")
Which produces the ugly:
I've obviously got something fundamentally wrong here: What should I do to make the geom_point jitter follow the geom_line jitter?
Another option for horizontal only would be to specify position_dodge and pass this to the position argument for each geom.
pd <- position_dodge(0.4)
ggplot(data = df, aes(x = dimension, y = value,
shape = Time, linetype = Time, group = Time)) +
geom_line(position = pd) +
geom_point(position = pd) +
xlab("Dimension") + ylab("Value")
One solution is to manually jitter the points:
df$value_j <- jitter(df$value)
ggplot(df, aes(dimension, value_j, shape=Time, linetype=Time, group=Time)) +
geom_line() +
geom_point() +
labs(x="Dimension", y="Value")
The horizontal solution for your discrete X axis isn't as clean (it's clean under the covers when ggplot2 does it since it handles the axis and point transformations for you quite nicely) but it's doable:
df$dim_j <- jitter(as.numeric(factor(df$dimension)))
ggplot(df, aes(dim_j, value, shape=Time, linetype=Time, group=Time)) +
geom_line() +
geom_point() +
scale_x_continuous(labels=dimension) +
labs(x="Dimension", y="Value")
On July 2017, developpers of ggplot2 have added a seed argument on position_jitter function (https://github.com/tidyverse/ggplot2/pull/1996).
So, now (here: ggplot2 3.2.1) you can pass the argument seed to position_jitter in order to have the same jitter effect in geom_point and geom_line (see the official documentation: https://ggplot2.tidyverse.org/reference/position_jitter.html)
Note that this seed argument does not exist (yet) in geom_jitter.
ggplot(data = df, aes(x = dimension, y = value,
shape = Time, linetype = Time, group = Time)) +
geom_line(position = position_jitter(width = 0.25, seed = 123)) +
geom_point(position = position_jitter(width = 0.25, seed = 123)) +
xlab("Dimension") + ylab("Value")
This seems like to simplest thing to do, but I have not been able to figure this out on R. For descriptive purposes, I want to create one bar graph that show the means and error plots of multiple questions/variables. My data is based on anonymous responses so there is no grouping variables.
Is there a way to do this on R? Below is an example of what my data looks like. I would like to plot mean and standard deviation of each variable next to each other in the same bar graph.
dat <- data.frame(satisfaction = c(1, 2, 3, 4),
engaged = c(2, 3, 4, 2),
relevant = c(4, 1, 3, 2),
recommend = c(4, 1, 3, 3))
What you could do is reshape the data into long format with reshape2 (or data.table or tidyr) without specifying an id-variable and using all columns as measure variables. After that you can create a plot with for example ggplot2. Using:
library(reshape2)
library(ggplot2)
# reshape into long format
dat2 <- melt(dat, measure.vars = 1:4) # or just: melt(dat)
# create the plot
ggplot(dat2, aes(x = variable, y = value)) +
stat_summary(geom = 'bar', fun.y = 'mean', width = 0.7, fill = 'grey') +
stat_summary(geom = 'errorbar', width = 0.2, size = 1.5) +
theme_minimal(base_size = 14) +
theme(axis.title = element_blank())
gives:
Update: As #GavinSimpson pointed out in his answer: for visualizing means and standard errors, a barplot is not the best alternative. As an alternative you could also use geom_pointrange:
ggplot(dat2, aes(x = variable, y = value)) +
stat_summary(geom = 'pointrange', fatten = 5, size = 1.2) +
theme_minimal(base_size = 14) +
theme(axis.title = element_blank())
which gives:
Whilst I know you asked for a barplot, a dotplot of the data is an alternative visualisation that focuses on the means and standard errors. If the drawing of a bar all the way to 0 is not that informative, the dotplot is a good alternative.
Reusing the objects and code from #Procrastinatus Maximus' answer we have:
ggplot(dat2, aes(x = variable, y = value)) +
stat_summary(geom = 'point', fun.y = 'mean', size = 2) +
stat_summary(geom = 'errorbar', width = 0.2) +
xlab(NULL) +
theme_bw()
which produces