Errorbars look like pointrange (ggplot2) - r

I have the following data frame:
> df <- read.table("throughputOverallSummary.txt", header = TRUE)
> df
ExperimentID clients connections msgSize Mean Deviation Error
1 77 100 50 1999 142.56427 8.368127 0.4710121
2 78 200 50 1999 284.22705 13.575943 0.3832827
3 79 400 50 1999 477.48997 44.820831 0.7538666
4 80 600 50 1999 486.87102 49.916391 0.8240869
5 81 800 50 1999 488.84899 51.422070 0.8462216
6 82 10 50 1999 15.23667 1.995150 1.0498722
7 83 50 50 1999 71.94000 5.197893 0.5793057
and some code that processes the dataframe df above:
msg_1999 = subset(df, df$msgSize == 1999)
if (nrow(msg_1999) > 0) {
limits = aes(ymax = msg_1999$Mean + msg_1999$Deviation, ymin = msg_1999$Mean -
msg_1999$Deviation)
ggplot(data = msg_1999, aes(clients, Mean, color = as.factor(connections), group =
as.factor(connections))) +
geom_point() + geom_line() +
geom_errorbar(limits, width = 0.25) +
xlab("Number of Clients") +
ylab("Throughput (in messages/second)") +
labs(title = "Message size 1999 bytes", color = "Connections")
ggsave(file = "throughputMessageSize1999.png")
}
My problem is that the error bars in the plot look like pointrange. The horizontal bars at the upper and lower end of the error bars are missing.
Ideally, the error bars should have looked something like this:
Why do errorbars from my code look different?

The width parameter as the same scale as x, you have given width = 0.25, where the range of the x axis is 0-800. A bar with width 0.25 is not going to be visible on this graph. If you don't set the width value, then something reasonably sensible is guessed.
ggplot(data = df, aes(clients, Mean, color = as.factor(connections), group =
as.factor(connections))) +
geom_point() + geom_line() +
geom_errorbar(aes(ymax = Mean + Deviation, ymin=Mean-Deviation)) +
xlab("Number of Clients") +
ylab("Throughput (in messages/second)") +
labs(title = "Message size 1999 bytes", color = "Connections")
Note that if you want to predefine your mapping argument, you should still specify the variables as you would within a call to geom_xxxx. aes (and ggplot) does some fancy footwork to ensure that this will be evaluated within the correct environment at the time of plotting.
Thus the following will work
limits <- aes(ymax = Mean + Deviation, ymin=Mean-Deviation)
ggplot(data = df, aes(clients, Mean, color = as.factor(connections), group =
as.factor(connections))) +
geom_point() + geom_line() +
geom_errorbar(limits) +
xlab("Number of Clients") +
ylab("Throughput (in messages/second)") +
labs(title = "Message size 1999 bytes", color = "Connections")

Related

How can I use the ggplot function to visualise grouped data?

I have a data set which has the time taken for individuals to read a sentence (response_time) under the experimental factors of the condition of the sentence (normal or visually degraded) and the number of cups of coffee (caffeine) that an individual has drunk. I want to visualise the data using ggplot, but with the data grouped according to the condition of the sentence and the coffee drunk - e.g. the response times recorded for individuals reading a normal sentence and having drunk one cup of coffee.
This is what I have tried so far, but the graph comes up as one big blob (not separated by group) and has over 15 warnings!!
participant condition response_time caffeine
<dbl> <fct> <dbl> <fct>
1 1 Normal 984 1
2 2 Normal 1005 1
3 3 Normal 979 3
4 4 Normal 1040 2
5 5 Normal 1008 2
6 6 Normal 979 3
>
tidied_data_2 %>%
ggplot(aes(x = condition:caffeine, y = response_time, colour = condition:caffeine)) +
geom_violin() +
geom_jitter(width = .1, alpha = .25) +
guides(colour = FALSE) +
stat_summary(fun.data = "mean_cl_boot", colour = "black") +
theme_minimal() +
theme(text = element_text(size = 13)) +
labs(x = "Condition X Caffeine", y = "Response Time (ms)")
Any suggestions on how to better code what I want would be great.
As a wiki answer because too long for a comment.
Not sure what you are intending with condition:caffeine - I've never seen that syntax in ggplot. Try aes(x = as.character(caffeine), y = ..., color = as.character(caffeine)) instead (or, because it is a factor in your case anyways, you can just use aes(x = caffeine, y = ..., color = caffeine)
If your idea is to separate by condition, you could just use aes(x = caffeine, y = ..., color = condition), as they are going to be separated by x anyways.
of another note - why not actually plotting a scatter plot? Like making this a proper two-dimensional graph. suggestion below.
library(ggplot2)
library(dplyr)
tidied_data_2 <- read.table(text = "participant condition response_time caffeine
1 1 Normal 984 1
2 2 Normal 1005 1
3 3 Normal 979 3
4 4 Normal 1040 2
5 5 Normal 1008 2
6 6 Normal 979 3", head = TRUE)
tidied_data_2 %>%
ggplot(aes(x = as.character(caffeine), y = response_time, colour = as.character(caffeine))) +
## geom_violin does not make sense with so few observations
# geom_violin() +
## I've removed alpha so you can see the dots better
geom_jitter(width = .1) +
guides(colour = FALSE) +
stat_summary(fun.data = "mean_cl_boot", colour = "black") +
theme_minimal() +
theme(text = element_text(size = 13)) +
labs(x = "Condition X Caffeine", y = "Response Time (ms)")
what I would rather do
tidied_data_2 %>%
## in this example as.integer(as.character(x)) is unnecessary, but it is necessary for your data sample
ggplot(aes(x = as.integer(as.character(caffeine)), y = response_time)) +
geom_jitter(width = .1) +
theme_minimal()

How do I fix error message "Must request at least one colour from a hue palette"?

I'm working on a stats project and I'm trying to make a categorical-by-categorical interaction plot between my response and explanatory variables but I keep getting the error message mentioned above. What's going wrong in my code that's causing this error?
mymeans <- summary(emmeans(mymodel1, pairwise ~ Opp_Rank | Location)$emmeans)
mymeans_plot <- mymeans[c("Ranked Status", "Location", "emmean", "SE"),]
ggplot(mymeans_plot, aes(x = Opp_Rank, y = emmean, col = Location)) +
geom_point(position = position_dodge(width = 0.4)) +
geom_errorbar(aes(ymin = emmean - SE, ymax = emmean + SE),
width = 0.4, size = 0.7,
position = position_dodge(width = 0.4)) +
xlab("Ranked Status") +
ylab("Mean Points Scored +/- SE") +
ggtitle("Mean Points Scored by Ranked Status and Location") +
theme_classic()
Points_Scored Location Opp_Rank Year
1 6 Home No 1936
2 6 Away No 1936
3 18 Home No 1936
4 0 Away No 1936
5 7 Home Yes 1936
6 6 Away No 1936
This is what my interaction plot is supposed to end up looking like.
On this line:
ggplot(mymeans_plot, aes(x = Opp_Rank, y = emmean, col = Location)) +
change col to color.
If you're still getting the error and if you're working in RStudio, you can add the following to your chunk header:
```{r, warning=FALSE, message=FALSE}
#your code here
```

Show statistically significant difference in a graph

I have carried out an experiment with six treatments and each treatment was performed in the light and darkness. I have used ggplot2 to make bar plot graph. I would like add the significance letters (e.g. LSD result) into the graph to show the difference between light and darkness for each treatment but it gives me an error.
Any suggestion?
data <- read.table(header = TRUE, text =
'T0 T1 T2 T3 T4 T5 LVD
40 62 50 45 45 58 Light
30 60 44 40 30 58 Light
30 68 42 35 32 59 Light
47 75 58 55 50 70 Dark
45 75 52 54 42 78 Dark
50 75 68 48 56 75 Dark
')
gla <- melt(data,id="LVD")
ggplot(gla, aes(x=variable, y=value, fill=as.factor(LVD))) +
stat_summary(fun.y=mean,
geom="bar",position=position_dodge(),colour="black",width=.7,size=.7) +
stat_summary(fun.ymin=min,fun.ymax=max,geom="errorbar",
color="black",position=position_dodge(.7), width=.2) +
scale_fill_manual("Legend", values = c("Light" = "white", "Dark" ="gray46")) +
xlab("Treatments")+
ylab("Germination % ") +
theme(panel.background = element_rect(fill = 'white', colour = 'black'))
till here it perfectly works but when I use geom_text it gives an error
+ geom_text(aes(label=c("a","b","a","a","a","a, a","b","a","b","a","b")))
The error is:
Error: Aesthetics must be either length 1 or the same as the data (36): label, x, y, fill
The problem is that you have 36 data points, which you summarize to 12. ggplot will only allow mapping to 36 data points in geom_text (which the error tells you). In order to use the summarized 12 points, you do need to use stat_summary once again.
The basic rule is that statistical transformations (like summaries) do *not* transfer between layers (i.e. geoms and stats). So geom_text has no idea what the y values computed by the original stat_summary actually are.
Then you also need to fix the typo in your letters.
We end up with:
ggplot(gla, aes(x=variable, y=value, fill=as.factor(LVD))) +
stat_summary(fun.y=mean,
geom="bar",position=position_dodge(),colour="black",width=.7,size=.7) +
stat_summary(fun.ymin=min,fun.ymax=max,geom="errorbar",
color="black",position=position_dodge(.7), width=.2) +
stat_summary(geom = 'text', fun.y = max, position = position_dodge(.7),
label = c("a","b","a","a","a","a", "a","b","a","b","a","b"), vjust = -0.5) +
scale_fill_manual("Legend", values = c("Light" = "white", "Dark" ="gray46")) +
xlab("Treatments") +
ylab("Germination % ") +
scale_y_continuous(expand = c(0, 0), limits = c(0, 85)) +
theme_bw()
I don't like dynamite plots, so here's my version:
let <- c("a","b","a","a","a","a", "a","b","a","b","a","b")
stars <- ifelse(let[c(TRUE, FALSE)] == let[c(FALSE, TRUE)], '', '*')
ggplot(gla, aes(x = variable, y = value)) +
stat_summary(aes(col = as.factor(LVD)),
fun.y=mean, fun.ymin = min, fun.ymax = max,
position = position_dodge(.3), size = .7) +
stat_summary(geom = 'text', fun.y = max, position = position_dodge(.3),
label = stars, vjust = 0, size = 6) +
scale_color_manual("Legend", values = c("Light" = "black", "Dark" ="gray46")) +
xlab("Treatments") +
ylab("Germination % ") +
scale_y_continuous(expand = c(0.1, 0)) +
theme_bw()
I fount it the simplest way to show the statistical significance with asterisks and lines.
fig2 + geom_text(x=1.5,y=89, label = "***") + annotate("segment", x=c(1,1,2), xend=c(1,2,2), y=c(84,86,86), yend=c(86,86,84), size=1)adds 'geom_text' and 'annotate'
[1]: https://i.stack.imgur.com/fs0zN.png

geom_lines not linking what they should with error bars plot in ggplot

I have the following dataset ready to plot an error bars and lines graph
> growth
treatment class variable N value sd se ci
1 elevated Dominant RBAI2012 18 0.014127713 0.009739951 0.002295728 0.004843564
2 elevated Dominant RBAI2013 18 0.021869978 0.013578741 0.003200540 0.006752549
3 elevated Codominant RBAI2012 40 0.011564725 0.013718591 0.002169100 0.004387418
4 elevated Codominant RBAI2013 41 0.011471512 0.011091167 0.001732149 0.003500804
5 elevated Subordinate RBAI2012 24 0.004419784 0.009286883 0.001895677 0.003921507
6 elevated Subordinate RBAI2013 24 0.004397105 0.008704831 0.001776866 0.003675728
7 ambient Dominant RBAI2012 13 0.025836265 0.011880315 0.003295007 0.007179203
8 ambient Dominant RBAI2013 13 0.025992636 0.015162901 0.004205432 0.009162850
9 ambient Codominant RBAI2012 26 0.018067329 0.011830940 0.002320238 0.004778620
10 ambient Codominant RBAI2013 26 0.015595275 0.012467140 0.002445007 0.005035587
11 ambient Subordinate RBAI2012 33 0.006073904 0.008287442 0.001442658 0.002938599
12 ambient Subordinate RBAI2013 35 0.003239033 0.006846507 0.001157271 0.002351857
I've tried the following code, resulting this plot:
p <- ggplot(growth,aes(class,value,colour=treatment,group=variable))
pd<-position_dodge(.9)
# se= standard error; ci=confidence interval
p + geom_errorbar(aes(ymin=value-se,ymax=value+se),width=.1,position=pd,colour="black") + geom_point(position=pd,size=4) + geom_line(position=pd) +
theme_bw() + theme(legend.position=c(1,1),legend.justification=c(1,1))
The lines should link the points of their same color within each x-axis category, but clearly they don't. Please, could you help me draw the lines properly (e.g blue with blue and red with red within "Dominant" class, different lines for "codominant" class.
Also, do you know how to include in the x-labels the variables I am grouping with (i.e. "RBAI2012","RBAI2013"?
Many thanks
To distinguish also between different of levels of 'variable' you may introduce a fourth aesstetic: shape. First define a new grouping variable, a combination of 'treatment' and 'variable', which has four levels. Map group, colours and shape to this variable. Then use scale_colour_manual and scale_shape_manual to set two levels of colours, which corresponds to the two levels of 'treatment'. Similarly, define two 'variable' shapes.
growth$grp <- paste0(growth$treatment, growth$variable)
ggplot(data = growth, aes(x = class, y = value, group = grp,
colour = grp, shape = grp)) +
geom_point(size = 4, position = pd) +
geom_line(position = pd) +
geom_errorbar(aes(ymin = value - se, ymax = value + se), colour = "black",
position = pd, width = 0.1) +
scale_colour_manual(name = "Treatment:Variable",
values = c("red", "red","blue", "blue")) +
scale_shape_manual(name = "Treatment:Variable",
values = c(19, 17, 19, 17))
theme_bw() +
theme(legend.position = c(1,1), legend.justification = c(1,1))
One option is using a facet plot like so:
p <- ggplot(growth, aes(x = class, y = value, group = treatment, color = treatment))
p + geom_point(size = 4) + facet_grid(. ~ variable) + geom_errorbar(aes(ymin=value-se,ymax=value+se),width=.1,colour="black") + geom_line()
If you want it on one graph, another option is defining a new variable that combines treatment and variable:
growth$treatment_variable <- paste(growth$treatment, growth$variable)
p <- ggplot(growth, aes(x = class, y = value, group = treatment_variable, colour = treatment_variable))
pd<-position_dodge(.2)
p + geom_point(size = 4, position=pd) + geom_errorbar(aes(ymin=value-se, ymax=value+se), width=.1, position=pd, colour="black") + geom_line(position=pd)
You have too many grouping variables (variable and treatment) and including them in a single plot may be a bit confusing. You might want to use faceting, like this:
p <- ggplot(growth,aes(class,value,colour=treatment,group=treatment))
pd<-position_dodge(.9)
p +
geom_errorbar(aes(ymin=value-se,ymax=value+se),width=.1,position=pd,colour="black") +
geom_point(position=pd,size=4) + geom_line(position=pd) +
theme_bw() + theme(legend.position=c(1,1),legend.justification=c(1,1)) +
facet_grid(variable~treatment)
It is possible to do this, but you need to hack it since you're essentially plotting a geom_line() on different groupings (variable + treatment) than with the geom_point() and geom_errorbar() calls.
You need to use ggplot_build() to get back the rendered data and draw a geom_line(), based on the existing points data, grouped by colour:
p <- ggplot(growth) # move the aes() into the individual charts
pd<-position_dodge(.9) # leave dodge as is
se<-0.01 # faked this
p <- p +
geom_point(aes(x=factor(class),y=value,colour=treatment,group=variable),position=pd,size=4) +
theme_bw() + theme(legend.position=c(1,1),legend.justification=c(1,1)) +
geom_errorbar(aes(x=factor(class),ymin=value-se,ymax=value+se,colour=treatment,group=variable),position=pd,width=.1,colour="black")
b<-ggplot_build(p)$data[[1]] # get the ggpolt rendered data for this panel
p + geom_line(data=b,aes(x,y,group=colour), color=b$colour) # plot the lines

ggplot2 multiple sub groups of a bar chart

I am trying to produce a bar graph that has multiple groupings of factors. An example from excel of what I am attempting to create, subgrouped by Variety and Irrigation treatment:
I know I could produce multiple graphs using facet_wrap(), but I would like to produce multiple graphs for this same type of data for multiple years of similar data. An example of the data I used in this example:
Year Trt Variety geno yield SE
2010-2011 Irr Variety.2 1 6807 647
2010-2011 Irr Variety.2 2 5901 761
2010-2011 Irr Variety.1 1 6330 731
2010-2011 Irr Variety.1 2 5090 421
2010-2011 Dry Variety.2 1 3953 643
2010-2011 Dry Variety.2 2 3438 683
2010-2011 Dry Variety.1 1 3815 605
2010-2011 Dry Variety.1 2 3326 584
Is there a way to create multiple groupings in ggplot2? I have searched for quite some time and have yet to see an example of something like the example graph above.
Thanks for any help you may have!
This may be a start.
dodge <- position_dodge(width = 0.9)
ggplot(df, aes(x = interaction(Variety, Trt), y = yield, fill = factor(geno))) +
geom_bar(stat = "identity", position = position_dodge()) +
geom_errorbar(aes(ymax = yield + SE, ymin = yield - SE), position = dodge, width = 0.2)
Update: labelling of x axis
I have added:
coord_cartesian, to set limits of y axis, mainly the lower limit to avoid the default expansion of the axis.
annotate, to add the desired labels. I have hard-coded the x positions, which I find OK in this fairly simple example.
theme_classic, to remove the gray background and the grid.
theme, increase lower plot margin to have room for the two-row label, remove default labels.
Last set of code: Because the text is added below the x-axis, it 'disappears' outside the plot area, and we need to remove the 'clipping'. That's it!
library(grid)
g1 <- ggplot(data = df, aes(x = interaction(Variety, Trt), y = yield, fill = factor(geno))) +
geom_bar(stat = "identity", position = position_dodge()) +
geom_errorbar(aes(ymax = yield + SE, ymin = yield - SE), position = dodge, width = 0.2) +
coord_cartesian(ylim = c(0, 7500)) +
annotate("text", x = 1:4, y = - 400,
label = rep(c("Variety 1", "Variety 2"), 2)) +
annotate("text", c(1.5, 3.5), y = - 800, label = c("Irrigated", "Dry")) +
theme_classic() +
theme(plot.margin = unit(c(1, 1, 4, 1), "lines"),
axis.title.x = element_blank(),
axis.text.x = element_blank())
# remove clipping of x axis labels
g2 <- ggplot_gtable(ggplot_build(g1))
g2$layout$clip[g2$layout$name == "panel"] <- "off"
grid.draw(g2)

Resources