Plotting individual observations and group means with facets with ggplot2 - r

I'm trying to plot data from a study with three within-subjects (test item, frame, sample size) variables in ggplot. I have summarised and plotted test item on the x axis and have separate lines for sample size and have used facet_grid to separate the two frame conditions. The summarised this data to create within-subjects 95% CI error bars. I'd also like to underlay individual participant's lines. All the advice I have found so far doesn't explain how to plot individual and grouped data when you have facetted the data. Everything I have tried looks messy and doesn't clearly show individual's curves/lines.
Is there a way to do this?
I've considered splitting the data by the facetted conditions and plotting separately but if there is an easier way I would like to find it!
Here's a some of the data:
human_exp1 <- structure(list(sample_size = structure(c(1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L), .Label = c("2", "8", "20"), class = "factor"),
sampling_frame = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L), .Label = c("category", "property"), class = "factor"),
test_item = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
4L, 4L, 4L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L,
6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L,
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
4L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L,
6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 6L,
6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L,
6L, 6L, 6L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 6L, 6L, 6L,
6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L,
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
5L, 5L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L,
6L, 6L, 6L, 6L, 6L, 6L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
4L, 4L, 4L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L,
6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L
), .Label = c("1", "2", "3", "4", "5", "6"), class = "factor"),
id = structure(c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L,
11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 1L, 2L, 3L,
4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L,
17L, 18L, 19L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L,
12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 1L, 2L, 3L, 4L, 5L,
6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L,
19L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L,
14L, 15L, 16L, 17L, 18L, 19L, 1L, 2L, 3L, 4L, 5L, 6L, 7L,
8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L,
1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L,
15L, 16L, 17L, 18L, 19L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L,
9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 1L,
2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L,
15L, 16L, 17L, 18L, 19L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L,
9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 1L,
2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L,
15L, 16L, 17L, 18L, 19L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L,
9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 1L,
2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L,
15L, 16L, 17L, 18L, 19L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L,
9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 1L,
2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L,
15L, 16L, 17L, 18L, 19L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L,
9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 1L,
2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L,
15L, 16L, 17L, 18L, 19L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L,
9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 1L,
2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L,
15L, 16L, 17L, 18L, 19L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L,
9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 1L,
2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L,
15L, 16L, 17L, 18L, 19L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L,
9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 1L,
2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L,
15L, 16L, 17L, 18L, 19L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L,
9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 1L,
2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L,
15L, 16L, 17L, 18L, 19L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L,
9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 1L,
2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L,
15L, 16L, 17L, 18L, 19L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L,
9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 1L,
2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L,
15L, 16L, 17L, 18L, 19L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L,
9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 1L,
2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L,
15L, 16L, 17L, 18L, 19L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L,
9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 1L,
2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L,
15L, 16L, 17L, 18L, 19L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L,
9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 1L,
2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L,
15L, 16L, 17L, 18L, 19L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L,
9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L), .Label = c("1",
"2", "3", "4", "6", "7", "8", "9", "10", "11", "12", "13",
"14", "15", "16", "17", "18", "19", "20", "21", "22", "23",
"24", "25", "26", "27", "28", "29", "30", "31", "32", "33",
"34", "35", "36", "37", "38", "39", "40", "41", "42", "43",
"44", "45", "46", "47", "48", "49", "50", "85", "86", "87",
"88", "89", "90", "91", "92", "93", "94", "95", "96"), class = "factor"),
response = c(0.75, 0.25, 0.4, 0.5, 0.3, 0.55, 0.65, 0.4,
0.3, 0.5, 0, 0.15, 0.65, 0.65, 0.5, 0.65, 0.8, 0.65, 0.65,
0.75, 0.15, 0.35, 0.6, 0.15, 0.3, 0.5, 0.1, 0.3, 0.5, 0,
0.25, 0.45, 0.75, 0.7, 0.45, 0.65, 0.75, 0.75, 0.3, 0.1,
0.25, 0.15, 0.2, 0.3, 0.35, 0.05, 0.3, 0.5, 0, 0.15, 0.5,
0.1, 0.35, 0.25, 0.5, 0.5, 0, 0.25, 0, 0.3, 0.1, 0.15, 0.35,
0.2, 0, 0.3, 0.5, 0, 0.1, 0.5, 0, 0.3, 0.1, 0.7, 0.45, 0,
0.25, 0, 0.35, 0.1, 0.15, 0.3, 0.1, 0, 0.2, 0.25, 0, 0.1,
0.5, 0, 0.15, 0.3, 0.7, 0.4, 0, 0.05, 0.1, 0.3, 0.1, 0, 0.3,
0.05, 0, 0.25, 0.25, 0, 0.15, 0.5, 0, 0.1, 0, 0.75, 0.6,
0, 0.75, 0.3, 0.9, 0.3, 0.2, 0.95, 0.6, 0.7, 0.6, 0.5, 0,
0, 0.5, 0.9, 0.8, 0.9, 0.75, 0.7, 0.8, 0.5, 0.25, 0.1, 0.05,
0, 0.65, 0.5, 0.3, 0.8, 0.5, 0, 0, 0.5, 0.4, 0.85, 0.5, 0.55,
0.55, 0.35, 0.3, 0.2, 0.15, 0.05, 0, 0.3, 0.15, 0.05, 0.45,
0.5, 0, 0, 0.5, 0.45, 0.55, 0.3, 0.35, 0.4, 0.3, 0.15, 0.2,
0.15, 0, 0, 0.3, 0.1, 0, 0.3, 0.5, 0, 0, 0.5, 0.35, 0.35,
0.25, 0.3, 0.5, 0.35, 0.05, 0.2, 0, 0, 0.05, 0.3, 0.05, 0,
0.3, 0.5, 0, 0, 0.5, 0, 0.55, 0, 0.3, 0.35, 0.2, 0.1, 0.2,
0, 0, 0, 0.3, 0.05, 0, 0.25, 0.5, 0, 0, 0.5, 0, 0.55, 0,
0.25, 0.5, 0.25, 0.8, 0.4, 0.75, 0.7, 0.45, 0.95, 0.85, 0.55,
0.7, 0.5, 0, 0.5, 0.8, 0.8, 0.95, 1, 0.8, 0.7, 1, 0.9, 0.2,
0.7, 0.75, 0.25, 0.7, 0.6, 1, 0.7, 0.5, 0, 1, 0.8, 0.9, 0.8,
0.75, 0.8, 0.85, 1, 0.25, 0.1, 0.2, 0.15, 0.25, 0.6, 0.2,
0, 0.45, 0.5, 0, 0.5, 0.7, 0.35, 0.45, 0.25, 0.75, 0.4, 0.2,
0.1, 0.15, 0.65, 0.1, 0.2, 0.55, 0.05, 0, 0.4, 0.5, 0, 0.5,
0.6, 0.35, 0.35, 0, 0.7, 0.45, 0, 0.1, 0.15, 0.15, 0.15,
0.05, 0.55, 0, 0, 0.35, 0.25, 0, 0.5, 0.55, 0.35, 0.2, 0,
0.8, 0.45, 0, 0.05, 0, 0.6, 0.25, 0.1, 0.5, 0, 0, 0.35, 0.25,
0, 0.5, 0.45, 0.35, 0.2, 0, 0.75, 0.4, 0.1, 0.9, 0.5, 0.95,
0.55, 0.4, 1, 0.65, 0.75, 0.6, 0.5, 0, 0.5, 0.75, 0.85, 0.95,
0.9, 0.6, 0.85, 0.75, 0.5, 0.5, 0.95, 0.3, 0.3, 0.55, 0.45,
0.35, 0.9, 0.5, 0, 0, 0.25, 0.65, 0.9, 0.25, 0.75, 0.65,
0.25, 0.2, 0.2, 0.1, 0.05, 0, 0.1, 0.15, 0.05, 0.4, 0.5,
0, 0, 0.45, 0.4, 0.55, 0.1, 0.5, 0.5, 0.2, 0.1, 0.2, 0.4,
0, 0, 0.1, 0.05, 0, 0.2, 0.5, 0, 0, 0.35, 0.35, 0.55, 0.1,
0.35, 0.4, 0.15, 0.1, 0.2, 0, 0, 0, 0.05, 0, 0, 0.2, 0.5,
0, 0, 0.15, 0, 0.55, 0, 0.2, 0.45, 0.15, 0.05, 0.25, 0, 0,
0, 0.05, 0, 0, 0.2, 0.5, 0, 0, 0.3, 0, 0.55, 0, 0.3, 0.35,
0.05, 0.8, 0.15, 0.8, 0.8, 0.75, 1, 0.7, 0.5, 0.95, 0.5,
0, 0.5, 0.9, 0.85, 1, 1, 1, 0.8, 1, 1, 0.15, 0.75, 0.8, 0.4,
1, 0.5, 1, 0.85, 0.5, 0, 1, 0.85, 1, 0.85, 0.9, 0.9, 0.85,
1, 0.1, 0, 0.25, 0.3, 0.4, 0.65, 0, 0, 0.6, 0.5, 0, 0, 0.75,
0.65, 0.65, 0.45, 0.7, 0.5, 0, 0.1, 0, 0.2, 0.3, 0.4, 1,
0, 0, 0.6, 0.5, 0, 0, 0.7, 0.35, 0.55, 0, 0.85, 0.3, 0, 0.1,
0, 0.25, 0.25, 0.1, 0.65, 0, 0, 0.65, 0.25, 0, 0, 0.65, 0.35,
0.3, 0.05, 0.85, 0.3, 0, 0.05, 0, 0.15, 0.25, 0.1, 0.5, 0,
0, 0.45, 0.25, 0, 0, 0.6, 0.35, 0.3, 0, 0.65, 0.25, 0, 0.95,
0.6, 1, 0.75, 0.65, 0.5, 0.55, 0.9, 0.8, 0.5, 0, 1, 0.9,
0.95, 1, 0.95, 0.5, 0.85, 0.8, 0.5, 0.55, 0.95, 0.45, 0.55,
0.5, 0.4, 0.35, 0.8, 0.5, 0, 0, 0.35, 0.65, 1, 0.45, 0.5,
0.55, 0.25, 0.15, 0.3, 0.25, 0.15, 0, 0, 0, 0, 0.35, 0.5,
0, 0, 0.4, 0.35, 0.5, 0.05, 0.25, 0.4, 0, 0.05, 0.2, 0.45,
0, 0, 0, 0, 0, 0.25, 0.5, 0, 0, 0.3, 0.35, 0.5, 0, 0, 0.35,
0, 0.05, 0.25, 0, 0, 0, 0, 0, 0, 0.15, 0.5, 0, 0, 0.15, 0,
0.5, 0, 0, 0.3, 0, 0.05, 0.25, 0, 0, 0, 0, 0, 0, 0.2, 0.5,
0, 0, 0.15, 0, 0.5, 0, 0, 0.35, 0)), row.names = c(NA, -684L
), class = c("tbl_df", "tbl", "data.frame"))
I used summarySEwithin to summarise the data:
within <- Rmisc::summarySEwithin(data = human_exp1, measurevar = "response",
withinvars = c("sample_size", "sampling_frame", "test_item"),
idvar = "id")
I used the summarised data to plot the group means in ggplot. Particularly so I could compute within-ss confidence intervals for the means.
pd <- position_dodge(0.1)
ggplot(within, aes(x=test_item, y=response, colour=factor(sample_size), group=factor(sample_size)))+
geom_point(position=pd, size=5)+
geom_line(position=pd, size = .8)+
facet_grid(cols = vars(sampling_frame))+
geom_errorbar(aes(ymin=response-ci, ymax=response+ci), width=1, position=pd, size=1)+
ylim(0, 1)+
theme_bw()+
scale_x_discrete(
breaks=c("1","2","3", "4", "5", "6"),
labels=c("S1", "S2", "T1", "T2", "T3", "T4")
)+
# theme(legend.position = c(.9, .85))+
labs(x = "Test Item", y = "Generalisation Response")
I then summarised the data and grouped by all the grouping variables including id
gd <- human %>%
group_by(id, test_item, sample_size, sampling_frame) %>%
summarise(response = mean(response))%>%
ungroup()
gd
I then tried many different versions of geom_line() with the gd summarised data to add individual lines.
Any help would be much appreciated. I would like the individual lines to appear as faint grey lines behind the group mean lines.
Here is what I have with the within-subjects grouped data
Here is what I get when I try to add individual lines with geom_line(data = human, aes(x=test_item, y=response, group=id))

Is this what you want? I grouped the individual lines by both id and sample_size to get single lines:
ggplot(within, aes(x=test_item, y=response, colour=factor(sample_size), group=factor(sample_size)))+
geom_point(position=pd, size=5)+
geom_line(position=pd, size = .8)+
facet_grid(cols = vars(sampling_frame))+
geom_errorbar(aes(ymin=response-ci, ymax=response+ci), width=1, position=pd, size=1)+
ylim(0, 1)+
theme_bw()+
scale_x_discrete(
breaks=c("1","2","3", "4", "5", "6"),
labels=c("S1", "S2", "T1", "T2", "T3", "T4")
)+
# theme(legend.position = c(.9, .85))+
labs(x = "Test Item", y = "Generalisation Response") +
geom_line(data=human_exp1, alpha=0.2, color="black", aes(x=test_item, Y=response, group=interaction(id,sample_size)))

Is this what you are lookong for?
library(dplyr)
library(ggplot2)
within %>%
ungroup() %>%
group_by(test_item, sample_size) %>%
summarise(mean = mean(response), ci = sd(response)) -> smry
pd <- "jitter"
ggplot(within, aes(x = test_item, y = response)) +
geom_point(aes(colour = sample_size), position = pd) +
geom_errorbar(
data = smry,
mapping = aes(y = mean, ymin = mean - ci, ymax = mean + ci),
size = 1
)+
facet_grid(cols = vars(sampling_frame)) +
ylim(0, 1) +
scale_x_discrete(
breaks = c("1","2","3", "4", "5", "6"),
labels = c("S1", "S2", "T1", "T2", "T3", "T4")
) +
labs(x = "Test Item", y = "Generalisation Response") +
theme_bw()
# theme(legend.position = c(.9, .85))+

Related

How to apply p-value for each group of dataframe in R using facet_wrap in ggpubr

I have a data that looks like this:
melted.df <- structure(list(Time = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 2L,
2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 1L, 1L,
1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L,
4L, 4L, 4L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 3L,
3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L,
2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L), .Label = c("24",
"36", "48", "72"), class = "factor"), id = c(1L, 2L, 3L, 4L,
5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L,
19L, 20L, 21L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L,
12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 20L, 21L, 1L, 2L, 3L,
4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L,
18L, 19L, 20L, 21L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L,
11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 20L, 21L), Samples = structure(c(1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L,
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
4L, 4L, 4L), .Label = c("WT_Ago2_800", "WT_Ago2_400", "WT_Ago2_200",
"WT_Ago4_800"), class = "factor"), Size = c(0, 0, 0, 0, 0, 0,
0.3, 0, 0, 0.1, 0, 0, 0, 0, 0, 0, 0, 0, 0.5, 0.8, 0.5, 0, 0,
0, 0, 0, 0, 0.1, 0.65, 0.2, 0.85, 0.725, 0.575, 0.1, 1.1, 0.9,
1.325, 1, 0.8, 0.5, 2.2, 1.65, 0, 0, 0, 0, 0, 0, 0.825, 1.175,
0.1, 0.55, 0.85, 0.85, 1.1, 1.4, 0.6, 0.95, 1.15, 0.975, 2.35,
1.15, 2.1, 0, 0, 0, 0, 0, 0, 0.65, 1.4, 0.55, 0.1, 0.7, 1.1,
0.95, 1.85, 0.85, 0.1, 1.5, 1.25, 1.8, 1.75, 2.15)), row.names = c(NA,
-84L), class = "data.frame")
This data consists of 4 time frames (24, 36, 48 and 72 hours). I want to use the code below to paste the p values calculated as stat.test for each time.levels and apply that to each facet_wrap. If you check for i=1, there is no p-value so it's nothing you would want to apply to the figure, and if you do i=2, you would get p-values applied to the figure. The problem is that I couldn't get the p-value applied to its respective facets. It just applies same p-value in all facets. How can I get this resolved?
code:
library(devtools)
# install_github("https://github.com/kassambara/rstatix")
library(rstatix) # https://github.com/kassambara/rstatix
library(stringi)
library(ggpubr)
time.levels <- levels(melted.df$Time)
stat.test <- NULL
for (i in 1:length(time.levels)){
stat.test <- aov(Size ~ Samples, data = melted.df[melted.df$Time == time.levels[i],]) %>%
tukey_hsd()
# stat.test <- rbind(stat.test, tmp.stat)
bp <- ggboxplot(melted.df, x = "Samples", y = "Size") +
facet_wrap(vars(Time))+
stat_pvalue_manual(
stat.test, label = "p.adj",
y.position = c(2, 2.5, 3, 3.5, 3.8, 4)
)
bp
}
Note. All your values in Size for Time == 24L are zero:
> filter(melted.df, Time == 24L) %>% select(Size) %>% summary
Size
Min. :0
1st Qu.:0
Median :0
Mean :0
3rd Qu.:0
Max. :0
If you wish to proceed anyway, you should make the plots individually and then use gridExtra::grid.arrange:
library(gridExtra)
bp <- vector("list", length = length(time.levels))
for (i in seq_along(time.levels)) {
sdf <- melted.df[melted.df$Time == time.levels[i],]
stat.test <- aov(Size ~ Samples, data = sdf) %>%
tukey_hsd()
bp[[i]] <- ggboxplot(sdf, x = "Samples", y = "Size") +
facet_wrap(vars(Time))+
stat_pvalue_manual(
stat.test, label = "p.adj",
y.position = c(2, 2.5, 3, 3.5, 3.8, 4)
)
}
do.call(grid.arrange, bp)
Note that you have to use the subset data.frame sdf as the input for ggboxplot.
You don't need to use gridExtra::grid.arrange.
Here is a clean solution.
library(rstatix) # latest version
library(ggpubr) # latest version
stat.test <- melted.df %>%
group_by(Time) %>%
tukey_hsd(Size ~ Samples)
ggboxplot(melted.df, x = "Samples", y = "Size", facet.by = "Time") +
stat_pvalue_manual(
stat.test, label = "p.adj",
y.position = c(2, 2.5, 3, 3.5, 3.8, 4)
)

How to make create two y-axis labels with a grid of facets with a single x-axis label

I have been struggling with ggplot to display these plots how I would like. My data have 2 factors, quarter and species. Station will be on the x-axis, value on the y-axis, and the constituent will be used with the facet_wrap. I want quarter differentiated with shapes, and species with colors.
The issue is I'm trying to replicate a figure done in SigmaPlot. It is 4x4 grid of plots, with the first two rows of the first column are empty, to allow for the placement of the legend. My original plan was to have two separate facets made using facet-wrap, and combine those, however, this doesn't maintain the 4x4 arrangement, it transforms it into a 1x2, which ruins alignment of plots and shrinks the larger faceted grid.
My next thought was to create each plot individually, then arrange them in a grid using cowplot. This presents the plots how I'd like them arranged, but I can't figure out how to have two y-axis labels, due to different units. One label would be centered on the two leftmost plots, and one centered on the left of the next column of 4 plots.
I'm trying to use this code (just copy the example data below, and run):
library(ggplot)
library(gridExtra)
test.data1 <- test.data[1:95, ]
test.data2 <- test.data[96:111, ]
testplot1 <- ggplot(test.data1, aes(Station, value)) +
geom_point(aes(shape = factor(quarter), fill = Species)) +
scale_shape_manual(values = c(21, 22)) +
labs(x = "Station", y = "Unit a", shape = "Sampling Quarter", fill = "Species") +
theme(legend.position = "none", legend.title = element_blank()) +
guides(fill = guide_legend(override.aes = list(shape = 21), nrow = 2, byrow = TRUE), shape = guide_legend(nrow = 2, byrow = TRUE)) +
facet_wrap( ~ constituent, ncol = 3, scales = "free_y")
testplot2 <- ggplot(test.data2, aes(Station, value)) +
geom_point(aes(shape = factor(quarter), fill = Species))
scale_shape_manual(values = c(21, 22)) +
labs(x = "Station", y = "Unit b", shape = "Sampling Quarter", fill = "Species") +
theme(legend.position = "top", legend.title = element_blank()) +
guides(fill = guide_legend(override.aes = list(shape = 21), nrow = 2, byrow = TRUE), shape = guide_legend(nrow = 2, byrow = TRUE)) +
facet_wrap( ~ constituent, ncol = 1, scales = "free_y")
grid.arrange(testplot2, testplot1, ncol = 2)
Which generates this:
But I want it to be arranged like this, where the XX and YY plots from above are normalized in size with the other plots (this was done using individual plots, and using plot_grid):
Example data from a larger set:
test.data <- structure(list(Station = structure(c(1L, 1L, 1L, 1L, 2L, 2L,
2L, 2L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 2L, 2L,
2L, 2L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 2L, 2L,
2L, 2L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 2L, 2L,
2L, 2L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 2L, 2L,
2L, 2L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 2L, 2L, 2L,
2L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 2L, 2L, 2L,
2L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L), .Label = c("StA", "StB"), class = "factor"),
CollectionDate = structure(c(3L, 2L, 3L, 1L, 3L, 1L, 3L,
1L, 3L, 2L, 3L, 1L, 3L, 1L, 3L, 1L, 3L, 2L, 3L, 1L, 3L, 1L,
3L, 1L, 3L, 2L, 3L, 1L, 3L, 1L, 3L, 1L, 3L, 2L, 3L, 1L, 3L,
1L, 3L, 1L, 3L, 2L, 3L, 1L, 3L, 1L, 3L, 1L, 3L, 2L, 3L, 1L,
3L, 1L, 3L, 1L, 3L, 2L, 3L, 1L, 3L, 1L, 3L, 1L, 3L, 2L, 3L,
1L, 3L, 1L, 3L, 1L, 3L, 2L, 3L, 1L, 3L, 1L, 1L, 3L, 2L, 3L,
1L, 3L, 1L, 3L, 1L, 3L, 2L, 3L, 1L, 3L, 1L, 3L, 1L, 3L, 2L,
3L, 1L, 3L, 1L, 3L, 1L, 3L, 2L, 3L, 1L, 3L, 1L, 3L, 1L), .Label = c("10/1/2017",
"10/16/2017", "4/1/2017"), class = "factor"), Species = structure(c(1L,
2L, 2L, 3L, 1L, 2L, 2L, 3L, 1L, 2L, 2L, 3L, 1L, 2L, 2L, 3L,
1L, 2L, 2L, 3L, 1L, 2L, 2L, 3L, 1L, 2L, 2L, 3L, 1L, 2L, 2L,
3L, 1L, 2L, 2L, 3L, 1L, 2L, 2L, 3L, 1L, 2L, 2L, 3L, 1L, 2L,
2L, 3L, 1L, 2L, 2L, 3L, 1L, 2L, 2L, 3L, 1L, 2L, 2L, 3L, 1L,
2L, 2L, 3L, 1L, 2L, 2L, 3L, 1L, 2L, 2L, 3L, 1L, 2L, 2L, 3L,
1L, 2L, 3L, 1L, 2L, 2L, 3L, 1L, 2L, 2L, 3L, 1L, 2L, 2L, 3L,
1L, 2L, 2L, 3L, 1L, 2L, 2L, 3L, 1L, 2L, 2L, 3L, 1L, 2L, 2L,
3L, 1L, 2L, 2L, 3L), .Label = c("SpA", "SpB", "SpC"), class = "factor"),
quarter = structure(c(1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L,
2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L,
1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L,
2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L,
1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L,
2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 2L, 1L, 2L, 1L, 2L, 1L,
2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L,
1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L), .Label = c("2017 Q2",
"2017 Q4"), class = "factor"), constituent = structure(c(1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
4L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 6L, 6L,
6L, 6L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 8L, 8L, 8L, 8L, 8L,
8L, 8L, 8L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 10L, 10L, 10L,
10L, 10L, 10L, 10L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L,
12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 13L, 13L, 13L, 13L,
13L, 13L, 13L, 13L, 14L, 14L, 14L, 14L, 14L, 14L, 14L, 14L
), .Label = c("A", "B", "C", "D", "E", "F", "G", "H", "I",
"J", "K", "L", "XX", "YY"), class = "factor"), value = c(16,
35, 46, 23, 40, 19, 9, 50, 0.2, 1, 0.5698, 0.322, 1, 0.45,
0.322, 0.5, 16, 9, 6, 19, 14, 13, 16, 9, 0, 0.004, 0, 0.004,
1, 0.32, 1, 0.678, 0, 0.39, 0.23, 0, 0, 1.1, 0.5, 0.5, 9,
4.9, 7, 4.768, 9, 8.65, 4.768, 6.54, 195, 195, 46, 46, 124,
124, 218, 218, 2, 1, 1, 1, 1, 2, 1, 1, 0.1, 0.4, 0.22, 0.4,
0.22, 0.4, 0.22, 0.1, 0.99, 0.99, 1.2, 0.45, 0.765, 0.99,
0.99, 0.99, 0.99, 1.2, 4.3, 0.98, 0.99, 1.2, 1.2, 34, 34,
65, 98, 150, 34, 65, 65, 2, 0, 4, 1.3, 5, 3.3, 1.56, 1, 9,
0.36, 4, 4, 11, 2, 2.22, 11)), class = "data.frame", row.names = c(NA,
-111L))

Incorrect shape and fill of ggplot legend

I am trying to make a plot showing the cumulative mortality across different feeding levels and temperatures. I have managed to get the graph to look correct, but the legend does not match. I'm sure my code is overly complicated, but it is the only way I have found to achieve the correct visual. I would like the different temperatures to be represented by different shapes and I would like the different feeding levels to be represented by solid or hollow fill (but the same shape as the corresponding temperature). The feeding level only applies at 2 and 5 degrees.
As seen below, the feeding level on the graph does show solid and hollow points, but on the legend it does not. I would also like all the point shown below 'Temperature' in the legend to be solid.
Here is my code:
ggplot(ac_tank_cumulative_mort_summary, aes(Day, mean, shape = factor(Target_Temp),fill=factor(Feeding))) +
geom_point(stat = "identity",size=3.5,color="black") +
geom_line() +
scale_y_continuous(limits = c(0,100)) +
scale_shape_manual(name="Temperature (ºC)",labels=c("0 ºC","2 ºC","5 ºC","7 ºC","9 ºC"),values=c(21,22,23,24,25)) +
scale_fill_manual(name="Feeding",labels=c("High Food","Low Food"),values=c("black","white")) +
xlab("Day of Experiment") +
ylab("Cumulative Mortality (%)") +
ggtitle("Cumulative Percent Mortality \n of Later Stage Arctic Cod") +
theme_bw() +
theme(axis.text = element_text(size = 16, color='black'), axis.title = element_text(size = 16, face = "bold"), panel.grid.major = element_blank(), panel.grid.minor = element_blank(), legend.text = element_text(size = 16), legend.title = element_text(size = 16, face = "bold"), plot.title = element_text(size = 18, face = "bold",hjust=0.5))
It produces a graph that looks like this:
A reproducible example:
structure(list(Target_Temp = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 7L,
7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L,
7L, 7L, 7L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L,
9L, 9L, 9L), Feeding = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L), Day = c(0L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L,
11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 20L, 0L, 1L, 2L,
3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L,
17L, 18L, 19L, 20L, 0L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L,
11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 20L, 0L, 1L, 2L,
3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L,
17L, 18L, 19L, 0L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L,
12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 0L, 1L, 2L, 3L, 4L, 5L,
6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L,
19L, 0L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L,
14L, 15L), N = c(3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L
), mean = c(0, 5.23026878966667, 15.1164184233333, 25.0941619566667,
31.02208526, 37.00051361, 39.6671802766667, 43.1015237133333,
46.0402328333333, 49.4934086633333, 52.9560006833333, 54.0859441866667,
55.7620270533333, 57.4569423066667, 57.4569423066667, 58.6369757233333,
60.40180446, 60.9865997833333, 60.9865997833333, 61.58183788,
62.77231407, 0, 5.483691307, 9.37154714766667, 18.2054598866667,
19.8012953533333, 23.7363121066667, 26.4029787733333, 31.31040864,
34.08436863, 36.8583286166667, 38.5098588766667, 39.0474932833333,
40.1227621, 40.66039651, 42.91086732, 43.52815127, 43.52815127,
43.52815127, 43.52815127, 44.1454352233333, 44.1454352233333,
0, 5.32153032133333, 12.76963777, 24.6092796066667, 32.63939764,
41.1558811566667, 43.8225478233333, 47.6157916166667, 49.2030932033333,
52.46723647, 53.0227920266667, 53.57834758, 53.57834758, 54.09116809,
54.6759634133333, 58.1083346633333, 58.6931299833333, 59.2779253066667,
60.90874968, 61.4643052366667, 62.0019396466667, 0, 5.84092792033333,
18.993371995, 28.6523059933333, 32.47031207, 32.9397956366667,
34.27312897, 37.6423639866667, 39.56172328, 43.4211543733333,
44.4015465333333, 46.8111019033333, 49.2206572766667, 49.6901408466667,
50.1803369233333, 53.25726, 56.33418308, 58.2949673933333, 61.3040171633333,
64.8485118866667, 0, 13.1614526916667, 22.6657863833333, 31.59793698,
38.60881636, 41.6744537733333, 42.0077871066667, 42.0077871066667,
45.0654117, 47.1327193933333, 47.6455399066667, 48.6711809333333,
49.6534850133333, 54.2688696266667, 55.3529521233333, 57.4086130966667,
59.4730877466667, 60.9999279933333, 61.54637608, 61.54637608,
0, 11.9041826816667, 20.52782238, 29.3914919133333, 31.8773415066667,
32.9526103233333, 34.2859436566667, 34.9395384266667, 34.9395384266667,
35.5931332, 36.8751844833333, 40.96136817, 43.3312574, 45.71371576,
51.05476461, 65.1266928133333, 73.21981707, 77.7511211166667,
79.56133673, 80.8065805933333, 0, 40, 87.5, 90, 90, 92.5, 92.5,
92.5, 92.5, 92.5, 92.5, 92.5, 92.5, 92.5, 92.5, 92.5), sd = c(0,
1.97372461784689, 5.80473942192512, 12.9273738611295, 18.7980078654077,
26.2827168030405, 24.7758104083834, 25.1241212927305, 27.2873150523553,
27.4420059335036, 27.4630003540068, 26.461746133237, 25.0082449931503,
23.6407105265119, 23.6407105265119, 22.6249375824511, 21.0314195173936,
20.4868619144685, 20.4868619144685, 20.0127776006859, 19.1960498751,
0, 0.77902264946986, 1.23794275560703, 3.34864008798032, 3.2335928525848,
5.36060756499091, 5.77844904674583, 4.44080298306709, 4.09533840068484,
3.9968905491584, 4.06201390276555, 3.24181683679238, 1.93161303094676,
1.71565205207569, 2.64834297675624, 2.66829911911964, 2.66829911911964,
2.66829911911964, 2.66829911911964, 3.08417845208611, 3.08417845208611,
0, 0.910533229473384, 2.72480540999204, 6.65751125731073, 8.33148895279841,
9.0007033472894, 9.57771603598199, 11.7084213217991, 12.9037012862438,
14.5648056008555, 15.1859834728865, 15.8412902763179, 15.8412902763179,
14.9818268027929, 15.7149156535738, 18.1464512942252, 18.9773239417533,
19.8251322063165, 18.2151433915608, 18.2852656545651, 17.440878798362,
0, 3.00352826170525, 9.15977839384524, 9.07674417328714, 7.17760460594045,
6.49258187655286, 6.47302852615445, 5.18715642772437, 4.46042901997611,
3.92629364971166, 4.75362293709948, 4.62667657495029, 5.55770752177903,
5.01285084668542, 4.53061089193868, 7.36285237297975, 9.36455421131727,
6.89234052824489, 6.90699192099218, 6.4916961180775, 0, 4.40845912544074,
6.13942103207389, 6.68730640525036, 9.00051763429896, 11.6269296244466,
11.4835770151044, 11.4835770151044, 12.6472892332494, 13.3450448267859,
12.8661920432309, 12.0486504655982, 11.2113054845842, 10.9471632374873,
11.8734160758113, 10.9393380542906, 10.181245628949, 9.03442659821049,
9.64562925636332, 9.64562925636332, 0, 8.39261284702131, 9.64654877195206,
12.3498744833651, 13.5156278842202, 13.1057317249229, 13.3307327422633,
14.420694327421, 14.420694327421, 15.5166855751566, 14.066097040009,
12.5037531860345, 12.3444832323343, 14.1519399447546, 10.2648576647302,
5.60999878946208, 2.3909883546916, 5.27289282704079, 5.52418422785351,
6.39683309409102, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA), se = c(0, 1.13953043942009, 3.3513678678241,
7.46362277863804, 10.8530349013219, 15.1743336212701, 14.3043208086713,
14.5054181915108, 15.7543386909395, 15.8436495128116, 15.8557706471406,
15.2776962532519, 14.4385169787554, 13.6489705863157, 13.6489705863157,
13.0625138036266, 12.1424957198072, 11.8280952411691, 11.8280952411691,
11.5543825349881, 11.0828445627664, 0, 0.449768936376239, 0.714726583191068,
1.93333825621461, 1.86691570388948, 3.09494822066744, 3.33618911263724,
2.56389879769188, 2.36444472805801, 2.30760583447808, 2.34520482021369,
1.87166382338554, 1.1152173033873, 0.990532174101631, 1.52902153053667,
1.54054321470217, 1.54054321470217, 1.54054321470217, 1.54054321470217,
1.78065125954076, 1.78065125954076, 0, 0.525696605142557, 1.57316713694826,
3.84371591654131, 4.81018738964856, 5.1965585004535, 5.529696931596,
6.75986020192626, 7.4499554111554, 8.40899443434848, 8.76763164598026,
9.14597320534316, 9.14597320534316, 8.64976173754481, 9.07301078288308,
10.4768585395573, 10.95656308627, 11.4460454160367, 10.5165179404452,
10.557003047867, 10.0694960691379, 0, 1.73408785041418, 5.28840052140387,
5.2404606918127, 4.14399195137642, 3.74849389416348, 3.7372047620474,
2.99480615987537, 2.57522989538443, 2.26684669557854, 2.74450548236037,
2.67121296600089, 3.20874393377633, 2.89417078574127, 2.61574941805425,
4.25094479954333, 5.40662789474487, 3.97929465932875, 3.98775364487541,
3.74798250126929, 0, 2.54522506278467, 3.54459638553631, 3.8609181532248,
5.19645127900848, 6.71281094852307, 6.63004628093031, 6.63004628093031,
7.30191584333562, 7.70476522309248, 7.42829943960477, 6.95629158968482,
6.47285023949184, 6.32034764202606, 6.85511996757007, 6.31582977040099,
5.87814490455939, 5.21602862845074, 5.56890664766469, 5.56890664766469,
0, 4.84547728643206, 5.56943753023738, 7.13020335742892, 7.80325139722136,
7.56659773931127, 7.69650213724068, 8.32579175183782, 8.32579175183782,
8.95856259374746, 8.12106491249661, 7.21904526783764, 7.12709071719501,
8.17062633665946, 5.92641833592515, 3.23893431124941, 1.38043777021046,
3.04430609310005, 3.18938925100431, 3.69321330883456, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA)), .Names = c("Target_Temp",
"Feeding", "Day", "N", "mean", "sd", "se"), row.names = c(NA,
-139L), class = "data.frame")
As aosmith pointed out in the comment you can add this line in order to make sure that a shape with a fill is used in the legend:
guides(fill = guide_legend(override.aes = list(shape = 21)),
shape = guide_legend(override.aes = list(fill = "black")))

boxplot with multiple factor labels using base R functions

How can one possibly reproduce the ggplot-based boxplot shown in this answer but using base R boxplot function?
Sample date from the above link:
d<-data.frame(x=rnorm(1500),f1=rep(seq(1:20),75),f2=rep(letters[1:3],500))
# first factor has 20+ levels
d$f1<-factor(d$f1)
# second factor a,b,c
d$f2<-factor(d$f2)
boxplot(x~f2*f1,data=d,col=c("red","blue","green"),frame.plot=TRUE,axes=FALSE)
It would be great if the groups on the x-axis are spaced from each other.
I have limited knowledge about ggplot2.
EDIT
While waiting for more suggestions using base R functions, I am making some progress with ggplot2.
Using this sample data how can I produce a plot with well aligned x-axis as the one in the link above?
The following does not give me the correct alignment (I want the numbers 1:8 aligned at the center of each group):
library(ggplot2)
ggplot(dat3, aes(x = ID, y = value, group=interaction(obs, ID), fill=obs)) +
geom_boxplot() +
scale_fill_manual(values = c("yellow", "orange"))
dat3=structure(list(values = c(0, 0, 0, 0, 0, 0, 0, 0, -0.0169491525423729,
0, 0, 0, 0, 1, 1, 0.64367816091954, 0.64367816091954, 0, 0, -0.0163934426229508,
-0.021978021978022, 0.109195402298851, 0, 0, 0, 0, 0.207650273224044,
0.4375, 0, 0, 0, 0, 0.302325581395349, 0.303370786516854, 0.270588235294118,
-0.0188679245283019, 0.156462585034014, 0.092436974789916, 0.69,
-0.021978021978022, 0.64367816091954, 0.614906832298137, 0.612903225806452,
0.274853801169591, 0, 0.303370786516854, 0, 0, -0.03125, 0.229813664596273,
0.557142857142857, 0, 0.109195402298851, 0.0746268656716418,
0.180616740088106, 0.210526315789474, 0.310344827586207, 1, 1,
0.0825688073394495, 0.294117647058824, 0, 0.4375, 0, 0.230769230769231,
0.347826086956522, -0.0163934426229508, 0.156462585034014, 0,
0, 0, 1, 0, 0, 0, 0.483333333333333, 0.483333333333333, 0, 0,
0, 0, 0, -0.0169491525423729, 0, 0.310344827586207, 0, 0.296875,
0.302325581395349, 0, 0, 0, 0, 0, 0, 0.482758620689655, 0, 0,
0, 0, 0, 0, 0, 0, 0.150684931506849, 0.150684931506849, 0, 0,
-0.021978021978022, -0.021978021978022, 0.270588235294118, 0,
0, 0.482758620689655, 0.482758620689655, 0.272727272727273, 0.272727272727273,
0, 1, 0, 0, 0.642857142857143, 0.211864406779661, 0.156462585034014,
-0.0449438202247191, -0.0449438202247191, 0.389763779527559,
0.389763779527559, -0.021978021978022, 0.211864406779661, 0.213197969543147,
0.213197969543147, 0.358620689655172, -0.0163934426229508, 0.483333333333333,
0, 0, 0.362139917695473, 0.362139917695473, 0.261904761904762,
0.483333333333333, 1, 1, 0.236453201970443, 0.302325581395349,
0.310344827586207, 1, 1, 0.358974358974359, 0.358974358974359,
-0.0606060606060606, 0.0721649484536082, 0.615384615384615, 0.615384615384615,
0.347826086956522, 1, 0, 0, 0, -0.0273972602739726, -0.0273972602739726,
-0.0169491525423729, -0.0256410256410256, 0.107142857142857,
0.107142857142857, 0.302325581395349, -0.0163934426229508, -0.0264900662251656,
0.311111111111111, 0.311111111111111, 0.156462585034014, 0.156462585034014,
-0.0483091787439614, 0.311111111111111, -0.0333333333333333,
-0.0333333333333333, 0.311111111111111), ind = structure(c(1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L,
6L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 8L, 8L, 8L, 8L,
8L, 8L, 8L, 8L, 8L, 8L, 8L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 6L, 6L,
6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 7L, 7L, 7L, 7L, 7L, 7L, 7L,
7L, 7L, 7L, 7L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L), .Label = c("ETS",
"ETS.1", "ETS.2", "ETS.3", "ETS.4", "ETS.5", "ETS.6", "ETS.7"
), class = "factor"), ID = structure(c(1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L,
4L, 4L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
5L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 7L, 7L, 7L, 7L,
7L, 7L, 7L, 7L, 7L, 7L, 7L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L,
8L, 8L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 6L, 6L, 6L,
6L, 6L, 6L, 6L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 8L,
8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L), .Label = c("4", "5",
"6", "7", "8", "9", "10", "11"), class = "factor"), obs = structure(c(1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("obs",
"capa"), class = "factor")), .Names = c("values", "ind", "ID",
"obs"), row.names = c(NA, 176L), class = "data.frame")
You can specify the location of the boxes using at option.
set.seed(1)
d<-data.frame(x=rnorm(1500),f1=rep(seq(1:20),75),f2=rep(letters[1:3],500))
# first factor has 20+ levels
d$f1<-factor(d$f1)
# second factor a,b,c
d$f2<-factor(d$f2)
boxplot(x~f2*f1,data=d, at = (1:80)[-4*(1:20)], col=c("red","blue","green"),frame.plot=TRUE,axes=FALSE)
axis(1,at=seq(2,80,4),labels=1:20,cex.axis=0.7)

drawing line segments connecting sets of points

I am trying to connect sets of (two) points at each level of x, in each facet. Here is a reproducible example:
datum <- structure(list(frequency = c(8L, 7L, 6L, 18L, 5L, 11L, 16L, 15L,
9L, 8L, 8L, 10L, 2L, 20L, 14L, 3L, 6L, 2L, 2L, 11L, 10L, 6L,
15L, 19L, 18L, 18L, 8L, 2L, 10L, 15L, 12L, 17L, 1L, 18L, 7L,
8L, 16L, 4L, 9L, 2L, 7L, 3L, 16L, 7L, 18L, 20L, 9L, 10L, 13L,
2L, 15L, 7L, 3L, 20L, 4L, 15L, 5L, 7L, 9L, 16L, 5L, 8L, 10L,
10L, 7L, 10L, 10L, 17L, 7L, 8L, 13L, 13L, 16L, 5L, 20L, 18L,
13L, 19L, 3L, 8L, 14L, 12L, 20L, 2L, 9L, 13L, 7L, 2L, 5L, 5L,
13L, 9L, 13L, 7L, 9L, 4L, 4L, 20L, 1L, 4L), band = structure(c(2L,
4L, 2L, 3L, 2L, 1L, 4L, 1L, 2L, 1L, 3L, 4L, 2L, 4L, 3L, 4L, 3L,
2L, 3L, 2L, 2L, 4L, 2L, 1L, 1L, 2L, 1L, 4L, 4L, 1L, 4L, 4L, 2L,
1L, 4L, 4L, 3L, 4L, 1L, 1L, 3L, 4L, 1L, 3L, 4L, 1L, 2L, 1L, 1L,
2L, 2L, 1L, 3L, 4L, 2L, 1L, 2L, 4L, 2L, 2L, 4L, 4L, 2L, 4L, 4L,
1L, 1L, 4L, 2L, 3L, 4L, 1L, 2L, 4L, 1L, 2L, 4L, 1L, 1L, 3L, 4L,
4L, 2L, 2L, 2L, 1L, 3L, 2L, 2L, 2L, 3L, 3L, 1L, 3L, 4L, 3L, 3L,
1L, 3L, 4L), .Label = c("1", "2", "3", "4"), class = "factor"),
test = structure(c(1L, 2L, 1L, 2L, 1L, 2L, 1L, 1L, 1L, 2L,
2L, 2L, 2L, 2L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 1L,
2L, 1L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 2L,
2L, 2L, 1L, 1L, 1L, 2L, 1L, 1L, 2L, 2L, 1L, 1L, 1L, 1L, 2L,
1L, 2L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 2L, 2L, 1L,
2L, 2L, 2L, 1L, 2L, 1L, 1L, 1L, 2L, 1L, 2L, 2L, 2L, 2L, 2L,
2L, 1L, 1L, 2L, 2L, 1L, 2L, 1L, 2L, 2L, 2L, 1L, 2L, 1L, 2L
), .Label = c("1", "2"), class = "factor"), knowledge = structure(c(2L,
3L, 1L, 3L, 1L, 1L, 3L, 3L, 1L, 3L, 1L, 3L, 2L, 2L, 1L, 1L,
1L, 1L, 3L, 3L, 1L, 2L, 3L, 1L, 1L, 2L, 2L, 1L, 1L, 3L, 2L,
3L, 1L, 2L, 2L, 1L, 1L, 2L, 2L, 1L, 3L, 3L, 1L, 1L, 2L, 3L,
3L, 2L, 2L, 3L, 1L, 1L, 2L, 2L, 2L, 3L, 1L, 3L, 1L, 1L, 2L,
1L, 1L, 2L, 3L, 1L, 1L, 1L, 1L, 3L, 2L, 2L, 1L, 2L, 3L, 2L,
1L, 2L, 3L, 3L, 2L, 1L, 3L, 1L, 3L, 2L, 1L, 3L, 2L, 2L, 3L,
1L, 1L, 2L, 1L, 2L, 3L, 1L, 3L, 1L), .Label = c("1", "2",
"3"), class = "factor")), .Names = c("frequency", "band",
"test", "knowledge"), row.names = c(NA, -100L), class = "data.frame")
Here is the code I have so far:
ggplot(datum, aes(knowledge, frequency, color=test)) +
stat_summary(fun.y='mean', geom='point', position=position_dodge(width=.9), size=3) +
facet_grid(~band) +
labs(y='number of words (max = 20)', x='self-report knowledge') +
scale_x_discrete(labels=c('none', 'form', 'meaning'))
Looking at the left-most facet ('1') in the graph, I would like a line to connect the pretest to posttest in the none column, another line connecting pretest to posttest in the form column, and a line connecting the pretest to the posttest in the meaning column. I would like this done in each facet.
I hope that makes sense, and thanks!
I find relying on ggplot too much for data manipulation/summarizing can hurt more than it helps. I have no idea how to connect the position-dodged points with a line. Instead, I'd do something like this:
library(dplyr)
datsum = datum %>%
group_by(band, knowledge, test) %>%
summarize(mean = mean(frequency)) %>%
ungroup %>%
mutate(knowledge_fac = factor(knowledge, labels = c('none', 'form', 'meaning')))
ggplot(datsum, aes(x = test, y = mean)) +
geom_path(aes(group = band:knowledge)) +
geom_point(aes(color = factor(test))) +
facet_grid(band ~ knowledge_fac) +
labs(y='number of words (max = 20)', x='self-report knowledge')
Borrowing from Gregor's work in munging the data, I think this does what was requested. The mutate() chunk creates Test to be a numeric offset of -0.1 for test 1 and 0.1 for test 2. This is then added to the numeric value of knowledge. The result is the numeric x passed to ggplot2. Gregor correctly defined the groups, so the rest is straightforward.
library(dplyr)
datsum <- datum %>%
group_by(band, knowledge, test) %>%
summarize(mean = mean(frequency)) %>%
mutate(Test = 0.1 * (2 * (test == 2) - 1),
Knowledge = as.numeric(knowledge) + Test) %>%
ungroup
ggplot(datsum, aes(x = Knowledge, y = mean, color = test)) +
geom_path(aes(group = band:knowledge), color = "black") +
geom_point(size = 3) +
facet_wrap(~ band, nrow = 1) +
labs(y='number of words (max = 20)', x='self-report knowledge') +
scale_color_manual(values = c("orange", "blue")) +
scale_x_continuous(limits = c(0.5, 3.5), breaks = 1:3,
labels = c("none", "form", "meaning"))

Resources