Passing an external function (and arguments) to dplyr summarize or mutate - r

I am trying to summarize a large set of data with an external function (sii package).
What I need to do is calculate SII for each subject, with each system, at each presentation level.
Example data:
data <- structure(list(Subject = structure(c(1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L), .Label = c("1", "2"), class = "factor"), Ear = structure(c(1L,
2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L,
2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L,
2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L,
2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L,
2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L,
2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L,
2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L,
2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L,
2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L,
2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L,
2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L,
2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L,
2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L,
2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L,
2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L,
2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L,
2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L,
2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L,
2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L,
2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L,
2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L,
2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L,
2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L,
2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L,
2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L,
2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L,
2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L,
2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L,
2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L,
2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L,
2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L,
2L, 1L, 2L, 1L, 2L, 1L, 2L), .Label = c("Left", "Right"), class = "factor"),
System = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L,
2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L,
1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L,
1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L,
2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L,
2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L,
1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L,
1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L,
2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L,
2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L,
1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L,
1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L,
2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L,
2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L,
1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L,
1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L,
2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L,
2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L,
1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L,
1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L,
2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L,
2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L,
1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L,
1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L,
2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L,
2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L,
1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L,
1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L,
2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L,
2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L,
1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L,
1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L,
2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L,
2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L
), .Label = c("A", "B"), class = "factor"), Pres_Level = structure(c(1L,
1L, 2L, 2L, 3L, 3L, 1L, 1L, 2L, 2L, 3L, 3L, 1L, 1L, 2L, 2L,
3L, 3L, 1L, 1L, 2L, 2L, 3L, 3L, 1L, 1L, 2L, 2L, 3L, 3L, 1L,
1L, 2L, 2L, 3L, 3L, 1L, 1L, 2L, 2L, 3L, 3L, 1L, 1L, 2L, 2L,
3L, 3L, 1L, 1L, 2L, 2L, 3L, 3L, 1L, 1L, 2L, 2L, 3L, 3L, 1L,
1L, 2L, 2L, 3L, 3L, 1L, 1L, 2L, 2L, 3L, 3L, 1L, 1L, 2L, 2L,
3L, 3L, 1L, 1L, 2L, 2L, 3L, 3L, 1L, 1L, 2L, 2L, 3L, 3L, 1L,
1L, 2L, 2L, 3L, 3L, 1L, 1L, 2L, 2L, 3L, 3L, 1L, 1L, 2L, 2L,
3L, 3L, 1L, 1L, 2L, 2L, 3L, 3L, 1L, 1L, 2L, 2L, 3L, 3L, 1L,
1L, 2L, 2L, 3L, 3L, 1L, 1L, 2L, 2L, 3L, 3L, 1L, 1L, 2L, 2L,
3L, 3L, 1L, 1L, 2L, 2L, 3L, 3L, 1L, 1L, 2L, 2L, 3L, 3L, 1L,
1L, 2L, 2L, 3L, 3L, 1L, 1L, 2L, 2L, 3L, 3L, 1L, 1L, 2L, 2L,
3L, 3L, 1L, 1L, 2L, 2L, 3L, 3L, 1L, 1L, 2L, 2L, 3L, 3L, 1L,
1L, 2L, 2L, 3L, 3L, 1L, 1L, 2L, 2L, 3L, 3L, 1L, 1L, 2L, 2L,
3L, 3L, 1L, 1L, 2L, 2L, 3L, 3L, 1L, 1L, 2L, 2L, 3L, 3L, 1L,
1L, 2L, 2L, 3L, 3L, 1L, 1L, 2L, 2L, 3L, 3L, 1L, 1L, 2L, 2L,
3L, 3L, 1L, 1L, 2L, 2L, 3L, 3L, 1L, 1L, 2L, 2L, 3L, 3L, 1L,
1L, 2L, 2L, 3L, 3L, 1L, 1L, 2L, 2L, 3L, 3L, 1L, 1L, 2L, 2L,
3L, 3L, 1L, 1L, 2L, 2L, 3L, 3L, 1L, 1L, 2L, 2L, 3L, 3L, 1L,
1L, 2L, 2L, 3L, 3L, 1L, 1L, 2L, 2L, 3L, 3L, 1L, 1L, 2L, 2L,
3L, 3L, 1L, 1L, 2L, 2L, 3L, 3L, 1L, 1L, 2L, 2L, 3L, 3L, 1L,
1L, 2L, 2L, 3L, 3L, 1L, 1L, 2L, 2L, 3L, 3L, 1L, 1L, 2L, 2L,
3L, 3L, 1L, 1L, 2L, 2L, 3L, 3L, 1L, 1L, 2L, 2L, 3L, 3L, 1L,
1L, 2L, 2L, 3L, 3L, 1L, 1L, 2L, 2L, 3L, 3L, 1L, 1L, 2L, 2L,
3L, 3L, 1L, 1L, 2L, 2L, 3L, 3L, 1L, 1L, 2L, 2L, 3L, 3L, 1L,
1L, 2L, 2L, 3L, 3L, 1L, 1L, 2L, 2L, 3L, 3L, 1L, 1L, 2L, 2L,
3L, 3L, 1L, 1L, 2L, 2L, 3L, 3L, 1L, 1L, 2L, 2L, 3L, 3L, 1L,
1L, 2L, 2L, 3L, 3L, 1L, 1L, 2L, 2L, 3L, 3L, 1L, 1L, 2L, 2L,
3L, 3L, 1L, 1L, 2L, 2L, 3L, 3L, 1L, 1L, 2L, 2L, 3L, 3L, 1L,
1L, 2L, 2L, 3L, 3L, 1L, 1L, 2L, 2L, 3L, 3L, 1L, 1L, 2L, 2L,
3L, 3L, 1L, 1L, 2L, 2L, 3L, 3L, 1L, 1L, 2L, 2L, 3L, 3L, 1L,
1L, 2L, 2L, 3L, 3L, 1L, 1L, 2L, 2L, 3L, 3L, 1L, 1L, 2L, 2L,
3L, 3L, 1L, 1L, 2L, 2L, 3L, 3L, 1L, 1L, 2L, 2L, 3L, 3L, 1L,
1L, 2L, 2L, 3L, 3L, 1L, 1L, 2L, 2L, 3L, 3L, 1L, 1L, 2L, 2L,
3L, 3L, 1L, 1L, 2L, 2L, 3L, 3L), .Label = c("55", "65", "75"
), class = "factor"), Frequency = c(125, 125, 125, 125, 125,
125, 125, 125, 125, 125, 125, 125, 125, 125, 125, 125, 125,
125, 125, 125, 125, 125, 125, 125, 160, 160, 160, 160, 160,
160, 160, 160, 160, 160, 160, 160, 160, 160, 160, 160, 160,
160, 160, 160, 160, 160, 160, 160, 200, 200, 200, 200, 200,
200, 200, 200, 200, 200, 200, 200, 200, 200, 200, 200, 200,
200, 200, 200, 200, 200, 200, 200, 250, 250, 250, 250, 250,
250, 250, 250, 250, 250, 250, 250, 250, 250, 250, 250, 250,
250, 250, 250, 250, 250, 250, 250, 315, 315, 315, 315, 315,
315, 315, 315, 315, 315, 315, 315, 315, 315, 315, 315, 315,
315, 315, 315, 315, 315, 315, 315, 400, 400, 400, 400, 400,
400, 400, 400, 400, 400, 400, 400, 400, 400, 400, 400, 400,
400, 400, 400, 400, 400, 400, 400, 500, 500, 500, 500, 500,
500, 500, 500, 500, 500, 500, 500, 500, 500, 500, 500, 500,
500, 500, 500, 500, 500, 500, 500, 630, 630, 630, 630, 630,
630, 630, 630, 630, 630, 630, 630, 630, 630, 630, 630, 630,
630, 630, 630, 630, 630, 630, 630, 800, 800, 800, 800, 800,
800, 800, 800, 800, 800, 800, 800, 800, 800, 800, 800, 800,
800, 800, 800, 800, 800, 800, 800, 1000, 1000, 1000, 1000,
1000, 1000, 1000, 1000, 1000, 1000, 1000, 1000, 1000, 1000,
1000, 1000, 1000, 1000, 1000, 1000, 1000, 1000, 1000, 1000,
1250, 1250, 1250, 1250, 1250, 1250, 1250, 1250, 1250, 1250,
1250, 1250, 1250, 1250, 1250, 1250, 1250, 1250, 1250, 1250,
1250, 1250, 1250, 1250, 1600, 1600, 1600, 1600, 1600, 1600,
1600, 1600, 1600, 1600, 1600, 1600, 1600, 1600, 1600, 1600,
1600, 1600, 1600, 1600, 1600, 1600, 1600, 1600, 2000, 2000,
2000, 2000, 2000, 2000, 2000, 2000, 2000, 2000, 2000, 2000,
2000, 2000, 2000, 2000, 2000, 2000, 2000, 2000, 2000, 2000,
2000, 2000, 2500, 2500, 2500, 2500, 2500, 2500, 2500, 2500,
2500, 2500, 2500, 2500, 2500, 2500, 2500, 2500, 2500, 2500,
2500, 2500, 2500, 2500, 2500, 2500, 3000, 3000, 3000, 3000,
3000, 3000, 3000, 3000, 3000, 3000, 3000, 3000, 3000, 3000,
3000, 3000, 3000, 3000, 3000, 3000, 3000, 3000, 3000, 3000,
3150, 3150, 3150, 3150, 3150, 3150, 3150, 3150, 3150, 3150,
3150, 3150, 3150, 3150, 3150, 3150, 3150, 3150, 3150, 3150,
3150, 3150, 3150, 3150, 4000, 4000, 4000, 4000, 4000, 4000,
4000, 4000, 4000, 4000, 4000, 4000, 4000, 4000, 4000, 4000,
4000, 4000, 4000, 4000, 4000, 4000, 4000, 4000, 5000, 5000,
5000, 5000, 5000, 5000, 5000, 5000, 5000, 5000, 5000, 5000,
5000, 5000, 5000, 5000, 5000, 5000, 5000, 5000, 5000, 5000,
5000, 5000, 6000, 6000, 6000, 6000, 6000, 6000, 6000, 6000,
6000, 6000, 6000, 6000, 6000, 6000, 6000, 6000, 6000, 6000,
6000, 6000, 6000, 6000, 6000, 6000, 6300, 6300, 6300, 6300,
6300, 6300, 6300, 6300, 6300, 6300, 6300, 6300, 6300, 6300,
6300, 6300, 6300, 6300, 6300, 6300, 6300, 6300, 6300, 6300,
8000, 8000, 8000, 8000, 8000, 8000, 8000, 8000, 8000, 8000,
8000, 8000, 8000, 8000, 8000, 8000, 8000, 8000, 8000, 8000,
8000, 8000, 8000, 8000), REM_SPL = c(43.68, 38.85, 51.43,
48.71, 59.22, 58.62, 38.51, 38.45, 48.33, 48.44, 58.18, 58.34,
52.51, 39.6, 58.89, 49.06, 64.63, 58.7, 40.42, 38.81, 49.03,
48.52, 58.3, 58.31, 54.92, 49.44, 62.59, 59.24, 70.32, 68.97,
48.43, 48.19, 58.21, 58.16, 68.17, 67.99, 63.83, 51.47, 68.49,
59.93, 73.25, 69.09, 48.69, 48.3, 58.35, 58.1, 68.19, 68.01,
60.9, 50.42, 68.57, 61.23, 76.59, 71.69, 53.6, 52.7, 63.7,
62.63, 73.63, 72.59, 69.31, 50.73, 73.51, 60.53, 78.59, 71.52,
52.12, 51.55, 62.76, 61.94, 73.15, 72.33, 60.74, 57.26, 68.37,
66.47, 76.5, 75.78, 52.44, 49.21, 62.49, 58.96, 72.44, 69.11,
68.65, 60.55, 72.73, 67.42, 78.12, 75.97, 50.62, 58.85, 58.9,
61.92, 70.33, 68.8, 55.39, 50.46, 62.96, 59.28, 71.08, 68.47,
48.88, 61.78, 58.9, 71.68, 68.8, 80.89, 64.34, 56.79, 68.32,
61.46, 73.3, 68.84, 62.02, 68.72, 68.09, 74.56, 73.49, 80.8,
56.69, 59.3, 64.04, 67.76, 72.05, 76.68, 51.93, 69.98, 61.96,
79.52, 71.84, 88.99, 67, 65.08, 70.76, 70.07, 75.18, 77.28,
71.03, 77.65, 75.63, 82.24, 79.81, 88.9, 51.92, 57.93, 59.21,
66.1, 67.46, 75.01, 56.59, 68.41, 65.69, 77.82, 75.5, 87.89,
63.7, 64.53, 67.67, 69.39, 71.81, 76.13, 69.27, 76.85, 73.64,
81.51, 78.2, 87.85, 48.87, 53.9, 55.78, 61.65, 63.82, 70.4,
58.82, 65.38, 67.34, 74.02, 76.93, 83.84, 61.42, 61.24, 65.35,
65.97, 69.53, 72.23, 68.71, 74.23, 73.86, 79.7, 79.37, 85.62,
48.01, 50.44, 54.41, 57.41, 61.56, 65.25, 55.58, 60.89, 63.1,
68.84, 71.74, 77.46, 60.05, 58.6, 63.73, 62.9, 67.59, 68.1,
66.15, 68.8, 70.82, 74.09, 76.38, 79.82, 47.18, 48.5, 53.45,
55.01, 60.08, 61.96, 50.95, 55.74, 57.98, 63.25, 65.43, 71.02,
59.17, 56.77, 63.17, 61.08, 67.06, 65.64, 62.25, 64.22, 66.68,
69.38, 71.26, 74.48, 45.35, 46.41, 51.51, 52.74, 57.89, 59.15,
49.19, 51.76, 55.54, 58.63, 61.76, 65.99, 57.58, 54.92, 61.84,
59.05, 65.64, 63.46, 61.35, 60.78, 64.83, 65.47, 68.72, 69.57,
46.65, 47.33, 51.94, 53.13, 57.31, 59, 49.36, 51.67, 55.5,
57.69, 60.82, 63.09, 60.43, 56.86, 64.43, 60.78, 68.14, 64.65,
64.16, 60.72, 66.77, 64.85, 70.22, 68.47, 52.05, 52.4, 57.11,
57.74, 62.04, 63.19, 54.05, 54.49, 58.79, 59.71, 62.52, 63.23,
66.43, 61.57, 70.32, 65.49, 73.94, 69.19, 68.14, 62.49, 70.3,
66.13, 72.44, 68.77, 58.81, 58.52, 63.77, 63.85, 68.05, 68.88,
60.06, 60.46, 64.5, 64.38, 67.55, 67.18, 70.69, 66.81, 74.87,
70.92, 78.38, 74.41, 72.09, 66.25, 74.33, 69.39, 76.22, 71.49,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, 60.18, 59.65, 64.79,
64.88, 68.88, 70.03, 67.33, 66.6, 71.46, 71.19, 73.76, 73.17,
69.65, 67.08, 73.98, 71.05, 77.77, 74.67, 75.44, 69.74, 78.49,
73.39, 79.65, 75.22, 58.4, 59.01, 63.43, 64.86, 67.39, 69.91,
68.71, 66.86, 73.05, 75.1, 74.7, 76.83, 65.9, 65.64, 71.34,
70.31, 75.86, 73.99, 71.73, 67.53, 77.7, 75.64, 78.59, 77.33,
58.44, 58.86, 63.4, 64.32, 67.55, 69.36, 66.91, 66.78, 71.71,
75.09, 73.4, 76.7, 66.48, 64.59, 71.47, 68.87, 75.38, 72.33,
68.31, 66.76, 75.47, 75.42, 76.96, 76.82, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, 55.72, 55.8, 60.74, 61.53, 65.31, 66.82,
67.76, 66.76, 71.6, 72.51, 73.49, 74.22, 63.62, 62.1, 68.72,
66.45, 72.44, 69.93, 68.97, 67.19, 75.25, 73, 76.63, 74.49,
52.18, 51.25, 57.97, 57.54, 62.94, 63.07, 67.82, 67.65, 71.4,
72, 73.4, 73.92, 60.18, 58.02, 66.23, 63.48, 70.87, 67.6,
68.92, 68.42, 75.79, 72.7, 77.23, 74.33), Thresh_SPL = c(40,
40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 65, 60, 65, 60,
65, 60, 65, 60, 65, 60, 65, 60, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 33, 33, 33, 33,
33, 33, 33, 33, 33, 33, 33, 33, 58, 53, 58, 53, 58, 53, 58,
53, 58, 53, 58, 53, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, 34.6, 34.6, 34.6, 34.6, 34.6,
34.6, 34.6, 34.6, 34.6, 34.6, 34.6, 34.6, 59.6, 54.6, 59.6,
54.6, 59.6, 54.6, 59.6, 54.6, 59.6, 54.6, 59.6, 54.6, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, 35, 30, 35, 30, 35, 30, 35, 30, 35, 30, 35, 30, 50,
50, 50, 50, 50, 50, 50, 50, 50, 50, 50, 50, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 44.6,
44.6, 44.6, 44.6, 44.6, 44.6, 44.6, 44.6, 44.6, 44.6, 44.6,
44.6, 69.6, 59.6, 69.6, 59.6, 69.6, 59.6, 69.6, 59.6, 69.6,
59.6, 69.6, 59.6, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 72.5,
77.5, 72.5, 77.5, 72.5, 77.5, 72.5, 77.5, 72.5, 77.5, 72.5,
77.5, 87.5, 77.5, 87.5, 77.5, 87.5, 77.5, 87.5, 77.5, 87.5,
77.5, 87.5, 77.5, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 76.5,
81.5, 76.5, 81.5, 76.5, 81.5, 76.5, 81.5, 76.5, 81.5, 76.5,
81.5, 96.5, 81.5, 96.5, 81.5, 96.5, 81.5, 96.5, 81.5, 96.5,
81.5, 96.5, 81.5, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 77.5,
82.5, 77.5, 82.5, 77.5, 82.5, 77.5, 82.5, 77.5, 82.5, 77.5,
82.5, 82.5, 77.5, 82.5, 77.5, 82.5, 77.5, 82.5, 77.5, 82.5,
77.5, 82.5, 77.5, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 94,
104, 94, 104, 94, 104, 94, 104, 94, 104, 94, 104, 104, 89,
104, 89, 104, 89, 104, 89, 104, 89, 104, 89)), row.names = c(NA,
-504L), class = "data.frame")
The sii function takes several arguments:
sii(speech = speech, threshold = threshold, freq = frequency, method = "one-third octave", interpolate = T)
I want to fix the freq argument as:
freq = c(125, 250, 500, 1000, 2000, 3150, 4000, 6300, 8000)
One particularly difficult part is that I need to subset the speech and threshold arguments on slightly different values of Frequency:
For speech: c(125, 250, 500, 1000, 2000, 3150, 4000, 6300, 8000)
For threshold: c(125, 250, 500, 1000, 2000, 3000, 4000, 6000, 8000)
The other arguments need to be calculated based on the grouping. What I have tried so far doesn't work:
library(tidyverse)
library(sii)
data %>%
group_by(Subject, Ear, System, Pres_Level) %>%
summarize(SII = sii(speech = . %>%
filter(Frequency %in% c(125, 250, 500, 1000, 2000, 3150, 4000, 6300, 8000)) %>%
pull(REM_SPL),
threshold = . %>%
filter(Frequency %in% c(125, 250, 500, 1000, 2000, 3000, 4000, 6000, 8000)) %>%
pull(Thresh_SPL),
freq = frequency, method = "one-third octave", interpolate = T))
Error in sii(speech = . %>% filter(Frequency %in% c(125, 250, 500, 1000, :
`speech' must have the same length as `freq'.
Trying to maintain grouping for the arguments:
data %>%
select(-REM_Level) %>%
filter(Frequency >= 125, Frequency <= 8000) %>%
group_by(Subject, Ear, System, Pres_Level) %>%
mutate(Speech = tibble(REM_SPL) %>%
filter(Frequency %in% c(125, 250, 500, 1000, 2000, 3150, 4000, 6300, 8000))) %>%
pull(REM_SPL)),
threshold = tibble(Thresh_SPL) %>%
filter(Frequency %in% c(125, 250, 500, 1000, 2000, 3000, 4000, 6000, 8000) %>%
pull(Thresh_SPL))) %>%
mutate(SII = sii(speech = speech, threshold = threshold, freq = c(125, 250, 500, 1000, 2000, 3000, 4000, 6000, 8000),
method = "one-third octave", interpolate = T))
Error in mutate_impl(.data, dots) :
Column `Speech` is of unsupported class data.frame
I have attempted using some nested loops, but that hasn't worked at all.
My desired output is something like this (these are fake SII values):
Subject System Pres_Level SII
1 1 A 55 0.65
2 1 B 55 0.60
3 1 C 55 0.60
4 1 A 65 0.70
5 1 B 65 0.75
6 1 C 65 0.80
7 1 A 75 0.76
8 1 B 75 0.78
9 1 C 75 0.74
10 2 A 55 0.55
11 2 B 55 0.58
12 2 C 55 0.57
13 2 A 65 0.74
14 2 B 65 0.72
15 2 C 65 0.82
16 2 A 75 0.80
17 2 B 75 0.82
18 2 C 75 0.76
19 3 A 55 0.58
20 3 B 55 0.62
21 3 C 55 0.64
22 3 A 65 0.74
23 3 B 65 0.76
24 3 C 65 0.78
25 3 A 75 0.80
26 3 B 75 0.76
27 3 C 75 0.74
Can anyone suggest how I might achieve what I'm looking for?

I believe this is what you're looking for... One thing you need to look out for is what sii returns (an object of length 10 and not a vector length 1). You need to further extract the SII value from the result, hence sii(...)$sii in the summarize call.
After edit with the new data:
data %>%
group_by(Subject, Ear, System, Pres_Level) %>%
summarize(SII = sii(speech = REM_SPL[Frequency %in% c(125, 250, 500, 1000, 2000, 3150, 4000, 6300, 8000)],
threshold = Thresh_SPL[Frequency %in% c(125, 250, 500, 1000, 2000, 3000, 4000, 6000, 8000)],
freq = Frequency[Frequency %in% c(125, 250, 500, 1000, 2000, 3150, 4000, 6300, 8000)],
method = "one-third octave",
interpolate = T)$sii)
# A tibble: 24 x 5
# Groups: Subject, Ear, System [?]
Subject Ear System Pres_Level SII
<fct> <fct> <fct> <fct> <dbl>
1 1 Left A 55 0.788
2 1 Left A 65 0.782
3 1 Left A 75 0.759
4 1 Left B 55 0.806
5 1 Left B 65 0.774
6 1 Left B 75 0.742
7 1 Right A 55 0.749
8 1 Right A 65 0.749
9 1 Right A 75 0.737
10 1 Right B 55 0.765
# ... with 14 more rows

Related

"Error in Contrasts" message in glm model after confirming that all Factor variables are longer than 2 unique values [R]:

I am not sure why I am still receiving this message when running a base model with all variables in my dataset:
My data, with anonymized variables:
set.seed(1234)
#dput(df)
structure(list(outcome_1= structure(c(2L, 1L, 1L, 2L,
1L, 2L, 1L, 2L, 2L, 1L, 1L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 2L,
NA, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 2L, 2L, 2L, 2L, 2L,
1L, 2L, 1L, NA, 2L, 1L), .Label = c("0", "1"), class = "factor"),
outcome_2= structure(c(2L, 1L, 1L, 1L, 1L, 2L, 1L,
2L, 1L, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 2L, 1L, 2L, 2L, NA, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 1L,
1L, 1L, NA, 2L, 1L), .Label = c("0", "1"), class = "factor"),
outcome_3= structure(c(2L, 1L, 1L, 1L, 1L, 2L, 1L,
2L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 2L, 2L, 1L, 2L, 1L, NA, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 2L, 2L, 1L,
1L, 1L, NA, 1L, 1L), .Label = c("0", "1"), class = "factor"),
bl_ep = c(16, 92, 10, 40, 19, 1, 16, 10, 22, 28, 8, 11, 6,
47, 12, 1, 9, 20, 2, 14, 72, 28, 5, 16, 61, 12, 24, 22, 44,
44, 16, 36, 62, 10, 16, 10, 89, 22, 5, 38, 8, 11), bl_days = c(12,
28, 10, 25, 19, 1, 10, 9, 13, 28, 4, 11, 6, 20, 12, 1, 8,
16, 2, 12, 27, 28, 5, 13, 24, 10, 18, 18, 16, 16, 10, 28,
22, 5, 15, 8, 28, 15, 5, 22, 7, 11), score_1 = c(11,
19, 17, 17, 12, 14, 8, 12, 14, 15, 14, 13, 12, 14, 15, 5,
11, 14, 14, 13, 16, 11, 11, 14, 20, 14, 12, 11, 17, 15, 14,
18, 15, 14, 12, 10, 17, 16, 11, 13, 18, 17), score_2 = c(1.1,
1.6, 1.6, 2.8, 1.9, 3.3, 4, 3.8, 1.8, 1.4, 2, 3.55, 1.6,
1.8, 2.4, 3.7, 1.4, 2.9, 3.55, 2.5, 1.6, 3.2, 3.5, 2.4, 3.1,
2.3, 3.8, 3.9, 1.1, 1.7, 2.3, 1.5, 1.9, 3.3, 3, 2.9, 1.6,
3.1, 3.7, 2.8, 1.2, 1.9), score_3 = c(1,
1.22222222222222, 1.11111111111111, 1.88888888888889, 1.44444444444444,
1.44444444444444, 3.22222222222222, 2.77777777777778, 1.11111111111111,
1, 1, 2.83333333333333, 1.22222222222222, 1.875, 1.55555555555556,
2.66666666666667, 1, 2.25, 1.72222222222222, 2.05555555555556,
1.22222222222222, 2, 2, 1.77777777777778, 1.33333333333333,
1.11111111111111, 2.5, 2.55555555555556, 1, 1.22222222222222,
1.77777777777778, 1.22222222222222, 2.44444444444444, 1.55555555555556,
1.77777777777778, 1.66666666666667, 1.11111111111111, 2.33333333333333,
2.88888888888889, 1.55555555555556, 1, 1.25), score_4 = c(1.31428571428571,
1.37142857142857, 1.08571428571429, 1.83809523809524, 1.37142857142857,
1.8952380952381, 4, 3.88571428571429, 3.02857142857143, 2.12222222222222,
1.43333333333333, 3.39047619047619, 1.74285714285714, 1.67619047619048,
2.02857142857143, 3.48571428571429, 1.24761904761905, 3.73333333333333,
3.08571428571429, 2.56666666666667, 1.74285714285714, 2.6952380952381,
3.45714285714286, 2.27619047619048, 1.9047619047619, 2.62857142857143,
3.74285714285714, 3.74285714285714, 1.24761904761905, 1.39047619047619,
1.83809523809524, 2.74285714285714, 4, 1.77142857142857,
3.42857142857143, 3.2, 1.65714285714286, 2.55238095238095,
2.38095238095238, 2.40952380952381, 2.07619047619048, 2.56666666666667
), score_5 = c(1, 1, 1, 1, 1.33333333333333,
1, 3.33333333333333, 3.66666666666667, 1.66666666666667,
1.66666666666667, 2, 2.5, 1.66666666666667, 1, 1.33333333333333,
3, 1, 1.66666666666667, 2.16666666666667, 2.16666666666667,
1.33333333333333, 2.66666666666667, 3, 2.66666666666667,
1.33333333333333, 2.66666666666667, 3, 1.33333333333333,
1, 1, 1, 1, 1, 1.33333333333333, 3, 3.66666666666667, 1.66666666666667,
1.33333333333333, 2.33333333333333, 1.66666666666667, 2,
2), sex = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L,
1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L), .Label = c("F", "M"), class = "factor"), age = c(64,
66, 51, 69, 60, 65, 65, 69, 50, 78, 75, 78, 35, 77, 69, 48,
65, 72, 60, 64, 78, 71, 58, 55, 55, 57, 81, 76, 56, 71, 56,
73, 69, 51, 43, 77, 31, 64, 69, 63, 38, 71), childbirth = structure(c(2L,
2L, 2L, 1L, 2L, 2L, 2L, NA, 2L, 2L, 2L, 2L, NA, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, NA, 1L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 2L), .Label = c("N",
"Y"), class = "factor"), x1= c(3, 2, 2, NA,
3, 2, 3, NA, 3, 3, 2, 2, NA, 2, 5, 2, 2, 2, 4, 3, 2, 2, 3,
NA, 2, 3, NA, NA, 2, 2, 2, 2, 2, 2, 3, 2, 1, NA, 2, 2, 1,
3), x2= c(0, 0, 0, NA, 1, 0, 0, NA, 0, 0,
0, 0, NA, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, NA, 0, 0, NA, NA,
0, 0, 0, 0, 0, 0, 0, 0, 1, NA, 0, 0, 0, 0), x3= structure(c(4L,
1L, 1L, 2L, 1L, 1L, 1L, NA, 4L, 1L, 1L, 4L, NA, 4L, 1L, 4L,
4L, 4L, 4L, 3L, 1L, 1L, 1L, 2L, 4L, 1L, NA, 2L, 1L, 4L, 1L,
1L, 4L, 4L, 1L, 4L, 4L, 2L, 4L, 4L, 4L, 1L), .Label = c("N",
"NA", "UNK", "Y"), class = "factor"), x4= structure(c(4L,
1L, 1L, 2L, 1L, 1L, 1L, NA, 1L, 1L, 4L, 1L, NA, 1L, 1L, 4L,
3L, 1L, 4L, 4L, 1L, 4L, 4L, 2L, 1L, 4L, NA, 2L, 4L, 1L, 4L,
1L, 1L, 4L, 4L, 1L, 4L, 2L, 4L, 1L, 4L, 4L), .Label = c("N",
"NA", "UNK", "Y"), class = "factor"), x5= structure(c(2L,
2L, 2L, 2L, 2L, 2L, 2L, NA, 2L, 2L, 2L, 2L, NA, 2L, 2L, 1L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, NA, 2L, 2L, 2L, 2L,
2L, 2L, 1L, 1L, 2L, 1L, 2L, 2L, 2L, 1L, 2L), .Label = c("N",
"Y"), class = "factor"), x6= structure(c(2L, 2L, 2L, 1L,
1L, 2L, 2L, NA, 1L, 1L, 1L, 2L, NA, 2L, 2L, 1L, 2L, 2L, 1L,
2L, 2L, 2L, 1L, 1L, 2L, 2L, NA, 2L, 2L, 2L, 1L, 2L, 1L, 1L,
1L, 2L, 2L, 1L, 1L, 1L, 1L, 2L), .Label = c("N", "Y"), class = "factor"),
x7= structure(c(1L, 1L, 1L, 1L, 1L,
1L, 1L, NA, 1L, 1L, 1L, 1L, NA, 1L, 3L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 3L, 1L, NA, 3L, 1L, 1L, 1L, 1L, 1L, 1L, 3L,
3L, 1L, 1L, 1L, 1L, 2L, 3L), .Label = c("N", "NA", "Y"), class = "factor"),
x8= structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, NA, 1L,
2L, 2L, 2L, NA, 1L, 1L, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 1L, 1L,
1L, 1L, NA, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L,
2L, 2L, 2L), .Label = c("N", "Y"), class = "factor"), x9= structure(c(1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 2L,
1L, 2L, 2L, 2L, 1L, 2L, 2L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 1L,
1L, 2L, 2L, 2L, 1L, 1L, 2L, 1L, 2L, 1L, 2L), .Label = c("N",
"Y"), class = "factor"), x10= structure(c(1L, 2L, 2L,
1L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 2L, 1L, 1L, 1L, 2L, 1L,
1L, 1L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 1L, 2L, 2L,
1L, 1L, 2L, 1L, 2L, 2L, 1L, 1L, 1L), .Label = c("N", "Y"), class = "factor"),
x11= structure(c(1L, 1L, 1L, 2L, 1L, 2L, 1L, 1L,
2L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 2L, 1L,
2L, 2L, 1L, 1L, 1L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L), .Label = c("N", "Y"), class = "factor"),
x12= structure(c(1L, 1L, 1L, 2L, 1L, 1L, 2L, 1L,
1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 2L, 1L,
1L, 2L, 1L, 1L, 1L, 1L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L,
1L, 1L, 1L, 1L), .Label = c("N", "Y"), class = "factor"),
x13= structure(c(2L, 2L, 2L, 1L, 1L, 2L, 1L, 2L,
2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 2L, 1L,
2L, 1L, 2L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 2L, 1L, 1L), .Label = c("N", "Y"), class = "factor"),
x14= c(41, 7, 8, 9, 7, 2, 1, 5, 9, 6, 6, 8,
14, 2, 4, NA, 11, 9, 31, 13, 8, 2, 11, 20, 8, 7, 6, 8, 2,
12, 32, 1, 2, 38, 10, 17, 5, 28, 31, 10, 3, 6), x15= structure(c(3L,
4L, 2L, 2L, 3L, 2L, 3L, 2L, 3L, 2L, 3L, 3L, 5L, 1L, 3L, 3L,
3L, 3L, 3L, 3L, 2L, 2L, 3L, 2L, 3L, 1L, 2L, 2L, 3L, 3L, 3L,
2L, 3L, 3L, 3L, 3L, 3L, 2L, 3L, 3L, 3L, 3L), .Label = c("IATRO",
"IDIO", "OBST", "OBST/IDIO", "TRAUM"), class = "factor"),
x16= structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 1L, 1L, 2L,
1L, 1L, 1L, 1L, 1L), .Label = c("N", "Y"), class = "factor"),
x17= structure(c(2L, 2L, 1L, 2L, 1L, 2L,
2L, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
2L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 2L, 1L, 1L, 2L, 1L, 1L,
2L, 1L, 1L, 1L, 1L, 1L), .Label = c("N", "Y"), class = "factor"),
x18= c(31.8, 20, 30.9, 23.3, 22.5, 23.1, 23.6, 25.9, 22.8,
25.2, 30.2, 23.4, 22.2, 29, 24.8, 32.7, 20.8, 28.5, 24.6,
23, 23.4, 21.1, 24.9, 18, 21.7, 27.6, 27, 29, 32.9, 26, 29.3,
27.1, 22.7, 19.7, 25, 22.3, 21.3, 17.5, 20.9, 20.1, 25.1,
22.1), x19= structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 2L, 1L, 1L, 1L,
1L, 1L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 2L, 2L, 2L), .Label = c("No", "Yes"), class = "factor"),
x20 = structure(c(2L, 1L, 2L, 1L, 2L, 1L, 2L,
1L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 2L, 2L, 2L, 2L, 1L,
1L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 2L, 1L, 2L, 2L, 1L, 1L, 2L,
1L, 1L, 1L, 2L, 2L), .Label = c("NO", "YES"), class = "factor"),
x21= structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 2L, 1L, 2L, 1L, 1L, 1L,
1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 2L, 1L, 2L,
1L, 1L, 1L, 1L, 1L), .Label = c("NO", "YES"), class = "factor")), row.names = c(NA,
-42L), class = c("tbl_df", "tbl", "data.frame"))
logit1 <-glm(outcome_1~., data = df, family = "binomial")
Which yielded the classic error message for a logit model:
#Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) :
# contrasts can be applied only to factors with 2 or more levels
Ok, so I went to double check that all factor variables indeed have more than 1 unique value, and can verify:
sapply(lapply(df, unique), length)
returned all variables showing 2 or more unique values. Still same error message when I ran the model again.
I even attempted to run one solution I found online:
values_count <- sapply(lapply(df, unique), length)
logit1 <-
lm(outcome_1~ ., df[ , values_count > 1])
What's going on? Am I blind in seeing some variable that is secretly saying it has more than one unique value and does not?
Thank you!
The regression works on the supplied data for simple models, such as
logit1 <-glm(outcome_1~ sex + age, data = df, family = "binomial")
It's a small data set with lots of variables, the computer is not going to be able to pull out the meaningful relationships even if they are there. Start with some exploratory data plots, and think about how the (biological) relationship between your outcomes and other variables in order to come up with hypotheses you can test with you data. Realistically, which measurements do you think actually affect patient outcomes?

How to un-filter or clear filters in r dplyr?

A diagram showing what I am trying to achieve:
Hello community. I am trying to mutate a new variable depending on the mean function of one specific group (filtered and grouped). As I try to create the new variable I ungroup the set to operate in all groups. I tried running this R code.
However, the mutate function applies only to the filtered group and can't find an un-filter function. I dput() a sample of my data frame below (df01). Thank you very much in advance for all your comments and suggestions. Regards. M.
R code:
df01 %>% #summary table of the means to be used.
filter(GFPimg == "WT") %>%
group_by(Demineralization, Cond, Temp) %>%
summarise(Mean2 = mean(Mean)) %>%
arrange(desc(Demineralization))
print()
df01 %>%
filter(GFPimg == "WT") %>%
group_by(Demineralization, Cond, Temp) %>%
mutate(mean2 = mean(Mean)) %>%
arrange(desc(Demineralization, Cond)) %>%
ungroup() %>%
group_by(Demineralization, Cond) %>%
mutate(submean = Mean - mean2) %>%
print(n=200)
Data frame sample df01:
df01 <- structure(list(ID = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54),
GFPimg = structure(c(1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 2L, 2L, 2L, 2L, 3L, 3L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 2L, 2L, 2L, 2L, 3L, 3L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 1L, 1L), .Label = c("HT", "MT", "WT"), class = "factor"),
Cond = structure(c(1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 2L, 2L, 2L, 2L), .Label = c("EC", "EI"), class = "factor"),
Temp = structure(c(2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L), .Label = c("37c", "RT"), class = "factor"),
Side = structure(c(1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L), .Label = c("L", "R"), class = "factor"),
Mean = c(62.435, 64.537, 102.447, 92.608, 103.277, 104.711, 67.017, 61.748, 68.921, 59.962, 63.368, 60.435, 69.54, 67.886, 51.71, 50.291, 50.881, 54.865, 80.538, 84.05, 92.223, 87.337, 90.444, 90.728, 29.951, 28.574, 30.896, 30.399, 29.773, 30.715, 31.498, 30.385, 99.004, 83.644, 95.962, 83.451, 22.649, 22.5, 53.066, 51.368, 55.459, 57.203, 54.444, 58.504, 76.518, 95.81, 23.43, 24.736, 28.86, 28.347, 28.386, 29.319, 58.017, 63.064, 80.293, 89.194, 70.52, 63.989, 71.436, 59.379, 75.986, 80.22, 71.583, 76.589, 77.138, 95.998, 77.193, 71.384, 75.614, 83.061, 73.062, 71.833, 71.83, 55.783, 77.376, 64, 96.14, 99.876, 40.972, 53.465, 36.25, 47.626, 40.619, 39.697, 34.34, 36.855, 77.131, 75.35, 67.014, 58.811, 39.237, 49.357, 74.333, 79.921, 62.631, 63.119, 60.207, 65.171, 77.563, 82.078, 39.115, 45.988, 42.65, 55.806, 33.534, 41.271, 62.359, 67.092),
Demineralization = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("After", "Before"), class = "factor")), class = "data.frame", row.names = c(NA, -108L))
Since other users might have a similar question, I post the answer I arrived to.
Lessons learned:
The pipe operator (%>%) pass on a modified dataframe from the
previous operation to the next one.
Subset() and filter() effectively produce a new dataframe that keeps
only those rows that satisfy the condition, and drops all other
rows.
There is no function to un-filter or clear filters or un-subset in dplyr. Instead, to keep all rows and perform a calculation on another specific
subset, you can apply a conditional with ifelse().
Code that solved my problem:
df01 %>%
group_by(Demineralization, Cond, Temp) %>%
mutate(Mean2 = mean(ifelse(GFPimg == 'WT', Mean, NA), na.rm=T)) %>%
ungroup() %>%
mutate(submean = Mean - Mean2) %>%
print(n=200)

How to transform data for bar graph based on multi choice?

So I'm learning to use R/GGplot, it was simple enough to create a single bar chart but I'm struggling to understand how to properly manipulate the data to get the chart I want.
So I have a basic example data file that looks like this in RStudio:
Basically, I wanted to make a bar for each option, which counts the "Yes" options. The Y axis would then be equal to the total number of records, with the scale measured in %.
Here is where I think I went completely wrong:
data_Q1 <- data.frame(Q1 = c("Red", "Blue", "Green", "Yellow", "Pink"))
I believe here I might need to remove the "No" level, then rename the "Yes" level to the colour name before I can work with it, but I seem to be greatly misunderstanding how I do this.
I've tried using droplevels() and raw_data$Q1_1[grepl("Yes", raw_data$Q1, ignore.case=T)] <- "Red" but neither seem to achieve the goal I want.
Here is the code I'm using for the graph:
ggplot(
data_Q1,
aes(
x=Q1,
y=sum(
complete.cases(raw_data)
)
)
)
+geom_bar(
aes(
fill=Q1
),
colour="black",
stat="identity"
)
+labs(
title="Colours respondents liked",
subtitle="Q1. Select all the colours you like",
caption="source: example data"
)
+ylab("Total completes")
+scale_y_continuous(labels = scales::percent)
structure(list(id = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12,
13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28,
29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44,
45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60,
61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76,
77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92,
93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106,
107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119,
120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132,
133, 134, 135, 136, 137, 138, 139), Q1_1 = structure(c(2L, 1L,
2L, 2L, 2L, 1L, 2L, 2L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 1L, 2L, 1L, 2L, 2L,
2L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 1L, 2L, 2L, 2L, 1L, 2L, 1L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 1L,
2L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L,
1L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 2L), .Label = c("No", "Yes"), class = "factor"),
Q1_2 = structure(c(2L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 1L,
1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 2L,
2L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 1L, 1L, 1L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 1L, 2L,
2L, 2L, 1L, 2L, 2L, 2L, 1L, 2L, 2L, 1L, 2L, 2L, 1L, 1L, 1L,
2L, 2L, 1L, 2L, 2L, 2L, 1L, 2L, 2L, 1L, 1L, 2L, 2L, 2L, 2L,
2L, 1L, 2L, 2L, 1L, 2L, 2L, 1L, 1L, 2L, 2L, 2L, 1L, 2L, 1L,
2L, 2L, 1L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 1L, 2L, 1L,
2L, 1L, 2L, 1L, 2L, 2L, 1L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 2L,
2L, 1L, 2L, 2L, 1L, 1L, 2L, 2L, 2L), .Label = c("No", "Yes"
), class = "factor"), Q1_3 = structure(c(2L, 2L, 2L, 2L,
2L, 1L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 1L, 2L, 2L, 1L, 2L,
1L, 2L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 1L, 2L,
2L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 2L,
1L, 1L, 2L, 1L, 2L, 2L, 1L, 1L, 2L, 2L, 1L, 2L, 1L, 1L, 1L,
2L, 2L, 2L, 1L, 2L, 2L, 1L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 1L,
2L, 2L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 2L, 1L, 2L, 1L,
2L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 2L, 2L, 2L
), .Label = c("No", "Yes"), class = "factor"), Q1_4 = structure(c(1L,
2L, 1L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 1L, 1L, 1L,
2L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 1L, 2L, 2L, 1L, 1L, 1L,
2L, 1L, 1L, 2L, 2L, 1L, 1L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 1L,
1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 2L, 2L, 2L, 1L, 2L,
1L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 1L, 2L, 1L, 1L, 1L, 1L,
1L, 2L, 2L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 2L, 2L, 1L, 1L,
2L, 1L, 2L, 1L, 2L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 2L,
1L, 2L, 1L, 2L, 2L, 1L, 1L, 2L, 1L, 1L, 1L, 2L, 2L, 1L, 2L,
1L, 1L, 2L, 1L, 2L, 2L, 1L, 2L, 1L, 2L, 2L, 2L, 1L, 1L, 1L,
1L, 2L, 1L), .Label = c("No", "Yes"), class = "factor"),
Q1_5 = structure(c(2L, 1L, 2L, 2L, 2L, 1L, 1L, 2L, 2L, 1L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 1L,
1L, 2L, 2L, 2L, 1L, 1L, 2L, 1L, 2L, 1L, 2L, 2L, 2L, 1L, 2L,
2L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 2L,
2L, 2L, 1L, 2L, 2L, 1L, 2L, 2L, 2L, 1L, 1L, 2L, 2L, 2L, 2L,
1L, 2L, 1L, 1L, 1L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 2L,
2L, 2L, 1L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 1L,
1L, 1L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 1L,
2L, 2L, 2L, 2L, 1L, 1L, 2L, 1L, 2L), .Label = c("No", "Yes"
), class = "factor")), row.names = c(NA, -139L), variable.labels = c(id = "id",
Q1_1 = "[Red] Q1. Select all the colours you like", Q1_2 = "[Blue] Q1. Select all the colours you like",
Q1_3 = "[Green] Q1. Select all the colours you like", Q1_4 = "[Yellow] Q1. Select all the colours you like",
Q1_5 = "[Pink] Q1. Select all the colours you like"), codepage = 65001L, class = "data.frame")
Okay, I am assuming this is from some survey, so your data is (probably) a bit messy.
An approach with tidyverse and transposing (t()) would look something like this:
library(tidyverse)
df <- raw_data
df2 <- data.frame(t( df %>% summarise(Q1 = sum(Q1_1=="Yes") / length(complete.cases(df)) ,
Q2 = sum(Q1_2=="Yes") / length(complete.cases(df)),
Q3 = sum(Q1_3=="Yes") / length(complete.cases(df)),
Q4 = sum(Q1_4=="Yes") / length(complete.cases(df)),
Q5 = sum(Q1_5=="Yes") / length(complete.cases(df)),) ))
names(df2) <- ("sum_yes")
df2$q <- rownames(df2)
Output of df2:
> df2
sum_yes q
Q1 0.8417266 Q1
Q2 0.7338129 Q2
Q3 0.7122302 Q3
Q4 0.4820144 Q4
Q5 0.7122302 Q5
>
Then, we plot the results:
library(ggplot2)
ggplot(
df2,
aes(
x=q,
y = sum_yes,
fill = q
)
) +
geom_bar(
colour="black",
stat="identity"
) + labs(
title="Colours respondents liked",
subtitle="Q1. Select all the colours you like",
caption="source: example data"
) + ylab("Total completes")+scale_y_continuous(labels = scales::percent) +
scale_fill_manual(values= c("Red", "Blue", "Green", "Yellow", "Pink") )
with scale_fill_manual, you can just specify what the colors of your results should be.
If I understood the data correctly, you would need to replace the names (Q1 to Q5 ) with the respective colors.

How to extract random effects and variance components from lme4 wrapped in dlply

This post How can I extract elements from lists of lists in R? answers some of my questions but that still doesn't quite work for me and what I need to do goes beyond my R knowledge.
I have data from field trials in 2 environments (=trials), 2 years and 5 traits of interest (defined by trait_id). GID is the unique line identifier. My model in lme4 is:
mods <- dlply(data,.(trial,trait_id),
function(d)
lmer(phenotype_value ~(1|GID)+(1|year)+(1|year:GID)+(1|year:rep),
na.action = na.omit,data=d))
Running this returns a large list of 10 elements and I would like to store the random effects for GID for all traits per trial in a data frame. I tried several things:
blup=lapply(mods,ranef, drop = FALSE)
blup1=blup[[1]]
blup2=blup1$GID
will give me a df with the random effects for one trait per trial, I was hoping for something more streamlined that will preserve some of info like $irrigation.GRYLD in the column names.
Here is a reproducible example with only two traits (GRYLD, PTHT), 2 years (11OBR, 12OBR), and two reps:
structure(list(GID = structure(c(1L, 2L, 3L, 4L, 5L, 5L, 1L,
2L, 4L, 3L, 1L, 2L, 3L, 4L, 5L, 5L, 1L, 2L, 4L, 3L, 1L, 2L, 3L,
4L, 5L, 5L, 2L, 1L, 4L, 3L, 1L, 2L, 3L, 4L, 5L, 5L, 2L, 1L, 4L,
3L, 1L, 2L, 3L, 4L, 5L, 5L, 1L, 2L, 4L, 3L, 1L, 2L, 3L, 4L, 5L,
5L, 1L, 2L, 4L, 3L, 1L, 2L, 3L, 4L, 5L, 5L, 2L, 1L, 4L, 3L, 1L,
2L, 3L, 4L, 5L, 5L, 2L, 1L, 4L, 3L), .Label = c("A", "B", "C",
"D", "E"), class = "factor"), year = structure(c(1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("11OBR",
"12OBR"), class = "factor"), trial = structure(c(1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("heat",
"irrigation"), class = "factor"), rep = c(1L, 1L, 1L, 1L, 1L,
2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 1L,
1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 2L, 2L,
2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L,
1L, 1L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L,
2L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L), trait_id = structure(c(1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("GRYLD",
"PTHT"), class = "factor"), phenotype_value = c(3.93, 3.38, 1.65,
4.33, 2.45, 2.48, 3.98, 3.3, 4.96, 1.53, 87.5, 69.5, 65.5, 84.5,
77, 81, 94.5, 84.5, 89, 81, 6.56, 4.3, 5.76, 7.3, 5.73, 4.14,
5.93, 6.96, 8.43, 5.81, 114.5, 100, 104.5, 110, 110, 106, 99,
97.5, 105, 100, 0.119, 0.131, 0.681, 0.963, 0.738, 1.144, 0.194,
0.731, 0.895, 0.648, 35, 50, 45, 50, 45, 50, 55, 45, 50, 55,
2.79, 3.73, 3.96, 4.64, 5.03, 2.94, 3.78, 4.14, 3.89, 3.21, 90,
95, 105, 100, 105, 85, 95, 100, 100, 95)), .Names = c("GID",
"year", "trial", "rep", "trait_id", "phenotype_value"), class = "data.frame", row.names = c(NA,
-80L))
I'm not quite sure what you want as an output format, but how about:
all_ranef <- function(object) {
rr <- ranef(object)
ldply(rr,function(x) data.frame(group=rownames(x),x,check.names=FALSE))
}
ldply(mods,all_ranef)
## trial trait_id .id group (Intercept)
## 1 heat GRYLD year:GID 11OBR:A 7.935352e-01
## 2 heat GRYLD year:GID 11OBR:B 1.960487e-01
## 3 heat GRYLD year:GID 11OBR:C -1.504116e+00
## ...
## 82 irrigation PTHT year:rep 12OBR:2 -1.595022e+00
## 83 irrigation PTHT year 11OBR 2.915033e+00
## 84 irrigation PTHT year 12OBR -2.915033e+00
this works reasonably well because all of your random effects are intercept-only. If you had some random-slopes terms in the models you might either want to reshape2:::melt() the individual random effects, or use rbind.fill() to combine data frames with different random-effects columns.
library("ggplot2"); theme_set(theme_bw())
ggplot(vals, aes(y=group,x=`(Intercept)`))+
geom_point(aes(colour=interaction(trial,trait_id)))+
facet_wrap(~.id,scale="free")
By the way, it's usually inadvisable to use a factor with only 2 levels (YEAR) as a grouping variable ...

R: Calculate moving maximum slope by week accounting for factors

I have a data.frame that includes heating degree day (HDD) below.
structure(list(WinterID = structure(c(1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L), .Label = c("2002", "2002_2003", "2003", "2003_2004",
"2004", "2004_2005", "2005", "2005_2006", "2006", "2006_2007",
"2007", "2007_2008", "2008"), class = "factor"), Date = structure(c(11968,
11969, 11970, 11971, 11972, 11973, 11974, 11975, 11976, 11977,
11978, 11979, 11980, 11981, 11982, 11983, 11984, 11985, 11986,
11987, 11988, 11989, 11990, 11991, 11992, 11993, 11994, 11995,
11996, 11997, 11998, 11999, 12000, 12001, 12002, 12003, 12004,
12005, 12006, 12007, 12008, 12009, 12010, 12011, 12012, 12013,
12014, 12015, 12016, 12017, 12018, 12019, 12020, 12021, 12022,
12023, 12024, 12025, 12026, 12027, 12028, 12029, 12030, 12031,
12032, 12033, 12034, 12035, 12036, 12037, 12038, 12039, 12040,
12041, 12042, 12043, 12044, 12045, 12046, 12047, 12048, 12049,
12050, 12051, 12052, 12053, 12054, 12055, 12056, 12057, 12058,
12059, 12060, 12061, 12062, 12063, 12064, 12065, 12066, 12067,
12068, 12069, 12070, 12071, 12072, 12073, 12074, 12075, 12076,
12077, 12078, 12079, 12080, 12081, 12082, 12083, 12084, 12085,
12086, 12087, 12088, 12089, 12090, 12091, 12092, 12093, 12094,
12095, 12096, 12097, 12098, 12099, 12100, 12101, 12102, 12103,
12104, 12105, 12106, 12107, 12108, 12109, 12110, 12111, 12112,
12113, 12114, 12115, 12116, 12117, 12118, 12119, 12120, 12121,
12122, 12123, 12124, 12125, 12126, 12127, 12128, 12129, 12130,
12131, 12132, 12133, 12134, 12135, 12136, 12137, 12138, 12139,
12140, 12141, 12142, 12010, 12011, 12014, 12015, 12017, 12023,
12024, 12025, 12026, 12027, 12028, 12029, 12030, 12042, 12070,
12071, 12075, 12076, 12077, 12078, 12079, 12080, 12082, 12083,
12084), class = "Date"), SiteID = structure(c(1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L), .Label = "NW_SB", class = "factor"), SubstrateConcat = structure(c(2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L), .Label = c("B_A", "B_B", "B_E"), class = "factor"),
HDD = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0.246666666666667, 7.12666666666667, 10.6133333333333,
2.96666666666667, 0, 0.0933333333333337, 7.31333333333334,
10.7133333333333, 6.20000000000001, 2.70666666666667, 6.20000000000001,
3.88666666666667, 16.5866666666667, 28.3933333333333, 12.98,
21.6133333333333, 19.14, 12.6666666666667, 7.52, 3.33333333333334,
18.2933333333333, 4.14666666666667, 2.17333333333334, 26.08,
1.38, 7.48000000000001, 36.5733333333333, 53.4666666666667,
98.4533333333333, 109.093333333333, 104.14, 80.2466666666667,
47.0333333333333, 14.7133333333333, 15.7266666666667, 21.1066666666667,
5.07333333333334, 0.613333333333334, 6.18000000000001, 29.5666666666667,
45.5333333333333, 59.5666666666667, 91.44, 85.38, 51.1, 25.9666666666667,
14.8266666666667, 34.48, 79.16, 90.08, 66.3533333333333,
75.14, 97.1733333333333, 83.3066666666667, 50.0133333333333,
37.2733333333333, 88.9133333333334, 101.926666666667, 100.56,
99.2933333333334, 97.66, 89.6466666666667, 110.613333333333,
79.1466666666667, 92.6066666666667, 71.7133333333333, 31.32,
27.02, 39.02, 98.14, 62.5866666666667, 46.7933333333333,
47.5133333333333, 48.3666666666667, 25.5333333333333, 13.6,
17.9133333333333, 14.16, 7.98666666666667, 3.44, 1.86666666666667,
12.66, 0, 7.09333333333334, 21.3266666666667, 40.52, 18.8466666666667,
37.8466666666667, 33.42, 33.7133333333333, 15.6133333333333,
0.720000000000001, 2.31333333333334, 12.3066666666667, 8.48666666666667,
2.86, 0, 0, 0, 6.98666666666667, 6.67333333333334, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 6.58000000000001, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 13.42, 30.5266666666667,
1.12, 28.5066666666667, 6.82666666666667, 10.3933333333333,
3.18, 11.0466666666667, 0, 0, 0)), .Names = c("WinterID",
"Date", "SiteID", "SubstrateConcat", "HDD"), row.names = c(NA,
200L), class = "data.frame")
I'm trying to calculate the moving maximum slope over 7 days beginning on 4 November of each year without using a loop. This moving maximum slope needs to account for WinterID, SiteID, and SubstrateConcat.
For clarification, the calculation I'm trying to obtain is this:
Slope=(max-min)/7, where:
Max= (i-3)+(i-2)+(i-1)+i+(i+1)+(i+2)+(i+3)
Min= (i-3)
(((i-3)+(i-2)+(i-1)+i+(i+1)+(i+2)+(i+3)) - (i-3))/7
So, using a real example starting with 2002-11-19 as i:
(0+0.24+7.13+10.61+2.97+0+0.97) - 0)/7 = 3.13
I tried using zoo package rollmean, however, I could not figure out how to account for WinterID, SiteID, and SubstrateConcat. This gave me an "order.by" error where my Date values were not unique, since I have dates with different SubstrateConcat and WinterID criteria. As I enter more data into the database, there will eventually be dates with multiple SiteID criteria as well.
I thought maybe xts, TTR and ROC would be what I could use as in this question: Maximum slope for a given interval each day. But again, I don't understand how to specify the multiple group factors, as well as going three days forward and three days back as in align=center with rollmean.
Will someone please point me in the right direction here? Will one of the above functions combined with ddply work?
Thank you!
EDITED to include the answer after the answer supplied by #eddi.
dt <- data.table(df)
dt[, MaxSlope := if(length(HDD)<7) {rep(NA_real_, length(HDD))} else {filter(HDD, c(1,1,1,1,1,1,0)/7)}, by=list(Winter, Site, Substrate)]
This code works perfectly for dates that are continuous. Can anyone recommend how to tweak this code for data that has missing dates? For instance, I have:
Date Temp
Nov 21 14
Nov 23 10
Nov 24 12
Nov 27 11
Nov 28 7
Nov 29 9
Nov 30 10
Dec 01 12
Dec 02 8
Dec 03 7
I don't want the Max Slope calculated for Nov 21, Nov 23 and Nov 24 because there isn't consecutive data for the calculation. Instead, I want "NA" inserted. Can the existing code above, be modified to accommodate this?
Sounds like you need filter (or you could also use one of the rolling mean/sum functions). And the grouping part is easiest to do with data.table:
library(data.table)
dt = data.table(your_df)
dt[, filter(HDD, c(1,1,1,1,1,1,0))/7,
by = list(WinterID, SiteID, SubstrateConcat)]
I couldn't get working solution with ddply, though I didn't spend much time debugging. Here's a solution using base functions (assuming your object is named hdd).
# split your object into groups
shdd <- split(hdd, hdd[,c("WinterID","SiteID","SubstrateConcat")], drop=TRUE)
# create a function to apply to each group
f <- function(d) transform(d, MaxSlopeHDD=rollmax(c(NA,diff(d$HDD)),7,fill=NA))
# apply the function to each group and rbind the results together
shdd <- do.call(rbind, lapply(shdd, f))

Resources