R Program, Function doesn't recognize input string of column name - r

I have function in R to get mean of anything, which is any column name of dataframe
mean_anything <- function(directory, anything){
files_full <- list.files(directory, full.names=TRUE)
seethis <- lapply(files_full, read.csv)
output <- do.call(rbind, seethis)
mean(output$anything, na.rm=TRUE)
}
If I invoke mean_anything("diet_data", "Age") I get
[1] NA
Warning message:
In mean.default(output$anything, na.rm = TRUE) :
argument is not numeric or logical: returning NA
However, if I replace
mean(output$anything, na.rm=TRUE)
with
mean(output$Age, na.rm=TRUE)
Then the function will output [1] 36.4
I tried using single and double quotes around anything, I tried output[anything], how to fix?
dput(output)
structure(list(Patient.Name = structure(c(1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
5L), .Label = c("Andy", "David", "John", "Mike", "Steve"), class = "factor"),
Age = c(30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L,
30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L,
30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 35L, 35L, 35L, 35L,
35L, 35L, 35L, 35L, 35L, 35L, 35L, 35L, 35L, 35L, 35L, 35L,
35L, 35L, 35L, 35L, 35L, 35L, 35L, 35L, 35L, 35L, 35L, 35L,
35L, 35L, 22L, 22L, 22L, 22L, 22L, 22L, 22L, 22L, 22L, 22L,
22L, 22L, 22L, 22L, 22L, 22L, 22L, 22L, 22L, 22L, 22L, 22L,
22L, 22L, 22L, 22L, 22L, 22L, 22L, 22L, 40L, 40L, 40L, 40L,
40L, 40L, 40L, 40L, 40L, 40L, 40L, 40L, 40L, 40L, 40L, 40L,
40L, 40L, 40L, 40L, 40L, 40L, 40L, 40L, 40L, 40L, 40L, 40L,
40L, 40L, 55L, 55L, 55L, 55L, 55L, 55L, 55L, 55L, 55L, 55L,
55L, 55L, 55L, 55L, 55L, 55L, 55L, 55L, 55L, 55L, 55L, 55L,
55L, 55L, 55L, 55L, 55L, 55L, 55L, 55L), Weight = c(140L,
140L, 140L, 139L, 138L, 138L, 138L, 138L, 138L, 138L, 138L,
138L, 137L, 137L, 138L, 139L, 139L, 137L, 137L, 137L, 137L,
137L, 137L, 135L, 135L, 135L, 135L, 135L, 135L, 135L, 210L,
209L, 209L, 209L, 209L, 209L, 209L, 208L, 208L, 208L, 208L,
208L, 208L, 207L, 206L, 206L, 206L, 205L, 205L, 205L, 205L,
204L, 204L, 204L, 203L, 203L, 202L, 202L, 202L, 201L, 175L,
175L, 175L, 175L, 175L, 175L, 175L, 175L, 175L, 175L, 175L,
175L, 175L, 175L, 175L, 175L, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, 177L, 188L, 188L, 188L, 188L, 189L,
189L, 189L, 189L, 189L, 189L, 189L, 189L, 189L, 189L, 190L,
190L, 190L, 190L, 190L, 190L, 190L, 190L, 190L, 192L, 192L,
192L, 192L, 192L, 192L, 192L, 225L, 225L, 225L, 224L, 224L,
224L, 223L, 223L, 223L, 223L, 223L, 222L, 221L, 221L, 221L,
220L, 220L, 219L, 219L, 219L, 218L, 217L, 217L, 217L, 216L,
215L, 215L, 214L, 214L, 214L), Day = c(1L, 2L, 3L, 4L, 5L,
6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L,
19L, 20L, 21L, 22L, 23L, 24L, 25L, 26L, 27L, 28L, 29L, 30L,
1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L,
15L, 16L, 17L, 18L, 19L, 20L, 21L, 22L, 23L, 24L, 25L, 26L,
27L, 28L, 29L, 30L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L,
11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 20L, 21L, 22L,
23L, 24L, 25L, 26L, 27L, 28L, 29L, 30L, 1L, 2L, 3L, 4L, 5L,
6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L,
19L, 20L, 21L, 22L, 23L, 24L, 25L, 26L, 27L, 28L, 29L, 30L,
1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L,
15L, 16L, 17L, 18L, 19L, 20L, 21L, 22L, 23L, 24L, 25L, 26L,
27L, 28L, 29L, 30L)), .Names = c("Patient.Name", "Age", "Weight",
"Day"), row.names = c(NA, 150L), class = "data.frame")

I solved it.
mean(output[,anything], na.rm=TRUE)
i.e.
mean_anything <- function(directory, anything){
files_full <- list.files(directory, full.names=TRUE)
seethis <- lapply(files_full, read.csv)
output <- do.call(rbind, seethis)
mean(output[,anything], na.rm=TRUE)
}
Where anything is any column in the dataframe

Related

Is there a way to produce multiple x-y scatterplots at once based on grouping value, ordered by a third variable?

I have multi-level data. The group level is individual persons, which are designated by id. The variable index indicates different time points. Is there a way to make a separate scatterplot (x vs. y) for each individual, all displayed in the same output, and ordered based on a third variable (z)? If so, can color then be added to indicate degree of third variable (z)? Data below, Thanks.
> dput(dat1.1)
structure(list(id = c(2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L,
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
4L, 4L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 5L), index = c(1L, 2L, 3L, 4L, 5L, 6L, 7L,
8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 20L,
1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L,
15L, 16L, 17L, 18L, 19L, 20L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L,
9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 20L, 1L,
2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L,
16L, 17L, 18L, 19L, 20L), x = c(7.443917, 7.520429, 7.446833,
8.07893, 8.534033, 8.263931, 7.598647, 6.902987, 7.672617, 7.739256,
7.591341, 8.101125, 7.811751, 6.596834, 6.637652, 8.467165, 7.835399,
6.500149, 7.083198, 7.531798, 6.110208, 6.368534, 5.26318, 6.735778,
5.580152, 5.460161, 5.844303, 6.258181, 7.191627, 5.105033, 6.760193,
5.857215, 5.866264, 6.769086, 6.547294, 5.623804, 4.675815, 6.153901,
6.040519, 6.236045, 8.216397, 6.097841, 5.491311, 5.831432, 6.297337,
6.655688, 5.553445, 6.37449, 6.271961, 6.959645, 7.080341, 6.46092,
6.476955, 7.221111, 6.219023, NA, NA, NA, NA, NA, 8.21752, 7.589581,
8.363739, 8.849697, 7.78645, 7.494006, 7.827766, 9.11352, 7.80884,
6.701855, 6.259061, 5.523358, 6.186617, 6.548538, 6.6937, 7.213297,
5.243428, 7.510827, 7.054297, 7.603241), y = c(106L, 114L, 50L,
50L, 56L, 46L, 50L, 52L, 114L, 50L, 56L, 26L, 48L, 52L, 48L,
54L, 54L, 56L, 52L, 50L, 84L, 86L, 88L, 86L, 82L, 84L, 88L, 84L,
86L, 84L, 86L, 86L, 84L, 84L, 88L, 88L, 88L, 84L, 86L, 120L,
106L, 168L, 116L, 56L, 108L, 68L, 68L, 70L, 74L, 76L, 76L, 76L,
72L, 70L, 118L, NA, NA, NA, NA, NA, 60L, 62L, 52L, 90L, 50L,
50L, 54L, 56L, 52L, 30L, 78L, 30L, 52L, 54L, 52L, 80L, 86L, 46L,
54L, 84L), z = c(33L, 33L, 33L, 33L, 33L, 33L, 33L, 33L, 33L,
33L, 33L, 33L, 33L, 33L, 33L, 33L, 33L, 33L, 33L, 33L, 54L, 54L,
54L, 54L, 54L, 54L, 54L, 54L, 54L, 54L, 54L, 54L, 54L, 54L, 54L,
54L, 54L, 54L, 54L, 54L, 56L, 56L, 56L, 56L, 56L, 56L, 56L, 56L,
56L, 56L, 56L, 56L, 56L, 56L, 56L, 56L, 56L, 56L, 56L, 56L, 50L,
50L, 50L, 50L, 50L, 50L, 50L, 50L, 50L, 50L, 50L, 50L, 50L, 50L,
50L, 50L, 50L, 50L, 50L, 50L)), class = "data.frame", row.names = c(NA,
-80L))
Does this come close to giving you what you want?
library(tidyverse)
d %>%
group_by(id) %>%
mutate(z=as.factor(z)) %>%
group_map(
function(.x, .y) {
.x %>%
ggplot() +
geom_point(aes(x=x, y=y, colour=z)) +
facet_wrap(vars(z)) +
scale_colour_manual(drop=FALSE, values=d %>% distinct(z) %>% pull(z)) +
labs(title=.x$id[1])
},
.keep=TRUE
)
Points to note:
group_map applies a function to each group of a grouped data frame. .x refers to the data in the current group, .y is a one row tibble defining the group. .keep requests that the grouping variables are kept in .x.
drop=FALSE in the call to scale_colour_manual() ensures that unused factor levels are retained in the legend (and hence different levels of z are distinguishable between plots).

How to identify (not remove) SETS of data that are duplicated? Dplyr or other solution?

so I have data about Sites, nested in Class. In each Site there is a Time (timepoint) variable. The data of interest is Count1, Total1, Count2, Total2.
I know there are whole duplicate sets within Class, across Sites for the values of Count1, Total1, Count2, Total2 for Time.
Here's what I mean - Let's say we have Class 1, with the first Site:
Class Site Time Count1 Total1 Count2 Total2
1 a0QjvO281o1 1 8 64 4 34
1 a0QjvO281o1 2 16 64 8 34
1 a0QjvO281o1 3 16 64 8 34
1 a0QjvO281o1 4 16 64 8 34
1 a0QjvO281o1 6 8 64 4 34
And, I've noticed there are several other Sites with this EXACT pattern (or other repeated patterns).
Class Site Time Count1 Total1 Count2 Total2
1 zlG1VmpE6QQ 1 8 64 4 34
1 zlG1VmpE6QQ 2 16 64 8 34
1 zlG1VmpE6QQ 3 16 64 8 34
1 zlG1VmpE6QQ 4 16 64 8 34
1 zlG1VmpE6QQ 6 8 64 4 34
I want to identify within Class how many Sites have the same pattern. Either marking them or reducing the data sets to the first unique site pattern, but I would like to be able to say how many Sites fit each found pattern.
So, here's the partial data:
df <-
structure(list(Class = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), Site = structure(c(3L,
3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 6L, 6L, 6L, 6L, 6L, 9L, 9L,
9L, 9L, 9L, 17L, 17L, 17L, 17L, 17L, 19L, 19L, 19L, 19L, 19L,
30L, 30L, 30L, 30L, 30L, 49L, 49L, 49L, 49L, 49L, 54L, 54L, 54L,
54L, 54L, 56L, 56L, 56L, 56L, 56L, 62L, 62L, 62L, 62L, 62L, 66L,
66L, 66L, 66L, 66L, 86L, 86L, 86L, 86L, 86L, 88L, 88L, 88L, 88L,
88L, 98L, 98L, 98L, 98L, 98L, 33L, 33L, 33L, 33L, 33L, 128L,
128L, 128L, 128L, 128L, 141L, 141L, 141L, 141L, 141L, 153L, 153L,
153L, 153L, 153L, 154L, 154L, 154L, 154L, 154L, 274L, 274L, 274L,
274L, 274L, 291L, 291L, 291L, 291L, 291L, 306L, 306L, 306L, 306L,
306L, 309L, 309L, 309L, 309L, 309L, 336L, 336L, 336L, 336L, 336L,
342L, 342L, 342L, 342L, 342L, 396L, 396L, 396L, 396L, 396L, 413L,
413L, 413L, 413L, 413L, 418L, 418L, 418L, 418L, 418L, 435L, 435L,
435L, 435L, 435L, 451L), .Label = c("~", "A0e3A15Lh1d", "a0QjvO281o1",
"A0R2gEqRbTv", "A4J3Jp6KNz2", "A757EHpLOya", "A8kkDgEvEZV", "ab5F7MfRxZW",
"AcjfpLUXjwt", "admxsO3fTtq", "aEBm7REs6XS", "AEZgWxwdbd9", "AezXCsZxd2U",
"AFjm1YmnfyO", "AFTwI0xBM6e", "aGw7PyLMEkl", "aHNXoYj7uNJ", "AibLRYCSE4P",
"aitNX6Qxkon", "ajEqsuhE9fV", "aJFDh98Iahb", "AKG4BvCUVsF", "AMtGkXGugJb",
"aNczAtKAJsv", "aoY0wrz6qBF", "aOz3ikxG7qM", "aPWuF0rDfuJ", "aQrGXlhzEJB",
"ARu0wnYDkam", "As7tGowP84e", "AsqolR3dfgv", "atj39UeK8N9", "atmjKVCRnzw",
"aUhP7zZ7LPU", "aUMEQzUKI0K", "AuP8NAgS7Th", "aUyy9i4fwhS", "AVFW2vlGxds",
"awoAlwC06Go", "awxCmxmeea2", "AWYFb5fwcYb", "Ax2Q16uPW55", "AXO6R085bth",
"Ay6W05BTgDV", "aZMeFIlkevS", "B08adcYOEl7", "b5MVFPi1inY", "B7fffQm5omx",
"ba3kFfcKXNk", "bCK7hWM4bnK", "BDlYKSCaOIG", "BE3TZDysXuQ", "bErpy9bSZAV",
"Beu6pmpSDJE", "BgfNJiJlDrF", "bGUeQEEpq7q", "bgWDDBsRLIL", "bHwo17fsILI",
"bifefa8JnfN", "bIQ3gsw51RH", "bisxDvmwluW", "biy6fHoOcZp", "bK7yQP8LNkJ",
"Bke0tWeJyBr", "bKMNhuIYaYW", "blkWvfFDVm6", "bnaDFC8EVAo", "BNDeQ6sJctI",
"Bokks2ESodd", "BoKlS77F7Il", "BqLRDDu69ic", "bqoZAzbsajz", "BRlA0HkkMGM",
"bT501IhkxV9", "BTliRZoJs4i", "bTTf1R7zgRn", "bTZAPQPXgI5", "BUtglXWCjkf",
"BvcJEyVWsGG", "bVHpRZguCL2", "BVymUZcbCuf", "BwkVolONMBn", "bWtq9NnOoCU",
"c2YR2oDyx7t", "c3dhvyZuPum", "c3LYcysugey", "c46Q9ExLocA", "C52gwcl9fmp",
"c5IYnQ3M7dj", "c6yCKEAemfr", "C8uv1qapHmC", "Ca2rjTu7g6A", "cAsHVMiIVHT",
"cB7mNM1MNm0", "Cbboq0XBHn1", "cbUfMWJl9sK", "ccixNtjWLkf", "ccL7Esacksn",
"CgmvbI2pkyK", "cGvhZR5kDxQ", "chFA8wLA953", "cIb00kbYPgm", "cjoj6MxgfxE",
"cJrxpXipqCm", "cMR1ECoHpE4", "CmRKRa25mZu", "cnCuI3VeJKt", "cNUlz8NllVu",
"CoySgwRgeRE", "CpZyeEzz39h", "CqIH5ytvqTS", "cRbK3weaIO6", "cs2MtDT1y17",
"CSVVXoe0xGC", "ctEZrxoEucg", "CxCDdfOd0Nj", "cXzO64qne5O", "CZq12nSSyn9",
"CzTmTRr0krx", "d3F3FBUFtWi", "d3f8P40FxnS", "d3thFMLEOGr", "d3UA2wZLHlM",
"D3wXzwwrBE7", "D4Bb0bZE5eK", "D5BprGY8EIU", "D5F054OKtW4", "D9nOWZAX3yT",
"DAcTRfO0CNG", "DbjU3iBZtGx", "Dd4sp3zIfSJ", "DDC8Dws74Zz", "DEFzmar1QtJ",
"dEoQWkLavTj", "deVhoPko4Bh", "DFBDO1gXQwf", "DfdvXXyNSoV", "dGCqYO3Zi6p",
"DGDkUV76OgX", "Dgt3VcFh8rl", "DHdEugYqcEI", "Dhku9zrZoJe", "dHokR5oLiIl",
"DhPZWGceA1Q", "DiKXevYOYNB", "DJIgnE1QQbB", "dkR7YOB6UT6", "dKy3aHycCap",
"dl9g8UYxk20", "DLmEBtWqO9S", "DLza3NSQYUI", "dmUHnTHgfYg", "dnRXJOdEzdw",
"doRK8OhG0kd", "DQaEryfraV6", "dQk8ubXxXLX", "dQOwWKXxFeq", "DrHlSXIHalR",
"DrLeENdZwxX", "DRUaAOrybxb", "dSJcUkmJWvZ", "dSuHNzaRaSf", "dtDftsTowRA",
"DVF2BNdSzV9", "DW7NajJs9ry", "dw94DZyrpUZ", "Dxa8RiDlXB6", "dXBB3LIqhd8",
"dY1ATXbywBu", "DY3V0E6pUYD", "dYIdx3HoWbL", "DZMyvdZEDeB", "dZrjKdqCi1w",
"e2cMNKCnHOw", "E2g3H9rUdML", "e59NHDOFTWC", "E6KoR8hXk7P", "E6vLBntf9QE",
"E8PnLO9QRcE", "e9NQxtBNruk", "e9QjFd6fZ4I", "EAdX1JPb4Dm", "eCGBeD0uz0D",
"ECHaJeidpTR", "edLdPyMjbaz", "EefeXxr8yDS", "ef6tzAcpMeF", "eFB6BfJ2BTY",
"EjFYleP5G9K", "eLGdmsoRjWn", "ElmgbenqYn7", "EM5PauW0KWg", "EmhBF1JUw3i",
"enR40fiMtoo", "EpxhEmcMVXh", "EQpPsVwWvqz", "EQtHhnAYjJp", "erfgs35WGXU",
"eRNEYF9OfA8", "ERqjIjzKnNm", "EsdcJsyJTJG", "ESNgljw6VvC", "eSZjKIwHPYi",
"etyPfIkrlrM", "Eu1JrO8bBkB", "euFWewBZ5Xr", "EVaNkH5nz1s", "eXgA6Zfn6KQ",
"EXIi96SW1Bm", "eYPdhvwFirr", "eZ2NazTVbb6", "EzN8D82lOTp", "F03oK0VRgyk",
"f0WCSs2fwvv", "F3CHKWYM2Pb", "f3FoF8cpKiH", "F42k81lXXMO", "F8ZvmoAy2bh",
"fd5zuIbL3Qd", "fDN9KAuRv2o", "FdqK3U8rDRX", "fG2ws21A6Lj", "fgDQSAYp5pj",
"FGjbxwib4q5", "FgLXwaIGGbn", "FiqXUXkRHXr", "fiuesJ8f3xw", "fJAqAOFzB2b",
"fJmQ6P38mHh", "fJy2O3xh1fV", "FjZuMxKuYvb", "FKe5fQHbu8l", "FKuw35vjqRz",
"FmAQ159jI3w", "FMGmKkEOmV4", "FmuzZuFFMzD", "FMX7RNQIwYu", "FNUYBvpbWaA",
"fnzDrz05g0T", "FO80di9Jxuk", "FOKfyVchS21", "fP0XmUTTfks", "fpCA3TMnMA3",
"FPkj0JvlmyK", "FPSoejJAWSU", "FqkwtkM7eXB", "FqlQZiGKxpr", "FThJa71HEEs",
"FVaQ3fSHtT5", "FvQrsd2gVeu", "fx7bCRgdYic", "FxrH3E1ge0f", "fYtsyMj84LY",
"G0EID1cpxEB", "g4jJZ1SNP4I", "g7AYmMzlRL5", "G7hnxrBDXd2", "GCQVHCnV25O",
"GcYMteoIkw9", "GDCM1IWa7Zh", "gdsUTJnwdzb", "Ge7oZ5R4iBk", "gEff10Pq35y",
"GFPi9bpW3sN", "ggMEnqgD9kD", "gKR0a28tTp5", "gKRGyOXbpzj", "glEuzcZNWIM",
"GlWdTuycHxs", "glyDmwEFzrr", "gmmjFqs7MFB", "GMWgNQ8JB1r", "gqFwQOY1wSE",
"gqNh2d7WJva", "grKa7EwswRX", "gsIY3JD3iHh", "GSWPAgMxhy2", "gsX0auFXP9m",
"Gtef53Qyxrj", "GTQqEhUUV1F", "guGv3PY445Z", "gUve5bZAut8", "gVZ58EQOH6K",
"GwXv8OX78AT", "GXIQmznIdQe", "GxIVLRDNmVF", "GziA2Vc0HX4", "h0RMK448nhs",
"H0vjaO76Wg8", "H1G7wWYemSm", "H3mOm6sbODE", "h4IQGhyYAQp", "H6LR8zRVQLW",
"hAoSAyLR3I6", "HB6ZBS6kyJ8", "HcKIEHFgpDb", "hCuRPOStRLU", "HdTW2XJg3IO",
"HdxFUpXFp2O", "hFwwNnFm1B8", "hHMHykeQBua", "HI3Z2eSmWYl", "hiRGzSqrLx5",
"hjeei4JLTiF", "HjwC2LDSWHK", "HlElMRh1t6W", "hlIZJlEsd7B", "hLwLFwQgUdb",
"HmIC1eI4aEQ", "HmuBn2Tdutx", "HN6AdgqShbf", "hoSu28MRYPv", "hq6x4qBOYsg",
"HQHoA9YKMAI", "hqvimuJJhKL", "hrpWiEmnynY", "hsLoXTDJDib", "htJFOM9EYmH",
"HU4RdTNlezp", "hWWRAoV26mI", "HXA0U1WlIhx", "hxckGietsww", "Hy4Uo9AjrnA",
"hy52ywnDIAM", "Hy5stTfQzCG", "HZd0k5dqZ9h", "hZV0CekLNni", "i0rzEGmhViY",
"i0UbyVCIMMY", "I21MUYJoVMy", "I2G30Bxw2BX", "I2tQnsS7wn6", "I3n104WlitM",
"i3UCGccuhCZ", "i4KTQ0RGK3T", "i5GWQwiObW1", "I5NWo4ucWB3", "I6v4GYaXpQC",
"i7xMMyJ6A6E", "IAvpgvgrG0f", "iBB477oQopG", "IBhZ0h4Ap4D", "IbltT4i4TK6",
"Icts0NC4qAd", "IesVnrPQeSZ", "IFINQSPg4YM", "IFTZCzzniHQ", "IFvY9G1PHAV",
"igDf6uUnTYe", "iHIs3hIFf0i", "IHWMvXnrYmQ", "Ii0xFlLHHXz", "iI2i5pPbl5B",
"Iiwy3Zv7iLb", "iJax0w1KHEN", "ijl4gbKzr3X", "IJwB2CRmy7D", "IKMMHGYtcDC",
"ikpa1wjF92j", "iL8UKqtpf9G", "ILiQ2JLmcLT", "ILJAF0UeEJj", "in5GYhicsOP",
"INcVgc44sm9", "ioVTytF5utn", "iPY8yPbKyA0", "IQIfv1gEqzC", "iqKq6QyUII5",
"iqopOI7y0N3", "ITafa9GjY9I", "ITzEvGOU2GR", "IuymlqNZCLI", "ivq1Bh0PvUd",
"iwrIeTg1XFz", "iWvqk82htTQ", "IxcUubx1fw5", "j2k93SJevE1", "j2X8kPMcchC",
"j6UnkDFKZc1", "j7218NqxjYe", "j8DdqpZn2qc", "j8FYrPT09Sd", "J9JOpPQB23Y",
"jaDbDaXw0Pc", "JcZ2R7KZzTq", "jdswhtT866l", "JE6sdkvuc9S", "JeSc2hThLHY",
"JEWdR4I9TIm", "jf0RxRXJQD0", "jFFOiUs7WoZ", "jhngb8KdYU1", "jiIV8o3C0qx",
"jJ1tYGFTuaR", "JJD60zjyHFp", "jKg6rpNATKH", "jlaaYySSxTv", "JlEPa3N6EgO",
"jlZ6LAYKEo9", "JMhFN7V0B1r", "JMr6AvPnW1M", "JnJtmnGCY95", "JnsP1SLvvsw",
"jOl9gZtASeV", "Jq4XG5c63t1", "JqfwjhLrHs7", "JrxejHLYDML", "JTNDUJAu3DA",
"jUtaZ7I8azt", "juWqrHQgdew", "jVb0CSg6sIR", "JVHpkK4exDw", "JVk9m9vVA1D",
"jWFefvuCwnA", "jXoQbHS18G7", "JYfu3Ld3AuN", "K2Lh8hkI6ST", "K3RIalye4fw",
"K3rIsFyLwv7", "k6fqIh47UYc", "K7re2lFVRfv", "K9HNTtT80IM", "kAQQIuh4eZr",
"KbEhvcWmvAf", "KBMxpwB6DCO", "KBybjbIp9VK", "kCdAI1b02G6", "KCPICjUZcE4",
"kCQMO6wkkV5", "KCtzRrOqmal", "kdDCRlEWqYr", "kdUL3XxL1bF", "kdXwwhZfS7V",
"kEeOSZheoND", "kEhPOqEXXk0", "kGE4jAoYn5L", "KHXn2gzpI0j", "KjMGcLd3XXd",
"kK6NYM3jZkd", "kKsL2QkNR4K", "kl6QWeL9RDW", "klThMLasoQV", "KmfuUMQ7T93",
"Kn9F1mXO0GV", "KNU8WQL2zSc", "KP6O1BkuoPX", "KPF6QKOADPR", "KpV6xl78isl",
"KqyKD3POUbS", "KQYxmgQNUSD", "KRQ61nuKa1b", "KtDVkM6bDeW", "ktTYjYLEW3v",
"kubDpNzUTG7", "KujnNfVcY2N", "kVJ0jf7P7Wf", "kWBZ1e0JH5h", "Kwts2m2rUUp",
"KxEa3dXzAYv", "kyGz0JzX3Z0", "kzHnYcum1wX", "L3iJ4hZ2ypn", "l7dBO27dhA6",
"l7RKRoGgmlq", "L7xlpOoRnWm", "LaH8j5yWJZ1", "lawU1EpVZVc", "LBEkbl9SzHf",
"lbvPWYrpTPw", "LcWVIO0Jsqj", "LDmpwdWKomn", "leQOMrPQiqf", "LFOfMnjCDvJ",
"lgEnN00o6mZ", "lGgWFnakeII", "LHie5mY8Uj8", "lIEtVHeJ086", "LiLYwGv2WWN",
"lJ41xkkb1jI", "LJFDVm4S9HF", "LJzqA45qmSZ", "llQAyMkWXID", "LmBKIXa2mSL",
"LmwBbNZehh2", "lnkTmWmupfH", "lPAr5SfstTF", "lpCdKHJgyDr", "lQfxQMSOVqP",
"lS1XvFsr6no", "lUDMkJxSxHL", "Lw70k8Wjzp4", "LXWKW1xwmoZ", "lYNYlzUvgos",
"LYZ27cymGw5", "LZ1OWhYhPiZ", "m4ue4ZOdIep", "m6E2SxuEKtc", "m7fmNp4WilZ",
"m8FGZ1tP0UE", "M8kI8XD6qF9", "Ma2YKDqULAr", "MA3CYGbUEaG", "MAk4KZRu1L9",
"MAtmMxsNpeZ", "mC01s0xdGEm", "MCE5Y33BYDN", "MCT0SGxhkuU", "MdmyzozNJ02",
"mDNJnXJ3Bap", "ME541MEplIz", "ME9FWjRMe4e", "mePQU0trYhJ", "MFT0CnzHbgk",
"MFy31o7euAb", "mfZwiJJpZcR", "MgptQftlksp", "mgUgOViogq7", "MI2vOsP8NSo",
"MjCkEceL336", "mJY0L6TiTId", "MkU5WMbgI4U", "mKYg307awDr", "MM5BhvP1qVK",
"MM5CMbf9hxl", "mnshKO7lVDt", "moicbsA41fH", "mOSub2ULY1O", "Mpi4Xzop4kw",
"mPQwmRVhsKK", "mpxTG4BSHvb", "mR9nchmQZXC", "mruhLKuBF86", "mVZB3R5M66F",
"MW1EtjyMl5d", "MXHQSQfyHl2", "My1mHzVMqV6", "Mynld4Vekod", "N1giIHXfzhb",
"N23VxXj21Wv", "N2gVM6xHjXX", "n33C6ztvpqu", "N3LQS3eat8p", "n46vbqoLchh",
"N4rlgJRGUs3", "n5H2FaL7kap", "N5PPLwwES0c", "N6CPQoLRnz6", "N8nfcWXZtit",
"NawPD8q2KC8", "nChFLgqqH0w", "NCqjtm01Y4E", "NdMiR2VVel6", "nfR5nCiNHMC",
"nfwoSSAiWjg", "nfWs6WgmRC2", "nG7qJqJR13Z", "NGHkoHvBwF0", "nH6JZBFhCXs",
"nhfdWznpsqJ", "nhnQpVPQ7zK", "nhsj9HCnhEs", "nHtTsUMZoVG", "nIhIdZmXLXS",
"NIsmtALRuS5", "nj2KML2oqvV", "NJKcpotvrAQ", "nkXtOreJnSJ", "NLBLC0uWFuB",
"nmdSUueCjti", "NP8pgYnty0q", "NQxDKw6jGTj", "NSZxDwLVCeC", "nUanptGavqT",
"Nv5WX50ktwr", "nvJQYEQIFFM", "nvXHNeXXvJ5", "nwbO0NqAg7S", "nWJFiQq1vDL",
"nx2J294i6hk", "nxgu0uT1tLT", "NxKCqlm0eTG", "NYEpdnELJ54", "nYIBsKHueFr",
"nYnOM20f4fb", "NZxaguajfAY", "O1U2KTQp7RW", "O2p0zdfIFmP", "o3nzTkLC1Pl",
"o3pKyi7ckFO", "o4gtcJidna5", "O4slz8eLLn6", "o79rSRM0UlM", "O7qGvpaAt2w",
"oByIGUGsrgx", "od9Sosf2Y0V", "oDTFc2FqImi", "OdyuvCVU9Hz", "oEFK7vjkTU0",
"oEXOZcbaHxA", "OgLIyzin181", "OHtxRBRAzYs", "oJNbeCd6bvb", "oJsgj7WMDkq",
"OLEt9ovMHrz", "OlkZe7ivV0p", "oN0anW8xCpq", "oNDzB1D5as4", "oNfV9ntBJ9u",
"oNttkuJFbwC", "ooElCfPc54o", "OpEVn6IiULE", "OQ3BQRswMx7", "oTB157EY3jY",
"otmVyzT3xRC", "oUWkMygGP2W", "owxf1XoQ3Lu", "oYgWYWUVt2h", "OYjhvD7DqIP",
"oZfnfo46pS4", "p1NV2hE2fCZ", "p25NocgpHkc", "P2eQdjxbuZo", "p3T3oB4tfNN",
"P3Uob5UKAoM", "p4hBFnI8WIp", "p5L7w9Tjay3", "p7C2DczQikw", "P8tFheT6TtS",
"P91Rf8wCj7Z", "p9J64kFu5Fd", "PDOfJJdpbob", "pdRTIO2JqPL", "PDWC7RxX4t9",
"pEAFBcOJIVF", "PEfq6d3TONP", "PeNS8yHqYH1", "pEvaEn24SR1", "pg9F69FU9fh",
"Pi6v7zcA26e", "PibIwh4xKHI", "PicYz4ZaEkF", "PIm96jtkVB5", "pIVjHCsQgJI",
"PJI3sARzQAG", "PK027w8aZ5K", "PKfz9RYfKzF", "pl8h1HdqpFW", "pl9IGnhmOJc",
"PlISiBPN3db", "pMiRPEvyleJ", "pMtEAU5iVTB", "PnB0GLiMdBm", "PPb3XMcCAf3",
"PSdLvfFlDRF", "pthlRKVLgNp", "PTZfXfOkUR1", "pWmPB9No5RJ", "PWXwPbUM2DB",
"pxPQCkuJZrl", "PxXh1I86blw", "Pz198xRjRHD", "q2UUKkPtvll", "q4hyZcb2pgA",
"q6ke2WlwbWr", "Q75pcfnDLwr", "Q86baYhZPOB", "q8fmqtJVDhh", "qBrBhSbFC0d",
"qc9eMgI8Y95", "QCY2lUMpt7f", "QDkCAOGVng6", "QdYKp8ivavV", "qeBFicifeNz",
"QeKGz2D6wNe", "qEt7nmwua6v", "QGJz6Rv3qHU", "Qgzh7S5pLc3", "qHaaYvuNGIB",
"qiBueINJbti", "qimfq5GL5mV", "qJsVouyMqE8", "qlnxDl1BOrw", "Qlt1DOyb7iP",
"qm0fcx7VGOQ", "QMT77ObrHQa", "QOyCdSRSUXL", "Qpj3LVa0kMf", "qQ84fCTxdGh",
"QRaKmOedEZx", "qs9EipoiiBD", "qsPQEZph59z", "QTFJClMfP8c", "QtJyTjN5faU",
"qU7z54bY9jA", "QvByLV2hsHo", "QVFUUfes7vc", "QvQ5bpVOJDj", "Qwzbgh4Flmx",
"qx2DdF2CKFL", "qXdueHJNqcv", "QxSfgx5QfT7", "qxuRrLWQmXL", "Qztk8cjmz1e",
"r0ehsy1jjxa", "r2w7bZu3FsL", "R3ac44RpwRG", "r4mXVpHUWC7", "r6p12UeHOyg",
"r9efDheFtk3", "rakWSnvNhWr", "rbBZoYFr4DM", "rBtlT7YCRKx", "RdbYAXOnm2S",
"RdM4hjZsFRg", "Re2M8SlCc98", "RfmkqgjDUPL", "rgAmPaAHmNU", "rGbQXTyOdmW",
"RHpQbDCZK5O", "rhxxSbYXZRR", "RiIZqF2hfqY", "rIR8cwAz0sf", "rJ3tipUjVQ4",
"rlAmYWNUTnR", "rLiYzJJRiBA", "rLOyzoOdZqC", "RMKAo2HcVkM", "rnGH1Q5IyIU",
"robJRJuEFfM", "RovRnV9RWFd", "rpmWXDmHjsq", "rPPdTvv1QoY", "RqLdtXwHdGO",
"rR1aDWav3z1", "RrjHJQJDQSr", "rrZEwHEjjy8", "rsM3sdDc3Lk", "RsmDQZSmpD7",
"RtK3aS9WP2H", "ru8BHTnYxI3", "RU8DlKBg48x", "rUysfjKrKqk", "Rv3o89GkqWH",
"rVC8KePJHu3", "RvCLp5qbvtz", "RvQqAbOcEfA", "RW617O0UjQJ", "RWvmueaioAl",
"RxADuUq1Ba1", "RxHTSbz8VN5", "RxND5KsxzvW", "RyRJf2UHJL1", "S1Rh4YnCAAZ",
"s28njgt1wYe", "s4eb8Spa5TC", "S6gaiIWGmh9", "S6X4d5WHA1H", "sAnH4cWV41G",
"SATZgjyfpdZ", "sAyk7hwXEbV", "sBu9GwU5IKe", "SdlDgZMNxqX", "SegMIAP4dhw",
"SfB5NwJXaot", "SfPGp94cYZa", "SG0QMcMgRRq", "sG2EfH7UYLQ", "SgK14sd0Fq1",
"sgNOxONNZIv", "sGXcrRdwzAk", "shkTq2LdpXw", "si6qmHhCV9F", "skR6XpFhu3u",
"skuXY545bae", "skVM9VC2v6H", "sLkylFDaonQ", "sLQ3GDMCRSz", "sMVuTESYbpd",
"soMCF3RbHqt", "sQhxc449PV4", "sRbFOoSk7qZ", "srTptJGYtcK", "SSS1hmwqHOR",
"StSjQheznIv", "SvLVieXqQT7", "SVR6pSBhbCb", "sWlH85siDIT", "SWTCBn32M8D",
"sYBdL54a73r", "t0V5NCdjdPi", "T15MpYA7f51", "t3snPDHuVBW", "T5LdflE3Peq",
"T6RUeMH9KP0", "T6VbSgxjG4o", "t9Fl7c8SJbm", "TafeAKXESCA", "TBMPJiR0PKA",
"tcjz9dmJW4y", "tDDh1EjIZkh", "TE62MxBLgne", "TE7dhvcKVwp", "tEiDKptkacd",
"tEr481bYdow", "tfEtbnUgkGv", "TgpNd1eUCH5", "tGV21Z1HgXN", "thQZhxRh887",
"TJp862VOKlS", "TK7P7QXIDOA", "TKb3FP8mXY0", "TL5cvVAN3cA", "TM31sX4CThP",
"tMpwPcDzIfU", "tNf8m963xKK", "tnR9XvFJ5d7", "tNu7AdZ5358", "tOEYJ1EgIkn",
"TqSXqCuyodR", "tRgUTgCKu4J", "troIuBzxemz", "TSQWaAvOer5", "TsSlV9eE7Mz",
"ttKnsfno2BN", "tvoTu4cpYbh", "TWJPFfCeHES", "twyDPmlDNjH", "TyjDUvHkCAx",
"TYYPCGssY7i", "tz00ETYw78Y", "TZ307ap3HvE", "TzPwGs1AcCL", "TZxEGcWjbdk",
"u0ezFwC4OLL", "U3DjjRVyEun", "u45lZujojLF", "U6Mo4GsQKwT", "U7jt55boMwC",
"U8feQBluEhj", "UBe2SLdSmxV", "uBjjsyieqtr", "UccWk7OAtZ2", "uDXFpf8Ko6P",
"uE4KejhmDyk", "uGfkThgxZsI", "Uih0KGtvZeo", "UIyI4hkq7Bx", "UjoXPWJKPXb",
"uKFFT93nPmp", "UKSoohp2vBC", "UL70316n0C2", "UlD5QNXAW40", "uLDFnAy4ro0",
"UNxoCz1KXnW", "uOmh6keHjf6", "uormVxMEerw", "Upe0kYdbeUy", "UPSbASHNQmU",
"UQ1K5VqXqcZ", "uQvg5rWo87I", "usFB6MgBB6t", "uTeZmtXQzSN", "utgv86YyClH",
"UTmdWR44H5x", "uUmAJIXkmsO", "UUsAfkqIPhV", "Uv6Baj6YaG1", "UV9ZR51T6Ts",
"UvVxiC7b1jZ", "UW4ZNlm05Jq", "UxEq7311Xzd", "UXhcOzwv9o5", "UXSSmcXoWR8",
"v15yxuZyGjR", "V1MbBFGqwbB", "V5LD5oYeZys", "v6BprVsEEt2", "V7Hl62C5Wgz",
"Vah8YYh5HI5", "vbDOTEMQjfW", "vBjsjEqsmWL", "vBym1l507tA", "vf1kkxsjkB1",
"vFSbE4W5Kg6", "VfZPt9kXxL9", "vGLQ19KWuBv", "VHK1T5sygmw", "VLuN2iZ9oZp",
"VmwVU8HFDBn", "vnaUuR9C4FH", "vniKeY4S1Ru", "vOu023c0Snx", "vOuGO9bkEUa",
"vrQvRBzXiLv", "vRRoviRJVgX", "VS1mxlo1mVx", "VsFXXXagVmp", "VSHfWQyhzUu",
"vugElbQMtcL", "vvaX4oKLyKo", "vw87QIZ7dhk", "VWLVmvtDCSI", "vxXQe9jxSPE",
"vy0hyVTrTom", "Vy1JFQbsNBB", "vzGc2nPWraO", "vzVRv2jtJxL", "w0aCC4wNNzW",
"W1wtZLbWuY0", "w2yXiR4CyWt", "w539HzekPQh", "w55gRgLikEN", "WBhss2tvLa8",
"WcPEy9epMgd", "WCSGolF5yhy", "wdcS5ORWZte", "WDyq0ryAjpn", "webeuXrveDi",
"WeSJR8GDPmC", "WFApCUf18Lp", "wfFCmvMEGOQ", "WFiPvuGJf9O", "WggRnJplCQI",
"wgqFTVU7Iky", "wIMmZwl1gpX", "WjCGPzMzLVr", "WJfiDULf7ZC", "wkl1yyAzga3",
"wlspYUyDoQM", "wm060hpEM7g", "wMPB6u0GZDL", "Wn07Tbv74qp", "wNha3idA7l6",
"WnZVpXq5XCO", "wOe4JHkqbUm", "Wog7gclb7TJ", "wq4bmXnJK45", "Wq4O1nlYk1C",
"wqUwUpMD2mJ", "wrGYa8E94Yc", "WSAfRmiEJOF", "wSP90pEfCng", "wSW662GVwZP",
"wtoXU3G9YIy", "WtPSqPwjH2f", "wtV2TtEPCCZ", "Wtw2jbyaHz2", "WUChzooYWJ1",
"WUFgPdTN02g", "wUQiuRjZxiO", "Ww9Rq2KLlqV", "WWabB2sc4B7", "wxKEHpSLvib",
"wXnoTA2MDy9", "WYk4A1fVYD7", "WYMXHupBG7P", "wzD83xmvR3b", "WzemydwRD0R",
"X4ZVDdDd2xa", "X6efCWparbb", "X6uv3PName4", "X7deWPhTiIy", "X8TsrtMQFiu",
"X8UmaBiq1yy", "xbJCVaOZWp5", "XbjRzgMPN24", "XbubJh2yjOw", "XcqBCAaLcq5",
"xd8LIlN7N8h", "XdKVljaiZ9j", "xeEUMp35d5m", "XeUDpg1CTKf", "xf9Q4yYDlq5",
"XFEHZnnEGkT", "xFO9GKAXi1n", "xfxtwRZ7Ejp", "xhOpIbHQy8I", "XjBkSXvZLOZ",
"xjfIPJ04cET", "XLt8l1uPicg", "xlYle4v5GZ8", "xmJNiAbmSfe", "XnkRi1jTMKr",
"XPhxWI0fDyq", "XqDQsrhQ7W5", "Xsd3yzbnFOf", "XTF6vymtG8J", "xuovzIjWZUG",
"Xv1I8z1cK76", "XvVmyn071HT", "XxBMueAFsnk", "xxVZKlzMYJJ", "xyr4dO4G3tW",
"y4rr2PbfufS", "yaa2uBLsdRa", "YBG39jGSV17", "yDcnCB4aZEX", "YDuoFIKpONe",
"YdWxRCaQR2D", "yfgSogitBGX", "YFi06xiFHWs", "YFi2V7qfmJf", "yfpM2zJ3Zuc",
"ygTl7hih5qi", "YGtrgJxKWiU", "yIcfnuZhejK", "YIxt0WtezdT", "yJ014QFEqru",
"yJO8QTnBF3o", "yKfdWuLsdDx", "ylMgcLnwgce", "YNy9ymD2A8p", "yONz8gph9A7",
"YowwYq8CIXJ", "YPsxC0bl7T2", "YQP6diqjJAl", "YqR6LoSk2Ed", "yqwh11CvYXU",
"YRemZ3p9bFA", "ySxRSgTOeqD", "yTvx2IJ0w0z", "ytwga9hKjVj", "YtyO06HBaVr",
"YvEkkZlNeCK", "yVFdJkYsLK5", "yvoQHXHGvbT", "YVT9zsaVBzp", "YWbmL6VK8R6",
"Ywm8eA9tZHe", "yXady1QV27H", "yY7MHufA6C9", "yYG52aLO1GK", "yYgG4h097xR",
"YyhPAO5yx22", "Yz5yhyHf7Ul", "z2cGjpx37Mw", "Z42m6cWsI9m", "z4DptoHrJnb",
"z4kLOdnL1Op", "z5tZes2s49Z", "z5WklS85YjT", "z6bId6qlNk4", "Z6ZZLw50mAM",
"z8MwD6T43n2", "z8UkGdr2xNs", "Z90jET09ZrD", "zaeb1Zos2Mu", "ZBkpY2KdibX",
"Zc0BcScQDBU", "zCjn57zZQVN", "ZcrdEBruDka", "ZCT4YbaBFUb", "ZdVIx83rdI7",
"zEQXA689E4a", "ZfjQmCjVKRF", "zfutn6ulVcO", "zFzYdXMnPoP", "zG4JqtM8wHO",
"ZGyAErBl5PS", "ZifoCg4OvIj", "ZJ6MAab9PJE", "ZKVzRmYkKzQ", "zlG1VmpE6QQ",
"zN6xXPgmzqK", "zOfDRrZmbQO", "zOGa9wLHDFE", "zQmuipEUYbz", "zR7UekDUG3X",
"zrs6iFpEtF1", "ZrUjQFzR1gM", "zTnxsAMqHRP", "Zu7gpmcwfqY", "zvOkAI9ewwE",
"zvv07VAowTS", "ZWAdop7zYgJ", "ZWAEE8DrywN", "zxIlF5RwQFi", "ZXONCt7P01p"
), class = "factor"), Time = c(1L, 2L, 3L, 4L, 6L, 1L, 2L, 3L,
4L, 6L, 1L, 2L, 3L, 4L, 6L, 1L, 2L, 3L, 4L, 6L, 1L, 2L, 3L, 4L,
6L, 1L, 2L, 3L, 4L, 6L, 1L, 2L, 3L, 4L, 6L, 1L, 2L, 3L, 4L, 6L,
1L, 2L, 3L, 4L, 6L, 1L, 2L, 3L, 4L, 6L, 1L, 2L, 3L, 4L, 6L, 1L,
2L, 3L, 4L, 6L, 1L, 2L, 3L, 4L, 6L, 1L, 2L, 3L, 4L, 6L, 1L, 2L,
3L, 4L, 6L, 1L, 2L, 3L, 4L, 6L, 1L, 2L, 3L, 4L, 6L, 1L, 2L, 3L,
4L, 6L, 1L, 2L, 3L, 4L, 6L, 1L, 2L, 3L, 4L, 6L, 1L, 2L, 3L, 4L,
6L, 1L, 2L, 3L, 4L, 6L, 1L, 2L, 3L, 4L, 6L, 1L, 2L, 3L, 4L, 6L,
1L, 2L, 3L, 4L, 6L, 1L, 2L, 3L, 4L, 6L, 1L, 2L, 3L, 4L, 6L, 1L,
2L, 3L, 4L, 6L, 1L, 2L, 3L, 4L, 6L, 1L, 2L, 3L, 4L, 6L, 1L),
Count1 = c(8L, 16L, 16L, 16L, 8L, 12L, 24L, 24L, 24L, 12L,
8L, 16L, 16L, 16L, 8L, 8L, 16L, 16L, 16L, 8L, 8L, 16L, 16L,
16L, 8L, 8L, 16L, 16L, 16L, 8L, 8L, 16L, 16L, 16L, 8L, 12L,
24L, 24L, 24L, 12L, 8L, 16L, 16L, 16L, 8L, 8L, 16L, 16L,
16L, 8L, 12L, 24L, 24L, 24L, 12L, 8L, 16L, 16L, 16L, 8L,
12L, 24L, 24L, 24L, 12L, 8L, 16L, 16L, 16L, 8L, 8L, 16L,
16L, 16L, 8L, 8L, 16L, 16L, 16L, 8L, 8L, 16L, 16L, 16L, 8L,
8L, 16L, 16L, 16L, 8L, 8L, 16L, 16L, 16L, 8L, 8L, 16L, 16L,
16L, 8L, 8L, 16L, 16L, 16L, 8L, 8L, 16L, 16L, 16L, 8L, 8L,
16L, 16L, 16L, 8L, 8L, 16L, 16L, 16L, 8L, 8L, 16L, 16L, 16L,
8L, 8L, 16L, 16L, 16L, 8L, 8L, 16L, 16L, 16L, 8L, 8L, 16L,
16L, 16L, 8L, 8L, 16L, 16L, 16L, 8L, 8L, 16L, 16L, 16L, 8L,
8L), Total1 = c(64L, 64L, 64L, 64L, 64L, 96L, 96L, 96L, 96L,
96L, 64L, 64L, 64L, 64L, 64L, 64L, 64L, 64L, 64L, 64L, 64L,
64L, 64L, 64L, 64L, 64L, 64L, 64L, 64L, 64L, 64L, 64L, 64L,
64L, 64L, 96L, 96L, 96L, 96L, 96L, 64L, 64L, 64L, 64L, 64L,
64L, 64L, 64L, 64L, 64L, 96L, 96L, 96L, 96L, 96L, 64L, 64L,
64L, 64L, 64L, 96L, 96L, 96L, 96L, 96L, 64L, 64L, 64L, 64L,
64L, 64L, 64L, 64L, 64L, 64L, 64L, 64L, 64L, 64L, 64L, 64L,
64L, 64L, 64L, 64L, 64L, 64L, 64L, 64L, 64L, 64L, 64L, 64L,
64L, 64L, 64L, 64L, 64L, 64L, 64L, 64L, 64L, 64L, 64L, 64L,
64L, 64L, 64L, 64L, 64L, 64L, 64L, 64L, 64L, 64L, 64L, 64L,
64L, 64L, 64L, 64L, 64L, 64L, 64L, 64L, 64L, 64L, 64L, 64L,
64L, 64L, 64L, 64L, 64L, 64L, 64L, 64L, 64L, 64L, 64L, 64L,
64L, 64L, 64L, 64L, 64L, 64L, 64L, 64L, 64L, 64L), Count2 = c(4L,
8L, 8L, 8L, 4L, 4L, 8L, 8L, 8L, 4L, 3L, 7L, 8L, 8L, 4L, 4L,
8L, 8L, 8L, 4L, 3L, 8L, 8L, 8L, 4L, 3L, 7L, 8L, 8L, 4L, 2L,
4L, 4L, 4L, 2L, 3L, 5L, 8L, 8L, 4L, 4L, 8L, 8L, 8L, 4L, 4L,
8L, 8L, 8L, 4L, 4L, 8L, 8L, 8L, 4L, 3L, 6L, 8L, 8L, 4L, 4L,
8L, 8L, 8L, 4L, 3L, 4L, 6L, 6L, 2L, 2L, 4L, 4L, 4L, 2L, 4L,
8L, 8L, 8L, 4L, 4L, 8L, 8L, 8L, 4L, 4L, 8L, 8L, 8L, 4L, 4L,
8L, 8L, 8L, 4L, 4L, 8L, 8L, 8L, 4L, 4L, 8L, 8L, 8L, 4L, 4L,
8L, 8L, 8L, 4L, 4L, 8L, 8L, 8L, 4L, 4L, 8L, 8L, 8L, 4L, 4L,
8L, 8L, 8L, 4L, 3L, 8L, 8L, 8L, 4L, 4L, 8L, 8L, 8L, 4L, 3L,
8L, 8L, 8L, 4L, 3L, 5L, 7L, 8L, 3L, 4L, 8L, 8L, 8L, 4L, 4L
), Total2 = c(34L, 34L, 34L, 34L, 34L, 34L, 34L, 34L, 34L,
34L, 32L, 32L, 32L, 32L, 32L, 34L, 34L, 34L, 34L, 34L, 33L,
33L, 33L, 33L, 33L, 32L, 32L, 32L, 32L, 32L, 16L, 16L, 16L,
16L, 16L, 30L, 30L, 30L, 30L, 30L, 34L, 34L, 34L, 34L, 34L,
34L, 34L, 34L, 34L, 34L, 34L, 34L, 34L, 34L, 34L, 31L, 31L,
31L, 31L, 31L, 34L, 34L, 34L, 34L, 34L, 22L, 22L, 22L, 22L,
22L, 16L, 16L, 16L, 16L, 16L, 34L, 34L, 34L, 34L, 34L, 34L,
34L, 34L, 34L, 34L, 34L, 34L, 34L, 34L, 34L, 34L, 34L, 34L,
34L, 34L, 34L, 34L, 34L, 34L, 34L, 34L, 34L, 34L, 34L, 34L,
34L, 34L, 34L, 34L, 34L, 34L, 34L, 34L, 34L, 34L, 34L, 34L,
34L, 34L, 34L, 34L, 34L, 34L, 34L, 34L, 33L, 33L, 33L, 33L,
33L, 34L, 34L, 34L, 34L, 34L, 33L, 33L, 33L, 33L, 33L, 28L,
28L, 28L, 28L, 28L, 34L, 34L, 34L, 34L, 34L, 34L)), row.names = c(1L,
2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L,
16L, 17L, 18L, 19L, 20L, 21L, 22L, 23L, 24L, 25L, 26L, 27L, 28L,
29L, 30L, 31L, 32L, 33L, 34L, 35L, 36L, 37L, 38L, 39L, 40L, 41L,
42L, 43L, 44L, 45L, 46L, 47L, 48L, 49L, 50L, 51L, 52L, 53L, 54L,
55L, 56L, 57L, 58L, 59L, 60L, 61L, 62L, 63L, 64L, 65L, 66L, 67L,
68L, 69L, 70L, 71L, 72L, 73L, 74L, 75L, 1041L, 1042L, 1043L,
1044L, 1045L, 1046L, 1047L, 1048L, 1049L, 1050L, 1051L, 1052L,
1053L, 1054L, 1055L, 1056L, 1057L, 1058L, 1059L, 1060L, 1061L,
1062L, 1063L, 1064L, 1065L, 1066L, 1067L, 1068L, 1069L, 1070L,
1071L, 1072L, 1073L, 1074L, 1075L, 1076L, 1077L, 1078L, 1079L,
1080L, 1081L, 1082L, 1083L, 1084L, 1085L, 1086L, 1087L, 1088L,
1089L, 1090L, 1091L, 1092L, 1093L, 1094L, 1095L, 1096L, 1097L,
1098L, 1099L, 1100L, 1101L, 1102L, 1103L, 1104L, 1105L, 1106L,
1107L, 1108L, 1109L, 1110L, 1111L, 1112L, 1113L, 1114L, 1115L,
1116L), class = "data.frame")
An option is to group by 'Class', 'Site', paste (str_c) the columns except 'Time' to a single string, then grouped by 'Class', 'Count1', ..., 'Total2', columns, get the group indices to create the 'ind' column and do a left_join with original dataset
library(dplyr)
library(stringr)
df %>%
group_by(Class, Site) %>%
summarise_at(vars(-Time), str_c, collapse="") %>%
group_by(Class, Count1, Total1, Count2, Total2) %>%
mutate(ind = group_indices()) %>%
ungroup %>%
select(Class, Site, ind) %>%
left_join(df)
Or a similar logic with data.table
library(data.table)
setDT(df)[df[, lapply(.SD, paste, collapse=""),
.(Class, Site), .SDcols = patterns('Count|Total')][,
ind := .GRP, by = c('Class', 'Count1', 'Total1', 'Count2', 'Total2')
][, .(Class, Site, ind)], on = .(Class, Site)]

Making left-skewed distribution normal using log transformation?

I have a dataset with a variable that has a left-skewed distribution (the tail is on the left).
variable <- c(rep(35, 2), rep(36, 4), rep(37, 16), rep(38, 44), rep(39, 72), rep(40, 30))
I just want to make this data have a more normal distribution so I can perform an anova, but using log10, or log2 makes it still way left-skewed. What transformation can I use to make this data more normal?
EDIT: My model is: mod <- lme(reponse ~ variable*variable2, random=~group, data=data), so Kruskal Wallace would work except for the random effect and one predictor term thing. I did a Shapiro Wilk test, and my data is definitely non-normal. If justifiable, I would like to transform my data to give the ANOVA a better chance of detecting a significant result. Either that, or a mixed effect test for non-normal data.
#Ben Bolker - Thank you for your reply; I appreciate it. I did read your answer, but I'm still reading up on exactly what some of your suggestions mean (I'm very new to statistics). My p-value is fairly close to significant and I don't want to p-hack, but I also want to give my data the best chance I can of being significant. If I can't justify transforming my data or using something besides ANOVA, then so be it.
I've provided a dataframe snapshot below. My response variable is "temp.max", the maximum temperature at which a plant dies. My predictor variables are "growth.chamber" (either a 29 or 21 degree growth chamber) and "environment" (either field or forest). My random variable is "groupID" (the group the plants were raised in, consisting of 5-10 individuals). This is a reciprocal transplant experiment, so I raised both forest and field plants in both 21 and 29 degree chambers. What I want to know is if "temp.max" differs between field and forest populations, whether "temp.max" differs between growth chambers, and whether there is any interaction between environment and growth chamber in regards to temp.max. I would very, very much appreciate any help. Thank you.
> dput(data)
structure(list(groupID = structure(c(12L, 12L, 12L, 12L, 12L,
12L, 12L, 12L, 12L, 12L, 14L, 14L, 14L, 14L, 14L, 14L, 14L, 14L,
14L, 14L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 16L, 16L,
16L, 16L, 16L, 19L, 19L, 19L, 19L, 19L, 19L, 19L, 19L, 19L, 19L,
18L, 18L, 18L, 18L, 18L, 18L, 18L, 18L, 18L, 18L, 17L, 17L, 17L,
17L, 17L, 17L, 17L, 20L, 20L, 20L, 20L, 20L, 20L, 20L, 20L, 20L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 3L, 3L, 3L, 3L, 3L, 6L, 6L, 6L, 6L, 6L, 6L, 6L,
6L, 6L, 6L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 7L, 7L, 7L, 7L, 7L, 7L,
7L, 7L, 7L, 7L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 13L, 13L, 13L,
13L, 13L, 13L, 13L, 13L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L,
9L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 11L, 11L,
11L, 11L, 11L, 8L, 8L, 8L, 8L, 8L), .Label = c("GRP_104", "GRP_111",
"GRP_132", "GRP_134", "GRP_137", "GRP_142", "GRP_145", "GRP_147",
"GRP_182", "GRP_192", "GRP_201", "GRP_28", "GRP_31", "GRP_40",
"GRP_68", "GRP_70", "GRP_78", "GRP_83", "GRP_92", "GRP_98"), class = "factor"),
individual = c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 1L,
2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 1L, 2L, 3L, 4L, 5L,
6L, 7L, 8L, 9L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L, 6L,
7L, 8L, 9L, 10L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L,
1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L,
9L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 16L, 17L, 1L, 2L, 3L,
4L, 5L, 6L, 7L, 8L, 9L, 10L, 1L, 2L, 3L, 4L, 5L, 1L, 2L,
3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 1L, 2L, 3L, 4L, 5L, 6L,
7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 15L, 16L, 20L, 1L, 2L, 3L,
4L, 5L, 6L, 7L, 16L, 1L, 2L, 3L, 4L, 5L, 11L, 12L, 14L, 1L,
2L, 3L, 4L, 5L, 6L, 7L, 18L, 19L, 20L, 1L, 2L, 3L, 4L, 5L,
16L, 17L, 18L, 19L, 20L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L,
4L, 5L), temp.max = c(39L, 35L, 39L, 39L, 35L, 40L, 40L,
40L, 40L, 39L, 39L, 39L, 39L, 39L, 39L, 39L, 39L, 38L, 38L,
38L, 39L, 39L, 40L, 38L, 40L, 39L, 39L, 40L, 40L, 39L, 39L,
39L, 39L, 39L, 39L, 39L, 39L, 39L, 39L, 39L, 39L, 40L, 38L,
40L, 40L, 40L, 40L, 40L, 40L, 39L, 40L, 39L, 39L, 40L, 39L,
39L, 39L, 39L, 38L, 38L, 38L, 38L, 40L, 39L, 39L, 38L, 38L,
39L, 39L, 37L, 39L, 39L, 37L, 39L, 39L, 39L, 39L, 37L, 39L,
39L, 38L, 37L, 38L, 38L, 38L, 36L, 36L, 36L, 37L, 37L, 40L,
39L, 40L, 39L, 39L, 37L, 37L, 38L, 38L, 38L, 37L, 38L, 38L,
38L, 37L, 38L, 38L, 37L, 38L, 40L, 38L, 38L, 38L, 38L, 37L,
38L, 39L, 38L, 38L, 38L, 38L, 38L, 40L, 38L, 40L, 39L, 39L,
39L, 39L, 39L, 39L, 39L, 39L, 39L, 40L, 40L, 39L, 39L, 38L,
37L, 39L, 37L, 39L, 39L, 39L, 39L, 39L, 39L, 40L, 39L, 39L,
40L, 40L, 38L, 40L, 40L, 36L, 38L, 38L, 38L, 38L, 37L, 37L,
38L, 38L, 38L, 39L, 39L), environment = structure(c(1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L), .Label = c("field", "forest"), class = "factor"), growth.chamber = c(29L,
29L, 29L, 29L, 29L, 29L, 29L, 29L, 29L, 29L, 21L, 21L, 21L,
21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L,
21L, 21L, 21L, 21L, 29L, 29L, 29L, 29L, 29L, 21L, 21L, 21L,
21L, 21L, 21L, 21L, 21L, 21L, 21L, 29L, 29L, 29L, 29L, 29L,
29L, 29L, 29L, 29L, 29L, 29L, 29L, 29L, 29L, 29L, 29L, 29L,
21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L, 29L, 29L, 29L,
29L, 29L, 29L, 29L, 29L, 29L, 29L, 21L, 21L, 21L, 21L, 21L,
21L, 21L, 21L, 21L, 21L, 29L, 29L, 29L, 29L, 29L, 21L, 21L,
21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L, 29L, 29L, 29L, 29L,
29L, 29L, 29L, 21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L,
21L, 29L, 29L, 29L, 29L, 29L, 29L, 29L, 29L, 21L, 21L, 21L,
21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L,
21L, 21L, 21L, 29L, 29L, 29L, 29L, 29L, 29L, 29L, 29L, 29L,
29L, 21L, 21L, 21L, 21L, 21L, 29L, 29L, 29L, 29L, 29L)), .Names = c("groupID",
"individual", "temp.max", "environment", "growth.chamber"), row.names = c(1L,
2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 21L, 22L, 23L, 24L, 25L,
26L, 27L, 28L, 29L, 30L, 41L, 42L, 43L, 44L, 45L, 46L, 47L, 48L,
49L, 58L, 59L, 60L, 61L, 62L, 68L, 69L, 70L, 71L, 72L, 73L, 74L,
75L, 76L, 77L, 88L, 89L, 90L, 91L, 92L, 93L, 94L, 95L, 96L, 97L,
108L, 109L, 110L, 111L, 112L, 113L, 114L, 122L, 123L, 124L, 125L,
126L, 127L, 128L, 129L, 130L, 139L, 140L, 141L, 142L, 143L, 144L,
145L, 146L, 147L, 148L, 158L, 159L, 160L, 161L, 162L, 163L, 164L,
165L, 166L, 167L, 178L, 179L, 180L, 181L, 182L, 188L, 189L, 190L,
191L, 192L, 193L, 194L, 195L, 196L, 197L, 208L, 209L, 210L, 211L,
212L, 213L, 214L, 222L, 223L, 224L, 225L, 226L, 227L, 228L, 229L,
230L, 231L, 242L, 243L, 244L, 245L, 246L, 247L, 248L, 249L, 258L,
259L, 260L, 261L, 262L, 263L, 264L, 265L, 272L, 273L, 274L, 275L,
276L, 277L, 278L, 279L, 280L, 281L, 292L, 293L, 294L, 295L, 296L,
297L, 298L, 299L, 300L, 301L, 312L, 313L, 314L, 315L, 316L, 322L,
323L, 324L, 325L, 326L), class = "data.frame")
tl;dr you probably don't actually need to worry about the skew here.
There are a few issues here, and since they're mostly statistical rather than programming-related, this question is probably more relevant for CrossValidated.
If I copied your data correctly, they're equivalent to this:
dd <- rep(35:40,c(2,4,16,44,72,30))
plot(table(dd))
Your data are discrete - that's why the density plot that #user113156 posts has distinct peaks.
Here are the issues:
the most important is that for most statistical purposes you're not actually interested in the Normality of the marginal distribution, which is what you're showing here. Rather, you want to know whether the distribution of the residuals from a model is Normal or not; for an ANOVA, this is equivalent to asking whether the distribution of values within each group is Normal (and the groups have similar within-group variances).
Normality is not very important; ANOVA is robust to moderate degrees of non-Normality (e.g. see here).
Log transformation modifies your data in the wrong direction (i.e. it will tend to increase the left skewness). In general fixing this kind of left-skewed data requires a transformation like raising to a power >1 (the opposite direction from log- or square root-transformation), but when the values are far from zero it doesn't usually help very much anyway.
Some statistical options if you are worried:
a non-parametric, rank-based test like the Kruskal-Wallis test (the rank-based analogue of 1-way ANOVA)
do an ANOVA, but use a permutation-based approach to test statistical significance.
use an ordinal model
use hierarchical bootstrapping (resample within replacement within and between clusters) to derive more robust confidence intervals on parameters
Your variable follows a discrete distribution. You have integer values ranging from 35 (n=2) to 40 (n=30). I think you need to carry out some ordinal analysis collapsing values from 35 to 37 that have fewer observations in one category. Otherwise you could perform a non-parametric analysis using kruskal.test() function.
I have bad news and good news.
the bad news is that I don't see statistically significant patterns in your data.
the good news is that, given the structure of your experimental design, you can analyze your data much more simply (you don't need mixed models)
load packages, adjust defaults
library(ggplot2); theme_set(theme_bw())
library(dplyr)
check structure of data
Tabulating the data confirms that this is a nested design; each group occurs within a single environment/growth chamber combination.
tt <- with(dd,table(groupID,
interaction(environment,growth.chamber)))
## exactly one non-zero entry per group
all(rowSums(tt>0)==1)
aggregate data
Convert growth.chamber to a categorical variable; collapse each group to its mean temp.max value (and record the number of observations per group)
dda <-(dd
%>% mutate(growth.chamber=factor(growth.chamber))
%>% group_by(groupID,environment,growth.chamber)
%>% summarise(n=n(),temp.max=mean(temp.max))
)
ggplot(dda,aes(growth.chamber,temp.max,
colour=environment))+
geom_boxplot(aes(fill=environment),alpha=0.2)+
geom_point(position=position_dodge(width=0.75),
aes(size=n),alpha=0.5)+
scale_size(range=c(3,7))
Analysis
Now that we've aggregated (without losing any information we care about), we can use a linear regression with weights specifying the number of samples per observation:
m1 <- lm(temp.max~growth.chamber*environment,weights=n,
data=dda)
Checking distribution etc. of residuals:
plot(m1)
This all looks fine; no indication of serious bias, heteroscedasticity, non-Normality, or outliers ...
summary(m1)
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 38.2558 0.2858 133.845 <2e-16 ***
## growth.chamber29 0.3339 0.4144 0.806 0.432
## environmentforest 0.2442 0.3935 0.620 0.544
## growth.chamber29:environmentforest 0.3240 0.5809 0.558 0.585
## Residual standard error: 1.874 on 16 degrees of freedom
## Multiple R-squared: 0.2364, Adjusted R-squared: 0.09318
## F-statistic: 1.651 on 3 and 16 DF, p-value: 0.2174
Or a coefficient plot (dotwhisker::dwplot(m1))
While the plot of the data doesn't look like it's just noise, the statistical analysis suggests that we can't really distinguish it from noise ...

Box -Plot for Groups in R

I am having trouble to make a box-plot for differet groups side by side.
dput(df)
structure(list(UserName = structure(c(20L, 20L, 20L, 20L, 20L,
20L, 20L, 20L, 20L, 20L, 20L, 20L, 20L, 20L, 20L, 20L, 20L, 20L,
20L, 20L, 20L, 20L, 20L, 20L, 20L, 20L, 20L, 20L, 20L, 20L, 20L,
20L, 20L, 20L, 20L, 20L, 20L, 20L, 20L, 21L, 21L, 21L, 21L, 21L,
21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L,
21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L,
21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L,
21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L,
21L, 21L, 21L, 3L, 1L, 1L, 3L, 3L, 26L, 3L, 29L, 2L, 29L, 7L,
10L, 2L, 10L, 10L, 6L, 30L, 2L, 2L, 1L, 1L, 3L, 16L, 10L, 10L,
6L, 10L, 2L, 6L, 29L, 6L, 1L, 4L, 17L, 5L, 5L, 5L, 5L, 14L, 5L,
14L, 5L, 24L, 23L, 23L, 28L, 25L, 28L, 28L, 28L, 28L, 28L, 28L,
28L, 28L, 28L, 28L, 28L, 28L, 28L, 28L, 28L, 28L, 28L, 31L, 31L,
4L, 27L, 27L, 27L, 12L, 12L, 12L, 12L, 19L, 19L, 22L, 12L, 11L,
11L, 11L, 9L, 22L, 12L, 15L, 22L, 22L, 22L, 11L, 9L, 11L, 12L,
11L, 18L, 18L, 22L, 22L, 18L, 18L, 19L, 22L, 22L, 19L, 19L, 22L,
19L, 11L, 19L, 15L, 22L, 19L, 19L, 9L, 19L, 19L, 9L, 18L, 12L,
18L, 22L, 8L, 13L, 13L, 13L), .Label = c("CYL", "FAL1",
"GS", "HA1", "HX", "HURRT", "KWY", "LEI", "L1",
"LIGYR", "LYC", "LJ", "LQI", "LIC", "LOK", "MDA",
"NMZ", "NGK", "OXJ", "P_PT", "P_SH", "PDI",
"PONN", "PEHMB", "TGT1", "TNS", "THOLH", "TOT",
"WAN1", "WAK", "YH"), class = "factor"), Division = structure(c(1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 2L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 6L,
6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L,
6L, 6L, 7L, 7L, 2L, 7L, 7L, 7L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L,
8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L,
8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L,
8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 9L, 9L, 9L), .Label = c("BATCH",
"BTR", "IIT", "POL", "PTC", "PTP", "PTQ", "SPL", "TM"), class = "factor"),
SpoolUsage_max = structure(c(20L, 21L, 22L, 25L, 26L, 27L,
29L, 33L, 34L, 39L, 41L, 43L, 47L, 48L, 49L, 51L, 52L, 53L,
55L, 57L, 58L, 59L, 60L, 61L, 81L, 82L, 83L, 87L, 99L, 102L,
108L, 108L, 141L, 143L, 155L, 158L, 160L, 5L, 8L, 90L, 94L,
96L, 98L, 104L, 110L, 111L, 112L, 113L, 114L, 116L, 117L,
118L, 120L, 122L, 124L, 126L, 127L, 128L, 129L, 130L, 131L,
132L, 134L, 135L, 136L, 137L, 138L, 139L, 140L, 142L, 144L,
145L, 146L, 147L, 148L, 149L, 150L, 151L, 152L, 153L, 154L,
156L, 157L, 199L, 201L, 203L, 204L, 205L, 206L, 69L, 70L,
71L, 72L, 73L, 74L, 75L, 77L, 78L, 80L, 9L, 16L, 16L, 17L,
23L, 36L, 42L, 46L, 46L, 46L, 50L, 56L, 63L, 65L, 89L, 97L,
101L, 125L, 172L, 174L, 174L, 184L, 185L, 186L, 191L, 196L,
207L, 4L, 6L, 68L, 106L, 107L, 35L, 10L, 37L, 95L, 175L,
175L, 188L, 189L, 198L, 3L, 24L, 91L, 92L, 40L, 40L, 44L,
45L, 103L, 133L, 178L, 194L, 195L, 200L, 7L, 66L, 164L, 165L,
166L, 167L, 168L, 169L, 170L, 13L, 14L, 35L, 100L, 119L,
123L, 18L, 54L, 109L, 79L, 9L, 11L, 15L, 18L, 19L, 30L, 31L,
32L, 38L, 54L, 62L, 64L, 84L, 85L, 86L, 88L, 93L, 109L, 115L,
121L, 159L, 161L, 161L, 162L, 162L, 173L, 176L, 177L, 179L,
180L, 181L, 182L, 183L, 187L, 190L, 192L, 193L, 202L, 208L,
1L, 2L, 67L, 76L, 79L, 105L, 163L, 171L, 12L, 28L, 197L), .Label = c("1,002.12",
"1,027.99", "1,207.40", "1,368.90", "1,599.16", "1,616.11",
"1,804.20", "1,804.28", "106.09", "106.49", "106.5", "110.59",
"118.37", "119.12", "122.69", "123.19", "123.3", "123.49",
"125.19", "126.54", "128.72", "128.94", "132.43", "132.51",
"132.55", "135.45", "137.26", "141.87", "142.59", "145.93",
"146.11", "146.52", "147.22", "149.04", "149.27", "151.42",
"154.7", "155.61", "155.9", "156.07", "156.23", "157.8",
"158.92", "159.41", "160.22", "162.84", "163.45", "166.11",
"166.63", "170.96", "171.19", "172.73", "173.24", "176.51",
"176.56", "176.94", "177.75", "181.23", "184.5", "190.34",
"190.7", "193.7", "197.78", "199.66", "199.95", "2,007.44",
"2,009.54", "2,030.52", "2,273.26", "2,440.88", "2,473.26",
"2,633.03", "2,663.28", "2,706.98", "2,723.36", "2,755.44",
"2,759.55", "2,821.46", "2,829.16", "2,835.27", "200.27",
"204.97", "206.63", "208.96", "212.89", "216.38", "217.45",
"232.67", "234.05", "251.6", "253.61", "258.98", "262.16",
"266.48", "266.88", "268.92", "271.27", "276.31", "279.41",
"283.22", "289.51", "292.47", "292.67", "298.71", "3,003.51",
"3,184.47", "3,885.86", "305.69", "307.59", "308.38", "309.54",
"310.48", "313.8", "313.91", "314.72", "317.51", "319.85",
"321.54", "321.57", "321.63", "322.46", "327.56", "328.57",
"331.06", "331.85", "333.85", "333.9", "333.98", "334.28",
"335.22", "335.89", "336.63", "337.3", "337.74", "339.74",
"341.78", "345.12", "345.54", "347.99", "348", "348.13",
"348.48", "348.49", "349.3", "350.18", "350.53", "353.08",
"353.74", "353.98", "354.59", "355.55", "358.47", "359.14",
"359.59", "359.98", "361.84", "362.86", "370.08", "373.83",
"376.4", "394.45", "395.48", "4,166.39", "4,667.87", "4,696.73",
"4,708.79", "4,729.34", "4,731.65", "4,757.80", "4,760.75",
"4,769.30", "415.37", "421.52", "423.58", "428.34", "487.35",
"491.12", "495.1", "495.91", "495.94", "499.07", "517.68",
"527.29", "536.62", "550.83", "572.71", "574.75", "576.42",
"605.69", "613.56", "632.1", "668.87", "669.68", "686.88",
"688.05", "762.93", "770.16", "781.07", "858.09", "858.68",
"864.56", "868.03", "874.65", "879.09", "886.68", "890.64",
"911.58", "954.76"), class = "factor")), .Names = c("UserName",
"Division", "SpoolUsage_max"), class = "data.frame", row.names = c(NA,
-223L))
I am trying to get a box-plot for each Division (each division withits own users) side by side.
I have tried the following:
library(reshape2)
library(ggplot2)
p <- ggplot(melt(df), aes(variable, value)) + geom_boxplot()
p <- p + geom_boxplot(fill = "grey80", colour = "#3366FF")
p <- p +xlab("UserName")+ylab("SpoolUsage_Max")+ggtitle("Spool Usage Analysis by Users")
p <- p +coord_flip()
p
I cannot produce with division (with its users ) each divison with a color for a side by side single box plot
Here you go:
df <- df %>% mutate(val = gsub(",", "", SpoolUsage_max) %>% as.numeric)
ggplot(df, aes(Division, val, fill=UserName)) + geom_boxplot()
May be neater if you use facet_wrap option.

Internal ordering of facets ggplot2

I'm trying to plot a facets in ggplot2 but I struggle to get the internal ordering of the different facets right. The data looks like this:
head(THAT_EXT)
ID FILE GENRE NODE
1 CKC_1823_01 CKC Novels better
2 CKC_1824_01 CKC Novels better
3 EW9_192_03 EW9 Popular Science better
4 H0B_265_01 H0B Popular Science sad
5 CS2_231_03 CS2 Academic Prose desirable
6 FED_8_05 FED Academic Prose certain
str(THAT_EXT)
'data.frame': 851 obs. of 4 variables:
$ ID : Factor w/ 851 levels "A05_122_01","A05_277_07",..: 345 346 439 608 402 484 319 395 228 5 ...
$ FILE : Factor w/ 241 levels "A05","A06","A0K",..: 110 110 127 169 120 135 105 119 79 2 ...
$ GENRE: Factor w/ 5 levels "Academic Prose",..: 4 4 5 5 1 1 1 5 1 5 ...
$ NODE : Factor w/ 115 levels "absurd","accepted",..: 14 14 14 89 23 16 59 59 18 66 ...
Part of the problem is that can't get the sorting right. Here is the code for the sorting of NODE that I use:
THAT_EXT <- within(THAT_EXT,
NODE <- factor(NODE,
levels=names(sort(table(NODE),
decreasing=TRUE))))
When I plot this with the code below I get a graphs in which the NODE is not correctly sorted in the individual GENREs since different NODEs are more frequent in different GENREs:
p1 <-
ggplot(THAT_EXT, aes(x=NODE)) +
geom_bar() +
scale_x_discrete("THAT_EXT", breaks=NULL) + # supress tick marks on x axis
facet_wrap(~GENRE)
What I want is for every facet to have NODE sorted in decreasing order for that particular GENRE. Can anyone help with this?
structure(list(ID = structure(c(1L, 2L, 3L, 4L, 10L, 133L, 137L,
138L, 139L, 140L, 141L, 142L, 143L, 144L, 145L, 146L, 147L, 148L,
149L, 150L, 151L, 152L, 153L, 154L, 155L, 156L, 157L, 158L, 159L,
160L, 161L, 162L, 163L, 164L, 165L, 166L, 167L, 168L, 169L, 170L,
171L, 172L, 173L, 174L, 175L, 176L, 177L, 178L, 179L, 180L, 181L,
182L, 183L, 184L, 185L, 186L, 187L, 188L, 189L, 190L, 191L, 192L,
193L, 194L, 195L, 196L, 197L, 198L, 199L, 200L, 201L, 202L, 203L,
204L, 205L, 206L, 207L, 208L, 212L, 213L, 214L, 215L, 216L, 217L,
218L, 219L, 220L, 221L, 222L, 223L, 224L, 225L, 226L, 227L, 228L,
229L, 230L, 231L, 232L, 233L, 234L, 235L, 236L, 237L, 238L, 239L,
240L, 241L, 267L, 268L, 269L, 270L, 271L, 272L, 273L, 274L, 275L,
276L, 277L, 278L, 279L, 280L, 281L, 282L, 283L, 284L, 290L, 291L,
298L, 299L, 300L, 303L, 304L, 305L, 306L, 307L, 308L, 309L, 310L,
313L, 314L, 315L, 316L, 317L, 318L, 319L, 327L, 328L, 329L, 330L,
331L, 332L, 333L, 334L, 335L, 336L, 337L, 338L, 339L, 340L, 341L,
342L, 343L, 344L, 345L, 346L, 347L, 348L, 352L, 353L, 354L, 355L,
356L, 357L, 358L, 359L, 360L, 349L, 350L, 351L, 361L, 362L, 363L,
364L, 365L, 366L, 367L, 368L, 369L, 370L, 371L, 372L, 373L, 374L,
375L, 376L, 377L, 378L, 379L, 380L, 381L, 12L, 13L, 14L, 15L,
16L, 17L, 18L, 19L, 20L, 21L, 22L, 23L, 24L, 25L, 26L, 27L, 28L,
29L, 30L, 31L, 32L, 33L, 34L, 35L, 36L, 41L, 42L, 43L, 44L, 45L,
46L, 50L, 54L, 72L, 73L, 74L, 75L, 76L, 90L, 91L, 92L, 97L, 98L,
102L, 115L, 125L, 126L, 127L, 128L, 129L, 130L, 131L, 132L, 209L,
210L, 211L, 242L, 243L, 244L, 245L, 246L, 289L, 292L, 293L, 294L,
295L, 296L, 297L, 301L, 302L, 311L, 312L, 320L, 321L, 322L, 323L,
324L, 325L, 326L, 382L, 383L, 384L, 385L, 386L, 387L, 388L, 5L,
6L, 7L, 8L, 9L, 11L, 37L, 38L, 39L, 40L, 47L, 48L, 49L, 51L,
52L, 53L, 55L, 56L, 57L, 58L, 59L, 60L, 61L, 62L, 63L, 64L, 65L,
66L, 67L, 68L, 69L, 70L, 71L, 77L, 78L, 79L, 80L, 81L, 82L, 83L,
84L, 85L, 86L, 87L, 88L, 89L, 93L, 94L, 95L, 96L, 99L, 100L,
101L, 103L, 104L, 105L, 106L, 107L, 108L, 109L, 110L, 111L, 112L,
113L, 114L, 116L, 117L, 118L, 119L, 120L, 121L, 122L, 123L, 124L,
134L, 135L, 136L, 247L, 248L, 249L, 250L, 251L, 252L, 253L, 254L,
255L, 256L, 257L, 258L, 259L, 260L, 261L, 262L, 263L, 264L, 265L,
266L, 285L, 286L, 287L, 288L), .Label = c("A05_122_01", "A05_277_07",
"A05_400_01", "A05_99_01", "A06_1283_02", "A06_1389_01", "A06_1390_01",
"A06_1441_02", "A06_884_03", "A0K_1190_03", "A77_1684_01", "A8K_525_03",
"A8K_582_01", "A8K_645_01", "A8K_799_01", "A90_341_02", "A90_496_01",
"A94_217_01", "A94_472_01", "A94_477_03", "A9M_164_01", "A9M_259_03",
"A9N_199_01", "A9N_489_01", "A9N_591_01", "A9R_173_01", "A9R_425_02",
"A9W_536_02", "AA5_121_01", "AAE_203_01", "AAE_243_01", "AAE_412_01",
"AAW_14_03", "AAW_244_02", "AAW_297_04", "AAW_365_04", "ADG_1398_01",
"ADG_1500_01", "ADG_1507_01", "ADG_1516_01", "AHB_336_01", "AHB_421_01",
"AHJ_1090_02", "AHJ_619_01", "AR3_340_01", "AR3_91_03", "ARF_879_01",
"ARF_985_01", "ARF_991_02", "ARK_1891_01", "ASL_33_04", "ASL_43_01",
"ASL_9_01", "AT7_1031_01", "B09_1162_01", "B09_1475_01", "B09_1493_01",
"B09_1539_01", "B0G_197_01", "B0G_320_01", "B0N_1037_01", "B0N_624_01",
"B0N_645_02", "B0N_683_01", "B3G_313_04", "B3G_320_03", "B3G_398_02",
"B7M_1630_01", "B7M_1913_01", "BNN_746_02", "BNN_895_01", "BP7_2426_01",
"BP7_2777_01", "BP7_2898_01", "BP9_410_01", "BP9_599_01", "BPK_829_01",
"C93_1407_02", "C9A_181_01", "C9A_196_01", "C9A_365_01", "C9A_82_02",
"C9A_9_01", "CB9_306_02", "CB9_63_04", "CB9_86_01", "CBJ_439_01",
"CBJ_702_02", "CBJ_705_01", "CCM_320_01", "CCM_665_01", "CCM_669_02",
"CCN_1036_02", "CCN_1078_01", "CCN_1119_01", "CCN_784_01", "CCW_2284_02",
"CCW_2349_03", "CE7_242_02", "CE7_284_01", "CE7_39_01", "CEB_1675_01",
"CER_145_03", "CER_23_01", "CER_235_02", "CER_378_10", "CET_1056_02",
"CET_680_01", "CET_705_01", "CET_797_01", "CET_838_01", "CET_879_05",
"CET_946_03", "CET_986_01", "CEY_2977_01", "CJ3_107_02", "CJ3_114_03",
"CJ3_20_01", "CJ3_81_01", "CK2_112_01", "CK2_22_01", "CK2_392_01",
"CK2_42_01", "CK2_75_01", "CKC_1776_01", "CKC_1777_01", "CKC_1823_01",
"CKC_1824_01", "CKC_1860_01", "CKC_1883_01", "CKC_1883_02", "CKC_2127_01",
"CMN_1439_02", "CRM_5767_01", "CRM_5770_03", "CRM_5789_01", "CS2_110_01",
"CS2_131_01", "CS2_139_01", "CS2_187_01", "CS2_187_03", "CS2_231_03",
"CS2_249_02", "CS2_301_01", "CS2_35_01", "CS2_58_02", "EV6_16_01",
"EV6_206_02", "EV6_240_01", "EV6_244_02", "EV6_28_01", "EV6_30_01",
"EV6_32_01", "EV6_450_01", "EV6_69_01", "EV6_80_01", "EV6_91_01",
"FAC_1019_01", "FAC_1026_01", "FAC_1027_01", "FAC_1235_01", "FAC_1269_05",
"FAC_1270_05", "FAC_1393_01", "FAC_1406_03", "FAC_933_01", "FAC_950_01",
"FAC_960_01", "FED_105_01", "FED_120_02", "FED_21_02", "FED_281_02",
"FED_302_02", "FED_53_01", "FED_8_05", "FEF_498_03", "FEF_674_03",
"FR2_410_01", "FR2_557_02", "FR2_593_01", "FR2_691_01", "FR4_232_01",
"FR4_331_01", "FR4_346_01", "FS7_818_01", "FS7_919_01", "FU0_368_02",
"FYT_1138_01", "FYT_1183_01", "FYT_901_05", "G08_1336_01", "G1E_385_01",
"G1N_824_01", "G1N_860_01", "G1N_868_01", "G1N_975_01", "GU5_854_01",
"GUJ_423_01", "GUJ_501_01", "GUJ_611_01", "GUJ_629_03", "GUJ_700_01",
"GV0_10_01", "GV0_104_01", "GV0_111_01", "GV0_122_01", "GV0_160_01",
"GV0_232_02", "GV2_1465_01", "GV2_1899_01", "GV6_2683_01", "GW6_297_01",
"GW6_306_05", "GW6_307_01", "GW6_322_01", "GW6_330_02", "GW6_335_01",
"GW6_338_01", "GW6_367_02", "GW6_373_01", "GW6_407_01", "GW6_411_01",
"GW6_413_01", "GW6_421_01", "GW6_423_01", "GW6_424_01", "GW6_428_01",
"GW6_447_01", "GWM_480_01", "GWM_533_02", "GWM_554_02", "GWM_554_03",
"GWM_609_01", "GWM_609_04", "GWM_610_01", "GWM_730_01", "GWM_731_01",
"GWM_738_01", "GWM_804_06", "GWM_815_01", "GWM_832_03", "GVP_179_01",
"GVP_211_01", "GVP_393_02", "GVP_443_02", "GVP_710_01", "H0B_171_04",
"H0B_216_01", "H0B_265_01", "H0B_32_01", "H0B_361_03", "H0B_365_01",
"H0B_369_01", "H0B_74_01", "H0B_93_01", "H10_1002_01", "H10_1032_04",
"H10_653_01", "H10_803_01", "H10_824_01", "H10_825_03", "H10_881_01",
"H10_986_01", "H78_851_04", "H78_891_01", "H78_946_04", "H79_1959_19",
"H7S_110_05", "H7S_130_06", "H7S_131_03", "H7S_131_04", "H7S_146_01",
"H7S_148_01", "H7S_164_01", "H7S_179_01", "H7S_54_01", "H7S_56_05",
"H7S_62_03", "H7S_79_01", "H7S_8_01", "H7S_81_01", "H7S_83_01",
"H7S_87_01", "H7S_92_03", "H7X_1028_02", "H7X_1091_01", "H7X_691_01",
"H7X_695_01", "H8H_2917_01", "H8K_153_01", "H8K_55_01", "H8M_1897_01",
"H8M_2104_02", "H8T_3316_03", "H98_3204_01", "H98_3410_01", "H98_3490_02",
"H9R_130_02", "H9R_39_01", "H9S_1297_01", "HA2_3107_02", "HA2_3284_01",
"HPY_754_04", "HPY_785_09", "HPY_799_03", "HPY_807_04", "HPY_830_04",
"HPY_838_02", "HPY_843_01", "HPY_869_11", "HR7_190_01", "HR7_440_01",
"HTP_540_01", "HTP_585_01", "HTP_588_05", "HTP_593_01", "HTP_601_01",
"HTP_613_01", "HTP_648_02", "HTW_197_01", "HTW_494_01", "HTW_750_01",
"HWL_2770_01", "HWL_2919_01", "HWM_45_01", "HWM_45_02", "HXY_1047_03",
"HXY_701_01", "HXY_781_01", "HXY_783_01", "HXY_784_01", "HXY_836_01",
"HXY_931_01", "HXY_963_01", "HXY_972_01", "HXY_985_03", "HY6_1024_01",
"HY6_1025_01", "HY6_1164_01", "HY6_1223_01", "HY6_988_03", "HY6_989_01",
"HY8_160_01", "HY8_164_01", "HY8_292_03", "HY8_316_01", "HY9_778_03",
"HY9_845_02", "HYX_235_08", "HYX_245_01", "HYX_88_01", "J12_1474_02",
"J12_1492_01", "J12_1571_01", "J12_1845_01", "J14_341_01", "J18_597_04",
"J18_698_02", "J18_759_01", "J18_828_01", "J3R_197_01", "J3R_219_02",
"J3R_277_04", "J3T_267_01", "J3T_269_02", "J3T_57_02", "J41_41_02",
"J41_58_03", "J9B_133_03", "J9B_341_02", "J9B_341_03", "J9D_147_05",
"J9D_218_01", "J9D_411_01", "J9D_616_01", "J9D_616_02", "JNB_563_02",
"JT7_118_01", "JT7_129_02", "JT7_218_02", "JT7_344_02", "JXS_3663_01",
"JXU_407_01", "JXU_468_02", "JXU_559_01", "JXV_1439_04", "JXV_1592_01",
"JY1_100_01"), class = "factor"), GENRE = structure(c(1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 5L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
5L, 5L), .Label = c("Academic Prose", "Conversation", "News",
"Novels", "Popular Science"), class = "factor"), NODE = structure(c(9L,
10L, 10L, 10L, 4L, 10L, 71L, 35L, 49L, 6L, 5L, 15L, 28L, 44L,
64L, 64L, 28L, 28L, 18L, 18L, 32L, 18L, 58L, 10L, 72L, 28L, 18L,
10L, 64L, 10L, 35L, 64L, 64L, 69L, 8L, 10L, 50L, 69L, 49L, 49L,
15L, 69L, 10L, 49L, 8L, 64L, 49L, 10L, 69L, 18L, 61L, 67L, 67L,
61L, 57L, 69L, 11L, 10L, 64L, 10L, 59L, 61L, 49L, 10L, 59L, 1L,
61L, 35L, 54L, 54L, 39L, 44L, 61L, 64L, 69L, 1L, 23L, 49L, 49L,
8L, 69L, 49L, 69L, 49L, 49L, 69L, 35L, 49L, 49L, 49L, 35L, 10L,
49L, 48L, 10L, 49L, 11L, 44L, 50L, 11L, 50L, 69L, 49L, 10L, 59L,
68L, 47L, 69L, 49L, 35L, 29L, 8L, 49L, 50L, 35L, 10L, 35L, 8L,
35L, 8L, 10L, 35L, 10L, 10L, 10L, 35L, 44L, 61L, 35L, 44L, 28L,
47L, 39L, 39L, 49L, 61L, 43L, 60L, 19L, 10L, 10L, 10L, 44L, 44L,
62L, 44L, 10L, 59L, 10L, 61L, 1L, 53L, 33L, 10L, 8L, 8L, 64L,
64L, 10L, 57L, 61L, 64L, 66L, 19L, 61L, 64L, 10L, 10L, 8L, 19L,
35L, 28L, 10L, 61L, 35L, 42L, 35L, 28L, 32L, 64L, 10L, 18L, 28L,
25L, 35L, 35L, 10L, 18L, 10L, 22L, 55L, 28L, 10L, 1L, 55L, 51L,
1L, 38L, 28L, 28L, 33L, 10L, 44L, 29L, 16L, 8L, 28L, 69L, 32L,
10L, 61L, 20L, 35L, 10L, 28L, 10L, 32L, 10L, 46L, 59L, 64L, 35L,
66L, 2L, 35L, 28L, 30L, 18L, 69L, 32L, 10L, 28L, 17L, 36L, 64L,
61L, 10L, 64L, 33L, 3L, 37L, 26L, 28L, 64L, 44L, 28L, 64L, 64L,
6L, 6L, 64L, 50L, 32L, 8L, 64L, 50L, 28L, 24L, 18L, 47L, 35L,
40L, 24L, 55L, 44L, 22L, 1L, 49L, 44L, 18L, 45L, 63L, 64L, 35L,
12L, 35L, 10L, 35L, 10L, 10L, 10L, 44L, 44L, 44L, 65L, 44L, 55L,
32L, 49L, 64L, 39L, 69L, 1L, 60L, 7L, 14L, 44L, 33L, 10L, 19L,
10L, 70L, 53L, 8L, 61L, 61L, 44L, 61L, 65L, 28L, 68L, 69L, 27L,
61L, 28L, 72L, 34L, 61L, 32L, 10L, 49L, 35L, 49L, 10L, 10L, 69L,
39L, 40L, 19L, 59L, 53L, 49L, 49L, 44L, 49L, 35L, 49L, 61L, 61L,
1L, 10L, 28L, 49L, 35L, 49L, 61L, 50L, 69L, 35L, 61L, 35L, 50L,
10L, 28L, 69L, 61L, 21L, 69L, 29L, 35L, 35L, 35L, 11L, 69L, 8L,
41L, 56L, 35L, 61L, 69L, 49L, 49L, 49L, 1L, 13L, 64L, 64L, 52L,
44L, 64L, 64L, 50L, 49L, 69L, 11L, 59L, 49L, 31L), .Label = c("apparent",
"appropriate", "awful", "axiomatic", "best", "better", "breathtaking",
"certain", "characteristic", "clear", "conceivable", "convenient",
"crucial", "cruel", "desirable", "disappointing", "emphatic",
"essential", "evident", "expected", "extraordinary", "fair",
"fortunate", "Funny", "good", "great", "imperative", "important",
"impossible", "incredible", "inescapable", "inevitable", "interesting",
"ironic", "likely", "Likely", "lucky", "ludicrous", "natural",
"necessary", "needful", "notable", "noteworthy", "obvious", "odd",
"paradoxical", "plain", "plausible", "possible", "probable",
"proper", "relevant", "remarkable", "revealing", "right", "Sad",
"self-evident", "sensible", "significant", "striking", "surprising",
"symptomatic", "terrible", "true", "typical", "understandable",
"unexpected", "unfortunate", "unlikely", "unreasonable", "untrue",
"vital"), class = "factor")), .Names = c("ID", "GENRE", "NODE"
), class = "data.frame", row.names = c(NA, -388L))
As I mentioned already: facet_wrap is not intended for having individual scales. At least I didn't find a solution. Hence, setting the labels in scale_x_discrete did not bring the desired result.
But this my workaround:
library(plyr)
library(ggplot2)
nodeCount <- ddply( df, c("GENRE", "NODE"), nrow )
nodeCount$factors <- paste( nodeCount$GENRE, nodeCount$NODE, sep ="." )
nodeCount <- nodeCount[ order( nodeCount$GENRE, nodeCount$V1, decreasing=TRUE ), ]
nodeCount$factors <- factor( nodeCount$factors, levels=nodeCount$factors )
head(nodeCount)
GENRE NODE V1 factors
121 Popular Science possible 14 Popular Science.possible
128 Popular Science surprising 11 Popular Science.surprising
116 Popular Science likely 9 Popular Science.likely
132 Popular Science unlikely 9 Popular Science.unlikely
103 Popular Science clear 7 Popular Science.clear
129 Popular Science true 5 Popular Science.true
g <- ggplot( nodeCount, aes( y=V1, x = factors ) ) +
geom_bar() +
scale_x_discrete( breaks=NULL ) + # supress tick marks on x axis
facet_wrap( ~GENRE, scale="free_x" ) +
geom_text( aes( label = NODE, y = V1+2 ), angle = 45, vjust = 0, hjust=0, size=3 )
Which gives:

Resources