How can I add missing sequence values? - r

I have a data frame like this:
structure(list(x = c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L,
11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 20L, 21L, 22L, 23L,
24L, 25L, 26L, 27L, 28L, 29L, 30L, 31L, 32L, 33L, 34L, 35L, 36L,
37L, 38L, 39L, 40L, 41L, 42L, 43L, 44L, 45L, 46L, 47L, 48L, 49L,
50L, 51L, 52L, 53L, 54L, 55L, 56L, 57L, 58L, 59L, 60L, 61L, 62L,
63L, 64L, 65L, 66L, 67L, 68L, 69L, 70L, 71L, 72L, 73L, 74L, 75L,
76L, 77L, 78L, 79L, 80L, 81L, 82L, 83L, 84L, 85L, 86L, 87L, 88L,
89L, 90L, 91L, 92L, 93L, 94L, 95L, 96L, 97L, 98L, 99L, 100L,
101L, 102L, 103L, 104L, 105L, 106L, 107L, 108L, 109L, 110L, 112L,
113L, 114L, 115L, 116L, 117L, 118L, 119L, 120L, 121L, 123L, 124L,
125L, 127L, 128L, 129L, 130L, 132L, 133L, 134L, 135L, 136L, 137L,
138L, 139L, 140L, 141L, 142L, 143L, 145L, 146L, 147L, 148L, 149L,
150L, 151L, 152L, 153L, 154L, 155L, 158L, 160L, 163L, 164L, 166L,
167L, 169L, 170L, 173L, 174L, 178L, 179L, 181L, 182L, 183L, 186L,
187L, 191L, 192L, 193L, 194L, 197L, 198L, 200L, 205L, 208L, 209L,
213L, 214L, 216L, 217L, 220L, 222L, 223L, 225L, 229L, 233L, 235L,
237L, 242L, 243L, 244L, 251L, 253L, 254L, 255L, 261L, 262L, 263L,
264L, 267L, 268L, 269L, 270L, 276L, 281L, 282L, 284L, 285L, 287L,
289L, 293L, 295L, 297L, 299L, 301L, 306L, 308L, 315L, 317L, 318L,
320L, 327L, 330L, 336L, 337L, 345L, 346L, 355L, 359L, 376L, 377L,
379L, 384L, 387L, 388L, 402L, 405L, 408L, 415L, 420L, 421L, 427L,
428L, 429L, 430L, 437L, 438L, 439L, 440L, 446L, 448L, 453L, 456L,
469L, 472L, 476L, 478L, 481L, 483L, 486L, 487L, 488L, 497L, 500L,
502L, 504L, 507L, 512L, 525L, 530L, 531L, 543L, 546L, 550L, 578L,
581L, 598L, 601L, 680L, 689L, 693L, 712L, 728L, 746L, 768L, 790L,
794L, 840L, 851L, 861L, 928L, 969L, 1010L, 1180L, 1698L), freq = c(29186L,
12276L, 5851L, 3938L, 3133L, 1894L, 1157L, 820L, 597L, 481L,
398L, 297L, 269L, 251L, 175L, 176L, 153L, 130L, 117L, 108L, 93L,
83L, 58L, 84L, 60L, 43L, 59L, 51L, 57L, 53L, 38L, 38L, 32L, 35L,
28L, 27L, 29L, 22L, 24L, 29L, 30L, 23L, 26L, 19L, 19L, 25L, 14L,
22L, 16L, 12L, 15L, 14L, 11L, 13L, 18L, 10L, 17L, 20L, 7L, 9L,
2L, 8L, 12L, 8L, 7L, 10L, 10L, 9L, 6L, 6L, 9L, 5L, 11L, 4L, 5L,
5L, 10L, 4L, 6L, 1L, 4L, 7L, 3L, 4L, 3L, 2L, 3L, 5L, 7L, 2L,
2L, 3L, 2L, 4L, 7L, 1L, 3L, 5L, 5L, 3L, 5L, 2L, 2L, 2L, 3L, 2L,
5L, 7L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 2L, 1L, 3L, 2L, 2L, 1L,
3L, 4L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 1L, 4L, 3L, 1L, 2L, 2L, 1L,
1L, 1L, 1L, 2L, 3L, 1L, 1L, 3L, 2L, 1L, 1L, 1L, 4L, 4L, 1L, 2L,
2L, 4L, 2L, 1L, 1L, 1L, 1L, 3L, 1L, 1L, 2L, 3L, 1L, 1L, 1L, 1L,
3L, 2L, 1L, 3L, 1L, 3L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L,
2L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 3L, 2L, 1L, 1L, 2L, 1L, 1L,
2L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L,
1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 4L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L)), .Names = c("x",
"freq"), row.names = c(NA, -296L), class = "data.frame")
After the x value of 130, there are missing values. Is there a way I make this a continuous data frame in increments of 1 i.e. from 1 to 1698, populate the entire list and set the elements that do not have a value here as 0? What I mean is:
1,2
4,5
5,7
should be converted to:
1,2
2,0
3,0
4,5
5,7
Any suggestions?

You can also use merge (assuming your data is strored in l):
l <- merge(l,data.frame(x = 1:1698),all = TRUE,by = "x")
l$freq[is.na(l$freq)] <- 0

I'd create a data set of values that aren't covered by column x and then create a dataframe of those values and assign 0 to the freq of all of these x values. Then rbind and order by x.
#I called your data dat
y <- 1:max(dat$x)
dat2 <- data.frame(x=y[!y%in%dat$x], freq=0)
dat3 <- rbind(dat, dat2)
dat4 <- dat3[order(dat3$x), ] #could stop here
rownames(dat4) <- NULL #but I hate non sequential row names
dat4

Related

Widening Data and Changing Columns

I have managed to delete a little bit of code that did the below task and can't for the life of me figure out how I did it before.
I want to widen the data that has two factors spread over 8 different 'waves'. There are four 'Paper' factors, each with the same four internal factors 'Response'. The output from a previously required function gives the following dataframe:
[
And I would like to make it look like this:
The single column of the first tibble has become the single row of the second tibble.
As you can see, the second tibble has extra factors of Paper but these can just be joined row wise.
I really wasn't sure how to attack this, but thought it would be done using the pivot_wider function. When I tried
times_correct <- times_19 %>%
pivot_wider( id_cols = c('Stay/remain in the EU`', 'Leave the EU', 'I would/will not vote', 'Don\'t know'), names_from = eurrefcolnames)
I got the error that I can't subset columns that don't exist which makes sense: I need to manually add the correct 'Waves'. I think this is relatively simple, but can't for the life of me figure out how I did it!
Here is the dput of the various tibbles:
structure(list(resp = structure(c(3L, 2L, 4L, 1L, NA, NA, NA,
NA), .Label = c("Don't Know", "Leave", "Remain", "Will Not Vote"
), class = "factor"), `Stay/remain in the EU` = c(316L, 290L,
313L, 324L, 338L, 320L, 325L, 335L), `Leave the EU` = c(157L,
123L, 159L, 154L, 134L, 189L, 187L, 181L), `I would/will not vote` = c(2L,
3L, 3L, 3L, 2L, 2L, 2L, 0L), `Don't know` = c(56L, 51L, 55L,
50L, 57L, 20L, 17L, 0L), Paper = structure(c(1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L), .Label = "Times", class = "factor")), row.names = c(NA,
-8L), class = c("tbl_df", "tbl", "data.frame"))
structure(list(resp = structure(c(3L, 2L, 4L, 1L, 3L, 2L, 4L,
1L, 3L, 2L, 4L, 1L, 3L, 2L, 4L, 1L, 3L, 2L, 4L, 1L), .Label = c("Don't Know",
"Leave", "Remain", "Will Not Vote"), class = "factor"), euRefVoteW1 = c(316L,
157L, 2L, 56L, 190L, 339L, 4L, 70L, 819L, 79L, 9L, 71L, 1294L,
1311L, 150L, 523L, 1715L, 2587L, 133L, 630L), euRefVoteW2 = c(290L,
123L, 3L, 51L, 175L, 282L, 3L, 62L, 777L, 74L, 5L, 62L, 1091L,
925L, 80L, 371L, 1528L, 2044L, 83L, 517L), euRefVoteW3 = c(313L,
159L, 3L, 55L, 199L, 334L, 4L, 69L, 835L, 81L, 10L, 57L, 1348L,
1289L, 139L, 508L, 1766L, 2563L, 156L, 586L), euRefVoteW4 = c(324L,
154L, 3L, 50L, 215L, 328L, 2L, 61L, 848L, 70L, 10L, 55L, 1397L,
1267L, 128L, 492L, 1853L, 2494L, 143L, 583L), euRefVoteW6 = c(338L,
134L, 2L, 57L, 241L, 286L, 2L, 77L, 853L, 68L, 5L, 57L, 1519L,
1133L, 112L, 520L, 2017L, 2284L, 106L, 667L), euRefVoteW7 = c(320L,
189L, 2L, 20L, 186L, 384L, 2L, 34L, 832L, 109L, 8L, 34L, 1449L,
1456L, 87L, 292L, 1906L, 2785L, 55L, 328L), euRefVoteW8 = c(325L,
187L, 2L, 17L, 187L, 384L, 1L, 34L, 836L, 118L, 5L, 24L, 1462L,
1522L, 72L, 228L, 1898L, 2852L, 56L, 268L), euRefVoteW9 = c(335L,
181L, 0L, 0L, 206L, 385L, 0L, 6L, 844L, 102L, 0L, 4L, 1572L,
1462L, 0L, 21L, 2018L, 2827L, 0L, 20L), Paper = structure(c(1L,
1L, 1L, 1L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 5L,
5L, 5L, 5L), .Label = c("Times", "Telegraph", "Control", "No_Paper",
"Rest"), class = "factor")), row.names = c(NA, -20L), class = c("tbl_df",
"tbl", "data.frame"))
eurrefcolnames = c('euRefVoteW1','euRefVoteW2', 'euRefVoteW3', 'euRefVoteW4', 'euRefVoteW6',' euRefVoteW7', 'euRefVoteW8', 'euRefVoteW9')
EDIT:
Here is the function that create the initial dataframes, is there an edit I could make here perhaps ?
tally_reader_number <- function(input_dataframe,newspaper_name) {
#function takes the input of in_all_waves, tallies the number of different eu ref responses using map_df for a given newspaper factor (defined above)
# and returns a dataframe of responese for each wave with the newspaper factor as a column
returned_dataframe <- input_dataframe %>%
filter(Paper == newspaper_name) %>%
ungroup() %>% #function refuses to work without this
select(-Paper) %>%
map_df(table) %>% # use map_df from the purrr package to "table" each column
rownames_to_column("response") %>% #convert the rownames to a column named response
mutate(resp = case_when(response == 1 ~ "Remain", #change the resulting numbers to the correct responses
response == 2 ~ "Leave",
response ==3 ~ "Will Not Vote",
response == 4 ~ "Don't Know")) %>%
select(resp, everything(), -response) %>% #reorder the columns with resp at the front, removing response
mutate(Paper = newspaper_name)
returned_dataframe$Paper <- as.factor(returned_dataframe$Paper)
returned_dataframe$resp <- as.factor(returned_dataframe$resp)
returned_dataframe
}

order geom_point by specific facet

I have a ggplot related question, which should be easy but I could not find the answer yet. I am trying to plot a faceted plot with the code below and this dataset (11 kB).
ggplot(plot.dat, aes(x = estimate, y = reorder(countryyear, estimate))) +
geom_point() +
geom_segment(aes(x=conf.low, xend=conf.high, yend=countryyear)) +
facet_grid(. ~ facet) +
xlab("Random Effect Estimate") +
ylab("") + scale_x_continuous(breaks=c(seq(0, 5, 1)), limits=c(0, 5)) +
ggtitle("Random Slopes in Country*Year Groups from Northwestern Europe") +
theme_minimal() + theme(plot.title = element_text(hjust = 0.5))
I would like countryyear to be organized by the values of estimate in the Extreme Right facet. Not quite sure how to order by values of a specific facet. Any ideas are welcome! Thanks.
Update: Here is the dput structure of a random subset of the dataset. It has some missing values, but it should work for the sake of the example. I also updated the download link above, that has the full version.
structure(list(estimate = c(1.41056902925372, 0.854859208455895,
1.16012834593894, 0.871339033194504, 0.803272289946221, 1.17540386134493,
0.996313357490551, 1.49940694539732, 1.33773365908762, 2.7318703090905,
1.19131935418045, 1.12765907711738, 0.746741192261761, 0.985847015192172,
0.912357310925342, 1.11582763712164, 1.21854572824977, 0.675712547978394,
0.566955524699616, 1.32611743759365, 0.519648352294682, 0.591013596394243,
1.30944973684044, 0.613722269599125, 1.13293279727271, 0.950788678552604,
1.1599446923567, 1.11493952112913, 0.95336321045095, 1.39002327097034,
0.794207546872633, 0.788545101449259, 1.01096883872495, 0.897407203907834,
1.38391605229103, 1.35754760293107, 1.0718508539761, 0.542191158958878,
0.757132752456427, 1.44172863221312, 1.04842251986171, 0.77260404885379,
0.879288027642055, 1.09372353598088, 0.745484830381145, 1.21211217249353,
0.628009608902132, 1.34864488674734), countryyear = structure(c(1L,
2L, 4L, 5L, 7L, 9L, 10L, 12L, 13L, 26L, 28L, 29L, 31L, 32L, 34L,
36L, 37L, 39L, 40L, 57L, 59L, 60L, 62L, 63L, 65L, 67L, 68L, 70L,
71L, 73L, 75L, 76L, 89L, 90L, 92L, 94L, 95L, 103L, 104L, 106L,
108L, 109L, 111L, 128L, 130L, 132L, 133L, 135L), .Label = c("AT02",
"AT04", "AT06", "AT14", "AT16", "BE02", "BE04", "BE06", "BE08",
"BE10", "BE12", "BE14", "BE16", "BG06", "BG08", "BG10", "BG12",
"CH14", "CZ02", "CZ04", "CZ08", "CZ10", "CZ12", "CZ14", "CZ16",
"DE02", "DE04", "DE06", "DE08", "DE10", "DE12", "DE14", "DE16",
"DK02", "DK04", "DK06", "DK08", "DK10", "DK12", "DK14", "EE04",
"EE06", "EE08", "EE10", "EE12", "EE14", "EE16", "ES02", "ES04",
"ES06", "ES08", "ES10", "ES12", "ES14", "ES16", "FI02", "FI04",
"FI06", "FI08", "FI10", "FI12", "FI14", "FI16", "FR06", "FR08",
"FR10", "FR12", "FR14", "FR16", "GB02", "GB04", "GB06", "GB08",
"GB10", "GB12", "GB14", "GB16", "GR02", "GR04", "GR08", "GR10",
"HU02", "HU06", "HU08", "HU10", "HU12", "HU14", "HU16", "IE02",
"IE04", "IE06", "IE08", "IE10", "IE12", "IE14", "IE16", "IT04",
"IT12", "IT16", "LT10", "LT12", "LT14", "NL02", "NL04", "NL06",
"NL08", "NL10", "NL12", "NL14", "NL16", "NO14", "PL02", "PL04",
"PL06", "PL08", "PL10", "PL12", "PL14", "PL16", "PT02", "PT04",
"PT06", "PT08", "PT10", "PT12", "PT14", "PT16", "SE02", "SE04",
"SE06", "SE08", "SE10", "SE12", "SE14", "SE16", "SI02", "SI04",
"SI06", "SI08", "SI10", "SI12", "SI14", "SI16", "SK04", "SK06",
"SK08", "SK10", "SK12"), class = "factor"), facet = structure(c(1L,
3L, 1L, 4L, 5L, 3L, 4L, 1L, 1L, 1L, 5L, 5L, 4L, 5L, 3L, 1L, 2L,
4L, 5L, 2L, 1L, 4L, 2L, 5L, 2L, 3L, 4L, 3L, 2L, 5L, 5L, 4L, 2L,
5L, 4L, 5L, 3L, 1L, 4L, 5L, 3L, 5L, 4L, 1L, 5L, 2L, 4L, 1L), .Label = c("Intercept",
"Extreme Left", "Center", "Right", "Extreme Right"), class = "factor"),
conf.low = c(1.16824810706745, 0.686215051613965, 0.910277310292764,
0.591705078386698, 0.37357342399703, 0.947951001435781, 0.663296044193037,
1.18794112232166, 1.06645119085865, 2.33578182814618, 0.580210898576738,
0.564235690522211, 0.530859530342114, 0.516191258265551,
0.730992343373883, 0.862424540370486, 0.827891784352444,
0.427638276259852, 0.275692447335368, 0.829763907986328,
0.370078643492081, 0.321852705445509, 0.83550621863293, 0.289836810427436,
0.847226120408727, 0.780056160572728, 0.873143885861924,
0.869757467125519, 0.615741777890997, 0.649483531741787,
0.349657606457465, 0.523294407847395, 0.670109418373736,
0.36656743494149, 0.952201390937053, 0.777207016700884, 0.888128473009524,
0.397085597526946, 0.479828726362257, 0.614533313431094,
0.813336887981082, 0.3129232351085, 0.61435321820328, 0.854801028643867,
0.346698059397102, 0.805414039007076, 0.434676644041643,
1.07780736338027), conf.high = c(1.70315275860739, 1.06494933995261,
1.47855797769819, 1.28312522319126, 1.7272277157504, 1.45743211956315,
1.49652679976667, 1.8925358720741, 1.67802460909168, 3.19512520208851,
2.44607918797515, 2.25369471581694, 1.05041423643869, 1.8828182806291,
1.13872035780431, 1.44368725318228, 1.79353596677755, 1.06769546329854,
1.16593171156554, 2.11938292490653, 0.729667639003753, 1.08526995489865,
2.05223919950836, 1.29954170985538, 1.51498719434776, 1.15888977865399,
1.54095070825389, 1.4292376699955, 1.47610807594453, 2.97492484321718,
1.80395225460704, 1.18824770090216, 1.52521060717706, 2.19697554354282,
2.01136404338166, 2.37122858469145, 1.29357889999432, 0.740322123703373,
1.19469713534712, 3.38237391450413, 1.35145693795059, 1.90755095606211,
1.25847381058047, 1.39942645489832, 1.60297301142912, 1.82417470710871,
0.907332092210651, 1.68753999308876)), row.names = c(1L,
9L, 17L, 25L, 33L, 41L, 49L, 57L, 65L, 128L, 136L, 144L, 152L,
160L, 168L, 176L, 184L, 192L, 200L, 283L, 291L, 299L, 307L, 315L,
323L, 331L, 339L, 347L, 355L, 363L, 371L, 379L, 442L, 450L, 458L,
466L, 474L, 512L, 520L, 528L, 536L, 544L, 552L, 640L, 648L, 656L,
664L, 672L), class = "data.frame")

Merging error in R

For two example dataframes:
gp <- structure(list(gp.code = structure(c(1L, 3L, 5L, 13L, 6L, 20L,
10L, 19L, 17L, 12L, 2L, 18L, 7L, 16L, 15L, 4L, 8L, 143L, 14L,
9L, 11L, 33L, 23L, 113L, 102L, 97L, 83L, 122L, 77L, 111L, 29L,
68L, 142L, 56L, 118L, 115L, 78L, 58L, 104L, 71L, 43L, 121L, 32L,
110L, 53L, 70L, 123L, 61L, 87L, 48L, 73L, 100L, 37L, 141L, 114L,
34L, 89L, 81L, 98L, 92L, 63L, 50L, 60L, 47L, 125L, 145L, 145L,
93L, 93L, 99L, 99L, 138L, 138L, 137L, 86L, 139L, 91L, 146L, 79L,
103L, 31L, 124L, 22L, 76L, 26L, 108L, 105L, 116L, 84L, 136L,
67L, 106L, 52L, 95L, 51L, 27L, 82L, 130L, 101L, 107L, 133L, 62L,
42L, 117L, 112L, 85L, 69L, 49L, 46L, 45L, 120L, 38L, 39L, 55L,
96L, 80L, 75L, 44L, 35L, 109L, 41L, 24L, 59L, 54L, 144L, 65L,
28L, 25L, 119L, 66L, 74L, 36L, 57L, 21L, 135L, 134L, 132L, 140L,
64L, 127L, 129L, 128L, 131L, 72L, 88L, 40L, 30L, 94L, 90L, 126L
), .Label = c("E82002", "E82014", "E82018", "E82019", "E82023",
"E82031", "E82037", "E82040", "E82041", "E82055", "E82058", "E82059",
"E82060", "E82062", "E82071", "E82077", "E82084", "E82095", "E82107",
"E82113", "M85001", "M85002", "M85005", "M85007", "M85008", "M85009",
"M85011", "M85013", "M85015", "M85019", "M85020", "M85021", "M85024",
"M85025", "M85030", "M85031", "M85037", "M85041", "M85042", "M85043",
"M85047", "M85048", "M85051", "M85052", "M85055", "M85056", "M85058",
"M85059", "M85062", "M85064", "M85065", "M85070", "M85074", "M85076",
"M85077", "M85078", "M85079", "M85084", "M85086", "M85088", "M85092",
"M85097", "M85098", "M85107", "M85111", "M85113", "M85115", "M85116",
"M85118", "M85127", "M85128", "M85134", "M85136", "M85141", "M85142",
"M85145", "M85146", "M85153", "M85154", "M85156", "M85167", "M85171",
"M85174", "M85176", "M85177", "M85178", "M85179", "M85600", "M85611",
"M85624", "M85634", "M85642", "M85652", "M85655", "M85669", "M85671",
"M85679", "M85684", "M85686", "M85693", "M85694", "M85699", "M85701",
"M85713", "M85715", "M85716", "M85717", "M85721", "M85730", "M85733",
"M85735", "M85736", "M85749", "M85753", "M85756", "M85757", "M85770",
"M85774", "M85776", "M85782", "M85783", "M85794", "M85797", "M85801",
"M88020", "M88021", "M89001", "M89002", "M89008", "M89009", "M89012",
"M89013", "M89021", "M89026", "M89027", "Y00412", "Y00471", "Y00492",
"Y01680", "Y02567", "Y02571", "Y02620", "Y02639", "Y02893", "Y02961",
"Y02963"), class = "factor"), cqc.rating = structure(c(1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 5L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 4L, 2L, 1L, 1L, 1L, 1L,
5L, 1L, 5L, 1L, 5L, 1L, 5L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 5L, 1L, 1L, 1L, 5L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 3L, 1L,
1L, 3L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L), .Label = c("good", "inadequate", "not rated",
"oustanding", "requires improvement"), class = "factor")), .Names = c("gp.code",
"cqc.rating"), row.names = c(NA, 150L), class = "data.frame")
df <- structure(list(gp.code = structure(c(1L, 4L, 6L, 14L, 7L, 2L,
21L, 11L, 20L, 18L, 13L, 3L, 19L, 22L, 8L, 17L, 16L, 5L, 9L,
148L, 15L, 10L, 12L, 37L, 25L, 127L, 114L, 109L, 77L, 135L, 98L,
87L, 125L, 79L, 32L, 147L, 64L, 132L, 129L, 88L, 67L, 118L, 68L,
93L, 49L, 82L, 134L, 26L, 35L, 124L, 61L, 81L, 136L, 33L, 71L,
54L, 102L, 46L, 84L, 112L, 43L, 146L, 128L, 24L, 38L, 103L, 95L,
110L, 105L, 74L, 57L, 70L, 53L, 138L, 117L, 39L, 94L, 116L, 149L,
111L, 144L, 106L, 143L, 145L, 101L, 104L, 150L, 89L, 115L, 34L,
137L, 23L, 29L, 86L, 28L, 75L, 83L, 122L, 60L, 66L, 119L, 99L,
130L, 142L, 65L, 78L, 59L, 107L, 120L, 56L, 31L, 58L, 30L, 72L,
96L, 139L, 113L, 121L, 140L, 73L, 48L, 131L, 126L, 42L, 100L,
76L, 80L, 141L, 55L, 52L, 36L, 51L, 133L, 44L, 45L, 63L, 40L,
92L, 108L, 90L, 85L, 50L, 41L, 123L, 91L, 47L, 27L, 69L, 62L,
97L), .Label = c("E82002", "E82004", "E82014", "E82018", "E82019",
"E82023", "E82031", "E82037", "E82040", "E82041", "E82055", "E82058",
"E82059", "E82060", "E82062", "E82071", "E82077", "E82084", "E82095",
"E82107", "E82113", "E82663", "M85002", "M85003", "M85005", "M85006",
"M85007", "M85009", "M85010", "M85011", "M85014", "M85015", "M85018",
"M85020", "M85021", "M85023", "M85024", "M85025", "M85028", "M85029",
"M85030", "M85036", "M85037", "M85041", "M85042", "M85045", "M85047",
"M85048", "M85051", "M85052", "M85055", "M85056", "M85058", "M85059",
"M85062", "M85063", "M85064", "M85065", "M85070", "M85072", "M85074",
"M85076", "M85077", "M85078", "M85081", "M85082", "M85084", "M85085",
"M85086", "M85088", "M85092", "M85094", "M85097", "M85098", "M85100",
"M85105", "M85108", "M85115", "M85116", "M85118", "M85127", "M85128",
"M85133", "M85136", "M85142", "M85145", "M85146", "M85153", "M85154",
"M85156", "M85159", "M85163", "M85164", "M85166", "M85167", "M85171",
"M85172", "M85174", "M85176", "M85177", "M85178", "M85179", "M85611",
"M85634", "M85642", "M85652", "M85669", "M85671", "M85679", "M85684",
"M85686", "M85693", "M85694", "M85699", "M85701", "M85706", "M85711",
"M85713", "M85715", "M85716", "M85717", "M85721", "M85730", "M85733",
"M85735", "M85736", "M85749", "M85753", "M85756", "M85757", "M85770",
"M85774", "M85782", "M85783", "M85794", "M85797", "M85801", "M88020",
"M89009", "M89021", "Y00159", "Y00412", "Y00471", "Y00492", "Y01680",
"Y02571", "Y02620", "Y02639", "Y02961", "Y02963"), class = "factor"),
antibiotic = c(1.23248149651249, 1.19804465710497, 0.753794802511325,
0.85669917849255, 0.806766970145873, 1.2944351625755, 0.79749081458912,
0.949915803767271, 1.28676136005656, 0.861894948337942, 0.98944777231592,
0.77976175611218, 1.0802092104795, 1.18992427754597, 0.922230847446508,
1.00968448247105, 1.00925275017575, 1.13856339619023, 1.29658868391219,
3.43992412181159, 0.9405259515181, 1.04536664449872, 0.857195681526592,
1.36040902899291, 1.1555007762595, 1.23099411388522, 1.2921619764172,
1.20896911806371, 0.90601414991211, 1.48026866615811, 0.865283503864064,
1.34285564503446, 0.919419926661631, 1.41915312988514, 1.2330635342805,
3.66834851140276, 1.2803964023984, 0.777309332259057, 1.16760007845018,
0.903108177347766, 1.07415817045842, 1.76503145582347, 0.662906258393768,
1.11922205065869, 1.45743378132416, 1.40338387936522, 1.56356764856955,
1.21554707497369, 0.765459254266153, 1.02985290952772, 0.747988215118069,
1.28199535302764, 0.791630491986821, 1.45457105212014, 1.5360908424018,
1.36219759497743, 1.2823181822961, 1.16445352400409, 0.867251210987798,
0.93449947713661, 0.972235945064716, 0.952976072770419, 1.01713285255742,
1.0094222885861, 0.875833539680039, 0.618892154842347, 0.472595751806604,
0.496879988390655, 1.50731245234776, 1.04907441178441, 0.894164623526121,
0.658261298693029, 0.726078998206472, 1.02776752877325, 1.19666179452119,
0.97476267236602, 0.0127648710748021, 1.17439331625073, 5.8393330107237,
1.59645232815965, 0.487542408650236, 1.14865894544346, 0.729495610858418,
0.475652186678803, 0.810665743225695, 1.55727483921682, 0.509032628956674,
1.08248967413256, 0.829656197645062, 0.883813971368163, 1.1606344950849,
0.643888106444113, 0.658542420310134, 0.788100265873058,
0.999993653251755, 0.549776366766276, 1.00900222339709, 0.759174545084884,
0.732601429257463, 0.811032584239922, 0.992078825347759,
0.916336303170667, 0.924425842068231, 0.833487920775124,
1.2048401786876, 1.0710312446967, 1.15996384388112, 0.802575397465166,
0.827940641127218, 0.988964351312201, 0.810501627167164,
0.972188732451928, 1.21663117141513, 0.648182525899754, 1.24597821683072,
1.25013278566623, 1.16685772173495, 0.878810966942241, 1.21188990166584,
1.05209718360933, 0.928089616209815, 1.51726626492982, 0.955522092040987,
1.14598540145985, 0.992072220256482, 1.17856657930143, 0.487420516416757,
1.12018266962542, 0.999491890919433, 1.10449907263643, 1.38308178076077,
1.0848078324396, 0.735665641476272, 0.815600508556523, 1.04175344119065,
1.63317262657807, 0.941009543029732, 0.945643608300648, 0.785026349264038,
1.11186113789778, 0.931541465655869, 0.950426305389678, 1.12222589692599,
1.75509240895922, 1.39836663546273, 1.11387374264761, 1.42177823010633,
0.957155370021804, 1.48242155040868, 1.1388984391116)), .Names = c("gp.code",
"antibiotic"), row.names = c(NA, 150L), class = "data.frame")
I wish to merge the data in gp to df. This is a sample of my data, the full version is about 8000 records.
I normally use the code:
new <- merge(df, gp, by=c("gp.code"), all.x=T)
But when I run this, you can see it retrieves 154 records in the 'new' dataframe. As I understand it, the all.x=TRUE refers to all of the records in the df dataframe - why is it picking up more rows of data? If I change it to all.y=TRUE it gets 150 records. When I run this on my full dataset, I cannot back to the number of rows in df (using all.x or all.y = T), just with the additional merged column.
What am I doing wrong? Is there another function which is more appropriate?

Change the value in a column of a dataframe depending on how many of each possible value there are

I have a dataframe looking like this:
chr <- c(1,1,1,1,1)
b1 <- c('HP', 'HP', 'CP', 'CP', 'KP')
b2 <- c('HP', 'HP', 'CP', 'CP', 'KP')
b3 <- c('CP', 'KP', 'CP', 'HP', 'CP')
b4 <- c('CP', 'KP', 'CP', 'HP', 'CP')
b5 <- c('CP', 'CP', 'KP', 'KP', 'HP')
b6 <- c('CP', 'CP', 'KP', 'KP', 'HP')
b7 <- c('CP', 'KP', 'HP', 'CP', 'CP')
b8 <- c('CP', 'KP', 'HP', 'CP', 'CP')
df <- data.frame(chr, b1,b2,b3,b4,b5,b6,b7,b8)
I want to write a function that looks at each 'b' column and asks if it contains the value 'HP'. If it does, and the other six 'b' columns contain 'CP' or 'KP', I want to change the value 'HP' into 'CP' or 'KP' depending on which is the majority. If CP is the majority, change the HP to CP. If KP is the majority, change HP to KP.
(note that the value of b1 and b2, b3 and b4 etc is always the same, so really only 4 columns need to be looked at, b1, b3, b5, and b7).
To clarify, if the columns are e.g. HP HP CP CP CP CP KP KP, I want to change the two HPs into CPs (and leave the other columns the same).
So, the example I gave would become:
chr <- c(1,1,1,1,1)
b1 <- c('CP', 'KP', 'CP', 'CP', 'KP')
b2 <- c('CP', 'KP', 'CP', 'CP', 'KP')
b3 <- c('CP', 'KP', 'CP', 'CP', 'CP')
b4 <- c('CP', 'KP', 'CP', 'CP', 'CP')
b5 <- c('CP', 'CP', 'KP', 'KP', 'CP')
b6 <- c('CP', 'CP', 'KP', 'KP', 'CP')
b7 <- c('CP', 'KP', 'CP', 'CP', 'CP')
b8 <- c('CP', 'KP', 'CP', 'CP', 'CP')
df <- data.frame(chr, b1,b2,b3,b4,b5,b6,b7,b8)
df
I have written a function (just for df$b1) with if statements, but it doesn't work.
(note the rules for whether the HP changes to KP or CP depend on how many other CPs or KPs there are):
fun <- function(df){
if(df$b1 == 'HP' && df$b3 == 'CP' && df$b5 == 'CP' && df$b7 == 'CP') {df$b1 <- 'KP'}
if(df$b1 == 'HP' && df$b3 == 'KP' && df$b5 == 'CP' && df$b7 == 'CP') {df$b1 <- 'CP'}
if(df$b1 == 'HP' && df$b3 == 'CP' && df$b5 == 'KP' && df$b7 == 'CP') {df$b1 <- 'CP'}
if(df$b1 == 'HP' && df$b3 == 'CP' && df$b5 == 'CP' && df$b7 == 'KP') {df$b1 <- 'CP'}
if(df$b1 == 'HP' && df$b3 == 'KP' && df$b5 == 'KP' && df$b7 == 'CP') {df$b1 <- 'KP'}
if(df$b1 == 'HP' && df$b3 == 'KP' && df$b5 == 'CP' && df$b7 == 'KP') {df$b1 <- 'KP'}
if(df$b1 == 'HP' && df$b3 == 'CP' && df$b5 == 'KP' && df$b7 == 'KP') {df$b1 <- 'KP'}
if(df$b1 == 'HP' && df$b3 == 'KP' && df$b5 == 'KP' && df$b7 == 'KP') {df$b1 <- 'CP'}
df$b2 <-df$b1
}
Thanks very much for any help. I'm really stuck on this one.
EDIT: This is a sample of my actual data which is more complex than the example I gave above.
structure(list(chr = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L), pos_c = c(2373L, 2406L, 2418L, 2419L,
2447L, 2450L, 2468L, 2524L, 2533L, 2535L, 2536L, 2542L, 2623L,
2709L, 3942L, 11716L, 11893L, 11898L, 12190L, 12396L, 26639L,
26640L, 26643L, 26646L, 26655L, 26657L, 26661L, 26667L, 26670L,
26676L, 26679L, 26684L, 26685L, 26688L, 26694L, 26703L, 26710L,
26712L, 26713L, 26723L, 26733L, 26737L, 26738L, 26739L, 26742L,
26743L, 26748L, 26761L, 26765L, 26766L, 26778L, 26781L, 26790L,
26792L, 26796L, 26802L, 26805L, 26811L, 26814L, 26819L, 26820L,
26823L, 26829L, 26838L, 26846L, 26847L, 26848L, 26872L, 26873L,
26874L, 26877L, 26878L, 26883L, 26889L, 26901L, 26904L, 26907L,
26916L, 26923L, 26925L, 26927L, 26931L, 26937L, 26940L, 26946L,
26954L, 26958L, 26961L, 26963L, 26964L, 26970L, 26981L, 26982L,
26983L, 26991L, 26994L, 26997L, 27007L, 27008L, 27009L, 27012L,
27015L, 27018L, 27027L, 202471L, 203660L, 203668L, 203669L, 203670L,
203672L, 203678L, 203683L, 203686L, 203687L, 203690L, 203704L,
203705L, 203711L, 203714L, 203732L, 203749L, 203752L, 203754L,
203755L, 203903L, 203910L, 203911L, 203912L, 203913L, 203914L,
203915L, 203922L, 203924L, 203933L, 203937L, 203939L, 203945L,
203948L, 203951L, 203957L, 203960L, 203961L, 203963L, 203969L,
203972L, 203973L, 203974L, 203975L, 203981L, 203991L, 204220L,
204227L, 204230L, 204232L, 204242L, 204245L, 204262L, 204272L,
204278L, 204282L, 204290L), c1 = c(101L, 60L, 63L, 64L, 100L,
97L, 94L, 83L, 80L, 48L, 46L, 51L, 69L, 46L, 23L, 79L, 63L, 59L,
53L, 85L, 13L, 12L, 1L, 9L, 11L, 13L, 9L, 14L, 14L, 12L, 15L,
9L, 15L, 14L, 14L, 2L, 2L, 8L, 3L, 0L, 0L, 4L, 2L, 1L, 4L, 4L,
8L, 39L, 7L, 5L, 2L, 41L, 69L, 79L, 89L, 120L, 128L, 90L, 134L,
107L, 169L, 120L, 103L, 48L, 58L, 132L, 62L, 19L, 9L, 13L, 12L,
12L, 17L, 251L, 8L, 367L, 367L, 264L, 5L, 170L, 113L, 234L, 134L,
143L, 189L, 224L, 255L, 296L, 448L, 239L, 169L, 80L, 312L, 84L,
403L, 397L, 430L, 529L, 544L, 556L, 565L, 549L, 555L, 4L, 11L,
0L, 18L, 18L, 19L, 19L, 18L, 18L, 17L, 17L, 15L, 15L, 16L, 15L,
13L, 14L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 2L, 3L, 2L, 0L, 0L, 0L,
0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
0L, 1L, 13L, 2L, 10L, 4L, 10L, 24L, 33L, 33L, 63L, 42L), c2 = c(101L,
60L, 63L, 64L, 100L, 97L, 94L, 83L, 80L, 48L, 46L, 51L, 69L,
46L, 23L, 79L, 63L, 59L, 53L, 85L, 13L, 12L, 1L, 9L, 11L, 13L,
9L, 14L, 14L, 12L, 15L, 9L, 15L, 14L, 14L, 2L, 2L, 8L, 3L, 0L,
0L, 4L, 2L, 1L, 4L, 4L, 8L, 39L, 7L, 5L, 2L, 41L, 69L, 79L, 89L,
120L, 128L, 90L, 134L, 107L, 169L, 120L, 103L, 48L, 58L, 132L,
62L, 19L, 9L, 13L, 12L, 12L, 17L, 251L, 8L, 367L, 367L, 264L,
5L, 170L, 113L, 234L, 134L, 143L, 189L, 224L, 255L, 296L, 448L,
239L, 169L, 80L, 312L, 84L, 403L, 397L, 430L, 529L, 544L, 556L,
565L, 549L, 555L, 4L, 11L, 0L, 18L, 18L, 19L, 19L, 18L, 18L,
17L, 17L, 15L, 15L, 16L, 15L, 13L, 14L, 0L, 1L, 0L, 0L, 0L, 0L,
0L, 2L, 3L, 2L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L,
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 13L, 2L, 10L, 4L, 10L, 24L,
33L, 33L, 63L, 42L), c3 = c(37L, 0L, 0L, 0L, 42L, 46L, 46L, 21L,
26L, 6L, 2L, 7L, 11L, 4L, 0L, 4L, 1L, 0L, 0L, 2L, 29L, 29L, 0L,
22L, 23L, 23L, 26L, 27L, 29L, 24L, 32L, 26L, 35L, 32L, 32L, 3L,
3L, 10L, 1L, 5L, 1L, 6L, 1L, 0L, 5L, 11L, 6L, 81L, 15L, 14L,
0L, 92L, 157L, 174L, 168L, 236L, 221L, 143L, 228L, 251L, 292L,
273L, 281L, 33L, 39L, 260L, 57L, 53L, 24L, 22L, 26L, 37L, 37L,
484L, 16L, 721L, 724L, 436L, 7L, 367L, 163L, 411L, 167L, 373L,
275L, 599L, 637L, 773L, 866L, 615L, 223L, 63L, 531L, 59L, 878L,
868L, 911L, 939L, 975L, 995L, 980L, 931L, 958L, 12L, 16L, 0L,
12L, 13L, 12L, 11L, 9L, 12L, 11L, 11L, 10L, 1L, 0L, 0L, 0L, 1L,
1L, 2L, 1L, 0L, 1L, 1L, 0L, 2L, 2L, 2L, 0L, 0L, 0L, 0L, 0L, 1L,
0L, 0L, 1L, 0L, 0L, 1L, 0L, 0L, 0L, 1L, 1L, 0L, 1L, 1L, 2L, 28L,
5L, 28L, 3L, 12L, 39L, 40L, 50L, 90L, 80L), c4 = c(37L, 0L, 0L,
0L, 42L, 46L, 46L, 21L, 26L, 6L, 2L, 7L, 11L, 4L, 0L, 4L, 1L,
0L, 0L, 2L, 29L, 29L, 0L, 22L, 23L, 23L, 26L, 27L, 29L, 24L,
32L, 26L, 35L, 32L, 32L, 3L, 3L, 10L, 1L, 5L, 1L, 6L, 1L, 0L,
5L, 11L, 6L, 81L, 15L, 14L, 0L, 92L, 157L, 174L, 168L, 236L,
221L, 143L, 228L, 251L, 292L, 273L, 281L, 33L, 39L, 260L, 57L,
53L, 24L, 22L, 26L, 37L, 37L, 484L, 16L, 721L, 724L, 436L, 7L,
367L, 163L, 411L, 167L, 373L, 275L, 599L, 637L, 773L, 866L, 615L,
223L, 63L, 531L, 59L, 878L, 868L, 911L, 939L, 975L, 995L, 980L,
931L, 958L, 12L, 16L, 0L, 12L, 13L, 12L, 11L, 9L, 12L, 11L, 11L,
10L, 1L, 0L, 0L, 0L, 1L, 1L, 2L, 1L, 0L, 1L, 1L, 0L, 2L, 2L,
2L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 1L, 0L, 0L, 1L, 0L, 0L, 0L,
1L, 1L, 0L, 1L, 1L, 2L, 28L, 5L, 28L, 3L, 12L, 39L, 40L, 50L,
90L, 80L), c5 = c(96L, 77L, 74L, 72L, 96L, 96L, 92L, 80L, 79L,
79L, 76L, 76L, 66L, 55L, 64L, 78L, 110L, 100L, 165L, 171L, 38L,
41L, 2L, 38L, 33L, 37L, 21L, 40L, 41L, 21L, 37L, 19L, 45L, 30L,
22L, 22L, 28L, 34L, 30L, 31L, 25L, 40L, 34L, 33L, 34L, 46L, 41L,
96L, 48L, 51L, 38L, 93L, 152L, 155L, 155L, 193L, 195L, 189L,
222L, 213L, 284L, 248L, 230L, 56L, 70L, 208L, 82L, 85L, 67L,
64L, 64L, 83L, 71L, 495L, 77L, 570L, 577L, 499L, 55L, 292L, 236L,
352L, 244L, 296L, 351L, 391L, 440L, 483L, 653L, 417L, 194L, 57L,
460L, 57L, 538L, 520L, 573L, 731L, 753L, 770L, 772L, 757L, 761L,
35L, 73L, 66L, 70L, 70L, 71L, 70L, 74L, 79L, 82L, 83L, 85L, 69L,
68L, 71L, 71L, 70L, 73L, 72L, 72L, 74L, 103L, 107L, 106L, 107L,
109L, 106L, 106L, 105L, 106L, 105L, 108L, 104L, 105L, 106L, 106L,
103L, 112L, 112L, 113L, 112L, 109L, 114L, 114L, 115L, 120L, 114L,
97L, 125L, 103L, 124L, 107L, 116L, 145L, 139L, 138L, 177L, 139L
), c6 = c(96L, 77L, 74L, 72L, 96L, 96L, 92L, 80L, 79L, 79L, 76L,
76L, 66L, 55L, 64L, 78L, 110L, 100L, 165L, 171L, 38L, 41L, 2L,
38L, 33L, 37L, 21L, 40L, 41L, 21L, 37L, 19L, 45L, 30L, 22L, 22L,
28L, 34L, 30L, 31L, 25L, 40L, 34L, 33L, 34L, 46L, 41L, 96L, 48L,
51L, 38L, 93L, 152L, 155L, 155L, 193L, 195L, 189L, 222L, 213L,
284L, 248L, 230L, 56L, 70L, 208L, 82L, 85L, 67L, 64L, 64L, 83L,
71L, 495L, 77L, 570L, 577L, 499L, 55L, 292L, 236L, 352L, 244L,
296L, 351L, 391L, 440L, 483L, 653L, 417L, 194L, 57L, 460L, 57L,
538L, 520L, 573L, 731L, 753L, 770L, 772L, 757L, 761L, 35L, 73L,
66L, 70L, 70L, 71L, 70L, 74L, 79L, 82L, 83L, 85L, 69L, 68L, 71L,
71L, 70L, 73L, 72L, 72L, 74L, 103L, 107L, 106L, 107L, 109L, 106L,
106L, 105L, 106L, 105L, 108L, 104L, 105L, 106L, 106L, 103L, 112L,
112L, 113L, 112L, 109L, 114L, 114L, 115L, 120L, 114L, 97L, 125L,
103L, 124L, 107L, 116L, 145L, 139L, 138L, 177L, 139L), c7 = c(28L,
3L, 1L, 1L, 52L, 50L, 60L, 49L, 50L, 3L, 2L, 2L, 37L, 11L, 0L,
1L, 2L, 2L, 0L, 1L, 28L, 30L, 1L, 17L, 23L, 28L, 11L, 30L, 32L,
13L, 32L, 19L, 39L, 18L, 17L, 23L, 29L, 46L, 37L, 25L, 21L, 42L,
32L, 29L, 30L, 41L, 44L, 141L, 72L, 64L, 25L, 93L, 219L, 234L,
218L, 294L, 277L, 184L, 294L, 273L, 382L, 293L, 280L, 131L, 132L,
386L, 157L, 99L, 77L, 75L, 68L, 66L, 88L, 615L, 55L, 746L, 740L,
685L, 27L, 305L, 158L, 511L, 151L, 326L, 371L, 605L, 650L, 727L,
886L, 623L, 314L, 170L, 734L, 162L, 937L, 908L, 987L, 964L, 997L,
1002L, 1007L, 960L, 980L, 28L, 75L, 61L, 96L, 98L, 97L, 96L,
93L, 101L, 99L, 100L, 98L, 91L, 90L, 90L, 89L, 87L, 76L, 75L,
75L, 76L, 88L, 92L, 87L, 86L, 88L, 87L, 85L, 87L, 87L, 83L, 86L,
87L, 86L, 86L, 89L, 83L, 83L, 84L, 84L, 86L, 83L, 86L, 88L, 87L,
88L, 84L, 81L, 118L, 90L, 120L, 90L, 101L, 127L, 134L, 140L,
172L, 160L), c8 = c(28L, 3L, 1L, 1L, 52L, 50L, 60L, 49L, 50L,
3L, 2L, 2L, 37L, 11L, 0L, 1L, 2L, 2L, 0L, 1L, 28L, 30L, 1L, 17L,
23L, 28L, 11L, 30L, 32L, 13L, 32L, 19L, 39L, 18L, 17L, 23L, 29L,
46L, 37L, 25L, 21L, 42L, 32L, 29L, 30L, 41L, 44L, 141L, 72L,
64L, 25L, 93L, 219L, 234L, 218L, 294L, 277L, 184L, 294L, 273L,
382L, 293L, 280L, 131L, 132L, 386L, 157L, 99L, 77L, 75L, 68L,
66L, 88L, 615L, 55L, 746L, 740L, 685L, 27L, 305L, 158L, 511L,
151L, 326L, 371L, 605L, 650L, 727L, 886L, 623L, 314L, 170L, 734L,
162L, 937L, 908L, 987L, 964L, 997L, 1002L, 1007L, 960L, 980L,
28L, 75L, 61L, 96L, 98L, 97L, 96L, 93L, 101L, 99L, 100L, 98L,
91L, 90L, 90L, 89L, 87L, 76L, 75L, 75L, 76L, 88L, 92L, 87L, 86L,
88L, 87L, 85L, 87L, 87L, 83L, 86L, 87L, 86L, 86L, 89L, 83L, 83L,
84L, 84L, 86L, 83L, 86L, 88L, 87L, 88L, 84L, 81L, 118L, 90L,
120L, 90L, 101L, 127L, 134L, 140L, 172L, 160L), k1 = c(39L, 64L,
68L, 69L, 38L, 38L, 41L, 51L, 54L, 84L, 83L, 84L, 57L, 50L, 43L,
58L, 72L, 71L, 29L, 35L, 0L, 0L, 10L, 1L, 1L, 0L, 3L, 0L, 0L,
1L, 0L, 3L, 0L, 0L, 0L, 14L, 14L, 9L, 15L, 18L, 24L, 20L, 20L,
27L, 28L, 10L, 28L, 27L, 59L, 64L, 73L, 43L, 19L, 7L, 27L, 5L,
23L, 30L, 29L, 65L, 10L, 46L, 27L, 160L, 168L, 95L, 175L, 255L,
265L, 271L, 270L, 76L, 269L, 77L, 14L, 12L, 11L, 118L, 382L,
204L, 220L, 181L, 290L, 290L, 114L, 209L, 89L, 159L, 7L, 144L,
95L, 9L, 180L, 411L, 105L, 125L, 97L, 19L, 3L, 3L, 2L, 12L, 1L,
540L, 1L, 32L, 14L, 14L, 13L, 13L, 15L, 14L, 12L, 11L, 12L, 11L,
12L, 13L, 13L, 9L, 18L, 17L, 8L, 18L, 6L, 2L, 1L, 2L, 1L, 2L,
4L, 4L, 4L, 4L, 4L, 4L, 4L, 3L, 3L, 4L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 1L, 0L, 2L, 1L, 21L, 28L, 49L, 50L, 54L, 45L,
44L), k2 = c(39L, 64L, 68L, 69L, 38L, 38L, 41L, 51L, 54L, 84L,
83L, 84L, 57L, 50L, 43L, 58L, 72L, 71L, 29L, 35L, 0L, 0L, 10L,
1L, 1L, 0L, 3L, 0L, 0L, 1L, 0L, 3L, 0L, 0L, 0L, 14L, 14L, 9L,
15L, 18L, 24L, 20L, 20L, 27L, 28L, 10L, 28L, 27L, 59L, 64L, 73L,
43L, 19L, 7L, 27L, 5L, 23L, 30L, 29L, 65L, 10L, 46L, 27L, 160L,
168L, 95L, 175L, 255L, 265L, 271L, 270L, 76L, 269L, 77L, 14L,
12L, 11L, 118L, 382L, 204L, 220L, 181L, 290L, 290L, 114L, 209L,
89L, 159L, 7L, 144L, 95L, 9L, 180L, 411L, 105L, 125L, 97L, 19L,
3L, 3L, 2L, 12L, 1L, 540L, 1L, 32L, 14L, 14L, 13L, 13L, 15L,
14L, 12L, 11L, 12L, 11L, 12L, 13L, 13L, 9L, 18L, 17L, 8L, 18L,
6L, 2L, 1L, 2L, 1L, 2L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 3L, 3L, 4L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 0L, 2L, 1L, 21L,
28L, 49L, 50L, 54L, 45L, 44L), k3 = c(84L, 122L, 120L, 120L,
92L, 88L, 90L, 107L, 98L, 114L, 120L, 117L, 91L, 64L, 59L, 100L,
113L, 109L, 56L, 136L, 1L, 0L, 29L, 7L, 4L, 6L, 5L, 6L, 6L, 9L,
7L, 11L, 7L, 10L, 9L, 44L, 46L, 38L, 51L, 60L, 79L, 75L, 80L,
83L, 80L, 41L, 97L, 61L, 133L, 135L, 180L, 100L, 50L, 28L, 75L,
18L, 79L, 94L, 100L, 117L, 47L, 74L, 68L, 393L, 390L, 191L, 416L,
504L, 532L, 545L, 545L, 181L, 556L, 175L, 19L, 24L, 19L, 312L,
766L, 389L, 416L, 418L, 639L, 475L, 239L, 293L, 70L, 135L, 37L,
122L, 84L, 42L, 408L, 886L, 93L, 115L, 65L, 67L, 35L, 37L, 47L,
50L, 54L, 942L, 9L, 43L, 29L, 29L, 29L, 29L, 28L, 27L, 25L, 25L,
26L, 32L, 33L, 32L, 33L, 30L, 26L, 23L, 24L, 23L, 8L, 1L, 2L,
2L, 2L, 2L, 2L, 4L, 4L, 4L, 4L, 3L, 4L, 3L, 3L, 3L, 3L, 3L, 3L,
4L, 4L, 3L, 3L, 4L, 3L, 2L, 2L, 0L, 7L, 3L, 65L, 73L, 111L, 98L,
133L, 107L, 64L), k4 = c(84L, 122L, 120L, 120L, 92L, 88L, 90L,
107L, 98L, 114L, 120L, 117L, 91L, 64L, 59L, 100L, 113L, 109L,
56L, 136L, 1L, 0L, 29L, 7L, 4L, 6L, 5L, 6L, 6L, 9L, 7L, 11L,
7L, 10L, 9L, 44L, 46L, 38L, 51L, 60L, 79L, 75L, 80L, 83L, 80L,
41L, 97L, 61L, 133L, 135L, 180L, 100L, 50L, 28L, 75L, 18L, 79L,
94L, 100L, 117L, 47L, 74L, 68L, 393L, 390L, 191L, 416L, 504L,
532L, 545L, 545L, 181L, 556L, 175L, 19L, 24L, 19L, 312L, 766L,
389L, 416L, 418L, 639L, 475L, 239L, 293L, 70L, 135L, 37L, 122L,
84L, 42L, 408L, 886L, 93L, 115L, 65L, 67L, 35L, 37L, 47L, 50L,
54L, 942L, 9L, 43L, 29L, 29L, 29L, 29L, 28L, 27L, 25L, 25L, 26L,
32L, 33L, 32L, 33L, 30L, 26L, 23L, 24L, 23L, 8L, 1L, 2L, 2L,
2L, 2L, 2L, 4L, 4L, 4L, 4L, 3L, 4L, 3L, 3L, 3L, 3L, 3L, 3L, 4L,
4L, 3L, 3L, 4L, 3L, 2L, 2L, 0L, 7L, 3L, 65L, 73L, 111L, 98L,
133L, 107L, 64L), k5 = c(0L, 14L, 14L, 14L, 1L, 0L, 0L, 8L, 7L,
5L, 5L, 5L, 0L, 3L, 0L, 8L, 2L, 3L, 18L, 15L, 0L, 2L, 38L, 3L,
5L, 1L, 18L, 1L, 2L, 2L, 3L, 21L, 2L, 15L, 1L, 26L, 22L, 17L,
27L, 33L, 41L, 39L, 42L, 45L, 51L, 14L, 50L, 31L, 82L, 84L, 108L,
55L, 24L, 16L, 51L, 33L, 44L, 55L, 54L, 87L, 15L, 20L, 27L, 285L,
297L, 151L, 293L, 343L, 363L, 374L, 376L, 57L, 382L, 24L, 25L,
10L, 8L, 103L, 551L, 301L, 320L, 276L, 364L, 340L, 49L, 272L,
171L, 195L, 24L, 180L, 161L, 11L, 254L, 663L, 188L, 229L, 158L,
26L, 3L, 3L, 6L, 10L, 6L, 708L, 0L, 9L, 0L, 3L, 0L, 1L, 0L, 2L,
0L, 0L, 1L, 9L, 9L, 9L, 10L, 10L, 6L, 6L, 1L, 6L, 2L, 0L, 5L,
3L, 2L, 3L, 4L, 2L, 3L, 2L, 2L, 1L, 3L, 0L, 0L, 4L, 1L, 0L, 1L,
5L, 2L, 0L, 1L, 2L, 0L, 2L, 5L, 1L, 3L, 3L, 43L, 50L, 78L, 75L,
87L, 78L, 59L), k6 = c(0L, 14L, 14L, 14L, 1L, 0L, 0L, 8L, 7L,
5L, 5L, 5L, 0L, 3L, 0L, 8L, 2L, 3L, 18L, 15L, 0L, 2L, 38L, 3L,
5L, 1L, 18L, 1L, 2L, 2L, 3L, 21L, 2L, 15L, 1L, 26L, 22L, 17L,
27L, 33L, 41L, 39L, 42L, 45L, 51L, 14L, 50L, 31L, 82L, 84L, 108L,
55L, 24L, 16L, 51L, 33L, 44L, 55L, 54L, 87L, 15L, 20L, 27L, 285L,
297L, 151L, 293L, 343L, 363L, 374L, 376L, 57L, 382L, 24L, 25L,
10L, 8L, 103L, 551L, 301L, 320L, 276L, 364L, 340L, 49L, 272L,
171L, 195L, 24L, 180L, 161L, 11L, 254L, 663L, 188L, 229L, 158L,
26L, 3L, 3L, 6L, 10L, 6L, 708L, 0L, 9L, 0L, 3L, 0L, 1L, 0L, 2L,
0L, 0L, 1L, 9L, 9L, 9L, 10L, 10L, 6L, 6L, 1L, 6L, 2L, 0L, 5L,
3L, 2L, 3L, 4L, 2L, 3L, 2L, 2L, 1L, 3L, 0L, 0L, 4L, 1L, 0L, 1L,
5L, 2L, 0L, 1L, 2L, 0L, 2L, 5L, 1L, 3L, 3L, 43L, 50L, 78L, 75L,
87L, 78L, 59L), k7 = c(0L, 36L, 42L, 44L, 0L, 0L, 0L, 3L, 3L,
49L, 50L, 51L, 0L, 0L, 0L, 0L, 0L, 0L, 31L, 158L, 0L, 1L, 28L,
14L, 11L, 9L, 27L, 14L, 12L, 14L, 14L, 28L, 14L, 32L, 19L, 41L,
37L, 26L, 39L, 57L, 85L, 75L, 82L, 87L, 87L, 37L, 91L, 54L, 124L,
138L, 206L, 150L, 44L, 18L, 92L, 38L, 76L, 95L, 101L, 155L, 20L,
90L, 48L, 375L, 344L, 135L, 379L, 519L, 537L, 549L, 563L, 67L,
557L, 91L, 43L, 30L, 35L, 125L, 784L, 491L, 519L, 324L, 627L,
503L, 215L, 296L, 68L, 203L, 42L, 173L, 58L, 43L, 222L, 812L,
64L, 98L, 36L, 65L, 36L, 45L, 42L, 50L, 43L, 962L, 0L, 36L, 0L,
0L, 0L, 1L, 1L, 0L, 0L, 0L, 1L, 15L, 17L, 15L, 13L, 12L, 25L,
27L, 8L, 26L, 7L, 2L, 5L, 5L, 4L, 5L, 5L, 5L, 5L, 6L, 5L, 4L,
6L, 0L, 0L, 5L, 0L, 1L, 0L, 5L, 3L, 0L, 0L, 4L, 0L, 1L, 4L, 2L,
9L, 3L, 59L, 77L, 123L, 107L, 144L, 119L, 79L), k8 = c(0L, 36L,
42L, 44L, 0L, 0L, 0L, 3L, 3L, 49L, 50L, 51L, 0L, 0L, 0L, 0L,
0L, 0L, 31L, 158L, 0L, 1L, 28L, 14L, 11L, 9L, 27L, 14L, 12L,
14L, 14L, 28L, 14L, 32L, 19L, 41L, 37L, 26L, 39L, 57L, 85L, 75L,
82L, 87L, 87L, 37L, 91L, 54L, 124L, 138L, 206L, 150L, 44L, 18L,
92L, 38L, 76L, 95L, 101L, 155L, 20L, 90L, 48L, 375L, 344L, 135L,
379L, 519L, 537L, 549L, 563L, 67L, 557L, 91L, 43L, 30L, 35L,
125L, 784L, 491L, 519L, 324L, 627L, 503L, 215L, 296L, 68L, 203L,
42L, 173L, 58L, 43L, 222L, 812L, 64L, 98L, 36L, 65L, 36L, 45L,
42L, 50L, 43L, 962L, 0L, 36L, 0L, 0L, 0L, 1L, 1L, 0L, 0L, 0L,
1L, 15L, 17L, 15L, 13L, 12L, 25L, 27L, 8L, 26L, 7L, 2L, 5L, 5L,
4L, 5L, 5L, 5L, 5L, 6L, 5L, 4L, 6L, 0L, 0L, 5L, 0L, 1L, 0L, 5L,
3L, 0L, 0L, 4L, 0L, 1L, 4L, 2L, 9L, 3L, 59L, 77L, 123L, 107L,
144L, 119L, 79L), b1 = structure(c(7L, 3L, 3L, 3L, 7L, 7L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 7L, 1L, 1L, 7L,
7L, 7L, 1L, 7L, 1L, 1L, 1L, 1L, 7L, 1L, 1L, 1L, 7L, 7L, 7L, 7L,
5L, 5L, 7L, 7L, 5L, 5L, 7L, 7L, 3L, 5L, 5L, 5L, 3L, 7L, 7L, 7L,
1L, 7L, 7L, 7L, 3L, 1L, 7L, 7L, 7L, 7L, 3L, 7L, 5L, 5L, 5L, 5L,
7L, 5L, 7L, 7L, 1L, 1L, 3L, 5L, 3L, 7L, 3L, 3L, 3L, 7L, 3L, 7L,
3L, 1L, 7L, 7L, 7L, 3L, 5L, 7L, 7L, 7L, 1L, 1L, 1L, 1L, 1L, 1L,
5L, 1L, 5L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 5L, 5L, 7L, 5L, 5L, 6L, 6L, 2L, 6L, 2L, 4L, 4L, 4L, 4L, 4L,
4L, 4L, 6L, 6L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 2L,
1L, 7L, 7L, 7L, 7L, 3L, 7L, 7L, 3L, 7L), .Label = c("CP", "HF",
"HP", "KF", "KP", "NF", "NP"), class = "factor"), b2 = structure(c(7L,
3L, 3L, 3L, 7L, 7L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 7L, 1L, 1L, 7L, 7L, 7L, 1L, 7L, 1L, 1L, 1L, 1L, 7L, 1L,
1L, 1L, 7L, 7L, 7L, 7L, 5L, 5L, 7L, 7L, 5L, 5L, 7L, 7L, 3L, 5L,
5L, 5L, 3L, 7L, 7L, 7L, 1L, 7L, 7L, 7L, 3L, 1L, 7L, 7L, 7L, 7L,
3L, 7L, 5L, 5L, 5L, 5L, 7L, 5L, 7L, 7L, 1L, 1L, 3L, 5L, 3L, 7L,
3L, 3L, 3L, 7L, 3L, 7L, 3L, 1L, 7L, 7L, 7L, 3L, 5L, 7L, 7L, 7L,
1L, 1L, 1L, 1L, 1L, 1L, 5L, 1L, 5L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 5L, 5L, 7L, 5L, 5L, 6L, 6L, 2L, 6L,
2L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 6L, 6L, 4L, 4L, 4L, 4L, 4L, 4L,
4L, 4L, 4L, 4L, 4L, 2L, 1L, 7L, 7L, 7L, 7L, 3L, 7L, 7L, 3L, 7L
), .Label = c("CP", "HF", "HP", "KF", "KP", "NF", "NP"), class = "factor"),
b3 = structure(c(3L, 5L, 5L, 5L, 3L, 3L, 3L, 5L, 7L, 5L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 1L, 1L, 5L, 7L, 7L,
7L, 7L, 7L, 7L, 7L, 7L, 7L, 1L, 7L, 7L, 5L, 5L, 7L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 7L, 5L, 3L, 5L, 5L, 5L, 3L, 7L, 7L, 3L,
7L, 7L, 7L, 3L, 3L, 7L, 7L, 7L, 5L, 5L, 3L, 5L, 5L, 5L, 5L,
5L, 7L, 5L, 7L, 7L, 1L, 1L, 3L, 5L, 3L, 7L, 3L, 7L, 3L, 7L,
3L, 7L, 1L, 1L, 7L, 7L, 7L, 3L, 5L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 5L, 3L, 5L, 7L, 3L, 7L, 7L, 7L, 3L, 3L, 3L, 7L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 6L, 6L, 2L, 2L, 2L,
6L, 4L, 4L, 4L, 4L, 6L, 4L, 6L, 6L, 6L, 6L, 6L, 6L, 4L, 4L,
6L, 6L, 4L, 6L, 2L, 7L, 1L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 3L,
7L), .Label = c("CP", "HF", "HP", "KF", "KP", "NF", "NP"), class = "factor"),
b4 = structure(c(3L, 5L, 5L, 5L, 3L, 3L, 3L, 5L, 7L, 5L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 1L, 1L, 5L, 7L, 7L,
7L, 7L, 7L, 7L, 7L, 7L, 7L, 1L, 7L, 7L, 5L, 5L, 7L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 7L, 5L, 3L, 5L, 5L, 5L, 3L, 7L, 7L, 3L,
7L, 7L, 7L, 3L, 3L, 7L, 7L, 7L, 5L, 5L, 3L, 5L, 5L, 5L, 5L,
5L, 7L, 5L, 7L, 7L, 1L, 1L, 3L, 5L, 3L, 7L, 3L, 7L, 3L, 7L,
3L, 7L, 1L, 1L, 7L, 7L, 7L, 3L, 5L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 5L, 3L, 5L, 7L, 3L, 7L, 7L, 7L, 3L, 3L, 3L, 7L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 6L, 6L, 2L, 2L, 2L,
6L, 4L, 4L, 4L, 4L, 6L, 4L, 6L, 6L, 6L, 6L, 6L, 6L, 4L, 4L,
6L, 6L, 4L, 6L, 2L, 7L, 1L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 3L,
7L), .Label = c("CP", "HF", "HP", "KF", "KP", "NF", "NP"), class = "factor"),
b5 = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 4L, 1L, 3L, 1L, 4L,
1L, 2L, 1L, 1L, 4L, 1L, 4L, 1L, 4L, 4L, 2L, 4L, 2L, 2L, 2L,
4L, 2L, 2L, 2L, 2L, 4L, 2L, 4L, 2L, 2L, 4L, 2L, 1L, 1L, 4L,
4L, 4L, 4L, 4L, 4L, 1L, 4L, 4L, 4L, 4L, 2L, 4L, 4L, 4L, 3L,
3L, 4L, 4L, 1L, 4L, 1L, 1L, 1L, 3L, 2L, 4L, 2L, 2L, 2L, 4L,
2L, 4L, 4L, 1L, 4L, 4L, 4L, 2L, 3L, 4L, 2L, 4L, 1L, 1L, 1L,
1L, 1L, 1L, 3L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 4L, 1L, 4L, 2L, 2L, 4L, 4L, 2L,
4L), .Label = c("CP", "HP", "KP", "NP"), class = "factor"),
b6 = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 4L, 1L, 3L, 1L, 4L,
1L, 2L, 1L, 1L, 4L, 1L, 4L, 1L, 4L, 4L, 2L, 4L, 2L, 2L, 2L,
4L, 2L, 2L, 2L, 2L, 4L, 2L, 4L, 2L, 2L, 4L, 2L, 1L, 1L, 4L,
4L, 4L, 4L, 4L, 4L, 1L, 4L, 4L, 4L, 4L, 2L, 4L, 4L, 4L, 3L,
3L, 4L, 4L, 1L, 4L, 1L, 1L, 1L, 3L, 2L, 4L, 2L, 2L, 2L, 4L,
2L, 4L, 4L, 1L, 4L, 4L, 4L, 2L, 3L, 4L, 2L, 4L, 1L, 1L, 1L,
1L, 1L, 1L, 3L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 4L, 1L, 4L, 2L, 2L, 4L, 4L, 2L,
4L), .Label = c("CP", "HP", "KP", "NP"), class = "factor"),
b7 = structure(c(2L, 4L, 4L, 4L, 2L, 2L, 2L, 2L, 2L, 4L,
4L, 4L, 2L, 2L, 5L, 1L, 1L, 1L, 4L, 4L, 2L, 2L, 4L, 3L, 6L,
6L, 6L, 3L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 3L, 3L, 3L, 3L, 6L,
6L, 3L, 6L, 6L, 6L, 6L, 3L, 6L, 3L, 3L, 4L, 3L, 6L, 6L, 6L,
6L, 6L, 6L, 6L, 3L, 2L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 4L, 4L,
4L, 6L, 4L, 2L, 6L, 2L, 2L, 2L, 4L, 3L, 6L, 3L, 6L, 3L, 6L,
3L, 6L, 6L, 2L, 6L, 6L, 6L, 6L, 4L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 4L, 2L, 3L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 6L, 6L, 6L, 6L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 6L, 2L, 6L, 3L, 3L, 6L, 3L, 3L,
6L), .Label = c("CF", "CP", "HP", "KP", "NF", "NP"), class = "factor"),
b8 = structure(c(2L, 4L, 4L, 4L, 2L, 2L, 2L, 2L, 2L, 4L,
4L, 4L, 2L, 2L, 5L, 1L, 1L, 1L, 4L, 4L, 2L, 2L, 4L, 3L, 6L,
6L, 6L, 3L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 3L, 3L, 3L, 3L, 6L,
6L, 3L, 6L, 6L, 6L, 6L, 3L, 6L, 3L, 3L, 4L, 3L, 6L, 6L, 6L,
6L, 6L, 6L, 6L, 3L, 2L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 4L, 4L,
4L, 6L, 4L, 2L, 6L, 2L, 2L, 2L, 4L, 3L, 6L, 3L, 6L, 3L, 6L,
3L, 6L, 6L, 2L, 6L, 6L, 6L, 6L, 4L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 4L, 2L, 3L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 6L, 6L, 6L, 6L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 6L, 2L, 6L, 3L, 3L, 6L, 3L, 3L,
6L), .Label = c("CF", "CP", "HP", "KP", "NF", "NP"), class = "factor")), .Names = c("chr",
"pos_c", "c1", "c2", "c3", "c4", "c5", "c6", "c7", "c8", "k1",
"k2", "k3", "k4", "k5", "k6", "k7", "k8", "b1", "b2", "b3", "b4",
"b5", "b6", "b7", "b8"), class = "data.frame", row.names = c(NA,
-161L))
You can try:
t(apply(df[,-1], 1, function(rg){
occ_rg <- table(rg)
rg[grep("HP",rg)] <- names(occ_rg)[which.max(occ_rg)]
return(rg)}))
So, to have your new df:
df <- data.frame(chr=df[, 1], t(apply(df[,-1], 1, function(rg){
occ_rg <- table(rg)
rg[grep("HP",rg)] <- names(occ_rg)[which.max(occ_rg)]
return(rg)})),
stringsAsFactors=F)
# chr b1 b2 b3 b4 b5 b6 b7 b8
#1 1 CP CP CP CP CP CP CP CP
#2 1 KP KP KP KP CP CP KP KP
#3 1 CP CP CP CP KP KP CP CP
#4 1 CP CP CP CP KP KP CP CP
#5 1 KP KP CP CP CP CP CP CP
EDIT
If you have other columns and the columns you want to change are the only ones beginning with "b", you can do :
df[, grepl("^b", colnames(df))] <- t(apply(df[, grepl("^b", colnames(df))],
1,
function(rg){
occ_rg <- table(rg)
rg[grep("HP",rg)] <- names(occ_rg)[which.max(occ_rg)]
return(rg)}))
Example:
With this df:
# chr c1 b1 b2 b3 b4 b5 b6 b7 b8 c2
#1 1 1 HP HP CP CP CP CP CP CP 11
#2 1 2 HP HP KP KP CP CP KP KP 12
#3 1 3 CP CP CP CP KP KP HP HP 13
#4 1 4 CP CP HP HP KP KP CP CP 14
#5 1 5 KP KP CP CP HP HP CP CP 15
You get:
# chr c1 b1 b2 b3 b4 b5 b6 b7 b8 c2
#1 1 1 CP CP CP CP CP CP CP CP 11
#2 1 2 KP KP KP KP CP CP KP KP 12
#3 1 3 CP CP CP CP KP KP CP CP 13
#4 1 4 CP CP CP CP KP KP CP CP 14
#5 1 5 KP KP CP CP CP CP CP CP 15
EDIT 2
If you have other values than "HP", "CP" and "KP" and want to replace "HP" by either "CP" or "KP", depending on which occurs the most, you can do:
df[, grepl("^b", colnames(df))] <- t(apply(df[, grepl("^b", colnames(df))],
1,
function(rg){
occ_rg <- table(rg)
occ_rg <- occ_rg[grepl("KP|CP", names(occ_rg))]
rg[grep("HP",rg)] <- names(occ_rg)[which.max(occ_rg)]
return(rg)}))
Explanation (for edit2):
df[, grepl("^b", colnames(df))] <- # only the columns beginning with b are considered (so the other ones will remain untouched)
t( # the results of apply will be transposed
apply(df[, grepl("^b", colnames(df))], # apply on df with only the columns beginning by b
1, # by row
function(rg){ # a function that takes a vector "rg" as input
occ_rg <- table(rg) # computes the table
occ_rg <- occ_rg[grepl("KP|CP", names(occ_rg))] # keep only the occurrences of either "KP" or "CP"
rg[grep("HP",rg)] <- names(occ_rg)[which.max(occ_rg)] # replace in the vector rg the "HP" elements by "KP" or "CP" depending on which occurs the most
return(rg) # finally returns the vector rg
}))

Internal ordering of facets ggplot2

I'm trying to plot a facets in ggplot2 but I struggle to get the internal ordering of the different facets right. The data looks like this:
head(THAT_EXT)
ID FILE GENRE NODE
1 CKC_1823_01 CKC Novels better
2 CKC_1824_01 CKC Novels better
3 EW9_192_03 EW9 Popular Science better
4 H0B_265_01 H0B Popular Science sad
5 CS2_231_03 CS2 Academic Prose desirable
6 FED_8_05 FED Academic Prose certain
str(THAT_EXT)
'data.frame': 851 obs. of 4 variables:
$ ID : Factor w/ 851 levels "A05_122_01","A05_277_07",..: 345 346 439 608 402 484 319 395 228 5 ...
$ FILE : Factor w/ 241 levels "A05","A06","A0K",..: 110 110 127 169 120 135 105 119 79 2 ...
$ GENRE: Factor w/ 5 levels "Academic Prose",..: 4 4 5 5 1 1 1 5 1 5 ...
$ NODE : Factor w/ 115 levels "absurd","accepted",..: 14 14 14 89 23 16 59 59 18 66 ...
Part of the problem is that can't get the sorting right. Here is the code for the sorting of NODE that I use:
THAT_EXT <- within(THAT_EXT,
NODE <- factor(NODE,
levels=names(sort(table(NODE),
decreasing=TRUE))))
When I plot this with the code below I get a graphs in which the NODE is not correctly sorted in the individual GENREs since different NODEs are more frequent in different GENREs:
p1 <-
ggplot(THAT_EXT, aes(x=NODE)) +
geom_bar() +
scale_x_discrete("THAT_EXT", breaks=NULL) + # supress tick marks on x axis
facet_wrap(~GENRE)
What I want is for every facet to have NODE sorted in decreasing order for that particular GENRE. Can anyone help with this?
structure(list(ID = structure(c(1L, 2L, 3L, 4L, 10L, 133L, 137L,
138L, 139L, 140L, 141L, 142L, 143L, 144L, 145L, 146L, 147L, 148L,
149L, 150L, 151L, 152L, 153L, 154L, 155L, 156L, 157L, 158L, 159L,
160L, 161L, 162L, 163L, 164L, 165L, 166L, 167L, 168L, 169L, 170L,
171L, 172L, 173L, 174L, 175L, 176L, 177L, 178L, 179L, 180L, 181L,
182L, 183L, 184L, 185L, 186L, 187L, 188L, 189L, 190L, 191L, 192L,
193L, 194L, 195L, 196L, 197L, 198L, 199L, 200L, 201L, 202L, 203L,
204L, 205L, 206L, 207L, 208L, 212L, 213L, 214L, 215L, 216L, 217L,
218L, 219L, 220L, 221L, 222L, 223L, 224L, 225L, 226L, 227L, 228L,
229L, 230L, 231L, 232L, 233L, 234L, 235L, 236L, 237L, 238L, 239L,
240L, 241L, 267L, 268L, 269L, 270L, 271L, 272L, 273L, 274L, 275L,
276L, 277L, 278L, 279L, 280L, 281L, 282L, 283L, 284L, 290L, 291L,
298L, 299L, 300L, 303L, 304L, 305L, 306L, 307L, 308L, 309L, 310L,
313L, 314L, 315L, 316L, 317L, 318L, 319L, 327L, 328L, 329L, 330L,
331L, 332L, 333L, 334L, 335L, 336L, 337L, 338L, 339L, 340L, 341L,
342L, 343L, 344L, 345L, 346L, 347L, 348L, 352L, 353L, 354L, 355L,
356L, 357L, 358L, 359L, 360L, 349L, 350L, 351L, 361L, 362L, 363L,
364L, 365L, 366L, 367L, 368L, 369L, 370L, 371L, 372L, 373L, 374L,
375L, 376L, 377L, 378L, 379L, 380L, 381L, 12L, 13L, 14L, 15L,
16L, 17L, 18L, 19L, 20L, 21L, 22L, 23L, 24L, 25L, 26L, 27L, 28L,
29L, 30L, 31L, 32L, 33L, 34L, 35L, 36L, 41L, 42L, 43L, 44L, 45L,
46L, 50L, 54L, 72L, 73L, 74L, 75L, 76L, 90L, 91L, 92L, 97L, 98L,
102L, 115L, 125L, 126L, 127L, 128L, 129L, 130L, 131L, 132L, 209L,
210L, 211L, 242L, 243L, 244L, 245L, 246L, 289L, 292L, 293L, 294L,
295L, 296L, 297L, 301L, 302L, 311L, 312L, 320L, 321L, 322L, 323L,
324L, 325L, 326L, 382L, 383L, 384L, 385L, 386L, 387L, 388L, 5L,
6L, 7L, 8L, 9L, 11L, 37L, 38L, 39L, 40L, 47L, 48L, 49L, 51L,
52L, 53L, 55L, 56L, 57L, 58L, 59L, 60L, 61L, 62L, 63L, 64L, 65L,
66L, 67L, 68L, 69L, 70L, 71L, 77L, 78L, 79L, 80L, 81L, 82L, 83L,
84L, 85L, 86L, 87L, 88L, 89L, 93L, 94L, 95L, 96L, 99L, 100L,
101L, 103L, 104L, 105L, 106L, 107L, 108L, 109L, 110L, 111L, 112L,
113L, 114L, 116L, 117L, 118L, 119L, 120L, 121L, 122L, 123L, 124L,
134L, 135L, 136L, 247L, 248L, 249L, 250L, 251L, 252L, 253L, 254L,
255L, 256L, 257L, 258L, 259L, 260L, 261L, 262L, 263L, 264L, 265L,
266L, 285L, 286L, 287L, 288L), .Label = c("A05_122_01", "A05_277_07",
"A05_400_01", "A05_99_01", "A06_1283_02", "A06_1389_01", "A06_1390_01",
"A06_1441_02", "A06_884_03", "A0K_1190_03", "A77_1684_01", "A8K_525_03",
"A8K_582_01", "A8K_645_01", "A8K_799_01", "A90_341_02", "A90_496_01",
"A94_217_01", "A94_472_01", "A94_477_03", "A9M_164_01", "A9M_259_03",
"A9N_199_01", "A9N_489_01", "A9N_591_01", "A9R_173_01", "A9R_425_02",
"A9W_536_02", "AA5_121_01", "AAE_203_01", "AAE_243_01", "AAE_412_01",
"AAW_14_03", "AAW_244_02", "AAW_297_04", "AAW_365_04", "ADG_1398_01",
"ADG_1500_01", "ADG_1507_01", "ADG_1516_01", "AHB_336_01", "AHB_421_01",
"AHJ_1090_02", "AHJ_619_01", "AR3_340_01", "AR3_91_03", "ARF_879_01",
"ARF_985_01", "ARF_991_02", "ARK_1891_01", "ASL_33_04", "ASL_43_01",
"ASL_9_01", "AT7_1031_01", "B09_1162_01", "B09_1475_01", "B09_1493_01",
"B09_1539_01", "B0G_197_01", "B0G_320_01", "B0N_1037_01", "B0N_624_01",
"B0N_645_02", "B0N_683_01", "B3G_313_04", "B3G_320_03", "B3G_398_02",
"B7M_1630_01", "B7M_1913_01", "BNN_746_02", "BNN_895_01", "BP7_2426_01",
"BP7_2777_01", "BP7_2898_01", "BP9_410_01", "BP9_599_01", "BPK_829_01",
"C93_1407_02", "C9A_181_01", "C9A_196_01", "C9A_365_01", "C9A_82_02",
"C9A_9_01", "CB9_306_02", "CB9_63_04", "CB9_86_01", "CBJ_439_01",
"CBJ_702_02", "CBJ_705_01", "CCM_320_01", "CCM_665_01", "CCM_669_02",
"CCN_1036_02", "CCN_1078_01", "CCN_1119_01", "CCN_784_01", "CCW_2284_02",
"CCW_2349_03", "CE7_242_02", "CE7_284_01", "CE7_39_01", "CEB_1675_01",
"CER_145_03", "CER_23_01", "CER_235_02", "CER_378_10", "CET_1056_02",
"CET_680_01", "CET_705_01", "CET_797_01", "CET_838_01", "CET_879_05",
"CET_946_03", "CET_986_01", "CEY_2977_01", "CJ3_107_02", "CJ3_114_03",
"CJ3_20_01", "CJ3_81_01", "CK2_112_01", "CK2_22_01", "CK2_392_01",
"CK2_42_01", "CK2_75_01", "CKC_1776_01", "CKC_1777_01", "CKC_1823_01",
"CKC_1824_01", "CKC_1860_01", "CKC_1883_01", "CKC_1883_02", "CKC_2127_01",
"CMN_1439_02", "CRM_5767_01", "CRM_5770_03", "CRM_5789_01", "CS2_110_01",
"CS2_131_01", "CS2_139_01", "CS2_187_01", "CS2_187_03", "CS2_231_03",
"CS2_249_02", "CS2_301_01", "CS2_35_01", "CS2_58_02", "EV6_16_01",
"EV6_206_02", "EV6_240_01", "EV6_244_02", "EV6_28_01", "EV6_30_01",
"EV6_32_01", "EV6_450_01", "EV6_69_01", "EV6_80_01", "EV6_91_01",
"FAC_1019_01", "FAC_1026_01", "FAC_1027_01", "FAC_1235_01", "FAC_1269_05",
"FAC_1270_05", "FAC_1393_01", "FAC_1406_03", "FAC_933_01", "FAC_950_01",
"FAC_960_01", "FED_105_01", "FED_120_02", "FED_21_02", "FED_281_02",
"FED_302_02", "FED_53_01", "FED_8_05", "FEF_498_03", "FEF_674_03",
"FR2_410_01", "FR2_557_02", "FR2_593_01", "FR2_691_01", "FR4_232_01",
"FR4_331_01", "FR4_346_01", "FS7_818_01", "FS7_919_01", "FU0_368_02",
"FYT_1138_01", "FYT_1183_01", "FYT_901_05", "G08_1336_01", "G1E_385_01",
"G1N_824_01", "G1N_860_01", "G1N_868_01", "G1N_975_01", "GU5_854_01",
"GUJ_423_01", "GUJ_501_01", "GUJ_611_01", "GUJ_629_03", "GUJ_700_01",
"GV0_10_01", "GV0_104_01", "GV0_111_01", "GV0_122_01", "GV0_160_01",
"GV0_232_02", "GV2_1465_01", "GV2_1899_01", "GV6_2683_01", "GW6_297_01",
"GW6_306_05", "GW6_307_01", "GW6_322_01", "GW6_330_02", "GW6_335_01",
"GW6_338_01", "GW6_367_02", "GW6_373_01", "GW6_407_01", "GW6_411_01",
"GW6_413_01", "GW6_421_01", "GW6_423_01", "GW6_424_01", "GW6_428_01",
"GW6_447_01", "GWM_480_01", "GWM_533_02", "GWM_554_02", "GWM_554_03",
"GWM_609_01", "GWM_609_04", "GWM_610_01", "GWM_730_01", "GWM_731_01",
"GWM_738_01", "GWM_804_06", "GWM_815_01", "GWM_832_03", "GVP_179_01",
"GVP_211_01", "GVP_393_02", "GVP_443_02", "GVP_710_01", "H0B_171_04",
"H0B_216_01", "H0B_265_01", "H0B_32_01", "H0B_361_03", "H0B_365_01",
"H0B_369_01", "H0B_74_01", "H0B_93_01", "H10_1002_01", "H10_1032_04",
"H10_653_01", "H10_803_01", "H10_824_01", "H10_825_03", "H10_881_01",
"H10_986_01", "H78_851_04", "H78_891_01", "H78_946_04", "H79_1959_19",
"H7S_110_05", "H7S_130_06", "H7S_131_03", "H7S_131_04", "H7S_146_01",
"H7S_148_01", "H7S_164_01", "H7S_179_01", "H7S_54_01", "H7S_56_05",
"H7S_62_03", "H7S_79_01", "H7S_8_01", "H7S_81_01", "H7S_83_01",
"H7S_87_01", "H7S_92_03", "H7X_1028_02", "H7X_1091_01", "H7X_691_01",
"H7X_695_01", "H8H_2917_01", "H8K_153_01", "H8K_55_01", "H8M_1897_01",
"H8M_2104_02", "H8T_3316_03", "H98_3204_01", "H98_3410_01", "H98_3490_02",
"H9R_130_02", "H9R_39_01", "H9S_1297_01", "HA2_3107_02", "HA2_3284_01",
"HPY_754_04", "HPY_785_09", "HPY_799_03", "HPY_807_04", "HPY_830_04",
"HPY_838_02", "HPY_843_01", "HPY_869_11", "HR7_190_01", "HR7_440_01",
"HTP_540_01", "HTP_585_01", "HTP_588_05", "HTP_593_01", "HTP_601_01",
"HTP_613_01", "HTP_648_02", "HTW_197_01", "HTW_494_01", "HTW_750_01",
"HWL_2770_01", "HWL_2919_01", "HWM_45_01", "HWM_45_02", "HXY_1047_03",
"HXY_701_01", "HXY_781_01", "HXY_783_01", "HXY_784_01", "HXY_836_01",
"HXY_931_01", "HXY_963_01", "HXY_972_01", "HXY_985_03", "HY6_1024_01",
"HY6_1025_01", "HY6_1164_01", "HY6_1223_01", "HY6_988_03", "HY6_989_01",
"HY8_160_01", "HY8_164_01", "HY8_292_03", "HY8_316_01", "HY9_778_03",
"HY9_845_02", "HYX_235_08", "HYX_245_01", "HYX_88_01", "J12_1474_02",
"J12_1492_01", "J12_1571_01", "J12_1845_01", "J14_341_01", "J18_597_04",
"J18_698_02", "J18_759_01", "J18_828_01", "J3R_197_01", "J3R_219_02",
"J3R_277_04", "J3T_267_01", "J3T_269_02", "J3T_57_02", "J41_41_02",
"J41_58_03", "J9B_133_03", "J9B_341_02", "J9B_341_03", "J9D_147_05",
"J9D_218_01", "J9D_411_01", "J9D_616_01", "J9D_616_02", "JNB_563_02",
"JT7_118_01", "JT7_129_02", "JT7_218_02", "JT7_344_02", "JXS_3663_01",
"JXU_407_01", "JXU_468_02", "JXU_559_01", "JXV_1439_04", "JXV_1592_01",
"JY1_100_01"), class = "factor"), GENRE = structure(c(1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 5L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
5L, 5L), .Label = c("Academic Prose", "Conversation", "News",
"Novels", "Popular Science"), class = "factor"), NODE = structure(c(9L,
10L, 10L, 10L, 4L, 10L, 71L, 35L, 49L, 6L, 5L, 15L, 28L, 44L,
64L, 64L, 28L, 28L, 18L, 18L, 32L, 18L, 58L, 10L, 72L, 28L, 18L,
10L, 64L, 10L, 35L, 64L, 64L, 69L, 8L, 10L, 50L, 69L, 49L, 49L,
15L, 69L, 10L, 49L, 8L, 64L, 49L, 10L, 69L, 18L, 61L, 67L, 67L,
61L, 57L, 69L, 11L, 10L, 64L, 10L, 59L, 61L, 49L, 10L, 59L, 1L,
61L, 35L, 54L, 54L, 39L, 44L, 61L, 64L, 69L, 1L, 23L, 49L, 49L,
8L, 69L, 49L, 69L, 49L, 49L, 69L, 35L, 49L, 49L, 49L, 35L, 10L,
49L, 48L, 10L, 49L, 11L, 44L, 50L, 11L, 50L, 69L, 49L, 10L, 59L,
68L, 47L, 69L, 49L, 35L, 29L, 8L, 49L, 50L, 35L, 10L, 35L, 8L,
35L, 8L, 10L, 35L, 10L, 10L, 10L, 35L, 44L, 61L, 35L, 44L, 28L,
47L, 39L, 39L, 49L, 61L, 43L, 60L, 19L, 10L, 10L, 10L, 44L, 44L,
62L, 44L, 10L, 59L, 10L, 61L, 1L, 53L, 33L, 10L, 8L, 8L, 64L,
64L, 10L, 57L, 61L, 64L, 66L, 19L, 61L, 64L, 10L, 10L, 8L, 19L,
35L, 28L, 10L, 61L, 35L, 42L, 35L, 28L, 32L, 64L, 10L, 18L, 28L,
25L, 35L, 35L, 10L, 18L, 10L, 22L, 55L, 28L, 10L, 1L, 55L, 51L,
1L, 38L, 28L, 28L, 33L, 10L, 44L, 29L, 16L, 8L, 28L, 69L, 32L,
10L, 61L, 20L, 35L, 10L, 28L, 10L, 32L, 10L, 46L, 59L, 64L, 35L,
66L, 2L, 35L, 28L, 30L, 18L, 69L, 32L, 10L, 28L, 17L, 36L, 64L,
61L, 10L, 64L, 33L, 3L, 37L, 26L, 28L, 64L, 44L, 28L, 64L, 64L,
6L, 6L, 64L, 50L, 32L, 8L, 64L, 50L, 28L, 24L, 18L, 47L, 35L,
40L, 24L, 55L, 44L, 22L, 1L, 49L, 44L, 18L, 45L, 63L, 64L, 35L,
12L, 35L, 10L, 35L, 10L, 10L, 10L, 44L, 44L, 44L, 65L, 44L, 55L,
32L, 49L, 64L, 39L, 69L, 1L, 60L, 7L, 14L, 44L, 33L, 10L, 19L,
10L, 70L, 53L, 8L, 61L, 61L, 44L, 61L, 65L, 28L, 68L, 69L, 27L,
61L, 28L, 72L, 34L, 61L, 32L, 10L, 49L, 35L, 49L, 10L, 10L, 69L,
39L, 40L, 19L, 59L, 53L, 49L, 49L, 44L, 49L, 35L, 49L, 61L, 61L,
1L, 10L, 28L, 49L, 35L, 49L, 61L, 50L, 69L, 35L, 61L, 35L, 50L,
10L, 28L, 69L, 61L, 21L, 69L, 29L, 35L, 35L, 35L, 11L, 69L, 8L,
41L, 56L, 35L, 61L, 69L, 49L, 49L, 49L, 1L, 13L, 64L, 64L, 52L,
44L, 64L, 64L, 50L, 49L, 69L, 11L, 59L, 49L, 31L), .Label = c("apparent",
"appropriate", "awful", "axiomatic", "best", "better", "breathtaking",
"certain", "characteristic", "clear", "conceivable", "convenient",
"crucial", "cruel", "desirable", "disappointing", "emphatic",
"essential", "evident", "expected", "extraordinary", "fair",
"fortunate", "Funny", "good", "great", "imperative", "important",
"impossible", "incredible", "inescapable", "inevitable", "interesting",
"ironic", "likely", "Likely", "lucky", "ludicrous", "natural",
"necessary", "needful", "notable", "noteworthy", "obvious", "odd",
"paradoxical", "plain", "plausible", "possible", "probable",
"proper", "relevant", "remarkable", "revealing", "right", "Sad",
"self-evident", "sensible", "significant", "striking", "surprising",
"symptomatic", "terrible", "true", "typical", "understandable",
"unexpected", "unfortunate", "unlikely", "unreasonable", "untrue",
"vital"), class = "factor")), .Names = c("ID", "GENRE", "NODE"
), class = "data.frame", row.names = c(NA, -388L))
As I mentioned already: facet_wrap is not intended for having individual scales. At least I didn't find a solution. Hence, setting the labels in scale_x_discrete did not bring the desired result.
But this my workaround:
library(plyr)
library(ggplot2)
nodeCount <- ddply( df, c("GENRE", "NODE"), nrow )
nodeCount$factors <- paste( nodeCount$GENRE, nodeCount$NODE, sep ="." )
nodeCount <- nodeCount[ order( nodeCount$GENRE, nodeCount$V1, decreasing=TRUE ), ]
nodeCount$factors <- factor( nodeCount$factors, levels=nodeCount$factors )
head(nodeCount)
GENRE NODE V1 factors
121 Popular Science possible 14 Popular Science.possible
128 Popular Science surprising 11 Popular Science.surprising
116 Popular Science likely 9 Popular Science.likely
132 Popular Science unlikely 9 Popular Science.unlikely
103 Popular Science clear 7 Popular Science.clear
129 Popular Science true 5 Popular Science.true
g <- ggplot( nodeCount, aes( y=V1, x = factors ) ) +
geom_bar() +
scale_x_discrete( breaks=NULL ) + # supress tick marks on x axis
facet_wrap( ~GENRE, scale="free_x" ) +
geom_text( aes( label = NODE, y = V1+2 ), angle = 45, vjust = 0, hjust=0, size=3 )
Which gives:

Resources