Summary with label names with dplyr

Summary with label names with dplyr - r

I have imported a .sav file with Haven but where I am stuck is that I cant seem to work out how to print the label names in place or, with the label codings. Labels: 1 = unemployed, 2 = looking etc.
Employment <- select(well_being_df, EmploymentStatus, Gender) %>% <group_by(EmploymentStatus) %>% summarise_all(funs(mean, n = n(), sd,min(.,is.na = TRUE), max(.,is.na = TRUE)))
# A tibble: 5 x 6
EmploymentStatus mean n sd min max
<dbl+lbl> <dbl> <int> <dbl> <dbl> <dbl>
1 1 1.67 12 0.492 1 2
2 2 1.17 6 0.408 1 2
3 3 1.8 85 0.431 1 3
4 4 1.5 62 0.504 1 2
5 5 1.5 4 0.577 1 2
Ideally:
# A tibble: 5 x 6
EmploymentStatus mean n sd min max
<dbl+lbl> <dbl> <int> <dbl> <dbl> <dbl>
1 1 Unemployed 1.67 12 0.492 1 2
2 2 Looking 1.17 6 0.408 1 2
3 3 Etc 1.8 85 0.431 1 3
4 4 1.5 62 0.504 1 2
5 5 1.5 4 0.577 1 2
dput(head(well_being_df, 10))
structure(list(Age = c(22, 20, 23, 20, 25, 18, 24, 21, 21, 30.7344197070233
), Gender = structure(c(2, 2, 1, 2, 1, 2, 2, 2, 2, 1), labels = c(Male = 1,
Female = 2, Transgender = 3), class = "labelled"), EmploymentStatus = structure(c(3,
1, 4, 3, 3, 3, 3, 4, 3, 4), labels = c(`Unemployed but not looking` = 1,
`Unemployed and looking` = 2, `Part-time` = 3, `Full-time` = 4,
Retired = 5), class = "labelled"), Cognition1 = structure(c(6,
3, 6, 5, 9, 6, 4, 4, 7, 5), labels = c(`Provides nothing that you want` = 0,
`Provides half of what you want` = 5, `Provides all that you want` = 10
), class = "labelled"), Cognition2 = structure(c(7, 3, 8,
5, 8, 5, 5, 7, 7, 3), labels = c(`Far below average` = 0,
`About Average` = 5, `Far above average` = 10), class = "labelled"),
Cognition3 = structure(c(6, 5, 4, 5, 6, 5, 5, 5, 5, 5), labels = c(`Far less than you deserve` = 0,
`About what you deserve` = 5, `Far more than you deserve` = 10
), class = "labelled"), Cognition4 = structure(c(7, 3, 6,
2, 8, 3, 3, 5, 6, 2), labels = c(`Far less than you need` = 0,
`About what you need` = 5, `Far more than you need` = 10), class = "labelled"),
Cognition5 = structure(c(10, 9, 6, 3, 7, 2, 2, 0, 4, 0), labels = c(`Far less than expected` = 0,
`About as expected` = 5, `Far more than expected` = 10), class = "labelled"),
Cognition6 = structure(c(8, 6, 0, 3, 3, 8, 9, 10, 5, 10), labels = c(`Far more than it will in the future` = 0,
`About what you expect in the future` = 5, `Far less than what the future will offer` = 10
), class = "labelled"), Cognition7 = structure(c(9, 7, 10,
5, 6, 2, 3, 0, 8, 3), labels = c(`Far below previous best` = 0,
`Equals previous best` = 5, `Far above previous best` = 10
), class = "labelled")), row.names = c(NA, -10L), class = c("tbl_df",
"tbl", "data.frame"))

Employment <- select(well_being_df, EmploymentStatus, Gender) %>%
mutate(EmploymentStatus = labelled::to_factor(EmploymentStatus)) %>% # use labelled package
group_by(EmploymentStatus) %>%
summarise_all(funs(mean, n = n(), sd,min(.,is.na = TRUE), max(.,is.na = TRUE)))

Related

Cross validation returning 0 for train-merror and test-merror

Here is a sample from my data:
data
## q6 q7 q8 q9 q10 q11 q12 q13 q14 q15 q16 q17 q18 q19 q20 q21 q22 q23 w
## 1 1.73 54.43 2 5 5 1 1 1 1 1 1 2 3 2 2 2 2 2 0
## 2 1.50 51.26 2 5 1 1 1 1 1 1 1 1 1 1 1 2 1 1 0
## 3 1.90 66.68 1 5 1 1 1 1 1 1 3 NA NA NA NA NA 1 NA 0
## 4 NA NA 2 5 1 2 4 4 1 1 1 1 2 1 2 2 1 1 0
## 5 1.63 68.49 1 4 3 1 1 1 1 1 1 1 1 1 1 1 4 5 1
## 6 1.70 59.88 2 5 1 1 1 1 1 1 1 1 1 1 1 2 2 2 0
## 7 1.73 70.76 2 5 1 2 8 1 1 1 1 1 1 1 1 2 2 2 1
## 8 1.75 90.72 NA 5 1 NA NA 1 1 1 1 2 1 1 1 2 2 2 0
## 9 1.50 40.82 2 4 2 1 1 3 1 1 1 1 1 1 1 2 3 2 0
## 10 1.68 49.90 1 5 1 1 1 1 1 1 1 1 1 1 1 2 2 2 0
## 11 1.50 86.18 1 3 2 NA 6 4 1 1 2 8 5 1 1 1 NA NA 1
## 12 1.88 79.83 3 5 1 2 2 1 1 1 1 2 1 1 1 2 1 1 1
## 13 1.78 68.49 2 4 1 1 1 1 1 1 1 1 1 1 1 2 2 2 1
## 14 1.73 54.43 2 4 1 1 1 1 1 1 1 1 3 1 1 2 2 2 0
## 15 1.80 72.58 2 4 1 1 1 1 1 1 1 1 3 1 1 2 2 2 0
Then I cross validated this data:
xgb.cv(data=as.matrix(data),label=data$w, num_class=2, nrounds=20, nfold=5, eval_metric="merror", lambda=1, objective = "multi:softmax")
My label is the "w" column. The num_class is 2 as "w" has either 0 or 1 class. My final goal is build a classifier to predict labels w from data with xgboost however when I ran the above xgb.cv it returns me this:
## [1] train-merror:0.000000+0.000000 test-merror:0.000000+0.000000
## [2] train-merror:0.000000+0.000000 test-merror:0.000000+0.000000
## [3] train-merror:0.000000+0.000000 test-merror:0.000000+0.000000
## [4] train-merror:0.000000+0.000000 test-merror:0.000000+0.000000
## [5] train-merror:0.000000+0.000000 test-merror:0.000000+0.000000
## [6] train-merror:0.000000+0.000000 test-merror:0.000000+0.000000
## [7] train-merror:0.000000+0.000000 test-merror:0.000000+0.000000
## [8] train-merror:0.000000+0.000000 test-merror:0.000000+0.000000
## [9] train-merror:0.000000+0.000000 test-merror:0.000000+0.000000
## [10] train-merror:0.000000+0.000000 test-merror:0.000000+0.000000
## [11] train-merror:0.000000+0.000000 test-merror:0.000000+0.000000
## [12] train-merror:0.000000+0.000000 test-merror:0.000000+0.000000
## [13] train-merror:0.000000+0.000000 test-merror:0.000000+0.000000
## [14] train-merror:0.000000+0.000000 test-merror:0.000000+0.000000
## [15] train-merror:0.000000+0.000000 test-merror:0.000000+0.000000
## [16] train-merror:0.000000+0.000000 test-merror:0.000000+0.000000
## [17] train-merror:0.000000+0.000000 test-merror:0.000000+0.000000
## [18] train-merror:0.000000+0.000000 test-merror:0.000000+0.000000
## [19] train-merror:0.000000+0.000000 test-merror:0.000000+0.000000
## [20] train-merror:0.000000+0.000000 test-merror:0.000000+0.000000
All train and test merror are 0. Why? How do I solve this?
Here is the requested dput(head(data)):
## structure(list(q6 = c(1.73, 1.5, 1.9, NA, 1.63, 1.7), q7 = c(54.43,
## 51.26, 66.68, NA, 68.49, 59.88), q8 = c(2, 2, 1, 2, 1, 2), q9 = c(5,
## 5, 5, 5, 4, 5), q10 = c(5, 1, 1, 1, 3, 1), q11 = c(1, 1, 1, 2,
## 1, 1), q12 = c(1, 1, 1, 4, 1, 1), q13 = c(1, 1, 1, 4, 1, 1),
## q14 = c(1, 1, 1, 1, 1, 1), q15 = c(1, 1, 1, 1, 1, 1), q16 = c(1,
## 1, 3, 1, 1, 1), q17 = c(2, 1, NA, 1, 1, 1), q18 = c(3, 1,
## NA, 2, 1, 1), q19 = c(2, 1, NA, 1, 1, 1), q20 = c(2, 1, NA,
## 2, 1, 1), q21 = c(2, 2, NA, 2, 1, 2), q22 = c(2, 1, 1, 1,
## 4, 2), q23 = c(2, 1, NA, 1, 5, 2), q24 = c(1, 2, 1, 2, 1,
## 1), q25 = c(1, 2, 1, 2, 2, 1), q26 = c(2, 2, 1, 1, 1, 1),
## q27 = c(2, 2, 1, 2, 1, 1), q28 = c(2, 2, 2, 2, 1, 1), q29 = c(1,
## 1, NA, 1, 1, 3), q30 = c(1, 1, NA, 1, 1, 3), q31 = c(1, 2,
## NA, 1, 1, 1), q32 = c(6, 1, NA, 6, 6, 1), q33 = c(NA, 1,
## NA, 2, 5, 1), q34 = c(NA, 1, NA, 2, 4, 1), q35 = c(NA, 1,
## NA, 5, 5, 1), q36 = c(2, 1, NA, 3, 3, 1), q37 = c(1, 1, NA,
## 1, 1, 1), q38 = c(6, 1, NA, 4, 1, 1), q39 = c(1, 2, 2, 1,
## 1, 2), q40 = c(3, 1, NA, 2, 7, 1), q41 = c(6, 1, 2, 5, 6,
## 3), q42 = c(5, 1, 5, 5, 5, 6), q43 = c(1, 1, 1, 2, 2, 2),
## q44 = c(1, 1, 1, 2, 2, NA), q45 = c(1, 1, 1, 5, 7, 4), q46 = c(1,
## 1, 1, 6, 5, 7), q47 = c(7, 1, NA, 7, 7, 6), q48 = c(6, 1,
## 7, 5, 5, 6), q49 = c(4, 1, NA, 6, 1, 4), q50 = c(1, 1, 1,
## 2, 3, 1), q51 = c(1, 1, 1, 1, 1, 1), q52 = c(1, 1, 1, 1,
## 1, 1), q53 = c(1, 1, 1, 2, 3, 1), q54 = c(1, 1, 1, 1, 2,
## 1), q55 = c(1, 1, 1, 2, 1, 1), q56 = c(1, 1, 1, 1, 1, 1),
## q57 = c(1, 1, 1, 4, 4, 2), q58 = c(1, 1, 1, 1, 1, 1), q59 = c(1,
## 2, 2, 2, 1, 1), q60 = c(1, 2, 1, 1, 1, 1), q61 = c(7, 1,
## 2, 5, 6, 6), q62 = c(3, 1, 3, 5, 7, 5), q63 = c(3, 1, 3,
## 2, 4, 5), q64 = c(3, 1, 3, 3, 3, 2), q65 = c(2, 1, 2, 2,
## 2, 3), q66 = c(4, 1, NA, 4, 4, 2), q67 = c(2, 3, 3, 2, 3,
## 2), q68 = c(1, 1, 2, 1, 1, 1), q69 = c(2, 3, 3, 2, 3, 3),
## q70 = c(2, 4, 4, 2, 1, 1), q71 = c(3, 2, 3, 1, 3, 2), q72 = c(4,
## 4, 4, 2, 3, 2), q73 = c(1, 2, 1, 1, 1, 2), q74 = c(2, 2,
## 3, 2, 2, 2), q75 = c(2, 2, 2, 2, 2, 1), q76 = c(7, 2, 2,
## 2, 2, 1), q77 = c(3, 3, 4, 4, 2, 7), q78 = c(1, 2, 4, 2,
## 1, 3), q79 = c(4, 8, 6, 3, 1, 2), q80 = c(6, 4, 4, 3, 1,
## 4), q81 = c(5, NA, 1, 4, 2, 1), q82 = c(7, 1, 6, 5, 2, 7),
## q83 = c(1, 1, 1, 6, 1, 6), q84 = c(1, 1, 1, 2, 1, 2), q85 = c(2,
## 2, 1, 2, 2, 2), q86 = c(1, 1, NA, 1, 1, 1), q87 = c(2, 2,
## NA, 2, 2, 1), q88 = c(4, 5, 5, 3, 1, 2), q89 = c(4, 2, 2,
## 4, 2, 4), q90 = c(2, 1, NA, NA, 1, 2), q91 = c(1, 1, 1, 3,
## 3, 1), q92 = c(1, 1, 1, 2, 2, 5), q93 = c(4, 5, 7, 4, 7,
## 2), q94 = c(3, 3, 2, 2, 3, 2), q95 = c(1, 4, 1, 1, 1, 4),
## q96 = c(1, 1, 1, 1, 1, 1), q97 = c(1, 1, 3, 1, 2, 3), q98 = c(1,
## 2, 2, 1, 1, 1), q99 = c(1, 1, 1, 1, 1, 2), w = c(0, 0, 0,
## 0, 0, 0)), row.names = c(NA, 6L), class = "data.frame")

How to sort each column of a df in descending order regarless of the row order?

I am trying to sort my data in descending or ascending order regardless of the data in the rows. I made a dummy example below:
A <- c(9,9,5,4,6,3,2,NA)
B <- c(9,5,3,4,1,4,NA,NA)
C <- c(1,4,5,6,7,4,2,4)
base <- data.frame(A,B,C)
df <- base
df$A <- sort(df$A,na.last = T)
df$B <- sort(df$B,na.last = T)
df$C <- sort(df$C)
We get this
structure(list(A = c(2, 3, 3, 4, 4, 4, 5, 5, 6, 9, 9, NA), B = c(1,
2, 3, 4, 4, 4, 5, 5, 9, 10, NA, NA), C = c(1, 2, 3, 4, 4, 4,
5, 5, 6, 7, 8, 8)), row.names = c(NA, -12L), class = "data.frame")
I want to get something similar to df but my data have hundreds of columns, is there an easier way to do it?
I tried arrange_all() but the result is not what i want.
library(tidyverse)
test <- base%>%
arrange_all()
Obtaining this:
structure(list(A = c(2, 3, 3, 4, 4, 4, 5, 5, 6, 9, 9, NA), B = c(NA,
2, 4, 4, 5, 10, 3, 4, 1, 5, 9, NA), C = c(2, 3, 4, 6, 8, 5, 5,
8, 7, 4, 1, 4)), class = "data.frame", row.names = c(NA, -12L
))

You can sort each column individually :
library(dplyr)
base %>% mutate(across(.fns = sort, na.last = TRUE))
# A B C
#1 2 1 1
#2 3 3 2
#3 4 4 4
#4 5 4 4
#5 6 5 4
#6 9 9 5
#7 9 NA 6
#8 NA NA 7
Or in base R :
base[] <- lapply(base, sort, na.last = TRUE)

Can I create many categories of one variable based in two other conditions in r? [duplicate]

This question already has answers here:
How collect additional row data on binned data in R
(1 answer)
Group value in range r
(3 answers)
Closed 3 years ago.
I am doing a statistic analysis in a big data frame (more than 48.000.000 rows) in r. Here is an exemple of the data:
structure(list(herd = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3,
3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3), cows = c(1, 2,
3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 1, 2, 3, 4,
5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 1, 2, 3, 4, 5, 6,
7, 8, 9, 10, 11, 12, 13, 14, 15, 16), `date` = c("11/03/2013",
"12/03/2013", "13/03/2013", "14/03/2013", "15/03/2013", "16/03/2013",
"13/05/2012", "14/05/2012", "15/05/2012", "16/05/2012", "17/05/2012",
"18/05/2012", "10/07/2016", "11/07/2016", "12/07/2016", "13/07/2016",
"11/03/2013", "12/03/2013", "13/03/2013", "14/03/2013", "15/03/2013",
"16/03/2013", "13/05/2012", "14/05/2012", "15/05/2012", "16/05/2012",
"17/05/2012", "18/05/2012", "10/07/2016", "11/07/2016", "12/07/2016",
"13/07/2016", "11/03/2013", "12/03/2013", "13/03/2013", "14/03/2013",
"15/03/2013", "16/03/2013", "13/05/2012", "14/05/2012", "15/05/2012",
"16/05/2012", "17/05/2012", "18/05/2012", "10/07/2016", "11/07/2016",
"12/07/2016", "13/07/2016"), glicose = c(240666, 23457789, 45688688,
679, 76564, 6574553, 78654, 546432, 76455643, 6876, 7645432,
876875, 98654, 453437, 98676, 9887554, 76543, 9775643, 986545,
240666, 23457789, 45688688, 679, 76564, 6574553, 78654, 546432,
76455643, 6876, 7645432, 876875, 98654, 453437, 98676, 9887554,
76543, 9775643, 986545, 240666, 23457789, 45688688, 679, 76564,
6574553, 78654, 546432, 76455643, 6876)), class = c("tbl_df",
"tbl", "data.frame"), row.names = c(NA, -48L))
I need to identify how many cows are in the following category of glicose by herd and by date:
<=100000
100000 and <=150000
150000 and <=200000
200000 and <=250000
250000 and <=400000
>400000
I tried to use the functions filter() and select() but could not categorize the variable like that.
I tried either to make a vector for each category but it did not work:
ht <- df %>% group_by(herd, date) %>%
filter(glicose < 100000)
Actually I do not have a clue of how I could do this. Please help!
I expect to get the number of cows in each category of each herd based on each date in a table like this:

Calling your data df,
df %>%
mutate(glicose_group = cut(glicose, breaks = c(0, seq(1e5, 2.5e5, by = 0.5e5), 4e5, Inf)),
date = as.Date(date, format = "%d/%m/%Y")) %>%
group_by(herd, date, glicose_group) %>%
count
# # A tibble: 48 x 4
# # Groups: herd, date, glicose_group [48]
# herd date glicose_group n
# <dbl> <date> <fct> <int>
# 1 1 2012-05-13 (0,1e+05] 1
# 2 1 2012-05-14 (4e+05,Inf] 1
# 3 1 2012-05-15 (4e+05,Inf] 1
# 4 1 2012-05-16 (0,1e+05] 1
# 5 1 2012-05-17 (4e+05,Inf] 1
# 6 1 2012-05-18 (4e+05,Inf] 1
# 7 1 2013-03-11 (2e+05,2.5e+05] 1
# 8 1 2013-03-12 (4e+05,Inf] 1
# 9 1 2013-03-13 (4e+05,Inf] 1
# 10 1 2013-03-14 (0,1e+05] 1
# # ... with 38 more rows
I also threw in a conversion to Date class, which is probably a good idea.

When mutate_all and lapply disagree ... How to replace lapply with mutate_all

I'm here again to ask for your help!
I'm trying to figure out what's happening with mutate_all (or with me...).
Let's say I have this dataset:
ds <- structure(list(Q1 = structure(c(5, 4, 5, 5, 5, 5, 5, 5, 5, 5,
5, 4, 3, 5, 5, 5, 5, 5, 1, 4, 5, 5, 3, 4, 5, 5, 5, 5, 5, 2, 5,
5, 4, 5, 5, 3, 5, 5, 4, 3, 3, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 4,
5, 4), label = "1 Para mim é igual se os meus amigos são heterossexuais ou homossexuais.", format.spss = "F1.0", display_width = 3L, class = "labelled", labels = c(`discordo totalmente` = 1,
discordo = 2, indiferente = 3, concordo = 4, `concordo totalmente` = 5
)), Q2 = structure(c(1, 1, 1, 1, 1, 1, 3, 1, 2, 3, 1, 4, 4, 4,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 4, 1, 1, 1, 1, 1, 1, 1, 1, 3, 2,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 1, 2, 1, 2), label = "A homossexualidade é uma perturbação psicológica/biológica.", format.spss = "F1.0", display_width = 5L, class = "labelled", labels = c(`discordo totalmente` = 1,
discordo = 2, indiferente = 3, concordo = 4, `concordo totalmente` = 5
)), Q3 = structure(c(5, 2, 5, 4, 5, 4, 5, 5, 5, 4, 5, 5, 2, 3,
5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5,
5, 5, 5, 4, 4, 5, 5, 5, 5, 5, 5, 5, 5, 5, 4, 5, 4, 5, 4), label = "Acredito que os pais e as mães homossexuais são tão competentes como os pais e mães heterossexuais.", format.spss = "F1.0", display_width = 5L, class = "labelled", labels = c(`discordo totalmente` = 1,
discordo = 2, indiferente = 3, concordo = 4, `concordo totalmente` = 5
)), Q4 = structure(c(1, 2, 1, 2, 1, 1, 1, 1, 1, 1, 2, 1, 2, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 3, 1, 1, 1, 1, 2,
1, 1, 1, 1, 2, 1, 1, 1, 1, 2, 1, 1, 1, 5, 1, 1, 2, 1, 3), label = "4 Todas as Lésbicas, Gays, Bissexuais, Transexuais, Transgêneros e Intersexuais (LGBTI) me deixam irritado.", format.spss = "F1.0", display_width = 4L, class = "labelled", labels = c(`discordo totalmente` = 1,
discordo = 2, indiferente = 3, concordo = 4, `concordo totalmente` = 5
)), Q5 = structure(c(1, 4, 1, 1, 1, 1, 3, 1, 2, 1, 1, 1, 3, 3,
1, 1, 1, 1, 1, 1, 1, 3, 1, 1, 1, 1, 1, 1, 3, 1, 1, 1, 1, 3, 2,
1, 1, 1, 2, 2, 5, 1, 4, 1, 3, 1, 1, 1, 1, 1, 1, 1, 1, 3), label = "A legalização do casamento entre pessoas do mesmo sexo é muito errada.", format.spss = "F1.0", display_width = 5L, class = "labelled", labels = c(`discordo totalmente` = 1,
discordo = 2, indiferente = 3, concordo = 4, `concordo totalmente` = 5
))), row.names = c(NA, -54L), class = c("tbl_df", "tbl", "data.frame"
))
Then I need to transform all variables into factors to plot them. I really like the dplyr approach:
ds_mutate <- ds %>% mutate_all(., factor, levels=1:5)
likert(ds_mutate)
But this error is coming up:
Error in likert(ds_mutate) :
All items (columns) must have the same number of levels
When I use lapply (Nobody will convince me 'apply'functions are intuitive...), it works pretty well:
> ds_apply <- lapply(ds, factor, levels=1:5) %>% as.data.frame()
> likert(ds_apply)
Item 1 2 3 4 5
1 Q1 1.851852 1.851852 9.259259 14.814815 72.222222
2 Q2 77.777778 9.259259 5.555556 7.407407 0.000000
3 Q3 0.000000 3.703704 1.851852 14.814815 79.629630
4 Q4 79.629630 14.814815 3.703704 0.000000 1.851852
5 Q5 72.222222 7.407407 14.814815 3.703704 1.851852
But as you can see, the str is (for me) the same...
i'm looking forward to hearing from you!!
Thank you!

There is one difference:
class(ds_mutate)
# [1] "tbl_df" "tbl" "data.frame"
class(ds_apply)
# [1] "data.frame"
The issue then arises from the fact that, in the call of likert, we have
nlevels = length(levels(items[, 1]))
where, in the former case,
length(levels(ds_mutate[, 1]))
# [1] 0
since
ds_mutate[, 1]
# A tibble: 54 x 1
# Q1
# <fct>
# 1 5
# 2 4
# 3 5
# 4 5
# 5 5
# 6 5
# 7 5
# 8 5
# 9 5
# 10 5
# … with 44 more rows
i.e., the result is a tibble. Also,
methods("levels")
# [1] levels.default
so that there is no levels method for tibbles. Notice also that
class(ds_mutate) <- c("data.frame", "tbl_df", "tbl")
ds_mutate[, 1]
# [1] 5 4 5 5 5 5 5 5 5 5 5 4 3 5 5 5 5 5 1 4 5 5 3 4 5 5 5 5 5 2 5 5 4 5 5 3 5 5 4 3 3 5 5 5
# [45] 5 5 5 5 5 5 5 4 5 4
# Levels: 1 2 3 4 5
in which case
likert(ds_mutate)
starts to work too. Without modifying classes you may also use
likert(data.frame(ds_mutate))
Extra: lapply in
lapply(ds, factor, levels = 1:5)
actually is really intuitive once we understand one thing: a data frame is a special case of a list where each list element is of the same length. Know the way sapply or lapply works is that it goes over each element of the first argument: once we see ds as a data frame whose elements (since it's a list) are columns, it becomes clear how it operates. For the same reason, since the results of factor in this case are of the same length, the list resulting from the call to lapply nicely can be converted to a data frame.

I never used likert package but it looks like it doesn't take an object of the class tibble. This works for me:
likert(as.data.frame(ds_mutate))

Reorganize a List with Lists in a new list in R

This is my list:
mylist=list(list(a = c(2, 3, 4, 5), b = c(3, 4, 5, 5), c = c(3, 7, 5,
5), d = c(3, 4, 9, 5), e = c(3, 4, 5, 9), f = c(3, 4, 1, 9),
g = c(3, 1, 5, 9), h = c(3, 3, 5, 9), i = c(3, 17, 3, 9),
j = c(3, 17, 3, 9)), list(a = c(2, 5, 48, 4), b = c(7, 4,
5, 5), c = c(3, 7, 35, 5), d = c(3, 843, 9, 5), e = c(3, 43,
5, 9), f = c(3, 4, 31, 39), g = c(3, 1, 5, 9), h = c(3, 3, 5,
9), i = c(3, 17, 3, 9), j = c(3, 17, 3, 9)), list(a = c(2, 3,
4, 35), b = c(3, 34, 5, 5), c = c(3, 37, 5, 5), d = c(38, 4,
39, 5), e = c(3, 34, 5, 9), f = c(33, 4, 1, 9), g = c(3, 1, 5,
9), h = c(3, 3, 35, 9), i = c(3, 17, 33, 9), j = c(3, 137, 3,
9)), list(a = c(23, 3, 4, 85), b = c(3, 4, 53, 5), c = c(3, 7,
5, 5), d = c(3, 4, 9, 5), e = c(3, 4, 5, 9), f = c(3, 34, 1,
9), g = c(38, 1, 5, 9), h = c(3, 3, 5, 9), i = c(3, 137, 3, 9
), j = c(3, 17, 3, 9)), list(a = c(2, 3, 48, 5), b = c(3, 4,
5, 53), c = c(3, 73, 53, 5), d = c(3, 43, 9, 5), e = c(33, 4,
5, 9), f = c(33, 4, 13, 9), g = c(3, 81, 5, 9), h = c(3, 3, 5,
9), i = c(3, 137, 3, 9), j = c(3, 173, 3, 9)))
As you can see my list has 5 entries. Each entry has 10 others entries filled by 4 elements.
> mylist[[4]][[1]]
[1] 23 3 4 85
I want to create another list with only one entry.
All want to put all entr of tipe mylist[[i]][[1]] in first position of a new list: mynewlist[[1]][[1]] will be filled by the mylist[[1]][[1]],mylist[[2]][[1]],mylist[[3]][[1]],mylist[[4]][[1]],mylist[[5]][[1]] elements.
The secon position of mynewlist (mynewlist[[2]][[1]]) will be: mylist[[1]][[2]],mylist[[2]][[2]],mylist[[3]][[2]],mylist[[4]][[2]],mylist[[5]][[2]] elements.
Until
The fith position of mynewlist (mynewlist[[5]][[1]]) will be: mylist[[1]][[5]],mylist[[2]][[5]],mylist[[3]][[5]],mylist[[4]][[5]],mylist[[5]][[5]] elements.
In other words, I want to put every mylist[[i]][[1]]$a in the mynewlist[[1]][[1]] position; the mylist[[i]][[1]]$b in the mynewlist[[1]][[2]] position and so on until mylist[[i]][[1]]$j in the mynewlist[[1]][[10]]
This should be my output for the first position of mynewlist:
#[[1]]
#[1] 2 3 4 5
2 5 48 4
2 3 4 35
23 3 4 85
2 3 48 5
Any help?

We can use transpose
library(dplyr)
out <- mylist %>%
purrr::transpose(.)
out[[1]]
#[[1]]
#[1] 2 3 4 5
#[[2]]
#[1] 2 5 48 4
#[[3]]
#[1] 2 3 4 35
#[[4]]
#[1] 23 3 4 85
#[[5]]
#[1] 2 3 48 5

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

Summary with label names with dplyr - r

Employment <- select(well_being_df, EmploymentStatus, Gender) %>% mutate(EmploymentStatus = labelled::to_factor(EmploymentStatus)) %>% # use labelled package group_by(EmploymentStatus) %>% summarise_all(funs(mean, n = n(), sd,min(.,is.na = TRUE), max(.,is.na = TRUE)))

Related

Cross validation returning 0 for train-merror and test-merror

How to sort each column of a df in descending order regarless of the row order?

Can I create many categories of one variable based in two other conditions in r? [duplicate]

When mutate_all and lapply disagree ... How to replace lapply with mutate_all

Reorganize a List with Lists in a new list in R

Categories

Resources