how to count number of response values by time thresholds in r - r

I have a student dataset that includes responses to questions as right or wrong. There is also a time variable in seconds. I would like to create a time flag to record number of correct and incorrect responses by 1 minute 2 minute and 3 minute thresholds. Here is a sample dataset.
df <- data.frame(id = c(1,2,3,4,5),
gender = c("m","f","m","f","m"),
age = c(11,12,12,13,14),
i1 = c(1,0,NA,1,0),
i2 = c(0,1,0,"1]",1),
i3 = c("1]",1,"1]",0,"0]"),
i4 = c(0,"0]",1,1,0),
i5 = c(1,1,NA,"0]","1]"),
i6 = c(0,0,"0]",1,1),
i7 = c(1,"1]",1,0,0),
i8 = c(0,0,0,"1]","1]"),
i9 = c(1,1,1,0,NA),
time = c(115,138,148,195, 225))
> df
id gender age i1 i2 i3 i4 i5 i6 i7 i8 i9 time
1 1 m 11 1 0 1] 0 1 0 1 0 1 115
2 2 f 12 0 1 1 0] 1 0 1] 0 1 138
3 3 m 12 NA 0 1] 1 <NA> 0] 1 0 1 148
4 4 f 13 1 1] 0 1 0] 1 0 1] 0 195
5 5 m 14 0 1 0] 0 1] 1 0 1] NA 225
The minute thresholds are represented by a ] sign at the right side of the score.
For example for the id = 3, the 1-minute threshold is at item i3 , the 2-minute threshold is at item i6. Each student might have different time thresholds.
I need to create flagging variables to count number of correct and incorrect responses by the 1-min 2-min and 3-min thresholds.
How can I achieve the desired dataset as below.
> df1
id gender age i1 i2 i3 i4 i5 i6 i7 i8 i9 time one_true one_false two_true two_false three_true three_false
1 1 m 11 1 0 1] 0 1 0 1 0 1 115 2 1 NA NA NA NA
2 2 f 12 0 1 1 0] 1 0 1] 0 1 138 2 2 4 3 NA NA
3 3 m 12 NA 0 1] 1 <NA> 0] 1 0 1 148 1 1 2 2 NA NA
4 4 f 13 1 1] 0 1 0] 1 0 1] 0 195 2 0 3 2 5 3
5 5 m 14 0 1 0] 0 1] 1 0 1] NA 225 1 2 2 3 4 4

library(tidyverse)
df %>%
pivot_longer(i1:i9,values_transform = as.character) %>%
group_by(id)%>%
mutate(vs = rev(cumsum(replace_na(str_detect(rev(value),']'),0))))%>%
filter(vs > 0)%>%
mutate(vs = max(vs) - vs + 1)%>%
group_by(vs,.add = TRUE)%>%
summarise(true = sum(str_detect(value, '1'), na.rm = TRUE),
false = sum(str_detect(value, '0'), na.rm = TRUE),
.groups = "drop_last")%>%
mutate(across(c(true, false),cumsum)) %>%
pivot_wider(id, names_from = vs, values_from = c(true, false))
# A tibble: 5 x 7
# Groups: id [5]
id true_1 true_2 true_3 false_1 false_2 false_3
<dbl> <int> <int> <int> <int> <int> <int>
1 1 2 NA NA 1 NA NA
2 2 2 4 NA 2 3 NA
3 3 1 2 NA 1 2 NA
4 4 2 3 5 0 2 3
5 5 1 2 4 2 3 4

You could also accomplish the same in base R:
fun <- function(x){
a <- diff(c(0,which(grepl("]", x))))
f_sum <- function(x,y) sum(na.omit(grepl(x,y)))
fn <- function(x) c(true = f_sum('1',x), false = f_sum('0',x))
y <- tapply(x[seq(sum(a))], rep(seq_along(a),a), fn)
s <- do.call(rbind, Reduce("+", y, accumulate = TRUE))
nms <- do.call(paste, c(sep='_',expand.grid(colnames(s), seq(nrow(s)))))
setNames(c(t(s)), nms)
}
fun2 <- function(x){
ln <- lengths(x)
nms <- names(x[[which.max(ln)]])
do.call(rbind, lapply(x, function(x)setNames(`length<-`(x,max(ln)),nms)))
}
fun2(apply(df[4:12],1,fun))
true_1 false_1 true_2 false_2 true_3 false_3
[1,] 2 1 NA NA NA NA
[2,] 2 2 4 3 NA NA
[3,] 1 1 2 2 NA NA
[4,] 2 0 3 2 5 3
[5,] 1 2 2 3 4 4

Related

Drop columns when there are many missingness in R

I am trying to drop some columns that have less than 5 valid values. Here is an example dataset.
df <- data.frame(id = c(1,2,3,4,5,6,7,8,9,10),
i1 = c(0,1,1,1,1,0,0,1,NA,1),
i2 = c(1,0,0,1,0,1,1,0,0,NA),
i3 = c(NA,NA,NA,NA,NA,NA,NA,NA,NA,0),
i4 = c(NA,1,NA,NA,NA,NA,NA,NA,1,NA))
> df
id i1 i2 i3 i4
1 1 0 1 NA NA
2 2 1 0 NA 1
3 3 1 0 NA NA
4 4 1 1 NA NA
5 5 1 0 NA NA
6 6 0 1 NA NA
7 7 0 1 NA NA
8 8 1 0 NA NA
9 9 NA 0 NA 1
10 10 1 NA 0 NA
in this case, columns i3 and i4 needs to be dropped from the data frame.
How can I get the desired dataset below:
> df
id i1 i2
1 1 0 1
2 2 1 0
3 3 1 0
4 4 1 1
5 5 1 0
6 6 0 1
7 7 0 1
8 8 1 0
9 9 NA 0
10 10 1 NA
You can keep cols with at least 5 non-missing values with:
df[colSums(!is.na(df)) >= 5]
Can use discard from the purrr package:
library(purrr)
df <- data.frame(id = c(1,2,3,4,5,6,7,8,9,10),
i1 = c(0,1,1,1,1,0,0,1,NA,1),
i2 = c(1,0,0,1,0,1,1,0,0,NA),
i3 = c(NA,NA,NA,NA,NA,NA,NA,NA,NA,0),
i4 = c(NA,1,NA,NA,NA,NA,NA,NA,1,NA))
df %>%
discard(~ sum(!is.na(.))<5)
#> id i1 i2
#> 1 1 0 1
#> 2 2 1 0
#> 3 3 1 0
#> 4 4 1 1
#> 5 5 1 0
#> 6 6 0 1
#> 7 7 0 1
#> 8 8 1 0
#> 9 9 NA 0
#> 10 10 1 NA
Created on 2022-11-10 with reprex v2.0.2
While this is likely slower than base R methods (for datasets with extremely many columns > 1000), I generally feel the readability of the code is far superior. In addition, it is easy to do more complicated statements.
Using R base, another approach...
> df[, sapply(df, function(x) sum(is.na(x))) < 5]
id i1 i2
1 1 0 1
2 2 1 0
3 3 1 0
4 4 1 1
5 5 1 0
6 6 0 1
7 7 0 1
8 8 1 0
9 9 NA 0
10 10 1 NA
A performance comparison of the different answers given in this post:
funs = list(
colSums = function(df){df[colSums(!is.na(df)) >= nrow/10]},
sapply = function(df){df[, sapply(df, function(x) sum(!is.na(x))) >= nrow/10]},
discard = function(df){df %>% discard(~ sum(!is.na(.)) < nrow/10)},
mutate = function(df){df %>% mutate(across(where(~ sum(!is.na(.)) < nrow/10), ~ NULL))},
select = function(df){df %>% select(where(~ sum(!is.na(.)) >= nrow/10))})
ncol = 10000
nrow = 100
df = replicate(ncol, sample(c(1:9, NA), nrow, TRUE)) %>% as_tibble()
avrtime = map_dbl(funs, function(f){
duration = c()
for(i in 1:10){
t1 = Sys.time()
f(df)
t2 = Sys.time()
duration[i] = as.numeric(t2 - t1)}
return(mean(duration))})
avrtime[order(avrtime)]
The average time taken by each (in seconds):
colSums sapply discard select mutate
0.04510500 0.04692972 0.29207475 0.29451160 0.31755514
Using select
library(dplyr)
df %>%
select(where(~ sum(complete.cases(.x)) >=5))
-output
id i1 i2
1 1 0 1
2 2 1 0
3 3 1 0
4 4 1 1
5 5 1 0
6 6 0 1
7 7 0 1
8 8 1 0
9 9 NA 0
10 10 1 NA
Or in base R
Filter(\(x) sum(complete.cases(x)) >=5 , df)

R: data.table converts to logical in conditional by group operation (in seemingly random way)

I have the following problem:
test <- data.table(v = ceiling(runif(20, 0, 5)), g = ceiling(runif(20, 0, 2)))
setorder(test, g)
test[, (paste0("n", 1:5)) := lapply(1:5, function(x) sum(v == x)), by = g]
test[, (paste0("foo", 1:3)) := lapply(1:3, function(x){ifelse(get(paste0("n", x + 1)) != 0,
get(paste0("n", x))/get(paste0("n", x + 1)), NA)}), by = g]
test
If you run this code several times, then from time to time one of the "foo" variables is converted to a logical and it makes very little sense.
Thank you for your help!
The reason is the use of NA by default it is NA_logical_ and if there is only NA based on the condition, then it will be a logical column or else it gets coerced to the column type of other observation. This can be resolved if we use NA_real_ constant as mentioned in ?NA
NA is a logical constant of length 1 which contains a missing value indicator. NA can be coerced to any other vector type except raw. There are also constants NA_integer_, NA_real_, NA_complex_ and NA_character_ of the other atomic vector types which support missing values: all of these are reserved words in the R language.
test[, (paste0("foo", 1:3)) :=
lapply(1:3, function(x){
ifelse(get(paste0("n", x + 1)) != 0,
get(paste0("n", x))/get(paste0("n", x + 1)), NA_real_)}), by = g]
Instead of using ifelse and specify the correct NA based on the column type, an option is also to use case_when (from dplyr) or data.table::fcase which by default return NA (with the appropriate type of the column)
test[, paste0("foo", 1:3) := lapply(1:3,
function(x) fcase(.SD[[paste0("n", x + 1)]] !=0,
.SD[[paste0("n", x)]]/.SD[[paste0("n", x + 1)]])), by = g]
-testing
lst1 <- replicate(10, {
test <- data.table(v = ceiling(runif(20, 0, 5)),
g = ceiling(runif(20, 0, 2)))
setorder(test, g)
test[, (paste0("n", 1:5)) := lapply(1:5, function(x) sum(v == x)),
by = g];test[, paste0("foo", 1:3) := lapply(1:3,
function(x) fcase(.SD[[paste0("n", x + 1)]] !=0,
.SD[[paste0("n", x)]]/.SD[[paste0("n", x + 1)]])), by = g]
}, simplify = FALSE)
-checking the one element with only NA
> lst1[[9]]
v g n1 n2 n3 n4 n5 foo1 foo2 foo3
<num> <num> <int> <int> <int> <int> <int> <num> <num> <num>
1: 4 1 3 1 0 2 4 3.00 NA 0
2: 5 1 3 1 0 2 4 3.00 NA 0
3: 1 1 3 1 0 2 4 3.00 NA 0
4: 4 1 3 1 0 2 4 3.00 NA 0
5: 5 1 3 1 0 2 4 3.00 NA 0
6: 1 1 3 1 0 2 4 3.00 NA 0
7: 5 1 3 1 0 2 4 3.00 NA 0
8: 2 1 3 1 0 2 4 3.00 NA 0
9: 1 1 3 1 0 2 4 3.00 NA 0
10: 5 1 3 1 0 2 4 3.00 NA 0
11: 2 2 1 4 0 1 4 0.25 NA 0
12: 1 2 1 4 0 1 4 0.25 NA 0
13: 2 2 1 4 0 1 4 0.25 NA 0
14: 5 2 1 4 0 1 4 0.25 NA 0
15: 5 2 1 4 0 1 4 0.25 NA 0
16: 2 2 1 4 0 1 4 0.25 NA 0
17: 5 2 1 4 0 1 4 0.25 NA 0
18: 4 2 1 4 0 1 4 0.25 NA 0
19: 2 2 1 4 0 1 4 0.25 NA 0
20: 5 2 1 4 0 1 4 0.25 NA 0
v g n1 n2 n3 n4 n5 foo1 foo2 foo3

Repeating loop and adding columns in R

I am trying to build an R code that will take my loop and run it 20 times. Each time I would like to add a column to the existing data frame. Here I tried it by adding the code 3 times, but I feel like there must be an easier way to automate this. I am very grateful for any help.
My original data file (called "igel") contains two columns ("Year" and "Grid") and 1096 rows. With the loop I pick a random number from the column "Grid" and check whether it has been picked before. If so it adds 0 to a new column if not it adds 1.
Here the code:
a <- data.frame(matrix(ncol = 2, nrow = 0))
x <- c("number", "count")
colnames(a) <- x
for (i in 1:1096) {
num_i <- sample(igel$Grid, 1)
count_i <- c(if (num_i %in% a$number == TRUE) {0} else {1})
a<-a %>% add_row(number = num_i, count = count_i)
}
b <- data.frame(matrix(ncol = 2, nrow = 0))
x <- c("number", "count")
colnames(b) <- x
for (i in 1:1096) {
num_i <- sample(igel$Grid, 1)
count_i <- c(if (num_i %in% b$number == TRUE) {0} else {1})
b<-b %>% add_row(number = num_i, count = count_i)
}
c <- data.frame(matrix(ncol = 2, nrow = 0))
x <- c("number", "count")
colnames(c) <- x
for (i in 1:1096) {
num_i <- sample(igel$Grid, 1)
count_i <- c(if (num_i %in% c$number == TRUE) {0} else {1})
c<-c %>% add_row(number = num_i, count = count_i)
}
df.total<- cbind(a$count,b$count, c$count)
Consider sapply and even its wrapper, replicate and calculate number and count separately in vector calculations instead of growing object in loop by row.
# RUNS 3 SAMPLES OF igel$Grid 1,096 TIMES (ADJUST 3 TO ANY POSITIVE INT LIKE 20)
grid_number <- data.frame(replicate(3, replicate(1096, sample(igel$Grid, 1))))
# RUNS ACROSS 3 COLUMNS TO CHECK CURRENT ROW VALUE IS INCLUDED FOR ALL VALUES BEFORE ROW
grid_count <- sapply(grid_number, function(col)
sapply(seq_along(col), function(i)
ifelse(col[i] %in% col[1:(i-1)], 0, 1)
)
)
While above does not exactly reproduce your output, df.total (a matrix and not data frame), due to the random sampling within iterations, the two maintain similar structure:
dim(df.total)
# [1] 1096 3
dim(grid_count)
# [1] 1096 3
Try to avoid iterating through rows. It is rarely necessary, if ever. Here is one approach (replace n with 1096 and elem with igel$Grid):
n = 20
elem = 1:5
df.total = list()
for (i in 1:5) {
a = data.frame(number = sample(elem, n, replace=TRUE))
a$count = as.numeric(duplicated(a$number))
df.total[[i]] = a
}
df.total = as.data.frame(df.total)
df.total
## number count number.1 count.1 number.2 count.2 number.3 count.3 number.4 count.4
## 1 4 0 2 0 5 0 4 0 1 0
## 2 3 0 5 0 3 0 4 1 3 0
## 3 5 0 3 0 4 0 2 0 4 0
## 4 5 1 1 0 2 0 5 0 3 1
## 5 2 0 4 0 2 1 5 1 5 0
## 6 4 1 2 1 2 1 5 1 5 1
## 7 5 1 1 1 3 1 2 1 4 1
## 8 5 1 2 1 5 1 5 1 4 1
## 9 2 1 1 1 1 0 1 0 1 1
## 10 3 1 1 1 5 1 4 1 1 1
## 11 5 1 3 1 1 1 3 0 5 1
## 12 2 1 1 1 2 1 5 1 1 1
## 13 3 1 5 1 4 1 5 1 4 1
## 14 1 0 4 1 2 1 4 1 1 1
## 15 4 1 4 1 2 1 5 1 1 1
## 16 4 1 2 1 5 1 2 1 5 1
## 17 3 1 1 1 1 1 3 1 2 0
## 18 2 1 2 1 2 1 2 1 2 1
## 19 2 1 3 1 1 1 2 1 1 1
## 20 1 1 3 1 2 1 1 1 3 1

New variable that indicates the first occurrence of a specific value

I want to create a new variable that indicates the first specific observation of a value for a variable.
In the following example dataset I want to have a new variable "firstna" that is "1" for the first observation of "NA" for this player.
game_data <- data.frame(player = c(1,1,1,1,2,2,2,2), level = c(1,2,3,4,1,2,3,4), points = c(20,NA,NA,NA,20,40,NA,NA))
game_data
player level points
1 1 1 20
2 1 2 NA
3 1 3 NA
4 1 4 NA
5 2 1 20
6 2 2 40
7 2 3 NA
8 2 4 NA
The resulting dataframe should look like this:
game_data_new <- data.frame(player = c(1,1,1,1,2,2,2,2), level = c(1,2,3,4,1,2,3,4), points = c(20,NA,NA,NA,20,40,NA,NA), firstna = c(0,1,0,0,0,0,1,0))
game_data_new
player level points firstna
1 1 1 20 0
2 1 2 NA 1
3 1 3 NA 0
4 1 4 NA 0
5 2 1 20 0
6 2 2 40 0
7 2 3 NA 1
8 2 4 NA 0
To be honest i don't know how to do this. It would be perfect if there is a dplyr option to do so.
A base R solution:
ave(game_data$points, game_data$player,
FUN = function(x) seq_along(x) == match(NA, x, nomatch = 0))
Another ave option to find out first NA by group (player).
game_data$firstna <- ave(game_data$points, game_data$player,
FUN = function(x) cumsum(is.na(x)) == 1)
game_data
# player level points firstna
#1 1 1 20 0
#2 1 2 NA 1
#3 1 3 NA 0
#4 1 4 NA 0
#5 2 1 20 0
#6 2 2 40 0
#7 2 3 NA 1
#8 2 4 NA 0
Here is a solution with data.table:
library("data.table")
game_data <- data.table(player = c(1,1,1,1,2,2,2,2), level = c(1,2,3,4,1,2,3,4), points = c(20,NA,NA,NA,20,40,NA,NA))
game_data[, firstna:=is.na(points) & !is.na(shift(points)), player][]
# > game_data[, firstna:=is.na(points) & !is.na(shift(points)), player][]
# player level points firstna
# 1: 1 1 20 FALSE
# 2: 1 2 NA TRUE
# 3: 1 3 NA FALSE
# 4: 1 4 NA FALSE
# 5: 2 1 20 FALSE
# 6: 2 2 40 FALSE
# 7: 2 3 NA TRUE
# 8: 2 4 NA FALSE
You can do this by grouping by player and then mutating to check if a row has an NA value and the previous row doesn't
game_data %>%
group_by(player) %>%
mutate(firstna = ifelse(is.na(points) & lag(!is.na(points)),1,0)) %>%
ungroup()
Result:
# A tibble: 8 x 4
# Groups: player [2]
player level points firstna
<dbl> <dbl> <dbl> <dbl>
1 1 1 20 0
2 1 2 NA 1
3 1 3 NA 0
4 1 4 NA 0
5 2 1 20 0
6 2 2 40 0
7 2 3 NA 1
8 2 4 NA 0
library(tidyverse)
library(data.table)
data.frame(
player = c(1,1,1,1,2,2,2,2),
level = c(1,2,3,4,1,2,3,4),
points = c(20,NA,NA,NA,20,40,NA,NA)
) -> game_data
game_data_base1 <- game_data
game_data_dt <- data.table(game_data)
microbenchmark::microbenchmark(
better_base = game_data$first_na <- ave(
game_data$points,
game_data$player,
FUN=function(x) seq_along(x)==match(NA,x,nomatch=0)
),
brute_base = do.call(
rbind.data.frame,
lapply(
split(game_data, game_data$player),
function(x) {
x$firstna <- 0
na_loc <- which(is.na(x$points))
if (length(na_loc) > 0) x$firstna[na_loc[1]] <- 1
x
}
)
),
tidy = game_data %>%
group_by(player) %>%
mutate(firstna=as.numeric(is.na(points) & !duplicated(points))) %>%
ungroup(),
dt = game_data_dt[, firstna:=as.integer(is.na(points) & !is.na(shift(points))), player]
)
## Unit: microseconds
## expr min lq mean median uq max neval
## better_base 125.188 156.861 362.9829 191.6385 355.6675 3095.958 100
## brute_base 366.642 450.002 2782.6621 658.0380 1072.6475 174373.974 100
## tidy 998.924 1119.022 2528.3687 1509.0705 2516.9350 42406.778 100
## dt 330.428 421.211 1031.9978 535.8415 1042.1240 9671.991 100
game_data %>%
group_by(player) %>%
mutate(firstna=as.numeric(is.na(points) & !duplicated(points)))
Group by player, then create a boolean vector for cases that are both NA and not duplicates for previous rows.
# A tibble: 8 x 4
# Groups: player [2]
player level points firstna
<dbl> <dbl> <dbl> <dbl>
1 1 1 20 0
2 1 2 NA 1
3 1 3 NA 0
4 1 4 NA 0
5 2 1 20 0
6 2 2 40 0
7 2 3 NA 1
8 2 4 NA 0
If you want the 1s on the last non-NA line before an NA, replace the mutate line with this:
mutate(lastnonNA=as.numeric(!is.na(points) & is.na(lead(points))))
First row of a block of NAs that runs all the way to the end of the player's group:
game_data %>%
group_by(player) %>%
mutate(firstna=as.numeric(is.na(points) & !duplicated(cbind(points,cumsum(!is.na(points))))))
Another way using base:
game_data$firstna <-
unlist(
tapply(game_data$points, game_data$player, function(x) {i<-which(is.na(x))[1];x[]<-0;x[i]<-1;x})
)
or as another ?ave clone:
ave(game_data$points, game_data$player, FUN = function(x) {
i<-which(is.na(x))[1];x[]<-0;x[i]<-1;x
})
An option using diff
transform(game_data, firstna = ave(is.na(points), player, FUN = function(x) c(0,diff(x))))
# player level points firstna
# 1 1 1 20 0
# 2 1 2 NA 1
# 3 1 3 NA 0
# 4 1 4 NA 0
# 5 2 1 20 0
# 6 2 2 40 0
# 7 2 3 NA 1
# 8 2 4 NA 0
And its dplyr equivalent:
library(dplyr)
game_data %>% group_by(player) %>% mutate(firstna = c(0,diff(is.na(points))))
# # A tibble: 8 x 4
# # Groups: player [2]
# player level points firstna
# <dbl> <dbl> <dbl> <dbl>
# 1 1 1 20 0
# 2 1 2 NA 1
# 3 1 3 NA 0
# 4 1 4 NA 0
# 5 2 1 20 0
# 6 2 2 40 0
# 7 2 3 NA 1
# 8 2 4 NA 0

How to format describeBy table in R?

I have this data set:
Defects.I Defects.D Treatment
1 2 A
1 3 B
And I'm trying to do a descriptive statistics for defects detected and isolated, grouped per treatment.
After searching for a while I found a nice function on the psych library called describeBy().
With the following code:
describeBy(myData[1:2],myData$Treatment)
I got this output:
Treatment A
Mean. Median. Trimed.
Defects.I x x x
Defects.D x x x
Treatment B
Mean. Median. Trimed.
Defects.I x x x
Defects.D x x x
But in reality I was looking for something like
Mean. Median. Trimed.
A B A B A B
Defects.I x x x x x x
Defects.D x x x x x x
Data
myData <- structure(list(Defects.I = c(1L, 1L), Defects.D = 2:3, Treatment = c("A",
"B")), .Names = c("Defects.I", "Defects.D", "Treatment"), class = "data.frame", row.names = c(NA,
-2L))
Since describeBy returns a lists of data frames, we could just cbind them all, but that doesn't get the right order. Instead we can interleave the columns
myData <- structure(list(Defects.I = c(1L, 1L), Defects.D = 2:3,
Treatment = c("A", "B")),
.Names = c("Defects.I", "Defects.D", "Treatment"),
class = "data.frame", row.names = c(NA, -2L))
l <- psych::describeBy(myData[1:2], myData$Treatment)
So interleave using this order
order(sequence(c(ncol(l$A), ncol(l$B))))
# [1] 1 14 2 15 3 16 4 17 5 18 6 19 7 20 8 21 9 22 10 23 11 24 12 25 13 26
rather than what cbind alone would do
c(1:13, 1:13)
# [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 1 2 3 4 5 6 7 8 9 10 11 12 13
so this
do.call('cbind', l)[, order(sequence(lengths(l)))]
# A.vars B.vars A.n B.n A.mean B.mean A.sd B.sd A.median B.median A.trimmed B.trimmed A.mad B.mad
# Defects.I 1 1 1 1 1 1 NA NA 1 1 1 1 0 0
# Defects.D 2 2 1 1 2 3 NA NA 2 3 2 3 0 0
# A.min B.min A.max B.max A.range B.range A.skew B.skew A.kurtosis B.kurtosis A.se B.se
# Defects.I 1 1 1 1 0 0 NA NA NA NA NA NA
# Defects.D 2 3 2 3 0 0 NA NA NA NA NA NA
or as a function
interleave <- function(l, how = c('cbind', 'rbind')) {
how <- match.arg(how)
if (how %in% 'rbind')
do.call(how, l)[order(sequence(sapply(l, nrow))), ]
else do.call(how, l)[, order(sequence(sapply(l, ncol))), ]
}
interleave(l)
# A.vars B.vars A.n B.n
# Defects.I 1 1 1 1
# Defects.D 2 2 1 1 ...
# ...
interleave(l, 'r')
# vars n mean sd median trimmed mad min max range skew kurtosis se
# A.Defects.I 1 1 1 NA 1 1 0 1 1 0 NA NA NA
# B.Defects.I 1 1 1 NA 1 1 0 1 1 0 NA NA NA
# A.Defects.D 2 1 2 NA 2 2 0 2 2 0 NA NA NA
# B.Defects.D 2 1 3 NA 3 3 0 3 3 0 NA NA NA
You can try the mat = TRUE argument. It's not exactly what you're looking for, but it's closer:
library(psych)
mydata = data.frame(Defects.I = c(1,1), Defects.D = c(2,3), Treatment = c('A','B'))
describeBy(mydata[1:2], mydata$Treatment, mat = TRUE)
gives
item group1 vars n mean sd median trimmed mad min max range skew kurtosis se
Defects.I1 1 A 1 1 1 NA 1 1 0 1 1 0 NA NA NA
Defects.I2 2 B 1 1 1 NA 1 1 0 1 1 0 NA NA NA
Defects.D1 3 A 2 1 2 NA 2 2 0 2 2 0 NA NA NA
Defects.D2 4 B 2 1 3 NA 3 3 0 3 3 0 NA NA NA

Resources