Why do.call rbind outputs only one row? - r

I have a similar situation like this:
set.seed(2014)
df<-data.frame(
group=rbinom(100,1,0.6),
y1=rbinom(100,1,0.3),
y2=rbinom(100,1,0.8))
for (y in c("y1","y1")){
test<-summary(table(df[,"group"],df[,y]))
output<-do.call(rbind,list(cbind(test$statistic,test$p.value)))
}
output
[,1] [,2]
[1,] 1.066 0.3019
I'm wondering why it's not an output as I expected:
output
[,1] [,2]
[1,] 1.066 0.3019
[2,] 0.00011 1

In each iteration of the loop (you've used y1 twice) output is overwritten by a new value. Presumably you were aiming for soemthing like:
set.seed(2014)
df<-data.frame(
group=rbinom(100,1,0.6),
y1=rbinom(100,1,0.3),
y2=rbinom(100,1,0.8))
output <- NULL
for (y in c("y1","y2")){
test<-summary(table(df[,"group"],df[,y]))
output<-rbind(output,cbind(test$statistic,test$p.value))
}
output

2 issues: you are looping over y1 twice, and you are not appending your new result to the older one. I think you want to loop using lapply and rbind that list:
do.call(rbind,lapply(c("y1","y2"),
function (y) summary(table(df[,"group"],df[,y]))))[,c("statistic","p.value")]
statistic p.value
[1,] 1.065739 0.30191
[2,] 0.000106695 0.9917585

You basically do this twice:
y <- "y1"
test<-summary(table(df[,"group"],df[,y]))
myList <- list(cbind(test$statistic,test$p.value))
#[[1]]
# [,1] [,2]
#[1,] 1.065739 0.30191
See how there is only one element in the list? This element is passed to rbind:
do.call(rbind, myList)
# [,1] [,2]
#[1,] 1.065739 0.30191
rbind(myList[[1]])
# [,1] [,2]
#[1,] 1.065739 0.30191

Related

Apply() cannot be applied to this list?

I have created an example below, where I am trying to make a list of each row of a matrix, then use apply().
mat<-matrix(rexp(9, rate=.1), ncol=3)
my_list2 <- list()
for(i in 1:nrow(mat)) {
my_list2[[i]] <- mat[i,]
}
#DO NOT CHANGE THIS:
apply(my_list2[[i]],2,sum)
However the apply() function does not work, giving a dimension error. I understand that apply() is not the best function to use here but it is present in a function that I need so I cannot change that line.
Does anyone have any idea how I can change my "my_list2" to work better? Thank you!
Edit:
Here is an example that works (non reproducible)
Example
Note both the example above and this example have type "list"
This answer addresses "how to properly get a list of matrices", not how to resolve the use of apply.
By default in R, when you subset a matrix to a single column or a single row, it reduces the dimensionality. For instance,
mtx <- matrix(1:6, nrow = 2)
mtx
# [,1] [,2] [,3]
# [1,] 1 3 5
# [2,] 2 4 6
mtx[1,]
# [1] 1 3 5
mtx[,3]
# [1] 5 6
If you want a single row or column but to otherwise retain dimensionality, add the drop=FALSE argument to the [-subsetting:
mtx[1,,drop=FALSE]
# [,1] [,2] [,3]
# [1,] 1 3 5
mtx[,3,drop=FALSE]
# [,1]
# [1,] 5
# [2,] 6
In this way, your code to produce sample data can be adjusted to be:
set.seed(42) # important for reproducibility in questions on SO
mat<-matrix(rexp(9, rate=.1), ncol=3)
my_list2 <- list()
for(i in 1:nrow(mat)) {
my_list2[[i]] <- mat[i,,drop=FALSE]
}
my_list2
# [[1]]
# [,1] [,2] [,3]
# [1,] 1.983368 0.381919 3.139846
# [[2]]
# [,1] [,2] [,3]
# [1,] 6.608953 4.731766 4.101296
# [[3]]
# [,1] [,2] [,3]
# [1,] 2.83491 14.63627 11.91598
And then you can use akrun's most recent code to resolve how to get the row-wise sums within each list element, i.e., one of
lapply(my_list2, apply, 2, sum)
lapply(my_list2, function(z) apply(z, 2, sum))
lapply(my_list2, \(z) apply(z, 2, sum)) # R-4.1 or later
In your screenshot it works because the object part of the list ex[[1]] is an array. And in your example the elements of your list are vectors. You could try the following:
mat<-matrix(rexp(9, rate=.1), ncol=3)
my_list2 <- list()
for(i in 1:nrow(mat)) {
my_list2[[i]] <- as.matrix(mat[i,])
}
#DO NOT CHANGE THIS:
apply(my_list2[[1]],2,sum)
apply(my_list2[[2]],2,sum)
apply(my_list2[[3]],2,sum)
You should note that apply cannot be applied to all three elements of the array in one line. And to do it in one, that line should be changed.

Efficiently finding minimum cells values from a set of matrices in R

I have a list of matrices (size n*n), and I need to create a new matrix giving the minimum value observed for each cell, based on my list.
For instance, with the following matrices list:
> a = list(matrix(rexp(9), 3), matrix(rexp(9), 3), matrix(rexp(9), 3))
> a
[[1]]
[,1] [,2] [,3]
[1,] 0.5220069 0.39643016 0.04255687
[2,] 0.4464044 0.66029350 0.34116609
[3,] 2.2495949 0.01705576 0.08861866
[[2]]
[,1] [,2] [,3]
[1,] 0.3823704 0.271399 0.7388449
[2,] 0.1227819 1.160775 1.2131681
[3,] 0.1914548 1.004209 0.7628437
[[3]]
[,1] [,2] [,3]
[1,] 0.2125612 0.45379057 1.5987420
[2,] 0.3242311 0.02736743 0.4372894
[3,] 0.6634098 1.15401347 0.9008529
The output should be:
[,1] [,2] [,3]
[1,] 0.2125612 0.271399 0.04255687
[2,] 0.1227819 0.02736743 0.34116609
[3,] 0.1914548 0.01705576 0.08861866
I tried using apply loop with the following code (using melt and dcast from reshape2 library):
library(reshape2)
all = melt(a)
allComps = unique(all[,c(1:2)])
allComps$min=apply(allComps, 1, function(x){
g1 = x[1]
g2 = x[2]
b = unlist(lapply(a, function(y){
return(y[g1,g2])
}))
return(b[which(b==min(b))])
})
dcast(allComps, Var1~Var2)
It works but it is taking a very long time to run when applied on large matrices (6000*6000). I am looking for a faster way to do this.
Use Reduce with pmin :
Reduce(pmin, a)
# [,1] [,2] [,3]
#[1,] 0.02915345 0.03157736 0.3142273
#[2,] 0.57661027 0.05621098 0.1452668
#[3,] 0.48021473 0.18828404 0.4787604
data
set.seed(123)
a = list(matrix(rexp(9), 3), matrix(rexp(9), 3), matrix(rexp(9), 3))
Maybe it should be considered to store the matrices in an array instead of a list. This can be done with simplify2array. In an array the minimum over specific dimensions can be found using min in apply.
A <- simplify2array(a)
apply(A, 1:2, min)
We can use
apply(array(unlist(a), c(3, 3, 3)), 1:2, min)

Paste together two data tables in R [duplicate]

I want to paste cells of matrix together, But when I do paste(),It returns a vector. Is there a direct function for same in R?
mat <- matrix(1:4,2,2)
paste(mat,mat,sep=",")
I want the output as
[,1] [,2]
[1,] 1,1 2,2
[2,] 3,3 4,4
A matrix in R is just a vector with an attribute specifying the dimensions. When you paste them together you are simply losing the dimension attribute.
So,
matrix(paste(mat,mat,sep=","),2,2)
Or, e.g.
mat1 <- paste(mat,mat,sep=",")
> mat1
[1] "1,1" "2,2" "3,3" "4,4"
> dim(mat1) <- c(2,2)
> mat1
[,1] [,2]
[1,] "1,1" "3,3"
[2,] "2,2" "4,4"
Here's just one example of how you might write a simple function to do this:
paste_matrix <- function(...,sep = " ",collapse = NULL){
n <- max(sapply(list(...),nrow))
p <- max(sapply(list(...),ncol))
matrix(paste(...,sep = sep,collapse = collapse),n,p)
}
...but the specific function you want will depend on how you want it to handle more than two matrices, matrices of different dimensions or possibly inputs that are totally unacceptable (random objects, NULL, etc.).
This particular function recycles the vector and outputs a matrix with the dimension matching the largest of the various inputs.
Another approach to the Joran's one is to use [] instead of reconstructing a matrix. In that way you can also keep the colnames for example:
truc <- matrix(c(1:3, LETTERS[3:1]), ncol=2)
colnames(truc) <- c("A", "B")
truc[] <- paste(truc, truc, sep=",")
truc
# A B
# [1,] "1,1" "C,C"
# [2,] "2,2" "B,B"
# [3,] "3,3" "A,A"
Or use sprintf withdim<-
`dim<-`(sprintf('%d,%d', mat, mat), dim(mat))
# [,1] [,2]
#[1,] "1,1" "3,3"
#[2,] "2,2" "4,4"
The ascii library has a function paste.matrix for element-wise paste across matrices. The output is the transpose to the desired outcome, but that's easy to address with t().
library(ascii)
mat <- matrix(1:4,2,2)
t(paste.matrix(mat,mat,sep=","))
[,1] [,2]
[1,] "1,1" "2,2"
[2,] "3,3" "4,4"

Extract elements from matrix diagonal saved in multiple lists in R

I´m trying to get different elements from multiple diagonal saved as lists. My data looks something like this:
res <- list()
res[[1]] <- matrix(c(0.04770856,0.02854005,0.02854005,0.03260190), nrow=2, ncol=2)
res[[2]] <- matrix(c(0.05436957,0.04887182,0.04887182, 0.10484454), nrow=2, ncol=2)
> res
[[1]]
[,1] [,2]
[1,] 0.04770856 0.02854005
[2,] 0.02854005 0.03260190
[[2]]
[,1] [,2]
[1,] 0.05436957 0.04887182
[2,] 0.04887182 0.10484454
> diag(res[[1]])
[1] 0.04770856 0.03260190
> diag(res[[2]])
[1] 0.05436957 0.10484454
I would like to save the first and second elements of each diagonal of a given list into a vector similar to this:
d.1st.el <- c(0.04770856, 0.05436957)
d.2nd.el <- c(0.03260190, 0.10484454)
My issue is to write the function that runs for all given lists and get the diagonals. For some reason, when I use unlist() to extract the values of each matrix for a given level, it doesn't get me the number but the full matrix.
Does anyone have a simple solution?
sapply(res, diag)
[,1] [,2]
[1,] 0.04770856 0.05436957
[2,] 0.03260190 0.10484454
# or
lapply(res, diag)
[[1]]
[1] 0.04770856 0.03260190
[[2]]
[1] 0.05436957 0.10484454
If you want the vectors for some reason in your global environment:
alld <- lapply(res, diag)
names(alld) <- sprintf("d.%d.el", 1:length(alld))
list2env(alld, globalenv())
In two steps you can do:
# Step 1 - Get the diagonals
all_diags <- sapply(res, function(x) diag(t(x)))
print(all_diags)
[,1] [,2]
[1,] 0.04770856 0.05436957
[2,] 0.03260190 0.10484454
# Step 2 - Append to vectors
d.1st.el <- all_diags[1,]
d.2nd.el <- all_diags[2,]

How to use some apply function to solve what requires two for-loops in R

I have a matrix, named "mat", and a smaller matrix, named "center".
temp = c(1.8421,5.6586,6.3526,2.904,3.232,4.6076,4.8,3.2909,4.6122,4.9399)
mat = matrix(temp, ncol=2)
[,1] [,2]
[1,] 1.8421 4.6076
[2,] 5.6586 4.8000
[3,] 6.3526 3.2909
[4,] 2.9040 4.6122
[5,] 3.2320 4.9399
center = matrix(c(3, 6, 3, 2), ncol=2)
[,1] [,2]
[1,] 3 3
[2,] 6 2
I need to compute the distance between each row of mat with every row of center. For example, the distance of mat[1,] and center[1,] can be computed as
diff = mat[1,]-center[1,]
t(diff)%*%diff
[,1]
[1,] 3.92511
Similarly, I can find the distance of mat[1,] and center[2,]
diff = mat[1,]-center[2,]
t(diff)%*%diff
[,1]
[1,] 24.08771
Repeat this process for each row of mat, I will end up with
[,1] [,2]
[1,] 3.925110 24.087710
[2,] 10.308154 7.956554
[3,] 11.324550 1.790750
[4,] 2.608405 16.408805
[5,] 3.817036 16.304836
I know how to implement it with for-loops. I was really hoping someone could tell me how to do it with some kind of an apply() function, maybe mapply() I guess.
Thanks
apply(center, 1, function(x) colSums((x - t(mat)) ^ 2))
# [,1] [,2]
# [1,] 3.925110 24.087710
# [2,] 10.308154 7.956554
# [3,] 11.324550 1.790750
# [4,] 2.608405 16.408805
# [5,] 3.817036 16.304836
If you want the apply for expressiveness of code that's one thing but it's still looping, just different syntax. This can be done without any loops, or with a very small one across center instead of mat. I'd just transpose first because it's wise to get into the habit of getting as much as possible out of the apply statement. (The BrodieG answer is pretty much identical in function.) These are working because R will automatically recycle the smaller vector along the matrix and do it much faster than apply or for.
tm <- t(mat)
apply(center, 1, function(m){
colSums((tm - m)^2) })
Use dist and then extract the relevant submatrix:
ix <- 1:nrow(mat)
as.matrix( dist( rbind(mat, center) )^2 )[ix, -ix]
6 7
# 1 3.925110 24.087710
# 2 10.308154 7.956554
# 3 11.324550 1.790750
# 4 2.608405 16.408805
# 5 3.817036 16.304836
REVISION: simplified slightly.
You could use outer as well
d <- function(i, j) sum((mat[i, ] - center[j, ])^2)
outer(1:nrow(mat), 1:nrow(center), Vectorize(d))
This will solve it
t(apply(mat,1,function(row){
d1<-sum((row-center[1,])^2)
d2<-sum((row-center[2,])^2)
return(c(d1,d2))
}))
Result:
[,1] [,2]
[1,] 3.925110 24.087710
[2,] 10.308154 7.956554
[3,] 11.324550 1.790750
[4,] 2.608405 16.408805
[5,] 3.817036 16.304836

Resources