I'm trying to apply a function to a list using apply but I'm having trouble doing so. I'm trying to calculate the earth-movers distance using the emdist package. Every index in the list has two subindices. I want to calculate the earth-movers distance for these subindices iteratively (the real list has thousands of indices). The problem is Rstudio crashes each time I try to run the code on a test dataset. An example of the test dataset:
set.seed(42)
output1 <- list(list(matrix(0,8,11),matrix(0,8,11)), list(matrix(rnorm(80),8,10),matrix(rnorm(80),8,10)))
[[1]]
[[1]][[1]]
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11]
[1,] 0 0 0 0 0 0 0 0 0 0 0
[2,] 0 0 0 0 0 0 0 0 0 0 0
[3,] 0 0 0 0 0 0 0 0 0 0 0
[4,] 0 0 0 0 0 0 0 0 0 0 0
[5,] 0 0 0 0 0 0 0 0 0 0 0
[6,] 0 0 0 0 0 0 0 0 0 0 0
[7,] 0 0 0 0 0 0 0 0 0 0 0
[8,] 0 0 0 0 0 0 0 0 0 0 0
[[1]][[2]]
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11]
[1,] 0 0 0 0 0 0 0 0 0 0 0
[2,] 0 0 0 0 0 0 0 0 0 0 0
[3,] 0 0 0 0 0 0 0 0 0 0 0
[4,] 0 0 0 0 0 0 0 0 0 0 0
[5,] 0 0 0 0 0 0 0 0 0 0 0
[6,] 0 0 0 0 0 0 0 0 0 0 0
[7,] 0 0 0 0 0 0 0 0 0 0 0
[8,] 0 0 0 0 0 0 0 0 0 0 0
Now when I do this:
library(emdist)
sapply(output1,function(x) {emd2d(x[[seq_along(x)[1]]],x[[seq_along(x)[2]]]) })
Rstudio simply crashes. I have also tried:
mapply(emd2d,sapply(output1,`[`,1),sapply(output1,`[`,2))
But to no avail. Any ideas? I'm running this on a 2013 macbook air with 2gb of RAM.
this works fine:
> emd2d(output1[[2]][[1]],output1[[2]][[2]])
[1] -6.089909
this does not:
emd2d(output1[[1]][[1]],output1[[1]][[2]])
Seems emd2d() might hate it when you compare two all zero matrices...
At least for me on OSX as this succeeds for me:
set.seed(666)
output2 <- list(list(matrix(5,8,11),matrix(5,8,11)),
list(matrix(rnorm(80),8,10),matrix(rnorm(80),8,10)))
sapply(output2,function(x) {emd2d(x[[1]],x[[2]]) })
#[1] 0.000000 -7.995288
# not i removed your seq_along because I don't think you really want this..
as does this:
> set.seed(666)
> output2 <- list(list(matrix(0,8,11),matrix(5,8,11)), list(matrix(rnorm(80),8,10),matrix(rnorm(80),8,10)))
> sapply(output2,function(x) {emd2d(x[[1]],x[[2]]) })
[1] NaN -7.995288
Maybe you need to contact the package creator about this then, in the mean time you could create a function that checks if both matrices are all zeros, e.g.
foo <- function(z){ if( sum(length(z[[1]][ z[[1]] != 0]),
length(z[[2]][ z[[2]] != 0]) ) > 0){
emd2d(z[[1]],z[[2]])
}else{
0
}
}
# i use length and subsetting, not just sum(), in case somehow
# the two matrices sum to zero because you have minus values in them
> sapply(output1, foo)
[1] 0.000000 -6.089909
Related
I have tried to write a function for this part of code and I can not mange because I am new to R
can someone help me?
I made a function like this :
m <- matrix(0, nrow=10, ncol=10) # Create an adjacency matrix
and I have changed the the element of it like below :
m[1,2] <- m[2,3] <- m[3,4] <-m[4,5]<-m[5,6]<-m[6,7]<-m[7,8] <-m[8,9]<-m[9,10]<-m[1,10] <- 1
but how can i do it automatically inside a function? to automatically iterate and change value?
I am not very sure about the logic for why m[1,10] is assigned one, for the others, you can do:
m <- matrix(0, nrow=10, ncol=10)
m[row(m) == col(m)-1] <- 1
m
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 0 1 0 0 0 0 0 0 0 0
[2,] 0 0 1 0 0 0 0 0 0 0
[3,] 0 0 0 1 0 0 0 0 0 0
[4,] 0 0 0 0 1 0 0 0 0 0
[5,] 0 0 0 0 0 1 0 0 0 0
[6,] 0 0 0 0 0 0 1 0 0 0
[7,] 0 0 0 0 0 0 0 1 0 0
[8,] 0 0 0 0 0 0 0 0 1 0
[9,] 0 0 0 0 0 0 0 0 0 1
[10,] 0 0 0 0 0 0 0 0 0 0
How to do a row-wise replacement of values using R?
I have a Matrix and I would like to replace some of its values using an index vector. The problem is that R automatically does a column-wise extraction of the values as opposed to a row-wise.
You will find my code and results below:
Matrix=matrix(rep(0,42),nrow=6,ncol=7,byrow=TRUE)
v=c(1,7,11,16,18)
Matrix[v]=1
Matrix
[,1] [,2] [,3] [,4] [,5] [,6] [,7]
[1,] 1 1 0 0 0 0 0
[2,] 0 0 0 0 0 0 0
[3,] 0 0 0 0 0 0 0
[4,] 0 0 1 0 0 0 0
[5,] 0 1 0 0 0 0 0
[6,] 0 0 1 0 0 0 0
What I actually want to get is the row-wise version of this meaning:
[,1] [,2] [,3] [,4] [,5] [,6] [,7]
[1,] 1 0 0 0 0 0 1
[2,] 0 0 0 1 0 0 0
[3,] 0 1 0 1 0 0 0
[4,] 0 0 0 0 0 0 0
[5,] 0 0 0 0 0 0 0
[6,] 0 0 0 0 0 0 0
>
Apparently R does a column-wise replacement of values by default.
What is the best way to obtain a row-wise replacement of the values?
Thanks!
You could recalculate the onedimensional indizes to row- and column-indices. Supposing you have calculated the row-indices in the first column of the matrix Ind and the columnindices in the second column of Ind you can do Matrix[Ind] <- 1
Matrix <- matrix(rep(0,42),nrow=6,ncol=7,byrow=TRUE)
v <- c(1,7,11,16,18)
Row <- (v-1) %/% ncol(Matrix) +1
Col <- (v-1) %% ncol(Matrix) +1
Matrix[cbind(Row,Col)] <- 1
Matrix
# [,1] [,2] [,3] [,4] [,5] [,6] [,7]
# [1,] 1 0 0 0 0 0 1
# [2,] 0 0 0 1 0 0 0
# [3,] 0 1 0 1 0 0 0
# [4,] 0 0 0 0 0 0 0
# [5,] 0 0 0 0 0 0 0
# [6,] 0 0 0 0 0 0 0
We can do
+(matrix(seq_along(Matrix) %in% v, ncol=ncol(Matrix), nrow=nrow(Matrix), byrow=TRUE))
# [,1] [,2] [,3] [,4] [,5] [,6] [,7]
#[1,] 1 0 0 0 0 0 1
#[2,] 0 0 0 1 0 0 0
#[3,] 0 1 0 1 0 0 0
#[4,] 0 0 0 0 0 0 0
#[5,] 0 0 0 0 0 0 0
#[6,] 0 0 0 0 0 0 0
You could redo your 1's to make them row-wise or you can do the following:
Matrix=matrix(rep(0,42),nrow=6,ncol=7,byrow=TRUE)
v=c(1,7,11,16,18)
Matrix<-t(Matrix)
Matrix[v]=1
Matrix<-t(Matrix)
Matrix
[,1] [,2] [,3] [,4] [,5] [,6] [,7]
[1,] 1 0 0 0 0 0 1
[2,] 0 0 0 1 0 0 0
[3,] 0 1 0 1 0 0 0
[4,] 0 0 0 0 0 0 0
[5,] 0 0 0 0 0 0 0
[6,] 0 0 0 0 0 0 0
I have a question, I am trying to create a 10x10 matrix using the code below, where the first column contains 10 values from a normal distribution with std dev of .5 and a mean equal to j where j is a value 1:10. My code below produces the observed matrix, where only the final column is filled with values. What am I doing wrong? Thank you.
for(j in 1:10){
y<-matrix(0,ncol=10,nrow=10)
y[,j]<-rnorm(n=10,mean=j,sd=.5)
}
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 0 0 0 0 0 0 0 0 0 10.857520
[2,] 0 0 0 0 0 0 0 0 0 10.490549
[3,] 0 0 0 0 0 0 0 0 0 9.888620
[4,] 0 0 0 0 0 0 0 0 0 9.495205
[5,] 0 0 0 0 0 0 0 0 0 9.674356
[6,] 0 0 0 0 0 0 0 0 0 10.810197
[7,] 0 0 0 0 0 0 0 0 0 10.337517
[8,] 0 0 0 0 0 0 0 0 0 9.715229
[9,] 0 0 0 0 0 0 0 0 0 9.902603
[10,] 0 0 0 0 0 0 0 0 0 8.972656
I have a matrix(initialized to zeros) and a set of indices. If the i'th value in indices is j, then I want to set the (j,i)th entry of the matrix to 1.
For eg:
> m = matrix(0, 10, 7)
> indices
[1] 2 9 3 4 5 1 10
And the result should be
> result
[,1] [,2] [,3] [,4] [,5] [,6] [,7]
[1,] 0 0 0 0 0 1 0
[2,] 1 0 0 0 0 0 0
[3,] 0 0 1 0 0 0 0
[4,] 0 0 0 1 0 0 0
[5,] 0 0 0 0 1 0 0
[6,] 0 0 0 0 0 0 0
[7,] 0 0 0 0 0 0 0
[8,] 0 0 0 0 0 0 0
[9,] 0 1 0 0 0 0 0
[10,] 0 0 0 0 0 0 1
I asked a somewhat related question a little while back, which used a vector instead of a matrix. Is there a similar simple solution to this problem?
## OP's example data
m = matrix(0, 10, 7)
j <- c(2, 9, 3, 4, 5, 1, 10)
## Construct a two column matrix of indices (1st column w. rows & 2nd w. columns)
ij <- cbind(j, seq_along(j))
## Use it to subassign into the matrix
m[ij] <- 1
m
# [,1] [,2] [,3] [,4] [,5] [,6] [,7]
# [1,] 0 0 0 0 0 1 0
# [2,] 1 0 0 0 0 0 0
# [3,] 0 0 1 0 0 0 0
# [4,] 0 0 0 1 0 0 0
# [5,] 0 0 0 0 1 0 0
# [6,] 0 0 0 0 0 0 0
# [7,] 0 0 0 0 0 0 0
# [8,] 0 0 0 0 0 0 0
# [9,] 0 1 0 0 0 0 0
# [10,] 0 0 0 0 0 0 1
For the record, the answer in your linked question can easily be adapted to suit this scenario too by using sapply:
indices <- c(2, 9, 3, 4, 5, 1, 10)
sapply(indices, tabulate, nbins = 10)
# [,1] [,2] [,3] [,4] [,5] [,6] [,7]
# [1,] 0 0 0 0 0 1 0
# [2,] 1 0 0 0 0 0 0
# [3,] 0 0 1 0 0 0 0
# [4,] 0 0 0 1 0 0 0
# [5,] 0 0 0 0 1 0 0
# [6,] 0 0 0 0 0 0 0
# [7,] 0 0 0 0 0 0 0
# [8,] 0 0 0 0 0 0 0
# [9,] 0 1 0 0 0 0 0
# [10,] 0 0 0 0 0 0 1
For small datasets you might not notice the performance difference, but Josh's answer, which uses matrix indexing, would definitely be much faster, even if you changed my answer here to use vapply instead of sapply.
How can I randomly add a value to a matrix?
say I have:
mat <- matrix(0, 10, 10)
v = 5
how can I add randomly v to mat, 2 positions at a time? The output should look like this after a single iteration:
out
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 0 0 0 0 0 0 0 0 0 0
[2,] 5 0 0 0 0 0 0 0 0 0
[3,] 0 0 0 0 0 0 0 0 0 0
[4,] 0 0 0 0 5 0 0 0 0 0
[5,] 0 0 0 0 0 0 0 0 0 0
[6,] 0 0 0 0 0 0 0 0 0 0
[7,] 0 0 0 0 0 0 0 0 0 0
[8,] 0 0 0 0 0 0 0 0 0 0
[9,] 0 0 0 0 0 0 0 0 0 0
[10,] 0 0 0 0 0 0 0 0 0 0
After another iteration, mat should have 2 more positions filled with the value in 'v'
You could use ?sample to randomly index your matrix:
idx <- sample(length(mat), size=2)
mat[idx] <- mat[idx] + v