Translate for loop into parallel foreach loop in R - r

I try to optimize parts of a simulation script and have parts of it run in parallel. So I discovered the foreach function.
As an example I try to have the following run in parallel
Wiederholungen<-4
assign("A",array(0,c(Wiederholungen,10,3)))
assign("B",array(0,c(Wiederholungen,10,3)))
assign("C",array(0,c(Wiederholungen,10,3)))
Old Code:
for(m in 1:Wiederholungen){
A[m,,]<-m
B[m,,]<-m
C[m,,]<-m
}
Resulting in: (for A)
A
, , 1
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 1 1 1 1 1 1 1 1 1 1
[2,] 2 2 2 2 2 2 2 2 2 2
[3,] 3 3 3 3 3 3 3 3 3 3
[4,] 4 4 4 4 4 4 4 4 4 4
, , 2
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 1 1 1 1 1 1 1 1 1 1
[2,] 2 2 2 2 2 2 2 2 2 2
[3,] 3 3 3 3 3 3 3 3 3 3
[4,] 4 4 4 4 4 4 4 4 4 4
, , 3
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 1 1 1 1 1 1 1 1 1 1
[2,] 2 2 2 2 2 2 2 2 2 2
[3,] 3 3 3 3 3 3 3 3 3 3
[4,] 4 4 4 4 4 4 4 4 4 4
This should put the value of m in the m-th row of the dataframes A,B & C. In my script m is the number of replicates (Wiederholungen) the script should run.
Foreach Code:
foreach(m=1:Wiederholungen)%dopar%{
A[m,,]<-m
B[m,,]<-m
C[m,,]<-m
}
Resulting in:
[[1]]
[1] 1
[[2]]
[1] 2
[[3]]
[1] 3
[[4]]
[1] 4
But this does not lead in the above result for the dataframes A,B,C. I know that foreach has a combine option but it seems only to work if you have the result for one matrix and not for several as in my example. How do I get the same result as with my old for loop using a parallel processing approach. I am using Ubuntu and 3.2.2.

I know this is a very old post, but thought an answer might help someone in the future.
Foreach returns a list of only the last object created in the loop. so if you want A, B, and C returned you need to list them as the final step of the loop.
This code works for me.
W<- foreach(m = 1:Wiederholungen) %dopar% {
A[m,,]<-m
B[m,,]<-m
C[m,,]<-m
list(A, B, C)
}
For future reference, the foreach vignette can be very helpful with things like this.

Related

Combinations of vector with sub-vector length n

Given a vector, 1:4, and a sequence length, 2, I would like to separate the vector into 'sub-vectors', each with a length of 2, and generate a matrix of all possible combinations of these sub-vectors.
Output would look like this:
[,1] [,2] [,3] [,4]
[1,] 1 2 3 4
[2,] 3 4 1 2
Another example. With vector 1:8 and sub-vector length of 4, output would look like this:
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
[1,] 1 2 3 4 5 6 7 8
[2,] 5 6 7 8 1 2 3 4
With a vector 1:9 and sub-vector length of 3, output would look like this:
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
[1,] 1 2 3 4 5 6 7 8 9
[2,] 1 2 3 7 8 9 4 5 6
[3,] 4 5 6 1 2 3 7 8 9
[4,] 4 5 6 7 8 9 1 2 3
[5,] 7 8 9 4 5 6 1 2 3
[6,] 7 8 9 1 2 3 4 5 6
It's a given that the vector length must be divisible by the sub-vector length.
I can answer the whole question, but it will take a bit longer. This should give you the flavour of the answer.
The package combinat has a function called permn which gives you the all the permutations of a vector. You want this, but not quite. What you need is the permutations of all the blocks. So in your first example you have two blocks of length two, and in your second example you have three blocks of length three. If we look at the first, and think about ordering the blocks:
> library(combinat)
> numBlocks = 2
> permn(1:numBlocks)
[[1]]
[1] 1 2
[[2]]
[1] 2 1
So I hope you can see that the first permutation would take the blocks b1 = c(1,2), and b2 = c(3,4) and order them c(b1,b2), and the second would order them c(b2,b1).
Equally if you had three blocks, b1 = 1:3; b2 = 4:6; b3 = 7:9 then
permn(1:3)
[[1]]
[1] 1 2 3
[[2]]
[1] 1 3 2
[[3]]
[1] 3 1 2
[[4]]
[1] 3 2 1
[[5]]
[1] 2 3 1
[[6]]
[1] 2 1 3
gives you the ordering of these blocks. The more general solution is figuring out how to move the blocks around, but that isn't too hard.
Update: Using my multicool package. Note co-lexical ordering (coolex) isn't the order you'd come up with by yourself.
library(multicool)
combs = function(v, blockLength){
if(length(v) %% blockLength != 0){
stop("vector length must be divisible by blockLength")
}
numBlocks = length(v) / blockLength
blockWise = matrix(v, nc = blockLength, byrow = TRUE)
m = initMC(1:numBlocks)
Perms = allPerm(m)
t(apply(Perms, 1, function(p)as.vector(t(blockWise[p,]))))
}
> combs(1:4, 2)
[,1] [,2] [,3] [,4]
[1,] 3 4 1 2
[2,] 1 2 3 4
> combs(1:9, 3)
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
[1,] 7 8 9 4 5 6 1 2 3
[2,] 1 2 3 7 8 9 4 5 6
[3,] 7 8 9 1 2 3 4 5 6
[4,] 4 5 6 7 8 9 1 2 3
[5,] 1 2 3 4 5 6 7 8 9
[6,] 4 5 6 1 2 3 7 8 9

replace NA of a matrix with some values

I have this matrix:
mat=matrix(c(1,1,1,2,2,2,3,4,NA,
4,4,4,4,4,3,5,6,4,
3,3,5,5,6,8,0,9,NA,
1,1,1,1,1,4,5,6,1),nrow=4,byrow=TRUE)
print(mat)
# [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
# [1,] 1 1 1 2 2 2 3 4 NA
# [2,] 4 4 4 4 4 3 5 6 4
# [3,] 3 3 5 5 6 8 0 9 NA
# [4,] 1 1 1 1 1 4 5 6 1
I should replace the NA values with other values, in this way:
I have another matrix:
mat2=matrix(c(24,1,3,2, 4,4,4,4, 3,2,2,5, 1,3,5,1),nrow=4,byrow=TRUE)
[,1] [,2] [,3] [,4]
[1,] 24 1 3 2
[2,] 4 4 4 4
[3,] 3 2 2 5
[4,] 1 3 5 1
and the subset with the index of the rows with NA of the first matrix "mat":
subset=c(1,3)
I want to replcace the NA of the matrix with the colnames of the value of the row with the max value.
in this case, I will have "1" for the first row and "4" for the third one, I don't care about row 2 and 4.
Use this
mat[subset,9] <- apply(mat2[subset,],1,which.max)
mat[which(is.na(mat))] <- apply(mat2,1,max)[which(is.na(mat), arr.ind = T)[1,]]
This should replace every NA value with the maximum value from the same row in mat2. I don't have an open core to debug on so I hope this works. If you have any questions or it crashes just comment.

Add a column with the value of the object with max frequency

I have this matrix:
mat=matrix(c(1,1,1,2,2,2,3,4,
4,4,4,4,4,3,5,6,
3,3,5,5,6,8,0,9,
1,1,1,1,1,4,5,6),nrow=4,byrow=TRUE)
print(mat)
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
[1,] 1 1 1 2 2 2 3 4
[2,] 4 4 4 4 4 3 5 6
[3,] 3 3 5 5 6 8 0 9
[4,] 1 1 1 1 1 4 5 6
and a subset with the index of the row I want to apply my function:
subset=c(2,4)
I would like to add a new column in the matrix "mat" which contains, only for the subset I specified, the value of the object with the max frequency in the row.
In this case:
for row number 1, I would like to have an empty cell in the new column,
for row number 2, I would like to have the value "4" in the new column,
for row number 3, I would like to have an empty cell in the new column,
for row number 4, I would like to have the value "1" in the new column.
EDIT:
thanks for the code in the answer!
now i should replace the NA values with other values:
i have another matrix:
mat2=matrix(c(24,1,3,2, 4,4,4,4, 3,2,2,5, 1,3,5,1),nrow=4,byrow=TRUE)
[,1] [,2] [,3] [,4]
[1,] 24 1 3 2
[2,] 4 4 4 4
[3,] 3 2 2 5
[4,] 1 3 5 1
and the subset:
subset=c(1,3)
i want to replcace the NA of the matrix (the remaining rows out of the first subeset) with the colnames of the value of the row with the max value.
in this case, i will have "1" for the first row and "4" for the third one.
Your are looking for the mode. Unfortunately R doesn't provide a builtin mode function. But it is not too hard to write your own one:
## create mode function
modeValue <- function(x) {
ux <- unique(x)
ux[which.max(tabulate(match(x, ux)))]
}
## add new column with NA
smat <- cbind(mat, NA)
## calculate mode for subset
smat[subset, ncol(smat)] <- apply(smat[subset, , drop=FALSE], 1, modeValue)
smat
# [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
# [1,] 1 1 1 2 2 2 3 4 NA
# [2,] 4 4 4 4 4 3 5 6 4
# [3,] 3 3 5 5 6 8 0 9 NA
# [4,] 1 1 1 1 1 4 5 6 1
Here is a function that will work. It calculates such values (modes)for all rows then substitutes missings where desired:
myFunc <- function(x, myRows) {
myModes <- apply(mat, 1, FUN=function(i) {
temp<- table(i)
as.numeric(names(temp)[which.max(temp)])
})
myModes[setdiff(seq.int(nrow(x)), myRows)] <- NA
myModes
}
For the example, this returns
myFunc(mat, c(2,4))
[1] NA 4 NA 1
To add this to your matrix, just use cbind:
cbind(mat, myFunc(mat, c(2,4)))
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
[1,] 1 1 1 2 2 2 3 4 NA
[2,] 4 4 4 4 4 3 5 6 4
[3,] 3 3 5 5 6 8 0 9 NA
[4,] 1 1 1 1 1 4 5 6 1

How to combine matrices with sapply in R

Each loop of my sapply function will out put a n*m matrix. n is fixed, m is not.
For example, if I run this in R:
sapply(1:3, function(x) {matrix(1:9, 3)})
and it will output:
[,1] [,2] [,3]
[1,] 1 1 1
[2,] 2 2 2
[3,] 3 3 3
[4,] 4 4 4
[5,] 5 5 5
[6,] 6 6 6
[7,] 7 7 7
[8,] 8 8 8
[9,] 9 9 9
However, what I want is something like this:
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
[1,] 1 4 7 1 4 7 1 4 7
[2,] 2 5 8 2 5 8 2 5 8
[3,] 3 6 9 3 6 9 3 6 9
Any idea for this? Thanks
One solution is:
do.call(cbind, lapply(1:3, function(x) {matrix(1:9, 3)}))
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
[1,] 1 4 7 1 4 7 1 4 7
[2,] 2 5 8 2 5 8 2 5 8
[3,] 3 6 9 3 6 9 3 6 9
We can use replicate
`dim<-`(replicate(3, matrix(1:9, 3)), c(3, 3*3))
# [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
#[1,] 1 4 7 1 4 7 1 4 7
#[2,] 2 5 8 2 5 8 2 5 8
#[3,] 3 6 9 3 6 9 3 6 9

Repeat values in a matrix (R)

Here is the code I am working with:
A <- matrix(1:9, nrow = 3)
A
cbind(A,A,A)
This gives an output:
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
[1,] 1 4 7 1 4 7 1 4 7
[2,] 2 5 8 2 5 8 2 5 8
[3,] 3 6 9 3 6 9 3 6 9
The desired output is...
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
[1,] 1 1 1 4 4 4 7 7 7
[2,] 2 2 2 5 5 5 8 8 8
[3,] 3 3 3 6 6 6 9 9 9
In addition I tried this...
test <- (sapply(A , function(maybe) rep(maybe,each=3)))
test
Which gives an output of:
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
[1,] 1 2 3 4 5 6 7 8 9
[2,] 1 2 3 4 5 6 7 8 9
[3,] 1 2 3 4 5 6 7 8 9
The help is much appreciated.
Use rep with column indexing: A[,rep(1:3, each=3)]

Resources