Rank values of an array in R with apply - r

I need to rank the values of an array, on the third axe.
I have an array like so :
a <- array(c(1:9,11:19,21:29),dim = c(3,3,3))
The expected result is an matrix with the rank of the first one, that I expected to compute like this : apply(X = a, MARGIN = c(1,2), FUN = rank)[, ,1]
# expected result because a[,,1] is the minimal value of my array
[,1] [,2] [,3]
[1,] 1 1 1
[2,] 1 1 1
[3,] 1 1 1
However the apply function return me another array, which is strange because it return a matrix when I use other functions like mean, max or sort.
I've checked that apply(a, c(1,2), print) and it print the vector I'm interested in ranking (example it print 1 11 21 on the first iteration).
I can code this with a for-loop but at least I'm curious on what exactly is the issue in this case.
# The for loop to obtain the good result
a2 <- a[,,1]
for(i in seq(dim(a)[1])){
for(j in seq(dim(a)[2])){
a2[i,j] <- rank(a[i,j,])[1]
}
}
Thanks in advance !

You can use [1,,] instead of [,,1], i.e.,
apply(X = a, MARGIN = c(1,2), FUN = rank)[1,,]
such that
> apply(X = a, MARGIN = c(1,2), FUN = rank)[1,,]
[,1] [,2] [,3]
[1,] 1 1 1
[2,] 1 1 1
[3,] 1 1 1

Try
apply(a, c(1,2), function(x) rank(x)[1])
# [,1] [,2] [,3]
# [1,] 1 1 1
# [2,] 1 1 1
# [3,] 1 1 1

Related

How to vectorize this operation?

I have a n x 3 x m array, call it I. It contains 3 columns, n rows (say n=10), and m slices. I have a computation that must be done to replace the third column in each slice based on the other 2 columns in the slice.
I've written a function insertNewRows(I[,,simIndex]) that takes a given slice and replaces the third column. The following for-loop does what I want, but it's slow. Is there a way to speed this up by using one of the apply functions? I cannot figure out how to get them to work in the way I'd like.
for(simIndex in 1:m){
I[,, simIndex] = insertNewRows(I[,,simIndex])
}
I can provide more details on insertNewRows if needed, but the short version is that it takes a probability based on the columns I[,1:2, simIndex] of a given slice of the array, and generates a binomial RV based on the probability.
It seems like one of the apply functions should work just by using
I = apply(FUN = insertNewRows, MARGIN = c(1,2,3)) but that just produces gibberish..?
Thank you in advance!
IK
The question has not defined the input nor the transformation nor the result so we can't really answer it but here is an example of adding a row of ones to to a[,,i] for each i so maybe that will suggest how you could solve the problem yourself.
This is how you could use sapply, apply, plyr::aaply, reshaping using matrix/aperm and abind::abind.
# input array and function
a <- array(1:24, 2:4)
f <- function(x) rbind(x, 1) # append a row of 1's
aa <- array(sapply(1:dim(a)[3], function(i) f(a[,,i])), dim(a) + c(1,0,0))
aa2 <- array(apply(a, 3, f), dim(a) + c(1,0,0))
aa3 <- aperm(plyr::aaply(a, 3, f), c(2, 3, 1))
aa4 <- array(rbind(matrix(a, dim(a)[1]), 1), dim(a) + c(1,0,0))
aa5 <- abind::abind(a, array(1, dim(a)[2:3]), along = 1)
dimnames(aa3) <- dimnames(aa5) <- NULL
sapply(list(aa2, aa3, aa4, aa5), identical, aa)
## [1] TRUE TRUE TRUE TRUE
aa[,,1]
## [,1] [,2] [,3]
## [1,] 1 3 5
## [2,] 2 4 6
## [3,] 1 1 1
aa[,,2]
## [,1] [,2] [,3]
## [1,] 7 9 11
## [2,] 8 10 12
## [3,] 1 1 1
aa[,,3]
## [,1] [,2] [,3]
## [1,] 13 15 17
## [2,] 14 16 18
## [3,] 1 1 1
aa[,,4]
## [,1] [,2] [,3]
## [1,] 19 21 23
## [2,] 20 22 24
## [3,] 1 1 1

R: How do I translate solve_LSAP into a dataframe?

I'm trying to do a large assignment problem, and am using LSAP. It works, but I am trying to get the output into a dataframe so that I can do more with it. However, the documentation on the function says "An object of class "solve_LSAP" with the optimal assignment of rows to columns", with no further information on the data. I cannot seem to crack open the class to break the data out into a more usable form.
I've provided their example code.
x <- matrix(c(5, 1, 4, 3, 5, 2, 2, 4, 4), nrow = 3)
y <- solve_LSAP(x, maximum = FALSE)
y
Output:
Optimal assignment:
1 => 3, 2 => 1, 3 => 2
I have 200+ assignments, and the baseline output is simply not usable for me. How can I translate it into a dataframe, or at least a matrix that looks something like the below?
Row Column
1 3
2 1
3 2
The solve_LSAP returns column indices by each row, therefore comprising all you need for reconstruction:
len <- length(y)
parsedMat <- cbind(
1:len,
as.integer(y)
)
parsedMat
[,1] [,2]
[1,] 1 3
[2,] 2 1
[3,] 3 2
This can be turned into a solved matrix by:
solvedMat <- matrix(0, nrow = len, ncol = len)
solvedMat[parsedMat] <- 1
solvedMat
[,1] [,2] [,3]
[1,] 0 0 1
[2,] 1 0 0
[3,] 0 1 0
You can also turn this into a function that will return both outputs in form of a list, e.g.:
parseClueOutput <- function(x) {
len <- length(x)
parsedMat <- cbind(
1:len,
as.integer(x)
)
solvedMat <- matrix(0, nrow = len, ncol = len)
solvedMat[parsedMat] <- 1
return(
list(
parsedMat = parsedMat,
solvedMat = solvedMat
)
)
}
And use it as:
parseClueOutput(y)
$parsedMat
[,1] [,2]
[1,] 1 3
[2,] 2 1
[3,] 3 2
$solvedMat
[,1] [,2] [,3]
[1,] 0 0 1
[2,] 1 0 0
[3,] 0 1 0
As for the structure, the solve_LSAP is not really a complicated object, it's essentially a numeric vector which you can see with:
is.numeric(y)
[1] TRUE
Or:
str(y)
'solve_LSAP' num [1:3] 3 1 2
You can also turn the solvedMat or parsedMat into a dataframe easily - for example, the parsedMat:
setNames(as.data.frame(parsedMat), c('Row', 'Column'))
Row Column
1 1 3
2 2 1
3 3 2

For-loop and Lapply: Same function gives different results

I want to iterate a function over a list of vectors. I'm trying to use Lapply however this is giving unwanted results whilst a for loop with the same arguments has the correct results:
Reproducible example:
library(gtools) # for 'permutations' function
exampleList <- list(c("RETURN", "COMBINATIONS"), c(1,2,3), c("PLEASE WORK") )
Desired output (what the for-loop returns):
for (i in 1:length(exampleList)) {
print( permutations(n = length(exampleList[[i]]), r = length(exampleList[[i]]), v = exampleList[[i]]))
}
[,1] [,2]
[1,] "COMBINATIONS" "RETURN"
[2,] "RETURN" "COMBINATIONS"
[,1] [,2] [,3]
[1,] 1 2 3
[2,] 1 3 2
[3,] 2 1 3
[4,] 2 3 1
[5,] 3 1 2
[6,] 3 2 1
[,1]
[1,] "PLEASE WORK"
What the Lapply version currently returns:
lapply(exampleList, permutations, n = length(exampleList), r = length(exampleList))
Error in FUN(X[[i]], ...) : v is either non-atomic or too short
If I understand correctly, lapply iterates through each exampleList[[i]] so the 'v' argument doesnt need to be specified (note I still get an error when trying to specify it). What is causing my results to be inconsistent?
In your attempt you are giving values of n and r to be length(exampleList). However, it should be equal to length of each individual element in the list.
lapply(exampleList, function(x)
gtools::permutations(n = length(x), r = length(x), v = x))
#[[1]]
# [,1] [,2]
#[1,] "COMBINATIONS" "RETURN"
#[2,] "RETURN" "COMBINATIONS"
#[[2]]
# [,1] [,2] [,3]
#[1,] 1 2 3
#[2,] 1 3 2
#[3,] 2 1 3
#[4,] 2 3 1
#[5,] 3 1 2
#[6,] 3 2 1
#[[3]]
# [,1]
#[1,] "PLEASE WORK"
You can also write this with Map
Map(function(x, y) gtools::permutations(n = y, r = y, v = x),
exampleList, lengths(exampleList))

Apply function on each element of a list of matrices

I have a list of matrices.
(below is a simplified example, I actually have a list of 3 matrices, the first one being in 2D, while the second and third ones are in 3D)
> a <- matrix(-1:2, ncol = 2)
> b <- array(c(-2:5), dim=c(2, 2, 2))
> c_list <- list(a,b)
> c_list
[[1]]
[,1] [,2]
[1,] -1 1
[2,] 0 2
[[2]]
, , 1
[,1] [,2]
[1,] -2 0
[2,] -1 1
, , 2
[,1] [,2]
[1,] 2 4
[2,] 3 5
I'd like to apply the function max(0,c_list) to each and every element (without a loop), in order to have the same type of object as "c_list" but with the negative values replaced by zeros.
> output
[[1]]
[,1] [,2]
[1,] 0 1
[2,] 0 2
[[2]]
, , 1
[,1] [,2]
[1,] 0 0
[2,] 0 1
, , 2
[,1] [,2]
[1,] 2 4
[2,] 3 5
I've managed to do it for a matrice or for a list with mapply or lapply, but not for a list of matrices.
Answer : either Sotos' answer
output <- lapply(c_list, function(i)replace(i, i < 0, 0))
or Moody_Mudskipper's answer
output <- lapply(c_list,pmax,0)
You can use pmax, it will preserve the format of the source matrix and vectorized so faster than looping with max.
lapply(c_list,pmax,0)
Using apply and lapply:
a <- matrix(-1:2, ncol = 2)
b <- matrix(-3:0, ncol = 2)
c <- list(a,b)
d <- lapply(c, function(m) {
apply(m, c(1, 2), function(x) max(0, x))
})
Output:
> d
[[1]]
[,1] [,2]
[1,] 0 1
[2,] 0 2
[[2]]
[,1] [,2]
[1,] 0 0
[2,] 0 0

how to replace for loop with function in R

I would like to replace the loops in the following code.
Test<-function(j){
card<-5
#matrix s is to hold the results
s <- matrix(rep(0,j*card),nrow=j,ncol=card,byrow=TRUE)
# Loop1
for (k in 1:j)
{
#A vector should be drawn from another matrix,
#for simplicity, I define a vector "sol" to be modified in Loop2
sol<-rep(1,card)
#Given the vector "sol", select a vector position randomly
#for a given no. of times (i.e. steps), say 10.
step<-10
# Loop2 - Modify value in sol
for (i in seq_len(step))
{
#Draw a position
r<-sample(seq_len(card),1)
#Each position has specific probabilities for
#assignment of possible values, meaning p is related to
#the position.
#For simplicity, just define the probabilities by random here.
p<-runif(3,0,1) # just create p for each step
p<-p/sum(p) #
#Finally, draw a value for the selected position and
#value of sol within this loop is kept changing.
sol[r]<-sample(1:3,1,prob=p)
}
# keep the result in matrix s.
s[k,]<-sol }
return(s)}
Given an input vector
[,1] [,2] [,3] [,4] [,5]
[1,] 1 1 1 1 1
It is expected to output a matrix like this:
[,1] [,2] [,3] [,4] [,5]
[1,] 1 1 3 2 3
[2,] 1 1 1 1 3
[3,] 2 2 2 2 3
[4,] 2 1 2 2 1
[5,] 1 1 3 1 1
Each step in Loop2 depends on a probability vector, which is then used to change value in the sol. Then I tried to replace Loop2 with sapply as follows:
sapply(seq_len(steps), function(x){
r<-runif(seq_len(card),1)
sol[r]<-sample(1:3,1,prob=p) #Try to modify value in sol
})
s[k,]<-sol #Actually, no change in sol.
However, values in sol has no changed keeping all 1s, i.e. 1,1,1,1,1.
How can Loop2 be replaced by other apply family or other functions?
Thank you.
If I understand correctly what you're trying to achieve, you don't need apply() functions for this:
Test <- function(j) {
card <- 5
p<-runif(3,0,1)
p<-p/sum(p)
out <- matrix(sample(1:3, j*card, replace=T, prob=p), ncol=card, nrow=j)
return(out)
}
Test(5)
[,1] [,2] [,3] [,4] [,5]
[1,] 2 2 2 1 1
[2,] 1 2 3 2 2
[3,] 2 3 1 1 2
[4,] 1 2 1 2 1
[5,] 2 1 1 2 2
In order to refactor this function, notice that all the r <- sample(card,1) are independent draws from the multinomial distribution. This can be pulled out of the loop.
The second thing to note is that the conditional distribution of s[i,j] given r is 1 if the multinomial draw is zero, otherwise it is sample(3,1,prob=runif(3)). (The distribution does not change if a cell is selected repeatedly).
Put those two facts together, and we have this:
Test2 <- function(j,card=5,step=10) {
r <- t(rmultinom(j,step,rep(1,card)))
s <- apply(r, 1:2, function(x) if(x > 0) sample(3,1,prob=runif(3)) else 1)
return(s)
}
What about that:
test2 <- function(j) {
card <- 5
# Create a matrix where each of the j*card row is a p as defined in your original function.
p <- matrix(runif(3*j*card), ncol=3)
p <- t(apply(p, 1, function(x) x/sum(x)))
# For each row of p, draw a single value at random
draws <- apply(p, 1, function(x) sample(1:3, 1, prob=x))
# Format the output as a j*card matrix
out <- matrix(draws, ncol=card, byrow=TRUE)
return(out)
}
If test2() does what you want, it's roughly 300 times faster than Test() on my machine.

Resources