Imagine I have an array in R with N dimensions (a matrix would be an array with 2 dimensions) and I want to select the rows from 1 to n of my array. I was wondering if there was a syntax to do this in R without knowing the number of dimensions.
Indeed, I can do
x = matrix(0, nrow = 10, ncol = 2)
x[1:5, ] # to take the 5 first rows of a matrix
x = array(0, dim = c(10, 2, 3))
x[1:5, , ] # to take the 5 first rows of a 3D array
So far I haven't found a way to use this kind of writing to extract rows of an array without knowing its number of dimensions (obviously if I knew the number of dimensions I would just have to put as many commas as needed). The following snippet works but does not seem to be the most native way to do it:
x = array(0, dim = c(10, 2, 3, 4)
apply(x, 2:length(dim(x)), function(y) y[1:5])
Is there a more R way to achieve this?
Your apply solution is the best, actually.
apply(x, 2:length(dim(x)), `[`, 1:5)
or even better as #RuiBarradas pointed out (please vote his comment too!):
apply(x, -1, `[`, 1:5)
Coming from Lisp, I can say, that R is very lispy.
And the apply solution is a very lispy solution.
And therefore it is very R-ish (a solution following the functional programming paradigm).
Function slice.index() is easily overlooked (as I know to my cost! see magic::arow()) but can be useful in this case:
x <- array(runif(60), dim = c(10, 2, 3))
array(x[slice.index(x,1) %in% 1:5],c(5,dim(x)[-1]))
HTH, Robin
Related
I am trying to create a matrix of coordinates(indexes) that I randomly pick one from using the sample function. I then use these to select a cell in another matrix. What is the best way to do this? The trouble is how to store these integers in the matrix so that they are easy to separate. Right now I have them stored as strings with a comma, that I then split. Someone suggested I use a pair, or a string, but I cannot seam to get these to work with a matrix. Thanks!
EDIT:What i currently have looks like this (changed a little to make sense out of context):
probs <- matrix(c(0,0,0.6,0,0,
0,0.7,1,0.7,0,
0.6,1,0,1,0.6,
0,0.7,1,0.7,0,
0,0,0.6,0,0),5,5)
cordsMat <- matrix("",5,5)
for (x in 1:5){
for (y in 1:5){
cordsMat[x,y] = paste(x,y,sep=",")
}
}
cords <- sample(cordsMat,1,,probs)
cordsVec <- unlist(strsplit(cords,split = ","))
cordX <- as.numeric(cordsVec[1])
cordY <- as.numeric(cordsVec[2])
otherMat[cordX,cordY]
It sort of works but i would also be interested for a better way, as this will get repeated a lot.
If you want to set the probabilities it can easily be done by providing it to sample
# creating the matrix
matrix(sample(rep(1:6, 15:20), 25), 5) -> other.mat
# set the probs vec
probs <- c(0,0,0.6,0,0,
0,0.7,1,0.7,0,
0.6,1,0,1,0.6,
0,0.7,1,0.7,0,
0,0,0.6,0,0)
# the coordinates matrix
mat <- as.matrix(expand.grid(1:nrow(other.mat),1:ncol(other.mat)))
# sampling a row randomly
sample(mat, 1, prob=probs) -> rand
# getting the value
other.mat[mat[rand,1], mat[rand,2]]
[1] 6
I have a rather simple problem and was wondering whether some of you guys know a very efficient (=fast) solution for this:
I have two matrices mat and arr and want to accomplish the following: Take every column of arr and substract it from mat. Then take the logarithm of one minus the absolute value of the difference. That's it. Right now, I'm using sapply (see below), but I'm pretty sure that it's possible to do it faster (maybe using sweep?)
Code:
mat <- matrix(.3, nrow=10, ncol = 4)
arr <- matrix(.1, nrow=10, ncol = 10000)
i <- ncol(arr)
result <- sapply(1:i, function(ii) (log(1-abs(mat-arr[,ii]))))
Thanks for any ideas!
We could replicate and then do a difference
result2 <- matrix(log(1- abs(rep(mat, ncol(arr)) -
rep(arr, ncol(mat)))), ncol = i)
identical(result, result2)
#[1] TRUE
I would like to convert a for cycle into a faster operation such as apply.
Here is my code
for(a in 1:dim(k)[1]){
for(b in 1:dim(k)[2]){
if( (k[a,b,1,1]==0) & (k[a,b,1,2]==0) & (k[a,b,1,3]==0) ){
k[a,b,1,1]<-1
k[a,b,1,2]<-1
k[a,b,1,3]<-1
}
}
}
It's a simple code that does a check on each element of the multidimensional array k and if the three elements are the same and equal to 0, it assigns the value 1.
Is there a way to make it faster?. The matrix k has 1,444,000 elements and it takes too long to run it. Can anyone help?
Thanks
With apply you can return all your 3-combinations as a numeric vector and then check for your specific condition:
# This creates an array with the same properties as yours
array <- array(data = sample(c(0, 1), 81, replace = TRUE,
prob = c(0.9, 0.1)), c(3, 3, 3, 3))
# This loops over all vectors in the fourth dimension and returns a
# vector of ones if your condition is met
apply(array, MARGIN = c(1, 2, 3), FUN = function(x) {
if (sum(x) == 0 & length(unique(x)) == 1)
return(c(1, 1, 1))
else
return(x)
})
Note that the MARGIN argument specifies the dimensions over which to loop. You want the fourth dimension vectors so you specify c(1, 2, 3).
If you then assign this newly created array to the old one, you replaced all vectors where the condition is met with ones.
You should first use the filter function twice (composed), and then the apply (lapply?) function on the filtered array. Maybe you can also reduce the array, because it looks like you're not very interested in the third dimension (always accessing the 1st item). You should probably do some reading about functional programming in R here http://adv-r.had.co.nz/Functionals.html
Note I'm not a R programmer, but I'm quite familiar with functional programming (Haskell etc) so this might give you an idea. This might be faster, but it depends a bit on how R is designed (lazy or eager evaluation etc).
I just saw what seemed like a perfectly good question that was deleted and since like the original questioner I couldn't find a duplicate, I'm posting again.
Assume that I have a simple matrix ("m"), which I want to index with another logical matrix ("i"), keeping the original matrix structure intact. Something like this:
# original matrix
m <- matrix(1:12, nrow = 3, ncol = 4)
# logical matrix
i <- matrix(c(rep(FALSE, 6), rep(TRUE, 6)), nrow = 3, ncol = 4)
m
i
# Desired output:
matrix(c(rep(NA,6), m[i]), nrow(m), ncol(m))
# however this seems bad programming...
Using m[i] returns a vector and not a matrix. What is the correct way to achieve this?
The original poster added a comment saying he'd figured out a solution, then almost immediately deleted it:
m[ !i ] <- NA
I had started an answer that offered a slightly different solution using the is.na<- function:
is.na(m) <- !i
Both solutions seem to be reasonable R code that rely upon logical indexing. (The i matrix structure is not actually relied upon. A vector of the proper length and entries would also have preserved the matrix structure of m.)
Both solutions provide above works and are fine. Here is another solution to produce a new matrix, without modifying the previous one. Make sure that your matrix of logical values are well store as logical, and not as character.
vm <- as.vector(m)
vi <- as.vector(i)
new_v <- ifelse(vi, vm, NA)
new_mat <- matrix(new_v, nrow = nrow(m), ncol=ncol(m))
This question is similar to questions that have been asked regarding floating-point error in other languages (for example here), however I haven't found a satisfactory solution.
I'm working on a project that involves investigating matrices that share certain characteristics. As part of that, I need to know how many matrices in a list are unique.
D <- as.matrix(read.table("datasource",...))
mat_list <- vector('list',length=length(samples_list))
mat_list <- lapply(1:length(samples_list),function(i) matrix(data=0,nrow(D),ncol(D)))
This list is then populated by computations from the data based on the elements of samples_list. After mat_list has been populated, I need to removed duplicates. Running
mat_list <- unique(mat_list)
narrows things down quite a bit; however, many of those elements are really within machine error of each other. The function unique does not allow one to specify precision, and I was unable to find source code for modification.
One idea I had was this:
ErrorReduction<-function(mat_list, tol=2){
len <- length(mat_list)
diff <- mat_list[[i]]-mat_list[[i+1]]
for(i in 1:len-1){
if(norm(diff,"i")<tol){
mat_list[[i+1]] <- mat_list[i]
}
}
mat_list<-unique(mat_list)
return(mat_list)
}
but this only looks at pairwise differences. It would be simple but most likely inefficient to do this with nested for loops.
What methods do you know of, or what ideas do you have, of handling the problem of identifying and removing matrices that are within machine error of being duplicates?
Here is a function that applies all.equal to every pair using outer and removes all duplicates:
approx.unique <- function(l) {
is.equal.fun <- function(i, j)isTRUE(all.equal(norm(l[[i]] - l[[j]], "M"), 0))
is.equal.mat <- outer(seq_along(l), seq_along(l), Vectorize(is.equal.fun))
is.duplicate <- colSums(is.equal.mat * upper.tri(is.equal.mat)) > 0
l[!is.duplicate]
}
An example:
a <- matrix(runif(12), 4, 3)
b <- matrix(runif(12), 4, 3)
c <- matrix(runif(12), 4, 3)
all <- list(a1 = a, b1 = b, a2 = a, a3 = a, b2 = b, c1 = c)
names(approx.unique(all))
# [1] "a1" "b1" "c1"
I believe you are looking for all.equal which compares objects 'within machine error'. Check out ?all.equal.