I am trying to assign rows of a 3D array, but I don't know how excatly.
I have a 2D index array where each row corresponds to the first and second index of the 3D array, and a 2D value array which i want to insert into the 3D array. The simplest way I found to do this was
indexes <- cbind(1:30, rep(c(1, 2), 15))
rows <- cbind(1:20, 31:50, 71:90)
for (i in 1:nrow(indexes)) for (j in 1:3)
data[indexes[i,1], indexes[i,2], j] <- rows[i, j]
But this is hard to read, because it uses nested indexing, so I was hoping there was a simpler way, like
data[indexes,] <- rows
(this does not work)
What I've tried:
this question shows how to index the array (without assignment)
apply(data, 3, `[`, indexes)
but this doesn't allow assignment
apply(data, 3, `[`, indexes) <- rows #: could not find function "apply<-"
nor does using [<- work:
apply(data, 3, `[<-`, indexes, rows)
because it treats rows as a vector.
Neither of the following works either
data[indexes[1], indexes[2],] <- rows #: subscript out of bounds
data[indexes,] <- rows #: incorrect number of subscripts on matrix
So is there a simpler way of assigning to a multidimensional array?
Your indexes variable implies that data has first dim of 30, but rows[30,j] doesn't exist. So your problem isn't well posed, and I'll change it.
The basic idea is that you can index a 3 way array by an n x 3 matrix. Each row of the matrix corresponds to a location in the 3 way array, so if you want to set entry data[1,2,3] to 4, and entry data[5,6,7] to 8, you'd use
index <- rbind(c(1,2,3), c(5,6,7))
data[index] <- c(4,8)
You will need to expand your indexes variable to replicate each row 3 times, then read the rows matrix as a vector, and then this works:
data <- array(NA, dim=c(30, 2, 3))
indexes <- cbind(1:30, rep(c(1, 2), 15))
rows <- cbind(1:30, 31:60, 71:100)
indexes1 <- indexes[rep(1:nrow(indexes), each = 3),]
indexes2 <- cbind(indexes1, 1:3)
data[indexes2] <- t(rows) # Transpose because R reads down columns first
I don't think this is any simpler than what you had with the for loops, but maybe you'll find it preferable.
After reading #user2554330's answer, I found a slightly simpler solution
# initialize as in user2554330's answer
data <- ...
indexes <- ...
rows <- ...
indexes3 <- as.matrix(merge(indexes, 1:3))
data[indexes3] <- rows
comparison of indexes2 and indexes3 (using fewer elements):
# print(indexes2)
[,1] [,2] [,3]
[1,] 1 1 1
[2,] 1 1 2
[3,] 1 1 3
[4,] 2 2 1
[5,] 2 2 2
[6,] 2 2 3
[7,] 3 1 1
[8,] 3 1 2
[9,] 3 1 3
[10,] 4 2 1
[11,] 4 2 2
[12,] 4 2 3
# print(indexes3)
[,1] [,2] [,3]
[1,] 1 1 1
[2,] 2 2 1
[3,] 3 1 1
[4,] 4 2 1
[5,] 1 1 2
[6,] 2 2 2
[7,] 3 1 2
[8,] 4 2 2
[9,] 1 1 3
[10,] 2 2 3
[11,] 3 1 3
[12,] 4 2 3
Related
I have a problem about how to select neighboring elements in a vector and put them into a list or matrix in R.
For example:
vl <- c(1,2,3,4,5)
I want to get the results like this:
1,2
2,3
3,4
4,5
The results can be in a list or matrix
I know we can use a loop to get results.Like this:
pl <- list()
k=0
for (p in 1: length(vl)) {
k=k+1
pl[[k]] <- sort(c(vl[p],vl[p+1]))}
But I have a big data. Using loop is relatively slow.
Is there any function to get results directly?
Many thanks!
We can use head and tail to ignore the last and first element respectively.
data.frame(a = head(vl, -1), b = tail(vl, -1))
# a b
#1 1 2
#2 2 3
#3 3 4
#4 4 5
EDIT
If the data needs to be sorted we can use apply row-wise to sort it.
vl <- c(2,5,3,1,6,4)
t(apply(data.frame(a = head(vl, -1), b = tail(vl, -1)), 1, sort))
# [,1] [,2]
#[1,] 2 5
#[2,] 3 5
#[3,] 1 3
#[4,] 1 6
#[5,] 4 6
You can do:
matrix(c(vl[-length(vl)], vl[-1]), ncol = 2)
[,1] [,2]
[1,] 1 2
[2,] 2 3
[3,] 3 4
[4,] 4 5
If you want to sort two columns rowwise, then you can use pmin() and pmax() which will be faster than using apply(x, 1, sort) with a large number of rows.
sapply(c(pmin, pmax), do.call, data.frame(vl[-length(vl)], vl[-1]))
The problem can also be solved by applying the sort() function on a rolling window of length 2:
vl <- c(2,5,3,1,6,4)
zoo::rollapply(vl, 2L, sort)
which returns a matrix as requested:
[,1] [,2]
[1,] 2 5
[2,] 3 5
[3,] 1 3
[4,] 1 6
[5,] 4 6
Note that the modified input vector vl is used which has been posted by the OP in comments here and here.
Besides zoo, there are also other packages which offer rollapply functions, e.g.,
t(rowr::rollApply(vl, sort, 2L, 2L))
I have this code and can't understand how rbind.fill.matrix is used.
dtmat is a matrix with the documents on rows and words on columns.
word <- do.call(rbind.fill.matrix,lapply(1:ncol(dtmat), function(i) {
t(rep(1:length(dtmat[,i]), dtmat[,i]))
}))
I read the description of the function and says that binds matrices but cannot understand which ones and fills with NA missing columns.
From what I understand, the function replaces columns that dont bind with NA.
Lets say I have 2 matrices A with two columns col1 and col2, B with three columns col1, col2 and colA. Since I want to bind all both these matrices, but rbind only binds matrices with equal number of columns and same column names, rbind.fill.matrix binds the columns but adds NA to all values that should be in both the matrices that are not. The code below will explain it more clearly.
a <- matrix(c(1,1,2,2), nrow = 2, byrow = T)
> a
[,1] [,2]
[1,] 1 1
[2,] 2 2
>
> b <- matrix(c(1,1,1,2,2,2,3,3,3), nrow = 3, byrow = T)
> b
[,1] [,2] [,3]
[1,] 1 1 1
[2,] 2 2 2
[3,] 3 3 3
>
> library(plyr)
> r <- rbind.fill.matrix(a,b)
> r
1 2 3
[1,] 1 1 NA
[2,] 2 2 NA
[3,] 1 1 1
[4,] 2 2 2
[5,] 3 3 3
>
>
The documentation also mentions about column names, which I think you can also understand from the example.
I'm looking for a fast way to return the indices of columns of a matrix that match values provided in a vector (ideally of length 1 or the same as the number of rows in the matrix)
for instance:
mat <- matrix(1:100,10)
values <- c(11,2,23,12,35,6,97,3,9,10)
the desired function, which I call rowMatches() would return:
rowMatches(mat, values)
[1] 2 1 3 NA 4 1 10 NA 1 1
Indeed, value 11 is first found at the 2nd column of the first row, value 2 appears at the 1st column of the 2nd row, value 23 is at the 3rd column of the 3rd row, value 12 is not in the 4th row... and so on.
Since I haven't found any solution in package matrixStats, I came up with this function:
rowMatches <- function(mat,values) {
res <- integer(nrow(mat))
matches <- mat == values
for (col in ncol(mat):1) {
res[matches[,col]] <- col
}
res[res==0] <- NA
res
}
For my intended use, there will be millions of rows and few columns. So splitting the matrix into rows (in a list called, say, rows) and calling Map(match, as.list(values), rows) would be way too slow.
But I'm not satisfied by my function because there is a loop, which may be slow if there are many columns. It should be possible to use apply() on columns, but it won't make it faster.
Any ideas?
res <- arrayInd(match(values, mat), .dim = dim(mat))
res[res[, 1] != seq_len(nrow(res)), 2] <- NA
# [,1] [,2]
# [1,] 1 2
# [2,] 2 1
# [3,] 3 3
# [4,] 2 NA
# [5,] 5 4
# [6,] 6 1
# [7,] 7 10
# [8,] 3 NA
# [9,] 9 1
#[10,] 10 1
Roland's answer is good, but I'll post an alternative solution:
res <- which(mat==values, arr.ind = T)
res <- res[match(seq_len(nrow(mat)), res[,1]), 2]
I have a single vector (call it t1) with a series of observations. I want to create a set of new vectors by popping the first observation from t1 (and so on for subsequent near-copies). But I want to keep the vectors the same length so I can add them to a data frame later.
I was able to make it work as follows:
t1 <- c(1, 2, 3)
t2 <- t1[-1]
t3 <- t2[-1]
t2[length(t2)+1] <- 0
t3[length(t3)+1] <- 0
t3[length(t3)+1] <- 0
t.all <- cbind(as.data.frame(t1), as.data.frame(t2), as.data.frame(t3))
t.all
t1 t2 t3
1 1 2 3
2 2 3 0
3 3 0 0
But this is clumsy and it's going to be tedious if I want to create a large number of columns. How can I keep the vectors the same length (or solve this problem another way)?
Here a loop version of what you try to do , uding do.call and lapply:
cbind(t1,do.call(cbind,lapply(seq_along(t1)-1,
function(x)c(tail(t1,-x),rep(0,x)))))
t1
[1,] 1 2 3
[2,] 2 3 0
[3,] 3 0 0
> t.all <- sapply(0:2, function(x) c( t1[(x+1):3], rep(0,x) ) )
> t.all
[,1] [,2] [,3]
[1,] 1 2 3
[2,] 2 3 0
[3,] 3 0 0
If you need it to be a data.frame it would be a lot more efficient to build as a matrix first and then wrap as.data.frame around the final result.
Here's another way using vector indexing:
t1 <- (2,5,3)
mm <- do.call(rbind, lapply(seq_along(t1), function(x) t1[x:length(t1)][1:length(t1)]))
# [,1] [,2] [,3]
# [1,] 2 5 3
# [2,] 5 3 NA
# [3,] 3 NA NA
mm[is.na(mm)] <- 0
# [,1] [,2] [,3]
# [1,] 2 5 3
# [2,] 5 3 0
# [3,] 3 0 0
Another way without using apply family:
t1 <- c(2,5,4,6)
len <- length(t1)
matrix(t1[outer(1:len, 0:(len-1), '+')], ncol=len)
# [,1] [,2] [,3] [,4]
# [1,] 2 5 4 6
# [2,] 5 4 6 NA
# [3,] 4 6 NA NA
# [4,] 6 NA NA NA
How about creating a matrix row-by-row, by recycling t1 as desired:
tmat <-cbind(t1,t1,t1,t1,....) # as many as needed
Then just use a matrix triangle function
newmat<- tmat * upper.tri(tmat,diag=TRUE)
That's offset from your sample, but contains the same info per row.
Most of the other answers focus on creating the final data.frame. If that is your ultimate goal, then they provide good approaches. This answer instead focuses narrowly on your question of how to take the first element off and preserve the length. In order to keep things tidy, it is best to do the whole thing in one function.
shift <- function(tx) {append(tx[-1],0)}
Then you can have
t1 <- c(1, 2, 3)
t2 <- shift(t1)
t3 <- shift(t2)
t.all <- data.frame(t1, t2, t3)
which gives you the same result you had.
> t.all
t1 t2 t3
1 1 2 3
2 2 3 0
3 3 0 0
If you want to combine this function with a looping construct to create the data.frame, it is easiest to go through a matrix first.
t.all <- matrix(t1, nrow=length(t1), ncol=length(t1))
lapply(seq(length=length(t1))[-1], function(i) {
t.all[,i] <<- shift(t.all[,(i-1)])
})
t.all <- as.data.frame(t.all)
which gives the same data.frame, but with slightly different column names
> t.all
V1 V2 V3
1 1 2 3
2 2 3 0
3 3 0 0
As fast as possible, I would like to replace the first zeros in some rows of a matrix with values stored in another vector.
There is a numeric matrix where each row is a vector with some zeros.
I also have two vectors, one containing the rows, in what to be replaced, and another the new values: replace.in.these.rows and new.values. Also, I can generate the vector of first zeroes with sapply
mat <- matrix(1,5,5)
mat[c(1,8,10,14,16,22,14)] <- 0
replace.in.these.rows <- c(1,2,3)
new.values <- c(91,92,93)
corresponding.poz.of.1st.zero <- sapply(replace.in.these.rows,
function(x) which(mat [x,] == 0)[1] )
Now I would like something that iterates over the index vectors, but without a for loop possibly:
matrix[replace.in.these.rows, corresponding.poz.of.the.1st.zero ] <- new.values
Is there a trick with indexing more than simple vectors? It could not use list or array(e.g.-by-column) as index.
By default R matrices are a set of column vectors. Do I gain anything if I store the data in a transposed form? It would mean to work on columns instead of rows.
Context:
This matrix stores contact ID-s of a network. This is not an adjacency matrix n x n, rather n x max.number.of.partners (or n*=30) matrix.
The network uses edgelist by default, but I wanted to store the "all links from X" together.
I assumed, but not sure if this is more efficient than always extract the information from the edgelist (multiple times each round in a simulation)
I also assumed that this linearly growing matrix form is faster than storing the same information in a same formatted list.
Some comments on these contextual assumptions are also welcome.
Edit: If only the first zeros are to be replace then this approach works:
first0s <-apply(mat[replace.in.these.rows, ] , 1, function(x) which(x==0)[1])
mat[cbind(replace.in.these.rows, first0s)] <- new.values
> mat
[,1] [,2] [,3] [,4] [,5]
[1,] 91 1 1 0 1
[2,] 1 1 1 1 92
[3,] 1 93 1 1 1
[4,] 1 1 0 1 1
[5,] 1 0 1 1 1
Edit: I thought that the goal was to replace all zeros in the chosen rows and this was the approach. A completely vectorized approach:
idxs <- which(mat==0, arr.ind=TRUE)
# This returns that rows and columns that identify the zero elements
# idxs[,"row"] %in% replace.in.these.rows
# [1] TRUE TRUE FALSE FALSE TRUE TRUE
# That isolates the ones you want.
# idxs[ idxs[,"row"] %in% replace.in.these.rows , ]
# that shows what you will supply as the two column argument to "["
# row col
#[1,] 1 1
#[2,] 3 2
#[3,] 1 4
#[4,] 2 5
chosen.ones <- idxs[ idxs[,"row"] %in% replace.in.these.rows , ]
mat[chosen.ones] <- new.values[chosen.ones[,"row"]]
# Replace the zeros with the values chosen (and duplicated if necessary) by "row".
mat
#---------
[,1] [,2] [,3] [,4] [,5]
[1,] 91 1 1 91 1
[2,] 1 1 1 1 92
[3,] 1 93 1 1 1
[4,] 1 1 0 1 1
[5,] 1 0 1 1 1