Diff function transposes matrix when used with apply over rows - r

I need to find the difference between columns in a matrix. The diff() function will work with a matrix for differences between rows for each column, but I need differences between columns for each row. I tried using apply() with diff() over the row index, but this returns a transposed result. Why does it return the transpose and am I using it incorrectly?
# make a small sample matrix
m <- matrix(seq(1, 20, by = 1), nrow = 2, ncol = 10)
# apply the diff function to the rows, I expect a 2 by 10 matrix here, should I just transpose it?
apply(m, 1, diff)
[,1] [,2]
[1,] 2 2
[2,] 2 2
[3,] 2 2
[4,] 2 2
[5,] 2 2
[6,] 2 2
[7,] 2 2
[8,] 2 2
[9,] 2 2

To elaborate on the "transpose effect" of apply:
According to ?apply, apply applies a function to the row vectors (MARGIN = 1) or column vectors (MARGIN = 2) of an array (e.g. a matrix) and returns
an array of dimension ‘c(n, dim(X)[MARGIN])’ if ‘n > 1’
where n is the length of the vector returned by an individual call of the function to either the row or column vector.
So in your case dim(m) is 2 10 (i.e. a 2x10 matrix) and MARGIN = 1, so the array dimension of the return object is 9 2, which means a 9x2 matrix (as diff returns a vector of length n=9).
You can see the same "transpose effect" when you do
apply(m, 1, c)
# [,1] [,2]
# [1,] 1 2
# [2,] 3 4
# [3,] 5 6
# [4,] 7 8
# [5,] 9 10
# [6,] 11 12
# [7,] 13 14
# [8,] 15 16
# [9,] 17 18
#[10,] 19 20

If all what's confusing you is the format 9 2 instead of 2 9, then you are right, just transpose it.
Because you very well are already computing the differences over the columns, i.e.
for(k in 1:nrow(m)) diff(m[k,])
yields exactly what you already have. A simple t(apply(m, 1, diff)) will do the trick.
> t(apply(m, 1, diff))
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
[1,] 2 2 2 2 2 2 2 2 2
[2,] 2 2 2 2 2 2 2 2 2
PS You can't get a 10 2/2 10 structure because diff will obviously return an element of size n - 1.

Related

Subtract vector from one column of a matrix

I'm a complete R novice, and I'm really struggling on this problem. I need to take a vector, evens, and subtract it from the first column of a matrix, top_secret. I tried to call up only that column using top_secret[,1] and subtract the vector from that, but then it only returns the column. Is there a way to do this inside the matrix so that I can continue to manipulate the matrix without creating a bunch of separate columns?
Sure, you can. Here is an example:
m <- matrix(c(1,2,3,4),4,4, byrow = TRUE)
> m
[,1] [,2] [,3] [,4]
[1,] 1 2 3 4
[2,] 1 2 3 4
[3,] 1 2 3 4
[4,] 1 2 3 4
m[,4] <- m[,4] - c(5,5,5,5)
which gives:
> m
[,1] [,2] [,3] [,4]
[1,] 1 2 3 -1
[2,] 1 2 3 -1
[3,] 1 2 3 -1
[4,] 1 2 3 -1
Or another option is replace
replace(m, cbind(seq_len(nrow(m)), 4), m[,4] - 5)
data
m <- matrix(c(1,2,3,4),4,4, byrow = TRUE)

extract every two elements in matrix row in r in sequence to calculate euclidean distance

How to extract every two elements in sequence in a matrix and return the result as a matrix so that I could feed the answer in a formula for calculation:
For example, I have a one row matrix with 6 columns:
[,1][,2][,3][,4][,5][,6]
[1,] 2 1 5 5 10 1
I want to extract column 1 and two in first iteration, 3 and 4 in second iteration and so on. The result has to be in the form of matrix.
[1,] 2 1
[2,] 5 5
[3,] 10 1
My original codes:
data <- matrix(c(1,1,1,2,2,1,2,2,5,5,5,6,10,1,10,2,11,1,11,2), ncol = 2)
Center Matrix:
[,1][,2][,3][,4][,5][,6]
[1,] 2 1 5 5 10 1
[2,] 1 1 2 1 10 1
[3,] 5 5 5 6 11 2
[4,] 2 2 5 5 10 1
[5,] 2 1 5 6 5 5
[6,] 2 2 5 5 11 1
[7,] 2 1 5 5 10 1
[8,] 1 1 5 6 11 1
[9,] 2 1 5 5 10 1
[10,] 5 6 11 1 10 2
objCentroidDist <- function(data, centers) {
resultMatrix <- matrix(NA, nrow=dim(data)[1], ncol=dim(centers)[1])
for(i in 1:nrow(centers)) {
resultMatrix [,i] <- sqrt(rowSums(t(t(data)-centers[i, ])^2))
}
resultMatrix
}
objCentroidDist(data,centers)
I want the Result matrix to be as per below:
[1,][,2][,3]
[1,]
[2,]
[3,]
[4,]
[5,]
[7,]
[8,]
[9,]
[10]
My concern is, how to calculate the data-centers distance if the dimensions of the data matrix are two, and centers matrix are six. (to calculate the distance from the data matrix and every two columns in centers matrix). Each row of the centers matrix has three centers.
Something like this maybe?
m <- matrix(c(2,1,5,5,10,1), ncol = 6)
list.seq.pairs <- lapply(seq(1, ncol(m), 2), function(x) {
m[,c(x, x+1)]
})
> list.seq.pairs
[[1]]
[1] 2 1
[[2]]
[1] 5 5
[[3]]
[1] 10 1
And, in case you're wanting to iterate over multiple rows in a matrix,
you can expand on the above like this:
mm <- matrix(1:18, ncol = 6, byrow = TRUE)
apply(mm, 1, function(x) {
lapply(seq(1, length(x), 2), function(y) {
x[c(y, y+1)]
})
})
EDIT:
I'm really not sure what you're after exactly. I think, if you want each row transformed into a 2 x 3 matrix:
mm <- matrix(1:18, ncol = 6, byrow = TRUE)
list.mats <- lapply(1:nrow(mm), function(x){
a = matrix(mm[x,], ncol = 2, byrow = TRUE)
})
> list.mats
[[1]]
[,1] [,2]
[1,] 1 2
[2,] 3 4
[3,] 5 6
[[2]]
[,1] [,2]
[1,] 7 8
[2,] 9 10
[3,] 11 12
[[3]]
[,1] [,2]
[1,] 13 14
[2,] 15 16
[3,] 17 18
If, however, you want to get to your results matrix- I think it's probably easiest to do whatever calculations you need to do while you're dealing with each row:
results <- t(apply(mm, 1, function(x) {
sapply(seq(1, length(x), 2), function(y) {
val1 = x[y] # Get item one
val2 = x[y+1] # Get item two
val1 / val2 # Do your calculation here
})
}))
> results
[,1] [,2] [,3]
[1,] 0.5000000 0.7500 0.8333333
[2,] 0.8750000 0.9000 0.9166667
[3,] 0.9285714 0.9375 0.9444444
That said, I don't understand what you're trying to do so this may miss the mark. You may have more luck if you ask a new question where you show example input and the actual expected output that you're after, with the actual values you expect.

Preserve structure, when indexing a matrix with another matrix in R

Dear StackOverflowers,
I have an integer matrix in R and I would like to subset it so that I remove 1 specified cell in each column. So that, for instance, a 4x3 matrix becomes a 3x3 matrix. I have tried doing it by creating the second logical matrix of the same dimensions.
(subject.matrix <- matrix(1:12, nrow = 4))
[,1] [,2] [,3]
[1,] 1 5 9
[2,] 2 6 10
[3,] 3 7 11
[4,] 4 8 12
(query.matrix <- matrix(c(T, T, F, T, T, F, T, T, T, T, T, F), nrow = 4))
[,1] [,2] [,3]
[1,] TRUE TRUE TRUE
[2,] TRUE FALSE TRUE
[3,] FALSE TRUE TRUE
[4,] TRUE TRUE FALSE
The problem is that, when I index the first matrix by the second one, it is simplified to an integer vector.
subject.matrix[query.matrix]
[1] 1 2 4 5 7 8 9 10 11
I've tried adding drop=F, but to no avail. I know, I can just wrap the resulting vector into a 3x3 matrix. So the expected outcome would be:
matrix(subject.matrix[query.matrix], nrow = 3)
[,1] [,2] [,3]
[1,] 1 5 9
[2,] 2 7 10
[3,] 4 8 11
But I wonder if there's a more elegant/direct solution. I'm also not attached to using a logical matrix as the index, if that means a simpler solution. Perhaps, I could subset it with a vector of indices for the rows to be removed in each column, which in this case would translate into c(3, 2, 4).
Many thanks!
Edit based on #LyzandeR suggestion: My final goal was to take column sums of the resulting matrix. So replacing the redundant values with NA's seems to be the best way to go.
I think that the only way you can preserve the matrix structure would be to use a more general way of your question edit i.e.:
matrix(subject.matrix[query.matrix], ncol = ncol(subject.matrix))
You could even convert it into a function if you plan on using it multiple times:
subset.mat <- function(mat, index, cols=ncol(mat)) {
matrix(mat[index], ncol = cols)
}
Output:
> subset.mat(subject.matrix, query.matrix)
[,1] [,2] [,3]
[1,] 1 5 9
[2,] 2 7 10
[3,] 4 8 11
Also (sorry just read your updated comment) you might consider using NAs in the matrix instead of subsetting them out, which will allow you to calculate the column sums as you say:
subject.matrix[!query.matrix] <- NA
subject.matrix
# [,1] [,2] [,3]
#[1,] 1 5 9
#[2,] 2 NA 10
#[3,] NA 7 11
#[4,] 4 8 NA
This is a little brute-forceish, but I think you'll be able to extrapolate it into something more general:
new.matrix = matrix(ncol = ncol(subject.matrix), nrow = nrow(subject.matrix) - 1)
for(i in 1:ncol(subject.matrix)){
new.matrix[,i] = subject.matrix[,i][query.matrix[,i] == TRUE]
}
new.matrix
[,1] [,2] [,3]
[1,] 1 5 9
[2,] 2 7 10
[3,] 4 8 11
Essentially, I just initialized an empty matrix, and then iterated through each column of subject.matrix taking only the TRUE values for query.matrix.

Data structure to hold multiple matrices

I have an array of strings which are actually names of datasets. I perform several measures on each dataset and get result of each measure in a matrix.
I want to save the results of one dataset in some data structure.
So, for example:
We have a string "glass".
From measurements on dataset "glass" I get 3 matrices a,b,c.
How could I save a,b,c in one structure?
Thanks.
Use a list.
> mydata <- list()
> mydata[[1]] <- matrix(1:4, 2, 2)
> mydata[[2]] <- matrix(1:10, 5, 2)
> mydata[[3]] <- matrix(1:16, 4, 4)
> mydata
[[1]]
[,1] [,2]
[1,] 1 3
[2,] 2 4
[[2]]
[,1] [,2]
[1,] 1 6
[2,] 2 7
[3,] 3 8
[4,] 4 9
[5,] 5 10
[[3]]
[,1] [,2] [,3] [,4]
[1,] 1 5 9 13
[2,] 2 6 10 14
[3,] 3 7 11 15
[4,] 4 8 12 16
>
> # To access the first matrix in the list...
> mydata[[1]]
[,1] [,2]
[1,] 1 3
[2,] 2 4
See ?list for more information.
Since they are the same size you can choose either list or a array. Dason showed the list option.
a=matrix(rnorm(16),nrow=4)
b=matrix(rnorm(16),nrow=4)
d=matrix(rnorm(16),nrow=4)
glass=array(c(a,b,d),dim=c(4,4,3))

Form matrix from rows in 3-dimensional array

I have X, a three-dimensional array in R. I want to take a vector of indices indx (length equal to dim(X)[1]) and form a matrix where the first row is the first row of X[ , , indx[1]], the second row is the second row of X[ , , indx[2]], and so on.
For example, I have:
R> X <- array(1:18, dim = c(3, 2, 3))
R> X
, , 1
[,1] [,2]
[1,] 1 4
[2,] 2 5
[3,] 3 6
, , 2
[,1] [,2]
[1,] 7 10
[2,] 8 11
[3,] 9 12
, , 3
[,1] [,2]
[1,] 13 16
[2,] 14 17
[3,] 15 18
R> indx <- c(2, 3, 1)
My desired output is
R> rbind(X[1, , 2], X[2, , 3], X[3, , 1])
[,1] [,2]
[1,] 7 10
[2,] 14 17
[3,] 3 6
As of now I'm using the inelegant (and slow) sapply(1:dim(X)[2], function(x) X[cbind(1:3, x, indx)]). Is there any way to do this using the built-in indexing functions? I had no luck experimenting with the matrix indexing methods described in ?Extract, but I may just be doing it wrong.
Maybe like this:
t(sapply(1:3,function(x) X[,,idx][x,,x]))
I may be answering the wrong question (I can't reconcile your first description and your sample output)... This produces your sample output, but I can't say that it's much faster without running it on your data.
do.call(rbind, lapply(1:dim(X)[1], function(i) X[i, , indx[i]]))
Matrix indexing to the rescue! No applys needed.
Figure out which indices you want:
n <- dim(X)[2]
foo <- cbind(rep(seq_along(indx),n),
rep(seq.int(n), each=length(indx)),
rep(indx,n))
(the result is this)
[,1] [,2] [,3]
[1,] 1 1 2
[2,] 2 1 3
[3,] 3 1 1
[4,] 1 2 2
[5,] 2 2 3
[6,] 3 2 1
and use it as index, converting back to a matrix to make it look like your output.
> matrix(X[foo],ncol=n)
[,1] [,2]
[1,] 7 10
[2,] 14 17
[3,] 3 6

Resources