I have a matrix full of integers and I need to create an index where for each of these integers I get the numbers of columns containing it (using R).
For instance, suppose I have this table:
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 31738 3136023010 777150982 2318301701 44 3707934113
[2,] 1687741813 44 31738 1284682632 462137835 445275140
[3,] 44 123 123 31738 1215490197 123
In my case I have 31738 being present in columns:
1,2 and 4
elements: [1,1], [2,3] and [3,4]
and 44 being present in columns 1,2 and 5 (elements [3,1], [2,2] and [1,5]
So for all elements in my table, I need to have an index like
31738 = 1 3 4
3136023010 = 2
777150982 = 3
44 = 1 2 3
....
123 = 2 3 6
etc
edit: i corrected my mistake pointed out in comment bellow.
We can do
setNames(lapply(unique(m1), function(i)
as.vector(which(m1==i, arr.ind = TRUE)[,2])), unique(m1))
Or another option is
split(col(m1), m1)
data
m1 <- structure(c(31738, 1687741813, 44, 3136023010, 44, 123, 777150982,
31738, 123, 2318301701, 1284682632, 31738, 44, 462137835, 1215490197,
3707934113, 445275140, 123), .Dim = c(3L, 6L))
Related
I have a numerical matrix of size 17 (rows) x 6 (columns). It looks looks like this:
Now I want to transform this matrix to an array of size 2 (rows) x 3 (columns) x (17 dimensions) in a way that every row is transformed to one dimension in the new array, in a way that columns 1-3 go to the first row and columns 4-6 go to the second row.
I have used the numbers out of the example to give an example how dimension 1 looks in this new array (it includes the values of the first row):
How can I transform this matrix to the array I would like to have?
m <- matrix(c(1:12), ncol = 6)
# [,1] [,2] [,3] [,4] [,5] [,6]
#[1,] 1 3 5 7 9 11
#[2,] 2 4 6 8 10 12
a <- array(t(m), dim = c(3, 2, length(m)/6))
a <- aperm(a, c(2, 1, 3)) #switch first and second dimension
#, , 1
#
# [,1] [,2] [,3]
#[1,] 1 3 5
#[2,] 7 9 11
#
#, , 2
#
# [,1] [,2] [,3]
#[1,] 2 4 6
#[2,] 8 10 12
I need to find the difference between columns in a matrix. The diff() function will work with a matrix for differences between rows for each column, but I need differences between columns for each row. I tried using apply() with diff() over the row index, but this returns a transposed result. Why does it return the transpose and am I using it incorrectly?
# make a small sample matrix
m <- matrix(seq(1, 20, by = 1), nrow = 2, ncol = 10)
# apply the diff function to the rows, I expect a 2 by 10 matrix here, should I just transpose it?
apply(m, 1, diff)
[,1] [,2]
[1,] 2 2
[2,] 2 2
[3,] 2 2
[4,] 2 2
[5,] 2 2
[6,] 2 2
[7,] 2 2
[8,] 2 2
[9,] 2 2
To elaborate on the "transpose effect" of apply:
According to ?apply, apply applies a function to the row vectors (MARGIN = 1) or column vectors (MARGIN = 2) of an array (e.g. a matrix) and returns
an array of dimension ‘c(n, dim(X)[MARGIN])’ if ‘n > 1’
where n is the length of the vector returned by an individual call of the function to either the row or column vector.
So in your case dim(m) is 2 10 (i.e. a 2x10 matrix) and MARGIN = 1, so the array dimension of the return object is 9 2, which means a 9x2 matrix (as diff returns a vector of length n=9).
You can see the same "transpose effect" when you do
apply(m, 1, c)
# [,1] [,2]
# [1,] 1 2
# [2,] 3 4
# [3,] 5 6
# [4,] 7 8
# [5,] 9 10
# [6,] 11 12
# [7,] 13 14
# [8,] 15 16
# [9,] 17 18
#[10,] 19 20
If all what's confusing you is the format 9 2 instead of 2 9, then you are right, just transpose it.
Because you very well are already computing the differences over the columns, i.e.
for(k in 1:nrow(m)) diff(m[k,])
yields exactly what you already have. A simple t(apply(m, 1, diff)) will do the trick.
> t(apply(m, 1, diff))
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
[1,] 2 2 2 2 2 2 2 2 2
[2,] 2 2 2 2 2 2 2 2 2
PS You can't get a 10 2/2 10 structure because diff will obviously return an element of size n - 1.
I am trying to map matching columns between 2 matrices. For simplicity, I have 2 simple matrices, a and b:
a <- matrix(c(1, 2), nrow = 2, ncol = 2)
b <- matrix(c(1,2,1,2,3:8), nrow = 2, ncol = 5)
> a
[,1] [,2]
[1,] 1 1
[2,] 2 2
> b
[,1] [,2] [,3] [,4] [,5]
[1,] 1 1 3 5 7
[2,] 2 2 4 6 8
I want to create a vector of length length(a[, 1]) = 2, ie
> out
[1] 1 2
Where the first element of out is the column number in b that matches the first column of a, and the second element of out is the column number in b that matches the second column in a. I have tried
> match(data.frame(a), data.frame(b))
[1] 1 1
but I need each element of the resulting vector to be unique. Probably simple solution, but I am not seeing it. Thanks!
May be you are looking for something like intersect.
a <- matrix(c(10, 20), nrow = 2, ncol = 2)
b <- matrix(c(10,20,1,2,3:6,10,20), nrow = 2, ncol = 5)
#> b
# [,1] [,2] [,3] [,4] [,5]
#[1,] 10 1 3 5 10
#[2,] 20 2 4 6 20
#Finding matching columns in b from a. Only 1st column of a is considered
matched <- b[,1:ncol(b)] == a[,1:1]
#> matched
# [,1] [,2] [,3] [,4] [,5]
#[1,] TRUE FALSE FALSE FALSE TRUE
#[2,] TRUE FALSE FALSE FALSE TRUE
desired <- which(matched[1,], arr.ind = TRUE)
#> desired
#[1] 1 5
The matched column 1 and 5 are returned.
I guess I'm not allowed to comment on here. Anyhoo...the above answer by MKR looks good, but I would add this line before creating the "desired" object. This is to ensure every column element matches (instead of testing the first row only).
matched<-sapply(1:ncol(matched),function(x) all(matched[,x]))
I have a data frame like this:
V1 V2 V3
1 1 1 0.0000000
2 2 1 1.6646331
3 3 1 1.6649136
4 4 1 1.7420642
5 5 1 1.4441743
6 6 1 0.7544465
7 7 1 1.9796860
8 8 1 1.0744425
9 9 1 2.1503288
10 10 1 1.0408388
11 11 1 2.0822162
....
841 29 29 0.0000000
I want to convert this data frame to a matrix. In this matrix V2 should be the row and V1 should be column
[1] [2] [3] [4] ....
[1] 0.0000000 1.6646331 1.664936 1.7420642...
How can I do that in r?
Assuming you have contiguous values for your matrix (i.e. no gaps in the matrix) and the running order of values is continuous (i.e. row1;columns1:10,row2;columns1:10... etc), then....
Take the values in the appropriate column (V3 in your case) and reshape them according to your paramters of matrix size...
m <- matrix( df$V3 , ncol = max(df$V1) , nrow = max(df$V2) , byrow = TRUE )
#[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11]
#[1,] 0 1.664633 1.664914 1.742064 1.444174 0.7544465 1.979686 1.074442 2.150329 1.040839 2.082216
If you need to match the values (i.e. running order is not continuous) then you can take advantage of vectorised matrix subscripting like so....
# Create empty matrix
m <- matrix( NA, ncol = max(df$V1) , nrow = max(df$V2) )
# Then fill with values at defined locations
m[ cbind( df$V2 , df$V1 ) ] <- df$V3
You need to do two things:
Reorder your columns
Transpose (using the t function)
So first create some example data
d = data.frame(V1=runif(5), V2= 5+runif(5), V3 = 10+runif(5))
then
t(d[, ncol(d):1])
or
t(d)[ncol(d):1, ]
This should do the job (where df is your data frame)
m <- do.call(cbind, df)
I have data in a csv file in a single column with 6954 values. I want to split this column into multiple columns such that each column has 122 data and next column has the next 122 data and so on. I guess, I will have a final matrix of 122 rows and 57 columns. Any help will be appreciated.
Thanks
Like this ?
x <- rep(1:122, 5)
xx <- matrix(x, nrow=122)
xx[1:5, ]
[,1] [,2] [,3] [,4] [,5]
[1,] 1 1 1 1 1
[2,] 2 2 2 2 2
[3,] 3 3 3 3 3
[4,] 4 4 4 4 4
[5,] 5 5 5 5 5
Or this will do the trick as well:
x = 1:6954
dim(x) <- c(122, 57)
A column can be split using the colsplit function which is part of the reshape package
http://r.ramganalytics.com/r/split-a-column-by-a-character-using-colsplit-function/