Indexing a vector by an array in R - r

In MATLAB and numpy, you can index a vector by an array of indices and get a result of the same shape out, e.g.
A = [1 1 2 3 5 8 13];
B = [1 2; 2 6; 7 1; 4 4];
A(B)
## ans =
##
## 1 1
## 1 8
## 13 1
## 3 3
or
import numpy as np
a = np.array([1, 1, 2, 3, 5, 8, 13])
b = np.reshape(np.array([0, 1, 1, 5, 6, 0, 3, 3]), (4, 2))
a[b]
## array([[ 1, 1],
## [ 1, 8],
## [13, 1],
## [ 3, 3]])
However, in R, indexing a vector by an array of indices returns a vector:
a <- c(1, 1, 2, 3, 5, 8, 13)
b <- matrix(c(1, 2, 7, 4, 2, 6, 1, 4), nrow = 4)
a[b]
## [1] 1 1 13 3 1 8 1 3
Is there an idiomatic way in R to perform vectorized lookup that preserves array shape?

You can't specify dimensions through subsetting alone in R (AFAIK). Here is a workaround:
`dim<-`(a[b], dim(b))
Produces:
[,1] [,2]
[1,] 1 1
[2,] 1 8
[3,] 13 1
[4,] 3 3
dim<-(...) just allows us to use the dimension setting function dim<- for its result rather than side effect as is normally the case.
You can also do stuff like:
t(apply(b, 1, function(idx) a[idx]))
but that will be slow.

This is not very elegant, but it works
matrix(a[b],nrow=nrow(b))

Option 1: if we do not need to keep the original values in b, we could simply
"Caveat: the values in b will be over-written"
b[] = a[b]
b
# [,1] [,2]
# [1,] 1 1
# [2,] 1 8
# [3,] 13 1
# [4,] 3 3
Option 2: if want to retain the values in b, An easy workaround could be
c = b # copy b to c
c[] = a[c]
c
# [,1] [,2]
# [1,] 1 1
# [2,] 1 8
# [3,] 13 1
# [4,] 3 3
Actually I found Option 2 is easy to follow and clean.

Related

Converting a list of lists into a data frame

I have a function that first generates a list of vectors (generated by using lapply), and then cbinds it to a column vector. I thought this would produce a dataframe. However, it produces a list of lists.
The cbind function isn't working as I thought it would.
Here's a small example of what the function is generating
col_test <- c(1, 2, 1, 1, 2)
lst_test <- list(c(1, 2 , 3), c(2, 2, 2), c(1, 1, 2), c(1, 2, 2), c(1, 1, 1))
a_df <- cbind(col_test, lst_test)
Typing
> a_df[1,]
gives the output
$`col_test`
[1] 1
$lst_test
[1] 1 2 3
I'd like the data frame to be
[,1] [,2] [,3] [,4]
[1,] 1 1 2 3
[2,] 2 2 2 2
[3,] 1 1 1 2
[4,] 1 1 2 2
[5,] 2 1 1 1
How do I get it into this form?
data.frame(col_test,t(as.data.frame(lst_test)))
do.call(rbind, Map(c, col_test, lst_test))
# [,1] [,2] [,3] [,4]
#[1,] 1 1 2 3
#[2,] 2 2 2 2
#[3,] 1 1 1 2
#[4,] 1 1 2 2
#[5,] 2 1 1 1
col_test <- c(1, 2, 1, 1, 2)
lst_test <- list(c(1, 2 , 3), c(2, 2, 2), c(1, 1, 2), c(1, 2, 2), c(1, 1, 1))
name the sublists so we can use bind_rows
names(lst_test) <- 1:length(lst_test)
lst_test1 <- bind_rows(lst_test)
the bind_rows function binds by cols in this case so we need to pivot it
lst_test_pivot <- t(lst_test1)
but this gives us a matrix, so we need to cast it back to a dataframe
lst_test_pivot_df <- as.data.frame(lst_test_pivot)
now it works as
cbind(col_test, lst_test_pivot_df)
now produces
col_test V1 V2 V3
1 1 1 2 3
2 2 2 2 2
3 1 1 1 2
4 1 1 2 2
5 2 1 1 1
This should do the trick. Note that we are using do.call so that the individual elements of lst_test are sent as parameters to cbind, which prevents cbind from creating a list-of-lists. t is used to transpose the resulting matrix to your preferred orientation, and finally, one more cbind with col_test inserts that data as well.
library(tidyverse)
mat.new <- do.call(cbind, lst_test) %>%
t %>%
cbind(col_test, .) %>%
unname
[,1] [,2] [,3] [,4]
[1,] 1 1 2 3
[2,] 2 2 2 2
[3,] 1 1 1 2
[4,] 1 1 2 2
[5,] 2 1 1 1

Selecting all the row combinations from a matrix

I have a matrix consisting of 10 rows ,
I would like to make a combination between these row using R such as:
M= matrix(c(
1,2,3,4,
5,6,7,3,
5,5,4,8,
5,2,7,8,
4,8,7,8,
2,6,7,9,
5,6,7,4,
5,6,7,2,
5,6,7,3,
5,6,7,0),nrow=10, byrow=TRUE)
First step
combination (3 row ) from ( 10 row ).
This means that we have other matrices (resulting from matrix M) their number 120- matrix(3*4)
Second step
combination (6 row ) from ( 10 row )
This means that we have other matrices (we also resulting from matrix M) their number 210-matrix(6*4)
You can split matrix with apply to list of rows than use combn function as below:
M <- structure(c(1, 5, 5, 5, 4, 2, 5, 5, 5, 5, 2, 6, 5, 2, 8, 6, 6,
6, 6, 6, 3, 7, 4, 7, 7, 7, 7, 7, 7, 7, 4, 3, 8, 8, 8, 9, 4, 2,
3, 0), .Dim = c(10L, 4L))
x <- apply(M, 1, list)
# combinations for three rows
cmbs3 <- combn(x, 3)
ncol(cmbs3)
# 120
cmbs3[, 2]
# second combination
# [[1]]
# [[1]][[1]]
# [1] 1 2 3 4
#
#
# [[2]]
# [[2]][[1]]
# [1] 5 6 7 3
#
#
# [[3]]
# [[3]][[1]]
# [1] 5 2 7 8
# combinations for six rows
cmbs6 <- combn(x, 6)
ncol(cmbs6)
# 210
EDIT:
Or use elgant solution provided by nicola - subsetting by row index generated by combn (I like it much more :):
lapply(combn(10, 3, simplify = FALSE), function(x) M[x, ])
Output:
[[1]]
[,1] [,2] [,3] [,4]
[1,] 1 2 3 4
[2,] 5 6 7 3
[3,] 5 5 4 8
[[2]]
[,1] [,2] [,3] [,4]
[1,] 1 2 3 4
[2,] 5 6 7 3
[3,] 5 2 7 8
...
[[119]]
[,1] [,2] [,3] [,4]
[1,] 5 6 7 4
[2,] 5 6 7 3
[3,] 5 6 7 0
[[120]]
[,1] [,2] [,3] [,4]
[1,] 5 6 7 2
[2,] 5 6 7 3
[3,] 5 6 7 0

How to compare single values of a vector with matrix and if they occur take values from another matrix with the same position?

I'm a programming beginner and I'm not able to solve this problem:
I have a vector length 132 and two matrices A and B with the size of 132x24. I would like to take every single value of the vector and compare it rowwise with matrix A. If the value occurs in A I want to have the index of the column to go to matrix B and pick the value from the column with the same position (row and column indices) as in matrix A. The results should be given back as a vector with the same length of 132.
How to do this? Do I need a for loop or are there some smart ways to work with packages?
Unfortunately I can not give example data.
Thank you for your help!
# vector v contains values that I want to compare with matrix A
> v
[1] 5 1 10 1 7
# every single value of v occurs in every row of A only once
# I want to have the position of this value in matrix A
> A
[,1] [,2] [,3] [,4]
[1,] 5 7 4 1
[2,] 14 1 3 3
[3,] 13 3 1 10
[4,] 2 1 5 8
[5,] 13 2 5 7
# the position in matrix A equals the position in matrix B
# now the values of B have to be returned as a vector
> B
[,1] [,2] [,3] [,4]
[1,] 6 3 4 3
[2,] 5 2 5 5
[3,] 4 6 3 1
[4,] 3 6 1 5
[5,] 2 4 6 3
# vector with fitting values of B
> x
[1] 6 2 1 6 3
v <- c(5, 1, 10, 1, 7)
A <- matrix(c(
5, 7, 4, 1,
14, 1, 3, 3,
13, 3, 1, 10,
2, 1, 5, 8,
13, 2, 5, 7), 5, byrow = TRUE)
B <- matrix(c(
6, 3, 4, 3,
5, 2, 5, 5,
4, 6, 3, 1,
3, 6, 1, 5,
2, 4, 6, 3), 5, byrow = TRUE)
myfun <- function(i) which(v[i]==A[i,])
ii <- 1:length(v)
B[cbind(ii, sapply(ii, myfun))]
The function myfun() is quick'n'dirty.
To test if your data are ok you can calculate how often the value v[i] is found in the row A[i,]
countv <- function(i) sum(v[i]==A[i,])
all(sapply(ii, countv)==1) ### should be TRUE
If you get FALSE then inspect:
which(sapply(ii, countv)!=1)
Alright, I'm not sure how you pictured your output, but I've got something that comes near.
Example data:
x <- 1:132
set.seed(123)
A <- matrix(sample(1:1000, size = 132*24, replace = TRUE), nrow = 132, ncol = 24)
B <- matrix(rnorm(132*24), nrow = 132, ncol = 24)
Now we check for every value of vector x if and where it occurs in every row of matrix A:
x.vs.A <- sapply(x, function(x){
apply(A, 1, function(y) {
match(x, y)
})
})
This gives us a matrix x.vs.A with 132 rows (the rows of A) and 132 columns (the values of x). Within the cells of this matrix, we will find either NA, if the combination of one value of x and one row of A was unsuccessful, or the column position within A of the FIRST match of the value of x.
And now we extract the rowwise position and bind them together with the cell value, depiting the second (column) dimension of the matched value. Thus we create for every value of x a matrix of row/column position of matches in matrix A:
x.in.A <- apply(x.vs.A, 2, function(x) cbind(which(!is.na(x)), x[!is.na(x)]))
Example:
> x.in.A[[1]]
[,1] [,2]
[1,] 12 17
[2,] 42 17
[3,] 73 12
[4,] 123 21
This would show that the first value in vector x can be found in A[12, 17], in A[42, 17] and so on.
Now access these values in B, returning vectors for each value of x, and bind them to the matrices in the list:
x.in.B <- lapply(x.in.A, function(x){
apply(x, 1, function(y){
B[y[1], y[2]]
})
})
x.in.AB <- mapply(function(x, y) cbind(x, y),
x.in.A, x.in.B)
> x.in.AB[[1]]
y
[1,] 12 17 -0.2492526
[2,] 42 17 -0.7985330
[3,] 73 12 0.1253824
[4,] 123 21 -0.9704919

All possible combinations of two vectors while keeping the order in R

I have a vector, say vec1, and another vector named vec2 as follows:
vec1 = c(4,1)
# [1] 4 1
vec2 = c(5,3,2)
# [1] 5 3 2
What I'm looking for is all possible combinations of vec1 and vec2 while the order of the vectors' elements is kept. That is, the resultant matrix should be like this:
> res
[,1] [,2] [,3] [,4] [,5]
[1,] 4 1 5 3 2
[2,] 4 5 1 3 2
[3,] 4 5 3 1 2
[4,] 4 5 3 2 1
[5,] 5 4 1 3 2
[6,] 5 4 3 1 2
[7,] 5 4 3 2 1
[8,] 5 3 4 1 2
[9,] 5 3 4 2 1
[10,] 5 3 2 4 1
# res=structure(c(4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 1, 5, 5, 5, 4, 4, 4,
# 3, 3, 3, 5, 1, 3, 3, 1, 3, 3, 4, 4, 2, 3, 3, 1, 2, 3, 1, 2, 1,
# 2, 4, 2, 2, 2, 1, 2, 2, 1, 2, 1, 1), .Dim = c(10L, 5L))
There is no repetition allowed for two vectors. That is, all rows of the resultant matrix have unique elements.
I'm actually looking for the most efficient way. One way to tackle this problem is to generate all possible permutations of length n which grows factorially (n=5 here) and then apply filtering. But it's time-consuming as n grows.
Is there an efficient way to do that?
Try this one:
nv1 <- length(vec1)
nv2 <- length(vec2)
n <- nv1 + nv2
result <- combn(n,nv1,function(v) {z=integer(n);z[v]=vec1;z[-v]=vec2;z})
The idea is to produce all combinations of indices at which to put the elements of vec1.
Not that elegant as Marat Talipov solution, but you can do:
# get the ordering per vector
cc <- c(order(vec1,decreasing = T), order(vec2, decreasing = T)+length(vec1))
cc
[1] 1 2 3 4 5
# permutation to get all "order-combinations"
library(combinat)
m <- do.call(rbind, permn(cc))
# remove unsorted per vector, only if both vectors are correct set TRUE for both:
gr <- apply(m, 1, function(x){
!is.unsorted(x[x < (length(vec1)+1)]) & !is.unsorted(x[x > (length(vec1))])
})
# result, exchange the order index with the vector elements:
t(apply(m[gr, ], 1, function(x, y) y[x], c(vec1, vec2)))
[,1] [,2] [,3] [,4] [,5]
[1,] 4 1 5 3 2
[2,] 4 5 3 1 2
[3,] 4 5 3 2 1
[4,] 4 5 1 3 2
[5,] 5 4 1 3 2
[6,] 5 4 3 2 1
[7,] 5 4 3 1 2
[8,] 5 3 4 1 2
[9,] 5 3 4 2 1
[10,] 5 3 2 4 1

Combine multiple cells into one cell in R

I want to combine vectors of values, each currently saved as a row in a matrix, into single cells, with values separate by commas.
My current code creates random vectors.
For instance,
## Group 1
N <- 10
set.seed(06510)
grp1 <- t(replicate(N, sample(seq(1:4), 4, replace = FALSE)) )
The results look like
Table 1:
[,1] [,2] [,3] [,4]
[1,] 2 4 3 1
[2,] 4 2 1 3
[3,] 2 4 1 3
[4,] 1 4 3 2
[5,] 1 3 2 4
[6,] 2 1 3 4
[7,] 4 3 2 1
[8,] 4 1 3 2
[9,] 2 4 3 1
[10,] 1 4 2 3
But I want the results to look like:
Table 2:
[,1]
[1,] 2,4,3,1
[2,] 4,2,1,3
[3,] 2,4,1,3
[4,] 1,4,3,2
[5,] 1,3,2,4
[6,] 2,1,3,4
[7,] 4,3,2,1
[8,] 4,1,3,2
[9,] 2,4,3,1
[10,] 1,4,2,3
I'm creating a randomization table and each cell represents the ordering of 4 survey questions for each survey respondent. Ultimately, I want to create multiple columns like the one above, so maintaining 4 columns for every randomization item will make for a big hard-to-read randomization table.
Your main problem is you need to use the function I() to protect the list strucuture. Your second problem is that you need to return a list structure from replicate() which is returning a matrix (because you have a set of equal length vectors). Set simplify = FALSE and note where the transpose operation t occurs....
grp1 <- replicate(N, t( sample(seq(1:4), 4, replace = FALSE ) ) , simplify = FALSE )
as.data.frame( I(grp1) )
# I(grp1)
#1 2, 4, 3, 1
#2 4, 2, 1, 3
#3 2, 4, 1, 3
#4 1, 4, 3, 2
#5 1, 3, 2, 4
#6 2, 1, 3, 4
#7 4, 3, 2, 1
#8 4, 1, 3, 2
#9 2, 4, 3, 1
#10 1, 4, 2, 3
# And just to check...
sapply( as.data.frame( I(grp1) ) , mode )
I(grp1)
"list"
However, I don't know why this is more useful to you than a plain old data.frame or probably even better for your use-case, a list of matrices.
Whatever you do, you will end up with characters, I hope you are not surprised by that.
apply(grp1,1,paste,collapse=",")
gives you a vector result. You can turn that into a matrix like this:
matrix(apply(grp1,1,paste,collapse=","),ncol=1)
See ?apply. apply() is enormously useful.

Resources