R which samples were drawn randomly from matrix? - r

I am trying to determine which columns were sampled from a matrix randomly sampled within each row. The function sample does not appear to have the ability to tell you which locations were actually sampled. Now, a simple matching routine can solve the problem if all values are unique. However, they are not in my case, so this will not work.
x <- c(2,3,5,1,6,7,2,3,5,6,3,5)
y <- matrix(x,ncol=4,nrow=3)
random <- t(apply(y,1,sample,2,replace=FALSE))
y
[,1] [,2] [,3] [,4]
[1,] 2 1 2 6
[2,] 3 6 3 3
[3,] 5 7 5 5
random
[,1] [,2]
[1,] 2 6
[2,] 3 3
[3,] 5 5
With repeated values in the original matrix, I cannot tell if random[1,1] was sampled from column 1 or column 3, since they both have a value of 2. Hence, matching won't work here.
Accompanying the matrix "random" I would also like a matrix that gives the column from which each value was sampled, in an identically sized matrix. For example, such as:
[,1] [,2]
[1,] 1 4
[2,] 1 3
[3,] 3 4
Thanks!

You need to save your random selections from sample separately so you don't have to worry about matching later. E.g., using y again:
y
# [,1] [,2] [,3] [,4]
#[1,] 2 1 2 6
#[2,] 3 6 3 3
#[3,] 5 7 5 5
set.seed(42)
randkey <- t(replicate(nrow(y),sample(1:ncol(y),2)))
# [,1] [,2]
#[1,] 4 3
#[2,] 2 3
#[3,] 3 2
random <- matrix(y[cbind(c(row(randkey)), c(randkey))], nrow(y))
# [,1] [,2]
#[1,] 6 2
#[2,] 6 3
#[3,] 5 7

Related

Duplicating rows in R matrix

I have a small matrix, say
x <- matrix(1:10, nrow = 5) # values 1:10 across 5 rows and 2 columns
The result is
[,1] [,2]
[1,] 1 6
[2,] 2 7
[3,] 3 8
[4,] 4 9
[5,] 5 10
What I want to be able to do now is duplicate random rows in x; for example, producing
[,1] [,2]
[1,] 1 6
[2,] 2 7
[3,] 5 10
[4,] 4 9
[5,] 5 10
I believe the R function 'rep()' is the solution and also 'sample()', but I don't want to have to specify the size argument in sample(); i.e., I want an arbitrary number of rows to be duplicated each time.
Is there a simple way of accomplishing this using rep() and sample()?
We can use the sample function. I've used set.seed for reproducibility, if you remove that line the results should change.
set.seed(1848) # reproducibility
x[sample(x = nrow(x), size = nrow(x), replace = T), ]
[,1] [,2]
[1,] 1 6
[2,] 2 7
[3,] 5 10
[4,] 1 6
[5,] 5 10
Another option could be as sample a row number and replace that with another sampled row number. It will be as:
x[sample(1:nrow(x),1),] <- x[sample(1:nrow(x),1),]
x
# [,1] [,2]
#[1,] 5 10
#[2,] 2 7
#[3,] 3 8
#[4,] 4 9
#[5,] 5 10
OR
Just to duplicate upto 3 random rows, solution could be:
x[sample(1:nrow(x),3),] <- x[sample(1:nrow(x),3),]

R: How to permute only column in a data frame/matrix

I want to create x randomised matrices where only the columns are permuted but the rows are kept constant. I already took a look at permatful() in the vegan package. Nevertheless, i was not able to generate the desired result even though i am quite sure that this should be possible somehow.
df = matrix(c(2,3,1,4,5,1,3,6,2,4,1,3), ncol=3)
This is (one possible) desired result
[,1] [,2] [,3]
[1,] 2 5 2
[2,] 3 1 4
[3,] 1 3 1
[4,] 4 6 3
v
v permutation
v
[,1] [,2] [,3]
[1,] 5 2 2
[2,] 1 4 3
[3,] 3 1 1
[4,] 6 3 4
I tried something like permatfull(df, times=1, fixedmar = "rows", shuffle = "samp") which results in
[,1] [,2] [,3]
[1,] 5 2 2
[2,] 1 4 3
[3,] 3 1 1
[4,] 3 4 6
Now column 1 (originally column 2) has changed from 5,1,3,6 to 5,1,3,3.
Anyone an idea why I do not get the expected result?
Thanks in Advance,
Christian

How to concatenate column repetitions of a matrix without a for loop

Let's say I have the below matrix:
[,1] [,2]
[1,] 1 2
[2,] 3 4
[3,] 5 6
I want to generate a matrix which is the concatenation (by column) of matrices that are generated by repetition of each column k times. For example, when k=3, below is what I want to get:
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 1 1 1 2 2 2
[2,] 3 3 3 4 4 4
[3,] 5 5 5 6 6 6
How can I do that without a for loop?
You can do this with column indexing. A convenient way to repeat each column number the correct number of times is the rep function:
mat[,rep(seq_len(ncol(mat)), each=3)]
# [,1] [,2] [,3] [,4] [,5] [,6]
# [1,] 1 1 1 2 2 2
# [2,] 3 3 3 4 4 4
# [3,] 5 5 5 6 6 6
In the above expression, seq_len(ncol(mat)) is the sequence from 1 through the number of columns in the matrix (you could think of it like 1:ncol(mat), except it deals nicely with some special cases like 0-column matrices).
Data:
(mat <- matrix(1:6, nrow=3, byrow = TRUE))
# [,1] [,2]
# [1,] 1 2
# [2,] 3 4
# [3,] 5 6
We can repeat each element of matrix k times and fit the vector in a matrix where number of columns is k times the original one.
k <- 3
matrix(rep(t(mat), each = k), ncol = ncol(mat) * k, byrow = TRUE)
# [,1] [,2] [,3] [,4] [,5] [,6]
#[1,] 1 1 1 2 2 2
#[2,] 3 3 3 4 4 4
#[3,] 5 5 5 6 6 6

Vectorized subsetting of matrices with varying index length

I am trying to convert the rows of the matrices below into indices to subset from another matrix. The first matrix would generate four indices to be used for subsetting from a matrix called data, shown at the bottom. The second matrix would generate six indices, each of length two, and so on.
library(gtools) # library for combinations()
matrix(match(combinations(n=4,r=1,v=LETTERS[1:4]),LETTERS),ncol=1)
[,1]
[1,] 1
[2,] 2
[3,] 3
[4,] 4
matrix(match(combinations(n=4,r=2,v=LETTERS[1:4]),LETTERS),ncol=2)
[,1] [,2]
[1,] 1 2
[2,] 1 3
[3,] 1 4
[4,] 2 3
[5,] 2 4
[6,] 3 4
matrix(match(combinations(n=4,r=3,v=LETTERS[1:4]),LETTERS),ncol=3)
[,1] [,2] [,3]
[1,] 1 2 3
[2,] 1 2 4
[3,] 1 3 4
[4,] 2 3 4
matrix(match(combinations(n=4,r=4,v=LETTERS[1:4]),LETTERS),ncol=4)
[,1] [,2] [,3] [,4]
[1,] 1 2 3 4
data <- matrix(rnorm(16,0),ncol=4)
data[,1]
data[,2]
data[,3]
data[,4]
The second matrix would generate indices to be used like:
data[,c(1,2)]
data[,c(1,3)]
data[,c(1,4)]
data[,c(2,3)]
etc.
What would a generic, vectorized function look like to accomplish the above for varying n?

Shuffle column values but keep the same matrix column ordering

How can I shuffle the values of matrix m1 across each column:
Initial:
m1=cbind(c(1,2,3),c(4,5,6),c(7,8,9))
Do something and:
m1=cbind(c(7,5,3),c(4,2,9),c(1,8,6))
Thanks
You can call the sample function on each column of your matrix to shuffle it:
set.seed(100)
apply(m1, 2, sample)
# [,1] [,2] [,3]
# [1,] 1 5 8
# [2,] 3 4 9
# [3,] 2 6 7
ehm, you mean by row in your example?!
shuffle a list:
# create a list from 1 to 9
x <- seq(1,9)
# shuffle
x[order(runif(length(x)))]
shuffle rows/columns of a matrix:
# example matrix
m1 <- matrix(x,ncol=3)
# shuffle by row
for (i in 1:nrow(m1)) m1[i,] <- m1[i,order(runif(length(m1[i,])))]
# shuffle by col
for (i in 1:ncol(m1)) m1[,i] <- m1[order(runif(length(m1[i,]))),i]
edit: maybe sample is better... http://stat.ethz.ch/R-manual/R-devel/library/base/html/sample.html
You can also put sample in matrix indices and sample the rows and columns.
To shuffle the entire matrix,
> m1[sample(nrow(m1)), sample(ncol(m1))]
# [,1] [,2] [,3]
#[1,] 6 9 3
#[2,] 5 8 2
#[3,] 4 7 1
Or by row
> m1[sample(nrow(m1)), ]
# [,1] [,2] [,3]
#[1,] 3 6 9
#[2,] 1 4 7
#[3,] 2 5 8
Or by column
> m1[,sample(ncol(m1))]
# [,1] [,2] [,3]
#[1,] 7 4 1
#[2,] 8 5 2
#[3,] 9 6 3

Resources