I am trying to accomplish two things. First if I have a vector 1:5 I want to get a matrix (or two vectors) indicating the unique combinations of these elements including twice the same number but excluding repetitions.
Right now I can do this using a matrix:
foo <- matrix(1:5,5,5)
cbind(foo[upper.tri(foo,diag=TRUE)],foo[lower.tri(foo,diag=TRUE)])
[,1] [,2]
[1,] 1 1
[2,] 1 2
[3,] 2 3
[4,] 1 4
[5,] 2 5
[6,] 3 2
[7,] 1 3
[8,] 2 4
[9,] 3 5
[10,] 4 3
[11,] 1 4
[12,] 2 5
[13,] 3 4
[14,] 4 5
[15,] 5 5
But there has to be a simpler way. I tried to use Vectorize on seq but this gives me an error:
cbind(Vectorize(seq,"from")(1:5,5),Vectorize(seq,"to")(5,1:5))
Error in Vectorize(seq, "from") :
must specify formal argument names to vectorize
A second thing I want to do is if I have a list containing vectors, bar, to get a vector containing the elements of the list repeated equal to the number of elements in that element. I can do this with:
unlist(apply(rbind(1:length(bar),sapply(bar,length)),2,function(x)rep(x[1],x[2])))
[1] 1 1 1 1 1 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3
But again there must be an easier way. I tried Vectorize again here but with the same error:
Vectorize(rep,"each")(1:length(bar),each=sapply(bar,length))
in Vectorize(rep, "each") :
must specify formal argument names to vectorize
To your first question: what about the simple combn() function in base:
> combn(1:5,2)
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 1 1 1 1 2 2 2 3 3 4
[2,] 2 3 4 5 3 4 5 4 5 5
If you need a matrix arranged the one you made up, just transpose it with t(), like t(combn(1:5,2))
Note: this will not give you back the combinations of repeated elements of your seq, but you may add those easily to the matrix.
> unlist(lapply(1:5, seq, from=1))
[1] 1 1 2 1 2 3 1 2 3 4 1 2 3 4 5
> unlist(lapply(1:5, seq, 5))
[1] 1 2 3 4 5 2 3 4 5 3 4 5 4 5 5
and
> bar = lapply(1:5, seq, from=1)
> rep(seq_along(bar), sapply(bar, length))
[1] 1 2 2 3 3 3 4 4 4 4 5 5 5 5 5
A faster variation of Martin Morgan's solution to the first part:
rep(1:5,5:1)
[1] 1 1 1 1 1 2 2 2 2 3 3 3 4 4 5
unlist(lapply(1:5,function(x) x:5))
[1] 1 2 3 4 5 2 3 4 5 3 4 5 4 5 5
Roughly 7 and 3 times faster respectively.
I'm not sure I follow what you mean in the second part, but the following seems to fit your description:
lapply(bar,function(x) rep(x,length(x)))
Related
I've been trying to apply a function using the last two values of rows in a data frame, and I want it to repeat this process using the last two values of every row. Here's the function I'm trying to apply.
Basically, I create a function that uses a Kish grid. A kish grid is a way of randomly selecting participants in a household survey. It's a 10x8 matrix, it looks like this.
kishvalues <- c(1,1,1,1,1,1,1,1,
1,2,2,2,2,2,2,2,
1,1,3,3,3,3,3,3,
1,2,1,4,4,4,4,4,
1,1,2,1,5,5,5,5,
1,2,3,2,1,6,6,6,
1,1,1,3,2,1,7,7,
1,2,2,4,3,2,1,8,
1,1,3,1,4,3,2,1,
1,2,1,2,5,4,3,2)
kishtable <- matrix(kishvalues, nrow=10, ncol=8, byrow=T); kishtable
> kishtable <- matrix(kishvalues, nrow=10, ncol=8, byrow=T); kishtable
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
[1,] 1 1 1 1 1 1 1 1
[2,] 1 2 2 2 2 2 2 2
[3,] 1 1 3 3 3 3 3 3
[4,] 1 2 1 4 4 4 4 4
[5,] 1 1 2 1 5 5 5 5
[6,] 1 2 3 2 1 6 6 6
[7,] 1 1 1 3 2 1 7 7
[8,] 1 2 2 4 3 2 1 8
[9,] 1 1 3 1 4 3 2 1
[10,] 1 2 1 2 5 4 3 2
If I'm doing household interviews, let's say I visit my 7th house of the day (rows), and there are 4 eligible participants for the interview (columns), I use the Kish table to select which of the four participants from youngest to oldest I have to choose in order to somewhat maintain randomness, and I would have to select the 3rd participant of that household for the interview.
> kishtable <- matrix(kishvalues, nrow=10, ncol=8, byrow=T); kishtable
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
[1,] 1 1 1 1 1 1 1 1
[2,] 1 2 2 2 2 2 2 2
[3,] 1 1 3 3 3 3 3 3
[4,] 1 2 1 4 4 4 4 4
[5,] 1 1 2 1 5 5 5 5
[6,] 1 2 3 2 1 6 6 6
[7,] 1 1 1 (3) 2 1 7 7
[8,] 1 2 2 4 3 2 1 8
[9,] 1 1 3 1 4 3 2 1
[10,] 1 2 1 2 5 4 3 2
Now, here is the function I'm using
kish <- function(house,ep){
x <- kishtable[house,ep]
print(x)
}
kish(house=7, ep=4)
[1] 3
How can I apply this function, but instead of doing it on two single values, do it on two vectors, one vector is the sequential number of the homes visited (1 - 10) and another vector representing the number of eligible participants which can vary from 1 - 8?
Hope I made sense, let me know if you need anything else to better understand the problem.
Use cbind for the [-index.
homes <- c(7,8,10)
participants <- c(4,6,5)
kishtable[cbind(homes, participants)]
# [1] 3 2 5
Mapping to the following:
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
[1,] 1 1 1 1 1 1 1 1
[2,] 1 2 2 2 2 2 2 2
[3,] 1 1 3 3 3 3 3 3
[4,] 1 2 1 4 4 4 4 4
[5,] 1 1 2 1 5 5 5 5
[6,] 1 2 3 2 1 6 6 6
[7,] 1 1 1 (3)1 2 1 7 7
[8,] 1 2 2 4 3 (2)2 1 8
[9,] 1 1 3 1 4 3 2 1
[10,] 1 2 1 2 (5)3 4 3 2
If there were a matrix, example:
m <- matrix(c(1:4), nrow = 4, ncol=4)
how would I go about multiplying only the odd rows by an arbitrary scalar while still keeping the even rows in the same place and the same value?
In this example, the matrix with only odd rows being multiplied by two would become:
1 1 1 1 2 2 2 2
2 2 2 2 2 2 2 2
3 3 3 3 ----> 6 6 6 6
4 4 4 4 4 4 4 4
You can pass bool vector to matrix, and if it divisible with number of rows r will replicate it.
mat <- matrix(1:4,4,4)
mat[c(TRUE,FALSE),] <- mat[c(TRUE,FALSE),] * 2
[,1] [,2] [,3] [,4]
[1,] 2 2 2 2
[2,] 2 2 2 2
[3,] 6 6 6 6
[4,] 4 4 4 4
Here is a base R code using diag() + rep()
diag(rep(c(2,1),nrow(mat)/2))%*% mat
such that
> diag(rep(c(2,1),nrow(mat)/2))%*% mat
[,1] [,2] [,3] [,4]
[1,] 2 2 2 2
[2,] 2 2 2 2
[3,] 6 6 6 6
[4,] 4 4 4 4
My problem is as follows and I have considered many derivations:
I have an array, say with dimensions dims = c(10000, 5, 2) - that is 10 rows, 5 columns and 2 subarrays.
I would like to be able to use the sample() function to sample a given proportion (say m) of rows in EACH subarray and move them BETWEEN subarrays.
So, say swap Row 5 subarray 1 with Row 10 of subarray 2 (see example below).
I asked a similar question
Moving rows between subarrays
and got some great help.
The solution is useful but restricted to random sampling of rows (meaning that not all subarrays will be sampled).
, , 1
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 4 3 4 4 3 4 5 2 4 4
[2,] 1 4 3 5 4 5 4 5 2 4
[3,] 1 5 2 1 1 2 1 4 5 1
[4,] 3 1 1 3 5 4 2 4 4 4
[5,] 3 2 5 1 2 2 5 5 4 3 <-- e.g., switch this row
[6,] 4 5 5 2 3 4 1 3 5 5
[7,] 5 5 5 5 1 4 3 1 2 5
[8,] 3 4 3 1 3 3 4 3 2 3
[9,] 1 1 3 2 4 4 1 4 2 3
[10,] 1 4 4 2 4 2 4 2 2 1
, , 2
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 5 5 1 1 5 2 1 4 3 1
[2,] 4 3 2 4 3 5 5 5 4 3
[3,] 2 4 1 1 4 2 2 2 3 4
[4,] 5 1 4 5 4 4 3 4 4 5
[5,] 1 5 5 4 3 3 5 2 2 2
[6,] 2 2 2 2 5 5 3 4 3 5
[7,] 5 2 1 1 2 5 3 4 4 2
[8,] 3 4 3 3 1 3 3 2 3 5
[9,] 2 1 4 4 3 2 4 5 5 2
[10,] 5 3 4 5 4 3 5 1 2 3 <-- with this row
In the above example, m = 0.10, that is 10% of the rows (1 row) in each subarray are sampled and then swapped.
Any ideas on how to force sample() to sample within ALL subarrays? Ideally, the number of rows in each subarray will be very large (10000 or more).
Though I have only included 2 subarrays, where a random row or rows swap(s) with a random row or rows in subarray 2 (dictated by m), I need a routine that is generalizable to k subarrays. So if k = 3, then sampling occurs within ALL subarrays and random rows are swapped with neighbouring subarrays.
So, a random row or rows in subarray 1 has equal chance of moving to either subarray 2 or subarray 3 (it doesn't matter which subarray rows go to, so long as they are always moving between subarrays. Then, the corresponding row or rows from subarray 2 or 3 will go to subarray 1.
The number of rows must remain constant. For example, there cannot be 11 rows in subarray 1 and only 9 in subarray 2 -- it has to be 10 and 10.
I don't know of any packages that will do this. The goal here is to simulate movement of animals.
Any ideas are greatly appreciated.
This is verbose and could be cleaned up, but should get you there.
set.seed(4)
x <- array(sample(1:10, 200, replace = T), dim = c(10, 10, 2))
nrows_x <- dim(x)[1]
proportion <- 0.10
idx <- sample(1:nrows_x, nrows_x * proportion)
rows_dim_1 <- x[idx, , 1]
rows_dim_2 <- x[idx, , 2]
x[idx, , 1] <- rows_dim_2
x[idx, , 2] <- rows_dim_1
I have a list of 6 with 10 values in each. I would like to fill a 10x6 (10 rows, 6 columns) matrix with these values. I've tried some things but it's not working. I'm sure there must be an easy way to do it, but I haven't found it yet. Could anyone please help?
Here some example data:
l = lapply(1:6, rep, 10)
then use ?do.call and cbind to paste the list elements as columns:
do.call(cbind, l)
and you get a matrix:
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 1 2 3 4 5 6
[2,] 1 2 3 4 5 6
[3,] 1 2 3 4 5 6
[4,] 1 2 3 4 5 6
[5,] 1 2 3 4 5 6
[6,] 1 2 3 4 5 6
[7,] 1 2 3 4 5 6
[8,] 1 2 3 4 5 6
[9,] 1 2 3 4 5 6
[10,] 1 2 3 4 5 6
I'm not sure what you call this, but the default 'flow' of matrices is downwards (as seen below)
matrix(1,7,5)*(1:7)
1 1 1 1 1
2 2 2 2 2
3 3 3 3 3
4 4 4 4 4
5 5 5 5 5
6 6 6 6 6
7 7 7 7 7
What if your intention is to multiply the vector to the right instead of downwards? Is there a better way to write the command below? Is there a toggle for column instead of row (same for replicate(7,1:7) it assumes downwards flow (paste row vectors downwards instead of column vectors to the right); is transpose the solution?)
t(t(matrix(1,7,5))*(1:5))
1 2 3 4 5
1 2 3 4 5
1 2 3 4 5
1 2 3 4 5
1 2 3 4 5
1 2 3 4 5
1 2 3 4 5
If you really want to do this a lot after defining the matrix you can always make an operator yourself:
'%mat%'<- function(x,y)t(t(x)*y)
matrix(1,7,5)%mat%1:5
[,1] [,2] [,3] [,4] [,5]
[1,] 1 2 3 4 5
[2,] 1 2 3 4 5
[3,] 1 2 3 4 5
[4,] 1 2 3 4 5
[5,] 1 2 3 4 5
[6,] 1 2 3 4 5
[7,] 1 2 3 4 5
But I think it easier to just transpose twice as you said in the question:
t(t(matrix(1,7,5))*1:5)
Or of course opt to transpose the matrix once in the beginning, do everything you need to do with it and then transpose it back.
As far as I know there is no way to change the default behaviour of *, nor would you probably want too,
A matrix is simply a vector with a dim attribute. The elements of the matrix are stored in the vector in column-major order and there is no way to change this. * is an element-by-element operator that recycles its arguments as necessary. You can see the recycling rule at work via:
> x <- matrix(1,7,5)
> x*1:5
[,1] [,2] [,3] [,4] [,5]
[1,] 1 3 5 2 4
[2,] 2 4 1 3 5
[3,] 3 5 2 4 1
[4,] 4 1 3 5 2
[5,] 5 2 4 1 3
[6,] 1 3 5 2 4
[7,] 2 4 1 3 5
You can see the multiplication is taking place by column and the vector (1:5) is being recycled to be the same length as the matrix. Rather than transposing, you could use the matrix function to re-size your matrix by row.
> matrix(x*1:5,nrow(x),ncol(x),byrow=TRUE)
[,1] [,2] [,3] [,4] [,5]
[1,] 1 2 3 4 5
[2,] 1 2 3 4 5
[3,] 1 2 3 4 5
[4,] 1 2 3 4 5
[5,] 1 2 3 4 5
[6,] 1 2 3 4 5
[7,] 1 2 3 4 5
I'm not sure that's the most efficient solution, but it's the best I can think of at the moment and it's slightly faster than using t twice.
Do you mean this?
> matrix(rep(1:7,5), nrow=7, ncol=5)
[,1] [,2] [,3] [,4] [,5]
[1,] 1 1 1 1 1
[2,] 2 2 2 2 2
[3,] 3 3 3 3 3
[4,] 4 4 4 4 4
[5,] 5 5 5 5 5
[6,] 6 6 6 6 6
[7,] 7 7 7 7 7
> matrix(rep(1:7,5), nrow=5, ncol=7, byrow=TRUE)
[,1] [,2] [,3] [,4] [,5] [,6] [,7]
[1,] 1 2 3 4 5 6 7
[2,] 1 2 3 4 5 6 7
[3,] 1 2 3 4 5 6 7
[4,] 1 2 3 4 5 6 7
[5,] 1 2 3 4 5 6 7