I would like to create 1000 random lists of 1652 genes from a universe of 44.400 genes.
I decided to replace. I used the following instruction to create the random lists:
randomMatrix<-replicate(1000, sample(gene_list, 1652, replace = T))
The point is that in each list a gene is replicated. For my study, genes can be replicated between lists but not in each list. How can I impose not to replicate genes in each single list?
Thanks in advance
It should work with replace = FALSE:
randomMatrix<-replicate(1000, sample(gene_list, 1652, replace = FALSE))
This, of course, requires at least 1652 unique values in gene_list.
A reproducible example would be nice to illustrate your problem, since you didn't give us such example I just assume a List and made some replications
List <- list(c(2,1,3,4,5,6), c(1,4,5,7,0,6), c(2,4,7,9,3,1))
set.seed(001)
replicate(3, lapply(List, sample, 7, replace=TRUE), simplify = FALSE)
which produces
[[1]]
[[1]][[1]]
[1] 1 3 4 6 1 6 6
[[1]][[2]]
[1] 7 7 1 4 4 0 5
[[1]][[3]]
[1] 3 7 3 1 7 3 1
[[2]]
[[2]][[1]]
[1] 1 4 2 1 3 2 3
[[2]][[2]]
[1] 6 5 5 7 5 4 0
[[2]][[3]]
[1] 3 3 2 3 7 3 9
[[3]]
[[3]][[1]]
[1] 5 4 4 5 2 3 5
[[3]][[2]]
[1] 0 5 6 5 4 1 1
[[3]][[3]]
[1] 4 9 9 7 1 4 7
Note that this approach will give you a resampled data (with replacement) for each element of your original list, that's why each replication is a list consisting in three elements each one.
If you write sapply instead of lapply inside replicate(...) the resulting output would be nicer.
set.seed(001)
replicate(3, sapply(List, sample, 7, replace=TRUE), simplify = FALSE)
[[1]]
[,1] [,2] [,3]
[1,] 1 7 3
[2,] 3 7 7
[3,] 4 1 3
[4,] 6 4 1
[5,] 1 4 7
[6,] 6 0 3
[7,] 6 5 1
[[2]]
[,1] [,2] [,3]
[1,] 1 6 3
[2,] 4 5 3
[3,] 2 5 2
[4,] 1 7 3
[5,] 3 5 7
[6,] 2 4 3
[7,] 3 0 9
[[3]]
[,1] [,2] [,3]
[1,] 5 0 4
[2,] 4 5 9
[3,] 4 6 9
[4,] 5 5 7
[5,] 2 4 1
[6,] 3 1 4
[7,] 5 1 7
Related
Assume I have to generate 1000 Sample Pairs (Y1,Y2) (from a Normal Distribution with replacement). Each of the pairs should have 20 observations.
y1 <- rep(sample(c(1:10),10, replace = TRUE))
y2 <- rep(sample(c(1:10),10, replace = TRUE))
How would I now generate 1000 of these pairs, so that they are easy to access for further computations.
I had the idea of looping them a 1000 times and saving them in a dataframe, but this may get chaotic.
Is there a simpler/nicer way to do this? A package or a function that I am missing?
Help would be appreciated!
One way is to use replicate, i.e.
replicate(5, rep(sample(c(1:10), 10, replace = TRUE)))
# [,1] [,2] [,3] [,4] [,5]
# [1,] 3 9 2 4 5
# [2,] 4 1 10 8 1
# [3,] 5 6 1 3 7
# [4,] 1 9 9 6 5
# [5,] 5 3 4 7 9
# [6,] 4 5 4 4 5
# [7,] 2 10 9 4 9
# [8,] 3 1 10 5 3
# [9,] 7 3 10 9 10
#[10,] 10 3 10 10 1
I have this toy matrix:
m
[,1] [,2] [,3] [,4] [,5] [,6] [,7]
[1,] 1 3 1 6 8 8 8
[2,] 2 2 5 7 9 7 4
[3,] 1 2 3 4 5 6 7
[4,] 1 2 3 4 5 6 7
[5,] 1 2 3 4 5 6 7
and I want to keep only the rows with 4 unique elements.
My initial strategy was to use...
apply(m, 1, function(x) unique(x))
[[1]]
[1] 1 3 6 8
[[2]]
[1] 2 5 7 9 4
[[3]]
[1] 1 2 3 4 5 6 7
[[4]]
[1] 1 2 3 4 5 6 7
[[5]]
[1] 1 2 3 4 5 6 7
... which would inform R of the unique elements in each row.
Then it would be as simple as m[length(apply(m, 1, function(x) unique(x)))==4, ].
But, not so easy... The output of apply(), at least in this case is a list, and hence, I can't use this trick.
Can you help?
I have a list of 6 with 10 values in each. I would like to fill a 10x6 (10 rows, 6 columns) matrix with these values. I've tried some things but it's not working. I'm sure there must be an easy way to do it, but I haven't found it yet. Could anyone please help?
Here some example data:
l = lapply(1:6, rep, 10)
then use ?do.call and cbind to paste the list elements as columns:
do.call(cbind, l)
and you get a matrix:
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 1 2 3 4 5 6
[2,] 1 2 3 4 5 6
[3,] 1 2 3 4 5 6
[4,] 1 2 3 4 5 6
[5,] 1 2 3 4 5 6
[6,] 1 2 3 4 5 6
[7,] 1 2 3 4 5 6
[8,] 1 2 3 4 5 6
[9,] 1 2 3 4 5 6
[10,] 1 2 3 4 5 6
I don't understand why I cannot order a matrix based on a vector using the order function
I have the following:
m
[,1] [,2]
[1,] 1 5
[2,] 2 5
[3,] 3 5
[4,] 4 5
[5,] 5 5
[6,] 6 5
v
[[1]]
[1] 3 1 2 4 5 6
When I use:
m[order(unlist(v)),]
I get the following, incorrectly ordered matrix.
[,1] [,2]
[1,] 2 5
[2,] 3 5
[3,] 1 5
[4,] 4 5
[5,] 5 5
[6,] 6 5
when the order that I want is what's in v
[,1] [,2]
[1,] 3 5
[2,] 1 5
[3,] 2 5
[4,] 4 5
[5,] 5 5
[6,] 6 5
Why do you guys think this is happening and how can I fix it?
Instead of
m[order(unlist(v)),]
Try
temp <- unlist(v)
m[ temp , ]
Because order returns the indexes in the order that you desire. E.g.
> x = c(3,1,2)
> order(x)
[1] 2 3 1
> x[order(x)]
[1] 1 2 3
I want to create a function that produces a matrix containing several lags of a variable. A simple example that works is
a <- ts(1:10)
cbind(a, lag(a, -1))
To do this for multiple lags, I have
lagger <- function(var, lags) {
### Create list of lags
lagged <- lapply(1:lags, function(x){
lag(var, -x)
})
### Join lags together
do.call(cbind, list(var, lagged))
}
Using the above example gives unexpected results;
lagger(a, 1)
gives a length 20 list with the original time series broken out into separate list slots and the final 10 each being a replication of the lagged series.
Any suggestions to getting this working? Thanks!
This gives a lag of 0 and of 1.
library(zoo)
a <- ts(11:13)
lags <- -(0:1)
a.lag <- as.ts(lag(as.zoo(a), lags))
Now a.lag is this:
> a.lag
Time Series:
Start = 1
End = 4
Frequency = 1
lag0 lag-1
1 11 NA
2 12 11
3 13 12
4 NA 13
If you don't want the NA entries then use: as.ts(na.omit(lag(as.zoo(a), lags))) .
Based on #Joshua Ulrich answer.
I thinkd embed is the correct answer but you get the vectors in the other way around. I mean using embed you'll get the lagged series not in the proper order, see the following
lagged <- embed(a,4)
colnames(lagged) <- paste('t', 3:0, sep='-')
lagged
t-3 t-2 t-1 t-0
[1,] 4 3 2 1
[2,] 5 4 3 2
[3,] 6 5 4 3
[4,] 7 6 5 4
[5,] 8 7 6 5
[6,] 9 8 7 6
[7,] 10 9 8 7
this gives the correct answer to you but not in the correct order, since the lags are in descending order.
But it you reorder just like this:
lagged_OK <- lagged[,ncol(lagged):1]
colnames(lagged_OK) <- paste('t', 0:3, sep='-')
lagged_OK
lag.0 lag.1 lag.2 lag.3
[1,] 1 2 3 4
[2,] 2 3 4 5
[3,] 3 4 5 6
[4,] 4 5 6 7
[5,] 5 6 7 8
[6,] 6 7 8 9
[7,] 7 8 9 10
Then, you get the right lagged matrix.
I add colnames only for explanation purpose, you can just do:
embed(a,4)[ ,4:1]
If you really want a lagger function, try this
lagger <- function(x, lag=1){
lag <- lag+1
Lagged <- embed(x,lag)[ ,lag:1]
colnames(Lagged) <- paste('lag', 0:(lag-1), sep='.')
return(Lagged)
}
lagger(a, 4)
lag.0 lag.1 lag.2 lag.3 lag.4
[1,] 1 2 3 4 5
[2,] 2 3 4 5 6
[3,] 3 4 5 6 7
[4,] 4 5 6 7 8
[5,] 5 6 7 8 9
[6,] 6 7 8 9 10
lagger(a, 1)
lag.0 lag.1
[1,] 1 2
[2,] 2 3
[3,] 3 4
[4,] 4 5
[5,] 5 6
[6,] 6 7
[7,] 7 8
[8,] 8 9
[9,] 9 10
I'm not sure what's wrong with your function, but you can probably use embed instead.
> embed(a,4)
[,1] [,2] [,3] [,4]
[1,] 4 3 2 1
[2,] 5 4 3 2
[3,] 6 5 4 3
[4,] 7 6 5 4
[5,] 8 7 6 5
[6,] 9 8 7 6
[7,] 10 9 8 7