Related
There are 9 treatments and we want to have 7 blocks. In each block, the treatment should be repeated once.
The 9 treatments are marked as follows:
-Treatment 1 (1-7)
-Treatment 2 (8-14)
-Treatment 3 (15-21)
-Treatment 4 (22-28)
-Treatment 5 (29-35)
-Treatment 6 (36-42)
-Treatment 7 (43-49)
-Treatment 8 (50-56)
-Treatment 9 (57-63)
Each number represents a pot. We want these pots randomised in 7 blocks (columns) but we don't want two pot of the same treatment adjacent to each other - highlighted in grey:
How would I go about this in R?
If I'm interpreting it correctly, this should work.
We'll do a two-step sampling:
First, sample the treatment group itself, making it much easier to determine if a particular row in the block is in the same treatment group as the same row, previous block.
Second, sample one from each of the proven-safe groups.
I'll use a random seed here for reproducibility, do not use set.seed(.) in production.
set.seed(42)
nBlocks <- 7
treatments <- list(1:7, 8:14, 15:21, 22:28, 29:35, 36:42, 43:49, 50:56, 57:63)
blocks <- Reduce(function(prev, ign) {
while (TRUE) {
this <- sample(length(treatments))
if (!any(this == prev)) break
}
this
}, seq.int(nBlocks)[-1], init = sample(length(treatments)), accumulate = TRUE)
blocks <- do.call(cbind, blocks)
blocks
# [,1] [,2] [,3] [,4] [,5] [,6] [,7]
# [1,] 1 3 4 2 8 2 1
# [2,] 5 1 2 4 5 7 9
# [3,] 9 8 9 3 1 3 5
# [4,] 7 9 3 6 7 9 3
# [5,] 2 4 8 5 4 1 4
# [6,] 4 7 1 9 6 4 2
# [7,] 8 6 5 7 2 6 8
# [8,] 3 5 6 8 9 5 6
# [9,] 6 2 7 1 3 8 7
Here each column is a "block", and each number represents the treatment group assigned to each row. You can see that no rows contain the same group in subsequent columns.
For instance, the first column ("block 1") will have something from the Treatment 1 group in the first row, Treatment 5 group in row two, etc. Further, inspection will show that all treatments are included in each block column, an inferred requirement of the experimental design.
(FYI, it is theoretically possible that this will take a while based on the random conditions. Because it repeats per-column, it should be relatively efficient, though. I have no safeguards here for too-long-execution, but I don't think it is required: the conditions here do not lend to a high likelihood of "failure" requiring much repetition.)
The next step is to convert each of these group numbers into a number from the respective treatment group.
apply(blocks, 1:2, function(ind) sample(treatments[[ind]], 1))
# [,1] [,2] [,3] [,4] [,5] [,6] [,7]
# [1,] 6 17 22 11 54 14 3
# [2,] 30 3 13 22 33 48 58
# [3,] 63 55 61 15 4 21 33
# [4,] 49 60 21 36 43 58 21
# [5,] 12 25 55 32 27 7 25
# [6,] 24 46 4 58 38 28 11
# [7,] 53 38 35 49 11 36 56
# [8,] 16 29 36 56 63 29 40
# [9,] 36 8 47 3 19 50 43
To verify, in the first matrix, our first three rows (block 1) were 1, 5, and 9, which should translate into 1-7, 29-35, and57-63, respectively. "6" is within 1-7, "30" is within 29-35, and "63" is within 59-63. Inspection will show the remainder to be correct.
Because of the step of determining treatment groups first, it is much simpler to verify/guarantee that you will not repeat treatment groups in a row between two adjacent blocks.
EDIT
Rules:
The same treatment group may not be on the same row in adjacent columns; and
The same treatment (not group) may not be in any row in adjacent columns.
We can use the same methodology as before. Note that as any groups become smaller, the iteration time may increase but I do not expect it likely to get into an infinite loop. (However, if you inadvertently have a group of length 1, then ... this will never end.)
nBlocks <- 7
treatments <- list(1:7, 8:14, 15:21, 22:28, 29:35, 36:42, 43:49, 50:56, 57:63)
# helper function for randomized selection of treatments given groups
func <- function(grp) cbind(grp, sapply(treatments[grp], sample, size = 1))
set.seed(42)
func(c(1,3,5))
# grp
# [1,] 1 1
# [2,] 3 19
# [3,] 5 29
And then the same Reduce mindset:
set.seed(42)
blocks <- Reduce(function(prev, ign) {
while (TRUE) {
this1 <- sample(length(treatments))
if (!any(this1 == prev[,1])) break
}
while (TRUE) {
this2 <- func(this1)
if (!any(this2[,2] %in% prev[,2])) break
}
this2
}, seq.int(nBlocks-1), init = func(sample(length(treatments))), accumulate = TRUE)
blocks <- do.call(cbind, blocks)
groups <- blocks[, seq(1, by = 2, length.out = nBlocks)]
treats <- blocks[, seq(2, by = 2, length.out = nBlocks)]
From this, we have two products (though you will likely only care about the second):
The treatment groups, good to verify rule 1 above: no group may be in the same row in adjacent columns:
groups
# grp grp grp grp grp grp grp
# [1,] 1 3 1 7 8 5 1
# [2,] 5 1 2 8 2 7 3
# [3,] 9 8 5 2 1 4 6
# [4,] 7 9 6 3 4 8 5
# [5,] 2 4 7 9 3 9 4
# [6,] 4 7 4 5 7 1 2
# [7,] 8 6 9 1 9 6 7
# [8,] 3 5 8 6 5 2 9
# [9,] 6 2 3 4 6 3 8
The treatments themselves, for rule 2 above, where no treatment may be in adjacent columns:
treats
#
# [1,] 7 19 2 47 51 33 3
# [2,] 35 4 12 50 8 44 15
# [3,] 60 51 35 10 1 22 41
# [4,] 43 58 41 21 26 55 31
# [5,] 12 24 43 57 17 57 26
# [6,] 27 49 26 34 48 6 11
# [7,] 53 36 62 6 62 36 47
# [8,] 16 33 54 42 32 10 62
# [9,] 37 9 15 27 37 18 56
Edit 2:
Another rule:
Each treatment group must be seen exactly once in each row and column (requiring a square experimental design).
I think this is effectively generating a sudoku-like matrix of treatment groups, and once that is satisfied, backfill rule #2 (no repeat treatments in adjacent columns). One way (though it is hasty) is suggested by https://gamedev.stackexchange.com/a/138228:
set.seed(42)
vec <- sample(9)
ind <- sapply(cumsum(c(0, 3, 3, 1, 3, 3, 1, 3, 3)), rot, x = vec)
apply(ind, 1, function(z) all(1:9 %in% z)) # all rows have all 1-9, no repeats
# [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
apply(ind, 1, function(z) all(1:9 %in% z)) # ... columns ...
# [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
ind
# [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
# [1,] 1 7 8 3 5 2 4 6 9
# [2,] 5 2 3 6 9 4 8 1 7
# [3,] 9 4 6 1 7 8 3 5 2
# [4,] 7 8 1 5 2 3 6 9 4
# [5,] 2 3 5 9 4 6 1 7 8
# [6,] 4 6 9 7 8 1 5 2 3
# [7,] 8 1 7 2 3 5 9 4 6
# [8,] 3 5 2 4 6 9 7 8 1
# [9,] 6 9 4 8 1 7 2 3 5
This makes a rather fixed-style of random group arrangements given the constraints on groups. Since this is a design of experiments, if you're going to use this method (and proximity between blocks is at all a concern), then you should likely randomize columns and/or rows of the ind matrix before sampling the treatments themselves. (You can do columns and rows, just do them piece-wise, and it should preserve the constraints.)
ind <- ind[sample(9),][,sample(9)]
ind
# [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
# [1,] 2 3 8 1 4 7 9 6 5
# [2,] 7 8 4 6 2 9 5 3 1
# [3,] 1 7 9 4 5 6 3 2 8
# [4,] 8 1 6 9 3 4 2 5 7
# [5,] 5 2 7 8 9 1 6 4 3
# [6,] 3 5 1 7 6 8 4 9 2
# [7,] 4 6 3 5 8 2 7 1 9
# [8,] 6 9 5 2 1 3 8 7 4
# [9,] 9 4 2 3 7 5 1 8 6
From here, we can enact rule 2:
treatments <- list(1:7, 8:14, 15:21, 22:28, 29:35, 36:42, 43:49, 50:56, 57:63)
mtx <- do.call(rbind, Reduce(function(prev, ind) {
while (TRUE) {
this <- sapply(treatments[ind], sample, size = 1)
if (!any(prev %in% this)) break
}
this
}, asplit(ind, 2)[-1],
init = sapply(treatments[ind[,1]], sample, size = 1),
accumulate = TRUE))
mtx
# [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
# [1,] 11 44 4 52 30 15 23 41 59
# [2,] 16 56 49 3 12 33 39 57 27
# [3,] 52 24 60 40 46 2 20 29 13
# [4,] 1 37 23 63 56 48 32 12 17
# [5,] 24 10 30 16 58 39 50 2 47
# [6,] 49 57 41 25 6 52 11 17 34
# [7,] 59 31 19 14 38 23 47 51 7
# [8,] 41 17 11 33 24 61 5 43 54
# [9,] 29 4 51 45 20 8 58 28 40
Matlab can do this task. I cannot get it right so far by using matrix(), t(), and reShape().
My intention is to transpose a series to a matrix of fixed 10 rows and the number of column varies based on the length of the data series. If these are some remains left, they can be discarded.
For example:
Row #1 1 2 3 4
Row #2 5 6 7 8
Row #3 9 10 11 12
Row #4 13 14 15 16
Row #5 17 18 19 20
Row #6 21 22 23 24
Row #7 25 26 27 28
Row #8 29 30 31 32
Row #9 33 34 35 36
Row #10 37 38 39 40
If there are any remains left (i.e, 41~49), these data can be just discarded.
Any suggestions?
This is what I think you are asking for. A vector of arbitrary length and data. To be turned into a matrix with nrow 10 and ncol based on data length.
#your series of arbitrary length
data = 1:49
#calculate number of columns based on length
col = as.integer(length(data)/10)
#max index
maxIndx = 10*col
#create and transpose matrix
yourMtx = t(matrix(data[0:maxIndx],col,10))
#your matrix
> [,1] [,2] [,3] [,4]
[1,] 1 2 3 4
[2,] 5 6 7 8
[3,] 9 10 11 12
[4,] 13 14 15 16
[5,] 17 18 19 20
[6,] 21 22 23 24
[7,] 25 26 27 28
[8,] 29 30 31 32
[9,] 33 34 35 36
[10,] 37 38 39 40
#create reverse matrix
revMtx = yourMtx[,rev(seq_len(ncol(yourMtx)))]
#reverse matrix
> [,1] [,2] [,3] [,4]
[1,] 4 3 2 1
[2,] 8 7 6 5
[3,] 12 11 10 9
[4,] 16 15 14 13
[5,] 20 19 18 17
[6,] 24 23 22 21
[7,] 28 27 26 25
[8,] 32 31 30 29
[9,] 36 35 34 33
[10,] 40 39 38 37
If I understand your question correctly, this looks to be an approach you could use.
# generate my series
myseries <- 1:49
# specify number of columns and rows
ncols <- 4
nrows <- 10
# create a matrix with the first ncols*nrows elements and fill by row
mymatrix <- matrix(myseries[1:(ncols*nrows)],
ncol = ncols, nrow = nrows, byrow = TRUE)
mymatrix
[,1] [,2] [,3] [,4]
[1,] 1 2 3 4
[2,] 5 6 7 8
[3,] 9 10 11 12
[4,] 13 14 15 16
[5,] 17 18 19 20
[6,] 21 22 23 24
[7,] 25 26 27 28
[8,] 29 30 31 32
[9,] 33 34 35 36
[10,] 37 38 39 40
Is there a faster way to take a vector and turn it into a 10 columns matrix, like in the following output?
I have a vector of 9000 elements and I am trying to create 200 columns, from the most recent observations to its previous 200 observations, going backward for each columns.
In the example below, the number 10 represent the 10th obs in the vector, the 9 represents the 9th obs,..., the number 1 represents the first observation in the vector.
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 10 9 8 7 6 5 4 3 2 1
[2,] 11 10 9 8 7 6 5 4 3 2
[3,] 12 11 10 9 8 7 6 5 4 3
[4,] 13 12 11 10 9 8 7 6 5 4
[5,] 14 13 12 11 10 9 8 7 6 5
[6,] 15 14 13 12 11 10 9 8 7 6
[7,] 16 15 14 13 12 11 10 9 8 7
[8,] 17 16 15 14 13 12 11 10 9 8
[9,] 18 17 16 15 14 13 12 11 10 9
[10,] 19 18 17 16 15 14 13 12 11 10
[11,] 20 19 18 17 16 15 14 13 12 11
a<-1:20
z<-cbind(
a1<-a[-c(1:9)],
a2<-a[-c(1:8,length(a))],
a3<-a[-c(1:7,length(a)-1,length(a))],
a4<-a[-c(1:6,(length(a)-2):length(a))],
a5<-a[-c(1:5,(length(a)-3):length(a))],
a6<-a[-c(1:4,(length(a)-4):length(a))],
a7<-a[-c(1:3,(length(a)-5):length(a))],
a8<-a[-c(1:2,(length(a)-6):length(a))],
a9<-a[-c(1,(length(a)-7):length(a))],
a10<-a[-c((length(a)-8):length(a))]
)
z
I did the same thing for 40 columns, but I can't imagine doing the same thing for 200 columns.
Any help would be greatly appreciated. Thank you in advance
a<-1:100
z<-cbind(
a1<-a[-c(1:39)],
a2<-a[-c(1:38,length(a))],
a3<-a[-c(1:37,length(a)-1,length(a))],
a4<-a[-c(1:36,(length(a)-2):length(a))],
a5<-a[-c(1:35,(length(a)-3):length(a))],
a6<-a[-c(1:34,(length(a)-4):length(a))],
a7<-a[-c(1:33,(length(a)-5):length(a))],
a8<-a[-c(1:32,(length(a)-6):length(a))],
a9<-a[-c(1:31,(length(a)-7):length(a))],
a10<-a[-c(1:30,(length(a)-8):length(a))],
a11<-a[-c(1:29,(length(a)-9):length(a))],
a12<-a[-c(1:28,(length(a)-10):length(a))],
a13<-a[-c(1:27,(length(a)-11):length(a))],
a14<-a[-c(1:26,(length(a)-12):length(a))],
a15<-a[-c(1:25,(length(a)-13):length(a))],
a16<-a[-c(1:24,(length(a)-14):length(a))],
a17<-a[-c(1:23,(length(a)-15):length(a))],
a18<-a[-c(1:22,(length(a)-16):length(a))],
a19<-a[-c(1:21,(length(a)-17):length(a))],
a20<-a[-c(1:20,(length(a)-18):length(a))],
a21<-a[-c(1:19,(length(a)-19):length(a))],
a22<-a[-c(1:18,(length(a)-20):length(a))],
a23<-a[-c(1:17,(length(a)-21):length(a))],
a24<-a[-c(1:16,(length(a)-22):length(a))],
a25<-a[-c(1:15,(length(a)-23):length(a))],
a26<-a[-c(1:14,(length(a)-24):length(a))],
a27<-a[-c(1:13,(length(a)-25):length(a))],
a28<-a[-c(1:12,(length(a)-26):length(a))],
a29<-a[-c(1:11,(length(a)-27):length(a))],
a30<-a[-c(1:10,(length(a)-28):length(a))],
a31<-a[-c(1:9,(length(a)-29):length(a))],
a32<-a[-c(1:8,(length(a)-30):length(a))],
a33<-a[-c(1:7,(length(a)-31):length(a))],
a34<-a[-c(1:6,(length(a)-32):length(a))],
a35<-a[-c(1:5,(length(a)-33):length(a))],
a36<-a[-c(1:4,(length(a)-34):length(a))],
a37<-a[-c(1:3,(length(a)-35):length(a))],
a38<-a[-c(1:2,(length(a)-36):length(a))],
a39<-a[-c(1,(length(a)-37):length(a))],
a40<-a[-c((length(a)-38):length(a))]
)
z
Here's one possibility:
m <- matrix(nrow=11, ncol=10)
ncol(m) - col(m) + row(m)
# [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
# [1,] 10 9 8 7 6 5 4 3 2 1
# [2,] 11 10 9 8 7 6 5 4 3 2
# [3,] 12 11 10 9 8 7 6 5 4 3
# [4,] 13 12 11 10 9 8 7 6 5 4
# [5,] 14 13 12 11 10 9 8 7 6 5
# [6,] 15 14 13 12 11 10 9 8 7 6
# [7,] 16 15 14 13 12 11 10 9 8 7
# [8,] 17 16 15 14 13 12 11 10 9 8
# [9,] 18 17 16 15 14 13 12 11 10 9
# [10,] 19 18 17 16 15 14 13 12 11 10
# [11,] 20 19 18 17 16 15 14 13 12 11
I think embed() will be useful:
x <- 1:9000
m <- embed(x,200)
m <- m[,rev(seq(ncol(m)))] ## reverse columns
x=1:20
[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
rep(x,2)
[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
View(rep(x,2))
Having a problem with generating a 20 by 2 vector using the rep() function in R.
Instead of creating two columns, each running from 1 to 20, when I view the data in the R workspace, it is displayed as 40X1 vector i.e. 1-20 1-20.
How do you use the rep() function to create a repeated column vector of 20X2? Thank you.
rep will return an atomic vector. If you want a matrix, use matrix on the results, with the appropriate dimensions.
eg.
x <- 1:20
matrix(rep(x,2), ncol = 2)
[,1] [,2]
[1,] 1 1
[2,] 2 2
[3,] 3 3
[4,] 4 4
[5,] 5 5
[6,] 6 6
[7,] 7 7
[8,] 8 8
[9,] 9 9
[10,] 10 10
[11,] 11 11
[12,] 12 12
[13,] 13 13
[14,] 14 14
[15,] 15 15
[16,] 16 16
[17,] 17 17
[18,] 18 18
[19,] 19 19
[20,] 20 20
How can I melt a lower half triangle plus diagonal matrix ?
11 NA NA NA NA
12 22 NA NA NA
13 23 33 NA NA
14 24 34 44 NA
15 25 35 45 55
A <- t(matrix (c(11, NA, NA, NA, NA, 12, 22, NA, NA, NA,
13, 23, 33, NA, NA, 14, 24, 34, 44, NA,15, 25,
35, 45, 55), ncol = 5))
> A
[,1] [,2] [,3] [,4] [,5]
[1,] 11 NA NA NA NA
[2,] 12 22 NA NA NA
[3,] 13 23 33 NA NA
[4,] 14 24 34 44 NA
[5,] 15 25 35 45 55
To data.frame in row and col (preserving the following order)
col row value
1 1 11
1 2 12
1 3 13
1 4 14
1 5 15
2 2 22
2 3 23
2 4 24
2 5 25
3 3 33
3 4 34
3 5 35
4 4 44
4 5 45
5 5 55
If you want the indices as columns as well, this should work:
m <- matrix(1:25,5,5)
m[upper.tri(m)] <- NA
m
[,1] [,2] [,3] [,4] [,5]
[1,] 1 NA NA NA NA
[2,] 2 7 NA NA NA
[3,] 3 8 13 NA NA
[4,] 4 9 14 19 NA
[5,] 5 10 15 20 25
cbind(which(!is.na(m),arr.ind = TRUE),na.omit(as.vector(m)))
row col
[1,] 1 1 1
[2,] 2 1 2
[3,] 3 1 3
[4,] 4 1 4
[5,] 5 1 5
[6,] 2 2 7
[7,] 3 2 8
[8,] 4 2 9
[9,] 5 2 10
[10,] 3 3 13
[11,] 4 3 14
[12,] 5 3 15
[13,] 4 4 19
[14,] 5 4 20
[15,] 5 5 25
I guess I'll explain this a bit. I'm using three "tricks":
The arr.ind argument to which to get the indices
The very useful na.omit function to avoid some extra typing
The fact that R stores matrices in column major form, hence as.vector returns the values in the right order.
My one liner.
reshape2::melt(A, varnames = c('row', 'col'), na.rm = TRUE)
Here's my first solution:
test <- rbind(c(11,NA,NA,NA,NA),
c(12,22,NA,NA,NA),
c(13,23,33,NA,NA),
c(14,24,34,44,NA),
c(15,25,35,45,55)) ## Load the matrix
test2 <- as.vector(test) ## "melt" it into a vector
test <- cbind( test2[!is.na(test2)] ) ## get rid of NAs, cbind it into a column
Results are:
> test
[,1]
[1,] 11
[2,] 12
[3,] 13
[4,] 14
[5,] 15
[6,] 22
[7,] 23
[8,] 24
[9,] 25
[10,] 33
[11,] 34
[12,] 35
[13,] 44
[14,] 45
[15,] 55
Alternatively, you can use the matrix command:
test <- rbind(c(11,NA,NA,NA,NA),
c(12,22,NA,NA,NA),
c(13,23,33,NA,NA),
c(14,24,34,44,NA),
c(15,25,35,45,55)) ## Load the matrix
test2 <- matrix(test, ncol=1)
test <- cbind( test2[!is.na(test2), ] )
## same as above, except now explicitly noting rows to replace.
Here is my attempt:
# enter the data
df <- c(11,12,13,14,15,NA,22,23,24,25,NA,NA,33,34,35,NA,NA,NA,44,45,NA,NA,NA,NA,55)
dim(df) <- c(5,5)
df
# make new data frame with rows and column indicators
melteddf <- data.frame(
value=df[lower.tri(df,diag=T)],
col=rep(1:ncol(df),ncol(df):1),
row=unlist(sapply(1:nrow(df),function(x) x:nrow(df)))
)
I wish I knew about the arr.ind part of cbind which before now though.
Here is a method using arrayInd which is basically the same as #joran's but might be useful in other settings:
na.omit( data.frame(arrayInd(1:prod(dim(A)), dim(A)), value=c(A)) )
X1 X2 value
1 1 1 11
2 2 1 12
3 3 1 13
4 4 1 14
5 5 1 15
7 2 2 22
8 3 2 23
9 4 2 24
10 5 2 25
13 3 3 33
14 4 3 34
15 5 3 35
19 4 4 44
20 5 4 45
25 5 5 55