Related
I have set up a pair of die rolling game with codes below but the final part of the code is not working that is to construct a code that can calculate the winning percentage by taking the average of the throws vector. The player wins if a double sixes come up.
throws <- NULL
for( i in 1:24 ){
Die1 <-sample(c(1,2,3,4,5,6),size=1,replace = TRUE,prob = NULL)
Die2 <-sample(c(1,2,3,4,5,6),size=1,replace = TRUE,prob = NULL)
throw <- Die1+Die2
throws[i] <- throw
}
throws
game <- any( throws == 12 )
game
for (i in 24){
if (Die1==6 & Die2==6){
throws/2 * 100}
R is vectorized and your problem can be solved with vectorized instructions only.
To repeat the same code a certain number of times, use replicate, not a loop (neither for nor *apply loops);
now that throws are stored in a matrix, use the optimized colSums, it's coded in C, it's simple and it's fast;
a proportion can be computed as the mean value of a logical vector.
set.seed(2023) # make the results reproducible
nthrows <- 24L
throws <- replicate(nthrows, sample(6L, 2L, replace = TRUE))
throws
#> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14]
#> [1,] 5 3 4 1 5 1 5 3 4 1 6 6 5 2
#> [2,] 1 2 2 1 1 5 2 5 5 1 2 6 1 6
#> [,15] [,16] [,17] [,18] [,19] [,20] [,21] [,22] [,23] [,24]
#> [1,] 6 5 6 6 6 2 3 6 5 2
#> [2,] 1 4 1 4 6 2 4 1 6 4
game <- any(colSums(throws) == 12L)
game
#> [1] TRUE
mean(colSums(throws) == 12L) * 100
#> [1] 8.333333
Created on 2023-02-16 with reprex v2.0.2
Edit
The following code gives the matrix throws row and column names. The rest of the code in the original answer remains the same.
set.seed(2023) # make the results reproducible
nthrows <- 24L
throws <- replicate(nthrows, sample(6L, 2L, replace = TRUE))
dimnames(throws) <- list(c("Die1", "Die2"),
paste0("throw", seq.int(nthrows)))
throws
#> throw1 throw2 throw3 throw4 throw5 throw6 throw7 throw8 throw9 throw10
#> Die1 5 3 4 1 5 1 5 3 4 1
#> Die2 1 2 2 1 1 5 2 5 5 1
#> throw11 throw12 throw13 throw14 throw15 throw16 throw17 throw18 throw19
#> Die1 6 6 5 2 6 5 6 6 6
#> Die2 2 6 1 6 1 4 1 4 6
#> throw20 throw21 throw22 throw23 throw24
#> Die1 2 3 6 5 2
#> Die2 2 4 1 6 4
Created on 2023-02-17 with reprex v2.0.2
I was able to write a function in r to "shift" a column of a matrix over to the right by one:
shift <- function(disc){
mat <- matrix(nrow = 4, ncol = 12)
mat[,1] <- disc[,12]
for(i in 1:11){
mat[,i+1] <- disc[,i]
}
return(mat)
}
So to see how that works:
> disc0
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12]
[1,] 2 5 10 7 16 8 7 8 8 3 4 12
[2,] 3 3 14 14 21 21 9 9 4 4 6 6
[3,] 8 9 10 11 12 13 14 15 4 5 6 7
[4,] 14 11 14 14 11 14 11 14 11 11 14 11
> shift(disc0)
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12]
[1,] 12 2 5 10 7 16 8 7 8 8 3 4
[2,] 6 3 3 14 14 21 21 9 9 4 4 6
[3,] 7 8 9 10 11 12 13 14 15 4 5 6
[4,] 11 14 11 14 14 11 14 11 14 11 11 14
What if I wanted to shift over 3 times, for example? I could do this manually:
> x <- disc0
> x <- shift(x)
> x <- shift(x)
> x <- shift(x)
> x
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12]
[1,] 3 4 12 2 5 10 7 16 8 7 8 8
[2,] 4 6 6 3 3 14 14 21 21 9 9 4
[3,] 5 6 7 8 9 10 11 12 13 14 15 4
[4,] 11 14 11 14 11 14 14 11 14 11 14 11
So now the original first column (2,3,8,14) is now in the 4th column.
But how can I automate this? I want to write a function that will repeat my shift function n times. Thanks in advance
You could write a function that takes in the shift parameter:
shift <- function(x, num = 1){
n <- ncol(x)
x[, c((n - num +1):n, 1:(n - num))]
}
mat
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
[1,] 1 2 3 4 5 6 7 8
[2,] 1 2 3 4 5 6 7 8
[3,] 1 2 3 4 5 6 7 8
[4,] 1 2 3 4 5 6 7 8
shift(mat)
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
[1,] 8 1 2 3 4 5 6 7
[2,] 8 1 2 3 4 5 6 7
[3,] 8 1 2 3 4 5 6 7
[4,] 8 1 2 3 4 5 6 7
shift(mat,2)
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
[1,] 7 8 1 2 3 4 5 6
[2,] 7 8 1 2 3 4 5 6
[3,] 7 8 1 2 3 4 5 6
[4,] 7 8 1 2 3 4 5 6
shift(mat,3)
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
[1,] 6 7 8 1 2 3 4 5
[2,] 6 7 8 1 2 3 4 5
[3,] 6 7 8 1 2 3 4 5
[4,] 6 7 8 1 2 3 4 5
You may use a for loop -
n <- 3
for(i in seq_len(n)) {
disc0 <- shift(disc0)
}
disc0
# [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12]
#[1,] 3 4 12 2 5 10 7 16 8 7 8 8
#[2,] 4 6 6 3 3 14 14 21 21 9 9 4
#[3,] 5 6 7 8 9 10 11 12 13 14 15 4
#[4,] 11 14 11 14 11 14 14 11 14 11 14 11
I have the following data frame
Name <- c("Jhon", "Lee", "Suzan", "Abhinav",
"Brain", "Ron","Cat","Mike","Bob","Sue","Carl")
Vote <- rep(letters[1:21],each=10, len=230)
z <- as.data.frame (cbind(Name, Vote))
I want to create a list of data frames which represent all possible combinations of 6 names with their respective votes (out of 11 that I have) and which include as well the 5 other names appended.
The following gives me all possible combinations of 6 names, which is 462
comb<-combn(unique(as.character(z$Name)), 6)
comb has 462 columns, so it is the correct output.
The following code creates the list of all data frames across the combinations.
combdf <- apply(comb, 2, function(vec) z[ z$Name %in% vec, ] )
The following code should create the output that I want
output <- z %>%
pull(Name) %>%
unique %>%
combn(., 3, FUN = function(vec)
z %>%
filter(Name %in% vec) %>%
bind_rows(z %>%
filter(!Name %in% vec) %>%
rename(Name2 = Name, Vote2 = Vote)) %>%
mutate(across(c(Name2, Vote2),
~ .[order(is.na(.))])), simplify = FALSE)
My problem is that output has 165 data frames and I expect 462. In addition, each data frame in combdf , if I am not wrong, should have 230 rows (as does my original data frame - z). However, this is not the case. For ex., number one has 226, number four has 229, number 18 has 228 (checked at random).
The m in combn determines the number of combinations as well
>combn(1:6, 2)
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] [,15]
[1,] 1 1 1 1 1 2 2 2 2 3 3 3 4 4 5
[2,] 2 3 4 5 6 3 4 5 6 4 5 6 5 6 6
> combn(1:6, 3)
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] [,15] [,16] [,17] [,18] [,19] [,20]
[1,] 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 3 3 3 4
[2,] 2 2 2 2 3 3 3 4 4 5 3 3 3 4 4 5 4 4 5 5
[3,] 3 4 5 6 4 5 6 5 6 6 4 5 6 5 6 6 5 6 6 6
Note the number of column difference. Similarly, in the OP's post, the 'combdf' was created with m = 6, while in the tidyverse code, the m used is 3. Thus is makes a difference from 462 to 165
Each loop of my sapply function will out put a n*m matrix. n is fixed, m is not.
For example, if I run this in R:
sapply(1:3, function(x) {matrix(1:9, 3)})
and it will output:
[,1] [,2] [,3]
[1,] 1 1 1
[2,] 2 2 2
[3,] 3 3 3
[4,] 4 4 4
[5,] 5 5 5
[6,] 6 6 6
[7,] 7 7 7
[8,] 8 8 8
[9,] 9 9 9
However, what I want is something like this:
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
[1,] 1 4 7 1 4 7 1 4 7
[2,] 2 5 8 2 5 8 2 5 8
[3,] 3 6 9 3 6 9 3 6 9
Any idea for this? Thanks
One solution is:
do.call(cbind, lapply(1:3, function(x) {matrix(1:9, 3)}))
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
[1,] 1 4 7 1 4 7 1 4 7
[2,] 2 5 8 2 5 8 2 5 8
[3,] 3 6 9 3 6 9 3 6 9
We can use replicate
`dim<-`(replicate(3, matrix(1:9, 3)), c(3, 3*3))
# [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
#[1,] 1 4 7 1 4 7 1 4 7
#[2,] 2 5 8 2 5 8 2 5 8
#[3,] 3 6 9 3 6 9 3 6 9
I try to optimize parts of a simulation script and have parts of it run in parallel. So I discovered the foreach function.
As an example I try to have the following run in parallel
Wiederholungen<-4
assign("A",array(0,c(Wiederholungen,10,3)))
assign("B",array(0,c(Wiederholungen,10,3)))
assign("C",array(0,c(Wiederholungen,10,3)))
Old Code:
for(m in 1:Wiederholungen){
A[m,,]<-m
B[m,,]<-m
C[m,,]<-m
}
Resulting in: (for A)
A
, , 1
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 1 1 1 1 1 1 1 1 1 1
[2,] 2 2 2 2 2 2 2 2 2 2
[3,] 3 3 3 3 3 3 3 3 3 3
[4,] 4 4 4 4 4 4 4 4 4 4
, , 2
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 1 1 1 1 1 1 1 1 1 1
[2,] 2 2 2 2 2 2 2 2 2 2
[3,] 3 3 3 3 3 3 3 3 3 3
[4,] 4 4 4 4 4 4 4 4 4 4
, , 3
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 1 1 1 1 1 1 1 1 1 1
[2,] 2 2 2 2 2 2 2 2 2 2
[3,] 3 3 3 3 3 3 3 3 3 3
[4,] 4 4 4 4 4 4 4 4 4 4
This should put the value of m in the m-th row of the dataframes A,B & C. In my script m is the number of replicates (Wiederholungen) the script should run.
Foreach Code:
foreach(m=1:Wiederholungen)%dopar%{
A[m,,]<-m
B[m,,]<-m
C[m,,]<-m
}
Resulting in:
[[1]]
[1] 1
[[2]]
[1] 2
[[3]]
[1] 3
[[4]]
[1] 4
But this does not lead in the above result for the dataframes A,B,C. I know that foreach has a combine option but it seems only to work if you have the result for one matrix and not for several as in my example. How do I get the same result as with my old for loop using a parallel processing approach. I am using Ubuntu and 3.2.2.
I know this is a very old post, but thought an answer might help someone in the future.
Foreach returns a list of only the last object created in the loop. so if you want A, B, and C returned you need to list them as the final step of the loop.
This code works for me.
W<- foreach(m = 1:Wiederholungen) %dopar% {
A[m,,]<-m
B[m,,]<-m
C[m,,]<-m
list(A, B, C)
}
For future reference, the foreach vignette can be very helpful with things like this.