R - Loop with If statement and tracking the number of successes - r

I want to simulate different poker hands. Through painful trial and error, I got the ranks, suits, the deck and a function to draw any given number of cards as:
suits <- c("spd","hrt","dimd","clbs")
ranks <- c(1:10,"J","Q","K")
deck <- as.matrix(expand.grid('rank' = ranks, 'suit' = suits))
draw <- function (n) deck[sample(nrow(deck), n), ]
draw(5) # Drawing 5 cards from the deck...
Output:
rank suit
[1,] "4" "dimd"
[2,] "6" "dimd"
[3,] "8" "spd"
[4,] "K" "hrt"
[5,] "8" "clbs"
Now I want to find out through simulation the probability of getting different hands. I did come up with some possible loops with a counter for the number of successes but I am stuck.
Here is an example... Let me try to figure out how many full houses I get in 1000 simulations. Since a full house is defined as "three matching cards of one rank and two matching cards of another rank", I figured that the key part of the function would be to have a boolean within an if statement that takes advantage of the R function unique()==2, meaning 2 unique ranks - with 5 cards dealt, 2 unique ranks could be a full house (another possibility is four-of-a-kind with any other rank).
iterations <- 1000
counter <- 0
for (i in iterations){
s <- draw(5)
if(length(unique(s[,1])) == 2) counter <- counter + 1
}
counter
Output: [1] 0
I have tried multiple other things, including counter[i] <- 1 for successful cases, and with the idea of running a sum(counter) at the end, but all without getting the loop to work.

In your code you have:
for(i in 1000) {
print(i)
} # 1000
It would only print once because i would iterate once as 1000.
Here's an alternative approach using rle.
iterations <- 10000
draws <- list()
for (i in 1:iterations){
s <- draw(5)
draws[[i]] <- all(rle(sort(s[,1]))$lengths %in% c(2,3))
if(draws[[i]]) {
print(s)
}
}
summary(unlist(draws))

Using a data frame as follows, it seems to produce the result you are looking for:
suits <- c("spd","hrt","dimd","clbs")
ranks <- c(1:10,"J","Q","K")
deck <- as.data.frame(expand.grid('rank' = ranks, 'suit' = suits))
draw <- function (n) deck[sample(nrow(deck), n), ]
counter <- 0;
for (i in 1:1000) {
df <- draw(5);
counter <- counter + (length(unique(df$rank)) == 2)
}
counter
[1] 156

suits <- c("spd","hrt","dimd","clbs")
ranks <- c(1:10,"J","Q","K")
deck <- as.data.frame(expand.grid('rank' = ranks, 'suit' = suits))
draw <- function (n) deck[sample(nrow(deck), n), ]
iterations <- 1000
counter <- 0
for (i in 1:iterations) {
hand <- draw(5)
rank_table <- table(hand[, 1])
if (length(names(rank_table)) == 2 & min(rank_table) > 1) counter <- counter + 1
# could have four of a rank, one of another;
# need to ensure two of a rank, three of another
}
counter
[1] 1
This result is not far from what is expected http://www.math.hawaii.edu/~ramsey/Probability/PokerHands.html

Related

Finding all combinations using recursion in R

I'm having issue with returning values from recursive functions, hoping you could help me out. I have a list with a bunch of matrices, each matrix representing a set of possible combinations and generated using combn(). As an example, this could be 3 matrices inside the list:
# set 1 has 4 elements, do nCk = 4C1:
set1 <- c("b2","b3","b4","b5")
set1 <- combn(set1,1,simplify = T)
# set 2 has 3 elements, choose 2:
set2 <- c("c1","c2","b2")
set2 <- combn(set2,2,simplify = T)
# set 3 has 10 elements, choose 1:
set3 <- combn(c(1:10),1, simplify = T)
If we were to print set2, for instance, it would have 2 rows (choose 2), and 3 columns (3C2 = 3):
> set2
[,1] [,2] [,3]
[1,] "c1" "c1" "c2"
[2,] "c2" "b2" "b2"
I need get all possible 4-element combinations (1 element per set above). I can do this using a while loop and simulating a state machine, but that solution is clunky and makes for long code. I know this can be done using recursion as I was able to print the 120 combinations correctly (code below), but when trying to return them or save them in a variable, either I get a <font color="red">subscript out of bounds error or the results repeat thousands of times. I want to avoid global variables too, this will be embedded in a rather large project, so I'd prefer to avoid bloating my workspace with more variables than needed.
Of course, when deployed the number of sets will be dynamic, and the elements per set will change too. The sets aren't too big either, so I would love to implement a recursive approach!
Working code to print:
combb <- function(allsets, number, carry){
if(number>length(allsets)){
print(carry)
return()
} else{
for(j in 1:length(allsets[[number]][1,])){
newcarry <- c(carry, allsets[[number]][,j])
number2 <- number + 1
combb(allsets, number2, newcarry)
}
}
}
Thank you!
I found that it was very hard to carry the results back and forth, as it required flags and lists or a different solution. What I did instead was create a wrapper function where I created a local variable. The recursive function is defined inside, and accesses ("globally") the variable mentioned above. Then, that variable is returned by the wrapper:
combb <- function(allsets){
carry <- integer(0)
height <- 0L
for (j in 1:length(allsets)) {
height <- height + length(allsets[[j]][, 1])
}
output <- matrix(allsets[[1]][0, 1], nrow = height, ncol = 0)
combb1 <- function(allsets, number, carry) {
if(number > length(allsets)){
output <<- cbind(output, carry, deparse.level = 0)
return()
} else{
for (j in 1:length(allsets[[number]][1,])) {
# Only add unique combinations (some combinations are vectors)
if((TRUE %in% (allsets[[number]][, j] %in% carry)) == FALSE) {
newcarry <- c(carry, allsets[[number]][, j], use.names = FALSE)
number2 <- number + 1
combb1(allsets, number2, newcarry)
} else{
next()
}
}
}
}
combb1(allsets, 1, carry)
return(output)
}
As you can see from that solution, recursion is neat (combb1 function) and doesn't clutter any of the global/workspace variables.

Running random operations over vector, conditional

I am doing some modelling and wish to simulate randomness.
I have a total number of runs run_times which is 5 in this example.
A vector holding run_lengths will print 1's for which, so if run length is 3, it prints 1's 3 times.
The sample_data includes a sample of 1's and 0's. The application of printing 1's along a run_lengths is randomly done when sample_data == 1; not all == 1 is to be picked though. Only random... and operation can only print 1 for a total number of run_times (5).
Theres a few moving parts for sure.
I am tackling the problem in this manner:
I am able to select run_lengths at random with sample(run_lengths, 1). I am unsure how to select sample_data at random and I'm trying to keep a counter in order to stay under run_times:
run_lengths <- c(2,4,5,6,7,8,1)
run_times <- 5
sample_data <- c(0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,1,1,1,1,0,0,0,0,0,0,0,0,1,1,1,1,0,0,0,0,0,0)
# Randomly select 1's from sample_data, when find 1, randomly print 1's along run_lengths
# Only print a certain amount of times (run_times)
# Pick run_lengths at random == sample(run_lengths,1)
# Pick df$sample 1's at random, how to randomly select????
count <- 0 # keep track of how many random run_lengths is being applied
res <- NULL
while (length(res) < length(sample_data)) {
if (sample_data[length(res)+1] == 1 & count < run_times) { # not sure how to pick sample_date == 1?
res <- c(res, rep(1,sample(run_lengths,1))) # if signal == 1 (randomly) then randomly rep a run_length
count <- count +1 # count how many random reps, run_lengths have been applied
} else {
res <- c(res, 0) # Note if condition is not true, we print 0 vs 1
}
}
res <- res[1:length(sample_data)]
res
I have completed it maybe on 60%? I'm not sure what is the best approach for choosing random 1's from sample_data. Also I'm not sure how to only keep number of run_lengths under the run_times maximum. I am attempting to keep a count for when the condition was true. If it was exceeded, it would ignore any other true conditions.
Ok, time to put down some code, still not sure about if it's right or not
sample_data <- c(0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,1,1,1,1,0,0,0,0,0,0,0,0,1,1,1,1,0,0,0,0,0,0)
# take indices of of sampled data where value == 1
i <- which(sample_data %in% 1)
# now shuffle them all, no replacement - random positions with 1s
p <- sample(i, length(i), replace=FALSE)
print(sample_data[p[1]])
print(sample_data[p[2]])
print(sample_data[p[3]])
...
Is this what you want?

Storing output of nested loop in R

I am new to R but trying desperately to learn the ropes. In fact I feel a little stupid asking this question as I have gone through a number of similar problems but have not been able to get the desired results. My code is as shown below :
## Initializing Parameters
fstart <- 960 ## Start frequency in MHz
fstop <- 1240 ## Stop Frequency In MHz
bw <- 5.44 ## IF Bandwidth in MHz
offset <- 100 ## Max. Variation in TOD in milliseconds
f_dwell <- 1 ## Time spent on each search frequency in millisecond
iterations <- 100 ## No. of iterations to run
## No. of possible frequencies
f <- seq((fstart + bw/2), (fstop - bw/2), by=bw)
## Initializing the frequency table
freq_table <- matrix (NA, nrow=(2*offset +1), ncol=offset)
## Fill frequency table row wise with random values of possible frequencies
for (i in 1:(2*offset + 1)){
row_value <- c(sample(f), sample(f, offset-length(f)))
freq_table[i, ] <- row_value
}
## Assign a row from freq_table to unknown node
unknown_node <- freq_table[sample(1:(2*offset + 1), 1), ]
t = numeric(iterations)
## Calculate number of repetitions of frequencies
for(k in 1:iterations){
for(j in 1:offset){
y <- (sort(table(freq_table[, j]), decreasing=TRUE))
x <- as.vector(y) ## Number of repetitions of each frequency
y <- names(y)
## Search Frequencies
sf1 <- as.numeric(y[1])
sf2 <- as.numeric(y[2])
if (unknown_node[j] == sf1){
t[k] <- ((j-1)*f_dwell)*2 + f_dwell
break
}
else {
if (unknown_node[j] == sf2){
t[k] <- ((j-1)*f_dwell)*2 + 2*f_dwell
break
}
}
## Delete rows from freq_table that have sf1 & sf2
freq_table <- subset(freq_table, freq_table[, 1]!=sf1 & freq_table[, 1]!=sf2 )
}
}
print(t)
If I run this without the k for loop, I get different values of variable t every time. However, I wanted to run the inner for loop iteratively and get a vector of t values each time the inner for loop runs. I do get the length of t as 100, but the values are repeating. The first few values (2 0r 3 or sometimes 4) are different, but the rest keep repeating. I can't figure out why.

Store values in For Loop

I have a for loop in R in which I want to store the result of each calculation (for all the values looped through). In the for loop a function is called and the output is stored in a variable r in the moment. However, this is overwritten in each successive loop. How could I store the result of each loop through the function and access it afterwards?
Thanks,
example
for (par1 in 1:n) {
var<-function(par1,par2)
c(var,par1)->var2
print(var2)
So print returns every instance of var2 but in var2 only the value for the last n is saved..is there any way to get an array of the data or something?
initialise an empty object and then assign the value by indexing
a <- 0
for (i in 1:10) {
a[i] <- mean(rnorm(50))
}
print(a)
EDIT:
To include an example with two output variables, in the most basic case, create an empty matrix with the number of columns corresponding to your output parameters and the number of rows matching the number of iterations. Then save the output in the matrix, by indexing the row position in your for loop:
n <- 10
mat <- matrix(ncol=2, nrow=n)
for (i in 1:n) {
var1 <- function_one(i,par1)
var2 <- function_two(i,par2)
mat[i,] <- c(var1,var2)
}
print(mat)
The iteration number i corresponds to the row number in the mat object. So there is no need to explicitly keep track of it.
However, this is just to illustrate the basics. Once you understand the above, it is more efficient to use the elegant solution given by #eddi, especially if you are handling many output variables.
To get a list of results:
n = 3
lapply(1:n, function(par1) {
# your function and whatnot, e.g.
par1*par1
})
Or sapply if you want a vector instead.
A bit more complicated example:
n = 3
some_fn = function(x, y) { x + y }
par2 = 4
lapply(1:n, function(par1) {
var = some_fn(par1, par2)
return(c(var, par1)) # don't have to type return, but I chose to make it explicit here
})
#[[1]]
#[1] 5 1
#
#[[2]]
#[1] 6 2
#
#[[3]]
#[1] 7 3

Faster solution to looped grouped RLE calculation

I have a working solution to my problem, but I will not be able to use it because it is so slow (my calculations predict that the whole simulation will take 2-3 years!). Thus I am looking for a better (faster) solution. This is (in essence) the code I am working with:
N=4
x <-NULL
for (i in 1:N) { #first loop
v <-sample(0:1, 1000000, 1/2) #generate data
v <-as.data.frame(v) #convert to dataframe
v$t <-rep(1:2, each=250) #group
v$p <-rep(1:2000, each=500) #p.number
# second loop
for (j in 1:2000) { #second loop
#count rle for group 1 for each pnumber
x <- rbind(x, table(rle(v$v[v$t==1&v$p==j])))
#count rle for group 2 for each pnumber
x <- rbind(x, table(rle(v$v[v$t==2&v$p==j])))
} #end second loop
} #end first loop
#total rle counts for both group 1 & 2
y <-aggregate(x, list(as.numeric(rownames(x))), sum)
In words: The code generates a coin-flip simulation (v). A group factor is generated (1 & 2). A p.number factor is generated (1:2000). The run lengths are recorded for each p.number (1:2000) for both groups 1 & group 2 (each p.number has runs in both groups). After N loops (the first loop), the total run lengths are presented as a table (aggregate) (that is, the run lengths for each group, for each p.number, over N loops as a total).
I need the first loop because the data that I am working with comes in individual files (so I'm loading the file, calculating various statistics etc and then loading the next file and doing the same). I am much less attached to the second loop, but can't figure out how to replace it with something faster.
What can be done to the second loop to make it (hopefully, a lot) faster?
You are committing the cardinal sin of growing an object within a for() loop in R. Don't (I repeat don't) do this. Allocate sufficient storage for x at the beginning and then fill in x as you go.
x <- matrix(nrow = N * (2000 * 2), ncol = ??)
Then in the inner loop
x[ii, ] <- table(rle(....))
where ii is a loop counter that you initialise to 1 before the first loop and increment within the second loop:
x <- matrix(nrow = N * (2000 * 2), ncol = ??)
ii <- 1
for(i in 1:N) {
.... # stuff here
for(j in 1:2000) {
.... # stuff here
x[ii, ] <- table(rle(....))
## increment ii
ii <- ii + 1
x[ii, ] <- table(rle(....))
## increment ii
ii <- ii + 1
} ## end inner loop
} ## end outer loop
Also note that you are reusing index i in bot for()loops which will not work.iis just a normal R object and so bothfor()loops will be overwriting it as the progress. USej` for the second loop as I did above.
Try that simple optimisation first and see if that will allow the real simulation to complete in an acceptable amount of time. If not, come back with a new Q showing the latest code and we can think about other optimisations. The optimisation above is simple to do, optimising table() and rle() might take a lot more work. Noting that, you might look at the tabulate() function which does the heavy lifting in table(), which might be one avenue for optimising that particular step.
If you just want to run rle and table for each combination of the values of v$t and v$p separately, there is no need for the second loop. It is much faster in this way:
values <- v$v + v$t * 10 + v$p * 100
runlength <- rle(values)
runlength$values <- runlength$values %% 2
x <- table(runlength)
y <- aggregate(unclass(x), list(as.numeric(rownames(x))), sum)
The whole code will look like this. If N is as low as 4, the growing object x will not be a severe problem. But generally I agree with #GavinSimpson, that it is not a good programming technique.
N=4
x <-NULL
for (i in 1:N) { #first loop
v <-sample(0:1, 1000000, 1/2) #generate data
v <-as.data.frame(v) #convert to dataframe
v$t <-rep(1:2, each=250) #group
v$p <-rep(1:2000, each=500) #p.number
values <- v$v + N * 10 + v$t * 100 + v$p * 1000
runlength <- rle(values)
runlength$values <- runlength$values %% 2
x <- rbind(x, table(runlength))
} #end first loop
y <-aggregate(x, list(as.numeric(rownames(x))), sum) #tota

Resources