I am doing some modelling and wish to simulate randomness.
I have a total number of runs run_times which is 5 in this example.
A vector holding run_lengths will print 1's for which, so if run length is 3, it prints 1's 3 times.
The sample_data includes a sample of 1's and 0's. The application of printing 1's along a run_lengths is randomly done when sample_data == 1; not all == 1 is to be picked though. Only random... and operation can only print 1 for a total number of run_times (5).
Theres a few moving parts for sure.
I am tackling the problem in this manner:
I am able to select run_lengths at random with sample(run_lengths, 1). I am unsure how to select sample_data at random and I'm trying to keep a counter in order to stay under run_times:
run_lengths <- c(2,4,5,6,7,8,1)
run_times <- 5
sample_data <- c(0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,1,1,1,1,0,0,0,0,0,0,0,0,1,1,1,1,0,0,0,0,0,0)
# Randomly select 1's from sample_data, when find 1, randomly print 1's along run_lengths
# Only print a certain amount of times (run_times)
# Pick run_lengths at random == sample(run_lengths,1)
# Pick df$sample 1's at random, how to randomly select????
count <- 0 # keep track of how many random run_lengths is being applied
res <- NULL
while (length(res) < length(sample_data)) {
if (sample_data[length(res)+1] == 1 & count < run_times) { # not sure how to pick sample_date == 1?
res <- c(res, rep(1,sample(run_lengths,1))) # if signal == 1 (randomly) then randomly rep a run_length
count <- count +1 # count how many random reps, run_lengths have been applied
} else {
res <- c(res, 0) # Note if condition is not true, we print 0 vs 1
}
}
res <- res[1:length(sample_data)]
res
I have completed it maybe on 60%? I'm not sure what is the best approach for choosing random 1's from sample_data. Also I'm not sure how to only keep number of run_lengths under the run_times maximum. I am attempting to keep a count for when the condition was true. If it was exceeded, it would ignore any other true conditions.
Ok, time to put down some code, still not sure about if it's right or not
sample_data <- c(0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,1,1,1,1,0,0,0,0,0,0,0,0,1,1,1,1,0,0,0,0,0,0)
# take indices of of sampled data where value == 1
i <- which(sample_data %in% 1)
# now shuffle them all, no replacement - random positions with 1s
p <- sample(i, length(i), replace=FALSE)
print(sample_data[p[1]])
print(sample_data[p[2]])
print(sample_data[p[3]])
...
Is this what you want?
Related
I new to R and I'm trying to see how many iterations are needed to fill a vector with numbers 1 to 55 (no duplicates) from a random sample using runif.
At the moment, the vector has a lots of duplicates in it and my number of iterations being returned is the size of the vector. So, i'm not sure if my logic is correct.
The aim of the if statement is to check if the value from the sample exists in the vector, and if it does, choose the next one. But i'm not sure if it's correct, since the next number could already exist in the vector. Any help would be much appreciated
numbers=as.integer(runif(800, min=1, max=55)) ## my sample from runif
i=sample(numbers, 1)
## setting up my vector to store 55 unique values (1 to 55)
p=rep(0,55)
## my counters
j=0
n=1
## my while loop
while (p[n] %in% 0){
## if the sample value already exists in the vector, choose the next value from the sample
if (numbers[n] %in% p) {
p[n]=numbers[n+1]
}
else {
p[n] = numbers[n]
}
n = n + 1
j = j + 1
}
I believe that the following is what you want. Instead pf a while loop on p, the while loop should search for a new value in numbers.
set.seed(2021) # make the results reproducible
numbers <- sample(55, 800, TRUE)
## setting up my vector to store 55 unique values (1 to 55)
p <- integer(55)
# assign the elemnts of p one by one
for(j in seq_along(p)){
## if the sample value already exists in the vector,
## choose the next value from the sample
n <- 1
while (numbers[n] %in% p) {
n <- n + 1
}
if(n <= length(numbers)){
p[j] <- numbers[n]
}
}
j
#[1] 55
length(unique(p)) == length(p)
#[1] TRUE
I am working through the Euler Problems, and the problem is to sum the even terms in a Fibonacci sequence up to the length where the last term is < 4e6. I got it eventually but the following method of counting the even numbers did not work, and I am curious as to why.
First, this method of counting even numbers from a sequence works:
numbers <- 1:32
N <- length(numbers)
total <- rep(0,N)
for (i in numbers){
if(i %% 2 == 0) total[i] <-i
}
sum(total) #272
Then, this Fibb sequence works:
Fibb<-function(x){
y <- 1:x
y[1] = 1
y[2] = 2
for (i in 3:x){
y[i] <- y[i-2] + y[i-1]
}
return(y)
}
but the same sum function I used on the first sequence doesn't work:
numbers <- as.integer(Fibb(32)) # 1, 2, 3, 5, 8, 13, 21...
N <- length(numbers)
total <- rep(0,N)
for (i in numbers){
if(i %% 2 == 0) total[i] <-i
}
sum(total) #NA
The total of the third chunk is a large numeric, mostly composed of NAs.
EDIT: What I'd like to know is why the loop in the first block of code runs correctly and not that in the third; I copied and pasted likes 6-7, from the first chunk to the third, the only difference is the "numbers" sequence.
Has anyone encountered a problem like this?
Thanks!
It is because you are using elements of numbers as your index into total.
See how you have for (i in numbers). So (for example) when considering the Fibbonaci number 2584 in numbers, you are setting total[2584] <- 1.
Your eventual total vector is 3524578 elements long (!!) when it only needs to be 32 long. All the other elements that you don't store a result in are set to NA, and the sum of NA is NA.
Separate out your Fibonacci number (which can be arbitrarily large) from your index into total (which only goes up to 32). To make the index, you can use seq_along(numbers) which is essentially 1:length(numbers). Then use numbers[i] to get that Fibonacci number.
for (i in seq_along(numbers)) {
if(numbers[i] %% 2 == 0) total[i] <- 1
}
How can the following be accomplished with R?
Connect a constantly changing data source (e.g. https://goo.gl/XCM6yG) into R,
Measure time once prices start to rise consistently from initial baseline range to peak (represented by the green horizontal line),
Measure time from peak back to baseline range (the teal line)
Note: "Departure from baseline range" (unless there is a better mathematical way) defined as at least the most recent 5 prices all being over 3 standard deviations above the mean of the latest 200 prices
This is a really vague questions with an unknown use case but... here we go.
Monitoring in what way? The length? That's what I did
The vector has over 200 values we can take the mean, so we need a control flow for that part.
I added in some noise which basically says force the behavior you want to calculate ( ifelse(i %in% 996:1000, 100, 0) which means, if the iterator is in 996 to 1000, add 100 to the random normal i generated). We set a counter and check if each value is about 3 sd of the vector values, if so we record the time.
At each input of the data...check if the current value is the max value... now this is more tricky since we would have to look at the trend. This is beyond the scope of my assistance.
Up to you to figure out since I don't really understand
vec <- vecmean <- val5 <- c()
counter <- 0
for(i in 1:1000){
vec[i] <- rnorm(1) + ifelse(i %in% 996:1000, 100, 0)
Sys.sleep(.001) # change to 1 second
#1
cat('The vector has',length(vec),'values within...\n')
#2
if(length(vec)>200){
vecmean <- c(vecmean, mean(vec[(i-200):i]))
cat('The mean of the last 200 observations is ',
format(vecmean[length(vecmean)], digits =2),'\n')
#3
upr <- vecmean[length(vecmean)] + 3*sd(vec)
if(vec[i] > upr){
counter <- counter + 1
} else{
counter <- 0
}
if(counter > 4){
cat('Last 5 values greater than 3sd aboving the rolling mean!\n')
val5 <- Sys.time()
cat("Timestamp:",as.character(val5),'\n')
}
}
# 4
theMax <- max(vec)
if(vec[i] == theMax & !is.null(val5) ){
valMax <- Sys.time()
valDiff <- valMax - val5
cat('The time difference between the first flag and second is', as.character(valDiff),'\n')
}
}
I am trying to write a program that sets a state from A to state B and vice versa.
rnumbers <- data.frame(replicate(5,runif(2000, 0, 1)))
I am imagining this data frame of random numbers in a uniform distribution, except it has 10000 rows instead of 20 rows.
Setting the probability of going to state A and state B :
dt <- c(.02)
A <- dt*1
B <- dt*.5
Making a function that goes through data frame rnumbers and putting in a 0 if the number is less than B and a 1 if the number is less than A.
step_generator <- function(x){
step <- ifelse ( x < B, 0, ifelse(x < A, 1, NA))
return(step)
}
state <- apply(rnumbers, 2, step_generator)
This essentially gives me what I want - a data frame with columns that contain 0, 1, or NA depending on the value of the random number in rnumbers. However, I am missing a couple of things--
1) I would like to keep track of how long each state lasts. What I mean by that, is if you imagine each row as a change in time as above (dt <- c(.02)). I want to be able to plot "state vs. time". In order to address this, this is what I tried :
state1 <- transform(state, time = rep(dt))
state2 <- transform(state1, cumtime = cumsum(time))
This gets me close to what I want, cumtime goes from .02 to .4. However, I want the clock to start at 0 in the 1st row and add .02 to every subsequent row.
2) I need to know how long each state lasts for. Essentially, I want to be able to go through each column, and ask for how much time (cumsum) does each state last. This would then give me a distribution of times for state A and state B. I want this stored in another data frame.
I think this makes sense, if anything is unclear please let me know and I will clarify.
Thanks for any and all help!
The range between "number is less than .02*1 and greater than .02*.5" is very narrow so if you are setting this simulation up, most of the first row will most probably be zero. You cannot really hope to get success with ifelse when the conditions have any look-back features. That function doesn't allow "back-indexing".
rstate <- rnumbers # copy the structure
rstate[] <- NA # preserve structure with NA's
# Init:
rstate[1, ] <- rnumbers[1, ] < .02 & rnumbers[1, ] > 0.01
step_generator <- function(col, rnum){
for (i in 2:length(col) ){
if( rnum[i] < B) { col[i] <- 0 }
else { if (rnum[i] < A) {col[i] <- 1 }
else {col[i] <- col[i-1] } }
}
return(col)
}
# Run for each column index:
for(cl in 1:5){ rstate[ , cl] <-
step_generator(rstate[,cl], rnumbers[,cl]) }
I want to simulate different poker hands. Through painful trial and error, I got the ranks, suits, the deck and a function to draw any given number of cards as:
suits <- c("spd","hrt","dimd","clbs")
ranks <- c(1:10,"J","Q","K")
deck <- as.matrix(expand.grid('rank' = ranks, 'suit' = suits))
draw <- function (n) deck[sample(nrow(deck), n), ]
draw(5) # Drawing 5 cards from the deck...
Output:
rank suit
[1,] "4" "dimd"
[2,] "6" "dimd"
[3,] "8" "spd"
[4,] "K" "hrt"
[5,] "8" "clbs"
Now I want to find out through simulation the probability of getting different hands. I did come up with some possible loops with a counter for the number of successes but I am stuck.
Here is an example... Let me try to figure out how many full houses I get in 1000 simulations. Since a full house is defined as "three matching cards of one rank and two matching cards of another rank", I figured that the key part of the function would be to have a boolean within an if statement that takes advantage of the R function unique()==2, meaning 2 unique ranks - with 5 cards dealt, 2 unique ranks could be a full house (another possibility is four-of-a-kind with any other rank).
iterations <- 1000
counter <- 0
for (i in iterations){
s <- draw(5)
if(length(unique(s[,1])) == 2) counter <- counter + 1
}
counter
Output: [1] 0
I have tried multiple other things, including counter[i] <- 1 for successful cases, and with the idea of running a sum(counter) at the end, but all without getting the loop to work.
In your code you have:
for(i in 1000) {
print(i)
} # 1000
It would only print once because i would iterate once as 1000.
Here's an alternative approach using rle.
iterations <- 10000
draws <- list()
for (i in 1:iterations){
s <- draw(5)
draws[[i]] <- all(rle(sort(s[,1]))$lengths %in% c(2,3))
if(draws[[i]]) {
print(s)
}
}
summary(unlist(draws))
Using a data frame as follows, it seems to produce the result you are looking for:
suits <- c("spd","hrt","dimd","clbs")
ranks <- c(1:10,"J","Q","K")
deck <- as.data.frame(expand.grid('rank' = ranks, 'suit' = suits))
draw <- function (n) deck[sample(nrow(deck), n), ]
counter <- 0;
for (i in 1:1000) {
df <- draw(5);
counter <- counter + (length(unique(df$rank)) == 2)
}
counter
[1] 156
suits <- c("spd","hrt","dimd","clbs")
ranks <- c(1:10,"J","Q","K")
deck <- as.data.frame(expand.grid('rank' = ranks, 'suit' = suits))
draw <- function (n) deck[sample(nrow(deck), n), ]
iterations <- 1000
counter <- 0
for (i in 1:iterations) {
hand <- draw(5)
rank_table <- table(hand[, 1])
if (length(names(rank_table)) == 2 & min(rank_table) > 1) counter <- counter + 1
# could have four of a rank, one of another;
# need to ensure two of a rank, three of another
}
counter
[1] 1
This result is not far from what is expected http://www.math.hawaii.edu/~ramsey/Probability/PokerHands.html