I am trying write a function that generates simulated data but if the simulated data does not meet the condition, I need to skip it and if it does meet the condition, then I will apply the function summary.
I would like to loop it until I find 10 valid datasets and then stop. (I actually have to do this until it reaches 10000). Here is the code. The code sort of works except it does not stop. I think I probably placed the next and break function in the wrong place. I hope someone could help me write this together.
Another way I could approach this is to generate all the valid data first and then apply the function find_MLE (summary) to the final list.
Edit: I put break inside repeat. I edit the code to make it reproducible. Still the code keeps generating data and does not break.
here is a reproducible version
validData <- function(GM, GSD,sampleSize, p) {
count=0
for (i in 1:n) {
repeat {
lod <- quantile(rlnorm(1000000, log(GM), log(GSD)), p = p)
X_before <- rlnorm(sampleSize, log(GM), log(GSD))
Xs <- ifelse(X_before <= lod, lod, X_before)
delta <- ifelse(X_before <= lod, 1, 0)
pct_cens <- sum(delta)/length(delta)
print(pct_cens)
if (pct_cens == 0 & pct_cens ==1) next
else {
sumStats <- summary(Xs)
Med <- sumStats[3]
Ave <- sumStats[4]
}
count<- count+1
if (count == 10) break
}}
return(c(pct_cens, Med, Ave))
}
validData(GM=1,GSD=2,sampleSize=10,p=0.1)
Thanks for your help. I was able to write a function without using break function! I posted it here in case other people might find it helpful.
dset <- function (GM, GSD, n, p) {
Mean <- array()
Median <- array()
count = 0
while(count < 10) {
lod <- quantile(rlnorm(1000000, log(GM), log(GSD)), p = p)
X_before <- rlnorm(n, log(GM), log(GSD))
Xs <- ifelse(X_before <= lod, lod, X_before)
delta <- ifelse(X_before <= lod, 1, 0)
pct_cens <- sum(delta)/length(delta)
print(pct_cens)
if (pct_cens == 0 | pct_cens == 1 ) next
else {count <- count +1
if (pct_cens > 0 & pct_cens < 1) {
sumStats <- summary(Xs)
Median[count] <- sumStats[3]
Mean [count]<- sumStats[4]
print(list(pct_cens=pct_cens,Xs=Xs, delta=delta, Median=Median,Mean=Mean))
}
}
}
return(data.frame( Mean=Mean, Median=Median)) }
Since your code isn't replicable, I cannot fully test and debug your code, but here is what I think it would look like without being able to replicate with an MLE function. This is roughly how I would set it up. But check out the documentation/Google on break, next, for/while loops related to R when testing your code.
validData <- function(GM, GSD,Size, p) {
for (i in 1:20) {
count <- 1
repeat {
lod <- quantile(rlnorm(1000000, log(GM), log(GSD)), p = p)
X_before <- rlnorm(Size, log(GM), log(GSD))
Xs <- ifelse(X_before <= lod, lod, X_before)
delta <- ifelse(X_before <= lod, 1, 0)
pct_cens <- sum(delta)/length(delta)
if (pct_cens == 0 & pct_cens ==1)
function() #your foo goes here
else {
mles <- find_MLE(c(0,0), Xs, delta)
GM_est <- mles[1]
GSD_est <- mles[2]
AM_est <- exp(log(GM_est) + 1 )
SD_est<- sqrt((AM_est)^2*exp(log(GSD_est)^2))
D95th_est <- GM_est*(GSD_est^1.645)
} }
return(c(GM_est,GSD_est,AM_est,SD_est,D95th_est))
count<- count+1
if (count == 10) break
}}
To skip to the outer loop based on a condition, simply use break()
Here's a simple example where the inner loop will try to run 10 times, but a condition will usually be met which prevents it
# OUTER LOOP
for(i in 1:2) {
print(paste("Outer loop iteration", i))
# INNER LOOP (will run max 10 times)
for(j in 1:10) {
print(paste("Inner loop iteration", j))
if (runif(1) > 0.4) { # Randomly break the inner loop
print(paste("Breaking inner loop", j))
break()
}
}
}
If you want to skip to the outer loop when there's an error (rather than based on a condition), see here
Related
I am trying to write a function to calculate h-point. the function is defined over a rank frequency data frame.
consider the following data.frame :
DATA <-data.frame(frequency=c(49,48,46,38,29,24,23,22,15,12,12,10,10,9,9), rank=c(seq(1, 15)))
and the formula for h-point is :
if {there is an r = f(r), h-point = r }
else { h-point = f(i)j-f(j)i / j-i+f(i)-f(j) }
where f(i) and f(j) are corresponding frequencies for ith and jth ranks and i and j are adjacent ranks that i<f(i) and j>f(j).
NOW, i have tried the following codes :
fr <-function(x){d <-DATA$frequency[x]
return(d)}
for (i in 1:length(DATA$rank)) {
j <- i+1
if (i==fr(i))
return(i)
else(i<fr(i) && j>fr(j)) {
s <-fr(i)*j-fr(j)*i/j-i+fr(i)-fr(j)
return(s)
}}
I also tried:
for (i in 1:length(DATA$rank)) {
j <- i+1
if (i==fr(i))
return(i)
if (i<fr(i) while(j>fr(j))) {
s <-fr(i)*j-fr(j)*i/j-i+fr(i)-fr(j)
return(s)
}}
and neither of them works. for the DATA ,the desired result would be i=11 and j=12, so:
h-point=12×12 - 10×11 / 12 - 11 + 12 - 10
can you please tell me what I`m doing wrong here?
You could do:
h_point <- function(data){
x <- seq(nrow(data))
f_x <- data[["frequency"]][x]
h <- which(x == f_x)
if(length(h)>1) h
else{
i <- which(x<f_x)
j <- which(x>f_x)
s <- which(outer(i,j,"-") == -1, TRUE)
i <- i[s[,1]]
j <- j[s[,2]]
cat("i: ",i, "j: ", j,"\n")
f_x[i]*j - f_x[j]*i / (i-j + f_x[i]-f_x[j])
}
}
h_point(DATA)
i: 11 j: 12
[1] 34
I think I have figured out what you are trying to achieve. My loop will go through DATA and break at any point if rank == frequency for a given row. If might be more prudent to explicitly test this with DATA$rank[i] == fr(i) rather than relying on i, in case tied ranks etc.
The second if statement calculates h-point (s) for rows i and j if row i has rank that is lower than freq and row j has a rank that is higher.
Is this what you wanted?
DATA <-data.frame(frequency=c(49,48,46,38,29,24,23,22,15,12,12,10,10,9,9), rank=c(seq(1, 15)))
fr <-function(x){d <-DATA$frequency[x]
return(d)}
for(i in 1:nrow(DATA)){
j <- i+1
if (i==fr(i)){
s <- list(ij=c(i=i,j=j), h=i)
break
}else if(i <fr(i) && j>fr(j)){
s <-list(ij=c(i=i,j=j),h=fr(i)*j-fr(j)*i/j-i+fr(i)-fr(j))
}}
I am not sure the formula is correct, in your loop you had j-i but in explanation it was i-j. Not sure if the entire i-j+fr(i)-fr(j) is the denominator and similarly for the numerator. Simple fixes.
I am working with a time-series raster brick. The brick has 365 layers representing a value for each day of the year.
I want to create a new layer in which each cell holds the number of day of year in which a certain condition is met.
My current approach is the following (APHRO being the raster brick), but returns the error message below:
enter code here
r <- raster(ncol=40, nrow=20)
r[] <- rnorm(n=ncell(r))
APHRO <- brick(x=c(r, r*2, r))
NewLayer <- calc(APHRO, fun=FindOnsetDate(APHRO))
Returning this error:
Error in .local(x, ...) : not a valid subset
And the function being parsed:
FindOnsetDate <- function (s) {
x=0
repeat {
x+1
if(s[[x]] >= 20 | s[[x]] + s[[x+1]] >= 20 & ChkFalseOnset() == FALSE)
{break}
}
return(x);
}
With the function for the 3rd condition being:
ChkFalseOnset <- function (x) {
for (i in 0:13){
if (sum(APHRO[[x+i:x+i+7]]) >= 5)
{return(FALSE); break}
return(TRUE)
}
}
Thank you in advance!!!!
And please let me know if I should provide more information - tried to keep it parsimonious.
The problem is that your function is no good:
FindOnsetDate <- function (s) {
x=0
repeat {
x+1
if(s[[x]] >= 20 | s[[x]] + s[[x+1]] >= 20)
{break}
}
return(x);
}
FindOnsetDate(1:100)
#Error in s[[x]] :
# attempt to select less than one element in get1index <real>
Perhaps something like this:
FindOnsetDate <- function (s) {
j <- s + c(s[-1], 0)
sum(j > 20 | s > 20)
# if all values are positive, just do sum(j > 20)
}
FindOnsetDate(1:20)
#10
This works now:
r <- calc(APHRO, FindOnsetDate)
I would suggest a basic two-step process. With a 365-days example:
set.seed(123)
r <- raster(ncol=40, nrow=20)
r_list <- list()
for(i in 1:365){
r_list[[i]] <- setValues(r,rnorm(n=ncell(r),mean = 10,sd = 5))
}
APHRO <- brick(r_list)
Use a basic logic test for each iteration:
r_list2 <- list()
for(i in 1:365){
if(i != 365){
r_list2[[i]] <- APHRO[[i]] >= 20 | APHRO[[i]] + APHRO[[i+1]] >= 20
}else{
r_list2[[i]] <- APHRO[[i]] >= 20
}
}
Compute sum by year:
NewLayer <- calc(brick(r_list2), fun=sum)
plot(NewLayer)
I've got the following code in R:
func.time <- function(n){
times <- c()
for(i in 1:n){
r <- 1 #x is the room the mouse is in
X <- 0 #time, starting at 0
while(r != 5){
if(r == 1){
r <- sample(c(2,3),1) }
else if(r == 2){
r <- sample(c(1,3), 1) }
else if(r == 3){
r <- sample(c(1,2,4,5), 1) }
else if (r == 4){
r <- sample(c(3,5), 1) }
X <- X + 1
}
times <- c(X, times)
}
mean(times)
}
func.time(10000)
It works fine, but I've been told that using switch() can speed it up seeing as I've got so many if else statements but I can't seem to get it to work, any help is appreciated in advance.
Edit
I've tried this:
func.time <- function(n) {
times <- c()
for(i in 1:n) {
r <- 1 #x is the room the mouse is in
X <- 0 #time, starting at 0
while(r != 5) {
switch(r, "1" = sample(c(2,3), 1),
"2" = sample(c(1,3), 1),
"3" = sample(c(1,2,4,5), 1),
"4" = sample(c(3,5)))
X <- X + 1
}
times <- c(X, times)
}
mean(times)
}
func.time(10000)
But it was a basic attempt, I'm not sure I've understood the switch() method properly.
I though Dominic's assessment was very useful but when I went to examine the edit it was being held up on what I thought was an incorrect basis. So I decided to just fix the code. When usign a numeric argument to the EXPR parameter you do not use the item=value formalism but rather just put in the expressions:
func.time <- function(n){times <- c()
for(i in 1:n){; r <- 1; X <- 0
while(r != 5){
r <- switch(r,
sample(c(2,3), 1) , # r=1
sample(c(1,3), 1) , # r=2
sample(c(1,2,4,5), 1), #r=3
sample(c(3,5), 1) ) # r=4
X <- X + 1 }
times <- c(X, times) }
mean(times) }
func.time(1000)
#[1] 7.999
For another example of how to use switch with a numeric argument to EXPR, consider my answer to this question: R switch statement with varying outputs throwing error
I'd like to perform this function on a matrix 100 times. How can I do this?
v = 1
m <- matrix(0,10,10)
rad <- function(x) {
idx <- sample(length(x), size=1)
flip = sample(0:1,1,rep=T)
if(flip == 1) {
x[idx] <- x[idx] + v
} else if(flip == 0) {
x[idx] <- x[idx] - v
return(x)
}
}
This is what I have so far but doesn't work.
for (i in 1:100) {
rad(m)
}
I also tried this, which seemed to work, but gave me an output of like 5226 rows for some reason. The output should just be a 10X10 matrix with changed values depending on the conditions of the function.
reps <- unlist(lapply(seq_len(100), function(x) rad(m)))
Ok I think I got it.
The return statement in your function is only inside a branch of an if statement, so it returns a matrix with a probability of ~50% while in the other cases it does not return anything; you should change the code function into this:
rad <- function(x) {
idx <- sample(length(x), size=1)
flip = sample(0:1,1,rep=T)
if(flip == 1) {
x[idx] <- x[idx] + v
} else if(flip == 0) {
x[idx] <- x[idx] - v
}
return(x)
}
Then you can do:
for (i in 1:n) {
m <- rad(m)
}
Note that this is semantically equal to:
for (i in 1:n) {
tmp <- rad(m) # return a modified verion of m (m is not changed yet)
# and put it into tmp
m <- tmp # set m equal to tmp, then in the next iteration we will
# start from a modified m
}
When you run rad(m) is not do changes on m.
Why?
It do a local copy of m matrix and work on it in the function. When function end it disappear.
Then you need to save what function return.
As #digEmAll write the right code is:
for (i in 1:100) {
m <- rad(m)
}
You don't need a loop here. The whole operation can be vectorized.
v <- 1
m <- matrix(0,10,10)
n <- 100 # number of random replacements
idx <- sample(length(m), n, replace = TRUE) # indices
flip <- sample(c(-1, 1), n, replace = TRUE) # subtract or add
newVal <- aggregate(v * flip ~ idx, FUN = sum) # calculate new values for indices
m[newVal[[1]]] <- m[newVal[[1]]] + newVal[[2]] # add new values
I have generated an infinite loop and don't know how to fix it.
I essentially want to go through the data frame rnumbers and generate rstate2 with 1, -1, or 0 depending on what is in rnumbers
The function step_generator is getting stuck at the repeat function. I am not sure how to make the code put -1 in rstate2 if rnumber is less than C and then repeat an ifelse function for the next rows until a value of D or greater is obtained. Once D is obtained exit the repeat function and go back into the original for loop.
Here is my code:
rnumbers <- data.frame(replicate(5,runif(20000, 0, 1)))
dt <- c(.01)
A <- .01
B <- .0025
C <- .0003
D <- .003
E <- .05
rstate <- rnumbers # copy the structure
rstate[] <- NA # preserve structure with NA's
# Init:
rstate[1, ] <- c(0)
step_generator <- function(col, rnum){
for (i in 2:length(col) ){
if( rnum[i] < C) {
col[i] <- -1
repeat {
ifelse(rnum[i] < E, -1, if(rnum[i] >= D) {break})
}
}
else { if (rnum[i] < B) {col[i] <- -1 }
else {ifelse(rnum[i] < A, 1, 0) } }
}
return(col)
}
# Run for each column index:
for(cl in 1:5){ rstate[ , cl] <-
step_generator(rstate[,cl], rnumbers[,cl]) }
Thanks for any help.
The problem is that you are not increasing i inside repeat loop, so basically you are testing the same i all the time, and because rnum[i] < C (from if condition) it will always be rnum[i] < E since C < E, and loop never breaks.
However, if you increase i inside repeat it still will come back to value resulting from for loop, so you have to do it in different way, for example using while loop. I'm not exactly sure if I understand what you are trying to do, but basing on your description I've made this function:
step_generator <- function(col, rnum){
i <- 2
while (i <= length(col)){
if (rnum[i] < C) {
col[i] <- -1
while ((i < length(col)) & (rnum[i + 1] < D)){
i <- i + 1
col[i] <- -1
}
} else if (rnum[i] < B){
col[i] <- -1
} else if (rnum[i] < A){
col[i] <- 1
} else {
col [i] <- 0
}
i <- i + 1
}
return(col)
}