I want to write a for loop that iterates over a vector or list, where i'm adding values to them in each iteration. i came up with the following code, it's not iterating more than 1 iteration. i don't want to use a while loop to write this program. I want to know how can i control for loops iterator. thanks in advance.
steps <- 1
random_number <- c(sample(20, 1))
for (item in random_number){
if(item <18){
random_number <- c(random_number,sample(20, 1))
steps <- steps + 1
}
}
print(paste0("It took ", steps, " steps."))
It depends really on what you want to achieve. Either way, I am afraid you cannot change the iterator on the fly. while seems resonable in this context, or perhaps knowing the plausible maximum number of iterations, you could proceed with those, and deal with needless iterations via an if statement. Based on your code, something more like:
steps <- 1
for (item in 1:100){
random_number <- c(sample(20, 1))
if(random_number < 18){
random_number <- c(random_number,sample(20, 1))
steps <- steps + 1
}
}
print(paste0("It took ", steps, " steps."))
Which to be honest is not really different from a while() combined with an if statement to make sure it doesn't run forever.
This can't be done. The vector used in the for loop is evaluated at the start of the loop and any changes to that vector won't affect the loop. You will have to use a while loop or some other type of iteration.
Related
Inspired by the leetcode challenge for two sum, I wanted to solve it in R. But while trying to solve it by brute-force I run in to an issue with my for loop.
So the basic idea is that given a vector of integers, which two integers in the vector, sums up to a set target integer.
First I create 10000 integers:
set.seed(1234)
n_numbers <- 10000
nums <- sample(-10^4:10^4, n_numbers, replace = FALSE)
The I do a for loop within a for loop to check every single element against eachother.
# ensure that it is actually solvable
target <- nums[11] + nums[111]
test <- 0
for (i in 1:(length(nums)-1)) {
for (j in 1:(length(nums)-1)) {
j <- j + 1
test <- nums[i] + nums[j]
if (test == target) {
print(i)
print(j)
break
}
}
}
My problem is that it starts wildly printing numbers before ever getting to the right condition of test == target. And I cannot seem to figure out why.
I think there are several issues with your code:
First, you don't have to increase your j manually, you can do this within the for-statement. So if you really want to increase your j by 1 in every step you can just write:
for (j in 2:(length(nums)))
Second, you are breaking only the inner-loop of the for-loop. Look here Breaking out of nested loops in R for further information on that.
Third, there are several entries in nums that gave the "right" result target. Therefore, your if-condition works well and prints all combination of nums[i]+nums[j] that are equal to target.
I'm trying to write a simple population model using nested for loops. I want to project the population 10yrs, and I want to run this projection 100 times. I need the output of each time step and include a counter so I know which year of which iteration the results correspond to. I have an example running using this code, but I was wondering if:
1) There was a more elegant solution to than using the rows<-rows+1 command to reset and advance the counter each time?
2) There is a more elegant solution than for loops to accomplish this?
library(VGAM)
popdata<-matrix(nrow=1000,ncol=3)
dimnames(popdata)[[2]]<-c('iteration','year','popsize')
rows<-1
for (iteration in 1:100){
pop<-50
for(year in 1:10){
popdata[rows,1]<-iteration
popdata[rows,2]<-year
pop<-rbetabinom(1,pop,0.6)
popdata[rows,3]<-pop
rows<-rows+1
}
}
You can replace the row counter by (iteration - 1) * 10 + year.
You can also clean up the work done in the loop by assigning the iteration and year data outside as you know in advance what these results will be.
This gives something like
popdata<-matrix(nrow=1000,ncol=3)
dimnames(popdata)[[2]]<-c('iteration','year','popsize')
# Do the deterministic stuff first
popdata[, 1] <- rep(1:100, each = 10)
popdata[, 2] <- rep(1:10, times = 100)
for (iteration in 1:100){
pop<-50
for(year in 1:10){
rows <- (iteration - 1) * 10 + yea
pop<-rbetabinom(1,pop,0.6)
popdata[rows,3]<-pop
}
}
Because pop depends on the previous pop it is less easy to simplify the inner loop entirely.
I think the approach would be to generate all your binomial results as a long vector of 1s and 0s.
Then in each loop you would subset the next pop elements of that vector.
In the end I don't think that would actually be neater or more readable though.
I am working with R and my script is taking a very long time. I was thinking I can stop it and then start it again by changing my counters.
My code is this
NC <- MLOA
for (i in 1:313578){
len_mods <- length(MLOA[[i]])
for (j in 1:2090){
for(k in 1:len_mods){
temp_match <- matchv[j]
temp_rep <- replacev[j]
temp_mod <- MLOA[[i]][k]
is_found <- match(temp_mod,temp_match, nomatch = 0, incomparables = 0)
if(is_found[1] == 1) NC[[i]][k] <- temp_rep
rm(temp_match,temp_rep,temp_mod)
}
}
}
I am thinking that I can stop my script, then re-start it by checking what values of i,j and k are and changing the counts to start at their current values. So instead of counting "for (i in 1:313578)" if i is up to 100,000 I could do (i in 100000:313578).
I don't want to stop my script though before checking that my logic about restarting it is solid.
Thanks in anticipation
I'm a bit confused what you are doing. Generally on this forum it is a good idea to greatly simplify your code, and only present the core of the problem in a very simple example. That withstanding, this might help. Put your for loop in a function whose parameters are the first elements of the sequence of numbers you loop over. For example:
myloop <- function(x,...){
for (i in seq(x,313578,1)){
...
This way you can easily manipulate were your loop starts.
The more important question is, however, why are you using for loops in the first place? In R, for loops should be avoided at all costs. By vectorizing your code you can greatly increase its speed. I have realized speed increases of a factor of 500!
In general, the only reason you use a for loop in R is if current iterations of the for loop depend on previous iterations. If this is the case then you are likely bound to the slow for loop.
Depending on your computer skills, however, even for loops can be made faster in R. If you know C, or are willing to learn a bit, interfacing with C can dramatically increase the speed of your code.
An easier way to increase the speed of your code, which unfortunately will not yield the same speed up as interfacing with C, is using R's Byte Complier. Check out the cmpfun function.
One final thing on speeding up code: The following line of codetemp_match <- matchv[j] looks innocuous enough, however, this can really slow things down. This is because every time you assign matchv[j] to temp_match you make a copy of temp_match. That means that your computer needs to find some were to store this copy in RAM. R is smart, as you make more and more copies, it will clean up after you and throw away those copies you are no longer using with the garbage collect function. However finding places to store your copies as well as calling the garbage collect function take time. Read this if you want to learn more: http://adv-r.had.co.nz/memory.html.
You could also use while loops for your 3 loops to maintain a counter. In the following, you can stop the script at any time (and view the intermediate results) and restart by changing continue=TRUE or simply running the loop part of the script:
n <- 6
res <- array(NaN, dim=rep(n,3))
continue = FALSE
if(!continue){
i <- 1
j <- 1
k <- 1
}
while(k <= n){
while(j <= n){
while(i <= n){
res[i,j,k] <- as.numeric(paste0(i,j,k))
Sys.sleep(0.1)
i <- i+1
}
j <- j+1
i <- 1
}
k <- k+1
j <- 1
}
i;j;k
res
This is what I got to....
for(i in 1:313578)
{
mp<-match(MLOA[[i]],matchv,nomatch = 0, incomparables=0)
lgic<- which(as.logical(mp),arr.ind = FALSE, useNames = TRUE)
NC[[i]][lgic]<-replacev[mp]}
Thanks to those who responded, Jacob H, you are right, I am definitely a newby with R, your response was useful. Frank - your pointers helped.
My solution probably still isn't an optimal one. All I wanted to do was a find and replace. Matchv was the vector in which I was searching for a match for each MLOA[i], with replacev being the vector of replacement information.
I'm currently looping through a large data set and what I discovered is that the higher loop index, the slowlier the loop is. It goes pretty fast at the beginning, but it's incredibly slow at the end. What's the reason for this? Is there any way how to bypass it?
Remarks:
1) I can't use plyr because the calculation is recursive.
2) The length of output vector is not known in advance.
My code looks rougly like this:
for (i in 1:20000){
if(i == 1){
temp <- "some function"(input data[i])
output <- temp
} else {
temp <- "some function"(input data[i], temp)
out <- rbind(out, temp)
}
}
The problem is that you are growing the object out at each iteration, which will entail larger and larger amounts of copying as the size of out increases (as your loop index increases).
In this case, you know the loop needs a vector of 20000 elements, so create one initially and fill in that object as you loop. Doing this will also remove the need for the if() ... else() which is also slowing down your loop and will become appreciable as the size of the loop increases.
For example, you could do:
out <- numeric(20000)
out[1] <- foo(data[1])
for (i in 2:length(out)) {
out[i] <- foo(data[i], out[i-1])
}
What out needs to be when you create it will depend on what foo() returns. Adjust creation of out accordingly.
I have got a column with different numbers (from 1 to tt) and would like to use looping to perform a count on the occurrence of these numbers in R.
count = matrix(ncol=1,nrow=tt) #creating an empty matrix
for (j in 1:tt)
{count[j] = 0} #initiate count at 0
for (j in 1:tt)
{
for (i in 1:N) #for each observation (1 to N)
{
if (column[i] == j)
{count[j] = count[j] + 1 }
}
}
Unfortunately I keep getting this error.
Error in if (column[i] == j) { :
missing value where TRUE/FALSE needed
So I tried:
for (i in 1:N) #from obs 1 to obs N
if (column[i] = 1) print("Test")
I basically got the same error.
Tried to do abit research on this kind of error and alot have to said about "debugging" which I'm not familiar with.
Hopefully someone can tell me what's happening here. Thanks!
As you progress with your learning of R, one feature you should be aware of is vectorisation. Many operations that (in C say) would have to be done in a loop, can be don all at once in R. This is particularly true when you have a vector/matrix/array and a scalar, and want to perform an operation between them.
Say you want to add 2 to the vector myvector. The C/C++ way to do it in R would be to use a loop:
for ( i in 1:length(myvector) )
myvector[i] = myvector[i] + 2
Since R has vectorisation, you can do the addition without a loop at all, that is, add a scalar to a vector:
myvector = myvector + 2
Vectorisation means the loop is done internally. This is much more efficient than writing the loop within R itself! (If you've ever done any Matlab or python/numpy it's much the same in this sense).
I know you're new to R so this is a bit confusing but just keep in mind that often loops can be eliminated in R.
With that in mind, let's look at your code:
The initialisation of count to 0 can be done at creation, so the first loop is unnecessary.
count = matrix(0,ncol=1,nrow=tt)
Secondly, because of vectorisation, you can compare a vector to a scalar.
So for your inner loop in i, instead of looping through column and doing if column[i]==j, you can do idx = (column==j). This returns a vector that is TRUE where column[i]==j and FALSE otherwise.
To find how many elements of column are equal to j, we just count how many TRUEs there are in idx. That is, we do sum(idx).
So your double-loop can be rewritten like so:
for ( j in 1:tt ) {
idx = (column == j)
count[j] = sum(idx) # no need to add
}
Now it's even possible to remove the outer loop in j by using the function sapply:
sapply( 1:tt, function(j) sum(column==j) )
The above line of code means: "for each j in 1:tt, return function(j)", an returns a vector where the j'th element is the result of the function.
So in summary, you can reduce your entire code to:
count = sapply( 1:tt, function(j) sum(column==j) )
(Although this doesn't explain your error, which I suspect is to do with the construction or class of your column).
I suggest to not use for loops, but use the count function from the plyr package. This function does exactly what you want in one line of code.