Iterating over a changing sequence in a for loop - r

Suppose we have syntax as follows:
for (i in sequence) {
...
sequence <- append(sequence, i, after = 3)
}
But the code seems not to work properly because sequence is not updating within the brackets. I have come up with a similar decision using while loop instead. Is it possible to use for loop anyway?

The for loop in R only evaluates the seq value (sequence in your example) at the beginning. So changing sequence in your loop will have no effect over how many times the loop will run.
For example,
sequence <- 1:2
for (i in sequence) {
print(i)
sequence <- 0
}
will print the numbers 1 and 2, and then will finish, with sequence containing a single zero.
This is described in the help page ?"for":
The seq in a for loop is evaluated at the start of the loop; changing
it subsequently does not affect the loop. If seq has length zero the
body of the loop is skipped. Otherwise the variable var is assigned in
turn the value of each element of seq. You can assign to var within
the body of the loop, but this will not affect the next iteration.
When the loop terminates, var remains as a variable containing its
latest value.

No, for loops in R can be considered a function call that iterates over the inputs "by value" (eg. by copying the input). Any change to the iterator sequence after start will leave the for-loop iteration unaffected. This is in general good practice as it drastically reduces bad coding practice and infinite loops. This is easily illustrated with a simple example:
idx <- 1:15
for(i in idx){
idx <- head(idx, -1)
cat('idx: ', idx, '\n')
cat('i: ', i, '\n')
}
If you want to update the iterator your best bet is either repeat or while but be careful as it increases the potential for errors and unexpected infinite loops.
idx <- 1:15
while(length(idx) != 0){
i <- head(idx, 1)
idx <- tail(idx, -1)
cat('idx: ', idx, '\n')
cat('i: ', i, '\n')
}

Related

Problem with checking logical within for loop

Inspired by the leetcode challenge for two sum, I wanted to solve it in R. But while trying to solve it by brute-force I run in to an issue with my for loop.
So the basic idea is that given a vector of integers, which two integers in the vector, sums up to a set target integer.
First I create 10000 integers:
set.seed(1234)
n_numbers <- 10000
nums <- sample(-10^4:10^4, n_numbers, replace = FALSE)
The I do a for loop within a for loop to check every single element against eachother.
# ensure that it is actually solvable
target <- nums[11] + nums[111]
test <- 0
for (i in 1:(length(nums)-1)) {
for (j in 1:(length(nums)-1)) {
j <- j + 1
test <- nums[i] + nums[j]
if (test == target) {
print(i)
print(j)
break
}
}
}
My problem is that it starts wildly printing numbers before ever getting to the right condition of test == target. And I cannot seem to figure out why.
I think there are several issues with your code:
First, you don't have to increase your j manually, you can do this within the for-statement. So if you really want to increase your j by 1 in every step you can just write:
for (j in 2:(length(nums)))
Second, you are breaking only the inner-loop of the for-loop. Look here Breaking out of nested loops in R for further information on that.
Third, there are several entries in nums that gave the "right" result target. Therefore, your if-condition works well and prints all combination of nums[i]+nums[j] that are equal to target.

Iteration in r for loop

I want to write a for loop that iterates over a vector or list, where i'm adding values to them in each iteration. i came up with the following code, it's not iterating more than 1 iteration. i don't want to use a while loop to write this program. I want to know how can i control for loops iterator. thanks in advance.
steps <- 1
random_number <- c(sample(20, 1))
for (item in random_number){
if(item <18){
random_number <- c(random_number,sample(20, 1))
steps <- steps + 1
}
}
print(paste0("It took ", steps, " steps."))
It depends really on what you want to achieve. Either way, I am afraid you cannot change the iterator on the fly. while seems resonable in this context, or perhaps knowing the plausible maximum number of iterations, you could proceed with those, and deal with needless iterations via an if statement. Based on your code, something more like:
steps <- 1
for (item in 1:100){
random_number <- c(sample(20, 1))
if(random_number < 18){
random_number <- c(random_number,sample(20, 1))
steps <- steps + 1
}
}
print(paste0("It took ", steps, " steps."))
Which to be honest is not really different from a while() combined with an if statement to make sure it doesn't run forever.
This can't be done. The vector used in the for loop is evaluated at the start of the loop and any changes to that vector won't affect the loop. You will have to use a while loop or some other type of iteration.

Query for loops in R

I am working out with a data in R.
a<-rep(NA,400)
for(i in 1:10){for(j in 0:40){print(dat$V2[i]-j)}}
Instead of printing, I want to add that value into an empty array (a). I would be thankful if someone help me with the same.
In a case like this (nested loops), often the easiest way is to add a counter to keep track of positions in the array:
a <- rep(NA, 400)
counter <- 1
for(i in 1:10){
for(j in 0:40){
a[counter] <- dat$V2[i]-j
counter <- counter + 1
}
}
Here is one way:
#a<-rep(NA,400)
#for(i in 1:10){for(j in 0:40){print(dat$V2[i]-j)}}
a <- as.numeric( sapply( 1:10, function(i){
sapply( 0:40, function(j) {
dat$V2[i]-j
})
}))
sapply is useful here because countrary to a for loop it returns something with each loop. So in this case I first loop over 1:10, like your for loop, with the useful difference that it actually returns something each time.
What it does in each iteration is to run a new nested block, also using sapply, this time looping over 0:40, also this time returning something, this time its the expression you have innermost in your for loop.
So for 10 times, for each of 1:10, it will loop over 0:40, each time calculating your expression and returning it, which should result in it getting calculated and returned as you want.
as.numeric is wrapped around it to make sure it stays as one long vector, which seems to be what you want.
A way only using loops:
for(i in 1:10){
for(j in 0:39){
print(i*40 - 39 + j)
a[i*40 - 39 + j] = dat$V2[i]-j}}
PS: as you want to create a vector with 400 observations, and i goes from 1 to 10, j needs to have "length" 40, so you want it to be 0:39.
When you want to create a vector (1 dimentional) with a doble loop, the following formula normally applies:
index of the vector = i*length(j) - (length(j)-1) + j

Speed of for loop is decreasing

I'm currently looping through a large data set and what I discovered is that the higher loop index, the slowlier the loop is. It goes pretty fast at the beginning, but it's incredibly slow at the end. What's the reason for this? Is there any way how to bypass it?
Remarks:
1) I can't use plyr because the calculation is recursive.
2) The length of output vector is not known in advance.
My code looks rougly like this:
for (i in 1:20000){
if(i == 1){
temp <- "some function"(input data[i])
output <- temp
} else {
temp <- "some function"(input data[i], temp)
out <- rbind(out, temp)
}
}
The problem is that you are growing the object out at each iteration, which will entail larger and larger amounts of copying as the size of out increases (as your loop index increases).
In this case, you know the loop needs a vector of 20000 elements, so create one initially and fill in that object as you loop. Doing this will also remove the need for the if() ... else() which is also slowing down your loop and will become appreciable as the size of the loop increases.
For example, you could do:
out <- numeric(20000)
out[1] <- foo(data[1])
for (i in 2:length(out)) {
out[i] <- foo(data[i], out[i-1])
}
What out needs to be when you create it will depend on what foo() returns. Adjust creation of out accordingly.

missing value where TRUE/FALSE needed error in R

I have got a column with different numbers (from 1 to tt) and would like to use looping to perform a count on the occurrence of these numbers in R.
count = matrix(ncol=1,nrow=tt) #creating an empty matrix
for (j in 1:tt)
{count[j] = 0} #initiate count at 0
for (j in 1:tt)
{
for (i in 1:N) #for each observation (1 to N)
{
if (column[i] == j)
{count[j] = count[j] + 1 }
}
}
Unfortunately I keep getting this error.
Error in if (column[i] == j) { :
missing value where TRUE/FALSE needed
So I tried:
for (i in 1:N) #from obs 1 to obs N
if (column[i] = 1) print("Test")
I basically got the same error.
Tried to do abit research on this kind of error and alot have to said about "debugging" which I'm not familiar with.
Hopefully someone can tell me what's happening here. Thanks!
As you progress with your learning of R, one feature you should be aware of is vectorisation. Many operations that (in C say) would have to be done in a loop, can be don all at once in R. This is particularly true when you have a vector/matrix/array and a scalar, and want to perform an operation between them.
Say you want to add 2 to the vector myvector. The C/C++ way to do it in R would be to use a loop:
for ( i in 1:length(myvector) )
myvector[i] = myvector[i] + 2
Since R has vectorisation, you can do the addition without a loop at all, that is, add a scalar to a vector:
myvector = myvector + 2
Vectorisation means the loop is done internally. This is much more efficient than writing the loop within R itself! (If you've ever done any Matlab or python/numpy it's much the same in this sense).
I know you're new to R so this is a bit confusing but just keep in mind that often loops can be eliminated in R.
With that in mind, let's look at your code:
The initialisation of count to 0 can be done at creation, so the first loop is unnecessary.
count = matrix(0,ncol=1,nrow=tt)
Secondly, because of vectorisation, you can compare a vector to a scalar.
So for your inner loop in i, instead of looping through column and doing if column[i]==j, you can do idx = (column==j). This returns a vector that is TRUE where column[i]==j and FALSE otherwise.
To find how many elements of column are equal to j, we just count how many TRUEs there are in idx. That is, we do sum(idx).
So your double-loop can be rewritten like so:
for ( j in 1:tt ) {
idx = (column == j)
count[j] = sum(idx) # no need to add
}
Now it's even possible to remove the outer loop in j by using the function sapply:
sapply( 1:tt, function(j) sum(column==j) )
The above line of code means: "for each j in 1:tt, return function(j)", an returns a vector where the j'th element is the result of the function.
So in summary, you can reduce your entire code to:
count = sapply( 1:tt, function(j) sum(column==j) )
(Although this doesn't explain your error, which I suspect is to do with the construction or class of your column).
I suggest to not use for loops, but use the count function from the plyr package. This function does exactly what you want in one line of code.

Resources