missing value where TRUE/FALSE needed error in R - r

I have got a column with different numbers (from 1 to tt) and would like to use looping to perform a count on the occurrence of these numbers in R.
count = matrix(ncol=1,nrow=tt) #creating an empty matrix
for (j in 1:tt)
{count[j] = 0} #initiate count at 0
for (j in 1:tt)
{
for (i in 1:N) #for each observation (1 to N)
{
if (column[i] == j)
{count[j] = count[j] + 1 }
}
}
Unfortunately I keep getting this error.
Error in if (column[i] == j) { :
missing value where TRUE/FALSE needed
So I tried:
for (i in 1:N) #from obs 1 to obs N
if (column[i] = 1) print("Test")
I basically got the same error.
Tried to do abit research on this kind of error and alot have to said about "debugging" which I'm not familiar with.
Hopefully someone can tell me what's happening here. Thanks!

As you progress with your learning of R, one feature you should be aware of is vectorisation. Many operations that (in C say) would have to be done in a loop, can be don all at once in R. This is particularly true when you have a vector/matrix/array and a scalar, and want to perform an operation between them.
Say you want to add 2 to the vector myvector. The C/C++ way to do it in R would be to use a loop:
for ( i in 1:length(myvector) )
myvector[i] = myvector[i] + 2
Since R has vectorisation, you can do the addition without a loop at all, that is, add a scalar to a vector:
myvector = myvector + 2
Vectorisation means the loop is done internally. This is much more efficient than writing the loop within R itself! (If you've ever done any Matlab or python/numpy it's much the same in this sense).
I know you're new to R so this is a bit confusing but just keep in mind that often loops can be eliminated in R.
With that in mind, let's look at your code:
The initialisation of count to 0 can be done at creation, so the first loop is unnecessary.
count = matrix(0,ncol=1,nrow=tt)
Secondly, because of vectorisation, you can compare a vector to a scalar.
So for your inner loop in i, instead of looping through column and doing if column[i]==j, you can do idx = (column==j). This returns a vector that is TRUE where column[i]==j and FALSE otherwise.
To find how many elements of column are equal to j, we just count how many TRUEs there are in idx. That is, we do sum(idx).
So your double-loop can be rewritten like so:
for ( j in 1:tt ) {
idx = (column == j)
count[j] = sum(idx) # no need to add
}
Now it's even possible to remove the outer loop in j by using the function sapply:
sapply( 1:tt, function(j) sum(column==j) )
The above line of code means: "for each j in 1:tt, return function(j)", an returns a vector where the j'th element is the result of the function.
So in summary, you can reduce your entire code to:
count = sapply( 1:tt, function(j) sum(column==j) )
(Although this doesn't explain your error, which I suspect is to do with the construction or class of your column).

I suggest to not use for loops, but use the count function from the plyr package. This function does exactly what you want in one line of code.

Related

Problem with checking logical within for loop

Inspired by the leetcode challenge for two sum, I wanted to solve it in R. But while trying to solve it by brute-force I run in to an issue with my for loop.
So the basic idea is that given a vector of integers, which two integers in the vector, sums up to a set target integer.
First I create 10000 integers:
set.seed(1234)
n_numbers <- 10000
nums <- sample(-10^4:10^4, n_numbers, replace = FALSE)
The I do a for loop within a for loop to check every single element against eachother.
# ensure that it is actually solvable
target <- nums[11] + nums[111]
test <- 0
for (i in 1:(length(nums)-1)) {
for (j in 1:(length(nums)-1)) {
j <- j + 1
test <- nums[i] + nums[j]
if (test == target) {
print(i)
print(j)
break
}
}
}
My problem is that it starts wildly printing numbers before ever getting to the right condition of test == target. And I cannot seem to figure out why.
I think there are several issues with your code:
First, you don't have to increase your j manually, you can do this within the for-statement. So if you really want to increase your j by 1 in every step you can just write:
for (j in 2:(length(nums)))
Second, you are breaking only the inner-loop of the for-loop. Look here Breaking out of nested loops in R for further information on that.
Third, there are several entries in nums that gave the "right" result target. Therefore, your if-condition works well and prints all combination of nums[i]+nums[j] that are equal to target.

Query for loops in R

I am working out with a data in R.
a<-rep(NA,400)
for(i in 1:10){for(j in 0:40){print(dat$V2[i]-j)}}
Instead of printing, I want to add that value into an empty array (a). I would be thankful if someone help me with the same.
In a case like this (nested loops), often the easiest way is to add a counter to keep track of positions in the array:
a <- rep(NA, 400)
counter <- 1
for(i in 1:10){
for(j in 0:40){
a[counter] <- dat$V2[i]-j
counter <- counter + 1
}
}
Here is one way:
#a<-rep(NA,400)
#for(i in 1:10){for(j in 0:40){print(dat$V2[i]-j)}}
a <- as.numeric( sapply( 1:10, function(i){
sapply( 0:40, function(j) {
dat$V2[i]-j
})
}))
sapply is useful here because countrary to a for loop it returns something with each loop. So in this case I first loop over 1:10, like your for loop, with the useful difference that it actually returns something each time.
What it does in each iteration is to run a new nested block, also using sapply, this time looping over 0:40, also this time returning something, this time its the expression you have innermost in your for loop.
So for 10 times, for each of 1:10, it will loop over 0:40, each time calculating your expression and returning it, which should result in it getting calculated and returned as you want.
as.numeric is wrapped around it to make sure it stays as one long vector, which seems to be what you want.
A way only using loops:
for(i in 1:10){
for(j in 0:39){
print(i*40 - 39 + j)
a[i*40 - 39 + j] = dat$V2[i]-j}}
PS: as you want to create a vector with 400 observations, and i goes from 1 to 10, j needs to have "length" 40, so you want it to be 0:39.
When you want to create a vector (1 dimentional) with a doble loop, the following formula normally applies:
index of the vector = i*length(j) - (length(j)-1) + j

Which loop to use, R language?

We have to create function(K) that returns vector which has all items smaller than or equal to K from fibonacci sequence. We can assume K is fibonacci item. For example if K is 3 the function would return vector (1,1,2,3).
In general, a for loop is used when you know how many iterations you need to do, and a while loop is used when you want to keep going until a condition is met.
For this case, it sounds like you get an input K and you want to keep going until you find a Fibonacci term > K, so use a while loop.
ans <- function(n) {
x <- c(1,1)
while (length(x) <= n) {
position <- length(x)
new <- x[position] + x[position-1]
x <- c(x,new)
}
return(x[x<=n])
}
`
Tried many different loops, and this is closest I get. It works with every other number but ans(3) gives 1,1,2 even though it should give 1,1,2,3. Couldn't see what is wrong with this.

Logical comparison of two vectors with binary (0/1) result

For an assignment I had to create a random vector theta, a vector p containing for each element of theta the associated probability, and another random vector u. No problems thus far, but I'm stuck with the next instruction which I report below:
Generate a vector r1 that has a 1 in position i if pi ≥ ui and 0 if pi < ui. The
vector r1 is a Rasch item given the latent variable theta.
theta=rnorm(1000,0,1)
p=(exp(theta-1))/(1+exp(theta-1))
u=runif(1000,0,1)
I tried the following code, but it doesn't work.
r1<-for(i in 1:1000){
if(p[i]<u[i]){
return("0")
} else {
return("1")}
}
You can use the ifelse function:
r1 <- ifelse(p >= u, 1, 0)
Or you can simply convert the logical comparison into a numeric vector, which turns TRUE into 1 and FALSE into 0:
r1 <- as.numeric(p >= u)
#DavidRobinson gave a nice working solution, but let's look at why your attempt didn't work:
r1<-for(i in 1:1000){
if(p[i]<u[i]){
return("0")
} else {
return("1")}
}
We've got a few problems, biggest of which is that you're confusing for loops with general functions, both by assigning and using return(). return() is used when you are writing your own function, with function() <- .... Inside a for loop it isn't needed. A for loop just runs the code inside it a certain number of times, it can't return something like a function.
You do need a way to store your results. This is best done by pre-allocating a results vector, and then filling it inside the for loop.
r1 <- rep(NA, length(p)) # create a vector as long as p
for (i in 1:1000) {
if (p[i] < u[i]) { # compare the ith element of p and u
r1[i] <- 0 # put the answer in the ith element of r1
} else {
r1[i] <- 1
}
}
We could simplify this a bit. Rather than bothering with the if and the else, you could start r1 as all 0's, and then only change it to a 1 if p[i] >= u[i]. Just to be safe I think it's better to make the for statement something like for (i in 1:length(p)), or best yet for (i in seq_along(p)), but the beauty of R is how few for loops are necessary, and #DavidRobinson's vectorized suggestions are far cleaner.

Explaining a for loop in R

I'm very new to R, and much more new to programming in R. I have the following question and its answer (which is not mine). I've trying to understand why some values, from where they are obtained, why they are used, etc.
Question: Make the vector 3 5 7 9 11 13 15 17 with a for loop. Start
with x=numeric() and fill this vector with the for loop
I know I have to create x=numeric() so I can fill it with the result obtained from the loop.
The answer from a classmate was:
> x <- numeric()
> for(i in 1:8){
if(i==1){ ## Why ==1 and not 0, or any other value
x[i] <- 3
}else{
x[i] <- x[i-1]+2 ### And why i-1
}
I'm having similar problems in questions like:
Make a for loop that adds the second element of a vector to the first,
subtracts the third element from the result, adds the fourth again and
so on for the entire length of the vector
So far, I created the vector and the empty vector
> y = c(5, 10, 15, 20, 25, 30)
> answer <- 0
And then, when I try to do the for loop, I get stuck here:
for(i in 1:length(y)){
if(i...){ ### ==1? ==0?
answer = y[i] ###and here I really don't know how to continue.
}else if()
}
Believe me when I tell you I've read several replies to questions here, like in How to make a vector using a for loop, plus pages and pages about for loop, but cannot really figure how to solve these (and other) problems.
I repeat, I'm very new, so I'm struggling trying to understand it. Any help would be much appreciated.
First, I will annotate the loop to answer what the loop is doing.
# Initialize the vector
x <- numeric()
for(i in 1:8){
# Initialize the first element of the vector, x[1]. Remember, R indexes start at 1, not 0.
if(i==1){
x[i] <- 3
} else {
# Define each additional element in terms of the previous one (x[i - 1]
# is the element of x before the current one.
x[i] <- x[i-1]+2 ### And why i-1
}
}
A better solution that uses a loop and grows it (like the instructions state) is something like this:
x <- numeric()
for(i in 1:8){
x[i] <- 2 * i + 1
}
This is still not a good way to do things because growing a vector inside a loop is very slow. To fix this, you can preallocate the vector by telling numeric the length of the vector you want:
x <- numeric(8)
The best way to solve this would be:
2 * 1:8 + 1
using vectorized operations.
To help you solve your other problem, I suggest writing out each step of the loop as a table. For example, for my solution, the table would be
i | x[i]
------------------
1 | 2 * 1 + 1 = 3
2 | 2 * 2 + 1 = 5
and so on. This will give you an idea of what the for loop is doing at each iteration.
This is intentionally not an answer because there are better ways to solve the alternating sign summation problem than a for-loop. I suppose there could be value in getting comfortable with for-loops but the vectorized approaches in R should be learned as well. R has "argument recycling" for many of its operations, including the "*" (multiplication) operation: Look at:
(1:10)*c(1,-1)
Then take an arbitrary vector, say vec and try:
sum( vec*c(1,-1) )
The more correct answer after looking at that result would be:
vvec[1] + sum( vec[-1]*c(1,-1) )
Which has the educational advantage of illustrating R's negative indexing. Look up "argument recycling" in your documentation. The shorter objects are automagically duplicatied/triplicated/however-many-needed-cated to exactly match the length of the longest vector in the mathematical or logical expression.

Resources