Logical comparison of two vectors with binary (0/1) result - r

For an assignment I had to create a random vector theta, a vector p containing for each element of theta the associated probability, and another random vector u. No problems thus far, but I'm stuck with the next instruction which I report below:
Generate a vector r1 that has a 1 in position i if pi ≥ ui and 0 if pi < ui. The
vector r1 is a Rasch item given the latent variable theta.
theta=rnorm(1000,0,1)
p=(exp(theta-1))/(1+exp(theta-1))
u=runif(1000,0,1)
I tried the following code, but it doesn't work.
r1<-for(i in 1:1000){
if(p[i]<u[i]){
return("0")
} else {
return("1")}
}

You can use the ifelse function:
r1 <- ifelse(p >= u, 1, 0)
Or you can simply convert the logical comparison into a numeric vector, which turns TRUE into 1 and FALSE into 0:
r1 <- as.numeric(p >= u)

#DavidRobinson gave a nice working solution, but let's look at why your attempt didn't work:
r1<-for(i in 1:1000){
if(p[i]<u[i]){
return("0")
} else {
return("1")}
}
We've got a few problems, biggest of which is that you're confusing for loops with general functions, both by assigning and using return(). return() is used when you are writing your own function, with function() <- .... Inside a for loop it isn't needed. A for loop just runs the code inside it a certain number of times, it can't return something like a function.
You do need a way to store your results. This is best done by pre-allocating a results vector, and then filling it inside the for loop.
r1 <- rep(NA, length(p)) # create a vector as long as p
for (i in 1:1000) {
if (p[i] < u[i]) { # compare the ith element of p and u
r1[i] <- 0 # put the answer in the ith element of r1
} else {
r1[i] <- 1
}
}
We could simplify this a bit. Rather than bothering with the if and the else, you could start r1 as all 0's, and then only change it to a 1 if p[i] >= u[i]. Just to be safe I think it's better to make the for statement something like for (i in 1:length(p)), or best yet for (i in seq_along(p)), but the beauty of R is how few for loops are necessary, and #DavidRobinson's vectorized suggestions are far cleaner.

Related

Problem with checking logical within for loop

Inspired by the leetcode challenge for two sum, I wanted to solve it in R. But while trying to solve it by brute-force I run in to an issue with my for loop.
So the basic idea is that given a vector of integers, which two integers in the vector, sums up to a set target integer.
First I create 10000 integers:
set.seed(1234)
n_numbers <- 10000
nums <- sample(-10^4:10^4, n_numbers, replace = FALSE)
The I do a for loop within a for loop to check every single element against eachother.
# ensure that it is actually solvable
target <- nums[11] + nums[111]
test <- 0
for (i in 1:(length(nums)-1)) {
for (j in 1:(length(nums)-1)) {
j <- j + 1
test <- nums[i] + nums[j]
if (test == target) {
print(i)
print(j)
break
}
}
}
My problem is that it starts wildly printing numbers before ever getting to the right condition of test == target. And I cannot seem to figure out why.
I think there are several issues with your code:
First, you don't have to increase your j manually, you can do this within the for-statement. So if you really want to increase your j by 1 in every step you can just write:
for (j in 2:(length(nums)))
Second, you are breaking only the inner-loop of the for-loop. Look here Breaking out of nested loops in R for further information on that.
Third, there are several entries in nums that gave the "right" result target. Therefore, your if-condition works well and prints all combination of nums[i]+nums[j] that are equal to target.

Finding the value of infinite sums in r

I'm very new to r and programming so please stay with me :)
I am trying to use iterations to find the value of infinite iterations to the 4th decimal place. I.e. where the 4th decimal does not change. so 1.4223, where 3 does not change anymore so the result to 3 decimal place is 1.422.
The link above shows an example of a similar problem that I am faced with. My question is how do I create a for-loop that goes to infinity and find the value where the 4th decimal point stops changing?
I have tried using while loops but I am not sure how to stop it from just looping forever. I need some if statement like below:
result <- 0
i <- 1
d <- 1e-4
while(TRUE)
{
result <- result + (1/(i^2))
if(abs(result) < d)
{
break
}
i <- i + 1
}
result
Here's an example: to do the infinite loop, use while(TRUE) {}, and as you suggested use an if clause and break to stop when necessary.
## example equation shown
## fun <- function(x,n) {
## (x-1)^(2*n)/(n*(2*n-1))
## }
## do it for f(x)=1/x^2 instead
## doesn't have any x-dependence, but leave it in anyway
fun <- function(x,n) {
1/n^2
}
n <- 1
## x <- 0.6
tol <- 1e-4
ans <- 0
while (TRUE) {
next_term <- fun(x,n)
ans <- ans + next_term
if (abs(next_term)<tol) break
n <- n+1
}
When run this gives ans=1.635082, n=101.
R also has a rarely used repeat { } keyword, but while(TRUE) will probably be clearer to readers
there are more efficient ways to do this (i.e. calculating the numerator by multiplying it by (x-1)^2 each time)
it's generally a good idea to test for a maximum number of iterations as well so that you don't set up a truly infinite loop if your series doesn't converge or if you have a bug in your code
I haven't solved your exact problem (chose a smaller value of tol), but you should be able to adjust this to get an answer
as discussed in the answer to your previous question, this isn't guaranteed, but should generally be OK; you can check (I haven't) to be sure that the particular series you want to evaluate has well-behaved convergence

Which loop to use, R language?

We have to create function(K) that returns vector which has all items smaller than or equal to K from fibonacci sequence. We can assume K is fibonacci item. For example if K is 3 the function would return vector (1,1,2,3).
In general, a for loop is used when you know how many iterations you need to do, and a while loop is used when you want to keep going until a condition is met.
For this case, it sounds like you get an input K and you want to keep going until you find a Fibonacci term > K, so use a while loop.
ans <- function(n) {
x <- c(1,1)
while (length(x) <= n) {
position <- length(x)
new <- x[position] + x[position-1]
x <- c(x,new)
}
return(x[x<=n])
}
`
Tried many different loops, and this is closest I get. It works with every other number but ans(3) gives 1,1,2 even though it should give 1,1,2,3. Couldn't see what is wrong with this.

How to create a function the finds the indexed slope of two vectors

I have looked everywhere for this answer but I am having a hard time even figuring our how to ask this question. I am trying to create a function such that it creates a vector that is a function of two other vectors, where I use a for loop to index values at k and k+1. Here is an example of my code, which does not work:
x <- 1:10
y <- x^2
d <- data.frame(x,y)
invSlope <- NULL
invSlope.f <- function(X,Y){
for(k in 1:length(X)-1){
invSlope[k] = (X[k+1] - X[k])/ (Y[k+1] - Y[k])
invSlope[length(X)] = 0
return(invSlope)
}
}
d$invSlope <- invSlope.f(d$x,d$y)
What I am trying to accomplish is at d$invSlope[1] I have the inverse of the slope of the line that comes after it (delta x/delta y). The last value of the vector would just be 0. I can accomplish this with a for loop (or even nested for loops), but I would like to generalize this to a function.
Thanks
The diff function is a vectorized approach... we don't need no steenkin' loops:
finvslope <- function(xseq, yseq) { c( diff(xseq)/diff(yseq) , 0) }

missing value where TRUE/FALSE needed error in R

I have got a column with different numbers (from 1 to tt) and would like to use looping to perform a count on the occurrence of these numbers in R.
count = matrix(ncol=1,nrow=tt) #creating an empty matrix
for (j in 1:tt)
{count[j] = 0} #initiate count at 0
for (j in 1:tt)
{
for (i in 1:N) #for each observation (1 to N)
{
if (column[i] == j)
{count[j] = count[j] + 1 }
}
}
Unfortunately I keep getting this error.
Error in if (column[i] == j) { :
missing value where TRUE/FALSE needed
So I tried:
for (i in 1:N) #from obs 1 to obs N
if (column[i] = 1) print("Test")
I basically got the same error.
Tried to do abit research on this kind of error and alot have to said about "debugging" which I'm not familiar with.
Hopefully someone can tell me what's happening here. Thanks!
As you progress with your learning of R, one feature you should be aware of is vectorisation. Many operations that (in C say) would have to be done in a loop, can be don all at once in R. This is particularly true when you have a vector/matrix/array and a scalar, and want to perform an operation between them.
Say you want to add 2 to the vector myvector. The C/C++ way to do it in R would be to use a loop:
for ( i in 1:length(myvector) )
myvector[i] = myvector[i] + 2
Since R has vectorisation, you can do the addition without a loop at all, that is, add a scalar to a vector:
myvector = myvector + 2
Vectorisation means the loop is done internally. This is much more efficient than writing the loop within R itself! (If you've ever done any Matlab or python/numpy it's much the same in this sense).
I know you're new to R so this is a bit confusing but just keep in mind that often loops can be eliminated in R.
With that in mind, let's look at your code:
The initialisation of count to 0 can be done at creation, so the first loop is unnecessary.
count = matrix(0,ncol=1,nrow=tt)
Secondly, because of vectorisation, you can compare a vector to a scalar.
So for your inner loop in i, instead of looping through column and doing if column[i]==j, you can do idx = (column==j). This returns a vector that is TRUE where column[i]==j and FALSE otherwise.
To find how many elements of column are equal to j, we just count how many TRUEs there are in idx. That is, we do sum(idx).
So your double-loop can be rewritten like so:
for ( j in 1:tt ) {
idx = (column == j)
count[j] = sum(idx) # no need to add
}
Now it's even possible to remove the outer loop in j by using the function sapply:
sapply( 1:tt, function(j) sum(column==j) )
The above line of code means: "for each j in 1:tt, return function(j)", an returns a vector where the j'th element is the result of the function.
So in summary, you can reduce your entire code to:
count = sapply( 1:tt, function(j) sum(column==j) )
(Although this doesn't explain your error, which I suspect is to do with the construction or class of your column).
I suggest to not use for loops, but use the count function from the plyr package. This function does exactly what you want in one line of code.

Resources