Error using next in for loop R - r

I keep getting this error in my For loop:
Error in FUN(X[[i]], ...) : no loop for break/next, jumping to top level
autoAnal <- function(x){
if(!is.numeric(x)){
next
}
m <- median(x, na.rm = T)
a <- mean(x, na.rm = T)
s <- sd(x, na.rm = T)
q <- quantile(x, na.rm = T)
q3 <- q[4]
q1 <- q[2]
outhigh <- (1.5 * q3) + IQR(x, na.rm = T)
outlow <- (1.5 * q1) - IQR(x, na.rm = T)
data.table(Median = m, Average = a, StDev = s,
Outhigh = outhigh, Outlow = outlow)
}
Channel Data
Channel june july december
ATM 666 783 333
VISA 893 321 321
DEBIT 931 134 123
MERCHANT 913 111 134
The first varible - "Channel" is not numerical, so I want R to skip and go to the next variable. However, I get the error above. I can run the function successfully on a numerical vector, but when I use a dataframe or data.table
it doesn't work.
When I run this code I get the error:
test <- lapply(channnel, autoAnal)
What am I doing wrong with the 'next' statement in my for loop?
Thank you.

In order to understand why next doesn't work, we have to break down the two parts of your code that you've attempted. To do this, it makes more sense to work backwards and examine how lapply works. Let's start with a basic variable called i and set it equal to the numbers 1 through 10. Once we have that, we'll use lapply to see how the function handles an input vector.
i <- 1:10
tmp <- lapply(i, as.character)
If you run that code and look at tmp, you'll see that we get a list with 10 elements, where each element in the list is a single number stored as a string. What this should show you is that lapply works as a loop going through each element of an object that you've passed to it and feeding that into whatever function you've chosen. So let's use that to look at what your code is doing.
test <- lapply(channnel, autoAnal)
This is going to take the first element of channnel and pass it into your function autoAnal. At this point, the code switches from the loop that lapply is running and steps into a single function call. The first thing your function does is checks if the element that has been passed into the function is numeric or not. If it's not, you use next to tell R to skip that variable.
However, next only works in loops, such as while and for. So let's go back to our i variable. Let's say we wanted to loop over i and print a message if the number is even, otherwise, skip to the next number. We can do that like this:
for (i in 1:10) {
if (i %% 2 == 0) {
print("i is even!")
} else {
next
}
}
In this instance, using next doesn't throw an error because it's used within a for loop. That's why in my comment I mentioned that there is no for loop, at least not where you think there is. My intention behind saying this was to point out that, yes, lapply is a loop, but not for the sake of what you're trying to do with next.
So, what if we just tried to use next with an if statement?
i <- 5
if (i %% 2 == 0) {
print("i is even")
} else {
next
}
This will throw an error because your code isn't a loop. There's just one interation that it's going to go through, which involves checking if i is even or not, and then moving on. Much in the same way that next doesn't work here, it's not working in your code because the if statement is just checking if the element passed to it is numeric. There's nothing for it to "next" to if that makes sense. For this reason, next is only used with loops.
In my opinion, next is never even really needed. You can always just use if/else statements to run code if needed or otherwise just ignore the variable. As an example, let's rewrite your code to take advantage of if/else and get rid of our next call.
autoAnal <- function(x){
if(is.numeric(x)){
m <- median(x, na.rm = T)
a <- mean(x, na.rm = T)
s <- sd(x, na.rm = T)
q <- quantile(x, na.rm = T)
q3 <- q[4]
q1 <- q[2]
outhigh <- (1.5 * q3) + IQR(x, na.rm = T)
outlow <- (1.5 * q1) - IQR(x, na.rm = T)
data.table(Median = m, Average = a, StDev = s,
Outhigh = outhigh, Outlow = outlow)
} else {
print("Skipping this element")
}
}
By using if/else, we tell the computer to only perform those calculations on variables if they're numeric. Otherwise, print a message saying we're skipping a variable. In normal code, I'd advise just dropping the else statement all together. lapply will naturally return NULL if there is no value returned, and the function will skip all of the code completely for the first variable since it's not numeric. In the end, you end up with an error-free function that only operates on numeric data.
Hopefully that helps illustrate why next doesn't work in your current context.

Related

Function does not to work with lubridate/mutate/across but works with a loop

I try to fix dates (years) using a function
change_century <- function(x){
a <- year(x)
ifelse(test = a >2020,yes = year(x) <- (year(x)-100),no = year(x) <- a)
return(x)
}
The function works for specific row or using a loop for one column (here date of birth)
for (i in c(1:nrow(Df))){
Df_recode$DOB[i] <- change_century(Df$DOB[i])
}
Then I try to use mutate/across
Df_recode <- Df %>% mutate(across(list_variable_date,~change_century(.)))
It does not work. Is there something I am getting wrong? thank you !
Try:
change_century <- function(x){
a <- year(x)
newx <- ifelse(test = a > 2020, yes = a - 100, no = a)
return(newx)
}
(Frankly, the use of newx as a temporary storage and then returning it was done that way solely to introduce minimal changes in your code. In general, in this case one does not need return, in fact theoretically it adds an unnecessary function to the evaluation stack. I would tend to have two lines in that function: a <- year(x) and ifelse(..), without assignment. The default behavior in R is to return the value of the last expression, which in my case would be the results of ifelse, which is what we want. Assigning it to newx and then return(newx) or even just newx as the last expression has exactly the same effect.)
Rationale
ifelse cannot have variable assignment within it. That's not to say that is is a syntax error (it is not), but that it is counter to its intent. You are asking the function to go through each condition found in test=, and return a value based on it. Regardless of the condition, both yes= and no= are evaluated completely, and then ifelse joins them together as needed.
For demonstration,
ifelse(test = c(TRUE, FALSE, TRUE), yes = 1:3, no = 11:13)
The return value is something like:
c(
if (test[1]) yes[1] else no[1],
if (test[2]) yes[2] else no[2],
if (test[3]) yes[3] else no[3]
)
# c(1, 12, 3)
To capture the results of the zipped-together yeses and nos c(1, 12, 3), one must capture the return value from ifelse itself, not inside of the call to ifelse.
Another point that may be relevant: ifelse(cond, yes, now) is not at all a shortcut for if (cond) { yes } else { no }. Some key differences:
in if, the cond must always be exactly length 1, no more, no less.
In R < 4.2, length 0 returns an error argument is of length zero (see ref), while length 2 or more produces a warning the condition has length > 1 and only the first element will be used (see ref1, ref2).
In R >= 4.2, both conditions (should) produce an error (no warnings).
ifelse is intended to be vectorized, so the cond can be any length. yes= and no= should either be the same length or length 1 (recycling is in effect here); cond= should really be the same length as the longer of yes= and no=.
if does short-circuiting, meaning that if (TRUE || stop("quux")) 1 will never attempt to evaluate stop. This can be very useful when one condition will fail (logically or with a literal error) if attempted on a NULL object, such as if (!is.null(quux) && quux > 5) ....
Conversely, ifelse always evaluates all three of cond=, yes=, and no=, and all values in each, there is no short-circuiting.

Assigning objects to arguments in R functions - switches, if else statements or functions?

I am new to R and programming in general and am trying to write a very basic function where the input is 2 numbers and a selection from one of 3 operations. The output is supposed to be the result of a further calculation (divide the result of the input by 3*pi) and then a character string to confirm what operation was selected/performed. I want the default operation to be addition.
I've read up a little on the switch function and if... else type statements but not sure what is the most efficient way to achieve what I am trying to do and so far I haven't been able to get anything to work anyway. I seem to be getting a massive matrix as the output or an error to say I can't return multiple arguments in my current attempt. Can someone help with where I am going wrong? Thank you in advance.
basiccalc <- function(x, y, operation = addition){
addition <- x + y
subtraction <- x - y
multiplication <- x * y
calculation <- operation/(3*pi)
return(calculation, "operation")
}
switch would be useful
basiccalc <- function(x, y, operation = addition) {
operation <- deparse(substitute(operation))
op <- switch(operation,
addition = x + y,
subtraction = x - y,
multiplication = x * y)
return(op/(3 *pi))
}
-testing
> basiccalc(3, 5)
[1] 0.8488264
> 8/(3 * pi)
[1] 0.8488264
> basiccalc(3, 5, operation = subtraction)
[1] -0.2122066
> (3- 5)/(3 * pi)
[1] -0.2122066

Infinite loop using WHILE even though condition is met in R

I am trying to learn how to implement control structures such as FOR and while loops.
I created a function the simulates betting in a famous Brazilian lotto.
In the lotto, a player bets on 6 unique integers from a 1:60 vector (called your_bet).
The function samples 6 values from the 1 to 60 universe ("result") and tests how many values in result match your_bet, printing out:
your_bet
result
total points (out of 6 possible)
one of three possible commentaries on the result of the bet.
Code as follows:
```
LOTTO<-function(your_bet=sample(1:60, size=6, replace=FALSE)){
result<-sample(1:60, size=6, replace=FALSE)
logical_vector<-(your_bet %in% result)
total_points<-sum(as.integer(logical_vector))
print(paste(c("Your bet:", as.character(your_bet))), collapse="")
print(paste(c("Result", as.character(result))), collapse="")
print(paste(c("Total points", as.character(total_points))), collapse="")
if (total_points==6)
print("You are a millonaire")
else if (total_points==5)
print("5 points, you are rich!")
else print("good luck next time")
}
```
I then tried to implement a loop that would make the function go over and over again in a loop until total points>= a given target (here as target_points), modifying the function as bellow.
```
LOTTO<-function(your_bet=sample(1:60, size=6, replace=FALSE), stubborn_until_x_points=FALSE,
target_points)#inserted stubborn_until_x_points and target_points arguments{
result<-sample(1:60, size=6, replace=FALSE)
logical_vector<-(your_bet %in% result)
total_points<-sum(as.integer(logical_vector))
print(paste(c("Your bet:", as.character(your_bet))), collapse="")
print(paste(c("Result", as.character(result))), collapse="")
print(paste(c("Total points", as.character(total_points))), collapse="")
if (total_points==6)
print("You are a millonaire")
else if (total_points==5)
print("5 points, you are rich!")
else print("good luck next time")
if (stubborn_until_x_points==TRUE)#Inserted WHILE loop here{
while(total_points < target_points){
LOTTO(your_bet, stubborn_until_x_points=TRUE, target_points)}
}
}
```
This did make the function repeat in a loop, but for some reason, it keeps looping even when the condition is met.
Observation - I realized that when the condition is met on the first run, it actually stops, but after entering a loop, it goes on forever.
I canĀ“t find what is wrong. Any ideas?
Thanks in advance.
I haven't dug quite deep into the function, but the only reason that this should not break at any point is when target_points > n_samples (6 here).
The problem in this case is quite obvious and simple to fix. Reduce target_points to be less than 6 or add n_samples (6 atm.) and make it greater than target_points. Rather than this I suspect the main problem lies within the recursive function though. R is rather restrictive when it comes to recursion, if one tries to do a simple recursion for example
i <- 0
f <- function(){
i <<- i + 1
f()
}
f()
i
# 896 on my pc
one can see that we cannot use recursion for very deep recursive functions (in R). This throws the very unhelpful error
Error: C stack usage 7974196 is too close to the limit
To alleviate this, one simply has to remove the recursion (simply in italian because sometimes it is not simple). In this case we just move the while loop to contain the main body of the function, and use an if statement to break early if necessary.
Below is a slightly modified version of the function (note that sample_n and number_range has been added as arguments).
In this function the while loop has been moved to contain the main body, and the result is instead printed at the end (using the variable res to figure out the result). At the end of the loop I use a if(isFALSE(stubborn_until_x_points))break statement to exit early if necessary.
LOTTO <- function(your_bet,
sample_n = 6,
number_range = 1:60,
stubborn_until_x_points = FALSE,
target_points){
if(missing(target_points) || target_points > sample_n)
stop('missing target_points or target_points too large')
total_points <- -Inf # Always smaller than target_points
i <- 0
res <- 0
# If you want a new bet every iteration.
# Add this at the end of the loop, but remove the 'if'
if(missing(your_bet))
your_bet <- sample(number_range, size=sample_n, replace=FALSE)
while(total_points < target_points){
result <- sample(number_range, size=sample_n, replace=FALSE)
logical_vector <- your_bet %in% result
total_points <- sum(logical_vector)
if (total_points==6){
res <- 1
}else if (total_points==5){
res <- 2
}
i <- i + 1
if(isFALSE(stubborn_until_x_points))
break
}
if(res == 1)
cat(res <- 'You\'re a millionaire!\n', sep = '\n')
else if(res == 2)
cat(res <- '5 points, you are rich!\n', sep = '\n')
else
cat(res <- 'Better luck next time.', sep = '\n')
c(result = res, number_of_tries = i)
}
The function is called as before, but now also returns the number of attempts and the result obtained from trials as shown below.
LOTTO(target_points = 6, stubborn_until_x_points = TRUE)
You're a millionaire!
#Output:
result number_of_tries
"You're a millionaire!\n" "8297820"
Removing recursion, including the body of the function inside the while loop, assigning -Inf to the initial total_points and adding the break statement were much usefull.
Addapting the answer by #Oliver yielded exactly what I was looking for:
LOTTO<-function(your_bet=sample(1:60, size=6, replace=FALSE), stubborn_until_x_points=FALSE,
target_points=0){
total_points<--Inf
while(total_points < target_points){
result<-sample(1:60, size=6, replace=FALSE)
logical_vector<-(your_bet %in% result)
total_points<-sum(as.integer(logical_vector))
print(paste(c("Your bet:", as.character(your_bet))), collapse="")
print(paste(c("Result", as.character(result))), collapse="")
print(paste(c("Total points", as.character(total_points))), collapse="")
if (total_points==6)
print("You are a millonaire")
else if (total_points==5)
print("5 points, you are rich!")
else print("good luck next time")
if (isFALSE(stubborn_until_x_points==TRUE))
break
}
His answer, however, yields interesting additional results, and better control over inadequate argument inputs

Can't store return value from which function in R

Hi everyone so I'm very new to R - installed RStudio last week.
I'm trying to store the coordinate of an element in the dataframe other_list below into the variable beta1, but I keep getting the error: "object 'beta1' not found".
for (val in 1:length(sampledata)) {
if (sampledata[val] %in% other_list)
beta1 = which(other_list == sampledata[val])
add_values = dataset1[c(beta1), c("name", "gender", "age")]
rbind(dataset2, add_values)
}
Anyone have an explanation for why beta1 doesn't get recognized in the dataset1 brackets?
Thank you!
Some suggestions.
First, you don't define what beta1 should be if the condition is not met, yet you try to use it anyway. This is bad in both the initial case ('beta1' not found) and in the next case where beta1 was defined on a previous loop. Your data will be wrong. I suggest you only use beta1 if relevant, try this:
for (val in 1:length(sampledata)) {
beta1 = which(other_list == sampledata[val])
if (length(beta1)) {
add_values = dataset1[c(beta1), c("name", "gender", "age")]
rbind(dataset2, add_values)
}
}
(This is still wrong.)
Second, you call rbind but ignore its return value. Most functions in R are functional in that they have no side-effect. rbind does not change any of the data, it returns the concatenated data. So you need to capture it:
for (val in 1:length(sampledata)) {
beta1 = which(other_list == sampledata[val])
if (length(beta1)) {
add_values = dataset1[c(beta1), c("name", "gender", "age")]
dataset2 = rbind(dataset2, add_values)
}
}
(This is still bad.)
Slightly-third ... this is not strictly broken but can be if you intend to "automate" this. The premise of 1:length(x) is that it should allow you to iterate over indices of a vector ... but if x is empty, then 1:length(x) reduces to 1:0 which does not return an empty vector:
1:length(c())
# [1] 1 0
which means that your loop will try to find both sampledata[1] (which does not exist) and sampledata[0] (which is also length 0, and will break many functions).
It is better to use either for (i in seq_len(length(x))) or for (i in seq_along(x)), both will do nothing if x is empty,
for (i in seq_along(10:11)) print(i)
# [1] 1
# [1] 2
for (i in seq_along(integer(0))) print(i)
#### nothing done
Fourth, though is how rbind does things functionally: it makes a complete copy of the entire data.frame. This means that if you start with (say) 1000 rows and then add 2 more rows, then you have those 1000 rows in memory twice. Eventually it is garbage-collected (cleared from memory), but that takes time. Now on the next iteration of the for loop, you want to add another 3 rows; this time it makes a complete copy of the now-current 1002 rows so that they now occur in memory twice. Do this enough times and your time to calculate is mostly spent in copying all of that data around. (This is precisely the problem discussed in Chapter 2: Growing objects in the R Inferno.)
The fix is generally to take all of the rows to be appended and only concatenate them once. I think this would change your code to something like:
list_of_frames = list()
for (val in 1:length(sampledata)) {
beta1 = which(other_list == sampledata[val])
if (length(beta1)) {
add_values = dataset1[c(beta1), c("name", "gender", "age")]
list_of_frames = c(list_of_frames, add_values)
}
}
dataset2 = do.call(rbind, list_of_frames)
If you're using other packages, this might be easier with one of these lines of code instead:
dataset2 = dplyr::bind_rows(list_of_frames)
dataset2 = data.table::rbindlist(list_of_frames)
This should be "good enough".
You could go one step further (to a Fifth point) to use an R-idiomatic lapply, though, something like:
list_of_frames = lapply(seq_along(sampledata), function(ind) {
beta1 = which(other_list == sampledata[ind])
if (length(beta1)) dataset1[beta1, c("name", "gender", "age")]
})
### or better yet
list_of_frames = lapply(sampledata, function(sampdat) {
beta1 = which(other_list == sampdat)
if (length(beta1)) dataset1[beta1, c("name", "gender", "age")]
})
### then
dataset2 = do.call(rbind, list_of_frames)
Yes, there will likely be elements of list_of_frames that are NULL, and the do.call(rbind, ...) part deals with that just fine.

Function raise error with return statement

I want to process a own designed function on every cell using the calc function of the "raster" package.
Everything works perfectly when I try to print the "final" result of the function (value I want to return), but when I try to use return statement, I got an error :
Error in .local(x, values, ...) :
values must be numeric, integer or logical.
Here is the code leading to that error
inR <- 'D://test/TS_combined_clipped.tif'
outR <- 'D://test/R_test3.tif'
rasterB <- brick(inR)
fun1 <-function(x){
years = seq(1, 345)
na_idx = which(is.na(x))
years = years[-na_idx]
x <- na.omit(x)
idx = detectChangePoint(x, cpmType='Student', ARL0=500)$changePoint
return(years[idx]) # this raises error
# print(years[idx]) # This does *not* raises any error
}
r <- calc(rasterB, fun=fun1, filename=outR, overwrite=TRUE)
How is it possible to have a return statement to make it fails ?
Some of my tests leads to the fact that it seems that the process fails just after the execution of the calc function on the very last cell of the rasterBrick.
But I have no clue of where to start to try to fix this.
Input image is available here
[EDIT]
I just noticed that if I use return(idx) instead of return(year[idx]) the process works without error raised.
So it seems that the problem is more at fetching the value of the year variable.
Is therefore any particular thing that I missed in the use of indexes with R ?
Comment of user2554330 put me on the good track, issue was that calc cannot handle a "numeric(0)" result.
Updated code is then
inR <- 'D://test/TS_combined_clipped.tif'
outR <- 'D://test/R_test3.tif'
rasterB <- brick(inR)
fun1 <-function(x){
years = seq(1, 345)
na_idx = which(is.na(x))
years = years[-na_idx]
x <- na.omit(x)
idx = detectChangePoint(x, cpmType='Student', ARL0=500)$changePoint
if (idx==0){
return(0)
} else {
return(as.integer(years[idx]))
}
}
r <- calc(rasterB, fun=fun1, filename=outR, overwrite=TRUE)

Resources