How do I continue running the function when errors happen? [duplicate] - r

I have an example function below that reads in a date as a string and returns it as a date object. If it reads a string that it cannot convert to a date, it returns an error.
testFunction <- function (date_in) {
return(as.Date(date_in))
}
testFunction("2010-04-06") # this works fine
testFunction("foo") # this returns an error
Now, I want to use lapply and apply this function over a list of dates:
dates1 = c("2010-04-06", "2010-04-07", "2010-04-08")
lapply(dates1, testFunction) # this works fine
But if I want to apply the function over a list when one string in the middle of two good dates returns an error, what is the best way to deal with this?
dates2 = c("2010-04-06", "foo", "2010-04-08")
lapply(dates2, testFunction)
I presume that I want a try catch in there, but is there a way to catch the error for the "foo" string whilst asking lapply to continue and read the third date?

Use a tryCatch expression around the function that can throw the error message:
testFunction <- function (date_in) {
return(tryCatch(as.Date(date_in), error=function(e) NULL))
}
The nice thing about the tryCatch function is that you can decide what to do in the case of an error (in this case, return NULL).
> lapply(dates2, testFunction)
[[1]]
[1] "2010-04-06"
[[2]]
NULL
[[3]]
[1] "2010-04-08"

One could try to keep it simple rather than to make it complicated:
Use the vectorised date parsing
R> as.Date( c("2010-04-06", "foo", "2010-04-08") )
[1] "2010-04-06" NA "2010-04-08"
You can trivially wrap na.omit() or whatever around it. Or find the index of NAs and extract accordingly from the initial vector, or use the complement of the NAs to find the parsed dates, or, or, or. It is all here already.
You can make your testFunction() do something. Use the test there -- if the returned (parsed) date is NA, do something.
Add a tryCatch() block or a try() to your date parsing.
The whole things is a little odd as you go from a one-type data structure (vector of chars) to something else, but you can't easily mix types unless you keep them in a list type. So maybe you need to rethink this.

You can also accomplish this kind of task with the purrr helper functions map and possibly. For example
library(purrr)
map(dates2, possibly(testFunction, NA))
Here possibly will return NA (or whatever value you specified if an error occurs.

Assuming the testFunction() is not trivial and/or that one cannot alter it, it can be wrapped in a function of your own, with a tryCatch() block. For example:
> FaultTolerantTestFunction <- function(date_in) {
+ tryCatch({ret <- testFunction(date_in);}, error = function(e) {ret <<- NA});
+ ret
+ }
> FaultTolerantTestFunction('bozo')
[1] NA
> FaultTolerantTestFunction('2010-03-21')
[1] "2010-03-21"

Related

How do you solve "could not find function "deparse<-" | "as.name<-" | "eval<-"" errors when trying to dynamically name dataframes in R? [duplicate]

I am using R to parse a list of strings in the form:
original_string <- "variable_name=variable_value"
First, I extract the variable name and value from the original string and convert the value to numeric class.
parameter_value <- as.numeric("variable_value")
parameter_name <- "variable_name"
Then, I would like to assign the value to a variable with the same name as the parameter_name string.
variable_name <- parameter_value
What is/are the function(s) for doing this?
assign is what you are looking for.
assign("x", 5)
x
[1] 5
but buyer beware.
See R FAQ 7.21
http://cran.r-project.org/doc/FAQ/R-FAQ.html#How-can-I-turn-a-string-into-a-variable_003f
You can use do.call:
do.call("<-",list(parameter_name, parameter_value))
There is another simple solution found there:
http://www.r-bloggers.com/converting-a-string-to-a-variable-name-on-the-fly-and-vice-versa-in-r/
To convert a string to a variable:
x <- 42
eval(parse(text = "x"))
[1] 42
And the opposite:
x <- 42
deparse(substitute(x))
[1] "x"
The function you are looking for is get():
assign ("abc",5)
get("abc")
Confirming that the memory address is identical:
getabc <- get("abc")
pryr::address(abc) == pryr::address(getabc)
# [1] TRUE
Reference: R FAQ 7.21 How can I turn a string into a variable?
Use x=as.name("string"). You can use then use x to refer to the variable with name string.
I don't know, if it answers your question correctly.
strsplit to parse your input and, as Greg mentioned, assign to assign the variables.
original_string <- c("x=123", "y=456")
pairs <- strsplit(original_string, "=")
lapply(pairs, function(x) assign(x[1], as.numeric(x[2]), envir = globalenv()))
ls()
assign is good, but I have not found a function for referring back to the variable you've created in an automated script. (as.name seems to work the opposite way). More experienced coders will doubtless have a better solution, but this solution works and is slightly humorous perhaps, in that it gets R to write code for itself to execute.
Say I have just assigned value 5 to x (var.name <- "x"; assign(var.name, 5)) and I want to change the value to 6. If I am writing a script and don't know in advance what the variable name (var.name) will be (which seems to be the point of the assign function), I can't simply put x <- 6 because var.name might have been "y". So I do:
var.name <- "x"
#some other code...
assign(var.name, 5)
#some more code...
#write a script file (1 line in this case) that works with whatever variable name
write(paste0(var.name, " <- 6"), "tmp.R")
#source that script file
source("tmp.R")
#remove the script file for tidiness
file.remove("tmp.R")
x will be changed to 6, and if the variable name was anything other than "x", that variable will similarly have been changed to 6.
I was working with this a few days ago, and noticed that sometimes you will need to use the get() function to print the results of your variable.
ie :
varnames = c('jan', 'feb', 'march')
file_names = list_files('path to multiple csv files saved on drive')
assign(varnames[1], read.csv(file_names[1]) # This will assign the variable
From there, if you try to print the variable varnames[1], it returns 'jan'.
To work around this, you need to do
print(get(varnames[1]))
If you want to convert string to variable inside body of function, but you want to have variable global:
test <- function() {
do.call("<<-",list("vartest","xxx"))
}
test()
vartest
[1] "xxx"
Maybe I didn't understand your problem right, because of the simplicity of your example. To my understanding, you have a series of instructions stored in character vectors, and those instructions are very close to being properly formatted, except that you'd like to cast the right member to numeric.
If my understanding is right, I would like to propose a slightly different approach, that does not rely on splitting your original string, but directly evaluates your instruction (with a little improvement).
original_string <- "variable_name=\"10\"" # Your original instruction, but with an actual numeric on the right, stored as character.
library(magrittr) # Or library(tidyverse), but it seems a bit overkilled if the point is just to import pipe-stream operator
eval(parse(text=paste(eval(original_string), "%>% as.numeric")))
print(variable_name)
#[1] 10
Basically, what we are doing is that we 'improve' your instruction variable_name="10" so that it becomes variable_name="10" %>% as.numeric, which is an equivalent of variable_name=as.numeric("10") with magrittr pipe-stream syntax. Then we evaluate this expression within current environment.
Hope that helps someone who'd wander around here 8 years later ;-)
Other than assign, one other way to assign value to string named object is to access .GlobalEnv directly.
# Equivalent
assign('abc',3)
.GlobalEnv$'abc' = 3
Accessing .GlobalEnv gives some flexibility, and my use case was assigning values to a string-named list. For example,
.GlobalEnv$'x' = list()
.GlobalEnv$'x'[[2]] = 5 # works
var = 'x'
.GlobalEnv[[glue::glue('{var}')]][[2]] = 5 # programmatic names from glue()

Warning message: By Converting Dates to Numeric Values

date_to_numeric<- function(x)#function for construction of date
{
strptime(x,format = "%Y-%m-%d")->t
if(is.na(t)==TRUE)
strptime(x,format = "%Y%m%d")->t
as.numeric(format(t, "%Y"))->t1
as.numeric(format(t, "%m"))->t2
as.numeric(format(t, "%d"))->t3
d<-c(0,0.08493150685,0.1616438356,0.2465753425,0.3287671233,0.4136986301,0.495890411,0.5808219178,0.6657534247,0.7479452055,0.8328767123,0.9150684932)
d[t2]->t2
t3<-t3/365
result<-t1+t2+t3
return(result)
}
time(d)->t
t<-date_to_numeric(t)
Warning message:
In if (is.na(t) == TRUE) t <- strptime(x, format = "%yyyy%mm%dd") :
the condition has length > 1 and only the first element will be used
Can please someone explain to me why I get this error message ? I usde the same Code in jannuary last year and it woked fine ! Any hepel is hilgly preciated !
As #Sotos mentioned, the reason you are receiving this warning message is because in your function, you are using and if statement but object t is likely a vector of dates. Since if is not vectorized, your function will only check if the first element of t is missing (in if (is.na(t))), and it is giving you this precise warning. Note that your code will still run, however it probably won't return what you are expecting.
The simplest way to fix this without editing your function is using sapply(). You can do something like this:
t <- time(d)
t2 <- sapply(t, FUN = date_to_numeric)
You can also edit your date_to_numeric function to allow for proper vectorized calculations, which I would recommend for the long run.

Preserving data structure when returning values from function in R

I currently have a basic script written in R, which has two functions embedded within another:
FunctionA <- Function() {
results_from_B <- FunctionB()
results_from_C <- FunctionC()
}
Function B generates some data which is then analysed in Function C.
If I stop the code within function A, I can see the structure of results_from_C - this appears under 'values' and I can refer to different elements using the syntax results_from_C$column_name1.
I achieved this within Function C by specifying the returned values using:
return(list(column_name_1 = value1, column_name_2 = value2)
However, I cannot work out how I can return these same values (in the same structure) from Function A - everything I try returns a list which is formatted as 'Data' rather than 'Values' and cannot be indexed using the syntax results_from_A$column_name1.
Can anyone help me to understand what I need to do in order to extract results from Function C outside of Function A?
Thanks in advance
I don't understand what you mean by formatted as 'Data' rather than 'Values'.
There's nothing wrong with the setup you describe, I every now and then use functions inside functions, it's perfectly OK.
(Note that R is case sensitive, it's function not Function.)
FunctionA <- function() {
FunctionB <- function() 1:2*pi
FunctionC <- function(x)
list(column_name_1 = x[1], column_name_2 = x[2])
results_from_B <- FunctionB()
results_from_C <- FunctionC(results_from_B)
results_from_C
}
result <- FunctionA()
result
$column_name_1
[1] 3.141593
$column_name_2
[1] 6.283185
result$column_name_1
[1] 3.141593
Is this it? If not, please clarify your question.

If my function doesn't work on every object, how do I skip those objects?

I am trying to write a function and apply it to a list. Inside my function is a function written by some one else. If I make my list very easy, everything will work fine. But if I use all the real data I have, there are some bad objects and the outside function doesn't work and my whole function won't go through.
What do I type to say "If the outside function doesn't work, skip that object and move to the next one in the list."? With or without NA, doesn't matter.
I cannot figure out how to write a reproducible example that would result in a list of dataframes, which is what happens inside this function. I'm willing to take any help to improve this question.
My function is something like this:
do_this<- function(x){
outside_function(x))%>% #this returns a dataframe for each object
filter()%>%
select()%>%
summarise_each(funs(mean(., na.rm = TRUE))) #by the end the df is down to one row
}
This is how I apply the function to the list to come up with my final dataframe.
df<-bind_rows(lapply(my_list, do_this))
An example:
myfun <- function(x) {if (x == 1) {stop("bad")} else x}
throws error on input of 1:
lapply(1:4, myfun) # stops from error
Just wrap it in try (as long as you don't need more complex error handling):
L <- lapply(1:4, function(x) try(myfun(x)))
And then you can use Filter to get rid of the "bad" cases:
Filter(function(x) !inherits(x, "try-error"), L)
Although you may want to just make your wrapper function more robust, or return NULL (or some other appropriate value) under the condition that makes the inner function fail.

R Function - Print When Not Assigned

I see there is another related question, but the answer isnt what I am looking for. I want a function that can be assigned to an object, but still will print the output even when assigned, but not double print it.
In this case:
fun <- function(x) {
print(x+1)
x+1
}
a <- fun(3)
In this case, it would both save to a, and it would print to console, which is what I want.
But in this case:
fun(3)
It would print to the console twice. Is there a way to get the desired result from case 1, without double printing on case 2?
Assuming that you still want your function to return the 'x+1' value, you could just wrap it in the invisible function:
fun <- function(x) {
print(x+1)
invisible(x+1)
}
> fun(3)
[1] 4
> a = fun(3)
[1] 4
> a
[1] 4
This will only print it out once, while still retaining the 'x+1' value.
You can accomplish this same behavior with any function in R that visibly returns an object by wrapping it in parentheses.
fun <- function(x) {
x+1
}
> (fun(3))
[1] 4
> (a = fun(3))
[1] 4
> a
[1] 4
Or, equivalently, you may simply call print on your assignment.
> print(fun(3))
[1] 4
I'm not sure rolling this functionality into a function has any benefits over using the existing print method outside of a function.
R functions shouldn't have unasked for "side effects". This isn't a rule, but a strong recommendation. As evidence for it being good practice, there are plenty of questions on SO like this one where a poorly behaved function prints output using print or cat that the end user struggles to disable.
Based on this, I would strongly encourage you to either use message() rather than print() or to add an argument that can disable the printing. message() is the "right" way print to the console during execution, but it won't format the result nicely depending on the data structure.
Thus, if your function expects simple outputs then I would recommend doing it like this:
fun <- function(x) {
result = x + 1
message(result)
invisible(result)
}
If it might have more complicated output, you could try something like this (demoing on mtcars):
fun <- function(x) {
result = head(mtcars)
sapply(capture.output(print(result)), message)
invisible(result)
}
Messages can easily be suppressed by wrapping the call in suppressMessages(), and message = F is a handy argument for knitr code chunks to say "ignore the messages".
The other option is to add an argument
fun <- function(x, quietly = FALSE) {
result = x + 1
if (!quietly) print(result)
invisible(result)
}
And I'd also think long and hard about whether this strange behavior is really necessary. Usually, having functions that behave as expected is better than having special cases that throw expectations.

Resources