R language: Unexpected behaviour with function arguments in lapply - r

I am attempting to create a list of matrices containing iid Normal numbers. For the sake of a simple example, let the matrices be 4 by 2 and consider a list of length 3. The following code seemed like it should work (to me):
MyMatrix <- lapply(1:3, function() {matrix(rnorm(8), 4, 2)})
But it failed, with the following error:
Error in FUN(1:3[[1L]], ...) : unused argument (1:3[[1]])
On a whim, I tried:
MyMatrix <- lapply(1:3, function(x) {matrix(rnorm(8), 4, 2)})
And it worked! But why? x is not used anywhere in the function, and on experimentation, the behaviour of the expression is not affected by whether x already exists in the workspace or not. It appears to be entirely superfluous.
I am new to R, so I would be very grateful if an experienced user could explain what is going on here and why my first line fails.

You can't have a function that doesn't take arguments and then pass it arguments. Which is exactly what you are doing when you run lapply, as each value is passed in turn as the first argument to the function. E.g.
out <- lapply(1:3, function(x) x)
str(out)
#List of 3
# $ : int 1
# $ : int 2
# $ : int 3
Simple example throwing an error:
test <- function() {"woot"}
test()
#[1] "woot"
test(1)
#Error in test(1) : unused argument (1)
lapply(1:3, test)
#Error in FUN(1:3[[1L]], ...) : unused argument (1:3[[1]])
It's good form for R to error out, as it likely means you're expecting the function's returned result to change based on the arguments passed to the function. And it wouldn't. There are functions like this included in base R, like Sys.time(), which will fail if you try to pass it superfluous arguments which might otherwise make sense:
Sys.time()
#[1] "2014-07-07 13:22:11 EST"
Sys.time(tz="UTC")
#Error in Sys.time(tz = "UTC") : unused argument (tz = "UTC")

Related

methods for an object with length(class)>1

This question suggests using methods(class=class(obj)) to extract the list of methods that are available for an object.
If used on an object where length(class(obj))>1, this leads to lots of warnings, e.g.
set.seed(101)
a <- matrix(rnorm(20), nrow = 10)
b <- a + rnorm(length(a))
obj <- lm(b ~ a)
gives class(obj) as c("lm","mlm"); methods(class=class(obj)) gives
(many times)< Warning in grep(pattern, all.names, value = TRUE) :
argument 'pattern' has length > 1 and only the first element will be used
Warning in gsub(name, "", S3reg) :
argument 'pattern' has length > 1 and only the first element will be used
Warning in sub(paste0("\.", class, "$"), "", row.names(info)) :
argument 'pattern' has length > 1 and only the first element will be used
followed by the results.
It seems (?) that applying methods(class=...) to the last element of class(obj) would work, but I'm interested in alternatives or discussion as to why that is (or is not) correct ...
To clarify, I would like the returned value to be a (preferably unique) character vector, so that I can use something like if ("foo" %in% allmethods(obj)) to test for the availability of a specified method for the object ...
Is this what you're looking for?
sapply(class(obj), function(x) methods(class = x))
Note that the code below gives an error, since argument generic.function becomes mlm. It must be argument class like above.
sapply(class(obj), methods)
Error in .S3methods(generic.function, class, parent.frame()) :
no function 'mlm' is visible

do.call() doesn't like base function "c" with a list

I have a larger section of code but I've narrowed down the problem to this -
So I want to return a concatenated list.
do.call(c,"X")
Error in do.call(c, "X") : second argument must be a list
So above it complains about the SECOND argument not being a list.
asimplelist=list(2,3,4)
class(asimplelist)
[1] "list"
do.call(c,asimplelist)
Error in do.call(c, asimplelist) :
'what' must be a function or character string
Why will this not return a concatenated list ? C is a legit function, and it's being passed a list?
args(do.call)
function (what, args, quote = FALSE, envir = parent.frame())
NULL
So "what" is the function argument it is complaining about.
I will answer "stealing" my answer from this comment by Nick Kennedy:
It might be better to put the c in double quotes.
If the user has a non-function named c in the global environment, do.call(c, dates) will fail with the error "Error in do.call(c, list(1:3)) : 'what' must be a character string or a function".
Clearly it may not be best practice to define c, but it's quite common for people to do a <- 1; b <- 2; c <- 3.
For most purposes, R still works fine in this scenario; c(1, 2) will still work, but do.call(c, x) won't.
Of course if the user has redefined c to be a function (e.g. c <- sum), then do.call will use the redefined function.

R: pass a vector of symbols into a function instead of a long argument list

I'm writing an R package that calls C code with the .C() function. I need to pass over 50 arguments into .C(), which seriously bloats my code and is prone to error. Rather than type
output <- .C("my_dynlib", arg2, arg3, arg4, arg5, arg6, ..., arg53)
I would rather have a character vector myNames of additional arguments to send to .C() and do something like
f <- Vectorize(as.symbol, "x")
mySymbols <- f(myNames)
output <- .C("my_dynlib", mySymbols)
But this isn't quite what I want because mySymbols is a list. Is there a way to compactly pass a collection of arguments to a function that I don't want to rewrite?
Footnote: I'm guessing that some of you will suggest passing more complicated arguments with .Call, .External, or Rcpp, but that's more overhaul than I want to deal with right now.
Edit: what if I want myNames to point to slots in an s3 or s4 object?
Suppose I want to use symbols that refer to class slots in a data frame.
> y = data.frame(a = 1:4, b = 5:8)
> myNames = paste("y#", slotNames(y), sep = "")
> mySymbols = lapply(myNames, as.symbol)
> str(mySymbols)
List of 4
$ : symbol y#.Data
$ : symbol y#names
$ : symbol y#row.names
$ : symbol y#.S3Class
I use do.call like #josilber said, but do.call doesn't recognize the symbols
> do.call(print, mySymbols)
Error in (function (x, ...) : object 'y#.Data' not found
even though I can use those symbols manually.
> y#.Data
[[1]]
[1] 1 2 3 4
[[2]]
[1] 5 6 7 8
If you have a list of objects that you want to pass to a function, you can do so with the do.call function. In your case, it sounds like the following should work:
do.call(".C", c("my_dynlib", mySymbols))

Concerning R, when defining a Replacement Function, do the arguments have to be named as/like "x" and "value"?

By "replacement functions" I mean those mentioned in this thread What are Replacement Functions in R?, ones that look like 'length<-'(x, value). When I was working with such functions I encountered something weird. It seems that a replacement function only works when variables are named according to a certain rule.
Here is my code:
a <- c(1,2,3)
I will try to change the first element of a, using one of the 3 replacement functions below.
'first0<-' <- function(x, value){
x[1] <- value
x
}
first0(a) <- 5
a
# returns [1] 5 2 3.
The first one works pretty well... but then when I change the name of arguments in the definition,
'first1<-' <- function(somex, somevalue){
somex[1] <- somevalue
somex
}
first1(a) <- 9
# Error in `first1<-`(`*tmp*`, value = 9) : unused argument (value = 9)
a
# returns [1] 5 2 3
It fails to work, though the following code is OK:
a <- 'first1<-'(a, 9)
a
# returns [1] 9 2 3
Some other names work well, too, if they are similar to x and value, it seems:
'first2<-' <- function(x11, value11){
x11[1] <- value11
x11
}
first2(a) <- 33
a
# returns [1] 33 2 3
This doesn't make sense to me. Do the names of variables actually matter or did I make some mistakes?
There are two things going on here. First, the only real rule of replacement functions is that the new value will be passed as a parameter named value and it will be the last parameter. That's why when you specify the signature function(somex, somevalue), you get the error unused argument (value = 9) and the assignment doesn't work.
Secondly, things work with the signature function(x11, value11) thanks to partial matching of parameter names in R. Consider this example
f<-function(a, value1234=5) {
print(value1234)
}
f(value=5)
# [1] 5
Note that 5 is returned. This behavior is defined under argument matching in the language definition.
Another way to see what's going on is to print the call signature of what's actually being called.
'first0<-' <- function(x, value){
print(sys.call())
x[1] <- value
x
}
a <- c(1,2,3)
first0(a) <- 5
# `first0<-`(`*tmp*`, value = 5)
So the first parameter is actually passed as an unnamed positional parameter, and the new value is passed as the named parameter value=. This is the only parameter name that matters.

Debugging lapply/sapply calls

Code written using lapply and friends is usually easier on the eyes and more Rish than loops. I love lapply just as much as the next guy, but how do I debug it when things go wrong? For example:
> ## a list composed of numeric elements
> x <- as.list(-2:2)
> ## turn one of the elements into characters
> x[[2]] <- "what?!?"
>
> ## using sapply
> sapply(x, function(x) 1/x)
Error in 1/x : non-numeric argument to binary operator
Had I used a for loop:
> y <- rep(NA, length(x))
> for (i in 1:length(x)) {
+ y[i] <- 1/x[[i]]
+ }
Error in 1/x[[i]] : non-numeric argument to binary operator
But I would know where the error happened:
> i
[1] 2
What should I do when using lapply/sapply?
Use the standard R debugging techniques to stop exactly when the error occurs:
options(error = browser)
or
options(error = recover)
When done, revert to standard behaviour:
options(error = NULL)
If you wrap your inner function with a try() statement, you get more information:
> sapply(x, function(x) try(1/x))
Error in 1/x : non-numeric argument to binary operator
[1] "-0.5"
[2] "Error in 1/x : non-numeric argument to binary operator\n"
[3] "Inf"
[4] "1"
[5] "0.5"
In this case, you can see which index fails.
Use the plyr package, with .inform = TRUE:
library(plyr)
laply(x, function(x) 1/x, .inform = TRUE)
Like geoffjentry said:
> sapply(x, function(x) {
res <- tryCatch(1 / x,
error=function(e) {
cat("Failed on x = ", x, "\n", sep="") ## browser()
stop(e)
})
})
Also, your for loop could be rewritten to be much cleaner (possibly a little slower):
> y <- NULL
> for (xi in x)
y <- c(y, 1 / xi)
Error in 1/xi : non-numeric argument to binary operator
For loops are slow in R, but unless you really need the speed I'd go with a simple iterative approach over a confusing list comprehension.
If I need to figure out some code on the fly, I'll always go:
sapply(x, function(x) {
browser()
...
})
And write the code from inside the function so I see what I'm getting.
-- Dan
Using debug or browser isn't a good idea in this case, because it will stop your code so frequently. Use Try or TryCatch instead, and deal with the situation when it arises.
You can debug() the function, or put a browser() inside the body. This is only particularly useful if you don't have a gajillion iterations to work through.
Also, I've not personally done this, but I suspect you could put a browser() in as part of a tryCatch(), such that when the error is generated you can use the browser() interface.
I've faced the same problem and have tended to make my calls with (l)(m)(s)(t)apply to be functions that I can debug().
So, instead of blah<-sapply(x,function(x){ x+1 })
I'd say,
myfn<-function(x){x+1}
blah<-sapply(x,function(x){myfn(x)})
and use debug(myfn) with options(error=recover).
I also like the advice about sticking print() lines here and there to see what is happening.
Even better is to design a test of myfn(x) that it has to pass and to be sure it passes said test before subjecting it to sapply. I only have patience to to this about half the time.

Resources