R why does do.call not match a direct calculation? - r

In the process of doing something more complicated, I've found the following:
If I use do.call converting a numeric vector to a list, I am getting a different value from the applied function and I'm not sure why.
x <- rnorm(30)
median(x) # -0.01192347
# does not match:
do.call("median",as.list(x)) # -1.912244
Why?
Note: I'm trying to run various functions using a vector of function names. This works with do.call, but only if I get the correct output from do.call.
Thanks for any suggestions.

So do.call expects the args argument to be a list of arguments, so technically we'd want to pass list(x = x):
> set.seed(123)
> x <- rnorm(10)
> median(x)
[1] -0.07983455
> do.call(median,list(x = x))
[1] -0.07983455
> do.call(median,as.list(x))
[1] -0.5604756
Calling as.list on the vector x turns it into a list of length 10, as though you were going to call median and pass it 10 separate arguments. But really, we're passing just one, x. So in the end it just grabs the first element of the vector and passes that to the argument x.

Related

what does the small x means in lapply

I have the variables:
trims<- c(0,0.1,0.2,0.5)
x<-rcauchy(100)
and the following operation:
lapply(trims, mean, x=x)
what does the small x refer to in this case? The documentation for lapply does not explain it well either. I do know that for lapply function, it takes a function and apply it to each element of the list, which I believe is trim in this case. How does x come in then?
If we use anonymous function, it will be clear.
res <- lapply(trims, function(y) mean(x, trim=y))
res1 <- lapply(trims, mean, x=x)
identical(res, res1)
#[1] TRUE
The lapply loops through each of the 'trims' and as mean has first argument of x and second argument of trim and the first argument is already mentioned with x=x i.e. the object created with rauncy, naturally the the second argument i.e. trim selects the values in 'trimws'

Iterating over the argument of a function (grep) passed to lapply

I currently am doing an operation similar to below:
v<-c("my","pig","is","big","with","a","name")
s<-c("m","g")
for(i in c(1:length(s))){
print(grep(v,pattern=s[i]))
}
Which prints
[1] 1 7
[1] 2 4
I would like to instead vectorize this operation where the return values are stored in a vector. I tried
mynewvector<-lapply(v,grep,pattern=s,x=v)
but the problem is that I don't know how to get lapply iterate over the elements passed as arguments (e.g. iterating over s). I saw this answer, but I don't think mapply works here because I am trying to hold one argument constant (x=v) and iterate over the other argument (pattern=s)
How would I do this?
Following up on d.b.'s response, the most clear solution is
lapply(s, function(a) grep(pattern = a, x = v))

From a list of objects, get a character vector of their names

I have a function that takes as an argument a list of functions.
library(moments)
library(plyr)
tests <- list(mean, varience, skewness, kurtosis)
f <- function(X, tests){
out <- each(... = tests)(X) #each from plyr
names(out) <- GetNames(tests)
out
}
I want GetNames to take the list of objects, in this case functions, and return the names of the objects as text. Ideally, I'd like GetNames to work with any list of named objects:
> GetNames(tests)
[1] "mean" "varience" "skewness" "kurtosis"
as.character(tests) returns the text of the code of each function, not their names.
I tried:
GN <- function(X) deparse(substitute(X))
GetNames <- function(X) lapply(tests, GN)
GetNames(tests)
But this returns:
[[1]]
[1] "X[[i]]"
[[2]]
[1] "X[[i]]"
[[3]]
[1] "X[[i]]"
[[4]]
[1] "X[[i]]"
I have some version of this problem frequently when writing R code. I want a function to evaluate its argument some number of steps, here one step from tests to the names of its objects, and then stop and let me do something to the result, here convert them to strings, rather than going on to get the referents of the names before I can grab them (the names).
Once you run
tests <- list(mean, varience, skewness, kurtosis)
those symbols are evaluated and discarded. If you look at
tests[[2]]
or something, you can see there really isn't an original reference to varience, but rather the funcion that the symbol varience pointed to is now stored in a list. (Things work a bit differently when passing parameters to functions thanks to the promises and the call stack but that's not what you're doing here). There is no lazy-evaluation for list() after you've run it.
If you want to keep the names of the functions, it's probably best to work with a named list. You can make a helper function like
nlist <- function(...) { dots<-substitute(...()); setNames(list(...), sapply(dots, deparse))}
tests <- nlist(mean, var, skewness, kurtosis)
Now the values are preserved as names
names(tests)
# [1] "mean" "var" "skewness" "kurtosis"
I confess to being a bit baffled by this question, which makes me think there's some pieces of information you haven't shared with us.
For instance, you say:
I'd like GetNames to work with any list of named objects
Um...well for a named list of objects, such a function already exists, and it's called names().
The "sane" way to do this sort of thing is just name the list in the first place:
tests <- list("mean" = mean, "variance" = variance,
"skewness" = skewness, "kurtosis" = kurtosis)
or you can set the names programmatically via setNames.

How do I access function names in R?

I am writing a function that receives two parameters: a data frame, and a function, and, after processing the data frame, summarizes it using the function parameter (e.g. mean, sd,...). My question is, how can I get the name of the function received as a parameter?
How about:
f <- function(x) deparse(substitute(x))
f(mean)
# [1] "mean"
f(sd)
# [1] "sd"
do.call may be what you want here. You can get a function name as character value, and then pass that and a list of arguments to do.call for evaluation. For example:
X<-"mean"
do.call(X,args=list(c(1:5)) )
[1] 3
Perhaps I'm misunderstanding the question, but it seems like you could simply have the function name as a parameter, and evaluate the function like normal within your function. This approach works fine for me. The ellipsis is for added parameters to your function of interest.
myFunc=function(data,func,...){return(func(data,...))}
myFunc(runif(100), sd)
And if you'd want to apply it to every column or row of a data.frame, you could simply use an apply statement in myFunc.
Here's my try, perhaps, you want to return both the result and the function name:
y <- 1:10
myFunction <- function(x, param) {
return(paste(param(x), substitute(param)))
}
myFunction(y, mean)
# [1] "5.5 mean"

R apply statement does not work with a matrix

I'd like to know the reason why the following does not work on the matrix structure I have posted here (I've used the dput command).
When I try running:
apply(mymatrix, 2, sum)
I get:
Error in FUN(newX[, i], ...) : invalid 'type' (list) of argument
However, when I check to make sure it's a matrix I get the following:
is.matrix(mymatrix)
[1] TRUE
I realize that I can get around this problem by unlisting the data into a temp variable and then just recreating the matrix, but I'm curious why this is happening.
?is.matrix says:
'is.matrix' returns 'TRUE' if 'x' is a vector and has a '"dim"'
attribute of length 2) and 'FALSE' otherwise.
Your object is a list with a dim attribute. A list is a type of vector (even though it is not an atomic type, which is what most people think of as vectors), so is.matrix returns TRUE. For example:
> l <- as.list(1:10)
> dim(l) <- c(10,1)
> is.matrix(l)
[1] TRUE
To convert mymatrix to an atomic matrix, you need to do something like this:
mymatrix2 <- unlist(mymatrix, use.names=FALSE)
dim(mymatrix2) <- dim(mymatrix)
# now your apply call will work
apply(mymatrix2, 2, sum)
# but you should really use (if you're really just summing columns)
colSums(mymatrix2)
The elements of your matrix are not numeric, instead they are list, to see this you can do:
apply(m,2, class) # here m is your matrix
So if you want the column sum you have to 'coerce' them to be numeric and then apply colSums which is a shortcut for apply(x, 2, sum)
colSums(apply(m, 2, as.numeric)) # this will give you the sum you want.

Resources