Purpose of this R idiom (match.call followed by eval(parent.frame()) - r

I've inherited some code that uses what I think is a common R idiom for libraries, but I'm not sure what is achieved by writing in such a verbose way. Ultimately I intend to re-write but I would first like to know why before I do something stupid.
ecd <-
function(formula, data, subset, weights, offset, ...) {
cl = match.call()
mf = match.call(expand.dots = FALSE)
m =
match(c("formula", "data", "subset", "weights", "offset"),
names(mf),
0L)
mf = mf[c(1L, m)]
mf$drop.unused.levels = TRUE
mf[[1L]] = quote(stats::model.frame)
mf = eval(mf, parent.frame())
mt = attr(mf, "terms")
y = stats::model.response(mf, "numeric")
w = as.vector(stats::model.weights(mf))
offset = as.vector(stats::model.offset(mf))
x = stats::model.matrix(mt, mf, contrasts)
z = ecd.fit(x, y, w, offset, ...)
My current understanding is that it constructs a function call object (type?) from the original arguments to the function and then manually calls it, rather than just calling stats::model.frame directly. Any insights would be appreciated.

I think this should answer everything, explanations are in the code :
# for later
FOO <- function(x) 1000 * x
y <- 1
foo <- function(...) {
cl = match.call()
message("cl")
print(cl)
message("as.list(cl)")
print(as.list(cl))
message("class(cl)")
print(class(cl))
# we can modify the call is if it were a list
cl[[1]] <- quote(FOO)
message("modified call")
print(cl)
y <- 2
# now I want to call it, if I call it here or in the parent.frame might
# give a different output
message("evaluate it locally")
print(eval(cl))
message("evaluate it in the parent environment")
print(eval(cl, parent.frame()))
message("eval.parent is equivalent and more idiomatic")
print(eval.parent(cl))
invisible(NULL)
}
foo(y)
# cl
# foo(y)
# as.list(cl)
# [[1]]
# foo
#
# [[2]]
# y
#
# class(cl)
# [1] "call"
# modified call
# FOO(y)
# evaluate it locally
# [1] 2000
# evaluate it in the parent environment
# [1] 1000
# eval.parent is equivalent and more idiomatic
# [1] 1000

Related

R Call changing depending on whether variable is named or not in S3 Methods

I'm looking to deal with call evaluations but am out of my depth when it comes to S3 Methods. Basically, I am wondering why a variable that I pass to a function call is not evaluated but rather remains the name of the variable rather than it's value. And all of this depends on whether I name the variable in the function or not.
Let me illustrate with a short example:
I first create a quick function to create a sample class to be used with S3 Methods:
create_myS3 <- function(a, b){
out <- list()
out$a <- a
out$b <- b
class(out) <- "myS3"
return(out)
}
Now the set-up that I am interested in features a number of functions within each other. I first create an S3 method for this myS3 class, let's call it m and we define a specific routine for the myS3 class as well as a default method. Note that the myS3 version calls the default version.
m <- function(x, ...){UseMethod("m")}
m.myS3 <- function(x, estimator = NULL){
y <- list()
y$a <- x$a + 1
y$b <- x$b + 1
out <- m.default(y,
estimator)
return(out)
}
m.default <- function(x, estimator = NULL, ...){
out <- list()
out$call <- sys.call()
out$result <- x$a - x$b
out$aux$estimator <- estimator
return(out)
}
Now that we have defined the functions, we can look at the results function that I'm interested in:
h <- function(x){
out <- list()
out$result_call <- if(is.null(x$call$estimator)){"Success"}else{"Fail"}
out$result_list <- if(is.null(x$aux$estimator)){"Success"}else{"Fail"}
return(out)
}
It's entire purpose is to check whether the estimator element is in the object it is passed to and to give a message based on that.
Ok, now let's put it all together:
g <- function(x){
object <- m(x)
out <- h(object)
return(out)
}
initial <- create_myS3(10,5)
g(initial)
The g() function now calls m() on the input, which was created with the create_myS3 function - so is of class myS3 and is therefore passed to m.myS3 before it is passed to m.default. The resulting object is then passed to h() - in all cases we have not set the estimator argument, which then defaults to NULL and both my check statements in h() return Success.
Now all I do is change one tiny thing: I now modify m.myS3 to call the m.default not just with the order of the input variables but now I also specify the option - in my mind the more robust way. So to clarify, from this m.default(y, estimator) I change it to m.default(x = y, estimator = estimator).
This change then changes my results from h() to Fail for the evaluation if(is.null(x$call$estimator)){"Success"}else{"Fail"} while if(is.null(x$aux$estimator)){"Success"}else{"Fail"} results in Success.
The reason for this is that the call statement evaluates to estimator rather than to its true value NULL.
Is there an easy way to evaluate this call to its true value (I have tried eval or deparse)? Or even better is there are a way to ensure that in m.myS3 the value is always passed rather than the variable?
Here below is the total code for convenience:
create_myS3 <- function(a, b){
out <- list()
out$a <- a
out$b <- b
class(out) <- "myS3"
return(out)
}
m <- function(x, ...){UseMethod("m")}
m.myS3 <- function(x, estimator = NULL){
y <- list()
y$a <- x$a + 1
y$b <- x$b + 1
out <- m.default(y,
estimator)
return(out)
}
m.default <- function(x, estimator = NULL, ...){
out <- list()
out$call <- sys.call()
out$result <- x$a - x$b
out$aux$estimator <- estimator
return(out)
}
h <- function(x){
out <- list()
out$result_call <- if(is.null(x$call$estimator)){"Success"}else{"Fail"}
out$result_list <- if(is.null(x$aux$estimator)){"Success"}else{"Fail"}
return(out)
}
g <- function(x){
object <- m(x)
out <- h(object)
return(out)
}
initial <- create_myS3(10,5)
g(initial)
$result_call
[1] "Success"
$result_list
[1] "Success"
## Changing m.myS3 (only change is to name the option of function m.default)
m.myS3 <- function(x, estimator = NULL){
y <- list()
y$a <- x$a + 1
y$b <- x$b + 1
out <- m.default(x = y,
estimator = estimator)
return(out)
}
g(initial)
$result_call
[1] "Fail"
$result_list
[1] "Success"

Unpack dots provided from another function with missing named arguments

Similar to the question here. Given a function f with named arguments and a function g taking any number of arguments through ..., how would one
f <- function(a)
g(a = a)
g <- function(...)
list(...)
f()
Error in g(a = a) : argument "a" is missing, with no default
rlang::dots_list sadly did not provide an answer
f2 <- function(a)
h(a = a)
h <- function(...)
rlang::dots_list(..., .ignore_empty = 'all')
f2()
Error in eval(expr, p) : argument "a" is missing, with no default
Edit:
To make the problem more clear, the function g may be called by a myriad of functions, and I'm looking for a way to handle the missing arguments within g and not f.
You can forward ... to subfunctions to multiple depths without evaluating them as long as the subfunctions don't actually perform any evaluation themselves so you don't have to handle this in all functions that receive ... but at the point where it is evaluated you will need to deal with it somehow.
Assuming that f() should return a empty list handle the missing argument separately within g
f <- function(a) g(a = a)
g <- function(..., default = list()) if (missing(..1)) default else list(...)
f()
## [1] list()
or the following which checks each element of ... :
g <- function(..., default = list()) {
L <- list()
for(i in seq_len(...length())) {
x <- try(eval.parent(list(...)[[i]]), silent = TRUE)
L[[i]] <- if (inherits(x, "try-error")) default else x
}
names(L) <- names(substitute(alist(...))[-1])
L
}
f()
## $a
## list()
or within f:
f <- function(a) if (missing(a)) g() else g(a = a)
g <- function(...) list(...)
f()
## [1] list()
Your code seems to be OK except you call f() without a argument at the end... try this:
f <- function(a)
g(a = a)
g <- function(...)
list(...)
f("example")
Or you have to provide a default value for a:
f <- function(a = "example")
g(a = a)
g <- function(...)
list(...)
f()
So the problem is not a missing argument in g(...), but missing argument value in f() when calling g(a = a) without having a.

How to get match.call() from a united function?

I have three functions and one function is made out of the other two by using useMethod().
logReg <- function(x, ...) UseMethod("logReg")
logRec.numeric <- function(x, y) {
print(x)
}
logReg.formula <- function(formula, data) {
print(formula)
}
My functions are a bit more complex but does not matter for my question. I want logReg to give me additionaly the original function call as output (not the function call of logReg.numeric oder logReg.formula). My first try was:
logReg <- function(x, ...) {
out <- list()
out$call <- match.call()
out
UseMethod("logReg")
}
But it does not work. Can someone give me a hint how to solve my problem?
Here's another way :
logReg <- function(x, ...) {
logReg <- function(x, ...) UseMethod("logReg")
list(logReg(x,...), call=match.call())
}
res <- logReg(1,2)
# [1] 1
res
# [[1]]
# [1] 1
#
# $call
# logReg(x = 1, 2)
#
You can make it work with atttibutes too if you prefer.
Try evaluating it explicitly. Note that this preserves the caller as the parent frame of the method.
logReg <- function(x, ...) {
cl <- mc <- match.call()
cl[[1]] <- as.name("logReg0")
out <- structure(eval.parent(cl), call = mc)
out
}
logReg0 <- function(x, ...) UseMethod("logReg0")
logReg0.numeric <- function(x, ...) print(x)
logReg0.formula <- function(x, ...) print(x)
result <- logReg(c(1,2))
## [1] 1 2
result
## [1] 1 2
## attr(,"call")
## logReg(x = c(1, 2))

List of quosures as input of a set of functions

This question refers to "Programming with dplyr"
I want to slice the ... argument of a function and use each element as an argument for a corresponding function.
foo <- function(...){
<some code>
}
should evaluate for example foo(x, y, z) in this form:
list(bar(~x), bar(~y), bar(~z))
so that x, y, z remain quoted till they get evaluated in bar.
I tried this:
foo <- function(...){
arguments <- quos(...)
out <- map(arguments, ~bar(UQ(.)))
out
}
I have two intentions:
Learn better how tidyeval/rlang works and when to use it.
turn future::futureOf() into a function that get me more then one futures at once.
This approach might be overly complicated, because I don't fully understand the underlying concepts of tidyeval yet.
You don't really need any packages for this. match.call can be used.
foo <- function(..., envir = parent.frame()) {
cl <- match.call()
cl$envir <- NULL
cl[[1L]] <- as.name("bar")
lapply(seq_along(cl)[-1], function(i) eval(cl[c(1L, i)], envir))
}
# test
bar <- function(...) match.call()
foo(x = 1, y = 2, z = 3)
giving:
[[1]]
bar(x = 1)
[[2]]
bar(y = 2)
[[3]]
bar(z = 3)
Another test
bar <- function(...) ..1^2
foo(x = 1, y = 2, z = 3)
giving:
[[1]]
[1] 1
[[2]]
[1] 4
[[3]]
[1] 9

How to reliably get dependent variable name from formula object?

Let's say I have the following formula:
myformula<-formula("depVar ~ Var1 + Var2")
How to reliably get dependent variable name from formula object?
I failed to find any built-in function that serves this purpose.
I know that as.character(myformula)[[2]] works, as do
sub("^(\\w*)\\s~\\s.*$","\\1",deparse(myform))
It just seems to me, that these methods are more a hackery, than a reliable and standard method to do it.
Does anyone know perchance what exactly method the e.g. lm use? I've seen it's code, but it is a little to cryptic to me... here is a quote for your convenience:
> lm
function (formula, data, subset, weights, na.action, method = "qr",
model = TRUE, x = FALSE, y = FALSE, qr = TRUE, singular.ok = TRUE,
contrasts = NULL, offset, ...)
{
ret.x <- x
ret.y <- y
cl <- match.call()
mf <- match.call(expand.dots = FALSE)
m <- match(c("formula", "data", "subset", "weights", "na.action",
"offset"), names(mf), 0L)
mf <- mf[c(1L, m)]
mf$drop.unused.levels <- TRUE
mf[[1L]] <- as.name("model.frame")
mf <- eval(mf, parent.frame())
if (method == "model.frame")
return(mf)
else if (method != "qr")
warning(gettextf("method = '%s' is not supported. Using 'qr'",
method), domain = NA)
mt <- attr(mf, "terms")
y <- model.response(mf, "numeric")
w <- as.vector(model.weights(mf))
if (!is.null(w) && !is.numeric(w))
stop("'weights' must be a numeric vector")
offset <- as.vector(model.offset(mf))
if (!is.null(offset)) {
if (length(offset) != NROW(y))
stop(gettextf("number of offsets is %d, should equal %d (number of observations)",
length(offset), NROW(y)), domain = NA)
}
if (is.empty.model(mt)) {
x <- NULL
z <- list(coefficients = if (is.matrix(y)) matrix(, 0,
3) else numeric(), residuals = y, fitted.values = 0 *
y, weights = w, rank = 0L, df.residual = if (!is.null(w)) sum(w !=
0) else if (is.matrix(y)) nrow(y) else length(y))
if (!is.null(offset)) {
z$fitted.values <- offset
z$residuals <- y - offset
}
}
else {
x <- model.matrix(mt, mf, contrasts)
z <- if (is.null(w))
lm.fit(x, y, offset = offset, singular.ok = singular.ok,
...)
else lm.wfit(x, y, w, offset = offset, singular.ok = singular.ok,
...)
}
class(z) <- c(if (is.matrix(y)) "mlm", "lm")
z$na.action <- attr(mf, "na.action")
z$offset <- offset
z$contrasts <- attr(x, "contrasts")
z$xlevels <- .getXlevels(mt, mf)
z$call <- cl
z$terms <- mt
if (model)
z$model <- mf
if (ret.x)
z$x <- x
if (ret.y)
z$y <- y
if (!qr)
z$qr <- NULL
z
}
Try using all.vars:
all.vars(myformula)[1]
I suppose you could also cook your own function to work with terms():
getResponse <- function(formula) {
tt <- terms(formula)
vars <- as.character(attr(tt, "variables"))[-1] ## [1] is the list call
response <- attr(tt, "response") # index of response var
vars[response]
}
R> myformula <- formula("depVar ~ Var1 + Var2")
R> getResponse(myformula)
[1] "depVar"
It is just as hacky as as.character(myformyula)[[2]] but you have the assurance that you get the correct variable as the ordering of the call parse tree isn't going to change any time soon.
This isn't so good with multiple dependent variables:
R> myformula <- formula("depVar1 + depVar2 ~ Var1 + Var2")
R> getResponse(myformula)
[1] "depVar1 + depVar2"
as they'll need further processing.
I found an useful package 'formula.tools' which is suitable for your task.
code Example:
f <- as.formula(a1 + a2~a3 + a4)
lhs.vars(f) #get dependent variables
[1] "a1" "a2"
rhs.vars(f) #get independent variables
[1] "a3" "a4"
Based on your edit to get the actual response, not just its name, we can use the nonstandard evaluation idiom employed by lm() and most other modelling functions with a formula interface in base R
form <- formula("depVar ~ Var1 + Var2")
dat <- data.frame(depVar = rnorm(10), Var1 = rnorm(10), Var2 = rnorm(10))
getResponse <- function(form, data) {
mf <- match.call(expand.dots = FALSE)
m <- match(c("formula", "data"), names(mf), 0L)
mf <- mf[c(1L, m)]
mf$drop.unused.levels <- TRUE
mf[[1L]] <- as.name("model.frame")
mf <- eval(mf, parent.frame())
y <- model.response(mf, "numeric")
y
}
> getResponse(form, dat)
1 2 3 4 5
-0.02828573 -0.41157817 2.45489291 1.39035938 -0.31267835
6 7 8 9 10
-0.39945771 -0.09141438 0.81826105 0.37448482 -0.55732976
As you see, this gets the actual response variable data from the supplied data frame.
How this works is that the function first captures the function call without expanding the ... argument as that contains things not needed for the evaluation of the data for the formula.
Next, the "formula" and "data" arguments are matched with the call. The line mf[c(1L, m)] selects the function name from the call (1L) and the locations of the two matched arguments. The drop.unused.levels argument of model.frame() is set to TRUE in the next line, and then the call is updated to switch the function name in the call from lm to model.frame. All the above code does is takes the call to lm() and processes that call into a call to the model.frame() function.
This modified call is then evaluated in the parent environment of the function - which in this case is the global environment.
The last line uses the model.response() extractor function to take the response variable from the model frame.
This should always give you all dependent vars:
myformula<-formula("depVar1 + depVar2 ~ Var1 + Var2")
as.character(myformula[[2]])[-1]
#[1] "depVar1" "depVar2"
And I wouldn't consider this particularly "hacky".
Edit:
Something strange happens with 3 dependents:
myformula<-formula("depVar1 + depVar2 + depVar3 ~ Var1 + Var2")
as.character(myformula[[2]])
#[1] "+" "depVar1 + depVar2" "depVar3"
So this might not be as reliable as I thought.
Edit2:
Okay, myformula[[2]] is a language object and as.character seems to do something similar as languageEl.
length(myformula[[2]])
#[1] 3
languageEl(myformula[[2]],which=1)
#`+`
languageEl(myformula[[2]],which=2)
#depVar1 + depVar2
languageEl(myformula[[2]],which=3)
#depVar3
languageEl(languageEl(myformula[[2]],which=2),which=2)
#depVar1
If you check the length of each element, you could create your own extraction function. But this is probably too much of a hack.
Edit3:
Based on the answer by #seancarmody all.vars(myformula[[2]]) is the way to go.
Using all.vars is very tricky as it won't detect the response from a one-sided formula. For example
all.vars(~x+1)
[1] "x"
that is wrong.
Here is the most reliable way of getting the response:
getResponseFromFormula = function(formula) {
if (attr(terms(as.formula(formula)) , which = 'response'))
all.vars(formula)[1]
else
NULL
}
getResponseFromFormula(~x+1)
NULL
getResponseFromFormula(y~x+1)
[1] "y"
Note that you can replace all.vars(formula)[1] in the function with formula[2] if the formula contains more than one variable for the response.
I know this question is quite old, but I thought I'd add a base R answer which doesn't require indexing, doesn't depend on the order of the variables listed in a call to all.vars, and which gives the response variables as separate elements when there is more than one:
myformula <- formula("depVar1 + depVar2 ~ Var1 + Var2")
all_vars <- all.vars(myformula)
response <- all_vars[!(all_vars %in% labels(terms(myformula)))]
> response
[1] "depVar1" "depVar2"

Resources