I'm trying to write a function with dynamic arguments (i.e. the function argument names are not determined beforehand). Inside the function, I can generate a list of possible argument names as strings and try to extract the function argument with the corresponding name (if given). I tried using match.arg, but that does not work.
As a (massively stripped-down) example, consider the following attempt:
# Override column in the dataframe. Dots arguments can be any
# of the column names of the data.frame.
dataframe.override = function(frame, ...) {
for (n in names(frame)) {
# Check whether this col name was given as an argument to the function
if (!missing(n)) {
vl = match.arg(n);
# DO something with that value and assign it as a column:
newval = vl
frame[,n] = newval
}
}
frame
}
AA = data.frame(a = 1:5, b = 6:10, c = 11:15)
dataframe.override(AA, b = c(5,6,6,6,6)) # Should override column b
Unfortunately, the match.arg apparently does not work:
Error in match.arg(n) : 'arg' should be one of
So, my question is: Inside a function, how can I check whether the function was called with a given argument and extract its value, given the argument name as a string?
Thanks,
Reinhold
PS: In reality, the "Do something..." part is quite complicated, so simply assigning the vector to the dataframe column directly without such a function is not an option.
You probably want to review the chapter on Non Standard Evaluation in Advanced-R. I also think Hadley's answer to a related question might be useful.
So: let's start from that other answer. The most idiomatic way to get the arguments to a function is like this:
get_arguments <- function(...){
match.call(expand.dots = FALSE)$`...`
}
That provides a list of the arguments with names:
> get_arguments(one, test=2, three=3)
[[1]]
one
$test
[1] 2
$three
[1] 3
You could simply call names() on the result to get the names.
Note that if you want the values as strings you'll need to use deparse, e.g.
deparse(get_arguments(one, test=2, three=3)[[2]])
[1] "2"
P.S. Instead of looping through all columns, you might want to use intersect or setdiff, e.g.
dataframe.override = function(frame, ...) {
columns = names(match.call(expand.dots = FALSE)$`...`)[-1]
matching.cols <- intersect(names(frame), names(columns))
for (i in seq_along(matching.cols) {
n = matching.cols[[i]]
# Check whether this col name was given as an argument to the function
if (!missing(n)) {
vl = match.arg(n);
# DO something with that value and assign it as a column:
newval = vl
frame[,n] = newval
}
}
frame
}
P.P.S: I'm assuming there's a reason you're not using dplyr::mutate for this.
Related
I have the following double loop:
indexnames = c(a, b, c, d, etc.)
# with
# length(indexnames) = 87
# class(indexnames) = "character"
# (indexnames = indexes I want to add in a column)
files = c(aname, bname, cname, dname, etc.)
# with
# length(files) = 87
# class(files) = "character"
# (files = name of files in the global environment)
Now I want to loop through the two list and add to the files[1] a column of name "index" with the input index[1]. I implemented this the following way:
for(i in files){
for(j in indexnames){
files[i] = cbind(Index = indexnames[j], files[i])
}
}
When I run this, I get an error message of 50 or more warnings.
What am I doing wrong?
Appreciating any help, thanks.
You need to use get() and assign() functions to get the behavior you want.
Actually you don't have to use i or j in name elements when creating loops. It's easier to debug a loop if you name them in a more human readable way. Still let's look at your inner part of the loop.
files[i]
Given files is a vector, you cannot call a specific element by it's value this way (nor you'd want to, since it's just a vector with the name of objects). Instead make "i" cycle through a number vector 'for(i in 1:87)'
for (index in 1:87) {
assign( files[i] , `[[<-`(get(files[i]), 'index', value = indexnames[i] ))
}
I found some help in this answer:
How to use `assign()` or `get()` on specific named column of a dataframe?
I'm trying to create a function that can evaluate multiple independent expressions. My goal is to input many expressions at once like myfunction(x = 2, y = c(5,10,11) , z = 10, ...), and use each expression's name and value to feed other functions inside of it. The transform() function works kind of like that: transform(someData, x = x*2, y = y + 1).
I know I can get the name and the value of an expression using:
> names(expression(x=2))
[1] "x"
> eval(expression(x=2))
[1] 2
However, I don't know how to pass those expressions through a function. Here is some of my work so far.
With unquoted expression (x=2) I could not pass it using the dots (...).
> myfunction <- function(...) { names(expression(...)) }
> myfunction(x=2)
expression(...)
Now, using quotes. It gets the value but not the name. Parse structure is different from the tradicional expression. See class(expression(x=2)) and class(parse(text="x=2")), then str(expression(x=2)) and str(parse(text="x=2")).
> myfunction <- function(...) {
assign("temp",...)
results <- parse(text=temp)
cat(names(results))
cat(eval(results))
}
> myfunction("x=2")
> 2
So, any ideas?
It's unclear exactly what you want the return of your function to be. You can get the names and expressions passed to a function using
myfunction <- function(...) {
x<-substitute(...())
#names(x)
x
}
myfunction(x = 2, y = c(5,10,11) , z = 10)
Here you get a named list and each of the items is an unevaluated expression or language object that you can evaluate later if you like.
I have written a function that produces as output 2 matrices, say A & B, and I have used list() in order to separate them in my output. Now I would like to re-write my function so that the displayed output is ONLY matrix B unless I specify it when calling the function (however, my function has still to compute both matrices.) Basically, I would like to hide matrix A from the output unless I say otherwise.
Can I do this in R?
Yes.
Here's an example:
myfun <- function(a, b, Bonly=TRUE) {
# calculations
result <- list(a, b)
if (Bonly) return(result[2]) else return(result)
}
Basically you set a variable that has a default in the function with the notation x=DEFAULT in the set of arguments passed to the function. The variable does not need to be specified for the function to run. If the variable has the default value then return just B, otherwise return both.
> myfun(1,2)
[[1]]
[1] 2
> myfun(1,2, FALSE)
[[1]]
[1] 1
[[2]]
[1] 2
You can set an argument with a default value saying that matrix A should hidden, unless the user specifies it should be part of the result
myFunction <- function(<your arguments>, hideA = TRUE){
#your computations
...
output <- list(A = <matrix A>, B = <matrix B>)
#your result
if(hideA) output <- output$B #hide A
return(output)
}
#calling the function
myFunction(<your args>) #A will be hidden by default
myFunction(<your args>, hideA = FALSE) #the list of matrix will be returned
I'm working with dplyr and created code to compute new data that is plotted with ggplot.
I want to create a function with this code. It should take a name of a column of the data frame that is manipulated by dplyr. However, trying to work with columnnames does not work. Please consider the minimal example below:
df <- data.frame(A = seq(-5, 5, 1), B = seq(0,10,1))
library(dplyr)
foo <- function (x) {
df %>%
filter(x < 1)
}
foo(B)
Error in filter_impl(.data, dots(...), environment()) :
object 'B' not found
Is there any solution to use the name of a column as a function argument?
If you want to create a function which accepts the string "B" as an argument (as in you question's title)
foo_string <- function (x) {
eval(substitute(df %>% filter(xx < 1),list(xx=as.name(x))))
}
foo_string("B")
If you want to create a function which accepts captures B as an argument (as in dplyr)
foo_nse <- function (x) {
# capture the argument without evaluating it
x <- substitute(x)
eval(substitute(df %>% filter(xx < 1),list(xx=x)))
}
foo_nse(B)
You can find more information in Advanced R
Edit
dplyr makes things easier in version 0.3. Functions with suffixes "_" accept a string or an expression as an argument
foo_string <- function (x) {
# construct the string
string <- paste(x,"< 1")
# use filter_ instead of filter
df %>% filter_(string)
}
foo_string("B")
foo_nse <- function (x) {
# capture the argument without evaluating it
x <- substitute(x)
# construct the expression
expression <- lazyeval::interp(quote(xx < 1), xx = x)
# use filter_ instead of filter
df %>% filter_(expression)
}
foo_nse(B)
You can find more information in this vignette
I remember a similar question which was answered by #Richard Scriven. I think you need to write something like this.
foo <- function(x,...)filter(x,...)
What #Richard Scriven mentioned was that you need to use ... here. If you type ?dplyr, you will be able to find this: filter(.data, ...) I think you replace .data with x or whatever. If you want to pick up rows which have values smaller than 1 in B in your df, it will be like this.
foo <- function (x,...) filter(x,...)
foo(df, B < 1)
I'm using data.table package and trying to write a function (shown below):
require(data.table)
# Function definition
f = function(path, key) {
table = data.table(read.delim(path, header=TRUE))
e = substitute(key)
setkey(table, e) # <- Error in setkeyv(x, cols, verbose = verbose) : some columns are not in the data.table: e
return(table)
}
# Usage
f("table.csv", ID)
Here I try to pass an expression to the function. Why this code doesn't work?
I've already tried different combinations of substitute(), quote() and eval(). So, it'd be great if you could also explain how to get this to work.
First, let's look at how the setkey function does things from the data.table package:
# setkey function
function (x, ..., verbose = getOption("datatable.verbose"))
{
if (is.character(x))
stop("x may no longer be the character name of the data.table. The possibility was undocumented and has been removed.")
cols = getdots()
if (!length(cols))
cols = colnames(x)
else if (identical(cols, "NULL"))
cols = NULL
setkeyv(x, cols, verbose = verbose)
}
So, when you do:
require(data.table)
dt <- data.table(ID=c(1,1,2,2,3), y = 1:5)
setkey(dt, ID)
It calls the function getdots which is internal to data.table (that is, it's not exported). Let's have a look at that function:
# data.table:::getdots
function ()
{
as.character(match.call(sys.function(-1), call = sys.call(-1),
expand.dots = FALSE)$...)
}
So, what does this do? It takes the parameter you entered in setkey and it uses match.call to extract the arguments separately. That is, the match.call argument for this example case would be:
setkey(x = dt, ... = list(ID))
and since it's a list, you can access the ... parameter with $... to get a list of 1 element with its value ID and converting to this list to a character with as.character results in "ID" (a character vector). And then setkey passes this to setkeyv internally to set the keys.
Now why doesn't this work when you write setkey(table, key) inside your function?
This is precisely because of the way setkey/getdots is. The setkey function is designed to take any argument after the first argument (which is a data.table) and then return the ... argument as a character.
That is, if you give setkey(dt, key) then it'll return cols <- "key". If you give setkey(dt, e), it'll give back cols <- "e". It doesn't look for if "key" is an existing variable and then if so substitute the value of the variable. All it does is convert the value you provide (whether it be a symbol or character) back to a character.
Of course this won't work in your case because you want the value in key = ID to be provided in setkey. At least I can't think of a way to do this.
How to get around this?
As #agstudy already mentions, the best/easiest way is to pass "ID" and use setkeyv. But, if you really insist on using f("table.csv", ID) then, this is what you could do:
f <- function(path, key) {
table = data.table(read.delim(path, header=TRUE))
e = as.character(match.call(f)$key)
setkeyv(table, e)
return(table)
}
Here, you first use match.call to get the value corresponding to argument key and then convert it to a character and then pass that to setkeyv.
In short, setkey internally uses setkeyv. And imho, setkey is a convenient function to be used when you already know the column name of the data.table for which you need to set the key. Hope this helps.
I can't tell from your code what you're trying to achieve, so I'll answer the question the title asks instead; "How to pass an expression through a function?"
If you want to do this (this should be avoided where possible), you can do the following:
f <- function(expression) {
return(eval(parse(text=expression)))
}
For example:
f("a <- c(1,2,3); sum(a)")
# [1] 6