I wonder how I can pass arguments of some function to a subpart of the function that uses with / within. e.g.:
myfunction <- function(dataframe,col1,col2){
res <- within(dataframe, somenewcol <-paste(col1,"-","col2",sep=""))
return(res)
}
where col1 and col2 are columns contained in the dataframe. What´s the correct way to pass the arguments col1 and col2 to the within expression? When I just try to use it, i get :
Error in paste(col1, "-", , :
object 'Some_passed_col' not found
Here´s an example:
dataset <- data.frame(rnorm(20),2001:2020,rep(1:10,2))
names(dataset) <- c("mydata","col1","col2")
myfunction <- function(dataframe,arg1,arg2){
res <- with(dataframe, onesinglecol <- paste(arg1,"-","arg2",sep=""))
return(res)
}
# call function
myfunction(dataset,col1,col2)
EDIT:
the following works for me now, but I cannot completely understand why... so any further explanation is appreciated:
myfunction(dataset,arg1="col1",arg2="col2")
if I adjust
res <- with(dataframe, onesinglecol <- paste(get(arg1),"-",get(arg2),sep=""))
Try
myfunction <- function(dataframe,arg1,arg2){
dataframe["onesinglecol"] <- dataframe[[arg1]] -dataframe[[arg2]]
return(dataframe)
}
And call it with character-valued column names rather than object names that are nowhere defined:
myfunction(dataset,"col1","col2")
mydata col1 col2 onesinglecol
1 0.6834402 2001 1 2000
2 1.6623748 2002 2 2000
3 -0.5769926 2003 3 2000 .... etc
I think this is done via the ... directive:
E.g.:
myfunction <- function(dataframe, ...){
var <- anotherfunction( arg1= 1, arg2 = 2 , ...)
return(var)
}
... is a placeholder for additional arguments passed through to "anotherfunction".
You are missing the fact that col1 and col2 do not exist in dataframe (from your) nor in the user workspace.
Basically, with() and within() work like this:
> foo <- 10
> bar <- data.frame(FOO = 10)
> FOO + foo
Error: object 'FOO' not found
> with(bar, FOO + foo)
[1] 20
In the first case, FOO was not found as it is inside bar. In the second case, we set up an environment within which our expression is evaluated. Inside that environment FOO does exist. foo is also found in the workspace.
In your first example (please don't edit error messages etc, show us exactly what code you ran and what error was produced) either one col1 or col2 didn't exist in the environment created within which your expression was evaluated.
Further, you appear to want to store in col1 and col2 the name of a column (component) of your dataframe. DWin has shown you one way to use this information. An alternative maintaining the use of within() is to use get() like this:
res <- within(dataframe, somenewcol <- paste(get(col1), "-", get(col2), sep=""))
Why this works, as per your extra edit and quandary, is that get() returns the object named by it's first argument. get("foo") will return the object named foo (continuing from my example above):
get("foo") ## finds foo and returns it
1 10
In your example, you have a data frame with names inter alia "col1" and "col2". You changed your code to get(arg1) (where arg1 <- "col1"), where you are asking get() to return the object with name "col1" from the evaluation environment visible at the time your function is being evaluated. As your dataframe contained a col1 component, which was visible because within() had made it available, get() was able to find an object with the required name and include it in the expression.
But at this point you are trying to jump through too many hoops because your questions haven't been specific. I presume you are asking this because of my answer here to your previous Q. That answer suggested a better alternative than attach(). But you weren't clear there what some arguments were or what your really wanted to do. If I had known now what you really wanted to do then I would have suggested you use DWin's Answer above.
You seem to not want to hard code the column/component names. If you could hard code, this would be the solution:
res <- within(dataframe, somenewcol <- paste(col1, "-", col2, sep = ""))
But seeing as you don't want to hard code you need a get() version or DWin's solution.
Related
I have two fields:
FirstVisit
SecondVisit
I am building a function to pull data from either field depending on user input (heavily reduced yet relevant version of function):
pullData(visit){
# Do something
}
What I am looking to do is for the function to take the user's input and use it to form part of the call to the data frame field.
For example, when the user runs:
pullData(First)
The function will run like this:
print(df$FirstVisit)
Conversely, when the user runs:
pullData(Second)
The function will run:
print(df$SecondVisit)
My function is considerably more complex than this, but this basic example relates to just the specific aspect of it that I am trying to work out.
So far I have tried something like:
print(paste0(df["df$", visit, "Visit", ])
# The intention is to result in df$FirstVisit or df$SecondVisit depending on the input
And this:
print(paste0(df[df$", visit, "Visit, ])
# Again, intended result should be df$FirstVisit or df$SecondVisit, depending on the input
among other alternatives (some with paste()), yet nothing has worked so far.
I suspect that it is possible and feel that I am close.
How can I achieve this?
If you really want to run the function like pullData(First), you need to use metaprogramming (to get the name of the argument instead of the arguements value) like
pullData <- function(...) {
arg <- rlang::ensyms(...)
if(length(arg)!=1) stop("invalid argument in pullData")
dataName <- paste0(as.character(arg[[1]]),"Visit")
print(df[[dataName]])
}
If you can manage to call the function with a character-argument like pullData("First"), you can simply do:
pullData <- function(choice = "First") {
dataName <- paste0(choice,"Visit")
print(df[[dataName]])
}
I am not quite sure if this is what you're going for, but here's a possible solution:
pullData <- function(visit){
visit <- rlang::quo_text(enquo(visit))
visit <- tolower(visit)
if (visit %in% c("first", "firstvisit")){
data <- df$FirstVisit
}
if (visit %in% c("second", "secondvisit")){
data <- df$SecondVisit
}
data
}
Using this sample data:
df <- data.frame(FirstVisit = c("first value"),
SecondVisit = c("second value"))
Gets us:
> pullData(first)
[1] "first value"
> pullData(second)
[1] "second value"
For the sake of completeness, R allows for partial matching when subsetting with character indices; see help("$").
df <- data.frame(FirstVisit = 11:12, SecondVisit = 21:22)
For interactive use:
df$F
[1] 11 12
df$S
[1] 21 22
For programming on computed indices, the [[ operator has to be used, e.g.,
df[["F", exact = FALSE]]
[1] 11 12
This can be wrapped in a function call:
pullData <- function(x) df[[x, exact = FALSE]]
Thus,
pullData("F")
pullData("Fi")
pullData("First")
pullData("FirstVisit")
return all
[1] 11 12
while
pullData("S")
pullData("Second")
return
[1] 21 22
But watchout when dealing with user supplied input as typos might lead to unexpected results:
pullData("f")
pullData("first")
pullData("Frist")
NULL
Imagine you have a simple function that specifies which statistical tests to run for each variable. Its syntax, simplified for the purposes of this question is as follows:
test <- function(...) {
x <- list(...)
return(x)
}
which takes argument pairs such as Gender = 'Tukey', and intends to pass its result to other functions down the line. The output of test() is as follows:
test(Gender = 'Tukey')
# $Gender
# [1] "Tukey"
What is desired is the ability to replace the literal Gender by a dynamically assigned variable varname (e.g., for looping purposes). Currently what happens is:
varname <- 'Gender'
test(varname = 'Tukey')
# $varname
# [1] "Tukey"
but what is desired is this:
varname <- 'Gender'
test(varname = 'Tukey')
# $Gender
# [1] "Tukey"
I tried tinkering with functions such as eval() and parse(), but to no avail. In practice, I resolved the issue by simply renaming the resulting list, but it is an ugly solution and I am sure there is an elegant R way to achieve it. Thank in advance for the educational value of your answer.
NB: This question occurred to me while trying to program a custom function which uses mcp() from the effects package in its internals. The said mcp() function is the real world counterpart of test().
EDIT1: Perhaps it needs to be clarified that (for educational purposes) changing test() is not an option. The question is about how to pass the tricky argument to test(). If you take a look at NB, it becomes clear why: the real world counterpart of test(), namely mcp(), comes with a package. And while it is possible to create a modified copy of it, I am really curious whether there exists a simple solution in somehow 'converting' the dynamically assigned variable to a literal in the context of dot-arguments.
This works:
test <- function(...) {
x = list(...)
names(x) <- sapply(names(x),
function(p) eval(as.symbol(p)))
return(x)
}
apple = "orange"
test(apple = 5)
We can use
test <- function(...) {
x <- list(...)
if(exists(names(x))) names(x) <- get(names(x))
x
}
test(Gender = 'Tukey')
#$Gender
#[1] "Tukey"
test(varname = 'Tukey')
#$Gender
#[1] "Tukey"
What about this:
varname <- "Gender"
args <- list()
args[[varname]] <- "Tukey"
do.call(test, args)
I am trying to print the "result" of using table function, but when I tried to use the code here, I got something very strange:
for (i in 1:4){
print (table(paste("group",i,"$", "BMI_obese",sep=""), paste("group",i,"$","A1.1", sep="")))
}
This is the result in R output:
group1$A1.1
group1$BMI_obese 1
group2$A1.1
group2$BMI_obese 1
group3$A1.1
group3$BMI_obese 1
group4$A1.1
group4$BMI_obese 1
But when I type out the statement without typing inside the loop:
table(group2$BMI_obese, group2$A1.1)
I got what I want:
1 2 3 4 5
0 51 20 9 8 0
1 37 20 15 6 4
Does anyone know which part of my for loop code is not correct or can be modified to fit my purpose of printing the loop table result?
Hi, all but now I have another problem. I am trying to add an inner loop which will take the column name as an argument, because I would like to loop through mulitiple column for each of the group data (i.e. for group1, I would like to have table of BMI_obese vs A1.1, BMI_obese vs A1.2 ... BMI_obese vs A1.15. This is my code, but somehow it is not working, I think it is because it is not recognizing the A1.1, A1.2,... as an column taking from the data group1, group2, group3, group4. But instead it is treated as a string I think. I am not sure how to fix it:
for (i in 2:4) {
for (j in c("A1.1","A1.2"))
{
print(with(get(paste0("group", i)),table(BMI_obese,j)))
}
}
I keep getting this error message:
Error in table(BMI_obese, j) : all arguments must have the same length
Okay, you are trying to construct a variable name using paste and then do a table. You are simply passing the name of the variable to table, not the variable object itself. For this sort of approach you want to use get()
for (i in 1:4) {
with(get(paste0("group", i), table(BMI_obese, A1.1))
}
#example saving as a list (using lapply rather than for loop)
group1 <- data.frame(x=LETTERS[1:10], y=(1:10)[sample(10, replace=TRUE)])
group2 <- data.frame(x=LETTERS[1:10], y=(1:10)[sample(10, replace=TRUE)])
result <- lapply(1:2, function(i) with(get(paste0("group", i)), table(x, y)))
#look at first six rows of each:
head(result[[1]])
head(result[[2]])
#example illustrating fetching objects from a string name
data(mtcars)
head(with(get("mtcars"), table(disp, cyl)))
head(with(get("mtcars"), table(disp, "cyl")))
#Error in table(disp, "cyl") : all arguments must have the same length
head(with(get("mtcars"), table(disp, get("cyl"))))
You could also use a combination of eval and parse like this:
x1 <- c(sample(10, 100, replace = TRUE))
y1 <- c(sample(10, 100, replace = TRUE))
table(eval(parse(text = paste0("x", 1))),
eval(parse(text = paste0("y", 1))))
But I'd also say it is not the nicest practice to access variables that way...
Your types are used wrong. See the difference:
table(group2$BMI_obese, group2$A1.1)
and
table(paste(...),paste(...))
So what type does paste return? Certainly some string.
EDIT:
paste(...) was not meant to be syntactically correct but an abbreviation for paste("group",i,"$", "BMI_obese",sep=""), or whatever you paste together.
paste(...) is returning some string. If you put that result into a table, you get a table of strings (the unexpected result that you got). What you want to do is acessing variables or fields with the name which is returned by your paste(...). Just an an eval to your paste like Daniel said and do it like this.
for (i in 1:4){
print (table(eval(paste("group",i,"$", "BMI_obese",sep="")),eval(paste("group",i,"$","A1.1", sep=""))))
}
It seems possible to assign a vector of functions in R like this:
F <- c(function(){return(0)},function(){return(1)})
so that they can be invoked like this (for example): F[[1]]().
This gave me the impression I could do this:
DF <- data.frame(F=c(function(){return(0)}))
which results in the following error
Error in as.data.frame.default(x[[i]], optional = TRUE) : cannot
coerce class ""function"" to a data.frame
Does this mean it is not possible to put functions into a data frame? Or am I doing something wrong?
No, you cannot directly put a function into a data-frame.
You can, however, define the functions beforehand and put their names in the data frame.
foo <- function(bar) { return( 2 + bar ) }
foo2 <- function(bar) { return( 2 * bar ) }
df <- data.frame(c('foo', 'foo2'), stringsAsFactors = FALSE)
Then use do.call() to use the functions:
do.call(df[1, 1], list(4))
# 6
do.call(df[2, 1], list(4))
# 8
EDIT
The above work around will work as long as you have a named function.
The issue seems to be that R see's the class of the object as a function, looks up the appropriate method for as.data.frame() (i.e. as.data.frame.function()) but can't find it. That causes a call to as.data.frame.default() which pretty must is a wrapper for a stop() call with the message you reported.
In short, they just seem not to have implemented it for that class.
While you can't put a function or other object directly into a data.frame, you can make it work if you go via a matrix.
foo <- function() {print("qux")}
m <- matrix(c("bar", foo), nrow=1, ncol=2)
df <- data.frame(m)
df$X2[[1]]()
Yields:
[1] "qux"
And the contents of df look like:
X1 X2
1 bar function () , {, print("qux"), }
Quite why this works while the direct path does not, I don't know. I suspect that doing this in any production code would be a "bad thing".
Thanks in advance, and sorry if this question has been answered previously - I have looked pretty extensively. I have a dataset containing a row of with concatenated information, specifically: name,color code,some function expression. For example, one value may be:
cost#FF0033#log(x)+6.
I have all of the code to extract the information, and I end up with a vector of expressions that I would like to convert to a list of actual functions.
For example:
func.list <- list()
test.func <- c("x","x+1","x+2","x+3","x+4")
where test.func is the vector of expressions. What I would like is:
func.list[[3]]
To be equivalent to
function(x){x+3}
I know that I can create a function using:
somefunc <- function(x){eval(parse(text="x+1"))}
to convert a character value into a function. The problem comes when I try and loop through to make multiple functions. For an example of something I tried that didn't work:
for(i in 1:length(test.func)){
temp <- test.func[i]
f <- assign(function(x){eval(expr=parse(text=temp))})
func.list[[i]] <- f
}
Based on another post (http://stats.stackexchange.com/questions/3836/how-to-create-a-vector-of-functions) I also tried this:
makefunc <- function(y){y;function(x){y}}
for(i in 1:length(test.func)){
func.list[[i]] <- assign(x=paste("f",i,sep=""),value=makefunc(eval(parse(text=test.func[i]))))
}
Which gives the following error: Error in eval(expr, envir, enclos) : object 'x' not found
The eventual goal is to take the list of functions and apply the jth function to the jth column of the data.frame, so that the user of the script can specify how to normalize each column within the concatenated information given by the column header.
Maybe initialize your list with a single generic function, and then update them using:
foo <- function(x){x+3}
> body(foo) <- quote(x+4)
> foo
function (x)
x + 4
More specifically, starting from a character, you'd probably do something like:
body(foo) <- parse(text = "x+5")
Just to add onto joran's answer, this is what finally worked:
test.data <- matrix(data=rep(1,25),5,5)
test.data <- data.frame(test.data)
test.func <- c("x","x+1","x+2","x+3","x+4")
func.list <- list()
for(i in 1:length(test.func)){
func.list[[i]] <- function(x){}
body(func.list[[i]]) <- parse(text=test.func[i])
}
processed <- mapply(do.call,func.list,lapply(test.data,list))
Thanks again, joran.
This is what I do:
f <- list(identity="x",plus1 = "x+1", square= "x^2")
funCreator <- function(snippet){
txt <- snippet
function(x){
exprs <- parse(text = txt)
eval(exprs)
}
}
listOfFunctions <- lapply(setNames(f,names(f)),function(x){funCreator(x)}) # I like to have some control of the names of the functions
listOfFunctions[[1]] # try to see what the actual function looks like?
library(pryr)
unenclose(listOfFunctions[[3]]) # good way to see the actual function http://adv-r.had.co.nz/Functional-programming.html
# Call your funcions
listOfFunctions[[2]](3) # 3+1 = 4
do.call(listOfFunctions[[3]],list(3)) # 3^2 = 9
attach(listOfFunctions) # you can also attach your list of functions and call them by name
square(3) # 3^2 = 9
identity(7) # 7 ## masked object identity, better detach it now!
detach(listOfFunctions)