How function(x) works in R? - r

data =data.frame(sapply(data,function(x) ifelse((x==999),NA,x)))
Could you explain how function(x)....,x works? I know what input and output of this function are, it just replaces 999 values with NAs in dataframe, but I want to know how this function(x) works in General.

This is your function with the name my_func: It takes a value x checks if it is 999, if not it gives back the value of x, if yes it gives back 999:
my_func <- function(x) {
ifelse(x==999, NA, x)
}
You now can use this function as follows:
named function standalone:
my_func(mtcars$mpg)
As named function with lapply or sapply ...
Basically sapply function in R does the same job as lapply() function but returns a vector:
lapply(X, FUN)
Arguments:
-X: A vector or an object
-FUN: Function applied to each element of x
lapply(mtcars$mpg, my_func)
sapply(X, FUN)
Arguments:
-X: A vector or an object
-FUN: Function applied to each element of x
sapply(mtcars$mpg, my_func)
Now your question:
You can use a function in this case my_func directly without defining or naming first as an anonymous function (means the function has not a name) like:
lapply(mtcars$mpg, function(x) ifelse(x==999, NA, x))
Note: essentially after function(x) it is the body part of your named function my_func.

Related

Non-standard evaluation in a user-defined function with lapply or with in R

I wrote a wrapper around ftable because I need to compute flat tables with frequency and percentage for many variables. As ftable method for class "formula" uses non-standard evaluation, the wrapper relies on do.call and match.call to allow the use of the subset argument of ftable (more details in my previous question).
mytable <- function(...) {
do.call(what = ftable,
args = as.list(x = match.call()[-1]))
# etc
}
However, I cannot use this wrapper with lapply nor with:
# example 1: error with "lapply"
lapply(X = warpbreaks[c("breaks",
"wool",
"tension")],
FUN = mytable,
row.vars = 1)
Error in (function (x, ...) : object 'X' not found
# example 2: error with "with"
with(data = warpbreaks[warpbreaks$tension == "L", ],
expr = mytable(wool))
Error in (function (x, ...) : object 'wool' not found
These errors seem to be due to match.call not being evaluated in the right environment.
As this question is closely linked to my previous one, here is a sum up of my problems:
The wrapper with do.call and match.call cannot be used with lapply or with.
The wrapper without do.call and match.call cannot use the subset argument of ftable.
And a sum up of my questions:
How can I write a wrapper which allows both to use the subset argument of ftable and to be used with lapply and with? I have ideas to avoid the use of lapply and with, but I am looking to understand and correct these errors to improve my knowledge of R.
Is the error with lapply related to the following note from ?lapply?
For historical reasons, the calls created by lapply are unevaluated,
and code has been written (e.g., bquote) that relies on this. This
means that the recorded call is always of the form FUN(X[[i]], ...),
with i replaced by the current (integer or double) index. This is not
normally a problem, but it can be if FUN uses sys.call or match.call
or if it is a primitive function that makes use of the call. This
means that it is often safer to call primitive functions with a
wrapper, so that e.g. lapply(ll, function(x) is.numeric(x)) is
required to ensure that method dispatch for is.numeric occurs
correctly.
The problem with using match.call with lapply is that match.call returns the literal call that passed into it, without any interpretation. To see what's going on, let's make a simpler function which shows exactly how your function is interpreting the arguments passed into it:
match_call_fun <- function(...) {
call = as.list(match.call()[-1])
print(call)
}
When we call it directly, match.call correctly gets the arguments and puts them in a list that we can use with do.call:
match_call_fun(iris['Species'], 9)
[[1]]
iris["Species"]
[[2]]
[1] 9
But watch what happens when we use lapply (I've only included the output of the internal print statement):
lapply('Species', function(x) match_call_fun(iris[x], 9))
[[1]]
iris[x]
[[2]]
[1] 9
Since match.call gets the literal arguments passed to it, it receives iris[x], not the properly interpreted iris['Species'] that we want. When we pass those arguments into ftable with do.call, it looks for an object x in the current environment, and then returns an error when it can't find it. We need to interpret
As you've seen, adding envir = parent.frame() fixes the problem. This is because, adding that argument tells do.call to evaluate iris[x] in the parent frame, which is the anonymous function in lapply where x has it's proper meaning. To see this in action, let's make another simple function that uses do.call to print ls from 3 different environmental levels:
z <- function(...) {
print(do.call(ls, list()))
print(do.call(ls, list(), envir = parent.frame()))
print(do.call(ls, list(), envir = parent.frame(2)))
}
When we call z() from the global environment, we see the empty environment inside the function, then the Global Environment:
z()
character(0) # Interior function environment
[1] "match_call_fun" "y" "z" # GlobalEnv
[1] "match_call_fun" "y" "z" # GlobalEnv
But when we call from within lapply, we see that one level of parent.frame up is the anonymous function in lapply:
lapply(1, z)
character(0) # Interior function environment
[1] "FUN" "i" "X" # lapply
[1] "match_call_fun" "y" "z" # GlobalEnv
So, by adding envir = parent.frame(), do.call knows to evaluate iris[x] in the lapply environment where it knows that x is actually 'Species', and it evaluates correctly.
mytable_envir <- function(...) {
tab <- do.call(what = ftable,
args = as.list(match.call()[-1]),
envir = parent.frame())
prop <- prop.table(x = tab,
margin = 2) * 100
bind <- cbind(as.matrix(x = tab),
as.matrix(x = prop))
margin <- addmargins(A = bind,
margin = 1)
round(x = margin,
digits = 1)
}
# This works!
lapply(X = c("breaks","wool","tension"),
FUN = function(x) mytable_envir(warpbreaks[x],row.vars = 1))
As for why adding envir = parent.frame() makes a difference since that appears to be the default option. I'm not 100% sure, but my guess is that when the default argument is used, parent.frame is evaluated inside the do.call function, returning the environment in which do.call is run. What we're doing, however, is calling parent.frame outside do.call, which means it returns one level higher than the default version.
Here's a test function that takes parent.frame() as a default value:
fun <- function(y=parent.frame()) {
print(y)
print(parent.frame())
print(parent.frame(2))
print(parent.frame(3))
}
Now look at what happens when we call it from within lapply both with and without passing in parent.frame() as an argument:
lapply(1, function(y) fun())
<environment: 0x12c5bc1b0> # y argument
<environment: 0x12c5bc1b0> # parent.frame called inside
<environment: 0x12c5bc760> # 1 level up = lapply
<environment: R_GlobalEnv> # 2 levels up = globalEnv
lapply(1, function(y) fun(y = parent.frame()))
<environment: 0x104931358> # y argument
<environment: 0x104930da8> # parent.frame called inside
<environment: 0x104931358> # 1 level up = lapply
<environment: R_GlobalEnv> # 2 levels up = globalEnv
In the first example, the value of y is the same as what you get when you call parent.frame() inside the function. In the second example, the value of y is the same as the environment one level up (inside lapply). So, while they look the same, they're actually doing different things: in the first example, parent.frame is being evaluated inside the function when it sees that there is no y= argument, in the second, parent.frame is evaluated in the lapply anonymous function first, before calling fun, and then is passed into it.
As you only want to pass all the arguments passed to ftable u do not need the do.call().
mytable <- function(...) {
tab <- ftable(...)
prop <- prop.table(x = tab,
margin = 2) * 100
bind <- cbind(as.matrix(x = tab),
as.matrix(x = prop))
margin <- addmargins(A = bind,
margin = 1)
return(round(x = margin,
digits = 1))
}
The following lapply creates a table for every Variable separatly i don't know if that is what you want.
lapply(X = c("breaks",
"wool",
"tension"),
FUN = function(x) mytable(warpbreaks[x],
row.vars = 1))
If you want all 3 variables in 1 table
warpbreaks$newVar <- LETTERS[3:4]
lapply(X = cbind("c(\"breaks\", \"wool\", \"tension\")",
"c(\"newVar\", \"tension\",\"wool\")"),
FUN = function(X)
eval(parse(text=paste("mytable(warpbreaks[,",X,"],
row.vars = 1)")))
)
Thanks to this issue, the wrapper became:
# function 1
mytable <- function(...) {
do.call(what = ftable,
args = as.list(x = match.call()[-1]),
envir = parent.frame())
# etc
}
Or:
# function 2
mytable <- function(...) {
mc <- match.call()
mc[[1]] <- quote(expr = ftable)
eval.parent(expr = mc)
# etc
}
I can now use the subset argument of ftable, and use the wrapper in lapply:
lapply(X = warpbreaks[c("wool",
"tension")],
FUN = function(x) mytable(formula = x ~ breaks,
data = warpbreaks,
subset = breaks < 15))
However I do not understand why I have to supply envir = parent.frame() to do.call as it is a default argument.
More importantly, these methods do not resolve another issue: I can not use the subset argument of ftable with mapply.

R: get the name of a function that is stored in a variable

In my R program, I have a function that gets a function as an argument and inside this function I would like to get its name as a string (i.e. the function name of this argument that is supposed to be a function).
You might be looking for substitute:
f <- function(x) { substitute(x) }
f(mean)
Yields:
mean
which is a symbol. To get it as a string instead, add deparse:
f <- function(x) { deparse(substitute(x)) }
f(mean)
Yields:
[1] "mean"

Update first value in each list element in R

I would like to replace the first value in each list element with the second value from the same element.
For example I would like a function to transform lst into lst2
lst<-list(c(0:4),c(5:9))
lst
lst2<-list(c(1, c(1:4)),c(6,c(6:9)))
lst2
I know that I can do
lst[[1]][1]=lst[[1]][2]
lst[[2]][1]=lst[[2]][2]
But I would like a function to iterate over all list elements. I have tried various things (all unsuccessful) with lapply such as:
lapply(list, function(x) x[1]=x[2])
We can use lapply to loop over the list and we need to return the x if we are using anonymous function call.
lstN <- lapply(lst, function(x) {x[1] <- x[2]
x})
identical(lst2, lstN)
#[1] TRUE

How to make the elements of a matrix callable functions

I want to make a matrix of functions (that I wrote). Then access them element wise and call.
So I have : func1(x) , func2(y), func3(z) and func4(t) that are four R functions I wrote that work fine.They return numerics.
Now if I do:
a_matrix <- matrix(c(a=func1,b=func2,c=func3,d=func4),2,2)
a_func<-a_matrix[1,1]
a_func(x)
I get the following error:
error:attempt to call non-function.
Instead of matrix if I use list,
a_list<-list(a=func1,b=func2,c=func3,d=func4)
a_func<-list$a
a_func(x)
gives expected result
typeof(list$a)
[1] "closure"
If I do :
typeof(a_matrix)
[1] "list"
typeof(a_matrix[1,1])
[1] "list"
(am using R 3.1.1)
When you create non-atomic matrices like that, they are basically made into fancy lists. Similar rules apply to these lists as to regular lists when it comes to indexing; namely that [ ] will always return another list, and [[ ]] will extract the element without the list wrapper. You really want
func1 <- function(x) x+1
func2 <- function(x) x+2
func3 <- function(x) x+3
func4 <- function(x) x+4
a_matrix <- matrix(c(a=func1,b=func2,c=func3,d=func4),2,2)
a_func <- a_matrix[[1,1]]
a_func(5)
# [1] 6
You'd get the same results with your standard list syntax if you did
a_list <- list(a=func1,b=func2,c=func3,d=func4)
a_func <- a_list["a"]
a_func(5)
# Error: could not find function "a_func"
a_func <- a_list[["a"]]
a_func(5)
# [1] 6

Positional Matching Trickery

I wrote this nifty function to apply a function for every combination of vectorized arguments:
require(plyr)
require(ggplot2)
###eapply accepts a function and and a call to expand grid
###where columns created by expand.grid must correspond to arguments of fun
##each row created by expand.grid will be called by fun independently
###arguments
##fun either a function or a non-empty character string naming the function to be called.
###... vectors, factors, or a list containing thse
###value
###a data frame
##Details
##at this time, elements of ... must be at least partially named to match args of fun
##positional matching does not work
###from the ddply documentation page:
###The most unambiguous behaviour is achieved when fun returns a data frame - in that case pieces will
###be combined with rbind.fill. If fun returns an atomic vector of fixed length, it will be rbinded
###together and converted to a data frame. Any other values will result in an error.
eapply <- function(fun,...){
if(!is.character(fun)) fun <- as.character(substitute(fun))
adply(
expand.grid(...),
1,
function(x,fun) do.call(fun,x),
fun
)
}
##example use:
m <- function(n,visit.cost){
if(n*visit.cost < 250){
c("total.cost"=n*visit.cost)
}else{
c("total.cost"=250 + (n*visit.cost-250)*.25)
}
}
d <- eapply(m, n=1:30, visit.cost=c(40,60,80,100))
ggplot(d,aes(x=n,y=total.cost,color=as.factor(visit.cost),group=visit.cost)) + geom_line()
How can I rewrite the function such that the arguments passed to expand.grid need not be named:
d <- eapply(m, 1:30, c(40,60,80,100))
Alternatively, are there any existing functions that have similar functionality?
Not the most elegant but this works. Most importantly, it allows you to pass variables to expand.grid without naming them.
eeyore <- function(fun, ...){
if(!is.character(fun)) fun <- as.character(substitute(fun))
f <- match.fun(fun)
args <- as.list(substitute(list(...)))[-1]
foo <- expand.grid(llply(args, eval))
foo$F <- apply(foo, 1, function(x) { f(x[[1]], x[[2]])})
foo
}
d <- eeyore(m, 1:30, c(40,60,80,100))

Resources