R mapply with named arguments - r

One fear I have when using mapply in R is that I may mess up the order of arguments & hence unconsciously generate garbage results.
mydata<-data.frame(Temperature=foobar,Pressure=foobar2)
myfunction<-function(P,T)
{
....
}
mapply(FUN = myfunction,mydata$Temperature,mydata$Pressure)
Is there a way to utilize named arguments to avoid this sort of error via mapply?

If we need to match the function arguments, name the arguments for Map/mapply with the arguments of the function
mapply(FUN = myfunction,T=mydata$Temperature,P=mydata$Pressure)
We can apply the function directly instead of mapply though (based on the example provided below in my post)
do.call(myfunction, unname(mydata[2:1]))
data
mydata <- data.frame(Temperature = 1:5, Pressure = 16:20)
myfunction <- function(P, T) {P*5 + T*10}

Related

combining do.call() and debug() prints all arguments contents

I am using R and trying to debug a function that I call with do.call() for convenience.
Combining do.call() and browser() is problematic. Basically, all the elements of the list of arguments passed to do.call() are printed, which, if the list contains for example a very large data table, is not sustainable.
Here is a reprex. I create a simple getsum() function that sums elements of a vector. I create a func() function that calls getsum() for a list of vectors.
#getsum returns the sum of a vector's elements
#func returns the vector of the sum of a list of vectors
func <- function(vec_list){
browser()
sums = lapply(FUN=getsum, X=vec_list)
sum = unlist(sums)
return(sum)
}
getsum <- function(vec){
sum = sum(vec)
return(sum)
}
args = list(vec_list=list(rnorm(5), rnorm(5)))
do.call(func, args)
That's the output I get :
Called from: (function(vec_list){
browser()
sums = lapply(FUN=getsum, X=vec_list)
sum = unlist(sums)
return(sum)
})(vec_list = list(c(-0.0801864335418185, 0.448324209935905,
-2.86518616779484, -0.359284963520417, -0.620062639582574), c(1.74835180362954,
-0.904288222154223, 0.746007117029027, 0.625889703799832, -0.908748727897187
)))
Browse[1]>
One might tell me "why are you using do.call()?". Indeed, if I simply call the function myself, the problem does not arise (see below). In this example I don't need to use do.call(), but sometimes it's extremely convenient.
#instead of do.call() use :
func(vec_list=args$vec_list)
The output is then :
Called from: func(vec_list = args$vec_list)
Browse[1]>
EDIT :
I have tried the argument browser(skipCalls=TRUE) that solves the problem but defies the purpose of browser(). It makes R executing all the function's command at once. Other suggestions welcome.

R: Use (m?)apply with a function that returns a list as the argument to the function to be applied over?

I have a function of two arguments foo(a,b). As an input of this function, I was to use every row of the output of combinations(10,2) from the gtools library. I've tried to get it to work with mapply and I really had high hopes for apply(combinations(10,2),1,foo), but everything that I've attempted throws the error "argument "b" is missing, with no default". How can I correct this without storing combinations(10,2) in memory and dividing it up? I suspect that I'm missing a trick with Vectorize.
For a simple reproducible example, use beta(a,b) in place of foo(a,b).
What I very specifically do not want to do is anything like:
a<-combinations(10,2)
mapply(foo,a[,1],a[,2])
because I do not want to store combinations(10,2) in memory.
Here we can use do.call with mapply or Map
do.call(mapply, c(FUN = foo, asplit(combinations(10, 2), 2)))
Or with Map (returns a list)
do.call(Map, c(f = foo, asplit(combinations(10, 2), 2)))
As a reproducible example, can use beta
do.call(Map, c(f = beta, asplit(combinations(10, 2), 2)))

How can create a function using variables in a dataframe

I'm sure the question is a bit dummy (sorry)... I'm trying to create a function using differents variables I have stored in a Dataframe. The function is like that:
mlr_turb <- function(Cond_in, Flow_in, pH_in, pH_out, Turb_in, nm250_i, nm400_i, nm250_o, nm400_o){
Coag = (+0.032690 + 0.090289*Cond_in + 0.003229*Flow_in - 0.021980*pH_in - 0.037486*pH_out
+0.016031*Turb_in -0.026006*nm250_i +0.093138*nm400_o - 0.397858*nm250_o - 0.109392*nm400_o)/0.167304
return(Coag)
}
m4_turb <- mlr_turb(dataset)
The problem is when I try to run my function in a dataframe (with the same name of variables). It doesn't detect my variables and shows this message:
Error in mlr_turb(dataset) :
argument "Flow_in" is missing, with no default
But, actually, there is, also all the variables.
I think I missplace or missing some order in the function that gives it the possibility to take the variables from the dataset. I have searched a lot about that but I have not found any answer...
No dumb questions!
I think you're looking for do.call. This function allows you to unpack values into a function as arguments. Here's a really simple example.
# a simple function that takes x, y and z as arguments
myFun <- function(x, y, z){
result <- (x + y)/z
return(result)
}
# a simple data frame with columns x, y and z
myData <- data.frame(x=1:5,
y=(1:5)*pi,
z=(11:15))
# unpack the values into the function using do.call
do.call('myFun', myData)
Output:
[1] 0.3765084 0.6902654 0.9557522 1.1833122 1.3805309
You meet a standard problem when writing R that is related to the question of standard evaluation (SE) vs non standard evaluation (NSE). If you need more elements, you can have a look at this blog post I wrote
I think the most convenient way to write function using variables is to use variable names as arguments of the function.
Let's take again #Muon example.
# a simple function that takes x, y and z as arguments
myFun <- function(x, y, z){
result <- (x + y)/z
return(result)
}
The question is where R should find the values behind names x, y and z. In a function, R will first look within the function environment (here x,y and z are defined as parameters) then it will look at global environment and then it will look at the different packages attached.
In myFun, R expects vectors. If you give a column name, you will experience an error. What happens if you want to give a column name ? You must say to R that the name you gave should be associated to a value in the scope of a dataframe. You can for instance do something like that:
myFun <- function(df, col1 = "x", col2 = "y", col3 = "z"){
result <- (df[,col1] + df[,col2])/df[,col3]
return(result)
}
You can go far further in that aspect with data.table package. If you start writing functions that need to use variables from a dataframe, I recommend you to start having a look at this package
I like Muon's answer, but I couldn't get it to work if there are columns in the data.frame not in the function. Using the with() function is a simple way to make this work as well...
#Code from Muon:
# a simple function that takes x, y and z as arguments
myFun <- function(x, y, z){
result <- (x + y)/z
return(result)
}
# a simple data frame with columns x, y and z
myData <- data.frame(x=1:5,
y=(1:5)*pi,
z=(11:15),
a=6:10) #adding a var not used in myFun
# unpack the values into the function using do.call
do.call('myFun', myData)
#generates an error for the unused "a" column
#using with() function:
with(myData, myFun(x, y, z))

Applying a function with two arguments using Lapply

I have created a test function, called testFunc which expects two arguments.
testFunc<-function(x,y){
length(x)
nrow(y)
}
Now I want to use lappy to apply this function to a list, keeping the y argument fixed.
Consider a test list, testList:
testList<-list(a=c(1,2,3,4,5,5,6),b=c(1,2,4,5,6,7,8))
Can we use lapply to run testFunc on testList$a and testList$b with same value of y?
I tried this call:
lapply(X = testList, FUN = testFunc, someDataFrame)
But I am always getting the length of someDataFrame as the output. Am I missing something obvious.
Change your function to
testFunc<-function(x,y){
return(c(length(x), nrow(y)))
}
By default, a R function returns the last evaluated value
Simplest way, use a named variable:
lapply(X = testList, FUN=testFunc, y=someDataFrame)

How do I call a namespaced function without evaluating the parameters you are giving it?

In R, the idiomatic way to call another function without evaluating the parameters you give it is apparently as follows:
Call <- match.call(expand.dots = TRUE)
# Modify parameters here as needed and set unneeded ones to NULL.
Call[[1L]] <- as.name("name.of.function.to.be.called.here")
eval.parent(Call)
However, when I put a namespaced name (e.g. utils::write.csv) in the as.name() call, I get an error:
"could not find function "utils::write.csv"
What is the proper way of using this R idiom to call a namespaced function?
Here is a solution using do.call(), which both constructs and evaluates the function call.
Like the approach you started with, this one uses the fact that R calls are lists in which: (a) the first element is the name of a function; and (b) all following elements are arguments to that function.
j <- function(x, file) {
Call <- match.call(expand.dots = TRUE)
arglist <- as.list(Call)[-1]
do.call(utils::write.csv, arglist)
}
dat <- data.frame(x=1:10, y=rnorm(10))
j(dat, file="outfilename.csv")
EDIT: FWIW, here's an example from plot.formula in base R, which uses a construct similar to the one above:
{
m <- match.call(expand.dots = FALSE)
eframe <- parent.frame()
. . .
. . .
m <- as.list(m)
m[[1L]] <- stats::model.frame.default
m <- as.call(c(m, list(na.action = NULL)))
mf <- eval(m, eframe)
. . .
. . .
}
The function uses the do.call() construct later on. Going a bit deeper into the weeds, my reading is that in the snippet shown here, it instead uses several steps mostly because of the need to add na.action=NULL to the list of arguments.
In any case, it looks like the do.call() options is as close to canonical as could be desired.
As #Josh O'Brien answered, do.call is much more straight forward to use.
The first argument to do.call can be either a function name or an actual function.
The function name can NOT contain the namespace qualifier. The :: part is actually a function that takes the names on both sides and find the corresponding function, so it must be evaluated separately to work.
So, with do.call, you need something like:
# ...Stuff from Josh's answer goes here
# And then:
do.call(utils::write.csv, arglist)
And with eval:
Call <- match.call(expand.dots = TRUE)
# Modify parameters here as needed and set unneeded ones to NULL.
Call[[1L]] <- utils::write.csv
eval.parent(Call)
Note the lack of quotes around the function name. That evaluates to the function closure.
Another way of getting the function from a namespace-qualified name:
eval(parse(text="utils::write.csv"))
Again, the :: function is called that correctly finds the function.
Another more manual way is to extract the namespace name & function name and then do the lookup yourself:
x <- strsplit("utils::write.csv", "::")[[1]]
get(x[2], asNamespace(x[1]))

Resources