using clusterApply with unknown number of arguments - r

I want to be able to generalise the behavior of clusterApply() so that I can parallelise functions with different number of arguments.
Normally, I use clusterApply() like this:
clusterApply(cl=cl,seq_len(nsim),FUN=runsim,arg1,arg2,arg3)
But what if I don't know how many arguments function runsim has? I was thinking of using do.call("runsim",listofArguments), but I don't know if I can use it inside of clusterApply.
Any suggestions?

The main issue seems to be the fact that do.call wants the function (or name thereof) as first argument while clusterApply, like all functions from the apply family, passes the iterated over object as the first argument to the function it calls. Consequently one solution could be:
clusterApply(cl=cl,seq_len(nsim),FUN=function(x) do.call(rumsim, args = list(...)))
... can now be filled with whatever different arguments there are including the possibility to hand over x (i.e., the slice of the iterated over object, in this case the simulation number).
I do not see the need to also wrap clusterApply into do.call as you know which function to call (clusterApply).

Related

Is attributes() a function in R?

Help files call attributes() a function. Its syntax looks like a function call. Even class(attributes) calls it a function.
But I see I can assign something to attributes(myobject), which seems unusual. For example, I cannot assign anything to log(myobject).
So what is the proper name for "functions" like attributes()? Are there any other examples of it? How do you tell them apart from regular functions? (Other than trying supposedfunction(x)<-0, that is.)
Finally, I guess attributes() implementation overrides the assignment operator, in order to become a destination for assignments. Am I right? Is there any usable guide on how to do it?
Very good observation Indeed. It's an example of replacement function, if you see closely and type apropos('attributes') in your R console, It will return
"attributes"
"attributes<-"
along with other outputs.
So, basically the place where you are able to assign on the left sign of assignment operator, you are not calling attributes, you are actually calling attributes<- , There are many functions in R like that for example: names(), colnames(), length() etc. In your example log doesn't have any replacement counterpart hence it doesn't work the way you anticipated.
Definiton(from advanced R book link given below):
Replacement functions act like they modify their arguments in place,
and have the special name xxx<-. They typically have two arguments (x
and value), although they can have more, and they must return the
modified object
If you want to see the list of these functions you can do :
apropos('<-$') and you can check out similar functions, which has similar kind of properties.
You can read about it here and here
I am hopeful that this solves your problem.

Can we use apply function along with some user defined function or if/while loops in R to conditionally work it on selective rows?

I know that while and if functions in R are not vectorised. while and if functions help us selectively work on some rows based on some condition. I also know that the apply function in R is used to apply over the columns and hence it operates on all rows of columns that we wish to put apply on. Can I use apply() along with user defined functions and/or with while/if loop to conditionally use it over some rows rather than all rows as apply function usually does.
Note :- This core issue here is to bypass the drawback on non-vectorization of while/if loops in R.
You can supply user defined functions to apply using an argument FUN = function(x) user_defined_function(x) {}. And apply is "vectorized" in sense that as argument it accept vectors, not scalars (but its implementation is heavily using for and if loops, type apply without arguments in your console). So for and apply are of the same perfomance.
However you can break the execution of user defined function throwing exception with stop and wrapping in tryCatch it is a non-recommended technique (it influences environements, call stacks, scopes etc., make debugging difficult and lead to errors which are difficult to identify).
Better to use for and if and very often it is the most easiest and effective way (to write a recursive function, taking in consideration that (tail) recursion is not really optimized for R, or fully refactor your algorithm quite difficult and time consuming).

How to use apply() with my function

bmi<-function(x,y){
(x)/((y/100)^2)
}
bmi(70,177) it can work
but with apply() it does't work
apply(Student,1:2,bmi(Student$weight,Student$height))
Error in match.fun(FUN) :
'bmi(Student$weight, Student$height)' is not a function, character or symbol
It's a bit unclear what the goal is. If it's just to get an answer, then the comments do answer it. If on the other hand, the goal is to understand what you are doing wrong, then read on. I'd say the first error going from left to right is passing the whole dataframe. I would have only passed the 'height' and 'weight' columns.
The next error, again going from left to right, is the use of 1:2 as the second argument to apply. You obviously want to do this "by rows" which mean you should use only 1, i.e. the first dimension of the dataframe.
And the third error is using a function call rather than the function name. Functions with arguments in parentheses don't work when an R function (meaning apply in this case) is expecting a function name or an anonymous function as illustrated in comments.
Fourth error is not assigning the value to a column in your dataframe. So this probably would have succeeded in making the desired extra column via the apply method. But, as noted in comments this is not the most efficient method.:
Student$bmi_val <- apply(Student[ ,c("weight", "height")], bmi)
# didn't want my column name to be the same as the function name
The apply function was actually designed to work with matrices and arrays, so for many purposes it is ill-suited when used with dataframes. In this case where all the arguments to the bmi function are numeric and you can control the order of argument in the first argument to match the x and y positions, it's arguably an acceptable strategy, but not most R-ish method. When working with dates or factor variables, you should definitely avoid apply.

Naming columns of coefficient matrix in a VAR

I am searching for a fast and simple way to give comprehensible names to the columns of a VAR-coefficient matrix.
What I would like to use is the function VAR.names, which is used in the function VAR.est() in the VAR.etp-package. When I use the function VAR.est(), this works perfectly, but as soon as I modify VAR.est (by adding another element to the list of values which are returned), I receive an error message stating "could not find function VAR.names".
I could not find any information on the function VAR.names.
Example:
library(VAR.etp)
data(dat)
M=VAR.est(dat,p=2,type="const")
M$coef
Another possibility would be to use a loop as in the function VAR() from the vars package, but if VAR.names would actually work, this would be a lot more elegant!

How does one pass multiple data types in llply?

I have a function that requires both a S4 object and a data frame as arguments.
But functions like lapply and llply will only allow one list and one function.
example: new_list=llply(list, function)
I could make a single list with alternating S4 object and data but llply will push one list item at a time which means that it will either be the S4 object or the data (the function cannot perform with just one or the other).
In some sense what I am looking for is akin to a 2D list (where each row has the S4 obj and data and a row gets pushed through at a time).
So how would I make this work?
Here's a more general version of my problem. If I have a function like so:
foobar <- function(dat, threshold=0.5, max=.99)
{
...
}
and I wanted to push a list through this function, I could do:
new_list=llply(list, foobar)
but if I also wanted to pass a non-default value for threshold or max, how would I do so in this context?
Functions like lapply typically have a ... parameter of arguments which get passed to the function. Eg:
lapply(list, foobar, somearg='nondefaultvalue')
If you have multiple varying parameters (eg a different somearg value for each element in list), then you would either pack them as pairs in a list, or turn to a function like mapply:
mapply(foobar, list, somearg=c('vectorof', 'nondefault', 'values')
May be you can try this:
Make each list item itself a list, which contains a S4 object and a data frame.
Just a suggestion, I'm not quite sure if this works.

Resources