Can we use apply function along with some user defined function or if/while loops in R to conditionally work it on selective rows? - r

I know that while and if functions in R are not vectorised. while and if functions help us selectively work on some rows based on some condition. I also know that the apply function in R is used to apply over the columns and hence it operates on all rows of columns that we wish to put apply on. Can I use apply() along with user defined functions and/or with while/if loop to conditionally use it over some rows rather than all rows as apply function usually does.
Note :- This core issue here is to bypass the drawback on non-vectorization of while/if loops in R.

You can supply user defined functions to apply using an argument FUN = function(x) user_defined_function(x) {}. And apply is "vectorized" in sense that as argument it accept vectors, not scalars (but its implementation is heavily using for and if loops, type apply without arguments in your console). So for and apply are of the same perfomance.
However you can break the execution of user defined function throwing exception with stop and wrapping in tryCatch it is a non-recommended technique (it influences environements, call stacks, scopes etc., make debugging difficult and lead to errors which are difficult to identify).
Better to use for and if and very often it is the most easiest and effective way (to write a recursive function, taking in consideration that (tail) recursion is not really optimized for R, or fully refactor your algorithm quite difficult and time consuming).

Related

Is attributes() a function in R?

Help files call attributes() a function. Its syntax looks like a function call. Even class(attributes) calls it a function.
But I see I can assign something to attributes(myobject), which seems unusual. For example, I cannot assign anything to log(myobject).
So what is the proper name for "functions" like attributes()? Are there any other examples of it? How do you tell them apart from regular functions? (Other than trying supposedfunction(x)<-0, that is.)
Finally, I guess attributes() implementation overrides the assignment operator, in order to become a destination for assignments. Am I right? Is there any usable guide on how to do it?
Very good observation Indeed. It's an example of replacement function, if you see closely and type apropos('attributes') in your R console, It will return
"attributes"
"attributes<-"
along with other outputs.
So, basically the place where you are able to assign on the left sign of assignment operator, you are not calling attributes, you are actually calling attributes<- , There are many functions in R like that for example: names(), colnames(), length() etc. In your example log doesn't have any replacement counterpart hence it doesn't work the way you anticipated.
Definiton(from advanced R book link given below):
Replacement functions act like they modify their arguments in place,
and have the special name xxx<-. They typically have two arguments (x
and value), although they can have more, and they must return the
modified object
If you want to see the list of these functions you can do :
apropos('<-$') and you can check out similar functions, which has similar kind of properties.
You can read about it here and here
I am hopeful that this solves your problem.

Need to pre-process input to functions

I am writing a package with a suite of functions that take objects fit to a model (e.g., output from from "lmt", "lavaan", or "mirt" packages) and computes relevant indices based on those models.
The first thing EVERY function in this suite does is convert the input into a standardized form, so all of my functions look like this:
fooIndex <- function(x) {
x <- standardizerFunction(x)
# Now, compute the fooIndex
}
Here, standardizerFunction is an S3 generic function that has methods for all the supported input classes.
Is there a better way to accomplish this functionality than calling standardizerFunction inside of each of the functions computing indices?
EDIT: I just wanted to specify that my "problem" is that copying and pasting the same line of code into ~20 different functions seems like a poor programming style, and I am hoping for a better solution.
Based on what iod and Gregor wrote, the two ways to handle this are:
(1) Require the user to apply the standardizerFunction before running any of the main functions. The functions will the throw an error if the input is of the wrong class.
(2) Since our functions will be checking the input to make sure it is of the right class anyway, just fold standardizerFunction into the input checking part using something like:
if(!inherits(x, what="YourClass")) standardizerFunction(x)
In my particular setting, since most of my users are uncomfortable with R, asking them to pre-apply the standardizerFunction is not the best choice, so I am going with option 2.

Accessing index inside *apply

I have two containers, conty and contx. The values of both are tied to each other. conty[1] relates to contx[1] etc. while using apply on contx I want to access the index inside an apply structure so I can put values from corresponding element in conty into contz depending upon the index of x.
lapply(contx, function(x) {
if (x==1) append(contz,conty[xindex])
})
I could easily do this in a for loop but everybody insists that using the apply is better. And I tried to look for examples but the only thing I could find was mostly stuff for generating maps where it wasn't entirely clear how I could adapt to my problem.
There are a few issues here.
"everybody insists that using the apply is better". Sorry, but they're wrong; it's not necessarily better. See the old-school Burns Inferno ("If you are using R and you think you’re in hell, this is a map for you"), chapter 4 ("Overvectorization"):
A common reflex is to use a function in the apply family. This is not vectorization, it is loop-hiding. The apply function has a for loop in its definition. The lapply function buries the loop, but execution times tend to be roughly equal to an explicit for loop ... Base your decision of using an apply function on Uwe’s Maxim (page 20). The issue is of human time rather than silicon chip time. Human time can be wasted by taking longer to write the code, and (often much more importantly) by taking more time to understand subsequently what it does.
However, what you are doing that's bad is growing an object (also covered in the Inferno). Assuming that in your example contz started as an empty list, this should work (is my example reflective of your use case?)
x <- c(1,2,3,1)
conty <- list("a","b","c","d")
contz <- conty[which(x==1)]
Alternatively, if you want to use both the value and the index in your function, you can write a two-variable function f(val,index) and then use Map(f,my_list,seq_along(my_list))

using clusterApply with unknown number of arguments

I want to be able to generalise the behavior of clusterApply() so that I can parallelise functions with different number of arguments.
Normally, I use clusterApply() like this:
clusterApply(cl=cl,seq_len(nsim),FUN=runsim,arg1,arg2,arg3)
But what if I don't know how many arguments function runsim has? I was thinking of using do.call("runsim",listofArguments), but I don't know if I can use it inside of clusterApply.
Any suggestions?
The main issue seems to be the fact that do.call wants the function (or name thereof) as first argument while clusterApply, like all functions from the apply family, passes the iterated over object as the first argument to the function it calls. Consequently one solution could be:
clusterApply(cl=cl,seq_len(nsim),FUN=function(x) do.call(rumsim, args = list(...)))
... can now be filled with whatever different arguments there are including the possibility to hand over x (i.e., the slice of the iterated over object, in this case the simulation number).
I do not see the need to also wrap clusterApply into do.call as you know which function to call (clusterApply).

R - where can vectorize happen?

So clearly one way to vectorize a function is WITHIN the function - either explicitly iterate over inputs or utilize other functions that have been vectorized. Is there a way to mark or tag a function as being/treated as vectorized so that the iteration is managed by the R platform? The analogy would be attributes in c# or annotations in Java. I tell R that this function should be treated as vectorized and it feeds that input one at a time into the function, constructing the vector output? Or am I just thinking about this whole thing incorrectly?
You can use the Vectorize function (http://stat.ethz.ch/R-manual/R-patched/library/base/html/mapply.html), to make the function take vectors.
But here it just uses the mapply function to do the vectorization. As Gavin said, you are just hiding the loop.

Resources