The Art of R Programming : Where else could I find the information? - r

I came across the editorial review of the book The Art of R Programming, and found this
The Art of R Programming takes you on a guided tour of software development with R, from basic types and data structures to advanced topics like closures, recursion, and anonymous functions
I immediately became fascinated by the idea of anonymous functions, something I had come across in Python in the form of lambda functions but could not make the connection in the R language.
I searched in the R manual and found this
Generally functions are assigned to symbols but they don't need to be. The value returned by the call to function is a function. If this is not given a name it is referred to as an anonymous function. Anonymous functions are most frequently used as arguments other functions such as the apply family or outer.
These things for a not-very-long-time programmer like me are "quirky" in a very interesting sort of way.
Where can I find more of these for R (without having to buy a book) ?
Thank you for sharing your suggestions

Functions don't have names in R. Whether you happen to put a function into a variable or not is not a property of the function itself so there does not exist two sorts of functions: anonymous and named. The best we can do is to agree to call a function which has never been assigned to a variable anonymous.
A function f can be regarded as a triple consisting of its formal arguments, its body and its environment accessible individually via formals(f), body(f) and environment(f). The name is not any part of that triple. See the function objects part of the language definition manual.
Note that if we want a function to call itself then we can use Recall to avoid knowing whether or not the function was assigned to a variable. The alternative is that the function body must know that the function has been assigned to a particular variable and what the name of that variable is. That is, if the function is assigned to variable f, say, then the body can refer to f in order to call itself. Recall is limited to self-calling functions. If we have two functions which mutually call each other then a counterpart to Recall does not exist -- each function must name the other which means that each function must have been assigned to a variable and each function body must know the variable name that the other function was assigned to.

There's not a lot to say about anonymous functions in R. Unlike Python, where lambda functions require special syntax, in R an anonymous function is simply a function without a name.
For example:
function(x,y) { x+y }
whereas a normal, named, function would be
add <- function(x,y) { x+y }
Functions are first-class objects, so you can pass them (regardless of whether they're anonymous) as arguments to other functions. Examples of functions that take other functions as arguments include apply, lapply and sapply.

Get Patrick Burns' "The R Inferno" at his site
There are several good web sites with basic introductions to R usage.
I also like Zoonekynd's manual

Great answers about style so far. Here's an answer about a typical use of anonymous functions in R:
# Make some data up
my.list <- list()
for( i in seq(100) ) {
my.list[[i]] <- lm( runif(10) ~ runif(10) )
}
# Do something with the data
sapply( my.list, function(x) x$qr$rank )
We could have named the function, but for simple data extractions and so forth it's really handy not to have to.

Related

R: Not to look for variables outside a function if they do not exist within it

This function is OK in R:
f <- function(x) {
x + y
}
Because if the variable y is not defined inside the function f(), R will look for it outside the environment of the function, in its parent environment.
Apart from the fact that this behavior can be a bug generator, what is the point of functions having input parameters? Anyway, all the variables inside a function can be searched outside of it.
Is there any way not to look for variables outside a function if they do not exist within the function?
Some reasons for using parameters that came to my mind:
Without parameters, users have to define variables before using the function, and these variable names need to match the variable names used within the function -- this is impractical.
How is anyone supposed to know/remember the names of the variables within a function? How do I know which variables within a function are purely local, and which variables have to exist outside of the function?
Input parameters can be passed directly as values or as a variable (and the variable name does not matter).
Input parameters communicate the intended usage of the function; it is clear what data is needed to operate it (or at the very least: how many values need to be inserted by the user of the function)
Input parameters can be documented properly using Rd files (or roxygen syntax)
I am sure there are many other reasons to use input parameters.
M. Papenberg provides a very good explanation.
Here's a quick addendum how to make a function not look for objects in parental environments:
Just provide them in the parameter list! This might sound stupid, but that's what you should always do unless you have good reason to do otherwise. In your example only x is passed to the function. So, if the idea here is that x should be returned if y doesn't exist, you can go for default parameters. In this case this could be done as
f <- function(x, y = 0) {
x + y
}

Is attributes() a function in R?

Help files call attributes() a function. Its syntax looks like a function call. Even class(attributes) calls it a function.
But I see I can assign something to attributes(myobject), which seems unusual. For example, I cannot assign anything to log(myobject).
So what is the proper name for "functions" like attributes()? Are there any other examples of it? How do you tell them apart from regular functions? (Other than trying supposedfunction(x)<-0, that is.)
Finally, I guess attributes() implementation overrides the assignment operator, in order to become a destination for assignments. Am I right? Is there any usable guide on how to do it?
Very good observation Indeed. It's an example of replacement function, if you see closely and type apropos('attributes') in your R console, It will return
"attributes"
"attributes<-"
along with other outputs.
So, basically the place where you are able to assign on the left sign of assignment operator, you are not calling attributes, you are actually calling attributes<- , There are many functions in R like that for example: names(), colnames(), length() etc. In your example log doesn't have any replacement counterpart hence it doesn't work the way you anticipated.
Definiton(from advanced R book link given below):
Replacement functions act like they modify their arguments in place,
and have the special name xxx<-. They typically have two arguments (x
and value), although they can have more, and they must return the
modified object
If you want to see the list of these functions you can do :
apropos('<-$') and you can check out similar functions, which has similar kind of properties.
You can read about it here and here
I am hopeful that this solves your problem.

Need to pre-process input to functions

I am writing a package with a suite of functions that take objects fit to a model (e.g., output from from "lmt", "lavaan", or "mirt" packages) and computes relevant indices based on those models.
The first thing EVERY function in this suite does is convert the input into a standardized form, so all of my functions look like this:
fooIndex <- function(x) {
x <- standardizerFunction(x)
# Now, compute the fooIndex
}
Here, standardizerFunction is an S3 generic function that has methods for all the supported input classes.
Is there a better way to accomplish this functionality than calling standardizerFunction inside of each of the functions computing indices?
EDIT: I just wanted to specify that my "problem" is that copying and pasting the same line of code into ~20 different functions seems like a poor programming style, and I am hoping for a better solution.
Based on what iod and Gregor wrote, the two ways to handle this are:
(1) Require the user to apply the standardizerFunction before running any of the main functions. The functions will the throw an error if the input is of the wrong class.
(2) Since our functions will be checking the input to make sure it is of the right class anyway, just fold standardizerFunction into the input checking part using something like:
if(!inherits(x, what="YourClass")) standardizerFunction(x)
In my particular setting, since most of my users are uncomfortable with R, asking them to pre-apply the standardizerFunction is not the best choice, so I am going with option 2.

lapply-ing with the "$" function

I was going through some examples in hadley's guide to functionals, and came across an unexpected problem.
Suppose I have a list of model objects,
x=1:3;y=3:1; bah <- list(lm(x~y),lm(y~x))
and want to extract something from each (as suggested in hadley's question about a list called "trials"). I was expecting one of these to work:
lapply(bah,`$`,i='call') # or...
lapply(bah,`$`,call)
However, these return nulls. It seems like I'm not misusing the $ function, as these things work:
`$`(bah[[1]],i='call')
`$`(bah[[1]],call)
Anyway, I'm just doing this as an exercise and am curious where my mistake is. I know I could use an anonymous function, but think there must be a way to use syntax similar to my initial non-solution. I've looked through the places $ is mentioned in ?Extract, but didn't see any obvious explanation.
I just realized that this works:
lapply(bah,`[[`,i='call')
and this
lapply(bah,function(x)`$`(x,call))
Maybe this just comes down to some lapply voodoo that demands anonymous functions where none should be needed? I feel like I've heard that somewhere on SO before.
This is documented in ?lapply, in the "Note" section (emphasis mine):
For historical reasons, the calls created by lapply are unevaluated,
and code has been written (e.g. bquote) that relies on this. This
means that the recorded call is always of the form FUN(X[[0L]],
...), with 0L replaced by the current integer index. This is not
normally a problem, but it can be if FUN uses sys.call or
match.call or if it is a primitive function that makes use of the
call. This means that it is often safer to call primitive functions
with a wrapper, so that e.g. lapply(ll, function(x) is.numeric(x))
is required in R 2.7.1 to ensure that method dispatch for is.numeric
occurs correctly.

R- accessing varibales created within a user defined function after end of function

Take a basic function
fun<-function(){
x<-c(1,2,3,4,5)
y<-c(1,2,3,4,5)
t<-x+y
return(t)
}
After I have run the function, is there a way I can access any of the variables created within the function. Either by specifying the variable- something like this:
fun$y
or
fun$t
or is there some way of asking R to save the variable within the function for use during my current R session (I'm not looking to save it permanently). AKA something along the lines of:
fun<-function(){
x<-c(1,2,3,4,5)
y<-c(1,2,3,4,5)
t<-x+y
Y<-save y for latter use
T<-save T for latter use
return(t)
}
Thanks!
You can't use a variable outside of its scope.
What you can do is use a list to return multiple values from your function.
Here's a good example.
Yes and no.
Yes, it is technically possible to make assignments to variables outside the scope of your function, so that they are accessible elsewhere. Typically this is done using either <<-, which assigns in the global environment if the variable being assigned can't be found, or calling assign and specifying an environment directly.
But...
No, you should probably not be doing this. R is a functional language, which means that it is intended to be used such that its functions do not create side-effects. If you violate this principle too much, you will inevitably run afoul of serious problems that will be difficult, if not impossible to debug.
If you create variables within a function that you will need later, it is considered best practice to return them all in a list, as Benito describes.

Resources