Why quote function names - r

What is the reason behind quoting the name of your function? I have seen this in a couple of packages (for example line number 2 here in the quantmod package). Instead of writing
f <- function(x)
they write
"f" <- function(x)
Another example is from the gratia (line 88) package where functions are back quoted:
`f` <- function(x)

Transferred from Allan Cameron's comment.
It doesn't make any difference for the functions in the link you shared. It appears to be more of a stylistic choice from the developer to allow the top of function declarations to stand out. Sometimes it is necessary to wrap function names in quotes if they contain illegal characters.
The most frequently seen ones in R are the [<- type operators. That is, if you want to define a function that writes to a subset of a custom class so the user can do x[y] <- z then you need to write a function like "[<-.myclass" <- function(y, z) {...}.

Related

R: What exactly does \ mean in R? [duplicate]

I’m required to use/learn R for a new lecture at uni and I’m currently struggling a bit with its syntax. I want to plot (via curve) a simple function, but I can’t seem to get it working with an inline lambda-like function.
I’ve tried the following:
> curve( function(x) x^2 )
Error in curve(function(x) x^2) :
'expr' did not evaluate to an object of length 'n'
When I however store the function in a variable first, it works:
> quad <- function(x) x^2
> curve( quad )
Is such an inline use not allowed in R? Is there any other way to make this work without defining an extra function? Thanks!
Just for completeness. You can use "lambda-like" (anonymous) functions in R but if you want to put them to immediate use, you need to enclose the function definition in parentheses or curly braces:
(function (x) x+1) (1)
{function (x,y) x^y} (2,3)
In the case of curve the first argument is either expression or a function name - but if it is a function name then it is first converted to an expression. (See first few lines in the source code of curve). So if its' not a function name, you'll need an expression – which may contain a "lambda" function:
curve((function (x) x^2)(x))
If you want to use a function (as opposed to its name) as the argument, you can use plot.function:
plot(function(x) x^2)
From R 4.1 on, you can use \(x) lambda-like shorthand:
R now provides a shorthand notation for creating anonymous functions,
e.g. \(x) x + 1 is parsed as function(x) x + 1.
With function(x) x^2:
(\(x) x^2)(2)
#[1] 4
This can be used with curve :
curve((\(x) x^2)(x))
But as stated in comments, in this case an expression is more straightforward :
curve(x^2)
You have to look at the source of curve to appreciate what is happening (just type curve at the prompt and press enter).
There you can find how the expression passed is parsed.
The only way a function is discovered as being just that, is when only its name is passed along (see the is.namepart). If that is not the case, the expression is called for every x. In your case: for every x, the result is a function, which is not a happy thought for plotting...
So in short: no you cannot do what you tried, but as #ROLO indicated, you can immediately pass the function body, which will be parsed as an expression (and should contain x). If this holds multiple statements, just enclose them in curly braces.

Using dplyr `enquo` after calling argument

I have a function that roughly follows this structure:
TestFunc <- function(dat, point) {
if (!(point %in% c("NW", "SE", "SW")))
stop("point must be one of 'SW', 'SE', 'NW'")
point <- enquo(point)
return(dat %>% filter(point == !!point))
The issue is that I get the following error when I include the check for values:
Error in (function (x, strict = TRUE) :
the argument has already been evaluated
The error disappears when I remove the check. How can I keep both?
The thing to remember about the quosure framework is that it's a very clever, sophisticated piece of code that undoes another very clever, sophisticated piece of code to get you back where you started from. What you want is achievable in a very simple fashion, without going to NSE and coming back again.
TestFunc <- function(dat, point)
{
if(!(point %in% c("NW", "SE", "SW")))
stop("point must be one of 'SW', 'SE', 'NW'")
dat[dat$point == point, ]
}
(The difference between this and using match.arg, as #Frank suggests in a comment, is that match.arg will use the first value as the default if no input is supplied whereas this will fail.)
If you want to call other dplyr/tidyverse verbs, just do that after filtering the rows.
Because of technical reasons that have to do with how R optimises code, you can only capture arguments that have not been evaluated yet.
So you first have to capture with enquo() and then proceed to check the value. However if you have to mix both quoting and value-based code it often indicates a design problem.
As Hong suggested it seems that in your case you can directly unquote the value without capturing it. Unquoting will ensure the right value is found (since you gave the same name to that variable as the column in your data frame).
Evaluate point so filter can tell the difference between the argument and the data's column
aosmith has a good idea, so I'm putting it in answer form, along with a reproducible example:
f <- function(dat, point) {
if (!(point %in% c("NW", "SE", "SW")))
stop("point must be one of 'SW', 'SE', 'NW'")
filter(dat, point == !!point)
}
tbl <- tibble(point = c('SE', 'NW'))
f(tbl, 'SE')
f(tbl, 'XX')
If you're passing point in as a string, you only need to differentiate the argument point ( = "SE", in this case) and the column dat$point (or tbl$point, depending if you're inside or outside the function). From the dplyr programming doc:
In dplyr (and in tidyeval in general) you use !! to say that you want to unquote an input so that it’s evaluated, not quoted.
You want to evaluate the argument point so that "SE" is passed to filter, that way filter can tell the difference between the column point and the argument SE.
(I also tried using filter(dat, .data$point == point), but the RHS point still refers to the column, not the f argument.)
I hope this example helps with your real code. 🍟

Anonymous function in lapply

I am reading Wickham's Advanced R book. This question is relating to solving Question 5 in chapter 12 - Functionals. The exercise asks us to:
Implement a version of lapply() that supplies FUN with both the name and value of each component.
Now, when I run below code, I get expected answer for one column.
c(class(iris[1]),names(iris[1]))
Output is:
"data.frame" "Sepal.Length"
Building upon above code, here's what I did:
lapply(iris,function(x){c(class(x),names(x))})
However, I only get the output from class(x) and not from names(x). Why is this the case?
I also tried paste() to see whether it works.
lapply(iris,function(x){paste(class(x),names(x),sep = " ")})
I only get class(x) in the output. I don't see names(x) being returned.
Why is this the case? Also, how do I fix it?
Can someone please help me?
Instead of going over the data frame directly you could switch things around and have lapply go over a vector of the column names,
data(iris)
lapply(colnames(iris), function(x) c(class(iris[[x]]), x))
or over an index for the columns, referencing the data frame.
lapply(1:ncol(iris), function(x) c(class(iris[[x]]), names(iris[x])))
Notice the use of both single and double square brackets.
iris[[n]] references the values of the nth object in the list iris (a data frame is just a particular kind of list), stripping all attributes, making something like mean(iris[[1]]) possible.
iris[n] references the nth object itself, all attributes intact, making something like names(iris[1]) possible.

Using values from a dataframe to apply a function to a vector

I'll start off by admitting that I'm terrible at the apply functions, and function writing in general, in R. I am working on a course project to clean and model some text data, and I would like to include a step that cleans up contractions.
The qdapDictionaries package includes a contractions data frame with two columns, the first column is the contraction and the second is the expanded version. For example:
contraction expanded
5 aren't are not
I want to use the values in here to run a gsub function on my text, which I still have in a large character element. Something like gsub(contr,expd,text).
Here's an example vector that I am using to test things out:
vct <- c("I've got a problem","it shouldn't be that hard","I'm having trouble 'cause I'm dumb")
I'm stumped on how to loop through the data frame (without actually writing a loop, because it seems like the least efficient way to do it) so I can run all the gsubs that I need.
There's probably a simple answer, but here's what I tried: first, I created a function that would return the expanded version if passed a contraction:
expand <- function(contr) {
expd <- contractions[which(contractions[1]==contr),2]
}
I can use sapply with this and it does work, more or less; looping over the first column in contractions, sapply(contractions[,1],expand) returns a named vector of characters with the expanded phrases.
I can't figure out how to combine this vector with gsub though. I tried writing a second function gsub_expand and changing the expand function to return both the contraction and the expansion:
gsub_expand <- function(list, text) {
text <- gsub(list[[1]],list[[2]],text)
return(text)
}
When I ran gsub_expand(sapply(contractions[,1],expand),vct) it only corrected a portion of my vector.
[1] "I've got a problem" "it shouldn't be that hard" "I'm having trouble because I'm dumb"
The first entry in the contractions data frame is 'cause and because, so the interior sapply doesn't seem to actually be looping. I'm stuck in the logic of what I want to pass to what, and what I'm supposed to loop over.
Thanks for any help.
Two options:
stringr::str_replace_all
The stringr package does mostly the same things you can do with base regex functions, but sometimes in a dramatically simpler way. This is one of those times. You can pass str_replace_all a named list or character vector, and it will use the names as patterns and the values as replacements, so all you need is
library(stringr)
contractions <- c("I've" = 'I have', "shouldn't" = 'should not', "I'm" = 'I am')
str_replace_all(vct, contractions)
and you get
[1] "I have got a problem" "it should not be that hard"
[3] "I am having trouble 'cause I am dumb"
No muss, no fuss, just works.
lapply/mapply/Map and gsub
You can, of course, use lapply or a for loop to repeat gsub. You can formulate this call in a few ways, depending on how your data is stored, and how you want to get it out. Let's first make a copy of vct, because we're going to overwrite it:
vct2 <- vct
Now we can use any of these three:
lapply(1:length(contractions),
function(x){vct2 <<- gsub(names(contractions[x]), contractions[x], vct2)})
# `mapply` is a multivariate version of `sapply`
mapply(function(x, y){vct2 <<- gsub(x, y, vct2)}, names(contractions), contractions)
# `Map` is a multivariate version of `lapply`
Map(function(x, y){vct2 <<- gsub(x, y, vct2)}, names(contractions), contractions)
each of which will return slightly different useless data, but will also save the changes to vct2, which now looks the same as the results of str_replace_all above.
These are a little complicated, mostly because you need to save the internal version of vct as you go with each change made. The vct <<- writes to the initialized vct2 outside the function's environment, allowing us to capture the successive changes. Be a little careful with <<-; it's powerful. See ?assignOps for more info.

The Art of R Programming : Where else could I find the information?

I came across the editorial review of the book The Art of R Programming, and found this
The Art of R Programming takes you on a guided tour of software development with R, from basic types and data structures to advanced topics like closures, recursion, and anonymous functions
I immediately became fascinated by the idea of anonymous functions, something I had come across in Python in the form of lambda functions but could not make the connection in the R language.
I searched in the R manual and found this
Generally functions are assigned to symbols but they don't need to be. The value returned by the call to function is a function. If this is not given a name it is referred to as an anonymous function. Anonymous functions are most frequently used as arguments other functions such as the apply family or outer.
These things for a not-very-long-time programmer like me are "quirky" in a very interesting sort of way.
Where can I find more of these for R (without having to buy a book) ?
Thank you for sharing your suggestions
Functions don't have names in R. Whether you happen to put a function into a variable or not is not a property of the function itself so there does not exist two sorts of functions: anonymous and named. The best we can do is to agree to call a function which has never been assigned to a variable anonymous.
A function f can be regarded as a triple consisting of its formal arguments, its body and its environment accessible individually via formals(f), body(f) and environment(f). The name is not any part of that triple. See the function objects part of the language definition manual.
Note that if we want a function to call itself then we can use Recall to avoid knowing whether or not the function was assigned to a variable. The alternative is that the function body must know that the function has been assigned to a particular variable and what the name of that variable is. That is, if the function is assigned to variable f, say, then the body can refer to f in order to call itself. Recall is limited to self-calling functions. If we have two functions which mutually call each other then a counterpart to Recall does not exist -- each function must name the other which means that each function must have been assigned to a variable and each function body must know the variable name that the other function was assigned to.
There's not a lot to say about anonymous functions in R. Unlike Python, where lambda functions require special syntax, in R an anonymous function is simply a function without a name.
For example:
function(x,y) { x+y }
whereas a normal, named, function would be
add <- function(x,y) { x+y }
Functions are first-class objects, so you can pass them (regardless of whether they're anonymous) as arguments to other functions. Examples of functions that take other functions as arguments include apply, lapply and sapply.
Get Patrick Burns' "The R Inferno" at his site
There are several good web sites with basic introductions to R usage.
I also like Zoonekynd's manual
Great answers about style so far. Here's an answer about a typical use of anonymous functions in R:
# Make some data up
my.list <- list()
for( i in seq(100) ) {
my.list[[i]] <- lm( runif(10) ~ runif(10) )
}
# Do something with the data
sapply( my.list, function(x) x$qr$rank )
We could have named the function, but for simple data extractions and so forth it's really handy not to have to.

Resources