If function(x) can work, why would we need function()? - r

I understand how "function(x)" works, but what is the role of "function()" here?
z <- function() {
y <- 2
function(x) {
x + y
}
}

function is a keyword which is part of the creation of a function (in the programming sense that Gilles describes in his answer). The other parts are the argument list (in parentheses) and the function body (in braces).
In your example, z is a function which takes no arguments. It returns a function which takes 1 argument (named x) (since R returns the last evaluated statement as the return value by default). That function returns its argument x plus 2.
When z is called (with no arguments: z()) it assigns 2 to y (inside the functions variable scope, an additional concept that I'm not going to get into). Then it creates a function (without a name) which takes a single argument named x, which, when itself called, returns its argument x plus 2. That anonymous function is returned from the call to z and, presumably, stored so that it can be called later.
See https://github.com/hadley/devtools/wiki/Functions and https://github.com/hadley/devtools/wiki/Functionals for more discussion on passing around functions as objects.

The word “function” means somewhat different things in mathematics and in programming. In mathematics, a function is a correspondence between each possible value of the parameters and a result. In programming, a function is a sequence of instructions to compute the result from the parameters.
In mathematics, a function with no argument is a constant. In programming, this is not the case, because functions can have side effects, such as printing something. So you will encounter many functions with no arguments in programs.
Hre the function function(x) { x + y } depends on the variable y. There are no side effects, so this function is very much like the mathematical function defined by $f(x) = x + y$. However, this definition is only complete for a given value of y. The previous instruction sets y to 2, so
function() {
y <- 2
function(x) {
x + y
}
}
is equivalent to
function () {
function(x) {
x + 2
}
}
in the sense that both definitions produce the same results when applied to the same value. They are, however, computed in slightly different ways.
That function is given the name z. When you call z (with no argument, so you write z()), this builds the function function (x) { x + 2 }, or something equivalent: z() is a function of one argument that adds 2 to its argument. So you can write something like z()(3) — the result is 5.
This is obviously a toy example. As you progress in your lectures, you'll see progressively more complex examples where such function building is mixed with other features to achieve useful things.

With some help I've picked out a few examples of functions without formal arguments to help you understand why they could be useful.
Functions which have side-effects
plot.new() for instance, initializes a graphics device.
Want to update the console buffer? flush.console() has your back.
Functions which have a narrow purpose
This is probably the majority of the cases.
Want to know the date/time? Call date().
Want to know the version of R? Call getRversion().

Related

Function taking Vectors and Scalars

I have a function that takes a vector
function foo(x::Vector{Int64})
x
end
How can I make it work also for scalars (i.e. turn them into one-element vectors)?
I know that I can do this:
foo(x::Int64) = foo([x])
but this stops being cool when there are more arguments, because you're writing multiple methods to achieve only one thing.
I think I something like foo(x::Union{Int64, Vector{Int64}}), but I don't know where or how it works or if it is the right thing to do.
Can anyone help?
You can make a helper function which either converts or does nothing. Then the main function can accept any combination:
_vec(x::Number) = [x]
_vec(x::AbstractVector) = x
function f(x, y, z) # could specify ::Union{Number, AbstractVector}
xv = _vec(x)
yv = _vec(y)
...
end
The ... could do the actual work, or could call f(xv, yv, zv) where another method f(x::AbstractVector, y::AbstractVector, z::AbstractVector) does the work --- whichever seems cleaner.
The main time this comes up is if the version of your function for vectors does the same thing for all of it's elements. In this case, what you want to do is define f(x::Int), and use broadcasting f.([1,2,3]) for the vector case.

Accessing variables in parent function

I'm working on a script that finds text in large PDFs, and I have the bare bones script written out. I'm trying to refactor my code to encapsulate the main while loop in a function, so I can run sapply() on it with a list of the PDFs. Some of the functions that I call within the main loop require values from that main loop: here's a stripped down, pseudo-version of my code:
pdfParse <- function() {
N <- sample(1:50, 1)*2
n = N/2; i = 0
while (i <= N) {
what <- whatP(n)
i = i + length(what)
if !length(what) {break}
else {n <- N/2 - i}
}
n
}
res <- sample(0:1, N)
r = 1
whatP <- function(t) {
r = r*2
if (t%%3) {
if (t%%5) {
return(res[(n/r):n])
} else {
whatP((rev(t)[1]):(rev(t)[1] + r))
} else {return(rep(NaN, 2))}
}
So my question is, how do I access the variable n that I've defined in the pdfParse function within the function it calls? Even if it's possible, I'd like to avoid assigning it as a global variable. I've read a bit into closures, but I'm not sure if that's an applicable solution here.
Edit: For clarification, whatP(n) starts out with n as its initial argument, but it's recursive, so depending on whether certain conditions are fulfilled, it may end up operating on a vector that doesn't even include n. but I still want to return the something that depends on the original n I defined in pdfParse
The simplest (and probably safest, given that your res function is recursive) is to make n an argument of whatP.
whatP <- function(t,n) {
...
}
and then call it from pdfParse with two arguments instead of one.
If for some reason you don't want to do this, then you have two options
(a) you can actually just use n as though it were in scope. R's rules for where it looks for a variable are very different from, say, C(++). In order, R searches in
the environment of the current function
the environment of its parent (the function that called it)
the environment of its parent's parent and so on
the global environment
the environments of loaded packages, in the same order they appear in search().
Since your function is being called from within the function that defined n, it will find the appropriate value under the second (or third, given it's recursive) bullet.
(b) you can use get with a suitable (negative) value of pos, corresponding to the parent. (Alternatively, use sys.frame). Not recommended here as it's tricky to get right with recursive functions, but can be useful in other situations (and it will bypass any n you might have redefined in the meantime in another, closer, scope).

Function doesn't change value (R)

I have written a function that takes two arguments, a number between 0:16 and a vector which contains four parameter values.
The output of the function does change if I change the parameters in the vector, but it does not change if I change the number between 0:16.
I can add, that the function I'm having troubles with, includes another function (called 'pi') which takes the same arguments.
I have checked that the 'pi' function does actually change values if I change the value from 0:16 (and it does also change if I change the values of the parameters).
Firstly, here is my code;
pterm_ny <- function(x, theta){
(1-sum(theta[1:2]))*(theta[4]^(x))*exp((-1)*theta[4])/pi(x, theta)
}
pi <- function(x, theta){
theta[1]*1*(x==0)+theta[2]*(theta[3]^(x))*exp((-1)*(theta[3]))+(1-
sum(theta[1:2]))*(theta[4]^(x))*exp((-1)*(theta[4]))
}
Which returns 0.75 for pterm_ny(i,c(0.2,0.2,2,2)), were i = 1,...,16 and 0.2634 for i = 0, which tells me that the indicator function part in 'pi' does work.
With respect to raising a number to a certain power, I have been told that one should wrap the wished number in a 'I', as an example it would be like;
x^I(2)
I have tried to do that in my code, but that didn't help either.
I can't remember the argument for doing it, but I expect that it's to ensure that the number in parentheses is interpreted as an integer.
My end goal is to get 17 different values of the 'pterm' and to accomplish that, I was thinking of using the sapply function like this;
sapply(c(0:16),pterm_ny,theta = c(0.2,0.2,2,2))
I really hope that someone can point out what I'm missing here.
In advance, thank you!
You have a theta[4]^x term both in your main expression and in your pi() function; these are cancelling out, leaving the result invariant to changes in x ...
Also:
you might want to avoid using pi as your function name, as it's also a built-in variable (3.14159...) - this can sometimes cause confusion
the advice about using the "as is" function I() to protect powers is only relevant within formulas, e.g. as used in lm() (linear regression). (It would be used as I(x^2), not x^I(2)

Using outer() with a multivariable function

Suppose you have a function f<- function(x,y,z) { ... }. How would you go about passing a constant to one argument, but letting the other ones vary? In other words, I would like to do something like this:
output <- outer(x,y,f(x,y,z=2))
This code doesn't evaluate, but is there a way to do this?
outer(x, y, f, z=2)
The arguments after the function are additional arguments to it, see ... in ?outer. This syntax is very common in R, the whole apply family works the same for instance.
Update:
I can't tell exactly what you want to accomplish in your follow up question, but think a solution on this form is probably what you should use.
outer(sigma_int, theta_int, function(s,t)
dmvnorm(y, rep(0, n), y_mat(n, lambda, t, s)))
This calculates a variance matrix for each combination of the values in sigma_int and theta_int, uses that matrix to define a dennsity and evaluates it in the point(s) defined in y. I haven't been able to test it though since I don't know the types and dimensions of the variables involved.
outer (along with the apply family of functions and others) will pass along extra arguments to the functions which they call. However, if you are dealing with a case where this is not supported (optim being one example), then you can use the more general approach of currying. To curry a function is to create a new function which has (some of) the variables fixed and therefore has fewer parameters.
library("functional")
output <- outer(x,y,Curry(f,z=2))

How to specify FUN used in by( ) or related apply( ) functions

In a by() function, I will use cor (correlation) to be the FUN there. However, I'd like to setup use="complete.obs" too.
I don't know how to pass this argument in the FUN = cor part.
For example,
by(data, INDICES=list(data$Age), FUN=cor)
probably
by(data, INDICES=list(data$Age), FUN=cor, use = "complete.obs")
will work.
the arguments to by are passed to FUN.
If you start looking around at various R help files for functions like by, you may start to notice a curious 'argument' popping up over and over again: .... You're going to see an ellipsis listed along with all the other arguments to a function.
This is actually an argument itself. It will collect any other arguments you pass and hand them off to subsequent functions called later. The documentation will usually tell you what function these arguments will be handed to.
In this case, in ?by we see this:
... further arguments to FUN.
This means that any other arguments you pass to by that don't match the ones listed will be handed off to the function you pass to FUN.
Another common instance can be found in plot, where the documentation only lists two specific arguments, x and y. Then there's the ... which gathers up anything else you pass to plot and hands it off to methods or to par to set graphical parameter settings.
So in #kohske's example, use = "complete.obs" will be automatically passed on the cor, since it doesn't match any of the other arguments for by.
#kohske and #joran give equivalent answers showing built in features of by (which are also present in apply and the entire plyr family) for passing additional arguments to the supplied function since this is a common application/problem. #Tomas also shows another way to specify an anonymous function which is just a function that calls the "real" function with certain parameters fixed. Fixing parameters to a function call (to effectively make a function with fewer arguments) is a common approach, especially in functional approaches to programming; in that context it is called currying or partial application.
library("functional")
by(data, INDICES=list(data$Age), FUN=Curry(cor, use = "complete.obs"))
This approach can be used when one function does not use ... to "pass along" arguments, and you want to indicate the only reason that an anonymous function is needed is to specify certain arguments.
In general, you have 2 possibilities:
1) specify the arguments in the calling function (tapply() or by() in this case). This also works even if the key argument to fun() is not the first one:
fun <- function(arg1, arg2, arg3) { ... } # just to see how fun() looks like
tapply(var1, var2, fun, arg1 = something, arg3 = something2)
# arg2 will be filled by tapply
2) you may write your wrapper function (sometimes this is needed):
tapply(var1, var2, function (x) { fun(something, x, something2) })

Resources