Function taking Vectors and Scalars - julia

I have a function that takes a vector
function foo(x::Vector{Int64})
x
end
How can I make it work also for scalars (i.e. turn them into one-element vectors)?
I know that I can do this:
foo(x::Int64) = foo([x])
but this stops being cool when there are more arguments, because you're writing multiple methods to achieve only one thing.
I think I something like foo(x::Union{Int64, Vector{Int64}}), but I don't know where or how it works or if it is the right thing to do.
Can anyone help?

You can make a helper function which either converts or does nothing. Then the main function can accept any combination:
_vec(x::Number) = [x]
_vec(x::AbstractVector) = x
function f(x, y, z) # could specify ::Union{Number, AbstractVector}
xv = _vec(x)
yv = _vec(y)
...
end
The ... could do the actual work, or could call f(xv, yv, zv) where another method f(x::AbstractVector, y::AbstractVector, z::AbstractVector) does the work --- whichever seems cleaner.

The main time this comes up is if the version of your function for vectors does the same thing for all of it's elements. In this case, what you want to do is define f(x::Int), and use broadcasting f.([1,2,3]) for the vector case.

Related

getindex for specific dims in Julia

Suppose I want to write a dynamic function that gets an object subtype of AbstractMatrix and shuffles the values along a specified dimension. Surely there can be various approaches and ways to do this, but suppose the following way:
import Random.shuffle
function shuffle(data::AbstractMatrix; dims=1)
n = size(data, dims)
shuffled_idx = shuffle(1:n)
data[shuffled_idx, :] #This line is wrong. It's not dynamic
A wrong way is to use several (actually indefinite) if-else statements like if dims==1 do... if dims==2 do. But it isn't the way to do these kinds of things. I could write data::AbstractArray then the input could have various dimensions. So this came to my mind that this can be possible if I can do something like getindex(data, [idxs]; dims). But I checked for the dims keyword argument (or even positional one) in the dispatches of getindex, but there isn't such a definition. So how can I get values by specified indexes and along a dim?
You are looking for selectdim:
help?> selectdim
search: selectdim
selectdim(A, d::Integer, i)
Return a view of all the data of A where the index for dimension d equals i.
Equivalent to view(A,:,:,...,i,:,:,...) where i is in position d.
Here's a code example:
function myshuffle(data::AbstractMatrix; dim=1)
inds = shuffle(axes(data, dim))
return selectdim(data, dim, inds)
end
Make sure not to use 1:n as indices for AbstractArrays, as they may have non-standard indices. Use axes instead.
BTW, selectdim apparently returns a view, so you may or may not need to use collect on it.

Julia convention for optional arguments

Say I have a function such as f(x,y) but the y parameter is optional. What is the preferred way to set y as an optional argument? One option that works for me:
function f(x, y=nothing)
# do stuff
if y == nothing
# do stuff
else
# do stuff
end
# do stuff
end
But is this the preferred way? I can't set y to a single default value to use in calculation, as there are a number of calculations that are done differently when y is nothing. I could also just have separate functions f(x) and f(x,y) but that seems like too much code duplication.
This is fine. Note that optional arguments cause dispatch. This means that the if y == nothing (or equivalently, if typeof(y) <: Void), will actually compile away. You'll get two different functions which depend on whether the user gives a value or not. So in the end, the if statement compiles away and it's perfectly efficient to do this.
I will note that the same is not currently true for keyword arguments.
Another way to do this is to have two methods:
f(x)
and
f(x,y)
Whether two methods is nicer than 1 method with an if depends on the problem. Since the if will compile away using type information, there's no difference between these two approaches other than code organization.

If function(x) can work, why would we need function()?

I understand how "function(x)" works, but what is the role of "function()" here?
z <- function() {
y <- 2
function(x) {
x + y
}
}
function is a keyword which is part of the creation of a function (in the programming sense that Gilles describes in his answer). The other parts are the argument list (in parentheses) and the function body (in braces).
In your example, z is a function which takes no arguments. It returns a function which takes 1 argument (named x) (since R returns the last evaluated statement as the return value by default). That function returns its argument x plus 2.
When z is called (with no arguments: z()) it assigns 2 to y (inside the functions variable scope, an additional concept that I'm not going to get into). Then it creates a function (without a name) which takes a single argument named x, which, when itself called, returns its argument x plus 2. That anonymous function is returned from the call to z and, presumably, stored so that it can be called later.
See https://github.com/hadley/devtools/wiki/Functions and https://github.com/hadley/devtools/wiki/Functionals for more discussion on passing around functions as objects.
The word “function” means somewhat different things in mathematics and in programming. In mathematics, a function is a correspondence between each possible value of the parameters and a result. In programming, a function is a sequence of instructions to compute the result from the parameters.
In mathematics, a function with no argument is a constant. In programming, this is not the case, because functions can have side effects, such as printing something. So you will encounter many functions with no arguments in programs.
Hre the function function(x) { x + y } depends on the variable y. There are no side effects, so this function is very much like the mathematical function defined by $f(x) = x + y$. However, this definition is only complete for a given value of y. The previous instruction sets y to 2, so
function() {
y <- 2
function(x) {
x + y
}
}
is equivalent to
function () {
function(x) {
x + 2
}
}
in the sense that both definitions produce the same results when applied to the same value. They are, however, computed in slightly different ways.
That function is given the name z. When you call z (with no argument, so you write z()), this builds the function function (x) { x + 2 }, or something equivalent: z() is a function of one argument that adds 2 to its argument. So you can write something like z()(3) — the result is 5.
This is obviously a toy example. As you progress in your lectures, you'll see progressively more complex examples where such function building is mixed with other features to achieve useful things.
With some help I've picked out a few examples of functions without formal arguments to help you understand why they could be useful.
Functions which have side-effects
plot.new() for instance, initializes a graphics device.
Want to update the console buffer? flush.console() has your back.
Functions which have a narrow purpose
This is probably the majority of the cases.
Want to know the date/time? Call date().
Want to know the version of R? Call getRversion().

Using outer() with a multivariable function

Suppose you have a function f<- function(x,y,z) { ... }. How would you go about passing a constant to one argument, but letting the other ones vary? In other words, I would like to do something like this:
output <- outer(x,y,f(x,y,z=2))
This code doesn't evaluate, but is there a way to do this?
outer(x, y, f, z=2)
The arguments after the function are additional arguments to it, see ... in ?outer. This syntax is very common in R, the whole apply family works the same for instance.
Update:
I can't tell exactly what you want to accomplish in your follow up question, but think a solution on this form is probably what you should use.
outer(sigma_int, theta_int, function(s,t)
dmvnorm(y, rep(0, n), y_mat(n, lambda, t, s)))
This calculates a variance matrix for each combination of the values in sigma_int and theta_int, uses that matrix to define a dennsity and evaluates it in the point(s) defined in y. I haven't been able to test it though since I don't know the types and dimensions of the variables involved.
outer (along with the apply family of functions and others) will pass along extra arguments to the functions which they call. However, if you are dealing with a case where this is not supported (optim being one example), then you can use the more general approach of currying. To curry a function is to create a new function which has (some of) the variables fixed and therefore has fewer parameters.
library("functional")
output <- outer(x,y,Curry(f,z=2))

How to specify FUN used in by( ) or related apply( ) functions

In a by() function, I will use cor (correlation) to be the FUN there. However, I'd like to setup use="complete.obs" too.
I don't know how to pass this argument in the FUN = cor part.
For example,
by(data, INDICES=list(data$Age), FUN=cor)
probably
by(data, INDICES=list(data$Age), FUN=cor, use = "complete.obs")
will work.
the arguments to by are passed to FUN.
If you start looking around at various R help files for functions like by, you may start to notice a curious 'argument' popping up over and over again: .... You're going to see an ellipsis listed along with all the other arguments to a function.
This is actually an argument itself. It will collect any other arguments you pass and hand them off to subsequent functions called later. The documentation will usually tell you what function these arguments will be handed to.
In this case, in ?by we see this:
... further arguments to FUN.
This means that any other arguments you pass to by that don't match the ones listed will be handed off to the function you pass to FUN.
Another common instance can be found in plot, where the documentation only lists two specific arguments, x and y. Then there's the ... which gathers up anything else you pass to plot and hands it off to methods or to par to set graphical parameter settings.
So in #kohske's example, use = "complete.obs" will be automatically passed on the cor, since it doesn't match any of the other arguments for by.
#kohske and #joran give equivalent answers showing built in features of by (which are also present in apply and the entire plyr family) for passing additional arguments to the supplied function since this is a common application/problem. #Tomas also shows another way to specify an anonymous function which is just a function that calls the "real" function with certain parameters fixed. Fixing parameters to a function call (to effectively make a function with fewer arguments) is a common approach, especially in functional approaches to programming; in that context it is called currying or partial application.
library("functional")
by(data, INDICES=list(data$Age), FUN=Curry(cor, use = "complete.obs"))
This approach can be used when one function does not use ... to "pass along" arguments, and you want to indicate the only reason that an anonymous function is needed is to specify certain arguments.
In general, you have 2 possibilities:
1) specify the arguments in the calling function (tapply() or by() in this case). This also works even if the key argument to fun() is not the first one:
fun <- function(arg1, arg2, arg3) { ... } # just to see how fun() looks like
tapply(var1, var2, fun, arg1 = something, arg3 = something2)
# arg2 will be filled by tapply
2) you may write your wrapper function (sometimes this is needed):
tapply(var1, var2, function (x) { fun(something, x, something2) })

Resources