Using argument of outer function as global variable for a function defined inside outer function - "function factory" - r

Is this bad practice? It seems like a lot could go wrong here.*
I am setting the argument of an outer function to be a global variable for a function defined inside it. I am just doing this to work around some existing code.
f = function(a,b){h = function(c){print(b);b+c}}
myh = f(1,2)
myh(7)
#[1] 2
#[1] 9
*On the other hand, it's perfectly acceptable to write something like
h = function(c){print(7);7+c}

Creating a function that creates functions (or a function factory) is a totally acceptable code practice. See https://adv-r.hadley.nz/function-factories.html for more details on certain parts of the technical implementation in R.
It is most often used if you need to create functions at runtime or you need to create a lot of similar funcions.
The function factory you have created could be considered similar to a function factory that would create different sized counters that told the user how much the amount was incremented by.
It is important to keep track of the functions you create this way however.
Let me know if you'd like more clarification on anything.
(One possible bad practise in the function you have created though is an unused argument a).

Related

Julia type conversion best practices

I have a function which requires a DateTime argument. A possibility is that a user might provide a ZonedDateTime argument. As far as I can tell there are three possible ways to catch this without breaking:
Accept both arguments in a single method, and perform a type conversion if necessary via an if... statement
function ofdatetime(dt::AbstractDateTime)
if dt::ZonedDateTime
dt = DateTime(dt, UTC)
end
...
end
Define a second method which simply converts the type and calls the first method
function ofdatetime(dt::DateTime)
...
end
function ofdatetime(dt::ZonedDateTime)
dt = DateTime(dt, UTC)
return ofdatetime(dt)
end
Redefine the entire function body for the second method
function ofdatetime(dt::DateTime)
...
end
function ofdatetime(dt::ZonedDateTime)
dt = DateTime(dt, UTC)
...
end
Of course, this doesn't apply when a different argument type implies that the function actually do something different - the whole point of multiple dispatch - but this is a toy example. I'm wondering what is best practice in these cases? It needn't be exclusively to do with time zones, this is just the example I'm working with. Perhaps a relevant question is 'how does Julia do multiple dispatch under the hood?' i.e. are arguments dispatched to relevant methods by something like an if... else/switch... case block, or is it more clever than that?
The answer in the comments is correct that, ideally, you would write your ofdatetime function such that all operations on dt within your function body are general to any AbstractDateTime; in any case where the the difference between DateTime and ZonedDateTime would matter, you can use dispatch at that point within your function to take care of the details.
Failing that, either of 2 or 3 is generally preferable to 1 in your question, since for either of those, the branch can be elided in the case that the type of df is known at compile-time. Of the latter two, 2 is probably preferable to 3 as written in your example in terms of general code style ("DRY"), but if you were able to avoid the type conversion by writing entirely different function bodies, then 3 could actually have better performance than if the type conversion is at all expensive.
In general though, the best of all worlds is to keep most your code generic to either type, and only dispatch at the last possible moment.

Pipe with additional Arguments

I read in several places that pipes in Julia only work with functions that take only one argument. This is not true, since I can do the following:
function power(a, b = 2) a^b end
3 |> power
> 9
and it works fine.
However, I but can't completely get my head around the pipe. E.g. why is this not working?? :
3 |> power()
> MethodError: no method matching power()
What I would actually like to do is using a pipe and define additional arguments, e.g. keyword arguments so that it is actually clear which argument to pass when piping (namely the only positional one):
function power(a; b = 2) a^b end
3 |> power(b = 3)
Is there any way to do something like this?
I know I could do a work-around with the Pipe package, but to honest it feels kind of clunky to write #pipe at the start of half of the lines.
In R the magritrr package has convincing logic (in my opinion): it passes what's left of the pipe by default as the first argument to the function on the right - I'm looking for something similar.
power as defined in the first snippet has two methods. One with one argument, one with two. So the point about |> working only with one-argument methods still holds.
The kind of thing you want to do is called "partial application", and very common in functional languages. You can always write
3 |> (a -> power(a, 3))
but that gets clunky quickly. Other language have syntax like power(%1, 3) to denote that lambda. There's discussion to add something similar to Julia, but it's difficult to get right. Pipe is exactly the macro-based fix for it.
If you have control over the defined method, you can also implement methods with an interface that return partially applied versions as you like -- many predicates in Base do this already, e.g., ==(1). There's also the option of Base.Fix2(power, 3), but that's not really an improvement, if you ask me (apart from maybe being nicer to the compiler).
And note that magrittrs pipes are also "macro"-based. The difference is that argument passing in R is way more complicated, and you can't see from outside whether an argument is used as a value or as an expression (essentially, R passes a thunk containing the expression and a pointer to the parent environment, and automatically evaluates and caches it if you use it as a value; see substitute)

Best practice to access variables several function layers deep

I am often confronted with the following situation when I debug my Julia code:
I suspect that a certain variable (often a large matrix) deep inside my code is not what I intended it to be and I want to have a closer look at it. Ideally, I want to have access to it in the REPL so I can play around with it.
What is the best practice to get access to variables several function layers deep without passing them up the chain, i.e. changing the function returns?
Example:
function multiply(u)
v = 2*u
w = subtract(v)
return w
end
function subtract(x)
i = x-5
t = 10
return i-3t
end
multiply(10)
If I run multiply() and suspect that the intermediate variable i is not what I assume it should be, how would I gain access to it in the REPL?
I know that I could just write a test function and test that i has the intended properties right inside subtract(), but sometimes it would just be quicker to use the REPL.
This is the same in any programming language. You can use debugging tools like ASTInterpreter2 (which has good Juno integration) to step through your code and have an interactive REPL in the current environment, or you can use println debugging where you run the code with #show commands in there to print out values.

How to restore a over-written built-in function in Julia

The question may apply for other languages, too.
If I use a built-in function name as a variable name,
I can restore the function by doing:
all = 123
all = Base.all
But If I define, for example, a custom function sum() and then I do,
sum = Base.sum
I got an error saying "invalid redefinition of constant sum"
Is there a way to restore a built-in function if I over-wrote it? Or is that impossible by design?
For this example, you could just redefine sum as Base.sum:
sum(x) = Base.sum(x)
Is that what you would like?
NB. this may not "overwrite" your definition of sum. If it uses type parameters (e.g. sum(x::Vector) it may still be dispatched in preference to the general sum(x), in which case you would need to repeat the above for those specific methods.
If this is simply a problem for you when you are working in the REPL, and you don't mind losing all your other definitions, you could do workspace() to reset Main.

How to get a function from a symbol without using eval?

I've got a symbol that represents the name of a function to be called:
julia> func_sym = :tanh
I can use that symbol to get the tanh function and call it using:
julia> eval(func_sym)(2)
0.9640275800758169
But I'd rather avoid the 'eval' there as it will be called many times and it's expensive (and func_sym can have several different values depending on context).
IIRC in Ruby you can say something like:
obj.send(func_sym, args)
Is there something similar in Julia?
EDIT: some more details on why I have functions represented by symbols:
I have a type (from a neural network) that includes the activation function, originally I included it as a funcion:
type NeuralLayer
weights::Matrix{Float32}
biases::Vector{Float32}
a_func::Function
end
However, I needed to serialize these things to files using JLD, but it's not possible to serialize a Function, so I went with a symbol:
type NeuralLayer
weights::Matrix{Float32}
biases::Vector{Float32}
a_func::Symbol
end
And currently I use the eval approach above to call the activation function. There are collections of NeuralLayers and each can have it's own activation function.
#Isaiah's answer is spot-on; perhaps even more-so after the edit to the original question. To elaborate and make this more specific to your case: I'd change your NeuralLayer type to be parametric:
type NeuralLayer{func_type}
weights::Matrix{Float32}
biases::Vector{Float32}
end
Since func_type doesn't appear in the types of the fields, the constructor will require you to explicitly specify it: layer = NeuralLayer{:excitatory}(w, b). One restriction here is that you cannot modify a type parameter.
Now, func_type could be a symbol (like you're doing now) or it could be a more functionally relevant parameter (or parameters) that tunes your activation function. Then you define your activation functions like this:
# If you define your NeuralLayer with just one parameter:
activation(layer::NeuralLayer{:inhibitory}) = …
activation(layer::NeuralLayer{:excitatory}) = …
# Or if you want to use several physiological parameters instead:
activation{g_K,g_Na,g_l}(layer::NeuralLayer{g_K,g_Na,g_l} = f(g_K, g_Na, g_l)
The key point is that functions and behavior are external to the data. Use type definitions and abstract type hierarchies to define behavior, as is coded in the external functions… but only store data itself in the types. This is dramatically different from Python or other strongly object-oriented paradigms, and it takes some getting used to.
But I'd rather avoid the 'eval' there as it will be called many times and it's expensive (and func_sym can have several different values depending on context).
This sort of dynamic dispatch is possible in Julia, but not recommended. Changing the value of 'func_sym' based on context defeats type inference as well as method specialization and inlining. Instead, the recommended approach is to use multiple dispatch, as detailed in the Methods section of the manual.

Resources