I am working on a meta-interpreter for a language fragment that needs to be rich enough to support higher-order functions, and running into a problem with closures.
Specifically, I need all values to be representable as finite terms; no infinite recurrence, no objects pointing to each other. This is fine for most kinds of values, numbers, finite lists, maps, abstract syntax trees representing program code. The problem is closures; they contain a reference to their containing environment, but if a closure is stored in a local variable, then that containing environment also contains a reference to the closure. This is fine if you are working with mutable pointers, but it's an infinite recurrence if you are trying to work with finite terms.
Is there a known technique for representing closures as finite terms, or some technique I am missing that bypasses the problem?
I once had a similar problem when I wrote a functional language that used reference counting GC, and therefore had to avoid any cyclic references.
My solution was that the environment didn't actually include a pointer to the closure. Instead, it stored a function that would produce the closure when you pass it a pointer to the environment.
The process of looking up a value from the environment would include calling that function to produce the closure if necessary. That function, of course, doesn't need a cyclic pointer to the environment, because the environment will be passed in.
This trick is basically making recursive data structures in the same way that the Y combinator is used to make recursive functions.
Related
I've noticed that defining unneccesary partial derivatives can significantly slow down the optimizer. Therefore I'm trying to understand: how can I know whether I should define the partial derivative for a certain input/output relationship?
When you say "unnecessary" do you mean partial derivatives that are always zero?
Using declare_partials('*', '*'), when a component is really more sparse than that will significantly slow down your model. Anywhere where a partial derivatives is always zero, you should simply not declare it.
Furthermore, if you have a vectorized operation, then your Jacobian is actually a diagonal matrix. In that case, you should declare a [sparse partial derivative] by giving rows and cols arguments to the declare_partial call1. This will often substantially speed up your code.
Technically speaking, if you follows the data path from all of your design variables, through each components, to the objective and constraints, then any variable you passed would need to have its partials defined. But practically speaking you should declare and specify all the partials for every output w.r.t. every input (unless they are zero), so that changes to model connectivity don't break your derivatives.
It takes a little bit more time to declare your partials more sparsely, but the performance speed up is well worth it.
I think they need to be defined if they are ever relevant to a response (constraint or objective) in the optimization, or as part of a nonlinear solve within a group. My personal practice is to always define them. Should I every change my optimization problem, which I do often, I don't want to have to go back and make sure I'm always defining the appropriate derivatives.
The master-branch of OpenMDAO contains some jacobian-coloring techniques which can significantly improve performance if your problem is particularly sparse in nature. This method is enabled by setting the following options on the driver:
p.driver.options['dynamic_simul_derivs'] = True
p.driver.options['dynamic_simul_derivs_repeats'] = 5
This method works by filling in the user-described sparsity pattern (specified with rows and cols in declare partials) with random numbers and computing the total jacobian. The repeat option is there in improve confidence in the results, since it's possible but unlikely that a single pass will result in an "incidental zero" in the jacobian that is not truly part of the sparsity structure.
With this technique, and by doing things like vectorizing by calculations instead of using nested for loops, I've been able to get very good performance in a lot of situations. Of course, the effectiveness of these methods is going to change from model to model.
Can someone give me explanation for the below sentence highlighted in Bold.
"Primitive functions are only found in the base package, and since they operate at a low level, they can be more efficient (primitive replacement functions don’t have to make copies), and can have different rules for argument matching (e.g., switch and call). This, however, comes at a cost of behaving differently from all other functions in R. Hence the R core team generally avoids creating them unless there is no other option.
Source Link:http://adv-r.had.co.nz/Functions.html#lexical-scoping
I have to write an Excel addin in F#, it does some pretty heavy computations in order to calibrate some curves as a first step in some User Defined Functions.
As a second step, I need re-use the representation of the universe (the curves calibrated in the first step) as an argument for other functions.
When I was doing this in a procedural language with states, I would just return a string handle on the universe which would be an object that I would store in memory. If I am doing this in F#, am I breaking the functional language paradigm ?
Is there an elegant way to do a similar thing without having to do the recalibration in the first step ? Here I am using Excel, but this is a more general question.
Do you mean that if you have user-defined function A and UDF B both of them require calling another function to calibrate? If that's the case, then it sounds like you should memoize the calibration function and have A and B use the memoized function.
As a side note, you should consider disregarding typical academic implementations of memoization and consider one with limits on the upper bound of inputs.
As a side, side note - Excel is one of the most widely used function programming paradigms.
I am trying to understand the difference between defined and undefined values of recursive functions. Is it similar to programming a loop in an imperative language, like Java, but there is an unseen mistake in the loop structure which causes variables to not contain the values that you were expecting them to contain?
I've just started learning Common Lisp--and rapidly falling in love with it--and I've just moved onto the type system. I seem to be developing a particular fondness for applicative programming.
As I understand it, in CL strings and lists are both sequences, but there don't seem to be any standard functions for mapping over a sequence, only lists. I can see why they would be supplied for lists, what with them being the fundamental datatype and all, but why was it not designed to work with sequences? As they are a more general type, it would seem more useful to target applicative functions at them rather than lists. Or am I completely misunderstandimatifying how it works?
Edit:
What I was feeling particularly confused about was the way that sequences -- the abstraction -- and lists -- an implementation -- seem to be muddled up in CL. The consensus seems to be that this is for historical reasons; lisp has been around so long that you can pretty much map out the development of software engineering practices through its functions and macros; which functions apply to sequences and which to lists seems arbitrary at first glance because CL has a mixture of pre-sequence-abstraction functions that operate only on lists, and functions that do the same thing in a more general way on sequences. As someone who is just learning CL at the moment, I think it would be useful if authors introduced sequences first as the cleaner abstraction, and then bought in lists as the most fundamental implementation of that abstraction. Lists would still be needed as syntax of course, but by the time it is necessary to state this explicitly many readers would have worked this out by themselves, which would be quite an ego boost when starting out.
Why, there are a lot of functions working on sequences. Mapping over a sequence is done with MAP or MAP-INTO.
Look at the sequences section of the CLHS to find out more.
There is also a quick reference that is nicely organized.
Well, you are generally correct. Most functions do indeed focus on lists (mapcar, find, count, remove, append etc.) For a few of these there are equivalent functions for sequences (concatenate, some and every come to mind), and some, where the list-equivalent is outdated (eg. nth for lists only vs. elt for all sequences). Some functions simply work on sequences (length, for example).
CL is a bit of a mess. It's a big language, as in huge. Over 700 functions, AFAIK. And it's old. Some of these functions are deprecated by convention, and others are rarely, if ever, used.
Yes, it would be more useful to have mapping functions be methods, that applied as intended on all sequences. CL was simply not built that way. If it were to be built again today, I'm sure this would be considered, and it would look very different.
That said, you are not left completely in the cold. The loop macro works on sequences, as does iterate (a separate looping macro, which i happen to like more). This will get you far. For most practical purposes you will be using lists, and this won't be more than a pragmatic problem. If you do happen to lack a mapping function for vectors (or sequences in general), who's to stop you from writing it?