I'm doing an exercise using Isabelle recently and don't know the difference between fold and unfold. What I know is to apply the rules which contain inference symbol, we should use rule and those without inference symbol, we should use unfold. But when should we use fold? Some examples will be much appreciated.
Related
I would like to use some user defined contrasts in R in addition to the default contrast codes (contr.treatment / contr.sum / contr.helmert). However, the guidance I have read indicates that these need to be provided in an inverse matrix. Could anybody explain why?
Ie, the guidance here: https://stats.idre.ucla.edu/r/library/r-library-contrast-coding-systems-for-categorical-variables/ states:
The final contrast matrix (or coding scheme) turns out to be the inverse of mat transposed.
Similarly, this site: https://rstudio-pubs-static.s3.amazonaws.com/65059_586f394d8eb84f84b1baaf56ffb6b47f.html writes:
While there are a handful of automatic contrast functions in R (what I’ve been using so far), you will sometimes find yourself wanting to run comparisons that are not included there. When that happens, you can specify them yourself. You need to be careful, though, because the contrasts() function is a sneaky little bastard, as noted above. To apply contrast weights, you’ll need to give it the inverse of your matrix of weights.
Neither explains why. In addition, computing an inverse matrix messes up the contrast coefficients so that they become difficult to interpret as they are no longer a standard unit apart, so I'd like to know why it's necessary.
I am testing performance of different solvers on minimizing an objective function derived from simulated method of moments. Given that my objective function is not differentiable, I wonder if automatic differentiation would work in this case? I tried my best to read some introduction on this method, but I couldn't figure it out.
I am actually trying to use Ipopt+JuMP in Julia for this test. Previously, I have tested it using BlackBoxoptim in Julia. I will also appreciate if you could provide some insights on optimization of non-differentiable functions in Julia.
It seems that I am not clear on "non-differentiable". Let me give you an example. Consider the following objective function. X is dataset, B is unobserved random errors which will be integrated out, \theta is parameters. However, A is discrete and therefore not differentiable.
I'm not exactly an expert on optimization, but: it depends on what you mean by "nondifferentiable".
For many mathematical functions that are used, "nondifferentiable" will just mean "not everywhere differentiable" -- but that's still "differentiable almost everywhere, except on countably many points" (e.g., abs, relu). These functions are not a problem at all -- you can just chose any subgradient and apply any normal gradient method. That's what basically all AD systems for machine learning do. The case for non-singular subgradients will happen with low probability anyway. An alternative for certain forms of convex objectives are proximal gradient methods, which "smooth" the objective in an efficient way that preserves optima (cf. ProximalOperators.jl).
Then there's those functions that seem like they can't be differentiated at all, since they seem "combinatoric" or discrete, but are in fact piecewise differentiable (if seen from the correct point of view). This includes sorting and ranking. But you have to find them, and describing and implementing the derivative is rather complicated. Whether such functions are supported by an AD system depends on how sophisticated its "standard library" is. Some variants of this, like "permute", can just fall out AD over control structures, while move complex ones require the primitive adjoints to be manually defined.
For certain kinds of problems, though, we just work in an intrinsically discrete space -- like, integer parameters of some probability distributions. In these case, differentiation makes no sense, and hence AD libraries define their primitives not to work on these parameters. Possible alternatives are to use (mixed) integer programming, approximations, search, and model selection. This case also occurs for problems where the optimized space itself depends on the parameter in question, like the second argument of fill. We also have things like the ℓ0 "norm" or the rank of a matrix, for which there exist well-known continuous relaxations, but that's outside of the scope of AD).
(In the specific case of MCMC for discrete or dimensional parameters, there's other ways to deal with that, like combining HMC with other MC methods in a Gibbs sampler, or using a nonparametric model instead. Other tricks are possible for VI.)
That being said, you will rarely encounter complicated nowhere differentiable continuous functions in optimization. They are already complicated to describe, are just unlikely to arise in the kind of math we use for modelling.
I'm using Sage math to do some calculation where I found the numerical evaluation quite different from that of python's.
For example, the evalf() nolonger works, instead it uses n() and gp().
My questions are:
what are different ways of numerical evaluation in Sage and what's their difference?
what's difference between n() and gp()? why the later seemed to be much slower?
I assume you are referring to Sympy when you say evalf. Anyway, n() or numerical_approx() is the equivalent. See the documentation. The default is 53 bits of accuracy.
You shouldn't be thinking of gp() though, unless you really want to use the GP/Pari interpreter or convert something to GP.
Basically I have created two MATLAB functions which involve some basic signal processing and I need to describe how these functions work in a written report. It specifically requires me to describe the algorithms using mathematical notation.
Maths really isn't my strong point at all, in fact I'm quite surprised I've even been able to develop the functions in the first place. I'm quite worried about the situation at the moment, it's the last section of writing I need to complete but it is crucially important.
What I want to know is whether I'm going to have to grab a book and teach myself mathematical notation in a very short space of time or is there possibly an easier/quicker way to learn? (Yes I know reading a book should be simple enough, but maths + short time frame = major headache + stress)
I've searched through some threads on here already but I really don't know where to start!
Although your question is rather vague, and I have no idea what sorts of algorithms you have coded that you are trying to describe in equation form, here are a few pointers that may help:
Check the MATLAB documentation: If you are using built-in MATLAB functions, they will sometimes give an equation in the documentation that describes what they are doing internally. Some examples are the functions CONV, CORRCOEF, and FFT. If the function is rather complicated, it may not have an equation but instead have links to some papers describing the algorithm, which may themselves have equations for the algorithm. An example is the function HILBERT (which you can also find equations for on Wikipedia).
Find some lists of common mathematical symbols: Some standard symbols used to represent common mathematical operations can be found here.
Look at some sample pseudocode to see how it's done: For algorithms you yourself have coded up, you'll have to write them out in equation or pseudocode form. A paper that I've used often in my work is Templates for the Solution of Linear Systems, and it has some examples of pseudocode that may be helpful to you. I would suggest first looking at the list of symbols used in that paper (on page iv) to see some typical notations used to represent various mathematical operations. You can then look at some of the examples of pseudocode throughout the rest of the document, such as in the box on page 8.
I suggest that you learn a little bit of LaTeX and investigate Matlab's publish feature. You only need to learn enough LaTeX to write mathematical expressions. Then you have to write Matlab comments in your source file in LaTeX, but only for the bits you want to look like high-quality maths. Finally, open the Matlab editor on your .m file, and select File | Publish.
See Very Quick Intro to LaTeX and check your Matlab documentation for publish.
In addition to the answers already here, I would strongly advise using words in addition to forumlae in your report to describe the maths that you are presenting.
If I were marking a student's report and they explained the concepts of what they were doing correctly, but had poor or incorrect mathematical notation to back it up: this would lose them some marks, but would hopefully not impede my understanding of the hard work they've put in.
If they had poor/wrong maths, with no explanation of what they meant to say, this could jeapordise my understanding of their entire project and cost them a passing grade.
The reason you haven't found any useful threads is because most of the time, people are trying to turn maths into algorithms, not vice versa!
Starting from an arbitrary algorithm, sometimes pseudo-code, along with suitable comments, is the clearest (and possibly only) representation.
I have a number of small algorithms that I would like to write up in a paper. They are relatively short, and concise. However, instead of writing them in pseudo-code (à la Cormen or even Knuth), I would like to write an algebraic representation of them (more linear and better LaTeX rendering) . However, I cannot find resources as to the best notation for this, if there is anything: e.g. how do I represent a loop? If? The addition of a tuple to a list?
Has any of you encountered this problem, and somehow solved it?
Thanks.
EDIT: Thanks, people. I think I did a poor job at phrasing the question. Here goes again, hoping I make it clearer: what is the common notation for talking about loops and if-then clauses in a mathematical notation? For instance, I can use $acc \leftarrow acc \cup \langle i,i+1 \rangle$ to represent the "add" method of a list.
Don't do this. You are deviating from what people expect to see when they read a paper about algorithms. You should follow expected practices; your ideas are more likely to get the attention that they deserve. When in Rome, do as the Romans do.
Formatting code (or pseudocode as it may be) in a LaTeXed paper is very easy. See, for example, Formatting code in LaTeX.
I see if-expressions in mathematical notation fairly often. The usual thing for a loop is a recurrence relation, or equivalently, a function defined recursively.
Here's how the Ackermann function is defined on Wikipedia, for instance:
This picture is nice because it feels mathematical in flavor and yet you could clearly type it in almost exactly as written and have an implementation. It is not always possible to achieve that.
Other mathematical notations that correspond to loops include ∑-notation for summation and set-builder notation.
I hope this answers your question! But if your aim is to describe how something is done and have someone understand, I think it is probably a mistake to assume that mathematicians would prefer to see equations. I don't think they're interchangeable tools (despite Turing equivalence). If your algorithm involves mutable data structures, procedural code is probably going to be better than equations for explaining it.
I'd copy Knuth. Few know how to communicate better than him in a computer science setting.
A symbol for general loops does not exist; usually you will use the summation operator. "if" is represented using implications, and to "add a tuple to a list" you would use union.
However, in general, a bit of verbosity is not necessarily a bad thing - sometimes, especially for complex algorithms, it is best to spell it out in plain English, using examples and diagrams. This is doubly-true for non-coders.
Think about it: when you read a math text-book on Euclid's algorithm for GCD, or the sieve of Eratosthenes, how is it written? Usually, the algorithm itself is in prose, while the proof of the algorithm is where the mathematical symbols lie.
You might take a look at Haskell. Haskell formats well in latex, has a nice algebraic syntax, and you can even compile a latex file with Haskell in it, provided the code is wrapped in \begin{code} and \end{code}. See here: http://www.haskell.org/haskellwiki/Literate_programming. There are probably literate programming tools for other languages.
Lisp started out as a mathematical notation of a computing model so that the lecturer would have a better tool than turing machines. By accident, it turns out that it can be implemented in assembly - thus lisp, the programming language was born.
But I don't think this is really what you are looking for since the computing model that lisp describes doesn't have loops: recursion is used instead. The syntax derives from algebra where braces denote evaluate-this-and-substitute-the-result. Indeed, lisp's model of computing is basically substitution - what algebra essentially is.
Indeed, most functional languages like Lisp, Haskell and Erlang are derived from mathematics. Haskell is actually a result of proving that lambda calculus can be used to implement type systems. So Haskell, like Lisp was born out of pure mathematics. But again, the syntax is not what you would probably be used to.
You can certainly explain Lisp and Haskell syntax to mathematicians and they would treat it as a "game". Language constructs like loops, recursion and conditionals can be proven out of the rules of the game rather than blindly implemented like in other languages. This would lead you into the realms of combinatronics, another branch of mathematics. Indeed, in combinatronics, even the concept of numbers can be constructed out of the rules of the game rather than being a native part of the language (google Church Numerals).
So have a look at Lisp/Scheme, Erlang and Haskell if you want. Erlang especially has syntax close to what you want:
add(a,b) -> a + b
But my recommendation is to write in C-like pseudocode. It's sort of the lowest common denominator in programming languages. Has a syntax that is fairly easy to understand and clean. And the function syntax even derives from functions in mathematics. Remember f(x)?
As a plus, mathematicians are used to writing C, statisticians are used to writing C (though generally they prefer R), physicists are used to writing C, programmers are used to at least looking at C (I know a few who've never touched C).
Actually, scratch that. You mention that your target audience are statisticians. Write in R
Something like this website describes?
APL? The only problem is that few people can read it.