How to perform a vectorized division in Julia?

How to perform a vectorized division in Julia? - julia

given a multi-dimensional array of 252×3 Array{Float64,2}, why can't I do something Python-esque like:
normalized_data = vals / vals[1,:] to have every element divided by the 1st item in its respective column. This works in Python (albeit with vals[0,:] in the denominator). In Julia, I had to use:
normalized_data = [(vals[:,1] / vals[1,1]) (vals[:,2] / vals[1,2]) (vals[:,3] / vals[1,3])]
This seems really limiting and isn't generic to work no matter how many columns of data I have!

It can.
normalized_data = vals ./ vals[1,:]
or even better, if normalized_data is already allocated:
normalized_data .= vals ./ vals[1,:]
(Edit: For v0.5 or higher, it needs to be vals ./ vals[1,:].' due to the dropped dimensions. See comments)
will be allocation-free. This form of vectorization syntax is partially derived from MATLAB. I would suggest looking through the manual. One place to look if you're just starting is the differences from other languages:
http://docs.julialang.org/en/release-0.4/manual/noteworthy-differences/
For more on broadcasting and understanding what we gain by making the . explicit, see the follow blog post:
http://julialang.org/blog/2017/01/moredots
Essentially, because the .'s are explicit, the parser can fuse the expressions and make it much more efficient than doing "vectorized" computing a la NumPy (or at least, it will always be as efficient as possible, instead of sometimes being efficient due to compiler magic).

Related

Struggling with building an intuition for recursion

Though I have studied and able am able to understand some programs in recursion, I am still not able to intuitively obtain a solution using recursion as I do easily using Iteration. Is there any course or track available in order to build an intuition for recursion? How can one master the concept of recursion?

if you want to gain a thorough understanding of how recursion works, I highly recommend that you start with understanding mathematical induction, as the two are very closely related, if not arguably identical.
Recursion is a way of breaking down seemingly complicated problems into smaller bits. Consider the trivial example of the factorial function.
def factorial(n):
if n < 2:
return 1
return n * factorial(n - 1)
To calculate factorial(100), for example, all you need is to calculate factorial(99) and multiply 100. This follows from the familiar definition of the factorial.
Here are some tips for coming up with a recursive solution:
Assume you know the result returned by the immediately preceding recursive call (e.g. in calculating factorial(100), assume you already know the value of factorial(99). How do you go from there?)
Consider the base case (i.e. when should the recursion come to a halt?)
The first bullet point might seem rather abstract, but all it means is this: a large portion of the work has already been done. How do you go from there to complete the task? In the case of the factorial, factorial(99) constituted this large portion of work. In many cases, you will find that identifying this portion of work simply amounts to examining the argument to the function (e.g. n in factorial), and assuming that you already have the answer to func(n - 1).
Here's another example for concreteness. Let's say we want to reverse a string without using in-built functions. In using recursion, we might assume that string[:-1], or the substring until the very last character, has already been reversed. Then, all that is needed is to put the last remaining character in the front. Using this inspiration, we might come up with the following recursive solution:
def my_reverse(string):
if not string: # base case: empty string
return string # return empty string, nothing to reverse
return string[-1] + my_reverse(string[:-1])
With all of this said, recursion is built on mathematical induction, and these two are inseparable ideas. In fact, one can easily prove that recursive algorithms work using induction. I highly recommend that you checkout this lecture.

How to speed up writing to a matrix in a reference class in R

Here is a piece of R code that writes to each element of a matrix in a reference class. It runs incredibly slowly, and I’m wondering if I’ve missed a simple trick that will speed this up.
nx = 2000
ny = 10
ref_matrix <- setRefClass(
"ref_matrix",fields = list(data = "matrix"),
)
out <- ref_matrix(data = matrix(0.0,nx,ny))
#tracemem(out$data)
for (iy in 1:ny) {
for (ix in 1:nx) {
out$data[ix,iy] <- ix + iy
}
}
It seems that each write to an element of the matrix triggers a check that involves a copy of the entire matrix. (Uncommenting the tracemen() call shows this.) Now, I’ve found a discussion that seems to confirm this:
https://r-devel.r-project.narkive.com/8KtYICjV/rd-copy-on-assignment-to-large-field-of-reference-class
and this also seems to be covered by Speeding up field access in R reference classes
but in both of these this behaviour can be bypassed by not declaring a class for the field, and this works for the example in the first link which uses a 1D vector, b, which can just be set as b <<- 1:10000. But I’ve not found an equivalent way of creating a 2D array without using a explicit “matrix” instance.
Am I just missing something simple, or is this actually not possible?
Let me add a couple of things. First, I’m very new to R, so could easily have missed something. Second, I’m really just curious about the way reference classes work in this case and whether there’s a simple way to use them efficiently; I’m not looking for a really fast way to set the elements of a matrix - I can do that by not having the matrix in a reference class at all, and if I really care about speed I can write a C routine to do it and call it from R.
Here’s some background that might explain why I’m interested in this, which you’re welcome to ignore.
I got here by wanting to see how different languages, and even different compiler options and different ways of coding the same operation, compared for efficiency when accessing 2D rectangular arrays. I’ve been playing with a test program that creates two 2D arrays of the same size, and calls a subroutine that sets the first to the elements of the second plus their index values. (Almost any operation would do, but this one isn’t completely trivial to optimise.) I have this in a number of languages now, C, C++, Julia, Tcl, Fortran, Swift, etc., even hand-coded assembler (spoiler alert: assembler isn’t worth the effort any more) and thought I’d try R. The obvious implementation in R passes the two arrays to a subroutine that does the work, but because R doesn’t normally pass by reference, that routine has to make a copy of the modified array and return that as the function value. I thought using a reference class would avoid the relatively minor overhead of that copy, so I tried that and was surprised to discover that, far from speeding things up, it slowed them down enormously.

Use outer:
out$data <- outer(1:ny, 1:nx, `+`)
Also, don't use reference classes (or R6 classes) unless you actually need reference semantics. KISS and all that.

matrix multiplication order PVM vs MVP in graphics programming

hi there I was wondering why most tutorials and programming code use MVP to describe the Model-View-Projection matrix. Instead of PVM which is the actual order of implementation in the code:
mat4 MVP = ProjectionMatrix * ViewMatrix * ModelMatrix;
gl_Position = MVP * VertexInModelSpace;
seems much more understandable to me to write PVM instead of MVP.

Matrices don't actually have a fixed meaning, just relations between rows and columns. The meaning is freely definable by the developers. The MVP order follows from standard mathematical conventions. But since nothing says you can not define the vectors as columns instead of rows nothing precludes this ordering.
Clarification: Since changing notation transposes the meaning. Then following applies:
MmvpT = Mpvm
Due to the definition of matrix multiplication following rule kicks in:
(AB)T = BTAT
Since B can be recursively another matrix multiplication a infinite chain of these are possible. Which means essentially that you have swapped the multiplication order, by changing notation.
Its a bit like looking at the problem from the outside or the problem from the inside. In this case your thinking as a outside observer. Whereas the other way around one would observe the thing from the standpoint of the first operator in the chain. Personally I think the notation you use may be more intuitive for this specific task, the other is just way more common. Mainly due to the fact that all mathematics books I have ever seen use this convention, so blame the mathematicians.
So better stick with the more common way, makes things more generally understandable. For example: Nothing stops me from typing the answer in Finnish but the convention of stackoverflow is to answer in English, making answers more understandable to most users. Use the more common form since others may not grasp the difference, and this leads to errors.

The other problem is that matrix multiplication is not necessarily commutative:
AB != BA
So it's a good idea to stick with the convention.

Best practices for coding simple mathematical calculations in Python

I need to perform simple mathematical calculations in Python 2.7 with sums, subtractions, divisions, multiplications, sums over lists of numbers etc.
I want to write elegant, bullet-proof, and efficient code but I must admit I got confused by several things, for example:
if I have 1/(N-1)*x in my equation should I just code 1/(N-1)*x or maybe 1.0/(N-1)*x, 1.0/(N-1.0)*x or any other combination of these?
for division, should I use // or / with from __future__ import division?
what practices such as "using math.fsum() for concatenating a list of floats" are out there?
should I assume that input numbers are float or do the conversion just in case (maybe risking drop of efficiency on many float(x) operations)?
So what are the best practices for writing a code for simple mathematical calculations in Python that is
elegant/Pythonic,
efficient,
bullet-proof to issues like uncertainty in exact number type of input data (float vs integer) ?

If you use Python 2.7, ALWAYS use from __future__ import division. It removes a hell of a lot confusion and bugs.
With this you should never have to worry if a division is a float or not, / will always be a float and // will always be an int.
You should convert your input with float(). You will do it only once, and it won't be much of a performance hit.
I would get the sum of a list of floats like this: sum(li, 0.0), but if precision is required, use math.fsum which is specifically created for this.
And finally, your final statement was confusing. Did you mean 1/((N-1)*x) or (1/(N-1))*x? In the first case I would write it as 1 / (x * (N-1)) and in the second case x / (N-1). Both assume 3.x style division.
Also, look into numpy if you want some real performance.

If you want great performance for numerical code in Python, you should consider PyPy. Numpy and scipy are convenient for dealing with arrays, and they give good performance if you use linear algebra algorithms that they provide. But if your numerical operations are in pure Python code, PyPy can give significant improvements in performance. I have seen speedups above 20x. And when you use PyPy, the best way to write your mathematical expressions is the simplest way. It will optimize your code better than you could, so make it as simple and readable as possible.

Efficiency of stack-based expression evaluation for math parsing

I have to write, for academic purposes, an application that plots user-input expressions like: f(x) = 1 - exp(3^(5*ln(cosx)) + x)
The approach I've chosen to write the parser is to convert the expression in RPN with the Shunting-Yard algorithm, treating primitive functions like "cos" as unary operators. This means the function written above would be converted in a series of tokens like:
1, x, cos, ln, 5, *,3, ^, exp, -
The problem is that to plot the function I have to evaluate it LOTS of times, so applying the stack evaluation algorithm for each input value would be very inefficient.
How can I solve this? Do I have to forget the RPN idea?

How much is "LOTS of times"? A million?
What kind of functions could be input? Can we assume they are continuous?
Did you try measuring how well your code performs?
(Sorry, started off with questions!)
You could try one of the two approaches (or both) described briefly below (there are probably many more):
1) Parse Trees.
You could create a Parse Tree. Then do what most compilers do to optimize expressions, constant folding, common subexpression elimination (which you could achieve by linking together the common expression subtrees and caching the result), etc.
Then you could use lazy evaluation techniques to avoid whole subtrees. For instance if you have a tree
*
/ \
A B
where A evaluates to 0, you could completely avoid evaluating B as you know the result is 0. With RPN you would lose out on the lazy evaluation.
2) Interpolation
Assuming your function is continuous, you could approximate your function to a high degree of accuracy using Polynomial Interpolation. This way you can do the complicated calculation of the function a few times (based on the degree of polynomial you choose), and then do fast polynomial calculations for the rest of the time.
To create the initial set of data, you could just use approach 1 or just stick to using your RPN, as you would only be generating a few values.
So if you use Interpolation, you could keep your RPN...
Hope that helps!

Why reinvent the wheel? Use a fast scripting language instead.
Integrating something like lua into your code will take very little time and be very fast.
You'll usually be able byte compile your expression, and that should result in code that runs very fast, certainly fast enough for simple 1D graphs.
I recommend lua as its fast, and integrates with C/C++ easier than any other scripting language. Another good options would be python, but while its better known I found it trickier to integrate.

Why not keep around a parse tree (I use "tree" loosely, in your case it's a sequence of operations), and mark input variables accordingly? (e.g. for inputs x, y, z, etc. annotate "x" with 0 to signify the first input variable, "y" with 1 to signify the 2nd input variable, etc.)
That way you can parse the expression once, keep the parse tree, take in an array of inputs, and apply the parse tree to evaluate.
If you're worrying about the performance aspects of the evaluation step (vs. the parsing step), I don't think you'd do much better unless you get into vectorizing (applying your parse tree on a vector of inputs at once) or hard-coding the operations into a fixed function.

What I do is use the shunting algorithm to produce the RPN. I then "compile" the RPN into a tokenised form that can be executed (interpretively) repeatedly without re-parsing the expression.

Michael Anderson suggested Lua. If you want to try Lua for just this task, see my ae library.

Inefficient in what sense? There's machine time and programmer time. Is there a standard for how fast it needs to run with a particular level of complexity? Is it more important to finish the assignment and move on to the next one (perfectionists sometimes never finish)?
All those steps have to happen for each input value. Yes, you could have a heuristic that scans the list of operations and cleans it up a bit. Yes, you could compile some of it down to assembly instead of calling +, * etc. as high level functions. You can compare vectorization (doing all the +'s then all the *'s etc, with a vector of values) to doing the whole procedure for one value at a time. But do you need to?
I mean, what do you think happens if you plot a function in gnuplot or Mathematica?

Your simple interpretation of RPN should work just fine, especially since it contains
math library functions like cos, exp, and ^(pow, involving logs)
symbol table lookup
Hopefully, your symbol table (with variables like x in it) will be short and simple.
The library functions will most likely be your biggest time-takers, so unless your interpreter is poorly written, it will not be a problem.
If, however, you really gotta go for speed, you could translate the expression into C code, compile and link it into a dll on-the-fly and load it (takes about a second). That, plus memoized versions of the math functions, could give you the best performance.
P.S. For parsing, your syntax is pretty vanilla, so a simple recursive-descent parser (about a page of code, O(n) same as shunting-yard) should work just fine. In fact, you might just be able to compute the result as you parse (if math functions are taking most of the time), and not bother with parse trees, RPN, any of that stuff.

I think this RPN based library can serve the purpose: http://expressionoasis.vedantatree.com/
I used it with one of my calculator project and it works well. It is small and simple, but extensible.

One optimization would be to replace the stack with an array of values and implement the evaluator as a three address mechine where each operation loads from two (or one) location and saves to a third. This can make for very tight code:
struct Op {
enum {
add, sub, mul, div,
cos, sin, tan,
//....
} op;
int a, b, d;
}
void go(Op* ops, int n, float* v) {
for(int i = 0; i < n; i++) {
switch(ops[i].op) {
case add: v[op[i].d] = v[op[i].a] + v[op[i].b]; break;
case sub: v[op[i].d] = v[op[i].a] - v[op[i].b]; break;
case mul: v[op[i].d] = v[op[i].a] * v[op[i].b]; break;
case div: v[op[i].d] = v[op[i].a] / v[op[i].b]; break;
//...
}
}
}
The conversion from RPN to 3-address should be easy as 3-address is a generalization.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex