Inclusivity of Julia ranges - julia

I hate that ranges include the end. Here is an example where I've deliberately removed the end of the range.
N = 100
for x in 0.0 : 2*pi/N : 2*pi*(N-1)/N
println(x)
end
Is there any way to avoid the ugliness of this for loop?

Yes, there is
N = 100
for x in range(0; step=2π/N, length=N)
println(x)
end

Maybe not the most elegant way... take the first n-1 elements
r = 0.0 : 2*pi/N : 2*pi
r = Iterators.take(r,length(r)-1)

Unfortunately, inclusive ranges (and 1-based indexing) is baked into the idioms of Julia at a fundamental level.
However, for this specific case, do note that stepping with floating point values can be problematic, as adding N values might be less than, equal to, or greater than the final value, giving different results for the for loop. Although julia tries really hard, there's no way to quite do the right thing in all circumstances. As a bonus, working in integer values only for the ranges simplifies things. You might want to consider:
for ix in 0:N-1
x = ix * 2 * pi / N
println(x)
end
Alternatively, the range() function has a form with a len parameter:
for x in range(0, 2*pi*(N-1)/N, length=n)
println(x)
end
Or indeed, combining this with the other answer of only taking (N-1) could work.

You could actually define your own operator such as:
▷(a,b) = a:b-1
Now you can write:
julia> 3▷6
3:5
Julia also natively supports custom indices for arrays. There is a package CustomUnitRanges that is maybe an overkill here.

Related

Cumulative Integration Options With Julia

I have two 1-D arrays in which I would like to calculate the approximate cumulative integral of 1 array with respect to the scalar spacing specified by the 2nd array. MATLAB has a function called cumtrapz that handles this scenario. Is there something similar that I can try within Julia to accomplish the same thing?
The expected result is another 1-D array with the integral calculated for each element.
There is a numerical integration package for Julia (see the link) that defines cumul_integrate(X, Y) and uses the trapezoidal rule by default.
If this package didn't exist, though, you could easily write the function yourself and have a very efficient implementation out of the box because the loop does not come with a performance penalty.
Edit: Added an #assert to check matching vector dimensions and fixed a typo.
function cumtrapz(X::T, Y::T) where {T <: AbstractVector}
# Check matching vector length
#assert length(X) == length(Y)
# Initialize Output
out = similar(X)
out[1] = 0
# Iterate over arrays
for i in 2:length(X)
out[i] = out[i-1] + 0.5*(X[i] - X[i-1])*(Y[i] + Y[i-1])
end
# Return output
out
end

What is the simplest way to iterate over an array of arrays?

Let x::Vector{Vector{T}}. What is the best way to iterate over all the elements of each inner vector (that is, all elements of type T)? The best I can come up with is a double iteration using the single-line notation, ie:
for n in eachindex(x), m in eachindex(x[n])
x[n][m]
end
but I'm wondering if there is a single iterator, perhaps in the Iterators package, designed specifically for this purpose, e.g. for i in some_iterator(x) ; x[i] ; end.
More generally, what about iterating over the inner-most elements of any array of arrays (that is, arrays of any dimension)?
Your way
for n in eachindex(x), m in eachindex(x[n])
x[n][m]
end
is pretty fast. If you want best speed, use
for n in eachindex(x)
y = x[n]
for m in eachindex(y)
y[m]
end
end
which avoids dereferencing twice (the first dereference is hard to optimize out because arrays are mutable, and so getindex isn't pure). Alternatively, if you don't need m and n, you could just use
for y in x, for z in y
z
end
which is also fast.
Note that column-major storage is irrelevant, since all arrays here are one-dimensional.
To answer your general question:
If the number of dimensions is a compile-time constant, see Base.Cartesian
If the number of dimensions is not a compile-time constant, use recursion
And finally, as Dan Getz mentioned in a comment:
using Iterators
for z in chain(x...)
z
end
also works. This however has a bit of a performance penalty.
I'm wondering if there is a single iterator, perhaps in the Iterators package, designed specifically for this purpose, e.g. for i in some_iterator(x) ; x[i] ; end
Today (in Julia 1.x versions), Iterators.flatten is exactly this.
help?> Iterators.flatten
flatten(iter)
Given an iterator that yields iterators, return an iterator that
yields the elements of those iterators. Put differently, the
elements of the argument iterator are concatenated.
julia> x = [1:5, [π, ℯ, 42], 'a':'e']
3-element Vector{AbstractVector}:
1:5
[3.141592653589793, 2.718281828459045, 42.0]
'a':1:'e'
julia> for el in Iterators.flatten(x)
print(el, " ")
end
1 2 3 4 5 3.141592653589793 2.718281828459045 42.0 a b c d e
julia>

what is the purpose of 'NULL' in processing loops?

sqr = seq(1, 100, by=2)
sqr.squared = NULL
for (n in 1:50)
{
sqr.squared[n] = sqr[n]^2
}
I came accross the loop above, for a beginner this was simple enough. To further understand r what was the precise purpose of the second line? For my research I gather it has something to do with resetting the vector. If someone could elaborate it'd be much appreciated.
sqr.squared <- NULL
is one of many ways initialize the empty vector sqr.squared prior to running it through a loop. In general, when the length of the resulting vector is known, it is much better practice to allocate the vector's length. So here,
sqr.squared <- vector("integer", 50)
would be much better practice. And faster too. This way you are not building the new vector in the loop. But since ^ is vectorized, you could also simply do
sqr[1:50] ^ 2
and ditch the loop all together.
Another way to think about it is to remember that everything in r is a function call, and functions need input (usually).
say you calculated y and want to store that value somewhere. You can do x <- y without initializing an x object (r does this for you unlike in other languages, c for example), but say you want to store it in a specific place in x.
So note that <- (or = in your example) is a function
y <- 1
x[2] <- y
# Error in x[2] <- y : object 'x' not found
This is a different function than <-. Since you want to put y at x[2], you need the function [<-
`[<-`(x, 2, y)
# Error: object 'x' not found
But this still doesn't work because we need the object x to use this function, so initialize x to something.
(x <- numeric(5))
# [1] 0 0 0 0 0
# and now use the function
`[<-`(x, 2, y)
# [1] 0 1 0 0 0
This prefix notation is easier for computers to parse (eg, + 1 1) but harder for humans (me at least), so we prefer infix notation (eg, 1 + 1). R makes such functions easier to use x[2] <- y rather than how I did above.
The first answer is correct, when you assign a NULL value to a variable, the purpose is to initialize a vector. In many cases, when you are working checking numbers or with different types of variables, you will need to set NULL this arrays, matrix, etc.
For example, in you want to create a some type of element, in some cases you will need to put something inside them. This is the purpose of to use NULL. In addition, sometimes you will require NA instead of NULL.

Define Piecewise Functions in Julia

I have an application in which I need to define a piecewise function, IE, f(x) = g(x) for [x in some range], f(x)=h(x) for [x in some other range], ... etc.
Is there a nice way to do this in Julia? I'd rather not use if-else because it seems that I'd have to check every range for large values of x. The way that I was thinking was to construct an array of functions and an array of bounds/ranges, then when f(x) is called, do a binary search on the ranges to find the appropriate index and use the corresponding function (IE, h(x), g(x), etc.
It seems as though such a mathematically friendly language might have some functionality for this, but the documentation doesn't mention piecewise in this manner. Hopefully someone else has given this some thought, thanks!
with a Heaviside function you can do a interval function:
function heaviside(t)
0.5 * (sign(t) + 1)
end
and
function interval(t, a, b)
heaviside(t-a) - heaviside(t-b)
end
function piecewise(t)
sinc(t) .* interval(t,-3,3) + cos(t) .* interval(t, 4,7)
end
and I think it could also implement a subtype Interval, it would be much more elegant
I tried to implement a piecewise function for Julia, and this is the result:
function piecewise(x::Symbol,c::Expr,f::Expr)
n=length(f.args)
#assert n==length(c.args)
#assert c.head==:vect
#assert f.head==:vect
vf=Vector{Function}(n)
for i in 1:n
vf[i]=#eval $x->$(f.args[i])
end
return #eval ($x)->($(vf)[findfirst($c)])($x)
end
pf=piecewise(:x,:([x>0, x==0, x<0]),:([2*x,-1,-x]))
pf(1) # => 2
pf(-2) # => 2
pf(0) # => -1
Why not something like this?
function piecewise(x::Float64, breakpts::Vector{Float64}, f::Vector{Function})
#assert(issorted(breakpts))
#assert(length(breakpts) == length(f)+1)
b = searchsortedfirst(breakpts, x)
return f[b](x)
end
piecewise(X::Vector{Float64}, bpts, f) = [ piecewise(x,bpts,f) for x in X ]
Here you have a list of (sorted) breakpoints, and you can use the optimized searchsortedfirst to find the first breakpoint b greater than x. The edge case when no breakpoint is greater than x is also handled appropriately since length(breakpts)+1 is returned, so b is the correct index into the vector of functions f.

dynamic programming pseudocode for Travelling Salesman

this is a dynamic programming pseudocode for TSP (Travelling Salesman Problem). i understood its optimal substructure but i can't figure out what the code in red brackets do.
i am not asking anyone to write the actual code, i just need explanation on what is happening so i can write my own.... thanks:)
here is a link for the pseudocode, i couln't uploaded over here.
http://www.imagechicken.com/viewpic.php?p=1266328410025325200&x=jpg
Here is some less mathematical pseudo-code. I don't know if this will explain what's happening, but it may help you read it. This isn't a functional algorithm (lots of := all over), so I'm going to use Python pseudo-code.
# I have no idea where 'i' comes from. It's not defined anywhere
for k in range(2,n):
C[set(i,k), k] = d(1,k)
shortest_path = VERY_LARGE_NUMBER
# I have to assume that n is the number of nodes in the graph G
# other things that are not defined:
# d_i,j -- I will assume it's the distance from i to j in G
for subset_size in range(3,n):
for index_subset in subsets_of_size(subset_size, range(1,n)):
for k in index_subset:
C[S,k] = argmin(lambda m: C[S-k,m] + d(G,m,k), S - k)
shortest_path = argmin(lambda k: C[set(range(1,n)),k] + d(G,1,k), range(2,n))
return shortest_path
# also needed....
def d(G, i, j):
return G[i][j]
def subsets_of_size(n, s): # returns a list of sets
# complicated code goes here
pass
def argmin(f, l):
best = l[0]
bestVal = f(best)
for x in l[1:]:
newVal = f(x)
if newVal < bestVal:
best = x
bestVal = newVal
return best
Some notes:
The source algorithm is not complete. At least, its formatting is weird in the inner loop, and it rebinds k in the second argmin. So the whole thing is probably wrong; I've not tried to run this code.
arguments to range should probably all be increased by 1 since Python counts from 0, not 1. (and in general counting from 1 is a bad idea).
I assume that G is a dictionary of type { from : { to : length } }. In other words, an adjacency list representation.
I inferred that C is a dictionary of type { (set(int),int) : int }. I could be wrong.
I use a set as keys to C. In real Python, you must convert to a frozen_set first. The conversion is just busywork so I left it out.
I can't remember the set operators in Python. I seem to remember it uses | and & instead of + and -.
I didn't write subsets_of_size. It's fairly complicated.

Resources