How to update all elements of a nested map in elixir - functional-programming

I'm trying to implement in Elixir some of the maze generation algorithms from the excellent book Mazes for Programmers by Jamis Buck. In imperative languages like Go or V it's a piece of cake but with Elixir I'm stuck.
A maze is a grid of cells. A cell holds information about in which direction we can move. It is represented as a struct with boolean members (north: true or east: false, etc.). A grid is a map where keys are tuples {col, row} and values are Cells. If mz is a maze, mz.grid[{0, 0}] is the cell located at the upper left corner.
One of the basic operation is to open a path from one cell c1 to another c2 and most of the time, if we can go from c1 to c2, we can also go from c2 to c1 which means this operation modifies both cells. To implement this, I have a function open_to(maze, x, y, direction) which returns a tuple of two cells c1_new and c2_new where the direction information in each cell had been changed. Then I can update the grid with Enum.put(maze.grid, {x, y}, c1_new). Same for c2_new.
One of the most simple algorithm, the binary tree algorithm, needs to visit all cells one by one and open a bidirectional link with one of the neighbors. Bidirectional means that both cells need to be updated and the second cell may be visited only later. I'm stuck at this step as I can't find how to update the grid with the cells returned by open_to(). My Elixir pseudo code is as follows:
def generate(mz) do
Enum.map(mz.grid, fn({{x, y}, c1}) ->
neighbors = [Grid.cell_to(mz, x, y, :north), Grid.cell_to(mz, x, y, :east)]
c2_dir = select_the_neighbor(neighbors) # returns one of :north, :east or nil
# Here! open_to returns the updated cells but what to do with them?
{c1_new, c2_new} = if c2_dir != nil, do: Grid.open_to(mz, x, y, c2_dir)
end)
end
I believe the issue comes from the data structure I've chosen and from the way I go through it, but I can't find another way. Any help is appreciated

If I'm understanding the question it's "how can each step in Enum.map/2 update the maze and have that visible to each other and the final result?".
Where data structures in Elixir are immutable, you don't change the data another variable points to.
As a simple example, putting a key/value pair into a map creates an entirely new map:
iex(1)> map = %{a: 3}
%{a: 3}
iex(2)> Map.put(map, :a, 4)
%{a: 4}
iex(3)> map
%{a: 3}
In a similar fashion, Enum.map/2 isn't intended for modifying anything except the value currently being operated on (and even then only in the new list, not the original). If you want to update some value based on each cell you may be looking for Enum.reduce/3. It enumerates things like Enum.map/2, but it takes a value or "accumulator". The reducer function you pass in is called with an item and the accumulator and ought to return the updated value of the accumulator. Then the final value of that accumulator is what is returned from Enum.reduce/3.
So your pseudo code might look something like this:
def generate(mz) do
# don't use the c1 from enumerating the grid--it could be out of date
# you could also just enumerate the grid coordinates:
# - for x <- 0..width-1, y <- 0..height-1, do: {x, y}
# - Map.keys(mz.grid)
Enum.reduce(mz.grid, mz, fn {{x, y}, _dont_use_this_c1}, mz ->
neighbors = [Grid.cell_to(mz, x, y, :north), Grid.cell_to(mz, x, y, :east)]
if c2_dir = select_the_neighbor(neighbors) do
{c1_new, c2_new} = Grid.open_to(mz, x, y, c2_dir)
mz
|> Map.put({x, y}, c1_new)
|> Map.put(find_x_y(x, y, c2_dir), c2_new)
else
# you have to return a maze, or other iterations will try adding cells to nil
mz
end
end)
end

Related

Julia: Apply 1 dimensional Julia function to multi-dimensional array

I'm a "write Fortran in all languages" kind of person trying to learn modern programming practices. I have a one dimensional function ft(lx)=HT(x,f(x),lx), where x, and f(x) are one dimensional arrays of size nx, and lx is the size of output array ft. I want to apply HT on a multidimensional array f(x,y,z).
Basically I want to apply HT on all three dimensions to go from f(x,y,z) defined on (nx,ny,nz) dimensional grid, to ft(lx,ly,lz) defined on (lx,ly,lz) dimensional grid:
ft(lx,y,z) = HT(x,f(x,y,z) ,lx)
ft(lx,ly,z) = HT(y,ft(lx,y,z) ,ly)
ft(lx,ly,lz) = HT(z,ft(lx,ly,z),lz)
In f95 style I would tend to write something like:
FTx=zeros((lx,ny,nz))
for k=1:nz
for j=1:ny
FTx[:,j,k]=HT(x,f[:,j,k],lx)
end
end
FTxy=zeros((lx,ly,nz))
for k=1:nz
for i=1:lx
FTxy[i,:,k]=HT(y,FTx[i,:,k],ly)
end
end
FTxyz=zeros((lx,ly,lz))
for j=1:ly
for i=1:lx
FTxyz[i,j,:]=HT(z,FTxy[i,j,:],lz)
end
end
I know idiomatic Julia would require using something like mapslices. I was not able to understand how to go about doing this from the mapslices documentation.
So my question is: what would be the idiomatic Julia code, along with proper type declarations, equivalent to the Fortran style version?
A follow up sub-question would be: Is it possible to write a function
FT = HTnD((Tuple of x,y,z etc.),f(x,y,z), (Tuple of lx,ly,lz etc.))
that works with arbitrary dimensions? I.e. it would automatically adjust computation for 1,2,3 dimensions based on the sizes of input tuples and function?
I have a piece of code here which is fairly close to what you want. The key tool is Base.Cartesian.#nexprs which you can read up on in the linked documentation.
The three essential lines in my code are Lines 30 to 32. Here is a verbal description of what they do.
Line 30: reshape an n1 x n2 x ... nN-sized array C_{k-1} into an n1 x prod(n2,...,nN) matrix tmp_k.
Line 31: Apply the function B[k] to each column of tmp_k. In my code, there are some indirections here since I want to allow for B[k] to be a matrix or a function, but the basic idea is as described above. This is the part where you would want to bring in your HT function.
Line 32: Reshape tmp_k back into an N-dimensional array and circularly permute the dimensions such that the second dimension of tmp_k ends up as the first dimension of C_k. This makes sure that the next iteration of the "loop" implied by #nexprs operates on the second dimension of the original array, and so on.
As you can see, my code avoids forming slices along arbitrary dimensions by permuting such that we only ever need to slice along the first dimension. This makes programming much easier, and it can also have some performance benefits. For example, computing the matrix-vector products B * C[i1,:,i3] for all i1,i3can be done easily and very efficiently by moving the second dimension of C into the first position of tmp and using gemm to compute B * tmp. Doing the same efficiently without the permutation would be much harder.
Following #gTcV's code, your function would look like:
using Base.Cartesian
ht(x,F,d) = mapslices(f -> HT(x, f, d), F, dims = 1)
#generated function HTnD(
xx::NTuple{N,Any},
F::AbstractArray{<:Any,N},
newdims::NTuple{N,Int}
) where {N}
quote
F_0 = F
Base.Cartesian.#nexprs $N k->begin
tmp_k = reshape(F_{k-1},(size(F_{k-1},1),prod(Base.tail(size(F_{k-1})))))
tmp_k = ht(xx[k], tmp_k, newdims[k])
F_k = Array(reshape(permutedims(tmp_k),(Base.tail(size(F_{k-1}))...,size(tmp_k,1))))
# https://github.com/JuliaLang/julia/issues/30988
end
return $(Symbol("F_",N))
end
end
A simpler version, which shows the usage of mapslices would look like this
function simpleHTnD(
xx::NTuple{N,Any},
F::AbstractArray{<:Any,N},
newdims::NTuple{N,Int}
) where {N}
for k = 1:N
F = mapslices(f -> HT(xx[k], f, newdims[k]), F, dims = k)
end
return F
end
you could even use foldl if you are a friend of one-liners ;-)
fold_HTnD(xx, F, newdims) = foldl((F, k) -> mapslices(f -> HT(xx[k], f, newdims[k]), F, dims = k), 1:length(xx), init = F)

Get the mapping from each element of input to the bin of the histogram in Julia

Matlab's [n,mapx] = histc(x, bin_edged) returns the counts of x in each bin as n and returns a map, which is the same length of x which is the bin index that each element of x was placed into.
I can do the same thing in Julia as follows:
Using StatsBase
x = rand(1000)
bin_e = 0:0.1:1
h = fit(Histogram, x, bin_e)
yx = map((z) -> findnext(z.<=h.edges[1],1),x) .- 1
Is this the "right way" to do this? It seem a bit kludgy.
Inspired by this python question you should be able to define a small function that delivers the desired mapping (modulo conventions):
binindices(edges, data) = searchsortedlast.(Ref(edges), data)
Note that the bin edges are sorted and we can use seachsortedlast to get the last bin edge smaller or equal than a datapoint. Broadcasting this over all of the data we obtain the mapping. Note that the Ref(edges) indicates that edges is a scalar under broadcasting (that means that the full array is considered in each call).
Although conceptionally identical to your solution, this approach is about 13x faster on my machine.
I filed an issue over at StatsBase.jl's github page suggesting to add this as a feature.
After looking through the code for Histogram.jl I found that they already included a function binindex. So this solution is probably the best:
x = 0:0.001:10
h1 = fit(Histogram,x,0:10,closed=left)
xmap1 = StatsBase.binindex.(Ref(h1), x)
h2 = fit(Histogram,x,0:10,closed=right)
xmap2 = StatsBase.binindex.(Ref(h2), x)
I stumbled across this question when I was trying to figure out how many occurrences of each value I had in a list of values. If each value is in its own bin (as for categorical data, or integer data with a small number of unique values), this is what one would be plotting in a histogram.
If that is what you want, then countmap() in StatBase package is just what you need.

Allocating Subarrays in Mergesort

What's happening, folks.
So, I've done a fair amount of research on merge sort, and in spite of getting the "gist" of it, I am still baffled by how one is supposed to store the subarrays in order to merge them back together—in other words, save them somewhere so that they would "know" each other, as you would otherwise—in classic recursive fashion—have all these independent function calls returning data that I would assume would go out of scope.
Here's what I first thought: create a new array named "subs" to store the subarrays in upon each division (I also considered using a closure to do this and would like to know whether this is advisable). But, as you proceed to the next division, what are you gonna do—replace each element in subs with its subarrays? Then, you would be facing more costly work, especially once you consider how you're gonna move things around in subs in order to ensure that each subarray has its own index.
Heh—I have a bad feeling that this might be a far cry from what's actually supposed to be done. I understand that this algorithm is a classic example of the divide-and-conquer approach, but it's just strange to me that one couldn't just cut to the chase by splitting the array into all of its elements right off the bat (after all, that's the base case, and what would be wrong with throwing in a greedy approach to solving the problem?).
Thanks!
EDIT:
Alright, so I figured it out.
To sum it up: I used indices to track where to place elements (and obviate the need for built-in list functions that may slow down runtime).
By using nested functions and a (hidden) pointer to the new array, I kept data in scope. An auxiliary array buffers data from the subarrays.
In retrospect, what I originally had in mind vaguely resembled insertion sort was, in fact, bottom-up merge sort. Having previously questioned the efficiency and purpose of top-down merge sort, I now understand that by breaking down the problem, it expedites comparisons and swaps (especially when operating on larger lists, which insertion sort would prove to be less efficient in sorting). I did not attempt to implement my initial idea because I did not have a clear enough picture of recursion and how data is passed.
#!/bin/python
import sys
def merge_sort(arr):
def merge(*indices): # indices = first, last, and pivot indices, respectively
head, tail = indices[0], indices[1]
pivot = indices[2]
i = head
j = pivot+1
k = 0
while (i <= pivot and j <= tail):
if new[i] <= new[j]:
aux[k] = new[i]
i += 1
k += 1
else:
aux[k] = new[j]
j += 1
k += 1
while (i <= pivot):
aux[k] = new[i]
i += 1
k += 1
while (j <= tail):
aux[k] = new[j]
j += 1
k += 1
for x in xrange(head, tail+1):
new[x] = aux[x-head]
# end merge
def split(a, *indices): # indices = first and last indices, respectively
head, tail = indices[0], indices[1]
pivot = (head+tail) / 2
if head < tail:
l_sub = a[head:pivot+1]
r_sub = a[pivot+1:tail+1]
split(l_sub, head, pivot)
split(r_sub, pivot+1, tail)
merge(head, tail, pivot)
# end split
new = arr
aux = list(new)
tail = len(new)-1
split(new, 0, tail)
return new
# end merge_sort
if __name__ == "__main__":
loops = int(raw_input().strip())
for _ in xrange(loops):
arr = map(int, raw_input().strip().split(' '))
result = merge_sort(arr)
print result

Elixir loop over a matrix

I have a list of elements and I am converting it into a list of lists using the Enum.chunk_every method.
The code is something like this:
matrix = Enum.chunk_every(list_1d, num_cols)
Now I want to loop over the matrix and access the neighbors
Simply if I have the list [1,2,3,4,5,6,1,2,3] it is converted to a 3X3 matrix like:
[[1,2,3], [4,5,6], [1,2,3]]
Now how do I loop over this matrix? And what if I want to access the neighbors of the elements? For example the neighbors of 5 are 2,4,6 and 2.
I can see that recursion is a way to go but how will that work here?
There are many ways to solve this, and I think that you should consider first what is your use case (size of the matrix, number of matrices, number of accesses...) and adapt your data structure accordingly.
Nevertheless, here is a simple implementation (in Erlang shell, I let you adapt to elixir):
1> L = [[1,2,3], [4,5,6], [1,2,3]].
[[1,2,3],[4,5,6],[1,2,3]]
2> Get = fun(I,J,L) ->
try
V = lists:nth(I,lists:nth(J,L)),
{ok,V}
catch
_:_ -> {error,out_of_bound}
end
end.
#Fun<erl_eval.18.99386804>
3> Get(1,2,L).
{ok,4}
4> Get(2,3,L).
{ok,2}
5> Get(2,4,L).
{error,out_of_bound}
6> Neighbor = fun(I,J,L) ->
[ V || {I1,J1} <- [{I,J-1},{I-1,J},{I+1,J},{I,J+1}],
{ok,V} <- [Get(I1,J1,L)]
]
end.
#Fun<erl_eval.18.99386804>
7> Neighbor(2,2,L).
[2,4,6,2]
8> Neighbor(1,2,L).
[1,5,1]
9>
Remark: I like list comprehension, you may prefer to use lists:map in this case. This code is not efficient since it parses 4 time the list to get the neighbors. The only advantage is that it is "straight". so it should be easy to read.

Add two matrices with recursion

In my programming course, we were given to add two matrices together with recursion only. Apparently our professor intended us to make a recursive method but still use one "for" loop to solve it.
However, I'm still convinced that everything that can be done with a for loop can be done with recursion. So I was trying to do it:
The method should look like this:
public static int[][] addMatrix(int[][] matrix1, int[][] matrix2)
no additional variables may be passed with.
I'm having a real hard time trying to solve this, since this is a very easy exercise using for loop - I thought it was pretty doable with recursion.
Any help ?
--UPDATE
So far my course of thought was like this:
To always take the first matrix with its full size, and step by step breaking the second matrix by taking matrix2[matrix2.length] and calling the method on it, that way I would know where to store the values calculated by the method without using an index variable.
Illustration:
X X Y Y
X X Y Y
X X Y Y
X are the variables in martix1, Y for matrix2
X X
X X
X X Y Y
Take the last "row":
X X
X X
X X Y
And if matrix2 is only 1x1, add it to the appropriate index in matrix1.
This is the best that I could have come up with.
With "matrix2[matrix2.length]" you'd be going out of range by the way since indexes start from 0.
Here is some pseudo code in no particular language that will give you the idea:
function addMatrices(mat1, mat2, result_mat /*pass by reference*/, i=0, j=0){
if (j >= mat1[0].size)
return result_mat;
result_mat[i][j] = mat1[i][j] + mat2[i][j];
i ++;
if (i == mat1.size){
i = 0;
j ++;
}
return addMatrices(mat1, mat2, result_mat, i, j);
}

Resources