Trying to use ConcatLayer with different shape inputs - lasagne

I am trying to work with nolearn and use the ConcatLayer to combine multiple inputs. It works great as long as every input has the same type and shape. I have three different types of inputs that will eventually produce a single scalar output value.
The first input is an image of dimensions (288,1001)
The second input is a vector of length 87
The third is a single scalar value
I am using Conv2DLayer(s) on the first input.
The second input utilizes Conv1DLayer or DenseLayer (not sure which would be better since I can't get it far enough to see what happens)
I'm not even sure how the third input should be set up since it is only a single value I want to feed into the network.
The code blows up at the ConcatLayer with:
'Mismatch: input shapes must be the same except in the concatenation axis'
It would be forever grateful if someone could write out a super simple network structure that can take these types of inputs and output a single scalar value. I have been googling all day and simply cannot figure this one out.
The fit function looks like this if it is helpful to know, as you can see I am inputting a dictionary with an item for each type of input:
X = {'base_input': X_base, 'header_input': X_headers, 'time_input':X_time}
net.fit(X, y)

It is hard to properly answer the question, because - it depends.
Without having information on what you are trying to do and what data you are working on, we are playing the guessing game here and thus I have to fall back to giving general tips.
First it is totally reasonable, that ConcatLayer complains. It just does not make a lot of sense to append a scalar to the Pixel values of an Image. So you should think about what you actually want. This is most likely combining the information of the three sources.
You are right by suggesting to process the Image with 2D convolutions and the sequence data with 1D convolutions. If you want to generate a scalar value, you propably want to use dense layers later on, to condense the information.
So it would be naturally, to leave the lowlevel processing of the three branches independent and then concatenate them later on.
Something along the lines of:
Image -> conv -> ... -> conv -> dense -> ... -> dense -> imValues
Timeseries -> conv -> ... -> conv -> dense ... -> dense -> seriesValues
concatLayer([imValues, seriesValues, Scalar] -> dense -> ... -> dense with num_units=1
Another less often reasonable Option would be, to add the Information at the lowlevel processing of the Image. This might make sense, if the local processing is much easier, given the knowledge of the scalar/timeseries.
This architecture might look like:
concatLayer(seriesValues, scalar) -> dense -> ... -> reshape((-1, N, 1, 1))
-> Upscale2DLayer(Image.shape[2:3]) -> globalInformation
concatLayer([globalInformation, Image]) -> 2D conv filtersize=1 -> conv -> ... -> conv
Note that you will almost certainly want to go with the first Option.
One unrelated Thing I noticed, is the huge size of your Input Image. You should reduce it(resizing/patches). Unless you have a gigantic load of data and tons of Memory and computing power, you will otherwise either overfit or waste Hardware.

Related

bulk adding to a map, in F#

I've a simple type:
type Token =
{
Symbol: string
Address: string
Decimals: int
}
and a memory cache (they're in a db):
let mutable private tokenCache : Map<string, Token> = Map.empty
part of the Tokens module.
Sometimes I get a few new entries to add, in the form of a Token array, and I want to update the cache.
It happens very rarely (less than once per million reads).
When I update the database with the new batch, I want to update the cache map as well and I just wrote this:
tokenCache <- tokens |> Seq.fold (fun m i -> m.Add(i.Symbol, i)) tokenCache
Since this is happening rarely, I don't really care about the performance so this question is out of curiosity:
When I do this, the map will be recreated once per entry in the tokens array: 10 new tokens, 10 map re-creation. I assumed this was the most 'F#' way to deal with this. It got me thinking: wouldn't converting the map to a list of KVP, getting the output of distinct and re-creating a map be more efficient? or is there another method I haven't thought about?
This is not an answer to the question as stated, but a clarification to something you asked in the comments.
This premise that you have expressed is incorrect:
the map will be recreated once per entry in the tokens array
The map doesn't actually get completely recreated for every insertion. But at the same time, another hypothesis that you have expressed in the comments is also incorrect:
so the immutability is from the language's perspective, the compiler doesn't recreate the object behind the scenes?
Immutability is real. But the map also doesn't get recreated every time. Sometimes it does, but not every time.
I'm not going to describe exactly how Map works, because that's too involved. Instead, I'll illustrate the principle on a list.
F# lists are "singly linked lists", which means each list consists of two things: (1) first element (called "head") and (2) a reference (pointer) to the rest of elements (called "tail"). The crucial thing to note here is that the "rest of elements" part is also itself a list.
So if you declare a list like this:
let x = [1; 2; 3]
It would be represented in memory something like this:
x -> 1 -> 2 -> 3 -> []
The name x is a reference to the first element, and then each element has a reference to the next one, and the last one - to empty list. So far so good.
Now let's see what happens if you add a new element to this list:
let y = 42 :: x
Now the list y will be represented like this:
y -> 42 -> 1 -> 2 -> 3 -> []
But this picture is missing half the picture. If we look at the memory in a wider scope than just y, we'll see this:
x -> 1 -> 2 -> 3 -> []
^
|
/
y -> 42
So you see that the y list consists of two things (as all lists do): first element 42 and a reference to the rest of the elements 1->2->3. But the "rest of the elements" bit is not exclusive to y, it has its own name x.
And so it is that you have two lists x and y, 3 and 4 elements respectively, but together they occupy just 4 cells of memory, not 7.
And another thing to note is that when I created the y list, I did not have to recreate the whole list from scratch, I did not have to copy 1, 2, and 3 from x to y. Those cells stayed right where they are, and y only got a reference to them.
And a third thing to note is that this means that prepending an element to a list is an O(1) operation. No copying of the list involved.
And a fourth (and hopefully final) thing to note is that this approach is only possible because of immutability. It is only because I know that the x list will never change that I can take a reference to it. If it was subject to change, I would be forced to copy it just in case.
This sort of arrangement, where each iteration of a data structure is built "on top of" the previous one is called "persistent data structure" (well, to be more precise, it's one kind of a persistent data structure).
The way it works is very easy to see for linked lists, but it also works for more involved data structures, including maps (which are represented as trees).

Saving multiple sparse arrays in one big sparse array

I have been trying to implement some code in Julia JuMP. The idea of my code is that I have a for loop inside my while loop that runs S times. In each of these loops I solve a subproblem and get some variables as well as opt=1 if the subproblem was optimal or opt=0 if it was not optimal. Depending on the value of opt, I have two types of constraints, either optimality cuts (if opt=1) or feasibility cuts (if opt=0). So the intention with my code is that I only add all of the optimality cuts if there are no feasibility cuts for s=1:S (i.e. we get opt=1 in every iteration from 1:S).
What I am looking for is a better way to save the values of ubar, vbar and wbar. Currently I am saving them one at a time with the for-loop, which is quite expensive.
So the problem is that my values of ubar,vbar and wbar are sparse axis arrays. I have tried to save them in other ways like making a 3d sparse axis array, which I could not get to work, since I couldn't figure out how to initialize it.
The below code works (with the correct code inserted inside my <>'s of course), but does not perform as well as I wish. So if there is some way to save the values of 2d sparse axis arrays more efficiently, I would love to know it! Thank you in advance!
ubar2=zeros(nV,nV,S)
vbar2=zeros(nV,nV,S)
wbar2=zeros(nV,nV,S)
while <some condition>
opts=0
for s=1:S
<solve a subproblem, get new ubar,vbar,wbar and opt=1 if optimal or 0 if not>
opts+=opt
if opt==1
# Add opt cut Constraints
for i=1:nV
for k=1:nV
if i!=k
ubar2[i,k,s]=ubar[i,k]
end
end
for j=i:nV
if links[i,j]==1
vbar2[i,j,s]=vbar[i,j]
wbar2[i,j,s]=wbar[i,j]
end
end
end
else
# Add feas cut Constraints
#constraint(mas, <constraint from ubar,vbar,wbar> <= 0)
break
end
if opts==S
for s=1:S
#constraint(mas, <constraint from ubar2,vbar2,wbar2> <= <some variable>)
end
end
end
A SparseAxisArray is simply a thin wrapper in top of a Dict.
It was defined such that when the user creates a container in a JuMP macro, whether he gets an Array, a DenseAxisArray or a SparseAxisArray, it behaves as close as possible to one another hence the user does not need to care about what he obtained for most operations.
For this reason we could not just create a Dict as it behaves differently as an array. For instance you cannot do getindex with multiple indices as x[2, 2].
Here you can use either a Dict or a SparseAxisArray, as you prefer.
Both of them have O(1) complexity for setting and getting new elements and a sparse storage which seems to be adequate for what you need.
If you choose SparseAxisArray, you can initialize it with
ubar2 = JuMP.Containers.SparseAxisArray(Dict{Tuple{Int,Int,Int},Float64}())
and set it with
ubar2[i,k,s]=ubar[i,k]
If you choose Dict, you can initialize it with
ubar2 = Dict{Tuple{Int,Int,Int},Float64}()
and set it with
ubar2[(i,k,s)]=ubar[i,k]

How exactly works Evaluate function in Wolfram Mathematica. How does it make a difference in two plots below

I'm trying to understand how exactly works function Evaluate. Here I have two examples and the only difference between them is function Evaluate.
First plot with Evaluate.
ReliefPlot[
Table[Evaluate[Sum[Sin[RandomReal[9, 2].{x, y}], {20}]], {x, 1, 2, .02},
{y, 1, 2, .02}],
ColorFunction ->
(Blend[{Darker[Green, .8], Lighter[Brown, .2],White}, #] &),
Frame -> False, Background -> None, PlotLegends -> Automatic]
https://imgur.com/itBRYEv.png "plot1"
Second plot without Evaluate.
ReliefPlot[
Table[Sum[Sin[RandomReal[9, 2].{x, y}], {20}], {x, 1, 2, .02},
{y, 1,2, .02}],
ColorFunction ->
(Blend[{Darker[Green, .8], Lighter[Brown, .2], White}, #] &),
Frame -> False, Background -> None,
PlotLegends -> Automatic]
https://i.imgur.com/fvdiSCm.png "plot2"
Please explain how Evaluate makes a difference here.
Compare this
count=0;
ReliefPlot[Table[Sum[Sin[count++;RandomReal[9,2].{x,y}],{20}],{x,1,2,.02},{y,1,2,.02}]]
count
which should display your plot followed by 52020=51*51*20 because you have a 51*51 Table and each entry needs to evaluate the 20 iterations of your Sum
with this
count=0;
ReliefPlot[Table[Evaluate[Sum[Sin[count++;RandomReal[9,2].{x,y}],{20}]],{x,1,2,.02},{y,1,2,.02}]]
count
which should display your plot followed by 20 because the Evaluate needed to do the 20 iterations of your Sum only once, even though you do see 51*51 blocks of different colors on the screen.
You will get the same counts displayed, without the graphics, if you remove the ReliefPlot from each of these, so that seems to show it isn't the ReliefPlot that is responsible for the number of times your RandomReal is calculated, it is the Table.
So that Evaluate is translating the external text of your Table entry into an internal form and telling Table that this has already been done and does not need to be repeated for every iteration of the Table.
What you put and see on the screen is the front end of Mathematica. Hidden behind that is the back end where most of the actual calculations are done. The front and back ends communicate with each other during your input, calculations, output and display.
But this still doesn't answer the question why the two plots look so different. I am guessing that when you don't use Evaluate and thus don't mark the result of the Table as being complete and finished then the ReliefPlot will repeatedly probe that expression in your Table and that expression will be different every time because of the RandomReal and this is what displays the smoother higher resolution displayed graphic. But when you do use the Evaluate and thus the Table is marked as done and finished and needs no further evaluation then the ReliefPlot just uses the 51*51 values without recalculating or probing and you get a lower resolution ReliefPlot.
As with almost all of Mathematica, the details of the algorithms used for each of the thousands of different functions are not available. Sometimes the Options and Details tab in the help page for a given function can give you some additional information. Experimentation can sometimes help you guess what is going on behind the code. Sometimes other very bright people have figured out parts of the behavior and posted descriptions. But that is probably all there is.
Table has the HoldAll attribute
Attributes[Table]
(* {HoldAll, Protected} *)
Read this and this to learn more about evaluation in the WL.

ℝ³ -> ℕ mapping for a finite number of values

I am looking for an algorithm that is capable of mapping a finite but large number of 3 dimensional positions (about 10^11) to indices (so a mapping ℝ³ -> ℕ)
I know that it's possible and fairly simple to make an ℕ -> ℝ³ mapping, and that's essentially what i want to do, but ℕ -> ℝ³ would be an impractical way of figuring out which indices of ℕ are near a certain position,
Ideally i would also like to ensure that my finite subset of ℕ contains no duplicates.
Some background on how this would be implemented to give a better idea on the constraints and problems with some naive solutions to this problem:
I'm trying to think of a way to map stars in a galaxy to a unique ID that i can then use as a "seed" for a random number generator, an ℕ -> ℝ³ mapping would require me to iterate over all of ℕ to find the values of ℝ³ that are near a given location, which is obviously not a practical approach
I've already found some information about the cantor pairing function and dovetailing, but those cause problems because those mainly apply to ℕⁿ and not ℝⁿ.
It's not guaranteed that my ℝ³ values follow a grid, if they did i could map ℝ³-> ℕ³ by figuring out which "box" the value is in in, and then use cantor's pairing function to figure out which ℕ belongs to that box, but in my situations the box might contain multiple values, or none.
Thanks in advance for any help
You could use a k-d tree to spatially partition your set of points. To map onto a natural number, treat the path through the tree to each point as string of binary digits where 0 is the left branch and 1 is the right branch. This might not get you exactly what you're looking for, since some points which are spatially close to each other, may lie on different branches, and are therefore numerically distant from each other. However if two points are close to each other numerically, they will be close to each other spatially.
Alternatively, you could also use an octree, in which case you get three bits at a time for each level you descend into the tree. You can completely partition the space so each region contains at most one point of interest.

Erlang Recursive end loop

I just started learning Erlang and since I found out there is no for loop I tried recreating one with recursion:
display(Rooms, In) ->
Room = array:get(In, Rooms)
io:format("~w", [Room]),
if
In < 59 -> display(Rooms, In + 1);
true -> true
end.
With this code i need to display the content (false or true) of each array in Rooms till the number 59 is reached. However this creates a weird code which displays all of Rooms contents about 60 times (?). When I drop the if statement and only put in the recursive code it is working except for a exception error: Bad Argument.
So basically my question is how do I put a proper end to my "for loop".
Thanks in advance!
Hmm, this code is rewritten and not pasted. It is missing colon after Room = array:get(In, Rooms). The Bad argument error is probably this:
exception error: bad argument
in function array:get/2 (array.erl, line 633)
in call from your_module_name:display/2
This means, that you called array:get/2 with bad arguments: either Rooms is not an array or you used index out of range. The second one is more likely the cause. You are checking if:
In < 59
and then calling display again, so it will get to 58, evaluate to true and call:
display(Rooms, 59)
which is too much.
There is also couple of other things:
In io:format/2 it is usually better to use ~p instead of ~w. It does exactly the same, but with pretty printing, so it is easier to read.
In Erlang if is unnatural, because it evaluates guards and one of them has to match or you get error... It is just really weird.
case is much more readable:
case In < 59 of
false -> do_something();
true -> ok
end
In case you usually write something, that always matches:
case Something of
{One, Two} -> do_stuff(One, Two);
[Head, RestOfList] -> do_other_stuff(Head, RestOfList);
_ -> none_of_the_previous_matched()
end
The underscore is really useful in pattern matching.
In functional languages you should never worry about details like indexes! Array module has map function, which takes function and array as arguments and calls the given function on each array element.
So you can write your code this way:
display(Rooms) ->
DisplayRoom = fun(Index, Room) -> io:format("~p ~p~n", [Index, Room]) end,
array:map(DisplayRoom, Rooms).
This isn't perfect though, because apart from calling the io:format/2 and displaying the contents, it will also construct new array. io:format returns atom ok after completion, so you will get array of 58 ok atoms. There is also array:foldl/3, which doesn't have that problem.
If you don't have to have random access, it would be best to simply use lists.
Rooms = lists:duplicate(58, false),
DisplayRoom = fun(Room) -> io:format("~p~n", [Room]) end,
lists:foreach(DisplayRoom, Rooms)
If you are not comfortable with higher order functions. Lists allow you to easily write recursive algorithms with function clauses:
display([]) -> % always start with base case, where you don't need recursion
ok; % you have to return something
display([Room | RestRooms]) -> % pattern match on list splitting it to first element and tail
io:format("~p~n", [Room]), % do something with first element
display(RestRooms). % recursive call on rest (RestRooms is quite funny name :D)
To summarize - don't write forloops in Erlang :)
This is a general misunderstanding of recursive loop definitions. What you are trying to check for is called the "base condition" or "base case". This is easiest to deal with by matching:
display(0, _) ->
ok;
display(In, Rooms) ->
Room = array:get(In, Rooms)
io:format("~w~n", [Room]),
display(In - 1, Rooms).
This is, however, rather unidiomatic. Instead of using a hand-made recursive function, something like a fold or map is more common.
Going a step beyond that, though, most folks would probably have chosen to represent the rooms as a set or list, and iterated over it using list operations. When hand-written the "base case" would be an empty list instead of a 0:
display([]) ->
ok;
display([Room | Rooms]) ->
io:format("~w~n", [Room]),
display(Rooms).
Which would have been avoided in favor, once again, of a list operation like foreach:
display(Rooms) ->
lists:foreach(fun(Room) -> io:format("~w~n", [Room]) end, Rooms).
Some folks really dislike reading lambdas in-line this way. (In this case I find it readable, but the larger they get the more likely the are to become genuinely distracting.) An alternative representation of the exact same function:
display(Rooms) ->
Display = fun(Room) -> io:format("~w~n", [Room]) end,
lists:foreach(Display, Rooms).
Which might itself be passed up in favor of using a list comprehension as a shorthand for iteration:
_ = [io:format("~w~n", [Room]) | Room <- Rooms].
When only trying to get a side effect, though, I really think that lists:foreach/2 is the best choice for semantic reasons.
I think part of the difficulty you are experiencing is that you have chosen to use a rather unusual structure as your base data for your first Erlang program that does anything (arrays are not used very often, and are not very idiomatic in functional languages). Try working with lists a bit first -- its not scary -- and some of the idioms and other code examples and general discussions about list processing and functional programming will make more sense.
Wait! There's more...
I didn't deal with the case where you have an irregular room layout. The assumption was always that everything was laid out in a nice even grid -- which is never the case when you get into the really interesting stuff (either because the map is irregular or because the topology is interesting).
The main difference here is that instead of simply carrying a list of [Room] where each Room value is a single value representing the Room's state, you would wrap the state value of the room in a tuple which also contained some extra data about that state such as its location or coordinates, name, etc. (You know, "metadata" -- which is such an overloaded, buzz-laden term today that I hate saying it.)
Let's say we need to maintain coordinates in a three-dimensional space in which the rooms reside, and that each room has a list of occupants. In the case of the array we would have divided the array by the dimensions of the layout. A 10*10*10 space would have an array index from 0 to 999, and each location would be found by an operation similar to
locate({X, Y, Z}) -> (1 * X) + (10 * Y) + (100 * Z).
and the value of each Room would be [Occupant1, occupant2, ...].
It would be a real annoyance to define such an array and then mark arbitrarily large regions of it as "unusable" to give the impression of irregular layout, and then work around that trying to simulate a 3D universe.
Instead we could use a list (or something like a list) to represent the set of rooms, but the Room value would now be a tuple: Room = {{X, Y, Z}, [Occupants]}. You may have an additional element (or ten!), like the "name" of the room or some other status information or whatever, but the coordinates are the most certain real identity you're likely to get. To get the room status you would do the same as before, but mark what element you are looking at:
display(Rooms) ->
Display =
fun({ID, Occupants}) ->
io:format("ID ~p: Occupants ~p~n", [ID, Occupants])
end,
lists:foreach(Display, Rooms).
To do anything more interesting than printing sequentially, you could replace the internals of Display with a function that uses the coordinates to plot the room on a chart, check for empty or full lists of Occupants (use pattern matching, don't do it procedurally!), or whatever else you might dream up.

Resources