Mutating a Julia dictionary through aliases

Mutating a Julia dictionary through aliases - dictionary

Let's say I have a dictionary like so:
my_object = Dict{Symbol, Any}(
:foo => Dict{Symbol, Any}(
:foo_items => ["item_a", "item_b", "item_c"],
:bar => Dict{Symbol, Any}(
:bar_name => ["name_a", "name_b", "name_c"],
:type_before => ["Float32", "Float64", "String"],
:type_after => ["Int32", "Int64", "Int8"]
)
)
)
And I want to convert these arrays, each with different functions, such as making them vectors of Symbol rather than String. I could mutate this dictionary directly, like this:
# Need to check these keys are present
if haskey(my_object, :foo)
if haskey(my_object[:foo], :foo_items)
my_object[:foo][:foo_items] = Symbol.(my_object[:foo][:foo_items])
...
end
This however quickly becomes very tedious, with lots of duplication, and is therefore error-prone. I was hoping to use aliasing to make this a bit simpler and more readable, especially because containers like Dict are passed by reference:
if haskey(my_object, :foo)
foo = my_object[:foo]
if haskey(foo, :foo_items)
foo_items = foo[:foo_items]
foo_items = Symbol.(foo_items)
...
end
This however does not work, with my_object remaining unchanged. Which is strange, because === implies that the memory addresses are the same up until the actual change is made:
julia> foo = my_object[:foo];
julia> foo === my_object[:foo]
true
julia> foo_items = foo[:foo_items];
julia> foo_items === my_object[:foo][:foo_items]
true
Is this a case of copy-on-write? Why can't I mutate the dictionary this way? And what can I do instead if I want to mutate elements of a nested dictionary in a simpler way?

I would do it recursively
function conversion!(dict, keyset)
for (k, v) in dict
if v isa Dict
conversion!(v, keyset)
else
if k in keyset
dict[k] = converted(Val(k), v)
end
end
end
end
converted(::Val{:foo_items}, value) = Symbol.(value)
# converted(::Val{:bar_name}, value) = ...
my_object = Dict{Symbol, Any}(
:foo => Dict{Symbol, Any}(
:foo_items => ["item_a", "item_b", "item_c"],
:bar => Dict{Symbol, Any}(
:bar_name => ["name_a", "name_b", "name_c"],
:type_before => ["Float32", "Float64", "String"],
:type_after => ["Int32", "Int64", "Int8"]
)
)
)
toconvert = Set([:foo_items])#, :bar_name, :type_before, :type_after])
#show my_object
conversion!(my_object, toconvert)
#show my_object
my_object = Dict{Symbol, Any}(:foo => Dict{Symbol, Any}(:foo_items => ["item_a", "item_b", "item_c"], :bar => Dict{Symbol, Any}(:type_before => ["Float32", "Float64", "String"], :bar_name => ["name_a", "name_b", "name_c"], :type_after => ["Int32", "Int64", "Int8"])))
my_object = Dict{Symbol, Any}(:foo => Dict{Symbol, Any}(:foo_items => [:item_a, :item_b, :item_c], :bar => Dict{Symbol, Any}(:type_before => ["Float32", "Float64", "String"], :bar_name => ["name_a", "name_b", "name_c"], :type_after => ["Int32", "Int64", "Int8"])))
Feel like the code may be more type-stable.

Related

How can I rename columns when using combine in Julia DataFrame for many functions?

what's wrong with the following sintaxis:
combine(gpd, :SepalWidth .=> [mean, sum] => [:mymean, :mysum] )
given that gdp is groupedDataFrame, and given that I want the columns :mymean and :mysum

You are missing a dot in broadcasting. The following should work:
combine(gpd, :SepalWidth .=> [mean, sum] .=> [:mymean, :mysum])
EDIT
A crucial part of learning how to debug complex expressions in DataFrames.jl mini language is to understand that one can always check how broadcasting will handle the passed expression stand alone.
In this case you have:
julia> :SepalWidth .=> [mean, sum] .=> [:mymean, :mysum]
2-element Vector{Pair{Symbol}}:
:SepalWidth => (Statistics.mean => :mymean)
:SepalWidth => (sum => :mysum)
so as you can see the result is a vector of two correct transformation operations.
Now let us have a look at:
julia> [:SepalWidth, :SepalLength] .=> [mean] => [:mymean1, :mymean2]
2-element Vector{Pair{Symbol, Pair{Vector{typeof(mean)}, Vector{Symbol}}}}:
:SepalWidth => ([Statistics.mean] => [:mymean1, :mymean2])
:SepalLength => ([Statistics.mean] => [:mymean1, :mymean2])
This is clearly incorrect - as you try to store the result of mean as two columns. Instead if you write e.g.:
julia> [:SepalWidth, :SepalLength] .=> mean .=> [:mymean1, :mymean2]
2-element Vector{Pair{Symbol, Pair{typeof(mean), Symbol}}}:
:SepalWidth => (Statistics.mean => :mymean1)
:SepalLength => (Statistics.mean => :mymean2)
all is correct again.
Interestingly, in some cases you can omit a dot in broadcasting (but this is rare). For instance:
julia> [:SepalWidth, :SepalLength] .=> mean .=> identity
2-element Vector{Pair{Symbol, Pair{typeof(mean), typeof(identity)}}}:
:SepalWidth => (Statistics.mean => identity)
:SepalLength => (Statistics.mean => identity)
julia> [:SepalWidth, :SepalLength] .=> mean => identity
2-element Vector{Pair{Symbol, Pair{typeof(mean), typeof(identity)}}}:
:SepalWidth => (Statistics.mean => identity)
:SepalLength => (Statistics.mean => identity)
give you exactly the same result (in this case the identity part means that you reuse the input column name as output column name). The result with => and .=> in the second part is the same since the second and third of the whole expression have a single element.

Julia: Writing a function "paramvalues" that returns a dictionary of parameter values

I am working on an Economics assignment and need to create a function in Julia that takes keyword arguments and returns them in the form of a dictionary. So far this is my code, but it is not running because there is "no method matching getindex(::Nothing, ::Symbol).
function paramvalues(a; Nb = 100, Ns = 100, Minval = 0, Maxval = 200, Mincost = 0, Maxcost = 200)
d = Dict(:Nb => Nb, :Ns => Ns, :Minval => Minval, :Maxval => Maxval, :Mincost => Mincost, :Maxcost => Maxcost)
for values in a
d[values] = get(d, values, 0)
end
d
end

Dict can directly take the keyword-arguments:
julia> f(;kw...) = Dict(kw)
f (generic function with 1 method)
julia> f(a=5, b=3)
Dict{Symbol, Int64} with 2 entries:
:a => 5
:b => 3

This is probably closer to what you want:
function paramvalues(a; kwargs...)
#show kwargs
dict = collect(kwargs)
#show dict
return dict
end
The magic here is that kwargs (aka Key Word Arguments) with the ... "SLURP", soaks up all the key word arguments into one variable.
And it is an iterator over an array of pairs. To convert an iterator into a standard collection like an Array or a Dict, you can use collect. An iterator over pairs turns into a Dict.
Trying it out I get this:
paramvalues(123;Nb = 100, Ns = 100, Minval = 0, Maxval = 200, Mincost = 0, Maxcost = 200)
kwargs = Base.Iterators.Pairs(:Nb => 100,:Ns => 100,:Minval => 0,:Maxval => 200,:Mincost => 0,:Maxcost => 200)
dict = Pair{Symbol,Int64}[:Nb => 100, :Ns => 100, :Minval => 0, :Maxval => 200, :Mincost => 0, :Maxcost => 200]
6-element Array{Pair{Symbol,Int64},1}:
:Nb => 100
:Ns => 100
:Minval => 0
:Maxval => 200
:Mincost => 0
:Maxcost => 200
I don't know what a is supposed to be.
It sounds like your error is coming from calling get on a variable storing Nothing instead of storing a Dict.
Also when walking a dictionary, it returns a tuple pair of (key, value). So use something like this to walk a dictionary:
for (k,v) in my_dict
#show k,v
end
Also don't forget to use the ? in the julia repl and search some of these things. Like ?Dict or ?for or ?enumerate or ?#show or ?collect.

Proper way to merge nested values within maps?

Given the following:
M1 = #{ "Robert" => #{"Scott" => #{}} },
M2 = #{ "Robert" => #{"Adams" => #{}} }
Merged should be:
M3 = #{ "Robert" => #{ "Scott" => #{}, "Adams" => {}}
Now if we merge in the following:
M4 = #{ "William" => #{ "Robert" => #{ "Scott" => {} }}}
M5 = #{ "William" => #{ "Robert" => #{ "Fitzgerald" => {} }}}
We should get the following:
M6 = #{ "Robert" => #{ "Scott" => #{}, "Adams" => {},
"William" => #{ "Robert" => #{ "Fitzgerald" => {}, "Scott" => {} }}}
I had the idea of iterating, getting each level's key and iterating over them. Checking if they're the same, merging the map if not, check if it's map or not, if it not stop and merge, otherwise call itself again. The problem I'm having is the function keeps crashing, is there a better way to do this?
This is the code I have so far:
merger(M1, M2) ->
M1_Keys = maps:keys(M1),
M2_Keys = maps:keys(M2),
do_merge(M1, M2, M1_Keys).
do_merge(M1, M2, [Head|Tail]) ->
Check = check_if_same(M1, M2),
io:fwrite("Check is: ~p\n", [Check]),
case Check of
{ok, true} ->
io:fwrite("true\n");
{ok, false} ->
io:fwrite("false\n")
end,
do_merge(M1, M2, Tail);
% P1 = maps:get(Head, M1),
% P2 = maps:get(Head, M2),
% P3 = maps:merge(P1, P2),
% M4 = maps:update(Head, P3, M1),
% io:fwrite("~p \n", [M4]),
% do_merge(M1, M2, Tail);
do_merge(M1, M2, []) ->
ok.
check_if_same(M1, M2) ->
{ok, lists:sort( maps:keys(M1) ) == lists:sort( maps:keys(M2) )}.
However, it crashes with the following error:
$erlc *.erl
helloworld.erl:10: Warning: variable 'M2_Keys' is unused
helloworld.erl:13: Warning: variable 'Head' is unused
helloworld.erl:30: Warning: variable 'M1' is unused
helloworld.erl:30: Warning: variable 'M2' is unused
$erl -noshell -s helloworld start -s init stop
Check is: {ok,true}
true
{"init terminating in do_boot",{{badmap,ok},[{maps,keys,[ok],[]},{helloworld,merger,2,[{file,"helloworld.erl"},{line,10}]},{init,start_em,1,[]},{init,do_boot,3,[]}]}}
init terminating in do_boot ()
Crash dump is being written to: erl_crash.dump...done

As I answered in a previous post, I can't see why you get this result, Need more information how you start the shell, type the command, and the complete result.
Unfortunately, I have not enough time to go in details and comment you code, I put here a code that does what you want, if I can I'll add comments later:
-module (merger).
-compile(export_all).
% yourType = maps(Key :: term() => Value :: yourType()) | #{}.
% merge operation:
% get all keys from 2 inputs
% if a key belongs to one input only, insert key => value in the result
% if a key belongs to 2 inputs, insert key => merge(Value1,value2) in the result
%
% lets write this
merger(M1, M2) ->
Keys = lists:usort(maps:keys(M1) ++ maps:keys(M2)), % will produce a list containing all the keys without repetition
lists:foldl(fun(Key,Acc) -> do_merge(Key,M1,M2,Acc) end,#{},Keys).
do_merge(Key, M1, M2, Acc) ->
case {maps:is_key(Key, M1),maps:is_key(Key, M2)} of
{true, true} ->
maps:put(Key, merger(maps:get(Key, M1),maps:get(Key, M2)), Acc);
{true, false} ->
maps:put(Key,maps:get(Key, M1),Acc);
{false, true} ->
maps:put(Key,maps:get(Key, M2),Acc)
end.
test() ->
R1 = merger(#{ "Robert" => #{"Scott" => #{}} },#{ "Robert" => #{"Adams" => #{}} }),
R2 = merger(R1,#{ "William" => #{ "Robert" => #{ "Scott" => #{} }}}),
merger(R2,#{ "William" => #{ "Robert" => #{ "Fitzgerald" => #{} }}}).
Which gives in the shell:
1> c(merger).
merger.erl:3: Warning: export_all flag enabled - all functions will be exported
{ok,merger}
2> merger:test().
#{"Robert" => #{"Adams" => #{},"Scott" => #{}},
"William" =>
#{"Robert" => #{"Fitzgerald" => #{},"Scott" => #{}}}}
3>
[EDIT]
Here is a commented version with 2 methods for the merge
-module (merger).
-compile(export_all).
% yourType = maps(Key :: term() => Value :: yourType()) | #{}.
% This first version sticks to the description in natural language
% merge operation:
% get all keys from 2 inputs
% if a key belongs to one input only, insert key => value in the result
% if a key belongs to 2 inputs, insert key => merge(Value1,value2) in the result
%
% let's write this
merger(M1, M2) ->
Keys = lists:usort(maps:keys(M1) ++ maps:keys(M2)), % will produce a list containing all the keys without repetition
lists:foldl(fun(Key,Acc) -> do_merge(Key,M1,M2,Acc) end,#{},Keys).
% will execute the do_merge function for each element in the Keys list and accumulate the result in Acc.
% The initial value of the accumulator is set to #{}
% https://erlang.org/doc/man/lists.html#foldl-3
% This function is the direct translation of the description above.
do_merge(Key, M1, M2, Acc) ->
% The case statement returns the result of the matching case.
case {maps:is_key(Key, M1),maps:is_key(Key, M2)} of
{true, true} ->
maps:put(Key, merger(maps:get(Key, M1),maps:get(Key, M2)), Acc);
{true, false} ->
maps:put(Key,maps:get(Key, M1),Acc);
{false, true} ->
maps:put(Key,maps:get(Key, M2),Acc)
end.
% the previous algorithm does a lot of useless operations: extract and combine the key lists, unique sort
% and uses 3 maps to build the result.
% a more efficient method is to break the symmetry of M1 and M2, and consider that you merge M2 into M1,
% so M1 is the the initial value of the algorithm.
% then, rather than extract the keys from M2, it is more direct to use the maps:foldl function.
% https://erlang.org/doc/man/maps.html#fold-3
% now the merge operation is :
% insert {key, Value} in the accumulator.
% If the key already exits in the accumulator, then the new value is the merge of the accumulator value and of the parameter value,
% If not then simply put Key,Value in the accumulator
% fold will call do_merge2 with each Key and Value from M2, the result of previous operations
% and the Value for Key in the accumulator (undefined if Key does not exist in the accumulator).
% The initial value is M1.
merger2(M1,M2) ->
maps:fold(fun(Key,Value,AccIn) -> do_merge2(Key,Value,AccIn,maps:get(Key,AccIn,undefined)) end, M1, M2).
% In the parameter I have added the result of maps:get/3, it returns either the Value if the key exists,
% either a default value, here: undefined if it does not exist. This allows to use pattern matching (more erlang way) rather than a case or if statement.
do_merge2(Key,Value,Acc,undefined) ->
% the Key was not present in ACC
maps:put(Key, Value, Acc);
do_merge2(Key,Value1,Acc,Value2) ->
% the Key was present in ACC associated to Value2
maps:put(Key,merger2(Value1,Value2),Acc).
% The nice thing is now the whole code needs only 3 function declarations containing 1 line of code each.
% It is pretty neat, maybe less easy to start with.
% For the test, I now pass the merger function name to use as a parameter
test(Merger) ->
R1 = Merger(#{ "Robert" => #{"Scott" => #{}} },#{ "Robert" => #{"Adams" => #{}}}),
R2 = Merger(R1,#{ "William" => #{ "Robert" => #{ "Scott" => #{}}}}),
Merger(R2,#{ "William" => #{ "Robert" => #{ "Fitzgerald" => #{}}}}).
test1() ->
io:format("using merger :~n~p~n~n",[test(fun merger:merger/2)]),
io:format("using merger2 :~n~p~n~n",[test(fun merger:merger2/2)]).
In the shell, it gives:
$ erl
Erlang/OTP 22 [erts-10.6] [64-bit] [smp:8:8] [ds:8:8:10] [async-threads:1]
Eshell V10.6 (abort with ^G)
1> c(merger).
merger.erl:3: Warning: export_all flag enabled - all functions will be exported
{ok,merger}
2> merger:test(fun merger:merger/2).
#{"Robert" => #{"Adams" => #{},"Scott" => #{}},
"William" =>
#{"Robert" => #{"Fitzgerald" => #{},"Scott" => #{}}}}
3> merger:test(fun merger:merger2/2).
#{"Robert" => #{"Adams" => #{},"Scott" => #{}},
"William" =>
#{"Robert" => #{"Fitzgerald" => #{},"Scott" => #{}}}}
4>
or invoked from PowerShell window:
PS C:\git\test_area\src> erlc merger.erl
merger.erl:3: Warning: export_all flag enabled - all functions will be exported
PS C:\git\test_area\src> erl -noshell -s merger test1 -s init stop
using merger :
#{"Robert" => #{"Adams" => #{},"Scott" => #{}},
"William" => #{"Robert" => #{"Fitzgerald" => #{},"Scott" => #{}}}}
using merger2 :
#{"Robert" => #{"Adams" => #{},"Scott" => #{}},
"William" => #{"Robert" => #{"Fitzgerald" => #{},"Scott" => #{}}}}
PS C:\git\test_area\src>
For the reason why you get a crash dump, I have to make some guess (you do not provide the stat function :o). I think that you do a test like mine, which combines several evaluations. The problem in this case is that at the end of the recursion, for the first evaluation (R1 = Merger(#{ "Robert" => #{"Scott" => #{}} },#{ "Robert" => #{"Adams" => #{}}}) in my case), you get the return value ok (do_merge(M1, M2, []) -> ok in your code). This result is then reused for the next evaluation, and the program fails on invocation of maps:keys(ok) saying that it got a badmap: ok.

Your do_merge returns ok always (the base recursion case).
Here you have two solutions, the first one is more readable, but I'd go with the second one
deep_map_merge(M1, M2) when is_map(M1), is_map(M2) ->
% Merge both as if they had no common keys
FlatMerge = maps:merge(M1, M2),
% Get common keys (This is O(N^2), there are better ways)
CommonKeys = [K || K <- maps:keys(M1), K2 <- maps:keys(M2), K == K2],
% Update the merged map with the merge of the common keys
lists:foldl(fun(K, MergeAcc) ->
MergeAcc#{K => deep_map_merge(maps:get(K, M1), maps:get(K, M2))}
end, FlatMerge, CommonKeys);
deep_map_merge(_, Override) ->
Override.
deep_map_merge2(M1, M2) when is_map(M1), is_map(M2) ->
maps:fold(fun(K, V2, Acc) ->
case Acc of
#{K := V1} ->
Acc#{K => deep_map_merge2(V1, V2)};
_ ->
Acc#{K => V2}
end
end, M1, M2);
deep_map_merge2(_, Override) ->
Override.

Unpack dict entries inside a function

I want to unpack parameters that are stored in a dictionary. They should be available inside the local scope of a function afterwards. The name should be the same as the key which is a symbol.
macro unpack_dict()
code = :()
for (k,v) in dict
ex = :($k = $v)
code = quote
$code
$ex
end
end
return esc(code)
end
function assign_parameters(dict::Dict{Symbol, T}) where T<:Any
#unpack_dict
return a + b - c
end
dict = Dict(:a => 1,
:b => 5,
:c => 6)
assign_parameters(dict)
However, this code throws:
LoadError: UndefVarError: dict not defined
If I define the dictionary before the macro it works because the dictionary is defined.
Does someone has an idea how to solve this? Using eval() works but is evaluated in the global scope what I want to avoid.

If you want to unpack them then the best method is to simply unpack them directly:
function actual_fun(d)
a = d[:a]
b = d[:b]
c = d[:c]
a+b+c
end
This will be type stable, relatively fast and readable.
You could, for instance, do something like this (I present you two options to avoid direct assignment to a, b, and c variables):
called_fun(d) = helper(;d...)
helper(;kw...) = actual_fun(;values(kw)...)
actual_fun(;a,b,c, kw...) = a+b+c
function called_fun2(d::Dict{T,S}) where {T,S}
actual_fun(;NamedTuple{Tuple(keys(d)), NTuple{length(d), S}}(values(d))...)
end
and now you can write something like:
julia> d = Dict(:a=>1, :b=>2, :c=>3, :d=>4)
Dict{Symbol,Int64} with 4 entries:
:a => 1
:b => 2
:d => 4
:c => 3
julia> called_fun(d)
6
julia> called_fun2(d)
6
But I would not recommend it - it is not type stable and not very readable.
AFACT other possibilities will have similar shortcomings as during compile time Julia knows only types of variables not their values.
EDIT: You can do something like this:
function unpack_dict(dict)
ex = :()
for (k,v) in dict
ex = :($ex; $k = $v)
end
return :(myfun() = ($ex; a+b+c))
end
runner(d) = eval(unpack_dict(d))
and then run:
julia> d = Dict(:a=>1, :b=>2, :c=>3, :d=>4)
Dict{Symbol,Int64} with 4 entries:
:a => 1
:b => 2
:d => 4
:c => 3
julia> runner(d)
myfun (generic function with 1 method)
julia> myfun()
6
but again - I feel this is a bit messy.

In Julia 1.7, you can simply unpack named tuples into the local scope, and you can easily "spread" a dict into a named tuple.
julia> dict = Dict(:a => 1, :b => 5, :c => 6)
Dict{Symbol, Int64} with 3 entries:
:a => 1
:b => 5
:c => 6
julia> (; a, b, c) = (; sort(dict)...)
(a = 1, b = 5, c = 6)
julia> a, b, c
(1, 5, 6)
(The dict keys are sorted so that the named tuple produced is type-stable; if the keys were produced in arbitrary order then this would result in a named tuple with fields in arbitrary order as well.)

Flow: annotation of a function variable

What is a proper flow type annotations of variable
const a = x => x
using generics?
const a: (<T> T => T) = x => x
is failing.

Flow implicitly types expression x => x as (mixed) => mixed (according to flow type-at-pos). Thus the problem with
const a: (<T> T => T) = x => x // failing
is that type of right-hand side ((mixed) => mixed) doesn't match to type of left-hand side (<T> (T) => T).
Possible solution could be to explicitly set type of right-hand side:
const a: (<T> (T) => T) = <U> (x: U): U => x
If use of generic wouldn't be required, annotated definition of a could look like:
const a: (mixed) => mixed = x => x

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

Mutating a Julia dictionary through aliases - dictionary

Related

How can I rename columns when using combine in Julia DataFrame for many functions?

Julia: Writing a function "paramvalues" that returns a dictionary of parameter values

Proper way to merge nested values within maps?

Unpack dict entries inside a function

Flow: annotation of a function variable

Categories

Resources