Efficiently convert Dict to NamedTuple in Julia - julia

I would like to have an interface that accepts a Dict or a NamedTuple as input, but then always converts the input to a NamedTuple.
Given a Dict
julia> dd = Dict(:a => 1, :b => 2)
Dict{Symbol,Int64} with 2 entries:
:a => 1
:b => 2
I can covert it to a NamedTuple with
julia> (; dd...)
(a = 1, b = 2)
However, this both allocates a surprising (to me) amount
julia> using BenchmarkTools
julia> #btime (; $dd...);
1.033 μs (12 allocations: 896 bytes)
And it does not apply to nested Dicts, which I would like to convert to nested NamedTuples
julia> dd_nested = Dict(:a => 1, :b => Dict(:x => 3, :y => 4))
Dict{Symbol,Any} with 2 entries:
:a => 1
:b => Dict(:y=>4,:x=>3)
julia> (; dd_nested...)
(a = 1, b = Dict(:y => 4,:x => 3))
where the desired output is equal to
julia> (a = 1, b = (x = 3, y = 4))
(a = 1, b = (x = 3, y = 4))

What about:
unzip(d::Dict) = (;(p.first => unzip(p.second) for p in d)...)
unzip(d) = d
Sample test:
julia> unzip(dd)
(a = 1, b = (y = 4, x = 3))
Regarding memory allocations NamedTupleTools.jl seems to have a slightly smaller memory footprint. But in either case you are creating quite a bit of data structure here so most likely you will not be able to do it any cheaper.

Related

Subdimensional array from multidimensional array in Julia

With NumPy I can access a subdimensional array from a multidimensional array without knowing the dimension of the original array:
import numpy as np
a = np.zeros((2, 3, 4)) # A 2-by-3-by-4 array of zeros
a[0] # A 3-by-4 array of zeros
but with Julia I am at a loss. It seems that I must know the dimension of a to do this:
a = zeros(2, 3, 4) # A 2-by-3-by-4 array of zeros
a[1, :, :] # A 3-by-4 array of zeros
What should I do if I don't know the dimension of a?
selectdim gives a view of what you are looking for,
a = zeros(2, 3, 4)
selectdim(a,1,1)
If you want to iterate over each "subdimensional array" in order, you can also use eachslice:
julia> a = reshape(1:24, (2, 3, 4));
julia> eachslice(a, dims = 1) |> first
3×4 view(reshape(::UnitRange{Int64}, 2, 3, 4), 1, :, :) with eltype Int64:
1 7 13 19
3 9 15 21
5 11 17 23
julia> for a2dims in eachslice(a, dims = 1)
#show size(a2dims)
end
size(a2dims) = (3, 4)
size(a2dims) = (3, 4)

Using Iterators.product on a variable number of lists

I'm trying to make a function that use Iterators.product() on a variable number of arrays.
This is my code:
function make_feature_space(dictionary, k)
dict_chars = collect(dictionary)
#Change bottom line
dict_product = collect(Iterators.product(dict_chars))
return dict_product
end
The behavior I'd like would be something along the lines of calling make_feature_space(dictionary, 3) would return Iterators.product(dict_chars, dict_chars, dict_chars), but calling make_feature_space(dictionary, 2) would get Iterators.product(dict_chars, dict_chars).
Thanks!!
Here's a solution that uses Iterators.repeated and splatting:
using Base.Iterators
make_feature_space(x, n) = product(repeated(x, n)...)
Here it is in action:
julia> x = 1:2;
julia> make_feature_space(x, 2) |> collect
2×2 Array{Tuple{Int64,Int64},2}:
(1, 1) (1, 2)
(2, 1) (2, 2)
julia> make_feature_space(x, 3) |> collect
2×2×2 Array{Tuple{Int64,Int64,Int64},3}:
[:, :, 1] =
(1, 1, 1) (1, 2, 1)
(2, 1, 1) (2, 2, 1)
[:, :, 2] =
(1, 1, 2) (1, 2, 2)
(2, 1, 2) (2, 2, 2)
Note that this implementation can use any iterator in the first argument of make_feature_space. Further note that a dictionary is an iterator of pairs, so you can do this:
julia> d = Dict(:a => 1, :b => 2)
Dict{Symbol,Int64} with 2 entries:
:a => 1
:b => 2
julia> make_feature_space(d, 2) |> collect
2×2 Array{Tuple{Pair{Symbol,Int64},Pair{Symbol,Int64}},2}:
(:a=>1, :a=>1) (:a=>1, :b=>2)
(:b=>2, :a=>1) (:b=>2, :b=>2)
Though it's not clear from your question if you're looking for a product over the keys, values, or pairs of the dictionary.

How to broadcast vectors (lists) into tuples in Julia?

Is there a generator/iterator function that will turn
a = [1,2]
b = [3,4]
into [(1,3),(2,4)] and
a = 1
b = [3,4]
into [(1,3),(1,4)] using the same expression?
Is there a similar way to create a named tuple such as [(a=1,b=3),(a=1,b=4)]?
You can use broadcasting with Julia's dot syntax for this:
julia> tuple.(a, b)
2-element Array{Tuple{Int64,Int64},1}:
(1, 3)
(2, 4)
tuple here is a function that just creates a tuple from its arguments.
For NamedTuples you can either call the lower-level constructor directly on tuples with
julia> NamedTuple{(:a, :b)}.(tuple.(a, b))
2-element Array{NamedTuple{(:a, :b),Tuple{Int64,Int64}},1}:
(a = 1, b = 3)
(a = 2, b = 4)
where :a and :b are the sorted key names, or equivalently, using an anonymous function:
julia> broadcast((a_i, b_i) -> (a=a_i, b=b_i), a, b)
2-element Array{NamedTuple{(:a, :b),Tuple{Int64,Int64}},1}:
(a = 1, b = 3)
(a = 2, b = 4)
Hope that helps!
Just broadcast the tuple function.
julia> a = [1,2]; b=[3,4];
julia> tuple.(a,b)
2-element Array{Tuple{Int64,Int64},1}:
(1, 3)
(2, 4)
julia> tuple.(1, b)
2-element Array{Tuple{Int64,Int64},1}:
(1, 3)
(1, 4)
Second question - broadcast the constructor:
julia> NamedTuple{(:a, :b)}.(tuple.(1, b))
2-element Array{NamedTuple{(:a, :b),Tuple{Int64,Int64}},1}:
(a = 1, b = 3)
(a = 1, b = 4)

Julia: All possible sums of `n` entries of a Vector with unique integers, (with repetition)

Let's say I have a vector of unique integers, for example [1, 2, 6, 4] (sorting doesn't really matter).
Given some n, I want to get all possible values of summing n elements of the set, including summing an element with itself. It is important that the list I get is exhaustive.
For example, for n = 1 I get the original set.
For n = 2 I should get all values of summing 1 with all other elements, 2 with all others etc. Some kind of memory is also required, in the sense that I have to know from which entries of the original set did the sum I am facing come from.
For a given, specific n, I know how to solve the problem. I want a concise way of being able to solve it for any n.
EDIT: This question is for Julia 0.7 and above...
This is a typical task where you can use a dictionary in a recursive function (I am annotating types for clarity):
function nsum!(x::Vector{Int}, n::Int, d=Dict{Int,Set{Vector{Int}}},
prefix::Vector{Int}=Int[])
if n == 1
for v in x
seq = [prefix; v]
s = sum(seq)
if haskey(d, s)
push!(d[s], sort!(seq))
else
d[s] = Set([sort!(seq)])
end
end
else
for v in x
nsum!(x, n-1, d, [prefix; v])
end
end
end
function genres(x::Vector{Int}, n::Int)
n < 1 && error("n must be positive")
d = Dict{Int, Set{Vector{Int}}}()
nsum!(x, n, d)
d
end
Now you can use it e.g.
julia> genres([1, 2, 4, 6], 3)
Dict{Int64,Set{Array{Int64,1}}} with 14 entries:
16 => Set(Array{Int64,1}[[4, 6, 6]])
11 => Set(Array{Int64,1}[[1, 4, 6]])
7 => Set(Array{Int64,1}[[1, 2, 4]])
9 => Set(Array{Int64,1}[[1, 4, 4], [1, 2, 6]])
10 => Set(Array{Int64,1}[[2, 4, 4], [2, 2, 6]])
8 => Set(Array{Int64,1}[[2, 2, 4], [1, 1, 6]])
6 => Set(Array{Int64,1}[[2, 2, 2], [1, 1, 4]])
4 => Set(Array{Int64,1}[[1, 1, 2]])
3 => Set(Array{Int64,1}[[1, 1, 1]])
5 => Set(Array{Int64,1}[[1, 2, 2]])
13 => Set(Array{Int64,1}[[1, 6, 6]])
14 => Set(Array{Int64,1}[[4, 4, 6], [2, 6, 6]])
12 => Set(Array{Int64,1}[[4, 4, 4], [2, 4, 6]])
18 => Set(Array{Int64,1}[[6, 6, 6]])
EDIT: In the code I use sort! and Set to avoid duplicate entries (remove them if you want duplicates). Also you could keep track how far in the index on vector x in the loop you reached in outer recursive calls to avoid generating duplicates at all, which would speed up the procedure.
I want a concise way of being able to solve it for any n.
Here is a concise solution using IterTools.jl:
Julia 0.6
using IterTools
n = 3
summands = [1, 2, 6, 4]
myresult = map(x -> (sum(x), x), reduce((x1, x2) -> vcat(x1, collect(product(fill(summands, x2)...))), [], 1:n))
(IterTools.jl is required for product())
Julia 0.7
using Iterators
n = 3
summands = [1, 2, 6, 4]
map(x -> (sum(x), x), reduce((x1, x2) -> vcat(x1, vec(collect(product(fill(summands, x2)...)))), 1:n; init = Vector{Tuple{Int, NTuple{n, Int}}}[]))
(In Julia 0.7, the parameter position of the neutral element changed from 2nd to 3rd argument.)
How does this work?
Let's indent the one-liner (using the Julia 0.6 version, the idea is the same for the Julia 0.7 version):
map(
# Map the possible combinations of `1:n` entries of `summands` to a tuple containing their sum and the summands used.
x -> (sum(x), x),
# Generate all possible combinations of `1:n`summands of `summands`.
reduce(
# Concatenate previously generated combinations with the new ones
(x1, x2) -> vcat(
x1,
vec(
collect(
# Cartesian product of all arguments.
product(
# Use `summands` for `x2` arguments.
fill(
summands,
x2)...)))),
# Specify for what lengths we want to generate combinations.
1:n;
# Neutral element (empty array).
init = Vector{Tuple{Int, NTuple{n, Int}}}[]))
Julia 0.6
This is really just to get a free critique from the experts as to why my method is inferior to theirs!
using Combinatorics, BenchmarkTools
function nsum(a::Vector{Int}, n::Int)::Vector{Tuple{Int, Vector{Int}}}
r = Vector{Tuple{Int, Vector{Int}}}()
s = with_replacement_combinations(a, n)
for i in s
push!(r, (sum(i), i))
end
return sort!(r, by = x -> x[1])
end
#btime nsum([1, 2, 6, 4], 3)
It runs in circa 4.154 μs on my 1.8 GHz processor for n = 3. It produces a sorted array showing the sum (which may appear more than once) and how it is made up (which is unique to each instance of the sum).

Does there exist an iterative implementation of fibonacci that is not a tabulation?

All the iterative implementations that I have found seem to necessarily be using a tabulation method.
Is dynamic programming an alternative only to recursion, while it is a mandatory solution for iteration?
An iterative implementation of fibonacci that does not use tabulation, memoization, or a swap variable – this code from my answer here on getting hung-up learning functional style.
const append = (xs, x) =>
xs.concat ([x])
const fibseq = n => {
let seq = []
let a = 0
let b = 1
while (n >= 0) {
n = n - 1
seq = append (seq, a)
a = a + b
b = a - b
}
return seq
}
console.log (fibseq (500))
// [ 0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, ... ]
And remember, procedure differs from process – ie, recursive procedures can spawn iterative processes. Below, even though fibseq is defined recursively, the process that it spawns is iterative – again, no tabulation, memoization, or any other made-up terms going on
const recur = (...values) =>
({ type: recur, values })
const loop = f =>
{
let acc = f ()
while (acc && acc.type === recur)
acc = f (...acc.values)
return acc
}
const fibseq = x =>
loop ((n = x, seq = [], a = 0, b = 1) =>
n === 0
? seq.concat ([a])
: recur (n - 1, seq.concat ([a]), a + b, a))
console.time ('loop/recur')
console.log (fibseq (500))
console.timeEnd ('loop/recur')
// [ 0,
// 1,
// 1,
// 2,
// 3,
// 5,
// 8,
// 13,
// 21,
// 34,
// ... 490 more items ]
// loop/recur: 5ms
There is a definition of Fibonacci numbers as a sum of binomial coefficients, which themselves can be calculated iteratively, as a representation of all the compositions of (n-1) from 1s and 2s.
In Haskell, we could write:
-- https://rosettacode.org/wiki/Evaluate_binomial_coefficients#Haskell
Prelude> choose n k = product [k+1..n] `div` product [1..n-k]
Prelude> fib n = sum [(n-k-1) `choose` k | k <- [0..(n-1) `div` 2]]
Prelude> fib 100
354224848179261915075

Resources