values and keys guaranteed to be in the consistent order? - dictionary

When applied to a Dict, will values(...) and keys(...) return items in matching order?
In other words, is zip(keys(d), values(d)) guaranteed to contain precisely the key-value pairs of the dictionary d?

Option 1
The current Julia source code indicates that the keys and vals of a Dict() object are stored as Array objects, which are ordered. Thus, you could just use values() and keys() separately, as in your question formulation. But, it is dangerous to rely on under the hood implementation details that aren't documented, since they might be changed without notice.
Option 2
An OrderedDict from the DataStructures package (along with the functions values() and keys()) is probably the simplest and safest way to be certain of consistent ordering. It's ok if you don't specifically need the ordering.
Option 3
If you don't want to deal with the added hassle of installing and loading the DataStructures package, you could just use Julia's built in syntax for handling this kind of thing, e.g.
Mydict = Dict("a" => 1, "b" => 2, "c" => 1)
a = [(key, val) for (key, val) in Mydict]
The use of zip() as given in the question formulation just adds complexity and risk in this situation.
If you want the entities separate, you could then use:
Keys = [key for (key, val) in Mydict]
Values = [val for (key, val) in Mydict]
or just refer to a[idx][1] for the idx element of Keys when you need it.

Currently your assertion seems to be true:
julia> let
d = [i => i^2 for i in 1:10_000]
z = zip(keys(d), values(d))
for (pair, tupl) in zip(d, z)
#assert pair[1] == tupl[1] && pair[2] == tupl[2]
end
info("Success")
end
INFO: Success
But that is an undocumented implementation detail as Michael Ohlrogge explains.
Stefan Karpinski comment about show(dict) now sorted by key in #16743:
This has performance implications for printing very large Dicts. I don't think it's a good idea. I do, however, think that making Dict ordered is a good idea that we should go ahead with.
See also:
#10116 WIP: try ordered Dict representation.
Most importantly, what are you trying to do? Perhaps an OrederedDict is what you need?

Yes, keys and values return items in matching order. Unless, as Dan Getz pointed out above, the dictionary is modified in between using the two iterators.
I think it would be relatively perverse for a dictionary not to have this behavior. It was obvious to us that the order should match, to the point that it didn't even occur to us to mention this explicitly in the documentation.

Another way to ensure corresponding order between keys and values is using imap from the Iterators package in the following way:
using Iterators
d = Dict(1=>'a',2=>'b',3=>'c')
# keys iterator is `imap(first,d)`
julia> collect(imap(first,d))
3-element Array{Any,1}:
2
3
1
# values iterator is `imap(last,d)`
julia> collect(imap(last,d))
3-element Array{Any,1}:
'b'
'c'
'a'
This method can potentially be adapted for other structures. All the other comments and answers are also good.

Related

How to set dictionary values to zero in Julia

I would like to set all the values of a dictionary to zero. In Python you can use the .fromkeys() function to do this (see this SO question). How can this be done in Julia?
The replace function comes to mind. It takes a function as first argument that takes a pair as argument and returns the modified pair:
d_new = replace(kv -> kv[1] => 0, d)
Here, each pair kv from d is replaced by kv[1] => 0, where kv[1] is the key and 0 the new associated value.
Note that the mutating variant replace! is also available for in-place replacement.
EDIT: another possibility when the dictionary has to be mutated is:
map!(x->0, values(d))
Not quite the answer to your question, but it may be interesting for you to consider using Dictionaries.jl package, which provides more convenient means to work with dictionaries than Base Dict.
For example, in Dictionaries.jl notation your question can be solved as follows
using Dictionaries
d = dictionary(['a' => 1, 'b' => 2])
map(zero, d)
I found the answer to my own question. As mentioned in the Julia docs, dictionaries can be created using generators. So you can copy the keys of the original Dict and assign all the values to be 0 like so
d_new = Dict(i => 0 for i in keys(d))
See also this forum post.

dropping singleton dimensions in julia

Just playing around with Julia (1.0) and one thing that I need to use a lot in Python/numpy/matlab is the squeeze function to drop the singleton dimensions.
I found out that one way to do this in Julia is:
a = rand(3, 3, 1);
a = dropdims(a, dims = tuple(findall(size(a) .== 1)...))
The second line seems a bit cumbersome and not easy to read and parse instantly (this could also be my bias that I bring from other languages). However, I wonder if this is the canonical way to do this in Julia?
The actual answer to this question surprised me. What you are asking could be rephrased as:
why doesn't dropdims(a) remove all singleton dimensions?
I'm going to quote Tim Holy from the relevant issue here:
it's not possible to have squeeze(A) return a type that the compiler
can infer---the sizes of the input matrix are a runtime variable, so
there's no way for the compiler to know how many dimensions the output
will have. So it can't possibly give you the type stability you seek.
Type stability aside, there are also some other surprising implications of what you have written. For example, note that:
julia> f(a) = dropdims(a, dims = tuple(findall(size(a) .== 1)...))
f (generic function with 1 method)
julia> f(rand(1,1,1))
0-dimensional Array{Float64,0}:
0.9939103383167442
In summary, including such a method in Base Julia would encourage users to use it, resulting in potentially type-unstable code that, under some circumstances, will not be fast (something the core developers are strenuously trying to avoid). In languages like Python, rigorous type-stability is not enforced, and so you will find such functions.
Of course, nothing stops you from defining your own method as you have. And I don't think you'll find a significantly simpler way of writing it. For example, the proposition for Base that was not implemented was the method:
function squeeze(A::AbstractArray)
singleton_dims = tuple((d for d in 1:ndims(A) if size(A, d) == 1)...)
return squeeze(A, singleton_dims)
end
Just be aware of the potential implications of using it.
Let me simply add that "uncontrolled" dropdims (drop any singleton dimension) is a frequent source of bugs. For example, suppose you have some loop that asks for a data array A from some external source, and you run R = sum(A, dims=2) on it and then get rid of all singleton dimensions. But then suppose that one time out of 10000, your external source returns A for which size(A, 1) happens to be 1: boom, suddenly you're dropping more dimensions than you intended and perhaps at risk for grossly misinterpreting your data.
If you specify those dimensions manually instead (e.g., dropdims(R, dims=2)) then you are immune from bugs like these.
You can get rid of tuple in favor of a comma ,:
dropdims(a, dims = (findall(size(a) .== 1)...,))
I'm a bit surprised at Colin's revelation; surely something relying on 'reshape' is type stable? (plus, as a bonus, returns a view rather than a copy).
julia> function squeeze( A :: AbstractArray )
keepdims = Tuple(i for i in size(A) if i != 1);
return reshape( A, keepdims );
end;
julia> a = randn(2,1,3,1,4,1,5,1,6,1,7);
julia> size( squeeze(a) )
(2, 3, 4, 5, 6, 7)
No?

How to check if a variable is scalar in julia

I would like to check if a variable is scalar in julia, such as Integer, String, Number, but not AstractArray, Tuple, type, struct, etc. Is there a simple method to do this (i.e. isscalar(x))
The notion of what is, or is not a scalar is under-defined without more context.
Mathematically, a scalar is defined; (Wikipedia)
A scalar is an element of a field which is used to define a vector space.
That is to say, you need to define a vector space, based on a field, before you can determine if something is, or is not a scalar (relative to that vector space.).
For the right vector space, tuples could be a scalar.
Of-course we are not looking for a mathematically rigorous definition.
Just a pragmatic one.
Base it off what Broadcasting considers to be scalar
I suggest that the only meaningful way in which a scalar can be defined in julia, is of the behavior of broadcast.
As of Julia 1:
using Base.Broadcast
isscalar(x::T) where T = isscalar(T)
isscalar(::Type{T}) where T = BroadcastStyle(T) isa Broadcast.DefaultArrayStyle{0}
See the docs for Broadcast.
In julia 0.7, Scalar is the default. So it is basically anything that doesn't have specific broadcasting behavior, i.e. it knocks out things like array and tuples etc.:
using Base.Broadcast
isscalar(x::T) where T = isscalar(T)
isscalar(::Type{T}) where T = BroadcastStyle(T) isa Broadcast.Scalar
In julia 0.6 this is a bit more messy, but similar:
isscalar(x::T) where T = isscalar(T)
isscalar(::Type{T}) where T = Base.Broadcast._containertype(T)===Any
The advantage of using the methods for Broadcast to determine if something is scalar, over using your own methods, is that anyone making a new type that is going to act in a scalar way must make sure it works with those methods correctly
(or actually nonscalar since scalar is the default.)
Structs are not not scalar
That is to say: sometimes structs are scalar and sometimes they are not and it depends on the struct.
Note however that these methods do not consider struct to be non-scalar.
I think you are mistaken in your desire to.
Julia structs are not (necessarily or usually) a collection type.
Consider that: BigInteger, BigFloat, Complex128 etc etc
are all defined using structs
I was tempted to say that having a start method makes a type nonscalar, but that would be incorrect as start(::Number) is defined.
(This has been debated a few times)
For completeness, I am copying Tasos Papastylianou's answer from the comments to here. If all you want to do is distinguish scalars from arrays you can use:
isa(x, Number)
This will output true if x is a Number (like a float or an int), and output false if x is an Array (vector, matrix, etc.)
I found myself needing to capture the notion of if something was scalar or not recently in MultiResolutionIterators.jl.
I found the boardcasting based rules from the other answer,
did not meet my needs.
In particular I wanted to consider strings as nonscalar.
I defined a trait,
bases on method_exists(start, (T,)),
with some exceptions as mentioned e.g. for Number.
abstract type Scalarness end
struct Scalar <: Scalarness end
struct NotScalar <: Scalarness end
isscalar(::Type{Any}) = NotScalar() # if we don't know the type we can't really know if scalar or not
isscalar(::Type{<:AbstractString}) = NotScalar() # We consider strings to be nonscalar
isscalar(::Type{<:Number}) = Scalar() # We consider Numbers to be scalar
isscalar(::Type{Char}) = Scalar() # We consider Sharacter to be scalar
isscalar(::Type{T}) where T = method_exists(start, (T,)) ? NotScalar() : Scalar()
Something similar is also done by AbstractTrees.jl
isscalar(x) == applicable(start, x) && !isa(x, Integer) && !isa(x, Char) && !isa(x, Task)

Return const dictionary

In Julia, suppose I have a function that returns a dictionary:
function f()
d = [i => 2i for i = 1:10]
return d
end
I would like to return the dictionary as const. That is, keys cannot be added or removed, and existing keys cannot be reassigned. Is it possible to modify f so that the returned dictionary is const?
Julia's standard library does not provide an immutable associative type. You could implement such a type yourself and not define any setindex! method for it. It might be easier to simply not mutate the returned dictionary, however.
Although Julia doesn't have a readonly Dict in its standard library (there is the unexported ImmutableDict, but that only prevents deletions, not sets), nor in the DataStructures.jl package, it could fairly easily be added as a package.
There are a number of advantages of a readonly Dict, for example, a perfect hash function can be generated so that entries are found (or not) with only a single probe. (https://www.gnu.org/software/gperf/manual/gperf.html describes a tool to generate a perfect hash).

"Adding" a value to a tuple?

I am attempting to represent dice rolls in Julia. I am generating all the rolls of a ndsides with
sort(collect(product(repeated(1:sides, n)...)), by=sum)
This produces something like:
[(1,1),(2,1),(1,2),(3,1),(2,2),(1,3),(4,1),(3,2),(2,3),(1,4) … (6,3),(5,4),(4,5),(3,6),(6,4),(5,5),(4,6),(6,5),(5,6),(6,6)]
I then want to be able to reasonably modify those tuples to represent things like dropping the lowest value in the roll or adding a constant number, etc., e.g., converting (2,5) into (10,2,5) or (5,).
Does Julia provide nice functions to easily modify (not necessarily in-place) n-tuples or will it be simpler to move to a different structure to represent the rolls?
Thanks.
Tuples are immutable, so you can't modify them in-place. There is very good support for other mutable data structures, so there aren't many methods that take a tuple and return a new, slightly modified copy. One way to do this is by splatting a section of the old tuple into a new tuple, so, for example, to create a new tuple like an existing tuple t but with the first element set to 5, you would write: tuple(5, t[2:end]...). But that's awkward, and there are much better solutions.
As spencerlyon2 suggests in his comment, a one dimensional Array{Int,1} is a great place to start. You can take a look at the Data Structures manual page to get an idea of the kinds of operations you can use; one-dimensional Arrays are iterable, indexable, and support the dequeue interface.
Depending upon how important performance is and how much work you're doing, it may be worthwhile to create your own data structure. You'll be able to add your own, specific methods (e.g., reroll!) for that type. And by taking advantage of some of the domain restrictions (e.g., if you only ever want to have a limited number of dice rolls), you may be able to beat the performance of the general Array.
You can construct a new tuple based on spreading or slicing another:
julia> b = (2,5)
(2, 5)
julia> (10, b...)
(10, 2, 5)
julia> b[2:end]
(5,)

Resources