How to set dictionary values to zero in Julia - julia

I would like to set all the values of a dictionary to zero. In Python you can use the .fromkeys() function to do this (see this SO question). How can this be done in Julia?

The replace function comes to mind. It takes a function as first argument that takes a pair as argument and returns the modified pair:
d_new = replace(kv -> kv[1] => 0, d)
Here, each pair kv from d is replaced by kv[1] => 0, where kv[1] is the key and 0 the new associated value.
Note that the mutating variant replace! is also available for in-place replacement.
EDIT: another possibility when the dictionary has to be mutated is:
map!(x->0, values(d))

Not quite the answer to your question, but it may be interesting for you to consider using Dictionaries.jl package, which provides more convenient means to work with dictionaries than Base Dict.
For example, in Dictionaries.jl notation your question can be solved as follows
using Dictionaries
d = dictionary(['a' => 1, 'b' => 2])
map(zero, d)

I found the answer to my own question. As mentioned in the Julia docs, dictionaries can be created using generators. So you can copy the keys of the original Dict and assign all the values to be 0 like so
d_new = Dict(i => 0 for i in keys(d))
See also this forum post.

Related

getindex for specific dims in Julia

Suppose I want to write a dynamic function that gets an object subtype of AbstractMatrix and shuffles the values along a specified dimension. Surely there can be various approaches and ways to do this, but suppose the following way:
import Random.shuffle
function shuffle(data::AbstractMatrix; dims=1)
n = size(data, dims)
shuffled_idx = shuffle(1:n)
data[shuffled_idx, :] #This line is wrong. It's not dynamic
A wrong way is to use several (actually indefinite) if-else statements like if dims==1 do... if dims==2 do. But it isn't the way to do these kinds of things. I could write data::AbstractArray then the input could have various dimensions. So this came to my mind that this can be possible if I can do something like getindex(data, [idxs]; dims). But I checked for the dims keyword argument (or even positional one) in the dispatches of getindex, but there isn't such a definition. So how can I get values by specified indexes and along a dim?
You are looking for selectdim:
help?> selectdim
search: selectdim
selectdim(A, d::Integer, i)
Return a view of all the data of A where the index for dimension d equals i.
Equivalent to view(A,:,:,...,i,:,:,...) where i is in position d.
Here's a code example:
function myshuffle(data::AbstractMatrix; dim=1)
inds = shuffle(axes(data, dim))
return selectdim(data, dim, inds)
end
Make sure not to use 1:n as indices for AbstractArrays, as they may have non-standard indices. Use axes instead.
BTW, selectdim apparently returns a view, so you may or may not need to use collect on it.

dropping singleton dimensions in julia

Just playing around with Julia (1.0) and one thing that I need to use a lot in Python/numpy/matlab is the squeeze function to drop the singleton dimensions.
I found out that one way to do this in Julia is:
a = rand(3, 3, 1);
a = dropdims(a, dims = tuple(findall(size(a) .== 1)...))
The second line seems a bit cumbersome and not easy to read and parse instantly (this could also be my bias that I bring from other languages). However, I wonder if this is the canonical way to do this in Julia?
The actual answer to this question surprised me. What you are asking could be rephrased as:
why doesn't dropdims(a) remove all singleton dimensions?
I'm going to quote Tim Holy from the relevant issue here:
it's not possible to have squeeze(A) return a type that the compiler
can infer---the sizes of the input matrix are a runtime variable, so
there's no way for the compiler to know how many dimensions the output
will have. So it can't possibly give you the type stability you seek.
Type stability aside, there are also some other surprising implications of what you have written. For example, note that:
julia> f(a) = dropdims(a, dims = tuple(findall(size(a) .== 1)...))
f (generic function with 1 method)
julia> f(rand(1,1,1))
0-dimensional Array{Float64,0}:
0.9939103383167442
In summary, including such a method in Base Julia would encourage users to use it, resulting in potentially type-unstable code that, under some circumstances, will not be fast (something the core developers are strenuously trying to avoid). In languages like Python, rigorous type-stability is not enforced, and so you will find such functions.
Of course, nothing stops you from defining your own method as you have. And I don't think you'll find a significantly simpler way of writing it. For example, the proposition for Base that was not implemented was the method:
function squeeze(A::AbstractArray)
singleton_dims = tuple((d for d in 1:ndims(A) if size(A, d) == 1)...)
return squeeze(A, singleton_dims)
end
Just be aware of the potential implications of using it.
Let me simply add that "uncontrolled" dropdims (drop any singleton dimension) is a frequent source of bugs. For example, suppose you have some loop that asks for a data array A from some external source, and you run R = sum(A, dims=2) on it and then get rid of all singleton dimensions. But then suppose that one time out of 10000, your external source returns A for which size(A, 1) happens to be 1: boom, suddenly you're dropping more dimensions than you intended and perhaps at risk for grossly misinterpreting your data.
If you specify those dimensions manually instead (e.g., dropdims(R, dims=2)) then you are immune from bugs like these.
You can get rid of tuple in favor of a comma ,:
dropdims(a, dims = (findall(size(a) .== 1)...,))
I'm a bit surprised at Colin's revelation; surely something relying on 'reshape' is type stable? (plus, as a bonus, returns a view rather than a copy).
julia> function squeeze( A :: AbstractArray )
keepdims = Tuple(i for i in size(A) if i != 1);
return reshape( A, keepdims );
end;
julia> a = randn(2,1,3,1,4,1,5,1,6,1,7);
julia> size( squeeze(a) )
(2, 3, 4, 5, 6, 7)
No?

How to check if a variable is scalar in julia

I would like to check if a variable is scalar in julia, such as Integer, String, Number, but not AstractArray, Tuple, type, struct, etc. Is there a simple method to do this (i.e. isscalar(x))
The notion of what is, or is not a scalar is under-defined without more context.
Mathematically, a scalar is defined; (Wikipedia)
A scalar is an element of a field which is used to define a vector space.
That is to say, you need to define a vector space, based on a field, before you can determine if something is, or is not a scalar (relative to that vector space.).
For the right vector space, tuples could be a scalar.
Of-course we are not looking for a mathematically rigorous definition.
Just a pragmatic one.
Base it off what Broadcasting considers to be scalar
I suggest that the only meaningful way in which a scalar can be defined in julia, is of the behavior of broadcast.
As of Julia 1:
using Base.Broadcast
isscalar(x::T) where T = isscalar(T)
isscalar(::Type{T}) where T = BroadcastStyle(T) isa Broadcast.DefaultArrayStyle{0}
See the docs for Broadcast.
In julia 0.7, Scalar is the default. So it is basically anything that doesn't have specific broadcasting behavior, i.e. it knocks out things like array and tuples etc.:
using Base.Broadcast
isscalar(x::T) where T = isscalar(T)
isscalar(::Type{T}) where T = BroadcastStyle(T) isa Broadcast.Scalar
In julia 0.6 this is a bit more messy, but similar:
isscalar(x::T) where T = isscalar(T)
isscalar(::Type{T}) where T = Base.Broadcast._containertype(T)===Any
The advantage of using the methods for Broadcast to determine if something is scalar, over using your own methods, is that anyone making a new type that is going to act in a scalar way must make sure it works with those methods correctly
(or actually nonscalar since scalar is the default.)
Structs are not not scalar
That is to say: sometimes structs are scalar and sometimes they are not and it depends on the struct.
Note however that these methods do not consider struct to be non-scalar.
I think you are mistaken in your desire to.
Julia structs are not (necessarily or usually) a collection type.
Consider that: BigInteger, BigFloat, Complex128 etc etc
are all defined using structs
I was tempted to say that having a start method makes a type nonscalar, but that would be incorrect as start(::Number) is defined.
(This has been debated a few times)
For completeness, I am copying Tasos Papastylianou's answer from the comments to here. If all you want to do is distinguish scalars from arrays you can use:
isa(x, Number)
This will output true if x is a Number (like a float or an int), and output false if x is an Array (vector, matrix, etc.)
I found myself needing to capture the notion of if something was scalar or not recently in MultiResolutionIterators.jl.
I found the boardcasting based rules from the other answer,
did not meet my needs.
In particular I wanted to consider strings as nonscalar.
I defined a trait,
bases on method_exists(start, (T,)),
with some exceptions as mentioned e.g. for Number.
abstract type Scalarness end
struct Scalar <: Scalarness end
struct NotScalar <: Scalarness end
isscalar(::Type{Any}) = NotScalar() # if we don't know the type we can't really know if scalar or not
isscalar(::Type{<:AbstractString}) = NotScalar() # We consider strings to be nonscalar
isscalar(::Type{<:Number}) = Scalar() # We consider Numbers to be scalar
isscalar(::Type{Char}) = Scalar() # We consider Sharacter to be scalar
isscalar(::Type{T}) where T = method_exists(start, (T,)) ? NotScalar() : Scalar()
Something similar is also done by AbstractTrees.jl
isscalar(x) == applicable(start, x) && !isa(x, Integer) && !isa(x, Char) && !isa(x, Task)

values and keys guaranteed to be in the consistent order?

When applied to a Dict, will values(...) and keys(...) return items in matching order?
In other words, is zip(keys(d), values(d)) guaranteed to contain precisely the key-value pairs of the dictionary d?
Option 1
The current Julia source code indicates that the keys and vals of a Dict() object are stored as Array objects, which are ordered. Thus, you could just use values() and keys() separately, as in your question formulation. But, it is dangerous to rely on under the hood implementation details that aren't documented, since they might be changed without notice.
Option 2
An OrderedDict from the DataStructures package (along with the functions values() and keys()) is probably the simplest and safest way to be certain of consistent ordering. It's ok if you don't specifically need the ordering.
Option 3
If you don't want to deal with the added hassle of installing and loading the DataStructures package, you could just use Julia's built in syntax for handling this kind of thing, e.g.
Mydict = Dict("a" => 1, "b" => 2, "c" => 1)
a = [(key, val) for (key, val) in Mydict]
The use of zip() as given in the question formulation just adds complexity and risk in this situation.
If you want the entities separate, you could then use:
Keys = [key for (key, val) in Mydict]
Values = [val for (key, val) in Mydict]
or just refer to a[idx][1] for the idx element of Keys when you need it.
Currently your assertion seems to be true:
julia> let
d = [i => i^2 for i in 1:10_000]
z = zip(keys(d), values(d))
for (pair, tupl) in zip(d, z)
#assert pair[1] == tupl[1] && pair[2] == tupl[2]
end
info("Success")
end
INFO: Success
But that is an undocumented implementation detail as Michael Ohlrogge explains.
Stefan Karpinski comment about show(dict) now sorted by key in #16743:
This has performance implications for printing very large Dicts. I don't think it's a good idea. I do, however, think that making Dict ordered is a good idea that we should go ahead with.
See also:
#10116 WIP: try ordered Dict representation.
Most importantly, what are you trying to do? Perhaps an OrederedDict is what you need?
Yes, keys and values return items in matching order. Unless, as Dan Getz pointed out above, the dictionary is modified in between using the two iterators.
I think it would be relatively perverse for a dictionary not to have this behavior. It was obvious to us that the order should match, to the point that it didn't even occur to us to mention this explicitly in the documentation.
Another way to ensure corresponding order between keys and values is using imap from the Iterators package in the following way:
using Iterators
d = Dict(1=>'a',2=>'b',3=>'c')
# keys iterator is `imap(first,d)`
julia> collect(imap(first,d))
3-element Array{Any,1}:
2
3
1
# values iterator is `imap(last,d)`
julia> collect(imap(last,d))
3-element Array{Any,1}:
'b'
'c'
'a'
This method can potentially be adapted for other structures. All the other comments and answers are also good.

Return const dictionary

In Julia, suppose I have a function that returns a dictionary:
function f()
d = [i => 2i for i = 1:10]
return d
end
I would like to return the dictionary as const. That is, keys cannot be added or removed, and existing keys cannot be reassigned. Is it possible to modify f so that the returned dictionary is const?
Julia's standard library does not provide an immutable associative type. You could implement such a type yourself and not define any setindex! method for it. It might be easier to simply not mutate the returned dictionary, however.
Although Julia doesn't have a readonly Dict in its standard library (there is the unexported ImmutableDict, but that only prevents deletions, not sets), nor in the DataStructures.jl package, it could fairly easily be added as a package.
There are a number of advantages of a readonly Dict, for example, a perfect hash function can be generated so that entries are found (or not) with only a single probe. (https://www.gnu.org/software/gperf/manual/gperf.html describes a tool to generate a perfect hash).

Resources