Given two vectors a = [1, 2] and b = [3, 4], how do I obtain the concatenated vector c = [1, 2, 3, 4]? It seems that hcat or vcat could be used as they work on arrays, but when using vectors to store collections of elements it seems unfitting to first think about the orientation of the data; it's just supposed to be a list of values.
You can write
[a; b]
Under the hood this is the same as vcat, but it's terser, looks better, and is easier to remember, as it's also consistent with literal matrix construction syntax.
An alternative for concatenating multiple vectors is
reduce(vcat, (a, b))
Most Array methods treat arrays as general "tensors" of arbitrary ranks ("data cubes"), so you do need to think about the orientation. In the general case, there's cat(a, b; dims), of which hcat and vcat are special cases.
There is another class of methods treating Vectors as list like. From those, append! is the method that, well, appends a vector to another. The problem is that it is mutable. So you can, for example, append!(copy(a), b), or use something like BangBang.NoBang.append (which just selects the right method internally, though).
For the case of more than two vectors to be concatenated, I like the pattern of
reduce(append!, (a, b), init=Int[])
Related
using ShiftedArrays
struct CircularMatrix{T} <: AbstractArray{T,2}
data::Array{T,2}
view::CircShiftedArray
currentIndex::Int
function CircularMatrix{T}(dims...) where T
data = zeros(T, dims...)
CircularMatrix(data, ShiftedArrays.circshift(data, (0, -1)), 1)
end
end
Base.size(M::CircularMatrix) = size(M.data)
Base.eltype(::Type{CircularMatrix{T}}) where {T} = T
function shift_forward!(M::CircularMatrix)
M.shift_forward!(1)
end
function shift_forward!(M::CircularMatrix, n)
# replace the view with a view shifted forwards.
M.currentIndex += n
M.view = ShiftedArrays.circshift(M.data, (n, M.currentIndex))
end
#inline Base.#propagate_inbounds function Base.getindex(M::CircularMatrix, i) = M.view[i]
#inline Base.#propagate_inbounds function Base.setindex!(M::CircularMatrix, data, i) = M.view[i] = data
How can I make CircularMatrix act just like a regular matrix.
So that I can access it like
m = CircularMatrix{Int}(4,4)
m[1, 1] = 5
x = view(m, 1, :)
Your matrix type is defined to be a subtype of AbstractArray{T, 2}. You need to implement a few methods in the informal array interface of Julia for your type to make functions and features that work on AbstractArray{T, 2} to also work on your custom type, that is, to make your CircularMatrix an iterable, indexable, completely functioning matrix.
The methods to implement are
size(M::CircularMatrix)
getindex(M::CircularMatrix, i::Int)
getindex(M::CircularMatrix, I::Vararg{Int, N})
setindex!(M::CircularMatrix, v, i::Int)
setindex!(M::CircularMatrix, v, I::Vararg{Int, N})
You already implement 1, 2 and 4 but have not yet set your indexing style. You might not need 3 and 5 if you choose linear indexing style. You only need to set IndexStyle to be IndexLinear() and maybe a few modifications, then everything should just work for your matrix.
1. size(M::CircularMatrix)
The first one is size. size(A::CircularMatrix) returns a Tuple of dimensions of A. I believe for your matrix probably something like the following
Base.size(M::CircularMatrix) = size(M.data)
2. getindex(M::CircularMatrix, i::Int)
This method is needed if you choose linear indexing style. getindex(M, i::Int) should give you the value at linear index i. You already implement it in your code. If you choose linear indexing, you need to set IndexStyle for your type and then you simply skip 3 and 5. Julia will automatically convert multiple index accesses, e.g. a[3, 5], to a linear index access.
Base.IndexStyle(::Type{<:CircularMatrix}) = IndexLinear()
Base.#propogate_inbounds function Base.getindex(M::CircularMatrix, i::Int)
#boundscheck checkbounds(M, i)
#inbounds M.view[i]
end
It might be better to use #inbounds here on the second line. If the caller doesn't use #inbounds, we check the bounds first and this hopefully makes the subsequent bounds check unnecessary. You might want to omit this during development, though.
3. getindex(M::CircularMatrix, I::Vararg{Int, N})
The third one is for Cartesian indexing style. If you choose this style you need to implement this method. Vararg{Int, N} in the signature stands for "exactly N Int arguments". Here N should be equal to the dimensionality of CircularMatrix. Since this is a matrix, N should be two. If you choose this style, you need to define something like the following
Base.#propogate_inbounds function Base.getindex(A::CircularMatrix, I::Vararg{Int, 2})
#boundscheck checkbounds(A, I...)
#inbounds A.view[# convert I[1]` and `I[2]` to a linear index in `view`]
end
or since your dimensionality is not parametric and a matrix is 2D, simply
Base.#propogate_inbounds function Base.getindex(A::CircularMatrix, i::Int, j::Int)
#boundscheck checkbounds(A, i, j)
#inbounds A.view[# convert i` and `j` to a linear index in `view`]
end
4. setindex!(M::CircularMatrix, v, i::Int)
The fourth one is similar to the second. This method should set the value at linear index i, if you choose linear indexing style.
5. setindex!(M::CircularMatrix, v, I::Vararg{Int, N})
The fifth one should be similar to the third, if you choose Cartesian indexing style.
After the implementations for 1, 2, and 4 and setting IndexStyle, you should have a custom matrix type that just works.
m[1, 1] = 5
x = view(m, 1, :)
for e in
...
end
for i in eachindex(m)
...
end
display(m)
println(m)
length(m)
ndims(m)
map(f, A)
....
These should all work.
A few notes
There is a documentation for Abstract Arrays interface here with a few examples. You can also see Optional Methods to implement.
There is a JuliaArray organization on GitHub that provides lots of useful custom array implementations including StaticArrays, OffsetArrays, etc. and also a JuliaMatrices organization that provides custom matrix types. You might want to take a look at their implementations.
#inline is redundant if you use Base.#propogate_inbounds.
#propagate_inbounds
Tells the compiler to inline a function while retaining the caller's
inbounds context.
You do not need to define eltype for your matrix, since there is already a definition for AbstractArray{T, N} which returns T.
Julia's "higher-order" function "map" looks very useful. But while it is easy to understand how it can be used on functions that have one input, it is not obvious how map can be used when the function has multiple inputs, and when each these may be arrays. I would like discover how map is used in that situation.
Suppose I have the following function:
function randomSample(items, weights)
sample(items, Weights(weights))
end
Example:
Pkg.add("StatsBase")
using StatsBase
randomSample([1,0],[0.5, 0.5])
How can map be used here? I have tried something like:
items = [1 0;1 0;1 0]
weights = [1 0;0.5 0.5;0.75 0.25]
map(randomSample(items,weights))
In the example above, I would expect Julia to output a 3 by 1 array of integers (from the items), each row being either 0 or 1 depending on the corresponding weights.
In your case when items and weights are Matrix you can use the eachrow function like this:
map(randomSample, eachrow(items), eachrow(weights))
If you are on Julia version earlier than 1.1 you can write:
map(i -> randomSample(items[i, :], weights[i, :]), axes(items, 1))
or
map(i -> randomSample(view(items,i, :), view(weights, i, :)), axes(items, 1))
(the latter avoids allocations)
However, in practice I would probably define items and weights as vectors of vectors:
items = [[1, 0],[1, 0],[1, 0]]
weights = [[1, 0], [0.5, 0.5], [0.75, 0.25]]
and then you can simply write:
map(randomSample, items, weights)
or
randomSample.(items, weights)
The reason for my preference is the following:
it is conceptually clearer what is the structure of your data
vector of vectors is easier to mutate (e.g. you can push! a new entry at the end)
vector of vectors can be ragged if needed
in some cases it might be a bit faster (iterating by rows in Julia is not optimal as it uses column-major indexing; of course you can fix it in your Matrix approach by assuming that you store your data columnwise not colwise as you currently do)
(this is not a very strong preference and you can probably choose whatever is more convenient to you)
Here's some toy code:
type MyType
x::Int
end
vec = [MyType(1), MyType(2), MyType(3), MyType(4)]
ids = [2, 1, 3, 1]
vec = vec[ids]
julia> vec
4-element Array{MyType,1}:
MyType(2)
MyType(1)
MyType(3)
MyType(1)
That looks fine, except for this behavior:
julia> vec[2].x = 60
60
julia> vec
4-element Array{MyType,1}:
MyType(2)
MyType(60)
MyType(3)
MyType(60)
I want to be able to rearrange the contents of a vector, with the possibility that I eliminate some values and duplicate others. But when I duplicate values, I don't want this copy behavior. Is there an "elegant" way to do this? Something like this works, but yeesh:
vec = [deepcopy(vec[ids[i]]) for i in 1:4]
The issue is that you're creating mutable types, and your vector therefore contains references to the instantiated data - so when you create a vector based on ids, you're creating what amounts to a vector of pointers to the structures. This further means that the elements in the vector with the same id are actually pointers to the same object.
There's no good way to do this without ensuring that your references are different. That either means 1) immutable types, which means you can't reassign x, or 2) copy/deepcopy.
In R, the function outer structurally allows you to take the outer product of two vectors x and y while providing a number of options for the actual function applied to each combination. For example outer(x,y,'-') creates an "outer product" matrix of the elementwise differences between x and y. Does Julia have something similar?
Broadcast is the Julia operation which occurs when adding .'s around. When the two containers have the same size, it's an element-wise operation. Example: x.*y is element-wise if size(x)==size(y). However, when the shapes don't match, then broadcast really comes into effect. If one of them is a row vector and one of them is a column vector, then the output will be 2D with out[i,j] matching the ith row of the column vector with the j row vector. This means x .* y is a peculiar way to write the outer product if one a row and the other is a column vector.
In general, what broadcast is doing is:
This is wasteful when dimensions get large, so Julia offers broadcast(), which expands singleton dimensions in array arguments to match the corresponding dimension in the other array without using extra memory
(This is from the Julia Manual)
But this generalizes to all of the other binary operators, so x .- y' is what you're looking for.
For dealing with two-dimensional matrices, rbind and cbind are useful functions. Are there more generic functions to perform the same operation in more dimensions? Suppose I have data like this:
data <- lapply(c(11,22,33), function(i) matrix(i, nrow=2, ncol=4))
What I'd like to obtain is this:
data <- do.call(c, data)
dim(data) <- c(2, 4, 3)
but without having to work out all the dimensions myself.
Is there a function providing this functionality, either built-in or as part of a reasonably common package? Or do you want to share your own ideas of how such a function could be implemented most elegantly?
Bonus points:
If the function gives some control over the order of dimensions, then a subsequent call to aperm could be avoided.
It would be nice if it could operate by either passing multiple function arguments or a list of arguments, although using do.call or list, either one will suffice.
I'd like to use such a function as the .combine argument to a foreach call. So it should be able to construct multi-dimensional matrices using calls of the form f(f(f(a, b), c), d) (each call takes exactly two arguments, the first usually the result of the previous call) or even f(f(a, b), c, d) (more than two arguments, the first still might be the result of the previous call), with a, b, c, d all of the same size, resulting in a matrix with a dimension 1 higher than the dimensions of these and a size of 4 in that dimension, corresponding to the 4 elements a through d.
The abind package has precisely this function, with most of the features you mention, although I haven't checked all of them in detail.
At the very least, it would give you a start on how one would implement something along these lines.