vector-of vs. vector in Clojure - vector

When would I use vector-of to create a vector, instead of the vector function. Is the guideline to use vector most of the time and only for performance reason switch to vector-of?
I could not find good info on when to use vector-of.

vector-of is used for creating a vector of a single primitive type, :int, :long, :float, :double, :byte, :short, :char, or :boolean. It doesn't allow other types as it stores the values unboxed internally. So, if your vector need to include other types than those primitive types, you cannot use vector-of. But if you are sure that the vector will have data of a single primitive type, you can use vector-of for better performance.
user=> (vector-of :int 1 2 3 4 5)
[1 2 3 4 5]
user=> (vector-of :double 1.0 2.0)
[1.0 2.0]
user=> (vector-of :string "hello" "world")
Execution error (IllegalArgumentException) at user/eval5 (REPL:1).
Unrecognized type :string
As you can see, you should specify primitive type as an argument.
vector can be used to create a vector of any type.
user=> (vector 1 2.0 "hello")
[1 2.0 "hello"]
You can put any type when you use vector.
Also, there's another function vec, which is used for creating a new vector containing the contents of coll.
user=> (vec '(1 2 3 4 5))
[1 2 3 4 5]
Usually, you can get the basic information of a function/macro from the repl, like the following.
user=> (doc vector-of)
-------------------------
clojure.core/vector-of
([t] [t & elements])
Creates a new vector of a single primitive type t, where t is one
of :int :long :float :double :byte :short :char or :boolean. The
resulting vector complies with the interface of vectors in general,
but stores the values unboxed internally.
Optionally takes one or more elements to populate the vector.
Reference:
https://clojuredocs.org/clojure.core/vector-of
https://clojuredocs.org/clojure.core/vector
https://clojuredocs.org/clojure.core/vec

Nobody really ever uses vector-of. If you don't super care about performance, vector is fine, and if you do super care about performance you usually want a primitive array or some other java type. Honestly I would expect occasional weird snags when passing a vector-of to anything that expects an ordinary vector or sequence - maybe it works fine, but it's just such a rare thing to see that it wouldn't surprise me if it caused issues.

Related

append! Vs push! in Julia

In Julia, you can permanently append elements to an existing vector using append! or push!. For example:
julia> vec = [1,2,3]
3-element Vector{Int64}:
1
2
3
julia> push!(vec, 4,5)
5-element Vector{Int64}:
1
2
3
4
5
# or
julia> append!(vec, 4,5)
7-element Vector{Int64}:
1
2
3
4
5
But, what is the difference between append! and push!? According to the official doc it's recommended to:
"Use push! to add individual items to a collection which are not already themselves in another collection. The result
of the preceding example is equivalent to push!([1, 2, 3], 4, 5, 6)."
So this is the main difference between these two functions! But, in the example above, I appended individual elements to an existing vector using append!. So why do they recommend using push! in these cases?
append!(v, x) will iterate x, and essentially push! the elements of x to v. push!(v, x) will take x as a whole and add it at the end of v. In your example there is no difference, since in Julia you can iterate a number (it behaves like an iterator with length 1). Here is a better example illustrating the difference:
julia> v = Any[]; # Intentionally using Any just to show the difference
julia> x = [1, 2, 3]; y = [4, 5, 6];
julia> push!(v, x, y);
julia> append!(v, x, y);
julia> v
8-element Vector{Any}:
[1, 2, 3]
[4, 5, 6]
1
2
3
4
5
6
In this example, when using push!, x and y becomes elements of v, but when using append! the elements of x and y become elements of v.
Since Julia is still in its early phase it'd be best if you follow the community standards, and one of the community standards, is your code making "sense" to other developers at first sight - I should know what your "intents" are immediately I read your code.
About append!, the doc says:
"For an ordered container collection, add the elements of each
collections to the end of it. !!! compat "Julia 1.6" Specifying
multiple collections to be appended requires at least Julia 1.6."
The append! method was added and requires Julia 1.6 to use for multiple collections; so in a sense it is the method that's going to be used in the future as Julia gets adopted a lot, Python uses it too, so adopters from there would likely use it too.
About push!, the doc says:
"Insert one or more items in collection. If collection is an ordered
container, the items are inserted at the end (in the given order). If
collection is ordered, use append! to add all the elements of another
collection to it."
The doc advises you use "append!" over "push!" when your collection is ordered. So as a Julia user, if I see append! on your code, I should know the collections its making changes on is in some way "ordered". That's just it. Otherwise, push! and append! does same things(something that might change in the future), but please follow community standards, it will help.
So use append! when you care about order and use push! when order doesn't matter in your collections. This way, any Julia user reading your code, will know your intents right away; but please don't mix them up

Confused about evaluation of lazy sequences

I am experimenting with clojure's lazy sequences. In order to see when the evaluation of an item would occur, I created a function called square that prints the result before returning it. I then apply this function to a vector using map.
(defn square [x]
(let [result (* x x)]
(println "printing " result)
result))
(def s (map square [1 2 3 4 5])) ; outputs nothing
Here in my declaration of s, the REPL does not output anything. This signals that the computation has not started yet. This appears to be correct. I then do:
(first s)
The function "first" takes only the first item. So I expect that only 1 will be evaluated. My expectation is the REPL will output the following:
printing 1
1
However, the REPL outputted the following instead.
printing 1
printing 4
printing 9
printing 16
printing 25
1
So rather than evaluating only the first item, it seems it evaluates all items, even though I am accessing just the first item.
If the state of a lazy sequence can only be either all values computed and no values computed, then how can it gain the advantages of lazy evaluation? I came from a scheme background and I was expecting more like the behavior of streams. Looks like I am mistaken. Can anyone explain what is going on?
Laziness isn't all-or-nothing, but some implementations of seq operate on 'chunks' of the input sequence (see here for an explanation). This is the case for vector which you can test for with chunked-seq?:
(chunked-seq? (seq [1 2 3 4 5]))
When given a collection map checks to see if the underlying sequence is chunked, and if so evaluates the result on a per-chunk basis instead of an element at a time.
The chunk size is usually 32 so you can see this behaviour by comparing the result of
(first (map square (vec (range 35))))
This should only display a message for the first 32 items and not the entire sequence.

Julia: short syntax for creating (1,n) array

In Matlab I can write:
[0:n]
to get an array (1,n). For n=2, I get:
0 1 2
How to do the same in Julia? The purpose is to get the same type of array (1,3).
I know I can write [0 1 2], but I want something general like in Matlab.
In julia, the colon operator (in this context, anyway) returns a UnitRange object. This is an iterable object; that means you can use it with a for loop, or you can collect all its contents, etc. If you collect its contents, what you get here is a Vector.
If what you're after is explicitly a RowVector, then you can collect the contents of the UnitRange, and reshape the resulting vector accordingly (which in this case can be done via a simple transpose operation).
julia> collect(1:3).'
1×3 RowVector{Int64,Array{Int64,1}}:
1 2 3
The .' transpose operator is also defined for UnitRange arguments:
julia> (1:3).'
1×3 RowVector{Int64,UnitRange{Int64}}:
1 2 3
However, note the difference in the resulting type; if you apply .' again, you get a UnitRange object back again.
If you don't particularly like having a "RowVector" object, and want a straightforward array, use that in an Array constructor:
julia> Array((1:3).')
1×3 Array{Int64,2}:
1 2 3
(above as of latest julia 0.7 dev version)

What is the best way to form inner products?

I was delighted to learn that Julia allows a beautifully succinct way to form inner products:
julia> x = [1;0]; y = [0;1];
julia> x'y
1-element Array{Int64,1}:
0
This alternative to dot(x,y) is nice, but it can lead to surprises:
julia> #printf "Inner product = %f\n" x'y
Inner product = ERROR: type: non-boolean (Array{Bool,1}) used in boolean context
julia> #printf "Inner product = %f\n" dot(x,y)
Inner product = 0.000000
So while i'd like to write x'y, it seems best to avoid it, since otherwise I need to be conscious of pitfalls related to scalars versus 1-by-1 matrices.
But I'm new to Julia, and probably I'm not thinking in the right way. Do others use this succinct alternative to dot, and if so, when is it safe to do so?
There is a conceptual problem here. When you do
julia> x = [1;0]; y = [0;1];
julia> x'y
0
That is actually turned into a matrix * vector product with dimensions of 2x1 and 1 respectively, resulting in a 1x1 matrix. Other languages, such as MATLAB, don't distinguish between a 1x1 matrix and a scalar quantity, but Julia does for a variety of reasons. It is thus never safe to use it as alternative to the "true" inner product function dot, which is defined to return a scalar output.
Now, if you aren't a fan of the dots, you can consider sum(x.*y) of sum(x'y). Also keep in mind that column and row vectors are different: in fact, there is no such thing as a row vector in Julia, more that there is a 1xN matrix. So you get things like
julia> x = [ 1 2 3 ]
1x3 Array{Int64,2}:
1 2 3
julia> y = [ 3 2 1]
1x3 Array{Int64,2}:
3 2 1
julia> dot(x,y)
ERROR: `dot` has no method matching dot(::Array{Int64,2}, ::Array{Int64,2})
You might have used a 2d row vector where a 1d column vector was required.
Note the difference between 1d column vector [1,2,3] and 2d row vector [1 2 3].
You can convert to a column vector with the vec() function.
The error message suggestion is dot(vec(x),vec(y), but sum(x.*y) also works in this case and is shorter.
julia> sum(x.*y)
10
julia> dot(vec(x),vec(y))
10
Now, you can write x⋅y instead of dot(x,y).
To write the ⋅ symbol, type \cdot followed by the TAB key.
If the first argument is complex, it is conjugated.
Now, dot() and ⋅ also work for matrices.
Since version 1.0, you need
using LinearAlgebra
before you use the dot product function or operator.

lazy sequence depending on previous elements

Learning clojure, trying to create a lazy infinite sequence of all prime numbers.
I'm aware that there are more efficient algorithms; I'm doing the following more as a POC/lesson than as the ideal solution.
I have a function that, given a sequence of primes, tells me what the next prime is:
(next-prime [2 3 5]) ; result: 7
My lazy sequence must therefore pass itself to this function, then take the result and add that to itself.
My first attempt:
(def lazy-primes
(lazy-cat [2 3 5] (find-next-prime lazy-primes)))
..which results in an IllegalArgumentException: Don't know how to create ISeq from: java.lang.Integer
My second attempt:
(def lazy-primes
(lazy-cat [2 3 5] [(find-next-prime lazy-primes)]))
..which gives me [2 3 5 7] when asked for 10 elements.
Attempt 3:
(def lazy-primes
(lazy-cat [2 3 5]
(conj lazy-primes (find-next-prime lazy-primes))))
(take 10 lazy-primes) ; result: (2 3 5 7 2 3 5 7 2 3)
All of these seem like they should work (or at least, should work given that the preceding didn't work). Why am I getting the bogus output for each case?
Reasons why your initial attempts don't work:
(find-next-prime lazy-primes) returns an integer but lazy-cat needs a sequence
[(find-next-prime lazy-primes)] creates a vector (and is hence seqable) but it only gets evaluated once when it is first accessed
conj is adding new primes to the start of the sequence (since lazy-cat and hence lazy-primes returns a sequence)... which is probably not what you want! It's also possibly confusing find-next-prime depending on how that is implemented, and there might be a few subtle issues around chunked sequences as well.....
You might instead want to use something like:
(defn seq-fn [builder-fn num ss]
(cons
(builder-fn (take num ss))
(lazy-seq (seq-fn builder-fn (inc num) ss))))
(def lazy-primes
(lazy-cat [2 3 5] (seq-fn next-prime 3 lazy-primes)))
A bit complicated, but basically what I'm doing is using the higher-order helper function to provide a closure over a set of parameters that includes the number of primes created so far, so that it can generate the next prime incrementally at each step.
p.s. as I'm sure you are aware there are faster algorithms for generating primes! I'm assuming that this is intended primarily as an exercise in Clojure and the use of lazy sequences, in which case all well and good! But if you really care about generating lots of primes I'd recommend taking a look at the Sieve of Atkin
Alternatively, you could use iterate: the built-in function that lazily takes the output of a function and applies that to the function again
clojure.core/iterate
([f x])
Returns a lazy sequence of x, (f x), (f (f x)) etc.
f must be free of side-effects
in order for you to make it work, the next-prime function should concatenate its result to its input, and return the concatenation.
Then you can just call (take 100 (iterate list-primes [1])) to get a list of the first 100
primes.
With your next-prime function you can generate a lazy sequence of all primes with the following snippet of code:
(def primes (map peek (iterate #(conj % (next-prime %)) [2])))
The combination you are looking for is concat + lazy-seq + local fn.
Take a look at the implementation of Erathostenes' Sieve in the Clojure Contrib libraries: https://github.com/richhickey/clojure-contrib/blob/78ee9b3e64c5ac6082fb223fc79292175e8e4f0c/src/main/clojure/clojure/contrib/lazy_seqs.clj#L66
One more word, though: this implementation uses a more sophisticated algorithm for the Sieve in a functional language.
Another implementation for Clojure can be found in Rosetta code. However, I don't like that one as it uses atoms, which you don't need for the solution of this algo in Clojure.

Resources