Julia: Turn Vector into multiple m x n matrices without a loop - julia

Let's say I have a vector V, and I want to either turn this vector into multiple m x n matrices, or get multiple m x n matrices from this Vector V.
For the most basic example: Turn V = collect(1:75) into 3 5x5 matrices.
As far as I am aware this can be done by first using reshape reshape(V, 5, :) and then looping through it. Is there a better way in Julia without using a loop?
If possible, a solution that can easily change between row-major and column-major results is preferrable.

TL:DR
m, n, n_matrices = 4, 2, 5
V = collect(1:m*n*n_matrices)
V = reshape(V, m, n, :)
V = permutedims(V, [2,1,3])
display(V)
From my limited knowledge about Julia:
When doing V = collect(1:m*n), you initialize a contiguous array in memory. From V you wish to create a container of m by n matrices. You can achieve this by doing reshape(V, m, n, :), then you can access the first matrix with V[:,:,1]. The "container" in this case is just another array (thus you have a three dimensional array), which in this case we interpret as "an array of matrices" (but you could also interpret it as a box). You can then transpose every matrix in your array by swapping the first two dimensions like this: permutedims(V, [2,1,3]).
How this works
From what I understand; n-dimensional arrays in Julia are contiguous arrays in memory when you don't do any "skipping" (e.g. V[1:2:end]). For example the 2 x 4 matrix A:
1 3 5 7
2 4 6 8
is in memory just 1 2 3 4 5 6 7 8. You simply interpret the data in a specific way, where the first two numbers makes up the first column, then the second two numbers makes the next column so on so forth. The reshape function simply specifies how you want to interpret the data in memory. So if we did reshape(A, 4, 2) we basically interpret the numbers in memory as "the first four values makes the first column, the second four values makes the second column", and we would get:
1 5
2 6
3 7
4 8
We are basically doing the same thing here, but with an extra dimension.
From my observations it also seems to be that permutedims in this case reallocates memory. Also, feel free to correct me if I am wrong.
Old answer:
I don't know much about Julia, but in Python using NumPy I would have done something like this:
reshape(V, :, m, n)
EDIT: As #BatWannaBe states, the result is technically one array (but three dimensional). You can always interpret a three dimensional array as a container of 2D arrays, which from my understanding is what you ask for.

Related

Is there a way to swap columns in O(1) in Julia?

I picked up Julia to do some numerical analysis stuff and was trying to implement a full pivot LU decomposition (as in, trying to get an LU decomposition that is as stable as possible). I thought that the best way of doing so was finding the maximum value for each column and then resorting the columns in descending order of their maximum values.
Is there a way of avoiding swapping every element of two columns and instead doing something like changing two references/pointers?
Following up on #longemen3000's answer, you can use views to swap columns. For example:
julia> A = reshape(1:12, 3, 4)
3×4 reshape(::UnitRange{Int64}, 3, 4) with eltype Int64:
1 4 7 10
2 5 8 11
3 6 9 12
julia> V = view(A, :, [3,2,4,1])
3×4 view(reshape(::UnitRange{Int64}, 3, 4), :, [3, 2, 4, 1]) with eltype Int64:
7 4 10 1
8 5 11 2
9 6 12 3
That said, whether this is a good strategy depends on access patterns. If you'll use elements of V once or a few times, this view strategy is a good one. In contrast, if you access elements of V many times, you may be better off making a copy or moving values in-place, since that's a price you pay once whereas here you pay an indirection cost every time you access a value.
Just for "completeness", in case you actually want to swap columns in-place,
function swapcols!(X::AbstractMatrix, i::Integer, j::Integer)
#inbounds for k = 1:size(X,1)
X[k,i], X[k,j] = X[k,j], X[k,i]
end
end
is simple and fast.
In fact, in an individual benchmark for small matrices this is even faster than the view approach mentioned in the other answers (views aren't always free):
julia> A = rand(1:10,4,4);
julia> #btime view($A, :, $([3,2,1,4]));
31.919 ns (3 allocations: 112 bytes)
julia> #btime swapcols!($A, 1,3);
8.107 ns (0 allocations: 0 bytes)
in julia there is the #view macro, that allows you to create an array that is just a reference to another array, for example:
A = [1 2;3 4]
Aview = #view A[:,1] #view of the first column
Aview[1,1] = 10
julia> A
2×2 Array{Int64,2}:
10 2
3 4
with that said, when working with concrete number types (Float64,Int64,etc), julia uses contiguous blocks of memory with the direct representation of the number type. that is, a julia array of numbers is not an array of pointers were each element of an array is a pointer to a value. if the values of an array can be represented by a concrete binary representation (an array of structs, for example) then an array of pointers is used.
I'm not a computer science expert, but i observed that is better to have your data tightly packed that using a lot of pointers when doing number crunching.
Another different case is Sparse Arrays. the basic julia representation of an sparse array is an array of indices and an array of values. here you can simply swap the indices instead of copying the values

Generate Unique Combinations of Integers

I am looking for help with pseudo code (unless you are a user of Game Maker 8.0 by Mark Overmars and know the GML equivalent of what I need) for how to generate a list / array of unique combinations of a set of X number of integers which size is variable. It can be 1-5 or 1-1000.
For example:
IntegerList{1,2,3,4}
1,2
1,3
1,4
2,3
2,4
3,4
I feel like the math behind this is simple I just cant seem to wrap my head around it after checking multiple sources on how to do it in languages such as C++ and Java. Thanks everyone.
As there are not many details in the question, I assume:
Your input is a natural number n and the resulting array contains all natural numbers from 1 to n.
The expected output given by the combinations above, resembles a symmetric relation, i. e. in your case [1, 2] is considered the same as [2, 1].
Combinations [x, x] are excluded.
There are only combinations with 2 elements.
There is no List<> datatype or dynamic array, so the array length has to be known before creating the array.
The number of elements in your result is therefore the binomial coefficient m = n over 2 = n! / (2! * (n - 2)!) (which is 4! / (2! * (4 - 2)!) = 24 / 4 = 6 in your example) with ! being the factorial.
First, initializing the array with the first n natural numbers should be quite easy using the array element index. However, the index is a property of the array elements, so you don't need to initialize them in the first place.
You need 2 nested loops processing the array. The outer loop ranges i from 1 to n - 1, the inner loop ranges j from 2 to n. If your indexes start from 0 instead of 1, you have to take this into consideration for the loop limits. Now, you only need to fill your target array with the combinations [i, j]. To find the correct index in your target array, you should use a third counter variable, initialized with the first index and incremented at the end of the inner loop.
I agree, the math behind is not that hard and I think this explanation should suffice to develop the corresponding code yourself.

Piecewise / Noncontiguous Ranges?

Is there any kind of object class for piecewise / noncontiguous ranges in Julia? For instance, I can create a regular range:
a = UnitRange(1:5)
But, if I wanted to combine this with other ranges:
b = UnitRange([1:5, 8:10, 4:7])
I cannot currently find an object or method. There is a PiecewiseIncreasingRanges module (https://github.com/simonster/PiecewiseIncreasingRanges.jl) that would be just what I want in this situation, except that it, as the name implies, requires the ranges be monotonically increasing.
The context for this is that I am looking for a way to create a compressed, memory efficient version of the SparseMatrixCSC type for sparse matrices with repeating rows. The RLEVectors module will work well to save space on the nonzerovalue vector in the sparse matrix class. Now though I am trying to find something to save space for the rowvalue vector that also defines the sparse matrix, since series of repeating rows will result in ranges of values in that vector (e.g. if the first 10 rows, or even certain columns in the first ten rows, of a sparse matrix are identical, then there will be a lot of 1:10 patterns in the row value vector).
More generally, I'd like a range such as the b object that I try to create above over which I could do an iterated loop, getting:
for (idx, item) in enumerate(hypothetical_object)
println("idx: $idx, item: $item")
end
idx: 1, item: 1
idx: 2, item: 2
...
idx: 5, item: 5
idx: 6, item: 8
idx: 7, item: 9
idx: 8, item: 10
idx: 9, item: 4
idx: 10, item: 5
...
Update: One thing I'm considering, and will probably try implementing if I don't hear other suggestions here, will be to just create an array of PiecewiseIncreasingRange objects, one for each column in my sparse matrix. (I would probably also then break the nonzero value vector into an array of separate pieces, one for each column of my sparse matrix as well). This would at least be relatively simple to implement. I don't have a good sense off the bat how this would compare in terms of computational efficiency to the kind of object I am searching for in this question. I suspect that memory requirements would be about the same.
To loop over a sequence of ranges (or other iterators), you can use the chain function in the Iterators.jl package.
For example:
using Iterators
b = chain(1:5, 8:10, 4:7)
for i in b
println(i)
end
outputs the elements of each range.

How to vectorize complex iterative loop in r

I usually have no problem with vectorization in r, but I am having a tough time in the example below where there are both iterative and non-iterative components in the for loop.
In the code below, I have a calculation that I have to perform based on a set of constants (Dini), a vector of values (Xs), where the ith value of the output vector (Ys) is also dependent on i-1 value:
Dini=128 #constant
Xs=c(6.015, 5.996, 5.989, 5.911, 5.851, 5.851, 5.858, 5.851)
Y0=125.73251 #starting Y value
Ys=c(Y0) #starting of output vector, first value is known
for (Vi in Xs[2:length(Xs)]){
ytm1=Ys[length(Ys)]
y=(955.74301-2*((Dini+ytm1-Vi)^2-ytm1^2)^0.5+2*ytm1*acos(ytm1/(Dini+ytm1-Vi)))/pi/2
Ys=c(Ys, y)
}
df=data.frame(Xs, Ys)
df
Xs Ys
1 6.015 125.7325
2 5.996 125.7273
3 5.989 125.7251
4 5.911 125.7036
5 5.851 125.6859
6 5.851 125.6849
7 5.858 125.6868
8 5.851 125.6850
For this case, where there is a mix of both iterative and non iterative components in the for loop, my mind has got twisted in a non-vectorized knot.
Any suggestions?
You might want to look into use Reduce in this case. For example
Ys<-Reduce(function(prev, cur) {
(955.74301-2*((Dini+prev-cur)^2-prev^2)^0.5 + 2*prev*acos(prev/(Dini+prev-cur)))/pi/2
}, Xs, init=Y0, accumulate=T)[-1]
From the ?Reduce help page: "Reduce uses a binary function to successively combine the elements of a given vector and a possibly given initial value." This makes it easier to create vectors where a given value depends on a previous value.

How to reshape Arrays quickly

In the following code I am using the Julia Optim package for finding an optimal matrix with respect to an objective function.
Unfortunately the provided optimize function only supports vectors, so I have to transform the matrix to a vector before passing it to the optimize function, and also transform it back when using it in the objective function.
function opt(A0,X)
I1(A) = sum(maximum(X*A,1))
function transform(A)
# reshape matrix to vector
return reshape(A,prod(size(A)))
end
function transformback(tA)
# reshape vector to matrix
return reshape(tA, size(A0))
end
obj(tA) = -I1(transformback(tA))
result = optimize(obj, transform(A0), method = :nelder_mead)
return transformback(result.minimum)
end
I think Julia is allocating new space for this every time and it feels slow, so what would be a more efficient way to tackle this problem?
So long as arrays contain elements that are considered immutable, which includes all primitives, then elements of an array are contained in 1 big contiguous blob of memory. So you can break dimension rules and simply treat a 2 dimensional array as a 1-dimensional array, which is what you want to do. So you don't need to reshape, but I don't think reshape is your problem
Arrays are column major and contiguous
Consider the following function
function enumerateArray(a)
for i = 1:*(size(a)...)
print(a[i])
end
end
This function multiplies all of the dimensions of a together and then loops from 1 to that number assuming a is one dimensional.
When you define a as the following
julia> a = [ 1 2; 3 4; 5 6]
3x2 Array{Int64,2}:
1 2
3 4
5 6
The result is
julia> enumerateArray(a)
135246
This illustrates a couple of things.
Yes it actually works
Matrices are stored in column-major format
reshape
So, the question is why doesn't reshape use that fact? Well it does. Here's the julia source for reshape in array.c
a = (jl_array_t*)allocobj((sizeof(jl_array_t) + sizeof(void*) + ndimwords*sizeof(size_t) + 15)&-16);
So yes a new array is created, but the only the new dimension information is created, it points back to the original data which is not copied. You can verify this simply like this:
b = reshape(a,6);
julia> size(b)
(6,)
julia> size(a)
(3,2)
julia> b[4]=100
100
julia> a
3x2 Array{Int64,2}:
1 100
3 4
5 6
So setting the 4th element of b sets the (1,2) element of a.
As for overall slowness
I1(A) = sum(maximum(X*A,1))
will create a new array.
You can use a couple of macros to track this down #profile and #time. Time will additionally record the amount of memory allocated and can be put in front of any expression.
For example
julia> A = rand(1000,1000);
julia> X = rand(1000,1000);
julia> #time sum(maximum(X*A,1))
elapsed time: 0.484229671 seconds (8008640 bytes allocated)
266274.8435928134
The statistics recorded by #profile are output using Profile.print()
Also, most methods in Optim actually allow you to supply Arrays, not just Vectors. You could generalize the nelder_mead function to do the same.

Resources