Torch - Query matrix with another matrix - torch

I have a m x n tensor (Tensor 1) and another k x 2 tensor (Tensor 2) and I wish to extract all the values of Tensor 1 using indices based on Tensor 2. For example;
Tensor1
1 2 3 4 5
6 7 8 9 10
11 12 13 14 15
16 17 18 19 20
[torch.DoubleTensor of size 4x5]
Tensor2
2 1
3 5
1 1
4 3
[torch.DoubleTensor of size 4x2]
And the function would yield;
6
15
1
18

The first solution that comes into mind is to simply loop through indexes and pick the correspoding values:
function get_elems_simple(tensor, indices)
local res = torch.Tensor(indices:size(1)):typeAs(tensor)
local i = 0
res:apply(
function ()
i = i + 1
return tensor[indices[i]:clone():storage()]
end)
return res
end
Here tensor[indices[i]:clone():storage()] is just a generic way to pick an element from a multi-dimensional tensor. In k-dimensional case this is exactly analogous to tensor[{indices[i][1], ... , indices[i][k]}].
This method works fine if you don't have to extract lots of values (the bottleneck is :apply method which is not able to use many optimization techniques and SIMD instructions because the function it executes is a black box). The job can be done way more efficiently: the method :index does exactly what you need... with a one-dimensional tensor. Multi-dimensional target/index tensors need to be flattened:
function flatten_indices(sp_indices, shape)
sp_indices = sp_indices - 1
local n_elem, n_dim = sp_indices:size(1), sp_indices:size(2)
local flat_ind = torch.LongTensor(n_elem):fill(1)
local mult = 1
for d = n_dim, 1, -1 do
flat_ind:add(sp_indices[{{}, d}] * mult)
mult = mult * shape[d]
end
return flat_ind
end
function get_elems_efficient(tensor, sp_indices)
local flat_indices = flatten_indices(sp_indices, tensor:size())
local flat_tensor = tensor:view(-1)
return flat_tensor:index(1, flat_indices)
end
The difference is drastic:
n = 500000
k = 100
a = torch.rand(n, k)
ind = torch.LongTensor(n, 2)
ind[{{}, 1}]:random(1, n)
ind[{{}, 2}]:random(1, k)
elems1 = get_elems_simple(a, ind) # 4.53 sec
elems2 = get_elems_efficient(a, ind) # 0.05 sec
print(torch.all(elems1:eq(elems2))) # true

Related

How is it possible to make a matrix with special random elements?

is there any possibility in Julia to make a matrix with special random elements?
for example, a matrix which each row has random elements but every elements should repeat at least one time:
n = zeros(Int,3, 5)
for i in indices(n, 1)
for j in indices(n, 2)
n[i,j]=rand(0:3)
end
end
n=
3×5 Array{Int64,2}:
1 2 1 1 2
3 3 2 2 0
3 2 1 0 0
but in second row, there is not 1 . would you please help me how this matrix is made?
Thanks.
You can use this function:
using Random
function randfill!(m::AbstractMatrix, s::AbstractVector)
n1 = length(s)
n2 = size(m, 2)
#assert n2 >= n1
for i in 1:size(m,1)
m[i, 1:n1] .= s
for j in n1+1:n2
m[i,j] = rand(s)
end
shuffle!(view(m, i, :))
end
m
end

Dijkstra's algorithm with adjacency matrix

I'm trying to implement the following code from here but it won't work correctly.
What I want is the shortest path distances from a source to all nodes and also the predecessors. Also, I want the input of the graph to be an adjacency matrix which contains all of the edge weights.
I'm trying to make it work in just one function so I have to rewrite it. If I'm right the original code calls other functions (from graph.jl for example).
I don't quite understand how to rewrite the for loop which calls the adj() function.
Also, I'm not sure if the input is correct in the way the code is for now.
function dijkstra(graph, source)
node_size = size(graph, 1)
dist = ones(Float64, node_size) * Inf
dist[source] = 0.0
Q = Set{Int64}() # visited nodes
T = Set{Int64}(1:node_size) # unvisited nodes
pred = ones(Int64, node_size) * -1
while condition(T)
# node selection
untraversed_nodes = [(d, k) for (k, d) in enumerate(dist) if k in T]
if minimum(untraversed_nodes)[1] == Inf
break # Break if remaining nodes are disconnected
end
node_ind = untraversed_nodes[argmin(untraversed_nodes)][2]
push!(Q, node_ind)
delete!(T, node_ind)
# distance update
curr_node = graph.nodes[node_ind]
for (neigh, edge) in adj(graph, curr_node)
t_ind = neigh.index
weight = edge.cost
if dist[t_ind] > dist[node_ind] + weight
dist[t_ind] = dist[node_ind] + weight
pred[t_ind] = node_ind
end
end
end
return dist, pred
end
So if I'm trying it with the following matrix
A = [0 2 1 4 5 1; 1 0 4 2 3 4; 2 1 0 1 2 4; 3 5 2 0 3 3; 2 4 3 4 0 1; 3 4 7 3 1 0]
and source 2 i would like to get the distances in a vector dist and the predeccessors in anothe vectore pred.
Right now I'm getting
ERROR: type Array has no field nodes
Stacktrace: [1] getproperty(::Any, ::Symbol) at .\sysimg.jl:18
I guess I have to rewrite it a bit more.
I m thankful for any help.
Assuming that graph[i,j] is a length of path from i to j (your graph is directed looking at your data), and it is a Matrix with non-negative entries, where 0 indicates no edge from i to j, a minimal rewrite of your code should be something like:
function dijkstra(graph, source)
#assert size(graph, 1) == size(graph, 2)
node_size = size(graph, 1)
dist = fill(Inf, node_size)
dist[source] = 0.0
T = Set{Int}(1:node_size) # unvisited nodes
pred = fill(-1, node_size)
while !isempty(T)
min_val, min_idx = minimum((dist[v], v) for v in T)
if isinf(min_val)
break # Break if remaining nodes are disconnected
end
delete!(T, min_idx)
# distance update
for nei in 1:node_size
if graph[min_idx, nei] > 0 && nei in T
possible_dist = dist[min_idx] + graph[min_idx, nei]
if possible_dist < dist[nei]
dist[nei] = possible_dist
pred[nei] = min_idx
end
end
end
end
return dist, pred
end
(I have not tested it extensively, so please report if you find any bugs)

Summation inside summation inside production in R

I have problems with the coding of a function to optimize in which there are two summations and one production, all with different indexing. I split the code into two functions for simplicity.
In the first function j goes from 0 to k:
w = function(n,k,gam){
j = 0:k
w = (1 / factorial(k)) * n * sum(choose(k, j * gam))
return(w)}
In the second function k goes from 0 to n (that is fixed to 10); instead the production goes from 1 to length(x):
f = function(gam,del){
x = mydata #vector of 500 elements
n = 10
k = 0:10
for (i in 0:10)
pdf = prod( sum( w(n, k[i], gam) * (1 / del + (n/x)^(n+1))
return(-pdf)}
When I try the function I obtain the following error:
Error in 0:k : argument of length 0
Edit: This is what I am tryig to code
where I want to maximize L(d,g) using optim and:
and n is fixed to a specific value.
Solution
Change for (i in 0:10) to for ( i in 1:11 ). Note: When I copied and ran your code I also noticed some unrelated bracket/parentheses omissions you may need to fix also.
Explanation
Your problem is that R uses a 1-based indexing system rather than a 0-based one like many other programming languages or some mathematical formulae. If you run the following code you'll get the same error, and it pinpoints the problem:
k = 0:10
for ( i in 0:10 ) {
print(0:k[i])
}
Error in 0:k[i] : argument of length 0
You get an error on the first iteration because there is no 0 element of k. Compare that to the following loop:
k = 0:10
for ( i in 1:11 ) {
print(0:k[i])
}
[1] 0
[1] 0 1
[1] 0 1 2
[1] 0 1 2 3
[1] 0 1 2 3 4
[1] 0 1 2 3 4 5
[1] 0 1 2 3 4 5 6
[1] 0 1 2 3 4 5 6 7
[1] 0 1 2 3 4 5 6 7 8
[1] 0 1 2 3 4 5 6 7 8 9
[1] 0 1 2 3 4 5 6 7 8 9 10
Update
Your comment to the answer clarifies some additional information you need:
Just to full understand everything, how do I know in a situation like
this that R is indexing the production on x and the summation on k?
The short answer is that it depends on how you nest your loops and function calls. In more detail:
When you call f(), you start a for loop over the elements of k, so R is indexing the block of code within the for loop (everything in the braces in my re-formatted version of f() below) "on" k. For every element in k, you assign prod(...) to pdf (Side note: I don't know why you're re-writing over pdf in every iteration of this loop)
sum( w(n, k[i], gam) * gamma(1 / del + k[i]) * s^(n + 1)) produces a vector of length max(length(w(n, k[i], gam)), length(s)) (side note: Beware of recycling! -- see Section 2.2 of "An Introduction to R"); prod(sum( w(n, k[i], gam) * gamma(1 / del + k[i]) * s^(n + 1))) effectively indexes over the elements of that vector
w(n, k[i], gam) * gamma(1 / del + k[i]) * s^(n + 1) produces a vector of length max(length(w(n, k[i], gam)), length(s)); sum( w(n, k[i], gam) * gamma(1 / del + k[i]) * s^(n + 1)) effectively indexes over the elements of that vector
Etc.
What you're indexing over, explicitly or implicitly through vectorized operations, depends on which level of nested loops or function calls you're talking about. You may need some careful thinking and planning about when you want to index over what, which will tell you how you need to nest things. Put the operation whose indices should vary fastest on the innermost call. For example, in effect, prod(1:3 + sum(1:3)) will index over sum(1:3) to produce that sum first then index over 1:3 + sum(1:3) to produce the product. I.e., sum(1:3) = 1 + 2 + 3 = 6, then prod(1:3 + sum(1:3)) = (1 + 6) * (2 + 6) * (3 + 6) = 7 * 8 * 9 = 504. It's just like how parentheses work in mathematics.
Also, another side note, I wouldn't refer to global variables from within a function as you do in f() -- I've highlighted below in your code where you do that and offered an alternative that doesn't do it.
f = function(gam, del){
x = mydata # don't refer to a global variable "mydata", make it an argument
n = 10
s = n / x
k = 1:11
for (i in 1:11){
pdf = prod( sum( w(n, k[i], gam) * gamma(1 / del + k[i]) * s^(n + 1)))
}
return(-pdf)
}
# Do this instead
# (though there are still other things to fix,
# like re-writing over "pdf" eleven times and only using the last value)
f = function(gam, del, x, n = 10) {
s = n / x
s = n / x
k = 0:10
for (i in 1:11){
pdf = prod( sum( w(n, k[i], gam) * gamma(1 / del + k[i]) * s^(n + 1)))
}
return(-pdf)
}

Calculating Total Number of Times of Loops

I'm trying to calculate the total number of times the innermost statement is executed.
count = 0;
for i = 1 to n
for j = 1 to n - i
count = count + 1
I figured that the most the loop can execute is O(n*n-i) = O(n^2). I wanted to prove this by using double summation but I'm getting lost since the I'm having trouble starting the equation since j = 1 is thrown into there.
Can someone help me explain this to me?
Thanks
For each i, the inner loop executes n - i times (n is constant). Therefore (since i ranges from 1 to n), to determine the total number of times the innermost statement is executed, we must evaluate the sum
(n - 1) + (n - 2) + (n - 3) + ... + (n - n)
By rearranging the terms (grouping all the ns that appear first), we can see that this is equal to
n*n - (1 + 2 + 3 + ... + n) = n*n - n(n+1)/2 = n*(n-1)/2 = n*n/2 - n/2
Here's a simple implementation in Python to verify this:
def f(n):
count = 0;
for i in range(1, n + 1):
for _ in range(1, n - i + 1):
count = count + 1
return count
for n in range(1,11):
print n, '\t', f(n), '\t', n*n/2 - n/2
Output:
1 0 0
2 1 1
3 3 3
4 6 6
5 10 10
6 15 15
7 21 21
8 28 28
9 36 36
10 45 45
The first column is n, the second is the number of times that inner statement is executed, and the third is n*n/2 - n/2.

Summing elems of array using binary recursion

I wasn't starting to understand linear recursion and then I thought I practice up on sorting algorithms and then quick sort was where I had trouble with recursion. So I decided to work with a simpler eg, a binary sum that I found online. I understand that recursion, like all function calls, are executed one # a time and not at the same time (which is what multi-threading does but is not of my concern when tracing). So I need to execute all of recursive call A BEFORE recursive call B, but I get lost in the mix. Does anyone mind tracing it completely. The e.g. I have used of size, n = 9 where elems are all 1's to keep it simple.
/**
* Sums an integer array using binary recursion.
* #param arr, an integer array
* #param i starting index
* #param n size of the array
* floor(x) is largest integer <= x
* ceil(x) is smallest integer >= x
*/
public int binarySum(int arr[], int i, int n) {
if (n == 1)
return arr[i];
return binarySum(arr, i, ceil(n/2)) + binarySum(arr,i + ceil(n/2), floor(n/2));
}
What I personally do is start with an array of size 2. There are two elements.
return binarySum(arr, i, ceil(n/2)) + binarySum(arr,i + ceil(n/2), floor(n/2)) will do nothing but split the array into 2 and add the two elements. - case 1
now, this trivial starting point will be the lowest level of the recursion for the higher cases.
now increase n = 4. the array is split into 2 : indices from 0-2 and 2-4.
now the 2 elements inside indices 0 to 2 are added in case 1 and so are the 2 elements added in indices 2-4.
Now these two results are added in this case.
Now we are able to make more sense of the recursion technique, some times understanding bottom up is easier as in this case!
Now to your question consider an array of 9 elements : 1 2 3 4 5 6 7 8 9
n = 9 => ceil(9/2) = 5, floor(9/2) = 4
Now first call (top call) of binarySum(array, 0, 9)
now n = size is not 1
hence the recursive call....
return binarySum(array, 0, 5) + binarySum(array, 5, 4)
now the first binarySum(array, 0 ,5) operates on the first 5 elements of the array and the second binarySum(array,5,4) operates on the last 4 elements of the array
hence the array division can be seen like this: 1 2 3 4 5 | 6 7 8 9
The first function finds the sum of the elements: 1 2 3 4 5
and the second function finds the sum of the elements 6 7 8 9
and these two are added together and returned as the answer to the top call!
now how does this 1+2+3+4+5 and 6+7+8+9 work? we recurse again....
so the tracing will look like
1 2 3 4 5 | 6 7 8 9
1 2 3 | 4 5 6 7 | 8 9
1 2 | 3 4 | 5 6 | 7 8 | 9
[1 | 2]___[3]___[4 5]___[6 7]___[8 9]
Till this we are fine..we are just calling the functions recursively.
But now, we hit the base case!
if (n == 1)
return arr[i];
[1 + 2]____[3]____[4 + 5]____[6 + 7]____[8 + 9]
[3 + 3] ____ [9] ____[13 + 17]
[6 + 9] [30]
[15 + 30]
[45]
which is the sum.
So for understanding see what is done to the major instance of the problem and you can be sure that the same thing is going to happen to the minor instance of the problem.
This example explains binary sum with trace in java
the trace is based on index of array , where 0 - is yours starting index and 8 is length of the array
int sum(int* arr, int p, int k) {
if (p == k)
return arr[k];
int s = (p + k) / 2;
return sum(arr, p, s) + sum(arr, s + 1, k);
}

Resources