Does slice or index of chainer.Variable to get item in chainer has backward ability? - chainer

Does the following code chainer.Variable still have ability to hold graph and can backward (gradient flow) after slice(a[0,1] or index(a[0]):
>>> a = chainer.Variable(np.array([[1,2,3],[10,11,12]]))
>>> a
variable([[ 1, 2, 3],
[10, 11, 12]])
>>> a[0]
variable([1, 2, 3])
>>> a[0, 1]
variable([1])

Yes. Indexing of chainer.Variable supports backprop.

Related

Using gather() to retrieve rows from 3d tensor with 2d tensor in Pytorch

I'm new to Pytorch and having an issue with the gather() function:
I have a 3d tensor, x[i,j,k]:
x=tensor([[[1,2,3],
[4,5,6],
[7,8,9]],
[[10,11,12],
[13,14,15],
[16,17,18]]])
I have an index tensor:
index=tensor([[1,2,0]])
I want to use the values of index to iterate over x[j] and fetch the (complete) rows. I've tried gather() with all dims, squeezing, unsqueezing and it never seems to get the output I'm looking for, which would be:
output=tensor([[[4,5,6],
[7,8,9],
[1,2,3]],
[[13,14,15],
[16,17,18],
[10,11,12]]])
I've also tried repeating the values of index to get the same shape as x but it did not work.
I know I can do this with an if loop, but I'm pretty sure I can do it with gather() as well. Thanks for the help
Let us set up the two tensors x and index:
>>> x = torch.arange(1,19).view(2,3,3)
>>> index = torch.tensor([[1,2,0]])
What you are looking for is the torch.gather operation:
out[i][j][k] = x[i][index[i][j][k]][k]
In other to apply this function, you need to expand index to the same shape as out. Additionally, a transpose operation is required to flip your original index tensor.
>>> i = index.T.expand_as(x)
tensor([[[1, 1, 1],
[2, 2, 2],
[0, 0, 0]],
[[1, 1, 1],
[2, 2, 2],
[0, 0, 0]]])
If you compare with the pseudo code line above, you can see how every element of i represents the row of the original tensor x the operator will gather values from.
Applying the function gets us to the desired result:
x.gather(dim=1, index=index.T.expand_as(x))
tensor([[[ 4, 5, 6],
[ 7, 8, 9],
[ 1, 2, 3]],
[[13, 14, 15],
[16, 17, 18],
[10, 11, 12]]])

Graph convolutions in Keras

How can we implement graph convolutions in Keras?
Ideally in the form of a layer accepting 2 inputs - the set (as time-sequence) of nodes and (same time dimension length) set of integer indexes (into the time dimension) of each node's neighbours.
If we would be able to gather items into the style and shape of Conv layers, we could use normal convolutions.
The gather can be done using this Keras layer which uses tensorflow's gather.
class GatherFromIndices(Layer):
"""
To have a graph convolution (over a fixed/fixed degree kernel) from a given sequence of nodes, we need to gather
the data of each node's neighbours before running a simple Conv1D/conv2D,
that would be effectively a defined convolution (or even TimeDistributed(Dense()) can be used - only
based on data format we would output).
This layer should do exactly that.
Does not support non integer values, values lesser than 0 zre automatically masked.
"""
def __init__(self, mask_value=0, include_self=True, flatten_indices_features=False, **kwargs):
Layer.__init__(self, **kwargs)
self.mask_value = mask_value
self.include_self = include_self
self.flatten_indices_features = flatten_indices_features
def get_config(self):
config = {'mask_value': self.mask_value,
'include_self': self.include_self,
'flatten_indices_features': self.flatten_indices_features,
}
base_config = super(GatherFromIndices, self).get_config()
return dict(list(base_config.items()) + list(config.items()))
#def build(self, input_shape):
#self.built = True
def compute_output_shape(self, input_shape):
inp_shape, inds_shape = input_shape
indices = inds_shape[-1]
if self.include_self:
indices += 1
features = inp_shape[-1]
if self.flatten_indices_features:
return tuple(list(inds_shape[:-1]) + [indices * features])
else:
return tuple(list(inds_shape[:-1]) + [indices, features])
def call(self, inputs, training=None):
inp, inds = inputs
# assumes input in the shape of (inp=[...,batches, sequence_len, features],
# inds = [...,batches,sequence_ind_len, neighbours]... indexing into inp)
# for output we want to get [...,batches,sequence_ind_len, indices,features]
assert_shapes = tf.Assert(tf.reduce_all(tf.equal(tf.shape(inp)[:-2], tf.shape(inds)[:-2])), [inp])
assert_positive_ins_shape = tf.Assert(tf.reduce_all(tf.greater(tf.shape(inds), 0)), [inds])
# the shapes need to be the same (with the exception of the last dimension)
with tf.control_dependencies([assert_shapes, assert_positive_ins_shape]):
inp_shape = tf.shape(inp)
inds_shape = tf.shape(inds)
features_dim = -1
# ^^ todo for future variablility of the last dimension, because maybe can be made to take not the last
# dimension as features, but something else.
inp_p = tf.reshape(inp, [-1, inp_shape[features_dim]])
ins_p = tf.reshape(inds, [-1, inds_shape[features_dim]])
# we have lost the batchdimension by reshaping, so we save it by adding the size to the respective indexes
# we do it because we use the gather_nd as nonbatched (so we do not need to provide batch indices)
resized_range = tf.range(tf.shape(ins_p)[0])
different_seqs_ids_float = tf.scalar_mul(1.0 / tf.to_float(inds_shape[-2]), tf.to_float(resized_range))
different_seqs_ids = tf.to_int32(tf.floor(different_seqs_ids_float))
different_seqs_ids_packed = tf.scalar_mul(inp_shape[-2], different_seqs_ids)
thseq = tf.expand_dims(different_seqs_ids_packed, -1)
# in case there are negative indices, make them all be equal to -1
# and add masking value to the ending of inp_p - that way, everything that should be masked
# will get the masking value as features.
mask = tf.greater_equal(ins_p, 0) # extract where minuses are, because the will all default to default value
# .. before the mod operation, if provided greater id numbers, to wrap correctly small sequences
offset_ins_p = tf.mod(ins_p, inp_shape[-2]) + thseq # broadcast to ins_p
minus_1 = tf.scalar_mul(tf.shape(inp_p)[0], tf.ones_like(mask, dtype=tf.int32))
'''
On GPU, if we use index = -1 anywhere it would throw a warning:
OP_REQUIRES failed at gather_nd_op.cc:50 : Invalid argument:
flat indices = [-1] does not index into param.
Which is a warning, that there are -1s. We are using that as feature and know about that.
'''
offset_ins_p = tf.where(mask, offset_ins_p, minus_1)
# also possible to do something like tf.multiply(offset_ins_p, mask) + tf.scalar_mul(-1, mask)
mask_value_last = tf.zeros((inp_shape[-1],))
if self.mask_value != 0:
mask_value_last += tf.constant(self.mask_value) # broadcasting if needed
inp_p = tf.concat([inp_p, tf.expand_dims(mask_value_last, 0)], axis=0)
# expand dims so that it would slice n times instead having slice of length n indices
neighb_p = tf.gather_nd(inp_p, tf.expand_dims(offset_ins_p, -1)) # [-1,indices, features]
out_shape = tf.concat([inds_shape, inp_shape[features_dim:]], axis=-1)
neighb = tf.reshape(neighb_p, out_shape)
# ^^ [...,batches,sequence_len, indices,features]
if self.include_self: # if is set, add self at the 0th position
self_originals = tf.expand_dims(inp, axis=features_dim-1)
# ^^ [...,batches,sequence_len, 1, features]
neighb = tf.concat([neighb, self_originals], axis=features_dim-1)
if self.flatten_indices_features:
neighb = tf.reshape(neighb, tf.concat([inds_shape[:-1], [-1]], axis=-1))
return neighb
With a debuggable interactive test:
def allow_tf_debug(func):
"""
Decorator for tests that use tensorflow, to make them more breakpoint-friendly, i.e. to be able to call .eval()
on tensors immediately.
"""
def interactive_wrapper():
sess = tf.InteractiveSession()
ret = func()
sess.close()
return ret
return interactive_wrapper
#allow_tf_debug
def test_gather_from_indices():
gat = GatherFromIndices(include_self=False, flatten_indices_features=False)
# test for include_self=True is not included
# test for flatten_indices_features not included
seq = [ # batch of sequences
# sequences of 2d features
[[0, 1], [1, 2], [2, 3], [3, 4], [4, 5], [5, 6], [6, 7], [7, 8]],
[[10, 1], [11, 2], [12, 3], [13, 4], [14, 5], [15, 6], [16, 7], [17, 8]]
]
ids = [ # batch of sequences
# sequences of 3 ids of each item in sequence
[[0, 0, 0], [1, 1, 1], [2, 2, 2], [3, 3, 3], [5, 5, 5], [6, 6, 6], [7, 7, 7]],
[[0, 1, 2], [1, 2, 3], [2, 3, 4], [3, 4, 5], [5, 6, 7], [6, 7, 0], [7, 0, -1]]
# minus one should mean masking
]
def compute_assert_2ways_gathers(seq, ids):
seq = np.array(seq, dtype=np.float32)
ids = np.array(ids, dtype=np.int32)
# intended_look
result_np = None
if len(ids.shape) == 3: # classical batches
result_np = np.empty(list(ids.shape) + [seq.shape[-1]])
for b, seq_in_batch in enumerate(ids):
for i, sid in enumerate(seq_in_batch):
for c, copyid in enumerate(sid):
assert ids[b,i,c] == copyid
if ids[b,i,c] < 0:
result_np[b, i, c, :] = 0
else:
result_np[b, i, c, :] = seq[b, ids[b,i,c], :]
elif len(ids.shape) == 4: # some other batching format...
result_np = np.empty(list(ids.shape) + [seq.shape[-1]])
for mb, mseq_in_batch in enumerate(ids):
for b, seq_in_batch in enumerate(mseq_in_batch):
for i, sid in enumerate(seq_in_batch):
for c, copyid in enumerate(sid):
assert ids[mb, b, i, c] == copyid
if ids[mb, b, i, c] < 0:
result_np[mb, b, i, c, :] = 0
else:
result_np[mb, b, i, c, :] = seq[mb, b, ids[mb, b, i, c], :]
output_shape_kerascomputed = gat.compute_output_shape([seq.shape, ids.shape])
assert isinstance(output_shape_kerascomputed, tuple)
assert list(output_shape_kerascomputed) == list(result_np.shape)
#with tf.get_default_session() as sess:
sess = tf.get_default_session()
gat.build(seq.shape)
result = gat.call([tf.constant(seq), tf.constant(ids)])
tf_result = sess.run(result)
assert list(tf_result.shape) == list(output_shape_kerascomputed)
assert np.all(np.equal(tf_result, result_np))
compute_assert_2ways_gathers(seq, ids)
compute_assert_2ways_gathers(seq * 5, ids * 5)
compute_assert_2ways_gathers([seq] * 3, [ids] * 3)
And usage example for 5 neighbours per node:
fields_input = Input(shape=(None, 10, name='nodedata')
neighbours_ids_input = Input(shape=(None, 5), name='nodes_neighbours_ids', dtype='int32')
fields_input_with_neighbours = GatherFromIndices(mask_value=0,
include_self=True, flatten_indices_features=True)\
([fields_input, neighbours_ids_input])
fields = Conv1D(128, kernel_size=5, padding='same',
activation='relu')(fields_input_with_neighbours) # data_format="channels_last"

How to do outer product as a layer with chainer?

How can I include an outer product (of the previous feature vector and itself) as a layer in chainer, especially in a way that's compatible with batching?
F.matmul is also very handy.
Depending on the input shapes, you can combine it with F.expand_dims (of course F.reshape works, too) or use transa/transb arguments.
For details, refer to the official documentation of functions.
Code
import chainer.functions as F
import numpy as np
print("---")
x = np.array([[[1], [2], [3]], [[4], [5], [6]]], 'f')
y = np.array([[[1, 2, 3]], [[4, 5, 6]]], 'f')
print(x.shape)
print(y.shape)
z = F.matmul(x, y)
print(z)
print("---")
x = np.array([[[1], [2], [3]], [[4], [5], [6]]], 'f')
y = np.array([[[1], [2], [3]], [[4], [5], [6]]], 'f')
print(x.shape)
print(y.shape)
z = F.matmul(x, y, transb=True)
print(z)
print("---")
x = np.array([[1, 2, 3], [4, 5, 6]], 'f')
y = np.array([[1, 2, 3], [4, 5, 6]], 'f')
print(x.shape)
print(y.shape)
z = F.matmul(
F.expand_dims(x, -1),
F.expand_dims(y, -1),
transb=True)
print(z)
Output
---
(2, 3, 1)
(2, 1, 3)
variable([[[ 1. 2. 3.]
[ 2. 4. 6.]
[ 3. 6. 9.]]
[[ 16. 20. 24.]
[ 20. 25. 30.]
[ 24. 30. 36.]]])
---
(2, 3, 1)
(2, 3, 1)
variable([[[ 1. 2. 3.]
[ 2. 4. 6.]
[ 3. 6. 9.]]
[[ 16. 20. 24.]
[ 20. 25. 30.]
[ 24. 30. 36.]]])
---
(2, 3)
(2, 3)
variable([[[ 1. 2. 3.]
[ 2. 4. 6.]
[ 3. 6. 9.]]
[[ 16. 20. 24.]
[ 20. 25. 30.]
[ 24. 30. 36.]]])
You can use F.reshape and F.broadcast_to to explicitly handle array.
Assume you have 2-dim array h with shape (minibatch, feature).
If you want to calculate outer product of h and h, try below code.
Is this what you want to do?
import numpy as np
from chainer import functions as F
def outer_product(h):
s0, s1 = h.shape
h1 = F.reshape(h, (s0, s1, 1))
h1 = F.broadcast_to(h1, (s0, s1, s1))
h2 = F.reshape(h, (s0, 1, s1))
h2 = F.broadcast_to(h2, (s0, s1, s1))
h_outer = h1 * h2
return h_outer
# test code
h = np.arange(12).reshape(3, 4).astype(np.float32)
h_outer = outer_product(h)
print(h.shape)
print(h_outer.shape, h_outer.data)

Find all cycles of given length (networkx)

Given an undirected graph how do you go about finding all cycles of length n (using networkx if possible). So input would be the Graph and n and the function would return all cycles of that length.
You can use networkx.cycle_basis.
>>> G = networkx.Graph()
>>> networkx.add_cycle(G, [0, 1, 2, 3])
>>> networkx.add_cycle(G, [0, 3, 4, 5])
>>> print(networkx.cycle_basis(G))
[[3, 4, 5, 0], [1, 2, 3, 0]]
>>> print(networkx.cycle_basis(G, root = 2))
[[1, 2, 3, 0]]
Then, you can check the length of each list as you see fit.

Sum of column elements of dictionary values made of lists in Python

I have a dictionary as follows
d = {0:[1,2,3], 1:[2,3,4], 2:[3,4,5], 3:[4,5,6]}
What wold be the most compact form in Python to sum the same column elements of the dictionary values which are lists or how can I get the following result out of dictionary values ?
[(1+2+3+4), (2+3+4+5), (3+4+5+6)]=[10,14,18]
Without NumPy
>>> d
{0: [1, 2, 3], 1: [2, 3, 4], 2: [3, 4, 5], 3: [4, 5, 6]}
>>> map(sum, zip(*d.values()))
[10, 14, 18]
with NumPy
>>> import numpy as np
>>> d
{0: [1, 2, 3], 1: [2, 3, 4], 2: [3, 4, 5], 3: [4, 5, 6]}
>>> map(np.sum, zip(*d.values()))
[10, 14, 18]

Resources