how to give fixed embedding matrix to EmbeddingLayer in Lasagne? - lasagne

I have implemented a deep learning architecture which uses Lasagne EmbeddingLayer.
Now I have the word vectors already learned using word2vec and do not want the word vectors to be the parameters of my network.
After reading the documentation, I think it specifies that the numpy array provided to the 'W' parameter is the initial value for the Embedding Matrix.
How can I declare/specify the EmbeddingLayer in the code so that it uses the input weight matrix as a fixed matrix of word vectors??

The above problem can be solved by adding the 'trainable=False' tag to the weight parameter of the custom layer defined to work as the Embedding Layer.

Related

How to convert .dicom (slices) to a single (volume) image?

I have 'n' number of slices, is it possible to convert them to a single file, (that has correct slice arrangement), and parse them using ImageIO or any other python package ?
I'm not sure what ImageIO is, however for parsing a set of slices (which I assume you mean a single CT or MR type series, that's meant to be a single 3D volume) check out simpleITK.
I think it will do exactly what you want: it's a very complete "3d aware" dicom library (and very fast as it's wrapped around C libraries). In particular it will read a complete multi-file series, and create a single 3D representation of it.
It's representation is based on extended numpy objects - so in particular it will have a 3D numpy array for the series, but in addition knows about the 3D location/orientation of the series relative to the dicom patient coordinate system.
So once you have that, you've got all the spatial/3D info you need to be able to use with any other python libraries.

keras embedding vector back to one-hot

I am using keras in NLP problem. There comes a question about word embedding when I try to predict next word according to previous words. I have already turn the one-hot word to word vector via keras Embedding layer like this:
word_vector = Embedding(input_dim=2000,output_dim=100)(word_one_hot)
And use this word_vector to do something and the model gives another word_vector at last. But I have to see what the prediction word really is. How I can turn the word_vector back to word_one_hot?
This question is old but seems to be linked to a common point of confusion about what embeddings are and what purpose they serve.
First off, you should never convert to one-hot if you're going to embed afterwards. This is just a wasted step.
Starting with your raw data, you need to tokenize it. This is simply the process of assigning a unique integer to each element in your vocabulary (the set of all possible words/characters [your choice] in your data). Keras has convenience functions for this:
from keras.preprocessing.sequence import pad_sequences
from keras.preprocessing.text import Tokenizer
max_words = 100 # just a random example,
# it is the number of most frequently occurring words in your data set that you want to use in your model.
tokenizer = Tokenizer(num_words=max_words)
# This builds the word index
tokenizer.fit_on_texts(df['column'])
# This turns strings into lists of integer indices.
train_sequences = tokenizer.texts_to_sequences(df['column'])
# This is how you can recover the word index that was computed
print(tokenizer.word_index)
Embeddings generate a representation. Later layers in your model use earlier representations to generate more abstract representations. The final representation is used to generate a probability distribution over the number of possible classes (assuming classification).
When your model makes a prediction, it provides a probability estimate for each of the integers in the word_index. So, 'cat' as the most likely next word, and your word_index had something like {cat:666}, ideally the model would have provided a high likelihood for 666 (not 'cat'). Does this make sense? The model doesn't predict an embedding vector ever, the embedding vectors are intermediary representations of the input data that are (hopefully) useful for predicting an integer associated with a word/character/class.

Implement BidirectionalGridLSTM

I’m implementing a chatbot using Tensorflow’s seq2seq model[1], feeding it with data from the Ubuntu Dialogue Corpus. I want to compare an RNN using standard LSTM cells with Grid LSTM cells described in Kalchbrenner et al [2].
I’m trying to implement the Grid LSTM cell in the translation model described in section 4.4 [2], but I’m struggling with the bidirectional part.
I have tried using BidirectionalGridLSTMCell, but I’m not sure what they mean by num_frequency_block. They do not mention that in the paper. Does anyone know what they mean by num_frequency_block? In the api docs it says:
num_frequecy_blocks: [required] A list of frequency blocks needed to cover the whole input feature splitting defined by start_freqindex_list and end_freqindex_list.
Further, I have tried to create my own cell. First I do the forward processing with the inputs, then I reverse the inputs, and do the backward processing. But when I concatenate these results, the shape changes. E.g. when I try to run the network with a batch size of 32, then i get this error:
ValueError: Dimensions must be equal, but are 64 and 32
How can I concatenate the results without changing the shape? Is that even possible?
Does anyone have any other tips, on how I can implement Bidirectional Grid LSTM?
[1] https://www.tensorflow.org/tutorials/seq2seq/
[2] https://arxiv.org/abs/1507.01526
-tensorflow has bidirectional LSTMs built-in: https://github.com/aymericdamien/TensorFlow-Examples/blob/master/notebooks/3_NeuralNetworks/bidirectional_rnn.ipynb
here's a tutorial for using bidirectional LSTMs for intent matching: https://blog.themusio.com/2016/07/18/musios-intent-classifier-2/
you're missing your second [2] reference link.
is this a helpful baseline, even if they don't provide grids?
may i ask what you are using it for?

matrix can contain elements of different class

I am a newbie to R programming. In the tutorial for R language , I found matrix can not have elements from different classes.
But I am able to create a matrix as follows:
x<-matrix(list(1,"2",TRUE,1+1i),nrow=2,ncol=2)
Please explain what am I missing?
A matrix is implemented as a vector with a dim attribute. A list is technically a type of vector, so what you created is "legal" in that sense.
But it's not very useful because most functions that take a matrix as input expect the matrix to be an atomic type (a list is a recursive-type object).

Generating a SequenceFile

Given data in the following format (tag_uri image_uri image_uri image_uri ...), I need to turn them into Hadoop SequenceFile format for further processing by Mahout (e.g. clustering)
http://flickr.com/photos/tags/100commentgroup http://flickr.com/photos/34254318#N06/4019040356 http://flickr.com/photos/46857830#N03/5651576112
http://flickr.com/photos/tags/100faves http://flickr.com/photos/21207178#N07/5441742937
...
Before this I would turn the input into csv (or arff) as follows
http://flickr.com/photos/tags/100commentgroup,http://flickr.com/photos/tags/100faves,...
0,1,...
1,1,...
...
with each row describes one tag. Then the arff file is converted into a vector file used by mahout for further processing. I am trying to skip the arff generation part, and generate a sequenceFile instead. If I am not mistaken, to represent my data as a sequenceFile, I would need to store each row of the data with $tag_uri as key, then $image_vector as value. What is the proper way of doing this (if possible, can I have the tag_url for each row to be included in the sequencefile somewhere)?
Some references that I found, but not sure if they are relevant:
Writing a SequenceFile
Formatting input matrix for svd matrix factorization (can I store my matrix in this form?)
RandomAccessSparseVector (considering I only list images that are assigned with a given tag instead of all the images in a line, is it possible to represent it using this vector?)
SequenceFile write
SequenceFile explanation
You just need a SequenceFile.Writer, which is explained in your link #4. This lets you write key-value pairs to the file. What the key and value are depends on your use case, of course. It's not at all the same for clustering versus matrix decomposition versus collaborative filtering. There's not one SequenceFile format.
Chances are that the key or value will be a Mahout Vector. The thing that knows how to write a Vector is VectorWritable. This is the class you would use to wrap a Vector and write it with SequenceFile.Writer.
You would need to look at the job that will consume it to make sure you're passing what it expects. For clustering, for example, I think the key is ignored and the value is a Vector.

Resources