Looking for an example on how to use ConvLSTM2D layer in tensorflow
I'm trying to implement a convolution LSTM model and ConvLSTM2D layer seems to be what I'm looking for, but couldn't find any example on how to use it.
This is the documentation in tensorflow website: https://www.tensorflow.org/versions/master/api_docs/python/tf/contrib/keras/layers/ConvLSTM2D
See also questions close to this topic

ForEach loop in Tensorflow?
When I want to write a loss function for tensorflow, I need to stick to tensorflow operators, as far as I understand the problem.
Now I wanted to try as an easy example to let tensorflow place circles without intersection. Now the data model would just be a matrix with
3 x num_items
entries of the formx,z,radius
.In pure python I would then have a for loop which skips 3 elements in each iteration and compares the distance of the center to another center with the sum of the radius of both circles. For a simple loss function I would return 0 on the first collision and 1 if there was no collision. I could return the distance, when it i.e. helps to create gradients.
Now I try to find out how to do something like this with tensorflow operators. There i a
while_loop
operator, but there seems to be noforeach_loop
and I would even need operators which allow nesting (at least for the naive approach to the circle test). 
Convert 2d tensor to 3d in tensorflow
I need to convert 2d tensor to a 3d tensor. how can I transfer this in tensor flow.
[[30, 29, 19, 17, 12, 11], [30, 27, 20, 16, 5, 1], [28, 25, 17, 14, 7, 2], [28, 26, 21, 14, 6, 4]]
to this
[[[0,30], [0,29], [0,19], [0,17], [0,12], [0,11]], [[1,30], [1,27], [1,20], [1,16],[1,5], [1,1]], [[2,28], [2,25], [2,17], [2,14], [2,7], [2,2]], [[3,28], [3,26], [3,21], [3,14], [3,6], [3,4]]]
Thanks! I am doing this to implement asked in How to select rows from a 3D Tensor in TensorFlow? @kom

Retrain neural network after loading from pb file
I have figured out how to load a pb file and use the model for inference:
import tensorflow as tf import os import sys from tensorflow.python.platform import gfile import numpy as np from scipy.misc import imread, imresize class ImageFolderDataSource: def __init__(self, folder, batch_size, labels): if not os.path.exists(folder): raise Exception("Folder doesn't exist: {}".format(folder)) if set(labels) != set(os.listdir(folder)): raise Exception("The labels are not consistent with folder structure.") self.labels = labels self.n_classes = len(self.labels) self.labels_index = np.arange(self.n_classes) self.index_map = dict(zip(self.labels, self.labels_index)) self.label_map = dict(zip(self.labels_index, self.labels)) self.folder = folder self.batch_size = batch_size self.label_pool = [] self.file_pool = [] for label in self.labels: label_one_hot = self.one_hot(self.index_map[label]) label_folder = os.path.join(folder, label) label_files = list(map( lambda f: os.path.join(label_folder, f), os.listdir(label_folder) )) self.file_pool.extend(label_files) self.label_pool.extend(np.repeat([label_one_hot], len(label_files), axis=0)) self.n_files = len(self.file_pool) self.label_pool = np.array(self.label_pool) self.file_pool = np.array(self.file_pool) def one_hot(self, index): res = np.repeat(0, self.n_classes).astype(np.float32) res[index] = 1.0 return res def rand_index(self): return np.random.choice(np.arange(self.n_files), self.batch_size) def get_batch(self): index = self.rand_index() batch_labels = self.label_pool[index] batch_files = self.file_pool[index] batch_data = np.array(list( map( lambda f: imread(f), batch_files ) )) return batch_labels, batch_data, batch_files with tf.Session() as persisted_sess: def run_inference_on_image(): with open("latest_labels.txt") as fh: label_names = [x.strip() for x in fh.readlines()] batch_reader = ImageFolderDataSource("/home/kaiyin/PycharmProjects/demoloadpbtensorflow/images/validate", 5, label_names) labels, data, files = batch_reader.get_batch() print(files) data = np.divide(data, np.float32(255.0)) answer = None # # Print all operators in the graph # for op in persisted_sess.graph.get_operations(): # print(op) # # Print all tensors produced by each operator in the graph # for op in persisted_sess.graph.get_operations(): # print(op.values()) # tensor_names = [[v.name for v in op.values()] for op in persisted_sess.graph.get_operations()] # tensor_names = np.squeeze(tensor_names) # print(tensor_names) softmax_tensor = persisted_sess.graph.get_tensor_by_name('import/final_result:0') def predict(img): predictions = persisted_sess.run(softmax_tensor, {'import/input:0': img}) predictions = np.squeeze(predictions) print("##################") top_k = predictions.argsort()[:][::1] # Getting top 3 predictions, reverse order for node_id in top_k: human_string = batch_reader.labels[node_id] score = predictions[node_id] print('%s (score = %.5f)' % (human_string, score)) answer = labels[top_k[0]] return answer for index in range(len(labels)): img = data[index] label_index = np.nonzero(labels[index])[0][0] label_name = batch_reader.label_map[label_index] if img.shape != (224, 224, 3): img = imresize(img, (224, 224, 3)) predict(np.expand_dims(img, axis=0)) with gfile.FastGFile("latest.pb", 'rb') as f: graph_def = tf.GraphDef() graph_def.ParseFromString(f.read()) persisted_sess.graph.as_default() tf.import_graph_def(graph_def) run_inference_on_image()
In this case the model is mobilenet v1.0. I am wondering how I can retrain the weights of this model (or any other neural network model).
I know there is a retrain utility from tensorflow, but I would like to do it in python to allow more flexibility. Any suggestions?
Here is the full project: https://github.com/kindlychung/demoloadpbtensorflow

depthwise_Conv2D_native Tensorflow
What is the difference between the depthwise_conv2d and depthwise_conv2d_native in TensorFlow ?
 Why would testing loss be lower than training loss at every epoch?

Bottleneck architecture (CNN) in reinforcement learning
In the original DQN paper, they used relatively big (8x8, 4x4) filter sizes. This is undesirable because of the huge amount of learnable parameters and the big computational cost. At the OpenAI's implementation of the A3C algorithm, they use 3x3 filter size. If you choose carefully the number of filters and the number of layers you can have fewer learnable parameters and less computational cost and more nonlinearity whit these 3x3 filters. But I couldn't find any paper where they use the "bottleneck" architecture (usage of 1x1 sized kernels). Is there any particular reason for that?

Understanding the Mechanics of Training an LSTM
For the past month or so, I have been frantically trying to understand the mechanics of LSTMs (long ShortTerm Memory Neural Networks). I am fairly new in the field of AI, so excuse me if the question is too primitive.
Let me begin with what I have done so far. I have read this very helpful blog posts on Neural Networks and NLP and on LSTMs. I have installed TensorFlow and successfully run some tutorials, including this on Recurrent Neural Networks, which I am trying to understand. In a further attempt to understand how to properly build and train LSTMs, I have read the series of tutorials by Erik Hallstrom on Medium (unfortunately I cannot post more than two links, because my reputation is still 1).
So, I have understood how to build the computation graph on TensorFlow and how to run it, how to create an LSTM cell and how to stack them together, what dropout is, how to calculate the loss and how to use optimizers (that perform Backpropagation automatically). I have some issues understanding the way we split our training data into batches. In
reader.py
I see that we split the training dataset into batches and Hallstrom's article it illustrates how the different parts must be synchronized. But, since Python is a dynamic language, I cannot easily see the shape of the tensors and I have no idea how to identify (print) their contents.So, the question has four branches:
Why don't we just generate a moving window that takes num_steps words at each time? In each epoch we can move it forward by one. Why do we leave that awkward "space" between the various "windows" in the raw data? If they had to be synchronized in some way, then that would make sense; in NLP, however, I cannot see any obvious type of synchronization (the windows fall in the middle of sentences).
How does the epoch_size relate to the size of the input data, the batch_size and num_steps. I mean, it is calculated as
(data_len // batch_size  1) // num_steps
but why?The tutorial uses the same code to generate the training and the test graph. The difference between the two is that the test Graph lacks the Gradient Descent Optimizer and Dropout nodes. But, the way I understand it, when you run the test you should give the LSTM on word at the time and expect to get a hint on what the next word should be. I think that the tutorial outputs the probabilities (logits, inf to +inf) in a vector for each word in the dictionary (
output[1]
is the probability for word 1,output[2]
for word 2 etc.). If that is the case, why is they
vectorx >> 1
when we run the test graph?Why is the operation that performs the lookup for word embeddings bound to CPU0? Is that a temporary BUGFIX, or is there any other reason why the lookup cannot be done in parallel?
All of the above questions refer to the code sample from the TensorFlow RNN Tutorial. I understand that the questions might sound silly, but I have read too many articles (including two questions on Stack Overflow on the same tutorial) and want to be able to understand the code in full so as to build an implementation of my own. Thank you all in advance for your help.

Many to one LSTM, multiclass classification
I am trying to train an LSTMRNN with 64 hidden units. My data is the following:
input: numpy array with dimensions (170000, 50, 500) > (examples, time steps, number of features)
output: numpy array with dimensions (170000, 10)The output is a categorical variable with 10 classes (e.g., class 1 corresponds to the vector [1,0,0,0,0,0,0,0,0,0])
So far i have tried this code but an error appears stating that the Dense layer should have a 3D input.
model = Sequential() model.add(LSTM(64, input_shape=(50, 500), return_sequences=True,dropout=0.2, recurrent_dropout=0.2)) model.add(Dense(units = 10,activation='sigmoid')) model.compile(loss='categorical_crossentropy', optimizer='adam') model.fit(input, output, epochs=1, batch_size=64)
The only thing that seems to work is changing the output so it has the following form: (170000,50,10), which has basically all zeros except for the 50th timestep.
Is this a correct approach? If so, is there something more efficient? I am concerned about the fact that expanding the outputs' shape might make the code less efficient.

LSTM Variational Autoencoder in Tensorflow
I am developing an LSTM based Variational Autoencoder as a part of my project. Here is my code Encoder and decoder are LSTM with 2 layers and 2048 units. I do not understand what I should do with LSTM output in both decoder and encoder. Should I take only last input or input at every time step. I am passing video frames i.e (batch_size, 1024) > (batch_size, 32, 32). 32 Represents time steps and no of the corresponding feature are also 32. LSTM output is (32, batch_size, 2048). Should I encode further in case of encoder as encoder should reduce the size? and What should I do incase of decoder ?

Basic RNN in tensorflow, weights not being updated?
I am trying to implement a basic Recurrent Neural Network in TensorFlow,I am new to both RNN and Tensor flow. The idea is to get easy with both the working of Tensor flow and RNN. Code below. data [1.,1.,0.] label[0.,1.,1.].
import tensorflow as tf import numpy as np tf.reset_default_graph() s = tf.Session() data = np.asarray([1.0,1.0,0.0]) label = np.asarray([0.0,1.0,1.0]) with tf.variable_scope('rnn_cell'): wx = tf.get_variable('wx',[1]) wy = tf.get_variable('wy',[1]) wh = tf.get_variable('wh',[1]) init = tf.global_variables_initializer() s.run(init) def rnn_cell(rnn_input,curr_state): with tf.variable_scope('rnn_cell',reuse=True): wx = tf.get_variable('wx',[1]) wy = tf.get_variable('wy',[1]) wh = tf.get_variable('wh',[1]) next_state = tf.tanh((rnn_input*wx)+ curr_state*wh) yin = tf.nn.softmax(next_state*wy) return next_state,yin yin_h = [] for i in range(3): next_state,yin = rnn_cell(data[i],0.0) yin_h.append(yin) print(s.run([wx,wh,wy])) loss = tf.nn.softmax_cross_entropy_with_logits(labels=label,logits=tf.reshape(yin_h,(1,3))) train = tf.train.GradientDescentOptimizer(learning_rate=0.1).minimize(loss) print(s.run([train,loss])) print(s.run([wx,wh,wy]))
However when I run this program the weights don't update, can someone help me with this? I get an output as below.
[array([ 1.66055334], dtype=float32), array([ 0.91633427], dtype=float32), array([ 1.51226103], dtype=float32)] [None, array([ 2.19722462], dtype=float32)] [array([ 1.66055334], dtype=float32), array([ 0.91633427], dtype=float32), array([ 1.51226103], dtype=float32)]
The weights remain the same.
Secondly is this the right way to create the RNN cell ?
Third if I change the code with this below
train = tf.train.GradientDescentOptimizer(learning_rate=0.1) opt = train.compute_gradients(loss) print(s.run(opt))
I get the below output
[(array([0.], dtype=float32), array([0.71624851], dtype=float32)), (array([0.], dtype=float32), array([0.57606542], dtype=float32)), (array([0.], dtype=float32), array([0.33180201], dtype=float32))]
It seems like compute gradients is only calculating 2 variables of wx,wh,wy and not 3, any idea?
Thanks.

TypeError: strided_slice() missing 1 required positional argument: 'strides'
In My Repository when i run the file comp.py i get the following error
TypeError: strided_slice() missing 1 required positional argument: 'strides'
I used strided_slices in reader.py. I don't know why this error is arising. Any help would be great.

Many to one LSTM time secuence model configuration using Keras
I have a dataset where each row corresponds to an hour of the day and a quantity (besides other features).
My objective is to model a LSTM network to predict this quantities given a set of features to the RNN (one of them the hour of the day obviously) but can't get it to work. I'm trying to use this kind of RNN because they seem to be good for sequences. In my case, time sequences related to a target value (the one I want to predict later).
I have read some of the "Many to one" questions here at SE but they were of no much help to me. Could anyone experienced on this type of networks give any starting tips to be able to first, model this kind of network in a Many to one fashion and then do a prediction of the output for a future date? An example would help greatly and would be really appreciated.
P.D. After some more thought I am doubting about the root of the problem. Could it be better to use a One to One than a Many to one to predict in my case?. It seems that what I'm seeking (from some simple features, that are just ints and floats, predict an int) falls in the OneToOne problem, but I could be wrong.
Cheers.