Looking for an example on how to use ConvLSTM2D layer in tensorflow
I'm trying to implement a convolution LSTM model and ConvLSTM2D layer seems to be what I'm looking for, but couldn't find any example on how to use it.
This is the documentation in tensorflow website: https://www.tensorflow.org/versions/master/api_docs/python/tf/contrib/keras/layers/ConvLSTM2D
See also questions close to this topic

Tensorflow Checkpoint
Why is my tf.keras.Model not updated after I restore the checkpoint?
Below is the example code that I am running to understand the behavior of tfe.Checkpoint.
import tensorflow as tf tf.enable_eager_execution() import tensorflow.contrib.eager as tfe import os # Create inputs = tf.keras.Input(shape=(2,)) outputs = tf.keras.layers.Dense(3, activation=tf.nn.relu, use_bias=False)(inputs) model = tf.keras.Model(inputs=inputs, outputs=outputs) print(model.variables) # Modify print('Modify model:') model.variables[0].assign([[1., 2.,3],[1.,2.,3]]) print(model.variables) # Save this_checkpoint = tfe.Checkpoint(model=model) checkpoint_directory = './tmp3' checkpoint_prefix = os.path.join(checkpoint_directory, "ckpt") save_path = this_checkpoint.save(checkpoint_prefix) print('Saved model as %s' %(save_path)) # Restore print('\nNew model:') inputs2 = tf.keras.Input(shape=(2,)) outputs2 = tf.keras.layers.Dense(3, activation=tf.nn.relu, use_bias=False)(inputs2) model2 = tf.keras.Model(inputs=inputs2, outputs=outputs2) print(model2.variables) ckpt_to_restore = tf.train.latest_checkpoint(checkpoint_directory) print('\nRestoring model from %s' %(ckpt_to_restore)) new_checkpoint = tfe.Checkpoint(model=model2) new_checkpoint.restore(ckpt_to_restore) print(model2.variables)
The output is:
[<tf.Variable 'dense/kernel:0' shape=(2, 3) dtype=float32, numpy= array([[ 0.72318137, 1.05397141, 0.79438519], [0.64990056, 0.88330412, 1.07289195]], dtype=float32)>] Modify model: [<tf.Variable 'dense/kernel:0' shape=(2, 3) dtype=float32, numpy= array([[ 1., 2., 3.],[ 1., 2., 3.]], dtype=float32)>] Saved model as ./tmp3/ckpt1 New model: [<tf.Variable 'dense_1/kernel:0' shape=(2, 3) dtype=float32, numpy= array([[0.74109459, 0.93528509, 0.05304849], [0.87230837, 0.74049175, 0.26022822]], dtype=float32)>] Restoring model from ./tmp3/ckpt1 [<tf.Variable 'dense_1/kernel:0' shape=(2, 3) dtype=float32, numpy= array([[0.74109459, 0.93528509, 0.05304849], [0.87230837, 0.74049175, 0.26022822]], dtype=float32)>]
I was expecting the dense layer in model2 to get the values [[1,2,3],[1,2,3]]. Why is the layer not updated to the values saved in the checkpoint?

Installing tensorflow with gpu support nvcuda.dll error
I am trying to install tensorflow with gpu support.
I followed this instruction
https://docs.nvidia.com/deeplearning/sdk/cudnninstall/index.html#installwindows
Yet i recieve error:
Could not find 'nvcuda.dll'. Tensorflow requires that this DL be installed in directory that is named in your %PATH% env. variable
I have add variable as its stated in the guide.
WHat could be the problem? THanks for help.

Keras LSTM go_backwards usage
I have a question regarding the usage of the go_backwards argument in the Keras LSTM model layer. The documentation for this layer can be found here: https://keras.io/layers/recurrent/#lstm.
Question1: If I set the "go_backwards" flag to True, do I still feed the input data "forwards" during the training process. For example, if an input sentence in English normally reads "I fell", and it's German translation reads "Ich fiel", would I feed it forwards ("I fell", "Ich fiel"), or backwards ("fell I", "fiel Ich") during the training process.
Question 2: Same question for making model predictions, is the data fed forward ("I fell"), or reverse ("fell I")?
I think I should feed all data forward but I can't find any documentation that convinces me that this is correct.

ValueError: Error when checking : expected dense_1_input to have shape (9,) but got array with shape (1,)
I have the this dataset
step pos_x pos_y vel_x vel_y ship_lander_angle ship_lander_angular_vel leg_1_ground_contact leg_2_ground_contact action 0 0 0.004053 0.937387 0.410560 0.215127 0.004703 0.092998 0.0 0.0 3 1 1 0.008040 0.933774 0.401600 0.240878 0.007613 0.058204 0.0 0.0 3 2 2 0.011951 0.929763 0.392188 0.267401 0.008632 0.020372 0.0 0.0 3 3 3 0.015796 0.925359 0.383742 0.293582 0.007955 0.013536 0.0 0.0 3 4 4 0.019576 0.920563 0.375744 0.319748 0.005674 0.045625 0.0 0.0 3
I split it as follows:
X = dataset[dataset.columns.difference(["action"])] Y = dataset["action"] # Use a range scaling to scale all variables to between 0 and 1 min_max_scaler = preprocessing.MinMaxScaler() cols = X.columns X = pd.DataFrame(min_max_scaler.fit_transform(X), columns = cols) # Watch out for putting back in columns here # Perfrom split to train, validation, test x_train_plus_valid, x_test, y_train_plus_valid, y_test = train_test_split(X, Y, random_state=0, test_size = 0.30, train_size = 0.7) x_train, x_valid, y_train, y_valid = train_test_split(x_train_plus_valid, y_train_plus_valid, random_state=0, test_size = 0.199/0.7, train_size = 0.5/0.7) # convert to numpy arrays y_train_wide = keras.utils.to_categorical(np.asarray(y_train)) # convert the target classes to binary y_train_plus_valid_wide = keras.utils.to_categorical(np.asarray(y_train_plus_valid)) y_valid_wide = keras.utils.to_categorical(np.asarray(y_valid))
And i use the Neural Network to train my data
model_mlp = Sequential() model_mlp.add(Dense(input_dim=9, units=32)) model_mlp.add(Activation('relu')) model_mlp.add(Dropout(0.2)) model_mlp.add(Dense(32)) model_mlp.add(Activation('relu')) model_mlp.add(Dropout(0.2)) model_mlp.add(Dense(4)) model_mlp.add(Activation('softmax')) #model.add(Dense(num_classes, activation='softmax')) model_mlp.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy']) model_mlp.fit(np.asfarray(x_train), np.asfarray(y_train_wide), \ epochs=20, batch_size=32, verbose=1, \ validation_data=(np.asfarray(x_valid), np.asfarray(y_valid_wide)))
I almost got 93% accuracy. I save the model as follows
filepath = "first_model.mod" model_mlp.save(filepath)
In another file where I need to load the model and calculate the reward I got above mention error
if __name__=="__main__": # Load the Lunar Lander environment env = LunarLander() s = env.reset() # Load and initialise the contrll model ROWS = 64 COLS = 64 CHANNELS = 1 model = keras.models.load_model("first_model.mod") # Run the game loop total_reward = 0 steps = 0 while True: # Get the model to make a prediction a = model.predict_classes(s) a = a[0] # Step on the game s, r, done, info = env.step(a) env.render() total_reward += r if steps % 20 == 0 or done: print(["{:+0.2f}".format(x) for x in s]) print("step {} total_reward {:+0.2f}".format(steps, total_reward)) steps += 1 if done: break
Error is at following line :
a = model.predict_classes(s)

Image Classification  Tensorflow
I am building an image classifier for 10 categories using tensorflow. My test accuracy is only around 25%. My parameters are: params = {"input_shape":[200,300,3], "conv_layers": 5, "filters":[40,50,60,60,60], "kernel_size":[5,5], "hidden_units": 9000, "drop_rate":0.4, "n_classes":10} and Relu activation function. I have 1000 training images per category.(300x200) what shall i do to get more accuracy? what alteration can be made in the params?How many epoch will be efficient?

Low validation accuracy in parallel DenseNet
I've taken the code from https://github.com/flyyufelix/cnn_finetune and remodeled it so that there is now two DenseNet121 in parallel, with the layers after each model's last Global Average Pooled removed.
Both models were joined together like this:
print("Begin model 1") model = densenet121_model(img_rows=img_rows, img_cols=img_cols, color_type=channel, num_classes=num_classes) print("Begin model 2") model2 = densenet121_nw_model(img_rows=img_rows, img_cols=img_cols, color_type=channel, num_classes=num_classes) mergedOut = Add()([model.output,model2.output]) #mergedOut = Flatten()(mergedOut) mergedOut = Dense(num_classes, name='cmb_fc6')(mergedOut) mergedOut = Activation('softmax', name='cmb_prob')(mergedOut) newModel = Model([model.input,model2.input], mergedOut) adam = Adam(lr=1e3, decay=1e6, amsgrad=True) newModel.compile(optimizer=adam, loss='categorical_crossentropy', metrics=['accuracy']) # Start Finetuning newModel.fit([X_train,X_train], Y_train, batch_size=batch_size, nb_epoch=nb_epoch, shuffle=True, verbose=1, validation_data=([X_valid,X_valid],Y_valid) )
The first model has its layers frozen, and the one in parallel is suppose to learn additional features on top of the first model to supposedly improve accuracy.
However, even at 100 epochs, the training accuracy is almost 100% but validation floats around 9%.
I'm not quite sure what could be the reason and how to fix it, considering I've already changed the optimizer from SGD (same concept, 2 densenets with the first trained on ImageNet, the second has no weights to begin with same results) to Adam (2 densenets, both pretrained on imagenet).
Epoch 101/1000 1000/1000 [==============================]  1678s 2s/step  loss: 0.0550  acc: 0.9820  val_loss: 12.9906  val_acc: 0.0900 Epoch 102/1000 1000/1000 [==============================]  1703s 2s/step  loss: 0.0567  acc: 0.9880  val_loss: 12.9804  val_acc: 0.1100

Lstm Autoencoder for time series
I am trying to build an LSTM Autoencoder to predict Time Series data,since I am new to python I have mistakes in the decoding part. I tried to build it up like hereandKeras. I could not understand the difference between the given examples at all. The code that I have right now looks like:
Question 1: is how to choose the batch_size and input_dimension when each sample has 2000 values ?
Question 2: How to get this LSTM Autoencoder working ? That it is predicting from the lets say starting from sample 10 on till the end of the data ?
Mydata has in total 1500 samples, I would go with 10 time steps (or more if better), and each sample has 2000 Values. If you need more Information I would include them as well later.
trainX = np.reshape(data, (1500, 10,2000)) from keras.layers import * from keras.models import Model from keras.layers import Input, LSTM, RepeatVector
parameter
timesteps=10 input_dim=2000 units=100 #choosen unit number randomly batch_size=2000 epochs=20
Model
inpE = Input((timesteps,input_dim)) outE = LSTM(units = units, return_sequences=True)(inpE) encoder = Model(inpE,outE) inpD = RepeatVector(timesteps)(outE) outD1 = LSTM(input_dim, return_sequences=True)(outD decoder = Model(inpD,outD) autoencoder = Model(inpE, outD) autoencoder.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy']) autoencoder.fit(trainX, trainX, batch_size=batch_size, epochs=epochs) encoderPredictions = encoder.predict(trainX)

Tensorflow RNN LM loss not decreasing
I'm trying to train a basic unidirectional LSTM RNN language model on PennTree Bank. My neural network runs, but the loss on the test set is not decreasing at all. I'm wondering why is this?
Network parameters:
V = 10000 batch_size = 20 hidden_size = 650 embed_size = hidden_size num_unrollings = 35 max_epoch = 6 learning_rate = 1.0
Graph definition:
graph = tf.Graph() with graph.as_default(): cell_state = tf.placeholder(tf.float32, shape=(batch_size, hidden_size), name="CellState") hidden_state = tf.placeholder(tf.float32, shape=(batch_size, hidden_size), name="HiddenState") curr_batch = tf.placeholder(tf.int32, shape=[num_unrollings + 1, batch_size]) lstm = tf.contrib.rnn.BasicLSTMCell(hidden_size) embeddings = tf.Variable(tf.truncated_normal([V, embed_size], 0.1, 0.1), trainable=True, dtype=tf.float32) W = tf.Variable(tf.truncated_normal([hidden_size, V], 0.1, 0.1)) b = tf.Variable(tf.zeros(V)) inputs = curr_batch[:num_unrollings,:] # num_unrollings x batch_size labels = curr_batch[1:, :] # num_unrollings x batch_size input_list = list() for t in range(num_unrollings): emb = tf.nn.embedding_lookup(embeddings, inputs[t,:]) input_list.append(emb) outputs, states = tf.nn.static_rnn(lstm, input_list, initial_state=[cell_state, hidden_state]) # outputs: num_unrollings x batch_size x hidden cell_state, hidden_state = states outputs_flat = tf.reshape(outputs, [1, lstm.output_size]) # output_flat: (num_unrollings x batch_size) x hidden logits = tf.nn.softmax(tf.matmul(outputs_flat, W) + b) # logits_tensor: (num_unrollings x batch_size) x V logits_tensor = tf.reshape(logits, [batch_size, num_unrollings, V]) targets = tf.transpose(labels) # targets: batch_size x num_unrollings weights = tf.ones([batch_size, num_unrollings]) # weights: batch_size x num_unrollings loss = tf.reduce_sum(tf.contrib.seq2seq.sequence_loss(logits_tensor, targets, weights, average_across_timesteps=False, average_across_batch=True)) optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss)
Session:
with tf.Session(graph=graph) as session: tf.global_variables_initializer().run() cstate = np.zeros([batch_size, hidden_size]).astype(np.float32) hstate = np.zeros([batch_size, hidden_size]).astype(np.float32) for epoch in range(max_epoch): CURSOR_train = 0 epoch_over = False steps = 0 average_loss = 0.0 while not epoch_over: new_batch, epoch_over = nextBatch() feed_data = {curr_batch: new_batch, "CellState:0": cstate, "HiddenState:0": hstate} _, l, new_cell_state, new_hidden_state = session.run([optimizer, loss, cell_state, hidden_state], feed_dict=feed_data) cstate = new_cell_state hstate = new_hidden_state average_loss += l PRINT_INTERVAL = 200 if steps % PRINT_INTERVAL == 0: print("Avg loss for last {0} batches: {1}".format(PRINT_INTERVAL, average_loss / PRINT_INTERVAL)) average_loss = 0 TEST_INTERVAL = 600 if steps % TEST_INTERVAL == 0: # Evaluate the model test_over = False test_loss = 0.0 test_batch_num = 0 print("Testing ... ") while not test_over: test_batch_num += 1 test_batch, test_over = nextBatch(setup='test') feed_data_test = { curr_batch: test_batch, "CellState:0": cstate, "HiddenState:0": hstate } tl, d1, d2 = session.run([loss, cell_state, hidden_state], feed_dict=feed_data_test) test_loss += tl test_loss = test_loss / test_batch_num print("Avg loss on test set: {0}".format(test_loss)) steps += 1 sys.stdout.write('\rStep: {0}'.format(steps))
The loss on test set is always 320.2430792614422, no matter how long I train it. The loss on the training set does change. Thanks in adavance!

Tensorflow.js/Keras LTSM with multiple sequences?
I am trying to train a lstm model with Tensorflow.js using the Layers API that is built on Keras. I am having trouble getting the correct predictions back. I am trying to feed the model an array of NBA player's career production scores per season (ex: [20, 30, 40, 55, 60, 55, 33, 23]). I want to feed it an array of players with the next season's production score as the y.
var data = tf.tensor([ [[100], [86], [105], [122], [118], [96], [107], [118], [100], [85]], [[30], [53], [74], [85], [96], [87], [98], [99], [110], [101]], [[30], [53], [74], [85], [96], [87], [98], [99], [110], [101]], [[30], [53], [74], [85], [96], [87], [98], [99], [110], [101]], [[30], [53], [74], [85], [96], [87], [98], [99], [110], [101]], [[30], [53], [74], [85], [96], [87], [98], [99], [110], [101]], [[30], [53], [74], [85], [96], [87], [98], [99], [110], [101]], [[30], [53], [74], [85], [96], [87], [98], [99], [110], [101]] ]); var y = tf.tensor([[100], [90], [90], [90], [90], [90], [90], [90]]); const model = tf.sequential(); model.add( tf.layers.lstm({ units: 1, inputShape: [10, 1] }) ); model.compile({ loss: "meanSquaredError", optimizer: "adam" }); model.fit(data, y, { epochs: 1000 }).then(() => { // Use the model to do inference on a data point the model hasnt // seen before: model .predict( tf.tensor([ [[30], [53], [74], [85], [96], [87], [98], [99], [110], [101]] ]) ) .print(); });
It is predicting something like this: [[0],]
When I am expecting something like this: [[90]]

Batch size, Epoch, and iteration in a manytoone RNN
I am having trouble defining mainly the batch size when it comes to a manytoone RNN.
I have been looking at the following article: https://medium.com/@erikhallstrm/helloworldrnn83cd7105b767
This is on a manytomany RNN, but still the definition of the batch puzzles me.
What is the batch size when it comes to a manytoone RNN?
This would probably answer my other two confusions on epoch and iteration.

Python  Pattern prediction using LSTM Recurrent Neural Networks with Keras
I am dealing with pattern prediction from a formatted CSV dataset with three columns (time_stamp, X and Y  where Y is the actual value). I wanted to predict the value of X from Y based on time index from past values and here is how I approached the problem with LSTM Recurrent Neural Networks in Python with Keras.
import numpy as np import pandas as pd import matplotlib.pyplot as plt from keras.models import Sequential from keras.layers import LSTM, Dense from keras.preprocessing.sequence import TimeseriesGenerator from sklearn.preprocessing import MinMaxScaler from sklearn.model_selection import train_test_split np.random.seed(7) df = pd.read_csv('test32_C_data.csv') n_features=100 values = df.values for i in range(0,n_features): df['X_t'+str(i)] = df['X'].shift(i) df['X_tp'+str(i)] = (df['X'].shift(i)  df['X'].shift(i+1))/(df['X'].shift(i)) print(df) pd.set_option('use_inf_as_null', True) #df.replace([np.inf, np.inf], np.nan).dropna(axis=1) df.dropna(inplace=True) X = df.drop('Y', axis=1) y = df['Y'] X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.40) X_train = X_train.drop('time', axis=1) X_train = X_train.drop('X_t1', axis=1) X_train = X_train.drop('X_t2', axis=1) X_test = X_test.drop('time', axis=1) X_test = X_test.drop('X_t1', axis=1) X_test = X_test.drop('X_t2', axis=1) sc = MinMaxScaler() X_train = np.array(df['X']) X_train = X_train.reshape(1, 1) X_train = sc.fit_transform(X_train) y_train = np.array(df['Y']) y_train=y_train.reshape(1, 1) y_train = sc.fit_transform(y_train) model_data = TimeseriesGenerator(X_train, y_train, 100, batch_size = 10) # Initialising the RNN model = Sequential() # Adding the input layerand the LSTM layer model.add(LSTM(4, input_shape=(None, 1))) # Adding the output layer model.add(Dense(1)) # Compiling the RNN model.compile(loss='mse', optimizer='rmsprop') # Fitting the RNN to the Training set model.fit_generator(model_data) # evaluate the model #scores = model.evaluate(X_train, y_train) #print("\n%s: %.2f%%" % (model.metrics_names[1], scores[1]*100)) # Getting the predicted values predicted = X_test predicted = sc.transform(predicted) predicted = predicted.reshape((1, 1, 1)) y_pred = model.predict(predicted) y_pred = sc.inverse_transform(y_pred)
When I plot the prediction as this
plt.figure plt.plot(y_test, color = 'red', label = 'Actual') plt.plot(y_pred, color = 'blue', label = 'Predicted') plt.title('Prediction') plt.xlabel('Time [INdex]') plt.ylabel('Values') plt.legend() plt.show()
The following plot is what I get.
However, if we plot each column separately,
groups = [1, 2] i = 1 # plot each column plt.figure() for group in groups: plt.subplot(len(groups), 1, i) plt.plot(values[:, group]) plt.title(df.columns[group], y=0.5, loc='right') i += 1 plt.show()
The following plots are what we get.
How can we improve the prediction accuracy and get a plot close to the following?

Modeling multivariate and variable timesteps sequential data in LSTM
I have multiple feature time series data but timestep in not constant. How to model such data in LSTM architecture. e,g
SegmentNo / Speed(kmh) / Distance(km) / Time(minutes) / TransportMode
1 70 30 28 Car
2 3 1 15 Walk
3 40 10 20 BusIn the single trip of a user, he first travels 28 minutes by car, then he walks for 15 minutes and then travels 20 minutes by bus. I am interested to know next transportation mode in a sequence that user will take. can I also predict the set of other features (speed, distance, time).
I though to consider a trip as a batch, but the timesteps are variable, 28 minutes then 15 min and then 20 minutes. (Data is sequential but timestep is not constant).
one approach in my mind is to repeat each row to number of times in time column. for example, repeat first row 28 times as user travel for 28 minutes, repeat walk row for 15 times times. Then dealing the data at fixed time step of one minutes. But I'm afraid it will slow down the performance by unnecessarily increasing the data size.