keras seqeunce to sequence lstm inference step input shape error
i'm building a seq2seq model with keras based on this example: https://blog.keras.io/atenminuteintroductiontosequencetosequencelearninginkeras.html
When i try to load the saved model in jupyter notebook to test the results i get the following error:
ValueError: Layer decoder_lstm expects 1 inputs, but it received 3 input tensors. Input received: [Reshape{3}.0, InplaceDimShuffle{0,1}.0, InplaceDimShuffle{0,1}.0]
This is what my model looks like when i train it, no errors here.
latent_dim = 256
batch_size = 10
epochs = 1
encoder_inputs = Input(shape=(None, ))
encoder_embedding = Embedding(amount_encoder_tokens, latent_dim, name="encoder_embedding")(encoder_inputs)
encoder_LSTM1 = LSTM(latent_dim, return_sequences=True, name="encoder_lstm1")(encoder_embedding)
x, state_h, state_c = LSTM(latent_dim, return_state=True, name="encoder_lstm2")(encoder_LSTM1)
encoder_states = [state_h, state_c]
decoder_inputs = Input(batch_shape=(batch_size, max_headline_length))
decoder_embedding = Embedding(amount_decoder_tokens, latent_dim,
name="decoder_embedding")(decoder_inputs)
decoder_LSTM = LSTM(latent_dim, return_sequences=True, stateful=True,
name="decoder_lstm", initial_state=encoder_states)
decoder_outputs = decoder_LSTM(decoder_embedding) #,
initial_state=encoder_states)
decoder_dense = Dense(amount_decoder_tokens, activation="softmax",
name="decoder_dense")
decoder_outputs = decoder_dense(decoder_outputs)
model = Model([encoder_inputs, decoder_inputs], decoder_outputs)
model.compile(optimizer=RMSprop(lr=0.0001), loss='categorical_crossentropy')
Any help would be appreciated.
See also questions close to this topic

Use data as input for CNN
I have an EEG data structure as shown 3D data
I want to use it as input for my CNN model. Should I append the data together in 1 variable or keep this same structure for input? If the data is appended then will the data lose some importance?

Parsing TFRecord when in eager execution
Considering that is possible to run
tf.Data.Datasets
in eager execution mode, how should I open a TFRecord file on eager execution? I'm more concerned about the parser writing, because I'm currently usingdataset.make_one_shot_iterator
as an iterator (between several images on my container). 
Running tf.nn.conv2d_transpose on GPU
I am using TensorFlow to implement a neural network.
The code works perfectly fine without convolutional layers, and the convolutional layers work perfectly fine on CPU, however, when I use
tf.nn.conv2d_transpose
ontensorflowgpu
, I cannot seem to get it to run.20180423 15:06:14.935077: F T:\src\github\tensorflow\tensorflow\core\kernels\conv_ops.cc:712] Check failed: stream>parent()>GetConvolveAlgorithms( conv_parameters.ShouldIncludeWinogradNonfusedAlgo<T>(), &algorithms)
I have been trying to fix this problem for a while now. I have tried the following solution which has been mentioned on a lot of platforms but no gain so far:
config = tf.ConfigProto() config.gpu_options.allow_growth = True sess = tf.Session(config=config)
Any help is highly appreciated. I am on a Windows 10 platform.

How to calculate the amount of connections in neural network
I have exactly this scenario and I need to know how many connections this set has. I searched in several places and I'm not sure of the answer. That is, I do not know how to calculate the number of connections on my network, this is still unclear to me. What I have is exactly as follows:
** Having bias in all but least of the input
 Input: 784
 First hidden layer  Output: 400
 Second hidden layer  Output: 200
 Output layer  Output: 10
I would calculate this as follows: ((784 * 400) + bias) + ((400 * 200) + bias) + ((200 * 10) + bias) = XXX
I do not know if this is correct. I need help figuring out how to solve this, and if it's not just something mathematical, what's the theory to do this calculation?
Thank you.

inputs for nDCG in sklearn
I'm unable to understand the input format of sklearn nDcg: http://sklearn.apachecn.org/en/0.19.0/modules/generated/sklearn.metrics.ndcg_score.html
Currently I have the following problem: I have multiple queries for each of which the ranking probabilities have been calculated successfully. But now the problem is calculating nDCG for the test set for which I would like to use the sklearn nDcg. The example given on the link
>>> y_true = [1, 0, 2] >>> y_score = [[0.15, 0.55, 0.2], [0.7, 0.2, 0.1], [0.06, 0.04, 0.9]] >>> ndcg_score(y_true, y_score, k=2) 1.0
According to site, y_true is ground truth and y_score are the probabilities.So following are my questions:
 Is this example for just one query or multiple queries?
 If this is for just one query then what does y_true represents: original rankings?
 If this is for a single query and why we have multiple input probabilites?
 How this method can be applied to multiple queries and their resultant probabilites?

Loss functions in GANs
I'm trying to build a simple mnist GAN and need less to say, it didn't work. I've searched a lot and fixed most of my code. Though I can't really understand how loss functions are working.
This is what I did:
loss_d = tf.reduce_mean(tf.log(discriminator(real_data))) # maximise loss_g = tf.reduce_mean(tf.log(discriminator(generator(noise_input), trainable = False))) # maxmize cuz d(g) instead of 1  d(g) loss = loss_d + loss_g train_d = tf.train.AdamOptimizer(learning_rate).minimize(loss_d) train_g = tf.train.AdamOptimizer(learning_rate).minimize(loss_g)
I get 0.0 as my loss value. Can you explain how to deal with loss functions in GANs?

Has anyone written weldon pooling for keras?
Has the Weldon pooling [1] been implemented in Keras?
I can see that it has been implemented in pytorch by the authors [2] but cannot find a keras equivalent.
[1] T. Durand, N. Thome, and M. Cord. Weldon: Weakly su pervised learning of deep convolutional neural networks. In CVPR, 2016. [2] https://github.com/durandtibo/weldon.resnet.pytorch/tree/master/weldon

Keras  How do I turn a 5d Tensor into a 3d one by getting rid of the last three dimensions?
In Keras, I have a (100, 200, 300, 400, 500) Tensor outputted from a Layer, and want to turn it into a (100, 200, 300*400*500) one before feeding it into a new Layer. How do I do that?

Multilabel classifciation: keras custom metrics
Contextualization
I am working on a multi_label classification problem with images. I am trying to predict 39 labels. In other words, I am trying to identifying which one of the 39 characteristics is present in a given image( many characteristics can be found in one image that's why I am on a multi label classification situation.
Data
My input data are (X,Y): X is of shape (1814,204,204,3) and Y is of shape (1814,39). So basically X are the set of images and Y are the labels associated to each images which will be used for the supervised learning process.Model
I am building a Convolutional neural network in order to make predictions. For this task, I am using Keras in order to create my model.What I have done
In order to validate my model, I need to choose a metric. However, metrics available in Keras are irreverent in my case and won't help me validate my model since I am in multilabel classification situation. That's why I decided to create my custom metric. I created recall and precision metrics applied to columns of Y and Y_predict . In other words, I will calculate recall and precision for each class of the 39 classes. So here is the code of my metrics:def recall(y_true, y_pred): #Recall metric. true_positives = K.sum(K.round(K.clip(y_true * y_pred, 0, 1)),axis=0) possible_positives = K.sum(K.round(K.clip(y_true, 0, 1)),axis=0) recall = true_positives / (possible_positives + K.epsilon()) return recall def precision(y_true, y_pred): #Precision metric. true_positives = K.sum(K.round(K.clip(y_true * y_pred, 0, 1)),axis=0) predicted_positives = K.sum(K.round(K.clip(y_pred, 0, 1)),axis=1) precision = true_positives / (predicted_positives + K.epsilon()) return precision
My vector Y is of shape (n,39) that's why I am doing operations over axis=0. In other words, for each label, I am caluclating precision and recall.
Next step, I called these two metrics by precising it to keras fit function. In other words I used this line of code:
model.compile(loss='binary_crossentropy', optimizer=sgd, metrics=[precision,recall])
Code for building, compiling and fitting model:
Here is the code that I use for building the model an training it + its result . ( I am not putting the part of code where I split data into train and validation: Train on 1269 samples, validate on 545 samples)# Model: CNN model = Sequential() model.add(Conv2D(32, (3, 3), input_shape=(204, 204, 3), padding='same', activation='relu', kernel_constraint=maxnorm(3))) model.add(Dropout(0.2)) model.add(Conv2D(32, (3, 3), activation='relu', padding='same', kernel_constraint=maxnorm(3))) model.add(MaxPooling2D(pool_size=(2, 2))) model.add(Flatten()) model.add(Dense(512, activation='relu', kernel_constraint=maxnorm(3))) model.add(Dropout(0.5)) model.add(Dense(39)) model.add(Activation('sigmoid')) # Compile model epochs = 5 lrate = 0.001 decay = lrate/epochs sgd = SGD(lr=lrate, momentum=0.9, decay=decay, nesterov=False) model.compile(loss='binary_crossentropy', optimizer=sgd, metrics=[precision,recall]) # fitting the model model.fit(X_train, Y_train, epochs=epochs, batch_size=32,validation_data=(X_valid,Y_valid))
Results
Train on 1269 samples, validate on 545 samples Epoch 1/5 96/1269 [=>............................]  ETA: 6:40  loss: 0.6668  precision: 0.1031  recall: 0.2493
Issues/Questions
Question 1: In the log of the results section , there are precision and recall values. I don't know why I got real values instead of vector of values. The way I constructed my two metrics should give me an array of shape (1,39) for precision and (1,39) for recall, which should contain precision and recall for each class, still The output is only a number?
Question 2: these values of precision and recall given by the log, they represent metric calculation for a data of size= batch? How can I calculate the metric over an epoch ( which is more useful as an information than just a calculation over a batch? Some may say, just calculate the average over all batches? Sure , that's what I am thinking of but I don't know how to do it since KERAS is kinda of a black box to me and I don't exactly what is happening ''behind the scenes'' in order to follow/modify the adequate part of code? 
Convert Lasagne BatchNormLayer to Keras BatchNormalization layer
I want to convert a pretrained Lasagne (Theano) model to a Keras (Tensorflow) model, so all layers need to have the exact same configuration. From both documentations it is not clear to me how the parameters correspond. Let's assume a Lasagne BatchNormLayer with default settings:
class lasagne.layers.BatchNormLayer(incoming, axes='auto', epsilon=1e4, alpha=0.1, beta=lasagne.init.Constant(0), gamma=lasagne.init.Constant(1), mean=lasagne.init.Constant(0), inv_std=lasagne.init.Constant(1), **kwargs)
And this is the Keras BatchNormalization layer API:
keras.layers.BatchNormalization(axis=1, momentum=0.99, epsilon=0.001, center=True, scale=True, beta_initializer='zeros', gamma_initializer='ones', moving_mean_initializer='zeros', moving_variance_initializer='ones', beta_regularizer=None, gamma_regularizer=None, beta_constraint=None, gamma_constraint=None)
Most of it is clear, so I'll provide the corresponding parameters for future reference here:
(Lasagne > Keras) incoming > (not needed, automatic) axes > axis epsilon > epsilon alpha > ? beta > beta_initializer gamma > gamma_initializer mean > moving_mean_initializer inv_std > moving_variance_initializer ? > momentum ? > center ? > scale ? > beta_regularizer ? > gamma_regularizer ? > beta_constraint ? > gamma_constraint
I assume Lasagne simply does not support beta_regularizer, gamma_regularizer, beta_constraint and gamma_constraint, so the default in Keras of None is correct. I also assume in Lasagne center and scale are always turned on and can not be turned off.
That leaves Lasagne alpha and Keras momentum. From the Lasagne documentation for alpha:
Coefficient for the exponential moving average of batchwise means and standard deviations computed during training; the closer to one, the more it will depend on the last batches seen
From the Keras documentation for momentum:
Momentum for the moving mean and the moving variance.
They seem to correspond  but by which formula?

Theano won't run with GPU
I'm trying to run a code using a GPU, I've already installed CUDA 9.0, and cuDNN 7.1, the installation went fine (I guess...) but when I run the test code provided here Deep Learning with Theano:
from theano import function, config, shared, tensor import numpy import time vlen = 10 * 30 * 768 # 10 x #cores x # threads per core iters = 1000 rng = numpy.random.RandomState(22) x = shared(numpy.asarray(rng.rand(vlen), config.floatX)) f = function([], tensor.exp(x)) print(f.maker.fgraph.toposort()) t0 = time.time() for i in range(iters): r = f() t1 = time.time() print("Looping %d times took %f seconds" % (iters, t1  t0)) print("Result is %s" % (r,)) if numpy.any([isinstance(x.op, tensor.Elemwise) and ('Gpu' not in type(x.op).__name__) for x in f.maker.fgraph.toposort()]): print('Used the cpu') else: print('Used the gpu')
The result says
Used the cpu
and some other information.I'm running the code in Windows 10 at the moment

Trainable Gaussian Activation Neupy/Theano
How do I implement a custom activation function (RBF kernel with adjustable mean and variances) in Neupy or Theano for use in Neupy.
Quick Background: Gradient Descent works with every parameter in the network. I want to make a specialized features space that contains optimized feature parameters.
I think my problems is in the creation of parameters, how they are sized, and how they are all connected.
Code:
Activation Function Class
class RBF(layers.ActivationLayer): def initialize(self): super(RBF, self).initialize() self.add_parameter(name='mean', shape=(1,), value=init.Normal(), trainable=True) self.add_parameter(name='std_dev', shape=(1,), value=init.Normal(), trainable=True) def output(self, input_value): return rbf(input_value, self.parameters)
RBF Function
def rbf(input_value, parameters): K = _outer_substract(input_value, parameters['mean']) return np.exp( np.linalg.norm(K)/parameters['std_dev'])
Function to shape?
# 1.1: Function to subtract. def _outer_substract(x, y): # x = x.dimshuffle(0, 1, 'x') # x = T.addbroadcast(x, 2) return (x  y.T).T
Help will be much appreciated as this is will provide great insight into how to customize neupy networks. The documentation is pathetic to say the least...

Keras binary crossentropy on sequence to sequence prediction turns negative
i am trying to predict if an appliance is turned on by the power signal of the whole household. I build an 1D CNN with 1440 timestamps in and 1440 timestamps out. It is compiled with a binary_crossentropy loss. I feed the neural network the total power signal an transform the power signal of the appliance to a binary sequence 1 if appliance power > 0 otherwise 0.
At first i started to train the model on single days from the dataset, but the model just learned on which timestamps it is most likely that the appliance is on. Now i give it the training data in a sliding window way moving always 60 steps forward to prevent the learning of the timestamps.
The problem i encounter is that after some while the loss turns negative and the model performance gets worse and worse.
Is it normal that the binary_crossentropy loss turns negative?
Thank you for your help in advance.
Kind regards, Dennis

I do not know why in my Keras neural network model, the prediction shape is not consistent with the shape of labels while training?
I have built a Keras ConvLSTM neural network, and I want to predict one frame ahead based on a sequence of 10 time steps:
Model:
from keras.models import Sequential from keras.layers.convolutional import Conv3D from keras.layers.convolutional_recurrent import ConvLSTM2D from keras.layers.normalization import BatchNormalization import numpy as np import pylab as plt from keras import layers # We create a layer which take as input movies of shape # (n_frames, width, height, channels) and returns a movie # of identical shape. model = Sequential() model.add(ConvLSTM2D(filters=40, kernel_size=(3, 3), input_shape=(None, 64, 64, 1), padding='same', return_sequences=True)) model.add(BatchNormalization()) model.add(ConvLSTM2D(filters=40, kernel_size=(3, 3), padding='same', return_sequences=True)) model.add(BatchNormalization()) model.add(ConvLSTM2D(filters=40, kernel_size=(3, 3), padding='same', return_sequences=True)) model.add(BatchNormalization()) model.add(ConvLSTM2D(filters=40, kernel_size=(3, 3), padding='same', return_sequences=True)) model.add(BatchNormalization()) model.add(Conv3D(filters=1, kernel_size=(3, 3, 3), activation='sigmoid', padding='same', data_format='channels_last')) model.compile(loss='binary_crossentropy', optimizer='adadelta')
training:
data_train_x = data_4[0:20, 0:10, :, :, :] data_train_y = data_4[0:20, 10:11, :, :, :] model.fit(data_train_x, data_train_y, batch_size=10, epochs=1, validation_split=0.05)
and I test the model:
test_x = np.reshape(data_test_x[2,:,:,:,:], [1,10,64,64,1]) next_frame = model.predict(test_x,batch_size=1, verbose=1, steps=None)
but the problem is that 'next_frame' shape is: (1, 10, 64, 64, 1) which based on the train data it should be (1, 1, 64, 64, 1)
And this is the results of 'model.summary()':
_________________________________________________________________ Layer (type) Output Shape Param # ================================================================= conv_lst_m2d_1 (ConvLSTM2D) (None, None, 64, 64, 40) 59200 _________________________________________________________________ batch_normalization_1 (Batch (None, None, 64, 64, 40) 160 _________________________________________________________________ conv_lst_m2d_2 (ConvLSTM2D) (None, None, 64, 64, 40) 115360 _________________________________________________________________ batch_normalization_2 (Batch (None, None, 64, 64, 40) 160 _________________________________________________________________ conv_lst_m2d_3 (ConvLSTM2D) (None, None, 64, 64, 40) 115360 _________________________________________________________________ batch_normalization_3 (Batch (None, None, 64, 64, 40) 160 _________________________________________________________________ conv_lst_m2d_4 (ConvLSTM2D) (None, None, 64, 64, 40) 115360 _________________________________________________________________ batch_normalization_4 (Batch (None, None, 64, 64, 40) 160 _________________________________________________________________ conv3d_1 (Conv3D) (None, None, 64, 64, 1) 1081 ================================================================= Total params: 407,001 Trainable params: 406,681 Nontrainable params: 320

In word embedding, how to map the vector to word?
I checked all API and couldn't find a way to map vector to word no matter in word2Vec or glove. Google doesn't help that much.
Does anybody know to do this?
Background: I'm training a chatbot by using seq2seq model. But the implementations I found so far are using onehot encoding. So I want to try use glove embedding and use the output mapping back to the word.