1.0.1 release

fixed TensorBoard callback (#2363 )
Fix Dropout in RNNs
2016-04-16 14:11:34 -07:00 · 2016-04-16 13:56:56 -07:00 · 2016-04-16 09:08:15 -07:00 · 2016-04-15 22:40:54 -07:00 · 2016-04-15 22:37:41 -07:00 · 2016-04-15 08:21:46 -07:00
@@ -134,13 +134,15 @@ to pass the learning phase flag to your function:
 get_3rd_layer_output = K.function([model.layers[0].input, K.learning_phase()],
                                  [model.layers[3].output])

-# output in train mode
-layer_output = get_3rd_layer_output([X, 1])[0]
+# output in train mode = 0
+layer_output = get_3rd_layer_output([X, 0])[0]

-# output in test mode
+# output in test mode = 1
 layer_output = get_3rd_layer_output([X, 1])[0]
 ```

+Another more flexible way of getting output from intermediate layers is to use the [functional API](/getting-started/functional-api-guide).
+
 ---

 ### How can I use Keras with datasets that don't fit in memory?
@@ -45,7 +45,7 @@ With the functional API, it is easy to re-use trained models: you can treat any

 ```python
 x = Input(shape=(784,))
-# this works, and returns the 10-way softmax we defined above. 
+# this works, and returns the 10-way softmax we defined above.
 y = model(x)
 ```

@@ -127,12 +127,12 @@ model = Model(input=[main_input, auxiliary_input], output=[main_loss, auxiliary_
 ```

 We compile the model and assign a weight of 0.2 to the auxiliary loss.
-To specify different `loss_weight` or `loss` for each different output, you can use a list or a dictionary.
+To specify different `loss_weights` or `loss` for each different output, you can use a list or a dictionary.
 Here we pass a single loss as the `loss` argument, so the same loss will be used on all outputs.

 ```python
 model.compile(optimizer='rmsprop', loss='binary_crossentropy',
-              loss_weight=[1., 0.2])
+              loss_weights=[1., 0.2])
 ```

 We can train the model by passing it lists of input arrays and target arrays:
@@ -148,7 +148,7 @@ We could also have compiled the model via:
 ```python
 model.compile(optimizer='rmsprop',
              loss={'main_output': 'binary_crossentropy', 'aux_output': 'binary_crossentropy'},
-              loss_weight={'main_output': 1., 'aux_output': 0.2})
+              loss_weights={'main_output': 1., 'aux_output': 0.2})

 # and trained it via:
 model.fit({'main_input': headline_data, 'aux_input': additional_data},
@@ -196,7 +196,7 @@ encoded_b = shared_lstm(tweet_b)
 merged_vector = merge([encoded_a, encoded_b], mode='concat', concat_axis=-1)

 # and add a logistic regression on top
-predictions = Dense(1, activation='sigmoid')(merged_vector) 
+predictions = Dense(1, activation='sigmoid')(merged_vector)

 # we define a trainable model linking the
 # tweet inputs to the predictions
@@ -11,7 +11,7 @@ b = Dense(32)(a)
 model = Model(input=a, output=b)
 ```

-This model will include all layers required in the computation of `a` given `b`.
+This model will include all layers required in the computation of `b` given `a`.

 In the case of multi-input or multi-output models, you can use lists as well:

@@ -29,4 +29,4 @@ For a detailed introduction of what `Model` can do, read [this guide to the Kera

 ## Methods

-{{autogenerated}}
+{{autogenerated}}
@@ -12,7 +12,8 @@ keras.preprocessing.image.ImageDataGenerator(featurewise_center=True,
    height_shift_range=0.,
    shear_range=0.,
    horizontal_flip=False,
-    vertical_flip=False)
+    vertical_flip=False,
+    dim_ordering='th')
 ```

 Generate batches of tensor image data with real-time data augmentation. The data will be looped over (in batches) indefinitely.
@@ -29,6 +30,9 @@ Generate batches of tensor image data with real-time data augmentation. The data
    - __shear_range__: Float. Shear Intensity (Shear angle in counter-clockwise direction as radians)
    - __horizontal_flip__: Boolean. Randomly flip inputs horizontally.
    - __vertical_flip__: Boolean. Randomly flip inputs vertically.
+    - __dim_ordering__: One of {"th", "tf"}.
+        "tf" mode means that the images should have shape `(samples, width, height, channels)`,
+        "th" mode means that the images should have shape `(samples, channels, width, height)`.

 - __Methods__:
    - __fit(X)__: Required if featurewise_center or featurewise_std_normalization or zca_whitening. Compute necessary quantities on some sample data.
@@ -1,6 +1,9 @@
 '''This example demonstrates the use of Convolution1D for text classification.

-Gets to 0.835 test accuracy after 2 epochs. 100s/epoch on K520 GPU.
+Gets to 0.88 test accuracy after 2 epochs. 
+90s/epoch on Intel i5 2.4Ghz CPU.
+10s/epoch on Tesla K40 GPU.
+
 '''

 from __future__ import print_function
@@ -9,17 +12,18 @@ np.random.seed(1337)  # for reproducibility

 from keras.preprocessing import sequence
 from keras.models import Sequential
-from keras.layers.core import Dense, Dropout, Activation, Flatten
+from keras.layers.core import Dense, Dropout, Activation, Lambda
 from keras.layers.embeddings import Embedding
-from keras.layers.convolutional import Convolution1D, MaxPooling1D
+from keras.layers.convolutional import Convolution1D
 from keras.datasets import imdb
+from keras import backend as K


 # set parameters:
 max_features = 5000
-maxlen = 100
+maxlen = 400
 batch_size = 32
-embedding_dims = 100
+embedding_dims = 50
 nb_filter = 250
 filter_length = 3
 hidden_dims = 250
@@ -42,8 +46,10 @@ model = Sequential()

 # we start off with an efficient embedding layer which maps
 # our vocab indices into embedding_dims dimensions
-model.add(Embedding(max_features, embedding_dims, input_length=maxlen))
-model.add(Dropout(0.25))
+model.add(Embedding(max_features,
+                    embedding_dims,
+                    input_length=maxlen,
+                    dropout=0.2))

 # we add a Convolution1D, which will learn nb_filter
 # word group filters of size filter_length:
@@ -52,16 +58,17 @@ model.add(Convolution1D(nb_filter=nb_filter,
                        border_mode='valid',
                        activation='relu',
                        subsample_length=1))
-# we use standard max pooling (halving the output of the previous layer):
-model.add(MaxPooling1D(pool_length=2))

-# We flatten the output of the conv layer,
-# so that we can add a vanilla dense layer:
-model.add(Flatten())
+# we use max over time pooling by defining a python function to use
+# in a Lambda layer
+def max_1d(X):
+    return K.max(X, axis=1)
+
+model.add(Lambda(max_1d, output_shape=(nb_filter,)))

 # We add a vanilla hidden layer:
 model.add(Dense(hidden_dims))
-model.add(Dropout(0.25))
+model.add(Dropout(0.2))
 model.add(Activation('relu'))

 # We project onto a single unit output layer, and squash it with a sigmoid:
@@ -69,7 +76,7 @@ model.add(Dense(1))
 model.add(Activation('sigmoid'))

 model.compile(loss='binary_crossentropy',
-              optimizer='rmsprop',
+              optimizer='adam',
              metrics=['accuracy'])
 model.fit(X_train, y_train,
          batch_size=batch_size,
@@ -28,6 +28,11 @@ def euclidean_distance(vects):
    return K.sqrt(K.sum(K.square(x - y), axis=1, keepdims=True))


+def eucl_dist_output_shape(shapes):
+    shape1, shape2 = shapes
+    return shape1
+
+
 def contrastive_loss(y_true, y_pred):
    '''Contrastive loss from Hadsell-et-al.'06
    http://yann.lecun.com/exdb/publis/pdf/hadsell-chopra-lecun-06.pdf
@@ -103,7 +108,7 @@ input_b = Input(shape=(input_dim,))
 processed_a = base_network(input_a)
 processed_b = base_network(input_b)

-distance = Lambda(euclidean_distance)([processed_a, processed_b])
+distance = Lambda(euclidean_distance, output_shape=eucl_dist_output_shape)([processed_a, processed_b])

 model = Model(input=[input_a, input_b], output=distance)

@@ -1 +1 @@
-__version__ = '1.0.0'
+__version__ = '1.0.1'
@@ -21,6 +21,11 @@ def learning_phase():
    return _LEARNING_PHASE


+def set_learning_phase(value):
+    global _LEARNING_PHASE
+    _LEARNING_PHASE = tf.constant(value, name='keras_learning_phase')
+
+
 def get_session():
    '''Returns the TF session in use by the backend.
    '''
@@ -560,6 +565,17 @@ def set_value(x, value):
    tf.assign(x, np.asarray(value)).op.run(session=get_session())


+def batch_set_value(tuples):
+    '''Sets the values of many tensor variables at once.
+
+    # Arguments
+        tuples: a list of tuples `(tensor, value)`.
+            `value` should be a Numpy array.
+    '''
+    if tuples:
+        ops = [tf.assign(x, np.asarray(value)) for x, value in tuples]
+        get_session().run(ops)
+
 # GRAPH MANIPULATION

 class Function(object):
@@ -473,6 +473,11 @@ def set_value(x, value):
    x.set_value(np.asarray(value, dtype=x.dtype))


+def batch_set_value(tuples):
+    for x, value in tuples:
+        x.set_value(np.asarray(value, dtype=x.dtype))
+
+
 # GRAPH MANIPULATION

 class Function(object):
@@ -60,8 +60,7 @@ class CallbackList(object):
            callback.on_batch_end(batch, logs)
        self._delta_ts_batch_end.append(time.time() - t_before_callbacks)
        delta_t_median = np.median(self._delta_ts_batch_end)
-        if self._delta_t_batch > 0. and delta_t_median > 0.95 * \
-           self._delta_t_batch and delta_t_median > 0.1:
+        if self._delta_t_batch > 0. and (delta_t_median > 0.95 * self._delta_t_batch and delta_t_median > 0.1):
            warnings.warn('Method on_batch_end() is slow compared '
                          'to the batch update (%f). Check your callbacks.'
                          % delta_t_median)
@@ -447,7 +446,7 @@ class TensorBoard(Callback):

        self.model = model
        self.sess = KTF.get_session()
-        if self.histogram_freq and not self.merged:
+        if self.histogram_freq and self.merged is None:
            layers = self.model.layers
            for layer in layers:
                if hasattr(layer, 'W'):
@@ -468,8 +467,14 @@ class TensorBoard(Callback):
            if epoch % self.histogram_freq == 0:
                # TODO: implement batched calls to sess.run
                # (current call will likely go OOM on GPU)
-                feed_dict = dict(zip(self.model.inputs,
-                                     self.model.validation_data))
+                if self.model.uses_learning_phase:
+                    cut_v_data = len(self.model.inputs)
+                    val_data = self.model.validation_data[:cut_v_data] + [0]
+                    tensors = self.model.inputs + [K.learning_phase()]
+                else:
+                    val_data = self.model.validation_data
+                    tensors = self.model.inputs
+                feed_dict = dict(zip(tensors, val_data))
                result = self.sess.run([self.merged], feed_dict=feed_dict)
                summary_str = result[0]
                self.writer.add_summary(summary_str, epoch)
@@ -844,13 +844,17 @@ class Layer(object):
                            '" with a  weight list of length ' + str(len(weights)) +
                            ', but the layer was expecting ' + str(len(params)) +
                            ' weights. Provided weights: ' + str(weights))
+        if not params:
+            return
+        weight_value_tuples = []
        for p, w in zip(params, weights):
            if K.get_value(p).shape != w.shape:
                raise Exception('Layer weight shape ' +
                                str(K.get_value(p).shape) +
                                ' not compatible with '
                                'provided weight shape ' + str(w.shape))
-            K.set_value(p, w)
+            weight_value_tuples.append((p, w))
+        K.batch_set_value(weight_value_tuples)

    def get_weights(self):
        '''Returns the current weights of the layer,
@@ -1259,6 +1263,8 @@ class Merge(Layer):
            if hasattr(self._output_shape, '__call__'):
                output_shape = self._output_shape(input_shape)
                return output_shape
+            elif self._output_shape is not None:
+                return self._output_shape
            else:
                # TODO: consider shape auto-inference with TF
                raise Exception('The Merge layer ' + self.name +
@@ -2285,12 +2291,29 @@ class Container(Layer):
                                ' layers into a model with ' +
                                str(len(flattened_layers)) + ' layers.')

+            # we batch weight value assignments in a single backend call
+            # which provides a speedup in TensorFlow.
+            weight_value_tuples = []
            for k, name in enumerate(layer_names):
                g = f[name]
                weight_names = [n.decode('utf8') for n in g.attrs['weight_names']]
                if len(weight_names):
-                    weights = [g[weight_name] for weight_name in weight_names]
-                    flattened_layers[k].set_weights(weights)
+                    weight_values = [g[weight_name] for weight_name in weight_names]
+                    layer = flattened_layers[k]
+                    symbolic_weights = layer.trainable_weights + layer.non_trainable_weights
+                    if len(weight_values) != len(symbolic_weights):
+                        raise Exception('Layer #' + str(k) +
+                                        ' (named "' + layer.name +
+                                        '" in the current model) was found to '
+                                        'correspond to layer ' + name +
+                                        ' in the save file. '
+                                        'However the new layer ' + layer.name +
+                                        ' expects ' + str(len(symbolic_weights)) +
+                                        ' weights, but the saved weights have ' +
+                                        str(len(weight_values)) +
+                                        ' elements.')
+                    weight_value_tuples += zip(symbolic_weights, weight_values)
+            K.batch_set_value(weight_value_tuples)
        f.close()

    def to_json(self, **kwargs):
@@ -31,8 +31,8 @@ def standardize_input_data(data, names, shapes=None, check_batch_dim=True,
        arrays = []
        for name in names:
            if name not in data:
-                raise Exception('No data provided for input "' +
-                                name + '". Input data keys: ' +
+                raise Exception('No data provided for "' +
+                                name + '". Need data for each key in: ' +
                                str(data.keys()))
            arrays.append(data[name])
    elif type(data) is list:
@@ -224,6 +224,22 @@ def collect_metrics(metrics, output_names):
                        str(metrics))


+def collect_trainable_weights(layer):
+    trainable = getattr(layer, 'trainable', True)
+    if not trainable:
+        return []
+    weights = []
+    if layer.__class__.__name__ in ['Sequential', 'Model']:
+        for sublayer in layer.layers:
+            weights += collect_trainable_weights(sublayer)
+    elif layer.__class__.__name__ == 'Graph':
+        for sublayer in layer._graph_nodes.values():
+            weights += collect_trainable_weights(sublayer)
+    else:
+        weights += layer.trainable_weights
+    return weights
+
+
 def batch_shuffle(index_array, batch_size):
    '''This shuffles an array in a batch-wise fashion.
    Useful for shuffling HDF5 arrays
@@ -369,7 +385,7 @@ def standardize_weights(y, sample_weight=None, class_weight=None,
 def generator_queue(generator, max_q_size=10,
                    wait_time=0.05, nb_worker=1):
    '''Builds a threading queue out of a data generator.
-    Used in `fit_generator`, `evaluate_generator`.
+    Used in `fit_generator`, `evaluate_generator`, `predict_generator`.
    '''
    q = queue.Queue()
    _stop = threading.Event()
@@ -622,13 +638,10 @@ class Model(Container):
            else:
                inputs = self.inputs + self.targets + self.sample_weights

-            # dedupe trainable weights
-            trainable_weights_set = set()
+            # get trainable weights
            trainable_weights = []
-            for w in self.trainable_weights:
-                if w not in trainable_weights_set:
-                    trainable_weights_set.add(w)
-                    trainable_weights.append(w)
+            for layer in self.layers:
+                trainable_weights += collect_trainable_weights(layer)

            training_updates = self.optimizer.get_updates(trainable_weights, self.constraints, self.total_loss)
            updates = self.updates + training_updates
@@ -692,8 +705,6 @@ class Model(Container):
        # Returns
            `History` object.
        '''
-        self.training_data = ins
-        self.validation_data = val_ins
        do_validation = False
        if val_f and val_ins:
            do_validation = True
@@ -710,7 +721,14 @@ class Model(Container):
            callbacks += [cbks.ProgbarLogger()]
        callbacks = cbks.CallbackList(callbacks)

-        callbacks._set_model(self)
+        # it's possible to callback a different model than self
+        # (used by Sequential models)
+        if hasattr(self, 'callback_model') and self.callback_model:
+            callback_model = self.callback_model
+        else:
+            callback_model = self
+
+        callbacks._set_model(callback_model)
        callbacks._set_params({
            'batch_size': batch_size,
            'nb_epoch': nb_epoch,
@@ -720,8 +738,9 @@ class Model(Container):
            'metrics': callback_metrics,
        })
        callbacks.on_train_begin()
+        callback_model.stop_training = False
+        self.validation_data = val_ins

-        self.stop_training = False
        for epoch in range(nb_epoch):
            callbacks.on_epoch_begin(epoch)
            if shuffle == 'batch':
@@ -768,7 +787,7 @@ class Model(Container):
                        for l, o in zip(out_labels, val_outs):
                            epoch_logs['val_' + l] = o
            callbacks.on_epoch_end(epoch, epoch_logs)
-            if self.stop_training:
+            if callback_model.stop_training:
                break
        callbacks.on_train_end()
        return self.history
@@ -950,14 +969,6 @@ class Model(Container):
                                                           class_weight=class_weight,
                                                           check_batch_dim=False,
                                                           batch_size=batch_size)
-        # prepare input arrays and training function
-        if self.uses_learning_phase:
-            ins = x + y + sample_weights + [1.]
-        else:
-            ins = x + y + sample_weights
-        self._make_train_function()
-        f = self.train_function
-
        # prepare validation data
        if validation_data:
            do_validation = True
@@ -996,6 +1007,14 @@ class Model(Container):
            val_f = None
            val_ins = None

+        # prepare input arrays and training function
+        if self.uses_learning_phase:
+            ins = x + y + sample_weights + [1.]
+        else:
+            ins = x + y + sample_weights
+        self._make_train_function()
+        f = self.train_function
+
        # prepare display labels
        out_labels = self.metrics_names
        if do_validation:
@@ -1184,7 +1203,7 @@ class Model(Container):
    def fit_generator(self, generator, samples_per_epoch, nb_epoch,
                      verbose=1, callbacks=[],
                      validation_data=None, nb_val_samples=None,
-                      class_weight={}):
+                      class_weight={}, max_q_size=10):
        '''Fits the model on data generated batch-by-batch by
        a Python generator.
        The generator is run in parallel to the model, for efficiency.
@@ -1214,6 +1233,7 @@ class Model(Container):
                at the end of every epoch.
            class_weight: dictionary mapping class indices to a weight
                for the class.
+            max_q_size: maximum size for the generator queue

        # Returns
            A `History` object.
@@ -1261,7 +1281,12 @@ class Model(Container):
            callbacks += [cbks.ProgbarLogger()]
        callbacks = cbks.CallbackList(callbacks)

-        callbacks._set_model(self)
+        # it's possible to callback a different model than self:
+        if hasattr(self, 'callback_model') and self.callback_model:
+            callback_model = self.callback_model
+        else:
+            callback_model = self
+        callbacks._set_model(callback_model)
        callbacks._set_params({
            'nb_epoch': nb_epoch,
            'nb_sample': samples_per_epoch,
@@ -1287,9 +1312,9 @@ class Model(Container):
            self.validation_data = None

        # start generator thread storing batches into a queue
-        data_gen_queue, _stop = generator_queue(generator)
+        data_gen_queue, _stop = generator_queue(generator, max_q_size=max_q_size)

-        self.stop_training = False
+        callback_model.stop_training = False
        while epoch < nb_epoch:
            callbacks.on_epoch_begin(epoch)
            samples_seen = 0
@@ -1322,6 +1347,8 @@ class Model(Container):
                batch_logs = {}
                if type(x) is list:
                    batch_size = len(x[0])
+                elif type(x) is dict:
+                    batch_size = len(list(x.values())[0])
                else:
                    batch_size = len(x)
                batch_logs['batch'] = batch_index
@@ -1358,7 +1385,8 @@ class Model(Container):
                if samples_seen >= samples_per_epoch and do_validation:
                    if val_gen:
                        val_outs = self.evaluate_generator(validation_data,
-                                                           nb_val_samples)
+                                                           nb_val_samples,
+                                                           max_q_size=max_q_size)
                    else:
                        # no need for try/except because
                        # data has already been validated
@@ -1373,14 +1401,14 @@ class Model(Container):

            callbacks.on_epoch_end(epoch, epoch_logs)
            epoch += 1
-            if self.stop_training:
+            if callback_model.stop_training:
                break

        _stop.set()
        callbacks.on_train_end()
        return self.history

-    def evaluate_generator(self, generator, val_samples):
+    def evaluate_generator(self, generator, val_samples, max_q_size=10):
        '''Evaluates the model on a data generator. The generator should
        return the same kind of data as accepted by `test_on_batch`.

@@ -1391,6 +1419,7 @@ class Model(Container):
            val_samples:
                total number of samples to generate from `generator`
                before returning.
+            max_q_size: maximum size for the generator queue

        # Returns
            Scalar test loss (if the model has a single output and no metrics)
@@ -1404,7 +1433,7 @@ class Model(Container):
        wait_time = 0.01
        all_outs = []
        weights = []
-        data_gen_queue, _stop = generator_queue(generator)
+        data_gen_queue, _stop = generator_queue(generator, max_q_size=max_q_size)

        while processed_samples < val_samples:
            generator_output = None
@@ -1438,6 +1467,8 @@ class Model(Container):

            if type(x) is list:
                nb_samples = len(x[0])
+            elif type(x) is dict:
+                nb_samples = len(list(x.values())[0])
            else:
                nb_samples = len(x)
            all_outs.append(outs)
@@ -1456,7 +1487,7 @@ class Model(Container):
                                weights=weights))
            return averages

-    def predict_generator(self, generator, val_samples):
+    def predict_generator(self, generator, val_samples, max_q_size=10):
        '''Generates predictions for the input samples from a data generator.
        The generator should return the same kind of data as accepted by
        `predict_on_batch`.
@@ -1465,6 +1496,7 @@ class Model(Container):
            generator: generator yielding batches of input samples.
            val_samples: total number of samples to generate from `generator`
                before returning.
+            max_q_size: maximum size for the generator queue

        # Returns
            Numpy array(s) of predictions.
@@ -1474,7 +1506,7 @@ class Model(Container):
        processed_samples = 0
        wait_time = 0.01
        all_outs = []
-        data_gen_queue, _stop = generator_queue(generator)
+        data_gen_queue, _stop = generator_queue(generator, max_q_size=max_q_size)

        while processed_samples < val_samples:
            generator_output = None
@@ -1507,6 +1539,8 @@ class Model(Container):

            if type(x) is list:
                nb_samples = len(x[0])
+            elif type(x) is dict:
+                nb_samples = len(list(x.values())[0])
            else:
                nb_samples = len(x)

@@ -76,9 +76,10 @@ class Dropout(Layer):
        - [Dropout: A Simple Way to Prevent Neural Networks from Overfitting](http://www.cs.toronto.edu/~rsalakhu/papers/srivastava14a.pdf)
    '''
    def __init__(self, p, **kwargs):
-        self.supports_masking = True
-        self.uses_learning_phase = True
        self.p = p
+        if 0. < self.p < 1.:
+            self.uses_learning_phase = True
+        self.supports_masking = True
        super(Dropout, self).__init__(**kwargs)

    def call(self, x, mask=None):
@@ -77,7 +77,6 @@ class Embedding(Layer):
        self.dropout = dropout

        self.W_constraint = constraints.get(W_constraint)
-        self.constraints = [self.W_constraint]

        self.W_regularizer = regularizers.get(W_regularizer)
        self.activity_regularizer = regularizers.get(activity_regularizer)
@@ -93,6 +92,11 @@ class Embedding(Layer):
        self.W = self.init((self.input_dim, self.output_dim),
                           name='{}_W'.format(self.name))
        self.trainable_weights = [self.W]
+
+        self.constraints = {}
+        if self.W_constraint:
+            self.constraints[self.W] = self.W_constraint
+
        self.regularizers = []
        if self.W_regularizer:
            self.W_regularizer.set_param(self.W)
@@ -112,7 +116,11 @@ class Embedding(Layer):
            return K.not_equal(x, 0)

    def get_output_shape_for(self, input_shape):
-        return (input_shape[0], self.input_length, self.output_dim)
+        if not self.input_length:
+            input_length = input_shape[1]
+        else:
+            input_length = self.input_length
+        return (input_shape[0], input_length, self.output_dim)

    def call(self, x, mask=None):
        if 0. < self.dropout < 1.:
@@ -196,7 +196,7 @@ class Recurrent(Layer):
    def call(self, x, mask=None):
        # input shape: (nb_samples, time (padded with zeros), input_dim)
        # note that the .build() method of subclasses MUST define
-        # self.input_sepc with a complete input shape.
+        # self.input_spec with a complete input shape.
        input_shape = self.input_spec[0].shape
        if K._BACKEND == 'tensorflow':
            if not input_shape[1]:
@@ -549,17 +549,17 @@ class GRU(Recurrent):
        if 0 < self.dropout_U < 1:
            ones = K.ones_like(K.reshape(x[:, 0, 0], (-1, 1)))
            ones = K.concatenate([ones] * self.output_dim, 1)
-            B_U = [K.dropout(ones, self.dropout_U) for _ in range(3)]
+            B_U = [K.in_train_phase(K.dropout(ones, self.dropout_U), ones) for _ in range(3)]
            constants.append(B_U)
        else:
            constants.append([K.cast_to_floatx(1.) for _ in range(4)])

-        if self.consume_less == 'cpu' and 0 < self.dropout_W < 1:
+        if 0 < self.dropout_W < 1:
            input_shape = self.input_spec[0].shape
            input_dim = input_shape[-1]
            ones = K.ones_like(K.reshape(x[:, 0, 0], (-1, 1)))
            ones = K.concatenate([ones] * input_dim, 1)
-            B_W = [K.dropout(ones, self.dropout_W) for _ in range(3)]
+            B_W = [K.in_train_phase(K.dropout(ones, self.dropout_W), ones) for _ in range(3)]
            constants.append(B_W)
        else:
            constants.append([K.cast_to_floatx(1.) for _ in range(4)])
@@ -715,9 +715,9 @@ class LSTM(Recurrent):
            self.states = [K.zeros((input_shape[0], self.output_dim)),
                           K.zeros((input_shape[0], self.output_dim))]

-    def preprocess_input(self, x, train=False):
+    def preprocess_input(self, x):
        if self.consume_less == 'cpu':
-            if train and (0 < self.dropout_W < 1):
+            if 0 < self.dropout_W < 1:
                dropout = self.dropout_W
            else:
                dropout = 0
@@ -767,17 +767,17 @@ class LSTM(Recurrent):
        if 0 < self.dropout_U < 1:
            ones = K.ones_like(K.reshape(x[:, 0, 0], (-1, 1)))
            ones = K.concatenate([ones] * self.output_dim, 1)
-            B_U = [K.dropout(ones, self.dropout_U) for _ in range(4)]
+            B_U = [K.in_train_phase(K.dropout(ones, self.dropout_U), ones) for _ in range(4)]
            constants.append(B_U)
        else:
            constants.append([K.cast_to_floatx(1.) for _ in range(4)])

-        if self.consume_less == 'cpu' and 0 < self.dropout_W < 1:
+        if 0 < self.dropout_W < 1:
            input_shape = self.input_spec[0].shape
            input_dim = input_shape[-1]
            ones = K.ones_like(K.reshape(x[:, 0, 0], (-1, 1)))
            ones = K.concatenate([ones] * input_dim, 1)
-            B_W = [K.dropout(ones, self.dropout_W) for _ in range(4)]
+            B_W = [K.in_train_phase(K.dropout(ones, self.dropout_W), ones) for _ in range(4)]
            constants.append(B_W)
        else:
            constants.append([K.cast_to_floatx(1.) for _ in range(4)])
@@ -533,7 +533,8 @@ class Graph(Model):
    def fit_generator(self, generator, samples_per_epoch, nb_epoch,
                      verbose=1, callbacks=[],
                      validation_data=None, nb_val_samples=None,
-                      class_weight={}, **kwargs):
+                      class_weight={},
+                      max_q_size=10, **kwargs):
        '''Fits a model on data generated batch-by-batch by a Python generator.
        The generator is run in parallel to the model, for efficiency.
        For instance, this allows you to do real-time data augmentation
@@ -641,13 +642,14 @@ class Graph(Model):
                                                   callbacks=callbacks,
                                                   validation_data=validation_data,
                                                   nb_val_samples=nb_val_samples,
-                                                   class_weight=class_weight)
+                                                   class_weight=class_weight,
+                                                   max_q_size=max_q_size)
        self.train_on_batch = self._train_on_batch
        self.evaluate = self._evaluate
        return history

    def evaluate_generator(self, generator, val_samples,
-                           verbose=1, **kwargs):
+                           verbose=1, max_q_size=10, **kwargs):
        '''Evaluates the model on a generator. The generator should
        return the same kind of data with every yield as accepted
        by `evaluate`.
@@ -700,7 +702,8 @@ class Graph(Model):

        generator = fixed_generator()
        history = super(Graph, self).evaluate_generator(generator,
-                                                        val_samples)
+                                                        val_samples,
+                                                        max_q_size=max_q_size)
        self.test_on_batch = self._test_on_batch
        return history

@@ -8,3 +8,8 @@ def binary_accuracy(y_true, y_pred):
 def categorical_accuracy(y_true, y_pred):
    return K.mean(K.equal(K.argmax(y_true, axis=-1),
                  K.argmax(y_pred, axis=-1)))
+
+
+from .utils.generic_utils import get_from_module
+def get(identifier):
+    return get_from_module(identifier, globals(), 'metric')
@@ -178,6 +178,9 @@ class Sequential(Model):
        self.output_names = self.model.output_names
        self.input_names = self.model.input_names

+        # make sure child model callbacks will call the parent Sequential model:
+        self.model.callback_model = self
+
        self.built = True

    @property
@@ -312,7 +315,7 @@ class Sequential(Model):
                model.add(Dense(10, activation='softmax'))
                model.compile(optimizer='rmsprop',
                              loss='categorical_crossentropy',
-                              metrics=['acccuracy'])
+                              metrics=['accuracy'])
            ```
        '''
        # create the underlying model
@@ -375,6 +378,8 @@ class Sequential(Model):
            at successive epochs, as well as validation loss values
            and validation metrics values (if applicable).
        '''
+        if self.model is None:
+            raise Exception('The model needs to be compiled before being used.')
        if 'show_accuracy' in kwargs:
            kwargs.pop('show_accuracy')
            warnings.warn('The "show_accuracy" argument is deprecated, '
@@ -414,6 +419,8 @@ class Sequential(Model):
            The attribute `model.metrics_names` will give you
            the display labels for the scalar outputs.
        '''
+        if self.model is None:
+            raise Exception('The model needs to be compiled before being used.')
        if 'show_accuracy' in kwargs:
            kwargs.pop('show_accuracy')
            warnings.warn('The "show_accuracy" argument is deprecated, '
@@ -441,6 +448,8 @@ class Sequential(Model):
        # Returns
            A Numpy array of predictions.
        '''
+        if self.model is None:
+            raise Exception('The model needs to be compiled before being used.')
        return self.model.predict(x, batch_size=batch_size, verbose=verbose)

    def predict_on_batch(self, x):
@@ -522,6 +531,8 @@ class Sequential(Model):
        # Returns
            A Numpy array of probability predictions.
        '''
+        if self.model is None:
+            raise Exception('The model needs to be compiled before being used.')
        preds = self.predict(x, batch_size, verbose)
        if preds.min() < 0. or preds.max() > 1.:
            warnings.warn('Network returning invalid probability values. '
@@ -543,6 +554,8 @@ class Sequential(Model):
        # Returns
            A numpy array of class predictions.
        '''
+        if self.model is None:
+            raise Exception('The model needs to be compiled before being used.')
        proba = self.predict(x, batch_size=batch_size, verbose=verbose)
        if proba.shape[-1] > 1:
            return proba.argmax(axis=-1)
@@ -552,8 +565,7 @@ class Sequential(Model):
    def fit_generator(self, generator, samples_per_epoch, nb_epoch,
                      verbose=1, callbacks=[],
                      validation_data=None, nb_val_samples=None,
-                      class_weight=None,
-                      **kwargs):
+                      class_weight=None, max_q_size=10, **kwargs):
        '''Fits the model on data generated batch-by-batch by
        a Python generator.
        The generator is run in parallel to the model, for efficiency.
@@ -583,6 +595,7 @@ class Sequential(Model):
                at the end of every epoch.
            class_weight: dictionary mapping class indices to a weight
                for the class.
+            max_q_size: maximum size for the generator queue

        # Returns
            A `History` object.
@@ -604,6 +617,8 @@ class Sequential(Model):
                                samples_per_epoch=10000, nb_epoch=10)
        ```
        '''
+        if self.model is None:
+            raise Exception('The model needs to be compiled before being used.')
        if 'show_accuracy' in kwargs:
            kwargs.pop('show_accuracy')
            warnings.warn('The "show_accuracy" argument is deprecated, '
@@ -629,10 +644,10 @@ class Sequential(Model):
                                        callbacks=callbacks,
                                        validation_data=validation_data,
                                        nb_val_samples=nb_val_samples,
-                                        class_weight=class_weight)
+                                        class_weight=class_weight,
+                                        max_q_size=max_q_size)

-    def evaluate_generator(self, generator, val_samples,
-                           **kwargs):
+    def evaluate_generator(self, generator, val_samples, max_q_size=10, **kwargs):
        '''Evaluates the model on a data generator. The generator should
        return the same kind of data as accepted by `test_on_batch`.

@@ -643,7 +658,10 @@ class Sequential(Model):
            val_samples:
                total number of samples to generate from `generator`
                before returning.
+            max_q_size: maximum size for the generator queue
        '''
+        if self.model is None:
+            raise Exception('The model needs to be compiled before being used.')
        if 'show_accuracy' in kwargs:
            kwargs.pop('show_accuracy')
            warnings.warn('The "show_accuracy" argument is deprecated, '
@@ -658,9 +676,10 @@ class Sequential(Model):
            raise Exception('Received unknown keyword arguments: ' +
                            str(kwargs))
        return self.model.evaluate_generator(generator,
-                                             val_samples)
+                                             val_samples,
+                                             max_q_size=max_q_size)

-    def predict_generator(self, generator, val_samples):
+    def predict_generator(self, generator, val_samples, max_q_size=10):
        '''Generates predictions for the input samples from a data generator.
        The generator should return the same kind of data as accepted by
        `predict_on_batch`.
@@ -669,12 +688,15 @@ class Sequential(Model):
            generator: generator yielding batches of input samples.
            val_samples: total number of samples to generate from `generator`
                before returning.
+            max_q_size: maximum size for the generator queue

        # Returns
            A Numpy array of predictions.
        '''
-
-        return self.model.predict_generator(generator, val_samples)
+        if self.model is None:
+            raise Exception('The model needs to be compiled before being used.')
+        return self.model.predict_generator(generator, val_samples,
+                                            max_q_size=max_q_size)

    def get_config(self):
        '''Returns the model configuration
@@ -16,23 +16,24 @@ from six.moves import range
 import threading


-def random_rotation(x, rg, fill_mode='nearest', cval=0.):
+def random_rotation(x, rg, fill_mode='nearest',
+                    cval=0., axes=(1, 2)):
    angle = np.random.uniform(-rg, rg)
    x = ndimage.interpolation.rotate(x, angle,
-                                     axes=(1, 2),
+                                     axes=axes,
                                     reshape=False,
                                     mode=fill_mode,
                                     cval=cval)
    return x


-def random_shift(x, wrg, hrg, fill_mode='nearest', cval=0.):
+def random_shift(x, wrg, hrg, fill_mode='nearest',
+                 cval=0., row_index=1, col_index=2):
    shift_x = shift_y = 0
-
    if wrg:
-        shift_x = np.random.uniform(-wrg, wrg) * x.shape[2]
+        shift_x = np.random.uniform(-wrg, wrg) * x.shape[col_index]
    if hrg:
-        shift_y = np.random.uniform(-hrg, hrg) * x.shape[1]
+        shift_y = np.random.uniform(-hrg, hrg) * x.shape[row_index]
    x = ndimage.interpolation.shift(x, (0, shift_y, shift_x),
                                    order=0,
                                    mode=fill_mode,
@@ -40,15 +41,10 @@ def random_shift(x, wrg, hrg, fill_mode='nearest', cval=0.):
    return x


-def horizontal_flip(x):
-    for i in range(x.shape[0]):
-        x[i] = np.fliplr(x[i])
-    return x
-
-
-def vertical_flip(x):
-    for i in range(x.shape[0]):
-        x[i] = np.flipud(x[i])
+def flip_axis(x, axis):
+    x = np.asarray(x).swapaxes(axis, 0)
+    x = x[::-1, ...]
+    x = x.swapaxes(0, axis)
    return x


@@ -140,6 +136,8 @@ class ImageDataGenerator(object):
        shear_range: shear intensity (shear angle in radians).
        horizontal_flip: whether to randomly flip images horizontally.
        vertical_flip: whether to randomly flip images vertically.
+        dim_ordering: 'th' or 'tf'. In 'th' mode, the channels dimension
+            (the depth) is at index 1, in 'tf' mode it is at index 3.
    '''
    def __init__(self,
                 featurewise_center=True,
@@ -152,12 +150,25 @@ class ImageDataGenerator(object):
                 height_shift_range=0.,
                 shear_range=0.,
                 horizontal_flip=False,
-                 vertical_flip=False):
+                 vertical_flip=False,
+                 dim_ordering='th'):
        self.__dict__.update(locals())
        self.mean = None
        self.std = None
        self.principal_components = None
        self.lock = threading.Lock()
+        if dim_ordering not in {'tf', 'th'}:
+            raise Exception('dim_ordering should be "tf" (channel after row and \
+            column) or "th" (channel before row and column). Received arg: ', dim_ordering)
+        self.dim_ordering = dim_ordering
+        if dim_ordering == "th":
+            self.channel_index = 1
+            self.row_index = 2
+            self.col_index = 3
+        if dim_ordering == "tf":
+            self.channel_index = 3
+            self.row_index = 1
+            self.col_index = 2

    def _flow_index(self, N, batch_size=32, shuffle=False, seed=None):
        b = 0
@@ -230,9 +241,9 @@ class ImageDataGenerator(object):

    def standardize(self, x):
        if self.samplewise_center:
-            x -= np.mean(x, axis=1, keepdims=True)
+            x -= np.mean(x, axis=self.channel_index, keepdims=True)
        if self.samplewise_std_normalization:
-            x /= (np.std(x, axis=1, keepdims=True) + 1e-7)
+            x /= (np.std(x, axis=self.channel_index, keepdims=True) + 1e-7)

        if self.featurewise_center:
            x -= self.mean
@@ -240,23 +251,29 @@ class ImageDataGenerator(object):
            x /= (self.std + 1e-7)

        if self.zca_whitening:
-            flatx = np.reshape(x, (x.shape[0] * x.shape[1] * x.shape[2]))
+            flatx = np.reshape(x, (x.size))
            whitex = np.dot(flatx, self.principal_components)
            x = np.reshape(whitex, (x.shape[0], x.shape[1], x.shape[2]))

        return x

    def random_transform(self, x):
+        # x is a single image, so it doesn't have image number at index 0
+        img_col_index = self.col_index - 1
+        img_row_index = self.row_index - 1
+
        if self.rotation_range:
-            x = random_rotation(x, self.rotation_range)
+            x = random_rotation(x, self.rotation_range,
+                                axes=(img_row_index, img_col_index))
        if self.width_shift_range or self.height_shift_range:
-            x = random_shift(x, self.width_shift_range, self.height_shift_range)
+            x = random_shift(x, self.width_shift_range, self.height_shift_range,
+                             row_index=img_row_index, col_index=img_col_index)
        if self.horizontal_flip:
            if np.random.random() < 0.5:
-                x = horizontal_flip(x)
+                x = flip_axis(x, img_col_index)
        if self.vertical_flip:
            if np.random.random() < 0.5:
-                x = vertical_flip(x)
+                x = flip_axis(x, img_row_index)
        if self.shear_range:
            x = random_shear(x, self.shear_range)
        # TODO:
@@ -9,7 +9,6 @@ def to_categorical(y, nb_classes=None):
    '''Convert class vector (integers from 0 to nb_classes)
    to binary class matrix, for use with categorical_crossentropy.
    '''
-    y = np.asarray(y, dtype='int32')
    if not nb_classes:
        nb_classes = np.max(y)+1
    Y = np.zeros((len(y), nb_classes))
@@ -51,3 +50,27 @@ def probas_to_classes(y_pred):

 def categorical_probas_to_classes(p):
    return np.argmax(p, axis=1)
+
+
+def convert_kernel(kernel, dim_ordering='th'):
+    '''Converts a kernel matrix (numpy array)
+    from Theano format to TensorFlow format
+    (or reciprocally, since the transformation
+    is its own inverse).
+    '''
+    new_kernel = np.copy(kernel)
+    if dim_ordering == 'th':
+        w = kernel.shape[2]
+        h = kernel.shape[3]
+        for i in range(w):
+            for j in range(h):
+                new_kernel[:, :, i, j] = kernel[:, :, w - i - 1, h - j - 1]
+    elif dim_ordering == 'tf':
+        w = kernel.shape[0]
+        h = kernel.shape[1]
+        for i in range(w):
+            for j in range(h):
+                new_kernel[i, j, :, :] = kernel[w - i - 1, h - j - 1, :, :]
+    else:
+        raise Exception('Invalid dim_ordering: ' + str(dim_ordering))
+    return new_kernel
@@ -3,12 +3,12 @@ from setuptools import find_packages


 setup(name='Keras',
-      version='1.0.0',
+      version='1.0.1',
      description='Deep Learning for Python',
      author='Francois Chollet',
      author_email='francois.chollet@gmail.com',
      url='https://github.com/fchollet/keras',
-      download_url='https://github.com/fchollet/keras/tarball/1.0.0',
+      download_url='https://github.com/fchollet/keras/tarball/1.0.1',
      license='MIT',
      install_requires=['theano', 'pyyaml', 'six'],
      extras_require={
@@ -5,6 +5,7 @@ import numpy as np

 from keras.backend import theano_backend as KTH
 from keras.backend import tensorflow_backend as KTF
+from keras.utils.np_utils import convert_kernel


 def check_single_tensor_operation(function_name, input_shape, **kwargs):
@@ -22,10 +23,12 @@ def check_single_tensor_operation(function_name, input_shape, **kwargs):
 def check_two_tensor_operation(function_name, x_input_shape,
                               y_input_shape, **kwargs):
    xval = np.random.random(x_input_shape) - 0.5
+
    xth = KTH.variable(xval)
    xtf = KTF.variable(xval)

    yval = np.random.random(y_input_shape) - 0.5
+
    yth = KTH.variable(yval)
    ytf = KTF.variable(yval)

@@ -369,42 +372,56 @@ class TestBackend(object):
        check_single_tensor_operation('l2_normalize', (4, 3), axis=-1)
        check_single_tensor_operation('l2_normalize', (4, 3), axis=1)

-    # def test_conv2d(self):
-    #     '''conv2d works "properly" with Theano and TF but outputs different
-    #     values in each case. Cause unclear (input / kernel shape format?)
-    #     '''
-    #     # TH kernel shape: (depth, input_depth, rows, cols)
-    #     check_two_tensor_operation('conv2d', (5, 3, 10, 12), (4, 3, 2, 2),
-    #                                strides=(1, 1), border_mode='valid')
-    #     check_two_tensor_operation('conv2d', (5, 3, 10, 12), (4, 3, 2, 2),
-    #                                strides=(1, 1), border_mode='same')
+    def test_conv2d(self):
+        # TH kernel shape: (depth, input_depth, rows, cols)
+        # TF kernel shape: (rows, cols, input_depth, depth)

-    #     # TF kernel shape: (rows, cols, input_depth, depth)
-    #     check_two_tensor_operation('conv2d', (5, 10, 12, 3), (2, 2, 3, 4),
-    #                                strides=(1, 1), border_mode='valid', dim_ordering='tf')
-    #     check_two_tensor_operation('conv2d', (5, 10, 12, 3), (2, 2, 3, 4),
-    #                                strides=(1, 1), border_mode='same', dim_ordering='tf')
+        for input_shape in [(2, 3, 4, 5), (2, 3, 5, 6)]:
+            for kernel_shape in [(4, 3, 2, 2), (4, 3, 3, 4)]:
+                xval = np.random.random(input_shape)

-    #     check_two_tensor_operation('conv2d', (5, 3, 10, 12), (4, 3, 3, 3),
-    #                                strides=(1, 1), border_mode='valid')
-    #     check_two_tensor_operation('conv2d', (5, 3, 10, 12), (4, 3, 3, 3),
-    #                                strides=(1, 1), border_mode='same')
+                xth = KTH.variable(xval)
+                xtf = KTF.variable(xval)

-    #     check_two_tensor_operation('conv2d', (5, 3, 10, 12), (4, 3, 3, 3),
-    #                                strides=(2, 2), border_mode='valid')
+                kernel_val = np.random.random(kernel_shape) - 0.5

-    # def test_pool2d(self):
-    #     '''pool2d works "properly" with Theano and TF but outputs different
-    #     values in each case. Cause unclear (input shape format?)
-    #     '''
-    #     check_single_tensor_operation('pool2d', (5, 3, 10, 12), pool_size=(2, 2),
-    #                                   strides=(1, 1), border_mode='valid')
+                kernel_th = KTH.variable(convert_kernel(kernel_val))
+                kernel_tf = KTF.variable(kernel_val)

-    #     check_single_tensor_operation('pool2d', (5, 3, 9, 11), pool_size=(2, 2),
-    #                                   strides=(1, 1), border_mode='valid')
+                zth = KTH.eval(KTH.conv2d(xth, kernel_th))
+                ztf = KTF.eval(KTF.conv2d(xtf, kernel_tf))

-    #     check_single_tensor_operation('pool2d', (5, 3, 9, 11), pool_size=(2, 3),
-    #                                   strides=(1, 1), border_mode='valid')
+                assert zth.shape == ztf.shape
+                assert_allclose(zth, ztf, atol=1e-05)
+
+        input_shape = (1, 6, 5, 3)
+        kernel_shape = (3, 3, 3, 2)
+
+        xval = np.random.random(input_shape)
+
+        xth = KTH.variable(xval)
+        xtf = KTF.variable(xval)
+
+        kernel_val = np.random.random(kernel_shape) - 0.5
+
+        kernel_th = KTH.variable(convert_kernel(kernel_val, dim_ordering='tf'))
+        kernel_tf = KTF.variable(kernel_val)
+
+        zth = KTH.eval(KTH.conv2d(xth, kernel_th, dim_ordering='tf'))
+        ztf = KTF.eval(KTF.conv2d(xtf, kernel_tf, dim_ordering='tf'))
+
+        assert zth.shape == ztf.shape
+        assert_allclose(zth, ztf, atol=1e-05)
+
+    def test_pool2d(self):
+        check_single_tensor_operation('pool2d', (5, 3, 10, 12), pool_size=(2, 2),
+                                      strides=(1, 1), border_mode='valid')
+
+        check_single_tensor_operation('pool2d', (5, 3, 9, 11), pool_size=(2, 2),
+                                      strides=(1, 1), border_mode='valid')
+
+        check_single_tensor_operation('pool2d', (5, 3, 9, 11), pool_size=(2, 3),
+                                      strides=(1, 1), border_mode='valid')

    def test_random_normal(self):
        mean = 0.
@@ -1,9 +1,11 @@
 import pytest
 import numpy as np
+from numpy.testing import assert_allclose

 from keras.layers import Dense, Dropout
 from keras.engine.topology import merge, Input
 from keras.engine.training import Model
+from keras.models import Sequential, Graph
 from keras import backend as K


@@ -142,6 +144,18 @@ def test_model_methods():
                              [output_a_np, output_b_np])
    assert len(out) == 2

+    # test with a custom metric function
+    mse = lambda y_true, y_pred: K.mean(K.pow(y_true - y_pred, 2))
+    model.compile(optimizer, loss, metrics=[mse],
+                  sample_weight_mode=None)
+
+    out = model.train_on_batch([input_a_np, input_b_np],
+                               [output_a_np, output_b_np])
+    assert len(out) == 3
+    out = model.test_on_batch([input_a_np, input_b_np],
+                              [output_a_np, output_b_np])
+    assert len(out) == 3
+
    input_a_np = np.random.random((10, 3))
    input_b_np = np.random.random((10, 3))

@@ -153,5 +167,28 @@ def test_model_methods():
    out = model.predict([input_a_np, input_b_np], batch_size=4)


+def test_trainable_argument():
+    x = np.random.random((5, 3))
+    y = np.random.random((5, 2))
+
+    model = Sequential()
+    model.add(Dense(2, input_dim=3, trainable=False))
+    model.compile('rmsprop', 'mse')
+    out = model.predict(x)
+    model.train_on_batch(x, y)
+    out_2 = model.predict(x)
+    assert_allclose(out, out_2)
+
+    # test with nesting
+    input = Input(shape=(3,))
+    output = model(input)
+    model = Model(input, output)
+    model.compile('rmsprop', 'mse')
+    out = model.predict(x)
+    model.train_on_batch(x, y)
+    out_2 = model.predict(x)
+    assert_allclose(out, out_2)
+
+
 if __name__ == '__main__':
    pytest.main([__file__])
@@ -7,9 +7,10 @@ import shutil


 def setup_function(func):
-    os.mkdir('test_images')
-    os.mkdir('test_images/rgb')
-    os.mkdir('test_images/gsc')
+    paths = ['test_images', 'test_images/rgb', 'test_images/gsc']
+    for path in paths:
+        if not os.path.exists(path):
+            os.mkdir(path)

    img_w = img_h = 20
    for n in range(8):
@@ -55,5 +56,34 @@ def test_image_data_generator():
            assert x.shape[1:] == images.shape[1:]
            break

+
+def test_img_flip():
+    x = np.array(range(4)).reshape([1, 1, 2, 2])
+    assert (flip_axis(x, 0) == x).all()
+    assert (flip_axis(x, 1) == x).all()
+    assert (flip_axis(x, 2) == [[[[2, 3], [0, 1]]]]).all()
+    assert (flip_axis(x, 3) == [[[[1, 0], [3, 2]]]]).all()
+
+    dim_ordering_and_col_index = (('tf', 2), ('th', 3))
+    for dim_ordering, col_index in dim_ordering_and_col_index:
+        image_generator_th = ImageDataGenerator(
+            featurewise_center=False,
+            samplewise_center=False,
+            featurewise_std_normalization=False,
+            samplewise_std_normalization=False,
+            zca_whitening=False,
+            rotation_range=0,
+            width_shift_range=0,
+            height_shift_range=0,
+            shear_range=0,
+            horizontal_flip=True,
+            vertical_flip=False,
+            dim_ordering=dim_ordering).flow(x, [1])
+        for i in range(10):
+            potentially_flipped_x, _ = next(image_generator_th)
+            assert ((potentially_flipped_x == x).all() or
+                    (potentially_flipped_x == flip_axis(x, col_index)).all())
+
+
 if __name__ == '__main__':
    pytest.main([__file__])
@@ -126,7 +126,7 @@ def test_LearningRateScheduler():
    assert (float(K.get_value(model.optimizer.lr)) - 0.2) < K.epsilon()


-@pytest.mark.skipif((K._BACKEND != 'tensorflow') or (sys.version_info[0] == 3),
+@pytest.mark.skipif((K._BACKEND != 'tensorflow'),
                    reason="Requires tensorflow backend")
 def test_TensorBoard():
    import shutil
@@ -252,8 +252,4 @@ def test_TensorBoard():
    KTF.set_session(old_session)

 if __name__ == '__main__':
-    # pytest.main([__file__])
-    # test_ModelCheckpoint()
-    # test_EarlyStopping()
-    # test_LearningRateScheduler()
-    test_TensorBoard()
+    pytest.main([__file__])
@@ -53,7 +53,7 @@ def test_graph_fit_generator():
                        validation_data=data_generator_graph(False), nb_val_samples=batch_size * 3)
    graph.fit_generator(data_generator_graph(True), 1000, nb_epoch=4,
                        validation_data=data_generator_graph(False), nb_val_samples=batch_size * 3)
-    gen_loss = graph.evaluate_generator(data_generator_graph(True), 128, verbose=0)    
+    gen_loss = graph.evaluate_generator(data_generator_graph(True), 128, verbose=0)

    loss = graph.evaluate({'input1': X_test_graph, 'output1': y_test_graph}, verbose=0)

@@ -66,6 +66,7 @@ def test_sequential_fit_generator():
    model.fit_generator(data_generator(True), len(X_train), nb_epoch, validation_data=(X_test, y_test))
    model.fit_generator(data_generator(True), len(X_train), nb_epoch,
                        validation_data=data_generator(False), nb_val_samples=batch_size * 3)
+    model.fit_generator(data_generator(True), len(X_train), nb_epoch, max_q_size=2)

    loss = model.evaluate(X_train, y_train)

@@ -100,8 +101,8 @@ def test_sequential():

    loss = model.evaluate(X_test, y_test)

-    prediction = model.predict_generator(data_generator(X_test, y_test), X_test.shape[0])
-    gen_loss = model.evaluate_generator(data_generator(X_test, y_test, 50), X_test.shape[0])
+    prediction = model.predict_generator(data_generator(X_test, y_test), X_test.shape[0], max_q_size=2)
+    gen_loss = model.evaluate_generator(data_generator(X_test, y_test, 50), X_test.shape[0], max_q_size=2)
    pred_loss = K.eval(K.mean(objectives.get(model.loss)(K.variable(y_test), K.variable(prediction))))

    assert(np.isclose(pred_loss, loss))
Autor	SHA1	Mensagem	Data
Francois Chollet	7a12fd0f85	1.0.1 release	2016-04-16 14:11:34 -07:00
Steven Hugg	9d60126661	fixed TensorBoard callback (#2363 )	2016-04-16 13:56:56 -07:00
Francois Chollet	e341e73c6a	Fix Dropout in RNNs	2016-04-16 09:08:15 -07:00
Francois Chollet	5dad3786f6	Add batch_set_value for faster TF weight loading	2016-04-15 22:40:54 -07:00
Francois Chollet	ed0cd2c60d	Add TF/TH kernel conversion util	2016-04-15 22:37:41 -07:00
Kai Li	2eea3a4c5d	Update model.md (#2348 )	2016-04-15 08:21:46 -07:00
Gijs van Tulder	090a46763e	Fix support for custom metrics functions (#2351 )	2016-04-15 08:21:16 -07:00
Dan Becker	c4ed82cdf6	Fix typo in docs. loss_weight should be loss_weights (#2343 )	2016-04-14 21:53:33 -07:00
Dan Becker	b32248d615	Change error message in standardize_input_data (#2338 )	2016-04-14 19:18:53 -07:00
Francois Chollet	ca7437502b	Shape inference fix for Embedding	2016-04-14 15:44:36 -07:00
Francois Chollet	fe9b797a46	Style fixes in preprocessing/image	2016-04-14 15:44:24 -07:00
Francois Chollet	b8059aeaba	Fix test_image unit test	2016-04-14 13:38:07 -07:00
Daniele Bonadiman	85f80714c2	Max Over Time in imdb_cnn.py (#2320 ) * Max Over Time in imdb_cnn.py Following this issue https://github.com/fchollet/keras/issues/2296 i propose this PR. The mayor optimisation a part of the Max over time are: - Dropout in the Embedding layer. - Longer input sequences (400 instead of 100), made possible from the speedup of the Max Over Time. - Adam optimizer. Overall it takes 90 to 100 sec per epoch on my laptop CPU and in two epochs it reaches 0.885 accuracy that is a 5 points improvement over the previous implementation. Moreover it requires less memory (300k parameters vs 3M+) since the number of parameters do not depend by the length of the input sequence anymore. * Update imdb_cnn.py	2016-04-14 13:22:06 -07:00
fchollet	2cc9ebf28b	Add set_learning_phase in TF backend.	2016-04-14 08:32:04 -07:00
fchollet	cb5d69c769	Fix case where output_shape in Merge is tuple	2016-04-13 21:20:21 -07:00
Francois Chollet	3b961a6b7b	Fix Graph generator methods	2016-04-13 19:30:11 -07:00
Francois Chollet	cba3ea9d90	Fix PEP8	2016-04-13 18:32:18 -07:00
Thomas Boquet	57ea065db7	Added learning phase to callbacks (#2297 ) (#2303 ) * added learning phase to callbacks (#2297) * cleaned imports * replaced tabs by spaces * added case where uses_learning_phase is False * fixed pep8 blank line bug	2016-04-13 18:00:49 -07:00
Ben Cook	4f5f88b9ba	Expose max_q_size and other generator_queue args (#2300 ) * [#2287] expose generator_queue args * [#2287] only expose max_q_size	2016-04-13 18:00:25 -07:00
Francois Chollet	c1c2b330a1	Fix "trainable" argument	2016-04-13 17:58:05 -07:00
Francois Chollet	05e1d8e5f4	Fix siamese example	2016-04-13 12:05:42 -07:00
Francois Chollet	3cbca7bdba	Fix validation_split	2016-04-13 11:39:16 -07:00
Francois Chollet	3e3c210f1d	Update preprocessing/image documentation	2016-04-13 10:15:26 -07:00
Dan Becker	cb65139aa8	Allow 'tf' ordering in ImageDataGenerator (#2291 )	2016-04-13 10:14:59 -07:00
Francois Chollet	1206120d10	Fix callback issue with Sequential model	2016-04-12 15:22:05 -07:00
Francois Chollet	80ebe80138	Callback style fix	2016-04-12 15:21:02 -07:00
Francois Chollet	fa1d6b478e	Fix generators methods when passing data as dicts	2016-04-12 13:51:45 -07:00
Francois Chollet	345413fb8c	Merge branch 'master' of https://github.com/fchollet/keras	2016-04-12 12:52:05 -07:00
Kyle McDonald	66ebd2a843	corrected parameter for learning_phase (#2284 )	2016-04-12 10:18:42 -07:00
fchollet	d50f469c09	Fix for invalid dropout values	2016-04-12 09:41:29 -07:00
Charles-Emmanuel	26714bc635	Small typo (#2282 ) Small type	2016-04-12 09:23:20 -07:00
Pasquale Minervini	30208ae08b	Fix for issue #2276 (#2277 )	2016-04-12 09:20:20 -07:00
Francois Chollet	6ea3188971	Fix typo	2016-04-11 22:47:18 -07:00
Kyle McDonald	1db555a530	remove creation of np.asarray in to_categorical (#2268 )	2016-04-11 22:36:37 -07:00
Francois Chollet	0772210dea	Better error messages for Sequential	2016-04-11 16:41:08 -07:00