Prepare new PyPI release.

Mention requests for contribution in CONTRIBUTING.md
fix small inconsistencies in the documentation (#5817 )
2017-03-16 11:40:14 -07:00 · 2017-03-16 11:37:06 -07:00 · 2017-03-16 10:56:35 -07:00 · 2017-03-16 07:04:07 -07:00 · 2017-03-16 07:03:45 -07:00 · 2017-03-15 21:13:53 -07:00
@@ -30,9 +30,20 @@ You can also use Github issues to request features you would like to see in Kera

 3. After discussing the feature you may choose to attempt a Pull Request. If you're at all able, start writing some code. We always have more work to do than time to do it. If you can write some code then that will speed the process along.

+
+## Requests for Contributions
+
+[This is the board](https://github.com/fchollet/keras/projects/1) where we list current outstanding issues and features to be added. If you want to start contributing to Keras, this is the place to start.
+
+
 ## Pull Requests

-We love pull requests. Here's a quick guide:
+**Where should I submit my pull request?**
+
+1. **Keras improvements and bugfixes** go to the [Keras `master` branch](https://github.com/fchollet/keras/tree/master).
+2. **New features** such as layers and datasets go to [keras-contrib](https://github.com/farizrahman4u/keras-contrib). Unless it is a new feature listed in [Requests for Contributions](https://github.com/fchollet/keras/projects/1), in which case it belongs in core Keras.
+
+Here's a quick guide to submitting your improvements:

 1. If your PR introduces a change in functionality, make sure you start by opening an issue to discuss whether the change should be made, and how to handle it. This will save you from having your PR closed down the road! Of course, if your PR is a simple bug fix, you don't need to do that.

@@ -17,10 +17,12 @@ In the future, we are likely to add more backend options. Go ask Microsoft about

 If you have run Keras at least once, you will find the Keras configuration file at:

-`~/.keras/keras.json`
+`$HOME/.keras/keras.json`

 If it isn't there, you can create it.

+**NOTE for Windows Users:** Please change `$HOME` with `%USERPROFILE%`.
+
 The default configuration file looks like this:

 ```
@@ -56,7 +58,7 @@ Using TensorFlow backend.
 }
 ```

-You can change these settings by editing `~/.keras/keras.json`. 
+You can change these settings by editing `$HOME/.keras/keras.json`. 

 * `image_data_format`: string, either `"channels_last"` or `"channels_first"`. It specifies which data format convention Keras will follow. (`keras.backend.image_data_format()` returns it.)
  - For 2D data (e.g. image), `"channels_last"` assumes `(rows, cols, channels)` while `"channels_first"` assumes `(channels, rows, cols)`. 
@@ -69,14 +71,14 @@ You can change these settings by editing `~/.keras/keras.json`.

 ## Using the abstract Keras backend to write new code

-If you want the Keras modules you write to be compatible with both Theano and TensorFlow, you have to write them via the abstract Keras backend API. Here's an intro.
+If you want the Keras modules you write to be compatible with both Theano (`th`) and TensorFlow (`tf`), you have to write them via the abstract Keras backend API. Here's an intro.

 You can import the backend module via:
 ```python
-from keras import backend as K
+*from keras import backend as K*
 ```

-The code below instantiates an input placeholder. It's equivalent to `tf.placeholder()` or `T.matrix()`, `T.tensor3()`, etc.
+The code below instantiates an input placeholder. It's equivalent to `tf.placeholder()` or `th.tensor.matrix()`, `th.tensor.tensor3()`, etc.

 ```python
 input = K.placeholder(shape=(2, 4, 5))
@@ -86,9 +88,10 @@ input = K.placeholder(shape=(None, 4, 5))
 input = K.placeholder(ndim=3)
 ```

-The code below instantiates a shared variable. It's equivalent to `tf.variable()` or `theano.shared()`.
+The code below instantiates a shared variable. It's equivalent to `tf.Variable()` or `th.shared()`.

 ```python
+import numpy as np
 val = np.random.random((3, 4, 5))
 var = K.variable(value=val)

@@ -101,11 +104,16 @@ var = K.ones(shape=(3, 4, 5))
 Most tensor operations you will need can be done as you would in TensorFlow or Theano:

 ```python
+# Initializing Tensors with Random Numbers
+b = K.random_uniform_variable(shape=(3, 4)). # Uniform distribution
+c = K.random_normal_variable(shape=(3, 4)). # Gaussian distribution
+d = K.random_normal_variable(shape=(3, 4)).
+# Tensor Arithmetics
 a = b + c * K.abs(d)
 c = K.dot(a, K.transpose(b))
-a = K.sum(b, axis=2)
+a = K.sum(b, axis=1)
 a = K.softmax(b)
-a = concatenate([b, c], axis=-1)
+a = K.concatenate([b, c], axis=-1)
 # etc...
 ```

@@ -17,6 +17,6 @@ model.add(Dense(64, kernel_constraint=max_norm(2.)))

 ## Available constraints

- __max_norm__(m=2): maximum-norm constraint
+- __max_norm__(max_value=2, axis=0): maximum-norm constraint
 - __non_neg__(): non-negativity constraint
 - __unit_norm__(): unit-norm constraint, enforces the matrix to have unit norm along the last axis
@@ -15,6 +15,7 @@
 - [How can I use stateful RNNs?](#how-can-i-use-stateful-rnns)
 - [How can I remove a layer from a Sequential model?](#how-can-i-remove-a-layer-from-a-sequential-model)
 - [How can I use pre-trained models in Keras?](#how-can-i-use-pre-trained-models-in-keras)
+- [How can I use HDF5 inputs with Keras?](#how-can-i-use-hdf5-inputs-with-keras)

 ---

@@ -319,6 +320,7 @@ To use statefulness in RNNs, you need to:

 - explicitly specify the batch size you are using, by passing a `batch_size` argument to the first layer in your model. E.g. `batch_size=32` for a 32-samples batch of sequences of 10 timesteps with 16 features per timestep.
 - set `stateful=True` in your RNN layer(s).
+- specify `shuffle=False` when calling fit().

 To reset the states accumulated:

@@ -403,3 +405,18 @@ The VGG16 model is also the basis for several Keras example scripts:
 - [Style transfer](https://github.com/fchollet/keras/blob/master/examples/neural_style_transfer.py)
 - [Feature visualization](https://github.com/fchollet/keras/blob/master/examples/conv_filter_visualization.py)
 - [Deep dream](https://github.com/fchollet/keras/blob/master/examples/deep_dream.py)
+
+---
+
+### How can I use HDF5 inputs with Keras?
+
+You can use the `HDF5Matrix` class from `keras.utils.io_utils`. See [the documentation](/io_utils/#HDF5Matrix) for details.
+
+You can also directly use a HDF5 dataset:
+
+```python
+import h5py
+with h5py.File('input/file.hdf5', 'r') as f:
+    X_data = f['X_data']
+    model.predict(X_data)
+```
@@ -98,7 +98,7 @@ model.compile(optimizer='rmsprop',

 # Generate dummy data
 import numpy as np
-data = np.random.random((1000, 784))
+data = np.random.random((1000, 100))
 labels = np.random.randint(2, size=(1000, 1))

 # Train the model, iterating on the data in batches of 32 samples
@@ -117,8 +117,8 @@ model.compile(optimizer='rmsprop',

 # Generate dummy data
 import numpy as np
-data = np.random.random((1000, 784))
-labels = np.random.randint(10, size=(1000, 10))
+data = np.random.random((1000, 100))
+labels = np.random.randint(10, size=(1000, 1))

 # Convert labels to categorical one-hot encoding
 binary_labels = keras.utils.to_categorical(labels, num_classes=10)
@@ -199,9 +199,9 @@ from keras.layers import Conv2D, MaxPooling2D
 from keras.optimizers import SGD

 model = Sequential()
-# input: 100x100 images with 3 channels -> (3, 100, 100) tensors.
+# input: 100x100 images with 3 channels -> (100, 100, 3) tensors.
 # this applies 32 convolution filters of size 3x3 each.
-model.add(Conv2D(32, (3, 3), activation='relu', input_shape=(3, 100, 100)))
+model.add(Conv2D(32, (3, 3), activation='relu', input_shape=(100, 100, 3)))
 model.add(Conv2D(32, (3, 3), activation='relu'))
 model.add(MaxPooling2D(pool_size=(2, 2)))
 model.add(Dropout(0.25))
@@ -251,12 +251,12 @@ score = model.evaluate(x_test, y_test, batch_size=16)
 from keras.models import Sequential
 from keras.layers import Dense, Dropout
 from keras.layers import Embedding
-from keras.layers import Conv1D, GlobalAveragePooling1D
+from keras.layers import Conv1D, GlobalAveragePooling1D, MaxPooling1D

 model = Sequential()
 model.add(Conv1D(64, 3, activation='relu', input_shape=(seq_length, 100)))
 model.add(Conv1D(64, 3, activation='relu'))
-model.add(MaxPooling1D((3, 3)))
+model.add(MaxPooling1D(3))
 model.add(Conv1D(128, 3, activation='relu'))
 model.add(Conv1D(128, 3, activation='relu'))
 model.add(GlobalAveragePooling1D())
@@ -358,6 +358,6 @@ x_val = np.random.random((batch_size * 3, timesteps, data_dim))
 y_val = np.random.random((batch_size * 3, num_classes))

 model.fit(x_train, y_train,
-          batch_size=batch_size, epochs=5,
+          batch_size=batch_size, epochs=5, shuffle=False,
          validation_data=(x_val, y_val))
 ```
@@ -14,10 +14,10 @@ These layers expose 3 keyword arguments:
 ## Example

 ```python
-from keras.regularizers import l2, activity_l2
+from keras import regularizers
 model.add(Dense(64, input_dim=64,
-                kernel_regularizer=l2(0.01),
-                activity_regularizer=activity_l2(0.01)))
+                kernel_regularizer=regularizers.l2(0.01),
+                activity_regularizer=regularizers.l1(0.01)))
 ```

 ## Available penalties
@@ -14,9 +14,9 @@ Time per epoch: 3s on CPU (core i7).
 '''
 from __future__ import print_function

-from keras.models import Sequential
+from keras.models import Sequential, Model
 from keras.layers.embeddings import Embedding
-from keras.layers import Activation, Dense, Merge, Permute, Dropout
+from keras.layers import Input, Activation, Dense, Permute, Dropout, add, dot, concatenate
 from keras.layers import LSTM
 from keras.utils.data_utils import get_file
 from keras.preprocessing.sequence import pad_sequences
@@ -38,7 +38,8 @@ def tokenize(sent):
 def parse_stories(lines, only_supporting=False):
    '''Parse stories provided in the bAbi tasks format

-    If only_supporting is true, only the sentences that support the answer are kept.
+    If only_supporting is true, only the sentences
+    that support the answer are kept.
    '''
    data = []
    story = []
@@ -68,9 +69,12 @@ def parse_stories(lines, only_supporting=False):


 def get_stories(f, only_supporting=False, max_length=None):
-    '''Given a file name, read the file, retrieve the stories, and then convert the sentences into a single story.
+    '''Given a file name, read the file,
+    retrieve the stories,
+    and then convert the sentences into a single story.

-    If max_length is supplied, any stories longer than max_length tokens will be discarded.
+    If max_length is supplied,
+    any stories longer than max_length tokens will be discarded.
    '''
    data = parse_stories(f.readlines(), only_supporting=only_supporting)
    flatten = lambda data: reduce(lambda x, y: x + y, data)
@@ -85,7 +89,8 @@ def vectorize_stories(data, word_idx, story_maxlen, query_maxlen):
    for story, query, answer in data:
        x = [word_idx[w] for w in story]
        xq = [word_idx[w] for w in query]
-        y = np.zeros(len(word_idx) + 1)  # let's not forget that index 0 is reserved
+        # let's not forget that index 0 is reserved
+        y = np.zeros(len(word_idx) + 1)
        y[word_idx[answer]] = 1
        X.append(x)
        Xq.append(xq)
@@ -93,7 +98,6 @@ def vectorize_stories(data, word_idx, story_maxlen, query_maxlen):
    return (pad_sequences(X, maxlen=story_maxlen),
            pad_sequences(Xq, maxlen=query_maxlen), np.array(Y))

-
 try:
    path = get_file('babi-tasks-v1-2.tar.gz', origin='https://s3.amazonaws.com/text-datasets/babi_tasks_1-20_v1-2.tar.gz')
 except:
@@ -116,7 +120,11 @@ print('Extracting stories for the challenge:', challenge_type)
 train_stories = get_stories(tar.extractfile(challenge.format('train')))
 test_stories = get_stories(tar.extractfile(challenge.format('test')))

-vocab = sorted(reduce(lambda x, y: x | y, (set(story + q + [answer]) for story, q, answer in train_stories + test_stories)))
+vocab = set()
+for story, q, answer in train_stories + test_stories:
+    vocab |= set(story + q + [answer])
+vocab = sorted(vocab)
+
 # Reserve 0 for masking via pad_sequences
 vocab_size = len(vocab) + 1
 story_maxlen = max(map(len, (x for x, _, _ in train_stories + test_stories)))
@@ -135,8 +143,14 @@ print('-')
 print('Vectorizing the word sequences...')

 word_idx = dict((c, i + 1) for i, c in enumerate(vocab))
-inputs_train, queries_train, answers_train = vectorize_stories(train_stories, word_idx, story_maxlen, query_maxlen)
-inputs_test, queries_test, answers_test = vectorize_stories(test_stories, word_idx, story_maxlen, query_maxlen)
+inputs_train, queries_train, answers_train = vectorize_stories(train_stories,
+                                                               word_idx,
+                                                               story_maxlen,
+                                                               query_maxlen)
+inputs_test, queries_test, answers_test = vectorize_stories(test_stories,
+                                                            word_idx,
+                                                            story_maxlen,
+                                                            query_maxlen)

 print('-')
 print('inputs: integer tensor of shape (samples, max_length)')
@@ -153,13 +167,25 @@ print('answers_test shape:', answers_test.shape)
 print('-')
 print('Compiling...')

+# placeholders
+input_sequence = Input((story_maxlen,))
+question = Input((query_maxlen,))
+
+# encoders
 # embed the input sequence into a sequence of vectors
 input_encoder_m = Sequential()
 input_encoder_m.add(Embedding(input_dim=vocab_size,
-                              output_dim=64,
-                              input_length=story_maxlen))
+                              output_dim=64))
 input_encoder_m.add(Dropout(0.3))
 # output: (samples, story_maxlen, embedding_dim)
+
+# embed the input into a sequence of vectors of size query_maxlen
+input_encoder_c = Sequential()
+input_encoder_c.add(Embedding(input_dim=vocab_size,
+                              output_dim=query_maxlen))
+input_encoder_c.add(Dropout(0.3))
+# output: (samples, story_maxlen, query_maxlen)
+
 # embed the question into a sequence of vectors
 question_encoder = Sequential()
 question_encoder.add(Embedding(input_dim=vocab_size,
@@ -167,44 +193,43 @@ question_encoder.add(Embedding(input_dim=vocab_size,
                               input_length=query_maxlen))
 question_encoder.add(Dropout(0.3))
 # output: (samples, query_maxlen, embedding_dim)
-# compute a 'match' between input sequence elements (which are vectors)
-# and the question vector sequence
-match = Sequential()
-match.add(Merge([input_encoder_m, question_encoder],
-                mode='dot',
-                dot_axes=[2, 2]))
-match.add(Activation('softmax'))
-# output: (samples, story_maxlen, query_maxlen)
-# embed the input into a single vector with size = story_maxlen:
-input_encoder_c = Sequential()
-input_encoder_c.add(Embedding(input_dim=vocab_size,
-                              output_dim=query_maxlen,
-                              input_length=story_maxlen))
-input_encoder_c.add(Dropout(0.3))
-# output: (samples, story_maxlen, query_maxlen)
-# sum the match vector with the input vector:
-response = Sequential()
-response.add(Merge([match, input_encoder_c], mode='sum'))
-# output: (samples, story_maxlen, query_maxlen)
-response.add(Permute((2, 1)))  # output: (samples, query_maxlen, story_maxlen)

-# concatenate the match vector with the question vector,
-# and do logistic regression on top
-answer = Sequential()
-answer.add(Merge([response, question_encoder], mode='concat', concat_axis=-1))
+# encode input sequence and questions (which are indices)
+# to sequences of dense vectors
+input_encoded_m = input_encoder_m(input_sequence)
+input_encoded_c = input_encoder_c(input_sequence)
+question_encoded = question_encoder(question)
+
+# compute a 'match' between the first input vector sequence
+# and the question vector sequence
+# shape: `(samples, story_maxlen, query_maxlen)`
+match = dot([input_encoded_m, question_encoded], axes=(2, 2))
+match = Activation('softmax')(match)
+
+# add the match matrix with the second input vector sequence
+response = add([match, input_encoded_c])  # (samples, story_maxlen, query_maxlen)
+response = Permute((2, 1))(response)  # (samples, query_maxlen, story_maxlen)
+
+# concatenate the match matrix with the question vector sequence
+answer = concatenate([response, question_encoded])
+
 # the original paper uses a matrix multiplication for this reduction step.
 # we choose to use a RNN instead.
-answer.add(LSTM(32))
-# one regularization layer -- more would probably be needed.
-answer.add(Dropout(0.3))
-answer.add(Dense(vocab_size))
-# we output a probability distribution over the vocabulary
-answer.add(Activation('softmax'))
+answer = LSTM(32)(answer)  # (samples, 32)

-answer.compile(optimizer='rmsprop', loss='categorical_crossentropy',
-               metrics=['accuracy'])
-# Note: you could use a Graph model to avoid repeat the input twice
-answer.fit([inputs_train, queries_train, inputs_train], answers_train,
-           batch_size=32,
-           epochs=120,
-           validation_data=([inputs_test, queries_test, inputs_test], answers_test))
+# one regularization layer -- more would probably be needed.
+answer = Dropout(0.3)(answer)
+answer = Dense(vocab_size)(answer)  # (samples, vocab_size)
+# we output a probability distribution over the vocabulary
+answer = Activation('softmax')(answer)
+
+# build the final model
+model = Model([input_sequence, question], answer)
+model.compile(optimizer='rmsprop', loss='categorical_crossentropy',
+              metrics=['accuracy'])
+
+# train
+model.fit([inputs_train, queries_train], answers_train,
+          batch_size=32,
+          epochs=120,
+          validation_data=([inputs_test, queries_test], answers_test))
@@ -83,7 +83,8 @@ def tokenize(sent):
 def parse_stories(lines, only_supporting=False):
    '''Parse stories provided in the bAbi tasks format

-    If only_supporting is true, only the sentences that support the answer are kept.
+    If only_supporting is true,
+    only the sentences that support the answer are kept.
    '''
    data = []
    story = []
@@ -113,9 +114,11 @@ def parse_stories(lines, only_supporting=False):


 def get_stories(f, only_supporting=False, max_length=None):
-    '''Given a file name, read the file, retrieve the stories, and then convert the sentences into a single story.
+    '''Given a file name, read the file, retrieve the stories,
+    and then convert the sentences into a single story.

-    If max_length is supplied, any stories longer than max_length tokens will be discarded.
+    If max_length is supplied,
+    any stories longer than max_length tokens will be discarded.
    '''
    data = parse_stories(f.readlines(), only_supporting=only_supporting)
    flatten = lambda data: reduce(lambda x, y: x + y, data)
@@ -130,7 +133,8 @@ def vectorize_stories(data, word_idx, story_maxlen, query_maxlen):
    for story, query, answer in data:
        x = [word_idx[w] for w in story]
        xq = [word_idx[w] for w in query]
-        y = np.zeros(len(word_idx) + 1)  # let's not forget that index 0 is reserved
+        # let's not forget that index 0 is reserved
+        y = np.zeros(len(word_idx) + 1)
        y[word_idx[answer]] = 1
        xs.append(x)
        xqs.append(xq)
@@ -140,10 +144,13 @@ def vectorize_stories(data, word_idx, story_maxlen, query_maxlen):
 RNN = recurrent.LSTM
 EMBED_HIDDEN_SIZE = 50
 SENT_HIDDEN_SIZE = 100
-QUERy_HIDDEN_SIZE = 100
+QUERY_HIDDEN_SIZE = 100
 BATCH_SIZE = 32
 EPOCHS = 40
-print('RNN / Embed / Sent / Query = {}, {}, {}, {}'.format(RNN, EMBED_HIDDEN_SIZE, SENT_HIDDEN_SIZE, QUERy_HIDDEN_SIZE))
+print('RNN / Embed / Sent / Query = {}, {}, {}, {}'.format(RNN,
+                                                           EMBED_HIDDEN_SIZE,
+                                                           SENT_HIDDEN_SIZE,
+                                                           QUERY_HIDDEN_SIZE))

 try:
    path = get_file('babi-tasks-v1-2.tar.gz', origin='https://s3.amazonaws.com/text-datasets/babi_tasks_1-20_v1-2.tar.gz')
@@ -164,7 +171,11 @@ challenge = 'tasks_1-20_v1-2/en/qa2_two-supporting-facts_{}.txt'
 train = get_stories(tar.extractfile(challenge.format('train')))
 test = get_stories(tar.extractfile(challenge.format('test')))

-vocab = sorted(reduce(lambda x, y: x | y, (set(story + q + [answer]) for story, q, answer in train + test)))
+vocab = set()
+for story, q, answer in train + test:
+    vocab |= set(story + q + [answer])
+vocab = sorted(vocab)
+
 # Reserve 0 for masking via pad_sequences
 vocab_size = len(vocab) + 1
 word_idx = dict((c, i + 1) for i, c in enumerate(vocab))
@@ -203,6 +214,10 @@ model.compile(optimizer='adam',
              metrics=['accuracy'])

 print('Training')
-model.fit([x, xq], y, batch_size=BATCH_SIZE, epochs=EPOCHS, validation_split=0.05)
-loss, acc = model.evaluate([tx, txq], ty, batch_size=BATCH_SIZE)
+model.fit([x, xq], y,
+          batch_size=BATCH_SIZE,
+          epochs=EPOCHS,
+          validation_split=0.05)
+loss, acc = model.evaluate([tx, txq], ty,
+                           batch_size=BATCH_SIZE)
 print('Test loss / test accuracy = {:.4f} / {:.4f}'.format(loss, acc))
@@ -51,20 +51,45 @@ from keras.datasets import mnist
 from keras.models import Model
 from keras.layers import Activation
 from keras.layers import UpSampling2D, Conv2D, MaxPooling2D
-from keras.layers import Input, BatchNormalization
+from keras.layers import Input, BatchNormalization, ELU
 import matplotlib.pyplot as plt
 import keras.backend as K
 from keras import layers


-def convresblock(x, nfeats=8, ksize=3, nskipped=2):
-    ''' The proposed residual block from [4]'''
+def convresblock(x, nfeats=8, ksize=3, nskipped=2, elu=True):
+    """The proposed residual block from [4].
+
+    Running with elu=True will use ELU nonlinearity and running with
+    elu=False will use BatchNorm + RELU nonlinearity.  While ELU's are fast
+    due to the fact they do not suffer from BatchNorm overhead, they may
+    overfit because they do not offer the stochastic element of the batch
+    formation process of BatchNorm, which acts as a good regularizer.
+
+    # Arguments
+        x: 4D tensor, the tensor to feed through the block
+        nfeats: Integer, number of feature maps for conv layers.
+        ksize: Integer, width and height of conv kernels in first convolution.
+        nskipped: Integer, number of conv layers for the residual function.
+        elu: Boolean, whether to use ELU or BN+RELU.
+
+    # Input shape
+        4D tensor with shape:
+        `(batch, channels, rows, cols)`
+
+    # Output shape
+        4D tensor with shape:
+        `(batch, filters, rows, cols)`
+    """
    y0 = Conv2D(nfeats, ksize, padding='same')(x)
    y = y0
    for i in range(nskipped):
-        y = BatchNormalization(axis=1)(y)
-        y = Activation('relu')(y)
-        y = Conv2D(nfeats, ksize, padding='same')(y)
+        if elu:
+            y = ELU()(y)
+        else:
+            y = BatchNormalization(axis=1)(y)
+            y = Activation('relu')(y)
+        y = Conv2D(nfeats, 1, padding='same')(y)
    return layers.add([y0, y])


@@ -143,7 +168,8 @@ y = img_input
 for i in range(nlayers):
    y_prepool = convresblock(y, nfeats=nfeats_all[i + 1], ksize=ksize)
    y = MaxPooling2D(pool_size=(pool_sizes[i], pool_sizes[i]))(y_prepool)
-    wheres[i] = layers.Lambda(getwhere, output_shape=lambda x: x[0])([y_prepool, y])
+    wheres[i] = layers.Lambda(
+        getwhere, output_shape=lambda x: x[0])([y_prepool, y])

 # Now build the decoder, and use the stored 'where' masks to place the features
 for i in range(nlayers):
@@ -62,6 +62,14 @@ model.compile(loss='mse', optimizer='rmsprop')
 print('Training')
 for i in range(epochs):
    print('Epoch', i, '/', epochs)
+
+    # Note that the last state for sample i in a batch will
+    # be used as initial state for sample i in the next batch.
+    # Thus we are simultaneously training on batch_size series with
+    # lower resolution than the original series contained in cos.
+    # Each of these series are offset by one step and can be
+    # extracted with cos[i::batch_size].
+
    model.fit(cos,
              expected_output,
              batch_size=batch_size,
@@ -18,4 +18,4 @@ from . import losses
 from . import optimizers
 from . import regularizers

-__version__ = '2.0.0'
+__version__ = '2.0.1'
@@ -69,5 +69,14 @@ else:
 def backend():
    """Publicly accessible method
    for determining the current backend.
+
+    # Returns
+        String, the name of the backend Keras is currently using.
+
+    # Example
+    ```python
+        >>> keras.backend.backend()
+        'tensorflow'
+    ```
    """
    return _BACKEND
@@ -178,10 +178,10 @@ def is_keras_tensor(x):
 # Legacy methods

 def set_image_dim_ordering(dim_ordering):
-    """Sets the value of the image data format.
+    """Legacy setter for `image_data_format`.

    # Arguments
-        data_format: string. `'channels_first'` or `'channels_last'`.
+        dim_ordering: string. `'tf'` or `'th'`.

    # Example
    ```python
@@ -204,7 +204,7 @@ def set_image_dim_ordering(dim_ordering):


 def image_dim_ordering():
-    """Legacy getter for data format.
+    """Legacy getter for `image_data_format`.
    """
    if _IMAGE_DATA_FORMAT == 'channels_first':
        return 'th'
@@ -630,7 +630,7 @@ def random_uniform_variable(shape, low, high, dtype=None,

    # Arguments
        shape: Tuple of integers, shape of returned Keras variable.
-        low: Float, lower boundary of the output inteval.
+        low: Float, lower boundary of the output interval.
        high: Float, upper boundary of the output interval.
        dtype: String, dtype of returned Keras variable.
        name: String, name of returned Keras variable.
@@ -1906,7 +1906,7 @@ def one_hot(indices, num_classes):


 def reverse(x, axes):
-    """Reverse a tensor along the the specified axes.
+    """Reverse a tensor along the specified axes.

    # Arguments
        x: Tensor to reverse.
@@ -3153,7 +3153,7 @@ def random_binomial(shape, p=0.0, dtype=None, seed=None):

    # Arguments
        shape: A tuple of integers, the shape of tensor to create.
-        p: A float, `0. <= p <= 1`, probability of binomlai distribution.
+        p: A float, `0. <= p <= 1`, probability of binomial distribution.
        dtype: String, dtype of returned tensor.
        seed: Integer, random seed.

@@ -1002,7 +1002,7 @@ def one_hot(indices, num_classes):


 def reverse(x, axes):
-    """Reverse a tensor along the the specified axes
+    """Reverse a tensor along the specified axes
    """
    if isinstance(axes, int):
        axes = [axes]
@@ -581,13 +581,15 @@ class TensorBoard(Callback):

    # Arguments
        log_dir: the path of the directory where to save the log
-            files to be parsed by Tensorboard
+            files to be parsed by Tensorboard.
        histogram_freq: frequency (in epochs) at which to compute activation
            histograms for the layers of the model. If set to 0,
            histograms won't be computed.
        write_graph: whether to visualize the graph in Tensorboard.
            The log file can become quite large when
            write_graph is set to True.
+        write_images: whether to write model weights to visualize as
+            image in Tensorboard.
    """

    def __init__(self, log_dir='./logs',
@@ -269,7 +269,9 @@ class Layer(object):
                          'dtype',
                          'name',
                          'trainable',
-                          'weights'}
+                          'weights',
+                          'input_dtype',  # legacy
+                          }
        for kwarg in kwargs:
            if kwarg not in allowed_kwargs:
                raise TypeError('Keyword argument not understood:', kwarg)
@@ -292,8 +294,15 @@ class Layer(object):
                    batch_size = None
                batch_input_shape = (batch_size,) + tuple(kwargs['input_shape'])
            self.batch_input_shape = batch_input_shape
-            dtype = kwargs.get('dtype', K.floatx())
+
+            # Set dtype.
+            dtype = kwargs.get('dtype')
+            if dtype is None:
+                dtype = kwargs.get('input_dtype')
+            if dtype is None:
+                dtype = K.floatx()
            self.dtype = dtype
+
        if 'weights' in kwargs:
            self._initial_weights = kwargs['weights']
        else:
@@ -1594,7 +1594,7 @@ class Model(Container):
                In this case you should make sure to specify
                sample_weight_mode="temporal" in compile().
            class_weight: optional dictionary mapping
-                lass indices (integers) to
+                class indices (integers) to
                a weight (float) to apply to the model's loss for the samples
                from this class during training.
                This can be useful to tell the model to "pay more attention" to
@@ -219,7 +219,7 @@ class _Conv(Layer):
            'activation': activations.serialize(self.activation),
            'use_bias': self.use_bias,
            'kernel_initializer': initializers.serialize(self.kernel_initializer),
-            'bias_initializer': initializers.serialize(self.kernel_initializer),
+            'bias_initializer': initializers.serialize(self.bias_initializer),
            'kernel_regularizer': regularizers.serialize(self.kernel_regularizer),
            'bias_regularizer': regularizers.serialize(self.bias_regularizer),
            'activity_regularizer': regularizers.serialize(self.activity_regularizer),
@@ -27,7 +27,7 @@ class Masking(Layer):

    For each timestep in the input tensor (dimension #1 in the tensor),
    if all values in the input tensor at that timestep
-    are equal to `mask_value`, then the timestep will masked (skipped)
+    are equal to `mask_value`, then the timestep will be masked (skipped)
    in all downstream layers (as long as they support masking).

    If any downstream layer does not support masking yet receives such
@@ -107,9 +107,9 @@ class Dropout(Layer):
            def dropped_inputs():
                return K.dropout(inputs, self.rate, noise_shape,
                                 seed=self.seed)
-            output = K.in_train_phase(dropped_inputs, inputs,
-                                      training=training)
-        return output
+            return K.in_train_phase(dropped_inputs, inputs,
+                                    training=training)
+        return inputs

    def get_config(self):
        config = {'rate': self.rate}
@@ -857,7 +857,7 @@ class Dense(Layer):
            'activation': activations.serialize(self.activation),
            'use_bias': self.use_bias,
            'kernel_initializer': initializers.serialize(self.kernel_initializer),
-            'bias_initializer': initializers.serialize(self.kernel_initializer),
+            'bias_initializer': initializers.serialize(self.bias_initializer),
            'kernel_regularizer': regularizers.serialize(self.kernel_regularizer),
            'bias_regularizer': regularizers.serialize(self.bias_regularizer),
            'activity_regularizer': regularizers.serialize(self.activity_regularizer),
@@ -175,7 +175,7 @@ class LocallyConnected1D(Layer):
            'activation': activations.serialize(self.activation),
            'use_bias': self.use_bias,
            'kernel_initializer': initializers.serialize(self.kernel_initializer),
-            'bias_initializer': initializers.serialize(self.kernel_initializer),
+            'bias_initializer': initializers.serialize(self.bias_initializer),
            'kernel_regularizer': regularizers.serialize(self.kernel_regularizer),
            'bias_regularizer': regularizers.serialize(self.bias_regularizer),
            'activity_regularizer': regularizers.serialize(self.activity_regularizer),
@@ -432,7 +432,7 @@ class LocallyConnected2D(Layer):
            'activation': activations.serialize(self.activation),
            'use_bias': self.use_bias,
            'kernel_initializer': initializers.serialize(self.kernel_initializer),
-            'bias_initializer': initializers.serialize(self.kernel_initializer),
+            'bias_initializer': initializers.serialize(self.bias_initializer),
            'kernel_regularizer': regularizers.serialize(self.kernel_regularizer),
            'bias_regularizer': regularizers.serialize(self.bias_regularizer),
            'activity_regularizer': regularizers.serialize(self.activity_regularizer),
@@ -148,7 +148,7 @@ legacy_gaussiannoise_support = generate_legacy_interface(
    conversions=[('sigma', 'stddev')])


-def lstm_args_preprocessor(args, kwargs):
+def recurrent_args_preprocessor(args, kwargs):
    converted = []
    if 'forget_bias_init' in kwargs:
        if kwargs['forget_bias_init'] == 'one':
@@ -160,6 +160,15 @@ def lstm_args_preprocessor(args, kwargs):
            warnings.warn('The `forget_bias_init` argument '
                          'has been ignored. Use `unit_forget_bias=True` '
                          'instead to intialize with ones.')
+    if 'input_dim' in kwargs:
+        input_length = kwargs.pop('input_length', None)
+        input_dim = kwargs.pop('input_dim')
+        input_shape = (input_length, input_dim)
+        kwargs['input_shape'] = input_shape
+        converted.append(('input_dim', 'input_shape'))
+        warnings.warn('The `input_dim` and `input_length` arguments '
+                      'in recurrent layers are deprecated. '
+                      'Use `input_shape` instead.')
    return args, kwargs, converted

 legacy_recurrent_support = generate_legacy_interface(
@@ -177,7 +186,7 @@ legacy_recurrent_support = generate_legacy_interface(
    value_conversions={'consume_less': {'cpu': 0,
                                        'mem': 1,
                                        'gpu': 2}},
-    preprocessor=lstm_args_preprocessor)
+    preprocessor=recurrent_args_preprocessor)

 legacy_gaussiandropout_support = generate_legacy_interface(
    allowed_positional_args=['rate'],
@@ -66,7 +66,7 @@ class Merge(Layer):
        warnings.warn('The `Merge` layer is deprecated '
                      'and will be removed after 08/2017. '
                      'Use instead layers from `keras.layers.merge`, '
-                      'e.g. `add`, `concatenate`, etc.')
+                      'e.g. `add`, `concatenate`, etc.', stacklevel=2)
        self.layers = layers
        self.mode = mode
        self.concat_axis = concat_axis
@@ -286,7 +286,7 @@ class Merge(Layer):

        assert hasattr(mask, '__len__') and len(mask) == len(inputs)

-        if self.mode in ['sum', 'mul', 'ave']:
+        if self.mode in ['sum', 'mul', 'ave', 'max']:
            masks = [K.expand_dims(m, 0) for m in mask if m is not None]
            return K.all(K.concatenate(masks, axis=0), axis=0, keepdims=False)
        elif self.mode == 'concat':
@@ -361,6 +361,7 @@ class Merge(Layer):

    @classmethod
    def from_config(cls, config):
+        config = config.copy()
        mode_type = config.pop('mode_type')
        if mode_type == 'function':
            mode = globals()[config['mode']]
@@ -406,7 +407,7 @@ def merge(inputs, mode='sum', concat_axis=-1,
    ```
    # Arguments
        mode: String or lambda/function. If string, must be one
-            of: 'sum', 'mul', 'concat', 'ave', 'cos', 'dot'.
+            of: 'sum', 'mul', 'concat', 'ave', 'cos', 'dot', 'max'.
            If lambda/function, it should take as input a list of tensors
            and return a single tensor.
        concat_axis: Integer, axis to use in mode `concat`.
@@ -429,7 +430,7 @@ def merge(inputs, mode='sum', concat_axis=-1,
    warnings.warn('The `merge` function is deprecated '
                  'and will be removed after 08/2017. '
                  'Use instead layers from `keras.layers.merge`, '
-                  'e.g. `sum`, `concatenate`, etc.')
+                  'e.g. `sum`, `concatenate`, etc.', stacklevel=2)
    all_keras_tensors = True
    for x in inputs:
        if not hasattr(x, '_keras_history'):
@@ -140,10 +140,12 @@ def conv_input_length(output_length, filter_size, padding, stride):


 def deconv_length(dim_size, stride_size, kernel_size, padding):
-    if dim_size is not None:
-        dim_size *= stride_size
-    if padding == 'valid' and dim_size is not None:
-        dim_size += max(kernel_size - stride_size, 0)
-    if padding == 'full' and dim_size is not None:
-        dim_size -= stride_size + kernel_size - 2
+    if dim_size is None:
+        return None
+    if padding == 'valid':
+        dim_size = dim_size * stride_size + max(kernel_size - stride_size, 0)
+    elif padding == 'full':
+        dim_size = dim_size * stride_size - (stride_size + kernel_size - 2)
+    elif padding == 'same':
+        dim_size = dim_size * stride_size
    return dim_size
@@ -197,6 +197,8 @@ def func_load(code, defaults=None, closure=None, globs=None):
    """
    if isinstance(code, (tuple, list)):  # unpack previous dump
        code, defaults, closure = code
+        if isinstance(defaults, list):
+            defaults = tuple(defaults)
    code = marshal.loads(code.encode('raw_unicode_escape'))
    if globs is None:
        globs = globals()
@@ -3,14 +3,14 @@ from setuptools import find_packages


 setup(name='Keras',
-      version='2.0.0',
+      version='2.0.1',
      description='Deep Learning for Python',
      author='Francois Chollet',
      author_email='francois.chollet@gmail.com',
      url='https://github.com/fchollet/keras',
-      download_url='https://github.com/fchollet/keras/tarball/2.0.0',
+      download_url='https://github.com/fchollet/keras/tarball/2.0.1',
      license='MIT',
-      install_requires=['tensorflow', 'pyyaml', 'six'],
+      install_requires=['theano', 'pyyaml', 'six'],
      extras_require={
          'h5py': ['h5py'],
          'visualize': ['pydot-ng'],
@@ -113,6 +113,16 @@ def test_lstm_legacy_interface():
    new_layer = keras.layers.LSTM(2, input_shape=[3, 5], name='d', implementation=1)
    assert json.dumps(old_layer.get_config()) == json.dumps(new_layer.get_config())

+    old_layer = keras.layers.LSTM(input_dim=5, input_length=3,
+                                  output_dim=2, name='d', consume_less='mem')
+    new_layer = keras.layers.LSTM(2, input_shape=[3, 5], name='d', implementation=1)
+    assert json.dumps(old_layer.get_config()) == json.dumps(new_layer.get_config())
+
+    old_layer = keras.layers.LSTM(input_dim=5,
+                                  output_dim=2, name='d', consume_less='mem')
+    new_layer = keras.layers.LSTM(2, input_shape=[None, 5], name='d', implementation=1)
+    assert json.dumps(old_layer.get_config()) == json.dumps(new_layer.get_config())
+
    old_layer = keras.layers.LSTM(input_shape=[3, 5], output_dim=2, name='d', consume_less='gpu')
    new_layer = keras.layers.LSTM(2, input_shape=[3, 5], name='d', implementation=2)
    assert json.dumps(old_layer.get_config()) == json.dumps(new_layer.get_config())
Autor	SHA1	Mensagem	Data
Francois Chollet	9cf7f816f2	Prepare new PyPI release.	2017-03-16 11:40:14 -07:00
Francois Chollet	6691b9e3fb	Mention requests for contribution in CONTRIBUTING.md	2017-03-16 11:37:06 -07:00
Tony Beltramelli	467de6bb6c	fix small inconsistencies in the documentation (#5817 ) * fix small inconsistencies in the documentation * remove backend-specific details * add line removed by accident (that what happen when you commit your changes too fast) * Fix in doc example	2017-03-16 10:56:35 -07:00
Pokey Rule	62fd5f7ab6	Fix max_norm doc (#5816 )	2017-03-16 07:04:07 -07:00
Kosuke Kusano	1039924245	fix typo in some layers' `get_config()` (#5810 ) * fix typo * fix typo * fix typo	2017-03-16 07:03:45 -07:00
Francois Chollet	3e81b668ea	Merge branch 'farizrahman4u-patch-5'	2017-03-15 21:13:53 -07:00
Francois Chollet	459d7fe3d7	Style fixes in example scripts	2017-03-15 21:13:31 -07:00
Fariz Rahman	1a0792ae13	Update babi_memnn.py	2017-03-16 09:13:58 +05:30
Fariz Rahman	b5a02391e0	Upgrade memory network example to functional api	2017-03-16 09:12:07 +05:30
Francois Chollet	c88a11c378	Accept legacy input_dtype arg	2017-03-15 19:44:34 -07:00
Sean	40e91020dd	Add HDF5 inputs to FAQ (#3932 )	2017-03-15 16:14:19 -07:00
Andrew Hundt	c3472a5488	Keras 2 and keras-contrib pull request process (#5800 ) * Keras 2 and keras-contrib pull request process Explains the new pull request procedures for Keras 1, Keras 2, and keras-contrib, should resolve https://github.com/fchollet/keras/issues/5270. Based on discussion in https://github.com/fchollet/keras/issues/4944 and https://github.com/farizrahman4u/keras-contrib/issues/9. * CONTRIBUTING.md keras-2 release update	2017-03-15 15:42:17 -07:00
allanzelener	02ba149e57	Convert function defaults to tuple if serialized as list. (#5350 )	2017-03-15 15:37:36 -07:00
Ivan Sosnovik	cb841ae079	Fix typo (#5347 )	2017-03-15 15:36:43 -07:00
Valerio Maggio	0703d79606	Improvements in the Documentation of Backend (#5767 ) * Fix and improvements to the `backend` documentation Improved Preamble of the `backend.md` template: - fixed a typo - Added few notes that makes the documentation more self explanatory - Made all code examples running by Copy&Paste Aligned the format of the `backend()` function Fixed docstring of `set_image_dim_ordering()` function * Fixed a Typo in %USERPROFILE% env name for Window Users * Added `_variable` so not to get a different value every time	2017-03-15 12:40:42 -07:00
Adam Basfop Cavendish	5cef75219a	Fix `deconv_length` inference for `padding='same'` in conv_utils (#5796 )	2017-03-15 12:39:11 -07:00
StefOe	8590e086a3	Dropout created UnboundLocalError (#5788 ) * Dropout created UnboundLocalError output was not assigned when if-clause was not entered * remove unneeded else case * realign code with pep8	2017-03-15 12:00:09 -07:00
3h4	aff40d8008	Clarifying comment to stateful LSTM example (#5787 ) * Clarifying comment to stateful LSTM example * Update stateful_lstm.py	2017-03-15 11:04:50 -07:00
Yorwba	7095aca51b	Reapply patches to legacy Merge layer (#5791 ) * Bug fix : Model.from_config (#5730) * Add merge mode 'max' where it was missing (fixes #3486) (#5729)	2017-03-15 11:04:13 -07:00
nzw	a8eb2e97d0	Fix sample codes for keras-2 (#5782 )	2017-03-15 00:32:21 -07:00
Francois Chollet	ce3093a3b2	PEP8 fixes.	2017-03-15 00:16:44 -07:00
Francois Chollet	f448341e55	Merge branch 'keras-2'	2017-03-15 00:15:55 -07:00
Francois Chollet	51c5f3f9c6	Add support for legacy input_dim in recurrent layers.	2017-03-14 23:06:18 -07:00
nzw	0cc52cf251	Fix typos (#5753 )	2017-03-14 22:54:46 -07:00
antonmbk	c45f48eaea	SWWAE Example: Conv kernel size in resid pathway to 1x1 and activation from BN+RELU to ELU (#5756 ) * Changed conv kernel size in resid pathway to 1x1, and changed activation from BN+RELU to ELU. * Added a more informative docstring decsribing elu argument and its two behaviors.	2017-03-14 22:54:25 -07:00
fchollet	36e526edc6	Update regularizers docs	2017-03-14 22:53:16 -07:00
Francois Chollet	e98379b7d9	Remove tf dependency and add theano dependency in setup.py.	2017-03-14 21:53:54 -07:00
Ben	3f7b0ff954	Set stacklevel so warning points to invocation instead of warning (#5775 )	2017-03-14 21:43:16 -07:00
Richard L. Phillips	69a6b1a028	Made minor change to missing arg in docstring (#5768 )	2017-03-14 11:17:28 -07:00