Comparar commits
29 Commits
| Autor | SHA1 | Data | |
|---|---|---|---|
| 9cf7f816f2 | |||
| 6691b9e3fb | |||
| 467de6bb6c | |||
| 62fd5f7ab6 | |||
| 1039924245 | |||
| 3e81b668ea | |||
| 459d7fe3d7 | |||
| 1a0792ae13 | |||
| b5a02391e0 | |||
| c88a11c378 | |||
| 40e91020dd | |||
| c3472a5488 | |||
| 02ba149e57 | |||
| cb841ae079 | |||
| 0703d79606 | |||
| 5cef75219a | |||
| 8590e086a3 | |||
| aff40d8008 | |||
| 7095aca51b | |||
| a8eb2e97d0 | |||
| ce3093a3b2 | |||
| f448341e55 | |||
| 51c5f3f9c6 | |||
| 0cc52cf251 | |||
| c45f48eaea | |||
| 36e526edc6 | |||
| e98379b7d9 | |||
| 3f7b0ff954 | |||
| 69a6b1a028 |
+12
-1
@@ -30,9 +30,20 @@ You can also use Github issues to request features you would like to see in Kera
|
||||
|
||||
3. After discussing the feature you may choose to attempt a Pull Request. If you're at all able, start writing some code. We always have more work to do than time to do it. If you can write some code then that will speed the process along.
|
||||
|
||||
|
||||
## Requests for Contributions
|
||||
|
||||
[This is the board](https://github.com/fchollet/keras/projects/1) where we list current outstanding issues and features to be added. If you want to start contributing to Keras, this is the place to start.
|
||||
|
||||
|
||||
## Pull Requests
|
||||
|
||||
We love pull requests. Here's a quick guide:
|
||||
**Where should I submit my pull request?**
|
||||
|
||||
1. **Keras improvements and bugfixes** go to the [Keras `master` branch](https://github.com/fchollet/keras/tree/master).
|
||||
2. **New features** such as layers and datasets go to [keras-contrib](https://github.com/farizrahman4u/keras-contrib). Unless it is a new feature listed in [Requests for Contributions](https://github.com/fchollet/keras/projects/1), in which case it belongs in core Keras.
|
||||
|
||||
Here's a quick guide to submitting your improvements:
|
||||
|
||||
1. If your PR introduces a change in functionality, make sure you start by opening an issue to discuss whether the change should be made, and how to handle it. This will save you from having your PR closed down the road! Of course, if your PR is a simple bug fix, you don't need to do that.
|
||||
|
||||
|
||||
externo
+16
-8
@@ -17,10 +17,12 @@ In the future, we are likely to add more backend options. Go ask Microsoft about
|
||||
|
||||
If you have run Keras at least once, you will find the Keras configuration file at:
|
||||
|
||||
`~/.keras/keras.json`
|
||||
`$HOME/.keras/keras.json`
|
||||
|
||||
If it isn't there, you can create it.
|
||||
|
||||
**NOTE for Windows Users:** Please change `$HOME` with `%USERPROFILE%`.
|
||||
|
||||
The default configuration file looks like this:
|
||||
|
||||
```
|
||||
@@ -56,7 +58,7 @@ Using TensorFlow backend.
|
||||
}
|
||||
```
|
||||
|
||||
You can change these settings by editing `~/.keras/keras.json`.
|
||||
You can change these settings by editing `$HOME/.keras/keras.json`.
|
||||
|
||||
* `image_data_format`: string, either `"channels_last"` or `"channels_first"`. It specifies which data format convention Keras will follow. (`keras.backend.image_data_format()` returns it.)
|
||||
- For 2D data (e.g. image), `"channels_last"` assumes `(rows, cols, channels)` while `"channels_first"` assumes `(channels, rows, cols)`.
|
||||
@@ -69,14 +71,14 @@ You can change these settings by editing `~/.keras/keras.json`.
|
||||
|
||||
## Using the abstract Keras backend to write new code
|
||||
|
||||
If you want the Keras modules you write to be compatible with both Theano and TensorFlow, you have to write them via the abstract Keras backend API. Here's an intro.
|
||||
If you want the Keras modules you write to be compatible with both Theano (`th`) and TensorFlow (`tf`), you have to write them via the abstract Keras backend API. Here's an intro.
|
||||
|
||||
You can import the backend module via:
|
||||
```python
|
||||
from keras import backend as K
|
||||
*from keras import backend as K*
|
||||
```
|
||||
|
||||
The code below instantiates an input placeholder. It's equivalent to `tf.placeholder()` or `T.matrix()`, `T.tensor3()`, etc.
|
||||
The code below instantiates an input placeholder. It's equivalent to `tf.placeholder()` or `th.tensor.matrix()`, `th.tensor.tensor3()`, etc.
|
||||
|
||||
```python
|
||||
input = K.placeholder(shape=(2, 4, 5))
|
||||
@@ -86,9 +88,10 @@ input = K.placeholder(shape=(None, 4, 5))
|
||||
input = K.placeholder(ndim=3)
|
||||
```
|
||||
|
||||
The code below instantiates a shared variable. It's equivalent to `tf.variable()` or `theano.shared()`.
|
||||
The code below instantiates a shared variable. It's equivalent to `tf.Variable()` or `th.shared()`.
|
||||
|
||||
```python
|
||||
import numpy as np
|
||||
val = np.random.random((3, 4, 5))
|
||||
var = K.variable(value=val)
|
||||
|
||||
@@ -101,11 +104,16 @@ var = K.ones(shape=(3, 4, 5))
|
||||
Most tensor operations you will need can be done as you would in TensorFlow or Theano:
|
||||
|
||||
```python
|
||||
# Initializing Tensors with Random Numbers
|
||||
b = K.random_uniform_variable(shape=(3, 4)). # Uniform distribution
|
||||
c = K.random_normal_variable(shape=(3, 4)). # Gaussian distribution
|
||||
d = K.random_normal_variable(shape=(3, 4)).
|
||||
# Tensor Arithmetics
|
||||
a = b + c * K.abs(d)
|
||||
c = K.dot(a, K.transpose(b))
|
||||
a = K.sum(b, axis=2)
|
||||
a = K.sum(b, axis=1)
|
||||
a = K.softmax(b)
|
||||
a = concatenate([b, c], axis=-1)
|
||||
a = K.concatenate([b, c], axis=-1)
|
||||
# etc...
|
||||
```
|
||||
|
||||
|
||||
externo
+1
-1
@@ -17,6 +17,6 @@ model.add(Dense(64, kernel_constraint=max_norm(2.)))
|
||||
|
||||
## Available constraints
|
||||
|
||||
- __max_norm__(m=2): maximum-norm constraint
|
||||
- __max_norm__(max_value=2, axis=0): maximum-norm constraint
|
||||
- __non_neg__(): non-negativity constraint
|
||||
- __unit_norm__(): unit-norm constraint, enforces the matrix to have unit norm along the last axis
|
||||
+17
@@ -15,6 +15,7 @@
|
||||
- [How can I use stateful RNNs?](#how-can-i-use-stateful-rnns)
|
||||
- [How can I remove a layer from a Sequential model?](#how-can-i-remove-a-layer-from-a-sequential-model)
|
||||
- [How can I use pre-trained models in Keras?](#how-can-i-use-pre-trained-models-in-keras)
|
||||
- [How can I use HDF5 inputs with Keras?](#how-can-i-use-hdf5-inputs-with-keras)
|
||||
|
||||
---
|
||||
|
||||
@@ -319,6 +320,7 @@ To use statefulness in RNNs, you need to:
|
||||
|
||||
- explicitly specify the batch size you are using, by passing a `batch_size` argument to the first layer in your model. E.g. `batch_size=32` for a 32-samples batch of sequences of 10 timesteps with 16 features per timestep.
|
||||
- set `stateful=True` in your RNN layer(s).
|
||||
- specify `shuffle=False` when calling fit().
|
||||
|
||||
To reset the states accumulated:
|
||||
|
||||
@@ -403,3 +405,18 @@ The VGG16 model is also the basis for several Keras example scripts:
|
||||
- [Style transfer](https://github.com/fchollet/keras/blob/master/examples/neural_style_transfer.py)
|
||||
- [Feature visualization](https://github.com/fchollet/keras/blob/master/examples/conv_filter_visualization.py)
|
||||
- [Deep dream](https://github.com/fchollet/keras/blob/master/examples/deep_dream.py)
|
||||
|
||||
---
|
||||
|
||||
### How can I use HDF5 inputs with Keras?
|
||||
|
||||
You can use the `HDF5Matrix` class from `keras.utils.io_utils`. See [the documentation](/io_utils/#HDF5Matrix) for details.
|
||||
|
||||
You can also directly use a HDF5 dataset:
|
||||
|
||||
```python
|
||||
import h5py
|
||||
with h5py.File('input/file.hdf5', 'r') as f:
|
||||
X_data = f['X_data']
|
||||
model.predict(X_data)
|
||||
```
|
||||
|
||||
@@ -98,7 +98,7 @@ model.compile(optimizer='rmsprop',
|
||||
|
||||
# Generate dummy data
|
||||
import numpy as np
|
||||
data = np.random.random((1000, 784))
|
||||
data = np.random.random((1000, 100))
|
||||
labels = np.random.randint(2, size=(1000, 1))
|
||||
|
||||
# Train the model, iterating on the data in batches of 32 samples
|
||||
@@ -117,8 +117,8 @@ model.compile(optimizer='rmsprop',
|
||||
|
||||
# Generate dummy data
|
||||
import numpy as np
|
||||
data = np.random.random((1000, 784))
|
||||
labels = np.random.randint(10, size=(1000, 10))
|
||||
data = np.random.random((1000, 100))
|
||||
labels = np.random.randint(10, size=(1000, 1))
|
||||
|
||||
# Convert labels to categorical one-hot encoding
|
||||
binary_labels = keras.utils.to_categorical(labels, num_classes=10)
|
||||
@@ -199,9 +199,9 @@ from keras.layers import Conv2D, MaxPooling2D
|
||||
from keras.optimizers import SGD
|
||||
|
||||
model = Sequential()
|
||||
# input: 100x100 images with 3 channels -> (3, 100, 100) tensors.
|
||||
# input: 100x100 images with 3 channels -> (100, 100, 3) tensors.
|
||||
# this applies 32 convolution filters of size 3x3 each.
|
||||
model.add(Conv2D(32, (3, 3), activation='relu', input_shape=(3, 100, 100)))
|
||||
model.add(Conv2D(32, (3, 3), activation='relu', input_shape=(100, 100, 3)))
|
||||
model.add(Conv2D(32, (3, 3), activation='relu'))
|
||||
model.add(MaxPooling2D(pool_size=(2, 2)))
|
||||
model.add(Dropout(0.25))
|
||||
@@ -251,12 +251,12 @@ score = model.evaluate(x_test, y_test, batch_size=16)
|
||||
from keras.models import Sequential
|
||||
from keras.layers import Dense, Dropout
|
||||
from keras.layers import Embedding
|
||||
from keras.layers import Conv1D, GlobalAveragePooling1D
|
||||
from keras.layers import Conv1D, GlobalAveragePooling1D, MaxPooling1D
|
||||
|
||||
model = Sequential()
|
||||
model.add(Conv1D(64, 3, activation='relu', input_shape=(seq_length, 100)))
|
||||
model.add(Conv1D(64, 3, activation='relu'))
|
||||
model.add(MaxPooling1D((3, 3)))
|
||||
model.add(MaxPooling1D(3))
|
||||
model.add(Conv1D(128, 3, activation='relu'))
|
||||
model.add(Conv1D(128, 3, activation='relu'))
|
||||
model.add(GlobalAveragePooling1D())
|
||||
@@ -358,6 +358,6 @@ x_val = np.random.random((batch_size * 3, timesteps, data_dim))
|
||||
y_val = np.random.random((batch_size * 3, num_classes))
|
||||
|
||||
model.fit(x_train, y_train,
|
||||
batch_size=batch_size, epochs=5,
|
||||
batch_size=batch_size, epochs=5, shuffle=False,
|
||||
validation_data=(x_val, y_val))
|
||||
```
|
||||
|
||||
externo
+3
-3
@@ -14,10 +14,10 @@ These layers expose 3 keyword arguments:
|
||||
## Example
|
||||
|
||||
```python
|
||||
from keras.regularizers import l2, activity_l2
|
||||
from keras import regularizers
|
||||
model.add(Dense(64, input_dim=64,
|
||||
kernel_regularizer=l2(0.01),
|
||||
activity_regularizer=activity_l2(0.01)))
|
||||
kernel_regularizer=regularizers.l2(0.01),
|
||||
activity_regularizer=regularizers.l1(0.01)))
|
||||
```
|
||||
|
||||
## Available penalties
|
||||
|
||||
+74
-49
@@ -14,9 +14,9 @@ Time per epoch: 3s on CPU (core i7).
|
||||
'''
|
||||
from __future__ import print_function
|
||||
|
||||
from keras.models import Sequential
|
||||
from keras.models import Sequential, Model
|
||||
from keras.layers.embeddings import Embedding
|
||||
from keras.layers import Activation, Dense, Merge, Permute, Dropout
|
||||
from keras.layers import Input, Activation, Dense, Permute, Dropout, add, dot, concatenate
|
||||
from keras.layers import LSTM
|
||||
from keras.utils.data_utils import get_file
|
||||
from keras.preprocessing.sequence import pad_sequences
|
||||
@@ -38,7 +38,8 @@ def tokenize(sent):
|
||||
def parse_stories(lines, only_supporting=False):
|
||||
'''Parse stories provided in the bAbi tasks format
|
||||
|
||||
If only_supporting is true, only the sentences that support the answer are kept.
|
||||
If only_supporting is true, only the sentences
|
||||
that support the answer are kept.
|
||||
'''
|
||||
data = []
|
||||
story = []
|
||||
@@ -68,9 +69,12 @@ def parse_stories(lines, only_supporting=False):
|
||||
|
||||
|
||||
def get_stories(f, only_supporting=False, max_length=None):
|
||||
'''Given a file name, read the file, retrieve the stories, and then convert the sentences into a single story.
|
||||
'''Given a file name, read the file,
|
||||
retrieve the stories,
|
||||
and then convert the sentences into a single story.
|
||||
|
||||
If max_length is supplied, any stories longer than max_length tokens will be discarded.
|
||||
If max_length is supplied,
|
||||
any stories longer than max_length tokens will be discarded.
|
||||
'''
|
||||
data = parse_stories(f.readlines(), only_supporting=only_supporting)
|
||||
flatten = lambda data: reduce(lambda x, y: x + y, data)
|
||||
@@ -85,7 +89,8 @@ def vectorize_stories(data, word_idx, story_maxlen, query_maxlen):
|
||||
for story, query, answer in data:
|
||||
x = [word_idx[w] for w in story]
|
||||
xq = [word_idx[w] for w in query]
|
||||
y = np.zeros(len(word_idx) + 1) # let's not forget that index 0 is reserved
|
||||
# let's not forget that index 0 is reserved
|
||||
y = np.zeros(len(word_idx) + 1)
|
||||
y[word_idx[answer]] = 1
|
||||
X.append(x)
|
||||
Xq.append(xq)
|
||||
@@ -93,7 +98,6 @@ def vectorize_stories(data, word_idx, story_maxlen, query_maxlen):
|
||||
return (pad_sequences(X, maxlen=story_maxlen),
|
||||
pad_sequences(Xq, maxlen=query_maxlen), np.array(Y))
|
||||
|
||||
|
||||
try:
|
||||
path = get_file('babi-tasks-v1-2.tar.gz', origin='https://s3.amazonaws.com/text-datasets/babi_tasks_1-20_v1-2.tar.gz')
|
||||
except:
|
||||
@@ -116,7 +120,11 @@ print('Extracting stories for the challenge:', challenge_type)
|
||||
train_stories = get_stories(tar.extractfile(challenge.format('train')))
|
||||
test_stories = get_stories(tar.extractfile(challenge.format('test')))
|
||||
|
||||
vocab = sorted(reduce(lambda x, y: x | y, (set(story + q + [answer]) for story, q, answer in train_stories + test_stories)))
|
||||
vocab = set()
|
||||
for story, q, answer in train_stories + test_stories:
|
||||
vocab |= set(story + q + [answer])
|
||||
vocab = sorted(vocab)
|
||||
|
||||
# Reserve 0 for masking via pad_sequences
|
||||
vocab_size = len(vocab) + 1
|
||||
story_maxlen = max(map(len, (x for x, _, _ in train_stories + test_stories)))
|
||||
@@ -135,8 +143,14 @@ print('-')
|
||||
print('Vectorizing the word sequences...')
|
||||
|
||||
word_idx = dict((c, i + 1) for i, c in enumerate(vocab))
|
||||
inputs_train, queries_train, answers_train = vectorize_stories(train_stories, word_idx, story_maxlen, query_maxlen)
|
||||
inputs_test, queries_test, answers_test = vectorize_stories(test_stories, word_idx, story_maxlen, query_maxlen)
|
||||
inputs_train, queries_train, answers_train = vectorize_stories(train_stories,
|
||||
word_idx,
|
||||
story_maxlen,
|
||||
query_maxlen)
|
||||
inputs_test, queries_test, answers_test = vectorize_stories(test_stories,
|
||||
word_idx,
|
||||
story_maxlen,
|
||||
query_maxlen)
|
||||
|
||||
print('-')
|
||||
print('inputs: integer tensor of shape (samples, max_length)')
|
||||
@@ -153,13 +167,25 @@ print('answers_test shape:', answers_test.shape)
|
||||
print('-')
|
||||
print('Compiling...')
|
||||
|
||||
# placeholders
|
||||
input_sequence = Input((story_maxlen,))
|
||||
question = Input((query_maxlen,))
|
||||
|
||||
# encoders
|
||||
# embed the input sequence into a sequence of vectors
|
||||
input_encoder_m = Sequential()
|
||||
input_encoder_m.add(Embedding(input_dim=vocab_size,
|
||||
output_dim=64,
|
||||
input_length=story_maxlen))
|
||||
output_dim=64))
|
||||
input_encoder_m.add(Dropout(0.3))
|
||||
# output: (samples, story_maxlen, embedding_dim)
|
||||
|
||||
# embed the input into a sequence of vectors of size query_maxlen
|
||||
input_encoder_c = Sequential()
|
||||
input_encoder_c.add(Embedding(input_dim=vocab_size,
|
||||
output_dim=query_maxlen))
|
||||
input_encoder_c.add(Dropout(0.3))
|
||||
# output: (samples, story_maxlen, query_maxlen)
|
||||
|
||||
# embed the question into a sequence of vectors
|
||||
question_encoder = Sequential()
|
||||
question_encoder.add(Embedding(input_dim=vocab_size,
|
||||
@@ -167,44 +193,43 @@ question_encoder.add(Embedding(input_dim=vocab_size,
|
||||
input_length=query_maxlen))
|
||||
question_encoder.add(Dropout(0.3))
|
||||
# output: (samples, query_maxlen, embedding_dim)
|
||||
# compute a 'match' between input sequence elements (which are vectors)
|
||||
# and the question vector sequence
|
||||
match = Sequential()
|
||||
match.add(Merge([input_encoder_m, question_encoder],
|
||||
mode='dot',
|
||||
dot_axes=[2, 2]))
|
||||
match.add(Activation('softmax'))
|
||||
# output: (samples, story_maxlen, query_maxlen)
|
||||
# embed the input into a single vector with size = story_maxlen:
|
||||
input_encoder_c = Sequential()
|
||||
input_encoder_c.add(Embedding(input_dim=vocab_size,
|
||||
output_dim=query_maxlen,
|
||||
input_length=story_maxlen))
|
||||
input_encoder_c.add(Dropout(0.3))
|
||||
# output: (samples, story_maxlen, query_maxlen)
|
||||
# sum the match vector with the input vector:
|
||||
response = Sequential()
|
||||
response.add(Merge([match, input_encoder_c], mode='sum'))
|
||||
# output: (samples, story_maxlen, query_maxlen)
|
||||
response.add(Permute((2, 1))) # output: (samples, query_maxlen, story_maxlen)
|
||||
|
||||
# concatenate the match vector with the question vector,
|
||||
# and do logistic regression on top
|
||||
answer = Sequential()
|
||||
answer.add(Merge([response, question_encoder], mode='concat', concat_axis=-1))
|
||||
# encode input sequence and questions (which are indices)
|
||||
# to sequences of dense vectors
|
||||
input_encoded_m = input_encoder_m(input_sequence)
|
||||
input_encoded_c = input_encoder_c(input_sequence)
|
||||
question_encoded = question_encoder(question)
|
||||
|
||||
# compute a 'match' between the first input vector sequence
|
||||
# and the question vector sequence
|
||||
# shape: `(samples, story_maxlen, query_maxlen)`
|
||||
match = dot([input_encoded_m, question_encoded], axes=(2, 2))
|
||||
match = Activation('softmax')(match)
|
||||
|
||||
# add the match matrix with the second input vector sequence
|
||||
response = add([match, input_encoded_c]) # (samples, story_maxlen, query_maxlen)
|
||||
response = Permute((2, 1))(response) # (samples, query_maxlen, story_maxlen)
|
||||
|
||||
# concatenate the match matrix with the question vector sequence
|
||||
answer = concatenate([response, question_encoded])
|
||||
|
||||
# the original paper uses a matrix multiplication for this reduction step.
|
||||
# we choose to use a RNN instead.
|
||||
answer.add(LSTM(32))
|
||||
# one regularization layer -- more would probably be needed.
|
||||
answer.add(Dropout(0.3))
|
||||
answer.add(Dense(vocab_size))
|
||||
# we output a probability distribution over the vocabulary
|
||||
answer.add(Activation('softmax'))
|
||||
answer = LSTM(32)(answer) # (samples, 32)
|
||||
|
||||
answer.compile(optimizer='rmsprop', loss='categorical_crossentropy',
|
||||
metrics=['accuracy'])
|
||||
# Note: you could use a Graph model to avoid repeat the input twice
|
||||
answer.fit([inputs_train, queries_train, inputs_train], answers_train,
|
||||
batch_size=32,
|
||||
epochs=120,
|
||||
validation_data=([inputs_test, queries_test, inputs_test], answers_test))
|
||||
# one regularization layer -- more would probably be needed.
|
||||
answer = Dropout(0.3)(answer)
|
||||
answer = Dense(vocab_size)(answer) # (samples, vocab_size)
|
||||
# we output a probability distribution over the vocabulary
|
||||
answer = Activation('softmax')(answer)
|
||||
|
||||
# build the final model
|
||||
model = Model([input_sequence, question], answer)
|
||||
model.compile(optimizer='rmsprop', loss='categorical_crossentropy',
|
||||
metrics=['accuracy'])
|
||||
|
||||
# train
|
||||
model.fit([inputs_train, queries_train], answers_train,
|
||||
batch_size=32,
|
||||
epochs=120,
|
||||
validation_data=([inputs_test, queries_test], answers_test))
|
||||
|
||||
+24
-9
@@ -83,7 +83,8 @@ def tokenize(sent):
|
||||
def parse_stories(lines, only_supporting=False):
|
||||
'''Parse stories provided in the bAbi tasks format
|
||||
|
||||
If only_supporting is true, only the sentences that support the answer are kept.
|
||||
If only_supporting is true,
|
||||
only the sentences that support the answer are kept.
|
||||
'''
|
||||
data = []
|
||||
story = []
|
||||
@@ -113,9 +114,11 @@ def parse_stories(lines, only_supporting=False):
|
||||
|
||||
|
||||
def get_stories(f, only_supporting=False, max_length=None):
|
||||
'''Given a file name, read the file, retrieve the stories, and then convert the sentences into a single story.
|
||||
'''Given a file name, read the file, retrieve the stories,
|
||||
and then convert the sentences into a single story.
|
||||
|
||||
If max_length is supplied, any stories longer than max_length tokens will be discarded.
|
||||
If max_length is supplied,
|
||||
any stories longer than max_length tokens will be discarded.
|
||||
'''
|
||||
data = parse_stories(f.readlines(), only_supporting=only_supporting)
|
||||
flatten = lambda data: reduce(lambda x, y: x + y, data)
|
||||
@@ -130,7 +133,8 @@ def vectorize_stories(data, word_idx, story_maxlen, query_maxlen):
|
||||
for story, query, answer in data:
|
||||
x = [word_idx[w] for w in story]
|
||||
xq = [word_idx[w] for w in query]
|
||||
y = np.zeros(len(word_idx) + 1) # let's not forget that index 0 is reserved
|
||||
# let's not forget that index 0 is reserved
|
||||
y = np.zeros(len(word_idx) + 1)
|
||||
y[word_idx[answer]] = 1
|
||||
xs.append(x)
|
||||
xqs.append(xq)
|
||||
@@ -140,10 +144,13 @@ def vectorize_stories(data, word_idx, story_maxlen, query_maxlen):
|
||||
RNN = recurrent.LSTM
|
||||
EMBED_HIDDEN_SIZE = 50
|
||||
SENT_HIDDEN_SIZE = 100
|
||||
QUERy_HIDDEN_SIZE = 100
|
||||
QUERY_HIDDEN_SIZE = 100
|
||||
BATCH_SIZE = 32
|
||||
EPOCHS = 40
|
||||
print('RNN / Embed / Sent / Query = {}, {}, {}, {}'.format(RNN, EMBED_HIDDEN_SIZE, SENT_HIDDEN_SIZE, QUERy_HIDDEN_SIZE))
|
||||
print('RNN / Embed / Sent / Query = {}, {}, {}, {}'.format(RNN,
|
||||
EMBED_HIDDEN_SIZE,
|
||||
SENT_HIDDEN_SIZE,
|
||||
QUERY_HIDDEN_SIZE))
|
||||
|
||||
try:
|
||||
path = get_file('babi-tasks-v1-2.tar.gz', origin='https://s3.amazonaws.com/text-datasets/babi_tasks_1-20_v1-2.tar.gz')
|
||||
@@ -164,7 +171,11 @@ challenge = 'tasks_1-20_v1-2/en/qa2_two-supporting-facts_{}.txt'
|
||||
train = get_stories(tar.extractfile(challenge.format('train')))
|
||||
test = get_stories(tar.extractfile(challenge.format('test')))
|
||||
|
||||
vocab = sorted(reduce(lambda x, y: x | y, (set(story + q + [answer]) for story, q, answer in train + test)))
|
||||
vocab = set()
|
||||
for story, q, answer in train + test:
|
||||
vocab |= set(story + q + [answer])
|
||||
vocab = sorted(vocab)
|
||||
|
||||
# Reserve 0 for masking via pad_sequences
|
||||
vocab_size = len(vocab) + 1
|
||||
word_idx = dict((c, i + 1) for i, c in enumerate(vocab))
|
||||
@@ -203,6 +214,10 @@ model.compile(optimizer='adam',
|
||||
metrics=['accuracy'])
|
||||
|
||||
print('Training')
|
||||
model.fit([x, xq], y, batch_size=BATCH_SIZE, epochs=EPOCHS, validation_split=0.05)
|
||||
loss, acc = model.evaluate([tx, txq], ty, batch_size=BATCH_SIZE)
|
||||
model.fit([x, xq], y,
|
||||
batch_size=BATCH_SIZE,
|
||||
epochs=EPOCHS,
|
||||
validation_split=0.05)
|
||||
loss, acc = model.evaluate([tx, txq], ty,
|
||||
batch_size=BATCH_SIZE)
|
||||
print('Test loss / test accuracy = {:.4f} / {:.4f}'.format(loss, acc))
|
||||
|
||||
@@ -51,20 +51,45 @@ from keras.datasets import mnist
|
||||
from keras.models import Model
|
||||
from keras.layers import Activation
|
||||
from keras.layers import UpSampling2D, Conv2D, MaxPooling2D
|
||||
from keras.layers import Input, BatchNormalization
|
||||
from keras.layers import Input, BatchNormalization, ELU
|
||||
import matplotlib.pyplot as plt
|
||||
import keras.backend as K
|
||||
from keras import layers
|
||||
|
||||
|
||||
def convresblock(x, nfeats=8, ksize=3, nskipped=2):
|
||||
''' The proposed residual block from [4]'''
|
||||
def convresblock(x, nfeats=8, ksize=3, nskipped=2, elu=True):
|
||||
"""The proposed residual block from [4].
|
||||
|
||||
Running with elu=True will use ELU nonlinearity and running with
|
||||
elu=False will use BatchNorm + RELU nonlinearity. While ELU's are fast
|
||||
due to the fact they do not suffer from BatchNorm overhead, they may
|
||||
overfit because they do not offer the stochastic element of the batch
|
||||
formation process of BatchNorm, which acts as a good regularizer.
|
||||
|
||||
# Arguments
|
||||
x: 4D tensor, the tensor to feed through the block
|
||||
nfeats: Integer, number of feature maps for conv layers.
|
||||
ksize: Integer, width and height of conv kernels in first convolution.
|
||||
nskipped: Integer, number of conv layers for the residual function.
|
||||
elu: Boolean, whether to use ELU or BN+RELU.
|
||||
|
||||
# Input shape
|
||||
4D tensor with shape:
|
||||
`(batch, channels, rows, cols)`
|
||||
|
||||
# Output shape
|
||||
4D tensor with shape:
|
||||
`(batch, filters, rows, cols)`
|
||||
"""
|
||||
y0 = Conv2D(nfeats, ksize, padding='same')(x)
|
||||
y = y0
|
||||
for i in range(nskipped):
|
||||
y = BatchNormalization(axis=1)(y)
|
||||
y = Activation('relu')(y)
|
||||
y = Conv2D(nfeats, ksize, padding='same')(y)
|
||||
if elu:
|
||||
y = ELU()(y)
|
||||
else:
|
||||
y = BatchNormalization(axis=1)(y)
|
||||
y = Activation('relu')(y)
|
||||
y = Conv2D(nfeats, 1, padding='same')(y)
|
||||
return layers.add([y0, y])
|
||||
|
||||
|
||||
@@ -143,7 +168,8 @@ y = img_input
|
||||
for i in range(nlayers):
|
||||
y_prepool = convresblock(y, nfeats=nfeats_all[i + 1], ksize=ksize)
|
||||
y = MaxPooling2D(pool_size=(pool_sizes[i], pool_sizes[i]))(y_prepool)
|
||||
wheres[i] = layers.Lambda(getwhere, output_shape=lambda x: x[0])([y_prepool, y])
|
||||
wheres[i] = layers.Lambda(
|
||||
getwhere, output_shape=lambda x: x[0])([y_prepool, y])
|
||||
|
||||
# Now build the decoder, and use the stored 'where' masks to place the features
|
||||
for i in range(nlayers):
|
||||
|
||||
@@ -62,6 +62,14 @@ model.compile(loss='mse', optimizer='rmsprop')
|
||||
print('Training')
|
||||
for i in range(epochs):
|
||||
print('Epoch', i, '/', epochs)
|
||||
|
||||
# Note that the last state for sample i in a batch will
|
||||
# be used as initial state for sample i in the next batch.
|
||||
# Thus we are simultaneously training on batch_size series with
|
||||
# lower resolution than the original series contained in cos.
|
||||
# Each of these series are offset by one step and can be
|
||||
# extracted with cos[i::batch_size].
|
||||
|
||||
model.fit(cos,
|
||||
expected_output,
|
||||
batch_size=batch_size,
|
||||
|
||||
+1
-1
@@ -18,4 +18,4 @@ from . import losses
|
||||
from . import optimizers
|
||||
from . import regularizers
|
||||
|
||||
__version__ = '2.0.0'
|
||||
__version__ = '2.0.1'
|
||||
|
||||
@@ -69,5 +69,14 @@ else:
|
||||
def backend():
|
||||
"""Publicly accessible method
|
||||
for determining the current backend.
|
||||
|
||||
# Returns
|
||||
String, the name of the backend Keras is currently using.
|
||||
|
||||
# Example
|
||||
```python
|
||||
>>> keras.backend.backend()
|
||||
'tensorflow'
|
||||
```
|
||||
"""
|
||||
return _BACKEND
|
||||
|
||||
@@ -178,10 +178,10 @@ def is_keras_tensor(x):
|
||||
# Legacy methods
|
||||
|
||||
def set_image_dim_ordering(dim_ordering):
|
||||
"""Sets the value of the image data format.
|
||||
"""Legacy setter for `image_data_format`.
|
||||
|
||||
# Arguments
|
||||
data_format: string. `'channels_first'` or `'channels_last'`.
|
||||
dim_ordering: string. `'tf'` or `'th'`.
|
||||
|
||||
# Example
|
||||
```python
|
||||
@@ -204,7 +204,7 @@ def set_image_dim_ordering(dim_ordering):
|
||||
|
||||
|
||||
def image_dim_ordering():
|
||||
"""Legacy getter for data format.
|
||||
"""Legacy getter for `image_data_format`.
|
||||
"""
|
||||
if _IMAGE_DATA_FORMAT == 'channels_first':
|
||||
return 'th'
|
||||
|
||||
@@ -630,7 +630,7 @@ def random_uniform_variable(shape, low, high, dtype=None,
|
||||
|
||||
# Arguments
|
||||
shape: Tuple of integers, shape of returned Keras variable.
|
||||
low: Float, lower boundary of the output inteval.
|
||||
low: Float, lower boundary of the output interval.
|
||||
high: Float, upper boundary of the output interval.
|
||||
dtype: String, dtype of returned Keras variable.
|
||||
name: String, name of returned Keras variable.
|
||||
@@ -1906,7 +1906,7 @@ def one_hot(indices, num_classes):
|
||||
|
||||
|
||||
def reverse(x, axes):
|
||||
"""Reverse a tensor along the the specified axes.
|
||||
"""Reverse a tensor along the specified axes.
|
||||
|
||||
# Arguments
|
||||
x: Tensor to reverse.
|
||||
@@ -3153,7 +3153,7 @@ def random_binomial(shape, p=0.0, dtype=None, seed=None):
|
||||
|
||||
# Arguments
|
||||
shape: A tuple of integers, the shape of tensor to create.
|
||||
p: A float, `0. <= p <= 1`, probability of binomlai distribution.
|
||||
p: A float, `0. <= p <= 1`, probability of binomial distribution.
|
||||
dtype: String, dtype of returned tensor.
|
||||
seed: Integer, random seed.
|
||||
|
||||
|
||||
@@ -1002,7 +1002,7 @@ def one_hot(indices, num_classes):
|
||||
|
||||
|
||||
def reverse(x, axes):
|
||||
"""Reverse a tensor along the the specified axes
|
||||
"""Reverse a tensor along the specified axes
|
||||
"""
|
||||
if isinstance(axes, int):
|
||||
axes = [axes]
|
||||
|
||||
@@ -581,13 +581,15 @@ class TensorBoard(Callback):
|
||||
|
||||
# Arguments
|
||||
log_dir: the path of the directory where to save the log
|
||||
files to be parsed by Tensorboard
|
||||
files to be parsed by Tensorboard.
|
||||
histogram_freq: frequency (in epochs) at which to compute activation
|
||||
histograms for the layers of the model. If set to 0,
|
||||
histograms won't be computed.
|
||||
write_graph: whether to visualize the graph in Tensorboard.
|
||||
The log file can become quite large when
|
||||
write_graph is set to True.
|
||||
write_images: whether to write model weights to visualize as
|
||||
image in Tensorboard.
|
||||
"""
|
||||
|
||||
def __init__(self, log_dir='./logs',
|
||||
|
||||
@@ -269,7 +269,9 @@ class Layer(object):
|
||||
'dtype',
|
||||
'name',
|
||||
'trainable',
|
||||
'weights'}
|
||||
'weights',
|
||||
'input_dtype', # legacy
|
||||
}
|
||||
for kwarg in kwargs:
|
||||
if kwarg not in allowed_kwargs:
|
||||
raise TypeError('Keyword argument not understood:', kwarg)
|
||||
@@ -292,8 +294,15 @@ class Layer(object):
|
||||
batch_size = None
|
||||
batch_input_shape = (batch_size,) + tuple(kwargs['input_shape'])
|
||||
self.batch_input_shape = batch_input_shape
|
||||
dtype = kwargs.get('dtype', K.floatx())
|
||||
|
||||
# Set dtype.
|
||||
dtype = kwargs.get('dtype')
|
||||
if dtype is None:
|
||||
dtype = kwargs.get('input_dtype')
|
||||
if dtype is None:
|
||||
dtype = K.floatx()
|
||||
self.dtype = dtype
|
||||
|
||||
if 'weights' in kwargs:
|
||||
self._initial_weights = kwargs['weights']
|
||||
else:
|
||||
|
||||
@@ -1594,7 +1594,7 @@ class Model(Container):
|
||||
In this case you should make sure to specify
|
||||
sample_weight_mode="temporal" in compile().
|
||||
class_weight: optional dictionary mapping
|
||||
lass indices (integers) to
|
||||
class indices (integers) to
|
||||
a weight (float) to apply to the model's loss for the samples
|
||||
from this class during training.
|
||||
This can be useful to tell the model to "pay more attention" to
|
||||
|
||||
@@ -219,7 +219,7 @@ class _Conv(Layer):
|
||||
'activation': activations.serialize(self.activation),
|
||||
'use_bias': self.use_bias,
|
||||
'kernel_initializer': initializers.serialize(self.kernel_initializer),
|
||||
'bias_initializer': initializers.serialize(self.kernel_initializer),
|
||||
'bias_initializer': initializers.serialize(self.bias_initializer),
|
||||
'kernel_regularizer': regularizers.serialize(self.kernel_regularizer),
|
||||
'bias_regularizer': regularizers.serialize(self.bias_regularizer),
|
||||
'activity_regularizer': regularizers.serialize(self.activity_regularizer),
|
||||
|
||||
@@ -27,7 +27,7 @@ class Masking(Layer):
|
||||
|
||||
For each timestep in the input tensor (dimension #1 in the tensor),
|
||||
if all values in the input tensor at that timestep
|
||||
are equal to `mask_value`, then the timestep will masked (skipped)
|
||||
are equal to `mask_value`, then the timestep will be masked (skipped)
|
||||
in all downstream layers (as long as they support masking).
|
||||
|
||||
If any downstream layer does not support masking yet receives such
|
||||
@@ -107,9 +107,9 @@ class Dropout(Layer):
|
||||
def dropped_inputs():
|
||||
return K.dropout(inputs, self.rate, noise_shape,
|
||||
seed=self.seed)
|
||||
output = K.in_train_phase(dropped_inputs, inputs,
|
||||
training=training)
|
||||
return output
|
||||
return K.in_train_phase(dropped_inputs, inputs,
|
||||
training=training)
|
||||
return inputs
|
||||
|
||||
def get_config(self):
|
||||
config = {'rate': self.rate}
|
||||
@@ -857,7 +857,7 @@ class Dense(Layer):
|
||||
'activation': activations.serialize(self.activation),
|
||||
'use_bias': self.use_bias,
|
||||
'kernel_initializer': initializers.serialize(self.kernel_initializer),
|
||||
'bias_initializer': initializers.serialize(self.kernel_initializer),
|
||||
'bias_initializer': initializers.serialize(self.bias_initializer),
|
||||
'kernel_regularizer': regularizers.serialize(self.kernel_regularizer),
|
||||
'bias_regularizer': regularizers.serialize(self.bias_regularizer),
|
||||
'activity_regularizer': regularizers.serialize(self.activity_regularizer),
|
||||
|
||||
@@ -175,7 +175,7 @@ class LocallyConnected1D(Layer):
|
||||
'activation': activations.serialize(self.activation),
|
||||
'use_bias': self.use_bias,
|
||||
'kernel_initializer': initializers.serialize(self.kernel_initializer),
|
||||
'bias_initializer': initializers.serialize(self.kernel_initializer),
|
||||
'bias_initializer': initializers.serialize(self.bias_initializer),
|
||||
'kernel_regularizer': regularizers.serialize(self.kernel_regularizer),
|
||||
'bias_regularizer': regularizers.serialize(self.bias_regularizer),
|
||||
'activity_regularizer': regularizers.serialize(self.activity_regularizer),
|
||||
@@ -432,7 +432,7 @@ class LocallyConnected2D(Layer):
|
||||
'activation': activations.serialize(self.activation),
|
||||
'use_bias': self.use_bias,
|
||||
'kernel_initializer': initializers.serialize(self.kernel_initializer),
|
||||
'bias_initializer': initializers.serialize(self.kernel_initializer),
|
||||
'bias_initializer': initializers.serialize(self.bias_initializer),
|
||||
'kernel_regularizer': regularizers.serialize(self.kernel_regularizer),
|
||||
'bias_regularizer': regularizers.serialize(self.bias_regularizer),
|
||||
'activity_regularizer': regularizers.serialize(self.activity_regularizer),
|
||||
|
||||
@@ -148,7 +148,7 @@ legacy_gaussiannoise_support = generate_legacy_interface(
|
||||
conversions=[('sigma', 'stddev')])
|
||||
|
||||
|
||||
def lstm_args_preprocessor(args, kwargs):
|
||||
def recurrent_args_preprocessor(args, kwargs):
|
||||
converted = []
|
||||
if 'forget_bias_init' in kwargs:
|
||||
if kwargs['forget_bias_init'] == 'one':
|
||||
@@ -160,6 +160,15 @@ def lstm_args_preprocessor(args, kwargs):
|
||||
warnings.warn('The `forget_bias_init` argument '
|
||||
'has been ignored. Use `unit_forget_bias=True` '
|
||||
'instead to intialize with ones.')
|
||||
if 'input_dim' in kwargs:
|
||||
input_length = kwargs.pop('input_length', None)
|
||||
input_dim = kwargs.pop('input_dim')
|
||||
input_shape = (input_length, input_dim)
|
||||
kwargs['input_shape'] = input_shape
|
||||
converted.append(('input_dim', 'input_shape'))
|
||||
warnings.warn('The `input_dim` and `input_length` arguments '
|
||||
'in recurrent layers are deprecated. '
|
||||
'Use `input_shape` instead.')
|
||||
return args, kwargs, converted
|
||||
|
||||
legacy_recurrent_support = generate_legacy_interface(
|
||||
@@ -177,7 +186,7 @@ legacy_recurrent_support = generate_legacy_interface(
|
||||
value_conversions={'consume_less': {'cpu': 0,
|
||||
'mem': 1,
|
||||
'gpu': 2}},
|
||||
preprocessor=lstm_args_preprocessor)
|
||||
preprocessor=recurrent_args_preprocessor)
|
||||
|
||||
legacy_gaussiandropout_support = generate_legacy_interface(
|
||||
allowed_positional_args=['rate'],
|
||||
|
||||
@@ -66,7 +66,7 @@ class Merge(Layer):
|
||||
warnings.warn('The `Merge` layer is deprecated '
|
||||
'and will be removed after 08/2017. '
|
||||
'Use instead layers from `keras.layers.merge`, '
|
||||
'e.g. `add`, `concatenate`, etc.')
|
||||
'e.g. `add`, `concatenate`, etc.', stacklevel=2)
|
||||
self.layers = layers
|
||||
self.mode = mode
|
||||
self.concat_axis = concat_axis
|
||||
@@ -286,7 +286,7 @@ class Merge(Layer):
|
||||
|
||||
assert hasattr(mask, '__len__') and len(mask) == len(inputs)
|
||||
|
||||
if self.mode in ['sum', 'mul', 'ave']:
|
||||
if self.mode in ['sum', 'mul', 'ave', 'max']:
|
||||
masks = [K.expand_dims(m, 0) for m in mask if m is not None]
|
||||
return K.all(K.concatenate(masks, axis=0), axis=0, keepdims=False)
|
||||
elif self.mode == 'concat':
|
||||
@@ -361,6 +361,7 @@ class Merge(Layer):
|
||||
|
||||
@classmethod
|
||||
def from_config(cls, config):
|
||||
config = config.copy()
|
||||
mode_type = config.pop('mode_type')
|
||||
if mode_type == 'function':
|
||||
mode = globals()[config['mode']]
|
||||
@@ -406,7 +407,7 @@ def merge(inputs, mode='sum', concat_axis=-1,
|
||||
```
|
||||
# Arguments
|
||||
mode: String or lambda/function. If string, must be one
|
||||
of: 'sum', 'mul', 'concat', 'ave', 'cos', 'dot'.
|
||||
of: 'sum', 'mul', 'concat', 'ave', 'cos', 'dot', 'max'.
|
||||
If lambda/function, it should take as input a list of tensors
|
||||
and return a single tensor.
|
||||
concat_axis: Integer, axis to use in mode `concat`.
|
||||
@@ -429,7 +430,7 @@ def merge(inputs, mode='sum', concat_axis=-1,
|
||||
warnings.warn('The `merge` function is deprecated '
|
||||
'and will be removed after 08/2017. '
|
||||
'Use instead layers from `keras.layers.merge`, '
|
||||
'e.g. `sum`, `concatenate`, etc.')
|
||||
'e.g. `sum`, `concatenate`, etc.', stacklevel=2)
|
||||
all_keras_tensors = True
|
||||
for x in inputs:
|
||||
if not hasattr(x, '_keras_history'):
|
||||
|
||||
@@ -140,10 +140,12 @@ def conv_input_length(output_length, filter_size, padding, stride):
|
||||
|
||||
|
||||
def deconv_length(dim_size, stride_size, kernel_size, padding):
|
||||
if dim_size is not None:
|
||||
dim_size *= stride_size
|
||||
if padding == 'valid' and dim_size is not None:
|
||||
dim_size += max(kernel_size - stride_size, 0)
|
||||
if padding == 'full' and dim_size is not None:
|
||||
dim_size -= stride_size + kernel_size - 2
|
||||
if dim_size is None:
|
||||
return None
|
||||
if padding == 'valid':
|
||||
dim_size = dim_size * stride_size + max(kernel_size - stride_size, 0)
|
||||
elif padding == 'full':
|
||||
dim_size = dim_size * stride_size - (stride_size + kernel_size - 2)
|
||||
elif padding == 'same':
|
||||
dim_size = dim_size * stride_size
|
||||
return dim_size
|
||||
|
||||
@@ -197,6 +197,8 @@ def func_load(code, defaults=None, closure=None, globs=None):
|
||||
"""
|
||||
if isinstance(code, (tuple, list)): # unpack previous dump
|
||||
code, defaults, closure = code
|
||||
if isinstance(defaults, list):
|
||||
defaults = tuple(defaults)
|
||||
code = marshal.loads(code.encode('raw_unicode_escape'))
|
||||
if globs is None:
|
||||
globs = globals()
|
||||
|
||||
+3
-3
@@ -3,14 +3,14 @@ from setuptools import find_packages
|
||||
|
||||
|
||||
setup(name='Keras',
|
||||
version='2.0.0',
|
||||
version='2.0.1',
|
||||
description='Deep Learning for Python',
|
||||
author='Francois Chollet',
|
||||
author_email='francois.chollet@gmail.com',
|
||||
url='https://github.com/fchollet/keras',
|
||||
download_url='https://github.com/fchollet/keras/tarball/2.0.0',
|
||||
download_url='https://github.com/fchollet/keras/tarball/2.0.1',
|
||||
license='MIT',
|
||||
install_requires=['tensorflow', 'pyyaml', 'six'],
|
||||
install_requires=['theano', 'pyyaml', 'six'],
|
||||
extras_require={
|
||||
'h5py': ['h5py'],
|
||||
'visualize': ['pydot-ng'],
|
||||
|
||||
@@ -113,6 +113,16 @@ def test_lstm_legacy_interface():
|
||||
new_layer = keras.layers.LSTM(2, input_shape=[3, 5], name='d', implementation=1)
|
||||
assert json.dumps(old_layer.get_config()) == json.dumps(new_layer.get_config())
|
||||
|
||||
old_layer = keras.layers.LSTM(input_dim=5, input_length=3,
|
||||
output_dim=2, name='d', consume_less='mem')
|
||||
new_layer = keras.layers.LSTM(2, input_shape=[3, 5], name='d', implementation=1)
|
||||
assert json.dumps(old_layer.get_config()) == json.dumps(new_layer.get_config())
|
||||
|
||||
old_layer = keras.layers.LSTM(input_dim=5,
|
||||
output_dim=2, name='d', consume_less='mem')
|
||||
new_layer = keras.layers.LSTM(2, input_shape=[None, 5], name='d', implementation=1)
|
||||
assert json.dumps(old_layer.get_config()) == json.dumps(new_layer.get_config())
|
||||
|
||||
old_layer = keras.layers.LSTM(input_shape=[3, 5], output_dim=2, name='d', consume_less='gpu')
|
||||
new_layer = keras.layers.LSTM(2, input_shape=[3, 5], name='d', implementation=2)
|
||||
assert json.dumps(old_layer.get_config()) == json.dumps(new_layer.get_config())
|
||||
|
||||
Referência em uma Nova Issue
Bloquear um usuário