Comparar commits

..

1332 Commits

Autor SHA1 Mensagem Data
Francois Chollet b2e3780e8c Prepare PyPI release 2016-09-19 13:18:22 -07:00
Francois Chollet 0b04ac3117 Fix TF RNN dynamic behavior 2016-09-19 11:01:33 -07:00
Francois Chollet 90d0eb9b88 Regularizers style fixes 2016-09-18 15:27:45 -07:00
fchollet f2aa89f443 Freeze list of trainable weights at compile time 2016-09-18 10:41:37 -07:00
kuza55 2a319c7255 Add exception when trying to reuse regularizers (#3803)
My reading of regularizers is that they cannot be reused, but it doesn't actually fail in any way and seems like it results in only regularizing the last layer. Having an exception prevent this would probably improve the ergonomics.
2016-09-17 20:22:26 -07:00
Francois Chollet 4fb3f1b3f3 Make TF dynamic RNN work without states. 2016-09-16 17:15:18 -07:00
Furiously Curious 072d33599b Added Gitter channel badge (#3744)
* Added Gitter channel badge

Assigned @fchollet as channel admin on Gitter

* Link fix
2016-09-15 18:10:06 -07:00
Seonghyeon Nam 56f3c85b87 Fix ValueError(ndim of gamma and beta) of batch normalization when using Theano (#3740)
* Fix ndim mismatch error when using theano

* Change keras backend call
2016-09-15 18:09:02 -07:00
Francois Chollet 8b42fff90e Fix flaky test 2016-09-14 15:15:00 -07:00
Francois Chollet 1dc5d43d32 Remove deprecated resnet50 example 2016-09-14 15:03:26 -07:00
Francois Chollet ee2d08ff79 Fix activity regularization for wrapper layers 2016-09-14 15:02:05 -07:00
Francois Chollet 305b3bed74 Finalize streamlining of conv1d. 2016-09-14 14:39:47 -07:00
Francois Chollet 9f6acd960c Simplify Conv1D ops. 2016-09-14 14:18:15 -07:00
Flynn, Michael D 672890b1c8 Add AtrousConvolution1D to convolutional layers (#3763)
* Add `AtrousConvolution1D` to convolutional layers

* Add test for `AtrousConvolution1D` layer

* Add AtrousConvolution1D to docs
2016-09-14 11:40:04 -07:00
Francois Chollet c58bcc2c02 Fix deconv test 2016-09-13 16:56:39 -07:00
Francois Chollet 82318263a1 Set default backend to TF 2016-09-13 16:24:43 -07:00
Francois Chollet d90e1db50b Revert default backend to TH 2016-09-13 15:37:38 -07:00
Francois Chollet 8af0264a77 Set TensorFlow as default backend for new installs 2016-09-13 15:19:13 -07:00
Junwei Pan 8193287e08 Update docoment for callbacks.py: add the specification of auto mode (#3758) 2016-09-13 08:52:12 -07:00
fchollet 13bd33e73f Merge branch 'master' of ssh://github.com/fchollet/keras 2016-09-10 22:54:51 -07:00
fchollet b2e8d5ab7c Add support for LR decay in all optimizers 2016-09-10 12:34:05 -07:00
Ardalan a375cb322f fastText: adding n-gram embeddings for higher test_set accuracy (#3733)
* adding bi-gram embeddings for better test accuracy

* - add arbitrary n-gram range
- fix typos

* - fixing white spaces

* - add comment
2016-09-10 10:35:15 -07:00
dolaameng d9c4d8a76a update examples/neural_doodle.py based on issues #3731 (#3741) 2016-09-10 10:24:39 -07:00
kuza55 79edae58d5 Initial Sparse Matrix Support (#3695)
* Minimal SparseTensor support for TensorFlow

* Basic Theano support for Sparse dot product

* Sparse Input for Both + Sparse Concat for TF

* Fixed issue with _keras_shape for sparse Inputs

* pep8

* Cleanup + Theano concat (untested)

* Bug fix & pep8

* Fix Theano concat

* Bugfix & simplification

* Next step: Unit tests

* Basic unit test for sparse dot; TF works, TH fails

* Fix KTH is_sparse

* pep8

* more tests, sparse KTH.eval, pep8

* sparse model test

* address code review comments

* make sparse boolean in K.placeholder

* skip sparse tests when TH.sparse import fails

* pep8

* pep8

* fixed flakey test, auto-dense in KTH.eval

* fixed some more len/shape issues for fit_generator

* fixed some more len/shape issues for prediction

* Added better exceptions when theano.sparse fails to import

* betterer

* pep8
2016-09-09 16:26:37 -07:00
iampat 6675776640 Fix a small typo in help files (#3728)
impoprt --> import
2016-09-08 17:35:38 -07:00
dolaameng 40685c3b2a add examples/neural_doodle.py (#3724) 2016-09-08 10:15:57 -07:00
Francois Chollet 25874ceab2 Update TD wrapper 2016-09-07 19:32:26 -07:00
Tim Shi 4b2093ef67 allow output size different from state size (#3709) 2016-09-07 15:52:06 -07:00
kuza55 9bc2e60fd5 TensorBoard callback improvements (#3656)
* TensorBoard callback improvements

* Removed name improvement in TensorBoard callback

* Fix variables broken by removing name fixups

* Update callbacks.py
2016-09-07 12:59:08 -07:00
antonmbk 685ce7573d Added stacked what where autoencoder. (#3616)
* Added stacked what where autoencoder.

SWWAE uses residual blocks. Trains fast. Creates very good reconstructions.

* Added newline at end for PEP8

* Went through PEP8 errors and corrected all (except for the imports which following the numpy seed, but this should be ok).  Also, for the pool_size of 2, we halved the number of features maps and the number of epochs, and it still trains a net that can very nicely reconstruct the input.

* Added spaces arround - and + when they are used as binary operators (more PEP8).

* In decoder, the index of the features and pool size and wheres are all equal to nlayers-1-i, so set ind variable to this value and passed it to them.

* With ind variable in decoder, don't need two lines for the upsampling layer.

* Added title to plot, got rid of ticks on plot.

* PEP8 for * binary operator. Corrected some grammar issues in the docstring.
2016-09-07 11:05:41 -07:00
dolaameng f5ad1c5753 fix bug in neural_style_transfer example for image_dim_ordering=tf (#3715)
* fix bug in neural_style_transfer example for image_dim_ordering=tf

* fix PEP8 mixed space and tab
2016-09-07 10:57:23 -07:00
Francois Chollet cc92025fdc Make examples agnostic to image_dim_ordering 2016-09-06 15:53:56 -07:00
Fariz Rahman f05cd95fad Dot/cos merge : bug fix (#3708) 2016-09-06 15:04:26 -07:00
kuza55 4325843ef0 Add Matthews correlation coefficient to metrics (#3689)
* Add Matthews correlation coefficient to metrics

I needed this for a Kaggle competition and it seemed useful in general so I thought I'd contribute it back.

* Enabled test for matthews metric

* Remove unnecessary cast garbage

* Addresses code review comments

* Renamed to matthews_corrcoef to be consistent with sklearn

* Update test_metrics.py

* pep8

* rename to mathews_correlation

* Update metrics.py

* Fixed typo
2016-09-06 13:42:56 -07:00
Arel Cordero 607635d2ce Optionally load weights by name (#3488)
* Adding feature to load_weights by name

Squashed commit of the following:

commit fd47e763855c34ed78d26ee441d83e0e63f08119
Author: Arel Cordero <arel@ditto.us.com>
Date:   Thu Aug 18 16:02:14 2016 +0000

    typo

commit d0b06c03080131c55ab4777064a196ff339ad7df
Author: Arel Cordero <arel@ditto.us.com>
Date:   Thu Aug 18 15:52:35 2016 +0000

    update documentation for "load_weights"

commit 844cfc2e8c9c6f267799a22ed54ac4d75807c5ab
Author: Arel Cordero <arel@ditto.us.com>
Date:   Thu Aug 18 02:42:10 2016 +0000

    batch updating weights

commit f361a70da4b40b961f1af9c8f1c3cd26273d0cad
Author: Arel Cordero <arel@ditto.us.com>
Date:   Thu Aug 18 02:29:17 2016 +0000

    removing pudb line

commit 738de4c371503626b4c9dbae6428fb279b368a76
Author: Arel Cordero <arel@ditto.us.com>
Date:   Wed Aug 17 19:56:51 2016 +0000

    adding unit tests for loading weights by name

commit cb0971b3cfe62452ab445e4034098cab2be3031b
Author: Arel Cordero <arel@ditto.us.com>
Date:   Tue Aug 16 23:45:32 2016 +0000

    cleaning up code based on comments

commit ef08fd2c9f5d3c65359cbdf5b090e08733a518de
Author: Arel Cordero <arel@ditto.us.com>
Date:   Tue Aug 16 04:50:46 2016 +0000

    debugging

commit 0d74f0e997960886b1044c26001de6cd6ad90bb9
Author: Arel Cordero <arel@ditto.us.com>
Date:   Tue Aug 16 04:15:43 2016 +0000

    optionally load model by name

* changed random file names to use tempfile module

* clean up documentation strings

* clarifying documentation
2016-09-06 11:42:31 -07:00
Abishek Bhat b8fddc862e Add missing Softmax activation memnn. (#3706)
The implementation of bAbi [End to End Memory
Network](https://arxiv.org/pdf/1503.08895v5.pdf) in the example
seems to be missing the Softmax Layer.

Quoting the paper.

> The query q is also embedded (again, in the simplest case via another embedding matrix
B with the same dimensions as A) to obtain an internal state u. In the embedding space, we compute
the match between u and each memory m<sub>i</sub> by taking the inner product followed by a softmax.

Also, the question encoder
[here](https://github.com/fchollet/keras/blob/0df0177437ce672d654db6d7edfdc653aaf67533/examples/babi_memnn.py#L186) seems to sum over the probabilities and the question vector as suggestted in the original paper.

> Output memory representation: Each x<sub>i</sub> has a corresponding output vector c<sub>i</sub> (given in the
simplest case by another embedding matrix C). The response vector from the memory o is then a
sum over the transformed inputs c<sub>i</sub> , weighted by the probability vector from the input.

I tried running the model(with and without the intermediate softmax)
against _Single Supporting Fact_ en-10k dataset and found that the
network the intermediate softmax trained a lot faster(95% at 100epoch) than the former(67% at 100epoch).

Network without the Softmax activation in the Input Memory Representation at epoch=100
======================================================================================
```
Iteration 10
Train on 10000 samples, validate on 1000 samples
Epoch 1/10
10000/10000 [==============================] - 8s - loss: 0.0549 - acc: 0.9819 - val_loss: 1.8088 - val_acc: 0.6470
Epoch 2/10
10000/10000 [==============================] - 6s - loss: 0.0612 - acc: 0.9802 - val_loss: 1.7839 - val_acc: 0.6650
Epoch 3/10
10000/10000 [==============================] - 6s - loss: 0.0542 - acc: 0.9812 - val_loss: 1.7595 - val_acc: 0.6750
Epoch 4/10
10000/10000 [==============================] - 6s - loss: 0.0538 - acc: 0.9826 - val_loss: 1.8198 - val_acc: 0.6670
Epoch 5/10
10000/10000 [==============================] - 6s - loss: 0.0590 - acc: 0.9790 - val_loss: 1.7891 - val_acc: 0.6650
Epoch 6/10
10000/10000 [==============================] - 6s - loss: 0.0548 - acc: 0.9803 - val_loss: 1.7682 - val_acc: 0.6790
Epoch 7/10
10000/10000 [==============================] - 6s - loss: 0.0455 - acc: 0.9841 - val_loss: 1.8394 - val_acc: 0.6730
Epoch 8/10
10000/10000 [==============================] - 6s - loss: 0.0559 - acc: 0.9797 - val_loss: 1.7764 - val_acc: 0.6650
Epoch 9/10
10000/10000 [==============================] - 6s - loss: 0.0488 - acc: 0.9835 - val_loss: 1.7711 - val_acc: 0.6620
Epoch 10/10
10000/10000 [==============================] - 6s - loss: 0.0502 - acc: 0.9834 - val_loss: 1.8225 - val_acc: 0.6700
```

Network with Softmax Activation in the Input Memory Representation at epoch=100
===============================================================================

```
Iteration 10
Train on 10000 samples, validate on 1000 samples
Epoch 1/10
10000/10000 [==============================] - 6s - loss: 0.0084 - acc: 0.9972 - val_loss: 0.2426 - val_acc: 0.9520
Epoch 2/10
10000/10000 [==============================] - 7s - loss: 0.0152 - acc: 0.9946 - val_loss: 0.2063 - val_acc: 0.9560
Epoch 3/10
10000/10000 [==============================] - 6s - loss: 0.0104 - acc: 0.9969 - val_loss: 0.2010 - val_acc: 0.9540
Epoch 4/10
10000/10000 [==============================] - 6s - loss: 0.0163 - acc: 0.9959 - val_loss: 0.2023 - val_acc: 0.9580
Epoch 5/10
10000/10000 [==============================] - 6s - loss: 0.0136 - acc: 0.9962 - val_loss: 0.2007 - val_acc: 0.9560
Epoch 6/10
10000/10000 [==============================] - 6s - loss: 0.0152 - acc: 0.9953 - val_loss: 0.1989 - val_acc: 0.9570
Epoch 7/10
10000/10000 [==============================] - 7s - loss: 0.0085 - acc: 0.9969 - val_loss: 0.2113 - val_acc: 0.9490
Epoch 8/10
10000/10000 [==============================] - 7s - loss: 0.0116 - acc: 0.9972 - val_loss: 0.2346 - val_acc: 0.9500
Epoch 9/10
10000/10000 [==============================] - 7s - loss: 0.0106 - acc: 0.9970 - val_loss: 0.2052 - val_acc: 0.9550
Epoch 10/10
10000/10000 [==============================] - 7s - loss: 0.0132 - acc: 0.9963 - val_loss: 0.2114 - val_acc: 0.9500
```
2016-09-06 11:33:11 -07:00
dolaameng 0df0177437 make image parameters more consistent (#3672)
* change of variable names in examples/neural_transfer_style for consistency

* add docstring to keras.preprocessing.image.load_img()
2016-09-06 10:29:26 -07:00
Pedro S f90cbcd1e3 Added regularization option to BatchNormalization layer (#3671)
* Added regularization option to BatchNormalization layer

* Update normalization.py

* Added regularization to BN test

* Fixed identation

* Removed trailing whitespace and refixed identation
2016-09-02 08:15:51 -07:00
Fariz Rahman 870d7f7f93 Lambda layer : Allow multiple inputs (#3668) 2016-09-01 13:48:21 -07:00
Roberto de Moura Estevão Filho 799bec66a2 CTC import compatibility with tensorflow 0.10 (#3650)
* CTC import compatibility with tensorflow 0.10
Try except clause to import ctc_loss in new path on tensorflow 0.10.

* Fixed ctc_decode and added tests for tensorflow.
ctc_decode when using beam search decoder has been fixed to conform with
tensorflow API. Function documentation has been updated to reflect the
changes. Two tests, for greedy and beam search decoding, have also been
added to test_backends.py.

* Fix pep8 styling.

* Fixed styling on long lines on ctc_decode tests.
2016-09-01 11:33:11 -07:00
Matt 2321fbbc1d Fix Batch Norm compatibility with 3D inputs (#3666)
* Fix Batch Norm compatibility with 3D inputs

the theano backend now uses dnn_batch_normalization which only supports
up to 4-dimensional input. This breaks any 5-d layers such as 3D
convolutions.

* using intermediate variable
2016-09-01 10:22:23 -07:00
gw0 48ae7217e4 Fix TensorFlow RNN backwards support. (#3662) 2016-09-01 05:56:42 -07:00
Francois Chollet 6f54b233f1 Fix Theano input shape inference in InputLayer 2016-08-31 21:49:43 -07:00
ηzw 1bf1055395 Update docs for SpatialDropouts (#3652) 2016-08-31 13:55:21 -07:00
gw0 6417d90d5c Fix #2814 lambda function serialization and deserialization (#3639)
* Remove old-style function attributes.

* Fix lambda function serialization and deserialization.
2016-08-31 10:05:05 -07:00
fchollet c939cebf0d Theano rnn fix when input_dim = 1 2016-08-30 18:35:22 -07:00
kuza55 7ae36d132a Write TensorBoard Histograms with Tensor names (#3635)
Resolves the acute symptoms in https://github.com/fchollet/keras/issues/3357

Doesn't address the question of having a better __repr__ since that is a much wider change.
2016-08-30 16:55:38 -07:00
Francois Chollet c478409dad Fix weight constraint sharing issue 2016-08-30 12:54:42 -07:00
Frédéric Bastien 109441a708 Small speed up by preventing transfer to CPU or copy on the CPU just to get the shape. (#3631) 2016-08-30 12:41:47 -07:00
Fariz Rahman b267e8293d Update sequential-model-guide.md (#3630) 2016-08-30 09:43:41 -07:00
Francois Chollet 3a4c683d5c Update download path for babi dataset 2016-08-29 13:03:36 -07:00
Sean Löfgren d5649da5f8 update pytest config for pep8 tests (#3617) (#3619) 2016-08-29 11:20:39 -07:00
Fariz Rahman 9c28d21b4f Fix lambda layer docstring (#3604) 2016-08-29 10:44:19 -07:00
kuza55 9e58b8237b Enable colocate_gradients_with_ops=True (#3620)
By default TensorFlow allocates all gradient matricies on gpu:0, which makes it pretty much impossible to do parallelize a large model.

colocate_gradients_with_ops puts these matricies next to the operations, allowing you to split your model across multiple GPUs. I ran into this issue myself and this fixed it for me.

I think it's also meant to set gradient computations to be done on the device where the operations are stored, but my belief about that comes from https://github.com/tensorflow/tensorflow/issues/2441

I'm not sure why this isn't the default in TF, so I'm not sure if this should be behind a flag or something, but having to make my own patches to keras to do multi-GPU training seems like the wrong answer.
2016-08-29 10:44:01 -07:00
Francois Chollet b184c76205 Update docs 2016-08-28 14:29:40 -07:00
fchollet 065fb2a74c Add global pooling layers 2016-08-28 14:22:15 -07:00
Francois Chollet a0a0d42630 Fix example in doc 2016-08-28 13:20:51 -07:00
Francois Chollet 756153899a Merge branch 'master' of https://github.com/fchollet/keras 2016-08-28 13:10:47 -07:00
Francois Chollet e02554412f Fix example in doc 2016-08-28 13:09:33 -07:00
ηzw ca37e806b9 Fix docs (#3609)
* Fix typo

* Fix typo

* Fix docstring

* Remove the unnecessary augument in docstring
2016-08-28 11:22:16 -07:00
Francois Chollet fe0347dbf0 Update docs 2016-08-28 02:33:50 -07:00
Francois Chollet 4984c5fc7c Update documentation 2016-08-28 02:03:14 -07:00
Francois Chollet f605769af9 Merge branch 'master' of https://github.com/fchollet/keras 2016-08-28 01:11:08 -07:00
Francois Chollet 534f6b7975 Remove flaky test 2016-08-28 01:10:53 -07:00
ηzw ee8fd78383 Fix docstring in Locally-connected Layers (#3607) 2016-08-28 01:07:58 -07:00
Francois Chollet fbc4f37037 Example touch-up 2016-08-27 20:28:03 -07:00
Francois Chollet f23f2ff2c9 Add keras.applications, refactor 2 convnet scripts 2016-08-27 20:27:49 -07:00
Francois Chollet d0659327bd Prepare 1.0.8 release. 2016-08-27 17:04:58 -07:00
Nithish deva Divakar 88b301f182 Update io_utils.py (#3577)
* Update io_utils.py

Fix for wrong input dimension when using HDF5matrix for loading data

* Update io_utils.py
2016-08-27 09:45:31 -07:00
Yanush Viktor d5e16807d2 Update image.md (#3600)
Information in docs about optional parameter `shuffle` is not correct. In the code
https://github.com/fchollet/keras/blob/master/keras/preprocessing/image.py#L263
it's `True` by default, but `False` in the docs.
2016-08-27 09:45:10 -07:00
Fariz Rahman 79a2bcd05f TF dynamic RNN : Allow sequences of ndim > 3 (#3603)
Docstring says "at least 3D", but current code is hard coded for 3D input.
2016-08-27 09:44:55 -07:00
dibule e582f9dcac Bug fix, batch_size set instead of default one (#3590) 2016-08-26 14:30:57 -07:00
Carl Thomé 4cefd6136b Add pydot-ng dependency (#3593) 2016-08-26 14:30:25 -07:00
Francois Chollet 1cf04b7a10 Update documentation 2016-08-26 14:22:56 -07:00
Francois Chollet ad3db301f2 Update RNN docstring 2016-08-25 12:34:35 -07:00
Francois Chollet e15eb40317 Merge branch 'master' of https://github.com/fchollet/keras 2016-08-25 11:46:41 -07:00
Felix Lau 8459d0403c Fix Cropping3D InputSpec to be dim=5 (#3570) 2016-08-25 10:29:52 -07:00
François Chollet 08014eea36 Add support for dynamic RNNs in TensorFlow. (#3474)
* Add support for dynamic RNNs in TensorFlow.

* Fix return states

* Add support for go_backwards in dynamic TF RNNs

* Currently broken: TF RNN dropout, go_backwards

* Finalize dynamic RNNs in TF

* Remove unnecessary comment

* Comment out added test

* Comment out functional guide test
2016-08-24 20:47:51 -07:00
Francois Chollet c0b27108d0 Whitespace fix 2016-08-24 16:15:38 -07:00
Francois Chollet 40612facf3 Merge branch 'master' of https://github.com/fchollet/keras 2016-08-24 15:16:39 -07:00
Francois Chollet e418fc6937 Add get_variable_shape backend method 2016-08-24 15:16:22 -07:00
cyrusmaher 045d442fcd Update scikit_learn.py (#3567)
Sklearn (e.g. the Bagging estimator) expects a value error if a parameter isn't recognized.
2016-08-24 14:19:25 -07:00
Dmitry Lukovkin 2370f9f1db Docs update for Deconvolution2D layer (#3549)
* Fix exception message for Deconvolution2D

* Docs update for Deconvolution2D layer (#3540)

* Corrections to Deconvolution2D docs

* References formatted as Markdown links
* Blank lines added
2016-08-24 13:16:02 -07:00
Jurim Lee 519d1e7420 bug fix : cnn tensorflow backend error (#3558) 2016-08-24 08:57:31 -07:00
Francois Chollet 82afc713d0 Merge branch 'master' of https://github.com/fchollet/keras 2016-08-24 04:15:34 -07:00
Francois Chollet a1e9a8addd data_utils update 2016-08-24 04:15:20 -07:00
Fariz Rahman f59736a06c Bug fix : RNNs (#3550) 2016-08-23 17:03:17 -07:00
Francois Chollet d187059596 Merge branch 'master' of https://github.com/fchollet/keras 2016-08-23 14:30:21 -07:00
Francois Chollet 7d2f0b1ba8 Fix trainable_weight lists 2016-08-23 14:29:49 -07:00
Francois Chollet 6a6d939dea Style fix 2016-08-23 14:29:27 -07:00
Keunwoo Choi 090b8b7f99 add cropping1d/2d/3d layers (#3509)
* add cropping1d/2d/3d layers

* fix PEP8 issue, fix incorrect doc strings

* add example code on Cropping2D

* fix init/get_config of crop1d/3d, add test codes for cropping1d/2d/3d

* fix test code - PEP8

It doesnt pass test (only in cropping2d and basic_test), but my laptop setting is not correct (it doesnt pass some other existing layes as well), so committing to test it in a correct way.

* change to follow PEP8 again

* update test_convolutonal.py for PEP8, test code to us K.image_dim_ordering()

* PEP8 for test_convolutional.py - indentation

* fix typo. add assert to check cropping lengths
2016-08-22 19:20:47 -07:00
Dominic Breuker e4dda27de1 allow KerasClassifier.score to accept sparse labels (#3534) 2016-08-22 15:19:26 -07:00
EdwardRaff f2786d9d80 Update theano_backend.py (#3532)
fix batch norm causing NaN for many datasets
2016-08-20 12:50:23 -07:00
Katherine Crowson c6daa24e3c Use Theano's CuDNN implementation of batch normalization (#3529) 2016-08-19 20:03:55 -07:00
François Chollet 2f4eed1f0f Remove lock in fit_generator (#3528) 2016-08-19 16:09:20 -07:00
Francois Chollet acc5c45feb Remove lock in fit_generator 2016-08-19 11:21:47 -07:00
Rodrigo Prado 00c3335071 added language sh to install instructions (#3520) 2016-08-19 08:36:51 -07:00
Ardalan 65e4c8e76e Update lstm_text_generation.py (#3516)
updating comment
2016-08-18 14:03:26 -07:00
Francois Chollet d4f5dff8ee Style fixes 2016-08-18 13:51:45 -07:00
dolaameng 8d9cb782fb add example mnist_net2net.py (#3503)
* add example mnist_net2net.py

* change of mnist_net2net.py based on 1st comments

* typo fixed in examples/mnist_net2net.py
2016-08-18 13:07:16 -07:00
fchollet 02ff1d4462 Fix style in example 2016-08-17 20:00:49 -07:00
Francois Chollet 007d2c2e25 Fix theano tests 2016-08-17 17:39:24 -07:00
Francois Chollet 3bf7637986 Style fixes 2016-08-17 16:32:52 -07:00
Francois Chollet 33ff9dbce2 Speed up bidirectional wrapper tests 2016-08-17 16:30:31 -07:00
Fariz Rahman f25e894558 Bidirectional Wrapper (#3495)
* Add Bidirectional Wrapper

* Fix example

* Update wrappers.py

* Add reverse op

* Update tensorflow_backend.py

* Update wrappers.py

* Update test_wrappers.py

* bug fix

* Update test_wrappers.py

* Update test_wrappers.py

* bug fix

* Add test for reverse op

* Enable reverse along multiple axes

* Update theano_backend.py

* Update theano_backend.py

* Update test_wrappers.py

* Speed up tests

* Validate merge_mode arg, Add None mode

* Update test_wrappers.py

* Update test_wrappers.py

* Add properties; reverse -> backward

* Bug fix

* Resolve naming conflict

* Whitespace fix

* Update imdb_bidirectional_lstm.py

* Fix imports
2016-08-17 16:27:06 -07:00
Junwei Pan 52c1a7456f Add a reference paper for Adagrad (#3473)
Add a reference paper for Adagrad
2016-08-17 13:31:24 -07:00
Junwei Pan b2392413fa Fix a issue when only specify one dot_axes for in the Merge layer in the dot mode (#3470)
* Upload examples/imdb_fasttext.py which implement the fasttext model

* Remove Dropout and unnecessary imports

* Remove Dropout and unnecessary imports

* Remove Dropout and unnecessary imports

* Fix a issue when only specify one dot_axes for in the Merge layer

* Fix a issue when only specify one dot_axes for in the Merge layer
2016-08-17 13:30:45 -07:00
Frederico Tommasi Caroli 1941eaabe0 Using locks instead of ignoring ValuErrors in generators (#3501) 2016-08-17 11:20:51 -07:00
ηzw 3d5bf9753f Fix typo (#3505) 2016-08-17 11:18:52 -07:00
fchollet a4d191d4f9 Only write config file if it didn't exist 2016-08-16 20:53:57 -07:00
Reid Sanders dad54ec211 Minor imbd dataset documentation update, added docstring to reuters (#3494)
* Updated dataset documentation to reflect removal of test_split argument
from imbd dataset. Added docstring to reuters dataset load_data.

* Updated imbd and reuters examples in dataset docs to reflect all
available arguments with current default values.
2016-08-16 14:14:32 -07:00
Francois Chollet b525f5f4d7 Make h5py optional again 2016-08-16 13:31:15 -07:00
Mike Henry e8190a8d8d Added support for CTC in both Theano and Tensorflow along with image OCR example. (#3436)
* Added CTC to Theano and Tensorflow backend along with image OCR example

* Fixed python style issues, made data files remote, and made code more idiomatic to Keras

* Fixed a couple more style issues brought up in the original PR

* Reverted wrappers.py

* Fixed potential training-on-validation issue and removed unused imports

* Fixed PEP8 issue

* Remaining PEP8 issues fixed
2016-08-16 13:25:26 -07:00
Francois Chollet 4e155139ca Style fixes. 2016-08-16 11:17:44 -07:00
stas-sl 458edeed9a implement spatial dropout (#3463) 2016-08-16 11:16:18 -07:00
Katherine Crowson 04d785f4bf Allow multiprocessing on OS X (#3389) 2016-08-14 15:43:43 -07:00
fchollet 28d9c0c511 Style fixes in example. 2016-08-14 15:41:04 -07:00
Antonie Lin 91310971b9 mnist hierarchical rnn example (#3460) 2016-08-14 15:40:13 -07:00
Francois Chollet 5d2acf4897 Merge branch 'master' of https://github.com/fchollet/keras 2016-08-13 11:54:04 -07:00
Francois Chollet dc98019d49 Add get_layer to sequential models 2016-08-13 11:53:48 -07:00
ηzw b008bb35cc Style fixes (#3462) 2016-08-13 11:22:01 -07:00
Junwei Pan 46d5b197e0 Implement a fasttext example (#3446)
* Upload examples/imdb_fasttext.py which implement the fasttext model

* Remove Dropout and unnecessary imports

* Remove Dropout and unnecessary imports

* Remove Dropout and unnecessary imports
2016-08-12 14:36:35 -07:00
Francois Chollet 2c510530b1 Update FAQ 2016-08-10 16:44:35 -07:00
Francois Chollet ec6eda77ad Style fixes in tests 2016-08-10 16:44:29 -07:00
Yann Henon 4805e5856b Upsampling layer fix to work when input shape is None (#3429) 2016-08-09 14:47:23 -07:00
Fariz Rahman 55447cbb3d Bug fix : squeeze (#3433) 2016-08-09 13:59:47 -07:00
Francois Chollet 69d5139b8c [breaking change] Weight ordering now topological. 2016-08-09 10:20:00 -07:00
Francois Chollet 89f1e05147 Functional Model weight loading in layer order 2016-08-08 18:53:36 -07:00
Francois Chollet bc779df8b7 Avoid creating new ops in TF to load weights 2016-08-08 16:13:40 -07:00
Francois Chollet e3c260e7d3 Fix test flakes 2016-08-08 13:00:53 -07:00
Francois Chollet 0af7e004c7 Fix test flakiness in TF 2016-08-08 12:28:35 -07:00
Francois Chollet 447445388e Merge branch 'master' of https://github.com/fchollet/keras 2016-08-08 11:23:38 -07:00
Francois Chollet b2c66816d7 Prepare new PyPI release. 2016-08-08 11:23:25 -07:00
Martin Thoma b6f81c6cc3 Fix documentation of the 1D max pooling layer (#3419) 2016-08-08 11:04:54 -07:00
Yad Faeq 98b289630a Word embdedding example updated (#3417)
* Added Convolution1D instead of Conv1D, which is depreceated

* updated rest of the example using Conv1D

* Python3 fails to decode utf-8 data, thus using encoding='latin-1'

* added condition for Encoding line 65-67

* Conv1D reverted back to the way it was
2016-08-08 10:59:31 -07:00
fchollet d68c0bd795 Update optimizers docs 2016-08-07 19:31:25 -07:00
Santiago Castro 5afda71f74 Fix scikit-learn python file name in docs (#3413) 2016-08-06 19:29:27 -07:00
Ramon de Oliveira 1b08a8d675 format Nadam references (#3404) 2016-08-05 16:41:34 -07:00
Moussa Taifi b508ab64bd Fix documentation for ModelCheckpoint callback params (#3408) 2016-08-05 16:41:15 -07:00
Richard Higgins 84f435e24b all zero arrays no longer get divided by zero in the process of being turned into images (#3401)
Update image.py
2016-08-05 09:10:52 -07:00
Fariz Rahman 984ad34a61 One hot op (#3353)
* One hot op

* tf too

* Update theano_backend.py

* Use built-in theano op

* Update theano_backend.py

* Add test

* Update test_backends.py

* Update test_backends.py

* Generalize for nD tensors

* Fix docstring on TF backend

* Update theano_backend.py

* Update theano_backend.py
2016-08-04 14:16:36 -07:00
Ondřej Filip ad3231c29a Docs update for Merge layer (#3392)
In Merge layer dot_axes param affects also 'cos' mode
2016-08-04 08:54:10 -07:00
fchollet c3d20bbc53 Merge branch 'master' of ssh://github.com/fchollet/keras 2016-08-03 22:11:12 -07:00
fchollet f9c03f183f Make trainable_weights dynamic 2016-08-03 22:02:45 -07:00
Tsukasa ŌMOTO 046a3c8a28 Fix YAML serialization for Advanced Activations (#3391)
Fix https://github.com/fchollet/keras/issues/2871#issuecomment-237365465

The problem occurs because PyYAML can't recognize numpy's data types.
2016-08-03 20:41:52 -07:00
Francois Chollet 05883934f1 Unify BN behavior across backends (fix) 2016-08-03 13:22:21 -07:00
Francois Chollet 97d2a73dd3 Unify BN behavior in TF and Theano 2016-08-03 12:39:24 -07:00
Francois Chollet 5367a44acb Further style fixes in example 2016-08-03 11:54:15 -07:00
Francois Chollet 1deaf71388 revert unnecessary changes in example script 2016-08-03 11:11:35 -07:00
Francois Chollet 99f564e972 Style fixes and small bug fixes 2016-08-03 11:08:48 -07:00
Zhengtao Wang c725f8d354 add resnet50 example (#3266)
* add resnet50 example

* fix PEP8 problems

* fix PEP8 problem....again...

This script get 10 points in PEP8 tests on my computer but.....

* add resnet 50

* fix problem caused by interrupted git push

* fix PEP8 problem..again!

* update weights links and remove load_weights

* fix pep8!

* remove skimage dependency, rename the file

* fix pep8...

* update

* support tf dim_ordering

* fix PEP8 problem
2016-08-03 10:51:18 -07:00
Shuhei Iitsuka 257ace722c Correct the reconstruction loss (#3383) 2016-08-02 21:45:31 -07:00
Wei Ouyang 0cd9d46828 remove keyword for cifar10.load_data (#3379) 2016-08-02 08:20:00 -07:00
Chris Caruso cef9e28a6c Added to rnn statefulness doc concerning func api (#3375)
Added correct information for how to enable rnn statefulness on functional models with 1 or more Input layers.
2016-08-01 20:21:42 -07:00
Francois Chollet 6c42da2abf Fix legacy model loading 2016-08-01 16:26:01 -07:00
Francois Chollet a9fc2bed49 Allow to call load_weights on model save file 2016-07-31 22:58:44 -07:00
Francois Chollet 1855c49d1f Merge branch 'master' of https://github.com/fchollet/keras 2016-07-31 17:45:43 -07:00
Francois Chollet cce65ce34d Update docs. 2016-07-31 17:45:32 -07:00
Jan Wilken Dörrie 70866c0154 Compare versions through pkg_resources.parse_version (#3355) 2016-07-31 10:45:03 -07:00
dolaameng d06e3753b0 update examples/neural_style_transfer.py (#3347) 2016-07-29 12:00:37 -07:00
Fariz Rahman cb4de1f859 Embedding layer - cast float to int implicitly (#3342)
* Embedding layer - cast float to int implicitly

* Cast only if not int
2016-07-29 11:11:46 -07:00
Jérémie b6d23b2e2d remove usage of tf.assign() in part of tensorflow backend (#3316) (#3320)
* remove usage of tf.assign() in the tensorflow backend (#3316)

Usage of the tf.assign() function in the set_value() and batch_set_values() functions creates new nodes on the Tensorflow graph which can eventually overflow the memory.
Therefore, the function has been rewritten using placeholders and feed_dict to avoid allocating additional memory.

* Correction to the set_value() function

Change to the set_value() function that had a bug when the variable "value" was a float.
The *1. dummy multiplication was added to avoid having to deal with tf.float32_ref dtypes.

* update set_value() of the tensorflow backend

Removal of the *1. dummy multiplication, replacement with a split() to avoid creating a new operation in the graph.

* fix to have session.run() called once in batch_set_value()

Rewriting of the batch_set_value() to avoid multiple calls to session.run() to improve speed.
2016-07-28 09:29:57 -07:00
Pradeep Dasigi 6a8815de0c Masked and non-masked merge bug fix (#3218)
* Masked and non-masked merge bug fix

* Masked merge concat logic with an expanded loop

* Cast mask of ones for unmasked input in merge to uint8
2016-07-27 17:50:49 -07:00
Francois Chollet e0179bad2f Refresh IMDB dataset 2016-07-27 17:30:04 -07:00
Francois Chollet 8778add0d6 Clean up test files 2016-07-27 12:09:40 -07:00
Francois Chollet facc823612 Update FAQ 2016-07-27 11:15:04 -07:00
Francois Chollet b91854ea9d Fix flaky test 2016-07-27 10:27:47 -07:00
Francois Chollet 05abe814ac Add keras version in model save files 2016-07-27 10:22:49 -07:00
François Chollet ea561ba6d8 Add model saving functionality (#3314)
* Add model saving functionality

* Update model saving functionality

* Fix py3 bytes/str issue

* Fix tests
2016-07-26 20:45:28 -07:00
Mostafa Abdulhamid df84c69676 Docker image for test and experiment Keras (#3035)
* Docker image for test and experiment Keras

 - Docker image with CUDA support on ubuntu 14.04
 - nvidia-docker script to forward the GPU to the container
 - MakeFile to simplify docker commands for build, run, test, ..etc
 - Add useful tools like jupyter notebook, ipdb, sklearn for experiments

* update nvidia-docker plugin

* use .theanorc in Dockerfile

* Add tensorflow to the docker image

* update Docker image to cuDNN v5

* test fixes

* move docker to sub directory

* README for docker

* Fix typos

* Add visualization to Dockerfile
2016-07-26 18:29:56 -07:00
Sadegh 3726aba2ee use of period fixed, default period set to 1000 (#3312) 2016-07-25 20:41:22 -07:00
Francois Chollet f6bcaffe4a Style fixes 2016-07-25 10:42:37 -07:00
yaringal c689b52dd1 Implemented transposed (de-) convolutions into Keras (#3251)
* theano backend now supports transposed convolutions

* working deconv

* new example file with deconv vae

* merged with #3273, fixed based on comments, pep8 tested

* test fix

* passes theano test

* start fixing deconv test

* fix deconv layer tests

* fix the right test

sorry, I "fixed" the wrong test last time

* clean up

* replace with_None with fixed_batch_size

* with_None --> fixed_batch_size

* comment edit

* fixed comments online
2016-07-25 10:33:03 -07:00
fchollet 09d75a4347 Style fixes 2016-07-24 14:20:36 -07:00
Rui Shu 59bd247603 Fix VAE example (#3220)
A number of changes:
1. Switch from Lambda to merge, otherwise code will not run.
2. Rename z_log_std to z_log_var in order for the objective function to make sense
3. Adjust reparameterization trick to reflect use of z_log_var, not z_log_std
4. Remove epsilon_std, since (standard) VAE uses isotropic gaussian prior.
5. Re-balance the weighting of KL and reconstruction terms
6. Use adam instead of rmsprop
7. Increase hidden unit size to improve model
8. Increase batch size to speed up training
2016-07-24 14:09:34 -07:00
dolaameng f221ef952f make examples/pretrained_word_embeddings.py more memory efficient (#3289)
* make examples/pretrained_word_embeddings.py more memory efficient

* make examples/pretrained_word_embeddings.py more memory efficient

* rename NB_WORDS to nb_words as it is not a global constant
2016-07-23 10:14:28 -07:00
Sebastian N. Fernandez d3c75e1d34 adding print_tensor (#3285) 2016-07-22 12:48:44 -07:00
Felix Lau 3aab55d29f Add Tensorflow support for UpSampling3D and ZeroPadding3D (#3274) 2016-07-22 12:47:59 -07:00
Fariz Rahman f9ef72c38a Logical ops (#3272)
* Logical ops

* Update tensorflow_backend.py

* Add tests

* Update tensorflow_backend.py

* lesser->less

* Update test_backends.py

* less->lesser

* less->lesser

* Update test_backends.py
2016-07-21 20:47:02 -07:00
Francois Chollet 108159ed17 Merge branch 'master' of https://github.com/fchollet/keras 2016-07-21 15:10:10 -07:00
Francois Chollet defa1283c4 Include iterations in optimizer weights 2016-07-21 15:04:30 -07:00
Francois Chollet 2788b60fe6 Revert behavior of regularizers 2016-07-21 15:04:09 -07:00
Olivier Grisel 7e70e1768f Small python3 compat fix in backend doc (#3275) 2016-07-21 10:31:52 -07:00
Kai Li 896ba77061 update residual connection example (#3278) 2016-07-21 10:30:19 -07:00
Francois Chollet c034262b78 Removed test for deprecated graph model 2016-07-20 16:12:49 -07:00
Francois Chollet b7edcf6eea Change behavior of regularizers: use mean, not sum 2016-07-20 15:52:47 -07:00
Francois Chollet 23e1ad2df7 Allow overriding learning phase 2016-07-20 15:48:52 -07:00
Francois Chollet 0a3939883a Style fix 2016-07-19 16:29:27 -07:00
Francois Chollet 3c8f91ee3d Style fix 2016-07-19 16:11:33 -07:00
Francois Chollet efa5b04797 Style fix 2016-07-19 16:09:07 -07:00
Francois Chollet 2da66ed009 Py3 fix 2016-07-19 14:56:50 -07:00
Francois Chollet 2ac6811362 Remove deprecated graph model test 2016-07-19 14:53:50 -07:00
Francois Chollet 74c51f213c Fix flaky test 2016-07-19 14:52:56 -07:00
Francois Chollet 4302d8060d Fix image resizing in preprocessing/image 2016-07-19 14:30:43 -07:00
Francois Chollet 576cf8978b Merge branch 'master' of https://github.com/fchollet/keras 2016-07-19 14:22:48 -07:00
Francois Chollet 3533912016 Native initializations, updates 2016-07-19 14:22:34 -07:00
ekerazha cf9922ff1d Fix broken imdb_cnn example (#3244)
* Fix broken imdb_cnn example

* Update imdb_cnn fix
2016-07-19 12:18:59 -07:00
Zhengtao Wang 4fa65fbb2f add doc for layers/normalization/BatchNormalization (#3248)
* add doc for layers/normalization/BatchNormalization

* fix PEP8 problem

* Update normalization.py
2016-07-19 12:18:39 -07:00
Kai Li f502ee2338 Update sequential-model-guide.md (#3257) 2016-07-19 12:18:25 -07:00
Francois Chollet 7a56925176 Even faster tests 2016-07-19 11:57:57 -07:00
Francois Chollet 0a108b3fb2 Fix tests maybe? 2016-07-19 11:31:22 -07:00
Francois Chollet 381a108e6d Revert Travis config 2016-07-19 11:17:32 -07:00
Francois Chollet 726c9fc8a6 Update tests 2016-07-19 10:49:44 -07:00
Francois Chollet 946ccd3228 Speed up tests (especially with TF) 2016-07-19 10:05:30 -07:00
Francois Chollet 8e1ebbfc11 Get rid of coveralls 2016-07-19 00:40:21 -07:00
Francois Chollet cc0e60c101 TF backend performance improvements 2016-07-19 00:30:05 -07:00
Francois Chollet ff3f00d845 Style fix in example 2016-07-18 23:48:56 -07:00
Francois Chollet 40195c2fa2 TF backend: group update ops, add clear_session 2016-07-18 23:48:56 -07:00
Francois Chollet 7f7300b8cb Minor style fixes 2016-07-18 23:48:56 -07:00
Fariz Rahman 1b158ff4ed Bug fix + test - Sequential.pop() (#3252)
* Bug fix - Sequential.pop()

* Add test for Sequential.pop()

* Update test_sequential_model.py
2016-07-18 18:30:13 -07:00
stas-sl b686b85b52 Use symbolic shape in tensorflow version of batch_flatten (#3253) 2016-07-18 15:37:10 -07:00
Francois Chollet 8fa82ae5cb Merge branch 'master' of https://github.com/fchollet/keras 2016-07-16 17:52:56 -07:00
Francois Chollet 0d5289141e Add pre-trained word embeddings example 2016-07-16 17:51:17 -07:00
Francois Chollet 01d5e7bc47 Fix up a few example 2016-07-16 17:47:52 -07:00
Fariz Rahman cfbaec60c7 Simplify output shape inference for dot/cos merge (#3233)
* Simplify output shape inference for dot/cos merge

* Add extra dim if rank is 1

* Explain output shape inference logic in doc string

* Update tensorflow_backend.py

* Update theano_backend.py

* Update tensorflow_backend.py

* Update tensorflow_backend.py

* Update theano_backend.py

* Update topology.py

* example

* Update theano_backend.py

* Update theano_backend.py

* Update tensorflow_backend.py

* Update tensorflow_backend.py

* Update tensorflow_backend.py

* Update tensorflow_backend.py

* Update theano_backend.py

* Update tensorflow_backend.py

* typo fix

* Update theano_backend.py
2016-07-16 16:35:39 -07:00
Francois Chollet f3e7245910 Use native Theano BN 2016-07-16 13:42:41 -07:00
Francois Chollet 892d9fae84 Prepare 1.0.6 release 2016-07-16 12:25:03 -07:00
Francois Chollet 796f895f01 Start to nativify variable initialization in TF 2016-07-16 11:19:41 -07:00
Francois Chollet 489bb4eb10 Update docs for pooling 2016-07-16 10:28:35 -07:00
Francois Chollet 8f458066bb Update docs for initializations 2016-07-16 10:28:18 -07:00
Francois Chollet 5dd7454260 Merge branch 'master' of https://github.com/fchollet/keras 2016-07-15 17:36:21 -07:00
Francois Chollet 571db82371 Add variable initialization override option in TF 2016-07-15 17:36:00 -07:00
fchollet d971e0cca5 Style fixes in SeparableConv2D 2016-07-14 18:22:33 -07:00
Fariz Rahman fde0aac733 Convnet aliases (#3226)
* Convnet aliases

Not very important, but kinda handy.

* Update convolutional.py

* pep8
2016-07-14 18:16:00 -07:00
Maruan b9d904c12f Add masking support to BatchNormalization layer. (#3228) 2016-07-14 17:02:56 -07:00
Eder Santana aa2ec42da6 stop gradients (#3221)
* stop gradients

* fix stop grad test

* stop gradients
2016-07-14 16:13:55 -07:00
Francois Chollet d90d473104 Add pooling layers page in docs 2016-07-14 15:27:46 -07:00
Francois Chollet 5a1e63990a Refactor batch norm 2016-07-14 15:05:48 -07:00
Francois Chollet e836c10c6f Fast BN for TF 2016-07-14 14:13:06 -07:00
Francois Chollet 47c09d9557 Add SeparableConv2D layer (TF only) 2016-07-14 11:22:27 -07:00
Francois Chollet b35b943364 Add support for None activations 2016-07-14 04:38:53 -07:00
Francois Chollet ca467cc50e Add support for input tensors in InputLayer 2016-07-14 04:30:21 -07:00
Francois Chollet 51f7cf0367 Doc formatting fix 2016-07-14 04:20:12 -07:00
Francois Chollet 642eaca618 Doc formatting fix 2016-07-13 12:38:45 -07:00
Francois Chollet 55e5680535 Update doc generation script 2016-07-13 12:35:11 -07:00
Francois Chollet 52ea31b65c Add FAQ entry on pre-trained models 2016-07-13 12:34:58 -07:00
Wei Ouyang b3a26a5b30 Add AtrousConv2D layer for dilated convolution (#3183) 2016-07-13 08:28:23 -07:00
Dave Challis 98974efa5f Fix to error message in exception (#3213)
Was incorrectly reporting the `loss` argument instead of the `loss_weights` argument when an exception related to loss_weights was thrown.
2016-07-13 08:27:03 -07:00
fchollet b6a776b242 Add Sequential.pop() method 2016-07-12 20:17:03 -07:00
Francois Chollet 1ea3f44f06 Fix dtype consistency issue in random_binomial 2016-07-12 17:05:47 -07:00
Michael Oliver 64e1320ca0 add dtype to zeros and ones allocations (#3187) 2016-07-09 10:10:15 -07:00
Francois Chollet 6e0b50fbdc Merge branch 'master' of https://github.com/fchollet/keras 2016-07-08 18:44:37 -07:00
Francois Chollet 22502a8fe8 Style fixes in regularizers 2016-07-08 18:44:23 -07:00
Jason Yosinski a78ad01bb4 Create initial_state tensor filled with zeros without use of K.zeros (#3123)
* Create initial_state tensor filled with zeros without use of K.zeros

* minor PEP8 fix
2016-07-06 12:47:14 -07:00
Carl Thomé 729e802e85 Added optional field name argument to RemoteMonitor callback (#3157)
* Added optional path argument

* Added optional field name argument
2016-07-06 09:47:03 -07:00
Joshua Chin 3ffff6d579 fix get_output_shape_for in Merge, when mode is callable (#3144) 2016-07-05 09:29:40 -07:00
Francois Chollet 6e5f97fca5 Merge branch 'master' of https://github.com/fchollet/keras 2016-07-04 14:21:13 -07:00
Francois Chollet eaff5bdfd7 Style touch-ups in TF backend 2016-07-04 14:20:54 -07:00
Francois Chollet 28819d36a4 Less frequent dataset tests 2016-07-04 14:17:35 -07:00
lucasmoura f9a4f6f306 Use defaultdict for _UID_PREFIXES (#3087)
The method get_uid on common.py first check if a prefix is in _UID_PREFIXED dict
and if it is not, a variable is added to the dict.

However, using a defaultdict, this check is no longer necessary.
2016-07-04 13:26:44 -07:00
Dmytro Mishkin 27c83c693d Added 'max' operation to Merge layer (#3128)
* Added 'max' operation to Merge layer. It allows to implement convolutional maxout with two (or more) convoluion layers and one Merge.

* Added 'max' to merge test
2016-07-04 11:03:07 -07:00
Eder Santana d106908a57 fix docs bugs (#3142)
* fix docs bugs

* fix docs bugs
2016-07-04 11:02:47 -07:00
Brian McMahan b4adce34dc Lambda output shape (#2680)
* updating the info for lambda

* updated lambda doc a bit more

made it more readable and stuff
2016-07-03 21:57:46 -07:00
Thibault de Boissière 3927505d1a Add multiprocessing for fit generator (#3049)
* Add multiprocessing for fit generator

* Change maxproc to nb_worker and update documentation

* Simplify multiprocessing test, clarify doc replace maxproc by nb_worker

* Replace maxproc by nb_worker in test

* Replace maxproc by nb_worker in test

* Update the doc: specify non picklable arguments should not be used with multiprocessing

* Add multiprocessing as an option with the pickle_safe argument
2016-07-03 21:50:01 -07:00
fchollet ede79f818e Add MIT license badge to README 2016-07-03 21:19:05 -07:00
Francois Chollet 742ac53262 Merge branch 'locally_connected' 2016-07-03 20:52:54 -07:00
Francois Chollet ee17ccc374 Add tests for locally connected layers 2016-07-03 20:51:58 -07:00
joelthchao 835a02c037 locally-connected layer
add unittest, fix output shape

PEP8

flatten weight, improve example

update docstring, remove cifar10 Alex exmaple

improve docstring, remove duplicate func

parallel by batch_dot

fix theano batch_dot

dim_ordering unit test, theano only use dot

dim_ordering unit test

Update locally connected layers
2016-07-03 20:48:22 -07:00
François Chollet ee8ff00a2a New conv ops (#3134)
* New function signature for conv2d in backend

* Clean up stuff

* Touch-up TF deconv op

* More cleanup

* Support for TF 3D conv/pool

* Move pooling layers to their own file

* Update TF version in Travis config

* Fix conv3d tests
2016-07-03 20:33:21 -07:00
fchollet 229f13a864 Lambda should not support masking implicitly 2016-07-02 15:12:46 -07:00
Rompei 0d60d637af TimeDistributedDense -> TimeDistributed(Dense()) in doc example 2016-07-02 12:12:54 -07:00
Francois Chollet c20e34a8b0 Prevent image_dim_ordering from being overwritten 2016-07-02 10:24:48 -07:00
Fariz Rahman 8d3f39852a Validate dot_axes argument in cos mode and fix output shape (#3116)
* Validate dot_axes argument in cos mode

* Update topology.py

* Update topology.py
2016-07-01 11:41:13 -07:00
Carl Thomé aa45dee5a4 Added optional path argument (#3118) 2016-07-01 10:56:17 -07:00
fchollet 885e6e621b Style fix in test 2016-06-30 23:22:06 -07:00
fchollet dc122c31ef Fix masking test 2016-06-30 23:19:51 -07:00
fchollet 3bc80d3db4 Remove unnecessary assert 2016-06-30 23:04:12 -07:00
Francois Chollet 439f2f3b2b Fix issue with multi-io + BatchNorm mask computing 2016-06-30 22:19:17 -07:00
Joshua Chin a1610eb274 model should use binary accuracy for binary crossentropy loss (#3098) 2016-06-28 12:45:34 -07:00
Francois Chollet 6b90eff03c Fix flaky test 2016-06-27 12:16:54 -07:00
Francois Chollet 4d404d1a54 Prepare 1.0.5 PyPI release 2016-06-27 12:01:20 -07:00
Francois Chollet fffa6a80ca Add attribute caching for flattened_layers 2016-06-27 11:51:28 -07:00
Francois Chollet 3f6b38b34f Fix duplicated updates issue 2016-06-27 11:51:12 -07:00
Francois Chollet 8166a55761 Fix flaky test 2016-06-27 11:50:56 -07:00
Shota 89f6d374e9 Fix typo (#3070) 2016-06-25 12:01:20 -07:00
M.Yasoob Ullah Khalid ☺ 9b3c2cf348 A small typo (#3067) 2016-06-24 19:04:28 -07:00
Francois Chollet 76406dd0c2 Remove unnecessary space 2016-06-23 16:17:36 -07:00
Francois Chollet d266b75423 Small fixes in text gen example 2016-06-23 16:17:24 -07:00
Francois Chollet 5bb5eb1657 Fix flaky test 2016-06-23 12:08:25 -07:00
Benjamin Bolte 258cf3b0f7 Support for masking in merged layers (#2413)
* added masking to merge layer (#2413)

* added documentation, fixed stylistic issues

* removed casting

* changed to using K.all
2016-06-23 12:03:55 -07:00
Utkarsh Upadhyay cdb5b09cd7 fix: Sort subdirs before mapping them to classes. (#3052)
The documentation says that [1]: 

> If [classes are] not provided, the list of classes will be automatically inferred (and the order of the classes, which will map to the label indices, will be alphanumeric).

However, the code was adding classes in the order `os.listdir` returned them. This commit alphanumerically sorts the sub-directories before mapping them to label indices.

[1] http://keras.io/preprocessing/image/
2016-06-23 11:32:40 -07:00
MikeAmy cb3469215a Moved epoch_logs = {} before batch loop to avoid UnboundLocalError. (#3019) 2016-06-20 08:18:01 -07:00
ηzw 703c2925b3 Add comment for a note of caution (#3024) 2016-06-19 12:36:05 -07:00
lucasmoura b6ca3ef051 Avoid double key lookup on callback.py (#3018)
On method on_epoch_end, to add new keys to the history dict, first it is
verified if a key is not on the history dict and if that is the case, a new key
is created on the history dict with an empty list as value.

However, this operation search for a key twice in the dict. This same behavior
can be achieved in a single step using dict setdefault method.
2016-06-19 10:48:53 -07:00
André Artelt b235f91cc7 doc: fix example for recurrent layer (#3022) 2016-06-19 10:39:23 -07:00
jkleint 3513472467 Allow re-use of EarlyStopping callback objects. (#3000)
An EarlyStopping callback object has internal state variables to tell it
when it has reached its stopping point.  These were initialized in __init__(),
so attempting to re-use the same object resulted in immediate stopping. This
prevents (for example) performing early stopping during cross-validation with
the scikit-learn wrapper.

This patch initializes the variables in on_train_begin(), so they are re-set
for each training fold.  Tests included.
2016-06-18 15:04:30 -07:00
Carlos Bentes 60e0c96f6c Fix typo in training (#3014) 2016-06-18 10:42:38 -07:00
Tsukasa ŌMOTO e37df7ca85 Fix json serialization in Lambda layer (#3012)
Fix #2582
Fix #3001
2016-06-17 21:26:45 -07:00
Tsukasa ŌMOTO a13a35fe52 Fix json serialization in merge layer with lamda output shape (#3011)
Fix #3008
2016-06-17 21:26:29 -07:00
Zhengping Che f421600218 fix wrong calls of __init__ in callbacks (#2999) 2016-06-16 14:51:28 -07:00
Francois Chollet 2a4492d74f Fix typo in docs 2016-06-16 10:56:34 -07:00
Tsukasa ŌMOTO 10368d867f Fix TF-IDF in Python 2 (#2992)
Fix #2974
2016-06-16 08:49:17 -07:00
Tsukasa ŌMOTO 936360020c Fix tf-idf again (#2986)
Fix 53aaa842ed
Fix #2974
2016-06-15 09:11:01 -07:00
ηzw c53c64d7fa Fix get_word_index (#2981) 2016-06-14 18:59:10 -07:00
Francois Chollet 6b122ba25f Allow arbitrary output shapes for custom losses 2016-06-14 16:40:58 -07:00
Francois Chollet 6501b587c0 Clarify use of two-branch models 2016-06-14 12:12:21 -07:00
Tsukasa ŌMOTO 53aaa842ed Fix tf-idf (#2980)
Fix #2974
2016-06-14 11:39:07 -07:00
Francois Chollet dc569e952d Merge branch 'master' of https://github.com/fchollet/keras 2016-06-13 12:06:45 -07:00
Francois Chollet c9aee4126e Fix issue with cascade of Merge layers 2016-06-13 12:05:48 -07:00
githubnemo 7397f4b0d0 Resolve #2960 (#2961)
* Resolve #2960

Introduce `K.var` so that the standard deviation computation can
be made numerically stable. Instead of

	K.std(x)

the user is able to write

	K.sqrt(K.var(x) + self.epsilon)

avoiding a division by zero in the gradient computation of `sqrt`.

* Fix typos
2016-06-12 21:16:21 -07:00
Rompei 3b83a1b1ac Fix initial variable in Evaluator. (#2955) 2016-06-12 14:50:56 -07:00
fchollet 8f8e4574dc Fix issue with Sequential deserialization 2016-06-12 11:04:09 -07:00
fchollet c30432a665 Nadam optimizer style fixes 2016-06-11 16:49:56 -07:00
Ilya Kulikov c4c2d8bbd4 Nadam optimizer and test for it added (#2764)
* Nadam optimizer and test for it added

* pep8 fix

* add comment in docstring and one more pep8 fix
2016-06-11 16:30:05 -07:00
Francois Chollet e49ba233f9 Convolution1D: apply activation after reshape 2016-06-10 15:05:34 -07:00
mpt5 cea6c1821f Update visualization.md (#2942)
* Update visualization.md

Added show_layer_names argument and its default value to docs

* Update visualization.md
2016-06-09 21:50:32 -07:00
Shaun Harker ab4bf447c6 Fix 1D convolution layers under Theano backend (#2938)
This issue is due to an unexpected loss of dimensionality when
composing the backend tensor operations "reshape" and "squeeze"
when there are dimensions of length 1.

For example, using a Theano backend the following fails with a
complaint about dimension mismatch:

UpSampling1D(2)(MaxPooling1D(2)(Reshape((2,1))(Input(shape=(2,)))))

The issue arises due to the conflict of two behaviors specific
to the Theano backend:

-   Reshape uses Theano's reshape function. Theano's reshape
    automatically makes dimensions with length 1 "broadcastable"

-   MaxPooling1D's implementation class _Pooling1D has a call method
    which uses a dummy dimension which it has to remove. The manner
    in which this dummy method is removed it to call "squeeze(x, axis)"
    from the backend. The squeeze implementation tells Theano to make
    the dummy dimension broadcastable, and then calls Theano's "squeeze",
    which removes ALL the broadcastable dimensions; not just the dummy
    dimension, but also the length 1 dimension flagged as broadcastable
    by reshape. This causes the problem observed above. This behavior
    is distinct from the behavior of the TensorFlow backend, which
    removes only the requested dimension.

This PR addresses this issue in two ways:

First, it introduces a test which checks the composition of "reshape"
and "squeeze" to make sure we get the same result using both Theano
and TensorFlow backends.

Second, it changes the implementation of squeeze(x,axis) so that the
Theano backend should behave similarly to the TensorFlow backend. With
this change the introduced test passes and the above example works.
2016-06-09 12:00:50 -07:00
Oswaldo Ludwig 4e0c8cf25b Eigenvalue Decay regularization (#2846)
* Update regularizers.py

I included a new regularizer named Eigenvalue Decay to the deep learning practitioner that aims at maximum-margin learning. This version approximates the dominant eigenvalue by a soft function given by the power method. For details, see:
Oswaldo Ludwig. "Deep learning with Eigenvalue Decay regularizer." ArXiv eprint arXiv:1604.06985 [cs.LG], (2016). https://www.researchgate.net/publication/301648136_Deep_Learning_with_Eigenvalue_Decay_Regularizer

The syntax for Eigenvalue Decay is similar to the other Keras weight regularizers, e.g.:

 model.add(Dense(100, W_regularizer=EigenvalueRegularizer(0.0005)))

* Example with Eigenvalue Decay regularization.

An example from Keras including regularization with Eigenvalue Decay. After training, you have to save the trained weights, create/compile a similar model without Eingenvalue Decay and save this model. Then, you can use your trained weights with this model, see lines 123-153 of  	CIFAR10_with_Eigenvalue_Decay.py (This is still an open issue).
This example yields a gain in the accuracy by the use of Eigenvalue Decay of 2.71% (averaged over 10 runs).

* Update CIFAR10_with_Eigenvalue_Decay.py

* Update CIFAR10_with_Eigenvalue_Decay.py

* Update CIFAR10_with_Eigenvalue_Decay.py

* Update regularizers.py

* Update regularizers.py

* Delete CIFAR10_with_Eigenvalue_Decay.py

* Update test_regularizers.py

* Update regularizers.py

* Update test_regularizers.py

* Update regularizers.py

* Update regularizers.py

I needed another reading in Keras backend...

* Issue to get shape of a tensor.

Issue to get shape of a tensor in the class EigenvalueRegularizer: the type returned for shape is different for Theano backend (Theano tensor type) and TF backend (TF TensorShape).

* Update regularizers.py

* Update regularizers.py

* Update regularizers.py

* Update regularizers.py

* Update regularizers.py

* Update regularizers.py

* Update regularizers.py
2016-06-08 12:20:25 -07:00
Francois Chollet e4b3a052a4 Small style fixes 2016-06-08 11:56:05 -07:00
Ziheng Jiang 825beb42c4 fix bug: rename duplicated loss name (#2842)
* rename duplicated loss name

* make python3 happy

* rewritten code to make it easy to read
2016-06-08 11:51:34 -07:00
Tammy Yang c8d605db55 Make DirectoryIterator case insensitive (#2932)
* make DirectoryIterator case insensitive

* Also need to make filename case insensitive while appending it into self.filenames
2016-06-08 11:27:19 -07:00
Ryo ASAKURA f6ecab58cb Fix description about parameter output_shape for function merge (#2933) 2016-06-08 11:26:56 -07:00
Tsukasa ŌMOTO d7e39347b9 Add mode=2 option to the docstring in BatchNormalization (#2919)
Fix a tiny typo.
2016-06-07 23:09:03 -07:00
Colin Rofls 25c10af596 fix 2852 (#2927) 2016-06-07 16:48:13 -07:00
Francois Chollet ded23f14c7 Fix typo in docs 2016-06-07 15:51:29 -07:00
Michael Crawford b0d52d930a Fix predict_proba method of KerasClassifier to return probabilites for both classes in case of binary classification. issue:2864 (#2924) 2016-06-07 11:48:04 -07:00
jakeleeme af5c5b6a55 Spellcheck source files (#2907) 2016-06-06 13:29:25 -07:00
ηzw ce51e19970 Fix typos in image preprocessing docs (#2906) 2016-06-06 13:28:39 -07:00
Francois Chollet 97e31b6090 Cleanup docs autogen script 2016-06-06 10:18:09 -07:00
Francois Chollet 62053e68e2 Prepare 1.0.4 PyPI release 2016-06-06 10:17:47 -07:00
Francois Chollet 489c07e748 Docs adjustment 2016-06-06 10:15:22 -07:00
fchollet 604ea8d68a Fix PEP8 BS 2016-06-05 20:41:19 -07:00
fchollet fd3cfb196b Allow no layer names in plot() 2016-06-05 20:20:14 -07:00
fchollet b71f6ba864 Allow absence of labels in flow() 2016-06-05 20:19:55 -07:00
fchollet dfc128b89a Fix some py3 generator issue 2016-06-05 14:52:10 -07:00
fchollet 34b8b57c2f Update image preprocessing docs 2016-06-05 14:01:28 -07:00
fchollet 3bba409d9e Improve docstring in preprocessing/image 2016-06-05 14:01:11 -07:00
fchollet 7869cdccec Merge branch 'master' of ssh://github.com/fchollet/keras 2016-06-05 13:39:55 -07:00
fchollet 0e18e345b0 Refactor ImageDataGenerator, add directory support 2016-06-05 10:24:54 -07:00
fchollet e5b99c7512 Tiny fixes in Sequential methods 2016-06-05 10:24:20 -07:00
aaditya prakash 7d4c85018a MaxoutDense no activation; incorrect docs (#2895)
Since MaxoutDense does not have activation it might be misleading to include "activation" as one of the arguments in the function docs.
2016-06-03 23:45:24 -07:00
fchollet edbec2dbc9 Remove bit of deprecated code 2016-06-03 23:13:46 -07:00
fchollet 973ece9809 Make dim_ordering a global default 2016-06-03 23:13:11 -07:00
lorenzoritter 90f441a6a0 fixed formatting error in the docstring (#2797)
* fixed formatting error in the docstring

* fixed formatting error in TimeDistributedDense of core.py
2016-06-02 19:07:34 -07:00
Andrew Stromnov 5a71090476 limit progress bar update rate (#2860)
* limit progress bar update rate

Limit progress bar update rate in verbose=1 mode. This patch allows to
reduce terminal I/O throughput while keeping reasonable high visual
update rate (defaults to 100 refreshes per second). It helps greatly
when working with large but simple data sets with small batches, which
leads to millions of relatively useless screen updates per second. Also
it helps to keep network traffic at reasonable rates, which
exceptionally useful within laggy networking conditions when using
keras over telnet/ssh, and improve web browser responsibility when
using keras within Jupyter Notebook.

* add docstrings for 'interval' and 'force' arguments
2016-06-02 13:06:37 -07:00
matthewmok 76cae0ec44 fix bug: change seed range for RandomStreams in Theano (#2865)
* bug fixed, numpy randint only output positive numbers ranging from 1 to 10e6

* Update theano_backend.py

changed style and numpy randint range

* Update theano_backend.py

removed extra spaces
2016-06-02 13:05:51 -07:00
talpay 273f0dda9d Added objective: Kullback Leibler Divergence (#2872)
* Added objective: Kullback Leibler Divergence

* KLD: Clip at 1
2016-06-02 10:23:00 -07:00
Tsukasa ŌMOTO 882b5a1d89 Fix YAML serialization when using Regularizers (#2883)
Fix #2871
2016-06-01 21:40:16 -07:00
ηzw 8c84ad1a86 fix typo (#2881)
* fix typo

* Update scikit-learn-api.md
2016-06-01 21:39:46 -07:00
Francois Chollet 80bfec7253 Fix JSON deserialization issue 2016-05-30 22:36:57 -07:00
fchollet 91b930298b Make Merge output_shape consistent with lambda 2016-05-30 20:50:59 -07:00
Tsukasa ŌMOTO 9c56b91548 Fix json serialization in merge layer (#2854)
Fix #2818
2016-05-30 20:30:07 -07:00
fchollet 7b5bab83f4 Merge branch 'master' of ssh://github.com/fchollet/keras 2016-05-29 15:36:21 -07:00
fchollet c5e2116ead Fix typo in doc 2016-05-29 15:36:14 -07:00
mittagessen d9db73a791 s/TimeDistributedDense/TimeDistribute(Dense(.../g (#2843) 2016-05-29 14:12:57 -07:00
Kumar Ayush 01ece4ef7b added required import line (#2839) 2016-05-27 23:56:51 -07:00
Francois Chollet 601f3e7cdb BN only uses learning phase in mode 0 2016-05-27 21:36:08 -07:00
Francois Chollet a9ca2c547f Merge branch 'master' of https://github.com/fchollet/keras 2016-05-27 21:32:53 -07:00
Francois Chollet 594cbed03b Small changes in mask caching 2016-05-27 21:32:43 -07:00
Monami Sharma 3938a905a1 Default values corrected for featurewise_std_normalization and featurewise_center (#2831)
For ImageDataGenerator, False is the default value for for featurewise_std_normalization and featurewise_center.
2016-05-27 08:59:10 -07:00
Francois Chollet 88d523e01b Add stateless batchnorm mode 2016-05-26 16:01:24 -07:00
Francois Chollet 0419fe67fc Merge branch 'master' of https://github.com/fchollet/keras 2016-05-25 16:46:21 -07:00
Francois Chollet 33ddeb5cbe Change way node depth is computed for shared layer 2016-05-25 16:46:06 -07:00
Colin Rofls 1f5d5b391b correctly serialize loss function (#2806) 2016-05-24 21:27:57 -07:00
fchollet 198c515208 Simplify imports in README 2016-05-23 23:59:34 -07:00
fchollet 5156673e17 Fix serialization issue with nested Sequential 2016-05-21 18:11:51 -07:00
fchollet 1529c9c438 Clarify error message 2016-05-21 17:01:39 -07:00
fchollet 32b10a8832 Fix first axis dim validation in multi-input model 2016-05-21 16:06:19 -07:00
fchollet 24501d4361 Fix ActivityReg layer 2016-05-21 15:46:32 -07:00
fchollet bf0c08e24a Add FAQ entry about layer freezing 2016-05-21 13:53:23 -07:00
RyosukeHonda 34d8cce6bc Fixed typo (#2770)
Fixed the year from "7 Apr 201" to "7 Apr 2015".
2016-05-21 10:12:45 -07:00
Joshua Loyal f0bfc24adc Correction to fan_out initializaiton (#2252)
* account for receptive field size in fan_out

* added test for conv layer initializations

* removed old reference to kernel_size
2016-05-19 22:11:41 -07:00
Xingdi (Eric) Yuan 2f8acfe4bf changeable print_summary (#2761)
* use changeable print_summary

* minor
2016-05-19 12:40:06 -07:00
gw0 e2fb8b2786 Add download error suggestion for babi_rnn.py and babi_memnn.py. (#2752) 2016-05-19 10:20:36 -07:00
Francois Chollet ebbc4d9fb8 Fix TB callback with non-standard TF version nums 2016-05-18 10:28:12 -07:00
Francois Chollet 8fc5b90e9a Update bibtex entry 2016-05-16 15:04:49 -07:00
mat kelcey 8a717f5b6c rename z_log_sigma to z_log_std to match z_mean (which is not z_mu) (#2729) 2016-05-16 11:08:00 -07:00
Colin Rofls aa91994166 save keras version & compile args when serializing models (#2690)
* save keras version & compile args when serializing models

* renamed prepare_config -> _updated_config + cleaner implementation
2016-05-15 20:18:31 -07:00
Joel 2091347a71 Fix zero division in merge mode='cos' (#2725)
* fix cos zero division

* use backend epsilon
2016-05-15 20:16:46 -07:00
Mikhail Korobov e0ed174f2c Input: proper error message for missing "shape" argument (#2727) 2016-05-15 15:15:14 -07:00
Francois Chollet 8c2a573ebf Prepare 1.0.3 release 2016-05-15 13:13:19 -07:00
Francois Chollet d7ff7cde92 Add VAE example 2016-05-14 12:06:23 -07:00
Francois Chollet 15d0b0ea08 Add K.tile test 2016-05-14 12:06:02 -07:00
Francois Chollet 3695bc2db5 Remove references to "join" merge mode 2016-05-13 11:06:08 -07:00
Francois Chollet a08995a90d Fix common LaTeX encoding issue 2016-05-12 12:03:20 -07:00
Tsukasa ŌMOTO aea00258e7 Update the reference of Batch Normalization (#2700)
We should refer the paper accepted in ICML 2015, instead of arXiv.
2016-05-12 09:54:46 -07:00
fchollet b581eb3f27 Update RMSprop 2016-05-11 21:35:11 -07:00
Francois Chollet 610ccba9f5 Normalize layer imports in examples 2016-05-11 18:45:37 -07:00
Francois Chollet d5ae6f32dd Fix flaky test 2016-05-11 18:01:01 -07:00
Francois Chollet 5308033936 Update RMSprop, Adagrad, Adadelta 2016-05-11 17:20:27 -07:00
Francois Chollet e2abb5ef2c Fix merge conflicts 2016-05-11 16:07:43 -07:00
Francois Chollet 1b11b4eeb6 Fix shape inference issue with TF.resize_images 2016-05-11 16:06:03 -07:00
Dieuwke Hupkes 39357b3045 Update documentation docstring Embedding (#2693)
From the documentation it is not entirely clear that if mask_zero is set
to True, the input_dim argument should be equal to the size of the
vocabulary + 2, as index 0 cannot be used anymore.

(This behaviour seems a bit strange, as it has as a consequence that the
first column of the weights of the embeddings will never be used or
updated. The resulting network thus has a redundant set of parameters).
2016-05-11 14:10:58 -07:00
Kai Sasaki ed7a5a1418 Residual connection should have the same dimension in case of no projection matrix (#2688) 2016-05-10 21:18:39 -07:00
Kyle McDonald ae682a71f9 functional API intermediate output doc in faq (#2682) 2016-05-10 08:22:53 -07:00
Brian McMahan 8327b37a0b fixed shape typo (#2679)
* fixed shape typo

* pep8
2016-05-09 22:17:12 -07:00
fchollet 973b5570aa Style touch-up 2016-05-06 20:37:46 -07:00
fchollet 7cb41fc5cc Fix weight saving issue 2016-05-06 20:37:35 -07:00
Tsukasa ŌMOTO 595d67ad7d Fix initialization of index_array (#2590)
index_array should be initialized when self.batch_index is zero.
2016-05-06 18:13:11 -07:00
François Chollet bb626c120e Revert "Revert "remove unused import statement in keras dir"" (#2647) 2016-05-06 13:32:43 -07:00
Xingdi (Eric) Yuan ba8fefa8ec Faster GRU (#2633)
* add a simple named entity recognition example

add a simple named entity recognition example

* add fast version of GRU

add fast version of GRU

* remove useless stuff
2016-05-06 11:10:46 -07:00
François Chollet 4b24f6d7b1 Revert "remove unused import statement in keras dir" (#2641) 2016-05-05 23:22:28 -07:00
ηzw 1c460e1e08 remove unused import statement in keras dir (#2638)
* remove unused import statement in keras dir

* rewrite import graph statement
2016-05-05 21:33:04 -07:00
Colin Rofls 7b4e157356 fixed docs for Sequential.get_config, and added a more helpful (#2635)
exception to `model_from_config`.
2016-05-05 15:24:52 -07:00
Dr. Kashif Rasul 5749f1b971 fix soft sign deprecation warning (#2623)
and backward compatible
2016-05-05 13:02:37 -07:00
Francois Chollet 3c57aff85b Style fixes 2016-05-05 11:17:25 -07:00
Carl Thomé 18504bcc86 Faster LSTM (#2523)
* Faster LSTM

* PEP8

* RNN dropout fix

* PEP

* PEP

* Less code duplication

* LSTM benchmark example

* PEP

* Test implementation modes

* Go through Keras backend
2016-05-05 11:01:48 -07:00
Francois Chollet d8864bfe48 Allow use of predict without compilation 2016-05-05 08:24:12 -07:00
Nic Eggert 078b20169b Add batch_get_value to backends (#2615)
* Add function to get multiple values at once

* Change to match existing batch_set_value

* Fix typo
2016-05-04 17:13:17 -07:00
Francois Chollet 5f7e78df65 Improve optimizer configuration 2016-05-04 14:18:06 -07:00
Francois Chollet fc470db7ab Fix typos in layer writing guide 2016-05-03 11:29:50 -07:00
jingzhehu f576f37801 one line fix for TensorBoard callback issue (#2574)
* one line fix for TensorBoard callback issue

Ref: https://github.com/fchollet/keras/issues/2570

* handle SummaryWriter based on tensorflow version

code contributed by @bnaul

https://github.com/bnaul/keras/commit/e04ce5e37ec234debaea8c6482ef90be1f
88286d
2016-05-03 10:51:43 -07:00
Francois Chollet b74118a766 Fix typo in documentation 2016-05-02 15:59:04 -07:00
Brian McMahan 1c7a0248b9 updated for list check bug in predict/predict_on_batch (#2585)
* updated for list check bug in predict/predict_on_batch

* pep fix

I think that's going to be the only pep complain..
2016-05-02 15:33:25 -07:00
Francois Chollet 36a829c20d Add doc page about writing custom layers. 2016-05-02 14:16:09 -07:00
chentingpc 33af75aa39 fix activity regularizer so it can deal with multiple inbound nodes as well (#2573) 2016-05-01 16:36:31 -07:00
jpeg729 844420425e Added softsign activation function (#2097) 2016-04-30 18:29:33 -07:00
Francois Chollet da57a530f9 "total_loss" -> "loss" 2016-04-30 16:38:23 -07:00
fchollet 1f17013949 Misc fixes 2016-04-30 15:09:35 -07:00
fchollet f18899cb36 Merge branch 'master' of https://github.com/commaai/keras into commaai-master 2016-04-30 14:09:56 -07:00
Sasank Chilamkurthy 877f946e24 Improved docs of ImageDataGenerator (#2565) 2016-04-30 11:53:44 -07:00
Francois Chollet a981a8c42c Make bias optional everywhere 2016-04-29 16:54:39 -07:00
Francois Chollet 5467107fc9 Prepare 1.0.2 PyPI release 2016-04-29 10:39:52 -07:00
Gijs van Tulder ad3107073b Re-raise exceptions to preserve stack trace (#2350) 2016-04-28 12:38:36 -07:00
Francois Chollet 8d62f4da6c Minor UX fix 2016-04-27 17:34:33 -07:00
Joel 3779b8a008 Fix test_image path non-exist error in ci-travis (#2531)
* correct inception_v3 network

* store test images in class attribute

* PEP8
2016-04-27 11:35:31 -07:00
Francois Chollet 6ec5e48969 Style touch-ups 2016-04-27 10:53:54 -07:00
fchollet bfa5ca553d Fix docstring 2016-04-27 09:20:19 -07:00
Francois Chollet c9f7d970e9 Style fixes in preprocessing/image 2016-04-26 15:24:05 -07:00
Sasank Chilamkurthy f26ce6e236 Rewriting image augmenter (#2446)
* Much better image data augmentor

* removed unnecessary functions

* shift origin to centre of the image for homographies

* init commit

* change to zoom_range

* Added scikit-image to extras_require in setup.py

* add zoom_range test, exception for invalid zoom_range

* add scikit-image to dependency

* fix fit and retain old functions for unit test

* use ndi insteadskimage in random_transform

* removed buggy code in random_rotations, shears etc  and replaced it with todos.

* remove sci-image, implement ndimage based methods, refactor random_transform

* random_zoom, array_to_img consider dim_ordering

* add random_channel_shift, support fill_mode and cval

* image doc, update test_image, PEP8

* fix channel shift clip

* fix doc, refine code

* detail explain of zoom range

* check coding style
2016-04-26 15:21:14 -07:00
Brian McMahan b001e36f18 adding a disable_b boolean to Dense (#2512)
* adding a disable_b boolean to Dense

* changing 'disable_b' to 'bias' 

Changing the name of the boolean & flipping its behavior so that the default is True and when set to False the bias is not used.

* integrating bias flag fully

changed the bias flag to affect the creation of the self.b variable as well as the output calculation

* fixing a blank line to appease pep8
2016-04-26 14:25:00 -07:00
Francois Chollet 9abb6ef723 Add TF graph management warning 2016-04-26 13:02:39 -07:00
Francois Chollet bfbdbb05bc Add root imports 2016-04-26 13:02:11 -07:00
Francois Chollet bd2bd51b5d Fix typo in README 2016-04-25 19:06:31 -07:00
Francois Chollet 4e547a31ed Improve TF session & variable management 2016-04-25 18:49:19 -07:00
Francois Chollet de8d0defcd Fix PEP8 2016-04-25 18:48:23 -07:00
gw0 344437c491 Fix plot with show_shapes and multiple inputs/outputs. (#2421) 2016-04-25 15:29:16 -07:00
George Hotz ed365e94fd Added simple support for returning a multitarget loss 2016-04-25 14:46:03 -07:00
TobyPDE 5910278ca8 Fixed minor typo in getting-started/sequential-model-guide (#2499) 2016-04-25 09:14:18 -07:00
fchollet 18841fa58d Fix build 2016-04-24 22:18:02 -07:00
Carl Thomé 6fb4e0e441 Add cos and sin to backend (#2493) 2016-04-24 21:43:57 -07:00
Francois Chollet 39051ef3ca Add model_from_config in models.py 2016-04-24 14:33:27 -07:00
fchollet 1f4084870b Add new metrics and metrics tests 2016-04-24 12:10:47 -07:00
fchollet 00e9d5b219 Update regularizer tests 2016-04-24 10:27:45 -07:00
fchollet 7f93747602 Remove outdated comment 2016-04-24 10:27:45 -07:00
Kai Li a7156b8c27 Update antirectifier.py (#2485) 2016-04-24 09:34:20 -07:00
fchollet b1e47f7741 Fix PEP8 2016-04-23 13:55:20 -07:00
Ke Ding 59f8d6ca22 add weights for SGD optimizer (#2478) 2016-04-23 13:33:14 -07:00
Rich P. I. Lewis 5f4019d980 fixed Merge Layer functional API (#2460)
* fixed Merge Layer functional API

* moved test to layers/test_core
2016-04-23 13:32:45 -07:00
Ke Ding f84389da08 fix a benign but wrong range number in GRU's get_constants (#2475) 2016-04-22 18:55:38 -07:00
Jiyuan Qian 63c1757df5 fix accuracy with sparse_categorical_crossentropy (#2471) 2016-04-22 10:41:14 -07:00
Joel d6ab850f45 correct inception_v3 network (#2472) 2016-04-22 10:38:19 -07:00
graham 61dd53e262 allows python3.5 to build alongside < 3.5 (#2457) 2016-04-21 15:31:25 -07:00
Francois Chollet 423a633b5b Update merge tests 2016-04-21 09:44:44 -07:00
Colin Rofls 256d4ef71b clarified usage of sparse_categorical_crossentropy (#2450)
- addressess #2444
2016-04-21 09:36:39 -07:00
Philip Bachman ad49962ba9 fix layer/node topo sort problem (#2433)
* fix layer/node topo sort problem

* fix to only iterate over valid layer/node keys
2016-04-20 21:38:23 -07:00
Brian McMahan 4680d70a78 fixing the constants thing in theano rnn (#2429) 2016-04-20 11:17:05 -07:00
Dapid ee7f056779 DOC: models should be compiled upon loading (#2428) 2016-04-20 10:23:26 -07:00
Francois Chollet 66c8d7baf2 Merge branch 'master' of https://github.com/fchollet/keras 2016-04-20 09:43:12 -07:00
Francois Chollet 9f929999d1 Fix Travis concurrent directory creation issue 2016-04-20 09:43:01 -07:00
fchollet 24f96262ec Add additional input data validation check 2016-04-20 08:53:40 -07:00
Brian McMahan 0e6e7a41f4 adding built check inside TimeDistributed (#2426) 2016-04-20 08:41:27 -07:00
Dan Becker 5cac088d98 Add scikit_learn wrapper example (#2388)
* Add scikit_learn wrapper example

* Extract and evaluate best model in examples/mnist_sklearn_wrapper.py
2016-04-19 21:50:50 -07:00
Francois Chollet 85f0448fee Make merge work with pure TF/TH tensors 2016-04-19 18:46:54 -07:00
Francois Chollet 106c0b753a Merge branch 'master' of https://github.com/fchollet/keras 2016-04-19 11:57:30 -07:00
Francois Chollet c525e634dc Fix loss compatibility validation 2016-04-19 11:57:19 -07:00
Eder Santana c398c0891b add eye to backened (#2407) 2016-04-19 11:38:21 -07:00
Francois Chollet 5ab48ac5d4 Update imagedatagenerator 2016-04-19 11:19:12 -07:00
Jeffery Ye ba29cd8e46 set input_length before reshape (#2410) 2016-04-19 10:49:47 -07:00
chardmeier b61235b77f Fixed typo. (#2401) 2016-04-19 10:20:13 -07:00
Francois Chollet 0ed00e38f0 Add inception v3 example 2016-04-18 21:51:43 -07:00
Francois Chollet 36eef0dd9a Add reset function to ImageDataGenerator 2016-04-18 21:51:43 -07:00
fchollet 1904194c7a Fix wrapper learning phase 2016-04-18 20:07:30 -07:00
Francois Chollet 7ce144881a Fix stateful unrolled RNNs in Theano 2016-04-18 17:09:20 -07:00
Eder Santana 55159cf451 Update topology.py (#2373) 2016-04-17 14:21:31 -07:00
Francois Chollet 7a12fd0f85 1.0.1 release 2016-04-16 14:11:34 -07:00
Steven Hugg 9d60126661 fixed TensorBoard callback (#2363) 2016-04-16 13:56:56 -07:00
Francois Chollet e341e73c6a Fix Dropout in RNNs 2016-04-16 09:08:15 -07:00
Francois Chollet 5dad3786f6 Add batch_set_value for faster TF weight loading 2016-04-15 22:40:54 -07:00
Francois Chollet ed0cd2c60d Add TF/TH kernel conversion util 2016-04-15 22:37:41 -07:00
Kai Li 2eea3a4c5d Update model.md (#2348) 2016-04-15 08:21:46 -07:00
Gijs van Tulder 090a46763e Fix support for custom metrics functions (#2351) 2016-04-15 08:21:16 -07:00
Dan Becker c4ed82cdf6 Fix typo in docs. loss_weight should be loss_weights (#2343) 2016-04-14 21:53:33 -07:00
Dan Becker b32248d615 Change error message in standardize_input_data (#2338) 2016-04-14 19:18:53 -07:00
Francois Chollet ca7437502b Shape inference fix for Embedding 2016-04-14 15:44:36 -07:00
Francois Chollet fe9b797a46 Style fixes in preprocessing/image 2016-04-14 15:44:24 -07:00
Francois Chollet b8059aeaba Fix test_image unit test 2016-04-14 13:38:07 -07:00
Daniele Bonadiman 85f80714c2 Max Over Time in imdb_cnn.py (#2320)
* Max Over Time in imdb_cnn.py

Following this issue https://github.com/fchollet/keras/issues/2296 i propose this PR.
The mayor optimisation a part of the Max over time are:

- Dropout in the Embedding layer.
- Longer input sequences (400 instead of 100), made possible from the speedup of the Max Over Time.
- Adam optimizer.

Overall it takes 90 to 100 sec per epoch on my laptop CPU and in two epochs it reaches 0.885 accuracy that is a 5 points improvement over the previous implementation. Moreover it requires less memory (300k parameters vs 3M+) since the number of parameters do not depend  by the length of the input sequence anymore.

* Update imdb_cnn.py
2016-04-14 13:22:06 -07:00
fchollet 2cc9ebf28b Add set_learning_phase in TF backend. 2016-04-14 08:32:04 -07:00
fchollet cb5d69c769 Fix case where output_shape in Merge is tuple 2016-04-13 21:20:21 -07:00
Francois Chollet 3b961a6b7b Fix Graph generator methods 2016-04-13 19:30:11 -07:00
Francois Chollet cba3ea9d90 Fix PEP8 2016-04-13 18:32:18 -07:00
Thomas Boquet 57ea065db7 Added learning phase to callbacks (#2297) (#2303)
* added learning phase to callbacks (#2297)

* cleaned imports

* replaced tabs by spaces

* added case where uses_learning_phase is False

* fixed pep8 blank line bug
2016-04-13 18:00:49 -07:00
Ben Cook 4f5f88b9ba Expose max_q_size and other generator_queue args (#2300)
* [#2287] expose generator_queue args

* [#2287] only expose max_q_size
2016-04-13 18:00:25 -07:00
Francois Chollet c1c2b330a1 Fix "trainable" argument 2016-04-13 17:58:05 -07:00
Francois Chollet 05e1d8e5f4 Fix siamese example 2016-04-13 12:05:42 -07:00
Francois Chollet 3cbca7bdba Fix validation_split 2016-04-13 11:39:16 -07:00
Francois Chollet 3e3c210f1d Update preprocessing/image documentation 2016-04-13 10:15:26 -07:00
Dan Becker cb65139aa8 Allow 'tf' ordering in ImageDataGenerator (#2291) 2016-04-13 10:14:59 -07:00
Francois Chollet 1206120d10 Fix callback issue with Sequential model 2016-04-12 15:22:05 -07:00
Francois Chollet 80ebe80138 Callback style fix 2016-04-12 15:21:02 -07:00
Francois Chollet fa1d6b478e Fix generators methods when passing data as dicts 2016-04-12 13:51:45 -07:00
Francois Chollet 345413fb8c Merge branch 'master' of https://github.com/fchollet/keras 2016-04-12 12:52:05 -07:00
Kyle McDonald 66ebd2a843 corrected parameter for learning_phase (#2284) 2016-04-12 10:18:42 -07:00
fchollet d50f469c09 Fix for invalid dropout values 2016-04-12 09:41:29 -07:00
Charles-Emmanuel 26714bc635 Small typo (#2282)
Small type
2016-04-12 09:23:20 -07:00
Pasquale Minervini 30208ae08b Fix for issue #2276 (#2277) 2016-04-12 09:20:20 -07:00
Francois Chollet 6ea3188971 Fix typo 2016-04-11 22:47:18 -07:00
Kyle McDonald 1db555a530 remove creation of np.asarray in to_categorical (#2268) 2016-04-11 22:36:37 -07:00
Francois Chollet 0772210dea Better error messages for Sequential 2016-04-11 16:41:08 -07:00
Francois Chollet df42e997b7 Merge branch 'master' into keras-1 2016-04-11 09:20:52 -07:00
Francois Chollet 1e71732600 Replace master branch with Keras 1.0 2016-04-11 09:19:03 -07:00
Francois Chollet 141e05e3a7 Update Theano installation instruction 2016-04-11 08:51:01 -07:00
Francois Chollet cadd3e4e2c Make TimeDistributed accept a mask 2016-04-11 08:18:39 -07:00
Francois Chollet 09034d9e17 Add badges to README 2016-04-11 07:49:29 -07:00
Francois Chollet 5706d1d688 Bugfix 2016-04-10 08:36:00 -07:00
Francois Chollet 41b9777746 Clean up training a bit 2016-04-10 07:45:54 -07:00
Francois Chollet d31fe1ac34 Fix LSTM regularizers 2016-04-10 07:45:54 -07:00
Francois Chollet c0eedfeca0 change optimizer in example 2016-04-10 07:45:54 -07:00
Francois Chollet dc8b5509f3 Small fixes in topology engine 2016-04-10 07:45:54 -07:00
Nic Eggert ddec052dab Remove restriction on strides in theano backend conv2d. (#2238)
Previously, strides were required to be smaller than the convolution
kernel. Usually, this is what a user wants, but there are edge
cases where one might want to do this (for instance, projection
shortcuts in Residual Networks).
2016-04-09 18:48:05 -07:00
Francois Chollet 2e45022c95 Clarify interface of in_train_phase 2016-04-08 14:14:22 -07:00
Francois Chollet cec7f73bca Fix irnn example 2016-04-08 14:11:40 -07:00
Francois Chollet 30989dc997 Fix generator methods. 2016-04-08 12:59:09 -07:00
Fariz Rahman 50a0d1cad4 Update topology.py (#2239) 2016-04-08 10:40:58 -07:00
Fariz Rahman 3db1f132a7 call : remove train arg from doc string (#2235) 2016-04-08 08:59:55 -07:00
Fariz Rahman 3b196feda5 Typo fix (#2234) 2016-04-08 08:59:43 -07:00
berleon dd766c68d9 Fix LeakyReLU return dtype (#2214)
LeakyReLU returns a tensor with float64 dtype.
It is stupid, but this line actually produces a float64 array:

```
    0.5*np.array(0.2, dtype=np.float32)
```

The theano nnet.relu function does something similar like this with the
LeakyReLU alpha parameter, which lead to a float64 tensor.
The solution is to not cast the alpha to float32.

Furthermore I tighten the `test_utils.layer_test`. It is now
required that the layer's output dtype is equal to the input dtype.
2016-04-07 17:02:03 -07:00
Francois Chollet 599e070824 Small fixes to the docs. 2016-04-07 16:52:09 -07:00
Francois Chollet 04c998a742 Fix typos 2016-04-07 14:25:41 -07:00
Francois Chollet cc985c3a9c Update docs, visualization utils 2016-04-07 14:23:34 -07:00
Francois Chollet 36cc508030 New documentation 2016-04-06 17:34:25 -07:00
Francois Chollet 444cd56740 Update README 2016-04-06 17:33:39 -07:00
Francois Chollet 88a86f7e45 Merge branch 'keras-1' of https://github.com/fchollet/keras into keras-1 2016-04-06 17:23:21 -07:00
Francois Chollet d6f94c0bc9 Fix docstring 2016-04-06 17:21:31 -07:00
Francois Chollet 2157aa6172 Make function compilation lazy 2016-04-06 17:21:21 -07:00
Francois Chollet 2013527840 Add docstrings to TF backend 2016-04-06 17:21:00 -07:00
Francois Chollet 8c73c6f218 Simplify lstm example 2016-04-06 17:20:43 -07:00
Michael Oliver 63059f6063 Bug fix to set correct uses_learning_phase flag
```test_on_batch``` and ```predict_on_batch``` had the wrong ```uses_learning_phase``` flag
2016-04-06 16:56:31 -07:00
Francois Chollet e179198410 Cache recursive calls in build_map_of_graph 2016-04-06 13:30:07 -07:00
Francois Chollet ed4a95bdad Slight cleanup of build_map_of_graph 2016-04-06 11:46:18 -07:00
Francois Chollet 1baddb9094 Docstrings cleanup 2016-04-06 11:46:01 -07:00
Francois Chollet 3160a445a8 Merge branch 'keras-1' of https://github.com/fchollet/keras into keras-1 2016-04-06 09:05:04 -07:00
Francois Chollet d627fd8781 Docstrings improvements 2016-04-06 09:04:53 -07:00
Michael Oliver b4bdc5a0fa add in predict_generator and tests
* add in predict_generator and tests

* fix PEP8 details

* Pre-allocate predictions

* make predictions return list if neccessary

* reset batch_size for other tests, make less wonky generator
2016-04-06 09:03:37 -07:00
Francois Chollet 6911fa2cba Fix typos in functional API guide 2016-04-05 10:27:11 -07:00
Francois Chollet 0d7c8711bd Fix JSON serialization issue 2016-04-05 10:26:57 -07:00
Francois Chollet d8b0fe0957 Correct functional API guide 2016-04-04 21:49:16 -07:00
Francois Chollet 263de77a5a Improve README, functional API guide 2016-04-04 21:46:27 -07:00
Francois Chollet d3615e682e Small fixes. 2016-04-04 15:00:06 -07:00
Michael Oliver 35da9d6ef2 Make tensorflow backend fully mimic theano.dot
* Squashed commit of the following:

commit f25b56f3a7547da94ffecee8701da5b34e757104
Author: Michael Oliver <michael.d.oliver@gmail.com>
Date:   Mon Apr 4 19:01:49 2016 +0000

    add proper shape inference

commit 5544f66148676d86cb53309701eee6a4e99a3aeb
Author: Michael Oliver <michael.d.oliver@gmail.com>
Date:   Mon Apr 4 18:03:05 2016 +0000

    Make PEP8 compliant and use int_shape

commit 0fe6e02699e2da553f8f1a866aaaa081c26a1cbd
Author: Michael Oliver <michael.d.oliver@gmail.com>
Date:   Mon Apr 4 03:35:29 2016 +0000

    Make tensorflow backend fully mimic theano dot

* fix None comparison
2016-04-04 14:30:55 -07:00
cmyr dbe7662e72 add sparse_categorical_crossentropy 2016-04-04 14:24:41 -07:00
Francois Chollet 30118bbab0 Improve docstrings, UX of common user mistakes 2016-04-04 14:22:28 -07:00
Francois Chollet 2902149f77 Finish PR backporting 2016-04-04 11:30:24 -07:00
Francois Chollet eb8b40cccd Fix captioning example in docs 2016-04-04 11:09:20 -07:00
Francois Chollet 76da13dff6 Backport of conv2d PRs by @ns 2016-04-04 11:09:06 -07:00
Francois Chollet f7cbdff79c Improve model training tests 2016-04-04 10:25:58 -07:00
Francois Chollet 81233b3cd3 Fix lambda serialization issue in Py3 2016-04-04 10:11:29 -07:00
Fariz Rahman af88e051fa Typo fix 2016-04-04 07:38:54 -07:00
berleon 6ad6b19bd6 Merge layer handle none in input_shapes 2016-04-04 07:34:36 -07:00
Francois Chollet 0242ca59ac Update version numbers 2016-04-03 20:58:13 -07:00
Francois Chollet c064963ef8 Fix Py3 compatibility issue in test 2016-04-03 20:22:29 -07:00
Francois Chollet 4318718769 Fix PEP8 2016-04-03 20:22:17 -07:00
Francois Chollet f981bdb551 Update functional API guide 2016-04-03 20:04:14 -07:00
Francois Chollet 3860e078a5 Small fixes in optimizers docstrings 2016-04-03 20:04:14 -07:00
Eder Santana d09e2a67bb fix batch_dot tests on backend
* Fix merge_dot tests

* Make batch_dot unique

batch_dot is not tensordot! It only accepts one reduce dimension at a
time. Other reduce dimensions should be dome afterwards with K.sum
This means that K.batch_dot will have the same behavior in both
tensorflow and theano. This also means that we have less parenthesis and
less nested lists.

New usage:

merge_mode = 'dot', dot_axes=[axis1, axis2]

Before:

merge_mode = 'dot', dot_axes=[[axis1], [axis2]]

* Backport sign by @the-moliver

* Fix docstrings

* Fix backend batch_dot tests
2016-04-03 20:04:00 -07:00
Francois Chollet 56ba6b9c7e Merge branch 'keras-1' of https://github.com/fchollet/keras into keras-1 2016-04-03 18:58:18 -07:00
berleon a9c6f26412 Add defaults for _gather_[list/dict]_attr
For example the Dense Layer does not have a update attribute, which

results in an error.
2016-04-03 18:58:08 -07:00
Francois Chollet 73817b8b77 More thorough tests for TimeDistributed 2016-04-03 13:38:46 -07:00
Francois Chollet 3f128b9838 Merge branch 'keras-1' of https://github.com/fchollet/keras into keras-1 2016-04-03 13:19:38 -07:00
berleon 9621bb5b8e fix h5py string encoding
When saving the weights a TypeError is raised by h5py.

See this issue https://github.com/h5py/h5py/issues/289 for details.

As it is recommended in the issue, the strings are now encoded as utf8.
2016-04-03 13:19:12 -07:00
Francois Chollet 836fb03aa0 Merge branch 'keras-1' of https://github.com/fchollet/keras into keras-1 2016-04-03 10:32:38 -07:00
Eder Santana c3aaf50b64 Update complete_guide_to_the_keras_functional_api.md
small changes
2016-04-03 10:29:02 -07:00
Francois Chollet fe00f5ff64 Fix learning phase issue with regularizers 2016-04-03 10:28:06 -07:00
Eder Santana 8b3543fca9 Fix merge_dot tests
* Fix merge_dot tests

* Make batch_dot unique

batch_dot is not tensordot! It only accepts one reduce dimension at a
time. Other reduce dimensions should be dome afterwards with K.sum
This means that K.batch_dot will have the same behavior in both
tensorflow and theano. This also means that we have less parenthesis and
less nested lists.

New usage:

merge_mode = 'dot', dot_axes=[axis1, axis2]

Before:

merge_mode = 'dot', dot_axes=[[axis1], [axis2]]

* Backport sign by @the-moliver

* Fix docstrings
2016-04-03 10:03:09 -07:00
Leon Chen a6fe2ae341 test against latest tensorflow in travis 2016-04-03 10:02:31 -07:00
graham 3ca7751445 fixed slice_X import location 2016-04-02 19:00:26 -07:00
Francois Chollet ebfde534c0 Add weight deduping before updates computation 2016-04-02 10:37:57 -07:00
Eder Santana f8c7dbb758 Backport #2123 by @carlthome
Backport #2123 by @carlthome
2016-04-01 21:27:20 -07:00
Francois Chollet 62f9053330 Fix babi_rnn example 2016-04-01 21:14:05 -07:00
Francois Chollet c429e651c1 Fix weight regularizers 2016-04-01 21:13:57 -07:00
Francois Chollet dacf017d38 Fix potential issue in topology engine 2016-04-01 21:13:34 -07:00
Francois Chollet dc3c1488bb Fix unit tests for Merge 2016-04-01 18:01:41 -07:00
Francois Chollet 4d7ff76cfb Fix some more tests 2016-04-01 16:45:05 -07:00
Francois Chollet 8ad6865952 Fix conv3d tests 2016-04-01 16:12:04 -07:00
Francois Chollet 2a4f6b942d Add more tests 2016-04-01 15:58:19 -07:00
Francois Chollet 91a819fb34 Fix engine issue 2016-04-01 15:58:07 -07:00
Francois Chollet 52dbeb1f26 Reintroduce failing siamese test 2016-04-01 15:57:45 -07:00
Francois Chollet 0836e47dfc Theano: warn on unused input 2016-04-01 15:56:30 -07:00
Francois Chollet 75bef59016 Unify dot behavior in TF and Theano 2016-04-01 15:56:19 -07:00
Francois Chollet 337c0c66cf Remove shape infer test (now incl in layers tests) 2016-04-01 13:36:28 -07:00
Francois Chollet f8e2df16f1 Fix TF batch_dot 2016-04-01 13:36:03 -07:00
Francois Chollet 10deb8f267 Add batchnorm unit tests 2016-04-01 13:25:59 -07:00
Francois Chollet efe5916109 Fix convolutional tests 2016-04-01 13:25:48 -07:00
Francois Chollet 64449c196e Fix recurrent tests 2016-04-01 13:25:37 -07:00
Francois Chollet f57128bd3d Fix wrappers 2016-04-01 13:25:13 -07:00
Francois Chollet 8740791d5c Small fixes to core layers 2016-04-01 13:24:48 -07:00
François Chollet 6ddb5a0452 Merge pull request #2158 from EderSantana/keras-1
Add batch_dot to backend
2016-04-01 08:38:29 -07:00
EderSantana 133699c2f3 keras.engine.topology.Merge uses K.batch_dot 2016-04-01 10:56:12 -04:00
EderSantana b0ea92bc12 Add batch_dot to backend 2016-04-01 00:15:28 -04:00
fchollet b587aeee1c Remove badge from README 2016-03-31 19:13:22 -07:00
Francois Chollet e754581ecb Fix noise layers 2016-03-31 19:08:07 -07:00
Francois Chollet fcb6ae8eed Fix activity regularization 2016-03-31 18:46:24 -07:00
Francois Chollet bf4dab3501 Update core layers 2016-03-31 18:17:17 -07:00
Francois Chollet a066cf8680 Fix advanced activations. 2016-03-31 16:43:10 -07:00
Francois Chollet 7448dcea65 Fix optimizers tests 2016-03-31 13:41:30 -07:00
Francois Chollet cf3b3dff32 Fix regularizers 2016-03-31 13:41:21 -07:00
Francois Chollet ca96737b20 Fix callbacks 2016-03-31 13:41:07 -07:00
Francois Chollet 61ade48343 Some cleanup 2016-03-31 11:50:30 -07:00
Francois Chollet 295bfe4e3a Keras 1.0 preview. 2016-03-31 11:35:27 -07:00
Francois Chollet bdd70d06d3 Prepare 0.3.3 release (last release before 1.0). 2016-03-31 11:15:44 -07:00
François Chollet 39a3be60c0 Merge pull request #2109 from the-moliver/sign
add sign operation to backends
2016-03-31 11:07:02 -07:00
François Chollet d81e4127fb Merge pull request #2123 from carlthome/master
Support low-precision floats
2016-03-31 11:04:11 -07:00
Carl Thomé 122be6e30b Support low-precision floats 2016-03-29 20:13:45 +02:00
Michael Oliver 8056f0dd37 add sign operation to backends 2016-03-28 19:43:49 +00:00
François Chollet 3a61ace619 Merge pull request #2098 from kylemcdonald/patch-4
roll jitter instead of pixel jitter for deep dream
2016-03-28 09:49:57 -07:00
Kyle McDonald 8661e78f08 roll jitter instead of pixel jitter for deep dream 2016-03-27 11:21:23 -04:00
François Chollet 90aafca585 Merge pull request #2055 from EderSantana/batch_dot
Batch dot
2016-03-25 17:18:50 -07:00
EderSantana d2ce350657 2D tensor support for batch_dot 2016-03-25 10:12:55 -04:00
EderSantana d4fce4f5f1 batch_dot supports arbitrary sized arrays 2016-03-24 15:35:21 -04:00
François Chollet 5e301d1f63 Merge pull request #2044 from NasenSpray/conv2d
Use T.nnet.conv2d interface
2016-03-24 08:53:42 -07:00
EderSantana 4e5348c5ca Add batch_dot to core.py layers 2016-03-23 18:52:57 -04:00
EderSantana a19db7b672 Add batch_dot to backend 2016-03-23 18:46:00 -04:00
François Chollet 85ebccfcc8 Merge pull request #2049 from grahamannett/patch-1
prevent UnicodeDecodeError
2016-03-23 13:59:30 -07:00
graham 677e9ee8ba prevent UnicodeDecodeError
Fix error that arises on some systems where python tries to read text as ascii
2016-03-23 05:49:16 -07:00
ns 0f4c7864ba fixed weird performance regression with up-to-date Theano 2016-03-23 09:27:29 +01:00
ns 63c099714b PEP8 fix 2016-03-23 08:53:36 +01:00
ns 3dd27c61fb fixed output shape for even kernels 2016-03-23 08:37:11 +01:00
ns 6543b67509 use Theano's new T.nnet.conv2d interface 2016-03-23 04:53:54 +01:00
François Chollet d0b348a55a Merge pull request #2037 from TheRushingWookie/master
Fixed the image captioning example
2016-03-22 19:03:29 -07:00
Quinn Jarrell 9f8d3cb399 Changed the image captioning example so it works with current keras. It was misusing the merge layer and had an incorrect parameter in of the GRU layers. Should fix issue #1522. 2016-03-22 15:40:55 -04:00
François Chollet deb4c06df8 Merge pull request #2013 from carlthome/master
Warn when epoch sees more samples than expected
2016-03-21 22:44:21 -07:00
François Chollet d3cc1de2d7 Merge pull request #2027 from sudeepraja/master
Added fix for training h5py dataset on Graph model
2016-03-21 22:43:20 -07:00
Sudeep Raja 404a30df88 Update models.py
Added fix for training h5py dataset on Graph model
2016-03-22 03:51:02 +05:30
Carl Thomé 3bd7d11170 Don't assume what a batch generator does 2016-03-21 15:59:07 +01:00
Carl Thomé 9ac50e0050 Warn when epoch sees more samples than expected 2016-03-20 19:51:24 +01:00
Francois Chollet 1145fec39f Update backend 2016-03-19 09:06:52 -07:00
François Chollet b0303f03ff Merge pull request #1968 from cadurosar/master
Fix Reproducibility with ImageDataGenerator
2016-03-14 18:34:34 -07:00
François Chollet 2716dcd6ab Merge pull request #1963 from jnphilipp/fix_config
Fixed get_config for SReLU.
2016-03-14 18:33:52 -07:00
François Chollet fef9de0d17 Merge pull request #1961 from giorgiop/typos_examples
typos in examples
2016-03-14 18:33:40 -07:00
Carlos Eduardo Rosar Kos Lassance 2dcafafcf9 change all instances of python random to numpy random in keras/preprocessing/image 2016-03-14 16:55:57 +01:00
jnphilipp 775664fdb8 Fixed nulls in input_shape. 2016-03-14 13:36:05 +01:00
jnphilipp a67034ee7c Fixed get_config for SReLU. 2016-03-14 12:48:08 +01:00
giorgiop e72bb9506a fixed typos 2016-03-14 11:27:20 +11:00
Francois Chollet ecd414d716 Fix RNN regularizers 2016-03-11 21:29:37 -08:00
Francois Chollet 80a831de1a Fix failing Lambda test 2016-03-11 17:54:19 -08:00
Francois Chollet 37fd456a5c Lambda layer improvements 2016-03-11 17:29:37 -08:00
Francois Chollet 0c1af0901d Remove class_mode, add support for acc in Graph 2016-03-11 17:29:17 -08:00
François Chollet 70431a5336 Merge pull request #1953 from tjrileywisc/master
Added os import so that check for weights file runs.
2016-03-11 11:06:10 -08:00
Francois Chollet cf3ab771d3 Cleanup of image processing utils. 2016-03-11 11:01:25 -08:00
Francois Chollet 524090e600 Cleanup of text/sequence preprocessing utils. 2016-03-11 10:50:57 -08:00
tim riley 9c9318ff6b Added os import so that check for weights file runs. 2016-03-11 12:57:14 -05:00
fchollet f75f70a60d Update examples in documentation 2016-03-10 21:31:35 -08:00
fchollet e3c31aa762 Merge branch 'master' of ssh://github.com/fchollet/keras 2016-03-10 20:57:18 -08:00
fchollet ef1e959505 Update visualize_util 2016-03-10 20:49:01 -08:00
François Chollet c5b8a1df80 Merge pull request #1943 from matt-peters/compile_kwargs
Allow passing kwargs down to K.function
2016-03-10 16:43:35 -08:00
Matthew Peters e73cf505a7 Add helpful error messages for invalid kwargs in K.function 2016-03-10 16:18:48 -08:00
Matthew Peters 8606edf3bf Allow passing kwargs down to K.function 2016-03-10 15:20:18 -08:00
Francois Chollet 71d46b7153 Fix docstring 2016-03-10 13:28:21 -08:00
Francois Chollet 7f3b2067bc Merge branch 'sklearn' of https://github.com/ipod825/keras into ipod825-sklearn 2016-03-10 13:22:55 -08:00
François Chollet ed882f4064 Merge pull request #1934 from danieldk/tf-backwards-fix
Tensorflow: reverse mask on go_backwards.
2016-03-09 11:17:06 -08:00
Daniël de Kok 126b820561 Tensorflow: reverse mask on go_backwards. 2016-03-09 10:27:25 +01:00
ipod825 b50624debd Add code to avoid user's typo in sk_params 2016-03-08 12:12:45 +08:00
François Chollet 709390dfdb Merge pull request #1918 from snurkabill/allow_soft_placement
by allowing soft placement into keras, we can run whole keras model u…
2016-03-07 19:27:28 -08:00
Jiří Vahala 56c492cbcc by allowing soft placement into keras, we can run whole keras model using tensorflow's "with("/:gpu42")" syntax 2016-03-08 03:34:17 +01:00
fchollet bde45eff87 Split model tests, fix create_output graph bug 2016-03-07 17:48:54 -08:00
Francois Chollet ce7276bc55 Merge branch 'master' of https://github.com/fchollet/keras 2016-03-07 16:48:30 -08:00
Francois Chollet fc476840fa Style fixes 2016-03-07 16:46:19 -08:00
Francois Chollet cfcb1e8703 Switch to symbolic shapes for TF in K.shape() 2016-03-07 16:46:11 -08:00
fchollet 8ba647c196 Fix PEP8 2016-03-06 19:07:49 -08:00
fchollet a86057d91c Improve error messages in TF backend. 2016-03-06 17:32:07 -08:00
fchollet 1ebeff8ee3 Move data_utils to utils.data_utils. 2016-03-06 17:31:57 -08:00
Francois Chollet be4a86f6dc Fix PEP8 2016-03-05 22:41:25 -08:00
Francois Chollet e5ccf53531 Fix typo 2016-03-05 22:24:18 -08:00
Francois Chollet ef93e2cffd Merge branch 'timedistributed' 2016-03-05 21:48:56 -08:00
Francois Chollet b8c59acd77 Improve tests for TimeDistributed 2016-03-05 21:48:41 -08:00
Francois Chollet 3cc242615d TimeDistributed: +docstring, perf improv, TF warn 2016-03-05 21:48:29 -08:00
Francois Chollet 6596cc79d6 Fix variable-size shape inference in padding layer 2016-03-05 21:18:12 -08:00
Francois Chollet cd28c6d07e Merge branch 'master' of https://github.com/fchollet/keras 2016-03-04 14:11:17 -08:00
Francois Chollet b883761820 Neural style transfer: remove input preprocessing. 2016-03-04 14:10:45 -08:00
François Chollet b2b04b0fff Merge pull request #1893 from kylemcdonald/patch-3
changed doc `(input_dim,)` to `(output_dim,)`
2016-03-04 13:41:56 -08:00
Francois Chollet 9e2628e811 Fix neural style transfer example 2016-03-04 13:41:01 -08:00
Kyle McDonald 537fb1cc01 changed doc (input_dim,) to (output_dim,) 2016-03-04 15:54:11 -05:00
François Chollet 99891c0cc8 Merge pull request #1892 from kylemcdonald/patch-2
corrected documentation of "weights" argument
2016-03-04 12:45:03 -08:00
Kyle McDonald d748db43ae corrected documentation of "weights" argument 2016-03-04 15:29:28 -05:00
fchollet 4ec84541e3 Fixes in imagedatagenerator 2016-03-02 20:44:07 -08:00
François Chollet ca05efc76f Merge pull request #1862 from qdbp/issue_1857
Fixing issue 1857
2016-03-01 11:30:19 -08:00
Francois Chollet 7768ae04a2 actually use nb_[val_]_worker args 2016-03-01 13:48:29 -05:00
Francois Chollet 0daec53acb Style fixes 2016-03-01 10:14:56 -08:00
berleon d9ca798c60 fix initialization of BatchNormalization
Currently the scaling parameter gamma is uniformly sampled between
-0.05 and 0.05. This also brings the std -0.05 and 0.05 and not to
1 as claimed by the docstring of the BatchNormalization class.

This commit fixes it by initializing gamma to 1.  The same
initialization is used by
(lasagne)[http://lasagne.readthedocs.org/en/latest/modules/layers/normalization.html#lasagne.layers.BatchNormLayer]
2016-03-01 13:47:00 +00:00
Francois Chollet 990ef92a60 Merge branch 'master' of https://github.com/fchollet/keras 2016-02-29 20:25:19 -08:00
Francois Chollet becc5f3a2c Add support for dim_ordering in initializations 2016-02-29 20:18:17 -08:00
François Chollet e5d0dc65e0 Merge pull request #1856 from GregorySenay/master
Update Siamese layer set weights function
2016-02-29 18:48:03 -08:00
Francois Chollet 48ce23086b Fix recurrent dropout 2016-02-29 16:41:54 -08:00
Francois Chollet a3c9d2d7c9 Merge branch 'master' of https://github.com/fchollet/keras 2016-02-29 10:50:21 -08:00
Francois Chollet 9efe17aeea Fix RNN get initial states recursion issue 2016-02-29 10:50:08 -08:00
fchollet 763a2a9536 Style and dosctring fixes, sklearn wrapper. 2016-02-28 11:54:00 -08:00
Francois Chollet 4a43567cea Update FAQ 2016-02-28 10:13:52 -08:00
ipod825 be24159959 Refactoring of sklearn 2016-02-28 13:46:20 +08:00
Francois Chollet 943d2d4cf8 Add native names to weight tensors 2016-02-27 17:43:18 -08:00
Francois Chollet c4361d2246 Improve Lambda layer shape inference for TF 2016-02-27 12:10:58 -08:00
Francois Chollet 80927fa958 Fix Theano backend pool 2016-02-27 11:54:48 -08:00
Francois Chollet f3f19146f9 Remove dummy inputs in layers 2016-02-27 11:24:04 -08:00
fchollet 3799660504 Update pool module import for Theano 2016-02-27 10:46:46 -08:00
Francois Chollet 9b4f973d57 Fix lstm regularizers 2016-02-26 13:04:14 -08:00
Francois Chollet 567fdccd0b Add Merge input_shape property 2016-02-26 11:38:36 -08:00
Francois Chollet cf755a9c7c Merge branch 'master' of https://github.com/fchollet/keras 2016-02-26 10:46:40 -08:00
Francois Chollet ff2f8ac69b Fix stateful LSTM example 2016-02-26 10:45:12 -08:00
François Chollet 428f4bfde6 Merge pull request #1832 from NasenSpray/fix_mean
T.mean() on int tensors returns float64; use _FLOATX instead
2016-02-26 09:15:48 -08:00
François Chollet 7552f2c26d Merge pull request #1834 from farizrahman4u/patch-5
Fix imports : bAbI Example
2016-02-26 08:45:37 -08:00
Fariz Rahman 0eea5f8867 Fix imports : bAbI Example 2016-02-26 22:13:06 +05:30
ns 47c67ac19a T.mean() on int tensors returns float64; use _FLOATX instead 2016-02-26 16:50:57 +01:00
François Chollet 55d9374961 Merge pull request #1824 from farizrahman4u/patch-2
bAbI: Doubling the FB LSTM baseline :)
2016-02-25 12:51:04 -08:00
Francois Chollet 045e47174f Add generic TimeDistributed layer. 2016-02-25 12:41:53 -08:00
Fariz Rahman 22c091ae3f bAbI: Doubling the FB LSTM baseline :) 2016-02-26 01:06:52 +05:30
François Chollet d20fe64a69 Merge pull request #1823 from EderSantana/patch-6
Update SiameseHead
2016-02-25 11:33:42 -08:00
Eder Santana c6c150b042 Update SiameseHead
We forgot to pass the train parameter to the layer.__call__ inside get_output_at. This is a bug.
2016-02-25 13:50:08 -05:00
Francois Chollet ababd95210 Fix merge conflicts 2016-02-25 10:36:24 -08:00
François Chollet ff676f10f6 Merge pull request #1811 from fchollet/faster-rnn
Possibly faster RNNs
2016-02-25 10:15:45 -08:00
Fariz Rahman 73e563ecaf Speed up RNNs
Update core.py

Fix TF indexing issue

Update recurrent.py

TF slicing issue workaround

Update recurrent.py

Update core.py

Update core.py

Remove all backend specific code

shape[-1] instead of input_dim

Update core.py

space fix

Update core.py

Fix TF reshape issue

Update recurrent.py

Update core.py

Fix overflow issue in 3D softmax

Improve readability
2016-02-25 23:31:28 +05:30
François Chollet 3f905e4a35 Merge pull request #1817 from AIshb/patch-1
Fix a little bug in pad_sequences
2016-02-25 09:51:48 -08:00
Francois Chollet 98e2789db9 Readability improvements 2016-02-25 09:05:59 -08:00
François Chollet 8a6cf4c13e Merge pull request #1816 from smohsensh/patch-1
fix add_shared_node SiameseHead naming
2016-02-25 08:22:52 -08:00
AIshb f4af11c730 Update sequence.py
fix a little mistake in pad_sequences
2016-02-25 20:31:28 +08:00
mohsen 9f2aa1b6ae fix add_shared_node SiameseHead naming 2016-02-25 15:31:33 +03:30
Francois Chollet abca83373d Possibly faster RNNs
Forgot dropout.

Various fixes

Fix SimpleRNN dropout typo
2016-02-24 22:15:05 -08:00
Fariz Rahman 3aa807a0c8 Speed up softmax too
Update core.py

Update core.py

Update core.py

Update core.py

Update core.py

Update theano_backend.py

Update tensorflow_backend.py
2016-02-25 10:39:36 +05:30
Fariz Rahman 089fa11752 TimeDistributedDense Speed up
Update core.py
2016-02-25 10:38:49 +05:30
Francois Chollet 06a1545645 Style fixes 2016-02-24 13:05:47 -08:00
François Chollet 461573a8d9 Merge pull request #1790 from neggert/call-layer-cache
Enable cache optimization for __call__
2016-02-24 12:45:29 -08:00
François Chollet db8f43128b Merge pull request #1801 from farizrahman4u/patch-26
Lambda layer: Bug Fix
2016-02-23 20:53:56 -08:00
Fariz Rahman 80ddb5b3b8 Update core.py 2016-02-24 03:10:33 +05:30
Gregory 3ffba42466 Update core.py
Change Siamese set_weights function to avoid false nb layers value
2016-02-23 11:07:38 -08:00
François Chollet 2c49115cd3 Merge pull request #1792 from neggert/pool-border-same
Implement border_mode="same" for pool2d in Theano
2016-02-22 15:55:24 -08:00
z001qdp bbaa66c530 Implement border_mode="same" for pool2d in Theano 2016-02-22 16:30:02 -06:00
François Chollet 784d81d2c8 Merge pull request #1623 from oeway/master
Convolutional layers for 3D
2016-02-22 12:47:06 -08:00
z001qdp 523e9845d7 Make layer and shape caches more robust
Move caches to properties so that containers can override the
implementation to ensure that the cache gets propagated correctly
to child layers when it is changed.

Reset instead of disabling layer and shape cache  in __call__

Previously, __call__ did not get the speed benefits from caching
because it disabled it in order to feed the layer new input. This
meant that __call__ could be very slow on complicated structures.
Now, instead of disabling it, we temporarily empty it, then restore
the original when we're done.
2016-02-22 14:21:21 -06:00
François Chollet 47bd0af702 Merge pull request #1721 from neggert/graph-call
Make __call__ work properly for Graph
2016-02-22 11:09:52 -08:00
François Chollet 82d3489764 Merge pull request #1789 from yaringal/BRNN_latest
Update example file `imdb_lstm.py` to use dropout in RNNs
2016-02-22 10:23:03 -08:00
z001qdp 44d558ad7f Refactor implementation of __call__.
This refactor allows the inherited method to work properly for
Sequential and Graph (with single input) containers in addition
to normal layers, so there's no need to override the method.
Previously, __call__ did not work correctly for Graph containers.

Implement Graph.__call__ for multiple inputs

Add option (re-)initialize weights in set_previous

This allows us to use set_previous in places where we previously
manually adjusted the previous layer, which means that layers
that have non-standard set_previous implementations (like Graph)
work properly when they are, for example, the first layer in a
Sequential model.

This commit also adds a clear_previous method.

Add input_shape property to Graph container
2016-02-22 12:21:07 -06:00
Yarin cbd11315b7 Update example file imdb_lstm.py to use dropout in RNNs 2016-02-22 17:55:59 +00:00
Yarin 1588998ee8 Merge https://github.com/fchollet/keras into BRNN_latest 2016-02-22 17:54:23 +00:00
fchollet 58ca064f93 Use MSE in stateful example. 2016-02-22 09:10:28 -08:00
Wei OUYANG 3ecf201aea Add 3D Convolutional layers(Convolution3D, MaxPooling3D, AveragePooling3D, UpSampling3D and ZeroPadding3D, working with theano backend)
---------------------------------------
Squashed from the following commits
add Convolution3D and MaxPooling3D layers
fix 5D tensor in theano, add examples
update conv3d, pool3d, add resize_volumes and spatial_3d_padding
update Convolution3D, MaxPooling3D and AveragePooling3D, add UpSampling3D and ZeroPadding3D
add test functions for Convolution3D, MaxPooling3D, AveragePooling3D, ZeroPadding3D and UpSampling3D
small fix by changing pad_z to pad_t
update comment
skip some tests for tenforflow, @pytest.mark.skipif(K._BACKEND != theano, reason="Requires Theano backend")
use autopep8 to fix the code to match pep8 coding style
small fix (caused by autopep8)
small fix (caused by autopep8)
small fix (caused by autopep8)
fixed the document string for all newly added layers
remove the example and the dataset for 3d
add error messge for tensorflow backend
support stride in pool3d
Rename "params" to "trainable_weights"
change notations and docstrings for 3D layers
fix pep8 error
change variable name in test code
small fix for pep8
add error message and docstring for strides in conv3d
fix test error caused by wrong strides in conv3d
support strides in conv3d by slicing the output
add if statement for stride (1,1,1)
fix get_config according to mdering, and other small fix
fix model_from_json issue by passing a 3d border_mode
fix according to jruales' review
change docstring in Convolution3D
delete docstring about TensorFlow
change docstring in Convolution3D and theano_backend
---------------------------------------

Author:    Wei OUYANG <oeway007@gmail.com>
2016-02-22 16:02:11 +01:00
fchollet 654404c2ed Add a few explanatory comments in recurrent.py 2016-02-21 19:46:06 -08:00
fchollet 7896ef7143 Style standardization 2016-02-21 19:31:22 -08:00
François Chollet 0c75006d12 Merge pull request #1782 from EderSantana/patch-5
Sensible activation for RNNs
2016-02-21 19:20:47 -08:00
Eder Santana c7f7ffe7c4 Sensible activation for RNNs
Have noticed how default GRUs works usually worse than LSTMs? It seems that "tanh" is a more sensible activation choice. Also for GRUs, tanh seems to be the default:
see http://arxiv.org/pdf/1412.3555v1.pdf Section 3.2
2016-02-21 19:59:05 -05:00
Francois Chollet f10c430731 Style fixes, fix flaky test 2016-02-21 16:53:15 -08:00
Francois Chollet ea5cb74414 Merge branch 'BRNN_latest' of https://github.com/yaringal/keras into yaringal-BRNN_latest 2016-02-21 16:34:16 -08:00
Yarin 606a9b6810 Merge https://github.com/fchollet/keras into BRNN_latest 2016-02-21 23:51:52 +00:00
Yarin 35c5fa911d clean up 2016-02-21 23:31:32 +00:00
fchollet d4e9696447 Merge branch 'master' of ssh://github.com/fchollet/keras 2016-02-21 12:16:49 -08:00
fchollet 4cff0623de Update documentation. 2016-02-21 12:16:39 -08:00
François Chollet bc82613eae Merge pull request #1767 from neggert/seq-set-weights
Fix `Sequential.set_weights` for nested containers
2016-02-21 11:49:50 -08:00
fchollet 0ec57f28bc Fix fit_generator docstring 2016-02-21 11:35:13 -08:00
fchollet 860e4e9177 Various fixes in evaluate_generator/fit_generator. 2016-02-21 11:23:07 -08:00
Yarin a2fdc32381 use less memory when dropout is disabled 2016-02-21 17:57:58 +00:00
shmo d1a3842b3d implement validate_generator and the use of validation generators in fit_generator 2016-02-21 10:24:27 -05:00
Yarin 5406bd3ad2 updated dropout test 2016-02-21 03:21:20 +00:00
Yarin 941c3f6ae8 fixes following fchollet's comments 2016-02-21 03:04:37 +00:00
Yarin bec2701214 remove new line at end of file 2016-02-20 06:00:32 +00:00
Yarin bc60832dcf lint 2016-02-20 05:41:50 +00:00
Yarin 96483326d8 fixed tensorflow bug 2016-02-20 05:20:37 +00:00
Yarin fed7cc257e switched from broadcasting to K.expand_dim and from slicing to K.gather, and adapted test_recurrent to test with dropout 2016-02-20 03:56:00 +00:00
François Chollet d03f7768b8 Merge pull request #1769 from tboquet/modelckpoint_deb
ModelCheckPoint references nonexistent attribute
2016-02-19 19:14:39 -08:00
Yarin ce79e0a8ef test random_binomial, fix rnn backend 2016-02-20 03:03:35 +00:00
Yarin 5b3809394c Dropout for LSTM, GRU, SimpleRNN, and Embedding layers.
Squashed commit of the following:

commit 39a59192e96fe4098f1d663384b79b10e3bcc979
Author: Yarin <yaringal@gmail.com>
Date:   Sat Feb 20 02:15:29 2016 +0000

    Squashed commit of the following:

    commit 88faa440d02df8ff356011258e3e89ce44a13e1d
    Author: Yarin <yaringal@gmail.com>
    Date:   Sat Feb 20 02:13:24 2016 +0000

        Clean up

    commit f55245199a11a202857efb1413ffa3b97c1dcfaf
    Author: Yarin <yaringal@gmail.com>
    Date:   Sat Feb 20 01:57:50 2016 +0000

        Ported dropout for LSTM, GRU, SimpleRNN, and Embedding layer to latest Keras (turned off by default).

        Squashed commit of the following:

        commit 574c4549da69f8c0831f02dce1ad05331d8b38ed
        Merge: 19ef51c bdb149d
        Author: Yarin <yaringal@gmail.com>
        Date:   Sat Feb 20 01:23:54 2016 +0000

            Merge branch 'BRNN_latest' of https://github.com/yaringal/keras into BRNN_latest

        commit 19ef51c633544f847cddebeb7a3add0936051f19
        Author: Yarin <yaringal@gmail.com>
        Date:   Sat Feb 20 01:12:23 2016 +0000

            implemented dropout in GRU and SimpleRNN

        commit bdb149d1bbff64cc6b4d694090b905153d28e33a
        Author: Yarin <yaringal@gmail.com>
        Date:   Sat Feb 20 01:12:23 2016 +0000

            implemented dropout in GRU and SimpleLSTM

        commit 72ade3f493dd725fb414cbc65a847259360be138
        Author: Yarin <yaringal@gmail.com>
        Date:   Sat Feb 20 00:52:01 2016 +0000

            clean up

        commit 9f3d213c91906b3be5c876d539819a8577bc438c
        Author: Yarin <yaringal@gmail.com>
        Date:   Sat Feb 20 00:42:58 2016 +0000

            Model test callback

        commit d4ffffc26cf24c8b7927209caad4379aac3db9c5
        Author: Yarin <yaringal@gmail.com>
        Date:   Fri Feb 19 23:47:40 2016 +0000

            removed dependence on theano

        commit 89a4e6576278564ffb882032d5a7ec5758fe00e4
        Author: Yarin <yg279@cam.ac.uk>
        Date:   Fri Feb 19 23:25:13 2016 +0000

            working BayesianLSTM and embedding dropout for theano backend

        commit 1ab4e19dfe9d49defd5575a5c2b0b880b5c46eb5
        Author: Yarin <yg279@cam.ac.uk>
        Date:   Fri Feb 19 16:41:48 2016 +0000

            working BayesianLSTM with dependence on theano

        commit 672c27401ee345a69592771cfc9ab017642b6af3
        Merge: 9360ea6 b8a9f84
        Author: Yarin <yaringal@gmail.com>
        Date:   Fri Feb 19 00:30:44 2016 +0000

            Merge https://github.com/fchollet/keras into BRNN_latest

        commit 9360ea6c25eab90e83aebb32eb187c65ed63c01d
        Author: Yarin <yaringal@gmail.com>
        Date:   Thu Feb 18 23:28:35 2016 +0000

            work in progress on BayesianLSTM

        commit b8a9f84fad
        Merge: a154495 0f3f563
        Author: François Chollet <francois.chollet@gmail.com>
        Date:   Thu Feb 18 11:24:42 2016 -0800

            Merge pull request #1756 from gw0/fix-for-refactor-callbacks

            Fix missing callback refactoring.

        commit 0f3f56327b
        Author: gw0 [http://gw.tnode.com/] <gw.2016@tnode.com>
        Date:   Thu Feb 18 17:01:45 2016 +0100

            Fix missing callback refactoring.
2016-02-20 02:15:59 +00:00
tboquet e501cd664e changed ref to attribute ModelCheckPoint 2016-02-19 20:14:18 -05:00
z001qdp 47aafaaca0 Fix Sequential.set_weights
`Sequential.set_weights` would fail for nested `Sequential`
containers. Borrow the implementation of `Graph.set_weights` to
get it working. Also add tests for triply-nested `Sequential`
models.
2016-02-19 14:23:04 -06:00
François Chollet b8a9f84fad Merge pull request #1756 from gw0/fix-for-refactor-callbacks
Fix missing callback refactoring.
2016-02-18 11:24:42 -08:00
gw0 [http://gw.tnode.com/] 0f3f56327b Fix missing callback refactoring. 2016-02-18 17:01:45 +01:00
François Chollet a154495a2a Merge pull request #1751 from DingKe/master
Support saving/loading sample_weight_mode(s)
2016-02-17 18:26:29 -08:00
Francois Chollet 25e5f7531a Add ISSUE_TEMPLATE.md 2016-02-17 16:56:18 -08:00
Francois Chollet ab179fab89 data_utils cleanup 2016-02-17 16:39:37 -08:00
Francois Chollet 59a714abe8 Style fixes 2016-02-17 16:39:24 -08:00
Ke Ding e432d10be5 Merge branch 'master' of https://github.com/fchollet/keras.git 2016-02-17 18:48:06 +08:00
Ke Ding eeb576b12f Add support for saving/loading sample_weight_mode(s) 2016-02-17 18:47:29 +08:00
François Chollet 1019e50e7f Merge pull request #1626 from MatthieuPerrot/master
retrieve datasets behind http/https proxy
2016-02-16 15:19:22 -08:00
Francois Chollet 1a6cb71732 Make history an attribute of models 2016-02-16 14:22:41 -08:00
Francois Chollet 9048b5cbba Merge branch 'master' of https://github.com/fchollet/keras 2016-02-16 13:23:25 -08:00
Francois Chollet c9642571c2 Refactor callbacks 2016-02-16 13:22:53 -08:00
François Chollet cda80c790b Merge pull request #1719 from barvinograd/general-pad-sequences
generalize pad_sequences
2016-02-16 11:57:34 -08:00
Bar Vinograd 46a5b3cb36 generalize pad_sequences #1718 2016-02-16 10:56:33 +02:00
Francois Chollet 5b23dd8a2f Style fixes 2016-02-15 13:30:28 -08:00
Francois Chollet 2b26389188 Merge branch 'master' of https://github.com/fchollet/keras 2016-02-15 13:29:09 -08:00
Francois Chollet 44e0a7bbf9 Style fixes 2016-02-15 13:27:06 -08:00
Francois Chollet 87cc39d99f Merge branch 'master' of https://github.com/barvinograd/keras into barvinograd-master 2016-02-15 13:18:51 -08:00
François Chollet 55aacd1905 Merge pull request #1736 from bryan-lunt/fix_load_weights
Fixed load_weights to not create empty/corrupt .h5 files
2016-02-15 12:57:41 -08:00
Francois Chollet 3df101cc77 Merge branch 'master' of https://github.com/fchollet/keras 2016-02-15 12:56:59 -08:00
Bryan Lunt c23579e059 Fixed load_weights to not create empty/corrupt .h5 files
Fixes issue #1734 non-existant .h5 files will not be created, IOError will be raised.
2016-02-15 12:13:23 -06:00
François Chollet 6a41ac1c36 Merge pull request #1724 from bmabey/master
updates docstring for Merge to include cos and join
2016-02-14 15:47:34 -08:00
Ben Mabey b78ade7e36 updates docstring for Merge to include cos and join 2016-02-14 16:40:54 -06:00
Matthieu Perrot 61800be9a0 Add http/https proxy management for dataset downloading 2016-02-13 21:14:55 +01:00
Bar Vinograd 3925eabaaf SReLU activation (#1662, #1681) 2016-02-13 12:07:57 +02:00
Francois Chollet 34296ec961 Update FAQ 2016-02-12 17:10:51 -08:00
Francois Chollet 52f48e1f46 Merge branch 'master' of https://github.com/fchollet/keras 2016-02-12 12:17:20 -08:00
Francois Chollet 20728c95fa Make it possible to configure nb of threads in TF 2016-02-12 12:17:00 -08:00
François Chollet 6181ca8aae Merge pull request #1701 from stas-sl/patch-2
more concise random_shift and fix axis order
2016-02-12 11:56:33 -08:00
François Chollet 68115cc25f Merge pull request #1705 from neggert/cache_input_size
Add caching for layer input/output sizes
2016-02-12 11:40:55 -08:00
Francois Chollet c192beaf43 Merge branch 'master' of https://github.com/fchollet/keras 2016-02-12 09:45:21 -08:00
Francois Chollet 27dd1e939c Add dim_ordering tests 2016-02-12 09:44:55 -08:00
Francois Chollet 1e46a5d3ec Improve error message 2016-02-12 09:44:31 -08:00
Francois Chollet 0bf2b1b075 Improve TF backend docstrings 2016-02-12 09:44:16 -08:00
François Chollet ab3ef3efe5 Merge pull request #1704 from brmson/f/cosmerge
Merge(cos): Fix bad import
2016-02-12 09:43:33 -08:00
Petr Baudis 4d3ee897da Merge(cos): Fix bad import 2016-02-12 16:07:19 +01:00
stas-sl 8e591d228c more concise random_shift and fix axis order 2016-02-12 13:34:45 +03:00
z001qdp 6429a57a3c Add caching for layer output sizes
This dramatically speeds up constructing large networks.
Implementation follows that of layer caching introduced in e8e8f7.
2016-02-11 16:48:17 -06:00
Francois Chollet 9ad5ed8103 Fix indentation in code examples in docs 2016-02-11 11:42:34 -08:00
Francois Chollet 209b42c5ee Fix code comments for method example in online doc 2016-02-11 11:00:12 -08:00
Francois Chollet 23147de72b Convs tests touch-ups 2016-02-10 14:35:34 -08:00
Francois Chollet 7b72163073 Theano: support for strides with border_mode=same 2016-02-10 13:57:09 -08:00
Francois Chollet 6279544dc3 0.3.2 PyPI release 2016-02-09 13:57:16 -08:00
François Chollet 657b9fb48e Merge pull request #1651 from tboquet/fit_gen_tensorb
Support + tests for fit_generator + tensorboard
2016-02-08 14:01:02 -08:00
tboquet d68e3316da Support for fit_generator + tensorboard 2016-02-08 14:55:00 -05:00
François Chollet cae797b803 Merge pull request #1668 from jstypka/master
docs: update "pad_sequences()" parameters in the docs
2016-02-08 10:14:17 -08:00
Jan Stypka 21492a292a docs: update "pad_sequences()" parameters in the docs 2016-02-08 15:20:12 +01:00
fchollet f27c5b0500 merge conflict 2016-02-07 22:56:59 -08:00
fchollet 523e24e8ac Simplify Theano RNN when no mask is passed 2016-02-07 22:34:24 -08:00
François Chollet 8d393f766b Merge pull request #1645 from oraac/master
added more detail to the building documentation instructions
2016-02-07 21:17:11 -08:00
Francois Chollet c450089de9 Documentation improvements 2016-02-07 15:42:18 -08:00
Francois Chollet 1b22915f85 Documentation improvements 2016-02-07 15:38:20 -08:00
Francois Chollet 5824f2eb99 Provide None as default value for layer names 2016-02-06 18:45:15 -08:00
David McInnis 27754d2a5f changed to make compatible with more browsers 2016-02-06 14:32:52 -08:00
Francois Chollet 99991779e5 TF fixes and style fixes 2016-02-06 13:49:57 -08:00
Mikael Rousson 652f2eb56d add siamese example
use graph model
take pairs of digits as input
2016-02-06 22:06:22 +01:00
Francois Chollet 359f91ff6c Fix py3 test 2016-02-06 11:21:44 -08:00
Francois Chollet f3fd56db50 Improve input validation in graph.compile 2016-02-06 11:03:02 -08:00
Francois Chollet 6f5f3d3fb6 Fix join mode in Merge / Siamese 2016-02-06 11:02:44 -08:00
Francois Chollet 9b10ab2980 Style fixes 2016-02-05 15:37:36 -08:00
Francois Chollet 113f20e7e5 Merge branch 'rescale-objective' of https://github.com/wxs/keras into wxs-rescale-objective 2016-02-05 15:34:57 -08:00
Francois Chollet f2443de96d Merge branch 'master' of https://github.com/fchollet/keras 2016-02-05 10:34:48 -08:00
Francois Chollet e14deedd78 Temporarily disable image preprocessing tests 2016-02-05 10:34:36 -08:00
François Chollet 7a708c305c Merge pull request #1648 from gw0/fix-fit_generator-terminate
Fix program termination when threads are used in fit_generator().
2016-02-05 10:00:32 -08:00
Xavier Snelgrove 9a4a931d32 Merge remote-tracking branch 'origin/master' into rescale-objective 2016-02-05 11:31:52 -05:00
gw0 [http://gw.tnode.com/] f854dcb83f Fix program termination when threads are used in fit_generator(). 2016-02-05 15:36:14 +01:00
David McInnis b74e4a9f2d added more detail to the building documentation instructions 2016-02-04 15:15:55 -08:00
Francois Chollet 217cdd8b85 Fix objective tests 2016-02-04 14:41:29 -08:00
Xavier Snelgrove 7e678b8315 Objective outputs should rescale based on sample_weights
If sample_weights is to be used as a mask as well as for re-weighting
then it's important that, at least when used as a mask, the output be
rescaled. Otherwise the order of magnitude of your objective changes
purely based on the number of masked entries in your training data.
2016-02-04 15:57:57 -05:00
Francois Chollet 65b5899d06 Remove RMSE objective (use MSE instead). 2016-02-04 10:59:01 -08:00
Francois Chollet 02d5f72be4 Simplify Theano tensor instantiation 2016-02-03 18:40:13 -08:00
Francois Chollet cd2e36392c Rename "params" to "trainable_weights" 2016-02-03 17:52:05 -08:00
Matthias Plappert ff4a9b7b24 fix BN serialization 2016-02-03 17:27:14 -08:00
François Chollet 8acf4de764 Merge pull request #1621 from DingKe/master
fix issue #1363
2016-02-02 20:44:05 -08:00
Ke Ding 7b789c7fa0 fix issue when loading stateful RNNs if they are not the first layer 2016-02-03 10:07:30 +08:00
Francois Chollet ef22fcf548 Fix TF dim check 2016-02-02 11:58:48 -08:00
Francois Chollet de8a0133f0 Improve "repeat" in backends 2016-02-02 11:26:03 -08:00
Xavier Snelgrove 095b6c118c Fix merge conflicts 2016-02-01 14:16:54 -08:00
Francois Chollet 827ec65111 Fix stateful RNN serialization 2016-02-01 13:48:40 -08:00
Xavier Snelgrove c94cf4b32a Auto-expand mask dims. Fix categorical_crossentropy.
- Categorical_crossentropy was taking an extra mean, the function
      already removes the final dimension of your input, so you don't need
      to take a mean as you would with, say, L2 loss.

     - The RNN backend call can now take a mask with or without the same
      number of dimensions as the input data

     - Fix Masking layer for Tensorflow

     - Add some tests to confirm objective function shapes
2016-02-01 14:47:10 -05:00
Francois Chollet b688192cbd 15% more efficient RNNs with TensorFlow 2016-01-31 15:30:20 -08:00
Francois Chollet 64374c98fb Merge branch 'master' of https://github.com/fchollet/keras 2016-01-31 14:10:31 -08:00
Francois Chollet 361c52c527 Merge branch 'udibr-imageDataGen_thread' 2016-01-31 14:10:16 -08:00
Francois Chollet 04d7504537 Style fixes 2016-01-31 14:09:52 -08:00
fchollet a07efd4b5c Update examples in doc 2016-01-31 10:19:39 -08:00
fchollet 0ca4a0fbae Add binary classification example in doc 2016-01-31 10:13:14 -08:00
Ehud Ben-Reuven 6e33b516ef ImageDataGenerator thread safe 2016-01-30 23:43:56 -05:00
fchollet 3d85402ca5 Fix various typos 2016-01-30 18:19:37 -08:00
fchollet 9720db9566 Fix typo 2016-01-30 18:13:25 -08:00
fchollet c5d11f1da3 Fix AutoEncoder; cleaner API. 2016-01-30 18:10:33 -08:00
François Chollet 5d3c267398 Merge pull request #1596 from agitter/master
Minor typos in example commands
2016-01-29 14:23:54 -08:00
Anthony Gitter eb8e8b281e Minor typos in example commands 2016-01-29 15:31:53 -06:00
Francois Chollet 001f29cf54 Merge branch 'master' of https://github.com/fchollet/keras 2016-01-29 10:47:08 -08:00
Francois Chollet bb2b3ada3d Improve docs of conv_filter_vis example 2016-01-29 10:03:07 -08:00
fchollet 92b1948d51 Fix typo 2016-01-28 22:05:25 -08:00
Francois Chollet 14b109072a Merge branch 'master' of https://github.com/fchollet/keras 2016-01-28 19:22:37 -08:00
Francois Chollet 027a018210 Add convolution filter visualization example 2016-01-28 19:22:24 -08:00
Francois Chollet 4341c623ff Test cleanup 2016-01-28 19:21:45 -08:00
François Chollet 7a3122c154 Merge pull request #1583 from awentzonline/fix-random-tests
Fixed backend tests for `random_uniform` and `random_normal`
2016-01-28 15:16:43 -08:00
François Chollet 282e634f3a Merge pull request #1582 from fchollet/temporal_sample_weighting
Temporal sample weighting
2016-01-28 14:18:53 -08:00
Adam Wentz 3882f32f09 Fixed backend tests for random_uniform and random_normal 2016-01-28 15:59:24 -06:00
Francois Chollet efe4fc72e5 Fix loss weighting tests 2016-01-28 13:50:19 -08:00
Francois Chollet 1623ea6166 Support for temporal sample weighting + tests 2016-01-28 11:20:04 -08:00
François Chollet b2681804f8 Merge pull request #1577 from jocicmarko/master
Speed up random shifts in data augmentation
2016-01-28 10:39:38 -08:00
Marko Jocić 9bc8ab169a Speed up random shifts in data augmentation
Currently, polynomial interpolation of 3rd order is done when shifting. However, that is not needed because the images are shifted by integer values (crop_left_pixels, crop_top_pixels), and there is nothing to interpolate.
Setting ```order=0``` will speed up random shifts significantly.
2016-01-28 14:18:05 +01:00
François Chollet 59dcd7ba7a Merge pull request #1566 from tpsatish95/master
added random_shear() to random_transform
2016-01-27 23:07:25 -08:00
tpsatish95 5cf50f6a6c added the random_shear transformation for images. 2016-01-28 11:35:50 +05:30
Francois Chollet ecdce975d3 Fix inaccuracy in FAQ 2016-01-27 19:15:40 -08:00
Francois Chollet aceded7bfb Merge branch 'master' of https://github.com/fchollet/keras 2016-01-27 19:10:45 -08:00
Francois Chollet 45955be120 Update FAQ 2016-01-27 19:10:23 -08:00
François Chollet 829886eb16 Merge pull request #1552 from udibr/ImageDataGenerator_shuffle
In ImageDataGenerator.flow shuffle before every epoch
2016-01-27 18:50:57 -08:00
Francois Chollet 4922a67f09 Add support for temporal sample weights 2016-01-27 18:41:35 -08:00
François Chollet 41d07f5ba9 Merge pull request #1560 from jnphilipp/master
Fixed ImageDataGenerator.flow
2016-01-27 10:56:00 -08:00
jnphilipp 05fe5f564a Fixed image save. 2016-01-26 21:18:04 +01:00
Ehud Ben-Reuven 403b4fc7a2 In ImageDataGenerator.flow shuffle before every epoch 2016-01-26 15:06:06 -05:00
Francois Chollet c61d075abc Fix ImageDataGenerator docs 2016-01-26 09:36:35 -08:00
fchollet 2e90ae18a6 Update Theano install instructions 2016-01-24 16:53:29 -08:00
fchollet ed1297b393 Update Theano install instructions 2016-01-24 16:09:51 -08:00
François Chollet a53577205f Merge pull request #1543 from Joshua-Chin/master
Fix documentation of output shape for Convolution2D
2016-01-24 12:49:38 -08:00
Joshua Chin 26ab219159 fix documentation for Convolution2D 2016-01-24 13:24:16 -05:00
Francois Chollet 399c00c7f8 Update sample_weight documentation 2016-01-22 22:06:49 -08:00
Francois Chollet 36892dba8f Introduce relu exception for old Theano versions 2016-01-22 15:17:03 -08:00
Francois Chollet 1e58b89523 Fix BN tests 2016-01-21 12:17:25 -08:00
Francois Chollet 4f530ab5f8 Merge branch 'master' of https://github.com/fchollet/keras 2016-01-21 11:54:58 -08:00
Francois Chollet 8823fa520b Add axis to BN config 2016-01-21 11:54:46 -08:00
François Chollet 1dab54c1c3 Merge pull request #1527 from the-moliver/master
add axis arguments to constraints
2016-01-21 11:23:16 -08:00
Michael Oliver e2e281e14f add axis arguments to constraints 2016-01-21 10:53:46 -08:00
François Chollet 3f599b9204 Merge pull request #1525 from wxs/fix-masking-3d+
Support >3d mask matrices.
2016-01-21 10:39:21 -08:00
Xavier Snelgrove ea7f400d57 Support >3d mask matrices.
The change to the dimshuffle/transpose call to support >3d inputs was
correct for the inputs array but did not apply to the mask array. This
fixes that.
2016-01-21 12:52:36 -05:00
François Chollet d7f870a47f Merge pull request #1507 from wb14123/char-tokenizer
make Tokenizer supports char level
2016-01-20 20:39:05 -08:00
François Chollet 1f0adb7ed7 Merge pull request #1514 from DingKe/master
Check gpu status correctly when using theano's cuda.use to choose gpu
2016-01-20 18:45:13 -08:00
DingKe 5657a38515 Check gpu correctly when using theano.sandbox.cuda.use to set gpu 2016-01-21 10:19:50 +08:00
Francois Chollet 2143046261 Improve error message in recurrent.py 2016-01-20 17:59:26 -08:00
Francois Chollet 571f1d4fcf Fix py3 compatibility 2016-01-20 15:28:09 -08:00
Francois Chollet f834c59290 Merge branch 'master' of https://github.com/fchollet/keras 2016-01-20 15:04:17 -08:00
Francois Chollet 0300ae2f4c Style fixes 2016-01-20 15:04:12 -08:00
François Chollet a67fb21e56 Merge pull request #1512 from farizrahman4u/patch-26
RNN : Support for nD data
2016-01-20 12:39:30 -08:00
Francois Chollet d9c4c02601 Fix image preprocessing tests 2016-01-20 12:25:36 -08:00
François Chollet 3ed897332f Merge pull request #1501 from farizrahman4u/patch-25
Remove output_dim argument from RNN API
2016-01-20 10:25:14 -08:00
François Chollet 52cb803deb Merge pull request #1511 from farizrahman4u/patch-24
White space fix
2016-01-20 10:18:09 -08:00
Fariz Rahman ace7b7fe7f RNN : Support for nD data
Currently,  only 3D input is supported by the rnn function.

Update theano_backend.py

Fix tf too

Avoid slicing

assert ndim>=3

Update theano_backend.py

typo

Update theano_backend.py
2016-01-20 23:29:57 +05:30
Fariz Rahman d08b0efc80 White space fix 2016-01-20 22:03:00 +05:30
Bin Wang 371dcc9071 add doc 2016-01-20 17:19:04 +08:00
Bin Wang 05dcfe15fe make Tokenizer supports char level 2016-01-20 16:56:57 +08:00
Fariz Rahman 73f1ef2e95 Update activations.py
Update tensorflow_backend.py

Update theano_backend.py

Update core.py

Update recurrent.py

Update test_backends.py
2016-01-20 12:41:41 +05:30
François Chollet 5bcac37553 Merge pull request #1503 from the-moliver/gaussnoisefix
Add sqrt for proper noise sigma calculation
2016-01-19 16:38:02 -08:00
Michael Oliver 34c54cb19c Add sqrt for proper noise sigma calculation 2016-01-19 15:43:36 -08:00
François Chollet fb99e04f86 Merge pull request #1502 from tboquet/from_config_deb
Added the name of the model when loading it from config
2016-01-19 14:36:15 -08:00
Francois Chollet 262c8ce1a0 Merge branch 'cassianokc-master' 2016-01-19 14:19:45 -08:00
Francois Chollet 2091bfe911 Fix stateful LSTM example 2016-01-19 14:18:11 -08:00
Francois Chollet 5e9aafca33 Merge branch 'master' of https://github.com/cassianokc/keras into cassianokc-master 2016-01-19 13:57:26 -08:00
tboquet e31d26b33f + name of th loaded models 2016-01-19 16:44:42 -05:00
Francois Chollet 45714e343f Improve real time data augmentation, cifar10 ex 2016-01-19 13:43:41 -08:00
Fariz Rahman c8f2633109 Merge pull request #6 from fchollet/master
update
2016-01-19 13:49:38 -05:00
Francois Chollet 852fe9cc7b Fix time distributed softmax 2016-01-19 09:34:41 -08:00
Francois Chollet d1af488d10 Minor style fixes 2016-01-17 17:10:04 -08:00
Francois Chollet c59b0e936d Merge branch 'wxs-backend-masking' 2016-01-17 16:48:53 -08:00
Francois Chollet c98633cd7c Fix normalization tests 2016-01-17 16:48:20 -08:00
Xavier Snelgrove ac773ed243 Fix masking in RNNs 2016-01-17 16:29:07 -08:00
Francois Chollet 9d120bf9e0 Fix py3 compatibility 2016-01-17 16:09:00 -08:00
Cassiano Kleinert Casagrande e443527d28 Add example of stateful LSTM sequence prediction 2016-01-17 21:56:21 -02:00
Francois Chollet 8c103559a6 Update tests 2016-01-17 15:41:15 -08:00
Francois Chollet 85a1225cb3 Fix batchnorm 2016-01-17 15:41:08 -08:00
Francois Chollet 160196137f Add support for axis lists in element wise ops 2016-01-17 15:40:46 -08:00
Francois Chollet 040e1ef628 Merge branch 'master' of https://github.com/fchollet/keras 2016-01-17 12:13:27 -08:00
Francois Chollet 0debec1c11 Fix batch normalization 2016-01-17 12:13:15 -08:00
Francois Chollet 0abed354b5 Add/subtract mean in style transfer script 2016-01-17 12:12:24 -08:00
fchollet d00140862e Fix conv2d mode in doc examples 2016-01-15 19:35:57 -08:00
Francois Chollet 5c3839a950 Merge branch 'consciousnesss-add_pep8' 2016-01-15 16:32:41 -08:00
Francois Chollet 2006ec2873 Newline. 2016-01-15 16:32:23 -08:00
Francois Chollet 036d5c8dc2 Merge branch 'add_pep8' of https://github.com/consciousnesss/keras into consciousnesss-add_pep8 2016-01-15 16:30:32 -08:00
Francois Chollet 3bfe4eace9 Normalize layer importing 2016-01-15 16:19:00 -08:00
Francois Chollet 83aaadaa9d Log backend type to stderr 2016-01-15 16:18:42 -08:00
Francois Chollet a18932cb65 Improve neural style example 2016-01-15 16:17:18 -08:00
Francois Chollet 70f0fe515a Improve deep dream example 2016-01-15 16:17:05 -08:00
olegsinyavskiy a563a8446e Add an automatic PEP8 check on the pull request submission:\n - ignore most of the errors to avoid disrupting others\n - add a separate job to avoid confusion that all jobs fail because of a single pep error\n - fix few small pep errors 2016-01-14 08:32:37 -08:00
Oleg Sinyavskiy bd2ff26b37 Merge pull request #5 from fchollet/master
update
2016-01-13 17:34:42 -08:00
Francois Chollet 58a94a9b05 Add deep dream example 2016-01-13 10:46:45 -08:00
Francois Chollet 3d3b8c52e9 Cleanup of neural style transfer example 2016-01-13 10:46:19 -08:00
François Chollet 558605a363 Merge pull request #1422 from jakebian/json-fix
models: when dumping config to json, parse unhandled types correctly
2016-01-12 13:46:11 -08:00
François Chollet a696513cfb Merge pull request #1448 from keunwoochoi/patch-1
Remove cv2 dependency and use scipy.misc
2016-01-11 16:09:11 -08:00
Keunwoo Choi ee6bad63b0 Remove cv2 dependency and use scipy.misc
to read, resize, and save images.
2016-01-11 23:33:25 +00:00
Francois Chollet cb13a33a31 Fix neural style comments 2016-01-11 11:43:10 -08:00
Francois Chollet 1fdcc370b6 Remove unnecessary commented code. 2016-01-11 11:30:10 -08:00
Francois Chollet 94c9183179 Merge branch 'master' of https://github.com/fchollet/keras 2016-01-11 11:23:44 -08:00
Francois Chollet ada6dd2943 Add neural style transfer example 2016-01-11 11:23:38 -08:00
Francois Chollet a5c07d796a Update theano backend 2016-01-11 11:22:29 -08:00
François Chollet d2f7593a35 Merge pull request #1442 from jfsantos/patch-8
FIxed Tensorboard callback for Python 3
2016-01-10 21:20:42 -08:00
João Felipe Santos 314ee54e60 FIxed Tensorboard callback for Python 3
Adding `dict_items` [does not work](https://stackoverflow.com/questions/13361510/typeerror-unsupported-operand-types-for-dict-items-and-dict-items) in Python 3. Workaround is to create a copy of the dict and `update` it with the other dict.
2016-01-10 13:12:00 -05:00
François Chollet 5d1789c805 Merge pull request #1430 from tboquet/fix_custom_obj_load
Added compatibility for custom loss functions
2016-01-08 12:16:12 -08:00
tboquet 42cd4d6b62 Added support for both Sequential and Graph 2016-01-08 14:16:27 -05:00
tboquet 6bb4cbbf5e added compatibility for custom loss functions 2016-01-08 13:19:25 -05:00
Francois Chollet 998efc04ee Merge branch 'master' of https://github.com/fchollet/keras 2016-01-08 10:02:45 -08:00
Francois Chollet 037e592f2b Naming, batch_flatten 2016-01-08 10:02:28 -08:00
fchollet ced84d53bc Fix example docstring 2016-01-07 22:24:40 -08:00
fchollet d0b98a2cb5 Antirectifier example style fixes 2016-01-07 21:18:34 -08:00
fchollet 09d91fccb9 Add antirectifier example 2016-01-07 20:22:33 -08:00
jake e947a56c52 models: when dumping config to json, parse unhandled types correctly
- properly convert numpy types
- handle python 'type' objects
2016-01-07 18:02:02 -08:00
François Chollet 887178bd02 Merge pull request #1417 from farizrahman4u/patch-21
Lambda layer:get_output should use get_input.
2016-01-07 17:24:13 -08:00
Fariz Rahman e21a6a9ebf Fix output_shape too. 2016-01-08 01:13:00 +05:30
François Chollet f31f85a720 Merge pull request #1419 from ozancaglayan/fix-mask-cast
models: Cast the mask to floatX to avoid theano upcasting
2016-01-07 11:14:22 -08:00
Fariz Rahman bf2f64bfd5 Update core.py 2016-01-08 00:27:43 +05:30
Ozan Caglayan f800e448a2 models: Cast the mask to floatX to avoid theano upcasting
Issue #1416
2016-01-07 20:19:24 +02:00
Fariz Rahman fe18ad8dde Lambda layer:get_output should use get_input. 2016-01-07 17:06:43 +05:30
François Chollet 6167d17aeb Merge pull request #1414 from tboquet/conv_same_conv2d
Evaluate the kernel of base Theano conv2d with the same border mode
2016-01-06 19:28:26 -08:00
tboquet f8dd6da08d Eval kernel base Theano conv2d same border mode 2016-01-06 21:59:50 -05:00
François Chollet c6a1c01a08 Merge pull request #1411 from jarfo/patch-1
Update models.py
2016-01-06 10:55:15 -08:00
jarfo d02ea03462 Update models.py
Model evaluation (test) using the _test K.function should be also stateful for stateful recurrent networks
2016-01-06 13:27:28 +01:00
Francois Chollet 13379da81b Fix kernel shape type in theano conv2d 2016-01-05 15:44:15 -08:00
Francois Chollet 87f62dc6cf Merge branch 'master' of https://github.com/fchollet/keras 2016-01-05 10:36:15 -08:00
Francois Chollet 6cb1172668 Add cosine proximity objective 2016-01-05 10:35:29 -08:00
Francois Chollet 458641f33a Add K.l2_normalization 2016-01-05 10:35:10 -08:00
François Chollet e2fa1d56c0 Merge pull request #1396 from julienr/resize_images
Add K.resize_images backend op.
2016-01-03 14:51:44 -08:00
Julien Rebetez 01395f13ed Figure out tensor shape automatically in K.resize_images 2016-01-03 22:36:28 +01:00
Francois Chollet 6445d385ee Update CONTRIBUTING.md 2016-01-03 13:17:36 -08:00
Julien Rebetez 4330fd78e9 Fix theano_backend.resize_images 2016-01-03 19:04:49 +01:00
Francois Chollet f447644900 Update PyPi release to 0.3.1 2016-01-03 09:41:04 -08:00
Francois Chollet 3f623df020 Merge branch 'master' of https://github.com/fchollet/keras 2016-01-03 09:38:25 -08:00
Francois Chollet 69e19b1e03 Improve optimizer tests 2016-01-03 09:38:02 -08:00
François Chollet 0ed465acfa Merge pull request #1397 from stevenxxiu/batch_input_shape_fix
batch_input_shape fix
2016-01-03 09:31:13 -08:00
François Chollet ad9d41f1b0 Merge pull request #1389 from kashif/adamax
adamax optimizer
2016-01-03 09:27:55 -08:00
Steven Xu b17e4c5edf input_shape fix 2016-01-03 23:02:07 +11:00
Julien Rebetez 9d15c96115 Add K.resize_images backend op.
This allows to take advantage of tensorflow’s resize_images operator in UpSampling2D.
2016-01-03 12:37:14 +01:00
Kashif Rasul 5c72e14034 adamax optimizer 2016-01-03 09:40:26 +01:00
Francois Chollet d401bb46dd Doc fixes 2016-01-01 22:31:25 -08:00
Francois Chollet 3a9ffc8ffd Merge branch 'berleon-fix-1275' 2016-01-01 11:21:20 -08:00
Francois Chollet 421a2cdf04 Move batch norm tests to tests/keras/layers/ 2016-01-01 11:07:19 -08:00
Francois Chollet 8f934e0379 Merge branch 'fix-1275' of https://github.com/berleon/keras into berleon-fix-1275 2016-01-01 09:46:23 -08:00
François Chollet bb45991899 Merge pull request #1388 from kylemcdonald/patch-1
typo in doc: batch_input_size => batch_input_shape
2015-12-31 15:51:11 -08:00
Kyle McDonald 582dfc4233 typo in doc: batch_input_size => batch_input_shape 2015-12-31 14:53:11 -08:00
berleon 177f7b6b6e [BatchNormalization] set updates in get_output
This commit fixes the DisconnectedInputError described in issue
the `get_output` method. Before this commit the `updates` member
could would use another input as the `get_output` method, if the
input was changed.
2015-12-31 18:14:02 +01:00
berleon 579a219614 [AutoEncoder] set_previous triggers build
The `params`, `regularizers`, `constraints` and `updates` member of the
AutoEncoder were set in the `__init__` method.
When set_previous was called, the mentioned members were not updated.
This behavior resulted in a DisconnectedInputError.
Now the mentioned members are set in the `build` method and the
`set_previous` method calls the `build` method every time the
input changes. This commit fixes issue #1275.
2015-12-31 16:54:21 +01:00
François Chollet f95e6bada3 Merge pull request #1383 from tboquet/conv_deb
Fixed dnn_conv output size
2015-12-30 19:58:20 -08:00
tboquet c00cf10ef8 * deleted custom padding/replaced by a slice 2015-12-30 22:01:49 -05:00
Francois Chollet be9f7bc62f Documentation fixes 2015-12-30 13:09:16 -08:00
Francois Chollet d49baf1bfb Fix example in FAQ 2015-12-29 16:00:56 -08:00
Francois Chollet 729f0765da Progbar: scientific notation only for small values 2015-12-29 16:00:39 -08:00
François Chollet 161b31dcf3 Merge pull request #1374 from PiranjaF/master
Fix: Loading merge layer from serialized data
2015-12-29 15:16:25 -08:00
PiranjaF 643961723c Update layer_utils.py 2015-12-29 23:18:03 +01:00
François Chollet 7b95359b8e Merge pull request #1354 from easyas314159/master
Fixed handling of negative dimensions in Reshape layers
2015-12-29 11:16:17 -08:00
François Chollet 7555a32d0a Merge pull request #1372 from viirya/dedup
Minor: remove duplicate code
2015-12-29 11:13:45 -08:00
Liang-Chi Hsieh c95f5d10c2 Minor: remove duplicate code. 2015-12-29 18:24:57 +08:00
Kevin Loney 03cd7bf493 Fixed some stylistic issues and expanded the doc string for the
Reshape. _fix_unknown_dimension
2015-12-28 23:59:44 -07:00
François Chollet 308fd87031 Merge pull request #1352 from viirya/check-keras-dir-permission
Check keras_dir writing permission and assign temporary directory
2015-12-27 09:39:48 -08:00
Liang-Chi Hsieh a98eec34f7 Check basedir for dataset path. 2015-12-26 14:46:21 +08:00
Liang-Chi Hsieh b4eb1d9491 Check base dir. 2015-12-26 09:11:16 +08:00
Kevin Loney 186d95ae9c Fixed handling of negative dimensions in Reshape.output_shape and
Reshape.get_output
2015-12-25 11:14:01 -07:00
Liang-Chi Hsieh 58ed77b0d2 Check keras_dir writing permission. 2015-12-25 18:07:20 +08:00
Francois Chollet 16675b98c0 Better input validation in Sequential & Graph. 2015-12-23 13:55:13 -08:00
François Chollet 7a61cc20b9 Merge pull request #1338 from rpinsler/master
Fix typos and minor inconsistencies.
2015-12-23 11:26:23 -08:00
François Chollet 534f68ec77 Merge pull request #1336 from wb14123/loop
fix iteration shadowed in loop
2015-12-23 10:29:45 -08:00
François Chollet 2a4680ec3e Merge pull request #1335 from sjebbara/graph-prediction-output
Output Format of predict_on_batch in Graph() Model as Dictionary
2015-12-23 10:28:45 -08:00
rpinsler 85e51a0f8f Fix typos and minor inconsistencies. 2015-12-23 13:06:03 +01:00
Bin Wang 0695b82f74 fix iteration shadowed in loop 2015-12-23 17:14:51 +08:00
sjebbara 19290c07fd return outputs of predict_on_batch function of the Graph model as a dictionary 2015-12-23 09:52:39 +01:00
Francois Chollet 29e60ab372 Remove print statement in test 2015-12-22 18:09:40 -08:00
Francois Chollet eda1a9e0a4 Add tests for initializations 2015-12-22 17:57:04 -08:00
Francois Chollet 69932604f9 Fix text preprocessing test 2015-12-22 12:03:28 -08:00
Francois Chollet 7f85541785 Add text preprocessing test 2015-12-22 11:26:25 -08:00
Francois Chollet 7f3cd093c0 Fix flaky test 2015-12-22 10:37:09 -08:00
Francois Chollet 18d52e634d Add text preprocessing tests 2015-12-22 10:36:59 -08:00
Francois Chollet 485d451b62 Remove no-longer used util function. 2015-12-22 09:33:32 -08:00
Francois Chollet d5fb5d1f15 Improve callbacks docs. 2015-12-22 09:33:32 -08:00
François Chollet f65b531631 Merge pull request #1328 from phreeza/siamese_tests
Some more testing of Siamese layer
2015-12-22 09:12:47 -08:00
fchollet c8176fd3bc Update FAQ in documentation 2015-12-22 08:10:18 -08:00
fchollet d870e45eb0 Fix flaky test 2015-12-22 08:09:55 -08:00
Francois Chollet 532515cbb0 Merge branch 'master' of https://github.com/fchollet/keras 2015-12-22 07:53:44 -08:00
Francois Chollet f2f4f4ec48 Add helpful error message in Flatten 2015-12-22 07:53:21 -08:00
Thomas McColgan 3d109c6ebe split out theano only part 2015-12-22 14:33:32 +01:00
Thomas McColgan 2a0f3e3dfc further test of siamese layer 2015-12-22 14:33:32 +01:00
François Chollet 7a6a47888c Merge pull request #1324 from Chasego/patch-1
Fix the wrong link
2015-12-21 19:38:26 -08:00
cyc d8e83cc773 Fix the wrong link
Fix the wrong link for "Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks"
2015-12-21 22:33:35 -05:00
Francois Chollet 908d866558 Improve unit test coverage 2015-12-21 17:55:26 -08:00
Francois Chollet 13df0bf32a Skip tensorboard test if py3 2015-12-21 14:21:57 -08:00
Francois Chollet fd632b70c5 Fix callback tests 2015-12-21 14:14:41 -08:00
Francois Chollet df612a881f Merge branch 'tboquet-tensorboard' 2015-12-21 13:53:33 -08:00
Francois Chollet b602a93e17 Update TensorBoard callback 2015-12-21 13:52:47 -08:00
tboquet 80eab1bc02 Add TensorBoard visualization callback. 2015-12-21 11:58:23 -08:00
Francois Chollet 49c343f836 Update CONTRIBUTING with info wrt commit squashing 2015-12-21 10:26:31 -08:00
François Chollet 27ee3a6bbc Merge pull request #1252 from gw0/feat-pydot-fallback
Import pydot-ng with pydot as fallback when visualizing.
2015-12-20 20:24:34 -08:00
fchollet 6ec1f7a498 Fix callback tests 2015-12-19 19:46:57 -08:00
fchollet e34f9e6deb Add tests for callbacks 2015-12-19 19:08:03 -08:00
fchollet 8d3b8ff627 Improve callback functionality 2015-12-19 19:07:50 -08:00
Francois Chollet dd58103a3c Better MemNN example 2015-12-18 15:10:52 -08:00
Francois Chollet 80096798fc Fix borked merge in test_models 2015-12-18 10:09:53 -08:00
François Chollet dac6b2f6a5 Merge pull request #1256 from consciousnesss/integration_tests_job
Split unit and IT tests
2015-12-18 09:23:42 -08:00
François Chollet 1fd55f69e5 Merge pull request #1298 from farizrahman4u/patch-21
Remove unnecessary apology :)
2015-12-18 09:20:55 -08:00
François Chollet 332f5c661f Merge pull request #1307 from gw0/fix-shape-tuples
Fix shapes should be tuples.
2015-12-18 09:20:41 -08:00
François Chollet 2f48f056ec Merge pull request #1290 from julienr/backend_from_env
Allows choosing the backend by setting the KERAS_BACKEND environment …
2015-12-18 08:52:57 -08:00
François Chollet 9d0cf9fbfa Merge pull request #1305 from fchollet/fit_generator
Fit generator
2015-12-18 08:52:27 -08:00
gw0 [http://gw.tnode.com/] bdf084e35e Fix shapes should be tuples. 2015-12-18 12:10:24 +01:00
gw0 [http://gw.tnode.com/] e5d3abdf09 Import pydot-ng with pydot as fallback when visualizing. 2015-12-18 11:05:27 +01:00
Francois Chollet 93b01aff15 dem flaky tests 2015-12-18 00:01:23 -08:00
Francois Chollet f9325e8fe5 Fix py3 compatibility. 2015-12-17 23:39:12 -08:00
Francois Chollet 3f67168c44 Fix flaky test 2015-12-17 23:03:23 -08:00
Francois Chollet f9911c10b4 Style fixes in datasets 2015-12-17 22:32:57 -08:00
Francois Chollet 47d074fec3 Add fit_generator methods in models 2015-12-17 22:32:44 -08:00
Francois Chollet 097e46837c Callback robustness fix 2015-12-17 22:14:35 -08:00
Francois Chollet b5f65dfaa4 Merge branch 'master' of https://github.com/fchollet/keras 2015-12-17 12:40:46 -08:00
Francois Chollet 5255b5df54 Style normalization in layers.core 2015-12-17 12:40:33 -08:00
Francois Chollet 58fb2b8af5 Improve BatchNorm documentation 2015-12-17 12:40:07 -08:00
Fariz Rahman c2a7ccd1cc Remove unnecessary apology :) 2015-12-18 00:32:40 +05:30
François Chollet 14d905cb4b Merge pull request #1293 from julienr/fix_upsampling
Modifies UpSampling1D/2D to repeat entries instead of tiling arrays.
2015-12-17 10:31:03 -08:00
François Chollet 6553a2bad9 Merge pull request #1289 from julienr/fix_docs_autogen
Fix docs/autogen.py to create subdirectories for autogenerated MODULES
2015-12-17 09:35:24 -08:00
François Chollet e01b824912 Merge pull request #1292 from julienr/fix_visutil
Remove hardcoded fontname in visualize_util
2015-12-17 09:34:59 -08:00
Julien Rebetez 5d685f4447 Use range instead of xrange to pass py35 tests 2015-12-17 16:32:48 +01:00
Julien Rebetez 50d3fddead Remove hardcoded fontname in visualize_util 2015-12-17 14:48:04 +01:00
Julien Rebetez 8715c70a74 Modify UpSampling1D/2D to turn [0, 1] into [0, 0, 1, 1] instead of [0, 1, 0, 1] 2015-12-17 14:47:04 +01:00
Julien Rebetez 554ed5bfc8 Add a K.repeat_elements function which works like np.repeat 2015-12-17 13:37:57 +01:00
Julien Rebetez ed92c14185 Allows choosing the backend by setting the KERAS_BACKEND environment variable 2015-12-17 11:45:52 +01:00
Julien Rebetez 8c914f793b Fix docs/autogen.py to create subdirectories for autogenerated MODULES
doc

Without this, docs/autogen.py fails with a no such file or directory
'sources/layers/convolutional.md'
2015-12-17 11:40:43 +01:00
Francois Chollet 063dd7d6c5 Merge branch 'lukedeo-highway' 2015-12-16 12:15:19 -08:00
Francois Chollet 827a6323b0 Style fixes 2015-12-16 12:15:09 -08:00
Francois Chollet edbefd7f50 Merge branch 'highway' of https://github.com/lukedeo/keras into lukedeo-highway 2015-12-16 12:12:30 -08:00
olegsinyavskiy 490ba423c4 do not share data between tests 2015-12-16 11:23:27 -08:00
Luke de Oliveira ec663aeda1 Remove TimeDistributedHighway tests 2015-12-16 19:26:11 +01:00
Luke de Oliveira 69cabdf6bc Remove TimeDistributedHighway
[WIP] will be added after `TimeDistributed` generic layer.
2015-12-16 19:10:39 +01:00
Francois Chollet d04cac6526 Fix GRU activation 2015-12-16 08:17:16 -08:00
Oleg Sinyavskiy ca37f96ca3 Merge pull request #4 from fchollet/master
update master
2015-12-15 21:50:02 -08:00
olegsinyavskiy 5e06aa5ef1 update the job 2015-12-15 19:07:24 -08:00
olegsinyavskiy 2f754c79f1 move tests into tests 2015-12-15 19:05:47 -08:00
Francois Chollet 42b3d37a54 Fix flaky test in preprocessing 2015-12-15 16:47:41 -08:00
François Chollet 151a6fbab9 Merge pull request #1279 from EderSantana/sequences
Add sequences tests and fix sequences docs
2015-12-15 16:30:46 -08:00
EderSantana e27587334b fix typo 2015-12-15 18:31:19 -03:00
EderSantana 54d3b9e673 rename test_sequences to test_sequence.py 2015-12-15 18:30:01 -03:00
EderSantana 2e29ef31a7 rename to and add more complete tests 2015-12-15 18:28:17 -03:00
Francois Chollet d68f12bbdf Merge branch 'farizrahman4u-patch-16' 2015-12-15 11:22:48 -08:00
Francois Chollet 3dddabebc4 Fix style, flaky test 2015-12-15 11:22:29 -08:00
EderSantana 4dcb2c04e3 Add sequences tests and fix sequences docs 2015-12-15 15:51:17 -03:00
Francois Chollet 324d8f875f Merge branch 'patch-16' of https://github.com/farizrahman4u/keras into farizrahman4u-patch-16 2015-12-15 10:46:34 -08:00
lukedeo 1872e00bea adding better documentation to highway layers 2015-12-15 19:46:19 +01:00
Fariz Rahman 3c8618bb39 Update core.py 2015-12-16 00:05:26 +05:30
Fariz Rahman 515448f859 Add is_graph to docstring 2015-12-15 23:56:30 +05:30
Fariz Rahman 85ba9aa884 Fix test_batchnorm_config() 2015-12-15 23:44:35 +05:30
François Chollet 1e418e0200 Merge pull request #1277 from farizrahman4u/patch-19
Quotes for default string values in docs
2015-12-15 09:51:58 -08:00
Fariz Rahman 54d332bf63 Quotes for default string values in docs 2015-12-15 23:02:20 +05:30
Fariz Rahman 3d51a26749 Fix json serializing 2015-12-15 22:48:51 +05:30
François Chollet 792b755594 Merge pull request #1258 from jeffzhengye/rnn_masking_bug_fix
correct masking: switch for each example in a batch
2015-12-15 09:15:35 -08:00
Fariz Rahman b2ab55611b Fix weight save 2015-12-15 21:14:42 +05:30
Fariz Rahman 97ba6aaaa1 call base constructor in Lambda layers 2015-12-15 20:32:45 +05:30
Fariz Rahman 794c083f6b Update containers.py 2015-12-15 20:29:04 +05:30
Fariz Rahman 534d81fc10 call base constructor in SiameseHead 2015-12-15 20:25:29 +05:30
Fariz Rahman 3adbb2bd4f Fix tests 2015-12-15 18:50:07 +05:30
Fariz Rahman c2220bff6e call base constructor in Merge and Siamese 2015-12-15 18:48:23 +05:30
Fariz Rahman 775e91c573 Update core.py 2015-12-15 18:09:39 +05:30
Fariz Rahman 3e19b0252f Update test_models.py 2015-12-15 18:08:55 +05:30
Fariz Rahman e0d8c09199 default arg for mask 2015-12-15 17:58:36 +05:30
Fariz Rahman f20383b7f7 Update containers.py 2015-12-15 17:45:55 +05:30
Fariz Rahman 5c83c69936 Update containers.py 2015-12-15 17:27:40 +05:30
Fariz Rahman f5d56fa2f7 Update test_models.py 2015-12-15 16:40:49 +05:30
Fariz Rahman 080a8199f4 Fix bug regarding layer cache 2015-12-15 16:35:06 +05:30
Fariz Rahman 7fb4f4b073 Update core.py 2015-12-15 16:30:10 +05:30
Fariz Rahman 9620a497c7 Update core.py 2015-12-15 16:25:23 +05:30
Fariz Rahman 3dcff62578 add arg cache_enabled 2015-12-15 16:21:12 +05:30
Fariz Rahman 91965e8f85 Update test_models.py 2015-12-15 16:08:50 +05:30
Fariz Rahman 92f2717c99 Update test_models.py 2015-12-15 16:02:44 +05:30
Fariz Rahman 07723ccff4 Update test_models.py 2015-12-15 15:39:41 +05:30
Fariz Rahman 0855541280 Siamese graph tests 2015-12-15 15:37:08 +05:30
Fariz Rahman 810cdc4a33 Test sequential 2015-12-15 15:24:32 +05:30
Fariz Rahman 652d61be46 Update core.py 2015-12-15 15:18:54 +05:30
Fariz Rahman 060fcd3ed0 Fix Siamese layer 2015-12-15 15:18:13 +05:30
Fariz Rahman 1cf7036d1c Fix add_shared_node() 2015-12-15 15:16:09 +05:30
Fariz Rahman f519823595 Tests 2015-12-15 13:22:20 +05:30
Fariz Rahman 4d3c6c9bbe Support nested models 2015-12-15 13:18:03 +05:30
Fariz Rahman f0cba6ec83 Add mask arg 2015-12-15 13:16:36 +05:30
jeffzhengye 090ac0d138 remove incompatible tests 2015-12-15 01:46:04 -05:00
olegsinyavskiy cf202bc5bd Merge remote-tracking branch 'origin/master' into integration_tests_job 2015-12-14 19:29:40 -08:00
Oleg Sinyavskiy 2ee8917836 Merge pull request #3 from fchollet/master
update
2015-12-14 19:29:19 -08:00
lukedeo 52976242f0 adding Highway Network layers (regular and time-distr.) 2015-12-14 20:03:46 +01:00
François Chollet 20d6d2b235 Merge pull request #1260 from chenb67/master
fix model checkpointer string formatter bug
2015-12-14 10:05:05 -08:00
Chen Buskilla 9987b8314e fix model checkpointer string formatter bug 2015-12-14 13:37:20 +02:00
jeffzhengye 53c8cfe47e Merge branch 'master' into rnn_masking_bug_fix 2015-12-13 19:26:03 -05:00
jeffzhengye bab230c0d6 add a test for simple rnn 2015-12-13 19:14:50 -05:00
Francois Chollet fddcf3acd3 Merge branch 'phreeza-test-sprint' 2015-12-13 14:45:59 -08:00
Francois Chollet 90b441c587 Style fixes in tests 2015-12-13 14:45:40 -08:00
Francois Chollet c746529559 Merge branch 'test-sprint' of https://github.com/phreeza/keras into phreeza-test-sprint 2015-12-13 14:28:03 -08:00
jeffzhengye efff160cea correct masking: switch for each example in a batch 2015-12-13 15:56:34 -05:00
Francois Chollet 0f86b45918 Add links to source code in documentation 2015-12-13 12:14:54 -08:00
Max Pumperla fb3df6a6e0 Merge branch 'test-sprint' of git://github.com/phreeza/keras into test-sprint 2015-12-13 10:43:20 +01:00
Max Pumperla 537e2d2123 single quotes in core layer test 2015-12-13 10:43:07 +01:00
Francois Chollet 4135797250 Update README 2015-12-12 13:34:46 -08:00
Francois Chollet ee3af2b8d0 Update docs README 2015-12-12 12:45:16 -08:00
Francois Chollet d229c47845 Add documentation generation script 2015-12-12 12:39:22 -08:00
Francois Chollet 9a93fc51cf Add docstrings in callbacks, models, optimizers 2015-12-12 12:36:23 -08:00
Francois Chollet 06f5f43079 Documentation refactor 2015-12-12 12:36:00 -08:00
olegsinyavskiy 039d15bb55 Split unit and integration tests:
- separate job on travis for IT tests
 - refactor and document IT tests (test_tasks.py)
 - char generation test with stacked LSTM
2015-12-12 11:21:15 -08:00
Oleg Sinyavskiy 8ad725248a Merge pull request #1 from fchollet/master
update master
2015-12-12 10:27:13 -08:00
Thomas McColgan 56aaffd3ea # This is a combination of 6 commits.
# The first commit's message is:
test image preprocessing

# The 2nd commit message will be skipped:

#	add PIL to enable testing of preprocessing code

# The 3rd commit message will be skipped:

#	try a different way to install PIL on travis

# The 4th commit message will be skipped:

#	include PIL only in python 2.7

# The 5th commit message will be skipped:

#	test image preprocessing

# The 6th commit message will be skipped:

#	fall back to Pillow for python 3 image processing
2015-12-12 14:35:59 +01:00
Thomas McColgan d3493fe6c8 test image preprocessing 2015-12-12 14:33:48 +01:00
Max Pumperla 444976552e Increased test threshold in merge overlap, as build breaks bc/ of it 2015-12-12 07:16:02 +01:00
Max Pumperla 671751fce4 Last unittest removed 2015-12-12 07:02:42 +01:00
Francois Chollet 656681e9d8 Docstrings touch-ups 2015-12-11 21:55:43 -08:00
Max Pumperla 19fadb5204 Merge master to resolve conflicts 2015-12-12 06:54:55 +01:00
Francois Chollet 25d76eb124 Merge branch 'master' of https://github.com/fchollet/keras 2015-12-11 21:39:05 -08:00
Francois Chollet 12e0052f43 Quote standardization 2015-12-11 21:38:54 -08:00
Francois Chollet 40864d8997 Add docstrings in layers. 2015-12-11 21:32:39 -08:00
François Chollet 5575f4a5a8 Merge pull request #1249 from bhargav/patch-1
Fix minor typo in recurrent.md
2015-12-11 15:58:11 -08:00
Bhargav Mangipudi af4f889fa6 Fix minor typo in recurrent.md
Fix a minor typo in docs.
2015-12-11 16:15:22 -06:00
François Chollet 40983e289a Merge pull request #1211 from farizrahman4u/patch-18
Bug fix - Cos
2015-12-11 09:37:02 -08:00
Max Pumperla a5d8b6378e Skip sklearn wrapper tests for TF for now 2015-12-11 15:16:19 +01:00
fchollet 71952f29c7 Remove make_reuters_dataset 2015-12-10 23:22:29 -08:00
fchollet febf9b2164 Fix output caching. 2015-12-10 22:11:56 -08:00
fchollet e8e8f7625b Introduce layer output caching. 2015-12-10 22:03:28 -08:00
François Chollet 3afc5ec4f0 Merge pull request #1242 from transcranial/tf-py34
add tensorflow 0.6.0 testing with python 3.4 to travis
2015-12-10 21:32:25 -08:00
fchollet de8e009e98 Merge branch 'mynameisfiber-master' 2015-12-10 21:27:37 -08:00
fchollet 23afa3f3b6 RNN unit tests touch-ups 2015-12-10 21:27:22 -08:00
fchollet b7db44866e Merge branch 'master' of https://github.com/mynameisfiber/keras into mynameisfiber-master 2015-12-10 21:15:23 -08:00
François Chollet 08470363ad Merge pull request #1243 from jeffzhengye/rnn_float32_issue
fix float32 issue: int32 times float32 will generate float64
2015-12-10 15:26:12 -08:00
Leon Chen 314ffececb tf.control_flow_ops -> tf.python.control_flow_ops 2015-12-10 17:54:44 -05:00
jeffzhengye c6f68f8311 use K.cast for compatibility 2015-12-10 17:53:50 -05:00
jeffzhengye f04272e697 fix float32 issue: int32 times float32 will generate float64 2015-12-10 17:11:47 -05:00
Francois Chollet 7401f47da4 Fix Conv2D behavior with dim_ordering='tf' 2015-12-10 14:01:56 -08:00
Leon Chen 67332afc46 get rid of python 2 switch for backends test 2015-12-10 15:53:19 -05:00
Leon Chen b9a5ff41f8 update travis to tensorflow 0.6.0 with additional python 3.4 testing 2015-12-10 15:27:55 -05:00
Francois Chollet 45c5f36399 Remove description in __init__ 2015-12-10 11:18:32 -08:00
Francois Chollet a4fff5aba3 Fix set_input in Sequential container. 2015-12-10 09:38:40 -08:00
Micha Gorelick bf8dc2f61f nit picking 2015-12-10 09:25:59 -05:00
Micha Gorelick bf3ef82fa5 Docstrings, state_updates for non-trainable, tests 2015-12-10 09:23:50 -05:00
Thomas McColgan 8c1c86baf0 remove unused import 2015-12-10 10:06:16 +01:00
Thomas McColgan 7dd3d062ff convert core tests to pytest 2015-12-10 10:05:54 +01:00
Thomas McColgan 8ab149df8a mark skipped tests as skipped, not passed 2015-12-10 09:57:07 +01:00
Thomas McColgan fd70ad6cd2 Merge remote-tracking branch 'origin/test-sprint' into test-sprint
# Conflicts:
#	tests/test_loss_masking.py
#	tests/test_sequential_model.py
2015-12-10 09:54:57 +01:00
Thomas McColgan 6f607f6aa8 some tensorflow tests skipped for now, will raise an issue for these 2015-12-10 09:51:12 +01:00
fchollet 59406a7148 Update callbacks documentation 2015-12-09 20:53:04 -08:00
fchollet 7da1523053 Replace unittest with pytest 2015-12-09 20:41:02 -08:00
Francois Chollet 20ca5befdc Add documentation for summary() feature 2015-12-09 18:32:08 -08:00
Francois Chollet 1ef35e90fc Introduce model.summary() feature 2015-12-09 18:27:53 -08:00
Micha Gorelick e2780cd515 pep8 2015-12-09 17:42:35 -05:00
Micha Gorelick 56fa445ef1 Created new model update list for state updates
In order to propagate state through _predictions_, I created a new
property of the model, `state_updates` that returns any model step
updates that are needed when doing a stateful prediction.  These updates
are identified as *any updates defined by a stateful layer*.
2015-12-09 17:28:45 -05:00
Micha Gorelick 6a80b176e8 Created state_updates for updates on prediction 2015-12-09 16:44:08 -05:00
Francois Chollet c2534964b7 Fix SimpleRNN step function. 2015-12-09 13:11:57 -08:00
Francois Chollet 82353da4dc Remove LRN2D layer. 2015-12-09 13:04:07 -08:00
François Chollet be46766622 Merge pull request #1228 from Sebubu/add_node_doc_update
Updated doc with the missing arguments in Graph
2015-12-09 12:58:07 -08:00
Micha Gorelick 40c48dd174 Run non-optimization updates on predict for stateful RNN's 2015-12-09 15:23:07 -05:00
Severin Bühler 6a498ae420 fixed text 2015-12-09 20:07:07 +01:00
Severin Bühler 1ba7d3d8b8 Updated documentation with the missing arguments 2015-12-09 20:00:06 +01:00
Max Pumperla ca3622cbb7 test softplus 2015-12-09 15:20:36 +01:00
Max Pumperla 3ce80e8e74 Hard sigmoid test 2015-12-09 15:12:44 +01:00
Max Pumperla 02b40ccfe3 activations unit to pytest 2015-12-09 14:49:17 +01:00
Max Pumperla 9e46447080 Sigmoid activation test 2015-12-09 14:43:41 +01:00
Max Pumperla b4cd5fd342 scikit test 2015-12-09 14:34:06 +01:00
Francois Chollet 81787dd2bb Cleanup examples 2015-12-08 18:49:14 -08:00
Francois Chollet 96f3404a57 Fixes wrt RNN statefulness 2015-12-08 16:52:21 -08:00
François Chollet 3620f02258 Merge pull request #1049 from julienr/visutil_improve
utils.visualize_util improvements
2015-12-08 12:35:23 -08:00
Francois Chollet 07ffc76b93 Finalize RNN statefulness functionality 2015-12-08 10:43:06 -08:00
Francois Chollet d400fc4512 Merge branch 'maxpumperla-pooling' 2015-12-08 10:21:39 -08:00
Francois Chollet 6d6481fedd Style fixes 2015-12-08 10:20:04 -08:00
Francois Chollet 93af5e95fd Touch-ups in pooling layers 2015-12-08 10:16:47 -08:00
Francois Chollet 31534bd15e Reference backend page in docs 2015-12-08 10:16:10 -08:00
François Chollet 2586884080 Merge pull request #1216 from farizrahman4u/patch-20
Bug fix - Siamese
2015-12-08 10:10:17 -08:00
Fariz Rahman cee4f72a8e Bug fix - Siamese 2015-12-08 23:11:12 +05:30
François Chollet 36eff96a5d Merge pull request #1213 from farizrahman4u/patch-20
Fix add_shared_node()
2015-12-08 08:51:56 -08:00
François Chollet e7859bf188 Merge pull request #1214 from ozancaglayan/master
doc: Preserve the right order for model.compile
2015-12-08 08:33:40 -08:00
Ozan Caglayan d7d1ee54a5 doc: Preserve the right order for model.compile
Be coherent in the examples of Sequential() and Graph() in terms
of argument ordering.
2015-12-08 16:47:07 +01:00
Fariz Rahman 829dff1866 Fix add_shared_node()
Sorry, there was one more set_name.
2015-12-08 21:12:55 +05:30
Max Pumperla b372bd9dd4 Optimizers and regularizers moved to keras/ 2015-12-08 15:09:33 +01:00
Max Pumperla 36a41720af Remove tests in old structure 2015-12-08 15:07:50 +01:00
Max Pumperla 8b0676cb94 test_models.py used to test all models (Sequential, Graph) 2015-12-08 15:06:34 +01:00
Max Pumperla b864fa974b Renamed local variables in graph model test to fit into test_models.py 2015-12-08 15:06:01 +01:00
Max Pumperla d5b4d04b5b PEP-8 linting for advanced activations 2015-12-08 14:29:38 +01:00
Max Pumperla 57d5a7ca78 Move embedding test to keras/layers 2015-12-08 13:48:59 +01:00
Max Pumperla fe6c554e7a Moved dataset tests to keras/datasets 2015-12-08 13:44:03 +01:00
Max Pumperla bfe113556f Merge branch 'master' into test-sprint 2015-12-08 13:34:13 +01:00
Fariz Rahman de2884af79 Bug fix - Cos 2015-12-08 15:23:02 +05:30
Max Pumperla f46f04d545 Removed self key word in reshaping test 2015-12-08 08:11:12 +01:00
Max Pumperla 7d0ed02cc1 refactored tf backend pooling 2015-12-08 07:45:17 +01:00
Max Pumperla 075c34d037 Removed comment and sum mode 2015-12-08 07:44:56 +01:00
Max Pumperla 313ebae4d2 renaming mean -> average pooling 2015-12-08 07:28:05 +01:00
Max Pumperla ace2d94706 Merge master into pooling 2015-12-08 07:19:58 +01:00
fchollet 31cf6b16f4 Fix Adadelta serialization 2015-12-07 21:27:04 -08:00
François Chollet b126b6328a Merge pull request #1205 from smauq/progbar-precision
Increased Progbar precision
2015-12-07 20:36:30 -08:00
smauq f200b19056 Increased Progbar precision 2015-12-08 03:08:16 +02:00
François Chollet 526c8a58b3 Merge pull request #1197 from farizrahman4u/patch-18
Travis badge
2015-12-07 12:07:39 -08:00
Max Pumperla 51c1c94439 Minor error in conv, some linting & fixing theano backend 2015-12-07 18:31:00 +01:00
Max Pumperla c19cdf4e60 Adapted test for convolutional layers 2015-12-07 14:39:00 +01:00
Max Pumperla db0ae8216c Shape inference test 2015-12-07 14:35:50 +01:00
Max Pumperla 362e0e95aa clean up in convolutional layer 2015-12-07 14:35:37 +01:00
Max Pumperla e99dbf5857 Fixed typo in tensorflow backend 2015-12-07 14:35:14 +01:00
Max Pumperla 4854042813 Fixed formula for reshaping in image2neibs for theano 2015-12-07 14:32:07 +01:00
François Chollet a18755dc67 Merge pull request #1196 from farizrahman4u/patch-11
Fix add_shared_node()
2015-12-06 11:50:51 -08:00
Thomas McColgan ae375dd638 start testing the changed Layer interface 2015-12-06 20:42:22 +01:00
Fariz Rahman f5ff9f07d9 Travis badge 2015-12-07 01:02:32 +05:30
Thomas McColgan 14a57c10e6 tests for advanced activations
thresholded activations

parametric softplus

some bugfixes

fix error caused by calling layer.build on PReLU

seed the rng in every test individually to make them deterministic
2015-12-06 20:02:54 +01:00
Fariz Rahman 9031b77843 Fix add_shared_node()
Remove set_name
2015-12-07 00:02:35 +05:30
François Chollet 64226e4335 Merge pull request #1194 from smauq/master
Fix broken links in README.md
2015-12-06 09:41:37 -08:00
Max Pumperla b34fdbf12a docs for new MeanPooling layers 2015-12-06 18:14:38 +01:00
Max Pumperla 4259071ce0 MeanPooling1D/2D introduced 2015-12-06 18:09:56 +01:00
Max Pumperla 5ef38bd25b renaming in test 2015-12-06 18:09:27 +01:00
Max Pumperla 47a9735a6d generalized pooling in theano 2015-12-06 18:09:09 +01:00
Max Pumperla fc6c6aa6f0 generalized pooling in tf 2015-12-06 18:08:47 +01:00
smauq f18fa4af50 Fix broken links in README.md 2015-12-06 14:32:19 +02:00
François Chollet 470555f10b Merge pull request #1191 from jfsantos/patch-7
Fix typo
2015-12-05 21:49:51 -08:00
João Felipe Santos 7ab44d8d9c Fix typo
idomatic -> idiomatic
2015-12-05 23:51:58 -05:00
fchollet 3ce1883afc Update documentation 2015-12-05 19:32:14 -08:00
fchollet 93718562db One more attempt at fixing indeterminism on Travis 2015-12-05 18:58:23 -08:00
fchollet 5fe40861ce New attempt at fixing test indeterminism 2015-12-05 18:31:42 -08:00
fchollet 182e84d09b Attempt to reduce indeterminism in tests 2015-12-05 17:56:59 -08:00
François Chollet 337f86b40b Merge pull request #1190 from smauq/master
Fixed the progbar in the cifar10 example
2015-12-05 17:21:54 -08:00
Francois Chollet 1427815137 Refresh unit tests. 2015-12-05 17:02:13 -08:00
François Chollet a0c03ad577 Merge pull request #1187 from EderSantana/patch-3
tip for better issue search and FAQ link
2015-12-05 16:34:21 -08:00
François Chollet 4c742737a3 Merge pull request #1189 from EderSantana/call2
Fix tolerance comparing __call__ to np.dot
2015-12-05 16:33:52 -08:00
smauq c7b7fae654 Fix progbar 2015-12-06 02:20:27 +02:00
François Chollet 8e64a6ce77 Merge pull request #1184 from consciousnesss/switch_tasks_to_pytest
Switch tasks to pytest
2015-12-05 16:14:17 -08:00
EderSantana acbae2cebb Fix tolerance comparing __call__ to np.dot 2015-12-05 19:14:04 -05:00
Eder Santana 9789858d0e Update CONTRIBUTING.md
Link to FAQ
2015-12-05 18:52:10 -05:00
Eder Santana d23c02807b Update CONTRIBUTING.md
I'm sure some people forget to do that.
2015-12-05 18:48:02 -05:00
Francois Chollet c11bdd850a Update CONTRIBUTING.md 2015-12-05 15:35:20 -08:00
Francois Chollet 897942783a Small change in CONTRIBUTING.md 2015-12-05 15:27:54 -08:00
Francois Chollet 74b37bf87a Add CONTRIBUTING.md 2015-12-05 15:26:17 -08:00
Francois Chollet 05a82e957c Fix some typos in docs 2015-12-05 15:25:28 -08:00
François Chollet 55e62e587f Merge pull request #1162 from EderSantana/call
Add __call__
2015-12-05 14:33:12 -08:00
EderSantana 614e48e612 fix test problem due to type casting 2015-12-05 17:05:08 -05:00
EderSantana b55f991014 Update docstrings 2015-12-05 17:00:29 -05:00
EderSantana 4aa713b1dc Update docstrings 2015-12-05 16:57:28 -05:00
EderSantana 369fb05436 Fix package names and floatx 2015-12-05 16:47:39 -05:00
olegsinyavskiy cac308e88d few refactoring ideas 2015-12-05 13:36:12 -08:00
olegsinyavskiy 6b7410dc11 few ignores 2015-12-05 13:35:37 -08:00
olegsinyavskiy beb5151b7a Summary of changes:
- switch to using pytest
 - fix determinstic seed
2015-12-05 13:17:55 -08:00
François Chollet e8818c841c Merge pull request #1182 from EderSantana/get_initial_states
Add get_initial_state method to Recurrent
2015-12-05 12:16:15 -08:00
EderSantana b691579fff Add get_initial_state method to Recurrent
With this method, I believe no other recurrent method will need to overwrite
get_output.
2015-12-05 14:38:38 -05:00
Max Pumperla 82971dedc4 Fixed get_config in Pooling1d/2d 2015-12-05 19:38:41 +01:00
Max Pumperla 61e5ea1f87 Fix for 2d pooling 2015-12-05 19:14:41 +01:00
Max Pumperla 6bfb3c648b Removed old MaxPool2D layer 2015-12-05 18:37:25 +01:00
Max Pumperla c600a4450c Removed GPL test part 2015-12-05 18:36:28 +01:00
Max Pumperla c3f3db64af Added Pooling2D base class & restructured MaxPooling2D 2015-12-05 18:35:59 +01:00
Max Pumperla a4f334c1ab Added Pooling1D base class & restructured MaxPooling1D 2015-12-05 18:22:10 +01:00
Thomas McColgan 217ce6f56a use specific version of coverage that works with coveralls 2015-12-05 14:36:23 +01:00
François Chollet ada2f2fa0d Merge pull request #1176 from dsteiner93/master
Fix mistake in backend documentation
2015-12-04 18:54:01 -08:00
dsteiner93 0f42d2db90 Fix mistake in backend documentation
The comments for all-zeros and all-ones were reversed in backend.md
2015-12-04 18:36:31 -08:00
EderSantana 90e9da4093 Fix tensorflow test 2015-12-04 20:30:36 -05:00
Oleg Sinyavskiy 194587e2a5 Merge pull request #3 from fchollet/master
update
2015-12-04 17:03:11 -08:00
Francois Chollet 61b30997eb Fix epsilon in objectives 2015-12-04 15:09:52 -08:00
EderSantana e1c7d287dc Fix tests 2015-12-04 16:44:49 -05:00
EderSantana f2e4e2ddce clean up code 2015-12-04 16:35:47 -05:00
EderSantana 4b40c34c53 Fix test_call.py docstring 2015-12-04 11:22:21 -05:00
EderSantana b002d00347 Add tests 2015-12-04 11:17:30 -05:00
EderSantana 3306f88f6d Merge branch 'call' of https://github.com/edersantana/keras into call 2015-12-04 10:08:58 -05:00
Francois Chollet 7b401f0b99 test_sequential save file cleanup 2015-12-03 22:35:52 -08:00
Francois Chollet 6c1ce0f6e9 Reintroduce image_shape and filter_shape in conv2d 2015-12-03 22:03:28 -08:00
Eder Santana ff647e04ee rebase 2015-12-03 14:50:44 -05:00
EderSantana 5f62723473 Merge branch 'master' of https://github.com/fchollet/keras into call 2015-12-03 14:48:55 -05:00
Francois Chollet f295ecb302 Actually fix floatx encoding 2015-12-03 11:23:51 -08:00
Eder Santana 9c3060d8ca Update common.py
Python 3 compatible
2015-12-03 14:17:25 -05:00
Eder Santana d2a1504060 Update core.py
Inherit from Pass from object
2015-12-03 14:06:47 -05:00
Francois Chollet 2161910a53 Fix floatx encoding on Python3 2015-12-03 11:02:57 -08:00
EderSantana 861c8a8e21 Add __call__ 2015-12-03 13:45:54 -05:00
François Chollet bb17fc7af1 Merge pull request #1147 from tboquet/fix_doc
Description of validation_data in the Graph doc
2015-12-03 10:44:18 -08:00
François Chollet 3e9f5c204a Merge pull request #1150 from consciousnesss/speed_up_keras_test_infrastracture
Speed up keras test infrastructure
2015-12-03 10:34:43 -08:00
François Chollet 39a457cccd Merge pull request #1158 from EderSantana/ascii
convert floatx to ascii
2015-12-03 10:33:30 -08:00
EderSantana 92f66a279a convert floatx to ascii 2015-12-03 10:28:14 -05:00
tboquet 5b1038abed and 2015-12-03 08:37:40 -05:00
olegsinyavskiy 4781f40eb6 These changes speeds up travis testing time 2 times using some pytest and travis configuration options.
Summary of changes:
 - py.test is configured to display test profiling information that shows 10 slowest tests. This would allow additional speed ups if anyone has ideas on some particular test. The slowest test is usually cifar dataset test and tensorflow convolutions. It seems that there are some other IT tests that could be sped up.
 - py.test is configured to run with pytest-xdist with 2 processes in parallel because travis does provide multicore support (1.5 cores) and because the slowest cifar test spends time on download which can run in parallel with other tests.
 - travis is configured to split backend tests into test matrix to make parallel theano vs tensorflow testing as opposed to rerun all the tests twice for python 2.7.
 - pickle filenames in tests are renamed to avoid clashes during multiprocessing
2015-12-02 22:09:59 -08:00
tboquet 58121fa855 deleted out 2015-12-02 22:23:31 -05:00
tboquet 82befe8384 type redo 2015-12-02 22:14:14 -05:00
tboquet f49e52a291 tuple to dict in the graph documentation 2015-12-02 22:12:42 -05:00
François Chollet 7bb897dff1 Merge pull request #1143 from jfsantos/patch-6
Fix typo
2015-12-02 17:56:39 -08:00
João Felipe Santos 664ada1fd2 Fix typo
denses -> dense
2015-12-02 17:27:52 -05:00
Francois Chollet 0933147dc8 Fix typo in recurrent. 2015-12-02 10:00:22 -08:00
Francois Chollet 1c6ab36c63 Fix typo in recurrent. 2015-12-02 09:57:19 -08:00
François Chollet 535af0b17d Merge pull request #1137 from stonebig/patch-1
version adjustement
2015-12-02 09:50:23 -08:00
Francois Chollet 9f917f265c Merge branch 'master' of https://github.com/fchollet/keras 2015-12-02 09:31:05 -08:00
Francois Chollet 54c025ac26 Remove neural turing machine, unviable in 0.3.0 2015-12-02 09:30:41 -08:00
stonebig 6fe65a6a1d version adjustement 2015-12-02 18:28:02 +01:00
François Chollet 5956dbe8fa Merge pull request #1096 from tboquet/master
* fixed validation y size for weighting
2015-12-01 20:54:35 -08:00
Francois Chollet be39e25b86 Fix Theano conv2d on cudnn with border_mode=same 2015-12-01 16:23:13 -08:00
Francois Chollet 905770099c Fix recurrent batch_input_shape error message 2015-12-01 16:17:49 -08:00
Francois Chollet 2553f07c3c Add support for batch_input_shape kwarg. 2015-12-01 15:56:12 -08:00
Francois Chollet af93198bde Documentation update. 2015-12-01 15:55:43 -08:00
Francois Chollet aaa47f0d20 Minor doc fixes 2015-12-01 10:03:37 -08:00
tboquet 1a7c6627ac * fixed validation y size for weighting 2015-11-27 23:17:05 -05:00
Julien Rebetez 98c6c8b3b6 Improve visualize_util to support container layers with optional recursion when plotting.
Also add an option to show layer shapes
2015-11-23 10:09:18 +01:00
farizrahman4u 611c4851e3 Merge pull request #1 from fchollet/master
Update fork
2015-10-25 15:36:42 +05:30
185 arquivos alterados com 29229 adições e 9393 exclusões
+8
Ver Arquivo
@@ -9,3 +9,11 @@ keras/datasets/temp/*
docs/site/*
docs/theme/*
tags
Keras.egg-info
# test-related
.coverage
.cache
# developer environments
.idea
+40 -9
Ver Arquivo
@@ -1,9 +1,20 @@
sudo: required
dist: trusty
language: python
python:
- "2.7"
- "3.4"
matrix:
include:
- python: 3.4
env: KERAS_BACKEND=theano
- python: 3.4
env: KERAS_BACKEND=tensorflow
- python: 2.7
env: KERAS_BACKEND=theano
- python: 2.7
env: KERAS_BACKEND=tensorflow
- python: 2.7
env: KERAS_BACKEND=theano TEST_MODE=INTEGRATION_TESTS
- python: 2.7
env: KERAS_BACKEND=theano TEST_MODE=PEP8
install:
# code below is taken from http://conda.pydata.org/docs/travis.html
# We do this conditionally because it saves us some downloading if the
@@ -23,20 +34,40 @@ install:
- conda create -q -n test-environment python=$TRAVIS_PYTHON_VERSION numpy scipy matplotlib pandas pytest h5py
- source activate test-environment
- pip install pytest-cov python-coveralls
- pip install pytest-cov python-coveralls pytest-xdist coverage==3.7.1 #we need this version of coverage for coveralls.io to work
- pip install pep8 pytest-pep8
- pip install git+git://github.com/Theano/Theano.git
# install PIL for preprocessing tests
- if [[ "$TRAVIS_PYTHON_VERSION" == "2.7" ]]; then
conda install pil;
elif [[ "$TRAVIS_PYTHON_VERSION" == "3.4" ]]; then
conda install Pillow;
fi
- python setup.py install
# install TensorFlow
- if [[ "$TRAVIS_PYTHON_VERSION" == "2.7" ]]; then
pip install https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.5.0-cp27-none-linux_x86_64.whl;
pip install https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.9.0-cp27-none-linux_x86_64.whl;
elif [[ "$TRAVIS_PYTHON_VERSION" == "3.4" ]]; then
pip install https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.9.0-cp34-cp34m-linux_x86_64.whl;
fi
# command to run tests
script:
- PYTHONPATH=$PWD:$PYTHONPATH py.test -v --cov-report term-missing --cov keras tests/
- if [[ "$TRAVIS_PYTHON_VERSION" == "2.7" ]]; then
sed -i -e 's/theano/tensorflow/g' ~/.keras/keras.json;
PYTHONPATH=$PWD:$PYTHONPATH py.test -v --cov-report term-missing --cov keras tests/;
# run keras backend init to initialize backend config
- python -c "import keras.backend"
# create dataset directory to avoid concurrent directory creation at runtime
- mkdir ~/.keras/datasets
# set up keras backend
- sed -i -e 's/"backend":[[:space:]]*"[^"]*/"backend":\ "'$KERAS_BACKEND'/g' ~/.keras/keras.json;
- echo -e "Running tests with the following config:\n$(cat ~/.keras/keras.json)"
- if [[ "$TEST_MODE" == "INTEGRATION_TESTS" ]]; then
PYTHONPATH=$PWD:$PYTHONPATH py.test tests/integration_tests;
elif [[ "$TEST_MODE" == "PEP8" ]]; then
PYTHONPATH=$PWD:$PYTHONPATH py.test --pep8 -m pep8 -n0;
else
PYTHONPATH=$PWD:$PYTHONPATH py.test tests/ --ignore=tests/integration_tests;
fi
after_success:
- coveralls
+65
Ver Arquivo
@@ -0,0 +1,65 @@
# On Github Issues and Pull Requests
Found a bug? Have a new feature to suggest? Want to contribute changes to the codebase? Make sure to read this first.
## Bug reporting
Your code doesn't work, and you have determined that the issue lies with Keras? Follow these steps to report a bug.
1. Your bug may already be fixed. Make sure to update to the current Keras master branch, as well as the latest Theano/TensorFlow master branch.
To easily update Theano: `pip install git+git://github.com/Theano/Theano.git --upgrade`
2. Search for similar issues. Make sure to delete `is:open` on the issue search to find solved tickets as well. It's possible somebody has encountered this bug already. Also remember to check out Keras' [FAQ](http://keras.io/faq/). Still having a problem? Open an issue on Github to let us know.
3. Make sure you provide us with useful information about your configuration: what OS are you using? What Keras backend are you using? Are you running on GPU? If so, what is your version of Cuda, of cuDNN? What is your GPU?
4. Provide us with a script to reproduce the issue. This script should be runnable as-is and should not require external data download (use randomly generated data if you need to run a model on some test data). We recommend that you use Github Gists to post your code. Any issue that cannot be reproduced is likely to be closed.
5. If possible, take a stab at fixing the bug yourself --if you can!
The more information you provide, the easier it is for us to validate that there is a bug and the faster we'll be able to take action. If you want your issue to be resolved quickly, following the steps above is crucial.
## Requesting a Feature
You can also use Github issues to request features you would like to see in Keras, or changes in the Keras API.
1. Provide a clear and detailed explanation of the feature you want and why it's important to add. Keep in mind that we want features that will be useful to the majority of our users and not just a small subset. If you're just targeting a minority of users, consider writing an add-on library for Keras. It is crucial for Keras to avoid bloating the API and codebase.
2. Provide code snippets demonstrating the API you have in mind and illustrating the use cases of your feature. Of course, you don't need to write any real code at this point!
3. After discussing the feature you may choose to attempt a Pull Request. If you're at all able, start writing some code. We always have more work to do than time to do it. If you can write some code then that will speed the process along.
## Pull Requests
We love pull requests. Here's a quick guide:
1. If your PR introduces a change in functionality, make sure you start by opening an issue to discuss whether the change should be made, and how to handle it. This will save you from having your PR closed down the road! Of course, if your PR is a simple bug fix, you don't need to do that.
2. Write the code. This is the hard part!
3. Make sure any new function or class you introduce has proper docstrings. Make sure any code you touch still has up-to-date docstrings and documentation.
4. Write tests. Your code should have full unit test coverage. If you want to see your PR merged promptly, this is crucial.
5. Run our test suite locally. It's easy: from the Keras folder, simply run: `py.test tests/`.
- You will need to install `pytest`, `coveralls`, `pytest-cov`, `pytest-xdist`: `pip install pytest pytest-cov python-coveralls pytest-xdist pep8 pytest-pep8`
6. Make sure all tests are passing:
- with the Theano backend, on Python 2.7 and Python 3.5
- with the TensorFlow backend, on Python 2.7
7. We use PEP8 syntax conventions, but we aren't dogmatic when it comes to line length. Make sure your lines stay reasonably sized, though. To make your life easier, we recommend running a PEP8 linter:
- Install PEP8 packages: `pip install pep8 pytest-pep8 autopep8`
- Run a standalone PEP8 check: `py.test --pep8 -m pep8`
- You can automatically fix some PEP8 error by running: `autopep8 -i --select <errors> <FILENAME>` for example: `autopep8 -i --select E128 tests/keras/backend/test_backends.py`
8. When committing, use appropriate, descriptive commit messages. Make sure that your branch history is not a string of "bug fix", "fix", "oops", etc. When submitting your PR, squash your commits into a single commit with an appropriate commit message, to make sure the project history stays clean and readable. See ['rebase and squash'](http://rebaseandsqua.sh/) for technical help on how to squash your commits.
9. Update the documentation. If introducing new functionality, make sure you include code snippets demonstrating the usage of your new feature.
10. Submit your PR. If your changes have been approved in a previous discussion, and if you have complete (and passing) unit tests, your PR is likely to be merged promptly. Otherwise, well...
## Adding new examples
Even if you don't contribute to the Keras source code, if you have an application of Keras that is concise and powerful, please consider adding it to our collection of examples. [Existing examples](https://github.com/fchollet/keras/tree/master/examples) show idiomatic Keras code: make sure to keep your own script in the same spirit.
+9
Ver Arquivo
@@ -0,0 +1,9 @@
Please make sure that the boxes below are checked before you submit your issue. Thank you!
- [ ] Check that you are up-to-date with the master branch of Keras. You can update with:
pip install git+git://github.com/fchollet/keras.git --upgrade --no-deps
- [ ] If running on Theano, check that you are up-to-date with the master branch of Theano. You can update with:
pip install git+git://github.com/Theano/Theano.git --upgrade --no-deps
- [ ] Provide a link to a GitHub Gist of a Python script that can reproduce your issue (or just copy the script here if it is short).
+40 -47
Ver Arquivo
@@ -1,20 +1,25 @@
# Keras: Deep Learning library for Theano and TensorFlow
# Keras: Deep Learning library for TensorFlow and Theano
[![Build Status](https://travis-ci.org/fchollet/keras.svg?branch=master)](https://travis-ci.org/fchollet/keras)
[![PyPI version](https://badge.fury.io/py/keras.svg)](https://badge.fury.io/py/keras)
[![license](https://img.shields.io/github/license/mashape/apistatus.svg?maxAge=2592000)](https://github.com/fchollet/keras/blob/master/LICENSE)
[![Join the chat at https://gitter.im/Keras-io/Lobby](https://badges.gitter.im/Keras-io/Lobby.svg)](https://gitter.im/Keras-io/Lobby)
## You have just found Keras.
Keras is a minimalist, highly modular neural networks library, written in Python and capable of running either on top of either [TensorFlow](https://github.com/tensorflow/tensorflow) or [Theano](https://github.com/Theano/Theano). It was developed with a focus on enabling fast experimentation. Being able to go from idea to result with the least possible delay is key to doing good research.
Keras is a minimalist, highly modular neural networks library, written in Python and capable of running on top of either [TensorFlow](https://github.com/tensorflow/tensorflow) or [Theano](https://github.com/Theano/Theano). It was developed with a focus on enabling fast experimentation. *Being able to go from idea to result with the least possible delay is key to doing good research.*
Use Keras if you need a deep learning library that:
- allows for easy and fast prototyping (through total modularity, minimalism, and extensibility).
- supports both convolutional networks and recurrent networks, as well as combinations of the two.
- supports arbitrary connectivity schemes (including multi-input and multi-output training).
- runs seamlessly on CPU and GPU.
- Allows for easy and fast prototyping (through total modularity, minimalism, and extensibility).
- Supports both convolutional networks and recurrent networks, as well as combinations of the two.
- Supports arbitrary connectivity schemes (including multi-input and multi-output training).
- Runs seamlessly on CPU and GPU.
Read the documentation at [Keras.io](http://keras.io).
Keras is compatible with:
- __Python 2.7-3.5__ with the Theano backend
- __Python 2.7__ with the TensorFlow backend
Keras is compatible with: __Python 2.7-3.5__.
------------------
@@ -36,9 +41,9 @@ Keras is compatible with:
## Getting started: 30 seconds to Keras
The core datastructure of Keras is a __model__, a way to organize layers. There are two types of models: [`Sequential`](/models/#sequential) and [`Graph`](/models/#graph).
The core data structure of Keras is a __model__, a way to organize layers. The main type of model is the [`Sequential`](http://keras.io/getting-started/sequential-model-guide) model, a linear stack of layers. For more complex architectures, you should use the [Keras functional API](http://keras.io/getting-started/functional-api-guide).
Here's the `Sequential` model (a linear pile of layers):
Here's the `Sequential` model:
```python
from keras.models import Sequential
@@ -49,17 +54,17 @@ model = Sequential()
Stacking layers is as easy as `.add()`:
```python
from keras.layers.core import Dense, Activation
from keras.layers import Dense, Activation
model.add(Dense(output_dim=64, input_dim=100, init="glorot_uniform"))
model.add(Dense(output_dim=64, input_dim=100))
model.add(Activation("relu"))
model.add(Dense(output_dim=10, init="glorot_uniform"))
model.add(Dense(output_dim=10))
model.add(Activation("softmax"))
```
Once your model looks good, configure its learning process with `.compile()`:
```python
model.compile(loss='categorical_crossentropy', optimizer='sgd')
model.compile(loss='categorical_crossentropy', optimizer='sgd', metrics=['accuracy'])
```
If you need to, you can further configure your optimizer. A core principle of Keras is to make things reasonably simple, while allowing the user to be fully in control when they need to (the ultimate control being the easy extensibility of the source code).
@@ -80,7 +85,7 @@ model.train_on_batch(X_batch, Y_batch)
Evaluate your performance in one line:
```python
objective_score = model.evaluate(X_test, Y_test, batch_size=32)
loss_and_metrics = model.evaluate(X_test, Y_test, batch_size=32)
```
Or generate predictions on new data:
@@ -89,11 +94,14 @@ classes = model.predict_classes(X_test, batch_size=32)
proba = model.predict_proba(X_test, batch_size=32)
```
Building a network of LSTMs, a deep CNN, a Neural Turing Machine, a word2vec embedder or any other model is just as fast. The ideas behind deep learning are simple, so why should their implementation be painful?
Building a question answering system, an image classification model, a Neural Turing Machine, a word2vec embedder or any other model is just as fast. The ideas behind deep learning are simple, so why should their implementation be painful?
Have a look at these [starter examples](http://keras.io/examples/).
For a more in-depth tutorial about Keras, you can check out:
In the [examples folder](https://github.com/fchollet/keras/tree/master/examples) of the repo, you will find more advanced models: question-answering with memory networks, text generation with stacked LSTMs, neural turing machines, etc.
- [Getting started with the Sequential model](http://keras.io/getting-started/sequential-model-guide)
- [Getting started with the functional API](http://keras.io/getting-started/functional-api-guide)
In the [examples folder](https://github.com/fchollet/keras/tree/master/examples) of the repository, you will find more advanced models: question-answering with memory networks, text generation with stacked LSTMs, etc.
------------------
@@ -108,35 +116,33 @@ Keras uses the following dependencies:
- HDF5 and h5py (optional, required if you use model saving/loading functions)
- Optional but recommended if you use CNNs: cuDNN.
When using the Theano backend:
- Theano
- [See installation instructions](http://deeplearning.net/software/theano/install.html#install).
**Note**: You should use the latest version of Theano, not the PyPI version. Install it with:
```
sudo pip install git+git://github.com/Theano/Theano.git
```
*When using the TensorFlow backend:*
When using the TensorFlow backend:
- TensorFlow
- [See installation instructions](https://github.com/tensorflow/tensorflow#download-and-setup).
To install, `cd` to the Keras folder and run the install command:
```
*When using the Theano backend:*
- Theano
- [See installation instructions](http://deeplearning.net/software/theano/install.html#install).
To install Keras, `cd` to the Keras folder and run the install command:
```sh
sudo python setup.py install
```
You can also install Keras from PyPI:
```
```sh
sudo pip install keras
```
------------------
## Switching from Theano to TensorFlow
## Switching from TensorFlow to Theano
By default, Keras will use Theano as its tensor manipulation library. [Follow these instructions](http://keras.io/backend/) to configure the Keras backend.
By default, Keras will use TensorFlow as its tensor manipulation library. [Follow these instructions](http://keras.io/backend/) to configure the Keras backend.
------------------
@@ -145,20 +151,7 @@ By default, Keras will use Theano as its tensor manipulation library. [Follow th
You can ask questions and join the development discussion on the [Keras Google group](https://groups.google.com/forum/#!forum/keras-users).
------------------
## Contribution Guidelines
Keras welcomes all contributions from the community.
- Keep a pragmatic mindset and avoid bloat. Only add to the source if that is the only path forward.
- New features should be documented. Make sure you update the documentation along with your Pull Request.
- Any new function or class should have a proper docstring.
- The documentation for every new feature should include a usage example in the form of a code snippet.
- All changes should be tested. Make sure any new feature you add has a corresponding unit test.
- Please no Pull Requests about coding style.
- Even if you don't contribute to the Keras source code, if you have an application of Keras that is concise and powerful, please consider adding it to our collection of [examples](https://github.com/fchollet/keras/tree/master/examples).
You can also post bug reports and feature requests in [Github issues](https://github.com/fchollet/keras/issues). Make sure to read [our guidelines](https://github.com/fchollet/keras/blob/master/CONTRIBUTING.md) first.
------------------
@@ -172,4 +165,4 @@ Keras was initially developed as part of the research effort of project ONEIROS
>_"Oneiroi are beyond our unravelling --who can be sure what tale they tell? Not all that men look for comes to pass. Two gates there are that give passage to fleeting Oneiroi; one is made of horn, one of ivory. The Oneiroi that pass through sawn ivory are deceitful, bearing a message that will not be fulfilled; those that come out through polished horn have truth behind them, to be accomplished for men who see them."_ Homer, Odyssey 19. 562 ff (Shewring translation).
------------------
------------------
+46
Ver Arquivo
@@ -0,0 +1,46 @@
FROM nvidia/cuda:7.5-cudnn5-devel
ENV CONDA_DIR /opt/conda
ENV PATH $CONDA_DIR/bin:$PATH
RUN mkdir -p $CONDA_DIR && \
echo export PATH=$CONDA_DIR/bin:'$PATH' > /etc/profile.d/conda.sh && \
apt-get update && \
apt-get install -y wget git libhdf5-dev g++ graphviz && \
wget --quiet https://repo.continuum.io/miniconda/Miniconda3-3.9.1-Linux-x86_64.sh && \
echo "6c6b44acdd0bc4229377ee10d52c8ac6160c336d9cdd669db7371aa9344e1ac3 *Miniconda3-3.9.1-Linux-x86_64.sh" | sha256sum -c - && \
/bin/bash /Miniconda3-3.9.1-Linux-x86_64.sh -f -b -p $CONDA_DIR && \
rm Miniconda3-3.9.1-Linux-x86_64.sh
ENV NB_USER keras
ENV NB_UID 1000
RUN useradd -m -s /bin/bash -N -u $NB_UID $NB_USER && \
mkdir -p $CONDA_DIR && \
chown keras $CONDA_DIR -R && \
mkdir -p /src && \
chown keras /src
USER keras
# Python
ARG python_version=3.5.1
ARG tensorflow_version=0.9.0rc0-cp35-cp35m
RUN conda install -y python=${python_version} && \
pip install https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-${tensorflow_version}-linux_x86_64.whl && \
pip install git+git://github.com/Theano/Theano.git && \
pip install ipdb pytest pytest-cov python-coveralls coverage==3.7.1 pytest-xdist pep8 pytest-pep8 pydot_ng && \
conda install Pillow scikit-learn notebook pandas matplotlib nose pyyaml six h5py && \
pip install git+git://github.com/fchollet/keras.git && \
conda clean -yt
ADD theanorc /home/keras/.theanorc
ENV PYTHONPATH='/src/:$PYTHONPATH'
WORKDIR /src
EXPOSE 8888
CMD jupyter notebook --port=8888 --ip=0.0.0.0
+26
Ver Arquivo
@@ -0,0 +1,26 @@
help:
@cat Makefile
DATA?="${HOME}/Data"
GPU?=0
DOCKER_FILE=Dockerfile
DOCKER=GPU=$(GPU) nvidia-docker
BACKEND=tensorflow
TEST=tests/
SRC=$(shell dirname `pwd`)
build:
docker build -t keras --build-arg python_version=3.5 -f $(DOCKER_FILE) .
bash: build
$(DOCKER) run -it -v $(SRC):/src -v $(DATA):/data --env KERAS_BACKEND=$(BACKEND) keras bash
ipython: build
$(DOCKER) run -it -v $(SRC):/src -v $(DATA):/data --env KERAS_BACKEND=$(BACKEND) keras ipython
notebook: build
$(DOCKER) run -it -v $(SRC):/src -v $(DATA):/data --net=host --env KERAS_BACKEND=$(BACKEND) keras
test: build
$(DOCKER) run -it -v $(SRC):/src -v $(DATA):/data --env KERAS_BACKEND=$(BACKEND) keras py.test $(TEST)
+58
Ver Arquivo
@@ -0,0 +1,58 @@
# Using Keras via Docker
This directory contains `Dockerfile` to make it easy to get up and running with
Keras via [Docker](http://www.docker.com/).
## Installing Docker
General installation instructions are
[on the Docker site](https://docs.docker.com/installation/), but we give some
quick links here:
* [OSX](https://docs.docker.com/installation/mac/): [docker toolbox](https://www.docker.com/toolbox)
* [ubuntu](https://docs.docker.com/installation/ubuntulinux/)
## Running the container
We are using `Makefile` to simplify docker commands within make commands.
Build the container and start a jupyter notebook
$ make notebook
Build the container and start an iPython shell
$ make ipython
Build the container and start a bash
$ make bash
For GPU support install NVidia drivers (ideally latest) and
[nvidia-docker](https://github.com/NVIDIA/nvidia-docker). Run using
$ make notebook GPU=0 # or [ipython, bash]
Switch between Theano and TensorFlow
$ make notebook BACKEND=theano
$ make notebook BACKEND=tensorflow
Mount a volume for external data sets
$ make DATA=~/mydata
Prints all make tasks
$ make help
You can change Theano parameters by editing `/docker/theanorc`.
Note: If you would have a problem running nvidia-docker you may try the old way
we have used. But it is not recommended. If you find a bug in the nvidia-docker report
it there please and try using the nvidia-docker as described above.
$ export CUDA_SO=$(\ls /usr/lib/x86_64-linux-gnu/libcuda.* | xargs -I{} echo '-v {}:{}')
$ export DEVICES=$(\ls /dev/nvidia* | xargs -I{} echo '--device {}:{}')
$ docker run -it -p 8888:8888 $CUDA_SO $DEVICES gcr.io/tensorflow/tensorflow:latest-gpu
+5
Ver Arquivo
@@ -0,0 +1,5 @@
[global]
floatX = float32
optimizer=None
device = gpu
+5 -2
Ver Arquivo
@@ -5,5 +5,8 @@ Our documentation uses extended Markdown, as implemented by [MkDocs](http://mkdo
## Building the documentation
- install MkDocs: `sudo pip install mkdocs`
- `cd` to the `docs/` folder and run: `mkdocs serve`
- install MkDocs: `pip install mkdocs`
- `cd` to the `docs/` folder and run:
- `python autogen.py`
- `mkdocs serve` # Starts a local webserver: [localhost:8000](localhost:8000)
- `mkdocs build` # Builds a static site in "site" directory
+448
Ver Arquivo
@@ -0,0 +1,448 @@
# -*- coding: utf-8 -*-
'''
General documentation architecture:
Home
Index
- Getting started
Getting started with the sequential model
Getting started with the functional api
Examples
FAQ
Installation guide
- Models
About Keras models
explain when one should use Sequential or functional API
explain compilation step
explain weight saving, weight loading
explain serialization, deserialization
Sequential
Model (functional API)
- Layers
About Keras layers
explain common layer functions: get_weights, set_weights, get_config
explain input_shape
explain usage on non-Keras tensors
Core layers
Convolutional
Recurrent
Embeddings
Normalization
Advanced activations
Noise
- Preprocessing
Image preprocessing
Text preprocessing
Sequence preprocessing
Objectives
Optimizers
Activations
Callbacks
Datasets
Backend
Initializations
Regularizers
Constraints
Visualization
Scikit-learn API
'''
from __future__ import print_function
from __future__ import unicode_literals
import re
import inspect
import os
import shutil
import sys
if sys.version[0] == '2':
reload(sys)
sys.setdefaultencoding('utf8')
from keras.layers import convolutional
from keras.layers import pooling
from keras.layers import local
from keras.layers import recurrent
from keras.layers import core
from keras.layers import noise
from keras.layers import normalization
from keras.layers import advanced_activations
from keras.layers import embeddings
from keras.layers import wrappers
from keras import optimizers
from keras import callbacks
from keras import models
from keras.engine import topology
from keras import objectives
from keras import backend
from keras import constraints
from keras import activations
from keras import regularizers
EXCLUDE = {
'Optimizer',
'Wrapper',
'get_session',
'set_session',
'CallbackList',
}
PAGES = [
{
'page': 'models/sequential.md',
'functions': [
models.Sequential.compile,
models.Sequential.fit,
models.Sequential.evaluate,
models.Sequential.predict,
models.Sequential.predict_classes,
models.Sequential.predict_proba,
models.Sequential.train_on_batch,
models.Sequential.test_on_batch,
models.Sequential.predict_on_batch,
models.Sequential.fit_generator,
models.Sequential.evaluate_generator,
models.Sequential.predict_generator,
],
},
{
'page': 'models/model.md',
'functions': [
models.Model.compile,
models.Model.fit,
models.Model.evaluate,
models.Model.predict,
models.Model.train_on_batch,
models.Model.test_on_batch,
models.Model.predict_on_batch,
models.Model.fit_generator,
models.Model.evaluate_generator,
models.Model.predict_generator,
models.Model.get_layer,
]
},
{
'page': 'layers/core.md',
'classes': [
core.Dense,
core.Activation,
core.Dropout,
core.SpatialDropout2D,
core.SpatialDropout3D,
core.Flatten,
core.Reshape,
core.Permute,
core.RepeatVector,
topology.Merge,
core.Lambda,
core.ActivityRegularization,
core.Masking,
core.Highway,
core.MaxoutDense,
core.TimeDistributedDense,
],
},
{
'page': 'layers/convolutional.md',
'classes': [
convolutional.Convolution1D,
convolutional.AtrousConvolution1D,
convolutional.Convolution2D,
convolutional.AtrousConvolution2D,
convolutional.SeparableConvolution2D,
convolutional.Deconvolution2D,
convolutional.Convolution3D,
convolutional.UpSampling1D,
convolutional.UpSampling2D,
convolutional.UpSampling3D,
convolutional.ZeroPadding1D,
convolutional.ZeroPadding2D,
convolutional.ZeroPadding3D,
],
},
{
'page': 'layers/pooling.md',
'classes': [
pooling.MaxPooling1D,
pooling.MaxPooling2D,
pooling.MaxPooling3D,
pooling.AveragePooling1D,
pooling.AveragePooling2D,
pooling.AveragePooling3D,
pooling.GlobalMaxPooling1D,
pooling.GlobalAveragePooling1D,
pooling.GlobalMaxPooling2D,
pooling.GlobalAveragePooling2D,
],
},
{
'page': 'layers/local.md',
'classes': [
local.LocallyConnected1D,
local.LocallyConnected2D,
],
},
{
'page': 'layers/recurrent.md',
'classes': [
recurrent.Recurrent,
recurrent.SimpleRNN,
recurrent.GRU,
recurrent.LSTM,
],
},
{
'page': 'layers/embeddings.md',
'classes': [
embeddings.Embedding,
],
},
{
'page': 'layers/normalization.md',
'classes': [
normalization.BatchNormalization,
],
},
{
'page': 'layers/advanced-activations.md',
'all_module_classes': [advanced_activations],
},
{
'page': 'layers/noise.md',
'all_module_classes': [noise],
},
{
'page': 'layers/wrappers.md',
'all_module_classes': [wrappers],
},
{
'page': 'optimizers.md',
'all_module_classes': [optimizers],
},
{
'page': 'callbacks.md',
'all_module_classes': [callbacks],
},
{
'page': 'backend.md',
'all_module_functions': [backend],
},
]
ROOT = 'http://keras.io/'
def get_earliest_class_that_defined_member(member, cls):
ancestors = get_classes_ancestors([cls])
result = None
for ancestor in ancestors:
if member in dir(ancestor):
result = ancestor
if not result:
return cls
return result
def get_classes_ancestors(classes):
ancestors = []
for cls in classes:
ancestors += cls.__bases__
filtered_ancestors = []
for ancestor in ancestors:
if ancestor.__name__ in ['object']:
continue
filtered_ancestors.append(ancestor)
if filtered_ancestors:
return filtered_ancestors + get_classes_ancestors(filtered_ancestors)
else:
return filtered_ancestors
def get_function_signature(function, method=True):
signature = inspect.getargspec(function)
defaults = signature.defaults
if method:
args = signature.args[1:]
else:
args = signature.args
if defaults:
kwargs = zip(args[-len(defaults):], defaults)
args = args[:-len(defaults)]
else:
kwargs = []
st = '%s.%s(' % (function.__module__, function.__name__)
for a in args:
st += str(a) + ', '
for a, v in kwargs:
if type(v) == str:
v = '\'' + v + '\''
st += str(a) + '=' + str(v) + ', '
if kwargs or args:
return st[:-2] + ')'
else:
return st + ')'
def get_class_signature(cls):
try:
class_signature = get_function_signature(cls.__init__)
class_signature = class_signature.replace('__init__', cls.__name__)
except:
# in case the class inherits from object and does not
# define __init__
class_signature = cls.__module__ + '.' + cls.__name__ + '()'
return class_signature
def class_to_docs_link(cls):
module_name = cls.__module__
assert module_name[:6] == 'keras.'
module_name = module_name[6:]
link = ROOT + module_name.replace('.', '/') + '#' + cls.__name__.lower()
return link
def class_to_source_link(cls):
module_name = cls.__module__
assert module_name[:6] == 'keras.'
path = module_name.replace('.', '/')
path += '.py'
line = inspect.getsourcelines(cls)[-1]
link = 'https://github.com/fchollet/keras/blob/master/' + path + '#L' + str(line)
return '[[source]](' + link + ')'
def code_snippet(snippet):
result = '```python\n'
result += snippet + '\n'
result += '```\n'
return result
def process_class_docstring(docstring):
docstring = re.sub(r'\n # (.*)\n',
r'\n __\1__\n\n',
docstring)
docstring = re.sub(r' ([^\s\\]+):(.*)\n',
r' - __\1__:\2\n',
docstring)
docstring = docstring.replace(' ' * 5, '\t\t')
docstring = docstring.replace(' ' * 3, '\t')
docstring = docstring.replace(' ', '')
return docstring
def process_function_docstring(docstring):
docstring = re.sub(r'\n # (.*)\n',
r'\n __\1__\n\n',
docstring)
docstring = re.sub(r'\n # (.*)\n',
r'\n __\1__\n\n',
docstring)
docstring = re.sub(r' ([^\s\\]+):(.*)\n',
r' - __\1__:\2\n',
docstring)
docstring = docstring.replace(' ' * 6, '\t\t')
docstring = docstring.replace(' ' * 4, '\t')
docstring = docstring.replace(' ', '')
return docstring
print('Cleaning up existing sources directory.')
if os.path.exists('sources'):
shutil.rmtree('sources')
print('Populating sources directory with templates.')
for subdir, dirs, fnames in os.walk('templates'):
for fname in fnames:
new_subdir = subdir.replace('templates', 'sources')
if not os.path.exists(new_subdir):
os.makedirs(new_subdir)
if fname[-3:] == '.md':
fpath = os.path.join(subdir, fname)
new_fpath = fpath.replace('templates', 'sources')
shutil.copy(fpath, new_fpath)
print('Starting autogeneration.')
for page_data in PAGES:
blocks = []
classes = page_data.get('classes', [])
for module in page_data.get('all_module_classes', []):
module_classes = []
for name in dir(module):
if name[0] == '_' or name in EXCLUDE:
continue
module_member = getattr(module, name)
if inspect.isclass(module_member):
cls = module_member
if cls.__module__ == module.__name__:
if cls not in module_classes:
module_classes.append(cls)
module_classes.sort(key=lambda x: id(x))
classes += module_classes
for cls in classes:
subblocks = []
signature = get_class_signature(cls)
subblocks.append('<span style="float:right;">' + class_to_source_link(cls) + '</span>')
subblocks.append('### ' + cls.__name__ + '\n')
subblocks.append(code_snippet(signature))
docstring = cls.__doc__
if docstring:
subblocks.append(process_class_docstring(docstring))
blocks.append('\n'.join(subblocks))
functions = page_data.get('functions', [])
for module in page_data.get('all_module_functions', []):
module_functions = []
for name in dir(module):
if name[0] == '_' or name in EXCLUDE:
continue
module_member = getattr(module, name)
if inspect.isfunction(module_member):
function = module_member
if module.__name__ in function.__module__:
if function not in module_functions:
module_functions.append(function)
module_functions.sort(key=lambda x: id(x))
functions += module_functions
for function in functions:
subblocks = []
signature = get_function_signature(function, method=False)
signature = signature.replace(function.__module__ + '.', '')
subblocks.append('### ' + function.__name__ + '\n')
subblocks.append(code_snippet(signature))
docstring = function.__doc__
if docstring:
subblocks.append(process_function_docstring(docstring))
blocks.append('\n\n'.join(subblocks))
mkdown = '\n----\n\n'.join(blocks)
# save module page.
# Either insert content into existing page,
# or create page otherwise
page_name = page_data['page']
path = os.path.join('sources', page_name)
if os.path.exists(path):
template = open(path).read()
assert '{{autogenerated}}' in template, ('Template found for ' + path +
' but missing {{autogenerated}} tag.')
mkdown = template.replace('{{autogenerated}}', mkdown)
print('...inserting autogenerated content into template:', path)
else:
print('...creating new page with autogenerated content:', path)
subdir = os.path.dirname(path)
if not os.path.exists(subdir):
os.makedirs(subdir)
open(path, 'w').write(mkdown)
+32 -18
Ver Arquivo
@@ -3,8 +3,8 @@ theme: readthedocs
docs_dir: sources
repo_url: http://github.com/fchollet/keras
site_url: http://keras.io/
#theme_dir: theme
site_description: Documentation for fast and lightweight Keras Deep Learning library.
# theme_dir: theme
site_description: 'Documentation for Keras, the Python Deep Learning library.'
dev_addr: '0.0.0.0:8000'
google_analytics: ['UA-61785484-1', 'keras.io']
@@ -12,30 +12,44 @@ google_analytics: ['UA-61785484-1', 'keras.io']
pages:
- Home: index.md
- Index: documentation.md
- Examples: examples.md
- FAQ: faq.md
- Optimizers: optimizers.md
- Objectives: objectives.md
- Models: models.md
- Activations: activations.md
- Initializations: initializations.md
- Regularizers: regularizers.md
- Constraints: constraints.md
- Callbacks: callbacks.md
- Datasets: datasets.md
- Visualization: visualization.md
- Getting started:
- Guide to the Sequential model: getting-started/sequential-model-guide.md
- Guide to the Functional API: getting-started/functional-api-guide.md
- FAQ: getting-started/faq.md
- Models:
- About Keras models: models/about-keras-models.md
- Sequential: models/sequential.md
- Model (functional API): models/model.md
- Layers:
- About Keras layers: layers/about-keras-layers.md
- Core Layers: layers/core.md
- Convolutional Layers: layers/convolutional.md
- Pooling Layers: layers/pooling.md
- Locally-connected Layers: layers/local.md
- Recurrent Layers: layers/recurrent.md
- Advanced Activations Layers: layers/advanced_activations.md
- Normalization Layers: layers/normalization.md
- Embedding Layers: layers/embeddings.md
- Advanced Activations Layers: layers/advanced-activations.md
- Normalization Layers: layers/normalization.md
- Noise layers: layers/noise.md
- Containers: layers/containers.md
- Layer wrappers: layers/wrappers.md
- Writing your own Keras layers: layers/writing-your-own-keras-layers.md
- Preprocessing:
- Sequence Preprocessing: preprocessing/sequence.md
- Text Preprocessing: preprocessing/text.md
- Image Preprocessing: preprocessing/image.md
- Objectives: objectives.md
- Optimizers: optimizers.md
- Activations: activations.md
- Callbacks: callbacks.md
- Datasets: datasets.md
- Applications: applications.md
- Backend: backend.md
- Initializations: initializations.md
- Regularizers: regularizers.md
- Constraints: constraints.md
- Visualization: visualization.md
- Scikit-learn API: scikit-learn-api.md
-40
Ver Arquivo
@@ -1,40 +0,0 @@
# Keras Documentation Index
## Introduction
- [Home](index.md)
- [Index](documentation.md)
- [Examples](examples.md)
- [FAQ](faq.md)
- [Backend](backend.md)
---
## Base functionality
- [Optimizers](optimizers.md)
- [Objectives](objectives.md)
- [Models](models.md)
- [Activations](activations.md)
- [Initializations](initializations.md)
- [Regularizers](regularizers.md)
- [Constraints](constraints.md)
- [Callbacks](callbacks.md)
- [Datasets](datasets.md)
---
## Layers
- [Core](layers/core.md)
- [Convolutional](layers/convolutional.md)
- [Recurrent](layers/recurrent.md)
- [Advanced Activations](layers/advanced_activations.md)
- [Normalization](layers/normalization.md)
- [Embeddings](layers/embeddings.md)
---
## Preprocessing
- [Sequence](preprocessing/sequence.md)
- [Text](preprocessing/text.md)
- [Image](preprocessing/image.md)
-176
Ver Arquivo
@@ -1,176 +0,0 @@
Here are a few examples to get you started!
### Multilayer Perceptron (MLP):
```python
from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation
from keras.optimizers import SGD
model = Sequential()
# Dense(64) is a fully-connected layer with 64 hidden units.
# in the first layer, you must specify the expected input data shape:
# here, 20-dimensional vectors.
model.add(Dense(64, input_dim=20, init='uniform'))
model.add(Activation('tanh'))
model.add(Dropout(0.5))
model.add(Dense(64, init='uniform'))
model.add(Activation('tanh'))
model.add(Dropout(0.5))
model.add(Dense(2, init='uniform'))
model.add(Activation('softmax'))
sgd = SGD(lr=0.1, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='mean_squared_error', optimizer=sgd)
model.fit(X_train, y_train, nb_epoch=20, batch_size=16)
score = model.evaluate(X_test, y_test, batch_size=16)
```
### Alternative implementation of MLP:
```python
model = Sequential()
model.add(Dense(64, input_dim=20, init='uniform', activation='tanh'))
model.add(Dropout(0.5))
model.add(Dense(64, init='uniform', activation='tanh'))
model.add(Dropout(0.5))
model.add(Dense(2, init='uniform', activation='softmax'))
sgd = SGD(lr=0.1, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='mean_squared_error', optimizer=sgd)
```
### VGG-like convnet:
```python
from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation, Flatten
from keras.layers.convolutional import Convolution2D, MaxPooling2D
from keras.optimizers import SGD
model = Sequential()
# input: 100x100 images with 3 channels -> (3, 100, 100) tensors.
# this applies 32 convolution filters of size 3x3 each.
model.add(Convolution2D(32, 3, 3, border_mode='full', input_shape=(3, 100, 100)))
model.add(Activation('relu'))
model.add(Convolution2D(32, 3, 3))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Convolution2D(64, 3, 3, border_mode='valid'))
model.add(Activation('relu'))
model.add(Convolution2D(64, 3, 3))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
# Note: Keras does automatic shape inference.
model.add(Dense(256))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(10))
model.add(Activation('softmax'))
sgd = SGD(lr=0.1, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='categorical_crossentropy', optimizer=sgd)
model.fit(X_train, Y_train, batch_size=32, nb_epoch=1)
```
### Sequence classification with LSTM:
```python
from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation
from keras.layers.embeddings import Embedding
from keras.layers.recurrent import LSTM
model = Sequential()
model.add(Embedding(max_features, 256, input_length=maxlen))
model.add(LSTM(output_dim=128, activation='sigmoid', inner_activation='hard_sigmoid'))
model.add(Dropout(0.5))
model.add(Dense(1))
model.add(Activation('sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='rmsprop')
model.fit(X_train, Y_train, batch_size=16, nb_epoch=10)
score = model.evaluate(X_test, Y_test, batch_size=16)
```
### Architecture for learning image captions with a convnet and a Gated Recurrent Unit:
(word-level embedding, caption of maximum length 16 words).
Note that getting this to work well will require using a bigger convnet, initialized with pre-trained weights.
```python
max_caption_len = 16
vocab_size = 10000
# first, let's define an image model that
# will encode pictures into 128-dimensional vectors.
# it should be initialized with pre-trained weights.
image_model = Sequential()
image_model.add(Convolution2D(32, 3, 3, border_mode='full', input_shape=(3, 100, 100)))
image_model.add(Activation('relu'))
image_model.add(Convolution2D(32, 3, 3))
image_model.add(Activation('relu'))
image_model.add(MaxPooling2D(pool_size=(2, 2)))
image_model.add(Convolution2D(64, 3, 3, border_mode='full'))
image_model.add(Activation('relu'))
image_model.add(Convolution2D(64, 3, 3))
image_model.add(Activation('relu'))
image_model.add(MaxPooling2D(pool_size=(2, 2)))
image_model.add(Flatten())
image_model.add(Dense(128))
# let's load the weights from a save file.
image_model.load_weights('weight_file.h5')
# next, let's define a RNN model that encodes sequences of words
# into sequences of 128-dimensional word vectors.
language_model = Sequential()
language_model.add(Embedding(vocab_size, 256, input_length=max_caption_len))
language_model.add(GRU(output_dim=128, return_sequences=True))
language_model.add(TimeDistributedDense(128))
# let's repeat the image vector to turn it into a sequence.
image_model.add(RepeatVector(max_caption_len))
# the output of both models will be tensors of shape (samples, max_caption_len, 128).
# let's concatenate these 2 vector sequences.
model = Merge([image_model, language_model], mode='concat', concat_axis=-1)
# let's encode this vector sequence into a single vector
model.add(GRU(256, 256, return_sequences=False))
# which will be used to compute a probability
# distribution over what the next word in the caption should be!
model.add(Dense(vocab_size))
model.add(Activation('softmax'))
model.compile(loss='categorical_crossentropy', optimizer='rmsprop')
# "images" is a numpy float array of shape (nb_samples, nb_channels=3, width, height).
# "captions" is a numpy integer array of shape (nb_samples, max_caption_len)
# containing word index sequences representing partial captions.
# "next_words" is a numpy float array of shape (nb_samples, vocab_size)
# containing a categorical encoding (0s and 1s) of the next word in the corresponding
# partial caption.
model.fit([images, partial_captions], next_words, batch_size=16, nb_epoch=100)
```
In the examples folder, you will find example models for real datasets:
- CIFAR10 small images classification: Convolutional Neural Network (CNN) with realtime data augmentation
- IMDB movie review sentiment classification: LSTM over sequences of words
- Reuters newswires topic classification: Multilayer Perceptron (MLP)
- MNIST handwritten digits classification: MLP & CNN
- Character-level text generation with LSTM
...and more.
-179
Ver Arquivo
@@ -1,179 +0,0 @@
# Keras FAQ: Frequently Asked Keras Questions
[How can I run Keras on GPU?](#how-can-i-run-keras-on-gpu)
[How can I save a Keras model?](#how-can-i-save-a-keras-model)
[Why is the training loss much higher than the testing loss?](#why-is-the-training-loss-much-higher-than-the-testing-loss)
[How can I visualize the output of an intermediate layer?](#how-can-i-visualize-the-output-of-an-intermediate-layer)
[Isn't there a bug with Merge or Graph related to input concatenation?](#isnt-there-a-bug-with-merge-or-graph-related-to-input-concatenation)
[How can I use Keras with datasets that don't fit in memory?](#how-can-i-use-keras-with-datasets-that-dont-fit-in-memory)
[How can I interrupt training when the validation loss isn't decreasing anymore?](#how-can-i-interrupt-training-when-the-validation-loss-isnt-decreasing-anymore)
[How is the validation split computed?](#how-is-the-validation-split-computed)
[Is the data shuffled during training?](#is-the-data-shuffled-during-training)
[How can I record the training / validation loss / accuracy at each epoch?](#how-can-i-record-the-training-validation-loss-accuracy-at-each-epoch)
---
### How can I run Keras on GPU?
Method 1: use Theano flags.
```bash
THEANO_FLAGS=device=gpu,floatX=float32 python my_keras_script.py
```
The name 'gpu' might have to be changed depending on your device's identifier (e.g. `gpu0`, `gpu1`, etc).
Method 2: set up your `.theanorc`: [Instructions](http://deeplearning.net/software/theano/library/config.html)
Method 3: manually set `theano.config.device`, `theano.config.floatX` at the beginning of your code:
```python
import theano
theano.config.device = 'gpu'
theano.config.floatX = 'float32'
```
---
### How can I save a Keras model?
*It is not recommended to use pickle or cPickle to save a Keras model.*
If you only need to save the architecture of a model, and not its weights, you can do:
```python
# save as JSON
json_string = model.to_json()
# save as YAML
yaml_string = model.to_yaml()
```
You can then build a fresh model from this data:
```python
# model reconstruction from JSON:
from keras.models import model_from_json
model = model_from_json(json_string)
# model reconstruction from YAML
model = model_from_yaml(yaml_string)
```
If you need to save the weights of a model, you can do so in HDF5:
```python
model.save_weights('my_model_weights.h5')
```
Assuming you have code for instantiating your model, you can then load the weights you saved into a model with the same architecture:
```python
model.load_weights('my_model_weights.h5')
```
This leads us to a way to save and reconstruct models from only serialized data:
```python
json_string = model.to_json()
open('my_model_architecture.json', 'w').write(json_string)
model.save_weights('my_model_weights.h5')
# elsewhere...
model = model_from_json(open('my_model_architecture.json').read())
model.load_weights('my_model_weights.h5')
```
---
### Why is the training loss much higher than the testing loss?
A Keras model has two modes: training and testing. Regularization mechanisms, such as Dropout and L1/L2 weight regularization, are turned off at testing time.
Besides, the training loss is the average of the losses over each batch of training data. Because your model is changing over time, the loss over the first batches of an epoch is generally higher than over the last batches. On the other hand, the testing loss for an epoch is computed using the model as it is at the end of the epoch, resulting in a lower loss.
---
### How can I visualize the output of an intermediate layer?
You can build a Theano function that will return the output of a certain layer given a certain input, for example:
```python
# with a Sequential model
get_3rd_layer_output = theano.function([model.layers[0].input],
model.layers[3].get_output(train=False))
layer_output = get_3rd_layer_output(X)
# with a Graph model
get_conv_layer_output = theano.function([model.inputs[i].input for i in model.input_order],
model.outputs['conv'].get_output(train=False),
on_unused_input='ignore')
conv_output = get_conv_output(input_data_dict)
```
---
### Isn't there a bug with Merge or Graph related to input concatenation?
Yes, there was a known bug with tensor concatenation in Thenao that was fixed early 2015.
Please upgrade to the latest version of Theano:
```bash
sudo pip install git+git://github.com/Theano/Theano.git
```
---
### How can I use Keras with datasets that don't fit in memory?
You can do batch training using `model.train_on_batch(X, y)` and `model.test_on_batch(X, y)`. See the [models documentation](models.md).
You can also see batch training in action in our [CIFAR10 example](https://github.com/fchollet/keras/blob/master/examples/cifar10_cnn.py).
---
### How can I interrupt training when the validation loss isn't decreasing anymore?
You can use an `EarlyStopping` callback:
```python
from keras.callbacks import EarlyStopping
early_stopping = EarlyStopping(monitor='val_loss', patience=2)
model.fit(X, y, validation_split=0.2, callbacks=[early_stopping])
```
Find out more in the [callbacks documentation](callbacks.md).
---
### How is the validation split computed?
If you set the `validation_split` arugment in `model.fit` to e.g. 0.1, then the validation data used will be the *last 10%* of the data. If you set it to 0.25, it will be the last 25% of the data, etc.
---
### Is the data shuffled during training?
Yes, if the `shuffle` argument in `model.fit` is set to `True` (which is the default), the training data will be randomly shuffled at each epoch.
Validation data isn't shuffled.
---
### How can I record the training / validation loss / accuracy at each epoch?
The `model.fit` method returns an `History` callback, which has a `history` attribute containing the lists of successive losses / accuracies.
```python
hist = model.fit(X, y, validation_split=0.2)
print(hist.history)
```
---
-23
Ver Arquivo
@@ -1,23 +0,0 @@
## Usage of initializations
Initializations define the probability distribution used to set the initial random weights of Keras layers.
The keyword arguments used for passing initializations to layers will depend on the layer. Usually it is simply `init`:
```python
model.add(Dense(64, init='uniform'))
```
## Available initializations
- __uniform__
- __lecun_uniform__: Uniform initialization scaled by the square root of the number of inputs (LeCun 98).
- __normal__
- __identity__: Use with square 2D layers (`shape[0] == shape[1]`).
- __orthogonal__: Use with square 2D layers (`shape[0] == shape[1]`).
- __zero__
- __glorot_normal__: Gaussian initialization scaled by fan_in + fan_out (Glorot 2010)
- __glorot_uniform__
- __he_normal__: Gaussian initialization scaled by fan_in (He et al., 2014)
- __he_uniform__
-106
Ver Arquivo
@@ -1,106 +0,0 @@
## LeakyReLU
```python
keras.layers.advanced_activations.LeakyReLU(alpha=0.3)
```
Special version of a Rectified Linear Unit that allows a small gradient when the unit is not active (`f(x) = alpha*x for x < 0`).
- __Input shape__: Arbitrary. Use the keyword argument `input_shape` (tuple of integers, does not include the samples axis) when using this layer as the first layer in a model.
- __Output shape__: Same as input.
- __Arguments__:
- __alpha__: float >= 0. Negative slope coefficient.
---
## PReLU
```python
keras.layers.advanced_activations.PReLU()
```
Parametrized linear unit. Similar to a LeakyReLU, where each input unit has its alpha coefficient, and where these coefficients are learned during training.
- __Input shape__: Arbitrary. Use the keyword argument `input_shape` (tuple of integers, does not include the samples axis) when using this layer as the first layer in a model.
- __Output shape__: Same as input.
- __References__:
- [Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification](http://arxiv.org/pdf/1502.01852v1.pdf)
---
## ELU
```python
keras.layers.advanced_activations.ELU()
```
Exponential linear unit. Negative values pushes mean unit activations closer to zero, with the advantage of having a noise-robust deactivation state.
- __Input shape__: Arbitrary. Use the keyword argument `input_shape` (tuple of integers, does not include the samples axis) when using this layer as the first layer in a model.
- __Output shape__: Same as input.
- __References__:
- [Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs)](http://arxiv.org/pdf/1511.07289v1.pdf)
---
## ParametricSoftplus
```python
keras.layers.advanced_activations.ParametricSoftplus()
```
Parametric Softplus of the form: (`f(x) = alpha * (1 + exp(beta * x))`). This is essentially a smooth version of ReLU where the parameters control the sharpness of the rectification. The parameters are initialized to more closely approximate a ReLU than the standard `softplus`: `alpha` initialized to `0.2` and `beta` initialized to `5.0`. The parameters are fit separately for each hidden unit.
- __Input shape__: Arbitrary. Use the keyword argument `input_shape=...` when using this layer as the first layer in a model.
- __Output shape__: Same as input.
- __References__:
- [Inferring Nonlinear Neuronal Computation Based on Physiologically Plausible Inputs](http://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1003143)
## Thresholded Linear
```python
keras.layers.advanced_activations.ThresholdedLinear(theta)
```
Parametrized linear unit. provides a threshold near zero where values are zeroed.
- __Input shape__: Arbitrary. Use the keyword argument `input_shape` (tuple of integers, does not include the samples axis) when using this layer as the first layer in a model.
- __Output shape__: Same as input.
- __Arguments__:
- __theta__: float >= 0. Threshold location of activation
- __References__:
- [Zero-Bias Autoencoders and the Benefits of Co-Adapting Features](http://arxiv.org/pdf/1402.3337.pdf)
## Thresholded ReLu
```python
keras.layers.advanced_activations.ThresholdedReLu(theta)
```
Parametrized rectified linear unit. provides a threshold near zero where values are zeroed.
- __Input shape__: Arbitrary. Use the keyword argument `input_shape=...` when using this layer as the first layer in a model.
- __Output shape__: Same as input.
- __Arguments__:
- __theta__: float >= 0. Threshold location of activation
- __References__:
- [Zero-Bias Autoencoders and the Benefits of Co-Adapting Features](http://arxiv.org/pdf/1402.3337.pdf)
-21
Ver Arquivo
@@ -1,21 +0,0 @@
Containers are ensembles of layers that can be interacted with through the same API as `Layer` objects.
## Sequential
```python
keras.layers.containers.Sequential(layers=[])
```
The Sequential container is a linear stack of layers. Apart from the `add` methods and the `layers` constructor argument, the API is identical to that of the `Layer` class.
This class is also the basis for the `keras.models.Sequential` architecture.
The `layers` constructor argument is a list of Layer instances.
__Methods__:
```python
add(layer)
```
Add a new layer to the stack.
-202
Ver Arquivo
@@ -1,202 +0,0 @@
## Convolution1D
```python
keras.layers.convolutional.Convolution1D(nb_filter, filter_length,
init='uniform',
activation='linear',
weights=None,
border_mode='valid',
subsample_length=1,
W_regularizer=None, b_regularizer=None,
W_constraint=None, b_constraint=None,
input_dim=None, input_length=None)
```
Convolution operator for filtering neighborhoods of one-dimensional inputs. When using this layer as the first layer in a model, either provide the keyword argument `input_dim` (int, e.g. 128 for sequences of 128-dimensional vectors), or `input_shape` (tuple of integers, e.g. (10, 128) for sequences of 10 vectors of 128-dimensional vectors).
- __Input shape__: 3D tensor with shape: `(samples, steps, input_dim)`.
- __Output shape__: 3D tensor with shape: `(samples, new_steps, nb_filter)`. `steps` value might have changed due to padding.
- __Arguments__:
- __nb_filter__: Number of convolution kernels to use (dimensionality of the output).
- __filter_length__: The extension (spatial or temporal) of each filter.
- __init__: name of initialization function for the weights of the layer (see: [initializations](../initializations.md)), or alternatively, Theano function to use for weights initialization. This parameter is only relevant if you don't pass a `weights` argument.
- __activation__: name of activation function to use (see: [activations](../activations.md)), or alternatively, elementwise Theano function. If you don't specify anything, no activation is applied (ie. "linear" activation: a(x) = x).
- __weights__: list of numpy arrays to set as initial weights.
- __border_mode__: 'valid' or 'same'.
- __subsample_length__: factor by which to subsample output.
- __W_regularizer__: instance of [WeightRegularizer](../regularizers.md) (eg. L1 or L2 regularization), applied to the main weights matrix.
- __b_regularizer__: instance of [WeightRegularizer](../regularizers.md), applied to the bias.
- __activity_regularizer__: instance of [ActivityRegularizer](../regularizers.md), applied to the network output.
- __W_constraint__: instance of the [constraints](../constraints.md) module (eg. maxnorm, nonneg), applied to the main weights matrix.
- __b_constraint__: instance of the [constraints](../constraints.md) module, applied to the bias.
- __input_dim__: Number of channels/dimensions in the input. Either this argument or the keyword argument `input_shape` must be provided when using this layer as the first layer in a model.
- __input_length__: Length of input sequences, when it is constant. This argument is required if you are going to connect `Flatten` then `Dense` layers upstream (without it, the shape of the dense outputs cannot be computed).
---
## Convolution2D
```python
keras.layers.convolutional.Convolution2D(nb_filter, nb_row, nb_col,
init='glorot_uniform',
activation='linear',
weights=None,
border_mode='valid',
subsample=(1, 1),
W_regularizer=None, b_regularizer=None,
W_constraint=None,
dim_ordering='th')
```
Convolution operator for filtering windows of two-dimensional inputs. When using this layer as the first layer in a model, provide the keyword argument `input_shape` (tuple of integers, does not include the sample axis), e.g. `input_shape=(3, 128, 128)` for 128x128 RGB pictures.
- __Input shape__: 4D tensor with shape: `(samples, channels, rows, cols)` if dim_ordering='th'
or 4D tensor with shape: `(samples, rows, cols, channels)` if dim_ordering='tf'.
- __Output shape__: 4D tensor with shape: `(samples, nb_filter, nb_row, nb_col)` if dim_ordering='th'
or 4D tensor with shape: `(samples, nb_row, nb_col, nb_filter)` if dim_ordering='tf'.
- __Arguments__:
- __nb_filter__: Number of convolution filters to use.
- __nb_row__: Number of rows in the convolution kernel.
- __nb_col__: Number of columns in the convolution kernel.
- __init__: name of initialization function for the weights of the layer (see: [initializations](../initializations.md)), or alternatively, Theano function to use for weights initialization. This parameter is only relevant if you don't pass a `weights` argument.
- __activation__: name of activation function to use (see: [activations](../activations.md)), or alternatively, elementwise Theano function. If you don't specify anything, no activation is applied (ie. "linear" activation: a(x) = x).
- __weights__: list of numpy arrays to set as initial weights.
- __border_mode__: 'valid' or 'same'.
- __subsample__: tuple of length 2. Factor by which to subsample output. Also called strides elsewhere.
- __W_regularizer__: instance of [WeightRegularizer](../regularizers.md) (eg. L1 or L2 regularization), applied to the main weights matrix.
- __b_regularizer__: instance of [WeightRegularizer](../regularizers.md), applied to the bias.
- __activity_regularizer__: instance of [ActivityRegularizer](../regularizers.md), applied to the network output.
- __W_constraint__: instance of the [constraints](../constraints.md) module (eg. maxnorm, nonneg), applied to the main weights matrix.
- __b_constraint__: instance of the [constraints](../constraints.md) module, applied to the bias.
- __dim_ordering__: 'th' or 'tf'. In 'th' mode, the channels dimension (the depth) is at index 1, in 'tf' mode is it at index 3.
---
## MaxPooling1D
```python
keras.layers.convolutional.MaxPooling1D(pool_length=2, stride=None, border_mode='valid')
```
Max pooling operation for temporal data.
- __Input shape__: 3D tensor with shape: `(samples, steps, features)`.
- __Output shape__: 3D tensor with shape: `(samples, downsampled_steps, features)`.
- __Arguments__:
- __pool_length__: factor by which to downscale. 2 will halve the input.
- __stride__: integer or None. Stride value.
- __border_mode__: 'valid' or 'same'. **Note:** 'same' will only work with TensorFlow for the time being.
---
## MaxPooling2D
```python
keras.layers.convolutional.MaxPooling2D(pool_size=(2, 2), border_mode='valid', dim_ordering='th')
```
Max pooling operation for spatial data.
- __Input shape__: 4D tensor with shape: `(samples, channels, rows, cols)` if dim_ordering='th'
or 4D tensor with shape: `(samples, rows, cols, channels)` if dim_ordering='tf'.
- __Output shape__: 4D tensor with shape: `(nb_samples, channels, pooled_rows, pooled_cols)` if dim_ordering='th'
or 4D tensor with shape: `(samples, pooled_rows, pooled_cols, channels)` if dim_ordering='tf'.
- __Arguments__:
- __pool_size__: tuple of 2 integers, factors by which to downscale (vertical, horizontal). (2, 2) will halve the image in each dimension.
- __strides__: tuple of 2 integers, or None. Strides values.
- __border_mode__: 'valid' or 'same'. **Note:** 'same' will only work with TensorFlow for the time being.
- __dim_ordering__: 'th' or 'tf'. In 'th' mode, the channels dimension (the depth) is at index 1, in 'tf' mode is it at index 3.
---
## UpSampling1D
```python
keras.layers.convolutional.UpSampling1D(length=2)
```
Repeats each temporal step `length` times along the time axis.
- __Input shape__: 3D tensor with shape: `(samples, steps, features)`.
- __Output shape__: 3D tensor with shape: `(samples, upsampled_steps, features)`.
- __Arguments__:
- __length__: integer. Upsampling factor.
---
## UpSampling2D
```python
keras.layers.convolutional.UpSampling2D(size=(2, 2), dim_ordering='th')
```
Repeats the rows and columns of the data by size[0] and size[1] respectively.
- __Input shape__: 4D tensor with shape: `(samples, channels, rows, cols)` if dim_ordering='th'
or 4D tensor with shape: `(samples, rows, cols, channels)` if dim_ordering='tf'.
- __Output shape__: 4D tensor with shape: `(samples, channels, upsampled_rows, upsampled_cols)` if dim_ordering='th'
or 4D tensor with shape: `(samples, upsampled_rows, upsampled_cols, channels)` if dim_ordering='tf'.
- __Arguments__:
- __size__: tuple of 2 integers. The upsampling factors for rows and columns.
- __dim_ordering__: 'th' or 'tf'. In 'th' mode, the channels dimension (the depth) is at index 1, in 'tf' mode is it at index 3.
---
## ZeroPadding1D
```python
keras.layers.convolutional.ZeroPaddding1D(padding=1)
```
Pads the input with zeros left and right along the time axis.
- __Input shape__: 3D tensor with shape: `(nb_samples, steps, dim)`.
- __Output shape__: 3D tensor with shape: `(nb_samples, padded_steps, dim)`.
- __Arguments__:
- __padding__: integer, the size of the padding.
---
## ZeroPadding2D
```python
keras.layers.convolutional.ZeroPaddding2D(padding=(1, 1), dim_ordering='th')
```
Pads the rows and columns of the input with zeros, left and right.
- __Input shape__: 4D tensor with shape: `(samples, channels, rows, cols)` if dim_ordering='th'
or 4D tensor with shape: `(samples, rows, cols, channels)` if dim_ordering='tf'.
- __Output shape__: 4D tensor with shape: `(samples, channels, padded_rows, padded_cols)` if dim_ordering='th'
or 4D tensor with shape: `(samples, padded_rows, padded_cols, channels)` if dim_ordering='tf'.
- __Arguments__:
- __padding__: tuple of 2 integers, the size of the padding for rows and columns respectively.
- __dim_ordering__: 'th' or 'tf'. In 'th' mode, the channels dimension (the depth) is at index 1, in 'tf' mode is it at index 3.
---
-477
Ver Arquivo
@@ -1,477 +0,0 @@
## Base class
```python
keras.layers.core.Layer()
```
__Methods__:
```python
set_previous(previous_layer)
```
Connect the input of the current layer to the output of the argument layer.
- __Return__: None.
- __Arguments__:
- __previous_layer__: Layer object.
```python
get_output(train)
```
Get the output of the layer.
- __Return__: Theano tensor.
- __Arguments__:
- __train__: Boolean. Specifies whether output is computed in training mode or in testing mode, which can change the logic, for instance in there are any `Dropout` layers in the network.
```python
get_input(train)
```
Get the input of the layer.
- __Return__: Theano tensor.
- __Arguments__:
- __train__: Boolean. Specifies whether output is computed in training mode or in testing mode, which can change the logic, for instance in there are any `Dropout` layers in the network.
```python
get_weights()
```
Get the weights of the parameters of the layer.
- __Return__: List of numpy arrays (one per layer parameter).
```python
set_weights(weights)
```
Set the weights of the parameters of the layer.
- __Arguments__:
- __weights__: List of numpy arrays (one per layer parameter). Should be in the same order as what `get_weights(self)` returns.
```python
get_config()
```
- __Return__: Configuration dictionary describing the layer.
---
## Dense
```python
keras.layers.core.Dense(output_dim,
init='glorot_uniform',
activation='linear',
weights=None,
W_regularizer=None, b_regularizer=None, activity_regularizer=None,
W_constraint=None, b_constraint=None,
input_dim=None)
```
Standard 1D fully-connect layer.
- __Input shape__: 2D tensor with shape: `(nb_samples, input_dim)`.
- __Output shape__: 2D tensor with shape: `(nb_samples, output_dim)`.
- __Arguments__:
- __output_dim__: int >= 0.
- __init__: name of initialization function for the weights of the layer (see: [initializations](../initializations.md)), or alternatively, Theano function to use for weights initialization. This parameter is only relevant if you don't pass a `weights` argument.
- __activation__: name of activation function to use (see: [activations](../activations.md)), or alternatively, elementwise Theano function. If you don't specify anything, no activation is applied (ie. "linear" activation: a(x) = x).
- __weights__: list of numpy arrays to set as initial weights. The list should have 1 element, of shape `(input_dim, output_dim)`.
- __W_regularizer__: instance of [WeightRegularizer](../regularizers.md) (eg. L1 or L2 regularization), applied to the main weights matrix.
- __b_regularizer__: instance of [WeightRegularizer](../regularizers.md), applied to the bias.
- __activity_regularizer__: instance of [ActivityRegularizer](../regularizers.md), applied to the network output.
- __W_constraint__: instance of the [constraints](../constraints.md) module (eg. maxnorm, nonneg), applied to the main weights matrix.
- __b_constraint__: instance of the [constraints](../constraints.md) module, applied to the bias.
- __input_dim__: dimensionality of the input (integer). This argument (or alternatively, the keyword argument `input_shape`) is required when using this layer as the first layer in a model.
---
## TimeDistributedDense
```python
keras.layers.core.TimeDistributedDense(output_dim,
init='glorot_uniform',
activation='linear',
weights=None
W_regularizer=None, b_regularizer=None, activity_regularizer=None,
W_constraint=None, b_constraint=None,
input_dim=None, input_length=None)
```
Fully-connected layer distributed over the time dimension. Useful after a recurrent network set to `return_sequences=True`.
- __Input shape__: 3D tensor with shape: `(nb_samples, timesteps, input_dim)`.
- __Arguments__:
- __output_dim__: int >= 0.
- __init__: name of initialization function for the weights of the layer (see: [initializations](../initializations.md)), or alternatively, Theano function to use for weights initialization. This parameter is only relevant if you don't pass a `weights` argument.
- __activation__: name of activation function to use (see: [activations](../activations.md)), or alternatively, elementwise Theano function. If you don't specify anything, no activation is applied (ie. "linear" activation: a(x) = x).
- __weights__: list of numpy arrays to set as initial weights. The list should have 1 element, of shape `(input_dim, output_dim)`.
- __W_regularizer__: instance of [WeightRegularizer](../regularizers.md) (eg. L1 or L2 regularization), applied to the main weights matrix.
- __b_regularizer__: instance of [WeightRegularizer](../regularizers.md), applied to the bias.
- __activity_regularizer__: instance of [ActivityRegularizer](../regularizers.md), applied to the network output.
- __W_constraint__: instance of the [constraints](../constraints.md) module (eg. maxnorm, nonneg), applied to the main weights matrix.
- __b_constraint__: instance of the [constraints](../constraints.md) module, applied to the bias.
- __input_dim__: dimensionality of the input (integer). This argument (or alternatively, the keyword argument `input_shape`) is required when using this layer as the first layer in a model.
- __input_length__: Length of input sequences, when it is constant. This argument is required if you are going to connect `Flatten` then `Dense` layers upstream (without it, the shape of the dense outputs cannot be computed).
- __Example__:
```python
# input shape: (nb_samples, timesteps, 10)
model.add(LSTM(5, return_sequences=True, input_dim=10)) # output shape: (nb_samples, timesteps, 5)
model.add(TimeDistributedDense(15)) # output shape: (nb_samples, timesteps, 15)
```
---
## AutoEncoder
```python
keras.layers.core.AutoEncoder(encoder, decoder, output_reconstruction=True, weights=None):
```
A customizable autoencoder model. If `output_reconstruction = True` then dim(input) = dim(output) else dim(output) = dim(hidden)
- __Input shape__: The layer shape is defined by the encoder definitions
- __Output shape__: The layer shape is defined by the decoder definitions
- __Arguments__:
- __encoder__: A [layer](./) or [layer container](./containers.md).
- __decoder__: A [layer](./) or [layer container](./containers.md).
- __output_reconstruction__: If this is False, then when .predict() is called, the output is the deepest hidden layer's activation. Otherwise, the output of the final decoder layer is presented. Be sure your validation data conforms to this logic if you decide to use any.
- __weights__: list of numpy arrays to set as initial weights. The list should have 1 element, of shape `(input_dim, output_dim)`.
- __Example__:
```python
from keras.layers import containers
# input shape: (nb_samples, 32)
encoder = containers.Sequential([Dense(16, input_dim=32), Dense(8)])
decoder = containers.Sequential([Dense(16, input_dim=8), Dense(32)])
autoencoder = Sequential()
autoencoder.add(AutoEncoder(encoder=encoder, decoder=decoder, output_reconstruction=False))
```
---
## Activation
```python
keras.layers.core.Activation(activation)
```
Apply an activation function to the input.
- __Input shape__: Arbitrary. Use the keyword argument `input_shape` (tuple of integers, does not include the samples axis) when using this layer as the first layer in a model.
- __Output shape__: Same as input.
- __Arguments__:
- __activation__: name of activation function to use (see: [activations](../activations.md)), or alternatively, elementwise Theano function.
---
## Dropout
```python
keras.layers.core.Dropout(p)
```
Apply dropout to the input. Dropout consists in randomly setting a fraction `p` of input units to 0 at each update during training time, which helps prevent overfitting. Reference: [Dropout: A Simple Way to Prevent Neural Networks from Overfitting](http://www.cs.toronto.edu/~rsalakhu/papers/srivastava14a.pdf)
- __Input shape__: Arbitrary. Use the keyword argument `input_shape` (tuple of integers, does not include the samples axis) when using this layer as the first layer in a model.
- __Output shape__: Same as input.
- __Arguments__:
- __p__: float (0 <= p < 1). Fraction of the input that gets dropped out at training time.
---
## Reshape
```python
keras.layers.core.Reshape(dims)
```
Reshape the input to a new shape containing the same number of units.
- __Input shape__: Arbitrary. Use the keyword argument `input_shape` (tuple of integers, does not include the samples axis) when using this layer as the first layer in a model.
- __Output shape__: `(nb_samples, dims)`.
- __Arguments__:
- dims: tuple of integers. Dimensions of the new shape.
- __Example__:
```python
# input shape: (nb_samples, 10)
model.add(Dense(100, input_dim=10)) # output shape: (nb_samples, 100)
model.add(Reshape(dims=(10, 10))) # output shape: (nb_samples, 10, 10)
```
---
## Flatten
```python
keras.layers.core.Flatten()
```
Convert a nD input to 1D.
- __Input shape__: Arbitrary. Use the keyword argument `input_shape` (tuple of integers, does not include the samples axis) when using this layer as the first layer in a model.
- __Output shape__: `(nb_samples, nb_input_units)`.
---
## RepeatVector
```python
keras.layers.core.RepeatVector(n)
```
Repeat the 1D input n times. Dimensions of input are assumed to be `(nb_samples, dim)`. Output will have the shape `(nb_samples, n, dim)`.
Note that the output is still a single tensor; `RepeatVector` does not split the data flow.
- __Input shape__: Arbitrary. Use the keyword argument `input_shape` (tuple of integers, does not include the samples axis) when using this layer as the first layer in a model.
- __Output shape__: `(nb_samples, n, input_dims)`.
- __Arguments__:
- __n__: int.
---
## Permute
```python
keras.layers.core.Permute(dims)
```
Permute the dimensions of the input data according to the given tuple. Sometimes useful for connecting RNNs and convnets together.
- __Input shape__: Arbitrary. Use the keyword argument `input_shape` (tuple of integers, does not include the samples axis) when using this layer as the first layer in a model.
- __Output shape__: Same as the input shape, but with the dimensions re-ordered according to the ordering specified by the tuple.
- __Argument__: tuple specifying the permutation scheme (e.g. `(2, 1)` permutes the first and second dimension of the input).
- __Example__:
```python
# input shape: (nb_samples, 10)
model.add(Dense(50, input_dim=10)) # output shape: (nb_samples, 50)
model.add(Reshape(dims=(10, 5))) # output shape: (nb_samples, 10, 5)
model.add(Permute(dims=(2, 1))) #output shape: (nb_samples, 5, 10)
```
---
## ActivityRegularization
```python
keras.layers.core.ActivityRegularization(l1=0., l2=0.)
```
Leaves the input unchanged, but adds a term to the loss function based on the input activity. L1 and L2 regularization supported.
This layer can be used, for instance, to induce activation sparsity in the previous layer.
---
## MaxoutDense
```python
keras.layers.core.MaxoutDense(output_dim, nb_feature=4,
init='glorot_uniform',
weights=None,
W_regularizer=None, b_regularizer=None, activity_regularizer=None,
W_constraint=None, b_constraint=None,
input_dim=None)
```
A dense maxout layer. A `MaxoutDense` layer takes the element-wise maximum of `nb_feature` `Dense(input_dim, output_dim)` linear layers. This allows the layer to learn a convex, piecewise linear activation function over the inputs. See [this paper](http://arxiv.org/pdf/1302.4389.pdf) for more details. Note that this is a *linear* layer -- if you wish to apply activation function (you shouldn't need to -- they are universal function approximators), an `Activation` layer must be added after.
- __Input shape__: 2D tensor with shape: `(nb_samples, input_dim)`.
- __Output shape__: 2D tensor with shape: `(nb_samples, output_dim)`.
- __Arguments__:
- __output_dim__: int >= 0.
- __nb_feature__: int >= 0. the number of features to create for the maxout. This is equivalent to the number of piecewise elements to be allowed for the activation function.
- __init__: name of initialization function for the weights of the layer (see: [initializations](../initializations.md)), or alternatively, Theano function to use for weights initialization. This parameter is only relevant if you don't pass a `weights` argument.
- __weights__: list of numpy arrays to set as initial weights. The list should have 1 element, of shape `(input_dim, output_dim)`.
- __W_regularizer__: instance of [WeightRegularizer](../regularizers.md) (eg. L1 or L2 regularization), applied to the main weights matrix.
- __b_regularizer__: instance of [WeightRegularizer](../regularizers.md), applied to the bias.
- __activity_regularizer__: instance of [ActivityRegularizer](../regularizers.md), applied to the network output.
- __W_constraint__: instance of the [constraints](../constraints.md) module (eg. maxnorm, nonneg), applied to the main weights matrix.
- __b_constraint__: instance of the [constraints](../constraints.md) module, applied to the bias.
- __input_dim__: dimensionality of the input (integer). This argument (or alternatively, the keyword argument `input_shape`) is required when using this layer as the first layer in a model.
```python
# input shape: (nb_samples, 10)
model.add(Dense(100, input_dim=10)) # output shape: (nb_samples, 100)
model.add(MaxoutDense(50, nb_feature=10)) # output shape: (nb_samples, 50)
```
## Merge
```python
keras.layers.core.Merge(layers, mode='sum', concat_axis=-1, dot_axes=-1)
```
Merge the output of a list of layers (or containers) into a single tensor.
- __Arguments__:
- __layers__: List of layers or [containers](/layers/containers/).
- __mode__: String, one of `{'sum', 'mul', 'concat', 'ave', 'dot'}`. `sum`, `mul` and `ave` will simply sum/multiply/average the outputs of the layers (therefore all layers should have an output with the same shape). `concat` will concatenate the outputs along the dimension specified by `concate_axis` (therefore all layers should have an output that only differ along this dimension). `dot` will dot tensor contraction on the axes specified by `dot_axes` (see [the Numpy documentation](http://docs.scipy.org/doc/numpy-1.10.1/reference/generated/numpy.tensordot.html) for more details).
- __concat_axis__: axis to use in `concat` mode.
- __dot_axes__: axis or axes to use in `dot` mode (see [the Numpy documentation](http://docs.scipy.org/doc/numpy-1.10.1/reference/generated/numpy.tensordot.html) for more details).
- __Notes__:
- `dot` mode only works with Theano for the time being.
- __Example__:
```python
left = Sequential()
left.add(Dense(50, input_shape=(784,)))
left.add(Activation('relu'))
right = Sequential()
right.add(Dense(50, input_shape=(784,)))
right.add(Activation('relu'))
model = Sequential()
model.add(Merge([left, right], mode='sum'))
model.add(Dense(10))
model.add(Activation('softmax'))
model.compile(loss='categorical_crossentropy', optimizer='rmsprop')
model.fit([X_train, X_train], Y_train, batch_size=128, nb_epoch=20, validation_data=([X_test, X_test], Y_test))
```
## Masking
```python
keras.layers.core.Masking(mask_value=0.)
```
Create a mask for the input data by using `mask_value` as the sentinel value which should be masked out.
Given an input of dimensions `(nb_samples, timesteps, input_dim)`, return the input untouched as output, and supply a mask of shape `(nb_samples, timesteps)` where all timesteps which had *all* their values equal to `mask_value` are masked out.
- __Input shape__: 3D tensor with shape: `(nb_samples, timesteps, features)`.
- __Output shape__: 3D tensor with shape: `(nb_samples, timesteps, features)`.
- __Notes__: Masking only works in Theano for the time being.
## Lambda
```python
keras.layers.core.Lambda(function, output_shape=None)
```
Used for evaluating an arbitrary Theano expression on the output of the previous layer.
- __Input shape__: Arbitrary. Use the keyword argument input_shape (tuple of integers, does not include the samples axis) when using this layer as the first layer in a model.
- __Output shape__: Specified by the `output_shape` argument.
- __Arguments__:
- __function__: The expression to be evaluated. Takes one argument: the output of the previous layer.
- __output_shape__: Shape of the tensor returned by `function`. Should be a shape tuple (not including the samples dimension) or a function of the full input shape tuple (including samples dimension).
- __Example__:
```python
# custom softmax function
def sharp_softmax(X, beta=1.5):
return theano.tensor.nnet.softmax(X * beta)
def output_shape(input_shape):
# here input_shape includes the samples dimension
return input_shape # shape is unchanged
model = Sequential()
model.add(Dense(input_dim=10, output_dim=10))
model.add(Lambda(sharp_softmax, output_shape))
model.add(Dense(1))
model.add(Activation('sigmoid'))
```
## LambdaMerge
```python
keras.layers.core.LambdaMerge(layers, function, output_shape=None)
```
Merge the output of a list of layers (or containers) into a single tensor, using an arbitrary Theano expression.
- __Arguments__:
- __layers__: List of layers or [containers](/layers/containers/).
- __function__: The expression to be evaluated. Takes one argument: the list of input tensors.
- __output_shape__: Shape of the tensor returned by `function`. Should be a shape tuple (not including samples dimension) or a function of the list of input shape tuples (including samples dimension).
- __Example__:
```python
# root mean square function
def rms(inputs):
# inputs is a list of tensors
s = inputs[0] ** 2
for i in range(1, len(inputs)):
s += inputs[i] ** 2
s /= len(inputs)
s = theano.tensor.sqrt(s)
# return a single tensor
return s
def output_shape(input_shapes):
# return the shape of the first tensor
return input_shapes[0]
left = Sequential()
left.add(Dense(input_dim=10, output_dim=10))
left.add(Activation('sigmoid'))
right = Sequential()
right.add(Dense(input_dim=10, output_dim=10))
right.add(Activation('sigmoid'))
model = Sequential()
model.add(LambdaMerge([left, right], rms, output_shape))
model.add(Dense(1))
model.add(Activation('sigmoid'))
```
---
-31
Ver Arquivo
@@ -1,31 +0,0 @@
## Embedding
```python
keras.layers.embeddings.Embedding(input_dim, output_dim,
init='uniform',
weights=None,
W_regularizer=None, W_constraint=None,
mask_zero=False,
input_length=None)
```
Turn positive integers (indexes) into denses vectors of fixed size,
eg. `[[4], [20]] -> [[0.25, 0.1], [0.6, -0.2]]`
- __Input shape__: 2D tensor with shape: `(nb_samples, sequence_length)`.
- __Output shape__: 3D tensor with shape: `(nb_samples, sequence_length, output_dim)`.
- __Arguments__:
- __input_dim__: int >= 0. Size of the vocabulary, ie. 1+maximum integer index occurring in the input data.
- __output_dim__: int >= 0. Dimension of the dense embedding.
- __init__: name of initialization function for the weights of the layer (see: [initializations](../initializations.md)), or alternatively, Theano function to use for weights initialization. This parameter is only relevant if you don't pass a `weights` argument.
- __weights__: list of numpy arrays to set as initial weights. The list should have 1 element, of shape `(input_dim, output_dim)`.
- __W_regularizer__: instance of the [regularizers](../regularizers.md) module (eg. L1 or L2 regularization), applied to the embedding matrix.
- __W_constraint__: instance of the [constraints](../constraints.md) module (eg. maxnorm, nonneg), applied to the embedding matrix.
- __mask_zero__: Whether or not the input value 0 is a special "padding" value that should be masked out. This is useful for [recurrent layers](recurrent.md) which may take variable length input. If this is `True` then all subsequent layers in the model need to support masking or an exception will be raised.
- __input_length__: Length of input sequences, when it is constant. This argument is required if you are going to connect `Flatten` then `Dense` layers upstream (without it, the shape of the dense outputs cannot be computed).
---
-37
Ver Arquivo
@@ -1,37 +0,0 @@
## GaussianNoise
```python
keras.layers.noise.GaussianNoise(sigma)
```
Apply to the input an additive zero-centred gaussian noise with standard deviation `sigma`. This is useful to mitigate overfitting (you could see it as a kind of random data augmentation). Gaussian Noise (GS) is a natural choice as corruption process for real valued inputs.
Only active at training time.
- __Input shape__: Arbitrary. Use the keyword argument `input_shape` (tuple of integers, does not include the samples axis) when using this layer as the first layer in a model.
- __Output shape__: Same as input.
- __Arguments__:
- __sigma__: float, standard deviation of the noise distribution.
---
## GaussianDropout
```python
keras.layers.noise.GaussianDropout(p)
```
Apply to the input an multiplicative one-centred gaussian noise with standard deviation `sqrt(p/(1-p))`. p refers to drop probability to match Dropout layer syntax.
Only active at training time.
- __Input shape__: Arbitrary. Use the keyword argument `input_shape` (tuple of integers, does not include the samples axis) when using this layer as the first layer in a model.
- __Output shape__: Same as input.
- __Arguments__:
- __p__: float, drop probability as with Dropout.
-19
Ver Arquivo
@@ -1,19 +0,0 @@
## BatchNormalization
```python
keras.layers.normalization.BatchNormalization(epsilon=1e-6, weights=None)
```
Normalize the activations of the previous layer at each batch.
- __Input shape__: Arbitrary. Use the keyword argument `input_shape` (tuple of integers, does not include the samples axis) when using this layer as the first layer in a model.
- __Output shape__: Same as input.
- __Arguments__:
- __epsilon__: small float > 0. Fuzz parameter.
- __weights__: Initialization weights. List of 2 numpy arrays, with shapes: `[(input_shape,), (input_shape,)]`
- __References__:
- [Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift](http://arxiv.org/pdf/1502.03167v3.pdf)
-130
Ver Arquivo
@@ -1,130 +0,0 @@
## SimpleRNN
```python
keras.layers.recurrent.SimpleRNN(output_dim,
init='glorot_uniform', inner_init='orthogonal',
activation='sigmoid',
weights=None,
return_sequences=False,
go_backwards=False,
stateful=False,
input_dim=None, input_length=None)
```
Fully connected RNN where output is to fed back to input.
- __Input shape__: 3D tensor with shape: `(nb_samples, timesteps, input_dim)`.
- __Output shape__:
- if `return_sequences`: 3D tensor with shape: `(nb_samples, timesteps, output_dim)`.
- else: 2D tensor with shape: `(nb_samples, output_dim)`.
- __Masking__: This layer supports masking for input data with a variable number of timesteps To introduce masks to your data, use an [Embedding](embeddings.md) layer with the `mask_zero` parameter set to `True`. **Note:** for the time being, masking in only supported with Theano.
- __Notes__: When using the TensorFlow backend, the number of timesteps used must be fixed. Make sure to pass an `input_length` int argument or a complete `input_shape` tuple argument.
- __Arguments__:
- __output_dim__: dimension of the internal projections and the final output.
- __init__: weight initialization function. Can be the name of an existing function (str), or a Theano function (see: [initializations](../initializations.md)).
- __activation__: activation function. Can be the name of an existing function (str), or a Theano function (see: [activations](../activations.md)).
- __weights__: list of numpy arrays to set as initial weights. The list should have 3 elements, of shapes: `[(input_dim, output_dim), (output_dim, output_dim), (output_dim,)]`.
- __return_sequences__: Boolean. Whether to return the last output in the output sequence, or the full sequence.
- __go_backwards__: Boolean (default False). If True, rocess the input sequence backwards.
- __stateful__: Boolean (default False). If True, the last state for each sample at index i in a batch will be used as initial state for the sample of index i in the following batch.
- __input_dim__: dimensionality of the input (integer). This argument (or alternatively, the keyword argument `input_shape`) is required when using this layer as the first layer in a model.
- __input_length__: Length of input sequences, when it is constant. This argument is required if you are going to connect `Flatten` then `Dense` layers upstream (without it, the shape of the dense outputs cannot be computed).
---
## GRU
```python
keras.layers.recurrent.GRU(output_dim,
init='glorot_uniform', inner_init='orthogonal',
activation='sigmoid', inner_activation='hard_sigmoid',
return_sequences=False,
go_backwards=False,
stateful=False,
input_dim=None, input_length=None)
```
Gated Recurrent Unit - Cho et al. 2014.
- __Input shape__: 3D tensor with shape: `(nb_samples, timesteps, input_dim)`.
- __Output shape__:
- if `return_sequences`: 3D tensor with shape: `(nb_samples, timesteps, output_dim)`.
- else: 2D tensor with shape: `(nb_samples, output_dim)`.
- __Masking__: This layer supports masking for input data with a variable number of timesteps To introduce masks to your data, use an [Embedding](embeddings.md) layer with the `mask_zero` parameter set to true. **Note:** for the time being, masking in only supported with Theano.
- __Notes__: When using the TensorFlow backend, the number of timesteps used must be fixed. Make sure to pass an `input_length` int argument or a complete `input_shape` tuple argument.
- __Arguments__:
- __output_dim__: dimension of the internal projections and the final output.
- __init__: weight initialization function for the output cell. Can be the name of an existing function (str), or a Theano function (see: [initializations](../initializations.md)).
- __inner_init__: weight initialization function for the inner cells.
- __activation__: activation function for the output. Can be the name of an existing function (str), or a Theano function (see: [activations](../activations.md)).
- __inner_activation__: activation function for the inner cells.
- __weights__: list of numpy arrays to set as initial weights. The list should have 9 elements.
- __return_sequences__: Boolean. Whether to return the last output in the output sequence, or the full sequence.
- __go_backwards__: Boolean (default False). If True, rocess the input sequence backwards.
- __stateful__: Boolean (default False). If True, the last state for each sample at index i in a batch will be used as initial state for the sample of index i in the following batch.
- __input_dim__: dimensionality of the input (integer). This argument (or alternatively, the keyword argument `input_shape`) is required when using this layer as the first layer in a model.
- __input_length__: Length of input sequences, when it is constant. This argument is required if you are going to connect `Flatten` then `Dense` layers upstream (without it, the shape of the dense outputs cannot be computed).
- __References__:
- [On the Properties of Neural Machine Translation: Encoder–Decoder Approaches](http://www.aclweb.org/anthology/W14-4012)
- [Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling](http://arxiv.org/pdf/1412.3555v1.pdf)
---
## LSTM
```python
keras.layers.recurrent.LSTM(output_dim,
init='glorot_uniform', inner_init='orthogonal', forget_bias_init='one',
activation='tanh', inner_activation='hard_sigmoid',
weights=None,
return_sequences=False,
go_backwards=False,
stateful=False,
input_dim=None, input_length=None)
```
Long-Short Term Memory unit - Hochreiter 1997.
- __Input shape__: 3D tensor with shape: `(nb_samples, timesteps, input_dim)`.
- __Output shape__:
- if `return_sequences`: 3D tensor with shape: `(nb_samples, timesteps, output_dim)`.
- else: 2D tensor with shape: `(nb_samples, output_dim)`.
- __Masking__: This layer supports masking for input data with a variable number of timesteps To introduce masks to your data, use an [Embedding](embeddings.md) layer with the `mask_zero` parameter set to true. **Note:** for the time being, masking in only supported with Theano.
- __Notes__: When using the TensorFlow backend, the number of timesteps used must be fixed. Make sure to pass an `input_length` int argument or a complete `input_shape` tuple argument.
- __Arguments__:
- __output_dim__: dimension of the internal projections and the final output.
- __init__: weight initialization function for the output cell. Can be the name of an existing function (str), or a Theano function (see: [initializations](../initializations.md)).
- __inner_init__: weight initialization function for the inner cells.
- __forget_bias_init__: initialization function for the bias of the forget gate. [Jozefowicz et al.](http://www.jmlr.org/proceedings/papers/v37/jozefowicz15.pdf) recommend initializing with ones.
- __activation__: activation function for the output. Can be the name of an existing function (str), or a Theano function (see: [activations](../activations.md)).
- __inner_activation__: activation function for the inner cells.
- __weights__: list of numpy arrays to set as initial weights. The list should have 12 elements.
- __return_sequences__: Boolean. Whether to return the last output in the output sequence, or the full sequence.
- __go_backwards__: Boolean (default False). If True, rocess the input sequence backwards.
- __stateful__: Boolean (default False). If True, the last state for each sample at index i in a batch will be used as initial state for the sample of index i in the following batch.
- __input_dim__: dimensionality of the input (integer). This argument (or alternatively, the keyword argument `input_shape`) is required when using this layer as the first layer in a model.
- __input_length__: Length of input sequences, when it is constant. This argument is required if you are going to connect `Flatten` then `Dense` layers upstream (without it, the shape of the dense outputs cannot be computed).
- __References__:
- [Long short-term memory](http://deeplearning.cs.cmu.edu/pdfs/Hochreiter97_lstm.pdf) (original 1997 paper)
- [Learning to forget: Continual prediction with LSTM](http://www.mitpressjournals.org/doi/pdf/10.1162/089976600300015015)
- [Supervised sequence labelling with recurrent neural networks](http://www.cs.toronto.edu/~graves/preprint.pdf)
---
-211
Ver Arquivo
@@ -1,211 +0,0 @@
## Sequential
Linear stack of layers.
```python
model = keras.models.Sequential()
```
- __Methods__:
- __add__(layer): Add a layer to the model.
- __compile__(optimizer, loss, class_mode="categorical"):
- __Arguments__:
- __optimizer__: str (name of optimizer) or optimizer object. See [optimizers](optimizers.md).
- __loss__: str (name of objective function) or objective function. See [objectives](objectives.md).
- __class_mode__: one of "categorical", "binary". This is only used for computing classification accuracy or using the predict_classes method.
- __fit__(X, y, batch_size=128, nb_epoch=100, verbose=1, validation_split=0., validation_data=None, shuffle=True, show_accuracy=False, callbacks=[], class_weight=None, sample_weight=None): Train a model for a fixed number of epochs.
- __Return__: a history object. It `history` attribute is a record of training loss values at successive epochs, as well as validation loss values (if applicable).
- __Arguments__:
- __X__: data.
- __y__: labels.
- __batch_size__: int. Number of samples per gradient update.
- __nb_epoch__: int.
- __verbose__: 0 for no logging to stdout, 1 for progress bar logging, 2 for one log line per epoch.
- __callbacks__: `keras.callbacks.Callback` list. List of callbacks to apply during training. See [callbacks](callbacks.md).
- __validation_split__: float (0. < x < 1). Fraction of the data to use as held-out validation data.
- __validation_data__: tuple (X, y) to be used as held-out validation data. Will override validation_split.
- __shuffle__: boolean or str (for 'batch'). Whether to shuffle the samples at each epoch. 'batch' is a special option for dealing with the limitations of HDF5 data; it shuffles in batch-sized chunks.
- __show_accuracy__: boolean. Whether to display class accuracy in the logs to stdout at each epoch.
- __class_weight__: dictionary mapping classes to a weight value, used for scaling the loss function (during training only).
- __sample_weight__: list or numpy array with 1:1 mapping to the training samples, used for scaling the loss function (during training only). For time-distributed data, there is one weight per sample *per timestep*, i.e. if your output data is shaped `(nb_samples, timesteps, output_dim)`, your mask should be of shape `(nb_samples, timesteps, 1)`. This allows you to mask out or reweight individual output timesteps, which is useful in sequence to sequence learning.
- __evaluate__(X, y, batch_size=128, show_accuracy=False, verbose=1, sample_weight=None): Show performance of the model over some validation data.
- __Return__: The loss score over the data, or a `(loss, accuracy)` tuple if `show_accuracy=True`.
- __Arguments__: Same meaning as fit method above. verbose is used as a binary flag (progress bar or nothing).
- __predict__(X, batch_size=128, verbose=1):
- __Return__: An array of predictions for some test data.
- __Arguments__: Same meaning as fit method above.
- __predict_classes__(X, batch_size=128, verbose=1): Return an array of class predictions for some test data.
- __Return__: An array of labels for some test data.
- __Arguments__: Same meaning as fit method above. verbose is used as a binary flag (progress bar or nothing).
- __train_on_batch__(X, y, accuracy=False, class_weight=None, sample_weight=None): Single gradient update on one batch.
- __Return__: loss over the data, or tuple `(loss, accuracy)` if `accuracy=True`.
- __test_on_batch__(X, y, accuracy=False, sample_weight=None): Single performance evaluation on one batch.
- __Return__: loss over the data, or tuple `(loss, accuracy)` if `accuracy=True`.
- __save_weights__(fname, overwrite=False): Store the weights of all layers to a HDF5 file. If overwrite==False and the file already exists, an exception will be thrown.
- __load_weights__(fname): Sets the weights of a model, based to weights stored by __save_weights__. You can only __load_weights__ on a savefile from a model with an identical architecture. __load_weights__ can be called either before or after the __compile__ step.
__Examples__:
```python
from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation
from keras.optimizers import SGD
model = Sequential()
model.add(Dense(2, init='uniform', input_dim=64))
model.add(Activation('softmax'))
model.compile(loss='mse', optimizer='sgd')
'''
Demonstration of verbose modes 1 and 2
'''
model.fit(X_train, y_train, nb_epoch=3, batch_size=16, verbose=1)
# outputs
'''
Train on 37800 samples, validate on 4200 samples
Epoch 0
37800/37800 [==============================] - 7s - loss: 0.0385
Epoch 1
37800/37800 [==============================] - 8s - loss: 0.0140
Epoch 2
10960/37800 [=======>......................] - ETA: 4s - loss: 0.0109
'''
model.fit(X_train, y_train, nb_epoch=3, batch_size=16, verbose=2)
# outputs
'''
Train on 37800 samples, validate on 4200 samples
Epoch 0
loss: 0.0190
Epoch 1
loss: 0.0146
Epoch 2
loss: 0.0049
'''
'''
Demonstration of show_accuracy
'''
model.fit(X_train, y_train, nb_epoch=3, batch_size=16, verbose=2, show_accuracy=True)
# outputs
'''
Train on 37800 samples, validate on 4200 samples
Epoch 0
loss: 0.0190 - acc.: 0.8750
Epoch 1
loss: 0.0146 - acc.: 0.8750
Epoch 2
loss: 0.0049 - acc.: 1.0000
'''
'''
Demonstration of validation_split
'''
model.fit(X_train, y_train, nb_epoch=3, batch_size=16, validation_split=0.1, show_accuracy=True, verbose=1)
# outputs
'''
Train on 37800 samples, validate on 4200 samples
Epoch 0
37800/37800 [==============================] - 7s - loss: 0.0385 - acc.: 0.7258 - val. loss: 0.0160 - val. acc.: 0.9136
Epoch 1
37800/37800 [==============================] - 8s - loss: 0.0140 - acc.: 0.9265 - val. loss: 0.0109 - val. acc.: 0.9383
Epoch 2
10960/37800 [=======>......................] - ETA: 4s - loss: 0.0109 - acc.: 0.9420
'''
```
---
## Graph
Arbitrary connection graph. It can have any number of inputs and outputs, with each output trained with its own loss function. The quantity being optimized by a Graph model is the sum of all loss functions over the different outputs.
```python
model = keras.models.Graph()
```
- __Methods__:
- __add_input__(name, input_shape, dtype='float'): Add an input with shape dimensionality `ndim`.
- __Arguments__:
- __input_shape__: Integer tuple, shape of the expected input (not including the samples axis). E.g. (10,) for 10-dimensional vectors, (None, 128) for sequences (of variable length) of 128-dimensional vectors, (3, 32, 32) for 32x32 images with RGB channels.
- __dtype__: `float` or `int`. Type of the expected input data.
- __add_output__(name, input=None, inputs=[], merge_mode='concat'): Add an output connect to `input` or `inputs`.
- __Arguments__:
- __name__: str. unique identifier of the output.
- __input__: str name of the node that the output is connected to. Only specify *one* of either `input` or `inputs`.
- __inputs__: list of str names of the node that the output is connected to.
- __merge_mode__: "sum" or "concat". Only applicable if `inputs` list is specified. Merge mode for the different inputs.
- __add_node__(layer, name, input=None, inputs=[], merge_mode='concat'): Add an output connect to `input` or `inputs`.
- __Arguments__:
- __layer__: Layer instance.
- __name__: str. unique identifier of the node.
- __input__: str name of the node/input that the node is connected to. Only specify *one* of either `input` or `inputs`.
- __inputs__: list of str names of the node that the node is connected to.
- __merge_mode__: "sum" or "concat". Only applicable if `inputs` list is specified. Merge mode for the different inputs.
- __add_shared_node__(layer, name, inputs=[], merge_mode=None, outputs=[]): Add a shared node connected to `inputs`. A shared node is a layer that will be applied separately to every incoming input, and that uses only one set of weights. The merging operation occurs on the outputs of the layer.
- __Arguments__:
- __layer__: Layer instance.
- __name__: str. unique identifier of the node.
- __inputs__: list of str names of the node that the node is connected to.
- __merge_mode__: Merge mode for the different inputs.
- __outputs__: Optional. List of names for outputs, when merge_mode = None.
- __compile__(optimizer, loss):
- __Arguments__:
- __optimizer__: str (name of optimizer) or optimizer object. See [optimizers](optimizers.md).
- __loss__: dictionary mapping the name(s) of the output(s) to a loss function (string name of objective function or objective function. See [objectives](objectives.md)).
- __fit__(data, batch_size=128, nb_epoch=100, verbose=1, validation_split=0., validation_data=None, shuffle=True, callbacks=[]): Train a model for a fixed number of epochs.
- __Return__: a history object. It `history` attribute is a record of training loss values at successive epochs, as well as validation loss values (if applicable).
- __Arguments__:
- __data__:dictionary mapping input names out outputs names to appropriate numpy arrays. All arrays should contain the same number of samples.
- __batch_size__: int. Number of samples per gradient update.
- __nb_epoch__: int.
- __verbose__: 0 for no logging to stdout, 1 for progress bar logging, 2 for one log line per epoch.
- __callbacks__: `keras.callbacks.Callback` list. List of callbacks to apply during training. See [callbacks](callbacks.md).
- __validation_split__: float (0. < x < 1). Fraction of the data to use as held-out validation data.
- __validation_data__: tuple (X, y) to be used as held-out validation data. Will override validation_split.
- __shuffle__: boolean. Whether to shuffle the samples at each epoch.
- __evaluate__(data, batch_size=128, verbose=1): Show performance of the model over some validation data.
- __Return__: The loss score over the data.
- __Arguments__: Same meaning as fit method above. verbose is used as a binary flag (progress bar or nothing).
- __predict__(data, batch_size=128, verbose=1):
- __Return__: A dictionary mapping output names to arrays of predictions over the data.
- __Arguments__: Same meaning as fit method above. Only inputs need to be specified in `data`.
- __train_on_batch__(data): Single gradient update on one batch.
- __Return__: loss over the data.
- __test_on_batch__(data): Single performance evaluation on one batch.
- __Return__: loss over the data.
- __save_weights__(fname, overwrite=False): Store the weights of all layers to a HDF5 file. If `overwrite==False` and the file already exists, an exception will be thrown.
- __load_weights__(fname): Sets the weights of a model, based to weights stored by __save_weights__. You can only __load_weights__ on a savefile from a model with an identical architecture. __load_weights__ can be called either before or after the __compile__ step.
__Examples__:
```python
# graph model with one input and two outputs
graph = Graph()
graph.add_input(name='input', input_shape=(32,))
graph.add_node(Dense(16), name='dense1', input='input')
graph.add_node(Dense(4), name='dense2', input='input')
graph.add_node(Dense(4), name='dense3', input='dense1')
graph.add_output(name='output1', input='dense2')
graph.add_output(name='output2', input='dense3')
graph.compile('rmsprop', {'output1':'mse', 'output2':'mse'})
history = graph.fit({'input':X_train, 'output1':y_train, 'output2':y2_train}, nb_epoch=10)
```
```python
# graph model with two inputs and one output
graph = Graph()
graph.add_input(name='input1', input_shape=(32,))
graph.add_input(name='input2', input_shape=(32,))
graph.add_node(Dense(16), name='dense1', input='input1')
graph.add_node(Dense(4), name='dense2', input='input2')
graph.add_node(Dense(4), name='dense3', input='dense1')
graph.add_output(name='output', inputs=['dense2', 'dense3'], merge_mode='sum')
graph.compile('rmsprop', {'output':'mse'})
history = graph.fit({'input1':X_train, 'input2':X2_train, 'output':y_train}, nb_epoch=10)
predictions = graph.predict({'input1':X_test, 'input2':X2_test}) # {'output':...}
```
-117
Ver Arquivo
@@ -1,117 +0,0 @@
## Usage of optimizers
An optimizer is one of the two arguments required for compiling a Keras model:
```python
model = Sequential()
model.add(Dense(64, init='uniform', input_dim=10))
model.add(Activation('tanh'))
model.add(Activation('softmax'))
sgd = SGD(lr=0.1, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='mean_squared_error', optimizer=sgd)
```
You can either instantiate an optimizer before passing it to `model.compile()` , as in the above example, or you can call it by its name. In the latter case, the default parameters for the optimizer will be used.
```python
# pass optimizer by name: default parameters will be used
model.compile(loss='mean_squared_error', optimizer='sgd')
```
---
## Base class
```python
keras.optimizers.Optimizer(**kwargs)
```
All optimizers descended from this class support the following keyword argument:
- __clipnorm__: float >= 0.
Note: this is base class for building optimizers, not an actual optimizer that can be used for training models.
---
## SGD
```python
keras.optimizers.SGD(lr=0.01, momentum=0., decay=0., nesterov=False)
```
__Arguments__:
- __lr__: float >= 0. Learning rate.
- __momentum__: float >= 0. Parameter updates momentum.
- __decay__: float >= 0. Learning rate decay over each update.
- __nesterov__: boolean. Whether to apply Nesterov momentum.
---
## Adagrad
```python
keras.optimizers.Adagrad(lr=0.01, epsilon=1e-6)
```
It is recommended to leave the parameters of this optimizer at their default values.
__Arguments__:
- __lr__: float >= 0. Learning rate.
- __epsilon__: float >= 0.
---
## Adadelta
```python
keras.optimizers.Adadelta(lr=1.0, rho=0.95, epsilon=1e-6)
```
It is recommended to leave the parameters of this optimizer at their default values.
__Arguments__:
- __lr__: float >= 0. Learning rate. It is recommended to leave it at the default value.
- __rho__: float >= 0.
- __epsilon__: float >= 0. Fuzz factor.
For more info, see *"Adadelta: an adaptive learning rate method"* by Matthew Zeiler.
---
## RMSprop
```python
keras.optimizers.RMSprop(lr=0.001, rho=0.9, epsilon=1e-6)
```
It is recommended to leave the parameters of this optimizer at their default values.
__Arguments__:
- __lr__: float >= 0. Learning rate.
- __rho__: float >= 0.
- __epsilon__: float >= 0. Fuzz factor.
---
## Adam
```python
keras.optimizers.Adam(lr=0.001, beta_1=0.9, beta_2=0.999, epsilon=1e-8)
```
Adam optimizer, proposed by Kingma and Lei Ba in [Adam: A Method For Stochastic Optimization](http://arxiv.org/pdf/1412.6980v8.pdf). Default parameters are those suggested in the paper.
__Arguments__:
- __lr__: float >= 0. Learning rate.
- __beta_1__, __beta_2__: floats, 0 < beta < 1. Generally close to 1.
- __epsilon__: float >= 0. Fuzz factor.
---
-70
Ver Arquivo
@@ -1,70 +0,0 @@
## ImageDataGenerator
```python
keras.preprocessing.image.ImageDataGenerator(featurewise_center=True,
samplewise_center=False,
featurewise_std_normalization=True,
samplewise_std_normalization=False,
zca_whitening=False,
rotation_range=0.,
width_shift_range=0.,
height_shift_range=0.,
horizontal_flip=False,
vertical_flip=False)
```
Generate batches of tensor image data with real-time data augmentation.
- __Arguments__:
- __featurewise_center__: Boolean. Set input mean to 0 over the dataset.
- __samplewise_center__: Boolean. Set each sample mean to 0.
- __featurewise_std_normalization__: Boolean. Divide inputs by std of the dataset.
- __samplewise_std_normalization__: Boolean. Divide each input by its std.
- __zca_whitening__: Boolean. Apply ZCA whitening.
- __rotation_range__: Int. Degree range for random rotations.
- __width_shift_range__: Float (fraction of total width). Range for random horizontal shifts.
- __height_shift_range__: Float (fraction of total height). Range for random vertical shifts.
- __horizontal_flip__: Boolean. Randomly flip inputs horizontally.
- __vertical_flip__: Boolean. Randomly flip inputs vertically.
- __Methods__:
- __fit(X)__: Required if featurewise_center or featurewise_std_normalization or zca_whitening. Compute necessary quantities on some sample data.
- __Arguments__:
- __X__: sample data.
- __augment__: Boolean (default: False). Whether to fit on randomly augmented samples.
- __rounds__: int (default: 1). If augment, how many augmentation passes over the data to use.
- __flow(X, y)__:
- __Arguments__:
- __X__: data.
- __y__: labels.
- __batch_size__: int (default: 32).
- __shuffle__: boolean (defaut: False).
- __save_to_dir__: None or str. This allows you to optimally specify a directory to which to save the augmented pictures being generated (useful for visualizing what you are doing).
- __save_prefix__: str. Prefix to use for filenames of saved pictures.
- __save_format__: one of "png", jpeg".
- __Example__:
```python
(X_train, y_train), (X_test, y_test) = cifar10.load_data(test_split=0.1)
Y_train = np_utils.to_categorical(y_train, nb_classes)
Y_test = np_utils.to_categorical(y_test, nb_classes)
datagen = ImageDataGenerator(
featurewise_center=True,
featurewise_std_normalization=True,
rotation_range=20,
width_shift_range=0.2,
height_shift_range=0.2,
horizontal_flip=True)
# compute quantities required for featurewise normalization
# (std, mean, and principal components if ZCA whitening is applied)
datagen.fit(X_train)
for e in range(nb_epoch):
print 'Epoch', e
# batch train with realtime data augmentation
for X_batch, Y_batch in datagen.flow(X_train, Y_train):
loss = model.train(X_batch, Y_batch)
```
@@ -14,11 +14,13 @@ is equivalent to:
model.add(Dense(64, activation='tanh'))
```
You can also pass an element-wise Theano function as an activation:
You can also pass an element-wise Theano/TensorFlow function as an activation:
```python
from keras import backend as K
def tanh(x):
return theano.tensor.tanh(x)
return K.tanh(x)
model.add(Dense(64, activation=tanh))
model.add(Activation(tanh))
@@ -28,6 +30,7 @@ model.add(Activation(tanh))
- __softmax__: Softmax applied across inputs last dimension. Expects shape either `(nb_samples, nb_timesteps, nb_dims)` or `(nb_samples, nb_dims)`.
- __softplus__
- __softsign__
- __relu__
- __tanh__
- __sigmoid__
@@ -36,4 +39,4 @@ model.add(Activation(tanh))
## On Advanced Activations
Activations that are more complex than a simple Theano function (eg. learnable activations, configurable activations, etc.) are available as [Advanced Activation layers](layers/advanced_activations.md), and can be found in the module `keras.layers.advanced_activations`. These include PReLU and LeakyReLU.
Activations that are more complex than a simple Theano/TensorFlow function (eg. learnable activations, configurable activations, etc.) are available as [Advanced Activation layers](layers/advanced-activations.md), and can be found in the module `keras.layers.advanced_activations`. These include PReLU and LeakyReLU.
+260
Ver Arquivo
@@ -0,0 +1,260 @@
# Applications
Keras Applications are deep learning models that are made available alongside pre-trained weights.
These models can be used for prediction, feature extraction, and fine-tuning.
Weights are downloaded automatically when instantiating a model. They are stored at `~/.keras/models/`.
## Available models
Models for image classification with weights trained on ImageNet:
- [VGG16](#vgg16)
- [VGG19](#vgg19)
- [ResNet50](#resnet50)
- [InceptionV3](#inceptionv3)
All of these architectures are compatible with both TensorFlow and Theano, and upon instantiation the models will be built according to the image dimension ordering set in your Keras configuration file at `~/.keras/keras.json`. For instance, if you have set `image_dim_ordering=tf`, then any model loaded from this repository will get built according to the TensorFlow dimension ordering convention, "Width-Height-Depth".
-----
## Examples
### Classify ImageNet classes with ResNet50
```python
from keras.applications.resnet50 import ResNet50
from keras.preprocessing import image
from keras.applications.resnet50 import preprocess_input, decode_predictions
model = ResNet50(weights='imagenet')
img_path = 'elephant.jpg'
img = image.load_img(img_path, target_size=(224, 224))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)
preds = model.predict(x)
print('Predicted:', decode_predictions(preds))
# print: [[u'n02504458', u'African_elephant']]
```
### Extract features with VGG16
```python
from keras.applications.vgg16 import VGG16
from keras.preprocessing import image
from keras.applications.vgg16 import preprocess_input
model = VGG16(weights='imagenet', include_top=False)
img_path = 'elephant.jpg'
img = image.load_img(img_path, target_size=(224, 224))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)
features = model.predict(x)
```
### Extract features from an arbitrary intermediate layer with VGG19
```python
from keras.applications.vgg19 import VGG19
from keras.preprocessing import image
from keras.applications.vgg19 import preprocess_input
from keras.models import Model
base_model = VGG19(weights='imagenet')
model = Model(input=base_model.input, output=base_model.get_layer('block4_pool').output)
img_path = 'elephant.jpg'
img = image.load_img(img_path, target_size=(224, 224))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)
block4_pool_features = model.predict(x)
```
### Fine-tune InceptionV3 on a new set of classes
```python
from keras.applications.inception_v3 import InceptionV3
from keras.preprocessing import image
from keras.models import Model
from keras.layers import Dense, GlobalAveragePooling2D
from keras import backend as K
# create the base pre-trained model
base_model = InceptionV3(weights='imagenet', include_top=False)
# add a global spatial average pooling layer
x = base_model.output
x = GlobalAveragePooling2D()(x)
# let's add a fully-connected layer
x = Dense(1024, activation='relu')(x)
# and a logistic layer -- let's say we have 200 classes
predictions = Dense(200, activation='softmax')(x)
# this is the model we will train
model = Model(input=base_model.input, output=predictions)
# first: train only the top layers (which were randomly initialized)
# i.e. freeze all convolutional InceptionV3 layers
for layer in base_model.layers:
layer.trainable = False
# compile the model (should be done *after* setting layers to non-trainable)
model.compile(optimizer='rmsprop', loss='categorical_crossentropy')
# train the model on the new data for a few epochs
model.fit_generator(...)
# at this point, the top layers are well trained and we can start fine-tuning
# convolutional layers from inception V3. We will freeze the bottom N layers
# and train the remaining top layers.
# let's visualize layer names and layer indices to see how many layers
# we should freeze:
for i, layer in enumerate(base_model.layers):
print(i, layer.name)
# we chose to train the top 2 inception blocks, i.e. we will freeze
# the first 172 layers and unfreeze the rest:
for layer in model.layers[:172]:
layer.trainable = False
for layer in model.layers[172:]:
layer.trainable = True
# we need to recompile the model for these modifications to take effect
# we use SGD with a low learning rate
from keras.optimizers import SGD
model.compile(optimizer=SGD(lr=0.0001, momentum=0.9), loss='categorical_crossentropy')
# we train our model again (this time fine-tuning the top 2 inception blocks
# alongside the top Dense layers
model.fit_generator(...)
```
### Build InceptionV3 over a custom input tensor
```python
from keras.applications.inception_v3 import InceptionV3
from keras.layers import Input
# this could also be the output a different Keras model or layer
input_tensor = Input(shape=(224, 224, 3)) # this assumes K.image_dim_ordering() == 'tf'
model = InceptionV3(input_tensor=input_tensor, weights='imagenet', include_top=True)
```
-----
## VGG16
```python
keras.applications.vgg16.VGG16(include_top=True, weights='imagenet', input_tensor=None)
```
### Arguments
- include_top: whether to include the 3 fully-connected layers at the top of the network.
- weights: one of `None` (random initialization) or "imagenet" (pre-training on ImageNet).
- input_tensor: optional Keras tensor (i.e. output of `layers.Input()`) to use as image input for the model.
### Returns
A Keras model instance.
### References
- [Very Deep Convolutional Networks for Large-Scale Image Recognition](https://arxiv.org/abs/1409.1556): please cite this paper if you use the VGG models in your work.
### License
These weights are ported from the ones [released by VGG at Oxford](http://www.robots.ox.ac.uk/~vgg/research/very_deep/) under the [Creative Commons Attribution License](https://creativecommons.org/licenses/by/4.0/).
-----
## VGG19
```python
keras.applications.vgg19.VGG19(include_top=True, weights='imagenet', input_tensor=None)
```
### Arguments
- include_top: whether to include the 3 fully-connected layers at the top of the network.
- weights: one of `None` (random initialization) or "imagenet" (pre-training on ImageNet).
- input_tensor: optional Keras tensor (i.e. output of `layers.Input()`) to use as image input for the model.
### Returns
A Keras model instance.
### References
- [Very Deep Convolutional Networks for Large-Scale Image Recognition](https://arxiv.org/abs/1409.1556)
### License
These weights are ported from the ones [released by VGG at Oxford](http://www.robots.ox.ac.uk/~vgg/research/very_deep/) under the [Creative Commons Attribution License](https://creativecommons.org/licenses/by/4.0/).
-----
## ResNet50
```python
keras.applications.resnet50.ResNet50(include_top=True, weights='imagenet', input_tensor=None)
```
### Arguments
- include_top: whether to include the 3 fully-connected layers at the top of the network.
- weights: one of `None` (random initialization) or "imagenet" (pre-training on ImageNet).
- input_tensor: optional Keras tensor (i.e. output of `layers.Input()`) to use as image input for the model.
### Returns
A Keras model instance.
### References
- [Deep Residual Learning for Image Recognition](https://arxiv.org/abs/1512.03385)
### License
These weights are ported from the ones [released by Kaiming He](https://github.com/KaimingHe/deep-residual-networks) under the [MIT license](https://github.com/KaimingHe/deep-residual-networks/blob/master/LICENSE).
-----
## InceptionV3
```python
keras.applications.inception_v3.InceptionV3(include_top=True, weights='imagenet', input_tensor=None)
```
### Arguments
- include_top: whether to include the 3 fully-connected layers at the top of the network.
- weights: one of `None` (random initialization) or "imagenet" (pre-training on ImageNet).
- input_tensor: optional Keras tensor (i.e. output of `layers.Input()`) to use as image input for the model.
### Returns
A Keras model instance.
### References
- [Rethinking the Inception Architecture for Computer Vision](http://arxiv.org/abs/1512.00567)
### License
These weights are trained by ourselves and are released under the MIT license.
+35 -10
Ver Arquivo
@@ -4,10 +4,14 @@
Keras is a model-level library, providing high-level building blocks for developing deep learning models. It does not handle itself low-level operations such as tensor products, convolutions and so on. Instead, it relies on a specialized, well-optimized tensor manipulation library to do so, serving as the "backend engine" of Keras. Rather than picking one single tensor library and making the implementation of Keras tied to that library, Keras handles the problem in a modular way, and several different backend engines can be plugged seamlessly into Keras.
At this time, Keras has two backend implementations available: the **Theano** backend and the **TensorFlow** backend.
At this time, Keras has two backend implementations available: the **TensorFlow** backend and the **Theano** backend.
- [Theano](http://deeplearning.net/software/theano/) is an open-source symbolic tensor manipulation framework developed by LISA/MILA Lab at Université de Montréal.
- [TensorFlow](http://www.tensorflow.org/) is an open-source symbolic tensor manipulation framework developed by Google, Inc.
- [Theano](http://deeplearning.net/software/theano/) is an open-source symbolic tensor manipulation framework developed by LISA/MILA Lab at Université de Montréal.
In the future, we are likely to add more backend options. If you are interested in developing a new backend, get in touch!
----
## Switching from one backend to another
@@ -17,12 +21,29 @@ If you have run Keras at least once, you will find the Keras configuration file
If it isn't there, you can create it.
It probably looks like this:
The default configuration file looks like this:
`{"epsilon": 1e-07, "floatx": "float32", "backend": "theano"}`
```
{
"image_dim_ordering": "tf",
"epsilon": 1e-07,
"floatx": "float32",
"backend": "tensorflow"
}
```
Simply change the field `backend` to either `"theano"` or `"tensorflow"`, and Keras will use the new configuration next time you run any Keras code.
You can also define the environment variable ``KERAS_BACKEND`` and this will
override what is defined in your config file :
```bash
KERAS_BACKEND=tensorflow python -c "from keras import backend"
Using TensorFlow backend.
```
----
## Using the abstract Keras backend to write new code
If you want the Keras modules you write to be compatible with both Theano and TensorFlow, you have to write them via the abstract Keras backend API. Here's an intro.
@@ -32,7 +53,7 @@ You can import the backend module via:
from keras import backend as K
```
This instantiates an input placeholder. It's equivalent to `tf.placeholder()` or `T.matrix()`, `T.tensor3()`, etc.
The code below instantiates an input placeholder. It's equivalent to `tf.placeholder()` or `T.matrix()`, `T.tensor3()`, etc.
```python
input = K.placeholder(shape=(2, 4, 5))
@@ -42,16 +63,16 @@ input = K.placeholder(shape=(None, 4, 5))
input = K.placeholder(ndim=3)
```
This instantiates a shared variable. It's equivalent to `tf.variable()` or `theano.shared()`.
The code below instantiates a shared variable. It's equivalent to `tf.variable()` or `theano.shared()`.
```python
val = np.random.random((3, 4, 5))
var = K.variable(value=val)
# all-zeros variable:
var = K.ones(shape=(3, 4, 5))
# all-ones:
var = K.zeros(shape=(3, 4, 5))
# all-ones:
var = K.ones(shape=(3, 4, 5))
```
Most tensor operations you will need can be done as you would in TensorFlow or Theano:
@@ -65,8 +86,12 @@ a = concatenate([b, c], axis=-1)
# etc...
```
For more information, see the code at `keras/backend/theano_backend.py` and `keras/backend/tensorflow_backend.py`.
----
## Backend functions
{{autogenerated}}
+2 -41
Ver Arquivo
@@ -4,51 +4,12 @@ A callback is a set of functions to be applied at given stages of the training p
---
## Base class
```python
keras.callbacks.Callback()
```
- __Properties__:
- __params__: dict. Training parameters (eg. verbosity, batch size, number of epochs...).
- __model__: `keras.models.Model`. Reference of the model being trained.
- __Methods__:
- __on_train_begin__(logs={}): Method called at the beginning of training.
- __on_train_end__(logs={}): Method called at the end of training.
- __on_epoch_begin__(epoch, logs={}): Method called at the beginning of epoch `epoch`.
- __on_epoch_end__(epoch, logs={}): Method called at the end of epoch `epoch`.
- __on_batch_begin__(batch, logs={}): Method called at the beginning of batch `batch`.
- __on_batch_end__(batch, logs={}): Method called at the end of batch `batch`.
The `logs` dictionary will contain keys for quantities relevant to the current batch or epoch. Currently, the `.fit()` method of the `Sequential` model class will include the following quantities in the `logs` that it passes to its callbacks:
- __on_epoch_end__: logs optionally include `val_loss` (if validation is enabled in `fit`), and `val_accuracy` (if validation and accuracy monitoring are enabled).
- __on_batch_begin__: logs include `size`, the number of samples in the current batch.
- __on_batch_end__: logs include `loss`, and optionally `accuracy` (if accuracy monitoring is enabled).
---
## Available callbacks
```python
keras.callbacks.ModelCheckpoint(filepath, verbose=0, save_best_only=False)
```
Save the model after every epoch. If `save_best_only=True`, the latest best model according to the validation loss will not be overwritten.
`filepath` can contain named formatting options, which will be filled the value of `epoch` and keys in `logs` (passed in `on_epoch_end`).
For example: if `filepath` is `weights.{epoch:02d}-{val_loss:.2f}.hdf5`, then multiple files will be save with the epoch number and the validation loss.
```python
keras.callbacks.EarlyStopping(monitor='val_loss', patience=0, verbose=0)
```
Stop training after no improvement of the metric `monitor` is seen for `patience` epochs.
{{autogenerated}}
---
## Create a callback
# Create a callback
You can create a custom callback by extending the base class `keras.callbacks.Callback`. A callback has access to its associated model through the class property `self.model`.
+35 -16
Ver Arquivo
@@ -2,13 +2,13 @@
## CIFAR10 small image classification
`keras.datasets.cifar10`
Dataset of 50,000 32x32 color training images, labeled over 10 categories, and 10,000 test images.
### Usage:
```python
from keras.datasets import cifar10
(X_train, y_train), (X_test, y_test) = cifar10.load_data()
```
@@ -21,13 +21,13 @@ Dataset of 50,000 32x32 color training images, labeled over 10 categories, and 1
## CIFAR100 small image classification
`keras.datasets.cifar100`
Dataset of 50,000 32x32 color training images, labeled over 100 categories, and 10,000 test images.
### Usage:
```python
from keras.datasets import cifar100
(X_train, y_train), (X_test, y_test) = cifar100.load_data(label_mode='fine')
```
@@ -44,8 +44,6 @@ Dataset of 50,000 32x32 color training images, labeled over 100 categories, and
## IMDB Movie reviews sentiment classification
`keras.datasets.imdb`
Dataset of 25,000 movies reviews from IMDB, labeled by sentiment (positive/negative). Reviews have been preprocessed, and each review is encoded as a [sequence](preprocessing/sequence.md) of word indexes (integers). For convenience, words are indexed by overall frequency in the dataset, so that for instance the integer "3" encodes the 3rd most frequent word in the data. This allows for quick filtering operations such as: "only consider the top 10,000 most common words, but eliminate the top 20 most common words".
As a convention, "0" does not stand for a specific word, but instead is used to encode any unknown word.
@@ -53,8 +51,16 @@ As a convention, "0" does not stand for a specific word, but instead is used to
### Usage:
```python
(X_train, y_train), (X_test, y_test) = imdb.load_data(path="imdb.pkl", \
nb_words=None, skip_top=0, maxlen=None, test_split=0.1, seed=113)
from keras.datasets import imdb
(X_train, y_train), (X_test, y_test) = imdb.load_data(path="imdb_full.pkl",
nb_words=None,
skip_top=0,
maxlen=None,
seed=113,
start_char=1,
oov_char=2,
index_from=3)
```
- __Return:__
- 2 tuples:
@@ -67,25 +73,38 @@ nb_words=None, skip_top=0, maxlen=None, test_split=0.1, seed=113)
- __nb_words__: integer or None. Top most frequent words to consider. Any less frequent word will appear as 0 in the sequence data.
- __skip_top__: integer. Top most frequent words to ignore (they will appear as 0s in the sequence data).
- __maxlen__: int. Maximum sequence length. Any longer sequence will be truncated.
- __test_split__: float. Fraction of the dataset to be used as test data.
- __seed__: int. Seed for reproducible data shuffling.
- __start_char__: char. The start of a sequence will be marked with this character.
Set to 1 because 0 is usually the padding character.
- __oov_char__: char. words that were cut out because of the `nb_words`
or `skip_top` limit will be replaced with this character.
- __index_from__: int. Index actual words with this index and higher.
---
## Reuters newswire topics classification
`keras.datasets.reuters`
Dataset of 11,228 newswires from Reuters, labeled over 46 topics. As with the IMDB dataset, each wire is encoded as a sequence of word indexes (same conventions).
### Usage:
```python
(X_train, y_train), (X_test, y_test) = reuters.load_data(path="reuters.pkl", \
nb_words=None, skip_top=0, maxlen=None, test_split=0.1, seed=113)
from keras.datasets import reuters
(X_train, y_train), (X_test, y_test) = reuters.load_data(path="reuters.pkl",
nb_words=None,
skip_top=0,
maxlen=None,
test_split=0.2,
seed=113,
start_char=1,
oov_char=2,
index_from=3)
```
The specifications are the same as that of the IMDB dataset.
The specifications are the same as that of the IMDB dataset, with the addition of:
- __test_split__: float. Fraction of the dataset to be used as test data.
This dataset also makes available the word index used for encoding the sequences:
@@ -101,13 +120,13 @@ word_index = reuters.get_word_index(path="reuters_word_index.pkl")
## MNIST database of handwritten digits
`keras.datasets.mnist`
Dataset of 60,000 28x28 grayscale images of the 10 digits, along with a test set of 10,000 images.
### Usage:
```python
from keras.datasets import mnist
(X_train, y_train), (X_test, y_test) = mnist.load_data()
```
+385
Ver Arquivo
@@ -0,0 +1,385 @@
# Keras FAQ: Frequently Asked Keras Questions
- [How should I cite Keras?](#how-should-i-cite-keras)
- [How can I run Keras on GPU?](#how-can-i-run-keras-on-gpu)
- [How can I save a Keras model?](#how-can-i-save-a-keras-model)
- [Why is the training loss much higher than the testing loss?](#why-is-the-training-loss-much-higher-than-the-testing-loss)
- [How can I visualize the output of an intermediate layer?](#how-can-i-visualize-the-output-of-an-intermediate-layer)
- [How can I use Keras with datasets that don't fit in memory?](#how-can-i-use-keras-with-datasets-that-dont-fit-in-memory)
- [How can I interrupt training when the validation loss isn't decreasing anymore?](#how-can-i-interrupt-training-when-the-validation-loss-isnt-decreasing-anymore)
- [How is the validation split computed?](#how-is-the-validation-split-computed)
- [Is the data shuffled during training?](#is-the-data-shuffled-during-training)
- [How can I record the training / validation loss / accuracy at each epoch?](#how-can-i-record-the-training-validation-loss-accuracy-at-each-epoch)
- [How can I "freeze" layers?](#how-can-i-freeze-keras-layers)
- [How can I use stateful RNNs?](#how-can-i-use-stateful-rnns)
- [How can I remove a layer from a Sequential model?](#how-can-i-remove-a-layer-from-a-sequential-model)
- [How can I use pre-trained models in Keras?](#how-can-i-use-pre-trained-models-in-keras)
---
### How should I cite Keras?
Please cite Keras in your publications if it helps your research. Here is an example BibTeX entry:
```
@misc{chollet2015keras,
title={Keras},
author={Chollet, Fran\c{c}ois},
year={2015},
publisher={GitHub},
howpublished={\url{https://github.com/fchollet/keras}},
}
```
### How can I run Keras on GPU?
If you are running on the TensorFlow backend, your code will automatically run on GPU if any available GPU is detected.
If you are running on the Theano backend, you can use one of the following methods:
Method 1: use Theano flags.
```bash
THEANO_FLAGS=device=gpu,floatX=float32 python my_keras_script.py
```
The name 'gpu' might have to be changed depending on your device's identifier (e.g. `gpu0`, `gpu1`, etc).
Method 2: set up your `.theanorc`: [Instructions](http://deeplearning.net/software/theano/library/config.html)
Method 3: manually set `theano.config.device`, `theano.config.floatX` at the beginning of your code:
```python
import theano
theano.config.device = 'gpu'
theano.config.floatX = 'float32'
```
---
### How can I save a Keras model?
*It is not recommended to use pickle or cPickle to save a Keras model.*
You can use `model.save(filepath)` to save a Keras model into a single HDF5 file which will contain:
- the architecture of the model, allowing to re-create the model
- the weights of the model
- the training configuration (loss, optimizer)
- the state of the optimizer, allowing to resume training exactly where you left off.
You can then use `keras.models.load_model(filepath)` to reinstantiate your model.
`load_model` will also take care of compiling the model using the saved training configuration
(unless the model was never compiled in the first place).
Example:
```python
from keras.models import load_model
model.save('my_model.h5') # creates a HDF5 file 'my_model.h5'
del model # deletes the existing model
# returns a compiled model
# identical to the previous one
model = load_model('my_model.h5')
```
If you only need to save the **architecture of a model**, and not its weights or its training configuration, you can do:
```python
# save as JSON
json_string = model.to_json()
# save as YAML
yaml_string = model.to_yaml()
```
The generated JSON / YAML files are human-readable and can be manually edited if needed.
You can then build a fresh model from this data:
```python
# model reconstruction from JSON:
from keras.models import model_from_json
model = model_from_json(json_string)
# model reconstruction from YAML
model = model_from_yaml(yaml_string)
```
If you need to save the **weights of a model**, you can do so in HDF5 with the code below.
Note that you will first need to install HDF5 and the Python library h5py, which do not come bundled with Keras.
```python
model.save_weights('my_model_weights.h5')
```
Assuming you have code for instantiating your model, you can then load the weights you saved into a model with the *same* architecture:
```python
model.load_weights('my_model_weights.h5')
```
If you need to load weights into a *different* architecture (with some layers in common), for instance for fine-tuning or transfer-learning, you can load weights by *layer name*:
```python
model.load_weights('my_model_weights.h5', by_name=True)
```
For example:
```python
"""
Assume original model looks like this:
model = Sequential()
model.add(Dense(2, input_dim=3, name="dense_1"))
model.add(Dense(3, name="dense_2"))
...
model.save_weights(fname)
"""
# new model
model = Sequential()
model.add(Dense(2, input_dim=3, name="dense_1")) # will be loaded
model.add(Dense(10, name="new_dense")) # will not be loaded
# load weights from first model; will only affect the first layer, dense_1.
model.load_weights(fname, by_name=True)
```
---
### Why is the training loss much higher than the testing loss?
A Keras model has two modes: training and testing. Regularization mechanisms, such as Dropout and L1/L2 weight regularization, are turned off at testing time.
Besides, the training loss is the average of the losses over each batch of training data. Because your model is changing over time, the loss over the first batches of an epoch is generally higher than over the last batches. On the other hand, the testing loss for an epoch is computed using the model as it is at the end of the epoch, resulting in a lower loss.
---
### How can I visualize the output of an intermediate layer?
You can build a Keras function that will return the output of a certain layer given a certain input, for example:
```python
from keras import backend as K
# with a Sequential model
get_3rd_layer_output = K.function([model.layers[0].input],
[model.layers[3].output])
layer_output = get_3rd_layer_output([X])[0]
```
Similarly, you could build a Theano and TensorFlow function directly.
Note that if your model has a different behavior in training and testing phase (e.g. if it uses `Dropout`, `BatchNormalization`, etc.), you will need
to pass the learning phase flag to your function:
```python
get_3rd_layer_output = K.function([model.layers[0].input, K.learning_phase()],
[model.layers[3].output])
# output in test mode = 0
layer_output = get_3rd_layer_output([X, 0])[0]
# output in train mode = 1
layer_output = get_3rd_layer_output([X, 1])[0]
```
Another more flexible way of getting output from intermediate layers is to use the [functional API](/getting-started/functional-api-guide). For example, if you have created an autoencoder for MNIST:
```python
inputs = Input(shape=(784,))
encoded = Dense(32, activation='relu')(inputs)
decoded = Dense(784)(encoded)
model = Model(input=inputs, output=decoded)
```
After compiling and training the model, you can get the output of the data from the encoder like this:
```python
encoder = Model(input=inputs, output=encoded)
X_encoded = encoder.predict(X)
```
---
### How can I use Keras with datasets that don't fit in memory?
You can do batch training using `model.train_on_batch(X, y)` and `model.test_on_batch(X, y)`. See the [models documentation](/models/sequential).
Alternatively, you can write a generator that yields batches of training data and use the method `model.fit_generator(data_generator, samples_per_epoch, nb_epoch)`.
You can see batch training in action in our [CIFAR10 example](https://github.com/fchollet/keras/blob/master/examples/cifar10_cnn.py).
---
### How can I interrupt training when the validation loss isn't decreasing anymore?
You can use an `EarlyStopping` callback:
```python
from keras.callbacks import EarlyStopping
early_stopping = EarlyStopping(monitor='val_loss', patience=2)
model.fit(X, y, validation_split=0.2, callbacks=[early_stopping])
```
Find out more in the [callbacks documentation](/callbacks).
---
### How is the validation split computed?
If you set the `validation_split` argument in `model.fit` to e.g. 0.1, then the validation data used will be the *last 10%* of the data. If you set it to 0.25, it will be the last 25% of the data, etc.
---
### Is the data shuffled during training?
Yes, if the `shuffle` argument in `model.fit` is set to `True` (which is the default), the training data will be randomly shuffled at each epoch.
Validation data is never shuffled.
---
### How can I record the training / validation loss / accuracy at each epoch?
The `model.fit` method returns an `History` callback, which has a `history` attribute containing the lists of successive losses and other metrics.
```python
hist = model.fit(X, y, validation_split=0.2)
print(hist.history)
```
---
### How can I "freeze" Keras layers?
To "freeze" a layer means to exclude it from training, i.e. its weights will never be updated. This is useful in the context of fine-tuning a model, or using fixed embeddings for a text input.
You can pass a `trainable` argument (boolean) to a layer constructor to set a layer to be non-trainable:
```python
frozen_layer = Dense(32, trainable=False)
```
Additionally, you can set the `trainable` property of a layer to `True` or `False` after instantiation. For this to take effect, you will need to call `compile()` on your model after modifying the `trainable` property. Here's an example:
```python
x = Input(shape=(32,))
layer = Dense(32)
layer.trainable = False
y = layer(x)
frozen_model = Model(x, y)
# in the model below, the weights of `layer` will not be updated during training
frozen_model.compile(optimizer='rmsprop', loss='mse')
layer.trainable = True
trainable_model = Model(x, y)
# with this model the weights of the layer will be updated during training
# (which will also affect the above model since it uses the same layer instance)
trainable_model.compile(optimizer='rmsprop', loss='mse')
frozen_model.fit(data, labels) # this does NOT update the weights of `layer`
trainable_model.fit(data, labels) # this updates the weights of `layer`
```
---
### How can I use stateful RNNs?
Making a RNN stateful means that the states for the samples of each batch will be reused as initial states for the samples in the next batch.
When using stateful RNNs, it is therefore assumed that:
- all batches have the same number of samples
- If `X1` and `X2` are successive batches of samples, then `X2[i]` is the follow-up sequence to `X1[i]`, for every `i`.
To use statefulness in RNNs, you need to:
- explicitly specify the batch size you are using, by passing a `batch_input_shape` argument to the first layer in your model. It should be a tuple of integers, e.g. `(32, 10, 16)` for a 32-samples batch of sequences of 10 timesteps with 16 features per timestep.
- set `stateful=True` in your RNN layer(s).
To reset the states accumulated:
- use `model.reset_states()` to reset the states of all layers in the model
- use `layer.reset_states()` to reset the states of a specific stateful RNN layer
Example:
```python
X # this is our input data, of shape (32, 21, 16)
# we will feed it to our model in sequences of length 10
model = Sequential()
model.add(LSTM(32, batch_input_shape=(32, 10, 16), stateful=True))
model.add(Dense(16, activation='softmax'))
model.compile(optimizer='rmsprop', loss='categorical_crossentropy')
# we train the network to predict the 11th timestep given the first 10:
model.train_on_batch(X[:, :10, :], np.reshape(X[:, 10, :], (32, 16)))
# the state of the network has changed. We can feed the follow-up sequences:
model.train_on_batch(X[:, 10:20, :], np.reshape(X[:, 20, :], (32, 16)))
# let's reset the states of the LSTM layer:
model.reset_states()
# another way to do it in this case:
model.layers[0].reset_states()
```
Notes that the methods `predict`, `fit`, `train_on_batch`, `predict_classes`, etc. will *all* update the states of the stateful layers in a model. This allows you to do not only stateful training, but also stateful prediction.
---
### How can I remove a layer from a Sequential model?
You can remove the last added layer in a Sequential model by calling `.pop()`:
```python
model = Sequential()
model.add(Dense(32, activation='relu', input_dim=784))
model.add(Dense(32, activation='relu'))
print(len(model.layers)) # "2"
model.pop()
print(len(model.layers)) # "1"
```
---
### How can I use pre-trained models in Keras?
Code and pre-trained weights are available for the following image classification models:
- VGG16
- VGG19
- ResNet50
- Inception v3
They can be imported from the module `keras.applications`:
```python
from keras.applications.vgg16 import VGG16
from keras.applications.vgg19 import VGG19
from keras.applications.resnet50 import ResNet50
from keras.applications.inception_v3 import InceptionV3
model = VGG16(weights='imagenet', include_top=True)
```
For a few simple usage examples, see [the documentation for the Applications module](/applications).
For a detailed example of how to use such a pre-trained model for feature extraction or for fine-tuning, see [this blog post](http://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html).
The VGG16 model is also the basis for several Keras example scripts:
- [Style transfer](https://github.com/fchollet/keras/blob/master/examples/neural_style_transfer.py)
- [Feature visualization](https://github.com/fchollet/keras/blob/master/examples/conv_filter_visualization.py)
- [Deep dream](https://github.com/fchollet/keras/blob/master/examples/deep_dream.py)
+421
Ver Arquivo
@@ -0,0 +1,421 @@
# Getting started with the Keras functional API
The Keras functional API is the way to go for defining complex models, such as multi-output models, directed acyclic graphs, or models with shared layers.
This guide assumes that you are already familiar with the `Sequential` model.
Let's start with something simple.
-----
## First example: fully connected network
The `Sequential` model is probably a better choice to implement such a network, but it helps to start with something really simple.
- A layer instance is callable (on a tensor), and it returns a tensor
- Input tensor(s) and output tensor(s) can then be used to define a `Model`
- Such a model can be trained just like Keras `Sequential` models.
```python
from keras.layers import Input, Dense
from keras.models import Model
# this returns a tensor
inputs = Input(shape=(784,))
# a layer instance is callable on a tensor, and returns a tensor
x = Dense(64, activation='relu')(inputs)
x = Dense(64, activation='relu')(x)
predictions = Dense(10, activation='softmax')(x)
# this creates a model that includes
# the Input layer and three Dense layers
model = Model(input=inputs, output=predictions)
model.compile(optimizer='rmsprop',
loss='categorical_crossentropy',
metrics=['accuracy'])
model.fit(data, labels) # starts training
```
-----
## All models are callable, just like layers
With the functional API, it is easy to re-use trained models: you can treat any model as if it were a layer, by calling it on a tensor. Note that by calling a model you aren't just re-using the *architecture* of the model, you are also re-using its weights.
```python
x = Input(shape=(784,))
# this works, and returns the 10-way softmax we defined above.
y = model(x)
```
This can allow, for instance, to quickly create models that can process *sequences* of inputs. You could turn an image classification model into a video classification model, in just one line.
```python
from keras.layers import TimeDistributed
# input tensor for sequences of 20 timesteps,
# each containing a 784-dimensional vector
input_sequences = Input(shape=(20, 784))
# this applies our previous model to every timestep in the input sequences.
# the output of the previous model was a 10-way softmax,
# so the output of the layer below will be a sequence of 20 vectors of size 10.
processed_sequences = TimeDistributed(model)(input_sequences)
```
-----
## Multi-input and multi-output models
Here's a good use case for the functional API: models with multiple inputs and outputs. The functional API makes it easy to manipulate a large number of intertwined datastreams.
Let's consider the following model. We seek to predict how many retweets and likes a news headline will receive on Twitter. The main input to the model will be the headline itself, as a sequence of words, but to spice things up, our model will also have an auxiliary input, receiving extra data such as the time of day when the headline was posted, etc.
The model will also be supervised via two loss functions. Using the main loss function earlier in a model is a good regularization mechanism for deep models.
Here's what our model looks like:
<img src="https://s3.amazonaws.com/keras.io/img/multi-input-multi-output-graph.png" alt="multi-input-multi-output-graph" style="width: 400px;"/>
Let's implement it with the functional API.
The main input will receive the headline, as a sequence of integers (each integer encodes a word).
The integers will be between 1 and 10,000 (a vocabulary of 10,000 words) and the sequences will be 100 words long.
```python
from keras.layers import Input, Embedding, LSTM, Dense, merge
from keras.models import Model
# headline input: meant to receive sequences of 100 integers, between 1 and 10000.
# note that we can name any layer by passing it a "name" argument.
main_input = Input(shape=(100,), dtype='int32', name='main_input')
# this embedding layer will encode the input sequence
# into a sequence of dense 512-dimensional vectors.
x = Embedding(output_dim=512, input_dim=10000, input_length=100)(main_input)
# a LSTM will transform the vector sequence into a single vector,
# containing information about the entire sequence
lstm_out = LSTM(32)(x)
```
Here we insert the auxiliary loss, allowing the LSTM and Embedding layer to be trained smoothly even though the main loss will be much higher in the model.
```python
auxiliary_loss = Dense(1, activation='sigmoid', name='aux_output')(lstm_out)
```
At this point, we feed into the model our auxiliary input data by concatenating it with the LSTM output:
```python
auxiliary_input = Input(shape=(5,), name='aux_input')
x = merge([lstm_out, auxiliary_input], mode='concat')
# we stack a deep fully-connected network on top
x = Dense(64, activation='relu')(x)
x = Dense(64, activation='relu')(x)
x = Dense(64, activation='relu')(x)
# and finally we add the main logistic regression layer
main_loss = Dense(1, activation='sigmoid', name='main_output')(x)
```
This defines a model with two inputs and two outputs:
```python
model = Model(input=[main_input, auxiliary_input], output=[main_loss, auxiliary_loss])
```
We compile the model and assign a weight of 0.2 to the auxiliary loss.
To specify different `loss_weights` or `loss` for each different output, you can use a list or a dictionary.
Here we pass a single loss as the `loss` argument, so the same loss will be used on all outputs.
```python
model.compile(optimizer='rmsprop', loss='binary_crossentropy',
loss_weights=[1., 0.2])
```
We can train the model by passing it lists of input arrays and target arrays:
```python
model.fit([headline_data, additional_data], [labels, labels],
nb_epoch=50, batch_size=32)
```
Since our inputs and outputs are named (we passed them a "name" argument),
We could also have compiled the model via:
```python
model.compile(optimizer='rmsprop',
loss={'main_output': 'binary_crossentropy', 'aux_output': 'binary_crossentropy'},
loss_weights={'main_output': 1., 'aux_output': 0.2})
# and trained it via:
model.fit({'main_input': headline_data, 'aux_input': additional_data},
{'main_output': labels, 'aux_output': labels},
nb_epoch=50, batch_size=32)
```
-----
## Shared layers
Another good use for the functional API are models that use shared layers. Let's take a look at shared layers.
Let's consider a dataset of tweets. We want to build a model that can tell whether two tweets are from the same person or not (this can allow us to compare users by the similarity of their tweets, for instance).
One way to achieve this is to build a model that encodes two tweets into two vectors, concatenates the vectors and adds a logistic regression of top, outputting a probability that the two tweets share the same author. The model would then be trained on positive tweet pairs and negative tweet pairs.
Because the problem is symmetric, the mechanism that encodes the first tweet should be reused (weights and all) to encode the second tweet. Here we use a shared LSTM layer to encode the tweets.
Let's build this with the functional API. We will take as input for a tweet a binary matrix of shape `(140, 256)`, i.e. a sequence of 140 vectors of size 256, where each dimension in the 256-dimensional vector encodes the presence/absence of a character (out of an alphabet of 256 frequent characters).
```python
from keras.layers import Input, LSTM, Dense, merge
from keras.models import Model
tweet_a = Input(shape=(140, 256))
tweet_b = Input(shape=(140, 256))
```
To share a layer across different inputs, simply instantiate the layer once, then call it on as many inputs as you want:
```python
# this layer can take as input a matrix
# and will return a vector of size 64
shared_lstm = LSTM(64)
# when we reuse the same layer instance
# multiple times, the weights of the layer
# are also being reused
# (it is effectively *the same* layer)
encoded_a = shared_lstm(tweet_a)
encoded_b = shared_lstm(tweet_b)
# we can then concatenate the two vectors:
merged_vector = merge([encoded_a, encoded_b], mode='concat', concat_axis=-1)
# and add a logistic regression on top
predictions = Dense(1, activation='sigmoid')(merged_vector)
# we define a trainable model linking the
# tweet inputs to the predictions
model = Model(input=[tweet_a, tweet_b], output=predictions)
model.compile(optimizer='rmsprop',
loss='binary_crossentropy',
metrics=['accuracy'])
model.fit([data_a, data_b], labels, nb_epoch=10)
```
Let's pause to take a look at how to read the shared layer's output or output shape.
-----
## The concept of layer "node"
Whenever you are calling a layer on some input, you are creating a new tensor (the output of the layer), and you are adding a "node" to the layer, linking the input tensor to the output tensor. When you are calling the same layer multiple times, that layer owns multiple nodes indexed as 0, 1, 2...
In previous versions of Keras, you could obtain the output tensor of a layer instance via `layer.get_output()`, or its output shape via `layer.output_shape`. You still can (except `get_output()` has been replaced by the property `output`). But what if a layer is connected to multiple inputs?
As long as a layer is only connected to one input, there is no confusion, and `.output` will return the one output of the layer:
```python
a = Input(shape=(140, 256))
lstm = LSTM(32)
encoded_a = lstm(a)
assert lstm.output == encoded_a
```
Not so if the layer has multiple inputs:
```python
a = Input(shape=(140, 256))
b = Input(shape=(140, 256))
lstm = LSTM(32)
encoded_a = lstm(a)
encoded_b = lstm(b)
lstm.output
```
```
>> AssertionError: Layer lstm_1 has multiple inbound nodes,
hence the notion of "layer output" is ill-defined.
Use `get_output_at(node_index)` instead.
```
Okay then. The following works:
```python
assert lstm.get_output_at(0) == encoded_a
assert lstm.get_output_at(1) == encoded_b
```
Simple enough, right?
The same is true for the properties `input_shape` and `output_shape`: as long as the layer has only one node, or as long as all nodes have the same input/output shape, then the notion of "layer output/input shape" is well defined, and that one shape will be returned by `layer.output_shape`/`layer.input_shape`. But if, for instance, you apply a same `Convolution2D` layer to an input of shape `(3, 32, 32)`, and then to an input of shape `(3, 64, 64)`, the layer will have multiple input/output shapes, and you will have to fetch them by specifying the index of the node they belong to:
```python
a = Input(shape=(3, 32, 32))
b = Input(shape=(3, 64, 64))
conv = Convolution2D(16, 3, 3, border_mode='same')
conved_a = conv(a)
# only one input so far, the following will work:
assert conv.input_shape == (None, 3, 32, 32)
conved_b = conv(b)
# now the `.input_shape` property wouldn't work, but this does:
assert conv.get_input_shape_at(0) == (None, 3, 32, 32)
assert conv.get_input_shape_at(1) == (None, 3, 64, 64)
```
-----
## More examples
Code examples are still the best way to get started, so here are a few more.
### Inception module
For more information about the Inception architecture, see [Going Deeper with Convolutions](http://arxiv.org/abs/1409.4842).
```python
from keras.layers import merge, Convolution2D, MaxPooling2D, Input
input_img = Input(shape=(3, 256, 256))
tower_1 = Convolution2D(64, 1, 1, border_mode='same', activation='relu')(input_img)
tower_1 = Convolution2D(64, 3, 3, border_mode='same', activation='relu')(tower_1)
tower_2 = Convolution2D(64, 1, 1, border_mode='same', activation='relu')(input_img)
tower_2 = Convolution2D(64, 5, 5, border_mode='same', activation='relu')(tower_2)
tower_3 = MaxPooling2D((3, 3), strides=(1, 1), border_mode='same')(input_img)
tower_3 = Convolution2D(64, 1, 1, border_mode='same', activation='relu')(tower_3)
output = merge([tower_1, tower_2, tower_3], mode='concat', concat_axis=1)
```
### Residual connection on a convolution layer
For more information about residual networks, see [Deep Residual Learning for Image Recognition](http://arxiv.org/abs/1512.03385).
```python
from keras.layers import merge, Convolution2D, Input
# input tensor for a 3-channel 256x256 image
x = Input(shape=(3, 256, 256))
# 3x3 conv with 3 output channels (same as input channels)
y = Convolution2D(3, 3, 3, border_mode='same')(x)
# this returns x + y.
z = merge([x, y], mode='sum')
```
### Shared vision model
This model re-uses the same image-processing module on two inputs, to classify whether two MNIST digits are the same digit or different digits.
```python
from keras.layers import merge, Convolution2D, MaxPooling2D, Input, Dense, Flatten
from keras.models import Model
# first, define the vision modules
digit_input = Input(shape=(1, 27, 27))
x = Convolution2D(64, 3, 3)(digit_input)
x = Convolution2D(64, 3, 3)(x)
x = MaxPooling2D((2, 2))(x)
out = Flatten()(x)
vision_model = Model(digit_input, out)
# then define the tell-digits-apart model
digit_a = Input(shape=(1, 27, 27))
digit_b = Input(shape=(1, 27, 27))
# the vision model will be shared, weights and all
out_a = vision_model(digit_a)
out_b = vision_model(digit_b)
concatenated = merge([out_a, out_b], mode='concat')
out = Dense(1, activation='sigmoid')(concatenated)
classification_model = Model([digit_a, digit_b], out)
```
### Visual question answering model
This model can select the correct one-word answer when asked a natural-language question about a picture.
It works by encoding the question into a vector, encoding the image into a vector, concatenating the two, and training on top a logistic regression over some vocabulary of potential answers.
```python
from keras.layers import Convolution2D, MaxPooling2D, Flatten
from keras.layers import Input, LSTM, Embedding, Dense, merge
from keras.models import Model, Sequential
# first, let's define a vision model using a Sequential model.
# this model will encode an image into a vector.
vision_model = Sequential()
vision_model.add(Convolution2D(64, 3, 3, activation='relu', border_mode='same', input_shape=(3, 224, 224)))
vision_model.add(Convolution2D(64, 3, 3, activation='relu'))
vision_model.add(MaxPooling2D((2, 2)))
vision_model.add(Convolution2D(128, 3, 3, activation='relu', border_mode='same'))
vision_model.add(Convolution2D(128, 3, 3, activation='relu'))
vision_model.add(MaxPooling2D((2, 2)))
vision_model.add(Convolution2D(256, 3, 3, activation='relu', border_mode='same'))
vision_model.add(Convolution2D(256, 3, 3, activation='relu'))
vision_model.add(Convolution2D(256, 3, 3, activation='relu'))
vision_model.add(MaxPooling2D((2, 2)))
vision_model.add(Flatten())
# now let's get a tensor with the output of our vision model:
image_input = Input(shape=(3, 224, 224))
encoded_image = vision_model(image_input)
# next, let's define a language model to encode the question into a vector.
# each question will be at most 100 word long,
# and we will index words as integers from 1 to 9999.
question_input = Input(shape=(100,), dtype='int32')
embedded_question = Embedding(input_dim=10000, output_dim=256, input_length=100)(question_input)
encoded_question = LSTM(256)(embedded_question)
# let's concatenate the question vector and the image vector:
merged = merge([encoded_question, encoded_image], mode='concat')
# and let's train a logistic regression over 1000 words on top:
output = Dense(1000, activation='softmax')(merged)
# this is our final model:
vqa_model = Model(input=[image_input, question_input], output=output)
# the next stage would be training this model on actual data.
```
### Video question answering model
Now that we have trained our image QA model, we can quickly turn it into a video QA model. With appropriate training, you will be able to show it a short video (e.g. 100-frame human action) and ask a natural language question about the video (e.g. "what sport is the boy playing?" -> "football").
```python
from keras.layers import TimeDistributed
video_input = Input(shape=(100, 3, 224, 224))
# this is our video encoded via the previously trained vision_model (weights are reused)
encoded_frame_sequence = TimeDistributed(vision_model)(video_input) # the output will be a sequence of vectors
encoded_video = LSTM(256)(encoded_frame_sequence) # the output will be a vector
# this is a model-level representation of the question encoder, reusing the same weights as before:
question_encoder = Model(input=question_input, output=encoded_question)
# let's use it to encode the question:
video_question_input = Input(shape=(100,), dtype='int32')
encoded_video_question = question_encoder(video_question_input)
# and this is our video question answering model:
merged = merge([encoded_video, encoded_video_question], mode='concat')
output = Dense(1000, activation='softmax')(merged)
video_qa_model = Model(input=[video_input, video_question_input], output=output)
```
+549
Ver Arquivo
@@ -0,0 +1,549 @@
# Getting started with the Keras Sequential model
The `Sequential` model is a linear stack of layers.
You can create a `Sequential` model by passing a list of layer instances to the constructor:
```python
from keras.models import Sequential
from keras.layers import Dense, Activation
model = Sequential([
Dense(32, input_dim=784),
Activation('relu'),
Dense(10),
Activation('softmax'),
])
```
You can also simply add layers via the `.add()` method:
```python
model = Sequential()
model.add(Dense(32, input_dim=784))
model.add(Activation('relu'))
```
----
## Specifying the input shape
The model needs to know what input shape it should expect. For this reason, the first layer in a `Sequential` model (and only the first, because following layers can do automatic shape inference) needs to receive information about its input shape. There are several possible ways to do this:
- pass an `input_shape` argument to the first layer. This is a shape tuple (a tuple of integers or `None` entries, where `None` indicates that any positive integer may be expected). In `input_shape`, the batch dimension is not included.
- pass instead a `batch_input_shape` argument, where the batch dimension is included. This is useful for specifying a fixed batch size (e.g. with stateful RNNs).
- some 2D layers, such as `Dense`, support the specification of their input shape via the argument `input_dim`, and some 3D temporal layers support the arguments `input_dim` and `input_length`.
As such, the following three snippets are strictly equivalent:
```python
model = Sequential()
model.add(Dense(32, input_shape=(784,)))
```
```python
model = Sequential()
model.add(Dense(32, batch_input_shape=(None, 784)))
# note that batch dimension is "None" here,
# so the model will be able to process batches of any size.
```
```python
model = Sequential()
model.add(Dense(32, input_dim=784))
```
And so are the following three snippets:
```python
model = Sequential()
model.add(LSTM(32, input_shape=(10, 64)))
```
```python
model = Sequential()
model.add(LSTM(32, batch_input_shape=(None, 10, 64)))
```
```python
model = Sequential()
model.add(LSTM(32, input_length=10, input_dim=64))
```
----
## The Merge layer
Multiple `Sequential` instances can be merged into a single output via a `Merge` layer. The output is a layer that can be added as first layer in a new `Sequential` model. For instance, here's a model with two separate input branches getting merged:
```python
from keras.layers import Merge
left_branch = Sequential()
left_branch.add(Dense(32, input_dim=784))
right_branch = Sequential()
right_branch.add(Dense(32, input_dim=784))
merged = Merge([left_branch, right_branch], mode='concat')
final_model = Sequential()
final_model.add(merged)
final_model.add(Dense(10, activation='softmax'))
```
<img src="https://s3.amazonaws.com/keras.io/img/two_branches_sequential_model.png" alt="two branch Sequential" style="width: 400px;"/>
Such a two-branch model can then be trained via e.g.:
```python
final_model.compile(optimizer='rmsprop', loss='categorical_crossentropy')
final_model.fit([input_data_1, input_data_2], targets) # we pass one data array per model input
```
The `Merge` layer supports a number of pre-defined modes:
- `sum` (default): element-wise sum
- `concat`: tensor concatenation. You can specify the concatenation axis via the argument `concat_axis`.
- `mul`: element-wise multiplication
- `ave`: tensor average
- `dot`: dot product. You can specify which axes to reduce along via the argument `dot_axes`.
- `cos`: cosine proximity between vectors in 2D tensors.
You can also pass a function as the `mode` argument, allowing for arbitrary transformations:
```python
merged = Merge([left_branch, right_branch], mode=lambda x: x[0] - x[1])
```
Now you know enough to be able to define *almost* any model with Keras. For complex models that cannot be expressed via `Sequential` and `Merge`, you can use [the functional API](/getting-started/functional-api-guide).
----
## Compilation
Before training a model, you need to configure the learning process, which is done via the `compile` method. It receives three arguments:
- an optimizer. This could be the string identifier of an existing optimizer (such as `rmsprop` or `adagrad`), or an instance of the `Optimizer` class. See: [optimizers](/optimizers).
- a loss function. This is the objective that the model will try to minimize. It can be the string identifier of an existing loss function (such as `categorical_crossentropy` or `mse`), or it can be an objective function. See: [objectives](/objectives).
- a list of metrics. For any classification problem you will want to set this to `metrics=['accuracy']`. A metric could be the string identifier of an existing metric (only `accuracy` is supported at this point), or a custom metric function.
```python
# for a multi-class classification problem
model.compile(optimizer='rmsprop',
loss='categorical_crossentropy',
metrics=['accuracy'])
# for a binary classification problem
model.compile(optimizer='rmsprop',
loss='binary_crossentropy',
metrics=['accuracy'])
# for a mean squared error regression problem
model.compile(optimizer='rmsprop',
loss='mse')
```
----
## Training
Keras models are trained on Numpy arrays of input data and labels. For training a model, you will typically use the `fit` function. [Read its documentation here](/models/sequential).
```python
# for a single-input model with 2 classes (binary):
model = Sequential()
model.add(Dense(1, input_dim=784, activation='sigmoid'))
model.compile(optimizer='rmsprop',
loss='binary_crossentropy',
metrics=['accuracy'])
# generate dummy data
import numpy as np
data = np.random.random((1000, 784))
labels = np.random.randint(2, size=(1000, 1))
# train the model, iterating on the data in batches
# of 32 samples
model.fit(data, labels, nb_epoch=10, batch_size=32)
```
```python
# for a multi-input model with 10 classes:
left_branch = Sequential()
left_branch.add(Dense(32, input_dim=784))
right_branch = Sequential()
right_branch.add(Dense(32, input_dim=784))
merged = Merge([left_branch, right_branch], mode='concat')
model = Sequential()
model.add(merged)
model.add(Dense(10, activation='softmax'))
model.compile(optimizer='rmsprop',
loss='categorical_crossentropy',
metrics=['accuracy'])
# generate dummy data
import numpy as np
from keras.utils.np_utils import to_categorical
data_1 = np.random.random((1000, 784))
data_2 = np.random.random((1000, 784))
# these are integers between 0 and 9
labels = np.random.randint(10, size=(1000, 1))
# we convert the labels to a binary matrix of size (1000, 10)
# for use with categorical_crossentropy
labels = to_categorical(labels, 10)
# train the model
# note that we are passing a list of Numpy arrays as training data
# since the model has 2 inputs
model.fit([data_1, data_2], labels, nb_epoch=10, batch_size=32)
```
----
## Examples
Here are a few examples to get you started!
In the examples folder, you will also find example models for real datasets:
- CIFAR10 small images classification: Convolutional Neural Network (CNN) with realtime data augmentation
- IMDB movie review sentiment classification: LSTM over sequences of words
- Reuters newswires topic classification: Multilayer Perceptron (MLP)
- MNIST handwritten digits classification: MLP & CNN
- Character-level text generation with LSTM
...and more.
### Multilayer Perceptron (MLP) for multi-class softmax classification:
```python
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation
from keras.optimizers import SGD
model = Sequential()
# Dense(64) is a fully-connected layer with 64 hidden units.
# in the first layer, you must specify the expected input data shape:
# here, 20-dimensional vectors.
model.add(Dense(64, input_dim=20, init='uniform'))
model.add(Activation('tanh'))
model.add(Dropout(0.5))
model.add(Dense(64, init='uniform'))
model.add(Activation('tanh'))
model.add(Dropout(0.5))
model.add(Dense(10, init='uniform'))
model.add(Activation('softmax'))
sgd = SGD(lr=0.1, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='categorical_crossentropy',
optimizer=sgd,
metrics=['accuracy'])
model.fit(X_train, y_train,
nb_epoch=20,
batch_size=16)
score = model.evaluate(X_test, y_test, batch_size=16)
```
### Alternative implementation of a similar MLP:
```python
model = Sequential()
model.add(Dense(64, input_dim=20, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(64, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(10, activation='softmax'))
model.compile(loss='categorical_crossentropy',
optimizer='adadelta',
metrics=['accuracy'])
```
### MLP for binary classification:
```python
model = Sequential()
model.add(Dense(64, input_dim=20, init='uniform', activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(64, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy',
optimizer='rmsprop',
metrics=['accuracy'])
```
### VGG-like convnet:
```python
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.layers import Convolution2D, MaxPooling2D
from keras.optimizers import SGD
model = Sequential()
# input: 100x100 images with 3 channels -> (3, 100, 100) tensors.
# this applies 32 convolution filters of size 3x3 each.
model.add(Convolution2D(32, 3, 3, border_mode='valid', input_shape=(3, 100, 100)))
model.add(Activation('relu'))
model.add(Convolution2D(32, 3, 3))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Convolution2D(64, 3, 3, border_mode='valid'))
model.add(Activation('relu'))
model.add(Convolution2D(64, 3, 3))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
# Note: Keras does automatic shape inference.
model.add(Dense(256))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(10))
model.add(Activation('softmax'))
sgd = SGD(lr=0.1, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='categorical_crossentropy', optimizer=sgd)
model.fit(X_train, Y_train, batch_size=32, nb_epoch=1)
```
### Sequence classification with LSTM:
```python
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation
from keras.layers import Embedding
from keras.layers import LSTM
model = Sequential()
model.add(Embedding(max_features, 256, input_length=maxlen))
model.add(LSTM(output_dim=128, activation='sigmoid', inner_activation='hard_sigmoid'))
model.add(Dropout(0.5))
model.add(Dense(1))
model.add(Activation('sigmoid'))
model.compile(loss='binary_crossentropy',
optimizer='rmsprop',
metrics=['accuracy'])
model.fit(X_train, Y_train, batch_size=16, nb_epoch=10)
score = model.evaluate(X_test, Y_test, batch_size=16)
```
### Architecture for learning image captions with a convnet and a Gated Recurrent Unit:
(word-level embedding, caption of maximum length 16 words).
Note that getting this to work well will require using a bigger convnet, initialized with pre-trained weights.
```python
max_caption_len = 16
vocab_size = 10000
# first, let's define an image model that
# will encode pictures into 128-dimensional vectors.
# it should be initialized with pre-trained weights.
image_model = Sequential()
image_model.add(Convolution2D(32, 3, 3, border_mode='valid', input_shape=(3, 100, 100)))
image_model.add(Activation('relu'))
image_model.add(Convolution2D(32, 3, 3))
image_model.add(Activation('relu'))
image_model.add(MaxPooling2D(pool_size=(2, 2)))
image_model.add(Convolution2D(64, 3, 3, border_mode='valid'))
image_model.add(Activation('relu'))
image_model.add(Convolution2D(64, 3, 3))
image_model.add(Activation('relu'))
image_model.add(MaxPooling2D(pool_size=(2, 2)))
image_model.add(Flatten())
image_model.add(Dense(128))
# let's load the weights from a save file.
image_model.load_weights('weight_file.h5')
# next, let's define a RNN model that encodes sequences of words
# into sequences of 128-dimensional word vectors.
language_model = Sequential()
language_model.add(Embedding(vocab_size, 256, input_length=max_caption_len))
language_model.add(GRU(output_dim=128, return_sequences=True))
language_model.add(TimeDistributed(Dense(128)))
# let's repeat the image vector to turn it into a sequence.
image_model.add(RepeatVector(max_caption_len))
# the output of both models will be tensors of shape (samples, max_caption_len, 128).
# let's concatenate these 2 vector sequences.
model = Sequential()
model.add(Merge([image_model, language_model], mode='concat', concat_axis=-1))
# let's encode this vector sequence into a single vector
model.add(GRU(256, return_sequences=False))
# which will be used to compute a probability
# distribution over what the next word in the caption should be!
model.add(Dense(vocab_size))
model.add(Activation('softmax'))
model.compile(loss='categorical_crossentropy', optimizer='rmsprop')
# "images" is a numpy float array of shape (nb_samples, nb_channels=3, width, height).
# "captions" is a numpy integer array of shape (nb_samples, max_caption_len)
# containing word index sequences representing partial captions.
# "next_words" is a numpy float array of shape (nb_samples, vocab_size)
# containing a categorical encoding (0s and 1s) of the next word in the corresponding
# partial caption.
model.fit([images, partial_captions], next_words, batch_size=16, nb_epoch=100)
```
### Stacked LSTM for sequence classification
In this model, we stack 3 LSTM layers on top of each other,
making the model capable of learning higher-level temporal representations.
The first two LSTMs return their full output sequences, but the last one only returns
the last step in its output sequence, thus dropping the temporal dimension
(i.e. converting the input sequence into a single vector).
<img src="https://keras.io/img/regular_stacked_lstm.png" alt="stacked LSTM" style="width: 300px;"/>
```python
from keras.models import Sequential
from keras.layers import LSTM, Dense
import numpy as np
data_dim = 16
timesteps = 8
nb_classes = 10
# expected input data shape: (batch_size, timesteps, data_dim)
model = Sequential()
model.add(LSTM(32, return_sequences=True,
input_shape=(timesteps, data_dim))) # returns a sequence of vectors of dimension 32
model.add(LSTM(32, return_sequences=True)) # returns a sequence of vectors of dimension 32
model.add(LSTM(32)) # return a single vector of dimension 32
model.add(Dense(10, activation='softmax'))
model.compile(loss='categorical_crossentropy',
optimizer='rmsprop',
metrics=['accuracy'])
# generate dummy training data
x_train = np.random.random((1000, timesteps, data_dim))
y_train = np.random.random((1000, nb_classes))
# generate dummy validation data
x_val = np.random.random((100, timesteps, data_dim))
y_val = np.random.random((100, nb_classes))
model.fit(x_train, y_train,
batch_size=64, nb_epoch=5,
validation_data=(x_val, y_val))
```
### Same stacked LSTM model, rendered "stateful"
A stateful recurrent model is one for which the internal states (memories) obtained after processing a batch
of samples are reused as initial states for the samples of the next batch. This allows to process longer sequences
while keeping computational complexity manageable.
[You can read more about stateful RNNs in the FAQ.](/faq/#how-can-i-use-stateful-rnns)
```python
from keras.models import Sequential
from keras.layers import LSTM, Dense
import numpy as np
data_dim = 16
timesteps = 8
nb_classes = 10
batch_size = 32
# expected input batch shape: (batch_size, timesteps, data_dim)
# note that we have to provide the full batch_input_shape since the network is stateful.
# the sample of index i in batch k is the follow-up for the sample i in batch k-1.
model = Sequential()
model.add(LSTM(32, return_sequences=True, stateful=True,
batch_input_shape=(batch_size, timesteps, data_dim)))
model.add(LSTM(32, return_sequences=True, stateful=True))
model.add(LSTM(32, stateful=True))
model.add(Dense(10, activation='softmax'))
model.compile(loss='categorical_crossentropy',
optimizer='rmsprop',
metrics=['accuracy'])
# generate dummy training data
x_train = np.random.random((batch_size * 10, timesteps, data_dim))
y_train = np.random.random((batch_size * 10, nb_classes))
# generate dummy validation data
x_val = np.random.random((batch_size * 3, timesteps, data_dim))
y_val = np.random.random((batch_size * 3, nb_classes))
model.fit(x_train, y_train,
batch_size=batch_size, nb_epoch=5,
validation_data=(x_val, y_val))
```
### Two merged LSTM encoders for classification over two parallel sequences
In this model, two input sequences are encoded into vectors by two separate LSTM modules.
These two vectors are then concatenated, and a fully connected network is trained on top of the concatenated representations.
<img src="https://keras.io/img/dual_lstm.png" alt="Dual LSTM" style="width: 600px;"/>
```python
from keras.models import Sequential
from keras.layers import Merge, LSTM, Dense
import numpy as np
data_dim = 16
timesteps = 8
nb_classes = 10
encoder_a = Sequential()
encoder_a.add(LSTM(32, input_shape=(timesteps, data_dim)))
encoder_b = Sequential()
encoder_b.add(LSTM(32, input_shape=(timesteps, data_dim)))
decoder = Sequential()
decoder.add(Merge([encoder_a, encoder_b], mode='concat'))
decoder.add(Dense(32, activation='relu'))
decoder.add(Dense(nb_classes, activation='softmax'))
decoder.compile(loss='categorical_crossentropy',
optimizer='rmsprop',
metrics=['accuracy'])
# generate dummy training data
x_train_a = np.random.random((1000, timesteps, data_dim))
x_train_b = np.random.random((1000, timesteps, data_dim))
y_train = np.random.random((1000, nb_classes))
# generate dummy validation data
x_val_a = np.random.random((100, timesteps, data_dim))
x_val_b = np.random.random((100, timesteps, data_dim))
y_val = np.random.random((100, nb_classes))
decoder.fit([x_train_a, x_train_b], y_train,
batch_size=64, nb_epoch=5,
validation_data=([x_val_a, x_val_b], y_val))
```
+33 -46
Ver Arquivo
@@ -2,19 +2,18 @@
## You have just found Keras.
Keras is a minimalist, highly modular neural networks library, written in Python and capable of running either on top of either [TensorFlow](https://github.com/tensorflow/tensorflow) or [Theano](https://github.com/Theano/Theano). It was developed with a focus on enabling fast experimentation. Being able to go from idea to result with the least possible delay is key to doing good research.
Keras is a minimalist, highly modular neural networks library, written in Python and capable of running on top of either [TensorFlow](https://github.com/tensorflow/tensorflow) or [Theano](https://github.com/Theano/Theano). It was developed with a focus on enabling fast experimentation. *Being able to go from idea to result with the least possible delay is key to doing good research.*
Use Keras if you need a deep learning library that:
- allows for easy and fast prototyping (through total modularity, minimalism, and extensibility).
- supports both convolutional networks and recurrent networks, as well as combinations of the two.
- supports arbitrary connectivity schemes (including multi-input and multi-output training).
- runs seamlessly on CPU and GPU.
- Allows for easy and fast prototyping (through total modularity, minimalism, and extensibility).
- Supports both convolutional networks and recurrent networks, as well as combinations of the two.
- Supports arbitrary connectivity schemes (including multi-input and multi-output training).
- Runs seamlessly on CPU and GPU.
Read the documentation at [Keras.io](http://keras.io).
Keras is compatible with:
- __Python 2.7-3.5__ with the Theano backend
- __Python 2.7__ with the TensorFlow backend
Keras is compatible with: __Python 2.7-3.5__.
------------------
@@ -36,9 +35,9 @@ Keras is compatible with:
## Getting started: 30 seconds to Keras
The core datastructure of Keras is a __model__, a way to organize layers. There are two types of models: [`Sequential`](/models/#sequential) and [`Graph`](/models/#graph).
The core data structure of Keras is a __model__, a way to organize layers. The main type of model is the [`Sequential`](http://keras.io/getting-started/sequential-model-guide) model, a linear stack of layers. For more complex architectures, you should use the [Keras functional API](http://keras.io/getting-started/functional-api-guide).
Here's the `Sequential` model (a linear pile of layers):
Here's the `Sequential` model:
```python
from keras.models import Sequential
@@ -49,17 +48,17 @@ model = Sequential()
Stacking layers is as easy as `.add()`:
```python
from keras.layers.core import Dense, Activation
from keras.layers import Dense, Activation
model.add(Dense(output_dim=64, input_dim=100, init="glorot_uniform"))
model.add(Dense(output_dim=64, input_dim=100))
model.add(Activation("relu"))
model.add(Dense(output_dim=10, init="glorot_uniform"))
model.add(Dense(output_dim=10))
model.add(Activation("softmax"))
```
Once your model looks good, configure its learning process with `.compile()`:
```python
model.compile(loss='categorical_crossentropy', optimizer='sgd')
model.compile(loss='categorical_crossentropy', optimizer='sgd', metrics=['accuracy'])
```
If you need to, you can further configure your optimizer. A core principle of Keras is to make things reasonably simple, while allowing the user to be fully in control when they need to (the ultimate control being the easy extensibility of the source code).
@@ -80,7 +79,7 @@ model.train_on_batch(X_batch, Y_batch)
Evaluate your performance in one line:
```python
objective_score = model.evaluate(X_test, Y_test, batch_size=32)
loss_and_metrics = model.evaluate(X_test, Y_test, batch_size=32)
```
Or generate predictions on new data:
@@ -89,11 +88,14 @@ classes = model.predict_classes(X_test, batch_size=32)
proba = model.predict_proba(X_test, batch_size=32)
```
Building a network of LSTMs, a deep CNN, a Neural Turing Machine, a word2vec embedder or any other model is just as fast. The ideas behind deep learning are simple, so why should their implementation be painful?
Building a question answering system, an image classification model, a Neural Turing Machine, a word2vec embedder or any other model is just as fast. The ideas behind deep learning are simple, so why should their implementation be painful?
Have a look at these [starter examples](http://keras.io/examples/).
For a more in-depth tutorial about Keras, you can check out:
In the [examples folder](https://github.com/fchollet/keras/tree/master/examples) of the repo, you will find more advanced models: question-answering with memory networks, text generation with stacked LSTMs, neural turing machines, etc.
- [Getting started with the Sequential model](http://keras.io/getting-started/sequential-model-guide)
- [Getting started with the functional API](http://keras.io/getting-started/functional-api-guide)
In the [examples folder](https://github.com/fchollet/keras/tree/master/examples) of the repository, you will find more advanced models: question-answering with memory networks, text generation with stacked LSTMs, etc.
------------------
@@ -108,35 +110,33 @@ Keras uses the following dependencies:
- HDF5 and h5py (optional, required if you use model saving/loading functions)
- Optional but recommended if you use CNNs: cuDNN.
When using the Theano backend:
- Theano
- [See installation instructions](http://deeplearning.net/software/theano/install.html#install).
**Note**: You should use the latest version of Theano, not the PyPI version. Install it with:
```
sudo pip install git+git://github.com/Theano/Theano.git
```
*When using the TensorFlow backend:*
When using the TensorFlow backend:
- TensorFlow
- [See installation instructions](https://github.com/tensorflow/tensorflow#download-and-setup).
To install, `cd` to the Keras folder and run the install command:
```
*When using the Theano backend:*
- Theano
- [See installation instructions](http://deeplearning.net/software/theano/install.html#install).
To install Keras, `cd` to the Keras folder and run the install command:
```sh
sudo python setup.py install
```
You can also install Keras from PyPI:
```
```sh
sudo pip install keras
```
------------------
## Switching from Theano to TensorFlow
## Switching from TensorFlow to Theano
By default, Keras will use Theano as its tensor manipulation library. [Follow these instructions](http://keras.io/backend/) to configure the Keras backend.
By default, Keras will use TensorFlow as its tensor manipulation library. [Follow these instructions](http://keras.io/backend/) to configure the Keras backend.
------------------
@@ -145,20 +145,7 @@ By default, Keras will use Theano as its tensor manipulation library. [Follow th
You can ask questions and join the development discussion on the [Keras Google group](https://groups.google.com/forum/#!forum/keras-users).
------------------
## Contribution Guidelines
Keras welcomes all contributions from the community.
- Keep a pragmatic mindset and avoid bloat. Only add to the source if that is the only path forward.
- New features should be documented. Make sure you update the documentation along with your Pull Request.
- Any new function or class should have a proper docstring.
- The documentation for every new feature should include a usage example in the form of a code snippet.
- All changes should be tested. Make sure any new feature you add has a corresponding unit test.
- Please no Pull Requests about coding style.
- Even if you don't contribute to the Keras source code, if you have an application of Keras that is concise and powerful, please consider adding it to our collection of [examples](https://github.com/fchollet/keras/tree/master/examples).
You can also post bug reports and feature requests in [Github issues](https://github.com/fchollet/keras/issues). Make sure to read [our guidelines](https://github.com/fchollet/keras/blob/master/CONTRIBUTING.md) first.
------------------
@@ -172,4 +159,4 @@ Keras was initially developed as part of the research effort of project ONEIROS
>_"Oneiroi are beyond our unravelling --who can be sure what tale they tell? Not all that men look for comes to pass. Two gates there are that give passage to fleeting Oneiroi; one is made of horn, one of ivory. The Oneiroi that pass through sawn ivory are deceitful, bearing a message that will not be fulfilled; those that come out through polished horn have truth behind them, to be accomplished for men who see them."_ Homer, Odyssey 19. 562 ff (Shewring translation).
------------------
------------------
+50
Ver Arquivo
@@ -0,0 +1,50 @@
## Usage of initializations
Initializations define the way to set the initial random weights of Keras layers.
The keyword arguments used for passing initializations to layers will depend on the layer. Usually it is simply `init`:
```python
model.add(Dense(64, init='uniform'))
```
## Available initializations
- __uniform__
- __lecun_uniform__: Uniform initialization scaled by the square root of the number of inputs (LeCun 98).
- __normal__
- __identity__: Use with square 2D layers (`shape[0] == shape[1]`).
- __orthogonal__: Use with square 2D layers (`shape[0] == shape[1]`).
- __zero__
- __glorot_normal__: Gaussian initialization scaled by fan_in + fan_out (Glorot 2010)
- __glorot_uniform__
- __he_normal__: Gaussian initialization scaled by fan_in (He et al., 2014)
- __he_uniform__
An initialization may be passed as a string (must match one of the available initializations above), or as a callable.
If a callable, then it must take two arguments: `shape` (shape of the variable to initialize) and `name` (name of the variable),
and it must return a variable (e.g. output of `K.variable()`):
```python
from keras import backend as K
import numpy as np
def my_init(shape, name=None):
value = np.random.random(shape)
return K.variable(value, name=name)
model.add(Dense(64, init=my_init))
```
You could also use functions from `keras.initializations` in this way:
```python
from keras import initializations
def my_init(shape, name=None):
return initializations.normal(shape, scale=0.01, name=name)
model.add(Dense(64, init=my_init))
```
+27
Ver Arquivo
@@ -0,0 +1,27 @@
# About Keras layers
All Keras layers have a number of methods in common:
- `layer.get_weights()`: returns the weights of the layer as a list of Numpy arrays.
- `layer.set_weights(weights)`: sets the weights of the layer from a list of Numpy arrays (with the same shapes as the output of `get_weights`).
- `layer.get_config()`: returns a dictionary containing the configuration of the layer. The layer can be reinstantiated from its config via:
```python
from keras.utils.layer_utils import layer_from_config
config = layer.get_config()
layer = layer_from_config(config)
```
If a layer has a single node (i.e. if it isn't a shared layer), you can get its input tensor, output tensor, input shape and output shape via:
- `layer.input`
- `layer.output`
- `layer.input_shape`
- `layer.output_shape`
If the layer has multiple nodes (see: [the concept of layer node and shared layers](/getting-started/functional-api-guide/#the-concept-of-layer-node)), you can use the following methods:
- `layer.get_input_at(node_index)`
- `layer.get_output_at(node_index)`
- `layer.get_input_shape_at(node_index)`
- `layer.get_output_shape_at(node_index)`
+34
Ver Arquivo
@@ -0,0 +1,34 @@
# Writing your own Keras layers
For simple, stateless custom operations, you are probably better off using `layers.core.Lambda` layers. But for any custom operation that has trainable weights, you should implement your own layer.
Here is the skeleton of a Keras layer. There are only three methods you need to implement:
- `build(input_shape)`: this is where you will define your weights. Trainable weights should be added to the list `self.trainable_weights`. Other attributes of note are: `self.non_trainable_weights` (list) and `self.updates` (list of update tuples (tensor, new_tensor)). For an example of how to use `non_trainable_weights` and `updates`, see the code for the `BatchNormalization` layer.
- `call(x)`: this is where the layer's logic lives. Unless you want your layer to support masking, you only have to care about the first argument passed to `call`: the input tensor.
- `get_output_shape_for(input_shape)`: in case your layer modifies the shape of its input, you should specify here the shape transformation logic. This allows Keras to do automatic shape inference.
```python
from keras import backend as K
from keras.engine.topology import Layer
import numpy as np
class MyLayer(Layer):
def __init__(self, output_dim, **kwargs):
self.output_dim = output_dim
super(MyLayer, self).__init__(**kwargs)
def build(self, input_shape):
input_dim = input_shape[1]
initial_weight_value = np.random.random((input_dim, output_dim))
self.W = K.variable(initial_weight_value)
self.trainable_weights = [self.W]
def call(self, x, mask=None):
return K.dot(x, self.W)
def get_output_shape_for(self, input_shape):
return (input_shape[0], self.output_dim)
```
The existing Keras layers provide ample examples of how to implement almost anything. Never hesitate to read the source code!
+33
Ver Arquivo
@@ -0,0 +1,33 @@
# About Keras models
There are two types of models available in Keras: [the Sequential model](/models/sequential) and [the Model class used with functional API](/models/model).
These models have a number of methods in common:
- `model.summary()`: prints a summary representation of your model.
- `model.get_config()`: returns a dictionary containing the configuration of the model. The model can be reinstantiated from its config via:
```python
config = model.get_config()
model = Model.from_config(config)
# or, for Sequential:
model = Sequential.from_config(config)
```
- `model.get_weights()`: returns a list of all weight tensors in the model, as Numpy arrays.
- `model.set_weights(weights)`: sets the values of the weights of the model, from a list of Numpy arrays. The arrays in the list should have the same shape as those returned by `get_weights()`.
- `model.to_json()`: returns a representation of the model as a JSON string. Note that the representation does not include the weights, only the architecture. You can reinstantiate the same model (with reinitialized weights) from the JSON string via:
```python
from models import model_from_json
json_string = model.to_json()
model = model_from_json(json_string)
```
- `model.to_yaml()`: returns a representation of the model as a YAML string. Note that the representation does not include the weights, only the architecture. You can reinstantiate the same model (with reinitialized weights) from the YAML string via:
```python
from models import model_from_yaml
yaml_string = model.to_yaml()
model = model_from_yaml(yaml_string)
```
- `model.save_weights(filepath)`: saves the weights of the model as a HDF5 file.
- `model.load_weights(filepath, by_name=False)`: loads the weights of the model from a HDF5 file (created by `save_weights`). By default, the architecture is expected to be unchanged. To load weights into a different architecture (with some layers in common), use `by_name=True` to load only those layers with the same name.
+32
Ver Arquivo
@@ -0,0 +1,32 @@
# Model class API
In the functional API, given an input tensor and output tensor, you can instantiate a `Model` via:
```python
from keras.models import Model
from keras.layers import Input, Dense
a = Input(shape=(32,))
b = Dense(32)(a)
model = Model(input=a, output=b)
```
This model will include all layers required in the computation of `b` given `a`.
In the case of multi-input or multi-output models, you can use lists as well:
```python
model = Model(input=[a1, a2], output=[b1, b3, b3])
```
For a detailed introduction of what `Model` can do, read [this guide to the Keras functional API](/getting-started/functional-api-guide).
## Useful attributes of Model
- `model.layers` is a flattened list of the layers comprising the model graph.
- `model.inputs` is the list of input tensors.
- `model.outputs` is the list of output tensors.
## Methods
{{autogenerated}}
+14
Ver Arquivo
@@ -0,0 +1,14 @@
# The Sequential model API
To get started, read [this guide to the Keras Sequential model](/getting-started/sequential-model-guide).
## Useful attributes of Model
- `model.layers` is a list of the layers added to the model.
----
## Sequential model methods
{{autogenerated}}
+7 -4
Ver Arquivo
@@ -7,10 +7,10 @@ An objective function (or loss function, or optimization score function) is one
model.compile(loss='mean_squared_error', optimizer='sgd')
```
You can either pass the name of an existing objective, or pass a Theano symbolic function that returns a scalar for each data-point and takes the following two arguments:
You can either pass the name of an existing objective, or pass a Theano/TensorFlow symbolic function that returns a scalar for each data-point and takes the following two arguments:
- __y_true__: True labels. Theano tensor.
- __y_pred__: Predictions. Theano tensor of the same shape as y_true.
- __y_true__: True labels. Theano/TensorFlow tensor.
- __y_pred__: Predictions. Theano/TensorFlow tensor of the same shape as y_true.
The actual optimized objective is the mean of the output array across all datapoints.
@@ -19,7 +19,6 @@ For a few examples of such functions, check out the [objectives source](https://
## Available objectives
- __mean_squared_error__ / __mse__
- __root_mean_squared_error__ / __rmse__
- __mean_absolute_error__ / __mae__
- __mean_absolute_percentage_error__ / __mape__
- __mean_squared_logarithmic_error__ / __msle__
@@ -27,3 +26,7 @@ For a few examples of such functions, check out the [objectives source](https://
- __hinge__
- __binary_crossentropy__: Also known as logloss.
- __categorical_crossentropy__: Also known as multiclass logloss. __Note__: using this objective requires that your labels are binary arrays of shape `(nb_samples, nb_classes)`.
- __sparse_categorical_crossentropy__: As above but accepts sparse labels. __Note__: this objective still requires that your labels have the same number of dimensions as your outputs; you may need to add a length-1 dimension to the shape of your labels, e.g with `np.expand_dims(y, -1)`.
- __kullback_leibler_divergence__ / __kld__: Information gain from a predicted probability distribution Q to a true probability distribution P. Gives a measure of difference between both distributions.
- __poisson__: Mean of `(predictions - targets * log(predictions))`
- __cosine_proximity__: The opposite (negative) of the mean cosine proximity between predictions and targets.
+44
Ver Arquivo
@@ -0,0 +1,44 @@
## Usage of optimizers
An optimizer is one of the two arguments required for compiling a Keras model:
```python
model = Sequential()
model.add(Dense(64, init='uniform', input_dim=10))
model.add(Activation('tanh'))
model.add(Activation('softmax'))
sgd = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='mean_squared_error', optimizer=sgd)
```
You can either instantiate an optimizer before passing it to `model.compile()` , as in the above example, or you can call it by its name. In the latter case, the default parameters for the optimizer will be used.
```python
# pass optimizer by name: default parameters will be used
model.compile(loss='mean_squared_error', optimizer='sgd')
```
---
## Parameters common to all Keras optimizers
The parameters `clipnorm` and `clipvalue` can be used with all optimizers to control gradient clipping:
```python
# all parameter gradients will be clipped to
# a maximum norm of 1.
sgd = SGD(lr=0.01, clipnorm=1.)
```
```python
# all parameter gradients will be clipped to
# a maximum value of 0.5 and
# a minimum value of -0.5.
sgd = SGD(lr=0.01, clipvalue=0.5)
```
---
{{autogenerated}}
+153
Ver Arquivo
@@ -0,0 +1,153 @@
## ImageDataGenerator
```python
keras.preprocessing.image.ImageDataGenerator(featurewise_center=False,
samplewise_center=False,
featurewise_std_normalization=False,
samplewise_std_normalization=False,
zca_whitening=False,
rotation_range=0.,
width_shift_range=0.,
height_shift_range=0.,
shear_range=0.,
zoom_range=0.,
channel_shift_range=0.,
fill_mode='nearest',
cval=0.,
horizontal_flip=False,
vertical_flip=False,
rescale=None,
dim_ordering=K.image_dim_ordering())
```
Generate batches of tensor image data with real-time data augmentation. The data will be looped over (in batches) indefinitely.
- __Arguments__:
- __featurewise_center__: Boolean. Set input mean to 0 over the dataset.
- __samplewise_center__: Boolean. Set each sample mean to 0.
- __featurewise_std_normalization__: Boolean. Divide inputs by std of the dataset.
- __samplewise_std_normalization__: Boolean. Divide each input by its std.
- __zca_whitening__: Boolean. Apply ZCA whitening.
- __rotation_range__: Int. Degree range for random rotations.
- __width_shift_range__: Float (fraction of total width). Range for random horizontal shifts.
- __height_shift_range__: Float (fraction of total height). Range for random vertical shifts.
- __shear_range__: Float. Shear Intensity (Shear angle in counter-clockwise direction as radians)
- __zoom_range__: Float or [lower, upper]. Range for random zoom. If a float, `[lower, upper] = [1-zoom_range, 1+zoom_range]`.
- __channel_shift_range__: Float. Range for random channel shifts.
- __fill_mode__: One of {"constant", "nearest", "reflect" or "wrap"}. Points outside the boundaries of the input are filled according to the given mode.
- __cval__: Float or Int. Value used for points outside the boundaries when `fill_mode = "constant"`.
- __horizontal_flip__: Boolean. Randomly flip inputs horizontally.
- __vertical_flip__: Boolean. Randomly flip inputs vertically.
- __rescale__: rescaling factor. Defaults to None. If None or 0, no rescaling is applied,
otherwise we multiply the data by the value provided (before applying
any other transformation).
- __dim_ordering__: One of {"th", "tf"}.
"tf" mode means that the images should have shape `(samples, width, height, channels)`,
"th" mode means that the images should have shape `(samples, channels, width, height)`.
It defaults to the `image_dim_ordering` value found in your
Keras config file at `~/.keras/keras.json`.
If you never set it, then it will be "th".
- __Methods__:
- __fit(X)__: Compute the internal data stats related to the data-dependent transformations, based on an array of sample data.
Only required if featurewise_center or featurewise_std_normalization or zca_whitening.
- __Arguments__:
- __X__: sample data.
- __augment__: Boolean (default: False). Whether to fit on randomly augmented samples.
- __rounds__: int (default: 1). If augment, how many augmentation passes over the data to use.
- __flow(X, y)__: Takes numpy data & label arrays, and generates batches of augmented/normalized data. Yields batches indefinitely, in an infinite loop.
- __Arguments__:
- __X__: data.
- __y__: labels.
- __batch_size__: int (default: 32).
- __shuffle__: boolean (defaut: True).
- __save_to_dir__: None or str (default: None). This allows you to optimally specify a directory to which to save the augmented pictures being generated (useful for visualizing what you are doing).
- __save_prefix__: str (default: `''`). Prefix to use for filenames of saved pictures (only relevant if `save_to_dir` is set).
- __save_format__: one of "png", "jpeg" (only relevant if `save_to_dir` is set). Default: "jpeg".
- __yields__: Tuples of `(x, y)` where `x` is a numpy array of image data and `y` is a numpy array of corresponding labels.
The generator loops indefinitely.
- __flow_from_directory(directory)__: Takes the path to a directory, and generates batches of augmented/normalized data. Yields batches indefinitely, in an infinite loop.
- __Arguments__:
- __directory__: path to the target directory. It should contain one subdirectory per class,
and the subdirectories should contain PNG or JPG images. See [this script](https://gist.github.com/fchollet/0830affa1f7f19fd47b06d4cf89ed44d) for more details.
- __target_size__: tuple of integers, default: `(256, 256)`. The dimensions to which all images found will be resized.
- __color_mode__: one of "grayscale", "rbg". Default: "rgb". Whether the images will be converted to have 1 or 3 color channels.
- __classes__: optional list of class subdirectories (e.g. `['dogs', 'cats']`). Default: None. If not provided, the list of classes will be automatically inferred (and the order of the classes, which will map to the label indices, will be alphanumeric).
- __class_mode__: one of "categorical", "binary", "sparse" or None. Default: "categorical". Determines the type of label arrays that are returned: "categorical" will be 2D one-hot encoded labels, "binary" will be 1D binary labels, "sparse" will be 1D integer labels. If None, no labels are returned (the generator will only yield batches of image data, which is useful to use `model.predict_generator()`, `model.evaluate_generator()`, etc.).
- __batch_size__: size of the batches of data (default: 32).
- __shuffle__: whether to shuffle the data (default: True)
- __seed__: optional random seed for shuffling.
- __save_to_dir__: None or str (default: None). This allows you to optimally specify a directory to which to save the augmented pictures being generated (useful for visualizing what you are doing).
- __save_prefix__: str. Prefix to use for filenames of saved pictures (only relevant if `save_to_dir` is set).
- __save_format__: one of "png", "jpeg" (only relevant if `save_to_dir` is set). Default: "jpeg".
- __Examples__:
Example of using `.flow(X, y)`:
```python
(X_train, y_train), (X_test, y_test) = cifar10.load_data()
Y_train = np_utils.to_categorical(y_train, nb_classes)
Y_test = np_utils.to_categorical(y_test, nb_classes)
datagen = ImageDataGenerator(
featurewise_center=True,
featurewise_std_normalization=True,
rotation_range=20,
width_shift_range=0.2,
height_shift_range=0.2,
horizontal_flip=True)
# compute quantities required for featurewise normalization
# (std, mean, and principal components if ZCA whitening is applied)
datagen.fit(X_train)
# fits the model on batches with real-time data augmentation:
model.fit_generator(datagen.flow(X_train, Y_train, batch_size=32),
samples_per_epoch=len(X_train), nb_epoch=nb_epoch)
# here's a more "manual" example
for e in range(nb_epoch):
print 'Epoch', e
batches = 0
for X_batch, Y_batch in datagen.flow(X_train, Y_train, batch_size=32):
loss = model.train(X_batch, Y_batch)
batches += 1
if batches >= len(X_train) / 32:
# we need to break the loop by hand because
# the generator loops indefinitely
break
```
Example of using `.flow_from_directory(directory)`:
```python
train_datagen = ImageDataGenerator(
rescale=1./255,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True)
test_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_directory(
'data/train',
target_size=(150, 150),
batch_size=32,
class_mode='binary')
validation_generator = test_datagen.flow_from_directory(
'data/validation',
target_size=(150, 150),
batch_size=32,
class_mode='binary')
model.fit_generator(
train_generator,
samples_per_epoch=2000,
nb_epoch=50,
validation_data=validation_generator,
nb_val_samples=800)
```
@@ -4,26 +4,29 @@
keras.preprocessing.sequence.pad_sequences(sequences, maxlen=None, dtype='int32')
```
Transform a list of `nb_samples sequences` (lists of scalars) into a 2D numpy array of shape `(nb_samples, nb_timesteps)`. `nb_timesteps` is either the `maxlen` argument if provided, or the length of the longest sequence otherwise. Sequences that are shorter than `nb_timesteps` are padded with zeros at the end.
Transform a list of `nb_samples sequences` (lists of scalars) into a 2D Numpy array of shape `(nb_samples, nb_timesteps)`. `nb_timesteps` is either the `maxlen` argument if provided, or the length of the longest sequence otherwise. Sequences that are shorter than `nb_timesteps` are padded with zeros at the end.
- __Return__: 2D numpy array of shape `(nb_samples, nb_timesteps)`.
- __Return__: 2D Numpy array of shape `(nb_samples, nb_timesteps)`.
- __Arguments__:
- __sequences__: List of lists of int or float.
- __maxlen__: None or int. Maximum sequence length, longer sequences are truncated and shorter sequences are padded with zeros at the end.
- __dtype__: datatype of the numpy array returned.
- __dtype__: datatype of the Numpy array returned.
- __padding__: 'pre' or 'post', pad either before or after each sequence.
- __truncating__: 'pre' or 'post', remove values from sequences larger than maxlen either in the beginning or in the end of the sequence
- __value__: float, value to pad the sequences to the desired value.
---
## skipgrams
```python
keras.preprocessing.sequence.skipgrams(sequence, vocabulary_size,
window_size=4, negative_samples=1., shuffle=True,
keras.preprocessing.sequence.skipgrams(sequence, vocabulary_size,
window_size=4, negative_samples=1., shuffle=True,
categorical=False, sampling_table=None)
```
Transforms a sequence of word indexes (list of int) into couples of the form:
Transforms a sequence of word indexes (list of int) into couples of the form:
- (word, word in the same window), with label 1 (positive samples).
- (word, random word from the vocabulary), with label 0 (negative samples).
@@ -31,8 +34,8 @@ Transforms a sequence of word indexes (list of int) into couples of the form:
Read more about Skipgram in this gnomic paper by Mikolov et al.: [Efficient Estimation of Word Representations in
Vector Space](http://arxiv.org/pdf/1301.3781v3.pdf)
- __Return__: tuple `(couples, labels)`.
- `couples` is a list of 2-elements lists of int: `[word_index, other_word_index]`.
- __Return__: tuple `(couples, labels)`.
- `couples` is a list of 2-elements lists of int: `[word_index, other_word_index]`.
- `labels` is a list of 0 and 1, where 1 indicates that `other_word_index` was found in the same window as `word_index`, and 0 indicates that `other_word_index` was random.
- if categorical is set to True, the labels are categorical, ie. 1 becomes [0,1], and 0 becomes [1, 0].
@@ -43,7 +46,7 @@ Vector Space](http://arxiv.org/pdf/1301.3781v3.pdf)
- __negative_samples__: float >= 0. 0 for no negative (=random) samples. 1 for same number as positive samples. etc.
- __shuffle__: boolean. Whether to shuffle the samples.
- __categorical__: boolean. Whether to make the returned labels categorical.
- __sampling_table__: numpy array of shape `(vocabulary_size,)` where `sampling_table[i]` is the probability of sampling the word with index i (assumed to be i-th most common word in the dataset).
- __sampling_table__: Numpy array of shape `(vocabulary_size,)` where `sampling_table[i]` is the probability of sampling the word with index i (assumed to be i-th most common word in the dataset).
---
@@ -56,7 +59,7 @@ keras.preprocessing.sequence.make_sampling_table(size, sampling_factor=1e-5)
Used for generating the `sampling_table` argument for `skipgrams`. `sampling_table[i]` is the probability of sampling the word i-th most common word in a dataset (more common words should be sampled less frequently, for balance).
- __Return__: numpy array of shape `(size,)`.
- __Return__: Numpy array of shape `(size,)`.
- __Arguments__:
- __size__: size of the vocabulary considered.
+45
Ver Arquivo
@@ -0,0 +1,45 @@
# Wrappers for the Scikit-Learn API
You can use `Sequential` Keras models (single-input only) as part of your Scikit-Learn workflow via the wrappers found at `keras.wrappers.scikit_learn.py`.
There are two wrappers available:
`keras.wrappers.scikit_learn.KerasClassifier(build_fn=None, **sk_params)`, which implements the Scikit-Learn classifier interface,
`keras.wrappers.scikit_learn.KerasRegressor(build_fn=None, **sk_params)`, which implements the Scikit-Learn regressor interface.
### Arguments
- __build_fn__: callable function or class instance
- __sk_params__: model parameters & fitting parameters
`build_fn` should construct, compile and return a Keras model, which
will then be used to fit/predict. One of the following
three values could be passed to build_fn:
1. A function
2. An instance of a class that implements the __call__ method
3. None. This means you implement a class that inherits from either
`KerasClassifier` or `KerasRegressor`. The __call__ method of the
present class will then be treated as the default build_fn.
`sk_params` takes both model parameters and fitting parameters. Legal model
parameters are the arguments of `build_fn`. Note that like all other
estimators in scikit-learn, 'build_fn' should provide default values for
its arguments, so that you could create the estimator without passing any
values to `sk_params`.
`sk_params` could also accept parameters for calling `fit`, `predict`,
`predict_proba`, and `score` methods (e.g., `nb_epoch`, `batch_size`).
fitting (predicting) parameters are selected in the following order:
1. Values passed to the dictionary arguments of
`fit`, `predict`, `predict_proba`, and `score` methods
2. Values passed to `sk_params`
3. The default values of the `keras.models.Sequential`
`fit`, `predict`, `predict_proba` and `score` methods
When using scikit-learn's `grid_search` API, legal tunable parameters are
those you could pass to `sk_params`, including fitting parameters.
In other words, you could use `grid_search` to search for the best
`batch_size` or `nb_epoch` as well as the model parameters.
@@ -10,11 +10,16 @@ from keras.utils.visualize_util import plot
plot(model, to_file='model.png')
```
`plot` takes two optional arguments:
- `show_shapes` (defaults to False) controls whether output shapes are shown in the graph.
- `show_layer_names` (defaults to True) controls whether layer names are shown in the graph.
You can also directly obtain the `pydot.Graph` object and render it yourself,
for example to show it in an ipython notebook :
```python
from IPython.display import SVG
from keras.utils.visualize_util import to_graph
from keras.utils.visualize_util import model_to_dot
SVG(to_graph(model).create(prog='dot', format='svg'))
SVG(model_to_dot(model).create(prog='dot', format='svg'))
```
+18 -16
Ver Arquivo
@@ -1,13 +1,5 @@
# -*- coding: utf-8 -*-
from __future__ import print_function
from keras.models import Sequential, slice_X
from keras.layers.core import Activation, TimeDistributedDense, RepeatVector
from keras.layers import recurrent
import numpy as np
from six.moves import range
"""
An implementation of sequence to sequence learning for performing addition
'''An implementation of sequence to sequence learning for performing addition
Input: "535+61"
Output: "596"
Padding is handled by using a repeated sentinel character (space)
@@ -32,16 +24,23 @@ Four digits inverted:
Five digits inverted:
+ One layer LSTM (128 HN), 550k training examples = 99% train/test accuracy in 30 epochs
"""
'''
from __future__ import print_function
from keras.models import Sequential
from keras.engine.training import slice_X
from keras.layers import Activation, TimeDistributed, Dense, RepeatVector, recurrent
import numpy as np
from six.moves import range
class CharacterTable(object):
"""
'''
Given a set of characters:
+ Encode them to a one hot integer representation
+ Decode the one hot integer representation to their character output
+ Decode a vector of probabilties to their character output
"""
+ Decode a vector of probabilities to their character output
'''
def __init__(self, chars, maxlen):
self.chars = sorted(set(chars))
self.char_indices = dict((c, i) for i, c in enumerate(self.chars))
@@ -140,17 +139,20 @@ for _ in range(LAYERS):
model.add(RNN(HIDDEN_SIZE, return_sequences=True))
# For each of step of the output sequence, decide which character should be chosen
model.add(TimeDistributedDense(len(chars)))
model.add(TimeDistributed(Dense(len(chars))))
model.add(Activation('softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam')
model.compile(loss='categorical_crossentropy',
optimizer='adam',
metrics=['accuracy'])
# Train the model each generation and show predictions against the validation dataset
for iteration in range(1, 200):
print()
print('-' * 50)
print('Iteration', iteration)
model.fit(X_train, y_train, batch_size=BATCH_SIZE, nb_epoch=1, validation_data=(X_val, y_val), show_accuracy=True)
model.fit(X_train, y_train, batch_size=BATCH_SIZE, nb_epoch=1,
validation_data=(X_val, y_val))
###
# Select 10 samples from the validation set at random so we can visualize errors
for i in range(10):
+104
Ver Arquivo
@@ -0,0 +1,104 @@
'''The example demonstrates how to write custom layers for Keras.
We build a custom activation layer called 'Antirectifier',
which modifies the shape of the tensor that passes through it.
We need to specify two methods: `get_output_shape_for` and `call`.
Note that the same result can also be achieved via a Lambda layer.
Because our custom layer is written with primitives from the Keras
backend (`K`), our code can run both on TensorFlow and Theano.
'''
from __future__ import print_function
from keras.models import Sequential
from keras.layers import Dense, Dropout, Layer, Activation
from keras.datasets import mnist
from keras import backend as K
from keras.utils import np_utils
class Antirectifier(Layer):
'''This is the combination of a sample-wise
L2 normalization with the concatenation of the
positive part of the input with the negative part
of the input. The result is a tensor of samples that are
twice as large as the input samples.
It can be used in place of a ReLU.
# Input shape
2D tensor of shape (samples, n)
# Output shape
2D tensor of shape (samples, 2*n)
# Theoretical justification
When applying ReLU, assuming that the distribution
of the previous output is approximately centered around 0.,
you are discarding half of your input. This is inefficient.
Antirectifier allows to return all-positive outputs like ReLU,
without discarding any data.
Tests on MNIST show that Antirectifier allows to train networks
with twice less parameters yet with comparable
classification accuracy as an equivalent ReLU-based network.
'''
def get_output_shape_for(self, input_shape):
shape = list(input_shape)
assert len(shape) == 2 # only valid for 2D tensors
shape[-1] *= 2
return tuple(shape)
def call(self, x, mask=None):
x -= K.mean(x, axis=1, keepdims=True)
x = K.l2_normalize(x, axis=1)
pos = K.relu(x)
neg = K.relu(-x)
return K.concatenate([pos, neg], axis=1)
# global parameters
batch_size = 128
nb_classes = 10
nb_epoch = 40
# the data, shuffled and split between train and test sets
(X_train, y_train), (X_test, y_test) = mnist.load_data()
X_train = X_train.reshape(60000, 784)
X_test = X_test.reshape(10000, 784)
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255
X_test /= 255
print(X_train.shape[0], 'train samples')
print(X_test.shape[0], 'test samples')
# convert class vectors to binary class matrices
Y_train = np_utils.to_categorical(y_train, nb_classes)
Y_test = np_utils.to_categorical(y_test, nb_classes)
# build the model
model = Sequential()
model.add(Dense(256, input_shape=(784,)))
model.add(Antirectifier())
model.add(Dropout(0.1))
model.add(Dense(256))
model.add(Antirectifier())
model.add(Dropout(0.1))
model.add(Dense(10))
model.add(Activation('softmax'))
# compile the model
model.compile(loss='categorical_crossentropy',
optimizer='rmsprop',
metrics=['accuracy'])
# train the model
model.fit(X_train, Y_train,
batch_size=batch_size, nb_epoch=nb_epoch,
verbose=1, validation_data=(X_test, Y_test))
# next, compare with an equivalent network
# with2x bigger Dense layers and ReLU
+33 -26
Ver Arquivo
@@ -1,30 +1,29 @@
from __future__ import print_function
from keras.models import Sequential
from keras.layers.embeddings import Embedding
from keras.layers.core import Activation, Dense, Merge, Permute, Dropout
from keras.layers.recurrent import LSTM
from keras.datasets.data_utils import get_file
from keras.preprocessing.sequence import pad_sequences
from functools import reduce
import tarfile
import numpy as np
import re
"""
Train a memory network on the bAbI dataset.
'''Trains a memory network on the bAbI dataset.
References:
- Jason Weston, Antoine Bordes, Sumit Chopra, Tomas Mikolov, Alexander M. Rush,
"Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks",
http://arxiv.org/abs/1503.08895
http://arxiv.org/abs/1502.05698
- Sainbayar Sukhbaatar, Arthur Szlam, Jason Weston, Rob Fergus,
"End-To-End Memory Networks",
http://arxiv.org/abs/1503.08895
Reaches 93% accuracy on task 'single_supporting_fact_10k' after 70 epochs.
Reaches 98.6% accuracy on task 'single_supporting_fact_10k' after 120 epochs.
Time per epoch: 3s on CPU (core i7).
"""
'''
from __future__ import print_function
from keras.models import Sequential
from keras.layers.embeddings import Embedding
from keras.layers import Activation, Dense, Merge, Permute, Dropout
from keras.layers import LSTM
from keras.utils.data_utils import get_file
from keras.preprocessing.sequence import pad_sequences
from functools import reduce
import tarfile
import numpy as np
import re
def tokenize(sent):
@@ -95,8 +94,13 @@ def vectorize_stories(data, word_idx, story_maxlen, query_maxlen):
pad_sequences(Xq, maxlen=query_maxlen), np.array(Y))
path = get_file('babi-tasks-v1-2.tar.gz',
origin='http://www.thespermwhale.com/jaseweston/babi/tasks_1-20_v1-2.tar.gz')
try:
path = get_file('babi-tasks-v1-2.tar.gz', origin='https://s3.amazonaws.com/text-datasets/babi_tasks_1-20_v1-2.tar.gz')
except:
print('Error downloading dataset, please download it manually:\n'
'$ wget http://www.thespermwhale.com/jaseweston/babi/tasks_1-20_v1-2.tar.gz\n'
'$ mv tasks_1-20_v1-2.tar.gz ~/.keras/datasets/babi-tasks-v1-2.tar.gz')
raise
tar = tarfile.open(path)
challenges = {
@@ -154,25 +158,29 @@ input_encoder_m = Sequential()
input_encoder_m.add(Embedding(input_dim=vocab_size,
output_dim=64,
input_length=story_maxlen))
input_encoder_m.add(Dropout(0.3))
# output: (samples, story_maxlen, embedding_dim)
# embed the question into a sequence of vectors
question_encoder = Sequential()
question_encoder.add(Embedding(input_dim=vocab_size,
output_dim=64,
input_length=query_maxlen))
question_encoder.add(Dropout(0.3))
# output: (samples, query_maxlen, embedding_dim)
# compute a 'match' between input sequence elements (which are vectors)
# and the question vector sequence
match = Sequential()
match.add(Merge([input_encoder_m, question_encoder],
mode='dot',
dot_axes=[(2,), (2,)]))
dot_axes=[2, 2]))
match.add(Activation('softmax'))
# output: (samples, story_maxlen, query_maxlen)
# embed the input into a single vector with size = story_maxlen:
input_encoder_c = Sequential()
input_encoder_c.add(Embedding(input_dim=vocab_size,
output_dim=query_maxlen,
input_length=story_maxlen))
input_encoder_c.add(Dropout(0.3))
# output: (samples, story_maxlen, query_maxlen)
# sum the match vector with the input vector:
response = Sequential()
@@ -186,18 +194,17 @@ answer = Sequential()
answer.add(Merge([response, question_encoder], mode='concat', concat_axis=-1))
# the original paper uses a matrix multiplication for this reduction step.
# we choose to use a RNN instead.
answer.add(LSTM(64))
answer.add(LSTM(32))
# one regularization layer -- more would probably be needed.
answer.add(Dropout(0.25))
answer.add(Dropout(0.3))
answer.add(Dense(vocab_size))
# we output a probability distribution over the vocabulary
answer.add(Activation('softmax'))
answer.compile(optimizer='rmsprop', loss='categorical_crossentropy')
answer.compile(optimizer='rmsprop', loss='categorical_crossentropy',
metrics=['accuracy'])
# Note: you could use a Graph model to avoid repeat the input twice
answer.fit([inputs_train, queries_train, inputs_train], answers_train,
batch_size=32,
nb_epoch=70,
show_accuracy=True,
nb_epoch=120,
validation_data=([inputs_test, queries_test, inputs_test], answers_test))
+45 -33
Ver Arquivo
@@ -1,21 +1,4 @@
from __future__ import absolute_import
from __future__ import print_function
from functools import reduce
import re
import tarfile
import numpy as np
np.random.seed(1337) # for reproducibility
from keras.datasets.data_utils import get_file
from keras.layers.embeddings import Embedding
from keras.layers.core import Dense, Merge
from keras.layers import recurrent
from keras.models import Sequential
from keras.preprocessing.sequence import pad_sequences
'''
Trains two recurrent neural networks based upon a story and a question.
'''Trains two recurrent neural networks based upon a story and a question.
The resulting merged vector is then queried to answer a range of bAbI tasks.
The results are comparable to those for an LSTM model provided in Weston et al.:
@@ -24,8 +7,8 @@ http://arxiv.org/abs/1502.05698
Task Number | FB LSTM Baseline | Keras QA
--- | --- | ---
QA1 - Single Supporting Fact | 50 | 52.1
QA2 - Two Supporting Facts | 20 | 37.0
QA1 - Single Supporting Fact | 50 | 100.0
QA2 - Two Supporting Facts | 20 | 50.0
QA3 - Three Supporting Facts | 20 | 20.5
QA4 - Two Arg. Relations | 61 | 62.9
QA5 - Three Arg. Relations | 70 | 61.9
@@ -51,8 +34,8 @@ https://research.facebook.com/researchers/1543934539189348
Notes:
- With default word, sentence, and query vector sizes, the GRU model achieves:
- 52.1% test accuracy on QA1 in 20 epochs (2 seconds per epoch on CPU)
- 37.0% test accuracy on QA2 in 20 epochs (16 seconds per epoch on CPU)
- 100% test accuracy on QA1 in 20 epochs (2 seconds per epoch on CPU)
- 50% test accuracy on QA2 in 20 epochs (16 seconds per epoch on CPU)
In comparison, the Facebook paper achieves 50% and 20% for the LSTM baseline.
- The task does not traditionally parse the question separately. This likely
@@ -73,6 +56,21 @@ noise to find the relevant statements, improving performance substantially.
This becomes especially obvious on QA2 and QA3, both far longer than QA1.
'''
from __future__ import print_function
from functools import reduce
import re
import tarfile
import numpy as np
np.random.seed(1337) # for reproducibility
from keras.utils.data_utils import get_file
from keras.layers.embeddings import Embedding
from keras.layers import Dense, Merge, Dropout, RepeatVector
from keras.layers import recurrent
from keras.models import Sequential
from keras.preprocessing.sequence import pad_sequences
def tokenize(sent):
'''Return the tokens of a sentence including punctuation.
@@ -140,15 +138,21 @@ def vectorize_stories(data, word_idx, story_maxlen, query_maxlen):
Y.append(y)
return pad_sequences(X, maxlen=story_maxlen), pad_sequences(Xq, maxlen=query_maxlen), np.array(Y)
RNN = recurrent.GRU
RNN = recurrent.LSTM
EMBED_HIDDEN_SIZE = 50
SENT_HIDDEN_SIZE = 100
QUERY_HIDDEN_SIZE = 100
BATCH_SIZE = 32
EPOCHS = 20
EPOCHS = 40
print('RNN / Embed / Sent / Query = {}, {}, {}, {}'.format(RNN, EMBED_HIDDEN_SIZE, SENT_HIDDEN_SIZE, QUERY_HIDDEN_SIZE))
path = get_file('babi-tasks-v1-2.tar.gz', origin='http://www.thespermwhale.com/jaseweston/babi/tasks_1-20_v1-2.tar.gz')
try:
path = get_file('babi-tasks-v1-2.tar.gz', origin='https://s3.amazonaws.com/text-datasets/babi_tasks_1-20_v1-2.tar.gz')
except:
print('Error downloading dataset, please download it manually:\n'
'$ wget http://www.thespermwhale.com/jaseweston/babi/tasks_1-20_v1-2.tar.gz\n'
'$ mv tasks_1-20_v1-2.tar.gz ~/.keras/datasets/babi-tasks-v1-2.tar.gz')
raise
tar = tarfile.open(path)
# Default QA1 with 1000 samples
# challenge = 'tasks_1-20_v1-2/en/qa1_single-supporting-fact_{}.txt'
@@ -180,20 +184,28 @@ print('story_maxlen, query_maxlen = {}, {}'.format(story_maxlen, query_maxlen))
print('Build model...')
sentrnn = Sequential()
sentrnn.add(Embedding(vocab_size, EMBED_HIDDEN_SIZE, mask_zero=True))
sentrnn.add(RNN(SENT_HIDDEN_SIZE, return_sequences=False))
sentrnn.add(Embedding(vocab_size, EMBED_HIDDEN_SIZE,
input_length=story_maxlen))
sentrnn.add(Dropout(0.3))
qrnn = Sequential()
qrnn.add(Embedding(vocab_size, EMBED_HIDDEN_SIZE))
qrnn.add(RNN(QUERY_HIDDEN_SIZE, return_sequences=False))
qrnn.add(Embedding(vocab_size, EMBED_HIDDEN_SIZE,
input_length=query_maxlen))
qrnn.add(Dropout(0.3))
qrnn.add(RNN(EMBED_HIDDEN_SIZE, return_sequences=False))
qrnn.add(RepeatVector(story_maxlen))
model = Sequential()
model.add(Merge([sentrnn, qrnn], mode='concat'))
model.add(Merge([sentrnn, qrnn], mode='sum'))
model.add(RNN(EMBED_HIDDEN_SIZE, return_sequences=False))
model.add(Dropout(0.3))
model.add(Dense(vocab_size, activation='softmax'))
model.compile(optimizer='adam', loss='categorical_crossentropy', class_mode='categorical')
model.compile(optimizer='adam',
loss='categorical_crossentropy',
metrics=['accuracy'])
print('Training')
model.fit([X, Xq], Y, batch_size=BATCH_SIZE, nb_epoch=EPOCHS, validation_split=0.05, show_accuracy=True)
loss, acc = model.evaluate([tX, tXq], tY, batch_size=BATCH_SIZE, show_accuracy=True)
model.fit([X, Xq], Y, batch_size=BATCH_SIZE, nb_epoch=EPOCHS, validation_split=0.05)
loss, acc = model.evaluate([tX, tXq], tY, batch_size=BATCH_SIZE)
print('Test loss / test accuracy = {:.4f} / {:.4f}'.format(loss, acc))
+41 -52
Ver Arquivo
@@ -1,27 +1,24 @@
from __future__ import absolute_import
'''Train a simple deep CNN on the CIFAR10 small images dataset.
GPU run command:
THEANO_FLAGS=mode=FAST_RUN,device=gpu,floatX=float32 python cifar10_cnn.py
It gets down to 0.65 test logloss in 25 epochs, and down to 0.55 after 50 epochs.
(it's still underfitting at that point, though).
Note: the data was pickled with Python 2, and some encoding issues might prevent you
from loading it in Python 3. You might have to load it in Python 2,
save it in a different format, load it in Python 3 and repickle it.
'''
from __future__ import print_function
from keras.datasets import cifar10
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation, Flatten
from keras.layers.convolutional import Convolution2D, MaxPooling2D
from keras.optimizers import SGD, Adadelta, Adagrad
from keras.utils import np_utils, generic_utils
from six.moves import range
'''
Train a (fairly simple) deep CNN on the CIFAR10 small images dataset.
GPU run command:
THEANO_FLAGS=mode=FAST_RUN,device=gpu,floatX=float32 python cifar10_cnn.py
It gets down to 0.65 test logloss in 25 epochs, and down to 0.55 after 50 epochs.
(it's still underfitting at that point, though).
Note: the data was pickled with Python 2, and some encoding issues might prevent you
from loading it in Python 3. You might have to load it in Python 2,
save it in a different format, load it in Python 3 and repickle it.
'''
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.layers import Convolution2D, MaxPooling2D
from keras.optimizers import SGD
from keras.utils import np_utils
batch_size = 32
nb_classes = 10
@@ -46,7 +43,7 @@ Y_test = np_utils.to_categorical(y_test, nb_classes)
model = Sequential()
model.add(Convolution2D(32, 3, 3, border_mode='same',
input_shape=(img_channels, img_rows, img_cols)))
input_shape=X_train.shape[1:]))
model.add(Activation('relu'))
model.add(Convolution2D(32, 3, 3))
model.add(Activation('relu'))
@@ -69,32 +66,35 @@ model.add(Activation('softmax'))
# let's train the model using SGD + momentum (how original).
sgd = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='categorical_crossentropy', optimizer=sgd)
model.compile(loss='categorical_crossentropy',
optimizer=sgd,
metrics=['accuracy'])
X_train = X_train.astype("float32")
X_test = X_test.astype("float32")
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255
X_test /= 255
if not data_augmentation:
print("Not using data augmentation or normalization")
model.fit(X_train, Y_train, batch_size=batch_size, nb_epoch=nb_epoch)
score = model.evaluate(X_test, Y_test, batch_size=batch_size)
print('Test score:', score)
print('Not using data augmentation.')
model.fit(X_train, Y_train,
batch_size=batch_size,
nb_epoch=nb_epoch,
validation_data=(X_test, Y_test),
shuffle=True)
else:
print("Using real time data augmentation")
print('Using real-time data augmentation.')
# this will do preprocessing and realtime data augmentation
datagen = ImageDataGenerator(
featurewise_center=True, # set input mean to 0 over the dataset
featurewise_center=False, # set input mean to 0 over the dataset
samplewise_center=False, # set each sample mean to 0
featurewise_std_normalization=True, # divide inputs by std of the dataset
featurewise_std_normalization=False, # divide inputs by std of the dataset
samplewise_std_normalization=False, # divide each input by its std
zca_whitening=False, # apply ZCA whitening
rotation_range=20, # randomly rotate images in the range (degrees, 0 to 180)
width_shift_range=0.2, # randomly shift images horizontally (fraction of total width)
height_shift_range=0.2, # randomly shift images vertically (fraction of total height)
rotation_range=0, # randomly rotate images in the range (degrees, 0 to 180)
width_shift_range=0.1, # randomly shift images horizontally (fraction of total width)
height_shift_range=0.1, # randomly shift images vertically (fraction of total height)
horizontal_flip=True, # randomly flip images
vertical_flip=False) # randomly flip images
@@ -102,20 +102,9 @@ else:
# (std, mean, and principal components if ZCA whitening is applied)
datagen.fit(X_train)
for e in range(nb_epoch):
print('-'*40)
print('Epoch', e)
print('-'*40)
print("Training...")
# batch train with realtime data augmentation
progbar = generic_utils.Progbar(X_train.shape[0])
for X_batch, Y_batch in datagen.flow(X_train, Y_train):
loss = model.train_on_batch(X_batch, Y_batch)
progbar.add(X_batch.shape[0], values=[("train loss", loss)])
print("Testing...")
# test time!
progbar = generic_utils.Progbar(X_test.shape[0])
for X_batch, Y_batch in datagen.flow(X_test, Y_test):
score = model.test_on_batch(X_batch, Y_batch)
progbar.add(X_batch.shape[0], values=[("test loss", score)])
# fit the model on the batches generated by datagen.flow()
model.fit_generator(datagen.flow(X_train, Y_train,
batch_size=batch_size),
samples_per_epoch=X_train.shape[0],
nb_epoch=nb_epoch,
validation_data=(X_test, Y_test))
+132
Ver Arquivo
@@ -0,0 +1,132 @@
'''Visualization of the filters of VGG16, via gradient ascent in input space.
This script can run on CPU in a few minutes (with the TensorFlow backend).
Results example: http://i.imgur.com/4nj4KjN.jpg
'''
from __future__ import print_function
from scipy.misc import imsave
import numpy as np
import time
from keras.applications import vgg16
from keras import backend as K
# dimensions of the generated pictures for each filter.
img_width = 128
img_height = 128
# the name of the layer we want to visualize
# (see model definition at keras/applications/vgg16.py)
layer_name = 'block5_conv1'
# util function to convert a tensor into a valid image
def deprocess_image(x):
# normalize tensor: center on 0., ensure std is 0.1
x -= x.mean()
x /= (x.std() + 1e-5)
x *= 0.1
# clip to [0, 1]
x += 0.5
x = np.clip(x, 0, 1)
# convert to RGB array
x *= 255
if K.image_dim_ordering() == 'th':
x = x.transpose((1, 2, 0))
x = np.clip(x, 0, 255).astype('uint8')
return x
# build the VGG16 network with ImageNet weights
model = vgg16.VGG16(weights='imagenet', include_top=False)
print('Model loaded.')
model.summary()
# this is the placeholder for the input images
input_img = model.input
# get the symbolic outputs of each "key" layer (we gave them unique names).
layer_dict = dict([(layer.name, layer) for layer in model.layers[1:]])
def normalize(x):
# utility function to normalize a tensor by its L2 norm
return x / (K.sqrt(K.mean(K.square(x))) + 1e-5)
kept_filters = []
for filter_index in range(0, 200):
# we only scan through the first 200 filters,
# but there are actually 512 of them
print('Processing filter %d' % filter_index)
start_time = time.time()
# we build a loss function that maximizes the activation
# of the nth filter of the layer considered
layer_output = layer_dict[layer_name].output
if K.image_dim_ordering() == 'th':
loss = K.mean(layer_output[:, filter_index, :, :])
else:
loss = K.mean(layer_output[:, :, :, filter_index])
# we compute the gradient of the input picture wrt this loss
grads = K.gradients(loss, input_img)[0]
# normalization trick: we normalize the gradient
grads = normalize(grads)
# this function returns the loss and grads given the input picture
iterate = K.function([input_img], [loss, grads])
# step size for gradient ascent
step = 1.
# we start from a gray image with some random noise
if K.image_dim_ordering() == 'th':
input_img_data = np.random.random((1, 3, img_width, img_height))
else:
input_img_data = np.random.random((1, img_width, img_height, 3))
input_img_data = (input_img_data - 0.5) * 20 + 128
# we run gradient ascent for 20 steps
for i in range(20):
loss_value, grads_value = iterate([input_img_data])
input_img_data += grads_value * step
print('Current loss value:', loss_value)
if loss_value <= 0.:
# some filters get stuck to 0, we can skip them
break
# decode the resulting input image
if loss_value > 0:
img = deprocess_image(input_img_data[0])
kept_filters.append((img, loss_value))
end_time = time.time()
print('Filter %d processed in %ds' % (filter_index, end_time - start_time))
# we will stich the best 64 filters on a 8 x 8 grid.
n = 8
# the filters that have the highest loss are assumed to be better-looking.
# we will only keep the top 64 filters.
kept_filters.sort(key=lambda x: x[1], reverse=True)
kept_filters = kept_filters[:n * n]
# build a black picture with enough space for
# our 8 x 8 filters of size 128 x 128, with a 5px margin in between
margin = 5
width = n * img_width + (n - 1) * margin
height = n * img_height + (n - 1) * margin
stitched_filters = np.zeros((width, height, 3))
# fill the picture with our saved filters
for i in range(n):
for j in range(n):
img, loss = kept_filters[i * n + j]
stitched_filters[(img_width + margin) * i: (img_width + margin) * i + img_width,
(img_height + margin) * j: (img_height + margin) * j + img_height, :] = img
# save the result to disk
imsave('stitched_filters_%dx%d.png' % (n, n), stitched_filters)
+233
Ver Arquivo
@@ -0,0 +1,233 @@
'''Deep Dreaming in Keras.
Run the script with:
```
python deep_dream.py path_to_your_base_image.jpg prefix_for_results
```
e.g.:
```
python deep_dream.py img/mypic.jpg results/dream
```
It is preferable to run this script on GPU, for speed.
If running on CPU, prefer the TensorFlow backend (much faster).
Example results: http://i.imgur.com/FX6ROg9.jpg
'''
from __future__ import print_function
from scipy.misc import imread, imresize, imsave
import numpy as np
from scipy.optimize import fmin_l_bfgs_b
import time
import argparse
import h5py
import os
from keras.models import Sequential
from keras.layers import Convolution2D, ZeroPadding2D, MaxPooling2D
from keras import backend as K
parser = argparse.ArgumentParser(description='Deep Dreams with Keras.')
parser.add_argument('base_image_path', metavar='base', type=str,
help='Path to the image to transform.')
parser.add_argument('result_prefix', metavar='res_prefix', type=str,
help='Prefix for the saved results.')
args = parser.parse_args()
base_image_path = args.base_image_path
result_prefix = args.result_prefix
# dimensions of the generated picture.
img_width = 600
img_height = 600
# path to the model weights file.
weights_path = 'vgg16_weights.h5'
# some settings we found interesting
saved_settings = {
'bad_trip': {'features': {'conv4_1': 0.05,
'conv4_2': 0.01,
'conv4_3': 0.01},
'continuity': 0.1,
'dream_l2': 0.8,
'jitter': 5},
'dreamy': {'features': {'conv5_1': 0.05,
'conv5_2': 0.02},
'continuity': 0.1,
'dream_l2': 0.02,
'jitter': 0},
}
# the settings we will use in this experiment
settings = saved_settings['dreamy']
# util function to open, resize and format pictures into appropriate tensors
def preprocess_image(image_path):
img = imresize(imread(image_path), (img_width, img_height))
img = img.transpose((2, 0, 1)).astype('float64')
img = np.expand_dims(img, axis=0)
return img
# util function to convert a tensor into a valid image
def deprocess_image(x):
x = x.transpose((1, 2, 0))
x = np.clip(x, 0, 255).astype('uint8')
return x
# build the VGG16 network
model = Sequential()
model.add(ZeroPadding2D((1, 1), batch_input_shape=(1, 3, img_width, img_height)))
first_layer = model.layers[-1]
# this is a placeholder tensor that will contain our generated images
dream = first_layer.input
model.add(Convolution2D(64, 3, 3, activation='relu', name='conv1_1'))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(64, 3, 3, activation='relu', name='conv1_2'))
model.add(MaxPooling2D((2, 2), strides=(2, 2)))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(128, 3, 3, activation='relu', name='conv2_1'))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(128, 3, 3, activation='relu', name='conv2_2'))
model.add(MaxPooling2D((2, 2), strides=(2, 2)))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(256, 3, 3, activation='relu', name='conv3_1'))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(256, 3, 3, activation='relu', name='conv3_2'))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(256, 3, 3, activation='relu', name='conv3_3'))
model.add(MaxPooling2D((2, 2), strides=(2, 2)))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(512, 3, 3, activation='relu', name='conv4_1'))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(512, 3, 3, activation='relu', name='conv4_2'))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(512, 3, 3, activation='relu', name='conv4_3'))
model.add(MaxPooling2D((2, 2), strides=(2, 2)))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(512, 3, 3, activation='relu', name='conv5_1'))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(512, 3, 3, activation='relu', name='conv5_2'))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(512, 3, 3, activation='relu', name='conv5_3'))
model.add(MaxPooling2D((2, 2), strides=(2, 2)))
# load the weights of the VGG16 networks
# (trained on ImageNet, won the ILSVRC competition in 2014)
# note: when there is a complete match between your model definition
# and your weight savefile, you can simply call model.load_weights(filename)
assert os.path.exists(weights_path), 'Model weights not found (see "weights_path" variable in script).'
f = h5py.File(weights_path)
for k in range(f.attrs['nb_layers']):
if k >= len(model.layers):
# we don't look at the last (fully-connected) layers in the savefile
break
g = f['layer_{}'.format(k)]
weights = [g['param_{}'.format(p)] for p in range(g.attrs['nb_params'])]
model.layers[k].set_weights(weights)
f.close()
print('Model loaded.')
# get the symbolic outputs of each "key" layer (we gave them unique names).
layer_dict = dict([(layer.name, layer) for layer in model.layers])
# continuity loss util function
def continuity_loss(x):
assert K.ndim(x) == 4
a = K.square(x[:, :, :img_width-1, :img_height-1] - x[:, :, 1:, :img_height-1])
b = K.square(x[:, :, :img_width-1, :img_height-1] - x[:, :, :img_width-1, 1:])
return K.sum(K.pow(a + b, 1.25))
# define the loss
loss = K.variable(0.)
for layer_name in settings['features']:
# add the L2 norm of the features of a layer to the loss
assert layer_name in layer_dict.keys(), 'Layer ' + layer_name + ' not found in model.'
coeff = settings['features'][layer_name]
x = layer_dict[layer_name].output
shape = layer_dict[layer_name].output_shape
# we avoid border artifacts by only involving non-border pixels in the loss
loss -= coeff * K.sum(K.square(x[:, :, 2: shape[2]-2, 2: shape[3]-2])) / np.prod(shape[1:])
# add continuity loss (gives image local coherence, can result in an artful blur)
loss += settings['continuity'] * continuity_loss(dream) / (3 * img_width * img_height)
# add image L2 norm to loss (prevents pixels from taking very high values, makes image darker)
loss += settings['dream_l2'] * K.sum(K.square(dream)) / (3 * img_width * img_height)
# feel free to further modify the loss as you see fit, to achieve new effects...
# compute the gradients of the dream wrt the loss
grads = K.gradients(loss, dream)
outputs = [loss]
if type(grads) in {list, tuple}:
outputs += grads
else:
outputs.append(grads)
f_outputs = K.function([dream], outputs)
def eval_loss_and_grads(x):
x = x.reshape((1, 3, img_width, img_height))
outs = f_outputs([x])
loss_value = outs[0]
if len(outs[1:]) == 1:
grad_values = outs[1].flatten().astype('float64')
else:
grad_values = np.array(outs[1:]).flatten().astype('float64')
return loss_value, grad_values
# this Evaluator class makes it possible
# to compute loss and gradients in one pass
# while retrieving them via two separate functions,
# "loss" and "grads". This is done because scipy.optimize
# requires separate functions for loss and gradients,
# but computing them separately would be inefficient.
class Evaluator(object):
def __init__(self):
self.loss_value = None
self.grad_values = None
def loss(self, x):
assert self.loss_value is None
loss_value, grad_values = eval_loss_and_grads(x)
self.loss_value = loss_value
self.grad_values = grad_values
return self.loss_value
def grads(self, x):
assert self.loss_value is not None
grad_values = np.copy(self.grad_values)
self.loss_value = None
self.grad_values = None
return grad_values
evaluator = Evaluator()
# run scipy-based optimization (L-BFGS) over the pixels of the generated image
# so as to minimize the loss
x = preprocess_image(base_image_path)
for i in range(5):
print('Start of iteration', i)
start_time = time.time()
# add a random jitter to the initial image. This will be reverted at decoding time
random_jitter = (settings['jitter'] * 2) * (np.random.random((3, img_width, img_height)) - 0.5)
x += random_jitter
# run L-BFGS for 7 steps
x, min_val, info = fmin_l_bfgs_b(evaluator.loss, x.flatten(),
fprime=evaluator.grads, maxfun=7)
print('Current loss value:', min_val)
# decode the dream and save it
x = x.reshape((3, img_width, img_height))
x -= random_jitter
img = deprocess_image(x)
fname = result_prefix + '_at_iteration_%d.png' % i
imsave(fname, img)
end_time = time.time()
print('Image saved as', fname)
print('Iteration %d completed in %ds' % (i, end_time - start_time))
+470
Ver Arquivo
@@ -0,0 +1,470 @@
'''This example uses a convolutional stack followed by a recurrent stack
and a CTC logloss function to perform optical character recognition
of generated text images. I have no evidence of whether it actually
learns general shapes of text, or just is able to recognize all
the different fonts thrown at it...the purpose is more to demonstrate CTC
inside of Keras. Note that the font list may need to be updated
for the particular OS in use.
This starts off with 4 letter words. After 10 or so epochs, CTC
learns translational invariance, so longer words and groups of words
with spaces are gradually fed in. This gradual increase in difficulty
is handled using the TextImageGenerator class which is both a generator
class for test/train data and a Keras callback class. Every 10 epochs
the wordlist that the generator draws from increases in difficulty.
The table below shows normalized edit distance values. Theano uses
a slightly different CTC implementation, so some Theano-specific
hyperparameter tuning would be needed to get it to match Tensorflow.
Norm. ED
Epoch | TF | TH
------------------------
10 0.072 0.272
20 0.032 0.115
30 0.024 0.098
40 0.023 0.108
This requires cairo and editdistance packages:
pip install cairocffi
pip install editdistance
Due to the use of a dummy loss function, Theano requires the following flags:
on_unused_input='ignore'
Created by Mike Henry
https://github.com/mbhenry/
'''
import os
import itertools
import re
import datetime
import cairocffi as cairo
import editdistance
import numpy as np
from scipy import ndimage
import pylab
from keras import backend as K
from keras.layers.convolutional import Convolution2D, MaxPooling2D
from keras.layers import Input, Layer, Dense, Activation, Flatten
from keras.layers import Reshape, Lambda, merge, Permute, TimeDistributed
from keras.models import Model
from keras.layers.recurrent import GRU
from keras.optimizers import SGD
from keras.utils import np_utils
from keras.utils.data_utils import get_file
from keras.preprocessing import image
import keras.callbacks
OUTPUT_DIR = "image_ocr"
np.random.seed(55)
# this creates larger "blotches" of noise which look
# more realistic than just adding gaussian noise
# assumes greyscale with pixels ranging from 0 to 1
def speckle(img):
severity = np.random.uniform(0, 0.6)
blur = ndimage.gaussian_filter(np.random.randn(*img.shape) * severity, 1)
img_speck = (img + blur)
img_speck[img_speck > 1] = 1
img_speck[img_speck <= 0] = 0
return img_speck
# paints the string in a random location the bounding box
# also uses a random font, a slight random rotation,
# and a random amount of speckle noise
def paint_text(text, w, h):
surface = cairo.ImageSurface(cairo.FORMAT_RGB24, w, h)
with cairo.Context(surface) as context:
context.set_source_rgb(1, 1, 1) # White
context.paint()
# this font list works in Centos 7
fonts = ['Century Schoolbook', 'Courier', 'STIX', 'URW Chancery L', 'FreeMono']
context.select_font_face(np.random.choice(fonts), cairo.FONT_SLANT_NORMAL,
np.random.choice([cairo.FONT_WEIGHT_BOLD, cairo.FONT_WEIGHT_NORMAL]))
context.set_font_size(40)
box = context.text_extents(text)
if box[2] > w or box[3] > h:
raise IOError('Could not fit string into image. Max char count is too large for given image width.')
# teach the RNN translational invariance by
# fitting text box randomly on canvas, with some room to rotate
border_w_h = (10, 16)
max_shift_x = w - box[2] - border_w_h[0]
max_shift_y = h - box[3] - border_w_h[1]
top_left_x = np.random.randint(0, int(max_shift_x))
top_left_y = np.random.randint(0, int(max_shift_y))
context.move_to(top_left_x - int(box[0]), top_left_y - int(box[1]))
context.set_source_rgb(0, 0, 0)
context.show_text(text)
buf = surface.get_data()
a = np.frombuffer(buf, np.uint8)
a.shape = (h, w, 4)
a = a[:, :, 0] # grab single channel
a /= 255
a = np.expand_dims(a, 0)
a = speckle(a)
a = image.random_rotation(a, 3 * (w - top_left_x) / w + 1)
return a
def shuffle_mats_or_lists(matrix_list, stop_ind=None):
ret = []
assert all([len(i) == len(matrix_list[0]) for i in matrix_list])
len_val = len(matrix_list[0])
if stop_ind is None:
stop_ind = len_val
assert stop_ind <= len_val
a = range(stop_ind)
np.random.shuffle(a)
a += range(stop_ind, len_val)
for mat in matrix_list:
if isinstance(mat, np.ndarray):
ret.append(mat[a])
elif isinstance(mat, list):
ret.append([mat[i] for i in a])
else:
raise TypeError('shuffle_mats_or_lists only supports '
'numpy.array and list objects')
return ret
def text_to_labels(text, num_classes):
ret = []
for char in text:
if char >= 'a' and char <= 'z':
ret.append(ord(char) - ord('a'))
elif char == ' ':
ret.append(26)
return ret
# only a-z and space..probably not to difficult
# to expand to uppercase and symbols
def is_valid_str(in_str):
search = re.compile(r'[^a-z\ ]').search
return not bool(search(in_str))
# Uses generator functions to supply train/test with
# data. Image renderings are text are created on the fly
# each time with random perturbations
class TextImageGenerator(keras.callbacks.Callback):
def __init__(self, monogram_file, bigram_file, minibatch_size,
img_w, img_h, downsample_width, val_split,
absolute_max_string_len=16):
self.minibatch_size = minibatch_size
self.img_w = img_w
self.img_h = img_h
self.monogram_file = monogram_file
self.bigram_file = bigram_file
self.downsample_width = downsample_width
self.val_split = val_split
self.blank_label = self.get_output_size() - 1
self.absolute_max_string_len = absolute_max_string_len
def get_output_size(self):
return 28
# num_words can be independent of the epoch size due to the use of generators
# as max_string_len grows, num_words can grow
def build_word_list(self, num_words, max_string_len=None, mono_fraction=0.5):
assert max_string_len <= self.absolute_max_string_len
assert num_words % self.minibatch_size == 0
assert (self.val_split * num_words) % self.minibatch_size == 0
self.num_words = num_words
self.string_list = []
self.max_string_len = max_string_len
self.Y_data = np.ones([self.num_words, self.absolute_max_string_len]) * -1
self.X_text = []
self.Y_len = [0] * self.num_words
# monogram file is sorted by frequency in english speech
with open(self.monogram_file, 'rt') as f:
for line in f:
if len(self.string_list) == int(self.num_words * mono_fraction):
break
word = line.rstrip()
if max_string_len == -1 or max_string_len is None or len(word) <= max_string_len:
self.string_list.append(word)
# bigram file contains common word pairings in english speech
with open(self.bigram_file, 'rt') as f:
lines = f.readlines()
for line in lines:
if len(self.string_list) == self.num_words:
break
columns = line.lower().split()
word = columns[0] + ' ' + columns[1]
if is_valid_str(word) and \
(max_string_len == -1 or max_string_len is None or len(word) <= max_string_len):
self.string_list.append(word)
if len(self.string_list) != self.num_words:
raise IOError('Could not pull enough words from supplied monogram and bigram files. ')
for i, word in enumerate(self.string_list):
self.Y_len[i] = len(word)
self.Y_data[i, 0:len(word)] = text_to_labels(word, self.get_output_size())
self.X_text.append(word)
self.Y_len = np.expand_dims(np.array(self.Y_len), 1)
self.cur_val_index = self.val_split
self.cur_train_index = 0
# each time an image is requested from train/val/test, a new random
# painting of the text is performed
def get_batch(self, index, size, train):
if K.image_dim_ordering() == 'th':
X_data = np.ones([size, 1, self.img_h, self.img_w])
else:
X_data = np.ones([size, self.img_h, self.img_w, 1])
labels = np.ones([size, self.absolute_max_string_len])
input_length = np.zeros([size, 1])
label_length = np.zeros([size, 1])
source_str = []
for i in range(0, size):
# Mix in some blank inputs. This seems to be important for
# achieving translational invariance
if train and i > size - 4:
if K.image_dim_ordering() == 'th':
X_data[i, 0, :, :] = paint_text('', self.img_w, self.img_h)
else:
X_data[i, :, :, 0] = paint_text('', self.img_w, self.img_h)
labels[i, 0] = self.blank_label
input_length[i] = self.downsample_width
label_length[i] = 1
source_str.append('')
else:
if K.image_dim_ordering() == 'th':
X_data[i, 0, :, :] = paint_text(self.X_text[index + i], self.img_w, self.img_h)
else:
X_data[i, :, :, 0] = paint_text(self.X_text[index + i], self.img_w, self.img_h)
labels[i, :] = self.Y_data[index + i]
input_length[i] = self.downsample_width
label_length[i] = self.Y_len[index + i]
source_str.append(self.X_text[index + i])
inputs = {'the_input': X_data,
'the_labels': labels,
'input_length': input_length,
'label_length': label_length,
'source_str': source_str # used for visualization only
}
outputs = {'ctc': np.zeros([size])} # dummy data for dummy loss function
return (inputs, outputs)
def next_train(self):
while 1:
ret = self.get_batch(self.cur_train_index, self.minibatch_size, train=True)
self.cur_train_index += self.minibatch_size
if self.cur_train_index >= self.val_split:
self.cur_train_index = self.cur_train_index % 32
(self.X_text, self.Y_data, self.Y_len) = shuffle_mats_or_lists(
[self.X_text, self.Y_data, self.Y_len], self.val_split)
yield ret
def next_val(self):
while 1:
ret = self.get_batch(self.cur_val_index, self.minibatch_size, train=False)
self.cur_val_index += self.minibatch_size
if self.cur_val_index >= self.num_words:
self.cur_val_index = self.val_split + self.cur_val_index % 32
yield ret
def on_train_begin(self, logs={}):
# translational invariance seems to be the hardest thing
# for the RNN to learn, so start with <= 4 letter words.
self.build_word_list(16000, 4, 1)
def on_epoch_begin(self, epoch, logs={}):
# After 10 epochs, translational invariance should be learned
# so start feeding longer words and eventually multiple words with spaces
if epoch == 10:
self.build_word_list(32000, 8, 1)
if epoch == 20:
self.build_word_list(32000, 8, 0.6)
if epoch == 30:
self.build_word_list(64000, 12, 0.5)
# the actual loss calc occurs here despite it not being
# an internal Keras loss function
def ctc_lambda_func(args):
y_pred, labels, input_length, label_length = args
# the 2 is critical here since the first couple outputs of the RNN
# tend to be garbage:
y_pred = y_pred[:, 2:, :]
return K.ctc_batch_cost(labels, y_pred, input_length, label_length)
# For a real OCR application, this should be beam search with a dictionary
# and language model. For this example, best path is sufficient.
def decode_batch(test_func, word_batch):
out = test_func([word_batch])[0]
ret = []
for j in range(out.shape[0]):
out_best = list(np.argmax(out[j, 2:], 1))
out_best = [k for k, g in itertools.groupby(out_best)]
# 26 is space, 27 is CTC blank char
outstr = ''
for c in out_best:
if c >= 0 and c < 26:
outstr += chr(c + ord('a'))
elif c == 26:
outstr += ' '
ret.append(outstr)
return ret
class VizCallback(keras.callbacks.Callback):
def __init__(self, test_func, text_img_gen, num_display_words=6):
self.test_func = test_func
self.output_dir = os.path.join(
OUTPUT_DIR, datetime.datetime.now().strftime('%A, %d. %B %Y %I.%M%p'))
self.text_img_gen = text_img_gen
self.num_display_words = num_display_words
os.makedirs(self.output_dir)
def show_edit_distance(self, num):
num_left = num
mean_norm_ed = 0.0
mean_ed = 0.0
while num_left > 0:
word_batch = next(self.text_img_gen)[0]
num_proc = min(word_batch['the_input'].shape[0], num_left)
decoded_res = decode_batch(self.test_func, word_batch['the_input'][0:num_proc])
for j in range(0, num_proc):
edit_dist = editdistance.eval(decoded_res[j], word_batch['source_str'][j])
mean_ed += float(edit_dist)
mean_norm_ed += float(edit_dist) / len(word_batch['source_str'][j])
num_left -= num_proc
mean_norm_ed = mean_norm_ed / num
mean_ed = mean_ed / num
print('\nOut of %d samples: Mean edit distance: %.3f Mean normalized edit distance: %0.3f'
% (num, mean_ed, mean_norm_ed))
def on_epoch_end(self, epoch, logs={}):
self.model.save_weights(os.path.join(self.output_dir, 'weights%02d.h5' % epoch))
self.show_edit_distance(256)
word_batch = next(self.text_img_gen)[0]
res = decode_batch(self.test_func, word_batch['the_input'][0:self.num_display_words])
for i in range(self.num_display_words):
pylab.subplot(self.num_display_words, 1, i + 1)
if K.image_dim_ordering() == 'th':
the_input = word_batch['the_input'][i, 0, :, :]
else:
the_input = word_batch['the_input'][i, :, :, 0]
pylab.imshow(the_input, cmap='Greys_r')
pylab.xlabel('Truth = \'%s\' Decoded = \'%s\'' % (word_batch['source_str'][i], res[i]))
fig = pylab.gcf()
fig.set_size_inches(10, 12)
pylab.savefig(os.path.join(self.output_dir, 'e%02d.png' % epoch))
pylab.close()
# Input Parameters
img_h = 64
img_w = 512
nb_epoch = 50
minibatch_size = 32
words_per_epoch = 16000
val_split = 0.2
val_words = int(words_per_epoch * (val_split))
# Network parameters
conv_num_filters = 16
filter_size = 3
pool_size_1 = 4
pool_size_2 = 2
time_dense_size = 32
rnn_size = 512
time_steps = img_w / (pool_size_1 * pool_size_2)
if K.image_dim_ordering() == 'th':
input_shape = (1, img_h, img_w)
else:
input_shape = (img_h, img_w, 1)
fdir = os.path.dirname(get_file('wordlists.tgz',
origin='http://www.isosemi.com/datasets/wordlists.tgz', untar=True))
img_gen = TextImageGenerator(monogram_file=os.path.join(fdir, 'wordlist_mono_clean.txt'),
bigram_file=os.path.join(fdir, 'wordlist_bi_clean.txt'),
minibatch_size=32,
img_w=img_w,
img_h=img_h,
downsample_width=img_w / (pool_size_1 * pool_size_2) - 2,
val_split=words_per_epoch - val_words)
act = 'relu'
input_data = Input(name='the_input', shape=input_shape, dtype='float32')
inner = Convolution2D(conv_num_filters, filter_size, filter_size, border_mode='same',
activation=act, name='conv1')(input_data)
inner = MaxPooling2D(pool_size=(pool_size_1, pool_size_1), name='max1')(inner)
inner = Convolution2D(conv_num_filters, filter_size, filter_size, border_mode='same',
activation=act, name='conv2')(inner)
inner = MaxPooling2D(pool_size=(pool_size_2, pool_size_2), name='max2')(inner)
conv_to_rnn_dims = ((img_h / (pool_size_1 * pool_size_2)) * conv_num_filters, img_w / (pool_size_1 * pool_size_2))
inner = Reshape(target_shape=conv_to_rnn_dims, name='reshape')(inner)
inner = Permute(dims=(2, 1), name='permute')(inner)
# cuts down input size going into RNN:
inner = TimeDistributed(Dense(time_dense_size, activation=act, name='dense1'))(inner)
# Two layers of bidirecitonal GRUs
# GRU seems to work as well, if not better than LSTM:
gru_1 = GRU(rnn_size, return_sequences=True, name='gru1')(inner)
gru_1b = GRU(rnn_size, return_sequences=True, go_backwards=True, name='gru1_b')(inner)
gru1_merged = merge([gru_1, gru_1b], mode='sum')
gru_2 = GRU(rnn_size, return_sequences=True, name='gru2')(gru1_merged)
gru_2b = GRU(rnn_size, return_sequences=True, go_backwards=True)(gru1_merged)
# transforms RNN output to character activations:
inner = TimeDistributed(Dense(img_gen.get_output_size(), name='dense2'))(merge([gru_2, gru_2b], mode='concat'))
y_pred = Activation('softmax', name='softmax')(inner)
Model(input=[input_data], output=y_pred).summary()
labels = Input(name='the_labels', shape=[img_gen.absolute_max_string_len], dtype='float32')
input_length = Input(name='input_length', shape=[1], dtype='int64')
label_length = Input(name='label_length', shape=[1], dtype='int64')
# Keras doesn't currently support loss funcs with extra parameters
# so CTC loss is implemented in a lambda layer
loss_out = Lambda(ctc_lambda_func, output_shape=(1,), name="ctc")([y_pred, labels, input_length, label_length])
lr = 0.03
# clipnorm seems to speeds up convergence
clipnorm = 5
sgd = SGD(lr=lr, decay=3e-7, momentum=0.9, nesterov=True, clipnorm=clipnorm)
model = Model(input=[input_data, labels, input_length, label_length], output=[loss_out])
# the loss calc occurs elsewhere, so use a dummy lambda func for the loss
model.compile(loss={'ctc': lambda y_true, y_pred: y_pred}, optimizer=sgd)
# captures output of softmax so we can decode the output during visualization
test_func = K.function([input_data], [y_pred])
viz_cb = VizCallback(test_func, img_gen.next_val())
model.fit_generator(generator=img_gen.next_train(), samples_per_epoch=(words_per_epoch - val_words),
nb_epoch=nb_epoch, validation_data=img_gen.next_val(), nb_val_samples=val_words,
callbacks=[viz_cb, img_gen])
+20 -36
Ver Arquivo
@@ -1,33 +1,25 @@
from __future__ import absolute_import
'''Train a Bidirectional LSTM on the IMDB sentiment classification task.
Output after 4 epochs on CPU: ~0.8146
Time per epoch on CPU (Core i7): ~150s.
'''
from __future__ import print_function
import numpy as np
np.random.seed(1337) # for reproducibility
from keras.preprocessing import sequence
from keras.utils.np_utils import accuracy
from keras.models import Graph
from keras.layers.core import Dense, Dropout
from keras.layers.embeddings import Embedding
from keras.layers.recurrent import LSTM
from keras.models import Sequential
from keras.layers import Dense, Dropout, Embedding, LSTM, Input, Bidirectional
from keras.datasets import imdb
'''
Train a Bidirectional LSTM on the IMDB sentiment classification task.
GPU command:
THEANO_FLAGS=mode=FAST_RUN,device=gpu,floatX=float32 python imdb_bidirectional_lstm.py
Output after 4 epochs on CPU: ~0.8146
Time per epoch on CPU (Core i7): ~150s.
'''
max_features = 20000
maxlen = 100 # cut texts after this number of words (among top max_features most common words)
batch_size = 32
print("Loading data...")
(X_train, y_train), (X_test, y_test) = imdb.load_data(nb_words=max_features,
test_split=0.2)
print('Loading data...')
(X_train, y_train), (X_test, y_test) = imdb.load_data(nb_words=max_features)
print(len(X_train), 'train sequences')
print(len(X_test), 'test sequences')
@@ -39,25 +31,17 @@ print('X_test shape:', X_test.shape)
y_train = np.array(y_train)
y_test = np.array(y_test)
print('Build model...')
model = Graph()
model.add_input(name='input', input_shape=(maxlen,), dtype=int)
model.add_node(Embedding(max_features, 128, input_length=maxlen),
name='embedding', input='input')
model.add_node(LSTM(64), name='forward', input='embedding')
model.add_node(LSTM(64, go_backwards=True), name='backward', input='embedding')
model.add_node(Dropout(0.5), name='dropout', inputs=['forward', 'backward'])
model.add_node(Dense(1, activation='sigmoid'), name='sigmoid', input='dropout')
model.add_output(name='output', input='sigmoid')
model = Sequential()
model.add(Embedding(max_features, 128, input_length=maxlen))
model.add(Bidirectional(LSTM(64)))
model.add(Dropout(0.5))
model.add(Dense(1, activation='sigmoid'))
# try using different optimizers and different optimizer configs
model.compile('adam', {'output': 'binary_crossentropy'})
model.compile('adam', 'binary_crossentropy', metrics=['accuracy'])
print("Train...")
model.fit({'input': X_train, 'output': y_train},
print('Train...')
model.fit(X_train, y_train,
batch_size=batch_size,
nb_epoch=4)
acc = accuracy(y_test,
np.round(np.array(model.predict({'input': X_test},
batch_size=batch_size)['output'])))
print('Test accuracy:', acc)
nb_epoch=4,
validation_data=[X_test, y_test])
+31 -29
Ver Arquivo
@@ -1,41 +1,40 @@
from __future__ import absolute_import
'''This example demonstrates the use of Convolution1D for text classification.
Gets to 0.89 test accuracy after 2 epochs.
90s/epoch on Intel i5 2.4Ghz CPU.
10s/epoch on Tesla K40 GPU.
'''
from __future__ import print_function
import numpy as np
np.random.seed(1337) # for reproducibility
from keras.preprocessing import sequence
from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation, Flatten
from keras.layers.embeddings import Embedding
from keras.layers.convolutional import Convolution1D, MaxPooling1D
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.layers import Embedding
from keras.layers import Convolution1D, MaxPooling1D
from keras.datasets import imdb
from keras import backend as K
'''
This example demonstrates the use of Convolution1D
for text classification.
Run on GPU: THEANO_FLAGS=mode=FAST_RUN,device=gpu,floatX=float32 python imdb_cnn.py
Get to 0.835 test accuracy after 2 epochs. 100s/epoch on K520 GPU.
'''
# set parameters:
max_features = 5000
maxlen = 100
maxlen = 400
batch_size = 32
embedding_dims = 100
embedding_dims = 50
nb_filter = 250
filter_length = 3
hidden_dims = 250
nb_epoch = 2
print("Loading data...")
(X_train, y_train), (X_test, y_test) = imdb.load_data(nb_words=max_features,
test_split=0.2)
print('Loading data...')
(X_train, y_train), (X_test, y_test) = imdb.load_data(nb_words=max_features)
print(len(X_train), 'train sequences')
print(len(X_test), 'test sequences')
print("Pad sequences (samples x time)")
print('Pad sequences (samples x time)')
X_train = sequence.pad_sequences(X_train, maxlen=maxlen)
X_test = sequence.pad_sequences(X_test, maxlen=maxlen)
print('X_train shape:', X_train.shape)
@@ -46,18 +45,20 @@ model = Sequential()
# we start off with an efficient embedding layer which maps
# our vocab indices into embedding_dims dimensions
model.add(Embedding(max_features, embedding_dims, input_length=maxlen))
model.add(Dropout(0.25))
model.add(Embedding(max_features,
embedding_dims,
input_length=maxlen,
dropout=0.2))
# we add a Convolution1D, which will learn nb_filter
# word group filters of size filter_length:
model.add(Convolution1D(nb_filter=nb_filter,
filter_length=filter_length,
border_mode="valid",
activation="relu",
border_mode='valid',
activation='relu',
subsample_length=1))
# we use standard max pooling (halving the output of the previous layer):
model.add(MaxPooling1D(pool_length=2))
# we use max pooling:
model.add(MaxPooling1D(pool_length=model.output_shape[1]))
# We flatten the output of the conv layer,
# so that we can add a vanilla dense layer:
@@ -65,7 +66,7 @@ model.add(Flatten())
# We add a vanilla hidden layer:
model.add(Dense(hidden_dims))
model.add(Dropout(0.25))
model.add(Dropout(0.2))
model.add(Activation('relu'))
# We project onto a single unit output layer, and squash it with a sigmoid:
@@ -73,8 +74,9 @@ model.add(Dense(1))
model.add(Activation('sigmoid'))
model.compile(loss='binary_crossentropy',
optimizer='rmsprop',
class_mode="binary")
model.fit(X_train, y_train, batch_size=batch_size,
nb_epoch=nb_epoch, show_accuracy=True,
optimizer='adam',
metrics=['accuracy'])
model.fit(X_train, y_train,
batch_size=batch_size,
nb_epoch=nb_epoch,
validation_data=(X_test, y_test))
+20 -27
Ver Arquivo
@@ -1,26 +1,20 @@
from __future__ import absolute_import
'''Train a recurrent convolutional network on the IMDB sentiment
classification task.
Gets to 0.8498 test accuracy after 2 epochs. 41s/epoch on K520 GPU.
'''
from __future__ import print_function
import numpy as np
np.random.seed(1337) # for reproducibility
from keras.preprocessing import sequence
from keras.optimizers import SGD, RMSprop, Adagrad
from keras.utils import np_utils
from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation
from keras.layers.embeddings import Embedding
from keras.layers.recurrent import LSTM, GRU, SimpleRNN
from keras.layers.convolutional import Convolution1D, MaxPooling1D
from keras.layers import Dense, Dropout, Activation
from keras.layers import Embedding
from keras.layers import LSTM, GRU, SimpleRNN
from keras.layers import Convolution1D, MaxPooling1D
from keras.datasets import imdb
'''
Train a recurrent convolutional network on the IMDB sentiment classification task.
GPU command:
THEANO_FLAGS=mode=FAST_RUN,device=gpu,floatX=float32 python imdb_lstm.py
Get to 0.8498 test accuracy after 2 epochs. 41s/epoch on K520 GPU.
'''
# Embedding
max_features = 20000
@@ -28,9 +22,9 @@ maxlen = 100
embedding_size = 128
# Convolution
filter_length = 3
filter_length = 5
nb_filter = 64
pool_length = 2
pool_length = 4
# LSTM
lstm_output_size = 70
@@ -45,12 +39,12 @@ batch_size is highly sensitive.
Only 2 epochs are needed as the dataset is very small.
'''
print("Loading data...")
(X_train, y_train), (X_test, y_test) = imdb.load_data(nb_words=max_features, test_split=0.2)
print('Loading data...')
(X_train, y_train), (X_test, y_test) = imdb.load_data(nb_words=max_features)
print(len(X_train), 'train sequences')
print(len(X_test), 'test sequences')
print("Pad sequences (samples x time)")
print('Pad sequences (samples x time)')
X_train = sequence.pad_sequences(X_train, maxlen=maxlen)
X_test = sequence.pad_sequences(X_test, maxlen=maxlen)
print('X_train shape:', X_train.shape)
@@ -63,8 +57,8 @@ model.add(Embedding(max_features, embedding_size, input_length=maxlen))
model.add(Dropout(0.25))
model.add(Convolution1D(nb_filter=nb_filter,
filter_length=filter_length,
border_mode="valid",
activation="relu",
border_mode='valid',
activation='relu',
subsample_length=1))
model.add(MaxPooling1D(pool_length=pool_length))
model.add(LSTM(lstm_output_size))
@@ -73,12 +67,11 @@ model.add(Activation('sigmoid'))
model.compile(loss='binary_crossentropy',
optimizer='adam',
class_mode="binary")
metrics=['accuracy'])
print("Train...")
print('Train...')
model.fit(X_train, y_train, batch_size=batch_size, nb_epoch=nb_epoch,
validation_data=(X_test, y_test), show_accuracy=True)
score, acc = model.evaluate(X_test, y_test, batch_size=batch_size,
show_accuracy=True)
validation_data=(X_test, y_test))
score, acc = model.evaluate(X_test, y_test, batch_size=batch_size)
print('Test score:', score)
print('Test accuracy:', acc)
+139
Ver Arquivo
@@ -0,0 +1,139 @@
'''This example demonstrates the use of fasttext for text classification
Based on Joulin et al's paper:
Bags of Tricks for Efficient Text Classification
https://arxiv.org/abs/1607.01759
Results on IMDB datasets with uni and bi-gram embeddings:
Uni-gram: 0.8813 test accuracy after 5 epochs. 15s/epoch on i7 cpu.
Bi-gram : 0.9056 test accuracy after 5 epochs. 5s/epoch on GTX 1080 gpu.
'''
from __future__ import print_function
import numpy as np
np.random.seed(1337) # for reproducibility
from keras.preprocessing import sequence
from keras.models import Sequential
from keras.layers import Dense, Flatten
from keras.layers import Embedding
from keras.layers import AveragePooling1D
from keras.datasets import imdb
def create_ngram_set(input_list, ngram_value=2):
"""
Extract a set of n-grams from a list of integers.
>>> create_ngram_set([1, 4, 9, 4, 1, 4], ngram_value=2)
{(4, 9), (4, 1), (1, 4), (9, 4)}
>>> create_ngram_set([1, 4, 9, 4, 1, 4], ngram_value=3)
[(1, 4, 9), (4, 9, 4), (9, 4, 1), (4, 1, 4)]
"""
return set(zip(*[input_list[i:] for i in range(ngram_value)]))
def add_ngram(sequences, token_indice, ngram_range=2):
"""
Augment the input list of list (sequences) by appending n-grams values.
Example: adding bi-gram
>>> sequences = [[1, 3, 4, 5], [1, 3, 7, 9, 2]]
>>> token_indice = {(1, 3): 1337, (9, 2): 42, (4, 5): 2017}
>>> add_ngram(sequences, token_indice, ngram_range=2)
[[1, 3, 4, 5, 1337, 2017], [1, 3, 7, 9, 2, 1337, 42]]
Example: adding tri-gram
>>> sequences = [[1, 3, 4, 5], [1, 3, 7, 9, 2]]
>>> token_indice = {(1, 3): 1337, (9, 2): 42, (4, 5): 2017, (7, 9, 2): 2018}
>>> add_ngram(sequences, token_indice, ngram_range=3)
[[1, 3, 4, 5, 1337], [1, 3, 7, 9, 2, 1337, 2018]]
"""
new_sequences = []
for input_list in sequences:
new_list = input_list[:]
for i in range(len(new_list)-ngram_range+1):
for ngram_value in range(2, ngram_range+1):
ngram = tuple(new_list[i:i+ngram_value])
if ngram in token_indice:
new_list.append(token_indice[ngram])
new_sequences.append(new_list)
return new_sequences
# Set parameters:
# ngram_range = 2 will add bi-grams features
ngram_range = 1
max_features = 20000
maxlen = 400
batch_size = 32
embedding_dims = 50
nb_epoch = 5
print('Loading data...')
(X_train, y_train), (X_test, y_test) = imdb.load_data(nb_words=max_features)
print(len(X_train), 'train sequences')
print(len(X_test), 'test sequences')
print('Average train sequence length: {}'.format(np.mean(list(map(len, X_train)), dtype=int)))
print('Average test sequence length: {}'.format(np.mean(list(map(len, X_test)), dtype=int)))
if ngram_range > 1:
print('Adding {}-gram features'.format(ngram_range))
# Create set of unique n-gram from the training set.
ngram_set = set()
for input_list in X_train:
for i in range(2, ngram_range+1):
set_of_ngram = create_ngram_set(input_list, ngram_value=i)
ngram_set.update(set_of_ngram)
# Dictionary mapping n-gram token to a unique integer.
# Integer values are greater than max_features in order
# to avoid collision with existing features.
start_index = max_features + 1
token_indice = {v: k+start_index for k, v in enumerate(ngram_set)}
indice_token = {token_indice[k]: k for k in token_indice}
# max_features is the highest integer that could be found in the dataset.
max_features = np.max(list(indice_token.keys())) + 1
# Augmenting X_train and X_test with n-grams features
X_train = add_ngram(X_train, token_indice, ngram_range)
X_test = add_ngram(X_test, token_indice, ngram_range)
print('Average train sequence length: {}'.format(np.mean(list(map(len, X_train)), dtype=int)))
print('Average test sequence length: {}'.format(np.mean(list(map(len, X_test)), dtype=int)))
print('Pad sequences (samples x time)')
X_train = sequence.pad_sequences(X_train, maxlen=maxlen)
X_test = sequence.pad_sequences(X_test, maxlen=maxlen)
print('X_train shape:', X_train.shape)
print('X_test shape:', X_test.shape)
print('Build model...')
model = Sequential()
# we start off with an efficient embedding layer which maps
# our vocab indices into embedding_dims dimensions
model.add(Embedding(max_features,
embedding_dims,
input_length=maxlen))
# we add a AveragePooling1D, which will average the embeddings
# of all words in the document
model.add(AveragePooling1D(pool_length=model.output_shape[1]))
# We flatten the output of the AveragePooling1D layer
model.add(Flatten())
# We project onto a single unit output layer, and squash it with a sigmoid:
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy',
optimizer='adam',
metrics=['accuracy'])
model.fit(X_train, y_train,
batch_size=batch_size,
nb_epoch=nb_epoch,
validation_data=(X_test, y_test))
+25 -37
Ver Arquivo
@@ -1,4 +1,15 @@
from __future__ import absolute_import
'''Trains a LSTM on the IMDB sentiment classification task.
The dataset is actually too small for LSTM to be of any advantage
compared to simpler, much faster methods such as TF-IDF + LogReg.
Notes:
- RNNs are tricky. Choice of batch size is important,
choice of loss and optimizer is critical, etc.
Some configurations won't converge.
- LSTM loss decrease patterns during training can be quite different
from what you see with CNNs/MLPs/etc.
'''
from __future__ import print_function
import numpy as np
np.random.seed(1337) # for reproducibility
@@ -6,41 +17,20 @@ np.random.seed(1337) # for reproducibility
from keras.preprocessing import sequence
from keras.utils import np_utils
from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation
from keras.layers.embeddings import Embedding
from keras.layers.recurrent import LSTM
from keras.layers import Dense, Dropout, Activation, Embedding
from keras.layers import LSTM, SimpleRNN, GRU
from keras.datasets import imdb
'''
Train a LSTM on the IMDB sentiment classification task.
The dataset is actually too small for LSTM to be of any advantage
compared to simpler, much faster methods such as TF-IDF+LogReg.
Notes:
- RNNs are tricky. Choice of batch size is important,
choice of loss and optimizer is critical, etc.
Some configurations won't converge.
- LSTM loss decrease patterns during training can be quite different
from what you see with CNNs/MLPs/etc.
GPU command:
THEANO_FLAGS=mode=FAST_RUN,device=gpu,floatX=float32 python imdb_lstm.py
'''
max_features = 20000
maxlen = 100 # cut texts after this number of words (among top max_features most common words)
maxlen = 80 # cut texts after this number of words (among top max_features most common words)
batch_size = 32
print("Loading data...")
(X_train, y_train), (X_test, y_test) = imdb.load_data(nb_words=max_features,
test_split=0.2)
print('Loading data...')
(X_train, y_train), (X_test, y_test) = imdb.load_data(nb_words=max_features)
print(len(X_train), 'train sequences')
print(len(X_test), 'test sequences')
print("Pad sequences (samples x time)")
print('Pad sequences (samples x time)')
X_train = sequence.pad_sequences(X_train, maxlen=maxlen)
X_test = sequence.pad_sequences(X_test, maxlen=maxlen)
print('X_train shape:', X_train.shape)
@@ -48,22 +38,20 @@ print('X_test shape:', X_test.shape)
print('Build model...')
model = Sequential()
model.add(Embedding(max_features, 128, input_length=maxlen))
model.add(LSTM(128)) # try using a GRU instead, for fun
model.add(Dropout(0.5))
model.add(Embedding(max_features, 128, dropout=0.2))
model.add(LSTM(128, dropout_W=0.2, dropout_U=0.2)) # try using a GRU instead, for fun
model.add(Dense(1))
model.add(Activation('sigmoid'))
# try using different optimizers and different optimizer configs
model.compile(loss='binary_crossentropy',
optimizer='adam',
class_mode="binary")
metrics=['accuracy'])
print("Train...")
model.fit(X_train, y_train, batch_size=batch_size, nb_epoch=3,
validation_data=(X_test, y_test), show_accuracy=True)
print('Train...')
model.fit(X_train, y_train, batch_size=batch_size, nb_epoch=15,
validation_data=(X_test, y_test))
score, acc = model.evaluate(X_test, y_test,
batch_size=batch_size,
show_accuracy=True)
batch_size=batch_size)
print('Test score:', score)
print('Test accuracy:', acc)
-123
Ver Arquivo
@@ -1,123 +0,0 @@
from __future__ import absolute_import
from __future__ import print_function
import numpy as np
import pandas as pd
np.random.seed(1337) # for reproducibility
from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation
from keras.layers.normalization import BatchNormalization
from keras.layers.advanced_activations import PReLU
from keras.utils import np_utils, generic_utils
from sklearn.preprocessing import LabelEncoder
from sklearn.preprocessing import StandardScaler
'''
This demonstrates how to reach a score of 0.4890 (local validation)
on the Kaggle Otto challenge, with a deep net using Keras.
Compatible Python 2.7-3.4. Requires Scikit-Learn and Pandas.
Recommended to run on GPU:
Command: THEANO_FLAGS=mode=FAST_RUN,device=gpu,floatX=float32 python kaggle_otto_nn.py
On EC2 g2.2xlarge instance: 19s/epoch. 6-7 minutes total training time.
Best validation score at epoch 21: 0.4881
Try it at home:
- with/without BatchNormalization (BatchNormalization helps!)
- with ReLU or with PReLU (PReLU helps!)
- with smaller layers, largers layers
- with more layers, less layers
- with different optimizers (SGD+momentum+decay is probably better than Adam!)
Get the data from Kaggle: https://www.kaggle.com/c/otto-group-product-classification-challenge/data
'''
def load_data(path, train=True):
df = pd.read_csv(path)
X = df.values.copy()
if train:
np.random.shuffle(X) # https://youtu.be/uyUXoap67N8
X, labels = X[:, 1:-1].astype(np.float32), X[:, -1]
return X, labels
else:
X, ids = X[:, 1:].astype(np.float32), X[:, 0].astype(str)
return X, ids
def preprocess_data(X, scaler=None):
if not scaler:
scaler = StandardScaler()
scaler.fit(X)
X = scaler.transform(X)
return X, scaler
def preprocess_labels(labels, encoder=None, categorical=True):
if not encoder:
encoder = LabelEncoder()
encoder.fit(labels)
y = encoder.transform(labels).astype(np.int32)
if categorical:
y = np_utils.to_categorical(y)
return y, encoder
def make_submission(y_prob, ids, encoder, fname):
with open(fname, 'w') as f:
f.write('id,')
f.write(','.join([str(i) for i in encoder.classes_]))
f.write('\n')
for i, probs in zip(ids, y_prob):
probas = ','.join([i] + [str(p) for p in probs.tolist()])
f.write(probas)
f.write('\n')
print("Wrote submission to file {}.".format(fname))
print("Loading data...")
X, labels = load_data('train.csv', train=True)
X, scaler = preprocess_data(X)
y, encoder = preprocess_labels(labels)
X_test, ids = load_data('test.csv', train=False)
X_test, _ = preprocess_data(X_test, scaler)
nb_classes = y.shape[1]
print(nb_classes, 'classes')
dims = X.shape[1]
print(dims, 'dims')
print("Building model...")
model = Sequential()
model.add(Dense(512, input_shape=(dims,)))
model.add(PReLU())
model.add(BatchNormalization())
model.add(Dropout(0.5))
model.add(Dense(512))
model.add(PReLU())
model.add(BatchNormalization())
model.add(Dropout(0.5))
model.add(Dense(512))
model.add(PReLU())
model.add(BatchNormalization())
model.add(Dropout(0.5))
model.add(Dense(nb_classes))
model.add(Activation('softmax'))
model.compile(loss='categorical_crossentropy', optimizer="adam")
print("Training model...")
model.fit(X, y, nb_epoch=20, batch_size=128, validation_split=0.15)
print("Generating submission...")
proba = model.predict_proba(X_test)
make_submission(proba, ids, encoder, fname='keras-otto.csv')
+83
Ver Arquivo
@@ -0,0 +1,83 @@
'''Compare LSTM implementations on the IMDB sentiment classification task.
consume_less='cpu' preprocesses input to the LSTM which typically results in
faster computations at the expense of increased peak memory usage as the
preprocessed input must be kept in memory.
consume_less='mem' does away with the preprocessing, meaning that it might take
a little longer, but should require less peak memory.
consume_less='gpu' concatenates the input, output and forget gate's weights
into one, large matrix, resulting in faster computation time as the GPU can
utilize more cores, at the expense of reduced regularization because the same
dropout is shared across the gates.
Note that the relative performance of the different `consume_less` modes
can vary depending on your device, your model and the size of your data.
'''
import time
import numpy as np
import matplotlib.pyplot as plt
from keras.preprocessing import sequence
from keras.models import Sequential
from keras.layers import Embedding, Dense, LSTM
from keras.datasets import imdb
max_features = 20000
max_length = 80
embedding_dim = 256
batch_size = 128
epochs = 10
modes = ['cpu', 'mem', 'gpu']
print('Loading data...')
(X_train, y_train), (X_test, y_test) = imdb.load_data(nb_words=max_features)
X_train = sequence.pad_sequences(X_train, max_length)
X_test = sequence.pad_sequences(X_test, max_length)
# Compile and train different models while meauring performance.
results = []
for mode in modes:
print('Testing mode: consume_less="{}"'.format(mode))
model = Sequential()
model.add(Embedding(max_features, embedding_dim, input_length=max_length, dropout=0.2))
model.add(LSTM(embedding_dim, dropout_W=0.2, dropout_U=0.2, consume_less=mode))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy',
optimizer='adam',
metrics=['accuracy'])
start_time = time.time()
history = model.fit(X_train, y_train,
batch_size=batch_size,
nb_epoch=epochs,
validation_data=(X_test, y_test))
average_time_per_epoch = (time.time() - start_time) / epochs
results.append((history, average_time_per_epoch))
# Compare models' accuracy, loss and elapsed time per epoch.
plt.style.use('ggplot')
ax1 = plt.subplot2grid((2, 2), (0, 0))
ax1.set_title('Accuracy')
ax1.set_ylabel('Validation Accuracy')
ax1.set_xlabel('Epochs')
ax2 = plt.subplot2grid((2, 2), (1, 0))
ax2.set_title('Loss')
ax2.set_ylabel('Validation Loss')
ax2.set_xlabel('Epochs')
ax3 = plt.subplot2grid((2, 2), (0, 1), rowspan=2)
ax3.set_title('Time')
ax3.set_ylabel('Seconds')
for mode, result in zip(modes, results):
ax1.plot(result[0].epoch, result[0].history['val_acc'], label=mode)
ax2.plot(result[0].epoch, result[0].history['val_loss'], label=mode)
ax1.legend()
ax2.legend()
ax3.bar(np.arange(len(results)), [x[1] for x in results],
tick_label=modes, align='center')
plt.tight_layout()
plt.show()
+30 -29
Ver Arquivo
@@ -1,36 +1,36 @@
'''Example script to generate text from Nietzsche's writings.
At least 20 epochs are required before the generated text
starts sounding coherent.
It is recommended to run this script on GPU, as recurrent
networks are quite computationally intensive.
If you try this script on new data, make sure your corpus
has at least ~100k characters. ~1M is better.
'''
from __future__ import print_function
from keras.models import Sequential
from keras.layers.core import Dense, Activation, Dropout
from keras.layers.recurrent import LSTM
from keras.datasets.data_utils import get_file
from keras.layers import Dense, Activation, Dropout
from keras.layers import LSTM
from keras.optimizers import RMSprop
from keras.utils.data_utils import get_file
import numpy as np
import random
import sys
'''
Example script to generate text from Nietzsche's writings.
At least 20 epochs are required before the generated text
starts sounding coherent.
It is recommended to run this script on GPU, as recurrent
networks are quite computationally intensive.
If you try this script on new data, make sure your corpus
has at least ~100k characters. ~1M is better.
'''
path = get_file('nietzsche.txt', origin="https://s3.amazonaws.com/text-datasets/nietzsche.txt")
text = open(path).read().lower()
print('corpus length:', len(text))
chars = set(text)
chars = sorted(list(set(text)))
print('total chars:', len(chars))
char_indices = dict((c, i) for i, c in enumerate(chars))
indices_char = dict((i, c) for i, c in enumerate(chars))
# cut the text in semi-redundant sequences of maxlen characters
maxlen = 20
maxlen = 40
step = 3
sentences = []
next_chars = []
@@ -48,24 +48,25 @@ for i, sentence in enumerate(sentences):
y[i, char_indices[next_chars[i]]] = 1
# build the model: 2 stacked LSTM
# build the model: a single LSTM
print('Build model...')
model = Sequential()
model.add(LSTM(512, return_sequences=True, input_shape=(maxlen, len(chars))))
model.add(Dropout(0.2))
model.add(LSTM(512, return_sequences=False))
model.add(Dropout(0.2))
model.add(LSTM(128, input_shape=(maxlen, len(chars))))
model.add(Dense(len(chars)))
model.add(Activation('softmax'))
model.compile(loss='categorical_crossentropy', optimizer='rmsprop')
optimizer = RMSprop(lr=0.01)
model.compile(loss='categorical_crossentropy', optimizer=optimizer)
def sample(a, temperature=1.0):
def sample(preds, temperature=1.0):
# helper function to sample an index from a probability array
a = np.log(a) / temperature
a = np.exp(a) / np.sum(np.exp(a))
return np.argmax(np.random.multinomial(1, a, 1))
preds = np.asarray(preds).astype('float64')
preds = np.log(preds) / temperature
exp_preds = np.exp(preds)
preds = exp_preds / np.sum(exp_preds)
probas = np.random.multinomial(1, preds, 1)
return np.argmax(probas)
# train the model, output generated text after each iteration
for iteration in range(1, 60):
@@ -86,7 +87,7 @@ for iteration in range(1, 60):
print('----- Generating with seed: "' + sentence + '"')
sys.stdout.write(generated)
for iteration in range(400):
for i in range(400):
x = np.zeros((1, maxlen, len(chars)))
for t, char in enumerate(sentence):
x[0, t, char_indices[char]] = 1.
+35 -27
Ver Arquivo
@@ -1,22 +1,20 @@
from __future__ import absolute_import
'''Trains a simple convnet on the MNIST dataset.
Gets to 99.25% test accuracy after 12 epochs
(there is still a lot of margin for parameter tuning).
16 seconds per epoch on a GRID K520 GPU.
'''
from __future__ import print_function
import numpy as np
np.random.seed(1337) # for reproducibility
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation, Flatten
from keras.layers.convolutional import Convolution2D, MaxPooling2D
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.layers import Convolution2D, MaxPooling2D
from keras.utils import np_utils
'''
Train a simple convnet on the MNIST dataset.
Run on GPU: THEANO_FLAGS=mode=FAST_RUN,device=gpu,floatX=float32 python mnist_cnn.py
Get to 99.25% test accuracy after 12 epochs (there is still a lot of margin for parameter tuning).
16 seconds per epoch on a GRID K520 GPU.
'''
from keras import backend as K
batch_size = 128
nb_classes = 10
@@ -27,17 +25,24 @@ img_rows, img_cols = 28, 28
# number of convolutional filters to use
nb_filters = 32
# size of pooling area for max pooling
nb_pool = 2
pool_size = (2, 2)
# convolution kernel size
nb_conv = 3
kernel_size = (3, 3)
# the data, shuffled and split between tran and test sets
# the data, shuffled and split between train and test sets
(X_train, y_train), (X_test, y_test) = mnist.load_data()
X_train = X_train.reshape(X_train.shape[0], 1, img_rows, img_cols)
X_test = X_test.reshape(X_test.shape[0], 1, img_rows, img_cols)
X_train = X_train.astype("float32")
X_test = X_test.astype("float32")
if K.image_dim_ordering() == 'th':
X_train = X_train.reshape(X_train.shape[0], 1, img_rows, img_cols)
X_test = X_test.reshape(X_test.shape[0], 1, img_rows, img_cols)
input_shape = (1, img_rows, img_cols)
else:
X_train = X_train.reshape(X_train.shape[0], img_rows, img_cols, 1)
X_test = X_test.reshape(X_test.shape[0], img_rows, img_cols, 1)
input_shape = (img_rows, img_cols, 1)
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255
X_test /= 255
print('X_train shape:', X_train.shape)
@@ -50,13 +55,13 @@ Y_test = np_utils.to_categorical(y_test, nb_classes)
model = Sequential()
model.add(Convolution2D(nb_filters, nb_conv, nb_conv,
border_mode='same',
input_shape=(1, img_rows, img_cols)))
model.add(Convolution2D(nb_filters, kernel_size[0], kernel_size[1],
border_mode='valid',
input_shape=input_shape))
model.add(Activation('relu'))
model.add(Convolution2D(nb_filters, nb_conv, nb_conv))
model.add(Convolution2D(nb_filters, kernel_size[0], kernel_size[1]))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(nb_pool, nb_pool)))
model.add(MaxPooling2D(pool_size=pool_size))
model.add(Dropout(0.25))
model.add(Flatten())
@@ -66,9 +71,12 @@ model.add(Dropout(0.5))
model.add(Dense(nb_classes))
model.add(Activation('softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adadelta')
model.compile(loss='categorical_crossentropy',
optimizer='adadelta',
metrics=['accuracy'])
model.fit(X_train, Y_train, batch_size=batch_size, nb_epoch=nb_epoch, show_accuracy=True, verbose=1, validation_data=(X_test, Y_test))
score = model.evaluate(X_test, Y_test, show_accuracy=True, verbose=0)
model.fit(X_train, Y_train, batch_size=batch_size, nb_epoch=nb_epoch,
verbose=1, validation_data=(X_test, Y_test))
score = model.evaluate(X_test, Y_test, verbose=0)
print('Test score:', score[0])
print('Test accuracy:', score[1])
+87
Ver Arquivo
@@ -0,0 +1,87 @@
"""This is an example of using Hierarchical RNN (HRNN) to classify MNIST digits.
HRNNs can learn across multiple levels of temporal hiearchy over a complex sequence.
Usually, the first recurrent layer of an HRNN encodes a sentence (e.g. of word vectors)
into a sentence vector. The second recurrent layer then encodes a sequence of
such vectors (encoded by the first layer) into a document vector. This
document vector is considered to preserve both the word-level and
sentence-level structure of the context.
# References
- [A Hierarchical Neural Autoencoder for Paragraphs and Documents](https://web.stanford.edu/~jurafsky/pubs/P15-1107.pdf)
Encodes paragraphs and documents with HRNN.
Results have shown that HRNN outperforms standard
RNNs and may play some role in more sophisticated generation tasks like
summarization or question answering.
- [Hierarchical recurrent neural network for skeleton based action recognition](http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=7298714)
Achieved state-of-the-art results on skeleton based action recognition with 3 levels
of bidirectional HRNN combined with fully connected layers.
In the below MNIST example the first LSTM layer first encodes every
column of pixels of shape (28, 1) to a column vector of shape (128,). The second LSTM
layer encodes then these 28 column vectors of shape (28, 128) to a image vector
representing the whole image. A final Dense layer is added for prediction.
After 5 epochs: train acc: 0.9858, val acc: 0.9864
"""
from __future__ import print_function
from keras.datasets import mnist
from keras.models import Sequential, Model
from keras.layers import Input, Dense, TimeDistributed
from keras.layers import LSTM
from keras.utils import np_utils
# Training parameters.
batch_size = 32
nb_classes = 10
nb_epochs = 5
# Embedding dimensions.
row_hidden = 128
col_hidden = 128
# The data, shuffled and split between train and test sets.
(X_train, y_train), (X_test, y_test) = mnist.load_data()
# Reshapes data to 4D for Hierarchical RNN.
X_train = X_train.reshape(X_train.shape[0], 28, 28, 1)
X_test = X_test.reshape(X_test.shape[0], 28, 28, 1)
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255
X_test /= 255
print('X_train shape:', X_train.shape)
print(X_train.shape[0], 'train samples')
print(X_test.shape[0], 'test samples')
# Converts class vectors to binary class matrices.
Y_train = np_utils.to_categorical(y_train, nb_classes)
Y_test = np_utils.to_categorical(y_test, nb_classes)
row, col, pixel = X_train.shape[1:]
# 4D input.
x = Input(shape=(row, col, pixel))
# Encodes a row of pixels using TimeDistributed Wrapper.
encoded_rows = TimeDistributed(LSTM(output_dim=row_hidden))(x)
# Encodes columns of encoded rows.
encoded_columns = LSTM(col_hidden)(encoded_rows)
# Final predictions and model.
prediction = Dense(nb_classes, activation='softmax')(encoded_columns)
model = Model(input=x, output=prediction)
model.compile(loss='categorical_crossentropy',
optimizer='rmsprop',
metrics=['accuracy'])
# Training.
model.fit(X_train, Y_train, batch_size=batch_size, nb_epoch=nb_epochs,
verbose=1, validation_data=(X_test, Y_test))
# Evaluation.
scores = model.evaluate(X_test, Y_test, verbose=0)
print('Test loss:', scores[0])
print('Test accuracy:', scores[1])
+28 -44
Ver Arquivo
@@ -1,32 +1,28 @@
from __future__ import absolute_import
'''This is a reproduction of the IRNN experiment
with pixel-by-pixel sequential MNIST in
"A Simple Way to Initialize Recurrent Networks of Rectified Linear Units"
by Quoc V. Le, Navdeep Jaitly, Geoffrey E. Hinton
arXiv:1504.00941v2 [cs.NE] 7 Apr 2015
http://arxiv.org/pdf/1504.00941v2.pdf
Optimizer is replaced with RMSprop which yields more stable and steady
improvement.
Reaches 0.93 train/test accuracy after 900 epochs
(which roughly corresponds to 1687500 steps in the original paper.)
'''
from __future__ import print_function
import numpy as np
np.random.seed(1337) # for reproducibility
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers.core import Dense, Activation
from keras.layers import Dense, Activation
from keras.layers import SimpleRNN
from keras.initializations import normal, identity
from keras.layers.recurrent import SimpleRNN, LSTM
from keras.optimizers import RMSprop
from keras.utils import np_utils
'''
This is a reproduction of the IRNN experiment
with pixel-by-pixel sequential MNIST in
"A Simple Way to Initialize Recurrent Networks of Rectified Linear Units "
by Quoc V. Le, Navdeep Jaitly, Geoffrey E. Hinton
arXiv:1504.00941v2 [cs.NE] 7 Apr 201
http://arxiv.org/pdf/1504.00941v2.pdf
Optimizer is replaced with RMSprop which yields more stable and steady
improvement.
Reaches 0.93 train/test accuracy after 900 epochs
(which roughly corresponds to 1687500 steps in the original paper.)
'''
batch_size = 32
nb_classes = 10
nb_epochs = 200
@@ -40,8 +36,8 @@ clip_norm = 1.0
X_train = X_train.reshape(X_train.shape[0], -1, 1)
X_test = X_test.reshape(X_test.shape[0], -1, 1)
X_train = X_train.astype("float32")
X_test = X_test.astype("float32")
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255
X_test /= 255
print('X_train shape:', X_train.shape)
@@ -55,32 +51,20 @@ Y_test = np_utils.to_categorical(y_test, nb_classes)
print('Evaluate IRNN...')
model = Sequential()
model.add(SimpleRNN(output_dim=hidden_units,
init=lambda shape: normal(shape, scale=0.001),
inner_init=lambda shape: identity(shape, scale=1.0),
activation='relu', input_shape=X_train.shape[1:]))
init=lambda shape, name: normal(shape, scale=0.001, name=name),
inner_init=lambda shape, name: identity(shape, scale=1.0, name=name),
activation='relu',
input_shape=X_train.shape[1:]))
model.add(Dense(nb_classes))
model.add(Activation('softmax'))
rmsprop = RMSprop(lr=learning_rate)
model.compile(loss='categorical_crossentropy', optimizer=rmsprop)
model.compile(loss='categorical_crossentropy',
optimizer=rmsprop,
metrics=['accuracy'])
model.fit(X_train, Y_train, batch_size=batch_size, nb_epoch=nb_epochs,
show_accuracy=True, verbose=1, validation_data=(X_test, Y_test))
verbose=1, validation_data=(X_test, Y_test))
scores = model.evaluate(X_test, Y_test, show_accuracy=True, verbose=0)
scores = model.evaluate(X_test, Y_test, verbose=0)
print('IRNN test score:', scores[0])
print('IRNN test accuracy:', scores[1])
print('Compare to LSTM...')
model = Sequential()
model.add(LSTM(hidden_units, input_shape=X_train.shape[1:]))
model.add(Dense(nb_classes))
model.add(Activation('softmax'))
rmsprop = RMSprop(lr=learning_rate)
model.compile(loss='categorical_crossentropy', optimizer=rmsprop)
model.fit(X_train, Y_train, batch_size=batch_size, nb_epoch=nb_epochs,
show_accuracy=True, verbose=1, validation_data=(X_test, Y_test))
scores = model.evaluate(X_test, Y_test, show_accuracy=True, verbose=0)
print('LSTM test score:', scores[0])
print('LSTM test accuracy:', scores[1])
+19 -19
Ver Arquivo
@@ -1,4 +1,10 @@
from __future__ import absolute_import
'''Trains a simple deep NN on the MNIST dataset.
Gets to 98.40% test accuracy after 20 epochs
(there is *a lot* of margin for parameter tuning).
2 seconds per epoch on a K520 GPU.
'''
from __future__ import print_function
import numpy as np
np.random.seed(1337) # for reproducibility
@@ -9,25 +15,18 @@ from keras.layers.core import Dense, Dropout, Activation
from keras.optimizers import SGD, Adam, RMSprop
from keras.utils import np_utils
'''
Train a simple deep NN on the MNIST dataset.
Get to 98.40% test accuracy after 20 epochs
(there is *a lot* of margin for parameter tuning).
2 seconds per epoch on a K520 GPU.
'''
batch_size = 128
nb_classes = 10
nb_epoch = 20
# the data, shuffled and split between tran and test sets
# the data, shuffled and split between train and test sets
(X_train, y_train), (X_test, y_test) = mnist.load_data()
X_train = X_train.reshape(60000, 784)
X_test = X_test.reshape(10000, 784)
X_train = X_train.astype("float32")
X_test = X_test.astype("float32")
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255
X_test /= 255
print(X_train.shape[0], 'train samples')
@@ -47,14 +46,15 @@ model.add(Dropout(0.2))
model.add(Dense(10))
model.add(Activation('softmax'))
rms = RMSprop()
model.compile(loss='categorical_crossentropy', optimizer=rms)
model.summary()
model.fit(X_train, Y_train,
batch_size=batch_size, nb_epoch=nb_epoch,
show_accuracy=True, verbose=2,
validation_data=(X_test, Y_test))
score = model.evaluate(X_test, Y_test,
show_accuracy=True, verbose=0)
model.compile(loss='categorical_crossentropy',
optimizer=RMSprop(),
metrics=['accuracy'])
history = model.fit(X_train, Y_train,
batch_size=batch_size, nb_epoch=nb_epoch,
verbose=1, validation_data=(X_test, Y_test))
score = model.evaluate(X_test, Y_test, verbose=0)
print('Test score:', score[0])
print('Test accuracy:', score[1])
+384
Ver Arquivo
@@ -0,0 +1,384 @@
'''This is an implementation of Net2Net experiment with MNIST in
'Net2Net: Accelerating Learning via Knowledge Transfer'
by Tianqi Chen, Ian Goodfellow, and Jonathon Shlens
arXiv:1511.05641v4 [cs.LG] 23 Apr 2016
http://arxiv.org/abs/1511.05641
Notes
- What:
+ Net2Net is a group of methods to transfer knowledge from a teacher neural
net to a student net,so that the student net can be trained faster than
from scratch.
+ The paper discussed two specific methods of Net2Net, i.e. Net2WiderNet
and Net2DeeperNet.
+ Net2WiderNet replaces a model with an equivalent wider model that has
more units in each hidden layer.
+ Net2DeeperNet replaces a model with an equivalent deeper model.
+ Both are based on the idea of 'function-preserving transformations of
neural nets'.
- Why:
+ Enable fast exploration of multiple neural nets in experimentation and
design process,by creating a series of wider and deeper models with
transferable knowledge.
+ Enable 'lifelong learning system' by gradually adjusting model complexity
to data availability,and reusing transferable knowledge.
Experiments
- Teacher model: a basic CNN model trained on MNIST for 3 epochs.
- Net2WiderNet exepriment:
+ Student model has a wider Conv2D layer and a wider FC layer.
+ Comparison of 'random-padding' vs 'net2wider' weight initialization.
+ With both methods, student model should immediately perform as well as
teacher model, but 'net2wider' is slightly better.
- Net2DeeperNet experiment:
+ Student model has an extra Conv2D layer and an extra FC layer.
+ Comparison of 'random-init' vs 'net2deeper' weight initialization.
+ Starting performance of 'net2deeper' is better than 'random-init'.
- Hyper-parameters:
+ SGD with momentum=0.9 is used for training teacher and student models.
+ Learning rate adjustment: it's suggested to reduce learning rate
to 1/10 for student model.
+ Addition of noise in 'net2wider' is used to break weight symmetry
and thus enable full capacity of student models. It is optional
when a Dropout layer is used.
Results
- Tested with 'Theano' backend and 'th' image_dim_ordering.
- Running on GPU GeForce GTX 980M
- Performance Comparisons - validation loss values during first 3 epochs:
(1) teacher_model: 0.075 0.041 0.041
(2) wider_random_pad: 0.036 0.034 0.032
(3) wider_net2wider: 0.032 0.030 0.030
(4) deeper_random_init: 0.061 0.043 0.041
(5) deeper_net2deeper: 0.032 0.031 0.029
'''
from __future__ import print_function
import numpy as np
np.random.seed(1337)
from keras.models import Sequential
from keras.layers import Conv2D, MaxPooling2D, Dense, Flatten
from keras.optimizers import SGD
from keras.utils import np_utils
from keras.datasets import mnist
input_shape = (1, 28, 28) # image shape
nb_class = 10 # number of class
# load and pre-process data
def preprocess_input(x):
return x.reshape((-1, ) + input_shape) / 255.
def preprocess_output(y):
return np_utils.to_categorical(y)
(train_x, train_y), (validation_x, validation_y) = mnist.load_data()
train_x, validation_x = map(preprocess_input, [train_x, validation_x])
train_y, validation_y = map(preprocess_output, [train_y, validation_y])
print('Loading MNIST data...')
print('train_x shape:', train_x.shape, 'train_y shape:', train_y.shape)
print('validation_x shape:', validation_x.shape,
'validation_y shape', validation_y.shape)
# knowledge transfer algorithms
def wider2net_conv2d(teacher_w1, teacher_b1, teacher_w2, new_width, init):
'''Get initial weights for a wider conv2d layer with a bigger nb_filter,
by 'random-padding' or 'net2wider'.
# Arguments
teacher_w1: `weight` of conv2d layer to become wider,
of shape (nb_filter1, nb_channel1, kh1, kw1)
teacher_b1: `bias` of conv2d layer to become wider,
of shape (nb_filter1, )
teacher_w2: `weight` of next connected conv2d layer,
of shape (nb_filter2, nb_channel2, kh2, kw2)
new_width: new `nb_filter` for the wider conv2d layer
init: initialization algorithm for new weights,
either 'random-pad' or 'net2wider'
'''
assert teacher_w1.shape[0] == teacher_w2.shape[1], (
'successive layers from teacher model should have compatible shapes')
assert teacher_w1.shape[0] == teacher_b1.shape[0], (
'weight and bias from same layer should have compatible shapes')
assert new_width > teacher_w1.shape[0], (
'new width (nb_filter) should be bigger than the existing one')
n = new_width - teacher_w1.shape[0]
if init == 'random-pad':
new_w1 = np.random.normal(0, 0.1, size=(n, ) + teacher_w1.shape[1:])
new_b1 = np.ones(n) * 0.1
new_w2 = np.random.normal(0, 0.1, size=(
teacher_w2.shape[0], n) + teacher_w2.shape[2:])
elif init == 'net2wider':
index = np.random.randint(teacher_w1.shape[0], size=n)
factors = np.bincount(index)[index] + 1.
new_w1 = teacher_w1[index, :, :, :]
new_b1 = teacher_b1[index]
new_w2 = teacher_w2[:, index, :, :] / factors.reshape((1, -1, 1, 1))
else:
raise ValueError('Unsupported weight initializer: %s' % init)
student_w1 = np.concatenate((teacher_w1, new_w1), axis=0)
if init == 'random-pad':
student_w2 = np.concatenate((teacher_w2, new_w2), axis=1)
elif init == 'net2wider':
# add small noise to break symmetry, so that student model will have
# full capacity later
noise = np.random.normal(0, 5e-2 * new_w2.std(), size=new_w2.shape)
student_w2 = np.concatenate((teacher_w2, new_w2 + noise), axis=1)
student_w2[:, index, :, :] = new_w2
student_b1 = np.concatenate((teacher_b1, new_b1), axis=0)
return student_w1, student_b1, student_w2
def wider2net_fc(teacher_w1, teacher_b1, teacher_w2, new_width, init):
'''Get initial weights for a wider fully connected (dense) layer
with a bigger nout, by 'random-padding' or 'net2wider'.
# Arguments
teacher_w1: `weight` of fc layer to become wider,
of shape (nin1, nout1)
teacher_b1: `bias` of fc layer to become wider,
of shape (nout1, )
teacher_w2: `weight` of next connected fc layer,
of shape (nin2, nout2)
new_width: new `nout` for the wider fc layer
init: initialization algorithm for new weights,
either 'random-pad' or 'net2wider'
'''
assert teacher_w1.shape[1] == teacher_w2.shape[0], (
'successive layers from teacher model should have compatible shapes')
assert teacher_w1.shape[1] == teacher_b1.shape[0], (
'weight and bias from same layer should have compatible shapes')
assert new_width > teacher_w1.shape[1], (
'new width (nout) should be bigger than the existing one')
n = new_width - teacher_w1.shape[1]
if init == 'random-pad':
new_w1 = np.random.normal(0, 0.1, size=(teacher_w1.shape[0], n))
new_b1 = np.ones(n) * 0.1
new_w2 = np.random.normal(0, 0.1, size=(n, teacher_w2.shape[1]))
elif init == 'net2wider':
index = np.random.randint(teacher_w1.shape[1], size=n)
factors = np.bincount(index)[index] + 1.
new_w1 = teacher_w1[:, index]
new_b1 = teacher_b1[index]
new_w2 = teacher_w2[index, :] / factors[:, np.newaxis]
else:
raise ValueError('Unsupported weight initializer: %s' % init)
student_w1 = np.concatenate((teacher_w1, new_w1), axis=1)
if init == 'random-pad':
student_w2 = np.concatenate((teacher_w2, new_w2), axis=0)
elif init == 'net2wider':
# add small noise to break symmetry, so that student model will have
# full capacity later
noise = np.random.normal(0, 5e-2 * new_w2.std(), size=new_w2.shape)
student_w2 = np.concatenate((teacher_w2, new_w2 + noise), axis=0)
student_w2[index, :] = new_w2
student_b1 = np.concatenate((teacher_b1, new_b1), axis=0)
return student_w1, student_b1, student_w2
def deeper2net_conv2d(teacher_w):
'''Get initial weights for a deeper conv2d layer by net2deeper'.
# Arguments
teacher_w: `weight` of previous conv2d layer,
of shape (nb_filter, nb_channel, kh, kw)
'''
nb_filter, nb_channel, kh, kw = teacher_w.shape
student_w = np.zeros((nb_filter, nb_filter, kh, kw))
for i in xrange(nb_filter):
student_w[i, i, (kh - 1) / 2, (kw - 1) / 2] = 1.
student_b = np.zeros(nb_filter)
return student_w, student_b
def copy_weights(teacher_model, student_model, layer_names):
'''Copy weights from teacher_model to student_model,
for layers with names listed in layer_names
'''
for name in layer_names:
weights = teacher_model.get_layer(name=name).get_weights()
student_model.get_layer(name=name).set_weights(weights)
# methods to construct teacher_model and student_models
def make_teacher_model(train_data, validation_data, nb_epoch=3):
'''Train a simple CNN as teacher model.
'''
model = Sequential()
model.add(Conv2D(64, 3, 3, input_shape=input_shape,
border_mode='same', name='conv1'))
model.add(MaxPooling2D(name='pool1'))
model.add(Conv2D(64, 3, 3, border_mode='same', name='conv2'))
model.add(MaxPooling2D(name='pool2'))
model.add(Flatten(name='flatten'))
model.add(Dense(64, activation='relu', name='fc1'))
model.add(Dense(nb_class, activation='softmax', name='fc2'))
model.compile(loss='categorical_crossentropy',
optimizer=SGD(lr=0.01, momentum=0.9),
metrics=['accuracy'])
train_x, train_y = train_data
history = model.fit(train_x, train_y, nb_epoch=nb_epoch,
validation_data=validation_data)
return model, history
def make_wider_student_model(teacher_model, train_data,
validation_data, init, nb_epoch=3):
'''Train a wider student model based on teacher_model,
with either 'random-pad' (baseline) or 'net2wider'
'''
new_conv1_width = 128
new_fc1_width = 128
model = Sequential()
# a wider conv1 compared to teacher_model
model.add(Conv2D(new_conv1_width, 3, 3, input_shape=input_shape,
border_mode='same', name='conv1'))
model.add(MaxPooling2D(name='pool1'))
model.add(Conv2D(64, 3, 3, border_mode='same', name='conv2'))
model.add(MaxPooling2D(name='pool2'))
model.add(Flatten(name='flatten'))
# a wider fc1 compared to teacher model
model.add(Dense(new_fc1_width, activation='relu', name='fc1'))
model.add(Dense(nb_class, activation='softmax', name='fc2'))
# The weights for other layers need to be copied from teacher_model
# to student_model, except for widened layers
# and their immediate downstreams, which will be initialized separately.
# For this example there are no other layers that need to be copied.
w_conv1, b_conv1 = teacher_model.get_layer('conv1').get_weights()
w_conv2, b_conv2 = teacher_model.get_layer('conv2').get_weights()
new_w_conv1, new_b_conv1, new_w_conv2 = wider2net_conv2d(
w_conv1, b_conv1, w_conv2, new_conv1_width, init)
model.get_layer('conv1').set_weights([new_w_conv1, new_b_conv1])
model.get_layer('conv2').set_weights([new_w_conv2, b_conv2])
w_fc1, b_fc1 = teacher_model.get_layer('fc1').get_weights()
w_fc2, b_fc2 = teacher_model.get_layer('fc2').get_weights()
new_w_fc1, new_b_fc1, new_w_fc2 = wider2net_fc(
w_fc1, b_fc1, w_fc2, new_fc1_width, init)
model.get_layer('fc1').set_weights([new_w_fc1, new_b_fc1])
model.get_layer('fc2').set_weights([new_w_fc2, b_fc2])
model.compile(loss='categorical_crossentropy',
optimizer=SGD(lr=0.001, momentum=0.9),
metrics=['accuracy'])
train_x, train_y = train_data
history = model.fit(train_x, train_y, nb_epoch=nb_epoch,
validation_data=validation_data)
return model, history
def make_deeper_student_model(teacher_model, train_data,
validation_data, init, nb_epoch=3):
'''Train a deeper student model based on teacher_model,
with either 'random-init' (baseline) or 'net2deeper'
'''
model = Sequential()
model.add(Conv2D(64, 3, 3, input_shape=input_shape,
border_mode='same', name='conv1'))
model.add(MaxPooling2D(name='pool1'))
model.add(Conv2D(64, 3, 3, border_mode='same', name='conv2'))
# add another conv2d layer to make original conv2 deeper
if init == 'net2deeper':
prev_w, _ = model.get_layer('conv2').get_weights()
new_weights = deeper2net_conv2d(prev_w)
model.add(Conv2D(64, 3, 3, border_mode='same',
name='conv2-deeper', weights=new_weights))
elif init == 'random-init':
model.add(Conv2D(64, 3, 3, border_mode='same', name='conv2-deeper'))
else:
raise ValueError('Unsupported weight initializer: %s' % init)
model.add(MaxPooling2D(name='pool2'))
model.add(Flatten(name='flatten'))
model.add(Dense(64, activation='relu', name='fc1'))
# add another fc layer to make original fc1 deeper
if init == 'net2deeper':
# net2deeper for fc layer with relu, is just an identity initializer
model.add(Dense(64, init='identity',
activation='relu', name='fc1-deeper'))
elif init == 'random-init':
model.add(Dense(64, activation='relu', name='fc1-deeper'))
else:
raise ValueError('Unsupported weight initializer: %s' % init)
model.add(Dense(nb_class, activation='softmax', name='fc2'))
# copy weights for other layers
copy_weights(teacher_model, model, layer_names=[
'conv1', 'conv2', 'fc1', 'fc2'])
model.compile(loss='categorical_crossentropy',
optimizer=SGD(lr=0.001, momentum=0.9),
metrics=['accuracy'])
train_x, train_y = train_data
history = model.fit(train_x, train_y, nb_epoch=nb_epoch,
validation_data=validation_data)
return model, history
# experiments setup
def net2wider_experiment():
'''Benchmark performances of
(1) a teacher model,
(2) a wider student model with `random_pad` initializer
(3) a wider student model with `Net2WiderNet` initializer
'''
train_data = (train_x, train_y)
validation_data = (validation_x, validation_y)
print('\nExperiment of Net2WiderNet ...')
print('\nbuilding teacher model ...')
teacher_model, _ = make_teacher_model(train_data,
validation_data,
nb_epoch=3)
print('\nbuilding wider student model by random padding ...')
make_wider_student_model(teacher_model, train_data,
validation_data, 'random-pad',
nb_epoch=3)
print('\nbuilding wider student model by net2wider ...')
make_wider_student_model(teacher_model, train_data,
validation_data, 'net2wider',
nb_epoch=3)
def net2deeper_experiment():
'''Benchmark performances of
(1) a teacher model,
(2) a deeper student model with `random_init` initializer
(3) a deeper student model with `Net2DeeperNet` initializer
'''
train_data = (train_x, train_y)
validation_data = (validation_x, validation_y)
print('\nExperiment of Net2DeeperNet ...')
print('\nbuilding teacher model ...')
teacher_model, _ = make_teacher_model(train_data,
validation_data,
nb_epoch=3)
print('\nbuilding deeper student model by random init ...')
make_deeper_student_model(teacher_model, train_data,
validation_data, 'random-init',
nb_epoch=3)
print('\nbuilding deeper student model by net2deeper ...')
make_deeper_student_model(teacher_model, train_data,
validation_data, 'net2deeper',
nb_epoch=3)
# run the experiments
net2wider_experiment()
net2deeper_experiment()
+130
Ver Arquivo
@@ -0,0 +1,130 @@
'''Train a Siamese MLP on pairs of digits from the MNIST dataset.
It follows Hadsell-et-al.'06 [1] by computing the Euclidean distance on the
output of the shared network and by optimizing the contrastive loss (see paper
for mode details).
[1] "Dimensionality Reduction by Learning an Invariant Mapping"
http://yann.lecun.com/exdb/publis/pdf/hadsell-chopra-lecun-06.pdf
Gets to 99.5% test accuracy after 20 epochs.
3 seconds per epoch on a Titan X GPU
'''
from __future__ import absolute_import
from __future__ import print_function
import numpy as np
np.random.seed(1337) # for reproducibility
import random
from keras.datasets import mnist
from keras.models import Sequential, Model
from keras.layers import Dense, Dropout, Input, Lambda
from keras.optimizers import SGD, RMSprop
from keras import backend as K
def euclidean_distance(vects):
x, y = vects
return K.sqrt(K.sum(K.square(x - y), axis=1, keepdims=True))
def eucl_dist_output_shape(shapes):
shape1, shape2 = shapes
return (shape1[0], 1)
def contrastive_loss(y_true, y_pred):
'''Contrastive loss from Hadsell-et-al.'06
http://yann.lecun.com/exdb/publis/pdf/hadsell-chopra-lecun-06.pdf
'''
margin = 1
return K.mean(y_true * K.square(y_pred) + (1 - y_true) * K.square(K.maximum(margin - y_pred, 0)))
def create_pairs(x, digit_indices):
'''Positive and negative pair creation.
Alternates between positive and negative pairs.
'''
pairs = []
labels = []
n = min([len(digit_indices[d]) for d in range(10)]) - 1
for d in range(10):
for i in range(n):
z1, z2 = digit_indices[d][i], digit_indices[d][i+1]
pairs += [[x[z1], x[z2]]]
inc = random.randrange(1, 10)
dn = (d + inc) % 10
z1, z2 = digit_indices[d][i], digit_indices[dn][i]
pairs += [[x[z1], x[z2]]]
labels += [1, 0]
return np.array(pairs), np.array(labels)
def create_base_network(input_dim):
'''Base network to be shared (eq. to feature extraction).
'''
seq = Sequential()
seq.add(Dense(128, input_shape=(input_dim,), activation='relu'))
seq.add(Dropout(0.1))
seq.add(Dense(128, activation='relu'))
seq.add(Dropout(0.1))
seq.add(Dense(128, activation='relu'))
return seq
def compute_accuracy(predictions, labels):
'''Compute classification accuracy with a fixed threshold on distances.
'''
return labels[predictions.ravel() < 0.5].mean()
# the data, shuffled and split between train and test sets
(X_train, y_train), (X_test, y_test) = mnist.load_data()
X_train = X_train.reshape(60000, 784)
X_test = X_test.reshape(10000, 784)
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255
X_test /= 255
input_dim = 784
nb_epoch = 20
# create training+test positive and negative pairs
digit_indices = [np.where(y_train == i)[0] for i in range(10)]
tr_pairs, tr_y = create_pairs(X_train, digit_indices)
digit_indices = [np.where(y_test == i)[0] for i in range(10)]
te_pairs, te_y = create_pairs(X_test, digit_indices)
# network definition
base_network = create_base_network(input_dim)
input_a = Input(shape=(input_dim,))
input_b = Input(shape=(input_dim,))
# because we re-use the same instance `base_network`,
# the weights of the network
# will be shared across the two branches
processed_a = base_network(input_a)
processed_b = base_network(input_b)
distance = Lambda(euclidean_distance, output_shape=eucl_dist_output_shape)([processed_a, processed_b])
model = Model(input=[input_a, input_b], output=distance)
# train
rms = RMSprop()
model.compile(loss=contrastive_loss, optimizer=rms)
model.fit([tr_pairs[:, 0], tr_pairs[:, 1]], tr_y,
validation_data=([te_pairs[:, 0], te_pairs[:, 1]], te_y),
batch_size=128,
nb_epoch=nb_epoch)
# compute final accuracy on training and test sets
pred = model.predict([tr_pairs[:, 0], tr_pairs[:, 1]])
tr_acc = compute_accuracy(pred, tr_y)
pred = model.predict([te_pairs[:, 0], te_pairs[:, 1]])
te_acc = compute_accuracy(pred, te_y)
print('* Accuracy on training set: %0.2f%%' % (100 * tr_acc))
print('* Accuracy on test set: %0.2f%%' % (100 * te_acc))
+94
Ver Arquivo
@@ -0,0 +1,94 @@
'''Example of how to use sklearn wrapper
Builds simple CNN models on MNIST and uses sklearn's GridSearchCV to find best model
'''
from __future__ import print_function
import numpy as np
np.random.seed(1337) # for reproducibility
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.layers import Convolution2D, MaxPooling2D
from keras.utils import np_utils
from keras.wrappers.scikit_learn import KerasClassifier
from sklearn.grid_search import GridSearchCV
nb_classes = 10
# input image dimensions
img_rows, img_cols = 28, 28
# load training data and do basic data normalization
(X_train, y_train), (X_test, y_test) = mnist.load_data()
X_train = X_train.reshape(X_train.shape[0], 1, img_rows, img_cols)
X_test = X_test.reshape(X_test.shape[0], 1, img_rows, img_cols)
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255
X_test /= 255
# convert class vectors to binary class matrices
y_train = np_utils.to_categorical(y_train, nb_classes)
y_test = np_utils.to_categorical(y_test, nb_classes)
def make_model(dense_layer_sizes, nb_filters, nb_conv, nb_pool):
'''Creates model comprised of 2 convolutional layers followed by dense layers
dense_layer_sizes: List of layer sizes. This list has one number for each layer
nb_filters: Number of convolutional filters in each convolutional layer
nb_conv: Convolutional kernel size
nb_pool: Size of pooling area for max pooling
'''
model = Sequential()
model.add(Convolution2D(nb_filters, nb_conv, nb_conv,
border_mode='valid',
input_shape=(1, img_rows, img_cols)))
model.add(Activation('relu'))
model.add(Convolution2D(nb_filters, nb_conv, nb_conv))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(nb_pool, nb_pool)))
model.add(Dropout(0.25))
model.add(Flatten())
for layer_size in dense_layer_sizes:
model.add(Dense(layer_size))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(nb_classes))
model.add(Activation('softmax'))
model.compile(loss='categorical_crossentropy',
optimizer='adadelta',
metrics=['accuracy'])
return model
dense_size_candidates = [[32], [64], [32, 32], [64, 64]]
my_classifier = KerasClassifier(make_model, batch_size=32)
validator = GridSearchCV(my_classifier,
param_grid={'dense_layer_sizes': dense_size_candidates,
# nb_epoch is avail for tuning even when not
# an argument to model building function
'nb_epoch': [3, 6],
'nb_filters': [8],
'nb_conv': [3],
'nb_pool': [2]},
scoring='log_loss',
n_jobs=1)
validator.fit(X_train, y_train)
print('The parameters of the best model are: ')
print(validator.best_params_)
# validator.best_estimator_ returns sklearn-wrapped version of best model.
# validator.best_estimator_.model returns the (unwrapped) keras model
best_model = validator.best_estimator_.model
metric_names = best_model.metrics_names
metric_values = best_model.evaluate(X_test, y_test)
for metric, value in zip(metric_names, metric_values):
print(metric, ': ', value)
+167
Ver Arquivo
@@ -0,0 +1,167 @@
'''Trains a stacked what-where autoencoder built on residual blocks on the
MNIST dataset. It exemplifies two influential methods that have been developed
in the past few years.
The first is the idea of properly "unpooling." During any max pool, the
exact location (the "where") of the maximal value in a pooled receptive field
is lost, however it can be very useful in the overall reconstruction of an
input image. Therefore, if the "where" is handed from the encoder
to the corresponding decoder layer, features being decoded can be "placed" in
the right location, allowing for reconstructions of much higher fidelity.
References:
[1]
"Visualizing and Understanding Convolutional Networks"
Matthew D Zeiler, Rob Fergus
https://arxiv.org/abs/1311.2901v3
[2]
"Stacked What-Where Auto-encoders"
Junbo Zhao, Michael Mathieu, Ross Goroshin, Yann LeCun
https://arxiv.org/abs/1506.02351v8
The second idea exploited here is that of residual learning. Residual blocks
ease the training process by allowing skip connections that give the network
the ability to be as linear (or non-linear) as the data sees fit. This allows
for much deep networks to be easily trained. The residual element seems to
be advantageous in the context of this example as it allows a nice symmetry
between the encoder and decoder. Normally, in the decoder, the final
projection to the space where the image is reconstructed is linear, however
this does not have to be the case for a residual block as the degree to which
its output is linear or non-linear is determined by the data it is fed.
However, in order to cap the reconstruction in this example, a hard softmax is
applied as a bias because we know the MNIST digits are mapped to [0,1].
References:
[3]
"Deep Residual Learning for Image Recognition"
Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun
https://arxiv.org/abs/1512.03385v1
[4]
"Identity Mappings in Deep Residual Networks"
Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun
https://arxiv.org/abs/1603.05027v3
'''
from __future__ import print_function
import numpy as np
np.random.seed(1337) # for reproducibility
from keras.datasets import mnist
from keras.models import Model
from keras.layers import Activation, merge
from keras.layers import UpSampling2D, Convolution2D, MaxPooling2D
from keras.layers import Input, BatchNormalization
import matplotlib.pyplot as plt
import keras.backend as K
def convresblock(x, nfeats=8, ksize=3, nskipped=2):
''' The proposed residual block from [4]'''
y0 = Convolution2D(nfeats, ksize, ksize, border_mode='same')(x)
y = y0
for i in range(nskipped):
y = BatchNormalization(mode=0, axis=1)(y)
y = Activation('relu')(y)
y = Convolution2D(nfeats, ksize, ksize, border_mode='same')(y)
return merge([y0, y], mode='sum')
def getwhere(x):
''' Calculate the "where" mask that contains switches indicating which
index contained the max value when MaxPool2D was applied. Using the
gradient of the sum is a nice trick to keep everything high level.'''
y_prepool, y_postpool = x
return K.gradients(K.sum(y_postpool), y_prepool)
# input image dimensions
img_rows, img_cols = 28, 28
# the data, shuffled and split between train and test sets
(X_train, _), (X_test, _) = mnist.load_data()
X_train = X_train.reshape(X_train.shape[0], 1, img_rows, img_cols)
X_test = X_test.reshape(X_test.shape[0], 1, img_rows, img_cols)
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255
X_test /= 255
print('X_train shape:', X_train.shape)
print(X_train.shape[0], 'train samples')
print(X_test.shape[0], 'test samples')
# The size of the kernel used for the MaxPooling2D
pool_size = 2
# The total number of feature maps at each layer
nfeats = [8, 16, 32, 64, 128]
# The sizes of the pooling kernel at each layer
pool_sizes = np.array([1, 1, 1, 1, 1]) * pool_size
# The convolution kernel size
ksize = 3
# Number of epochs to train for
nb_epoch = 5
# Batch size during training
batch_size = 128
if pool_size == 2:
# if using a 5 layer net of pool_size = 2
X_train = np.pad(X_train, [[0, 0], [0, 0], [2, 2], [2, 2]],
mode='constant')
X_test = np.pad(X_test, [[0, 0], [0, 0], [2, 2], [2, 2]], mode='constant')
nlayers = 5
elif pool_size == 3:
# if using a 3 layer net of pool_size = 3
X_train = X_train[:, :, :-1, :-1]
X_test = X_test[:, :, :-1, :-1]
nlayers = 3
else:
import sys
sys.exit("Script supports pool_size of 2 and 3.")
# Shape of input to train on (note that model is fully convolutional however)
input_shape = X_train.shape[1:]
# The final list of the size of axis=1 for all layers, including input
nfeats_all = [input_shape[0]] + nfeats
# First build the encoder, all the while keeping track of the "where" masks
img_input = Input(shape=input_shape)
# We push the "where" masks to the following list
wheres = [None] * nlayers
y = img_input
for i in range(nlayers):
y_prepool = convresblock(y, nfeats=nfeats_all[i + 1], ksize=ksize)
y = MaxPooling2D(pool_size=(pool_sizes[i], pool_sizes[i]))(y_prepool)
wheres[i] = merge([y_prepool, y], mode=getwhere,
output_shape=lambda x: x[0])
# Now build the decoder, and use the stored "where" masks to place the features
for i in range(nlayers):
ind = nlayers - 1 - i
y = UpSampling2D(size=(pool_sizes[ind], pool_sizes[ind]))(y)
y = merge([y, wheres[ind]], mode='mul')
y = convresblock(y, nfeats=nfeats_all[ind], ksize=ksize)
# Use hard_simgoid to clip range of reconstruction
y = Activation('hard_sigmoid')(y)
# Define the model and it's mean square error loss, and compile it with Adam
model = Model(img_input, y)
model.compile('adam', 'mse')
# Fit the model
model.fit(X_train, X_train, validation_data=(X_test, X_test),
batch_size=batch_size, nb_epoch=nb_epoch)
# Plot
X_recon = model.predict(X_test[:25])
X_plot = np.concatenate((X_test[:25], X_recon), axis=1)
X_plot = X_plot.reshape((5, 10, input_shape[-2], input_shape[-1]))
X_plot = np.vstack([np.hstack(x) for x in X_plot])
plt.figure()
plt.axis('off')
plt.title('Test Samples: Originals/Reconstructions')
plt.imshow(X_plot, interpolation='none', cmap='gray')
plt.savefig('reconstructions.png')
+37 -32
Ver Arquivo
@@ -1,4 +1,16 @@
from __future__ import absolute_import
'''Transfer learning toy example:
1- Train a simple convnet on the MNIST dataset the first 5 digits [0..4].
2- Freeze convolutional layers and fine-tune dense layers
for the classification of digits [5..9].
Run on GPU: THEANO_FLAGS=mode=FAST_RUN,device=gpu,floatX=float32 python mnist_transfer_cnn.py
Get to 99.8% test accuracy after 5 epochs
for the first five digits classifier
and 99.2% for the last five digits after transfer + fine-tuning.
'''
from __future__ import print_function
import numpy as np
import datetime
@@ -7,22 +19,10 @@ np.random.seed(1337) # for reproducibility
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation, Flatten
from keras.layers.convolutional import Convolution2D, MaxPooling2D
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.layers import Convolution2D, MaxPooling2D
from keras.utils import np_utils
'''
Transfer learning toy example:
1- Train a simple convnet on the MNIST dataset the first 5 digits [0..4].
2- Freeze convolutional layers and fine-tune dense layers
for the classification of digits [5..9].
Run on GPU: THEANO_FLAGS=mode=FAST_RUN,device=gpu,floatX=float32 python mnist_cnn.py
Get to 99.8% test accuracy after 5 epochs
for the first five digits classifier
and 99.2% for the last five digits after transfer + fine-tuning.
'''
from keras import backend as K
now = datetime.datetime.now
@@ -35,16 +35,21 @@ img_rows, img_cols = 28, 28
# number of convolutional filters to use
nb_filters = 32
# size of pooling area for max pooling
nb_pool = 2
pool_size = 2
# convolution kernel size
nb_conv = 3
kernel_size = 3
if K.image_dim_ordering() == 'th':
input_shape = (1, img_rows, img_cols)
else:
input_shape = (img_rows, img_cols, 1)
def train_model(model, train, test, nb_classes):
X_train = train[0].reshape(train[0].shape[0], 1, img_rows, img_cols)
X_test = test[0].reshape(test[0].shape[0], 1, img_rows, img_cols)
X_train = X_train.astype("float32")
X_test = X_test.astype("float32")
X_train = train[0].reshape((train[0].shape[0],) + input_shape)
X_test = test[0].reshape((test[0].shape[0],) + input_shape)
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255
X_test /= 255
print('X_train shape:', X_train.shape)
@@ -55,15 +60,17 @@ def train_model(model, train, test, nb_classes):
Y_train = np_utils.to_categorical(train[1], nb_classes)
Y_test = np_utils.to_categorical(test[1], nb_classes)
model.compile(loss='categorical_crossentropy', optimizer='adadelta')
model.compile(loss='categorical_crossentropy',
optimizer='adadelta',
metrics=['accuracy'])
t = now()
model.fit(X_train, Y_train,
batch_size=batch_size, nb_epoch=nb_epoch,
show_accuracy=True, verbose=1,
verbose=1,
validation_data=(X_test, Y_test))
print('Training time: %s' % (now() - t))
score = model.evaluate(X_test, Y_test, show_accuracy=True, verbose=0)
score = model.evaluate(X_test, Y_test, verbose=0)
print('Test score:', score[0])
print('Test accuracy:', score[1])
@@ -84,13 +91,13 @@ y_test_gte5 = y_test[y_test >= 5] - 5
# define two groups of layers: feature (convolutions) and classification (dense)
feature_layers = [
Convolution2D(nb_filters, nb_conv, nb_conv,
Convolution2D(nb_filters, kernel_size, kernel_size,
border_mode='valid',
input_shape=(1, img_rows, img_cols)),
input_shape=input_shape),
Activation('relu'),
Convolution2D(nb_filters, nb_conv, nb_conv),
Convolution2D(nb_filters, kernel_size, kernel_size),
Activation('relu'),
MaxPooling2D(pool_size=(nb_pool, nb_pool)),
MaxPooling2D(pool_size=(pool_size, pool_size)),
Dropout(0.25),
Flatten(),
]
@@ -103,9 +110,7 @@ classification_layers = [
]
# create complete model
model = Sequential()
for l in feature_layers + classification_layers:
model.add(l)
model = Sequential(feature_layers + classification_layers)
# train model for 5-digit classification [0..4]
train_model(model,
+364
Ver Arquivo
@@ -0,0 +1,364 @@
'''Neural doodle with Keras
Script Usage:
# Arguments:
```
--nlabels: # of regions (colors) in mask images
--style-image: image to learn style from
--style-mask: semantic labels for style image
--target-mask: semantic labels for target image (your doodle)
--content-image: optional image to learn content from
--target-image-prefix: path prefix for generated target images
```
# Example 1: doodle using a style image, style mask
and target mask.
```
python neural_doodle.py --nlabels 4 --style-image Monet/style.png \
--style-mask Monet/style_mask.png --target-mask Monet/target_mask.png \
--target-image-prefix generated/monet
```
# Example 2: doodle using a style image, style mask,
target mask and an optional content image.
```
python neural_doodle.py --nlabels 4 --style-image Renoir/style.png \
--style-mask Renoir/style_mask.png --target-mask Renoir/target_mask.png \
--content-image Renoir/creek.jpg \
--target-image-prefix generated/renoir
```
References:
[Dmitry Ulyanov's blog on fast-neural-doodle](http://dmitryulyanov.github.io/feed-forward-neural-doodle/)
[Torch code for fast-neural-doodle](https://github.com/DmitryUlyanov/fast-neural-doodle)
[Torch code for online-neural-doodle](https://github.com/DmitryUlyanov/online-neural-doodle)
[Paper Texture Networks: Feed-forward Synthesis of Textures and Stylized Images](http://arxiv.org/abs/1603.03417)
[Discussion on parameter tuning](https://github.com/fchollet/keras/issues/3705)
Resources:
Example images can be downloaded from
https://github.com/DmitryUlyanov/fast-neural-doodle/tree/master/data
'''
from __future__ import print_function
import time
import argparse
import numpy as np
from scipy.optimize import fmin_l_bfgs_b
from scipy.misc import imread, imsave
from keras import backend as K
from keras.layers import Input, Convolution2D, MaxPooling2D, AveragePooling2D
from keras.models import Model
from keras.preprocessing.image import load_img, img_to_array
from keras.applications import vgg19
# Command line arguments
parser = argparse.ArgumentParser(description='Keras neural doodle example')
parser.add_argument('--nlabels', type=int,
help='number of semantic labels'
' (regions in differnet colors)'
' in style_mask/target_mask')
parser.add_argument('--style-image', type=str,
help='path to image to learn style from')
parser.add_argument('--style-mask', type=str,
help='path to semantic mask of style image')
parser.add_argument('--target-mask', type=str,
help='path to semantic mask of target image')
parser.add_argument('--content-image', type=str, default=None,
help='path to optional content image')
parser.add_argument('--target-image-prefix', type=str,
help='path prefix for generated results')
args = parser.parse_args()
style_img_path = args.style_image
style_mask_path = args.style_mask
target_mask_path = args.target_mask
content_img_path = args.content_image
target_img_prefix = args.target_image_prefix
use_content_img = content_img_path is not None
nb_labels = args.nlabels
nb_colors = 3 # RGB
# determine image sizes based on target_mask
ref_img = imread(target_mask_path)
img_nrows, img_ncols = ref_img.shape[:2]
total_variation_weight = 50.
style_weight = 1.
content_weight = 0.1 if use_content_img else 0
content_feature_layers = ['block5_conv2']
# To get better generation qualities, use more conv layers for style features
style_feature_layers = ['block1_conv1', 'block2_conv1', 'block3_conv1',
'block4_conv1', 'block5_conv1']
# helper functions for reading/processing images
def preprocess_image(image_path):
img = load_img(image_path, target_size=(img_nrows, img_ncols))
img = img_to_array(img)
img = np.expand_dims(img, axis=0)
img = vgg19.preprocess_input(img)
return img
def deprocess_image(x):
if K.image_dim_ordering() == 'th':
x = x.reshape((3, img_nrows, img_ncols))
x = x.transpose((1, 2, 0))
else:
x = x.reshape((img_nrows, img_ncols, 3))
x = x[:, :, ::-1]
x[:, :, 0] += 103.939
x[:, :, 1] += 116.779
x[:, :, 2] += 123.68
x = np.clip(x, 0, 255).astype('uint8')
return x
def kmeans(xs, k):
assert xs.ndim == 2
try:
from sklearn.cluster import k_means
_, labels, _ = k_means(xs.astype("float64"), k)
except ImportError:
from scipy.cluster.vq import kmeans2
_, labels = kmeans2(xs, k, missing='raise')
return labels
def load_mask_labels():
'''Load both target and style masks.
A mask image (nr x nc) with m labels/colors will be loaded
as a 4D boolean tensor: (1, m, nr, nc) for 'th' or (1, nr, nc, m) for 'tf'
'''
target_mask_img = load_img(target_mask_path,
target_size=(img_nrows, img_ncols))
target_mask_img = img_to_array(target_mask_img)
style_mask_img = load_img(style_mask_path,
target_size=(img_nrows, img_ncols))
style_mask_img = img_to_array(style_mask_img)
if K.image_dim_ordering() == 'th':
mask_vecs = np.vstack([style_mask_img.reshape((3, -1)).T,
target_mask_img.reshape((3, -1)).T])
else:
mask_vecs = np.vstack([style_mask_img.reshape((-1, 3)),
target_mask_img.reshape((-1, 3))])
labels = kmeans(mask_vecs, nb_labels)
style_mask_label = labels[:img_nrows *
img_ncols].reshape((img_nrows, img_ncols))
target_mask_label = labels[img_nrows *
img_ncols:].reshape((img_nrows, img_ncols))
stack_axis = 0 if K.image_dim_ordering() == 'th' else -1
style_mask = np.stack([style_mask_label == r for r in xrange(nb_labels)],
axis=stack_axis)
target_mask = np.stack([target_mask_label == r for r in xrange(nb_labels)],
axis=stack_axis)
return (np.expand_dims(style_mask, axis=0),
np.expand_dims(target_mask, axis=0))
# Create tensor variables for images
if K.image_dim_ordering() == 'th':
shape = (1, nb_colors, img_nrows, img_ncols)
else:
shape = (1, img_nrows, img_ncols, nb_colors)
style_image = K.variable(preprocess_image(style_img_path))
target_image = K.placeholder(shape=shape)
if use_content_img:
content_image = K.variable(preprocess_image(content_img_path))
else:
content_image = K.zeros(shape=shape)
images = K.concatenate([style_image, target_image, content_image], axis=0)
# Create tensor variables for masks
raw_style_mask, raw_target_mask = load_mask_labels()
style_mask = K.variable(raw_style_mask.astype("float32"))
target_mask = K.variable(raw_target_mask.astype("float32"))
masks = K.concatenate([style_mask, target_mask], axis=0)
# index constants for images and tasks variables
STYLE, TARGET, CONTENT = 0, 1, 2
# Build image model, mask model and use layer outputs as features
# image model as VGG19
image_model = vgg19.VGG19(include_top=False, input_tensor=images)
# mask model as a series of pooling
mask_input = Input(tensor=masks, shape=(None, None, None), name="mask_input")
x = mask_input
for layer in image_model.layers[1:]:
name = 'mask_%s' % layer.name
if 'conv' in layer.name:
x = AveragePooling2D((3, 3), strides=(
1, 1), name=name, border_mode="same")(x)
elif 'pool' in layer.name:
x = AveragePooling2D((2, 2), name=name)(x)
mask_model = Model(mask_input, x)
# Collect features from image_model and task_model
image_features = {}
mask_features = {}
for img_layer, mask_layer in zip(image_model.layers, mask_model.layers):
if 'conv' in img_layer.name:
assert 'mask_' + img_layer.name == mask_layer.name
layer_name = img_layer.name
img_feat, mask_feat = img_layer.output, mask_layer.output
image_features[layer_name] = img_feat
mask_features[layer_name] = mask_feat
# Define loss functions
def gram_matrix(x):
assert K.ndim(x) == 3
features = K.batch_flatten(x)
gram = K.dot(features, K.transpose(features))
return gram
def region_style_loss(style_image, target_image, style_mask, target_mask):
'''Calculate style loss between style_image and target_image,
for one common region specified by their (boolean) masks
'''
assert 3 == K.ndim(style_image) == K.ndim(target_image)
assert 2 == K.ndim(style_mask) == K.ndim(target_mask)
if K.image_dim_ordering() == 'th':
masked_style = style_image * style_mask
masked_target = target_image * target_mask
nb_channels = K.shape(style_image)[0]
else:
masked_style = K.permute_dimensions(
style_image, (2, 0, 1)) * style_mask
masked_target = K.permute_dimensions(
target_image, (2, 0, 1)) * target_mask
nb_channels = K.shape(style_image)[-1]
s = gram_matrix(masked_style) / K.mean(style_mask) / nb_channels
c = gram_matrix(masked_target) / K.mean(target_mask) / nb_channels
return K.mean(K.square(s - c))
def style_loss(style_image, target_image, style_masks, target_masks):
'''Calculate style loss between style_image and target_image,
in all regions.
'''
assert 3 == K.ndim(style_image) == K.ndim(target_image)
assert 3 == K.ndim(style_masks) == K.ndim(target_masks)
loss = K.variable(0)
for i in xrange(nb_labels):
if K.image_dim_ordering() == 'th':
style_mask = style_masks[i, :, :]
target_mask = target_masks[i, :, :]
else:
style_mask = style_masks[:, :, i]
target_mask = target_masks[:, :, i]
loss += region_style_loss(style_image,
target_image, style_mask, target_mask)
return loss
def content_loss(content_image, target_image):
return K.sum(K.square(target_image - content_image))
def total_variation_loss(x):
assert 4 == K.ndim(x)
if K.image_dim_ordering() == 'th':
a = K.square(x[:, :, :img_nrows - 1, :img_ncols - 1] -
x[:, :, 1:, :img_ncols - 1])
b = K.square(x[:, :, :img_nrows - 1, :img_ncols - 1] -
x[:, :, :img_nrows - 1, 1:])
else:
a = K.square(x[:, :img_nrows - 1, :img_ncols - 1, :] -
x[:, 1:, :img_ncols - 1, :])
b = K.square(x[:, :img_nrows - 1, :img_ncols - 1, :] -
x[:, :img_nrows - 1, 1:, :])
return K.sum(K.pow(a + b, 1.25))
# Overall loss is the weighted sum of content_loss, style_loss and tv_loss
# Each individual loss uses features from image/mask models.
loss = K.variable(0)
for layer in content_feature_layers:
content_feat = image_features[layer][CONTENT, :, :, :]
target_feat = image_features[layer][TARGET, :, :, :]
loss += content_weight * content_loss(content_feat, target_feat)
for layer in style_feature_layers:
style_feat = image_features[layer][STYLE, :, :, :]
target_feat = image_features[layer][TARGET, :, :, :]
style_masks = mask_features[layer][STYLE, :, :, :]
target_masks = mask_features[layer][TARGET, :, :, :]
sl = style_loss(style_feat, target_feat, style_masks, target_masks)
loss += (style_weight / len(style_feature_layers)) * sl
loss += total_variation_weight * total_variation_loss(target_image)
loss_grads = K.gradients(loss, target_image)
# Evaluator class for computing efficiency
outputs = [loss]
if type(loss_grads) in {list, tuple}:
outputs += loss_grads
else:
outputs.append(loss_grads)
f_outputs = K.function([target_image], outputs)
def eval_loss_and_grads(x):
if K.image_dim_ordering() == 'th':
x = x.reshape((1, 3, img_nrows, img_ncols))
else:
x = x.reshape((1, img_nrows, img_ncols, 3))
outs = f_outputs([x])
loss_value = outs[0]
if len(outs[1:]) == 1:
grad_values = outs[1].flatten().astype('float64')
else:
grad_values = np.array(outs[1:]).flatten().astype('float64')
return loss_value, grad_values
class Evaluator(object):
def __init__(self):
self.loss_value = None
self.grads_values = None
def loss(self, x):
assert self.loss_value is None
loss_value, grad_values = eval_loss_and_grads(x)
self.loss_value = loss_value
self.grad_values = grad_values
return self.loss_value
def grads(self, x):
assert self.loss_value is not None
grad_values = np.copy(self.grad_values)
self.loss_value = None
self.grad_values = None
return grad_values
evaluator = Evaluator()
# Generate images by iterative optimization
if K.image_dim_ordering() == 'th':
x = np.random.uniform(0, 255, (1, 3, img_nrows, img_ncols)) - 128.
else:
x = np.random.uniform(0, 255, (1, img_nrows, img_ncols, 3)) - 128.
for i in range(50):
print('Start of iteration', i)
start_time = time.time()
x, min_val, info = fmin_l_bfgs_b(evaluator.loss, x.flatten(),
fprime=evaluator.grads, maxfun=20)
print('Current loss value:', min_val)
# save current generated image
img = deprocess_image(x.copy())
fname = target_img_prefix + '_at_iteration_%d.png' % i
imsave(fname, img)
end_time = time.time()
print('Image saved as', fname)
print('Iteration %d completed in %ds' % (i, end_time - start_time))
+259
Ver Arquivo
@@ -0,0 +1,259 @@
'''Neural style transfer with Keras.
Run the script with:
```
python neural_style_transfer.py path_to_your_base_image.jpg path_to_your_reference.jpg prefix_for_results
```
e.g.:
```
python neural_style_transfer.py img/tuebingen.jpg img/starry_night.jpg results/my_result
```
It is preferable to run this script on GPU, for speed.
Example result: https://twitter.com/fchollet/status/686631033085677568
# Details
Style transfer consists in generating an image
with the same "content" as a base image, but with the
"style" of a different picture (typically artistic).
This is achieved through the optimization of a loss function
that has 3 components: "style loss", "content loss",
and "total variation loss":
- The total variation loss imposes local spatial continuity between
the pixels of the combination image, giving it visual coherence.
- The style loss is where the deep learning keeps in --that one is defined
using a deep convolutional neural network. Precisely, it consists in a sum of
L2 distances between the Gram matrices of the representations of
the base image and the style reference image, extracted from
different layers of a convnet (trained on ImageNet). The general idea
is to capture color/texture information at different spatial
scales (fairly large scales --defined by the depth of the layer considered).
- The content loss is a L2 distance between the features of the base
image (extracted from a deep layer) and the features of the combination image,
keeping the generated image close enough to the original one.
# References
- [A Neural Algorithm of Artistic Style](http://arxiv.org/abs/1508.06576)
'''
from __future__ import print_function
from keras.preprocessing.image import load_img, img_to_array
from scipy.misc import imsave
import numpy as np
from scipy.optimize import fmin_l_bfgs_b
import time
import argparse
from keras.applications import vgg16
from keras import backend as K
parser = argparse.ArgumentParser(description='Neural style transfer with Keras.')
parser.add_argument('base_image_path', metavar='base', type=str,
help='Path to the image to transform.')
parser.add_argument('style_reference_image_path', metavar='ref', type=str,
help='Path to the style reference image.')
parser.add_argument('result_prefix', metavar='res_prefix', type=str,
help='Prefix for the saved results.')
args = parser.parse_args()
base_image_path = args.base_image_path
style_reference_image_path = args.style_reference_image_path
result_prefix = args.result_prefix
# these are the weights of the different loss components
total_variation_weight = 1.
style_weight = 1.
content_weight = 0.025
# dimensions of the generated picture.
img_nrows = 400
img_ncols = 400
assert img_ncols == img_nrows, 'Due to the use of the Gram matrix, width and height must match.'
# util function to open, resize and format pictures into appropriate tensors
def preprocess_image(image_path):
img = load_img(image_path, target_size=(img_nrows, img_ncols))
img = img_to_array(img)
img = np.expand_dims(img, axis=0)
img = vgg16.preprocess_input(img)
return img
# util function to convert a tensor into a valid image
def deprocess_image(x):
if K.image_dim_ordering() == 'th':
x = x.reshape((3, img_nrows, img_ncols))
x = x.transpose((1, 2, 0))
else:
x = x.reshape((img_nrows, img_ncols, 3))
x = x[:, :, ::-1]
x[:, :, 0] += 103.939
x[:, :, 1] += 116.779
x[:, :, 2] += 123.68
x = np.clip(x, 0, 255).astype('uint8')
return x
# get tensor representations of our images
base_image = K.variable(preprocess_image(base_image_path))
style_reference_image = K.variable(preprocess_image(style_reference_image_path))
# this will contain our generated image
if K.image_dim_ordering() == 'th':
combination_image = K.placeholder((1, 3, img_nrows, img_ncols))
else:
combination_image = K.placeholder((1, img_nrows, img_ncols, 3))
# combine the 3 images into a single Keras tensor
input_tensor = K.concatenate([base_image,
style_reference_image,
combination_image], axis=0)
# build the VGG16 network with our 3 images as input
# the model will be loaded with pre-trained ImageNet weights
model = vgg16.VGG16(input_tensor=input_tensor,
weights='imagenet', include_top=False)
print('Model loaded.')
# get the symbolic outputs of each "key" layer (we gave them unique names).
outputs_dict = dict([(layer.name, layer.output) for layer in model.layers])
# compute the neural style loss
# first we need to define 4 util functions
# the gram matrix of an image tensor (feature-wise outer product)
def gram_matrix(x):
assert K.ndim(x) == 3
if K.image_dim_ordering() == 'th':
features = K.batch_flatten(x)
else:
features = K.batch_flatten(K.permute_dimensions(x, (2, 0, 1)))
gram = K.dot(features, K.transpose(features))
return gram
# the "style loss" is designed to maintain
# the style of the reference image in the generated image.
# It is based on the gram matrices (which capture style) of
# feature maps from the style reference image
# and from the generated image
def style_loss(style, combination):
assert K.ndim(style) == 3
assert K.ndim(combination) == 3
S = gram_matrix(style)
C = gram_matrix(combination)
channels = 3
size = img_nrows * img_ncols
return K.sum(K.square(S - C)) / (4. * (channels ** 2) * (size ** 2))
# an auxiliary loss function
# designed to maintain the "content" of the
# base image in the generated image
def content_loss(base, combination):
return K.sum(K.square(combination - base))
# the 3rd loss function, total variation loss,
# designed to keep the generated image locally coherent
def total_variation_loss(x):
assert K.ndim(x) == 4
if K.image_dim_ordering() == 'th':
a = K.square(x[:, :, :img_nrows-1, :img_ncols-1] - x[:, :, 1:, :img_ncols-1])
b = K.square(x[:, :, :img_nrows-1, :img_ncols-1] - x[:, :, :img_nrows-1, 1:])
else:
a = K.square(x[:, :img_nrows-1, :img_ncols-1, :] - x[:, 1:, :img_ncols-1, :])
b = K.square(x[:, :img_nrows-1, :img_ncols-1, :] - x[:, :img_nrows-1, 1:, :])
return K.sum(K.pow(a + b, 1.25))
# combine these loss functions into a single scalar
loss = K.variable(0.)
layer_features = outputs_dict['block4_conv2']
base_image_features = layer_features[0, :, :, :]
combination_features = layer_features[2, :, :, :]
loss += content_weight * content_loss(base_image_features,
combination_features)
feature_layers = ['block1_conv1', 'block2_conv1',
'block3_conv1', 'block4_conv1',
'block5_conv1']
for layer_name in feature_layers:
layer_features = outputs_dict[layer_name]
style_reference_features = layer_features[1, :, :, :]
combination_features = layer_features[2, :, :, :]
sl = style_loss(style_reference_features, combination_features)
loss += (style_weight / len(feature_layers)) * sl
loss += total_variation_weight * total_variation_loss(combination_image)
# get the gradients of the generated image wrt the loss
grads = K.gradients(loss, combination_image)
outputs = [loss]
if type(grads) in {list, tuple}:
outputs += grads
else:
outputs.append(grads)
f_outputs = K.function([combination_image], outputs)
def eval_loss_and_grads(x):
if K.image_dim_ordering() == 'th':
x = x.reshape((1, 3, img_nrows, img_ncols))
else:
x = x.reshape((1, img_nrows, img_ncols, 3))
outs = f_outputs([x])
loss_value = outs[0]
if len(outs[1:]) == 1:
grad_values = outs[1].flatten().astype('float64')
else:
grad_values = np.array(outs[1:]).flatten().astype('float64')
return loss_value, grad_values
# this Evaluator class makes it possible
# to compute loss and gradients in one pass
# while retrieving them via two separate functions,
# "loss" and "grads". This is done because scipy.optimize
# requires separate functions for loss and gradients,
# but computing them separately would be inefficient.
class Evaluator(object):
def __init__(self):
self.loss_value = None
self.grads_values = None
def loss(self, x):
assert self.loss_value is None
loss_value, grad_values = eval_loss_and_grads(x)
self.loss_value = loss_value
self.grad_values = grad_values
return self.loss_value
def grads(self, x):
assert self.loss_value is not None
grad_values = np.copy(self.grad_values)
self.loss_value = None
self.grad_values = None
return grad_values
evaluator = Evaluator()
# run scipy-based optimization (L-BFGS) over the pixels of the generated image
# so as to minimize the neural style loss
if K.image_dim_ordering() == 'th':
x = np.random.uniform(0, 255, (1, 3, img_nrows, img_ncols)) - 128.
else:
x = np.random.uniform(0, 255, (1, img_nrows, img_ncols, 3)) - 128.
for i in range(10):
print('Start of iteration', i)
start_time = time.time()
x, min_val, info = fmin_l_bfgs_b(evaluator.loss, x.flatten(),
fprime=evaluator.grads, maxfun=20)
print('Current loss value:', min_val)
# save current generated image
img = deprocess_image(x.copy())
fname = result_prefix + '_at_iteration_%d.png' % i
imsave(fname, img)
end_time = time.time()
print('Image saved as', fname)
print('Iteration %d completed in %ds' % (i, end_time - start_time))
-168
Ver Arquivo
@@ -1,168 +0,0 @@
from __future__ import absolute_import
from __future__ import print_function
import numpy as np
np.random.seed(123)
import matplotlib.pyplot as plt
from theano import function
from keras.models import Sequential
from keras.layers.core import TimeDistributedDense, Activation
from keras.layers.recurrent import LSTM
from keras.optimizers import Adam
from keras.utils import generic_utils
from keras.layers.ntm import NeuralTuringMachine as NTM
"""
Copy Problem defined in Graves et. al [0]
Training data is made of sequences with length 1 to 20.
Test data are sequences of length 100.
The model is tested every 500 weight updates.
After about 3500 updates, the accuracy jumps from around 50% to >90%.
Estimated compile time: 12 min
Estimated time to train Neural Turing Machine and 3 layer LSTM on an NVidia GTX 680: 2h
[0]: http://arxiv.org/pdf/1410.5401v2.pdf
"""
batch_size = 100
h_dim = 128
n_slots = 128
m_length = 20
input_dim = 8
lr = 1e-3
clipvalue = 10
##### Neural Turing Machine ######
ntm = NTM(h_dim, n_slots=n_slots, m_length=m_length, shift_range=3,
inner_rnn='lstm', return_sequences=True, input_dim=input_dim)
model = Sequential()
model.add(ntm)
model.add(TimeDistributedDense(input_dim))
model.add(Activation('sigmoid'))
sgd = Adam(lr=lr, clipvalue=clipvalue)
model.compile(loss='binary_crossentropy', optimizer=sgd)
# LSTM - Run this for comparison
sgd2 = Adam(lr=lr, clipvalue=clipvalue)
lstm = Sequential()
lstm.add(LSTM(input_dim=input_dim, output_dim=h_dim*2, return_sequences=True))
lstm.add(LSTM(output_dim=h_dim*2, return_sequences=True))
lstm.add(LSTM(output_dim=h_dim*2, return_sequences=True))
lstm.add(TimeDistributedDense(input_dim))
lstm.add(Activation('sigmoid'))
lstm.compile(loss='binary_crossentropy', optimizer=sgd)
###### DATASET ########
def get_sample(batch_size=128, n_bits=8, max_size=20, min_size=1):
# generate samples with random length
inp = np.zeros((batch_size, 2*max_size-1, n_bits))
out = np.zeros((batch_size, 2*max_size-1, n_bits))
sw = np.zeros((batch_size, 2*max_size-1, 1))
for i in range(batch_size):
t = np.random.randint(low=min_size, high=max_size)
x = np.random.uniform(size=(t, n_bits)) > .5
for j,f in enumerate(x.sum(axis=-1)): # remove fake flags
if f>=n_bits:
x[j, :] = 0.
del_flag = np.ones((1, n_bits))
inp[i, :t+1] = np.concatenate([x, del_flag], axis=0)
out[i, t+1:(2*t+1)] = x
sw[i, t+1:(2*t+1)] = 1
return inp, out, sw
def show_pattern(inp, out, sw, file_name='ntm_output.png'):
''' Helper function to visualize results '''
plt.figure(figsize=(10, 10))
plt.subplot(131)
plt.imshow(inp>.5)
plt.subplot(132)
plt.imshow(out>.5)
plt.subplot(133)
plt.imshow(sw>.5)
plt.savefig(file_name)
plt.close()
# Show data example:
inp, out, sw = get_sample(1, 8, 20)
plt.subplot(131)
plt.title('input')
plt.imshow(inp[0], cmap='gray')
plt.subplot(132)
plt.title('desired')
plt.imshow(out[0], cmap='gray')
plt.subplot(133)
plt.title('sample_weight')
plt.imshow(sw[0], cmap='gray')
# training uses sequences of length 1 to 20. Test uses series of length 100.
def test_model(model, file_name, min_size=100):
I, V, sw = get_sample(batch_size=500, n_bits=input_dim, max_size=min_size+1, min_size=min_size)
Y = np.asarray(model.predict(I, batch_size=100) > .5).astype('float64')
acc = (V[:, -min_size:, :] == Y[:, -min_size:, :]).mean() * 100
show_pattern(Y[0], V[0], sw[0], file_name)
return acc
##### TRAIN ######
nb_epoch = 4000
progbar = generic_utils.Progbar(nb_epoch)
for e in range(nb_epoch):
I, V, sw = get_sample(n_bits=input_dim, max_size=20, min_size=1, batch_size=100)
loss1 = model.train_on_batch(I, V, sample_weight=sw)
loss2 = lstm.train_on_batch(I, V, sample_weight=sw)
progbar.add(1, values=[("NTM", loss1), ("LSTM", loss2)])
if e % 500 == 0:
print("")
acc1 = test_model(model, 'ntm.png')
acc2 = test_model(lstm, 'lstm.png')
print("NTM test acc: {}".format(acc1))
print("LSTM test acc: {}".format(acc2))
##### VISUALIZATION #####
X = model.get_input()
Y = ntm.get_full_output()[0:3] # (memory over time, read_vectors, write_vectors)
F = function([X], Y, allow_input_downcast=True)
inp, out, sw = get_sample(1, 8, 21, 20)
mem, read, write = F(inp.astype('float32'))
Y = model.predict(inp)
plt.figure(figsize=(15, 12))
plt.subplot(221)
plt.imshow(write[0])
plt.xlabel('memory location')
plt.ylabel('time')
plt.title('write')
plt.subplot(222)
plt.imshow(read[0])
plt.title('read')
plt.subplot(223)
plt.title('desired')
plt.imshow(out[0])
plt.subplot(224)
plt.imshow(Y[0]>.5)
plt.title('output')
plt.figure(figsize=(15, 10))
plt.subplot(325)
plt.ylabel('time')
plt.xlabel('location')
plt.title('memory evolving in time (avg value per location)')
plt.imshow(mem[0].mean(axis=-1))
+144
Ver Arquivo
@@ -0,0 +1,144 @@
'''This script loads pre-trained word embeddings (GloVe embeddings)
into a frozen Keras Embedding layer, and uses it to
train a text classification model on the 20 Newsgroup dataset
(classication of newsgroup messages into 20 different categories).
GloVe embedding data can be found at:
http://nlp.stanford.edu/data/glove.6B.zip
(source page: http://nlp.stanford.edu/projects/glove/)
20 Newsgroup data can be found at:
http://www.cs.cmu.edu/afs/cs.cmu.edu/project/theo-20/www/data/news20.html
'''
from __future__ import print_function
import os
import numpy as np
np.random.seed(1337)
from keras.preprocessing.text import Tokenizer
from keras.preprocessing.sequence import pad_sequences
from keras.utils.np_utils import to_categorical
from keras.layers import Dense, Input, Flatten
from keras.layers import Conv1D, MaxPooling1D, Embedding
from keras.models import Model
import sys
BASE_DIR = ''
GLOVE_DIR = BASE_DIR + '/glove.6B/'
TEXT_DATA_DIR = BASE_DIR + '/20_newsgroup/'
MAX_SEQUENCE_LENGTH = 1000
MAX_NB_WORDS = 20000
EMBEDDING_DIM = 100
VALIDATION_SPLIT = 0.2
# first, build index mapping words in the embeddings set
# to their embedding vector
print('Indexing word vectors.')
embeddings_index = {}
f = open(os.path.join(GLOVE_DIR, 'glove.6B.100d.txt'))
for line in f:
values = line.split()
word = values[0]
coefs = np.asarray(values[1:], dtype='float32')
embeddings_index[word] = coefs
f.close()
print('Found %s word vectors.' % len(embeddings_index))
# second, prepare text samples and their labels
print('Processing text dataset')
texts = [] # list of text samples
labels_index = {} # dictionary mapping label name to numeric id
labels = [] # list of label ids
for name in sorted(os.listdir(TEXT_DATA_DIR)):
path = os.path.join(TEXT_DATA_DIR, name)
if os.path.isdir(path):
label_id = len(labels_index)
labels_index[name] = label_id
for fname in sorted(os.listdir(path)):
if fname.isdigit():
fpath = os.path.join(path, fname)
if sys.version_info < (3,):
f = open(fpath)
else:
f = open(fpath, encoding='latin-1')
texts.append(f.read())
f.close()
labels.append(label_id)
print('Found %s texts.' % len(texts))
# finally, vectorize the text samples into a 2D integer tensor
tokenizer = Tokenizer(nb_words=MAX_NB_WORDS)
tokenizer.fit_on_texts(texts)
sequences = tokenizer.texts_to_sequences(texts)
word_index = tokenizer.word_index
print('Found %s unique tokens.' % len(word_index))
data = pad_sequences(sequences, maxlen=MAX_SEQUENCE_LENGTH)
labels = to_categorical(np.asarray(labels))
print('Shape of data tensor:', data.shape)
print('Shape of label tensor:', labels.shape)
# split the data into a training set and a validation set
indices = np.arange(data.shape[0])
np.random.shuffle(indices)
data = data[indices]
labels = labels[indices]
nb_validation_samples = int(VALIDATION_SPLIT * data.shape[0])
x_train = data[:-nb_validation_samples]
y_train = labels[:-nb_validation_samples]
x_val = data[-nb_validation_samples:]
y_val = labels[-nb_validation_samples:]
print('Preparing embedding matrix.')
# prepare embedding matrix
nb_words = min(MAX_NB_WORDS, len(word_index))
embedding_matrix = np.zeros((nb_words + 1, EMBEDDING_DIM))
for word, i in word_index.items():
if i > MAX_NB_WORDS:
continue
embedding_vector = embeddings_index.get(word)
if embedding_vector is not None:
# words not found in embedding index will be all-zeros.
embedding_matrix[i] = embedding_vector
# load pre-trained word embeddings into an Embedding layer
# note that we set trainable = False so as to keep the embeddings fixed
embedding_layer = Embedding(nb_words + 1,
EMBEDDING_DIM,
weights=[embedding_matrix],
input_length=MAX_SEQUENCE_LENGTH,
trainable=False)
print('Training model.')
# train a 1D convnet with global maxpooling
sequence_input = Input(shape=(MAX_SEQUENCE_LENGTH,), dtype='int32')
embedded_sequences = embedding_layer(sequence_input)
x = Conv1D(128, 5, activation='relu')(embedded_sequences)
x = MaxPooling1D(5)(x)
x = Conv1D(128, 5, activation='relu')(x)
x = MaxPooling1D(5)(x)
x = Conv1D(128, 5, activation='relu')(x)
x = MaxPooling1D(35)(x)
x = Flatten()(x)
x = Dense(128, activation='relu')(x)
preds = Dense(len(labels_index), activation='softmax')(x)
model = Model(sequence_input, preds)
model.compile(loss='categorical_crossentropy',
optimizer='rmsprop',
metrics=['acc'])
# happy learning!
model.fit(x_train, y_train, validation_data=(x_val, y_val),
nb_epoch=2, batch_size=128)
+19 -20
Ver Arquivo
@@ -1,28 +1,22 @@
from __future__ import absolute_import
'''Trains and evaluate a simple MLP
on the Reuters newswire topic classification task.
'''
from __future__ import print_function
import numpy as np
np.random.seed(1337) # for reproducibility
from keras.datasets import reuters
from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation
from keras.layers.normalization import BatchNormalization
from keras.layers import Dense, Dropout, Activation
from keras.utils import np_utils
from keras.preprocessing.text import Tokenizer
'''
Train and evaluate a simple MLP on the Reuters newswire topic classification task.
GPU run command:
THEANO_FLAGS=mode=FAST_RUN,device=gpu,floatX=float32 python examples/reuters_mlp.py
CPU run command:
python examples/reuters_mlp.py
'''
max_words = 1000
batch_size = 32
nb_epoch = 5
print("Loading data...")
print('Loading data...')
(X_train, y_train), (X_test, y_test) = reuters.load_data(nb_words=max_words, test_split=0.2)
print(len(X_train), 'train sequences')
print(len(X_test), 'test sequences')
@@ -30,20 +24,20 @@ print(len(X_test), 'test sequences')
nb_classes = np.max(y_train)+1
print(nb_classes, 'classes')
print("Vectorizing sequence data...")
print('Vectorizing sequence data...')
tokenizer = Tokenizer(nb_words=max_words)
X_train = tokenizer.sequences_to_matrix(X_train, mode="binary")
X_test = tokenizer.sequences_to_matrix(X_test, mode="binary")
X_train = tokenizer.sequences_to_matrix(X_train, mode='binary')
X_test = tokenizer.sequences_to_matrix(X_test, mode='binary')
print('X_train shape:', X_train.shape)
print('X_test shape:', X_test.shape)
print("Convert class vector to binary class matrix (for use with categorical_crossentropy)")
print('Convert class vector to binary class matrix (for use with categorical_crossentropy)')
Y_train = np_utils.to_categorical(y_train, nb_classes)
Y_test = np_utils.to_categorical(y_test, nb_classes)
print('Y_train shape:', Y_train.shape)
print('Y_test shape:', Y_test.shape)
print("Building model...")
print('Building model...')
model = Sequential()
model.add(Dense(512, input_shape=(max_words,)))
model.add(Activation('relu'))
@@ -51,9 +45,14 @@ model.add(Dropout(0.5))
model.add(Dense(nb_classes))
model.add(Activation('softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam')
model.compile(loss='categorical_crossentropy',
optimizer='adam',
metrics=['accuracy'])
history = model.fit(X_train, Y_train, nb_epoch=nb_epoch, batch_size=batch_size, verbose=1, show_accuracy=True, validation_split=0.1)
score = model.evaluate(X_test, Y_test, batch_size=batch_size, verbose=1, show_accuracy=True)
history = model.fit(X_train, Y_train,
nb_epoch=nb_epoch, batch_size=batch_size,
verbose=1, validation_split=0.1)
score = model.evaluate(X_test, Y_test,
batch_size=batch_size, verbose=1)
print('Test score:', score[0])
print('Test accuracy:', score[1])
+84
Ver Arquivo
@@ -0,0 +1,84 @@
'''Example script showing how to use stateful RNNs
to model long sequences efficiently.
'''
from __future__ import print_function
import numpy as np
import matplotlib.pyplot as plt
from keras.models import Sequential
from keras.layers import Dense, LSTM
# since we are using stateful rnn tsteps can be set to 1
tsteps = 1
batch_size = 25
epochs = 25
# number of elements ahead that are used to make the prediction
lahead = 1
def gen_cosine_amp(amp=100, period=1000, x0=0, xn=50000, step=1, k=0.0001):
"""Generates an absolute cosine time series with the amplitude
exponentially decreasing
Arguments:
amp: amplitude of the cosine function
period: period of the cosine function
x0: initial x of the time series
xn: final x of the time series
step: step of the time series discretization
k: exponential rate
"""
cos = np.zeros(((xn - x0) * step, 1, 1))
for i in range(len(cos)):
idx = x0 + i * step
cos[i, 0, 0] = amp * np.cos(2 * np.pi * idx / period)
cos[i, 0, 0] = cos[i, 0, 0] * np.exp(-k * idx)
return cos
print('Generating Data')
cos = gen_cosine_amp()
print('Input shape:', cos.shape)
expected_output = np.zeros((len(cos), 1))
for i in range(len(cos) - lahead):
expected_output[i, 0] = np.mean(cos[i + 1:i + lahead + 1])
print('Output shape')
print(expected_output.shape)
print('Creating Model')
model = Sequential()
model.add(LSTM(50,
batch_input_shape=(batch_size, tsteps, 1),
return_sequences=True,
stateful=True))
model.add(LSTM(50,
batch_input_shape=(batch_size, tsteps, 1),
return_sequences=False,
stateful=True))
model.add(Dense(1))
model.compile(loss='mse', optimizer='rmsprop')
print('Training')
for i in range(epochs):
print('Epoch', i, '/', epochs)
model.fit(cos,
expected_output,
batch_size=batch_size,
verbose=1,
nb_epoch=1,
shuffle=False)
model.reset_states()
print('Predicting')
predicted_output = model.predict(cos, batch_size=batch_size)
print('Plotting Results')
plt.subplot(2, 1, 1)
plt.plot(expected_output)
plt.title('Expected')
plt.subplot(2, 1, 2)
plt.plot(predicted_output)
plt.title('Predicted')
plt.show()
+97
Ver Arquivo
@@ -0,0 +1,97 @@
'''This script demonstrates how to build a variational autoencoder with Keras.
Reference: "Auto-Encoding Variational Bayes" https://arxiv.org/abs/1312.6114
'''
import numpy as np
import matplotlib.pyplot as plt
from keras.layers import Input, Dense, Lambda
from keras.models import Model
from keras import backend as K
from keras import objectives
from keras.datasets import mnist
batch_size = 100
original_dim = 784
latent_dim = 2
intermediate_dim = 256
nb_epoch = 50
x = Input(batch_shape=(batch_size, original_dim))
h = Dense(intermediate_dim, activation='relu')(x)
z_mean = Dense(latent_dim)(h)
z_log_var = Dense(latent_dim)(h)
def sampling(args):
z_mean, z_log_var = args
epsilon = K.random_normal(shape=(batch_size, latent_dim), mean=0.)
return z_mean + K.exp(z_log_var / 2) * epsilon
# note that "output_shape" isn't necessary with the TensorFlow backend
z = Lambda(sampling, output_shape=(latent_dim,))([z_mean, z_log_var])
# we instantiate these layers separately so as to reuse them later
decoder_h = Dense(intermediate_dim, activation='relu')
decoder_mean = Dense(original_dim, activation='sigmoid')
h_decoded = decoder_h(z)
x_decoded_mean = decoder_mean(h_decoded)
def vae_loss(x, x_decoded_mean):
xent_loss = original_dim * objectives.binary_crossentropy(x, x_decoded_mean)
kl_loss = - 0.5 * K.sum(1 + z_log_var - K.square(z_mean) - K.exp(z_log_var), axis=-1)
return xent_loss + kl_loss
vae = Model(x, x_decoded_mean)
vae.compile(optimizer='rmsprop', loss=vae_loss)
# train the VAE on MNIST digits
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.
x_train = x_train.reshape((len(x_train), np.prod(x_train.shape[1:])))
x_test = x_test.reshape((len(x_test), np.prod(x_test.shape[1:])))
vae.fit(x_train, x_train,
shuffle=True,
nb_epoch=nb_epoch,
batch_size=batch_size,
validation_data=(x_test, x_test))
# build a model to project inputs on the latent space
encoder = Model(x, z_mean)
# display a 2D plot of the digit classes in the latent space
x_test_encoded = encoder.predict(x_test, batch_size=batch_size)
plt.figure(figsize=(6, 6))
plt.scatter(x_test_encoded[:, 0], x_test_encoded[:, 1], c=y_test)
plt.colorbar()
plt.show()
# build a digit generator that can sample from the learned distribution
decoder_input = Input(shape=(latent_dim,))
_h_decoded = decoder_h(decoder_input)
_x_decoded_mean = decoder_mean(_h_decoded)
generator = Model(decoder_input, _x_decoded_mean)
# display a 2D manifold of the digits
n = 15 # figure with 15x15 digits
digit_size = 28
figure = np.zeros((digit_size * n, digit_size * n))
# we will sample n points within [-15, 15] standard deviations
grid_x = np.linspace(-15, 15, n)
grid_y = np.linspace(-15, 15, n)
for i, yi in enumerate(grid_x):
for j, xi in enumerate(grid_y):
z_sample = np.array([[xi, yi]])
x_decoded = generator.predict(z_sample)
digit = x_decoded[0].reshape(digit_size, digit_size)
figure[i * digit_size: (i + 1) * digit_size,
j * digit_size: (j + 1) * digit_size] = digit
plt.figure(figsize=(10, 10))
plt.imshow(figure)
plt.show()
+124
Ver Arquivo
@@ -0,0 +1,124 @@
'''This script demonstrates how to build a variational autoencoder with Keras and deconvolution layers.
Reference: "Auto-Encoding Variational Bayes" https://arxiv.org/abs/1312.6114
'''
import numpy as np
import matplotlib.pyplot as plt
from keras.layers import Input, Dense, Lambda, Flatten, Reshape
from keras.layers import Convolution2D, Deconvolution2D, MaxPooling2D
from keras.models import Model
from keras import backend as K
from keras import objectives
from keras.datasets import mnist
# input image dimensions
img_rows, img_cols, img_chns = 28, 28, 1
# number of convolutional filters to use
nb_filters = 32
# convolution kernel size
nb_conv = 3
batch_size = 16
original_dim = (img_chns, img_rows, img_cols)
latent_dim = 2
intermediate_dim = 128
epsilon_std = 0.01
nb_epoch = 5
x = Input(batch_shape=(batch_size,) + original_dim)
c = Convolution2D(nb_filters, nb_conv, nb_conv, border_mode='same', activation='relu')(x)
f = Flatten()(c)
h = Dense(intermediate_dim, activation='relu')(f)
z_mean = Dense(latent_dim)(h)
z_log_var = Dense(latent_dim)(h)
def sampling(args):
z_mean, z_log_var = args
epsilon = K.random_normal(shape=(batch_size, latent_dim),
mean=0., std=epsilon_std)
return z_mean + K.exp(z_log_var) * epsilon
# note that "output_shape" isn't necessary with the TensorFlow backend
# so you could write `Lambda(sampling)([z_mean, z_log_var])`
z = Lambda(sampling, output_shape=(latent_dim,))([z_mean, z_log_var])
# we instantiate these layers separately so as to reuse them later
decoder_h = Dense(intermediate_dim, activation='relu')
decoder_f = Dense(nb_filters*img_rows*img_cols, activation='relu')
decoder_c = Reshape((nb_filters, img_rows, img_cols))
decoder_mean = Deconvolution2D(img_chns, nb_conv, nb_conv,
(batch_size, img_chns, img_rows, img_cols),
border_mode='same')
h_decoded = decoder_h(z)
f_decoded = decoder_f(h_decoded)
c_decoded = decoder_c(f_decoded)
x_decoded_mean = decoder_mean(c_decoded)
def vae_loss(x, x_decoded_mean):
# NOTE: binary_crossentropy expects a batch_size by dim for x and x_decoded_mean, so we MUST flatten these!
x = K.flatten(x)
x_decoded_mean = K.flatten(x_decoded_mean)
xent_loss = objectives.binary_crossentropy(x, x_decoded_mean)
kl_loss = - 0.5 * K.mean(1 + z_log_var - K.square(z_mean) - K.exp(z_log_var), axis=-1)
return xent_loss + kl_loss
vae = Model(x, x_decoded_mean)
vae.compile(optimizer='rmsprop', loss=vae_loss)
vae.summary()
# train the VAE on MNIST digits
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train = x_train.astype('float32')[:, None, :, :] / 255.
x_test = x_test.astype('float32')[:, None, :, :] / 255.
vae.fit(x_train, x_train,
shuffle=True,
nb_epoch=nb_epoch,
batch_size=batch_size,
validation_data=(x_test, x_test))
# build a model to project inputs on the latent space
encoder = Model(x, z_mean)
# display a 2D plot of the digit classes in the latent space
x_test_encoded = encoder.predict(x_test, batch_size=batch_size)
plt.figure(figsize=(6, 6))
plt.scatter(x_test_encoded[:, 0], x_test_encoded[:, 1], c=y_test)
plt.colorbar()
plt.show()
# build a digit generator that can sample from the learned distribution
decoder_input = Input(shape=(latent_dim,))
_h_decoded = decoder_h(decoder_input)
_f_decoded = decoder_f(_h_decoded)
_c_decoded = decoder_c(_f_decoded)
_x_decoded_mean = decoder_mean(_c_decoded)
generator = Model(decoder_input, _x_decoded_mean)
# display a 2D manifold of the digits
n = 15 # figure with 15x15 digits
digit_size = 28
figure = np.zeros((digit_size * n, digit_size * n))
# we will sample n points within [-15, 15] standard deviations
grid_x = np.linspace(-15, 15, n)
grid_y = np.linspace(-15, 15, n)
for i, yi in enumerate(grid_x):
for j, xi in enumerate(grid_y):
z_sample = np.array([[xi, yi]])
x_decoded = generator.predict(z_sample)
digit = x_decoded[0].reshape(digit_size, digit_size)
figure[i * digit_size: (i + 1) * digit_size,
j * digit_size: (j + 1) * digit_size] = digit
plt.figure(figsize=(10, 10))
plt.imshow(figure)
plt.show()
+17 -14
Ver Arquivo
@@ -1,15 +1,18 @@
from __future__ import absolute_import
from . import backend
from . import datasets
from . import engine
from . import layers
from . import preprocessing
from . import utils
from . import wrappers
from . import callbacks
from . import constraints
from . import initializations
from . import metrics
from . import models
from . import objectives
from . import optimizers
from . import regularizers
"""
Keras: Theano-based Deep Learning library
==================================
Keras is a minimalist, highly modular neural network library in
the spirit of Torch, written in Python / Theano so as not to have
to deal with the dearth of ecosystem in Lua. It was developed with
a focus on enabling fast experimentation. Being able to go from
idea to result with the least possible delay is key to doing
good research.
See http://keras.io/
"""
__version__ = '0.2.0'
__version__ = '1.1.0'
+10 -6
Ver Arquivo
@@ -7,11 +7,9 @@ def softmax(x):
if ndim == 2:
return K.softmax(x)
elif ndim == 3:
# apply softmax to each timestep
def step(x, states):
return K.softmax(x), []
last_output, outputs, states = K.rnn(step, x, [], masking=False)
return outputs
e = K.exp(x - K.max(x, axis=-1, keepdims=True))
s = K.sum(e, axis=-1, keepdims=True)
return e / s
else:
raise Exception('Cannot apply softmax to a tensor that is not 2D or 3D. ' +
'Here, ndim=' + str(ndim))
@@ -21,6 +19,10 @@ def softplus(x):
return K.softplus(x)
def softsign(x):
return K.softsign(x)
def relu(x, alpha=0., max_value=None):
return K.relu(x, alpha=alpha, max_value=max_value)
@@ -39,11 +41,13 @@ def hard_sigmoid(x):
def linear(x):
'''
The function returns the variable that is passed in, so all types work
The function returns the variable that is passed in, so all types work.
'''
return x
from .utils.generic_utils import get_from_module
def get(identifier):
if identifier is None:
return linear
return get_from_module(identifier, globals(), 'activation function')
+4
Ver Arquivo
@@ -0,0 +1,4 @@
from .vgg16 import VGG16
from .vgg19 import VGG19
from .resnet50 import ResNet50
from .inception_v3 import InceptionV3
+43
Ver Arquivo
@@ -0,0 +1,43 @@
import numpy as np
import json
from ..utils.data_utils import get_file
from .. import backend as K
CLASS_INDEX = None
CLASS_INDEX_PATH = 'https://s3.amazonaws.com/deep-learning-models/image-models/imagenet_class_index.json'
def preprocess_input(x, dim_ordering='default'):
if dim_ordering == 'default':
dim_ordering = K.image_dim_ordering()
assert dim_ordering in {'tf', 'th'}
if dim_ordering == 'th':
x[:, 0, :, :] -= 103.939
x[:, 1, :, :] -= 116.779
x[:, 2, :, :] -= 123.68
# 'RGB'->'BGR'
x = x[:, ::-1, :, :]
else:
x[:, :, :, 0] -= 103.939
x[:, :, :, 1] -= 116.779
x[:, :, :, 2] -= 123.68
# 'RGB'->'BGR'
x = x[:, :, :, ::-1]
return x
def decode_predictions(preds):
global CLASS_INDEX
assert len(preds.shape) == 2 and preds.shape[1] == 1000
if CLASS_INDEX is None:
fpath = get_file('imagenet_class_index.json',
CLASS_INDEX_PATH,
cache_subdir='models')
CLASS_INDEX = json.load(open(fpath))
indices = np.argmax(preds, axis=-1)
results = []
for i in indices:
results.append(CLASS_INDEX[str(i)])
return results
+312
Ver Arquivo
@@ -0,0 +1,312 @@
# -*- coding: utf-8 -*-
'''Inception V3 model for Keras.
Note that the ImageNet weights provided are from a model that had not fully converged.
Inception v3 should be able to reach 6.9% top-5 error, but our model
only gets to 7.8% (same as a fully-converged ResNet 50).
For comparison, VGG16 only gets to 9.9%, quite a bit worse.
Also, do note that the input image format for this model is different than for
other models (299x299 instead of 224x224), and that the input preprocessing function
is also different.
# Reference:
- [Rethinking the Inception Architecture for Computer Vision](http://arxiv.org/abs/1512.00567)
'''
from __future__ import print_function
from __future__ import absolute_import
import warnings
from ..models import Model
from ..layers import Flatten, Dense, Input, BatchNormalization, merge
from ..layers import Convolution2D, MaxPooling2D, AveragePooling2D
from ..utils.layer_utils import convert_all_kernels_in_model
from ..utils.data_utils import get_file
from .. import backend as K
from .imagenet_utils import decode_predictions
TH_WEIGHTS_PATH = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.2/inception_v3_weights_th_dim_ordering_th_kernels.h5'
TF_WEIGHTS_PATH = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.2/inception_v3_weights_tf_dim_ordering_tf_kernels.h5'
TH_WEIGHTS_PATH_NO_TOP = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.2/inception_v3_weights_th_dim_ordering_th_kernels_notop.h5'
TF_WEIGHTS_PATH_NO_TOP = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.2/inception_v3_weights_tf_dim_ordering_tf_kernels_notop.h5'
def conv2d_bn(x, nb_filter, nb_row, nb_col,
border_mode='same', subsample=(1, 1),
name=None):
'''Utility function to apply conv + BN.
'''
if name is not None:
bn_name = name + '_bn'
conv_name = name + '_conv'
else:
bn_name = None
conv_name = None
if K.image_dim_ordering() == 'th':
bn_axis = 1
else:
bn_axis = 3
x = Convolution2D(nb_filter, nb_row, nb_col,
subsample=subsample,
activation='relu',
border_mode=border_mode,
name=conv_name)(x)
x = BatchNormalization(axis=bn_axis, name=bn_name)(x)
return x
def InceptionV3(include_top=True, weights='imagenet',
input_tensor=None):
'''Instantiate the Inception v3 architecture,
optionally loading weights pre-trained
on ImageNet. Note that when using TensorFlow,
for best performance you should set
`image_dim_ordering="tf"` in your Keras config
at ~/.keras/keras.json.
The model and the weights are compatible with both
TensorFlow and Theano. The dimension ordering
convention used by the model is the one
specified in your Keras config file.
Note that the default input image size for this model is 299x299.
# Arguments
include_top: whether to include the 3 fully-connected
layers at the top of the network.
weights: one of `None` (random initialization)
or "imagenet" (pre-training on ImageNet).
input_tensor: optional Keras tensor (i.e. output of `layers.Input()`)
to use as image input for the model.
# Returns
A Keras model instance.
'''
if weights not in {'imagenet', None}:
raise ValueError('The `weights` argument should be either '
'`None` (random initialization) or `imagenet` '
'(pre-training on ImageNet).')
# Determine proper input shape
if K.image_dim_ordering() == 'th':
if include_top:
input_shape = (3, 299, 299)
else:
input_shape = (3, None, None)
else:
if include_top:
input_shape = (299, 299, 3)
else:
input_shape = (None, None, 3)
if input_tensor is None:
img_input = Input(shape=input_shape)
else:
if not K.is_keras_tensor(input_tensor):
img_input = Input(tensor=input_tensor, shape=input_shape)
else:
img_input = input_tensor
if K.image_dim_ordering() == 'th':
channel_axis = 1
else:
channel_axis = 3
x = conv2d_bn(img_input, 32, 3, 3, subsample=(2, 2), border_mode='valid')
x = conv2d_bn(x, 32, 3, 3, border_mode='valid')
x = conv2d_bn(x, 64, 3, 3)
x = MaxPooling2D((3, 3), strides=(2, 2))(x)
x = conv2d_bn(x, 80, 1, 1, border_mode='valid')
x = conv2d_bn(x, 192, 3, 3, border_mode='valid')
x = MaxPooling2D((3, 3), strides=(2, 2))(x)
# mixed 0, 1, 2: 35 x 35 x 256
for i in range(3):
branch1x1 = conv2d_bn(x, 64, 1, 1)
branch5x5 = conv2d_bn(x, 48, 1, 1)
branch5x5 = conv2d_bn(branch5x5, 64, 5, 5)
branch3x3dbl = conv2d_bn(x, 64, 1, 1)
branch3x3dbl = conv2d_bn(branch3x3dbl, 96, 3, 3)
branch3x3dbl = conv2d_bn(branch3x3dbl, 96, 3, 3)
branch_pool = AveragePooling2D(
(3, 3), strides=(1, 1), border_mode='same')(x)
branch_pool = conv2d_bn(branch_pool, 32, 1, 1)
x = merge([branch1x1, branch5x5, branch3x3dbl, branch_pool],
mode='concat', concat_axis=channel_axis,
name='mixed' + str(i))
# mixed 3: 17 x 17 x 768
branch3x3 = conv2d_bn(x, 384, 3, 3, subsample=(2, 2), border_mode='valid')
branch3x3dbl = conv2d_bn(x, 64, 1, 1)
branch3x3dbl = conv2d_bn(branch3x3dbl, 96, 3, 3)
branch3x3dbl = conv2d_bn(branch3x3dbl, 96, 3, 3,
subsample=(2, 2), border_mode='valid')
branch_pool = MaxPooling2D((3, 3), strides=(2, 2))(x)
x = merge([branch3x3, branch3x3dbl, branch_pool],
mode='concat', concat_axis=channel_axis,
name='mixed3')
# mixed 4: 17 x 17 x 768
branch1x1 = conv2d_bn(x, 192, 1, 1)
branch7x7 = conv2d_bn(x, 128, 1, 1)
branch7x7 = conv2d_bn(branch7x7, 128, 1, 7)
branch7x7 = conv2d_bn(branch7x7, 192, 7, 1)
branch7x7dbl = conv2d_bn(x, 128, 1, 1)
branch7x7dbl = conv2d_bn(branch7x7dbl, 128, 7, 1)
branch7x7dbl = conv2d_bn(branch7x7dbl, 128, 1, 7)
branch7x7dbl = conv2d_bn(branch7x7dbl, 128, 7, 1)
branch7x7dbl = conv2d_bn(branch7x7dbl, 192, 1, 7)
branch_pool = AveragePooling2D((3, 3), strides=(1, 1), border_mode='same')(x)
branch_pool = conv2d_bn(branch_pool, 192, 1, 1)
x = merge([branch1x1, branch7x7, branch7x7dbl, branch_pool],
mode='concat', concat_axis=channel_axis,
name='mixed4')
# mixed 5, 6: 17 x 17 x 768
for i in range(2):
branch1x1 = conv2d_bn(x, 192, 1, 1)
branch7x7 = conv2d_bn(x, 160, 1, 1)
branch7x7 = conv2d_bn(branch7x7, 160, 1, 7)
branch7x7 = conv2d_bn(branch7x7, 192, 7, 1)
branch7x7dbl = conv2d_bn(x, 160, 1, 1)
branch7x7dbl = conv2d_bn(branch7x7dbl, 160, 7, 1)
branch7x7dbl = conv2d_bn(branch7x7dbl, 160, 1, 7)
branch7x7dbl = conv2d_bn(branch7x7dbl, 160, 7, 1)
branch7x7dbl = conv2d_bn(branch7x7dbl, 192, 1, 7)
branch_pool = AveragePooling2D(
(3, 3), strides=(1, 1), border_mode='same')(x)
branch_pool = conv2d_bn(branch_pool, 192, 1, 1)
x = merge([branch1x1, branch7x7, branch7x7dbl, branch_pool],
mode='concat', concat_axis=channel_axis,
name='mixed' + str(5 + i))
# mixed 7: 17 x 17 x 768
branch1x1 = conv2d_bn(x, 192, 1, 1)
branch7x7 = conv2d_bn(x, 192, 1, 1)
branch7x7 = conv2d_bn(branch7x7, 192, 1, 7)
branch7x7 = conv2d_bn(branch7x7, 192, 7, 1)
branch7x7dbl = conv2d_bn(x, 160, 1, 1)
branch7x7dbl = conv2d_bn(branch7x7dbl, 192, 7, 1)
branch7x7dbl = conv2d_bn(branch7x7dbl, 192, 1, 7)
branch7x7dbl = conv2d_bn(branch7x7dbl, 192, 7, 1)
branch7x7dbl = conv2d_bn(branch7x7dbl, 192, 1, 7)
branch_pool = AveragePooling2D((3, 3), strides=(1, 1), border_mode='same')(x)
branch_pool = conv2d_bn(branch_pool, 192, 1, 1)
x = merge([branch1x1, branch7x7, branch7x7dbl, branch_pool],
mode='concat', concat_axis=channel_axis,
name='mixed7')
# mixed 8: 8 x 8 x 1280
branch3x3 = conv2d_bn(x, 192, 1, 1)
branch3x3 = conv2d_bn(branch3x3, 320, 3, 3,
subsample=(2, 2), border_mode='valid')
branch7x7x3 = conv2d_bn(x, 192, 1, 1)
branch7x7x3 = conv2d_bn(branch7x7x3, 192, 1, 7)
branch7x7x3 = conv2d_bn(branch7x7x3, 192, 7, 1)
branch7x7x3 = conv2d_bn(branch7x7x3, 192, 3, 3,
subsample=(2, 2), border_mode='valid')
branch_pool = AveragePooling2D((3, 3), strides=(2, 2))(x)
x = merge([branch3x3, branch7x7x3, branch_pool],
mode='concat', concat_axis=channel_axis,
name='mixed8')
# mixed 9: 8 x 8 x 2048
for i in range(2):
branch1x1 = conv2d_bn(x, 320, 1, 1)
branch3x3 = conv2d_bn(x, 384, 1, 1)
branch3x3_1 = conv2d_bn(branch3x3, 384, 1, 3)
branch3x3_2 = conv2d_bn(branch3x3, 384, 3, 1)
branch3x3 = merge([branch3x3_1, branch3x3_2],
mode='concat', concat_axis=channel_axis,
name='mixed9_' + str(i))
branch3x3dbl = conv2d_bn(x, 448, 1, 1)
branch3x3dbl = conv2d_bn(branch3x3dbl, 384, 3, 3)
branch3x3dbl_1 = conv2d_bn(branch3x3dbl, 384, 1, 3)
branch3x3dbl_2 = conv2d_bn(branch3x3dbl, 384, 3, 1)
branch3x3dbl = merge([branch3x3dbl_1, branch3x3dbl_2],
mode='concat', concat_axis=channel_axis)
branch_pool = AveragePooling2D(
(3, 3), strides=(1, 1), border_mode='same')(x)
branch_pool = conv2d_bn(branch_pool, 192, 1, 1)
x = merge([branch1x1, branch3x3, branch3x3dbl, branch_pool],
mode='concat', concat_axis=channel_axis,
name='mixed' + str(9 + i))
if include_top:
# Classification block
x = AveragePooling2D((8, 8), strides=(8, 8), name='avg_pool')(x)
x = Flatten(name='flatten')(x)
x = Dense(1000, activation='softmax', name='predictions')(x)
# Create model
model = Model(img_input, x)
# load weights
if weights == 'imagenet':
if K.image_dim_ordering() == 'th':
if include_top:
weights_path = get_file('inception_v3_weights_th_dim_ordering_th_kernels.h5',
TH_WEIGHTS_PATH,
cache_subdir='models',
md5_hash='b3baf3070cc4bf476d43a2ea61b0ca5f')
else:
weights_path = get_file('inception_v3_weights_th_dim_ordering_th_kernels_notop.h5',
TH_WEIGHTS_PATH_NO_TOP,
cache_subdir='models',
md5_hash='79aaa90ab4372b4593ba3df64e142f05')
model.load_weights(weights_path)
if K.backend() == 'tensorflow':
warnings.warn('You are using the TensorFlow backend, yet you '
'are using the Theano '
'image dimension ordering convention '
'(`image_dim_ordering="th"`). '
'For best performance, set '
'`image_dim_ordering="tf"` in '
'your Keras config '
'at ~/.keras/keras.json.')
convert_all_kernels_in_model(model)
else:
if include_top:
weights_path = get_file('inception_v3_weights_tf_dim_ordering_tf_kernels.h5',
TF_WEIGHTS_PATH,
cache_subdir='models',
md5_hash='fe114b3ff2ea4bf891e9353d1bbfb32f')
else:
weights_path = get_file('inception_v3_weights_tf_dim_ordering_tf_kernels_notop.h5',
TF_WEIGHTS_PATH_NO_TOP,
cache_subdir='models',
md5_hash='2f3609166de1d967d1a481094754f691')
model.load_weights(weights_path)
if K.backend() == 'theano':
convert_all_kernels_in_model(model)
return model
def preprocess_input(x):
x /= 255.
x -= 0.5
x *= 2.
return x
+235
Ver Arquivo
@@ -0,0 +1,235 @@
# -*- coding: utf-8 -*-
'''ResNet50 model for Keras.
# Reference:
- [Deep Residual Learning for Image Recognition](https://arxiv.org/abs/1512.03385)
Adapted from code contributed by BigMoyan.
'''
from __future__ import print_function
from __future__ import absolute_import
import warnings
from ..layers import merge, Input
from ..layers import Dense, Activation, Flatten
from ..layers import Convolution2D, MaxPooling2D, ZeroPadding2D, AveragePooling2D
from ..layers import BatchNormalization
from ..models import Model
from .. import backend as K
from ..utils.layer_utils import convert_all_kernels_in_model
from ..utils.data_utils import get_file
from .imagenet_utils import decode_predictions, preprocess_input
TH_WEIGHTS_PATH = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.2/resnet50_weights_th_dim_ordering_th_kernels.h5'
TF_WEIGHTS_PATH = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.2/resnet50_weights_tf_dim_ordering_tf_kernels.h5'
TH_WEIGHTS_PATH_NO_TOP = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.2/resnet50_weights_th_dim_ordering_th_kernels_notop.h5'
TF_WEIGHTS_PATH_NO_TOP = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.2/resnet50_weights_tf_dim_ordering_tf_kernels_notop.h5'
def identity_block(input_tensor, kernel_size, filters, stage, block):
'''The identity_block is the block that has no conv layer at shortcut
# Arguments
input_tensor: input tensor
kernel_size: defualt 3, the kernel size of middle conv layer at main path
filters: list of integers, the nb_filters of 3 conv layer at main path
stage: integer, current stage label, used for generating layer names
block: 'a','b'..., current block label, used for generating layer names
'''
nb_filter1, nb_filter2, nb_filter3 = filters
if K.image_dim_ordering() == 'tf':
bn_axis = 3
else:
bn_axis = 1
conv_name_base = 'res' + str(stage) + block + '_branch'
bn_name_base = 'bn' + str(stage) + block + '_branch'
x = Convolution2D(nb_filter1, 1, 1, name=conv_name_base + '2a')(input_tensor)
x = BatchNormalization(axis=bn_axis, name=bn_name_base + '2a')(x)
x = Activation('relu')(x)
x = Convolution2D(nb_filter2, kernel_size, kernel_size,
border_mode='same', name=conv_name_base + '2b')(x)
x = BatchNormalization(axis=bn_axis, name=bn_name_base + '2b')(x)
x = Activation('relu')(x)
x = Convolution2D(nb_filter3, 1, 1, name=conv_name_base + '2c')(x)
x = BatchNormalization(axis=bn_axis, name=bn_name_base + '2c')(x)
x = merge([x, input_tensor], mode='sum')
x = Activation('relu')(x)
return x
def conv_block(input_tensor, kernel_size, filters, stage, block, strides=(2, 2)):
'''conv_block is the block that has a conv layer at shortcut
# Arguments
input_tensor: input tensor
kernel_size: defualt 3, the kernel size of middle conv layer at main path
filters: list of integers, the nb_filters of 3 conv layer at main path
stage: integer, current stage label, used for generating layer names
block: 'a','b'..., current block label, used for generating layer names
Note that from stage 3, the first conv layer at main path is with subsample=(2,2)
And the shortcut should have subsample=(2,2) as well
'''
nb_filter1, nb_filter2, nb_filter3 = filters
if K.image_dim_ordering() == 'tf':
bn_axis = 3
else:
bn_axis = 1
conv_name_base = 'res' + str(stage) + block + '_branch'
bn_name_base = 'bn' + str(stage) + block + '_branch'
x = Convolution2D(nb_filter1, 1, 1, subsample=strides,
name=conv_name_base + '2a')(input_tensor)
x = BatchNormalization(axis=bn_axis, name=bn_name_base + '2a')(x)
x = Activation('relu')(x)
x = Convolution2D(nb_filter2, kernel_size, kernel_size, border_mode='same',
name=conv_name_base + '2b')(x)
x = BatchNormalization(axis=bn_axis, name=bn_name_base + '2b')(x)
x = Activation('relu')(x)
x = Convolution2D(nb_filter3, 1, 1, name=conv_name_base + '2c')(x)
x = BatchNormalization(axis=bn_axis, name=bn_name_base + '2c')(x)
shortcut = Convolution2D(nb_filter3, 1, 1, subsample=strides,
name=conv_name_base + '1')(input_tensor)
shortcut = BatchNormalization(axis=bn_axis, name=bn_name_base + '1')(shortcut)
x = merge([x, shortcut], mode='sum')
x = Activation('relu')(x)
return x
def ResNet50(include_top=True, weights='imagenet',
input_tensor=None):
'''Instantiate the ResNet50 architecture,
optionally loading weights pre-trained
on ImageNet. Note that when using TensorFlow,
for best performance you should set
`image_dim_ordering="tf"` in your Keras config
at ~/.keras/keras.json.
The model and the weights are compatible with both
TensorFlow and Theano. The dimension ordering
convention used by the model is the one
specified in your Keras config file.
# Arguments
include_top: whether to include the 3 fully-connected
layers at the top of the network.
weights: one of `None` (random initialization)
or "imagenet" (pre-training on ImageNet).
input_tensor: optional Keras tensor (i.e. xput of `layers.Input()`)
to use as image input for the model.
# Returns
A Keras model instance.
'''
if weights not in {'imagenet', None}:
raise ValueError('The `weights` argument should be either '
'`None` (random initialization) or `imagenet` '
'(pre-training on ImageNet).')
# Determine proper input shape
if K.image_dim_ordering() == 'th':
if include_top:
input_shape = (3, 224, 224)
else:
input_shape = (3, None, None)
else:
if include_top:
input_shape = (224, 224, 3)
else:
input_shape = (None, None, 3)
if input_tensor is None:
img_input = Input(shape=input_shape)
else:
if not K.is_keras_tensor(input_tensor):
img_input = Input(tensor=input_tensor, shape=input_shape)
else:
img_input = input_tensor
if K.image_dim_ordering() == 'tf':
bn_axis = 3
else:
bn_axis = 1
x = ZeroPadding2D((3, 3))(img_input)
x = Convolution2D(64, 7, 7, subsample=(2, 2), name='conv1')(x)
x = BatchNormalization(axis=bn_axis, name='bn_conv1')(x)
x = Activation('relu')(x)
x = MaxPooling2D((3, 3), strides=(2, 2))(x)
x = conv_block(x, 3, [64, 64, 256], stage=2, block='a', strides=(1, 1))
x = identity_block(x, 3, [64, 64, 256], stage=2, block='b')
x = identity_block(x, 3, [64, 64, 256], stage=2, block='c')
x = conv_block(x, 3, [128, 128, 512], stage=3, block='a')
x = identity_block(x, 3, [128, 128, 512], stage=3, block='b')
x = identity_block(x, 3, [128, 128, 512], stage=3, block='c')
x = identity_block(x, 3, [128, 128, 512], stage=3, block='d')
x = conv_block(x, 3, [256, 256, 1024], stage=4, block='a')
x = identity_block(x, 3, [256, 256, 1024], stage=4, block='b')
x = identity_block(x, 3, [256, 256, 1024], stage=4, block='c')
x = identity_block(x, 3, [256, 256, 1024], stage=4, block='d')
x = identity_block(x, 3, [256, 256, 1024], stage=4, block='e')
x = identity_block(x, 3, [256, 256, 1024], stage=4, block='f')
x = conv_block(x, 3, [512, 512, 2048], stage=5, block='a')
x = identity_block(x, 3, [512, 512, 2048], stage=5, block='b')
x = identity_block(x, 3, [512, 512, 2048], stage=5, block='c')
x = AveragePooling2D((7, 7), name='avg_pool')(x)
if include_top:
x = Flatten()(x)
x = Dense(1000, activation='softmax', name='fc1000')(x)
model = Model(img_input, x)
# load weights
if weights == 'imagenet':
if K.image_dim_ordering() == 'th':
if include_top:
weights_path = get_file('resnet50_weights_th_dim_ordering_th_kernels.h5',
TH_WEIGHTS_PATH,
cache_subdir='models',
md5_hash='1c1f8f5b0c8ee28fe9d950625a230e1c')
else:
weights_path = get_file('resnet50_weights_th_dim_ordering_th_kernels_notop.h5',
TH_WEIGHTS_PATH_NO_TOP,
cache_subdir='models',
md5_hash='f64f049c92468c9affcd44b0976cdafe')
model.load_weights(weights_path)
if K.backend() == 'tensorflow':
warnings.warn('You are using the TensorFlow backend, yet you '
'are using the Theano '
'image dimension ordering convention '
'(`image_dim_ordering="th"`). '
'For best performance, set '
'`image_dim_ordering="tf"` in '
'your Keras config '
'at ~/.keras/keras.json.')
convert_all_kernels_in_model(model)
else:
if include_top:
weights_path = get_file('resnet50_weights_tf_dim_ordering_tf_kernels.h5',
TF_WEIGHTS_PATH,
cache_subdir='models',
md5_hash='a7b3fe01876f51b976af0dea6bc144eb')
else:
weights_path = get_file('resnet50_weights_tf_dim_ordering_tf_kernels_notop.h5',
TF_WEIGHTS_PATH_NO_TOP,
cache_subdir='models',
md5_hash='a268eb855778b3df3c7506639542a6af')
model.load_weights(weights_path)
if K.backend() == 'theano':
convert_all_kernels_in_model(model)
return model
+149
Ver Arquivo
@@ -0,0 +1,149 @@
# -*- coding: utf-8 -*-
'''VGG16 model for Keras.
# Reference:
- [Very Deep Convolutional Networks for Large-Scale Image Recognition](https://arxiv.org/abs/1409.1556)
'''
from __future__ import print_function
from __future__ import absolute_import
import warnings
from ..models import Model
from ..layers import Flatten, Dense, Input
from ..layers import Convolution2D, MaxPooling2D
from ..utils.layer_utils import convert_all_kernels_in_model
from ..utils.data_utils import get_file
from .. import backend as K
from .imagenet_utils import decode_predictions, preprocess_input
TH_WEIGHTS_PATH = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg16_weights_th_dim_ordering_th_kernels.h5'
TF_WEIGHTS_PATH = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg16_weights_tf_dim_ordering_tf_kernels.h5'
TH_WEIGHTS_PATH_NO_TOP = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg16_weights_th_dim_ordering_th_kernels_notop.h5'
TF_WEIGHTS_PATH_NO_TOP = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5'
def VGG16(include_top=True, weights='imagenet',
input_tensor=None):
'''Instantiate the VGG16 architecture,
optionally loading weights pre-trained
on ImageNet. Note that when using TensorFlow,
for best performance you should set
`image_dim_ordering="tf"` in your Keras config
at ~/.keras/keras.json.
The model and the weights are compatible with both
TensorFlow and Theano. The dimension ordering
convention used by the model is the one
specified in your Keras config file.
# Arguments
include_top: whether to include the 3 fully-connected
layers at the top of the network.
weights: one of `None` (random initialization)
or "imagenet" (pre-training on ImageNet).
input_tensor: optional Keras tensor (i.e. output of `layers.Input()`)
to use as image input for the model.
# Returns
A Keras model instance.
'''
if weights not in {'imagenet', None}:
raise ValueError('The `weights` argument should be either '
'`None` (random initialization) or `imagenet` '
'(pre-training on ImageNet).')
# Determine proper input shape
if K.image_dim_ordering() == 'th':
if include_top:
input_shape = (3, 224, 224)
else:
input_shape = (3, None, None)
else:
if include_top:
input_shape = (224, 224, 3)
else:
input_shape = (None, None, 3)
if input_tensor is None:
img_input = Input(shape=input_shape)
else:
if not K.is_keras_tensor(input_tensor):
img_input = Input(tensor=input_tensor, shape=input_shape)
else:
img_input = input_tensor
# Block 1
x = Convolution2D(64, 3, 3, activation='relu', border_mode='same', name='block1_conv1')(img_input)
x = Convolution2D(64, 3, 3, activation='relu', border_mode='same', name='block1_conv2')(x)
x = MaxPooling2D((2, 2), strides=(2, 2), name='block1_pool')(x)
# Block 2
x = Convolution2D(128, 3, 3, activation='relu', border_mode='same', name='block2_conv1')(x)
x = Convolution2D(128, 3, 3, activation='relu', border_mode='same', name='block2_conv2')(x)
x = MaxPooling2D((2, 2), strides=(2, 2), name='block2_pool')(x)
# Block 3
x = Convolution2D(256, 3, 3, activation='relu', border_mode='same', name='block3_conv1')(x)
x = Convolution2D(256, 3, 3, activation='relu', border_mode='same', name='block3_conv2')(x)
x = Convolution2D(256, 3, 3, activation='relu', border_mode='same', name='block3_conv3')(x)
x = MaxPooling2D((2, 2), strides=(2, 2), name='block3_pool')(x)
# Block 4
x = Convolution2D(512, 3, 3, activation='relu', border_mode='same', name='block4_conv1')(x)
x = Convolution2D(512, 3, 3, activation='relu', border_mode='same', name='block4_conv2')(x)
x = Convolution2D(512, 3, 3, activation='relu', border_mode='same', name='block4_conv3')(x)
x = MaxPooling2D((2, 2), strides=(2, 2), name='block4_pool')(x)
# Block 5
x = Convolution2D(512, 3, 3, activation='relu', border_mode='same', name='block5_conv1')(x)
x = Convolution2D(512, 3, 3, activation='relu', border_mode='same', name='block5_conv2')(x)
x = Convolution2D(512, 3, 3, activation='relu', border_mode='same', name='block5_conv3')(x)
x = MaxPooling2D((2, 2), strides=(2, 2), name='block5_pool')(x)
if include_top:
# Classification block
x = Flatten(name='flatten')(x)
x = Dense(4096, activation='relu', name='fc1')(x)
x = Dense(4096, activation='relu', name='fc2')(x)
x = Dense(1000, activation='softmax', name='predictions')(x)
# Create model
model = Model(img_input, x)
# load weights
if weights == 'imagenet':
if K.image_dim_ordering() == 'th':
if include_top:
weights_path = get_file('vgg16_weights_th_dim_ordering_th_kernels.h5',
TH_WEIGHTS_PATH,
cache_subdir='models')
else:
weights_path = get_file('vgg16_weights_th_dim_ordering_th_kernels_notop.h5',
TH_WEIGHTS_PATH_NO_TOP,
cache_subdir='models')
model.load_weights(weights_path)
if K.backend() == 'tensorflow':
warnings.warn('You are using the TensorFlow backend, yet you '
'are using the Theano '
'image dimension ordering convention '
'(`image_dim_ordering="th"`). '
'For best performance, set '
'`image_dim_ordering="tf"` in '
'your Keras config '
'at ~/.keras/keras.json.')
convert_all_kernels_in_model(model)
else:
if include_top:
weights_path = get_file('vgg16_weights_tf_dim_ordering_tf_kernels.h5',
TF_WEIGHTS_PATH,
cache_subdir='models')
else:
weights_path = get_file('vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5',
TF_WEIGHTS_PATH_NO_TOP,
cache_subdir='models')
model.load_weights(weights_path)
if K.backend() == 'theano':
convert_all_kernels_in_model(model)
return model
+152
Ver Arquivo
@@ -0,0 +1,152 @@
# -*- coding: utf-8 -*-
'''VGG19 model for Keras.
# Reference:
- [Very Deep Convolutional Networks for Large-Scale Image Recognition](https://arxiv.org/abs/1409.1556)
'''
from __future__ import print_function
from __future__ import absolute_import
import warnings
from ..models import Model
from ..layers import Flatten, Dense, Input
from ..layers import Convolution2D, MaxPooling2D
from ..utils.layer_utils import convert_all_kernels_in_model
from ..utils.data_utils import get_file
from .. import backend as K
from .imagenet_utils import decode_predictions, preprocess_input
TH_WEIGHTS_PATH = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg19_weights_th_dim_ordering_th_kernels.h5'
TF_WEIGHTS_PATH = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg19_weights_tf_dim_ordering_tf_kernels.h5'
TH_WEIGHTS_PATH_NO_TOP = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg19_weights_th_dim_ordering_th_kernels_notop.h5'
TF_WEIGHTS_PATH_NO_TOP = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg19_weights_tf_dim_ordering_tf_kernels_notop.h5'
def VGG19(include_top=True, weights='imagenet',
input_tensor=None):
'''Instantiate the VGG19 architecture,
optionally loading weights pre-trained
on ImageNet. Note that when using TensorFlow,
for best performance you should set
`image_dim_ordering="tf"` in your Keras config
at ~/.keras/keras.json.
The model and the weights are compatible with both
TensorFlow and Theano. The dimension ordering
convention used by the model is the one
specified in your Keras config file.
# Arguments
include_top: whether to include the 3 fully-connected
layers at the top of the network.
weights: one of `None` (random initialization)
or "imagenet" (pre-training on ImageNet).
input_tensor: optional Keras tensor (i.e. output of `layers.Input()`)
to use as image input for the model.
# Returns
A Keras model instance.
'''
if weights not in {'imagenet', None}:
raise ValueError('The `weights` argument should be either '
'`None` (random initialization) or `imagenet` '
'(pre-training on ImageNet).')
# Determine proper input shape
if K.image_dim_ordering() == 'th':
if include_top:
input_shape = (3, 224, 224)
else:
input_shape = (3, None, None)
else:
if include_top:
input_shape = (224, 224, 3)
else:
input_shape = (None, None, 3)
if input_tensor is None:
img_input = Input(shape=input_shape)
else:
if not K.is_keras_tensor(input_tensor):
img_input = Input(tensor=input_tensor, shape=input_shape)
else:
img_input = input_tensor
# Block 1
x = Convolution2D(64, 3, 3, activation='relu', border_mode='same', name='block1_conv1')(img_input)
x = Convolution2D(64, 3, 3, activation='relu', border_mode='same', name='block1_conv2')(x)
x = MaxPooling2D((2, 2), strides=(2, 2), name='block1_pool')(x)
# Block 2
x = Convolution2D(128, 3, 3, activation='relu', border_mode='same', name='block2_conv1')(x)
x = Convolution2D(128, 3, 3, activation='relu', border_mode='same', name='block2_conv2')(x)
x = MaxPooling2D((2, 2), strides=(2, 2), name='block2_pool')(x)
# Block 3
x = Convolution2D(256, 3, 3, activation='relu', border_mode='same', name='block3_conv1')(x)
x = Convolution2D(256, 3, 3, activation='relu', border_mode='same', name='block3_conv2')(x)
x = Convolution2D(256, 3, 3, activation='relu', border_mode='same', name='block3_conv3')(x)
x = Convolution2D(256, 3, 3, activation='relu', border_mode='same', name='block3_conv4')(x)
x = MaxPooling2D((2, 2), strides=(2, 2), name='block3_pool')(x)
# Block 4
x = Convolution2D(512, 3, 3, activation='relu', border_mode='same', name='block4_conv1')(x)
x = Convolution2D(512, 3, 3, activation='relu', border_mode='same', name='block4_conv2')(x)
x = Convolution2D(512, 3, 3, activation='relu', border_mode='same', name='block4_conv3')(x)
x = Convolution2D(512, 3, 3, activation='relu', border_mode='same', name='block4_conv4')(x)
x = MaxPooling2D((2, 2), strides=(2, 2), name='block4_pool')(x)
# Block 5
x = Convolution2D(512, 3, 3, activation='relu', border_mode='same', name='block5_conv1')(x)
x = Convolution2D(512, 3, 3, activation='relu', border_mode='same', name='block5_conv2')(x)
x = Convolution2D(512, 3, 3, activation='relu', border_mode='same', name='block5_conv3')(x)
x = Convolution2D(512, 3, 3, activation='relu', border_mode='same', name='block5_conv4')(x)
x = MaxPooling2D((2, 2), strides=(2, 2), name='block5_pool')(x)
if include_top:
# Classification block
x = Flatten(name='flatten')(x)
x = Dense(4096, activation='relu', name='fc1')(x)
x = Dense(4096, activation='relu', name='fc2')(x)
x = Dense(1000, activation='softmax', name='predictions')(x)
# Create model
model = Model(img_input, x)
# load weights
if weights == 'imagenet':
if K.image_dim_ordering() == 'th':
if include_top:
weights_path = get_file('vgg19_weights_th_dim_ordering_th_kernels.h5',
TH_WEIGHTS_PATH,
cache_subdir='models')
else:
weights_path = get_file('vgg19_weights_th_dim_ordering_th_kernels_notop.h5',
TH_WEIGHTS_PATH_NO_TOP,
cache_subdir='models')
model.load_weights(weights_path)
if K.backend() == 'tensorflow':
warnings.warn('You are using the TensorFlow backend, yet you '
'are using the Theano '
'image dimension ordering convention '
'(`image_dim_ordering="th"`). '
'For best performance, set '
'`image_dim_ordering="tf"` in '
'your Keras config '
'at ~/.keras/keras.json.')
convert_all_kernels_in_model(model)
else:
if include_top:
weights_path = get_file('vgg19_weights_tf_dim_ordering_tf_kernels.h5',
TF_WEIGHTS_PATH,
cache_subdir='models')
else:
weights_path = get_file('vgg19_weights_tf_dim_ordering_tf_kernels_notop.h5',
TF_WEIGHTS_PATH_NO_TOP,
cache_subdir='models')
model.load_weights(weights_path)
if K.backend() == 'theano':
convert_all_kernels_in_model(model)
return model
+46 -12
Ver Arquivo
@@ -2,38 +2,72 @@ from __future__ import absolute_import
from __future__ import print_function
import os
import json
from .common import epsilon, floatx, set_epsilon, set_floatx
import sys
from .common import epsilon
from .common import floatx
from .common import set_epsilon
from .common import set_floatx
from .common import get_uid
from .common import cast_to_floatx
from .common import image_dim_ordering
from .common import set_image_dim_ordering
from .common import is_keras_tensor
from .common import legacy_weight_ordering
from .common import set_legacy_weight_ordering
_keras_dir = os.path.expanduser(os.path.join('~', '.keras'))
_keras_base_dir = os.path.expanduser('~')
if not os.access(_keras_base_dir, os.W_OK):
_keras_base_dir = '/tmp'
_keras_dir = os.path.join(_keras_base_dir, '.keras')
if not os.path.exists(_keras_dir):
os.makedirs(_keras_dir)
_BACKEND = 'theano'
_config_path = os.path.expanduser(os.path.join('~', '.keras', 'keras.json'))
_BACKEND = 'tensorflow'
_config_path = os.path.expanduser(os.path.join(_keras_dir, 'keras.json'))
if os.path.exists(_config_path):
_config = json.load(open(_config_path))
_floatx = _config.get('floatx', floatx())
assert _floatx in {'float32', 'float64'}
assert _floatx in {'float16', 'float32', 'float64'}
_epsilon = _config.get('epsilon', epsilon())
assert type(_epsilon) == float
_backend = _config.get('backend', _BACKEND)
assert _backend in {'theano', 'tensorflow'}
_image_dim_ordering = _config.get('image_dim_ordering', image_dim_ordering())
assert _image_dim_ordering in {'tf', 'th'}
set_floatx(_floatx)
set_epsilon(_epsilon)
set_image_dim_ordering(_image_dim_ordering)
_BACKEND = _backend
else:
# save config file, for easy edition
# save config file
if not os.path.exists(_config_path):
_config = {'floatx': floatx(),
'epsilon': epsilon(),
'backend': _BACKEND}
json.dump(_config, open(_config_path, 'w'))
'backend': _BACKEND,
'image_dim_ordering': image_dim_ordering()}
with open(_config_path, 'w') as f:
f.write(json.dumps(_config, indent=4))
if 'KERAS_BACKEND' in os.environ:
_backend = os.environ['KERAS_BACKEND']
assert _backend in {'theano', 'tensorflow'}
_BACKEND = _backend
# import backend
if _BACKEND == 'theano':
print('Using Theano backend.')
sys.stderr.write('Using Theano backend.\n')
from .theano_backend import *
elif _BACKEND == 'tensorflow':
print('Using TensorFlow backend.')
sys.stderr.write('Using TensorFlow backend.\n')
from .tensorflow_backend import *
else:
raise Exception('Unknown backend: ' + str(backend))
raise Exception('Unknown backend: ' + str(_BACKEND))
def backend():
'''Publicly accessible method
for determining the current backend.
'''
return _BACKEND
+60 -2
Ver Arquivo
@@ -1,31 +1,89 @@
import numpy as np
from collections import defaultdict
# the type of float to use throughout the session.
_FLOATX = 'float32'
_EPSILON = 10e-8
_UID_PREFIXES = defaultdict(int)
_IMAGE_DIM_ORDERING = 'tf'
_LEGACY_WEIGHT_ORDERING = False
def epsilon():
'''Returns the value of the fuzz
factor used in numeric expressions.
'''
return _EPSILON
def set_epsilon(e):
'''Sets the value of the fuzz
factor used in numeric expressions.
'''
global _EPSILON
_EPSILON = e
def floatx():
'''Returns the default float type, as a string
(e.g. 'float16', 'float32', 'float64').
'''
return _FLOATX
def set_floatx(floatx):
global _FLOATX
if floatx not in {'float32', 'float64'}:
if floatx not in {'float16', 'float32', 'float64'}:
raise Exception('Unknown floatx type: ' + str(floatx))
_FLOATX = floatx
_FLOATX = str(floatx)
def cast_to_floatx(x):
'''Cast a Numpy array to floatx.
'''
return np.asarray(x, dtype=_FLOATX)
def image_dim_ordering():
'''Returns the image dimension ordering
convention ('th' or 'tf').
'''
return _IMAGE_DIM_ORDERING
def set_image_dim_ordering(dim_ordering):
'''Sets the value of the image dimension
ordering convention ('th' or 'tf').
'''
global _IMAGE_DIM_ORDERING
if dim_ordering not in {'tf', 'th'}:
raise Exception('Unknown dim_ordering:', dim_ordering)
_IMAGE_DIM_ORDERING = str(dim_ordering)
def get_uid(prefix=''):
_UID_PREFIXES[prefix] += 1
return _UID_PREFIXES[prefix]
def reset_uids():
global _UID_PREFIXES
_UID_PREFIXES = defaultdict(int)
def is_keras_tensor(x):
if hasattr(x, '_keras_shape'):
return True
else:
return False
def set_legacy_weight_ordering(value):
global _LEGACY_WEIGHT_ORDERING
assert value in {True, False}
_LEGACY_WEIGHT_ORDERING = value
def legacy_weight_ordering():
return _LEGACY_WEIGHT_ORDERING
Diferenças do arquivo suprimidas por serem muito extensas Carregar Diff
Diferenças do arquivo suprimidas por serem muito extensas Carregar Diff
+319 -80
Ver Arquivo
@@ -8,6 +8,8 @@ import warnings
from collections import deque
from .utils.generic_utils import Progbar
from keras import backend as K
from pkg_resources import parse_version
class CallbackList(object):
@@ -43,21 +45,26 @@ class CallbackList(object):
callback.on_batch_begin(batch, logs)
self._delta_ts_batch_begin.append(time.time() - t_before_callbacks)
delta_t_median = np.median(self._delta_ts_batch_begin)
if self._delta_t_batch > 0. and delta_t_median > 0.95 * self._delta_t_batch and delta_t_median > 0.1:
if self._delta_t_batch > 0. and delta_t_median > 0.95 * \
self._delta_t_batch and delta_t_median > 0.1:
warnings.warn('Method on_batch_begin() is slow compared '
'to the batch update (%f). Check your callbacks.' % delta_t_median)
'to the batch update (%f). Check your callbacks.'
% delta_t_median)
self._t_enter_batch = time.time()
def on_batch_end(self, batch, logs={}):
if not hasattr(self, '_t_enter_batch'):
self._t_enter_batch = time.time()
self._delta_t_batch = time.time() - self._t_enter_batch
t_before_callbacks = time.time()
for callback in self.callbacks:
callback.on_batch_end(batch, logs)
self._delta_ts_batch_end.append(time.time() - t_before_callbacks)
delta_t_median = np.median(self._delta_ts_batch_end)
if self._delta_t_batch > 0. and delta_t_median > 0.95 * self._delta_t_batch and delta_t_median > 0.1:
if self._delta_t_batch > 0. and (delta_t_median > 0.95 * self._delta_t_batch and delta_t_median > 0.1):
warnings.warn('Method on_batch_end() is slow compared '
'to the batch update (%f). Check your callbacks.' % delta_t_median)
'to the batch update (%f). Check your callbacks.'
% delta_t_median)
def on_train_begin(self, logs={}):
for callback in self.callbacks:
@@ -69,7 +76,31 @@ class CallbackList(object):
class Callback(object):
'''Abstract base class used to build new callbacks.
# Properties
params: dict. Training parameters
(eg. verbosity, batch size, number of epochs...).
model: instance of `keras.models.Model`.
Reference of the model being trained.
The `logs` dictionary that callback methods
take as argument will contain keys for quantities relevant to
the current batch or epoch.
Currently, the `.fit()` method of the `Sequential` model class
will include the following quantities in the `logs` that
it passes to its callbacks:
on_epoch_end: logs include `acc` and `loss`, and
optionally include `val_loss`
(if validation is enabled in `fit`), and `val_acc`
(if validation and accuracy monitoring are enabled).
on_batch_begin: logs include `size`,
the number of samples in the current batch.
on_batch_end: logs include `loss`, and optionally `acc`
(if accuracy monitoring is enabled).
'''
def __init__(self):
pass
@@ -99,6 +130,36 @@ class Callback(object):
class BaseLogger(Callback):
'''Callback that accumulates epoch averages of
the metrics being monitored.
This callback is automatically applied to
every Keras model.
'''
def on_epoch_begin(self, epoch, logs={}):
self.seen = 0
self.totals = {}
def on_batch_end(self, batch, logs={}):
batch_size = logs.get('size', 0)
self.seen += batch_size
for k, v in logs.items():
if k in self.totals:
self.totals[k] += v * batch_size
else:
self.totals[k] = v * batch_size
def on_epoch_end(self, epoch, logs={}):
for k in self.params['metrics']:
if k in self.totals:
# make value available to next callbacks
logs[k] = self.totals[k] / self.seen
class ProgbarLogger(Callback):
'''Callback that prints metrics to stdout.
'''
def on_train_begin(self, logs={}):
self.verbose = self.params['verbose']
self.nb_epoch = self.params['nb_epoch']
@@ -109,7 +170,6 @@ class BaseLogger(Callback):
self.progbar = Progbar(target=self.params['nb_sample'],
verbose=self.verbose)
self.seen = 0
self.totals = {}
def on_batch_begin(self, batch, logs={}):
if self.seen < self.params['nb_sample']:
@@ -119,82 +179,96 @@ class BaseLogger(Callback):
batch_size = logs.get('size', 0)
self.seen += batch_size
for k, v in logs.items():
if k in self.totals:
self.totals[k] += v * batch_size
else:
self.totals[k] = v * batch_size
for k in self.params['metrics']:
if k in logs:
self.log_values.append((k, logs[k]))
# skip progbar update for the last batch; will be handled by on_epoch_end
# skip progbar update for the last batch;
# will be handled by on_epoch_end
if self.verbose and self.seen < self.params['nb_sample']:
self.progbar.update(self.seen, self.log_values)
def on_epoch_end(self, epoch, logs={}):
for k in self.params['metrics']:
if k in self.totals:
self.log_values.append((k, self.totals[k] / self.seen))
if k in logs:
self.log_values.append((k, logs[k]))
if self.verbose:
self.progbar.update(self.seen, self.log_values)
self.progbar.update(self.seen, self.log_values, force=True)
class History(Callback):
'''Callback that records events
into a `History` object.
This callback is automatically applied to
every Keras model. The `History` object
gets returned by the `fit` method of models.
'''
def on_train_begin(self, logs={}):
self.epoch = []
self.history = {}
def on_epoch_begin(self, epoch, logs={}):
self.seen = 0
self.totals = {}
def on_batch_end(self, batch, logs={}):
batch_size = logs.get('size', 0)
self.seen += batch_size
for k, v in logs.items():
if k in self.totals:
self.totals[k] += v * batch_size
else:
self.totals[k] = v * batch_size
def on_epoch_end(self, epoch, logs={}):
self.epoch.append(epoch)
for k, v in self.totals.items():
if k not in self.history:
self.history[k] = []
self.history[k].append(v / self.seen)
for k, v in logs.items():
if k not in self.history:
self.history[k] = []
self.history[k].append(v)
self.history.setdefault(k, []).append(v)
class ModelCheckpoint(Callback):
def __init__(self, filepath, monitor='val_loss', verbose=0, save_best_only=False, mode='auto'):
'''Save the model after every epoch.
super(Callback, self).__init__()
`filepath` can contain named formatting options,
which will be filled the value of `epoch` and
keys in `logs` (passed in `on_epoch_end`).
For example: if `filepath` is `weights.{epoch:02d}-{val_loss:.2f}.hdf5`,
then multiple files will be save with the epoch number and
the validation loss.
# Arguments
filepath: string, path to save the model file.
monitor: quantity to monitor.
verbose: verbosity mode, 0 or 1.
save_best_only: if `save_best_only=True`,
the latest best model according to
the quantity monitored will not be overwritten.
mode: one of {auto, min, max}.
If `save_best_only=True`, the decision
to overwrite the current save file is made
based on either the maximization or the
minimization of the monitored quantity. For `val_acc`,
this should be `max`, for `val_loss` this should
be `min`, etc. In `auto` mode, the direction is
automatically inferred from the name of the monitored quantity.
save_weights_only: if True, then only the model's weights will be
saved (`model.save_weights(filepath)`), else the full model
is saved (`model.save(filepath)`).
'''
def __init__(self, filepath, monitor='val_loss', verbose=0,
save_best_only=False, save_weights_only=False,
mode='auto'):
super(ModelCheckpoint, self).__init__()
self.monitor = monitor
self.verbose = verbose
self.filepath = filepath
self.save_best_only = save_best_only
self.save_weights_only = save_weights_only
if mode not in ['auto', 'min', 'max']:
warnings.warn("ModelCheckpoint mode %s is unknown, fallback to auto mode" % (self.mode), RuntimeWarning)
warnings.warn('ModelCheckpoint mode %s is unknown, '
'fallback to auto mode.' % (mode),
RuntimeWarning)
mode = 'auto'
if mode == "min":
if mode == 'min':
self.monitor_op = np.less
self.best = np.Inf
elif mode == "max":
elif mode == 'max':
self.monitor_op = np.greater
self.best = -np.Inf
else:
if "acc" in self.monitor:
if 'acc' in self.monitor:
self.monitor_op = np.greater
self.best = -np.Inf
else:
@@ -206,90 +280,255 @@ class ModelCheckpoint(Callback):
if self.save_best_only:
current = logs.get(self.monitor)
if current is None:
warnings.warn("Can save best model only with %s available, skipping." % (self.monitor), RuntimeWarning)
warnings.warn('Can save best model only with %s available, '
'skipping.' % (self.monitor), RuntimeWarning)
else:
if self.monitor_op(current, self.best):
if self.verbose > 0:
print("Epoch %05d: %s improved from %0.5f to %0.5f, saving model to %s"
% (epoch, self.monitor, self.best, current, filepath))
print('Epoch %05d: %s improved from %0.5f to %0.5f,'
' saving model to %s'
% (epoch, self.monitor, self.best,
current, filepath))
self.best = current
self.model.save_weights(filepath, overwrite=True)
if self.save_weights_only:
self.model.save_weights(filepath, overwrite=True)
else:
self.model.save(filepath, overwrite=True)
else:
if self.verbose > 0:
print("Epoch %05d: %s did not improve" % (epoch, self.monitor))
print('Epoch %05d: %s did not improve' %
(epoch, self.monitor))
else:
if self.verbose > 0:
print("Epoch %05d: saving model to %s" % (epoch, filepath))
self.model.save_weights(filepath, overwrite=True)
print('Epoch %05d: saving model to %s' % (epoch, filepath))
if self.save_weights_only:
self.model.save_weights(filepath, overwrite=True)
else:
self.model.save(filepath, overwrite=True)
class EarlyStopping(Callback):
def __init__(self, monitor='val_loss', patience=0, verbose=0):
super(Callback, self).__init__()
'''Stop training when a monitored quantity has stopped improving.
# Arguments
monitor: quantity to be monitored.
patience: number of epochs with no improvement
after which training will be stopped.
verbose: verbosity mode.
mode: one of {auto, min, max}. In `min` mode,
training will stop when the quantity
monitored has stopped decreasing; in `max`
mode it will stop when the quantity
monitored has stopped increasing; in `auto`
mode, the direction is automatically inferred
from the name of the monitored quantity.
'''
def __init__(self, monitor='val_loss', patience=0, verbose=0, mode='auto'):
super(EarlyStopping, self).__init__()
self.monitor = monitor
self.patience = patience
self.verbose = verbose
self.best = np.Inf
self.wait = 0
if mode not in ['auto', 'min', 'max']:
warnings.warn('EarlyStopping mode %s is unknown, '
'fallback to auto mode.' % (self.mode),
RuntimeWarning)
mode = 'auto'
if mode == 'min':
self.monitor_op = np.less
elif mode == 'max':
self.monitor_op = np.greater
else:
if 'acc' in self.monitor:
self.monitor_op = np.greater
else:
self.monitor_op = np.less
def on_train_begin(self, logs={}):
self.wait = 0 # Allow instances to be re-used
self.best = np.Inf if self.monitor_op == np.less else -np.Inf
def on_epoch_end(self, epoch, logs={}):
current = logs.get(self.monitor)
if current is None:
warnings.warn("Early stopping requires %s available!" % (self.monitor), RuntimeWarning)
warnings.warn('Early stopping requires %s available!' %
(self.monitor), RuntimeWarning)
if current < self.best:
if self.monitor_op(current, self.best):
self.best = current
self.wait = 0
else:
if self.wait >= self.patience:
if self.verbose > 0:
print("Epoch %05d: early stopping" % (epoch))
print('Epoch %05d: early stopping' % (epoch))
self.model.stop_training = True
self.wait += 1
class RemoteMonitor(Callback):
def __init__(self, root='http://localhost:9000'):
'''Callback used to stream events to a server.
Requires the `requests` library.
# Arguments
root: root url to which the events will be sent (at the end
of every epoch). Events are sent to
`root + '/publish/epoch/end/'` by default. Calls are
HTTP POST, with a `data` argument which is a
JSON-encoded dictionary of event data.
'''
def __init__(self,
root='http://localhost:9000',
path='/publish/epoch/end/',
field='data'):
super(RemoteMonitor, self).__init__()
self.root = root
def on_epoch_begin(self, epoch, logs={}):
self.seen = 0
self.totals = {}
def on_batch_end(self, batch, logs={}):
batch_size = logs.get('size', 0)
self.seen += batch_size
for k, v in logs.items():
if k in self.totals:
self.totals[k] += v * batch_size
else:
self.totals[k] = v * batch_size
self.path = path
self.field = field
def on_epoch_end(self, epoch, logs={}):
import requests
send = {}
send['epoch'] = epoch
for k, v in self.totals.items():
send[k] = v / self.seen
for k, v in logs.items():
send[k] = v
try:
r = requests.post(self.root + '/publish/epoch/end/', {'data': json.dumps(send)})
requests.post(self.root + self.path,
{self.field: json.dumps(send)})
except:
print('Warning: could not reach RemoteMonitor root server at ' + str(self.root))
print('Warning: could not reach RemoteMonitor '
'root server at ' + str(self.root))
class LearningRateScheduler(Callback):
'''LearningRateScheduler
schedule is a function that gets an epoch number as input and returns a new
learning rate as output.
'''Learning rate scheduler.
# Arguments
schedule: a function that takes an epoch index as input
(integer, indexed from 0) and returns a new
learning rate as output (float).
'''
def __init__(self, schedule):
super(LearningRateScheduler, self).__init__()
self.schedule = schedule
def on_epoch_begin(self, epoch, logs={}):
self.model.optimizer.lr.set_value(self.schedule(epoch))
assert hasattr(self.model.optimizer, 'lr'), \
'Optimizer must have a "lr" attribute.'
lr = self.schedule(epoch)
assert type(lr) == float, 'The output of the "schedule" function should be float.'
K.set_value(self.model.optimizer.lr, lr)
class TensorBoard(Callback):
''' Tensorboard basic visualizations.
This callback writes a log for TensorBoard, which allows
you to visualize dynamic graphs of your training and test
metrics, as well as activation histograms for the different
layers in your model.
TensorBoard is a visualization tool provided with TensorFlow.
If you have installed TensorFlow with pip, you should be able
to launch TensorBoard from the command line:
```
tensorboard --logdir=/full_path_to_your_logs
```
You can find more information about TensorBoard
[here](https://www.tensorflow.org/versions/master/how_tos/summaries_and_tensorboard/index.html).
# Arguments
log_dir: the path of the directory where to save the log
files to be parsed by Tensorboard
histogram_freq: frequency (in epochs) at which to compute activation
histograms for the layers of the model. If set to 0,
histograms won't be computed.
write_graph: whether to visualize the graph in Tensorboard.
The log file can become quite large when
write_graph is set to True.
'''
def __init__(self, log_dir='./logs', histogram_freq=0, write_graph=True, write_images=False):
super(TensorBoard, self).__init__()
if K._BACKEND != 'tensorflow':
raise Exception('TensorBoard callback only works '
'with the TensorFlow backend.')
self.log_dir = log_dir
self.histogram_freq = histogram_freq
self.merged = None
self.write_graph = write_graph
self.write_images = write_images
def _set_model(self, model):
import tensorflow as tf
import keras.backend.tensorflow_backend as KTF
self.model = model
self.sess = KTF.get_session()
if self.histogram_freq and self.merged is None:
for layer in self.model.layers:
for weight in layer.weights:
tf.histogram_summary(weight.name, weight)
if self.write_images:
w_img = tf.squeeze(weight)
shape = w_img.get_shape()
if len(shape) > 1 and shape[0] > shape[1]:
w_img = tf.transpose(w_img)
if len(shape) == 1:
w_img = tf.expand_dims(w_img, 0)
w_img = tf.expand_dims(tf.expand_dims(w_img, 0), -1)
tf.image_summary(weight.name, w_img)
if hasattr(layer, 'output'):
tf.histogram_summary('{}_out'.format(layer.name),
layer.output)
self.merged = tf.merge_all_summaries()
if self.write_graph:
if parse_version(tf.__version__) >= parse_version('0.8.0'):
self.writer = tf.train.SummaryWriter(self.log_dir,
self.sess.graph)
else:
self.writer = tf.train.SummaryWriter(self.log_dir,
self.sess.graph_def)
else:
self.writer = tf.train.SummaryWriter(self.log_dir)
def on_epoch_end(self, epoch, logs={}):
import tensorflow as tf
if self.model.validation_data and self.histogram_freq:
if epoch % self.histogram_freq == 0:
# TODO: implement batched calls to sess.run
# (current call will likely go OOM on GPU)
if self.model.uses_learning_phase:
cut_v_data = len(self.model.inputs)
val_data = self.model.validation_data[:cut_v_data] + [0]
tensors = self.model.inputs + [K.learning_phase()]
else:
val_data = self.model.validation_data
tensors = self.model.inputs
feed_dict = dict(zip(tensors, val_data))
result = self.sess.run([self.merged], feed_dict=feed_dict)
summary_str = result[0]
self.writer.add_summary(summary_str, epoch)
for name, value in logs.items():
if name in ['batch', 'size']:
continue
summary = tf.Summary()
summary_value = summary.value.add()
summary_value.simple_value = value
summary_value.tag = name
self.writer.add_summary(summary, epoch)
self.writer.flush()
+58 -10
Ver Arquivo
@@ -7,39 +7,87 @@ class Constraint(object):
return p
def get_config(self):
return {"name": self.__class__.__name__}
return {'name': self.__class__.__name__}
class MaxNorm(Constraint):
def __init__(self, m=2):
'''Constrain the weights incident to each hidden unit to have a norm less than or equal to a desired value.
# Arguments
m: the maximum norm for the incoming weights.
axis: integer, axis along which to calculate weight norms. For instance,
in a `Dense` layer the weight matrix has shape (input_dim, output_dim),
set `axis` to `0` to constrain each weight vector of length (input_dim).
In a `MaxoutDense` layer the weight tensor has shape (nb_feature, input_dim, output_dim),
set `axis` to `1` to constrain each weight vector of length (input_dim),
i.e. constrain the filters incident to the `max` operation.
In a `Convolution2D` layer with the Theano backend, the weight tensor
has shape (nb_filter, stack_size, nb_row, nb_col), set `axis` to `[1,2,3]`
to constrain the weights of each filter tensor of size (stack_size, nb_row, nb_col).
In a `Convolution2D` layer with the TensorFlow backend, the weight tensor
has shape (nb_row, nb_col, stack_size, nb_filter), set `axis` to `[0,1,2]`
to constrain the weights of each filter tensor of size (nb_row, nb_col, stack_size).
# References
- [Dropout: A Simple Way to Prevent Neural Networks from Overfitting Srivastava, Hinton, et al. 2014](http://www.cs.toronto.edu/~rsalakhu/papers/srivastava14a.pdf)
'''
def __init__(self, m=2, axis=0):
self.m = m
self.axis = axis
def __call__(self, p):
norms = K.sqrt(K.sum(K.square(p), axis=0))
norms = K.sqrt(K.sum(K.square(p), axis=self.axis, keepdims=True))
desired = K.clip(norms, 0, self.m)
p = p * (desired / (1e-7 + norms))
p = p * (desired / (K.epsilon() + norms))
return p
def get_config(self):
return {"name": self.__class__.__name__,
"m": self.m}
return {'name': self.__class__.__name__,
'm': self.m,
'axis': self.axis}
class NonNeg(Constraint):
'''Constrain the weights to be non-negative.
'''
def __call__(self, p):
p *= K.cast(p >= 0., K.floatx())
return p
class UnitNorm(Constraint):
def __call__(self, p):
return p / K.sqrt(K.sum(K.square(p), axis=-1, keepdims=True))
'''Constrain the weights incident to each hidden unit to have unit norm.
# Arguments
axis: integer, axis along which to calculate weight norms. For instance,
in a `Dense` layer the weight matrix has shape (input_dim, output_dim),
set `axis` to `0` to constrain each weight vector of length (input_dim).
In a `MaxoutDense` layer the weight tensor has shape (nb_feature, input_dim, output_dim),
set `axis` to `1` to constrain each weight vector of length (input_dim),
i.e. constrain the filters incident to the `max` operation.
In a `Convolution2D` layer with the Theano backend, the weight tensor
has shape (nb_filter, stack_size, nb_row, nb_col), set `axis` to `[1,2,3]`
to constrain the weights of each filter tensor of size (stack_size, nb_row, nb_col).
In a `Convolution2D` layer with the TensorFlow backend, the weight tensor
has shape (nb_row, nb_col, stack_size, nb_filter), set `axis` to `[0,1,2]`
to constrain the weights of each filter tensor of size (nb_row, nb_col, stack_size).
'''
def __init__(self, axis=0):
self.axis = axis
def __call__(self, p):
return p / (K.epsilon() + K.sqrt(K.sum(K.square(p), axis=self.axis, keepdims=True)))
def get_config(self):
return {'name': self.__class__.__name__,
'axis': self.axis}
identity = Constraint
maxnorm = MaxNorm
nonneg = NonNeg
unitnorm = UnitNorm
from .utils.generic_utils import get_from_module
def get(identifier, kwargs=None):
return get_from_module(identifier, globals(), 'constraint', instantiate=True, kwargs=kwargs)
return get_from_module(identifier, globals(), 'constraint',
instantiate=True, kwargs=kwargs)
+1 -1
Ver Arquivo
@@ -2,7 +2,7 @@
from __future__ import absolute_import
import sys
from six.moves import cPickle
from six.moves import range
def load_batch(fpath, label_key='labels'):
f = open(fpath, 'rb')
+8 -4
Ver Arquivo
@@ -1,6 +1,7 @@
from __future__ import absolute_import
from .cifar import load_batch
from .data_utils import get_file
from ..utils.data_utils import get_file
from .. import backend as K
import numpy as np
import os
@@ -10,7 +11,6 @@ def load_data():
origin = "http://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz"
path = get_file(dirname, origin=origin, untar=True)
nb_test_samples = 10000
nb_train_samples = 50000
X_train = np.zeros((nb_train_samples, 3, 32, 32), dtype="uint8")
@@ -19,8 +19,8 @@ def load_data():
for i in range(1, 6):
fpath = os.path.join(path, 'data_batch_' + str(i))
data, labels = load_batch(fpath)
X_train[(i-1)*10000:i*10000, :, :, :] = data
y_train[(i-1)*10000:i*10000] = labels
X_train[(i - 1) * 10000: i * 10000, :, :, :] = data
y_train[(i - 1) * 10000: i * 10000] = labels
fpath = os.path.join(path, 'test_batch')
X_test, y_test = load_batch(fpath)
@@ -28,4 +28,8 @@ def load_data():
y_train = np.reshape(y_train, (len(y_train), 1))
y_test = np.reshape(y_test, (len(y_test), 1))
if K.image_dim_ordering() == 'tf':
X_train = X_train.transpose(0, 2, 3, 1)
X_test = X_test.transpose(0, 2, 3, 1)
return (X_train, y_train), (X_test, y_test)

Alguns arquivos não foram exibidos porque demasiados arquivos foram alterados neste diff Mostrar Mais