Comparar commits

..

2464 Commits

Autor SHA1 Mensagem Data
Francois Chollet 2ddd2bd557 Prepare new PyPI release 2016-11-25 20:52:27 -08:00
Ken Chatfield b2aebb30bf Don't add another header line to CSV logger when appending to an existing file (#4426) 2016-11-25 14:49:50 -08:00
Fariz Rahman 0a9c0ca461 Sequential : Fix trainable arg (#4509) 2016-11-25 11:59:05 -08:00
fchollet c0b32a9a04 Remove reference to legacy Graph model in tests. 2016-11-25 01:20:10 -08:00
fchollet 703d5a1298 Add dynamic trainability lightweight test 2016-11-24 23:59:51 -08:00
fchollet c5cc96a4f4 Saner way to collect trainable weights 2016-11-24 23:59:33 -08:00
fchollet de256cb5d5 Make sure ImageNet predictions are sorted 2016-11-24 23:30:00 -08:00
fchollet ce814302ac Remove support for legacy Graph model 2016-11-24 23:29:45 -08:00
Fariz Rahman 628bc6e03e ACGAN: Remove unnecessary dimension in label input (#4501) 2016-11-24 20:21:56 -08:00
Thomas Pinetz dfb606bb19 Fix border_mode = same for pooling layers documentation. (#4341) 2016-11-24 12:42:59 -08:00
Dontloo 88f3b3f75e fixed variational autoencoder visualization for Gaussian latent space (#4423) 2016-11-23 14:08:19 -08:00
Marzuk Kamal 773d4ce8cb def fbeta_score(y_true, y_pred, beta=1) (#4492)
set the default value of beta=1
2016-11-23 13:25:29 -08:00
Fariz Rahman 509d6d8235 Merge : Serialize output mask; Enable user arguments for callable mode (#4445)
* Update topology.py

* Update topology.py

* Update topology.py

* white space fix

* indentation fix

* add tests

* fix all tests

* add arguments arg to merge

* space after period

* add test with arguments

* add test with arguments for lambda layer too

* pep8 fixes

* fix tf test

* try fixing tf test; again

* bug fix

* finally
2016-11-23 13:24:54 -08:00
Taras Boiko 7bd5c862a2 Correctly check the output dimension for None instead of target (#4458) 2016-11-23 13:21:48 -08:00
Angelos Katharopoulos 2878f60634 Add map, foldl, foldr to the backend (#4461) 2016-11-23 13:21:13 -08:00
Luke de Oliveira 50fdb87888 adding mnist acgan example (#4475) 2016-11-23 13:19:29 -08:00
Gijs van Tulder dad7790ec3 Model summary: separate columns with a space. (#4469) 2016-11-23 11:06:30 -08:00
Marzuk Kamal 709bc5e15a tf.global_variables and tf.variables_initializer (#4490)
tf.all_variables and tf.initialize_variables are replaced by tf.global_variables and tf.variables_initializer for the future version of tensorflow
2016-11-23 11:06:10 -08:00
Ken Chatfield 06cc6d7fea Add initial epoch argument to fit functions (#4429)
* Added initial_epoch argument to fit functions in trainer

* Added unit test

* PEP8 fixes
2016-11-19 21:51:57 -08:00
EdwardRaff 97484ec9c1 Finishing Colincsl's SpatialDropout1D (#4416)
* Added SpatialDropout1D

This is a straightforward modification of SpatialDropout2D but for 1D data.

* Added SpatialDropout1D to docs

* SpatialDropout1D test

* Fixed indent issue

* Combined TF and TH dimension conditions

Use the same 1D dimensions for TensorFlow and Theano in SpatialDropout1D.

* trailing whitespace

* Removed dim_ordering variable

* Removing dim_ordering values

removing dim_ordering values as requested
2016-11-19 12:30:05 -08:00
Taras Boiko 6b04add932 Check all output dimensions for compatibility (#4420) 2016-11-19 10:10:08 -08:00
Yu Kobayashi 04ea01f385 Bug fix of Bidirectional(LSTM(..., stateful=True)) (#4424)
* Bug fix of Bidirectional(LSTM(..., stateful=True)) https://github.com/fchollet/keras/issues/4421

* Add Recurrent.from_config() test
2016-11-18 12:19:42 -08:00
Yu Kobayashi 8653060ae6 Update Travis TensorFlow to 0.11.0 (#4367) 2016-11-17 09:55:39 -08:00
Francois Chollet 8df3effa5f Merge branch 'shareable_bn' 2016-11-16 19:07:06 -08:00
Francois Chollet 771010f43b Add shareable BN (per-datastream updates). 2016-11-16 19:06:46 -08:00
Carl Thomé 8d20bac7fa Remove extraneous batch_input_shape (#4393) 2016-11-16 18:59:03 -08:00
Francois Chollet c4c4fac1ae Make BN shareable (not yet working) 2016-11-15 05:16:40 -08:00
Francois Chollet 016d85c9e6 Minor style fixes 2016-11-14 15:09:58 -08:00
Francois Chollet 3ab29205fc Merge branch 'master' of https://github.com/fchollet/keras 2016-11-14 15:08:04 -08:00
Francois Chollet fdd150eb4d Minor style fixes 2016-11-14 15:07:51 -08:00
Anton Chernyavski 789a2be8d9 Fix get_layer() by index (#4376) 2016-11-14 09:47:27 -08:00
Francois Chollet ae7ef37c1b Merge branch 'master' of https://github.com/fchollet/keras 2016-11-09 20:57:43 -08:00
Francois Chollet 94fba3d8f0 Fix Theano tests 2016-11-09 20:57:30 -08:00
Yu Kobayashi 6ac9af0a5a Fix the load_model() bug by sorting weights by names (#4338) 2016-11-09 20:36:45 -08:00
Francois Chollet e916f748db Fix Theano tests 2016-11-09 20:33:42 -08:00
Francois Chollet 92e8a20761 Remove unused set_input method 2016-11-09 18:34:09 -08:00
Francois Chollet cb3de665d1 Simplify tests 2016-11-09 18:01:19 -08:00
Francois Chollet 49a5cdf76d Improve error message 2016-11-09 18:01:06 -08:00
Francois Chollet 08a090de43 Merge branch 'master' of https://github.com/fchollet/keras 2016-11-09 17:33:49 -08:00
Francois Chollet fa3b17cd96 Minor code cleanup 2016-11-09 17:33:31 -08:00
Ken Chatfield 5266fdacf1 Bugfix to CIFAR pickle reading code in Python 3 (#4319) 2016-11-09 17:14:36 -08:00
nagachika b74c5953f0 Print EarlyStopping verbose message on_train_end. (#4332)
The message print on_epoch_end would be overwritten by ProgbarLogger.
2016-11-09 16:35:22 -08:00
Yu Kobayashi 00e8d20eae Theano tile() expects Python int, so casting from numpy.int32 to Python int. (#4330) 2016-11-09 16:23:22 -08:00
Gijs van Tulder e8e63e307e Theano: try not to use the old pool_* interface. (#4321) 2016-11-09 16:22:37 -08:00
Uwe Schmidt 7db6de848a Fix for issue #3965 (#4333)
* Fixes issue with resize_images and partially-definded tensors

Disclaimer: I haven't tested this with `dim_ordering == 'th'`

* PEP8 syntax
2016-11-09 16:21:37 -08:00
Matt Gardner 8360ef3a5a Add documentation to set self.built = True in MyLayer.build() (#4315)
* Added documentation to set self.built = True in MyLayer.build()

* Update writing-your-own-keras-layers.md
2016-11-07 18:19:27 -08:00
Francois Chollet d32b8fa4bd Further code cleanup 2016-11-07 17:27:41 -08:00
Francois Chollet c95c32e473 Improve docstrings 2016-11-07 15:36:57 -08:00
Francois Chollet 02fe371839 Merge branch 'master' of https://github.com/fchollet/keras 2016-11-07 12:46:54 -08:00
Francois Chollet b7b7c2ea94 Normalize default argument values 2016-11-07 12:46:41 -08:00
Francois Chollet 105dd031dd Documentation improvements 2016-11-07 12:46:18 -08:00
Joshua Loyal 4fa289166a allow for learning rate dtypes returned by numpy (#4304) 2016-11-07 10:33:11 -08:00
Carl Thomé a8bbcf611f ConvLSTM2D docstring spelling (#4306)
* Spelling

* "convolutionnal" spelling
2016-11-06 12:05:20 -08:00
Francois Chollet d5030b1f8c Add conv_lstm to examples/README 2016-11-05 15:30:33 -07:00
Francois Chollet f127b2f81d Merge branch 'imodpasteur-rebasedconvV1' 2016-11-05 13:46:02 -07:00
Francois Chollet 9d4087a1e9 Style fixes 2016-11-05 13:45:50 -07:00
Francois Chollet fd326ddf1b Merge branch 'rebasedconvV1' of https://github.com/imodpasteur/keras into imodpasteur-rebasedconvV1 2016-11-05 13:32:03 -07:00
Francois Chollet 7f42253f46 Add basic support for TF optimizers, part deux 2016-11-05 13:26:03 -07:00
Francois Chollet 18d7e5e6e4 Style fixes 2016-11-05 13:22:18 -07:00
Francois Chollet 6610880fd4 Merge branch 'master' of https://github.com/fchollet/keras 2016-11-05 13:21:38 -07:00
Arbona 11b73ae6b4 Tf dynamic 2016-11-04 21:20:30 +01:00
Carl Thomé 2b51317be8 Refactor F-score into precision and recall metrics (#4276)
* Refactor f-score into precision and recall metrics

* Docstring consistency

* Add docstring for fmeasure

* Added precision, recall, f-measure tests
2016-11-03 20:28:04 -07:00
Francois Chollet 650c2c8cf9 Add basic support for TF optimizers 2016-11-03 11:38:00 -07:00
Igor Macedo Quintanilha 49386e8da4 Bug fix when target is a SparseTensor. (#4200)
* Bug fix when target is a SparseTensor.
Check for sparsity when creating target placeholder.
Remove shape argument when creating sparse placeholder.

* Fixed ndim behavior for sparse tensor

* Fix sparse variable instantiation.

* Bug fix
2016-11-03 10:04:40 -07:00
Thang Bui 71494ffdbc changed VAE sampling variance to 1 (#4211)
* Update variational_autoencoder.py

fixed sampling bug

* Update variational_autoencoder_deconv.py

fixed variance bug
2016-11-02 15:58:32 -07:00
Francois Chollet a9b6bef062 Improve dynamic TF RNN implementation. 2016-11-02 11:51:29 -07:00
Francois Chollet 4840e435f7 Improve RNN error messages 2016-11-02 10:47:46 -07:00
Arbona 531147c877 Fix review 2016-11-02 12:08:31 +01:00
Francois Chollet 61c21ef9ee Imagenet predictions sorting fix 2016-11-01 17:39:39 -07:00
Francois Chollet 058e54061b Style fixes 2016-11-01 17:39:23 -07:00
Francois Chollet 32be731194 Some backend refactoring 2016-11-01 16:52:25 -07:00
Francois Chollet 9bf55395f1 Simplify 1D pooling implementation 2016-11-01 16:51:54 -07:00
Francois Chollet 114b82a212 Minor TF backend improvements 2016-11-01 15:26:01 -07:00
manelbaradad 7d143370d8 BUG: Deconvolution2D output shape not correctly referenced (#4251) 2016-11-01 11:24:54 -07:00
Gijs van Tulder bc6880fa34 Enable full convolution with the Theano backend. (#4250) 2016-11-01 11:03:50 -07:00
Francois Chollet c6d2ccd453 Prepare 1.1.1 release. 2016-10-31 13:12:59 -07:00
Francois Chollet cdab739471 Merge branch 'master' of https://github.com/fchollet/keras 2016-10-31 13:11:49 -07:00
Taras Boiko fee03bd5a6 Use six for wrapping in keras_test (#4235)
This will allow parameterized tests to work correctly in both 2.7 and
3.4
2016-10-31 10:51:32 -07:00
Aloïs Gruson 6fd2d43bfe Fix Theano Cudnn BatchNorm when axis!=1 (#3968)
* fix batch_norm when axis!=1

* fix dimshuffle for all backends

* moving cudnn bn fix to theano backend

* fix pep8

* dont use cudnn when bn axis is non broadcastable, ie dim=1
2016-10-28 10:51:32 -07:00
Arbona 40fd415409 Changed name example 2016-10-27 10:46:54 +02:00
Laurent Gautier 9c7020f7e7 Only allow the addition to Sequential objects of layers that are instances of Layer (#4184)
* Check that the added object is an instance of class Layer

* Update models.py

* Fix ValueError error message
2016-10-26 11:02:10 -07:00
Sean 556399cc48 Add more util docs (#4154)
* Add more util docs

* Leave out single use utils
2016-10-26 10:40:33 -07:00
Ramanan Balakrishnan bef888c2d8 add new min_delta parameter in EarlyStopping to stop in cases of minimal improvements (#4202) 2016-10-26 10:39:52 -07:00
Stefan Wunsch a89dabe0cd Enhance doc about usage of sample weights in validation data tuple (#4199) 2016-10-26 10:18:59 -07:00
Alexander Rakhlin 80fbbc3a6a Bug fix in zca_whitening (#4181)
When calculating 'sigma' denominator is # of instances (axis=0), not dimensionality (axis=1)

Proof:
http://ufldl.stanford.edu/wiki/index.php/Implementing_PCA/Whitening
http://ufldl.stanford.edu/wiki/index.php/Exercise:PCA_and_Whitening
Ng uses 2nd dim in denominator because his matrix is features x instances
2016-10-25 10:40:03 -07:00
Carl Thomé 7a6ee934e1 Display wrapped layers in graph visualization (#4169)
* Display wrapped layers in graph visualization

* Check parent class instead of class's module

* Check instance instead for brevity

* More consistent naming
2016-10-25 09:40:14 -07:00
Arbona 8b11f13507 Changed name 2016-10-25 17:45:28 +02:00
Francois Chollet 4401120ca6 Style fixes 2016-10-24 15:49:38 -07:00
Michael Dietz 8dd61c1dc4 Fixed https://github.com/fchollet/keras/issues/4048 : in TensorBoard callback which fails when it is not the only callback (specifically when another cbk is ReduceLROnPlateau). (#4159) 2016-10-24 15:13:39 -07:00
Roberto de Moura Estevão Filho 6849589430 Fix LiL sparse matrix on Tensorflow (#4173)
LiL sparse matrices would not work correctly due to dtype being
different. Using the sparse_coo data fixes it.
2016-10-24 13:33:45 -07:00
Jaye 4cd83631ee Update imdb_cnn.py to use GlobalMaxPooling1D (#4164) 2016-10-24 09:25:08 -07:00
Felix Sonntag 028aae19bf Fixes for Python 3 (#4121)
* Fixed weights.sort for Python 3

In Python 3 weights.sort could throw a TypeError exception, if the
names are all None

* Fixed _flattened_layers under Python 3

If self.layers is empty, an IndexError appears when accessing it. So
it’s necessary to check if it’s non-empty first

* Fixed weight sorting for Theano backend

* Added missing import statement

* Improved backend handling for weight calculation

* Simplified weight sorting and backend check

* Changed behavior of weights sorting

* Removed unnecessary import
2016-10-23 09:01:16 -07:00
jarfo 41741c38e5 Keep shape of the initial (dummy) state (#4146)
tensorflow breaks if the shape of the state changes
https://github.com/fchollet/keras/issues/4008
2016-10-22 20:23:02 -07:00
Thomas Boquet 3feca20c59 + multiprocessing in legacy - unused imports (#4139) 2016-10-21 14:58:28 -07:00
Johan Pauwels f1bc3c03ed Make build_fn argument of sckit-learn wrappers accept class methods (#4107) 2016-10-20 15:33:56 -07:00
Fariz Rahman 66e5944799 Fix Merge layer docstring (#4132) 2016-10-20 15:23:10 -07:00
Francois Chollet 6ffa6f39e6 Fix typo in Merge layer docstring. 2016-10-19 14:10:17 -07:00
Francois Chollet 94ee8e1570 Add Xception model to keras.applications. 2016-10-19 14:06:07 -07:00
happygds 3e95633b1f manually terminate threads process returned by generator_queue() (#4101)
* manually terminate threads process returned by `generator_queue()`

Recently I custum a video sequence DataGenerator (based on ImageDataGenerator) for experiment. When I use model.fit_generator as following:
>history = model.fit_generator(train_data_generator, samples_per_epoch=train_data_generator.nb_sample,
                              nb_epoch=nb_epoch, verbose=1, callbacks=[early_stopping, model_checkpoint],
                              validation_data=test_data_generator, nb_val_samples=test_data_generator.nb_sample,
                              max_q_size=10, nb_worker=8, pickle_safe=True)
I found that the validation process consumes much longer time than training despite it contains less data.
I read the code and changed the `self.evaluate_generator()` (line 1482) in `fit_generator' to use a multiprocessing approach as training process did. However, the memory usage quikly increases and it only last for a few epoches. 
Through analysis, I think it is caused by the processes weren't freed after the `evaluate_generator` accomplished. Thus I suggest returning `generator_threads` from function `generator_queue()` and manually terminate these threads in `fit_generator`, `evaluate_generator`, `predict_generator`.

* stastify the PEP style

* correct the PEP8's E128 error
2016-10-18 20:34:50 -07:00
Ramanan Balakrishnan 70ebb15a33 Add documentation about metrics functions (#4024)
* Add documentation about metrics functions

* Add docstrings to metrics.py and auto-generate the docs from these strings
2016-10-18 19:57:42 -07:00
Gijs van Tulder d745d9ee96 Use Theano's pool_3d function. (#4065) 2016-10-16 22:27:15 -07:00
Abishek Bhat b89a93faae Remove unused imports. (#4083) 2016-10-16 21:58:35 -07:00
Vijay Vasudevan 044071f0d5 Switch use of TF cond function to use public function. (#4064)
* Switch use of TF cond function to use public function.

Prior to newer TFs, cond was unavailable and thus was being
imported via private module namespaces.

Newer TFs expose tf.cond as the public interface.  There
are plans to remove private module namespace access so
this fixes keras to first try accessing through the public
namespace, and then going through the private one for older
versions of TF.

* PEP8 fix
2016-10-14 14:27:15 -07:00
ηzw 79c1331432 Remove unused import statement (#4053) 2016-10-14 09:16:56 -07:00
Jayanth Koushik 86f28494a5 Return decay from get_config of all optimizers (#4052) 2016-10-13 15:25:50 -07:00
Yu Kobayashi d53a1cd0c0 Python 3 support of image_ocr.py (#4049)
I fixed to support Python 3.
2016-10-13 13:53:35 -07:00
Arbona 2c96373a41 remove another useless check 2016-10-13 21:30:01 +02:00
Arbona 731e1bb206 remove a useless check 2016-10-13 21:28:51 +02:00
Arbona c1a72b3644 More test and fixed dropout 2016-10-13 20:58:01 +02:00
fchollet e52740f09a Add Gitter link to README 2016-10-12 20:11:43 -07:00
fchollet 5dd8c5c10c Padding style fixes. 2016-10-12 18:02:39 -07:00
Dmitry Lukovkin 169c0896d6 Make ZeroPadding2D optionally asymmetric (#3595)
* Make ZeroPadding2D and ZeroPadding1D optionally asymmetric

* Make padding argument polymorphic.
Add test case for asymmetric padding.
Remove excessive imports.

* Fix layer config saving.

* Duck typing (as soon as test passes tuple as a list)

* Doc update

* Set padding value for the missing keys to 0.
Raise exception if unexpected keys are found in the padding dict.

* Add test for ZeroPadding1D
2016-10-12 17:48:57 -07:00
ftence 1bc0468ada Applied imagenet mean pixel on BGR instead of RGB. (#4027) 2016-10-12 16:59:56 -07:00
Gijs van Tulder 9a411f367d Use Theano's new theano.nnet.conv3d interface. (#4039) 2016-10-12 16:57:50 -07:00
Jayanth Koushik 6074a18ec4 Fixed typo in Adamax (#4043)
Fixed a typo in Adamax which prevented it from using explicit decay.
2016-10-12 16:57:22 -07:00
Arbona 0e7f3e04b0 pep fixed 2016-10-12 22:11:22 +02:00
Arbona 53552b1d6e Various fix 2016-10-12 22:00:55 +02:00
Taras Boiko d7d1db5d79 Test AveragePooling2D in test_average_pooling2d (#4034) 2016-10-12 08:21:21 -07:00
Fariz Rahman 9d7a2338b4 imdb fasttext speedup (#4026)
* imdb fasttext speedup

* Lambda -> GlobalAveragePooling1D
2016-10-11 11:01:11 -07:00
Taras Boiko 6e42b0e4a7 Added ability to return more than one metric from a function (#3907) 2016-10-11 10:54:02 -07:00
Gijs van Tulder ef7911310d Use Theano's cuDNN batch normalization for training. (#4023) 2016-10-11 10:52:07 -07:00
Ramanan Balakrishnan 999f402829 add KL divergence to metrics (#4025) 2016-10-11 10:50:44 -07:00
Bas Veeling 85c2d28e99 ReduceLROnPlateau fix for cooldown=0 (Fixes #3991) (#4011) 2016-10-10 13:18:58 -07:00
Arbona 6b7421c448 Various fix 2016-10-09 10:46:04 +02:00
fchollet 7df184d3aa Style touch-ups 2016-10-08 15:53:24 -07:00
Abishek Bhat 197005a791 Correct metrics usage in getting started guide. (#3993)
As the code
[here](https://github.com/fchollet/keras/blob/master/keras/engine/training.py#L662) suggests whenever a model is compiled with `metrics = [name_of_the_metric_function]` works, however, the documenation suggests that `accuracy` is the only supported string representation.
2016-10-07 23:34:21 -07:00
Ramanan Balakrishnan 52ee2380e4 Add top-k classification accuracy metrics (#3987)
* add categorical accuracy metric which tracks over top-k predictions

* remove top_k_categorical_accuracy from being tested together with other all_metrics

* fix in_top_k to work with batches. correct metrics.py and test_metrics.py appropriately

* style fixes for documentation on in_top_k function

* default to k=5 for top_k_categorical_accuracy metric
2016-10-07 23:32:19 -07:00
Anish Shah 530eff62e5 [issue #3942] Add GlobalMaxPooling3D and GlobalAveragePooling3D (#3983) 2016-10-07 15:06:19 -07:00
Francois Chollet 4de7eaa6a8 Update docs 2016-10-06 15:38:01 -07:00
Francois Chollet 8281988842 Style fixes 2016-10-06 15:01:17 -07:00
Francois Chollet 4ed7138685 Style fixes 2016-10-06 14:55:22 -07:00
Carl Thomé 6689189819 Add F-score metric to metrics.py (#3895)
* Added optional path argument

* Added optional field name argument

* Added LambdaCallback callback

* Fixed on_epoch_begin assignment

* Match default signatures

* Whitespace

* Test LambdaCallback examples

* Only test process termination

* Imports

* Fixed test

* Wait on process to terminate

* Add zero threshold and set F measure to zero if no true samples exist

* Reduce zero threshold

* Flip thresholded non-zero count

* Add F measure test

* Updated test

* Remove lambda, simplify

* Whitespace

* Update docstring

* Update test

* Whitespace
2016-10-06 14:53:53 -07:00
Emad El-Haraty 0ce7e4976a Descriptions of examples as a README.md file, allowing for easier browsing in github (#3982) 2016-10-06 11:17:22 -07:00
Hengkai Guo 6b18a908b8 Fix shape inference error for newly version Tensorflow in ctc_label_dense_to_sparse (#3955) 2016-10-04 11:21:31 -07:00
Gunnar Läthén 570fdf31c5 Python3 fix for deserialization of closures (#3961) 2016-10-04 11:16:44 -07:00
Seonghyeon Nam 929669bd1b Remove a print message when using global pooling (#3963) 2016-10-04 11:15:16 -07:00
Roberto de Moura Estevão Filho 240fd5b68e Fix control_flow_ops import (#3948)
* Fix control_flow_ops import

Old access was not working on new version of tensorflow. This should
work for all versions.

* Fix identation
2016-10-03 09:42:16 -07:00
Arbona 1d0d79f61a Various fix 2016-10-03 11:43:24 +02:00
Arbona b5dddeb419 Removed notebook and added example in python 2016-10-03 10:45:53 +02:00
Andre Simpelo 9194052a94 Fixed dead link in batch norm documentation (#3937)
Fixed dead link for the references in the Batch Normalization documentation
2016-10-01 20:37:42 -07:00
fchollet e0d871b7dc Restructure docs for Applications module 2016-10-01 15:19:12 -07:00
Sean c455a19f8e Change HDF5Matrix so start and end are optional (#3933) 2016-10-01 12:55:31 -07:00
Francois Chollet d864512631 Fix flaky test 2016-10-01 00:37:21 -07:00
Sean 6ee5d61c91 HDF5Matrix documentation (#3931) 2016-10-01 00:14:39 -07:00
fchollet 04df170bea Merge branch 'master' of ssh://github.com/fchollet/keras 2016-10-01 00:11:45 -07:00
fchollet 5f58a6d2ca Support all backends, dim orderings for music CRNN 2016-10-01 00:11:39 -07:00
Yu Yin ffff5e99aa Fix summary param counting problem (#3661) (#3884)
* Fix summary param counting problem (#3661)

* ...recursively

* Fix default parameter
2016-09-30 22:15:10 -07:00
Francois Chollet 8fab33c245 Make deconv VAE compatible with both dim orderings 2016-09-30 16:26:50 -07:00
Eder Santana 3bf8964355 Keras is TF first. Fix TH first example (#3914)
* Keras is TF first. Fix TH first example

* Use K.set_image_dim_ordering('th')
2016-09-29 10:57:08 -07:00
JM Arbona a3697d097d Added recurrent convolutionnal layer 2016-09-29 10:18:24 +02:00
Thomas Boquet 51c85dd8d6 Bypass shape inference in deconv2d and use the output shape provided by the user (#3838)
* bypass shape inference in deconv2d

* * more doc in deconv layer

* more deconv layers in var autoencoder example

* * typo doc

* replicate deconv example with with paper's params

* replicate example with paper's params

* typo doc

* + relus in the deconv

* typo in var autoencodeur example

* + mult by ndim

* style fixes

* pep8
2016-09-28 13:40:44 -07:00
Nithish deva Divakar 31f41b9822 typos (#3869)
Added missing numpy imports in examples
2016-09-28 12:30:36 -07:00
M Clark 458576bbe7 List files in alphabetical order (#3871)
`os.listdir` to `sorted(os.listdir)` for alphabetical order instead of arbitrary order. Following PR#3751 this allows mask and images with the same name to be read together.
2016-09-28 12:30:21 -07:00
Yu Yin e3a64cc8a7 Choose format according to filename when plotting (#3883) 2016-09-28 11:43:23 -07:00
Francois Chollet 9045616bda Revert adadelta lr 2016-09-27 10:50:35 -07:00
Francois Chollet 25dbe8097f Update adadelta default learning rate 2016-09-27 09:56:58 -07:00
fchollet fb6a2941b9 Fix typos 2016-09-24 22:19:32 -07:00
fchollet ed131973ef Fix music tagger application 2016-09-24 22:12:22 -07:00
Keunwoo Choi 43060d8c7d add audio models: audio_convnet and audio_conv_rnn (#3718)
* add audio models: audio_convnet and audio_conv_rnn

* add audio models: audio_convnet and audio_conv_rnn

* remove white spaces at the end of lines

* add audio_conv_utils.py, update applications.md

* remove useless line in example in application.md

* remove useless line in example in application.md

* rename models (MusicTaggerCNN,CRNN), BN mode=0 weights

* pep8

* remove MusicTaggerCNN, add include_top argument

* update to follow pep8
2016-09-24 19:53:47 -07:00
fchollet d5f1250a8b Update imagenet prediction decoding utilities 2016-09-24 11:46:41 -07:00
Bas Veeling 4c01c0c4d7 ReduceLROnPlateau Callback and CSVLogger Callback (#3780)
* ReduceLROnPlateau Callback and CSVLogger Callback

* Added documentation and cleanup.

* Added examples.

* Added test for ReduceLROnPlateau()

* Minor changes to naming.

* Added epsilon for lr comparison.

* Fix sensitivity issue

* PEP8
2016-09-23 21:16:19 -07:00
danstowell af28101af1 Functional API guide: fix variable names "loss"->"output" (#3856)
Some of the variable names in this guide were misleadingly named. The outputs were named as `*_loss` implying that they held loss values, whereas they in fact held the outputs. It rather confused me; I believe my proposed naming is clearer.
2016-09-23 08:59:36 -07:00
Flynn, Michael D 56aa9f364a Add cropping layers to documentation (#3853)
* Correct documentation for Cropping3D layer

* Add Cropping layers to documentation
2016-09-22 20:46:22 -07:00
Taras Boiko f0d9867d09 Changed ELU implementation to use native ops (#3845) 2016-09-22 11:08:21 -07:00
Carl Thomé cfc9b4d41d LambdaCallback (#3760)
* Added optional path argument

* Added optional field name argument

* Added LambdaCallback callback

* Fixed on_epoch_begin assignment

* Match default signatures

* Whitespace

* Test LambdaCallback examples

* Only test process termination

* Imports

* Fixed test

* Wait on process to terminate
2016-09-22 09:19:51 -07:00
Fariz Rahman de66211afb Set theano as default backend for windows users (#3831)
* Set theano as default backend for windows users

* Update __init__.py
2016-09-21 21:12:06 -07:00
M Clark 414d5f0978 make ImageDataGenerator behaviour fully seedable/repeatable (#3751)
* make ImageDataGenerator behaviour fully seedable/repeatable

This makes ImageDataGenerator fully seedable.
- the seed argument in fit is now used
- the seed argument in flow and flow_from_directory now effects
transforms
- added example to docs of transforming images and masks together
- added test of using two seeded streams at once

* implemented requested changes

- PEP8
- explicit names
- classes=None
- remove test
2016-09-21 21:11:39 -07:00
Fariz Rahman 99bd066f38 TimeDistributed : unroll RNN when using TF backend (#3835)
* TimeDistributed : unroll RNN when using TF backend

TF dynamic rnn not working with ndim > 3

* Update wrappers.py

* Update wrappers.py
2016-09-21 17:31:46 -07:00
ηzw 82a22b20fc Update default dim_ordering (#3832)
* Update default dim_ordering

* Update default dim_ordering
2016-09-21 11:32:08 -07:00
Francois Chollet 25ed701dbd Merge branch 'master' of https://github.com/fchollet/keras 2016-09-20 21:40:07 -07:00
Francois Chollet 875c521413 Update deep dream example 2016-09-20 21:39:51 -07:00
kuza55 7b8363632e Attempted fix for #3801 (#3827) 2016-09-20 14:57:08 -07:00
kuza55 06f18fa1b9 Matthews Correlation fix and test (#3822) 2016-09-20 09:19:00 -07:00
Taras Boiko 54fc646537 Split multitest in test_recurrent (#3818) 2016-09-20 08:43:42 -07:00
Francois Chollet b2e3780e8c Prepare PyPI release 2016-09-19 13:18:22 -07:00
Francois Chollet 0b04ac3117 Fix TF RNN dynamic behavior 2016-09-19 11:01:33 -07:00
Francois Chollet 90d0eb9b88 Regularizers style fixes 2016-09-18 15:27:45 -07:00
fchollet f2aa89f443 Freeze list of trainable weights at compile time 2016-09-18 10:41:37 -07:00
kuza55 2a319c7255 Add exception when trying to reuse regularizers (#3803)
My reading of regularizers is that they cannot be reused, but it doesn't actually fail in any way and seems like it results in only regularizing the last layer. Having an exception prevent this would probably improve the ergonomics.
2016-09-17 20:22:26 -07:00
Francois Chollet 4fb3f1b3f3 Make TF dynamic RNN work without states. 2016-09-16 17:15:18 -07:00
Furiously Curious 072d33599b Added Gitter channel badge (#3744)
* Added Gitter channel badge

Assigned @fchollet as channel admin on Gitter

* Link fix
2016-09-15 18:10:06 -07:00
Seonghyeon Nam 56f3c85b87 Fix ValueError(ndim of gamma and beta) of batch normalization when using Theano (#3740)
* Fix ndim mismatch error when using theano

* Change keras backend call
2016-09-15 18:09:02 -07:00
Francois Chollet 8b42fff90e Fix flaky test 2016-09-14 15:15:00 -07:00
Francois Chollet 1dc5d43d32 Remove deprecated resnet50 example 2016-09-14 15:03:26 -07:00
Francois Chollet ee2d08ff79 Fix activity regularization for wrapper layers 2016-09-14 15:02:05 -07:00
Francois Chollet 305b3bed74 Finalize streamlining of conv1d. 2016-09-14 14:39:47 -07:00
Francois Chollet 9f6acd960c Simplify Conv1D ops. 2016-09-14 14:18:15 -07:00
Flynn, Michael D 672890b1c8 Add AtrousConvolution1D to convolutional layers (#3763)
* Add `AtrousConvolution1D` to convolutional layers

* Add test for `AtrousConvolution1D` layer

* Add AtrousConvolution1D to docs
2016-09-14 11:40:04 -07:00
Francois Chollet c58bcc2c02 Fix deconv test 2016-09-13 16:56:39 -07:00
Francois Chollet 82318263a1 Set default backend to TF 2016-09-13 16:24:43 -07:00
Francois Chollet d90e1db50b Revert default backend to TH 2016-09-13 15:37:38 -07:00
Francois Chollet 8af0264a77 Set TensorFlow as default backend for new installs 2016-09-13 15:19:13 -07:00
Junwei Pan 8193287e08 Update docoment for callbacks.py: add the specification of auto mode (#3758) 2016-09-13 08:52:12 -07:00
fchollet 13bd33e73f Merge branch 'master' of ssh://github.com/fchollet/keras 2016-09-10 22:54:51 -07:00
fchollet b2e8d5ab7c Add support for LR decay in all optimizers 2016-09-10 12:34:05 -07:00
Ardalan a375cb322f fastText: adding n-gram embeddings for higher test_set accuracy (#3733)
* adding bi-gram embeddings for better test accuracy

* - add arbitrary n-gram range
- fix typos

* - fixing white spaces

* - add comment
2016-09-10 10:35:15 -07:00
dolaameng d9c4d8a76a update examples/neural_doodle.py based on issues #3731 (#3741) 2016-09-10 10:24:39 -07:00
kuza55 79edae58d5 Initial Sparse Matrix Support (#3695)
* Minimal SparseTensor support for TensorFlow

* Basic Theano support for Sparse dot product

* Sparse Input for Both + Sparse Concat for TF

* Fixed issue with _keras_shape for sparse Inputs

* pep8

* Cleanup + Theano concat (untested)

* Bug fix & pep8

* Fix Theano concat

* Bugfix & simplification

* Next step: Unit tests

* Basic unit test for sparse dot; TF works, TH fails

* Fix KTH is_sparse

* pep8

* more tests, sparse KTH.eval, pep8

* sparse model test

* address code review comments

* make sparse boolean in K.placeholder

* skip sparse tests when TH.sparse import fails

* pep8

* pep8

* fixed flakey test, auto-dense in KTH.eval

* fixed some more len/shape issues for fit_generator

* fixed some more len/shape issues for prediction

* Added better exceptions when theano.sparse fails to import

* betterer

* pep8
2016-09-09 16:26:37 -07:00
iampat 6675776640 Fix a small typo in help files (#3728)
impoprt --> import
2016-09-08 17:35:38 -07:00
dolaameng 40685c3b2a add examples/neural_doodle.py (#3724) 2016-09-08 10:15:57 -07:00
Francois Chollet 25874ceab2 Update TD wrapper 2016-09-07 19:32:26 -07:00
Tim Shi 4b2093ef67 allow output size different from state size (#3709) 2016-09-07 15:52:06 -07:00
kuza55 9bc2e60fd5 TensorBoard callback improvements (#3656)
* TensorBoard callback improvements

* Removed name improvement in TensorBoard callback

* Fix variables broken by removing name fixups

* Update callbacks.py
2016-09-07 12:59:08 -07:00
antonmbk 685ce7573d Added stacked what where autoencoder. (#3616)
* Added stacked what where autoencoder.

SWWAE uses residual blocks. Trains fast. Creates very good reconstructions.

* Added newline at end for PEP8

* Went through PEP8 errors and corrected all (except for the imports which following the numpy seed, but this should be ok).  Also, for the pool_size of 2, we halved the number of features maps and the number of epochs, and it still trains a net that can very nicely reconstruct the input.

* Added spaces arround - and + when they are used as binary operators (more PEP8).

* In decoder, the index of the features and pool size and wheres are all equal to nlayers-1-i, so set ind variable to this value and passed it to them.

* With ind variable in decoder, don't need two lines for the upsampling layer.

* Added title to plot, got rid of ticks on plot.

* PEP8 for * binary operator. Corrected some grammar issues in the docstring.
2016-09-07 11:05:41 -07:00
dolaameng f5ad1c5753 fix bug in neural_style_transfer example for image_dim_ordering=tf (#3715)
* fix bug in neural_style_transfer example for image_dim_ordering=tf

* fix PEP8 mixed space and tab
2016-09-07 10:57:23 -07:00
Francois Chollet cc92025fdc Make examples agnostic to image_dim_ordering 2016-09-06 15:53:56 -07:00
Fariz Rahman f05cd95fad Dot/cos merge : bug fix (#3708) 2016-09-06 15:04:26 -07:00
kuza55 4325843ef0 Add Matthews correlation coefficient to metrics (#3689)
* Add Matthews correlation coefficient to metrics

I needed this for a Kaggle competition and it seemed useful in general so I thought I'd contribute it back.

* Enabled test for matthews metric

* Remove unnecessary cast garbage

* Addresses code review comments

* Renamed to matthews_corrcoef to be consistent with sklearn

* Update test_metrics.py

* pep8

* rename to mathews_correlation

* Update metrics.py

* Fixed typo
2016-09-06 13:42:56 -07:00
Arel Cordero 607635d2ce Optionally load weights by name (#3488)
* Adding feature to load_weights by name

Squashed commit of the following:

commit fd47e763855c34ed78d26ee441d83e0e63f08119
Author: Arel Cordero <arel@ditto.us.com>
Date:   Thu Aug 18 16:02:14 2016 +0000

    typo

commit d0b06c03080131c55ab4777064a196ff339ad7df
Author: Arel Cordero <arel@ditto.us.com>
Date:   Thu Aug 18 15:52:35 2016 +0000

    update documentation for "load_weights"

commit 844cfc2e8c9c6f267799a22ed54ac4d75807c5ab
Author: Arel Cordero <arel@ditto.us.com>
Date:   Thu Aug 18 02:42:10 2016 +0000

    batch updating weights

commit f361a70da4b40b961f1af9c8f1c3cd26273d0cad
Author: Arel Cordero <arel@ditto.us.com>
Date:   Thu Aug 18 02:29:17 2016 +0000

    removing pudb line

commit 738de4c371503626b4c9dbae6428fb279b368a76
Author: Arel Cordero <arel@ditto.us.com>
Date:   Wed Aug 17 19:56:51 2016 +0000

    adding unit tests for loading weights by name

commit cb0971b3cfe62452ab445e4034098cab2be3031b
Author: Arel Cordero <arel@ditto.us.com>
Date:   Tue Aug 16 23:45:32 2016 +0000

    cleaning up code based on comments

commit ef08fd2c9f5d3c65359cbdf5b090e08733a518de
Author: Arel Cordero <arel@ditto.us.com>
Date:   Tue Aug 16 04:50:46 2016 +0000

    debugging

commit 0d74f0e997960886b1044c26001de6cd6ad90bb9
Author: Arel Cordero <arel@ditto.us.com>
Date:   Tue Aug 16 04:15:43 2016 +0000

    optionally load model by name

* changed random file names to use tempfile module

* clean up documentation strings

* clarifying documentation
2016-09-06 11:42:31 -07:00
Abishek Bhat b8fddc862e Add missing Softmax activation memnn. (#3706)
The implementation of bAbi [End to End Memory
Network](https://arxiv.org/pdf/1503.08895v5.pdf) in the example
seems to be missing the Softmax Layer.

Quoting the paper.

> The query q is also embedded (again, in the simplest case via another embedding matrix
B with the same dimensions as A) to obtain an internal state u. In the embedding space, we compute
the match between u and each memory m<sub>i</sub> by taking the inner product followed by a softmax.

Also, the question encoder
[here](https://github.com/fchollet/keras/blob/0df0177437ce672d654db6d7edfdc653aaf67533/examples/babi_memnn.py#L186) seems to sum over the probabilities and the question vector as suggestted in the original paper.

> Output memory representation: Each x<sub>i</sub> has a corresponding output vector c<sub>i</sub> (given in the
simplest case by another embedding matrix C). The response vector from the memory o is then a
sum over the transformed inputs c<sub>i</sub> , weighted by the probability vector from the input.

I tried running the model(with and without the intermediate softmax)
against _Single Supporting Fact_ en-10k dataset and found that the
network the intermediate softmax trained a lot faster(95% at 100epoch) than the former(67% at 100epoch).

Network without the Softmax activation in the Input Memory Representation at epoch=100
======================================================================================
```
Iteration 10
Train on 10000 samples, validate on 1000 samples
Epoch 1/10
10000/10000 [==============================] - 8s - loss: 0.0549 - acc: 0.9819 - val_loss: 1.8088 - val_acc: 0.6470
Epoch 2/10
10000/10000 [==============================] - 6s - loss: 0.0612 - acc: 0.9802 - val_loss: 1.7839 - val_acc: 0.6650
Epoch 3/10
10000/10000 [==============================] - 6s - loss: 0.0542 - acc: 0.9812 - val_loss: 1.7595 - val_acc: 0.6750
Epoch 4/10
10000/10000 [==============================] - 6s - loss: 0.0538 - acc: 0.9826 - val_loss: 1.8198 - val_acc: 0.6670
Epoch 5/10
10000/10000 [==============================] - 6s - loss: 0.0590 - acc: 0.9790 - val_loss: 1.7891 - val_acc: 0.6650
Epoch 6/10
10000/10000 [==============================] - 6s - loss: 0.0548 - acc: 0.9803 - val_loss: 1.7682 - val_acc: 0.6790
Epoch 7/10
10000/10000 [==============================] - 6s - loss: 0.0455 - acc: 0.9841 - val_loss: 1.8394 - val_acc: 0.6730
Epoch 8/10
10000/10000 [==============================] - 6s - loss: 0.0559 - acc: 0.9797 - val_loss: 1.7764 - val_acc: 0.6650
Epoch 9/10
10000/10000 [==============================] - 6s - loss: 0.0488 - acc: 0.9835 - val_loss: 1.7711 - val_acc: 0.6620
Epoch 10/10
10000/10000 [==============================] - 6s - loss: 0.0502 - acc: 0.9834 - val_loss: 1.8225 - val_acc: 0.6700
```

Network with Softmax Activation in the Input Memory Representation at epoch=100
===============================================================================

```
Iteration 10
Train on 10000 samples, validate on 1000 samples
Epoch 1/10
10000/10000 [==============================] - 6s - loss: 0.0084 - acc: 0.9972 - val_loss: 0.2426 - val_acc: 0.9520
Epoch 2/10
10000/10000 [==============================] - 7s - loss: 0.0152 - acc: 0.9946 - val_loss: 0.2063 - val_acc: 0.9560
Epoch 3/10
10000/10000 [==============================] - 6s - loss: 0.0104 - acc: 0.9969 - val_loss: 0.2010 - val_acc: 0.9540
Epoch 4/10
10000/10000 [==============================] - 6s - loss: 0.0163 - acc: 0.9959 - val_loss: 0.2023 - val_acc: 0.9580
Epoch 5/10
10000/10000 [==============================] - 6s - loss: 0.0136 - acc: 0.9962 - val_loss: 0.2007 - val_acc: 0.9560
Epoch 6/10
10000/10000 [==============================] - 6s - loss: 0.0152 - acc: 0.9953 - val_loss: 0.1989 - val_acc: 0.9570
Epoch 7/10
10000/10000 [==============================] - 7s - loss: 0.0085 - acc: 0.9969 - val_loss: 0.2113 - val_acc: 0.9490
Epoch 8/10
10000/10000 [==============================] - 7s - loss: 0.0116 - acc: 0.9972 - val_loss: 0.2346 - val_acc: 0.9500
Epoch 9/10
10000/10000 [==============================] - 7s - loss: 0.0106 - acc: 0.9970 - val_loss: 0.2052 - val_acc: 0.9550
Epoch 10/10
10000/10000 [==============================] - 7s - loss: 0.0132 - acc: 0.9963 - val_loss: 0.2114 - val_acc: 0.9500
```
2016-09-06 11:33:11 -07:00
dolaameng 0df0177437 make image parameters more consistent (#3672)
* change of variable names in examples/neural_transfer_style for consistency

* add docstring to keras.preprocessing.image.load_img()
2016-09-06 10:29:26 -07:00
Pedro S f90cbcd1e3 Added regularization option to BatchNormalization layer (#3671)
* Added regularization option to BatchNormalization layer

* Update normalization.py

* Added regularization to BN test

* Fixed identation

* Removed trailing whitespace and refixed identation
2016-09-02 08:15:51 -07:00
Fariz Rahman 870d7f7f93 Lambda layer : Allow multiple inputs (#3668) 2016-09-01 13:48:21 -07:00
Roberto de Moura Estevão Filho 799bec66a2 CTC import compatibility with tensorflow 0.10 (#3650)
* CTC import compatibility with tensorflow 0.10
Try except clause to import ctc_loss in new path on tensorflow 0.10.

* Fixed ctc_decode and added tests for tensorflow.
ctc_decode when using beam search decoder has been fixed to conform with
tensorflow API. Function documentation has been updated to reflect the
changes. Two tests, for greedy and beam search decoding, have also been
added to test_backends.py.

* Fix pep8 styling.

* Fixed styling on long lines on ctc_decode tests.
2016-09-01 11:33:11 -07:00
Matt 2321fbbc1d Fix Batch Norm compatibility with 3D inputs (#3666)
* Fix Batch Norm compatibility with 3D inputs

the theano backend now uses dnn_batch_normalization which only supports
up to 4-dimensional input. This breaks any 5-d layers such as 3D
convolutions.

* using intermediate variable
2016-09-01 10:22:23 -07:00
gw0 48ae7217e4 Fix TensorFlow RNN backwards support. (#3662) 2016-09-01 05:56:42 -07:00
Francois Chollet 6f54b233f1 Fix Theano input shape inference in InputLayer 2016-08-31 21:49:43 -07:00
ηzw 1bf1055395 Update docs for SpatialDropouts (#3652) 2016-08-31 13:55:21 -07:00
gw0 6417d90d5c Fix #2814 lambda function serialization and deserialization (#3639)
* Remove old-style function attributes.

* Fix lambda function serialization and deserialization.
2016-08-31 10:05:05 -07:00
fchollet c939cebf0d Theano rnn fix when input_dim = 1 2016-08-30 18:35:22 -07:00
kuza55 7ae36d132a Write TensorBoard Histograms with Tensor names (#3635)
Resolves the acute symptoms in https://github.com/fchollet/keras/issues/3357

Doesn't address the question of having a better __repr__ since that is a much wider change.
2016-08-30 16:55:38 -07:00
Francois Chollet c478409dad Fix weight constraint sharing issue 2016-08-30 12:54:42 -07:00
Frédéric Bastien 109441a708 Small speed up by preventing transfer to CPU or copy on the CPU just to get the shape. (#3631) 2016-08-30 12:41:47 -07:00
Fariz Rahman b267e8293d Update sequential-model-guide.md (#3630) 2016-08-30 09:43:41 -07:00
Francois Chollet 3a4c683d5c Update download path for babi dataset 2016-08-29 13:03:36 -07:00
Sean Löfgren d5649da5f8 update pytest config for pep8 tests (#3617) (#3619) 2016-08-29 11:20:39 -07:00
Fariz Rahman 9c28d21b4f Fix lambda layer docstring (#3604) 2016-08-29 10:44:19 -07:00
kuza55 9e58b8237b Enable colocate_gradients_with_ops=True (#3620)
By default TensorFlow allocates all gradient matricies on gpu:0, which makes it pretty much impossible to do parallelize a large model.

colocate_gradients_with_ops puts these matricies next to the operations, allowing you to split your model across multiple GPUs. I ran into this issue myself and this fixed it for me.

I think it's also meant to set gradient computations to be done on the device where the operations are stored, but my belief about that comes from https://github.com/tensorflow/tensorflow/issues/2441

I'm not sure why this isn't the default in TF, so I'm not sure if this should be behind a flag or something, but having to make my own patches to keras to do multi-GPU training seems like the wrong answer.
2016-08-29 10:44:01 -07:00
Francois Chollet b184c76205 Update docs 2016-08-28 14:29:40 -07:00
fchollet 065fb2a74c Add global pooling layers 2016-08-28 14:22:15 -07:00
Francois Chollet a0a0d42630 Fix example in doc 2016-08-28 13:20:51 -07:00
Francois Chollet 756153899a Merge branch 'master' of https://github.com/fchollet/keras 2016-08-28 13:10:47 -07:00
Francois Chollet e02554412f Fix example in doc 2016-08-28 13:09:33 -07:00
ηzw ca37e806b9 Fix docs (#3609)
* Fix typo

* Fix typo

* Fix docstring

* Remove the unnecessary augument in docstring
2016-08-28 11:22:16 -07:00
Francois Chollet fe0347dbf0 Update docs 2016-08-28 02:33:50 -07:00
Francois Chollet 4984c5fc7c Update documentation 2016-08-28 02:03:14 -07:00
Francois Chollet f605769af9 Merge branch 'master' of https://github.com/fchollet/keras 2016-08-28 01:11:08 -07:00
Francois Chollet 534f6b7975 Remove flaky test 2016-08-28 01:10:53 -07:00
ηzw ee8fd78383 Fix docstring in Locally-connected Layers (#3607) 2016-08-28 01:07:58 -07:00
Francois Chollet fbc4f37037 Example touch-up 2016-08-27 20:28:03 -07:00
Francois Chollet f23f2ff2c9 Add keras.applications, refactor 2 convnet scripts 2016-08-27 20:27:49 -07:00
Francois Chollet d0659327bd Prepare 1.0.8 release. 2016-08-27 17:04:58 -07:00
Nithish deva Divakar 88b301f182 Update io_utils.py (#3577)
* Update io_utils.py

Fix for wrong input dimension when using HDF5matrix for loading data

* Update io_utils.py
2016-08-27 09:45:31 -07:00
Yanush Viktor d5e16807d2 Update image.md (#3600)
Information in docs about optional parameter `shuffle` is not correct. In the code
https://github.com/fchollet/keras/blob/master/keras/preprocessing/image.py#L263
it's `True` by default, but `False` in the docs.
2016-08-27 09:45:10 -07:00
Fariz Rahman 79a2bcd05f TF dynamic RNN : Allow sequences of ndim > 3 (#3603)
Docstring says "at least 3D", but current code is hard coded for 3D input.
2016-08-27 09:44:55 -07:00
dibule e582f9dcac Bug fix, batch_size set instead of default one (#3590) 2016-08-26 14:30:57 -07:00
Carl Thomé 4cefd6136b Add pydot-ng dependency (#3593) 2016-08-26 14:30:25 -07:00
Francois Chollet 1cf04b7a10 Update documentation 2016-08-26 14:22:56 -07:00
Francois Chollet ad3db301f2 Update RNN docstring 2016-08-25 12:34:35 -07:00
Francois Chollet e15eb40317 Merge branch 'master' of https://github.com/fchollet/keras 2016-08-25 11:46:41 -07:00
Felix Lau 8459d0403c Fix Cropping3D InputSpec to be dim=5 (#3570) 2016-08-25 10:29:52 -07:00
François Chollet 08014eea36 Add support for dynamic RNNs in TensorFlow. (#3474)
* Add support for dynamic RNNs in TensorFlow.

* Fix return states

* Add support for go_backwards in dynamic TF RNNs

* Currently broken: TF RNN dropout, go_backwards

* Finalize dynamic RNNs in TF

* Remove unnecessary comment

* Comment out added test

* Comment out functional guide test
2016-08-24 20:47:51 -07:00
Francois Chollet c0b27108d0 Whitespace fix 2016-08-24 16:15:38 -07:00
Francois Chollet 40612facf3 Merge branch 'master' of https://github.com/fchollet/keras 2016-08-24 15:16:39 -07:00
Francois Chollet e418fc6937 Add get_variable_shape backend method 2016-08-24 15:16:22 -07:00
cyrusmaher 045d442fcd Update scikit_learn.py (#3567)
Sklearn (e.g. the Bagging estimator) expects a value error if a parameter isn't recognized.
2016-08-24 14:19:25 -07:00
Dmitry Lukovkin 2370f9f1db Docs update for Deconvolution2D layer (#3549)
* Fix exception message for Deconvolution2D

* Docs update for Deconvolution2D layer (#3540)

* Corrections to Deconvolution2D docs

* References formatted as Markdown links
* Blank lines added
2016-08-24 13:16:02 -07:00
Jurim Lee 519d1e7420 bug fix : cnn tensorflow backend error (#3558) 2016-08-24 08:57:31 -07:00
Francois Chollet 82afc713d0 Merge branch 'master' of https://github.com/fchollet/keras 2016-08-24 04:15:34 -07:00
Francois Chollet a1e9a8addd data_utils update 2016-08-24 04:15:20 -07:00
Fariz Rahman f59736a06c Bug fix : RNNs (#3550) 2016-08-23 17:03:17 -07:00
Francois Chollet d187059596 Merge branch 'master' of https://github.com/fchollet/keras 2016-08-23 14:30:21 -07:00
Francois Chollet 7d2f0b1ba8 Fix trainable_weight lists 2016-08-23 14:29:49 -07:00
Francois Chollet 6a6d939dea Style fix 2016-08-23 14:29:27 -07:00
Keunwoo Choi 090b8b7f99 add cropping1d/2d/3d layers (#3509)
* add cropping1d/2d/3d layers

* fix PEP8 issue, fix incorrect doc strings

* add example code on Cropping2D

* fix init/get_config of crop1d/3d, add test codes for cropping1d/2d/3d

* fix test code - PEP8

It doesnt pass test (only in cropping2d and basic_test), but my laptop setting is not correct (it doesnt pass some other existing layes as well), so committing to test it in a correct way.

* change to follow PEP8 again

* update test_convolutonal.py for PEP8, test code to us K.image_dim_ordering()

* PEP8 for test_convolutional.py - indentation

* fix typo. add assert to check cropping lengths
2016-08-22 19:20:47 -07:00
Dominic Breuker e4dda27de1 allow KerasClassifier.score to accept sparse labels (#3534) 2016-08-22 15:19:26 -07:00
EdwardRaff f2786d9d80 Update theano_backend.py (#3532)
fix batch norm causing NaN for many datasets
2016-08-20 12:50:23 -07:00
Katherine Crowson c6daa24e3c Use Theano's CuDNN implementation of batch normalization (#3529) 2016-08-19 20:03:55 -07:00
François Chollet 2f4eed1f0f Remove lock in fit_generator (#3528) 2016-08-19 16:09:20 -07:00
Francois Chollet acc5c45feb Remove lock in fit_generator 2016-08-19 11:21:47 -07:00
Rodrigo Prado 00c3335071 added language sh to install instructions (#3520) 2016-08-19 08:36:51 -07:00
Ardalan 65e4c8e76e Update lstm_text_generation.py (#3516)
updating comment
2016-08-18 14:03:26 -07:00
Francois Chollet d4f5dff8ee Style fixes 2016-08-18 13:51:45 -07:00
dolaameng 8d9cb782fb add example mnist_net2net.py (#3503)
* add example mnist_net2net.py

* change of mnist_net2net.py based on 1st comments

* typo fixed in examples/mnist_net2net.py
2016-08-18 13:07:16 -07:00
fchollet 02ff1d4462 Fix style in example 2016-08-17 20:00:49 -07:00
Francois Chollet 007d2c2e25 Fix theano tests 2016-08-17 17:39:24 -07:00
Francois Chollet 3bf7637986 Style fixes 2016-08-17 16:32:52 -07:00
Francois Chollet 33ff9dbce2 Speed up bidirectional wrapper tests 2016-08-17 16:30:31 -07:00
Fariz Rahman f25e894558 Bidirectional Wrapper (#3495)
* Add Bidirectional Wrapper

* Fix example

* Update wrappers.py

* Add reverse op

* Update tensorflow_backend.py

* Update wrappers.py

* Update test_wrappers.py

* bug fix

* Update test_wrappers.py

* Update test_wrappers.py

* bug fix

* Add test for reverse op

* Enable reverse along multiple axes

* Update theano_backend.py

* Update theano_backend.py

* Update test_wrappers.py

* Speed up tests

* Validate merge_mode arg, Add None mode

* Update test_wrappers.py

* Update test_wrappers.py

* Add properties; reverse -> backward

* Bug fix

* Resolve naming conflict

* Whitespace fix

* Update imdb_bidirectional_lstm.py

* Fix imports
2016-08-17 16:27:06 -07:00
Junwei Pan 52c1a7456f Add a reference paper for Adagrad (#3473)
Add a reference paper for Adagrad
2016-08-17 13:31:24 -07:00
Junwei Pan b2392413fa Fix a issue when only specify one dot_axes for in the Merge layer in the dot mode (#3470)
* Upload examples/imdb_fasttext.py which implement the fasttext model

* Remove Dropout and unnecessary imports

* Remove Dropout and unnecessary imports

* Remove Dropout and unnecessary imports

* Fix a issue when only specify one dot_axes for in the Merge layer

* Fix a issue when only specify one dot_axes for in the Merge layer
2016-08-17 13:30:45 -07:00
Frederico Tommasi Caroli 1941eaabe0 Using locks instead of ignoring ValuErrors in generators (#3501) 2016-08-17 11:20:51 -07:00
ηzw 3d5bf9753f Fix typo (#3505) 2016-08-17 11:18:52 -07:00
fchollet a4d191d4f9 Only write config file if it didn't exist 2016-08-16 20:53:57 -07:00
Reid Sanders dad54ec211 Minor imbd dataset documentation update, added docstring to reuters (#3494)
* Updated dataset documentation to reflect removal of test_split argument
from imbd dataset. Added docstring to reuters dataset load_data.

* Updated imbd and reuters examples in dataset docs to reflect all
available arguments with current default values.
2016-08-16 14:14:32 -07:00
Francois Chollet b525f5f4d7 Make h5py optional again 2016-08-16 13:31:15 -07:00
Mike Henry e8190a8d8d Added support for CTC in both Theano and Tensorflow along with image OCR example. (#3436)
* Added CTC to Theano and Tensorflow backend along with image OCR example

* Fixed python style issues, made data files remote, and made code more idiomatic to Keras

* Fixed a couple more style issues brought up in the original PR

* Reverted wrappers.py

* Fixed potential training-on-validation issue and removed unused imports

* Fixed PEP8 issue

* Remaining PEP8 issues fixed
2016-08-16 13:25:26 -07:00
Francois Chollet 4e155139ca Style fixes. 2016-08-16 11:17:44 -07:00
stas-sl 458edeed9a implement spatial dropout (#3463) 2016-08-16 11:16:18 -07:00
Katherine Crowson 04d785f4bf Allow multiprocessing on OS X (#3389) 2016-08-14 15:43:43 -07:00
fchollet 28d9c0c511 Style fixes in example. 2016-08-14 15:41:04 -07:00
Antonie Lin 91310971b9 mnist hierarchical rnn example (#3460) 2016-08-14 15:40:13 -07:00
Francois Chollet 5d2acf4897 Merge branch 'master' of https://github.com/fchollet/keras 2016-08-13 11:54:04 -07:00
Francois Chollet dc98019d49 Add get_layer to sequential models 2016-08-13 11:53:48 -07:00
ηzw b008bb35cc Style fixes (#3462) 2016-08-13 11:22:01 -07:00
Junwei Pan 46d5b197e0 Implement a fasttext example (#3446)
* Upload examples/imdb_fasttext.py which implement the fasttext model

* Remove Dropout and unnecessary imports

* Remove Dropout and unnecessary imports

* Remove Dropout and unnecessary imports
2016-08-12 14:36:35 -07:00
Francois Chollet 2c510530b1 Update FAQ 2016-08-10 16:44:35 -07:00
Francois Chollet ec6eda77ad Style fixes in tests 2016-08-10 16:44:29 -07:00
Yann Henon 4805e5856b Upsampling layer fix to work when input shape is None (#3429) 2016-08-09 14:47:23 -07:00
Fariz Rahman 55447cbb3d Bug fix : squeeze (#3433) 2016-08-09 13:59:47 -07:00
Francois Chollet 69d5139b8c [breaking change] Weight ordering now topological. 2016-08-09 10:20:00 -07:00
Francois Chollet 89f1e05147 Functional Model weight loading in layer order 2016-08-08 18:53:36 -07:00
Francois Chollet bc779df8b7 Avoid creating new ops in TF to load weights 2016-08-08 16:13:40 -07:00
Francois Chollet e3c260e7d3 Fix test flakes 2016-08-08 13:00:53 -07:00
Francois Chollet 0af7e004c7 Fix test flakiness in TF 2016-08-08 12:28:35 -07:00
Francois Chollet 447445388e Merge branch 'master' of https://github.com/fchollet/keras 2016-08-08 11:23:38 -07:00
Francois Chollet b2c66816d7 Prepare new PyPI release. 2016-08-08 11:23:25 -07:00
Martin Thoma b6f81c6cc3 Fix documentation of the 1D max pooling layer (#3419) 2016-08-08 11:04:54 -07:00
Yad Faeq 98b289630a Word embdedding example updated (#3417)
* Added Convolution1D instead of Conv1D, which is depreceated

* updated rest of the example using Conv1D

* Python3 fails to decode utf-8 data, thus using encoding='latin-1'

* added condition for Encoding line 65-67

* Conv1D reverted back to the way it was
2016-08-08 10:59:31 -07:00
fchollet d68c0bd795 Update optimizers docs 2016-08-07 19:31:25 -07:00
Santiago Castro 5afda71f74 Fix scikit-learn python file name in docs (#3413) 2016-08-06 19:29:27 -07:00
Ramon de Oliveira 1b08a8d675 format Nadam references (#3404) 2016-08-05 16:41:34 -07:00
Moussa Taifi b508ab64bd Fix documentation for ModelCheckpoint callback params (#3408) 2016-08-05 16:41:15 -07:00
Richard Higgins 84f435e24b all zero arrays no longer get divided by zero in the process of being turned into images (#3401)
Update image.py
2016-08-05 09:10:52 -07:00
Fariz Rahman 984ad34a61 One hot op (#3353)
* One hot op

* tf too

* Update theano_backend.py

* Use built-in theano op

* Update theano_backend.py

* Add test

* Update test_backends.py

* Update test_backends.py

* Generalize for nD tensors

* Fix docstring on TF backend

* Update theano_backend.py

* Update theano_backend.py
2016-08-04 14:16:36 -07:00
Ondřej Filip ad3231c29a Docs update for Merge layer (#3392)
In Merge layer dot_axes param affects also 'cos' mode
2016-08-04 08:54:10 -07:00
fchollet c3d20bbc53 Merge branch 'master' of ssh://github.com/fchollet/keras 2016-08-03 22:11:12 -07:00
fchollet f9c03f183f Make trainable_weights dynamic 2016-08-03 22:02:45 -07:00
Tsukasa ŌMOTO 046a3c8a28 Fix YAML serialization for Advanced Activations (#3391)
Fix https://github.com/fchollet/keras/issues/2871#issuecomment-237365465

The problem occurs because PyYAML can't recognize numpy's data types.
2016-08-03 20:41:52 -07:00
Francois Chollet 05883934f1 Unify BN behavior across backends (fix) 2016-08-03 13:22:21 -07:00
Francois Chollet 97d2a73dd3 Unify BN behavior in TF and Theano 2016-08-03 12:39:24 -07:00
Francois Chollet 5367a44acb Further style fixes in example 2016-08-03 11:54:15 -07:00
Francois Chollet 1deaf71388 revert unnecessary changes in example script 2016-08-03 11:11:35 -07:00
Francois Chollet 99f564e972 Style fixes and small bug fixes 2016-08-03 11:08:48 -07:00
Zhengtao Wang c725f8d354 add resnet50 example (#3266)
* add resnet50 example

* fix PEP8 problems

* fix PEP8 problem....again...

This script get 10 points in PEP8 tests on my computer but.....

* add resnet 50

* fix problem caused by interrupted git push

* fix PEP8 problem..again!

* update weights links and remove load_weights

* fix pep8!

* remove skimage dependency, rename the file

* fix pep8...

* update

* support tf dim_ordering

* fix PEP8 problem
2016-08-03 10:51:18 -07:00
Shuhei Iitsuka 257ace722c Correct the reconstruction loss (#3383) 2016-08-02 21:45:31 -07:00
Wei Ouyang 0cd9d46828 remove keyword for cifar10.load_data (#3379) 2016-08-02 08:20:00 -07:00
Chris Caruso cef9e28a6c Added to rnn statefulness doc concerning func api (#3375)
Added correct information for how to enable rnn statefulness on functional models with 1 or more Input layers.
2016-08-01 20:21:42 -07:00
Francois Chollet 6c42da2abf Fix legacy model loading 2016-08-01 16:26:01 -07:00
Francois Chollet a9fc2bed49 Allow to call load_weights on model save file 2016-07-31 22:58:44 -07:00
Francois Chollet 1855c49d1f Merge branch 'master' of https://github.com/fchollet/keras 2016-07-31 17:45:43 -07:00
Francois Chollet cce65ce34d Update docs. 2016-07-31 17:45:32 -07:00
Jan Wilken Dörrie 70866c0154 Compare versions through pkg_resources.parse_version (#3355) 2016-07-31 10:45:03 -07:00
dolaameng d06e3753b0 update examples/neural_style_transfer.py (#3347) 2016-07-29 12:00:37 -07:00
Fariz Rahman cb4de1f859 Embedding layer - cast float to int implicitly (#3342)
* Embedding layer - cast float to int implicitly

* Cast only if not int
2016-07-29 11:11:46 -07:00
Jérémie b6d23b2e2d remove usage of tf.assign() in part of tensorflow backend (#3316) (#3320)
* remove usage of tf.assign() in the tensorflow backend (#3316)

Usage of the tf.assign() function in the set_value() and batch_set_values() functions creates new nodes on the Tensorflow graph which can eventually overflow the memory.
Therefore, the function has been rewritten using placeholders and feed_dict to avoid allocating additional memory.

* Correction to the set_value() function

Change to the set_value() function that had a bug when the variable "value" was a float.
The *1. dummy multiplication was added to avoid having to deal with tf.float32_ref dtypes.

* update set_value() of the tensorflow backend

Removal of the *1. dummy multiplication, replacement with a split() to avoid creating a new operation in the graph.

* fix to have session.run() called once in batch_set_value()

Rewriting of the batch_set_value() to avoid multiple calls to session.run() to improve speed.
2016-07-28 09:29:57 -07:00
Pradeep Dasigi 6a8815de0c Masked and non-masked merge bug fix (#3218)
* Masked and non-masked merge bug fix

* Masked merge concat logic with an expanded loop

* Cast mask of ones for unmasked input in merge to uint8
2016-07-27 17:50:49 -07:00
Francois Chollet e0179bad2f Refresh IMDB dataset 2016-07-27 17:30:04 -07:00
Francois Chollet 8778add0d6 Clean up test files 2016-07-27 12:09:40 -07:00
Francois Chollet facc823612 Update FAQ 2016-07-27 11:15:04 -07:00
Francois Chollet b91854ea9d Fix flaky test 2016-07-27 10:27:47 -07:00
Francois Chollet 05abe814ac Add keras version in model save files 2016-07-27 10:22:49 -07:00
François Chollet ea561ba6d8 Add model saving functionality (#3314)
* Add model saving functionality

* Update model saving functionality

* Fix py3 bytes/str issue

* Fix tests
2016-07-26 20:45:28 -07:00
Mostafa Abdulhamid df84c69676 Docker image for test and experiment Keras (#3035)
* Docker image for test and experiment Keras

 - Docker image with CUDA support on ubuntu 14.04
 - nvidia-docker script to forward the GPU to the container
 - MakeFile to simplify docker commands for build, run, test, ..etc
 - Add useful tools like jupyter notebook, ipdb, sklearn for experiments

* update nvidia-docker plugin

* use .theanorc in Dockerfile

* Add tensorflow to the docker image

* update Docker image to cuDNN v5

* test fixes

* move docker to sub directory

* README for docker

* Fix typos

* Add visualization to Dockerfile
2016-07-26 18:29:56 -07:00
Sadegh 3726aba2ee use of period fixed, default period set to 1000 (#3312) 2016-07-25 20:41:22 -07:00
Francois Chollet f6bcaffe4a Style fixes 2016-07-25 10:42:37 -07:00
yaringal c689b52dd1 Implemented transposed (de-) convolutions into Keras (#3251)
* theano backend now supports transposed convolutions

* working deconv

* new example file with deconv vae

* merged with #3273, fixed based on comments, pep8 tested

* test fix

* passes theano test

* start fixing deconv test

* fix deconv layer tests

* fix the right test

sorry, I "fixed" the wrong test last time

* clean up

* replace with_None with fixed_batch_size

* with_None --> fixed_batch_size

* comment edit

* fixed comments online
2016-07-25 10:33:03 -07:00
fchollet 09d75a4347 Style fixes 2016-07-24 14:20:36 -07:00
Rui Shu 59bd247603 Fix VAE example (#3220)
A number of changes:
1. Switch from Lambda to merge, otherwise code will not run.
2. Rename z_log_std to z_log_var in order for the objective function to make sense
3. Adjust reparameterization trick to reflect use of z_log_var, not z_log_std
4. Remove epsilon_std, since (standard) VAE uses isotropic gaussian prior.
5. Re-balance the weighting of KL and reconstruction terms
6. Use adam instead of rmsprop
7. Increase hidden unit size to improve model
8. Increase batch size to speed up training
2016-07-24 14:09:34 -07:00
dolaameng f221ef952f make examples/pretrained_word_embeddings.py more memory efficient (#3289)
* make examples/pretrained_word_embeddings.py more memory efficient

* make examples/pretrained_word_embeddings.py more memory efficient

* rename NB_WORDS to nb_words as it is not a global constant
2016-07-23 10:14:28 -07:00
Sebastian N. Fernandez d3c75e1d34 adding print_tensor (#3285) 2016-07-22 12:48:44 -07:00
Felix Lau 3aab55d29f Add Tensorflow support for UpSampling3D and ZeroPadding3D (#3274) 2016-07-22 12:47:59 -07:00
Fariz Rahman f9ef72c38a Logical ops (#3272)
* Logical ops

* Update tensorflow_backend.py

* Add tests

* Update tensorflow_backend.py

* lesser->less

* Update test_backends.py

* less->lesser

* less->lesser

* Update test_backends.py
2016-07-21 20:47:02 -07:00
Francois Chollet 108159ed17 Merge branch 'master' of https://github.com/fchollet/keras 2016-07-21 15:10:10 -07:00
Francois Chollet defa1283c4 Include iterations in optimizer weights 2016-07-21 15:04:30 -07:00
Francois Chollet 2788b60fe6 Revert behavior of regularizers 2016-07-21 15:04:09 -07:00
Olivier Grisel 7e70e1768f Small python3 compat fix in backend doc (#3275) 2016-07-21 10:31:52 -07:00
Kai Li 896ba77061 update residual connection example (#3278) 2016-07-21 10:30:19 -07:00
Francois Chollet c034262b78 Removed test for deprecated graph model 2016-07-20 16:12:49 -07:00
Francois Chollet b7edcf6eea Change behavior of regularizers: use mean, not sum 2016-07-20 15:52:47 -07:00
Francois Chollet 23e1ad2df7 Allow overriding learning phase 2016-07-20 15:48:52 -07:00
Francois Chollet 0a3939883a Style fix 2016-07-19 16:29:27 -07:00
Francois Chollet 3c8f91ee3d Style fix 2016-07-19 16:11:33 -07:00
Francois Chollet efa5b04797 Style fix 2016-07-19 16:09:07 -07:00
Francois Chollet 2da66ed009 Py3 fix 2016-07-19 14:56:50 -07:00
Francois Chollet 2ac6811362 Remove deprecated graph model test 2016-07-19 14:53:50 -07:00
Francois Chollet 74c51f213c Fix flaky test 2016-07-19 14:52:56 -07:00
Francois Chollet 4302d8060d Fix image resizing in preprocessing/image 2016-07-19 14:30:43 -07:00
Francois Chollet 576cf8978b Merge branch 'master' of https://github.com/fchollet/keras 2016-07-19 14:22:48 -07:00
Francois Chollet 3533912016 Native initializations, updates 2016-07-19 14:22:34 -07:00
ekerazha cf9922ff1d Fix broken imdb_cnn example (#3244)
* Fix broken imdb_cnn example

* Update imdb_cnn fix
2016-07-19 12:18:59 -07:00
Zhengtao Wang 4fa65fbb2f add doc for layers/normalization/BatchNormalization (#3248)
* add doc for layers/normalization/BatchNormalization

* fix PEP8 problem

* Update normalization.py
2016-07-19 12:18:39 -07:00
Kai Li f502ee2338 Update sequential-model-guide.md (#3257) 2016-07-19 12:18:25 -07:00
Francois Chollet 7a56925176 Even faster tests 2016-07-19 11:57:57 -07:00
Francois Chollet 0a108b3fb2 Fix tests maybe? 2016-07-19 11:31:22 -07:00
Francois Chollet 381a108e6d Revert Travis config 2016-07-19 11:17:32 -07:00
Francois Chollet 726c9fc8a6 Update tests 2016-07-19 10:49:44 -07:00
Francois Chollet 946ccd3228 Speed up tests (especially with TF) 2016-07-19 10:05:30 -07:00
Francois Chollet 8e1ebbfc11 Get rid of coveralls 2016-07-19 00:40:21 -07:00
Francois Chollet cc0e60c101 TF backend performance improvements 2016-07-19 00:30:05 -07:00
Francois Chollet ff3f00d845 Style fix in example 2016-07-18 23:48:56 -07:00
Francois Chollet 40195c2fa2 TF backend: group update ops, add clear_session 2016-07-18 23:48:56 -07:00
Francois Chollet 7f7300b8cb Minor style fixes 2016-07-18 23:48:56 -07:00
Fariz Rahman 1b158ff4ed Bug fix + test - Sequential.pop() (#3252)
* Bug fix - Sequential.pop()

* Add test for Sequential.pop()

* Update test_sequential_model.py
2016-07-18 18:30:13 -07:00
stas-sl b686b85b52 Use symbolic shape in tensorflow version of batch_flatten (#3253) 2016-07-18 15:37:10 -07:00
Francois Chollet 8fa82ae5cb Merge branch 'master' of https://github.com/fchollet/keras 2016-07-16 17:52:56 -07:00
Francois Chollet 0d5289141e Add pre-trained word embeddings example 2016-07-16 17:51:17 -07:00
Francois Chollet 01d5e7bc47 Fix up a few example 2016-07-16 17:47:52 -07:00
Fariz Rahman cfbaec60c7 Simplify output shape inference for dot/cos merge (#3233)
* Simplify output shape inference for dot/cos merge

* Add extra dim if rank is 1

* Explain output shape inference logic in doc string

* Update tensorflow_backend.py

* Update theano_backend.py

* Update tensorflow_backend.py

* Update tensorflow_backend.py

* Update theano_backend.py

* Update topology.py

* example

* Update theano_backend.py

* Update theano_backend.py

* Update tensorflow_backend.py

* Update tensorflow_backend.py

* Update tensorflow_backend.py

* Update tensorflow_backend.py

* Update theano_backend.py

* Update tensorflow_backend.py

* typo fix

* Update theano_backend.py
2016-07-16 16:35:39 -07:00
Francois Chollet f3e7245910 Use native Theano BN 2016-07-16 13:42:41 -07:00
Francois Chollet 892d9fae84 Prepare 1.0.6 release 2016-07-16 12:25:03 -07:00
Francois Chollet 796f895f01 Start to nativify variable initialization in TF 2016-07-16 11:19:41 -07:00
Francois Chollet 489bb4eb10 Update docs for pooling 2016-07-16 10:28:35 -07:00
Francois Chollet 8f458066bb Update docs for initializations 2016-07-16 10:28:18 -07:00
Francois Chollet 5dd7454260 Merge branch 'master' of https://github.com/fchollet/keras 2016-07-15 17:36:21 -07:00
Francois Chollet 571db82371 Add variable initialization override option in TF 2016-07-15 17:36:00 -07:00
fchollet d971e0cca5 Style fixes in SeparableConv2D 2016-07-14 18:22:33 -07:00
Fariz Rahman fde0aac733 Convnet aliases (#3226)
* Convnet aliases

Not very important, but kinda handy.

* Update convolutional.py

* pep8
2016-07-14 18:16:00 -07:00
Maruan b9d904c12f Add masking support to BatchNormalization layer. (#3228) 2016-07-14 17:02:56 -07:00
Eder Santana aa2ec42da6 stop gradients (#3221)
* stop gradients

* fix stop grad test

* stop gradients
2016-07-14 16:13:55 -07:00
Francois Chollet d90d473104 Add pooling layers page in docs 2016-07-14 15:27:46 -07:00
Francois Chollet 5a1e63990a Refactor batch norm 2016-07-14 15:05:48 -07:00
Francois Chollet e836c10c6f Fast BN for TF 2016-07-14 14:13:06 -07:00
Francois Chollet 47c09d9557 Add SeparableConv2D layer (TF only) 2016-07-14 11:22:27 -07:00
Francois Chollet b35b943364 Add support for None activations 2016-07-14 04:38:53 -07:00
Francois Chollet ca467cc50e Add support for input tensors in InputLayer 2016-07-14 04:30:21 -07:00
Francois Chollet 51f7cf0367 Doc formatting fix 2016-07-14 04:20:12 -07:00
Francois Chollet 642eaca618 Doc formatting fix 2016-07-13 12:38:45 -07:00
Francois Chollet 55e5680535 Update doc generation script 2016-07-13 12:35:11 -07:00
Francois Chollet 52ea31b65c Add FAQ entry on pre-trained models 2016-07-13 12:34:58 -07:00
Wei Ouyang b3a26a5b30 Add AtrousConv2D layer for dilated convolution (#3183) 2016-07-13 08:28:23 -07:00
Dave Challis 98974efa5f Fix to error message in exception (#3213)
Was incorrectly reporting the `loss` argument instead of the `loss_weights` argument when an exception related to loss_weights was thrown.
2016-07-13 08:27:03 -07:00
fchollet b6a776b242 Add Sequential.pop() method 2016-07-12 20:17:03 -07:00
Francois Chollet 1ea3f44f06 Fix dtype consistency issue in random_binomial 2016-07-12 17:05:47 -07:00
Michael Oliver 64e1320ca0 add dtype to zeros and ones allocations (#3187) 2016-07-09 10:10:15 -07:00
Francois Chollet 6e0b50fbdc Merge branch 'master' of https://github.com/fchollet/keras 2016-07-08 18:44:37 -07:00
Francois Chollet 22502a8fe8 Style fixes in regularizers 2016-07-08 18:44:23 -07:00
Jason Yosinski a78ad01bb4 Create initial_state tensor filled with zeros without use of K.zeros (#3123)
* Create initial_state tensor filled with zeros without use of K.zeros

* minor PEP8 fix
2016-07-06 12:47:14 -07:00
Carl Thomé 729e802e85 Added optional field name argument to RemoteMonitor callback (#3157)
* Added optional path argument

* Added optional field name argument
2016-07-06 09:47:03 -07:00
Joshua Chin 3ffff6d579 fix get_output_shape_for in Merge, when mode is callable (#3144) 2016-07-05 09:29:40 -07:00
Francois Chollet 6e5f97fca5 Merge branch 'master' of https://github.com/fchollet/keras 2016-07-04 14:21:13 -07:00
Francois Chollet eaff5bdfd7 Style touch-ups in TF backend 2016-07-04 14:20:54 -07:00
Francois Chollet 28819d36a4 Less frequent dataset tests 2016-07-04 14:17:35 -07:00
lucasmoura f9a4f6f306 Use defaultdict for _UID_PREFIXES (#3087)
The method get_uid on common.py first check if a prefix is in _UID_PREFIXED dict
and if it is not, a variable is added to the dict.

However, using a defaultdict, this check is no longer necessary.
2016-07-04 13:26:44 -07:00
Dmytro Mishkin 27c83c693d Added 'max' operation to Merge layer (#3128)
* Added 'max' operation to Merge layer. It allows to implement convolutional maxout with two (or more) convoluion layers and one Merge.

* Added 'max' to merge test
2016-07-04 11:03:07 -07:00
Eder Santana d106908a57 fix docs bugs (#3142)
* fix docs bugs

* fix docs bugs
2016-07-04 11:02:47 -07:00
Brian McMahan b4adce34dc Lambda output shape (#2680)
* updating the info for lambda

* updated lambda doc a bit more

made it more readable and stuff
2016-07-03 21:57:46 -07:00
Thibault de Boissière 3927505d1a Add multiprocessing for fit generator (#3049)
* Add multiprocessing for fit generator

* Change maxproc to nb_worker and update documentation

* Simplify multiprocessing test, clarify doc replace maxproc by nb_worker

* Replace maxproc by nb_worker in test

* Replace maxproc by nb_worker in test

* Update the doc: specify non picklable arguments should not be used with multiprocessing

* Add multiprocessing as an option with the pickle_safe argument
2016-07-03 21:50:01 -07:00
fchollet ede79f818e Add MIT license badge to README 2016-07-03 21:19:05 -07:00
Francois Chollet 742ac53262 Merge branch 'locally_connected' 2016-07-03 20:52:54 -07:00
Francois Chollet ee17ccc374 Add tests for locally connected layers 2016-07-03 20:51:58 -07:00
joelthchao 835a02c037 locally-connected layer
add unittest, fix output shape

PEP8

flatten weight, improve example

update docstring, remove cifar10 Alex exmaple

improve docstring, remove duplicate func

parallel by batch_dot

fix theano batch_dot

dim_ordering unit test, theano only use dot

dim_ordering unit test

Update locally connected layers
2016-07-03 20:48:22 -07:00
François Chollet ee8ff00a2a New conv ops (#3134)
* New function signature for conv2d in backend

* Clean up stuff

* Touch-up TF deconv op

* More cleanup

* Support for TF 3D conv/pool

* Move pooling layers to their own file

* Update TF version in Travis config

* Fix conv3d tests
2016-07-03 20:33:21 -07:00
fchollet 229f13a864 Lambda should not support masking implicitly 2016-07-02 15:12:46 -07:00
Rompei 0d60d637af TimeDistributedDense -> TimeDistributed(Dense()) in doc example 2016-07-02 12:12:54 -07:00
Francois Chollet c20e34a8b0 Prevent image_dim_ordering from being overwritten 2016-07-02 10:24:48 -07:00
Fariz Rahman 8d3f39852a Validate dot_axes argument in cos mode and fix output shape (#3116)
* Validate dot_axes argument in cos mode

* Update topology.py

* Update topology.py
2016-07-01 11:41:13 -07:00
Carl Thomé aa45dee5a4 Added optional path argument (#3118) 2016-07-01 10:56:17 -07:00
fchollet 885e6e621b Style fix in test 2016-06-30 23:22:06 -07:00
fchollet dc122c31ef Fix masking test 2016-06-30 23:19:51 -07:00
fchollet 3bc80d3db4 Remove unnecessary assert 2016-06-30 23:04:12 -07:00
Francois Chollet 439f2f3b2b Fix issue with multi-io + BatchNorm mask computing 2016-06-30 22:19:17 -07:00
Joshua Chin a1610eb274 model should use binary accuracy for binary crossentropy loss (#3098) 2016-06-28 12:45:34 -07:00
Francois Chollet 6b90eff03c Fix flaky test 2016-06-27 12:16:54 -07:00
Francois Chollet 4d404d1a54 Prepare 1.0.5 PyPI release 2016-06-27 12:01:20 -07:00
Francois Chollet fffa6a80ca Add attribute caching for flattened_layers 2016-06-27 11:51:28 -07:00
Francois Chollet 3f6b38b34f Fix duplicated updates issue 2016-06-27 11:51:12 -07:00
Francois Chollet 8166a55761 Fix flaky test 2016-06-27 11:50:56 -07:00
Shota 89f6d374e9 Fix typo (#3070) 2016-06-25 12:01:20 -07:00
M.Yasoob Ullah Khalid ☺ 9b3c2cf348 A small typo (#3067) 2016-06-24 19:04:28 -07:00
Francois Chollet 76406dd0c2 Remove unnecessary space 2016-06-23 16:17:36 -07:00
Francois Chollet d266b75423 Small fixes in text gen example 2016-06-23 16:17:24 -07:00
Francois Chollet 5bb5eb1657 Fix flaky test 2016-06-23 12:08:25 -07:00
Benjamin Bolte 258cf3b0f7 Support for masking in merged layers (#2413)
* added masking to merge layer (#2413)

* added documentation, fixed stylistic issues

* removed casting

* changed to using K.all
2016-06-23 12:03:55 -07:00
Utkarsh Upadhyay cdb5b09cd7 fix: Sort subdirs before mapping them to classes. (#3052)
The documentation says that [1]: 

> If [classes are] not provided, the list of classes will be automatically inferred (and the order of the classes, which will map to the label indices, will be alphanumeric).

However, the code was adding classes in the order `os.listdir` returned them. This commit alphanumerically sorts the sub-directories before mapping them to label indices.

[1] http://keras.io/preprocessing/image/
2016-06-23 11:32:40 -07:00
MikeAmy cb3469215a Moved epoch_logs = {} before batch loop to avoid UnboundLocalError. (#3019) 2016-06-20 08:18:01 -07:00
ηzw 703c2925b3 Add comment for a note of caution (#3024) 2016-06-19 12:36:05 -07:00
lucasmoura b6ca3ef051 Avoid double key lookup on callback.py (#3018)
On method on_epoch_end, to add new keys to the history dict, first it is
verified if a key is not on the history dict and if that is the case, a new key
is created on the history dict with an empty list as value.

However, this operation search for a key twice in the dict. This same behavior
can be achieved in a single step using dict setdefault method.
2016-06-19 10:48:53 -07:00
André Artelt b235f91cc7 doc: fix example for recurrent layer (#3022) 2016-06-19 10:39:23 -07:00
jkleint 3513472467 Allow re-use of EarlyStopping callback objects. (#3000)
An EarlyStopping callback object has internal state variables to tell it
when it has reached its stopping point.  These were initialized in __init__(),
so attempting to re-use the same object resulted in immediate stopping. This
prevents (for example) performing early stopping during cross-validation with
the scikit-learn wrapper.

This patch initializes the variables in on_train_begin(), so they are re-set
for each training fold.  Tests included.
2016-06-18 15:04:30 -07:00
Carlos Bentes 60e0c96f6c Fix typo in training (#3014) 2016-06-18 10:42:38 -07:00
Tsukasa ŌMOTO e37df7ca85 Fix json serialization in Lambda layer (#3012)
Fix #2582
Fix #3001
2016-06-17 21:26:45 -07:00
Tsukasa ŌMOTO a13a35fe52 Fix json serialization in merge layer with lamda output shape (#3011)
Fix #3008
2016-06-17 21:26:29 -07:00
Zhengping Che f421600218 fix wrong calls of __init__ in callbacks (#2999) 2016-06-16 14:51:28 -07:00
Francois Chollet 2a4492d74f Fix typo in docs 2016-06-16 10:56:34 -07:00
Tsukasa ŌMOTO 10368d867f Fix TF-IDF in Python 2 (#2992)
Fix #2974
2016-06-16 08:49:17 -07:00
Tsukasa ŌMOTO 936360020c Fix tf-idf again (#2986)
Fix 53aaa842ed
Fix #2974
2016-06-15 09:11:01 -07:00
ηzw c53c64d7fa Fix get_word_index (#2981) 2016-06-14 18:59:10 -07:00
Francois Chollet 6b122ba25f Allow arbitrary output shapes for custom losses 2016-06-14 16:40:58 -07:00
Francois Chollet 6501b587c0 Clarify use of two-branch models 2016-06-14 12:12:21 -07:00
Tsukasa ŌMOTO 53aaa842ed Fix tf-idf (#2980)
Fix #2974
2016-06-14 11:39:07 -07:00
Francois Chollet dc569e952d Merge branch 'master' of https://github.com/fchollet/keras 2016-06-13 12:06:45 -07:00
Francois Chollet c9aee4126e Fix issue with cascade of Merge layers 2016-06-13 12:05:48 -07:00
githubnemo 7397f4b0d0 Resolve #2960 (#2961)
* Resolve #2960

Introduce `K.var` so that the standard deviation computation can
be made numerically stable. Instead of

	K.std(x)

the user is able to write

	K.sqrt(K.var(x) + self.epsilon)

avoiding a division by zero in the gradient computation of `sqrt`.

* Fix typos
2016-06-12 21:16:21 -07:00
Rompei 3b83a1b1ac Fix initial variable in Evaluator. (#2955) 2016-06-12 14:50:56 -07:00
fchollet 8f8e4574dc Fix issue with Sequential deserialization 2016-06-12 11:04:09 -07:00
fchollet c30432a665 Nadam optimizer style fixes 2016-06-11 16:49:56 -07:00
Ilya Kulikov c4c2d8bbd4 Nadam optimizer and test for it added (#2764)
* Nadam optimizer and test for it added

* pep8 fix

* add comment in docstring and one more pep8 fix
2016-06-11 16:30:05 -07:00
Francois Chollet e49ba233f9 Convolution1D: apply activation after reshape 2016-06-10 15:05:34 -07:00
mpt5 cea6c1821f Update visualization.md (#2942)
* Update visualization.md

Added show_layer_names argument and its default value to docs

* Update visualization.md
2016-06-09 21:50:32 -07:00
Shaun Harker ab4bf447c6 Fix 1D convolution layers under Theano backend (#2938)
This issue is due to an unexpected loss of dimensionality when
composing the backend tensor operations "reshape" and "squeeze"
when there are dimensions of length 1.

For example, using a Theano backend the following fails with a
complaint about dimension mismatch:

UpSampling1D(2)(MaxPooling1D(2)(Reshape((2,1))(Input(shape=(2,)))))

The issue arises due to the conflict of two behaviors specific
to the Theano backend:

-   Reshape uses Theano's reshape function. Theano's reshape
    automatically makes dimensions with length 1 "broadcastable"

-   MaxPooling1D's implementation class _Pooling1D has a call method
    which uses a dummy dimension which it has to remove. The manner
    in which this dummy method is removed it to call "squeeze(x, axis)"
    from the backend. The squeeze implementation tells Theano to make
    the dummy dimension broadcastable, and then calls Theano's "squeeze",
    which removes ALL the broadcastable dimensions; not just the dummy
    dimension, but also the length 1 dimension flagged as broadcastable
    by reshape. This causes the problem observed above. This behavior
    is distinct from the behavior of the TensorFlow backend, which
    removes only the requested dimension.

This PR addresses this issue in two ways:

First, it introduces a test which checks the composition of "reshape"
and "squeeze" to make sure we get the same result using both Theano
and TensorFlow backends.

Second, it changes the implementation of squeeze(x,axis) so that the
Theano backend should behave similarly to the TensorFlow backend. With
this change the introduced test passes and the above example works.
2016-06-09 12:00:50 -07:00
Oswaldo Ludwig 4e0c8cf25b Eigenvalue Decay regularization (#2846)
* Update regularizers.py

I included a new regularizer named Eigenvalue Decay to the deep learning practitioner that aims at maximum-margin learning. This version approximates the dominant eigenvalue by a soft function given by the power method. For details, see:
Oswaldo Ludwig. "Deep learning with Eigenvalue Decay regularizer." ArXiv eprint arXiv:1604.06985 [cs.LG], (2016). https://www.researchgate.net/publication/301648136_Deep_Learning_with_Eigenvalue_Decay_Regularizer

The syntax for Eigenvalue Decay is similar to the other Keras weight regularizers, e.g.:

 model.add(Dense(100, W_regularizer=EigenvalueRegularizer(0.0005)))

* Example with Eigenvalue Decay regularization.

An example from Keras including regularization with Eigenvalue Decay. After training, you have to save the trained weights, create/compile a similar model without Eingenvalue Decay and save this model. Then, you can use your trained weights with this model, see lines 123-153 of  	CIFAR10_with_Eigenvalue_Decay.py (This is still an open issue).
This example yields a gain in the accuracy by the use of Eigenvalue Decay of 2.71% (averaged over 10 runs).

* Update CIFAR10_with_Eigenvalue_Decay.py

* Update CIFAR10_with_Eigenvalue_Decay.py

* Update CIFAR10_with_Eigenvalue_Decay.py

* Update regularizers.py

* Update regularizers.py

* Delete CIFAR10_with_Eigenvalue_Decay.py

* Update test_regularizers.py

* Update regularizers.py

* Update test_regularizers.py

* Update regularizers.py

* Update regularizers.py

I needed another reading in Keras backend...

* Issue to get shape of a tensor.

Issue to get shape of a tensor in the class EigenvalueRegularizer: the type returned for shape is different for Theano backend (Theano tensor type) and TF backend (TF TensorShape).

* Update regularizers.py

* Update regularizers.py

* Update regularizers.py

* Update regularizers.py

* Update regularizers.py

* Update regularizers.py

* Update regularizers.py
2016-06-08 12:20:25 -07:00
Francois Chollet e4b3a052a4 Small style fixes 2016-06-08 11:56:05 -07:00
Ziheng Jiang 825beb42c4 fix bug: rename duplicated loss name (#2842)
* rename duplicated loss name

* make python3 happy

* rewritten code to make it easy to read
2016-06-08 11:51:34 -07:00
Tammy Yang c8d605db55 Make DirectoryIterator case insensitive (#2932)
* make DirectoryIterator case insensitive

* Also need to make filename case insensitive while appending it into self.filenames
2016-06-08 11:27:19 -07:00
Ryo ASAKURA f6ecab58cb Fix description about parameter output_shape for function merge (#2933) 2016-06-08 11:26:56 -07:00
Tsukasa ŌMOTO d7e39347b9 Add mode=2 option to the docstring in BatchNormalization (#2919)
Fix a tiny typo.
2016-06-07 23:09:03 -07:00
Colin Rofls 25c10af596 fix 2852 (#2927) 2016-06-07 16:48:13 -07:00
Francois Chollet ded23f14c7 Fix typo in docs 2016-06-07 15:51:29 -07:00
Michael Crawford b0d52d930a Fix predict_proba method of KerasClassifier to return probabilites for both classes in case of binary classification. issue:2864 (#2924) 2016-06-07 11:48:04 -07:00
jakeleeme af5c5b6a55 Spellcheck source files (#2907) 2016-06-06 13:29:25 -07:00
ηzw ce51e19970 Fix typos in image preprocessing docs (#2906) 2016-06-06 13:28:39 -07:00
Francois Chollet 97e31b6090 Cleanup docs autogen script 2016-06-06 10:18:09 -07:00
Francois Chollet 62053e68e2 Prepare 1.0.4 PyPI release 2016-06-06 10:17:47 -07:00
Francois Chollet 489c07e748 Docs adjustment 2016-06-06 10:15:22 -07:00
fchollet 604ea8d68a Fix PEP8 BS 2016-06-05 20:41:19 -07:00
fchollet fd3cfb196b Allow no layer names in plot() 2016-06-05 20:20:14 -07:00
fchollet b71f6ba864 Allow absence of labels in flow() 2016-06-05 20:19:55 -07:00
fchollet dfc128b89a Fix some py3 generator issue 2016-06-05 14:52:10 -07:00
fchollet 34b8b57c2f Update image preprocessing docs 2016-06-05 14:01:28 -07:00
fchollet 3bba409d9e Improve docstring in preprocessing/image 2016-06-05 14:01:11 -07:00
fchollet 7869cdccec Merge branch 'master' of ssh://github.com/fchollet/keras 2016-06-05 13:39:55 -07:00
fchollet 0e18e345b0 Refactor ImageDataGenerator, add directory support 2016-06-05 10:24:54 -07:00
fchollet e5b99c7512 Tiny fixes in Sequential methods 2016-06-05 10:24:20 -07:00
aaditya prakash 7d4c85018a MaxoutDense no activation; incorrect docs (#2895)
Since MaxoutDense does not have activation it might be misleading to include "activation" as one of the arguments in the function docs.
2016-06-03 23:45:24 -07:00
fchollet edbec2dbc9 Remove bit of deprecated code 2016-06-03 23:13:46 -07:00
fchollet 973ece9809 Make dim_ordering a global default 2016-06-03 23:13:11 -07:00
lorenzoritter 90f441a6a0 fixed formatting error in the docstring (#2797)
* fixed formatting error in the docstring

* fixed formatting error in TimeDistributedDense of core.py
2016-06-02 19:07:34 -07:00
Andrew Stromnov 5a71090476 limit progress bar update rate (#2860)
* limit progress bar update rate

Limit progress bar update rate in verbose=1 mode. This patch allows to
reduce terminal I/O throughput while keeping reasonable high visual
update rate (defaults to 100 refreshes per second). It helps greatly
when working with large but simple data sets with small batches, which
leads to millions of relatively useless screen updates per second. Also
it helps to keep network traffic at reasonable rates, which
exceptionally useful within laggy networking conditions when using
keras over telnet/ssh, and improve web browser responsibility when
using keras within Jupyter Notebook.

* add docstrings for 'interval' and 'force' arguments
2016-06-02 13:06:37 -07:00
matthewmok 76cae0ec44 fix bug: change seed range for RandomStreams in Theano (#2865)
* bug fixed, numpy randint only output positive numbers ranging from 1 to 10e6

* Update theano_backend.py

changed style and numpy randint range

* Update theano_backend.py

removed extra spaces
2016-06-02 13:05:51 -07:00
talpay 273f0dda9d Added objective: Kullback Leibler Divergence (#2872)
* Added objective: Kullback Leibler Divergence

* KLD: Clip at 1
2016-06-02 10:23:00 -07:00
Tsukasa ŌMOTO 882b5a1d89 Fix YAML serialization when using Regularizers (#2883)
Fix #2871
2016-06-01 21:40:16 -07:00
ηzw 8c84ad1a86 fix typo (#2881)
* fix typo

* Update scikit-learn-api.md
2016-06-01 21:39:46 -07:00
Francois Chollet 80bfec7253 Fix JSON deserialization issue 2016-05-30 22:36:57 -07:00
fchollet 91b930298b Make Merge output_shape consistent with lambda 2016-05-30 20:50:59 -07:00
Tsukasa ŌMOTO 9c56b91548 Fix json serialization in merge layer (#2854)
Fix #2818
2016-05-30 20:30:07 -07:00
fchollet 7b5bab83f4 Merge branch 'master' of ssh://github.com/fchollet/keras 2016-05-29 15:36:21 -07:00
fchollet c5e2116ead Fix typo in doc 2016-05-29 15:36:14 -07:00
mittagessen d9db73a791 s/TimeDistributedDense/TimeDistribute(Dense(.../g (#2843) 2016-05-29 14:12:57 -07:00
Kumar Ayush 01ece4ef7b added required import line (#2839) 2016-05-27 23:56:51 -07:00
Francois Chollet 601f3e7cdb BN only uses learning phase in mode 0 2016-05-27 21:36:08 -07:00
Francois Chollet a9ca2c547f Merge branch 'master' of https://github.com/fchollet/keras 2016-05-27 21:32:53 -07:00
Francois Chollet 594cbed03b Small changes in mask caching 2016-05-27 21:32:43 -07:00
Monami Sharma 3938a905a1 Default values corrected for featurewise_std_normalization and featurewise_center (#2831)
For ImageDataGenerator, False is the default value for for featurewise_std_normalization and featurewise_center.
2016-05-27 08:59:10 -07:00
Francois Chollet 88d523e01b Add stateless batchnorm mode 2016-05-26 16:01:24 -07:00
Francois Chollet 0419fe67fc Merge branch 'master' of https://github.com/fchollet/keras 2016-05-25 16:46:21 -07:00
Francois Chollet 33ddeb5cbe Change way node depth is computed for shared layer 2016-05-25 16:46:06 -07:00
Colin Rofls 1f5d5b391b correctly serialize loss function (#2806) 2016-05-24 21:27:57 -07:00
fchollet 198c515208 Simplify imports in README 2016-05-23 23:59:34 -07:00
fchollet 5156673e17 Fix serialization issue with nested Sequential 2016-05-21 18:11:51 -07:00
fchollet 1529c9c438 Clarify error message 2016-05-21 17:01:39 -07:00
fchollet 32b10a8832 Fix first axis dim validation in multi-input model 2016-05-21 16:06:19 -07:00
fchollet 24501d4361 Fix ActivityReg layer 2016-05-21 15:46:32 -07:00
fchollet bf0c08e24a Add FAQ entry about layer freezing 2016-05-21 13:53:23 -07:00
RyosukeHonda 34d8cce6bc Fixed typo (#2770)
Fixed the year from "7 Apr 201" to "7 Apr 2015".
2016-05-21 10:12:45 -07:00
Joshua Loyal f0bfc24adc Correction to fan_out initializaiton (#2252)
* account for receptive field size in fan_out

* added test for conv layer initializations

* removed old reference to kernel_size
2016-05-19 22:11:41 -07:00
Xingdi (Eric) Yuan 2f8acfe4bf changeable print_summary (#2761)
* use changeable print_summary

* minor
2016-05-19 12:40:06 -07:00
gw0 e2fb8b2786 Add download error suggestion for babi_rnn.py and babi_memnn.py. (#2752) 2016-05-19 10:20:36 -07:00
Francois Chollet ebbc4d9fb8 Fix TB callback with non-standard TF version nums 2016-05-18 10:28:12 -07:00
Francois Chollet 8fc5b90e9a Update bibtex entry 2016-05-16 15:04:49 -07:00
mat kelcey 8a717f5b6c rename z_log_sigma to z_log_std to match z_mean (which is not z_mu) (#2729) 2016-05-16 11:08:00 -07:00
Colin Rofls aa91994166 save keras version & compile args when serializing models (#2690)
* save keras version & compile args when serializing models

* renamed prepare_config -> _updated_config + cleaner implementation
2016-05-15 20:18:31 -07:00
Joel 2091347a71 Fix zero division in merge mode='cos' (#2725)
* fix cos zero division

* use backend epsilon
2016-05-15 20:16:46 -07:00
Mikhail Korobov e0ed174f2c Input: proper error message for missing "shape" argument (#2727) 2016-05-15 15:15:14 -07:00
Francois Chollet 8c2a573ebf Prepare 1.0.3 release 2016-05-15 13:13:19 -07:00
Francois Chollet d7ff7cde92 Add VAE example 2016-05-14 12:06:23 -07:00
Francois Chollet 15d0b0ea08 Add K.tile test 2016-05-14 12:06:02 -07:00
Francois Chollet 3695bc2db5 Remove references to "join" merge mode 2016-05-13 11:06:08 -07:00
Francois Chollet a08995a90d Fix common LaTeX encoding issue 2016-05-12 12:03:20 -07:00
Tsukasa ŌMOTO aea00258e7 Update the reference of Batch Normalization (#2700)
We should refer the paper accepted in ICML 2015, instead of arXiv.
2016-05-12 09:54:46 -07:00
fchollet b581eb3f27 Update RMSprop 2016-05-11 21:35:11 -07:00
Francois Chollet 610ccba9f5 Normalize layer imports in examples 2016-05-11 18:45:37 -07:00
Francois Chollet d5ae6f32dd Fix flaky test 2016-05-11 18:01:01 -07:00
Francois Chollet 5308033936 Update RMSprop, Adagrad, Adadelta 2016-05-11 17:20:27 -07:00
Francois Chollet e2abb5ef2c Fix merge conflicts 2016-05-11 16:07:43 -07:00
Francois Chollet 1b11b4eeb6 Fix shape inference issue with TF.resize_images 2016-05-11 16:06:03 -07:00
Dieuwke Hupkes 39357b3045 Update documentation docstring Embedding (#2693)
From the documentation it is not entirely clear that if mask_zero is set
to True, the input_dim argument should be equal to the size of the
vocabulary + 2, as index 0 cannot be used anymore.

(This behaviour seems a bit strange, as it has as a consequence that the
first column of the weights of the embeddings will never be used or
updated. The resulting network thus has a redundant set of parameters).
2016-05-11 14:10:58 -07:00
Kai Sasaki ed7a5a1418 Residual connection should have the same dimension in case of no projection matrix (#2688) 2016-05-10 21:18:39 -07:00
Kyle McDonald ae682a71f9 functional API intermediate output doc in faq (#2682) 2016-05-10 08:22:53 -07:00
Brian McMahan 8327b37a0b fixed shape typo (#2679)
* fixed shape typo

* pep8
2016-05-09 22:17:12 -07:00
fchollet 973b5570aa Style touch-up 2016-05-06 20:37:46 -07:00
fchollet 7cb41fc5cc Fix weight saving issue 2016-05-06 20:37:35 -07:00
Tsukasa ŌMOTO 595d67ad7d Fix initialization of index_array (#2590)
index_array should be initialized when self.batch_index is zero.
2016-05-06 18:13:11 -07:00
François Chollet bb626c120e Revert "Revert "remove unused import statement in keras dir"" (#2647) 2016-05-06 13:32:43 -07:00
Xingdi (Eric) Yuan ba8fefa8ec Faster GRU (#2633)
* add a simple named entity recognition example

add a simple named entity recognition example

* add fast version of GRU

add fast version of GRU

* remove useless stuff
2016-05-06 11:10:46 -07:00
François Chollet 4b24f6d7b1 Revert "remove unused import statement in keras dir" (#2641) 2016-05-05 23:22:28 -07:00
ηzw 1c460e1e08 remove unused import statement in keras dir (#2638)
* remove unused import statement in keras dir

* rewrite import graph statement
2016-05-05 21:33:04 -07:00
Colin Rofls 7b4e157356 fixed docs for Sequential.get_config, and added a more helpful (#2635)
exception to `model_from_config`.
2016-05-05 15:24:52 -07:00
Dr. Kashif Rasul 5749f1b971 fix soft sign deprecation warning (#2623)
and backward compatible
2016-05-05 13:02:37 -07:00
Francois Chollet 3c57aff85b Style fixes 2016-05-05 11:17:25 -07:00
Carl Thomé 18504bcc86 Faster LSTM (#2523)
* Faster LSTM

* PEP8

* RNN dropout fix

* PEP

* PEP

* Less code duplication

* LSTM benchmark example

* PEP

* Test implementation modes

* Go through Keras backend
2016-05-05 11:01:48 -07:00
Francois Chollet d8864bfe48 Allow use of predict without compilation 2016-05-05 08:24:12 -07:00
Nic Eggert 078b20169b Add batch_get_value to backends (#2615)
* Add function to get multiple values at once

* Change to match existing batch_set_value

* Fix typo
2016-05-04 17:13:17 -07:00
Francois Chollet 5f7e78df65 Improve optimizer configuration 2016-05-04 14:18:06 -07:00
Francois Chollet fc470db7ab Fix typos in layer writing guide 2016-05-03 11:29:50 -07:00
jingzhehu f576f37801 one line fix for TensorBoard callback issue (#2574)
* one line fix for TensorBoard callback issue

Ref: https://github.com/fchollet/keras/issues/2570

* handle SummaryWriter based on tensorflow version

code contributed by @bnaul

https://github.com/bnaul/keras/commit/e04ce5e37ec234debaea8c6482ef90be1f
88286d
2016-05-03 10:51:43 -07:00
Francois Chollet b74118a766 Fix typo in documentation 2016-05-02 15:59:04 -07:00
Brian McMahan 1c7a0248b9 updated for list check bug in predict/predict_on_batch (#2585)
* updated for list check bug in predict/predict_on_batch

* pep fix

I think that's going to be the only pep complain..
2016-05-02 15:33:25 -07:00
Francois Chollet 36a829c20d Add doc page about writing custom layers. 2016-05-02 14:16:09 -07:00
chentingpc 33af75aa39 fix activity regularizer so it can deal with multiple inbound nodes as well (#2573) 2016-05-01 16:36:31 -07:00
jpeg729 844420425e Added softsign activation function (#2097) 2016-04-30 18:29:33 -07:00
Francois Chollet da57a530f9 "total_loss" -> "loss" 2016-04-30 16:38:23 -07:00
fchollet 1f17013949 Misc fixes 2016-04-30 15:09:35 -07:00
fchollet f18899cb36 Merge branch 'master' of https://github.com/commaai/keras into commaai-master 2016-04-30 14:09:56 -07:00
Sasank Chilamkurthy 877f946e24 Improved docs of ImageDataGenerator (#2565) 2016-04-30 11:53:44 -07:00
Francois Chollet a981a8c42c Make bias optional everywhere 2016-04-29 16:54:39 -07:00
Francois Chollet 5467107fc9 Prepare 1.0.2 PyPI release 2016-04-29 10:39:52 -07:00
Gijs van Tulder ad3107073b Re-raise exceptions to preserve stack trace (#2350) 2016-04-28 12:38:36 -07:00
Francois Chollet 8d62f4da6c Minor UX fix 2016-04-27 17:34:33 -07:00
Joel 3779b8a008 Fix test_image path non-exist error in ci-travis (#2531)
* correct inception_v3 network

* store test images in class attribute

* PEP8
2016-04-27 11:35:31 -07:00
Francois Chollet 6ec5e48969 Style touch-ups 2016-04-27 10:53:54 -07:00
fchollet bfa5ca553d Fix docstring 2016-04-27 09:20:19 -07:00
Francois Chollet c9f7d970e9 Style fixes in preprocessing/image 2016-04-26 15:24:05 -07:00
Sasank Chilamkurthy f26ce6e236 Rewriting image augmenter (#2446)
* Much better image data augmentor

* removed unnecessary functions

* shift origin to centre of the image for homographies

* init commit

* change to zoom_range

* Added scikit-image to extras_require in setup.py

* add zoom_range test, exception for invalid zoom_range

* add scikit-image to dependency

* fix fit and retain old functions for unit test

* use ndi insteadskimage in random_transform

* removed buggy code in random_rotations, shears etc  and replaced it with todos.

* remove sci-image, implement ndimage based methods, refactor random_transform

* random_zoom, array_to_img consider dim_ordering

* add random_channel_shift, support fill_mode and cval

* image doc, update test_image, PEP8

* fix channel shift clip

* fix doc, refine code

* detail explain of zoom range

* check coding style
2016-04-26 15:21:14 -07:00
Brian McMahan b001e36f18 adding a disable_b boolean to Dense (#2512)
* adding a disable_b boolean to Dense

* changing 'disable_b' to 'bias' 

Changing the name of the boolean & flipping its behavior so that the default is True and when set to False the bias is not used.

* integrating bias flag fully

changed the bias flag to affect the creation of the self.b variable as well as the output calculation

* fixing a blank line to appease pep8
2016-04-26 14:25:00 -07:00
Francois Chollet 9abb6ef723 Add TF graph management warning 2016-04-26 13:02:39 -07:00
Francois Chollet bfbdbb05bc Add root imports 2016-04-26 13:02:11 -07:00
Francois Chollet bd2bd51b5d Fix typo in README 2016-04-25 19:06:31 -07:00
Francois Chollet 4e547a31ed Improve TF session & variable management 2016-04-25 18:49:19 -07:00
Francois Chollet de8d0defcd Fix PEP8 2016-04-25 18:48:23 -07:00
gw0 344437c491 Fix plot with show_shapes and multiple inputs/outputs. (#2421) 2016-04-25 15:29:16 -07:00
George Hotz ed365e94fd Added simple support for returning a multitarget loss 2016-04-25 14:46:03 -07:00
TobyPDE 5910278ca8 Fixed minor typo in getting-started/sequential-model-guide (#2499) 2016-04-25 09:14:18 -07:00
fchollet 18841fa58d Fix build 2016-04-24 22:18:02 -07:00
Carl Thomé 6fb4e0e441 Add cos and sin to backend (#2493) 2016-04-24 21:43:57 -07:00
Francois Chollet 39051ef3ca Add model_from_config in models.py 2016-04-24 14:33:27 -07:00
fchollet 1f4084870b Add new metrics and metrics tests 2016-04-24 12:10:47 -07:00
fchollet 00e9d5b219 Update regularizer tests 2016-04-24 10:27:45 -07:00
fchollet 7f93747602 Remove outdated comment 2016-04-24 10:27:45 -07:00
Kai Li a7156b8c27 Update antirectifier.py (#2485) 2016-04-24 09:34:20 -07:00
fchollet b1e47f7741 Fix PEP8 2016-04-23 13:55:20 -07:00
Ke Ding 59f8d6ca22 add weights for SGD optimizer (#2478) 2016-04-23 13:33:14 -07:00
Rich P. I. Lewis 5f4019d980 fixed Merge Layer functional API (#2460)
* fixed Merge Layer functional API

* moved test to layers/test_core
2016-04-23 13:32:45 -07:00
Ke Ding f84389da08 fix a benign but wrong range number in GRU's get_constants (#2475) 2016-04-22 18:55:38 -07:00
Jiyuan Qian 63c1757df5 fix accuracy with sparse_categorical_crossentropy (#2471) 2016-04-22 10:41:14 -07:00
Joel d6ab850f45 correct inception_v3 network (#2472) 2016-04-22 10:38:19 -07:00
graham 61dd53e262 allows python3.5 to build alongside < 3.5 (#2457) 2016-04-21 15:31:25 -07:00
Francois Chollet 423a633b5b Update merge tests 2016-04-21 09:44:44 -07:00
Colin Rofls 256d4ef71b clarified usage of sparse_categorical_crossentropy (#2450)
- addressess #2444
2016-04-21 09:36:39 -07:00
Philip Bachman ad49962ba9 fix layer/node topo sort problem (#2433)
* fix layer/node topo sort problem

* fix to only iterate over valid layer/node keys
2016-04-20 21:38:23 -07:00
Brian McMahan 4680d70a78 fixing the constants thing in theano rnn (#2429) 2016-04-20 11:17:05 -07:00
Dapid ee7f056779 DOC: models should be compiled upon loading (#2428) 2016-04-20 10:23:26 -07:00
Francois Chollet 66c8d7baf2 Merge branch 'master' of https://github.com/fchollet/keras 2016-04-20 09:43:12 -07:00
Francois Chollet 9f929999d1 Fix Travis concurrent directory creation issue 2016-04-20 09:43:01 -07:00
fchollet 24f96262ec Add additional input data validation check 2016-04-20 08:53:40 -07:00
Brian McMahan 0e6e7a41f4 adding built check inside TimeDistributed (#2426) 2016-04-20 08:41:27 -07:00
Dan Becker 5cac088d98 Add scikit_learn wrapper example (#2388)
* Add scikit_learn wrapper example

* Extract and evaluate best model in examples/mnist_sklearn_wrapper.py
2016-04-19 21:50:50 -07:00
Francois Chollet 85f0448fee Make merge work with pure TF/TH tensors 2016-04-19 18:46:54 -07:00
Francois Chollet 106c0b753a Merge branch 'master' of https://github.com/fchollet/keras 2016-04-19 11:57:30 -07:00
Francois Chollet c525e634dc Fix loss compatibility validation 2016-04-19 11:57:19 -07:00
Eder Santana c398c0891b add eye to backened (#2407) 2016-04-19 11:38:21 -07:00
Francois Chollet 5ab48ac5d4 Update imagedatagenerator 2016-04-19 11:19:12 -07:00
Jeffery Ye ba29cd8e46 set input_length before reshape (#2410) 2016-04-19 10:49:47 -07:00
chardmeier b61235b77f Fixed typo. (#2401) 2016-04-19 10:20:13 -07:00
Francois Chollet 0ed00e38f0 Add inception v3 example 2016-04-18 21:51:43 -07:00
Francois Chollet 36eef0dd9a Add reset function to ImageDataGenerator 2016-04-18 21:51:43 -07:00
fchollet 1904194c7a Fix wrapper learning phase 2016-04-18 20:07:30 -07:00
Francois Chollet 7ce144881a Fix stateful unrolled RNNs in Theano 2016-04-18 17:09:20 -07:00
Eder Santana 55159cf451 Update topology.py (#2373) 2016-04-17 14:21:31 -07:00
Francois Chollet 7a12fd0f85 1.0.1 release 2016-04-16 14:11:34 -07:00
Steven Hugg 9d60126661 fixed TensorBoard callback (#2363) 2016-04-16 13:56:56 -07:00
Francois Chollet e341e73c6a Fix Dropout in RNNs 2016-04-16 09:08:15 -07:00
Francois Chollet 5dad3786f6 Add batch_set_value for faster TF weight loading 2016-04-15 22:40:54 -07:00
Francois Chollet ed0cd2c60d Add TF/TH kernel conversion util 2016-04-15 22:37:41 -07:00
Kai Li 2eea3a4c5d Update model.md (#2348) 2016-04-15 08:21:46 -07:00
Gijs van Tulder 090a46763e Fix support for custom metrics functions (#2351) 2016-04-15 08:21:16 -07:00
Dan Becker c4ed82cdf6 Fix typo in docs. loss_weight should be loss_weights (#2343) 2016-04-14 21:53:33 -07:00
Dan Becker b32248d615 Change error message in standardize_input_data (#2338) 2016-04-14 19:18:53 -07:00
Francois Chollet ca7437502b Shape inference fix for Embedding 2016-04-14 15:44:36 -07:00
Francois Chollet fe9b797a46 Style fixes in preprocessing/image 2016-04-14 15:44:24 -07:00
Francois Chollet b8059aeaba Fix test_image unit test 2016-04-14 13:38:07 -07:00
Daniele Bonadiman 85f80714c2 Max Over Time in imdb_cnn.py (#2320)
* Max Over Time in imdb_cnn.py

Following this issue https://github.com/fchollet/keras/issues/2296 i propose this PR.
The mayor optimisation a part of the Max over time are:

- Dropout in the Embedding layer.
- Longer input sequences (400 instead of 100), made possible from the speedup of the Max Over Time.
- Adam optimizer.

Overall it takes 90 to 100 sec per epoch on my laptop CPU and in two epochs it reaches 0.885 accuracy that is a 5 points improvement over the previous implementation. Moreover it requires less memory (300k parameters vs 3M+) since the number of parameters do not depend  by the length of the input sequence anymore.

* Update imdb_cnn.py
2016-04-14 13:22:06 -07:00
fchollet 2cc9ebf28b Add set_learning_phase in TF backend. 2016-04-14 08:32:04 -07:00
fchollet cb5d69c769 Fix case where output_shape in Merge is tuple 2016-04-13 21:20:21 -07:00
Francois Chollet 3b961a6b7b Fix Graph generator methods 2016-04-13 19:30:11 -07:00
Francois Chollet cba3ea9d90 Fix PEP8 2016-04-13 18:32:18 -07:00
Thomas Boquet 57ea065db7 Added learning phase to callbacks (#2297) (#2303)
* added learning phase to callbacks (#2297)

* cleaned imports

* replaced tabs by spaces

* added case where uses_learning_phase is False

* fixed pep8 blank line bug
2016-04-13 18:00:49 -07:00
Ben Cook 4f5f88b9ba Expose max_q_size and other generator_queue args (#2300)
* [#2287] expose generator_queue args

* [#2287] only expose max_q_size
2016-04-13 18:00:25 -07:00
Francois Chollet c1c2b330a1 Fix "trainable" argument 2016-04-13 17:58:05 -07:00
Francois Chollet 05e1d8e5f4 Fix siamese example 2016-04-13 12:05:42 -07:00
Francois Chollet 3cbca7bdba Fix validation_split 2016-04-13 11:39:16 -07:00
Francois Chollet 3e3c210f1d Update preprocessing/image documentation 2016-04-13 10:15:26 -07:00
Dan Becker cb65139aa8 Allow 'tf' ordering in ImageDataGenerator (#2291) 2016-04-13 10:14:59 -07:00
Francois Chollet 1206120d10 Fix callback issue with Sequential model 2016-04-12 15:22:05 -07:00
Francois Chollet 80ebe80138 Callback style fix 2016-04-12 15:21:02 -07:00
Francois Chollet fa1d6b478e Fix generators methods when passing data as dicts 2016-04-12 13:51:45 -07:00
Francois Chollet 345413fb8c Merge branch 'master' of https://github.com/fchollet/keras 2016-04-12 12:52:05 -07:00
Kyle McDonald 66ebd2a843 corrected parameter for learning_phase (#2284) 2016-04-12 10:18:42 -07:00
fchollet d50f469c09 Fix for invalid dropout values 2016-04-12 09:41:29 -07:00
Charles-Emmanuel 26714bc635 Small typo (#2282)
Small type
2016-04-12 09:23:20 -07:00
Pasquale Minervini 30208ae08b Fix for issue #2276 (#2277) 2016-04-12 09:20:20 -07:00
Francois Chollet 6ea3188971 Fix typo 2016-04-11 22:47:18 -07:00
Kyle McDonald 1db555a530 remove creation of np.asarray in to_categorical (#2268) 2016-04-11 22:36:37 -07:00
Francois Chollet 0772210dea Better error messages for Sequential 2016-04-11 16:41:08 -07:00
Francois Chollet df42e997b7 Merge branch 'master' into keras-1 2016-04-11 09:20:52 -07:00
Francois Chollet 1e71732600 Replace master branch with Keras 1.0 2016-04-11 09:19:03 -07:00
Francois Chollet 141e05e3a7 Update Theano installation instruction 2016-04-11 08:51:01 -07:00
Francois Chollet cadd3e4e2c Make TimeDistributed accept a mask 2016-04-11 08:18:39 -07:00
Francois Chollet 09034d9e17 Add badges to README 2016-04-11 07:49:29 -07:00
Francois Chollet 5706d1d688 Bugfix 2016-04-10 08:36:00 -07:00
Francois Chollet 41b9777746 Clean up training a bit 2016-04-10 07:45:54 -07:00
Francois Chollet d31fe1ac34 Fix LSTM regularizers 2016-04-10 07:45:54 -07:00
Francois Chollet c0eedfeca0 change optimizer in example 2016-04-10 07:45:54 -07:00
Francois Chollet dc8b5509f3 Small fixes in topology engine 2016-04-10 07:45:54 -07:00
Nic Eggert ddec052dab Remove restriction on strides in theano backend conv2d. (#2238)
Previously, strides were required to be smaller than the convolution
kernel. Usually, this is what a user wants, but there are edge
cases where one might want to do this (for instance, projection
shortcuts in Residual Networks).
2016-04-09 18:48:05 -07:00
Francois Chollet 2e45022c95 Clarify interface of in_train_phase 2016-04-08 14:14:22 -07:00
Francois Chollet cec7f73bca Fix irnn example 2016-04-08 14:11:40 -07:00
Francois Chollet 30989dc997 Fix generator methods. 2016-04-08 12:59:09 -07:00
Fariz Rahman 50a0d1cad4 Update topology.py (#2239) 2016-04-08 10:40:58 -07:00
Fariz Rahman 3db1f132a7 call : remove train arg from doc string (#2235) 2016-04-08 08:59:55 -07:00
Fariz Rahman 3b196feda5 Typo fix (#2234) 2016-04-08 08:59:43 -07:00
berleon dd766c68d9 Fix LeakyReLU return dtype (#2214)
LeakyReLU returns a tensor with float64 dtype.
It is stupid, but this line actually produces a float64 array:

```
    0.5*np.array(0.2, dtype=np.float32)
```

The theano nnet.relu function does something similar like this with the
LeakyReLU alpha parameter, which lead to a float64 tensor.
The solution is to not cast the alpha to float32.

Furthermore I tighten the `test_utils.layer_test`. It is now
required that the layer's output dtype is equal to the input dtype.
2016-04-07 17:02:03 -07:00
Francois Chollet 599e070824 Small fixes to the docs. 2016-04-07 16:52:09 -07:00
Francois Chollet 04c998a742 Fix typos 2016-04-07 14:25:41 -07:00
Francois Chollet cc985c3a9c Update docs, visualization utils 2016-04-07 14:23:34 -07:00
Francois Chollet 36cc508030 New documentation 2016-04-06 17:34:25 -07:00
Francois Chollet 444cd56740 Update README 2016-04-06 17:33:39 -07:00
Francois Chollet 88a86f7e45 Merge branch 'keras-1' of https://github.com/fchollet/keras into keras-1 2016-04-06 17:23:21 -07:00
Francois Chollet d6f94c0bc9 Fix docstring 2016-04-06 17:21:31 -07:00
Francois Chollet 2157aa6172 Make function compilation lazy 2016-04-06 17:21:21 -07:00
Francois Chollet 2013527840 Add docstrings to TF backend 2016-04-06 17:21:00 -07:00
Francois Chollet 8c73c6f218 Simplify lstm example 2016-04-06 17:20:43 -07:00
Michael Oliver 63059f6063 Bug fix to set correct uses_learning_phase flag
```test_on_batch``` and ```predict_on_batch``` had the wrong ```uses_learning_phase``` flag
2016-04-06 16:56:31 -07:00
Francois Chollet e179198410 Cache recursive calls in build_map_of_graph 2016-04-06 13:30:07 -07:00
Francois Chollet ed4a95bdad Slight cleanup of build_map_of_graph 2016-04-06 11:46:18 -07:00
Francois Chollet 1baddb9094 Docstrings cleanup 2016-04-06 11:46:01 -07:00
Francois Chollet 3160a445a8 Merge branch 'keras-1' of https://github.com/fchollet/keras into keras-1 2016-04-06 09:05:04 -07:00
Francois Chollet d627fd8781 Docstrings improvements 2016-04-06 09:04:53 -07:00
Michael Oliver b4bdc5a0fa add in predict_generator and tests
* add in predict_generator and tests

* fix PEP8 details

* Pre-allocate predictions

* make predictions return list if neccessary

* reset batch_size for other tests, make less wonky generator
2016-04-06 09:03:37 -07:00
Francois Chollet 6911fa2cba Fix typos in functional API guide 2016-04-05 10:27:11 -07:00
Francois Chollet 0d7c8711bd Fix JSON serialization issue 2016-04-05 10:26:57 -07:00
Francois Chollet d8b0fe0957 Correct functional API guide 2016-04-04 21:49:16 -07:00
Francois Chollet 263de77a5a Improve README, functional API guide 2016-04-04 21:46:27 -07:00
Francois Chollet d3615e682e Small fixes. 2016-04-04 15:00:06 -07:00
Michael Oliver 35da9d6ef2 Make tensorflow backend fully mimic theano.dot
* Squashed commit of the following:

commit f25b56f3a7547da94ffecee8701da5b34e757104
Author: Michael Oliver <michael.d.oliver@gmail.com>
Date:   Mon Apr 4 19:01:49 2016 +0000

    add proper shape inference

commit 5544f66148676d86cb53309701eee6a4e99a3aeb
Author: Michael Oliver <michael.d.oliver@gmail.com>
Date:   Mon Apr 4 18:03:05 2016 +0000

    Make PEP8 compliant and use int_shape

commit 0fe6e02699e2da553f8f1a866aaaa081c26a1cbd
Author: Michael Oliver <michael.d.oliver@gmail.com>
Date:   Mon Apr 4 03:35:29 2016 +0000

    Make tensorflow backend fully mimic theano dot

* fix None comparison
2016-04-04 14:30:55 -07:00
cmyr dbe7662e72 add sparse_categorical_crossentropy 2016-04-04 14:24:41 -07:00
Francois Chollet 30118bbab0 Improve docstrings, UX of common user mistakes 2016-04-04 14:22:28 -07:00
Francois Chollet 2902149f77 Finish PR backporting 2016-04-04 11:30:24 -07:00
Francois Chollet eb8b40cccd Fix captioning example in docs 2016-04-04 11:09:20 -07:00
Francois Chollet 76da13dff6 Backport of conv2d PRs by @ns 2016-04-04 11:09:06 -07:00
Francois Chollet f7cbdff79c Improve model training tests 2016-04-04 10:25:58 -07:00
Francois Chollet 81233b3cd3 Fix lambda serialization issue in Py3 2016-04-04 10:11:29 -07:00
Fariz Rahman af88e051fa Typo fix 2016-04-04 07:38:54 -07:00
berleon 6ad6b19bd6 Merge layer handle none in input_shapes 2016-04-04 07:34:36 -07:00
Francois Chollet 0242ca59ac Update version numbers 2016-04-03 20:58:13 -07:00
Francois Chollet c064963ef8 Fix Py3 compatibility issue in test 2016-04-03 20:22:29 -07:00
Francois Chollet 4318718769 Fix PEP8 2016-04-03 20:22:17 -07:00
Francois Chollet f981bdb551 Update functional API guide 2016-04-03 20:04:14 -07:00
Francois Chollet 3860e078a5 Small fixes in optimizers docstrings 2016-04-03 20:04:14 -07:00
Eder Santana d09e2a67bb fix batch_dot tests on backend
* Fix merge_dot tests

* Make batch_dot unique

batch_dot is not tensordot! It only accepts one reduce dimension at a
time. Other reduce dimensions should be dome afterwards with K.sum
This means that K.batch_dot will have the same behavior in both
tensorflow and theano. This also means that we have less parenthesis and
less nested lists.

New usage:

merge_mode = 'dot', dot_axes=[axis1, axis2]

Before:

merge_mode = 'dot', dot_axes=[[axis1], [axis2]]

* Backport sign by @the-moliver

* Fix docstrings

* Fix backend batch_dot tests
2016-04-03 20:04:00 -07:00
Francois Chollet 56ba6b9c7e Merge branch 'keras-1' of https://github.com/fchollet/keras into keras-1 2016-04-03 18:58:18 -07:00
berleon a9c6f26412 Add defaults for _gather_[list/dict]_attr
For example the Dense Layer does not have a update attribute, which

results in an error.
2016-04-03 18:58:08 -07:00
Francois Chollet 73817b8b77 More thorough tests for TimeDistributed 2016-04-03 13:38:46 -07:00
Francois Chollet 3f128b9838 Merge branch 'keras-1' of https://github.com/fchollet/keras into keras-1 2016-04-03 13:19:38 -07:00
berleon 9621bb5b8e fix h5py string encoding
When saving the weights a TypeError is raised by h5py.

See this issue https://github.com/h5py/h5py/issues/289 for details.

As it is recommended in the issue, the strings are now encoded as utf8.
2016-04-03 13:19:12 -07:00
Francois Chollet 836fb03aa0 Merge branch 'keras-1' of https://github.com/fchollet/keras into keras-1 2016-04-03 10:32:38 -07:00
Eder Santana c3aaf50b64 Update complete_guide_to_the_keras_functional_api.md
small changes
2016-04-03 10:29:02 -07:00
Francois Chollet fe00f5ff64 Fix learning phase issue with regularizers 2016-04-03 10:28:06 -07:00
Eder Santana 8b3543fca9 Fix merge_dot tests
* Fix merge_dot tests

* Make batch_dot unique

batch_dot is not tensordot! It only accepts one reduce dimension at a
time. Other reduce dimensions should be dome afterwards with K.sum
This means that K.batch_dot will have the same behavior in both
tensorflow and theano. This also means that we have less parenthesis and
less nested lists.

New usage:

merge_mode = 'dot', dot_axes=[axis1, axis2]

Before:

merge_mode = 'dot', dot_axes=[[axis1], [axis2]]

* Backport sign by @the-moliver

* Fix docstrings
2016-04-03 10:03:09 -07:00
Leon Chen a6fe2ae341 test against latest tensorflow in travis 2016-04-03 10:02:31 -07:00
graham 3ca7751445 fixed slice_X import location 2016-04-02 19:00:26 -07:00
Francois Chollet ebfde534c0 Add weight deduping before updates computation 2016-04-02 10:37:57 -07:00
Eder Santana f8c7dbb758 Backport #2123 by @carlthome
Backport #2123 by @carlthome
2016-04-01 21:27:20 -07:00
Francois Chollet 62f9053330 Fix babi_rnn example 2016-04-01 21:14:05 -07:00
Francois Chollet c429e651c1 Fix weight regularizers 2016-04-01 21:13:57 -07:00
Francois Chollet dacf017d38 Fix potential issue in topology engine 2016-04-01 21:13:34 -07:00
Francois Chollet dc3c1488bb Fix unit tests for Merge 2016-04-01 18:01:41 -07:00
Francois Chollet 4d7ff76cfb Fix some more tests 2016-04-01 16:45:05 -07:00
Francois Chollet 8ad6865952 Fix conv3d tests 2016-04-01 16:12:04 -07:00
Francois Chollet 2a4f6b942d Add more tests 2016-04-01 15:58:19 -07:00
Francois Chollet 91a819fb34 Fix engine issue 2016-04-01 15:58:07 -07:00
Francois Chollet 52dbeb1f26 Reintroduce failing siamese test 2016-04-01 15:57:45 -07:00
Francois Chollet 0836e47dfc Theano: warn on unused input 2016-04-01 15:56:30 -07:00
Francois Chollet 75bef59016 Unify dot behavior in TF and Theano 2016-04-01 15:56:19 -07:00
Francois Chollet 337c0c66cf Remove shape infer test (now incl in layers tests) 2016-04-01 13:36:28 -07:00
Francois Chollet f8e2df16f1 Fix TF batch_dot 2016-04-01 13:36:03 -07:00
Francois Chollet 10deb8f267 Add batchnorm unit tests 2016-04-01 13:25:59 -07:00
Francois Chollet efe5916109 Fix convolutional tests 2016-04-01 13:25:48 -07:00
Francois Chollet 64449c196e Fix recurrent tests 2016-04-01 13:25:37 -07:00
Francois Chollet f57128bd3d Fix wrappers 2016-04-01 13:25:13 -07:00
Francois Chollet 8740791d5c Small fixes to core layers 2016-04-01 13:24:48 -07:00
François Chollet 6ddb5a0452 Merge pull request #2158 from EderSantana/keras-1
Add batch_dot to backend
2016-04-01 08:38:29 -07:00
EderSantana 133699c2f3 keras.engine.topology.Merge uses K.batch_dot 2016-04-01 10:56:12 -04:00
EderSantana b0ea92bc12 Add batch_dot to backend 2016-04-01 00:15:28 -04:00
fchollet b587aeee1c Remove badge from README 2016-03-31 19:13:22 -07:00
Francois Chollet e754581ecb Fix noise layers 2016-03-31 19:08:07 -07:00
Francois Chollet fcb6ae8eed Fix activity regularization 2016-03-31 18:46:24 -07:00
Francois Chollet bf4dab3501 Update core layers 2016-03-31 18:17:17 -07:00
Francois Chollet a066cf8680 Fix advanced activations. 2016-03-31 16:43:10 -07:00
Francois Chollet 7448dcea65 Fix optimizers tests 2016-03-31 13:41:30 -07:00
Francois Chollet cf3b3dff32 Fix regularizers 2016-03-31 13:41:21 -07:00
Francois Chollet ca96737b20 Fix callbacks 2016-03-31 13:41:07 -07:00
Francois Chollet 61ade48343 Some cleanup 2016-03-31 11:50:30 -07:00
Francois Chollet 295bfe4e3a Keras 1.0 preview. 2016-03-31 11:35:27 -07:00
Francois Chollet bdd70d06d3 Prepare 0.3.3 release (last release before 1.0). 2016-03-31 11:15:44 -07:00
François Chollet 39a3be60c0 Merge pull request #2109 from the-moliver/sign
add sign operation to backends
2016-03-31 11:07:02 -07:00
François Chollet d81e4127fb Merge pull request #2123 from carlthome/master
Support low-precision floats
2016-03-31 11:04:11 -07:00
Carl Thomé 122be6e30b Support low-precision floats 2016-03-29 20:13:45 +02:00
Michael Oliver 8056f0dd37 add sign operation to backends 2016-03-28 19:43:49 +00:00
François Chollet 3a61ace619 Merge pull request #2098 from kylemcdonald/patch-4
roll jitter instead of pixel jitter for deep dream
2016-03-28 09:49:57 -07:00
Kyle McDonald 8661e78f08 roll jitter instead of pixel jitter for deep dream 2016-03-27 11:21:23 -04:00
François Chollet 90aafca585 Merge pull request #2055 from EderSantana/batch_dot
Batch dot
2016-03-25 17:18:50 -07:00
EderSantana d2ce350657 2D tensor support for batch_dot 2016-03-25 10:12:55 -04:00
EderSantana d4fce4f5f1 batch_dot supports arbitrary sized arrays 2016-03-24 15:35:21 -04:00
François Chollet 5e301d1f63 Merge pull request #2044 from NasenSpray/conv2d
Use T.nnet.conv2d interface
2016-03-24 08:53:42 -07:00
EderSantana 4e5348c5ca Add batch_dot to core.py layers 2016-03-23 18:52:57 -04:00
EderSantana a19db7b672 Add batch_dot to backend 2016-03-23 18:46:00 -04:00
François Chollet 85ebccfcc8 Merge pull request #2049 from grahamannett/patch-1
prevent UnicodeDecodeError
2016-03-23 13:59:30 -07:00
graham 677e9ee8ba prevent UnicodeDecodeError
Fix error that arises on some systems where python tries to read text as ascii
2016-03-23 05:49:16 -07:00
ns 0f4c7864ba fixed weird performance regression with up-to-date Theano 2016-03-23 09:27:29 +01:00
ns 63c099714b PEP8 fix 2016-03-23 08:53:36 +01:00
ns 3dd27c61fb fixed output shape for even kernels 2016-03-23 08:37:11 +01:00
ns 6543b67509 use Theano's new T.nnet.conv2d interface 2016-03-23 04:53:54 +01:00
François Chollet d0b348a55a Merge pull request #2037 from TheRushingWookie/master
Fixed the image captioning example
2016-03-22 19:03:29 -07:00
Quinn Jarrell 9f8d3cb399 Changed the image captioning example so it works with current keras. It was misusing the merge layer and had an incorrect parameter in of the GRU layers. Should fix issue #1522. 2016-03-22 15:40:55 -04:00
François Chollet deb4c06df8 Merge pull request #2013 from carlthome/master
Warn when epoch sees more samples than expected
2016-03-21 22:44:21 -07:00
François Chollet d3cc1de2d7 Merge pull request #2027 from sudeepraja/master
Added fix for training h5py dataset on Graph model
2016-03-21 22:43:20 -07:00
Sudeep Raja 404a30df88 Update models.py
Added fix for training h5py dataset on Graph model
2016-03-22 03:51:02 +05:30
Carl Thomé 3bd7d11170 Don't assume what a batch generator does 2016-03-21 15:59:07 +01:00
Carl Thomé 9ac50e0050 Warn when epoch sees more samples than expected 2016-03-20 19:51:24 +01:00
Francois Chollet 1145fec39f Update backend 2016-03-19 09:06:52 -07:00
François Chollet b0303f03ff Merge pull request #1968 from cadurosar/master
Fix Reproducibility with ImageDataGenerator
2016-03-14 18:34:34 -07:00
François Chollet 2716dcd6ab Merge pull request #1963 from jnphilipp/fix_config
Fixed get_config for SReLU.
2016-03-14 18:33:52 -07:00
François Chollet fef9de0d17 Merge pull request #1961 from giorgiop/typos_examples
typos in examples
2016-03-14 18:33:40 -07:00
Carlos Eduardo Rosar Kos Lassance 2dcafafcf9 change all instances of python random to numpy random in keras/preprocessing/image 2016-03-14 16:55:57 +01:00
jnphilipp 775664fdb8 Fixed nulls in input_shape. 2016-03-14 13:36:05 +01:00
jnphilipp a67034ee7c Fixed get_config for SReLU. 2016-03-14 12:48:08 +01:00
giorgiop e72bb9506a fixed typos 2016-03-14 11:27:20 +11:00
Francois Chollet ecd414d716 Fix RNN regularizers 2016-03-11 21:29:37 -08:00
Francois Chollet 80a831de1a Fix failing Lambda test 2016-03-11 17:54:19 -08:00
Francois Chollet 37fd456a5c Lambda layer improvements 2016-03-11 17:29:37 -08:00
Francois Chollet 0c1af0901d Remove class_mode, add support for acc in Graph 2016-03-11 17:29:17 -08:00
François Chollet 70431a5336 Merge pull request #1953 from tjrileywisc/master
Added os import so that check for weights file runs.
2016-03-11 11:06:10 -08:00
Francois Chollet cf3ab771d3 Cleanup of image processing utils. 2016-03-11 11:01:25 -08:00
Francois Chollet 524090e600 Cleanup of text/sequence preprocessing utils. 2016-03-11 10:50:57 -08:00
tim riley 9c9318ff6b Added os import so that check for weights file runs. 2016-03-11 12:57:14 -05:00
fchollet f75f70a60d Update examples in documentation 2016-03-10 21:31:35 -08:00
fchollet e3c31aa762 Merge branch 'master' of ssh://github.com/fchollet/keras 2016-03-10 20:57:18 -08:00
fchollet ef1e959505 Update visualize_util 2016-03-10 20:49:01 -08:00
François Chollet c5b8a1df80 Merge pull request #1943 from matt-peters/compile_kwargs
Allow passing kwargs down to K.function
2016-03-10 16:43:35 -08:00
Matthew Peters e73cf505a7 Add helpful error messages for invalid kwargs in K.function 2016-03-10 16:18:48 -08:00
Matthew Peters 8606edf3bf Allow passing kwargs down to K.function 2016-03-10 15:20:18 -08:00
Francois Chollet 71d46b7153 Fix docstring 2016-03-10 13:28:21 -08:00
Francois Chollet 7f3b2067bc Merge branch 'sklearn' of https://github.com/ipod825/keras into ipod825-sklearn 2016-03-10 13:22:55 -08:00
François Chollet ed882f4064 Merge pull request #1934 from danieldk/tf-backwards-fix
Tensorflow: reverse mask on go_backwards.
2016-03-09 11:17:06 -08:00
Daniël de Kok 126b820561 Tensorflow: reverse mask on go_backwards. 2016-03-09 10:27:25 +01:00
ipod825 b50624debd Add code to avoid user's typo in sk_params 2016-03-08 12:12:45 +08:00
François Chollet 709390dfdb Merge pull request #1918 from snurkabill/allow_soft_placement
by allowing soft placement into keras, we can run whole keras model u…
2016-03-07 19:27:28 -08:00
Jiří Vahala 56c492cbcc by allowing soft placement into keras, we can run whole keras model using tensorflow's "with("/:gpu42")" syntax 2016-03-08 03:34:17 +01:00
fchollet bde45eff87 Split model tests, fix create_output graph bug 2016-03-07 17:48:54 -08:00
Francois Chollet ce7276bc55 Merge branch 'master' of https://github.com/fchollet/keras 2016-03-07 16:48:30 -08:00
Francois Chollet fc476840fa Style fixes 2016-03-07 16:46:19 -08:00
Francois Chollet cfcb1e8703 Switch to symbolic shapes for TF in K.shape() 2016-03-07 16:46:11 -08:00
fchollet 8ba647c196 Fix PEP8 2016-03-06 19:07:49 -08:00
fchollet a86057d91c Improve error messages in TF backend. 2016-03-06 17:32:07 -08:00
fchollet 1ebeff8ee3 Move data_utils to utils.data_utils. 2016-03-06 17:31:57 -08:00
Francois Chollet be4a86f6dc Fix PEP8 2016-03-05 22:41:25 -08:00
Francois Chollet e5ccf53531 Fix typo 2016-03-05 22:24:18 -08:00
Francois Chollet ef93e2cffd Merge branch 'timedistributed' 2016-03-05 21:48:56 -08:00
Francois Chollet b8c59acd77 Improve tests for TimeDistributed 2016-03-05 21:48:41 -08:00
Francois Chollet 3cc242615d TimeDistributed: +docstring, perf improv, TF warn 2016-03-05 21:48:29 -08:00
Francois Chollet 6596cc79d6 Fix variable-size shape inference in padding layer 2016-03-05 21:18:12 -08:00
Francois Chollet cd28c6d07e Merge branch 'master' of https://github.com/fchollet/keras 2016-03-04 14:11:17 -08:00
Francois Chollet b883761820 Neural style transfer: remove input preprocessing. 2016-03-04 14:10:45 -08:00
François Chollet b2b04b0fff Merge pull request #1893 from kylemcdonald/patch-3
changed doc `(input_dim,)` to `(output_dim,)`
2016-03-04 13:41:56 -08:00
Francois Chollet 9e2628e811 Fix neural style transfer example 2016-03-04 13:41:01 -08:00
Kyle McDonald 537fb1cc01 changed doc (input_dim,) to (output_dim,) 2016-03-04 15:54:11 -05:00
François Chollet 99891c0cc8 Merge pull request #1892 from kylemcdonald/patch-2
corrected documentation of "weights" argument
2016-03-04 12:45:03 -08:00
Kyle McDonald d748db43ae corrected documentation of "weights" argument 2016-03-04 15:29:28 -05:00
fchollet 4ec84541e3 Fixes in imagedatagenerator 2016-03-02 20:44:07 -08:00
François Chollet ca05efc76f Merge pull request #1862 from qdbp/issue_1857
Fixing issue 1857
2016-03-01 11:30:19 -08:00
Francois Chollet 7768ae04a2 actually use nb_[val_]_worker args 2016-03-01 13:48:29 -05:00
Francois Chollet 0daec53acb Style fixes 2016-03-01 10:14:56 -08:00
berleon d9ca798c60 fix initialization of BatchNormalization
Currently the scaling parameter gamma is uniformly sampled between
-0.05 and 0.05. This also brings the std -0.05 and 0.05 and not to
1 as claimed by the docstring of the BatchNormalization class.

This commit fixes it by initializing gamma to 1.  The same
initialization is used by
(lasagne)[http://lasagne.readthedocs.org/en/latest/modules/layers/normalization.html#lasagne.layers.BatchNormLayer]
2016-03-01 13:47:00 +00:00
Francois Chollet 990ef92a60 Merge branch 'master' of https://github.com/fchollet/keras 2016-02-29 20:25:19 -08:00
Francois Chollet becc5f3a2c Add support for dim_ordering in initializations 2016-02-29 20:18:17 -08:00
François Chollet e5d0dc65e0 Merge pull request #1856 from GregorySenay/master
Update Siamese layer set weights function
2016-02-29 18:48:03 -08:00
Francois Chollet 48ce23086b Fix recurrent dropout 2016-02-29 16:41:54 -08:00
Francois Chollet a3c9d2d7c9 Merge branch 'master' of https://github.com/fchollet/keras 2016-02-29 10:50:21 -08:00
Francois Chollet 9efe17aeea Fix RNN get initial states recursion issue 2016-02-29 10:50:08 -08:00
fchollet 763a2a9536 Style and dosctring fixes, sklearn wrapper. 2016-02-28 11:54:00 -08:00
Francois Chollet 4a43567cea Update FAQ 2016-02-28 10:13:52 -08:00
ipod825 be24159959 Refactoring of sklearn 2016-02-28 13:46:20 +08:00
Francois Chollet 943d2d4cf8 Add native names to weight tensors 2016-02-27 17:43:18 -08:00
Francois Chollet c4361d2246 Improve Lambda layer shape inference for TF 2016-02-27 12:10:58 -08:00
Francois Chollet 80927fa958 Fix Theano backend pool 2016-02-27 11:54:48 -08:00
Francois Chollet f3f19146f9 Remove dummy inputs in layers 2016-02-27 11:24:04 -08:00
fchollet 3799660504 Update pool module import for Theano 2016-02-27 10:46:46 -08:00
Francois Chollet 9b4f973d57 Fix lstm regularizers 2016-02-26 13:04:14 -08:00
Francois Chollet 567fdccd0b Add Merge input_shape property 2016-02-26 11:38:36 -08:00
Francois Chollet cf755a9c7c Merge branch 'master' of https://github.com/fchollet/keras 2016-02-26 10:46:40 -08:00
Francois Chollet ff2f8ac69b Fix stateful LSTM example 2016-02-26 10:45:12 -08:00
François Chollet 428f4bfde6 Merge pull request #1832 from NasenSpray/fix_mean
T.mean() on int tensors returns float64; use _FLOATX instead
2016-02-26 09:15:48 -08:00
François Chollet 7552f2c26d Merge pull request #1834 from farizrahman4u/patch-5
Fix imports : bAbI Example
2016-02-26 08:45:37 -08:00
Fariz Rahman 0eea5f8867 Fix imports : bAbI Example 2016-02-26 22:13:06 +05:30
ns 47c67ac19a T.mean() on int tensors returns float64; use _FLOATX instead 2016-02-26 16:50:57 +01:00
François Chollet 55d9374961 Merge pull request #1824 from farizrahman4u/patch-2
bAbI: Doubling the FB LSTM baseline :)
2016-02-25 12:51:04 -08:00
Francois Chollet 045e47174f Add generic TimeDistributed layer. 2016-02-25 12:41:53 -08:00
Fariz Rahman 22c091ae3f bAbI: Doubling the FB LSTM baseline :) 2016-02-26 01:06:52 +05:30
François Chollet d20fe64a69 Merge pull request #1823 from EderSantana/patch-6
Update SiameseHead
2016-02-25 11:33:42 -08:00
Eder Santana c6c150b042 Update SiameseHead
We forgot to pass the train parameter to the layer.__call__ inside get_output_at. This is a bug.
2016-02-25 13:50:08 -05:00
Francois Chollet ababd95210 Fix merge conflicts 2016-02-25 10:36:24 -08:00
François Chollet ff676f10f6 Merge pull request #1811 from fchollet/faster-rnn
Possibly faster RNNs
2016-02-25 10:15:45 -08:00
Fariz Rahman 73e563ecaf Speed up RNNs
Update core.py

Fix TF indexing issue

Update recurrent.py

TF slicing issue workaround

Update recurrent.py

Update core.py

Update core.py

Remove all backend specific code

shape[-1] instead of input_dim

Update core.py

space fix

Update core.py

Fix TF reshape issue

Update recurrent.py

Update core.py

Fix overflow issue in 3D softmax

Improve readability
2016-02-25 23:31:28 +05:30
François Chollet 3f905e4a35 Merge pull request #1817 from AIshb/patch-1
Fix a little bug in pad_sequences
2016-02-25 09:51:48 -08:00
Francois Chollet 98e2789db9 Readability improvements 2016-02-25 09:05:59 -08:00
François Chollet 8a6cf4c13e Merge pull request #1816 from smohsensh/patch-1
fix add_shared_node SiameseHead naming
2016-02-25 08:22:52 -08:00
AIshb f4af11c730 Update sequence.py
fix a little mistake in pad_sequences
2016-02-25 20:31:28 +08:00
mohsen 9f2aa1b6ae fix add_shared_node SiameseHead naming 2016-02-25 15:31:33 +03:30
Francois Chollet abca83373d Possibly faster RNNs
Forgot dropout.

Various fixes

Fix SimpleRNN dropout typo
2016-02-24 22:15:05 -08:00
Fariz Rahman 3aa807a0c8 Speed up softmax too
Update core.py

Update core.py

Update core.py

Update core.py

Update core.py

Update theano_backend.py

Update tensorflow_backend.py
2016-02-25 10:39:36 +05:30
Fariz Rahman 089fa11752 TimeDistributedDense Speed up
Update core.py
2016-02-25 10:38:49 +05:30
Francois Chollet 06a1545645 Style fixes 2016-02-24 13:05:47 -08:00
François Chollet 461573a8d9 Merge pull request #1790 from neggert/call-layer-cache
Enable cache optimization for __call__
2016-02-24 12:45:29 -08:00
François Chollet db8f43128b Merge pull request #1801 from farizrahman4u/patch-26
Lambda layer: Bug Fix
2016-02-23 20:53:56 -08:00
Fariz Rahman 80ddb5b3b8 Update core.py 2016-02-24 03:10:33 +05:30
Gregory 3ffba42466 Update core.py
Change Siamese set_weights function to avoid false nb layers value
2016-02-23 11:07:38 -08:00
François Chollet 2c49115cd3 Merge pull request #1792 from neggert/pool-border-same
Implement border_mode="same" for pool2d in Theano
2016-02-22 15:55:24 -08:00
z001qdp bbaa66c530 Implement border_mode="same" for pool2d in Theano 2016-02-22 16:30:02 -06:00
François Chollet 784d81d2c8 Merge pull request #1623 from oeway/master
Convolutional layers for 3D
2016-02-22 12:47:06 -08:00
z001qdp 523e9845d7 Make layer and shape caches more robust
Move caches to properties so that containers can override the
implementation to ensure that the cache gets propagated correctly
to child layers when it is changed.

Reset instead of disabling layer and shape cache  in __call__

Previously, __call__ did not get the speed benefits from caching
because it disabled it in order to feed the layer new input. This
meant that __call__ could be very slow on complicated structures.
Now, instead of disabling it, we temporarily empty it, then restore
the original when we're done.
2016-02-22 14:21:21 -06:00
François Chollet 47bd0af702 Merge pull request #1721 from neggert/graph-call
Make __call__ work properly for Graph
2016-02-22 11:09:52 -08:00
François Chollet 82d3489764 Merge pull request #1789 from yaringal/BRNN_latest
Update example file `imdb_lstm.py` to use dropout in RNNs
2016-02-22 10:23:03 -08:00
z001qdp 44d558ad7f Refactor implementation of __call__.
This refactor allows the inherited method to work properly for
Sequential and Graph (with single input) containers in addition
to normal layers, so there's no need to override the method.
Previously, __call__ did not work correctly for Graph containers.

Implement Graph.__call__ for multiple inputs

Add option (re-)initialize weights in set_previous

This allows us to use set_previous in places where we previously
manually adjusted the previous layer, which means that layers
that have non-standard set_previous implementations (like Graph)
work properly when they are, for example, the first layer in a
Sequential model.

This commit also adds a clear_previous method.

Add input_shape property to Graph container
2016-02-22 12:21:07 -06:00
Yarin cbd11315b7 Update example file imdb_lstm.py to use dropout in RNNs 2016-02-22 17:55:59 +00:00
Yarin 1588998ee8 Merge https://github.com/fchollet/keras into BRNN_latest 2016-02-22 17:54:23 +00:00
fchollet 58ca064f93 Use MSE in stateful example. 2016-02-22 09:10:28 -08:00
Wei OUYANG 3ecf201aea Add 3D Convolutional layers(Convolution3D, MaxPooling3D, AveragePooling3D, UpSampling3D and ZeroPadding3D, working with theano backend)
---------------------------------------
Squashed from the following commits
add Convolution3D and MaxPooling3D layers
fix 5D tensor in theano, add examples
update conv3d, pool3d, add resize_volumes and spatial_3d_padding
update Convolution3D, MaxPooling3D and AveragePooling3D, add UpSampling3D and ZeroPadding3D
add test functions for Convolution3D, MaxPooling3D, AveragePooling3D, ZeroPadding3D and UpSampling3D
small fix by changing pad_z to pad_t
update comment
skip some tests for tenforflow, @pytest.mark.skipif(K._BACKEND != theano, reason="Requires Theano backend")
use autopep8 to fix the code to match pep8 coding style
small fix (caused by autopep8)
small fix (caused by autopep8)
small fix (caused by autopep8)
fixed the document string for all newly added layers
remove the example and the dataset for 3d
add error messge for tensorflow backend
support stride in pool3d
Rename "params" to "trainable_weights"
change notations and docstrings for 3D layers
fix pep8 error
change variable name in test code
small fix for pep8
add error message and docstring for strides in conv3d
fix test error caused by wrong strides in conv3d
support strides in conv3d by slicing the output
add if statement for stride (1,1,1)
fix get_config according to mdering, and other small fix
fix model_from_json issue by passing a 3d border_mode
fix according to jruales' review
change docstring in Convolution3D
delete docstring about TensorFlow
change docstring in Convolution3D and theano_backend
---------------------------------------

Author:    Wei OUYANG <oeway007@gmail.com>
2016-02-22 16:02:11 +01:00
fchollet 654404c2ed Add a few explanatory comments in recurrent.py 2016-02-21 19:46:06 -08:00
fchollet 7896ef7143 Style standardization 2016-02-21 19:31:22 -08:00
François Chollet 0c75006d12 Merge pull request #1782 from EderSantana/patch-5
Sensible activation for RNNs
2016-02-21 19:20:47 -08:00
Eder Santana c7f7ffe7c4 Sensible activation for RNNs
Have noticed how default GRUs works usually worse than LSTMs? It seems that "tanh" is a more sensible activation choice. Also for GRUs, tanh seems to be the default:
see http://arxiv.org/pdf/1412.3555v1.pdf Section 3.2
2016-02-21 19:59:05 -05:00
Francois Chollet f10c430731 Style fixes, fix flaky test 2016-02-21 16:53:15 -08:00
Francois Chollet ea5cb74414 Merge branch 'BRNN_latest' of https://github.com/yaringal/keras into yaringal-BRNN_latest 2016-02-21 16:34:16 -08:00
Yarin 606a9b6810 Merge https://github.com/fchollet/keras into BRNN_latest 2016-02-21 23:51:52 +00:00
Yarin 35c5fa911d clean up 2016-02-21 23:31:32 +00:00
fchollet d4e9696447 Merge branch 'master' of ssh://github.com/fchollet/keras 2016-02-21 12:16:49 -08:00
fchollet 4cff0623de Update documentation. 2016-02-21 12:16:39 -08:00
François Chollet bc82613eae Merge pull request #1767 from neggert/seq-set-weights
Fix `Sequential.set_weights` for nested containers
2016-02-21 11:49:50 -08:00
fchollet 0ec57f28bc Fix fit_generator docstring 2016-02-21 11:35:13 -08:00
fchollet 860e4e9177 Various fixes in evaluate_generator/fit_generator. 2016-02-21 11:23:07 -08:00
Yarin a2fdc32381 use less memory when dropout is disabled 2016-02-21 17:57:58 +00:00
shmo d1a3842b3d implement validate_generator and the use of validation generators in fit_generator 2016-02-21 10:24:27 -05:00
Yarin 5406bd3ad2 updated dropout test 2016-02-21 03:21:20 +00:00
Yarin 941c3f6ae8 fixes following fchollet's comments 2016-02-21 03:04:37 +00:00
Yarin bec2701214 remove new line at end of file 2016-02-20 06:00:32 +00:00
Yarin bc60832dcf lint 2016-02-20 05:41:50 +00:00
Yarin 96483326d8 fixed tensorflow bug 2016-02-20 05:20:37 +00:00
Yarin fed7cc257e switched from broadcasting to K.expand_dim and from slicing to K.gather, and adapted test_recurrent to test with dropout 2016-02-20 03:56:00 +00:00
François Chollet d03f7768b8 Merge pull request #1769 from tboquet/modelckpoint_deb
ModelCheckPoint references nonexistent attribute
2016-02-19 19:14:39 -08:00
Yarin ce79e0a8ef test random_binomial, fix rnn backend 2016-02-20 03:03:35 +00:00
Yarin 5b3809394c Dropout for LSTM, GRU, SimpleRNN, and Embedding layers.
Squashed commit of the following:

commit 39a59192e96fe4098f1d663384b79b10e3bcc979
Author: Yarin <yaringal@gmail.com>
Date:   Sat Feb 20 02:15:29 2016 +0000

    Squashed commit of the following:

    commit 88faa440d02df8ff356011258e3e89ce44a13e1d
    Author: Yarin <yaringal@gmail.com>
    Date:   Sat Feb 20 02:13:24 2016 +0000

        Clean up

    commit f55245199a11a202857efb1413ffa3b97c1dcfaf
    Author: Yarin <yaringal@gmail.com>
    Date:   Sat Feb 20 01:57:50 2016 +0000

        Ported dropout for LSTM, GRU, SimpleRNN, and Embedding layer to latest Keras (turned off by default).

        Squashed commit of the following:

        commit 574c4549da69f8c0831f02dce1ad05331d8b38ed
        Merge: 19ef51c bdb149d
        Author: Yarin <yaringal@gmail.com>
        Date:   Sat Feb 20 01:23:54 2016 +0000

            Merge branch 'BRNN_latest' of https://github.com/yaringal/keras into BRNN_latest

        commit 19ef51c633544f847cddebeb7a3add0936051f19
        Author: Yarin <yaringal@gmail.com>
        Date:   Sat Feb 20 01:12:23 2016 +0000

            implemented dropout in GRU and SimpleRNN

        commit bdb149d1bbff64cc6b4d694090b905153d28e33a
        Author: Yarin <yaringal@gmail.com>
        Date:   Sat Feb 20 01:12:23 2016 +0000

            implemented dropout in GRU and SimpleLSTM

        commit 72ade3f493dd725fb414cbc65a847259360be138
        Author: Yarin <yaringal@gmail.com>
        Date:   Sat Feb 20 00:52:01 2016 +0000

            clean up

        commit 9f3d213c91906b3be5c876d539819a8577bc438c
        Author: Yarin <yaringal@gmail.com>
        Date:   Sat Feb 20 00:42:58 2016 +0000

            Model test callback

        commit d4ffffc26cf24c8b7927209caad4379aac3db9c5
        Author: Yarin <yaringal@gmail.com>
        Date:   Fri Feb 19 23:47:40 2016 +0000

            removed dependence on theano

        commit 89a4e6576278564ffb882032d5a7ec5758fe00e4
        Author: Yarin <yg279@cam.ac.uk>
        Date:   Fri Feb 19 23:25:13 2016 +0000

            working BayesianLSTM and embedding dropout for theano backend

        commit 1ab4e19dfe9d49defd5575a5c2b0b880b5c46eb5
        Author: Yarin <yg279@cam.ac.uk>
        Date:   Fri Feb 19 16:41:48 2016 +0000

            working BayesianLSTM with dependence on theano

        commit 672c27401ee345a69592771cfc9ab017642b6af3
        Merge: 9360ea6 b8a9f84
        Author: Yarin <yaringal@gmail.com>
        Date:   Fri Feb 19 00:30:44 2016 +0000

            Merge https://github.com/fchollet/keras into BRNN_latest

        commit 9360ea6c25eab90e83aebb32eb187c65ed63c01d
        Author: Yarin <yaringal@gmail.com>
        Date:   Thu Feb 18 23:28:35 2016 +0000

            work in progress on BayesianLSTM

        commit b8a9f84fad
        Merge: a154495 0f3f563
        Author: François Chollet <francois.chollet@gmail.com>
        Date:   Thu Feb 18 11:24:42 2016 -0800

            Merge pull request #1756 from gw0/fix-for-refactor-callbacks

            Fix missing callback refactoring.

        commit 0f3f56327b
        Author: gw0 [http://gw.tnode.com/] <gw.2016@tnode.com>
        Date:   Thu Feb 18 17:01:45 2016 +0100

            Fix missing callback refactoring.
2016-02-20 02:15:59 +00:00
tboquet e501cd664e changed ref to attribute ModelCheckPoint 2016-02-19 20:14:18 -05:00
z001qdp 47aafaaca0 Fix Sequential.set_weights
`Sequential.set_weights` would fail for nested `Sequential`
containers. Borrow the implementation of `Graph.set_weights` to
get it working. Also add tests for triply-nested `Sequential`
models.
2016-02-19 14:23:04 -06:00
François Chollet b8a9f84fad Merge pull request #1756 from gw0/fix-for-refactor-callbacks
Fix missing callback refactoring.
2016-02-18 11:24:42 -08:00
gw0 [http://gw.tnode.com/] 0f3f56327b Fix missing callback refactoring. 2016-02-18 17:01:45 +01:00
François Chollet a154495a2a Merge pull request #1751 from DingKe/master
Support saving/loading sample_weight_mode(s)
2016-02-17 18:26:29 -08:00
Francois Chollet 25e5f7531a Add ISSUE_TEMPLATE.md 2016-02-17 16:56:18 -08:00
Francois Chollet ab179fab89 data_utils cleanup 2016-02-17 16:39:37 -08:00
Francois Chollet 59a714abe8 Style fixes 2016-02-17 16:39:24 -08:00
Ke Ding e432d10be5 Merge branch 'master' of https://github.com/fchollet/keras.git 2016-02-17 18:48:06 +08:00
Ke Ding eeb576b12f Add support for saving/loading sample_weight_mode(s) 2016-02-17 18:47:29 +08:00
François Chollet 1019e50e7f Merge pull request #1626 from MatthieuPerrot/master
retrieve datasets behind http/https proxy
2016-02-16 15:19:22 -08:00
Francois Chollet 1a6cb71732 Make history an attribute of models 2016-02-16 14:22:41 -08:00
Francois Chollet 9048b5cbba Merge branch 'master' of https://github.com/fchollet/keras 2016-02-16 13:23:25 -08:00
Francois Chollet c9642571c2 Refactor callbacks 2016-02-16 13:22:53 -08:00
François Chollet cda80c790b Merge pull request #1719 from barvinograd/general-pad-sequences
generalize pad_sequences
2016-02-16 11:57:34 -08:00
Bar Vinograd 46a5b3cb36 generalize pad_sequences #1718 2016-02-16 10:56:33 +02:00
Francois Chollet 5b23dd8a2f Style fixes 2016-02-15 13:30:28 -08:00
Francois Chollet 2b26389188 Merge branch 'master' of https://github.com/fchollet/keras 2016-02-15 13:29:09 -08:00
Francois Chollet 44e0a7bbf9 Style fixes 2016-02-15 13:27:06 -08:00
Francois Chollet 87cc39d99f Merge branch 'master' of https://github.com/barvinograd/keras into barvinograd-master 2016-02-15 13:18:51 -08:00
François Chollet 55aacd1905 Merge pull request #1736 from bryan-lunt/fix_load_weights
Fixed load_weights to not create empty/corrupt .h5 files
2016-02-15 12:57:41 -08:00
Francois Chollet 3df101cc77 Merge branch 'master' of https://github.com/fchollet/keras 2016-02-15 12:56:59 -08:00
Bryan Lunt c23579e059 Fixed load_weights to not create empty/corrupt .h5 files
Fixes issue #1734 non-existant .h5 files will not be created, IOError will be raised.
2016-02-15 12:13:23 -06:00
François Chollet 6a41ac1c36 Merge pull request #1724 from bmabey/master
updates docstring for Merge to include cos and join
2016-02-14 15:47:34 -08:00
Ben Mabey b78ade7e36 updates docstring for Merge to include cos and join 2016-02-14 16:40:54 -06:00
Matthieu Perrot 61800be9a0 Add http/https proxy management for dataset downloading 2016-02-13 21:14:55 +01:00
Bar Vinograd 3925eabaaf SReLU activation (#1662, #1681) 2016-02-13 12:07:57 +02:00
Francois Chollet 34296ec961 Update FAQ 2016-02-12 17:10:51 -08:00
Francois Chollet 52f48e1f46 Merge branch 'master' of https://github.com/fchollet/keras 2016-02-12 12:17:20 -08:00
Francois Chollet 20728c95fa Make it possible to configure nb of threads in TF 2016-02-12 12:17:00 -08:00
François Chollet 6181ca8aae Merge pull request #1701 from stas-sl/patch-2
more concise random_shift and fix axis order
2016-02-12 11:56:33 -08:00
François Chollet 68115cc25f Merge pull request #1705 from neggert/cache_input_size
Add caching for layer input/output sizes
2016-02-12 11:40:55 -08:00
Francois Chollet c192beaf43 Merge branch 'master' of https://github.com/fchollet/keras 2016-02-12 09:45:21 -08:00
Francois Chollet 27dd1e939c Add dim_ordering tests 2016-02-12 09:44:55 -08:00
Francois Chollet 1e46a5d3ec Improve error message 2016-02-12 09:44:31 -08:00
Francois Chollet 0bf2b1b075 Improve TF backend docstrings 2016-02-12 09:44:16 -08:00
François Chollet ab3ef3efe5 Merge pull request #1704 from brmson/f/cosmerge
Merge(cos): Fix bad import
2016-02-12 09:43:33 -08:00
Petr Baudis 4d3ee897da Merge(cos): Fix bad import 2016-02-12 16:07:19 +01:00
stas-sl 8e591d228c more concise random_shift and fix axis order 2016-02-12 13:34:45 +03:00
z001qdp 6429a57a3c Add caching for layer output sizes
This dramatically speeds up constructing large networks.
Implementation follows that of layer caching introduced in e8e8f7.
2016-02-11 16:48:17 -06:00
Francois Chollet 9ad5ed8103 Fix indentation in code examples in docs 2016-02-11 11:42:34 -08:00
Francois Chollet 209b42c5ee Fix code comments for method example in online doc 2016-02-11 11:00:12 -08:00
Francois Chollet 23147de72b Convs tests touch-ups 2016-02-10 14:35:34 -08:00
Francois Chollet 7b72163073 Theano: support for strides with border_mode=same 2016-02-10 13:57:09 -08:00
Francois Chollet 6279544dc3 0.3.2 PyPI release 2016-02-09 13:57:16 -08:00
François Chollet 657b9fb48e Merge pull request #1651 from tboquet/fit_gen_tensorb
Support + tests for fit_generator + tensorboard
2016-02-08 14:01:02 -08:00
tboquet d68e3316da Support for fit_generator + tensorboard 2016-02-08 14:55:00 -05:00
François Chollet cae797b803 Merge pull request #1668 from jstypka/master
docs: update "pad_sequences()" parameters in the docs
2016-02-08 10:14:17 -08:00
Jan Stypka 21492a292a docs: update "pad_sequences()" parameters in the docs 2016-02-08 15:20:12 +01:00
fchollet f27c5b0500 merge conflict 2016-02-07 22:56:59 -08:00
fchollet 523e24e8ac Simplify Theano RNN when no mask is passed 2016-02-07 22:34:24 -08:00
François Chollet 8d393f766b Merge pull request #1645 from oraac/master
added more detail to the building documentation instructions
2016-02-07 21:17:11 -08:00
Francois Chollet c450089de9 Documentation improvements 2016-02-07 15:42:18 -08:00
Francois Chollet 1b22915f85 Documentation improvements 2016-02-07 15:38:20 -08:00
Francois Chollet 5824f2eb99 Provide None as default value for layer names 2016-02-06 18:45:15 -08:00
David McInnis 27754d2a5f changed to make compatible with more browsers 2016-02-06 14:32:52 -08:00
Francois Chollet 99991779e5 TF fixes and style fixes 2016-02-06 13:49:57 -08:00
Mikael Rousson 652f2eb56d add siamese example
use graph model
take pairs of digits as input
2016-02-06 22:06:22 +01:00
Francois Chollet 359f91ff6c Fix py3 test 2016-02-06 11:21:44 -08:00
Francois Chollet f3fd56db50 Improve input validation in graph.compile 2016-02-06 11:03:02 -08:00
Francois Chollet 6f5f3d3fb6 Fix join mode in Merge / Siamese 2016-02-06 11:02:44 -08:00
Francois Chollet 9b10ab2980 Style fixes 2016-02-05 15:37:36 -08:00
Francois Chollet 113f20e7e5 Merge branch 'rescale-objective' of https://github.com/wxs/keras into wxs-rescale-objective 2016-02-05 15:34:57 -08:00
Francois Chollet f2443de96d Merge branch 'master' of https://github.com/fchollet/keras 2016-02-05 10:34:48 -08:00
Francois Chollet e14deedd78 Temporarily disable image preprocessing tests 2016-02-05 10:34:36 -08:00
François Chollet 7a708c305c Merge pull request #1648 from gw0/fix-fit_generator-terminate
Fix program termination when threads are used in fit_generator().
2016-02-05 10:00:32 -08:00
Xavier Snelgrove 9a4a931d32 Merge remote-tracking branch 'origin/master' into rescale-objective 2016-02-05 11:31:52 -05:00
gw0 [http://gw.tnode.com/] f854dcb83f Fix program termination when threads are used in fit_generator(). 2016-02-05 15:36:14 +01:00
David McInnis b74e4a9f2d added more detail to the building documentation instructions 2016-02-04 15:15:55 -08:00
Francois Chollet 217cdd8b85 Fix objective tests 2016-02-04 14:41:29 -08:00
Xavier Snelgrove 7e678b8315 Objective outputs should rescale based on sample_weights
If sample_weights is to be used as a mask as well as for re-weighting
then it's important that, at least when used as a mask, the output be
rescaled. Otherwise the order of magnitude of your objective changes
purely based on the number of masked entries in your training data.
2016-02-04 15:57:57 -05:00
Francois Chollet 65b5899d06 Remove RMSE objective (use MSE instead). 2016-02-04 10:59:01 -08:00
Francois Chollet 02d5f72be4 Simplify Theano tensor instantiation 2016-02-03 18:40:13 -08:00
Francois Chollet cd2e36392c Rename "params" to "trainable_weights" 2016-02-03 17:52:05 -08:00
Matthias Plappert ff4a9b7b24 fix BN serialization 2016-02-03 17:27:14 -08:00
François Chollet 8acf4de764 Merge pull request #1621 from DingKe/master
fix issue #1363
2016-02-02 20:44:05 -08:00
Ke Ding 7b789c7fa0 fix issue when loading stateful RNNs if they are not the first layer 2016-02-03 10:07:30 +08:00
Francois Chollet ef22fcf548 Fix TF dim check 2016-02-02 11:58:48 -08:00
Francois Chollet de8a0133f0 Improve "repeat" in backends 2016-02-02 11:26:03 -08:00
Xavier Snelgrove 095b6c118c Fix merge conflicts 2016-02-01 14:16:54 -08:00
Francois Chollet 827ec65111 Fix stateful RNN serialization 2016-02-01 13:48:40 -08:00
Xavier Snelgrove c94cf4b32a Auto-expand mask dims. Fix categorical_crossentropy.
- Categorical_crossentropy was taking an extra mean, the function
      already removes the final dimension of your input, so you don't need
      to take a mean as you would with, say, L2 loss.

     - The RNN backend call can now take a mask with or without the same
      number of dimensions as the input data

     - Fix Masking layer for Tensorflow

     - Add some tests to confirm objective function shapes
2016-02-01 14:47:10 -05:00
Francois Chollet b688192cbd 15% more efficient RNNs with TensorFlow 2016-01-31 15:30:20 -08:00
Francois Chollet 64374c98fb Merge branch 'master' of https://github.com/fchollet/keras 2016-01-31 14:10:31 -08:00
Francois Chollet 361c52c527 Merge branch 'udibr-imageDataGen_thread' 2016-01-31 14:10:16 -08:00
Francois Chollet 04d7504537 Style fixes 2016-01-31 14:09:52 -08:00
fchollet a07efd4b5c Update examples in doc 2016-01-31 10:19:39 -08:00
fchollet 0ca4a0fbae Add binary classification example in doc 2016-01-31 10:13:14 -08:00
Ehud Ben-Reuven 6e33b516ef ImageDataGenerator thread safe 2016-01-30 23:43:56 -05:00
fchollet 3d85402ca5 Fix various typos 2016-01-30 18:19:37 -08:00
fchollet 9720db9566 Fix typo 2016-01-30 18:13:25 -08:00
fchollet c5d11f1da3 Fix AutoEncoder; cleaner API. 2016-01-30 18:10:33 -08:00
François Chollet 5d3c267398 Merge pull request #1596 from agitter/master
Minor typos in example commands
2016-01-29 14:23:54 -08:00
Anthony Gitter eb8e8b281e Minor typos in example commands 2016-01-29 15:31:53 -06:00
Francois Chollet 001f29cf54 Merge branch 'master' of https://github.com/fchollet/keras 2016-01-29 10:47:08 -08:00
Francois Chollet bb2b3ada3d Improve docs of conv_filter_vis example 2016-01-29 10:03:07 -08:00
fchollet 92b1948d51 Fix typo 2016-01-28 22:05:25 -08:00
Francois Chollet 14b109072a Merge branch 'master' of https://github.com/fchollet/keras 2016-01-28 19:22:37 -08:00
Francois Chollet 027a018210 Add convolution filter visualization example 2016-01-28 19:22:24 -08:00
Francois Chollet 4341c623ff Test cleanup 2016-01-28 19:21:45 -08:00
François Chollet 7a3122c154 Merge pull request #1583 from awentzonline/fix-random-tests
Fixed backend tests for `random_uniform` and `random_normal`
2016-01-28 15:16:43 -08:00
François Chollet 282e634f3a Merge pull request #1582 from fchollet/temporal_sample_weighting
Temporal sample weighting
2016-01-28 14:18:53 -08:00
Adam Wentz 3882f32f09 Fixed backend tests for random_uniform and random_normal 2016-01-28 15:59:24 -06:00
Francois Chollet efe4fc72e5 Fix loss weighting tests 2016-01-28 13:50:19 -08:00
Francois Chollet 1623ea6166 Support for temporal sample weighting + tests 2016-01-28 11:20:04 -08:00
François Chollet b2681804f8 Merge pull request #1577 from jocicmarko/master
Speed up random shifts in data augmentation
2016-01-28 10:39:38 -08:00
Marko Jocić 9bc8ab169a Speed up random shifts in data augmentation
Currently, polynomial interpolation of 3rd order is done when shifting. However, that is not needed because the images are shifted by integer values (crop_left_pixels, crop_top_pixels), and there is nothing to interpolate.
Setting ```order=0``` will speed up random shifts significantly.
2016-01-28 14:18:05 +01:00
François Chollet 59dcd7ba7a Merge pull request #1566 from tpsatish95/master
added random_shear() to random_transform
2016-01-27 23:07:25 -08:00
tpsatish95 5cf50f6a6c added the random_shear transformation for images. 2016-01-28 11:35:50 +05:30
Francois Chollet ecdce975d3 Fix inaccuracy in FAQ 2016-01-27 19:15:40 -08:00
Francois Chollet aceded7bfb Merge branch 'master' of https://github.com/fchollet/keras 2016-01-27 19:10:45 -08:00
Francois Chollet 45955be120 Update FAQ 2016-01-27 19:10:23 -08:00
François Chollet 829886eb16 Merge pull request #1552 from udibr/ImageDataGenerator_shuffle
In ImageDataGenerator.flow shuffle before every epoch
2016-01-27 18:50:57 -08:00
Francois Chollet 4922a67f09 Add support for temporal sample weights 2016-01-27 18:41:35 -08:00
François Chollet 41d07f5ba9 Merge pull request #1560 from jnphilipp/master
Fixed ImageDataGenerator.flow
2016-01-27 10:56:00 -08:00
jnphilipp 05fe5f564a Fixed image save. 2016-01-26 21:18:04 +01:00
Ehud Ben-Reuven 403b4fc7a2 In ImageDataGenerator.flow shuffle before every epoch 2016-01-26 15:06:06 -05:00
Francois Chollet c61d075abc Fix ImageDataGenerator docs 2016-01-26 09:36:35 -08:00
fchollet 2e90ae18a6 Update Theano install instructions 2016-01-24 16:53:29 -08:00
fchollet ed1297b393 Update Theano install instructions 2016-01-24 16:09:51 -08:00
François Chollet a53577205f Merge pull request #1543 from Joshua-Chin/master
Fix documentation of output shape for Convolution2D
2016-01-24 12:49:38 -08:00
Joshua Chin 26ab219159 fix documentation for Convolution2D 2016-01-24 13:24:16 -05:00
Francois Chollet 399c00c7f8 Update sample_weight documentation 2016-01-22 22:06:49 -08:00
Francois Chollet 36892dba8f Introduce relu exception for old Theano versions 2016-01-22 15:17:03 -08:00
Francois Chollet 1e58b89523 Fix BN tests 2016-01-21 12:17:25 -08:00
Francois Chollet 4f530ab5f8 Merge branch 'master' of https://github.com/fchollet/keras 2016-01-21 11:54:58 -08:00
Francois Chollet 8823fa520b Add axis to BN config 2016-01-21 11:54:46 -08:00
François Chollet 1dab54c1c3 Merge pull request #1527 from the-moliver/master
add axis arguments to constraints
2016-01-21 11:23:16 -08:00
Michael Oliver e2e281e14f add axis arguments to constraints 2016-01-21 10:53:46 -08:00
François Chollet 3f599b9204 Merge pull request #1525 from wxs/fix-masking-3d+
Support >3d mask matrices.
2016-01-21 10:39:21 -08:00
Xavier Snelgrove ea7f400d57 Support >3d mask matrices.
The change to the dimshuffle/transpose call to support >3d inputs was
correct for the inputs array but did not apply to the mask array. This
fixes that.
2016-01-21 12:52:36 -05:00
François Chollet d7f870a47f Merge pull request #1507 from wb14123/char-tokenizer
make Tokenizer supports char level
2016-01-20 20:39:05 -08:00
François Chollet 1f0adb7ed7 Merge pull request #1514 from DingKe/master
Check gpu status correctly when using theano's cuda.use to choose gpu
2016-01-20 18:45:13 -08:00
DingKe 5657a38515 Check gpu correctly when using theano.sandbox.cuda.use to set gpu 2016-01-21 10:19:50 +08:00
Francois Chollet 2143046261 Improve error message in recurrent.py 2016-01-20 17:59:26 -08:00
Francois Chollet 571f1d4fcf Fix py3 compatibility 2016-01-20 15:28:09 -08:00
Francois Chollet f834c59290 Merge branch 'master' of https://github.com/fchollet/keras 2016-01-20 15:04:17 -08:00
Francois Chollet 0300ae2f4c Style fixes 2016-01-20 15:04:12 -08:00
François Chollet a67fb21e56 Merge pull request #1512 from farizrahman4u/patch-26
RNN : Support for nD data
2016-01-20 12:39:30 -08:00
Francois Chollet d9c4c02601 Fix image preprocessing tests 2016-01-20 12:25:36 -08:00
François Chollet 3ed897332f Merge pull request #1501 from farizrahman4u/patch-25
Remove output_dim argument from RNN API
2016-01-20 10:25:14 -08:00
François Chollet 52cb803deb Merge pull request #1511 from farizrahman4u/patch-24
White space fix
2016-01-20 10:18:09 -08:00
Fariz Rahman ace7b7fe7f RNN : Support for nD data
Currently,  only 3D input is supported by the rnn function.

Update theano_backend.py

Fix tf too

Avoid slicing

assert ndim>=3

Update theano_backend.py

typo

Update theano_backend.py
2016-01-20 23:29:57 +05:30
Fariz Rahman d08b0efc80 White space fix 2016-01-20 22:03:00 +05:30
Bin Wang 371dcc9071 add doc 2016-01-20 17:19:04 +08:00
Bin Wang 05dcfe15fe make Tokenizer supports char level 2016-01-20 16:56:57 +08:00
Fariz Rahman 73f1ef2e95 Update activations.py
Update tensorflow_backend.py

Update theano_backend.py

Update core.py

Update recurrent.py

Update test_backends.py
2016-01-20 12:41:41 +05:30
François Chollet 5bcac37553 Merge pull request #1503 from the-moliver/gaussnoisefix
Add sqrt for proper noise sigma calculation
2016-01-19 16:38:02 -08:00
Michael Oliver 34c54cb19c Add sqrt for proper noise sigma calculation 2016-01-19 15:43:36 -08:00
François Chollet fb99e04f86 Merge pull request #1502 from tboquet/from_config_deb
Added the name of the model when loading it from config
2016-01-19 14:36:15 -08:00
Francois Chollet 262c8ce1a0 Merge branch 'cassianokc-master' 2016-01-19 14:19:45 -08:00
Francois Chollet 2091bfe911 Fix stateful LSTM example 2016-01-19 14:18:11 -08:00
Francois Chollet 5e9aafca33 Merge branch 'master' of https://github.com/cassianokc/keras into cassianokc-master 2016-01-19 13:57:26 -08:00
tboquet e31d26b33f + name of th loaded models 2016-01-19 16:44:42 -05:00
Francois Chollet 45714e343f Improve real time data augmentation, cifar10 ex 2016-01-19 13:43:41 -08:00
Fariz Rahman c8f2633109 Merge pull request #6 from fchollet/master
update
2016-01-19 13:49:38 -05:00
Francois Chollet 852fe9cc7b Fix time distributed softmax 2016-01-19 09:34:41 -08:00
Francois Chollet d1af488d10 Minor style fixes 2016-01-17 17:10:04 -08:00
Francois Chollet c59b0e936d Merge branch 'wxs-backend-masking' 2016-01-17 16:48:53 -08:00
Francois Chollet c98633cd7c Fix normalization tests 2016-01-17 16:48:20 -08:00
Xavier Snelgrove ac773ed243 Fix masking in RNNs 2016-01-17 16:29:07 -08:00
Francois Chollet 9d120bf9e0 Fix py3 compatibility 2016-01-17 16:09:00 -08:00
Cassiano Kleinert Casagrande e443527d28 Add example of stateful LSTM sequence prediction 2016-01-17 21:56:21 -02:00
Francois Chollet 8c103559a6 Update tests 2016-01-17 15:41:15 -08:00
Francois Chollet 85a1225cb3 Fix batchnorm 2016-01-17 15:41:08 -08:00
Francois Chollet 160196137f Add support for axis lists in element wise ops 2016-01-17 15:40:46 -08:00
Francois Chollet 040e1ef628 Merge branch 'master' of https://github.com/fchollet/keras 2016-01-17 12:13:27 -08:00
Francois Chollet 0debec1c11 Fix batch normalization 2016-01-17 12:13:15 -08:00
Francois Chollet 0abed354b5 Add/subtract mean in style transfer script 2016-01-17 12:12:24 -08:00
fchollet d00140862e Fix conv2d mode in doc examples 2016-01-15 19:35:57 -08:00
Francois Chollet 5c3839a950 Merge branch 'consciousnesss-add_pep8' 2016-01-15 16:32:41 -08:00
Francois Chollet 2006ec2873 Newline. 2016-01-15 16:32:23 -08:00
Francois Chollet 036d5c8dc2 Merge branch 'add_pep8' of https://github.com/consciousnesss/keras into consciousnesss-add_pep8 2016-01-15 16:30:32 -08:00
Francois Chollet 3bfe4eace9 Normalize layer importing 2016-01-15 16:19:00 -08:00
Francois Chollet 83aaadaa9d Log backend type to stderr 2016-01-15 16:18:42 -08:00
Francois Chollet a18932cb65 Improve neural style example 2016-01-15 16:17:18 -08:00
Francois Chollet 70f0fe515a Improve deep dream example 2016-01-15 16:17:05 -08:00
olegsinyavskiy a563a8446e Add an automatic PEP8 check on the pull request submission:\n - ignore most of the errors to avoid disrupting others\n - add a separate job to avoid confusion that all jobs fail because of a single pep error\n - fix few small pep errors 2016-01-14 08:32:37 -08:00
Oleg Sinyavskiy bd2ff26b37 Merge pull request #5 from fchollet/master
update
2016-01-13 17:34:42 -08:00
Francois Chollet 58a94a9b05 Add deep dream example 2016-01-13 10:46:45 -08:00
Francois Chollet 3d3b8c52e9 Cleanup of neural style transfer example 2016-01-13 10:46:19 -08:00
François Chollet 558605a363 Merge pull request #1422 from jakebian/json-fix
models: when dumping config to json, parse unhandled types correctly
2016-01-12 13:46:11 -08:00
François Chollet a696513cfb Merge pull request #1448 from keunwoochoi/patch-1
Remove cv2 dependency and use scipy.misc
2016-01-11 16:09:11 -08:00
Keunwoo Choi ee6bad63b0 Remove cv2 dependency and use scipy.misc
to read, resize, and save images.
2016-01-11 23:33:25 +00:00
Francois Chollet cb13a33a31 Fix neural style comments 2016-01-11 11:43:10 -08:00
Francois Chollet 1fdcc370b6 Remove unnecessary commented code. 2016-01-11 11:30:10 -08:00
Francois Chollet 94c9183179 Merge branch 'master' of https://github.com/fchollet/keras 2016-01-11 11:23:44 -08:00
Francois Chollet ada6dd2943 Add neural style transfer example 2016-01-11 11:23:38 -08:00
Francois Chollet a5c07d796a Update theano backend 2016-01-11 11:22:29 -08:00
François Chollet d2f7593a35 Merge pull request #1442 from jfsantos/patch-8
FIxed Tensorboard callback for Python 3
2016-01-10 21:20:42 -08:00
João Felipe Santos 314ee54e60 FIxed Tensorboard callback for Python 3
Adding `dict_items` [does not work](https://stackoverflow.com/questions/13361510/typeerror-unsupported-operand-types-for-dict-items-and-dict-items) in Python 3. Workaround is to create a copy of the dict and `update` it with the other dict.
2016-01-10 13:12:00 -05:00
François Chollet 5d1789c805 Merge pull request #1430 from tboquet/fix_custom_obj_load
Added compatibility for custom loss functions
2016-01-08 12:16:12 -08:00
tboquet 42cd4d6b62 Added support for both Sequential and Graph 2016-01-08 14:16:27 -05:00
tboquet 6bb4cbbf5e added compatibility for custom loss functions 2016-01-08 13:19:25 -05:00
Francois Chollet 998efc04ee Merge branch 'master' of https://github.com/fchollet/keras 2016-01-08 10:02:45 -08:00
Francois Chollet 037e592f2b Naming, batch_flatten 2016-01-08 10:02:28 -08:00
fchollet ced84d53bc Fix example docstring 2016-01-07 22:24:40 -08:00
fchollet d0b98a2cb5 Antirectifier example style fixes 2016-01-07 21:18:34 -08:00
fchollet 09d91fccb9 Add antirectifier example 2016-01-07 20:22:33 -08:00
jake e947a56c52 models: when dumping config to json, parse unhandled types correctly
- properly convert numpy types
- handle python 'type' objects
2016-01-07 18:02:02 -08:00
François Chollet 887178bd02 Merge pull request #1417 from farizrahman4u/patch-21
Lambda layer:get_output should use get_input.
2016-01-07 17:24:13 -08:00
Fariz Rahman e21a6a9ebf Fix output_shape too. 2016-01-08 01:13:00 +05:30
François Chollet f31f85a720 Merge pull request #1419 from ozancaglayan/fix-mask-cast
models: Cast the mask to floatX to avoid theano upcasting
2016-01-07 11:14:22 -08:00
Fariz Rahman bf2f64bfd5 Update core.py 2016-01-08 00:27:43 +05:30
Ozan Caglayan f800e448a2 models: Cast the mask to floatX to avoid theano upcasting
Issue #1416
2016-01-07 20:19:24 +02:00
Fariz Rahman fe18ad8dde Lambda layer:get_output should use get_input. 2016-01-07 17:06:43 +05:30
François Chollet 6167d17aeb Merge pull request #1414 from tboquet/conv_same_conv2d
Evaluate the kernel of base Theano conv2d with the same border mode
2016-01-06 19:28:26 -08:00
tboquet f8dd6da08d Eval kernel base Theano conv2d same border mode 2016-01-06 21:59:50 -05:00
François Chollet c6a1c01a08 Merge pull request #1411 from jarfo/patch-1
Update models.py
2016-01-06 10:55:15 -08:00
jarfo d02ea03462 Update models.py
Model evaluation (test) using the _test K.function should be also stateful for stateful recurrent networks
2016-01-06 13:27:28 +01:00
Francois Chollet 13379da81b Fix kernel shape type in theano conv2d 2016-01-05 15:44:15 -08:00
Francois Chollet 87f62dc6cf Merge branch 'master' of https://github.com/fchollet/keras 2016-01-05 10:36:15 -08:00
Francois Chollet 6cb1172668 Add cosine proximity objective 2016-01-05 10:35:29 -08:00
Francois Chollet 458641f33a Add K.l2_normalization 2016-01-05 10:35:10 -08:00
François Chollet e2fa1d56c0 Merge pull request #1396 from julienr/resize_images
Add K.resize_images backend op.
2016-01-03 14:51:44 -08:00
Julien Rebetez 01395f13ed Figure out tensor shape automatically in K.resize_images 2016-01-03 22:36:28 +01:00
Francois Chollet 6445d385ee Update CONTRIBUTING.md 2016-01-03 13:17:36 -08:00
Julien Rebetez 4330fd78e9 Fix theano_backend.resize_images 2016-01-03 19:04:49 +01:00
Francois Chollet f447644900 Update PyPi release to 0.3.1 2016-01-03 09:41:04 -08:00
Francois Chollet 3f623df020 Merge branch 'master' of https://github.com/fchollet/keras 2016-01-03 09:38:25 -08:00
Francois Chollet 69e19b1e03 Improve optimizer tests 2016-01-03 09:38:02 -08:00
François Chollet 0ed465acfa Merge pull request #1397 from stevenxxiu/batch_input_shape_fix
batch_input_shape fix
2016-01-03 09:31:13 -08:00
François Chollet ad9d41f1b0 Merge pull request #1389 from kashif/adamax
adamax optimizer
2016-01-03 09:27:55 -08:00
Steven Xu b17e4c5edf input_shape fix 2016-01-03 23:02:07 +11:00
Julien Rebetez 9d15c96115 Add K.resize_images backend op.
This allows to take advantage of tensorflow’s resize_images operator in UpSampling2D.
2016-01-03 12:37:14 +01:00
Kashif Rasul 5c72e14034 adamax optimizer 2016-01-03 09:40:26 +01:00
Francois Chollet d401bb46dd Doc fixes 2016-01-01 22:31:25 -08:00
Francois Chollet 3a9ffc8ffd Merge branch 'berleon-fix-1275' 2016-01-01 11:21:20 -08:00
Francois Chollet 421a2cdf04 Move batch norm tests to tests/keras/layers/ 2016-01-01 11:07:19 -08:00
Francois Chollet 8f934e0379 Merge branch 'fix-1275' of https://github.com/berleon/keras into berleon-fix-1275 2016-01-01 09:46:23 -08:00
François Chollet bb45991899 Merge pull request #1388 from kylemcdonald/patch-1
typo in doc: batch_input_size => batch_input_shape
2015-12-31 15:51:11 -08:00
Kyle McDonald 582dfc4233 typo in doc: batch_input_size => batch_input_shape 2015-12-31 14:53:11 -08:00
berleon 177f7b6b6e [BatchNormalization] set updates in get_output
This commit fixes the DisconnectedInputError described in issue
the `get_output` method. Before this commit the `updates` member
could would use another input as the `get_output` method, if the
input was changed.
2015-12-31 18:14:02 +01:00
berleon 579a219614 [AutoEncoder] set_previous triggers build
The `params`, `regularizers`, `constraints` and `updates` member of the
AutoEncoder were set in the `__init__` method.
When set_previous was called, the mentioned members were not updated.
This behavior resulted in a DisconnectedInputError.
Now the mentioned members are set in the `build` method and the
`set_previous` method calls the `build` method every time the
input changes. This commit fixes issue #1275.
2015-12-31 16:54:21 +01:00
François Chollet f95e6bada3 Merge pull request #1383 from tboquet/conv_deb
Fixed dnn_conv output size
2015-12-30 19:58:20 -08:00
tboquet c00cf10ef8 * deleted custom padding/replaced by a slice 2015-12-30 22:01:49 -05:00
Francois Chollet be9f7bc62f Documentation fixes 2015-12-30 13:09:16 -08:00
Francois Chollet d49baf1bfb Fix example in FAQ 2015-12-29 16:00:56 -08:00
Francois Chollet 729f0765da Progbar: scientific notation only for small values 2015-12-29 16:00:39 -08:00
François Chollet 161b31dcf3 Merge pull request #1374 from PiranjaF/master
Fix: Loading merge layer from serialized data
2015-12-29 15:16:25 -08:00
PiranjaF 643961723c Update layer_utils.py 2015-12-29 23:18:03 +01:00
François Chollet 7b95359b8e Merge pull request #1354 from easyas314159/master
Fixed handling of negative dimensions in Reshape layers
2015-12-29 11:16:17 -08:00
François Chollet 7555a32d0a Merge pull request #1372 from viirya/dedup
Minor: remove duplicate code
2015-12-29 11:13:45 -08:00
Liang-Chi Hsieh c95f5d10c2 Minor: remove duplicate code. 2015-12-29 18:24:57 +08:00
Kevin Loney 03cd7bf493 Fixed some stylistic issues and expanded the doc string for the
Reshape. _fix_unknown_dimension
2015-12-28 23:59:44 -07:00
François Chollet 308fd87031 Merge pull request #1352 from viirya/check-keras-dir-permission
Check keras_dir writing permission and assign temporary directory
2015-12-27 09:39:48 -08:00
Liang-Chi Hsieh a98eec34f7 Check basedir for dataset path. 2015-12-26 14:46:21 +08:00
Liang-Chi Hsieh b4eb1d9491 Check base dir. 2015-12-26 09:11:16 +08:00
Kevin Loney 186d95ae9c Fixed handling of negative dimensions in Reshape.output_shape and
Reshape.get_output
2015-12-25 11:14:01 -07:00
Liang-Chi Hsieh 58ed77b0d2 Check keras_dir writing permission. 2015-12-25 18:07:20 +08:00
Francois Chollet 16675b98c0 Better input validation in Sequential & Graph. 2015-12-23 13:55:13 -08:00
François Chollet 7a61cc20b9 Merge pull request #1338 from rpinsler/master
Fix typos and minor inconsistencies.
2015-12-23 11:26:23 -08:00
François Chollet 534f68ec77 Merge pull request #1336 from wb14123/loop
fix iteration shadowed in loop
2015-12-23 10:29:45 -08:00
François Chollet 2a4680ec3e Merge pull request #1335 from sjebbara/graph-prediction-output
Output Format of predict_on_batch in Graph() Model as Dictionary
2015-12-23 10:28:45 -08:00
rpinsler 85e51a0f8f Fix typos and minor inconsistencies. 2015-12-23 13:06:03 +01:00
Bin Wang 0695b82f74 fix iteration shadowed in loop 2015-12-23 17:14:51 +08:00
sjebbara 19290c07fd return outputs of predict_on_batch function of the Graph model as a dictionary 2015-12-23 09:52:39 +01:00
Francois Chollet 29e60ab372 Remove print statement in test 2015-12-22 18:09:40 -08:00
Francois Chollet eda1a9e0a4 Add tests for initializations 2015-12-22 17:57:04 -08:00
Francois Chollet 69932604f9 Fix text preprocessing test 2015-12-22 12:03:28 -08:00
Francois Chollet 7f85541785 Add text preprocessing test 2015-12-22 11:26:25 -08:00
Francois Chollet 7f3cd093c0 Fix flaky test 2015-12-22 10:37:09 -08:00
Francois Chollet 18d52e634d Add text preprocessing tests 2015-12-22 10:36:59 -08:00
Francois Chollet 485d451b62 Remove no-longer used util function. 2015-12-22 09:33:32 -08:00
Francois Chollet d5fb5d1f15 Improve callbacks docs. 2015-12-22 09:33:32 -08:00
François Chollet f65b531631 Merge pull request #1328 from phreeza/siamese_tests
Some more testing of Siamese layer
2015-12-22 09:12:47 -08:00
fchollet c8176fd3bc Update FAQ in documentation 2015-12-22 08:10:18 -08:00
fchollet d870e45eb0 Fix flaky test 2015-12-22 08:09:55 -08:00
Francois Chollet 532515cbb0 Merge branch 'master' of https://github.com/fchollet/keras 2015-12-22 07:53:44 -08:00
Francois Chollet f2f4f4ec48 Add helpful error message in Flatten 2015-12-22 07:53:21 -08:00
Thomas McColgan 3d109c6ebe split out theano only part 2015-12-22 14:33:32 +01:00
Thomas McColgan 2a0f3e3dfc further test of siamese layer 2015-12-22 14:33:32 +01:00
François Chollet 7a6a47888c Merge pull request #1324 from Chasego/patch-1
Fix the wrong link
2015-12-21 19:38:26 -08:00
cyc d8e83cc773 Fix the wrong link
Fix the wrong link for "Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks"
2015-12-21 22:33:35 -05:00
Francois Chollet 908d866558 Improve unit test coverage 2015-12-21 17:55:26 -08:00
Francois Chollet 13df0bf32a Skip tensorboard test if py3 2015-12-21 14:21:57 -08:00
Francois Chollet fd632b70c5 Fix callback tests 2015-12-21 14:14:41 -08:00
Francois Chollet df612a881f Merge branch 'tboquet-tensorboard' 2015-12-21 13:53:33 -08:00
Francois Chollet b602a93e17 Update TensorBoard callback 2015-12-21 13:52:47 -08:00
tboquet 80eab1bc02 Add TensorBoard visualization callback. 2015-12-21 11:58:23 -08:00
Francois Chollet 49c343f836 Update CONTRIBUTING with info wrt commit squashing 2015-12-21 10:26:31 -08:00
François Chollet 27ee3a6bbc Merge pull request #1252 from gw0/feat-pydot-fallback
Import pydot-ng with pydot as fallback when visualizing.
2015-12-20 20:24:34 -08:00
fchollet 6ec1f7a498 Fix callback tests 2015-12-19 19:46:57 -08:00
fchollet e34f9e6deb Add tests for callbacks 2015-12-19 19:08:03 -08:00
fchollet 8d3b8ff627 Improve callback functionality 2015-12-19 19:07:50 -08:00
Francois Chollet dd58103a3c Better MemNN example 2015-12-18 15:10:52 -08:00
Francois Chollet 80096798fc Fix borked merge in test_models 2015-12-18 10:09:53 -08:00
François Chollet dac6b2f6a5 Merge pull request #1256 from consciousnesss/integration_tests_job
Split unit and IT tests
2015-12-18 09:23:42 -08:00
François Chollet 1fd55f69e5 Merge pull request #1298 from farizrahman4u/patch-21
Remove unnecessary apology :)
2015-12-18 09:20:55 -08:00
François Chollet 332f5c661f Merge pull request #1307 from gw0/fix-shape-tuples
Fix shapes should be tuples.
2015-12-18 09:20:41 -08:00
François Chollet 2f48f056ec Merge pull request #1290 from julienr/backend_from_env
Allows choosing the backend by setting the KERAS_BACKEND environment …
2015-12-18 08:52:57 -08:00
François Chollet 9d0cf9fbfa Merge pull request #1305 from fchollet/fit_generator
Fit generator
2015-12-18 08:52:27 -08:00
gw0 [http://gw.tnode.com/] bdf084e35e Fix shapes should be tuples. 2015-12-18 12:10:24 +01:00
gw0 [http://gw.tnode.com/] e5d3abdf09 Import pydot-ng with pydot as fallback when visualizing. 2015-12-18 11:05:27 +01:00
Francois Chollet 93b01aff15 dem flaky tests 2015-12-18 00:01:23 -08:00
Francois Chollet f9325e8fe5 Fix py3 compatibility. 2015-12-17 23:39:12 -08:00
Francois Chollet 3f67168c44 Fix flaky test 2015-12-17 23:03:23 -08:00
Francois Chollet f9911c10b4 Style fixes in datasets 2015-12-17 22:32:57 -08:00
Francois Chollet 47d074fec3 Add fit_generator methods in models 2015-12-17 22:32:44 -08:00
Francois Chollet 097e46837c Callback robustness fix 2015-12-17 22:14:35 -08:00
Francois Chollet b5f65dfaa4 Merge branch 'master' of https://github.com/fchollet/keras 2015-12-17 12:40:46 -08:00
Francois Chollet 5255b5df54 Style normalization in layers.core 2015-12-17 12:40:33 -08:00
Francois Chollet 58fb2b8af5 Improve BatchNorm documentation 2015-12-17 12:40:07 -08:00
Fariz Rahman c2a7ccd1cc Remove unnecessary apology :) 2015-12-18 00:32:40 +05:30
François Chollet 14d905cb4b Merge pull request #1293 from julienr/fix_upsampling
Modifies UpSampling1D/2D to repeat entries instead of tiling arrays.
2015-12-17 10:31:03 -08:00
François Chollet 6553a2bad9 Merge pull request #1289 from julienr/fix_docs_autogen
Fix docs/autogen.py to create subdirectories for autogenerated MODULES
2015-12-17 09:35:24 -08:00
François Chollet e01b824912 Merge pull request #1292 from julienr/fix_visutil
Remove hardcoded fontname in visualize_util
2015-12-17 09:34:59 -08:00
Julien Rebetez 5d685f4447 Use range instead of xrange to pass py35 tests 2015-12-17 16:32:48 +01:00
Julien Rebetez 50d3fddead Remove hardcoded fontname in visualize_util 2015-12-17 14:48:04 +01:00
Julien Rebetez 8715c70a74 Modify UpSampling1D/2D to turn [0, 1] into [0, 0, 1, 1] instead of [0, 1, 0, 1] 2015-12-17 14:47:04 +01:00
Julien Rebetez 554ed5bfc8 Add a K.repeat_elements function which works like np.repeat 2015-12-17 13:37:57 +01:00
Julien Rebetez ed92c14185 Allows choosing the backend by setting the KERAS_BACKEND environment variable 2015-12-17 11:45:52 +01:00
Julien Rebetez 8c914f793b Fix docs/autogen.py to create subdirectories for autogenerated MODULES
doc

Without this, docs/autogen.py fails with a no such file or directory
'sources/layers/convolutional.md'
2015-12-17 11:40:43 +01:00
Francois Chollet 063dd7d6c5 Merge branch 'lukedeo-highway' 2015-12-16 12:15:19 -08:00
Francois Chollet 827a6323b0 Style fixes 2015-12-16 12:15:09 -08:00
Francois Chollet edbefd7f50 Merge branch 'highway' of https://github.com/lukedeo/keras into lukedeo-highway 2015-12-16 12:12:30 -08:00
olegsinyavskiy 490ba423c4 do not share data between tests 2015-12-16 11:23:27 -08:00
Luke de Oliveira ec663aeda1 Remove TimeDistributedHighway tests 2015-12-16 19:26:11 +01:00
Luke de Oliveira 69cabdf6bc Remove TimeDistributedHighway
[WIP] will be added after `TimeDistributed` generic layer.
2015-12-16 19:10:39 +01:00
Francois Chollet d04cac6526 Fix GRU activation 2015-12-16 08:17:16 -08:00
Oleg Sinyavskiy ca37f96ca3 Merge pull request #4 from fchollet/master
update master
2015-12-15 21:50:02 -08:00
olegsinyavskiy 5e06aa5ef1 update the job 2015-12-15 19:07:24 -08:00
olegsinyavskiy 2f754c79f1 move tests into tests 2015-12-15 19:05:47 -08:00
Francois Chollet 42b3d37a54 Fix flaky test in preprocessing 2015-12-15 16:47:41 -08:00
François Chollet 151a6fbab9 Merge pull request #1279 from EderSantana/sequences
Add sequences tests and fix sequences docs
2015-12-15 16:30:46 -08:00
EderSantana e27587334b fix typo 2015-12-15 18:31:19 -03:00
EderSantana 54d3b9e673 rename test_sequences to test_sequence.py 2015-12-15 18:30:01 -03:00
EderSantana 2e29ef31a7 rename to and add more complete tests 2015-12-15 18:28:17 -03:00
Francois Chollet d68f12bbdf Merge branch 'farizrahman4u-patch-16' 2015-12-15 11:22:48 -08:00
Francois Chollet 3dddabebc4 Fix style, flaky test 2015-12-15 11:22:29 -08:00
EderSantana 4dcb2c04e3 Add sequences tests and fix sequences docs 2015-12-15 15:51:17 -03:00
Francois Chollet 324d8f875f Merge branch 'patch-16' of https://github.com/farizrahman4u/keras into farizrahman4u-patch-16 2015-12-15 10:46:34 -08:00
lukedeo 1872e00bea adding better documentation to highway layers 2015-12-15 19:46:19 +01:00
Fariz Rahman 3c8618bb39 Update core.py 2015-12-16 00:05:26 +05:30
Fariz Rahman 515448f859 Add is_graph to docstring 2015-12-15 23:56:30 +05:30
Fariz Rahman 85ba9aa884 Fix test_batchnorm_config() 2015-12-15 23:44:35 +05:30
François Chollet 1e418e0200 Merge pull request #1277 from farizrahman4u/patch-19
Quotes for default string values in docs
2015-12-15 09:51:58 -08:00
Fariz Rahman 54d332bf63 Quotes for default string values in docs 2015-12-15 23:02:20 +05:30
Fariz Rahman 3d51a26749 Fix json serializing 2015-12-15 22:48:51 +05:30
François Chollet 792b755594 Merge pull request #1258 from jeffzhengye/rnn_masking_bug_fix
correct masking: switch for each example in a batch
2015-12-15 09:15:35 -08:00
Fariz Rahman b2ab55611b Fix weight save 2015-12-15 21:14:42 +05:30
Fariz Rahman 97ba6aaaa1 call base constructor in Lambda layers 2015-12-15 20:32:45 +05:30
Fariz Rahman 794c083f6b Update containers.py 2015-12-15 20:29:04 +05:30
Fariz Rahman 534d81fc10 call base constructor in SiameseHead 2015-12-15 20:25:29 +05:30
Fariz Rahman 3adbb2bd4f Fix tests 2015-12-15 18:50:07 +05:30
Fariz Rahman c2220bff6e call base constructor in Merge and Siamese 2015-12-15 18:48:23 +05:30
Fariz Rahman 775e91c573 Update core.py 2015-12-15 18:09:39 +05:30
Fariz Rahman 3e19b0252f Update test_models.py 2015-12-15 18:08:55 +05:30
Fariz Rahman e0d8c09199 default arg for mask 2015-12-15 17:58:36 +05:30
Fariz Rahman f20383b7f7 Update containers.py 2015-12-15 17:45:55 +05:30
Fariz Rahman 5c83c69936 Update containers.py 2015-12-15 17:27:40 +05:30
Fariz Rahman f5d56fa2f7 Update test_models.py 2015-12-15 16:40:49 +05:30
Fariz Rahman 080a8199f4 Fix bug regarding layer cache 2015-12-15 16:35:06 +05:30
Fariz Rahman 7fb4f4b073 Update core.py 2015-12-15 16:30:10 +05:30
Fariz Rahman 9620a497c7 Update core.py 2015-12-15 16:25:23 +05:30
Fariz Rahman 3dcff62578 add arg cache_enabled 2015-12-15 16:21:12 +05:30
Fariz Rahman 91965e8f85 Update test_models.py 2015-12-15 16:08:50 +05:30
Fariz Rahman 92f2717c99 Update test_models.py 2015-12-15 16:02:44 +05:30
Fariz Rahman 07723ccff4 Update test_models.py 2015-12-15 15:39:41 +05:30
Fariz Rahman 0855541280 Siamese graph tests 2015-12-15 15:37:08 +05:30
Fariz Rahman 810cdc4a33 Test sequential 2015-12-15 15:24:32 +05:30
Fariz Rahman 652d61be46 Update core.py 2015-12-15 15:18:54 +05:30
Fariz Rahman 060fcd3ed0 Fix Siamese layer 2015-12-15 15:18:13 +05:30
Fariz Rahman 1cf7036d1c Fix add_shared_node() 2015-12-15 15:16:09 +05:30
Fariz Rahman f519823595 Tests 2015-12-15 13:22:20 +05:30
Fariz Rahman 4d3c6c9bbe Support nested models 2015-12-15 13:18:03 +05:30
Fariz Rahman f0cba6ec83 Add mask arg 2015-12-15 13:16:36 +05:30
jeffzhengye 090ac0d138 remove incompatible tests 2015-12-15 01:46:04 -05:00
olegsinyavskiy cf202bc5bd Merge remote-tracking branch 'origin/master' into integration_tests_job 2015-12-14 19:29:40 -08:00
Oleg Sinyavskiy 2ee8917836 Merge pull request #3 from fchollet/master
update
2015-12-14 19:29:19 -08:00
lukedeo 52976242f0 adding Highway Network layers (regular and time-distr.) 2015-12-14 20:03:46 +01:00
François Chollet 20d6d2b235 Merge pull request #1260 from chenb67/master
fix model checkpointer string formatter bug
2015-12-14 10:05:05 -08:00
Chen Buskilla 9987b8314e fix model checkpointer string formatter bug 2015-12-14 13:37:20 +02:00
jeffzhengye 53c8cfe47e Merge branch 'master' into rnn_masking_bug_fix 2015-12-13 19:26:03 -05:00
jeffzhengye bab230c0d6 add a test for simple rnn 2015-12-13 19:14:50 -05:00
Francois Chollet fddcf3acd3 Merge branch 'phreeza-test-sprint' 2015-12-13 14:45:59 -08:00
Francois Chollet 90b441c587 Style fixes in tests 2015-12-13 14:45:40 -08:00
Francois Chollet c746529559 Merge branch 'test-sprint' of https://github.com/phreeza/keras into phreeza-test-sprint 2015-12-13 14:28:03 -08:00
jeffzhengye efff160cea correct masking: switch for each example in a batch 2015-12-13 15:56:34 -05:00
Francois Chollet 0f86b45918 Add links to source code in documentation 2015-12-13 12:14:54 -08:00
Max Pumperla fb3df6a6e0 Merge branch 'test-sprint' of git://github.com/phreeza/keras into test-sprint 2015-12-13 10:43:20 +01:00
Max Pumperla 537e2d2123 single quotes in core layer test 2015-12-13 10:43:07 +01:00
Francois Chollet 4135797250 Update README 2015-12-12 13:34:46 -08:00
Francois Chollet ee3af2b8d0 Update docs README 2015-12-12 12:45:16 -08:00
Francois Chollet d229c47845 Add documentation generation script 2015-12-12 12:39:22 -08:00
Francois Chollet 9a93fc51cf Add docstrings in callbacks, models, optimizers 2015-12-12 12:36:23 -08:00
Francois Chollet 06f5f43079 Documentation refactor 2015-12-12 12:36:00 -08:00
olegsinyavskiy 039d15bb55 Split unit and integration tests:
- separate job on travis for IT tests
 - refactor and document IT tests (test_tasks.py)
 - char generation test with stacked LSTM
2015-12-12 11:21:15 -08:00
Oleg Sinyavskiy 8ad725248a Merge pull request #1 from fchollet/master
update master
2015-12-12 10:27:13 -08:00
Thomas McColgan 56aaffd3ea # This is a combination of 6 commits.
# The first commit's message is:
test image preprocessing

# The 2nd commit message will be skipped:

#	add PIL to enable testing of preprocessing code

# The 3rd commit message will be skipped:

#	try a different way to install PIL on travis

# The 4th commit message will be skipped:

#	include PIL only in python 2.7

# The 5th commit message will be skipped:

#	test image preprocessing

# The 6th commit message will be skipped:

#	fall back to Pillow for python 3 image processing
2015-12-12 14:35:59 +01:00
Thomas McColgan d3493fe6c8 test image preprocessing 2015-12-12 14:33:48 +01:00
Max Pumperla 444976552e Increased test threshold in merge overlap, as build breaks bc/ of it 2015-12-12 07:16:02 +01:00
Max Pumperla 671751fce4 Last unittest removed 2015-12-12 07:02:42 +01:00
Francois Chollet 656681e9d8 Docstrings touch-ups 2015-12-11 21:55:43 -08:00
Max Pumperla 19fadb5204 Merge master to resolve conflicts 2015-12-12 06:54:55 +01:00
Francois Chollet 25d76eb124 Merge branch 'master' of https://github.com/fchollet/keras 2015-12-11 21:39:05 -08:00
Francois Chollet 12e0052f43 Quote standardization 2015-12-11 21:38:54 -08:00
Francois Chollet 40864d8997 Add docstrings in layers. 2015-12-11 21:32:39 -08:00
François Chollet 5575f4a5a8 Merge pull request #1249 from bhargav/patch-1
Fix minor typo in recurrent.md
2015-12-11 15:58:11 -08:00
Bhargav Mangipudi af4f889fa6 Fix minor typo in recurrent.md
Fix a minor typo in docs.
2015-12-11 16:15:22 -06:00
François Chollet 40983e289a Merge pull request #1211 from farizrahman4u/patch-18
Bug fix - Cos
2015-12-11 09:37:02 -08:00
Max Pumperla a5d8b6378e Skip sklearn wrapper tests for TF for now 2015-12-11 15:16:19 +01:00
fchollet 71952f29c7 Remove make_reuters_dataset 2015-12-10 23:22:29 -08:00
fchollet febf9b2164 Fix output caching. 2015-12-10 22:11:56 -08:00
fchollet e8e8f7625b Introduce layer output caching. 2015-12-10 22:03:28 -08:00
François Chollet 3afc5ec4f0 Merge pull request #1242 from transcranial/tf-py34
add tensorflow 0.6.0 testing with python 3.4 to travis
2015-12-10 21:32:25 -08:00
fchollet de8e009e98 Merge branch 'mynameisfiber-master' 2015-12-10 21:27:37 -08:00
fchollet 23afa3f3b6 RNN unit tests touch-ups 2015-12-10 21:27:22 -08:00
fchollet b7db44866e Merge branch 'master' of https://github.com/mynameisfiber/keras into mynameisfiber-master 2015-12-10 21:15:23 -08:00
François Chollet 08470363ad Merge pull request #1243 from jeffzhengye/rnn_float32_issue
fix float32 issue: int32 times float32 will generate float64
2015-12-10 15:26:12 -08:00
Leon Chen 314ffececb tf.control_flow_ops -> tf.python.control_flow_ops 2015-12-10 17:54:44 -05:00
jeffzhengye c6f68f8311 use K.cast for compatibility 2015-12-10 17:53:50 -05:00
jeffzhengye f04272e697 fix float32 issue: int32 times float32 will generate float64 2015-12-10 17:11:47 -05:00
Francois Chollet 7401f47da4 Fix Conv2D behavior with dim_ordering='tf' 2015-12-10 14:01:56 -08:00
Leon Chen 67332afc46 get rid of python 2 switch for backends test 2015-12-10 15:53:19 -05:00
Leon Chen b9a5ff41f8 update travis to tensorflow 0.6.0 with additional python 3.4 testing 2015-12-10 15:27:55 -05:00
Francois Chollet 45c5f36399 Remove description in __init__ 2015-12-10 11:18:32 -08:00
Francois Chollet a4fff5aba3 Fix set_input in Sequential container. 2015-12-10 09:38:40 -08:00
Micha Gorelick bf8dc2f61f nit picking 2015-12-10 09:25:59 -05:00
Micha Gorelick bf3ef82fa5 Docstrings, state_updates for non-trainable, tests 2015-12-10 09:23:50 -05:00
Thomas McColgan 8c1c86baf0 remove unused import 2015-12-10 10:06:16 +01:00
Thomas McColgan 7dd3d062ff convert core tests to pytest 2015-12-10 10:05:54 +01:00
Thomas McColgan 8ab149df8a mark skipped tests as skipped, not passed 2015-12-10 09:57:07 +01:00
Thomas McColgan fd70ad6cd2 Merge remote-tracking branch 'origin/test-sprint' into test-sprint
# Conflicts:
#	tests/test_loss_masking.py
#	tests/test_sequential_model.py
2015-12-10 09:54:57 +01:00
Thomas McColgan 6f607f6aa8 some tensorflow tests skipped for now, will raise an issue for these 2015-12-10 09:51:12 +01:00
fchollet 59406a7148 Update callbacks documentation 2015-12-09 20:53:04 -08:00
fchollet 7da1523053 Replace unittest with pytest 2015-12-09 20:41:02 -08:00
Francois Chollet 20ca5befdc Add documentation for summary() feature 2015-12-09 18:32:08 -08:00
Francois Chollet 1ef35e90fc Introduce model.summary() feature 2015-12-09 18:27:53 -08:00
Micha Gorelick e2780cd515 pep8 2015-12-09 17:42:35 -05:00
Micha Gorelick 56fa445ef1 Created new model update list for state updates
In order to propagate state through _predictions_, I created a new
property of the model, `state_updates` that returns any model step
updates that are needed when doing a stateful prediction.  These updates
are identified as *any updates defined by a stateful layer*.
2015-12-09 17:28:45 -05:00
Micha Gorelick 6a80b176e8 Created state_updates for updates on prediction 2015-12-09 16:44:08 -05:00
Francois Chollet c2534964b7 Fix SimpleRNN step function. 2015-12-09 13:11:57 -08:00
Francois Chollet 82353da4dc Remove LRN2D layer. 2015-12-09 13:04:07 -08:00
François Chollet be46766622 Merge pull request #1228 from Sebubu/add_node_doc_update
Updated doc with the missing arguments in Graph
2015-12-09 12:58:07 -08:00
Micha Gorelick 40c48dd174 Run non-optimization updates on predict for stateful RNN's 2015-12-09 15:23:07 -05:00
Severin Bühler 6a498ae420 fixed text 2015-12-09 20:07:07 +01:00
Severin Bühler 1ba7d3d8b8 Updated documentation with the missing arguments 2015-12-09 20:00:06 +01:00
Max Pumperla ca3622cbb7 test softplus 2015-12-09 15:20:36 +01:00
Max Pumperla 3ce80e8e74 Hard sigmoid test 2015-12-09 15:12:44 +01:00
Max Pumperla 02b40ccfe3 activations unit to pytest 2015-12-09 14:49:17 +01:00
Max Pumperla 9e46447080 Sigmoid activation test 2015-12-09 14:43:41 +01:00
Max Pumperla b4cd5fd342 scikit test 2015-12-09 14:34:06 +01:00
Francois Chollet 81787dd2bb Cleanup examples 2015-12-08 18:49:14 -08:00
Francois Chollet 96f3404a57 Fixes wrt RNN statefulness 2015-12-08 16:52:21 -08:00
François Chollet 3620f02258 Merge pull request #1049 from julienr/visutil_improve
utils.visualize_util improvements
2015-12-08 12:35:23 -08:00
Francois Chollet 07ffc76b93 Finalize RNN statefulness functionality 2015-12-08 10:43:06 -08:00
Francois Chollet d400fc4512 Merge branch 'maxpumperla-pooling' 2015-12-08 10:21:39 -08:00
Francois Chollet 6d6481fedd Style fixes 2015-12-08 10:20:04 -08:00
Francois Chollet 93af5e95fd Touch-ups in pooling layers 2015-12-08 10:16:47 -08:00
Francois Chollet 31534bd15e Reference backend page in docs 2015-12-08 10:16:10 -08:00
François Chollet 2586884080 Merge pull request #1216 from farizrahman4u/patch-20
Bug fix - Siamese
2015-12-08 10:10:17 -08:00
Fariz Rahman cee4f72a8e Bug fix - Siamese 2015-12-08 23:11:12 +05:30
François Chollet 36eff96a5d Merge pull request #1213 from farizrahman4u/patch-20
Fix add_shared_node()
2015-12-08 08:51:56 -08:00
François Chollet e7859bf188 Merge pull request #1214 from ozancaglayan/master
doc: Preserve the right order for model.compile
2015-12-08 08:33:40 -08:00
Ozan Caglayan d7d1ee54a5 doc: Preserve the right order for model.compile
Be coherent in the examples of Sequential() and Graph() in terms
of argument ordering.
2015-12-08 16:47:07 +01:00
Fariz Rahman 829dff1866 Fix add_shared_node()
Sorry, there was one more set_name.
2015-12-08 21:12:55 +05:30
Max Pumperla b372bd9dd4 Optimizers and regularizers moved to keras/ 2015-12-08 15:09:33 +01:00
Max Pumperla 36a41720af Remove tests in old structure 2015-12-08 15:07:50 +01:00
Max Pumperla 8b0676cb94 test_models.py used to test all models (Sequential, Graph) 2015-12-08 15:06:34 +01:00
Max Pumperla b864fa974b Renamed local variables in graph model test to fit into test_models.py 2015-12-08 15:06:01 +01:00
Max Pumperla d5b4d04b5b PEP-8 linting for advanced activations 2015-12-08 14:29:38 +01:00
Max Pumperla 57d5a7ca78 Move embedding test to keras/layers 2015-12-08 13:48:59 +01:00
Max Pumperla fe6c554e7a Moved dataset tests to keras/datasets 2015-12-08 13:44:03 +01:00
Max Pumperla bfe113556f Merge branch 'master' into test-sprint 2015-12-08 13:34:13 +01:00
Fariz Rahman de2884af79 Bug fix - Cos 2015-12-08 15:23:02 +05:30
Max Pumperla f46f04d545 Removed self key word in reshaping test 2015-12-08 08:11:12 +01:00
Max Pumperla 7d0ed02cc1 refactored tf backend pooling 2015-12-08 07:45:17 +01:00
Max Pumperla 075c34d037 Removed comment and sum mode 2015-12-08 07:44:56 +01:00
Max Pumperla 313ebae4d2 renaming mean -> average pooling 2015-12-08 07:28:05 +01:00
Max Pumperla ace2d94706 Merge master into pooling 2015-12-08 07:19:58 +01:00
fchollet 31cf6b16f4 Fix Adadelta serialization 2015-12-07 21:27:04 -08:00
François Chollet b126b6328a Merge pull request #1205 from smauq/progbar-precision
Increased Progbar precision
2015-12-07 20:36:30 -08:00
smauq f200b19056 Increased Progbar precision 2015-12-08 03:08:16 +02:00
François Chollet 526c8a58b3 Merge pull request #1197 from farizrahman4u/patch-18
Travis badge
2015-12-07 12:07:39 -08:00
Max Pumperla 51c1c94439 Minor error in conv, some linting & fixing theano backend 2015-12-07 18:31:00 +01:00
Max Pumperla c19cdf4e60 Adapted test for convolutional layers 2015-12-07 14:39:00 +01:00
Max Pumperla db0ae8216c Shape inference test 2015-12-07 14:35:50 +01:00
Max Pumperla 362e0e95aa clean up in convolutional layer 2015-12-07 14:35:37 +01:00
Max Pumperla e99dbf5857 Fixed typo in tensorflow backend 2015-12-07 14:35:14 +01:00
Max Pumperla 4854042813 Fixed formula for reshaping in image2neibs for theano 2015-12-07 14:32:07 +01:00
François Chollet a18755dc67 Merge pull request #1196 from farizrahman4u/patch-11
Fix add_shared_node()
2015-12-06 11:50:51 -08:00
Thomas McColgan ae375dd638 start testing the changed Layer interface 2015-12-06 20:42:22 +01:00
Fariz Rahman f5ff9f07d9 Travis badge 2015-12-07 01:02:32 +05:30
Thomas McColgan 14a57c10e6 tests for advanced activations
thresholded activations

parametric softplus

some bugfixes

fix error caused by calling layer.build on PReLU

seed the rng in every test individually to make them deterministic
2015-12-06 20:02:54 +01:00
Fariz Rahman 9031b77843 Fix add_shared_node()
Remove set_name
2015-12-07 00:02:35 +05:30
François Chollet 64226e4335 Merge pull request #1194 from smauq/master
Fix broken links in README.md
2015-12-06 09:41:37 -08:00
Max Pumperla b34fdbf12a docs for new MeanPooling layers 2015-12-06 18:14:38 +01:00
Max Pumperla 4259071ce0 MeanPooling1D/2D introduced 2015-12-06 18:09:56 +01:00
Max Pumperla 5ef38bd25b renaming in test 2015-12-06 18:09:27 +01:00
Max Pumperla 47a9735a6d generalized pooling in theano 2015-12-06 18:09:09 +01:00
Max Pumperla fc6c6aa6f0 generalized pooling in tf 2015-12-06 18:08:47 +01:00
smauq f18fa4af50 Fix broken links in README.md 2015-12-06 14:32:19 +02:00
François Chollet 470555f10b Merge pull request #1191 from jfsantos/patch-7
Fix typo
2015-12-05 21:49:51 -08:00
João Felipe Santos 7ab44d8d9c Fix typo
idomatic -> idiomatic
2015-12-05 23:51:58 -05:00
fchollet 3ce1883afc Update documentation 2015-12-05 19:32:14 -08:00
fchollet 93718562db One more attempt at fixing indeterminism on Travis 2015-12-05 18:58:23 -08:00
fchollet 5fe40861ce New attempt at fixing test indeterminism 2015-12-05 18:31:42 -08:00
fchollet 182e84d09b Attempt to reduce indeterminism in tests 2015-12-05 17:56:59 -08:00
François Chollet 337f86b40b Merge pull request #1190 from smauq/master
Fixed the progbar in the cifar10 example
2015-12-05 17:21:54 -08:00
Francois Chollet 1427815137 Refresh unit tests. 2015-12-05 17:02:13 -08:00
François Chollet a0c03ad577 Merge pull request #1187 from EderSantana/patch-3
tip for better issue search and FAQ link
2015-12-05 16:34:21 -08:00
François Chollet 4c742737a3 Merge pull request #1189 from EderSantana/call2
Fix tolerance comparing __call__ to np.dot
2015-12-05 16:33:52 -08:00
smauq c7b7fae654 Fix progbar 2015-12-06 02:20:27 +02:00
François Chollet 8e64a6ce77 Merge pull request #1184 from consciousnesss/switch_tasks_to_pytest
Switch tasks to pytest
2015-12-05 16:14:17 -08:00
EderSantana acbae2cebb Fix tolerance comparing __call__ to np.dot 2015-12-05 19:14:04 -05:00
Eder Santana 9789858d0e Update CONTRIBUTING.md
Link to FAQ
2015-12-05 18:52:10 -05:00
Eder Santana d23c02807b Update CONTRIBUTING.md
I'm sure some people forget to do that.
2015-12-05 18:48:02 -05:00
Francois Chollet c11bdd850a Update CONTRIBUTING.md 2015-12-05 15:35:20 -08:00
Francois Chollet 897942783a Small change in CONTRIBUTING.md 2015-12-05 15:27:54 -08:00
Francois Chollet 74b37bf87a Add CONTRIBUTING.md 2015-12-05 15:26:17 -08:00
Francois Chollet 05a82e957c Fix some typos in docs 2015-12-05 15:25:28 -08:00
François Chollet 55e62e587f Merge pull request #1162 from EderSantana/call
Add __call__
2015-12-05 14:33:12 -08:00
EderSantana 614e48e612 fix test problem due to type casting 2015-12-05 17:05:08 -05:00
EderSantana b55f991014 Update docstrings 2015-12-05 17:00:29 -05:00
EderSantana 4aa713b1dc Update docstrings 2015-12-05 16:57:28 -05:00
EderSantana 369fb05436 Fix package names and floatx 2015-12-05 16:47:39 -05:00
olegsinyavskiy cac308e88d few refactoring ideas 2015-12-05 13:36:12 -08:00
olegsinyavskiy 6b7410dc11 few ignores 2015-12-05 13:35:37 -08:00
olegsinyavskiy beb5151b7a Summary of changes:
- switch to using pytest
 - fix determinstic seed
2015-12-05 13:17:55 -08:00
François Chollet e8818c841c Merge pull request #1182 from EderSantana/get_initial_states
Add get_initial_state method to Recurrent
2015-12-05 12:16:15 -08:00
EderSantana b691579fff Add get_initial_state method to Recurrent
With this method, I believe no other recurrent method will need to overwrite
get_output.
2015-12-05 14:38:38 -05:00
Max Pumperla 82971dedc4 Fixed get_config in Pooling1d/2d 2015-12-05 19:38:41 +01:00
Max Pumperla 61e5ea1f87 Fix for 2d pooling 2015-12-05 19:14:41 +01:00
Max Pumperla 6bfb3c648b Removed old MaxPool2D layer 2015-12-05 18:37:25 +01:00
Max Pumperla c600a4450c Removed GPL test part 2015-12-05 18:36:28 +01:00
Max Pumperla c3f3db64af Added Pooling2D base class & restructured MaxPooling2D 2015-12-05 18:35:59 +01:00
Max Pumperla a4f334c1ab Added Pooling1D base class & restructured MaxPooling1D 2015-12-05 18:22:10 +01:00
Thomas McColgan 217ce6f56a use specific version of coverage that works with coveralls 2015-12-05 14:36:23 +01:00
François Chollet ada2f2fa0d Merge pull request #1176 from dsteiner93/master
Fix mistake in backend documentation
2015-12-04 18:54:01 -08:00
dsteiner93 0f42d2db90 Fix mistake in backend documentation
The comments for all-zeros and all-ones were reversed in backend.md
2015-12-04 18:36:31 -08:00
EderSantana 90e9da4093 Fix tensorflow test 2015-12-04 20:30:36 -05:00
Oleg Sinyavskiy 194587e2a5 Merge pull request #3 from fchollet/master
update
2015-12-04 17:03:11 -08:00
Francois Chollet 61b30997eb Fix epsilon in objectives 2015-12-04 15:09:52 -08:00
EderSantana e1c7d287dc Fix tests 2015-12-04 16:44:49 -05:00
EderSantana f2e4e2ddce clean up code 2015-12-04 16:35:47 -05:00
EderSantana 4b40c34c53 Fix test_call.py docstring 2015-12-04 11:22:21 -05:00
EderSantana b002d00347 Add tests 2015-12-04 11:17:30 -05:00
EderSantana 3306f88f6d Merge branch 'call' of https://github.com/edersantana/keras into call 2015-12-04 10:08:58 -05:00
Francois Chollet 7b401f0b99 test_sequential save file cleanup 2015-12-03 22:35:52 -08:00
Francois Chollet 6c1ce0f6e9 Reintroduce image_shape and filter_shape in conv2d 2015-12-03 22:03:28 -08:00
Eder Santana ff647e04ee rebase 2015-12-03 14:50:44 -05:00
EderSantana 5f62723473 Merge branch 'master' of https://github.com/fchollet/keras into call 2015-12-03 14:48:55 -05:00
Francois Chollet f295ecb302 Actually fix floatx encoding 2015-12-03 11:23:51 -08:00
Eder Santana 9c3060d8ca Update common.py
Python 3 compatible
2015-12-03 14:17:25 -05:00
Eder Santana d2a1504060 Update core.py
Inherit from Pass from object
2015-12-03 14:06:47 -05:00
Francois Chollet 2161910a53 Fix floatx encoding on Python3 2015-12-03 11:02:57 -08:00
EderSantana 861c8a8e21 Add __call__ 2015-12-03 13:45:54 -05:00
François Chollet bb17fc7af1 Merge pull request #1147 from tboquet/fix_doc
Description of validation_data in the Graph doc
2015-12-03 10:44:18 -08:00
François Chollet 3e9f5c204a Merge pull request #1150 from consciousnesss/speed_up_keras_test_infrastracture
Speed up keras test infrastructure
2015-12-03 10:34:43 -08:00
François Chollet 39a457cccd Merge pull request #1158 from EderSantana/ascii
convert floatx to ascii
2015-12-03 10:33:30 -08:00
EderSantana 92f66a279a convert floatx to ascii 2015-12-03 10:28:14 -05:00
tboquet 5b1038abed and 2015-12-03 08:37:40 -05:00
olegsinyavskiy 4781f40eb6 These changes speeds up travis testing time 2 times using some pytest and travis configuration options.
Summary of changes:
 - py.test is configured to display test profiling information that shows 10 slowest tests. This would allow additional speed ups if anyone has ideas on some particular test. The slowest test is usually cifar dataset test and tensorflow convolutions. It seems that there are some other IT tests that could be sped up.
 - py.test is configured to run with pytest-xdist with 2 processes in parallel because travis does provide multicore support (1.5 cores) and because the slowest cifar test spends time on download which can run in parallel with other tests.
 - travis is configured to split backend tests into test matrix to make parallel theano vs tensorflow testing as opposed to rerun all the tests twice for python 2.7.
 - pickle filenames in tests are renamed to avoid clashes during multiprocessing
2015-12-02 22:09:59 -08:00
tboquet 58121fa855 deleted out 2015-12-02 22:23:31 -05:00
tboquet 82befe8384 type redo 2015-12-02 22:14:14 -05:00
tboquet f49e52a291 tuple to dict in the graph documentation 2015-12-02 22:12:42 -05:00
François Chollet 7bb897dff1 Merge pull request #1143 from jfsantos/patch-6
Fix typo
2015-12-02 17:56:39 -08:00
João Felipe Santos 664ada1fd2 Fix typo
denses -> dense
2015-12-02 17:27:52 -05:00
Francois Chollet 0933147dc8 Fix typo in recurrent. 2015-12-02 10:00:22 -08:00
Francois Chollet 1c6ab36c63 Fix typo in recurrent. 2015-12-02 09:57:19 -08:00
François Chollet 535af0b17d Merge pull request #1137 from stonebig/patch-1
version adjustement
2015-12-02 09:50:23 -08:00
Francois Chollet 9f917f265c Merge branch 'master' of https://github.com/fchollet/keras 2015-12-02 09:31:05 -08:00
Francois Chollet 54c025ac26 Remove neural turing machine, unviable in 0.3.0 2015-12-02 09:30:41 -08:00
stonebig 6fe65a6a1d version adjustement 2015-12-02 18:28:02 +01:00
François Chollet 5956dbe8fa Merge pull request #1096 from tboquet/master
* fixed validation y size for weighting
2015-12-01 20:54:35 -08:00
Francois Chollet be39e25b86 Fix Theano conv2d on cudnn with border_mode=same 2015-12-01 16:23:13 -08:00
Francois Chollet 905770099c Fix recurrent batch_input_shape error message 2015-12-01 16:17:49 -08:00
Francois Chollet 2553f07c3c Add support for batch_input_shape kwarg. 2015-12-01 15:56:12 -08:00
Francois Chollet af93198bde Documentation update. 2015-12-01 15:55:43 -08:00
Francois Chollet aaa47f0d20 Minor doc fixes 2015-12-01 10:03:37 -08:00
Francois Chollet df860fdb94 Update IRNN example 2015-11-29 13:08:45 -08:00
Francois Chollet 361a7cfe41 Update addition example 2015-11-29 12:54:45 -08:00
Francois Chollet 8f2d6d2714 Add support for time-distributed softmax. 2015-11-29 12:54:26 -08:00
Francois Chollet cbee000b66 Fix TF RNN issues 2015-11-29 12:22:41 -08:00
Francois Chollet 7ecd6c3c5f Fix backend tests 2015-11-29 10:38:48 -08:00
Francois Chollet 060ef32ce0 Merge branch 'backend' of https://github.com/fchollet/keras into backend 2015-11-29 10:15:45 -08:00
Francois Chollet 36392c75d7 Fix backend tests 2015-11-29 10:15:37 -08:00
François Chollet 2f01f29995 Merge pull request #1105 from transcranial/backend
fix tensorflow travis config
2015-11-28 21:13:41 -08:00
Leon Chen 0ab3a6c00c use ubuntu 14.04 on travis 2015-11-28 23:42:54 -05:00
Leon Chen 475fa79ec4 fix tensorflow travis config 2015-11-28 23:18:07 -05:00
Francois Chollet e600b0d947 Attempt to fix Travis config 2015-11-28 17:51:35 -08:00
Francois Chollet 677e15cd02 Attempt to fix Travis config 2015-11-28 17:50:23 -08:00
Francois Chollet ea2fd6526b Attempt to fix TF on Travis 2015-11-28 17:43:48 -08:00
Francois Chollet 26b040effe Attempt to fix TF in Travis config 2015-11-28 17:36:52 -08:00
Francois Chollet ef3ef71ee6 Add exception for TF + NTM 2015-11-28 17:19:13 -08:00
Francois Chollet 5c3aea2202 Fix backend init 2015-11-28 17:05:49 -08:00
Francois Chollet b6ea543a46 Fix Travis config. 2015-11-28 16:59:48 -08:00
Francois Chollet e32436144b Add tests 2015-11-28 16:39:42 -08:00
Francois Chollet 2bc900c3d2 Upate Travis config. 2015-11-28 16:36:09 -08:00
Francois Chollet 6c458ff281 Update layers. 2015-11-28 16:35:55 -08:00
Francois Chollet 69a8acc05a Update backend functionality. 2015-11-28 16:35:20 -08:00
Francois Chollet 71c6c83e30 Update examples. 2015-11-28 16:34:52 -08:00
Francois Chollet 634aedca1a Update documentation. 2015-11-28 16:34:35 -08:00
Francois Chollet 4b39b5f36b Remove deprecated code. 2015-11-28 16:34:06 -08:00
tboquet 1a7c6627ac * fixed validation y size for weighting 2015-11-27 23:17:05 -05:00
Francois Chollet 25e85b616f Merge master 2015-11-26 12:28:38 -08:00
Francois Chollet 47ed18a3af Update backends with rnn support 2015-11-26 10:42:52 -08:00
François Chollet febc604fc4 Merge pull request #1081 from farizrahman4u/patch-17
Bug fix - Siamese layer
2015-11-25 13:28:59 -08:00
François Chollet 8612da30a9 Merge pull request #1082 from PFischbeck/patch-1
Fix typo in index.md
2015-11-25 13:27:04 -08:00
Fariz Rahman 9c1afbb667 Update containers.py 2015-11-26 02:38:25 +05:30
Philipp Fischbeck 8c9c8ae5ad Fix typo in index.md 2015-11-25 22:06:56 +01:00
Fariz Rahman b92cec3d48 Fix for nested models 2015-11-26 02:21:42 +05:30
Fariz Rahman cbb9d00106 Update core.py 2015-11-26 02:09:43 +05:30
Fariz Rahman aca4a7735e Bug fix 2015-11-26 02:07:51 +05:30
Francois Chollet 6afce5862a Merge branch 'farizrahman4u-patch-16' 2015-11-25 11:53:57 -08:00
Francois Chollet f2a14a9b57 Update doc for add_shared_node in Graph model. 2015-11-25 11:53:28 -08:00
Francois Chollet 4429547354 Merge branch 'patch-16' of https://github.com/farizrahman4u/keras into farizrahman4u-patch-16 2015-11-25 11:45:06 -08:00
François Chollet b733951a19 Merge pull request #1079 from transcranial/elu
add exponential linear units activation layer
2015-11-25 10:30:12 -08:00
Fariz Rahman 78c61c65c2 Update containers.py 2015-11-25 23:38:18 +05:30
Fariz Rahman 242793a223 Update Graph doc 2015-11-25 23:36:16 +05:30
Fariz Rahman 5c1e5d5e40 Update core.py 2015-11-25 23:25:16 +05:30
Fariz Rahman e34022bf12 Docstrings 2015-11-25 23:13:49 +05:30
Fariz Rahman dac1e8ec65 Add add_shared_node() 2015-11-25 23:12:52 +05:30
Fariz Rahman 7867abdd69 Docstrings 2015-11-25 22:53:03 +05:30
Fariz Rahman 2685134832 Add add_shared_layer() 2015-11-25 22:52:29 +05:30
Fariz Rahman 138e77d0d6 Add Siamese layer 2015-11-25 22:50:53 +05:30
François Chollet 59a30e99a3 Merge pull request #1073 from pse1202/master
Typo fix
2015-11-25 09:03:38 -08:00
Leon Chen 71ee360c20 add exponential linear units activation layer 2015-11-25 11:20:49 -05:00
Sangeon Park 88cc300103 Typo fix
tran and test sets -> train and test sets
2015-11-25 07:42:15 +09:00
Julien Rebetez 98c6c8b3b6 Improve visualize_util to support container layers with optional recursion when plotting.
Also add an option to show layer shapes
2015-11-23 10:09:18 +01:00
François Chollet b059945195 Merge pull request #1059 from farizrahman4u/patch-15
Fix misleading comment in babi_memnn.py
2015-11-22 14:57:29 -08:00
Fariz Rahman 819f569ca8 Update babi_memnn.py 2015-11-23 04:16:32 +05:30
Fariz Rahman d322785543 Fix misleading comment in babi_memnn.py
Output of question_encoder is 3D : sequence of vectors, not single vector.
2015-11-23 03:54:32 +05:30
Francois Chollet 05aecbc0bc Fix README, docs captioning example 2015-11-22 12:05:26 -08:00
Francois Chollet 37ebbc3a1c Remove outdated comment 2015-11-22 12:03:53 -08:00
François Chollet 252db48746 Merge pull request #1052 from EderSantana/patch-4
Fix typos and even more informative docs.
2015-11-21 14:39:15 -08:00
Eder Santana 2d3880dc35 Fix typos and even more informative docs.
Even more informative docs. Sorry for not doing all the work at once.
2015-11-20 21:25:24 -05:00
François Chollet 50467e32a2 Merge pull request #990 from EderSantana/ntm
Neural Turing Machines
2015-11-20 17:56:38 -08:00
François Chollet e1cc291a25 Merge pull request #1046 from julienr/visutil_to_graph
Add visualize_util.to_graph and docs
2015-11-20 08:55:34 -08:00
Julien Rebetez 71b258d21e Add visualization section to docs 2015-11-20 11:03:31 +01:00
Julien Rebetez c1b54159b8 Add a new to_graph function in visualize_util that allow one to get the pydot.Graph object directly. This can be used to display the graph inline in a notebook. 2015-11-20 11:03:04 +01:00
François Chollet 6a231f1a24 Merge pull request #1041 from neggert/lambda_enhancements
Lambda layer enhancements
2015-11-19 22:39:28 -08:00
François Chollet b82223b2f2 Merge pull request #1037 from farizrahman4u/patch-9
Fix load_from_json for models with Lambda layer
2015-11-19 22:38:47 -08:00
François Chollet 0e69226546 Merge pull request #1039 from phreeza/patch-3
Update visualize_util.py to fix #1036
2015-11-19 22:38:33 -08:00
EderSantana 89132a6986 Fix typos and update docs 2015-11-19 23:46:55 -05:00
Francois Chollet 8b75182a17 Update backend 2015-11-19 20:14:48 -08:00
Francois Chollet bc5e993ae3 Update tests 2015-11-19 20:14:38 -08:00
Francois Chollet 8f2b5f0458 Continue Keras conversion to dual-backend version 2015-11-19 20:14:29 -08:00
Francois Chollet 34838cd369 Update examples 2015-11-19 20:13:49 -08:00
z001qdp 71b00324d8 Allow Lambda layer to pass arguments to Layer constructor.
Most importantly, the `input_shape` argument. This allows a Lambda
layer to be the first layer in a net, which was previously
impossible.
2015-11-19 16:35:26 -06:00
Nic Eggert 16590ccce5 Fix typo in Lambda layer 2015-11-19 16:17:48 -06:00
Thomas McColgan 2431764fed Update visualize_util.py to fix #1036 2015-11-19 14:53:51 +01:00
Fariz Rahman d444b9190f Update layer_utils.py 2015-11-19 16:15:26 +05:30
Francois Chollet 8ad18ce8f5 Update a number of tests. 2015-11-18 17:46:37 -08:00
Francois Chollet a744b600e9 Convert a number of layers. 2015-11-18 17:46:25 -08:00
Francois Chollet 52dac5e4b3 Update backends. 2015-11-18 17:45:29 -08:00
Francois Chollet e94f29cac4 Fix bug with validation data in Graph model 2015-11-18 13:32:06 -08:00
Francois Chollet 4e519f7aa7 Convert activations, constraints, noise, embedding 2015-11-17 21:39:15 -08:00
Francois Chollet fd05964135 Update TH / TF backends 2015-11-17 21:38:39 -08:00
Eder Santana 747d4a30b1 Update ntm.py
remove commented code
2015-11-17 21:03:06 -05:00
Eder Santana ab19cf7b07 Update ntm.py
clean up documentation
2015-11-17 21:00:51 -05:00
Eder Santana 6b8970abd3 Update neural_turing_machine_copy.py
Fix wrong import and train both models
2015-11-17 20:53:11 -05:00
Francois Chollet 5ed913da11 Convert constraints, initialization, activations 2015-11-15 15:01:58 -08:00
Francois Chollet 6ffa18e390 Convert objectives to Keras backend. 2015-11-15 14:33:20 -08:00
Francois Chollet 33ed943ad5 Convert optimizers to Keras backend. 2015-11-15 14:33:08 -08:00
Francois Chollet 368df8ef04 Convert regularizers to Keras backend. 2015-11-15 14:32:58 -08:00
Francois Chollet 24b5e80667 Fix axis specification in TF. 2015-11-15 11:30:07 -08:00
Francois Chollet 15fea0488a Style fixes 2015-11-14 22:17:13 -08:00
Francois Chollet cda6a998ef Add initial version of Theano/TensorFlow backends 2015-11-14 22:08:21 -08:00
François Chollet 653ee91d35 Merge pull request #999 from dbonadiman/patch-4
Update docs with go_backwards option
2015-11-14 11:04:50 -08:00
François Chollet eb8b37604c Merge pull request #1001 from dbonadiman/patch-5
Fixed error go_backward and return sequences.
2015-11-12 20:44:31 -08:00
Daniele Bonadiman e1bd779463 fixed an error in backward computation of recurrent layers on return sequences. 2015-11-12 13:25:47 +01:00
Daniele Bonadiman 62154fd6c6 Merge pull request #2 from fchollet/master
Up do date 2
2015-11-12 12:17:57 +01:00
Daniele Bonadiman f8b0c7e2e3 Update doc with go_backwards option 2015-11-12 11:48:32 +01:00
François Chollet 7df515b607 Merge pull request #995 from farizrahman4u/patch-9
Include samples dim in output_shape's argument
2015-11-11 10:33:24 -08:00
Fariz Rahman b65c665d2a Include samples dim in output_shape's argument 2015-11-11 23:58:12 +05:30
Francois Chollet dfa6b3a4c3 Merge branch 'farizrahman4u-patch-9' 2015-11-11 08:58:16 -08:00
Francois Chollet 2e6025d7cc Fix coding style in documentation for Lambda 2015-11-11 08:58:04 -08:00
Francois Chollet 1a721b42c7 Merge branch 'patch-9' of https://github.com/farizrahman4u/keras into farizrahman4u-patch-9 2015-11-11 08:47:08 -08:00
François Chollet d0677a1e1f Merge pull request #992 from tzachar/patch.02
Fix containers.graph sav/get_weights
2015-11-11 08:44:13 -08:00
Fariz Rahman f6f8bf643e Update core.md 2015-11-11 21:08:30 +05:30
Fariz Rahman d6aebff5b2 Added documentation for LambdaMerge layer 2015-11-11 21:07:36 +05:30
Fariz Rahman e60b7de064 Update core.md 2015-11-11 20:44:34 +05:30
Fariz Rahman cbd8eb7a55 Added documentation for Lambda layer with example 2015-11-11 20:39:53 +05:30
nir tzachar 20dc637cd6 Fix containers.graph sav/get_weights
As the graph container was not using each individual layer's get/set
weights, but rather the super class layer.get_weights, which works on
self.params(), it was missing some weights in the process, i.e., the
BatchNormalizationLayer has custom get_weights which allows to save the
running mean/std. However, these running computations are not added to
BatchNormalizationLayer.params(), resulting in losing these weights
after serializing a graph model utilizing a BatchNormalizationLayer.

Fixed to use each node's get/set weights.
2015-11-11 11:40:43 +02:00
François Chollet e037e17557 Merge pull request #950 from rushter/autoencoder-fix
Fix AutoEncoder's input_shape
2015-11-10 18:08:38 -08:00
François Chollet 9ad5bf3a7b Merge pull request #986 from tzachar/patch
Fix TimeDistributedMerge
2015-11-10 18:07:54 -08:00
EderSantana 9e4e317bcb Neural Turing Machines 2015-11-10 19:16:26 -05:00
nir tzachar db60707cb8 Fix TimeDistributedMerge
When mode='ave', and the dtype of the input is float32, dividing the sum
by shape[1], which is of dtype int64, results in an output of dtype
float64, which is wrong.

fixed to use theano.tensor.mean instead.
2015-11-10 09:40:37 +02:00
Francois Chollet 91ea495b56 MemNN Python3 compatibility 2015-11-09 18:47:29 -08:00
Francois Chollet d9596ae68d Add memory network example. 2015-11-09 18:47:29 -08:00
Francois Chollet 1dc1d25a58 Add memory network example. 2015-11-09 16:37:44 -08:00
François Chollet 527b1ed0dc Merge pull request #978 from dbonadiman/patch-3
Fix imdb bidirectional
2015-11-09 08:03:07 -08:00
Fariz Rahman 17d2f5f960 del instead of setting to None 2015-11-09 18:54:54 +05:30
Fariz Rahman 2ef4d698ef Whitespace fix 2015-11-09 18:50:52 +05:30
Fariz Rahman bde97055c0 Update core.py 2015-11-09 18:40:01 +05:30
Fariz Rahman 3b0e2b0c79 Del ndim arg,exclude samples dim from output shape 2015-11-09 18:37:32 +05:30
Daniele Bonadiman 3e3c43d3ee fix TypeError: fit() got an unexpected keyword argument 'show_accuracy' 2015-11-09 12:06:18 +01:00
Daniele Bonadiman cb60477548 Merge pull request #1 from fchollet/master
Up to date
2015-11-09 09:25:42 +00:00
Francois Chollet 3d25fae014 Merge branch 'dbonadiman-add-1' 2015-11-08 18:16:33 -08:00
Francois Chollet 4884595e4d Style fixes in bidirectional LSTM example. 2015-11-08 18:16:10 -08:00
Francois Chollet 1760ed6cd7 Merge branch 'add-1' of https://github.com/dbonadiman/keras into dbonadiman-add-1 2015-11-08 18:07:20 -08:00
François Chollet b1bbedfb46 Merge pull request #972 from dbonadiman/patch-1
[Patch] Fixing issue with dot output shape
2015-11-08 18:05:35 -08:00
Francois Chollet 227c300a0a Remove sudo from Travis CI config 2015-11-08 17:04:14 -08:00
Francois Chollet f5819a0d4e Update Travis CI config 2015-11-08 16:59:14 -08:00
Daniele Bonadiman 28882a868d fix https://github.com/fchollet/keras/pull/970#issuecomment-154886091 2015-11-09 00:29:14 +01:00
Daniele Bonadiman b32e60cfa0 fixing issues with dot output shape 2015-11-08 23:26:33 +01:00
Daniele Bonadiman ca360b0d15 bidirectional lstm example added. 2015-11-08 23:19:14 +01:00
Daniele Bonadiman 10852b2529 go_backwards added to recurrent layers. 2015-11-08 22:56:42 +01:00
rushter cc0108097c Fix AutoEncoder's input_shape 2015-11-04 10:12:39 +03:00
François Chollet fe14a845ab Merge pull request #942 from matsuyamax/master
Fixes and input checks for dot mode in merge.
2015-11-02 20:45:51 -08:00
Makoto Matsuyama 5964848bdf Fix Python3 compatibility 2015-11-02 20:14:33 -08:00
Makoto Matsuyama 002a9d5d2b Fixes and input checks for dot mode in merge. 2015-11-02 19:37:00 -08:00
François Chollet eba8530e7b Merge pull request #941 from stephenroller/master
imported TimeDistributedMerge
2015-11-02 14:47:42 -08:00
Stephen Roller 7432034ada TimeDistributedMerge needed to be imported into layer_utils, so it could be serialized to to_json 2015-11-02 16:36:05 -06:00
François Chollet 99331b83f9 Merge pull request #938 from farizrahman4u/patch-12
Avoid recalculation of output in join merge.
2015-11-02 12:37:31 -08:00
François Chollet 0b326d9688 Merge pull request #939 from stephenroller/master
Fix merge concat mode
2015-11-02 11:58:22 -08:00
Stephen Roller 57612707c1 Fix merge concat mode; the assertion checked the opposite of what was desired. 2015-11-02 13:51:10 -06:00
Fariz Rahman 5352e46e09 Avoid recalculation of output in join merge. 2015-11-03 00:10:43 +05:30
François Chollet a5d93bfdc1 Merge pull request #935 from liormagen/patch-1
Handle an empty document
2015-11-02 08:18:05 -08:00
Lior Magen 742ccaa2cf Handle an empty document
In case of an empty sequence (as a result of clearing stop words for example), don't raise an error but continue to the next sequence.
2015-11-02 16:39:18 +02:00
François Chollet 284ef7b495 Merge pull request #927 from LeavesBreathe/patch-2
Update Docs for RMSE in objectives
2015-11-01 10:59:42 -08:00
LeavesBreathe 02180e881d Update Docs for RMSE in objectives 2015-10-31 13:13:21 -04:00
Francois Chollet 4a6b03403b Fix ParametricSoftplus 2015-10-31 10:12:27 -07:00
François Chollet ccdd5d147d Merge pull request #926 from rushter/small-fixes
Small fixes
2015-10-31 10:11:16 -07:00
François Chollet 3c95896415 Merge pull request #922 from LeavesBreathe/patch-2
Added Root Mean Square Error
2015-10-31 09:57:05 -07:00
rushter 66edb6aea0 Fix python 3 compatibility 2015-10-31 18:16:48 +03:00
rushter 5590dc7b0d FIx read-only property can't be changed 2015-10-31 18:11:51 +03:00
rushter 8f2810bfee Remove useless declaration 2015-10-31 17:32:01 +03:00
rushter 42e582b1ba FIx typos 2015-10-31 17:30:00 +03:00
LeavesBreathe 43bf884b5f Added Root Mean Square Error 2015-10-31 08:59:04 -04:00
François Chollet 234408e82e Merge pull request #920 from farizrahman4u/patch-2
Fix dot_axis API
2015-10-30 13:47:51 -07:00
François Chollet 41db741881 Merge pull request #917 from contextvision/dropout
Dropout: rescale data at training time (breaks bw compat.)
2015-10-30 13:47:21 -07:00
François Chollet 46bfa18a57 Merge pull request #911 from jpeg729/custom_objects
Change the custom_layer api to support custom_objects
2015-10-30 13:41:54 -07:00
Francois Chollet 7e85390d8e Merge branch 'master' of https://github.com/fchollet/keras 2015-10-30 13:14:03 -07:00
Francois Chollet a2d01238af Fix flaky Travis test 2015-10-30 13:13:50 -07:00
Fariz Rahman ba982e7ee0 Whitespace fix 2015-10-30 23:28:15 +05:30
Fariz Rahman 18f122e1d9 Fix dot_axis API
Make the default value of dot_axes more useful.
2015-10-30 23:26:52 +05:30
Mikael Rousson bae645b65e rescale data at training time (breaks bw compat.) 2015-10-30 09:38:10 +01:00
jpeg729 66ad0bf736 Change the custom_layer api to support custom_objects
A small change permitting the use of custom loss functions, custom
optimizers, custom activation functions, and so on.
2015-10-29 16:31:17 +01:00
Fariz Rahman 469640bd45 Fix serialization recursion problem 2015-10-29 08:57:05 +05:30
Fariz Rahman 2f8b351c95 Update core.py 2015-10-29 08:23:18 +05:30
Fariz Rahman 8ee2ffd7e1 Update core.py 2015-10-29 08:22:09 +05:30
Fariz Rahman fb24dc1904 Update core.py 2015-10-29 00:02:00 +05:30
Fariz Rahman 79e702ec53 Added test for Lambda layer 2015-10-28 23:42:43 +05:30
François Chollet bd005d66af Merge pull request #906 from farizrahman4u/patch-2
Add dot_axes argument to Graph
2015-10-28 10:57:26 -07:00
Fariz Rahman 0354e64521 Update test_sequential_model.py 2015-10-28 12:39:02 +05:30
Fariz Rahman a4552fb004 Use input shape instead of input layer 2015-10-28 12:38:09 +05:30
Fariz Rahman 455d7d10db Add dot_axes argument to Graph
Add dot_axes argument, used by the recently added dot merge mode.
2015-10-28 12:07:53 +05:30
Francois Chollet 1c7585a563 Fix failing test 2015-10-27 22:08:41 -07:00
Francois Chollet 22566c37ce Fix dot_axes API in Merge 2015-10-27 22:08:25 -07:00
Fariz Rahman 1f0178e793 Whitespace fix 2015-10-28 10:32:51 +05:30
Fariz Rahman fbe2ea6537 Update core.py 2015-10-28 10:31:16 +05:30
Fariz Rahman 7b0be64757 White space/style/comment fix 2015-10-28 10:28:54 +05:30
François Chollet 0edbdcc998 Merge pull request #905 from farizrahman4u/patch-10
Dot and cos merge modes
2015-10-27 21:50:11 -07:00
Fariz Rahman 21dcabb6e8 Update test_sequential_model.py 2015-10-28 09:55:34 +05:30
Fariz Rahman c34578aece Added test for dot merge with tuple and int axes 2015-10-28 09:48:42 +05:30
Fariz Rahman 01a3e298fb Dot and cos merge modes
Cleaner version of #891
2015-10-28 09:44:18 +05:30
Fariz Rahman 42a65d4b83 Update core.py 2015-10-28 08:43:48 +05:30
Fariz Rahman cb836091ec Update core.py 2015-10-28 08:28:16 +05:30
Fariz Rahman 6a05247711 Update core.py 2015-10-28 04:10:43 +05:30
Fariz Rahman c015acb413 Update core.py 2015-10-28 03:43:48 +05:30
Fariz Rahman feac8dba01 Update test_sequential_model.py 2015-10-28 03:38:01 +05:30
Fariz Rahman ba389ffb11 Added LambdaMerge 2015-10-28 03:30:32 +05:30
Fariz Rahman c6c3a0247c Add auto_name to config 2015-10-28 02:48:43 +05:30
Fariz Rahman c140939c5a Update core.py 2015-10-28 02:39:18 +05:30
François Chollet a9cecbbe43 Merge pull request #897 from amitbeka/load-name-consume
utils.layer_utils: don't consume 'name' when loading
2015-10-27 13:48:10 -07:00
François Chollet 0f0d264e55 Merge pull request #903 from EderSantana/patch-2
property input_shape for Sequential
2015-10-27 13:35:36 -07:00
Eder Santana a9d0905897 property input_shape for Sequential
Fix #902
2015-10-27 16:23:59 -04:00
Fariz Rahman c90d2dfc0c Update core.py 2015-10-28 01:36:47 +05:30
Fariz Rahman 315a12ac58 Pass value for auto_name argument 2015-10-28 01:19:11 +05:30
Fariz Rahman f5896236fa Added auto_name argument for merge layer 2015-10-28 01:16:36 +05:30
Fariz Rahman 29e788dc2e Update test_sequential_model.py 2015-10-27 21:49:58 +05:30
Fariz Rahman 07f8cef29c Update core.py 2015-10-27 19:59:35 +05:30
Fariz Rahman a069161d0f Update test_sequential_model.py 2015-10-27 19:58:16 +05:30
François Chollet 26b57a6cdd Merge pull request #860 from jeanpijon/monitor_directions
Monitor directions #2
2015-10-26 10:12:53 -07:00
Fariz Rahman 45577f7959 Added type checking for output_shape return value 2015-10-26 21:54:07 +05:30
Fariz Rahman 25aab5d405 Indentation fix 2015-10-26 21:16:48 +05:30
Fariz Rahman a58e037715 Added doc string 2015-10-26 21:02:34 +05:30
Fariz Rahman eb62075078 Roll back 2015-10-26 20:43:17 +05:30
Pesan Jan 3663f64bb4 Modification of ModelCheckpoint to allow monitoring of increasing
metrics
2015-10-26 11:58:08 +01:00
Amit Beka fcd54ac85b utils.layer_utils: don't consume 'name' when loading
When loading regularizers/constraints from config, and the object isn't
found, don't consume the 'name' key.

This enables expansions to keras to be saved/loaded with dictionaries as
some of their parameters.

Signed-off-by: Amit Beka <amit.beka@gmail.com>
2015-10-26 12:11:07 +02:00
Fariz Rahman fb8e80daf4 Consider the case where Lambda is input layer 2015-10-26 14:52:54 +05:30
Fariz Rahman ac3d135b84 Only allow named inputs to be merged by join mode 2015-10-26 14:46:51 +05:30
Fariz Rahman d876dae965 Named layers before join merging 2015-10-26 14:45:06 +05:30
Fariz Rahman 345d78e27d Added default value for output shape 2015-10-26 08:41:34 +05:30
Fariz Rahman d5827c8917 Whitespace fix 2015-10-26 07:51:27 +05:30
Fariz Rahman 779bf2a8d2 Made compatible with Python 3 2015-10-26 07:50:55 +05:30
Fariz Rahman 0fab8ac373 Update test_sequential_model.py 2015-10-26 07:26:47 +05:30
Fariz Rahman b3594ccd5e Added test for serializing model with Lambda layer 2015-10-26 07:18:20 +05:30
Fariz Rahman 15b84d375f Update core.py 2015-10-26 07:00:02 +05:30
Fariz Rahman 312120272f Add Python3 support 2015-10-26 06:54:46 +05:30
Fariz Rahman 97b27bfd1e Update test_sequential_model.py 2015-10-26 06:24:59 +05:30
Fariz Rahman a0c5f5743c Update core.py 2015-10-26 06:19:18 +05:30
Fariz Rahman 688b898dc8 Update test_sequential_model.py 2015-10-26 05:50:19 +05:30
Fariz Rahman 7def993fe2 Update test_sequential_model.py 2015-10-26 05:40:56 +05:30
Fariz Rahman c33a551f97 Update test_sequential_model.py 2015-10-26 05:20:23 +05:30
Fariz Rahman 3f8bfd2b1c Added test for Lambda layer 2015-10-26 05:04:34 +05:30
Fariz Rahman c6cf4425d3 Update core.py 2015-10-26 04:21:56 +05:30
Fariz Rahman aa6cf91a18 White space/ style fix 2015-10-26 04:00:44 +05:30
Fariz Rahman 5faef2fc56 Changed order of inheritance 2015-10-26 03:58:41 +05:30
Fariz Rahman ad254f99eb Bug fix 2015-10-26 03:47:56 +05:30
Fariz Rahman e6cecb5d12 Separate Lambda and MaskedLambda layers 2015-10-26 03:29:57 +05:30
Fariz Rahman 6106d1fa30 Indentation fix 2015-10-26 02:30:00 +05:30
Fariz Rahman ab6b11d86e Lambda Layer
Lambda layer can be used to implement any arbitrary function when stacked on top of a join merge layer.
2015-10-26 02:20:24 +05:30
François Chollet 6e2e6eff89 Merge pull request #892 from amitbeka/fix-requirements
fix setup.py: add six as a requirement
2015-10-25 10:25:46 -07:00
Amit Beka e60680b12c fix setup.py: add six as a requirement
Signed-off-by: Amit Beka <amit.beka@gmail.com>
2015-10-25 18:06:41 +02:00
farizrahman4u 611c4851e3 Merge pull request #1 from fchollet/master
Update fork
2015-10-25 15:36:42 +05:30
Francois Chollet 47071f9303 Merge branch 'farizrahman4u-patch-8' 2015-10-24 21:08:12 -07:00
Francois Chollet 14486397c5 Whitespace / style fixes 2015-10-24 21:07:47 -07:00
Francois Chollet 0cd30480fb Merge branch 'patch-8' of https://github.com/farizrahman4u/keras into farizrahman4u-patch-8 2015-10-24 20:46:10 -07:00
Francois Chollet 74b590eff8 Whitespace / style fixes only 2015-10-24 20:39:06 -07:00
Francois Chollet 25a2a972b1 Merge branch 'patch-1' of https://github.com/farizrahman4u/keras into farizrahman4u-patch-1 2015-10-24 20:28:51 -07:00
François Chollet 01bb88513d Merge pull request #889 from trungnt13/master
fixed error in reading hdf5 dataset
2015-10-24 14:22:30 -07:00
farizrahman4u 0d233b2512 Update imdb_cnn_lstm.py 2015-10-25 02:43:02 +05:30
farizrahman4u e704e1578e Update imdb_cnn_lstm.py
Style fix
2015-10-25 02:37:47 +05:30
nick_artin 0c0d91df2e Fixed error in reading hdf5 dataset 2015-10-24 23:00:17 +02:00
François Chollet 52fd68f4c5 Merge pull request #886 from alvations/master
Added version information and docstring to __init__
2015-10-24 13:05:34 -07:00
alvations 7b0b324041 Merge pull request #2 from alvations/alvations-patch-1
Removed the unused import
2015-10-24 20:23:11 +02:00
alvations 8b24b3343f Removed the unused import 2015-10-24 20:20:32 +02:00
alvations 084b235c62 Merge pull request #1 from alvations/alvations-patch-1
Added version information and docstring to __init__
2015-10-24 15:29:40 +02:00
alvations 9a384888d7 Added version information and docstring to __init__ 2015-10-24 15:21:34 +02:00
farizrahman4u 6cc827ca55 Bug fix 2015-10-24 15:11:20 +05:30
farizrahman4u d2051a5df1 Added shape checking for concat merge 2015-10-24 14:53:36 +05:30
François Chollet e7c6d598a9 Merge pull request #839 from mmmikael/trainable
added possibility to freeze layers during training
2015-10-23 16:40:38 -07:00
farizrahman4u 1526d81ab3 Faster and better Sentiment Analysis example.
Faster and more accurate sentiment analysis using combination of convolutional and recurrent layers. Better and faster results when compared to using either convnet or rnn alone.

Comparison with other sentiment analysis examples (run on a slow machine so that the time differences are visible):

imdb_lstm.py

Train...
Train on 20000 samples, validate on 5000 samples
Epoch 1/4
20000/20000 [==============================] - 784s - loss: 0.4773 - acc: 0.7769 - val_loss: 0.3613 - val_acc: 0.8396
Epoch 2/4
20000/20000 [==============================] - 788s - loss: 0.2691 - acc: 0.8946 - val_loss: 0.3644 - val_acc: 0.8376
Epoch 3/4
20000/20000 [==============================] - 791s - loss: 0.1770 - acc: 0.9351 - val_loss: 0.3913 - val_acc: 0.8370
Epoch 4/4
20000/20000 [==============================] - 800s - loss: 0.1137 - acc: 0.9612 - val_loss: 0.4621 - val_acc: 0.8308
5000/5000 [==============================] - 48s
Test score: 0.46212145137
Test accuracy: 0.8308

imdb_cnn.py

Train on 20000 samples, validate on 5000 samples
Epoch 1/3
20000/20000 [==============================] - 1414s - loss: 0.6401 - acc: 0.5930 - val_loss: 0.5144 - val_acc: 0.7442
Epoch 2/3
20000/20000 [==============================] - 1411s - loss: 0.3908 - acc: 0.8255 - val_loss: 0.3615 - val_acc: 0.8344
Epoch 3/3
20000/20000 [==============================] - 1416s - loss: 0.3173 - acc: 0.8636 - val_loss: 0.3788 - val_acc: 0.8256



imdb_cnn_lstm.py

Train...
Train on 20000 samples, validate on 5000 samples
Epoch 1/2
20000/20000 [==============================] - 575s - loss: 0.4312 - acc: 0.7900 - val_loss: 0.3457 - val_acc: 0.8456
Epoch 2/2
20000/20000 [==============================] - 580s - loss: 0.2302 - acc: 0.9094 - val_loss: 0.3546 - val_acc: 0.8498
5000/5000 [==============================] - 26s
Test score: 0.354624111649
Test accuracy: 0.8498
2015-10-24 01:49:27 +05:30
farizrahman4u 73c0045815 Bug fix
The base Layer was not returning an output shape.
2015-10-24 01:08:40 +05:30
farizrahman4u 3981a87b31 Assert input shapes of Merge layer are equal
Make sure that only layers with same output_shape are merged using sum, ave and mul modes.
2015-10-23 20:52:43 +05:30
Mikael Rousson a78c43033e added possibility to freeze layers during training 2015-10-23 13:53:26 +02:00
François Chollet 80e85836c1 Merge pull request #876 from farizrahman4u/patch-2
Update comment to include new join merge mode
2015-10-22 15:35:04 -07:00
farizrahman4u 569bb74a08 Update comment to include new join merge mode 2015-10-23 00:04:22 +05:30
François Chollet a8eda69f3e Merge pull request #869 from EderSantana/join
add join to valid merge modes
2015-10-22 09:16:16 -07:00
EderSantana 6f620712b5 add join to valid merge modes 2015-10-21 21:41:34 -04:00
François Chollet a19c9ecfbd Merge pull request #865 from suixudongi8/patch-1
Update optimizers.py
2015-10-21 18:33:14 -07:00
Xsh Disney 11e4c4b90f Update optimizers.py
Bug:
float() argument must be a string or a number, not 'TensorSharedVariable'
fixed!
2015-10-21 21:30:44 +08:00
François Chollet 025cd16854 Merge pull request #859 from r9y9/patch-1
FIx `Exception: Invalid layer: LRN2D`
2015-10-20 10:24:48 -07:00
Ryuichi YAMAMOTO 68f619b9f9 import LRN2D
This should fix the problem`Exception: Invalid layer: LRN2D` while loading a model that includes LRN2D.

```py
model = Sequential()
model.add(Convolution2D(30, 3, 3, input_shape=(1, 28, 28)))
model.add(LRN2D())

model_def = model.to_yaml()

# this line raises Exception: Invalid layer: LRN2D
model_from_yaml(model_def)
```

The code above could reproduce the problem.
2015-10-20 19:28:41 +09:00
Francois Chollet f1ef9895f5 Update README w/ warning about using latest Theano 2015-10-14 11:26:23 -07:00
Francois Chollet a86b381424 Update LICENSE with copyright information. 2015-10-14 09:02:22 -07:00
Francois Chollet 5691912701 Update RNN docs 2015-10-13 17:56:06 -07:00
François Chollet cc26bc24a9 Merge pull request #826 from jfsantos/patch-5
Fix #821
2015-10-13 11:41:15 -07:00
João Felipe Santos f62c03bdea Fix #821
`decay` in `SGD` is not a shared value, so it does not have a `get_value` method.
2015-10-13 14:19:50 -04:00
François Chollet 40a9bd7c2f Merge pull request #819 from xingdi-eric-yuan/master
Bug fixes:
2015-10-12 20:15:13 -07:00
Xingdi (Eric) Yuan 38a3228999 Bug fixes:
“TypeError: Cannot cast ufunc subtract output from dtype('float64') to
dtype('uint8') with casting rule 'same_kind'” in
keras/preprocessing/image.py, line 239, when using data augmentation.
2015-10-12 15:09:25 +08:00
Francois Chollet d2189fef32 Update version number: now 0.2.0 2015-10-10 17:51:59 -07:00
François Chollet 0a35173b33 Merge pull request #814 from matsuyamax/shapeinfer
Fix variable sharing issue with NonNeg constraint.
2015-10-10 17:17:41 -07:00
Makoto Matsuyama 5d2f3101ae Fix variable sharing issue with NonNeg constraint. 2015-10-10 16:50:33 -07:00
François Chollet 73aac1c7c9 Merge pull request #813 from hedeon/keras_hq
Fix: added "Permute" into imports of line 8: keras/keras/utils/layer_utils.py
2015-10-10 13:09:37 -07:00
Haizhou Qu 815f7064a2 Fix: added "Permute" into imports 2015-10-10 20:50:26 +01:00
François Chollet 63f81cafbd Merge pull request #808 from transcranial/typo-fixes
fixes for new API: Embedding layer, additional examples
2015-10-09 22:47:43 -07:00
Leon Chen 4c1a6fc27e fix Embedding config property to include new input_length property, add input_length to all examples with Embedding layer 2015-10-09 16:35:48 -04:00
Leon Chen 9dbf04b699 poolsize -> pool_size 2015-10-09 16:32:34 -04:00
François Chollet 856a99de6c Merge pull request #804 from transcranial/conv2d-bugfix
correct image_shape and filter_shape parameters in Convolution2D
2015-10-08 16:11:44 -07:00
Leon Chen f463d23b38 correct image_shape and filter_shape parameters in Convolution2D 2015-10-08 16:54:15 -04:00
François Chollet 775719983f Merge pull request #791 from matsuyamax/shapeinfer
Add automatic shape inference.
2015-10-06 20:26:20 -07:00
Makoto Matsuyama e839a4bdac Py3 compatibility 2015-10-05 17:42:35 -07:00
Makoto Matsuyama cfd2763514 Simplify test_tasks 2015-10-05 17:04:50 -07:00
Makoto Matsuyama 0b8a52e463 Update documentation to new API. 2015-10-05 16:28:17 -07:00
Makoto Matsuyama cb77f7d7e2 Incorporate image_shape and filter_shape in convs 2015-10-05 13:02:31 -07:00
Makoto Matsuyama e8e56d9013 Add shape inference to Graph containers 2015-10-05 12:01:24 -07:00
Makoto Matsuyama c60e2dfbdb Update (most) automated tests. 2015-10-05 07:09:44 -07:00
François Chollet 9807dcd69b Merge pull request #775 from stephenbalaban/feature/customlayers
First crack at threading `custom_layers` through.
2015-10-05 05:55:27 -07:00
Makoto Matsuyama 2bd4c295d6 Update all examples with new API 2015-10-04 18:44:49 -07:00
Makoto Matsuyama 35d66d672b Add shape inference to existing layers 2015-10-04 16:45:01 -07:00
Stephen A. Balaban 11eaaeb695 Rm custom_layers argument from get_from_module
Removed tailind whitespace in generic_utils.
Shoved variables from `custom_layers` into globals().
2015-10-04 15:57:43 -07:00
Stephen A. Balaban cc1251b307 Merge branch 'master' of github.com:fchollet/keras 2015-10-04 15:44:36 -07:00
Makoto Matsuyama 4564dab62a Allow any layer to accept a 'input_shape' kwarg. 2015-10-04 14:26:12 -07:00
Makoto Matsuyama 0e62ae4eaa fix merge conflict 2015-10-04 13:41:45 -07:00
François Chollet 876bca046f Merge pull request #780 from matsuyamax/master
Fix compatibility with old MaxPooling interface
2015-10-04 13:00:01 -07:00
Makoto Matsuyama d7e0ba1c39 Fix 2015-10-04 12:49:50 -07:00
Makoto Matsuyama e0bcee4963 Revert default stride in MaxPooling to None 2015-10-04 12:28:22 -07:00
Makoto Matsuyama ca4fc2e72f Fix compatibility with old MaxPooling interface 2015-10-04 12:26:48 -07:00
François Chollet 83544cdb41 Merge pull request #777 from matsuyamax/shapeinfer
Add .output_shape attribute in all layers (+tests)
2015-10-04 10:46:56 -07:00
Makoto Matsuyama 37978fcda6 fix merge conflict 2015-10-04 06:57:52 -07:00
Makoto Matsuyama 61d76d4a07 fix merge conflict 2015-10-04 06:39:40 -07:00
Makoto Matsuyama cc6280f34d Fix tests 2015-10-04 06:30:45 -07:00
François Chollet 5bab11eec7 Merge pull request #778 from transcranial/conv1d-cudnn
Use cuDNN in Convolution1D layer if available
2015-10-04 06:29:44 -07:00
Leon Chen 65b048455b use cuDNN in Convolution1D layer if available 2015-10-04 01:52:39 -04:00
Makoto Matsuyama 9be4480eab Add ZeroPadding1D, refactor ZeroPadding2D 2015-10-03 22:16:14 -07:00
Makoto Matsuyama 7219bb4b96 Change API of Reshape layer 2015-10-03 22:15:53 -07:00
Makoto Matsuyama fd2c6dbafd in MaxPooling layers: poolsize -> pool_size 2015-10-03 21:52:35 -07:00
Makoto Matsuyama 19c736a4ca Remove print statement, py3 compatibility. 2015-10-03 19:54:46 -07:00
François Chollet b9bf954f24 Merge pull request #732 from matsuyamax/master
Update relu to use Theano's implementation
2015-10-03 19:27:18 -07:00
Makoto Matsuyama 0bc7b25f59 Upgrade Theano in Travis config 2015-10-03 18:20:00 -07:00
Makoto Matsuyama 9f47903daf Merge remote-tracking branch 'upstream/master' 2015-10-03 18:18:18 -07:00
Makoto Matsuyama 7f1eb97000 Remove whitespace, useless comment 2015-10-03 18:15:46 -07:00
Makoto Matsuyama c506fbda4a Add .output_shape attribute in all layers (+tests) 2015-10-03 17:08:28 -07:00
Stephen A. Balaban aa05c44145 Merge branch 'master' of github.com:fchollet/keras 2015-10-03 12:58:09 -07:00
Stephen A. Balaban 2e0d96d1a2 Merge branch 'bugfix/reshape' 2015-10-03 12:56:26 -07:00
François Chollet 6b62678e90 Merge pull request #619 from amitbeka/non-overwrite-checkpoint
support for multiple files in ModelCheckpoint
2015-10-03 12:10:55 -07:00
François Chollet cc8a901c31 Merge pull request #706 from nehz/nehz-subsample
Should subsample Convolution1D on correct axis
2015-10-03 12:07:06 -07:00
François Chollet 7c44d16a77 Merge pull request #771 from stephenbalaban/bugfix/reshape
Bugfix/reshape
2015-10-03 10:10:03 -07:00
Stephen A. Balaban ee07e6ef74 Added kwargs to Reshape. 2015-10-02 17:00:23 -07:00
Stephen A. Balaban 88a0ab5e93 First crack at threading custom_layers through.
A bit surprised that keras was using globals() to access layers (doesn't work
across modules.) Hacky solution was to pass a dict mapping name -> class.
I called this dict `custom_layers`.

Is there a better way of doing this that I'm not seeing?
2015-10-02 13:31:19 -07:00
Stephen A. Balaban 57bb9e2613 Added **kwargs to to_json and to_yaml.
This allows you to do nice things like save JSON models so that they're human
readable & editable. For example:

>>> with open('output.json', 'w') as f:
...     f.write(model.to_json(indent=4, sort_keys=True))
...
2015-10-02 10:59:13 -07:00
François Chollet c1857cfa66 Merge pull request #757 from matsuyamax/cnn_fix
Fix Reshape and Permute deserialization
2015-09-30 22:18:31 -07:00
Makoto Matsuyama 2d8307622d Fix Reshape and Permute deserialization 2015-09-30 21:16:59 -07:00
François Chollet af932d3480 Merge pull request #752 from jfsantos/patch-4
Fix typo in docstring
2015-09-30 09:07:18 -07:00
João Felipe Santos 4ed53ae5a4 Fix typo in docstring
longuest -> longest
2015-09-30 11:47:38 -04:00
François Chollet 0d798c662b Merge pull request #749 from nebw/fix-sample-weight-doc
add class_weight/sample_weight parameters to doc #736
2015-09-29 20:56:29 -07:00
Benjamin Wild 5f4675bd7f add class_weight/sample_weight parameters to doc #736 2015-09-29 16:51:32 +02:00
François Chollet 14b175c9b0 Merge pull request #745 from blackyang/doc_sample_weight
change sample_weight doc
2015-09-28 22:19:04 -07:00
Xiao Yang 8d4e75894a change sample_weight doc 2015-09-29 00:53:13 -04:00
François Chollet 3d888cbf7e Merge pull request #739 from matsuyamax/cnn_fix
Fix bug in deserialization of convolutional layers
2015-09-28 08:22:14 -07:00
Makoto Matsuyama 3b76158c49 Fix bug in deserialization of convolutional layers 2015-09-27 21:49:20 -07:00
François Chollet 788d838160 Merge pull request #738 from matsuyamax/graph_fix
Fix bug with Graph sample_weights
2015-09-27 21:39:16 -07:00
Makoto Matsuyama 56ae624f12 Fix bug with Graph sample_weights 2015-09-27 20:50:32 -07:00
François Chollet ef43a271ee Merge pull request #714 from eulerreich/patch-1
fixed incorrect comment
2015-09-27 10:48:18 -07:00
François Chollet 0b1a1e9761 Merge pull request #734 from EderSantana/master
Fix order of sings of clipvalue
2015-09-26 14:48:49 -07:00
EderSantana 52e3e2623a Merge branch 'master' of https://github.com/fchollet/keras 2015-09-26 17:43:26 -04:00
EderSantana 46a2fb6fd8 Fix sign order for clipvalue 2015-09-26 17:42:56 -04:00
Makoto Matsuyama b0f2446370 Fix relu 2015-09-25 23:55:25 -07:00
Makoto Matsuyama 7a2e8ce8a2 Update relu to use Theano's implementation 2015-09-25 23:21:35 -07:00
François Chollet 200948c3be Merge pull request #730 from eulerreich/patch-2
minor typo
2015-09-25 20:45:32 -07:00
François Chollet 35612d698a Merge pull request #731 from eulerreich/patch-3
minor typos
2015-09-25 20:45:15 -07:00
eulerreich 8a5767a53e minor typos 2015-09-25 22:18:53 -05:00
eulerreich f4ca4026a3 minor typo 2015-09-25 22:11:57 -05:00
François Chollet e4d0ed5992 Merge pull request #719 from farizrahman4u/patch-1
Update skipgram_word_embeddings.py
2015-09-25 19:52:26 -07:00
François Chollet 1325e73a59 Merge pull request #729 from EderSantana/master
Clip value as in Neural Turing Machines paper
2015-09-25 19:49:23 -07:00
EderSantana b6d8e9dd4e Fix clip value logic 2015-09-25 22:12:15 -04:00
EderSantana 69afdd7ec4 Add clip value as in Neural Turing Machines
Instead of norm clipping they do an elementwise clip. I believe others may want
to try that out too.
2015-09-25 22:10:27 -04:00
farizrahman4u d5cd2687ed Update skipgram_word_embeddings.py
Redundant code line 159 and 161
2015-09-25 11:14:53 +05:30
François Chollet ca60201fe5 Merge pull request #690 from EderSantana/master
Add merge_mode join
2015-09-24 21:16:22 -07:00
EderSantana dd6697738b Raise error if using merge_mode= with unnamed input 2015-09-24 18:14:04 -04:00
EderSantana cccc118225 Raise error if using merge_mode= with unnamed input 2015-09-24 18:12:21 -04:00
eulerreich 36578f8569 fixed incorrect comment 2015-09-23 12:30:48 -05:00
François Chollet c18a9cd405 Merge pull request #684 from jmhessel/mergefixes
Added averaging support in merge and a TimeDistributedMerge layer
2015-09-22 20:12:26 -07:00
Jack Hessel cba5cfa597 Added a very quick config unit test 2015-09-22 16:50:02 -04:00
EderSantana b2048d1d88 Merge branch 'master' of https://github.com/fchollet/keras 2015-09-22 11:43:42 -04:00
EderSantana 8bfafd6d7f Merge join returns OrderedDict instead of list
This makes merge_mode='join' complaint with keras API. Also, the OrderedDict
allows the user to simple .values() and use it as a list if he knows in which
order the inputs were merged.
2015-09-22 11:37:49 -04:00
Zhen Wang a6521de3e3 Should subsample Convolution1D on correct axis 2015-09-21 11:59:40 +08:00
Zhen Wang 02ddc11858 Merge pull request #1 from fchollet/master
Update
2015-09-21 11:58:23 +08:00
François Chollet 588261acfc Merge pull request #704 from rodrigob/patch-4
Add a bit of flexibility in Progbar.update
2015-09-20 12:33:31 -07:00
François Chollet 61a48d487f Merge pull request #696 from rodrigob/patch-3
"Epoch %d out of %d"
2015-09-20 12:26:49 -07:00
Rodrigo Benenson eee20b4614 Update callbacks.py
fixed +1
2015-09-20 21:23:43 +02:00
Rodrigo Benenson 9827db2c85 Update callbacks.py
following suggestions
2015-09-20 21:18:42 +02:00
Rodrigo Benenson b9403cb262 Add a bit of flexibility in Progbar.update
By allowing sum_values[k] to be other things than lists, it makes it easier for children classes to print "any value" (in my case, a timedelta object).
2015-09-20 15:26:27 +02:00
François Chollet e379fff425 Merge pull request #697 from ndronen/count-params
Parameter counting method for models.
2015-09-18 07:45:44 -07:00
Nicholas Dronen 80c0c762fd Add count_params method to keras.layers.core.Layer and the Sequential and Graph container classes. 2015-09-17 09:19:21 -06:00
Rodrigo Benenson 51818e5b7b "Epoch %d out of %d"
Print "Epoch %d out of %d" instead of just "Epoch %d"
2015-09-17 15:39:56 +02:00
François Chollet 393642df55 Merge pull request #691 from grahamannett/master
added visualization tools to view Sequential and Graph models
2015-09-16 20:52:41 -07:00
graham 6bb9eecd0c added functioning vizualization 2015-09-16 00:58:44 -07:00
graham f026bb2f5a added functioning vizualization 2015-09-16 00:10:38 -07:00
EderSantana 5c3db2fea6 Add merge_mode join 2015-09-15 21:52:54 -04:00
Jack Hessel 1a953feaf7 added averaging support in merge and a TimeDistributedMerge layer 2015-09-14 13:27:59 -04:00
François Chollet 0733a80297 Merge pull request #677 from jnphilipp/master
Fixed Python 3 Image loading. Closed #676
2015-09-11 19:23:45 -07:00
jnphilipp a5653c245a Fixed Python 3 Image loading. Closed #676 2015-09-11 22:39:07 +02:00
François Chollet 1724fe5882 Merge pull request #662 from gw0/feat-optional-h5py
Remove h5py requirement and made it optional.
2015-09-09 07:58:50 -07:00
gw0 [http://gw.tnode.com/] a582b184c9 Remove h5py requirement and made it optional. 2015-09-08 17:56:44 +02:00
François Chollet 36ef1ca7b4 Merge pull request #661 from jfsantos/patch-3
Updated documentation of Merge layer
2015-09-08 08:42:38 -07:00
João Felipe Santos 27edefe48c Updated documentation of Merge layer
Added 'mul' mode documentation to Merge.
2015-09-08 10:50:11 -04:00
François Chollet 4b1b86783f Merge pull request #659 from amitbeka/support-saving-masking
add Masking layer to utils.layer_utils for saving/loading
2015-09-08 07:46:45 -07:00
François Chollet 7009e80b74 Merge pull request #660 from amitbeka/fix-saving-sgd
fix SGD get_config
2015-09-08 07:46:22 -07:00
Amit Beka cd82deb152 fix SGD get_config
Signed-off-by: Amit Beka <amit.beka@gmail.com>
2015-09-08 17:04:24 +03:00
Amit Beka 65b794957f add Masking layer to utils.layer_utils for saving/loading
Signed-off-by: Amit Beka <amit.beka@gmail.com>
2015-09-08 16:39:06 +03:00
Francois Chollet 7b4e6ef50c Fix typo in FAQ 2015-09-07 20:50:40 -07:00
Francois Chollet f804b19fdc Fix typos in FAQ 2015-09-07 20:48:37 -07:00
Francois Chollet eff8731db4 Fixes in doc FAQ 2015-09-07 18:20:39 -07:00
Francois Chollet 43ddbf4a4f Add Keras FAQ 2015-09-07 17:11:38 -07:00
Francois Chollet c5b3959b42 Fix test_tasks 2015-09-07 15:36:38 -07:00
Francois Chollet 289804c67c Fix theano.tensor.signal import issue 2015-09-07 15:16:06 -07:00
Francois Chollet c6825eb343 Style fixes 2015-09-07 15:06:37 -07:00
Francois Chollet 92b8ad9d02 Merge branch 'master' of https://github.com/Mofef/keras into Mofef-master 2015-09-07 15:02:30 -07:00
Francois Chollet 3dfba0504b Merge branch 'master' of https://github.com/fchollet/keras 2015-09-07 13:59:56 -07:00
François Chollet 4bdb43f244 Merge pull request #639 from rodrigob/patch-1
Added reference for orthogonal initialization
2015-09-07 13:12:46 -07:00
Francois Chollet 83e285fd00 Add on_gpu() check 2015-09-07 13:05:45 -07:00
Francois Chollet 4e1ec93c2f Fix weight saving in BatchNormalization 2015-09-07 12:50:34 -07:00
François Chollet 2224c4cc1e Merge pull request #654 from phreeza/travis-coveralls-support
add travis CI and coveralls support
2015-09-07 10:40:55 -07:00
François Chollet 9f6f206ccd Merge pull request #647 from phreeza/test_convolutional
Add tests for convolutional layers
2015-09-07 10:40:35 -07:00
Francois Chollet f3eeb982d0 Avoid dnn import when not running on GPU 2015-09-07 10:32:13 -07:00
Moritz Münst 2be651dc39 concatenation axis as param for Merge() and Graph.add_output/node() 2015-09-07 19:36:51 +03:00
Thomas McColgan c77ded2eb6 add travis CI and coveralls support 2015-09-07 16:28:03 +02:00
François Chollet 06ab8dbd34 Merge pull request #650 from rodrigob/patch-2
Fix cPickle import for python3 support
2015-09-06 17:27:33 -07:00
François Chollet 8e293db9b5 Merge pull request #644 from anjishnu/issue_643
Fixed import errors with six.moves.cPickle and model.train typo in th…
2015-09-06 12:42:11 -07:00
Rodrigo Benenson 5040aa386d Fix cPickle import for python3 support 2015-09-06 15:58:30 +02:00
Thomas McColgan 8e67b040e8 add tests for border_mode == same 2015-09-06 13:03:18 +02:00
Thomas McColgan 84909a49c2 add upsampling layer tests 2015-09-06 12:55:50 +02:00
Thomas McColgan 0969c569a6 add convolutional layer tests 2015-09-06 12:17:24 +02:00
Francois Chollet cb8f0a83e6 Merge branch 'jfsantos-merge_mul' 2015-09-05 17:49:55 -07:00
Francois Chollet 5648119b66 Remove outdated Merge exception 2015-09-05 17:49:27 -07:00
Francois Chollet 25e9b90550 Merge branch 'merge_mul' of https://github.com/jfsantos/keras into jfsantos-merge_mul 2015-09-05 17:45:13 -07:00
Anjishnu Kumar e98b24a767 changed 'fit' to 'train_on_batch' 2015-09-05 14:19:40 -07:00
Anjishnu Kumar 034822359d Fixed import errors with six.moves.cPickle and model.train typo in the skipgram embeddings example 2015-09-05 13:36:52 -07:00
François Chollet 2e60c99924 Merge pull request #642 from wuaalb/lr-scheduler
Fix typo LearningRateScheduler
2015-09-05 04:38:26 -07:00
wuaalb 4bb6ac0b04 Fix typo LearningRateScheduler 2015-09-05 11:59:29 +02:00
François Chollet c368b86d11 Merge pull request #640 from Smerity/master
Removing magic numbers from MNIST and CIFAR10
2015-09-04 17:30:57 -07:00
Stephen Merity 49335d4345 Remove magic numbers from cifar10_cnn.py (fixes #469) 2015-09-04 16:34:00 -07:00
Stephen Merity 93c1a8c675 Remove magic numbers from mnist_cnn.py (re: #469) 2015-09-04 16:24:47 -07:00
Rodrigo Benenson 5f3bdeb0a3 Added reference for orthogonal initialization 2015-09-05 00:54:24 +02:00
François Chollet ddf908359c Merge pull request #637 from jnphilipp/master
Fix for issue #636
2015-09-04 10:28:41 -07:00
jnphilipp 37f4d11ea9 Merge branch 'master' of github.com:jnphilipp/keras 2015-09-04 13:49:52 +02:00
jnphilipp 94fbbd1c7e Fixed missing import. Closed #636 2015-09-04 13:44:26 +02:00
Francois Chollet 332d43e023 Make Pmat a param of JSZ1-2 2015-09-02 20:18:28 -07:00
Francois Chollet f84fe7ce17 Change cPickle import pattern in datasets 2015-09-02 20:15:14 -07:00
Joao Felipe Santos 16d0e40560 Updated 'mul' mode to support multiple layers 2015-08-31 21:17:09 -04:00
Amit Beka da24be79ab support for multiple files in ModelCheckpoint
enable string formatted filenames (e.g. weights.{epoch:02d}.hdf5), so
every epoch will be saved to a different file without overwriting.

Signed-off-by: Amit Beka <amit.beka@gmail.com>
2015-08-31 11:25:06 +03:00
Joao Felipe Santos ab8642e0ff Added element-wise multiplication as merge mode 2015-08-29 13:24:54 -04:00
fchollet 2c30d503ea Fix sample weights generation for validation data 2015-08-27 15:52:10 -07:00
fchollet 7a86ff7f5b Fixes in loss weighting with validation data 2015-08-27 15:38:26 -07:00
François Chollet 1eb2e6e3f2 Merge pull request #606 from entron/fix_multi_gpu
Multiple GPU works
2015-08-27 13:54:00 -07:00
Cheng Guo a9d437198e Multiple GPU works 2015-08-27 16:25:10 +02:00
fchollet 3c4f0ac609 Revert "Fix sample_weight and class_weight in validation"
This reverts commit 9773e810a5.
2015-08-26 16:50:24 -07:00
fchollet 9773e810a5 Fix sample_weight and class_weight in validation 2015-08-26 15:37:06 -07:00
fchollet 484039d2ce Merge branch 'conv_same_with_cudnn' of https://github.com/entron/keras into entron-conv_same_with_cudnn 2015-08-25 11:58:27 -07:00
Cheng Guo 5a3d2fe204 added check of running device 2015-08-25 10:42:42 +02:00
François Chollet 14e4a2391a Merge pull request #381 from osh/thresh_activ
adding thresholded linear and rectified activation functions
2015-08-24 23:17:07 -07:00
fchollet a701435049 Merge branch 'time-distributed-class-weights' 2015-08-24 15:08:59 -07:00
fchollet 24bf848851 Merge branch 'time-distributed-class-weights' of https://github.com/wxs/keras into time-distributed-class-weights 2015-08-24 15:00:52 -07:00
fchollet bad60eedda Merge branch 'wxs-fix-weight-mask-interaction' 2015-08-24 13:41:42 -07:00
fchollet 21f0bfa239 Fix loss masking test 2015-08-24 13:41:05 -07:00
fchollet 6ef6128bb1 Merge branch 'fix-weight-mask-interaction' of https://github.com/wxs/keras into wxs-fix-weight-mask-interaction 2015-08-24 13:23:15 -07:00
Xavier Snelgrove 9e001ee70c Make the logic more understandable via DRY 2015-08-24 15:27:52 -04:00
Xavier Snelgrove fc6c42e3df I believe this is the correct combination of masks and weights 2015-08-24 15:25:14 -04:00
Xavier Snelgrove 1ee8efde56 Remove a no-op reshape
This reshape happens implicitly via the nonzero() call now.
2015-08-24 15:20:10 -04:00
Xavier Snelgrove 25508e0771 Fix interaction issues with mask and weights in weighted_objective
We used nonzero() on the weights in order to ensure that if there
happened to be a NaN or an Inf in the output that was going to be masked
about by the weights anyway, it wouldn't propagate (because 0*inf = NaN)
however this was causing interaction issues if you also used a mask,
because that wasn't using nonzero() properly.

This fixes that, and also fixes what I believe was an issue where I was
calling mean() instead of dividing by the sum of the sample weights.
2015-08-24 15:13:24 -04:00
fchollet 34999c8658 Fix Poisson loss when target = 0 2015-08-24 11:10:27 -07:00
fchollet f66b58bb6c Merge branch 'master' of https://github.com/fchollet/keras 2015-08-24 06:06:34 -07:00
fchollet 6cd8d3c37a Fix optimizer serialization 2015-08-24 06:06:03 -07:00
François Chollet 0daf02a96d Merge pull request #586 from jerheff/patch-1
Fix documented form of parametric softplus
2015-08-23 18:42:23 -07:00
Jeremy Heffner b498218d0b Fix documented form of parametric softplus 2015-08-23 17:45:50 -04:00
fchollet aa21a15bd3 Merge branch 'EderSantana-master' 2015-08-21 14:52:30 -07:00
fchollet 93624e10e8 Fix graph container 2015-08-21 14:52:03 -07:00
fchollet d9579e1c08 Learning rate scheduling touch-ups 2015-08-21 14:51:24 -07:00
fchollet 5af36525c5 Merge branch 'master' of https://github.com/EderSantana/keras into EderSantana-master 2015-08-21 14:46:10 -07:00
fchollet eb11ad776e Merge branch 'master' of https://github.com/fchollet/keras 2015-08-21 14:24:25 -07:00
fchollet ff197781b3 Multi-io get_input / get_ouput in graph container 2015-08-21 14:24:13 -07:00
fchollet af84a6879d Remove irrelevant code in convolutional 2015-08-21 14:23:17 -07:00
EderSantana f4a2323d5e - pep8 + better callback name 2015-08-21 14:56:26 -04:00
EderSantana 6a5a317848 I mean on_epoch_begin 2015-08-20 17:43:03 -04:00
EderSantana cdbbdce934 Make lr and momemtum shared_scalars
With lr and momentum being scalars we can change their values without
needing to recompile the model. This PR also includes a Callback called
LrSetter that gets a dict with epoch x lr pairs and set the values of
the later at the begging of the associated epoch.
2015-08-20 17:42:15 -04:00
François Chollet effe128bde Merge pull request #561 from JasonTam/sklearn-init-params
Moved fit/predict params to init. Changed test accordingly.
2015-08-20 14:13:46 -07:00
JasonTam dcbe14cd9a Moved fit/predict params to init. Changed test accordingly. This addresses #558 . 2015-08-19 15:35:28 -04:00
François Chollet 103a3da614 Merge pull request #559 from danielforsyth/docs
Add SciPy Link to Conv Layer Docs
2015-08-19 09:46:19 -07:00
forsythd ec450f429e Add SciPy Link to Conv Layer Docs 2015-08-19 12:37:07 -04:00
Cheng Guo a3f3a020a2 added cudnn availability check 2015-08-19 11:10:11 +02:00
François Chollet b958978dde Merge pull request #555 from jdwittenauer/master
Finished scikit-learn wrapper
2015-08-18 20:15:55 -07:00
fchollet a478930d25 Add weight loading to PReLU, ParametricSoftPlus 2015-08-18 18:19:58 -07:00
John Wittenauer 4afb5b60d6 Resolved a few minor issues found during testing. 2015-08-18 21:05:45 -04:00
fchollet 78feed7fa9 Fix autoencoder serialization 2015-08-18 16:49:31 -07:00
François Chollet 5e9579aeac Merge pull request #547 from awentzonline/conn-map-dict
Fix Graph.set_previous with dict connection_map
2015-08-18 07:09:15 -07:00
Adam Wentz c515dc90d4 Fix Graph.set_previous with dict connection_map 2015-08-17 20:41:20 -05:00
John Wittenauer 7c966439fa Updated wrapper test script. 2015-08-17 21:03:00 -04:00
fchollet 818f5d7dc4 Merge branch 'master' of https://github.com/fchollet/keras 2015-08-17 17:57:48 -07:00
fchollet 23b1d7929f Merge branch 'Smerity-master' 2015-08-17 17:57:41 -07:00
fchollet d5455154f2 Touch-ups to addition RNN example 2015-08-17 17:57:20 -07:00
François Chollet d473216a8b Merge pull request #546 from awentzonline/typo
Fixed typo in containers.Graph
2015-08-17 17:22:19 -07:00
Adam Wentz ca0ebcc627 Fixed typo in containers.Graph 2015-08-17 19:01:42 -05:00
Stephen Merity 588ce7a7e2 Example: Sequence to sequence learning for addition using RNNs 2015-08-17 04:42:54 -07:00
fchollet 97174dd298 Fix batch normalization as first layer 2015-08-16 18:09:48 -07:00
fchollet 3dd6dbe5d4 Merge branch 'master' of https://github.com/fchollet/keras 2015-08-16 17:53:53 -07:00
fchollet 4de2b58842 Fix graph model 2015-08-16 17:53:37 -07:00
François Chollet 88ba02ae32 Merge pull request #538 from jfsantos/patch-2
Recent update broke HDF5Matrix
2015-08-16 17:02:40 -07:00
João Felipe Santos af6b50b64c Recent update broke HDF5Matrix
`refs` is a class attribute, not an instance attribute. If you make `refs` an instance attribute, this will cause `HDF5Matrix` to open the same HDF5 file more than once (which should never happen).
2015-08-16 16:51:20 -04:00
fchollet a02ccc2a78 Change API of ZeroPadding2D 2015-08-16 12:02:07 -07:00
fchollet f494757860 Merge branch 'anayebi-master' 2015-08-16 10:39:58 -07:00
fchollet 965a8cae03 Convolution layers: style cleanup, code re-org 2015-08-16 10:39:27 -07:00
fchollet c61616abf4 Merge branch 'master' of https://github.com/anayebi/keras into anayebi-master 2015-08-16 10:29:20 -07:00
fchollet ca758bef0b Fix RepeatVector in captioning example 2015-08-16 10:27:13 -07:00
anayebi 024a88f986 Removed negative axis indices from UpSample2D 2015-08-16 09:23:13 -07:00
fchollet a1161885d5 Fix text preprocessing 2015-08-17 00:28:46 +09:00
François Chollet 9c58adfe4b Merge pull request #515 from bshickel/patch-1
Fixes IndexError when converting sequences to matrix with nb_words = None
2015-08-16 08:22:11 -07:00
John Wittenauer c9461d7148 Added regressor class. 2015-08-15 20:59:06 -04:00
John Wittenauer dbe948ec97 Added base class for classifer to inherit from. 2015-08-15 20:46:41 -04:00
fchollet f7d4a1e443 Merge branch 'master' of https://github.com/fchollet/keras 2015-08-16 05:29:06 +09:00
fchollet 7115efc37b Remove optional loss masking (now automatic) 2015-08-16 05:28:33 +09:00
François Chollet 2079807ac2 Merge pull request #528 from stephenroller/ndim_tensor
ndim_tensor should support one dimensional tensors.
2015-08-16 05:21:30 +09:00
François Chollet aa2f866083 Merge pull request #533 from lukedeo/fix
fix: json dump to string
2015-08-16 05:21:18 +09:00
lukedeo 9b4fe767ae fix: json dump to string 2015-08-15 13:10:24 +02:00
anayebi fe45d2f002 Added UpSample1D and UpSample2D 2015-08-14 17:06:14 -07:00
EderSantana 9d76926eba Added cost masking to Graph model 2015-08-14 12:29:19 -04:00
Stephen Roller dec67bdb4e ndim_tensor should support one dimensional tensors. 2015-08-14 08:00:56 -05:00
fchollet a6aa7940bf Fix history in graph model. 2015-08-13 23:33:39 +09:00
Cheng Guo ab75f215b6 Use cudnn in Convolution2D to speedup 2015-08-13 11:45:28 +02:00
bshickel 3bc4b5249a Fixes IndexError when converting sequence to matrix
Calling sequences_to_matrix results in an IndexError when nb_words = None. This is caused by a 1-indexed word_index, since sequences_to_matrix expects 0-indexing. Converts word_index to 0-based indexing.
2015-08-10 13:46:39 -04:00
fchollet ae4219e6b1 Fix loss masking tests 2015-08-10 13:34:40 +09:00
fchollet b3cb7f4ef7 Merge branch 'master' of https://github.com/EderSantana/keras into EderSantana-master 2015-08-10 13:14:11 +09:00
fchollet eac3bf8b58 Extend layer API to multi inputs / ouputs 2015-08-10 13:13:29 +09:00
fchollet 425f29038a Add create_output option in Graph model 2015-08-10 12:36:41 +09:00
fchollet 1b66e36e25 Merge branch 'master' of https://github.com/fchollet/keras 2015-08-10 12:19:20 +09:00
fchollet b057624707 Add border_mode = same to Convolution1D 2015-08-10 12:16:52 +09:00
François Chollet 69628bb28d Merge pull request #512 from averybigant/fix_conv_layers_yaml_load
Fix conv layers loading for model_from_config
2015-08-10 11:41:36 +09:00
Renbi.YU ed4acfae40 Fix conv layers loading for model_from_config 2015-08-09 23:49:04 +08:00
fchollet 9e7f67b6f9 Fix typo 2015-08-09 15:29:14 +09:00
fchollet c81d6ec93f Fix batch normalization 2015-08-09 15:26:34 +09:00
fchollet 0eea5055c9 Use readthedocs theme in dev doc version 2015-08-09 12:14:51 +09:00
fchollet e1d8b1ba09 Update MaxPooling1D documentation. 2015-08-09 12:12:50 +09:00
fchollet 616fcbaa20 Merge branch 'nehz-MaxPooling1D-fix' of https://github.com/nehz/keras into nehz-nehz-MaxPooling1D-fix 2015-08-09 12:07:23 +09:00
François Chollet 921cf41d24 Merge pull request #494 from wxs/document-sampleweight
Add documentation for time distributed sample weighting
2015-08-07 15:36:48 +09:00
François Chollet f8afa92bbb Merge pull request #489 from dribnet/url_check
data_utils: add error handling on url fetches
2015-08-07 14:26:09 +09:00
François Chollet 1a572b10e8 Merge pull request #501 from Smerity/master
Fixes and full results for bAbi RNN example
2015-08-07 14:18:40 +09:00
Stephen Merity 342f2bc271 babi_rnn: Adding results for all tasks in bAbi tasks dataset 2015-08-06 21:43:05 -07:00
Stephen Merity dbc0c27729 babi_rnn bugfix: Fixing missing Python 3 support
+ reduce function disappeared (requires import from functools)
+ tarfiles and encodings - decoding bytes to ASCII at line level
2015-08-06 21:36:06 -07:00
Stephen Merity 63284a47ff babi_rnn bugfix: QA19 requires vocab from the answer
For all other questions, the full vocab is in the stories and the queries
2015-08-06 12:58:34 -07:00
Zhen Wang 6d17e9994e Fix MaxPooling1D.
Should downsample on `steps`.
2015-08-07 00:46:31 +08:00
Xavier Snelgrove 1d6b60c1ad Allow class_weight's use with time distributed data
As far as I can tell there is no reason not to support class_weight with
time distributed data, rewriting the standardize_weights function with
that in mind.
2015-08-06 11:22:43 -04:00
Xavier Snelgrove 73fdaf6d6f Add documentation for time distributed sample weighting
There was some confusion around whether you could mask individual
timesteps.
2015-08-06 09:59:35 -04:00
Tom White 6fddf15f1f data_utils: add error handling on url fetches
urlretrieve will blindly swallow any 4xx and 5xx responses
and then save the html error response in the local file. This
is probably exactly what we don't want, because not only will
the program crash if there is a network hiccup when the error
file cannot be opened, but it will continue to do so when rerun
until the corrupt cached file is found and manually removed.

Luckily, urlretrieve is just a thin wrapper around
FancyURLopener, so we can make our own thin wrapper
that throws an exception instead of caching the
wrong file.

Tested to be working as before when running cached and
uncached datasets, and also verified to fail loudly
when asked to fetch http://httpstat.us/500
2015-08-06 01:34:18 -07:00
François Chollet e42f738d0d Merge pull request #479 from anayebi/poisson-loss
Added Poisson Loss as an objective
2015-08-05 13:33:04 +09:00
anayebi 149d0e8d18 Added Poisson loss to Objectives 2015-08-04 21:06:51 -07:00
fchollet 37965cae6b Style touch-ups in babi example 2015-08-04 22:49:05 +09:00
Stephen Merity de78ddff9c Example: Use RNNs to answer questions from bAbi 2015-08-04 03:16:26 -07:00
fchollet 4ecb5bdd14 Fix Hualos callback 2015-08-02 16:58:04 +09:00
fchollet 1e3d9f7be1 Fix Hualos callback 2015-08-01 21:36:05 +09:00
François Chollet 54dc64736e Merge pull request #457 from kashif/adam
Updated adam solver to v8 of paper
2015-07-31 23:12:44 +09:00
François Chollet 3bf5340f18 Merge pull request #449 from tkipf/master
Proper handling of output values in Masking layer
2015-07-31 19:14:21 +09:00
fchollet 15a3a1f1ce Randomness seeding, small fixes 2015-07-31 10:38:46 +09:00
Kashif Rasul 9c7c52d908 further optimisation 2015-07-29 14:57:18 +02:00
Kashif Rasul 10b767d17e more efficient implementation as per paper 2015-07-29 12:07:46 +02:00
Kashif Rasul c68aaa21af fixed t variable 2015-07-28 20:36:41 +02:00
Kashif Rasul eeb56b9e22 updated adam solver
Updated adam solver to v8 of paper. The kappa (lambda) parameter has no
practical use and has been removed.

Fixed the calculations for beta_1_t and beta_2_t where also wrong.
2015-07-28 20:06:21 +02:00
EderSantana f25f894bd9 Correntions in the cost functions masking
We can average the vector and batch dimensions with mean, only the time
dimensions needs special love.
2015-07-28 10:01:14 -04:00
EderSantana a79343f5b0 Test previous commit 2015-07-27 12:39:48 -04:00
EderSantana 53aaa66994 Added masking to cost function
When dealing with sequences of different lenghts, this OPTIONALLY
fixes cost function bias to the largest sequences.
2015-07-27 12:27:13 -04:00
Thomas Kipf cb763aa26b Proper handling of output values in Masking layer 2015-07-27 14:24:00 +02:00
fchollet 48381f8af5 Masking layer touch-ups 2015-07-27 14:39:51 +09:00
fchollet 3212cd4ccd Merge branch 'masking-layer' of https://github.com/amitbeka/keras into amitbeka-masking-layer 2015-07-27 14:26:38 +09:00
fchollet 6a1ede1aa3 Merge branch 'master' of https://github.com/fchollet/keras 2015-07-27 14:26:07 +09:00
François Chollet b1631bd836 Merge pull request #447 from amitbeka/gitignore-tags
git-ignore tags file
2015-07-27 12:52:09 +09:00
Tim O'Shea a06e193679 adding thresholded linear and rectified activation functions 2015-07-26 15:06:42 -04:00
Amit Beka 38f7fb1efb git-ignore tags file
Signed-off-by: Amit Beka <amit.beka@gmail.com>
2015-07-26 13:01:46 +00:00
Amit Beka cb1d25fddb layers.core: add Masking layer
Signed-off-by: Amit Beka <amit.beka@gmail.com>
2015-07-26 12:57:57 +00:00
fchollet b64217cdef Small style fixes 2015-07-26 17:00:18 +09:00
François Chollet 59a634d558 Merge pull request #444 from erfannoury/patch-1
Fix LossHistory callback argument in doc
2015-07-26 16:48:59 +09:00
Erfan Noury dd833de568 Fix LossHistory callback argument in doc 2015-07-26 10:34:07 +04:30
fchollet 275f4166e4 Merge branch 'master' of https://github.com/fchollet/keras 2015-07-25 10:37:17 +09:00
fchollet 8675fc5213 Update core layers documentation 2015-07-25 10:34:51 +09:00
François Chollet 68f0677476 Merge pull request #441 from kenterao/master
Fix optimizer get/set state
2015-07-25 09:23:40 +09:00
Ken Terao 3a62cad7f5 fix model_from_json 2015-07-24 17:38:24 -05:00
Ken Terao 1e18983cfb Fix optimizer 2015-07-24 17:34:11 -05:00
fchollet a4def20848 Fix serialization 2015-07-24 22:08:27 +09:00
fchollet f1d60121ed Fix optimizers test 2015-07-24 18:34:57 +09:00
fchollet 0661032509 Merge branch 'master' of https://github.com/kenterao/keras into kenterao-master 2015-07-24 17:53:22 +09:00
fchollet 4f6b1a4dae Add optimizers tests 2015-07-24 17:50:44 +09:00
fchollet 0e295dbda1 Merge branch 'master' of https://github.com/fchollet/keras 2015-07-24 16:38:33 +09:00
fchollet 7c3bf9d02f Add dataset tests 2015-07-24 16:26:49 +09:00
fchollet e06f9df878 Refactor model serialization 2015-07-24 16:26:27 +09:00
fchollet b6aaeb35ee Fix embedding test 2015-07-24 16:26:16 +09:00
fchollet d1387c1e87 Move model_utils to layer_utils 2015-07-24 16:05:39 +09:00
fchollet 3a28da9e54 Cleanup/fix model_utils 2015-07-24 16:02:46 +09:00
fchollet 32fd202805 Merge branch 'print_layer_shapes' of https://github.com/julienr/keras into julienr-print_layer_shapes 2015-07-24 15:32:44 +09:00
fchollet e975f8a691 Remove deprecated methods 2015-07-24 09:32:41 +09:00
François Chollet d2defcae18 Merge pull request #433 from mynameisfiber/master
Fixed parameter passing for preprocessing.text.one_hot
2015-07-24 09:15:53 +09:00
Ken Terao 347e6d01ff Added get_state and set_state method to Optimizer base class. 2015-07-23 19:10:38 -05:00
François Chollet 540a1ceb45 Merge pull request #436 from kenterao/master
typo
2015-07-24 08:56:06 +09:00
Ken Terao 4c83fccced typo 2015-07-23 18:51:29 -05:00
Ken Terao 0974e07a4a typo 2015-07-23 18:49:33 -05:00
Micha Gorelick 570c377623 Fixed parameter passing for preprocessing.text.one_hot 2015-07-23 16:07:24 -04:00
Julien Rebetez 896880f84d Add a test script for model_utils 2015-07-23 18:22:13 +02:00
Julien Rebetez d635a60140 Add model_utils.print_graph_layer_shapes to handle Graph models.
Also handle Merge layers
2015-07-23 18:22:06 +02:00
Julien Rebetez e1df8ca2b1 Merge branch 'master' into print_layer_shapes 2015-07-23 16:48:27 +02:00
Julien Rebetez 1ad453f6d0 Add check to print_layer_shapes to fail explicitely on model used connected to other models. 2015-07-23 16:46:35 +02:00
fchollet a08bf38ec6 Extend loss weighting tests 2015-07-23 20:31:35 +09:00
fchollet 6289de3717 Fix & extend loss weighting 2015-07-23 20:14:34 +09:00
fchollet 8a99d6e404 Merge branch 'master' of https://github.com/kenterao/keras into kenterao-master 2015-07-23 19:44:05 +09:00
fchollet 6ed288b11c Merge branch 'the-moliver-Psoftplus' 2015-07-23 18:08:37 +09:00
fchollet 940377302b merge 2015-07-23 18:06:59 +09:00
Ken Terao 9cf5f6f982 adding sample_weights to Graph 2015-07-22 23:50:03 -05:00
Michael Oliver 84a3b5aecb Make initializations flexible 2015-07-22 12:30:51 -07:00
Michael Oliver 0174f21ea5 Change name, change to maskedlayer, add docs 2015-07-22 11:20:55 -07:00
François Chollet 2e204479ad Merge pull request #425 from tleeuwenburg/test_layers
More testing, esp core layers
2015-07-22 23:01:18 +09:00
fchollet 1ebec1e515 merge 2015-07-22 22:47:07 +09:00
François Chollet 828d1876ef Merge pull request #407 from tleeuwenburg/travis_upgrade
Travis upgrade
2015-07-22 22:36:20 +09:00
Tennessee Leeuwenburg 66247a8261 Added h5py to the conda install
Test with some travis steps removes

Testing

Tweaking

Tweaking

Tweaking

Final tweaking
2015-07-22 12:09:47 +02:00
Thomas McColgan 67426a6fa3 Test the constructor, config and params functions of all core layers. 2015-07-22 09:08:39 +02:00
Thomas McColgan 571448f5cd test connecting base layers 2015-07-22 09:08:39 +02:00
Thomas McColgan 319412a62d test elementary input and output of base layer 2015-07-22 09:08:39 +02:00
Thomas McColgan 20921b7b13 add some inline documentation 2015-07-22 09:08:39 +02:00
Thomas McColgan f08f590752 add a test on the output dimensions 2015-07-22 09:08:39 +02:00
Thomas McColgan b5dac6bd64 put dimensions into variables
and

make things a bit more concise
2015-07-22 09:08:39 +02:00
Thomas McColgan 130e5cd8cd run get_config and get_output_mask
typo
2015-07-22 09:08:39 +02:00
Thomas McColgan 529306f9f9 simple test of all recurrent layers 2015-07-22 09:08:38 +02:00
fchollet ed9834c62a Remove dot utils doc 2015-07-22 10:33:39 +09:00
fchollet 3037183c1b Remove dot_utils 2015-07-22 10:32:00 +09:00
fchollet ec8f7f0017 Codebase cleanup 2015-07-22 10:31:49 +09:00
fchollet f392a7800d Merge branch 'master' of https://github.com/fchollet/keras 2015-07-21 15:41:11 +09:00
fchollet 72e73b0a30 Merge branch 'cmyr-batch-shuffle' 2015-07-21 15:40:48 +09:00
fchollet 2fbfbdddf6 Cleanup error msg 2015-07-21 15:40:38 +09:00
fchollet e98b1c2352 Merge branch 'batch-shuffle' of https://github.com/cmyr/keras into cmyr-batch-shuffle 2015-07-21 15:37:16 +09:00
François Chollet 80f04d7b56 Merge pull request #420 from the-moliver/the-moliver-samplingfix
Update lstm_text_generation.py with proper multinomial sampling
2015-07-21 14:12:32 +09:00
Michael Oliver 2ada7d16cb change Psoftplus defaults to be nearer relu 2015-07-20 18:14:41 -07:00
Michael Oliver dc50928c18 add theano import 2015-07-20 17:44:19 -07:00
Michael Oliver 36a9a39473 Add parametric softplus 2015-07-20 17:09:09 -07:00
Michael Oliver 98d49754ed Update lstm_text_generation.py
Modify to use proper multinomial sampling, with temperature to control diversity. This seems to generate qualitatively better results and is technically more correct.
2015-07-20 15:42:14 -07:00
cmyr 3b4b5a654b added documentation + a hint if hdf5/shuffle conflict suspected 2015-07-20 13:44:56 -04:00
Julien Rebetez 62a4f29a71 Add print_layer_shapes function 2015-07-20 17:01:59 +02:00
fchollet d2b5849784 Merge branch 'anayebi-master' 2015-07-19 11:05:19 +09:00
fchollet 4b6bf1dbfe Fix Permute layer 2015-07-19 11:04:58 +09:00
fchollet 84d9171ac2 Merge branch 'master' of https://github.com/anayebi/keras into anayebi-master 2015-07-19 10:53:29 +09:00
fchollet c777cdf812 Update README 2015-07-18 20:01:23 +09:00
fchollet 91a15fdf43 Doc, README touch-ups 2015-07-18 19:49:10 +09:00
anayebi fea95703f4 Added Permute layer as suggested by loyeamen on #401 2015-07-17 13:20:12 -07:00
François Chollet efbf7a27f1 Merge pull request #404 from tleeuwenburg/test_norm
Squashed commit of the following:
2015-07-17 13:26:01 +09:00
François Chollet 0e87f4070e Merge pull request #403 from phreeza/typo
fix a typo in several places (ouput -> output)
2015-07-17 08:46:24 +09:00
Tennessee Leeuwenburg a3052e708a Updated to use new container infrastructure 2015-07-17 09:05:57 +10:00
Tennessee Leeuwenburg e6582c1892 Squashed commit of the following:
Author: Thomas McColgan <thomas.mccolgan@gmail.com>

    test config and weight init in batch normalization
    add tests for batch normalization
2015-07-17 08:42:52 +10:00
Thomas McColgan 71ac4bffd3 fix a typo in several places (ouput -> output) 2015-07-16 22:48:38 +02:00
fchollet 6a4aab453f Fix border mode = same in Conv2D 2015-07-16 23:30:29 +09:00
fchollet 46e19b95d8 Cleanup 2015-07-16 16:31:47 +09:00
fchollet 8824f1b469 Fix yaml serialization support 2015-07-16 16:31:14 +09:00
fchollet 43d84368c6 Fixes in yaml serialization 2015-07-16 15:18:34 +09:00
fchollet 08abc317f2 Merge branch 'yaml' of https://github.com/maxpumperla/keras into maxpumperla-yaml 2015-07-16 14:21:30 +09:00
fchollet 036d968ad6 Merge branch 'master' of https://github.com/fchollet/keras 2015-07-16 14:19:24 +09:00
fchollet 510068b83e Merge branch 'pjadzinsky-master' 2015-07-16 14:18:48 +09:00
fchollet 43736166a4 Touch-ups to 'same' border mode 2015-07-16 14:18:05 +09:00
fchollet 9e25669843 Merge branch 'master' of https://github.com/pjadzinsky/keras into pjadzinsky-master 2015-07-16 14:08:38 +09:00
François Chollet 12eba1333a Merge pull request #398 from kenterao/master
Match get_updates signature
2015-07-16 14:07:07 +09:00
Pablo Jadzinsky 6336567c84 Added border_mode='same' to Convolution2D 2015-07-15 21:36:36 -07:00
Ken Terao a3ebda96c3 Uninitialized progbar when verbose==0 2015-07-15 23:29:43 -04:00
Ken Terao 37fe48b744 Match get_updates signature 2015-07-15 23:24:42 -04:00
fchollet fd1d5908c6 Fix graph tests 2015-07-16 10:37:36 +09:00
fchollet 08b1964b77 Fix doc 2015-07-16 10:37:27 +09:00
Pablo Jadzinsky 6a0bf4833b Merge branch 'master' of github.com:pjadzinsky/keras 2015-07-15 12:48:38 -07:00
Pablo Jadzinsky 7af168d81a Added CropImage layer. Shrinks images in a convolution layer. When
applying a Convolution2D with border_mode='Full', images will grow in
size, this Layer allows to shrink them back to its original size (or any
other size)
2015-07-15 12:45:21 -07:00
Max Pumperla 5998b5dcff Extended yaml tests, includes merged sequentials and graphs 2015-07-15 17:07:42 +02:00
Max Pumperla af845d55a4 layer utils, has getter for layers by name and arguments and from_yaml function 2015-07-15 17:06:19 +02:00
Max Pumperla 200a006262 get_from_module available with additional dictionary arguments to initialize objects 2015-07-15 17:05:09 +02:00
Max Pumperla 705694a870 Allow getter of regularizers to take dict args 2015-07-15 17:03:41 +02:00
Max Pumperla 7da91d94ef Allow optimizer getter to take dict args 2015-07-15 17:00:07 +02:00
Max Pumperla 2b7a3cbaa9 to_yaml for Sequential and Graph, as well as model_from_yaml in model agnostic fashion 2015-07-15 16:59:28 +02:00
Max Pumperla c90f98eaec Merge layer to_yaml and None default for reg and constr 2015-07-15 16:58:18 +02:00
Max Pumperla 6c695493c9 Roll back to None as default for reg and constr 2015-07-15 16:56:30 +02:00
Max Pumperla eef82c486a Containers have a layer_to_yaml method | plus fixed a minor typo, inputs were mistakenly added to output_config, not input_config 2015-07-15 16:55:14 +02:00
Max Pumperla 53331e43c2 Allow constraint getter to take parameter dict 2015-07-15 16:52:40 +02:00
fchollet 94c930e99e Remove weight tying in autoencoder 2015-07-15 13:01:48 +09:00
fchollet 2d0e84a857 Merge branch 'mikekestemont-master' 2015-07-15 12:36:20 +09:00
fchollet c8a2f46f79 Rename IMDB CNN example 2015-07-15 12:35:56 +09:00
fchollet 6899a9db16 Revise IMDB conv1d example 2015-07-15 12:35:28 +09:00
fchollet 238d390932 Merge branch 'master' of https://github.com/mikekestemont/keras into mikekestemont-master 2015-07-15 12:18:15 +09:00
fchollet 22bd7cfab6 Update reuters dataset 2015-07-15 10:18:08 +09:00
Mike Kestemont 2603fa37d2 added conv1D example 2015-07-14 22:34:05 +02:00
cmyr 48ea8dfb47 implement batch shuffle 2015-07-14 15:04:23 -04:00
Max Pumperla 319e18ad8f Removed camel case in to_yaml 2015-07-14 16:51:31 +02:00
Max Pumperla 848e1fb5c5 Removed camel case 2015-07-14 16:49:55 +02:00
Max Pumperla 38c4cb2ac2 Manual test for yaml (de-)serialisation 2015-07-14 10:09:05 +02:00
Max Pumperla 493e6d2852 Changed output format of from/to yaml to string, instead of file path 2015-07-14 10:03:19 +02:00
fchollet 03e512a3f4 Batch validation in fit 2015-07-14 13:06:37 +09:00
fchollet a3dadff38f Update documentation 2015-07-12 08:10:13 +09:00
fchollet 65ebd8aa6a Merge branch 'master' of https://github.com/fchollet/keras 2015-07-11 10:20:03 +09:00
fchollet 9f71949d6d Dynamic epsilon in objectives 2015-07-11 10:19:46 +09:00
François Chollet 0bf62784cb Merge pull request #372 from KyotoSunshine/master
Fix uniform initializations equations
2015-07-10 11:21:54 +09:00
KyotoSunshine afc5adbfe1 Fix mistakes in uniform initializations equations
Standard deviation values were being passed as scale values for uniform distributions. 
But the relationship is: scale = standard deviation * sqrt(3).
So, the s values in glorot_uniform, lecun_uniform, and he_uniform should have been multiplied by sqrt(3) before being passed into uniform() function. Now it is fixed.
2015-07-10 02:51:57 +09:00
Max Pumperla 94d857f3cb Changed layers to use get_config for regularizers and constraints for (de-)serialization 2015-07-09 11:50:08 +02:00
Max Pumperla d77175fb2e GaussianNoise has to be MaskedLayer instead of Layer 2015-07-09 11:48:20 +02:00
Max Pumperla f142d34ffc serialize sequential models as yaml 2015-07-09 11:47:19 +02:00
Max Pumperla ca22bb7db5 Getter for regularizers 2015-07-09 10:42:38 +02:00
Max Pumperla 3909e783b9 Stored loss function as self.unweighted_loss, before passing it to weighted_objective 2015-07-09 10:41:54 +02:00
Max Pumperla 46f591d4bb Getter for constraints 2015-07-09 10:40:34 +02:00
François Chollet 472f8ba94b Merge pull request #346 from wxs/document-masking
Add some documentation of the masking feature
2015-07-09 06:57:28 +09:00
fchollet 5bf2718580 Fix binary_crossentropy 2015-07-08 14:03:07 -07:00
fchollet 9a649d2b27 Fix GaussianNoise imports 2015-07-08 13:32:54 -07:00
fchollet c06b746704 Merge branch 'patch-1' of https://github.com/the-moliver/keras into the-moliver-patch-1 2015-07-08 13:29:58 -07:00
Max Pumperla b4c62ffbab Constraints, optimizers and regularizers have a get_config() as preliminary to serialisation 2015-07-08 14:31:43 +02:00
Max Pumperla 5cbe62c248 Made theano_mode field of Sequential 2015-07-08 14:30:32 +02:00
Michael Oliver f0aedaf4e8 correct sqrt call 2015-07-07 18:29:52 -07:00
Michael Oliver 1bfbb33c0d change Layer to MaskedLayer bugfix 2015-07-07 18:22:19 -07:00
fchollet bc05a25b1b Fix tests, increase coverage 2015-07-07 11:07:10 -07:00
fchollet 1e73e1bc54 Fix History callback 2015-07-07 11:07:00 -07:00
fchollet 07f2253ff2 Add test for sequential model 2015-07-07 10:11:48 -07:00
fchollet f817e2e71c merge 2015-07-07 07:40:40 -07:00
François Chollet 4b07e77828 Merge pull request #353 from mthrok/master
Modified comment and fixed batch_size
2015-07-07 07:38:06 -07:00
François Chollet c0a44daabc Merge pull request #349 from floydsoft/patch-1
fix doc
2015-07-07 07:37:29 -07:00
Moto H 7ff158de41 Modified comment and fixed batch_size 2015-07-07 14:22:25 +09:00
Michael Oliver 8213e515d3 Update noise.py
Add link
2015-07-06 20:27:19 -07:00
Michael Oliver c353dfc5f9 Update noise.md
Add link
2015-07-06 20:26:34 -07:00
Michael Oliver 0cae63accb Update noise.py
fix code style
2015-07-06 20:08:24 -07:00
Michael Oliver 5ed7f78b42 update docs 2015-07-06 15:03:14 -07:00
Michael Oliver c354b9be5c Add GaussianDropout 2015-07-06 14:13:01 -07:00
floydsoft df331cc5e0 fix doc 2015-07-07 04:59:00 +08:00
François Chollet 14b014883a Merge pull request #348 from floydsoft/master
fix add_node from inputs
2015-07-06 12:51:18 -07:00
fchollet 7094be1b44 Add y standardization to graph model 2015-07-06 12:47:36 -07:00
floydsoft d4f39c8f53 fix add_node from inputs 2015-07-07 03:45:34 +08:00
Xavier Snelgrove 337f39bd02 Add some documentation of the masking feature 2015-07-06 15:28:16 -04:00
fchollet 7c8d9aaf6b Fix gaussian noise doc 2015-07-06 10:16:37 -07:00
fchollet 6c55dbd6d5 Fix callbacks doc 2015-07-06 10:09:32 -07:00
fchollet ab8f7da83f Put GaussianNoise in its own module. 2015-07-06 10:05:55 -07:00
fchollet 1f21e61a40 Merge branch 'master' of https://github.com/Reddine/keras into Reddine-master 2015-07-06 09:58:33 -07:00
fchollet 16f737ffc2 Fix AE example in doc 2015-07-06 09:57:00 -07:00
fchollet 1fe6556ab8 Fix graph doc 2015-07-05 21:05:56 -07:00
fchollet aeb954dd49 Add dtype to graph inputs 2015-07-05 21:03:29 -07:00
fchollet 35bcd5a45a Touch-ups in examples and doc 2015-07-05 15:04:20 -07:00
fchollet b1d9448908 Refactor merge layer 2015-07-05 14:51:25 -07:00
fchollet fe3c4d73eb Update graph tests 2015-07-05 14:28:35 -07:00
Kheir Eddine FARFAR c315b0d7a9 fix GaussianNoise 2015-07-05 22:22:16 +01:00
fchollet ddd5f47640 Fix container merging issue 2015-07-05 14:13:02 -07:00
fchollet 63f9a7955d Fix ModelCheckpoint callback 2015-07-05 11:19:58 -07:00
fchollet 8995b50a96 Remove DenoisingAutoEncoder 2015-07-05 11:09:22 -07:00
Kheir Eddine FARFAR e014b9435f Create Gaussian Noise layer 2015-07-05 17:46:32 +01:00
fchollet b22e547e98 Merge branch 'Reddine-master' 2015-07-04 15:39:54 -07:00
fchollet 8bd8ae1daa Correct MAPE loss 2015-07-04 15:39:37 -07:00
fchollet 1530f31c1d Merge branch 'master' of https://github.com/Reddine/keras into Reddine-master 2015-07-04 15:35:13 -07:00
fchollet f2c97d817b Add Graph model to doc 2015-07-04 15:08:53 -07:00
fchollet 53a05b6e4c Fix metrics issue in evaluate 2015-07-04 14:37:58 -07:00
fchollet dab55518ba Update cifar10 example 2015-07-04 14:28:24 -07:00
fchollet be75548ca3 Add graph config management 2015-07-04 14:27:01 -07:00
Kheir Eddine FARFAR 32f483fe33 Fix mape objective 2015-07-04 21:02:47 +01:00
fchollet 553a7c0265 Graph bugfix, improve tests 2015-07-04 12:02:26 -07:00
Kheir Eddine FARFAR 60daee5674 Add mape and msle objectives to the documentation 2015-07-04 15:24:24 +01:00
Kheir Eddine FARFAR 47fd945cf1 Add MAPE objective 2015-07-04 15:16:52 +01:00
fchollet 66b8f37175 Complete working version of graphs. API needs work 2015-07-04 00:50:24 -07:00
fchollet f1cd436574 Fixes in models, callbacks 2015-07-03 23:43:19 -07:00
fchollet f30223096e merge 2015-07-03 22:43:34 -07:00
fchollet 76b9877ccb Add MSLE objective 2015-07-03 18:04:15 -07:00
fchollet 243d4737d1 Better API for Convolution1D and MaxPooling1D 2015-07-03 15:59:25 -07:00
fchollet 9e4e432822 Merge branch 'pranv-master' 2015-07-03 11:31:27 -07:00
fchollet c7c5372509 Style fixes 2015-07-03 11:31:03 -07:00
fchollet 2baa9a8e57 Merge branch 'master' of https://github.com/pranv/keras into pranv-master 2015-07-03 11:12:21 -07:00
fchollet e63644cf82 Fix denoising autoencoder issue 2015-07-03 11:08:29 -07:00
Pranav Shyam af0899ded7 Added LRN, Convolution with Strides and ZeroPadding2D 2015-07-03 11:53:47 +05:30
fchollet a5f4bd33c9 Merge branch 'master' of https://github.com/fchollet/keras 2015-07-02 22:54:41 -07:00
fchollet 522201ec9e Add test utils 2015-07-02 22:54:27 -07:00
François Chollet 506aaf1721 Merge pull request #328 from samuela/patch-1
Fix parenthesis typo in examples.md
2015-07-02 22:12:21 -07:00
fchollet 8e1e32a906 Fix tests 2015-07-02 22:11:46 -07:00
fchollet 0332b95cdf Remove check in binary_crossentropy 2015-07-02 21:53:12 -07:00
fchollet f53c3195e7 Merge branch 'master' of https://github.com/fchollet/keras 2015-07-02 21:47:25 -07:00
fchollet 940bd47bc9 Add test_loss_weighting 2015-07-02 21:47:10 -07:00
fchollet ee730496ee Add test_tasks 2015-07-02 21:45:31 -07:00
fchollet 06d7c7dab2 Strict handling of incorrect dims in binary_xent 2015-07-02 21:44:57 -07:00
samuela 91d86c355b Fix parenthesis typo in examples.md 2015-07-02 21:17:19 -04:00
François Chollet 98f055074c Merge pull request #327 from tleeuwenburg/master
New pull request -- much cleaner
2015-07-02 17:48:27 -07:00
fchollet bba5379305 Fix constraint tests 2015-07-02 17:47:12 -07:00
Thomas McColgan acef252703 change constraints tests to new constraint api 2015-07-03 10:01:44 +10:00
Thomas McColgan 514ac06c2e missing axis parameter 2015-07-03 10:01:44 +10:00
Thomas McColgan f184db8c53 add exotic inputs to identity test 2015-07-03 10:01:43 +10:00
Thomas McColgan d59f2519fa make some texts a bit more explicit 2015-07-03 10:01:43 +10:00
Thomas McColgan 4956ed9e97 small PEP-8 changes 2015-07-03 10:01:43 +10:00
Thomas McColgan b48e39aafd Add a test for the identity, non-negative, and unit-norm constraints 2015-07-03 10:01:43 +10:00
Thomas McColgan a808ae6ff8 Add a test for the max-norm constraint 2015-07-03 10:01:43 +10:00
fchollet 2805413653 Merge branch 'master' of https://github.com/fchollet/keras 2015-07-02 15:22:18 -07:00
fchollet 2d5b86d7a8 Merge branch 'mthrok-master' 2015-07-02 15:21:57 -07:00
fchollet 12a5c6fe46 Touch-ups in IRNN example 2015-07-02 15:21:37 -07:00
fchollet e50462e71f Merge branch 'master' of https://github.com/mthrok/keras into mthrok-master 2015-07-02 13:50:00 -07:00
François Chollet 2f98bed389 Merge pull request #325 from phreeza/patch-2
Fix unitnorm to be a class, like other constraints
2015-07-02 13:02:29 -07:00
Thomas McColgan 24be9eb88b Fix unitnorm to be a class, like other constraints
brought to you by unittests
2015-07-02 21:29:50 +02:00
François Chollet e15f722558 Merge pull request #323 from wxs/None-on-embedding
Return None when not masking Embedding.
2015-07-02 10:13:20 -07:00
Xavier Snelgrove c9fd2c8a8c Add some clarification comments. 2015-07-02 12:55:58 -04:00
Xavier Snelgrove 42497d9fda Return None when not masking Embedding.
Embedding is supposed to return None for its mask when it's not in
masking mode.
2015-07-02 12:44:55 -04:00
fchollet 278618281c Fix masking issue 2015-07-01 21:32:57 -07:00
Moto H 4ae4c9c8df Corrected execution time 2015-07-02 09:24:09 +09:00
Moto H 89ad3c726c Changed #epochs 2015-07-02 09:02:27 +09:00
Moto H 3af4d2fd33 Added IRNN example. 2015-07-02 08:55:06 +09:00
fchollet a4329aa4b0 Merge branch 'master' of https://github.com/fchollet/keras 2015-06-30 20:03:23 -07:00
fchollet 24df6a9f9b Fix weights in model.test 2015-06-30 20:03:11 -07:00
fchollet 560cb94519 Callback refactor 2015-06-30 17:59:34 -07:00
fchollet 2ab9f0ef61 Add working Graph container 2015-06-30 17:59:19 -07:00
fchollet d148ff11c2 Initial draft of Graph container 2015-06-29 22:26:57 -07:00
François Chollet 256a908250 Merge pull request #301 from ameasure/patch-1
Update convolutional.md
2015-06-29 16:26:59 -07:00
Alexander Measure 11d3335dcc Update convolutional.md
Changed documentation of subsample parameter to match implementation.
2015-06-29 19:10:21 -04:00
François Chollet 1b7ce160e2 Merge pull request #297 from wxs/pre-truncation
Support truncation off beginning of sequence
2015-06-29 14:17:22 -07:00
fchollet c7aac3ce39 ones as default init for LSTM forget gate bias 2015-06-29 14:14:05 -07:00
Xavier Snelgrove 7e06995678 Support truncation off beginning of sequence 2015-06-29 11:57:40 -04:00
fchollet ce659e568b Fix check for mask 2015-06-29 06:02:02 -07:00
fchollet e8a6ae298d Fix regularization in Embedding 2015-06-29 05:07:37 -07:00
fchollet e3d3d62218 Update doc with sample_weight and class_weight 2015-06-28 18:01:30 -07:00
fchollet 877d44740c Merge branch 'master' of https://github.com/fchollet/keras 2015-06-28 17:54:29 -07:00
fchollet cc9edcf472 Fix merge conflicts, add class_weight support 2015-06-28 17:50:42 -07:00
fchollet 5e5bff0510 Simplify IMDB example 2015-06-28 17:09:54 -07:00
François Chollet bc4e47e689 Merge pull request #290 from stonebig/master
create .keras/models dir if needed
2015-06-28 10:53:28 -07:00
stonebig d9357646e2 create keras directories if needed
solves #289
2015-06-28 10:53:36 +02:00
fchollet c4735f59c4 Doc touch-ups 2015-06-27 23:52:03 -07:00
fchollet 3af40ba584 Fix typo in doc 2015-06-27 20:35:16 -07:00
fchollet 36e5a17b29 Update docs 2015-06-27 20:31:09 -07:00
fchollet cbde35fdf5 Improve API of ActivityRegularization 2015-06-27 20:31:03 -07:00
fchollet 76b28524d1 Remove image_shape in conv layers 2015-06-27 20:29:27 -07:00
fchollet 090ada4e77 Fix mkdocs.yml 2015-06-27 20:29:07 -07:00
fchollet 8ef90a03b3 Switch convolutional layers to new regularization 2015-06-27 15:43:49 -07:00
fchollet f04fdec9e6 Incorporate regularization into loss function 2015-06-27 15:39:32 -07:00
fchollet 0d6575c7f9 Merge branch 'master' of https://github.com/fchollet/keras 2015-06-27 13:42:51 -07:00
fchollet cc750adf60 Merge branch 'phreeza-activity_reg' 2015-06-27 13:42:21 -07:00
fchollet 752037c140 Remove loss_update system 2015-06-27 13:42:00 -07:00
fchollet ba11d9ca75 Restore callbacks in fit 2015-06-27 13:30:48 -07:00
fchollet 499a05003b Fix regularisers/constraints tests 2015-06-27 13:30:21 -07:00
fchollet c9addefa28 Simplify regularizers 2015-06-27 13:30:01 -07:00
fchollet 0f12e0119b Switch constraints to OO model 2015-06-27 13:29:48 -07:00
fchollet 4d267f5ba3 Remove manual check for regularizers (use auto) 2015-06-27 13:29:29 -07:00
fchollet 5719322573 Merge branch 'activity_reg' of https://github.com/phreeza/keras into phreeza-activity_reg 2015-06-27 13:02:13 -07:00
fchollet 5b6f56a040 Add mask support to new recurrent layers 2015-06-27 13:01:27 -07:00
François Chollet 3e08814118 Merge pull request #280 from tleeuwenburg/master
Added unit test, travis file
2015-06-26 18:13:59 -07:00
fchollet 9c8e0d43f3 Add support for custom padding value in sequence 2015-06-26 17:44:56 -07:00
fchollet ea0b7d263c Add text dataset support for OOV, start chars 2015-06-26 17:44:21 -07:00
fchollet 99c20e25c9 Make sure key examples are deterministic 2015-06-26 17:41:13 -07:00
fchollet 2abb4b2316 Merge branch 'the-moliver-patch-1' 2015-06-26 15:25:44 -07:00
fchollet 0d6c55d5f0 Style touch-ups 2015-06-26 15:25:30 -07:00
fchollet 45e444cc63 Code style fixes 2015-06-26 15:21:45 -07:00
fchollet 5d1976c46d Update IMDB example 2015-06-26 15:21:10 -07:00
Michael Oliver 3ff19e2a62 Fix memory leak
The scan in get_output TimeDistributedDense leaked memory like crazy. Changing it to match get_output in Dense seems to have fixed the problem and behaves identically.
2015-06-26 14:53:52 -07:00
Thomas McColgan 95a363f3d8 Revert "Make merge layer work with a single model, turning it into a container layer."
This reverts commit f4df2240c1.
2015-06-26 23:16:54 +02:00
fchollet 3995b18f43 Merge branch 'mask_value' of https://github.com/wxs/keras into wxs-mask_value 2015-06-26 14:13:06 -07:00
Xavier Snelgrove d57206dfd9 Disregard objective output dimensions when weighting
Binary cross-entropy and categorical cross-entropy give differently
shaped results, which used to not matter since mean() ignored the shape
2015-06-26 17:09:25 -04:00
Thomas McColgan 71787762db Revert "Add an Autoencoder model and a test to go with it."
This reverts commit 12f7f374c3.

Conflicts:
	keras/models.py
2015-06-26 22:59:15 +02:00
Thomas McColgan da20a4ccd2 remove a remaining instance of old naming 2015-06-26 22:57:37 +02:00
fchollet 90c4edefd3 Merge branch 'mask_value' of https://github.com/wxs/keras into wxs-mask_value 2015-06-26 12:32:36 -07:00
Xavier Snelgrove ad99af7be7 Force mask type to int8 for GPU 2015-06-26 12:00:21 -04:00
Thomas McColgan aaceb11b9e change custom regularizer in autoencoder to 2015-06-26 09:24:51 +02:00
Thomas McColgan 7d7085d523 Further simplification of Regularizer code (dropped flags entirely) 2015-06-26 09:23:26 +02:00
Thomas McColgan 33ca877821 Merge branch 'master' into activity_reg
Conflicts:
	keras/models.py
2015-06-26 09:18:37 +02:00
Tennessee Leeuwenburg 15055483ff Added a docstring to the linear activation function about input checking.
Added some more unit tests
2015-06-26 17:01:09 +10:00
Thomas McColgan c3d90020fc change semantics of Regularizer class to reduce external control flow 2015-06-26 08:50:11 +02:00
Thomas McColgan ed8448df7b Merge branch 'master' of https://github.com/phreeza/keras
Conflicts:
	keras/models.py
2015-06-26 08:30:55 +02:00
Tennessee Leeuwenburg 81822d3e52 Updated test to reflect new return type (list rather than list-of-list)
Added a test for the linear function
2015-06-26 12:37:00 +10:00
Tennessee Leeuwenburg 7a6cec902e Merge branch 'master' of https://github.com/fchollet/keras 2015-06-26 12:25:53 +10:00
Tennessee Leeuwenburg 37e3ffed85 Deleted superfluous test file
Added basic softmax activation test
2015-06-26 12:12:29 +10:00
fchollet 7fafc8ec9d Merge branch 'master' of https://github.com/fchollet/keras 2015-06-25 16:37:07 -07:00
fchollet 91aa190ceb Merge branch 'transcranial-RNN-MUT' 2015-06-25 16:35:57 -07:00
Xavier Snelgrove 4eb8346df5 Handle previous properly, rename things for clarity 2015-06-25 16:27:15 -04:00
Thomas McColgan 9083dd088d add execution lines to unit test 2015-06-25 21:30:29 +02:00
Leon Chen 4e78c9e758 add theano floatX dtype 2015-06-25 15:19:34 -04:00
Leon Chen 93ff2240f3 use sparse random projections for dimensionality changes 2015-06-25 15:03:43 -04:00
Thomas McColgan 867253e5d8 rename cost to loss 2015-06-25 20:53:50 +02:00
Xavier Snelgrove 31a303c38f Remove unused import 2015-06-25 14:10:25 -04:00
Xavier Snelgrove e1a39b80a9 Some code linting 2015-06-25 14:09:27 -04:00
Xavier Snelgrove 1564343c6e Didn't always initialize weight_val 2015-06-25 13:50:47 -04:00
Xavier Snelgrove 62392a4b5e Switch to better masking scheme.
Layers now have get_input_mask() and get_output_mask() functions
which you can use to get an int8 array representing which data
is masked.
2015-06-25 13:44:45 -04:00
Leon Chen 11685f68d5 relax input_dim == output_dim constraint with matrix multiplication, and rename MutatedRNN to JZS 2015-06-25 10:21:39 -04:00
Thomas McColgan b4c7aded69 fix file format 2015-06-25 14:07:12 +02:00
Thomas McColgan 182c827433 move manual text/check to appropriate location 2015-06-25 14:03:07 +02:00
Thomas McColgan 59e345501e refactored regularizers to be objects of a given class 2015-06-25 13:25:03 +02:00
Thomas McColgan 56e09ad736 add tests for regularizers 2015-06-25 11:39:47 +02:00
François Chollet 4d39c08c42 Merge pull request #277 from pdermyer/mergefix
Fixed Merge handling of overlapping Models.
2015-06-24 13:16:27 -07:00
Xavier Snelgrove 4962f18857 Move the objective-weighting decorator out of get() 2015-06-24 15:09:26 -04:00
Xavier Snelgrove 11e961fdf8 Introduce weighting of labels.
This changes objective functions to no longer return scalars, but
rather tensors of dimension one less than y, representing the loss for
each datapoint in y, on which it is expected you will calculate a weighted mean.
2015-06-24 14:58:58 -04:00
pdermyer 413663f03f Fixed Merge handling of overlapping Models. 2015-06-24 11:18:32 -07:00
Xavier Snelgrove 433f56f35d Merge remote-tracking branch 'origin/master' into mask_value 2015-06-24 11:37:58 -04:00
Leon Chen 6420c59f1b update docs 2015-06-24 11:29:47 -04:00
Leon Chen c35780557e implement MUT1, MUT2, MUT3 recurrent NN architectures from Jozefowicz et al 2015 2015-06-24 11:29:28 -04:00
François Chollet 4ba83caef2 Merge pull request #275 from transcranial/init-identity-mat
add identity matrix initialization
2015-06-23 17:24:52 -07:00
Leon Chen 13715c49c3 add identity matrix initialization 2015-06-23 19:16:16 -04:00
Xavier Snelgrove 7ef71ec944 Add mask_val to get_config() 2015-06-23 18:13:18 -04:00
fchollet 4e19cd29c9 Merge branch 'amitbeka-get_from_module_unicode_fix' 2015-06-23 10:51:28 -07:00
fchollet 1d02231a49 Merge branch 'get_from_module_unicode_fix' of https://github.com/amitbeka/keras into amitbeka-get_from_module_unicode_fix 2015-06-23 10:48:16 -07:00
François Chollet 333e85f6b1 Merge pull request #266 from d0ugal/master
Updated the MkDocs config from the deprecated format
2015-06-23 10:35:05 -07:00
fchollet 6d82ba75c9 Merge branch 'wxs-remove-time-distributed-softmax' 2015-06-23 10:22:15 -07:00
fchollet 135dccfeea Coding style 2015-06-23 10:22:03 -07:00
fchollet 4c5f8e364c merge 2015-06-23 10:16:50 -07:00
fchollet 515c430f43 Merge autoencoder fix 2015-06-23 10:07:46 -07:00
fchollet 8a2ae93ba7 Improve autoencoder check 2015-06-23 10:01:50 -07:00
Xavier Snelgrove 4c3495896e Remove time_distributed_softmax in favour of softmax
There is no reason to have two different functions for this! The softmax
function can just be configured to always perform the softmax across the
trailing dimension (i.e. nb_dimensions)
2015-06-23 12:24:08 -04:00
Xavier Snelgrove 86a9445ffd Remove mask_val from time_distributed_softmax 2015-06-23 12:04:03 -04:00
Xavier Snelgrove 271e2b7f84 Add masking support to the activation layer 2015-06-23 12:00:03 -04:00
Julien Rebetez ce6728d9ce Make Autoencoder.connect call encoder.connect(). This fixes
a DisconnectedInputError if the encoder is a containers.Sequential
instance.
2015-06-23 14:33:06 +02:00
Thomas McColgan 85c915bd7e Merge branch 'master' of https://github.com/fchollet/keras into activity_reg
Conflicts:
	keras/models.py
2015-06-23 13:08:15 +02:00
Dougal Matthews 2603b95c8c Updated the MkDocs config from the deprecated format 2015-06-23 09:14:21 +01:00
Amit Beka d9e76bdeca fix to support python3
Signed-off-by: Amit Beka <amit.beka@gmail.com>
2015-06-23 05:53:53 +00:00
Tennessee Leeuwenburg 28285ceda7 Merge branch 'master' of https://github.com/fchollet/keras 2015-06-23 14:15:47 +10:00
Stephen Merity be22591e32 Decrease memory usage of LSTM text gen example
Both the training features and labels can be represented as numpy
booleans instead of float32 / float64. This enables standard low RAM
machines to scale up to large datasets. Especially important if you
either have many characters (ASCII), long sequences, or a large dataset.
2015-06-23 14:03:32 +10:00
fchollet a9f94c94a7 Revert loss weighting 2015-06-23 14:03:32 +10:00
fchollet 155369711c Update gitignore 2015-06-23 14:03:31 +10:00
fchollet 3dd4b2e8ab Rename test folder 2015-06-23 14:03:31 +10:00
fchollet a1912b0774 Rename lossweights test 2015-06-23 14:03:31 +10:00
François Chollet 99cdd12bf7 Merge pull request #264 from Smerity/master
Decrease memory usage of LSTM text gen example
2015-06-22 17:32:28 -07:00
Xavier Snelgrove 2384aa4c97 Add masking to time_distributed_softmax 2015-06-22 17:33:44 -04:00
Stephen Merity b9fbc458ed Decrease memory usage of LSTM text gen example
Both the training features and labels can be represented as numpy
booleans instead of float32 / float64. This enables standard low RAM
machines to scale up to large datasets. Especially important if you
either have many characters (ASCII), long sequences, or a large dataset.
2015-06-22 14:24:46 -07:00
Xavier Snelgrove e3519221b4 Add masking to GRU and LSTM 2015-06-22 16:44:01 -04:00
Xavier Snelgrove 3f3a1f0e9b Remove redundant masking in _step 2015-06-22 15:49:53 -04:00
Xavier Snelgrove 852f5d977f TimeDistributedDense masking 2015-06-22 15:41:45 -04:00
Xavier Snelgrove a1ab9769ec Rename test to check as per new convention 2015-06-22 15:12:02 -04:00
Xavier Snelgrove ef4250fc05 Move recurrent mask tests into new directory layout 2015-06-22 15:05:01 -04:00
Xavier Snelgrove 76328bedd1 Merge remote-tracking branch 'origin/master' into mask_value 2015-06-22 14:45:48 -04:00
fchollet f8ee81d89e Revert loss weighting 2015-06-22 11:39:22 -07:00
Xavier Snelgrove 83419e644f Dropout masking 2015-06-22 14:16:48 -04:00
Amit Beka c274cb990b utils/generic_utils: fix unicode strings problem
in get_from_module(), a unicode identifier should be treated as a str
type.

Signed-off-by: Amit Beka <amit.beka@gmail.com>
2015-06-22 13:31:03 +00:00
Thomas McColgan b0b3ff13b0 Update models.py 2015-06-22 09:38:45 +02:00
fchollet 384634d321 Update gitignore 2015-06-21 16:05:55 -07:00
Tennessee Leeuwenburg 3e0b09e819 Updated script name 2015-06-22 07:53:32 +10:00
Tennessee Leeuwenburg e0f58c976b Try anaconda with travis 2015-06-22 07:45:23 +10:00
fchollet dd82a3944e Rename test folder 2015-06-21 12:00:57 -07:00
fchollet 9763f81185 Rename lossweights test 2015-06-21 11:54:25 -07:00
François Chollet ae13539a88 Merge pull request #256 from tleeuwenburg/master
Simple refactor of directories to support auto testing
2015-06-21 11:10:10 -07:00
Tennessee Leeuwenburg cbcb3b96ef Added travis file 2015-06-21 20:40:51 +10:00
Tennessee Leeuwenburg 0a21abc810 Rename a manual test to a check 2015-06-21 20:08:54 +10:00
Tennessee Leeuwenburg 1fc3c4b22e Renamed non-automatable tests as "checks"
Moved tests into either manual or auto subdirectories.
2015-06-21 20:06:36 +10:00
fchollet 5a1a00e69e Add RemoteMonitor callback 2015-06-20 19:32:28 -07:00
fchollet 97f23268c2 Refactor test_lossweights into unit test 2015-06-20 17:09:08 -07:00
fchollet fd5b68dbe3 Fix Python3 issue with class_weight 2015-06-20 16:53:57 -07:00
fchollet ac0a9db039 Merge 2015-06-20 15:32:45 -07:00
fchollet 1d586440a8 Rewrite class_weight and sample_weight tests 2015-06-20 15:30:18 -07:00
fchollet b44ecab225 Make RNG deterministic when seeded through numpy 2015-06-20 15:29:40 -07:00
fchollet 34cbc1d401 Merge branch 'instance_weight' of https://github.com/tdhd/keras into tdhd-instance_weight 2015-06-20 14:46:40 -07:00
fchollet 0e6fd3d306 Update callbacks documentation 2015-06-19 16:36:00 -07:00
fchollet ccbe381dcd Add early stopping 2015-06-19 16:31:06 -07:00
fchollet 1f224de9b1 Fix callbacks doc 2015-06-19 15:38:53 -07:00
fchollet e23e86ae7a Improve examples, add new MNIST CNN example 2015-06-19 15:38:33 -07:00
fchollet 06f22db69a Up the batch size in MNIST example 2015-06-19 12:52:43 -07:00
Philipp 824f9f5e80 refactored weighting in models.py 2015-06-19 21:30:23 +02:00
Xavier Snelgrove a12510ba97 Remove mask_value application from _step
I realized that it makes more sense to have _step *apply* a mask, but
then to set the masked entries to mask_value outside of step. This
should be more efficient, but more importantly should make
implementations easier to understand.

Another nice effect: an alternative masking scheme can be introduced
without changing _step at all.
2015-06-19 15:06:57 -04:00
Xavier Snelgrove c13154933e Remove some outdated comments from test 2015-06-19 13:45:56 -04:00
Xavier Snelgrove 981d23a66e Introduce masking to SimpleDeepRNN 2015-06-19 13:41:27 -04:00
Xavier Snelgrove 7278db105d Pass historical masks using the taps of scan
I did not previously know about Theano's "taps" concept.
2015-06-19 11:32:22 -04:00
Philipp 1b9f35f535 added tests for sample weight 2015-06-19 09:40:50 +02:00
Philipp c4f0f8b394 renamed instance weight to sample weight 2015-06-19 09:40:38 +02:00
Xavier Snelgrove d7a612bb8c Was incorrectly fixing the masked weights in Embedding
This led me to realize that I also was not properly passing masks out of
recurrent layers, nor were my tests properly checking for this. I've
resolved this here.
2015-06-19 01:07:35 -04:00
Xavier Snelgrove a8f1a6a11b Was not properly setting the mask row. 2015-06-18 22:08:17 -04:00
Xavier Snelgrove bde9ff8232 By default Embeddings don't have a mask 2015-06-18 21:58:51 -04:00
Xavier Snelgrove 32c507eaf8 Fix for the GPU again 2015-06-18 17:55:21 -04:00
Xavier Snelgrove 36676451c9 Was calculating the mask on transformed x, not X. 2015-06-18 17:41:49 -04:00
Xavier Snelgrove e369559170 Mis-set first value of mask at t-1 2015-06-18 17:34:00 -04:00
Xavier Snelgrove 7f445d9a21 Properly mask inputs from t-1 in recurrent net 2015-06-18 17:28:29 -04:00
Xavier Snelgrove 4cc03dd4ef Missed a bit of whitespace 2015-06-18 17:07:51 -04:00
Xavier Snelgrove 2727198f94 Whitespace changes for readability 2015-06-18 16:59:58 -04:00
Xavier Snelgrove 88b0c8e067 Added .swp files to the gitignore for Vim users 2015-06-18 16:58:34 -04:00
Xavier Snelgrove a75a31381e Added a concept of masking by value to the SimpleRNN 2015-06-18 16:27:30 -04:00
fchollet d550232458 Merge branch 'master' of https://github.com/fchollet/keras 2015-06-18 11:26:24 -07:00
fchollet 6329378ca3 Make pre-padding the default in sequence tensors 2015-06-18 11:26:11 -07:00
Philipp b3402f5011 renamed test file for class weights 2015-06-18 17:09:59 +02:00
Philipp 4cfb5d7971 added instance weight implementation 2015-06-18 17:04:28 +02:00
Thomas McColgan a51d7ae145 typo 2015-06-18 17:01:05 +02:00
Thomas McColgan bd361595be make l1 norm actually be an l1 norm. doh! 2015-06-18 16:43:28 +02:00
Thomas McColgan 668eeebdda Merge branch 'master' of https://github.com/fchollet/keras into activity_reg
Conflicts:
	keras/models.py
2015-06-18 14:47:23 +02:00
Thomas McColgan cf49d59aff add a test for activity regulrisation 2015-06-18 14:41:10 +02:00
Thomas McColgan 31a6ee342b make cost updates propagate all the way, even if layer input is unknown at instantiation 2015-06-18 14:40:46 +02:00
Thomas McColgan fecedb02cf add cost update member to layer 2015-06-18 10:58:38 +02:00
François Chollet 8e5cdd1689 Merge pull request #238 from jfsantos/patch-1
Fix typo
2015-06-17 11:13:11 -07:00
João Felipe Santos 45775e36b1 Fix typo 2015-06-17 14:11:13 -04:00
François Chollet 4830b4be27 Merge pull request #188 from tdhd/classweights
Add support for class_weight in fit
2015-06-17 10:25:34 -07:00
fchollet 702e9d3bec Merge branch 'master' of https://github.com/fchollet/keras 2015-06-16 22:53:13 -07:00
fchollet 1857a6af5e Improve LSTM text generation example 2015-06-16 22:52:06 -07:00
François Chollet 4b065a28a0 Merge pull request #234 from iskandr/master
Added Convolution1D and MaxPooling1D to layers documentation
2015-06-16 13:28:52 -07:00
Alex Rubinsteyn 33334883fa Added Convolution1D and MaxPooling1D to layers documentation 2015-06-16 12:58:52 -04:00
Philipp 2165fd03dc updated testcase for class weights 2015-06-16 11:03:30 +02:00
fchollet 872a9faf18 Fix printing for Python2 in LSTM example 2015-06-15 17:54:59 -07:00
fchollet d2b229df2e Add LSTM text generation example 2015-06-15 17:43:25 -07:00
Philipp 35c2f36759 moved weight multiplication outside of T.max 2015-06-15 15:20:33 +02:00
Philipp 24d735ecb1 merge master 2015-06-15 11:49:09 +02:00
Philipp 8c93ba860a objectives now reshape weight vector to match dims 2015-06-15 11:48:30 +02:00
fchollet bf4822f675 Fix autoencoder issue 2015-06-14 20:58:53 -07:00
Philipp ed6e351953 merge master 2015-06-14 23:10:34 +02:00
Philipp 368ad61236 removed unnecessary check in fit/train 2015-06-14 23:09:06 +02:00
Philipp 829b7ab289 fixed error in binary crossentropy obj 2015-06-14 23:04:37 +02:00
fchollet ce20955379 Update setup.py 2015-06-13 17:54:33 -07:00
Philipp 0a10e20959 merge master 2015-06-13 00:52:41 +02:00
Philipp ffeefb2a1b update test verbosity 2015-06-09 11:02:42 +02:00
Philipp eb33c9a18c updated branch 2015-06-09 10:53:31 +02:00
Philipp 39f05c3436 check for incompatible combination of parameters in fit/train 2015-06-08 09:52:58 +02:00
Philipp c0b7044ce6 add weight to (squared) hinge 2015-06-08 09:36:36 +02:00
Philipp bde147ce3a revert objective changes 2015-06-08 09:33:31 +02:00
Philipp d6eb8a47d7 add weight to all objective functions 2015-06-08 09:30:53 +02:00
Philipp 206f29ea6e fixed calculate_class_weights where it would yield an np-array with shape[1] == 1 which is not mappable. binary crossentropy adapted with weights. 2015-06-06 00:49:00 +02:00
Philipp a6824931e4 moved calculate_class_weights outside of model class 2015-06-05 21:47:32 +02:00
Philipp 7bcc4cb257 added documentation for classweights and a test 2015-06-03 17:39:34 +02:00
Philipp a4af21d588 added class weights to the Model class and one of the objective functions as an optional parameter 2015-06-03 17:39:11 +02:00
Thomas McColgan 12f7f374c3 Add an Autoencoder model and a test to go with it. 2015-05-25 17:17:58 +02:00
Thomas McColgan f4df2240c1 Make merge layer work with a single model, turning it into a container layer. 2015-05-25 17:16:05 +02:00
199 arquivos alterados com 36431 adições e 5458 exclusões
+14 -1
Ver Arquivo
@@ -1,6 +1,19 @@
*.DS_Store
*.pyc
*.swp
temp/*
dist/*
build/*
keras/datasets/data/*
keras/datasets/temp/*
keras/datasets/temp/*
docs/site/*
docs/theme/*
tags
Keras.egg-info
# test-related
.coverage
.cache
# developer environments
.idea
+73
Ver Arquivo
@@ -0,0 +1,73 @@
sudo: required
dist: trusty
language: python
matrix:
include:
- python: 3.4
env: KERAS_BACKEND=theano
- python: 3.4
env: KERAS_BACKEND=tensorflow
- python: 2.7
env: KERAS_BACKEND=theano
- python: 2.7
env: KERAS_BACKEND=tensorflow
- python: 2.7
env: KERAS_BACKEND=theano TEST_MODE=INTEGRATION_TESTS
- python: 2.7
env: KERAS_BACKEND=theano TEST_MODE=PEP8
install:
# code below is taken from http://conda.pydata.org/docs/travis.html
# We do this conditionally because it saves us some downloading if the
# version is the same.
- if [[ "$TRAVIS_PYTHON_VERSION" == "2.7" ]]; then
wget https://repo.continuum.io/miniconda/Miniconda-latest-Linux-x86_64.sh -O miniconda.sh;
else
wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh -O miniconda.sh;
fi
- bash miniconda.sh -b -p $HOME/miniconda
- export PATH="$HOME/miniconda/bin:$PATH"
- hash -r
- conda config --set always_yes yes --set changeps1 no
- conda update -q conda
# Useful for debugging any issues with conda
- conda info -a
- conda create -q -n test-environment python=$TRAVIS_PYTHON_VERSION numpy scipy matplotlib pandas pytest h5py
- source activate test-environment
- pip install pytest-cov python-coveralls pytest-xdist coverage==3.7.1 #we need this version of coverage for coveralls.io to work
- pip install pep8 pytest-pep8
- pip install git+git://github.com/Theano/Theano.git
# install PIL for preprocessing tests
- if [[ "$TRAVIS_PYTHON_VERSION" == "2.7" ]]; then
conda install pil;
elif [[ "$TRAVIS_PYTHON_VERSION" == "3.4" ]]; then
conda install Pillow;
fi
- python setup.py install
# install TensorFlow
- if [[ "$TRAVIS_PYTHON_VERSION" == "2.7" ]]; then
pip install https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.11.0-cp27-none-linux_x86_64.whl;
elif [[ "$TRAVIS_PYTHON_VERSION" == "3.4" ]]; then
pip install https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.11.0-cp34-cp34m-linux_x86_64.whl;
fi
# command to run tests
script:
# run keras backend init to initialize backend config
- python -c "import keras.backend"
# create dataset directory to avoid concurrent directory creation at runtime
- mkdir ~/.keras/datasets
# set up keras backend
- sed -i -e 's/"backend":[[:space:]]*"[^"]*/"backend":\ "'$KERAS_BACKEND'/g' ~/.keras/keras.json;
- echo -e "Running tests with the following config:\n$(cat ~/.keras/keras.json)"
- if [[ "$TEST_MODE" == "INTEGRATION_TESTS" ]]; then
PYTHONPATH=$PWD:$PYTHONPATH py.test tests/integration_tests;
elif [[ "$TEST_MODE" == "PEP8" ]]; then
PYTHONPATH=$PWD:$PYTHONPATH py.test --pep8 -m pep8 -n0;
else
PYTHONPATH=$PWD:$PYTHONPATH py.test tests/ --ignore=tests/integration_tests;
fi
after_success:
- coveralls
+65
Ver Arquivo
@@ -0,0 +1,65 @@
# On Github Issues and Pull Requests
Found a bug? Have a new feature to suggest? Want to contribute changes to the codebase? Make sure to read this first.
## Bug reporting
Your code doesn't work, and you have determined that the issue lies with Keras? Follow these steps to report a bug.
1. Your bug may already be fixed. Make sure to update to the current Keras master branch, as well as the latest Theano/TensorFlow master branch.
To easily update Theano: `pip install git+git://github.com/Theano/Theano.git --upgrade`
2. Search for similar issues. Make sure to delete `is:open` on the issue search to find solved tickets as well. It's possible somebody has encountered this bug already. Also remember to check out Keras' [FAQ](http://keras.io/faq/). Still having a problem? Open an issue on Github to let us know.
3. Make sure you provide us with useful information about your configuration: what OS are you using? What Keras backend are you using? Are you running on GPU? If so, what is your version of Cuda, of cuDNN? What is your GPU?
4. Provide us with a script to reproduce the issue. This script should be runnable as-is and should not require external data download (use randomly generated data if you need to run a model on some test data). We recommend that you use Github Gists to post your code. Any issue that cannot be reproduced is likely to be closed.
5. If possible, take a stab at fixing the bug yourself --if you can!
The more information you provide, the easier it is for us to validate that there is a bug and the faster we'll be able to take action. If you want your issue to be resolved quickly, following the steps above is crucial.
## Requesting a Feature
You can also use Github issues to request features you would like to see in Keras, or changes in the Keras API.
1. Provide a clear and detailed explanation of the feature you want and why it's important to add. Keep in mind that we want features that will be useful to the majority of our users and not just a small subset. If you're just targeting a minority of users, consider writing an add-on library for Keras. It is crucial for Keras to avoid bloating the API and codebase.
2. Provide code snippets demonstrating the API you have in mind and illustrating the use cases of your feature. Of course, you don't need to write any real code at this point!
3. After discussing the feature you may choose to attempt a Pull Request. If you're at all able, start writing some code. We always have more work to do than time to do it. If you can write some code then that will speed the process along.
## Pull Requests
We love pull requests. Here's a quick guide:
1. If your PR introduces a change in functionality, make sure you start by opening an issue to discuss whether the change should be made, and how to handle it. This will save you from having your PR closed down the road! Of course, if your PR is a simple bug fix, you don't need to do that.
2. Write the code. This is the hard part!
3. Make sure any new function or class you introduce has proper docstrings. Make sure any code you touch still has up-to-date docstrings and documentation.
4. Write tests. Your code should have full unit test coverage. If you want to see your PR merged promptly, this is crucial.
5. Run our test suite locally. It's easy: from the Keras folder, simply run: `py.test tests/`.
- You will need to install `pytest`, `coveralls`, `pytest-cov`, `pytest-xdist`: `pip install pytest pytest-cov python-coveralls pytest-xdist pep8 pytest-pep8`
6. Make sure all tests are passing:
- with the Theano backend, on Python 2.7 and Python 3.5
- with the TensorFlow backend, on Python 2.7
7. We use PEP8 syntax conventions, but we aren't dogmatic when it comes to line length. Make sure your lines stay reasonably sized, though. To make your life easier, we recommend running a PEP8 linter:
- Install PEP8 packages: `pip install pep8 pytest-pep8 autopep8`
- Run a standalone PEP8 check: `py.test --pep8 -m pep8`
- You can automatically fix some PEP8 error by running: `autopep8 -i --select <errors> <FILENAME>` for example: `autopep8 -i --select E128 tests/keras/backend/test_backends.py`
8. When committing, use appropriate, descriptive commit messages. Make sure that your branch history is not a string of "bug fix", "fix", "oops", etc. When submitting your PR, squash your commits into a single commit with an appropriate commit message, to make sure the project history stays clean and readable. See ['rebase and squash'](http://rebaseandsqua.sh/) for technical help on how to squash your commits.
9. Update the documentation. If introducing new functionality, make sure you include code snippets demonstrating the usage of your new feature.
10. Submit your PR. If your changes have been approved in a previous discussion, and if you have complete (and passing) unit tests, your PR is likely to be merged promptly. Otherwise, well...
## Adding new examples
Even if you don't contribute to the Keras source code, if you have an application of Keras that is concise and powerful, please consider adding it to our collection of examples. [Existing examples](https://github.com/fchollet/keras/tree/master/examples) show idiomatic Keras code: make sure to keep your own script in the same spirit.
+9
Ver Arquivo
@@ -0,0 +1,9 @@
Please make sure that the boxes below are checked before you submit your issue. Thank you!
- [ ] Check that you are up-to-date with the master branch of Keras. You can update with:
pip install git+git://github.com/fchollet/keras.git --upgrade --no-deps
- [ ] If running on Theano, check that you are up-to-date with the master branch of Theano. You can update with:
pip install git+git://github.com/Theano/Theano.git --upgrade --no-deps
- [ ] Provide a link to a GitHub Gist of a Python script that can reproduce your issue (or just copy the script here if it is short).
+19 -2
Ver Arquivo
@@ -1,6 +1,23 @@
The MIT License (MIT)
COPYRIGHT
Copyright (c) 2015
All contributions by François Chollet:
Copyright (c) 2015, François Chollet.
All rights reserved.
All contributions by Google:
Copyright (c) 2015, Google, Inc.
All rights reserved.
All other contributions:
Copyright (c) 2015, the respective contributors.
All rights reserved.
Each contributor holds copyright over their respective contributions.
The project versioning (Git) records all such contribution source information.
LICENSE
The MIT License (MIT)
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
+116 -154
Ver Arquivo
@@ -1,209 +1,171 @@
# Keras: Theano-based Deep Learning library
# Keras: Deep Learning library for TensorFlow and Theano
[![Build Status](https://travis-ci.org/fchollet/keras.svg?branch=master)](https://travis-ci.org/fchollet/keras)
[![PyPI version](https://badge.fury.io/py/keras.svg)](https://badge.fury.io/py/keras)
[![license](https://img.shields.io/github/license/mashape/apistatus.svg?maxAge=2592000)](https://github.com/fchollet/keras/blob/master/LICENSE)
[![Join the chat at https://gitter.im/Keras-io/Lobby](https://badges.gitter.im/Keras-io/Lobby.svg)](https://gitter.im/Keras-io/Lobby)
## You have just found Keras.
Keras is a minimalist, highly modular neural network library in the spirit of Torch, written in Python / Theano so as not to have to deal with the dearth of ecosystem in Lua. It was developed with a focus on enabling fast experimentation. Being able to go from idea to result with the least possible delay is key to doing good research.
Keras is a high-level neural networks library, written in Python and capable of running on top of either [TensorFlow](https://github.com/tensorflow/tensorflow) or [Theano](https://github.com/Theano/Theano). It was developed with a focus on enabling fast experimentation. *Being able to go from idea to result with the least possible delay is key to doing good research.*
Use Keras if you need a deep learning library that:
- allows for easy and fast prototyping (through total modularity, minimalism, and extensibility).
- supports both convolutional networks (for vision) and recurrent networks (for sequence data). As well as combinations of the two.
- runs seamlessly on the CPU and the GPU.
- Allows for easy and fast prototyping (through total modularity, minimalism, and extensibility).
- Supports both convolutional networks and recurrent networks, as well as combinations of the two.
- Supports arbitrary connectivity schemes (including multi-input and multi-output training).
- Runs seamlessly on CPU and GPU.
Read the documentation at [Keras.io](http://keras.io).
Keras is compatible with __Python 2.7-3.4__.
Keras is compatible with: __Python 2.7-3.5__.
------------------
## Guiding principles
- __Modularity.__ A model is understood as a sequence of standalone, fully-configurable modules that can be plugged together with as little restrictions as possible. In particular, neural layers, cost functions, optimizers, initialization schemes, activation functions and dropout are all standalone modules that you can combine to create new models.
- __Modularity.__ A model is understood as a sequence or a graph of standalone, fully-configurable modules that can be plugged together with as little restrictions as possible. In particular, neural layers, cost functions, optimizers, initialization schemes, activation functions, regularization schemes are all standalone modules that you can combine to create new models.
- __Minimalism.__ Each module should be kept short and simple (<100 lines of code). Every piece of code should be transparent upon first reading. No black magic: it hurts iteration speed and ability to innovate.
- __Minimalism.__ Each module should be kept short and simple. Every piece of code should be transparent upon first reading. No black magic: it hurts iteration speed and ability to innovate.
- __Easy extensibility.__ New features (a new module, per the above definition, or a new way to combine modules together) are dead simple to add (as new classes/functions), and existing modules provide ample examples.
- __Easy extensibility.__ New modules are dead simple to add (as new classes and functions), and existing modules provide ample examples. To be able to easily create new modules allows for total expressiveness, making Keras suitable for advanced research.
- __Work with Python__. No separate models configuration files in a declarative format (like in Caffe or PyLearn2). Models are described in Python code, which is compact, easier to debug, benefits from syntax highlighting, and most of all, allows for ease of extensibility. See for yourself with the examples below.
- __Work with Python__. No separate models configuration files in a declarative format. Models are described in Python code, which is compact, easier to debug, and allows for ease of extensibility.
## Examples
### Multilayer Perceptron (MLP):
------------------
## Getting started: 30 seconds to Keras
The core data structure of Keras is a __model__, a way to organize layers. The main type of model is the [`Sequential`](http://keras.io/getting-started/sequential-model-guide) model, a linear stack of layers. For more complex architectures, you should use the [Keras functional API](http://keras.io/getting-started/functional-api-guide).
Here's the `Sequential` model:
```python
from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation
model = Sequential()
```
Stacking layers is as easy as `.add()`:
```python
from keras.layers import Dense, Activation
model.add(Dense(output_dim=64, input_dim=100))
model.add(Activation("relu"))
model.add(Dense(output_dim=10))
model.add(Activation("softmax"))
```
Once your model looks good, configure its learning process with `.compile()`:
```python
model.compile(loss='categorical_crossentropy', optimizer='sgd', metrics=['accuracy'])
```
If you need to, you can further configure your optimizer. A core principle of Keras is to make things reasonably simple, while allowing the user to be fully in control when they need to (the ultimate control being the easy extensibility of the source code).
```python
from keras.optimizers import SGD
model = Sequential()
model.add(Dense(20, 64, init='uniform'))
model.add(Activation('tanh'))
model.add(Dropout(0.5))
model.add(Dense(64, 64, init='uniform'))
model.add(Activation('tanh'))
model.add(Dropout(0.5))
model.add(Dense(64, 2, init='uniform'))
model.add(Activation('softmax'))
sgd = SGD(lr=0.1, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='mean_squared_error', optimizer=sgd)
model.fit(X_train, y_train, nb_epoch=20, batch_size=16)
score = model.evaluate(X_test, y_test, batch_size=16)
model.compile(loss='categorical_crossentropy', optimizer=SGD(lr=0.01, momentum=0.9, nesterov=True))
```
### Alternative implementation of MLP:
You can now iterate on your training data in batches:
```python
model = Sequential()
model.add(Dense(20, 64, init='uniform', activation='tanh'))
model.add(Dropout(0.5))
model.add(Dense(64, 64, init='uniform', activation='tanh'))
model.add(Dropout(0.5))
model.add(Dense(64, 2, init='uniform', activation='softmax'))
sgd = SGD(lr=0.1, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='mean_squared_error', optimizer=sgd)
model.fit(X_train, Y_train, nb_epoch=5, batch_size=32)
```
### VGG-like convnet:
Alternatively, you can feed batches to your model manually:
```python
from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation, Flatten
from keras.layers.convolutional import Convolution2D, MaxPooling2D
from keras.optimizers import SGD
model = Sequential()
model.add(Convolution2D(32, 3, 3, 3, border_mode='full'))
model.add(Activation('relu'))
model.add(Convolution2D(32, 32, 3, 3))
model.add(Activation('relu'))
model.add(MaxPooling2D(poolsize=(2, 2)))
model.add(Dropout(0.25))
model.add(Convolution2D(64, 32, 3, 3, border_mode='full'))
model.add(Activation('relu'))
model.add(Convolution2D(64, 64, 3, 3))
model.add(Activation('relu'))
model.add(MaxPooling2D(poolsize=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(64*8*8, 256))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(256, 10))
model.add(Activation('softmax'))
sgd = SGD(lr=0.1, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='categorical_crossentropy', optimizer=sgd)
model.fit(X_train, Y_train, batch_size=32, nb_epoch=1)
model.train_on_batch(X_batch, Y_batch)
```
### Sequence classification with LSTM:
Evaluate your performance in one line:
```python
from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation
from keras.layers.embeddings import Embedding
from keras.layers.recurrent import LSTM
model = Sequential()
model.add(Embedding(max_features, 256))
model.add(LSTM(256, 128, activation='sigmoid', inner_activation='hard_sigmoid'))
model.add(Dropout(0.5))
model.add(Dense(128, 1))
model.add(Activation('sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='rmsprop')
model.fit(X_train, Y_train, batch_size=16, nb_epoch=10)
score = model.evaluate(X_test, Y_test, batch_size=16)
loss_and_metrics = model.evaluate(X_test, Y_test, batch_size=32)
```
### Architecture for learning image captions with a convnet and a Gated Recurrent Unit:
(word-level embedding, caption of maximum length 16 words).
Note that getting this to actually "work" will require using a bigger convnet, initialized with pre-trained weights.
Displaying readable results will also require an embedding decoder.
Or generate predictions on new data:
```python
max_caption_len = 16
model = Sequential()
model.add(Convolution2D(32, 3, 3, 3, border_mode='full'))
model.add(Activation('relu'))
model.add(Convolution2D(32, 32, 3, 3))
model.add(Activation('relu'))
model.add(MaxPooling2D(poolsize=(2, 2)))
model.add(Convolution2D(64, 32, 3, 3, border_mode='full'))
model.add(Activation('relu'))
model.add(Convolution2D(64, 64, 3, 3))
model.add(Activation('relu'))
model.add(MaxPooling2D(poolsize=(2, 2)))
model.add(Convolution2D(128, 64, 3, 3, border_mode='full'))
model.add(Activation('relu'))
model.add(Convolution2D(128, 128, 3, 3))
model.add(Activation('relu'))
model.add(MaxPooling2D(poolsize=(2, 2)))
model.add(Flatten())
model.add(Dense(128*4*4, 256))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Repeat(max_caption_len))
# the GRU below returns sequences of max_caption_len vectors of size 256 (our word embedding size)
model.add(GRU(256, 256, return_sequences=True))
model.compile(loss='mean_squared_error', optimizer='rmsprop')
# "images" is a numpy array of shape (nb_samples, nb_channels=3, width, height)
# "captions" is a numpy array of shape (nb_samples, max_caption_len=16, embedding_dim=256)
# captions are supposed already embedded (dense vectors).
model.fit(images, captions, batch_size=16, nb_epoch=100)
classes = model.predict_classes(X_test, batch_size=32)
proba = model.predict_proba(X_test, batch_size=32)
```
In the examples folder, you will find example models for real datasets:
- CIFAR10 small images classification: Convnet with realtime data augmentation
- IMDB movie review sentiment classification: LSTM over sequences of words
- Reuters newswires topic classification: Multilayer Perceptron
- MNIST handwritten digits classification: Multilayer Perceptron
Building a question answering system, an image classification model, a Neural Turing Machine, a word2vec embedder or any other model is just as fast. The ideas behind deep learning are simple, so why should their implementation be painful?
For a more in-depth tutorial about Keras, you can check out:
- [Getting started with the Sequential model](http://keras.io/getting-started/sequential-model-guide)
- [Getting started with the functional API](http://keras.io/getting-started/functional-api-guide)
In the [examples folder](https://github.com/fchollet/keras/tree/master/examples) of the repository, you will find more advanced models: question-answering with memory networks, text generation with stacked LSTMs, etc.
## Current capabilities
------------------
For complete coverage of the API, check out [the Keras documentation](http://keras.io).
A few highlights: convnets, LSTM, GRU, word2vec-style embeddings, PReLU, batch normalization...
## Installation
Keras uses the following dependencies:
- numpy, scipy
- Theano
- See installation instructions: http://deeplearning.net/software/theano/install.html#install
- pyyaml
- HDF5 and h5py (optional, required if you use model saving/loading functions)
- Optional but recommended if you use CNNs: cuDNN.
Once you have the dependencies installed, cd to the Keras folder and run the install command:
```
*When using the TensorFlow backend:*
- TensorFlow
- [See installation instructions](https://github.com/tensorflow/tensorflow#download-and-setup).
*When using the Theano backend:*
- Theano
- [See installation instructions](http://deeplearning.net/software/theano/install.html#install).
To install Keras, `cd` to the Keras folder and run the install command:
```sh
sudo python setup.py install
```
You can also install Keras from PyPI:
```sh
sudo pip install keras
```
------------------
## Switching from TensorFlow to Theano
By default, Keras will use TensorFlow as its tensor manipulation library. [Follow these instructions](http://keras.io/backend/) to configure the Keras backend.
------------------
## Support
You can ask questions and join the development discussion:
- On the [Keras Google group](https://groups.google.com/forum/#!forum/keras-users).
- On the [Keras Gitter channel](https://gitter.im/Keras-io/Lobby).
You can also post bug reports and feature requests in [Github issues](https://github.com/fchollet/keras/issues). Make sure to read [our guidelines](https://github.com/fchollet/keras/blob/master/CONTRIBUTING.md) first.
------------------
## Why this name, Keras?
Keras (κέρας) means _horn_ in Greek. It is a reference to a literary image from ancient Greek and Latin literature, first found in the _Odyssey_, where dream spirits (_Oneiroi_, singular _Oneiros_) are divided between those who deceive men with false visions, who arrive to Earth through a gate of ivory, and those who announce a future that will come to pass, who arrive through a gate of horn. It's a play on the words κέρας (horn) / κραίνω (fulfill), and ἐλέφας (ivory) / ἐλεφαίρομαι (deceive).
Keras was developed as part of the research effort of project ONEIROS (Open-ended Neuro-Electronic Intelligent Robot Operating System).
Keras was initially developed as part of the research effort of project ONEIROS (Open-ended Neuro-Electronic Intelligent Robot Operating System).
_"Oneiroi are beyond our unravelling --who can be sure what tale they tell? Not all that men look for comes to pass. Two gates there are that give passage to fleeting Oneiroi; one is made of horn, one of ivory. The Oneiroi that pass through sawn ivory are deceitful, bearing a message that will not be fulfilled; those that come out through polished horn have truth behind them, to be accomplished for men who see them."_ Homer, Odyssey 19. 562 ff (Shewring translation).
>_"Oneiroi are beyond our unravelling --who can be sure what tale they tell? Not all that men look for comes to pass. Two gates there are that give passage to fleeting Oneiroi; one is made of horn, one of ivory. The Oneiroi that pass through sawn ivory are deceitful, bearing a message that will not be fulfilled; those that come out through polished horn have truth behind them, to be accomplished for men who see them."_ Homer, Odyssey 19. 562 ff (Shewring translation).
------------------
+46
Ver Arquivo
@@ -0,0 +1,46 @@
FROM nvidia/cuda:7.5-cudnn5-devel
ENV CONDA_DIR /opt/conda
ENV PATH $CONDA_DIR/bin:$PATH
RUN mkdir -p $CONDA_DIR && \
echo export PATH=$CONDA_DIR/bin:'$PATH' > /etc/profile.d/conda.sh && \
apt-get update && \
apt-get install -y wget git libhdf5-dev g++ graphviz && \
wget --quiet https://repo.continuum.io/miniconda/Miniconda3-3.9.1-Linux-x86_64.sh && \
echo "6c6b44acdd0bc4229377ee10d52c8ac6160c336d9cdd669db7371aa9344e1ac3 *Miniconda3-3.9.1-Linux-x86_64.sh" | sha256sum -c - && \
/bin/bash /Miniconda3-3.9.1-Linux-x86_64.sh -f -b -p $CONDA_DIR && \
rm Miniconda3-3.9.1-Linux-x86_64.sh
ENV NB_USER keras
ENV NB_UID 1000
RUN useradd -m -s /bin/bash -N -u $NB_UID $NB_USER && \
mkdir -p $CONDA_DIR && \
chown keras $CONDA_DIR -R && \
mkdir -p /src && \
chown keras /src
USER keras
# Python
ARG python_version=3.5.1
ARG tensorflow_version=0.9.0rc0-cp35-cp35m
RUN conda install -y python=${python_version} && \
pip install https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-${tensorflow_version}-linux_x86_64.whl && \
pip install git+git://github.com/Theano/Theano.git && \
pip install ipdb pytest pytest-cov python-coveralls coverage==3.7.1 pytest-xdist pep8 pytest-pep8 pydot_ng && \
conda install Pillow scikit-learn notebook pandas matplotlib nose pyyaml six h5py && \
pip install git+git://github.com/fchollet/keras.git && \
conda clean -yt
ADD theanorc /home/keras/.theanorc
ENV PYTHONPATH='/src/:$PYTHONPATH'
WORKDIR /src
EXPOSE 8888
CMD jupyter notebook --port=8888 --ip=0.0.0.0
+26
Ver Arquivo
@@ -0,0 +1,26 @@
help:
@cat Makefile
DATA?="${HOME}/Data"
GPU?=0
DOCKER_FILE=Dockerfile
DOCKER=GPU=$(GPU) nvidia-docker
BACKEND=tensorflow
TEST=tests/
SRC=$(shell dirname `pwd`)
build:
docker build -t keras --build-arg python_version=3.5 -f $(DOCKER_FILE) .
bash: build
$(DOCKER) run -it -v $(SRC):/src -v $(DATA):/data --env KERAS_BACKEND=$(BACKEND) keras bash
ipython: build
$(DOCKER) run -it -v $(SRC):/src -v $(DATA):/data --env KERAS_BACKEND=$(BACKEND) keras ipython
notebook: build
$(DOCKER) run -it -v $(SRC):/src -v $(DATA):/data --net=host --env KERAS_BACKEND=$(BACKEND) keras
test: build
$(DOCKER) run -it -v $(SRC):/src -v $(DATA):/data --env KERAS_BACKEND=$(BACKEND) keras py.test $(TEST)
+58
Ver Arquivo
@@ -0,0 +1,58 @@
# Using Keras via Docker
This directory contains `Dockerfile` to make it easy to get up and running with
Keras via [Docker](http://www.docker.com/).
## Installing Docker
General installation instructions are
[on the Docker site](https://docs.docker.com/installation/), but we give some
quick links here:
* [OSX](https://docs.docker.com/installation/mac/): [docker toolbox](https://www.docker.com/toolbox)
* [ubuntu](https://docs.docker.com/installation/ubuntulinux/)
## Running the container
We are using `Makefile` to simplify docker commands within make commands.
Build the container and start a jupyter notebook
$ make notebook
Build the container and start an iPython shell
$ make ipython
Build the container and start a bash
$ make bash
For GPU support install NVidia drivers (ideally latest) and
[nvidia-docker](https://github.com/NVIDIA/nvidia-docker). Run using
$ make notebook GPU=0 # or [ipython, bash]
Switch between Theano and TensorFlow
$ make notebook BACKEND=theano
$ make notebook BACKEND=tensorflow
Mount a volume for external data sets
$ make DATA=~/mydata
Prints all make tasks
$ make help
You can change Theano parameters by editing `/docker/theanorc`.
Note: If you would have a problem running nvidia-docker you may try the old way
we have used. But it is not recommended. If you find a bug in the nvidia-docker report
it there please and try using the nvidia-docker as described above.
$ export CUDA_SO=$(\ls /usr/lib/x86_64-linux-gnu/libcuda.* | xargs -I{} echo '-v {}:{}')
$ export DEVICES=$(\ls /dev/nvidia* | xargs -I{} echo '--device {}:{}')
$ docker run -it -p 8888:8888 $CUDA_SO $DEVICES gcr.io/tensorflow/tensorflow:latest-gpu
+5
Ver Arquivo
@@ -0,0 +1,5 @@
[global]
floatX = float32
optimizer=None
device = gpu
+5 -2
Ver Arquivo
@@ -5,5 +5,8 @@ Our documentation uses extended Markdown, as implemented by [MkDocs](http://mkdo
## Building the documentation
- install MkDocs: `sudo pip install mkdocs`
- `cd` to the `docs/` folder and run: `mkdocs serve`
- install MkDocs: `pip install mkdocs`
- `cd` to the `docs/` folder and run:
- `python autogen.py`
- `mkdocs serve` # Starts a local webserver: [localhost:8000](localhost:8000)
- `mkdocs build` # Builds a static site in "site" directory
+483
Ver Arquivo
@@ -0,0 +1,483 @@
# -*- coding: utf-8 -*-
'''
General documentation architecture:
Home
Index
- Getting started
Getting started with the sequential model
Getting started with the functional api
Examples
FAQ
Installation guide
- Models
About Keras models
explain when one should use Sequential or functional API
explain compilation step
explain weight saving, weight loading
explain serialization, deserialization
Sequential
Model (functional API)
- Layers
About Keras layers
explain common layer functions: get_weights, set_weights, get_config
explain input_shape
explain usage on non-Keras tensors
Core layers
Convolutional
Recurrent
Embeddings
Normalization
Advanced activations
Noise
- Preprocessing
Image preprocessing
Text preprocessing
Sequence preprocessing
Objectives
Metrics
Optimizers
Activations
Callbacks
Datasets
Backend
Initializations
Regularizers
Constraints
Visualization
Scikit-learn API
'''
from __future__ import print_function
from __future__ import unicode_literals
import re
import inspect
import os
import shutil
import sys
if sys.version[0] == '2':
reload(sys)
sys.setdefaultencoding('utf8')
from keras.layers import convolutional
from keras.layers import pooling
from keras.layers import local
from keras.layers import recurrent
from keras.layers import core
from keras.layers import noise
from keras.layers import normalization
from keras.layers import advanced_activations
from keras.layers import embeddings
from keras.layers import wrappers
from keras import optimizers
from keras import callbacks
from keras import models
from keras.engine import topology
from keras import objectives
from keras import metrics
from keras import backend
from keras import constraints
from keras import activations
from keras import regularizers
from keras.utils import data_utils
from keras.utils import io_utils
from keras.utils import layer_utils
from keras.utils import np_utils
EXCLUDE = {
'Optimizer',
'Wrapper',
'get_session',
'set_session',
'CallbackList',
}
PAGES = [
{
'page': 'models/sequential.md',
'functions': [
models.Sequential.compile,
models.Sequential.fit,
models.Sequential.evaluate,
models.Sequential.predict,
models.Sequential.predict_classes,
models.Sequential.predict_proba,
models.Sequential.train_on_batch,
models.Sequential.test_on_batch,
models.Sequential.predict_on_batch,
models.Sequential.fit_generator,
models.Sequential.evaluate_generator,
models.Sequential.predict_generator,
],
},
{
'page': 'models/model.md',
'functions': [
models.Model.compile,
models.Model.fit,
models.Model.evaluate,
models.Model.predict,
models.Model.train_on_batch,
models.Model.test_on_batch,
models.Model.predict_on_batch,
models.Model.fit_generator,
models.Model.evaluate_generator,
models.Model.predict_generator,
models.Model.get_layer,
]
},
{
'page': 'layers/core.md',
'classes': [
core.Dense,
core.Activation,
core.Dropout,
core.SpatialDropout1D,
core.SpatialDropout2D,
core.SpatialDropout3D,
core.Flatten,
core.Reshape,
core.Permute,
core.RepeatVector,
topology.Merge,
core.Lambda,
core.ActivityRegularization,
core.Masking,
core.Highway,
core.MaxoutDense,
core.TimeDistributedDense,
],
},
{
'page': 'layers/convolutional.md',
'classes': [
convolutional.Convolution1D,
convolutional.AtrousConvolution1D,
convolutional.Convolution2D,
convolutional.AtrousConvolution2D,
convolutional.SeparableConvolution2D,
convolutional.Deconvolution2D,
convolutional.Convolution3D,
convolutional.Cropping1D,
convolutional.Cropping2D,
convolutional.Cropping3D,
convolutional.UpSampling1D,
convolutional.UpSampling2D,
convolutional.UpSampling3D,
convolutional.ZeroPadding1D,
convolutional.ZeroPadding2D,
convolutional.ZeroPadding3D,
],
},
{
'page': 'layers/pooling.md',
'classes': [
pooling.MaxPooling1D,
pooling.MaxPooling2D,
pooling.MaxPooling3D,
pooling.AveragePooling1D,
pooling.AveragePooling2D,
pooling.AveragePooling3D,
pooling.GlobalMaxPooling1D,
pooling.GlobalAveragePooling1D,
pooling.GlobalMaxPooling2D,
pooling.GlobalAveragePooling2D,
],
},
{
'page': 'layers/local.md',
'classes': [
local.LocallyConnected1D,
local.LocallyConnected2D,
],
},
{
'page': 'layers/recurrent.md',
'classes': [
recurrent.Recurrent,
recurrent.SimpleRNN,
recurrent.GRU,
recurrent.LSTM,
],
},
{
'page': 'layers/embeddings.md',
'classes': [
embeddings.Embedding,
],
},
{
'page': 'layers/normalization.md',
'classes': [
normalization.BatchNormalization,
],
},
{
'page': 'layers/advanced-activations.md',
'all_module_classes': [advanced_activations],
},
{
'page': 'layers/noise.md',
'all_module_classes': [noise],
},
{
'page': 'layers/wrappers.md',
'all_module_classes': [wrappers],
},
{
'page': 'metrics.md',
'all_module_functions': [metrics],
},
{
'page': 'optimizers.md',
'all_module_classes': [optimizers],
},
{
'page': 'callbacks.md',
'all_module_classes': [callbacks],
},
{
'page': 'backend.md',
'all_module_functions': [backend],
},
{
'page': 'utils/data_utils.md',
'functions': [
data_utils.get_file,
]
},
{
'page': 'utils/io_utils.md',
'classes': [
io_utils.HDF5Matrix
],
},
{
'page': 'utils/layer_utils.md',
'functions': [
layer_utils.layer_from_config,
]
},
{
'page': 'utils/np_utils.md',
'all_module_functions': [np_utils]
},
]
ROOT = 'http://keras.io/'
def get_earliest_class_that_defined_member(member, cls):
ancestors = get_classes_ancestors([cls])
result = None
for ancestor in ancestors:
if member in dir(ancestor):
result = ancestor
if not result:
return cls
return result
def get_classes_ancestors(classes):
ancestors = []
for cls in classes:
ancestors += cls.__bases__
filtered_ancestors = []
for ancestor in ancestors:
if ancestor.__name__ in ['object']:
continue
filtered_ancestors.append(ancestor)
if filtered_ancestors:
return filtered_ancestors + get_classes_ancestors(filtered_ancestors)
else:
return filtered_ancestors
def get_function_signature(function, method=True):
signature = inspect.getargspec(function)
defaults = signature.defaults
if method:
args = signature.args[1:]
else:
args = signature.args
if defaults:
kwargs = zip(args[-len(defaults):], defaults)
args = args[:-len(defaults)]
else:
kwargs = []
st = '%s.%s(' % (function.__module__, function.__name__)
for a in args:
st += str(a) + ', '
for a, v in kwargs:
if type(v) == str:
v = '\'' + v + '\''
st += str(a) + '=' + str(v) + ', '
if kwargs or args:
return st[:-2] + ')'
else:
return st + ')'
def get_class_signature(cls):
try:
class_signature = get_function_signature(cls.__init__)
class_signature = class_signature.replace('__init__', cls.__name__)
except:
# in case the class inherits from object and does not
# define __init__
class_signature = cls.__module__ + '.' + cls.__name__ + '()'
return class_signature
def class_to_docs_link(cls):
module_name = cls.__module__
assert module_name[:6] == 'keras.'
module_name = module_name[6:]
link = ROOT + module_name.replace('.', '/') + '#' + cls.__name__.lower()
return link
def class_to_source_link(cls):
module_name = cls.__module__
assert module_name[:6] == 'keras.'
path = module_name.replace('.', '/')
path += '.py'
line = inspect.getsourcelines(cls)[-1]
link = 'https://github.com/fchollet/keras/blob/master/' + path + '#L' + str(line)
return '[[source]](' + link + ')'
def code_snippet(snippet):
result = '```python\n'
result += snippet + '\n'
result += '```\n'
return result
def process_class_docstring(docstring):
docstring = re.sub(r'\n # (.*)\n',
r'\n __\1__\n\n',
docstring)
docstring = re.sub(r' ([^\s\\]+):(.*)\n',
r' - __\1__:\2\n',
docstring)
docstring = docstring.replace(' ' * 5, '\t\t')
docstring = docstring.replace(' ' * 3, '\t')
docstring = docstring.replace(' ', '')
return docstring
def process_function_docstring(docstring):
docstring = re.sub(r'\n # (.*)\n',
r'\n __\1__\n\n',
docstring)
docstring = re.sub(r'\n # (.*)\n',
r'\n __\1__\n\n',
docstring)
docstring = re.sub(r' ([^\s\\]+):(.*)\n',
r' - __\1__:\2\n',
docstring)
docstring = docstring.replace(' ' * 6, '\t\t')
docstring = docstring.replace(' ' * 4, '\t')
docstring = docstring.replace(' ', '')
return docstring
print('Cleaning up existing sources directory.')
if os.path.exists('sources'):
shutil.rmtree('sources')
print('Populating sources directory with templates.')
for subdir, dirs, fnames in os.walk('templates'):
for fname in fnames:
new_subdir = subdir.replace('templates', 'sources')
if not os.path.exists(new_subdir):
os.makedirs(new_subdir)
if fname[-3:] == '.md':
fpath = os.path.join(subdir, fname)
new_fpath = fpath.replace('templates', 'sources')
shutil.copy(fpath, new_fpath)
print('Starting autogeneration.')
for page_data in PAGES:
blocks = []
classes = page_data.get('classes', [])
for module in page_data.get('all_module_classes', []):
module_classes = []
for name in dir(module):
if name[0] == '_' or name in EXCLUDE:
continue
module_member = getattr(module, name)
if inspect.isclass(module_member):
cls = module_member
if cls.__module__ == module.__name__:
if cls not in module_classes:
module_classes.append(cls)
module_classes.sort(key=lambda x: id(x))
classes += module_classes
for cls in classes:
subblocks = []
signature = get_class_signature(cls)
subblocks.append('<span style="float:right;">' + class_to_source_link(cls) + '</span>')
subblocks.append('### ' + cls.__name__ + '\n')
subblocks.append(code_snippet(signature))
docstring = cls.__doc__
if docstring:
subblocks.append(process_class_docstring(docstring))
blocks.append('\n'.join(subblocks))
functions = page_data.get('functions', [])
for module in page_data.get('all_module_functions', []):
module_functions = []
for name in dir(module):
if name[0] == '_' or name in EXCLUDE:
continue
module_member = getattr(module, name)
if inspect.isfunction(module_member):
function = module_member
if module.__name__ in function.__module__:
if function not in module_functions:
module_functions.append(function)
module_functions.sort(key=lambda x: id(x))
functions += module_functions
for function in functions:
subblocks = []
signature = get_function_signature(function, method=False)
signature = signature.replace(function.__module__ + '.', '')
subblocks.append('### ' + function.__name__ + '\n')
subblocks.append(code_snippet(signature))
docstring = function.__doc__
if docstring:
subblocks.append(process_function_docstring(docstring))
blocks.append('\n\n'.join(subblocks))
mkdown = '\n----\n\n'.join(blocks)
# save module page.
# Either insert content into existing page,
# or create page otherwise
page_name = page_data['page']
path = os.path.join('sources', page_name)
if os.path.exists(path):
template = open(path).read()
assert '{{autogenerated}}' in template, ('Template found for ' + path +
' but missing {{autogenerated}} tag.')
mkdown = template.replace('{{autogenerated}}', mkdown)
print('...inserting autogenerated content into template:', path)
else:
print('...creating new page with autogenerated content:', path)
subdir = os.path.dirname(path)
if not os.path.exists(subdir):
os.makedirs(subdir)
open(path, 'w').write(mkdown)
+47 -30
Ver Arquivo
@@ -2,42 +2,59 @@ site_name: Keras Documentation
theme: readthedocs
docs_dir: sources
repo_url: http://github.com/fchollet/keras
site_url: /
#theme_dir: theme
site_description: Documentation for fast and lightweight Keras Deep Learning library.
include_404: true
include_search: true
site_url: http://keras.io/
# theme_dir: theme
site_description: 'Documentation for Keras, the Python Deep Learning library.'
dev_addr: '0.0.0.0:8000'
google_analytics: ['UA-61785484-1', 'keras.io']
pages:
- [index.md, Home]
- [documentation.md, Index]
- Home: index.md
- Getting started:
- Guide to the Sequential model: getting-started/sequential-model-guide.md
- Guide to the Functional API: getting-started/functional-api-guide.md
- FAQ: getting-started/faq.md
- Models:
- About Keras models: models/about-keras-models.md
- Sequential: models/sequential.md
- Model (functional API): models/model.md
- Layers:
- About Keras layers: layers/about-keras-layers.md
- Core Layers: layers/core.md
- Convolutional Layers: layers/convolutional.md
- Pooling Layers: layers/pooling.md
- Locally-connected Layers: layers/local.md
- Recurrent Layers: layers/recurrent.md
- Embedding Layers: layers/embeddings.md
- Advanced Activations Layers: layers/advanced-activations.md
- Normalization Layers: layers/normalization.md
- Noise layers: layers/noise.md
- Layer wrappers: layers/wrappers.md
- Writing your own Keras layers: layers/writing-your-own-keras-layers.md
- Preprocessing:
- Sequence Preprocessing: preprocessing/sequence.md
- Text Preprocessing: preprocessing/text.md
- Image Preprocessing: preprocessing/image.md
- Objectives: objectives.md
- Metrics: metrics.md
- Optimizers: optimizers.md
- Activations: activations.md
- Callbacks: callbacks.md
- Datasets: datasets.md
- Applications: applications.md
- Backend: backend.md
- Initializations: initializations.md
- Regularizers: regularizers.md
- Constraints: constraints.md
- Visualization: visualization.md
- Scikit-learn API: scikit-learn-api.md
- Utils:
- Data Utils: utils/data_utils.md
- I/O Utils: utils/io_utils.md
- Layer Utils: utils/layer_utils.md
- Numpy Utils: utils/np_utils.md
- [examples.md, Examples]
- [optimizers.md, Optimizers]
- [objectives.md, Objectives]
- [models.md, Models]
- [activations.md, Activations]
- [initializations.md, Initializations]
- [regularizers.md, Regularizers]
- [constraints.md, Constraints]
- [callbacks.md, Callbacks]
- [datasets.md, Datasets]
- [layers/core.md, Layers, Core Layers]
- [layers/convolutional.md, Layers, Convolutional Layers]
- [layers/recurrent.md, Layers, Recurrent Layers]
- [layers/advanced_activations.md, Layers, Advanced Activations Layers]
- [layers/normalization.md, Layers, Normalization Layers]
- [layers/embeddings.md, Layers, Embedding Layers]
- [layers/containers.md, Layers, Containers]
- [preprocessing/sequence.md, Preprocessing, Sequence Preprocessing]
- [preprocessing/text.md, Preprocessing, Text Preprocessing]
- [preprocessing/image.md, Preprocessing, Image Preprocessing]
- [utils/visualization.md, Utils, Visualization Utilities]
-40
Ver Arquivo
@@ -1,40 +0,0 @@
## Usage of activations
Activations can either be used through an `Activation` layer, or through the `activation` argument supported by all forward layers:
```python
from keras.layers.core import Activation, Dense
model.add(Dense(64, 64, init='uniform'))
model.add(Activation('tanh'))
```
is equivalent to:
```python
model.add(Dense(20, 64, init='uniform', activation='tanh'))
```
You can also pass an element-wise Theano function as an activation:
```python
def tanh(x):
return theano.tensor.tanh(x)
model.add(Dense(20, 64, init='uniform', activation=tanh))
model.add(Activation(tanh))
```
## Available activations
- __softmax__: Should only be applied to 2D layers (expected shape: `(nb_samples, nb_dims)`).
- __time_distributed_softmax__: Softmax applied to every sample at every timestep of a layer of shape `(nb_samples, nb_timesteps, nb_dims)`.
- __softplus__
- __relu__
- __tanh__
- __sigmoid__
- __hard_sigmoid__
- __linear__
## On Advanced Activations
Activations that are more complex than a simple Theano function (eg. learnable activations, configurable activations, etc.) are available as [Advanced Activation layers](layers/advanced_activations.md), and can be found in the module `keras.layers.advanced_activations`. These include PReLU and LeakyReLU.
-91
Ver Arquivo
@@ -1,91 +0,0 @@
## Usage of callbacks
A callback is a set of functions to be applied at given stages of the training procedure. You can use callbacks to get a view on internal states and statistics of the model during training. You can pass a list of callback (as the keyword argument `callbacks`) to the `.fit()` method of the `Sequential` model. The relevant methods of the callbacks will then be called at each stage of the training.
---
## Base class
```python
keras.callbacks.Callback()
```
- __Properties__:
- __params__: dict. Training parameters (eg. verbosity, batch size, number of epochs...).
- __model__: `keras.models.Model`. Reference of the model being trained.
- __Methods__:
- __on_train_begin__(logs={}): Method called at the beginning of training.
- __on_train_end__(logs={}): Method called at the end of training.
- __on_epoch_begin__(epoch, logs={}): Method called at the beginning of epoch `epoch`.
- __on_epoch_end__(epoch, logs={}): Method called at the end of epoch `epoch`.
- __on_batch_begin__(batch, logs={}): Method called at the beginning of batch `batch`.
- __on_batch_end__(batch, logs={}): Method called at the end of batch `batch`.
The `logs` dictionary will contain keys for quantities relevant to the current batch or epoch. Currently, the `.fit()` method of the `Sequential` model class will include the following quantities in the `logs` that it passes to its callbacks:
- __on_epoch_end__: logs optionally include `val_loss` (if validation is enabled in `fit`), and `val_accuracy` (if validation and accuracy monitoring are enabled).
- __on_batch_begin__: logs include `size`, the number of samples in the current batch.
- __on_batch_end__: logs include `loss`, and optionally `accuracy` (if accuracy monitoring is enabled).
---
## Create a callback
You can create a custom callback by extending the base class `keras.callbacks.Callback`. A callback has access to its associated model through the class property `self.model`.
Here's a simple example saving a list of losses over each batch during training:
```python
class LossHistory(keras.callbacks.Callback):
def on_train_begin(self):
self.losses = []
def on_batch_end(self, batch, logs={}):
self.losses.append(logs.get('loss'))
```
---
### Example to record the loss history
```python
class LossHistory(keras.callbacks.Callback):
def on_train_begin(self):
self.losses = []
def on_batch_end(self, batch, logs={}):
self.losses.append(logs.get('loss'))
model = Sequential()
model.add(Dense(784, 10, init='uniform'))
model.add(Activation('softmax'))
model.compile(loss='categorical_crossentropy', optimizer='rmsprop')
history = LossHistory()
model.fit(X_train, Y_train, batch_size=128, nb_epoch=20, verbose=0, callbacks=[history])
print history.losses
# outputs
'''
[0.66047596406559383, 0.3547245744908703, ..., 0.25953155204159617, 0.25901699725311789]
'''
```
---
### Example to checkpoint models
```python
from keras.callbacks import ModelCheckpoint
model = Sequential()
model.add(Dense(784, 10, init='uniform'))
model.add(Activation('softmax'))
model.compile(loss='categorical_crossentropy', optimizer='rmsprop')
'''
saves the model weights after each epoch if the validation loss decreased
'''
checkpointer = ModelCheckpoint(filepath="/tmp/weights.hdf5", verbose=1, save_best_only=True)
model.fit(X_train, Y_train, batch_size=128, nb_epoch=20, verbose=0, validation_data=(X_test, Y_test), callbacks=[checkpointer])
```
-35
Ver Arquivo
@@ -1,35 +0,0 @@
# Keras Documentation Index
## Introduction
- [Home](index.md)
- [Index](documentation.md)
- [Examples](examples.md)
---
## Base functionality
- [Optimizers](optimizers.md)
- [Objectives](objectives.md)
- [Models](models.md)
- [Activations](activations.md)
- [Initializations](initializations.md)
- [Datasets](datasets.md)
---
## Layers
- [Core](layers/core.md)
- [Convolutional](layers/convolutional.md)
- [Recurrent](layers/recurrent.md)
- [Advanced Activations](layers/advanced_activations.md)
- [Normalization](layers/normalization.md)
- [Embeddings](layers/embeddings.md)
---
## Preprocessing
- [Sequence](preprocessing/sequence.md)
- [Text](preprocessing/text.md)
- [Image](preprocessing/image.md)
-161
Ver Arquivo
@@ -1,161 +0,0 @@
Here are a few examples to get you started!
### Multilayer Perceptron (MLP)
```python
from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation
from keras.optimizers import SGD
model = Sequential()
model.add(Dense(20, 64, init='uniform'))
model.add(Activation('tanh'))
model.add(Dropout(0.5))
model.add(Dense(64, 64, init='uniform'))
model.add(Activation('tanh'))
model.add(Dropout(0.5))
model.add(Dense(64, 2, init='uniform'))
model.add(Activation('softmax'))
sgd = SGD(lr=0.1, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='mean_squared_error', optimizer=sgd)
model.fit(X_train, y_train, nb_epoch=20, batch_size=16)
score = model.evaluate(X_test, y_test, batch_size=16)
```
---
### Alternative implementation of MLP
```python
model = Sequential()
model.add(Dense(20, 64, init='uniform', activation='tanh'))
model.add(Dropout(0.5))
model.add(Dense(64, 64, init='uniform', activation='tanh'))
model.add(Dropout(0.5))
model.add(Dense(64, 2, init='uniform', activation='softmax')
sgd = SGD(lr=0.1, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='mean_squared_error', optimizer=sgd)
```
---
### VGG-like convnet
```python
from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation, Flatten
from keras.layers.convolutional import Convolution2D, MaxPooling2D
from keras.optimizers import SGD
model = Sequential()
model.add(Convolution2D(32, 3, 3, 3, border_mode='full'))
model.add(Activation('relu'))
model.add(Convolution2D(32, 32, 3, 3))
model.add(Activation('relu'))
model.add(MaxPooling2D(poolsize=(2, 2)))
model.add(Dropout(0.25))
model.add(Convolution2D(64, 32, 3, 3, border_mode='full'))
model.add(Activation('relu'))
model.add(Convolution2D(64, 64, 3, 3))
model.add(Activation('relu'))
model.add(MaxPooling2D(poolsize=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(64*8*8, 256))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(256, 10))
model.add(Activation('softmax'))
sgd = SGD(lr=0.1, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='categorical_crossentropy', optimizer=sgd)
model.fit(X_train, Y_train, batch_size=32, nb_epoch=1)
```
---
### Sequence classification with LSTM
```python
from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation
from keras.layers.embeddings import Embedding
from keras.layers.recurrent import LSTM
model = Sequential()
model.add(Embedding(max_features, 256))
model.add(LSTM(256, 128, activation='sigmoid', inner_activation='hard_sigmoid'))
model.add(Dropout(0.5))
model.add(Dense(128, 1))
model.add(Activation('sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='rmsprop')
model.fit(X_train, Y_train, batch_size=16, nb_epoch=10)
score = model.evaluate(X_test, Y_test, batch_size=16)
```
---
### Architecture for learning image captions with a convnet and a Gated Recurrent Unit
(word-level embedding, caption of maximum length 16 words).
Note that getting this to actually "work" will require using a bigger convnet, initialized with pre-trained weights.
Displaying readable results will also require an embedding decoder.
```python
max_caption_len = 16
model = Sequential()
model.add(Convolution2D(32, 3, 3, 3, border_mode='full'))
model.add(Activation('relu'))
model.add(Convolution2D(32, 32, 3, 3))
model.add(Activation('relu'))
model.add(MaxPooling2D(poolsize=(2, 2)))
model.add(Convolution2D(64, 32, 3, 3, border_mode='full'))
model.add(Activation('relu'))
model.add(Convolution2D(64, 64, 3, 3))
model.add(Activation('relu'))
model.add(MaxPooling2D(poolsize=(2, 2)))
model.add(Convolution2D(128, 64, 3, 3, border_mode='full'))
model.add(Activation('relu'))
model.add(Convolution2D(128, 128, 3, 3))
model.add(Activation('relu'))
model.add(MaxPooling2D(poolsize=(2, 2)))
model.add(Flatten())
model.add(Dense(128*4*4, 256))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Repeat(max_caption_len))
# the GRU below returns sequences of max_caption_len vectors of size 256 (our word embedding size)
model.add(GRU(256, 256, return_sequences=True))
model.compile(loss='mean_squared_error', optimizer='rmsprop')
# "images" is a numpy array of shape (nb_samples, nb_channels=3, width, height)
# "captions" is a numpy array of shape (nb_samples, max_caption_len=16, embedding_dim=256)
# captions are supposed already embedded (dense vectors).
model.fit(images, captions, batch_size=16, nb_epoch=100)
```
---
In the [examples folder](https://github.com/fchollet/keras/tree/master/examples), you will find example models for real datasets:
- CIFAR10 small images classification: Convnet with realtime data augmentation
- IMDB movie review sentiment classification: LSTM over sequences of words
- Reuters newswires topic classification: Multilayer Perceptron
-131
Ver Arquivo
@@ -1,131 +0,0 @@
# Keras: Theano-based Deep Learning library
## Overview
Keras is a minimalist, highly modular neural network library in the spirit of Torch, written in Python, that uses [Theano](http://deeplearning.net/software/theano/) under the hood for fast tensor manipulation on GPU and CPU. It was developed with a focus on enabling fast experimentation.
Use Keras if you need a deep learning library that:
- allows for easy and fast prototyping (through total modularity, minimalism, and extensibility).
- supports both __convolutional networks__ and __recurrent networks__ (LSTM, GRU, etc). As well as combinations of the two.
- runs seamlessly on the CPU and the GPU.
## Guiding principles
- __Modularity.__ A model is understood as a sequence of standalone, fully-configurable modules that can be plugged together with as little restrictions as possible. In particular, neural layers, cost functions, optimizers, initialization schemes, activation functions and dropout are all standalone modules that you can combine to create new models.
- __Minimalism.__ Each module should be kept short and simple (<100 lines of code). Every piece of code should be transparent upon first reading. No black magic: it hurts iteration speed and ability to innovate.
- __Easy extensibility.__ A new feature (a new module, per the above definition, or a new way to combine modules together) are dead simple to add (as new classes/functions), and existing modules provide ample examples.
- __Work with Python__. No separate models configuration files in a declarative format (like in Caffe or PyLearn2). Models are described in Python code, which is compact, easier to debug, benefits from syntax highlighting, and most of all, allows for ease of extensibility.
## Code
Find the code on Github: [fchollet/keras](https://github.com/fchollet/keras).
## License
Keras is licensed under the [MIT license](http://opensource.org/licenses/MIT).
## Getting started: 30 seconds to Keras
The core datastructure of Keras is a __model__, a way to organize layers. Here's a sequential model (a linear pile of layers).
```python
from keras.models import Sequential
model = Sequential()
```
Stacking layers is as easy as `.add()`:
```python
from keras.layers.core import Dense, Activation
model.add(Dense(input_dim=100, output_dim=64, init="uniform"))
model.add(Activation("relu"))
model.add(Dense(input_dim=64, output_dim=10, init="uniform"))
model.add(Activation("softmax"))
```
Once your model looks good, configure its learning process with `.compile()`:
```python
model.compile(loss='categorical_crossentropy', optimizer='sgd')
```
If you need to, you can further configure your optimizer. A core principle of Keras is make things things reasonably simple, while allowing the user to be fully in control when they need to (the ultimate control being the easy extensibility of the source code).
```python
from keras.optimizers import SGD
model.compile(loss='categorical_crossentropy', optimizer=SGD(lr=0.01, momentum=0.9, nesterov=True))
```
You can now iterate on your training data in batches:
```python
model.fit(X_train, Y_train, nb_epoch=5, batch_size=32)
```
Alternatively, you can feed batches to your model manually:
```python
model.train(X_batch, Y_batch)
```
Evaluate your performance in one line:
```python
objective_score = model.evaluate(X_test, Y_test, batch_size=32)
```
Or generate predictions on new data:
```python
classes = model.predict_classes(X_test, batch_size=32)
proba = model.predict_proba(X_test, batch_size=32)
```
Building a network of LSTMs, a deep CNN, a word2vec embedder or any other model is just as fast. The ideas behind deep learning are simple, so why should their implementation be painful?
Have a look at the [examples](examples.md).
## Installation
Keras uses the following dependencies:
- numpy, scipy
- Theano
- See [installation instructions](http://deeplearning.net/software/theano/install.html#install).
- HDF5 and h5py (optional, required if you use model saving/loading functions)
- Optional but recommended if you use CNNs: cuDNN.
Once you have the dependencies installed, clone the repo:
```bash
git clone https://github.com/fchollet/keras.git
```
Go to the Keras folder and run the install command:
```bash
cd keras
sudo python setup.py install
```
## Support
You can ask questions and join the development discussion on the [Keras Google group](https://groups.google.com/forum/#!forum/keras-users).
## Contribution Guidelines
Keras welcomes all contributions from the community.
- Keep a pragmatic mindset and avoid bloat. Only add to the source if that is the only path forward.
- New features should be documented. Make sure you update the documentation along with your Pull Request.
- The documentation for every new feature should include a usage example in the form of a code snippet.
- All changes should be tested. A formal test process will be introduced very soon.
- Even if you don't contribute to the Keras source code, if you have an application of Keras that is concise and powerful, please consider adding it to our collection of [examples](https://github.com/fchollet/keras/tree/master/examples).
## Why this name, Keras?
Keras (κέρας) means _horn_ in Greek. It is a reference to a literary image from ancient Greek and Latin literature, first found in the _Odyssey_, where dream spirits (_Oneiroi_, singular _Oneiros_) are divided between those who deceive men with false visions, who arrive to Earth through a gate of ivory, and those who announce a future that will come to pass, who arrive through a gate of horn. It's a play on the words κέρας (horn) / κραίνω (fulfill), and ἐλέφας (ivory) / ἐλεφαίρομαι (deceive).
Keras was developed as part of the research effort of project ONEIROS (Open-ended Neuro-Electronic Intelligent Robot Operating System).
> _"Oneiroi are beyond our unravelling --who can be sure what tale they tell? Not all that men look for comes to pass. Two gates there are that give passage to fleeting Oneiroi; one is made of horn, one of ivory. The Oneiroi that pass through sawn ivory are deceitful, bearing a message that will not be fulfilled; those that come out through polished horn have truth behind them, to be accomplished for men who see them."_
> -- Homer, Odyssey 19. 562 ff (Shewring translation).
-23
Ver Arquivo
@@ -1,23 +0,0 @@
# Initializations
## Usage of initializations
Initializations define the probability distribution used to set the initial random weights of Keras layers.
The keyword arguments used for passing initializations to layers will depend on the layer. Usually it is simply `init`:
```python
model.add(Dense(64, 64, init='uniform'))
```
## Available initializations
- __uniform__
- __lecun_uniform__: Uniform initialization scaled by the square root of the number of inputs (LeCun 98).
- __normal__
- __orthogonal__: Use with square 2D layers (`shape[0] == shape[1]`).
- __zero__
- __glorot_normal__: Gaussian initialization scaled by fan_in + fan_out (Glorot 2010)
- __glorot_uniform__
- __he_normal__: Gaussian initialization scaled by fan_in (He et al., 2014)
- __he_uniform__
-35
Ver Arquivo
@@ -1,35 +0,0 @@
## LeakyReLU
```python
keras.layers.advanced_activations.LeakyReLU(alpha=0.3)
```
Special version of a Rectified Linear Unit that allows a small gradient when the unit is not active (`f(x) = alpha*x for x < 0`).
- __Input shape__: This layer does not assume a specific input shape. As a result, it cannot be used as the first layer in a model.
- __Output shape__: Same as input.
- __Arguments__:
- __alpha__: float >= 0. Negative slope coefficient.
---
## PReLU
```python
keras.layers.advanced_activations.PReLU(input_shape)
```
Parametrized linear unit. Similar to a LeakyReLU, where each input unit has its alpha coefficient, and where these coefficients are learned during training.
- __Input shape__: Same as `input_shape`. This layer cannot be used as first layer in a model.
- __Output shape__: Same as input.
- __Arguments__:
- __input_shape__: tuple.
- __References__:
- [Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification](http://arxiv.org/pdf/1502.01852v1.pdf)
-23
Ver Arquivo
@@ -1,23 +0,0 @@
# Containers
Containers are ensembles of layers that can be interacted with through the same API as `Layer` objects.
## Sequential
```python
keras.layers.containers.Sequential(layers=[])
```
The Sequential container is a linear stack of layers. Apart from the `add` methods and the `layers` constructor argument, the API is identical to that of the `Layer` class.
This class is also the basis for the `keras.models.Sequential` architecture.
The `layers` constructor argument is a list of Layer instances.
__Methods__:
```python
add(layer)
```
Add a new layer to the stack.
-20
Ver Arquivo
@@ -1,20 +0,0 @@
## Convolution2D
```python
keras.layers.convolutional.Convolution2D(nb_filter, stack_size, nb_row, nb_col,
init='glorot_uniform', activation='linear', weights=None,
image_shape=None, border_mode='valid', subsample=(1,1))
```
This is a wrapper for Theano's [conv2d](http://deeplearning.net/software/theano/library/tensor/nnet/conv.html#theano.tensor.nnet.conv.conv2d).
---
## MaxPooling2D
```python
keras.layers.convolutional.MaxPooling2D(poolsize=(2, 2), ignore_border=True)
```
This is a wrapper for Theano's [max_pool_2d](http://deeplearning.net/software/theano/library/tensor/signal/downsample.html).
-355
Ver Arquivo
@@ -1,355 +0,0 @@
## Base class
```python
keras.layers.core.Layer()
```
__Methods__:
```python
connect(previous_layer)
```
Connect the input of the current layer to the output of the argument layer.
- __Return__: None.
- __Arguments__:
- __previous_layer__: Layer object.
```python
output(train)
```
Get the output of the layer.
- __Return__: Theano tensor.
- __Arguments__:
- __train__: Boolean. Specifies whether output is computed in training mode or in testing mode, which can change the logic, for instance in there are any `Dropout` layers in the network.
```python
get_input(train)
```
Get the input of the layer.
- __Return__: Theano tensor.
- __Arguments__:
- __train__: Boolean. Specifies whether output is computed in training mode or in testing mode, which can change the logic, for instance in there are any `Dropout` layers in the network.
```python
get_weights()
```
Get the weights of the parameters of the layer.
- __Return__: List of numpy arrays (one per layer parameter).
```python
set_weights(weights)
```
Set the weights of the parameters of the layer.
- __Arguments__:
- __weights__: List of numpy arrays (one per layer parameter). Should be in the same order as what `get_weights(self)` returns.
---
## Dense
```python
keras.layers.core.Dense(input_dim, output_dim, init='glorot_uniform', activation='linear', weights=None \
W_regularizer=None, b_regularizer=None, W_constraint=None, b_constraint=None)
```
Standard 1D fully-connect layer.
- __Input shape__: 2D tensor with shape: `(nb_samples, input_dim)`.
- __Output shape__: 2D tensor with shape: `(nb_samples, output_dim)`.
- __Arguments__:
- __input_dim__: int >= 0.
- __output_dim__: int >= 0.
- __init__: name of initialization function for the weights of the layer (see: [initializations](../initializations.md)), or alternatively, Theano function to use for weights initialization. This parameter is only relevant if you don't pass a `weights` argument.
- __activation__: name of activation function to use (see: [activations](../activations.md)), or alternatively, elementwise Theano function. If you don't specify anything, no activation is applied (ie. "linear" activation: a(x) = x).
- __weights__: list of numpy arrays to set as initial weights. The list should have 1 element, of shape `(input_dim, output_dim)`.
- __W_regularizer__: instance of the [regularizers](../regularizers.md) module (eg. L1 or L2 regularization), applied to the main weights matrix.
- __b_regularizer__: instance of the [regularizers](../regularizers.md) module, applied to the bias.
- __W_constraint__: instance of the [constraints](../constraints.md) module (eg. maxnorm, nonneg), applied to the main weights matrix.
- __b_constraint__: instance of the [constraints](../constraints.md) module, applied to the bias.
---
## TimeDistributedDense
```python
keras.layers.core.TimeDistributedDense(input_dim, output_dim, init='glorot_uniform', activation='linear', weights=None \
W_regularizer=None, b_regularizer=None, W_constraint=None, b_constraint=None)
```
Fully-connected layer distributed over the time dimension. Useful after a recurrent network set to `return_sequences=True`.
- __Input shape__: 3D tensor with shape: `(nb_samples, nb_timesteps, input_dim)`.
- __Arguments__:
- __input_dim__: int >= 0.
- __output_dim__: int >= 0.
- __init__: name of initialization function for the weights of the layer (see: [initializations](../initializations.md)), or alternatively, Theano function to use for weights initialization. This parameter is only relevant if you don't pass a `weights` argument.
- __activation__: name of activation function to use (see: [activations](../activations.md)), or alternatively, elementwise Theano function. If you don't specify anything, no activation is applied (ie. "linear" activation: a(x) = x).
- __weights__: list of numpy arrays to set as initial weights. The list should have 1 element, of shape `(input_dim, output_dim)`.
- __W_regularizer__: instance of the [regularizers](../regularizers.md) module (eg. L1 or L2 regularization), applied to the main weights matrix.
- __b_regularizer__: instance of the [regularizers](../regularizers.md) module, applied to the bias.
- __W_constraint__: instance of the [constraints](../constraints.md) module (eg. maxnorm, nonneg), applied to the main weights matrix.
- __b_constraint__: instance of the [constraints](../constraints.md) module, applied to the bias.
- __Example__:
```python
# input shape: (nb_samples, nb_timesteps, 10)
model.add(LSTM(10, 5, return_sequences=True)) # output shape: (nb_samples, nb_timesteps, 5)
model.add(TimeDistributedDense(5, 10)) # output shape: (nb_samples, nb_timesteps, 10)
```
---
## AutoEncoder
```python
keras.layers.core.AutoEncoder(encoder, decoder, output_reconstruction=True, tie_weights=False, weights=None):
```
A customizable autoencoder model. If `output_reconstruction = True` then dim(input) = dim(output) else dim(output) = dim(hidden)
- __Input shape__: The layer shape is defined by the encoder definitions
- __Output shape__: The layer shape is defined by the decoder definitions
- __Arguments__:
- __encoder__: A [layer](./) or [layer container](./containers.md).
- __decoder__: A [layer](./) or [layer container](./containers.md).
- __output_reconstruction__: If this is False the when .predict() is called the output is the deepest hidden layer's activation. Otherwise the output of the final decoder layer is presented. Be sure your validation data confirms to this logic if you decide to use any.
- __tie_weights__: If True then the encoder bias is tied to the decoder bias. **Note**: This required the encoder layer corresponding to this decoder layer to be of the same time, eg: Dense:Dense
- __weights__: list of numpy arrays to set as initial weights. The list should have 1 element, of shape `(input_dim, output_dim)`.
- __Example__:
```python
from keras.layers import containers
# input shape: (nb_samples, 32)
encoder = containers.Sequential([Dense(32, 16), Dense(16, 8)])
decoder = containers.Sequential([Dense(8, 16), Dense(16, 32)])
autoencoder.add(AutoEncoder(encoder=encoder, decoder=decoder, output_reconstruction=False, tie_weights=True))
```
---
## DenoisingAutoEncoder
```python
keras.layers.core.AutoEncoder(encoder, decoder, output_reconstruction=True, tie_weights=False, weights=None, corruption_level=0.3):
```
A denoising autoencoder model that inherits the base features from autoencoder.
Since this layer uses similar logic to Dropout it cannot be the first layer in a pipeline.
- __Input shape__: The layer shape is defined by the encoder definitions
- __Output shape__: The layer shape is defined by the decoder definitions
- __Arguments__:
- __encoder__: A [layer](./) or [layer container](./containers.md).
- __decoder__: A [layer](./) or [layer container](./containers.md).
- __output_reconstruction__: If this is False the when .predict() is called the output is the deepest hidden layer's activation. Otherwise the output of the final decoder layer is presented. Be sure your validation data confirms to this logic if you decide to use any.
- __tie_weights__: If True then the encoder bias is tied to the decoder bias. **Note**: This required the encoder layer corresponding to this decoder layer to be of the same time, eg: Dense:Dense
- __weights__: list of numpy arrays to set as initial weights. The list should have 1 element, of shape `(input_dim, output_dim)`.
- __corruption_level__: the amount of binomial noise added to the input layer of the model.
- __Example__:
```python
# input shape: (nb_samples, 32)
autoencoder.add(Dense(32, 32))
autoencoder.add(DenoisingAutoEncoder(encoder=Dense(32, 16),
decoder=Dense(16, 32),
output_reconstruction=False, tie_weights=True,
corruption_level=0.3))
```
---
## Activation
```python
keras.layers.core.Activation(activation)
```
Apply an activation function to the input.
- __Input shape__: This layer does not assume a specific input shape. As a result, it cannot be used as the first layer in a model.
- __Output shape__: Same as input.
- __Arguments__:
- __activation__: name of activation function to use (see: [activations](../activations.md)), or alternatively, elementwise Theano function.
---
## Dropout
```python
keras.layers.core.Dropout(p)
```
Apply dropout to the input. Dropout consists in randomly setting a fraction `p` of input units to 0 at each update during training time, which helps prevent overfitting. Reference: [Dropout: A Simple Way to Prevent Neural Networks from Overfitting](http://www.cs.toronto.edu/~rsalakhu/papers/srivastava14a.pdf)
- __Input shape__: This layer does not assume a specific input shape.
- __Output shape__: Same as input.
- __Arguments__:
- __p__: float (0 <= p < 1). Fraction of the input that gets dropped out at training time.
---
## Reshape
```python
keras.layers.core.Reshape(*dims)
```
Reshape the input to a new shape containing the same number of units.
- __Input shape__: This layer does not assume a specific input shape.
- __Output shape__: `(nb_samples, *dims)`.
- __Arguments__:
- *dims: integers. Dimensions of the new shape.
- __Example__:
```python
# input shape: (nb_samples, 10)
model.add(Dense(10, 100)) # output shape: (nb_samples, 100)
model.add(Reshape(10, 10)) # output shape: (nb_samples, 10, 10)
```
---
## Flatten
```python
keras.layers.core.Flatten()
```
Convert a nD input to 1D.
- __Input shape__: (nb_samples, *). This layer cannot be used as the first layer in a model.
- __Output shape__: `(nb_samples, nb_input_units)`.
---
## RepeatVector
```python
keras.layers.core.RepeatVector(n)
```
Repeat the 1D input n times. Dimensions of input are assumed to be (nb_samples, dim). Output will have the shape (nb_samples, n, dim).
- __Input shape__: This layer does not assume a specific input shape. This layer cannot be used as the first layer in a model.
- __Output shape__: `(nb_samples, n, input_dims)`.
- __Arguments__:
- __n__: int.
- __Example__:
---
## MaxoutDense
```python
keras.layers.core.MaxoutDense(input_dim, output_dim, nb_feature=4, init='glorot_uniform', weights=None, \
W_regularizer=None, b_regularizer=None, W_constraint=None, b_constraint=None)
```
A dense maxout layer. A `MaxoutDense` layer takes the element-wise maximum of `nb_feature` `Dense(input_dim, output_dim)` linear layers. This allows the layer to learn a convex, piecewise linear activation function over the inputs. See [this paper](http://arxiv.org/pdf/1302.4389.pdf) for more details. Note that this is a *linear* layer -- if you wish to apply activation function (you shouldn't need to -- they are universal function approximators), an `Activation` layer must be added after.
- __Input shape__: 2D tensor with shape: `(nb_samples, input_dim)`.
- __Output shape__: 2D tensor with shape: `(nb_samples, output_dim)`.
- __Arguments__:
- __input_dim__: int >= 0.
- __output_dim__: int >= 0.
- __nb_feature__: int >= 0. the number of features to create for the maxout. This is equivalent to the number of piecewise elements to be allowed for the activation function.
- __init__: name of initialization function for the weights of the layer (see: [initializations](../initializations.md)), or alternatively, Theano function to use for weights initialization. This parameter is only relevant if you don't pass a `weights` argument.
- __weights__: list of numpy arrays to set as initial weights. The list should have 1 element, of shape `(input_dim, output_dim)`.
- __W_regularizer__: instance of the [regularizers](../regularizers.md) module (eg. L1 or L2 regularization), applied to the main weights matrix.
- __b_regularizer__: instance of the [regularizers](../regularizers.md) module, applied to the bias.
- __W_constraint__: instance of the [constraints](../constraints.md) module (eg. maxnorm, nonneg), applied to the main weights matrix.
- __b_constraint__: instance of the [constraints](../constraints.md) module, applied to the bias.
```python
# input shape: (nb_samples, 10)
model.add(Dense(10, 100)) # output shape: (nb_samples, 100)
model.add(MaxoutDense(100, 100, nb_feature=10)) # output shape: (nb_samples, 100)
model.add(RepeatVector(2)) # output shape: (nb_samples, 2, 10)
```
## Merge
```python
keras.layers.core.Merge(models, mode='sum')
```
Merge the output of a list of models into a single tensor, following one of two modes: `sum` or `concat`.
- __Arguments__:
- __models__: List of `Sequential` models.
- __mode__: String, one of `{'sum', 'concat'}`. `sum` will simply sum the outputs of the models (therefore all models should have an output with the same shape). `concat` will concatenate the outputs along the last dimension (therefore all models should have an output that only differ along the last dimension).
- __Example__:
```python
left = Sequential()
left.add(Dense(784, 50))
left.add(Activation('relu'))
right = Sequential()
right.add(Dense(784, 50))
right.add(Activation('relu'))
model = Sequential()
model.add(Merge([left, right], mode='sum'))
model.add(Dense(50, 10))
model.add(Activation('softmax'))
model.compile(loss='categorical_crossentropy', optimizer='rmsprop')
model.fit([X_train, X_train], Y_train, batch_size=128, nb_epoch=20, validation_data=([X_test, X_test], Y_test))
```
-48
Ver Arquivo
@@ -1,48 +0,0 @@
## Embedding
```python
keras.layers.embeddings.Embedding(input_dim, output_dim, init='uniform', weights=None, W_regularizer=None, W_constraint=None)
```
Turn positive integers (indexes) into denses vectors of fixed size,
eg. `[[4], [20]] -> [[0.25, 0.1], [0.6, -0.2]]`
- __Input shape__: 2D tensor with shape: `(nb_samples, maxlen)`.
- __Output shape__: 3D tensor with shape: `(nb_samples, maxlen, output_dim)`.
- __Arguments__:
- __input_dim__: int >= 0. Size of the vocabulary, ie. 1+maximum integer index occuring in the input data.
- __output_dim__: int >= 0. Dimension of the dense embedding.
- __init__: name of initialization function for the weights of the layer (see: [initializations](../initializations.md)), or alternatively, Theano function to use for weights initialization. This parameter is only relevant if you don't pass a `weights` argument.
- __weights__: list of numpy arrays to set as initial weights. The list should have 1 element, of shape `(input_dim, output_dim)`.
- __W_regularizer__: instance of the [regularizers](../regularizers.md) module (eg. L1 or L2 regularization), applied to the embedding matrix.
- __W_constraint__: instance of the [constraints](../constraints.md) module (eg. maxnorm, nonneg), applied to the embedding matrix.
## WordContextProduct
```python
keras.layers.embeddings.WordContextProduct(input_dim, proj_dim=128,
init='uniform', activation='sigmoid', weights=None)
```
This layer turns a pair of words (a pivot word + a context word, ie. a word from the same context as a pivot, or a random, out-of-context word), indentified by their indices in a vocabulary, into two dense reprensentations (word representation and context representation).
Then it returns `activation(dot(pivot_embedding, context_embedding))`, which can be trained to encode the probability of finding the context word in the context of the pivot word (or reciprocally depending on your training procedure).
For more context, see Mikolov et al.: [Efficient Estimation of Word reprensentations in Vector Space](http://arxiv.org/pdf/1301.3781v3.pdf)
- __Input shape__: 2D tensor with shape: `(nb_samples, 2)`.
- __Output shape__: 2D tensor with shape: `(nb_samples, 1)`.
- __Arguments__:
- __input_dim__: int >= 0. Size of the vocabulary, ie. 1+maximum integer index occuring in the input data.
- __proj_dim__: int >= 0. Dimension of the dense embedding used internally.
- __init__: name of initialization function for the embeddings (see: [initializations](../initializations.md)), or alternatively, Theano function to use for weights initialization. This parameter is only relevant if you don't pass a `weights` argument.
- __activation__: name of activation function to use (see: [activations](../activations.md)), or alternatively, elementwise Theano function.
- __weights__: list of numpy arrays to set as initial weights. The list should have 2 element, both of shape `(input_dim, proj_dim)`. The first element is the word embedding weights, the second one is the context embedding weights.
-20
Ver Arquivo
@@ -1,20 +0,0 @@
## BatchNormalization
```python
keras.layers.normalization.BatchNormalization(input_shape, epsilon=1e-6, weights=None)
```
Normalize the activations of the previous layer at each batch.
- __Input shape__: Same as `input_shape`. This layer cannot be used as first layer in a model.
- __Output shape__: Same as input.
- __Arguments__:
- __input_shape__: tuple.
- __epsilon__: small float > 0. Fuzz parameter.
- __weights__: Initialization weights. List of 2 numpy arrays, with shapes: `[(input_shape,), (input_shape,)]`
- __References__:
- [Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift](http://arxiv.org/pdf/1502.03167v3.pdf)
-133
Ver Arquivo
@@ -1,133 +0,0 @@
## SimpleRNN
```python
keras.layers.recurrent.SimpleRNN(input_dim, output_dim,
init='glorot_uniform', inner_init='orthogonal', activation='sigmoid', weights=None,
truncate_gradient=-1, return_sequences=False)
```
Fully connected RNN where output is to fed back to input. Not a particularly useful model, included for demonstration purposes.
- __Input shape__: 3D tensor with shape: `(nb_samples, timesteps, input_dim)`.
- __Output shape__:
- if `return_sequences`: 3D tensor with shape: `(nb_samples, timesteps, ouput_dim)`.
- else: 2D tensor with shape: `(nb_samples, output_dim)`.
- __Arguments__:
- __input_dim__: dimension of the input.
- __output_dim__: dimension of the internal projections and the final output.
- __init__: weight initialization function. Can be the name of an existing function (str), or a Theano function (see: [initializations](../initializations.md)).
- __activation__: activation function. Can be the name of an existing function (str), or a Theano function (see: [activations](../activations.md)).
- __weights__: list of numpy arrays to set as initial weights. The list should have 3 elements, of shapes: `[(input_dim, output_dim), (output_dim, output_dim), (output_dim,)]`.
- __truncate_gradient__: Number of steps to use in truncated BPTT. See: [Theano "scan"](http://deeplearning.net/software/theano/library/scan.html).
- __return_sequences__: Boolean. Whether to return the last output in the output sequence, or the full sequence.
---
## SimpleDeepRNN
```python
keras.layers.recurrent.SimpleDeepRNN(input_dim, output_dim, depth=3,
init='glorot_uniform', inner_init='orthogonal',
activation='sigmoid', inner_activation='hard_sigmoid',
weights=None, truncate_gradient=-1, return_sequences=False)
```
Fully connected RNN where the output of multiple timesteps (up to "depth" steps in the past) is fed back to the input:
```
output = activation( W.x_t + b + inner_activation(U_1.h_tm1) + inner_activation(U_2.h_tm2) + ... )
```
Not a particularly useful model, included for demonstration purposes.
- __Input shape__: 3D tensor with shape: `(nb_samples, timesteps, input_dim)`.
- __Output shape__:
- if `return_sequences`: 3D tensor with shape: `(nb_samples, timesteps, ouput_dim)`.
- else: 2D tensor with shape: `(nb_samples, output_dim)`.
- __Arguments__:
- __input_dim__: dimension of the input.
- __output_dim__: dimension of the internal projections and the final output.
- __depth__: int >= 1. Lookback depth (eg. depth=1 is equivalent to SimpleRNN).
- __init__: weight initialization function for the output cell. Can be the name of an existing function (str), or a Theano function (see: [initializations](../initializations.md)).
- __inner_init__: weight initialization function for the inner cells.
- __activation__: activation function for the output. Can be the name of an existing function (str), or a Theano function (see: [activations](../activations.md)).
- __inner_activation__: activation function for the inner cells.
- __weights__: list of numpy arrays to set as initial weights. The list should have depth+2 elements.
- __truncate_gradient__: Number of steps to use in truncated BPTT. See: [Theano "scan"](http://deeplearning.net/software/theano/library/scan.html).
- __return_sequences__: Boolean. Whether to return the last output in the output sequence, or the full sequence.
---
## GRU
```python
keras.layers.recurrent.GRU(input_dim, output_dim=128,
init='glorot_uniform', inner_init='orthogonal',
activation='sigmoid', inner_activation='hard_sigmoid',
weights=None, truncate_gradient=-1, return_sequences=False)
```
Gated Recurrent Unit - Cho et al. 2014.
- __Input shape__: 3D tensor with shape: `(nb_samples, timesteps, input_dim)`.
- __Output shape__:
- if `return_sequences`: 3D tensor with shape: `(nb_samples, timesteps, ouput_dim)`.
- else: 2D tensor with shape: `(nb_samples, output_dim)`.
- __Arguments__:
- __input_dim__: dimension of the input.
- __output_dim__: dimension of the internal projections and the final output.
- __init__: weight initialization function for the output cell. Can be the name of an existing function (str), or a Theano function (see: [initializations](../initializations.md)).
- __inner_init__: weight initialization function for the inner cells.
- __activation__: activation function for the output. Can be the name of an existing function (str), or a Theano function (see: [activations](../activations.md)).
- __inner_activation__: activation function for the inner cells.
- __weights__: list of numpy arrays to set as initial weights. The list should have 9 elements.
- __truncate_gradient__: Number of steps to use in truncated BPTT. See: [Theano "scan"](http://deeplearning.net/software/theano/library/scan.html).
- __return_sequences__: Boolean. Whether to return the last output in the output sequence, or the full sequence.
- __References__:
- [On the Properties of Neural Machine Translation: Encoder–Decoder Approaches](http://www.aclweb.org/anthology/W14-4012)
- [Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling](http://arxiv.org/pdf/1412.3555v1.pdf)
---
## LSTM
```python
keras.layers.recurrent.LSTM(input_dim, output_dim=128,
init='glorot_uniform', inner_init='orthogonal',
activation='tanh', inner_activation='hard_sigmoid',
weights=None, truncate_gradient=-1, return_sequences=False)
```
Long-Short Term Memory unit - Hochreiter 1997.
- __Input shape__: 3D tensor with shape: `(nb_samples, timesteps, input_dim)`.
- __Output shape__:
- if `return_sequences`: 3D tensor with shape: `(nb_samples, timesteps, ouput_dim)`.
- else: 2D tensor with shape: `(nb_samples, output_dim)`.
- __Arguments__:
- __input_dim__: dimension of the input.
- __output_dim__: dimension of the internal projections and the final output.
- __init__: weight initialization function for the output cell. Can be the name of an existing function (str), or a Theano function (see: [initializations](../initializations.md)).
- __inner_init__: weight initialization function for the inner cells.
- __activation__: activation function for the output. Can be the name of an existing function (str), or a Theano function (see: [activations](../activations.md)).
- __inner_activation__: activation function for the inner cells.
- __weights__: list of numpy arrays to set as initial weights. The list should have 12 elements.
- __truncate_gradient__: Number of steps to use in truncated BPTT. See: [Theano "scan"](http://deeplearning.net/software/theano/library/scan.html).
- __return_sequences__: Boolean. Whether to return the last output in the output sequence, or the full sequence.
- __References__:
- [Long short-term memory](http://deeplearning.cs.cmu.edu/pdfs/Hochreiter97_lstm.pdf) (original 1997 paper)
- [Learning to forget: Continual prediction with LSTM](http://www.mitpressjournals.org/doi/pdf/10.1162/089976600300015015)
- [Supervised sequence labelling with recurrent neural networks](http://www.cs.toronto.edu/~graves/preprint.pdf)
-114
Ver Arquivo
@@ -1,114 +0,0 @@
## Sequential
Linear stack of layers.
```python
model = keras.models.Sequential()
```
- __Methods__:
- __add__(layer): Add a layer to the model.
- __compile__(optimizer, loss, class_mode="categorical"):
- __Arguments__:
- __optimizer__: str (name of optimizer) or optimizer object. See [optimizers](optimizers.md).
- __loss__: str (name of objective function) or objective function. See [objectives](objectives.md).
- __class_mode__: one of "categorical", "binary". This is only used for computing classification accuracy or using the predict_classes method.
- __theano_mode__: A `theano.compile.mode.Mode` ([reference](http://deeplearning.net/software/theano/library/compile/mode.html)) instance controlling specifying compilation options.
- __fit__(X, y, batch_size=128, nb_epoch=100, verbose=1, validation_split=0., validation_data=None, shuffle=True, show_accuracy=False, callbacks=[]): Train a model for a fixed number of epochs.
- __Return__: a history dictionary with a record of training loss values at successive epochs, as well as validation loss values (if applicable), accuracy (if applicable), etc.
- __Arguments__:
- __X__: data.
- __y__: labels.
- __batch_size__: int. Number of samples per gradient update.
- __nb_epoch__: int.
- __verbose__: 0 for no logging to stdout, 1 for progress bar logging, 2 for one log line per epoch.
- __validation_split__: float (0. < x < 1). Fraction of the data to use as held-out validation data.
- __validation_data__: tuple (X, y) to be used as held-out validation data. Will override validation_split.
- __shuffle__: boolean. Whether to shuffle the samples at each epoch.
- __show_accuracy__: boolean. Whether to display class accuracy in the logs to stdout at each epoch.
- __callbacks__: `keras.callbacks.Callback` list. List of callbacks to apply during training. See [callbacks](callbacks.md).
- __evaluate__(X, y, batch_size=128, show_accuracy=False, verbose=1): Show performance of the model over some validation data.
- __Return__: The loss score over the data.
- __Arguments__: Same meaning as fit method above. verbose is used as a binary flag (progress bar or nothing).
- __predict__(X, batch_size=128, verbose=1):
- __Return__: An array of predictions for some test data.
- __Arguments__: Same meaning as fit method above.
- __predict_classes__(X, batch_size=128, verbose=1): Return an array of class predictions for some test data.
- __Return__: An array of labels for some test data.
- __Arguments__: Same meaning as fit method above. verbose is used as a binary flag (progress bar or nothing).
- __train__(X, y, accuracy=False): Single gradient update on one batch. if accuracy==False, return tuple (loss_on_batch, accuracy_on_batch). Else, return loss_on_batch.
- __Return__: loss over the data, or tuple `(loss, accuracy)` if `accuracy=True`.
- __test__(X, y, accuracy=False): Single performance evaluation on one batch. if accuracy==False, return tuple (loss_on_batch, accuracy_on_batch). Else, return loss_on_batch.
- __Return__: loss over the data, or tuple `(loss, accuracy)` if `accuracy=True`.
- __save_weights__(fname, overwrite=False): Store the weights of all layers to a HDF5 file. If overwrite==False and the file already exists, an exception will be thrown.
- __load_weights__(fname): Sets the weights of a model, based to weights stored by __save__weights__. You can only __load__weights__ on a savefile from a model with an identical architecture. __load_weights__ can be called either before or after the __compile__ step.
__Examples__:
```python
from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation
from keras.optimizers import SGD
model = Sequential()
model.add(Dense(64, 2, init='uniform'))
model.add(Activation('softmax'))
model.compile(loss='mse', optimizer='sgd')
'''
Demonstration of verbose modes 1 and 2
'''
model.fit(X_train, y_train, nb_epoch=3, batch_size=16, verbose=1)
# outputs
'''
Train on 37800 samples, validate on 4200 samples
Epoch 0
37800/37800 [==============================] - 7s - loss: 0.0385
Epoch 1
37800/37800 [==============================] - 8s - loss: 0.0140
Epoch 2
10960/37800 [=======>......................] - ETA: 4s - loss: 0.0109
'''
model.fit(X_train, y_train, nb_epoch=3, batch_size=16, verbose=2)
# outputs
'''
Train on 37800 samples, validate on 4200 samples
Epoch 0
loss: 0.0190
Epoch 1
loss: 0.0146
Epoch 2
loss: 0.0049
'''
'''
Demonstration of show_accuracy
'''
model.fit(X_train, y_train, nb_epoch=3, batch_size=16, verbose=2, show_accuracy=True)
# outputs
'''
Train on 37800 samples, validate on 4200 samples
Epoch 0
loss: 0.0190 - acc.: 0.8750
Epoch 1
loss: 0.0146 - acc.: 0.8750
Epoch 2
loss: 0.0049 - acc.: 1.0000
'''
'''
Demonstration of validation_split
'''
model.fit(X_train, y_train, nb_epoch=3, batch_size=16, validation_split=0.1, show_accuracy=True, verbose=1)
# outputs
'''
Train on 37800 samples, validate on 4200 samples
Epoch 0
37800/37800 [==============================] - 7s - loss: 0.0385 - acc.: 0.7258 - val. loss: 0.0160 - val. acc.: 0.9136
Epoch 1
37800/37800 [==============================] - 8s - loss: 0.0140 - acc.: 0.9265 - val. loss: 0.0109 - val. acc.: 0.9383
Epoch 2
10960/37800 [=======>......................] - ETA: 4s - loss: 0.0109 - acc.: 0.9420
'''
```
-24
Ver Arquivo
@@ -1,24 +0,0 @@
## Usage of objectives
An objective function (or loss function, or optimization score function) is one of the two parameters required to compile a model:
```python
model.compile(loss='mean_squared_error', optimizer='sgd')
```
You can either pass the name of an existing objective, or pass a Theano symbolic function that returns a scalar and takes the following two arguments:
- __y_true__: True labels. Theano tensor.
- __y_pred__: Predictions. Theano tensor of the same shape as y_true.
For a few examples of such functions, check out the [objectives source](https://github.com/fchollet/keras/blob/master/keras/objectives.py).
## Available objectives
- __mean_squared_error__ / __mse__
- __mean_absolute_error__ / __mae__
- __squared_hinge__
- __hinge__
- __binary_crossentropy__: Also known as logloss.
- __categorical_crossentropy__: Also known as multiclass logloss. __Note__: using this objective requires that your labels are binary arrays of shape `(nb_samples, nb_classes)`.
-118
Ver Arquivo
@@ -1,118 +0,0 @@
## Usage of optimizers
An optimizer is one of the two arguments required for compiling a Keras model:
```python
model = Sequential()
model.add(Dense(20, 64, init='uniform'))
model.add(Activation('tanh'))
model.add(Activation('softmax'))
sgd = SGD(lr=0.1, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='mean_squared_error', optimizer=sgd)
```
You can either instantiate an optimizer before passing it to `model.compile()` , as in the above example, or you can call it by its name. In the latter case, the default parameters for the optimizer will be used.
```python
# pass optimizer by name: default parameters will be used
model.compile(loss='mean_squared_error', optimizer='sgd')
```
---
## Base class
```python
keras.optimizers.Optimizer(**kwargs)
```
All optimizers descended from this class support the following keyword argument:
- __clipnorm__: float >= 0.
Note: this is base class for building optimizers, not an actual optimizer that can be used for training models.
---
## SGD
```python
keras.optimizers.SGD(lr=0.01, momentum=0., decay=0., nesterov=False)
```
__Arguments__:
- __lr__: float >= 0. Learning rate.
- __momentum__: float >= 0. Parameter updates momentum.
- __decay__: float >= 0. Learning rate decay over each update.
- __nesterov__: boolean. Whether to apply Nesterov momentum.
---
## Adagrad
```python
keras.optimizers.Adagrad(lr=0.01, epsilon=1e-6)
```
It is recommended to leave the parameters of this optimizer at their default values.
__Arguments__:
- __lr__: float >= 0. Learning rate.
- __epsilon__: float >= 0.
---
## Adadelta
```python
keras.optimizers.Adadelta(lr=1.0, rho=0.95, epsilon=1e-6)
```
It is recommended to leave the parameters of this optimizer at their default values.
__Arguments__:
- __lr__: float >= 0. Learning rate. It is recommended to leave it at the default value.
- __rho__: float >= 0.
- __epsilon__: float >= 0. Fuzz factor.
For more info, see *"Adadelta: an adaptive learning rate method"* by Matthew Zeiler.
---
## RMSprop
```python
keras.optimizers.RMSprop(lr=0.001, rho=0.9, epsilon=1e-6)
```
It is recommended to leave the parameters of this optimizer at their default values.
__Arguments__:
- __lr__: float >= 0. Learning rate.
- __rho__: float >= 0.
- __epsilon__: float >= 0. Fuzz factor.
---
## Adam
```python
keras.optimizers.Adam(lr=0.001, beta_1=0.9, beta_2=0.999, epsilon=1e-8, kappa=1-1e-8)
```
Adam optimizer, proposed by Kingma and Lei Ba in [Adam: A Method For Stochastic Optimization](http://arxiv.org/pdf/1412.6980v4.pdf). Default parameters are those suggested in the paper. The parameter "lambda" from the paper has been renamed kappa, for syntactic reasons.
__Arguments__:
- __lr__: float >= 0. Learning rate.
- __beta_1__, __beta_2__: floats, 0 < beta < 1. Generally close to 1.
- __epsilon__: float >= 0. Fuzz factor.
- __kappa__: float 0 < kappa < 1. Lambda parameter in the original paper.
---
-70
Ver Arquivo
@@ -1,70 +0,0 @@
## ImageDataGenerator
```python
keras.preprocessing.image.ImageDataGenerator(featurewise_center=True,
samplewise_center=False,
featurewise_std_normalization=True,
samplewise_std_normalization=False,
zca_whitening=False,
rotation_range=0.,
width_shift_range=0.,
height_shift_range=0.,
horizontal_flip=False,
vertical_flip=False)
```
Generate batches of tensor image data with real-time data augmentation.
- __Arguments__:
- __featurewise_center__: Boolean. Set input mean to 0 over the dataset.
- __samplewise_center__: Boolean. Set each sample mean to 0.
- __featurewise_std_normalization__: Boolean. Divide inputs by std of the dataset.
- __samplewise_std_normalization__: Boolean. Divide each input by its std.
- __zca_whitening__: Boolean. Apply ZCA whitening.
- __rotation_range__: Int. Degree range for random rotations.
- __width_shift_range__: Float (fraction of total width). Range for random horizontal shifts.
- __height_shift_range__: Float (fraction of total height). Range for random vertical shifts.
- __horizontal_flip__: Boolean. Randomly flip inputs horizontally.
- __vertical_flip__: Boolean. Randomly flip inputs vertically.
- __Methods__:
- __fit(X)__: Required if featurewise_center or featurewise_std_normalization or zca_whitening. Compute necessary quantities on some sample data.
- __Arguments__:
- __X__: sample data.
- __augment__: Boolean (default: False). Whether to fit on randomly augmented samples.
- __rounds__: int (default: 1). If augment, how many augmentation passes over the data to use.
- __flow(X, y)__:
- __Arguments__:
- __X__: data.
- __y__: labels.
- __batch_size__: int (default: 32).
- __shuffle__: boolean (defaut: False).
- __save_to_dir__: None or str. This allows you to optimally specify a directory to which to save the augmented pictures being generated (useful for visualizing what you are doing).
- __save_prefix__: str. Prefix to use for filenames of saved pictures.
- __save_format__: one of "png", jpeg".
- __Example__:
```python
(X_train, y_train), (X_test, y_test) = cifar10.load_data(test_split=0.1)
Y_train = np_utils.to_categorical(y_train, nb_classes)
Y_test = np_utils.to_categorical(y_test, nb_classes)
datagen = ImageDataGenerator(
featurewise_center=True,
featurewise_std_normalization=True,
rotation_range=20,
width_shift_range=0.2,
height_shift_range=0.2,
horizontal_flip=True)
# compute quantities required for featurewise normalization
# (std, mean, and principal components if ZCA whitening is applied)
datagen.fit(X_train)
for e in range(nb_epoch):
print 'Epoch', e
# batch train with realtime data augmentation
for X_batch, Y_batch in datagen.flow(X_train, Y_train):
loss = model.train(X_batch, Y_batch)
```
-18
Ver Arquivo
@@ -1,18 +0,0 @@
## Usage of regularizers
Regularizers allow to apply penalties on network parameters during optimization.
The keyword arguments used for passing penalties to parameters in a layer will depend on the layer.
In the `Dense` layer it is simply `W_regularizer` for the main weights matrix, and `b_regularizer` for the bias.
```python
from keras.regularizers import l2
model.add(Dense(64, 64, W_regularizer = l2(.01)))
```
## Available penalties
- __l1__(l=0.01): L1 regularization penalty, also known as LASSO
- __l2__(l=0.01): L2 regularization penalty, also known as weight decay, or Ridge
- __l1l2__(l1=0.01, l2=0.01): L1-L2 regularization penalty, also known as ElasticNet
-28
Ver Arquivo
@@ -1,28 +0,0 @@
## Grapher
Creates a visualization of the model structure using `pydot`.
```python
grapher = keras.utils.dot_utils.Grapher()
```
- __Methods__:
- __plot__(model, to_file): creates a graph visualizing the structure of `model` and writes it to `to_file`.
- __Arguments__:
- __model__: an instance of a Keras model (e.g. `Sequential`)
- __to_file__: the filename to save the visualization png to.
__Examples__:
```python
from keras.models import Sequential
from keras.layers.core import Dense, Activation
from keras.utils.dot_utils import Grapher
grapher = Grapher()
model = Sequential()
model.add(Dense(64, 2, init='uniform'))
model.add(Activation('softmax'))
grapher.plot(model, 'model.png')
```
+42
Ver Arquivo
@@ -0,0 +1,42 @@
## Usage of activations
Activations can either be used through an `Activation` layer, or through the `activation` argument supported by all forward layers:
```python
from keras.layers.core import Activation, Dense
model.add(Dense(64))
model.add(Activation('tanh'))
```
is equivalent to:
```python
model.add(Dense(64, activation='tanh'))
```
You can also pass an element-wise Theano/TensorFlow function as an activation:
```python
from keras import backend as K
def tanh(x):
return K.tanh(x)
model.add(Dense(64, activation=tanh))
model.add(Activation(tanh))
```
## Available activations
- __softmax__: Softmax applied across inputs last dimension. Expects shape either `(nb_samples, nb_timesteps, nb_dims)` or `(nb_samples, nb_dims)`.
- __softplus__
- __softsign__
- __relu__
- __tanh__
- __sigmoid__
- __hard_sigmoid__
- __linear__
## On Advanced Activations
Activations that are more complex than a simple Theano/TensorFlow function (eg. learnable activations, configurable activations, etc.) are available as [Advanced Activation layers](layers/advanced-activations.md), and can be found in the module `keras.layers.advanced_activations`. These include PReLU and LeakyReLU.
+417
Ver Arquivo
@@ -0,0 +1,417 @@
# Applications
Keras Applications are deep learning models that are made available alongside pre-trained weights.
These models can be used for prediction, feature extraction, and fine-tuning.
Weights are downloaded automatically when instantiating a model. They are stored at `~/.keras/models/`.
## Available models
### Models for image classification with weights trained on ImageNet:
- [Xception](#xception)
- [VGG16](#vgg16)
- [VGG19](#vgg19)
- [ResNet50](#resnet50)
- [InceptionV3](#inceptionv3)
All of these architectures (except Xception) are compatible with both TensorFlow and Theano, and upon instantiation the models will be built according to the image dimension ordering set in your Keras configuration file at `~/.keras/keras.json`. For instance, if you have set `image_dim_ordering=tf`, then any model loaded from this repository will get built according to the TensorFlow dimension ordering convention, "Width-Height-Depth".
The Xception model is only available for TensorFlow, due to its reliance on `SeparableConvolution` layers.
### Model for music audio file auto-tagging (taking as input Mel-spectrograms):
- [MusicTaggerCRNN](#musictaggercrnn)
-----
## Usage examples for image classification models
### Classify ImageNet classes with ResNet50
```python
from keras.applications.resnet50 import ResNet50
from keras.preprocessing import image
from keras.applications.resnet50 import preprocess_input, decode_predictions
import numpy as np
model = ResNet50(weights='imagenet')
img_path = 'elephant.jpg'
img = image.load_img(img_path, target_size=(224, 224))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)
preds = model.predict(x)
# decode the results into a list of tuples (class, description, probability)
# (one such list for each sample in the batch)
print('Predicted:', decode_predictions(preds, top=3)[0])
# Predicted: [(u'n02504013', u'Indian_elephant', 0.82658225), (u'n01871265', u'tusker', 0.1122357), (u'n02504458', u'African_elephant', 0.061040461)]
```
### Extract features with VGG16
```python
from keras.applications.vgg16 import VGG16
from keras.preprocessing import image
from keras.applications.vgg16 import preprocess_input
import numpy as np
model = VGG16(weights='imagenet', include_top=False)
img_path = 'elephant.jpg'
img = image.load_img(img_path, target_size=(224, 224))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)
features = model.predict(x)
```
### Extract features from an arbitrary intermediate layer with VGG19
```python
from keras.applications.vgg19 import VGG19
from keras.preprocessing import image
from keras.applications.vgg19 import preprocess_input
from keras.models import Model
import numpy as np
base_model = VGG19(weights='imagenet')
model = Model(input=base_model.input, output=base_model.get_layer('block4_pool').output)
img_path = 'elephant.jpg'
img = image.load_img(img_path, target_size=(224, 224))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)
block4_pool_features = model.predict(x)
```
### Fine-tune InceptionV3 on a new set of classes
```python
from keras.applications.inception_v3 import InceptionV3
from keras.preprocessing import image
from keras.models import Model
from keras.layers import Dense, GlobalAveragePooling2D
from keras import backend as K
# create the base pre-trained model
base_model = InceptionV3(weights='imagenet', include_top=False)
# add a global spatial average pooling layer
x = base_model.output
x = GlobalAveragePooling2D()(x)
# let's add a fully-connected layer
x = Dense(1024, activation='relu')(x)
# and a logistic layer -- let's say we have 200 classes
predictions = Dense(200, activation='softmax')(x)
# this is the model we will train
model = Model(input=base_model.input, output=predictions)
# first: train only the top layers (which were randomly initialized)
# i.e. freeze all convolutional InceptionV3 layers
for layer in base_model.layers:
layer.trainable = False
# compile the model (should be done *after* setting layers to non-trainable)
model.compile(optimizer='rmsprop', loss='categorical_crossentropy')
# train the model on the new data for a few epochs
model.fit_generator(...)
# at this point, the top layers are well trained and we can start fine-tuning
# convolutional layers from inception V3. We will freeze the bottom N layers
# and train the remaining top layers.
# let's visualize layer names and layer indices to see how many layers
# we should freeze:
for i, layer in enumerate(base_model.layers):
print(i, layer.name)
# we chose to train the top 2 inception blocks, i.e. we will freeze
# the first 172 layers and unfreeze the rest:
for layer in model.layers[:172]:
layer.trainable = False
for layer in model.layers[172:]:
layer.trainable = True
# we need to recompile the model for these modifications to take effect
# we use SGD with a low learning rate
from keras.optimizers import SGD
model.compile(optimizer=SGD(lr=0.0001, momentum=0.9), loss='categorical_crossentropy')
# we train our model again (this time fine-tuning the top 2 inception blocks
# alongside the top Dense layers
model.fit_generator(...)
```
### Build InceptionV3 over a custom input tensor
```python
from keras.applications.inception_v3 import InceptionV3
from keras.layers import Input
# this could also be the output a different Keras model or layer
input_tensor = Input(shape=(224, 224, 3)) # this assumes K.image_dim_ordering() == 'tf'
model = InceptionV3(input_tensor=input_tensor, weights='imagenet', include_top=True)
```
-----
# Documentation for individual models
- [Xception](#xception)
- [VGG16](#vgg16)
- [VGG19](#vgg19)
- [ResNet50](#resnet50)
- [InceptionV3](#inceptionv3)
- [MusicTaggerCRNN](#musictaggercrnn)
-----
## Xception
```python
keras.applications.xception.Xception(include_top=True, weights='imagenet', input_tensor=None)
```
Xception V1 model, with weights pre-trained on ImageNet.
On ImageNet, this model gets to a top-1 validation accuracy of 0.790
and a top-5 validation accuracy of 0.945.
Note that this model is only available for the TensorFlow backend,
due to its reliance on `SeparableConvolution` layers. Additionally it only supports
the dimension ordering "tf" (width, height, channels).
The default input size for this model is 299x299.
### Arguments
- include_top: whether to include the fully-connected layer at the top of the network.
- weights: one of `None` (random initialization) or "imagenet" (pre-training on ImageNet).
- input_tensor: optional Keras tensor (i.e. output of `layers.Input()`) to use as image input for the model.
### Returns
A Keras model instance.
### References
- [Xception: Deep Learning with Depthwise Separable Convolutions](https://arxiv.org/abs/1610.02357)
### License
These weights are trained by ourselves and are released under the MIT license.
-----
## VGG16
```python
keras.applications.vgg16.VGG16(include_top=True, weights='imagenet', input_tensor=None)
```
VGG16 model, with weights pre-trained on ImageNet.
This model is available for both the Theano and TensorFlow backend, and can be built both
with "th" dim ordering (channels, width, height) or "tf" dim ordering (width, height, channels).
The default input size for this model is 224x224.
### Arguments
- include_top: whether to include the 3 fully-connected layers at the top of the network.
- weights: one of `None` (random initialization) or "imagenet" (pre-training on ImageNet).
- input_tensor: optional Keras tensor (i.e. output of `layers.Input()`) to use as image input for the model.
### Returns
A Keras model instance.
### References
- [Very Deep Convolutional Networks for Large-Scale Image Recognition](https://arxiv.org/abs/1409.1556): please cite this paper if you use the VGG models in your work.
### License
These weights are ported from the ones [released by VGG at Oxford](http://www.robots.ox.ac.uk/~vgg/research/very_deep/) under the [Creative Commons Attribution License](https://creativecommons.org/licenses/by/4.0/).
-----
## VGG19
```python
keras.applications.vgg19.VGG19(include_top=True, weights='imagenet', input_tensor=None)
```
VGG19 model, with weights pre-trained on ImageNet.
This model is available for both the Theano and TensorFlow backend, and can be built both
with "th" dim ordering (channels, width, height) or "tf" dim ordering (width, height, channels).
The default input size for this model is 224x224.
### Arguments
- include_top: whether to include the 3 fully-connected layers at the top of the network.
- weights: one of `None` (random initialization) or "imagenet" (pre-training on ImageNet).
- input_tensor: optional Keras tensor (i.e. output of `layers.Input()`) to use as image input for the model.
### Returns
A Keras model instance.
### References
- [Very Deep Convolutional Networks for Large-Scale Image Recognition](https://arxiv.org/abs/1409.1556)
### License
These weights are ported from the ones [released by VGG at Oxford](http://www.robots.ox.ac.uk/~vgg/research/very_deep/) under the [Creative Commons Attribution License](https://creativecommons.org/licenses/by/4.0/).
-----
## ResNet50
```python
keras.applications.resnet50.ResNet50(include_top=True, weights='imagenet', input_tensor=None)
```
ResNet50 model, with weights pre-trained on ImageNet.
This model is available for both the Theano and TensorFlow backend, and can be built both
with "th" dim ordering (channels, width, height) or "tf" dim ordering (width, height, channels).
The default input size for this model is 224x224.
### Arguments
- include_top: whether to include the fully-connected layer at the top of the network.
- weights: one of `None` (random initialization) or "imagenet" (pre-training on ImageNet).
- input_tensor: optional Keras tensor (i.e. output of `layers.Input()`) to use as image input for the model.
### Returns
A Keras model instance.
### References
- [Deep Residual Learning for Image Recognition](https://arxiv.org/abs/1512.03385)
### License
These weights are ported from the ones [released by Kaiming He](https://github.com/KaimingHe/deep-residual-networks) under the [MIT license](https://github.com/KaimingHe/deep-residual-networks/blob/master/LICENSE).
-----
## InceptionV3
```python
keras.applications.inception_v3.InceptionV3(include_top=True, weights='imagenet', input_tensor=None)
```
Inception V3 model, with weights pre-trained on ImageNet.
This model is available for both the Theano and TensorFlow backend, and can be built both
with "th" dim ordering (channels, width, height) or "tf" dim ordering (width, height, channels).
The default input size for this model is 299x299.
### Arguments
- include_top: whether to include the fully-connected layer at the top of the network.
- weights: one of `None` (random initialization) or "imagenet" (pre-training on ImageNet).
- input_tensor: optional Keras tensor (i.e. output of `layers.Input()`) to use as image input for the model.
### Returns
A Keras model instance.
### References
- [Rethinking the Inception Architecture for Computer Vision](http://arxiv.org/abs/1512.00567)
### License
These weights are trained by ourselves and are released under the MIT license.
-----
## MusicTaggerCRNN
```python
keras.applications.music_tagger_crnn.MusicTaggerCRNN(weights='msd', input_tensor=None, include_top=True)
```
A convolutional-recurrent model taking as input a vectorized representation of the MelSpectrogram of a music track and capable of outputting the musical genre of the track. You can use `keras.applications.music_tagger_crnn.preprocess_input` to convert a sound file to a vectorized spectrogram. This requires to have installed the [Librosa](http://librosa.github.io/librosa/) library. See [the usage example](#music-tagging-and-feature-extraction-with-musictaggercrnn).
### Arguments
- weights: one of `None` (random initialization) or "msd" (pre-training on [Million Song Dataset](http://labrosa.ee.columbia.edu/millionsong/)).
- input_tensor: optional Keras tensor (i.e. output of `layers.Input()`) to use as image input for the model.
- include_top: whether to include the 1 fully-connected layer (output layer) at the top of the network. If False, the network outputs 32-dim features.
### Returns
A Keras model instance.
### References
- [Convolutional Recurrent Neural Networks for Music Classification](https://arxiv.org/abs/1609.04243)
### License
These weights are ported from the ones [released by Keunwoo Choi](https://github.com/keunwoochoi/music-auto_tagging-keras) under the [MIT license](https://github.com/keunwoochoi/music-auto_tagging-keras/blob/master/LICENSE.md).
### Examples: music tagging and audio feature extraction
```python
from keras.applications.music_tagger_crnn import MusicTaggerCRNN
from keras.applications.music_tagger_crnn import preprocess_input, decode_predictions
import numpy as np
# 1. Tagging
model = MusicTaggerCRNN(weights='msd')
audio_path = 'audio_file.mp3'
melgram = preprocess_input(audio_path)
melgrams = np.expand_dims(melgram, axis=0)
preds = model.predict(melgrams)
print('Predicted:')
print(decode_predictions(preds))
# print: ('Predicted:', [[('rock', 0.097071797), ('pop', 0.042456303), ('alternative', 0.032439161), ('indie', 0.024491295), ('female vocalists', 0.016455274)]])
#. 2. Feature extraction
model = MusicTaggerCRNN(weights='msd', include_top=False)
audio_path = 'audio_file.mp3'
melgram = preprocess_input(audio_path)
melgrams = np.expand_dims(melgram, axis=0)
feats = model.predict(melgrams)
print('Features:')
print(feats[0, :10])
# print: ('Features:', [-0.19160545 0.94259131 -0.9991011 0.47644514 -0.19089699 0.99033844 0.1103896 -0.00340496 0.14823607 0.59856361])
```
+99
Ver Arquivo
@@ -0,0 +1,99 @@
# Keras backends
## What is a "backend"?
Keras is a model-level library, providing high-level building blocks for developing deep learning models. It does not handle itself low-level operations such as tensor products, convolutions and so on. Instead, it relies on a specialized, well-optimized tensor manipulation library to do so, serving as the "backend engine" of Keras. Rather than picking one single tensor library and making the implementation of Keras tied to that library, Keras handles the problem in a modular way, and several different backend engines can be plugged seamlessly into Keras.
At this time, Keras has two backend implementations available: the **TensorFlow** backend and the **Theano** backend.
- [TensorFlow](http://www.tensorflow.org/) is an open-source symbolic tensor manipulation framework developed by Google, Inc.
- [Theano](http://deeplearning.net/software/theano/) is an open-source symbolic tensor manipulation framework developed by LISA/MILA Lab at Université de Montréal.
In the future, we are likely to add more backend options. If you are interested in developing a new backend, get in touch!
----
## Switching from one backend to another
If you have run Keras at least once, you will find the Keras configuration file at:
`~/.keras/keras.json`
If it isn't there, you can create it.
The default configuration file looks like this:
```
{
"image_dim_ordering": "tf",
"epsilon": 1e-07,
"floatx": "float32",
"backend": "tensorflow"
}
```
Simply change the field `backend` to either `"theano"` or `"tensorflow"`, and Keras will use the new configuration next time you run any Keras code.
You can also define the environment variable ``KERAS_BACKEND`` and this will
override what is defined in your config file :
```bash
KERAS_BACKEND=tensorflow python -c "from keras import backend"
Using TensorFlow backend.
```
----
## Using the abstract Keras backend to write new code
If you want the Keras modules you write to be compatible with both Theano and TensorFlow, you have to write them via the abstract Keras backend API. Here's an intro.
You can import the backend module via:
```python
from keras import backend as K
```
The code below instantiates an input placeholder. It's equivalent to `tf.placeholder()` or `T.matrix()`, `T.tensor3()`, etc.
```python
input = K.placeholder(shape=(2, 4, 5))
# also works:
input = K.placeholder(shape=(None, 4, 5))
# also works:
input = K.placeholder(ndim=3)
```
The code below instantiates a shared variable. It's equivalent to `tf.variable()` or `theano.shared()`.
```python
val = np.random.random((3, 4, 5))
var = K.variable(value=val)
# all-zeros variable:
var = K.zeros(shape=(3, 4, 5))
# all-ones:
var = K.ones(shape=(3, 4, 5))
```
Most tensor operations you will need can be done as you would in TensorFlow or Theano:
```python
a = b + c * K.abs(d)
c = K.dot(a, K.transpose(b))
a = K.sum(b, axis=2)
a = K.softmax(b)
a = concatenate([b, c], axis=-1)
# etc...
```
----
## Backend functions
{{autogenerated}}
+72
Ver Arquivo
@@ -0,0 +1,72 @@
## Usage of callbacks
A callback is a set of functions to be applied at given stages of the training procedure. You can use callbacks to get a view on internal states and statistics of the model during training. You can pass a list of callbacks (as the keyword argument `callbacks`) to the `.fit()` method of the `Sequential` model. The relevant methods of the callbacks will then be called at each stage of the training.
---
{{autogenerated}}
---
# Create a callback
You can create a custom callback by extending the base class `keras.callbacks.Callback`. A callback has access to its associated model through the class property `self.model`.
Here's a simple example saving a list of losses over each batch during training:
```python
class LossHistory(keras.callbacks.Callback):
def on_train_begin(self, logs={}):
self.losses = []
def on_batch_end(self, batch, logs={}):
self.losses.append(logs.get('loss'))
```
---
### Example: recording loss history
```python
class LossHistory(keras.callbacks.Callback):
def on_train_begin(self, logs={}):
self.losses = []
def on_batch_end(self, batch, logs={}):
self.losses.append(logs.get('loss'))
model = Sequential()
model.add(Dense(10, input_dim=784, init='uniform'))
model.add(Activation('softmax'))
model.compile(loss='categorical_crossentropy', optimizer='rmsprop')
history = LossHistory()
model.fit(X_train, Y_train, batch_size=128, nb_epoch=20, verbose=0, callbacks=[history])
print history.losses
# outputs
'''
[0.66047596406559383, 0.3547245744908703, ..., 0.25953155204159617, 0.25901699725311789]
'''
```
---
### Example: model checkpoints
```python
from keras.callbacks import ModelCheckpoint
model = Sequential()
model.add(Dense(10, input_dim=784, init='uniform'))
model.add(Activation('softmax'))
model.compile(loss='categorical_crossentropy', optimizer='rmsprop')
'''
saves the model weights after each epoch if the validation loss decreased
'''
checkpointer = ModelCheckpoint(filepath="/tmp/weights.hdf5", verbose=1, save_best_only=True)
model.fit(X_train, Y_train, batch_size=128, nb_epoch=20, verbose=0, validation_data=(X_test, Y_test), callbacks=[checkpointer])
```
@@ -2,14 +2,17 @@
Functions from the `constraints` module allow setting constraints (eg. non-negativity) on network parameters during optimization.
The keyword arguments used for passing constraints to parameters in a layer will depend on the layer.
The penalties are applied on a per-layer basis. The exact API will depend on the layer, but the layers `Dense`, `TimeDistributedDense`, `MaxoutDense`, `Convolution1D` and `Convolution2D` have a unified API.
In the `Dense` layer it is simply `W_constraint` for the main weights matrix, and `b_constraint` for the bias.
These layers expose 2 keyword arguments:
- `W_constraint` for the main weights matrix
- `b_constraint` for the bias.
```python
from keras.constraints import maxnorm
model.add(Dense(64, 64, W_constraint = maxnorm(2)))
model.add(Dense(64, W_constraint = maxnorm(2)))
```
## Available constraints
+35 -16
Ver Arquivo
@@ -2,13 +2,13 @@
## CIFAR10 small image classification
`keras.datasets.cifar10`
Dataset of 50,000 32x32 color training images, labeled over 10 categories, and 10,000 test images.
### Usage:
```python
from keras.datasets import cifar10
(X_train, y_train), (X_test, y_test) = cifar10.load_data()
```
@@ -21,13 +21,13 @@ Dataset of 50,000 32x32 color training images, labeled over 10 categories, and 1
## CIFAR100 small image classification
`keras.datasets.cifar100`
Dataset of 50,000 32x32 color training images, labeled over 100 categories, and 10,000 test images.
### Usage:
```python
from keras.datasets import cifar100
(X_train, y_train), (X_test, y_test) = cifar100.load_data(label_mode='fine')
```
@@ -44,8 +44,6 @@ Dataset of 50,000 32x32 color training images, labeled over 100 categories, and
## IMDB Movie reviews sentiment classification
`keras.datasets.imdb`
Dataset of 25,000 movies reviews from IMDB, labeled by sentiment (positive/negative). Reviews have been preprocessed, and each review is encoded as a [sequence](preprocessing/sequence.md) of word indexes (integers). For convenience, words are indexed by overall frequency in the dataset, so that for instance the integer "3" encodes the 3rd most frequent word in the data. This allows for quick filtering operations such as: "only consider the top 10,000 most common words, but eliminate the top 20 most common words".
As a convention, "0" does not stand for a specific word, but instead is used to encode any unknown word.
@@ -53,8 +51,16 @@ As a convention, "0" does not stand for a specific word, but instead is used to
### Usage:
```python
(X_train, y_train), (X_test, y_test) = imdb.load_data(path="imdb.pkl", \
nb_words=None, skip_top=0, maxlen=None, test_split=0.1, seed=113)
from keras.datasets import imdb
(X_train, y_train), (X_test, y_test) = imdb.load_data(path="imdb_full.pkl",
nb_words=None,
skip_top=0,
maxlen=None,
seed=113,
start_char=1,
oov_char=2,
index_from=3)
```
- __Return:__
- 2 tuples:
@@ -67,25 +73,38 @@ nb_words=None, skip_top=0, maxlen=None, test_split=0.1, seed=113)
- __nb_words__: integer or None. Top most frequent words to consider. Any less frequent word will appear as 0 in the sequence data.
- __skip_top__: integer. Top most frequent words to ignore (they will appear as 0s in the sequence data).
- __maxlen__: int. Maximum sequence length. Any longer sequence will be truncated.
- __test_split__: float. Fraction of the dataset to be used as test data.
- __seed__: int. Seed for reproducible data shuffling.
- __start_char__: char. The start of a sequence will be marked with this character.
Set to 1 because 0 is usually the padding character.
- __oov_char__: char. words that were cut out because of the `nb_words`
or `skip_top` limit will be replaced with this character.
- __index_from__: int. Index actual words with this index and higher.
---
## Reuters newswire topics classification
`keras.datasets.reuters`
Dataset of 11,228 newswires from Reuters, labeled over 46 topics. As with the IMDB dataset, each wire is encoded as a sequence of word indexes (same conventions).
### Usage:
```python
(X_train, y_train), (X_test, y_test) = reuters.load_data(path="reuters.pkl", \
nb_words=None, skip_top=0, maxlen=None, test_split=0.1, seed=113)
from keras.datasets import reuters
(X_train, y_train), (X_test, y_test) = reuters.load_data(path="reuters.pkl",
nb_words=None,
skip_top=0,
maxlen=None,
test_split=0.2,
seed=113,
start_char=1,
oov_char=2,
index_from=3)
```
The specifications are the same as that of the IMDB dataset.
The specifications are the same as that of the IMDB dataset, with the addition of:
- __test_split__: float. Fraction of the dataset to be used as test data.
This dataset also makes available the word index used for encoding the sequences:
@@ -101,13 +120,13 @@ word_index = reuters.get_word_index(path="reuters_word_index.pkl")
## MNIST database of handwritten digits
`keras.datasets.mnist`
Dataset of 60,000 28x28 grayscale images of the 10 digits, along with a test set of 10,000 images.
### Usage:
```python
from keras.datasets import mnist
(X_train, y_train), (X_test, y_test) = mnist.load_data()
```
+385
Ver Arquivo
@@ -0,0 +1,385 @@
# Keras FAQ: Frequently Asked Keras Questions
- [How should I cite Keras?](#how-should-i-cite-keras)
- [How can I run Keras on GPU?](#how-can-i-run-keras-on-gpu)
- [How can I save a Keras model?](#how-can-i-save-a-keras-model)
- [Why is the training loss much higher than the testing loss?](#why-is-the-training-loss-much-higher-than-the-testing-loss)
- [How can I visualize the output of an intermediate layer?](#how-can-i-visualize-the-output-of-an-intermediate-layer)
- [How can I use Keras with datasets that don't fit in memory?](#how-can-i-use-keras-with-datasets-that-dont-fit-in-memory)
- [How can I interrupt training when the validation loss isn't decreasing anymore?](#how-can-i-interrupt-training-when-the-validation-loss-isnt-decreasing-anymore)
- [How is the validation split computed?](#how-is-the-validation-split-computed)
- [Is the data shuffled during training?](#is-the-data-shuffled-during-training)
- [How can I record the training / validation loss / accuracy at each epoch?](#how-can-i-record-the-training-validation-loss-accuracy-at-each-epoch)
- [How can I "freeze" layers?](#how-can-i-freeze-keras-layers)
- [How can I use stateful RNNs?](#how-can-i-use-stateful-rnns)
- [How can I remove a layer from a Sequential model?](#how-can-i-remove-a-layer-from-a-sequential-model)
- [How can I use pre-trained models in Keras?](#how-can-i-use-pre-trained-models-in-keras)
---
### How should I cite Keras?
Please cite Keras in your publications if it helps your research. Here is an example BibTeX entry:
```
@misc{chollet2015keras,
title={Keras},
author={Chollet, Fran\c{c}ois},
year={2015},
publisher={GitHub},
howpublished={\url{https://github.com/fchollet/keras}},
}
```
### How can I run Keras on GPU?
If you are running on the TensorFlow backend, your code will automatically run on GPU if any available GPU is detected.
If you are running on the Theano backend, you can use one of the following methods:
Method 1: use Theano flags.
```bash
THEANO_FLAGS=device=gpu,floatX=float32 python my_keras_script.py
```
The name 'gpu' might have to be changed depending on your device's identifier (e.g. `gpu0`, `gpu1`, etc).
Method 2: set up your `.theanorc`: [Instructions](http://deeplearning.net/software/theano/library/config.html)
Method 3: manually set `theano.config.device`, `theano.config.floatX` at the beginning of your code:
```python
import theano
theano.config.device = 'gpu'
theano.config.floatX = 'float32'
```
---
### How can I save a Keras model?
*It is not recommended to use pickle or cPickle to save a Keras model.*
You can use `model.save(filepath)` to save a Keras model into a single HDF5 file which will contain:
- the architecture of the model, allowing to re-create the model
- the weights of the model
- the training configuration (loss, optimizer)
- the state of the optimizer, allowing to resume training exactly where you left off.
You can then use `keras.models.load_model(filepath)` to reinstantiate your model.
`load_model` will also take care of compiling the model using the saved training configuration
(unless the model was never compiled in the first place).
Example:
```python
from keras.models import load_model
model.save('my_model.h5') # creates a HDF5 file 'my_model.h5'
del model # deletes the existing model
# returns a compiled model
# identical to the previous one
model = load_model('my_model.h5')
```
If you only need to save the **architecture of a model**, and not its weights or its training configuration, you can do:
```python
# save as JSON
json_string = model.to_json()
# save as YAML
yaml_string = model.to_yaml()
```
The generated JSON / YAML files are human-readable and can be manually edited if needed.
You can then build a fresh model from this data:
```python
# model reconstruction from JSON:
from keras.models import model_from_json
model = model_from_json(json_string)
# model reconstruction from YAML
model = model_from_yaml(yaml_string)
```
If you need to save the **weights of a model**, you can do so in HDF5 with the code below.
Note that you will first need to install HDF5 and the Python library h5py, which do not come bundled with Keras.
```python
model.save_weights('my_model_weights.h5')
```
Assuming you have code for instantiating your model, you can then load the weights you saved into a model with the *same* architecture:
```python
model.load_weights('my_model_weights.h5')
```
If you need to load weights into a *different* architecture (with some layers in common), for instance for fine-tuning or transfer-learning, you can load weights by *layer name*:
```python
model.load_weights('my_model_weights.h5', by_name=True)
```
For example:
```python
"""
Assume original model looks like this:
model = Sequential()
model.add(Dense(2, input_dim=3, name="dense_1"))
model.add(Dense(3, name="dense_2"))
...
model.save_weights(fname)
"""
# new model
model = Sequential()
model.add(Dense(2, input_dim=3, name="dense_1")) # will be loaded
model.add(Dense(10, name="new_dense")) # will not be loaded
# load weights from first model; will only affect the first layer, dense_1.
model.load_weights(fname, by_name=True)
```
---
### Why is the training loss much higher than the testing loss?
A Keras model has two modes: training and testing. Regularization mechanisms, such as Dropout and L1/L2 weight regularization, are turned off at testing time.
Besides, the training loss is the average of the losses over each batch of training data. Because your model is changing over time, the loss over the first batches of an epoch is generally higher than over the last batches. On the other hand, the testing loss for an epoch is computed using the model as it is at the end of the epoch, resulting in a lower loss.
---
### How can I visualize the output of an intermediate layer?
You can build a Keras function that will return the output of a certain layer given a certain input, for example:
```python
from keras import backend as K
# with a Sequential model
get_3rd_layer_output = K.function([model.layers[0].input],
[model.layers[3].output])
layer_output = get_3rd_layer_output([X])[0]
```
Similarly, you could build a Theano and TensorFlow function directly.
Note that if your model has a different behavior in training and testing phase (e.g. if it uses `Dropout`, `BatchNormalization`, etc.), you will need
to pass the learning phase flag to your function:
```python
get_3rd_layer_output = K.function([model.layers[0].input, K.learning_phase()],
[model.layers[3].output])
# output in test mode = 0
layer_output = get_3rd_layer_output([X, 0])[0]
# output in train mode = 1
layer_output = get_3rd_layer_output([X, 1])[0]
```
Another more flexible way of getting output from intermediate layers is to use the [functional API](/getting-started/functional-api-guide). For example, if you have created an autoencoder for MNIST:
```python
inputs = Input(shape=(784,))
encoded = Dense(32, activation='relu')(inputs)
decoded = Dense(784)(encoded)
model = Model(input=inputs, output=decoded)
```
After compiling and training the model, you can get the output of the data from the encoder like this:
```python
encoder = Model(input=inputs, output=encoded)
X_encoded = encoder.predict(X)
```
---
### How can I use Keras with datasets that don't fit in memory?
You can do batch training using `model.train_on_batch(X, y)` and `model.test_on_batch(X, y)`. See the [models documentation](/models/sequential).
Alternatively, you can write a generator that yields batches of training data and use the method `model.fit_generator(data_generator, samples_per_epoch, nb_epoch)`.
You can see batch training in action in our [CIFAR10 example](https://github.com/fchollet/keras/blob/master/examples/cifar10_cnn.py).
---
### How can I interrupt training when the validation loss isn't decreasing anymore?
You can use an `EarlyStopping` callback:
```python
from keras.callbacks import EarlyStopping
early_stopping = EarlyStopping(monitor='val_loss', patience=2)
model.fit(X, y, validation_split=0.2, callbacks=[early_stopping])
```
Find out more in the [callbacks documentation](/callbacks).
---
### How is the validation split computed?
If you set the `validation_split` argument in `model.fit` to e.g. 0.1, then the validation data used will be the *last 10%* of the data. If you set it to 0.25, it will be the last 25% of the data, etc.
---
### Is the data shuffled during training?
Yes, if the `shuffle` argument in `model.fit` is set to `True` (which is the default), the training data will be randomly shuffled at each epoch.
Validation data is never shuffled.
---
### How can I record the training / validation loss / accuracy at each epoch?
The `model.fit` method returns an `History` callback, which has a `history` attribute containing the lists of successive losses and other metrics.
```python
hist = model.fit(X, y, validation_split=0.2)
print(hist.history)
```
---
### How can I "freeze" Keras layers?
To "freeze" a layer means to exclude it from training, i.e. its weights will never be updated. This is useful in the context of fine-tuning a model, or using fixed embeddings for a text input.
You can pass a `trainable` argument (boolean) to a layer constructor to set a layer to be non-trainable:
```python
frozen_layer = Dense(32, trainable=False)
```
Additionally, you can set the `trainable` property of a layer to `True` or `False` after instantiation. For this to take effect, you will need to call `compile()` on your model after modifying the `trainable` property. Here's an example:
```python
x = Input(shape=(32,))
layer = Dense(32)
layer.trainable = False
y = layer(x)
frozen_model = Model(x, y)
# in the model below, the weights of `layer` will not be updated during training
frozen_model.compile(optimizer='rmsprop', loss='mse')
layer.trainable = True
trainable_model = Model(x, y)
# with this model the weights of the layer will be updated during training
# (which will also affect the above model since it uses the same layer instance)
trainable_model.compile(optimizer='rmsprop', loss='mse')
frozen_model.fit(data, labels) # this does NOT update the weights of `layer`
trainable_model.fit(data, labels) # this updates the weights of `layer`
```
---
### How can I use stateful RNNs?
Making a RNN stateful means that the states for the samples of each batch will be reused as initial states for the samples in the next batch.
When using stateful RNNs, it is therefore assumed that:
- all batches have the same number of samples
- If `X1` and `X2` are successive batches of samples, then `X2[i]` is the follow-up sequence to `X1[i]`, for every `i`.
To use statefulness in RNNs, you need to:
- explicitly specify the batch size you are using, by passing a `batch_input_shape` argument to the first layer in your model. It should be a tuple of integers, e.g. `(32, 10, 16)` for a 32-samples batch of sequences of 10 timesteps with 16 features per timestep.
- set `stateful=True` in your RNN layer(s).
To reset the states accumulated:
- use `model.reset_states()` to reset the states of all layers in the model
- use `layer.reset_states()` to reset the states of a specific stateful RNN layer
Example:
```python
X # this is our input data, of shape (32, 21, 16)
# we will feed it to our model in sequences of length 10
model = Sequential()
model.add(LSTM(32, batch_input_shape=(32, 10, 16), stateful=True))
model.add(Dense(16, activation='softmax'))
model.compile(optimizer='rmsprop', loss='categorical_crossentropy')
# we train the network to predict the 11th timestep given the first 10:
model.train_on_batch(X[:, :10, :], np.reshape(X[:, 10, :], (32, 16)))
# the state of the network has changed. We can feed the follow-up sequences:
model.train_on_batch(X[:, 10:20, :], np.reshape(X[:, 20, :], (32, 16)))
# let's reset the states of the LSTM layer:
model.reset_states()
# another way to do it in this case:
model.layers[0].reset_states()
```
Notes that the methods `predict`, `fit`, `train_on_batch`, `predict_classes`, etc. will *all* update the states of the stateful layers in a model. This allows you to do not only stateful training, but also stateful prediction.
---
### How can I remove a layer from a Sequential model?
You can remove the last added layer in a Sequential model by calling `.pop()`:
```python
model = Sequential()
model.add(Dense(32, activation='relu', input_dim=784))
model.add(Dense(32, activation='relu'))
print(len(model.layers)) # "2"
model.pop()
print(len(model.layers)) # "1"
```
---
### How can I use pre-trained models in Keras?
Code and pre-trained weights are available for the following image classification models:
- VGG16
- VGG19
- ResNet50
- Inception v3
They can be imported from the module `keras.applications`:
```python
from keras.applications.vgg16 import VGG16
from keras.applications.vgg19 import VGG19
from keras.applications.resnet50 import ResNet50
from keras.applications.inception_v3 import InceptionV3
model = VGG16(weights='imagenet', include_top=True)
```
For a few simple usage examples, see [the documentation for the Applications module](/applications).
For a detailed example of how to use such a pre-trained model for feature extraction or for fine-tuning, see [this blog post](http://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html).
The VGG16 model is also the basis for several Keras example scripts:
- [Style transfer](https://github.com/fchollet/keras/blob/master/examples/neural_style_transfer.py)
- [Feature visualization](https://github.com/fchollet/keras/blob/master/examples/conv_filter_visualization.py)
- [Deep dream](https://github.com/fchollet/keras/blob/master/examples/deep_dream.py)
+421
Ver Arquivo
@@ -0,0 +1,421 @@
# Getting started with the Keras functional API
The Keras functional API is the way to go for defining complex models, such as multi-output models, directed acyclic graphs, or models with shared layers.
This guide assumes that you are already familiar with the `Sequential` model.
Let's start with something simple.
-----
## First example: fully connected network
The `Sequential` model is probably a better choice to implement such a network, but it helps to start with something really simple.
- A layer instance is callable (on a tensor), and it returns a tensor
- Input tensor(s) and output tensor(s) can then be used to define a `Model`
- Such a model can be trained just like Keras `Sequential` models.
```python
from keras.layers import Input, Dense
from keras.models import Model
# this returns a tensor
inputs = Input(shape=(784,))
# a layer instance is callable on a tensor, and returns a tensor
x = Dense(64, activation='relu')(inputs)
x = Dense(64, activation='relu')(x)
predictions = Dense(10, activation='softmax')(x)
# this creates a model that includes
# the Input layer and three Dense layers
model = Model(input=inputs, output=predictions)
model.compile(optimizer='rmsprop',
loss='categorical_crossentropy',
metrics=['accuracy'])
model.fit(data, labels) # starts training
```
-----
## All models are callable, just like layers
With the functional API, it is easy to re-use trained models: you can treat any model as if it were a layer, by calling it on a tensor. Note that by calling a model you aren't just re-using the *architecture* of the model, you are also re-using its weights.
```python
x = Input(shape=(784,))
# this works, and returns the 10-way softmax we defined above.
y = model(x)
```
This can allow, for instance, to quickly create models that can process *sequences* of inputs. You could turn an image classification model into a video classification model, in just one line.
```python
from keras.layers import TimeDistributed
# input tensor for sequences of 20 timesteps,
# each containing a 784-dimensional vector
input_sequences = Input(shape=(20, 784))
# this applies our previous model to every timestep in the input sequences.
# the output of the previous model was a 10-way softmax,
# so the output of the layer below will be a sequence of 20 vectors of size 10.
processed_sequences = TimeDistributed(model)(input_sequences)
```
-----
## Multi-input and multi-output models
Here's a good use case for the functional API: models with multiple inputs and outputs. The functional API makes it easy to manipulate a large number of intertwined datastreams.
Let's consider the following model. We seek to predict how many retweets and likes a news headline will receive on Twitter. The main input to the model will be the headline itself, as a sequence of words, but to spice things up, our model will also have an auxiliary input, receiving extra data such as the time of day when the headline was posted, etc.
The model will also be supervised via two loss functions. Using the main loss function earlier in a model is a good regularization mechanism for deep models.
Here's what our model looks like:
<img src="https://s3.amazonaws.com/keras.io/img/multi-input-multi-output-graph.png" alt="multi-input-multi-output-graph" style="width: 400px;"/>
Let's implement it with the functional API.
The main input will receive the headline, as a sequence of integers (each integer encodes a word).
The integers will be between 1 and 10,000 (a vocabulary of 10,000 words) and the sequences will be 100 words long.
```python
from keras.layers import Input, Embedding, LSTM, Dense, merge
from keras.models import Model
# headline input: meant to receive sequences of 100 integers, between 1 and 10000.
# note that we can name any layer by passing it a "name" argument.
main_input = Input(shape=(100,), dtype='int32', name='main_input')
# this embedding layer will encode the input sequence
# into a sequence of dense 512-dimensional vectors.
x = Embedding(output_dim=512, input_dim=10000, input_length=100)(main_input)
# a LSTM will transform the vector sequence into a single vector,
# containing information about the entire sequence
lstm_out = LSTM(32)(x)
```
Here we insert the auxiliary loss, allowing the LSTM and Embedding layer to be trained smoothly even though the main loss will be much higher in the model.
```python
auxiliary_output = Dense(1, activation='sigmoid', name='aux_output')(lstm_out)
```
At this point, we feed into the model our auxiliary input data by concatenating it with the LSTM output:
```python
auxiliary_input = Input(shape=(5,), name='aux_input')
x = merge([lstm_out, auxiliary_input], mode='concat')
# we stack a deep fully-connected network on top
x = Dense(64, activation='relu')(x)
x = Dense(64, activation='relu')(x)
x = Dense(64, activation='relu')(x)
# and finally we add the main logistic regression layer
main_output = Dense(1, activation='sigmoid', name='main_output')(x)
```
This defines a model with two inputs and two outputs:
```python
model = Model(input=[main_input, auxiliary_input], output=[main_output, auxiliary_output])
```
We compile the model and assign a weight of 0.2 to the auxiliary loss.
To specify different `loss_weights` or `loss` for each different output, you can use a list or a dictionary.
Here we pass a single loss as the `loss` argument, so the same loss will be used on all outputs.
```python
model.compile(optimizer='rmsprop', loss='binary_crossentropy',
loss_weights=[1., 0.2])
```
We can train the model by passing it lists of input arrays and target arrays:
```python
model.fit([headline_data, additional_data], [labels, labels],
nb_epoch=50, batch_size=32)
```
Since our inputs and outputs are named (we passed them a "name" argument),
We could also have compiled the model via:
```python
model.compile(optimizer='rmsprop',
loss={'main_output': 'binary_crossentropy', 'aux_output': 'binary_crossentropy'},
loss_weights={'main_output': 1., 'aux_output': 0.2})
# and trained it via:
model.fit({'main_input': headline_data, 'aux_input': additional_data},
{'main_output': labels, 'aux_output': labels},
nb_epoch=50, batch_size=32)
```
-----
## Shared layers
Another good use for the functional API are models that use shared layers. Let's take a look at shared layers.
Let's consider a dataset of tweets. We want to build a model that can tell whether two tweets are from the same person or not (this can allow us to compare users by the similarity of their tweets, for instance).
One way to achieve this is to build a model that encodes two tweets into two vectors, concatenates the vectors and adds a logistic regression of top, outputting a probability that the two tweets share the same author. The model would then be trained on positive tweet pairs and negative tweet pairs.
Because the problem is symmetric, the mechanism that encodes the first tweet should be reused (weights and all) to encode the second tweet. Here we use a shared LSTM layer to encode the tweets.
Let's build this with the functional API. We will take as input for a tweet a binary matrix of shape `(140, 256)`, i.e. a sequence of 140 vectors of size 256, where each dimension in the 256-dimensional vector encodes the presence/absence of a character (out of an alphabet of 256 frequent characters).
```python
from keras.layers import Input, LSTM, Dense, merge
from keras.models import Model
tweet_a = Input(shape=(140, 256))
tweet_b = Input(shape=(140, 256))
```
To share a layer across different inputs, simply instantiate the layer once, then call it on as many inputs as you want:
```python
# this layer can take as input a matrix
# and will return a vector of size 64
shared_lstm = LSTM(64)
# when we reuse the same layer instance
# multiple times, the weights of the layer
# are also being reused
# (it is effectively *the same* layer)
encoded_a = shared_lstm(tweet_a)
encoded_b = shared_lstm(tweet_b)
# we can then concatenate the two vectors:
merged_vector = merge([encoded_a, encoded_b], mode='concat', concat_axis=-1)
# and add a logistic regression on top
predictions = Dense(1, activation='sigmoid')(merged_vector)
# we define a trainable model linking the
# tweet inputs to the predictions
model = Model(input=[tweet_a, tweet_b], output=predictions)
model.compile(optimizer='rmsprop',
loss='binary_crossentropy',
metrics=['accuracy'])
model.fit([data_a, data_b], labels, nb_epoch=10)
```
Let's pause to take a look at how to read the shared layer's output or output shape.
-----
## The concept of layer "node"
Whenever you are calling a layer on some input, you are creating a new tensor (the output of the layer), and you are adding a "node" to the layer, linking the input tensor to the output tensor. When you are calling the same layer multiple times, that layer owns multiple nodes indexed as 0, 1, 2...
In previous versions of Keras, you could obtain the output tensor of a layer instance via `layer.get_output()`, or its output shape via `layer.output_shape`. You still can (except `get_output()` has been replaced by the property `output`). But what if a layer is connected to multiple inputs?
As long as a layer is only connected to one input, there is no confusion, and `.output` will return the one output of the layer:
```python
a = Input(shape=(140, 256))
lstm = LSTM(32)
encoded_a = lstm(a)
assert lstm.output == encoded_a
```
Not so if the layer has multiple inputs:
```python
a = Input(shape=(140, 256))
b = Input(shape=(140, 256))
lstm = LSTM(32)
encoded_a = lstm(a)
encoded_b = lstm(b)
lstm.output
```
```
>> AssertionError: Layer lstm_1 has multiple inbound nodes,
hence the notion of "layer output" is ill-defined.
Use `get_output_at(node_index)` instead.
```
Okay then. The following works:
```python
assert lstm.get_output_at(0) == encoded_a
assert lstm.get_output_at(1) == encoded_b
```
Simple enough, right?
The same is true for the properties `input_shape` and `output_shape`: as long as the layer has only one node, or as long as all nodes have the same input/output shape, then the notion of "layer output/input shape" is well defined, and that one shape will be returned by `layer.output_shape`/`layer.input_shape`. But if, for instance, you apply a same `Convolution2D` layer to an input of shape `(3, 32, 32)`, and then to an input of shape `(3, 64, 64)`, the layer will have multiple input/output shapes, and you will have to fetch them by specifying the index of the node they belong to:
```python
a = Input(shape=(3, 32, 32))
b = Input(shape=(3, 64, 64))
conv = Convolution2D(16, 3, 3, border_mode='same')
conved_a = conv(a)
# only one input so far, the following will work:
assert conv.input_shape == (None, 3, 32, 32)
conved_b = conv(b)
# now the `.input_shape` property wouldn't work, but this does:
assert conv.get_input_shape_at(0) == (None, 3, 32, 32)
assert conv.get_input_shape_at(1) == (None, 3, 64, 64)
```
-----
## More examples
Code examples are still the best way to get started, so here are a few more.
### Inception module
For more information about the Inception architecture, see [Going Deeper with Convolutions](http://arxiv.org/abs/1409.4842).
```python
from keras.layers import merge, Convolution2D, MaxPooling2D, Input
input_img = Input(shape=(3, 256, 256))
tower_1 = Convolution2D(64, 1, 1, border_mode='same', activation='relu')(input_img)
tower_1 = Convolution2D(64, 3, 3, border_mode='same', activation='relu')(tower_1)
tower_2 = Convolution2D(64, 1, 1, border_mode='same', activation='relu')(input_img)
tower_2 = Convolution2D(64, 5, 5, border_mode='same', activation='relu')(tower_2)
tower_3 = MaxPooling2D((3, 3), strides=(1, 1), border_mode='same')(input_img)
tower_3 = Convolution2D(64, 1, 1, border_mode='same', activation='relu')(tower_3)
output = merge([tower_1, tower_2, tower_3], mode='concat', concat_axis=1)
```
### Residual connection on a convolution layer
For more information about residual networks, see [Deep Residual Learning for Image Recognition](http://arxiv.org/abs/1512.03385).
```python
from keras.layers import merge, Convolution2D, Input
# input tensor for a 3-channel 256x256 image
x = Input(shape=(3, 256, 256))
# 3x3 conv with 3 output channels (same as input channels)
y = Convolution2D(3, 3, 3, border_mode='same')(x)
# this returns x + y.
z = merge([x, y], mode='sum')
```
### Shared vision model
This model re-uses the same image-processing module on two inputs, to classify whether two MNIST digits are the same digit or different digits.
```python
from keras.layers import merge, Convolution2D, MaxPooling2D, Input, Dense, Flatten
from keras.models import Model
# first, define the vision modules
digit_input = Input(shape=(1, 27, 27))
x = Convolution2D(64, 3, 3)(digit_input)
x = Convolution2D(64, 3, 3)(x)
x = MaxPooling2D((2, 2))(x)
out = Flatten()(x)
vision_model = Model(digit_input, out)
# then define the tell-digits-apart model
digit_a = Input(shape=(1, 27, 27))
digit_b = Input(shape=(1, 27, 27))
# the vision model will be shared, weights and all
out_a = vision_model(digit_a)
out_b = vision_model(digit_b)
concatenated = merge([out_a, out_b], mode='concat')
out = Dense(1, activation='sigmoid')(concatenated)
classification_model = Model([digit_a, digit_b], out)
```
### Visual question answering model
This model can select the correct one-word answer when asked a natural-language question about a picture.
It works by encoding the question into a vector, encoding the image into a vector, concatenating the two, and training on top a logistic regression over some vocabulary of potential answers.
```python
from keras.layers import Convolution2D, MaxPooling2D, Flatten
from keras.layers import Input, LSTM, Embedding, Dense, merge
from keras.models import Model, Sequential
# first, let's define a vision model using a Sequential model.
# this model will encode an image into a vector.
vision_model = Sequential()
vision_model.add(Convolution2D(64, 3, 3, activation='relu', border_mode='same', input_shape=(3, 224, 224)))
vision_model.add(Convolution2D(64, 3, 3, activation='relu'))
vision_model.add(MaxPooling2D((2, 2)))
vision_model.add(Convolution2D(128, 3, 3, activation='relu', border_mode='same'))
vision_model.add(Convolution2D(128, 3, 3, activation='relu'))
vision_model.add(MaxPooling2D((2, 2)))
vision_model.add(Convolution2D(256, 3, 3, activation='relu', border_mode='same'))
vision_model.add(Convolution2D(256, 3, 3, activation='relu'))
vision_model.add(Convolution2D(256, 3, 3, activation='relu'))
vision_model.add(MaxPooling2D((2, 2)))
vision_model.add(Flatten())
# now let's get a tensor with the output of our vision model:
image_input = Input(shape=(3, 224, 224))
encoded_image = vision_model(image_input)
# next, let's define a language model to encode the question into a vector.
# each question will be at most 100 word long,
# and we will index words as integers from 1 to 9999.
question_input = Input(shape=(100,), dtype='int32')
embedded_question = Embedding(input_dim=10000, output_dim=256, input_length=100)(question_input)
encoded_question = LSTM(256)(embedded_question)
# let's concatenate the question vector and the image vector:
merged = merge([encoded_question, encoded_image], mode='concat')
# and let's train a logistic regression over 1000 words on top:
output = Dense(1000, activation='softmax')(merged)
# this is our final model:
vqa_model = Model(input=[image_input, question_input], output=output)
# the next stage would be training this model on actual data.
```
### Video question answering model
Now that we have trained our image QA model, we can quickly turn it into a video QA model. With appropriate training, you will be able to show it a short video (e.g. 100-frame human action) and ask a natural language question about the video (e.g. "what sport is the boy playing?" -> "football").
```python
from keras.layers import TimeDistributed
video_input = Input(shape=(100, 3, 224, 224))
# this is our video encoded via the previously trained vision_model (weights are reused)
encoded_frame_sequence = TimeDistributed(vision_model)(video_input) # the output will be a sequence of vectors
encoded_video = LSTM(256)(encoded_frame_sequence) # the output will be a vector
# this is a model-level representation of the question encoder, reusing the same weights as before:
question_encoder = Model(input=question_input, output=encoded_question)
# let's use it to encode the question:
video_question_input = Input(shape=(100,), dtype='int32')
encoded_video_question = question_encoder(video_question_input)
# and this is our video question answering model:
merged = merge([encoded_video, encoded_video_question], mode='concat')
output = Dense(1000, activation='softmax')(merged)
video_qa_model = Model(input=[video_input, video_question_input], output=output)
```
+567
Ver Arquivo
@@ -0,0 +1,567 @@
# Getting started with the Keras Sequential model
The `Sequential` model is a linear stack of layers.
You can create a `Sequential` model by passing a list of layer instances to the constructor:
```python
from keras.models import Sequential
from keras.layers import Dense, Activation
model = Sequential([
Dense(32, input_dim=784),
Activation('relu'),
Dense(10),
Activation('softmax'),
])
```
You can also simply add layers via the `.add()` method:
```python
model = Sequential()
model.add(Dense(32, input_dim=784))
model.add(Activation('relu'))
```
----
## Specifying the input shape
The model needs to know what input shape it should expect. For this reason, the first layer in a `Sequential` model (and only the first, because following layers can do automatic shape inference) needs to receive information about its input shape. There are several possible ways to do this:
- pass an `input_shape` argument to the first layer. This is a shape tuple (a tuple of integers or `None` entries, where `None` indicates that any positive integer may be expected). In `input_shape`, the batch dimension is not included.
- pass instead a `batch_input_shape` argument, where the batch dimension is included. This is useful for specifying a fixed batch size (e.g. with stateful RNNs).
- some 2D layers, such as `Dense`, support the specification of their input shape via the argument `input_dim`, and some 3D temporal layers support the arguments `input_dim` and `input_length`.
As such, the following three snippets are strictly equivalent:
```python
model = Sequential()
model.add(Dense(32, input_shape=(784,)))
```
```python
model = Sequential()
model.add(Dense(32, batch_input_shape=(None, 784)))
# note that batch dimension is "None" here,
# so the model will be able to process batches of any size.
```
```python
model = Sequential()
model.add(Dense(32, input_dim=784))
```
And so are the following three snippets:
```python
model = Sequential()
model.add(LSTM(32, input_shape=(10, 64)))
```
```python
model = Sequential()
model.add(LSTM(32, batch_input_shape=(None, 10, 64)))
```
```python
model = Sequential()
model.add(LSTM(32, input_length=10, input_dim=64))
```
----
## The Merge layer
Multiple `Sequential` instances can be merged into a single output via a `Merge` layer. The output is a layer that can be added as first layer in a new `Sequential` model. For instance, here's a model with two separate input branches getting merged:
```python
from keras.layers import Merge
left_branch = Sequential()
left_branch.add(Dense(32, input_dim=784))
right_branch = Sequential()
right_branch.add(Dense(32, input_dim=784))
merged = Merge([left_branch, right_branch], mode='concat')
final_model = Sequential()
final_model.add(merged)
final_model.add(Dense(10, activation='softmax'))
```
<img src="https://s3.amazonaws.com/keras.io/img/two_branches_sequential_model.png" alt="two branch Sequential" style="width: 400px;"/>
Such a two-branch model can then be trained via e.g.:
```python
final_model.compile(optimizer='rmsprop', loss='categorical_crossentropy')
final_model.fit([input_data_1, input_data_2], targets) # we pass one data array per model input
```
The `Merge` layer supports a number of pre-defined modes:
- `sum` (default): element-wise sum
- `concat`: tensor concatenation. You can specify the concatenation axis via the argument `concat_axis`.
- `mul`: element-wise multiplication
- `ave`: tensor average
- `dot`: dot product. You can specify which axes to reduce along via the argument `dot_axes`.
- `cos`: cosine proximity between vectors in 2D tensors.
You can also pass a function as the `mode` argument, allowing for arbitrary transformations:
```python
merged = Merge([left_branch, right_branch], mode=lambda x: x[0] - x[1])
```
Now you know enough to be able to define *almost* any model with Keras. For complex models that cannot be expressed via `Sequential` and `Merge`, you can use [the functional API](/getting-started/functional-api-guide).
----
## Compilation
Before training a model, you need to configure the learning process, which is done via the `compile` method. It receives three arguments:
- an optimizer. This could be the string identifier of an existing optimizer (such as `rmsprop` or `adagrad`), or an instance of the `Optimizer` class. See: [optimizers](/optimizers).
- a loss function. This is the objective that the model will try to minimize. It can be the string identifier of an existing loss function (such as `categorical_crossentropy` or `mse`), or it can be an objective function. See: [objectives](/objectives).
- a list of metrics. For any classification problem you will want to set this to `metrics=['accuracy']`. A metric could be the string identifier of an existing metric or a custom metric function. Custom metric function should return either a single tensor value or a dict `metric_name -> metric_value`. See: [metrics](/metrics).
```python
# for a multi-class classification problem
model.compile(optimizer='rmsprop',
loss='categorical_crossentropy',
metrics=['accuracy'])
# for a binary classification problem
model.compile(optimizer='rmsprop',
loss='binary_crossentropy',
metrics=['accuracy'])
# for a mean squared error regression problem
model.compile(optimizer='rmsprop',
loss='mse')
# for custom metrics
import keras.backend as K
def mean_pred(y_true, y_pred):
return K.mean(y_pred)
def false_rates(y_true, y_pred):
false_neg = ...
false_pos = ...
return {
'false_neg': false_neg,
'false_pos': false_pos,
}
model.compile(optimizer='rmsprop',
loss='binary_crossentropy',
metrics=['accuracy', mean_pred, false_rates])
```
----
## Training
Keras models are trained on Numpy arrays of input data and labels. For training a model, you will typically use the `fit` function. [Read its documentation here](/models/sequential).
```python
# for a single-input model with 2 classes (binary):
model = Sequential()
model.add(Dense(1, input_dim=784, activation='sigmoid'))
model.compile(optimizer='rmsprop',
loss='binary_crossentropy',
metrics=['accuracy'])
# generate dummy data
import numpy as np
data = np.random.random((1000, 784))
labels = np.random.randint(2, size=(1000, 1))
# train the model, iterating on the data in batches
# of 32 samples
model.fit(data, labels, nb_epoch=10, batch_size=32)
```
```python
# for a multi-input model with 10 classes:
left_branch = Sequential()
left_branch.add(Dense(32, input_dim=784))
right_branch = Sequential()
right_branch.add(Dense(32, input_dim=784))
merged = Merge([left_branch, right_branch], mode='concat')
model = Sequential()
model.add(merged)
model.add(Dense(10, activation='softmax'))
model.compile(optimizer='rmsprop',
loss='categorical_crossentropy',
metrics=['accuracy'])
# generate dummy data
import numpy as np
from keras.utils.np_utils import to_categorical
data_1 = np.random.random((1000, 784))
data_2 = np.random.random((1000, 784))
# these are integers between 0 and 9
labels = np.random.randint(10, size=(1000, 1))
# we convert the labels to a binary matrix of size (1000, 10)
# for use with categorical_crossentropy
labels = to_categorical(labels, 10)
# train the model
# note that we are passing a list of Numpy arrays as training data
# since the model has 2 inputs
model.fit([data_1, data_2], labels, nb_epoch=10, batch_size=32)
```
----
## Examples
Here are a few examples to get you started!
In the examples folder, you will also find example models for real datasets:
- CIFAR10 small images classification: Convolutional Neural Network (CNN) with realtime data augmentation
- IMDB movie review sentiment classification: LSTM over sequences of words
- Reuters newswires topic classification: Multilayer Perceptron (MLP)
- MNIST handwritten digits classification: MLP & CNN
- Character-level text generation with LSTM
...and more.
### Multilayer Perceptron (MLP) for multi-class softmax classification:
```python
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation
from keras.optimizers import SGD
model = Sequential()
# Dense(64) is a fully-connected layer with 64 hidden units.
# in the first layer, you must specify the expected input data shape:
# here, 20-dimensional vectors.
model.add(Dense(64, input_dim=20, init='uniform'))
model.add(Activation('tanh'))
model.add(Dropout(0.5))
model.add(Dense(64, init='uniform'))
model.add(Activation('tanh'))
model.add(Dropout(0.5))
model.add(Dense(10, init='uniform'))
model.add(Activation('softmax'))
sgd = SGD(lr=0.1, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='categorical_crossentropy',
optimizer=sgd,
metrics=['accuracy'])
model.fit(X_train, y_train,
nb_epoch=20,
batch_size=16)
score = model.evaluate(X_test, y_test, batch_size=16)
```
### Alternative implementation of a similar MLP:
```python
model = Sequential()
model.add(Dense(64, input_dim=20, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(64, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(10, activation='softmax'))
model.compile(loss='categorical_crossentropy',
optimizer='adadelta',
metrics=['accuracy'])
```
### MLP for binary classification:
```python
model = Sequential()
model.add(Dense(64, input_dim=20, init='uniform', activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(64, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy',
optimizer='rmsprop',
metrics=['accuracy'])
```
### VGG-like convnet:
```python
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.layers import Convolution2D, MaxPooling2D
from keras.optimizers import SGD
model = Sequential()
# input: 100x100 images with 3 channels -> (3, 100, 100) tensors.
# this applies 32 convolution filters of size 3x3 each.
model.add(Convolution2D(32, 3, 3, border_mode='valid', input_shape=(3, 100, 100)))
model.add(Activation('relu'))
model.add(Convolution2D(32, 3, 3))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Convolution2D(64, 3, 3, border_mode='valid'))
model.add(Activation('relu'))
model.add(Convolution2D(64, 3, 3))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
# Note: Keras does automatic shape inference.
model.add(Dense(256))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(10))
model.add(Activation('softmax'))
sgd = SGD(lr=0.1, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='categorical_crossentropy', optimizer=sgd)
model.fit(X_train, Y_train, batch_size=32, nb_epoch=1)
```
### Sequence classification with LSTM:
```python
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation
from keras.layers import Embedding
from keras.layers import LSTM
model = Sequential()
model.add(Embedding(max_features, 256, input_length=maxlen))
model.add(LSTM(output_dim=128, activation='sigmoid', inner_activation='hard_sigmoid'))
model.add(Dropout(0.5))
model.add(Dense(1))
model.add(Activation('sigmoid'))
model.compile(loss='binary_crossentropy',
optimizer='rmsprop',
metrics=['accuracy'])
model.fit(X_train, Y_train, batch_size=16, nb_epoch=10)
score = model.evaluate(X_test, Y_test, batch_size=16)
```
### Architecture for learning image captions with a convnet and a Gated Recurrent Unit:
(word-level embedding, caption of maximum length 16 words).
Note that getting this to work well will require using a bigger convnet, initialized with pre-trained weights.
```python
max_caption_len = 16
vocab_size = 10000
# first, let's define an image model that
# will encode pictures into 128-dimensional vectors.
# it should be initialized with pre-trained weights.
image_model = Sequential()
image_model.add(Convolution2D(32, 3, 3, border_mode='valid', input_shape=(3, 100, 100)))
image_model.add(Activation('relu'))
image_model.add(Convolution2D(32, 3, 3))
image_model.add(Activation('relu'))
image_model.add(MaxPooling2D(pool_size=(2, 2)))
image_model.add(Convolution2D(64, 3, 3, border_mode='valid'))
image_model.add(Activation('relu'))
image_model.add(Convolution2D(64, 3, 3))
image_model.add(Activation('relu'))
image_model.add(MaxPooling2D(pool_size=(2, 2)))
image_model.add(Flatten())
image_model.add(Dense(128))
# let's load the weights from a save file.
image_model.load_weights('weight_file.h5')
# next, let's define a RNN model that encodes sequences of words
# into sequences of 128-dimensional word vectors.
language_model = Sequential()
language_model.add(Embedding(vocab_size, 256, input_length=max_caption_len))
language_model.add(GRU(output_dim=128, return_sequences=True))
language_model.add(TimeDistributed(Dense(128)))
# let's repeat the image vector to turn it into a sequence.
image_model.add(RepeatVector(max_caption_len))
# the output of both models will be tensors of shape (samples, max_caption_len, 128).
# let's concatenate these 2 vector sequences.
model = Sequential()
model.add(Merge([image_model, language_model], mode='concat', concat_axis=-1))
# let's encode this vector sequence into a single vector
model.add(GRU(256, return_sequences=False))
# which will be used to compute a probability
# distribution over what the next word in the caption should be!
model.add(Dense(vocab_size))
model.add(Activation('softmax'))
model.compile(loss='categorical_crossentropy', optimizer='rmsprop')
# "images" is a numpy float array of shape (nb_samples, nb_channels=3, width, height).
# "captions" is a numpy integer array of shape (nb_samples, max_caption_len)
# containing word index sequences representing partial captions.
# "next_words" is a numpy float array of shape (nb_samples, vocab_size)
# containing a categorical encoding (0s and 1s) of the next word in the corresponding
# partial caption.
model.fit([images, partial_captions], next_words, batch_size=16, nb_epoch=100)
```
### Stacked LSTM for sequence classification
In this model, we stack 3 LSTM layers on top of each other,
making the model capable of learning higher-level temporal representations.
The first two LSTMs return their full output sequences, but the last one only returns
the last step in its output sequence, thus dropping the temporal dimension
(i.e. converting the input sequence into a single vector).
<img src="https://keras.io/img/regular_stacked_lstm.png" alt="stacked LSTM" style="width: 300px;"/>
```python
from keras.models import Sequential
from keras.layers import LSTM, Dense
import numpy as np
data_dim = 16
timesteps = 8
nb_classes = 10
# expected input data shape: (batch_size, timesteps, data_dim)
model = Sequential()
model.add(LSTM(32, return_sequences=True,
input_shape=(timesteps, data_dim))) # returns a sequence of vectors of dimension 32
model.add(LSTM(32, return_sequences=True)) # returns a sequence of vectors of dimension 32
model.add(LSTM(32)) # return a single vector of dimension 32
model.add(Dense(10, activation='softmax'))
model.compile(loss='categorical_crossentropy',
optimizer='rmsprop',
metrics=['accuracy'])
# generate dummy training data
x_train = np.random.random((1000, timesteps, data_dim))
y_train = np.random.random((1000, nb_classes))
# generate dummy validation data
x_val = np.random.random((100, timesteps, data_dim))
y_val = np.random.random((100, nb_classes))
model.fit(x_train, y_train,
batch_size=64, nb_epoch=5,
validation_data=(x_val, y_val))
```
### Same stacked LSTM model, rendered "stateful"
A stateful recurrent model is one for which the internal states (memories) obtained after processing a batch
of samples are reused as initial states for the samples of the next batch. This allows to process longer sequences
while keeping computational complexity manageable.
[You can read more about stateful RNNs in the FAQ.](/faq/#how-can-i-use-stateful-rnns)
```python
from keras.models import Sequential
from keras.layers import LSTM, Dense
import numpy as np
data_dim = 16
timesteps = 8
nb_classes = 10
batch_size = 32
# expected input batch shape: (batch_size, timesteps, data_dim)
# note that we have to provide the full batch_input_shape since the network is stateful.
# the sample of index i in batch k is the follow-up for the sample i in batch k-1.
model = Sequential()
model.add(LSTM(32, return_sequences=True, stateful=True,
batch_input_shape=(batch_size, timesteps, data_dim)))
model.add(LSTM(32, return_sequences=True, stateful=True))
model.add(LSTM(32, stateful=True))
model.add(Dense(10, activation='softmax'))
model.compile(loss='categorical_crossentropy',
optimizer='rmsprop',
metrics=['accuracy'])
# generate dummy training data
x_train = np.random.random((batch_size * 10, timesteps, data_dim))
y_train = np.random.random((batch_size * 10, nb_classes))
# generate dummy validation data
x_val = np.random.random((batch_size * 3, timesteps, data_dim))
y_val = np.random.random((batch_size * 3, nb_classes))
model.fit(x_train, y_train,
batch_size=batch_size, nb_epoch=5,
validation_data=(x_val, y_val))
```
### Two merged LSTM encoders for classification over two parallel sequences
In this model, two input sequences are encoded into vectors by two separate LSTM modules.
These two vectors are then concatenated, and a fully connected network is trained on top of the concatenated representations.
<img src="https://keras.io/img/dual_lstm.png" alt="Dual LSTM" style="width: 600px;"/>
```python
from keras.models import Sequential
from keras.layers import Merge, LSTM, Dense
import numpy as np
data_dim = 16
timesteps = 8
nb_classes = 10
encoder_a = Sequential()
encoder_a.add(LSTM(32, input_shape=(timesteps, data_dim)))
encoder_b = Sequential()
encoder_b.add(LSTM(32, input_shape=(timesteps, data_dim)))
decoder = Sequential()
decoder.add(Merge([encoder_a, encoder_b], mode='concat'))
decoder.add(Dense(32, activation='relu'))
decoder.add(Dense(nb_classes, activation='softmax'))
decoder.compile(loss='categorical_crossentropy',
optimizer='rmsprop',
metrics=['accuracy'])
# generate dummy training data
x_train_a = np.random.random((1000, timesteps, data_dim))
x_train_b = np.random.random((1000, timesteps, data_dim))
y_train = np.random.random((1000, nb_classes))
# generate dummy validation data
x_val_a = np.random.random((100, timesteps, data_dim))
x_val_b = np.random.random((100, timesteps, data_dim))
y_val = np.random.random((100, nb_classes))
decoder.fit([x_train_a, x_train_b], y_train,
batch_size=64, nb_epoch=5,
validation_data=([x_val_a, x_val_b], y_val))
```
+165
Ver Arquivo
@@ -0,0 +1,165 @@
# Keras: Deep Learning library for Theano and TensorFlow
## You have just found Keras.
Keras is a high-level neural networks library, written in Python and capable of running on top of either [TensorFlow](https://github.com/tensorflow/tensorflow) or [Theano](https://github.com/Theano/Theano). It was developed with a focus on enabling fast experimentation. *Being able to go from idea to result with the least possible delay is key to doing good research.*
Use Keras if you need a deep learning library that:
- Allows for easy and fast prototyping (through total modularity, minimalism, and extensibility).
- Supports both convolutional networks and recurrent networks, as well as combinations of the two.
- Supports arbitrary connectivity schemes (including multi-input and multi-output training).
- Runs seamlessly on CPU and GPU.
Read the documentation at [Keras.io](http://keras.io).
Keras is compatible with: __Python 2.7-3.5__.
------------------
## Guiding principles
- __Modularity.__ A model is understood as a sequence or a graph of standalone, fully-configurable modules that can be plugged together with as little restrictions as possible. In particular, neural layers, cost functions, optimizers, initialization schemes, activation functions, regularization schemes are all standalone modules that you can combine to create new models.
- __Minimalism.__ Each module should be kept short and simple. Every piece of code should be transparent upon first reading. No black magic: it hurts iteration speed and ability to innovate.
- __Easy extensibility.__ New modules are dead simple to add (as new classes and functions), and existing modules provide ample examples. To be able to easily create new modules allows for total expressiveness, making Keras suitable for advanced research.
- __Work with Python__. No separate models configuration files in a declarative format. Models are described in Python code, which is compact, easier to debug, and allows for ease of extensibility.
------------------
## Getting started: 30 seconds to Keras
The core data structure of Keras is a __model__, a way to organize layers. The main type of model is the [`Sequential`](http://keras.io/getting-started/sequential-model-guide) model, a linear stack of layers. For more complex architectures, you should use the [Keras functional API](http://keras.io/getting-started/functional-api-guide).
Here's the `Sequential` model:
```python
from keras.models import Sequential
model = Sequential()
```
Stacking layers is as easy as `.add()`:
```python
from keras.layers import Dense, Activation
model.add(Dense(output_dim=64, input_dim=100))
model.add(Activation("relu"))
model.add(Dense(output_dim=10))
model.add(Activation("softmax"))
```
Once your model looks good, configure its learning process with `.compile()`:
```python
model.compile(loss='categorical_crossentropy', optimizer='sgd', metrics=['accuracy'])
```
If you need to, you can further configure your optimizer. A core principle of Keras is to make things reasonably simple, while allowing the user to be fully in control when they need to (the ultimate control being the easy extensibility of the source code).
```python
from keras.optimizers import SGD
model.compile(loss='categorical_crossentropy', optimizer=SGD(lr=0.01, momentum=0.9, nesterov=True))
```
You can now iterate on your training data in batches:
```python
model.fit(X_train, Y_train, nb_epoch=5, batch_size=32)
```
Alternatively, you can feed batches to your model manually:
```python
model.train_on_batch(X_batch, Y_batch)
```
Evaluate your performance in one line:
```python
loss_and_metrics = model.evaluate(X_test, Y_test, batch_size=32)
```
Or generate predictions on new data:
```python
classes = model.predict_classes(X_test, batch_size=32)
proba = model.predict_proba(X_test, batch_size=32)
```
Building a question answering system, an image classification model, a Neural Turing Machine, a word2vec embedder or any other model is just as fast. The ideas behind deep learning are simple, so why should their implementation be painful?
For a more in-depth tutorial about Keras, you can check out:
- [Getting started with the Sequential model](http://keras.io/getting-started/sequential-model-guide)
- [Getting started with the functional API](http://keras.io/getting-started/functional-api-guide)
In the [examples folder](https://github.com/fchollet/keras/tree/master/examples) of the repository, you will find more advanced models: question-answering with memory networks, text generation with stacked LSTMs, etc.
------------------
## Installation
Keras uses the following dependencies:
- numpy, scipy
- pyyaml
- HDF5 and h5py (optional, required if you use model saving/loading functions)
- Optional but recommended if you use CNNs: cuDNN.
*When using the TensorFlow backend:*
- TensorFlow
- [See installation instructions](https://github.com/tensorflow/tensorflow#download-and-setup).
*When using the Theano backend:*
- Theano
- [See installation instructions](http://deeplearning.net/software/theano/install.html#install).
To install Keras, `cd` to the Keras folder and run the install command:
```sh
sudo python setup.py install
```
You can also install Keras from PyPI:
```sh
sudo pip install keras
```
------------------
## Switching from TensorFlow to Theano
By default, Keras will use TensorFlow as its tensor manipulation library. [Follow these instructions](http://keras.io/backend/) to configure the Keras backend.
------------------
## Support
You can ask questions and join the development discussion:
- On the [Keras Google group](https://groups.google.com/forum/#!forum/keras-users).
- On the [Keras Gitter channel](https://gitter.im/Keras-io/Lobby).
You can also post bug reports and feature requests in [Github issues](https://github.com/fchollet/keras/issues). Make sure to read [our guidelines](https://github.com/fchollet/keras/blob/master/CONTRIBUTING.md) first.
------------------
## Why this name, Keras?
Keras (κέρας) means _horn_ in Greek. It is a reference to a literary image from ancient Greek and Latin literature, first found in the _Odyssey_, where dream spirits (_Oneiroi_, singular _Oneiros_) are divided between those who deceive men with false visions, who arrive to Earth through a gate of ivory, and those who announce a future that will come to pass, who arrive through a gate of horn. It's a play on the words κέρας (horn) / κραίνω (fulfill), and ἐλέφας (ivory) / ἐλεφαίρομαι (deceive).
Keras was initially developed as part of the research effort of project ONEIROS (Open-ended Neuro-Electronic Intelligent Robot Operating System).
>_"Oneiroi are beyond our unravelling --who can be sure what tale they tell? Not all that men look for comes to pass. Two gates there are that give passage to fleeting Oneiroi; one is made of horn, one of ivory. The Oneiroi that pass through sawn ivory are deceitful, bearing a message that will not be fulfilled; those that come out through polished horn have truth behind them, to be accomplished for men who see them."_ Homer, Odyssey 19. 562 ff (Shewring translation).
------------------
+50
Ver Arquivo
@@ -0,0 +1,50 @@
## Usage of initializations
Initializations define the way to set the initial random weights of Keras layers.
The keyword arguments used for passing initializations to layers will depend on the layer. Usually it is simply `init`:
```python
model.add(Dense(64, init='uniform'))
```
## Available initializations
- __uniform__
- __lecun_uniform__: Uniform initialization scaled by the square root of the number of inputs (LeCun 98).
- __normal__
- __identity__: Use with square 2D layers (`shape[0] == shape[1]`).
- __orthogonal__: Use with square 2D layers (`shape[0] == shape[1]`).
- __zero__
- __glorot_normal__: Gaussian initialization scaled by fan_in + fan_out (Glorot 2010)
- __glorot_uniform__
- __he_normal__: Gaussian initialization scaled by fan_in (He et al., 2014)
- __he_uniform__
An initialization may be passed as a string (must match one of the available initializations above), or as a callable.
If a callable, then it must take two arguments: `shape` (shape of the variable to initialize) and `name` (name of the variable),
and it must return a variable (e.g. output of `K.variable()`):
```python
from keras import backend as K
import numpy as np
def my_init(shape, name=None):
value = np.random.random(shape)
return K.variable(value, name=name)
model.add(Dense(64, init=my_init))
```
You could also use functions from `keras.initializations` in this way:
```python
from keras import initializations
def my_init(shape, name=None):
return initializations.normal(shape, scale=0.01, name=name)
model.add(Dense(64, init=my_init))
```
+27
Ver Arquivo
@@ -0,0 +1,27 @@
# About Keras layers
All Keras layers have a number of methods in common:
- `layer.get_weights()`: returns the weights of the layer as a list of Numpy arrays.
- `layer.set_weights(weights)`: sets the weights of the layer from a list of Numpy arrays (with the same shapes as the output of `get_weights`).
- `layer.get_config()`: returns a dictionary containing the configuration of the layer. The layer can be reinstantiated from its config via:
```python
from keras.utils.layer_utils import layer_from_config
config = layer.get_config()
layer = layer_from_config(config)
```
If a layer has a single node (i.e. if it isn't a shared layer), you can get its input tensor, output tensor, input shape and output shape via:
- `layer.input`
- `layer.output`
- `layer.input_shape`
- `layer.output_shape`
If the layer has multiple nodes (see: [the concept of layer node and shared layers](/getting-started/functional-api-guide/#the-concept-of-layer-node)), you can use the following methods:
- `layer.get_input_at(node_index)`
- `layer.get_output_at(node_index)`
- `layer.get_input_shape_at(node_index)`
- `layer.get_output_shape_at(node_index)`
+35
Ver Arquivo
@@ -0,0 +1,35 @@
# Writing your own Keras layers
For simple, stateless custom operations, you are probably better off using `layers.core.Lambda` layers. But for any custom operation that has trainable weights, you should implement your own layer.
Here is the skeleton of a Keras layer. There are only three methods you need to implement:
- `build(input_shape)`: this is where you will define your weights. Trainable weights should be added to the list `self.trainable_weights`. Other attributes of note are: `self.non_trainable_weights` (list) and `self.updates` (list of update tuples (tensor, new_tensor)). For an example of how to use `non_trainable_weights` and `updates`, see the code for the `BatchNormalization` layer. This method must set `self.built = True`, which can be done by calling `super([Layer], self).build()`.
- `call(x)`: this is where the layer's logic lives. Unless you want your layer to support masking, you only have to care about the first argument passed to `call`: the input tensor.
- `get_output_shape_for(input_shape)`: in case your layer modifies the shape of its input, you should specify here the shape transformation logic. This allows Keras to do automatic shape inference.
```python
from keras import backend as K
from keras.engine.topology import Layer
import numpy as np
class MyLayer(Layer):
def __init__(self, output_dim, **kwargs):
self.output_dim = output_dim
super(MyLayer, self).__init__(**kwargs)
def build(self, input_shape):
input_dim = input_shape[1]
initial_weight_value = np.random.random((input_dim, output_dim))
self.W = K.variable(initial_weight_value)
self.trainable_weights = [self.W]
super(MyLayer, self).build() # be sure you call this somewhere!
def call(self, x, mask=None):
return K.dot(x, self.W)
def get_output_shape_for(self, input_shape):
return (input_shape[0], self.output_dim)
```
The existing Keras layers provide ample examples of how to implement almost anything. Never hesitate to read the source code!
+51
Ver Arquivo
@@ -0,0 +1,51 @@
## Usage of metrics
A metric is a function that is used to judge the performance of your model. Metric functions are to be supplied in the `metrics` parameter when a model is compiled.
A metric function is similar to an [objective function](/objectives), except that the results from evaluating a metric are not used when training the model.
You can either pass the name of an existing metric, or pass a Theano/TensorFlow symbolic function (see [Custom metrics](#custom-metrics)).
#### Arguments
- __y_true__: True labels. Theano/TensorFlow tensor.
- __y_pred__: Predictions. Theano/TensorFlow tensor of the same shape as y_true.
#### Returns
Single tensor value representing the mean of the output array across all
datapoints.
----
## Available metrics
{{autogenerated}}
----
## Custom metrics
Custom metrics can be defined and passed via the compilation step. The
function would need to take `(y_true, y_pred)` as arguments and return
either a single tensor value or a dict `metric_name -> metric_value`.
```python
# for custom metrics
import keras.backend as K
def mean_pred(y_true, y_pred):
return K.mean(y_pred)
def false_rates(y_true, y_pred):
false_neg = ...
false_pos = ...
return {
'false_neg': false_neg,
'false_pos': false_pos,
}
model.compile(optimizer='rmsprop',
loss='binary_crossentropy',
metrics=['accuracy', mean_pred, false_rates])
```
+33
Ver Arquivo
@@ -0,0 +1,33 @@
# About Keras models
There are two types of models available in Keras: [the Sequential model](/models/sequential) and [the Model class used with functional API](/models/model).
These models have a number of methods in common:
- `model.summary()`: prints a summary representation of your model.
- `model.get_config()`: returns a dictionary containing the configuration of the model. The model can be reinstantiated from its config via:
```python
config = model.get_config()
model = Model.from_config(config)
# or, for Sequential:
model = Sequential.from_config(config)
```
- `model.get_weights()`: returns a list of all weight tensors in the model, as Numpy arrays.
- `model.set_weights(weights)`: sets the values of the weights of the model, from a list of Numpy arrays. The arrays in the list should have the same shape as those returned by `get_weights()`.
- `model.to_json()`: returns a representation of the model as a JSON string. Note that the representation does not include the weights, only the architecture. You can reinstantiate the same model (with reinitialized weights) from the JSON string via:
```python
from models import model_from_json
json_string = model.to_json()
model = model_from_json(json_string)
```
- `model.to_yaml()`: returns a representation of the model as a YAML string. Note that the representation does not include the weights, only the architecture. You can reinstantiate the same model (with reinitialized weights) from the YAML string via:
```python
from models import model_from_yaml
yaml_string = model.to_yaml()
model = model_from_yaml(yaml_string)
```
- `model.save_weights(filepath)`: saves the weights of the model as a HDF5 file.
- `model.load_weights(filepath, by_name=False)`: loads the weights of the model from a HDF5 file (created by `save_weights`). By default, the architecture is expected to be unchanged. To load weights into a different architecture (with some layers in common), use `by_name=True` to load only those layers with the same name.
+32
Ver Arquivo
@@ -0,0 +1,32 @@
# Model class API
In the functional API, given an input tensor and output tensor, you can instantiate a `Model` via:
```python
from keras.models import Model
from keras.layers import Input, Dense
a = Input(shape=(32,))
b = Dense(32)(a)
model = Model(input=a, output=b)
```
This model will include all layers required in the computation of `b` given `a`.
In the case of multi-input or multi-output models, you can use lists as well:
```python
model = Model(input=[a1, a2], output=[b1, b3, b3])
```
For a detailed introduction of what `Model` can do, read [this guide to the Keras functional API](/getting-started/functional-api-guide).
## Useful attributes of Model
- `model.layers` is a flattened list of the layers comprising the model graph.
- `model.inputs` is the list of input tensors.
- `model.outputs` is the list of output tensors.
## Methods
{{autogenerated}}
+14
Ver Arquivo
@@ -0,0 +1,14 @@
# The Sequential model API
To get started, read [this guide to the Keras Sequential model](/getting-started/sequential-model-guide).
## Useful attributes of Model
- `model.layers` is a list of the layers added to the model.
----
## Sequential model methods
{{autogenerated}}
+40
Ver Arquivo
@@ -0,0 +1,40 @@
## Usage of objectives
An objective function (or loss function, or optimization score function) is one of the two parameters required to compile a model:
```python
model.compile(loss='mean_squared_error', optimizer='sgd')
```
You can either pass the name of an existing objective, or pass a Theano/TensorFlow symbolic function that returns a scalar for each data-point and takes the following two arguments:
- __y_true__: True labels. Theano/TensorFlow tensor.
- __y_pred__: Predictions. Theano/TensorFlow tensor of the same shape as y_true.
The actual optimized objective is the mean of the output array across all datapoints.
For a few examples of such functions, check out the [objectives source](https://github.com/fchollet/keras/blob/master/keras/objectives.py).
## Available objectives
- __mean_squared_error__ / __mse__
- __mean_absolute_error__ / __mae__
- __mean_absolute_percentage_error__ / __mape__
- __mean_squared_logarithmic_error__ / __msle__
- __squared_hinge__
- __hinge__
- __binary_crossentropy__: Also known as logloss.
- __categorical_crossentropy__: Also known as multiclass logloss. __Note__: using this objective requires that your labels are binary arrays of shape `(nb_samples, nb_classes)`.
- __sparse_categorical_crossentropy__: As above but accepts sparse labels. __Note__: this objective still requires that your labels have the same number of dimensions as your outputs; you may need to add a length-1 dimension to the shape of your labels, e.g with `np.expand_dims(y, -1)`.
- __kullback_leibler_divergence__ / __kld__: Information gain from a predicted probability distribution Q to a true probability distribution P. Gives a measure of difference between both distributions.
- __poisson__: Mean of `(predictions - targets * log(predictions))`
- __cosine_proximity__: The opposite (negative) of the mean cosine proximity between predictions and targets.
**Note**: when using the `categorical_crossentropy` objective, your targets should be in categorical format (e.g. if you have 10 classes, the target for each sample should be a 10-dimensional vector that is all-zeros expect for a 1 at the index corresponding to the class of the sample). In order to convert *integer targets* into *categorical targets*, you can use the Keras utility `to_categorical`:
```python
from keras.utils.np_utils import to_categorical
categorical_labels = to_categorical(int_labels, nb_classes=None)
```
+44
Ver Arquivo
@@ -0,0 +1,44 @@
## Usage of optimizers
An optimizer is one of the two arguments required for compiling a Keras model:
```python
model = Sequential()
model.add(Dense(64, init='uniform', input_dim=10))
model.add(Activation('tanh'))
model.add(Activation('softmax'))
sgd = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='mean_squared_error', optimizer=sgd)
```
You can either instantiate an optimizer before passing it to `model.compile()` , as in the above example, or you can call it by its name. In the latter case, the default parameters for the optimizer will be used.
```python
# pass optimizer by name: default parameters will be used
model.compile(loss='mean_squared_error', optimizer='sgd')
```
---
## Parameters common to all Keras optimizers
The parameters `clipnorm` and `clipvalue` can be used with all optimizers to control gradient clipping:
```python
# all parameter gradients will be clipped to
# a maximum norm of 1.
sgd = SGD(lr=0.01, clipnorm=1.)
```
```python
# all parameter gradients will be clipped to
# a maximum value of 0.5 and
# a minimum value of -0.5.
sgd = SGD(lr=0.01, clipvalue=0.5)
```
---
{{autogenerated}}
+192
Ver Arquivo
@@ -0,0 +1,192 @@
## ImageDataGenerator
```python
keras.preprocessing.image.ImageDataGenerator(featurewise_center=False,
samplewise_center=False,
featurewise_std_normalization=False,
samplewise_std_normalization=False,
zca_whitening=False,
rotation_range=0.,
width_shift_range=0.,
height_shift_range=0.,
shear_range=0.,
zoom_range=0.,
channel_shift_range=0.,
fill_mode='nearest',
cval=0.,
horizontal_flip=False,
vertical_flip=False,
rescale=None,
dim_ordering=K.image_dim_ordering())
```
Generate batches of tensor image data with real-time data augmentation. The data will be looped over (in batches) indefinitely.
- __Arguments__:
- __featurewise_center__: Boolean. Set input mean to 0 over the dataset.
- __samplewise_center__: Boolean. Set each sample mean to 0.
- __featurewise_std_normalization__: Boolean. Divide inputs by std of the dataset.
- __samplewise_std_normalization__: Boolean. Divide each input by its std.
- __zca_whitening__: Boolean. Apply ZCA whitening.
- __rotation_range__: Int. Degree range for random rotations.
- __width_shift_range__: Float (fraction of total width). Range for random horizontal shifts.
- __height_shift_range__: Float (fraction of total height). Range for random vertical shifts.
- __shear_range__: Float. Shear Intensity (Shear angle in counter-clockwise direction as radians)
- __zoom_range__: Float or [lower, upper]. Range for random zoom. If a float, `[lower, upper] = [1-zoom_range, 1+zoom_range]`.
- __channel_shift_range__: Float. Range for random channel shifts.
- __fill_mode__: One of {"constant", "nearest", "reflect" or "wrap"}. Points outside the boundaries of the input are filled according to the given mode.
- __cval__: Float or Int. Value used for points outside the boundaries when `fill_mode = "constant"`.
- __horizontal_flip__: Boolean. Randomly flip inputs horizontally.
- __vertical_flip__: Boolean. Randomly flip inputs vertically.
- __rescale__: rescaling factor. Defaults to None. If None or 0, no rescaling is applied,
otherwise we multiply the data by the value provided (before applying
any other transformation).
- __dim_ordering__: One of {"th", "tf"}.
"tf" mode means that the images should have shape `(samples, width, height, channels)`,
"th" mode means that the images should have shape `(samples, channels, width, height)`.
It defaults to the `image_dim_ordering` value found in your
Keras config file at `~/.keras/keras.json`.
If you never set it, then it will be "tf".
- __Methods__:
- __fit(X)__: Compute the internal data stats related to the data-dependent transformations, based on an array of sample data.
Only required if featurewise_center or featurewise_std_normalization or zca_whitening.
- __Arguments__:
- __X__: sample data.
- __augment__: Boolean (default: False). Whether to fit on randomly augmented samples.
- __rounds__: int (default: 1). If augment, how many augmentation passes over the data to use.
- __seed__: int (default: None). Random seed.
- __flow(X, y)__: Takes numpy data & label arrays, and generates batches of augmented/normalized data. Yields batches indefinitely, in an infinite loop.
- __Arguments__:
- __X__: data.
- __y__: labels.
- __batch_size__: int (default: 32).
- __shuffle__: boolean (defaut: True).
- __seed__: int (default: None).
- __save_to_dir__: None or str (default: None). This allows you to optimally specify a directory to which to save the augmented pictures being generated (useful for visualizing what you are doing).
- __save_prefix__: str (default: `''`). Prefix to use for filenames of saved pictures (only relevant if `save_to_dir` is set).
- __save_format__: one of "png", "jpeg" (only relevant if `save_to_dir` is set). Default: "jpeg".
- __yields__: Tuples of `(x, y)` where `x` is a numpy array of image data and `y` is a numpy array of corresponding labels.
The generator loops indefinitely.
- __flow_from_directory(directory)__: Takes the path to a directory, and generates batches of augmented/normalized data. Yields batches indefinitely, in an infinite loop.
- __Arguments__:
- __directory__: path to the target directory. It should contain one subdirectory per class,
and the subdirectories should contain PNG or JPG images. See [this script](https://gist.github.com/fchollet/0830affa1f7f19fd47b06d4cf89ed44d) for more details.
- __target_size__: tuple of integers, default: `(256, 256)`. The dimensions to which all images found will be resized.
- __color_mode__: one of "grayscale", "rbg". Default: "rgb". Whether the images will be converted to have 1 or 3 color channels.
- __classes__: optional list of class subdirectories (e.g. `['dogs', 'cats']`). Default: None. If not provided, the list of classes will be automatically inferred (and the order of the classes, which will map to the label indices, will be alphanumeric).
- __class_mode__: one of "categorical", "binary", "sparse" or None. Default: "categorical". Determines the type of label arrays that are returned: "categorical" will be 2D one-hot encoded labels, "binary" will be 1D binary labels, "sparse" will be 1D integer labels. If None, no labels are returned (the generator will only yield batches of image data, which is useful to use `model.predict_generator()`, `model.evaluate_generator()`, etc.).
- __batch_size__: size of the batches of data (default: 32).
- __shuffle__: whether to shuffle the data (default: True)
- __seed__: optional random seed for shuffling and transformations.
- __save_to_dir__: None or str (default: None). This allows you to optimally specify a directory to which to save the augmented pictures being generated (useful for visualizing what you are doing).
- __save_prefix__: str. Prefix to use for filenames of saved pictures (only relevant if `save_to_dir` is set).
- __save_format__: one of "png", "jpeg" (only relevant if `save_to_dir` is set). Default: "jpeg".
- __Examples__:
Example of using `.flow(X, y)`:
```python
(X_train, y_train), (X_test, y_test) = cifar10.load_data()
Y_train = np_utils.to_categorical(y_train, nb_classes)
Y_test = np_utils.to_categorical(y_test, nb_classes)
datagen = ImageDataGenerator(
featurewise_center=True,
featurewise_std_normalization=True,
rotation_range=20,
width_shift_range=0.2,
height_shift_range=0.2,
horizontal_flip=True)
# compute quantities required for featurewise normalization
# (std, mean, and principal components if ZCA whitening is applied)
datagen.fit(X_train)
# fits the model on batches with real-time data augmentation:
model.fit_generator(datagen.flow(X_train, Y_train, batch_size=32),
samples_per_epoch=len(X_train), nb_epoch=nb_epoch)
# here's a more "manual" example
for e in range(nb_epoch):
print 'Epoch', e
batches = 0
for X_batch, Y_batch in datagen.flow(X_train, Y_train, batch_size=32):
loss = model.train(X_batch, Y_batch)
batches += 1
if batches >= len(X_train) / 32:
# we need to break the loop by hand because
# the generator loops indefinitely
break
```
Example of using `.flow_from_directory(directory)`:
```python
train_datagen = ImageDataGenerator(
rescale=1./255,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True)
test_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_directory(
'data/train',
target_size=(150, 150),
batch_size=32,
class_mode='binary')
validation_generator = test_datagen.flow_from_directory(
'data/validation',
target_size=(150, 150),
batch_size=32,
class_mode='binary')
model.fit_generator(
train_generator,
samples_per_epoch=2000,
nb_epoch=50,
validation_data=validation_generator,
nb_val_samples=800)
```
Example of transforming images and masks together.
```python
# we create two instances with the same arguments
data_gen_args = dict(featurewise_center=True,
featurewise_std_normalization=True,
rotation_range=90.,
width_shift_range=0.1,
height_shift_range=0.1,
zoom_range=0.2)
image_datagen = ImageDataGenerator(**data_gen_args)
mask_datagen = ImageDataGenerator(**data_gen_args)
# Provide the same seed and keyword arguments to the fit and flow methods
seed = 1
image_datagen.fit(images, augment=True, seed=seed)
mask_datagen.fit(masks, augment=True, seed=seed)
image_generator = image_datagen.flow_from_directory(
'data/images',
class_mode=None,
seed=seed)
mask_generator = mask_datagen.flow_from_directory(
'data/masks',
class_mode=None,
seed=seed)
# combine generators into one which yields image and masks
train_generator = zip(image_generator, mask_generator)
model.fit_generator(
train_generator,
samples_per_epoch=2000,
nb_epoch=50)
```
@@ -4,26 +4,29 @@
keras.preprocessing.sequence.pad_sequences(sequences, maxlen=None, dtype='int32')
```
Transform a list of `nb_samples sequences` (lists of scalars) into a 2D numpy array of shape `(nb_samples, nb_timesteps)`. `nb_timesteps` is either the `maxlen` argument if provided, or the length of the longest sequence otherwise. Sequences that are shorter than `nb_timesteps` are padded with zeros at the end.
Transform a list of `nb_samples sequences` (lists of scalars) into a 2D Numpy array of shape `(nb_samples, nb_timesteps)`. `nb_timesteps` is either the `maxlen` argument if provided, or the length of the longest sequence otherwise. Sequences that are shorter than `nb_timesteps` are padded with zeros at the end.
- __Return__: 2D numpy array of shape `(nb_samples, nb_timesteps)`.
- __Return__: 2D Numpy array of shape `(nb_samples, nb_timesteps)`.
- __Arguments__:
- __sequences__: List of lists of int or float.
- __maxlen__: None or int. Maximum sequence length, longer sequences are truncated and shorter sequences are padded with zeros at the end.
- __dtype__: datatype of the numpy array returned.
- __dtype__: datatype of the Numpy array returned.
- __padding__: 'pre' or 'post', pad either before or after each sequence.
- __truncating__: 'pre' or 'post', remove values from sequences larger than maxlen either in the beginning or in the end of the sequence
- __value__: float, value to pad the sequences to the desired value.
---
## skipgrams
```python
keras.preprocessing.sequence.skipgrams(sequence, vocabulary_size,
window_size=4, negative_samples=1., shuffle=True,
keras.preprocessing.sequence.skipgrams(sequence, vocabulary_size,
window_size=4, negative_samples=1., shuffle=True,
categorical=False, sampling_table=None)
```
Transforms a sequence of word indexes (list of int) into couples of the form:
Transforms a sequence of word indexes (list of int) into couples of the form:
- (word, word in the same window), with label 1 (positive samples).
- (word, random word from the vocabulary), with label 0 (negative samples).
@@ -31,8 +34,8 @@ Transforms a sequence of word indexes (list of int) into couples of the form:
Read more about Skipgram in this gnomic paper by Mikolov et al.: [Efficient Estimation of Word Representations in
Vector Space](http://arxiv.org/pdf/1301.3781v3.pdf)
- __Return__: tuple `(couples, labels)`.
- `couples` is a list of 2-elements lists of int: `[word_index, other_word_index]`.
- __Return__: tuple `(couples, labels)`.
- `couples` is a list of 2-elements lists of int: `[word_index, other_word_index]`.
- `labels` is a list of 0 and 1, where 1 indicates that `other_word_index` was found in the same window as `word_index`, and 0 indicates that `other_word_index` was random.
- if categorical is set to True, the labels are categorical, ie. 1 becomes [0,1], and 0 becomes [1, 0].
@@ -43,7 +46,7 @@ Vector Space](http://arxiv.org/pdf/1301.3781v3.pdf)
- __negative_samples__: float >= 0. 0 for no negative (=random) samples. 1 for same number as positive samples. etc.
- __shuffle__: boolean. Whether to shuffle the samples.
- __categorical__: boolean. Whether to make the returned labels categorical.
- __sampling_table__: numpy array of shape `(vocabulary_size,)` where `sampling_table[i]` is the probability of sampling the word with index i (assumed to be i-th most common word in the dataset).
- __sampling_table__: Numpy array of shape `(vocabulary_size,)` where `sampling_table[i]` is the probability of sampling the word with index i (assumed to be i-th most common word in the dataset).
---
@@ -56,7 +59,7 @@ keras.preprocessing.sequence.make_sampling_table(size, sampling_factor=1e-5)
Used for generating the `sampling_table` argument for `skipgrams`. `sampling_table[i]` is the probability of sampling the word i-th most common word in a dataset (more common words should be sampled less frequently, for balance).
- __Return__: numpy array of shape `(size,)`.
- __Return__: Numpy array of shape `(size,)`.
- __Arguments__:
- __size__: size of the vocabulary considered.
+40
Ver Arquivo
@@ -0,0 +1,40 @@
## Usage of regularizers
Regularizers allow to apply penalties on layer parameters or layer activity during optimization. These penalties are incorporated in the loss function that the network optimizes.
The penalties are applied on a per-layer basis. The exact API will depend on the layer, but the layers `Dense`, `TimeDistributedDense`, `MaxoutDense`, `Convolution1D` and `Convolution2D` have a unified API.
These layers expose 3 keyword arguments:
- `W_regularizer`: instance of `keras.regularizers.WeightRegularizer`
- `b_regularizer`: instance of `keras.regularizers.WeightRegularizer`
- `activity_regularizer`: instance of `keras.regularizers.ActivityRegularizer`
## Example
```python
from keras.regularizers import l2, activity_l2
model.add(Dense(64, input_dim=64, W_regularizer=l2(0.01), activity_regularizer=activity_l2(0.01)))
```
## Available penalties
```python
keras.regularizers.WeightRegularizer(l1=0., l2=0.)
```
```python
keras.regularizers.ActivityRegularizer(l1=0., l2=0.)
```
## Shortcuts
These are shortcut functions available in `keras.regularizers`.
- __l1__(l=0.01): L1 weight regularization penalty, also known as LASSO
- __l2__(l=0.01): L2 weight regularization penalty, also known as weight decay, or Ridge
- __l1l2__(l1=0.01, l2=0.01): L1-L2 weight regularization penalty, also known as ElasticNet
- __activity_l1__(l=0.01): L1 activity regularization
- __activity_l2__(l=0.01): L2 activity regularization
- __activity_l1l2__(l1=0.01, l2=0.01): L1+L2 activity regularization
+45
Ver Arquivo
@@ -0,0 +1,45 @@
# Wrappers for the Scikit-Learn API
You can use `Sequential` Keras models (single-input only) as part of your Scikit-Learn workflow via the wrappers found at `keras.wrappers.scikit_learn.py`.
There are two wrappers available:
`keras.wrappers.scikit_learn.KerasClassifier(build_fn=None, **sk_params)`, which implements the Scikit-Learn classifier interface,
`keras.wrappers.scikit_learn.KerasRegressor(build_fn=None, **sk_params)`, which implements the Scikit-Learn regressor interface.
### Arguments
- __build_fn__: callable function or class instance
- __sk_params__: model parameters & fitting parameters
`build_fn` should construct, compile and return a Keras model, which
will then be used to fit/predict. One of the following
three values could be passed to build_fn:
1. A function
2. An instance of a class that implements the __call__ method
3. None. This means you implement a class that inherits from either
`KerasClassifier` or `KerasRegressor`. The __call__ method of the
present class will then be treated as the default build_fn.
`sk_params` takes both model parameters and fitting parameters. Legal model
parameters are the arguments of `build_fn`. Note that like all other
estimators in scikit-learn, 'build_fn' should provide default values for
its arguments, so that you could create the estimator without passing any
values to `sk_params`.
`sk_params` could also accept parameters for calling `fit`, `predict`,
`predict_proba`, and `score` methods (e.g., `nb_epoch`, `batch_size`).
fitting (predicting) parameters are selected in the following order:
1. Values passed to the dictionary arguments of
`fit`, `predict`, `predict_proba`, and `score` methods
2. Values passed to `sk_params`
3. The default values of the `keras.models.Sequential`
`fit`, `predict`, `predict_proba` and `score` methods
When using scikit-learn's `grid_search` API, legal tunable parameters are
those you could pass to `sk_params`, including fitting parameters.
In other words, you could use `grid_search` to search for the best
`batch_size` or `nb_epoch` as well as the model parameters.
+25
Ver Arquivo
@@ -0,0 +1,25 @@
## Model visualization
The `keras.utils.visualize_util` module provides utility functions to plot
a Keras model (using graphviz).
This will plot a graph of the model and save it to a file:
```python
from keras.utils.visualize_util import plot
plot(model, to_file='model.png')
```
`plot` takes two optional arguments:
- `show_shapes` (defaults to False) controls whether output shapes are shown in the graph.
- `show_layer_names` (defaults to True) controls whether layer names are shown in the graph.
You can also directly obtain the `pydot.Graph` object and render it yourself,
for example to show it in an ipython notebook :
```python
from IPython.display import SVG
from keras.utils.visualize_util import model_to_dot
SVG(model_to_dot(model).create(prog='dot', format='svg'))
```
+97
Ver Arquivo
@@ -0,0 +1,97 @@
# Keras examples directory
[addition_rnn.py](addition_rnn.py)
Implementation of sequence to sequence learning for performing addition of two numbers (as strings).
[antirectifier.py](antirectifier.py)
Demonstrates how to write custom layers for Keras.
[babi_memnn.py](babi_memnn.py)
Trains a memory network on the bAbI dataset for reading comprehension.
[babi_rnn.py](babi_rnn.py)
Trains a two-branch recurrent network on the bAbI dataset for reading comprehension.
[cifar10_cnn.py](cifar10_cnn.py)
Trains a simple deep CNN on the CIFAR10 small images dataset.
[conv_filter_visualization.py](conv_filter_visualization.py)
Visualization of the filters of VGG16, via gradient ascent in input space.
[conv_lstm.py](conv_lstm.py)
Demonstrates the use of a convolutional LSTM network.
[deep_dream.py](deep_dream.py)
Deep Dreams in Keras.
[image_ocr.py](image_ocr.py)
Trains a convolutional stack followed by a recurrent stack and a CTC logloss function to perform optical character recognition (OCR).
[imdb_bidirectional_lstm.py](imdb_bidirectional_lstm.py)
Trains a Bidirectional LSTM on the IMDB sentiment classification task.
[imdb_cnn.py](imdb_cnn.py)
Demonstrates the use of Convolution1D for text classification.
[imdb_cnn_lstm.py](imdb_cnn_lstm.py)
Trains a convolutional stack followed by a recurrent stack network on the IMDB sentiment classification task.
[imdb_fasttext.py](imdb_fasttext.py)
Trains a FastText model on the IMDB sentiment classification task.
[imdb_lstm.py](imdb_lstm.py)
Trains a LSTM on the IMDB sentiment classification task.
[lstm_benchmark.py](lstm_benchmark.py)
Compares different LSTM implementations on the IMDB sentiment classification task.
[lstm_text_generation.py](lstm_text_generation.py)
Generates text from Nietzsche's writings.
[mnist_cnn.py](mnist_cnn.py)
Trains a simple convnet on the MNIST dataset.
[mnist_hierarchical_rnn.py](mnist_hierarchical_rnn.py)
Trains a Hierarchical RNN (HRNN) to classify MNIST digits.
[mnist_irnn.py](mnist_irnn.py)
Reproduction of the IRNN experiment with pixel-by-pixel sequential MNIST in "A Simple Way to Initialize Recurrent Networks of Rectified Linear Units" by Le et al.
[mnist_mlp.py](mnist_mlp.py)
Trains a simple deep multi-layer perceptron on the MNIST dataset.
[mnist_net2net.py](mnist_net2net.py)
Reproduction of the Net2Net experiment with MNIST in "Net2Net: Accelerating Learning via Knowledge Transfer".
[mnist_siamese_graph.py](mnist_siamese_graph.py)
Trains a Siamese multi-layer perceptron on pairs of digits from the MNIST dataset.
[mnist_sklearn_wrapper.py](mnist_sklearn_wrapper.py)
Demonstrates how to use the sklearn wrapper.
[mnist_swwae.py](mnist_swwae.py)
Trains a Stacked What-Where AutoEncoder built on residual blocks on the MNIST dataset.
[mnist_transfer_cnn.py](mnist_transfer_cnn.py)
Transfer learning toy example.
[neural_doodle.py](neural_doodle.py)
Neural doodle.
[neural_style_transfer.py](neural_style_transfer.py)
Neural style transfer.
[pretrained_word_embeddings.py](pretrained_word_embeddings.py)
Loads pre-trained word embeddings (GloVe embeddings) into a frozen Keras Embedding layer, and uses it to train a text classification model on the 20 Newsgroup dataset.
[reuters_mlp.py](reuters_mlp.py)
Trains and evaluate a simple MLP on the Reuters newswire topic classification task.
[stateful_lstm.py](stateful_lstm.py)
Demonstrates how to use stateful RNNs to model long sequences efficiently.
[variational_autoencoder.py](variational_autoencoder.py)
Demonstrates how to build a variational autoencoder.
[variational_autoencoder_deconv.py](variational_autoencoder_deconv.py)
Demonstrates how to build a variational autoencoder with Keras using deconvolution layers.
+168
Ver Arquivo
@@ -0,0 +1,168 @@
# -*- coding: utf-8 -*-
'''An implementation of sequence to sequence learning for performing addition
Input: "535+61"
Output: "596"
Padding is handled by using a repeated sentinel character (space)
Input may optionally be inverted, shown to increase performance in many tasks in:
"Learning to Execute"
http://arxiv.org/abs/1410.4615
and
"Sequence to Sequence Learning with Neural Networks"
http://papers.nips.cc/paper/5346-sequence-to-sequence-learning-with-neural-networks.pdf
Theoretically it introduces shorter term dependencies between source and target.
Two digits inverted:
+ One layer LSTM (128 HN), 5k training examples = 99% train/test accuracy in 55 epochs
Three digits inverted:
+ One layer LSTM (128 HN), 50k training examples = 99% train/test accuracy in 100 epochs
Four digits inverted:
+ One layer LSTM (128 HN), 400k training examples = 99% train/test accuracy in 20 epochs
Five digits inverted:
+ One layer LSTM (128 HN), 550k training examples = 99% train/test accuracy in 30 epochs
'''
from __future__ import print_function
from keras.models import Sequential
from keras.engine.training import slice_X
from keras.layers import Activation, TimeDistributed, Dense, RepeatVector, recurrent
import numpy as np
from six.moves import range
class CharacterTable(object):
'''
Given a set of characters:
+ Encode them to a one hot integer representation
+ Decode the one hot integer representation to their character output
+ Decode a vector of probabilities to their character output
'''
def __init__(self, chars, maxlen):
self.chars = sorted(set(chars))
self.char_indices = dict((c, i) for i, c in enumerate(self.chars))
self.indices_char = dict((i, c) for i, c in enumerate(self.chars))
self.maxlen = maxlen
def encode(self, C, maxlen=None):
maxlen = maxlen if maxlen else self.maxlen
X = np.zeros((maxlen, len(self.chars)))
for i, c in enumerate(C):
X[i, self.char_indices[c]] = 1
return X
def decode(self, X, calc_argmax=True):
if calc_argmax:
X = X.argmax(axis=-1)
return ''.join(self.indices_char[x] for x in X)
class colors:
ok = '\033[92m'
fail = '\033[91m'
close = '\033[0m'
# Parameters for the model and dataset
TRAINING_SIZE = 50000
DIGITS = 3
INVERT = True
# Try replacing GRU, or SimpleRNN
RNN = recurrent.LSTM
HIDDEN_SIZE = 128
BATCH_SIZE = 128
LAYERS = 1
MAXLEN = DIGITS + 1 + DIGITS
chars = '0123456789+ '
ctable = CharacterTable(chars, MAXLEN)
questions = []
expected = []
seen = set()
print('Generating data...')
while len(questions) < TRAINING_SIZE:
f = lambda: int(''.join(np.random.choice(list('0123456789')) for i in range(np.random.randint(1, DIGITS + 1))))
a, b = f(), f()
# Skip any addition questions we've already seen
# Also skip any such that X+Y == Y+X (hence the sorting)
key = tuple(sorted((a, b)))
if key in seen:
continue
seen.add(key)
# Pad the data with spaces such that it is always MAXLEN
q = '{}+{}'.format(a, b)
query = q + ' ' * (MAXLEN - len(q))
ans = str(a + b)
# Answers can be of maximum size DIGITS + 1
ans += ' ' * (DIGITS + 1 - len(ans))
if INVERT:
query = query[::-1]
questions.append(query)
expected.append(ans)
print('Total addition questions:', len(questions))
print('Vectorization...')
X = np.zeros((len(questions), MAXLEN, len(chars)), dtype=np.bool)
y = np.zeros((len(questions), DIGITS + 1, len(chars)), dtype=np.bool)
for i, sentence in enumerate(questions):
X[i] = ctable.encode(sentence, maxlen=MAXLEN)
for i, sentence in enumerate(expected):
y[i] = ctable.encode(sentence, maxlen=DIGITS + 1)
# Shuffle (X, y) in unison as the later parts of X will almost all be larger digits
indices = np.arange(len(y))
np.random.shuffle(indices)
X = X[indices]
y = y[indices]
# Explicitly set apart 10% for validation data that we never train over
split_at = len(X) - len(X) / 10
(X_train, X_val) = (slice_X(X, 0, split_at), slice_X(X, split_at))
(y_train, y_val) = (y[:split_at], y[split_at:])
print(X_train.shape)
print(y_train.shape)
print('Build model...')
model = Sequential()
# "Encode" the input sequence using an RNN, producing an output of HIDDEN_SIZE
# note: in a situation where your input sequences have a variable length,
# use input_shape=(None, nb_feature).
model.add(RNN(HIDDEN_SIZE, input_shape=(MAXLEN, len(chars))))
# For the decoder's input, we repeat the encoded input for each time step
model.add(RepeatVector(DIGITS + 1))
# The decoder RNN could be multiple layers stacked or a single layer
for _ in range(LAYERS):
model.add(RNN(HIDDEN_SIZE, return_sequences=True))
# For each of step of the output sequence, decide which character should be chosen
model.add(TimeDistributed(Dense(len(chars))))
model.add(Activation('softmax'))
model.compile(loss='categorical_crossentropy',
optimizer='adam',
metrics=['accuracy'])
# Train the model each generation and show predictions against the validation dataset
for iteration in range(1, 200):
print()
print('-' * 50)
print('Iteration', iteration)
model.fit(X_train, y_train, batch_size=BATCH_SIZE, nb_epoch=1,
validation_data=(X_val, y_val))
###
# Select 10 samples from the validation set at random so we can visualize errors
for i in range(10):
ind = np.random.randint(0, len(X_val))
rowX, rowy = X_val[np.array([ind])], y_val[np.array([ind])]
preds = model.predict_classes(rowX, verbose=0)
q = ctable.decode(rowX[0])
correct = ctable.decode(rowy[0])
guess = ctable.decode(preds[0], calc_argmax=False)
print('Q', q[::-1] if INVERT else q)
print('T', correct)
print(colors.ok + '' + colors.close if correct == guess else colors.fail + '' + colors.close, guess)
print('---')
+104
Ver Arquivo
@@ -0,0 +1,104 @@
'''The example demonstrates how to write custom layers for Keras.
We build a custom activation layer called 'Antirectifier',
which modifies the shape of the tensor that passes through it.
We need to specify two methods: `get_output_shape_for` and `call`.
Note that the same result can also be achieved via a Lambda layer.
Because our custom layer is written with primitives from the Keras
backend (`K`), our code can run both on TensorFlow and Theano.
'''
from __future__ import print_function
from keras.models import Sequential
from keras.layers import Dense, Dropout, Layer, Activation
from keras.datasets import mnist
from keras import backend as K
from keras.utils import np_utils
class Antirectifier(Layer):
'''This is the combination of a sample-wise
L2 normalization with the concatenation of the
positive part of the input with the negative part
of the input. The result is a tensor of samples that are
twice as large as the input samples.
It can be used in place of a ReLU.
# Input shape
2D tensor of shape (samples, n)
# Output shape
2D tensor of shape (samples, 2*n)
# Theoretical justification
When applying ReLU, assuming that the distribution
of the previous output is approximately centered around 0.,
you are discarding half of your input. This is inefficient.
Antirectifier allows to return all-positive outputs like ReLU,
without discarding any data.
Tests on MNIST show that Antirectifier allows to train networks
with twice less parameters yet with comparable
classification accuracy as an equivalent ReLU-based network.
'''
def get_output_shape_for(self, input_shape):
shape = list(input_shape)
assert len(shape) == 2 # only valid for 2D tensors
shape[-1] *= 2
return tuple(shape)
def call(self, x, mask=None):
x -= K.mean(x, axis=1, keepdims=True)
x = K.l2_normalize(x, axis=1)
pos = K.relu(x)
neg = K.relu(-x)
return K.concatenate([pos, neg], axis=1)
# global parameters
batch_size = 128
nb_classes = 10
nb_epoch = 40
# the data, shuffled and split between train and test sets
(X_train, y_train), (X_test, y_test) = mnist.load_data()
X_train = X_train.reshape(60000, 784)
X_test = X_test.reshape(10000, 784)
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255
X_test /= 255
print(X_train.shape[0], 'train samples')
print(X_test.shape[0], 'test samples')
# convert class vectors to binary class matrices
Y_train = np_utils.to_categorical(y_train, nb_classes)
Y_test = np_utils.to_categorical(y_test, nb_classes)
# build the model
model = Sequential()
model.add(Dense(256, input_shape=(784,)))
model.add(Antirectifier())
model.add(Dropout(0.1))
model.add(Dense(256))
model.add(Antirectifier())
model.add(Dropout(0.1))
model.add(Dense(10))
model.add(Activation('softmax'))
# compile the model
model.compile(loss='categorical_crossentropy',
optimizer='rmsprop',
metrics=['accuracy'])
# train the model
model.fit(X_train, Y_train,
batch_size=batch_size, nb_epoch=nb_epoch,
verbose=1, validation_data=(X_test, Y_test))
# next, compare with an equivalent network
# with2x bigger Dense layers and ReLU
+210
Ver Arquivo
@@ -0,0 +1,210 @@
'''Trains a memory network on the bAbI dataset.
References:
- Jason Weston, Antoine Bordes, Sumit Chopra, Tomas Mikolov, Alexander M. Rush,
"Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks",
http://arxiv.org/abs/1502.05698
- Sainbayar Sukhbaatar, Arthur Szlam, Jason Weston, Rob Fergus,
"End-To-End Memory Networks",
http://arxiv.org/abs/1503.08895
Reaches 98.6% accuracy on task 'single_supporting_fact_10k' after 120 epochs.
Time per epoch: 3s on CPU (core i7).
'''
from __future__ import print_function
from keras.models import Sequential
from keras.layers.embeddings import Embedding
from keras.layers import Activation, Dense, Merge, Permute, Dropout
from keras.layers import LSTM
from keras.utils.data_utils import get_file
from keras.preprocessing.sequence import pad_sequences
from functools import reduce
import tarfile
import numpy as np
import re
def tokenize(sent):
'''Return the tokens of a sentence including punctuation.
>>> tokenize('Bob dropped the apple. Where is the apple?')
['Bob', 'dropped', 'the', 'apple', '.', 'Where', 'is', 'the', 'apple', '?']
'''
return [x.strip() for x in re.split('(\W+)?', sent) if x.strip()]
def parse_stories(lines, only_supporting=False):
'''Parse stories provided in the bAbi tasks format
If only_supporting is true, only the sentences that support the answer are kept.
'''
data = []
story = []
for line in lines:
line = line.decode('utf-8').strip()
nid, line = line.split(' ', 1)
nid = int(nid)
if nid == 1:
story = []
if '\t' in line:
q, a, supporting = line.split('\t')
q = tokenize(q)
substory = None
if only_supporting:
# Only select the related substory
supporting = map(int, supporting.split())
substory = [story[i - 1] for i in supporting]
else:
# Provide all the substories
substory = [x for x in story if x]
data.append((substory, q, a))
story.append('')
else:
sent = tokenize(line)
story.append(sent)
return data
def get_stories(f, only_supporting=False, max_length=None):
'''Given a file name, read the file, retrieve the stories, and then convert the sentences into a single story.
If max_length is supplied, any stories longer than max_length tokens will be discarded.
'''
data = parse_stories(f.readlines(), only_supporting=only_supporting)
flatten = lambda data: reduce(lambda x, y: x + y, data)
data = [(flatten(story), q, answer) for story, q, answer in data if not max_length or len(flatten(story)) < max_length]
return data
def vectorize_stories(data, word_idx, story_maxlen, query_maxlen):
X = []
Xq = []
Y = []
for story, query, answer in data:
x = [word_idx[w] for w in story]
xq = [word_idx[w] for w in query]
y = np.zeros(len(word_idx) + 1) # let's not forget that index 0 is reserved
y[word_idx[answer]] = 1
X.append(x)
Xq.append(xq)
Y.append(y)
return (pad_sequences(X, maxlen=story_maxlen),
pad_sequences(Xq, maxlen=query_maxlen), np.array(Y))
try:
path = get_file('babi-tasks-v1-2.tar.gz', origin='https://s3.amazonaws.com/text-datasets/babi_tasks_1-20_v1-2.tar.gz')
except:
print('Error downloading dataset, please download it manually:\n'
'$ wget http://www.thespermwhale.com/jaseweston/babi/tasks_1-20_v1-2.tar.gz\n'
'$ mv tasks_1-20_v1-2.tar.gz ~/.keras/datasets/babi-tasks-v1-2.tar.gz')
raise
tar = tarfile.open(path)
challenges = {
# QA1 with 10,000 samples
'single_supporting_fact_10k': 'tasks_1-20_v1-2/en-10k/qa1_single-supporting-fact_{}.txt',
# QA2 with 10,000 samples
'two_supporting_facts_10k': 'tasks_1-20_v1-2/en-10k/qa2_two-supporting-facts_{}.txt',
}
challenge_type = 'single_supporting_fact_10k'
challenge = challenges[challenge_type]
print('Extracting stories for the challenge:', challenge_type)
train_stories = get_stories(tar.extractfile(challenge.format('train')))
test_stories = get_stories(tar.extractfile(challenge.format('test')))
vocab = sorted(reduce(lambda x, y: x | y, (set(story + q + [answer]) for story, q, answer in train_stories + test_stories)))
# Reserve 0 for masking via pad_sequences
vocab_size = len(vocab) + 1
story_maxlen = max(map(len, (x for x, _, _ in train_stories + test_stories)))
query_maxlen = max(map(len, (x for _, x, _ in train_stories + test_stories)))
print('-')
print('Vocab size:', vocab_size, 'unique words')
print('Story max length:', story_maxlen, 'words')
print('Query max length:', query_maxlen, 'words')
print('Number of training stories:', len(train_stories))
print('Number of test stories:', len(test_stories))
print('-')
print('Here\'s what a "story" tuple looks like (input, query, answer):')
print(train_stories[0])
print('-')
print('Vectorizing the word sequences...')
word_idx = dict((c, i + 1) for i, c in enumerate(vocab))
inputs_train, queries_train, answers_train = vectorize_stories(train_stories, word_idx, story_maxlen, query_maxlen)
inputs_test, queries_test, answers_test = vectorize_stories(test_stories, word_idx, story_maxlen, query_maxlen)
print('-')
print('inputs: integer tensor of shape (samples, max_length)')
print('inputs_train shape:', inputs_train.shape)
print('inputs_test shape:', inputs_test.shape)
print('-')
print('queries: integer tensor of shape (samples, max_length)')
print('queries_train shape:', queries_train.shape)
print('queries_test shape:', queries_test.shape)
print('-')
print('answers: binary (1 or 0) tensor of shape (samples, vocab_size)')
print('answers_train shape:', answers_train.shape)
print('answers_test shape:', answers_test.shape)
print('-')
print('Compiling...')
# embed the input sequence into a sequence of vectors
input_encoder_m = Sequential()
input_encoder_m.add(Embedding(input_dim=vocab_size,
output_dim=64,
input_length=story_maxlen))
input_encoder_m.add(Dropout(0.3))
# output: (samples, story_maxlen, embedding_dim)
# embed the question into a sequence of vectors
question_encoder = Sequential()
question_encoder.add(Embedding(input_dim=vocab_size,
output_dim=64,
input_length=query_maxlen))
question_encoder.add(Dropout(0.3))
# output: (samples, query_maxlen, embedding_dim)
# compute a 'match' between input sequence elements (which are vectors)
# and the question vector sequence
match = Sequential()
match.add(Merge([input_encoder_m, question_encoder],
mode='dot',
dot_axes=[2, 2]))
match.add(Activation('softmax'))
# output: (samples, story_maxlen, query_maxlen)
# embed the input into a single vector with size = story_maxlen:
input_encoder_c = Sequential()
input_encoder_c.add(Embedding(input_dim=vocab_size,
output_dim=query_maxlen,
input_length=story_maxlen))
input_encoder_c.add(Dropout(0.3))
# output: (samples, story_maxlen, query_maxlen)
# sum the match vector with the input vector:
response = Sequential()
response.add(Merge([match, input_encoder_c], mode='sum'))
# output: (samples, story_maxlen, query_maxlen)
response.add(Permute((2, 1))) # output: (samples, query_maxlen, story_maxlen)
# concatenate the match vector with the question vector,
# and do logistic regression on top
answer = Sequential()
answer.add(Merge([response, question_encoder], mode='concat', concat_axis=-1))
# the original paper uses a matrix multiplication for this reduction step.
# we choose to use a RNN instead.
answer.add(LSTM(32))
# one regularization layer -- more would probably be needed.
answer.add(Dropout(0.3))
answer.add(Dense(vocab_size))
# we output a probability distribution over the vocabulary
answer.add(Activation('softmax'))
answer.compile(optimizer='rmsprop', loss='categorical_crossentropy',
metrics=['accuracy'])
# Note: you could use a Graph model to avoid repeat the input twice
answer.fit([inputs_train, queries_train, inputs_train], answers_train,
batch_size=32,
nb_epoch=120,
validation_data=([inputs_test, queries_test, inputs_test], answers_test))
+211
Ver Arquivo
@@ -0,0 +1,211 @@
'''Trains two recurrent neural networks based upon a story and a question.
The resulting merged vector is then queried to answer a range of bAbI tasks.
The results are comparable to those for an LSTM model provided in Weston et al.:
"Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks"
http://arxiv.org/abs/1502.05698
Task Number | FB LSTM Baseline | Keras QA
--- | --- | ---
QA1 - Single Supporting Fact | 50 | 100.0
QA2 - Two Supporting Facts | 20 | 50.0
QA3 - Three Supporting Facts | 20 | 20.5
QA4 - Two Arg. Relations | 61 | 62.9
QA5 - Three Arg. Relations | 70 | 61.9
QA6 - Yes/No Questions | 48 | 50.7
QA7 - Counting | 49 | 78.9
QA8 - Lists/Sets | 45 | 77.2
QA9 - Simple Negation | 64 | 64.0
QA10 - Indefinite Knowledge | 44 | 47.7
QA11 - Basic Coreference | 72 | 74.9
QA12 - Conjunction | 74 | 76.4
QA13 - Compound Coreference | 94 | 94.4
QA14 - Time Reasoning | 27 | 34.8
QA15 - Basic Deduction | 21 | 32.4
QA16 - Basic Induction | 23 | 50.6
QA17 - Positional Reasoning | 51 | 49.1
QA18 - Size Reasoning | 52 | 90.8
QA19 - Path Finding | 8 | 9.0
QA20 - Agent's Motivations | 91 | 90.7
For the resources related to the bAbI project, refer to:
https://research.facebook.com/researchers/1543934539189348
Notes:
- With default word, sentence, and query vector sizes, the GRU model achieves:
- 100% test accuracy on QA1 in 20 epochs (2 seconds per epoch on CPU)
- 50% test accuracy on QA2 in 20 epochs (16 seconds per epoch on CPU)
In comparison, the Facebook paper achieves 50% and 20% for the LSTM baseline.
- The task does not traditionally parse the question separately. This likely
improves accuracy and is a good example of merging two RNNs.
- The word vector embeddings are not shared between the story and question RNNs.
- See how the accuracy changes given 10,000 training samples (en-10k) instead
of only 1000. 1000 was used in order to be comparable to the original paper.
- Experiment with GRU, LSTM, and JZS1-3 as they give subtly different results.
- The length and noise (i.e. 'useless' story components) impact the ability for
LSTMs / GRUs to provide the correct answer. Given only the supporting facts,
these RNNs can achieve 100% accuracy on many tasks. Memory networks and neural
networks that use attentional processes can efficiently search through this
noise to find the relevant statements, improving performance substantially.
This becomes especially obvious on QA2 and QA3, both far longer than QA1.
'''
from __future__ import print_function
from functools import reduce
import re
import tarfile
import numpy as np
np.random.seed(1337) # for reproducibility
from keras.utils.data_utils import get_file
from keras.layers.embeddings import Embedding
from keras.layers import Dense, Merge, Dropout, RepeatVector
from keras.layers import recurrent
from keras.models import Sequential
from keras.preprocessing.sequence import pad_sequences
def tokenize(sent):
'''Return the tokens of a sentence including punctuation.
>>> tokenize('Bob dropped the apple. Where is the apple?')
['Bob', 'dropped', 'the', 'apple', '.', 'Where', 'is', 'the', 'apple', '?']
'''
return [x.strip() for x in re.split('(\W+)?', sent) if x.strip()]
def parse_stories(lines, only_supporting=False):
'''Parse stories provided in the bAbi tasks format
If only_supporting is true, only the sentences that support the answer are kept.
'''
data = []
story = []
for line in lines:
line = line.decode('utf-8').strip()
nid, line = line.split(' ', 1)
nid = int(nid)
if nid == 1:
story = []
if '\t' in line:
q, a, supporting = line.split('\t')
q = tokenize(q)
substory = None
if only_supporting:
# Only select the related substory
supporting = map(int, supporting.split())
substory = [story[i - 1] for i in supporting]
else:
# Provide all the substories
substory = [x for x in story if x]
data.append((substory, q, a))
story.append('')
else:
sent = tokenize(line)
story.append(sent)
return data
def get_stories(f, only_supporting=False, max_length=None):
'''Given a file name, read the file, retrieve the stories, and then convert the sentences into a single story.
If max_length is supplied, any stories longer than max_length tokens will be discarded.
'''
data = parse_stories(f.readlines(), only_supporting=only_supporting)
flatten = lambda data: reduce(lambda x, y: x + y, data)
data = [(flatten(story), q, answer) for story, q, answer in data if not max_length or len(flatten(story)) < max_length]
return data
def vectorize_stories(data, word_idx, story_maxlen, query_maxlen):
X = []
Xq = []
Y = []
for story, query, answer in data:
x = [word_idx[w] for w in story]
xq = [word_idx[w] for w in query]
y = np.zeros(len(word_idx) + 1) # let's not forget that index 0 is reserved
y[word_idx[answer]] = 1
X.append(x)
Xq.append(xq)
Y.append(y)
return pad_sequences(X, maxlen=story_maxlen), pad_sequences(Xq, maxlen=query_maxlen), np.array(Y)
RNN = recurrent.LSTM
EMBED_HIDDEN_SIZE = 50
SENT_HIDDEN_SIZE = 100
QUERY_HIDDEN_SIZE = 100
BATCH_SIZE = 32
EPOCHS = 40
print('RNN / Embed / Sent / Query = {}, {}, {}, {}'.format(RNN, EMBED_HIDDEN_SIZE, SENT_HIDDEN_SIZE, QUERY_HIDDEN_SIZE))
try:
path = get_file('babi-tasks-v1-2.tar.gz', origin='https://s3.amazonaws.com/text-datasets/babi_tasks_1-20_v1-2.tar.gz')
except:
print('Error downloading dataset, please download it manually:\n'
'$ wget http://www.thespermwhale.com/jaseweston/babi/tasks_1-20_v1-2.tar.gz\n'
'$ mv tasks_1-20_v1-2.tar.gz ~/.keras/datasets/babi-tasks-v1-2.tar.gz')
raise
tar = tarfile.open(path)
# Default QA1 with 1000 samples
# challenge = 'tasks_1-20_v1-2/en/qa1_single-supporting-fact_{}.txt'
# QA1 with 10,000 samples
# challenge = 'tasks_1-20_v1-2/en-10k/qa1_single-supporting-fact_{}.txt'
# QA2 with 1000 samples
challenge = 'tasks_1-20_v1-2/en/qa2_two-supporting-facts_{}.txt'
# QA2 with 10,000 samples
# challenge = 'tasks_1-20_v1-2/en-10k/qa2_two-supporting-facts_{}.txt'
train = get_stories(tar.extractfile(challenge.format('train')))
test = get_stories(tar.extractfile(challenge.format('test')))
vocab = sorted(reduce(lambda x, y: x | y, (set(story + q + [answer]) for story, q, answer in train + test)))
# Reserve 0 for masking via pad_sequences
vocab_size = len(vocab) + 1
word_idx = dict((c, i + 1) for i, c in enumerate(vocab))
story_maxlen = max(map(len, (x for x, _, _ in train + test)))
query_maxlen = max(map(len, (x for _, x, _ in train + test)))
X, Xq, Y = vectorize_stories(train, word_idx, story_maxlen, query_maxlen)
tX, tXq, tY = vectorize_stories(test, word_idx, story_maxlen, query_maxlen)
print('vocab = {}'.format(vocab))
print('X.shape = {}'.format(X.shape))
print('Xq.shape = {}'.format(Xq.shape))
print('Y.shape = {}'.format(Y.shape))
print('story_maxlen, query_maxlen = {}, {}'.format(story_maxlen, query_maxlen))
print('Build model...')
sentrnn = Sequential()
sentrnn.add(Embedding(vocab_size, EMBED_HIDDEN_SIZE,
input_length=story_maxlen))
sentrnn.add(Dropout(0.3))
qrnn = Sequential()
qrnn.add(Embedding(vocab_size, EMBED_HIDDEN_SIZE,
input_length=query_maxlen))
qrnn.add(Dropout(0.3))
qrnn.add(RNN(EMBED_HIDDEN_SIZE, return_sequences=False))
qrnn.add(RepeatVector(story_maxlen))
model = Sequential()
model.add(Merge([sentrnn, qrnn], mode='sum'))
model.add(RNN(EMBED_HIDDEN_SIZE, return_sequences=False))
model.add(Dropout(0.3))
model.add(Dense(vocab_size, activation='softmax'))
model.compile(optimizer='adam',
loss='categorical_crossentropy',
metrics=['accuracy'])
print('Training')
model.fit([X, Xq], Y, batch_size=BATCH_SIZE, nb_epoch=EPOCHS, validation_split=0.05)
loss, acc = model.evaluate([tX, tXq], tY, batch_size=BATCH_SIZE)
print('Test loss / test accuracy = {:.4f} / {:.4f}'.format(loss, acc))
+65 -78
Ver Arquivo
@@ -1,35 +1,38 @@
from __future__ import absolute_import
'''Train a simple deep CNN on the CIFAR10 small images dataset.
GPU run command:
THEANO_FLAGS=mode=FAST_RUN,device=gpu,floatX=float32 python cifar10_cnn.py
It gets down to 0.65 test logloss in 25 epochs, and down to 0.55 after 50 epochs.
(it's still underfitting at that point, though).
Note: the data was pickled with Python 2, and some encoding issues might prevent you
from loading it in Python 3. You might have to load it in Python 2,
save it in a different format, load it in Python 3 and repickle it.
'''
from __future__ import print_function
from keras.datasets import cifar10
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation, Flatten
from keras.layers.convolutional import Convolution2D, MaxPooling2D
from keras.optimizers import SGD, Adadelta, Adagrad
from keras.utils import np_utils, generic_utils
from six.moves import range
'''
Train a (fairly simple) deep CNN on the CIFAR10 small images dataset.
GPU run command:
THEANO_FLAGS=mode=FAST_RUN,device=gpu,floatX=float32 python cifar10_cnn.py
It gets down to 0.65 test logloss in 25 epochs, and down to 0.55 after 50 epochs.
(it's still underfitting at that point, though).
Note: the data was pickled with Python 2, and some encoding issues might prevent you
from loading it in Python 3. You might have to load it in Python 2,
save it in a different format, load it in Python 3 and repickle it.
'''
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.layers import Convolution2D, MaxPooling2D
from keras.optimizers import SGD
from keras.utils import np_utils
batch_size = 32
nb_classes = 10
nb_epoch = 200
data_augmentation = True
# the data, shuffled and split between tran and test sets
# input image dimensions
img_rows, img_cols = 32, 32
# the CIFAR10 images are RGB
img_channels = 3
# the data, shuffled and split between train and test sets
(X_train, y_train), (X_test, y_test) = cifar10.load_data()
print('X_train shape:', X_train.shape)
print(X_train.shape[0], 'train samples')
print(X_test.shape[0], 'test samples')
@@ -39,85 +42,69 @@ Y_test = np_utils.to_categorical(y_test, nb_classes)
model = Sequential()
model.add(Convolution2D(32, 3, 3, 3, border_mode='full'))
model.add(Convolution2D(32, 3, 3, border_mode='same',
input_shape=X_train.shape[1:]))
model.add(Activation('relu'))
model.add(Convolution2D(32, 32, 3, 3))
model.add(Convolution2D(32, 3, 3))
model.add(Activation('relu'))
model.add(MaxPooling2D(poolsize=(2, 2)))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Convolution2D(64, 32, 3, 3, border_mode='full'))
model.add(Convolution2D(64, 3, 3, border_mode='same'))
model.add(Activation('relu'))
model.add(Convolution2D(64, 64, 3, 3))
model.add(Convolution2D(64, 3, 3))
model.add(Activation('relu'))
model.add(MaxPooling2D(poolsize=(2, 2)))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(64*8*8, 512, init='normal'))
model.add(Dense(512))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(512, nb_classes, init='normal'))
model.add(Dense(nb_classes))
model.add(Activation('softmax'))
# let's train the model using SGD + momentum (how original).
sgd = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='categorical_crossentropy', optimizer=sgd)
model.compile(loss='categorical_crossentropy',
optimizer=sgd,
metrics=['accuracy'])
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255
X_test /= 255
if not data_augmentation:
print("Not using data augmentation or normalization")
X_train = X_train.astype("float32")
X_test = X_test.astype("float32")
X_train /= 255
X_test /= 255
model.fit(X_train, Y_train, batch_size=batch_size, nb_epoch=10)
score = model.evaluate(X_test, Y_test, batch_size=batch_size)
print('Test score:', score)
print('Not using data augmentation.')
model.fit(X_train, Y_train,
batch_size=batch_size,
nb_epoch=nb_epoch,
validation_data=(X_test, Y_test),
shuffle=True)
else:
print("Using real time data augmentation")
print('Using real-time data augmentation.')
# this will do preprocessing and realtime data augmentation
datagen = ImageDataGenerator(
featurewise_center=True, # set input mean to 0 over the dataset
samplewise_center=False, # set each sample mean to 0
featurewise_std_normalization=True, # divide inputs by std of the dataset
samplewise_std_normalization=False, # divide each input by its std
zca_whitening=False, # apply ZCA whitening
rotation_range=20, # randomly rotate images in the range (degrees, 0 to 180)
width_shift_range=0.2, # randomly shift images horizontally (fraction of total width)
height_shift_range=0.2, # randomly shift images vertically (fraction of total height)
horizontal_flip=True, # randomly flip images
vertical_flip=False) # randomly flip images
featurewise_center=False, # set input mean to 0 over the dataset
samplewise_center=False, # set each sample mean to 0
featurewise_std_normalization=False, # divide inputs by std of the dataset
samplewise_std_normalization=False, # divide each input by its std
zca_whitening=False, # apply ZCA whitening
rotation_range=0, # randomly rotate images in the range (degrees, 0 to 180)
width_shift_range=0.1, # randomly shift images horizontally (fraction of total width)
height_shift_range=0.1, # randomly shift images vertically (fraction of total height)
horizontal_flip=True, # randomly flip images
vertical_flip=False) # randomly flip images
# compute quantities required for featurewise normalization
# compute quantities required for featurewise normalization
# (std, mean, and principal components if ZCA whitening is applied)
datagen.fit(X_train)
for e in range(nb_epoch):
print('-'*40)
print('Epoch', e)
print('-'*40)
print("Training...")
# batch train with realtime data augmentation
progbar = generic_utils.Progbar(X_train.shape[0])
for X_batch, Y_batch in datagen.flow(X_train, Y_train):
loss = model.train(X_batch, Y_batch)
progbar.add(X_batch.shape[0], values=[("train loss", loss)])
print("Testing...")
# test time!
progbar = generic_utils.Progbar(X_test.shape[0])
for X_batch, Y_batch in datagen.flow(X_test, Y_test):
score = model.test(X_batch, Y_batch)
progbar.add(X_batch.shape[0], values=[("test loss", score)])
# fit the model on the batches generated by datagen.flow()
model.fit_generator(datagen.flow(X_train, Y_train,
batch_size=batch_size),
samples_per_epoch=X_train.shape[0],
nb_epoch=nb_epoch,
validation_data=(X_test, Y_test))
+132
Ver Arquivo
@@ -0,0 +1,132 @@
'''Visualization of the filters of VGG16, via gradient ascent in input space.
This script can run on CPU in a few minutes (with the TensorFlow backend).
Results example: http://i.imgur.com/4nj4KjN.jpg
'''
from __future__ import print_function
from scipy.misc import imsave
import numpy as np
import time
from keras.applications import vgg16
from keras import backend as K
# dimensions of the generated pictures for each filter.
img_width = 128
img_height = 128
# the name of the layer we want to visualize
# (see model definition at keras/applications/vgg16.py)
layer_name = 'block5_conv1'
# util function to convert a tensor into a valid image
def deprocess_image(x):
# normalize tensor: center on 0., ensure std is 0.1
x -= x.mean()
x /= (x.std() + 1e-5)
x *= 0.1
# clip to [0, 1]
x += 0.5
x = np.clip(x, 0, 1)
# convert to RGB array
x *= 255
if K.image_dim_ordering() == 'th':
x = x.transpose((1, 2, 0))
x = np.clip(x, 0, 255).astype('uint8')
return x
# build the VGG16 network with ImageNet weights
model = vgg16.VGG16(weights='imagenet', include_top=False)
print('Model loaded.')
model.summary()
# this is the placeholder for the input images
input_img = model.input
# get the symbolic outputs of each "key" layer (we gave them unique names).
layer_dict = dict([(layer.name, layer) for layer in model.layers[1:]])
def normalize(x):
# utility function to normalize a tensor by its L2 norm
return x / (K.sqrt(K.mean(K.square(x))) + 1e-5)
kept_filters = []
for filter_index in range(0, 200):
# we only scan through the first 200 filters,
# but there are actually 512 of them
print('Processing filter %d' % filter_index)
start_time = time.time()
# we build a loss function that maximizes the activation
# of the nth filter of the layer considered
layer_output = layer_dict[layer_name].output
if K.image_dim_ordering() == 'th':
loss = K.mean(layer_output[:, filter_index, :, :])
else:
loss = K.mean(layer_output[:, :, :, filter_index])
# we compute the gradient of the input picture wrt this loss
grads = K.gradients(loss, input_img)[0]
# normalization trick: we normalize the gradient
grads = normalize(grads)
# this function returns the loss and grads given the input picture
iterate = K.function([input_img], [loss, grads])
# step size for gradient ascent
step = 1.
# we start from a gray image with some random noise
if K.image_dim_ordering() == 'th':
input_img_data = np.random.random((1, 3, img_width, img_height))
else:
input_img_data = np.random.random((1, img_width, img_height, 3))
input_img_data = (input_img_data - 0.5) * 20 + 128
# we run gradient ascent for 20 steps
for i in range(20):
loss_value, grads_value = iterate([input_img_data])
input_img_data += grads_value * step
print('Current loss value:', loss_value)
if loss_value <= 0.:
# some filters get stuck to 0, we can skip them
break
# decode the resulting input image
if loss_value > 0:
img = deprocess_image(input_img_data[0])
kept_filters.append((img, loss_value))
end_time = time.time()
print('Filter %d processed in %ds' % (filter_index, end_time - start_time))
# we will stich the best 64 filters on a 8 x 8 grid.
n = 8
# the filters that have the highest loss are assumed to be better-looking.
# we will only keep the top 64 filters.
kept_filters.sort(key=lambda x: x[1], reverse=True)
kept_filters = kept_filters[:n * n]
# build a black picture with enough space for
# our 8 x 8 filters of size 128 x 128, with a 5px margin in between
margin = 5
width = n * img_width + (n - 1) * margin
height = n * img_height + (n - 1) * margin
stitched_filters = np.zeros((width, height, 3))
# fill the picture with our saved filters
for i in range(n):
for j in range(n):
img, loss = kept_filters[i * n + j]
stitched_filters[(img_width + margin) * i: (img_width + margin) * i + img_width,
(img_height + margin) * j: (img_height + margin) * j + img_height, :] = img
# save the result to disk
imsave('stitched_filters_%dx%d.png' % (n, n), stitched_filters)
+142
Ver Arquivo
@@ -0,0 +1,142 @@
""" This script demonstrates the use of a convolutional LSTM network.
This network is used to predict the next frame of an artificially
generated movie which contains moving squares.
"""
from keras.models import Sequential
from keras.layers.convolutional import Convolution3D
from keras.layers.convolutional_recurrent import ConvLSTM2D
from keras.layers.normalization import BatchNormalization
import numpy as np
import pylab as plt
# We create a layer which take as input movies of shape
# (n_frames, width, height, channels) and returns a movie
# of identical shape.
seq = Sequential()
seq.add(ConvLSTM2D(nb_filter=40, nb_row=3, nb_col=3,
input_shape=(None, 40, 40, 1),
border_mode='same', return_sequences=True))
seq.add(BatchNormalization())
seq.add(ConvLSTM2D(nb_filter=40, nb_row=3, nb_col=3,
border_mode='same', return_sequences=True))
seq.add(BatchNormalization())
seq.add(ConvLSTM2D(nb_filter=40, nb_row=3, nb_col=3,
border_mode='same', return_sequences=True))
seq.add(BatchNormalization())
seq.add(ConvLSTM2D(nb_filter=40, nb_row=3, nb_col=3,
border_mode='same', return_sequences=True))
seq.add(BatchNormalization())
seq.add(Convolution3D(nb_filter=1, kernel_dim1=1, kernel_dim2=3,
kernel_dim3=3, activation='sigmoid',
border_mode='same', dim_ordering='tf'))
seq.compile(loss='binary_crossentropy', optimizer='adadelta')
# Artificial data generation:
# Generate movies with 3 to 7 moving squares inside.
# The squares are of shape 1x1 or 2x2 pixels,
# which move linearly over time.
# For convenience we first create movies with bigger width and height (80x80)
# and at the end we select a 40x40 window.
def generate_movies(n_samples=1200, n_frames=15):
row = 80
col = 80
noisy_movies = np.zeros((n_samples, n_frames, row, col, 1), dtype=np.float)
shifted_movies = np.zeros((n_samples, n_frames, row, col, 1),
dtype=np.float)
for i in range(n_samples):
# Add 3 to 7 moving squares
n = np.random.randint(3, 8)
for j in range(n):
# Initial position
xstart = np.random.randint(20, 60)
ystart = np.random.randint(20, 60)
# Direction of motion
directionx = np.random.randint(0, 3) - 1
directiony = np.random.randint(0, 3) - 1
# Size of the square
w = np.random.randint(2, 4)
for t in range(n_frames):
x_shift = xstart + directionx * t
y_shift = ystart + directiony * t
noisy_movies[i, t, x_shift - w: x_shift + w,
y_shift - w: y_shift + w, 0] += 1
# Make it more robust by adding noise.
# The idea is that if during inference,
# the value of the pixel is not exactly one,
# we need to train the network to be robust and still
# consider it as a pixel belonging to a square.
if np.random.randint(0, 2):
noise_f = (-1)**np.random.randint(0, 2)
noisy_movies[i, t,
x_shift - w - 1: x_shift + w + 1,
y_shift - w - 1: y_shift + w + 1,
0] += noise_f * 0.1
# Shift the ground truth by 1
x_shift = xstart + directionx * (t + 1)
y_shift = ystart + directiony * (t + 1)
shifted_movies[i, t, x_shift - w: x_shift + w,
y_shift - w: y_shift + w, 0] += 1
# Cut to a 40x40 window
noisy_movies = noisy_movies[::, ::, 20:60, 20:60, ::]
shifted_movies = shifted_movies[::, ::, 20:60, 20:60, ::]
noisy_movies[noisy_movies >= 1] = 1
shifted_movies[shifted_movies >= 1] = 1
return noisy_movies, shifted_movies
# Train the network
noisy_movies, shifted_movies = generate_movies(n_samples=1200)
seq.fit(noisy_movies[:1000], shifted_movies[:1000], batch_size=10,
nb_epoch=300, validation_split=0.05)
# Testing the network on one movie
# feed it with the first 7 positions and then
# predict the new positions
which = 1004
track = noisy_movies[which][:7, ::, ::, ::]
for j in range(16):
new_pos = seq.predict(track[np.newaxis, ::, ::, ::, ::])
new = new_pos[::, -1, ::, ::, ::]
track = np.concatenate((track, new), axis=0)
# And then compare the predictions
# to the ground truth
track2 = noisy_movies[which][::, ::, ::, ::]
for i in range(15):
fig = plt.figure(figsize=(10, 5))
ax = fig.add_subplot(121)
if i >= 7:
ax.text(1, 3, 'Predictions !', fontsize=20, color='w')
else:
ax.text(1, 3, 'Inital trajectory', fontsize=20)
toplot = track[i, ::, ::, 0]
plt.imshow(toplot)
ax = fig.add_subplot(122)
plt.text(1, 3, 'Ground truth', fontsize=20)
toplot = track2[i, ::, ::, 0]
if i >= 2:
toplot = shifted_movies[which][i - 1, ::, ::, 0]
plt.imshow(toplot)
plt.savefig('%i_animate.png' % (i + 1))
+209
Ver Arquivo
@@ -0,0 +1,209 @@
'''Deep Dreaming in Keras.
Run the script with:
```
python deep_dream.py path_to_your_base_image.jpg prefix_for_results
```
e.g.:
```
python deep_dream.py img/mypic.jpg results/dream
```
It is preferable to run this script on GPU, for speed.
If running on CPU, prefer the TensorFlow backend (much faster).
Example results: http://i.imgur.com/FX6ROg9.jpg
'''
from __future__ import print_function
from keras.preprocessing.image import load_img, img_to_array
import numpy as np
from scipy.misc import imsave
from scipy.optimize import fmin_l_bfgs_b
import time
import argparse
from keras.applications import vgg16
from keras import backend as K
from keras.layers import Input
parser = argparse.ArgumentParser(description='Deep Dreams with Keras.')
parser.add_argument('base_image_path', metavar='base', type=str,
help='Path to the image to transform.')
parser.add_argument('result_prefix', metavar='res_prefix', type=str,
help='Prefix for the saved results.')
args = parser.parse_args()
base_image_path = args.base_image_path
result_prefix = args.result_prefix
# dimensions of the generated picture.
img_width = 600
img_height = 600
# path to the model weights file.
weights_path = 'vgg16_weights.h5'
# some settings we found interesting
saved_settings = {
'bad_trip': {'features': {'block4_conv1': 0.05,
'block4_conv2': 0.01,
'block4_conv3': 0.01},
'continuity': 0.1,
'dream_l2': 0.8,
'jitter': 5},
'dreamy': {'features': {'block5_conv1': 0.05,
'block5_conv2': 0.02},
'continuity': 0.1,
'dream_l2': 0.02,
'jitter': 0},
}
# the settings we will use in this experiment
settings = saved_settings['dreamy']
# util function to open, resize and format pictures into appropriate tensors
def preprocess_image(image_path):
img = load_img(image_path, target_size=(img_width, img_height))
img = img_to_array(img)
img = np.expand_dims(img, axis=0)
img = vgg16.preprocess_input(img)
return img
# util function to convert a tensor into a valid image
def deprocess_image(x):
if K.image_dim_ordering() == 'th':
x = x.reshape((3, img_width, img_height))
x = x.transpose((1, 2, 0))
else:
x = x.reshape((img_width, img_height, 3))
# Remove zero-center by mean pixel
x[:, :, 0] += 103.939
x[:, :, 1] += 116.779
x[:, :, 2] += 123.68
# 'BGR'->'RGB'
x = x[:, :, ::-1]
x = np.clip(x, 0, 255).astype('uint8')
return x
if K.image_dim_ordering() == 'th':
img_size = (3, img_width, img_height)
else:
img_size = (img_width, img_height, 3)
# this will contain our generated image
dream = Input(batch_shape=(1,) + img_size)
# build the VGG16 network with our placeholder
# the model will be loaded with pre-trained ImageNet weights
model = vgg16.VGG16(input_tensor=dream,
weights='imagenet', include_top=False)
print('Model loaded.')
# get the symbolic outputs of each "key" layer (we gave them unique names).
layer_dict = dict([(layer.name, layer) for layer in model.layers])
# continuity loss util function
def continuity_loss(x):
assert K.ndim(x) == 4
if K.image_dim_ordering() == 'th':
a = K.square(x[:, :, :img_width - 1, :img_height - 1] -
x[:, :, 1:, :img_height - 1])
b = K.square(x[:, :, :img_width - 1, :img_height - 1] -
x[:, :, :img_width - 1, 1:])
else:
a = K.square(x[:, :img_width - 1, :img_height-1, :] -
x[:, 1:, :img_height - 1, :])
b = K.square(x[:, :img_width - 1, :img_height-1, :] -
x[:, :img_width - 1, 1:, :])
return K.sum(K.pow(a + b, 1.25))
# define the loss
loss = K.variable(0.)
for layer_name in settings['features']:
# add the L2 norm of the features of a layer to the loss
assert layer_name in layer_dict.keys(), 'Layer ' + layer_name + ' not found in model.'
coeff = settings['features'][layer_name]
x = layer_dict[layer_name].output
shape = layer_dict[layer_name].output_shape
# we avoid border artifacts by only involving non-border pixels in the loss
if K.image_dim_ordering() == 'th':
loss -= coeff * K.sum(K.square(x[:, :, 2: shape[2] - 2, 2: shape[3] - 2])) / np.prod(shape[1:])
else:
loss -= coeff * K.sum(K.square(x[:, 2: shape[1] - 2, 2: shape[2] - 2, :])) / np.prod(shape[1:])
# add continuity loss (gives image local coherence, can result in an artful blur)
loss += settings['continuity'] * continuity_loss(dream) / np.prod(img_size)
# add image L2 norm to loss (prevents pixels from taking very high values, makes image darker)
loss += settings['dream_l2'] * K.sum(K.square(dream)) / np.prod(img_size)
# feel free to further modify the loss as you see fit, to achieve new effects...
# compute the gradients of the dream wrt the loss
grads = K.gradients(loss, dream)
outputs = [loss]
if type(grads) in {list, tuple}:
outputs += grads
else:
outputs.append(grads)
f_outputs = K.function([dream], outputs)
def eval_loss_and_grads(x):
x = x.reshape((1,) + img_size)
outs = f_outputs([x])
loss_value = outs[0]
if len(outs[1:]) == 1:
grad_values = outs[1].flatten().astype('float64')
else:
grad_values = np.array(outs[1:]).flatten().astype('float64')
return loss_value, grad_values
# this Evaluator class makes it possible
# to compute loss and gradients in one pass
# while retrieving them via two separate functions,
# "loss" and "grads". This is done because scipy.optimize
# requires separate functions for loss and gradients,
# but computing them separately would be inefficient.
class Evaluator(object):
def __init__(self):
self.loss_value = None
self.grad_values = None
def loss(self, x):
assert self.loss_value is None
loss_value, grad_values = eval_loss_and_grads(x)
self.loss_value = loss_value
self.grad_values = grad_values
return self.loss_value
def grads(self, x):
assert self.loss_value is not None
grad_values = np.copy(self.grad_values)
self.loss_value = None
self.grad_values = None
return grad_values
evaluator = Evaluator()
# run scipy-based optimization (L-BFGS) over the pixels of the generated image
# so as to minimize the loss
x = preprocess_image(base_image_path)
for i in range(5):
print('Start of iteration', i)
start_time = time.time()
# add a random jitter to the initial image. This will be reverted at decoding time
random_jitter = (settings['jitter'] * 2) * (np.random.random(img_size) - 0.5)
x += random_jitter
# run L-BFGS for 7 steps
x, min_val, info = fmin_l_bfgs_b(evaluator.loss, x.flatten(),
fprime=evaluator.grads, maxfun=7)
print('Current loss value:', min_val)
# decode the dream and save it
x = x.reshape(img_size)
x -= random_jitter
img = deprocess_image(np.copy(x))
fname = result_prefix + '_at_iteration_%d.png' % i
imsave(fname, img)
end_time = time.time()
print('Image saved as', fname)
print('Iteration %d completed in %ds' % (i, end_time - start_time))
+470
Ver Arquivo
@@ -0,0 +1,470 @@
'''This example uses a convolutional stack followed by a recurrent stack
and a CTC logloss function to perform optical character recognition
of generated text images. I have no evidence of whether it actually
learns general shapes of text, or just is able to recognize all
the different fonts thrown at it...the purpose is more to demonstrate CTC
inside of Keras. Note that the font list may need to be updated
for the particular OS in use.
This starts off with 4 letter words. After 10 or so epochs, CTC
learns translational invariance, so longer words and groups of words
with spaces are gradually fed in. This gradual increase in difficulty
is handled using the TextImageGenerator class which is both a generator
class for test/train data and a Keras callback class. Every 10 epochs
the wordlist that the generator draws from increases in difficulty.
The table below shows normalized edit distance values. Theano uses
a slightly different CTC implementation, so some Theano-specific
hyperparameter tuning would be needed to get it to match Tensorflow.
Norm. ED
Epoch | TF | TH
------------------------
10 0.072 0.272
20 0.032 0.115
30 0.024 0.098
40 0.023 0.108
This requires cairo and editdistance packages:
pip install cairocffi
pip install editdistance
Due to the use of a dummy loss function, Theano requires the following flags:
on_unused_input='ignore'
Created by Mike Henry
https://github.com/mbhenry/
'''
import os
import itertools
import re
import datetime
import cairocffi as cairo
import editdistance
import numpy as np
from scipy import ndimage
import pylab
from keras import backend as K
from keras.layers.convolutional import Convolution2D, MaxPooling2D
from keras.layers import Input, Layer, Dense, Activation, Flatten
from keras.layers import Reshape, Lambda, merge, Permute, TimeDistributed
from keras.models import Model
from keras.layers.recurrent import GRU
from keras.optimizers import SGD
from keras.utils import np_utils
from keras.utils.data_utils import get_file
from keras.preprocessing import image
import keras.callbacks
OUTPUT_DIR = "image_ocr"
np.random.seed(55)
# this creates larger "blotches" of noise which look
# more realistic than just adding gaussian noise
# assumes greyscale with pixels ranging from 0 to 1
def speckle(img):
severity = np.random.uniform(0, 0.6)
blur = ndimage.gaussian_filter(np.random.randn(*img.shape) * severity, 1)
img_speck = (img + blur)
img_speck[img_speck > 1] = 1
img_speck[img_speck <= 0] = 0
return img_speck
# paints the string in a random location the bounding box
# also uses a random font, a slight random rotation,
# and a random amount of speckle noise
def paint_text(text, w, h):
surface = cairo.ImageSurface(cairo.FORMAT_RGB24, w, h)
with cairo.Context(surface) as context:
context.set_source_rgb(1, 1, 1) # White
context.paint()
# this font list works in Centos 7
fonts = ['Century Schoolbook', 'Courier', 'STIX', 'URW Chancery L', 'FreeMono']
context.select_font_face(np.random.choice(fonts), cairo.FONT_SLANT_NORMAL,
np.random.choice([cairo.FONT_WEIGHT_BOLD, cairo.FONT_WEIGHT_NORMAL]))
context.set_font_size(40)
box = context.text_extents(text)
if box[2] > w or box[3] > h:
raise IOError('Could not fit string into image. Max char count is too large for given image width.')
# teach the RNN translational invariance by
# fitting text box randomly on canvas, with some room to rotate
border_w_h = (10, 16)
max_shift_x = w - box[2] - border_w_h[0]
max_shift_y = h - box[3] - border_w_h[1]
top_left_x = np.random.randint(0, int(max_shift_x))
top_left_y = np.random.randint(0, int(max_shift_y))
context.move_to(top_left_x - int(box[0]), top_left_y - int(box[1]))
context.set_source_rgb(0, 0, 0)
context.show_text(text)
buf = surface.get_data()
a = np.frombuffer(buf, np.uint8)
a.shape = (h, w, 4)
a = a[:, :, 0] # grab single channel
a = a.astype(np.float32) / 255
a = np.expand_dims(a, 0)
a = speckle(a)
a = image.random_rotation(a, 3 * (w - top_left_x) / w + 1)
return a
def shuffle_mats_or_lists(matrix_list, stop_ind=None):
ret = []
assert all([len(i) == len(matrix_list[0]) for i in matrix_list])
len_val = len(matrix_list[0])
if stop_ind is None:
stop_ind = len_val
assert stop_ind <= len_val
a = range(stop_ind)
np.random.shuffle(a)
a += range(stop_ind, len_val)
for mat in matrix_list:
if isinstance(mat, np.ndarray):
ret.append(mat[a])
elif isinstance(mat, list):
ret.append([mat[i] for i in a])
else:
raise TypeError('shuffle_mats_or_lists only supports '
'numpy.array and list objects')
return ret
def text_to_labels(text, num_classes):
ret = []
for char in text:
if char >= 'a' and char <= 'z':
ret.append(ord(char) - ord('a'))
elif char == ' ':
ret.append(26)
return ret
# only a-z and space..probably not to difficult
# to expand to uppercase and symbols
def is_valid_str(in_str):
search = re.compile(r'[^a-z\ ]').search
return not bool(search(in_str))
# Uses generator functions to supply train/test with
# data. Image renderings are text are created on the fly
# each time with random perturbations
class TextImageGenerator(keras.callbacks.Callback):
def __init__(self, monogram_file, bigram_file, minibatch_size,
img_w, img_h, downsample_width, val_split,
absolute_max_string_len=16):
self.minibatch_size = minibatch_size
self.img_w = img_w
self.img_h = img_h
self.monogram_file = monogram_file
self.bigram_file = bigram_file
self.downsample_width = downsample_width
self.val_split = val_split
self.blank_label = self.get_output_size() - 1
self.absolute_max_string_len = absolute_max_string_len
def get_output_size(self):
return 28
# num_words can be independent of the epoch size due to the use of generators
# as max_string_len grows, num_words can grow
def build_word_list(self, num_words, max_string_len=None, mono_fraction=0.5):
assert max_string_len <= self.absolute_max_string_len
assert num_words % self.minibatch_size == 0
assert (self.val_split * num_words) % self.minibatch_size == 0
self.num_words = num_words
self.string_list = []
self.max_string_len = max_string_len
self.Y_data = np.ones([self.num_words, self.absolute_max_string_len]) * -1
self.X_text = []
self.Y_len = [0] * self.num_words
# monogram file is sorted by frequency in english speech
with open(self.monogram_file, 'rt') as f:
for line in f:
if len(self.string_list) == int(self.num_words * mono_fraction):
break
word = line.rstrip()
if max_string_len == -1 or max_string_len is None or len(word) <= max_string_len:
self.string_list.append(word)
# bigram file contains common word pairings in english speech
with open(self.bigram_file, 'rt') as f:
lines = f.readlines()
for line in lines:
if len(self.string_list) == self.num_words:
break
columns = line.lower().split()
word = columns[0] + ' ' + columns[1]
if is_valid_str(word) and \
(max_string_len == -1 or max_string_len is None or len(word) <= max_string_len):
self.string_list.append(word)
if len(self.string_list) != self.num_words:
raise IOError('Could not pull enough words from supplied monogram and bigram files. ')
for i, word in enumerate(self.string_list):
self.Y_len[i] = len(word)
self.Y_data[i, 0:len(word)] = text_to_labels(word, self.get_output_size())
self.X_text.append(word)
self.Y_len = np.expand_dims(np.array(self.Y_len), 1)
self.cur_val_index = self.val_split
self.cur_train_index = 0
# each time an image is requested from train/val/test, a new random
# painting of the text is performed
def get_batch(self, index, size, train):
if K.image_dim_ordering() == 'th':
X_data = np.ones([size, 1, self.img_h, self.img_w])
else:
X_data = np.ones([size, self.img_h, self.img_w, 1])
labels = np.ones([size, self.absolute_max_string_len])
input_length = np.zeros([size, 1])
label_length = np.zeros([size, 1])
source_str = []
for i in range(0, size):
# Mix in some blank inputs. This seems to be important for
# achieving translational invariance
if train and i > size - 4:
if K.image_dim_ordering() == 'th':
X_data[i, 0, :, :] = paint_text('', self.img_w, self.img_h)
else:
X_data[i, :, :, 0] = paint_text('', self.img_w, self.img_h)
labels[i, 0] = self.blank_label
input_length[i] = self.downsample_width
label_length[i] = 1
source_str.append('')
else:
if K.image_dim_ordering() == 'th':
X_data[i, 0, :, :] = paint_text(self.X_text[index + i], self.img_w, self.img_h)
else:
X_data[i, :, :, 0] = paint_text(self.X_text[index + i], self.img_w, self.img_h)
labels[i, :] = self.Y_data[index + i]
input_length[i] = self.downsample_width
label_length[i] = self.Y_len[index + i]
source_str.append(self.X_text[index + i])
inputs = {'the_input': X_data,
'the_labels': labels,
'input_length': input_length,
'label_length': label_length,
'source_str': source_str # used for visualization only
}
outputs = {'ctc': np.zeros([size])} # dummy data for dummy loss function
return (inputs, outputs)
def next_train(self):
while 1:
ret = self.get_batch(self.cur_train_index, self.minibatch_size, train=True)
self.cur_train_index += self.minibatch_size
if self.cur_train_index >= self.val_split:
self.cur_train_index = self.cur_train_index % 32
(self.X_text, self.Y_data, self.Y_len) = shuffle_mats_or_lists(
[self.X_text, self.Y_data, self.Y_len], self.val_split)
yield ret
def next_val(self):
while 1:
ret = self.get_batch(self.cur_val_index, self.minibatch_size, train=False)
self.cur_val_index += self.minibatch_size
if self.cur_val_index >= self.num_words:
self.cur_val_index = self.val_split + self.cur_val_index % 32
yield ret
def on_train_begin(self, logs={}):
# translational invariance seems to be the hardest thing
# for the RNN to learn, so start with <= 4 letter words.
self.build_word_list(16000, 4, 1)
def on_epoch_begin(self, epoch, logs={}):
# After 10 epochs, translational invariance should be learned
# so start feeding longer words and eventually multiple words with spaces
if epoch == 10:
self.build_word_list(32000, 8, 1)
if epoch == 20:
self.build_word_list(32000, 8, 0.6)
if epoch == 30:
self.build_word_list(64000, 12, 0.5)
# the actual loss calc occurs here despite it not being
# an internal Keras loss function
def ctc_lambda_func(args):
y_pred, labels, input_length, label_length = args
# the 2 is critical here since the first couple outputs of the RNN
# tend to be garbage:
y_pred = y_pred[:, 2:, :]
return K.ctc_batch_cost(labels, y_pred, input_length, label_length)
# For a real OCR application, this should be beam search with a dictionary
# and language model. For this example, best path is sufficient.
def decode_batch(test_func, word_batch):
out = test_func([word_batch])[0]
ret = []
for j in range(out.shape[0]):
out_best = list(np.argmax(out[j, 2:], 1))
out_best = [k for k, g in itertools.groupby(out_best)]
# 26 is space, 27 is CTC blank char
outstr = ''
for c in out_best:
if c >= 0 and c < 26:
outstr += chr(c + ord('a'))
elif c == 26:
outstr += ' '
ret.append(outstr)
return ret
class VizCallback(keras.callbacks.Callback):
def __init__(self, test_func, text_img_gen, num_display_words=6):
self.test_func = test_func
self.output_dir = os.path.join(
OUTPUT_DIR, datetime.datetime.now().strftime('%A, %d. %B %Y %I.%M%p'))
self.text_img_gen = text_img_gen
self.num_display_words = num_display_words
os.makedirs(self.output_dir)
def show_edit_distance(self, num):
num_left = num
mean_norm_ed = 0.0
mean_ed = 0.0
while num_left > 0:
word_batch = next(self.text_img_gen)[0]
num_proc = min(word_batch['the_input'].shape[0], num_left)
decoded_res = decode_batch(self.test_func, word_batch['the_input'][0:num_proc])
for j in range(0, num_proc):
edit_dist = editdistance.eval(decoded_res[j], word_batch['source_str'][j])
mean_ed += float(edit_dist)
mean_norm_ed += float(edit_dist) / len(word_batch['source_str'][j])
num_left -= num_proc
mean_norm_ed = mean_norm_ed / num
mean_ed = mean_ed / num
print('\nOut of %d samples: Mean edit distance: %.3f Mean normalized edit distance: %0.3f'
% (num, mean_ed, mean_norm_ed))
def on_epoch_end(self, epoch, logs={}):
self.model.save_weights(os.path.join(self.output_dir, 'weights%02d.h5' % epoch))
self.show_edit_distance(256)
word_batch = next(self.text_img_gen)[0]
res = decode_batch(self.test_func, word_batch['the_input'][0:self.num_display_words])
for i in range(self.num_display_words):
pylab.subplot(self.num_display_words, 1, i + 1)
if K.image_dim_ordering() == 'th':
the_input = word_batch['the_input'][i, 0, :, :]
else:
the_input = word_batch['the_input'][i, :, :, 0]
pylab.imshow(the_input, cmap='Greys_r')
pylab.xlabel('Truth = \'%s\' Decoded = \'%s\'' % (word_batch['source_str'][i], res[i]))
fig = pylab.gcf()
fig.set_size_inches(10, 12)
pylab.savefig(os.path.join(self.output_dir, 'e%02d.png' % epoch))
pylab.close()
# Input Parameters
img_h = 64
img_w = 512
nb_epoch = 50
minibatch_size = 32
words_per_epoch = 16000
val_split = 0.2
val_words = int(words_per_epoch * (val_split))
# Network parameters
conv_num_filters = 16
filter_size = 3
pool_size_1 = 4
pool_size_2 = 2
time_dense_size = 32
rnn_size = 512
time_steps = img_w // (pool_size_1 * pool_size_2)
if K.image_dim_ordering() == 'th':
input_shape = (1, img_h, img_w)
else:
input_shape = (img_h, img_w, 1)
fdir = os.path.dirname(get_file('wordlists.tgz',
origin='http://www.isosemi.com/datasets/wordlists.tgz', untar=True))
img_gen = TextImageGenerator(monogram_file=os.path.join(fdir, 'wordlist_mono_clean.txt'),
bigram_file=os.path.join(fdir, 'wordlist_bi_clean.txt'),
minibatch_size=32,
img_w=img_w,
img_h=img_h,
downsample_width=img_w // (pool_size_1 * pool_size_2) - 2,
val_split=words_per_epoch - val_words)
act = 'relu'
input_data = Input(name='the_input', shape=input_shape, dtype='float32')
inner = Convolution2D(conv_num_filters, filter_size, filter_size, border_mode='same',
activation=act, name='conv1')(input_data)
inner = MaxPooling2D(pool_size=(pool_size_1, pool_size_1), name='max1')(inner)
inner = Convolution2D(conv_num_filters, filter_size, filter_size, border_mode='same',
activation=act, name='conv2')(inner)
inner = MaxPooling2D(pool_size=(pool_size_2, pool_size_2), name='max2')(inner)
conv_to_rnn_dims = ((img_h // (pool_size_1 * pool_size_2)) * conv_num_filters, img_w // (pool_size_1 * pool_size_2))
inner = Reshape(target_shape=conv_to_rnn_dims, name='reshape')(inner)
inner = Permute(dims=(2, 1), name='permute')(inner)
# cuts down input size going into RNN:
inner = TimeDistributed(Dense(time_dense_size, activation=act, name='dense1'))(inner)
# Two layers of bidirecitonal GRUs
# GRU seems to work as well, if not better than LSTM:
gru_1 = GRU(rnn_size, return_sequences=True, name='gru1')(inner)
gru_1b = GRU(rnn_size, return_sequences=True, go_backwards=True, name='gru1_b')(inner)
gru1_merged = merge([gru_1, gru_1b], mode='sum')
gru_2 = GRU(rnn_size, return_sequences=True, name='gru2')(gru1_merged)
gru_2b = GRU(rnn_size, return_sequences=True, go_backwards=True)(gru1_merged)
# transforms RNN output to character activations:
inner = TimeDistributed(Dense(img_gen.get_output_size(), name='dense2'))(merge([gru_2, gru_2b], mode='concat'))
y_pred = Activation('softmax', name='softmax')(inner)
Model(input=[input_data], output=y_pred).summary()
labels = Input(name='the_labels', shape=[img_gen.absolute_max_string_len], dtype='float32')
input_length = Input(name='input_length', shape=[1], dtype='int64')
label_length = Input(name='label_length', shape=[1], dtype='int64')
# Keras doesn't currently support loss funcs with extra parameters
# so CTC loss is implemented in a lambda layer
loss_out = Lambda(ctc_lambda_func, output_shape=(1,), name="ctc")([y_pred, labels, input_length, label_length])
lr = 0.03
# clipnorm seems to speeds up convergence
clipnorm = 5
sgd = SGD(lr=lr, decay=3e-7, momentum=0.9, nesterov=True, clipnorm=clipnorm)
model = Model(input=[input_data, labels, input_length, label_length], output=[loss_out])
# the loss calc occurs elsewhere, so use a dummy lambda func for the loss
model.compile(loss={'ctc': lambda y_true, y_pred: y_pred}, optimizer=sgd)
# captures output of softmax so we can decode the output during visualization
test_func = K.function([input_data], [y_pred])
viz_cb = VizCallback(test_func, img_gen.next_val())
model.fit_generator(generator=img_gen.next_train(), samples_per_epoch=(words_per_epoch - val_words),
nb_epoch=nb_epoch, validation_data=img_gen.next_val(), nb_val_samples=val_words,
callbacks=[viz_cb, img_gen])
+47
Ver Arquivo
@@ -0,0 +1,47 @@
'''Train a Bidirectional LSTM on the IMDB sentiment classification task.
Output after 4 epochs on CPU: ~0.8146
Time per epoch on CPU (Core i7): ~150s.
'''
from __future__ import print_function
import numpy as np
np.random.seed(1337) # for reproducibility
from keras.preprocessing import sequence
from keras.models import Sequential
from keras.layers import Dense, Dropout, Embedding, LSTM, Input, Bidirectional
from keras.datasets import imdb
max_features = 20000
maxlen = 100 # cut texts after this number of words (among top max_features most common words)
batch_size = 32
print('Loading data...')
(X_train, y_train), (X_test, y_test) = imdb.load_data(nb_words=max_features)
print(len(X_train), 'train sequences')
print(len(X_test), 'test sequences')
print("Pad sequences (samples x time)")
X_train = sequence.pad_sequences(X_train, maxlen=maxlen)
X_test = sequence.pad_sequences(X_test, maxlen=maxlen)
print('X_train shape:', X_train.shape)
print('X_test shape:', X_test.shape)
y_train = np.array(y_train)
y_test = np.array(y_test)
model = Sequential()
model.add(Embedding(max_features, 128, input_length=maxlen))
model.add(Bidirectional(LSTM(64)))
model.add(Dropout(0.5))
model.add(Dense(1, activation='sigmoid'))
# try using different optimizers and different optimizer configs
model.compile('adam', 'binary_crossentropy', metrics=['accuracy'])
print('Train...')
model.fit(X_train, y_train,
batch_size=batch_size,
nb_epoch=4,
validation_data=[X_test, y_test])
+78
Ver Arquivo
@@ -0,0 +1,78 @@
'''This example demonstrates the use of Convolution1D for text classification.
Gets to 0.89 test accuracy after 2 epochs.
90s/epoch on Intel i5 2.4Ghz CPU.
10s/epoch on Tesla K40 GPU.
'''
from __future__ import print_function
import numpy as np
np.random.seed(1337) # for reproducibility
from keras.preprocessing import sequence
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation
from keras.layers import Embedding
from keras.layers import Convolution1D, GlobalMaxPooling1D
from keras.datasets import imdb
from keras import backend as K
# set parameters:
max_features = 5000
maxlen = 400
batch_size = 32
embedding_dims = 50
nb_filter = 250
filter_length = 3
hidden_dims = 250
nb_epoch = 2
print('Loading data...')
(X_train, y_train), (X_test, y_test) = imdb.load_data(nb_words=max_features)
print(len(X_train), 'train sequences')
print(len(X_test), 'test sequences')
print('Pad sequences (samples x time)')
X_train = sequence.pad_sequences(X_train, maxlen=maxlen)
X_test = sequence.pad_sequences(X_test, maxlen=maxlen)
print('X_train shape:', X_train.shape)
print('X_test shape:', X_test.shape)
print('Build model...')
model = Sequential()
# we start off with an efficient embedding layer which maps
# our vocab indices into embedding_dims dimensions
model.add(Embedding(max_features,
embedding_dims,
input_length=maxlen,
dropout=0.2))
# we add a Convolution1D, which will learn nb_filter
# word group filters of size filter_length:
model.add(Convolution1D(nb_filter=nb_filter,
filter_length=filter_length,
border_mode='valid',
activation='relu',
subsample_length=1))
# we use max pooling:
model.add(GlobalMaxPooling1D())
# We add a vanilla hidden layer:
model.add(Dense(hidden_dims))
model.add(Dropout(0.2))
model.add(Activation('relu'))
# We project onto a single unit output layer, and squash it with a sigmoid:
model.add(Dense(1))
model.add(Activation('sigmoid'))
model.compile(loss='binary_crossentropy',
optimizer='adam',
metrics=['accuracy'])
model.fit(X_train, y_train,
batch_size=batch_size,
nb_epoch=nb_epoch,
validation_data=(X_test, y_test))
+77
Ver Arquivo
@@ -0,0 +1,77 @@
'''Train a recurrent convolutional network on the IMDB sentiment
classification task.
Gets to 0.8498 test accuracy after 2 epochs. 41s/epoch on K520 GPU.
'''
from __future__ import print_function
import numpy as np
np.random.seed(1337) # for reproducibility
from keras.preprocessing import sequence
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation
from keras.layers import Embedding
from keras.layers import LSTM
from keras.layers import Convolution1D, MaxPooling1D
from keras.datasets import imdb
# Embedding
max_features = 20000
maxlen = 100
embedding_size = 128
# Convolution
filter_length = 5
nb_filter = 64
pool_length = 4
# LSTM
lstm_output_size = 70
# Training
batch_size = 30
nb_epoch = 2
'''
Note:
batch_size is highly sensitive.
Only 2 epochs are needed as the dataset is very small.
'''
print('Loading data...')
(X_train, y_train), (X_test, y_test) = imdb.load_data(nb_words=max_features)
print(len(X_train), 'train sequences')
print(len(X_test), 'test sequences')
print('Pad sequences (samples x time)')
X_train = sequence.pad_sequences(X_train, maxlen=maxlen)
X_test = sequence.pad_sequences(X_test, maxlen=maxlen)
print('X_train shape:', X_train.shape)
print('X_test shape:', X_test.shape)
print('Build model...')
model = Sequential()
model.add(Embedding(max_features, embedding_size, input_length=maxlen))
model.add(Dropout(0.25))
model.add(Convolution1D(nb_filter=nb_filter,
filter_length=filter_length,
border_mode='valid',
activation='relu',
subsample_length=1))
model.add(MaxPooling1D(pool_length=pool_length))
model.add(LSTM(lstm_output_size))
model.add(Dense(1))
model.add(Activation('sigmoid'))
model.compile(loss='binary_crossentropy',
optimizer='adam',
metrics=['accuracy'])
print('Train...')
model.fit(X_train, y_train, batch_size=batch_size, nb_epoch=nb_epoch,
validation_data=(X_test, y_test))
score, acc = model.evaluate(X_test, y_test, batch_size=batch_size)
print('Test score:', score)
print('Test accuracy:', acc)
+136
Ver Arquivo
@@ -0,0 +1,136 @@
'''This example demonstrates the use of fasttext for text classification
Based on Joulin et al's paper:
Bags of Tricks for Efficient Text Classification
https://arxiv.org/abs/1607.01759
Results on IMDB datasets with uni and bi-gram embeddings:
Uni-gram: 0.8813 test accuracy after 5 epochs. 8s/epoch on i7 cpu.
Bi-gram : 0.9056 test accuracy after 5 epochs. 2s/epoch on GTX 980M gpu.
'''
from __future__ import print_function
import numpy as np
np.random.seed(1337) # for reproducibility
from keras.preprocessing import sequence
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Embedding
from keras.layers import GlobalAveragePooling1D
from keras.datasets import imdb
def create_ngram_set(input_list, ngram_value=2):
"""
Extract a set of n-grams from a list of integers.
>>> create_ngram_set([1, 4, 9, 4, 1, 4], ngram_value=2)
{(4, 9), (4, 1), (1, 4), (9, 4)}
>>> create_ngram_set([1, 4, 9, 4, 1, 4], ngram_value=3)
[(1, 4, 9), (4, 9, 4), (9, 4, 1), (4, 1, 4)]
"""
return set(zip(*[input_list[i:] for i in range(ngram_value)]))
def add_ngram(sequences, token_indice, ngram_range=2):
"""
Augment the input list of list (sequences) by appending n-grams values.
Example: adding bi-gram
>>> sequences = [[1, 3, 4, 5], [1, 3, 7, 9, 2]]
>>> token_indice = {(1, 3): 1337, (9, 2): 42, (4, 5): 2017}
>>> add_ngram(sequences, token_indice, ngram_range=2)
[[1, 3, 4, 5, 1337, 2017], [1, 3, 7, 9, 2, 1337, 42]]
Example: adding tri-gram
>>> sequences = [[1, 3, 4, 5], [1, 3, 7, 9, 2]]
>>> token_indice = {(1, 3): 1337, (9, 2): 42, (4, 5): 2017, (7, 9, 2): 2018}
>>> add_ngram(sequences, token_indice, ngram_range=3)
[[1, 3, 4, 5, 1337], [1, 3, 7, 9, 2, 1337, 2018]]
"""
new_sequences = []
for input_list in sequences:
new_list = input_list[:]
for i in range(len(new_list)-ngram_range+1):
for ngram_value in range(2, ngram_range+1):
ngram = tuple(new_list[i:i+ngram_value])
if ngram in token_indice:
new_list.append(token_indice[ngram])
new_sequences.append(new_list)
return new_sequences
# Set parameters:
# ngram_range = 2 will add bi-grams features
ngram_range = 1
max_features = 20000
maxlen = 400
batch_size = 32
embedding_dims = 50
nb_epoch = 5
print('Loading data...')
(X_train, y_train), (X_test, y_test) = imdb.load_data(nb_words=max_features)
print(len(X_train), 'train sequences')
print(len(X_test), 'test sequences')
print('Average train sequence length: {}'.format(np.mean(list(map(len, X_train)), dtype=int)))
print('Average test sequence length: {}'.format(np.mean(list(map(len, X_test)), dtype=int)))
if ngram_range > 1:
print('Adding {}-gram features'.format(ngram_range))
# Create set of unique n-gram from the training set.
ngram_set = set()
for input_list in X_train:
for i in range(2, ngram_range+1):
set_of_ngram = create_ngram_set(input_list, ngram_value=i)
ngram_set.update(set_of_ngram)
# Dictionary mapping n-gram token to a unique integer.
# Integer values are greater than max_features in order
# to avoid collision with existing features.
start_index = max_features + 1
token_indice = {v: k+start_index for k, v in enumerate(ngram_set)}
indice_token = {token_indice[k]: k for k in token_indice}
# max_features is the highest integer that could be found in the dataset.
max_features = np.max(list(indice_token.keys())) + 1
# Augmenting X_train and X_test with n-grams features
X_train = add_ngram(X_train, token_indice, ngram_range)
X_test = add_ngram(X_test, token_indice, ngram_range)
print('Average train sequence length: {}'.format(np.mean(list(map(len, X_train)), dtype=int)))
print('Average test sequence length: {}'.format(np.mean(list(map(len, X_test)), dtype=int)))
print('Pad sequences (samples x time)')
X_train = sequence.pad_sequences(X_train, maxlen=maxlen)
X_test = sequence.pad_sequences(X_test, maxlen=maxlen)
print('X_train shape:', X_train.shape)
print('X_test shape:', X_test.shape)
print('Build model...')
model = Sequential()
# we start off with an efficient embedding layer which maps
# our vocab indices into embedding_dims dimensions
model.add(Embedding(max_features,
embedding_dims,
input_length=maxlen))
# we add a GlobalAveragePooling1D, which will average the embeddings
# of all words in the document
model.add(GlobalAveragePooling1D())
# We project onto a single unit output layer, and squash it with a sigmoid:
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy',
optimizer='adam',
metrics=['accuracy'])
model.fit(X_train, y_train,
batch_size=batch_size,
nb_epoch=nb_epoch,
validation_data=(X_test, y_test))
+32 -45
Ver Arquivo
@@ -1,48 +1,36 @@
from __future__ import absolute_import
'''Trains a LSTM on the IMDB sentiment classification task.
The dataset is actually too small for LSTM to be of any advantage
compared to simpler, much faster methods such as TF-IDF + LogReg.
Notes:
- RNNs are tricky. Choice of batch size is important,
choice of loss and optimizer is critical, etc.
Some configurations won't converge.
- LSTM loss decrease patterns during training can be quite different
from what you see with CNNs/MLPs/etc.
'''
from __future__ import print_function
import numpy as np
np.random.seed(1337) # for reproducibility
from keras.preprocessing import sequence
from keras.optimizers import SGD, RMSprop, Adagrad
from keras.utils import np_utils
from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation
from keras.layers.embeddings import Embedding
from keras.layers.recurrent import LSTM, GRU
from keras.layers import Dense, Dropout, Activation, Embedding
from keras.layers import LSTM, SimpleRNN, GRU
from keras.datasets import imdb
'''
Train a LSTM on the IMDB sentiment classification task.
max_features = 20000
maxlen = 80 # cut texts after this number of words (among top max_features most common words)
batch_size = 32
The dataset is actually too small for LSTM to be of any advantage
compared to simpler, much faster methods such as TF-IDF+LogReg.
Notes:
- RNNs are tricky. Choice of batch size is important,
choice of loss and optimizer is critical, etc.
Most configurations won't converge.
- LSTM loss decrease during training can be quite different
from what you see with CNNs/MLPs/etc. It's more or less a sigmoid
instead of an inverse exponential.
GPU command:
THEANO_FLAGS=mode=FAST_RUN,device=gpu,floatX=float32 python imdb_lstm.py
250s/epoch on GPU (GT 650M), vs. 400s/epoch on CPU (2.4Ghz Core i7).
'''
max_features=20000
maxlen = 100 # cut texts after this number of words (among top max_features most common words)
batch_size = 16
print("Loading data...")
(X_train, y_train), (X_test, y_test) = imdb.load_data(nb_words=max_features, test_split=0.2)
print('Loading data...')
(X_train, y_train), (X_test, y_test) = imdb.load_data(nb_words=max_features)
print(len(X_train), 'train sequences')
print(len(X_test), 'test sequences')
print("Pad sequences (samples x time)")
print('Pad sequences (samples x time)')
X_train = sequence.pad_sequences(X_train, maxlen=maxlen)
X_test = sequence.pad_sequences(X_test, maxlen=maxlen)
print('X_train shape:', X_train.shape)
@@ -50,21 +38,20 @@ print('X_test shape:', X_test.shape)
print('Build model...')
model = Sequential()
model.add(Embedding(max_features, 256))
model.add(LSTM(256, 128)) # try using a GRU instead, for fun
model.add(Dropout(0.5))
model.add(Dense(128, 1))
model.add(Embedding(max_features, 128, dropout=0.2))
model.add(LSTM(128, dropout_W=0.2, dropout_U=0.2)) # try using a GRU instead, for fun
model.add(Dense(1))
model.add(Activation('sigmoid'))
# try using different optimizers and different optimizer configs
model.compile(loss='binary_crossentropy', optimizer='adam', class_mode="binary")
model.compile(loss='binary_crossentropy',
optimizer='adam',
metrics=['accuracy'])
print("Train...")
model.fit(X_train, y_train, batch_size=batch_size, nb_epoch=5, validation_split=0.1, show_accuracy=True)
score = model.evaluate(X_test, y_test, batch_size=batch_size)
print('Train...')
model.fit(X_train, y_train, batch_size=batch_size, nb_epoch=15,
validation_data=(X_test, y_test))
score, acc = model.evaluate(X_test, y_test,
batch_size=batch_size)
print('Test score:', score)
classes = model.predict_classes(X_test, batch_size=batch_size)
acc = np_utils.accuracy(classes, y_test)
print('Test accuracy:', acc)
-124
Ver Arquivo
@@ -1,124 +0,0 @@
from __future__ import absolute_import
from __future__ import print_function
import numpy as np
import pandas as pd
from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation
from keras.layers.normalization import BatchNormalization
from keras.layers.advanced_activations import PReLU
from keras.utils import np_utils, generic_utils
from sklearn.preprocessing import LabelEncoder
from sklearn.preprocessing import StandardScaler
'''
This demonstrates how to reach a score of 0.4890 (local validation)
on the Kaggle Otto challenge, with a deep net using Keras.
Compatible Python 2.7-3.4
Recommended to run on GPU:
Command: THEANO_FLAGS=mode=FAST_RUN,device=gpu,floatX=float32 python kaggle_otto_nn.py
On EC2 g2.2xlarge instance: 19s/epoch. 6-7 minutes total training time.
Best validation score at epoch 21: 0.4881
Try it at home:
- with/without BatchNormalization (BatchNormalization helps!)
- with ReLU or with PReLU (PReLU helps!)
- with smaller layers, largers layers
- with more layers, less layers
- with different optimizers (SGD+momentum+decay is probably better than Adam!)
Get the data from Kaggle: https://www.kaggle.com/c/otto-group-product-classification-challenge/data
'''
np.random.seed(1337) # for reproducibility
def load_data(path, train=True):
df = pd.read_csv(path)
X = df.values.copy()
if train:
np.random.shuffle(X) # https://youtu.be/uyUXoap67N8
X, labels = X[:, 1:-1].astype(np.float32), X[:, -1]
return X, labels
else:
X, ids = X[:, 1:].astype(np.float32), X[:, 0].astype(str)
return X, ids
def preprocess_data(X, scaler=None):
if not scaler:
scaler = StandardScaler()
scaler.fit(X)
X = scaler.transform(X)
return X, scaler
def preprocess_labels(labels, encoder=None, categorical=True):
if not encoder:
encoder = LabelEncoder()
encoder.fit(labels)
y = encoder.transform(labels).astype(np.int32)
if categorical:
y = np_utils.to_categorical(y)
return y, encoder
def make_submission(y_prob, ids, encoder, fname):
with open(fname, 'w') as f:
f.write('id,')
f.write(','.join([str(i) for i in encoder.classes_]))
f.write('\n')
for i, probs in zip(ids, y_prob):
probas = ','.join([i] + [str(p) for p in probs.tolist()])
f.write(probas)
f.write('\n')
print("Wrote submission to file {}.".format(fname))
print("Loading data...")
X, labels = load_data('train.csv', train=True)
X, scaler = preprocess_data(X)
y, encoder = preprocess_labels(labels)
X_test, ids = load_data('test.csv', train=False)
X_test, _ = preprocess_data(X_test, scaler)
nb_classes = y.shape[1]
print(nb_classes, 'classes')
dims = X.shape[1]
print(dims, 'dims')
print("Building model...")
model = Sequential()
model.add(Dense(dims, 512, init='glorot_uniform'))
model.add(PReLU((512,)))
model.add(BatchNormalization((512,)))
model.add(Dropout(0.5))
model.add(Dense(512, 512, init='glorot_uniform'))
model.add(PReLU((512,)))
model.add(BatchNormalization((512,)))
model.add(Dropout(0.5))
model.add(Dense(512, 512, init='glorot_uniform'))
model.add(PReLU((512,)))
model.add(BatchNormalization((512,)))
model.add(Dropout(0.5))
model.add(Dense(512, nb_classes, init='glorot_uniform'))
model.add(Activation('softmax'))
model.compile(loss='categorical_crossentropy', optimizer="adam")
print("Training model...")
model.fit(X, y, nb_epoch=20, batch_size=128, validation_split=0.15)
print("Generating submission...")
proba = model.predict_proba(X_test)
make_submission(proba, ids, encoder, fname='keras-otto.csv')
+83
Ver Arquivo
@@ -0,0 +1,83 @@
'''Compare LSTM implementations on the IMDB sentiment classification task.
consume_less='cpu' preprocesses input to the LSTM which typically results in
faster computations at the expense of increased peak memory usage as the
preprocessed input must be kept in memory.
consume_less='mem' does away with the preprocessing, meaning that it might take
a little longer, but should require less peak memory.
consume_less='gpu' concatenates the input, output and forget gate's weights
into one, large matrix, resulting in faster computation time as the GPU can
utilize more cores, at the expense of reduced regularization because the same
dropout is shared across the gates.
Note that the relative performance of the different `consume_less` modes
can vary depending on your device, your model and the size of your data.
'''
import time
import numpy as np
import matplotlib.pyplot as plt
from keras.preprocessing import sequence
from keras.models import Sequential
from keras.layers import Embedding, Dense, LSTM
from keras.datasets import imdb
max_features = 20000
max_length = 80
embedding_dim = 256
batch_size = 128
epochs = 10
modes = ['cpu', 'mem', 'gpu']
print('Loading data...')
(X_train, y_train), (X_test, y_test) = imdb.load_data(nb_words=max_features)
X_train = sequence.pad_sequences(X_train, max_length)
X_test = sequence.pad_sequences(X_test, max_length)
# Compile and train different models while meauring performance.
results = []
for mode in modes:
print('Testing mode: consume_less="{}"'.format(mode))
model = Sequential()
model.add(Embedding(max_features, embedding_dim, input_length=max_length, dropout=0.2))
model.add(LSTM(embedding_dim, dropout_W=0.2, dropout_U=0.2, consume_less=mode))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy',
optimizer='adam',
metrics=['accuracy'])
start_time = time.time()
history = model.fit(X_train, y_train,
batch_size=batch_size,
nb_epoch=epochs,
validation_data=(X_test, y_test))
average_time_per_epoch = (time.time() - start_time) / epochs
results.append((history, average_time_per_epoch))
# Compare models' accuracy, loss and elapsed time per epoch.
plt.style.use('ggplot')
ax1 = plt.subplot2grid((2, 2), (0, 0))
ax1.set_title('Accuracy')
ax1.set_ylabel('Validation Accuracy')
ax1.set_xlabel('Epochs')
ax2 = plt.subplot2grid((2, 2), (1, 0))
ax2.set_title('Loss')
ax2.set_ylabel('Validation Loss')
ax2.set_xlabel('Epochs')
ax3 = plt.subplot2grid((2, 2), (0, 1), rowspan=2)
ax3.set_title('Time')
ax3.set_ylabel('Seconds')
for mode, result in zip(modes, results):
ax1.plot(result[0].epoch, result[0].history['val_acc'], label=mode)
ax2.plot(result[0].epoch, result[0].history['val_loss'], label=mode)
ax1.legend()
ax2.legend()
ax3.bar(np.arange(len(results)), [x[1] for x in results],
tick_label=modes, align='center')
plt.tight_layout()
plt.show()
+104
Ver Arquivo
@@ -0,0 +1,104 @@
'''Example script to generate text from Nietzsche's writings.
At least 20 epochs are required before the generated text
starts sounding coherent.
It is recommended to run this script on GPU, as recurrent
networks are quite computationally intensive.
If you try this script on new data, make sure your corpus
has at least ~100k characters. ~1M is better.
'''
from __future__ import print_function
from keras.models import Sequential
from keras.layers import Dense, Activation, Dropout
from keras.layers import LSTM
from keras.optimizers import RMSprop
from keras.utils.data_utils import get_file
import numpy as np
import random
import sys
path = get_file('nietzsche.txt', origin="https://s3.amazonaws.com/text-datasets/nietzsche.txt")
text = open(path).read().lower()
print('corpus length:', len(text))
chars = sorted(list(set(text)))
print('total chars:', len(chars))
char_indices = dict((c, i) for i, c in enumerate(chars))
indices_char = dict((i, c) for i, c in enumerate(chars))
# cut the text in semi-redundant sequences of maxlen characters
maxlen = 40
step = 3
sentences = []
next_chars = []
for i in range(0, len(text) - maxlen, step):
sentences.append(text[i: i + maxlen])
next_chars.append(text[i + maxlen])
print('nb sequences:', len(sentences))
print('Vectorization...')
X = np.zeros((len(sentences), maxlen, len(chars)), dtype=np.bool)
y = np.zeros((len(sentences), len(chars)), dtype=np.bool)
for i, sentence in enumerate(sentences):
for t, char in enumerate(sentence):
X[i, t, char_indices[char]] = 1
y[i, char_indices[next_chars[i]]] = 1
# build the model: a single LSTM
print('Build model...')
model = Sequential()
model.add(LSTM(128, input_shape=(maxlen, len(chars))))
model.add(Dense(len(chars)))
model.add(Activation('softmax'))
optimizer = RMSprop(lr=0.01)
model.compile(loss='categorical_crossentropy', optimizer=optimizer)
def sample(preds, temperature=1.0):
# helper function to sample an index from a probability array
preds = np.asarray(preds).astype('float64')
preds = np.log(preds) / temperature
exp_preds = np.exp(preds)
preds = exp_preds / np.sum(exp_preds)
probas = np.random.multinomial(1, preds, 1)
return np.argmax(probas)
# train the model, output generated text after each iteration
for iteration in range(1, 60):
print()
print('-' * 50)
print('Iteration', iteration)
model.fit(X, y, batch_size=128, nb_epoch=1)
start_index = random.randint(0, len(text) - maxlen - 1)
for diversity in [0.2, 0.5, 1.0, 1.2]:
print()
print('----- diversity:', diversity)
generated = ''
sentence = text[start_index: start_index + maxlen]
generated += sentence
print('----- Generating with seed: "' + sentence + '"')
sys.stdout.write(generated)
for i in range(400):
x = np.zeros((1, maxlen, len(chars)))
for t, char in enumerate(sentence):
x[0, t, char_indices[char]] = 1.
preds = model.predict(x, verbose=0)[0]
next_index = sample(preds, diversity)
next_char = indices_char[next_index]
generated += next_char
sentence = sentence[1:] + next_char
sys.stdout.write(next_char)
sys.stdout.flush()
print()
+314
Ver Arquivo
@@ -0,0 +1,314 @@
#!/usr/bin/env python
# -*- coding: utf-8 -*-
"""
Train an Auxiliary Classifier Generative Adversarial Network (ACGAN) on the
MNIST dataset. See https://arxiv.org/abs/1610.09585 for more details.
You should start to see reasonable images after ~5 epochs, and good images
by ~15 epochs. You should use a GPU, as the convolution-heavy operations are
very slow on the CPU. Prefer the TensorFlow backend if you plan on iterating, as
the compilation time can be a blocker using Theano.
Timings:
Hardware | Backend | Time / Epoch
-------------------------------------------
CPU | TF | 3 hrs
Titan X (maxwell) | TF | 4 min
Titan X (maxwell) | TH | 7 min
Consult https://github.com/lukedeo/keras-acgan for more information and
example output
"""
from __future__ import print_function
from collections import defaultdict
import cPickle as pickle
from PIL import Image
from six.moves import range
import keras.backend as K
from keras.datasets import mnist
from keras.layers import Input, Dense, Reshape, Flatten, Embedding, merge, Dropout
from keras.layers.advanced_activations import LeakyReLU
from keras.layers.convolutional import UpSampling2D, Convolution2D
from keras.models import Sequential, Model
from keras.optimizers import Adam
from keras.utils.generic_utils import Progbar
import numpy as np
np.random.seed(1337)
K.set_image_dim_ordering('th')
def build_generator(latent_size):
# we will map a pair of (z, L), where z is a latent vector and L is a
# label drawn from P_c, to image space (..., 1, 28, 28)
cnn = Sequential()
cnn.add(Dense(1024, input_dim=latent_size, activation='relu'))
cnn.add(Dense(128 * 7 * 7, activation='relu'))
cnn.add(Reshape((128, 7, 7)))
# upsample to (..., 14, 14)
cnn.add(UpSampling2D(size=(2, 2)))
cnn.add(Convolution2D(256, 5, 5, border_mode='same',
activation='relu', init='glorot_normal'))
# upsample to (..., 28, 28)
cnn.add(UpSampling2D(size=(2, 2)))
cnn.add(Convolution2D(128, 5, 5, border_mode='same',
activation='relu', init='glorot_normal'))
# take a channel axis reduction
cnn.add(Convolution2D(1, 2, 2, border_mode='same',
activation='tanh', init='glorot_normal'))
# this is the z space commonly refered to in GAN papers
latent = Input(shape=(latent_size, ))
# this will be our label
image_class = Input(shape=(1,), dtype='int32')
# 10 classes in MNIST
cls = Flatten()(Embedding(10, latent_size,
init='glorot_normal')(image_class))
# hadamard product between z-space and a class conditional embedding
h = merge([latent, cls], mode='mul')
fake_image = cnn(h)
return Model(input=[latent, image_class], output=fake_image)
def build_discriminator():
# build a relatively standard conv net, with LeakyReLUs as suggested in
# the reference paper
cnn = Sequential()
cnn.add(Convolution2D(32, 3, 3, border_mode='same', subsample=(2, 2),
input_shape=(1, 28, 28)))
cnn.add(LeakyReLU())
cnn.add(Dropout(0.3))
cnn.add(Convolution2D(64, 3, 3, border_mode='same', subsample=(1, 1)))
cnn.add(LeakyReLU())
cnn.add(Dropout(0.3))
cnn.add(Convolution2D(128, 3, 3, border_mode='same', subsample=(2, 2)))
cnn.add(LeakyReLU())
cnn.add(Dropout(0.3))
cnn.add(Convolution2D(256, 3, 3, border_mode='same', subsample=(1, 1)))
cnn.add(LeakyReLU())
cnn.add(Dropout(0.3))
cnn.add(Flatten())
image = Input(shape=(1, 28, 28))
features = cnn(image)
# first output (name=generation) is whether or not the discriminator
# thinks the image that is being shown is fake, and the second output
# (name=auxiliary) is the class that the discriminator thinks the image
# belongs to.
fake = Dense(1, activation='sigmoid', name='generation')(features)
aux = Dense(10, activation='softmax', name='auxiliary')(features)
return Model(input=image, output=[fake, aux])
if __name__ == '__main__':
# batch and latent size taken from the paper
nb_epochs = 50
batch_size = 100
latent_size = 100
# Adam parameters suggested in https://arxiv.org/abs/1511.06434
adam_lr = 0.0002
adam_beta_1 = 0.5
# build the discriminator
discriminator = build_discriminator()
discriminator.compile(
optimizer=Adam(lr=adam_lr, beta_1=adam_beta_1),
loss=['binary_crossentropy', 'sparse_categorical_crossentropy']
)
# build the generator
generator = build_generator(latent_size)
generator.compile(optimizer=Adam(lr=adam_lr, beta_1=adam_beta_1),
loss='binary_crossentropy')
latent = Input(shape=(latent_size, ))
image_class = Input(shape=(1,), dtype='int32')
# get a fake image
fake = generator([latent, image_class])
# we only want to be able to train generation for the combined model
discriminator.trainable = False
fake, aux = discriminator(fake)
combined = Model(input=[latent, image_class], output=[fake, aux])
combined.compile(
optimizer=Adam(lr=adam_lr, beta_1=adam_beta_1),
loss=['binary_crossentropy', 'sparse_categorical_crossentropy']
)
discriminator.trainable = True
# get our mnist data, and force it to be of shape (..., 1, 28, 28) with
# range [-1, 1]
(X_train, y_train), (X_test, y_test) = mnist.load_data()
X_train = (X_train.astype(np.float32) - 127.5) / 127.5
X_train = np.expand_dims(X_train, axis=1)
X_test = (X_test.astype(np.float32) - 127.5) / 127.5
X_test = np.expand_dims(X_test, axis=1)
nb_train, nb_test = X_train.shape[0], X_test.shape[0]
train_history = defaultdict(list)
test_history = defaultdict(list)
for epoch in range(nb_epochs):
print('Epoch {} of {}'.format(epoch + 1, nb_epochs))
nb_batches = int(X_train.shape[0] / batch_size)
progress_bar = Progbar(target=nb_batches)
epoch_gen_loss = []
epoch_disc_loss = []
for index in range(nb_batches):
progress_bar.update(index)
# generate a new batch of noise
noise = np.random.uniform(-1, 1, (batch_size, latent_size))
# get a batch of real images
image_batch = X_train[index * batch_size:(index + 1) * batch_size]
label_batch = y_train[index * batch_size:(index + 1) * batch_size]
# sample some labels from p_c
sampled_labels = np.random.randint(0, 10, batch_size)
# generate a batch of fake images, using the generated labels as a
# conditioner. We reshape the sampled labels to be
# (batch_size, 1) so that we can feed them into the embedding
# layer as a length one sequence
generated_images = generator.predict(
[noise, sampled_labels.reshape((-1, 1))], verbose=0)
X = np.concatenate((image_batch, generated_images))
y = np.array([1] * batch_size + [0] * batch_size)
aux_y = np.concatenate((label_batch, sampled_labels), axis=0)
# see if the discriminator can figure itself out...
epoch_disc_loss.append(discriminator.train_on_batch(X, [y, aux_y]))
# make new noise. we generate 2 * batch size here such that we have
# the generator optimize over an identical number of images as the
# discriminator
noise = np.random.uniform(-1, 1, (2 * batch_size, latent_size))
sampled_labels = np.random.randint(0, 10, 2 * batch_size)
# we want to fix the discriminator and let the generator train to
# trick it
discriminator.trainable = False
# For the generator, we want all the {fake, not-fake} labels to say
# not-fake
trick = np.ones(2 * batch_size)
epoch_gen_loss.append(combined.train_on_batch(
[noise, sampled_labels.reshape((-1, 1))], [trick, sampled_labels]))
discriminator.trainable = True
print('\nTesting for epoch {}:'.format(epoch + 1))
# evaluate the testing loss here
# generate a new batch of noise
noise = np.random.uniform(-1, 1, (nb_test, latent_size))
# sample some labels from p_c and generate images from them
sampled_labels = np.random.randint(0, 10, nb_test)
generated_images = generator.predict(
[noise, sampled_labels.reshape((-1, 1))], verbose=False)
X = np.concatenate((X_test, generated_images))
y = np.array([1] * nb_test + [0] * nb_test)
aux_y = np.concatenate((y_test, sampled_labels), axis=0)
# see if the discriminator can figure itself out...
discriminator_test_loss = discriminator.evaluate(
X, [y, aux_y], verbose=False)
discriminator_train_loss = np.mean(np.array(epoch_disc_loss), axis=0)
# make new noise
noise = np.random.uniform(-1, 1, (2 * nb_test, latent_size))
sampled_labels = np.random.randint(0, 10, 2 * nb_test)
trick = np.ones(2 * nb_test)
generator_test_loss = combined.evaluate(
[noise, sampled_labels.reshape((-1, 1))],
[trick, sampled_labels], verbose=False)
generator_train_loss = np.mean(np.array(epoch_gen_loss), axis=0)
# generate an epoch report on performance
train_history['generator'].append(generator_train_loss)
train_history['discriminator'].append(discriminator_train_loss)
test_history['generator'].append(generator_test_loss)
test_history['discriminator'].append(discriminator_test_loss)
print('{0:<22s} | {1:4s} | {2:15s} | {3:5s}'.format(
'component', *discriminator.metrics_names))
print('-' * 65)
ROW_FMT = '{0:<22s} | {1:<4.2f} | {2:<15.2f} | {3:<5.2f}'
print(ROW_FMT.format('generator (train)',
*train_history['generator'][-1]))
print(ROW_FMT.format('generator (test)',
*test_history['generator'][-1]))
print(ROW_FMT.format('discriminator (train)',
*train_history['discriminator'][-1]))
print(ROW_FMT.format('discriminator (test)',
*test_history['discriminator'][-1]))
# save weights every epoch
generator.save_weights(
'params_generator_epoch_{0:03d}.hdf5'.format(epoch), True)
discriminator.save_weights(
'params_discriminator_epoch_{0:03d}.hdf5'.format(epoch), True)
# generate some digits to display
noise = np.random.uniform(-1, 1, (100, latent_size))
sampled_labels = np.array([
[i] * 10 for i in range(10)
]).reshape(-1, 1)
# get a batch to display
generated_images = generator.predict(
[noise, sampled_labels], verbose=0)
# arrange them into a grid
img = (np.concatenate([r.reshape(-1, 28)
for r in np.split(generated_images, 10)
], axis=-1) * 127.5 + 127.5).astype(np.uint8)
Image.fromarray(img).save(
'plot_epoch_{0:03d}_generated.png'.format(epoch))
pickle.dump({'train': train_history, 'test': test_history},
open('acgan-history.pkl', 'wb'))
+82
Ver Arquivo
@@ -0,0 +1,82 @@
'''Trains a simple convnet on the MNIST dataset.
Gets to 99.25% test accuracy after 12 epochs
(there is still a lot of margin for parameter tuning).
16 seconds per epoch on a GRID K520 GPU.
'''
from __future__ import print_function
import numpy as np
np.random.seed(1337) # for reproducibility
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.layers import Convolution2D, MaxPooling2D
from keras.utils import np_utils
from keras import backend as K
batch_size = 128
nb_classes = 10
nb_epoch = 12
# input image dimensions
img_rows, img_cols = 28, 28
# number of convolutional filters to use
nb_filters = 32
# size of pooling area for max pooling
pool_size = (2, 2)
# convolution kernel size
kernel_size = (3, 3)
# the data, shuffled and split between train and test sets
(X_train, y_train), (X_test, y_test) = mnist.load_data()
if K.image_dim_ordering() == 'th':
X_train = X_train.reshape(X_train.shape[0], 1, img_rows, img_cols)
X_test = X_test.reshape(X_test.shape[0], 1, img_rows, img_cols)
input_shape = (1, img_rows, img_cols)
else:
X_train = X_train.reshape(X_train.shape[0], img_rows, img_cols, 1)
X_test = X_test.reshape(X_test.shape[0], img_rows, img_cols, 1)
input_shape = (img_rows, img_cols, 1)
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255
X_test /= 255
print('X_train shape:', X_train.shape)
print(X_train.shape[0], 'train samples')
print(X_test.shape[0], 'test samples')
# convert class vectors to binary class matrices
Y_train = np_utils.to_categorical(y_train, nb_classes)
Y_test = np_utils.to_categorical(y_test, nb_classes)
model = Sequential()
model.add(Convolution2D(nb_filters, kernel_size[0], kernel_size[1],
border_mode='valid',
input_shape=input_shape))
model.add(Activation('relu'))
model.add(Convolution2D(nb_filters, kernel_size[0], kernel_size[1]))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=pool_size))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(nb_classes))
model.add(Activation('softmax'))
model.compile(loss='categorical_crossentropy',
optimizer='adadelta',
metrics=['accuracy'])
model.fit(X_train, Y_train, batch_size=batch_size, nb_epoch=nb_epoch,
verbose=1, validation_data=(X_test, Y_test))
score = model.evaluate(X_test, Y_test, verbose=0)
print('Test score:', score[0])
print('Test accuracy:', score[1])
+87
Ver Arquivo
@@ -0,0 +1,87 @@
"""This is an example of using Hierarchical RNN (HRNN) to classify MNIST digits.
HRNNs can learn across multiple levels of temporal hiearchy over a complex sequence.
Usually, the first recurrent layer of an HRNN encodes a sentence (e.g. of word vectors)
into a sentence vector. The second recurrent layer then encodes a sequence of
such vectors (encoded by the first layer) into a document vector. This
document vector is considered to preserve both the word-level and
sentence-level structure of the context.
# References
- [A Hierarchical Neural Autoencoder for Paragraphs and Documents](https://web.stanford.edu/~jurafsky/pubs/P15-1107.pdf)
Encodes paragraphs and documents with HRNN.
Results have shown that HRNN outperforms standard
RNNs and may play some role in more sophisticated generation tasks like
summarization or question answering.
- [Hierarchical recurrent neural network for skeleton based action recognition](http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=7298714)
Achieved state-of-the-art results on skeleton based action recognition with 3 levels
of bidirectional HRNN combined with fully connected layers.
In the below MNIST example the first LSTM layer first encodes every
column of pixels of shape (28, 1) to a column vector of shape (128,). The second LSTM
layer encodes then these 28 column vectors of shape (28, 128) to a image vector
representing the whole image. A final Dense layer is added for prediction.
After 5 epochs: train acc: 0.9858, val acc: 0.9864
"""
from __future__ import print_function
from keras.datasets import mnist
from keras.models import Sequential, Model
from keras.layers import Input, Dense, TimeDistributed
from keras.layers import LSTM
from keras.utils import np_utils
# Training parameters.
batch_size = 32
nb_classes = 10
nb_epochs = 5
# Embedding dimensions.
row_hidden = 128
col_hidden = 128
# The data, shuffled and split between train and test sets.
(X_train, y_train), (X_test, y_test) = mnist.load_data()
# Reshapes data to 4D for Hierarchical RNN.
X_train = X_train.reshape(X_train.shape[0], 28, 28, 1)
X_test = X_test.reshape(X_test.shape[0], 28, 28, 1)
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255
X_test /= 255
print('X_train shape:', X_train.shape)
print(X_train.shape[0], 'train samples')
print(X_test.shape[0], 'test samples')
# Converts class vectors to binary class matrices.
Y_train = np_utils.to_categorical(y_train, nb_classes)
Y_test = np_utils.to_categorical(y_test, nb_classes)
row, col, pixel = X_train.shape[1:]
# 4D input.
x = Input(shape=(row, col, pixel))
# Encodes a row of pixels using TimeDistributed Wrapper.
encoded_rows = TimeDistributed(LSTM(output_dim=row_hidden))(x)
# Encodes columns of encoded rows.
encoded_columns = LSTM(col_hidden)(encoded_rows)
# Final predictions and model.
prediction = Dense(nb_classes, activation='softmax')(encoded_columns)
model = Model(input=x, output=prediction)
model.compile(loss='categorical_crossentropy',
optimizer='rmsprop',
metrics=['accuracy'])
# Training.
model.fit(X_train, Y_train, batch_size=batch_size, nb_epoch=nb_epochs,
verbose=1, validation_data=(X_test, Y_test))
# Evaluation.
scores = model.evaluate(X_test, Y_test, verbose=0)
print('Test loss:', scores[0])
print('Test accuracy:', scores[1])
+70
Ver Arquivo
@@ -0,0 +1,70 @@
'''This is a reproduction of the IRNN experiment
with pixel-by-pixel sequential MNIST in
"A Simple Way to Initialize Recurrent Networks of Rectified Linear Units"
by Quoc V. Le, Navdeep Jaitly, Geoffrey E. Hinton
arXiv:1504.00941v2 [cs.NE] 7 Apr 2015
http://arxiv.org/pdf/1504.00941v2.pdf
Optimizer is replaced with RMSprop which yields more stable and steady
improvement.
Reaches 0.93 train/test accuracy after 900 epochs
(which roughly corresponds to 1687500 steps in the original paper.)
'''
from __future__ import print_function
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Activation
from keras.layers import SimpleRNN
from keras.initializations import normal, identity
from keras.optimizers import RMSprop
from keras.utils import np_utils
batch_size = 32
nb_classes = 10
nb_epochs = 200
hidden_units = 100
learning_rate = 1e-6
clip_norm = 1.0
# the data, shuffled and split between train and test sets
(X_train, y_train), (X_test, y_test) = mnist.load_data()
X_train = X_train.reshape(X_train.shape[0], -1, 1)
X_test = X_test.reshape(X_test.shape[0], -1, 1)
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255
X_test /= 255
print('X_train shape:', X_train.shape)
print(X_train.shape[0], 'train samples')
print(X_test.shape[0], 'test samples')
# convert class vectors to binary class matrices
Y_train = np_utils.to_categorical(y_train, nb_classes)
Y_test = np_utils.to_categorical(y_test, nb_classes)
print('Evaluate IRNN...')
model = Sequential()
model.add(SimpleRNN(output_dim=hidden_units,
init=lambda shape, name: normal(shape, scale=0.001, name=name),
inner_init=lambda shape, name: identity(shape, scale=1.0, name=name),
activation='relu',
input_shape=X_train.shape[1:]))
model.add(Dense(nb_classes))
model.add(Activation('softmax'))
rmsprop = RMSprop(lr=learning_rate)
model.compile(loss='categorical_crossentropy',
optimizer=rmsprop,
metrics=['accuracy'])
model.fit(X_train, Y_train, batch_size=batch_size, nb_epoch=nb_epochs,
verbose=1, validation_data=(X_test, Y_test))
scores = model.evaluate(X_test, Y_test, verbose=0)
print('IRNN test score:', scores[0])
print('IRNN test accuracy:', scores[1])
+60
Ver Arquivo
@@ -0,0 +1,60 @@
'''Trains a simple deep NN on the MNIST dataset.
Gets to 98.40% test accuracy after 20 epochs
(there is *a lot* of margin for parameter tuning).
2 seconds per epoch on a K520 GPU.
'''
from __future__ import print_function
import numpy as np
np.random.seed(1337) # for reproducibility
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation
from keras.optimizers import SGD, Adam, RMSprop
from keras.utils import np_utils
batch_size = 128
nb_classes = 10
nb_epoch = 20
# the data, shuffled and split between train and test sets
(X_train, y_train), (X_test, y_test) = mnist.load_data()
X_train = X_train.reshape(60000, 784)
X_test = X_test.reshape(10000, 784)
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255
X_test /= 255
print(X_train.shape[0], 'train samples')
print(X_test.shape[0], 'test samples')
# convert class vectors to binary class matrices
Y_train = np_utils.to_categorical(y_train, nb_classes)
Y_test = np_utils.to_categorical(y_test, nb_classes)
model = Sequential()
model.add(Dense(512, input_shape=(784,)))
model.add(Activation('relu'))
model.add(Dropout(0.2))
model.add(Dense(512))
model.add(Activation('relu'))
model.add(Dropout(0.2))
model.add(Dense(10))
model.add(Activation('softmax'))
model.summary()
model.compile(loss='categorical_crossentropy',
optimizer=RMSprop(),
metrics=['accuracy'])
history = model.fit(X_train, Y_train,
batch_size=batch_size, nb_epoch=nb_epoch,
verbose=1, validation_data=(X_test, Y_test))
score = model.evaluate(X_test, Y_test, verbose=0)
print('Test score:', score[0])
print('Test accuracy:', score[1])
+384
Ver Arquivo
@@ -0,0 +1,384 @@
'''This is an implementation of Net2Net experiment with MNIST in
'Net2Net: Accelerating Learning via Knowledge Transfer'
by Tianqi Chen, Ian Goodfellow, and Jonathon Shlens
arXiv:1511.05641v4 [cs.LG] 23 Apr 2016
http://arxiv.org/abs/1511.05641
Notes
- What:
+ Net2Net is a group of methods to transfer knowledge from a teacher neural
net to a student net,so that the student net can be trained faster than
from scratch.
+ The paper discussed two specific methods of Net2Net, i.e. Net2WiderNet
and Net2DeeperNet.
+ Net2WiderNet replaces a model with an equivalent wider model that has
more units in each hidden layer.
+ Net2DeeperNet replaces a model with an equivalent deeper model.
+ Both are based on the idea of 'function-preserving transformations of
neural nets'.
- Why:
+ Enable fast exploration of multiple neural nets in experimentation and
design process,by creating a series of wider and deeper models with
transferable knowledge.
+ Enable 'lifelong learning system' by gradually adjusting model complexity
to data availability,and reusing transferable knowledge.
Experiments
- Teacher model: a basic CNN model trained on MNIST for 3 epochs.
- Net2WiderNet exepriment:
+ Student model has a wider Conv2D layer and a wider FC layer.
+ Comparison of 'random-padding' vs 'net2wider' weight initialization.
+ With both methods, student model should immediately perform as well as
teacher model, but 'net2wider' is slightly better.
- Net2DeeperNet experiment:
+ Student model has an extra Conv2D layer and an extra FC layer.
+ Comparison of 'random-init' vs 'net2deeper' weight initialization.
+ Starting performance of 'net2deeper' is better than 'random-init'.
- Hyper-parameters:
+ SGD with momentum=0.9 is used for training teacher and student models.
+ Learning rate adjustment: it's suggested to reduce learning rate
to 1/10 for student model.
+ Addition of noise in 'net2wider' is used to break weight symmetry
and thus enable full capacity of student models. It is optional
when a Dropout layer is used.
Results
- Tested with 'Theano' backend and 'th' image_dim_ordering.
- Running on GPU GeForce GTX 980M
- Performance Comparisons - validation loss values during first 3 epochs:
(1) teacher_model: 0.075 0.041 0.041
(2) wider_random_pad: 0.036 0.034 0.032
(3) wider_net2wider: 0.032 0.030 0.030
(4) deeper_random_init: 0.061 0.043 0.041
(5) deeper_net2deeper: 0.032 0.031 0.029
'''
from __future__ import print_function
import numpy as np
np.random.seed(1337)
from keras.models import Sequential
from keras.layers import Conv2D, MaxPooling2D, Dense, Flatten
from keras.optimizers import SGD
from keras.utils import np_utils
from keras.datasets import mnist
input_shape = (1, 28, 28) # image shape
nb_class = 10 # number of class
# load and pre-process data
def preprocess_input(x):
return x.reshape((-1, ) + input_shape) / 255.
def preprocess_output(y):
return np_utils.to_categorical(y)
(train_x, train_y), (validation_x, validation_y) = mnist.load_data()
train_x, validation_x = map(preprocess_input, [train_x, validation_x])
train_y, validation_y = map(preprocess_output, [train_y, validation_y])
print('Loading MNIST data...')
print('train_x shape:', train_x.shape, 'train_y shape:', train_y.shape)
print('validation_x shape:', validation_x.shape,
'validation_y shape', validation_y.shape)
# knowledge transfer algorithms
def wider2net_conv2d(teacher_w1, teacher_b1, teacher_w2, new_width, init):
'''Get initial weights for a wider conv2d layer with a bigger nb_filter,
by 'random-padding' or 'net2wider'.
# Arguments
teacher_w1: `weight` of conv2d layer to become wider,
of shape (nb_filter1, nb_channel1, kh1, kw1)
teacher_b1: `bias` of conv2d layer to become wider,
of shape (nb_filter1, )
teacher_w2: `weight` of next connected conv2d layer,
of shape (nb_filter2, nb_channel2, kh2, kw2)
new_width: new `nb_filter` for the wider conv2d layer
init: initialization algorithm for new weights,
either 'random-pad' or 'net2wider'
'''
assert teacher_w1.shape[0] == teacher_w2.shape[1], (
'successive layers from teacher model should have compatible shapes')
assert teacher_w1.shape[0] == teacher_b1.shape[0], (
'weight and bias from same layer should have compatible shapes')
assert new_width > teacher_w1.shape[0], (
'new width (nb_filter) should be bigger than the existing one')
n = new_width - teacher_w1.shape[0]
if init == 'random-pad':
new_w1 = np.random.normal(0, 0.1, size=(n, ) + teacher_w1.shape[1:])
new_b1 = np.ones(n) * 0.1
new_w2 = np.random.normal(0, 0.1, size=(
teacher_w2.shape[0], n) + teacher_w2.shape[2:])
elif init == 'net2wider':
index = np.random.randint(teacher_w1.shape[0], size=n)
factors = np.bincount(index)[index] + 1.
new_w1 = teacher_w1[index, :, :, :]
new_b1 = teacher_b1[index]
new_w2 = teacher_w2[:, index, :, :] / factors.reshape((1, -1, 1, 1))
else:
raise ValueError('Unsupported weight initializer: %s' % init)
student_w1 = np.concatenate((teacher_w1, new_w1), axis=0)
if init == 'random-pad':
student_w2 = np.concatenate((teacher_w2, new_w2), axis=1)
elif init == 'net2wider':
# add small noise to break symmetry, so that student model will have
# full capacity later
noise = np.random.normal(0, 5e-2 * new_w2.std(), size=new_w2.shape)
student_w2 = np.concatenate((teacher_w2, new_w2 + noise), axis=1)
student_w2[:, index, :, :] = new_w2
student_b1 = np.concatenate((teacher_b1, new_b1), axis=0)
return student_w1, student_b1, student_w2
def wider2net_fc(teacher_w1, teacher_b1, teacher_w2, new_width, init):
'''Get initial weights for a wider fully connected (dense) layer
with a bigger nout, by 'random-padding' or 'net2wider'.
# Arguments
teacher_w1: `weight` of fc layer to become wider,
of shape (nin1, nout1)
teacher_b1: `bias` of fc layer to become wider,
of shape (nout1, )
teacher_w2: `weight` of next connected fc layer,
of shape (nin2, nout2)
new_width: new `nout` for the wider fc layer
init: initialization algorithm for new weights,
either 'random-pad' or 'net2wider'
'''
assert teacher_w1.shape[1] == teacher_w2.shape[0], (
'successive layers from teacher model should have compatible shapes')
assert teacher_w1.shape[1] == teacher_b1.shape[0], (
'weight and bias from same layer should have compatible shapes')
assert new_width > teacher_w1.shape[1], (
'new width (nout) should be bigger than the existing one')
n = new_width - teacher_w1.shape[1]
if init == 'random-pad':
new_w1 = np.random.normal(0, 0.1, size=(teacher_w1.shape[0], n))
new_b1 = np.ones(n) * 0.1
new_w2 = np.random.normal(0, 0.1, size=(n, teacher_w2.shape[1]))
elif init == 'net2wider':
index = np.random.randint(teacher_w1.shape[1], size=n)
factors = np.bincount(index)[index] + 1.
new_w1 = teacher_w1[:, index]
new_b1 = teacher_b1[index]
new_w2 = teacher_w2[index, :] / factors[:, np.newaxis]
else:
raise ValueError('Unsupported weight initializer: %s' % init)
student_w1 = np.concatenate((teacher_w1, new_w1), axis=1)
if init == 'random-pad':
student_w2 = np.concatenate((teacher_w2, new_w2), axis=0)
elif init == 'net2wider':
# add small noise to break symmetry, so that student model will have
# full capacity later
noise = np.random.normal(0, 5e-2 * new_w2.std(), size=new_w2.shape)
student_w2 = np.concatenate((teacher_w2, new_w2 + noise), axis=0)
student_w2[index, :] = new_w2
student_b1 = np.concatenate((teacher_b1, new_b1), axis=0)
return student_w1, student_b1, student_w2
def deeper2net_conv2d(teacher_w):
'''Get initial weights for a deeper conv2d layer by net2deeper'.
# Arguments
teacher_w: `weight` of previous conv2d layer,
of shape (nb_filter, nb_channel, kh, kw)
'''
nb_filter, nb_channel, kh, kw = teacher_w.shape
student_w = np.zeros((nb_filter, nb_filter, kh, kw))
for i in xrange(nb_filter):
student_w[i, i, (kh - 1) / 2, (kw - 1) / 2] = 1.
student_b = np.zeros(nb_filter)
return student_w, student_b
def copy_weights(teacher_model, student_model, layer_names):
'''Copy weights from teacher_model to student_model,
for layers with names listed in layer_names
'''
for name in layer_names:
weights = teacher_model.get_layer(name=name).get_weights()
student_model.get_layer(name=name).set_weights(weights)
# methods to construct teacher_model and student_models
def make_teacher_model(train_data, validation_data, nb_epoch=3):
'''Train a simple CNN as teacher model.
'''
model = Sequential()
model.add(Conv2D(64, 3, 3, input_shape=input_shape,
border_mode='same', name='conv1'))
model.add(MaxPooling2D(name='pool1'))
model.add(Conv2D(64, 3, 3, border_mode='same', name='conv2'))
model.add(MaxPooling2D(name='pool2'))
model.add(Flatten(name='flatten'))
model.add(Dense(64, activation='relu', name='fc1'))
model.add(Dense(nb_class, activation='softmax', name='fc2'))
model.compile(loss='categorical_crossentropy',
optimizer=SGD(lr=0.01, momentum=0.9),
metrics=['accuracy'])
train_x, train_y = train_data
history = model.fit(train_x, train_y, nb_epoch=nb_epoch,
validation_data=validation_data)
return model, history
def make_wider_student_model(teacher_model, train_data,
validation_data, init, nb_epoch=3):
'''Train a wider student model based on teacher_model,
with either 'random-pad' (baseline) or 'net2wider'
'''
new_conv1_width = 128
new_fc1_width = 128
model = Sequential()
# a wider conv1 compared to teacher_model
model.add(Conv2D(new_conv1_width, 3, 3, input_shape=input_shape,
border_mode='same', name='conv1'))
model.add(MaxPooling2D(name='pool1'))
model.add(Conv2D(64, 3, 3, border_mode='same', name='conv2'))
model.add(MaxPooling2D(name='pool2'))
model.add(Flatten(name='flatten'))
# a wider fc1 compared to teacher model
model.add(Dense(new_fc1_width, activation='relu', name='fc1'))
model.add(Dense(nb_class, activation='softmax', name='fc2'))
# The weights for other layers need to be copied from teacher_model
# to student_model, except for widened layers
# and their immediate downstreams, which will be initialized separately.
# For this example there are no other layers that need to be copied.
w_conv1, b_conv1 = teacher_model.get_layer('conv1').get_weights()
w_conv2, b_conv2 = teacher_model.get_layer('conv2').get_weights()
new_w_conv1, new_b_conv1, new_w_conv2 = wider2net_conv2d(
w_conv1, b_conv1, w_conv2, new_conv1_width, init)
model.get_layer('conv1').set_weights([new_w_conv1, new_b_conv1])
model.get_layer('conv2').set_weights([new_w_conv2, b_conv2])
w_fc1, b_fc1 = teacher_model.get_layer('fc1').get_weights()
w_fc2, b_fc2 = teacher_model.get_layer('fc2').get_weights()
new_w_fc1, new_b_fc1, new_w_fc2 = wider2net_fc(
w_fc1, b_fc1, w_fc2, new_fc1_width, init)
model.get_layer('fc1').set_weights([new_w_fc1, new_b_fc1])
model.get_layer('fc2').set_weights([new_w_fc2, b_fc2])
model.compile(loss='categorical_crossentropy',
optimizer=SGD(lr=0.001, momentum=0.9),
metrics=['accuracy'])
train_x, train_y = train_data
history = model.fit(train_x, train_y, nb_epoch=nb_epoch,
validation_data=validation_data)
return model, history
def make_deeper_student_model(teacher_model, train_data,
validation_data, init, nb_epoch=3):
'''Train a deeper student model based on teacher_model,
with either 'random-init' (baseline) or 'net2deeper'
'''
model = Sequential()
model.add(Conv2D(64, 3, 3, input_shape=input_shape,
border_mode='same', name='conv1'))
model.add(MaxPooling2D(name='pool1'))
model.add(Conv2D(64, 3, 3, border_mode='same', name='conv2'))
# add another conv2d layer to make original conv2 deeper
if init == 'net2deeper':
prev_w, _ = model.get_layer('conv2').get_weights()
new_weights = deeper2net_conv2d(prev_w)
model.add(Conv2D(64, 3, 3, border_mode='same',
name='conv2-deeper', weights=new_weights))
elif init == 'random-init':
model.add(Conv2D(64, 3, 3, border_mode='same', name='conv2-deeper'))
else:
raise ValueError('Unsupported weight initializer: %s' % init)
model.add(MaxPooling2D(name='pool2'))
model.add(Flatten(name='flatten'))
model.add(Dense(64, activation='relu', name='fc1'))
# add another fc layer to make original fc1 deeper
if init == 'net2deeper':
# net2deeper for fc layer with relu, is just an identity initializer
model.add(Dense(64, init='identity',
activation='relu', name='fc1-deeper'))
elif init == 'random-init':
model.add(Dense(64, activation='relu', name='fc1-deeper'))
else:
raise ValueError('Unsupported weight initializer: %s' % init)
model.add(Dense(nb_class, activation='softmax', name='fc2'))
# copy weights for other layers
copy_weights(teacher_model, model, layer_names=[
'conv1', 'conv2', 'fc1', 'fc2'])
model.compile(loss='categorical_crossentropy',
optimizer=SGD(lr=0.001, momentum=0.9),
metrics=['accuracy'])
train_x, train_y = train_data
history = model.fit(train_x, train_y, nb_epoch=nb_epoch,
validation_data=validation_data)
return model, history
# experiments setup
def net2wider_experiment():
'''Benchmark performances of
(1) a teacher model,
(2) a wider student model with `random_pad` initializer
(3) a wider student model with `Net2WiderNet` initializer
'''
train_data = (train_x, train_y)
validation_data = (validation_x, validation_y)
print('\nExperiment of Net2WiderNet ...')
print('\nbuilding teacher model ...')
teacher_model, _ = make_teacher_model(train_data,
validation_data,
nb_epoch=3)
print('\nbuilding wider student model by random padding ...')
make_wider_student_model(teacher_model, train_data,
validation_data, 'random-pad',
nb_epoch=3)
print('\nbuilding wider student model by net2wider ...')
make_wider_student_model(teacher_model, train_data,
validation_data, 'net2wider',
nb_epoch=3)
def net2deeper_experiment():
'''Benchmark performances of
(1) a teacher model,
(2) a deeper student model with `random_init` initializer
(3) a deeper student model with `Net2DeeperNet` initializer
'''
train_data = (train_x, train_y)
validation_data = (validation_x, validation_y)
print('\nExperiment of Net2DeeperNet ...')
print('\nbuilding teacher model ...')
teacher_model, _ = make_teacher_model(train_data,
validation_data,
nb_epoch=3)
print('\nbuilding deeper student model by random init ...')
make_deeper_student_model(teacher_model, train_data,
validation_data, 'random-init',
nb_epoch=3)
print('\nbuilding deeper student model by net2deeper ...')
make_deeper_student_model(teacher_model, train_data,
validation_data, 'net2deeper',
nb_epoch=3)
# run the experiments
net2wider_experiment()
net2deeper_experiment()
-54
Ver Arquivo
@@ -1,54 +0,0 @@
from __future__ import absolute_import
from __future__ import print_function
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation
from keras.regularizers import l2, l1
from keras.constraints import maxnorm
from keras.optimizers import SGD, Adam, RMSprop
from keras.utils import np_utils
import numpy as np
'''
Train a simple deep NN on the MNIST dataset.
'''
batch_size = 64
nb_classes = 10
nb_epoch = 20
np.random.seed(1337) # for reproducibility
# the data, shuffled and split between tran and test sets
(X_train, y_train), (X_test, y_test) = mnist.load_data()
X_train = X_train.reshape(60000,784)
X_test = X_test.reshape(10000,784)
X_train = X_train.astype("float32")
X_test = X_test.astype("float32")
X_train /= 255
X_test /= 255
print(X_train.shape[0], 'train samples')
print(X_test.shape[0], 'test samples')
# convert class vectors to binary class matrices
Y_train = np_utils.to_categorical(y_train, nb_classes)
Y_test = np_utils.to_categorical(y_test, nb_classes)
model = Sequential()
model.add(Dense(784, 128))
model.add(Activation('relu'))
model.add(Dropout(0.2))
model.add(Dense(128, 128))
model.add(Activation('relu'))
model.add(Dropout(0.2))
model.add(Dense(128, 10))
model.add(Activation('softmax'))
rms = RMSprop()
model.compile(loss='categorical_crossentropy', optimizer=rms)
model.fit(X_train, Y_train, batch_size=batch_size, nb_epoch=nb_epoch, show_accuracy=True, verbose=2, validation_data=(X_test, Y_test))
score = model.evaluate(X_test, Y_test, show_accuracy=True, verbose=0)
print('Test score:', score[0])
print('Test accuracy:', score[1])
+130
Ver Arquivo
@@ -0,0 +1,130 @@
'''Train a Siamese MLP on pairs of digits from the MNIST dataset.
It follows Hadsell-et-al.'06 [1] by computing the Euclidean distance on the
output of the shared network and by optimizing the contrastive loss (see paper
for mode details).
[1] "Dimensionality Reduction by Learning an Invariant Mapping"
http://yann.lecun.com/exdb/publis/pdf/hadsell-chopra-lecun-06.pdf
Gets to 99.5% test accuracy after 20 epochs.
3 seconds per epoch on a Titan X GPU
'''
from __future__ import absolute_import
from __future__ import print_function
import numpy as np
np.random.seed(1337) # for reproducibility
import random
from keras.datasets import mnist
from keras.models import Sequential, Model
from keras.layers import Dense, Dropout, Input, Lambda
from keras.optimizers import SGD, RMSprop
from keras import backend as K
def euclidean_distance(vects):
x, y = vects
return K.sqrt(K.sum(K.square(x - y), axis=1, keepdims=True))
def eucl_dist_output_shape(shapes):
shape1, shape2 = shapes
return (shape1[0], 1)
def contrastive_loss(y_true, y_pred):
'''Contrastive loss from Hadsell-et-al.'06
http://yann.lecun.com/exdb/publis/pdf/hadsell-chopra-lecun-06.pdf
'''
margin = 1
return K.mean(y_true * K.square(y_pred) + (1 - y_true) * K.square(K.maximum(margin - y_pred, 0)))
def create_pairs(x, digit_indices):
'''Positive and negative pair creation.
Alternates between positive and negative pairs.
'''
pairs = []
labels = []
n = min([len(digit_indices[d]) for d in range(10)]) - 1
for d in range(10):
for i in range(n):
z1, z2 = digit_indices[d][i], digit_indices[d][i+1]
pairs += [[x[z1], x[z2]]]
inc = random.randrange(1, 10)
dn = (d + inc) % 10
z1, z2 = digit_indices[d][i], digit_indices[dn][i]
pairs += [[x[z1], x[z2]]]
labels += [1, 0]
return np.array(pairs), np.array(labels)
def create_base_network(input_dim):
'''Base network to be shared (eq. to feature extraction).
'''
seq = Sequential()
seq.add(Dense(128, input_shape=(input_dim,), activation='relu'))
seq.add(Dropout(0.1))
seq.add(Dense(128, activation='relu'))
seq.add(Dropout(0.1))
seq.add(Dense(128, activation='relu'))
return seq
def compute_accuracy(predictions, labels):
'''Compute classification accuracy with a fixed threshold on distances.
'''
return labels[predictions.ravel() < 0.5].mean()
# the data, shuffled and split between train and test sets
(X_train, y_train), (X_test, y_test) = mnist.load_data()
X_train = X_train.reshape(60000, 784)
X_test = X_test.reshape(10000, 784)
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255
X_test /= 255
input_dim = 784
nb_epoch = 20
# create training+test positive and negative pairs
digit_indices = [np.where(y_train == i)[0] for i in range(10)]
tr_pairs, tr_y = create_pairs(X_train, digit_indices)
digit_indices = [np.where(y_test == i)[0] for i in range(10)]
te_pairs, te_y = create_pairs(X_test, digit_indices)
# network definition
base_network = create_base_network(input_dim)
input_a = Input(shape=(input_dim,))
input_b = Input(shape=(input_dim,))
# because we re-use the same instance `base_network`,
# the weights of the network
# will be shared across the two branches
processed_a = base_network(input_a)
processed_b = base_network(input_b)
distance = Lambda(euclidean_distance, output_shape=eucl_dist_output_shape)([processed_a, processed_b])
model = Model(input=[input_a, input_b], output=distance)
# train
rms = RMSprop()
model.compile(loss=contrastive_loss, optimizer=rms)
model.fit([tr_pairs[:, 0], tr_pairs[:, 1]], tr_y,
validation_data=([te_pairs[:, 0], te_pairs[:, 1]], te_y),
batch_size=128,
nb_epoch=nb_epoch)
# compute final accuracy on training and test sets
pred = model.predict([tr_pairs[:, 0], tr_pairs[:, 1]])
tr_acc = compute_accuracy(pred, tr_y)
pred = model.predict([te_pairs[:, 0], te_pairs[:, 1]])
te_acc = compute_accuracy(pred, te_y)
print('* Accuracy on training set: %0.2f%%' % (100 * tr_acc))
print('* Accuracy on test set: %0.2f%%' % (100 * te_acc))
+94
Ver Arquivo
@@ -0,0 +1,94 @@
'''Example of how to use sklearn wrapper
Builds simple CNN models on MNIST and uses sklearn's GridSearchCV to find best model
'''
from __future__ import print_function
import numpy as np
np.random.seed(1337) # for reproducibility
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.layers import Convolution2D, MaxPooling2D
from keras.utils import np_utils
from keras.wrappers.scikit_learn import KerasClassifier
from sklearn.grid_search import GridSearchCV
nb_classes = 10
# input image dimensions
img_rows, img_cols = 28, 28
# load training data and do basic data normalization
(X_train, y_train), (X_test, y_test) = mnist.load_data()
X_train = X_train.reshape(X_train.shape[0], 1, img_rows, img_cols)
X_test = X_test.reshape(X_test.shape[0], 1, img_rows, img_cols)
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255
X_test /= 255
# convert class vectors to binary class matrices
y_train = np_utils.to_categorical(y_train, nb_classes)
y_test = np_utils.to_categorical(y_test, nb_classes)
def make_model(dense_layer_sizes, nb_filters, nb_conv, nb_pool):
'''Creates model comprised of 2 convolutional layers followed by dense layers
dense_layer_sizes: List of layer sizes. This list has one number for each layer
nb_filters: Number of convolutional filters in each convolutional layer
nb_conv: Convolutional kernel size
nb_pool: Size of pooling area for max pooling
'''
model = Sequential()
model.add(Convolution2D(nb_filters, nb_conv, nb_conv,
border_mode='valid',
input_shape=(1, img_rows, img_cols)))
model.add(Activation('relu'))
model.add(Convolution2D(nb_filters, nb_conv, nb_conv))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(nb_pool, nb_pool)))
model.add(Dropout(0.25))
model.add(Flatten())
for layer_size in dense_layer_sizes:
model.add(Dense(layer_size))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(nb_classes))
model.add(Activation('softmax'))
model.compile(loss='categorical_crossentropy',
optimizer='adadelta',
metrics=['accuracy'])
return model
dense_size_candidates = [[32], [64], [32, 32], [64, 64]]
my_classifier = KerasClassifier(make_model, batch_size=32)
validator = GridSearchCV(my_classifier,
param_grid={'dense_layer_sizes': dense_size_candidates,
# nb_epoch is avail for tuning even when not
# an argument to model building function
'nb_epoch': [3, 6],
'nb_filters': [8],
'nb_conv': [3],
'nb_pool': [2]},
scoring='log_loss',
n_jobs=1)
validator.fit(X_train, y_train)
print('The parameters of the best model are: ')
print(validator.best_params_)
# validator.best_estimator_ returns sklearn-wrapped version of best model.
# validator.best_estimator_.model returns the (unwrapped) keras model
best_model = validator.best_estimator_.model
metric_names = best_model.metrics_names
metric_values = best_model.evaluate(X_test, y_test)
for metric, value in zip(metric_names, metric_values):
print(metric, ': ', value)
+167
Ver Arquivo
@@ -0,0 +1,167 @@
'''Trains a stacked what-where autoencoder built on residual blocks on the
MNIST dataset. It exemplifies two influential methods that have been developed
in the past few years.
The first is the idea of properly "unpooling." During any max pool, the
exact location (the "where") of the maximal value in a pooled receptive field
is lost, however it can be very useful in the overall reconstruction of an
input image. Therefore, if the "where" is handed from the encoder
to the corresponding decoder layer, features being decoded can be "placed" in
the right location, allowing for reconstructions of much higher fidelity.
References:
[1]
"Visualizing and Understanding Convolutional Networks"
Matthew D Zeiler, Rob Fergus
https://arxiv.org/abs/1311.2901v3
[2]
"Stacked What-Where Auto-encoders"
Junbo Zhao, Michael Mathieu, Ross Goroshin, Yann LeCun
https://arxiv.org/abs/1506.02351v8
The second idea exploited here is that of residual learning. Residual blocks
ease the training process by allowing skip connections that give the network
the ability to be as linear (or non-linear) as the data sees fit. This allows
for much deep networks to be easily trained. The residual element seems to
be advantageous in the context of this example as it allows a nice symmetry
between the encoder and decoder. Normally, in the decoder, the final
projection to the space where the image is reconstructed is linear, however
this does not have to be the case for a residual block as the degree to which
its output is linear or non-linear is determined by the data it is fed.
However, in order to cap the reconstruction in this example, a hard softmax is
applied as a bias because we know the MNIST digits are mapped to [0,1].
References:
[3]
"Deep Residual Learning for Image Recognition"
Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun
https://arxiv.org/abs/1512.03385v1
[4]
"Identity Mappings in Deep Residual Networks"
Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun
https://arxiv.org/abs/1603.05027v3
'''
from __future__ import print_function
import numpy as np
np.random.seed(1337) # for reproducibility
from keras.datasets import mnist
from keras.models import Model
from keras.layers import Activation, merge
from keras.layers import UpSampling2D, Convolution2D, MaxPooling2D
from keras.layers import Input, BatchNormalization
import matplotlib.pyplot as plt
import keras.backend as K
def convresblock(x, nfeats=8, ksize=3, nskipped=2):
''' The proposed residual block from [4]'''
y0 = Convolution2D(nfeats, ksize, ksize, border_mode='same')(x)
y = y0
for i in range(nskipped):
y = BatchNormalization(mode=0, axis=1)(y)
y = Activation('relu')(y)
y = Convolution2D(nfeats, ksize, ksize, border_mode='same')(y)
return merge([y0, y], mode='sum')
def getwhere(x):
''' Calculate the "where" mask that contains switches indicating which
index contained the max value when MaxPool2D was applied. Using the
gradient of the sum is a nice trick to keep everything high level.'''
y_prepool, y_postpool = x
return K.gradients(K.sum(y_postpool), y_prepool)
# input image dimensions
img_rows, img_cols = 28, 28
# the data, shuffled and split between train and test sets
(X_train, _), (X_test, _) = mnist.load_data()
X_train = X_train.reshape(X_train.shape[0], 1, img_rows, img_cols)
X_test = X_test.reshape(X_test.shape[0], 1, img_rows, img_cols)
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255
X_test /= 255
print('X_train shape:', X_train.shape)
print(X_train.shape[0], 'train samples')
print(X_test.shape[0], 'test samples')
# The size of the kernel used for the MaxPooling2D
pool_size = 2
# The total number of feature maps at each layer
nfeats = [8, 16, 32, 64, 128]
# The sizes of the pooling kernel at each layer
pool_sizes = np.array([1, 1, 1, 1, 1]) * pool_size
# The convolution kernel size
ksize = 3
# Number of epochs to train for
nb_epoch = 5
# Batch size during training
batch_size = 128
if pool_size == 2:
# if using a 5 layer net of pool_size = 2
X_train = np.pad(X_train, [[0, 0], [0, 0], [2, 2], [2, 2]],
mode='constant')
X_test = np.pad(X_test, [[0, 0], [0, 0], [2, 2], [2, 2]], mode='constant')
nlayers = 5
elif pool_size == 3:
# if using a 3 layer net of pool_size = 3
X_train = X_train[:, :, :-1, :-1]
X_test = X_test[:, :, :-1, :-1]
nlayers = 3
else:
import sys
sys.exit("Script supports pool_size of 2 and 3.")
# Shape of input to train on (note that model is fully convolutional however)
input_shape = X_train.shape[1:]
# The final list of the size of axis=1 for all layers, including input
nfeats_all = [input_shape[0]] + nfeats
# First build the encoder, all the while keeping track of the "where" masks
img_input = Input(shape=input_shape)
# We push the "where" masks to the following list
wheres = [None] * nlayers
y = img_input
for i in range(nlayers):
y_prepool = convresblock(y, nfeats=nfeats_all[i + 1], ksize=ksize)
y = MaxPooling2D(pool_size=(pool_sizes[i], pool_sizes[i]))(y_prepool)
wheres[i] = merge([y_prepool, y], mode=getwhere,
output_shape=lambda x: x[0])
# Now build the decoder, and use the stored "where" masks to place the features
for i in range(nlayers):
ind = nlayers - 1 - i
y = UpSampling2D(size=(pool_sizes[ind], pool_sizes[ind]))(y)
y = merge([y, wheres[ind]], mode='mul')
y = convresblock(y, nfeats=nfeats_all[ind], ksize=ksize)
# Use hard_simgoid to clip range of reconstruction
y = Activation('hard_sigmoid')(y)
# Define the model and it's mean square error loss, and compile it with Adam
model = Model(img_input, y)
model.compile('adam', 'mse')
# Fit the model
model.fit(X_train, X_train, validation_data=(X_test, X_test),
batch_size=batch_size, nb_epoch=nb_epoch)
# Plot
X_recon = model.predict(X_test[:25])
X_plot = np.concatenate((X_test[:25], X_recon), axis=1)
X_plot = X_plot.reshape((5, 10, input_shape[-2], input_shape[-1]))
X_plot = np.vstack([np.hstack(x) for x in X_plot])
plt.figure()
plt.axis('off')
plt.title('Test Samples: Originals/Reconstructions')
plt.imshow(X_plot, interpolation='none', cmap='gray')
plt.savefig('reconstructions.png')
+127
Ver Arquivo
@@ -0,0 +1,127 @@
'''Transfer learning toy example:
1- Train a simple convnet on the MNIST dataset the first 5 digits [0..4].
2- Freeze convolutional layers and fine-tune dense layers
for the classification of digits [5..9].
Run on GPU: THEANO_FLAGS=mode=FAST_RUN,device=gpu,floatX=float32 python mnist_transfer_cnn.py
Get to 99.8% test accuracy after 5 epochs
for the first five digits classifier
and 99.2% for the last five digits after transfer + fine-tuning.
'''
from __future__ import print_function
import numpy as np
import datetime
np.random.seed(1337) # for reproducibility
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.layers import Convolution2D, MaxPooling2D
from keras.utils import np_utils
from keras import backend as K
now = datetime.datetime.now
batch_size = 128
nb_classes = 5
nb_epoch = 5
# input image dimensions
img_rows, img_cols = 28, 28
# number of convolutional filters to use
nb_filters = 32
# size of pooling area for max pooling
pool_size = 2
# convolution kernel size
kernel_size = 3
if K.image_dim_ordering() == 'th':
input_shape = (1, img_rows, img_cols)
else:
input_shape = (img_rows, img_cols, 1)
def train_model(model, train, test, nb_classes):
X_train = train[0].reshape((train[0].shape[0],) + input_shape)
X_test = test[0].reshape((test[0].shape[0],) + input_shape)
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255
X_test /= 255
print('X_train shape:', X_train.shape)
print(X_train.shape[0], 'train samples')
print(X_test.shape[0], 'test samples')
# convert class vectors to binary class matrices
Y_train = np_utils.to_categorical(train[1], nb_classes)
Y_test = np_utils.to_categorical(test[1], nb_classes)
model.compile(loss='categorical_crossentropy',
optimizer='adadelta',
metrics=['accuracy'])
t = now()
model.fit(X_train, Y_train,
batch_size=batch_size, nb_epoch=nb_epoch,
verbose=1,
validation_data=(X_test, Y_test))
print('Training time: %s' % (now() - t))
score = model.evaluate(X_test, Y_test, verbose=0)
print('Test score:', score[0])
print('Test accuracy:', score[1])
# the data, shuffled and split between train and test sets
(X_train, y_train), (X_test, y_test) = mnist.load_data()
# create two datasets one with digits below 5 and one with 5 and above
X_train_lt5 = X_train[y_train < 5]
y_train_lt5 = y_train[y_train < 5]
X_test_lt5 = X_test[y_test < 5]
y_test_lt5 = y_test[y_test < 5]
X_train_gte5 = X_train[y_train >= 5]
y_train_gte5 = y_train[y_train >= 5] - 5 # make classes start at 0 for
X_test_gte5 = X_test[y_test >= 5] # np_utils.to_categorical
y_test_gte5 = y_test[y_test >= 5] - 5
# define two groups of layers: feature (convolutions) and classification (dense)
feature_layers = [
Convolution2D(nb_filters, kernel_size, kernel_size,
border_mode='valid',
input_shape=input_shape),
Activation('relu'),
Convolution2D(nb_filters, kernel_size, kernel_size),
Activation('relu'),
MaxPooling2D(pool_size=(pool_size, pool_size)),
Dropout(0.25),
Flatten(),
]
classification_layers = [
Dense(128),
Activation('relu'),
Dropout(0.5),
Dense(nb_classes),
Activation('softmax')
]
# create complete model
model = Sequential(feature_layers + classification_layers)
# train model for 5-digit classification [0..4]
train_model(model,
(X_train_lt5, y_train_lt5),
(X_test_lt5, y_test_lt5), nb_classes)
# freeze feature layers and rebuild model
for l in feature_layers:
l.trainable = False
# transfer: train dense layers for new classification task [5..9]
train_model(model,
(X_train_gte5, y_train_gte5),
(X_test_gte5, y_test_gte5), nb_classes)
+366
Ver Arquivo
@@ -0,0 +1,366 @@
'''Neural doodle with Keras
Script Usage:
# Arguments:
```
--nlabels: # of regions (colors) in mask images
--style-image: image to learn style from
--style-mask: semantic labels for style image
--target-mask: semantic labels for target image (your doodle)
--content-image: optional image to learn content from
--target-image-prefix: path prefix for generated target images
```
# Example 1: doodle using a style image, style mask
and target mask.
```
python neural_doodle.py --nlabels 4 --style-image Monet/style.png \
--style-mask Monet/style_mask.png --target-mask Monet/target_mask.png \
--target-image-prefix generated/monet
```
# Example 2: doodle using a style image, style mask,
target mask and an optional content image.
```
python neural_doodle.py --nlabels 4 --style-image Renoir/style.png \
--style-mask Renoir/style_mask.png --target-mask Renoir/target_mask.png \
--content-image Renoir/creek.jpg \
--target-image-prefix generated/renoir
```
References:
[Dmitry Ulyanov's blog on fast-neural-doodle](http://dmitryulyanov.github.io/feed-forward-neural-doodle/)
[Torch code for fast-neural-doodle](https://github.com/DmitryUlyanov/fast-neural-doodle)
[Torch code for online-neural-doodle](https://github.com/DmitryUlyanov/online-neural-doodle)
[Paper Texture Networks: Feed-forward Synthesis of Textures and Stylized Images](http://arxiv.org/abs/1603.03417)
[Discussion on parameter tuning](https://github.com/fchollet/keras/issues/3705)
Resources:
Example images can be downloaded from
https://github.com/DmitryUlyanov/fast-neural-doodle/tree/master/data
'''
from __future__ import print_function
import time
import argparse
import numpy as np
from scipy.optimize import fmin_l_bfgs_b
from scipy.misc import imread, imsave
from keras import backend as K
from keras.layers import Input, Convolution2D, MaxPooling2D, AveragePooling2D
from keras.models import Model
from keras.preprocessing.image import load_img, img_to_array
from keras.applications import vgg19
# Command line arguments
parser = argparse.ArgumentParser(description='Keras neural doodle example')
parser.add_argument('--nlabels', type=int,
help='number of semantic labels'
' (regions in differnet colors)'
' in style_mask/target_mask')
parser.add_argument('--style-image', type=str,
help='path to image to learn style from')
parser.add_argument('--style-mask', type=str,
help='path to semantic mask of style image')
parser.add_argument('--target-mask', type=str,
help='path to semantic mask of target image')
parser.add_argument('--content-image', type=str, default=None,
help='path to optional content image')
parser.add_argument('--target-image-prefix', type=str,
help='path prefix for generated results')
args = parser.parse_args()
style_img_path = args.style_image
style_mask_path = args.style_mask
target_mask_path = args.target_mask
content_img_path = args.content_image
target_img_prefix = args.target_image_prefix
use_content_img = content_img_path is not None
nb_labels = args.nlabels
nb_colors = 3 # RGB
# determine image sizes based on target_mask
ref_img = imread(target_mask_path)
img_nrows, img_ncols = ref_img.shape[:2]
total_variation_weight = 50.
style_weight = 1.
content_weight = 0.1 if use_content_img else 0
content_feature_layers = ['block5_conv2']
# To get better generation qualities, use more conv layers for style features
style_feature_layers = ['block1_conv1', 'block2_conv1', 'block3_conv1',
'block4_conv1', 'block5_conv1']
# helper functions for reading/processing images
def preprocess_image(image_path):
img = load_img(image_path, target_size=(img_nrows, img_ncols))
img = img_to_array(img)
img = np.expand_dims(img, axis=0)
img = vgg19.preprocess_input(img)
return img
def deprocess_image(x):
if K.image_dim_ordering() == 'th':
x = x.reshape((3, img_nrows, img_ncols))
x = x.transpose((1, 2, 0))
else:
x = x.reshape((img_nrows, img_ncols, 3))
# Remove zero-center by mean pixel
x[:, :, 0] += 103.939
x[:, :, 1] += 116.779
x[:, :, 2] += 123.68
# 'BGR'->'RGB'
x = x[:, :, ::-1]
x = np.clip(x, 0, 255).astype('uint8')
return x
def kmeans(xs, k):
assert xs.ndim == 2
try:
from sklearn.cluster import k_means
_, labels, _ = k_means(xs.astype("float64"), k)
except ImportError:
from scipy.cluster.vq import kmeans2
_, labels = kmeans2(xs, k, missing='raise')
return labels
def load_mask_labels():
'''Load both target and style masks.
A mask image (nr x nc) with m labels/colors will be loaded
as a 4D boolean tensor: (1, m, nr, nc) for 'th' or (1, nr, nc, m) for 'tf'
'''
target_mask_img = load_img(target_mask_path,
target_size=(img_nrows, img_ncols))
target_mask_img = img_to_array(target_mask_img)
style_mask_img = load_img(style_mask_path,
target_size=(img_nrows, img_ncols))
style_mask_img = img_to_array(style_mask_img)
if K.image_dim_ordering() == 'th':
mask_vecs = np.vstack([style_mask_img.reshape((3, -1)).T,
target_mask_img.reshape((3, -1)).T])
else:
mask_vecs = np.vstack([style_mask_img.reshape((-1, 3)),
target_mask_img.reshape((-1, 3))])
labels = kmeans(mask_vecs, nb_labels)
style_mask_label = labels[:img_nrows *
img_ncols].reshape((img_nrows, img_ncols))
target_mask_label = labels[img_nrows *
img_ncols:].reshape((img_nrows, img_ncols))
stack_axis = 0 if K.image_dim_ordering() == 'th' else -1
style_mask = np.stack([style_mask_label == r for r in xrange(nb_labels)],
axis=stack_axis)
target_mask = np.stack([target_mask_label == r for r in xrange(nb_labels)],
axis=stack_axis)
return (np.expand_dims(style_mask, axis=0),
np.expand_dims(target_mask, axis=0))
# Create tensor variables for images
if K.image_dim_ordering() == 'th':
shape = (1, nb_colors, img_nrows, img_ncols)
else:
shape = (1, img_nrows, img_ncols, nb_colors)
style_image = K.variable(preprocess_image(style_img_path))
target_image = K.placeholder(shape=shape)
if use_content_img:
content_image = K.variable(preprocess_image(content_img_path))
else:
content_image = K.zeros(shape=shape)
images = K.concatenate([style_image, target_image, content_image], axis=0)
# Create tensor variables for masks
raw_style_mask, raw_target_mask = load_mask_labels()
style_mask = K.variable(raw_style_mask.astype("float32"))
target_mask = K.variable(raw_target_mask.astype("float32"))
masks = K.concatenate([style_mask, target_mask], axis=0)
# index constants for images and tasks variables
STYLE, TARGET, CONTENT = 0, 1, 2
# Build image model, mask model and use layer outputs as features
# image model as VGG19
image_model = vgg19.VGG19(include_top=False, input_tensor=images)
# mask model as a series of pooling
mask_input = Input(tensor=masks, shape=(None, None, None), name="mask_input")
x = mask_input
for layer in image_model.layers[1:]:
name = 'mask_%s' % layer.name
if 'conv' in layer.name:
x = AveragePooling2D((3, 3), strides=(
1, 1), name=name, border_mode="same")(x)
elif 'pool' in layer.name:
x = AveragePooling2D((2, 2), name=name)(x)
mask_model = Model(mask_input, x)
# Collect features from image_model and task_model
image_features = {}
mask_features = {}
for img_layer, mask_layer in zip(image_model.layers, mask_model.layers):
if 'conv' in img_layer.name:
assert 'mask_' + img_layer.name == mask_layer.name
layer_name = img_layer.name
img_feat, mask_feat = img_layer.output, mask_layer.output
image_features[layer_name] = img_feat
mask_features[layer_name] = mask_feat
# Define loss functions
def gram_matrix(x):
assert K.ndim(x) == 3
features = K.batch_flatten(x)
gram = K.dot(features, K.transpose(features))
return gram
def region_style_loss(style_image, target_image, style_mask, target_mask):
'''Calculate style loss between style_image and target_image,
for one common region specified by their (boolean) masks
'''
assert 3 == K.ndim(style_image) == K.ndim(target_image)
assert 2 == K.ndim(style_mask) == K.ndim(target_mask)
if K.image_dim_ordering() == 'th':
masked_style = style_image * style_mask
masked_target = target_image * target_mask
nb_channels = K.shape(style_image)[0]
else:
masked_style = K.permute_dimensions(
style_image, (2, 0, 1)) * style_mask
masked_target = K.permute_dimensions(
target_image, (2, 0, 1)) * target_mask
nb_channels = K.shape(style_image)[-1]
s = gram_matrix(masked_style) / K.mean(style_mask) / nb_channels
c = gram_matrix(masked_target) / K.mean(target_mask) / nb_channels
return K.mean(K.square(s - c))
def style_loss(style_image, target_image, style_masks, target_masks):
'''Calculate style loss between style_image and target_image,
in all regions.
'''
assert 3 == K.ndim(style_image) == K.ndim(target_image)
assert 3 == K.ndim(style_masks) == K.ndim(target_masks)
loss = K.variable(0)
for i in xrange(nb_labels):
if K.image_dim_ordering() == 'th':
style_mask = style_masks[i, :, :]
target_mask = target_masks[i, :, :]
else:
style_mask = style_masks[:, :, i]
target_mask = target_masks[:, :, i]
loss += region_style_loss(style_image,
target_image, style_mask, target_mask)
return loss
def content_loss(content_image, target_image):
return K.sum(K.square(target_image - content_image))
def total_variation_loss(x):
assert 4 == K.ndim(x)
if K.image_dim_ordering() == 'th':
a = K.square(x[:, :, :img_nrows - 1, :img_ncols - 1] -
x[:, :, 1:, :img_ncols - 1])
b = K.square(x[:, :, :img_nrows - 1, :img_ncols - 1] -
x[:, :, :img_nrows - 1, 1:])
else:
a = K.square(x[:, :img_nrows - 1, :img_ncols - 1, :] -
x[:, 1:, :img_ncols - 1, :])
b = K.square(x[:, :img_nrows - 1, :img_ncols - 1, :] -
x[:, :img_nrows - 1, 1:, :])
return K.sum(K.pow(a + b, 1.25))
# Overall loss is the weighted sum of content_loss, style_loss and tv_loss
# Each individual loss uses features from image/mask models.
loss = K.variable(0)
for layer in content_feature_layers:
content_feat = image_features[layer][CONTENT, :, :, :]
target_feat = image_features[layer][TARGET, :, :, :]
loss += content_weight * content_loss(content_feat, target_feat)
for layer in style_feature_layers:
style_feat = image_features[layer][STYLE, :, :, :]
target_feat = image_features[layer][TARGET, :, :, :]
style_masks = mask_features[layer][STYLE, :, :, :]
target_masks = mask_features[layer][TARGET, :, :, :]
sl = style_loss(style_feat, target_feat, style_masks, target_masks)
loss += (style_weight / len(style_feature_layers)) * sl
loss += total_variation_weight * total_variation_loss(target_image)
loss_grads = K.gradients(loss, target_image)
# Evaluator class for computing efficiency
outputs = [loss]
if type(loss_grads) in {list, tuple}:
outputs += loss_grads
else:
outputs.append(loss_grads)
f_outputs = K.function([target_image], outputs)
def eval_loss_and_grads(x):
if K.image_dim_ordering() == 'th':
x = x.reshape((1, 3, img_nrows, img_ncols))
else:
x = x.reshape((1, img_nrows, img_ncols, 3))
outs = f_outputs([x])
loss_value = outs[0]
if len(outs[1:]) == 1:
grad_values = outs[1].flatten().astype('float64')
else:
grad_values = np.array(outs[1:]).flatten().astype('float64')
return loss_value, grad_values
class Evaluator(object):
def __init__(self):
self.loss_value = None
self.grads_values = None
def loss(self, x):
assert self.loss_value is None
loss_value, grad_values = eval_loss_and_grads(x)
self.loss_value = loss_value
self.grad_values = grad_values
return self.loss_value
def grads(self, x):
assert self.loss_value is not None
grad_values = np.copy(self.grad_values)
self.loss_value = None
self.grad_values = None
return grad_values
evaluator = Evaluator()
# Generate images by iterative optimization
if K.image_dim_ordering() == 'th':
x = np.random.uniform(0, 255, (1, 3, img_nrows, img_ncols)) - 128.
else:
x = np.random.uniform(0, 255, (1, img_nrows, img_ncols, 3)) - 128.
for i in range(50):
print('Start of iteration', i)
start_time = time.time()
x, min_val, info = fmin_l_bfgs_b(evaluator.loss, x.flatten(),
fprime=evaluator.grads, maxfun=20)
print('Current loss value:', min_val)
# save current generated image
img = deprocess_image(x.copy())
fname = target_img_prefix + '_at_iteration_%d.png' % i
imsave(fname, img)
end_time = time.time()
print('Image saved as', fname)
print('Iteration %d completed in %ds' % (i, end_time - start_time))
+261
Ver Arquivo
@@ -0,0 +1,261 @@
'''Neural style transfer with Keras.
Run the script with:
```
python neural_style_transfer.py path_to_your_base_image.jpg path_to_your_reference.jpg prefix_for_results
```
e.g.:
```
python neural_style_transfer.py img/tuebingen.jpg img/starry_night.jpg results/my_result
```
It is preferable to run this script on GPU, for speed.
Example result: https://twitter.com/fchollet/status/686631033085677568
# Details
Style transfer consists in generating an image
with the same "content" as a base image, but with the
"style" of a different picture (typically artistic).
This is achieved through the optimization of a loss function
that has 3 components: "style loss", "content loss",
and "total variation loss":
- The total variation loss imposes local spatial continuity between
the pixels of the combination image, giving it visual coherence.
- The style loss is where the deep learning keeps in --that one is defined
using a deep convolutional neural network. Precisely, it consists in a sum of
L2 distances between the Gram matrices of the representations of
the base image and the style reference image, extracted from
different layers of a convnet (trained on ImageNet). The general idea
is to capture color/texture information at different spatial
scales (fairly large scales --defined by the depth of the layer considered).
- The content loss is a L2 distance between the features of the base
image (extracted from a deep layer) and the features of the combination image,
keeping the generated image close enough to the original one.
# References
- [A Neural Algorithm of Artistic Style](http://arxiv.org/abs/1508.06576)
'''
from __future__ import print_function
from keras.preprocessing.image import load_img, img_to_array
from scipy.misc import imsave
import numpy as np
from scipy.optimize import fmin_l_bfgs_b
import time
import argparse
from keras.applications import vgg16
from keras import backend as K
parser = argparse.ArgumentParser(description='Neural style transfer with Keras.')
parser.add_argument('base_image_path', metavar='base', type=str,
help='Path to the image to transform.')
parser.add_argument('style_reference_image_path', metavar='ref', type=str,
help='Path to the style reference image.')
parser.add_argument('result_prefix', metavar='res_prefix', type=str,
help='Prefix for the saved results.')
args = parser.parse_args()
base_image_path = args.base_image_path
style_reference_image_path = args.style_reference_image_path
result_prefix = args.result_prefix
# these are the weights of the different loss components
total_variation_weight = 1.
style_weight = 1.
content_weight = 0.025
# dimensions of the generated picture.
img_nrows = 400
img_ncols = 400
assert img_ncols == img_nrows, 'Due to the use of the Gram matrix, width and height must match.'
# util function to open, resize and format pictures into appropriate tensors
def preprocess_image(image_path):
img = load_img(image_path, target_size=(img_nrows, img_ncols))
img = img_to_array(img)
img = np.expand_dims(img, axis=0)
img = vgg16.preprocess_input(img)
return img
# util function to convert a tensor into a valid image
def deprocess_image(x):
if K.image_dim_ordering() == 'th':
x = x.reshape((3, img_nrows, img_ncols))
x = x.transpose((1, 2, 0))
else:
x = x.reshape((img_nrows, img_ncols, 3))
# Remove zero-center by mean pixel
x[:, :, 0] += 103.939
x[:, :, 1] += 116.779
x[:, :, 2] += 123.68
# 'BGR'->'RGB'
x = x[:, :, ::-1]
x = np.clip(x, 0, 255).astype('uint8')
return x
# get tensor representations of our images
base_image = K.variable(preprocess_image(base_image_path))
style_reference_image = K.variable(preprocess_image(style_reference_image_path))
# this will contain our generated image
if K.image_dim_ordering() == 'th':
combination_image = K.placeholder((1, 3, img_nrows, img_ncols))
else:
combination_image = K.placeholder((1, img_nrows, img_ncols, 3))
# combine the 3 images into a single Keras tensor
input_tensor = K.concatenate([base_image,
style_reference_image,
combination_image], axis=0)
# build the VGG16 network with our 3 images as input
# the model will be loaded with pre-trained ImageNet weights
model = vgg16.VGG16(input_tensor=input_tensor,
weights='imagenet', include_top=False)
print('Model loaded.')
# get the symbolic outputs of each "key" layer (we gave them unique names).
outputs_dict = dict([(layer.name, layer.output) for layer in model.layers])
# compute the neural style loss
# first we need to define 4 util functions
# the gram matrix of an image tensor (feature-wise outer product)
def gram_matrix(x):
assert K.ndim(x) == 3
if K.image_dim_ordering() == 'th':
features = K.batch_flatten(x)
else:
features = K.batch_flatten(K.permute_dimensions(x, (2, 0, 1)))
gram = K.dot(features, K.transpose(features))
return gram
# the "style loss" is designed to maintain
# the style of the reference image in the generated image.
# It is based on the gram matrices (which capture style) of
# feature maps from the style reference image
# and from the generated image
def style_loss(style, combination):
assert K.ndim(style) == 3
assert K.ndim(combination) == 3
S = gram_matrix(style)
C = gram_matrix(combination)
channels = 3
size = img_nrows * img_ncols
return K.sum(K.square(S - C)) / (4. * (channels ** 2) * (size ** 2))
# an auxiliary loss function
# designed to maintain the "content" of the
# base image in the generated image
def content_loss(base, combination):
return K.sum(K.square(combination - base))
# the 3rd loss function, total variation loss,
# designed to keep the generated image locally coherent
def total_variation_loss(x):
assert K.ndim(x) == 4
if K.image_dim_ordering() == 'th':
a = K.square(x[:, :, :img_nrows-1, :img_ncols-1] - x[:, :, 1:, :img_ncols-1])
b = K.square(x[:, :, :img_nrows-1, :img_ncols-1] - x[:, :, :img_nrows-1, 1:])
else:
a = K.square(x[:, :img_nrows-1, :img_ncols-1, :] - x[:, 1:, :img_ncols-1, :])
b = K.square(x[:, :img_nrows-1, :img_ncols-1, :] - x[:, :img_nrows-1, 1:, :])
return K.sum(K.pow(a + b, 1.25))
# combine these loss functions into a single scalar
loss = K.variable(0.)
layer_features = outputs_dict['block4_conv2']
base_image_features = layer_features[0, :, :, :]
combination_features = layer_features[2, :, :, :]
loss += content_weight * content_loss(base_image_features,
combination_features)
feature_layers = ['block1_conv1', 'block2_conv1',
'block3_conv1', 'block4_conv1',
'block5_conv1']
for layer_name in feature_layers:
layer_features = outputs_dict[layer_name]
style_reference_features = layer_features[1, :, :, :]
combination_features = layer_features[2, :, :, :]
sl = style_loss(style_reference_features, combination_features)
loss += (style_weight / len(feature_layers)) * sl
loss += total_variation_weight * total_variation_loss(combination_image)
# get the gradients of the generated image wrt the loss
grads = K.gradients(loss, combination_image)
outputs = [loss]
if type(grads) in {list, tuple}:
outputs += grads
else:
outputs.append(grads)
f_outputs = K.function([combination_image], outputs)
def eval_loss_and_grads(x):
if K.image_dim_ordering() == 'th':
x = x.reshape((1, 3, img_nrows, img_ncols))
else:
x = x.reshape((1, img_nrows, img_ncols, 3))
outs = f_outputs([x])
loss_value = outs[0]
if len(outs[1:]) == 1:
grad_values = outs[1].flatten().astype('float64')
else:
grad_values = np.array(outs[1:]).flatten().astype('float64')
return loss_value, grad_values
# this Evaluator class makes it possible
# to compute loss and gradients in one pass
# while retrieving them via two separate functions,
# "loss" and "grads". This is done because scipy.optimize
# requires separate functions for loss and gradients,
# but computing them separately would be inefficient.
class Evaluator(object):
def __init__(self):
self.loss_value = None
self.grads_values = None
def loss(self, x):
assert self.loss_value is None
loss_value, grad_values = eval_loss_and_grads(x)
self.loss_value = loss_value
self.grad_values = grad_values
return self.loss_value
def grads(self, x):
assert self.loss_value is not None
grad_values = np.copy(self.grad_values)
self.loss_value = None
self.grad_values = None
return grad_values
evaluator = Evaluator()
# run scipy-based optimization (L-BFGS) over the pixels of the generated image
# so as to minimize the neural style loss
if K.image_dim_ordering() == 'th':
x = np.random.uniform(0, 255, (1, 3, img_nrows, img_ncols)) - 128.
else:
x = np.random.uniform(0, 255, (1, img_nrows, img_ncols, 3)) - 128.
for i in range(10):
print('Start of iteration', i)
start_time = time.time()
x, min_val, info = fmin_l_bfgs_b(evaluator.loss, x.flatten(),
fprime=evaluator.grads, maxfun=20)
print('Current loss value:', min_val)
# save current generated image
img = deprocess_image(x.copy())
fname = result_prefix + '_at_iteration_%d.png' % i
imsave(fname, img)
end_time = time.time()
print('Image saved as', fname)
print('Iteration %d completed in %ds' % (i, end_time - start_time))
+144
Ver Arquivo
@@ -0,0 +1,144 @@
'''This script loads pre-trained word embeddings (GloVe embeddings)
into a frozen Keras Embedding layer, and uses it to
train a text classification model on the 20 Newsgroup dataset
(classication of newsgroup messages into 20 different categories).
GloVe embedding data can be found at:
http://nlp.stanford.edu/data/glove.6B.zip
(source page: http://nlp.stanford.edu/projects/glove/)
20 Newsgroup data can be found at:
http://www.cs.cmu.edu/afs/cs.cmu.edu/project/theo-20/www/data/news20.html
'''
from __future__ import print_function
import os
import numpy as np
np.random.seed(1337)
from keras.preprocessing.text import Tokenizer
from keras.preprocessing.sequence import pad_sequences
from keras.utils.np_utils import to_categorical
from keras.layers import Dense, Input, Flatten
from keras.layers import Conv1D, MaxPooling1D, Embedding
from keras.models import Model
import sys
BASE_DIR = ''
GLOVE_DIR = BASE_DIR + '/glove.6B/'
TEXT_DATA_DIR = BASE_DIR + '/20_newsgroup/'
MAX_SEQUENCE_LENGTH = 1000
MAX_NB_WORDS = 20000
EMBEDDING_DIM = 100
VALIDATION_SPLIT = 0.2
# first, build index mapping words in the embeddings set
# to their embedding vector
print('Indexing word vectors.')
embeddings_index = {}
f = open(os.path.join(GLOVE_DIR, 'glove.6B.100d.txt'))
for line in f:
values = line.split()
word = values[0]
coefs = np.asarray(values[1:], dtype='float32')
embeddings_index[word] = coefs
f.close()
print('Found %s word vectors.' % len(embeddings_index))
# second, prepare text samples and their labels
print('Processing text dataset')
texts = [] # list of text samples
labels_index = {} # dictionary mapping label name to numeric id
labels = [] # list of label ids
for name in sorted(os.listdir(TEXT_DATA_DIR)):
path = os.path.join(TEXT_DATA_DIR, name)
if os.path.isdir(path):
label_id = len(labels_index)
labels_index[name] = label_id
for fname in sorted(os.listdir(path)):
if fname.isdigit():
fpath = os.path.join(path, fname)
if sys.version_info < (3,):
f = open(fpath)
else:
f = open(fpath, encoding='latin-1')
texts.append(f.read())
f.close()
labels.append(label_id)
print('Found %s texts.' % len(texts))
# finally, vectorize the text samples into a 2D integer tensor
tokenizer = Tokenizer(nb_words=MAX_NB_WORDS)
tokenizer.fit_on_texts(texts)
sequences = tokenizer.texts_to_sequences(texts)
word_index = tokenizer.word_index
print('Found %s unique tokens.' % len(word_index))
data = pad_sequences(sequences, maxlen=MAX_SEQUENCE_LENGTH)
labels = to_categorical(np.asarray(labels))
print('Shape of data tensor:', data.shape)
print('Shape of label tensor:', labels.shape)
# split the data into a training set and a validation set
indices = np.arange(data.shape[0])
np.random.shuffle(indices)
data = data[indices]
labels = labels[indices]
nb_validation_samples = int(VALIDATION_SPLIT * data.shape[0])
x_train = data[:-nb_validation_samples]
y_train = labels[:-nb_validation_samples]
x_val = data[-nb_validation_samples:]
y_val = labels[-nb_validation_samples:]
print('Preparing embedding matrix.')
# prepare embedding matrix
nb_words = min(MAX_NB_WORDS, len(word_index))
embedding_matrix = np.zeros((nb_words + 1, EMBEDDING_DIM))
for word, i in word_index.items():
if i > MAX_NB_WORDS:
continue
embedding_vector = embeddings_index.get(word)
if embedding_vector is not None:
# words not found in embedding index will be all-zeros.
embedding_matrix[i] = embedding_vector
# load pre-trained word embeddings into an Embedding layer
# note that we set trainable = False so as to keep the embeddings fixed
embedding_layer = Embedding(nb_words + 1,
EMBEDDING_DIM,
weights=[embedding_matrix],
input_length=MAX_SEQUENCE_LENGTH,
trainable=False)
print('Training model.')
# train a 1D convnet with global maxpooling
sequence_input = Input(shape=(MAX_SEQUENCE_LENGTH,), dtype='int32')
embedded_sequences = embedding_layer(sequence_input)
x = Conv1D(128, 5, activation='relu')(embedded_sequences)
x = MaxPooling1D(5)(x)
x = Conv1D(128, 5, activation='relu')(x)
x = MaxPooling1D(5)(x)
x = Conv1D(128, 5, activation='relu')(x)
x = MaxPooling1D(35)(x)
x = Flatten()(x)
x = Dense(128, activation='relu')(x)
preds = Dense(len(labels_index), activation='softmax')(x)
model = Model(sequence_input, preds)
model.compile(loss='categorical_crossentropy',
optimizer='rmsprop',
metrics=['acc'])
# happy learning!
model.fit(x_train, y_train, validation_data=(x_val, y_val),
nb_epoch=2, batch_size=128)
+23 -22
Ver Arquivo
@@ -1,26 +1,22 @@
from __future__ import absolute_import
'''Trains and evaluate a simple MLP
on the Reuters newswire topic classification task.
'''
from __future__ import print_function
import numpy as np
np.random.seed(1337) # for reproducibility
from keras.datasets import reuters
from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation
from keras.layers.normalization import BatchNormalization
from keras.layers import Dense, Dropout, Activation
from keras.utils import np_utils
from keras.preprocessing.text import Tokenizer
'''
Train and evaluate a simple MLP on the Reuters newswire topic classification task.
GPU run command:
THEANO_FLAGS=mode=FAST_RUN,device=gpu,floatX=float32 python examples/reuters_mlp.py
CPU run command:
python examples/reuters_mlp.py
'''
max_words = 1000
batch_size = 32
nb_epoch = 5
print("Loading data...")
print('Loading data...')
(X_train, y_train), (X_test, y_test) = reuters.load_data(nb_words=max_words, test_split=0.2)
print(len(X_train), 'train sequences')
print(len(X_test), 'test sequences')
@@ -28,30 +24,35 @@ print(len(X_test), 'test sequences')
nb_classes = np.max(y_train)+1
print(nb_classes, 'classes')
print("Vectorizing sequence data...")
print('Vectorizing sequence data...')
tokenizer = Tokenizer(nb_words=max_words)
X_train = tokenizer.sequences_to_matrix(X_train, mode="binary")
X_test = tokenizer.sequences_to_matrix(X_test, mode="binary")
X_train = tokenizer.sequences_to_matrix(X_train, mode='binary')
X_test = tokenizer.sequences_to_matrix(X_test, mode='binary')
print('X_train shape:', X_train.shape)
print('X_test shape:', X_test.shape)
print("Convert class vector to binary class matrix (for use with categorical_crossentropy)")
print('Convert class vector to binary class matrix (for use with categorical_crossentropy)')
Y_train = np_utils.to_categorical(y_train, nb_classes)
Y_test = np_utils.to_categorical(y_test, nb_classes)
print('Y_train shape:', Y_train.shape)
print('Y_test shape:', Y_test.shape)
print("Building model...")
print('Building model...')
model = Sequential()
model.add(Dense(max_words, 256, init='normal'))
model.add(Dense(512, input_shape=(max_words,)))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(256, nb_classes, init='normal'))
model.add(Dense(nb_classes))
model.add(Activation('softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam')
model.compile(loss='categorical_crossentropy',
optimizer='adam',
metrics=['accuracy'])
history = model.fit(X_train, Y_train, nb_epoch=4, batch_size=batch_size, verbose=1, show_accuracy=True, validation_split=0.1)
score = model.evaluate(X_test, Y_test, batch_size=batch_size, verbose=1, show_accuracy=True)
history = model.fit(X_train, Y_train,
nb_epoch=nb_epoch, batch_size=batch_size,
verbose=1, validation_split=0.1)
score = model.evaluate(X_test, Y_test,
batch_size=batch_size, verbose=1)
print('Test score:', score[0])
print('Test accuracy:', score[1])
-215
Ver Arquivo
@@ -1,215 +0,0 @@
'''
We loop over words in a dataset, and for each word, we look at a context window around the word.
We generate pairs of (pivot_word, other_word_from_same_context) with label 1,
and pairs of (pivot_word, random_word) with label 0 (skip-gram method).
We use the layer WordContextProduct to learn embeddings for the word couples,
and compute a proximity score between the embeddings (= p(context|word)),
trained with our positive and negative labels.
We then use the weights computed by WordContextProduct to encode words
and demonstrate that the geometry of the embedding space
captures certain useful semantic properties.
Read more about skip-gram in this particularly gnomic paper by Mikolov et al.:
http://arxiv.org/pdf/1301.3781v3.pdf
Note: you should run this on GPU, otherwise training will be quite slow.
On a EC2 GPU instance, expect 3 hours per 10e6 comments (~10e8 words) per epoch with dim_proj=256.
Should be much faster on a modern GPU.
GPU command:
THEANO_FLAGS=mode=FAST_RUN,device=gpu,floatX=float32 python skipgram_word_embeddings.py
Dataset: 5,845,908 Hacker News comments.
Obtain the dataset at:
https://mega.co.nz/#F!YohlwD7R!wec0yNO86SeaNGIYQBOR0A
(HNCommentsAll.1perline.json.bz2)
'''
from __future__ import absolute_import
from __future__ import print_function
import numpy as np
import theano
import six.moves.cPickle
import os, re, json
from keras.preprocessing import sequence, text
from keras.optimizers import SGD, RMSprop, Adagrad
from keras.utils import np_utils, generic_utils
from keras.models import Sequential
from keras.layers.embeddings import WordContextProduct, Embedding
from six.moves import range
from six.moves import zip
max_features = 50000 # vocabulary size: top 50,000 most common words in data
skip_top = 100 # ignore top 100 most common words
nb_epoch = 1
dim_proj = 256 # embedding space dimension
save = True
load_model = False
load_tokenizer = False
train_model = True
save_dir = os.path.expanduser("~/.keras/models")
model_load_fname = "HN_skipgram_model.pkl"
model_save_fname = "HN_skipgram_model.pkl"
tokenizer_fname = "HN_tokenizer.pkl"
data_path = os.path.expanduser("~/")+"HNCommentsAll.1perline.json"
# text preprocessing utils
html_tags = re.compile(r'<.*?>')
to_replace = [('&#x27;', "'")]
hex_tags = re.compile(r'&.*?;')
def clean_comment(comment):
c = str(comment.encode("utf-8"))
c = html_tags.sub(' ', c)
for tag, char in to_replace:
c = c.replace(tag, char)
c = hex_tags.sub(' ', c)
return c
def text_generator(path=data_path):
f = open(path)
for i, l in enumerate(f):
comment_data = json.loads(l)
comment_text = comment_data["comment_text"]
comment_text = clean_comment(comment_text)
if i % 10000 == 0:
print(i)
yield comment_text
f.close()
# model management
if load_tokenizer:
print('Load tokenizer...')
tokenizer = six.moves.cPickle.load(open(os.path.join(save_dir, tokenizer_fname), 'rb'))
else:
print("Fit tokenizer...")
tokenizer = text.Tokenizer(nb_words=max_features)
tokenizer.fit_on_texts(text_generator())
if save:
print("Save tokenizer...")
if not os.path.exists(save_dir):
os.makedirs(save_dir)
six.moves.cPickle.dump(tokenizer, open(os.path.join(save_dir, tokenizer_fname), "wb"))
# training process
if train_model:
if load_model:
print('Load model...')
model = six.moves.cPickle.load(open(os.path.join(save_dir, model_load_fname), 'rb'))
else:
print('Build model...')
model = Sequential()
model.add(WordContextProduct(max_features, proj_dim=dim_proj, init="uniform"))
model.compile(loss='mse', optimizer='rmsprop')
sampling_table = sequence.make_sampling_table(max_features)
for e in range(nb_epoch):
print('-'*40)
print('Epoch', e)
print('-'*40)
progbar = generic_utils.Progbar(tokenizer.document_count)
samples_seen = 0
losses = []
for i, seq in enumerate(tokenizer.texts_to_sequences_generator(text_generator())):
# get skipgram couples for one text in the dataset
couples, labels = sequence.skipgrams(seq, max_features, window_size=4, negative_samples=1., sampling_table=sampling_table)
if couples:
# one gradient update per sentence (one sentence = a few 1000s of word couples)
X = np.array(couples, dtype="int32")
loss = model.train(X, labels)
losses.append(loss)
if len(losses) % 100 == 0:
progbar.update(i, values=[("loss", np.mean(losses))])
losses = []
samples_seen += len(labels)
print('Samples seen:', samples_seen)
print("Training completed!")
if save:
print("Saving model...")
if not os.path.exists(save_dir):
os.makedirs(save_dir)
six.moves.cPickle.dump(model, open(os.path.join(save_dir, model_save_fname), "wb"))
print("It's test time!")
# recover the embedding weights trained with skipgram:
weights = model.layers[0].get_weights()[0]
# we no longer need this
del model
weights[:skip_top] = np.zeros((skip_top, dim_proj))
norm_weights = np_utils.normalize(weights)
word_index = tokenizer.word_index
reverse_word_index = dict([(v, k) for k, v in list(word_index.items())])
word_index = tokenizer.word_index
def embed_word(w):
i = word_index.get(w)
if (not i) or (i<skip_top) or (i>=max_features):
return None
return norm_weights[i]
def closest_to_point(point, nb_closest=10):
proximities = np.dot(norm_weights, point)
tups = list(zip(list(range(len(proximities))), proximities))
tups.sort(key=lambda x: x[1], reverse=True)
return [(reverse_word_index.get(t[0]), t[1]) for t in tups[:nb_closest]]
def closest_to_word(w, nb_closest=10):
i = word_index.get(w)
if (not i) or (i<skip_top) or (i>=max_features):
return []
return closest_to_point(norm_weights[i].T, nb_closest)
''' the resuls in comments below were for:
5.8M HN comments
dim_proj = 256
nb_epoch = 2
optimizer = rmsprop
loss = mse
max_features = 50000
skip_top = 100
negative_samples = 1.
window_size = 4
and frequency subsampling of factor 10e-5.
'''
words = ["article", # post, story, hn, read, comments
"3", # 6, 4, 5, 2
"two", # three, few, several, each
"great", # love, nice, working, looking
"data", # information, memory, database
"money", # company, pay, customers, spend
"years", # ago, year, months, hours, week, days
"android", # ios, release, os, mobile, beta
"javascript", # js, css, compiler, library, jquery, ruby
"look", # looks, looking
"business", # industry, professional, customers
"company", # companies, startup, founders, startups
"after", # before, once, until
"own", # personal, our, having
"us", # united, country, american, tech, diversity, usa, china, sv
"using", # javascript, js, tools (lol)
"here", # hn, post, comments
]
for w in words:
res = closest_to_word(w)
print('====', w)
for r in res:
print(r)
+83
Ver Arquivo
@@ -0,0 +1,83 @@
'''Example script showing how to use stateful RNNs
to model long sequences efficiently.
'''
from __future__ import print_function
import numpy as np
import matplotlib.pyplot as plt
from keras.models import Sequential
from keras.layers import Dense, LSTM
# since we are using stateful rnn tsteps can be set to 1
tsteps = 1
batch_size = 25
epochs = 25
# number of elements ahead that are used to make the prediction
lahead = 1
def gen_cosine_amp(amp=100, period=1000, x0=0, xn=50000, step=1, k=0.0001):
"""Generates an absolute cosine time series with the amplitude
exponentially decreasing
Arguments:
amp: amplitude of the cosine function
period: period of the cosine function
x0: initial x of the time series
xn: final x of the time series
step: step of the time series discretization
k: exponential rate
"""
cos = np.zeros(((xn - x0) * step, 1, 1))
for i in range(len(cos)):
idx = x0 + i * step
cos[i, 0, 0] = amp * np.cos(2 * np.pi * idx / period)
cos[i, 0, 0] = cos[i, 0, 0] * np.exp(-k * idx)
return cos
print('Generating Data')
cos = gen_cosine_amp()
print('Input shape:', cos.shape)
expected_output = np.zeros((len(cos), 1))
for i in range(len(cos) - lahead):
expected_output[i, 0] = np.mean(cos[i + 1:i + lahead + 1])
print('Output shape')
print(expected_output.shape)
print('Creating Model')
model = Sequential()
model.add(LSTM(50,
batch_input_shape=(batch_size, tsteps, 1),
return_sequences=True,
stateful=True))
model.add(LSTM(50,
return_sequences=False,
stateful=True))
model.add(Dense(1))
model.compile(loss='mse', optimizer='rmsprop')
print('Training')
for i in range(epochs):
print('Epoch', i, '/', epochs)
model.fit(cos,
expected_output,
batch_size=batch_size,
verbose=1,
nb_epoch=1,
shuffle=False)
model.reset_states()
print('Predicting')
predicted_output = model.predict(cos, batch_size=batch_size)
print('Plotting Results')
plt.subplot(2, 1, 1)
plt.plot(expected_output)
plt.title('Expected')
plt.subplot(2, 1, 2)
plt.plot(predicted_output)
plt.title('Predicted')
plt.show()
+101
Ver Arquivo
@@ -0,0 +1,101 @@
'''This script demonstrates how to build a variational autoencoder with Keras.
Reference: "Auto-Encoding Variational Bayes" https://arxiv.org/abs/1312.6114
'''
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import norm
from keras.layers import Input, Dense, Lambda
from keras.models import Model
from keras import backend as K
from keras import objectives
from keras.datasets import mnist
batch_size = 100
original_dim = 784
latent_dim = 2
intermediate_dim = 256
nb_epoch = 50
epsilon_std = 1.0
x = Input(batch_shape=(batch_size, original_dim))
h = Dense(intermediate_dim, activation='relu')(x)
z_mean = Dense(latent_dim)(h)
z_log_var = Dense(latent_dim)(h)
def sampling(args):
z_mean, z_log_var = args
epsilon = K.random_normal(shape=(batch_size, latent_dim), mean=0.,
std=epsilon_std)
return z_mean + K.exp(z_log_var / 2) * epsilon
# note that "output_shape" isn't necessary with the TensorFlow backend
z = Lambda(sampling, output_shape=(latent_dim,))([z_mean, z_log_var])
# we instantiate these layers separately so as to reuse them later
decoder_h = Dense(intermediate_dim, activation='relu')
decoder_mean = Dense(original_dim, activation='sigmoid')
h_decoded = decoder_h(z)
x_decoded_mean = decoder_mean(h_decoded)
def vae_loss(x, x_decoded_mean):
xent_loss = original_dim * objectives.binary_crossentropy(x, x_decoded_mean)
kl_loss = - 0.5 * K.sum(1 + z_log_var - K.square(z_mean) - K.exp(z_log_var), axis=-1)
return xent_loss + kl_loss
vae = Model(x, x_decoded_mean)
vae.compile(optimizer='rmsprop', loss=vae_loss)
# train the VAE on MNIST digits
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.
x_train = x_train.reshape((len(x_train), np.prod(x_train.shape[1:])))
x_test = x_test.reshape((len(x_test), np.prod(x_test.shape[1:])))
vae.fit(x_train, x_train,
shuffle=True,
nb_epoch=nb_epoch,
batch_size=batch_size,
validation_data=(x_test, x_test))
# build a model to project inputs on the latent space
encoder = Model(x, z_mean)
# display a 2D plot of the digit classes in the latent space
x_test_encoded = encoder.predict(x_test, batch_size=batch_size)
plt.figure(figsize=(6, 6))
plt.scatter(x_test_encoded[:, 0], x_test_encoded[:, 1], c=y_test)
plt.colorbar()
plt.show()
# build a digit generator that can sample from the learned distribution
decoder_input = Input(shape=(latent_dim,))
_h_decoded = decoder_h(decoder_input)
_x_decoded_mean = decoder_mean(_h_decoded)
generator = Model(decoder_input, _x_decoded_mean)
# display a 2D manifold of the digits
n = 15 # figure with 15x15 digits
digit_size = 28
figure = np.zeros((digit_size * n, digit_size * n))
# linearly spaced coordinates on the unit square were transformed through the inverse CDF (ppf) of the Gaussian
# to produce values of the latent variables z, since the prior of the latent space is Gaussian
grid_x = norm.ppf(np.linspace(0.05, 0.95, n))
grid_y = norm.ppf(np.linspace(0.05, 0.95, n))
for i, yi in enumerate(grid_x):
for j, xi in enumerate(grid_y):
z_sample = np.array([[xi, yi]])
x_decoded = generator.predict(z_sample)
digit = x_decoded[0].reshape(digit_size, digit_size)
figure[i * digit_size: (i + 1) * digit_size,
j * digit_size: (j + 1) * digit_size] = digit
plt.figure(figsize=(10, 10))
plt.imshow(figure, cmap='Greys_r')
plt.show()
+173
Ver Arquivo
@@ -0,0 +1,173 @@
'''This script demonstrates how to build a variational autoencoder
with Keras and deconvolution layers.
Reference: "Auto-Encoding Variational Bayes" https://arxiv.org/abs/1312.6114
'''
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import norm
from keras.layers import Input, Dense, Lambda, Flatten, Reshape
from keras.layers import Convolution2D, Deconvolution2D
from keras.models import Model
from keras import backend as K
from keras import objectives
from keras.datasets import mnist
# input image dimensions
img_rows, img_cols, img_chns = 28, 28, 1
# number of convolutional filters to use
nb_filters = 64
# convolution kernel size
nb_conv = 3
batch_size = 100
if K.image_dim_ordering() == 'th':
original_img_size = (img_chns, img_rows, img_cols)
else:
original_img_size = (img_rows, img_cols, img_chns)
latent_dim = 2
intermediate_dim = 128
epsilon_std = 1.0
nb_epoch = 5
x = Input(batch_shape=(batch_size,) + original_img_size)
conv_1 = Convolution2D(img_chns, 2, 2, border_mode='same', activation='relu')(x)
conv_2 = Convolution2D(nb_filters, 2, 2,
border_mode='same', activation='relu',
subsample=(2, 2))(conv_1)
conv_3 = Convolution2D(nb_filters, nb_conv, nb_conv,
border_mode='same', activation='relu',
subsample=(1, 1))(conv_2)
conv_4 = Convolution2D(nb_filters, nb_conv, nb_conv,
border_mode='same', activation='relu',
subsample=(1, 1))(conv_3)
flat = Flatten()(conv_4)
hidden = Dense(intermediate_dim, activation='relu')(flat)
z_mean = Dense(latent_dim)(hidden)
z_log_var = Dense(latent_dim)(hidden)
def sampling(args):
z_mean, z_log_var = args
epsilon = K.random_normal(shape=(batch_size, latent_dim),
mean=0., std=epsilon_std)
return z_mean + K.exp(z_log_var) * epsilon
# note that "output_shape" isn't necessary with the TensorFlow backend
# so you could write `Lambda(sampling)([z_mean, z_log_var])`
z = Lambda(sampling, output_shape=(latent_dim,))([z_mean, z_log_var])
# we instantiate these layers separately so as to reuse them later
decoder_hid = Dense(intermediate_dim, activation='relu')
decoder_upsample = Dense(nb_filters * 14 * 14, activation='relu')
if K.image_dim_ordering() == 'th':
output_shape = (batch_size, nb_filters, 14, 14)
else:
output_shape = (batch_size, 14, 14, nb_filters)
decoder_reshape = Reshape(output_shape[1:])
decoder_deconv_1 = Deconvolution2D(nb_filters, nb_conv, nb_conv,
output_shape,
border_mode='same',
subsample=(1, 1),
activation='relu')
decoder_deconv_2 = Deconvolution2D(nb_filters, nb_conv, nb_conv,
output_shape,
border_mode='same',
subsample=(1, 1),
activation='relu')
if K.image_dim_ordering() == 'th':
output_shape = (batch_size, nb_filters, 29, 29)
else:
output_shape = (batch_size, 29, 29, nb_filters)
decoder_deconv_3_upsamp = Deconvolution2D(nb_filters, 2, 2,
output_shape,
border_mode='valid',
subsample=(2, 2),
activation='relu')
decoder_mean_squash = Convolution2D(img_chns, 2, 2,
border_mode='valid',
activation='sigmoid')
hid_decoded = decoder_hid(z)
up_decoded = decoder_upsample(hid_decoded)
reshape_decoded = decoder_reshape(up_decoded)
deconv_1_decoded = decoder_deconv_1(reshape_decoded)
deconv_2_decoded = decoder_deconv_2(deconv_1_decoded)
x_decoded_relu = decoder_deconv_3_upsamp(deconv_2_decoded)
x_decoded_mean_squash = decoder_mean_squash(x_decoded_relu)
def vae_loss(x, x_decoded_mean):
# NOTE: binary_crossentropy expects a batch_size by dim
# for x and x_decoded_mean, so we MUST flatten these!
x = K.flatten(x)
x_decoded_mean = K.flatten(x_decoded_mean)
xent_loss = img_rows * img_cols * objectives.binary_crossentropy(x, x_decoded_mean)
kl_loss = - 0.5 * K.mean(1 + z_log_var - K.square(z_mean) - K.exp(z_log_var), axis=-1)
return xent_loss + kl_loss
vae = Model(x, x_decoded_mean_squash)
vae.compile(optimizer='rmsprop', loss=vae_loss)
vae.summary()
# train the VAE on MNIST digits
(x_train, _), (x_test, y_test) = mnist.load_data()
x_train = x_train.astype('float32') / 255.
x_train = x_train.reshape((x_train.shape[0],) + original_img_size)
x_test = x_test.astype('float32') / 255.
x_test = x_test.reshape((x_test.shape[0],) + original_img_size)
print('x_train.shape:', x_train.shape)
vae.fit(x_train, x_train,
shuffle=True,
nb_epoch=nb_epoch,
batch_size=batch_size,
validation_data=(x_test, x_test))
# build a model to project inputs on the latent space
encoder = Model(x, z_mean)
# display a 2D plot of the digit classes in the latent space
x_test_encoded = encoder.predict(x_test, batch_size=batch_size)
plt.figure(figsize=(6, 6))
plt.scatter(x_test_encoded[:, 0], x_test_encoded[:, 1], c=y_test)
plt.colorbar()
plt.show()
# build a digit generator that can sample from the learned distribution
decoder_input = Input(shape=(latent_dim,))
_hid_decoded = decoder_hid(decoder_input)
_up_decoded = decoder_upsample(_hid_decoded)
_reshape_decoded = decoder_reshape(_up_decoded)
_deconv_1_decoded = decoder_deconv_1(_reshape_decoded)
_deconv_2_decoded = decoder_deconv_2(_deconv_1_decoded)
_x_decoded_relu = decoder_deconv_3_upsamp(_deconv_2_decoded)
_x_decoded_mean_squash = decoder_mean_squash(_x_decoded_relu)
generator = Model(decoder_input, _x_decoded_mean_squash)
# display a 2D manifold of the digits
n = 15 # figure with 15x15 digits
digit_size = 28
figure = np.zeros((digit_size * n, digit_size * n))
# linearly spaced coordinates on the unit square were transformed through the inverse CDF (ppf) of the Gaussian
# to produce values of the latent variables z, since the prior of the latent space is Gaussian
grid_x = norm.ppf(np.linspace(0.05, 0.95, n))
grid_y = norm.ppf(np.linspace(0.05, 0.95, n))
for i, yi in enumerate(grid_x):
for j, xi in enumerate(grid_y):
z_sample = np.array([[xi, yi]])
z_sample = np.tile(z_sample, batch_size).reshape(batch_size, 2)
x_decoded = generator.predict(z_sample, batch_size=batch_size)
digit = x_decoded[0].reshape(digit_size, digit_size)
figure[i * digit_size: (i + 1) * digit_size,
j * digit_size: (j + 1) * digit_size] = digit
plt.figure(figsize=(10, 10))
plt.imshow(figure, cmap='Greys_r')
plt.show()
+18
Ver Arquivo
@@ -0,0 +1,18 @@
from __future__ import absolute_import
from . import backend
from . import datasets
from . import engine
from . import layers
from . import preprocessing
from . import utils
from . import wrappers
from . import callbacks
from . import constraints
from . import initializations
from . import metrics
from . import models
from . import objectives
from . import optimizers
from . import regularizers
__version__ = '1.1.2'
+37 -16
Ver Arquivo
@@ -1,34 +1,55 @@
from __future__ import absolute_import
import theano
import theano.tensor as T
import types
from . import backend as K
from .utils.generic_utils import get_from_module
def softmax(x):
return T.nnet.softmax(x)
ndim = K.ndim(x)
if ndim == 2:
return K.softmax(x)
elif ndim == 3:
e = K.exp(x - K.max(x, axis=-1, keepdims=True))
s = K.sum(e, axis=-1, keepdims=True)
return e / s
else:
raise ValueError('Cannot apply softmax to a tensor '
'that is not 2D or 3D. '
'Here, ndim=' + str(ndim))
def elu(x, alpha=1.0):
return K.elu(x, alpha)
def time_distributed_softmax(x):
xshape = x.shape
X = x.reshape((xshape[0] * xshape[1], xshape[2]))
return T.nnet.softmax(X).reshape(xshape)
def softplus(x):
return T.nnet.softplus(x)
return K.softplus(x)
def softsign(x):
return K.softsign(x)
def relu(x, alpha=0., max_value=None):
return K.relu(x, alpha=alpha, max_value=max_value)
def relu(x):
return (x + abs(x)) / 2.0
def tanh(x):
return T.tanh(x)
return K.tanh(x)
def sigmoid(x):
return T.nnet.sigmoid(x)
return K.sigmoid(x)
def hard_sigmoid(x):
return T.nnet.hard_sigmoid(x)
return K.hard_sigmoid(x)
def linear(x):
return x
from .utils.generic_utils import get_from_module
def get(identifier):
return get_from_module(identifier, globals(), 'activation function')
if identifier is None:
return linear
return get_from_module(identifier, globals(), 'activation function')
+5
Ver Arquivo
@@ -0,0 +1,5 @@
from .vgg16 import VGG16
from .vgg19 import VGG19
from .resnet50 import ResNet50
from .inception_v3 import InceptionV3
from .xception import Xception
+86
Ver Arquivo
@@ -0,0 +1,86 @@
import numpy as np
from .. import backend as K
TAGS = ['rock', 'pop', 'alternative', 'indie', 'electronic',
'female vocalists', 'dance', '00s', 'alternative rock', 'jazz',
'beautiful', 'metal', 'chillout', 'male vocalists',
'classic rock', 'soul', 'indie rock', 'Mellow', 'electronica',
'80s', 'folk', '90s', 'chill', 'instrumental', 'punk',
'oldies', 'blues', 'hard rock', 'ambient', 'acoustic',
'experimental', 'female vocalist', 'guitar', 'Hip-Hop',
'70s', 'party', 'country', 'easy listening',
'sexy', 'catchy', 'funk', 'electro', 'heavy metal',
'Progressive rock', '60s', 'rnb', 'indie pop',
'sad', 'House', 'happy']
def librosa_exists():
try:
__import__('librosa')
except ImportError:
return False
else:
return True
def preprocess_input(audio_path, dim_ordering='default'):
'''Reads an audio file and outputs a Mel-spectrogram.
'''
if dim_ordering == 'default':
dim_ordering = K.image_dim_ordering()
assert dim_ordering in {'tf', 'th'}
if librosa_exists():
import librosa
else:
raise RuntimeError('Librosa is required to process audio files.\n' +
'Install it via `pip install librosa` \nor visit ' +
'http://librosa.github.io/librosa/ for details.')
# mel-spectrogram parameters
SR = 12000
N_FFT = 512
N_MELS = 96
HOP_LEN = 256
DURA = 29.12
src, sr = librosa.load(audio_path, sr=SR)
n_sample = src.shape[0]
n_sample_wanted = int(DURA * SR)
# trim the signal at the center
if n_sample < n_sample_wanted: # if too short
src = np.hstack((src, np.zeros((int(DURA * SR) - n_sample,))))
elif n_sample > n_sample_wanted: # if too long
src = src[(n_sample - n_sample_wanted) / 2:
(n_sample + n_sample_wanted) / 2]
logam = librosa.logamplitude
melgram = librosa.feature.melspectrogram
x = logam(melgram(y=src, sr=SR, hop_length=HOP_LEN,
n_fft=N_FFT, n_mels=N_MELS) ** 2,
ref_power=1.0)
if dim_ordering == 'th':
x = np.expand_dims(x, axis=0)
elif dim_ordering == 'tf':
x = np.expand_dims(x, axis=3)
return x
def decode_predictions(preds, top_n=5):
'''Decode the output of a music tagger model.
# Arguments
preds: 2-dimensional numpy array
top_n: integer in [0, 50], number of items to show
'''
assert len(preds.shape) == 2 and preds.shape[1] == 50
results = []
for pred in preds:
result = zip(TAGS, pred)
result = sorted(result, key=lambda x: x[1], reverse=True)
results.append(result[:top_n])
return results
+51
Ver Arquivo
@@ -0,0 +1,51 @@
import numpy as np
import json
from ..utils.data_utils import get_file
from .. import backend as K
CLASS_INDEX = None
CLASS_INDEX_PATH = 'https://s3.amazonaws.com/deep-learning-models/image-models/imagenet_class_index.json'
def preprocess_input(x, dim_ordering='default'):
if dim_ordering == 'default':
dim_ordering = K.image_dim_ordering()
assert dim_ordering in {'tf', 'th'}
if dim_ordering == 'th':
# 'RGB'->'BGR'
x = x[:, ::-1, :, :]
# Zero-center by mean pixel
x[:, 0, :, :] -= 103.939
x[:, 1, :, :] -= 116.779
x[:, 2, :, :] -= 123.68
else:
# 'RGB'->'BGR'
x = x[:, :, :, ::-1]
# Zero-center by mean pixel
x[:, :, :, 0] -= 103.939
x[:, :, :, 1] -= 116.779
x[:, :, :, 2] -= 123.68
return x
def decode_predictions(preds, top=5):
global CLASS_INDEX
if len(preds.shape) != 2 or preds.shape[1] != 1000:
raise ValueError('`decode_predictions` expects '
'a batch of predictions '
'(i.e. a 2D array of shape (samples, 1000)). '
'Found array with shape: ' + str(preds.shape))
if CLASS_INDEX is None:
fpath = get_file('imagenet_class_index.json',
CLASS_INDEX_PATH,
cache_subdir='models')
CLASS_INDEX = json.load(open(fpath))
results = []
for pred in preds:
top_indices = pred.argsort()[-top:][::-1]
result = [tuple(CLASS_INDEX[str(i)]) + (pred[i],) for i in top_indices]
result.sort(key=lambda x: x[2], reverse=True)
results.append(result)
return results
+312
Ver Arquivo
@@ -0,0 +1,312 @@
# -*- coding: utf-8 -*-
'''Inception V3 model for Keras.
Note that the ImageNet weights provided are from a model that had not fully converged.
Inception v3 should be able to reach 6.9% top-5 error, but our model
only gets to 7.8% (same as a fully-converged ResNet 50).
For comparison, VGG16 only gets to 9.9%, quite a bit worse.
Also, do note that the input image format for this model is different than for
the VGG16 and ResNet models (299x299 instead of 224x224), and that the input preprocessing function
is also different (same as Xception).
# Reference:
- [Rethinking the Inception Architecture for Computer Vision](http://arxiv.org/abs/1512.00567)
'''
from __future__ import print_function
from __future__ import absolute_import
import warnings
from ..models import Model
from ..layers import Flatten, Dense, Input, BatchNormalization, merge
from ..layers import Convolution2D, MaxPooling2D, AveragePooling2D
from ..utils.layer_utils import convert_all_kernels_in_model
from ..utils.data_utils import get_file
from .. import backend as K
from .imagenet_utils import decode_predictions
TH_WEIGHTS_PATH = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.2/inception_v3_weights_th_dim_ordering_th_kernels.h5'
TF_WEIGHTS_PATH = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.2/inception_v3_weights_tf_dim_ordering_tf_kernels.h5'
TH_WEIGHTS_PATH_NO_TOP = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.2/inception_v3_weights_th_dim_ordering_th_kernels_notop.h5'
TF_WEIGHTS_PATH_NO_TOP = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.2/inception_v3_weights_tf_dim_ordering_tf_kernels_notop.h5'
def conv2d_bn(x, nb_filter, nb_row, nb_col,
border_mode='same', subsample=(1, 1),
name=None):
'''Utility function to apply conv + BN.
'''
if name is not None:
bn_name = name + '_bn'
conv_name = name + '_conv'
else:
bn_name = None
conv_name = None
if K.image_dim_ordering() == 'th':
bn_axis = 1
else:
bn_axis = 3
x = Convolution2D(nb_filter, nb_row, nb_col,
subsample=subsample,
activation='relu',
border_mode=border_mode,
name=conv_name)(x)
x = BatchNormalization(axis=bn_axis, name=bn_name)(x)
return x
def InceptionV3(include_top=True, weights='imagenet',
input_tensor=None):
'''Instantiate the Inception v3 architecture,
optionally loading weights pre-trained
on ImageNet. Note that when using TensorFlow,
for best performance you should set
`image_dim_ordering="tf"` in your Keras config
at ~/.keras/keras.json.
The model and the weights are compatible with both
TensorFlow and Theano. The dimension ordering
convention used by the model is the one
specified in your Keras config file.
Note that the default input image size for this model is 299x299.
# Arguments
include_top: whether to include the fully-connected
layer at the top of the network.
weights: one of `None` (random initialization)
or "imagenet" (pre-training on ImageNet).
input_tensor: optional Keras tensor (i.e. output of `layers.Input()`)
to use as image input for the model.
# Returns
A Keras model instance.
'''
if weights not in {'imagenet', None}:
raise ValueError('The `weights` argument should be either '
'`None` (random initialization) or `imagenet` '
'(pre-training on ImageNet).')
# Determine proper input shape
if K.image_dim_ordering() == 'th':
if include_top:
input_shape = (3, 299, 299)
else:
input_shape = (3, None, None)
else:
if include_top:
input_shape = (299, 299, 3)
else:
input_shape = (None, None, 3)
if input_tensor is None:
img_input = Input(shape=input_shape)
else:
if not K.is_keras_tensor(input_tensor):
img_input = Input(tensor=input_tensor, shape=input_shape)
else:
img_input = input_tensor
if K.image_dim_ordering() == 'th':
channel_axis = 1
else:
channel_axis = 3
x = conv2d_bn(img_input, 32, 3, 3, subsample=(2, 2), border_mode='valid')
x = conv2d_bn(x, 32, 3, 3, border_mode='valid')
x = conv2d_bn(x, 64, 3, 3)
x = MaxPooling2D((3, 3), strides=(2, 2))(x)
x = conv2d_bn(x, 80, 1, 1, border_mode='valid')
x = conv2d_bn(x, 192, 3, 3, border_mode='valid')
x = MaxPooling2D((3, 3), strides=(2, 2))(x)
# mixed 0, 1, 2: 35 x 35 x 256
for i in range(3):
branch1x1 = conv2d_bn(x, 64, 1, 1)
branch5x5 = conv2d_bn(x, 48, 1, 1)
branch5x5 = conv2d_bn(branch5x5, 64, 5, 5)
branch3x3dbl = conv2d_bn(x, 64, 1, 1)
branch3x3dbl = conv2d_bn(branch3x3dbl, 96, 3, 3)
branch3x3dbl = conv2d_bn(branch3x3dbl, 96, 3, 3)
branch_pool = AveragePooling2D(
(3, 3), strides=(1, 1), border_mode='same')(x)
branch_pool = conv2d_bn(branch_pool, 32, 1, 1)
x = merge([branch1x1, branch5x5, branch3x3dbl, branch_pool],
mode='concat', concat_axis=channel_axis,
name='mixed' + str(i))
# mixed 3: 17 x 17 x 768
branch3x3 = conv2d_bn(x, 384, 3, 3, subsample=(2, 2), border_mode='valid')
branch3x3dbl = conv2d_bn(x, 64, 1, 1)
branch3x3dbl = conv2d_bn(branch3x3dbl, 96, 3, 3)
branch3x3dbl = conv2d_bn(branch3x3dbl, 96, 3, 3,
subsample=(2, 2), border_mode='valid')
branch_pool = MaxPooling2D((3, 3), strides=(2, 2))(x)
x = merge([branch3x3, branch3x3dbl, branch_pool],
mode='concat', concat_axis=channel_axis,
name='mixed3')
# mixed 4: 17 x 17 x 768
branch1x1 = conv2d_bn(x, 192, 1, 1)
branch7x7 = conv2d_bn(x, 128, 1, 1)
branch7x7 = conv2d_bn(branch7x7, 128, 1, 7)
branch7x7 = conv2d_bn(branch7x7, 192, 7, 1)
branch7x7dbl = conv2d_bn(x, 128, 1, 1)
branch7x7dbl = conv2d_bn(branch7x7dbl, 128, 7, 1)
branch7x7dbl = conv2d_bn(branch7x7dbl, 128, 1, 7)
branch7x7dbl = conv2d_bn(branch7x7dbl, 128, 7, 1)
branch7x7dbl = conv2d_bn(branch7x7dbl, 192, 1, 7)
branch_pool = AveragePooling2D((3, 3), strides=(1, 1), border_mode='same')(x)
branch_pool = conv2d_bn(branch_pool, 192, 1, 1)
x = merge([branch1x1, branch7x7, branch7x7dbl, branch_pool],
mode='concat', concat_axis=channel_axis,
name='mixed4')
# mixed 5, 6: 17 x 17 x 768
for i in range(2):
branch1x1 = conv2d_bn(x, 192, 1, 1)
branch7x7 = conv2d_bn(x, 160, 1, 1)
branch7x7 = conv2d_bn(branch7x7, 160, 1, 7)
branch7x7 = conv2d_bn(branch7x7, 192, 7, 1)
branch7x7dbl = conv2d_bn(x, 160, 1, 1)
branch7x7dbl = conv2d_bn(branch7x7dbl, 160, 7, 1)
branch7x7dbl = conv2d_bn(branch7x7dbl, 160, 1, 7)
branch7x7dbl = conv2d_bn(branch7x7dbl, 160, 7, 1)
branch7x7dbl = conv2d_bn(branch7x7dbl, 192, 1, 7)
branch_pool = AveragePooling2D(
(3, 3), strides=(1, 1), border_mode='same')(x)
branch_pool = conv2d_bn(branch_pool, 192, 1, 1)
x = merge([branch1x1, branch7x7, branch7x7dbl, branch_pool],
mode='concat', concat_axis=channel_axis,
name='mixed' + str(5 + i))
# mixed 7: 17 x 17 x 768
branch1x1 = conv2d_bn(x, 192, 1, 1)
branch7x7 = conv2d_bn(x, 192, 1, 1)
branch7x7 = conv2d_bn(branch7x7, 192, 1, 7)
branch7x7 = conv2d_bn(branch7x7, 192, 7, 1)
branch7x7dbl = conv2d_bn(x, 160, 1, 1)
branch7x7dbl = conv2d_bn(branch7x7dbl, 192, 7, 1)
branch7x7dbl = conv2d_bn(branch7x7dbl, 192, 1, 7)
branch7x7dbl = conv2d_bn(branch7x7dbl, 192, 7, 1)
branch7x7dbl = conv2d_bn(branch7x7dbl, 192, 1, 7)
branch_pool = AveragePooling2D((3, 3), strides=(1, 1), border_mode='same')(x)
branch_pool = conv2d_bn(branch_pool, 192, 1, 1)
x = merge([branch1x1, branch7x7, branch7x7dbl, branch_pool],
mode='concat', concat_axis=channel_axis,
name='mixed7')
# mixed 8: 8 x 8 x 1280
branch3x3 = conv2d_bn(x, 192, 1, 1)
branch3x3 = conv2d_bn(branch3x3, 320, 3, 3,
subsample=(2, 2), border_mode='valid')
branch7x7x3 = conv2d_bn(x, 192, 1, 1)
branch7x7x3 = conv2d_bn(branch7x7x3, 192, 1, 7)
branch7x7x3 = conv2d_bn(branch7x7x3, 192, 7, 1)
branch7x7x3 = conv2d_bn(branch7x7x3, 192, 3, 3,
subsample=(2, 2), border_mode='valid')
branch_pool = AveragePooling2D((3, 3), strides=(2, 2))(x)
x = merge([branch3x3, branch7x7x3, branch_pool],
mode='concat', concat_axis=channel_axis,
name='mixed8')
# mixed 9: 8 x 8 x 2048
for i in range(2):
branch1x1 = conv2d_bn(x, 320, 1, 1)
branch3x3 = conv2d_bn(x, 384, 1, 1)
branch3x3_1 = conv2d_bn(branch3x3, 384, 1, 3)
branch3x3_2 = conv2d_bn(branch3x3, 384, 3, 1)
branch3x3 = merge([branch3x3_1, branch3x3_2],
mode='concat', concat_axis=channel_axis,
name='mixed9_' + str(i))
branch3x3dbl = conv2d_bn(x, 448, 1, 1)
branch3x3dbl = conv2d_bn(branch3x3dbl, 384, 3, 3)
branch3x3dbl_1 = conv2d_bn(branch3x3dbl, 384, 1, 3)
branch3x3dbl_2 = conv2d_bn(branch3x3dbl, 384, 3, 1)
branch3x3dbl = merge([branch3x3dbl_1, branch3x3dbl_2],
mode='concat', concat_axis=channel_axis)
branch_pool = AveragePooling2D(
(3, 3), strides=(1, 1), border_mode='same')(x)
branch_pool = conv2d_bn(branch_pool, 192, 1, 1)
x = merge([branch1x1, branch3x3, branch3x3dbl, branch_pool],
mode='concat', concat_axis=channel_axis,
name='mixed' + str(9 + i))
if include_top:
# Classification block
x = AveragePooling2D((8, 8), strides=(8, 8), name='avg_pool')(x)
x = Flatten(name='flatten')(x)
x = Dense(1000, activation='softmax', name='predictions')(x)
# Create model
model = Model(img_input, x)
# load weights
if weights == 'imagenet':
if K.image_dim_ordering() == 'th':
if include_top:
weights_path = get_file('inception_v3_weights_th_dim_ordering_th_kernels.h5',
TH_WEIGHTS_PATH,
cache_subdir='models',
md5_hash='b3baf3070cc4bf476d43a2ea61b0ca5f')
else:
weights_path = get_file('inception_v3_weights_th_dim_ordering_th_kernels_notop.h5',
TH_WEIGHTS_PATH_NO_TOP,
cache_subdir='models',
md5_hash='79aaa90ab4372b4593ba3df64e142f05')
model.load_weights(weights_path)
if K.backend() == 'tensorflow':
warnings.warn('You are using the TensorFlow backend, yet you '
'are using the Theano '
'image dimension ordering convention '
'(`image_dim_ordering="th"`). '
'For best performance, set '
'`image_dim_ordering="tf"` in '
'your Keras config '
'at ~/.keras/keras.json.')
convert_all_kernels_in_model(model)
else:
if include_top:
weights_path = get_file('inception_v3_weights_tf_dim_ordering_tf_kernels.h5',
TF_WEIGHTS_PATH,
cache_subdir='models',
md5_hash='fe114b3ff2ea4bf891e9353d1bbfb32f')
else:
weights_path = get_file('inception_v3_weights_tf_dim_ordering_tf_kernels_notop.h5',
TF_WEIGHTS_PATH_NO_TOP,
cache_subdir='models',
md5_hash='2f3609166de1d967d1a481094754f691')
model.load_weights(weights_path)
if K.backend() == 'theano':
convert_all_kernels_in_model(model)
return model
def preprocess_input(x):
x /= 255.
x -= 0.5
x *= 2.
return x

Alguns arquivos não foram exibidos porque demasiados arquivos foram alterados neste diff Mostrar Mais