Comparar commits

...

1194 Commits

Autor SHA1 Mensagem Data
Francois Chollet f78d417f94 Update test deps. 2017-03-14 08:33:44 -07:00
Kosuke Kusano a18f9f5755 fix character type in loading weights. (#5762) 2017-03-14 08:31:59 -07:00
Francois Chollet 7f58b6fbe7 Update docs. 2017-03-13 20:14:58 -07:00
Francois Chollet fe8e8fd43f Merge branch 'keras-2' of github.com:fchollet/keras into keras-2 2017-03-13 14:15:39 -07:00
Daniel Seichter 966ec311d1 Fix learning_phase int check (#5749) 2017-03-13 13:51:28 -07:00
Francois Chollet 1f87cff493 Merge branch 'master' into keras-2 2017-03-13 12:30:52 -07:00
Francois Chollet e826c55981 Merge branch 'master' into keras-2 2017-03-13 12:28:23 -07:00
Francois Chollet 1b0c056975 Add docstring. 2017-03-13 12:18:02 -07:00
Francois Chollet 4c56f32624 Merge branch 'keras-2' of github.com:fchollet/keras into keras-2 2017-03-13 11:08:07 -07:00
Francois Chollet 859ffebfcc Improve docs 2017-03-13 11:07:52 -07:00
sadreamer b377edf974 Added stack 'axis' argument (#5743) 2017-03-13 10:17:57 -07:00
Francois Chollet 397e3336d1 Merge branch 'keras-2' of github.com:fchollet/keras into keras-2 2017-03-13 10:16:37 -07:00
Francois Chollet 7841b7fcd7 Add missing losses.md file. 2017-03-13 10:15:48 -07:00
Rizky Luthfianto 85a61ba92b [ci skip] Update Keras version in writing-your-own-keras-layers.md (#5741) 2017-03-12 23:34:15 -07:00
Francois Chollet 37e8234bd1 Merge branch 'keras-2' of github.com:fchollet/keras into keras-2 2017-03-12 19:43:47 -07:00
Francois Chollet 640d1db4e9 Update docs. 2017-03-12 19:43:29 -07:00
Arpit Agarwal 3916e62666 merge layer reference and initializers name corrected (#5736)
* merge layer reference and initializers name corrected

* Initializations changed to Initializers
2017-03-12 17:42:16 -07:00
Alexandre Kirszenberg 92d1cf8599 Missing parentheses in activations.md (#5735) 2017-03-12 17:18:32 -07:00
Francois Chollet e9222523ba Update docs 2017-03-12 16:13:57 -07:00
Francois Chollet 53c0faa553 Update some warning messages. 2017-03-12 15:01:32 -07:00
Francois Chollet 9655493978 More compatibility fixes. 2017-03-12 14:44:12 -07:00
Francois Chollet ac1afb3e5f Datasets legacy compatiblity. 2017-03-12 14:36:49 -07:00
Francois Chollet d704bea3b2 Fixes and improvements. 2017-03-12 14:12:43 -07:00
Francois Chollet 151b5f9778 Legacy conversion improvements 2017-03-12 13:35:25 -07:00
Abhai Kollara Dilip f420d89864 Legacy interfaces for fit_generator, evaluate_generator, predict_generator (#5725)
* Added legacy interfaces for fit_generator, evaluate_generator, predict_generator

* Added tests + PEP8 fix

* Import numpy directly

* PEP8 Fix

* PEP8 Fix
2017-03-12 12:46:00 -07:00
Yorwba b43caf7b49 Add merge mode 'max' where it was missing (fixes #3486) (#5729) 2017-03-12 11:52:23 -07:00
Fariz Rahman 7e071cd7df Bug fix : Model.from_config (#5730) 2017-03-12 11:28:15 -07:00
Francois Chollet 6e289d7186 Fix same issue for convlstm 2017-03-12 01:55:40 -08:00
Francois Chollet 7374442a1c Fix ridiculous LSTM bias initialization issue. 2017-03-12 01:55:40 -08:00
Francois Chollet 98af4eb39f Slight refactoring 2017-03-12 01:55:40 -08:00
Fred Schroeder 11107c5f2c Add SpatialDropout/Lambda API conversion interfaces. (#5726) 2017-03-12 01:51:28 -08:00
Fariz Rahman 28c208deab typo fix (#5727) 2017-03-12 01:51:05 -08:00
Francois Chollet 2082bafd18 Fix recurrent layers issue 2017-03-11 23:23:12 -08:00
Francois Chollet a00eef21c9 Fix LSTM weight loading (temp) 2017-03-11 22:11:33 -08:00
Francois Chollet d04d05442c Fix legacy recurrent layer API conversion 2017-03-11 21:25:31 -08:00
Francois Chollet d5a0737a01 Fix BN 2017-03-11 21:18:54 -08:00
Francois Chollet e71cbccc2a Fix legacy weight loading issue 2017-03-11 20:53:48 -08:00
Francois Chollet fa4aba7a9b Draft version of legacy weight converter 2017-03-11 20:43:03 -08:00
Francois Chollet 8b8ffe0ea4 Fix issue with conv3d conversion interface. 2017-03-11 19:44:44 -08:00
Francois Chollet aa826d684d Finish updating examples. 2017-03-11 19:44:29 -08:00
Fred Schroeder 84711475f8 Api conversions/zeropadding cropping (#5723)
* Add ZeroPadding2/3D and Cropping2/3D API conversion interfaces.

* Add ZeroPadding2/3D and Cropping2/3D API conversion interfaces.
2017-03-11 18:43:31 -08:00
Francois Chollet 7696a13995 Fix issues in layer conversion interfaces. 2017-03-11 18:11:59 -08:00
Francois Chollet fbc8697366 Add support for legacy atrous layers. 2017-03-11 17:21:55 -08:00
Francois Chollet a783ceae2a Finish up layer conversion interfaces. 2017-03-11 17:02:10 -08:00
Francois Chollet 21bf90cbf5 Fix bug in convlstm. 2017-03-11 16:12:53 -08:00
Francois Chollet 214a54d40e Finish converting conv layers. 2017-03-11 16:02:42 -08:00
Francois Chollet 16feb385e5 Add conversion interfaces for more conv layers. 2017-03-11 15:49:22 -08:00
Francois Chollet d51fc7659e Update a couple examples. 2017-03-11 15:03:43 -08:00
Francois Chollet 109b11016a Conversion interface for conv2d layer. 2017-03-11 15:03:43 -08:00
Francois Chollet 0c7b9d57c1 Support for legacy constraints 2017-03-11 15:03:43 -08:00
Fred Schroeder 4cf4aa1efe Add UpSampling*D API conversion interface. (#5719)
* Add UpSampling*D API conversion interface. Also modified generate_legacy_interface to allow value conversions in positional args.

* Removed positional keyword_conversion modification
2017-03-11 14:24:30 -08:00
Francois Chollet 070037d449 Merge branch 'keras-2' of github.com:fchollet/keras into keras-2 2017-03-11 13:47:11 -08:00
Francois Chollet ee04a7c77a Merge branch 'Joshua-Chin-specify-state' into keras-2 2017-03-11 13:46:48 -08:00
Francois Chollet da21c15180 Simplify setting initial_state in RNNs, allow numpy values for reset_states. 2017-03-11 13:46:22 -08:00
Francois Chollet 6fef95a9fd Merge branch 'specify-state' of https://github.com/Joshua-Chin/keras into Joshua-Chin-specify-state 2017-03-11 13:01:29 -08:00
Rizky Luthfianto 7f1cdbfef6 Add API conversion interface for Embedding layer. (#5702)
* Add API conversion interface for Embedding layer.

* Fix warning message.
2017-03-11 12:58:12 -08:00
piper e723c3baf1 Add API conversion interface for both Pooling3D and Global Pooling (#5707)
* Add API conversion interface for both AvgPooling3D and MaxPooling3D

* Add API conversion interface for 'Global pooling'
2017-03-11 12:55:15 -08:00
Abhai Kollara Dilip d12d171553 Fix BatchNormalization (#5709)
* Batch norm fix

* Undo refactoring
2017-03-11 12:53:43 -08:00
leereeves d3f4397797 Fix typos in docs/templates/datasets.md (#5706)
* Fix typos in docs/templates/datasets.md

* Update docs.

* Update docs.
2017-03-11 12:53:16 -08:00
jarfo fd9acf6d73 Do not assume outputs.dtype is equal to inputs.dtype in rnn() (tensorflow_backend.py) (#5715)
* Update tensorflow_backend.py

* Add TimeDistributed tests of Dense and Embedding layers with batch_input_size
2017-03-11 12:08:30 -08:00
piper 81aa60a7cf Fixes value of 'pool_size' argument of unit test about 'legacy_pooling2d_support' (#5701)
* Fixs value of 'pool_size' argument of unit test about 'legacy_pooling2d_support'

* change layer name in unit test for 'legacy_pooling2d_support'
2017-03-10 23:35:38 -08:00
Francois Chollet 118027fcae Fix BN issues. 2017-03-10 23:09:14 -08:00
Joshua Chin 16343b3261 Pass initial_state as a keyword argument. 2017-03-10 23:45:20 -05:00
Francois Chollet eaa827584f Update Inception V3 application. 2017-03-10 20:37:56 -08:00
Francois Chollet 045283bc68 Merge branch 'jihobak-pool2d-api-conversion-interface' into keras-2 2017-03-10 13:12:25 -08:00
Francois Chollet 93cf06c4c6 Fixes in legacy interfaces for 2d pooling layers. 2017-03-10 13:12:11 -08:00
Francois Chollet 54d9a5184d Merge branch 'pool2d-api-conversion-interface' of https://github.com/jihobak/keras into jihobak-pool2d-api-conversion-interface 2017-03-10 13:00:31 -08:00
Francois Chollet 2fc0d063d1 PEP8 fixes. 2017-03-10 12:01:23 -08:00
piper 195cbdfb3f fix unit test about 'legacy_pooling2d_support' 2017-03-11 04:46:49 +09:00
piper 43c0c82501 change conversion list for 'legacy_pooling2d_support' 2017-03-11 04:24:29 +09:00
piper ea55cda984 Merge branch 'keras-2' of https://github.com/fchollet/keras into pool2d-api-conversion-interface 2017-03-11 04:20:22 +09:00
Francois Chollet 6f4a86407d Handle fact that TF bias_add doesn’t support NCHW yet 2017-03-10 10:15:48 -08:00
Abhai Kollara Dilip 7e6ccb8a48 Legacy API support of LSTM, GRU, SimpleRNN (#5688)
* Legacy API interface for SimpleRNN, GRU, LSTM

* Test for recurrent layer legacy interfaces

* Fixed import

* Fixed issues with test and recurrent layers

* Preprocessor arguement for legacy generator + LSTM kwarg conversion fix

* Warning message fix
2017-03-10 10:07:00 -08:00
Fariz Rahman 152d896a77 Shape inference for dot (#5675)
* Shape inference for dot

* Remove dependency on numpy

* Check for scalar x (ndim=0)
2017-03-10 10:02:05 -08:00
Fariz Rahman ea7847328f sum -> add (#5698) 2017-03-10 10:01:47 -08:00
Francois Chollet 89e1ce5fa1 Merge branch 'keras-2' of github.com:fchollet/keras into keras-2 2017-03-10 09:26:38 -08:00
Francois Chollet b2a992e08f Use bias_add where possible, for efficiency in TF. 2017-03-10 09:26:24 -08:00
Fred Schroeder 2ad58e1dcc Add GaussianDropout API conversion interface. (#5689) 2017-03-10 08:14:08 -08:00
piper 6cc11224fb fix typo in unit test 2017-03-10 15:21:23 +09:00
piper a9549eb632 Merge branch 'keras-2' of https://github.com/fchollet/keras into pool2d-api-conversion-interface 2017-03-10 15:17:52 +09:00
Fred Schroeder e645a18e83 Add GaussianNoise API conversion interface. (#5686) 2017-03-09 22:05:49 -08:00
piper 4b75473690 change 'list(value_conversions.keys())' to 'value_conversions') 2017-03-10 13:28:01 +09:00
piper 8697cd9bcd Merge branch 'keras-2' of https://github.com/fchollet/keras into pool2d-api-conversion-interface 2017-03-10 13:23:56 +09:00
Joseph Jin-Chuan Tang 9f7ed932f1 Fix the missing link to TF installation (#5685) 2017-03-09 20:16:12 -08:00
Fred Schroeder 78eedad77a Add PReLU API conversion interface. (#5681)
* Add PReLU API conversion interface.

* Removed  argument from prelu legacy support test.
2017-03-09 20:12:44 -08:00
piper 76b4db7767 change list of tuples to dic 2017-03-10 12:54:46 +09:00
piper 080aa7a6ad Add API conversion interface for both AvgPooling2D and MaxPooling2D 2017-03-10 11:46:11 +09:00
piper 0744a25778 update 'generate_legacy_interface' to deal with change of conversions of argument value 2017-03-10 11:38:55 +09:00
Angelos Katharopoulos 5374cec3c5 Added dtype to map_fn (#5658)
* Add a dtype paramater to the map_fn backend function

* Update the map test to include the dtype parameter

Also update foldl and foldr to use variables for future proofing.
2017-03-09 15:27:59 -08:00
Francois Chollet c6c9373954 Abstract legacy interface support. 2017-03-09 14:53:59 -08:00
Rizky Luthfianto 74cf6df153 Fix the *Pooling1D layer conversion interface (#5677) 2017-03-09 10:36:53 -08:00
Rizky Luthfianto 42b18506c5 Adapt the layer conversion interface for both AvgPooling1D and MaxPooling1D (#5674) 2017-03-09 08:51:40 -08:00
Abhai Kollara Dilip 46a3f9443d Shape inference for tile and gather (#5635)
* Added tile shape inference

* Added shape inference for gather

* Added test for gather

* Fixed test_gather

* PEP8 fix

* Fixed gather test
2017-03-08 19:58:57 -08:00
piper 8f6d12f457 Add API conversion interface for MaxPooling1D layer (#5667) 2017-03-08 19:57:58 -08:00
Michael Oliver 6419d52543 GaussianDropout Bugfix (#5665)
Noise should be multiplicative for GaussianDropout
2017-03-08 17:24:42 -08:00
piper 7207c78e10 Fixed argument for 'convert_legacy_kwrags' (#5642) 2017-03-07 18:47:23 -08:00
Francois Chollet 19bef655af More abstraction in conversion interfaces. 2017-03-07 15:40:22 -08:00
Abhai Kollara Dilip 1606dc6a11 Add API conversion interface for Dropout layer (#5629)
* Add API conversion interface for Dropout layer

* Fixed API-conversion for Dropout

* Fixed warning message

* Added another test case

* Fixed warning message

* Whitespace fix
2017-03-07 15:21:14 -08:00
Ariel Rokem 0b2c044d48 Add documentation link in sequential model. (#5636) 2017-03-07 14:14:27 -08:00
Francois Chollet e91dc42842 lesser -> less 2017-03-07 13:35:53 -08:00
Francois Chollet adb15756ff Make deserialization case-insensitive for built-in optimizers. 2017-03-06 18:43:39 -08:00
Abhai Kollara Dilip e99eac292d Shape inference for Theano backend (#5618)
* Added shape inference for flatten

* Added shape inference for repeat_elements

* Added shape inference + minute optimisation for batch_flatten

* Added shape inference for squeeze

* Added shape inference for permute_dimensions

* Added shape inference for repeat

* Added shape inference for expand_dims

* Added shape inference for spatial_2d_padding

* Removind #TODO tag + whitespace fix

* Added shape inference for transpose

* Added shape inference for batch_dot

* Minor fix in batch_dot shape inference

* PEP8 Fix

* Shape inference tests to check_single_tensor_operation and check_double_tensor_operation

* PEP8 Fix + Added test for tile

* Fixed flatten shape inference

* Fixed squeeze shape inference

* Added batch_flatten to test_shape_operations
2017-03-06 16:52:55 -08:00
Frédéric Bastien 3b660145a7 Disable Theano profiler for function used only once. (#5622) 2017-03-06 13:31:49 -08:00
Francois Chollet 5516c8fb42 Add API conversion interface for Dense layer. 2017-03-06 12:36:23 -08:00
Francois Chollet f9202817f3 Fix activity regularization in RNNs. 2017-03-05 13:56:34 -08:00
Francois Chollet d72514d6cc Fix activity_regularizer in convlstm. 2017-03-04 17:46:09 -08:00
Francois Chollet 5f94aef668 Simplify tests. 2017-03-04 17:00:01 -08:00
Joshua Chin 5dbb6121cd Added test_specify_state. 2017-03-04 16:27:21 -05:00
Joshua Chin e45bce14b7 Fixed statefulness 2017-03-04 16:27:03 -05:00
Davy Song 77ea18e8f7 Fix typo
* spelling mistake trining -> training

* Update reuters.py

* Update imdb.py
2017-03-03 17:46:14 -08:00
Hiroya Chiba 65ce238f03 skip newsgroup header (#5585) 2017-03-03 17:45:54 -08:00
Joshua Chin a214f4e64d specify initial states symbolically 2017-03-03 20:14:22 -05:00
Francois Chollet 5b0967a08f Add resnet50 test. 2017-03-03 15:05:42 -08:00
Francois Chollet 136f1b4292 Add short aliases for global pooling layers. 2017-03-03 11:51:54 -08:00
Abhai Kollara Dilip ada7aa12bf Added deconv length inference for full padding (#5580) 2017-03-02 12:31:22 -08:00
Daniel Høyer Iversen 7da568d59f io utils: raise indexError for invalid key (#5560) 2017-03-01 13:01:22 -08:00
Paul Fitzpatrick 2b10c68980 make pydot optional (#5567)
Currently, `import keras` will fail if pydot is not installed
(for example, in a fresh virtualenv without the `keras[visualize]`
option).  This commit delays the check for pydot until it is
actually used, in line with the treatment of a PIL.Image dependency
in keras/preprocessing/image.py
2017-03-01 13:01:08 -08:00
Francois Chollet 5d0cb10949 Make utils globally importable & update examples. 2017-02-28 14:41:30 -08:00
Francois Chollet 0e9ac3dae0 visualize_util -> vis_utils 2017-02-28 14:41:07 -08:00
Spotlight0xff 4e50446279 Updated VAE examples (MLP and ConvNet) to the new API (#5552)
* update VAE examples (MLP and ConvNet) to the new API

* renamed objectives to metrics for xent_loss
* std -> stddev for random_normal
* adjusted arguments to Conv2D/Deconv2D
* fixed typo (filterss -> filters)

* change to conv2dtranspose and tuple strides
2017-02-28 10:21:05 -08:00
Abhai Kollara Dilip 50d83f7ab8 Fixed PReLU for Theano backend (#5550) 2017-02-28 09:09:44 -08:00
Abhai Kollara Dilip 91bab91f3c Added negative axes conversion (#5554) 2017-02-28 09:05:20 -08:00
Francois Chollet fbe7873fc0 sum -> add 2017-02-28 09:04:00 -08:00
Francois Chollet 8d34c7ed3c Fix typo. 2017-02-28 09:00:35 -08:00
Hannah Vivian Shaw eec61d9d49 Update several examples to work with the new API (#5548)
* Update mnist_transfer_cnn for new API

* Update mnist_siamese_graph.py for new API

* Refactor example a little bit for clarity

* Update mnist_irnn.py for new API

* Fix variable name

* Update mnist_heirarchial_rnn.py for new api

* Fix a few api calls i missed

* Update mnist_acgan.py for new API

* Fix variable name

* Update imdb_cnn for new API

* Update benchmark.py to work with new API

* PEP8 fix

* Change filter_length to kernel_size

* Update imdb_cnn_lstm.py for new API

* PEP8 indentation fix
2017-02-27 18:53:41 -08:00
Fariz Rahman 38a6dae44a Multiply -> Product (#5546)
* Multiply -> Product

* Update merge_test.py

* Update merge.py

* Update merge_test.py

* Update merge.py

* Update merge.py
2017-02-27 18:51:43 -08:00
Abhai Kollara Dilip 6d8cf2fd6e Added error message + PEP8 Fix (#5525)
* Added error messages

* Fixed PEP8 Issue : Extra whitespace

* Minor changes
2017-02-26 21:31:17 -08:00
fchollet 4b5984b8ce Relax optimizer tests. 2017-02-26 20:36:05 -08:00
fchollet 9683998b37 Fix PEP8 error. 2017-02-26 20:35:55 -08:00
Achal Shah ad21188a13 Theano fix for saving models (#5497)
backend support for saving models

clean code

simplify backend check

simplify backend check bug fix
2017-02-26 20:12:06 -08:00
Abhai Kollara Dilip 8b2f06de49 Fixed issues with conv2d in Theano backend (#5528)
* Added padding test case for AveragePooling1D

* Fixed issues with pool2d

* Minor change

* Minor change
2017-02-26 20:10:59 -08:00
Francois Chollet a853789795 Bug fix. 2017-02-26 15:19:17 -08:00
Bas Veeling 4487510c90 Causal Dilated Convolutions (#5489)
* Fixes temporal_padding bug and test_utils input_data param.

* Added 'causal' convolutions padding option to Conv1D.
2017-02-26 15:18:44 -08:00
Francois Chollet 4992707d57 Add comment. 2017-02-26 14:44:46 -08:00
Francois Chollet 7f5be01da9 Style fixes. 2017-02-26 14:33:04 -08:00
Francois Chollet c8282437a7 Remove unused import. 2017-02-26 13:52:06 -08:00
Francois Chollet 7ba07aa8ce Fix docstring. 2017-02-26 13:20:29 -08:00
Francois Chollet 579cc22cda Fixes. 2017-02-26 12:43:19 -08:00
Francois Chollet b666ef18a1 Update layer utils 2017-02-26 12:07:40 -08:00
Francois Chollet 4101b5fdb2 Remove legacy support code from TF backend. 2017-02-26 11:19:21 -08:00
Francois Chollet 173a40ddc1 Update setup.py. 2017-02-26 11:19:02 -08:00
Francois Chollet cad4e4e8cc Better coverage of TB callback test. 2017-02-26 11:18:50 -08:00
Francois Chollet a164addd8d Add names in optimizer variables. 2017-02-26 11:18:31 -08:00
Francois Chollet f623b2ab57 Rm legacy TF support from TensorBoard callback. 2017-02-25 11:29:03 -08:00
Francois Chollet 1193427fd3 Bug fix. 2017-02-25 11:03:21 -08:00
Francois Chollet 1ac23c2556 Merge branch 'keras-2' of github.com:fchollet/keras into keras-2 2017-02-25 10:39:32 -08:00
Francois Chollet f516cc6dde Fix applications. 2017-02-25 10:39:08 -08:00
Francois Chollet e1a283e9d1 Add backend info in saved models. 2017-02-25 10:38:44 -08:00
Francois Chollet 99fa71a916 Remove music tagger application. 2017-02-25 10:38:16 -08:00
Achal Shah 53b9f9c440 Updated num_classes to nb_class (#5498) 2017-02-24 20:34:44 -08:00
Francois Chollet 3cfd306c76 Fix some style issues. 2017-02-24 13:37:51 -08:00
Francois Chollet 6cb1fe9490 Fix typos. 2017-02-24 11:03:51 -08:00
Francois Chollet 825d58e2bb Fix initializers. 2017-02-24 10:46:07 -08:00
Francois Chollet c81713367e Refactor locally connected layer code. 2017-02-23 17:49:05 -08:00
Francois Chollet 0c8e6319cf Remove some legacy features. 2017-02-23 13:53:54 -08:00
Francois Chollet 5ae158e8b0 Update fit_generator APIs. 2017-02-23 12:46:00 -08:00
Francois Chollet e90b0713f2 Improve ability to fit models w/o external data. 2017-02-23 11:27:54 -08:00
Francois Chollet 5566c9db1d PEP8 fixes. 2017-02-23 10:50:55 -08:00
Francois Chollet 8fb8accb05 Minimal refactoring of compile. 2017-02-23 10:47:11 -08:00
Francois Chollet 547640587a Bugfix with sparse loss compat validation checks. 2017-02-22 21:13:53 -08:00
Francois Chollet d046cea1a1 Allows fitting models without input data, targets. 2017-02-22 21:05:32 -08:00
Francois Chollet 0f308d4b72 Final TF docstrings touch-ups. 2017-02-22 16:03:57 -08:00
Francois Chollet 8e25f8b1c2 Further TF backend docstring fixes. 2017-02-22 15:57:08 -08:00
Francois Chollet bf1d29c8c3 Further docstring fixes in TF backend. 2017-02-22 15:48:53 -08:00
Francois Chollet 8d317c9867 Backend fixes. 2017-02-22 15:21:33 -08:00
Francois Chollet c81f9447c7 Further TF backend docstring fixes. 2017-02-22 14:30:41 -08:00
Francois Chollet aeb22266c9 Backend style fixes. 2017-02-22 14:12:38 -08:00
Francois Chollet 0599ade6da Docstring fixes. 2017-02-22 13:53:12 -08:00
Francois Chollet db22fdf9af Further fixes and improvements. 2017-02-22 13:46:25 -08:00
Francois Chollet 9e25cb7d13 Small docstring fixes. 2017-02-22 12:57:50 -08:00
Francois Chollet cb567d27ea Small fix. 2017-02-22 12:51:33 -08:00
Francois Chollet c15d869775 Small docstring fixes. 2017-02-22 12:49:53 -08:00
Francois Chollet 70e5cd1620 Further fixes and improvements. 2017-02-22 12:44:53 -08:00
Francois Chollet 309f13ad8c Further docstring fixes. 2017-02-22 12:28:08 -08:00
Francois Chollet a277dabc14 Further docstring fixes. 2017-02-22 12:17:25 -08:00
Francois Chollet b361e30146 Further style improvements. 2017-02-22 12:07:26 -08:00
Francois Chollet 2ab5295cee Further docstring fixes. 2017-02-22 11:39:12 -08:00
Francois Chollet cb3e82051f Further docstring fixes. 2017-02-22 11:23:42 -08:00
Francois Chollet 5ef7ce3f07 Further fixes and improvements. 2017-02-22 10:45:57 -08:00
Francois Chollet 4f735c4b0d Further docstring fixes. 2017-02-22 10:27:11 -08:00
Francois Chollet 4828bde0ce Further docstring fixes. 2017-02-22 10:13:30 -08:00
Francois Chollet b03b2b6140 Further docstring fixes. 2017-02-21 20:15:14 -08:00
fchollet dab228bc56 Docstring fixes. 2017-02-21 19:56:08 -08:00
Francois Chollet 197fb886cb Add missing legacy support file. 2017-02-21 19:51:24 -08:00
Francois Chollet b7116b991d Correct imports 2017-02-21 19:51:08 -08:00
Francois Chollet 79192f8358 Further fixes. 2017-02-21 16:58:41 -08:00
Francois Chollet 13f271c7d6 Add unit test for custom objects scope. 2017-02-21 16:41:38 -08:00
Francois Chollet 1422baaa8d Style fixes. 2017-02-21 16:41:27 -08:00
Francois Chollet ac2a7254c2 Remove useless statement in Dense. 2017-02-21 16:41:17 -08:00
Francois Chollet 26e6df8a98 Remove legacy code in TF backend. 2017-02-21 16:41:07 -08:00
Francois Chollet ac4b365e0b Further style fixes. 2017-02-21 15:46:44 -08:00
Francois Chollet 00dc75116f Add missing docstrings in image preprocessing. 2017-02-21 15:31:58 -08:00
Pokey Rule 7e7d2ed1f9 Fix deprecated scoring function (#5452)
`'log_loss'` is deprecated; replaced by `'neg_log_loss'`
2017-02-21 15:07:20 -08:00
Vasilis Vryniotis 3cde2c1f8a Speed up load_image method. (#5468) 2017-02-21 15:06:18 -08:00
Francois Chollet 10012ae9c6 Further style fixes. 2017-02-21 15:04:57 -08:00
Francois Chollet 97db7e2666 Further style fixes. 2017-02-21 14:45:25 -08:00
Francois Chollet 3a57f184de Style fixes in applications. 2017-02-21 14:39:46 -08:00
Francois Chollet fb21b1bad8 Further docstring fixes. 2017-02-21 14:30:25 -08:00
Francois Chollet 59e67fd049 Add some docstrings in Optimizers. 2017-02-21 14:22:13 -08:00
Francois Chollet 67b43e3b0d Further style fixes. 2017-02-21 14:22:05 -08:00
Francois Chollet b257dad10c Add docstrings to merge layers. 2017-02-21 14:07:06 -08:00
Francois Chollet 7dc8a32f89 Style fixes. 2017-02-21 14:06:56 -08:00
Francois Chollet 701355a720 Further style fixes. 2017-02-21 13:28:30 -08:00
Francois Chollet 871809bb85 Further style fixes. 2017-02-21 13:14:10 -08:00
Francois Chollet 2c0809c125 Further style fixes. 2017-02-21 12:54:40 -08:00
Francois Chollet af0d18f616 Make sure metrics are properly masked. 2017-02-21 11:48:28 -08:00
Francois Chollet 16e8b93d38 Remove filter dilation from SeparableConv2D. 2017-02-20 20:38:59 -08:00
Francois Chollet f46ccff056 Work on legacy compatibility. 2017-02-20 20:35:11 -08:00
Francois Chollet 1bfa665eed Add some unit tests. 2017-02-20 18:37:55 -08:00
Francois Chollet 0564b940ec Fix up initializers, constraints. 2017-02-20 16:27:59 -08:00
Francois Chollet 5bb86288e9 Remove data_format from initializers. 2017-02-20 16:20:49 -08:00
Francois Chollet ebe84eb3a1 layer_from_config -> layers.deserialize 2017-02-20 16:02:43 -08:00
Francois Chollet b95cf4a1b1 Improve legacy Sequential model support. 2017-02-20 15:29:19 -08:00
Francois Chollet 451d74c56d Various fixes and improvements. 2017-02-19 22:42:39 -08:00
Francois Chollet 55e0347667 Update Travis config. 2017-02-19 22:41:38 -08:00
fchollet 803e2869c7 Update a number of example scripts. 2017-02-19 19:24:32 -08:00
Francois Chollet e092fa8b57 Refactor print_summary. 2017-02-19 17:13:31 -08:00
Francois Chollet f0e0527591 Convert applications to new API. 2017-02-19 17:13:22 -08:00
Vasilis Vryniotis a1c4bc0c96 Speed up preprocessing by avoiding unnecessary random transforms. (#5440) 2017-02-19 13:20:42 -08:00
Francois Chollet a60ee821a2 Add Boston housing dataset. 2017-02-18 19:18:54 -08:00
Francois Chollet b0500764a8 Move datasets to npz/json. 2017-02-18 19:18:29 -08:00
Francois Chollet 82f9e2358c Further fixes and improvements. 2017-02-17 16:32:37 -08:00
Francois Chollet e1a4aea4be Add backwards pass test to layer test suite 2017-02-17 13:31:40 -08:00
Francois Chollet 45a10bc6d7 Done with convolutional recurrent layer 2017-02-17 13:31:22 -08:00
Francois Chollet 10c76237ab Bug fixes 2017-02-17 13:09:04 -08:00
Francois Chollet d663fda862 Fix up layers unit tests with TF. 2017-02-16 15:26:58 -08:00
Francois Chollet 36ac91f057 Progress on unit tests 2017-02-15 17:16:36 -08:00
Francois Chollet 18cce177a8 Fix Theano learning phase. 2017-02-15 12:51:13 -08:00
Gijs van Tulder 2562172adc Fix shuffling of output_shape in Theano conv2d_transpose. (#5382) 2017-02-15 12:44:24 -08:00
Francois Chollet 10bfb1c565 Integration tests passing. 2017-02-14 16:08:30 -08:00
Francois Chollet d282871393 API switch for conv_recurrent layers. 2017-02-13 21:10:01 -08:00
Francois Chollet 03a7eb89e2 Advance API switch (BN; RNNs). 2017-02-13 16:55:19 -08:00
Francois Chollet a856451243 Merge branch 'keras-2' of https://github.com/fchollet/keras into keras-2 2017-02-12 13:07:51 -08:00
Francois Chollet bc92fb32c0 Unbundle Merge layer. 2017-02-12 13:07:31 -08:00
fchollet e1a5fbf7f6 Fixes in conv layers. 2017-02-10 19:45:30 -08:00
fchollet dc11aa99c0 Make utils importable by absolute path 2017-02-10 18:16:25 -08:00
fchollet ca23406974 Update 2 mnist examples 2017-02-10 18:16:11 -08:00
Francois Chollet 011c1faeb4 Learning_phase refactoring. 2017-02-10 17:04:30 -08:00
Francois Chollet b118cef26f Misc progress 2017-02-10 15:43:50 -08:00
Francois Chollet 46649e5d97 Theano conv fixes. 2017-02-10 14:02:18 -08:00
Francois Chollet 4fa7e5d454 Prepare PyPI release. 2017-02-10 08:59:58 -08:00
Francois Chollet 6710396aca Update pooling layers API 2017-02-09 20:41:11 -08:00
Francois Chollet 3fe89b4f7b Fixes in conv layers. 2017-02-09 19:00:03 -08:00
Francois Chollet 023331ec2a Progress on early draft of Keras 2 2017-02-09 17:16:25 -08:00
Francois Chollet 97f327317f fix merge conflicts 2017-02-08 11:04:30 -08:00
jialing3 14f35ab055 Update audio_conv_utils.py (#5301)
integer division for python3
2017-02-07 14:40:34 -08:00
SnowGushiGit 9777b51ee2 fix typo for load array (#5315) 2017-02-07 14:40:13 -08:00
t-vi 434545a11f restore tensorflow 1.0rc1 compatibility (#5317) 2017-02-07 14:39:47 -08:00
Fabian-Robert Stöter e58f0be8f0 Simplify unit test requirements installation (#4942)
* pep8

* revert changes

* add shape inference to Reshape()

* first attempt to simplify test requirements

* change contrib manual

* missing comma in setup
2017-02-06 18:02:28 -08:00
Pat York f85695cb7b FAQ updates (#5281)
* FAQ updates

* spelling/grammar

* more detail about a Batch

* Update FAQ
2017-02-06 15:29:45 -08:00
Francois Chollet 66f2613416 Refactor theano backend conv postprocessing 2017-02-06 14:46:22 -08:00
Francois Chollet 793232fe76 Increase default logging period 5x 2017-02-06 14:28:03 -08:00
karimpedia 3ad7463b60 Correction of an error phrase for a better feedback. (#5215) 2017-02-06 14:10:55 -08:00
Eric Xihui Lin 5ea9f5bdd1 fixed naming of bidirectional layer when loaded from saved model/config (#4883) 2017-02-06 14:06:11 -08:00
Shusen Liu 757b3ed1b0 fix mnist_sklearn_wrapper.py error when default image_dim_ordering=tf (#5130)
* fix mnist_sklearn_wrapper.py error when default image_dim_ordering=tf

as title, mnist_sklearn_wrapper.py assumes image_dim_ordering to be "th"
                                                                                                                                                                                                                      Test Plan:                                                                                                                                                                                                        KERAS_BACKEND=theano condapython examples/mnist_sklearn_wrapper.py                                                                                                                                                KERAS_BACKEND=tensorflow condapython examples/mnist_sklearn_wrapper.py

* 2nd commit

Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Blame Revision:
2017-02-06 14:04:53 -08:00
Francois Chollet c13f890972 Merge branch 'master' of https://github.com/fchollet/keras 2017-02-06 14:04:01 -08:00
Cristian Gratie 99ee2fb09a add MinMaxNorm constraint (#5139)
* constrains norm inside an interval
* includes rate parameter to control how the constraint is enforced
2017-02-06 13:50:28 -08:00
Francois Chollet d009ac8fba Normalize imports in layers/local.py. 2017-02-06 13:42:44 -08:00
Francois Chollet 5c384a1bca Make img preprocessing use the default float type. 2017-02-06 13:42:33 -08:00
Francois Chollet 0f450fe265 Merge branch 'master' of https://github.com/fchollet/keras 2017-02-06 13:33:50 -08:00
Francois Chollet c2a7c69f9d Update doc re: layer reconstruction 2017-02-06 13:30:40 -08:00
Gijs van Tulder 676e227b47 Convolution3D: do not fix input_shape. (Fixes #5108.) (#5177) 2017-02-06 13:16:32 -08:00
t-vi 1de4bf1b59 achieve compatibility with tensorflow 1.0rc1 (#5296) 2017-02-06 13:04:13 -08:00
Todd Gardner 7016e8f1d9 Always evaluate callables if learning phase is set, solves #5268 (#5269) 2017-02-06 12:25:09 -08:00
Yves Raimond 0f4fec30f0 Handling ndim == 2 in TF batch_dot (#5280)
* Handling ndim == 2 in TF batch_dot

* Adding support for swapped axes when ndim==2
2017-02-06 12:22:04 -08:00
Olexa Bilaniuk ff1f796032 Fixed duplicate line in keras/engine/training.py. (#5266)
The following occurs twice in different places, so far harmlessly but
still redundantly:

self.optimizer = optimizers.get(optimizer)
2017-02-06 11:21:59 -08:00
Minkoo Seo f1a95869eb Fix off by one error in WE example script
Tokenizer returns sequence values in the range of [0, nb_words). In this
example, MAX_NB_WORDS is 20000 and the data's min value is 19999. There
is no need to use 'nb_words + 1'.
2017-02-06 10:53:33 -08:00
Luke Yeager 4cd3d284e9 Fix link to TF install docs in README (#5256) 2017-02-02 12:40:18 -08:00
Minkoo Seo 0f4be6d17b Style fixes in example script 'addition_rnn'.
Adds more explanation. Make the code fit 80 column and intuitive.
2017-01-31 11:57:28 -08:00
Francois Chollet 6ce428511a Merge branch 'master' into keras-2 2017-01-30 14:27:41 -08:00
skwbc ab3b93e8dd fix: CSVLogger raises ValueError when it is reused. (#5202)
Reuse of `CSVLogger` object raises `ValueError: I/O operation on closed file.`
because in `on_train_end` method `self.csv_file` is closed
but `self.writer` is not reset to `None`.
2017-01-27 16:03:55 -08:00
Michael Oliver 3d400116b9 change abs to K.abs (#5200)
Remove ambiguity of `abs` call
2017-01-26 22:03:49 -08:00
Francois Chollet f41d5f021a PEP8 fix. 2017-01-26 15:32:33 -08:00
Charles-Emmanuel 7174a09d3c Correct cast function spec formatting (#5190) 2017-01-26 13:26:17 -08:00
Burak Benligiray 180fa47123 fixed typo in documentation: 'BNP' -> 'BMP' (#5194) 2017-01-26 13:21:58 -08:00
Ben 1585b8dd4e Fix documentation of K.rnn unroll parameter (#5192) 2017-01-26 12:11:58 -08:00
Francois Chollet 200b193282 fix merge conflicts 2017-01-24 11:44:53 -08:00
David Schneider c07d0e6448 improve error message for malformed validation arguments in fit (#5141)
* improve error message for incorrect validation arguments

* make formatting consistent per comments in #5141
2017-01-24 10:38:59 -08:00
Max Horn aabda13a10 Bugfix CSVLogger separator and append behaviour, added tests (#5143) 2017-01-24 09:43:57 -08:00
Mohanson dcb9fac577 Bugfix: Fix the expression of pictures (#5153)
The expression of  pictures should be (img_height, img_width, 3) or (3, img_height, img_width), not (img_width, img_height, 3) or (3, img_width, img_height).
2017-01-24 09:39:25 -08:00
François Chollet 2b674827c3 Update Travis tests to Python 3.5. (#5150) 2017-01-24 09:38:00 -08:00
fchollet a5a775b79f Merge branch 'master' of ssh://github.com/fchollet/keras 2017-01-23 21:36:32 -08:00
fchollet 9736056a60 Fix issue in mnist siamese graph example. 2017-01-23 21:36:07 -08:00
Francois Chollet 8a53df6338 Update "data_format" renaming in backend docstrings. 2017-01-23 16:46:33 -08:00
Francois Chollet 6aa5730ad5 Merge branch 'master' into keras-2 2017-01-23 16:43:29 -08:00
Keunwoo Choi d739b3c2cd add raises in tf_backend docstrings` (#5144) 2017-01-23 15:16:21 -08:00
Yves Raimond c751f81d0d Checking that ndim is >= 3 for batch_dot in TensorFlow backend (#5132)
* Checking that ndim is >= 3 for TF batch_dot

* Checking error is thrown if ndim < 3 in batch_dot for TF

* Integration test raising error when ndim < 3 on TF

* Using TF backend during shape test

* Fix style issue
2017-01-23 15:12:36 -08:00
Pat York d675907654 Unbroadcast initial states in Theano backend (RNNs) (#5126)
* Unbroadcast initial states in Theano backend (RNNs)

* Propagate fix to Masking=Ture conditional branch

* Fix case of no states
2017-01-23 15:07:42 -08:00
Botty Dimanov 72f1ce4ed4 Fix Sckit-learn API get_parameters bug (#5121)
Add **params to BaseWrapper.get_params(self,_) to fix TypeError
2017-01-22 13:11:58 -08:00
reynoldscem 5d38b04415 Add reference to slides for RMSprop (#5120)
* Add reference to slides for RMSprop

I thought as the docs cite other methods, it would be good to provide a citation for this optimiser. Hinton mentions to 'keep citing the slides' for this method. 

I chose the title of the lecture in question, if there's a better title I'm sure that can be used instead, but I think the citation should be there.

* Whitespace violation
2017-01-21 13:03:46 -08:00
Ben d7e0621ed3 Fix custom_objects for regularizers and other issues (#5012)
* Fix custom_objects for regularizers and other issues

* Add custom_object_scope

* Update optimizers and Lambda to respect global_custom_objects

* Add generic_utils to mkdocs and add tests for custom_object_scope

* Fix elif statement in optimizers.py

* Clean up generic_utils.py docstrings
2017-01-21 13:03:15 -08:00
Arun Das c57d1a3219 Update audio_conv_utils.py (#5111)
Corrected a typo in line 65. This was giving `TypeError: mel() got an unexpected keyword argument 'hop_lengthgth'` error. Please verify and merge.
Thanks.
2017-01-20 16:47:51 -08:00
Nicholas Cullen d87148c56b fix deconv2d None error (#5093) 2017-01-20 11:00:32 -08:00
Rajath Kumar M P 15338cc3da TimeDistributedDense deprecated, doc change (#5097) 2017-01-20 10:45:45 -08:00
Francois Chollet 262e5751f4 Prepare 1.2.1 PyPI release. 2017-01-19 15:58:53 -08:00
Francois Chollet e153e560a1 PEP8 fix. 2017-01-19 13:43:47 -08:00
Yann Henon 445aecdeb7 change rounding mode in theano backend to match tensorflow backend (#5089)
* changed rounding mode in theano to match tensorflow and updated docs

* pep8 fix

* Style fix in docstring.
2017-01-19 12:57:29 -08:00
Luke Yeager 57429d1567 Cleanup syntax and teardown for multithreading (#5069)
* Cleanup syntax and teardown for multithreading

* Test error-handling in multithreaded generators

* Update docstrings as per reviewer's comments
2017-01-19 12:54:21 -08:00
Francois Chollet fff781cf15 fix merge conflict 2017-01-18 15:17:26 -08:00
Zhen Wang 8f8d97e615 Small fix to array_to_img function (#5070)
Small fix to array_to_img function, it won't change the value of x when set scale=True now.
2017-01-18 14:42:13 -08:00
Francois Chollet c243f39ce5 Handle Stack/pack TF API change in TF 1.0 2017-01-18 14:31:58 -08:00
Francois Chollet f2fe51a9d2 Finish renaming initializations -> initializers 2017-01-18 10:13:52 -08:00
Francois Chollet a56b1a5518 Remove batchwise metrics 2017-01-17 17:42:45 -08:00
Francois Chollet 1c630c3e3c Remove legacy code from regularizers 2017-01-17 17:42:29 -08:00
Francois Chollet 32cb83408a Update .gitignore 2017-01-17 17:41:50 -08:00
Francois Chollet d8911c2885 Rewrite initializers 2017-01-17 17:41:40 -08:00
Javier Dehesa 55487f33b1 Added dtype parameter to zeros_like and ones_like (#5062)
* Fixed checking input masks in Layer.compute_mask

* Added dtype parameter to zeros_like and ones_like

* Fix existing docstring for ones_like and zeros_like
2017-01-17 13:29:21 -08:00
Petr Baudis 1c6db08158 Model generators: Make sure all threads finish when stop is requested (#5049)
Beforehand, slow generators could have caused race conditions and
crashes with 'ValueError: generator already executing', e.g. if
a validation generator filling up the queue took longer than a single
epoch that elapsed meanwhile.
2017-01-16 17:14:16 -08:00
Pat York e54d7951f2 Make output_shape parameter respect dim_ordering (#5002)
Affects DeConvolution2D
2017-01-16 10:05:35 -08:00
Mohanson 82ca6d4185 Fix a warning on Python3 (#5042)
* Fix a warning on python3

In Python3, 50000 / 10 = 5000.0. This will result in a warning from numpy:
VisibleDeprecationWarning: using a non-integer number instead of an integer will result in an error in the future.

* Use // instead
2017-01-14 21:33:04 -08:00
Gijs van Tulder f3c60dc571 Check for filter_dilation support in Theano conv2d. (#5029) 2017-01-13 19:18:24 -08:00
valentindey c4166a9efc corrected version (#5027)
the version tag should be 1.2.0, afaik 1.1.3 doesn't exist
Btw. https://keras.io/layers/writing-your-own-keras-layers/ shows an older version of this page.
2017-01-13 19:17:38 -08:00
Francois Chollet 5adce5266f Convert "dim_ordering" to "data_format". 2017-01-13 15:39:04 -08:00
Francois Chollet 3d176e926f Further style fixes in layers. 2017-01-12 18:12:26 -08:00
Shashank 3a7cd05b48 adding option to specify classes in all models in applications (#4947) 2017-01-12 18:01:57 -08:00
Francois Chollet 8ef4a3da52 Style fixes in layer wrappers. 2017-01-12 17:41:07 -08:00
Francois Chollet 1b7800aceb Further style fixes. 2017-01-12 17:32:08 -08:00
Francois Chollet b5746331f6 Further docstring fixes. 2017-01-12 16:43:39 -08:00
Francois Chollet 3e933ca0ed Style fixes in datasets. 2017-01-12 16:38:55 -08:00
Francois Chollet 53e541f7bf Further docstring fixes. 2017-01-12 16:10:34 -08:00
Francois Chollet fbc9a18f0a Merge branch 'master' of https://github.com/fchollet/keras 2017-01-12 15:01:42 -08:00
Francois Chollet 8a50f5dfc8 Further style fixes. 2017-01-12 15:00:35 -08:00
Javier Dehesa c3c634f4b1 Fixed checking input masks in Layer.compute_mask (#5015) 2017-01-12 13:22:02 -08:00
Michael Dietz 710d8e4dd3 Fix ModelCheckpoint acc monitoring 2017-01-12 13:20:58 -08:00
Francois Chollet 887576b113 Style fixes in preprocessing/image.py. 2017-01-12 11:47:11 -08:00
Francois Chollet 2ad3544b01 Merge branch 'master' of https://github.com/fchollet/keras 2017-01-12 11:32:11 -08:00
Francois Chollet 68bde67d0a Add urlretrieve docstring 2017-01-12 11:31:49 -08:00
Francois Chollet 0edecdd09e sp -> np 2017-01-12 11:31:33 -08:00
Francois Chollet 5d97657375 Fix handling of padding for np arrays in sequence.py 2017-01-12 11:31:16 -08:00
Charles-Emmanuel cf8947da79 [Documentation] "random_uniform" initializer doesn't exists. (#5016)
* "random_uniform" initializer doesn't exists.

The following line raises :`ValueError: Invalid initialization: random`
because "random_uniform" is just "uniform"
```
self.W = self.add_weight(shape=(input_shape[1], self.output_dim),
                                 initializer='uniform',
                                 trainable=True)
```

* shape parameter missing in build call

super(MyLayer, self).build(input_shape
2017-01-12 10:32:51 -08:00
Jack Hessel c6bf7558b2 Potential deconv model saving fix? (#4999)
Adding this cast to a tuple seems to fix this issue.
2017-01-11 23:10:21 -08:00
Pat York 429e253fb6 Allow dim_ordering as a parameter to all Initialization (#5000) 2017-01-11 23:06:58 -08:00
Pat York e5529d98fe Allow (n, 0) croppings on Cropping2d and 3d (#4941)
* Added support for 0 cropping on second axis

Prevents an invalid `x[:-0]` slice

* Added Cropping layer tests with no cropping
2017-01-11 17:53:51 -08:00
Francois Chollet 6e03136116 Style fixes in sequence/text preprocessing. 2017-01-11 16:55:22 -08:00
Francois Chollet 4973fe3069 Fix base filters in one_hot 2017-01-11 16:33:41 -08:00
Francois Chollet cfa1f7c3bc Further style fixes. 2017-01-11 16:24:19 -08:00
Francois Chollet 538d368396 Fix issue with clip in Theano backend 2017-01-11 16:23:49 -08:00
Francois Chollet 590a5a5382 Fix issue with TF clip. 2017-01-11 15:47:44 -08:00
Francois Chollet fa585c5151 Further style fixes in utils module. 2017-01-11 15:47:35 -08:00
Francois Chollet 7ae2f84783 Revert cuDNN check with Theano backend 2017-01-11 15:15:58 -08:00
Francois Chollet 088dbe6866 Remove unused util function 2017-01-11 15:07:00 -08:00
Francois Chollet 6fb7ba721c Fix issue with clip max_value. 2017-01-11 15:03:02 -08:00
Francois Chollet 7aa3114d9f Further style fixes. 2017-01-11 15:02:52 -08:00
Francois Chollet 8bfd851133 Merge branch 'master' of https://github.com/fchollet/keras 2017-01-11 14:33:13 -08:00
Francois Chollet 9120a7251d Further style fixes. 2017-01-11 14:32:35 -08:00
jrao1 fdb9561ade Theano cudnn code now throws Exception when it is not available, need… (#4832)
* Theano cudnn code now throws Exception when it is not available, need to catch this

* Revert "Theano cudnn code now throws Exception when it is not available, need to catch this"

This reverts commit 2d107a6a9aca469d545d6ee31624c4b530c7ea0a.

* use dnn_available to check if cudnn is available

* Fix pep8 error
2017-01-11 14:09:49 -08:00
Francois Chollet a5ec992b1f Remove dependency on np.inf 2017-01-11 14:05:19 -08:00
Francois Chollet 2c432ffeb3 Further docstring fixes. 2017-01-11 14:04:30 -08:00
Francois Chollet 0ab4b647f8 Further docstring fixes. 2017-01-11 12:00:35 -08:00
Francois Chollet 9f4734cbf1 Fix docstrings in preprocessing/text.py 2017-01-11 11:54:47 -08:00
Francois Chollet ac1a09c787 Stricter PEP8 linting config for Travis. 2017-01-11 11:44:56 -08:00
Francois Chollet c10945f53a PEP8 fixes in tests. 2017-01-11 11:40:57 -08:00
Francois Chollet 309f586424 PEP8 fixes in examples. 2017-01-11 11:39:58 -08:00
Francois Chollet 1f5455e29e Fix a number of PEP8 errors. 2017-01-11 11:28:52 -08:00
Francois Chollet a90af6f22e Docstring style fixes. 2017-01-11 11:20:20 -08:00
Keunwoo Choi 38719480a8 documentation - TF backend (complete) (#4764)
* documentation: tf backend, complete.

* few fixes

* pep8

* fixed according to review.

* documentation: tf backend, complete.

* few fixes

* pep8

* fixed according to review.
2017-01-11 08:57:49 -08:00
David Vetrano aa18604fec add tf version detection for new summary functions (#4989) 2017-01-11 08:56:01 -08:00
Dr. Kashif Rasul 875bc59ecf tensorflow 0.12 fixes (#4815)
* initial tensorflow 0.12 fixes

see #4805

* fixed indents for pep8

* added tests for clipnorm and clipvalues

* updated travis to tf 0.12.1

* batch_matmul removed

even though the tests don’t fail on travis… they fail locally…

* make changes work with TF 0.11

* move statement outside of if
2017-01-10 18:57:48 -08:00
Francois Chollet 89f0527f31 README style fix 2017-01-10 18:16:34 -08:00
Francois Chollet 8c0c3774e6 Add noise_shape and seed to Dropout layer API. 2017-01-10 18:14:14 -08:00
Francois Chollet 9c93d8ec06 Update Theano backend. 2017-01-10 18:13:50 -08:00
Francois Chollet 1ccad186fd Update TF backend. 2017-01-10 18:13:38 -08:00
Ben e8cd940cf8 Faster implementation of to_categorical (#4983) 2017-01-10 17:18:17 -08:00
Francois Chollet c39546ee10 Fix some core layer docstrings. 2017-01-10 17:13:57 -08:00
Francois Chollet 8f75744379 Further style fixes (still incomplete). 2017-01-10 15:34:16 -08:00
Francois Chollet ea47e6de27 Style fixes in optimizers. 2017-01-10 14:44:00 -08:00
Francois Chollet a6525be4fc Further style fixes. 2017-01-10 14:43:48 -08:00
Francois Chollet 833c0b23f5 Make set_params/set_model callback methods public. 2017-01-10 14:43:30 -08:00
Francois Chollet a04d968422 Further style fixes. 2017-01-10 14:18:44 -08:00
Francois Chollet 7b261704cf Style fixes in test_convolutional. 2017-01-10 11:39:55 -08:00
Francois Chollet 97b0f9f6e4 Style fixes in constraints. 2017-01-10 11:39:40 -08:00
Francois Chollet 3071e0de2f Further style fixes in callbacks. 2017-01-10 11:39:27 -08:00
Francois Chollet fe72033b2e Further style fixes in callbacks. 2017-01-10 10:42:05 -08:00
Junwei Pan b57b9d3f8e Add Docstring for SReLU; Docstring Style Fix (#4979) 2017-01-10 09:12:11 -08:00
Minkoo Seo 50b4f7fad5 Update recurrent.py. (#4970)
Add explanation on shuffle=False.
2017-01-09 16:13:16 -08:00
Junwei Pan 6b05aebc0c Reference Style Fix (#4972) 2017-01-09 16:11:00 -08:00
Francois Chollet 5863fc74b1 Minor style fixes in callbacks. 2017-01-09 13:59:12 -08:00
sukolsak 293940600b Touch up BN docstring in TF backend (#4960) 2017-01-08 15:34:48 -08:00
Junwei Pan f0369909d0 Style Fix (#4923)
* Style Fix

* Style Fix
2017-01-08 15:34:06 -08:00
nzw 9db82605d2 Add Xception (#4961) 2017-01-08 14:50:26 -08:00
Junwei Pan c0d95fd6c2 Remove unused imports and unused variables (#4930) 2017-01-06 18:25:03 +01:00
Fabian-Robert Stöter 150e0fa8a6 Support shape inference for Reshape layer (#4928)
* pep8

* revert changes

* add shape inference to Reshape()

* add another shape test
2017-01-06 18:23:23 +01:00
Junwei Pan 45ad509611 Add References (#4924) 2017-01-06 18:21:50 +01:00
Luke de Oliveira cbefd323be Deconvolution2D fix. (#4933)
* upgrade K.deconv to handle None dim in batch_size

* add test for K.deconv2d which verifies behavior
2017-01-06 18:19:47 +01:00
nzw 0f0d837178 Fix Bidirectional docstring (#4936) 2017-01-06 18:09:34 +01:00
Gijs van Tulder a6c9227372 Fix problem with old Theano batch normalization and GPU (#4917)
* Fix problem with old Theano batch normalization and GPU (see #4881).

* Fix test.
2017-01-05 12:13:43 +01:00
Ozan Çağlayan f6b804263a callbacks: Maximize fmeasure as well when mode==auto (#4902) 2017-01-05 00:20:42 +01:00
Eric Xihui Lin b5df1c6170 for K.dot: use integer shape if available (#4899) 2017-01-05 00:17:42 +01:00
Ozan Çağlayan 44bf298ec3 docs: fix typo inputs_shape -> input_shape (#4901) 2017-01-05 00:16:44 +01:00
Junwei Pan 5d575a3eff Style Fix (#4912) 2017-01-05 00:16:06 +01:00
Minkoo Seo e63372e41f Update sequence.md. (#4905)
* Update sequence.md.

Fix value added for padding and add explanation on truncation.

* fix typo
2017-01-05 00:15:38 +01:00
fchollet 9ee0c8e634 Fix init dim_ordering in conv layers 2017-01-05 00:07:40 +01:00
Daniël Heres 431c76abc4 Add 'one' initialization to docs (#4893)
* Add 'one' initialization

Initialization with one was missing from the docs.

* Update initializations.md
2017-01-02 16:38:07 +01:00
Ke Ding 21023f7f9c make tf backend's sparse crossentropy more robust to unkonw dimensions (#4867) 2016-12-31 15:47:05 +01:00
Rishikesh 1746ac463a Adding mnist_acgan.py example link in README (#4876) 2016-12-30 16:34:20 +01:00
Stanislaw Jastrzebski f573a86b42 Fix #4846 (#4856) 2016-12-30 10:19:01 +01:00
Michael Oliver 0e18cb3efa Fix for Issue #4851 (#4855)
* Fix for Issue #4851

I didn't catch when my original documentation was changed by @fchollet (overall for the better) but introducing this bug: https://github.com/fchollet/keras/issues/4851

My original docs only included the dimensions of the parameters (no batch dim) and were correct, but I think its better to change the functions to reflect the current docs.
My original docs were:
```Python
shared_axes: the axes along with to share parameters for
 -                     the activation function. For example if the
 -                     incoming feature maps from a 2D convolution
 -                     has dimensions 16x32x32 and you wish to share
 -                     parameters across space so that each feature
 -                     maps only has one set of parameters, set
 -                     shared_axes = [1, 2]
```

* Make PEP8 compliant

Add spaces around subtraction
2016-12-30 10:18:08 +01:00
Francois Chollet 50f7f03f6b Fix PEP8 issue 2016-12-28 22:50:50 +01:00
Francois Chollet 3d4a48b120 Fix Theano learning phase inference issue 2016-12-28 22:49:21 +01:00
Francois Chollet ffe013033e Remove deprecated regularizers from wrappers 2016-12-28 21:31:49 +01:00
Francois Chollet 00cbeecf6c Add shape partial inference to Theano reshape 2016-12-28 21:31:32 +01:00
Ander 737bea8f39 Cropping1D, zero right padding (#4860) 2016-12-28 21:10:12 +01:00
Dieuwke Hupkes c2e36f369b Fix kullback_leibler_divergence (#4800)
The kullback_leibler_divergence metric in metrics.py returned an output
with dimensionality N-1 (where N is the dimensionality of the target).
Add mean after sum to fix this, such that always a scalar is returned.
2016-12-22 13:13:05 -08:00
Mike Henry 883f74ca41 Image ocr fixes (#4790)
* Added CTC to Theano and Tensorflow backend along with image OCR example

* Fixed python style issues, made data files remote, and made code more idiomatic to Keras

* Fixed a couple more style issues brought up in the original PR

* Reverted wrappers.py

* Fixed potential training-on-validation issue and removed unused imports

* Fixed PEP8 issue

* Remaining PEP8 issues fixed

* Fixed failure to learn issue on image_ocr.py found on newer versions of Keras

* Minor tweaks before submitting fixed image_ocr.py example

* Removed TimeDistributed usage and fixed quote inconsistencies

* Fixed PEP8 issues

* Fixed issue where loading weights does not work for start_epoch < 10

* Switched to using initial_epoch
2016-12-22 13:09:00 -08:00
Francois Chollet d8b226f26b Merge branch 'master' of https://github.com/fchollet/keras 2016-12-21 17:08:42 -08:00
Francois Chollet c4f3155d19 Fix re: saving/loading models with frozen layers. 2016-12-21 17:08:30 -08:00
Ryan 72c7716902 refactored convert_kernel (#4720) 2016-12-21 15:59:03 -08:00
Francois Chollet 1bc79f66f9 Merge branch 'master' of https://github.com/fchollet/keras 2016-12-20 15:08:47 -08:00
Francois Chollet 2b3eae5f08 Deprecate unused argument. 2016-12-20 15:08:42 -08:00
Francois Chollet 497cff9772 Simple style fixes. 2016-12-20 15:08:26 -08:00
konstantinbarkalov fdb20dbc7e Add 'initial_epoch' argument to Sequential fit() and fit_generator(). (#4779)
* Add 'initial_epoch' argument to Sequential.fit() and Sequential.fit_generator() wrappers.

* typo fix
2016-12-20 15:07:26 -08:00
Francois Chollet 942ed44fdd Merge branch 'master' of https://github.com/fchollet/keras 2016-12-20 13:42:08 -08:00
Francois Chollet 8bc3f4d916 Adjust documentation. 2016-12-20 13:39:32 -08:00
Francois Chollet dcbc2b933a Minor style fixes. 2016-12-20 13:39:25 -08:00
Batchu Venkat Vishal d0b4779071 Maintain aspect ratio of content image (#4769) 2016-12-20 12:33:24 -08:00
Francois Chollet 12d068f675 Prepare PyPI release. 2016-12-19 15:34:08 -08:00
Francois Chollet 070609cbac Merge branch 'master' of https://github.com/fchollet/keras 2016-12-19 15:12:45 -08:00
Francois Chollet 6b1bf7d917 Style fixes in TF backend. 2016-12-19 15:12:33 -08:00
llcao fefb70b217 fix a bug in evaluating accuracy (#2736)
* fix accuracy computation in MNIST siamese graph example

previous code:
```
def compute_accuracy(predictions, labels):
    '''Compute classification accuracy with a fixed threshold on distances.
    '''
    return labels[predictions.ravel() < 0.5].mean()
```
is not accuracy over all the samples, but over samples with negative prediction.

* add space around  "=="

follow François's suggestion
2016-12-19 14:54:22 -08:00
Batchu Venkat Vishal 48d8853cad Allow to pass optional arguments to style transfer (#4768) 2016-12-19 14:31:07 -08:00
Francois Chollet 2a3d4722c2 Use int_shape where appropriate. 2016-12-19 14:29:22 -08:00
Francois Chollet d137d00182 Merge branch 'master' of https://github.com/fchollet/keras 2016-12-19 14:17:50 -08:00
Francois Chollet 9333179ad9 Add ability to test layers on dynamic shapes. 2016-12-19 14:12:29 -08:00
linxihui 0c842391d3 Fix issue with TF dot with dynamic shapes. 2016-12-19 14:12:04 -08:00
Gijs van Tulder 1fcb74f218 Use custom_objects to deserialize Lambda functions. (#4770) 2016-12-19 12:15:06 -08:00
Francois Chollet 1278bf9cfa Slight style fixes. 2016-12-19 12:05:28 -08:00
Francois Chollet 18e5b75f67 Generalize Dense to nD tensors. 2016-12-19 12:05:15 -08:00
Francois Chollet 766572b5b8 Add int_shape to Theano backend. 2016-12-19 12:00:24 -08:00
Keunwoo Choi 1de4d7cfba update croppings (#4765) 2016-12-19 11:15:18 -08:00
Jack Hessel 04107252f2 updated the issue template markdown with a bit more information (#4750)
* updated the issue template with a bit more information

* Update ISSUE_TEMPLATE.md

De-emphasized the google group, added StackOverflow.

* Update ISSUE_TEMPLATE.md
2016-12-17 10:45:00 -08:00
Michael Oliver 5f0e0d6c38 Fix issue #3568: allow sharing of activation function parameters along specified axes (#4141)
* allow ability to share activation parameters along specified axes

* add tests

* change to shared_axes and remove TF dummy broadcast function

* update tests to shared_axes

* Update docstrings in advanced activations
2016-12-16 17:07:10 -08:00
Julien Phalip 79406f111b Make sure that changes to the global floatx are effectively taken into account by the backend. (#4739) 2016-12-16 16:59:01 -08:00
Francois Chollet 30fa61d457 Use tf.select instead of tf.where (compat TF 0.11) 2016-12-16 16:37:55 -08:00
Francois Chollet 914d976801 Fix compatibility with TF 0.11. 2016-12-16 16:15:02 -08:00
Francois Chollet 839d4f108e Merge branch 'master' of https://github.com/fchollet/keras 2016-12-16 15:44:10 -08:00
Francois Chollet 5e73db6c00 Handle recent TF API changes. 2016-12-16 15:43:58 -08:00
Charles-Emmanuel e9b8424839 Theano backend consistancy (#4748)
* Theano backend consistancy

ones_like and zeros_like don't have the name parameter in their signature for theano backend so it can trigger an error if it is used.

* pep8

pep8
2016-12-16 13:52:31 -08:00
Francois Chollet e3dd5d7ca5 Tmp bugfix 2016-12-16 12:53:16 -08:00
Francois Chollet 2752a58730 Tmp bug fix 2016-12-16 12:43:11 -08:00
Francois Chollet 69eb5752ce Update README. 2016-12-16 12:40:55 -08:00
Francois Chollet 9090704f1d Update the TF backend documentation. 2016-12-16 12:38:11 -08:00
Keunwoo Choi fa9f863dbf Documentation - tensorflow_backend (#4677)
* documentation - tensorflow_backend.py - first 1/3

* pep8, add periods.

* documentation - tensorflow_backend.py

* pep8

* * fix the confusion of variable/tensor.
* remove `var` as a variable name. `kvar` is used instead.
* new lines between multiple items under # Arguments
2016-12-16 11:42:39 -08:00
Bohumír Zámečník 4b1b706aa4 Fix a typo in Cropping3D docs. (#4742)
saptio -> spatio
2016-12-16 10:26:24 -08:00
Francois Chollet 5e75b8506c Uniquify updates in Theano backend functions. 2016-12-15 17:02:35 -08:00
Francois Chollet bc9f341165 Update docs wrt custom layer writing. 2016-12-15 14:05:09 -08:00
PPACI b40b8a00e4 Now properly set self.y to None in NumpyArrayIterator (#4732) 2016-12-15 10:05:50 -08:00
Su Tang d811048887 Add python3 support for some examples (#4715) 2016-12-14 23:07:21 -08:00
fchollet 0ba2626bd2 Further style fixes 2016-12-14 22:39:18 -08:00
fchollet c6eea03c8d Style fixes in engine. 2016-12-14 22:09:05 -08:00
Francois Chollet 0fd0218ef0 Various style fixes. 2016-12-14 15:22:49 -08:00
Francois Chollet 5e1a5d07c4 Style fixes in layers/pooling.py. 2016-12-14 15:08:13 -08:00
Francois Chollet 518fa3aa44 Style fixes in layers/local.py. 2016-12-14 15:04:34 -08:00
Francois Chollet ef1da479ec Display count of trainable params in summary. 2016-12-14 14:25:06 -08:00
Alejandro Dubrovsky 3f12d7ae44 Add a 'period' variable to ModelCheckpoint to save every period epochs (#4687)
* Add a 'period' variable to ModelCheckpoint to save every period epochs

* Update callbacks.py
2016-12-14 14:02:23 -08:00
Gijs van Tulder c4579a9c43 Fix dimshuffle problem in old Theano batch_normalization. (Fixes #4697.) (#4713) 2016-12-14 13:55:34 -08:00
François Chollet ff62eb251b Refactor regularizers and add add_weight method. (#4703)
* Refactor regularizers, introduce layer.add_weight

* Fix BN add_update syntax

* Fix eigenvalue regularizer

* Style fixes.
2016-12-14 13:41:24 -08:00
Francois Chollet 2b336756b6 Set default backend to TF on windows. 2016-12-13 19:49:49 -08:00
Francois Chollet 0f0d8be884 Further style fixes. 2016-12-13 19:49:36 -08:00
Francois Chollet 3f3e0aa90e Style fixes in models.py. 2016-12-13 19:23:16 -08:00
Francois Chollet c0ee5b859c Further style fixes in TF backend. 2016-12-13 19:14:44 -08:00
Francois Chollet edae178532 Style fixes in utils. 2016-12-13 19:13:04 -08:00
Francois Chollet a0a0308061 Style fixes in sklearn wrapper 2016-12-13 18:58:45 -08:00
Francois Chollet 74329d0c1d Style fixes in backend/tensorflow_backend.py. 2016-12-13 18:55:55 -08:00
Francois Chollet 5777355972 Fix style issues in backend/theano_backend.py. 2016-12-13 18:37:23 -08:00
Francois Chollet 0272587c29 Fix style issues in core.py. 2016-12-13 18:36:25 -08:00
Francois Chollet 22d3c8810c Style fixes in engine/topology.py. 2016-12-13 17:44:07 -08:00
Francois Chollet 4aa8aa100b Style fixes in training.py. 2016-12-13 15:47:12 -08:00
Francois Chollet bd404b1c88 Add TF exception to SWWAE example 2016-12-13 13:01:58 -08:00
Francois Chollet bed17efae8 Merge branch 'master' of https://github.com/fchollet/keras 2016-12-13 12:53:41 -08:00
Jan Zikes 8d0199ed42 Replaced old tensorboard calls by calls compatible with tensorflow 0.12 (#4700)
* Added update towards tensorflow 0.12 (replaced deprecated calls by new ones)

* added backwards compatibility with tensorflow<=0.11.0
2016-12-13 11:17:44 -08:00
fchollet 9f33f8af5f Add names to Keras application models 2016-12-12 19:17:41 -08:00
Francois Chollet 7c4f033c6a Change optimizer in CIFAR10 example. 2016-12-12 16:02:43 -08:00
Francois Chollet 7e2e7a5e5a TB callback: close writer on train end. 2016-12-12 16:01:03 -08:00
Francois Chollet 909fbd19ea Update answer to validation split Q in FAQ 2016-12-12 15:56:56 -08:00
Francois Chollet 2b27ab1c9e Merge branch 'master' of https://github.com/fchollet/keras 2016-12-12 15:05:47 -08:00
Francois Chollet d244d38047 Backwards compatibility fix 2016-12-12 15:05:35 -08:00
Gijs van Tulder 2a0b112d08 Use Theano's abstract interface for batch normalization (#4595)
* Simplify BatchNormalization code.

* Make Theano's K.batch_normalization similar to TensorFlow.

* Change default batch normalization epsilon to 1e-3.

* Use Theano's new batch normalization interface.
2016-12-12 14:17:59 -08:00
Francois Chollet d9657b70c0 Merge branch 'master' of https://github.com/fchollet/keras 2016-12-12 23:09:19 +01:00
Francois Chollet e8939f43a6 Fix BN reuse issue in TF backend 2016-12-12 23:09:04 +01:00
sunil-at-gh 8e587fb17a Fix for issue #4640: MaxPooling1D with border_mode='same' caused h_pad to take illegal value -1 in theano_backend:pool2d. (#4683)
Added test case.
2016-12-12 14:08:04 -08:00
dathinab 757ae95cca Fixes #4690, check for bool dtype in K.mean (th0.9) (#4691)
(It also adds a docstring, like in the tf backend)
2016-12-12 13:56:05 -08:00
Leszek cc6e65d145 recurse directories in DicrectoryIterator (#4657)
* recurse directories in DicrectoryIterator

rebased and squashed, again

* added prose about new behaviour of flow_from_directory

also, about BMP files.
2016-12-11 09:57:28 +01:00
Francois Chollet df464c103e Remove pypi version badge 2016-12-11 09:22:22 +01:00
Keunwoo Choi d517b55576 Documentation - Add # Argument # Return # Example in backend/common.py (#4668)
* Add # Argument # Return # Example in backend/common.py

* * pep8
* debug (a mistake by me)

* * Remove useless docstrings, remove `None` returns
* Add bullet points

* argument→arguments, return→returns, example→examples

* Update common.py

* Update common.py
2016-12-11 03:38:20 +01:00
Daniel Angelov 4c1353c188 Added Convolution3D as a layer supporting the regularizers and constraints API. (#4673) 2016-12-10 19:15:38 +01:00
Francois Chollet 4871208f02 String cleanup 2016-12-10 11:33:09 +01:00
Francois Chollet 08566f22c7 Allow 4-channel images in preprocessing/image.py 2016-12-10 11:21:53 +01:00
Francois Chollet ea7b37a42a Merge branch 'master' of https://github.com/fchollet/keras 2016-12-10 11:18:35 +01:00
Adam Wentz 302eef7bad Add arange to both backends (#4350) 2016-12-10 11:15:00 +01:00
Gijs van Tulder 825adad18d Theano: deconv2d should also shuffle filter_shape. (#4631) 2016-12-10 09:58:32 +01:00
Francois Chollet 4491212da4 Remove unnecessary import 2016-12-09 12:16:51 +01:00
Francois Chollet 52e2f3ed64 Slight style fixes in example 2016-12-09 11:59:35 +01:00
Leszek 1de4fe0ba8 changed order of tests for theano (#4643)
shortest first. if they fail, stop testing.
2016-12-08 13:09:43 +01:00
Guy Hadash e1208f5b9f Adding preprocess function to ImageDataGenerator (#4620)
* addition option for preprocess function in ImageDataGenerator

* fixes according to comment
2016-12-08 10:34:22 +01:00
Yongsheng Xu 0a8ac44617 change dockerfile to fit the latest version of tensorflow and cuda (#4642) 2016-12-08 10:26:19 +01:00
Shaofan Lai 1fd2108bcf fix load_model so that it can load customized optimizer (#4625) 2016-12-07 11:13:36 +01:00
Pat York ad5e29a2b7 Added JSON content headers (#4603) 2016-12-05 09:18:01 +01:00
Francois Chollet 93b7dd9915 Close file after writing in TB callback. 2016-12-04 23:25:30 +01:00
Francois Chollet 9256b76226 Merge branch 'master' of https://github.com/fchollet/keras 2016-12-04 22:54:00 +01:00
Francois Chollet fbd12f7d44 Fix issue with LSTM dynamic masking. 2016-12-04 22:53:45 +01:00
Shijie Wu 90c4895a7a Fix document in convolutional_recurrent.py (#4586)
* Fix mismatch of interface and document in convolutional_recurrent.py

* Update document of convolutional_recurrent.py to match the interface
2016-12-04 22:34:05 +01:00
Francois Chollet 6dfa8b1d60 Update FAQ entry about intermediate output display 2016-12-04 19:11:17 +01:00
Francois Chollet 5430844453 Fixes to image preprocessing utils 2016-12-04 18:57:15 +01:00
Francois Chollet 9dd06082e7 Merge branch 'master' of https://github.com/fchollet/keras 2016-12-04 17:15:10 +01:00
Francois Chollet cb4f93913e Fix shape validation issue with imagenet_utils 2016-12-04 17:14:57 +01:00
ηzw 149946c706 Fix typos (#4591) 2016-12-04 04:05:34 -08:00
Francois Chollet 78988b5cd6 Fix serialized Merge output shape loading 2016-12-04 13:02:36 +01:00
Gijs van Tulder a081e049db Use Theano's IfElse instead of Switch for in_train_phase. (#4579)
This should allow Theano to perform lazy evaluation.
2016-12-02 16:25:11 -08:00
Francois Chollet 68af216772 Fix issue with custom Application input_tensor 2016-12-02 14:35:50 -08:00
Keunwoo Choi b4a532e970 adds keras.json details in backend document (#4560) 2016-11-30 23:18:48 -08:00
Francois Chollet 3bf913dc35 Unflake optimizer test 2016-11-30 21:11:53 -08:00
Taras Boiko 55163b5999 Warn when output shape not specified for Lambda on Theano (#4543) 2016-11-30 20:47:44 -08:00
Brett Naul 9bfbe6ae3e Fix deprecation warning in to_categorical (#4547) 2016-11-30 20:46:15 -08:00
Amane Suzuki b23e873e0f Fix small typo (#4559)
* Fix small typo

* Fix small typo
2016-11-30 20:44:54 -08:00
Fariz Rahman 79ec9b8079 ACGAN : Remove lines with no effect (#4503)
* Remove lines with no effect

* pep8

* Update mnist_acgan.py
2016-11-29 13:22:34 -08:00
Javier Dehesa 24d6cca275 Enforce shape invariance for states in RNN loop (#4536)
Fixes some shape invariance errors arising sometimes when building RNNs.
2016-11-29 11:31:14 -08:00
lebavarois 83b90c172c change to new tables api (#4526) 2016-11-28 13:35:05 -08:00
fchollet 57f2f11005 count_params should report count for all weights. 2016-11-26 21:08:21 -08:00
fchollet bf502be578 Merge branch 'master' of ssh://github.com/fchollet/keras 2016-11-26 20:08:46 -08:00
fchollet dfeca151a2 Allow to set input shape in CV models. 2016-11-26 20:08:41 -08:00
Francois Chollet 2ddd2bd557 Prepare new PyPI release 2016-11-25 20:52:27 -08:00
Ken Chatfield b2aebb30bf Don't add another header line to CSV logger when appending to an existing file (#4426) 2016-11-25 14:49:50 -08:00
Fariz Rahman 0a9c0ca461 Sequential : Fix trainable arg (#4509) 2016-11-25 11:59:05 -08:00
fchollet c0b32a9a04 Remove reference to legacy Graph model in tests. 2016-11-25 01:20:10 -08:00
fchollet 703d5a1298 Add dynamic trainability lightweight test 2016-11-24 23:59:51 -08:00
fchollet c5cc96a4f4 Saner way to collect trainable weights 2016-11-24 23:59:33 -08:00
fchollet de256cb5d5 Make sure ImageNet predictions are sorted 2016-11-24 23:30:00 -08:00
fchollet ce814302ac Remove support for legacy Graph model 2016-11-24 23:29:45 -08:00
Fariz Rahman 628bc6e03e ACGAN: Remove unnecessary dimension in label input (#4501) 2016-11-24 20:21:56 -08:00
Thomas Pinetz dfb606bb19 Fix border_mode = same for pooling layers documentation. (#4341) 2016-11-24 12:42:59 -08:00
Dontloo 88f3b3f75e fixed variational autoencoder visualization for Gaussian latent space (#4423) 2016-11-23 14:08:19 -08:00
Marzuk Kamal 773d4ce8cb def fbeta_score(y_true, y_pred, beta=1) (#4492)
set the default value of beta=1
2016-11-23 13:25:29 -08:00
Fariz Rahman 509d6d8235 Merge : Serialize output mask; Enable user arguments for callable mode (#4445)
* Update topology.py

* Update topology.py

* Update topology.py

* white space fix

* indentation fix

* add tests

* fix all tests

* add arguments arg to merge

* space after period

* add test with arguments

* add test with arguments for lambda layer too

* pep8 fixes

* fix tf test

* try fixing tf test; again

* bug fix

* finally
2016-11-23 13:24:54 -08:00
Taras Boiko 7bd5c862a2 Correctly check the output dimension for None instead of target (#4458) 2016-11-23 13:21:48 -08:00
Angelos Katharopoulos 2878f60634 Add map, foldl, foldr to the backend (#4461) 2016-11-23 13:21:13 -08:00
Luke de Oliveira 50fdb87888 adding mnist acgan example (#4475) 2016-11-23 13:19:29 -08:00
Gijs van Tulder dad7790ec3 Model summary: separate columns with a space. (#4469) 2016-11-23 11:06:30 -08:00
Marzuk Kamal 709bc5e15a tf.global_variables and tf.variables_initializer (#4490)
tf.all_variables and tf.initialize_variables are replaced by tf.global_variables and tf.variables_initializer for the future version of tensorflow
2016-11-23 11:06:10 -08:00
Ken Chatfield 06cc6d7fea Add initial epoch argument to fit functions (#4429)
* Added initial_epoch argument to fit functions in trainer

* Added unit test

* PEP8 fixes
2016-11-19 21:51:57 -08:00
EdwardRaff 97484ec9c1 Finishing Colincsl's SpatialDropout1D (#4416)
* Added SpatialDropout1D

This is a straightforward modification of SpatialDropout2D but for 1D data.

* Added SpatialDropout1D to docs

* SpatialDropout1D test

* Fixed indent issue

* Combined TF and TH dimension conditions

Use the same 1D dimensions for TensorFlow and Theano in SpatialDropout1D.

* trailing whitespace

* Removed dim_ordering variable

* Removing dim_ordering values

removing dim_ordering values as requested
2016-11-19 12:30:05 -08:00
Taras Boiko 6b04add932 Check all output dimensions for compatibility (#4420) 2016-11-19 10:10:08 -08:00
Yu Kobayashi 04ea01f385 Bug fix of Bidirectional(LSTM(..., stateful=True)) (#4424)
* Bug fix of Bidirectional(LSTM(..., stateful=True)) https://github.com/fchollet/keras/issues/4421

* Add Recurrent.from_config() test
2016-11-18 12:19:42 -08:00
Yu Kobayashi 8653060ae6 Update Travis TensorFlow to 0.11.0 (#4367) 2016-11-17 09:55:39 -08:00
Francois Chollet 8df3effa5f Merge branch 'shareable_bn' 2016-11-16 19:07:06 -08:00
Francois Chollet 771010f43b Add shareable BN (per-datastream updates). 2016-11-16 19:06:46 -08:00
Carl Thomé 8d20bac7fa Remove extraneous batch_input_shape (#4393) 2016-11-16 18:59:03 -08:00
Francois Chollet c4c4fac1ae Make BN shareable (not yet working) 2016-11-15 05:16:40 -08:00
Francois Chollet 016d85c9e6 Minor style fixes 2016-11-14 15:09:58 -08:00
Francois Chollet 3ab29205fc Merge branch 'master' of https://github.com/fchollet/keras 2016-11-14 15:08:04 -08:00
Francois Chollet fdd150eb4d Minor style fixes 2016-11-14 15:07:51 -08:00
Anton Chernyavski 789a2be8d9 Fix get_layer() by index (#4376) 2016-11-14 09:47:27 -08:00
Francois Chollet ae7ef37c1b Merge branch 'master' of https://github.com/fchollet/keras 2016-11-09 20:57:43 -08:00
Francois Chollet 94fba3d8f0 Fix Theano tests 2016-11-09 20:57:30 -08:00
Yu Kobayashi 6ac9af0a5a Fix the load_model() bug by sorting weights by names (#4338) 2016-11-09 20:36:45 -08:00
Francois Chollet e916f748db Fix Theano tests 2016-11-09 20:33:42 -08:00
Francois Chollet 92e8a20761 Remove unused set_input method 2016-11-09 18:34:09 -08:00
Francois Chollet cb3de665d1 Simplify tests 2016-11-09 18:01:19 -08:00
Francois Chollet 49a5cdf76d Improve error message 2016-11-09 18:01:06 -08:00
Francois Chollet 08a090de43 Merge branch 'master' of https://github.com/fchollet/keras 2016-11-09 17:33:49 -08:00
Francois Chollet fa3b17cd96 Minor code cleanup 2016-11-09 17:33:31 -08:00
Ken Chatfield 5266fdacf1 Bugfix to CIFAR pickle reading code in Python 3 (#4319) 2016-11-09 17:14:36 -08:00
nagachika b74c5953f0 Print EarlyStopping verbose message on_train_end. (#4332)
The message print on_epoch_end would be overwritten by ProgbarLogger.
2016-11-09 16:35:22 -08:00
Yu Kobayashi 00e8d20eae Theano tile() expects Python int, so casting from numpy.int32 to Python int. (#4330) 2016-11-09 16:23:22 -08:00
Gijs van Tulder e8e63e307e Theano: try not to use the old pool_* interface. (#4321) 2016-11-09 16:22:37 -08:00
Uwe Schmidt 7db6de848a Fix for issue #3965 (#4333)
* Fixes issue with resize_images and partially-definded tensors

Disclaimer: I haven't tested this with `dim_ordering == 'th'`

* PEP8 syntax
2016-11-09 16:21:37 -08:00
Matt Gardner 8360ef3a5a Add documentation to set self.built = True in MyLayer.build() (#4315)
* Added documentation to set self.built = True in MyLayer.build()

* Update writing-your-own-keras-layers.md
2016-11-07 18:19:27 -08:00
Francois Chollet d32b8fa4bd Further code cleanup 2016-11-07 17:27:41 -08:00
Francois Chollet c95c32e473 Improve docstrings 2016-11-07 15:36:57 -08:00
Francois Chollet 02fe371839 Merge branch 'master' of https://github.com/fchollet/keras 2016-11-07 12:46:54 -08:00
Francois Chollet b7b7c2ea94 Normalize default argument values 2016-11-07 12:46:41 -08:00
Francois Chollet 105dd031dd Documentation improvements 2016-11-07 12:46:18 -08:00
Joshua Loyal 4fa289166a allow for learning rate dtypes returned by numpy (#4304) 2016-11-07 10:33:11 -08:00
Carl Thomé a8bbcf611f ConvLSTM2D docstring spelling (#4306)
* Spelling

* "convolutionnal" spelling
2016-11-06 12:05:20 -08:00
Francois Chollet d5030b1f8c Add conv_lstm to examples/README 2016-11-05 15:30:33 -07:00
Francois Chollet f127b2f81d Merge branch 'imodpasteur-rebasedconvV1' 2016-11-05 13:46:02 -07:00
Francois Chollet 9d4087a1e9 Style fixes 2016-11-05 13:45:50 -07:00
Francois Chollet fd326ddf1b Merge branch 'rebasedconvV1' of https://github.com/imodpasteur/keras into imodpasteur-rebasedconvV1 2016-11-05 13:32:03 -07:00
Francois Chollet 7f42253f46 Add basic support for TF optimizers, part deux 2016-11-05 13:26:03 -07:00
Francois Chollet 18d7e5e6e4 Style fixes 2016-11-05 13:22:18 -07:00
Francois Chollet 6610880fd4 Merge branch 'master' of https://github.com/fchollet/keras 2016-11-05 13:21:38 -07:00
Arbona 11b73ae6b4 Tf dynamic 2016-11-04 21:20:30 +01:00
Carl Thomé 2b51317be8 Refactor F-score into precision and recall metrics (#4276)
* Refactor f-score into precision and recall metrics

* Docstring consistency

* Add docstring for fmeasure

* Added precision, recall, f-measure tests
2016-11-03 20:28:04 -07:00
Francois Chollet 650c2c8cf9 Add basic support for TF optimizers 2016-11-03 11:38:00 -07:00
Igor Macedo Quintanilha 49386e8da4 Bug fix when target is a SparseTensor. (#4200)
* Bug fix when target is a SparseTensor.
Check for sparsity when creating target placeholder.
Remove shape argument when creating sparse placeholder.

* Fixed ndim behavior for sparse tensor

* Fix sparse variable instantiation.

* Bug fix
2016-11-03 10:04:40 -07:00
Thang Bui 71494ffdbc changed VAE sampling variance to 1 (#4211)
* Update variational_autoencoder.py

fixed sampling bug

* Update variational_autoencoder_deconv.py

fixed variance bug
2016-11-02 15:58:32 -07:00
Francois Chollet a9b6bef062 Improve dynamic TF RNN implementation. 2016-11-02 11:51:29 -07:00
Francois Chollet 4840e435f7 Improve RNN error messages 2016-11-02 10:47:46 -07:00
Arbona 531147c877 Fix review 2016-11-02 12:08:31 +01:00
Francois Chollet 61c21ef9ee Imagenet predictions sorting fix 2016-11-01 17:39:39 -07:00
Francois Chollet 058e54061b Style fixes 2016-11-01 17:39:23 -07:00
Francois Chollet 32be731194 Some backend refactoring 2016-11-01 16:52:25 -07:00
Francois Chollet 9bf55395f1 Simplify 1D pooling implementation 2016-11-01 16:51:54 -07:00
Francois Chollet 114b82a212 Minor TF backend improvements 2016-11-01 15:26:01 -07:00
manelbaradad 7d143370d8 BUG: Deconvolution2D output shape not correctly referenced (#4251) 2016-11-01 11:24:54 -07:00
Gijs van Tulder bc6880fa34 Enable full convolution with the Theano backend. (#4250) 2016-11-01 11:03:50 -07:00
Francois Chollet c6d2ccd453 Prepare 1.1.1 release. 2016-10-31 13:12:59 -07:00
Francois Chollet cdab739471 Merge branch 'master' of https://github.com/fchollet/keras 2016-10-31 13:11:49 -07:00
Taras Boiko fee03bd5a6 Use six for wrapping in keras_test (#4235)
This will allow parameterized tests to work correctly in both 2.7 and
3.4
2016-10-31 10:51:32 -07:00
Aloïs Gruson 6fd2d43bfe Fix Theano Cudnn BatchNorm when axis!=1 (#3968)
* fix batch_norm when axis!=1

* fix dimshuffle for all backends

* moving cudnn bn fix to theano backend

* fix pep8

* dont use cudnn when bn axis is non broadcastable, ie dim=1
2016-10-28 10:51:32 -07:00
Arbona 40fd415409 Changed name example 2016-10-27 10:46:54 +02:00
Laurent Gautier 9c7020f7e7 Only allow the addition to Sequential objects of layers that are instances of Layer (#4184)
* Check that the added object is an instance of class Layer

* Update models.py

* Fix ValueError error message
2016-10-26 11:02:10 -07:00
Sean 556399cc48 Add more util docs (#4154)
* Add more util docs

* Leave out single use utils
2016-10-26 10:40:33 -07:00
Ramanan Balakrishnan bef888c2d8 add new min_delta parameter in EarlyStopping to stop in cases of minimal improvements (#4202) 2016-10-26 10:39:52 -07:00
Stefan Wunsch a89dabe0cd Enhance doc about usage of sample weights in validation data tuple (#4199) 2016-10-26 10:18:59 -07:00
Alexander Rakhlin 80fbbc3a6a Bug fix in zca_whitening (#4181)
When calculating 'sigma' denominator is # of instances (axis=0), not dimensionality (axis=1)

Proof:
http://ufldl.stanford.edu/wiki/index.php/Implementing_PCA/Whitening
http://ufldl.stanford.edu/wiki/index.php/Exercise:PCA_and_Whitening
Ng uses 2nd dim in denominator because his matrix is features x instances
2016-10-25 10:40:03 -07:00
Carl Thomé 7a6ee934e1 Display wrapped layers in graph visualization (#4169)
* Display wrapped layers in graph visualization

* Check parent class instead of class's module

* Check instance instead for brevity

* More consistent naming
2016-10-25 09:40:14 -07:00
Arbona 8b11f13507 Changed name 2016-10-25 17:45:28 +02:00
Francois Chollet 4401120ca6 Style fixes 2016-10-24 15:49:38 -07:00
Michael Dietz 8dd61c1dc4 Fixed https://github.com/fchollet/keras/issues/4048 : in TensorBoard callback which fails when it is not the only callback (specifically when another cbk is ReduceLROnPlateau). (#4159) 2016-10-24 15:13:39 -07:00
Roberto de Moura Estevão Filho 6849589430 Fix LiL sparse matrix on Tensorflow (#4173)
LiL sparse matrices would not work correctly due to dtype being
different. Using the sparse_coo data fixes it.
2016-10-24 13:33:45 -07:00
Jaye 4cd83631ee Update imdb_cnn.py to use GlobalMaxPooling1D (#4164) 2016-10-24 09:25:08 -07:00
Felix Sonntag 028aae19bf Fixes for Python 3 (#4121)
* Fixed weights.sort for Python 3

In Python 3 weights.sort could throw a TypeError exception, if the
names are all None

* Fixed _flattened_layers under Python 3

If self.layers is empty, an IndexError appears when accessing it. So
it’s necessary to check if it’s non-empty first

* Fixed weight sorting for Theano backend

* Added missing import statement

* Improved backend handling for weight calculation

* Simplified weight sorting and backend check

* Changed behavior of weights sorting

* Removed unnecessary import
2016-10-23 09:01:16 -07:00
jarfo 41741c38e5 Keep shape of the initial (dummy) state (#4146)
tensorflow breaks if the shape of the state changes
https://github.com/fchollet/keras/issues/4008
2016-10-22 20:23:02 -07:00
Thomas Boquet 3feca20c59 + multiprocessing in legacy - unused imports (#4139) 2016-10-21 14:58:28 -07:00
Johan Pauwels f1bc3c03ed Make build_fn argument of sckit-learn wrappers accept class methods (#4107) 2016-10-20 15:33:56 -07:00
Fariz Rahman 66e5944799 Fix Merge layer docstring (#4132) 2016-10-20 15:23:10 -07:00
Francois Chollet 6ffa6f39e6 Fix typo in Merge layer docstring. 2016-10-19 14:10:17 -07:00
Francois Chollet 94ee8e1570 Add Xception model to keras.applications. 2016-10-19 14:06:07 -07:00
happygds 3e95633b1f manually terminate threads process returned by generator_queue() (#4101)
* manually terminate threads process returned by `generator_queue()`

Recently I custum a video sequence DataGenerator (based on ImageDataGenerator) for experiment. When I use model.fit_generator as following:
>history = model.fit_generator(train_data_generator, samples_per_epoch=train_data_generator.nb_sample,
                              nb_epoch=nb_epoch, verbose=1, callbacks=[early_stopping, model_checkpoint],
                              validation_data=test_data_generator, nb_val_samples=test_data_generator.nb_sample,
                              max_q_size=10, nb_worker=8, pickle_safe=True)
I found that the validation process consumes much longer time than training despite it contains less data.
I read the code and changed the `self.evaluate_generator()` (line 1482) in `fit_generator' to use a multiprocessing approach as training process did. However, the memory usage quikly increases and it only last for a few epoches. 
Through analysis, I think it is caused by the processes weren't freed after the `evaluate_generator` accomplished. Thus I suggest returning `generator_threads` from function `generator_queue()` and manually terminate these threads in `fit_generator`, `evaluate_generator`, `predict_generator`.

* stastify the PEP style

* correct the PEP8's E128 error
2016-10-18 20:34:50 -07:00
Ramanan Balakrishnan 70ebb15a33 Add documentation about metrics functions (#4024)
* Add documentation about metrics functions

* Add docstrings to metrics.py and auto-generate the docs from these strings
2016-10-18 19:57:42 -07:00
Gijs van Tulder d745d9ee96 Use Theano's pool_3d function. (#4065) 2016-10-16 22:27:15 -07:00
Abishek Bhat b89a93faae Remove unused imports. (#4083) 2016-10-16 21:58:35 -07:00
Vijay Vasudevan 044071f0d5 Switch use of TF cond function to use public function. (#4064)
* Switch use of TF cond function to use public function.

Prior to newer TFs, cond was unavailable and thus was being
imported via private module namespaces.

Newer TFs expose tf.cond as the public interface.  There
are plans to remove private module namespace access so
this fixes keras to first try accessing through the public
namespace, and then going through the private one for older
versions of TF.

* PEP8 fix
2016-10-14 14:27:15 -07:00
ηzw 79c1331432 Remove unused import statement (#4053) 2016-10-14 09:16:56 -07:00
Jayanth Koushik 86f28494a5 Return decay from get_config of all optimizers (#4052) 2016-10-13 15:25:50 -07:00
Yu Kobayashi d53a1cd0c0 Python 3 support of image_ocr.py (#4049)
I fixed to support Python 3.
2016-10-13 13:53:35 -07:00
Arbona 2c96373a41 remove another useless check 2016-10-13 21:30:01 +02:00
Arbona 731e1bb206 remove a useless check 2016-10-13 21:28:51 +02:00
Arbona c1a72b3644 More test and fixed dropout 2016-10-13 20:58:01 +02:00
fchollet e52740f09a Add Gitter link to README 2016-10-12 20:11:43 -07:00
fchollet 5dd8c5c10c Padding style fixes. 2016-10-12 18:02:39 -07:00
Dmitry Lukovkin 169c0896d6 Make ZeroPadding2D optionally asymmetric (#3595)
* Make ZeroPadding2D and ZeroPadding1D optionally asymmetric

* Make padding argument polymorphic.
Add test case for asymmetric padding.
Remove excessive imports.

* Fix layer config saving.

* Duck typing (as soon as test passes tuple as a list)

* Doc update

* Set padding value for the missing keys to 0.
Raise exception if unexpected keys are found in the padding dict.

* Add test for ZeroPadding1D
2016-10-12 17:48:57 -07:00
ftence 1bc0468ada Applied imagenet mean pixel on BGR instead of RGB. (#4027) 2016-10-12 16:59:56 -07:00
Gijs van Tulder 9a411f367d Use Theano's new theano.nnet.conv3d interface. (#4039) 2016-10-12 16:57:50 -07:00
Jayanth Koushik 6074a18ec4 Fixed typo in Adamax (#4043)
Fixed a typo in Adamax which prevented it from using explicit decay.
2016-10-12 16:57:22 -07:00
Arbona 0e7f3e04b0 pep fixed 2016-10-12 22:11:22 +02:00
Arbona 53552b1d6e Various fix 2016-10-12 22:00:55 +02:00
Taras Boiko d7d1db5d79 Test AveragePooling2D in test_average_pooling2d (#4034) 2016-10-12 08:21:21 -07:00
Fariz Rahman 9d7a2338b4 imdb fasttext speedup (#4026)
* imdb fasttext speedup

* Lambda -> GlobalAveragePooling1D
2016-10-11 11:01:11 -07:00
Taras Boiko 6e42b0e4a7 Added ability to return more than one metric from a function (#3907) 2016-10-11 10:54:02 -07:00
Gijs van Tulder ef7911310d Use Theano's cuDNN batch normalization for training. (#4023) 2016-10-11 10:52:07 -07:00
Ramanan Balakrishnan 999f402829 add KL divergence to metrics (#4025) 2016-10-11 10:50:44 -07:00
Bas Veeling 85c2d28e99 ReduceLROnPlateau fix for cooldown=0 (Fixes #3991) (#4011) 2016-10-10 13:18:58 -07:00
Arbona 6b7421c448 Various fix 2016-10-09 10:46:04 +02:00
fchollet 7df184d3aa Style touch-ups 2016-10-08 15:53:24 -07:00
Abishek Bhat 197005a791 Correct metrics usage in getting started guide. (#3993)
As the code
[here](https://github.com/fchollet/keras/blob/master/keras/engine/training.py#L662) suggests whenever a model is compiled with `metrics = [name_of_the_metric_function]` works, however, the documenation suggests that `accuracy` is the only supported string representation.
2016-10-07 23:34:21 -07:00
Ramanan Balakrishnan 52ee2380e4 Add top-k classification accuracy metrics (#3987)
* add categorical accuracy metric which tracks over top-k predictions

* remove top_k_categorical_accuracy from being tested together with other all_metrics

* fix in_top_k to work with batches. correct metrics.py and test_metrics.py appropriately

* style fixes for documentation on in_top_k function

* default to k=5 for top_k_categorical_accuracy metric
2016-10-07 23:32:19 -07:00
Anish Shah 530eff62e5 [issue #3942] Add GlobalMaxPooling3D and GlobalAveragePooling3D (#3983) 2016-10-07 15:06:19 -07:00
Francois Chollet 4de7eaa6a8 Update docs 2016-10-06 15:38:01 -07:00
Francois Chollet 8281988842 Style fixes 2016-10-06 15:01:17 -07:00
Francois Chollet 4ed7138685 Style fixes 2016-10-06 14:55:22 -07:00
Carl Thomé 6689189819 Add F-score metric to metrics.py (#3895)
* Added optional path argument

* Added optional field name argument

* Added LambdaCallback callback

* Fixed on_epoch_begin assignment

* Match default signatures

* Whitespace

* Test LambdaCallback examples

* Only test process termination

* Imports

* Fixed test

* Wait on process to terminate

* Add zero threshold and set F measure to zero if no true samples exist

* Reduce zero threshold

* Flip thresholded non-zero count

* Add F measure test

* Updated test

* Remove lambda, simplify

* Whitespace

* Update docstring

* Update test

* Whitespace
2016-10-06 14:53:53 -07:00
Emad El-Haraty 0ce7e4976a Descriptions of examples as a README.md file, allowing for easier browsing in github (#3982) 2016-10-06 11:17:22 -07:00
Hengkai Guo 6b18a908b8 Fix shape inference error for newly version Tensorflow in ctc_label_dense_to_sparse (#3955) 2016-10-04 11:21:31 -07:00
Gunnar Läthén 570fdf31c5 Python3 fix for deserialization of closures (#3961) 2016-10-04 11:16:44 -07:00
Seonghyeon Nam 929669bd1b Remove a print message when using global pooling (#3963) 2016-10-04 11:15:16 -07:00
Roberto de Moura Estevão Filho 240fd5b68e Fix control_flow_ops import (#3948)
* Fix control_flow_ops import

Old access was not working on new version of tensorflow. This should
work for all versions.

* Fix identation
2016-10-03 09:42:16 -07:00
Arbona 1d0d79f61a Various fix 2016-10-03 11:43:24 +02:00
Arbona b5dddeb419 Removed notebook and added example in python 2016-10-03 10:45:53 +02:00
Andre Simpelo 9194052a94 Fixed dead link in batch norm documentation (#3937)
Fixed dead link for the references in the Batch Normalization documentation
2016-10-01 20:37:42 -07:00
fchollet e0d871b7dc Restructure docs for Applications module 2016-10-01 15:19:12 -07:00
Sean c455a19f8e Change HDF5Matrix so start and end are optional (#3933) 2016-10-01 12:55:31 -07:00
Francois Chollet d864512631 Fix flaky test 2016-10-01 00:37:21 -07:00
Sean 6ee5d61c91 HDF5Matrix documentation (#3931) 2016-10-01 00:14:39 -07:00
fchollet 04df170bea Merge branch 'master' of ssh://github.com/fchollet/keras 2016-10-01 00:11:45 -07:00
fchollet 5f58a6d2ca Support all backends, dim orderings for music CRNN 2016-10-01 00:11:39 -07:00
Yu Yin ffff5e99aa Fix summary param counting problem (#3661) (#3884)
* Fix summary param counting problem (#3661)

* ...recursively

* Fix default parameter
2016-09-30 22:15:10 -07:00
Francois Chollet 8fab33c245 Make deconv VAE compatible with both dim orderings 2016-09-30 16:26:50 -07:00
Eder Santana 3bf8964355 Keras is TF first. Fix TH first example (#3914)
* Keras is TF first. Fix TH first example

* Use K.set_image_dim_ordering('th')
2016-09-29 10:57:08 -07:00
JM Arbona a3697d097d Added recurrent convolutionnal layer 2016-09-29 10:18:24 +02:00
Thomas Boquet 51c85dd8d6 Bypass shape inference in deconv2d and use the output shape provided by the user (#3838)
* bypass shape inference in deconv2d

* * more doc in deconv layer

* more deconv layers in var autoencoder example

* * typo doc

* replicate deconv example with with paper's params

* replicate example with paper's params

* typo doc

* + relus in the deconv

* typo in var autoencodeur example

* + mult by ndim

* style fixes

* pep8
2016-09-28 13:40:44 -07:00
Nithish deva Divakar 31f41b9822 typos (#3869)
Added missing numpy imports in examples
2016-09-28 12:30:36 -07:00
M Clark 458576bbe7 List files in alphabetical order (#3871)
`os.listdir` to `sorted(os.listdir)` for alphabetical order instead of arbitrary order. Following PR#3751 this allows mask and images with the same name to be read together.
2016-09-28 12:30:21 -07:00
Yu Yin e3a64cc8a7 Choose format according to filename when plotting (#3883) 2016-09-28 11:43:23 -07:00
Francois Chollet 9045616bda Revert adadelta lr 2016-09-27 10:50:35 -07:00
Francois Chollet 25dbe8097f Update adadelta default learning rate 2016-09-27 09:56:58 -07:00
fchollet fb6a2941b9 Fix typos 2016-09-24 22:19:32 -07:00
fchollet ed131973ef Fix music tagger application 2016-09-24 22:12:22 -07:00
Keunwoo Choi 43060d8c7d add audio models: audio_convnet and audio_conv_rnn (#3718)
* add audio models: audio_convnet and audio_conv_rnn

* add audio models: audio_convnet and audio_conv_rnn

* remove white spaces at the end of lines

* add audio_conv_utils.py, update applications.md

* remove useless line in example in application.md

* remove useless line in example in application.md

* rename models (MusicTaggerCNN,CRNN), BN mode=0 weights

* pep8

* remove MusicTaggerCNN, add include_top argument

* update to follow pep8
2016-09-24 19:53:47 -07:00
fchollet d5f1250a8b Update imagenet prediction decoding utilities 2016-09-24 11:46:41 -07:00
Bas Veeling 4c01c0c4d7 ReduceLROnPlateau Callback and CSVLogger Callback (#3780)
* ReduceLROnPlateau Callback and CSVLogger Callback

* Added documentation and cleanup.

* Added examples.

* Added test for ReduceLROnPlateau()

* Minor changes to naming.

* Added epsilon for lr comparison.

* Fix sensitivity issue

* PEP8
2016-09-23 21:16:19 -07:00
danstowell af28101af1 Functional API guide: fix variable names "loss"->"output" (#3856)
Some of the variable names in this guide were misleadingly named. The outputs were named as `*_loss` implying that they held loss values, whereas they in fact held the outputs. It rather confused me; I believe my proposed naming is clearer.
2016-09-23 08:59:36 -07:00
Flynn, Michael D 56aa9f364a Add cropping layers to documentation (#3853)
* Correct documentation for Cropping3D layer

* Add Cropping layers to documentation
2016-09-22 20:46:22 -07:00
Taras Boiko f0d9867d09 Changed ELU implementation to use native ops (#3845) 2016-09-22 11:08:21 -07:00
Carl Thomé cfc9b4d41d LambdaCallback (#3760)
* Added optional path argument

* Added optional field name argument

* Added LambdaCallback callback

* Fixed on_epoch_begin assignment

* Match default signatures

* Whitespace

* Test LambdaCallback examples

* Only test process termination

* Imports

* Fixed test

* Wait on process to terminate
2016-09-22 09:19:51 -07:00
Fariz Rahman de66211afb Set theano as default backend for windows users (#3831)
* Set theano as default backend for windows users

* Update __init__.py
2016-09-21 21:12:06 -07:00
M Clark 414d5f0978 make ImageDataGenerator behaviour fully seedable/repeatable (#3751)
* make ImageDataGenerator behaviour fully seedable/repeatable

This makes ImageDataGenerator fully seedable.
- the seed argument in fit is now used
- the seed argument in flow and flow_from_directory now effects
transforms
- added example to docs of transforming images and masks together
- added test of using two seeded streams at once

* implemented requested changes

- PEP8
- explicit names
- classes=None
- remove test
2016-09-21 21:11:39 -07:00
Fariz Rahman 99bd066f38 TimeDistributed : unroll RNN when using TF backend (#3835)
* TimeDistributed : unroll RNN when using TF backend

TF dynamic rnn not working with ndim > 3

* Update wrappers.py

* Update wrappers.py
2016-09-21 17:31:46 -07:00
ηzw 82a22b20fc Update default dim_ordering (#3832)
* Update default dim_ordering

* Update default dim_ordering
2016-09-21 11:32:08 -07:00
Francois Chollet 25ed701dbd Merge branch 'master' of https://github.com/fchollet/keras 2016-09-20 21:40:07 -07:00
Francois Chollet 875c521413 Update deep dream example 2016-09-20 21:39:51 -07:00
kuza55 7b8363632e Attempted fix for #3801 (#3827) 2016-09-20 14:57:08 -07:00
kuza55 06f18fa1b9 Matthews Correlation fix and test (#3822) 2016-09-20 09:19:00 -07:00
Taras Boiko 54fc646537 Split multitest in test_recurrent (#3818) 2016-09-20 08:43:42 -07:00
Francois Chollet b2e3780e8c Prepare PyPI release 2016-09-19 13:18:22 -07:00
Francois Chollet 0b04ac3117 Fix TF RNN dynamic behavior 2016-09-19 11:01:33 -07:00
Francois Chollet 90d0eb9b88 Regularizers style fixes 2016-09-18 15:27:45 -07:00
fchollet f2aa89f443 Freeze list of trainable weights at compile time 2016-09-18 10:41:37 -07:00
kuza55 2a319c7255 Add exception when trying to reuse regularizers (#3803)
My reading of regularizers is that they cannot be reused, but it doesn't actually fail in any way and seems like it results in only regularizing the last layer. Having an exception prevent this would probably improve the ergonomics.
2016-09-17 20:22:26 -07:00
Francois Chollet 4fb3f1b3f3 Make TF dynamic RNN work without states. 2016-09-16 17:15:18 -07:00
Furiously Curious 072d33599b Added Gitter channel badge (#3744)
* Added Gitter channel badge

Assigned @fchollet as channel admin on Gitter

* Link fix
2016-09-15 18:10:06 -07:00
Seonghyeon Nam 56f3c85b87 Fix ValueError(ndim of gamma and beta) of batch normalization when using Theano (#3740)
* Fix ndim mismatch error when using theano

* Change keras backend call
2016-09-15 18:09:02 -07:00
Francois Chollet 8b42fff90e Fix flaky test 2016-09-14 15:15:00 -07:00
Francois Chollet 1dc5d43d32 Remove deprecated resnet50 example 2016-09-14 15:03:26 -07:00
Francois Chollet ee2d08ff79 Fix activity regularization for wrapper layers 2016-09-14 15:02:05 -07:00
Francois Chollet 305b3bed74 Finalize streamlining of conv1d. 2016-09-14 14:39:47 -07:00
Francois Chollet 9f6acd960c Simplify Conv1D ops. 2016-09-14 14:18:15 -07:00
Flynn, Michael D 672890b1c8 Add AtrousConvolution1D to convolutional layers (#3763)
* Add `AtrousConvolution1D` to convolutional layers

* Add test for `AtrousConvolution1D` layer

* Add AtrousConvolution1D to docs
2016-09-14 11:40:04 -07:00
Francois Chollet c58bcc2c02 Fix deconv test 2016-09-13 16:56:39 -07:00
Francois Chollet 82318263a1 Set default backend to TF 2016-09-13 16:24:43 -07:00
Francois Chollet d90e1db50b Revert default backend to TH 2016-09-13 15:37:38 -07:00
Francois Chollet 8af0264a77 Set TensorFlow as default backend for new installs 2016-09-13 15:19:13 -07:00
Junwei Pan 8193287e08 Update docoment for callbacks.py: add the specification of auto mode (#3758) 2016-09-13 08:52:12 -07:00
fchollet 13bd33e73f Merge branch 'master' of ssh://github.com/fchollet/keras 2016-09-10 22:54:51 -07:00
fchollet b2e8d5ab7c Add support for LR decay in all optimizers 2016-09-10 12:34:05 -07:00
Ardalan a375cb322f fastText: adding n-gram embeddings for higher test_set accuracy (#3733)
* adding bi-gram embeddings for better test accuracy

* - add arbitrary n-gram range
- fix typos

* - fixing white spaces

* - add comment
2016-09-10 10:35:15 -07:00
dolaameng d9c4d8a76a update examples/neural_doodle.py based on issues #3731 (#3741) 2016-09-10 10:24:39 -07:00
kuza55 79edae58d5 Initial Sparse Matrix Support (#3695)
* Minimal SparseTensor support for TensorFlow

* Basic Theano support for Sparse dot product

* Sparse Input for Both + Sparse Concat for TF

* Fixed issue with _keras_shape for sparse Inputs

* pep8

* Cleanup + Theano concat (untested)

* Bug fix & pep8

* Fix Theano concat

* Bugfix & simplification

* Next step: Unit tests

* Basic unit test for sparse dot; TF works, TH fails

* Fix KTH is_sparse

* pep8

* more tests, sparse KTH.eval, pep8

* sparse model test

* address code review comments

* make sparse boolean in K.placeholder

* skip sparse tests when TH.sparse import fails

* pep8

* pep8

* fixed flakey test, auto-dense in KTH.eval

* fixed some more len/shape issues for fit_generator

* fixed some more len/shape issues for prediction

* Added better exceptions when theano.sparse fails to import

* betterer

* pep8
2016-09-09 16:26:37 -07:00
iampat 6675776640 Fix a small typo in help files (#3728)
impoprt --> import
2016-09-08 17:35:38 -07:00
dolaameng 40685c3b2a add examples/neural_doodle.py (#3724) 2016-09-08 10:15:57 -07:00
Francois Chollet 25874ceab2 Update TD wrapper 2016-09-07 19:32:26 -07:00
Tim Shi 4b2093ef67 allow output size different from state size (#3709) 2016-09-07 15:52:06 -07:00
kuza55 9bc2e60fd5 TensorBoard callback improvements (#3656)
* TensorBoard callback improvements

* Removed name improvement in TensorBoard callback

* Fix variables broken by removing name fixups

* Update callbacks.py
2016-09-07 12:59:08 -07:00
antonmbk 685ce7573d Added stacked what where autoencoder. (#3616)
* Added stacked what where autoencoder.

SWWAE uses residual blocks. Trains fast. Creates very good reconstructions.

* Added newline at end for PEP8

* Went through PEP8 errors and corrected all (except for the imports which following the numpy seed, but this should be ok).  Also, for the pool_size of 2, we halved the number of features maps and the number of epochs, and it still trains a net that can very nicely reconstruct the input.

* Added spaces arround - and + when they are used as binary operators (more PEP8).

* In decoder, the index of the features and pool size and wheres are all equal to nlayers-1-i, so set ind variable to this value and passed it to them.

* With ind variable in decoder, don't need two lines for the upsampling layer.

* Added title to plot, got rid of ticks on plot.

* PEP8 for * binary operator. Corrected some grammar issues in the docstring.
2016-09-07 11:05:41 -07:00
dolaameng f5ad1c5753 fix bug in neural_style_transfer example for image_dim_ordering=tf (#3715)
* fix bug in neural_style_transfer example for image_dim_ordering=tf

* fix PEP8 mixed space and tab
2016-09-07 10:57:23 -07:00
Francois Chollet cc92025fdc Make examples agnostic to image_dim_ordering 2016-09-06 15:53:56 -07:00
Fariz Rahman f05cd95fad Dot/cos merge : bug fix (#3708) 2016-09-06 15:04:26 -07:00
kuza55 4325843ef0 Add Matthews correlation coefficient to metrics (#3689)
* Add Matthews correlation coefficient to metrics

I needed this for a Kaggle competition and it seemed useful in general so I thought I'd contribute it back.

* Enabled test for matthews metric

* Remove unnecessary cast garbage

* Addresses code review comments

* Renamed to matthews_corrcoef to be consistent with sklearn

* Update test_metrics.py

* pep8

* rename to mathews_correlation

* Update metrics.py

* Fixed typo
2016-09-06 13:42:56 -07:00
Arel Cordero 607635d2ce Optionally load weights by name (#3488)
* Adding feature to load_weights by name

Squashed commit of the following:

commit fd47e763855c34ed78d26ee441d83e0e63f08119
Author: Arel Cordero <arel@ditto.us.com>
Date:   Thu Aug 18 16:02:14 2016 +0000

    typo

commit d0b06c03080131c55ab4777064a196ff339ad7df
Author: Arel Cordero <arel@ditto.us.com>
Date:   Thu Aug 18 15:52:35 2016 +0000

    update documentation for "load_weights"

commit 844cfc2e8c9c6f267799a22ed54ac4d75807c5ab
Author: Arel Cordero <arel@ditto.us.com>
Date:   Thu Aug 18 02:42:10 2016 +0000

    batch updating weights

commit f361a70da4b40b961f1af9c8f1c3cd26273d0cad
Author: Arel Cordero <arel@ditto.us.com>
Date:   Thu Aug 18 02:29:17 2016 +0000

    removing pudb line

commit 738de4c371503626b4c9dbae6428fb279b368a76
Author: Arel Cordero <arel@ditto.us.com>
Date:   Wed Aug 17 19:56:51 2016 +0000

    adding unit tests for loading weights by name

commit cb0971b3cfe62452ab445e4034098cab2be3031b
Author: Arel Cordero <arel@ditto.us.com>
Date:   Tue Aug 16 23:45:32 2016 +0000

    cleaning up code based on comments

commit ef08fd2c9f5d3c65359cbdf5b090e08733a518de
Author: Arel Cordero <arel@ditto.us.com>
Date:   Tue Aug 16 04:50:46 2016 +0000

    debugging

commit 0d74f0e997960886b1044c26001de6cd6ad90bb9
Author: Arel Cordero <arel@ditto.us.com>
Date:   Tue Aug 16 04:15:43 2016 +0000

    optionally load model by name

* changed random file names to use tempfile module

* clean up documentation strings

* clarifying documentation
2016-09-06 11:42:31 -07:00
Abishek Bhat b8fddc862e Add missing Softmax activation memnn. (#3706)
The implementation of bAbi [End to End Memory
Network](https://arxiv.org/pdf/1503.08895v5.pdf) in the example
seems to be missing the Softmax Layer.

Quoting the paper.

> The query q is also embedded (again, in the simplest case via another embedding matrix
B with the same dimensions as A) to obtain an internal state u. In the embedding space, we compute
the match between u and each memory m<sub>i</sub> by taking the inner product followed by a softmax.

Also, the question encoder
[here](https://github.com/fchollet/keras/blob/0df0177437ce672d654db6d7edfdc653aaf67533/examples/babi_memnn.py#L186) seems to sum over the probabilities and the question vector as suggestted in the original paper.

> Output memory representation: Each x<sub>i</sub> has a corresponding output vector c<sub>i</sub> (given in the
simplest case by another embedding matrix C). The response vector from the memory o is then a
sum over the transformed inputs c<sub>i</sub> , weighted by the probability vector from the input.

I tried running the model(with and without the intermediate softmax)
against _Single Supporting Fact_ en-10k dataset and found that the
network the intermediate softmax trained a lot faster(95% at 100epoch) than the former(67% at 100epoch).

Network without the Softmax activation in the Input Memory Representation at epoch=100
======================================================================================
```
Iteration 10
Train on 10000 samples, validate on 1000 samples
Epoch 1/10
10000/10000 [==============================] - 8s - loss: 0.0549 - acc: 0.9819 - val_loss: 1.8088 - val_acc: 0.6470
Epoch 2/10
10000/10000 [==============================] - 6s - loss: 0.0612 - acc: 0.9802 - val_loss: 1.7839 - val_acc: 0.6650
Epoch 3/10
10000/10000 [==============================] - 6s - loss: 0.0542 - acc: 0.9812 - val_loss: 1.7595 - val_acc: 0.6750
Epoch 4/10
10000/10000 [==============================] - 6s - loss: 0.0538 - acc: 0.9826 - val_loss: 1.8198 - val_acc: 0.6670
Epoch 5/10
10000/10000 [==============================] - 6s - loss: 0.0590 - acc: 0.9790 - val_loss: 1.7891 - val_acc: 0.6650
Epoch 6/10
10000/10000 [==============================] - 6s - loss: 0.0548 - acc: 0.9803 - val_loss: 1.7682 - val_acc: 0.6790
Epoch 7/10
10000/10000 [==============================] - 6s - loss: 0.0455 - acc: 0.9841 - val_loss: 1.8394 - val_acc: 0.6730
Epoch 8/10
10000/10000 [==============================] - 6s - loss: 0.0559 - acc: 0.9797 - val_loss: 1.7764 - val_acc: 0.6650
Epoch 9/10
10000/10000 [==============================] - 6s - loss: 0.0488 - acc: 0.9835 - val_loss: 1.7711 - val_acc: 0.6620
Epoch 10/10
10000/10000 [==============================] - 6s - loss: 0.0502 - acc: 0.9834 - val_loss: 1.8225 - val_acc: 0.6700
```

Network with Softmax Activation in the Input Memory Representation at epoch=100
===============================================================================

```
Iteration 10
Train on 10000 samples, validate on 1000 samples
Epoch 1/10
10000/10000 [==============================] - 6s - loss: 0.0084 - acc: 0.9972 - val_loss: 0.2426 - val_acc: 0.9520
Epoch 2/10
10000/10000 [==============================] - 7s - loss: 0.0152 - acc: 0.9946 - val_loss: 0.2063 - val_acc: 0.9560
Epoch 3/10
10000/10000 [==============================] - 6s - loss: 0.0104 - acc: 0.9969 - val_loss: 0.2010 - val_acc: 0.9540
Epoch 4/10
10000/10000 [==============================] - 6s - loss: 0.0163 - acc: 0.9959 - val_loss: 0.2023 - val_acc: 0.9580
Epoch 5/10
10000/10000 [==============================] - 6s - loss: 0.0136 - acc: 0.9962 - val_loss: 0.2007 - val_acc: 0.9560
Epoch 6/10
10000/10000 [==============================] - 6s - loss: 0.0152 - acc: 0.9953 - val_loss: 0.1989 - val_acc: 0.9570
Epoch 7/10
10000/10000 [==============================] - 7s - loss: 0.0085 - acc: 0.9969 - val_loss: 0.2113 - val_acc: 0.9490
Epoch 8/10
10000/10000 [==============================] - 7s - loss: 0.0116 - acc: 0.9972 - val_loss: 0.2346 - val_acc: 0.9500
Epoch 9/10
10000/10000 [==============================] - 7s - loss: 0.0106 - acc: 0.9970 - val_loss: 0.2052 - val_acc: 0.9550
Epoch 10/10
10000/10000 [==============================] - 7s - loss: 0.0132 - acc: 0.9963 - val_loss: 0.2114 - val_acc: 0.9500
```
2016-09-06 11:33:11 -07:00
dolaameng 0df0177437 make image parameters more consistent (#3672)
* change of variable names in examples/neural_transfer_style for consistency

* add docstring to keras.preprocessing.image.load_img()
2016-09-06 10:29:26 -07:00
Pedro S f90cbcd1e3 Added regularization option to BatchNormalization layer (#3671)
* Added regularization option to BatchNormalization layer

* Update normalization.py

* Added regularization to BN test

* Fixed identation

* Removed trailing whitespace and refixed identation
2016-09-02 08:15:51 -07:00
Fariz Rahman 870d7f7f93 Lambda layer : Allow multiple inputs (#3668) 2016-09-01 13:48:21 -07:00
Roberto de Moura Estevão Filho 799bec66a2 CTC import compatibility with tensorflow 0.10 (#3650)
* CTC import compatibility with tensorflow 0.10
Try except clause to import ctc_loss in new path on tensorflow 0.10.

* Fixed ctc_decode and added tests for tensorflow.
ctc_decode when using beam search decoder has been fixed to conform with
tensorflow API. Function documentation has been updated to reflect the
changes. Two tests, for greedy and beam search decoding, have also been
added to test_backends.py.

* Fix pep8 styling.

* Fixed styling on long lines on ctc_decode tests.
2016-09-01 11:33:11 -07:00
Matt 2321fbbc1d Fix Batch Norm compatibility with 3D inputs (#3666)
* Fix Batch Norm compatibility with 3D inputs

the theano backend now uses dnn_batch_normalization which only supports
up to 4-dimensional input. This breaks any 5-d layers such as 3D
convolutions.

* using intermediate variable
2016-09-01 10:22:23 -07:00
gw0 48ae7217e4 Fix TensorFlow RNN backwards support. (#3662) 2016-09-01 05:56:42 -07:00
Francois Chollet 6f54b233f1 Fix Theano input shape inference in InputLayer 2016-08-31 21:49:43 -07:00
ηzw 1bf1055395 Update docs for SpatialDropouts (#3652) 2016-08-31 13:55:21 -07:00
gw0 6417d90d5c Fix #2814 lambda function serialization and deserialization (#3639)
* Remove old-style function attributes.

* Fix lambda function serialization and deserialization.
2016-08-31 10:05:05 -07:00
fchollet c939cebf0d Theano rnn fix when input_dim = 1 2016-08-30 18:35:22 -07:00
kuza55 7ae36d132a Write TensorBoard Histograms with Tensor names (#3635)
Resolves the acute symptoms in https://github.com/fchollet/keras/issues/3357

Doesn't address the question of having a better __repr__ since that is a much wider change.
2016-08-30 16:55:38 -07:00
Francois Chollet c478409dad Fix weight constraint sharing issue 2016-08-30 12:54:42 -07:00
Frédéric Bastien 109441a708 Small speed up by preventing transfer to CPU or copy on the CPU just to get the shape. (#3631) 2016-08-30 12:41:47 -07:00
Fariz Rahman b267e8293d Update sequential-model-guide.md (#3630) 2016-08-30 09:43:41 -07:00
Francois Chollet 3a4c683d5c Update download path for babi dataset 2016-08-29 13:03:36 -07:00
Sean Löfgren d5649da5f8 update pytest config for pep8 tests (#3617) (#3619) 2016-08-29 11:20:39 -07:00
Fariz Rahman 9c28d21b4f Fix lambda layer docstring (#3604) 2016-08-29 10:44:19 -07:00
kuza55 9e58b8237b Enable colocate_gradients_with_ops=True (#3620)
By default TensorFlow allocates all gradient matricies on gpu:0, which makes it pretty much impossible to do parallelize a large model.

colocate_gradients_with_ops puts these matricies next to the operations, allowing you to split your model across multiple GPUs. I ran into this issue myself and this fixed it for me.

I think it's also meant to set gradient computations to be done on the device where the operations are stored, but my belief about that comes from https://github.com/tensorflow/tensorflow/issues/2441

I'm not sure why this isn't the default in TF, so I'm not sure if this should be behind a flag or something, but having to make my own patches to keras to do multi-GPU training seems like the wrong answer.
2016-08-29 10:44:01 -07:00
Francois Chollet b184c76205 Update docs 2016-08-28 14:29:40 -07:00
fchollet 065fb2a74c Add global pooling layers 2016-08-28 14:22:15 -07:00
Francois Chollet a0a0d42630 Fix example in doc 2016-08-28 13:20:51 -07:00
Francois Chollet 756153899a Merge branch 'master' of https://github.com/fchollet/keras 2016-08-28 13:10:47 -07:00
Francois Chollet e02554412f Fix example in doc 2016-08-28 13:09:33 -07:00
ηzw ca37e806b9 Fix docs (#3609)
* Fix typo

* Fix typo

* Fix docstring

* Remove the unnecessary augument in docstring
2016-08-28 11:22:16 -07:00
Francois Chollet fe0347dbf0 Update docs 2016-08-28 02:33:50 -07:00
Francois Chollet 4984c5fc7c Update documentation 2016-08-28 02:03:14 -07:00
Francois Chollet f605769af9 Merge branch 'master' of https://github.com/fchollet/keras 2016-08-28 01:11:08 -07:00
Francois Chollet 534f6b7975 Remove flaky test 2016-08-28 01:10:53 -07:00
ηzw ee8fd78383 Fix docstring in Locally-connected Layers (#3607) 2016-08-28 01:07:58 -07:00
Francois Chollet fbc4f37037 Example touch-up 2016-08-27 20:28:03 -07:00
Francois Chollet f23f2ff2c9 Add keras.applications, refactor 2 convnet scripts 2016-08-27 20:27:49 -07:00
Francois Chollet d0659327bd Prepare 1.0.8 release. 2016-08-27 17:04:58 -07:00
Nithish deva Divakar 88b301f182 Update io_utils.py (#3577)
* Update io_utils.py

Fix for wrong input dimension when using HDF5matrix for loading data

* Update io_utils.py
2016-08-27 09:45:31 -07:00
Yanush Viktor d5e16807d2 Update image.md (#3600)
Information in docs about optional parameter `shuffle` is not correct. In the code
https://github.com/fchollet/keras/blob/master/keras/preprocessing/image.py#L263
it's `True` by default, but `False` in the docs.
2016-08-27 09:45:10 -07:00
Fariz Rahman 79a2bcd05f TF dynamic RNN : Allow sequences of ndim > 3 (#3603)
Docstring says "at least 3D", but current code is hard coded for 3D input.
2016-08-27 09:44:55 -07:00
dibule e582f9dcac Bug fix, batch_size set instead of default one (#3590) 2016-08-26 14:30:57 -07:00
Carl Thomé 4cefd6136b Add pydot-ng dependency (#3593) 2016-08-26 14:30:25 -07:00
Francois Chollet 1cf04b7a10 Update documentation 2016-08-26 14:22:56 -07:00
Francois Chollet ad3db301f2 Update RNN docstring 2016-08-25 12:34:35 -07:00
Francois Chollet e15eb40317 Merge branch 'master' of https://github.com/fchollet/keras 2016-08-25 11:46:41 -07:00
Felix Lau 8459d0403c Fix Cropping3D InputSpec to be dim=5 (#3570) 2016-08-25 10:29:52 -07:00
François Chollet 08014eea36 Add support for dynamic RNNs in TensorFlow. (#3474)
* Add support for dynamic RNNs in TensorFlow.

* Fix return states

* Add support for go_backwards in dynamic TF RNNs

* Currently broken: TF RNN dropout, go_backwards

* Finalize dynamic RNNs in TF

* Remove unnecessary comment

* Comment out added test

* Comment out functional guide test
2016-08-24 20:47:51 -07:00
Francois Chollet c0b27108d0 Whitespace fix 2016-08-24 16:15:38 -07:00
Francois Chollet 40612facf3 Merge branch 'master' of https://github.com/fchollet/keras 2016-08-24 15:16:39 -07:00
Francois Chollet e418fc6937 Add get_variable_shape backend method 2016-08-24 15:16:22 -07:00
cyrusmaher 045d442fcd Update scikit_learn.py (#3567)
Sklearn (e.g. the Bagging estimator) expects a value error if a parameter isn't recognized.
2016-08-24 14:19:25 -07:00
Dmitry Lukovkin 2370f9f1db Docs update for Deconvolution2D layer (#3549)
* Fix exception message for Deconvolution2D

* Docs update for Deconvolution2D layer (#3540)

* Corrections to Deconvolution2D docs

* References formatted as Markdown links
* Blank lines added
2016-08-24 13:16:02 -07:00
Jurim Lee 519d1e7420 bug fix : cnn tensorflow backend error (#3558) 2016-08-24 08:57:31 -07:00
Francois Chollet 82afc713d0 Merge branch 'master' of https://github.com/fchollet/keras 2016-08-24 04:15:34 -07:00
Francois Chollet a1e9a8addd data_utils update 2016-08-24 04:15:20 -07:00
Fariz Rahman f59736a06c Bug fix : RNNs (#3550) 2016-08-23 17:03:17 -07:00
Francois Chollet d187059596 Merge branch 'master' of https://github.com/fchollet/keras 2016-08-23 14:30:21 -07:00
Francois Chollet 7d2f0b1ba8 Fix trainable_weight lists 2016-08-23 14:29:49 -07:00
Francois Chollet 6a6d939dea Style fix 2016-08-23 14:29:27 -07:00
Keunwoo Choi 090b8b7f99 add cropping1d/2d/3d layers (#3509)
* add cropping1d/2d/3d layers

* fix PEP8 issue, fix incorrect doc strings

* add example code on Cropping2D

* fix init/get_config of crop1d/3d, add test codes for cropping1d/2d/3d

* fix test code - PEP8

It doesnt pass test (only in cropping2d and basic_test), but my laptop setting is not correct (it doesnt pass some other existing layes as well), so committing to test it in a correct way.

* change to follow PEP8 again

* update test_convolutonal.py for PEP8, test code to us K.image_dim_ordering()

* PEP8 for test_convolutional.py - indentation

* fix typo. add assert to check cropping lengths
2016-08-22 19:20:47 -07:00
Dominic Breuker e4dda27de1 allow KerasClassifier.score to accept sparse labels (#3534) 2016-08-22 15:19:26 -07:00
EdwardRaff f2786d9d80 Update theano_backend.py (#3532)
fix batch norm causing NaN for many datasets
2016-08-20 12:50:23 -07:00
Katherine Crowson c6daa24e3c Use Theano's CuDNN implementation of batch normalization (#3529) 2016-08-19 20:03:55 -07:00
François Chollet 2f4eed1f0f Remove lock in fit_generator (#3528) 2016-08-19 16:09:20 -07:00
Francois Chollet acc5c45feb Remove lock in fit_generator 2016-08-19 11:21:47 -07:00
Rodrigo Prado 00c3335071 added language sh to install instructions (#3520) 2016-08-19 08:36:51 -07:00
Ardalan 65e4c8e76e Update lstm_text_generation.py (#3516)
updating comment
2016-08-18 14:03:26 -07:00
Francois Chollet d4f5dff8ee Style fixes 2016-08-18 13:51:45 -07:00
dolaameng 8d9cb782fb add example mnist_net2net.py (#3503)
* add example mnist_net2net.py

* change of mnist_net2net.py based on 1st comments

* typo fixed in examples/mnist_net2net.py
2016-08-18 13:07:16 -07:00
fchollet 02ff1d4462 Fix style in example 2016-08-17 20:00:49 -07:00
Francois Chollet 007d2c2e25 Fix theano tests 2016-08-17 17:39:24 -07:00
Francois Chollet 3bf7637986 Style fixes 2016-08-17 16:32:52 -07:00
Francois Chollet 33ff9dbce2 Speed up bidirectional wrapper tests 2016-08-17 16:30:31 -07:00
Fariz Rahman f25e894558 Bidirectional Wrapper (#3495)
* Add Bidirectional Wrapper

* Fix example

* Update wrappers.py

* Add reverse op

* Update tensorflow_backend.py

* Update wrappers.py

* Update test_wrappers.py

* bug fix

* Update test_wrappers.py

* Update test_wrappers.py

* bug fix

* Add test for reverse op

* Enable reverse along multiple axes

* Update theano_backend.py

* Update theano_backend.py

* Update test_wrappers.py

* Speed up tests

* Validate merge_mode arg, Add None mode

* Update test_wrappers.py

* Update test_wrappers.py

* Add properties; reverse -> backward

* Bug fix

* Resolve naming conflict

* Whitespace fix

* Update imdb_bidirectional_lstm.py

* Fix imports
2016-08-17 16:27:06 -07:00
Junwei Pan 52c1a7456f Add a reference paper for Adagrad (#3473)
Add a reference paper for Adagrad
2016-08-17 13:31:24 -07:00
Junwei Pan b2392413fa Fix a issue when only specify one dot_axes for in the Merge layer in the dot mode (#3470)
* Upload examples/imdb_fasttext.py which implement the fasttext model

* Remove Dropout and unnecessary imports

* Remove Dropout and unnecessary imports

* Remove Dropout and unnecessary imports

* Fix a issue when only specify one dot_axes for in the Merge layer

* Fix a issue when only specify one dot_axes for in the Merge layer
2016-08-17 13:30:45 -07:00
Frederico Tommasi Caroli 1941eaabe0 Using locks instead of ignoring ValuErrors in generators (#3501) 2016-08-17 11:20:51 -07:00
ηzw 3d5bf9753f Fix typo (#3505) 2016-08-17 11:18:52 -07:00
fchollet a4d191d4f9 Only write config file if it didn't exist 2016-08-16 20:53:57 -07:00
Reid Sanders dad54ec211 Minor imbd dataset documentation update, added docstring to reuters (#3494)
* Updated dataset documentation to reflect removal of test_split argument
from imbd dataset. Added docstring to reuters dataset load_data.

* Updated imbd and reuters examples in dataset docs to reflect all
available arguments with current default values.
2016-08-16 14:14:32 -07:00
Francois Chollet b525f5f4d7 Make h5py optional again 2016-08-16 13:31:15 -07:00
Mike Henry e8190a8d8d Added support for CTC in both Theano and Tensorflow along with image OCR example. (#3436)
* Added CTC to Theano and Tensorflow backend along with image OCR example

* Fixed python style issues, made data files remote, and made code more idiomatic to Keras

* Fixed a couple more style issues brought up in the original PR

* Reverted wrappers.py

* Fixed potential training-on-validation issue and removed unused imports

* Fixed PEP8 issue

* Remaining PEP8 issues fixed
2016-08-16 13:25:26 -07:00
Francois Chollet 4e155139ca Style fixes. 2016-08-16 11:17:44 -07:00
stas-sl 458edeed9a implement spatial dropout (#3463) 2016-08-16 11:16:18 -07:00
Katherine Crowson 04d785f4bf Allow multiprocessing on OS X (#3389) 2016-08-14 15:43:43 -07:00
fchollet 28d9c0c511 Style fixes in example. 2016-08-14 15:41:04 -07:00
Antonie Lin 91310971b9 mnist hierarchical rnn example (#3460) 2016-08-14 15:40:13 -07:00
Francois Chollet 5d2acf4897 Merge branch 'master' of https://github.com/fchollet/keras 2016-08-13 11:54:04 -07:00
Francois Chollet dc98019d49 Add get_layer to sequential models 2016-08-13 11:53:48 -07:00
ηzw b008bb35cc Style fixes (#3462) 2016-08-13 11:22:01 -07:00
Junwei Pan 46d5b197e0 Implement a fasttext example (#3446)
* Upload examples/imdb_fasttext.py which implement the fasttext model

* Remove Dropout and unnecessary imports

* Remove Dropout and unnecessary imports

* Remove Dropout and unnecessary imports
2016-08-12 14:36:35 -07:00
Francois Chollet 2c510530b1 Update FAQ 2016-08-10 16:44:35 -07:00
Francois Chollet ec6eda77ad Style fixes in tests 2016-08-10 16:44:29 -07:00
Yann Henon 4805e5856b Upsampling layer fix to work when input shape is None (#3429) 2016-08-09 14:47:23 -07:00
Fariz Rahman 55447cbb3d Bug fix : squeeze (#3433) 2016-08-09 13:59:47 -07:00
Francois Chollet 69d5139b8c [breaking change] Weight ordering now topological. 2016-08-09 10:20:00 -07:00
Francois Chollet 89f1e05147 Functional Model weight loading in layer order 2016-08-08 18:53:36 -07:00
Francois Chollet bc779df8b7 Avoid creating new ops in TF to load weights 2016-08-08 16:13:40 -07:00
Francois Chollet e3c260e7d3 Fix test flakes 2016-08-08 13:00:53 -07:00
Francois Chollet 0af7e004c7 Fix test flakiness in TF 2016-08-08 12:28:35 -07:00
Francois Chollet 447445388e Merge branch 'master' of https://github.com/fchollet/keras 2016-08-08 11:23:38 -07:00
Francois Chollet b2c66816d7 Prepare new PyPI release. 2016-08-08 11:23:25 -07:00
Martin Thoma b6f81c6cc3 Fix documentation of the 1D max pooling layer (#3419) 2016-08-08 11:04:54 -07:00
Yad Faeq 98b289630a Word embdedding example updated (#3417)
* Added Convolution1D instead of Conv1D, which is depreceated

* updated rest of the example using Conv1D

* Python3 fails to decode utf-8 data, thus using encoding='latin-1'

* added condition for Encoding line 65-67

* Conv1D reverted back to the way it was
2016-08-08 10:59:31 -07:00
fchollet d68c0bd795 Update optimizers docs 2016-08-07 19:31:25 -07:00
Santiago Castro 5afda71f74 Fix scikit-learn python file name in docs (#3413) 2016-08-06 19:29:27 -07:00
Ramon de Oliveira 1b08a8d675 format Nadam references (#3404) 2016-08-05 16:41:34 -07:00
Moussa Taifi b508ab64bd Fix documentation for ModelCheckpoint callback params (#3408) 2016-08-05 16:41:15 -07:00
Richard Higgins 84f435e24b all zero arrays no longer get divided by zero in the process of being turned into images (#3401)
Update image.py
2016-08-05 09:10:52 -07:00
Fariz Rahman 984ad34a61 One hot op (#3353)
* One hot op

* tf too

* Update theano_backend.py

* Use built-in theano op

* Update theano_backend.py

* Add test

* Update test_backends.py

* Update test_backends.py

* Generalize for nD tensors

* Fix docstring on TF backend

* Update theano_backend.py

* Update theano_backend.py
2016-08-04 14:16:36 -07:00
Ondřej Filip ad3231c29a Docs update for Merge layer (#3392)
In Merge layer dot_axes param affects also 'cos' mode
2016-08-04 08:54:10 -07:00
fchollet c3d20bbc53 Merge branch 'master' of ssh://github.com/fchollet/keras 2016-08-03 22:11:12 -07:00
fchollet f9c03f183f Make trainable_weights dynamic 2016-08-03 22:02:45 -07:00
Tsukasa ŌMOTO 046a3c8a28 Fix YAML serialization for Advanced Activations (#3391)
Fix https://github.com/fchollet/keras/issues/2871#issuecomment-237365465

The problem occurs because PyYAML can't recognize numpy's data types.
2016-08-03 20:41:52 -07:00
Francois Chollet 05883934f1 Unify BN behavior across backends (fix) 2016-08-03 13:22:21 -07:00
Francois Chollet 97d2a73dd3 Unify BN behavior in TF and Theano 2016-08-03 12:39:24 -07:00
Francois Chollet 5367a44acb Further style fixes in example 2016-08-03 11:54:15 -07:00
Francois Chollet 1deaf71388 revert unnecessary changes in example script 2016-08-03 11:11:35 -07:00
Francois Chollet 99f564e972 Style fixes and small bug fixes 2016-08-03 11:08:48 -07:00
Zhengtao Wang c725f8d354 add resnet50 example (#3266)
* add resnet50 example

* fix PEP8 problems

* fix PEP8 problem....again...

This script get 10 points in PEP8 tests on my computer but.....

* add resnet 50

* fix problem caused by interrupted git push

* fix PEP8 problem..again!

* update weights links and remove load_weights

* fix pep8!

* remove skimage dependency, rename the file

* fix pep8...

* update

* support tf dim_ordering

* fix PEP8 problem
2016-08-03 10:51:18 -07:00
Shuhei Iitsuka 257ace722c Correct the reconstruction loss (#3383) 2016-08-02 21:45:31 -07:00
Wei Ouyang 0cd9d46828 remove keyword for cifar10.load_data (#3379) 2016-08-02 08:20:00 -07:00
Chris Caruso cef9e28a6c Added to rnn statefulness doc concerning func api (#3375)
Added correct information for how to enable rnn statefulness on functional models with 1 or more Input layers.
2016-08-01 20:21:42 -07:00
Francois Chollet 6c42da2abf Fix legacy model loading 2016-08-01 16:26:01 -07:00
Francois Chollet a9fc2bed49 Allow to call load_weights on model save file 2016-07-31 22:58:44 -07:00
Francois Chollet 1855c49d1f Merge branch 'master' of https://github.com/fchollet/keras 2016-07-31 17:45:43 -07:00
Francois Chollet cce65ce34d Update docs. 2016-07-31 17:45:32 -07:00
Jan Wilken Dörrie 70866c0154 Compare versions through pkg_resources.parse_version (#3355) 2016-07-31 10:45:03 -07:00
dolaameng d06e3753b0 update examples/neural_style_transfer.py (#3347) 2016-07-29 12:00:37 -07:00
Fariz Rahman cb4de1f859 Embedding layer - cast float to int implicitly (#3342)
* Embedding layer - cast float to int implicitly

* Cast only if not int
2016-07-29 11:11:46 -07:00
Jérémie b6d23b2e2d remove usage of tf.assign() in part of tensorflow backend (#3316) (#3320)
* remove usage of tf.assign() in the tensorflow backend (#3316)

Usage of the tf.assign() function in the set_value() and batch_set_values() functions creates new nodes on the Tensorflow graph which can eventually overflow the memory.
Therefore, the function has been rewritten using placeholders and feed_dict to avoid allocating additional memory.

* Correction to the set_value() function

Change to the set_value() function that had a bug when the variable "value" was a float.
The *1. dummy multiplication was added to avoid having to deal with tf.float32_ref dtypes.

* update set_value() of the tensorflow backend

Removal of the *1. dummy multiplication, replacement with a split() to avoid creating a new operation in the graph.

* fix to have session.run() called once in batch_set_value()

Rewriting of the batch_set_value() to avoid multiple calls to session.run() to improve speed.
2016-07-28 09:29:57 -07:00
Pradeep Dasigi 6a8815de0c Masked and non-masked merge bug fix (#3218)
* Masked and non-masked merge bug fix

* Masked merge concat logic with an expanded loop

* Cast mask of ones for unmasked input in merge to uint8
2016-07-27 17:50:49 -07:00
Francois Chollet e0179bad2f Refresh IMDB dataset 2016-07-27 17:30:04 -07:00
Francois Chollet 8778add0d6 Clean up test files 2016-07-27 12:09:40 -07:00
Francois Chollet facc823612 Update FAQ 2016-07-27 11:15:04 -07:00
Francois Chollet b91854ea9d Fix flaky test 2016-07-27 10:27:47 -07:00
Francois Chollet 05abe814ac Add keras version in model save files 2016-07-27 10:22:49 -07:00
François Chollet ea561ba6d8 Add model saving functionality (#3314)
* Add model saving functionality

* Update model saving functionality

* Fix py3 bytes/str issue

* Fix tests
2016-07-26 20:45:28 -07:00
Mostafa Abdulhamid df84c69676 Docker image for test and experiment Keras (#3035)
* Docker image for test and experiment Keras

 - Docker image with CUDA support on ubuntu 14.04
 - nvidia-docker script to forward the GPU to the container
 - MakeFile to simplify docker commands for build, run, test, ..etc
 - Add useful tools like jupyter notebook, ipdb, sklearn for experiments

* update nvidia-docker plugin

* use .theanorc in Dockerfile

* Add tensorflow to the docker image

* update Docker image to cuDNN v5

* test fixes

* move docker to sub directory

* README for docker

* Fix typos

* Add visualization to Dockerfile
2016-07-26 18:29:56 -07:00
Sadegh 3726aba2ee use of period fixed, default period set to 1000 (#3312) 2016-07-25 20:41:22 -07:00
Francois Chollet f6bcaffe4a Style fixes 2016-07-25 10:42:37 -07:00
yaringal c689b52dd1 Implemented transposed (de-) convolutions into Keras (#3251)
* theano backend now supports transposed convolutions

* working deconv

* new example file with deconv vae

* merged with #3273, fixed based on comments, pep8 tested

* test fix

* passes theano test

* start fixing deconv test

* fix deconv layer tests

* fix the right test

sorry, I "fixed" the wrong test last time

* clean up

* replace with_None with fixed_batch_size

* with_None --> fixed_batch_size

* comment edit

* fixed comments online
2016-07-25 10:33:03 -07:00
fchollet 09d75a4347 Style fixes 2016-07-24 14:20:36 -07:00
Rui Shu 59bd247603 Fix VAE example (#3220)
A number of changes:
1. Switch from Lambda to merge, otherwise code will not run.
2. Rename z_log_std to z_log_var in order for the objective function to make sense
3. Adjust reparameterization trick to reflect use of z_log_var, not z_log_std
4. Remove epsilon_std, since (standard) VAE uses isotropic gaussian prior.
5. Re-balance the weighting of KL and reconstruction terms
6. Use adam instead of rmsprop
7. Increase hidden unit size to improve model
8. Increase batch size to speed up training
2016-07-24 14:09:34 -07:00
dolaameng f221ef952f make examples/pretrained_word_embeddings.py more memory efficient (#3289)
* make examples/pretrained_word_embeddings.py more memory efficient

* make examples/pretrained_word_embeddings.py more memory efficient

* rename NB_WORDS to nb_words as it is not a global constant
2016-07-23 10:14:28 -07:00
Sebastian N. Fernandez d3c75e1d34 adding print_tensor (#3285) 2016-07-22 12:48:44 -07:00
Felix Lau 3aab55d29f Add Tensorflow support for UpSampling3D and ZeroPadding3D (#3274) 2016-07-22 12:47:59 -07:00
Fariz Rahman f9ef72c38a Logical ops (#3272)
* Logical ops

* Update tensorflow_backend.py

* Add tests

* Update tensorflow_backend.py

* lesser->less

* Update test_backends.py

* less->lesser

* less->lesser

* Update test_backends.py
2016-07-21 20:47:02 -07:00
Francois Chollet 108159ed17 Merge branch 'master' of https://github.com/fchollet/keras 2016-07-21 15:10:10 -07:00
Francois Chollet defa1283c4 Include iterations in optimizer weights 2016-07-21 15:04:30 -07:00
Francois Chollet 2788b60fe6 Revert behavior of regularizers 2016-07-21 15:04:09 -07:00
Olivier Grisel 7e70e1768f Small python3 compat fix in backend doc (#3275) 2016-07-21 10:31:52 -07:00
Kai Li 896ba77061 update residual connection example (#3278) 2016-07-21 10:30:19 -07:00
Francois Chollet c034262b78 Removed test for deprecated graph model 2016-07-20 16:12:49 -07:00
Francois Chollet b7edcf6eea Change behavior of regularizers: use mean, not sum 2016-07-20 15:52:47 -07:00
Francois Chollet 23e1ad2df7 Allow overriding learning phase 2016-07-20 15:48:52 -07:00
Francois Chollet 0a3939883a Style fix 2016-07-19 16:29:27 -07:00
Francois Chollet 3c8f91ee3d Style fix 2016-07-19 16:11:33 -07:00
Francois Chollet efa5b04797 Style fix 2016-07-19 16:09:07 -07:00
Francois Chollet 2da66ed009 Py3 fix 2016-07-19 14:56:50 -07:00
Francois Chollet 2ac6811362 Remove deprecated graph model test 2016-07-19 14:53:50 -07:00
Francois Chollet 74c51f213c Fix flaky test 2016-07-19 14:52:56 -07:00
Francois Chollet 4302d8060d Fix image resizing in preprocessing/image 2016-07-19 14:30:43 -07:00
Francois Chollet 576cf8978b Merge branch 'master' of https://github.com/fchollet/keras 2016-07-19 14:22:48 -07:00
Francois Chollet 3533912016 Native initializations, updates 2016-07-19 14:22:34 -07:00
ekerazha cf9922ff1d Fix broken imdb_cnn example (#3244)
* Fix broken imdb_cnn example

* Update imdb_cnn fix
2016-07-19 12:18:59 -07:00
Zhengtao Wang 4fa65fbb2f add doc for layers/normalization/BatchNormalization (#3248)
* add doc for layers/normalization/BatchNormalization

* fix PEP8 problem

* Update normalization.py
2016-07-19 12:18:39 -07:00
Kai Li f502ee2338 Update sequential-model-guide.md (#3257) 2016-07-19 12:18:25 -07:00
Francois Chollet 7a56925176 Even faster tests 2016-07-19 11:57:57 -07:00
Francois Chollet 0a108b3fb2 Fix tests maybe? 2016-07-19 11:31:22 -07:00
Francois Chollet 381a108e6d Revert Travis config 2016-07-19 11:17:32 -07:00
Francois Chollet 726c9fc8a6 Update tests 2016-07-19 10:49:44 -07:00
Francois Chollet 946ccd3228 Speed up tests (especially with TF) 2016-07-19 10:05:30 -07:00
Francois Chollet 8e1ebbfc11 Get rid of coveralls 2016-07-19 00:40:21 -07:00
Francois Chollet cc0e60c101 TF backend performance improvements 2016-07-19 00:30:05 -07:00
Francois Chollet ff3f00d845 Style fix in example 2016-07-18 23:48:56 -07:00
Francois Chollet 40195c2fa2 TF backend: group update ops, add clear_session 2016-07-18 23:48:56 -07:00
Francois Chollet 7f7300b8cb Minor style fixes 2016-07-18 23:48:56 -07:00
Fariz Rahman 1b158ff4ed Bug fix + test - Sequential.pop() (#3252)
* Bug fix - Sequential.pop()

* Add test for Sequential.pop()

* Update test_sequential_model.py
2016-07-18 18:30:13 -07:00
stas-sl b686b85b52 Use symbolic shape in tensorflow version of batch_flatten (#3253) 2016-07-18 15:37:10 -07:00
Francois Chollet 8fa82ae5cb Merge branch 'master' of https://github.com/fchollet/keras 2016-07-16 17:52:56 -07:00
Francois Chollet 0d5289141e Add pre-trained word embeddings example 2016-07-16 17:51:17 -07:00
Francois Chollet 01d5e7bc47 Fix up a few example 2016-07-16 17:47:52 -07:00
Fariz Rahman cfbaec60c7 Simplify output shape inference for dot/cos merge (#3233)
* Simplify output shape inference for dot/cos merge

* Add extra dim if rank is 1

* Explain output shape inference logic in doc string

* Update tensorflow_backend.py

* Update theano_backend.py

* Update tensorflow_backend.py

* Update tensorflow_backend.py

* Update theano_backend.py

* Update topology.py

* example

* Update theano_backend.py

* Update theano_backend.py

* Update tensorflow_backend.py

* Update tensorflow_backend.py

* Update tensorflow_backend.py

* Update tensorflow_backend.py

* Update theano_backend.py

* Update tensorflow_backend.py

* typo fix

* Update theano_backend.py
2016-07-16 16:35:39 -07:00
Francois Chollet f3e7245910 Use native Theano BN 2016-07-16 13:42:41 -07:00
Francois Chollet 892d9fae84 Prepare 1.0.6 release 2016-07-16 12:25:03 -07:00
Francois Chollet 796f895f01 Start to nativify variable initialization in TF 2016-07-16 11:19:41 -07:00
Francois Chollet 489bb4eb10 Update docs for pooling 2016-07-16 10:28:35 -07:00
Francois Chollet 8f458066bb Update docs for initializations 2016-07-16 10:28:18 -07:00
Francois Chollet 5dd7454260 Merge branch 'master' of https://github.com/fchollet/keras 2016-07-15 17:36:21 -07:00
Francois Chollet 571db82371 Add variable initialization override option in TF 2016-07-15 17:36:00 -07:00
fchollet d971e0cca5 Style fixes in SeparableConv2D 2016-07-14 18:22:33 -07:00
Fariz Rahman fde0aac733 Convnet aliases (#3226)
* Convnet aliases

Not very important, but kinda handy.

* Update convolutional.py

* pep8
2016-07-14 18:16:00 -07:00
Maruan b9d904c12f Add masking support to BatchNormalization layer. (#3228) 2016-07-14 17:02:56 -07:00
Eder Santana aa2ec42da6 stop gradients (#3221)
* stop gradients

* fix stop grad test

* stop gradients
2016-07-14 16:13:55 -07:00
Francois Chollet d90d473104 Add pooling layers page in docs 2016-07-14 15:27:46 -07:00
Francois Chollet 5a1e63990a Refactor batch norm 2016-07-14 15:05:48 -07:00
Francois Chollet e836c10c6f Fast BN for TF 2016-07-14 14:13:06 -07:00
Francois Chollet 47c09d9557 Add SeparableConv2D layer (TF only) 2016-07-14 11:22:27 -07:00
Francois Chollet b35b943364 Add support for None activations 2016-07-14 04:38:53 -07:00
Francois Chollet ca467cc50e Add support for input tensors in InputLayer 2016-07-14 04:30:21 -07:00
Francois Chollet 51f7cf0367 Doc formatting fix 2016-07-14 04:20:12 -07:00
Francois Chollet 642eaca618 Doc formatting fix 2016-07-13 12:38:45 -07:00
Francois Chollet 55e5680535 Update doc generation script 2016-07-13 12:35:11 -07:00
Francois Chollet 52ea31b65c Add FAQ entry on pre-trained models 2016-07-13 12:34:58 -07:00
Wei Ouyang b3a26a5b30 Add AtrousConv2D layer for dilated convolution (#3183) 2016-07-13 08:28:23 -07:00
Dave Challis 98974efa5f Fix to error message in exception (#3213)
Was incorrectly reporting the `loss` argument instead of the `loss_weights` argument when an exception related to loss_weights was thrown.
2016-07-13 08:27:03 -07:00
fchollet b6a776b242 Add Sequential.pop() method 2016-07-12 20:17:03 -07:00
Francois Chollet 1ea3f44f06 Fix dtype consistency issue in random_binomial 2016-07-12 17:05:47 -07:00
Michael Oliver 64e1320ca0 add dtype to zeros and ones allocations (#3187) 2016-07-09 10:10:15 -07:00
Francois Chollet 6e0b50fbdc Merge branch 'master' of https://github.com/fchollet/keras 2016-07-08 18:44:37 -07:00
Francois Chollet 22502a8fe8 Style fixes in regularizers 2016-07-08 18:44:23 -07:00
Jason Yosinski a78ad01bb4 Create initial_state tensor filled with zeros without use of K.zeros (#3123)
* Create initial_state tensor filled with zeros without use of K.zeros

* minor PEP8 fix
2016-07-06 12:47:14 -07:00
Carl Thomé 729e802e85 Added optional field name argument to RemoteMonitor callback (#3157)
* Added optional path argument

* Added optional field name argument
2016-07-06 09:47:03 -07:00
Joshua Chin 3ffff6d579 fix get_output_shape_for in Merge, when mode is callable (#3144) 2016-07-05 09:29:40 -07:00
Francois Chollet 6e5f97fca5 Merge branch 'master' of https://github.com/fchollet/keras 2016-07-04 14:21:13 -07:00
Francois Chollet eaff5bdfd7 Style touch-ups in TF backend 2016-07-04 14:20:54 -07:00
Francois Chollet 28819d36a4 Less frequent dataset tests 2016-07-04 14:17:35 -07:00
lucasmoura f9a4f6f306 Use defaultdict for _UID_PREFIXES (#3087)
The method get_uid on common.py first check if a prefix is in _UID_PREFIXED dict
and if it is not, a variable is added to the dict.

However, using a defaultdict, this check is no longer necessary.
2016-07-04 13:26:44 -07:00
Dmytro Mishkin 27c83c693d Added 'max' operation to Merge layer (#3128)
* Added 'max' operation to Merge layer. It allows to implement convolutional maxout with two (or more) convoluion layers and one Merge.

* Added 'max' to merge test
2016-07-04 11:03:07 -07:00
Eder Santana d106908a57 fix docs bugs (#3142)
* fix docs bugs

* fix docs bugs
2016-07-04 11:02:47 -07:00
Brian McMahan b4adce34dc Lambda output shape (#2680)
* updating the info for lambda

* updated lambda doc a bit more

made it more readable and stuff
2016-07-03 21:57:46 -07:00
Thibault de Boissière 3927505d1a Add multiprocessing for fit generator (#3049)
* Add multiprocessing for fit generator

* Change maxproc to nb_worker and update documentation

* Simplify multiprocessing test, clarify doc replace maxproc by nb_worker

* Replace maxproc by nb_worker in test

* Replace maxproc by nb_worker in test

* Update the doc: specify non picklable arguments should not be used with multiprocessing

* Add multiprocessing as an option with the pickle_safe argument
2016-07-03 21:50:01 -07:00
fchollet ede79f818e Add MIT license badge to README 2016-07-03 21:19:05 -07:00
Francois Chollet 742ac53262 Merge branch 'locally_connected' 2016-07-03 20:52:54 -07:00
Francois Chollet ee17ccc374 Add tests for locally connected layers 2016-07-03 20:51:58 -07:00
joelthchao 835a02c037 locally-connected layer
add unittest, fix output shape

PEP8

flatten weight, improve example

update docstring, remove cifar10 Alex exmaple

improve docstring, remove duplicate func

parallel by batch_dot

fix theano batch_dot

dim_ordering unit test, theano only use dot

dim_ordering unit test

Update locally connected layers
2016-07-03 20:48:22 -07:00
François Chollet ee8ff00a2a New conv ops (#3134)
* New function signature for conv2d in backend

* Clean up stuff

* Touch-up TF deconv op

* More cleanup

* Support for TF 3D conv/pool

* Move pooling layers to their own file

* Update TF version in Travis config

* Fix conv3d tests
2016-07-03 20:33:21 -07:00
fchollet 229f13a864 Lambda should not support masking implicitly 2016-07-02 15:12:46 -07:00
Rompei 0d60d637af TimeDistributedDense -> TimeDistributed(Dense()) in doc example 2016-07-02 12:12:54 -07:00
Francois Chollet c20e34a8b0 Prevent image_dim_ordering from being overwritten 2016-07-02 10:24:48 -07:00
Fariz Rahman 8d3f39852a Validate dot_axes argument in cos mode and fix output shape (#3116)
* Validate dot_axes argument in cos mode

* Update topology.py

* Update topology.py
2016-07-01 11:41:13 -07:00
Carl Thomé aa45dee5a4 Added optional path argument (#3118) 2016-07-01 10:56:17 -07:00
fchollet 885e6e621b Style fix in test 2016-06-30 23:22:06 -07:00
fchollet dc122c31ef Fix masking test 2016-06-30 23:19:51 -07:00
fchollet 3bc80d3db4 Remove unnecessary assert 2016-06-30 23:04:12 -07:00
Francois Chollet 439f2f3b2b Fix issue with multi-io + BatchNorm mask computing 2016-06-30 22:19:17 -07:00
Joshua Chin a1610eb274 model should use binary accuracy for binary crossentropy loss (#3098) 2016-06-28 12:45:34 -07:00
Francois Chollet 6b90eff03c Fix flaky test 2016-06-27 12:16:54 -07:00
Francois Chollet 4d404d1a54 Prepare 1.0.5 PyPI release 2016-06-27 12:01:20 -07:00
Francois Chollet fffa6a80ca Add attribute caching for flattened_layers 2016-06-27 11:51:28 -07:00
Francois Chollet 3f6b38b34f Fix duplicated updates issue 2016-06-27 11:51:12 -07:00
Francois Chollet 8166a55761 Fix flaky test 2016-06-27 11:50:56 -07:00
Shota 89f6d374e9 Fix typo (#3070) 2016-06-25 12:01:20 -07:00
M.Yasoob Ullah Khalid ☺ 9b3c2cf348 A small typo (#3067) 2016-06-24 19:04:28 -07:00
Francois Chollet 76406dd0c2 Remove unnecessary space 2016-06-23 16:17:36 -07:00
Francois Chollet d266b75423 Small fixes in text gen example 2016-06-23 16:17:24 -07:00
Francois Chollet 5bb5eb1657 Fix flaky test 2016-06-23 12:08:25 -07:00
Benjamin Bolte 258cf3b0f7 Support for masking in merged layers (#2413)
* added masking to merge layer (#2413)

* added documentation, fixed stylistic issues

* removed casting

* changed to using K.all
2016-06-23 12:03:55 -07:00
Utkarsh Upadhyay cdb5b09cd7 fix: Sort subdirs before mapping them to classes. (#3052)
The documentation says that [1]: 

> If [classes are] not provided, the list of classes will be automatically inferred (and the order of the classes, which will map to the label indices, will be alphanumeric).

However, the code was adding classes in the order `os.listdir` returned them. This commit alphanumerically sorts the sub-directories before mapping them to label indices.

[1] http://keras.io/preprocessing/image/
2016-06-23 11:32:40 -07:00
MikeAmy cb3469215a Moved epoch_logs = {} before batch loop to avoid UnboundLocalError. (#3019) 2016-06-20 08:18:01 -07:00
ηzw 703c2925b3 Add comment for a note of caution (#3024) 2016-06-19 12:36:05 -07:00
lucasmoura b6ca3ef051 Avoid double key lookup on callback.py (#3018)
On method on_epoch_end, to add new keys to the history dict, first it is
verified if a key is not on the history dict and if that is the case, a new key
is created on the history dict with an empty list as value.

However, this operation search for a key twice in the dict. This same behavior
can be achieved in a single step using dict setdefault method.
2016-06-19 10:48:53 -07:00
André Artelt b235f91cc7 doc: fix example for recurrent layer (#3022) 2016-06-19 10:39:23 -07:00
jkleint 3513472467 Allow re-use of EarlyStopping callback objects. (#3000)
An EarlyStopping callback object has internal state variables to tell it
when it has reached its stopping point.  These were initialized in __init__(),
so attempting to re-use the same object resulted in immediate stopping. This
prevents (for example) performing early stopping during cross-validation with
the scikit-learn wrapper.

This patch initializes the variables in on_train_begin(), so they are re-set
for each training fold.  Tests included.
2016-06-18 15:04:30 -07:00
Carlos Bentes 60e0c96f6c Fix typo in training (#3014) 2016-06-18 10:42:38 -07:00
Tsukasa ŌMOTO e37df7ca85 Fix json serialization in Lambda layer (#3012)
Fix #2582
Fix #3001
2016-06-17 21:26:45 -07:00
Tsukasa ŌMOTO a13a35fe52 Fix json serialization in merge layer with lamda output shape (#3011)
Fix #3008
2016-06-17 21:26:29 -07:00
Zhengping Che f421600218 fix wrong calls of __init__ in callbacks (#2999) 2016-06-16 14:51:28 -07:00
Francois Chollet 2a4492d74f Fix typo in docs 2016-06-16 10:56:34 -07:00
Tsukasa ŌMOTO 10368d867f Fix TF-IDF in Python 2 (#2992)
Fix #2974
2016-06-16 08:49:17 -07:00
Tsukasa ŌMOTO 936360020c Fix tf-idf again (#2986)
Fix 53aaa842ed
Fix #2974
2016-06-15 09:11:01 -07:00
ηzw c53c64d7fa Fix get_word_index (#2981) 2016-06-14 18:59:10 -07:00
Francois Chollet 6b122ba25f Allow arbitrary output shapes for custom losses 2016-06-14 16:40:58 -07:00
Francois Chollet 6501b587c0 Clarify use of two-branch models 2016-06-14 12:12:21 -07:00
Tsukasa ŌMOTO 53aaa842ed Fix tf-idf (#2980)
Fix #2974
2016-06-14 11:39:07 -07:00
Francois Chollet dc569e952d Merge branch 'master' of https://github.com/fchollet/keras 2016-06-13 12:06:45 -07:00
Francois Chollet c9aee4126e Fix issue with cascade of Merge layers 2016-06-13 12:05:48 -07:00
githubnemo 7397f4b0d0 Resolve #2960 (#2961)
* Resolve #2960

Introduce `K.var` so that the standard deviation computation can
be made numerically stable. Instead of

	K.std(x)

the user is able to write

	K.sqrt(K.var(x) + self.epsilon)

avoiding a division by zero in the gradient computation of `sqrt`.

* Fix typos
2016-06-12 21:16:21 -07:00
Rompei 3b83a1b1ac Fix initial variable in Evaluator. (#2955) 2016-06-12 14:50:56 -07:00
fchollet 8f8e4574dc Fix issue with Sequential deserialization 2016-06-12 11:04:09 -07:00
fchollet c30432a665 Nadam optimizer style fixes 2016-06-11 16:49:56 -07:00
Ilya Kulikov c4c2d8bbd4 Nadam optimizer and test for it added (#2764)
* Nadam optimizer and test for it added

* pep8 fix

* add comment in docstring and one more pep8 fix
2016-06-11 16:30:05 -07:00
Francois Chollet e49ba233f9 Convolution1D: apply activation after reshape 2016-06-10 15:05:34 -07:00
mpt5 cea6c1821f Update visualization.md (#2942)
* Update visualization.md

Added show_layer_names argument and its default value to docs

* Update visualization.md
2016-06-09 21:50:32 -07:00
Shaun Harker ab4bf447c6 Fix 1D convolution layers under Theano backend (#2938)
This issue is due to an unexpected loss of dimensionality when
composing the backend tensor operations "reshape" and "squeeze"
when there are dimensions of length 1.

For example, using a Theano backend the following fails with a
complaint about dimension mismatch:

UpSampling1D(2)(MaxPooling1D(2)(Reshape((2,1))(Input(shape=(2,)))))

The issue arises due to the conflict of two behaviors specific
to the Theano backend:

-   Reshape uses Theano's reshape function. Theano's reshape
    automatically makes dimensions with length 1 "broadcastable"

-   MaxPooling1D's implementation class _Pooling1D has a call method
    which uses a dummy dimension which it has to remove. The manner
    in which this dummy method is removed it to call "squeeze(x, axis)"
    from the backend. The squeeze implementation tells Theano to make
    the dummy dimension broadcastable, and then calls Theano's "squeeze",
    which removes ALL the broadcastable dimensions; not just the dummy
    dimension, but also the length 1 dimension flagged as broadcastable
    by reshape. This causes the problem observed above. This behavior
    is distinct from the behavior of the TensorFlow backend, which
    removes only the requested dimension.

This PR addresses this issue in two ways:

First, it introduces a test which checks the composition of "reshape"
and "squeeze" to make sure we get the same result using both Theano
and TensorFlow backends.

Second, it changes the implementation of squeeze(x,axis) so that the
Theano backend should behave similarly to the TensorFlow backend. With
this change the introduced test passes and the above example works.
2016-06-09 12:00:50 -07:00
Oswaldo Ludwig 4e0c8cf25b Eigenvalue Decay regularization (#2846)
* Update regularizers.py

I included a new regularizer named Eigenvalue Decay to the deep learning practitioner that aims at maximum-margin learning. This version approximates the dominant eigenvalue by a soft function given by the power method. For details, see:
Oswaldo Ludwig. "Deep learning with Eigenvalue Decay regularizer." ArXiv eprint arXiv:1604.06985 [cs.LG], (2016). https://www.researchgate.net/publication/301648136_Deep_Learning_with_Eigenvalue_Decay_Regularizer

The syntax for Eigenvalue Decay is similar to the other Keras weight regularizers, e.g.:

 model.add(Dense(100, W_regularizer=EigenvalueRegularizer(0.0005)))

* Example with Eigenvalue Decay regularization.

An example from Keras including regularization with Eigenvalue Decay. After training, you have to save the trained weights, create/compile a similar model without Eingenvalue Decay and save this model. Then, you can use your trained weights with this model, see lines 123-153 of  	CIFAR10_with_Eigenvalue_Decay.py (This is still an open issue).
This example yields a gain in the accuracy by the use of Eigenvalue Decay of 2.71% (averaged over 10 runs).

* Update CIFAR10_with_Eigenvalue_Decay.py

* Update CIFAR10_with_Eigenvalue_Decay.py

* Update CIFAR10_with_Eigenvalue_Decay.py

* Update regularizers.py

* Update regularizers.py

* Delete CIFAR10_with_Eigenvalue_Decay.py

* Update test_regularizers.py

* Update regularizers.py

* Update test_regularizers.py

* Update regularizers.py

* Update regularizers.py

I needed another reading in Keras backend...

* Issue to get shape of a tensor.

Issue to get shape of a tensor in the class EigenvalueRegularizer: the type returned for shape is different for Theano backend (Theano tensor type) and TF backend (TF TensorShape).

* Update regularizers.py

* Update regularizers.py

* Update regularizers.py

* Update regularizers.py

* Update regularizers.py

* Update regularizers.py

* Update regularizers.py
2016-06-08 12:20:25 -07:00
Francois Chollet e4b3a052a4 Small style fixes 2016-06-08 11:56:05 -07:00
Ziheng Jiang 825beb42c4 fix bug: rename duplicated loss name (#2842)
* rename duplicated loss name

* make python3 happy

* rewritten code to make it easy to read
2016-06-08 11:51:34 -07:00
Tammy Yang c8d605db55 Make DirectoryIterator case insensitive (#2932)
* make DirectoryIterator case insensitive

* Also need to make filename case insensitive while appending it into self.filenames
2016-06-08 11:27:19 -07:00
Ryo ASAKURA f6ecab58cb Fix description about parameter output_shape for function merge (#2933) 2016-06-08 11:26:56 -07:00
Tsukasa ŌMOTO d7e39347b9 Add mode=2 option to the docstring in BatchNormalization (#2919)
Fix a tiny typo.
2016-06-07 23:09:03 -07:00
Colin Rofls 25c10af596 fix 2852 (#2927) 2016-06-07 16:48:13 -07:00
Francois Chollet ded23f14c7 Fix typo in docs 2016-06-07 15:51:29 -07:00
Michael Crawford b0d52d930a Fix predict_proba method of KerasClassifier to return probabilites for both classes in case of binary classification. issue:2864 (#2924) 2016-06-07 11:48:04 -07:00
jakeleeme af5c5b6a55 Spellcheck source files (#2907) 2016-06-06 13:29:25 -07:00
ηzw ce51e19970 Fix typos in image preprocessing docs (#2906) 2016-06-06 13:28:39 -07:00
Francois Chollet 97e31b6090 Cleanup docs autogen script 2016-06-06 10:18:09 -07:00
Francois Chollet 62053e68e2 Prepare 1.0.4 PyPI release 2016-06-06 10:17:47 -07:00
Francois Chollet 489c07e748 Docs adjustment 2016-06-06 10:15:22 -07:00
fchollet 604ea8d68a Fix PEP8 BS 2016-06-05 20:41:19 -07:00
fchollet fd3cfb196b Allow no layer names in plot() 2016-06-05 20:20:14 -07:00
fchollet b71f6ba864 Allow absence of labels in flow() 2016-06-05 20:19:55 -07:00
fchollet dfc128b89a Fix some py3 generator issue 2016-06-05 14:52:10 -07:00
fchollet 34b8b57c2f Update image preprocessing docs 2016-06-05 14:01:28 -07:00
fchollet 3bba409d9e Improve docstring in preprocessing/image 2016-06-05 14:01:11 -07:00
fchollet 7869cdccec Merge branch 'master' of ssh://github.com/fchollet/keras 2016-06-05 13:39:55 -07:00
fchollet 0e18e345b0 Refactor ImageDataGenerator, add directory support 2016-06-05 10:24:54 -07:00
fchollet e5b99c7512 Tiny fixes in Sequential methods 2016-06-05 10:24:20 -07:00
aaditya prakash 7d4c85018a MaxoutDense no activation; incorrect docs (#2895)
Since MaxoutDense does not have activation it might be misleading to include "activation" as one of the arguments in the function docs.
2016-06-03 23:45:24 -07:00
fchollet edbec2dbc9 Remove bit of deprecated code 2016-06-03 23:13:46 -07:00
fchollet 973ece9809 Make dim_ordering a global default 2016-06-03 23:13:11 -07:00
lorenzoritter 90f441a6a0 fixed formatting error in the docstring (#2797)
* fixed formatting error in the docstring

* fixed formatting error in TimeDistributedDense of core.py
2016-06-02 19:07:34 -07:00
Andrew Stromnov 5a71090476 limit progress bar update rate (#2860)
* limit progress bar update rate

Limit progress bar update rate in verbose=1 mode. This patch allows to
reduce terminal I/O throughput while keeping reasonable high visual
update rate (defaults to 100 refreshes per second). It helps greatly
when working with large but simple data sets with small batches, which
leads to millions of relatively useless screen updates per second. Also
it helps to keep network traffic at reasonable rates, which
exceptionally useful within laggy networking conditions when using
keras over telnet/ssh, and improve web browser responsibility when
using keras within Jupyter Notebook.

* add docstrings for 'interval' and 'force' arguments
2016-06-02 13:06:37 -07:00
matthewmok 76cae0ec44 fix bug: change seed range for RandomStreams in Theano (#2865)
* bug fixed, numpy randint only output positive numbers ranging from 1 to 10e6

* Update theano_backend.py

changed style and numpy randint range

* Update theano_backend.py

removed extra spaces
2016-06-02 13:05:51 -07:00
talpay 273f0dda9d Added objective: Kullback Leibler Divergence (#2872)
* Added objective: Kullback Leibler Divergence

* KLD: Clip at 1
2016-06-02 10:23:00 -07:00
Tsukasa ŌMOTO 882b5a1d89 Fix YAML serialization when using Regularizers (#2883)
Fix #2871
2016-06-01 21:40:16 -07:00
ηzw 8c84ad1a86 fix typo (#2881)
* fix typo

* Update scikit-learn-api.md
2016-06-01 21:39:46 -07:00
Francois Chollet 80bfec7253 Fix JSON deserialization issue 2016-05-30 22:36:57 -07:00
fchollet 91b930298b Make Merge output_shape consistent with lambda 2016-05-30 20:50:59 -07:00
Tsukasa ŌMOTO 9c56b91548 Fix json serialization in merge layer (#2854)
Fix #2818
2016-05-30 20:30:07 -07:00
fchollet 7b5bab83f4 Merge branch 'master' of ssh://github.com/fchollet/keras 2016-05-29 15:36:21 -07:00
fchollet c5e2116ead Fix typo in doc 2016-05-29 15:36:14 -07:00
mittagessen d9db73a791 s/TimeDistributedDense/TimeDistribute(Dense(.../g (#2843) 2016-05-29 14:12:57 -07:00
Kumar Ayush 01ece4ef7b added required import line (#2839) 2016-05-27 23:56:51 -07:00
Francois Chollet 601f3e7cdb BN only uses learning phase in mode 0 2016-05-27 21:36:08 -07:00
Francois Chollet a9ca2c547f Merge branch 'master' of https://github.com/fchollet/keras 2016-05-27 21:32:53 -07:00
Francois Chollet 594cbed03b Small changes in mask caching 2016-05-27 21:32:43 -07:00
Monami Sharma 3938a905a1 Default values corrected for featurewise_std_normalization and featurewise_center (#2831)
For ImageDataGenerator, False is the default value for for featurewise_std_normalization and featurewise_center.
2016-05-27 08:59:10 -07:00
Francois Chollet 88d523e01b Add stateless batchnorm mode 2016-05-26 16:01:24 -07:00
Francois Chollet 0419fe67fc Merge branch 'master' of https://github.com/fchollet/keras 2016-05-25 16:46:21 -07:00
Francois Chollet 33ddeb5cbe Change way node depth is computed for shared layer 2016-05-25 16:46:06 -07:00
Colin Rofls 1f5d5b391b correctly serialize loss function (#2806) 2016-05-24 21:27:57 -07:00
fchollet 198c515208 Simplify imports in README 2016-05-23 23:59:34 -07:00
fchollet 5156673e17 Fix serialization issue with nested Sequential 2016-05-21 18:11:51 -07:00
fchollet 1529c9c438 Clarify error message 2016-05-21 17:01:39 -07:00
fchollet 32b10a8832 Fix first axis dim validation in multi-input model 2016-05-21 16:06:19 -07:00
fchollet 24501d4361 Fix ActivityReg layer 2016-05-21 15:46:32 -07:00
fchollet bf0c08e24a Add FAQ entry about layer freezing 2016-05-21 13:53:23 -07:00
RyosukeHonda 34d8cce6bc Fixed typo (#2770)
Fixed the year from "7 Apr 201" to "7 Apr 2015".
2016-05-21 10:12:45 -07:00
Joshua Loyal f0bfc24adc Correction to fan_out initializaiton (#2252)
* account for receptive field size in fan_out

* added test for conv layer initializations

* removed old reference to kernel_size
2016-05-19 22:11:41 -07:00
Xingdi (Eric) Yuan 2f8acfe4bf changeable print_summary (#2761)
* use changeable print_summary

* minor
2016-05-19 12:40:06 -07:00
gw0 e2fb8b2786 Add download error suggestion for babi_rnn.py and babi_memnn.py. (#2752) 2016-05-19 10:20:36 -07:00
Francois Chollet ebbc4d9fb8 Fix TB callback with non-standard TF version nums 2016-05-18 10:28:12 -07:00
Francois Chollet 8fc5b90e9a Update bibtex entry 2016-05-16 15:04:49 -07:00
mat kelcey 8a717f5b6c rename z_log_sigma to z_log_std to match z_mean (which is not z_mu) (#2729) 2016-05-16 11:08:00 -07:00
Colin Rofls aa91994166 save keras version & compile args when serializing models (#2690)
* save keras version & compile args when serializing models

* renamed prepare_config -> _updated_config + cleaner implementation
2016-05-15 20:18:31 -07:00
Joel 2091347a71 Fix zero division in merge mode='cos' (#2725)
* fix cos zero division

* use backend epsilon
2016-05-15 20:16:46 -07:00
Mikhail Korobov e0ed174f2c Input: proper error message for missing "shape" argument (#2727) 2016-05-15 15:15:14 -07:00
Francois Chollet 8c2a573ebf Prepare 1.0.3 release 2016-05-15 13:13:19 -07:00
Francois Chollet d7ff7cde92 Add VAE example 2016-05-14 12:06:23 -07:00
Francois Chollet 15d0b0ea08 Add K.tile test 2016-05-14 12:06:02 -07:00
Francois Chollet 3695bc2db5 Remove references to "join" merge mode 2016-05-13 11:06:08 -07:00
Francois Chollet a08995a90d Fix common LaTeX encoding issue 2016-05-12 12:03:20 -07:00
Tsukasa ŌMOTO aea00258e7 Update the reference of Batch Normalization (#2700)
We should refer the paper accepted in ICML 2015, instead of arXiv.
2016-05-12 09:54:46 -07:00
fchollet b581eb3f27 Update RMSprop 2016-05-11 21:35:11 -07:00
Francois Chollet 610ccba9f5 Normalize layer imports in examples 2016-05-11 18:45:37 -07:00
Francois Chollet d5ae6f32dd Fix flaky test 2016-05-11 18:01:01 -07:00
Francois Chollet 5308033936 Update RMSprop, Adagrad, Adadelta 2016-05-11 17:20:27 -07:00
Francois Chollet e2abb5ef2c Fix merge conflicts 2016-05-11 16:07:43 -07:00
Francois Chollet 1b11b4eeb6 Fix shape inference issue with TF.resize_images 2016-05-11 16:06:03 -07:00
Dieuwke Hupkes 39357b3045 Update documentation docstring Embedding (#2693)
From the documentation it is not entirely clear that if mask_zero is set
to True, the input_dim argument should be equal to the size of the
vocabulary + 2, as index 0 cannot be used anymore.

(This behaviour seems a bit strange, as it has as a consequence that the
first column of the weights of the embeddings will never be used or
updated. The resulting network thus has a redundant set of parameters).
2016-05-11 14:10:58 -07:00
Kai Sasaki ed7a5a1418 Residual connection should have the same dimension in case of no projection matrix (#2688) 2016-05-10 21:18:39 -07:00
Kyle McDonald ae682a71f9 functional API intermediate output doc in faq (#2682) 2016-05-10 08:22:53 -07:00
Brian McMahan 8327b37a0b fixed shape typo (#2679)
* fixed shape typo

* pep8
2016-05-09 22:17:12 -07:00
fchollet 973b5570aa Style touch-up 2016-05-06 20:37:46 -07:00
fchollet 7cb41fc5cc Fix weight saving issue 2016-05-06 20:37:35 -07:00
Tsukasa ŌMOTO 595d67ad7d Fix initialization of index_array (#2590)
index_array should be initialized when self.batch_index is zero.
2016-05-06 18:13:11 -07:00
François Chollet bb626c120e Revert "Revert "remove unused import statement in keras dir"" (#2647) 2016-05-06 13:32:43 -07:00
Xingdi (Eric) Yuan ba8fefa8ec Faster GRU (#2633)
* add a simple named entity recognition example

add a simple named entity recognition example

* add fast version of GRU

add fast version of GRU

* remove useless stuff
2016-05-06 11:10:46 -07:00
François Chollet 4b24f6d7b1 Revert "remove unused import statement in keras dir" (#2641) 2016-05-05 23:22:28 -07:00
ηzw 1c460e1e08 remove unused import statement in keras dir (#2638)
* remove unused import statement in keras dir

* rewrite import graph statement
2016-05-05 21:33:04 -07:00
Colin Rofls 7b4e157356 fixed docs for Sequential.get_config, and added a more helpful (#2635)
exception to `model_from_config`.
2016-05-05 15:24:52 -07:00
Dr. Kashif Rasul 5749f1b971 fix soft sign deprecation warning (#2623)
and backward compatible
2016-05-05 13:02:37 -07:00
Francois Chollet 3c57aff85b Style fixes 2016-05-05 11:17:25 -07:00
Carl Thomé 18504bcc86 Faster LSTM (#2523)
* Faster LSTM

* PEP8

* RNN dropout fix

* PEP

* PEP

* Less code duplication

* LSTM benchmark example

* PEP

* Test implementation modes

* Go through Keras backend
2016-05-05 11:01:48 -07:00
Francois Chollet d8864bfe48 Allow use of predict without compilation 2016-05-05 08:24:12 -07:00
Nic Eggert 078b20169b Add batch_get_value to backends (#2615)
* Add function to get multiple values at once

* Change to match existing batch_set_value

* Fix typo
2016-05-04 17:13:17 -07:00
Francois Chollet 5f7e78df65 Improve optimizer configuration 2016-05-04 14:18:06 -07:00
Francois Chollet fc470db7ab Fix typos in layer writing guide 2016-05-03 11:29:50 -07:00
jingzhehu f576f37801 one line fix for TensorBoard callback issue (#2574)
* one line fix for TensorBoard callback issue

Ref: https://github.com/fchollet/keras/issues/2570

* handle SummaryWriter based on tensorflow version

code contributed by @bnaul

https://github.com/bnaul/keras/commit/e04ce5e37ec234debaea8c6482ef90be1f
88286d
2016-05-03 10:51:43 -07:00
Francois Chollet b74118a766 Fix typo in documentation 2016-05-02 15:59:04 -07:00
Brian McMahan 1c7a0248b9 updated for list check bug in predict/predict_on_batch (#2585)
* updated for list check bug in predict/predict_on_batch

* pep fix

I think that's going to be the only pep complain..
2016-05-02 15:33:25 -07:00
Francois Chollet 36a829c20d Add doc page about writing custom layers. 2016-05-02 14:16:09 -07:00
chentingpc 33af75aa39 fix activity regularizer so it can deal with multiple inbound nodes as well (#2573) 2016-05-01 16:36:31 -07:00
jpeg729 844420425e Added softsign activation function (#2097) 2016-04-30 18:29:33 -07:00
Francois Chollet da57a530f9 "total_loss" -> "loss" 2016-04-30 16:38:23 -07:00
fchollet 1f17013949 Misc fixes 2016-04-30 15:09:35 -07:00
fchollet f18899cb36 Merge branch 'master' of https://github.com/commaai/keras into commaai-master 2016-04-30 14:09:56 -07:00
Sasank Chilamkurthy 877f946e24 Improved docs of ImageDataGenerator (#2565) 2016-04-30 11:53:44 -07:00
Francois Chollet a981a8c42c Make bias optional everywhere 2016-04-29 16:54:39 -07:00
Francois Chollet 5467107fc9 Prepare 1.0.2 PyPI release 2016-04-29 10:39:52 -07:00
Gijs van Tulder ad3107073b Re-raise exceptions to preserve stack trace (#2350) 2016-04-28 12:38:36 -07:00
Francois Chollet 8d62f4da6c Minor UX fix 2016-04-27 17:34:33 -07:00
Joel 3779b8a008 Fix test_image path non-exist error in ci-travis (#2531)
* correct inception_v3 network

* store test images in class attribute

* PEP8
2016-04-27 11:35:31 -07:00
Francois Chollet 6ec5e48969 Style touch-ups 2016-04-27 10:53:54 -07:00
fchollet bfa5ca553d Fix docstring 2016-04-27 09:20:19 -07:00
Francois Chollet c9f7d970e9 Style fixes in preprocessing/image 2016-04-26 15:24:05 -07:00
Sasank Chilamkurthy f26ce6e236 Rewriting image augmenter (#2446)
* Much better image data augmentor

* removed unnecessary functions

* shift origin to centre of the image for homographies

* init commit

* change to zoom_range

* Added scikit-image to extras_require in setup.py

* add zoom_range test, exception for invalid zoom_range

* add scikit-image to dependency

* fix fit and retain old functions for unit test

* use ndi insteadskimage in random_transform

* removed buggy code in random_rotations, shears etc  and replaced it with todos.

* remove sci-image, implement ndimage based methods, refactor random_transform

* random_zoom, array_to_img consider dim_ordering

* add random_channel_shift, support fill_mode and cval

* image doc, update test_image, PEP8

* fix channel shift clip

* fix doc, refine code

* detail explain of zoom range

* check coding style
2016-04-26 15:21:14 -07:00
Brian McMahan b001e36f18 adding a disable_b boolean to Dense (#2512)
* adding a disable_b boolean to Dense

* changing 'disable_b' to 'bias' 

Changing the name of the boolean & flipping its behavior so that the default is True and when set to False the bias is not used.

* integrating bias flag fully

changed the bias flag to affect the creation of the self.b variable as well as the output calculation

* fixing a blank line to appease pep8
2016-04-26 14:25:00 -07:00
Francois Chollet 9abb6ef723 Add TF graph management warning 2016-04-26 13:02:39 -07:00
Francois Chollet bfbdbb05bc Add root imports 2016-04-26 13:02:11 -07:00
Francois Chollet bd2bd51b5d Fix typo in README 2016-04-25 19:06:31 -07:00
Francois Chollet 4e547a31ed Improve TF session & variable management 2016-04-25 18:49:19 -07:00
Francois Chollet de8d0defcd Fix PEP8 2016-04-25 18:48:23 -07:00
gw0 344437c491 Fix plot with show_shapes and multiple inputs/outputs. (#2421) 2016-04-25 15:29:16 -07:00
George Hotz ed365e94fd Added simple support for returning a multitarget loss 2016-04-25 14:46:03 -07:00
TobyPDE 5910278ca8 Fixed minor typo in getting-started/sequential-model-guide (#2499) 2016-04-25 09:14:18 -07:00
fchollet 18841fa58d Fix build 2016-04-24 22:18:02 -07:00
Carl Thomé 6fb4e0e441 Add cos and sin to backend (#2493) 2016-04-24 21:43:57 -07:00
Francois Chollet 39051ef3ca Add model_from_config in models.py 2016-04-24 14:33:27 -07:00
fchollet 1f4084870b Add new metrics and metrics tests 2016-04-24 12:10:47 -07:00
fchollet 00e9d5b219 Update regularizer tests 2016-04-24 10:27:45 -07:00
fchollet 7f93747602 Remove outdated comment 2016-04-24 10:27:45 -07:00
Kai Li a7156b8c27 Update antirectifier.py (#2485) 2016-04-24 09:34:20 -07:00
fchollet b1e47f7741 Fix PEP8 2016-04-23 13:55:20 -07:00
Ke Ding 59f8d6ca22 add weights for SGD optimizer (#2478) 2016-04-23 13:33:14 -07:00
Rich P. I. Lewis 5f4019d980 fixed Merge Layer functional API (#2460)
* fixed Merge Layer functional API

* moved test to layers/test_core
2016-04-23 13:32:45 -07:00
Ke Ding f84389da08 fix a benign but wrong range number in GRU's get_constants (#2475) 2016-04-22 18:55:38 -07:00
Jiyuan Qian 63c1757df5 fix accuracy with sparse_categorical_crossentropy (#2471) 2016-04-22 10:41:14 -07:00
Joel d6ab850f45 correct inception_v3 network (#2472) 2016-04-22 10:38:19 -07:00
graham 61dd53e262 allows python3.5 to build alongside < 3.5 (#2457) 2016-04-21 15:31:25 -07:00
Francois Chollet 423a633b5b Update merge tests 2016-04-21 09:44:44 -07:00
Colin Rofls 256d4ef71b clarified usage of sparse_categorical_crossentropy (#2450)
- addressess #2444
2016-04-21 09:36:39 -07:00
Philip Bachman ad49962ba9 fix layer/node topo sort problem (#2433)
* fix layer/node topo sort problem

* fix to only iterate over valid layer/node keys
2016-04-20 21:38:23 -07:00
Brian McMahan 4680d70a78 fixing the constants thing in theano rnn (#2429) 2016-04-20 11:17:05 -07:00
Dapid ee7f056779 DOC: models should be compiled upon loading (#2428) 2016-04-20 10:23:26 -07:00
Francois Chollet 66c8d7baf2 Merge branch 'master' of https://github.com/fchollet/keras 2016-04-20 09:43:12 -07:00
Francois Chollet 9f929999d1 Fix Travis concurrent directory creation issue 2016-04-20 09:43:01 -07:00
fchollet 24f96262ec Add additional input data validation check 2016-04-20 08:53:40 -07:00
Brian McMahan 0e6e7a41f4 adding built check inside TimeDistributed (#2426) 2016-04-20 08:41:27 -07:00
Dan Becker 5cac088d98 Add scikit_learn wrapper example (#2388)
* Add scikit_learn wrapper example

* Extract and evaluate best model in examples/mnist_sklearn_wrapper.py
2016-04-19 21:50:50 -07:00
Francois Chollet 85f0448fee Make merge work with pure TF/TH tensors 2016-04-19 18:46:54 -07:00
Francois Chollet 106c0b753a Merge branch 'master' of https://github.com/fchollet/keras 2016-04-19 11:57:30 -07:00
Francois Chollet c525e634dc Fix loss compatibility validation 2016-04-19 11:57:19 -07:00
Eder Santana c398c0891b add eye to backened (#2407) 2016-04-19 11:38:21 -07:00
Francois Chollet 5ab48ac5d4 Update imagedatagenerator 2016-04-19 11:19:12 -07:00
Jeffery Ye ba29cd8e46 set input_length before reshape (#2410) 2016-04-19 10:49:47 -07:00
chardmeier b61235b77f Fixed typo. (#2401) 2016-04-19 10:20:13 -07:00
Francois Chollet 0ed00e38f0 Add inception v3 example 2016-04-18 21:51:43 -07:00
Francois Chollet 36eef0dd9a Add reset function to ImageDataGenerator 2016-04-18 21:51:43 -07:00
fchollet 1904194c7a Fix wrapper learning phase 2016-04-18 20:07:30 -07:00
Francois Chollet 7ce144881a Fix stateful unrolled RNNs in Theano 2016-04-18 17:09:20 -07:00
Eder Santana 55159cf451 Update topology.py (#2373) 2016-04-17 14:21:31 -07:00
Francois Chollet 7a12fd0f85 1.0.1 release 2016-04-16 14:11:34 -07:00
Steven Hugg 9d60126661 fixed TensorBoard callback (#2363) 2016-04-16 13:56:56 -07:00
Francois Chollet e341e73c6a Fix Dropout in RNNs 2016-04-16 09:08:15 -07:00
Francois Chollet 5dad3786f6 Add batch_set_value for faster TF weight loading 2016-04-15 22:40:54 -07:00
Francois Chollet ed0cd2c60d Add TF/TH kernel conversion util 2016-04-15 22:37:41 -07:00
Kai Li 2eea3a4c5d Update model.md (#2348) 2016-04-15 08:21:46 -07:00
Gijs van Tulder 090a46763e Fix support for custom metrics functions (#2351) 2016-04-15 08:21:16 -07:00
Dan Becker c4ed82cdf6 Fix typo in docs. loss_weight should be loss_weights (#2343) 2016-04-14 21:53:33 -07:00
Dan Becker b32248d615 Change error message in standardize_input_data (#2338) 2016-04-14 19:18:53 -07:00
Francois Chollet ca7437502b Shape inference fix for Embedding 2016-04-14 15:44:36 -07:00
Francois Chollet fe9b797a46 Style fixes in preprocessing/image 2016-04-14 15:44:24 -07:00
Francois Chollet b8059aeaba Fix test_image unit test 2016-04-14 13:38:07 -07:00
Daniele Bonadiman 85f80714c2 Max Over Time in imdb_cnn.py (#2320)
* Max Over Time in imdb_cnn.py

Following this issue https://github.com/fchollet/keras/issues/2296 i propose this PR.
The mayor optimisation a part of the Max over time are:

- Dropout in the Embedding layer.
- Longer input sequences (400 instead of 100), made possible from the speedup of the Max Over Time.
- Adam optimizer.

Overall it takes 90 to 100 sec per epoch on my laptop CPU and in two epochs it reaches 0.885 accuracy that is a 5 points improvement over the previous implementation. Moreover it requires less memory (300k parameters vs 3M+) since the number of parameters do not depend  by the length of the input sequence anymore.

* Update imdb_cnn.py
2016-04-14 13:22:06 -07:00
fchollet 2cc9ebf28b Add set_learning_phase in TF backend. 2016-04-14 08:32:04 -07:00
fchollet cb5d69c769 Fix case where output_shape in Merge is tuple 2016-04-13 21:20:21 -07:00
Francois Chollet 3b961a6b7b Fix Graph generator methods 2016-04-13 19:30:11 -07:00
Francois Chollet cba3ea9d90 Fix PEP8 2016-04-13 18:32:18 -07:00
Thomas Boquet 57ea065db7 Added learning phase to callbacks (#2297) (#2303)
* added learning phase to callbacks (#2297)

* cleaned imports

* replaced tabs by spaces

* added case where uses_learning_phase is False

* fixed pep8 blank line bug
2016-04-13 18:00:49 -07:00
Ben Cook 4f5f88b9ba Expose max_q_size and other generator_queue args (#2300)
* [#2287] expose generator_queue args

* [#2287] only expose max_q_size
2016-04-13 18:00:25 -07:00
Francois Chollet c1c2b330a1 Fix "trainable" argument 2016-04-13 17:58:05 -07:00
Francois Chollet 05e1d8e5f4 Fix siamese example 2016-04-13 12:05:42 -07:00
Francois Chollet 3cbca7bdba Fix validation_split 2016-04-13 11:39:16 -07:00
Francois Chollet 3e3c210f1d Update preprocessing/image documentation 2016-04-13 10:15:26 -07:00
Dan Becker cb65139aa8 Allow 'tf' ordering in ImageDataGenerator (#2291) 2016-04-13 10:14:59 -07:00
Francois Chollet 1206120d10 Fix callback issue with Sequential model 2016-04-12 15:22:05 -07:00
Francois Chollet 80ebe80138 Callback style fix 2016-04-12 15:21:02 -07:00
Francois Chollet fa1d6b478e Fix generators methods when passing data as dicts 2016-04-12 13:51:45 -07:00
Francois Chollet 345413fb8c Merge branch 'master' of https://github.com/fchollet/keras 2016-04-12 12:52:05 -07:00
Kyle McDonald 66ebd2a843 corrected parameter for learning_phase (#2284) 2016-04-12 10:18:42 -07:00
fchollet d50f469c09 Fix for invalid dropout values 2016-04-12 09:41:29 -07:00
Charles-Emmanuel 26714bc635 Small typo (#2282)
Small type
2016-04-12 09:23:20 -07:00
Pasquale Minervini 30208ae08b Fix for issue #2276 (#2277) 2016-04-12 09:20:20 -07:00
Francois Chollet 6ea3188971 Fix typo 2016-04-11 22:47:18 -07:00
Kyle McDonald 1db555a530 remove creation of np.asarray in to_categorical (#2268) 2016-04-11 22:36:37 -07:00
Francois Chollet 0772210dea Better error messages for Sequential 2016-04-11 16:41:08 -07:00
194 arquivos alterados com 33732 adições e 13075 exclusões
+1
Ver Arquivo
@@ -8,6 +8,7 @@ keras/datasets/data/*
keras/datasets/temp/*
docs/site/*
docs/theme/*
docs/sources/*
tags
Keras.egg-info
+15 -18
Ver Arquivo
@@ -3,18 +3,18 @@ dist: trusty
language: python
matrix:
include:
- python: 3.4
env: KERAS_BACKEND=theano
- python: 3.4
- python: 2.7
env: KERAS_BACKEND=tensorflow TEST_MODE=PEP8
- python: 2.7
env: KERAS_BACKEND=tensorflow TEST_MODE=INTEGRATION_TESTS
- python: 2.7
env: KERAS_BACKEND=tensorflow
- python: 3.5
env: KERAS_BACKEND=tensorflow
- python: 2.7
env: KERAS_BACKEND=theano
- python: 2.7
env: KERAS_BACKEND=tensorflow
- python: 2.7
env: KERAS_BACKEND=theano TEST_MODE=INTEGRATION_TESTS
- python: 2.7
env: KERAS_BACKEND=theano TEST_MODE=PEP8
- python: 3.5
env: KERAS_BACKEND=theano
install:
# code below is taken from http://conda.pydata.org/docs/travis.html
# We do this conditionally because it saves us some downloading if the
@@ -34,29 +34,26 @@ install:
- conda create -q -n test-environment python=$TRAVIS_PYTHON_VERSION numpy scipy matplotlib pandas pytest h5py
- source activate test-environment
- pip install pytest-cov python-coveralls pytest-xdist coverage==3.7.1 #we need this version of coverage for coveralls.io to work
- pip install pep8 pytest-pep8
- pip install git+git://github.com/Theano/Theano.git
# install PIL for preprocessing tests
- if [[ "$TRAVIS_PYTHON_VERSION" == "2.7" ]]; then
conda install pil;
elif [[ "$TRAVIS_PYTHON_VERSION" == "3.4" ]]; then
elif [[ "$TRAVIS_PYTHON_VERSION" == "3.5" ]]; then
conda install Pillow;
fi
- python setup.py install
- pip install -e .[tests]
# install TensorFlow
- if [[ "$TRAVIS_PYTHON_VERSION" == "2.7" ]]; then
pip install https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.7.1-cp27-none-linux_x86_64.whl;
elif [[ "$TRAVIS_PYTHON_VERSION" == "3.4" ]]; then
pip install https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.7.1-cp34-none-linux_x86_64.whl;
fi
- pip install tensorflow
# command to run tests
script:
# run keras backend init to initialize backend config
- python -c "import keras.backend"
# create dataset directory to avoid concurrent directory creation at runtime
- mkdir ~/.keras/datasets
# set up keras backend
- sed -i -e 's/"backend":[[:space:]]*"[^"]*/"backend":\ "'$KERAS_BACKEND'/g' ~/.keras/keras.json;
- echo -e "Running tests with the following config:\n$(cat ~/.keras/keras.json)"
+1 -1
Ver Arquivo
@@ -43,7 +43,7 @@ We love pull requests. Here's a quick guide:
4. Write tests. Your code should have full unit test coverage. If you want to see your PR merged promptly, this is crucial.
5. Run our test suite locally. It's easy: from the Keras folder, simply run: `py.test tests/`.
- You will need to install `pytest`, `coveralls`, `pytest-cov`, `pytest-xdist`: `pip install pytest pytest-cov python-coveralls pytest-xdist pep8 pytest-pep8`
- You will need to install the test requirements as well: `pip install -e .[tests]`.
6. Make sure all tests are passing:
- with the Theano backend, on Python 2.7 and Python 3.5
+6 -2
Ver Arquivo
@@ -1,9 +1,13 @@
Please make sure that the boxes below are checked before you submit your issue. Thank you!
Please make sure that the boxes below are checked before you submit your issue. If your issue is an implementation question, please ask your question on [StackOverflow](http://stackoverflow.com/questions/tagged/keras) or [join the Keras Slack channel](https://keras-slack-autojoin.herokuapp.com/) and ask there instead of filing a GitHub issue.
Thank you!
- [ ] Check that you are up-to-date with the master branch of Keras. You can update with:
pip install git+git://github.com/fchollet/keras.git --upgrade --no-deps
- [ ] If running on TensorFlow, check that you are up-to-date with the latest version. The installation instructions can be found [here](https://www.tensorflow.org/get_started/os_setup).
- [ ] If running on Theano, check that you are up-to-date with the master branch of Theano. You can update with:
pip install git+git://github.com/Theano/Theano.git --upgrade --no-deps
- [ ] Provide a link to a GitHub Gist of a Python script that can reproduce your issue (or just copy the script here if it is short).
- [ ] Provide a link to a GitHub Gist of a Python script that can reproduce your issue (or just copy the script here if it is short).
+48 -38
Ver Arquivo
@@ -1,18 +1,17 @@
# Keras: Deep Learning library for Theano and TensorFlow
# Keras: Deep Learning library for TensorFlow and Theano
[![Build Status](https://travis-ci.org/fchollet/keras.svg?branch=master)](https://travis-ci.org/fchollet/keras)
[![PyPI version](https://badge.fury.io/py/keras.svg)](https://badge.fury.io/py/keras)
[![license](https://img.shields.io/github/license/mashape/apistatus.svg?maxAge=2592000)](https://github.com/fchollet/keras/blob/master/LICENSE)
## You have just found Keras.
Keras is a minimalist, highly modular neural networks library, written in Python and capable of running on top of either [TensorFlow](https://github.com/tensorflow/tensorflow) or [Theano](https://github.com/Theano/Theano). It was developed with a focus on enabling fast experimentation. Being able to go from idea to result with the least possible delay is key to doing good research.
Keras is a high-level neural networks API, written in Python and capable of running on top of either [TensorFlow](https://github.com/tensorflow/tensorflow) or [Theano](https://github.com/Theano/Theano). It was developed with a focus on enabling fast experimentation. *Being able to go from idea to result with the least possible delay is key to doing good research.*
Use Keras if you need a deep learning library that:
- allows for easy and fast prototyping (through total modularity, minimalism, and extensibility).
- supports both convolutional networks and recurrent networks, as well as combinations of the two.
- supports arbitrary connectivity schemes (including multi-input and multi-output training).
- runs seamlessly on CPU and GPU.
- Allows for easy and fast prototyping (through user friendliness, modularity, and extensibility).
- Supports both convolutional networks and recurrent networks, as well as combinations of the two.
- Runs seamlessly on CPU and GPU.
Read the documentation at [Keras.io](http://keras.io).
@@ -24,11 +23,11 @@ Keras is compatible with: __Python 2.7-3.5__.
## Guiding principles
- __User friendliness.__ Keras is an API designed for human beings, not machines. It puts user experience front and center. Keras follows best practices for reducing cognitive load: it offers consistent & simple APIs, it minimizes the number of user actions required for common use cases, and it provides clear and actionable feedback upon user error.
- __Modularity.__ A model is understood as a sequence or a graph of standalone, fully-configurable modules that can be plugged together with as little restrictions as possible. In particular, neural layers, cost functions, optimizers, initialization schemes, activation functions, regularization schemes are all standalone modules that you can combine to create new models.
- __Minimalism.__ Each module should be kept short and simple. Every piece of code should be transparent upon first reading. No black magic: it hurts iteration speed and ability to innovate.
- __Easy extensibility.__ New modules are dead simple to add (as new classes and functions), and existing modules provide ample examples. To be able to easily create new modules allows for total expressiveness, making Keras suitable for advanced research.
- __Easy extensibility.__ New modules are simple to add (as new classes and functions), and existing modules provide ample examples. To be able to easily create new modules allows for total expressiveness, making Keras suitable for advanced research.
- __Work with Python__. No separate models configuration files in a declarative format. Models are described in Python code, which is compact, easier to debug, and allows for ease of extensibility.
@@ -38,9 +37,9 @@ Keras is compatible with: __Python 2.7-3.5__.
## Getting started: 30 seconds to Keras
The core data structure of Keras is a __model__, a way to organize layers. The main type of model is the [`Sequential`](http://keras.io/getting-started/sequential-model-guide) model, a linear stack of layers. For more complex architectures, you should use the [Keras function API](http://keras.io/getting-started/functional-api-guide).
The core data structure of Keras is a __model__, a way to organize layers. The simplest type of model is the [`Sequential`](http://keras.io/getting-started/sequential-model-guide) model, a linear stack of layers. For more complex architectures, you should use the [Keras functional API](http://keras.io/getting-started/functional-api-guide), which allows to build arbitrary graphs of layers.
Here's the `Sequential` model:
Here is the `Sequential` model:
```python
from keras.models import Sequential
@@ -51,47 +50,54 @@ model = Sequential()
Stacking layers is as easy as `.add()`:
```python
from keras.layers.core import Dense, Activation
from keras.layers import Dense, Activation
model.add(Dense(output_dim=64, input_dim=100))
model.add(Activation("relu"))
model.add(Dense(output_dim=10))
model.add(Activation("softmax"))
model.add(Dense(units=64, input_dim=100))
model.add(Activation('relu'))
model.add(Dense(units=10))
model.add(Activation('softmax'))
```
Once your model looks good, configure its learning process with `.compile()`:
```python
model.compile(loss='categorical_crossentropy', optimizer='sgd', metrics=['accuracy'])
model.compile(loss='categorical_crossentropy',
optimizer='sgd',
metrics=['accuracy'])
```
If you need to, you can further configure your optimizer. A core principle of Keras is to make things reasonably simple, while allowing the user to be fully in control when they need to (the ultimate control being the easy extensibility of the source code).
```python
from keras.optimizers import SGD
model.compile(loss='categorical_crossentropy', optimizer=SGD(lr=0.01, momentum=0.9, nesterov=True))
model.compile(loss=keras.losses.categorical_crossentropy,
optimizer=keras.optimizers.SGD(lr=0.01, momentum=0.9, nesterov=True))
```
You can now iterate on your training data in batches:
```python
model.fit(X_train, Y_train, nb_epoch=5, batch_size=32)
# x_train and y_train are Numpy arrays --just like in the Scikit-Learn API.
model.fit(x_train, y_train, epochs=5, batch_size=32)
```
Alternatively, you can feed batches to your model manually:
```python
model.train_on_batch(X_batch, Y_batch)
model.train_on_batch(x_batch, y_batch)
```
Evaluate your performance in one line:
```python
loss_and_metrics = model.evaluate(X_test, Y_test, batch_size=32)
loss_and_metrics = model.evaluate(x_test, y_test, batch_size=128)
```
Or generate predictions on new data:
```python
classes = model.predict_classes(X_test, batch_size=32)
proba = model.predict_proba(X_test, batch_size=32)
classes = model.predict(x_test, batch_size=128)
```
Building a question answering system, an image classification model, a Neural Turing Machine, a word2vec embedder or any other model is just as fast. The ideas behind deep learning are simple, so why should their implementation be painful?
Building a question answering system, an image classification model, a Neural Turing Machine, or any other model is just as fast. The ideas behind deep learning are simple, so why should their implementation be painful?
For a more in-depth tutorial about Keras, you can check out:
@@ -109,45 +115,49 @@ In the [examples folder](https://github.com/fchollet/keras/tree/master/examples)
Keras uses the following dependencies:
- numpy, scipy
- pyyaml
- yaml
- HDF5 and h5py (optional, required if you use model saving/loading functions)
- Optional but recommended if you use CNNs: cuDNN.
*When using the TensorFlow backend:*
- TensorFlow
- [See installation instructions](https://www.tensorflow.org/install/).
*When using the Theano backend:*
- Theano
- [See installation instructions](http://deeplearning.net/software/theano/install.html#install).
*When using the TensorFlow backend:*
- TensorFlow
- [See installation instructions](https://github.com/tensorflow/tensorflow#download-and-setup).
To install Keras, `cd` to the Keras folder and run the install command:
```
```sh
sudo python setup.py install
```
You can also install Keras from PyPI:
```
```sh
sudo pip install keras
```
------------------
## Switching from Theano to TensorFlow
## Switching from TensorFlow to Theano
By default, Keras will use Theano as its tensor manipulation library. [Follow these instructions](http://keras.io/backend/) to configure the Keras backend.
By default, Keras will use TensorFlow as its tensor manipulation library. [Follow these instructions](http://keras.io/backend/) to configure the Keras backend.
------------------
## Support
You can ask questions and join the development discussion on the [Keras Google group](https://groups.google.com/forum/#!forum/keras-users).
You can ask questions and join the development discussion:
You can also post bug reports and feature requests in [Github issues](https://github.com/fchollet/keras/issues). Make sure to read [our guidelines](https://github.com/fchollet/keras/blob/master/CONTRIBUTING.md) first.
- On the [Keras Google group](https://groups.google.com/forum/#!forum/keras-users).
- On the [Keras Slack channel](https://kerasteam.slack.com). Use [this link](https://keras-slack-autojoin.herokuapp.com/) to request an invitation to the channel.
You can also post **bug reports and feature requests** (only) in [Github issues](https://github.com/fchollet/keras/issues). Make sure to read [our guidelines](https://github.com/fchollet/keras/blob/master/CONTRIBUTING.md) first.
------------------
+46
Ver Arquivo
@@ -0,0 +1,46 @@
FROM nvidia/cuda:8.0-cudnn5-devel
ENV CONDA_DIR /opt/conda
ENV PATH $CONDA_DIR/bin:$PATH
RUN mkdir -p $CONDA_DIR && \
echo export PATH=$CONDA_DIR/bin:'$PATH' > /etc/profile.d/conda.sh && \
apt-get update && \
apt-get install -y wget git libhdf5-dev g++ graphviz && \
wget --quiet https://repo.continuum.io/miniconda/Miniconda3-3.9.1-Linux-x86_64.sh && \
echo "6c6b44acdd0bc4229377ee10d52c8ac6160c336d9cdd669db7371aa9344e1ac3 *Miniconda3-3.9.1-Linux-x86_64.sh" | sha256sum -c - && \
/bin/bash /Miniconda3-3.9.1-Linux-x86_64.sh -f -b -p $CONDA_DIR && \
rm Miniconda3-3.9.1-Linux-x86_64.sh
ENV NB_USER keras
ENV NB_UID 1000
RUN useradd -m -s /bin/bash -N -u $NB_UID $NB_USER && \
mkdir -p $CONDA_DIR && \
chown keras $CONDA_DIR -R && \
mkdir -p /src && \
chown keras /src
USER keras
# Python
ARG python_version=3.5.2
ARG tensorflow_version=0.12.0rc0-cp35-cp35m
RUN conda install -y python=${python_version} && \
pip install https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-${tensorflow_version}-linux_x86_64.whl && \
pip install git+git://github.com/Theano/Theano.git && \
pip install ipdb pytest pytest-cov python-coveralls coverage==3.7.1 pytest-xdist pep8 pytest-pep8 pydot_ng && \
conda install Pillow scikit-learn notebook pandas matplotlib nose pyyaml six h5py && \
pip install git+git://github.com/fchollet/keras.git && \
conda clean -yt
ADD theanorc /home/keras/.theanorc
ENV PYTHONPATH='/src/:$PYTHONPATH'
WORKDIR /src
EXPOSE 8888
CMD jupyter notebook --port=8888 --ip=0.0.0.0
+26
Ver Arquivo
@@ -0,0 +1,26 @@
help:
@cat Makefile
DATA?="${HOME}/Data"
GPU?=0
DOCKER_FILE=Dockerfile
DOCKER=GPU=$(GPU) nvidia-docker
BACKEND=tensorflow
TEST=tests/
SRC=$(shell dirname `pwd`)
build:
docker build -t keras --build-arg python_version=3.5 -f $(DOCKER_FILE) .
bash: build
$(DOCKER) run -it -v $(SRC):/src -v $(DATA):/data --env KERAS_BACKEND=$(BACKEND) keras bash
ipython: build
$(DOCKER) run -it -v $(SRC):/src -v $(DATA):/data --env KERAS_BACKEND=$(BACKEND) keras ipython
notebook: build
$(DOCKER) run -it -v $(SRC):/src -v $(DATA):/data --net=host --env KERAS_BACKEND=$(BACKEND) keras
test: build
$(DOCKER) run -it -v $(SRC):/src -v $(DATA):/data --env KERAS_BACKEND=$(BACKEND) keras py.test $(TEST)
+58
Ver Arquivo
@@ -0,0 +1,58 @@
# Using Keras via Docker
This directory contains `Dockerfile` to make it easy to get up and running with
Keras via [Docker](http://www.docker.com/).
## Installing Docker
General installation instructions are
[on the Docker site](https://docs.docker.com/installation/), but we give some
quick links here:
* [OSX](https://docs.docker.com/installation/mac/): [docker toolbox](https://www.docker.com/toolbox)
* [ubuntu](https://docs.docker.com/installation/ubuntulinux/)
## Running the container
We are using `Makefile` to simplify docker commands within make commands.
Build the container and start a jupyter notebook
$ make notebook
Build the container and start an iPython shell
$ make ipython
Build the container and start a bash
$ make bash
For GPU support install NVidia drivers (ideally latest) and
[nvidia-docker](https://github.com/NVIDIA/nvidia-docker). Run using
$ make notebook GPU=0 # or [ipython, bash]
Switch between Theano and TensorFlow
$ make notebook BACKEND=theano
$ make notebook BACKEND=tensorflow
Mount a volume for external data sets
$ make DATA=~/mydata
Prints all make tasks
$ make help
You can change Theano parameters by editing `/docker/theanorc`.
Note: If you would have a problem running nvidia-docker you may try the old way
we have used. But it is not recommended. If you find a bug in the nvidia-docker report
it there please and try using the nvidia-docker as described above.
$ export CUDA_SO=$(\ls /usr/lib/x86_64-linux-gnu/libcuda.* | xargs -I{} echo '-v {}:{}')
$ export DEVICES=$(\ls /dev/nvidia* | xargs -I{} echo '--device {}:{}')
$ docker run -it -p 8888:8888 $CUDA_SO $DEVICES gcr.io/tensorflow/tensorflow:latest-gpu
+5
Ver Arquivo
@@ -0,0 +1,5 @@
[global]
floatX = float32
optimizer=None
device = gpu
+131 -138
Ver Arquivo
@@ -39,7 +39,8 @@ Index
Text preprocessing
Sequence preprocessing
Objectives
Losses
Metrics
Optimizers
Activations
Callbacks
@@ -53,12 +54,23 @@ Scikit-learn API
'''
from __future__ import print_function
from __future__ import unicode_literals
import re
import inspect
import os
import shutil
import sys
if sys.version[0] == '2':
reload(sys)
sys.setdefaultencoding('utf8')
from keras.layers import convolutional
import keras
from keras import utils
from keras import layers
from keras import initializers
from keras.layers import pooling
from keras.layers import local
from keras.layers import recurrent
from keras.layers import core
from keras.layers import noise
@@ -70,17 +82,28 @@ from keras import optimizers
from keras import callbacks
from keras import models
from keras.engine import topology
from keras import objectives
from keras import losses
from keras import metrics
from keras import backend
from keras import constraints
from keras import activations
from keras import regularizers
from keras.utils import data_utils
from keras.utils import io_utils
from keras.utils import layer_utils
from keras.utils import np_utils
from keras.utils import generic_utils
EXCLUDE = {
'Optimizer',
'Wrapper',
'get_session',
'set_session',
'CallbackList',
'serialize',
'deserialize',
'get',
}
PAGES = [
@@ -98,6 +121,7 @@ PAGES = [
models.Sequential.predict_on_batch,
models.Sequential.fit_generator,
models.Sequential.evaluate_generator,
models.Sequential.predict_generator,
],
},
{
@@ -112,46 +136,64 @@ PAGES = [
models.Model.predict_on_batch,
models.Model.fit_generator,
models.Model.evaluate_generator,
models.Model.predict_generator,
models.Model.get_layer,
]
},
{
'page': 'layers/core.md',
'classes': [
core.Dense,
core.Activation,
core.Dropout,
core.Flatten,
core.Reshape,
core.Permute,
core.RepeatVector,
topology.Merge,
core.Lambda,
core.ActivityRegularization,
core.Masking,
core.Highway,
core.MaxoutDense,
core.TimeDistributedDense,
layers.Dense,
layers.Activation,
layers.Dropout,
layers.Flatten,
layers.Reshape,
layers.Permute,
layers.RepeatVector,
layers.Lambda,
layers.ActivityRegularization,
layers.Masking,
],
},
{
'page': 'layers/convolutional.md',
'classes': [
convolutional.Convolution1D,
convolutional.Convolution2D,
convolutional.Convolution3D,
convolutional.MaxPooling1D,
convolutional.MaxPooling2D,
convolutional.MaxPooling3D,
convolutional.AveragePooling1D,
convolutional.AveragePooling2D,
convolutional.AveragePooling3D,
convolutional.UpSampling1D,
convolutional.UpSampling2D,
convolutional.UpSampling3D,
convolutional.ZeroPadding1D,
convolutional.ZeroPadding2D,
convolutional.ZeroPadding3D,
layers.Conv1D,
layers.Conv2D,
layers.SeparableConv2D,
layers.Conv2DTranspose,
layers.Conv3D,
layers.Cropping1D,
layers.Cropping2D,
layers.Cropping3D,
layers.UpSampling1D,
layers.UpSampling2D,
layers.UpSampling3D,
layers.ZeroPadding1D,
layers.ZeroPadding2D,
layers.ZeroPadding3D,
],
},
{
'page': 'layers/pooling.md',
'classes': [
pooling.MaxPooling1D,
pooling.MaxPooling2D,
pooling.MaxPooling3D,
pooling.AveragePooling1D,
pooling.AveragePooling2D,
pooling.AveragePooling3D,
pooling.GlobalMaxPooling1D,
pooling.GlobalAveragePooling1D,
pooling.GlobalMaxPooling2D,
pooling.GlobalAveragePooling2D,
],
},
{
'page': 'layers/local.md',
'classes': [
local.LocallyConnected1D,
local.LocallyConnected2D,
],
},
{
@@ -183,12 +225,42 @@ PAGES = [
'page': 'layers/noise.md',
'all_module_classes': [noise],
},
{
'page': 'layers/merge.md',
'classes': [
layers.Add,
layers.Multiply,
layers.Average,
layers.Maximum,
layers.Concatenate,
layers.Dot,
],
'functions': [
layers.add,
layers.multiply,
layers.average,
layers.maximum,
layers.concatenate,
layers.dot,
]
},
{
'page': 'layers/wrappers.md',
'all_module_classes': [wrappers],
},
{
'page': 'metrics.md',
'all_module_functions': [metrics],
},
{
'page': 'losses.md',
'all_module_functions': [losses],
},
{
'page': 'initializers.md',
'all_module_functions': [initializers],
'all_module_classes': [initializers],
},
{
'page': 'optimizers.md',
'all_module_classes': [optimizers],
@@ -197,10 +269,20 @@ PAGES = [
'page': 'callbacks.md',
'all_module_classes': [callbacks],
},
{
'page': 'activations.md',
'all_module_functions': [activations],
},
{
'page': 'backend.md',
'all_module_functions': [backend],
},
{
'page': 'utils.md',
'all_module_functions': [utils],
'classes': [utils.CustomObjectScope,
utils.HDF5Matrix]
},
]
ROOT = 'http://keras.io/'
@@ -248,10 +330,8 @@ def get_function_signature(function, method=True):
for a in args:
st += str(a) + ', '
for a, v in kwargs:
if type(v) == str:
if isinstance(v, str):
v = '\'' + v + '\''
elif type(v) == unicode:
v = 'u\'' + v + '\''
st += str(a) + '=' + str(v) + ', '
if kwargs or args:
return st[:-2] + ')'
@@ -330,6 +410,7 @@ def process_function_docstring(docstring):
print('Cleaning up existing sources directory.')
if os.path.exists('sources'):
shutil.rmtree('sources')
print('Populating sources directory with templates.')
for subdir, dirs, fnames in os.walk('templates'):
for fname in fnames:
@@ -341,6 +422,14 @@ for subdir, dirs, fnames in os.walk('templates'):
new_fpath = fpath.replace('templates', 'sources')
shutil.copy(fpath, new_fpath)
# Take care of index page.
readme = open('../README.md').read()
index = open('templates/index.md').read()
index = index.replace('{{autogenerated}}', readme[readme.find('##'):])
f = open('sources/index.md', 'w')
f.write(index)
f.close()
print('Starting autogeneration.')
for page_data in PAGES:
blocks = []
@@ -394,7 +483,11 @@ for page_data in PAGES:
docstring = function.__doc__
if docstring:
subblocks.append(process_function_docstring(docstring))
blocks.append('\n\n'.join(subblocks))
blocks.append('\n\n'.join(subblocks))
if not blocks:
raise RuntimeError('Found no content for page ' +
page_data['page'])
mkdown = '\n----\n\n'.join(blocks)
# save module page.
@@ -414,103 +507,3 @@ for page_data in PAGES:
if not os.path.exists(subdir):
os.makedirs(subdir)
open(path, 'w').write(mkdown)
# covered_so_far = set()
# for module, module_name in MODULES:
# class_pages = []
# for name in dir(module):
# if name in SKIP:
# continue
# if name[0] == '_':
# continue
# module_member = getattr(module, name)
# if module_member in covered_so_far:
# continue
# if inspect.isclass(module_member):
# cls = module_member
# if cls.__module__ == module_name:
# try:
# class_signature = get_function_signature(cls.__init__)
# class_signature = class_signature.replace('__init__', cls.__name__)
# except:
# # in case the class inherits from object and does not
# # define __init__
# class_signature = module_name + '.' + cls.__name__ + '()'
# functions = []
# functions_not_defined_here = []
# for name in dir(cls):
# if name in SKIP:
# continue
# if name[0] == '_':
# continue
# cls_member = getattr(cls, name)
# if inspect.isfunction(cls_member):
# function = cls_member
# signature = inspect.getargspec(function)
# defaults = signature.defaults
# args = signature.args[1:]
# if defaults:
# kwargs = zip(args[-len(defaults):], defaults)
# args = args[:-len(defaults)]
# else:
# kwargs = []
# defined_by = get_earliest_class_that_defined_member(function.__name__, cls)
# if cls == defined_by:
# functions.append(function)
# else:
# functions_not_defined_here.append((function, defined_by))
# blocks = []
# blocks.append('<span style="float:right;">' + class_to_source_link(cls) + '</span>')
# blocks.append('# ' + cls.__name__ + '\n')
# blocks.append(code_snippet(class_signature))
# docstring = cls.__doc__
# if docstring:
# blocks.append(process_class_docstring(docstring))
# if cls.__name__ in INCLUDE_functionS_FOR:
# if functions or functions_not_defined_here:
# blocks.append('### functions\n')
# for function in functions:
# signature = get_function_signature(function)
# signature = signature.replace(module_name + '.', '')
# blocks.append(code_snippet(signature))
# docstring = function.__doc__
# if docstring:
# blocks.append(process_function_docstring(docstring))
# for function, defined_by in functions_not_defined_here:
# signature = get_function_signature(function)
# function_module_name = function.__module__
# signature = signature.replace(function_module_name + '.', '')
# link = '[' + defined_by.__name__ + '](' + class_to_docs_link(defined_by) + ')'
# blocks.append(code_snippet(signature))
# blocks.append('Defined by ' + link + '.\n')
# mkdown = '\n'.join(blocks)
# class_pages.append((id(cls), mkdown))
# covered_so_far.add(module_member)
# class_pages.sort(key=lambda x: x[0])
# class_pages = [x[1] for x in class_pages]
# module_page = '\n----\n\n'.join(class_pages)
# # save module page.
# # Either insert content into existing page,
# # or create page otherwise
# path = 'sources/' + module_name.replace('.', '/')[6:] + '.md'
# if os.path.exists(path):
# template = open(path).read()
# assert '{{autogenerated}}' in template, ('Template found for ' + path +
# ' but missing {{autogenerated}} tag.')
# module_page = template.replace('{{autogenerated}}', module_page)
# print('...inserting autogenerated content into template:', path)
# else:
# print('...creating new page with autogenerated content:', path)
# subdir = os.path.dirname(path)
# if not os.path.exists(subdir):
# os.makedirs(subdir)
# open(path, 'w').write(module_page)
+9 -8
Ver Arquivo
@@ -1,15 +1,14 @@
site_name: Keras Documentation
theme: readthedocs
theme_dir: theme
docs_dir: sources
repo_url: http://github.com/fchollet/keras
site_url: http://keras.io/
# theme_dir: theme
site_description: 'Documentation for Keras, the Python Deep Learning library.'
dev_addr: '0.0.0.0:8000'
google_analytics: ['UA-61785484-1', 'keras.io']
pages:
- Home: index.md
- Getting started:
@@ -24,8 +23,11 @@ pages:
- About Keras layers: layers/about-keras-layers.md
- Core Layers: layers/core.md
- Convolutional Layers: layers/convolutional.md
- Pooling Layers: layers/pooling.md
- Locally-connected Layers: layers/local.md
- Recurrent Layers: layers/recurrent.md
- Embedding Layers: layers/embeddings.md
- Merge Layers: layers/merge.md
- Advanced Activations Layers: layers/advanced-activations.md
- Normalization Layers: layers/normalization.md
- Noise layers: layers/noise.md
@@ -35,18 +37,17 @@ pages:
- Sequence Preprocessing: preprocessing/sequence.md
- Text Preprocessing: preprocessing/text.md
- Image Preprocessing: preprocessing/image.md
- Objectives: objectives.md
- Losses: losses.md
- Metrics: metrics.md
- Optimizers: optimizers.md
- Activations: activations.md
- Callbacks: callbacks.md
- Datasets: datasets.md
- Applications: applications.md
- Backend: backend.md
- Initializations: initializations.md
- Initializers: initializers.md
- Regularizers: regularizers.md
- Constraints: constraints.md
- Visualization: visualization.md
- Scikit-learn API: scikit-learn-api.md
- Utils: utils.md
+10 -17
Ver Arquivo
@@ -4,38 +4,31 @@
Activations can either be used through an `Activation` layer, or through the `activation` argument supported by all forward layers:
```python
from keras.layers.core import Activation, Dense
from keras.layers import Activation, Dense
model.add(Dense(64))
model.add(Activation('tanh'))
```
is equivalent to:
This is equivalent to:
```python
model.add(Dense(64, activation='tanh'))
```
You can also pass an element-wise Theano/TensorFlow function as an activation:
You can also pass an element-wise Tensorflow/Theano function as an activation:
```python
from keras import backend as K
def tanh(x):
return K.tanh(x)
model.add(Dense(64, activation=tanh))
model.add(Activation(tanh))
model.add(Dense(64, activation=K.tanh))
model.add(Activation(K.tanh))
```
## Available activations
- __softmax__: Softmax applied across inputs last dimension. Expects shape either `(nb_samples, nb_timesteps, nb_dims)` or `(nb_samples, nb_dims)`.
- __softplus__
- __relu__
- __tanh__
- __sigmoid__
- __hard_sigmoid__
- __linear__
{{autogenerated}}
## On Advanced Activations
## On "Advanced Activations"
Activations that are more complex than a simple Theano/TensorFlow function (eg. learnable activations, configurable activations, etc.) are available as [Advanced Activation layers](layers/advanced-activations.md), and can be found in the module `keras.layers.advanced_activations`. These include PReLU and LeakyReLU.
Activations that are more complex than a simple Tensorflow/Theano function (eg. learnable activations, which maintain a state) are available as [Advanced Activation layers](layers/advanced-activations.md), and can be found in the module `keras.layers.advanced_activations`. These include `PReLU` and `LeakyReLU`.
+455
Ver Arquivo
@@ -0,0 +1,455 @@
# Applications
Keras Applications are deep learning models that are made available alongside pre-trained weights.
These models can be used for prediction, feature extraction, and fine-tuning.
Weights are downloaded automatically when instantiating a model. They are stored at `~/.keras/models/`.
## Available models
### Models for image classification with weights trained on ImageNet:
- [Xception](#xception)
- [VGG16](#vgg16)
- [VGG19](#vgg19)
- [ResNet50](#resnet50)
- [InceptionV3](#inceptionv3)
All of these architectures (except Xception) are compatible with both TensorFlow and Theano, and upon instantiation the models will be built according to the image data format set in your Keras configuration file at `~/.keras/keras.json`. For instance, if you have set `image_data_format=tf`, then any model loaded from this repository will get built according to the TensorFlow data format convention, "Width-Height-Depth".
The Xception model is only available for TensorFlow, due to its reliance on `SeparableConvolution` layers.
-----
## Usage examples for image classification models
### Classify ImageNet classes with ResNet50
```python
from keras.applications.resnet50 import ResNet50
from keras.preprocessing import image
from keras.applications.resnet50 import preprocess_input, decode_predictions
import numpy as np
model = ResNet50(weights='imagenet')
img_path = 'elephant.jpg'
img = image.load_img(img_path, target_size=(224, 224))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)
preds = model.predict(x)
# decode the results into a list of tuples (class, description, probability)
# (one such list for each sample in the batch)
print('Predicted:', decode_predictions(preds, top=3)[0])
# Predicted: [(u'n02504013', u'Indian_elephant', 0.82658225), (u'n01871265', u'tusker', 0.1122357), (u'n02504458', u'African_elephant', 0.061040461)]
```
### Extract features with VGG16
```python
from keras.applications.vgg16 import VGG16
from keras.preprocessing import image
from keras.applications.vgg16 import preprocess_input
import numpy as np
model = VGG16(weights='imagenet', include_top=False)
img_path = 'elephant.jpg'
img = image.load_img(img_path, target_size=(224, 224))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)
features = model.predict(x)
```
### Extract features from an arbitrary intermediate layer with VGG19
```python
from keras.applications.vgg19 import VGG19
from keras.preprocessing import image
from keras.applications.vgg19 import preprocess_input
from keras.models import Model
import numpy as np
base_model = VGG19(weights='imagenet')
model = Model(input=base_model.input, output=base_model.get_layer('block4_pool').output)
img_path = 'elephant.jpg'
img = image.load_img(img_path, target_size=(224, 224))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)
block4_pool_features = model.predict(x)
```
### Fine-tune InceptionV3 on a new set of classes
```python
from keras.applications.inception_v3 import InceptionV3
from keras.preprocessing import image
from keras.models import Model
from keras.layers import Dense, GlobalAveragePooling2D
from keras import backend as K
# create the base pre-trained model
base_model = InceptionV3(weights='imagenet', include_top=False)
# add a global spatial average pooling layer
x = base_model.output
x = GlobalAveragePooling2D()(x)
# let's add a fully-connected layer
x = Dense(1024, activation='relu')(x)
# and a logistic layer -- let's say we have 200 classes
predictions = Dense(200, activation='softmax')(x)
# this is the model we will train
model = Model(input=base_model.input, output=predictions)
# first: train only the top layers (which were randomly initialized)
# i.e. freeze all convolutional InceptionV3 layers
for layer in base_model.layers:
layer.trainable = False
# compile the model (should be done *after* setting layers to non-trainable)
model.compile(optimizer='rmsprop', loss='categorical_crossentropy')
# train the model on the new data for a few epochs
model.fit_generator(...)
# at this point, the top layers are well trained and we can start fine-tuning
# convolutional layers from inception V3. We will freeze the bottom N layers
# and train the remaining top layers.
# let's visualize layer names and layer indices to see how many layers
# we should freeze:
for i, layer in enumerate(base_model.layers):
print(i, layer.name)
# we chose to train the top 2 inception blocks, i.e. we will freeze
# the first 172 layers and unfreeze the rest:
for layer in model.layers[:172]:
layer.trainable = False
for layer in model.layers[172:]:
layer.trainable = True
# we need to recompile the model for these modifications to take effect
# we use SGD with a low learning rate
from keras.optimizers import SGD
model.compile(optimizer=SGD(lr=0.0001, momentum=0.9), loss='categorical_crossentropy')
# we train our model again (this time fine-tuning the top 2 inception blocks
# alongside the top Dense layers
model.fit_generator(...)
```
### Build InceptionV3 over a custom input tensor
```python
from keras.applications.inception_v3 import InceptionV3
from keras.layers import Input
# this could also be the output a different Keras model or layer
input_tensor = Input(shape=(224, 224, 3)) # this assumes K.image_data_format() == 'channels_last'
model = InceptionV3(input_tensor=input_tensor, weights='imagenet', include_top=True)
```
-----
# Documentation for individual models
- [Xception](#xception)
- [VGG16](#vgg16)
- [VGG19](#vgg19)
- [ResNet50](#resnet50)
- [InceptionV3](#inceptionv3)
-----
## Xception
```python
keras.applications.xception.Xception(include_top=True, weights='imagenet', input_tensor=None, input_shape=None)
```
Xception V1 model, with weights pre-trained on ImageNet.
On ImageNet, this model gets to a top-1 validation accuracy of 0.790
and a top-5 validation accuracy of 0.945.
Note that this model is only available for the TensorFlow backend,
due to its reliance on `SeparableConvolution` layers. Additionally it only supports
the data format "channels_last" (height, width, channels).
The default input size for this model is 299x299.
### Arguments
- include_top: whether to include the fully-connected layer at the top of the network.
- weights: one of `None` (random initialization) or "imagenet" (pre-training on ImageNet).
- input_tensor: optional Keras tensor (i.e. output of `layers.Input()`) to use as image input for the model.
- input_shape: optional shape tuple, only to be specified
if `include_top` is False (otherwise the input shape
has to be `(299, 299, 3)`.
It should have exactly 3 inputs channels,
and width and height should be no smaller than 71.
E.g. `(150, 150, 3)` would be one valid value.
- pooling: Optional pooling mode for feature extraction
when `include_top` is `False`.
- `None` means that the output of the model will be
the 4D tensor output of the
last convolutional layer.
- `avg` means that global average pooling
will be applied to the output of the
last convolutional layer, and thus
the output of the model will be a 2D tensor.
- `max` means that global max pooling will
be applied.
- classes: optional number of classes to classify images
into, only to be specified if `include_top` is True, and
if no `weights` argument is specified.
### Returns
A Keras model instance.
### References
- [Xception: Deep Learning with Depthwise Separable Convolutions](https://arxiv.org/abs/1610.02357)
### License
These weights are trained by ourselves and are released under the MIT license.
-----
## VGG16
```python
keras.applications.vgg16.VGG16(include_top=True, weights='imagenet', input_tensor=None, input_shape=None)
```
VGG16 model, with weights pre-trained on ImageNet.
This model is available for both the Theano and TensorFlow backend, and can be built both
with "channels_first" data format (channels, height, width) or "channels_last" data format (height, width, channels).
The default input size for this model is 224x224.
### Arguments
- include_top: whether to include the 3 fully-connected layers at the top of the network.
- weights: one of `None` (random initialization) or "imagenet" (pre-training on ImageNet).
- input_tensor: optional Keras tensor (i.e. output of `layers.Input()`) to use as image input for the model.
- input_shape: optional shape tuple, only to be specified
if `include_top` is False (otherwise the input shape
has to be `(224, 224, 3)` (with `channels_last` data format)
or `(3, 224, 244)` (with `channels_first` data format).
It should have exactly 3 inputs channels,
and width and height should be no smaller than 48.
E.g. `(200, 200, 3)` would be one valid value.
- pooling: Optional pooling mode for feature extraction
when `include_top` is `False`.
- `None` means that the output of the model will be
the 4D tensor output of the
last convolutional layer.
- `avg` means that global average pooling
will be applied to the output of the
last convolutional layer, and thus
the output of the model will be a 2D tensor.
- `max` means that global max pooling will
be applied.
- classes: optional number of classes to classify images
into, only to be specified if `include_top` is True, and
if no `weights` argument is specified.
### Returns
A Keras model instance.
### References
- [Very Deep Convolutional Networks for Large-Scale Image Recognition](https://arxiv.org/abs/1409.1556): please cite this paper if you use the VGG models in your work.
### License
These weights are ported from the ones [released by VGG at Oxford](http://www.robots.ox.ac.uk/~vgg/research/very_deep/) under the [Creative Commons Attribution License](https://creativecommons.org/licenses/by/4.0/).
-----
## VGG19
```python
keras.applications.vgg19.VGG19(include_top=True, weights='imagenet', input_tensor=None, input_shape=None)
```
VGG19 model, with weights pre-trained on ImageNet.
This model is available for both the Theano and TensorFlow backend, and can be built both
with "channels_first" data format (channels, height, width) or "channels_last" data format (height, width, channels).
The default input size for this model is 224x224.
### Arguments
- include_top: whether to include the 3 fully-connected layers at the top of the network.
- weights: one of `None` (random initialization) or "imagenet" (pre-training on ImageNet).
- input_tensor: optional Keras tensor (i.e. output of `layers.Input()`) to use as image input for the model.
- input_shape: optional shape tuple, only to be specified
if `include_top` is False (otherwise the input shape
has to be `(224, 224, 3)` (with `channels_last` data format)
or `(3, 224, 244)` (with `channels_first` data format).
It should have exactly 3 inputs channels,
and width and height should be no smaller than 48.
E.g. `(200, 200, 3)` would be one valid value.
- pooling: Optional pooling mode for feature extraction
when `include_top` is `False`.
- `None` means that the output of the model will be
the 4D tensor output of the
last convolutional layer.
- `avg` means that global average pooling
will be applied to the output of the
last convolutional layer, and thus
the output of the model will be a 2D tensor.
- `max` means that global max pooling will
be applied.
- classes: optional number of classes to classify images
into, only to be specified if `include_top` is True, and
if no `weights` argument is specified.
### Returns
A Keras model instance.
### References
- [Very Deep Convolutional Networks for Large-Scale Image Recognition](https://arxiv.org/abs/1409.1556)
### License
These weights are ported from the ones [released by VGG at Oxford](http://www.robots.ox.ac.uk/~vgg/research/very_deep/) under the [Creative Commons Attribution License](https://creativecommons.org/licenses/by/4.0/).
-----
## ResNet50
```python
keras.applications.resnet50.ResNet50(include_top=True, weights='imagenet', input_tensor=None, input_shape=None)
```
ResNet50 model, with weights pre-trained on ImageNet.
This model is available for both the Theano and TensorFlow backend, and can be built both
with "channels_first" data format (channels, height, width) or "channels_last" data format (height, width, channels).
The default input size for this model is 224x224.
### Arguments
- include_top: whether to include the fully-connected layer at the top of the network.
- weights: one of `None` (random initialization) or "imagenet" (pre-training on ImageNet).
- input_tensor: optional Keras tensor (i.e. output of `layers.Input()`) to use as image input for the model.
- input_shape: optional shape tuple, only to be specified
if `include_top` is False (otherwise the input shape
has to be `(224, 224, 3)` (with `channels_last` data format)
or `(3, 224, 244)` (with `channels_first` data format).
It should have exactly 3 inputs channels,
and width and height should be no smaller than 197.
E.g. `(200, 200, 3)` would be one valid value.
- pooling: Optional pooling mode for feature extraction
when `include_top` is `False`.
- `None` means that the output of the model will be
the 4D tensor output of the
last convolutional layer.
- `avg` means that global average pooling
will be applied to the output of the
last convolutional layer, and thus
the output of the model will be a 2D tensor.
- `max` means that global max pooling will
be applied.
- classes: optional number of classes to classify images
into, only to be specified if `include_top` is True, and
if no `weights` argument is specified.
### Returns
A Keras model instance.
### References
- [Deep Residual Learning for Image Recognition](https://arxiv.org/abs/1512.03385)
### License
These weights are ported from the ones [released by Kaiming He](https://github.com/KaimingHe/deep-residual-networks) under the [MIT license](https://github.com/KaimingHe/deep-residual-networks/blob/master/LICENSE).
-----
## InceptionV3
```python
keras.applications.inception_v3.InceptionV3(include_top=True, weights='imagenet', input_tensor=None, input_shape=None)
```
Inception V3 model, with weights pre-trained on ImageNet.
This model is available for both the Theano and TensorFlow backend, and can be built both
with "channels_first" data format (channels, height, width) or "channels_last" data format (height, width, channels).
The default input size for this model is 299x299.
### Arguments
- include_top: whether to include the fully-connected layer at the top of the network.
- weights: one of `None` (random initialization) or "imagenet" (pre-training on ImageNet).
- input_tensor: optional Keras tensor (i.e. output of `layers.Input()`) to use as image input for the model.
- input_shape: optional shape tuple, only to be specified
if `include_top` is False (otherwise the input shape
has to be `(299, 299, 3)` (with `channels_last` data format)
or `(3, 299, 299)` (with `channels_first` data format).
It should have exactly 3 inputs channels,
and width and height should be no smaller than 139.
E.g. `(150, 150, 3)` would be one valid value.
- pooling: Optional pooling mode for feature extraction
when `include_top` is `False`.
- `None` means that the output of the model will be
the 4D tensor output of the
last convolutional layer.
- `avg` means that global average pooling
will be applied to the output of the
last convolutional layer, and thus
the output of the model will be a 2D tensor.
- `max` means that global max pooling will
be applied.
- classes: optional number of classes to classify images
into, only to be specified if `include_top` is True, and
if no `weights` argument is specified.
### Returns
A Keras model instance.
### References
- [Rethinking the Inception Architecture for Computer Vision](http://arxiv.org/abs/1512.00567)
### License
These weights are released under [the Apache License](https://github.com/tensorflow/models/blob/master/LICENSE).
+37 -6
Ver Arquivo
@@ -4,10 +4,12 @@
Keras is a model-level library, providing high-level building blocks for developing deep learning models. It does not handle itself low-level operations such as tensor products, convolutions and so on. Instead, it relies on a specialized, well-optimized tensor manipulation library to do so, serving as the "backend engine" of Keras. Rather than picking one single tensor library and making the implementation of Keras tied to that library, Keras handles the problem in a modular way, and several different backend engines can be plugged seamlessly into Keras.
At this time, Keras has two backend implementations available: the **Theano** backend and the **TensorFlow** backend.
At this time, Keras has two backend implementations available: the **TensorFlow** backend and the **Theano** backend.
- [Theano](http://deeplearning.net/software/theano/) is an open-source symbolic tensor manipulation framework developed by LISA/MILA Lab at Université de Montréal.
- [TensorFlow](http://www.tensorflow.org/) is an open-source symbolic tensor manipulation framework developed by Google, Inc.
- [Theano](http://deeplearning.net/software/theano/) is an open-source symbolic tensor manipulation framework developed by LISA/MILA Lab at Université de Montréal.
In the future, we are likely to add more backend options. Go ask Microsoft about how their CNTK backend project is doing.
----
@@ -19,9 +21,16 @@ If you have run Keras at least once, you will find the Keras configuration file
If it isn't there, you can create it.
It probably looks like this:
The default configuration file looks like this:
`{"epsilon": 1e-07, "floatx": "float32", "backend": "theano"}`
```
{
"image_data_format": "channels_last",
"epsilon": 1e-07,
"floatx": "float32",
"backend": "tensorflow"
}
```
Simply change the field `backend` to either `"theano"` or `"tensorflow"`, and Keras will use the new configuration next time you run any Keras code.
@@ -29,13 +38,35 @@ You can also define the environment variable ``KERAS_BACKEND`` and this will
override what is defined in your config file :
```bash
KERAS_BACKEND=tensorflow python -c "from keras import backend; print backend._BACKEND"
KERAS_BACKEND=tensorflow python -c "from keras import backend"
Using TensorFlow backend.
tensorflow
```
----
## keras.json details
```
{
"image_data_format": "channels_last",
"epsilon": 1e-07,
"floatx": "float32",
"backend": "tensorflow"
}
```
You can change these settings by editing `~/.keras/keras.json`.
* `image_data_format`: string, either `"channels_last"` or `"channels_first"`. It specifies which data format convention Keras will follow. (`keras.backend.image_data_format()` returns it.)
- For 2D data (e.g. image), `"channels_last"` assumes `(rows, cols, channels)` while `"channels_first"` assumes `(channels, rows, cols)`.
- For 3D data, `"channels_last"` assumes `(conv_dim1, conv_dim2, conv_dim3, channels)` while `"channels_first"` assumes `(channels, conv_dim1, conv_dim2, conv_dim3)`.
* `epsilon`: float, a numeric fuzzing constant used to avoid dividing by zero in some operations.
* `floatx`: string, `"float16"`, `"float32"`, or `"float64"`. Default float precision.
* `backend`: string, `"tensorflow"` or `"theano"`.
----
## Using the abstract Keras backend to write new code
If you want the Keras modules you write to be compatible with both Theano and TensorFlow, you have to write them via the abstract Keras backend API. Here's an intro.
+3 -3
Ver Arquivo
@@ -1,6 +1,6 @@
## Usage of callbacks
A callback is a set of functions to be applied at given stages of the training procedure. You can use callbacks to get a view on internal states and statistics of the model during training. You can pass a list of callbacks (as the keyword argument `callbacks`) to the `.fit()` method of the `Sequential` model. The relevant methods of the callbacks will then be called at each stage of the training.
A callback is a set of functions to be applied at given stages of the training procedure. You can use callbacks to get a view on internal states and statistics of the model during training. You can pass a list of callbacks (as the keyword argument `callbacks`) to the `.fit()` method of the `Sequential` or `Model` classes. The relevant methods of the callbacks will then be called at each stage of the training.
---
@@ -41,7 +41,7 @@ model.add(Activation('softmax'))
model.compile(loss='categorical_crossentropy', optimizer='rmsprop')
history = LossHistory()
model.fit(X_train, Y_train, batch_size=128, nb_epoch=20, verbose=0, callbacks=[history])
model.fit(X_train, Y_train, batch_size=128, epochs=20, verbose=0, callbacks=[history])
print history.losses
# outputs
@@ -66,7 +66,7 @@ model.compile(loss='categorical_crossentropy', optimizer='rmsprop')
saves the model weights after each epoch if the validation loss decreased
'''
checkpointer = ModelCheckpoint(filepath="/tmp/weights.hdf5", verbose=1, save_best_only=True)
model.fit(X_train, Y_train, batch_size=128, nb_epoch=20, verbose=0, validation_data=(X_test, Y_test), callbacks=[checkpointer])
model.fit(X_train, Y_train, batch_size=128, epochs=20, verbose=0, validation_data=(X_test, Y_test), callbacks=[checkpointer])
```
+7 -7
Ver Arquivo
@@ -2,21 +2,21 @@
Functions from the `constraints` module allow setting constraints (eg. non-negativity) on network parameters during optimization.
The penalties are applied on a per-layer basis. The exact API will depend on the layer, but the layers `Dense`, `TimeDistributedDense`, `MaxoutDense`, `Convolution1D` and `Convolution2D` have a unified API.
The penalties are applied on a per-layer basis. The exact API will depend on the layer, but the layers `Dense`, `Convolution1D`, `Convolution2D` and `Convolution3D` have a unified API.
These layers expose 2 keyword arguments:
- `W_constraint` for the main weights matrix
- `b_constraint` for the bias.
- `kernel_constraint` for the main weights matrix
- `bias_constraint` for the bias.
```python
from keras.constraints import maxnorm
model.add(Dense(64, W_constraint = maxnorm(2)))
model.add(Dense(64, kernel_constraint=max_norm(2.)))
```
## Available constraints
- __maxnorm__(m=2): maximum-norm constraint
- __nonneg__(): non-negativity constraint
- __unitnorm__(): unit-norm constraint, enforces the matrix to have unit norm along the last axis
- __max_norm__(m=2): maximum-norm constraint
- __non_neg__(): non-negativity constraint
- __unit_norm__(): unit-norm constraint, enforces the matrix to have unit norm along the last axis
+76 -27
Ver Arquivo
@@ -9,13 +9,14 @@ Dataset of 50,000 32x32 color training images, labeled over 10 categories, and 1
```python
from keras.datasets import cifar10
(X_train, y_train), (X_test, y_test) = cifar10.load_data()
(x_train, y_train), (x_test, y_test) = cifar10.load_data()
```
- __Return:__
- __Returns:__
- 2 tuples:
- __X_train, X_test__: uint8 array of RGB image data with shape (nb_samples, 3, 32, 32).
- __y_train, y_test__: uint8 array of category labels (integers in range 0-9) with shape (nb_samples,).
- __x_train, x_test__: uint8 array of RGB image data with shape (num_samples, 3, 32, 32).
- __y_train, y_test__: uint8 array of category labels (integers in range 0-9) with shape (num_samples,).
---
@@ -28,18 +29,19 @@ Dataset of 50,000 32x32 color training images, labeled over 100 categories, and
```python
from keras.datasets import cifar100
(X_train, y_train), (X_test, y_test) = cifar100.load_data(label_mode='fine')
(x_train, y_train), (x_test, y_test) = cifar100.load_data(label_mode='fine')
```
- __Return:__
- __Returns:__
- 2 tuples:
- __X_train, X_test__: uint8 array of RGB image data with shape (nb_samples, 3, 32, 32).
- __y_train, y_test__: uint8 array of category labels with shape (nb_samples,).
- __x_train, x_test__: uint8 array of RGB image data with shape (num_samples, 3, 32, 32).
- __y_train, y_test__: uint8 array of category labels with shape (num_samples,).
- __Arguments:__
- __label_mode__: "fine" or "coarse".
---
## IMDB Movie reviews sentiment classification
@@ -53,25 +55,33 @@ As a convention, "0" does not stand for a specific word, but instead is used to
```python
from keras.datasets import imdb
(X_train, y_train), (X_test, y_test) = imdb.load_data(path="imdb.pkl",
nb_words=None,
(x_train, y_train), (x_test, y_test) = imdb.load_data(path="imdb_full.pkl",
num_words=None,
skip_top=0,
maxlen=None,
test_split=0.1)
seed=113,
start_char=1,
oov_char=2,
index_from=3)
```
- __Return:__
- __Returns:__
- 2 tuples:
- __X_train, X_test__: list of sequences, which are lists of indexes (integers). If the nb_words argument was specific, the maximum possible index value is nb_words-1. If the maxlen argument was specified, the largest possible sequence length is maxlen.
- __x_train, x_test__: list of sequences, which are lists of indexes (integers). If the num_words argument was specific, the maximum possible index value is num_words-1. If the maxlen argument was specified, the largest possible sequence length is maxlen.
- __y_train, y_test__: list of integer labels (1 or 0).
- __Arguments:__
- __path__: if you do have the data locally (at `'~/.keras/datasets/' + path`), if will be downloaded to this location (in cPickle format).
- __nb_words__: integer or None. Top most frequent words to consider. Any less frequent word will appear as 0 in the sequence data.
- __path__: if you do not have the data locally (at `'~/.keras/datasets/' + path`), it will be downloaded to this location.
- __num_words__: integer or None. Top most frequent words to consider. Any less frequent word will appear as 0 in the sequence data.
- __skip_top__: integer. Top most frequent words to ignore (they will appear as 0s in the sequence data).
- __maxlen__: int. Maximum sequence length. Any longer sequence will be truncated.
- __test_split__: float. Fraction of the dataset to be used as test data.
- __seed__: int. Seed for reproducible data shuffling.
- __start_char__: char. The start of a sequence will be marked with this character.
Set to 1 because 0 is usually the padding character.
- __oov_char__: char. words that were cut out because of the `num_words`
or `skip_top` limit will be replaced with this character.
- __index_from__: int. Index actual words with this index and higher.
---
@@ -84,14 +94,20 @@ Dataset of 11,228 newswires from Reuters, labeled over 46 topics. As with the IM
```python
from keras.datasets import reuters
(X_train, y_train), (X_test, y_test) = reuters.load_data(path="reuters.pkl",
nb_words=None,
(x_train, y_train), (x_test, y_test) = reuters.load_data(path="reuters.pkl",
num_words=None,
skip_top=0,
maxlen=None,
test_split=0.1)
test_split=0.2,
seed=113,
start_char=1,
oov_char=2,
index_from=3)
```
The specifications are the same as that of the IMDB dataset.
The specifications are the same as that of the IMDB dataset, with the addition of:
- __test_split__: float. Fraction of the dataset to be used as test data.
This dataset also makes available the word index used for encoding the sequences:
@@ -99,12 +115,15 @@ This dataset also makes available the word index used for encoding the sequences
word_index = reuters.get_word_index(path="reuters_word_index.pkl")
```
- __Return:__ A dictionary where key are words (str) and values are indexes (integer). eg. `word_index["giraffe"]` might return `1234`.
- __Returns:__ A dictionary where key are words (str) and values are indexes (integer). eg. `word_index["giraffe"]` might return `1234`.
- __Arguments:__
- __path__: if you do have the index file locally (at `'~/.keras/datasets/' + path`), if will be downloaded to this location (in cPickle format).
- __path__: if you do not have the index file locally (at `'~/.keras/datasets/' + path`), it will be downloaded to this location.
---
## MNIST database of handwritten digits
Dataset of 60,000 28x28 grayscale images of the 10 digits, along with a test set of 10,000 images.
@@ -114,14 +133,44 @@ Dataset of 60,000 28x28 grayscale images of the 10 digits, along with a test set
```python
from keras.datasets import mnist
(X_train, y_train), (X_test, y_test) = mnist.load_data()
(x_train, y_train), (x_test, y_test) = mnist.load_data()
```
- __Return:__
- __Returns:__
- 2 tuples:
- __X_train, X_test__: uint8 array of grayscale image data with shape (nb_samples, 28, 28).
- __y_train, y_test__: uint8 array of digit labels (integers in range 0-9) with shape (nb_samples,).
- __x_train, x_test__: uint8 array of grayscale image data with shape (num_samples, 28, 28).
- __y_train, y_test__: uint8 array of digit labels (integers in range 0-9) with shape (num_samples,).
- __Arguments:__
- __path__: if you do have the index file locally (at `'~/.keras/datasets/' + path`), if will be downloaded to this location (in cPickle format).
- __path__: if you do not have the index file locally (at `'~/.keras/datasets/' + path`), it will be downloaded to this location.
---
## Boston housing price regression dataset
Dataset taken from the StatLib library which is maintained at Carnegie Mellon University.
Samples contain 13 attributes of houses at different locations around the Boston suburbs in the late 1970s.
Targets are the median values of the houses at a location (in k$).
### Usage:
```python
from keras.datasets import boston_housing
(x_train, y_train), (x_test, y_test) = boston_housing.load_data()
```
- __Arguments:__
- __path__: path where to cache the dataset locally
(relative to ~/.keras/datasets).
- __seed__: Random seed for shuffling the data
before computing the test split.
- __test_split__: fraction of the data to reserve as test set.
- __Returns:__
Tuple of Numpy arrays: `(x_train, y_train), (x_test, y_test)`.
+189 -27
Ver Arquivo
@@ -2,15 +2,19 @@
- [How should I cite Keras?](#how-should-i-cite-keras)
- [How can I run Keras on GPU?](#how-can-i-run-keras-on-gpu)
- [What does \["sample", "batch", "epoch"\] mean?](#what-does-sample-batch-epoch-mean)
- [How can I save a Keras model?](#how-can-i-save-a-keras-model)
- [Why is the training loss much higher than the testing loss?](#why-is-the-training-loss-much-higher-than-the-testing-loss)
- [How can I visualize the output of an intermediate layer?](#how-can-i-visualize-the-output-of-an-intermediate-layer)
- [How can I obtain the output of an intermediate layer?](#how-can-i-obtain-the-output-of-an-intermediate-layer)
- [How can I use Keras with datasets that don't fit in memory?](#how-can-i-use-keras-with-datasets-that-dont-fit-in-memory)
- [How can I interrupt training when the validation loss isn't decreasing anymore?](#how-can-i-interrupt-training-when-the-validation-loss-isnt-decreasing-anymore)
- [How is the validation split computed?](#how-is-the-validation-split-computed)
- [Is the data shuffled during training?](#is-the-data-shuffled-during-training)
- [How can I record the training / validation loss / accuracy at each epoch?](#how-can-i-record-the-training-validation-loss-accuracy-at-each-epoch)
- [How can I "freeze" layers?](#how-can-i-freeze-keras-layers)
- [How can I use stateful RNNs?](#how-can-i-use-stateful-rnns)
- [How can I remove a layer from a Sequential model?](#how-can-i-remove-a-layer-from-a-sequential-model)
- [How can I use pre-trained models in Keras?](#how-can-i-use-pre-trained-models-in-keras)
---
@@ -20,18 +24,20 @@ Please cite Keras in your publications if it helps your research. Here is an exa
```
@misc{chollet2015keras,
author = {Chollet, François},
title = {Keras},
year = {2015},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/fchollet/keras}}
title={Keras},
author={Chollet, Fran\c{c}ois},
year={2015},
publisher={GitHub},
howpublished={\url{https://github.com/fchollet/keras}},
}
```
---
### How can I run Keras on GPU?
If you are running on the TensorFlow backend, your code will automatically run on GPU if any available GPU is detected.
If you are running on the Theano backend, you can use one of the following methods:
Method 1: use Theano flags.
@@ -52,11 +58,50 @@ theano.config.floatX = 'float32'
---
### What does \["sample", "batch", "epoch"\] mean?
Below are some common definitions that are necessary to know and understand to correctly utilize Keras:
- **Sample**: one element of a dataset.
- *Example:* one image is a **sample** in a convolutional network
- *Example:* one audio file is a **sample** for a speech recognition model
- **Batch**: a set of *N* samples. The samples in a **batch** are processed independently, in parallel. If training, a batch results in only one update to the model.
- A **batch** generally approximates the distribution of the input data better than a single input. The larger the batch, the better the approximation; however, it is also true that the batch will take longer to processes and will still result in only one update. For inference (evaluate/predict), it is recommended to pick a batch size that is as large as you can afford without going out of memory (since larger batches will usually result in faster evaluating/prediction).
- **Epoch**: an arbitrary cutoff, generally defined as "one pass over the entire dataset", used to separate training into distinct phases, which is useful for logging and periodic evaluation.
- When using `evaluation_data` or `evaluation_split` with the `fit` method of Keras models, evaluation will be run at the end of every **epoch**.
- Within Keras, there is the ability to add [callbacks](https://keras.io/callbacks/) specifically designed to be run at the end of an **epoch**. Examples of these are learning rate changes and model checkpointing (saving).
---
### How can I save a Keras model?
*It is not recommended to use pickle or cPickle to save a Keras model.*
If you only need to save the architecture of a model, and not its weights, you can do:
You can use `model.save(filepath)` to save a Keras model into a single HDF5 file which will contain:
- the architecture of the model, allowing to re-create the model
- the weights of the model
- the training configuration (loss, optimizer)
- the state of the optimizer, allowing to resume training exactly where you left off.
You can then use `keras.models.load_model(filepath)` to reinstantiate your model.
`load_model` will also take care of compiling the model using the saved training configuration
(unless the model was never compiled in the first place).
Example:
```python
from keras.models import load_model
model.save('my_model.h5') # creates a HDF5 file 'my_model.h5'
del model # deletes the existing model
# returns a compiled model
# identical to the previous one
model = load_model('my_model.h5')
```
If you only need to save the **architecture of a model**, and not its weights or its training configuration, you can do:
```python
# save as JSON
@@ -66,6 +111,8 @@ json_string = model.to_json()
yaml_string = model.to_yaml()
```
The generated JSON / YAML files are human-readable and can be manually edited if needed.
You can then build a fresh model from this data:
```python
@@ -74,10 +121,11 @@ from keras.models import model_from_json
model = model_from_json(json_string)
# model reconstruction from YAML
from keras.models import model_from_yaml
model = model_from_yaml(yaml_string)
```
If you need to save the weights of a model, you can do so in HDF5 with the code below.
If you need to save the **weights of a model**, you can do so in HDF5 with the code below.
Note that you will first need to install HDF5 and the Python library h5py, which do not come bundled with Keras.
@@ -85,21 +133,37 @@ Note that you will first need to install HDF5 and the Python library h5py, which
model.save_weights('my_model_weights.h5')
```
Assuming you have code for instantiating your model, you can then load the weights you saved into a model with the same architecture:
Assuming you have code for instantiating your model, you can then load the weights you saved into a model with the *same* architecture:
```python
model.load_weights('my_model_weights.h5')
```
This leads us to a way to save and reconstruct models from only serialized data:
```python
json_string = model.to_json()
open('my_model_architecture.json', 'w').write(json_string)
model.save_weights('my_model_weights.h5')
If you need to load weights into a *different* architecture (with some layers in common), for instance for fine-tuning or transfer-learning, you can load weights by *layer name*:
# elsewhere...
model = model_from_json(open('my_model_architecture.json').read())
model.load_weights('my_model_weights.h5')
```python
model.load_weights('my_model_weights.h5', by_name=True)
```
For example:
```python
"""
Assume original model looks like this:
model = Sequential()
model.add(Dense(2, input_dim=3, name="dense_1"))
model.add(Dense(3, name="dense_2"))
...
model.save_weights(fname)
"""
# new model
model = Sequential()
model.add(Dense(2, input_dim=3, name="dense_1")) # will be loaded
model.add(Dense(10, name="new_dense")) # will not be loaded
# load weights from first model; will only affect the first layer, dense_1.
model.load_weights(fname, by_name=True)
```
---
@@ -112,9 +176,22 @@ Besides, the training loss is the average of the losses over each batch of train
---
### How can I visualize the output of an intermediate layer?
### How can I obtain the output of an intermediate layer?
You can build a Keras function that will return the output of a certain layer given a certain input, for example:
One simple way is to create a new `Model` that will output the layers that you are interested in:
```python
from keras.models import Model
model = ... # create the original model
layer_name = 'my_layer'
intermediate_layer_model = Model(inputs=model.input,
outputs=model.get_layer(layer_name).output)
intermediate_output = intermediate_layer_model.predict(data)
```
Alternatively, you can build a Keras function that will return the output of a certain layer given a certain input, for example:
```python
from keras import backend as K
@@ -134,10 +211,10 @@ to pass the learning phase flag to your function:
get_3rd_layer_output = K.function([model.layers[0].input, K.learning_phase()],
[model.layers[3].output])
# output in train mode
layer_output = get_3rd_layer_output([X, 1])[0]
# output in test mode = 0
layer_output = get_3rd_layer_output([X, 0])[0]
# output in test mode
# output in train mode = 1
layer_output = get_3rd_layer_output([X, 1])[0]
```
@@ -147,7 +224,7 @@ layer_output = get_3rd_layer_output([X, 1])[0]
You can do batch training using `model.train_on_batch(X, y)` and `model.test_on_batch(X, y)`. See the [models documentation](/models/sequential).
Alternatively, you can write a generator that yields batches of training data and use the method `model.fit_generator(data_generator, samples_per_epoch, nb_epoch)`.
Alternatively, you can write a generator that yields batches of training data and use the method `model.fit_generator(data_generator, samples_per_epoch, epochs)`.
You can see batch training in action in our [CIFAR10 example](https://github.com/fchollet/keras/blob/master/examples/cifar10_cnn.py).
@@ -169,8 +246,9 @@ Find out more in the [callbacks documentation](/callbacks).
### How is the validation split computed?
If you set the `validation_split` argument in `model.fit` to e.g. 0.1, then the validation data used will be the *last 10%* of the data. If you set it to 0.25, it will be the last 25% of the data, etc.
If you set the `validation_split` argument in `model.fit` to e.g. 0.1, then the validation data used will be the *last 10%* of the data. If you set it to 0.25, it will be the last 25% of the data, etc. Note that the data isn't shuffled before extracting the validation split, so the validation is literally just the *last* x% of samples in the input you passed.
The same validation set is used for all epochs (within a same call to `fit`).
---
@@ -194,6 +272,40 @@ print(hist.history)
---
### How can I "freeze" Keras layers?
To "freeze" a layer means to exclude it from training, i.e. its weights will never be updated. This is useful in the context of fine-tuning a model, or using fixed embeddings for a text input.
You can pass a `trainable` argument (boolean) to a layer constructor to set a layer to be non-trainable:
```python
frozen_layer = Dense(32, trainable=False)
```
Additionally, you can set the `trainable` property of a layer to `True` or `False` after instantiation. For this to take effect, you will need to call `compile()` on your model after modifying the `trainable` property. Here's an example:
```python
x = Input(shape=(32,))
layer = Dense(32)
layer.trainable = False
y = layer(x)
frozen_model = Model(x, y)
# in the model below, the weights of `layer` will not be updated during training
frozen_model.compile(optimizer='rmsprop', loss='mse')
layer.trainable = True
trainable_model = Model(x, y)
# with this model the weights of the layer will be updated during training
# (which will also affect the above model since it uses the same layer instance)
trainable_model.compile(optimizer='rmsprop', loss='mse')
frozen_model.fit(data, labels) # this does NOT update the weights of `layer`
trainable_model.fit(data, labels) # this updates the weights of `layer`
```
---
### How can I use stateful RNNs?
Making a RNN stateful means that the states for the samples of each batch will be reused as initial states for the samples in the next batch.
@@ -205,7 +317,7 @@ When using stateful RNNs, it is therefore assumed that:
To use statefulness in RNNs, you need to:
- explicitly specify the batch size you are using, by passing a `batch_input_shape` argument to the first layer in your model. It should be a tuple of integers, e.g. `(32, 10, 16)` for a 32-samples batch of sequences of 10 timesteps with 16 features per timestep.
- explicitly specify the batch size you are using, by passing a `batch_size` argument to the first layer in your model. E.g. `batch_size=32` for a 32-samples batch of sequences of 10 timesteps with 16 features per timestep.
- set `stateful=True` in your RNN layer(s).
To reset the states accumulated:
@@ -221,7 +333,7 @@ X # this is our input data, of shape (32, 21, 16)
# we will feed it to our model in sequences of length 10
model = Sequential()
model.add(LSTM(32, batch_input_shape=(32, 10, 16), stateful=True))
model.add(LSTM(32, input_shape=(10, 16), batch_size=32, stateful=True))
model.add(Dense(16, activation='softmax'))
model.compile(optimizer='rmsprop', loss='categorical_crossentropy')
@@ -241,3 +353,53 @@ model.layers[0].reset_states()
Notes that the methods `predict`, `fit`, `train_on_batch`, `predict_classes`, etc. will *all* update the states of the stateful layers in a model. This allows you to do not only stateful training, but also stateful prediction.
---
### How can I remove a layer from a Sequential model?
You can remove the last added layer in a Sequential model by calling `.pop()`:
```python
model = Sequential()
model.add(Dense(32, activation='relu', input_dim=784))
model.add(Dense(32, activation='relu'))
print(len(model.layers)) # "2"
model.pop()
print(len(model.layers)) # "1"
```
---
### How can I use pre-trained models in Keras?
Code and pre-trained weights are available for the following image classification models:
- Xception
- VGG16
- VGG19
- ResNet50
- Inception v3
They can be imported from the module `keras.applications`:
```python
from keras.applications.xception import Xception
from keras.applications.vgg16 import VGG16
from keras.applications.vgg19 import VGG19
from keras.applications.resnet50 import ResNet50
from keras.applications.inception_v3 import InceptionV3
model = VGG16(weights='imagenet', include_top=True)
```
For a few simple usage examples, see [the documentation for the Applications module](/applications).
For a detailed example of how to use such a pre-trained model for feature extraction or for fine-tuning, see [this blog post](http://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html).
The VGG16 model is also the basis for several Keras example scripts:
- [Style transfer](https://github.com/fchollet/keras/blob/master/examples/neural_style_transfer.py)
- [Feature visualization](https://github.com/fchollet/keras/blob/master/examples/conv_filter_visualization.py)
- [Deep dream](https://github.com/fchollet/keras/blob/master/examples/deep_dream.py)
+86 -85
Ver Arquivo
@@ -8,7 +8,7 @@ Let's start with something simple.
-----
## First example: fully connected network
## First example: a densely-connected network
The `Sequential` model is probably a better choice to implement such a network, but it helps to start with something really simple.
@@ -20,7 +20,7 @@ The `Sequential` model is probably a better choice to implement such a network,
from keras.layers import Input, Dense
from keras.models import Model
# this returns a tensor
# This returns a tensor
inputs = Input(shape=(784,))
# a layer instance is callable on a tensor, and returns a tensor
@@ -28,9 +28,9 @@ x = Dense(64, activation='relu')(inputs)
x = Dense(64, activation='relu')(x)
predictions = Dense(10, activation='softmax')(x)
# this creates a model that includes
# This creates a model that includes
# the Input layer and three Dense layers
model = Model(input=inputs, output=predictions)
model = Model(inputs=inputs, outputs=predictions)
model.compile(optimizer='rmsprop',
loss='categorical_crossentropy',
metrics=['accuracy'])
@@ -45,7 +45,7 @@ With the functional API, it is easy to re-use trained models: you can treat any
```python
x = Input(shape=(784,))
# this works, and returns the 10-way softmax we defined above.
# This works, and returns the 10-way softmax we defined above.
y = model(x)
```
@@ -54,11 +54,11 @@ This can allow, for instance, to quickly create models that can process *sequenc
```python
from keras.layers import TimeDistributed
# input tensor for sequences of 20 timesteps,
# Input tensor for sequences of 20 timesteps,
# each containing a 784-dimensional vector
input_sequences = Input(shape=(20, 784))
# this applies our previous model to every timestep in the input sequences.
# This applies our previous model to every timestep in the input sequences.
# the output of the previous model was a 10-way softmax,
# so the output of the layer below will be a sequence of 20 vectors of size 10.
processed_sequences = TimeDistributed(model)(input_sequences)
@@ -75,7 +75,7 @@ The model will also be supervised via two loss functions. Using the main loss fu
Here's what our model looks like:
<img src="http://s3.amazonaws.com/keras.io/img/multi-input-multi-output-graph.png" alt="multi-input-multi-output-graph" style="width: 400px;"/>
<img src="https://s3.amazonaws.com/keras.io/img/multi-input-multi-output-graph.png" alt="multi-input-multi-output-graph" style="width: 400px;"/>
Let's implement it with the functional API.
@@ -83,18 +83,18 @@ The main input will receive the headline, as a sequence of integers (each intege
The integers will be between 1 and 10,000 (a vocabulary of 10,000 words) and the sequences will be 100 words long.
```python
from keras.layers import Input, Embedding, LSTM, Dense, merge
from keras.layers import Input, Embedding, LSTM, Dense
from keras.models import Model
# headline input: meant to receive sequences of 100 integers, between 1 and 10000.
# note that we can name any layer by passing it a "name" argument.
# Headline input: meant to receive sequences of 100 integers, between 1 and 10000.
# Note that we can name any layer by passing it a "name" argument.
main_input = Input(shape=(100,), dtype='int32', name='main_input')
# this embedding layer will encode the input sequence
# This embedding layer will encode the input sequence
# into a sequence of dense 512-dimensional vectors.
x = Embedding(output_dim=512, input_dim=10000, input_length=100)(main_input)
# a LSTM will transform the vector sequence into a single vector,
# A LSTM will transform the vector sequence into a single vector,
# containing information about the entire sequence
lstm_out = LSTM(32)(x)
```
@@ -102,44 +102,44 @@ lstm_out = LSTM(32)(x)
Here we insert the auxiliary loss, allowing the LSTM and Embedding layer to be trained smoothly even though the main loss will be much higher in the model.
```python
auxiliary_loss = Dense(1, activation='sigmoid', name='aux_output')(lstm_out)
auxiliary_output = Dense(1, activation='sigmoid', name='aux_output')(lstm_out)
```
At this point, we feed into the model our auxiliary input data by concatenating it with the LSTM output:
```python
auxiliary_input = Input(shape=(5,), name='aux_input')
x = merge([lstm_out, auxiliary_input], mode='concat')
x = keras.layers.concatenate([lstm_out, auxiliary_input])
# we stack a deep fully-connected network on top
# We stack a deep densely-connected network on top
x = Dense(64, activation='relu')(x)
x = Dense(64, activation='relu')(x)
x = Dense(64, activation='relu')(x)
# and finally we add the main logistic regression layer
main_loss = Dense(1, activation='sigmoid', name='main_output')(x)
# And finally we add the main logistic regression layer
main_output = Dense(1, activation='sigmoid', name='main_output')(x)
```
This defines a model with two inputs and two outputs:
```python
model = Model(input=[main_input, auxiliary_input], output=[main_loss, auxiliary_loss])
model = Model(inputs=[main_input, auxiliary_input], outputs=[main_output, auxiliary_output])
```
We compile the model and assign a weight of 0.2 to the auxiliary loss.
To specify different `loss_weight` or `loss` for each different output, you can use a list or a dictionary.
To specify different `loss_weights` or `loss` for each different output, you can use a list or a dictionary.
Here we pass a single loss as the `loss` argument, so the same loss will be used on all outputs.
```python
model.compile(optimizer='rmsprop', loss='binary_crossentropy',
loss_weight=[1., 0.2])
loss_weights=[1., 0.2])
```
We can train the model by passing it lists of input arrays and target arrays:
```python
model.fit([headline_data, additional_data], [labels, labels],
nb_epoch=50, batch_size=32)
epochs=50, batch_size=32)
```
Since our inputs and outputs are named (we passed them a "name" argument),
@@ -148,12 +148,12 @@ We could also have compiled the model via:
```python
model.compile(optimizer='rmsprop',
loss={'main_output': 'binary_crossentropy', 'aux_output': 'binary_crossentropy'},
loss_weight={'main_output': 1., 'aux_output': 0.2})
loss_weights={'main_output': 1., 'aux_output': 0.2})
# and trained it via:
# And trained it via:
model.fit({'main_input': headline_data, 'aux_input': additional_data},
{'main_output': labels, 'aux_output': labels},
nb_epoch=50, batch_size=32)
epochs=50, batch_size=32)
```
-----
@@ -166,12 +166,13 @@ Let's consider a dataset of tweets. We want to build a model that can tell wheth
One way to achieve this is to build a model that encodes two tweets into two vectors, concatenates the vectors and adds a logistic regression of top, outputting a probability that the two tweets share the same author. The model would then be trained on positive tweet pairs and negative tweet pairs.
Because the problem is symetric, the mechanism that encodes the first tweet should be reused (weights and all) to encode the second tweet. Here we use a shared LSTM layer to encode the tweets.
Because the problem is symmetric, the mechanism that encodes the first tweet should be reused (weights and all) to encode the second tweet. Here we use a shared LSTM layer to encode the tweets.
Let's build this with the functional API. We will take as input for a tweet a binary matrix of shape `(140, 256)`, i.e. a sequence of 140 vectors of size 256, where each dimension in the 256-dimensional vector encodes the presence/absence of a character (out of an alphabet of 256 frequent characters).
```python
from keras.layers import Input, LSTM, Dense, merge
import keras
from keras.layers import Input, LSTM, Dense
from keras.models import Model
tweet_a = Input(shape=(140, 256))
@@ -181,31 +182,31 @@ tweet_b = Input(shape=(140, 256))
To share a layer across different inputs, simply instantiate the layer once, then call it on as many inputs as you want:
```python
# this layer can take as input a matrix
# This layer can take as input a matrix
# and will return a vector of size 64
shared_lstm = LSTM(64)
# when we reuse the same layer instance
# When we reuse the same layer instance
# multiple times, the weights of the layer
# are also being reused
# (it is effectively *the same* layer)
encoded_a = shared_lstm(tweet_a)
encoded_b = shared_lstm(tweet_b)
# we can then concatenate the two vectors:
merged_vector = merge([encoded_a, encoded_b], mode='concat', concat_axis=-1)
# We can then concatenate the two vectors:
merged_vector = keras.layers.concatenate([encoded_a, encoded_b], axis=-1)
# and add a logistic regression on top
predictions = Dense(1, activation='sigmoid')(merged_vector)
# And add a logistic regression on top
predictions = Dense(1, activation='sigmoid')(merged_vector)
# we define a trainable model linking the
# We define a trainable model linking the
# tweet inputs to the predictions
model = Model(input=[tweet_a, tweet_b], output=predictions)
model = Model(inputs=[tweet_a, tweet_b], outputs=predictions)
model.compile(optimizer='rmsprop',
loss='binary_crossentropy',
metrics=['accuracy'])
model.fit([data_a, data_b], labels, nb_epoch=10)
model.fit([data_a, data_b], labels, epochs=10)
```
Let's pause to take a look at how to read the shared layer's output or output shape.
@@ -255,16 +256,16 @@ assert lstm.get_output_at(1) == encoded_b
Simple enough, right?
The same is true for the properties `input_shape` and `output_shape`: as long as the layer has only one node, or as long as all nodes have the same input/output shape, then the notion of "layer output/input shape" is well defined, and that one shape will be returned by `layer.output_shape`/`layer.input_shape`. But if, for instance, you apply a same `Convolution2D` layer to an input of shape `(3, 32, 32)`, and then to an input of shape `(3, 64, 64)`, the layer will have multiple input/output shapes, and you will have to fetch them by specifying the index of the node they belong to:
The same is true for the properties `input_shape` and `output_shape`: as long as the layer has only one node, or as long as all nodes have the same input/output shape, then the notion of "layer output/input shape" is well defined, and that one shape will be returned by `layer.output_shape`/`layer.input_shape`. But if, for instance, you apply a same `Conv2D` layer to an input of shape `(3, 32, 32)`, and then to an input of shape `(3, 64, 64)`, the layer will have multiple input/output shapes, and you will have to fetch them by specifying the index of the node they belong to:
```python
a = Input(shape=(3, 32, 32))
b = Input(shape=(3, 64, 64))
conv = Convolution2D(16, 3, 3, border_mode='same')
conv = Conv2D(16, (3, 3), padding='same')
conved_a = conv(a)
# only one input so far, the following will work:
# Only one input so far, the following will work:
assert conv.input_shape == (None, 3, 32, 32)
conved_b = conv(b)
@@ -284,20 +285,20 @@ Code examples are still the best way to get started, so here are a few more.
For more information about the Inception architecture, see [Going Deeper with Convolutions](http://arxiv.org/abs/1409.4842).
```python
from keras.layers import merge, Convolution2D, MaxPooling2D, Input
from keras.layers import Conv2D, MaxPooling2D, Input
input_img = Input(shape=(3, 256, 256))
tower_1 = Convolution2D(64, 1, 1, border_mode='same', activation='relu')(input_img)
tower_1 = Convolution2D(64, 3, 3, border_mode='same', activation='relu')(tower_1)
tower_1 = Conv2D(64, (1, 1), padding='same', activation='relu')(input_img)
tower_1 = Conv2D(64, (3, 3), padding='same', activation='relu')(tower_1)
tower_2 = Convolution2D(64, 1, 1, border_mode='same', activation='relu')(input_img)
tower_2 = Convolution2D(64, 5, 5, border_mode='same', activation='relu')(tower_2)
tower_2 = Conv2D(64, (1, 1), padding='same', activation='relu')(input_img)
tower_2 = Conv2D(64, (5, 5), padding='same', activation='relu')(tower_2)
tower_3 = MaxPooling2D((3, 3), strides=(1, 1), border_mode='same')(input_img)
tower_3 = Convolution2D(64, 1, 1, border_mode='same', activation='relu')(tower_3)
tower_3 = MaxPooling2D((3, 3), strides=(1, 1), padding='same')(input_img)
tower_3 = Conv2D(64, (1, 1), padding='same', activation='relu')(tower_3)
output = merge([tower_1, tower_2, tower_3], mode='concat', concat_axis=1)
output = keras.layers.concatenate([tower_1, tower_2, tower_3], axis=1)
```
### Residual connection on a convolution layer
@@ -305,14 +306,14 @@ output = merge([tower_1, tower_2, tower_3], mode='concat', concat_axis=1)
For more information about residual networks, see [Deep Residual Learning for Image Recognition](http://arxiv.org/abs/1512.03385).
```python
from keras.layers import merge, Convolution2D, Input
from keras.layers import Conv2D, Input
# input tensor for a 3-channel 256x256 image
x = Input(shape=(3, 256, 256))
# 3x3 conv with 16 output channels
y = Convolution2D(16, 3, 3, border_mode='same')
# 3x3 conv with 3 output channels (same as input channels)
y = Conv2D(3, (3, 3), padding='same')(x)
# this returns x + y.
z = merge([x, y], mode='sum')
z = keras.layers.add([x, y])
```
### Shared vision model
@@ -320,27 +321,27 @@ z = merge([x, y], mode='sum')
This model re-uses the same image-processing module on two inputs, to classify whether two MNIST digits are the same digit or different digits.
```python
from keras.layers import merge, Convolution2D, MaxPooling2D, Input, Dense, Flatten
from keras.layers import Conv2D, MaxPooling2D, Input, Dense, Flatten
from keras.models import Model
# first, define the vision modules
# First, define the vision modules
digit_input = Input(shape=(1, 27, 27))
x = Convolution2D(64, 3, 3)(digit_input)
x = Convolution2D(64, 3, 3)(x)
x = Conv2D(64, (3, 3))(digit_input)
x = Conv2D(64, (3, 3))(x)
x = MaxPooling2D((2, 2))(x)
out = Flatten()(x)
vision_model = Model(digit_input, out)
# then define the tell-digits-apart model
# Then define the tell-digits-apart model
digit_a = Input(shape=(1, 27, 27))
digit_b = Input(shape=(1, 27, 27))
# the vision model will be shared, weights and all
# The vision model will be shared, weights and all
out_a = vision_model(digit_a)
out_b = vision_model(digit_b)
concatenated = merge([out_a, out_b], mode='concat')
concatenated = keras.layers.concatenate([out_a, out_b])
out = Dense(1, activation='sigmoid')(concatenated)
classification_model = Model([digit_a, digit_b], out)
@@ -353,46 +354,46 @@ This model can select the correct one-word answer when asked a natural-language
It works by encoding the question into a vector, encoding the image into a vector, concatenating the two, and training on top a logistic regression over some vocabulary of potential answers.
```python
from keras.layers import Convolution2D, MaxPooling2D, Flatten
from keras.layers import Input, LSTM, Embedding, Dense, merge
from keras.layers import Conv2D, MaxPooling2D, Flatten
from keras.layers import Input, LSTM, Embedding, Dense
from keras.models import Model, Sequential
# first, let's define a vision model using a Sequential model.
# this model will encode an image into a vector.
# First, let's define a vision model using a Sequential model.
# This model will encode an image into a vector.
vision_model = Sequential()
vision_model.add(Convolution2D(64, 3, 3, activation='relu', border_mode='same', input_shape=(3, 224, 224)))
vision_model.add(Convolution2D(64, 3, 3, activation='relu'))
vision_model.add(Conv2D(64, (3, 3) activation='relu', padding='same', input_shape=(3, 224, 224)))
vision_model.add(Conv2D(64, (3, 3), activation='relu'))
vision_model.add(MaxPooling2D((2, 2)))
vision_model.add(Convolution2D(128, 3, 3, activation='relu', border_mode='same'))
vision_model.add(Convolution2D(128, 3, 3, activation='relu'))
vision_model.add(Conv2D(128, (3, 3), activation='relu', padding='same'))
vision_model.add(Conv2D(128, (3, 3), activation='relu'))
vision_model.add(MaxPooling2D((2, 2)))
vision_model.add(Convolution2D(256, 3, 3, activation='relu', border_mode='same'))
vision_model.add(Convolution2D(256, 3, 3, activation='relu'))
vision_model.add(Convolution2D(256, 3, 3, activation='relu'))
vision_model.add(Conv2D(256, (3, 3), activation='relu', padding='same'))
vision_model.add(Conv2D(256, (3, 3), activation='relu'))
vision_model.add(Conv2D(256, (3, 3), activation='relu'))
vision_model.add(MaxPooling2D((2, 2)))
vision_model.add(Flatten())
# now let's get a tensor with the output of our vision model:
# Now let's get a tensor with the output of our vision model:
image_input = Input(shape=(3, 224, 224))
encoded_image = vision_model(image_input)
# next, let's define a language model to encode the question into a vector.
# each question will be at most 100 word long,
# Next, let's define a language model to encode the question into a vector.
# Each question will be at most 100 word long,
# and we will index words as integers from 1 to 9999.
question_input = Input(shape=(100,), dtype='int32')
embedded_question = Embedding(input_dim=10000, output_dim=256, input_length=100)(question_input)
encoded_question = LSTM(256)(embedded_question)
# let's concatenate the question vector and the image vector:
merged = merge([encoded_question, encoded_image], mode='concat')
# Let's concatenate the question vector and the image vector:
merged = keras.layers.concatenate([encoded_question, encoded_image])
# and let's train a logistic regression over 1000 words on top:
# And let's train a logistic regression over 1000 words on top:
output = Dense(1000, activation='softmax')(merged)
# this is our final model:
vqa_model = Model(input=[image_input, question_input], output=output)
# This is our final model:
vqa_model = Model(inputs=[image_input, question_input], outputs=output)
# the next stage would be training this model on actual data.
# The next stage would be training this model on actual data.
```
### Video question answering model
@@ -403,19 +404,19 @@ Now that we have trained our image QA model, we can quickly turn it into a video
from keras.layers import TimeDistributed
video_input = Input(shape=(100, 3, 224, 224))
# this is our video encoded via the previously trained vision_model (weights are reused)
# This is our video encoded via the previously trained vision_model (weights are reused)
encoded_frame_sequence = TimeDistributed(vision_model)(video_input) # the output will be a sequence of vectors
encoded_video = LSTM(256)(encoded_frame_sequence) # the output will be a vector
# this is a model-level representation of the question encoder, reusing the same weights as before:
question_encoder = Model(input=question_input, output=encoded_question)
# This is a model-level representation of the question encoder, reusing the same weights as before:
question_encoder = Model(inputs=question_input, outputs=encoded_question)
# let's use it to encode the question:
# Let's use it to encode the question:
video_question_input = Input(shape=(100,), dtype='int32')
encoded_video_question = question_encoder(video_question_input)
# and this is our video question answering model:
merged = merge([encoded_video, encoded_video_question], mode='concat')
# And this is our video question answering model:
merged = keras.layers.concatenate([encoded_video, encoded_video_question])
output = Dense(1000, activation='softmax')(merged)
video_qa_model = Model(input=[video_input, video_question_input], output=output)
video_qa_model = Model(inputs=[video_input, video_question_input], outputs=output)
```
+98 -276
Ver Arquivo
@@ -6,6 +6,7 @@ You can create a `Sequential` model by passing a list of layer instances to the
```python
from keras.models import Sequential
from keras.layers import Dense, Activation
model = Sequential([
Dense(32, input_dim=784),
@@ -29,167 +30,101 @@ model.add(Activation('relu'))
The model needs to know what input shape it should expect. For this reason, the first layer in a `Sequential` model (and only the first, because following layers can do automatic shape inference) needs to receive information about its input shape. There are several possible ways to do this:
- pass an `input_shape` argument to the first layer. This is a shape tuple (a tuple of integers or `None` entries, where `None` indicates that any positive integer may be expected). In `input_shape`, the batch dimension is not included.
- pass instead a `batch_input_shape` argument, where the batch dimension is included. This is useful for specifying a fixed batch size (e.g. with stateful RNNs).
- some 2D layers, such as `Dense`, support the specification of their input shape via the argument `input_dim`, and some 3D temporal layers support the arguments `input_dim` and `input_length`.
- Pass an `input_shape` argument to the first layer. This is a shape tuple (a tuple of integers or `None` entries, where `None` indicates that any positive integer may be expected). In `input_shape`, the batch dimension is not included.
- Some 2D layers, such as `Dense`, support the specification of their input shape via the argument `input_dim`, and some 3D temporal layers support the arguments `input_dim` and `input_length`.
- If you ever need to specify a fixed batch size for your inputs (this is useful for stateful recurrent networks), you can pass a `batch_size` argument to a layer. If you pass both `batch_size=32` and `input_shape=(6, 8)` to a layer, it will then expect every batch of inputs to have the batch shape `(32, 6, 8)`.
As such, the following three snippets are strictly equivalent:
As such, the following snippets are strictly equivalent:
```python
model = Sequential()
model.add(Dense(32, input_shape=(784,)))
```
```python
model = Sequential()
model.add(Dense(32, batch_input_shape=(None, 784)))
# note that batch dimension is "None" here,
# so the model will be able to process batches of any size.
```
```python
model = Sequential()
model.add(Dense(32, input_dim=784))
```
And so are the following three snippets:
```python
model = Sequential()
model.add(LSTM(32, input_shape=(10, 64)))
```
```python
model = Sequential()
model.add(LSTM(32, batch_input_shape=(None, 10, 64)))
```
```python
model = Sequential()
model.add(LSTM(32, input_length=10, input_dim=64))
```
----
## The Merge layer
Multiple `Sequential` instances can be merged into a single output via a `Merge` layer. The output is a layer that can be added as first layer in a new `Sequential` model. For instance, here's a model with two separate input branches getting merged:
```python
from keras.layers import Merge
left_branch = Sequential()
left_branch.add(Dense(32, input_dim=784))
right_branch = Sequential()
right_branch.add(Dense(32, input_dim=784))
merged = Merge([left_branch, right_branch], mode='concat')
final_model = Sequential()
final_model.add(merged)
final_model.add(Dense(10, activation='softmax'))
```
<img src="http://s3.amazonaws.com/keras.io/img/two_branches_sequential_model.png" alt="two branch Sequential" style="width: 400px;"/>
The `Merge` layer supports a number of pre-defined modes:
- `sum` (default): element-wise sum
- `concat`: tensor concatenation. You can specify the concatenation axis via the argument `concat_axis`.
- `mul`: element-wise multiplication
- `ave`: tensor average
- `dot`: dot product. You can specify which axes to reduce along via the argument `dot_axes`.
- `cos`: cosine proximity between vectors in 2D tensors.
You can also pass a function as the `mode` argument, allowing for arbitrary transformations:
```python
merged = Merge([left_branch, right_branch], mode=lambda x, y: x - y)
```
Now you know enough to be able to define *almost* any model with Keras. For complex models that cannot be expressed via `Sequential` and `Merge`, you can use [the functional API](/getting-started/functional-api-guide).
----
## Compilation
Before training a model, you need to configure the learning process, which is done via the `compile` method. It receives three arguments:
- an optimizer. This could be the string identifier of an existing optimizer (such as `rmsprop` or `adagrad`), or an instance of the `Optimizer` class. See: [optimizers](/optimizers).
- a loss function. This is the objective that the model will try to minimize. If can be the string identifier of an existing loss function (such as `categorical_crossentropy` or `mse`), or it can be an objective function. See: [objectives](/objectives).
- a list of metrics. For any classification problem you will want to set this to `metrics=['accuracy']`. A metric could be the string identifier of an existing metric (only `accuracy` is supported at this point), or a custom metric function.
- An optimizer. This could be the string identifier of an existing optimizer (such as `rmsprop` or `adagrad`), or an instance of the `Optimizer` class. See: [optimizers](/optimizers).
- A loss function. This is the objective that the model will try to minimize. It can be the string identifier of an existing loss function (such as `categorical_crossentropy` or `mse`), or it can be an objective function. See: [losses](/losses).
- A list of metrics. For any classification problem you will want to set this to `metrics=['accuracy']`. A metric could be the string identifier of an existing metric or a custom metric function.
```python
# for a multi-class classification problem
# For a multi-class classification problem
model.compile(optimizer='rmsprop',
loss='categorical_crossentropy',
metrics=['accuracy'])
# for a binary classification problem
# For a binary classification problem
model.compile(optimizer='rmsprop',
loss='binary_crossentropy',
metrics=['accuracy'])
# for a mean squared error regression problem
# For a mean squared error regression problem
model.compile(optimizer='rmsprop',
loss='mse')
# For custom metrics
import keras.backend as K
def mean_pred(y_true, y_pred):
return K.mean(y_pred)
model.compile(optimizer='rmsprop',
loss='binary_crossentropy',
metrics=['accuracy', mean_pred])
```
----
## Training
Keras models are trained on Numpy arrays of input data and labels. For training a model, you will typically use the `fit` function. [Read its documentation here](/models/sequential).
Keras models are trained on Numpy arrays of input data and labels. For training a model, you will typically use the `fit` function. [Read its documentation here](/models/sequential).
```python
# for a single-input model with 2 classes (binary):
# For a single-input model with 2 classes (binary classification):
model = Sequential()
model.add(Dense(1, input_dim=784, activation='softmax'))
model.add(Dense(32, activation='relu', input_dim=100))
model.add(Dense(1, activation='sigmoid'))
model.compile(optimizer='rmsprop',
loss='binary_crossentropy',
metrics=['accuracy'])
# generate dummy data
# Generate dummy data
import numpy as np
data = np.random.random((1000, 784))
labels = np.random.randint(2, size=(1000, 1))
# train the model, iterating on the data in batches
# of 32 samples
model.fit(data, labels, nb_epoch=10, batch_size=32)
# Train the model, iterating on the data in batches of 32 samples
model.fit(data, labels, epochs=10, batch_size=32)
```
```python
# for a multi-input model with 10 classes:
left_branch = Sequential()
left_branch.add(Dense(32, input_dim=784))
right_branch = Sequential()
right_branch.add(Dense(32, input_dim=784))
merged = Merge([left_branch, right_branch], mode='concat')
# For a single-input model with 10 classes (categorical classification):
model = Sequential()
model.add(merged)
model.add(Dense(32, activation='relu', input_dim=100))
model.add(Dense(10, activation='softmax'))
model.compile(optimizer='rmsprop',
loss='categorical_crossentropy',
metrics=['accuracy'])
# generate dummy data
# Generate dummy data
import numpy as np
from keras.utils.np_utils import to_categorical
data_1 = np.random.random((1000, 784))
data_2 = np.random.random((1000, 784))
data = np.random.random((1000, 784))
labels = np.random.randint(10, size=(1000, 10))
# these are integers between 0 and 9
labels = np.random.randint(10, size=(1000, 1))
# we convert the labels to a binary matrix of size (1000, 10)
# for use with categorical_crossentropy
labels = to_categorical(labels, 10)
# Convert labels to categorical one-hot encoding
binary_labels = keras.utils.to_categorical(labels, num_classes=10)
# train the model
# note that we are passing a list of Numpy arrays as training data
# since the model has 2 inputs
model.fit([data_1, data_2], labels, nb_epoch=10, batch_size=32)
# Train the model, iterating on the data in batches of 32 samples
model.fit(data, binary_labels, epochs=10, batch_size=32)
```
----
@@ -199,7 +134,7 @@ model.fit([data_1, data_2], labels, nb_epoch=10, batch_size=32)
Here are a few examples to get you started!
In the examples folder, you will also find example models for real datasets:
In the [examples folder](https://github.com/fchollet/keras/tree/master/examples), you will also find example models for real datasets:
- CIFAR10 small images classification: Convolutional Neural Network (CNN) with realtime data augmentation
- IMDB movie review sentiment classification: LSTM over sequences of words
@@ -221,47 +156,29 @@ model = Sequential()
# Dense(64) is a fully-connected layer with 64 hidden units.
# in the first layer, you must specify the expected input data shape:
# here, 20-dimensional vectors.
model.add(Dense(64, input_dim=20, init='uniform'))
model.add(Activation('tanh'))
model.add(Dropout(0.5))
model.add(Dense(64, init='uniform'))
model.add(Activation('tanh'))
model.add(Dropout(0.5))
model.add(Dense(10, init='uniform'))
model.add(Activation('softmax'))
sgd = SGD(lr=0.1, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='categorical_crossentropy',
optimizer=sgd,
metrics=['accuracy'])
model.fit(X_train, y_train,
nb_epoch=20,
batch_size=16)
score = model.evaluate(X_test, y_test, batch_size=16)
```
### Alternative implementation of a similar MLP:
```python
model = Sequential()
model.add(Dense(64, input_dim=20, activation='relu'))
model.add(Dense(64, activation='relu', input_dim=20))
model.add(Dropout(0.5))
model.add(Dense(64, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(10, activation='softmax'))
sgd = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='categorical_crossentropy',
optimizer='adadelta',
optimizer=sgd,
metrics=['accuracy'])
model.fit(x_train, y_train,
epochs=20,
batch_size=128)
score = model.evaluate(x_test, y_test, batch_size=128)
```
### MLP for binary classification:
```python
model = Sequential()
model.add(Dense(64, input_dim=20, init='uniform', activation='relu'))
model.add(Dense(64, input_dim=20, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(64, activation='relu'))
model.add(Dropout(0.5))
@@ -277,40 +194,32 @@ model.compile(loss='binary_crossentropy',
```python
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.layers import Convolution2D, MaxPooling2D
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras.optimizers import SGD
model = Sequential()
# input: 100x100 images with 3 channels -> (3, 100, 100) tensors.
# this applies 32 convolution filters of size 3x3 each.
model.add(Convolution2D(32, 3, 3, border_mode='valid', input_shape=(3, 100, 100)))
model.add(Activation('relu'))
model.add(Convolution2D(32, 3, 3))
model.add(Activation('relu'))
model.add(Conv2D(32, (3, 3), activation='relu', input_shape=(3, 100, 100)))
model.add(Conv2D(32, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Convolution2D(64, 3, 3, border_mode='valid'))
model.add(Activation('relu'))
model.add(Convolution2D(64, 3, 3))
model.add(Activation('relu'))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
# Note: Keras does automatic shape inference.
model.add(Dense(256))
model.add(Activation('relu'))
model.add(Dense(256, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(10, activation='softmax'))
model.add(Dense(10))
model.add(Activation('softmax'))
sgd = SGD(lr=0.1, decay=1e-6, momentum=0.9, nesterov=True)
sgd = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='categorical_crossentropy', optimizer=sgd)
model.fit(X_train, Y_train, batch_size=32, nb_epoch=1)
model.fit(x_train, y_train, batch_size=32, epochs=10)
```
@@ -318,89 +227,50 @@ model.fit(X_train, Y_train, batch_size=32, nb_epoch=1)
```python
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation
from keras.layers import Dense, Dropout
from keras.layers import Embedding
from keras.layers import LSTM
model = Sequential()
model.add(Embedding(max_features, 256, input_length=maxlen))
model.add(LSTM(output_dim=128, activation='sigmoid', inner_activation='hard_sigmoid'))
model.add(Embedding(max_features, output_dim=256))
model.add(LSTM(128))
model.add(Dropout(0.5))
model.add(Dense(1))
model.add(Activation('sigmoid'))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy',
optimizer='rmsprop',
metrics=['accuracy'])
model.fit(X_train, Y_train, batch_size=16, nb_epoch=10)
score = model.evaluate(X_test, Y_test, batch_size=16)
model.fit(x_train, y_train, batch_size=16, epochs=10)
score = model.evaluate(x_test, y_test, batch_size=16)
```
### Architecture for learning image captions with a convnet and a Gated Recurrent Unit:
(word-level embedding, caption of maximum length 16 words).
Note that getting this to work well will require using a bigger convnet, initialized with pre-trained weights.
### Sequence classification with 1D convolutions:
```python
max_caption_len = 16
vocab_size = 10000
from keras.models import Sequential
from keras.layers import Dense, Dropout
from keras.layers import Embedding
from keras.layers import Conv1D, GlobalAveragePooling1D
# first, let's define an image model that
# will encode pictures into 128-dimensional vectors.
# it should be initialized with pre-trained weights.
image_model = Sequential()
image_model.add(Convolution2D(32, 3, 3, border_mode='valid', input_shape=(3, 100, 100)))
image_model.add(Activation('relu'))
image_model.add(Convolution2D(32, 3, 3))
image_model.add(Activation('relu'))
image_model.add(MaxPooling2D(pool_size=(2, 2)))
image_model.add(Convolution2D(64, 3, 3, border_mode='valid'))
image_model.add(Activation('relu'))
image_model.add(Convolution2D(64, 3, 3))
image_model.add(Activation('relu'))
image_model.add(MaxPooling2D(pool_size=(2, 2)))
image_model.add(Flatten())
image_model.add(Dense(128))
# let's load the weights from a save file.
image_model.load_weights('weight_file.h5')
# next, let's define a RNN model that encodes sequences of words
# into sequences of 128-dimensional word vectors.
language_model = Sequential()
language_model.add(Embedding(vocab_size, 256, input_length=max_caption_len))
language_model.add(GRU(output_dim=128, return_sequences=True))
language_model.add(TimeDistributedDense(128))
# let's repeat the image vector to turn it into a sequence.
image_model.add(RepeatVector(max_caption_len))
# the output of both models will be tensors of shape (samples, max_caption_len, 128).
# let's concatenate these 2 vector sequences.
model = Sequential()
model.add(Merge([image_model, language_model], mode='concat', concat_axis=-1))
# let's encode this vector sequence into a single vector
model.add(GRU(256, return_sequences=False))
# which will be used to compute a probability
# distribution over what the next word in the caption should be!
model.add(Dense(vocab_size))
model.add(Activation('softmax'))
model.add(Conv1D(64, 3, activation='relu', input_shape=(seq_length, 100)))
model.add(Conv1D(64, 3, activation='relu'))
model.add(MaxPooling1D((3, 3)))
model.add(Conv1D(128, 3, activation='relu'))
model.add(Conv1D(128, 3, activation='relu'))
model.add(GlobalAveragePooling1D())
model.add(Dropout(0.5))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='categorical_crossentropy', optimizer='rmsprop')
model.compile(loss='binary_crossentropy',
optimizer='rmsprop',
metrics=['accuracy'])
# "images" is a numpy float array of shape (nb_samples, nb_channels=3, width, height).
# "captions" is a numpy integer array of shape (nb_samples, max_caption_len)
# containing word index sequences representing partial captions.
# "next_words" is a numpy float array of shape (nb_samples, vocab_size)
# containing a categorical encoding (0s and 1s) of the next word in the corresponding
# partial caption.
model.fit([images, partial_captions], next_words, batch_size=16, nb_epoch=100)
model.fit(x_train, y_train, batch_size=16, epochs=10)
score = model.evaluate(x_test, y_test, batch_size=16)
```
### Stacked LSTM for sequence classification
In this model, we stack 3 LSTM layers on top of each other,
@@ -410,7 +280,7 @@ The first two LSTMs return their full output sequences, but the last one only re
the last step in its output sequence, thus dropping the temporal dimension
(i.e. converting the input sequence into a single vector).
<img src="http://keras.io/img/regular_stacked_lstm.png" alt="stacked LSTM" style="width: 300px;"/>
<img src="https://keras.io/img/regular_stacked_lstm.png" alt="stacked LSTM" style="width: 300px;"/>
```python
from keras.models import Sequential
@@ -419,7 +289,7 @@ import numpy as np
data_dim = 16
timesteps = 8
nb_classes = 10
num_classes = 10
# expected input data shape: (batch_size, timesteps, data_dim)
model = Sequential()
@@ -433,16 +303,16 @@ model.compile(loss='categorical_crossentropy',
optimizer='rmsprop',
metrics=['accuracy'])
# generate dummy training data
# Generate dummy training data
x_train = np.random.random((1000, timesteps, data_dim))
y_train = np.random.random((1000, nb_classes))
y_train = np.random.random((1000, num_classes))
# generate dummy validation data
# Generate dummy validation data
x_val = np.random.random((100, timesteps, data_dim))
y_val = np.random.random((100, nb_classes))
y_val = np.random.random((100, num_classes))
model.fit(x_train, y_train,
batch_size=64, nb_epoch=5,
batch_size=64, epochs=5,
validation_data=(x_val, y_val))
```
@@ -462,11 +332,11 @@ import numpy as np
data_dim = 16
timesteps = 8
nb_classes = 10
num_classes = 10
batch_size = 32
# expected input batch shape: (batch_size, timesteps, data_dim)
# note that we have to provide the full batch_input_shape since the network is stateful.
# Expected input batch shape: (batch_size, timesteps, data_dim)
# Note that we have to provide the full batch_input_shape since the network is stateful.
# the sample of index i in batch k is the follow-up for the sample i in batch k-1.
model = Sequential()
model.add(LSTM(32, return_sequences=True, stateful=True,
@@ -479,63 +349,15 @@ model.compile(loss='categorical_crossentropy',
optimizer='rmsprop',
metrics=['accuracy'])
# generate dummy training data
# Generate dummy training data
x_train = np.random.random((batch_size * 10, timesteps, data_dim))
y_train = np.random.random((batch_size * 10, nb_classes))
y_train = np.random.random((batch_size * 10, num_classes))
# generate dummy validation data
# Generate dummy validation data
x_val = np.random.random((batch_size * 3, timesteps, data_dim))
y_val = np.random.random((batch_size * 3, nb_classes))
y_val = np.random.random((batch_size * 3, num_classes))
model.fit(x_train, y_train,
batch_size=batch_size, nb_epoch=5,
batch_size=batch_size, epochs=5,
validation_data=(x_val, y_val))
```
### Two merged LSTM encoders for classification over two parallel sequences
In this model, two input sequences are encoded into vectors by two separate LSTM modules.
These two vectors are then concatenated, and a fully connected network is trained on top of the concatenated representations.
<img src="http://keras.io/img/dual_lstm.png" alt="Dual LSTM" style="width: 600px;"/>
```python
from keras.models import Sequential
from keras.layers import Merge, LSTM, Dense
import numpy as np
data_dim = 16
timesteps = 8
nb_classes = 10
encoder_a = Sequential()
encoder_a.add(LSTM(32, input_shape=(timesteps, data_dim)))
encoder_b = Sequential()
encoder_b.add(LSTM(32, input_shape=(timesteps, data_dim)))
decoder = Sequential()
decoder.add(Merge([encoder_a, encoder_b], mode='concat'))
decoder.add(Dense(32, activation='relu'))
decoder.add(Dense(nb_classes, activation='softmax'))
decoder.compile(loss='categorical_crossentropy',
optimizer='rmsprop',
metrics=['accuracy'])
# generate dummy training data
x_train_a = np.random.random((1000, timesteps, data_dim))
x_train_b = np.random.random((1000, timesteps, data_dim))
y_train = np.random.random((1000, nb_classes))
# generate dummy validation data
x_val_a = np.random.random((100, timesteps, data_dim))
x_val_b = np.random.random((100, timesteps, data_dim))
y_val = np.random.random((100, nb_classes))
decoder.fit([x_train_a, x_train_b], y_train,
batch_size=64, nb_epoch=5,
validation_data=([x_val_a, x_val_b], y_val))
```
+1 -159
Ver Arquivo
@@ -1,161 +1,3 @@
# Keras: Deep Learning library for Theano and TensorFlow
## You have just found Keras.
Keras is a minimalist, highly modular neural networks library, written in Python and capable of running on top of either [TensorFlow](https://github.com/tensorflow/tensorflow) or [Theano](https://github.com/Theano/Theano). It was developed with a focus on enabling fast experimentation. Being able to go from idea to result with the least possible delay is key to doing good research.
Use Keras if you need a deep learning library that:
- allows for easy and fast prototyping (through total modularity, minimalism, and extensibility).
- supports both convolutional networks and recurrent networks, as well as combinations of the two.
- supports arbitrary connectivity schemes (including multi-input and multi-output training).
- runs seamlessly on CPU and GPU.
Read the documentation at [Keras.io](http://keras.io).
Keras is compatible with: __Python 2.7-3.5__.
------------------
## Guiding principles
- __Modularity.__ A model is understood as a sequence or a graph of standalone, fully-configurable modules that can be plugged together with as little restrictions as possible. In particular, neural layers, cost functions, optimizers, initialization schemes, activation functions, regularization schemes are all standalone modules that you can combine to create new models.
- __Minimalism.__ Each module should be kept short and simple. Every piece of code should be transparent upon first reading. No black magic: it hurts iteration speed and ability to innovate.
- __Easy extensibility.__ New modules are dead simple to add (as new classes and functions), and existing modules provide ample examples. To be able to easily create new modules allows for total expressiveness, making Keras suitable for advanced research.
- __Work with Python__. No separate models configuration files in a declarative format. Models are described in Python code, which is compact, easier to debug, and allows for ease of extensibility.
------------------
## Getting started: 30 seconds to Keras
The core data structure of Keras is a __model__, a way to organize layers. The main type of model is the [`Sequential`](http://keras.io/getting-started/sequential-model-guide) model, a linear stack of layers. For more complex architectures, you should use the [Keras function API](http://keras.io/getting-started/functional-api-guide).
Here's the `Sequential` model:
```python
from keras.models import Sequential
model = Sequential()
```
Stacking layers is as easy as `.add()`:
```python
from keras.layers.core import Dense, Activation
model.add(Dense(output_dim=64, input_dim=100))
model.add(Activation("relu"))
model.add(Dense(output_dim=10))
model.add(Activation("softmax"))
```
Once your model looks good, configure its learning process with `.compile()`:
```python
model.compile(loss='categorical_crossentropy', optimizer='sgd', metrics=['accuracy'])
```
If you need to, you can further configure your optimizer. A core principle of Keras is to make things reasonably simple, while allowing the user to be fully in control when they need to (the ultimate control being the easy extensibility of the source code).
```python
from keras.optimizers import SGD
model.compile(loss='categorical_crossentropy', optimizer=SGD(lr=0.01, momentum=0.9, nesterov=True))
```
You can now iterate on your training data in batches:
```python
model.fit(X_train, Y_train, nb_epoch=5, batch_size=32)
```
Alternatively, you can feed batches to your model manually:
```python
model.train_on_batch(X_batch, Y_batch)
```
Evaluate your performance in one line:
```python
loss_and_metrics = model.evaluate(X_test, Y_test, batch_size=32)
```
Or generate predictions on new data:
```python
classes = model.predict_classes(X_test, batch_size=32)
proba = model.predict_proba(X_test, batch_size=32)
```
Building a question answering system, an image classification model, a Neural Turing Machine, a word2vec embedder or any other model is just as fast. The ideas behind deep learning are simple, so why should their implementation be painful?
For a more in-depth tutorial about Keras, you can check out:
- [Getting started with the Sequential model](http://keras.io/getting-started/sequential-model-guide)
- [Getting started with the functional API](http://keras.io/getting-started/functional-api-guide)
In the [examples folder](https://github.com/fchollet/keras/tree/master/examples) of the repository, you will find more advanced models: question-answering with memory networks, text generation with stacked LSTMs, etc.
------------------
## Installation
Keras uses the following dependencies:
- numpy, scipy
- pyyaml
- HDF5 and h5py (optional, required if you use model saving/loading functions)
- Optional but recommended if you use CNNs: cuDNN.
*When using the Theano backend:*
- Theano
- [See installation instructions](http://deeplearning.net/software/theano/install.html#install).
*When using the TensorFlow backend:*
- TensorFlow
- [See installation instructions](https://github.com/tensorflow/tensorflow#download-and-setup).
To install Keras, `cd` to the Keras folder and run the install command:
```
sudo python setup.py install
```
You can also install Keras from PyPI:
```
sudo pip install keras
```
------------------
## Switching from Theano to TensorFlow
By default, Keras will use Theano as its tensor manipulation library. [Follow these instructions](http://keras.io/backend/) to configure the Keras backend.
------------------
## Support
You can ask questions and join the development discussion on the [Keras Google group](https://groups.google.com/forum/#!forum/keras-users).
You can also post bug reports and feature requests in [Github issues](https://github.com/fchollet/keras/issues). Make sure to read [our guidelines](https://github.com/fchollet/keras/blob/master/CONTRIBUTING.md) first.
------------------
## Why this name, Keras?
Keras (κέρας) means _horn_ in Greek. It is a reference to a literary image from ancient Greek and Latin literature, first found in the _Odyssey_, where dream spirits (_Oneiroi_, singular _Oneiros_) are divided between those who deceive men with false visions, who arrive to Earth through a gate of ivory, and those who announce a future that will come to pass, who arrive through a gate of horn. It's a play on the words κέρας (horn) / κραίνω (fulfill), and ἐλέφας (ivory) / ἐλεφαίρομαι (deceive).
Keras was initially developed as part of the research effort of project ONEIROS (Open-ended Neuro-Electronic Intelligent Robot Operating System).
>_"Oneiroi are beyond our unravelling --who can be sure what tale they tell? Not all that men look for comes to pass. Two gates there are that give passage to fleeting Oneiroi; one is made of horn, one of ivory. The Oneiroi that pass through sawn ivory are deceitful, bearing a message that will not be fulfilled; those that come out through polished horn have truth behind them, to be accomplished for men who see them."_ Homer, Odyssey 19. 562 ff (Shewring translation).
------------------
{{autogenerated}}
-23
Ver Arquivo
@@ -1,23 +0,0 @@
## Usage of initializations
Initializations define the probability distribution used to set the initial random weights of Keras layers.
The keyword arguments used for passing initializations to layers will depend on the layer. Usually it is simply `init`:
```python
model.add(Dense(64, init='uniform'))
```
## Available initializations
- __uniform__
- __lecun_uniform__: Uniform initialization scaled by the square root of the number of inputs (LeCun 98).
- __normal__
- __identity__: Use with square 2D layers (`shape[0] == shape[1]`).
- __orthogonal__: Use with square 2D layers (`shape[0] == shape[1]`).
- __zero__
- __glorot_normal__: Gaussian initialization scaled by fan_in + fan_out (Glorot 2010)
- __glorot_uniform__
- __he_normal__: Gaussian initialization scaled by fan_in (He et al., 2014)
- __he_uniform__
+43
Ver Arquivo
@@ -0,0 +1,43 @@
## Usage of initializers
Initializations define the way to set the initial random weights of Keras layers.
The keyword arguments used for passing initializers to layers will depend on the layer. Usually it is simply `kernel_initializer` and `bias_initializer`:
```python
model.add(Dense(64,
kernel_initializer='random_uniform',
bias_initializer='zeros'))
```
## Available initializers
The following built-in initializers are available as part of the `keras.initializers` module:
{{autogenerated}}
An initializer may be passed as a string (must match one of the available initializers above), or as a callable:
```python
from keras import initializers
model.add(Dense(64, kernel_initializer=initializers.random_normal(stddev=0.01)))
# also works; will use the default parameters.
model.add(Dense(64, kernel_initializer='random_normal'))
```
## Using custom initializers
If passing a custom callable, then it must take the argument `shape` (shape of the variable to initialize) and `dtype` (dtype of generated values):
```python
from keras import backend as K
def my_init(shape, dtype=None):
return K.random_normal(shape, dtype=dtype)
model.add(Dense(64, init=my_init))
```
+12 -2
Ver Arquivo
@@ -5,11 +5,21 @@ All Keras layers have a number of methods in common:
- `layer.get_weights()`: returns the weights of the layer as a list of Numpy arrays.
- `layer.set_weights(weights)`: sets the weights of the layer from a list of Numpy arrays (with the same shapes as the output of `get_weights`).
- `layer.get_config()`: returns a dictionary containing the configuration of the layer. The layer can be reinstantiated from its config via:
```python
from keras.utils.layer_utils import layer_from_config
layer = Dense(32)
config = layer.get_config()
reconstructed_layer = Dense.from_config(config)
```
Or:
```python
from keras import layers
config = layer.get_config()
layer = layer_from_config(config)
layer = layers.deserialize({'class_name': layer.__class__.__name__,
'config': config})
```
If a layer has a single node (i.e. if it isn't a shared layer), you can get its input tensor, output tensor, input shape and output shape via:
+36
Ver Arquivo
@@ -0,0 +1,36 @@
# Writing your own Keras layers
For simple, stateless custom operations, you are probably better off using `layers.core.Lambda` layers. But for any custom operation that has trainable weights, you should implement your own layer.
Here is the skeleton of a Keras layer, **as of Keras 2.0** (if you have an older version, please upgrade). There are only three methods you need to implement:
- `build(input_shape)`: this is where you will define your weights. This method must set `self.built = True`, which can be done by calling `super([Layer], self).build()`.
- `call(x)`: this is where the layer's logic lives. Unless you want your layer to support masking, you only have to care about the first argument passed to `call`: the input tensor.
- `compute_output_shape(input_shape)`: in case your layer modifies the shape of its input, you should specify here the shape transformation logic. This allows Keras to do automatic shape inference.
```python
from keras import backend as K
from keras.engine.topology import Layer
import numpy as np
class MyLayer(Layer):
def __init__(self, output_dim, **kwargs):
self.output_dim = output_dim
super(MyLayer, self).__init__(**kwargs)
def build(self, input_shape):
# Create a trainable weight variable for this layer.
self.kernel = self.add_weight(shape=(input_shape[1], self.output_dim),
initializer='uniform',
trainable=True)
super(MyLayer, self).build(input_shape) # Be sure to call this somewhere!
def call(self, x):
return K.dot(x, self.kernel)
def compute_output_shape(self, input_shape):
return (input_shape[0], self.output_dim)
```
The existing Keras layers provide examples of how to implement almost anything. Never hesitate to read the source code!
+37
Ver Arquivo
@@ -0,0 +1,37 @@
## Usage of loss functions
A loss function (or objective function, or optimization score function) is one of the two parameters required to compile a model:
```python
model.compile(loss='mean_squared_error', optimizer='sgd')
```
```python
from keras import losses
model.compile(loss=losses.mean_squared_error, optimizer='sgd')
```
You can either pass the name of an existing loss function, or pass a TensorFlow/Theano symbolic function that returns a scalar for each data-point and takes the following two arguments:
- __y_true__: True labels. TensorFlow/Theano tensor.
- __y_pred__: Predictions. TensorFlow/Theano tensor of the same shape as y_true.
The actual optimized objective is the mean of the output array across all datapoints.
For a few examples of such functions, check out the [losses source](https://github.com/fchollet/keras/blob/master/keras/losses.py).
## Available loss functions
{{autogenerated}}
----
**Note**: when using the `categorical_crossentropy` loss, your targets should be in categorical format (e.g. if you have 10 classes, the target for each sample should be a 10-dimensional vector that is all-zeros expect for a 1 at the index corresponding to the class of the sample). In order to convert *integer targets* into *categorical targets*, you can use the Keras utility `to_categorical`:
```python
from keras.utils.np_utils import to_categorical
categorical_labels = to_categorical(int_labels, num_classes=None)
```
+56
Ver Arquivo
@@ -0,0 +1,56 @@
## Usage of metrics
A metric is a function that is used to judge the performance of your model. Metric functions are to be supplied in the `metrics` parameter when a model is compiled.
```python
model.compile(loss='mean_squared_error',
optimizer='sgd',
metrics=['mae', 'acc'])
```
```python
from keras import metrics
model.compile(loss='mean_squared_error',
optimizer='sgd',
metrics=[metrics.mae, metrics.categorical_accuracy])
```
A metric function is similar to an [objective function](/objectives), except that the results from evaluating a metric are not used when training the model.
You can either pass the name of an existing metric, or pass a Theano/TensorFlow symbolic function (see [Custom metrics](#custom-metrics)).
#### Arguments
- __y_true__: True labels. Theano/TensorFlow tensor.
- __y_pred__: Predictions. Theano/TensorFlow tensor of the same shape as y_true.
#### Returns
Single tensor value representing the mean of the output array across all
datapoints.
----
## Available metrics
{{autogenerated}}
----
## Custom metrics
Custom metrics can be passed at the compilation step. The
function would need to take `(y_true, y_pred)` as arguments and return
a single tensor value.
```python
import keras.backend as K
def mean_pred(y_true, y_pred):
return K.mean(y_pred)
model.compile(optimizer='rmsprop',
loss='binary_crossentropy',
metrics=['accuracy', mean_pred])
```
+1 -1
Ver Arquivo
@@ -30,4 +30,4 @@ yaml_string = model.to_yaml()
model = model_from_yaml(yaml_string)
```
- `model.save_weights(filepath)`: saves the weights of the model as a HDF5 file.
- `model.load_weights(filepath)`: loads the weights of the model from a HDF5 file (created by `save_weights`).
- `model.load_weights(filepath, by_name=False)`: loads the weights of the model from a HDF5 file (created by `save_weights`). By default, the architecture is expected to be unchanged. To load weights into a different architecture (with some layers in common), use `by_name=True` to load only those layers with the same name.
+5 -5
Ver Arquivo
@@ -1,6 +1,6 @@
# Model class API
In the functional API, given an input tensor and output tensor, you can instantiate a `Model` via:
In the functional API, given some input tensor(s) and output tensor(s), you can instantiate a `Model` via:
```python
from keras.models import Model
@@ -8,15 +8,15 @@ from keras.layers import Input, Dense
a = Input(shape=(32,))
b = Dense(32)(a)
model = Model(input=a, output=b)
model = Model(inputs=a, outputs=b)
```
This model will include all layers required in the computation of `a` given `b`.
This model will include all layers required in the computation of `b` given `a`.
In the case of multi-input or multi-output models, you can use lists as well:
```python
model = Model(input=[a1, a2], output=[b1, b3, b3])
model = Model(inputs=[a1, a2], outputs=[b1, b3, b3])
```
For a detailed introduction of what `Model` can do, read [this guide to the Keras functional API](/getting-started/functional-api-guide).
@@ -29,4 +29,4 @@ For a detailed introduction of what `Model` can do, read [this guide to the Kera
## Methods
{{autogenerated}}
{{autogenerated}}
-30
Ver Arquivo
@@ -1,30 +0,0 @@
## Usage of objectives
An objective function (or loss function, or optimization score function) is one of the two parameters required to compile a model:
```python
model.compile(loss='mean_squared_error', optimizer='sgd')
```
You can either pass the name of an existing objective, or pass a Theano/TensorFlow symbolic function that returns a scalar for each data-point and takes the following two arguments:
- __y_true__: True labels. Theano/TensorFlow tensor.
- __y_pred__: Predictions. Theano/TensorFlow tensor of the same shape as y_true.
The actual optimized objective is the mean of the output array across all datapoints.
For a few examples of such functions, check out the [objectives source](https://github.com/fchollet/keras/blob/master/keras/objectives.py).
## Available objectives
- __mean_squared_error__ / __mse__
- __mean_absolute_error__ / __mae__
- __mean_absolute_percentage_error__ / __mape__
- __mean_squared_logarithmic_error__ / __msle__
- __squared_hinge__
- __hinge__
- __binary_crossentropy__: Also known as logloss.
- __categorical_crossentropy__: Also known as multiclass logloss. __Note__: using this objective requires that your labels are binary arrays of shape `(nb_samples, nb_classes)`.
- __poisson__: mean of `(predictions - targets * log(predictions))`
- __cosine_proximity__: the opposite (negative) of the mean cosine proximity between predictions and targets.
+27 -2
Ver Arquivo
@@ -4,12 +4,14 @@
An optimizer is one of the two arguments required for compiling a Keras model:
```python
from keras import optimizers
model = Sequential()
model.add(Dense(64, init='uniform', input_dim=10))
model.add(Dense(64, init='uniform', input_shape=(10,)))
model.add(Activation('tanh'))
model.add(Activation('softmax'))
sgd = SGD(lr=0.1, decay=1e-6, momentum=0.9, nesterov=True)
sgd = optimizers.SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='mean_squared_error', optimizer=sgd)
```
@@ -22,4 +24,27 @@ model.compile(loss='mean_squared_error', optimizer='sgd')
---
## Parameters common to all Keras optimizers
The parameters `clipnorm` and `clipvalue` can be used with all optimizers to control gradient clipping:
```python
from keras import optimizers
# All parameter gradients will be clipped to
# a maximum norm of 1.
sgd = optimizers.SGD(lr=0.01, clipnorm=1.)
```
```python
from keras import optimizers
# All parameter gradients will be clipped to
# a maximum value of 0.5 and
# a minimum value of -0.5.
sgd = optimizers.SGD(lr=0.01, clipvalue=0.5)
```
---
{{autogenerated}}
+137 -19
Ver Arquivo
@@ -2,55 +2,105 @@
## ImageDataGenerator
```python
keras.preprocessing.image.ImageDataGenerator(featurewise_center=True,
keras.preprocessing.image.ImageDataGenerator(featurewise_center=False,
samplewise_center=False,
featurewise_std_normalization=True,
featurewise_std_normalization=False,
samplewise_std_normalization=False,
zca_whitening=False,
rotation_range=0.,
width_shift_range=0.,
height_shift_range=0.,
shear_range=0.,
zoom_range=0.,
channel_shift_range=0.,
fill_mode='nearest',
cval=0.,
horizontal_flip=False,
vertical_flip=False)
vertical_flip=False,
rescale=None,
data_format=K.image_data_format())
```
Generate batches of tensor image data with real-time data augmentation. The data will be looped over (in batches) indefinitely.
- __Arguments__:
- __featurewise_center__: Boolean. Set input mean to 0 over the dataset.
- __featurewise_center__: Boolean. Set input mean to 0 over the dataset, feature-wise.
- __samplewise_center__: Boolean. Set each sample mean to 0.
- __featurewise_std_normalization__: Boolean. Divide inputs by std of the dataset.
- __featurewise_std_normalization__: Boolean. Divide inputs by std of the dataset, feature-wise.
- __samplewise_std_normalization__: Boolean. Divide each input by its std.
- __zca_whitening__: Boolean. Apply ZCA whitening.
- __rotation_range__: Int. Degree range for random rotations.
- __width_shift_range__: Float (fraction of total width). Range for random horizontal shifts.
- __height_shift_range__: Float (fraction of total height). Range for random vertical shifts.
- __shear_range__: Float. Shear Intensity (Shear angle in counter-clockwise direction as radians)
- __zoom_range__: Float or [lower, upper]. Range for random zoom. If a float, `[lower, upper] = [1-zoom_range, 1+zoom_range]`.
- __channel_shift_range__: Float. Range for random channel shifts.
- __fill_mode__: One of {"constant", "nearest", "reflect" or "wrap"}. Points outside the boundaries of the input are filled according to the given mode.
- __cval__: Float or Int. Value used for points outside the boundaries when `fill_mode = "constant"`.
- __horizontal_flip__: Boolean. Randomly flip inputs horizontally.
- __vertical_flip__: Boolean. Randomly flip inputs vertically.
- __rescale__: rescaling factor. Defaults to None. If None or 0, no rescaling is applied,
otherwise we multiply the data by the value provided (before applying
any other transformation).
- _data_format_: One of {"channels_first", "channels_last"}.
"channels_last" mode means that the images should have shape `(samples, height, width, channels)`,
"channels_first" mode means that the images should have shape `(samples, channels, height, width)`.
It defaults to the `image_data_format` value found in your
Keras config file at `~/.keras/keras.json`.
If you never set it, then it will be "channels_last".
- __Methods__:
- __fit(X)__: Required if featurewise_center or featurewise_std_normalization or zca_whitening. Compute necessary quantities on some sample data.
- __fit(X)__: Compute the internal data stats related to the data-dependent transformations, based on an array of sample data.
Only required if featurewise_center or featurewise_std_normalization or zca_whitening.
- __Arguments__:
- __X__: sample data.
- __X__: sample data. Should have rank 4.
In case of grayscale data,
the channels axis should have value 1, and in case
of RGB data, it should have value 3.
- __augment__: Boolean (default: False). Whether to fit on randomly augmented samples.
- __rounds__: int (default: 1). If augment, how many augmentation passes over the data to use.
- __flow(X, y)__:
- __seed__: int (default: None). Random seed.
- __flow(X, y)__: Takes numpy data & label arrays, and generates batches of augmented/normalized data. Yields batches indefinitely, in an infinite loop.
- __Arguments__:
- __X__: data.
- __X__: data. Should have rank 4.
In case of grayscale data,
the channels axis should have value 1, and in case
of RGB data, it should have value 3.
- __y__: labels.
- __batch_size__: int (default: 32).
- __shuffle__: boolean (defaut: False).
- __save_to_dir__: None or str. This allows you to optimally specify a directory to which to save the augmented pictures being generated (useful for visualizing what you are doing).
- __save_prefix__: str. Prefix to use for filenames of saved pictures.
- __save_format__: one of "png", jpeg".
- __shuffle__: boolean (defaut: True).
- __seed__: int (default: None).
- __save_to_dir__: None or str (default: None). This allows you to optimally specify a directory to which to save the augmented pictures being generated (useful for visualizing what you are doing).
- __save_prefix__: str (default: `''`). Prefix to use for filenames of saved pictures (only relevant if `save_to_dir` is set).
- __save_format__: one of "png", "jpeg" (only relevant if `save_to_dir` is set). Default: "jpeg".
- __yields__: Tuples of `(x, y)` where `x` is a numpy array of image data and `y` is a numpy array of corresponding labels.
The generator loops indefinitely.
- __flow_from_directory(directory)__: Takes the path to a directory, and generates batches of augmented/normalized data. Yields batches indefinitely, in an infinite loop.
- __Arguments__:
- __directory__: path to the target directory. It should contain one subdirectory per class.
Any PNG, JPG or BMP images inside each of the subdirectories directory tree will be included in the generator.
See [this script](https://gist.github.com/fchollet/0830affa1f7f19fd47b06d4cf89ed44d) for more details.
- __target_size__: tuple of integers, default: `(256, 256)`. The dimensions to which all images found will be resized.
- __color_mode__: one of "grayscale", "rbg". Default: "rgb". Whether the images will be converted to have 1 or 3 color channels.
- __classes__: optional list of class subdirectories (e.g. `['dogs', 'cats']`). Default: None. If not provided, the list of classes will be automatically inferred (and the order of the classes, which will map to the label indices, will be alphanumeric).
- __class_mode__: one of "categorical", "binary", "sparse" or None. Default: "categorical". Determines the type of label arrays that are returned: "categorical" will be 2D one-hot encoded labels, "binary" will be 1D binary labels, "sparse" will be 1D integer labels. If None, no labels are returned (the generator will only yield batches of image data, which is useful to use `model.predict_generator()`, `model.evaluate_generator()`, etc.).
- __batch_size__: size of the batches of data (default: 32).
- __shuffle__: whether to shuffle the data (default: True)
- __seed__: optional random seed for shuffling and transformations.
- __save_to_dir__: None or str (default: None). This allows you to optimally specify a directory to which to save the augmented pictures being generated (useful for visualizing what you are doing).
- __save_prefix__: str. Prefix to use for filenames of saved pictures (only relevant if `save_to_dir` is set).
- __save_format__: one of "png", "jpeg" (only relevant if `save_to_dir` is set). Default: "jpeg".
- __follow_links__: whether to follow symlinks inside class subdirectories (default: False).
- __Examples__:
Example of using `.flow(X, y)`:
- __Example__:
```python
(X_train, y_train), (X_test, y_test) = cifar10.load_data(test_split=0.1)
Y_train = np_utils.to_categorical(y_train, nb_classes)
Y_test = np_utils.to_categorical(y_test, nb_classes)
(X_train, y_train), (X_test, y_test) = cifar10.load_data()
Y_train = np_utils.to_categorical(y_train, num_classes)
Y_test = np_utils.to_categorical(y_test, num_classes)
datagen = ImageDataGenerator(
featurewise_center=True,
@@ -66,10 +116,10 @@ datagen.fit(X_train)
# fits the model on batches with real-time data augmentation:
model.fit_generator(datagen.flow(X_train, Y_train, batch_size=32),
samples_per_epoch=len(X_train), nb_epoch=nb_epoch)
samples_per_epoch=len(X_train), epochs=epochs)
# here's a more "manual" example
for e in range(nb_epoch):
for e in range(epochs):
print 'Epoch', e
batches = 0
for X_batch, Y_batch in datagen.flow(X_train, Y_train, batch_size=32):
@@ -80,3 +130,71 @@ for e in range(nb_epoch):
# the generator loops indefinitely
break
```
Example of using `.flow_from_directory(directory)`:
```python
train_datagen = ImageDataGenerator(
rescale=1./255,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True)
test_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_directory(
'data/train',
target_size=(150, 150),
batch_size=32,
class_mode='binary')
validation_generator = test_datagen.flow_from_directory(
'data/validation',
target_size=(150, 150),
batch_size=32,
class_mode='binary')
model.fit_generator(
train_generator,
samples_per_epoch=2000,
epochs=50,
validation_data=validation_generator,
num_val_samples=800)
```
Example of transforming images and masks together.
```python
# we create two instances with the same arguments
data_gen_args = dict(featurewise_center=True,
featurewise_std_normalization=True,
rotation_range=90.,
width_shift_range=0.1,
height_shift_range=0.1,
zoom_range=0.2)
image_datagen = ImageDataGenerator(**data_gen_args)
mask_datagen = ImageDataGenerator(**data_gen_args)
# Provide the same seed and keyword arguments to the fit and flow methods
seed = 1
image_datagen.fit(images, augment=True, seed=seed)
mask_datagen.fit(masks, augment=True, seed=seed)
image_generator = image_datagen.flow_from_directory(
'data/images',
class_mode=None,
seed=seed)
mask_generator = mask_datagen.flow_from_directory(
'data/masks',
class_mode=None,
seed=seed)
# combine generators into one which yields image and masks
train_generator = zip(image_generator, mask_generator)
model.fit_generator(
train_generator,
samples_per_epoch=2000,
epochs=50)
```
+12 -11
Ver Arquivo
@@ -1,17 +1,18 @@
## pad_sequences
```python
keras.preprocessing.sequence.pad_sequences(sequences, maxlen=None, dtype='int32')
keras.preprocessing.sequence.pad_sequences(sequences, maxlen=None, dtype='int32',
padding='pre', truncating='pre', value=0.)
```
Transform a list of `nb_samples sequences` (lists of scalars) into a 2D numpy array of shape `(nb_samples, nb_timesteps)`. `nb_timesteps` is either the `maxlen` argument if provided, or the length of the longest sequence otherwise. Sequences that are shorter than `nb_timesteps` are padded with zeros at the end.
Transform a list of `num_samples` sequences (lists of scalars) into a 2D Numpy array of shape `(num_samples, num_timesteps)`. `num_timesteps` is either the `maxlen` argument if provided, or the length of the longest sequence otherwise. Sequences that are shorter than `num_timesteps` are padded with `value` at the end. Sequences longer than `num_timesteps` are truncated so that it fits the desired length. Position where padding or truncation happens is determined by `padding` or `truncating`, respectively.
- __Return__: 2D numpy array of shape `(nb_samples, nb_timesteps)`.
- __Return__: 2D Numpy array of shape `(num_samples, num_timesteps)`.
- __Arguments__:
- __sequences__: List of lists of int or float.
- __maxlen__: None or int. Maximum sequence length, longer sequences are truncated and shorter sequences are padded with zeros at the end.
- __dtype__: datatype of the numpy array returned.
- __dtype__: datatype of the Numpy array returned.
- __padding__: 'pre' or 'post', pad either before or after each sequence.
- __truncating__: 'pre' or 'post', remove values from sequences larger than maxlen either in the beginning or in the end of the sequence
- __value__: float, value to pad the sequences to the desired value.
@@ -21,12 +22,12 @@ Transform a list of `nb_samples sequences` (lists of scalars) into a 2D numpy ar
## skipgrams
```python
keras.preprocessing.sequence.skipgrams(sequence, vocabulary_size,
window_size=4, negative_samples=1., shuffle=True,
keras.preprocessing.sequence.skipgrams(sequence, vocabulary_size,
window_size=4, negative_samples=1., shuffle=True,
categorical=False, sampling_table=None)
```
Transforms a sequence of word indexes (list of int) into couples of the form:
Transforms a sequence of word indexes (list of int) into couples of the form:
- (word, word in the same window), with label 1 (positive samples).
- (word, random word from the vocabulary), with label 0 (negative samples).
@@ -34,8 +35,8 @@ Transforms a sequence of word indexes (list of int) into couples of the form:
Read more about Skipgram in this gnomic paper by Mikolov et al.: [Efficient Estimation of Word Representations in
Vector Space](http://arxiv.org/pdf/1301.3781v3.pdf)
- __Return__: tuple `(couples, labels)`.
- `couples` is a list of 2-elements lists of int: `[word_index, other_word_index]`.
- __Return__: tuple `(couples, labels)`.
- `couples` is a list of 2-elements lists of int: `[word_index, other_word_index]`.
- `labels` is a list of 0 and 1, where 1 indicates that `other_word_index` was found in the same window as `word_index`, and 0 indicates that `other_word_index` was random.
- if categorical is set to True, the labels are categorical, ie. 1 becomes [0,1], and 0 becomes [1, 0].
@@ -46,7 +47,7 @@ Vector Space](http://arxiv.org/pdf/1301.3781v3.pdf)
- __negative_samples__: float >= 0. 0 for no negative (=random) samples. 1 for same number as positive samples. etc.
- __shuffle__: boolean. Whether to shuffle the samples.
- __categorical__: boolean. Whether to make the returned labels categorical.
- __sampling_table__: numpy array of shape `(vocabulary_size,)` where `sampling_table[i]` is the probability of sampling the word with index i (assumed to be i-th most common word in the dataset).
- __sampling_table__: Numpy array of shape `(vocabulary_size,)` where `sampling_table[i]` is the probability of sampling the word with index i (assumed to be i-th most common word in the dataset).
---
@@ -59,7 +60,7 @@ keras.preprocessing.sequence.make_sampling_table(size, sampling_factor=1e-5)
Used for generating the `sampling_table` argument for `skipgrams`. `sampling_table[i]` is the probability of sampling the word i-th most common word in a dataset (more common words should be sampled less frequently, for balance).
- __Return__: numpy array of shape `(size,)`.
- __Return__: Numpy array of shape `(size,)`.
- __Arguments__:
- __size__: size of the vocabulary considered.
+4 -4
Ver Arquivo
@@ -33,14 +33,14 @@ One-hot encode a text into a list of word indexes in a vocabulary of size n.
## Tokenizer
```python
keras.preprocessing.text.Tokenizer(nb_words=None, filters=base_filter(),
keras.preprocessing.text.Tokenizer(num_words=None, filters=base_filter(),
lower=True, split=" ")
```
Class for vectorizing texts, or/and turning texts into sequences (=list of word indexes, where the word of rank i in the dataset (starting at 1) has index i).
- __Arguments__: Same as `text_to_word_sequence` above.
- __nb_words__: None or int. Maximum number of words to work with (if set, tokenization will be restricted to the top nb_words most common words in the dataset).
- __num_words__: None or int. Maximum number of words to work with (if set, tokenization will be restricted to the top num_words most common words in the dataset).
- __Methods__:
@@ -57,7 +57,7 @@ Class for vectorizing texts, or/and turning texts into sequences (=list of word
- __Return__: yield one sequence per input text.
- __texts_to_matrix(texts)__:
- __Return__: numpy array of shape `(len(texts), nb_words)`.
- __Return__: numpy array of shape `(len(texts), num_words)`.
- __Arguments__:
- __texts__: list of texts to vectorize.
- __mode__: one of "binary", "count", "tfidf", "freq" (default: "binary").
@@ -67,7 +67,7 @@ Class for vectorizing texts, or/and turning texts into sequences (=list of word
- __sequences__: list of sequences to train on.
- __sequences_to_matrix(sequences)__:
- __Return__: numpy array of shape `(len(sequences), nb_words)`.
- __Return__: numpy array of shape `(len(sequences), num_words)`.
- __Arguments__:
- __sequences__: list of sequences to vectorize.
- __mode__: one of "binary", "count", "tfidf", "freq" (default: "binary").
+23 -17
Ver Arquivo
@@ -2,39 +2,45 @@
Regularizers allow to apply penalties on layer parameters or layer activity during optimization. These penalties are incorporated in the loss function that the network optimizes.
The penalties are applied on a per-layer basis. The exact API will depend on the layer, but the layers `Dense`, `TimeDistributedDense`, `MaxoutDense`, `Convolution1D` and `Convolution2D` have a unified API.
The penalties are applied on a per-layer basis. The exact API will depend on the layer, but the layers `Dense`, `Conv1D`, `Conv2D` and `Conv3D` have a unified API.
These layers expose 3 keyword arguments:
- `W_regularizer`: instance of `keras.regularizers.WeightRegularizer`
- `b_regularizer`: instance of `keras.regularizers.WeightRegularizer`
- `activity_regularizer`: instance of `keras.regularizers.ActivityRegularizer`
- `kernel_regularizer`: instance of `keras.regularizers.Regularizer`
- `bias_regularizer`: instance of `keras.regularizers.Regularizer`
- `activity_regularizer`: instance of `keras.regularizers.Regularizer`
## Example
```python
from keras.regularizers import l2, activity_l2
model.add(Dense(64, input_dim=64, W_regularizer=l2(0.01), activity_regularizer=activity_l2(0.01)))
model.add(Dense(64, input_dim=64,
kernel_regularizer=l2(0.01),
activity_regularizer=activity_l2(0.01)))
```
## Available penalties
```python
keras.regularizers.WeightRegularizer(l1=0., l2=0.)
keras.regularizers.l1(0.)
keras.regularizers.l2(0.)
keras.regularizers.l1_l2(0.)
```
## Developing new regularizers
Any function that takes in a weight matrix and returns a loss contribution tensor can be used as a regularizer, e.g.:
```python
keras.regularizers.ActivityRegularizer(l1=0., l2=0.)
from keras import backend as K
def l1_reg(weight_matrix):
return 0.01 * K.sum(K.abs(weight_matrix))
model.add(Dense(64, input_dim=64,
kernel_regularizer=l1_reg)
```
## Shortcuts
These are shortcut functions available in `keras.regularizers`.
- __l1__(l=0.01): L1 weight regularization penalty, also known as LASSO
- __l2__(l=0.01): L2 weight regularization penalty, also known as weight decay, or Ridge
- __l1l2__(l1=0.01, l2=0.01): L1-L2 weight regularization penalty, also known as ElasticNet
- __activity_l1__(l=0.01): L1 activity regularization
- __activity_l2__(l=0.01): L2 activity regularization
- __activity_l1l2__(l1=0.01, l2=0.01): L1+L2 activity regularization
Alternatively, you can write your regularizers in an object-oriented way;
see the [keras/regularizers.py](https://github.com/fchollet/keras/blob/master/keras/regularizers.py) module for examples.
+7 -7
Ver Arquivo
@@ -1,12 +1,12 @@
# Wrappers for the Sciki-Learn API
# Wrappers for the Scikit-Learn API
You can use `Sequential` Keras models (single-input only) as part of your Scikit-Learn workflow via the wrappers found at `keras.wrappers.sklearn.py`.
You can use `Sequential` Keras models (single-input only) as part of your Scikit-Learn workflow via the wrappers found at `keras.wrappers.scikit_learn.py`.
There are two wrappers available:
`keras.wrappers.sklearn.KerasClassifier(build_fn=None, **sk_params)`, which implements the sklearn classifier interface,
`keras.wrappers.scikit_learn.KerasClassifier(build_fn=None, **sk_params)`, which implements the Scikit-Learn classifier interface,
`keras.wrappers.sklearn.KerasRegressor(build_fn=None, **sk_params)`, which implements the sklearn regressor interface.
`keras.wrappers.scikit_learn.KerasRegressor(build_fn=None, **sk_params)`, which implements the Scikit-Learn regressor interface.
### Arguments
@@ -25,12 +25,12 @@ present class will then be treated as the default build_fn.
`sk_params` takes both model parameters and fitting parameters. Legal model
parameters are the arguments of `build_fn`. Note that like all other
estimators in scikit-learn, 'build_fn' should provide defalult values for
estimators in scikit-learn, 'build_fn' should provide default values for
its arguments, so that you could create the estimator without passing any
values to `sk_params`.
`sk_params` could also accept parameters for calling `fit`, `predict`,
`predict_proba`, and `score` methods (e.g., `nb_epoch`, `batch_size`).
`predict_proba`, and `score` methods (e.g., `epochs`, `batch_size`).
fitting (predicting) parameters are selected in the following order:
1. Values passed to the dictionary arguments of
@@ -42,4 +42,4 @@ fitting (predicting) parameters are selected in the following order:
When using scikit-learn's `grid_search` API, legal tunable parameters are
those you could pass to `sk_params`, including fitting parameters.
In other words, you could use `grid_search` to search for the best
`batch_size` or `nb_epoch` as well as the model parameters.
`batch_size` or `epochs` as well as the model parameters.
+6 -5
Ver Arquivo
@@ -1,18 +1,19 @@
## Model visualization
The `keras.utils.visualize_util` module provides utility functions to plot
a Keras model (using graphviz).
The `keras.utils.vis_utils` module provides utility functions to plot
a Keras model (using `graphviz`).
This will plot a graph of the model and save it to a file:
```python
from keras.utils.visualize_util import plot
plot(model, to_file='model.png')
from keras.utils import plot_model
plot_model(model, to_file='model.png')
```
`plot` takes one optional arguments:
`plot_model` takes two optional arguments:
- `show_shapes` (defaults to False) controls whether output shapes are shown in the graph.
- `show_layer_names` (defaults to True) controls whether layer names are shown in the graph.
You can also directly obtain the `pydot.Graph` object and render it yourself,
for example to show it in an ipython notebook :
+100
Ver Arquivo
@@ -0,0 +1,100 @@
# Keras examples directory
[addition_rnn.py](addition_rnn.py)
Implementation of sequence to sequence learning for performing addition of two numbers (as strings).
[antirectifier.py](antirectifier.py)
Demonstrates how to write custom layers for Keras.
[babi_memnn.py](babi_memnn.py)
Trains a memory network on the bAbI dataset for reading comprehension.
[babi_rnn.py](babi_rnn.py)
Trains a two-branch recurrent network on the bAbI dataset for reading comprehension.
[cifar10_cnn.py](cifar10_cnn.py)
Trains a simple deep CNN on the CIFAR10 small images dataset.
[conv_filter_visualization.py](conv_filter_visualization.py)
Visualization of the filters of VGG16, via gradient ascent in input space.
[conv_lstm.py](conv_lstm.py)
Demonstrates the use of a convolutional LSTM network.
[deep_dream.py](deep_dream.py)
Deep Dreams in Keras.
[image_ocr.py](image_ocr.py)
Trains a convolutional stack followed by a recurrent stack and a CTC logloss function to perform optical character recognition (OCR).
[imdb_bidirectional_lstm.py](imdb_bidirectional_lstm.py)
Trains a Bidirectional LSTM on the IMDB sentiment classification task.
[imdb_cnn.py](imdb_cnn.py)
Demonstrates the use of Convolution1D for text classification.
[imdb_cnn_lstm.py](imdb_cnn_lstm.py)
Trains a convolutional stack followed by a recurrent stack network on the IMDB sentiment classification task.
[imdb_fasttext.py](imdb_fasttext.py)
Trains a FastText model on the IMDB sentiment classification task.
[imdb_lstm.py](imdb_lstm.py)
Trains a LSTM on the IMDB sentiment classification task.
[lstm_benchmark.py](lstm_benchmark.py)
Compares different LSTM implementations on the IMDB sentiment classification task.
[lstm_text_generation.py](lstm_text_generation.py)
Generates text from Nietzsche's writings.
[mnist_acgan.py](mnist_acgan.py)
Implementation of AC-GAN ( Auxiliary Classifier GAN ) on the MNIST dataset
[mnist_cnn.py](mnist_cnn.py)
Trains a simple convnet on the MNIST dataset.
[mnist_hierarchical_rnn.py](mnist_hierarchical_rnn.py)
Trains a Hierarchical RNN (HRNN) to classify MNIST digits.
[mnist_irnn.py](mnist_irnn.py)
Reproduction of the IRNN experiment with pixel-by-pixel sequential MNIST in "A Simple Way to Initialize Recurrent Networks of Rectified Linear Units" by Le et al.
[mnist_mlp.py](mnist_mlp.py)
Trains a simple deep multi-layer perceptron on the MNIST dataset.
[mnist_net2net.py](mnist_net2net.py)
Reproduction of the Net2Net experiment with MNIST in "Net2Net: Accelerating Learning via Knowledge Transfer".
[mnist_siamese_graph.py](mnist_siamese_graph.py)
Trains a Siamese multi-layer perceptron on pairs of digits from the MNIST dataset.
[mnist_sklearn_wrapper.py](mnist_sklearn_wrapper.py)
Demonstrates how to use the sklearn wrapper.
[mnist_swwae.py](mnist_swwae.py)
Trains a Stacked What-Where AutoEncoder built on residual blocks on the MNIST dataset.
[mnist_transfer_cnn.py](mnist_transfer_cnn.py)
Transfer learning toy example.
[neural_doodle.py](neural_doodle.py)
Neural doodle.
[neural_style_transfer.py](neural_style_transfer.py)
Neural style transfer.
[pretrained_word_embeddings.py](pretrained_word_embeddings.py)
Loads pre-trained word embeddings (GloVe embeddings) into a frozen Keras Embedding layer, and uses it to train a text classification model on the 20 Newsgroup dataset.
[reuters_mlp.py](reuters_mlp.py)
Trains and evaluate a simple MLP on the Reuters newswire topic classification task.
[stateful_lstm.py](stateful_lstm.py)
Demonstrates how to use stateful RNNs to model long sequences efficiently.
[variational_autoencoder.py](variational_autoencoder.py)
Demonstrates how to build a variational autoencoder.
[variational_autoencoder_deconv.py](variational_autoencoder_deconv.py)
Demonstrates how to build a variational autoencoder with Keras using deconvolution layers.
+93 -62
Ver Arquivo
@@ -23,42 +23,47 @@ Four digits inverted:
Five digits inverted:
+ One layer LSTM (128 HN), 550k training examples = 99% train/test accuracy in 30 epochs
'''
from __future__ import print_function
from keras.models import Sequential
from keras.engine.training import slice_X
from keras.layers.core import Activation, TimeDistributedDense, RepeatVector
from keras.layers import recurrent
from keras import layers
import numpy as np
from six.moves import range
class CharacterTable(object):
'''
Given a set of characters:
"""Given a set of characters:
+ Encode them to a one hot integer representation
+ Decode the one hot integer representation to their character output
+ Decode a vector of probabilties to their character output
'''
def __init__(self, chars, maxlen):
+ Decode a vector of probabilities to their character output
"""
def __init__(self, chars):
"""Initialize character table.
# Arguments
chars: Characters that can appear in the input.
"""
self.chars = sorted(set(chars))
self.char_indices = dict((c, i) for i, c in enumerate(self.chars))
self.indices_char = dict((i, c) for i, c in enumerate(self.chars))
self.maxlen = maxlen
def encode(self, C, maxlen=None):
maxlen = maxlen if maxlen else self.maxlen
X = np.zeros((maxlen, len(self.chars)))
def encode(self, C, num_rows):
"""One hot encode given string C.
# Arguments
num_rows: Number of rows in the returned one hot encoding. This is
used to keep the # of rows for each data the same.
"""
x = np.zeros((num_rows, len(self.chars)))
for i, c in enumerate(C):
X[i, self.char_indices[c]] = 1
return X
x[i, self.char_indices[c]] = 1
return x
def decode(self, X, calc_argmax=True):
def decode(self, x, calc_argmax=True):
if calc_argmax:
X = X.argmax(axis=-1)
return ''.join(self.indices_char[x] for x in X)
x = x.argmax(axis=-1)
return ''.join(self.indices_char[x] for x in x)
class colors:
@@ -66,104 +71,130 @@ class colors:
fail = '\033[91m'
close = '\033[0m'
# Parameters for the model and dataset
# Parameters for the model and dataset.
TRAINING_SIZE = 50000
DIGITS = 3
INVERT = True
# Try replacing GRU, or SimpleRNN
RNN = recurrent.LSTM
HIDDEN_SIZE = 128
BATCH_SIZE = 128
LAYERS = 1
MAXLEN = DIGITS + 1 + DIGITS
# Maximum length of input is 'int + int' (e.g., '345+678'). Maximum length of
# int is DIGITS.
MAxLEN = DIGITS + 1 + DIGITS
# All the numbers, plus sign and space for padding.
chars = '0123456789+ '
ctable = CharacterTable(chars, MAXLEN)
ctable = CharacterTable(chars)
questions = []
expected = []
seen = set()
print('Generating data...')
while len(questions) < TRAINING_SIZE:
f = lambda: int(''.join(np.random.choice(list('0123456789')) for i in range(np.random.randint(1, DIGITS + 1))))
f = lambda: int(''.join(np.random.choice(list('0123456789'))
for i in range(np.random.randint(1, DIGITS + 1))))
a, b = f(), f()
# Skip any addition questions we've already seen
# Also skip any such that X+Y == Y+X (hence the sorting)
# Also skip any such that x+Y == Y+x (hence the sorting).
key = tuple(sorted((a, b)))
if key in seen:
continue
seen.add(key)
# Pad the data with spaces such that it is always MAXLEN
# Pad the data with spaces such that it is always MAxLEN.
q = '{}+{}'.format(a, b)
query = q + ' ' * (MAXLEN - len(q))
query = q + ' ' * (MAxLEN - len(q))
ans = str(a + b)
# Answers can be of maximum size DIGITS + 1
# Answers can be of maximum size DIGITS + 1.
ans += ' ' * (DIGITS + 1 - len(ans))
if INVERT:
# Reverse the query, e.g., '12+345 ' becomes ' 543+21'. (Note the
# space used for padding.)
query = query[::-1]
questions.append(query)
expected.append(ans)
print('Total addition questions:', len(questions))
print('Vectorization...')
X = np.zeros((len(questions), MAXLEN, len(chars)), dtype=np.bool)
x = np.zeros((len(questions), MAxLEN, len(chars)), dtype=np.bool)
y = np.zeros((len(questions), DIGITS + 1, len(chars)), dtype=np.bool)
for i, sentence in enumerate(questions):
X[i] = ctable.encode(sentence, maxlen=MAXLEN)
x[i] = ctable.encode(sentence, MAxLEN)
for i, sentence in enumerate(expected):
y[i] = ctable.encode(sentence, maxlen=DIGITS + 1)
y[i] = ctable.encode(sentence, DIGITS + 1)
# Shuffle (X, y) in unison as the later parts of X will almost all be larger digits
# Shuffle (x, y) in unison as the later parts of x will almost all be larger
# digits.
indices = np.arange(len(y))
np.random.shuffle(indices)
X = X[indices]
x = x[indices]
y = y[indices]
# Explicitly set apart 10% for validation data that we never train over
split_at = len(X) - len(X) / 10
(X_train, X_val) = (slice_X(X, 0, split_at), slice_X(X, split_at))
(y_train, y_val) = (y[:split_at], y[split_at:])
# Explicitly set apart 10% for validation data that we never train over.
split_at = len(x) - len(x) // 10
(x_train, x_val) = x[:split_at], x[split_at:]
(y_train, y_val) = y[:split_at], y[split_at:]
print(X_train.shape)
print('Training Data:')
print(x_train.shape)
print(y_train.shape)
print('Validation Data:')
print(x_val.shape)
print(y_val.shape)
# Try replacing GRU, or SimpleRNN.
RNN = layers.LSTM
HIDDEN_SIZE = 128
BATCH_SIZE = 128
LAYERS = 1
print('Build model...')
model = Sequential()
# "Encode" the input sequence using an RNN, producing an output of HIDDEN_SIZE
# note: in a situation where your input sequences have a variable length,
# use input_shape=(None, nb_feature).
model.add(RNN(HIDDEN_SIZE, input_shape=(MAXLEN, len(chars))))
# For the decoder's input, we repeat the encoded input for each time step
model.add(RepeatVector(DIGITS + 1))
# The decoder RNN could be multiple layers stacked or a single layer
# "Encode" the input sequence using an RNN, producing an output of HIDDEN_SIZE.
# Note: In a situation where your input sequences have a variable length,
# use input_shape=(None, num_feature).
model.add(RNN(HIDDEN_SIZE, input_shape=(MAxLEN, len(chars))))
# As the decoder RNN's input, repeatedly provide with the last hidden state of
# RNN for each time step. Repeat 'DIGITS + 1' times as that's the maximum
# length of output, e.g., when DIGITS=3, max output is 999+999=1998.
model.add(layers.RepeatVector(DIGITS + 1))
# The decoder RNN could be multiple layers stacked or a single layer.
for _ in range(LAYERS):
# By setting return_sequences to True, return not only the last output but
# all the outputs so far in the form of (num_samples, timesteps,
# output_dim). This is necessary as TimeDistributed in the below expects
# the first dimension to be the timesteps.
model.add(RNN(HIDDEN_SIZE, return_sequences=True))
# For each of step of the output sequence, decide which character should be chosen
model.add(TimeDistributedDense(len(chars)))
model.add(Activation('softmax'))
# Apply a dense layer to the every temporal slice of an input. For each of step
# of the output sequence, decide which character should be chosen.
model.add(layers.TimeDistributed(layers.Dense(len(chars))))
model.add(layers.Activation('softmax'))
model.compile(loss='categorical_crossentropy',
optimizer='adam',
metrics=['accuracy'])
model.summary()
# Train the model each generation and show predictions against the validation dataset
# Train the model each generation and show predictions against the validation
# dataset.
for iteration in range(1, 200):
print()
print('-' * 50)
print('Iteration', iteration)
model.fit(X_train, y_train, batch_size=BATCH_SIZE, nb_epoch=1,
validation_data=(X_val, y_val))
###
# Select 10 samples from the validation set at random so we can visualize errors
model.fit(x_train, y_train, batch_size=BATCH_SIZE, epochs=1,
validation_data=(x_val, y_val))
# Select 10 samples from the validation set at random so we can visualize
# errors.
for i in range(10):
ind = np.random.randint(0, len(X_val))
rowX, rowy = X_val[np.array([ind])], y_val[np.array([ind])]
preds = model.predict_classes(rowX, verbose=0)
q = ctable.decode(rowX[0])
ind = np.random.randint(0, len(x_val))
rowx, rowy = x_val[np.array([ind])], y_val[np.array([ind])]
preds = model.predict_classes(rowx, verbose=0)
q = ctable.decode(rowx[0])
correct = ctable.decode(rowy[0])
guess = ctable.decode(preds[0], calc_argmax=False)
print('Q', q[::-1] if INVERT else q)
print('T', correct)
print(colors.ok + '' + colors.close if correct == guess else colors.fail + '' + colors.close, guess)
if correct == guess:
print(colors.ok + '' + colors.close, end=" ")
else:
print(colors.fail + '' + colors.close, end=" ")
print(guess)
print('---')
+33 -32
Ver Arquivo
@@ -2,7 +2,7 @@
We build a custom activation layer called 'Antirectifier',
which modifies the shape of the tensor that passes through it.
We need to specify two methods: `output_shape` and `get_output`.
We need to specify two methods: `get_output_shape_for` and `call`.
Note that the same result can also be achieved via a Lambda layer.
@@ -11,14 +11,14 @@ backend (`K`), our code can run both on TensorFlow and Theano.
'''
from __future__ import print_function
import keras
from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Layer, Activation
from keras import layers
from keras.datasets import mnist
from keras import backend as K
from keras.utils import np_utils
class Antirectifier(Layer):
class Antirectifier(layers.Layer):
'''This is the combination of a sample-wise
L2 normalization with the concatenation of the
positive part of the input with the negative part
@@ -45,50 +45,51 @@ class Antirectifier(Layer):
with twice less parameters yet with comparable
classification accuracy as an equivalent ReLU-based network.
'''
def get_output_shape_for(self, input_shape):
def compute_output_shape(self, input_shape):
shape = list(input_shape)
assert len(shape) == 2 # only valid for 2D tensors
shape[-1] *= 2
return tuple(shape)
def call(self, x, mask=None):
x -= K.mean(x, axis=1, keepdims=True)
x = K.l2_normalize(x, axis=1)
pos = K.relu(x)
neg = K.relu(-x)
def call(self, inputs):
inputs -= K.mean(inputs, axis=1, keepdims=True)
inputs = K.l2_normalize(inputs, axis=1)
pos = K.relu(inputs)
neg = K.relu(-inputs)
return K.concatenate([pos, neg], axis=1)
# global parameters
batch_size = 128
nb_classes = 10
nb_epoch = 40
num_classes = 10
epochs = 40
# the data, shuffled and split between train and test sets
(X_train, y_train), (X_test, y_test) = mnist.load_data()
(x_train, y_train), (x_test, y_test) = mnist.load_data()
X_train = X_train.reshape(60000, 784)
X_test = X_test.reshape(10000, 784)
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255
X_test /= 255
print(X_train.shape[0], 'train samples')
print(X_test.shape[0], 'test samples')
x_train = x_train.reshape(60000, 784)
x_test = x_test.reshape(10000, 784)
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255
print(x_train.shape[0], 'train samples')
print(x_test.shape[0], 'test samples')
# convert class vectors to binary class matrices
Y_train = np_utils.to_categorical(y_train, nb_classes)
Y_test = np_utils.to_categorical(y_test, nb_classes)
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)
# build the model
model = Sequential()
model.add(Dense(256, input_shape=(784,)))
model.add(layers.Dense(256, input_shape=(784,)))
model.add(Antirectifier())
model.add(Dropout(0.1))
model.add(Dense(256))
model.add(layers.Dropout(0.1))
model.add(layers.Dense(256))
model.add(Antirectifier())
model.add(Dropout(0.1))
model.add(Dense(10))
model.add(Activation('softmax'))
model.add(layers.Dropout(0.1))
model.add(layers.Dense(10))
model.add(layers.Activation('softmax'))
# compile the model
model.compile(loss='categorical_crossentropy',
@@ -96,9 +97,9 @@ model.compile(loss='categorical_crossentropy',
metrics=['accuracy'])
# train the model
model.fit(X_train, Y_train,
batch_size=batch_size, nb_epoch=nb_epoch,
verbose=1, validation_data=(X_test, Y_test))
model.fit(x_train, y_train,
batch_size=batch_size, epochs=epochs,
verbose=1, validation_data=(x_test, y_test))
# next, compare with an equivalent network
# with2x bigger Dense layers and ReLU
+12 -6
Ver Arquivo
@@ -12,12 +12,12 @@ References:
Reaches 98.6% accuracy on task 'single_supporting_fact_10k' after 120 epochs.
Time per epoch: 3s on CPU (core i7).
'''
from __future__ import print_function
from keras.models import Sequential
from keras.layers.embeddings import Embedding
from keras.layers.core import Activation, Dense, Merge, Permute, Dropout
from keras.layers.recurrent import LSTM
from keras.layers import Activation, Dense, Merge, Permute, Dropout
from keras.layers import LSTM
from keras.utils.data_utils import get_file
from keras.preprocessing.sequence import pad_sequences
from functools import reduce
@@ -94,8 +94,13 @@ def vectorize_stories(data, word_idx, story_maxlen, query_maxlen):
pad_sequences(Xq, maxlen=query_maxlen), np.array(Y))
path = get_file('babi-tasks-v1-2.tar.gz',
origin='http://www.thespermwhale.com/jaseweston/babi/tasks_1-20_v1-2.tar.gz')
try:
path = get_file('babi-tasks-v1-2.tar.gz', origin='https://s3.amazonaws.com/text-datasets/babi_tasks_1-20_v1-2.tar.gz')
except:
print('Error downloading dataset, please download it manually:\n'
'$ wget http://www.thespermwhale.com/jaseweston/babi/tasks_1-20_v1-2.tar.gz\n'
'$ mv tasks_1-20_v1-2.tar.gz ~/.keras/datasets/babi-tasks-v1-2.tar.gz')
raise
tar = tarfile.open(path)
challenges = {
@@ -168,6 +173,7 @@ match = Sequential()
match.add(Merge([input_encoder_m, question_encoder],
mode='dot',
dot_axes=[2, 2]))
match.add(Activation('softmax'))
# output: (samples, story_maxlen, query_maxlen)
# embed the input into a single vector with size = story_maxlen:
input_encoder_c = Sequential()
@@ -200,5 +206,5 @@ answer.compile(optimizer='rmsprop', loss='categorical_crossentropy',
# Note: you could use a Graph model to avoid repeat the input twice
answer.fit([inputs_train, queries_train, inputs_train], answers_train,
batch_size=32,
nb_epoch=120,
epochs=120,
validation_data=([inputs_test, queries_test, inputs_test], answers_test))
+39 -36
Ver Arquivo
@@ -12,7 +12,7 @@ QA2 - Two Supporting Facts | 20 | 50.0
QA3 - Three Supporting Facts | 20 | 20.5
QA4 - Two Arg. Relations | 61 | 62.9
QA5 - Three Arg. Relations | 70 | 61.9
QA6 - Yes/No Questions | 48 | 50.7
QA6 - yes/No Questions | 48 | 50.7
QA7 - Counting | 49 | 78.9
QA8 - Lists/Sets | 45 | 77.2
QA9 - Simple Negation | 64 | 64.0
@@ -62,13 +62,12 @@ import re
import tarfile
import numpy as np
np.random.seed(1337) # for reproducibility
from keras.utils.data_utils import get_file
from keras.layers.embeddings import Embedding
from keras.layers.core import Dense, Merge, Dropout, RepeatVector
from keras import layers
from keras.layers import recurrent
from keras.models import Sequential
from keras.models import Model
from keras.preprocessing.sequence import pad_sequences
@@ -125,28 +124,34 @@ def get_stories(f, only_supporting=False, max_length=None):
def vectorize_stories(data, word_idx, story_maxlen, query_maxlen):
X = []
Xq = []
Y = []
xs = []
xqs = []
ys = []
for story, query, answer in data:
x = [word_idx[w] for w in story]
xq = [word_idx[w] for w in query]
y = np.zeros(len(word_idx) + 1) # let's not forget that index 0 is reserved
y[word_idx[answer]] = 1
X.append(x)
Xq.append(xq)
Y.append(y)
return pad_sequences(X, maxlen=story_maxlen), pad_sequences(Xq, maxlen=query_maxlen), np.array(Y)
xs.append(x)
xqs.append(xq)
ys.append(y)
return pad_sequences(xs, maxlen=story_maxlen), pad_sequences(xqs, maxlen=query_maxlen), np.array(ys)
RNN = recurrent.LSTM
EMBED_HIDDEN_SIZE = 50
SENT_HIDDEN_SIZE = 100
QUERY_HIDDEN_SIZE = 100
QUERy_HIDDEN_SIZE = 100
BATCH_SIZE = 32
EPOCHS = 40
print('RNN / Embed / Sent / Query = {}, {}, {}, {}'.format(RNN, EMBED_HIDDEN_SIZE, SENT_HIDDEN_SIZE, QUERY_HIDDEN_SIZE))
print('RNN / Embed / Sent / Query = {}, {}, {}, {}'.format(RNN, EMBED_HIDDEN_SIZE, SENT_HIDDEN_SIZE, QUERy_HIDDEN_SIZE))
path = get_file('babi-tasks-v1-2.tar.gz', origin='http://www.thespermwhale.com/jaseweston/babi/tasks_1-20_v1-2.tar.gz')
try:
path = get_file('babi-tasks-v1-2.tar.gz', origin='https://s3.amazonaws.com/text-datasets/babi_tasks_1-20_v1-2.tar.gz')
except:
print('Error downloading dataset, please download it manually:\n'
'$ wget http://www.thespermwhale.com/jaseweston/babi/tasks_1-20_v1-2.tar.gz\n'
'$ mv tasks_1-20_v1-2.tar.gz ~/.keras/datasets/babi-tasks-v1-2.tar.gz')
raise
tar = tarfile.open(path)
# Default QA1 with 1000 samples
# challenge = 'tasks_1-20_v1-2/en/qa1_single-supporting-fact_{}.txt'
@@ -166,40 +171,38 @@ word_idx = dict((c, i + 1) for i, c in enumerate(vocab))
story_maxlen = max(map(len, (x for x, _, _ in train + test)))
query_maxlen = max(map(len, (x for _, x, _ in train + test)))
X, Xq, Y = vectorize_stories(train, word_idx, story_maxlen, query_maxlen)
tX, tXq, tY = vectorize_stories(test, word_idx, story_maxlen, query_maxlen)
x, xq, y = vectorize_stories(train, word_idx, story_maxlen, query_maxlen)
tx, txq, ty = vectorize_stories(test, word_idx, story_maxlen, query_maxlen)
print('vocab = {}'.format(vocab))
print('X.shape = {}'.format(X.shape))
print('Xq.shape = {}'.format(Xq.shape))
print('Y.shape = {}'.format(Y.shape))
print('x.shape = {}'.format(x.shape))
print('xq.shape = {}'.format(xq.shape))
print('y.shape = {}'.format(y.shape))
print('story_maxlen, query_maxlen = {}, {}'.format(story_maxlen, query_maxlen))
print('Build model...')
sentrnn = Sequential()
sentrnn.add(Embedding(vocab_size, EMBED_HIDDEN_SIZE,
input_length=story_maxlen))
sentrnn.add(Dropout(0.3))
sentence = layers.Input(shape=(story_maxlen,), dtype='int32')
encoded_sentence = layers.Embedding(vocab_size, EMBED_HIDDEN_SIZE)(sentence)
encoded_sentence = layers.Dropout(0.3)(encoded_sentence)
qrnn = Sequential()
qrnn.add(Embedding(vocab_size, EMBED_HIDDEN_SIZE,
input_length=query_maxlen))
qrnn.add(Dropout(0.3))
qrnn.add(RNN(EMBED_HIDDEN_SIZE, return_sequences=False))
qrnn.add(RepeatVector(story_maxlen))
question = layers.Input(shape=(query_maxlen,), dtype='int32')
encoded_question = layers.Embedding(vocab_size, EMBED_HIDDEN_SIZE)(question)
encoded_question = layers.Dropout(0.3)(encoded_question)
encoded_question = RNN(EMBED_HIDDEN_SIZE)(encoded_question)
encoded_question = layers.RepeatVector(story_maxlen)(encoded_question)
model = Sequential()
model.add(Merge([sentrnn, qrnn], mode='sum'))
model.add(RNN(EMBED_HIDDEN_SIZE, return_sequences=False))
model.add(Dropout(0.3))
model.add(Dense(vocab_size, activation='softmax'))
merged = layers.add([encoded_sentence, encoded_question])
merged = RNN(EMBED_HIDDEN_SIZE)(merged)
merged = layers.Dropout(0.3)(merged)
preds = layers.Dense(vocab_size, activation='softmax')(merged)
model = Model([sentence, question], preds)
model.compile(optimizer='adam',
loss='categorical_crossentropy',
metrics=['accuracy'])
print('Training')
model.fit([X, Xq], Y, batch_size=BATCH_SIZE, nb_epoch=EPOCHS, validation_split=0.05)
loss, acc = model.evaluate([tX, tXq], tY, batch_size=BATCH_SIZE)
model.fit([x, xq], y, batch_size=BATCH_SIZE, epochs=EPOCHS, validation_split=0.05)
loss, acc = model.evaluate([tx, txq], ty, batch_size=BATCH_SIZE)
print('Test loss / test accuracy = {:.4f} / {:.4f}'.format(loss, acc))
+41 -48
Ver Arquivo
@@ -1,58 +1,53 @@
'''Train a simple deep CNN on the CIFAR10 small images dataset.
GPU run command:
THEANO_FLAGS=mode=FAST_RUN,device=gpu,floatX=float32 python cifar10_cnn.py
GPU run command with Theano backend (with TensorFlow, the GPU is automatically used):
THEANO_FLAGS=mode=FAST_RUN,device=gpu,floatx=float32 python cifar10_cnn.py
It gets down to 0.65 test logloss in 25 epochs, and down to 0.55 after 50 epochs.
(it's still underfitting at that point, though).
Note: the data was pickled with Python 2, and some encoding issues might prevent you
from loading it in Python 3. You might have to load it in Python 2,
save it in a different format, load it in Python 3 and repickle it.
'''
from __future__ import print_function
import keras
from keras.datasets import cifar10
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation, Flatten
from keras.layers.convolutional import Convolution2D, MaxPooling2D
from keras.optimizers import SGD
from keras.utils import np_utils
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.layers import Conv2D, MaxPooling2D
batch_size = 32
nb_classes = 10
nb_epoch = 200
num_classes = 10
epochs = 200
data_augmentation = True
# input image dimensions
img_rows, img_cols = 32, 32
# the CIFAR10 images are RGB
# The CIFAR10 images are RGB.
img_channels = 3
# the data, shuffled and split between train and test sets
(X_train, y_train), (X_test, y_test) = cifar10.load_data()
print('X_train shape:', X_train.shape)
print(X_train.shape[0], 'train samples')
print(X_test.shape[0], 'test samples')
# The data, shuffled and split between train and test sets:
(x_train, y_train), (x_test, y_test) = cifar10.load_data()
print('x_train shape:', x_train.shape)
print(x_train.shape[0], 'train samples')
print(x_test.shape[0], 'test samples')
# convert class vectors to binary class matrices
Y_train = np_utils.to_categorical(y_train, nb_classes)
Y_test = np_utils.to_categorical(y_test, nb_classes)
# Convert class vectors to binary class matrices.
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)
model = Sequential()
model.add(Convolution2D(32, 3, 3, border_mode='same',
input_shape=(img_channels, img_rows, img_cols)))
model.add(Conv2D(32, (3, 3), padding='same',
input_shape=x_train.shape[1:]))
model.add(Activation('relu'))
model.add(Convolution2D(32, 3, 3))
model.add(Conv2D(32, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Convolution2D(64, 3, 3, border_mode='same'))
model.add(Conv2D(64, (3, 3), padding='same'))
model.add(Activation('relu'))
model.add(Convolution2D(64, 3, 3))
model.add(Conv2D(64, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
@@ -61,31 +56,29 @@ model.add(Flatten())
model.add(Dense(512))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(nb_classes))
model.add(Dense(num_classes))
model.add(Activation('softmax'))
# let's train the model using SGD + momentum (how original).
sgd = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
# Let's train the model using RMSprop
model.compile(loss='categorical_crossentropy',
optimizer=sgd,
optimizer='rmsprop',
metrics=['accuracy'])
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255
X_test /= 255
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255
if not data_augmentation:
print('Not using data augmentation.')
model.fit(X_train, Y_train,
model.fit(x_train, y_train,
batch_size=batch_size,
nb_epoch=nb_epoch,
validation_data=(X_test, Y_test),
epochs=epochs,
validation_data=(x_test, y_test),
shuffle=True)
else:
print('Using real-time data augmentation.')
# this will do preprocessing and realtime data augmentation
# This will do preprocessing and realtime data augmentation:
datagen = ImageDataGenerator(
featurewise_center=False, # set input mean to 0 over the dataset
samplewise_center=False, # set each sample mean to 0
@@ -98,13 +91,13 @@ else:
horizontal_flip=True, # randomly flip images
vertical_flip=False) # randomly flip images
# compute quantities required for featurewise normalization
# (std, mean, and principal components if ZCA whitening is applied)
datagen.fit(X_train)
# Compute quantities required for featurewise normalization
# (std, mean, and principal components if ZCA whitening is applied).
datagen.fit(x_train)
# fit the model on the batches generated by datagen.flow()
model.fit_generator(datagen.flow(X_train, Y_train,
batch_size=batch_size),
samples_per_epoch=X_train.shape[0],
nb_epoch=nb_epoch,
validation_data=(X_test, Y_test))
# Fit the model on the batches generated by datagen.flow().
model.fit_generator(datagen.flow(x_train, y_train,
batch_size=batch_size),
steps_per_epoch=x_train.shape[0] // batch_size,
epochs=epochs,
validation_data=(x_test, y_test))
+26 -75
Ver Arquivo
@@ -3,34 +3,26 @@
This script can run on CPU in a few minutes (with the TensorFlow backend).
Results example: http://i.imgur.com/4nj4KjN.jpg
Before running this script, download the weights for the VGG16 model at:
https://drive.google.com/file/d/0Bz7KyqmuGsilT0J5dmRCM0ROVHc/view?usp=sharing
(source: https://gist.github.com/baraldilorenzo/07d7802847aaad0a35d3)
and make sure the variable `weights_path` in this script matches the location of the file.
'''
from __future__ import print_function
from scipy.misc import imsave
import numpy as np
import time
import os
import h5py
from keras.models import Sequential
from keras.layers import Convolution2D, ZeroPadding2D, MaxPooling2D
from keras.applications import vgg16
from keras import backend as K
# dimensions of the generated pictures for each filter.
img_width = 128
img_height = 128
# path to the model weights file.
weights_path = 'vgg16_weights.h5'
# the name of the layer we want to visualize (see model definition below)
layer_name = 'conv5_1'
# the name of the layer we want to visualize
# (see model definition at keras/applications/vgg16.py)
layer_name = 'block5_conv1'
# util function to convert a tensor into a valid image
def deprocess_image(x):
# normalize tensor: center on 0., ensure std is 0.1
x -= x.mean()
@@ -43,70 +35,22 @@ def deprocess_image(x):
# convert to RGB array
x *= 255
x = x.transpose((1, 2, 0))
if K.image_data_format() == 'channels_first':
x = x.transpose((1, 2, 0))
x = np.clip(x, 0, 255).astype('uint8')
return x
# build the VGG16 network
model = Sequential()
model.add(ZeroPadding2D((1, 1), batch_input_shape=(1, 3, img_width, img_height)))
first_layer = model.layers[-1]
# this is a placeholder tensor that will contain our generated images
input_img = first_layer.input
model.add(Convolution2D(64, 3, 3, activation='relu', name='conv1_1'))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(64, 3, 3, activation='relu', name='conv1_2'))
model.add(MaxPooling2D((2, 2), strides=(2, 2)))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(128, 3, 3, activation='relu', name='conv2_1'))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(128, 3, 3, activation='relu', name='conv2_2'))
model.add(MaxPooling2D((2, 2), strides=(2, 2)))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(256, 3, 3, activation='relu', name='conv3_1'))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(256, 3, 3, activation='relu', name='conv3_2'))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(256, 3, 3, activation='relu', name='conv3_3'))
model.add(MaxPooling2D((2, 2), strides=(2, 2)))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(512, 3, 3, activation='relu', name='conv4_1'))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(512, 3, 3, activation='relu', name='conv4_2'))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(512, 3, 3, activation='relu', name='conv4_3'))
model.add(MaxPooling2D((2, 2), strides=(2, 2)))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(512, 3, 3, activation='relu', name='conv5_1'))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(512, 3, 3, activation='relu', name='conv5_2'))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(512, 3, 3, activation='relu', name='conv5_3'))
model.add(MaxPooling2D((2, 2), strides=(2, 2)))
# load the weights of the VGG16 networks
# (trained on ImageNet, won the ILSVRC competition in 2014)
# note: when there is a complete match between your model definition
# and your weight savefile, you can simply call model.load_weights(filename)
assert os.path.exists(weights_path), 'Model weights not found (see "weights_path" variable in script).'
f = h5py.File(weights_path)
for k in range(f.attrs['nb_layers']):
if k >= len(model.layers):
# we don't look at the last (fully-connected) layers in the savefile
break
g = f['layer_{}'.format(k)]
weights = [g['param_{}'.format(p)] for p in range(g.attrs['nb_params'])]
model.layers[k].set_weights(weights)
f.close()
# build the VGG16 network with ImageNet weights
model = vgg16.VGG16(weights='imagenet', include_top=False)
print('Model loaded.')
model.summary()
# this is the placeholder for the input images
input_img = model.input
# get the symbolic outputs of each "key" layer (we gave them unique names).
layer_dict = dict([(layer.name, layer) for layer in model.layers])
layer_dict = dict([(layer.name, layer) for layer in model.layers[1:]])
def normalize(x):
@@ -124,7 +68,10 @@ for filter_index in range(0, 200):
# we build a loss function that maximizes the activation
# of the nth filter of the layer considered
layer_output = layer_dict[layer_name].output
loss = K.mean(layer_output[:, filter_index, :, :])
if K.image_data_format() == 'channels_first':
loss = K.mean(layer_output[:, filter_index, :, :])
else:
loss = K.mean(layer_output[:, :, :, filter_index])
# we compute the gradient of the input picture wrt this loss
grads = K.gradients(loss, input_img)[0]
@@ -139,7 +86,11 @@ for filter_index in range(0, 200):
step = 1.
# we start from a gray image with some random noise
input_img_data = np.random.random((1, 3, img_width, img_height)) * 20 + 128.
if K.image_data_format() == 'channels_first':
input_img_data = np.random.random((1, 3, img_width, img_height))
else:
input_img_data = np.random.random((1, img_width, img_height, 3))
input_img_data = (input_img_data - 0.5) * 20 + 128
# we run gradient ascent for 20 steps
for i in range(20):
+141
Ver Arquivo
@@ -0,0 +1,141 @@
""" This script demonstrates the use of a convolutional LSTM network.
This network is used to predict the next frame of an artificially
generated movie which contains moving squares.
"""
from keras.models import Sequential
from keras.layers.convolutional import Conv3D
from keras.layers.convolutional_recurrent import ConvLSTM2D
from keras.layers.normalization import BatchNormalization
import numpy as np
import pylab as plt
# We create a layer which take as input movies of shape
# (n_frames, width, height, channels) and returns a movie
# of identical shape.
seq = Sequential()
seq.add(ConvLSTM2D(filters=40, kernel_size=(3, 3),
input_shape=(None, 40, 40, 1),
padding='same', return_sequences=True))
seq.add(BatchNormalization())
seq.add(ConvLSTM2D(filters=40, kernel_size=(3, 3),
padding='same', return_sequences=True))
seq.add(BatchNormalization())
seq.add(ConvLSTM2D(filters=40, kernel_size=(3, 3),
padding='same', return_sequences=True))
seq.add(BatchNormalization())
seq.add(ConvLSTM2D(filters=40, kernel_size=(3, 3),
padding='same', return_sequences=True))
seq.add(BatchNormalization())
seq.add(Conv3D(filters=1, kernel_size=(3, 3, 3),
activation='sigmoid',
padding='same', data_format='channels_last'))
seq.compile(loss='binary_crossentropy', optimizer='adadelta')
# Artificial data generation:
# Generate movies with 3 to 7 moving squares inside.
# The squares are of shape 1x1 or 2x2 pixels,
# which move linearly over time.
# For convenience we first create movies with bigger width and height (80x80)
# and at the end we select a 40x40 window.
def generate_movies(n_samples=1200, n_frames=15):
row = 80
col = 80
noisy_movies = np.zeros((n_samples, n_frames, row, col, 1), dtype=np.float)
shifted_movies = np.zeros((n_samples, n_frames, row, col, 1),
dtype=np.float)
for i in range(n_samples):
# Add 3 to 7 moving squares
n = np.random.randint(3, 8)
for j in range(n):
# Initial position
xstart = np.random.randint(20, 60)
ystart = np.random.randint(20, 60)
# Direction of motion
directionx = np.random.randint(0, 3) - 1
directiony = np.random.randint(0, 3) - 1
# Size of the square
w = np.random.randint(2, 4)
for t in range(n_frames):
x_shift = xstart + directionx * t
y_shift = ystart + directiony * t
noisy_movies[i, t, x_shift - w: x_shift + w,
y_shift - w: y_shift + w, 0] += 1
# Make it more robust by adding noise.
# The idea is that if during inference,
# the value of the pixel is not exactly one,
# we need to train the network to be robust and still
# consider it as a pixel belonging to a square.
if np.random.randint(0, 2):
noise_f = (-1)**np.random.randint(0, 2)
noisy_movies[i, t,
x_shift - w - 1: x_shift + w + 1,
y_shift - w - 1: y_shift + w + 1,
0] += noise_f * 0.1
# Shift the ground truth by 1
x_shift = xstart + directionx * (t + 1)
y_shift = ystart + directiony * (t + 1)
shifted_movies[i, t, x_shift - w: x_shift + w,
y_shift - w: y_shift + w, 0] += 1
# Cut to a 40x40 window
noisy_movies = noisy_movies[::, ::, 20:60, 20:60, ::]
shifted_movies = shifted_movies[::, ::, 20:60, 20:60, ::]
noisy_movies[noisy_movies >= 1] = 1
shifted_movies[shifted_movies >= 1] = 1
return noisy_movies, shifted_movies
# Train the network
noisy_movies, shifted_movies = generate_movies(n_samples=1200)
seq.fit(noisy_movies[:1000], shifted_movies[:1000], batch_size=10,
epochs=300, validation_split=0.05)
# Testing the network on one movie
# feed it with the first 7 positions and then
# predict the new positions
which = 1004
track = noisy_movies[which][:7, ::, ::, ::]
for j in range(16):
new_pos = seq.predict(track[np.newaxis, ::, ::, ::, ::])
new = new_pos[::, -1, ::, ::, ::]
track = np.concatenate((track, new), axis=0)
# And then compare the predictions
# to the ground truth
track2 = noisy_movies[which][::, ::, ::, ::]
for i in range(15):
fig = plt.figure(figsize=(10, 5))
ax = fig.add_subplot(121)
if i >= 7:
ax.text(1, 3, 'Predictions !', fontsize=20, color='w')
else:
ax.text(1, 3, 'Inital trajectory', fontsize=20)
toplot = track[i, ::, ::, 0]
plt.imshow(toplot)
ax = fig.add_subplot(122)
plt.text(1, 3, 'Ground truth', fontsize=20)
toplot = track2[i, ::, ::, 0]
if i >= 2:
toplot = shifted_movies[which][i - 1, ::, ::, 0]
plt.imshow(toplot)
plt.savefig('%i_animate.png' % (i + 1))
+83 -97
Ver Arquivo
@@ -9,23 +9,23 @@ e.g.:
python deep_dream.py img/mypic.jpg results/dream
```
It is preferrable to run this script on GPU, for speed.
It is preferable to run this script on GPU, for speed.
If running on CPU, prefer the TensorFlow backend (much faster).
Example results: http://i.imgur.com/FX6ROg9.jpg
'''
from __future__ import print_function
from scipy.misc import imread, imresize, imsave
from keras.preprocessing.image import load_img, img_to_array
import numpy as np
from scipy.misc import imsave
from scipy.optimize import fmin_l_bfgs_b
import time
import argparse
import h5py
import os
from keras.models import Sequential
from keras.layers.convolutional import Convolution2D, ZeroPadding2D, MaxPooling2D
from keras.applications import vgg16
from keras import backend as K
from keras.layers import Input
parser = argparse.ArgumentParser(description='Deep Dreams with Keras.')
parser.add_argument('base_image_path', metavar='base', type=str,
@@ -38,22 +38,19 @@ base_image_path = args.base_image_path
result_prefix = args.result_prefix
# dimensions of the generated picture.
img_width = 600
img_height = 600
# path to the model weights file.
weights_path = 'vgg16_weights.h5'
img_width = 600
# some settings we found interesting
saved_settings = {
'bad_trip': {'features': {'conv4_1': 0.05,
'conv4_2': 0.01,
'conv4_3': 0.01},
'bad_trip': {'features': {'block4_conv1': 0.05,
'block4_conv2': 0.01,
'block4_conv3': 0.01},
'continuity': 0.1,
'dream_l2': 0.8,
'jitter': 5},
'dreamy': {'features': {'conv5_1': 0.05,
'conv5_2': 0.02},
'dreamy': {'features': {'block5_conv1': 0.05,
'block5_conv2': 0.02},
'continuity': 0.1,
'dream_l2': 0.02,
'jitter': 0},
@@ -61,85 +58,63 @@ saved_settings = {
# the settings we will use in this experiment
settings = saved_settings['dreamy']
# util function to open, resize and format pictures into appropriate tensors
def preprocess_image(image_path):
img = imresize(imread(image_path), (img_width, img_height))
img = img.transpose((2, 0, 1)).astype('float64')
# util function to open, resize and format pictures
# into appropriate tensors
img = load_img(image_path, target_size=(img_height, img_width))
img = img_to_array(img)
img = np.expand_dims(img, axis=0)
img = vgg16.preprocess_input(img)
return img
# util function to convert a tensor into a valid image
def deprocess_image(x):
x = x.transpose((1, 2, 0))
# util function to convert a tensor into a valid image
if K.image_data_format() == 'channels_first':
x = x.reshape((3, img_height, img_width))
x = x.transpose((1, 2, 0))
else:
x = x.reshape((img_height, img_width, 3))
# Remove zero-center by mean pixel
x[:, :, 0] += 103.939
x[:, :, 1] += 116.779
x[:, :, 2] += 123.68
# 'BGR'->'RGB'
x = x[:, :, ::-1]
x = np.clip(x, 0, 255).astype('uint8')
return x
# build the VGG16 network
model = Sequential()
model.add(ZeroPadding2D((1, 1), batch_input_shape=(1, 3, img_width, img_height)))
first_layer = model.layers[-1]
# this is a placeholder tensor that will contain our generated images
dream = first_layer.input
if K.image_data_format() == 'channels_first':
img_size = (3, img_height, img_width)
else:
img_size = (img_height, img_width, 3)
# this will contain our generated image
dream = Input(batch_shape=(1,) + img_size)
model.add(Convolution2D(64, 3, 3, activation='relu', name='conv1_1'))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(64, 3, 3, activation='relu', name='conv1_2'))
model.add(MaxPooling2D((2, 2), strides=(2, 2)))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(128, 3, 3, activation='relu', name='conv2_1'))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(128, 3, 3, activation='relu', name='conv2_2'))
model.add(MaxPooling2D((2, 2), strides=(2, 2)))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(256, 3, 3, activation='relu', name='conv3_1'))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(256, 3, 3, activation='relu', name='conv3_2'))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(256, 3, 3, activation='relu', name='conv3_3'))
model.add(MaxPooling2D((2, 2), strides=(2, 2)))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(512, 3, 3, activation='relu', name='conv4_1'))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(512, 3, 3, activation='relu', name='conv4_2'))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(512, 3, 3, activation='relu', name='conv4_3'))
model.add(MaxPooling2D((2, 2), strides=(2, 2)))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(512, 3, 3, activation='relu', name='conv5_1'))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(512, 3, 3, activation='relu', name='conv5_2'))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(512, 3, 3, activation='relu', name='conv5_3'))
model.add(MaxPooling2D((2, 2), strides=(2, 2)))
# load the weights of the VGG16 networks
# (trained on ImageNet, won the ILSVRC competition in 2014)
# note: when there is a complete match between your model definition
# and your weight savefile, you can simply call model.load_weights(filename)
assert os.path.exists(weights_path), 'Model weights not found (see "weights_path" variable in script).'
f = h5py.File(weights_path)
for k in range(f.attrs['nb_layers']):
if k >= len(model.layers):
# we don't look at the last (fully-connected) layers in the savefile
break
g = f['layer_{}'.format(k)]
weights = [g['param_{}'.format(p)] for p in range(g.attrs['nb_params'])]
model.layers[k].set_weights(weights)
f.close()
# build the VGG16 network with our placeholder
# the model will be loaded with pre-trained ImageNet weights
model = vgg16.VGG16(input_tensor=dream,
weights='imagenet', include_top=False)
print('Model loaded.')
# get the symbolic outputs of each "key" layer (we gave them unique names).
layer_dict = dict([(layer.name, layer) for layer in model.layers])
# continuity loss util function
def continuity_loss(x):
# continuity loss util function
assert K.ndim(x) == 4
a = K.square(x[:, :, :img_width-1, :img_height-1] - x[:, :, 1:, :img_height-1])
b = K.square(x[:, :, :img_width-1, :img_height-1] - x[:, :, :img_width-1, 1:])
if K.image_data_format() == 'channels_first':
a = K.square(x[:, :, :img_height - 1, :img_width - 1] -
x[:, :, 1:, :img_width - 1])
b = K.square(x[:, :, :img_height - 1, :img_width - 1] -
x[:, :, :img_height - 1, 1:])
else:
a = K.square(x[:, :img_height - 1, :img_width - 1, :] -
x[:, 1:, :img_width - 1, :])
b = K.square(x[:, :img_height - 1, :img_width - 1, :] -
x[:, :img_height - 1, 1:, :])
return K.sum(K.pow(a + b, 1.25))
# define the loss
@@ -151,12 +126,15 @@ for layer_name in settings['features']:
x = layer_dict[layer_name].output
shape = layer_dict[layer_name].output_shape
# we avoid border artifacts by only involving non-border pixels in the loss
loss -= coeff * K.sum(K.square(x[:, :, 2: shape[2]-2, 2: shape[3]-2])) / np.prod(shape[1:])
if K.image_data_format() == 'channels_first':
loss -= coeff * K.sum(K.square(x[:, :, 2: shape[2] - 2, 2: shape[3] - 2])) / np.prod(shape[1:])
else:
loss -= coeff * K.sum(K.square(x[:, 2: shape[1] - 2, 2: shape[2] - 2, :])) / np.prod(shape[1:])
# add continuity loss (gives image local coherence, can result in an artful blur)
loss += settings['continuity'] * continuity_loss(dream) / (3 * img_width * img_height)
loss += settings['continuity'] * continuity_loss(dream) / np.prod(img_size)
# add image L2 norm to loss (prevents pixels from taking very high values, makes image darker)
loss += settings['dream_l2'] * K.sum(K.square(dream)) / (3 * img_width * img_height)
loss += settings['dream_l2'] * K.sum(K.square(dream)) / np.prod(img_size)
# feel free to further modify the loss as you see fit, to achieve new effects...
@@ -164,14 +142,16 @@ loss += settings['dream_l2'] * K.sum(K.square(dream)) / (3 * img_width * img_hei
grads = K.gradients(loss, dream)
outputs = [loss]
if type(grads) in {list, tuple}:
if isinstance(grads, (list, tuple)):
outputs += grads
else:
outputs.append(grads)
f_outputs = K.function([dream], outputs)
def eval_loss_and_grads(x):
x = x.reshape((1, 3, img_width, img_height))
x = x.reshape((1,) + img_size)
outs = f_outputs([x])
loss_value = outs[0]
if len(outs[1:]) == 1:
@@ -180,16 +160,21 @@ def eval_loss_and_grads(x):
grad_values = np.array(outs[1:]).flatten().astype('float64')
return loss_value, grad_values
# this Evaluator class makes it possible
# to compute loss and gradients in one pass
# while retrieving them via two separate functions,
# "loss" and "grads". This is done because scipy.optimize
# requires separate functions for loss and gradients,
# but computing them separately would be inefficient.
class Evaluator(object):
"""Loss and gradients evaluator.
This Evaluator class makes it possible
to compute loss and gradients in one pass
while retrieving them via two separate functions,
"loss" and "grads". This is done because scipy.optimize
requires separate functions for loss and gradients,
but computing them separately would be inefficient.
"""
def __init__(self):
self.loss_value = None
self.grads_values = None
self.grad_values = None
def loss(self, x):
assert self.loss_value is None
@@ -207,25 +192,26 @@ class Evaluator(object):
evaluator = Evaluator()
# run scipy-based optimization (L-BFGS) over the pixels of the generated image
# Run scipy-based optimization (L-BFGS) over the pixels of the generated image
# so as to minimize the loss
x = preprocess_image(base_image_path)
for i in range(5):
print('Start of iteration', i)
start_time = time.time()
# add a random jitter to the initial image. This will be reverted at decoding time
random_jitter = (settings['jitter'] * 2) * (np.random.random((3, img_width, img_height)) - 0.5)
# Add a random jitter to the initial image.
# This will be reverted at decoding time
random_jitter = (settings['jitter'] * 2) * (np.random.random(img_size) - 0.5)
x += random_jitter
# run L-BFGS for 7 steps
# Run L-BFGS for 7 steps
x, min_val, info = fmin_l_bfgs_b(evaluator.loss, x.flatten(),
fprime=evaluator.grads, maxfun=7)
print('Current loss value:', min_val)
# decode the dream and save it
x = x.reshape((3, img_width, img_height))
# Decode the dream and save it
x = x.reshape(img_size)
x -= random_jitter
img = deprocess_image(x)
img = deprocess_image(np.copy(x))
fname = result_prefix + '_at_iteration_%d.png' % i
imsave(fname, img)
end_time = time.time()
+488
Ver Arquivo
@@ -0,0 +1,488 @@
'''This example uses a convolutional stack followed by a recurrent stack
and a CTC logloss function to perform optical character recognition
of generated text images. I have no evidence of whether it actually
learns general shapes of text, or just is able to recognize all
the different fonts thrown at it...the purpose is more to demonstrate CTC
inside of Keras. Note that the font list may need to be updated
for the particular OS in use.
This starts off with 4 letter words. For the first 12 epochs, the
difficulty is gradually increased using the TextImageGenerator class
which is both a generator class for test/train data and a Keras
callback class. After 20 epochs, longer sequences are thrown at it
by recompiling the model to handle a wider image and rebuilding
the word list to include two words separated by a space.
The table below shows normalized edit distance values. Theano uses
a slightly different CTC implementation, hence the different results.
Norm. ED
Epoch | TF | TH
------------------------
10 0.027 0.064
15 0.038 0.035
20 0.043 0.045
25 0.014 0.019
This requires cairo and editdistance packages:
pip install cairocffi
pip install editdistance
Created by Mike Henry
https://github.com/mbhenry/
'''
import os
import itertools
import re
import datetime
import cairocffi as cairo
import editdistance
import numpy as np
from scipy import ndimage
import pylab
from keras import backend as K
from keras.layers.convolutional import Convolution2D, MaxPooling2D
from keras.layers import Input, Dense, Activation
from keras.layers import Reshape, Lambda, merge
from keras.models import Model
from keras.layers.recurrent import GRU
from keras.optimizers import SGD
from keras.utils.data_utils import get_file
from keras.preprocessing import image
import keras.callbacks
OUTPUT_DIR = 'image_ocr'
np.random.seed(55)
# this creates larger "blotches" of noise which look
# more realistic than just adding gaussian noise
# assumes greyscale with pixels ranging from 0 to 1
def speckle(img):
severity = np.random.uniform(0, 0.6)
blur = ndimage.gaussian_filter(np.random.randn(*img.shape) * severity, 1)
img_speck = (img + blur)
img_speck[img_speck > 1] = 1
img_speck[img_speck <= 0] = 0
return img_speck
# paints the string in a random location the bounding box
# also uses a random font, a slight random rotation,
# and a random amount of speckle noise
def paint_text(text, w, h, rotate=False, ud=False, multi_fonts=False):
surface = cairo.ImageSurface(cairo.FORMAT_RGB24, w, h)
with cairo.Context(surface) as context:
context.set_source_rgb(1, 1, 1) # White
context.paint()
# this font list works in Centos 7
if multi_fonts:
fonts = ['Century Schoolbook', 'Courier', 'STIX', 'URW Chancery L', 'FreeMono']
context.select_font_face(np.random.choice(fonts), cairo.FONT_SLANT_NORMAL,
np.random.choice([cairo.FONT_WEIGHT_BOLD, cairo.FONT_WEIGHT_NORMAL]))
else:
context.select_font_face('Courier', cairo.FONT_SLANT_NORMAL, cairo.FONT_WEIGHT_BOLD)
context.set_font_size(25)
box = context.text_extents(text)
border_w_h = (4, 4)
if box[2] > (w - 2 * border_w_h[1]) or box[3] > (h - 2 * border_w_h[0]):
raise IOError('Could not fit string into image. Max char count is too large for given image width.')
# teach the RNN translational invariance by
# fitting text box randomly on canvas, with some room to rotate
max_shift_x = w - box[2] - border_w_h[0]
max_shift_y = h - box[3] - border_w_h[1]
top_left_x = np.random.randint(0, int(max_shift_x))
if ud:
top_left_y = np.random.randint(0, int(max_shift_y))
else:
top_left_y = h // 2
context.move_to(top_left_x - int(box[0]), top_left_y - int(box[1]))
context.set_source_rgb(0, 0, 0)
context.show_text(text)
buf = surface.get_data()
a = np.frombuffer(buf, np.uint8)
a.shape = (h, w, 4)
a = a[:, :, 0] # grab single channel
a = a.astype(np.float32) / 255
a = np.expand_dims(a, 0)
if rotate:
a = image.random_rotation(a, 3 * (w - top_left_x) / w + 1)
a = speckle(a)
return a
def shuffle_mats_or_lists(matrix_list, stop_ind=None):
ret = []
assert all([len(i) == len(matrix_list[0]) for i in matrix_list])
len_val = len(matrix_list[0])
if stop_ind is None:
stop_ind = len_val
assert stop_ind <= len_val
a = range(stop_ind)
np.random.shuffle(a)
a += range(stop_ind, len_val)
for mat in matrix_list:
if isinstance(mat, np.ndarray):
ret.append(mat[a])
elif isinstance(mat, list):
ret.append([mat[i] for i in a])
else:
raise TypeError('shuffle_mats_or_lists only supports '
'numpy.array and list objects')
return ret
def text_to_labels(text, num_classes):
ret = []
for char in text:
if char >= 'a' and char <= 'z':
ret.append(ord(char) - ord('a'))
elif char == ' ':
ret.append(26)
return ret
# only a-z and space..probably not to difficult
# to expand to uppercase and symbols
def is_valid_str(in_str):
search = re.compile(r'[^a-z\ ]').search
return not bool(search(in_str))
# Uses generator functions to supply train/test with
# data. Image renderings are text are created on the fly
# each time with random perturbations
class TextImageGenerator(keras.callbacks.Callback):
def __init__(self, monogram_file, bigram_file, minibatch_size,
img_w, img_h, downsample_factor, val_split,
absolute_max_string_len=16):
self.minibatch_size = minibatch_size
self.img_w = img_w
self.img_h = img_h
self.monogram_file = monogram_file
self.bigram_file = bigram_file
self.downsample_factor = downsample_factor
self.val_split = val_split
self.blank_label = self.get_output_size() - 1
self.absolute_max_string_len = absolute_max_string_len
def get_output_size(self):
return 28
# num_words can be independent of the epoch size due to the use of generators
# as max_string_len grows, num_words can grow
def build_word_list(self, num_words, max_string_len=None, mono_fraction=0.5):
assert max_string_len <= self.absolute_max_string_len
assert num_words % self.minibatch_size == 0
assert (self.val_split * num_words) % self.minibatch_size == 0
self.num_words = num_words
self.string_list = [''] * self.num_words
tmp_string_list = []
self.max_string_len = max_string_len
self.Y_data = np.ones([self.num_words, self.absolute_max_string_len]) * -1
self.X_text = []
self.Y_len = [0] * self.num_words
# monogram file is sorted by frequency in english speech
with open(self.monogram_file, 'rt') as f:
for line in f:
if len(tmp_string_list) == int(self.num_words * mono_fraction):
break
word = line.rstrip()
if max_string_len == -1 or max_string_len is None or len(word) <= max_string_len:
tmp_string_list.append(word)
# bigram file contains common word pairings in english speech
with open(self.bigram_file, 'rt') as f:
lines = f.readlines()
for line in lines:
if len(tmp_string_list) == self.num_words:
break
columns = line.lower().split()
word = columns[0] + ' ' + columns[1]
if is_valid_str(word) and \
(max_string_len == -1 or max_string_len is None or len(word) <= max_string_len):
tmp_string_list.append(word)
if len(tmp_string_list) != self.num_words:
raise IOError('Could not pull enough words from supplied monogram and bigram files. ')
# interlace to mix up the easy and hard words
self.string_list[::2] = tmp_string_list[:self.num_words // 2]
self.string_list[1::2] = tmp_string_list[self.num_words // 2:]
for i, word in enumerate(self.string_list):
self.Y_len[i] = len(word)
self.Y_data[i, 0:len(word)] = text_to_labels(word, self.get_output_size())
self.X_text.append(word)
self.Y_len = np.expand_dims(np.array(self.Y_len), 1)
self.cur_val_index = self.val_split
self.cur_train_index = 0
# each time an image is requested from train/val/test, a new random
# painting of the text is performed
def get_batch(self, index, size, train):
# width and height are backwards from typical Keras convention
# because width is the time dimension when it gets fed into the RNN
if K.image_data_format() == 'channels_first':
X_data = np.ones([size, 1, self.img_w, self.img_h])
else:
X_data = np.ones([size, self.img_w, self.img_h, 1])
labels = np.ones([size, self.absolute_max_string_len])
input_length = np.zeros([size, 1])
label_length = np.zeros([size, 1])
source_str = []
for i in range(0, size):
# Mix in some blank inputs. This seems to be important for
# achieving translational invariance
if train and i > size - 4:
if K.image_data_format() == 'channels_first':
X_data[i, 0, 0:self.img_w, :] = self.paint_func('')[0, :, :].T
else:
X_data[i, 0:self.img_w, :, 0] = self.paint_func('',)[0, :, :].T
labels[i, 0] = self.blank_label
input_length[i] = self.img_w // self.downsample_factor - 2
label_length[i] = 1
source_str.append('')
else:
if K.image_data_format() == 'channels_first':
X_data[i, 0, 0:self.img_w, :] = self.paint_func(self.X_text[index + i])[0, :, :].T
else:
X_data[i, 0:self.img_w, :, 0] = self.paint_func(self.X_text[index + i])[0, :, :].T
labels[i, :] = self.Y_data[index + i]
input_length[i] = self.img_w // self.downsample_factor - 2
label_length[i] = self.Y_len[index + i]
source_str.append(self.X_text[index + i])
inputs = {'the_input': X_data,
'the_labels': labels,
'input_length': input_length,
'label_length': label_length,
'source_str': source_str # used for visualization only
}
outputs = {'ctc': np.zeros([size])} # dummy data for dummy loss function
return (inputs, outputs)
def next_train(self):
while 1:
ret = self.get_batch(self.cur_train_index, self.minibatch_size, train=True)
self.cur_train_index += self.minibatch_size
if self.cur_train_index >= self.val_split:
self.cur_train_index = self.cur_train_index % 32
(self.X_text, self.Y_data, self.Y_len) = shuffle_mats_or_lists(
[self.X_text, self.Y_data, self.Y_len], self.val_split)
yield ret
def next_val(self):
while 1:
ret = self.get_batch(self.cur_val_index, self.minibatch_size, train=False)
self.cur_val_index += self.minibatch_size
if self.cur_val_index >= self.num_words:
self.cur_val_index = self.val_split + self.cur_val_index % 32
yield ret
def on_train_begin(self, logs={}):
self.build_word_list(16000, 4, 1)
self.paint_func = lambda text: paint_text(text, self.img_w, self.img_h,
rotate=False, ud=False, multi_fonts=False)
def on_epoch_begin(self, epoch, logs={}):
# rebind the paint function to implement curriculum learning
if epoch >= 3 and epoch < 6:
self.paint_func = lambda text: paint_text(text, self.img_w, self.img_h,
rotate=False, ud=True, multi_fonts=False)
elif epoch >= 6 and epoch < 9:
self.paint_func = lambda text: paint_text(text, self.img_w, self.img_h,
rotate=False, ud=True, multi_fonts=True)
elif epoch >= 9:
self.paint_func = lambda text: paint_text(text, self.img_w, self.img_h,
rotate=True, ud=True, multi_fonts=True)
if epoch >= 21 and self.max_string_len < 12:
self.build_word_list(32000, 12, 0.5)
# the actual loss calc occurs here despite it not being
# an internal Keras loss function
def ctc_lambda_func(args):
y_pred, labels, input_length, label_length = args
# the 2 is critical here since the first couple outputs of the RNN
# tend to be garbage:
y_pred = y_pred[:, 2:, :]
return K.ctc_batch_cost(labels, y_pred, input_length, label_length)
# For a real OCR application, this should be beam search with a dictionary
# and language model. For this example, best path is sufficient.
def decode_batch(test_func, word_batch):
out = test_func([word_batch])[0]
ret = []
for j in range(out.shape[0]):
out_best = list(np.argmax(out[j, 2:], 1))
out_best = [k for k, g in itertools.groupby(out_best)]
# 26 is space, 27 is CTC blank char
outstr = ''
for c in out_best:
if c >= 0 and c < 26:
outstr += chr(c + ord('a'))
elif c == 26:
outstr += ' '
ret.append(outstr)
return ret
class VizCallback(keras.callbacks.Callback):
def __init__(self, run_name, test_func, text_img_gen, num_display_words=6):
self.test_func = test_func
self.output_dir = os.path.join(
OUTPUT_DIR, run_name)
self.text_img_gen = text_img_gen
self.num_display_words = num_display_words
if not os.path.exists(self.output_dir):
os.makedirs(self.output_dir)
def show_edit_distance(self, num):
num_left = num
mean_norm_ed = 0.0
mean_ed = 0.0
while num_left > 0:
word_batch = next(self.text_img_gen)[0]
num_proc = min(word_batch['the_input'].shape[0], num_left)
decoded_res = decode_batch(self.test_func, word_batch['the_input'][0:num_proc])
for j in range(0, num_proc):
edit_dist = editdistance.eval(decoded_res[j], word_batch['source_str'][j])
mean_ed += float(edit_dist)
mean_norm_ed += float(edit_dist) / len(word_batch['source_str'][j])
num_left -= num_proc
mean_norm_ed = mean_norm_ed / num
mean_ed = mean_ed / num
print('\nOut of %d samples: Mean edit distance: %.3f Mean normalized edit distance: %0.3f'
% (num, mean_ed, mean_norm_ed))
def on_epoch_end(self, epoch, logs={}):
self.model.save_weights(os.path.join(self.output_dir, 'weights%02d.h5' % (epoch)))
self.show_edit_distance(256)
word_batch = next(self.text_img_gen)[0]
res = decode_batch(self.test_func, word_batch['the_input'][0:self.num_display_words])
if word_batch['the_input'][0].shape[0] < 256:
cols = 2
else:
cols = 1
for i in range(self.num_display_words):
pylab.subplot(self.num_display_words // cols, cols, i + 1)
if K.image_data_format() == 'channels_first':
the_input = word_batch['the_input'][i, 0, :, :]
else:
the_input = word_batch['the_input'][i, :, :, 0]
pylab.imshow(the_input.T, cmap='Greys_r')
pylab.xlabel('Truth = \'%s\'\nDecoded = \'%s\'' % (word_batch['source_str'][i], res[i]))
fig = pylab.gcf()
fig.set_size_inches(10, 13)
pylab.savefig(os.path.join(self.output_dir, 'e%02d.png' % (epoch)))
pylab.close()
def train(run_name, start_epoch, stop_epoch, img_w):
# Input Parameters
img_h = 64
words_per_epoch = 16000
val_split = 0.2
val_words = int(words_per_epoch * (val_split))
# Network parameters
conv_filterss = 16
filter_size = 3
pool_size = 2
time_dense_size = 32
rnn_size = 512
if K.image_data_format() == 'channels_first':
input_shape = (1, img_w, img_h)
else:
input_shape = (img_w, img_h, 1)
fdir = os.path.dirname(get_file('wordlists.tgz',
origin='http://www.isosemi.com/datasets/wordlists.tgz', untar=True))
img_gen = TextImageGenerator(monogram_file=os.path.join(fdir, 'wordlist_mono_clean.txt'),
bigram_file=os.path.join(fdir, 'wordlist_bi_clean.txt'),
minibatch_size=32,
img_w=img_w,
img_h=img_h,
downsample_factor=(pool_size ** 2),
val_split=words_per_epoch - val_words
)
act = 'relu'
input_data = Input(name='the_input', shape=input_shape, dtype='float32')
inner = Convolution2D(conv_filterss, filter_size, filter_size, border_mode='same',
activation=act, init='he_normal', name='conv1')(input_data)
inner = MaxPooling2D(pool_size=(pool_size, pool_size), name='max1')(inner)
inner = Convolution2D(conv_filterss, filter_size, filter_size, border_mode='same',
activation=act, init='he_normal', name='conv2')(inner)
inner = MaxPooling2D(pool_size=(pool_size, pool_size), name='max2')(inner)
conv_to_rnn_dims = (img_w // (pool_size ** 2), (img_h // (pool_size ** 2)) * conv_filterss)
inner = Reshape(target_shape=conv_to_rnn_dims, name='reshape')(inner)
# cuts down input size going into RNN:
inner = Dense(time_dense_size, activation=act, name='dense1')(inner)
# Two layers of bidirecitonal GRUs
# GRU seems to work as well, if not better than LSTM:
gru_1 = GRU(rnn_size, return_sequences=True, init='he_normal', name='gru1')(inner)
gru_1b = GRU(rnn_size, return_sequences=True, go_backwards=True, init='he_normal', name='gru1_b')(inner)
gru1_merged = merge([gru_1, gru_1b], mode='sum')
gru_2 = GRU(rnn_size, return_sequences=True, init='he_normal', name='gru2')(gru1_merged)
gru_2b = GRU(rnn_size, return_sequences=True, go_backwards=True, init='he_normal', name='gru2_b')(gru1_merged)
# transforms RNN output to character activations:
inner = Dense(img_gen.get_output_size(), init='he_normal',
name='dense2')(merge([gru_2, gru_2b], mode='concat'))
y_pred = Activation('softmax', name='softmax')(inner)
Model(input=[input_data], output=y_pred).summary()
labels = Input(name='the_labels', shape=[img_gen.absolute_max_string_len], dtype='float32')
input_length = Input(name='input_length', shape=[1], dtype='int64')
label_length = Input(name='label_length', shape=[1], dtype='int64')
# Keras doesn't currently support loss funcs with extra parameters
# so CTC loss is implemented in a lambda layer
loss_out = Lambda(ctc_lambda_func, output_shape=(1,), name='ctc')([y_pred, labels, input_length, label_length])
# clipnorm seems to speeds up convergence
sgd = SGD(lr=0.02, decay=1e-6, momentum=0.9, nesterov=True, clipnorm=5)
model = Model(input=[input_data, labels, input_length, label_length], output=[loss_out])
# the loss calc occurs elsewhere, so use a dummy lambda func for the loss
model.compile(loss={'ctc': lambda y_true, y_pred: y_pred}, optimizer=sgd)
if start_epoch > 0:
weight_file = os.path.join(OUTPUT_DIR, os.path.join(run_name, 'weights%02d.h5' % (start_epoch - 1)))
model.load_weights(weight_file)
# captures output of softmax so we can decode the output during visualization
test_func = K.function([input_data], [y_pred])
viz_cb = VizCallback(run_name, test_func, img_gen.next_val())
model.fit_generator(generator=img_gen.next_train(), samples_per_epoch=(words_per_epoch - val_words),
epochs=stop_epoch, validation_data=img_gen.next_val(), num_val_samples=val_words,
callbacks=[viz_cb, img_gen], initial_epoch=start_epoch)
if __name__ == '__main__':
run_name = datetime.datetime.now().strftime('%Y:%m:%d:%H:%M:%S')
train(run_name, 0, 20, 128)
# increase to wider images and start at epoch 20. The learned weights are reloaded
train(run_name, 20, 25, 512)
+20 -33
Ver Arquivo
@@ -6,56 +6,43 @@ Time per epoch on CPU (Core i7): ~150s.
from __future__ import print_function
import numpy as np
np.random.seed(1337) # for reproducibility
from keras.preprocessing import sequence
from keras.models import Model
from keras.layers import Dense, Dropout, Embedding, LSTM, Input, merge
from keras.models import Sequential
from keras.layers import Dense, Dropout, Embedding, LSTM, Bidirectional
from keras.datasets import imdb
max_features = 20000
maxlen = 100 # cut texts after this number of words (among top max_features most common words)
# cut texts after this number of words
# (among top max_features most common words)
maxlen = 100
batch_size = 32
print('Loading data...')
(X_train, y_train), (X_test, y_test) = imdb.load_data(nb_words=max_features,
test_split=0.2)
print(len(X_train), 'train sequences')
print(len(X_test), 'test sequences')
(x_train, y_train), (x_test, y_test) = imdb.load_data(num_words=max_features)
print(len(x_train), 'train sequences')
print(len(x_test), 'test sequences')
print("Pad sequences (samples x time)")
X_train = sequence.pad_sequences(X_train, maxlen=maxlen)
X_test = sequence.pad_sequences(X_test, maxlen=maxlen)
print('X_train shape:', X_train.shape)
print('X_test shape:', X_test.shape)
x_train = sequence.pad_sequences(x_train, maxlen=maxlen)
x_test = sequence.pad_sequences(x_test, maxlen=maxlen)
print('x_train shape:', x_train.shape)
print('x_test shape:', x_test.shape)
y_train = np.array(y_train)
y_test = np.array(y_test)
# this is the placeholder tensor for the input sequences
sequence = Input(shape=(maxlen,), dtype='int32')
# this embedding layer will transform the sequences of integers
# into vectors of size 128
embedded = Embedding(max_features, 128, input_length=maxlen)(sequence)
# apply forwards LSTM
forwards = LSTM(64)(embedded)
# apply backwards LSTM
backwards = LSTM(64, go_backwards=True)(embedded)
# concatenate the outputs of the 2 LSTMs
merged = merge([forwards, backwards], mode='concat', concat_axis=-1)
after_dp = Dropout(0.5)(merged)
output = Dense(1, activation='sigmoid')(after_dp)
model = Model(input=sequence, output=output)
model = Sequential()
model.add(Embedding(max_features, 128, input_length=maxlen))
model.add(Bidirectional(LSTM(64)))
model.add(Dropout(0.5))
model.add(Dense(1, activation='sigmoid'))
# try using different optimizers and different optimizer configs
model.compile('adam', 'binary_crossentropy', metrics=['accuracy'])
print('Train...')
model.fit(X_train, y_train,
model.fit(x_train, y_train,
batch_size=batch_size,
nb_epoch=4,
validation_data=[X_test, y_test])
epochs=4,
validation_data=[x_test, y_test])
+36 -39
Ver Arquivo
@@ -1,67 +1,64 @@
'''This example demonstrates the use of Convolution1D for text classification.
Gets to 0.835 test accuracy after 2 epochs. 100s/epoch on K520 GPU.
Gets to 0.89 test accuracy after 2 epochs.
90s/epoch on Intel i5 2.4Ghz CPU.
10s/epoch on Tesla K40 GPU.
'''
from __future__ import print_function
import numpy as np
np.random.seed(1337) # for reproducibility
from keras.preprocessing import sequence
from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation, Flatten
from keras.layers.embeddings import Embedding
from keras.layers.convolutional import Convolution1D, MaxPooling1D
from keras.layers import Dense, Dropout, Activation
from keras.layers import Embedding
from keras.layers import Conv1D, GlobalMaxPooling1D
from keras.datasets import imdb
# set parameters:
max_features = 5000
maxlen = 100
maxlen = 400
batch_size = 32
embedding_dims = 100
nb_filter = 250
filter_length = 3
embedding_dims = 50
filters = 250
kernel_size = 3
hidden_dims = 250
nb_epoch = 2
epochs = 2
print('Loading data...')
(X_train, y_train), (X_test, y_test) = imdb.load_data(nb_words=max_features,
test_split=0.2)
print(len(X_train), 'train sequences')
print(len(X_test), 'test sequences')
(x_train, y_train), (x_test, y_test) = imdb.load_data(num_words=max_features)
print(len(x_train), 'train sequences')
print(len(x_test), 'test sequences')
print('Pad sequences (samples x time)')
X_train = sequence.pad_sequences(X_train, maxlen=maxlen)
X_test = sequence.pad_sequences(X_test, maxlen=maxlen)
print('X_train shape:', X_train.shape)
print('X_test shape:', X_test.shape)
x_train = sequence.pad_sequences(x_train, maxlen=maxlen)
x_test = sequence.pad_sequences(x_test, maxlen=maxlen)
print('x_train shape:', x_train.shape)
print('x_test shape:', x_test.shape)
print('Build model...')
model = Sequential()
# we start off with an efficient embedding layer which maps
# our vocab indices into embedding_dims dimensions
model.add(Embedding(max_features, embedding_dims, input_length=maxlen))
model.add(Dropout(0.25))
model.add(Embedding(max_features,
embedding_dims,
input_length=maxlen))
model.add(Dropout(0.2))
# we add a Convolution1D, which will learn nb_filter
# we add a Convolution1D, which will learn filters
# word group filters of size filter_length:
model.add(Convolution1D(nb_filter=nb_filter,
filter_length=filter_length,
border_mode='valid',
activation='relu',
subsample_length=1))
# we use standard max pooling (halving the output of the previous layer):
model.add(MaxPooling1D(pool_length=2))
# We flatten the output of the conv layer,
# so that we can add a vanilla dense layer:
model.add(Flatten())
model.add(Conv1D(filters,
kernel_size,
padding='valid',
activation='relu',
strides=1))
# we use max pooling:
model.add(GlobalMaxPooling1D())
# We add a vanilla hidden layer:
model.add(Dense(hidden_dims))
model.add(Dropout(0.25))
model.add(Dropout(0.2))
model.add(Activation('relu'))
# We project onto a single unit output layer, and squash it with a sigmoid:
@@ -69,9 +66,9 @@ model.add(Dense(1))
model.add(Activation('sigmoid'))
model.compile(loss='binary_crossentropy',
optimizer='rmsprop',
optimizer='adam',
metrics=['accuracy'])
model.fit(X_train, y_train,
model.fit(x_train, y_train,
batch_size=batch_size,
nb_epoch=nb_epoch,
validation_data=(X_test, y_test))
epochs=epochs,
validation_data=(x_test, y_test))
+24 -27
Ver Arquivo
@@ -4,34 +4,31 @@ classification task.
Gets to 0.8498 test accuracy after 2 epochs. 41s/epoch on K520 GPU.
'''
from __future__ import print_function
import numpy as np
np.random.seed(1337) # for reproducibility
from keras.preprocessing import sequence
from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation
from keras.layers.embeddings import Embedding
from keras.layers.recurrent import LSTM, GRU, SimpleRNN
from keras.layers.convolutional import Convolution1D, MaxPooling1D
from keras.layers import Dense, Dropout, Activation
from keras.layers import Embedding
from keras.layers import LSTM
from keras.layers import Conv1D, MaxPooling1D
from keras.datasets import imdb
# Embedding
max_features = 20000
maxlen = 100
embedding_size = 128
# Convolution
filter_length = 3
nb_filter = 64
pool_length = 2
kernel_size = 5
filters = 64
pool_size = 4
# LSTM
lstm_output_size = 70
# Training
batch_size = 30
nb_epoch = 2
epochs = 2
'''
Note:
@@ -40,27 +37,27 @@ Only 2 epochs are needed as the dataset is very small.
'''
print('Loading data...')
(X_train, y_train), (X_test, y_test) = imdb.load_data(nb_words=max_features, test_split=0.2)
print(len(X_train), 'train sequences')
print(len(X_test), 'test sequences')
(x_train, y_train), (x_test, y_test) = imdb.load_data(num_words=max_features)
print(len(x_train), 'train sequences')
print(len(x_test), 'test sequences')
print('Pad sequences (samples x time)')
X_train = sequence.pad_sequences(X_train, maxlen=maxlen)
X_test = sequence.pad_sequences(X_test, maxlen=maxlen)
print('X_train shape:', X_train.shape)
print('X_test shape:', X_test.shape)
x_train = sequence.pad_sequences(x_train, maxlen=maxlen)
x_test = sequence.pad_sequences(x_test, maxlen=maxlen)
print('x_train shape:', x_train.shape)
print('x_test shape:', x_test.shape)
print('Build model...')
model = Sequential()
model.add(Embedding(max_features, embedding_size, input_length=maxlen))
model.add(Dropout(0.25))
model.add(Convolution1D(nb_filter=nb_filter,
filter_length=filter_length,
border_mode='valid',
activation='relu',
subsample_length=1))
model.add(MaxPooling1D(pool_length=pool_length))
model.add(Conv1D(filters,
kernel_size,
padding='valid',
activation='relu',
strides=1))
model.add(MaxPooling1D(pool_size=pool_size))
model.add(LSTM(lstm_output_size))
model.add(Dense(1))
model.add(Activation('sigmoid'))
@@ -70,8 +67,8 @@ model.compile(loss='binary_crossentropy',
metrics=['accuracy'])
print('Train...')
model.fit(X_train, y_train, batch_size=batch_size, nb_epoch=nb_epoch,
validation_data=(X_test, y_test))
score, acc = model.evaluate(X_test, y_test, batch_size=batch_size)
model.fit(x_train, y_train, batch_size=batch_size, epochs=epochs,
validation_data=(x_test, y_test))
score, acc = model.evaluate(x_test, y_test, batch_size=batch_size)
print('Test score:', score)
print('Test accuracy:', acc)
+135
Ver Arquivo
@@ -0,0 +1,135 @@
'''This example demonstrates the use of fasttext for text classification
Based on Joulin et al's paper:
Bags of Tricks for Efficient Text Classification
https://arxiv.org/abs/1607.01759
Results on IMDB datasets with uni and bi-gram embeddings:
Uni-gram: 0.8813 test accuracy after 5 epochs. 8s/epoch on i7 cpu.
Bi-gram : 0.9056 test accuracy after 5 epochs. 2s/epoch on GTx 980M gpu.
'''
from __future__ import print_function
import numpy as np
from keras.preprocessing import sequence
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Embedding
from keras.layers import GlobalAveragePooling1D
from keras.datasets import imdb
def create_ngram_set(input_list, ngram_value=2):
"""
Extract a set of n-grams from a list of integers.
>>> create_ngram_set([1, 4, 9, 4, 1, 4], ngram_value=2)
{(4, 9), (4, 1), (1, 4), (9, 4)}
>>> create_ngram_set([1, 4, 9, 4, 1, 4], ngram_value=3)
[(1, 4, 9), (4, 9, 4), (9, 4, 1), (4, 1, 4)]
"""
return set(zip(*[input_list[i:] for i in range(ngram_value)]))
def add_ngram(sequences, token_indice, ngram_range=2):
"""
Augment the input list of list (sequences) by appending n-grams values.
Example: adding bi-gram
>>> sequences = [[1, 3, 4, 5], [1, 3, 7, 9, 2]]
>>> token_indice = {(1, 3): 1337, (9, 2): 42, (4, 5): 2017}
>>> add_ngram(sequences, token_indice, ngram_range=2)
[[1, 3, 4, 5, 1337, 2017], [1, 3, 7, 9, 2, 1337, 42]]
Example: adding tri-gram
>>> sequences = [[1, 3, 4, 5], [1, 3, 7, 9, 2]]
>>> token_indice = {(1, 3): 1337, (9, 2): 42, (4, 5): 2017, (7, 9, 2): 2018}
>>> add_ngram(sequences, token_indice, ngram_range=3)
[[1, 3, 4, 5, 1337], [1, 3, 7, 9, 2, 1337, 2018]]
"""
new_sequences = []
for input_list in sequences:
new_list = input_list[:]
for i in range(len(new_list) - ngram_range + 1):
for ngram_value in range(2, ngram_range + 1):
ngram = tuple(new_list[i:i + ngram_value])
if ngram in token_indice:
new_list.append(token_indice[ngram])
new_sequences.append(new_list)
return new_sequences
# Set parameters:
# ngram_range = 2 will add bi-grams features
ngram_range = 1
max_features = 20000
maxlen = 400
batch_size = 32
embedding_dims = 50
epochs = 5
print('Loading data...')
(x_train, y_train), (x_test, y_test) = imdb.load_data(num_words=max_features)
print(len(x_train), 'train sequences')
print(len(x_test), 'test sequences')
print('Average train sequence length: {}'.format(np.mean(list(map(len, x_train)), dtype=int)))
print('Average test sequence length: {}'.format(np.mean(list(map(len, x_test)), dtype=int)))
if ngram_range > 1:
print('Adding {}-gram features'.format(ngram_range))
# Create set of unique n-gram from the training set.
ngram_set = set()
for input_list in x_train:
for i in range(2, ngram_range + 1):
set_of_ngram = create_ngram_set(input_list, ngram_value=i)
ngram_set.update(set_of_ngram)
# Dictionary mapping n-gram token to a unique integer.
# Integer values are greater than max_features in order
# to avoid collision with existing features.
start_index = max_features + 1
token_indice = {v: k + start_index for k, v in enumerate(ngram_set)}
indice_token = {token_indice[k]: k for k in token_indice}
# max_features is the highest integer that could be found in the dataset.
max_features = np.max(list(indice_token.keys())) + 1
# Augmenting x_train and x_test with n-grams features
x_train = add_ngram(x_train, token_indice, ngram_range)
x_test = add_ngram(x_test, token_indice, ngram_range)
print('Average train sequence length: {}'.format(np.mean(list(map(len, x_train)), dtype=int)))
print('Average test sequence length: {}'.format(np.mean(list(map(len, x_test)), dtype=int)))
print('Pad sequences (samples x time)')
x_train = sequence.pad_sequences(x_train, maxlen=maxlen)
x_test = sequence.pad_sequences(x_test, maxlen=maxlen)
print('x_train shape:', x_train.shape)
print('x_test shape:', x_test.shape)
print('Build model...')
model = Sequential()
# we start off with an efficient embedding layer which maps
# our vocab indices into embedding_dims dimensions
model.add(Embedding(max_features,
embedding_dims,
input_length=maxlen))
# we add a GlobalAveragePooling1D, which will average the embeddings
# of all words in the document
model.add(GlobalAveragePooling1D())
# We project onto a single unit output layer, and squash it with a sigmoid:
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy',
optimizer='adam',
metrics=['accuracy'])
model.fit(x_train, y_train,
batch_size=batch_size,
epochs=epochs,
validation_data=(x_test, y_test))
+16 -26
Ver Arquivo
@@ -1,8 +1,6 @@
'''Trains a LSTM on the IMDB sentiment classification task.
The dataset is actually too small for LSTM to be of any advantage
compared to simpler, much faster methods such as TF-IDF+LogReg.
compared to simpler, much faster methods such as TF-IDF + LogReg.
Notes:
- RNNs are tricky. Choice of batch size is important,
@@ -13,15 +11,11 @@ Some configurations won't converge.
from what you see with CNNs/MLPs/etc.
'''
from __future__ import print_function
import numpy as np
np.random.seed(1337) # for reproducibility
from keras.preprocessing import sequence
from keras.utils import np_utils
from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation
from keras.layers.embeddings import Embedding
from keras.layers.recurrent import LSTM, SimpleRNN, GRU
from keras.layers import Dense, Embedding
from keras.layers import LSTM
from keras.datasets import imdb
max_features = 20000
@@ -29,23 +23,21 @@ maxlen = 80 # cut texts after this number of words (among top max_features most
batch_size = 32
print('Loading data...')
(X_train, y_train), (X_test, y_test) = imdb.load_data(nb_words=max_features,
test_split=0.2)
print(len(X_train), 'train sequences')
print(len(X_test), 'test sequences')
(x_train, y_train), (x_test, y_test) = imdb.load_data(num_words=max_features)
print(len(x_train), 'train sequences')
print(len(x_test), 'test sequences')
print('Pad sequences (samples x time)')
X_train = sequence.pad_sequences(X_train, maxlen=maxlen)
X_test = sequence.pad_sequences(X_test, maxlen=maxlen)
print('X_train shape:', X_train.shape)
print('X_test shape:', X_test.shape)
x_train = sequence.pad_sequences(x_train, maxlen=maxlen)
x_test = sequence.pad_sequences(x_test, maxlen=maxlen)
print('x_train shape:', x_train.shape)
print('x_test shape:', x_test.shape)
print('Build model...')
model = Sequential()
model.add(Embedding(max_features, 128, input_length=maxlen, dropout=0.2))
model.add(LSTM(128, dropout_W=0.2, dropout_U=0.2)) # try using a GRU instead, for fun
model.add(Dense(1))
model.add(Activation('sigmoid'))
model.add(Embedding(max_features, 128))
model.add(LSTM(128, dropout=0.2, recurrent_dropout=0.2))
model.add(Dense(1, activation='sigmoid'))
# try using different optimizers and different optimizer configs
model.compile(loss='binary_crossentropy',
@@ -53,11 +45,9 @@ model.compile(loss='binary_crossentropy',
metrics=['accuracy'])
print('Train...')
print(X_train.shape)
print(y_train.shape)
model.fit(X_train, y_train, batch_size=batch_size, nb_epoch=15,
validation_data=(X_test, y_test))
score, acc = model.evaluate(X_test, y_test,
model.fit(x_train, y_train, batch_size=batch_size, epochs=15,
validation_data=(x_test, y_test))
score, acc = model.evaluate(x_test, y_test,
batch_size=batch_size)
print('Test score:', score)
print('Test accuracy:', acc)
+88
Ver Arquivo
@@ -0,0 +1,88 @@
'''Compare LSTM implementations on the IMDB sentiment classification task.
implementation=0 preprocesses input to the LSTM which typically results in
faster computations at the expense of increased peak memory usage as the
preprocessed input must be kept in memory.
implementation=1 does away with the preprocessing, meaning that it might take
a little longer, but should require less peak memory.
implementation=2 concatenates the input, output and forget gate's weights
into one, large matrix, resulting in faster computation time as the GPU can
utilize more cores, at the expense of reduced regularization because the same
dropout is shared across the gates.
Note that the relative performance of the different `consume_less` modes
can vary depending on your device, your model and the size of your data.
'''
import time
import numpy as np
import matplotlib.pyplot as plt
from keras.preprocessing import sequence
from keras.models import Sequential
from keras.layers import Embedding, Dense, LSTM, Dropout
from keras.datasets import imdb
max_features = 20000
max_length = 80
embedding_dim = 256
batch_size = 128
epochs = 10
modes = [0, 1, 2]
print('Loading data...')
(X_train, y_train), (X_test, y_test) = imdb.load_data(num_words=max_features)
X_train = sequence.pad_sequences(X_train, max_length)
X_test = sequence.pad_sequences(X_test, max_length)
# Compile and train different models while meauring performance.
results = []
for mode in modes:
print('Testing mode: implementation={}'.format(mode))
model = Sequential()
model.add(Embedding(max_features, embedding_dim,
input_length=max_length))
model.add(Dropout(0.2))
model.add(LSTM(embedding_dim,
dropout=0.2,
recurrent_dropout=0.2,
implementation=mode))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy',
optimizer='adam',
metrics=['accuracy'])
start_time = time.time()
history = model.fit(X_train, y_train,
batch_size=batch_size,
epochs=epochs,
validation_data=(X_test, y_test))
average_time_per_epoch = (time.time() - start_time) / epochs
results.append((history, average_time_per_epoch))
# Compare models' accuracy, loss and elapsed time per epoch.
plt.style.use('ggplot')
ax1 = plt.subplot2grid((2, 2), (0, 0))
ax1.set_title('Accuracy')
ax1.set_ylabel('Validation Accuracy')
ax1.set_xlabel('Epochs')
ax2 = plt.subplot2grid((2, 2), (1, 0))
ax2.set_title('Loss')
ax2.set_ylabel('Validation Loss')
ax2.set_xlabel('Epochs')
ax3 = plt.subplot2grid((2, 2), (0, 1), rowspan=2)
ax3.set_title('Time')
ax3.set_ylabel('Seconds')
for mode, result in zip(modes, results):
ax1.plot(result[0].epoch, result[0].history['val_acc'], label=mode)
ax2.plot(result[0].epoch, result[0].history['val_loss'], label=mode)
ax1.legend()
ax2.legend()
ax3.bar(np.arange(len(results)), [x[1] for x in results],
tick_label=modes, align='center')
plt.tight_layout()
plt.show()
+17 -15
Ver Arquivo
@@ -12,18 +12,19 @@ has at least ~100k characters. ~1M is better.
from __future__ import print_function
from keras.models import Sequential
from keras.layers.core import Dense, Activation, Dropout
from keras.layers.recurrent import LSTM
from keras.layers import Dense, Activation
from keras.layers import LSTM
from keras.optimizers import RMSprop
from keras.utils.data_utils import get_file
import numpy as np
import random
import sys
path = get_file('nietzsche.txt', origin="https://s3.amazonaws.com/text-datasets/nietzsche.txt")
path = get_file('nietzsche.txt', origin='https://s3.amazonaws.com/text-datasets/nietzsche.txt')
text = open(path).read().lower()
print('corpus length:', len(text))
chars = set(text)
chars = sorted(list(set(text)))
print('total chars:', len(chars))
char_indices = dict((c, i) for i, c in enumerate(chars))
indices_char = dict((i, c) for i, c in enumerate(chars))
@@ -47,31 +48,32 @@ for i, sentence in enumerate(sentences):
y[i, char_indices[next_chars[i]]] = 1
# build the model: 2 stacked LSTM
# build the model: a single LSTM
print('Build model...')
model = Sequential()
model.add(LSTM(512, return_sequences=True, input_shape=(maxlen, len(chars))))
model.add(Dropout(0.2))
model.add(LSTM(512, return_sequences=False))
model.add(Dropout(0.2))
model.add(LSTM(128, input_shape=(maxlen, len(chars))))
model.add(Dense(len(chars)))
model.add(Activation('softmax'))
model.compile(loss='categorical_crossentropy', optimizer='rmsprop')
optimizer = RMSprop(lr=0.01)
model.compile(loss='categorical_crossentropy', optimizer=optimizer)
def sample(a, temperature=1.0):
def sample(preds, temperature=1.0):
# helper function to sample an index from a probability array
a = np.log(a) / temperature
a = np.exp(a) / np.sum(np.exp(a))
return np.argmax(np.random.multinomial(1, a, 1))
preds = np.asarray(preds).astype('float64')
preds = np.log(preds) / temperature
exp_preds = np.exp(preds)
preds = exp_preds / np.sum(exp_preds)
probas = np.random.multinomial(1, preds, 1)
return np.argmax(probas)
# train the model, output generated text after each iteration
for iteration in range(1, 60):
print()
print('-' * 50)
print('Iteration', iteration)
model.fit(X, y, batch_size=128, nb_epoch=1)
model.fit(X, y, batch_size=128, epochs=1)
start_index = random.randint(0, len(text) - maxlen - 1)
+315
Ver Arquivo
@@ -0,0 +1,315 @@
#!/usr/bin/env python
# -*- coding: utf-8 -*-
"""
Train an Auxiliary Classifier Generative Adversarial Network (ACGAN) on the
MNIST dataset. See https://arxiv.org/abs/1610.09585 for more details.
You should start to see reasonable images after ~5 epochs, and good images
by ~15 epochs. You should use a GPU, as the convolution-heavy operations are
very slow on the CPU. Prefer the TensorFlow backend if you plan on iterating,
as the compilation time can be a blocker using Theano.
Timings:
Hardware | Backend | Time / Epoch
-------------------------------------------
CPU | TF | 3 hrs
Titan X (maxwell) | TF | 4 min
Titan X (maxwell) | TH | 7 min
Consult https://github.com/lukedeo/keras-acgan for more information and
example output
"""
from __future__ import print_function
from collections import defaultdict
try:
import cPickle as pickle
except ImportError:
import pickle
from PIL import Image
from six.moves import range
import keras.backend as K
from keras.datasets import mnist
from keras import layers
from keras.layers import Input, Dense, Reshape, Flatten, Embedding, Dropout
from keras.layers.advanced_activations import LeakyReLU
from keras.layers.convolutional import UpSampling2D, Conv2D
from keras.models import Sequential, Model
from keras.optimizers import Adam
from keras.utils.generic_utils import Progbar
import numpy as np
np.random.seed(1337)
K.set_image_data_format('channels_first')
def build_generator(latent_size):
# we will map a pair of (z, L), where z is a latent vector and L is a
# label drawn from P_c, to image space (..., 1, 28, 28)
cnn = Sequential()
cnn.add(Dense(1024, input_dim=latent_size, activation='relu'))
cnn.add(Dense(128 * 7 * 7, activation='relu'))
cnn.add(Reshape((128, 7, 7)))
# upsample to (..., 14, 14)
cnn.add(UpSampling2D(size=(2, 2)))
cnn.add(Conv2D(256, 5, padding='same',
activation='relu',
kernel_initializer='glorot_normal'))
# upsample to (..., 28, 28)
cnn.add(UpSampling2D(size=(2, 2)))
cnn.add(Conv2D(128, 5, padding='same',
activation='relu',
kernel_initializer='glorot_normal'))
# take a channel axis reduction
cnn.add(Conv2D(1, 2, padding='same',
activation='tanh',
kernel_initializer='glorot_normal'))
# this is the z space commonly refered to in GAN papers
latent = Input(shape=(latent_size, ))
# this will be our label
image_class = Input(shape=(1,), dtype='int32')
# 10 classes in MNIST
cls = Flatten()(Embedding(10, latent_size,
embeddings_initializer='glorot_normal')(image_class))
# hadamard product between z-space and a class conditional embedding
h = layers.multiply([latent, cls])
fake_image = cnn(h)
return Model([latent, image_class], fake_image)
def build_discriminator():
# build a relatively standard conv net, with LeakyReLUs as suggested in
# the reference paper
cnn = Sequential()
cnn.add(Conv2D(32, 3, padding='same', strides=2,
input_shape=(1, 28, 28)))
cnn.add(LeakyReLU())
cnn.add(Dropout(0.3))
cnn.add(Conv2D(64, 3, padding='same', strides=2))
cnn.add(LeakyReLU())
cnn.add(Dropout(0.3))
cnn.add(Conv2D(128, 3, padding='same', strides=2))
cnn.add(LeakyReLU())
cnn.add(Dropout(0.3))
cnn.add(Conv2D(256, 3, padding='same', strides=1))
cnn.add(LeakyReLU())
cnn.add(Dropout(0.3))
cnn.add(Flatten())
image = Input(shape=(1, 28, 28))
features = cnn(image)
# first output (name=generation) is whether or not the discriminator
# thinks the image that is being shown is fake, and the second output
# (name=auxiliary) is the class that the discriminator thinks the image
# belongs to.
fake = Dense(1, activation='sigmoid', name='generation')(features)
aux = Dense(10, activation='softmax', name='auxiliary')(features)
return Model(image, [fake, aux])
if __name__ == '__main__':
# batch and latent size taken from the paper
epochs = 50
batch_size = 100
latent_size = 100
# Adam parameters suggested in https://arxiv.org/abs/1511.06434
adam_lr = 0.0002
adam_beta_1 = 0.5
# build the discriminator
discriminator = build_discriminator()
discriminator.compile(
optimizer=Adam(lr=adam_lr, beta_1=adam_beta_1),
loss=['binary_crossentropy', 'sparse_categorical_crossentropy']
)
# build the generator
generator = build_generator(latent_size)
generator.compile(optimizer=Adam(lr=adam_lr, beta_1=adam_beta_1),
loss='binary_crossentropy')
latent = Input(shape=(latent_size, ))
image_class = Input(shape=(1,), dtype='int32')
# get a fake image
fake = generator([latent, image_class])
# we only want to be able to train generation for the combined model
discriminator.trainable = False
fake, aux = discriminator(fake)
combined = Model([latent, image_class], [fake, aux])
combined.compile(
optimizer=Adam(lr=adam_lr, beta_1=adam_beta_1),
loss=['binary_crossentropy', 'sparse_categorical_crossentropy']
)
# get our mnist data, and force it to be of shape (..., 1, 28, 28) with
# range [-1, 1]
(X_train, y_train), (X_test, y_test) = mnist.load_data()
X_train = (X_train.astype(np.float32) - 127.5) / 127.5
X_train = np.expand_dims(X_train, axis=1)
X_test = (X_test.astype(np.float32) - 127.5) / 127.5
X_test = np.expand_dims(X_test, axis=1)
num_train, num_test = X_train.shape[0], X_test.shape[0]
train_history = defaultdict(list)
test_history = defaultdict(list)
for epoch in range(epochs):
print('Epoch {} of {}'.format(epoch + 1, epochs))
num_batches = int(X_train.shape[0] / batch_size)
progress_bar = Progbar(target=num_batches)
epoch_gen_loss = []
epoch_disc_loss = []
for index in range(num_batches):
progress_bar.update(index)
# generate a new batch of noise
noise = np.random.uniform(-1, 1, (batch_size, latent_size))
# get a batch of real images
image_batch = X_train[index * batch_size:(index + 1) * batch_size]
label_batch = y_train[index * batch_size:(index + 1) * batch_size]
# sample some labels from p_c
sampled_labels = np.random.randint(0, 10, batch_size)
# generate a batch of fake images, using the generated labels as a
# conditioner. We reshape the sampled labels to be
# (batch_size, 1) so that we can feed them into the embedding
# layer as a length one sequence
generated_images = generator.predict(
[noise, sampled_labels.reshape((-1, 1))], verbose=0)
X = np.concatenate((image_batch, generated_images))
y = np.array([1] * batch_size + [0] * batch_size)
aux_y = np.concatenate((label_batch, sampled_labels), axis=0)
# see if the discriminator can figure itself out...
epoch_disc_loss.append(discriminator.train_on_batch(X, [y, aux_y]))
# make new noise. we generate 2 * batch size here such that we have
# the generator optimize over an identical number of images as the
# discriminator
noise = np.random.uniform(-1, 1, (2 * batch_size, latent_size))
sampled_labels = np.random.randint(0, 10, 2 * batch_size)
# we want to train the genrator to trick the discriminator
# For the generator, we want all the {fake, not-fake} labels to say
# not-fake
trick = np.ones(2 * batch_size)
epoch_gen_loss.append(combined.train_on_batch(
[noise, sampled_labels.reshape((-1, 1))],
[trick, sampled_labels]))
print('\nTesting for epoch {}:'.format(epoch + 1))
# evaluate the testing loss here
# generate a new batch of noise
noise = np.random.uniform(-1, 1, (num_test, latent_size))
# sample some labels from p_c and generate images from them
sampled_labels = np.random.randint(0, 10, num_test)
generated_images = generator.predict(
[noise, sampled_labels.reshape((-1, 1))], verbose=False)
X = np.concatenate((X_test, generated_images))
y = np.array([1] * num_test + [0] * num_test)
aux_y = np.concatenate((y_test, sampled_labels), axis=0)
# see if the discriminator can figure itself out...
discriminator_test_loss = discriminator.evaluate(
X, [y, aux_y], verbose=False)
discriminator_train_loss = np.mean(np.array(epoch_disc_loss), axis=0)
# make new noise
noise = np.random.uniform(-1, 1, (2 * num_test, latent_size))
sampled_labels = np.random.randint(0, 10, 2 * num_test)
trick = np.ones(2 * num_test)
generator_test_loss = combined.evaluate(
[noise, sampled_labels.reshape((-1, 1))],
[trick, sampled_labels], verbose=False)
generator_train_loss = np.mean(np.array(epoch_gen_loss), axis=0)
# generate an epoch report on performance
train_history['generator'].append(generator_train_loss)
train_history['discriminator'].append(discriminator_train_loss)
test_history['generator'].append(generator_test_loss)
test_history['discriminator'].append(discriminator_test_loss)
print('{0:<22s} | {1:4s} | {2:15s} | {3:5s}'.format(
'component', *discriminator.metrics_names))
print('-' * 65)
ROW_FMT = '{0:<22s} | {1:<4.2f} | {2:<15.2f} | {3:<5.2f}'
print(ROW_FMT.format('generator (train)',
*train_history['generator'][-1]))
print(ROW_FMT.format('generator (test)',
*test_history['generator'][-1]))
print(ROW_FMT.format('discriminator (train)',
*train_history['discriminator'][-1]))
print(ROW_FMT.format('discriminator (test)',
*test_history['discriminator'][-1]))
# save weights every epoch
generator.save_weights(
'params_generator_epoch_{0:03d}.hdf5'.format(epoch), True)
discriminator.save_weights(
'params_discriminator_epoch_{0:03d}.hdf5'.format(epoch), True)
# generate some digits to display
noise = np.random.uniform(-1, 1, (100, latent_size))
sampled_labels = np.array([
[i] * 10 for i in range(10)
]).reshape(-1, 1)
# get a batch to display
generated_images = generator.predict(
[noise, sampled_labels], verbose=0)
# arrange them into a grid
img = (np.concatenate([r.reshape(-1, 28)
for r in np.split(generated_images, 10)
], axis=-1) * 127.5 + 127.5).astype(np.uint8)
Image.fromarray(img).save(
'plot_epoch_{0:03d}_generated.png'.format(epoch))
pickle.dump({'train': train_history, 'test': test_history},
open('acgan-history.pkl', 'wb'))
+38 -45
Ver Arquivo
@@ -6,69 +6,62 @@ Gets to 99.25% test accuracy after 12 epochs
'''
from __future__ import print_function
import numpy as np
np.random.seed(1337) # for reproducibility
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation, Flatten
from keras.layers.convolutional import Convolution2D, MaxPooling2D
from keras.utils import np_utils
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras import backend as K
batch_size = 128
nb_classes = 10
nb_epoch = 12
num_classes = 10
epochs = 12
# input image dimensions
img_rows, img_cols = 28, 28
# number of convolutional filters to use
nb_filters = 32
# size of pooling area for max pooling
nb_pool = 2
# convolution kernel size
nb_conv = 3
# the data, shuffled and split between train and test sets
(X_train, y_train), (X_test, y_test) = mnist.load_data()
(x_train, y_train), (x_test, y_test) = mnist.load_data()
X_train = X_train.reshape(X_train.shape[0], 1, img_rows, img_cols)
X_test = X_test.reshape(X_test.shape[0], 1, img_rows, img_cols)
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255
X_test /= 255
print('X_train shape:', X_train.shape)
print(X_train.shape[0], 'train samples')
print(X_test.shape[0], 'test samples')
if K.image_data_format() == 'channels_first':
x_train = x_train.reshape(x_train.shape[0], 1, img_rows, img_cols)
x_test = x_test.reshape(x_test.shape[0], 1, img_rows, img_cols)
input_shape = (1, img_rows, img_cols)
else:
x_train = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)
x_test = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)
input_shape = (img_rows, img_cols, 1)
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255
print('x_train shape:', x_train.shape)
print(x_train.shape[0], 'train samples')
print(x_test.shape[0], 'test samples')
# convert class vectors to binary class matrices
Y_train = np_utils.to_categorical(y_train, nb_classes)
Y_test = np_utils.to_categorical(y_test, nb_classes)
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)
model = Sequential()
model.add(Convolution2D(nb_filters, nb_conv, nb_conv,
border_mode='valid',
input_shape=(1, img_rows, img_cols)))
model.add(Activation('relu'))
model.add(Convolution2D(nb_filters, nb_conv, nb_conv))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(nb_pool, nb_pool)))
model.add(Conv2D(32, kernel_size=(3, 3),
activation='relu',
input_shape=input_shape))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128))
model.add(Activation('relu'))
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(nb_classes))
model.add(Activation('softmax'))
model.add(Dense(num_classes, activation='softmax'))
model.compile(loss='categorical_crossentropy',
optimizer='adadelta',
model.compile(loss=keras.losses.categorical_crossentropy,
optimizer=keras.optimizers.Adadelta(),
metrics=['accuracy'])
model.fit(X_train, Y_train, batch_size=batch_size, nb_epoch=nb_epoch,
verbose=1, validation_data=(X_test, Y_test))
score = model.evaluate(X_test, Y_test, verbose=0)
print('Test score:', score[0])
model.fit(x_train, y_train, batch_size=batch_size, epochs=epochs,
verbose=1, validation_data=(x_test, y_test))
score = model.evaluate(x_test, y_test, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])
+88
Ver Arquivo
@@ -0,0 +1,88 @@
"""This is an example of using Hierarchical RNN (HRNN) to classify MNIST digits.
HRNNs can learn across multiple levels of temporal hiearchy over a complex sequence.
Usually, the first recurrent layer of an HRNN encodes a sentence (e.g. of word vectors)
into a sentence vector. The second recurrent layer then encodes a sequence of
such vectors (encoded by the first layer) into a document vector. This
document vector is considered to preserve both the word-level and
sentence-level structure of the context.
# References
- [A Hierarchical Neural Autoencoder for Paragraphs and Documents](https://arxiv.org/abs/1506.01057)
Encodes paragraphs and documents with HRNN.
Results have shown that HRNN outperforms standard
RNNs and may play some role in more sophisticated generation tasks like
summarization or question answering.
- [Hierarchical recurrent neural network for skeleton based action recognition](http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=7298714)
Achieved state-of-the-art results on skeleton based action recognition with 3 levels
of bidirectional HRNN combined with fully connected layers.
In the below MNIST example the first LSTM layer first encodes every
column of pixels of shape (28, 1) to a column vector of shape (128,). The second LSTM
layer encodes then these 28 column vectors of shape (28, 128) to a image vector
representing the whole image. A final Dense layer is added for prediction.
After 5 epochs: train acc: 0.9858, val acc: 0.9864
"""
from __future__ import print_function
import keras
from keras.datasets import mnist
from keras.models import Model
from keras.layers import Input, Dense, TimeDistributed
from keras.layers import LSTM
# Training parameters.
batch_size = 32
num_classes = 10
epochs = 5
# Embedding dimensions.
row_hidden = 128
col_hidden = 128
# The data, shuffled and split between train and test sets.
(x_train, y_train), (x_test, y_test) = mnist.load_data()
# Reshapes data to 4D for Hierarchical RNN.
x_train = x_train.reshape(x_train.shape[0], 28, 28, 1)
x_test = x_test.reshape(x_test.shape[0], 28, 28, 1)
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255
print('x_train shape:', x_train.shape)
print(x_train.shape[0], 'train samples')
print(x_test.shape[0], 'test samples')
# Converts class vectors to binary class matrices.
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)
row, col, pixel = x_train.shape[1:]
# 4D input.
x = Input(shape=(row, col, pixel))
# Encodes a row of pixels using TimeDistributed Wrapper.
encoded_rows = TimeDistributed(LSTM(row_hidden))(x)
# Encodes columns of encoded rows.
encoded_columns = LSTM(col_hidden)(encoded_rows)
# Final predictions and model.
prediction = Dense(num_classes, activation='softmax')(encoded_columns)
model = Model(x, prediction)
model.compile(loss='categorical_crossentropy',
optimizer='rmsprop',
metrics=['accuracy'])
# Training.
model.fit(x_train, y_train,
batch_size=batch_size, epochs=epochs,
verbose=1, validation_data=(x_test, y_test))
# Evaluation.
scores = model.evaluate(x_test, y_test, verbose=0)
print('Test loss:', scores[0])
print('Test accuracy:', scores[1])
+27 -27
Ver Arquivo
@@ -3,7 +3,7 @@ with pixel-by-pixel sequential MNIST in
"A Simple Way to Initialize Recurrent Networks of Rectified Linear Units"
by Quoc V. Le, Navdeep Jaitly, Geoffrey E. Hinton
arXiv:1504.00941v2 [cs.NE] 7 Apr 201
arxiv:1504.00941v2 [cs.NE] 7 Apr 2015
http://arxiv.org/pdf/1504.00941v2.pdf
Optimizer is replaced with RMSprop which yields more stable and steady
@@ -15,56 +15,56 @@ Reaches 0.93 train/test accuracy after 900 epochs
from __future__ import print_function
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers.core import Dense, Activation
from keras.initializations import normal, identity
from keras.layers.recurrent import SimpleRNN
from keras.layers import Dense, Activation
from keras.layers import SimpleRNN
from keras import initializers
from keras.optimizers import RMSprop
from keras.utils import np_utils
batch_size = 32
nb_classes = 10
nb_epochs = 200
num_classes = 10
epochs = 200
hidden_units = 100
learning_rate = 1e-6
clip_norm = 1.0
# the data, shuffled and split between train and test sets
(X_train, y_train), (X_test, y_test) = mnist.load_data()
(x_train, y_train), (x_test, y_test) = mnist.load_data()
X_train = X_train.reshape(X_train.shape[0], -1, 1)
X_test = X_test.reshape(X_test.shape[0], -1, 1)
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255
X_test /= 255
print('X_train shape:', X_train.shape)
print(X_train.shape[0], 'train samples')
print(X_test.shape[0], 'test samples')
x_train = x_train.reshape(x_train.shape[0], -1, 1)
x_test = x_test.reshape(x_test.shape[0], -1, 1)
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255
print('x_train shape:', x_train.shape)
print(x_train.shape[0], 'train samples')
print(x_test.shape[0], 'test samples')
# convert class vectors to binary class matrices
Y_train = np_utils.to_categorical(y_train, nb_classes)
Y_test = np_utils.to_categorical(y_test, nb_classes)
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)
print('Evaluate IRNN...')
model = Sequential()
model.add(SimpleRNN(output_dim=hidden_units,
init=lambda shape, name: normal(shape, scale=0.001, name=name),
inner_init=lambda shape, name: identity(shape, scale=1.0, name=name),
model.add(SimpleRNN(hidden_units,
kernel_initializer=initializers.RandomNormal(stddev=0.001),
recurrent_initializer=initializers.Identity(gain=1.0),
activation='relu',
input_shape=X_train.shape[1:]))
model.add(Dense(nb_classes))
input_shape=x_train.shape[1:]))
model.add(Dense(num_classes))
model.add(Activation('softmax'))
rmsprop = RMSprop(lr=learning_rate)
model.compile(loss='categorical_crossentropy',
optimizer=rmsprop,
metrics=['accuracy'])
model.fit(X_train, Y_train, batch_size=batch_size, nb_epoch=nb_epochs,
verbose=1, validation_data=(X_test, Y_test))
model.fit(x_train, y_train, batch_size=batch_size, epochs=epochs,
verbose=1, validation_data=(x_test, y_test))
scores = model.evaluate(X_test, Y_test, verbose=0)
scores = model.evaluate(x_test, y_test, verbose=0)
print('IRNN test score:', scores[0])
print('IRNN test accuracy:', scores[1])
+24 -29
Ver Arquivo
@@ -6,45 +6,40 @@ Gets to 98.40% test accuracy after 20 epochs
'''
from __future__ import print_function
import numpy as np
np.random.seed(1337) # for reproducibility
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation
from keras.optimizers import SGD, Adam, RMSprop
from keras.utils import np_utils
from keras.layers import Dense, Dropout
from keras.optimizers import RMSprop
batch_size = 128
nb_classes = 10
nb_epoch = 20
num_classes = 10
epochs = 20
# the data, shuffled and split between train and test sets
(X_train, y_train), (X_test, y_test) = mnist.load_data()
(x_train, y_train), (x_test, y_test) = mnist.load_data()
X_train = X_train.reshape(60000, 784)
X_test = X_test.reshape(10000, 784)
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255
X_test /= 255
print(X_train.shape[0], 'train samples')
print(X_test.shape[0], 'test samples')
x_train = x_train.reshape(60000, 784)
x_test = x_test.reshape(10000, 784)
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255
print(x_train.shape[0], 'train samples')
print(x_test.shape[0], 'test samples')
# convert class vectors to binary class matrices
Y_train = np_utils.to_categorical(y_train, nb_classes)
Y_test = np_utils.to_categorical(y_test, nb_classes)
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)
model = Sequential()
model.add(Dense(512, input_shape=(784,)))
model.add(Activation('relu'))
model.add(Dense(512, activation='relu', input_shape=(784,)))
model.add(Dropout(0.2))
model.add(Dense(512))
model.add(Activation('relu'))
model.add(Dense(512, activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(10))
model.add(Activation('softmax'))
model.add(Dense(10, activation='softmax'))
model.summary()
@@ -52,9 +47,9 @@ model.compile(loss='categorical_crossentropy',
optimizer=RMSprop(),
metrics=['accuracy'])
history = model.fit(X_train, Y_train,
batch_size=batch_size, nb_epoch=nb_epoch,
verbose=1, validation_data=(X_test, Y_test))
score = model.evaluate(X_test, Y_test, verbose=0)
print('Test score:', score[0])
history = model.fit(x_train, y_train,
batch_size=batch_size, epochs=epochs,
verbose=1, validation_data=(x_test, y_test))
score = model.evaluate(x_test, y_test, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])
+386
Ver Arquivo
@@ -0,0 +1,386 @@
'''This is an implementation of Net2Net experiment with MNIST in
'Net2Net: Accelerating Learning via Knowledge Transfer'
by Tianqi Chen, Ian Goodfellow, and Jonathon Shlens
arXiv:1511.05641v4 [cs.LG] 23 Apr 2016
http://arxiv.org/abs/1511.05641
Notes
- What:
+ Net2Net is a group of methods to transfer knowledge from a teacher neural
net to a student net,so that the student net can be trained faster than
from scratch.
+ The paper discussed two specific methods of Net2Net, i.e. Net2WiderNet
and Net2DeeperNet.
+ Net2WiderNet replaces a model with an equivalent wider model that has
more units in each hidden layer.
+ Net2DeeperNet replaces a model with an equivalent deeper model.
+ Both are based on the idea of 'function-preserving transformations of
neural nets'.
- Why:
+ Enable fast exploration of multiple neural nets in experimentation and
design process,by creating a series of wider and deeper models with
transferable knowledge.
+ Enable 'lifelong learning system' by gradually adjusting model complexity
to data availability,and reusing transferable knowledge.
Experiments
- Teacher model: a basic CNN model trained on MNIST for 3 epochs.
- Net2WiderNet exepriment:
+ Student model has a wider Conv2D layer and a wider FC layer.
+ Comparison of 'random-padding' vs 'net2wider' weight initialization.
+ With both methods, student model should immediately perform as well as
teacher model, but 'net2wider' is slightly better.
- Net2DeeperNet experiment:
+ Student model has an extra Conv2D layer and an extra FC layer.
+ Comparison of 'random-init' vs 'net2deeper' weight initialization.
+ Starting performance of 'net2deeper' is better than 'random-init'.
- Hyper-parameters:
+ SGD with momentum=0.9 is used for training teacher and student models.
+ Learning rate adjustment: it's suggested to reduce learning rate
to 1/10 for student model.
+ Addition of noise in 'net2wider' is used to break weight symmetry
and thus enable full capacity of student models. It is optional
when a Dropout layer is used.
Results
- Tested with 'Theano' backend and 'channels_first' image_data_format.
- Running on GPU GeForce GTX 980M
- Performance Comparisons - validation loss values during first 3 epochs:
(1) teacher_model: 0.075 0.041 0.041
(2) wider_random_pad: 0.036 0.034 0.032
(3) wider_net2wider: 0.032 0.030 0.030
(4) deeper_random_init: 0.061 0.043 0.041
(5) deeper_net2deeper: 0.032 0.031 0.029
'''
from __future__ import print_function
from six.moves import xrange
import numpy as np
import keras
from keras.models import Sequential
from keras.layers import Conv2D, MaxPooling2D, Dense, Flatten
from keras.optimizers import SGD
from keras.datasets import mnist
if keras.backend.image_data_format() == 'channels_first':
input_shape = (1, 28, 28) # image shape
else:
input_shape = (28, 28, 1) # image shape
num_class = 10 # number of class
# load and pre-process data
def preprocess_input(x):
return x.reshape((-1, ) + input_shape) / 255.
def preprocess_output(y):
return keras.utils.to_categorical(y)
(train_x, train_y), (validation_x, validation_y) = mnist.load_data()
train_x, validation_x = map(preprocess_input, [train_x, validation_x])
train_y, validation_y = map(preprocess_output, [train_y, validation_y])
print('Loading MNIST data...')
print('train_x shape:', train_x.shape, 'train_y shape:', train_y.shape)
print('validation_x shape:', validation_x.shape,
'validation_y shape', validation_y.shape)
# knowledge transfer algorithms
def wider2net_conv2d(teacher_w1, teacher_b1, teacher_w2, new_width, init):
'''Get initial weights for a wider conv2d layer with a bigger filters,
by 'random-padding' or 'net2wider'.
# Arguments
teacher_w1: `weight` of conv2d layer to become wider,
of shape (filters1, num_channel1, kh1, kw1)
teacher_b1: `bias` of conv2d layer to become wider,
of shape (filters1, )
teacher_w2: `weight` of next connected conv2d layer,
of shape (filters2, num_channel2, kh2, kw2)
new_width: new `filters` for the wider conv2d layer
init: initialization algorithm for new weights,
either 'random-pad' or 'net2wider'
'''
assert teacher_w1.shape[0] == teacher_w2.shape[1], (
'successive layers from teacher model should have compatible shapes')
assert teacher_w1.shape[0] == teacher_b1.shape[0], (
'weight and bias from same layer should have compatible shapes')
assert new_width > teacher_w1.shape[0], (
'new width (filters) should be bigger than the existing one')
n = new_width - teacher_w1.shape[0]
if init == 'random-pad':
new_w1 = np.random.normal(0, 0.1, size=(n, ) + teacher_w1.shape[1:])
new_b1 = np.ones(n) * 0.1
new_w2 = np.random.normal(0, 0.1, size=(
teacher_w2.shape[0], n) + teacher_w2.shape[2:])
elif init == 'net2wider':
index = np.random.randint(teacher_w1.shape[0], size=n)
factors = np.bincount(index)[index] + 1.
new_w1 = teacher_w1[index, :, :, :]
new_b1 = teacher_b1[index]
new_w2 = teacher_w2[:, index, :, :] / factors.reshape((1, -1, 1, 1))
else:
raise ValueError('Unsupported weight initializer: %s' % init)
student_w1 = np.concatenate((teacher_w1, new_w1), axis=0)
if init == 'random-pad':
student_w2 = np.concatenate((teacher_w2, new_w2), axis=1)
elif init == 'net2wider':
# add small noise to break symmetry, so that student model will have
# full capacity later
noise = np.random.normal(0, 5e-2 * new_w2.std(), size=new_w2.shape)
student_w2 = np.concatenate((teacher_w2, new_w2 + noise), axis=1)
student_w2[:, index, :, :] = new_w2
student_b1 = np.concatenate((teacher_b1, new_b1), axis=0)
return student_w1, student_b1, student_w2
def wider2net_fc(teacher_w1, teacher_b1, teacher_w2, new_width, init):
'''Get initial weights for a wider fully connected (dense) layer
with a bigger nout, by 'random-padding' or 'net2wider'.
# Arguments
teacher_w1: `weight` of fc layer to become wider,
of shape (nin1, nout1)
teacher_b1: `bias` of fc layer to become wider,
of shape (nout1, )
teacher_w2: `weight` of next connected fc layer,
of shape (nin2, nout2)
new_width: new `nout` for the wider fc layer
init: initialization algorithm for new weights,
either 'random-pad' or 'net2wider'
'''
assert teacher_w1.shape[1] == teacher_w2.shape[0], (
'successive layers from teacher model should have compatible shapes')
assert teacher_w1.shape[1] == teacher_b1.shape[0], (
'weight and bias from same layer should have compatible shapes')
assert new_width > teacher_w1.shape[1], (
'new width (nout) should be bigger than the existing one')
n = new_width - teacher_w1.shape[1]
if init == 'random-pad':
new_w1 = np.random.normal(0, 0.1, size=(teacher_w1.shape[0], n))
new_b1 = np.ones(n) * 0.1
new_w2 = np.random.normal(0, 0.1, size=(n, teacher_w2.shape[1]))
elif init == 'net2wider':
index = np.random.randint(teacher_w1.shape[1], size=n)
factors = np.bincount(index)[index] + 1.
new_w1 = teacher_w1[:, index]
new_b1 = teacher_b1[index]
new_w2 = teacher_w2[index, :] / factors[:, np.newaxis]
else:
raise ValueError('Unsupported weight initializer: %s' % init)
student_w1 = np.concatenate((teacher_w1, new_w1), axis=1)
if init == 'random-pad':
student_w2 = np.concatenate((teacher_w2, new_w2), axis=0)
elif init == 'net2wider':
# add small noise to break symmetry, so that student model will have
# full capacity later
noise = np.random.normal(0, 5e-2 * new_w2.std(), size=new_w2.shape)
student_w2 = np.concatenate((teacher_w2, new_w2 + noise), axis=0)
student_w2[index, :] = new_w2
student_b1 = np.concatenate((teacher_b1, new_b1), axis=0)
return student_w1, student_b1, student_w2
def deeper2net_conv2d(teacher_w):
'''Get initial weights for a deeper conv2d layer by net2deeper'.
# Arguments
teacher_w: `weight` of previous conv2d layer,
of shape (filters, num_channel, kh, kw)
'''
filters, num_channel, kh, kw = teacher_w.shape
student_w = np.zeros((filters, filters, kh, kw))
for i in xrange(filters):
student_w[i, i, (kh - 1) / 2, (kw - 1) / 2] = 1.
student_b = np.zeros(filters)
return student_w, student_b
def copy_weights(teacher_model, student_model, layer_names):
'''Copy weights from teacher_model to student_model,
for layers with names listed in layer_names
'''
for name in layer_names:
weights = teacher_model.get_layer(name=name).get_weights()
student_model.get_layer(name=name).set_weights(weights)
# methods to construct teacher_model and student_models
def make_teacher_model(train_data, validation_data, epochs=3):
'''Train a simple CNN as teacher model.
'''
model = Sequential()
model.add(Conv2D(64, 3, input_shape=input_shape,
padding='same', name='conv1'))
model.add(MaxPooling2D(2, name='pool1'))
model.add(Conv2D(64, 3, padding='same', name='conv2'))
model.add(MaxPooling2D(2, name='pool2'))
model.add(Flatten(name='flatten'))
model.add(Dense(64, activation='relu', name='fc1'))
model.add(Dense(num_class, activation='softmax', name='fc2'))
model.compile(loss='categorical_crossentropy',
optimizer=SGD(lr=0.01, momentum=0.9),
metrics=['accuracy'])
train_x, train_y = train_data
history = model.fit(train_x, train_y, epochs=epochs,
validation_data=validation_data)
return model, history
def make_wider_student_model(teacher_model, train_data,
validation_data, init, epochs=3):
'''Train a wider student model based on teacher_model,
with either 'random-pad' (baseline) or 'net2wider'
'''
new_conv1_width = 128
new_fc1_width = 128
model = Sequential()
# a wider conv1 compared to teacher_model
model.add(Conv2D(new_conv1_width, 3, input_shape=input_shape,
padding='same', name='conv1'))
model.add(MaxPooling2D(2, name='pool1'))
model.add(Conv2D(64, 3, padding='same', name='conv2'))
model.add(MaxPooling2D(2, name='pool2'))
model.add(Flatten(name='flatten'))
# a wider fc1 compared to teacher model
model.add(Dense(new_fc1_width, activation='relu', name='fc1'))
model.add(Dense(num_class, activation='softmax', name='fc2'))
# The weights for other layers need to be copied from teacher_model
# to student_model, except for widened layers
# and their immediate downstreams, which will be initialized separately.
# For this example there are no other layers that need to be copied.
w_conv1, b_conv1 = teacher_model.get_layer('conv1').get_weights()
w_conv2, b_conv2 = teacher_model.get_layer('conv2').get_weights()
new_w_conv1, new_b_conv1, new_w_conv2 = wider2net_conv2d(
w_conv1, b_conv1, w_conv2, new_conv1_width, init)
model.get_layer('conv1').set_weights([new_w_conv1, new_b_conv1])
model.get_layer('conv2').set_weights([new_w_conv2, b_conv2])
w_fc1, b_fc1 = teacher_model.get_layer('fc1').get_weights()
w_fc2, b_fc2 = teacher_model.get_layer('fc2').get_weights()
new_w_fc1, new_b_fc1, new_w_fc2 = wider2net_fc(
w_fc1, b_fc1, w_fc2, new_fc1_width, init)
model.get_layer('fc1').set_weights([new_w_fc1, new_b_fc1])
model.get_layer('fc2').set_weights([new_w_fc2, b_fc2])
model.compile(loss='categorical_crossentropy',
optimizer=SGD(lr=0.001, momentum=0.9),
metrics=['accuracy'])
train_x, train_y = train_data
history = model.fit(train_x, train_y, epochs=epochs,
validation_data=validation_data)
return model, history
def make_deeper_student_model(teacher_model, train_data,
validation_data, init, epochs=3):
'''Train a deeper student model based on teacher_model,
with either 'random-init' (baseline) or 'net2deeper'
'''
model = Sequential()
model.add(Conv2D(64, 3, input_shape=input_shape,
padding='same', name='conv1'))
model.add(MaxPooling2D(2, name='pool1'))
model.add(Conv2D(64, 3, padding='same', name='conv2'))
# add another conv2d layer to make original conv2 deeper
if init == 'net2deeper':
prev_w, _ = model.get_layer('conv2').get_weights()
new_weights = deeper2net_conv2d(prev_w)
model.add(Conv2D(64, 3, padding='same',
name='conv2-deeper', weights=new_weights))
elif init == 'random-init':
model.add(Conv2D(64, 3, padding='same', name='conv2-deeper'))
else:
raise ValueError('Unsupported weight initializer: %s' % init)
model.add(MaxPooling2D(2, name='pool2'))
model.add(Flatten(name='flatten'))
model.add(Dense(64, activation='relu', name='fc1'))
# add another fc layer to make original fc1 deeper
if init == 'net2deeper':
# net2deeper for fc layer with relu, is just an identity initializer
model.add(Dense(64, init='identity',
activation='relu', name='fc1-deeper'))
elif init == 'random-init':
model.add(Dense(64, activation='relu', name='fc1-deeper'))
else:
raise ValueError('Unsupported weight initializer: %s' % init)
model.add(Dense(num_class, activation='softmax', name='fc2'))
# copy weights for other layers
copy_weights(teacher_model, model, layer_names=[
'conv1', 'conv2', 'fc1', 'fc2'])
model.compile(loss='categorical_crossentropy',
optimizer=SGD(lr=0.001, momentum=0.9),
metrics=['accuracy'])
train_x, train_y = train_data
history = model.fit(train_x, train_y, epochs=epochs,
validation_data=validation_data)
return model, history
# experiments setup
def net2wider_experiment():
'''Benchmark performances of
(1) a teacher model,
(2) a wider student model with `random_pad` initializer
(3) a wider student model with `Net2WiderNet` initializer
'''
train_data = (train_x, train_y)
validation_data = (validation_x, validation_y)
print('\nExperiment of Net2WiderNet ...')
print('\nbuilding teacher model ...')
teacher_model, _ = make_teacher_model(train_data,
validation_data,
epochs=3)
print('\nbuilding wider student model by random padding ...')
make_wider_student_model(teacher_model, train_data,
validation_data, 'random-pad',
epochs=3)
print('\nbuilding wider student model by net2wider ...')
make_wider_student_model(teacher_model, train_data,
validation_data, 'net2wider',
epochs=3)
def net2deeper_experiment():
'''Benchmark performances of
(1) a teacher model,
(2) a deeper student model with `random_init` initializer
(3) a deeper student model with `Net2DeeperNet` initializer
'''
train_data = (train_x, train_y)
validation_data = (validation_x, validation_y)
print('\nExperiment of Net2DeeperNet ...')
print('\nbuilding teacher model ...')
teacher_model, _ = make_teacher_model(train_data,
validation_data,
epochs=3)
print('\nbuilding deeper student model by random init ...')
make_deeper_student_model(teacher_model, train_data,
validation_data, 'random-init',
epochs=3)
print('\nbuilding deeper student model by net2deeper ...')
make_deeper_student_model(teacher_model, train_data,
validation_data, 'net2deeper',
epochs=3)
# run the experiments
net2wider_experiment()
net2deeper_experiment()
+23 -17
Ver Arquivo
@@ -13,13 +13,12 @@ Gets to 99.5% test accuracy after 20 epochs.
from __future__ import absolute_import
from __future__ import print_function
import numpy as np
np.random.seed(1337) # for reproducibility
import random
from keras.datasets import mnist
from keras.models import Sequential, Model
from keras.layers import Dense, Dropout, Input, Lambda
from keras.optimizers import SGD, RMSprop
from keras.optimizers import RMSprop
from keras import backend as K
@@ -28,12 +27,18 @@ def euclidean_distance(vects):
return K.sqrt(K.sum(K.square(x - y), axis=1, keepdims=True))
def eucl_dist_output_shape(shapes):
shape1, shape2 = shapes
return (shape1[0], 1)
def contrastive_loss(y_true, y_pred):
'''Contrastive loss from Hadsell-et-al.'06
http://yann.lecun.com/exdb/publis/pdf/hadsell-chopra-lecun-06.pdf
'''
margin = 1
return K.mean(y_true * K.square(y_pred) + (1 - y_true) * K.square(K.maximum(margin - y_pred, 0)))
return K.mean(y_true * K.square(y_pred) +
(1 - y_true) * K.square(K.maximum(margin - y_pred, 0)))
def create_pairs(x, digit_indices):
@@ -45,7 +50,7 @@ def create_pairs(x, digit_indices):
n = min([len(digit_indices[d]) for d in range(10)]) - 1
for d in range(10):
for i in range(n):
z1, z2 = digit_indices[d][i], digit_indices[d][i+1]
z1, z2 = digit_indices[d][i], digit_indices[d][i + 1]
pairs += [[x[z1], x[z2]]]
inc = random.randrange(1, 10)
dn = (d + inc) % 10
@@ -74,22 +79,22 @@ def compute_accuracy(predictions, labels):
# the data, shuffled and split between train and test sets
(X_train, y_train), (X_test, y_test) = mnist.load_data()
X_train = X_train.reshape(60000, 784)
X_test = X_test.reshape(10000, 784)
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255
X_test /= 255
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train = x_train.reshape(60000, 784)
x_test = x_test.reshape(10000, 784)
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255
input_dim = 784
nb_epoch = 20
epochs = 20
# create training+test positive and negative pairs
digit_indices = [np.where(y_train == i)[0] for i in range(10)]
tr_pairs, tr_y = create_pairs(X_train, digit_indices)
tr_pairs, tr_y = create_pairs(x_train, digit_indices)
digit_indices = [np.where(y_test == i)[0] for i in range(10)]
te_pairs, te_y = create_pairs(X_test, digit_indices)
te_pairs, te_y = create_pairs(x_test, digit_indices)
# network definition
base_network = create_base_network(input_dim)
@@ -103,9 +108,10 @@ input_b = Input(shape=(input_dim,))
processed_a = base_network(input_a)
processed_b = base_network(input_b)
distance = Lambda(euclidean_distance)([processed_a, processed_b])
distance = Lambda(euclidean_distance,
output_shape=eucl_dist_output_shape)([processed_a, processed_b])
model = Model(input=[input_a, input_b], output=distance)
model = Model([input_a, input_b], distance)
# train
rms = RMSprop()
@@ -113,7 +119,7 @@ model.compile(loss=contrastive_loss, optimizer=rms)
model.fit([tr_pairs[:, 0], tr_pairs[:, 1]], tr_y,
validation_data=([te_pairs[:, 0], te_pairs[:, 1]], te_y),
batch_size=128,
nb_epoch=nb_epoch)
epochs=epochs)
# compute final accuracy on training and test sets
pred = model.predict([tr_pairs[:, 0], tr_pairs[:, 1]])
+102
Ver Arquivo
@@ -0,0 +1,102 @@
'''Example of how to use sklearn wrapper
Builds simple CNN models on MNIST and uses sklearn's GridSearchCV to find best model
'''
from __future__ import print_function
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras.wrappers.scikit_learn import KerasClassifier
from keras import backend as K
from sklearn.grid_search import GridSearchCV
num_classes = 10
# input image dimensions
img_rows, img_cols = 28, 28
# load training data and do basic data normalization
(x_train, y_train), (x_test, y_test) = mnist.load_data()
if K.image_data_format() == 'channels_first':
x_train = x_train.reshape(x_train.shape[0], 1, img_rows, img_cols)
x_test = x_test.reshape(x_test.shape[0], 1, img_rows, img_cols)
input_shape = (1, img_rows, img_cols)
else:
x_train = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)
x_test = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)
input_shape = (img_rows, img_cols, 1)
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255
# convert class vectors to binary class matrices
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)
def make_model(dense_layer_sizes, filters, kernel_size, pool_size):
'''Creates model comprised of 2 convolutional layers followed by dense layers
dense_layer_sizes: List of layer sizes.
This list has one number for each layer
filters: Number of convolutional filters in each convolutional layer
kernel_size: Convolutional kernel size
pool_size: Size of pooling area for max pooling
'''
model = Sequential()
model.add(Conv2D(filters, kernel_size,
padding='valid',
input_shape=input_shape))
model.add(Activation('relu'))
model.add(Conv2D(filters, kernel_size))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=pool_size))
model.add(Dropout(0.25))
model.add(Flatten())
for layer_size in dense_layer_sizes:
model.add(Dense(layer_size))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes))
model.add(Activation('softmax'))
model.compile(loss='categorical_crossentropy',
optimizer='adadelta',
metrics=['accuracy'])
return model
dense_size_candidates = [[32], [64], [32, 32], [64, 64]]
my_classifier = KerasClassifier(make_model, batch_size=32)
validator = GridSearchCV(my_classifier,
param_grid={'dense_layer_sizes': dense_size_candidates,
# epochs is avail for tuning even when not
# an argument to model building function
'epochs': [3, 6],
'filters': [8],
'kernel_size': [3],
'pool_size': [2]},
scoring='neg_log_loss',
n_jobs=1)
validator.fit(x_train, y_train)
print('The parameters of the best model are: ')
print(validator.best_params_)
# validator.best_estimator_ returns sklearn-wrapped version of best model.
# validator.best_estimator_.model returns the (unwrapped) keras model
best_model = validator.best_estimator_.model
metric_names = best_model.metrics_names
metric_values = best_model.evaluate(x_test, y_test)
for metric, value in zip(metric_names, metric_values):
print(metric, ': ', value)
+175
Ver Arquivo
@@ -0,0 +1,175 @@
'''Trains a stacked what-where autoencoder built on residual blocks on the
MNIST dataset. It exemplifies two influential methods that have been developed
in the past few years.
The first is the idea of properly 'unpooling.' During any max pool, the
exact location (the 'where') of the maximal value in a pooled receptive field
is lost, however it can be very useful in the overall reconstruction of an
input image. Therefore, if the 'where' is handed from the encoder
to the corresponding decoder layer, features being decoded can be 'placed' in
the right location, allowing for reconstructions of much higher fidelity.
References:
[1]
'Visualizing and Understanding Convolutional Networks'
Matthew D Zeiler, Rob Fergus
https://arxiv.org/abs/1311.2901v3
[2]
'Stacked What-Where Auto-encoders'
Junbo Zhao, Michael Mathieu, Ross Goroshin, Yann LeCun
https://arxiv.org/abs/1506.02351v8
The second idea exploited here is that of residual learning. Residual blocks
ease the training process by allowing skip connections that give the network
the ability to be as linear (or non-linear) as the data sees fit. This allows
for much deep networks to be easily trained. The residual element seems to
be advantageous in the context of this example as it allows a nice symmetry
between the encoder and decoder. Normally, in the decoder, the final
projection to the space where the image is reconstructed is linear, however
this does not have to be the case for a residual block as the degree to which
its output is linear or non-linear is determined by the data it is fed.
However, in order to cap the reconstruction in this example, a hard softmax is
applied as a bias because we know the MNIST digits are mapped to [0,1].
References:
[3]
'Deep Residual Learning for Image Recognition'
Kaiming He, xiangyu Zhang, Shaoqing Ren, Jian Sun
https://arxiv.org/abs/1512.03385v1
[4]
'Identity Mappings in Deep Residual Networks'
Kaiming He, xiangyu Zhang, Shaoqing Ren, Jian Sun
https://arxiv.org/abs/1603.05027v3
'''
from __future__ import print_function
import numpy as np
from keras.datasets import mnist
from keras.models import Model
from keras.layers import Activation
from keras.layers import UpSampling2D, Conv2D, MaxPooling2D
from keras.layers import Input, BatchNormalization
import matplotlib.pyplot as plt
import keras.backend as K
from keras import layers
def convresblock(x, nfeats=8, ksize=3, nskipped=2):
''' The proposed residual block from [4]'''
y0 = Conv2D(nfeats, ksize, padding='same')(x)
y = y0
for i in range(nskipped):
y = BatchNormalization(axis=1)(y)
y = Activation('relu')(y)
y = Conv2D(nfeats, ksize, padding='same')(y)
return layers.add([y0, y])
def getwhere(x):
''' Calculate the 'where' mask that contains switches indicating which
index contained the max value when MaxPool2D was applied. Using the
gradient of the sum is a nice trick to keep everything high level.'''
y_prepool, y_postpool = x
return K.gradients(K.sum(y_postpool), y_prepool)
if K.backend() == 'tensorflow':
raise RuntimeError('This example can only run with the '
'Theano backend for the time being, '
'because it requires taking the gradient '
'of a gradient, which isn\'t '
'supported for all TF ops.')
# This example assume 'channels_first' data format.
K.set_image_data_format('channels_first')
# input image dimensions
img_rows, img_cols = 28, 28
# the data, shuffled and split between train and test sets
(x_train, _), (x_test, _) = mnist.load_data()
x_train = x_train.reshape(x_train.shape[0], 1, img_rows, img_cols)
x_test = x_test.reshape(x_test.shape[0], 1, img_rows, img_cols)
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255
print('x_train shape:', x_train.shape)
print(x_train.shape[0], 'train samples')
print(x_test.shape[0], 'test samples')
# The size of the kernel used for the MaxPooling2D
pool_size = 2
# The total number of feature maps at each layer
nfeats = [8, 16, 32, 64, 128]
# The sizes of the pooling kernel at each layer
pool_sizes = np.array([1, 1, 1, 1, 1]) * pool_size
# The convolution kernel size
ksize = 3
# Number of epochs to train for
epochs = 5
# Batch size during training
batch_size = 128
if pool_size == 2:
# if using a 5 layer net of pool_size = 2
x_train = np.pad(x_train, [[0, 0], [0, 0], [2, 2], [2, 2]],
mode='constant')
x_test = np.pad(x_test, [[0, 0], [0, 0], [2, 2], [2, 2]], mode='constant')
nlayers = 5
elif pool_size == 3:
# if using a 3 layer net of pool_size = 3
x_train = x_train[:, :, :-1, :-1]
x_test = x_test[:, :, :-1, :-1]
nlayers = 3
else:
import sys
sys.exit('Script supports pool_size of 2 and 3.')
# Shape of input to train on (note that model is fully convolutional however)
input_shape = x_train.shape[1:]
# The final list of the size of axis=1 for all layers, including input
nfeats_all = [input_shape[0]] + nfeats
# First build the encoder, all the while keeping track of the 'where' masks
img_input = Input(shape=input_shape)
# We push the 'where' masks to the following list
wheres = [None] * nlayers
y = img_input
for i in range(nlayers):
y_prepool = convresblock(y, nfeats=nfeats_all[i + 1], ksize=ksize)
y = MaxPooling2D(pool_size=(pool_sizes[i], pool_sizes[i]))(y_prepool)
wheres[i] = layers.Lambda(getwhere, output_shape=lambda x: x[0])([y_prepool, y])
# Now build the decoder, and use the stored 'where' masks to place the features
for i in range(nlayers):
ind = nlayers - 1 - i
y = UpSampling2D(size=(pool_sizes[ind], pool_sizes[ind]))(y)
y = layers.multiply([y, wheres[ind]])
y = convresblock(y, nfeats=nfeats_all[ind], ksize=ksize)
# Use hard_simgoid to clip range of reconstruction
y = Activation('hard_sigmoid')(y)
# Define the model and it's mean square error loss, and compile it with Adam
model = Model(img_input, y)
model.compile('adam', 'mse')
# Fit the model
model.fit(x_train, x_train, validation_data=(x_test, x_test),
batch_size=batch_size, epochs=epochs)
# Plot
x_recon = model.predict(x_test[:25])
x_plot = np.concatenate((x_test[:25], x_recon), axis=1)
x_plot = x_plot.reshape((5, 10, input_shape[-2], input_shape[-1]))
x_plot = np.vstack([np.hstack(x) for x in x_plot])
plt.figure()
plt.axis('off')
plt.title('Test Samples: Originals/Reconstructions')
plt.imshow(x_plot, interpolation='none', cmap='gray')
plt.savefig('reconstructions.png')
+49 -48
Ver Arquivo
@@ -12,107 +12,108 @@ and 99.2% for the last five digits after transfer + fine-tuning.
'''
from __future__ import print_function
import numpy as np
import datetime
np.random.seed(1337) # for reproducibility
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation, Flatten
from keras.layers.convolutional import Convolution2D, MaxPooling2D
from keras.utils import np_utils
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras import backend as K
now = datetime.datetime.now
batch_size = 128
nb_classes = 5
nb_epoch = 5
num_classes = 5
epochs = 5
# input image dimensions
img_rows, img_cols = 28, 28
# number of convolutional filters to use
nb_filters = 32
filters = 32
# size of pooling area for max pooling
nb_pool = 2
pool_size = 2
# convolution kernel size
nb_conv = 3
kernel_size = 3
if K.image_data_format() == 'channels_first':
input_shape = (1, img_rows, img_cols)
else:
input_shape = (img_rows, img_cols, 1)
def train_model(model, train, test, nb_classes):
X_train = train[0].reshape(train[0].shape[0], 1, img_rows, img_cols)
X_test = test[0].reshape(test[0].shape[0], 1, img_rows, img_cols)
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255
X_test /= 255
print('X_train shape:', X_train.shape)
print(X_train.shape[0], 'train samples')
print(X_test.shape[0], 'test samples')
def train_model(model, train, test, num_classes):
x_train = train[0].reshape((train[0].shape[0],) + input_shape)
x_test = test[0].reshape((test[0].shape[0],) + input_shape)
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255
print('x_train shape:', x_train.shape)
print(x_train.shape[0], 'train samples')
print(x_test.shape[0], 'test samples')
# convert class vectors to binary class matrices
Y_train = np_utils.to_categorical(train[1], nb_classes)
Y_test = np_utils.to_categorical(test[1], nb_classes)
y_train = keras.utils.to_categorical(train[1], num_classes)
y_test = keras.utils.to_categorical(test[1], num_classes)
model.compile(loss='categorical_crossentropy',
optimizer='adadelta',
metrics=['accuracy'])
t = now()
model.fit(X_train, Y_train,
batch_size=batch_size, nb_epoch=nb_epoch,
model.fit(x_train, y_train,
batch_size=batch_size, epochs=epochs,
verbose=1,
validation_data=(X_test, Y_test))
validation_data=(x_test, y_test))
print('Training time: %s' % (now() - t))
score = model.evaluate(X_test, Y_test, verbose=0)
score = model.evaluate(x_test, y_test, verbose=0)
print('Test score:', score[0])
print('Test accuracy:', score[1])
# the data, shuffled and split between train and test sets
(X_train, y_train), (X_test, y_test) = mnist.load_data()
(x_train, y_train), (x_test, y_test) = mnist.load_data()
# create two datasets one with digits below 5 and one with 5 and above
X_train_lt5 = X_train[y_train < 5]
x_train_lt5 = x_train[y_train < 5]
y_train_lt5 = y_train[y_train < 5]
X_test_lt5 = X_test[y_test < 5]
x_test_lt5 = x_test[y_test < 5]
y_test_lt5 = y_test[y_test < 5]
X_train_gte5 = X_train[y_train >= 5]
y_train_gte5 = y_train[y_train >= 5] - 5 # make classes start at 0 for
X_test_gte5 = X_test[y_test >= 5] # np_utils.to_categorical
x_train_gte5 = x_train[y_train >= 5]
y_train_gte5 = y_train[y_train >= 5] - 5
x_test_gte5 = x_test[y_test >= 5]
y_test_gte5 = y_test[y_test >= 5] - 5
# define two groups of layers: feature (convolutions) and classification (dense)
feature_layers = [
Convolution2D(nb_filters, nb_conv, nb_conv,
border_mode='valid',
input_shape=(1, img_rows, img_cols)),
Conv2D(filters, kernel_size,
padding='valid',
input_shape=input_shape),
Activation('relu'),
Convolution2D(nb_filters, nb_conv, nb_conv),
Conv2D(filters, kernel_size),
Activation('relu'),
MaxPooling2D(pool_size=(nb_pool, nb_pool)),
MaxPooling2D(pool_size=pool_size),
Dropout(0.25),
Flatten(),
]
classification_layers = [
Dense(128),
Activation('relu'),
Dropout(0.5),
Dense(nb_classes),
Dense(num_classes),
Activation('softmax')
]
# create complete model
model = Sequential()
for l in feature_layers + classification_layers:
model.add(l)
model = Sequential(feature_layers + classification_layers)
# train model for 5-digit classification [0..4]
train_model(model,
(X_train_lt5, y_train_lt5),
(X_test_lt5, y_test_lt5), nb_classes)
(x_train_lt5, y_train_lt5),
(x_test_lt5, y_test_lt5), num_classes)
# freeze feature layers and rebuild model
for l in feature_layers:
@@ -120,5 +121,5 @@ for l in feature_layers:
# transfer: train dense layers for new classification task [5..9]
train_model(model,
(X_train_gte5, y_train_gte5),
(X_test_gte5, y_test_gte5), nb_classes)
(x_train_gte5, y_train_gte5),
(x_test_gte5, y_test_gte5), num_classes)
+366
Ver Arquivo
@@ -0,0 +1,366 @@
'''Neural doodle with Keras
Script Usage:
# Arguments:
```
--nlabels: # of regions (colors) in mask images
--style-image: image to learn style from
--style-mask: semantic labels for style image
--target-mask: semantic labels for target image (your doodle)
--content-image: optional image to learn content from
--target-image-prefix: path prefix for generated target images
```
# Example 1: doodle using a style image, style mask
and target mask.
```
python neural_doodle.py --nlabels 4 --style-image Monet/style.png \
--style-mask Monet/style_mask.png --target-mask Monet/target_mask.png \
--target-image-prefix generated/monet
```
# Example 2: doodle using a style image, style mask,
target mask and an optional content image.
```
python neural_doodle.py --nlabels 4 --style-image Renoir/style.png \
--style-mask Renoir/style_mask.png --target-mask Renoir/target_mask.png \
--content-image Renoir/creek.jpg \
--target-image-prefix generated/renoir
```
References:
[Dmitry Ulyanov's blog on fast-neural-doodle](http://dmitryulyanov.github.io/feed-forward-neural-doodle/)
[Torch code for fast-neural-doodle](https://github.com/DmitryUlyanov/fast-neural-doodle)
[Torch code for online-neural-doodle](https://github.com/DmitryUlyanov/online-neural-doodle)
[Paper Texture Networks: Feed-forward Synthesis of Textures and Stylized Images](http://arxiv.org/abs/1603.03417)
[Discussion on parameter tuning](https://github.com/fchollet/keras/issues/3705)
Resources:
Example images can be downloaded from
https://github.com/DmitryUlyanov/fast-neural-doodle/tree/master/data
'''
from __future__ import print_function
import time
import argparse
import numpy as np
from scipy.optimize import fmin_l_bfgs_b
from scipy.misc import imread, imsave
from keras import backend as K
from keras.layers import Input, AveragePooling2D
from keras.models import Model
from keras.preprocessing.image import load_img, img_to_array
from keras.applications import vgg19
# Command line arguments
parser = argparse.ArgumentParser(description='Keras neural doodle example')
parser.add_argument('--nlabels', type=int,
help='number of semantic labels'
' (regions in differnet colors)'
' in style_mask/target_mask')
parser.add_argument('--style-image', type=str,
help='path to image to learn style from')
parser.add_argument('--style-mask', type=str,
help='path to semantic mask of style image')
parser.add_argument('--target-mask', type=str,
help='path to semantic mask of target image')
parser.add_argument('--content-image', type=str, default=None,
help='path to optional content image')
parser.add_argument('--target-image-prefix', type=str,
help='path prefix for generated results')
args = parser.parse_args()
style_img_path = args.style_image
style_mask_path = args.style_mask
target_mask_path = args.target_mask
content_img_path = args.content_image
target_img_prefix = args.target_image_prefix
use_content_img = content_img_path is not None
num_labels = args.nlabels
num_colors = 3 # RGB
# determine image sizes based on target_mask
ref_img = imread(target_mask_path)
img_nrows, img_ncols = ref_img.shape[:2]
total_variation_weight = 50.
style_weight = 1.
content_weight = 0.1 if use_content_img else 0
content_feature_layers = ['block5_conv2']
# To get better generation qualities, use more conv layers for style features
style_feature_layers = ['block1_conv1', 'block2_conv1', 'block3_conv1',
'block4_conv1', 'block5_conv1']
# helper functions for reading/processing images
def preprocess_image(image_path):
img = load_img(image_path, target_size=(img_nrows, img_ncols))
img = img_to_array(img)
img = np.expand_dims(img, axis=0)
img = vgg19.preprocess_input(img)
return img
def deprocess_image(x):
if K.image_data_format() == 'channels_first':
x = x.reshape((3, img_nrows, img_ncols))
x = x.transpose((1, 2, 0))
else:
x = x.reshape((img_nrows, img_ncols, 3))
# Remove zero-center by mean pixel
x[:, :, 0] += 103.939
x[:, :, 1] += 116.779
x[:, :, 2] += 123.68
# 'BGR'->'RGB'
x = x[:, :, ::-1]
x = np.clip(x, 0, 255).astype('uint8')
return x
def kmeans(xs, k):
assert xs.ndim == 2
try:
from sklearn.cluster import k_means
_, labels, _ = k_means(xs.astype('float64'), k)
except ImportError:
from scipy.cluster.vq import kmeans2
_, labels = kmeans2(xs, k, missing='raise')
return labels
def load_mask_labels():
'''Load both target and style masks.
A mask image (nr x nc) with m labels/colors will be loaded
as a 4D boolean tensor: (1, m, nr, nc) for 'channels_first' or (1, nr, nc, m) for 'channels_last'
'''
target_mask_img = load_img(target_mask_path,
target_size=(img_nrows, img_ncols))
target_mask_img = img_to_array(target_mask_img)
style_mask_img = load_img(style_mask_path,
target_size=(img_nrows, img_ncols))
style_mask_img = img_to_array(style_mask_img)
if K.image_data_format() == 'channels_first':
mask_vecs = np.vstack([style_mask_img.reshape((3, -1)).T,
target_mask_img.reshape((3, -1)).T])
else:
mask_vecs = np.vstack([style_mask_img.reshape((-1, 3)),
target_mask_img.reshape((-1, 3))])
labels = kmeans(mask_vecs, num_labels)
style_mask_label = labels[:img_nrows *
img_ncols].reshape((img_nrows, img_ncols))
target_mask_label = labels[img_nrows *
img_ncols:].reshape((img_nrows, img_ncols))
stack_axis = 0 if K.image_data_format() == 'channels_first' else -1
style_mask = np.stack([style_mask_label == r for r in xrange(num_labels)],
axis=stack_axis)
target_mask = np.stack([target_mask_label == r for r in xrange(num_labels)],
axis=stack_axis)
return (np.expand_dims(style_mask, axis=0),
np.expand_dims(target_mask, axis=0))
# Create tensor variables for images
if K.image_data_format() == 'channels_first':
shape = (1, num_colors, img_nrows, img_ncols)
else:
shape = (1, img_nrows, img_ncols, num_colors)
style_image = K.variable(preprocess_image(style_img_path))
target_image = K.placeholder(shape=shape)
if use_content_img:
content_image = K.variable(preprocess_image(content_img_path))
else:
content_image = K.zeros(shape=shape)
images = K.concatenate([style_image, target_image, content_image], axis=0)
# Create tensor variables for masks
raw_style_mask, raw_target_mask = load_mask_labels()
style_mask = K.variable(raw_style_mask.astype('float32'))
target_mask = K.variable(raw_target_mask.astype('float32'))
masks = K.concatenate([style_mask, target_mask], axis=0)
# index constants for images and tasks variables
STYLE, TARGET, CONTENT = 0, 1, 2
# Build image model, mask model and use layer outputs as features
# image model as VGG19
image_model = vgg19.VGG19(include_top=False, input_tensor=images)
# mask model as a series of pooling
mask_input = Input(tensor=masks, shape=(None, None, None), name='mask_input')
x = mask_input
for layer in image_model.layers[1:]:
name = 'mask_%s' % layer.name
if 'conv' in layer.name:
x = AveragePooling2D((3, 3), strides=(
1, 1), name=name, border_mode='same')(x)
elif 'pool' in layer.name:
x = AveragePooling2D((2, 2), name=name)(x)
mask_model = Model(mask_input, x)
# Collect features from image_model and task_model
image_features = {}
mask_features = {}
for img_layer, mask_layer in zip(image_model.layers, mask_model.layers):
if 'conv' in img_layer.name:
assert 'mask_' + img_layer.name == mask_layer.name
layer_name = img_layer.name
img_feat, mask_feat = img_layer.output, mask_layer.output
image_features[layer_name] = img_feat
mask_features[layer_name] = mask_feat
# Define loss functions
def gram_matrix(x):
assert K.ndim(x) == 3
features = K.batch_flatten(x)
gram = K.dot(features, K.transpose(features))
return gram
def region_style_loss(style_image, target_image, style_mask, target_mask):
'''Calculate style loss between style_image and target_image,
for one common region specified by their (boolean) masks
'''
assert 3 == K.ndim(style_image) == K.ndim(target_image)
assert 2 == K.ndim(style_mask) == K.ndim(target_mask)
if K.image_data_format() == 'channels_first':
masked_style = style_image * style_mask
masked_target = target_image * target_mask
num_channels = K.shape(style_image)[0]
else:
masked_style = K.permute_dimensions(
style_image, (2, 0, 1)) * style_mask
masked_target = K.permute_dimensions(
target_image, (2, 0, 1)) * target_mask
num_channels = K.shape(style_image)[-1]
s = gram_matrix(masked_style) / K.mean(style_mask) / num_channels
c = gram_matrix(masked_target) / K.mean(target_mask) / num_channels
return K.mean(K.square(s - c))
def style_loss(style_image, target_image, style_masks, target_masks):
'''Calculate style loss between style_image and target_image,
in all regions.
'''
assert 3 == K.ndim(style_image) == K.ndim(target_image)
assert 3 == K.ndim(style_masks) == K.ndim(target_masks)
loss = K.variable(0)
for i in xrange(num_labels):
if K.image_data_format() == 'channels_first':
style_mask = style_masks[i, :, :]
target_mask = target_masks[i, :, :]
else:
style_mask = style_masks[:, :, i]
target_mask = target_masks[:, :, i]
loss += region_style_loss(style_image,
target_image, style_mask, target_mask)
return loss
def content_loss(content_image, target_image):
return K.sum(K.square(target_image - content_image))
def total_variation_loss(x):
assert 4 == K.ndim(x)
if K.image_data_format() == 'channels_first':
a = K.square(x[:, :, :img_nrows - 1, :img_ncols - 1] -
x[:, :, 1:, :img_ncols - 1])
b = K.square(x[:, :, :img_nrows - 1, :img_ncols - 1] -
x[:, :, :img_nrows - 1, 1:])
else:
a = K.square(x[:, :img_nrows - 1, :img_ncols - 1, :] -
x[:, 1:, :img_ncols - 1, :])
b = K.square(x[:, :img_nrows - 1, :img_ncols - 1, :] -
x[:, :img_nrows - 1, 1:, :])
return K.sum(K.pow(a + b, 1.25))
# Overall loss is the weighted sum of content_loss, style_loss and tv_loss
# Each individual loss uses features from image/mask models.
loss = K.variable(0)
for layer in content_feature_layers:
content_feat = image_features[layer][CONTENT, :, :, :]
target_feat = image_features[layer][TARGET, :, :, :]
loss += content_weight * content_loss(content_feat, target_feat)
for layer in style_feature_layers:
style_feat = image_features[layer][STYLE, :, :, :]
target_feat = image_features[layer][TARGET, :, :, :]
style_masks = mask_features[layer][STYLE, :, :, :]
target_masks = mask_features[layer][TARGET, :, :, :]
sl = style_loss(style_feat, target_feat, style_masks, target_masks)
loss += (style_weight / len(style_feature_layers)) * sl
loss += total_variation_weight * total_variation_loss(target_image)
loss_grads = K.gradients(loss, target_image)
# Evaluator class for computing efficiency
outputs = [loss]
if isinstance(loss_grads, (list, tuple)):
outputs += loss_grads
else:
outputs.append(loss_grads)
f_outputs = K.function([target_image], outputs)
def eval_loss_and_grads(x):
if K.image_data_format() == 'channels_first':
x = x.reshape((1, 3, img_nrows, img_ncols))
else:
x = x.reshape((1, img_nrows, img_ncols, 3))
outs = f_outputs([x])
loss_value = outs[0]
if len(outs[1:]) == 1:
grad_values = outs[1].flatten().astype('float64')
else:
grad_values = np.array(outs[1:]).flatten().astype('float64')
return loss_value, grad_values
class Evaluator(object):
def __init__(self):
self.loss_value = None
self.grads_values = None
def loss(self, x):
assert self.loss_value is None
loss_value, grad_values = eval_loss_and_grads(x)
self.loss_value = loss_value
self.grad_values = grad_values
return self.loss_value
def grads(self, x):
assert self.loss_value is not None
grad_values = np.copy(self.grad_values)
self.loss_value = None
self.grad_values = None
return grad_values
evaluator = Evaluator()
# Generate images by iterative optimization
if K.image_data_format() == 'channels_first':
x = np.random.uniform(0, 255, (1, 3, img_nrows, img_ncols)) - 128.
else:
x = np.random.uniform(0, 255, (1, img_nrows, img_ncols, 3)) - 128.
for i in range(50):
print('Start of iteration', i)
start_time = time.time()
x, min_val, info = fmin_l_bfgs_b(evaluator.loss, x.flatten(),
fprime=evaluator.grads, maxfun=20)
print('Current loss value:', min_val)
# save current generated image
img = deprocess_image(x.copy())
fname = target_img_prefix + '_at_iteration_%d.png' % i
imsave(fname, img)
end_time = time.time()
print('Image saved as', fname)
print('Iteration %d completed in %ds' % (i, end_time - start_time))
+92 -89
Ver Arquivo
@@ -1,10 +1,5 @@
'''Neural style transfer with Keras.
Before running this script, download the weights for the VGG16 model at:
https://drive.google.com/file/d/0Bz7KyqmuGsilT0J5dmRCM0ROVHc/view?usp=sharing
(source: https://gist.github.com/baraldilorenzo/07d7802847aaad0a35d3)
and make sure the variable `weights_path` in this script matches the location of the file.
Run the script with:
```
python neural_style_transfer.py path_to_your_base_image.jpg path_to_your_reference.jpg prefix_for_results
@@ -13,9 +8,15 @@ e.g.:
```
python neural_style_transfer.py img/tuebingen.jpg img/starry_night.jpg results/my_result
```
Optional parameters:
```
--iter, To specify the number of iterations the style transfer takes place (Default is 10)
--content_weight, The weight given to the content loss (Default is 0.025)
--style_weight, The weight given to the style loss (Default is 1.0)
--tv_weight, The weight given to the total variation loss (Default is 1.0)
```
It is preferrable to run this script on GPU, for speed.
If running on CPU, prefer the TensorFlow backend (much faster).
It is preferable to run this script on GPU, for speed.
Example result: https://twitter.com/fchollet/status/686631033085677568
@@ -34,7 +35,7 @@ the pixels of the combination image, giving it visual coherence.
- The style loss is where the deep learning keeps in --that one is defined
using a deep convolutional neural network. Precisely, it consists in a sum of
L2 distances betwen the Gram matrices of the representations of
L2 distances between the Gram matrices of the representations of
the base image and the style reference image, extracted from
different layers of a convnet (trained on ImageNet). The general idea
is to capture color/texture information at different spatial
@@ -49,16 +50,14 @@ keeping the generated image close enough to the original one.
'''
from __future__ import print_function
from scipy.misc import imread, imresize, imsave
from keras.preprocessing.image import load_img, img_to_array
from scipy.misc import imsave
import numpy as np
from scipy.optimize import fmin_l_bfgs_b
import time
import os
import argparse
import h5py
from keras.models import Sequential
from keras.layers.convolutional import Convolution2D, ZeroPadding2D, MaxPooling2D
from keras.applications import vgg16
from keras import backend as K
parser = argparse.ArgumentParser(description='Neural style transfer with Keras.')
@@ -68,33 +67,56 @@ parser.add_argument('style_reference_image_path', metavar='ref', type=str,
help='Path to the style reference image.')
parser.add_argument('result_prefix', metavar='res_prefix', type=str,
help='Prefix for the saved results.')
parser.add_argument('--iter', type=int, default=10, required=False,
help='Number of iterations to run.')
parser.add_argument('--content_weight', type=float, default=0.025, required=False,
help='Content weight.')
parser.add_argument('--style_weight', type=float, default=1.0, required=False,
help='Style weight.')
parser.add_argument('--tv_weight', type=float, default=1.0, required=False,
help='Total Variation weight.')
args = parser.parse_args()
base_image_path = args.base_image_path
style_reference_image_path = args.style_reference_image_path
result_prefix = args.result_prefix
weights_path = 'vgg16_weights.h5'
iterations = args.iter
# these are the weights of the different loss components
total_variation_weight = 1.
style_weight = 1.
content_weight = 0.025
total_variation_weight = args.tv_weight
style_weight = args.style_weight
content_weight = args.content_weight
# dimensions of the generated picture.
img_width = 400
img_height = 400
assert img_height == img_width, 'Due to the use of the Gram matrix, width and height must match.'
width, height = load_img(base_image_path).size
img_nrows = 400
img_ncols = int(width * img_nrows / height)
# util function to open, resize and format pictures into appropriate tensors
def preprocess_image(image_path):
img = imresize(imread(image_path), (img_width, img_height))
img = img.transpose((2, 0, 1)).astype('float64')
img = load_img(image_path, target_size=(img_nrows, img_ncols))
img = img_to_array(img)
img = np.expand_dims(img, axis=0)
img = vgg16.preprocess_input(img)
return img
# util function to convert a tensor into a valid image
def deprocess_image(x):
x = x.transpose((1, 2, 0))
if K.image_data_format() == 'channels_first':
x = x.reshape((3, img_nrows, img_ncols))
x = x.transpose((1, 2, 0))
else:
x = x.reshape((img_nrows, img_ncols, 3))
# Remove zero-center by mean pixel
x[:, :, 0] += 103.939
x[:, :, 1] += 116.779
x[:, :, 2] += 123.68
# 'BGR'->'RGB'
x = x[:, :, ::-1]
x = np.clip(x, 0, 255).astype('uint8')
return x
@@ -103,7 +125,10 @@ base_image = K.variable(preprocess_image(base_image_path))
style_reference_image = K.variable(preprocess_image(style_reference_image_path))
# this will contain our generated image
combination_image = K.placeholder((1, 3, img_width, img_height))
if K.image_data_format() == 'channels_first':
combination_image = K.placeholder((1, 3, img_nrows, img_ncols))
else:
combination_image = K.placeholder((1, img_nrows, img_ncols, 3))
# combine the 3 images into a single Keras tensor
input_tensor = K.concatenate([base_image,
@@ -111,60 +136,9 @@ input_tensor = K.concatenate([base_image,
combination_image], axis=0)
# build the VGG16 network with our 3 images as input
first_layer = ZeroPadding2D((1, 1))
first_layer.set_input(input_tensor, shape=(3, 3, img_width, img_height))
model = Sequential()
model.add(first_layer)
model.add(Convolution2D(64, 3, 3, activation='relu', name='conv1_1'))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(64, 3, 3, activation='relu'))
model.add(MaxPooling2D((2, 2), strides=(2, 2)))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(128, 3, 3, activation='relu', name='conv2_1'))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(128, 3, 3, activation='relu'))
model.add(MaxPooling2D((2, 2), strides=(2, 2)))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(256, 3, 3, activation='relu', name='conv3_1'))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(256, 3, 3, activation='relu'))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(256, 3, 3, activation='relu'))
model.add(MaxPooling2D((2, 2), strides=(2, 2)))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(512, 3, 3, activation='relu', name='conv4_1'))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(512, 3, 3, activation='relu', name='conv4_2'))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(512, 3, 3, activation='relu'))
model.add(MaxPooling2D((2, 2), strides=(2, 2)))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(512, 3, 3, activation='relu', name='conv5_1'))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(512, 3, 3, activation='relu'))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(512, 3, 3, activation='relu'))
model.add(MaxPooling2D((2, 2), strides=(2, 2)))
# load the weights of the VGG16 networks
# (trained on ImageNet, won the ILSVRC competition in 2014)
# note: when there is a complete match between your model definition
# and your weight savefile, you can simply call model.load_weights(filename)
assert os.path.exists(weights_path), 'Model weights not found (see "weights_path" variable in script).'
f = h5py.File(weights_path)
for k in range(f.attrs['nb_layers']):
if k >= len(model.layers):
# we don't look at the last (fully-connected) layers in the savefile
break
g = f['layer_{}'.format(k)]
weights = [g['param_{}'.format(p)] for p in range(g.attrs['nb_params'])]
model.layers[k].set_weights(weights)
f.close()
# the model will be loaded with pre-trained ImageNet weights
model = vgg16.VGG16(input_tensor=input_tensor,
weights='imagenet', include_top=False)
print('Model loaded.')
# get the symbolic outputs of each "key" layer (we gave them unique names).
@@ -174,9 +148,14 @@ outputs_dict = dict([(layer.name, layer.output) for layer in model.layers])
# first we need to define 4 util functions
# the gram matrix of an image tensor (feature-wise outer product)
def gram_matrix(x):
assert K.ndim(x) == 3
features = K.batch_flatten(x)
if K.image_data_format() == 'channels_first':
features = K.batch_flatten(x)
else:
features = K.batch_flatten(K.permute_dimensions(x, (2, 0, 1)))
gram = K.dot(features, K.transpose(features))
return gram
@@ -185,38 +164,50 @@ def gram_matrix(x):
# It is based on the gram matrices (which capture style) of
# feature maps from the style reference image
# and from the generated image
def style_loss(style, combination):
assert K.ndim(style) == 3
assert K.ndim(combination) == 3
S = gram_matrix(style)
C = gram_matrix(combination)
channels = 3
size = img_width * img_height
size = img_nrows * img_ncols
return K.sum(K.square(S - C)) / (4. * (channels ** 2) * (size ** 2))
# an auxiliary loss function
# designed to maintain the "content" of the
# base image in the generated image
def content_loss(base, combination):
return K.sum(K.square(combination - base))
# the 3rd loss function, total variation loss,
# designed to keep the generated image locally coherent
def total_variation_loss(x):
assert K.ndim(x) == 4
a = K.square(x[:, :, :img_width-1, :img_height-1] - x[:, :, 1:, :img_height-1])
b = K.square(x[:, :, :img_width-1, :img_height-1] - x[:, :, :img_width-1, 1:])
if K.image_data_format() == 'channels_first':
a = K.square(x[:, :, :img_nrows - 1, :img_ncols - 1] - x[:, :, 1:, :img_ncols - 1])
b = K.square(x[:, :, :img_nrows - 1, :img_ncols - 1] - x[:, :, :img_nrows - 1, 1:])
else:
a = K.square(x[:, :img_nrows - 1, :img_ncols - 1, :] - x[:, 1:, :img_ncols - 1, :])
b = K.square(x[:, :img_nrows - 1, :img_ncols - 1, :] - x[:, :img_nrows - 1, 1:, :])
return K.sum(K.pow(a + b, 1.25))
# combine these loss functions into a single scalar
loss = K.variable(0.)
layer_features = outputs_dict['conv4_2']
layer_features = outputs_dict['block4_conv2']
base_image_features = layer_features[0, :, :, :]
combination_features = layer_features[2, :, :, :]
loss += content_weight * content_loss(base_image_features,
combination_features)
feature_layers = ['conv1_1', 'conv2_1', 'conv3_1', 'conv4_1', 'conv5_1']
feature_layers = ['block1_conv1', 'block2_conv1',
'block3_conv1', 'block4_conv1',
'block5_conv1']
for layer_name in feature_layers:
layer_features = outputs_dict[layer_name]
style_reference_features = layer_features[1, :, :, :]
@@ -229,14 +220,19 @@ loss += total_variation_weight * total_variation_loss(combination_image)
grads = K.gradients(loss, combination_image)
outputs = [loss]
if type(grads) in {list, tuple}:
if isinstance(grads, (list, tuple)):
outputs += grads
else:
outputs.append(grads)
f_outputs = K.function([combination_image], outputs)
def eval_loss_and_grads(x):
x = x.reshape((1, 3, img_width, img_height))
if K.image_data_format() == 'channels_first':
x = x.reshape((1, 3, img_nrows, img_ncols))
else:
x = x.reshape((1, img_nrows, img_ncols, 3))
outs = f_outputs([x])
loss_value = outs[0]
if len(outs[1:]) == 1:
@@ -251,7 +247,10 @@ def eval_loss_and_grads(x):
# "loss" and "grads". This is done because scipy.optimize
# requires separate functions for loss and gradients,
# but computing them separately would be inefficient.
class Evaluator(object):
def __init__(self):
self.loss_value = None
self.grads_values = None
@@ -274,15 +273,19 @@ evaluator = Evaluator()
# run scipy-based optimization (L-BFGS) over the pixels of the generated image
# so as to minimize the neural style loss
x = np.random.uniform(0, 255, (1, 3, img_width, img_height))
for i in range(10):
if K.image_data_format() == 'channels_first':
x = np.random.uniform(0, 255, (1, 3, img_nrows, img_ncols)) - 128.
else:
x = np.random.uniform(0, 255, (1, img_nrows, img_ncols, 3)) - 128.
for i in range(iterations):
print('Start of iteration', i)
start_time = time.time()
x, min_val, info = fmin_l_bfgs_b(evaluator.loss, x.flatten(),
fprime=evaluator.grads, maxfun=20)
print('Current loss value:', min_val)
# save current generated image
img = deprocess_image(x.reshape((3, img_width, img_height)))
img = deprocess_image(x.copy())
fname = result_prefix + '_at_iteration_%d.png' % i
imsave(fname, img)
end_time = time.time()
+148
Ver Arquivo
@@ -0,0 +1,148 @@
'''This script loads pre-trained word embeddings (GloVe embeddings)
into a frozen Keras Embedding layer, and uses it to
train a text classification model on the 20 Newsgroup dataset
(classication of newsgroup messages into 20 different categories).
GloVe embedding data can be found at:
http://nlp.stanford.edu/data/glove.6B.zip
(source page: http://nlp.stanford.edu/projects/glove/)
20 Newsgroup data can be found at:
http://www.cs.cmu.edu/afs/cs.cmu.edu/project/theo-20/www/data/news20.html
'''
from __future__ import print_function
import os
import sys
import numpy as np
from keras.preprocessing.text import Tokenizer
from keras.preprocessing.sequence import pad_sequences
from keras.utils import to_categorical
from keras.layers import Dense, Input, Flatten
from keras.layers import Conv1D, MaxPooling1D, Embedding
from keras.models import Model
BASE_DIR = ''
GLOVE_DIR = BASE_DIR + '/glove.6B/'
TEXT_DATA_DIR = BASE_DIR + '/20_newsgroup/'
MAX_SEQUENCE_LENGTH = 1000
MAX_NB_WORDS = 20000
EMBEDDING_DIM = 100
VALIDATION_SPLIT = 0.2
# first, build index mapping words in the embeddings set
# to their embedding vector
print('Indexing word vectors.')
embeddings_index = {}
f = open(os.path.join(GLOVE_DIR, 'glove.6B.100d.txt'))
for line in f:
values = line.split()
word = values[0]
coefs = np.asarray(values[1:], dtype='float32')
embeddings_index[word] = coefs
f.close()
print('Found %s word vectors.' % len(embeddings_index))
# second, prepare text samples and their labels
print('Processing text dataset')
texts = [] # list of text samples
labels_index = {} # dictionary mapping label name to numeric id
labels = [] # list of label ids
for name in sorted(os.listdir(TEXT_DATA_DIR)):
path = os.path.join(TEXT_DATA_DIR, name)
if os.path.isdir(path):
label_id = len(labels_index)
labels_index[name] = label_id
for fname in sorted(os.listdir(path)):
if fname.isdigit():
fpath = os.path.join(path, fname)
if sys.version_info < (3,):
f = open(fpath)
else:
f = open(fpath, encoding='latin-1')
t = f.read()
i = t.find('\n\n') # skip header
if 0 < i:
t = t[i:]
texts.append(t)
f.close()
labels.append(label_id)
print('Found %s texts.' % len(texts))
# finally, vectorize the text samples into a 2D integer tensor
tokenizer = Tokenizer(num_words=MAX_NB_WORDS)
tokenizer.fit_on_texts(texts)
sequences = tokenizer.texts_to_sequences(texts)
word_index = tokenizer.word_index
print('Found %s unique tokens.' % len(word_index))
data = pad_sequences(sequences, maxlen=MAX_SEQUENCE_LENGTH)
labels = to_categorical(np.asarray(labels))
print('Shape of data tensor:', data.shape)
print('Shape of label tensor:', labels.shape)
# split the data into a training set and a validation set
indices = np.arange(data.shape[0])
np.random.shuffle(indices)
data = data[indices]
labels = labels[indices]
num_validation_samples = int(VALIDATION_SPLIT * data.shape[0])
x_train = data[:-num_validation_samples]
y_train = labels[:-num_validation_samples]
x_val = data[-num_validation_samples:]
y_val = labels[-num_validation_samples:]
print('Preparing embedding matrix.')
# prepare embedding matrix
num_words = min(MAX_NB_WORDS, len(word_index))
embedding_matrix = np.zeros((num_words, EMBEDDING_DIM))
for word, i in word_index.items():
if i >= MAX_NB_WORDS:
continue
embedding_vector = embeddings_index.get(word)
if embedding_vector is not None:
# words not found in embedding index will be all-zeros.
embedding_matrix[i] = embedding_vector
# load pre-trained word embeddings into an Embedding layer
# note that we set trainable = False so as to keep the embeddings fixed
embedding_layer = Embedding(num_words,
EMBEDDING_DIM,
weights=[embedding_matrix],
input_length=MAX_SEQUENCE_LENGTH,
trainable=False)
print('Training model.')
# train a 1D convnet with global maxpooling
sequence_input = Input(shape=(MAX_SEQUENCE_LENGTH,), dtype='int32')
embedded_sequences = embedding_layer(sequence_input)
x = Conv1D(128, 5, activation='relu')(embedded_sequences)
x = MaxPooling1D(5)(x)
x = Conv1D(128, 5, activation='relu')(x)
x = MaxPooling1D(5)(x)
x = Conv1D(128, 5, activation='relu')(x)
x = MaxPooling1D(35)(x)
x = Flatten()(x)
x = Dense(128, activation='relu')(x)
preds = Dense(len(labels_index), activation='softmax')(x)
model = Model(sequence_input, preds)
model.compile(loss='categorical_crossentropy',
optimizer='rmsprop',
metrics=['acc'])
# happy learning!
model.fit(x_train, y_train, validation_data=(x_val, y_val),
epochs=10, batch_size=128)
+25 -26
Ver Arquivo
@@ -1,59 +1,58 @@
'''Trains and evaluate a simple MLP
on the Reuters newswire topic classification task.
'''
from __future__ import print_function
import numpy as np
np.random.seed(1337) # for reproducibility
import numpy as np
import keras
from keras.datasets import reuters
from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation
from keras.layers.normalization import BatchNormalization
from keras.utils import np_utils
from keras.layers import Dense, Dropout, Activation
from keras.preprocessing.text import Tokenizer
max_words = 1000
batch_size = 32
nb_epoch = 5
epochs = 5
print('Loading data...')
(X_train, y_train), (X_test, y_test) = reuters.load_data(nb_words=max_words, test_split=0.2)
print(len(X_train), 'train sequences')
print(len(X_test), 'test sequences')
(x_train, y_train), (x_test, y_test) = reuters.load_data(num_words=max_words,
test_split=0.2)
print(len(x_train), 'train sequences')
print(len(x_test), 'test sequences')
nb_classes = np.max(y_train)+1
print(nb_classes, 'classes')
num_classes = np.max(y_train) + 1
print(num_classes, 'classes')
print('Vectorizing sequence data...')
tokenizer = Tokenizer(nb_words=max_words)
X_train = tokenizer.sequences_to_matrix(X_train, mode='binary')
X_test = tokenizer.sequences_to_matrix(X_test, mode='binary')
print('X_train shape:', X_train.shape)
print('X_test shape:', X_test.shape)
tokenizer = Tokenizer(num_words=max_words)
x_train = tokenizer.sequences_to_matrix(x_train, mode='binary')
x_test = tokenizer.sequences_to_matrix(x_test, mode='binary')
print('x_train shape:', x_train.shape)
print('x_test shape:', x_test.shape)
print('Convert class vector to binary class matrix (for use with categorical_crossentropy)')
Y_train = np_utils.to_categorical(y_train, nb_classes)
Y_test = np_utils.to_categorical(y_test, nb_classes)
print('Y_train shape:', Y_train.shape)
print('Y_test shape:', Y_test.shape)
print('Convert class vector to binary class matrix '
'(for use with categorical_crossentropy)')
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)
print('y_train shape:', y_train.shape)
print('y_test shape:', y_test.shape)
print('Building model...')
model = Sequential()
model.add(Dense(512, input_shape=(max_words,)))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(nb_classes))
model.add(Dense(num_classes))
model.add(Activation('softmax'))
model.compile(loss='categorical_crossentropy',
optimizer='adam',
metrics=['accuracy'])
history = model.fit(X_train, Y_train,
nb_epoch=nb_epoch, batch_size=batch_size,
history = model.fit(x_train, y_train,
epochs=epochs, batch_size=batch_size,
verbose=1, validation_split=0.1)
score = model.evaluate(X_test, Y_test,
score = model.evaluate(x_test, y_test,
batch_size=batch_size, verbose=1)
print('Test score:', score[0])
print('Test accuracy:', score[1])
+10 -12
Ver Arquivo
@@ -5,8 +5,7 @@ from __future__ import print_function
import numpy as np
import matplotlib.pyplot as plt
from keras.models import Sequential
from keras.layers.core import Dense
from keras.layers.recurrent import LSTM
from keras.layers import Dense, LSTM
# since we are using stateful rnn tsteps can be set to 1
@@ -17,7 +16,7 @@ epochs = 25
lahead = 1
def gen_cosine_amp(amp=100, period=25, x0=0, xn=50000, step=1, k=0.0001):
def gen_cosine_amp(amp=100, period=1000, x0=0, xn=50000, step=1, k=0.0001):
"""Generates an absolute cosine time series with the amplitude
exponentially decreasing
@@ -32,12 +31,12 @@ def gen_cosine_amp(amp=100, period=25, x0=0, xn=50000, step=1, k=0.0001):
cos = np.zeros(((xn - x0) * step, 1, 1))
for i in range(len(cos)):
idx = x0 + i * step
cos[i, 0, 0] = amp * np.cos(idx / (2 * np.pi * period))
cos[i, 0, 0] = amp * np.cos(2 * np.pi * idx / period)
cos[i, 0, 0] = cos[i, 0, 0] * np.exp(-k * idx)
return cos
print('Generating Data')
print('Generating Data...')
cos = gen_cosine_amp()
print('Input shape:', cos.shape)
@@ -45,17 +44,16 @@ expected_output = np.zeros((len(cos), 1))
for i in range(len(cos) - lahead):
expected_output[i, 0] = np.mean(cos[i + 1:i + lahead + 1])
print('Output shape')
print(expected_output.shape)
print('Output shape:', expected_output.shape)
print('Creating Model')
print('Creating Model...')
model = Sequential()
model.add(LSTM(50,
batch_input_shape=(batch_size, tsteps, 1),
input_shape=(tsteps, 1),
batch_size=batch_size,
return_sequences=True,
stateful=True))
model.add(LSTM(50,
batch_input_shape=(batch_size, tsteps, 1),
return_sequences=False,
stateful=True))
model.add(Dense(1))
@@ -68,14 +66,14 @@ for i in range(epochs):
expected_output,
batch_size=batch_size,
verbose=1,
nb_epoch=1,
epochs=1,
shuffle=False)
model.reset_states()
print('Predicting')
predicted_output = model.predict(cos, batch_size=batch_size)
print('Ploting Results')
print('Plotting Results')
plt.subplot(2, 1, 1)
plt.plot(expected_output)
plt.title('Expected')
+101
Ver Arquivo
@@ -0,0 +1,101 @@
'''This script demonstrates how to build a variational autoencoder with Keras.
Reference: "Auto-Encoding Variational Bayes" https://arxiv.org/abs/1312.6114
'''
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import norm
from keras.layers import Input, Dense, Lambda
from keras.models import Model
from keras import backend as K
from keras import metrics
from keras.datasets import mnist
batch_size = 100
original_dim = 784
latent_dim = 2
intermediate_dim = 256
epochs = 50
epsilon_std = 1.0
x = Input(batch_shape=(batch_size, original_dim))
h = Dense(intermediate_dim, activation='relu')(x)
z_mean = Dense(latent_dim)(h)
z_log_var = Dense(latent_dim)(h)
def sampling(args):
z_mean, z_log_var = args
epsilon = K.random_normal(shape=(batch_size, latent_dim), mean=0.,
stddev=epsilon_std)
return z_mean + K.exp(z_log_var / 2) * epsilon
# note that "output_shape" isn't necessary with the TensorFlow backend
z = Lambda(sampling, output_shape=(latent_dim,))([z_mean, z_log_var])
# we instantiate these layers separately so as to reuse them later
decoder_h = Dense(intermediate_dim, activation='relu')
decoder_mean = Dense(original_dim, activation='sigmoid')
h_decoded = decoder_h(z)
x_decoded_mean = decoder_mean(h_decoded)
def vae_loss(x, x_decoded_mean):
xent_loss = original_dim * metrics.binary_crossentropy(x, x_decoded_mean)
kl_loss = - 0.5 * K.sum(1 + z_log_var - K.square(z_mean) - K.exp(z_log_var), axis=-1)
return xent_loss + kl_loss
vae = Model(x, x_decoded_mean)
vae.compile(optimizer='rmsprop', loss=vae_loss)
# train the VAE on MNIST digits
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.
x_train = x_train.reshape((len(x_train), np.prod(x_train.shape[1:])))
x_test = x_test.reshape((len(x_test), np.prod(x_test.shape[1:])))
vae.fit(x_train, x_train,
shuffle=True,
epochs=epochs,
batch_size=batch_size,
validation_data=(x_test, x_test))
# build a model to project inputs on the latent space
encoder = Model(x, z_mean)
# display a 2D plot of the digit classes in the latent space
x_test_encoded = encoder.predict(x_test, batch_size=batch_size)
plt.figure(figsize=(6, 6))
plt.scatter(x_test_encoded[:, 0], x_test_encoded[:, 1], c=y_test)
plt.colorbar()
plt.show()
# build a digit generator that can sample from the learned distribution
decoder_input = Input(shape=(latent_dim,))
_h_decoded = decoder_h(decoder_input)
_x_decoded_mean = decoder_mean(_h_decoded)
generator = Model(decoder_input, _x_decoded_mean)
# display a 2D manifold of the digits
n = 15 # figure with 15x15 digits
digit_size = 28
figure = np.zeros((digit_size * n, digit_size * n))
# linearly spaced coordinates on the unit square were transformed through the inverse CDF (ppf) of the Gaussian
# to produce values of the latent variables z, since the prior of the latent space is Gaussian
grid_x = norm.ppf(np.linspace(0.05, 0.95, n))
grid_y = norm.ppf(np.linspace(0.05, 0.95, n))
for i, yi in enumerate(grid_x):
for j, xi in enumerate(grid_y):
z_sample = np.array([[xi, yi]])
x_decoded = generator.predict(z_sample)
digit = x_decoded[0].reshape(digit_size, digit_size)
figure[i * digit_size: (i + 1) * digit_size,
j * digit_size: (j + 1) * digit_size] = digit
plt.figure(figsize=(10, 10))
plt.imshow(figure, cmap='Greys_r')
plt.show()
+179
Ver Arquivo
@@ -0,0 +1,179 @@
'''This script demonstrates how to build a variational autoencoder
with Keras and deconvolution layers.
Reference: "Auto-Encoding Variational Bayes" https://arxiv.org/abs/1312.6114
'''
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import norm
from keras.layers import Input, Dense, Lambda, Flatten, Reshape
from keras.layers import Conv2D, Conv2DTranspose
from keras.models import Model
from keras import backend as K
from keras import metrics
from keras.datasets import mnist
# input image dimensions
img_rows, img_cols, img_chns = 28, 28, 1
# number of convolutional filters to use
filters = 64
# convolution kernel size
num_conv = 3
batch_size = 100
if K.image_data_format() == 'channels_first':
original_img_size = (img_chns, img_rows, img_cols)
else:
original_img_size = (img_rows, img_cols, img_chns)
latent_dim = 2
intermediate_dim = 128
epsilon_std = 1.0
epochs = 5
x = Input(batch_shape=(batch_size,) + original_img_size)
conv_1 = Conv2D(img_chns,
kernel_size=(2, 2),
padding='same', activation='relu')(x)
conv_2 = Conv2D(filters,
kernel_size=(2, 2),
padding='same', activation='relu',
strides=(2, 2))(conv_1)
conv_3 = Conv2D(filters,
kernel_size=num_conv,
padding='same', activation='relu',
strides=1)(conv_2)
conv_4 = Conv2D(filters,
kernel_size=num_conv,
padding='same', activation='relu',
strides=1)(conv_3)
flat = Flatten()(conv_4)
hidden = Dense(intermediate_dim, activation='relu')(flat)
z_mean = Dense(latent_dim)(hidden)
z_log_var = Dense(latent_dim)(hidden)
def sampling(args):
z_mean, z_log_var = args
epsilon = K.random_normal(shape=(batch_size, latent_dim),
mean=0., stddev=epsilon_std)
return z_mean + K.exp(z_log_var) * epsilon
# note that "output_shape" isn't necessary with the TensorFlow backend
# so you could write `Lambda(sampling)([z_mean, z_log_var])`
z = Lambda(sampling, output_shape=(latent_dim,))([z_mean, z_log_var])
# we instantiate these layers separately so as to reuse them later
decoder_hid = Dense(intermediate_dim, activation='relu')
decoder_upsample = Dense(filters * 14 * 14, activation='relu')
if K.image_data_format() == 'channels_first':
output_shape = (batch_size, filters, 14, 14)
else:
output_shape = (batch_size, 14, 14, filters)
decoder_reshape = Reshape(output_shape[1:])
decoder_deconv_1 = Conv2DTranspose(filters,
kernel_size=num_conv,
padding='same',
strides=1,
activation='relu')
decoder_deconv_2 = Conv2DTranspose(filters, num_conv,
padding='same',
strides=1,
activation='relu')
if K.image_data_format() == 'channels_first':
output_shape = (batch_size, filters, 29, 29)
else:
output_shape = (batch_size, 29, 29, filters)
decoder_deconv_3_upsamp = Conv2DTranspose(filters,
kernel_size=(3, 3),
strides=(2, 2),
padding='valid',
activation='relu')
decoder_mean_squash = Conv2D(img_chns,
kernel_size=2,
padding='valid',
activation='sigmoid')
hid_decoded = decoder_hid(z)
up_decoded = decoder_upsample(hid_decoded)
reshape_decoded = decoder_reshape(up_decoded)
deconv_1_decoded = decoder_deconv_1(reshape_decoded)
deconv_2_decoded = decoder_deconv_2(deconv_1_decoded)
x_decoded_relu = decoder_deconv_3_upsamp(deconv_2_decoded)
x_decoded_mean_squash = decoder_mean_squash(x_decoded_relu)
def vae_loss(x, x_decoded_mean):
# NOTE: binary_crossentropy expects a batch_size by dim
# for x and x_decoded_mean, so we MUST flatten these!
x = K.flatten(x)
x_decoded_mean = K.flatten(x_decoded_mean)
xent_loss = img_rows * img_cols * metrics.binary_crossentropy(x, x_decoded_mean)
kl_loss = - 0.5 * K.mean(1 + z_log_var - K.square(z_mean) - K.exp(z_log_var), axis=-1)
return xent_loss + kl_loss
vae = Model(x, x_decoded_mean_squash)
vae.compile(optimizer='rmsprop', loss=vae_loss)
vae.summary()
# train the VAE on MNIST digits
(x_train, _), (x_test, y_test) = mnist.load_data()
x_train = x_train.astype('float32') / 255.
x_train = x_train.reshape((x_train.shape[0],) + original_img_size)
x_test = x_test.astype('float32') / 255.
x_test = x_test.reshape((x_test.shape[0],) + original_img_size)
print('x_train.shape:', x_train.shape)
vae.fit(x_train, x_train,
shuffle=True,
epochs=epochs,
batch_size=batch_size,
validation_data=(x_test, x_test))
# build a model to project inputs on the latent space
encoder = Model(x, z_mean)
# display a 2D plot of the digit classes in the latent space
x_test_encoded = encoder.predict(x_test, batch_size=batch_size)
plt.figure(figsize=(6, 6))
plt.scatter(x_test_encoded[:, 0], x_test_encoded[:, 1], c=y_test)
plt.colorbar()
plt.show()
# build a digit generator that can sample from the learned distribution
decoder_input = Input(shape=(latent_dim,))
_hid_decoded = decoder_hid(decoder_input)
_up_decoded = decoder_upsample(_hid_decoded)
_reshape_decoded = decoder_reshape(_up_decoded)
_deconv_1_decoded = decoder_deconv_1(_reshape_decoded)
_deconv_2_decoded = decoder_deconv_2(_deconv_1_decoded)
_x_decoded_relu = decoder_deconv_3_upsamp(_deconv_2_decoded)
_x_decoded_mean_squash = decoder_mean_squash(_x_decoded_relu)
generator = Model(decoder_input, _x_decoded_mean_squash)
# display a 2D manifold of the digits
n = 15 # figure with 15x15 digits
digit_size = 28
figure = np.zeros((digit_size * n, digit_size * n))
# linearly spaced coordinates on the unit square were transformed through the inverse CDF (ppf) of the Gaussian
# to produce values of the latent variables z, since the prior of the latent space is Gaussian
grid_x = norm.ppf(np.linspace(0.05, 0.95, n))
grid_y = norm.ppf(np.linspace(0.05, 0.95, n))
for i, yi in enumerate(grid_x):
for j, xi in enumerate(grid_y):
z_sample = np.array([[xi, yi]])
z_sample = np.tile(z_sample, batch_size).reshape(batch_size, 2)
x_decoded = generator.predict(z_sample, batch_size=batch_size)
digit = x_decoded[0].reshape(digit_size, digit_size)
figure[i * digit_size: (i + 1) * digit_size,
j * digit_size: (j + 1) * digit_size] = digit
plt.figure(figsize=(10, 10))
plt.imshow(figure, cmap='Greys_r')
plt.show()
+21 -1
Ver Arquivo
@@ -1 +1,21 @@
__version__ = '1.0.0'
from __future__ import absolute_import
from . import activations
from . import applications
from . import backend
from . import datasets
from . import engine
from . import layers
from . import preprocessing
from . import utils
from . import wrappers
from . import callbacks
from . import constraints
from . import initializers
from . import metrics
from . import models
from . import losses
from . import optimizers
from . import regularizers
__version__ = '2.0.0'
+34 -7
Ver Arquivo
@@ -1,5 +1,7 @@
from __future__ import absolute_import
import six
from . import backend as K
from .utils.generic_utils import deserialize_keras_object
def softmax(x):
@@ -11,14 +13,23 @@ def softmax(x):
s = K.sum(e, axis=-1, keepdims=True)
return e / s
else:
raise Exception('Cannot apply softmax to a tensor that is not 2D or 3D. ' +
'Here, ndim=' + str(ndim))
raise ValueError('Cannot apply softmax to a tensor '
'that is not 2D or 3D. '
'Here, ndim=' + str(ndim))
def elu(x, alpha=1.0):
return K.elu(x, alpha)
def softplus(x):
return K.softplus(x)
def softsign(x):
return K.softsign(x)
def relu(x, alpha=0., max_value=None):
return K.relu(x, alpha=alpha, max_value=max_value)
@@ -36,12 +47,28 @@ def hard_sigmoid(x):
def linear(x):
'''
The function returns the variable that is passed in, so all types work.
'''
return x
from .utils.generic_utils import get_from_module
def serialize(activation):
return activation.__name__
def deserialize(name, custom_objects=None):
return deserialize_keras_object(name,
module_objects=globals(),
custom_objects=custom_objects,
printable_module_name='activation function')
def get(identifier):
return get_from_module(identifier, globals(), 'activation function')
if identifier is None:
return linear
if isinstance(identifier, six.string_types):
identifier = str(identifier)
return deserialize(identifier)
elif callable(identifier):
return identifier
else:
raise ValueError('Could not interpret '
'activation function identifier:', identifier)
+5
Ver Arquivo
@@ -0,0 +1,5 @@
from .vgg16 import VGG16
from .vgg19 import VGG19
from .resnet50 import ResNet50
from .inception_v3 import InceptionV3
from .xception import Xception
+138
Ver Arquivo
@@ -0,0 +1,138 @@
import json
from ..utils.data_utils import get_file
from .. import backend as K
CLASS_INDEX = None
CLASS_INDEX_PATH = 'https://s3.amazonaws.com/deep-learning-models/image-models/imagenet_class_index.json'
def preprocess_input(x, data_format=None):
"""Preprocesses a tensor encoding a batch of images.
# Arguments
x: input Numpy tensor, 4D.
data_format: data format of the image tensor.
# Returns
Preprocessed tensor.
"""
if data_format is None:
data_format = K.image_data_format()
assert data_format in {'channels_last', 'channels_first'}
if data_format == 'channels_first':
# 'RGB'->'BGR'
x = x[:, ::-1, :, :]
# Zero-center by mean pixel
x[:, 0, :, :] -= 103.939
x[:, 1, :, :] -= 116.779
x[:, 2, :, :] -= 123.68
else:
# 'RGB'->'BGR'
x = x[:, :, :, ::-1]
# Zero-center by mean pixel
x[:, :, :, 0] -= 103.939
x[:, :, :, 1] -= 116.779
x[:, :, :, 2] -= 123.68
return x
def decode_predictions(preds, top=5):
"""Decodes the prediction of an ImageNet model.
# Arguments
preds: Numpy tensor encoding a batch of predictions.
top: integer, how many top-guesses to return.
# Returns
A list of lists of top class prediction tuples
`(class_name, class_description, score)`.
One list of tuples per sample in batch input.
# Raises
ValueError: in case of invalid shape of the `pred` array
(must be 2D).
"""
global CLASS_INDEX
if len(preds.shape) != 2 or preds.shape[1] != 1000:
raise ValueError('`decode_predictions` expects '
'a batch of predictions '
'(i.e. a 2D array of shape (samples, 1000)). '
'Found array with shape: ' + str(preds.shape))
if CLASS_INDEX is None:
fpath = get_file('imagenet_class_index.json',
CLASS_INDEX_PATH,
cache_subdir='models')
CLASS_INDEX = json.load(open(fpath))
results = []
for pred in preds:
top_indices = pred.argsort()[-top:][::-1]
result = [tuple(CLASS_INDEX[str(i)]) + (pred[i],) for i in top_indices]
result.sort(key=lambda x: x[2], reverse=True)
results.append(result)
return results
def _obtain_input_shape(input_shape,
default_size,
min_size,
data_format,
include_top):
"""Internal utility to compute/validate an ImageNet model's input shape.
# Arguments
input_shape: either None (will return the default network input shape),
or a user-provided shape to be validated.
default_size: default input width/height for the model.
min_size: minimum input width/height accepted by the model.
data_format: image data format to use.
include_top: whether the model is expected to
be linked to a classifier via a Flatten layer.
# Returns
An integer shape tuple (may include None entries).
# Raises
ValueError: in case of invalid argument values.
"""
if data_format == 'channels_first':
default_shape = (3, default_size, default_size)
else:
default_shape = (default_size, default_size, 3)
if include_top:
if input_shape is not None:
if input_shape != default_shape:
raise ValueError('When setting`include_top=True`, '
'`input_shape` should be ' + str(default_shape) + '.')
input_shape = default_shape
else:
if data_format == 'channels_first':
if input_shape is not None:
if len(input_shape) != 3:
raise ValueError('`input_shape` must be a tuple of three integers.')
if input_shape[0] != 3:
raise ValueError('The input must have 3 channels; got '
'`input_shape=' + str(input_shape) + '`')
if ((input_shape[1] is not None and input_shape[1] < min_size) or
(input_shape[2] is not None and input_shape[2] < min_size)):
raise ValueError('Input size must be at least ' +
str(min_size) + 'x' + str(min_size) + ', got '
'`input_shape=' + str(input_shape) + '`')
else:
input_shape = (3, None, None)
else:
if input_shape is not None:
if len(input_shape) != 3:
raise ValueError('`input_shape` must be a tuple of three integers.')
if input_shape[-1] != 3:
raise ValueError('The input must have 3 channels; got '
'`input_shape=' + str(input_shape) + '`')
if ((input_shape[0] is not None and input_shape[0] < min_size) or
(input_shape[1] is not None and input_shape[1] < min_size)):
raise ValueError('Input size must be at least ' +
str(min_size) + 'x' + str(min_size) + ', got '
'`input_shape=' + str(input_shape) + '`')
else:
input_shape = (None, None, 3)
return input_shape
+393
Ver Arquivo
@@ -0,0 +1,393 @@
# -*- coding: utf-8 -*-
"""Inception V3 model for Keras.
Note that the input image format for this model is different than for
the VGG16 and ResNet models (299x299 instead of 224x224),
and that the input preprocessing function is also different (same as Xception).
# Reference
- [Rethinking the Inception Architecture for Computer Vision](http://arxiv.org/abs/1512.00567)
"""
from __future__ import print_function
from __future__ import absolute_import
import warnings
from ..models import Model
from .. import layers
from ..layers import Activation
from ..layers import Dense
from ..layers import Input
from ..layers import BatchNormalization
from ..layers import Conv2D
from ..layers import MaxPooling2D
from ..layers import AveragePooling2D
from ..layers import GlobalAveragePooling2D
from ..layers import GlobalMaxPooling2D
from ..engine.topology import get_source_inputs
from ..utils.layer_utils import convert_all_kernels_in_model
from ..utils.data_utils import get_file
from .. import backend as K
from .imagenet_utils import decode_predictions
from .imagenet_utils import _obtain_input_shape
WEIGHTS_PATH = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.5/inception_v3_weights_tf_dim_ordering_tf_kernels.h5'
WEIGHTS_PATH_NO_TOP = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.5/inception_v3_weights_tf_dim_ordering_tf_kernels_notop.h5'
def conv2d_bn(x,
filters,
num_row,
num_col,
padding='same',
strides=(1, 1),
name=None):
"""Utility function to apply conv + BN.
# Arguments
x: input tensor.
filters: filters in `Conv2D`.
num_row: height of the convolution kernel.
num_col: width of the convolution kernel.
padding: padding mode in `Conv2D`.
strides: strides in `Conv2D`.
name: name of the ops; will become `name + '_conv'`
for the convolution and `name + '_bn'` for the
batch norm layer.
# Returns
Output tensor after applying `Conv2D` and `BatchNormalization`.
"""
if name is not None:
bn_name = name + '_bn'
conv_name = name + '_conv'
else:
bn_name = None
conv_name = None
if K.image_data_format() == 'channels_first':
bn_axis = 1
else:
bn_axis = 3
x = Conv2D(
filters, (num_row, num_col),
strides=strides,
padding=padding,
use_bias=False,
name=conv_name)(x)
x = BatchNormalization(axis=bn_axis, scale=False, name=bn_name)(x)
x = Activation('relu', name=name)(x)
return x
def InceptionV3(include_top=True,
weights='imagenet',
input_tensor=None,
input_shape=None,
pooling=None,
classes=1000):
"""Instantiates the Inception v3 architecture.
Optionally loads weights pre-trained
on ImageNet. Note that when using TensorFlow,
for best performance you should set
`image_data_format="channels_last"` in your Keras config
at ~/.keras/keras.json.
The model and the weights are compatible with both
TensorFlow and Theano. The data format
convention used by the model is the one
specified in your Keras config file.
Note that the default input image size for this model is 299x299.
# Arguments
include_top: whether to include the fully-connected
layer at the top of the network.
weights: one of `None` (random initialization)
or "imagenet" (pre-training on ImageNet).
input_tensor: optional Keras tensor (i.e. output of `layers.Input()`)
to use as image input for the model.
input_shape: optional shape tuple, only to be specified
if `include_top` is False (otherwise the input shape
has to be `(299, 299, 3)` (with `channels_last` data format)
or `(3, 299, 299)` (with `channels_first` data format).
It should have exactly 3 inputs channels,
and width and height should be no smaller than 139.
E.g. `(150, 150, 3)` would be one valid value.
pooling: Optional pooling mode for feature extraction
when `include_top` is `False`.
- `None` means that the output of the model will be
the 4D tensor output of the
last convolutional layer.
- `avg` means that global average pooling
will be applied to the output of the
last convolutional layer, and thus
the output of the model will be a 2D tensor.
- `max` means that global max pooling will
be applied.
classes: optional number of classes to classify images
into, only to be specified if `include_top` is True, and
if no `weights` argument is specified.
# Returns
A Keras model instance.
# Raises
ValueError: in case of invalid argument for `weights`,
or invalid input shape.
"""
if weights not in {'imagenet', None}:
raise ValueError('The `weights` argument should be either '
'`None` (random initialization) or `imagenet` '
'(pre-training on ImageNet).')
if weights == 'imagenet' and include_top and classes != 1000:
raise ValueError('If using `weights` as imagenet with `include_top`'
' as true, `classes` should be 1000')
# Determine proper input shape
input_shape = _obtain_input_shape(
input_shape,
default_size=299,
min_size=139,
data_format=K.image_data_format(),
include_top=include_top)
if input_tensor is None:
img_input = Input(shape=input_shape)
else:
img_input = Input(tensor=input_tensor, shape=input_shape)
if K.image_data_format() == 'channels_first':
channel_axis = 1
else:
channel_axis = 3
x = conv2d_bn(img_input, 32, 3, 3, strides=(2, 2), padding='valid')
x = conv2d_bn(x, 32, 3, 3, padding='valid')
x = conv2d_bn(x, 64, 3, 3)
x = MaxPooling2D((3, 3), strides=(2, 2))(x)
x = conv2d_bn(x, 80, 1, 1, padding='valid')
x = conv2d_bn(x, 192, 3, 3, padding='valid')
x = MaxPooling2D((3, 3), strides=(2, 2))(x)
# mixed 0, 1, 2: 35 x 35 x 256
branch1x1 = conv2d_bn(x, 64, 1, 1)
branch5x5 = conv2d_bn(x, 48, 1, 1)
branch5x5 = conv2d_bn(branch5x5, 64, 5, 5)
branch3x3dbl = conv2d_bn(x, 64, 1, 1)
branch3x3dbl = conv2d_bn(branch3x3dbl, 96, 3, 3)
branch3x3dbl = conv2d_bn(branch3x3dbl, 96, 3, 3)
branch_pool = AveragePooling2D((3, 3), strides=(1, 1), padding='same')(x)
branch_pool = conv2d_bn(branch_pool, 32, 1, 1)
x = layers.concatenate(
[branch1x1, branch5x5, branch3x3dbl, branch_pool],
axis=channel_axis,
name='mixed0')
# mixed 1: 35 x 35 x 256
branch1x1 = conv2d_bn(x, 64, 1, 1)
branch5x5 = conv2d_bn(x, 48, 1, 1)
branch5x5 = conv2d_bn(branch5x5, 64, 5, 5)
branch3x3dbl = conv2d_bn(x, 64, 1, 1)
branch3x3dbl = conv2d_bn(branch3x3dbl, 96, 3, 3)
branch3x3dbl = conv2d_bn(branch3x3dbl, 96, 3, 3)
branch_pool = AveragePooling2D((3, 3), strides=(1, 1), padding='same')(x)
branch_pool = conv2d_bn(branch_pool, 64, 1, 1)
x = layers.concatenate(
[branch1x1, branch5x5, branch3x3dbl, branch_pool],
axis=channel_axis,
name='mixed1')
# mixed 2: 35 x 35 x 256
branch1x1 = conv2d_bn(x, 64, 1, 1)
branch5x5 = conv2d_bn(x, 48, 1, 1)
branch5x5 = conv2d_bn(branch5x5, 64, 5, 5)
branch3x3dbl = conv2d_bn(x, 64, 1, 1)
branch3x3dbl = conv2d_bn(branch3x3dbl, 96, 3, 3)
branch3x3dbl = conv2d_bn(branch3x3dbl, 96, 3, 3)
branch_pool = AveragePooling2D((3, 3), strides=(1, 1), padding='same')(x)
branch_pool = conv2d_bn(branch_pool, 64, 1, 1)
x = layers.concatenate(
[branch1x1, branch5x5, branch3x3dbl, branch_pool],
axis=channel_axis,
name='mixed2')
# mixed 3: 17 x 17 x 768
branch3x3 = conv2d_bn(x, 384, 3, 3, strides=(2, 2), padding='valid')
branch3x3dbl = conv2d_bn(x, 64, 1, 1)
branch3x3dbl = conv2d_bn(branch3x3dbl, 96, 3, 3)
branch3x3dbl = conv2d_bn(
branch3x3dbl, 96, 3, 3, strides=(2, 2), padding='valid')
branch_pool = MaxPooling2D((3, 3), strides=(2, 2))(x)
x = layers.concatenate(
[branch3x3, branch3x3dbl, branch_pool], axis=channel_axis, name='mixed3')
# mixed 4: 17 x 17 x 768
branch1x1 = conv2d_bn(x, 192, 1, 1)
branch7x7 = conv2d_bn(x, 128, 1, 1)
branch7x7 = conv2d_bn(branch7x7, 128, 1, 7)
branch7x7 = conv2d_bn(branch7x7, 192, 7, 1)
branch7x7dbl = conv2d_bn(x, 128, 1, 1)
branch7x7dbl = conv2d_bn(branch7x7dbl, 128, 7, 1)
branch7x7dbl = conv2d_bn(branch7x7dbl, 128, 1, 7)
branch7x7dbl = conv2d_bn(branch7x7dbl, 128, 7, 1)
branch7x7dbl = conv2d_bn(branch7x7dbl, 192, 1, 7)
branch_pool = AveragePooling2D((3, 3), strides=(1, 1), padding='same')(x)
branch_pool = conv2d_bn(branch_pool, 192, 1, 1)
x = layers.concatenate(
[branch1x1, branch7x7, branch7x7dbl, branch_pool],
axis=channel_axis,
name='mixed4')
# mixed 5, 6: 17 x 17 x 768
for i in range(2):
branch1x1 = conv2d_bn(x, 192, 1, 1)
branch7x7 = conv2d_bn(x, 160, 1, 1)
branch7x7 = conv2d_bn(branch7x7, 160, 1, 7)
branch7x7 = conv2d_bn(branch7x7, 192, 7, 1)
branch7x7dbl = conv2d_bn(x, 160, 1, 1)
branch7x7dbl = conv2d_bn(branch7x7dbl, 160, 7, 1)
branch7x7dbl = conv2d_bn(branch7x7dbl, 160, 1, 7)
branch7x7dbl = conv2d_bn(branch7x7dbl, 160, 7, 1)
branch7x7dbl = conv2d_bn(branch7x7dbl, 192, 1, 7)
branch_pool = AveragePooling2D(
(3, 3), strides=(1, 1), padding='same')(x)
branch_pool = conv2d_bn(branch_pool, 192, 1, 1)
x = layers.concatenate(
[branch1x1, branch7x7, branch7x7dbl, branch_pool],
axis=channel_axis,
name='mixed' + str(5 + i))
# mixed 7: 17 x 17 x 768
branch1x1 = conv2d_bn(x, 192, 1, 1)
branch7x7 = conv2d_bn(x, 192, 1, 1)
branch7x7 = conv2d_bn(branch7x7, 192, 1, 7)
branch7x7 = conv2d_bn(branch7x7, 192, 7, 1)
branch7x7dbl = conv2d_bn(x, 192, 1, 1)
branch7x7dbl = conv2d_bn(branch7x7dbl, 192, 7, 1)
branch7x7dbl = conv2d_bn(branch7x7dbl, 192, 1, 7)
branch7x7dbl = conv2d_bn(branch7x7dbl, 192, 7, 1)
branch7x7dbl = conv2d_bn(branch7x7dbl, 192, 1, 7)
branch_pool = AveragePooling2D((3, 3), strides=(1, 1), padding='same')(x)
branch_pool = conv2d_bn(branch_pool, 192, 1, 1)
x = layers.concatenate(
[branch1x1, branch7x7, branch7x7dbl, branch_pool],
axis=channel_axis,
name='mixed7')
# mixed 8: 8 x 8 x 1280
branch3x3 = conv2d_bn(x, 192, 1, 1)
branch3x3 = conv2d_bn(branch3x3, 320, 3, 3,
strides=(2, 2), padding='valid')
branch7x7x3 = conv2d_bn(x, 192, 1, 1)
branch7x7x3 = conv2d_bn(branch7x7x3, 192, 1, 7)
branch7x7x3 = conv2d_bn(branch7x7x3, 192, 7, 1)
branch7x7x3 = conv2d_bn(
branch7x7x3, 192, 3, 3, strides=(2, 2), padding='valid')
branch_pool = MaxPooling2D((3, 3), strides=(2, 2))(x)
x = layers.concatenate(
[branch3x3, branch7x7x3, branch_pool], axis=channel_axis, name='mixed8')
# mixed 9: 8 x 8 x 2048
for i in range(2):
branch1x1 = conv2d_bn(x, 320, 1, 1)
branch3x3 = conv2d_bn(x, 384, 1, 1)
branch3x3_1 = conv2d_bn(branch3x3, 384, 1, 3)
branch3x3_2 = conv2d_bn(branch3x3, 384, 3, 1)
branch3x3 = layers.concatenate(
[branch3x3_1, branch3x3_2], axis=channel_axis, name='mixed9_' + str(i))
branch3x3dbl = conv2d_bn(x, 448, 1, 1)
branch3x3dbl = conv2d_bn(branch3x3dbl, 384, 3, 3)
branch3x3dbl_1 = conv2d_bn(branch3x3dbl, 384, 1, 3)
branch3x3dbl_2 = conv2d_bn(branch3x3dbl, 384, 3, 1)
branch3x3dbl = layers.concatenate(
[branch3x3dbl_1, branch3x3dbl_2], axis=channel_axis)
branch_pool = AveragePooling2D(
(3, 3), strides=(1, 1), padding='same')(x)
branch_pool = conv2d_bn(branch_pool, 192, 1, 1)
x = layers.concatenate(
[branch1x1, branch3x3, branch3x3dbl, branch_pool],
axis=channel_axis,
name='mixed' + str(9 + i))
if include_top:
# Classification block
x = GlobalAveragePooling2D(name='avg_pool')(x)
x = Dense(classes, activation='softmax', name='predictions')(x)
else:
if pooling == 'avg':
x = GlobalAveragePooling2D()(x)
elif pooling == 'max':
x = GlobalMaxPooling2D()(x)
# Ensure that the model takes into account
# any potential predecessors of `input_tensor`.
if input_tensor is not None:
inputs = get_source_inputs(input_tensor)
else:
inputs = img_input
# Create model.
model = Model(inputs, x, name='inception_v3')
# load weights
if weights == 'imagenet':
if K.image_data_format() == 'channels_first':
if K.backend() == 'tensorflow':
warnings.warn('You are using the TensorFlow backend, yet you '
'are using the Theano '
'image data format convention '
'(`image_data_format="channels_first"`). '
'For best performance, set '
'`image_data_format="channels_last"` in '
'your Keras config '
'at ~/.keras/keras.json.')
if include_top:
weights_path = get_file(
'inception_v3_weights_tf_dim_ordering_tf_kernels.h5',
WEIGHTS_PATH,
cache_subdir='models',
md5_hash='9a0d58056eeedaa3f26cb7ebd46da564')
else:
weights_path = get_file(
'inception_v3_weights_tf_dim_ordering_tf_kernels_notop.h5',
WEIGHTS_PATH_NO_TOP,
cache_subdir='models',
md5_hash='bcbd6486424b2319ff4ef7d526e38f63')
model.load_weights(weights_path)
if K.backend() == 'theano':
convert_all_kernels_in_model(model)
return model
def preprocess_input(x):
x /= 255.
x -= 0.5
x *= 2.
return x
+284
Ver Arquivo
@@ -0,0 +1,284 @@
# -*- coding: utf-8 -*-
"""ResNet50 model for Keras.
# Reference:
- [Deep Residual Learning for Image Recognition](https://arxiv.org/abs/1512.03385)
Adapted from code contributed by BigMoyan.
"""
from __future__ import print_function
from __future__ import absolute_import
import warnings
from ..layers import Input
from .. import layers
from ..layers import Dense
from ..layers import Activation
from ..layers import Flatten
from ..layers import Conv2D
from ..layers import MaxPooling2D
from ..layers import ZeroPadding2D
from ..layers import AveragePooling2D
from ..layers import GlobalAveragePooling2D
from ..layers import GlobalMaxPooling2D
from ..layers import BatchNormalization
from ..models import Model
from .. import backend as K
from ..engine.topology import get_source_inputs
from ..utils import layer_utils
from ..utils.data_utils import get_file
from .imagenet_utils import decode_predictions
from .imagenet_utils import preprocess_input
from .imagenet_utils import _obtain_input_shape
WEIGHTS_PATH = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.2/resnet50_weights_tf_dim_ordering_tf_kernels.h5'
WEIGHTS_PATH_NO_TOP = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.2/resnet50_weights_tf_dim_ordering_tf_kernels_notop.h5'
def identity_block(input_tensor, kernel_size, filters, stage, block):
"""The identity block is the block that has no conv layer at shortcut.
# Arguments
input_tensor: input tensor
kernel_size: defualt 3, the kernel size of middle conv layer at main path
filters: list of integers, the filterss of 3 conv layer at main path
stage: integer, current stage label, used for generating layer names
block: 'a','b'..., current block label, used for generating layer names
# Returns
Output tensor for the block.
"""
filters1, filters2, filters3 = filters
if K.image_data_format() == 'channels_last':
bn_axis = 3
else:
bn_axis = 1
conv_name_base = 'res' + str(stage) + block + '_branch'
bn_name_base = 'bn' + str(stage) + block + '_branch'
x = Conv2D(filters1, (1, 1), name=conv_name_base + '2a')(input_tensor)
x = BatchNormalization(axis=bn_axis, name=bn_name_base + '2a')(x)
x = Activation('relu')(x)
x = Conv2D(filters2, kernel_size,
padding='same', name=conv_name_base + '2b')(x)
x = BatchNormalization(axis=bn_axis, name=bn_name_base + '2b')(x)
x = Activation('relu')(x)
x = Conv2D(filters3, (1, 1), name=conv_name_base + '2c')(x)
x = BatchNormalization(axis=bn_axis, name=bn_name_base + '2c')(x)
x = layers.add([x, input_tensor])
x = Activation('relu')(x)
return x
def conv_block(input_tensor, kernel_size, filters, stage, block, strides=(2, 2)):
"""conv_block is the block that has a conv layer at shortcut
# Arguments
input_tensor: input tensor
kernel_size: defualt 3, the kernel size of middle conv layer at main path
filters: list of integers, the filterss of 3 conv layer at main path
stage: integer, current stage label, used for generating layer names
block: 'a','b'..., current block label, used for generating layer names
# Returns
Output tensor for the block.
Note that from stage 3, the first conv layer at main path is with strides=(2,2)
And the shortcut should have strides=(2,2) as well
"""
filters1, filters2, filters3 = filters
if K.image_data_format() == 'channels_last':
bn_axis = 3
else:
bn_axis = 1
conv_name_base = 'res' + str(stage) + block + '_branch'
bn_name_base = 'bn' + str(stage) + block + '_branch'
x = Conv2D(filters1, (1, 1), strides=strides,
name=conv_name_base + '2a')(input_tensor)
x = BatchNormalization(axis=bn_axis, name=bn_name_base + '2a')(x)
x = Activation('relu')(x)
x = Conv2D(filters2, kernel_size, padding='same',
name=conv_name_base + '2b')(x)
x = BatchNormalization(axis=bn_axis, name=bn_name_base + '2b')(x)
x = Activation('relu')(x)
x = Conv2D(filters3, (1, 1), name=conv_name_base + '2c')(x)
x = BatchNormalization(axis=bn_axis, name=bn_name_base + '2c')(x)
shortcut = Conv2D(filters3, (1, 1), strides=strides,
name=conv_name_base + '1')(input_tensor)
shortcut = BatchNormalization(axis=bn_axis, name=bn_name_base + '1')(shortcut)
x = layers.add([x, shortcut])
x = Activation('relu')(x)
return x
def ResNet50(include_top=True, weights='imagenet',
input_tensor=None, input_shape=None,
pooling=None,
classes=1000):
"""Instantiates the ResNet50 architecture.
Optionally loads weights pre-trained
on ImageNet. Note that when using TensorFlow,
for best performance you should set
`image_data_format="channels_last"` in your Keras config
at ~/.keras/keras.json.
The model and the weights are compatible with both
TensorFlow and Theano. The data format
convention used by the model is the one
specified in your Keras config file.
# Arguments
include_top: whether to include the 3 fully-connected
layers at the top of the network.
weights: one of `None` (random initialization)
or "imagenet" (pre-training on ImageNet).
input_tensor: optional Keras tensor (i.e. output of `layers.Input()`)
to use as image input for the model.
input_shape: optional shape tuple, only to be specified
if `include_top` is False (otherwise the input shape
has to be `(224, 224, 3)` (with `channels_last` data format)
or `(3, 224, 244)` (with `channels_first` data format).
It should have exactly 3 inputs channels,
and width and height should be no smaller than 197.
E.g. `(200, 200, 3)` would be one valid value.
pooling: Optional pooling mode for feature extraction
when `include_top` is `False`.
- `None` means that the output of the model will be
the 4D tensor output of the
last convolutional layer.
- `avg` means that global average pooling
will be applied to the output of the
last convolutional layer, and thus
the output of the model will be a 2D tensor.
- `max` means that global max pooling will
be applied.
classes: optional number of classes to classify images
into, only to be specified if `include_top` is True, and
if no `weights` argument is specified.
# Returns
A Keras model instance.
# Raises
ValueError: in case of invalid argument for `weights`,
or invalid input shape.
"""
if weights not in {'imagenet', None}:
raise ValueError('The `weights` argument should be either '
'`None` (random initialization) or `imagenet` '
'(pre-training on ImageNet).')
if weights == 'imagenet' and include_top and classes != 1000:
raise ValueError('If using `weights` as imagenet with `include_top`'
' as true, `classes` should be 1000')
# Determine proper input shape
input_shape = _obtain_input_shape(input_shape,
default_size=224,
min_size=197,
data_format=K.image_data_format(),
include_top=include_top)
if input_tensor is None:
img_input = Input(shape=input_shape)
else:
if not K.is_keras_tensor(input_tensor):
img_input = Input(tensor=input_tensor, shape=input_shape)
else:
img_input = input_tensor
if K.image_data_format() == 'channels_last':
bn_axis = 3
else:
bn_axis = 1
x = ZeroPadding2D((3, 3))(img_input)
x = Conv2D(64, (7, 7), strides=(2, 2), name='conv1')(x)
x = BatchNormalization(axis=bn_axis, name='bn_conv1')(x)
x = Activation('relu')(x)
x = MaxPooling2D((3, 3), strides=(2, 2))(x)
x = conv_block(x, 3, [64, 64, 256], stage=2, block='a', strides=(1, 1))
x = identity_block(x, 3, [64, 64, 256], stage=2, block='b')
x = identity_block(x, 3, [64, 64, 256], stage=2, block='c')
x = conv_block(x, 3, [128, 128, 512], stage=3, block='a')
x = identity_block(x, 3, [128, 128, 512], stage=3, block='b')
x = identity_block(x, 3, [128, 128, 512], stage=3, block='c')
x = identity_block(x, 3, [128, 128, 512], stage=3, block='d')
x = conv_block(x, 3, [256, 256, 1024], stage=4, block='a')
x = identity_block(x, 3, [256, 256, 1024], stage=4, block='b')
x = identity_block(x, 3, [256, 256, 1024], stage=4, block='c')
x = identity_block(x, 3, [256, 256, 1024], stage=4, block='d')
x = identity_block(x, 3, [256, 256, 1024], stage=4, block='e')
x = identity_block(x, 3, [256, 256, 1024], stage=4, block='f')
x = conv_block(x, 3, [512, 512, 2048], stage=5, block='a')
x = identity_block(x, 3, [512, 512, 2048], stage=5, block='b')
x = identity_block(x, 3, [512, 512, 2048], stage=5, block='c')
x = AveragePooling2D((7, 7), name='avg_pool')(x)
if include_top:
x = Flatten()(x)
x = Dense(classes, activation='softmax', name='fc1000')(x)
else:
if pooling == 'avg':
x = GlobalAveragePooling2D()(x)
elif pooling == 'max':
x = GlobalMaxPooling2D()(x)
# Ensure that the model takes into account
# any potential predecessors of `input_tensor`.
if input_tensor is not None:
inputs = get_source_inputs(input_tensor)
else:
inputs = img_input
# Create model.
model = Model(inputs, x, name='resnet50')
# load weights
if weights == 'imagenet':
if include_top:
weights_path = get_file('resnet50_weights_tf_dim_ordering_tf_kernels.h5',
WEIGHTS_PATH,
cache_subdir='models',
md5_hash='a7b3fe01876f51b976af0dea6bc144eb')
else:
weights_path = get_file('resnet50_weights_tf_dim_ordering_tf_kernels_notop.h5',
WEIGHTS_PATH_NO_TOP,
cache_subdir='models',
md5_hash='a268eb855778b3df3c7506639542a6af')
model.load_weights(weights_path)
if K.backend() == 'theano':
layer_utils.convert_all_kernels_in_model(model)
if K.image_data_format() == 'channels_first':
if include_top:
maxpool = model.get_layer(name='avg_pool')
shape = maxpool.output_shape[1:]
dense = model.get_layer(name='fc1000')
layer_utils.convert_dense_weights_data_format(dense, shape, 'channels_first')
if K.backend() == 'tensorflow':
warnings.warn('You are using the TensorFlow backend, yet you '
'are using the Theano '
'image data format convention '
'(`image_data_format="channels_first"`). '
'For best performance, set '
'`image_data_format="channels_last"` in '
'your Keras config '
'at ~/.keras/keras.json.')
return model
+189
Ver Arquivo
@@ -0,0 +1,189 @@
# -*- coding: utf-8 -*-
"""VGG16 model for Keras.
# Reference
- [Very Deep Convolutional Networks for Large-Scale Image Recognition](https://arxiv.org/abs/1409.1556)
"""
from __future__ import print_function
from __future__ import absolute_import
import warnings
from ..models import Model
from ..layers import Flatten
from ..layers import Dense
from ..layers import Input
from ..layers import Conv2D
from ..layers import MaxPooling2D
from ..layers import GlobalAveragePooling2D
from ..layers import GlobalMaxPooling2D
from ..engine.topology import get_source_inputs
from ..utils import layer_utils
from ..utils.data_utils import get_file
from .. import backend as K
from .imagenet_utils import decode_predictions
from .imagenet_utils import preprocess_input
from .imagenet_utils import _obtain_input_shape
WEIGHTS_PATH = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg16_weights_tf_dim_ordering_tf_kernels.h5'
WEIGHTS_PATH_NO_TOP = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5'
def VGG16(include_top=True, weights='imagenet',
input_tensor=None, input_shape=None,
pooling=None,
classes=1000):
"""Instantiates the VGG16 architecture.
Optionally loads weights pre-trained
on ImageNet. Note that when using TensorFlow,
for best performance you should set
`image_data_format="channels_last"` in your Keras config
at ~/.keras/keras.json.
The model and the weights are compatible with both
TensorFlow and Theano. The data format
convention used by the model is the one
specified in your Keras config file.
# Arguments
include_top: whether to include the 3 fully-connected
layers at the top of the network.
weights: one of `None` (random initialization)
or "imagenet" (pre-training on ImageNet).
input_tensor: optional Keras tensor (i.e. output of `layers.Input()`)
to use as image input for the model.
input_shape: optional shape tuple, only to be specified
if `include_top` is False (otherwise the input shape
has to be `(224, 224, 3)` (with `channels_last` data format)
or `(3, 224, 244)` (with `channels_first` data format).
It should have exactly 3 inputs channels,
and width and height should be no smaller than 48.
E.g. `(200, 200, 3)` would be one valid value.
pooling: Optional pooling mode for feature extraction
when `include_top` is `False`.
- `None` means that the output of the model will be
the 4D tensor output of the
last convolutional layer.
- `avg` means that global average pooling
will be applied to the output of the
last convolutional layer, and thus
the output of the model will be a 2D tensor.
- `max` means that global max pooling will
be applied.
classes: optional number of classes to classify images
into, only to be specified if `include_top` is True, and
if no `weights` argument is specified.
# Returns
A Keras model instance.
# Raises
ValueError: in case of invalid argument for `weights`,
or invalid input shape.
"""
if weights not in {'imagenet', None}:
raise ValueError('The `weights` argument should be either '
'`None` (random initialization) or `imagenet` '
'(pre-training on ImageNet).')
if weights == 'imagenet' and include_top and classes != 1000:
raise ValueError('If using `weights` as imagenet with `include_top`'
' as true, `classes` should be 1000')
# Determine proper input shape
input_shape = _obtain_input_shape(input_shape,
default_size=224,
min_size=48,
data_format=K.image_data_format(),
include_top=include_top)
if input_tensor is None:
img_input = Input(shape=input_shape)
else:
if not K.is_keras_tensor(input_tensor):
img_input = Input(tensor=input_tensor, shape=input_shape)
else:
img_input = input_tensor
# Block 1
x = Conv2D(64, (3, 3), activation='relu', padding='same', name='block1_conv1')(img_input)
x = Conv2D(64, (3, 3), activation='relu', padding='same', name='block1_conv2')(x)
x = MaxPooling2D((2, 2), strides=(2, 2), name='block1_pool')(x)
# Block 2
x = Conv2D(128, (3, 3), activation='relu', padding='same', name='block2_conv1')(x)
x = Conv2D(128, (3, 3), activation='relu', padding='same', name='block2_conv2')(x)
x = MaxPooling2D((2, 2), strides=(2, 2), name='block2_pool')(x)
# Block 3
x = Conv2D(256, (3, 3), activation='relu', padding='same', name='block3_conv1')(x)
x = Conv2D(256, (3, 3), activation='relu', padding='same', name='block3_conv2')(x)
x = Conv2D(256, (3, 3), activation='relu', padding='same', name='block3_conv3')(x)
x = MaxPooling2D((2, 2), strides=(2, 2), name='block3_pool')(x)
# Block 4
x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block4_conv1')(x)
x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block4_conv2')(x)
x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block4_conv3')(x)
x = MaxPooling2D((2, 2), strides=(2, 2), name='block4_pool')(x)
# Block 5
x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block5_conv1')(x)
x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block5_conv2')(x)
x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block5_conv3')(x)
x = MaxPooling2D((2, 2), strides=(2, 2), name='block5_pool')(x)
if include_top:
# Classification block
x = Flatten(name='flatten')(x)
x = Dense(4096, activation='relu', name='fc1')(x)
x = Dense(4096, activation='relu', name='fc2')(x)
x = Dense(classes, activation='softmax', name='predictions')(x)
else:
if pooling == 'avg':
x = GlobalAveragePooling2D()(x)
elif pooling == 'max':
x = GlobalMaxPooling2D()(x)
# Ensure that the model takes into account
# any potential predecessors of `input_tensor`.
if input_tensor is not None:
inputs = get_source_inputs(input_tensor)
else:
inputs = img_input
# Create model.
model = Model(inputs, x, name='vgg16')
# load weights
if weights == 'imagenet':
if include_top:
weights_path = get_file('vgg16_weights_tf_dim_ordering_tf_kernels.h5',
WEIGHTS_PATH,
cache_subdir='models')
else:
weights_path = get_file('vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5',
WEIGHTS_PATH_NO_TOP,
cache_subdir='models')
model.load_weights(weights_path)
if K.backend() == 'theano':
layer_utils.convert_all_kernels_in_model(model)
if K.image_data_format() == 'channels_first':
if include_top:
maxpool = model.get_layer(name='block5_pool')
shape = maxpool.output_shape[1:]
dense = model.get_layer(name='fc1')
layer_utils.convert_dense_weights_data_format(dense, shape, 'channels_first')
if K.backend() == 'tensorflow':
warnings.warn('You are using the TensorFlow backend, yet you '
'are using the Theano '
'image data format convention '
'(`image_data_format="channels_first"`). '
'For best performance, set '
'`image_data_format="channels_last"` in '
'your Keras config '
'at ~/.keras/keras.json.')
return model
+192
Ver Arquivo
@@ -0,0 +1,192 @@
# -*- coding: utf-8 -*-
"""VGG19 model for Keras.
# Reference
- [Very Deep Convolutional Networks for Large-Scale Image Recognition](https://arxiv.org/abs/1409.1556)
"""
from __future__ import print_function
from __future__ import absolute_import
import warnings
from ..models import Model
from ..layers import Flatten
from ..layers import Dense
from ..layers import Input
from ..layers import Conv2D
from ..layers import MaxPooling2D
from ..layers import GlobalAveragePooling2D
from ..layers import GlobalMaxPooling2D
from ..engine.topology import get_source_inputs
from ..utils import layer_utils
from ..utils.data_utils import get_file
from .. import backend as K
from .imagenet_utils import decode_predictions
from .imagenet_utils import preprocess_input
from .imagenet_utils import _obtain_input_shape
WEIGHTS_PATH = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg19_weights_tf_dim_ordering_tf_kernels.h5'
WEIGHTS_PATH_NO_TOP = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg19_weights_tf_dim_ordering_tf_kernels_notop.h5'
def VGG19(include_top=True, weights='imagenet',
input_tensor=None, input_shape=None,
pooling=None,
classes=1000):
"""Instantiates the VGG19 architecture.
Optionally loads weights pre-trained
on ImageNet. Note that when using TensorFlow,
for best performance you should set
`image_data_format="channels_last"` in your Keras config
at ~/.keras/keras.json.
The model and the weights are compatible with both
TensorFlow and Theano. The data format
convention used by the model is the one
specified in your Keras config file.
# Arguments
include_top: whether to include the 3 fully-connected
layers at the top of the network.
weights: one of `None` (random initialization)
or "imagenet" (pre-training on ImageNet).
input_tensor: optional Keras tensor (i.e. output of `layers.Input()`)
to use as image input for the model.
input_shape: optional shape tuple, only to be specified
if `include_top` is False (otherwise the input shape
has to be `(224, 224, 3)` (with `channels_last` data format)
or `(3, 224, 244)` (with `channels_first` data format).
It should have exactly 3 inputs channels,
and width and height should be no smaller than 48.
E.g. `(200, 200, 3)` would be one valid value.
pooling: Optional pooling mode for feature extraction
when `include_top` is `False`.
- `None` means that the output of the model will be
the 4D tensor output of the
last convolutional layer.
- `avg` means that global average pooling
will be applied to the output of the
last convolutional layer, and thus
the output of the model will be a 2D tensor.
- `max` means that global max pooling will
be applied.
classes: optional number of classes to classify images
into, only to be specified if `include_top` is True, and
if no `weights` argument is specified.
# Returns
A Keras model instance.
# Raises
ValueError: in case of invalid argument for `weights`,
or invalid input shape.
"""
if weights not in {'imagenet', None}:
raise ValueError('The `weights` argument should be either '
'`None` (random initialization) or `imagenet` '
'(pre-training on ImageNet).')
if weights == 'imagenet' and include_top and classes != 1000:
raise ValueError('If using `weights` as imagenet with `include_top`'
' as true, `classes` should be 1000')
# Determine proper input shape
input_shape = _obtain_input_shape(input_shape,
default_size=224,
min_size=48,
data_format=K.image_data_format(),
include_top=include_top)
if input_tensor is None:
img_input = Input(shape=input_shape)
else:
if not K.is_keras_tensor(input_tensor):
img_input = Input(tensor=input_tensor, shape=input_shape)
else:
img_input = input_tensor
# Block 1
x = Conv2D(64, (3, 3), activation='relu', padding='same', name='block1_conv1')(img_input)
x = Conv2D(64, (3, 3), activation='relu', padding='same', name='block1_conv2')(x)
x = MaxPooling2D((2, 2), strides=(2, 2), name='block1_pool')(x)
# Block 2
x = Conv2D(128, (3, 3), activation='relu', padding='same', name='block2_conv1')(x)
x = Conv2D(128, (3, 3), activation='relu', padding='same', name='block2_conv2')(x)
x = MaxPooling2D((2, 2), strides=(2, 2), name='block2_pool')(x)
# Block 3
x = Conv2D(256, (3, 3), activation='relu', padding='same', name='block3_conv1')(x)
x = Conv2D(256, (3, 3), activation='relu', padding='same', name='block3_conv2')(x)
x = Conv2D(256, (3, 3), activation='relu', padding='same', name='block3_conv3')(x)
x = Conv2D(256, (3, 3), activation='relu', padding='same', name='block3_conv4')(x)
x = MaxPooling2D((2, 2), strides=(2, 2), name='block3_pool')(x)
# Block 4
x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block4_conv1')(x)
x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block4_conv2')(x)
x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block4_conv3')(x)
x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block4_conv4')(x)
x = MaxPooling2D((2, 2), strides=(2, 2), name='block4_pool')(x)
# Block 5
x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block5_conv1')(x)
x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block5_conv2')(x)
x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block5_conv3')(x)
x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block5_conv4')(x)
x = MaxPooling2D((2, 2), strides=(2, 2), name='block5_pool')(x)
if include_top:
# Classification block
x = Flatten(name='flatten')(x)
x = Dense(4096, activation='relu', name='fc1')(x)
x = Dense(4096, activation='relu', name='fc2')(x)
x = Dense(classes, activation='softmax', name='predictions')(x)
else:
if pooling == 'avg':
x = GlobalAveragePooling2D()(x)
elif pooling == 'max':
x = GlobalMaxPooling2D()(x)
# Ensure that the model takes into account
# any potential predecessors of `input_tensor`.
if input_tensor is not None:
inputs = get_source_inputs(input_tensor)
else:
inputs = img_input
# Create model.
model = Model(inputs, x, name='vgg19')
# load weights
if weights == 'imagenet':
if include_top:
weights_path = get_file('vgg19_weights_tf_dim_ordering_tf_kernels.h5',
WEIGHTS_PATH,
cache_subdir='models')
else:
weights_path = get_file('vgg19_weights_tf_dim_ordering_tf_kernels_notop.h5',
WEIGHTS_PATH_NO_TOP,
cache_subdir='models')
model.load_weights(weights_path)
if K.backend() == 'theano':
layer_utils.convert_all_kernels_in_model(model)
if K.image_data_format() == 'channels_first':
if include_top:
maxpool = model.get_layer(name='block5_pool')
shape = maxpool.output_shape[1:]
dense = model.get_layer(name='fc1')
layer_utils.convert_dense_weights_data_format(dense, shape, 'channels_first')
if K.backend() == 'tensorflow':
warnings.warn('You are using the TensorFlow backend, yet you '
'are using the Theano '
'image data format convention '
'(`image_data_format="channels_first"`). '
'For best performance, set '
'`image_data_format="channels_last"` in '
'your Keras config '
'at ~/.keras/keras.json.')
return model
+266
Ver Arquivo
@@ -0,0 +1,266 @@
# -*- coding: utf-8 -*-
"""Xception V1 model for Keras.
On ImageNet, this model gets to a top-1 validation accuracy of 0.790
and a top-5 validation accuracy of 0.945.
Do note that the input image format for this model is different than for
the VGG16 and ResNet models (299x299 instead of 224x224),
and that the input preprocessing function
is also different (same as Inception V3).
Also do note that this model is only available for the TensorFlow backend,
due to its reliance on `SeparableConvolution` layers.
# Reference
- [Xception: Deep Learning with Depthwise Separable Convolutions](https://arxiv.org/abs/1610.02357)
"""
from __future__ import print_function
from __future__ import absolute_import
import warnings
from ..models import Model
from .. import layers
from ..layers import Dense
from ..layers import Input
from ..layers import BatchNormalization
from ..layers import Activation
from ..layers import Conv2D
from ..layers import SeparableConv2D
from ..layers import MaxPooling2D
from ..layers import GlobalAveragePooling2D
from ..layers import GlobalMaxPooling2D
from ..engine.topology import get_source_inputs
from ..utils.data_utils import get_file
from .. import backend as K
from .imagenet_utils import decode_predictions
from .imagenet_utils import _obtain_input_shape
TF_WEIGHTS_PATH = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.4/xception_weights_tf_dim_ordering_tf_kernels.h5'
TF_WEIGHTS_PATH_NO_TOP = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.4/xception_weights_tf_dim_ordering_tf_kernels_notop.h5'
def Xception(include_top=True, weights='imagenet',
input_tensor=None, input_shape=None,
pooling=None,
classes=1000):
"""Instantiates the Xception architecture.
Optionally loads weights pre-trained
on ImageNet. This model is available for TensorFlow only,
and can only be used with inputs following the TensorFlow
data format `(width, height, channels)`.
You should set `image_data_format="channels_last"` in your Keras config
located at ~/.keras/keras.json.
Note that the default input image size for this model is 299x299.
# Arguments
include_top: whether to include the fully-connected
layer at the top of the network.
weights: one of `None` (random initialization)
or "imagenet" (pre-training on ImageNet).
input_tensor: optional Keras tensor (i.e. output of `layers.Input()`)
to use as image input for the model.
input_shape: optional shape tuple, only to be specified
if `include_top` is False (otherwise the input shape
has to be `(299, 299, 3)`.
It should have exactly 3 inputs channels,
and width and height should be no smaller than 71.
E.g. `(150, 150, 3)` would be one valid value.
pooling: Optional pooling mode for feature extraction
when `include_top` is `False`.
- `None` means that the output of the model will be
the 4D tensor output of the
last convolutional layer.
- `avg` means that global average pooling
will be applied to the output of the
last convolutional layer, and thus
the output of the model will be a 2D tensor.
- `max` means that global max pooling will
be applied.
classes: optional number of classes to classify images
into, only to be specified if `include_top` is True, and
if no `weights` argument is specified.
# Returns
A Keras model instance.
# Raises
ValueError: in case of invalid argument for `weights`,
or invalid input shape.
RuntimeError: If attempting to run this model with a
backend that does not support separable convolutions.
"""
if weights not in {'imagenet', None}:
raise ValueError('The `weights` argument should be either '
'`None` (random initialization) or `imagenet` '
'(pre-training on ImageNet).')
if weights == 'imagenet' and include_top and classes != 1000:
raise ValueError('If using `weights` as imagenet with `include_top`'
' as true, `classes` should be 1000')
if K.backend() != 'tensorflow':
raise RuntimeError('The Xception model is only available with '
'the TensorFlow backend.')
if K.image_data_format() != 'channels_last':
warnings.warn('The Xception model is only available for the '
'input data format "channels_last" '
'(width, height, channels). '
'However your settings specify the default '
'data format "channels_first" (channels, width, height). '
'You should set `image_data_format="channels_last"` in your Keras '
'config located at ~/.keras/keras.json. '
'The model being returned right now will expect inputs '
'to follow the "channels_last" data format.')
K.set_image_data_format('channels_last')
old_data_format = 'channels_first'
else:
old_data_format = None
# Determine proper input shape
input_shape = _obtain_input_shape(input_shape,
default_size=299,
min_size=71,
data_format=K.image_data_format(),
include_top=include_top)
if input_tensor is None:
img_input = Input(shape=input_shape)
else:
if not K.is_keras_tensor(input_tensor):
img_input = Input(tensor=input_tensor, shape=input_shape)
else:
img_input = input_tensor
x = Conv2D(32, (3, 3), strides=(2, 2), use_bias=False, name='block1_conv1')(img_input)
x = BatchNormalization(name='block1_conv1_bn')(x)
x = Activation('relu', name='block1_conv1_act')(x)
x = Conv2D(64, (3, 3), use_bias=False, name='block1_conv2')(x)
x = BatchNormalization(name='block1_conv2_bn')(x)
x = Activation('relu', name='block1_conv2_act')(x)
residual = Conv2D(128, (1, 1), strides=(2, 2),
padding='same', use_bias=False)(x)
residual = BatchNormalization()(residual)
x = SeparableConv2D(128, (3, 3), padding='same', use_bias=False, name='block2_sepconv1')(x)
x = BatchNormalization(name='block2_sepconv1_bn')(x)
x = Activation('relu', name='block2_sepconv2_act')(x)
x = SeparableConv2D(128, (3, 3), padding='same', use_bias=False, name='block2_sepconv2')(x)
x = BatchNormalization(name='block2_sepconv2_bn')(x)
x = MaxPooling2D((3, 3), strides=(2, 2), padding='same', name='block2_pool')(x)
x = layers.add([x, residual])
residual = Conv2D(256, (1, 1), strides=(2, 2),
padding='same', use_bias=False)(x)
residual = BatchNormalization()(residual)
x = Activation('relu', name='block3_sepconv1_act')(x)
x = SeparableConv2D(256, (3, 3), padding='same', use_bias=False, name='block3_sepconv1')(x)
x = BatchNormalization(name='block3_sepconv1_bn')(x)
x = Activation('relu', name='block3_sepconv2_act')(x)
x = SeparableConv2D(256, (3, 3), padding='same', use_bias=False, name='block3_sepconv2')(x)
x = BatchNormalization(name='block3_sepconv2_bn')(x)
x = MaxPooling2D((3, 3), strides=(2, 2), padding='same', name='block3_pool')(x)
x = layers.add([x, residual])
residual = Conv2D(728, (1, 1), strides=(2, 2),
padding='same', use_bias=False)(x)
residual = BatchNormalization()(residual)
x = Activation('relu', name='block4_sepconv1_act')(x)
x = SeparableConv2D(728, (3, 3), padding='same', use_bias=False, name='block4_sepconv1')(x)
x = BatchNormalization(name='block4_sepconv1_bn')(x)
x = Activation('relu', name='block4_sepconv2_act')(x)
x = SeparableConv2D(728, (3, 3), padding='same', use_bias=False, name='block4_sepconv2')(x)
x = BatchNormalization(name='block4_sepconv2_bn')(x)
x = MaxPooling2D((3, 3), strides=(2, 2), padding='same', name='block4_pool')(x)
x = layers.add([x, residual])
for i in range(8):
residual = x
prefix = 'block' + str(i + 5)
x = Activation('relu', name=prefix + '_sepconv1_act')(x)
x = SeparableConv2D(728, (3, 3), padding='same', use_bias=False, name=prefix + '_sepconv1')(x)
x = BatchNormalization(name=prefix + '_sepconv1_bn')(x)
x = Activation('relu', name=prefix + '_sepconv2_act')(x)
x = SeparableConv2D(728, (3, 3), padding='same', use_bias=False, name=prefix + '_sepconv2')(x)
x = BatchNormalization(name=prefix + '_sepconv2_bn')(x)
x = Activation('relu', name=prefix + '_sepconv3_act')(x)
x = SeparableConv2D(728, (3, 3), padding='same', use_bias=False, name=prefix + '_sepconv3')(x)
x = BatchNormalization(name=prefix + '_sepconv3_bn')(x)
x = layers.add([x, residual])
residual = Conv2D(1024, (1, 1), strides=(2, 2),
padding='same', use_bias=False)(x)
residual = BatchNormalization()(residual)
x = Activation('relu', name='block13_sepconv1_act')(x)
x = SeparableConv2D(728, (3, 3), padding='same', use_bias=False, name='block13_sepconv1')(x)
x = BatchNormalization(name='block13_sepconv1_bn')(x)
x = Activation('relu', name='block13_sepconv2_act')(x)
x = SeparableConv2D(1024, (3, 3), padding='same', use_bias=False, name='block13_sepconv2')(x)
x = BatchNormalization(name='block13_sepconv2_bn')(x)
x = MaxPooling2D((3, 3), strides=(2, 2), padding='same', name='block13_pool')(x)
x = layers.add([x, residual])
x = SeparableConv2D(1536, (3, 3), padding='same', use_bias=False, name='block14_sepconv1')(x)
x = BatchNormalization(name='block14_sepconv1_bn')(x)
x = Activation('relu', name='block14_sepconv1_act')(x)
x = SeparableConv2D(2048, (3, 3), padding='same', use_bias=False, name='block14_sepconv2')(x)
x = BatchNormalization(name='block14_sepconv2_bn')(x)
x = Activation('relu', name='block14_sepconv2_act')(x)
if include_top:
x = GlobalAveragePooling2D(name='avg_pool')(x)
x = Dense(classes, activation='softmax', name='predictions')(x)
else:
if pooling == 'avg':
x = GlobalAveragePooling2D()(x)
elif pooling == 'max':
x = GlobalMaxPooling2D()(x)
# Ensure that the model takes into account
# any potential predecessors of `input_tensor`.
if input_tensor is not None:
inputs = get_source_inputs(input_tensor)
else:
inputs = img_input
# Create model.
model = Model(inputs, x, name='xception')
# load weights
if weights == 'imagenet':
if include_top:
weights_path = get_file('xception_weights_tf_dim_ordering_tf_kernels.h5',
TF_WEIGHTS_PATH,
cache_subdir='models')
else:
weights_path = get_file('xception_weights_tf_dim_ordering_tf_kernels_notop.h5',
TF_WEIGHTS_PATH_NO_TOP,
cache_subdir='models')
model.load_weights(weights_path)
if old_data_format:
K.set_image_data_format(old_data_format)
return model
def preprocess_input(x):
x /= 255.
x -= 0.5
x *= 2.
return x
+26 -9
Ver Arquivo
@@ -7,8 +7,10 @@ from .common import epsilon
from .common import floatx
from .common import set_epsilon
from .common import set_floatx
from .common import get_uid
from .common import cast_to_floatx
from .common import image_data_format
from .common import set_image_data_format
from .common import is_keras_tensor
_keras_base_dir = os.path.expanduser('~')
if not os.access(_keras_base_dir, os.W_OK):
@@ -18,34 +20,42 @@ _keras_dir = os.path.join(_keras_base_dir, '.keras')
if not os.path.exists(_keras_dir):
os.makedirs(_keras_dir)
_BACKEND = 'theano'
# Default backend: TensorFlow.
_BACKEND = 'tensorflow'
_config_path = os.path.expanduser(os.path.join(_keras_dir, 'keras.json'))
if os.path.exists(_config_path):
_config = json.load(open(_config_path))
_floatx = _config.get('floatx', floatx())
assert _floatx in {'float16', 'float32', 'float64'}
_epsilon = _config.get('epsilon', epsilon())
assert type(_epsilon) == float
assert isinstance(_epsilon, float)
_backend = _config.get('backend', _BACKEND)
assert _backend in {'theano', 'tensorflow'}
_image_data_format = _config.get('image_data_format',
image_data_format())
assert _image_data_format in {'channels_last', 'channels_first'}
set_floatx(_floatx)
set_epsilon(_epsilon)
set_image_data_format(_image_data_format)
_BACKEND = _backend
else:
# save config file, for easy edition
# save config file
if not os.path.exists(_config_path):
_config = {'floatx': floatx(),
'epsilon': epsilon(),
'backend': _BACKEND}
'backend': _BACKEND,
'image_data_format': image_data_format()}
with open(_config_path, 'w') as f:
# add new line in order for bash 'cat' display the content correctly
f.write(json.dumps(_config) + '\n')
f.write(json.dumps(_config, indent=4))
if 'KERAS_BACKEND' in os.environ:
_backend = os.environ['KERAS_BACKEND']
assert _backend in {'theano', 'tensorflow'}
_BACKEND = _backend
# import backend
if _BACKEND == 'theano':
sys.stderr.write('Using Theano backend.\n')
from .theano_backend import *
@@ -53,4 +63,11 @@ elif _BACKEND == 'tensorflow':
sys.stderr.write('Using TensorFlow backend.\n')
from .tensorflow_backend import *
else:
raise Exception('Unknown backend: ' + str(_BACKEND))
raise ValueError('Unknown backend: ' + str(_BACKEND))
def backend():
"""Publicly accessible method
for determining the current backend.
"""
return _BACKEND
+181 -14
Ver Arquivo
@@ -3,43 +3,210 @@ import numpy as np
# the type of float to use throughout the session.
_FLOATX = 'float32'
_EPSILON = 10e-8
_UID_PREFIXES = {}
_IMAGE_DATA_FORMAT = 'channels_last'
def epsilon():
"""Returns the value of the fuzz
factor used in numeric expressions.
# Returns
A float.
# Example
```python
>>> keras.backend.epsilon()
1e-08
```
"""
return _EPSILON
def set_epsilon(e):
"""Sets the value of the fuzz
factor used in numeric expressions.
# Arguments
e: float. New value of epsilon.
# Example
```python
>>> from keras import backend as K
>>> K.epsilon()
1e-08
>>> K.set_epsilon(1e-05)
>>> K.epsilon()
1e-05
```
"""
global _EPSILON
_EPSILON = e
def floatx():
'''Returns the default float type, as a string
"""Returns the default float type, as a string
(e.g. 'float16', 'float32', 'float64').
'''
# Returns
String, the current default float type.
# Example
```python
>>> keras.backend.floatx()
'float32'
```
"""
return _FLOATX
def set_floatx(floatx):
"""Sets the default float type.
# Arguments
String: 'float16', 'float32', or 'float64'.
# Example
```python
>>> from keras import backend as K
>>> K.floatx()
'float32'
>>> K.set_floatx('float16')
>>> K.floatx()
'float16'
```
"""
global _FLOATX
if floatx not in {'float16', 'float32', 'float64'}:
raise Exception('Unknown floatx type: ' + str(floatx))
floatx = str(floatx)
_FLOATX = floatx
raise ValueError('Unknown floatx type: ' + str(floatx))
_FLOATX = str(floatx)
def cast_to_floatx(x):
'''Cast a Numpy array to floatx.
'''
"""Cast a Numpy array to the default Keras float type.
# Arguments
x: Numpy array.
# Returns
The same Numpy array, cast to its new type.
# Example
```python
>>> from keras import backend as K
>>> K.floatx()
'float32'
>>> arr = numpy.array([1.0, 2.0], dtype='float64')
>>> arr.dtype
dtype('float64')
>>> new_arr = K.cast_to_floatx(arr)
>>> new_arr
array([ 1., 2.], dtype=float32)
>>> new_arr.dtype
dtype('float32')
```
"""
return np.asarray(x, dtype=_FLOATX)
def get_uid(prefix=''):
if prefix not in _UID_PREFIXES:
_UID_PREFIXES[prefix] = 1
return 1
def image_data_format():
"""Returns the default image data format
convention ('channels_first' or 'channels_last').
# Returns
A string, either `'channels_first'` or `'channels_last'`
# Example
```python
>>> keras.backend.image_data_format()
'channels_first'
```
"""
return _IMAGE_DATA_FORMAT
def set_image_data_format(data_format):
"""Sets the value of the data format convention.
# Arguments
data_format: string. `'channels_first'` or `'channels_last'`.
# Example
```python
>>> from keras import backend as K
>>> K.image_data_format()
'channels_first'
>>> K.set_image_data_format('channels_last')
>>> K.image_data_format()
'channels_last'
```
"""
global _IMAGE_DATA_FORMAT
if data_format not in {'channels_last', 'channels_first'}:
raise ValueError('Unknown data_format:', data_format)
_IMAGE_DATA_FORMAT = str(data_format)
def is_keras_tensor(x):
"""Returns whether `x` is a Keras tensor.
# Arguments
x: a potential tensor.
# Returns
A boolean: whether the argument is a Keras tensor.
# Examples
```python
>>> from keras import backend as K
>>> np_var = numpy.array([1, 2])
>>> K.is_keras_tensor(np_var)
False
>>> keras_var = K.variable(np_var)
>>> K.is_keras_tensor(keras_var) # A variable is not a Tensor.
False
>>> keras_placeholder = K.placeholder(shape=(2, 4, 5))
>>> K.is_keras_tensor(keras_placeholder) # A placeholder is a Tensor.
True
```
"""
if hasattr(x, '_keras_shape'):
return True
else:
_UID_PREFIXES[prefix] += 1
return _UID_PREFIXES[prefix]
return False
# Legacy methods
def set_image_dim_ordering(dim_ordering):
"""Sets the value of the image data format.
# Arguments
data_format: string. `'channels_first'` or `'channels_last'`.
# Example
```python
>>> from keras import backend as K
>>> K.image_data_format()
'channels_first'
>>> K.set_image_data_format('channels_last')
>>> K.image_data_format()
'channels_last'
```
"""
global _IMAGE_DATA_FORMAT
if dim_ordering not in {'tf', 'th'}:
raise ValueError('Unknown dim_ordering:', dim_ordering)
if dim_ordering == 'th':
data_format = 'channels_first'
else:
data_format = 'channels_last'
_IMAGE_DATA_FORMAT = data_format
def image_dim_ordering():
"""Legacy getter for data format.
"""
if _IMAGE_DATA_FORMAT == 'channels_first':
return 'th'
else:
return 'tf'
Diferenças do arquivo suprimidas por serem muito extensas Carregar Diff
Diferenças do arquivo suprimidas por serem muito extensas Carregar Diff
+609 -166
Ver Arquivo
Diferenças do arquivo suprimidas por serem muito extensas Carregar Diff
+142 -55
Ver Arquivo
@@ -1,93 +1,180 @@
from __future__ import absolute_import
import six
from . import backend as K
from .utils.generic_utils import serialize_keras_object
from .utils.generic_utils import deserialize_keras_object
class Constraint(object):
def __call__(self, p):
return p
def __call__(self, w):
return w
def get_config(self):
return {'name': self.__class__.__name__}
return {}
class MaxNorm(Constraint):
'''Constrain the weights incident to each hidden unit to have a norm less than or equal to a desired value.
"""MaxNorm weight constraint.
Constrains the weights incident to each hidden unit
to have a norm less than or equal to a desired value.
# Arguments
m: the maximum norm for the incoming weights.
axis: integer, axis along which to calculate weight norms. For instance,
in a `Dense` layer the weight matrix has shape (input_dim, output_dim),
set `axis` to `0` to constrain each weight vector of length (input_dim).
In a `MaxoutDense` layer the weight tensor has shape (nb_feature, input_dim, output_dim),
set `axis` to `1` to constrain each weight vector of length (input_dim),
i.e. constrain the filters incident to the `max` operation.
In a `Convolution2D` layer with the Theano backend, the weight tensor
has shape (nb_filter, stack_size, nb_row, nb_col), set `axis` to `[1,2,3]`
to constrain the weights of each filter tensor of size (stack_size, nb_row, nb_col).
In a `Convolution2D` layer with the TensorFlow backend, the weight tensor
has shape (nb_row, nb_col, stack_size, nb_filter), set `axis` to `[0,1,2]`
to constrain the weights of each filter tensor of size (nb_row, nb_col, stack_size).
axis: integer, axis along which to calculate weight norms.
For instance, in a `Dense` layer the weight matrix
has shape `(input_dim, output_dim)`,
set `axis` to `0` to constrain each weight vector
of length `(input_dim,)`.
In a `Convolution2D` layer with `data_format="channels_last"`,
the weight tensor has shape
`(rows, cols, input_depth, output_depth)`,
set `axis` to `[0, 1, 2]`
to constrain the weights of each filter tensor of size
`(rows, cols, input_depth)`.
# References
- [Dropout: A Simple Way to Prevent Neural Networks from Overfitting Srivastava, Hinton, et al. 2014](http://www.cs.toronto.edu/~rsalakhu/papers/srivastava14a.pdf)
'''
def __init__(self, m=2, axis=0):
self.m = m
"""
def __init__(self, max_value=2, axis=0):
self.max_value = max_value
self.axis = axis
def __call__(self, p):
norms = K.sqrt(K.sum(K.square(p), axis=self.axis, keepdims=True))
desired = K.clip(norms, 0, self.m)
p = p * (desired / (K.epsilon() + norms))
return p
def __call__(self, w):
norms = K.sqrt(K.sum(K.square(w), axis=self.axis, keepdims=True))
desired = K.clip(norms, 0, self.max_value)
w *= (desired / (K.epsilon() + norms))
return w
def get_config(self):
return {'name': self.__class__.__name__,
'm': self.m,
return {'max_value': self.max_value,
'axis': self.axis}
class NonNeg(Constraint):
'''Constrain the weights to be non-negative.
'''
def __call__(self, p):
p *= K.cast(p >= 0., K.floatx())
return p
"""Constrains the weights to be non-negative.
"""
def __call__(self, w):
w *= K.cast(w >= 0., K.floatx())
return w
class UnitNorm(Constraint):
'''Constrain the weights incident to each hidden unit to have unit norm.
"""Constrains the weights incident to each hidden unit to have unit norm.
# Arguments
axis: integer, axis along which to calculate weight norms. For instance,
in a `Dense` layer the weight matrix has shape (input_dim, output_dim),
set `axis` to `0` to constrain each weight vector of length (input_dim).
In a `MaxoutDense` layer the weight tensor has shape (nb_feature, input_dim, output_dim),
set `axis` to `1` to constrain each weight vector of length (input_dim),
i.e. constrain the filters incident to the `max` operation.
In a `Convolution2D` layer with the Theano backend, the weight tensor
has shape (nb_filter, stack_size, nb_row, nb_col), set `axis` to `[1,2,3]`
to constrain the weights of each filter tensor of size (stack_size, nb_row, nb_col).
In a `Convolution2D` layer with the TensorFlow backend, the weight tensor
has shape (nb_row, nb_col, stack_size, nb_filter), set `axis` to `[0,1,2]`
to constrain the weights of each filter tensor of size (nb_row, nb_col, stack_size).
'''
axis: integer, axis along which to calculate weight norms.
For instance, in a `Dense` layer the weight matrix
has shape `(input_dim, output_dim)`,
set `axis` to `0` to constrain each weight vector
of length `(input_dim,)`.
In a `Convolution2D` layer with `data_format="channels_last"`,
the weight tensor has shape
`(rows, cols, input_depth, output_depth)`,
set `axis` to `[0, 1, 2]`
to constrain the weights of each filter tensor of size
`(rows, cols, input_depth)`.
"""
def __init__(self, axis=0):
self.axis = axis
def __call__(self, p):
return p / (K.epsilon() + K.sqrt(K.sum(K.square(p), axis=self.axis, keepdims=True)))
def __call__(self, w):
return w / (K.epsilon() + K.sqrt(K.sum(K.square(w),
axis=self.axis,
keepdims=True)))
def get_config(self):
return {'name': self.__class__.__name__,
return {'axis': self.axis}
class MinMaxNorm(Constraint):
"""MinMaxNorm weight constraint.
Constrains the weights incident to each hidden unit
to have the norm between a lower bound and an upper bound.
# Arguments
min_value: the minimum norm for the incoming weights.
max_value: the maximum norm for the incoming weights.
rate: rate for enforcing the constraint: weights will be
rescaled to yield
`(1 - rate) * norm + rate * norm.clip(min_value, max_value)`.
Effectively, this means that rate=1.0 stands for strict
enforcement of the constraint, while rate<1.0 means that
weights will be rescaled at each step to slowly move
towards a value inside the desired interval.
axis: integer, axis along which to calculate weight norms.
For instance, in a `Dense` layer the weight matrix
has shape `(input_dim, output_dim)`,
set `axis` to `0` to constrain each weight vector
of length `(input_dim,)`.
In a `Convolution2D` layer with `dim_ordering="tf"`,
the weight tensor has shape
`(rows, cols, input_depth, output_depth)`,
set `axis` to `[0, 1, 2]`
to constrain the weights of each filter tensor of size
`(rows, cols, input_depth)`.
"""
def __init__(self, min_value=0.0, max_value=1.0, rate=1.0, axis=0):
self.min_value = min_value
self.max_value = max_value
self.rate = rate
self.axis = axis
def __call__(self, w):
norms = K.sqrt(K.sum(K.square(w), axis=self.axis, keepdims=True))
desired = (self.rate * K.clip(norms, self.min_value, self.max_value) +
(1 - self.rate) * norms)
w *= (desired / (K.epsilon() + norms))
return w
def get_config(self):
return {'min_value': self.min_value,
'max_value': self.max_value,
'rate': self.rate,
'axis': self.axis}
maxnorm = MaxNorm
nonneg = NonNeg
unitnorm = UnitNorm
# Aliases.
from .utils.generic_utils import get_from_module
def get(identifier, kwargs=None):
return get_from_module(identifier, globals(), 'constraint',
instantiate=True, kwargs=kwargs)
max_norm = MaxNorm
non_neg = NonNeg
unit_norm = UnitNorm
min_max_norm = MinMaxNorm
# Legacy aliases.
maxnorm = max_norm
nonneg = non_neg
unitnorm = unit_norm
def serialize(constraint):
return serialize_keras_object(constraint)
def deserialize(config, custom_objects=None):
return deserialize_keras_object(config,
module_objects=globals(),
custom_objects=custom_objects,
printable_module_name='constraint')
def get(identifier):
if identifier is None:
return None
if isinstance(identifier, dict):
return deserialize(identifier)
elif isinstance(identifier, six.string_types):
config = {'class_name': str(identifier), 'config': {}}
return deserialize(config)
elif callable(identifier):
return identifier
else:
raise ValueError('Could not interpret constraint identifier:',
identifier)
+8
Ver Arquivo
@@ -0,0 +1,8 @@
from __future__ import absolute_import
from . import mnist
from . import imdb
from . import reuters
from . import cifar10
from . import cifar100
from . import boston_housing
+34
Ver Arquivo
@@ -0,0 +1,34 @@
from ..utils.data_utils import get_file
import numpy as np
def load_data(path='boston_housing.npz', seed=113, test_split=0.2):
"""Loads the Boston Housing dataset.
# Arguments
path: path where to cache the dataset locally
(relative to ~/.keras/datasets).
seed: Random seed for shuffling the data
before computing the test split.
test_split: fraction of the data to reserve as test set.
# Returns
Tuple of Numpy arrays: `(x_train, y_train), (x_test, y_test)`.
"""
assert 0 <= test_split < 1
path = get_file(path, origin='https://s3.amazonaws.com/keras-datasets/boston_housing.npz')
f = np.load(path)
x = f['x']
y = f['y']
f.close()
np.random.seed(seed)
np.random.shuffle(x)
np.random.seed(seed)
np.random.shuffle(y)
x_train = np.array(x[:int(len(x) * (1 - test_split))])
y_train = np.array(y[:int(len(x) * (1 - test_split))])
x_test = np.array(x[int(len(x) * (1 - test_split)):])
y_test = np.array(y[int(len(x) * (1 - test_split)):])
return (x_train, y_train), (x_test, y_test)
+15 -5
Ver Arquivo
@@ -2,21 +2,31 @@
from __future__ import absolute_import
import sys
from six.moves import cPickle
from six.moves import range
def load_batch(fpath, label_key='labels'):
"""Internal utility for parsing CIFAR data.
# Arguments
fpath: path the file to parse.
label_key: key for label data in the retrieve
dictionary.
# Returns
A tuple `(data, labels)`.
"""
f = open(fpath, 'rb')
if sys.version_info < (3,):
d = cPickle.load(f)
else:
d = cPickle.load(f, encoding="bytes")
d = cPickle.load(f, encoding='bytes')
# decode utf8
d_decoded = {}
for k, v in d.items():
del(d[k])
d[k.decode("utf8")] = v
d_decoded[k.decode('utf8')] = v
d = d_decoded
f.close()
data = d["data"]
data = d['data']
labels = d[label_key]
data = data.reshape(data.shape[0], 3, 32, 32)
+19 -9
Ver Arquivo
@@ -1,30 +1,40 @@
from __future__ import absolute_import
from .cifar import load_batch
from ..utils.data_utils import get_file
from .. import backend as K
import numpy as np
import os
def load_data():
dirname = "cifar-10-batches-py"
origin = "http://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz"
"""Loads CIFAR10 dataset.
# Returns
Tuple of Numpy arrays: `(x_train, y_train), (x_test, y_test)`.
"""
dirname = 'cifar-10-batches-py'
origin = 'http://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz'
path = get_file(dirname, origin=origin, untar=True)
nb_train_samples = 50000
num_train_samples = 50000
X_train = np.zeros((nb_train_samples, 3, 32, 32), dtype="uint8")
y_train = np.zeros((nb_train_samples,), dtype="uint8")
x_train = np.zeros((num_train_samples, 3, 32, 32), dtype='uint8')
y_train = np.zeros((num_train_samples,), dtype='uint8')
for i in range(1, 6):
fpath = os.path.join(path, 'data_batch_' + str(i))
data, labels = load_batch(fpath)
X_train[(i-1)*10000:i*10000, :, :, :] = data
y_train[(i-1)*10000:i*10000] = labels
x_train[(i - 1) * 10000: i * 10000, :, :, :] = data
y_train[(i - 1) * 10000: i * 10000] = labels
fpath = os.path.join(path, 'test_batch')
X_test, y_test = load_batch(fpath)
x_test, y_test = load_batch(fpath)
y_train = np.reshape(y_train, (len(y_train), 1))
y_test = np.reshape(y_test, (len(y_test), 1))
return (X_train, y_train), (X_test, y_test)
if K.image_data_format() == 'channels_last':
x_train = x_train.transpose(0, 2, 3, 1)
x_test = x_test.transpose(0, 2, 3, 1)
return (x_train, y_train), (x_test, y_test)
+23 -10
Ver Arquivo
@@ -1,28 +1,41 @@
from __future__ import absolute_import
from .cifar import load_batch
from ..utils.data_utils import get_file
from .. import backend as K
import numpy as np
import os
def load_data(label_mode='fine'):
if label_mode not in ['fine', 'coarse']:
raise Exception('label_mode must be one of "fine" "coarse".')
"""Loads CIFAR100 dataset.
dirname = "cifar-100-python"
origin = "http://www.cs.toronto.edu/~kriz/cifar-100-python.tar.gz"
# Arguments
label_mode: one of "fine", "coarse".
# Returns
Tuple of Numpy arrays: `(x_train, y_train), (x_test, y_test)`.
# Raises
ValueError: in case of invalid `label_mode`.
"""
if label_mode not in ['fine', 'coarse']:
raise ValueError('label_mode must be one of "fine" "coarse".')
dirname = 'cifar-100-python'
origin = 'http://www.cs.toronto.edu/~kriz/cifar-100-python.tar.gz'
path = get_file(dirname, origin=origin, untar=True)
nb_test_samples = 10000
nb_train_samples = 50000
fpath = os.path.join(path, 'train')
X_train, y_train = load_batch(fpath, label_key=label_mode+'_labels')
x_train, y_train = load_batch(fpath, label_key=label_mode + '_labels')
fpath = os.path.join(path, 'test')
X_test, y_test = load_batch(fpath, label_key=label_mode+'_labels')
x_test, y_test = load_batch(fpath, label_key=label_mode + '_labels')
y_train = np.reshape(y_train, (len(y_train), 1))
y_test = np.reshape(y_test, (len(y_test), 1))
return (X_train, y_train), (X_test, y_test)
if K.image_data_format() == 'channels_last':
x_train = x_train.transpose(0, 2, 3, 1)
x_test = x_test.transpose(0, 2, 3, 1)
return (x_train, y_train), (x_test, y_test)
-4
Ver Arquivo
@@ -1,4 +0,0 @@
from ..utils.data_utils import *
import warnings
warnings.warn('data_utils has been moved to keras.utils.data_utils.')
+99 -37
Ver Arquivo
@@ -1,69 +1,131 @@
from __future__ import absolute_import
from six.moves import cPickle
import gzip
from ..utils.data_utils import get_file
from six.moves import zip
import numpy as np
import json
import warnings
def load_data(path="imdb.pkl", nb_words=None, skip_top=0,
maxlen=None, test_split=0.2, seed=113,
start_char=1, oov_char=2, index_from=3):
def load_data(path='imdb.npz', num_words=None, skip_top=0,
maxlen=None, seed=113,
start_char=1, oov_char=2, index_from=3, **kwargs):
"""Loads the IMDB dataset.
path = get_file(path, origin="https://s3.amazonaws.com/text-datasets/imdb.pkl")
# Arguments
path: where to cache the data (relative to `~/.keras/dataset`).
num_words: max number of words to include. Words are ranked
by how often they occur (in the training set) and only
the most frequent words are kept
skip_top: skip the top N most frequently occuring words
(which may not be informative).
maxlen: truncate sequences after this length.
seed: random seed for sample shuffling.
start_char: The start of a sequence will be marked with this character.
Set to 1 because 0 is usually the padding character.
oov_char: words that were cut out because of the `num_words`
or `skip_top` limit will be replaced with this character.
index_from: index actual words with this index and higher.
if path.endswith(".gz"):
f = gzip.open(path, 'rb')
else:
f = open(path, 'rb')
# Returns
Tuple of Numpy arrays: `(x_train, y_train), (x_test, y_test)`.
X, labels = cPickle.load(f)
# Raises
ValueError: in case `maxlen` is so low
that no input sequence could be kept.
Note that the 'out of vocabulary' character is only used for
words that were present in the training set but are not included
because they're not making the `num_words` cut here.
Words that were not seen in the training set but are in the test set
have simply been skipped.
"""
# Legacy support
if 'nb_words' in kwargs:
warnings.warn('The `nb_words` argument in `load_data` '
'has been renamed `num_words`.')
num_words = kwargs.pop('nb_words')
if kwargs:
raise TypeError('Unrecognized keyword arguments: ' + str(kwargs))
path = get_file(path,
origin='https://s3.amazonaws.com/text-datasets/imdb.npz')
f = np.load(path)
x_train = f['x_train']
labels_train = f['y_train']
x_test = f['x_test']
labels_test = f['y_test']
f.close()
np.random.seed(seed)
np.random.shuffle(X)
np.random.shuffle(x_train)
np.random.seed(seed)
np.random.shuffle(labels)
np.random.shuffle(labels_train)
np.random.seed(seed * 2)
np.random.shuffle(x_test)
np.random.seed(seed * 2)
np.random.shuffle(labels_test)
xs = np.concatenate([x_train, x_test])
labels = np.concatenate([labels_train, labels_test])
if start_char is not None:
X = [[start_char] + [w + index_from for w in x] for x in X]
xs = [[start_char] + [w + index_from for w in x] for x in xs]
elif index_from:
X = [[w + index_from for w in x] for x in X]
xs = [[w + index_from for w in x] for x in xs]
if maxlen:
new_X = []
new_xs = []
new_labels = []
for x, y in zip(X, labels):
for x, y in zip(xs, labels):
if len(x) < maxlen:
new_X.append(x)
new_xs.append(x)
new_labels.append(y)
X = new_X
xs = new_xs
labels = new_labels
if not X:
raise Exception('After filtering for sequences shorter than maxlen=' +
str(maxlen) + ', no sequence was kept. '
'Increase maxlen.')
if not nb_words:
nb_words = max([max(x) for x in X])
if not xs:
raise ValueError('After filtering for sequences shorter than maxlen=' +
str(maxlen) + ', no sequence was kept. '
'Increase maxlen.')
if not num_words:
num_words = max([max(x) for x in xs])
# by convention, use 2 as OOV word
# reserve 'index_from' (=3 by default) characters: 0 (padding), 1 (start), 2 (OOV)
# reserve 'index_from' (=3 by default) characters:
# 0 (padding), 1 (start), 2 (OOV)
if oov_char is not None:
X = [[oov_char if (w >= nb_words or w < skip_top) else w for w in x] for x in X]
xs = [[oov_char if (w >= num_words or w < skip_top) else w for w in x] for x in xs]
else:
nX = []
for x in X:
new_xs = []
for x in xs:
nx = []
for w in x:
if (w >= nb_words or w < skip_top):
if w >= num_words or w < skip_top:
nx.append(w)
nX.append(nx)
X = nX
new_xs.append(nx)
xs = new_xs
X_train = np.array(X[:int(len(X) * (1 - test_split))])
y_train = np.array(labels[:int(len(X) * (1 - test_split))])
x_train = np.array(xs[:len(x_train)])
y_train = np.array(labels[:len(x_train)])
X_test = np.array(X[int(len(X) * (1 - test_split)):])
y_test = np.array(labels[int(len(X) * (1 - test_split)):])
x_test = np.array(xs[len(x_train):])
y_test = np.array(labels[len(x_train):])
return (X_train, y_train), (X_test, y_test)
return (x_train, y_train), (x_test, y_test)
def get_word_index(path='imdb_word_index.json'):
"""Retrieves the dictionary mapping word indices back to words.
# Arguments
path: where to cache the data (relative to `~/.keras/dataset`).
# Returns
The word index dictionary.
"""
path = get_file(path,
origin='https://s3.amazonaws.com/text-datasets/imdb_word_index.json')
f = open(path)
data = json.load(f)
f.close()
return data
+16 -16
Ver Arquivo
@@ -1,22 +1,22 @@
# -*- coding: utf-8 -*-
import gzip
from ..utils.data_utils import get_file
from six.moves import cPickle
import sys
import numpy as np
def load_data(path="mnist.pkl.gz"):
path = get_file(path, origin="https://s3.amazonaws.com/img-datasets/mnist.pkl.gz")
def load_data(path='mnist.npz'):
"""Loads the MNIST dataset.
if path.endswith(".gz"):
f = gzip.open(path, 'rb')
else:
f = open(path, 'rb')
if sys.version_info < (3,):
data = cPickle.load(f)
else:
data = cPickle.load(f, encoding="bytes")
# Arguments
path: path where to cache the dataset locally
(relative to ~/.keras/datasets).
# Returns
Tuple of Numpy arrays: `(x_train, y_train), (x_test, y_test)`.
"""
path = get_file(path, origin='https://s3.amazonaws.com/img-datasets/mnist.npz')
f = np.load(path)
x_train = f['x_train']
y_train = f['y_train']
x_test = f['x_test']
y_test = f['y_test']
f.close()
return data # (X_train, y_train), (X_test, y_test)
return (x_train, y_train), (x_test, y_test)
+79 -32
Ver Arquivo
@@ -1,67 +1,114 @@
# -*- coding: utf-8 -*-
from __future__ import absolute_import
from ..utils.data_utils import get_file
from six.moves import cPickle
from six.moves import zip
import numpy as np
import json
import warnings
def load_data(path="reuters.pkl", nb_words=None, skip_top=0,
def load_data(path='reuters.npz', num_words=None, skip_top=0,
maxlen=None, test_split=0.2, seed=113,
start_char=1, oov_char=2, index_from=3):
start_char=1, oov_char=2, index_from=3, **kwargs):
"""Loads the Reuters newswire classification dataset.
path = get_file(path, origin="https://s3.amazonaws.com/text-datasets/reuters.pkl")
f = open(path, 'rb')
X, labels = cPickle.load(f)
f.close()
# Arguments
path: where to cache the data (relative to `~/.keras/dataset`).
num_words: max number of words to include. Words are ranked
by how often they occur (in the training set) and only
the most frequent words are kept
skip_top: skip the top N most frequently occuring words
(which may not be informative).
maxlen: truncate sequences after this length.
test_split: Fraction of the dataset to be used as test data.
seed: random seed for sample shuffling.
start_char: The start of a sequence will be marked with this character.
Set to 1 because 0 is usually the padding character.
oov_char: words that were cut out because of the `num_words`
or `skip_top` limit will be replaced with this character.
index_from: index actual words with this index and higher.
# Returns
Tuple of Numpy arrays: `(x_train, y_train), (x_test, y_test)`.
Note that the 'out of vocabulary' character is only used for
words that were present in the training set but are not included
because they're not making the `num_words` cut here.
Words that were not seen in the training set but are in the test set
have simply been skipped.
"""
# Legacy support
if 'nb_words' in kwargs:
warnings.warn('The `nb_words` argument in `load_data` '
'has been renamed `num_words`.')
num_words = kwargs.pop('nb_words')
if kwargs:
raise TypeError('Unrecognized keyword arguments: ' + str(kwargs))
path = get_file(path, origin='https://s3.amazonaws.com/text-datasets/reuters.npz')
npzfile = np.load(path)
xs = npzfile['x']
labels = npzfile['y']
npzfile.close()
np.random.seed(seed)
np.random.shuffle(X)
np.random.shuffle(xs)
np.random.seed(seed)
np.random.shuffle(labels)
if start_char is not None:
X = [[start_char] + [w + index_from for w in x] for x in X]
xs = [[start_char] + [w + index_from for w in x] for x in xs]
elif index_from:
X = [[w + index_from for w in x] for x in X]
xs = [[w + index_from for w in x] for x in xs]
if maxlen:
new_X = []
new_xs = []
new_labels = []
for x, y in zip(X, labels):
for x, y in zip(xs, labels):
if len(x) < maxlen:
new_X.append(x)
new_xs.append(x)
new_labels.append(y)
X = new_X
xs = new_xs
labels = new_labels
if not nb_words:
nb_words = max([max(x) for x in X])
if not num_words:
num_words = max([max(x) for x in xs])
# by convention, use 2 as OOV word
# reserve 'index_from' (=3 by default) characters: 0 (padding), 1 (start), 2 (OOV)
# reserve 'index_from' (=3 by default) characters:
# 0 (padding), 1 (start), 2 (OOV)
if oov_char is not None:
X = [[oov_char if (w >= nb_words or w < skip_top) else w for w in x] for x in X]
xs = [[oov_char if (w >= num_words or w < skip_top) else w for w in x] for x in xs]
else:
nX = []
for x in X:
new_xs = []
for x in xs:
nx = []
for w in x:
if (w >= nb_words or w < skip_top):
if w >= num_words or w < skip_top:
nx.append(w)
nX.append(nx)
X = nX
new_xs.append(nx)
xs = new_xs
X_train = X[:int(len(X) * (1 - test_split))]
y_train = labels[:int(len(X) * (1 - test_split))]
x_train = np.array(xs[:int(len(xs) * (1 - test_split))])
y_train = np.array(labels[:int(len(xs) * (1 - test_split))])
X_test = X[int(len(X) * (1 - test_split)):]
y_test = labels[int(len(X) * (1 - test_split)):]
x_test = np.array(xs[int(len(xs) * (1 - test_split)):])
y_test = np.array(labels[int(len(xs) * (1 - test_split)):])
return (X_train, y_train), (X_test, y_test)
return (x_train, y_train), (x_test, y_test)
def get_word_index(path="reuters_word_index.pkl"):
path = get_file(path, origin="https://s3.amazonaws.com/text-datasets/reuters_word_index.pkl")
f = open(path, 'rb')
return cPickle.load(f)
def get_word_index(path='reuters_word_index.json'):
"""Retrieves the dictionary mapping word indices back to words.
# Arguments
path: where to cache the data (relative to `~/.keras/dataset`).
# Returns
The word index dictionary.
"""
path = get_file(path, origin='https://s3.amazonaws.com/text-datasets/reuters_word_index.json')
f = open(path)
data = json.load(f)
f.close()
return data
-2
Ver Arquivo
@@ -4,7 +4,5 @@ from .topology import InputSpec
from .topology import Input
from .topology import InputLayer
from .topology import Layer
from .topology import Merge
from .topology import merge
from .topology import get_source_inputs
from .training import Model
+1913 -1345
Ver Arquivo
Diferenças do arquivo suprimidas por serem muito extensas Carregar Diff
+1341 -757
Ver Arquivo
Diferenças do arquivo suprimidas por serem muito extensas Carregar Diff
-107
Ver Arquivo
@@ -1,107 +0,0 @@
from __future__ import absolute_import
import numpy as np
from . import backend as K
def get_fans(shape, dim_ordering='th'):
if len(shape) == 2:
fan_in = shape[0]
fan_out = shape[1]
elif len(shape) == 4 or len(shape) == 5:
# assuming convolution kernels (2D or 3D).
# TH kernel shape: (depth, input_depth, ...)
# TF kernel shape: (..., input_depth, depth)
if dim_ordering == 'th':
fan_in = np.prod(shape[1:])
fan_out = shape[0]
elif dim_ordering == 'tf':
fan_in = np.prod(shape[:-1])
fan_out = shape[-1]
else:
raise Exception('Invalid dim_ordering: ' + dim_ordering)
else:
# no specific assumptions
fan_in = np.sqrt(np.prod(shape))
fan_out = np.sqrt(np.prod(shape))
return fan_in, fan_out
def uniform(shape, scale=0.05, name=None):
return K.variable(np.random.uniform(low=-scale, high=scale, size=shape),
name=name)
def normal(shape, scale=0.05, name=None):
return K.variable(np.random.normal(loc=0.0, scale=scale, size=shape),
name=name)
def lecun_uniform(shape, name=None, dim_ordering='th'):
''' Reference: LeCun 98, Efficient Backprop
http://yann.lecun.com/exdb/publis/pdf/lecun-98b.pdf
'''
fan_in, fan_out = get_fans(shape, dim_ordering=dim_ordering)
scale = np.sqrt(3. / fan_in)
return uniform(shape, scale, name=name)
def glorot_normal(shape, name=None, dim_ordering='th'):
''' Reference: Glorot & Bengio, AISTATS 2010
'''
fan_in, fan_out = get_fans(shape, dim_ordering=dim_ordering)
s = np.sqrt(2. / (fan_in + fan_out))
return normal(shape, s, name=name)
def glorot_uniform(shape, name=None, dim_ordering='th'):
fan_in, fan_out = get_fans(shape, dim_ordering=dim_ordering)
s = np.sqrt(6. / (fan_in + fan_out))
return uniform(shape, s, name=name)
def he_normal(shape, name=None, dim_ordering='th'):
''' Reference: He et al., http://arxiv.org/abs/1502.01852
'''
fan_in, fan_out = get_fans(shape, dim_ordering=dim_ordering)
s = np.sqrt(2. / fan_in)
return normal(shape, s, name=name)
def he_uniform(shape, name=None, dim_ordering='th'):
fan_in, fan_out = get_fans(shape, dim_ordering=dim_ordering)
s = np.sqrt(6. / fan_in)
return uniform(shape, s, name=name)
def orthogonal(shape, scale=1.1, name=None):
''' From Lasagne. Reference: Saxe et al., http://arxiv.org/abs/1312.6120
'''
flat_shape = (shape[0], np.prod(shape[1:]))
a = np.random.normal(0.0, 1.0, flat_shape)
u, _, v = np.linalg.svd(a, full_matrices=False)
# pick the one with the correct shape
q = u if u.shape == flat_shape else v
q = q.reshape(shape)
return K.variable(scale * q[:shape[0], :shape[1]], name=name)
def identity(shape, scale=1, name=None):
if len(shape) != 2 or shape[0] != shape[1]:
raise Exception('Identity matrix initialization can only be used '
'for 2D square matrices.')
else:
return K.variable(scale * np.identity(shape[0]), name=name)
def zero(shape, name=None):
return K.zeros(shape, name=name)
def one(shape, name=None):
return K.ones(shape, name=name)
from .utils.generic_utils import get_from_module
def get(identifier, **kwargs):
return get_from_module(identifier, globals(),
'initialization', kwargs=kwargs)
+468
Ver Arquivo
@@ -0,0 +1,468 @@
from __future__ import absolute_import
import numpy as np
import six
from . import backend as K
from .utils.generic_utils import serialize_keras_object
from .utils.generic_utils import deserialize_keras_object
class Initializer(object):
"""Initializer base class: all initializers inherit from this class.
"""
def __call__(self, shape, dtype=None):
raise NotImplementedError
def get_config(self):
return {}
@classmethod
def from_config(cls, config):
return cls(**config)
class Zeros(Initializer):
"""Initializer that generates tensors initialized to 0."""
def __call__(self, shape, dtype=None):
return K.constant(0, shape=shape, dtype=dtype)
class Ones(Initializer):
"""Initializer that generates tensors initialized to 1."""
def __call__(self, shape, dtype=None):
return K.constant(1, shape=shape, dtype=dtype)
class Constant(Initializer):
"""Initializer that generates tensors initialized to a constant value.
# Arguments
value: float; the value of the generator tensors.
"""
def __init__(self, value=0):
self.value = value
def __call__(self, shape, dtype=None):
return K.constant(self.value, shape=shape, dtype=dtype)
def get_config(self):
return {'value': self.value}
class RandomNormal(Initializer):
"""Initializer that generates tensors with a normal distribution.
# Arguments
mean: a python scalar or a scalar tensor. Mean of the random values
to generate.
stddev: a python scalar or a scalar tensor. Standard deviation of the
random values to generate.
seed: A Python integer. Used to seed the random generator.
"""
def __init__(self, mean=0., stddev=0.05, seed=None):
self.mean = mean
self.stddev = stddev
self.seed = seed
def __call__(self, shape, dtype=None):
return K.random_normal(shape, self.mean, self.stddev,
dtype=dtype, seed=self.seed)
def get_config(self):
return {
'mean': self.mean,
'stddev': self.stddev,
'seed': self.seed
}
class RandomUniform(Initializer):
"""Initializer that generates tensors with a uniform distribution.
# Arguments
minval: A python scalar or a scalar tensor. Lower bound of the range
of random values to generate.
maxval: A python scalar or a scalar tensor. Upper bound of the range
of random values to generate. Defaults to 1 for float types.
seed: A Python integer. Used to seed the random generator.
"""
def __init__(self, minval=-0.05, maxval=0.05, seed=None):
self.minval = minval
self.maxval = maxval
self.seed = seed
def __call__(self, shape, dtype=None):
return K.random_uniform(shape, self.minval, self.maxval,
dtype=dtype, seed=self.seed)
def get_config(self):
return {
'minval': self.minval,
'maxval': self.maxval,
'seed': self.seed,
}
class TruncatedNormal(Initializer):
"""Initializer that generates a truncated normal distribution.
These values are similar to values from a `random_normal_initializer`
except that values more than two standard deviations from the mean
are discarded and re-drawn. This is the recommended initializer for
neural network weights and filters.
# Arguments
mean: a python scalar or a scalar tensor. Mean of the random values
to generate.
stddev: a python scalar or a scalar tensor. Standard deviation of the
random values to generate.
seed: A Python integer. Used to seed the random generator.
"""
def __init__(self, mean=0., stddev=0.05, seed=None):
self.mean = mean
self.stddev = stddev
self.seed = seed
def __call__(self, shape, dtype=None):
return K.truncated_normal(shape, self.mean, self.stddev,
dtype=dtype, seed=self.seed)
def get_config(self):
return {
'mean': self.mean,
'stddev': self.stddev,
'seed': self.seed
}
class VarianceScaling(Initializer):
"""Initializer capable of adapting its scale to the shape of weights.
With `distribution="normal"`, samples are drawn from a truncated normal
distribution centered on zero, with `stddev = sqrt(scale / n)` where n is:
- number of input units in the weight tensor, if mode = "fan_in"
- number of output units, if mode = "fan_out"
- average of the numbers of input and output units, if mode = "fan_avg"
With `distribution="uniform"`,
samples are drawn from a uniform distribution
within [-limit, limit], with `limit = sqrt(3 * scale / n)`.
# Arguments
scale: Scaling factor (positive float).
mode: One of "fan_in", "fan_out", "fan_avg".
distribution: Random distribution to use. One of "normal", "uniform".
seed: A Python integer. Used to seed the random generator.
# Raises
ValueError: In case of an invalid value for the "scale", mode" or
"distribution" arguments.
"""
def __init__(self, scale=1.0,
mode='fan_in',
distribution='normal',
seed=None):
if scale <= 0.:
raise ValueError('`scale` must be a positive float. Got:', scale)
mode = mode.lower()
if mode not in {'fan_in', 'fan_out', 'fan_avg'}:
raise ValueError('Invalid `mode` argument: '
'expected on of {"fan_in", "fan_out", "fan_avg"} '
'but got', mode)
distribution = distribution.lower()
if distribution not in {'normal', 'uniform'}:
raise ValueError('Invalid `distribution` argument: '
'expected one of {"normal", "uniform"} '
'but got', distribution)
self.scale = scale
self.mode = mode
self.distribution = distribution
self.seed = seed
def __call__(self, shape, dtype=None):
fan_in, fan_out = _compute_fans(shape)
scale = self.scale
if self.mode == 'fan_in':
scale /= max(1., fan_in)
elif self.mode == 'fan_out':
scale /= max(1., fan_out)
else:
scale /= max(1., float(fan_in + fan_out) / 2)
if self.distribution == 'normal':
stddev = np.sqrt(scale)
return K.truncated_normal(shape, 0., stddev,
dtype=dtype, seed=self.seed)
else:
limit = np.sqrt(3. * scale)
return K.random_uniform(shape, -limit, limit,
dtype=dtype, seed=self.seed)
def get_config(self):
return {
'scale': self.scale,
'mode': self.mode,
'distribution': self.distribution,
'seed': self.seed
}
class Orthogonal(Initializer):
"""Initializer that generates a random orthogonal matrix.
# Arguments
gain: Multiplicative factor to apply to the orthogonal matrix.
seed: A Python integer. Used to seed the random generator.
# References
Saxe et al., http://arxiv.org/abs/1312.6120
"""
def __init__(self, gain=1., seed=None):
self.gain = gain
self.seed = seed
def __call__(self, shape, dtype=None):
num_rows = 1
for dim in shape[:-1]:
num_rows *= dim
num_cols = shape[-1]
flat_shape = (num_rows, num_cols)
if self.seed is not None:
np.random.seed(self.seed)
a = np.random.normal(0.0, 1.0, flat_shape)
u, _, v = np.linalg.svd(a, full_matrices=False)
# Pick the one with the correct shape.
q = u if u.shape == flat_shape else v
q = q.reshape(shape)
return self.gain * q[:shape[0], :shape[1]]
def get_config(self):
return {
'gain': self.gain,
'seed': self.seed
}
class Identity(Initializer):
"""Initializer that generates the identity matrix.
Only use for square 2D matrices.
# Arguments
gain: Multiplicative factor to apply to the identity matrix.
"""
def __init__(self, gain=1.):
self.gain = gain
def __call__(self, shape, dtype=None):
if len(shape) != 2 or shape[0] != shape[1]:
raise ValueError('Identity matrix initializer can only be used '
'for 2D square matrices.')
else:
return self.gain * np.identity(shape[0])
def get_config(self):
return {
'gain': self.gain
}
def lecun_uniform(seed=None):
"""LeCun uniform initializer.
It draws samples from a uniform distribution within [-limit, limit]
where `limit` is `sqrt(3 / fan_in)`
where `fan_in` is the number of input units in the weight tensor.
# Arguments
seed: A Python integer. Used to seed the random generator.
# Returns
An initializer.
# References
LeCun 98, Efficient Backprop,
http://yann.lecun.com/exdb/publis/pdf/lecun-98b.pdf
"""
return VarianceScaling(scale=1.,
mode='fan_in',
distribution='uniform',
seed=seed)
def glorot_normal(seed=None):
"""Glorot normal initializer, also called Xavier normal initializer.
It draws samples from a truncated normal distribution centered on 0
with `stddev = sqrt(2 / (fan_in + fan_out))`
where `fan_in` is the number of input units in the weight tensor
and `fan_out` is the number of output units in the weight tensor.
# Arguments
seed: A Python integer. Used to seed the random generator.
# Returns
An initializer.
# References
Glorot & Bengio, AISTATS 2010
http://jmlr.org/proceedings/papers/v9/glorot10a/glorot10a.pdf
"""
return VarianceScaling(scale=1.,
mode='fan_avg',
distribution='normal',
seed=seed)
def glorot_uniform(seed=None):
"""Glorot uniform initializer, also called Xavier uniform initializer.
It draws samples from a uniform distribution within [-limit, limit]
where `limit` is `sqrt(6 / (fan_in + fan_out))`
where `fan_in` is the number of input units in the weight tensor
and `fan_out` is the number of output units in the weight tensor.
# Arguments
seed: A Python integer. Used to seed the random generator.
# Returns
An initializer.
# References
Glorot & Bengio, AISTATS 2010
http://jmlr.org/proceedings/papers/v9/glorot10a/glorot10a.pdf
"""
return VarianceScaling(scale=1.,
mode='fan_avg',
distribution='uniform',
seed=seed)
def he_normal(seed=None):
"""He normal initializer.
It draws samples from a truncated normal distribution centered on 0
with `stddev = sqrt(2 / fan_in)`
where `fan_in` is the number of input units in the weight tensor.
# Arguments
seed: A Python integer. Used to seed the random generator.
# Returns
An initializer.
# References
He et al., http://arxiv.org/abs/1502.01852
"""
return VarianceScaling(scale=2.,
mode='fan_in',
distribution='normal',
seed=seed)
def he_uniform(seed=None):
"""He uniform variance scaling initializer.
It draws samples from a uniform distribution within [-limit, limit]
where `limit` is `sqrt(6 / fan_in)`
where `fan_in` is the number of input units in the weight tensor.
# Arguments
seed: A Python integer. Used to seed the random generator.
# Returns
An initializer.
# References
He et al., http://arxiv.org/abs/1502.01852
"""
return VarianceScaling(scale=2.,
mode='fan_in',
distribution='uniform',
seed=seed)
# Compatibility aliases
zero = zeros = Zeros
one = ones = Ones
constant = Constant
uniform = random_uniform = RandomUniform
normal = random_normal = RandomNormal
truncated_normal = TruncatedNormal
identity = Identity
orthogonal = Orthogonal
# Utility functions
def _compute_fans(shape, data_format='channels_last'):
"""Computes the number of input and output units for a weight shape.
# Arguments
shape: Integer shape tuple.
data_format: Image data format to use for convolution kernels.
Note that all kernels in Keras are standardized on the
`channels_last` ordering (even when inputs are set
to `channels_first`).
# Returns
A tuple of scalars, `(fan_in, fan_out)`.
# Raises
ValueError: in case of invalid `data_format` argument.
"""
if len(shape) == 2:
fan_in = shape[0]
fan_out = shape[1]
elif len(shape) in {3, 4, 5}:
# Assuming convolution kernels (1D, 2D or 3D).
# TH kernel shape: (depth, input_depth, ...)
# TF kernel shape: (..., input_depth, depth)
if data_format == 'channels_first':
receptive_field_size = np.prod(shape[2:])
fan_in = shape[1] * receptive_field_size
fan_out = shape[0] * receptive_field_size
elif data_format == 'channels_last':
receptive_field_size = np.prod(shape[:2])
fan_in = shape[-2] * receptive_field_size
fan_out = shape[-1] * receptive_field_size
else:
raise ValueError('Invalid data_format: ' + data_format)
else:
# No specific assumptions.
fan_in = np.sqrt(np.prod(shape))
fan_out = np.sqrt(np.prod(shape))
return fan_in, fan_out
def serialize(initializer):
return serialize_keras_object(initializer)
def deserialize(config, custom_objects=None):
return deserialize_keras_object(config,
module_objects=globals(),
custom_objects=custom_objects,
printable_module_name='initializer')
def get(identifier):
if isinstance(identifier, dict):
return deserialize(identifier)
elif isinstance(identifier, six.string_types):
config = {'class_name': str(identifier), 'config': {}}
return deserialize(config)
elif callable(identifier):
return identifier
else:
raise ValueError('Could not interpret initializer identifier:',
identifier)

Alguns arquivos não foram exibidos porque demasiados arquivos foram alterados neste diff Mostrar Mais