Layers¶
The Layers classes implement one layer of neural network of different types. the ::class:Layer is the mother class of all the layers and has to be inherited by any new layer.
All the neural network layers currently supported by yaddll.
Layer (incoming[, name]) |
Layer is the base class of any neural network layer. |
InputLayer (input_shape[, input]) |
Input layer of the data, it has no parameters, it just shapes the data as the input for any network. |
ReshapeLayer (incoming[, output_shape]) |
Reshape the incoming layer to the output_shape. |
FlattenLayer (incoming[, n_dim]) |
Reshape layers back to flat |
Activation (incoming[, activation]) |
Apply activation function to previous layer |
DenseLayer (incoming, n_units[, W, b, ...]) |
Fully connected neural network layer |
UnsupervisedLayer (incoming, n_units, ...) |
Base class for all unsupervised layers. |
LogisticRegression (incoming, n_class[, W, ...]) |
Dense layer with softmax activation |
Dropout (incoming[, corruption_level]) |
Dropout layer |
Dropconnect (incoming, n_units[, ...]) |
DropConnect layer |
PoolLayer (incoming, pool_size[, stride, ...]) |
Pooling layer, default is maxpooling |
ConvLayer (incoming[, image_shape, ...]) |
Convolutional layer |
ConvPoolLayer (incoming, pool_size[, ...]) |
Convolutional and pooling layer |
AutoEncoder (incoming, n_units, hyperparameters) |
Autoencoder |
RBM (incoming, n_units, hyperparameters[, W, ...]) |
Restricted Boltzmann Machines |
BatchNormalization (incoming[, axis, alpha, ...]) |
Normalize the input layer over each mini-batch according to [R3333]: |
RNN (incoming, n_units[, n_out, activation, ...]) |
Recurrent Neural Network |
LSTM (incoming, n_units[, peepholes, ...]) |
Long Short Term Memory |
GRU (incoming, n_units[, activation, ...]) |
Gated Recurrent unit |
Detailed description¶
-
class
yadll.layers.
Layer
(incoming, name=None, **kwargs)[source]¶ Layer is the base class of any neural network layer. It has to be subclassed by any kind of layer.
Parameters: incoming : a Layer , a List of Layers or a tuple of int
The incoming layer, a list of incoming layers or the shape of the input layer
name : string, optional
The layer name. default name is the class name plus instantiation number i.e: ‘DenseLayer 3’
-
get_output
(**kwargs)[source]¶ Return the output of this layer
Raises: NotImplementedError
This method has to be overriden by new layer implementation.
-
get_params
()[source]¶ Theano shared variables representing the parameters of this layer.
Returns: list of Theano shared variables that parametrize the layer
-
-
class
yadll.layers.
InputLayer
(input_shape, input=None, **kwargs)[source]¶ Input layer of the data, it has no parameters, it just shapes the data as the input for any network. A :
InputLayer
is always the first layer of any network.
-
class
yadll.layers.
ReshapeLayer
(incoming, output_shape=None, **kwargs)[source]¶ Reshape the incoming layer to the output_shape.
-
class
yadll.layers.
DenseLayer
(incoming, n_units, W=<function glorot_uniform>, b=<function constant>, activation=<function tanh>, l1=None, l2=None, **kwargs)[source]¶ Fully connected neural network layer
-
class
yadll.layers.
Activation
(incoming, activation=<function linear>, **kwargs)[source]¶ Apply activation function to previous layer
-
class
yadll.layers.
UnsupervisedLayer
(incoming, n_units, hyperparameters, **kwargs)[source]¶ Base class for all unsupervised layers. Unsupervised layers are pre-trained against its own input.
-
class
yadll.layers.
LogisticRegression
(incoming, n_class, W=<function constant>, activation=<function softmax>, **kwargs)[source]¶ Dense layer with softmax activation
References
[R5757] http://deeplearning.net/tutorial/logreg.html
-
class
yadll.layers.
Dropconnect
(incoming, n_units, corruption_level=0.5, **kwargs)[source]¶ DropConnect layer
-
class
yadll.layers.
PoolLayer
(incoming, pool_size, stride=None, ignore_border=True, pad=(0, 0), mode='max', **kwargs)[source]¶ Pooling layer, default is maxpooling
-
class
yadll.layers.
ConvLayer
(incoming, image_shape=None, filter_shape=None, W=<function glorot_uniform>, border_mode='valid', subsample=(1, 1), l1=None, l2=None, pool_scale=None, **kwargs)[source]¶ Convolutional layer
-
class
yadll.layers.
ConvPoolLayer
(incoming, pool_size, image_shape=None, filter_shape=None, b=<function constant>, activation=<function tanh>, **kwargs)[source]¶ Convolutional and pooling layer
References
[R5959] http://deeplearning.net/tutorial/lenet.html
-
class
yadll.layers.
AutoEncoder
(incoming, n_units, hyperparameters, corruption_level=0.0, W=(<function glorot_uniform>, {'gain': <function sigmoid>}), b_prime=<function constant>, sigma=None, contraction_level=None, **kwargs)[source]¶ Autoencoder
References
[R6161] http://deeplearning.net/tutorial/dA.html
-
class
yadll.layers.
RBM
(incoming, n_units, hyperparameters, W=<function glorot_uniform>, b_hidden=<function constant>, activation=<function sigmoid>, **kwargs)[source]¶ Restricted Boltzmann Machines
References
[R6363] http://deeplearning.net/tutorial/rbm.html
-
class
yadll.layers.
BatchNormalization
(incoming, axis=-2, alpha=0.1, epsilon=1e-05, has_beta=True, **kwargs)[source]¶ Normalize the input layer over each mini-batch according to [R6565]:
\[ \begin{align}\begin{aligned}\hat{x} = \frac{x - E[x]}{\sqrt{Var[x] + \epsilon}}\\y = \gamma * \hat{x} + \beta\end{aligned}\end{align} \]Warning
When a BatchNormalization layer is used the batch size has to be given at compile time. You can not use None as the first dimension anymore. Prediction has to be made on the same batch size.
References
[R6565] (1, 2) http://jmlr.org/proceedings/papers/v37/ioffe15.pdf
-
class
yadll.layers.
RNN
(incoming, n_units, n_out=None, activation=<function sigmoid>, last_only=True, grad_clipping=0, go_backwards=False, allow_gc=False, **kwargs)[source]¶ Recurrent Neural Network
\[h_t = \sigma(x_t.W + h_{t-1}.U + b)\]References
[R6769] http://deeplearning.net/tutorial/rnnslu.html [R6869] https://arxiv.org/pdf/1602.06662.pdf [R6969] https://arxiv.org/pdf/1511.06464.pdf
-
class
yadll.layers.
LSTM
(incoming, n_units, peepholes=False, tied_i_f=False, activation=<function tanh>, last_only=True, grad_clipping=0, go_backwards=False, allow_gc=False, **kwargs)[source]¶ Long Short Term Memory
\[\begin{split}i_t &= \sigma(x_t.W_i + h_{t-1}.U_i + b_i)\\ f_t &= \sigma(x_t.W_f + h_{t-1}.U_f + b_f)\\ \tilde{C_t} &= \tanh(x_t.W_c + h_{t-1}.U_c + b_c)\\ C_t &= f_t * C_{t-1} + i_t * \tilde{C_t}\\ o_t &= \sigma(x_t.W_o + h_{t-1}.U_o + b_o)\\ h_t &= o_t * \tanh(C_t) \text{with Peephole connections:}\\ i_t &= \sigma(x_t.W_i + h_{t-1}.U_i + C_{t-1}.P_i + b_i)\\ f_t &= \sigma(x_t.W_f + h_{t-1}.U_f + C_{t-1}.P_f + b_f)\\ \tilde{C_t} &= \tanh(x_t.W_c + h_{t-1}.U_c + b_c)\\ C_t &= f_t * C_{t-1} + i_t * \tilde{C_t}\\ o_t &= \sigma(x_t.W_o + h_{t-1}.U_o + C_t.P_o + b_o)\\ h_t &= o_t * \tanh(C_t)\\ \text{with tied forget and input gates:}\\ C_t &= f_t * C_{t-1} + (1 - f_t) * \tilde{C_t}\\\end{split}\]Parameters: incoming : a Layer
The incoming layer with an output_shape = (n_batches, n_time_steps, n_dim)
n_units : int
n_hidden = n_input_gate = n_forget_gate = n_cell_gate = n_output_gate = n_units All gates have the same number of units
n_out : int
number of output units
peephole : boolean default is False
use peephole connections.
tied_i : boolean default is false
tie input and forget gate
activation : yadll.activations function default is yadll.activations.tanh
activation function
last_only : boolean default is True
set to true if you only need the last element of the output sequence. Theano will optimize graph.
References
[R7377] http://deeplearning.net/tutorial/lstm.html [R7477] http://christianherta.de/lehre/dataScience/machineLearning/neuralNetworks/LSTM.php [R7577] http://people.idsia.ch/~juergen/lstm/ [R7677] http://colah.github.io/posts/2015-08-Understanding-LSTMs/ [R7777] https://arxiv.org/pdf/1308.0850v5.pdf
-
class
yadll.layers.
GRU
(incoming, n_units, activation=<function tanh>, last_only=True, grad_clipping=0, go_backwards=False, allow_gc=False, **kwargs)[source]¶ Gated Recurrent unit
\[\begin{split}z_t &= \sigma(x_t.W_z + h_{t-1}.U_z + b_z)\\ r_t &= \sigma(x_t.W_r + h_{t-1}.U_r + b_r)\\ \tilde{h_t} &= \tanh(x_t.W_h + (r_t*h_{t-1}).U_h + b_h)\\ h_t &= (1 - z_t) * h_{t-1} + z_t * \tilde{h_t}\end{split}\]References
[R8385] http://deeplearning.net/tutorial/lstm.html [R8485] https://arxiv.org/pdf/1412.3555.pdf [R8585] http://jmlr.org/proceedings/papers/v37/jozefowicz15.pdf