Layers¶

The Layers classes implement one layer of neural network of different types. the ::class:Layer is the mother class of all the layers and has to be inherited by any new layer.

All the neural network layers currently supported by yaddll.

`Layer`(incoming[, name])	Layer is the base class of any neural network layer.
`InputLayer`(input_shape[, input])	Input layer of the data, it has no parameters, it just shapes the data as the input for any network.
`ReshapeLayer`(incoming[, output_shape])	Reshape the incoming layer to the output_shape.
`FlattenLayer`(incoming[, n_dim])	Reshape layers back to flat
`Activation`(incoming[, activation])	Apply activation function to previous layer
`DenseLayer`(incoming, n_units[, W, b, ...])	Fully connected neural network layer
`UnsupervisedLayer`(incoming, n_units, ...)	Base class for all unsupervised layers.
`LogisticRegression`(incoming, n_class[, W, ...])	Dense layer with softmax activation
`Dropout`(incoming[, corruption_level])	Dropout layer
`Dropconnect`(incoming, n_units[, ...])	DropConnect layer
`PoolLayer`(incoming, pool_size[, stride, ...])	Pooling layer, default is maxpooling
`ConvLayer`(incoming[, image_shape, ...])	Convolutional layer
`ConvPoolLayer`(incoming, pool_size[, ...])	Convolutional and pooling layer
`AutoEncoder`(incoming, n_units, hyperparameters)	Autoencoder
`RBM`(incoming, n_units, hyperparameters[, W, ...])	Restricted Boltzmann Machines
`BatchNormalization`(incoming[, axis, alpha, ...])	Normalize the input layer over each mini-batch according to [R3333]:
`RNN`(incoming, n_units[, n_out, activation, ...])	Recurrent Neural Network
`LSTM`(incoming, n_units[, peepholes, ...])	Long Short Term Memory
`GRU`(incoming, n_units[, activation, ...])	Gated Recurrent unit

Detailed description¶

class yadll.layers.Layer(incoming, name=None, **kwargs)[source]¶

Layer is the base class of any neural network layer. It has to be subclassed by any kind of layer.

Parameters:

incoming : a Layer , a List of Layers or a tuple of int

The incoming layer, a list of incoming layers or the shape of the input layer

name : string, optional

The layer name. default name is the class name plus instantiation number i.e: ‘DenseLayer 3’

get_output(**kwargs)[source]¶

Return the output of this layer

Raises:

NotImplementedError

This method has to be overriden by new layer implementation.

get_params()[source]¶

Theano shared variables representing the parameters of this layer.

Returns:	list of Theano shared variables that parametrize the layer

get_reguls()[source]¶

Theano expression representing the sum of the regulators of this layer.

Returns:

Theano expression representing the sum of the regulators

of this layer

output_shape[source]¶

Compute the output shape of this layer given the input shape.

Returns:	a tuple representing the shape of the output of this layer.

Notes

This method has to be overriden by new layer implementation or will return the input shape.

class yadll.layers.InputLayer(input_shape, input=None, **kwargs)[source]¶: Input layer of the data, it has no parameters, it just shapes the data as the input for any network. A :InputLayer is always the first layer of any network.

class yadll.layers.ReshapeLayer(incoming, output_shape=None, **kwargs)[source]¶: Reshape the incoming layer to the output_shape.

class yadll.layers.FlattenLayer(incoming, n_dim=2, **kwargs)[source]¶: Reshape layers back to flat

class yadll.layers.DenseLayer(incoming, n_units, W=<function glorot_uniform>, b=<function constant>, activation=<function tanh>, l1=None, l2=None, **kwargs)[source]¶: Fully connected neural network layer

class yadll.layers.Activation(incoming, activation=<function linear>, **kwargs)[source]¶: Apply activation function to previous layer

class yadll.layers.UnsupervisedLayer(incoming, n_units, hyperparameters, **kwargs)[source]¶: Base class for all unsupervised layers. Unsupervised layers are pre-trained against its own input.

class yadll.layers.LogisticRegression(incoming, n_class, W=<function constant>, activation=<function softmax>, **kwargs)[source]¶

Dense layer with softmax activation

References

[R5757]

http://deeplearning.net/tutorial/logreg.html

class yadll.layers.Dropout(incoming, corruption_level=0.5, **kwargs)[source]¶: Dropout layer

class yadll.layers.Dropconnect(incoming, n_units, corruption_level=0.5, **kwargs)[source]¶: DropConnect layer

class yadll.layers.PoolLayer(incoming, pool_size, stride=None, ignore_border=True, pad=(0, 0), mode='max', **kwargs)[source]¶: Pooling layer, default is maxpooling

class yadll.layers.ConvLayer(incoming, image_shape=None, filter_shape=None, W=<function glorot_uniform>, border_mode='valid', subsample=(1, 1), l1=None, l2=None, pool_scale=None, **kwargs)[source]¶: Convolutional layer

class yadll.layers.ConvPoolLayer(incoming, pool_size, image_shape=None, filter_shape=None, b=<function constant>, activation=<function tanh>, **kwargs)[source]¶

Convolutional and pooling layer

References

[R5959]

http://deeplearning.net/tutorial/lenet.html

class yadll.layers.AutoEncoder(incoming, n_units, hyperparameters, corruption_level=0.0, W=(<function glorot_uniform>, {'gain': <function sigmoid>}), b_prime=<function constant>, sigma=None, contraction_level=None, **kwargs)[source]¶

Autoencoder

References

[R6161]

http://deeplearning.net/tutorial/dA.html

class yadll.layers.RBM(incoming, n_units, hyperparameters, W=<function glorot_uniform>, b_hidden=<function constant>, activation=<function sigmoid>, **kwargs)[source]¶

Restricted Boltzmann Machines

References

[R6363]

http://deeplearning.net/tutorial/rbm.html

class yadll.layers.BatchNormalization(incoming, axis=-2, alpha=0.1, epsilon=1e-05, has_beta=True, **kwargs)[source]¶

Normalize the input layer over each mini-batch according to [R6565]:

\[ \begin{align}\begin{aligned}\hat{x} = \frac{x - E[x]}{\sqrt{Var[x] + \epsilon}}\\y = \gamma * \hat{x} + \beta\end{aligned}\end{align} \]

Warning

When a BatchNormalization layer is used the batch size has to be given at compile time. You can not use None as the first dimension anymore. Prediction has to be made on the same batch size.

References

[R6565]

(1, 2) http://jmlr.org/proceedings/papers/v37/ioffe15.pdf

class yadll.layers.RNN(incoming, n_units, n_out=None, activation=<function sigmoid>, last_only=True, grad_clipping=0, go_backwards=False, allow_gc=False, **kwargs)[source]¶

Recurrent Neural Network

\[h_t = \sigma(x_t.W + h_{t-1}.U + b)\]

References

[R6769]

http://deeplearning.net/tutorial/rnnslu.html

[R6869]

https://arxiv.org/pdf/1602.06662.pdf

[R6969]

https://arxiv.org/pdf/1511.06464.pdf

class yadll.layers.LSTM(incoming, n_units, peepholes=False, tied_i_f=False, activation=<function tanh>, last_only=True, grad_clipping=0, go_backwards=False, allow_gc=False, **kwargs)[source]¶

Long Short Term Memory

\[\begin{split}i_t &= \sigma(x_t.W_i + h_{t-1}.U_i + b_i)\\ f_t &= \sigma(x_t.W_f + h_{t-1}.U_f + b_f)\\ \tilde{C_t} &= \tanh(x_t.W_c + h_{t-1}.U_c + b_c)\\ C_t &= f_t * C_{t-1} + i_t * \tilde{C_t}\\ o_t &= \sigma(x_t.W_o + h_{t-1}.U_o + b_o)\\ h_t &= o_t * \tanh(C_t) \text{with Peephole connections:}\\ i_t &= \sigma(x_t.W_i + h_{t-1}.U_i + C_{t-1}.P_i + b_i)\\ f_t &= \sigma(x_t.W_f + h_{t-1}.U_f + C_{t-1}.P_f + b_f)\\ \tilde{C_t} &= \tanh(x_t.W_c + h_{t-1}.U_c + b_c)\\ C_t &= f_t * C_{t-1} + i_t * \tilde{C_t}\\ o_t &= \sigma(x_t.W_o + h_{t-1}.U_o + C_t.P_o + b_o)\\ h_t &= o_t * \tanh(C_t)\\ \text{with tied forget and input gates:}\\ C_t &= f_t * C_{t-1} + (1 - f_t) * \tilde{C_t}\\\end{split}\]

Parameters:

incoming : a Layer

The incoming layer with an output_shape = (n_batches, n_time_steps, n_dim)

n_units : int

n_hidden = n_input_gate = n_forget_gate = n_cell_gate = n_output_gate = n_units All gates have the same number of units

n_out : int

number of output units

peephole : boolean default is False

use peephole connections.

tied_i : boolean default is false

tie input and forget gate

activation : yadll.activations function default is yadll.activations.tanh

activation function

last_only : boolean default is True

set to true if you only need the last element of the output sequence. Theano will optimize graph.

References

[R7377]

http://deeplearning.net/tutorial/lstm.html

[R7477]

http://christianherta.de/lehre/dataScience/machineLearning/neuralNetworks/LSTM.php

[R7577]

http://people.idsia.ch/~juergen/lstm/

[R7677]

http://colah.github.io/posts/2015-08-Understanding-LSTMs/

[R7777]

https://arxiv.org/pdf/1308.0850v5.pdf

class yadll.layers.GRU(incoming, n_units, activation=<function tanh>, last_only=True, grad_clipping=0, go_backwards=False, allow_gc=False, **kwargs)[source]¶

Gated Recurrent unit

\[\begin{split}z_t &= \sigma(x_t.W_z + h_{t-1}.U_z + b_z)\\ r_t &= \sigma(x_t.W_r + h_{t-1}.U_r + b_r)\\ \tilde{h_t} &= \tanh(x_t.W_h + (r_t*h_{t-1}).U_h + b_h)\\ h_t &= (1 - z_t) * h_{t-1} + z_t * \tilde{h_t}\end{split}\]

References

[R8385]

http://deeplearning.net/tutorial/lstm.html

[R8485]

https://arxiv.org/pdf/1412.3555.pdf

[R8585]

http://jmlr.org/proceedings/papers/v37/jozefowicz15.pdf