Data¶
-
yadll.data.
normalize
(x)[source]¶ Normalization: Scale data to [0, 1]
\[z = (x - min(x)) / (max(x) - min(x))\]Parameters: x: numpy array Returns: z, min, max
-
yadll.data.
standardize
(x, epsilon=1e-06)[source]¶ Standardization: Scale to mean=0 and std=1
\[z = (x - mean(x)) / std(x)\]Parameters: x: numpy array Returns: z, mean, std
-
yadll.data.
apply_standardize
(x, x_mean, x_std)[source]¶ Apply standardization to data given mean and std
-
yadll.data.
one_hot_encoding
(arr, N=None)[source]¶ One hot encoding of a vector of integer categorical variables in a range [0..N].
You can provide the higher category N or max(arr) will be used.
Parameters: arr : numpy array
array of integer in a range [0, N]
N : int, optional
Higher category
Returns: one hot encoding [0, 1, 0, 0]
Examples
>>> a = np.asarray([1, 0, 3]) >>> one_hot_encoding(a) array([[ 0., 1., 0., 0.], [ 1., 0., 0., 0.], [ 0., 0., 0., 1.]]) >>> one_hot_encoding(a, 5) array([[ 0., 1., 0., 0., 0., 0.], [ 1., 0., 0., 0., 0., 0.], [ 0., 0., 0., 1., 0., 0.]])
-
yadll.data.
one_hot_decoding
(mat)[source]¶ decoding of a one hot matrix
Parameters: mat : numpy matrix
one hot matrix
Returns: vector of decoded value
Examples
>>> a = np.asarray([[0, 1, 0, 0], [1, 0, 0, 0], [0, 0, 0, 1]]) >>> one_hot_decoding(a) array([1, 0, 3])
-
class
yadll.data.
Data
(data, preprocessing=None, shared=True, borrow=True, cast_y=False)[source]¶ Data container.
data is made of train_set, valid_set, test_set and set_x, set_y = set
Parameters: data : string
data file name (with path)
shared : bool
theano shared variable
borrow : bool
theano borrowable variable
cast_y : bool
cast y to intX
Examples
Load data
>>> yadll.data.Data('data/mnist/mnist.pkl.gz')
Methods
dataset : return the dataset as Theano shared variables [(train_set_x, train_set_y), (valid_set_x, valid_set_y), (test_set_x, test_set_y)]