Convolutional RNN

CGRU

class mdgru.model.crnn.cgru.CGRUCell(myshape, num_units, kw)[source]

Bases: mdgru.model.crnn.CRNNCell

Convolutional gated recurrent unit.

This class processes n-d data along the last dimension using a gated recurrent unit, which uses n-1 d convolutions on its path along that last dimension to gather context from input and last state to produce the new state. Property defaults contains defaults for all properties of a CGRUCell that are the same for one MDGRU. :param add_x_bn: Enables batch normalization on inputs for gates :param kw: dict containing the following options.

  • add_x_bn [default: False] Add batch normalization at the gates input
  • add_h_bn [default: False] Add batch normalization at the gates state
  • add_a_bn [default: False] Add batch normalization at the candidates input and state
  • resgrux [default: False] Add residual learning to the input of each cgru
  • resgruh [default: False] Add residual learning to the state of each cgru
  • put_r_back [default: False] Move the reset gate to the location the original GRU applies it at
  • use_dropconnect_on_state [default: False] Apply dropconnect on the candidate weights as well
  • min_mini_batch [default: None] Number of iterations batches to average over
  • istraining [default: Tensor(“Const:0”, shape=(), dtype=bool)]
  • gate [default: function sigmoid at 0x7fc7772d2620]
Parameters:
  • add_h_bn – Enables batch normalization on last state for gates
  • add_a_bn – Enables batch normalization for the candidate (both input and last state)
  • resgrux – Enables residual learning on weighted input
  • resgruh – Enables residual learning on weighted previous output / state
  • use_dropconnect_on_state – Should dropconnect be used also for the candidate computation?
  • put_r_back – Use reset gate r’s position of original gru formulation, which complicates computation.
  • min_mini_batch – Emulation
  • gate – Defines activation function to be used for gates
_add_inbound_node(input_tensors, output_tensors, arguments=None)

Internal method to create an inbound node for the layer.

Parameters:
  • input_tensors – list of input tensors.
  • output_tensors – list of output tensors.
  • arguments – dictionary of keyword arguments that were passed to the call method of the layer at the call that created the node.
_add_variable_with_custom_getter(name, shape=None, dtype=tf.float32, initializer=None, getter=None, overwrite=False, **kwargs_for_getter)

Restore-on-create for a variable be saved with this Checkpointable.

If the user has requested that this object or another Checkpointable which depends on this object be restored from a checkpoint (deferred loading before variable object creation), initializer may be ignored and the value from the checkpoint used instead.

Parameters:
  • name – A name for the variable. Must be unique within this object.
  • shape – The shape of the variable.
  • dtype – The data type of the variable.
  • initializer – The initializer to use. Ignored if there is a deferred restoration left over from a call to _restore_from_checkpoint_position.
  • getter – The getter to wrap which actually fetches the variable.
  • overwrite – If True, disables unique name and type checks.
  • **kwargs_for_getter – Passed to the getter.
Returns:

The new variable object.

Raises:

ValueError – If the variable name is not unique.

_assert_input_compatibility(inputs)

Checks compatibility between the layer and provided inputs.

This checks that the tensor(s) inputs verify the input assumptions of the layer (if any). If not, a clear and actional exception gets raised.

Parameters:inputs – input tensor or list of input tensors.
Raises:ValueError – in case of mismatch between the provided inputs and the expectations of the layer.
_checkpoint_dependencies

All dependencies of this object.

May be overridden to include conditional dependencies.

Returns:A list of CheckpointableReference objects indicating named Checkpointable dependencies which should be saved along with this object.
_convlinear(args, output_size, bias, bias_start=0.0, scope=None, dropconnectx=None, dropconnecth=None, dropconnectxmatrix=None, dropconnecthmatrix=None, strides=None, orthogonal_init=True)

Computes the convolution of current input and previous output or state (args[0] and args[1]).

The two tensors contained in args are convolved with their respective filters. Due to the rnn library of tensorflow, spatial dimensions are collapsed and have to be restored before convolution. Also, dropconnectmatrices are applied to the weights. If specified, a bias is generated and returned as well.

Parameters:
  • args (2-Tuple of ndarrays) – Current input and last output
  • output_size (int) – Number of output channels (separate from myshapes[1][-1], as sometimes this value differs)
  • bias (bool) – Flag if bias should be used
  • bias_start (float) – Flag for bias initialization
  • scope (str) – Override standard “ConvLinear” scope
  • dropconnectx – Flag if dropconnect should be applied on input weights
  • dropconnecth – Flag if dropconnect should be applied on state weights
  • dropconnectxmatrix – Dropconnect matrix for input weights
  • dropconnecthmatrix – Dropconnect matrix for state weights
  • strides – Strides to be applied to the input convolution
  • orthogonal_init – Flag if orthogonal initialization should be performed for the state weights
Returns:

2-tuple of results for state and input, 3-tuple additionally including a bias if requested

_convolution(data, convolution_filter, filter_shape=None, strides=None, is_circular_convolution=False)

Convolves data and convolution_filter, using circular convolution if required.

_default_crnn_activation()
_defaults = {'add_a_bn': {'value': False, 'help': 'Add batch normalization at the candidates input and state'}, 'add_h_bn': {'value': False, 'help': 'Add batch normalization at the gates state'}, 'add_x_bn': {'value': False, 'help': 'Add batch normalization at the gates input'}, 'gate': <function sigmoid at 0x7fc7772d2620>, 'istraining': <tf.Tensor 'Const:0' shape=() dtype=bool>, 'min_mini_batch': {'value': None, 'help': 'Number of iterations batches to average over'}, 'put_r_back': {'value': False, 'help': 'Move the reset gate to the location the original GRU applies it at'}, 'resgruh': {'value': False, 'help': 'Add residual learning to the state of each cgru'}, 'resgrux': {'value': False, 'help': 'Add residual learning to the input of each cgru'}, 'use_dropconnect_on_state': {'value': False, 'help': 'Apply dropconnect on the candidate weights as well'}}
_deferred_dependencies

A dictionary with deferred dependencies.

Stores restorations for other Checkpointable objects on which this object may eventually depend. May be overridden by sub-classes (e.g. Optimizers use conditional dependencies based the current graph, and so need separate management of deferred dependencies too).

Returns:A dictionary mapping from local name to a list of _CheckpointPosition objects.
_gather_saveables_for_checkpoint()

Returns a dictionary of values to checkpoint with this object.

Keys in the returned dictionary are local to this object and in a separate namespace from dependencies. Values may either be SaveableObject factories or variables easily converted to SaveableObject`s (as in `tf.train.Saver’s var_list constructor argument).

SaveableObjects have a name set, which Checkpointable needs to generate itself. So rather than returning SaveableObjects directly, this method should return a dictionary of callables which take name arguments and return SaveableObjects with that name.

If this object may also be passed to the global-name-based tf.train.Saver, the returned callables should have a default value for their name argument (i.e. be callable with no arguments).

Returned values must be saved only by this object; if any value may be shared, it should instead be a dependency. For example, variable objects save their own values with the key VARIABLE_VALUE_KEY, but objects which reference variables simply add a dependency.

Returns:The dictionary mapping attribute names to SaveableObject factories described above. For example: {VARIABLE_VALUE_KEY:
lambda name=”global_name_for_this_object”: SaveableObject(name=name, …)}
_get_dropconnect(shape, keep_rate, name)

Creates factors to be applied to filters to achieve either Bernoulli or Gaussian dropconnect.

_get_node_attribute_at_index(node_index, attr, attr_name)

Private utility to retrieves an attribute (e.g. inputs) from a node.

This is used to implement the methods:
  • get_input_shape_at
  • get_output_shape_at
  • get_input_at

etc…

Parameters:
  • node_index – Integer index of the node from which to retrieve the attribute.
  • attr – Exact node attribute name.
  • attr_name – Human-readable attribute name, for error messages.
Returns:

The layer’s attribute attr at the node of index node_index.

Raises:
  • RuntimeError – If the layer has no inbound nodes, or if called in Eager
  • mode.
  • ValueError – If the index provided does not match any node.
_get_weights_h(filtershape, dtype, name, orthogonal_init=True)

Return weights for output convolution.

_get_weights_x(filtershape, dtype, name)

Return weights for input convolution.

_handle_deferred_dependencies(name, checkpointable)

Pop and load any deferred checkpoint restores into checkpointable.

This method does not add a new dependency on checkpointable, but it does check if any outstanding/deferred dependencies have been queued waiting for this dependency to be added (matched based on name). If so, checkpointable and its dependencies are restored. The restorations are considered fulfilled and so are deleted.

_track_checkpointable is more appropriate for adding a normal/unconditional dependency, and includes handling for deferred restorations. This method allows objects such as Optimizer to use the same restoration logic while managing conditional dependencies themselves, by overriding _checkpoint_dependencies and _lookup_dependency to change the object’s dependencies based on the context it is saved/restored in (a single optimizer instance can have state associated with multiple graphs).

Parameters:
  • name – The name of the dependency within this object (self), used to match checkpointable with values saved in a checkpoint.
  • checkpointable – The Checkpointable object to restore (inheriting from CheckpointableBase).
_init_set_name(name)
_lookup_dependency(name)

Look up a dependency by name.

May be overridden to include conditional dependencies.

Parameters:name – The local name of the dependency.
Returns:A Checkpointable object, or None if no dependency by this name was found.
_make_unique_name(name_uid_map=None, avoid_names=None, namespace='', zero_based=False)
_maybe_initialize_checkpointable()

Initialize dependency management.

Not __init__, since most objects will forget to call it.

_name_scope_name(current_variable_scope)

Determines op naming for the Layer.

_paddata(data, fshape)

Pads spatial dimensions of data, such that a convolution of size fshape results in a circular convolution

_preload_simple_restoration(name, shape)

Return a dependency’s value for restore-on-create.

Note the restoration is not deleted; if for some reason preload is called and then not assigned to the variable (for example because a custom getter overrides the initializer), the assignment will still happen once the variable is tracked (determined based on checkpoint.restore_uid).

Parameters:
  • name – The object-local name of the dependency holding the variable’s value.
  • shape – The shape of the variable being loaded into.
Returns:

An callable for use as a variable’s initializer/initial_value, or None if one should not be set (either because there was no variable with this name in the checkpoint or because it needs more complex deserialization). Any non-trivial deserialization will happen when the variable object is tracked.

_restore_from_checkpoint_position(checkpoint_position)

Restore this object and its dependencies (may be deferred).

_rnn_get_variable(getter, *args, **kwargs)
_set_scope(scope=None)
_single_restoration_from_checkpoint_position(checkpoint_position, visit_queue)

Restore this object, and either queue its dependencies or defer them.

_tf_api_names = ('nn.rnn_cell.RNNCell',)
_track_checkpointable(checkpointable, name, overwrite=False)

Declare a dependency on another Checkpointable object.

Indicates that checkpoints for this object should include variables from checkpointable.

Variables in a checkpoint are mapped to Checkpointable`s based on the names provided when the checkpoint was written. To avoid breaking existing checkpoints when modifying a class, neither variable names nor dependency names (the names passed to `_track_checkpointable) may change.

Parameters:
  • checkpointable – A Checkpointable which this object depends on.
  • name – A local name for checkpointable, used for loading checkpoints into the correct objects.
  • overwrite – Boolean, whether silently replacing dependencies is OK. Used for __setattr__, where throwing an error on attribute reassignment would be inappropriate.
Returns:

checkpointable, for convenience when declaring a dependency and assigning to a member variable in one statement.

Raises:
  • TypeError – If checkpointable does not inherit from Checkpointable.
  • ValueError – If another object is already tracked by this name.
activity_regularizer

Optional regularizer function for the output of this layer.

add_loss(losses, inputs=None)

Add loss tensor(s), potentially dependent on layer inputs.

Some losses (for instance, activity regularization losses) may be dependent on the inputs passed when calling a layer. Hence, when reusing the same layer on different inputs a and b, some entries in layer.losses may be dependent on a and some on b. This method automatically keeps track of dependencies.

The get_losses_for method allows to retrieve the losses relevant to a specific set of inputs.

Note that add_loss is not supported when executing eagerly. Instead, variable regularizers may be added through add_variable. Activity regularization is not supported directly (but such losses may be returned from Layer.call()).

Parameters:
  • losses – Loss tensor, or list/tuple of tensors.
  • inputs – If anything other than None is passed, it signals the losses are conditional on some of the layer’s inputs, and thus they should only be run where these inputs are available. This is the case for activity regularization losses, for instance. If None is passed, the losses are assumed to be unconditional, and will apply across all dataflows of the layer (e.g. weight regularization losses).
Raises:

RuntimeError – If called in Eager mode.

add_update(updates, inputs=None)

Add update op(s), potentially dependent on layer inputs.

Weight updates (for instance, the updates of the moving mean and variance in a BatchNormalization layer) may be dependent on the inputs passed when calling a layer. Hence, when reusing the same layer on different inputs a and b, some entries in layer.updates may be dependent on a and some on b. This method automatically keeps track of dependencies.

The get_updates_for method allows to retrieve the updates relevant to a specific set of inputs.

This call is ignored in Eager mode.

Parameters:
  • updates – Update op, or list/tuple of update ops.
  • inputs – If anything other than None is passed, it signals the updates are conditional on some of the layer’s inputs, and thus they should only be run where these inputs are available. This is the case for BatchNormalization updates, for instance. If None, the updates will be taken into account unconditionally, and you are responsible for making sure that any dependency they might have is available at runtime. A step counter might fall into this category.
add_variable(name, shape, dtype=None, initializer=None, regularizer=None, trainable=True, constraint=None, partitioner=None)

Adds a new variable to the layer, or gets an existing one; returns it.

Parameters:
  • name – variable name.
  • shape – variable shape.
  • dtype – The type of the variable. Defaults to self.dtype or float32.
  • initializer – initializer instance (callable).
  • regularizer – regularizer instance (callable).
  • trainable – whether the variable should be part of the layer’s “trainable_variables” (e.g. variables, biases) or “non_trainable_variables” (e.g. BatchNorm mean, stddev). Note, if the current variable scope is marked as non-trainable then this parameter is ignored and any added variables are also marked as non-trainable.
  • constraint – constraint instance (callable).
  • partitioner – (optional) partitioner instance (callable). If provided, when the requested variable is created it will be split into multiple partitions according to partitioner. In this case, an instance of PartitionedVariable is returned. Available partitioners include tf.fixed_size_partitioner and tf.variable_axis_size_partitioner. For more details, see the documentation of tf.get_variable and the “Variable Partitioners and Sharding” section of the API guide.
Returns:

The created variable. Usually either a Variable or ResourceVariable instance. If partitioner is not None, a PartitionedVariable instance is returned.

Raises:

RuntimeError – If called with partioned variable regularization and eager execution is enabled.

apply(inputs, *args, **kwargs)

Apply the layer on a input.

This simply wraps self.__call__.

Parameters:
  • inputs – Input tensor(s).
  • *args – additional positional arguments to be passed to self.call.
  • **kwargs – additional keyword arguments to be passed to self.call.
Returns:

Output tensor(s).

build(_)
call(inputs, **kwargs)

The logic of the layer lives here.

Parameters:
  • inputs – input tensor(s).
  • **kwargs – additional keyword arguments.
Returns:

Output tensor(s).

compute_output_shape(input_shape)

Computes the output shape of the layer given the input shape.

Parameters:

input_shape – A (possibly nested tuple of) TensorShape. It need not be fully defined (e.g. the batch size may be unknown).

Returns:

A (possibly nested tuple of) TensorShape.

Raises:
  • TypeError – if input_shape is not a (possibly nested tuple of) TensorShape.
  • ValueError – if input_shape is incomplete or is incompatible with the the layer.
count_params()

Count the total number of scalars composing the weights.

Returns:An integer count.
Raises:ValueError – if the layer isn’t yet built (in which case its weights aren’t yet defined).
dtype
get_input_at(node_index)

Retrieves the input tensor(s) of a layer at a given node.

Parameters:node_index – Integer, index of the node from which to retrieve the attribute. E.g. node_index=0 will correspond to the first time the layer was called.
Returns:A tensor (or list of tensors if the layer has multiple inputs).
Raises:RuntimeError – If called in Eager mode.
get_input_shape_at(node_index)

Retrieves the input shape(s) of a layer at a given node.

Parameters:node_index – Integer, index of the node from which to retrieve the attribute. E.g. node_index=0 will correspond to the first time the layer was called.
Returns:A shape tuple (or list of shape tuples if the layer has multiple inputs).
Raises:RuntimeError – If called in Eager mode.
get_losses_for(inputs)

Retrieves losses relevant to a specific set of inputs.

Parameters:inputs – Input tensor or list/tuple of input tensors.
Returns:List of loss tensors of the layer that depend on inputs.
Raises:RuntimeError – If called in Eager mode.
get_output_at(node_index)

Retrieves the output tensor(s) of a layer at a given node.

Parameters:node_index – Integer, index of the node from which to retrieve the attribute. E.g. node_index=0 will correspond to the first time the layer was called.
Returns:A tensor (or list of tensors if the layer has multiple outputs).
Raises:RuntimeError – If called in Eager mode.
get_output_shape_at(node_index)

Retrieves the output shape(s) of a layer at a given node.

Parameters:node_index – Integer, index of the node from which to retrieve the attribute. E.g. node_index=0 will correspond to the first time the layer was called.
Returns:A shape tuple (or list of shape tuples if the layer has multiple outputs).
Raises:RuntimeError – If called in Eager mode.
get_updates_for(inputs)

Retrieves updates relevant to a specific set of inputs.

Parameters:inputs – Input tensor or list/tuple of input tensors.
Returns:List of update ops of the layer that depend on inputs.
Raises:RuntimeError – If called in Eager mode.
graph
inbound_nodes

Deprecated, do NOT use! Only for compatibility with external Keras.

input

Retrieves the input tensor(s) of a layer.

Only applicable if the layer has exactly one input, i.e. if it is connected to one incoming layer.

Returns:

Input tensor or list of input tensors.

Raises:
  • AttributeError – if the layer is connected to
  • more than one incoming layers.
Raises:
input_shape

Retrieves the input shape(s) of a layer.

Only applicable if the layer has exactly one input, i.e. if it is connected to one incoming layer, or if all inputs have the same shape.

Returns:

Input shape, as an integer shape tuple (or list of shape tuples, one tuple per input tensor).

Raises:
losses

Losses which are associated with this Layer.

Note that when executing eagerly, getting this property evaluates regularizers. When using graph execution, variable regularization ops have already been created and are simply returned here.

Returns:A list of tensors.
name
non_trainable_variables
non_trainable_weights
outbound_nodes

Deprecated, do NOT use! Only for compatibility with external Keras.

output

Retrieves the output tensor(s) of a layer.

Only applicable if the layer has exactly one output, i.e. if it is connected to one incoming layer.

Returns:

Output tensor or list of output tensors.

Raises:
output_shape

Retrieves the output shape(s) of a layer.

Only applicable if the layer has one output, or if all outputs have the same shape.

Returns:

Output shape, as an integer shape tuple (or list of shape tuples, one tuple per output tensor).

Raises:
output_size
scope_name
state_size
trainable_variables
trainable_weights
updates
variables

Returns the list of all layer variables/weights.

Returns:A list of variables.
weights

Returns the list of all layer variables/weights.

Returns:A list of variables.
zero_state(batch_size, dtype)

Return zero-filled state tensor(s).

Parameters:
  • batch_size – int, float, or unit Tensor representing the batch size.
  • dtype – the data type to use for the state.
Returns:

If state_size is an int or TensorShape, then the return value is a N-D tensor of shape [batch_size, state_size] filled with zeros.

If state_size is a nested list or tuple, then the return value is a nested list or tuple (of the same structure) of 2-D tensors with the shapes [batch_size, s] for each s in state_size.

Module contents

class mdgru.model.crnn.CRNNCell(myshape, num_units, kw)[source]

Bases: tensorflow.python.ops.rnn_cell_impl.LayerRNNCell

Base convolutional RNN method, implements common functions and serves as abstract class.

Property defaults contains default values for all properties of a CGRUCell that are the same for one MDGRU and is used to filter valid arguments.

Parameters:
  • myshape (list) – Contains shape information on the input tensor.
  • num_units (int) – Defines number of output channels.
  • activation (tensorflow activation function) – Can be used to override tanh as activation function.
  • periodic_convolution_x (bool) – Enables circular convolution for the input
  • periodic_convolution_h (bool) – Enables circular convolution for the last output / state
  • dropconnectx (tensorflow placeholder or None) – keeprate of dropconnect regularization on weights connecting to input
  • dropconnecth (tensorflow placeholder or None) – keeprate of dropconnect regularization on weights connecting to previous state / output
  • use_bernoulli (bool) – decide if bernoulli or Gaussian distributions should be used for the weight distributions with dropconnect
_add_inbound_node(input_tensors, output_tensors, arguments=None)

Internal method to create an inbound node for the layer.

Parameters:
  • input_tensors – list of input tensors.
  • output_tensors – list of output tensors.
  • arguments – dictionary of keyword arguments that were passed to the call method of the layer at the call that created the node.
_add_variable_with_custom_getter(name, shape=None, dtype=tf.float32, initializer=None, getter=None, overwrite=False, **kwargs_for_getter)

Restore-on-create for a variable be saved with this Checkpointable.

If the user has requested that this object or another Checkpointable which depends on this object be restored from a checkpoint (deferred loading before variable object creation), initializer may be ignored and the value from the checkpoint used instead.

Parameters:
  • name – A name for the variable. Must be unique within this object.
  • shape – The shape of the variable.
  • dtype – The data type of the variable.
  • initializer – The initializer to use. Ignored if there is a deferred restoration left over from a call to _restore_from_checkpoint_position.
  • getter – The getter to wrap which actually fetches the variable.
  • overwrite – If True, disables unique name and type checks.
  • **kwargs_for_getter – Passed to the getter.
Returns:

The new variable object.

Raises:

ValueError – If the variable name is not unique.

_assert_input_compatibility(inputs)

Checks compatibility between the layer and provided inputs.

This checks that the tensor(s) inputs verify the input assumptions of the layer (if any). If not, a clear and actional exception gets raised.

Parameters:inputs – input tensor or list of input tensors.
Raises:ValueError – in case of mismatch between the provided inputs and the expectations of the layer.
_checkpoint_dependencies

All dependencies of this object.

May be overridden to include conditional dependencies.

Returns:A list of CheckpointableReference objects indicating named Checkpointable dependencies which should be saved along with this object.
_convlinear(args, output_size, bias, bias_start=0.0, scope=None, dropconnectx=None, dropconnecth=None, dropconnectxmatrix=None, dropconnecthmatrix=None, strides=None, orthogonal_init=True)[source]

Computes the convolution of current input and previous output or state (args[0] and args[1]).

The two tensors contained in args are convolved with their respective filters. Due to the rnn library of tensorflow, spatial dimensions are collapsed and have to be restored before convolution. Also, dropconnectmatrices are applied to the weights. If specified, a bias is generated and returned as well.

Parameters:
  • args (2-Tuple of ndarrays) – Current input and last output
  • output_size (int) – Number of output channels (separate from myshapes[1][-1], as sometimes this value differs)
  • bias (bool) – Flag if bias should be used
  • bias_start (float) – Flag for bias initialization
  • scope (str) – Override standard “ConvLinear” scope
  • dropconnectx – Flag if dropconnect should be applied on input weights
  • dropconnecth – Flag if dropconnect should be applied on state weights
  • dropconnectxmatrix – Dropconnect matrix for input weights
  • dropconnecthmatrix – Dropconnect matrix for state weights
  • strides – Strides to be applied to the input convolution
  • orthogonal_init – Flag if orthogonal initialization should be performed for the state weights
Returns:

2-tuple of results for state and input, 3-tuple additionally including a bias if requested

_convolution(data, convolution_filter, filter_shape=None, strides=None, is_circular_convolution=False)[source]

Convolves data and convolution_filter, using circular convolution if required.

_default_crnn_activation()[source]
_defaults = {'crnn_activation': <function tanh at 0x7fc7772d2730>, 'dropconnecth': None, 'dropconnectx': None, 'periodic_convolution_h': False, 'periodic_convolution_x': False, 'use_bernoulli': {'value': False, 'help': 'Use bernoulli or gaussian distribution for dropconnect'}}
_deferred_dependencies

A dictionary with deferred dependencies.

Stores restorations for other Checkpointable objects on which this object may eventually depend. May be overridden by sub-classes (e.g. Optimizers use conditional dependencies based the current graph, and so need separate management of deferred dependencies too).

Returns:A dictionary mapping from local name to a list of _CheckpointPosition objects.
_gather_saveables_for_checkpoint()

Returns a dictionary of values to checkpoint with this object.

Keys in the returned dictionary are local to this object and in a separate namespace from dependencies. Values may either be SaveableObject factories or variables easily converted to SaveableObject`s (as in `tf.train.Saver’s var_list constructor argument).

SaveableObjects have a name set, which Checkpointable needs to generate itself. So rather than returning SaveableObjects directly, this method should return a dictionary of callables which take name arguments and return SaveableObjects with that name.

If this object may also be passed to the global-name-based tf.train.Saver, the returned callables should have a default value for their name argument (i.e. be callable with no arguments).

Returned values must be saved only by this object; if any value may be shared, it should instead be a dependency. For example, variable objects save their own values with the key VARIABLE_VALUE_KEY, but objects which reference variables simply add a dependency.

Returns:The dictionary mapping attribute names to SaveableObject factories described above. For example: {VARIABLE_VALUE_KEY:
lambda name=”global_name_for_this_object”: SaveableObject(name=name, …)}
_get_dropconnect(shape, keep_rate, name)[source]

Creates factors to be applied to filters to achieve either Bernoulli or Gaussian dropconnect.

_get_node_attribute_at_index(node_index, attr, attr_name)

Private utility to retrieves an attribute (e.g. inputs) from a node.

This is used to implement the methods:
  • get_input_shape_at
  • get_output_shape_at
  • get_input_at

etc…

Parameters:
  • node_index – Integer index of the node from which to retrieve the attribute.
  • attr – Exact node attribute name.
  • attr_name – Human-readable attribute name, for error messages.
Returns:

The layer’s attribute attr at the node of index node_index.

Raises:
  • RuntimeError – If the layer has no inbound nodes, or if called in Eager
  • mode.
  • ValueError – If the index provided does not match any node.
_get_weights_h(filtershape, dtype, name, orthogonal_init=True)[source]

Return weights for output convolution.

_get_weights_x(filtershape, dtype, name)[source]

Return weights for input convolution.

_handle_deferred_dependencies(name, checkpointable)

Pop and load any deferred checkpoint restores into checkpointable.

This method does not add a new dependency on checkpointable, but it does check if any outstanding/deferred dependencies have been queued waiting for this dependency to be added (matched based on name). If so, checkpointable and its dependencies are restored. The restorations are considered fulfilled and so are deleted.

_track_checkpointable is more appropriate for adding a normal/unconditional dependency, and includes handling for deferred restorations. This method allows objects such as Optimizer to use the same restoration logic while managing conditional dependencies themselves, by overriding _checkpoint_dependencies and _lookup_dependency to change the object’s dependencies based on the context it is saved/restored in (a single optimizer instance can have state associated with multiple graphs).

Parameters:
  • name – The name of the dependency within this object (self), used to match checkpointable with values saved in a checkpoint.
  • checkpointable – The Checkpointable object to restore (inheriting from CheckpointableBase).
_init_set_name(name)
_lookup_dependency(name)

Look up a dependency by name.

May be overridden to include conditional dependencies.

Parameters:name – The local name of the dependency.
Returns:A Checkpointable object, or None if no dependency by this name was found.
_make_unique_name(name_uid_map=None, avoid_names=None, namespace='', zero_based=False)
_maybe_initialize_checkpointable()

Initialize dependency management.

Not __init__, since most objects will forget to call it.

_name_scope_name(current_variable_scope)

Determines op naming for the Layer.

_paddata(data, fshape)[source]

Pads spatial dimensions of data, such that a convolution of size fshape results in a circular convolution

_preload_simple_restoration(name, shape)

Return a dependency’s value for restore-on-create.

Note the restoration is not deleted; if for some reason preload is called and then not assigned to the variable (for example because a custom getter overrides the initializer), the assignment will still happen once the variable is tracked (determined based on checkpoint.restore_uid).

Parameters:
  • name – The object-local name of the dependency holding the variable’s value.
  • shape – The shape of the variable being loaded into.
Returns:

An callable for use as a variable’s initializer/initial_value, or None if one should not be set (either because there was no variable with this name in the checkpoint or because it needs more complex deserialization). Any non-trivial deserialization will happen when the variable object is tracked.

_restore_from_checkpoint_position(checkpoint_position)

Restore this object and its dependencies (may be deferred).

_rnn_get_variable(getter, *args, **kwargs)
_set_scope(scope=None)
_single_restoration_from_checkpoint_position(checkpoint_position, visit_queue)

Restore this object, and either queue its dependencies or defer them.

_tf_api_names = ('nn.rnn_cell.RNNCell',)
_track_checkpointable(checkpointable, name, overwrite=False)

Declare a dependency on another Checkpointable object.

Indicates that checkpoints for this object should include variables from checkpointable.

Variables in a checkpoint are mapped to Checkpointable`s based on the names provided when the checkpoint was written. To avoid breaking existing checkpoints when modifying a class, neither variable names nor dependency names (the names passed to `_track_checkpointable) may change.

Parameters:
  • checkpointable – A Checkpointable which this object depends on.
  • name – A local name for checkpointable, used for loading checkpoints into the correct objects.
  • overwrite – Boolean, whether silently replacing dependencies is OK. Used for __setattr__, where throwing an error on attribute reassignment would be inappropriate.
Returns:

checkpointable, for convenience when declaring a dependency and assigning to a member variable in one statement.

Raises:
  • TypeError – If checkpointable does not inherit from Checkpointable.
  • ValueError – If another object is already tracked by this name.
activity_regularizer

Optional regularizer function for the output of this layer.

add_loss(losses, inputs=None)

Add loss tensor(s), potentially dependent on layer inputs.

Some losses (for instance, activity regularization losses) may be dependent on the inputs passed when calling a layer. Hence, when reusing the same layer on different inputs a and b, some entries in layer.losses may be dependent on a and some on b. This method automatically keeps track of dependencies.

The get_losses_for method allows to retrieve the losses relevant to a specific set of inputs.

Note that add_loss is not supported when executing eagerly. Instead, variable regularizers may be added through add_variable. Activity regularization is not supported directly (but such losses may be returned from Layer.call()).

Parameters:
  • losses – Loss tensor, or list/tuple of tensors.
  • inputs – If anything other than None is passed, it signals the losses are conditional on some of the layer’s inputs, and thus they should only be run where these inputs are available. This is the case for activity regularization losses, for instance. If None is passed, the losses are assumed to be unconditional, and will apply across all dataflows of the layer (e.g. weight regularization losses).
Raises:

RuntimeError – If called in Eager mode.

add_update(updates, inputs=None)

Add update op(s), potentially dependent on layer inputs.

Weight updates (for instance, the updates of the moving mean and variance in a BatchNormalization layer) may be dependent on the inputs passed when calling a layer. Hence, when reusing the same layer on different inputs a and b, some entries in layer.updates may be dependent on a and some on b. This method automatically keeps track of dependencies.

The get_updates_for method allows to retrieve the updates relevant to a specific set of inputs.

This call is ignored in Eager mode.

Parameters:
  • updates – Update op, or list/tuple of update ops.
  • inputs – If anything other than None is passed, it signals the updates are conditional on some of the layer’s inputs, and thus they should only be run where these inputs are available. This is the case for BatchNormalization updates, for instance. If None, the updates will be taken into account unconditionally, and you are responsible for making sure that any dependency they might have is available at runtime. A step counter might fall into this category.
add_variable(name, shape, dtype=None, initializer=None, regularizer=None, trainable=True, constraint=None, partitioner=None)

Adds a new variable to the layer, or gets an existing one; returns it.

Parameters:
  • name – variable name.
  • shape – variable shape.
  • dtype – The type of the variable. Defaults to self.dtype or float32.
  • initializer – initializer instance (callable).
  • regularizer – regularizer instance (callable).
  • trainable – whether the variable should be part of the layer’s “trainable_variables” (e.g. variables, biases) or “non_trainable_variables” (e.g. BatchNorm mean, stddev). Note, if the current variable scope is marked as non-trainable then this parameter is ignored and any added variables are also marked as non-trainable.
  • constraint – constraint instance (callable).
  • partitioner – (optional) partitioner instance (callable). If provided, when the requested variable is created it will be split into multiple partitions according to partitioner. In this case, an instance of PartitionedVariable is returned. Available partitioners include tf.fixed_size_partitioner and tf.variable_axis_size_partitioner. For more details, see the documentation of tf.get_variable and the “Variable Partitioners and Sharding” section of the API guide.
Returns:

The created variable. Usually either a Variable or ResourceVariable instance. If partitioner is not None, a PartitionedVariable instance is returned.

Raises:

RuntimeError – If called with partioned variable regularization and eager execution is enabled.

apply(inputs, *args, **kwargs)

Apply the layer on a input.

This simply wraps self.__call__.

Parameters:
  • inputs – Input tensor(s).
  • *args – additional positional arguments to be passed to self.call.
  • **kwargs – additional keyword arguments to be passed to self.call.
Returns:

Output tensor(s).

build(_)
call(inputs, **kwargs)

The logic of the layer lives here.

Parameters:
  • inputs – input tensor(s).
  • **kwargs – additional keyword arguments.
Returns:

Output tensor(s).

compute_output_shape(input_shape)

Computes the output shape of the layer given the input shape.

Parameters:

input_shape – A (possibly nested tuple of) TensorShape. It need not be fully defined (e.g. the batch size may be unknown).

Returns:

A (possibly nested tuple of) TensorShape.

Raises:
  • TypeError – if input_shape is not a (possibly nested tuple of) TensorShape.
  • ValueError – if input_shape is incomplete or is incompatible with the the layer.
count_params()

Count the total number of scalars composing the weights.

Returns:An integer count.
Raises:ValueError – if the layer isn’t yet built (in which case its weights aren’t yet defined).
dtype
get_input_at(node_index)

Retrieves the input tensor(s) of a layer at a given node.

Parameters:node_index – Integer, index of the node from which to retrieve the attribute. E.g. node_index=0 will correspond to the first time the layer was called.
Returns:A tensor (or list of tensors if the layer has multiple inputs).
Raises:RuntimeError – If called in Eager mode.
get_input_shape_at(node_index)

Retrieves the input shape(s) of a layer at a given node.

Parameters:node_index – Integer, index of the node from which to retrieve the attribute. E.g. node_index=0 will correspond to the first time the layer was called.
Returns:A shape tuple (or list of shape tuples if the layer has multiple inputs).
Raises:RuntimeError – If called in Eager mode.
get_losses_for(inputs)

Retrieves losses relevant to a specific set of inputs.

Parameters:inputs – Input tensor or list/tuple of input tensors.
Returns:List of loss tensors of the layer that depend on inputs.
Raises:RuntimeError – If called in Eager mode.
get_output_at(node_index)

Retrieves the output tensor(s) of a layer at a given node.

Parameters:node_index – Integer, index of the node from which to retrieve the attribute. E.g. node_index=0 will correspond to the first time the layer was called.
Returns:A tensor (or list of tensors if the layer has multiple outputs).
Raises:RuntimeError – If called in Eager mode.
get_output_shape_at(node_index)

Retrieves the output shape(s) of a layer at a given node.

Parameters:node_index – Integer, index of the node from which to retrieve the attribute. E.g. node_index=0 will correspond to the first time the layer was called.
Returns:A shape tuple (or list of shape tuples if the layer has multiple outputs).
Raises:RuntimeError – If called in Eager mode.
get_updates_for(inputs)

Retrieves updates relevant to a specific set of inputs.

Parameters:inputs – Input tensor or list/tuple of input tensors.
Returns:List of update ops of the layer that depend on inputs.
Raises:RuntimeError – If called in Eager mode.
graph
inbound_nodes

Deprecated, do NOT use! Only for compatibility with external Keras.

input

Retrieves the input tensor(s) of a layer.

Only applicable if the layer has exactly one input, i.e. if it is connected to one incoming layer.

Returns:

Input tensor or list of input tensors.

Raises:
  • AttributeError – if the layer is connected to
  • more than one incoming layers.
Raises:
input_shape

Retrieves the input shape(s) of a layer.

Only applicable if the layer has exactly one input, i.e. if it is connected to one incoming layer, or if all inputs have the same shape.

Returns:

Input shape, as an integer shape tuple (or list of shape tuples, one tuple per input tensor).

Raises:
losses

Losses which are associated with this Layer.

Note that when executing eagerly, getting this property evaluates regularizers. When using graph execution, variable regularization ops have already been created and are simply returned here.

Returns:A list of tensors.
name
non_trainable_variables
non_trainable_weights
outbound_nodes

Deprecated, do NOT use! Only for compatibility with external Keras.

output

Retrieves the output tensor(s) of a layer.

Only applicable if the layer has exactly one output, i.e. if it is connected to one incoming layer.

Returns:

Output tensor or list of output tensors.

Raises:
output_shape

Retrieves the output shape(s) of a layer.

Only applicable if the layer has one output, or if all outputs have the same shape.

Returns:

Output shape, as an integer shape tuple (or list of shape tuples, one tuple per output tensor).

Raises:
output_size
scope_name
state_size
trainable_variables
trainable_weights
updates
variables

Returns the list of all layer variables/weights.

Returns:A list of variables.
weights

Returns the list of all layer variables/weights.

Returns:A list of variables.
zero_state(batch_size, dtype)

Return zero-filled state tensor(s).

Parameters:
  • batch_size – int, float, or unit Tensor representing the batch size.
  • dtype – the data type to use for the state.
Returns:

If state_size is an int or TensorShape, then the return value is a N-D tensor of shape [batch_size, state_size] filled with zeros.

If state_size is a nested list or tuple, then the return value is a nested list or tuple (of the same structure) of 2-D tensors with the shapes [batch_size, s] for each s in state_size.