- '''import步骤省略,具体参数配置省略'''
- input = Input(shape=(3, 3)) # 模型输入层
- x = Conv2D(...)(input) # 中间层
- x = BatchNormalization(...)(x) # 中间层
- output = LeakyReLU(...)(x) # 模型输出层
- Model(input, output) # 根据input及output来构建模型
在上面例子中,我们构建了一个单输入单输出模型。首先,我们创建了一个shape为(3, 3)的输入张量input,并简单搭建了一个卷积网络,得到输出张量output,最终通过调用Model(input, output)来完成模型的构建。在这过程中,相信大家或多或少都有一个疑问:keras是怎么通过调用Model()来实现网络图(graph)的构建的呐?
InputSpec()类用以指定网络中每一个layer的input tensor的维度数(ndim)、数据类型(dtype)、维度(shape)等属性,并在构建有关layer时用以检测该layer输入tensor(s)的兼容性。
Container()类通过递归的方式,从模型的最终输出layer(s)层的output tensor(s)开始,根据每个layer的node信息反向递归找出模型的所有网络层信息并构建模型的整体网络图(graph)(有点类似C语言中链表结构的递归查询)。上述提到的泛型函数Model()就是通过继承Container()类而来的。
1、__init__(): Defines custom layer attributes, and creates layer state variables that do not depend on input shapes, using add_weight().
2、build(self, input_shape): This method can be used to create weights that depend on the shape(s) of the input(s), using add_weight(). __call__() will automatically build the layer (if it has not been built yet) by calling build().
用来创建依赖于输入shape的weights,即每层的权重。在layer的创建中,build() methods通过调用add_weight()方法来创建layer的weights。在基类Layer中build() methods为空:
- class Layer(object):
- """ 代码其余部分省略 """
- def build(self, input_shape):
- """Creates the layer weights.
- Must be implemented on all layers that have weights.
- # Arguments
- input_shape: Keras tensor (future input to layer)
- or list/tuple of Keras tensors to reference
- for weight shape computations.
- """
- self.built = True
因此build() methods需要在在继承了Layer class的子类中实现,以class _Conv(Layer):类为例,其实现build() method的方式为:
- class _Conv(Layer):
- """ 代码其余部分省略 """
- def build(self, input_shape):
- if self.data_format == 'channels_first':
- channel_axis = 1
- else:
- channel_axis = -1
- if input_shape[channel_axis] is None:
- raise ValueError('The channel dimension of the inputs '
- 'should be defined. Found `None`.')
- input_dim = input_shape[channel_axis]
- kernel_shape = self.kernel_size + (input_dim, self.filters)
- self.kernel = self.add_weight(shape=kernel_shape,
- initializer=self.kernel_initializer,
- name='kernel',
- regularizer=self.kernel_regularizer,
- constraint=self.kernel_constraint)
- if self.use_bias:
- self.bias = self.add_weight(shape=(self.filters,),
- initializer=self.bias_initializer,
- name='bias',
- regularizer=self.bias_regularizer,
- constraint=self.bias_constraint)
- else:
- self.bias = None
- # Set input spec.
- self.input_spec = InputSpec(ndim=self.rank + 2,
- axes={channel_axis: input_dim})
- self.built = True

3、call(self, *args, **kwargs): Called in __call__ after making sure build() has been called. call() performs the logic of applying the layer to the input tensors (which should be passed in as argument). Two reserved keyword arguments you can optionally use in call() are: - training (boolean, whether the call is in inference mode or training mode) - mask (boolean tensor encoding masked timesteps in the input, used in RNN layers)
在__call__ method中,build() method被调用之后调用call() method,用以处理该layer的input tensors,并输出相应的output tensors。类似于上述build method的实现原理,在Layer类中,并没有对call method进行具体的实现:
- class Layer(object):
- """ 代码其余部分省略 """
- def call(self, inputs, **kwargs):
- """This is where the layer's logic lives.
- # Arguments
- inputs: Input tensor, or list/tuple of input tensors.
- **kwargs: Additional keyword arguments.
- # Returns
- A tensor or list/tuple of tensors.
- """
- return inputs
因此需要在Layer的子类中对其进行具体的实现,同样以class _Conv(Layer):类为例,其实现call() method的方式为:
- class _Conv(Layer):
- """ 省略其余部分代码 """
- def call(self, inputs):
- if self.rank == 1:
- """ 省略该部分代码 """
- if self.rank == 2:
- outputs = K.conv2d(
- inputs,
- self.kernel,
- strides=self.strides,
- padding=self.padding,
- data_format=self.data_format,
- dilation_rate=self.dilation_rate)
- if self.rank == 3:
- """ 省略该部分代码 """
- if self.use_bias:
- outputs = K.bias_add(
- outputs,
- self.bias,
- data_format=self.data_format)
- if self.activation is not None:
- return self.activation(outputs)
- return outputs

4、get_config(self): Returns a dictionary containing the configuration used to initialize this layer. If the keys differ from the arguments in __init__, then override from_config(self) as well. This method is used when saving the layer or a model that contains this layer.
图中所示模型的网络流图共有7个网络层Layer A~G,并假设LayerF为连结(concatenate)层而不是共享层,本次不讨论含有共享层的情况。每个layer都绑定有一个Node结点,网络的输入层为Layer A(该输入层就是使用上述的InputLayer()创建的),输出层为Layer G。IN1为LayerA的输入tensor,OUT1为LayerA的输出tensor也即LayerB和LayerC的输入tensor IN2、IN3,其他同理。
node并不参与计算,只是用来记录各layer、tensor之间关系的一个桥梁。以Layer F的node6为例说明其Node关联Layer E、Layer D及Layer G的过程。其他的与此相同。
首先,大家可能注意到这些属性中只有“outbound_layer”是单数形式,其他的都是复数形式。这是由于outbound_layer代表将input tensors转化成output tensors的层,上图可以看出在Layer F中,完成IN6到OUT6转换过程的layer正是Layer F本身,而这个层还只有且只能有一个,这一点不难理解。
inbound_layers表示node6的入站层,可以有多个,图中不难看出,node6的inbound_layers为Layer E和Layer D两个,即inbound_layers[] = [Layer E, Layer D]。
node_indices表示Layer F中node的索引,当一个Layer中有2个或2个以上的node时,一般对应着共享层,此处只讨论非共享层的情况,因此每个Layer只有一个node,因此这种情况下node_indices为0,即一个。
tensor_indices表示Layer F中output_tensors中每个tensor的索引,对应多输出的情况。假设图中的layer全部是单输出layer,所以node6的tensor_indices为0,即只输出一个tensor。
- class Node(object):
- def __init__(self, outbound_layer,
- inbound_layers, node_indices, tensor_indices,
- input_tensors, output_tensors,
- input_masks, output_masks,
- input_shapes, output_shapes,
- arguments=None):
- """ 此处省略部分代码 """
- # Add nodes to all layers involved.
- for layer in inbound_layers:
- if layer is not None:
- layer._outbound_nodes.append(self)
- outbound_layer._inbound_nodes.append(self)
- """ 此处省略部分代码 """
- """
- Each time a layer is connected to some new input,
- a node is added to `layer._inbound_nodes`.
- Each time the output of a layer is used by another layer,
- a node is added to `layer._outbound_nodes`.
- """
如果还是觉得上面解释比较拗口,就拿上图中的Layer F为例解释一下。我们看到有两个层的输出被连接到Layer F的输入中,因此Layer F的_inbound_nodes[]=[Node6],注意!这里并不是[Node4, Node5]。注意layer的入站结点(_inbound_nodes)与node的入站层(inbound_layers)的区别。Layer F的输出OUT6被Layer G用作输入,因此Layer F的_outbound_nodes[]=[Node7]
直接用例子说明三个参数的意思吧:在上图的Layer F中,OUT6的inbound_layer为Layer F,node_index为0,tensor_index为0。概括来说,tensor的_keras_history属性记录了该tensor来自哪里。
input_tensors[ ] = inbound_layers[ ]._inbound_nodes[ ].output_tensors[ ]
- class Container(Layer):
- """ 省略部分代码段 """
- def __init__(...):
- ......
- def build_map_of_graph(tensor, finished_nodes, nodes_in_progress,
- layer=None, node_index=None, tensor_index=None):
- """Builds a map of the graph of layers.
- This recursively updates the map `layer_indices`,
- the list `nodes_in_decreasing_depth` and the set `container_nodes`.
- # Arguments
- tensor: Some tensor in a graph.
- finished_nodes: Set of nodes whose subgraphs have been traversed
- completely. Useful to prevent duplicated work.
- nodes_in_progress: Set of nodes that are currently active on the
- recursion stack. Useful to detect cycles.
- layer: Layer from which `tensor` comes from. If not provided,
- will be obtained from `tensor._keras_history`.
- node_index: Node index from which `tensor` comes from.
- tensor_index: Tensor_index from which `tensor` comes from.
- # Raises
- RuntimeError: if a cycle is detected.
- """
- if not layer or node_index is None or tensor_index is None:
- layer, node_index, tensor_index = tensor._keras_history
- node = layer._inbound_nodes[node_index]
- # Prevent cycles.
- if node in nodes_in_progress:
- raise RuntimeError(
- 'The tensor ' + str(tensor) + ' at layer "' +
- layer.name + '" is part of a cycle.')
- # Don't repeat work for shared subgraphs
- if node in finished_nodes:
- return
- # Update container_nodes.
- container_nodes.add(self._node_key(layer, node_index))
- # Store the traversal order for layer sorting.
- if layer not in layer_indices:
- layer_indices[layer] = len(layer_indices)
- nodes_in_progress.add(node)
- # Propagate to all previous tensors connected to this node.
- for i in range(len(node.inbound_layers)):
- x = node.input_tensors[i]
- layer = node.inbound_layers[i]
- node_index = node.node_indices[i]
- tensor_index = node.tensor_indices[i]
- build_map_of_graph(x, finished_nodes, nodes_in_progress,
- layer, node_index, tensor_index)
- finished_nodes.add(node)
- nodes_in_progress.remove(node)
- nodes_in_decreasing_depth.append(node)
- finished_nodes = set()
- nodes_in_progress = set()
- for x in self.outputs:
- build_map_of_graph(x, finished_nodes, nodes_in_progress)
- ......

解释了泛型模型的创建过程,下面看一下序列模型( Sequential(Model) )的一个流图,是不是感觉结构很简单了?
