当前位置:   article > 正文

Passing output of 3DCNN layer to LSTM layer

Passing output of 3DCNN layer to LSTM layer

题意:将3DCNN(三维卷积神经网络)层的输出传递给LSTM(长短期记忆网络)层

问题背景:

Whilst trying to learn Recurrent Neural Networks(RNNs) am trying to train an Automatic Lip Reading Model using 3DCNN + LSTM. I tried out a code I found for the same on Kaggle.

在尝试学习循环神经网络(RNNs)的过程中,我正试图使用3DCNN(三维卷积神经网络)+ LSTM来训练一个自动唇读模型。我在Kaggle上找到了一个相同的代码示例,并尝试运行了它。

  1. model = Sequential()
  2. # 1st layer group
  3. model.add(Conv3D(32, (3, 3, 3), strides = 1, input_shape=(22, 100, 100, 1), activation='relu', padding='valid'))
  4. model.add(MaxPooling3D(pool_size=(2, 2, 2), strides=2))
  5. model.add(Conv3D(64, (3, 3, 3), activation='relu', strides=1))
  6. model.add(MaxPooling3D(pool_size=(2, 2, 2), strides=2))
  7. model.add(Conv3D(128, (3, 3, 3), activation='relu', strides=1))
  8. model.add(MaxPooling3D(pool_size=(2, 2, 2), strides=2))
  9. shape = model.get_output_shape_at(0)
  10. model.add(Reshape((shape[-1],shape[1]*shape[2]*shape[3])))
  11. # LSTMS - Recurrent Network Layer
  12. model.add(LSTM(32, return_sequences=True))
  13. model.add(Dropout(.5))
  14. model.add((Flatten()))
  15. # # FC layers group
  16. model.add(Dense(2048, activation='relu'))
  17. model.add(Dropout(.5))
  18. model.add(Dense(1024, activation='relu'))
  19. model.add(Dropout(.5))
  20. model.add(Dense(10, activation='softmax'))
  21. model.compile(loss='categorical_crossentropy', optimizer='Adam', metrics=['accuracy'])
  22. model.summary()

However, it returns the following error:        然而 它返回了下面的错误:

  1. 11 model.add(MaxPooling3D(pool_size=(2, 2, 2), strides=2))
  2. 12
  3. ---> 13 shape = model.get_output_shape_at(0)
  4. 14 model.add(Reshape((shape[-1],shape[1]*shape[2]*shape[3])))
  5. 15
  6. RuntimeError: The layer sequential_2 has never been called and thus has no defined output shape.

From my understanding, I see that the author of the code was trying to get the output shape of the first layer and reshape it such as to forward to the LSTM layer. Found a similar post following which I made the following changes and the error was fixed.

据我理解,代码的作者试图获取第一层的输出形状,并将其重塑以传递给LSTM层。我发现了一个类似的帖子,按照该帖子的指导,我进行了以下更改,并解决了错误。

  1. shape = model.layers[-1].output_shape
  2. # shape = model.get_output_shape_at(0)

Still I am confused as to what the code does to forward the input from the CNN layer to LSTM layer. Any help to make me understand the above is appreciated. Thank You!!

我仍然对代码如何将CNN层的输入转发到LSTM层感到困惑。任何帮助我理解上述内容的建议都将不胜感激。谢谢!!

问题解决:

When you are passing the code from top to bottom then the inputs are flowing in the graph from top to bottom, you are getting this error because you can't call this function on eager mode, as Tensorflow 2.0 is fully transferred to eager mode, so, once you will fit the function and train it 1 epoch then you can use model.get_output_at(0) otherwise use mode.layers[-1].output.

当你在从上到下传递代码时,输入在图中也是从上到下流动的。你遇到这个错误是因为你不能在急切执行模式下调用这个函数,因为TensorFlow 2.0已经完全转到了急切执行模式。所以,一旦你拟合了函数并训练了一个周期(epoch),你就可以使用model.get_output_at(0),否则你可以使用model.layers[-1].output

The CNN Layer will extract the features locally then LSTM will sequentially extract and learn the feature, using CONV with LSTM is a good approach, but I will recommend you directly using tf.keras.layers.ConvLSTM3D. Check it here https://www.tensorflow.org/api_docs/python/tf/keras/layers/ConvLSTM3D

CNN层将局部提取特征,然后LSTM将顺序地提取和学习这些特征。虽然将CONV与LSTM结合使用是一个不错的方法,但我建议你直接使用tf.keras.layers.ConvLSTM3D。你可以在这里查看它的详细信息:https://www.tensorflow.org/api_docs/python/tf/keras/layers/ConvLSTM3D

  1. tf.keras.backend.clear_session()
  2. model = Sequential()
  3. # 1st layer group
  4. model.add(Conv3D(32, (3, 3, 3), strides = 1, input_shape=(22, 100, 100, 1), activation='relu', padding='valid'))
  5. model.add(MaxPooling3D(pool_size=(2, 2, 2), strides=2))
  6. model.add(Conv3D(64, (3, 3, 3), activation='relu', strides=1))
  7. model.add(MaxPooling3D(pool_size=(2, 2, 2), strides=2))
  8. model.add(Conv3D(128, (3, 3, 3), activation='relu', strides=1))
  9. model.add(MaxPooling3D(pool_size=(2, 2, 2), strides=2))
  10. shape = model.layers[-1].output_shape
  11. model.add(Reshape((shape[-1],shape[1]*shape[2]*shape[3])))
  12. # LSTMS - Recurrent Network Layer
  13. model.add(LSTM(32, return_sequences=True))
  14. model.add(Dropout(.5))
  15. model.add((Flatten()))
  16. # # FC layers group
  17. model.add(Dense(2048, activation='relu'))
  18. model.add(Dropout(.5))
  19. model.add(Dense(1024, activation='relu'))
  20. model.add(Dropout(.5))
  21. model.add(Dense(10, activation='softmax'))
  22. model.compile(loss='categorical_crossentropy', optimizer='Adam', metrics=['accuracy'])
  23. model.summary()

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/人工智能uu/article/detail/873715
推荐阅读
相关标签
  

闽ICP备14008679号