当前位置:   article > 正文

使用多层RNN-LSTM网络实现MNIST数据集分类及常见坑汇总_lstm实现minist数据集

lstm实现minist数据集

1 前言

循环神经网络(Recurrent Neural Network, RNN)又称递归神经网络,出现于20世纪80年代,其雏形见于美国物理学家J.J.Hopfield于1982年提出的可作联想存储器的互联网络——Hopfield神经网络模型。RNN是一类专门用于处理和预测序列数据的神经网络,其网络结构如下:

RNN网络结构

Sepp Hochreiter教授和Jurgen Schmidhuber教授于1997年提出了长短时记忆网络(Long Short-Term Memory,LSTM),解决了长期依赖问题,主要应用于文本分类、语音识别、机器翻译、自动对话、图片生成标题等问题中。LSTM网络结构如下所示:

LSTM网络结构

本博客仍采用MNIST数据集做实验,关于MNIST数据集的说明及其配置,见使用TensorFlow实现MNIST数据集分类

RNN采用一行一行地读取图片数据,即每个时刻读取图片一行的28个像素,一共有28个时间序列(28行),最后一个时刻输出汇总了前面所有时刻的信息,因此只用最后一个时刻的输出来判断图片类别。数据转换如下:

数据转换格式

2 单层RNN-LSTM网络

数据流如下:

单层RNN-LSTM数据流示意图
  1. import tensorflow as tf
  2. from tensorflow.examples.tutorials.mnist import input_data
  3. #载入数据集
  4. mnist = input_data.read_data_sets("MNIST_data/",one_hot=True)
  5. #lstm细胞输入向量维度,即每个时刻输入一行,共28个像素
  6. input_size = 28
  7. #时序持续长度,28个时刻,即每做一次预测,需要输入28行
  8. time_size = 28
  9. #每个隐藏层节点数
  10. hidden_size = 100
  11. #10个分类
  12. class_num = 10
  13. #每批次50个样本
  14. batch_size = 50
  15. #计算一共有多少个训练批次
  16. batch_num = mnist.train.num_examples // batch_size
  17. x = tf.placeholder(tf.float32,[None,784])
  18. y = tf.placeholder(tf.float32,[None,10])
  19. weights=tf.Variable(tf.truncated_normal([hidden_size,class_num],stddev=0.1))
  20. biases=tf.Variable(tf.constant(0.1,shape=[class_num,]))
  21. #定义RNN-LSTM网络
  22. def RNN_LSTM(x,weights,biases):
  23. #[batch_size,time_size*input_size]==>[batch_size,time_size,input_size]
  24. inputs=tf.reshape(x,[-1,time_size,input_size])
  25. #定义LSTM基本单元lstm_cell
  26. lstm_cell = tf.contrib.rnn.BasicLSTMCell(num_units=hidden_size,forget_bias=1.0,state_is_tuple=True)
  27. outputs,state = tf.nn.dynamic_rnn(lstm_cell,inputs,dtype=tf.float32,time_major=False)
  28. #输出隐层变换
  29. results = tf.matmul(outputs[:,-1,:],weights)+biases
  30. return results
  31. y_=RNN_LSTM(x,weights,biases)
  32. #交叉熵损失函数
  33. cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits_v2(logits=y_,labels=y))
  34. #使用AdamOptimizer优化器进行优化
  35. train = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)
  36. correct_prediction = tf.equal(tf.argmax(y,1),tf.argmax(y_,1))
  37. accuracy = tf.reduce_mean(tf.cast(correct_prediction,tf.float32))
  38. #初始化
  39. init = tf.global_variables_initializer()
  40. with tf.Session() as sess:
  41. sess.run(init)
  42. test_feed={x:mnist.test.images,y:mnist.test.labels}
  43. for epoch in range(6):
  44. #训练
  45. for batch in range(batch_num):
  46. batch_x,batch_y=mnist.train.next_batch(batch_size)
  47. sess.run(train,feed_dict={x:batch_x,y:batch_y})
  48. #预测
  49. acc=sess.run(accuracy,feed_dict=test_feed)
  50. print("Iter "+str(epoch)+", Testing Accuracy =",acc)
单层RNN-LSTM运行结果

 

3 多层RNN-LSTM网络

数据流如下:

多层RNN-LSTM数据流示意图
  1. import tensorflow as tf
  2. from tensorflow.examples.tutorials.mnist import input_data
  3. #载入数据集
  4. mnist = input_data.read_data_sets("MNIST_data/",one_hot=True)
  5. #lstm细胞输入向量维度,即每个时刻输入一行,共28个像素
  6. input_size = 28
  7. #时序持续长度,28个时刻,即每做一次预测,需要输入28行
  8. time_size = 28
  9. #每个隐藏层节点数
  10. hidden_size = 100
  11. #LSTM layer的层数
  12. layer_num = 2
  13. #10个分类
  14. class_num = 10
  15. #每批次50个样本
  16. batch_size = 50
  17. #计算一共有多少个训练批次
  18. batch_num = mnist.train.num_examples // batch_size
  19. x = tf.placeholder(tf.float32,[None,784])
  20. y = tf.placeholder(tf.float32,[None,10])
  21. weights={'in':tf.Variable(tf.truncated_normal([input_size,hidden_size],stddev=0.1)),
  22. 'out':tf.Variable(tf.truncated_normal([hidden_size,class_num]))}
  23. biases={'in':tf.Variable(tf.constant(0.1,shape=[hidden_size,])),
  24. 'out':tf.Variable(tf.constant(0.1,shape=[class_num,]))}
  25. #定义RNN-LSTM网络
  26. def RNN_LSTM(x,weights,biases):
  27. #[batch_size,time_size*input_size]==>[batch_size*time_size,input_size]
  28. x=tf.reshape(x,[-1,input_size])
  29. #输入隐层变换
  30. inputs=tf.matmul(x,weights["in"])+biases["in"]
  31. #[batch_size*time_size,hidden_size]==>[batch_size,time_size,hidden_size]
  32. inputs=tf.reshape(inputs,[-1,time_size,hidden_size])
  33. #定义LSTM基本单元lstm_cell
  34. lstm_cell = tf.contrib.rnn.BasicLSTMCell(num_units=hidden_size,forget_bias=1.0,state_is_tuple=True)
  35. #堆叠多层LSTM单元
  36. mlstm_cell = tf.contrib.rnn.MultiRNNCell([lstm_cell]*layer_num,state_is_tuple=True)
  37. outputs,state = tf.nn.dynamic_rnn(mlstm_cell,inputs,dtype=tf.float32,time_major=False)
  38. #输出隐层变换
  39. results = tf.matmul(outputs[:,-1,:],weights["out"])+biases["out"]
  40. return results
  41. y_=RNN_LSTM(x,weights,biases)
  42. #交叉熵损失函数
  43. cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits_v2(logits=y_,labels=y))
  44. #使用AdamOptimizer优化器进行优化
  45. train = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)
  46. correct_prediction = tf.equal(tf.argmax(y,1),tf.argmax(y_,1))
  47. accuracy = tf.reduce_mean(tf.cast(correct_prediction,tf.float32))
  48. #初始化
  49. init = tf.global_variables_initializer()
  50. with tf.Session() as sess:
  51. sess.run(init)
  52. test_feed={x:mnist.test.images,y:mnist.test.labels}
  53. for epoch in range(6):
  54. #训练
  55. for batch in range(batch_num):
  56. batch_x,batch_y=mnist.train.next_batch(batch_size)
  57. sess.run(train,feed_dict={x:batch_x,y:batch_y})
  58. #预测
  59. acc=sess.run(accuracy,feed_dict=test_feed)
  60. print("Iter "+str(epoch)+", Testing Accuracy =",acc)
多层RNN-LSTM运行结果

 

4 常见错误汇总

单层RNN-LSTM网络一般不会犯错,这里主要介绍多层RNN-LSTM网络中的常见错误。

4.1 输入隐层没有进行维数变换

错误提示:

  1. ValueError: Dimensions must be equal, but are 200 and 128 for 'rnn/while/rnn/multi_rnn_cell/cell_0/
  2. basic_lstm_cell/MatMul_1' (op: 'MatMul') with input shapes: [?,200], [128,400].

 在LSTM内部有遗忘门、输入门、输出门,每个时刻权值和偏值共享。如果不对输入隐层进行维数变换,第一层的输入向量为28+100=128维,第二层的输入向量为100+100=200维。所以,在输入前需要将28维的向量映射到100维,这样两层的输入都是200维。

4.2 训练batch_size和预测batch_size不一致

很多博客和视频将如下代码

outputs,state = tf.nn.dynamic_rnn(mlstm_cell,inputs,dtype=tf.float32,time_major=False)

写为:

  1. #用全零来初始化state
  2. init_state = mlstm_cell.zero_state(batch_size,dtype=tf.float32)
  3. outputs,state=tf.nn.dynamic_rnn(mlstm_cell,inputs,initial_state=init_state,time_major=False)

 它将batch_size与RNN-LSTM绑定在一起了,然而训练时的batch_size和预测时的batch_size不一致(巨坑),导致出现如下报错提示:

  1. InvalidArgumentError: ConcatOp : Dimensions of inputs should match: shape[0] = [10000,100] vs. shape[1] = [50,100]
  2. [[node rnn/while/rnn/multi_rnn_cell/cell_0/basic_lstm_cell/concat (defined at G:/Anaconda/Spyder/lstm.py:44) ]]

这里的10000是指预测数据集的batch_size。在不删除init_state的情况下,有如下两种解决方案:

(1)将测试集的batch_size和训练集的batch_size保持一致

  1. #预测
  2. total_acc=0.0
  3. for batch in range(test_batch_num):
  4. batch_x,batch_y=mnist.test.next_batch(batch_size)
  5. total_acc+=sess.run(accuracy,feed_dict={x:batch_x,y:batch_y})
  6. acc=total_acc/test_batch_num
  7. print("Iter "+str(epoch)+", Testing Accuracy =",acc)

 (2)使用placeholder定义batch_size

  1. .................
  2. #每个训练批次50个样本
  3. train_batch_size = 50
  4. #计算一共有多少个训练批次
  5. batch_num = mnist.train.num_examples//train_batch_size
  6. batch_size = tf.placeholder(tf.int32,[])
  7. .................
  8. with tf.Session() as sess:
  9. sess.run(init)
  10. test_feed={x:mnist.test.images,y:mnist.test.labels,batch_size:mnist.test.num_examples}
  11. for epoch in range(6):
  12. #训练
  13. for batch in range(batch_num):
  14. batch_x,batch_y=mnist.train.next_batch(train_batch_size)
  15. sess.run(train,feed_dict={x:batch_x,y:batch_y,batch_size:train_batch_size})
  16. #预测
  17. acc=sess.run(accuracy,feed_dict=test_feed)
  18. print("Iter "+str(epoch)+", Testing Accuracy =",acc)

5 参考文献

tf.nn.dynamic_rnn返回值详解

TensorFlow入门(五)多层 LSTM 通俗易懂版

tensorflow使用多层RNN(lstm)预测手写数字实现部分细节及踩坑总结

LSTM的训练和测试长度(batch_size)不一样报错的解决方案 

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/笔触狂放9/article/detail/785902
推荐阅读
相关标签
  

闽ICP备14008679号