当前位置:   article > 正文

神经网络代码实现(用手写数字识别数据集实验)

神经网络代码实现(用手写数字识别数据集实验)

目录

一、前言

二、神经网络架构

三、算法实现

         1、导入包

         2、实现类

         3、训练函数

        4、权重参数矩阵初始化

        5、参数矩阵变换向量

         6、向量变换权重参数矩阵

         7、进行梯度下降

                7.1、损失函数

                        7.1.1、前向传播

                7.2、反向传播

        8、预测函数

四、完整代码

 五、手写数字识别


一、前言

        读者需要了解神经网络的基础知识,可以参考神经网络(深度学习,计算机视觉,得分函数,损失函数,前向传播,反向传播,激活函数)

        本文为大家详细的描述了,实现神经网络的逻辑,代码。并且用手写识别来实验,结果基本实现了神经网络的要求。

二、神经网络架构

        想一想:

        1.输入数据:特征值(手写数字识别是像素点,784个特征)

        2.W1,W2,W3矩阵的形状

        3.前向传播

        4.激活函数(用Sigmoid)

        5.反向传播

        6.偏置项

        7.损失(\hat{y}-y)

        8.得出W1,W2,W3对损失有多大影响,公式如下:

        \begin{matrix} \delta(4)=a(4)-y\\ \delta(3)=(\Theta ^3)^T\delta(4)*g'(z^{(3)})\\ \delta(2)=(\Theta ^2)^T\delta(3)*g'(z^{(2)})\\ \delta(1) =is \; for\;input\;layer,\;we\;can't\;change\;it\\ g'-sigmoid\;gradient\\ g'(z)=\frac{\partial }{\partial z}g(z)=g(z)(1-g(z));\;where\;g(z)=\frac{1}{1+e^{-z}} \end{matrix}

        算法流程(简便版):

        \begin{matrix} for\;i=1\;to\;m\\ \;set\;a(1)=x^{(i)}\\ perform\;forward\;propagation\;to\;compute\;a(1)\;for\;l=2,3...L\\ Using\;y^{(i)},compute\;\delta(L)=a^{(l)}-y^{(i)}\\ compute\;\delta(L-1)\;\delta(L-2)...\delta(2)\\ \Delta _{ij}^{(l)}:\Delta _{ij}^{(l)}+a_{j}^{(l)}\delta_i^{(l+1)}(or\;in\;vectorized\;form\;\Delta^{(l)}=\Delta^{(l)}+\delta^{(l+1)}(a(l))^T) \end{matrix}

         

三、算法实现

         1、导入包

  1. import numpy as np
  2. from Neural_Network_Lab.utils.features import prepare_for_training
  3. from Neural_Network_Lab.utils.hypothesis import sigmoid,sigmoid_gradient

           这里utils包用来封装数据预处理,和Sigmoid函数

  1. """Add polynomial features to the features set"""
  2. import numpy as np
  3. from .normalize import normalize
  4. def generate_polynomials(dataset, polynomial_degree, normalize_data=False):
  5. """变换方法:
  6. x1, x2, x1^2, x2^2, x1*x2, x1*x2^2, etc.
  7. """
  8. features_split = np.array_split(dataset, 2, axis=1)
  9. dataset_1 = features_split[0]
  10. dataset_2 = features_split[1]
  11. (num_examples_1, num_features_1) = dataset_1.shape
  12. (num_examples_2, num_features_2) = dataset_2.shape
  13. if num_examples_1 != num_examples_2:
  14. raise ValueError('Can not generate polynomials for two sets with different number of rows')
  15. if num_features_1 == 0 and num_features_2 == 0:
  16. raise ValueError('Can not generate polynomials for two sets with no columns')
  17. if num_features_1 == 0:
  18. dataset_1 = dataset_2
  19. elif num_features_2 == 0:
  20. dataset_2 = dataset_1
  21. num_features = num_features_1 if num_features_1 < num_examples_2 else num_features_2
  22. dataset_1 = dataset_1[:, :num_features]
  23. dataset_2 = dataset_2[:, :num_features]
  24. polynomials = np.empty((num_examples_1, 0))
  25. for i in range(1, polynomial_degree + 1):
  26. for j in range(i + 1):
  27. polynomial_feature = (dataset_1 ** (i - j)) * (dataset_2 ** j)
  28. polynomials = np.concatenate((polynomials, polynomial_feature), axis=1)
  29. if normalize_data:
  30. polynomials = normalize(polynomials)[0]
  31. return polynomials
  1. import numpy as np
  2. def generate_sinusoids(dataset, sinusoid_degree):
  3. """
  4. sin(x).
  5. """
  6. num_examples = dataset.shape[0]
  7. sinusoids = np.empty((num_examples, 0))
  8. for degree in range(1, sinusoid_degree + 1):
  9. sinusoid_features = np.sin(degree * dataset)
  10. sinusoids = np.concatenate((sinusoids, sinusoid_features), axis=1)
  11. return sinusoids

 

  1. """Normalize features"""
  2. import numpy as np
  3. def normalize(features):
  4. features_normalized = np.copy(features).astype(float)
  5. # 计算均值
  6. features_mean = np.mean(features, 0)
  7. # 计算标准差
  8. features_deviation = np.std(features, 0)
  9. # 标准化操作
  10. if features.shape[0] > 1:
  11. features_normalized -= features_mean
  12. # 防止除以0
  13. features_deviation[features_deviation == 0] = 1
  14. features_normalized /= features_deviation
  15. return features_normalized, features_mean, features_deviation

          数据预处理:

  1. """Prepares the dataset for training"""
  2. import numpy as np
  3. from .normalize import normalize
  4. from .generate_sinusoids import generate_sinusoids
  5. from .generate_polynomials import generate_polynomials
  6. def prepare_for_training(data, polynomial_degree=0, sinusoid_degree=0, normalize_data=True):
  7. # 计算样本总数
  8. num_examples = data.shape[0]
  9. data_processed = np.copy(data)
  10. # 预处理
  11. features_mean = 0
  12. features_deviation = 0
  13. data_normalized = data_processed
  14. if normalize_data:
  15. (
  16. data_normalized,
  17. features_mean,
  18. features_deviation
  19. ) = normalize(data_processed)
  20. data_processed = data_normalized
  21. # 特征变换sinusoidal
  22. if sinusoid_degree > 0:
  23. sinusoids = generate_sinusoids(data_normalized, sinusoid_degree)
  24. data_processed = np.concatenate((data_processed, sinusoids), axis=1)
  25. # 特征变换polynomial
  26. if polynomial_degree > 0:
  27. polynomials = generate_polynomials(data_normalized, polynomial_degree, normalize_data)
  28. data_processed = np.concatenate((data_processed, polynomials), axis=1)
  29. # 加一列1
  30. data_processed = np.hstack((np.ones((num_examples, 1)), data_processed))
  31. return data_processed, features_mean, features_deviation

        Sigmoid函数:

  1. import numpy as np
  2. def sigmoid(matrix):
  3. """Applies sigmoid function to NumPy matrix"""
  4. return 1 / (1 + np.exp(-matrix))

         2、实现类

        多层感知机 初始化:数据,标签,网络层次(用列表表示如三层[784,25,10]表示输入层784个神经元,25个隐藏层神经元,10个输出层神经元),数据是否标准化处理。

  1. class MultilayerPerceptron:
  2. def __init__(self,data,labels,layers,normalize_data=False):
  3. data_processed = prepare_for_training(data,normalize_data=normalize_data)[0]
  4. self.data = data_processed
  5. self.labels = labels
  6. self.layers = layers # [ 784 ,25 ,10]
  7. self.normalize_data = normalize_data
  8. self.thetas = MultilayerPerceptron.thetas_init(layers)

         3、训练函数

        输入迭代次数,学习率,进行梯度下降算法,更新权重参数矩阵,得到最终的权重参数矩阵,和损失值。矩阵不好进行更新操作,可以把它拉成向量。

  1. def train(self,max_ietrations = 1000,alpha = 0.1):
  2. #方便矩阵更新 拉长 把矩阵拉成向量
  3. unrolled_theta = MultilayerPerceptron.thetas_unroll(self.thetas)
  4. (optimized_theta, cost_history) = MultilayerPerceptron.gradient_descent(self.data,self.labels,unrolled_theta,self.layers,max_ietrations,alpha)
  5. self.thetas = MultilayerPerceptron.thetas_roll(optimized_theta,self.layers)
  6. return self.thetas,cost_history

        4、权重参数矩阵初始化

        根据网络层次可以确定,矩阵的大小,用字典存储。

  1. @staticmethod
  2. def thetas_init(layers):
  3. num_layers = len(layers)
  4. thetas = {} #用字典形式 key:表示第几层 vlues:权重参数矩阵
  5. for layer_index in range(num_layers-1):
  6. '''
  7. 会执行两次: 得到两组参数矩阵 25 * 785 , 10 * 26
  8. '''
  9. in_count = layers[layer_index]
  10. out_count = layers[layer_index+1]
  11. #初始化 初始值小
  12. #这里需要考虑偏置项,偏置的个数与输出的个数一样
  13. thetas[layer_index]=np.random.rand(out_count,in_count+1) * 0.05 #加一列输入特征
  14. return thetas

        5、参数矩阵变换向量

        将权重参数矩阵变换成向量

  1. @staticmethod
  2. def thetas_unroll(thetas):
  3. #拼接成一个向量
  4. num_theta_layers = len(thetas)
  5. unrolled_theta = np.array([])
  6. for theta_layer_index in range(num_theta_layers):
  7. unrolled_theta = np.hstack((unrolled_theta,thetas[theta_layer_index].flatten()))
  8. return unrolled_theta

         6、向量变换权重参数矩阵

        后边前向传播时需要进行矩阵乘法,需要变换回来

  1. @staticmethod
  2. def thetas_roll(unrolled_theta,layers):
  3. num_layers = len(layers)
  4. thetas = {}
  5. unrolled_shift = 0
  6. for layer_index in range(num_layers - 1):
  7. in_count = layers[layer_index]
  8. out_count = layers[layer_index + 1]
  9. thetas_width = in_count + 1
  10. thetas_height = out_count
  11. thetas_volume = thetas_width * thetas_height
  12. start_index = unrolled_shift
  13. end_index =unrolled_shift + thetas_volume
  14. layer_theta_unrolled = unrolled_theta[start_index:end_index]
  15. thetas[layer_index] = layer_theta_unrolled.reshape((thetas_height,thetas_width))
  16. unrolled_shift = unrolled_shift + thetas_volume
  17. return thetas

         7、进行梯度下降

                1. 损失函数,计算损失值

                2. 计算梯度值

                3. 更新参数

                那么得先要实现损失函数,计算损失值。

                7.1、损失函数

                        实现损失函数,得到损失值得要实现前向传播走一次

                        7.1.1、前向传播
  1. @staticmethod
  2. def feedforword_propagation(data,thetas,layers):
  3. num_layers = len(layers)
  4. num_examples = data.shape[0]
  5. in_layer_activation = data #输入层
  6. #逐层计算 隐藏层
  7. for layer_index in range(num_layers - 1):
  8. theta = thetas[layer_index]
  9. out_layer_activation = sigmoid(np.dot(in_layer_activation,theta.T)) #输出层
  10. # 正常计算之后是num_examples * 25 ,但是要考虑偏置项 变成num_examples * 26
  11. out_layer_activation = np.hstack((np.ones((num_examples,1)),out_layer_activation))
  12. in_layer_activation = out_layer_activation
  13. #返回输出层结果,不要偏置项
  14. return in_layer_activation[:,1:]

                损失函数:

  1. @staticmethod
  2. def cost_function(data,labels,thetas,layers):
  3. num_layers = len(layers)
  4. num_examples = data.shape[0]
  5. num_labels = layers[-1]
  6. #前向传播走一次
  7. predictions = MultilayerPerceptron.feedforword_propagation(data,thetas,layers)
  8. #制作标签,每一个样本的标签都是one-dot
  9. bitwise_labels = np.zeros((num_examples,num_labels))
  10. for example_index in range(num_examples):
  11. bitwise_labels[example_index][labels[example_index][0]] = 1
  12. #咱们的预测值是概率值y= 7 [0,0,0,0,0,0,1,0,0,0] 在正确值的位置上概率越大越好 在错误值的位置上概率越小越好
  13. bit_set_cost = np.sum(np.log(predictions[bitwise_labels == 1]))
  14. bit_not_set_cost = np.sum(np.log(1 - predictions[bitwise_labels == 0]))
  15. cost = (-1/num_examples) * (bit_set_cost+bit_not_set_cost)
  16. return cost

                7.2、反向传播

                在梯度下降的过程中,要实现参数矩阵的更新,必须要实现反向传播。利用上述的公式,进行运算即可得到。

  1. @staticmethod
  2. def back_propagation(data,labels,thetas,layers):
  3. num_layers = len(layers)
  4. (num_examples,num_features) = data.shape
  5. num_label_types = layers[-1]
  6. deltas = {} # 算出每一层对结果的影响
  7. #初始化
  8. for layer_index in range(num_layers - 1):
  9. in_count = layers[layer_index]
  10. out_count = layers[layer_index + 1]
  11. deltas[layer_index] = np.zeros((out_count,in_count+1)) #25 * 785 10 *26
  12. for example_index in range(num_examples):
  13. layers_inputs = {}
  14. layers_activations = {}
  15. layers_activation = data[example_index,:].reshape((num_features,1))
  16. layers_activations[0] = layers_activation
  17. #逐层计算
  18. for layer_index in range(num_layers - 1):
  19. layer_theta = thetas[layer_index] #得到当前的权重参数值 : 25 *785 10 *26
  20. layer_input = np.dot(layer_theta,layers_activation) # 第一次 得到 25 * 1 第二次: 10 * 1
  21. layers_activation = np.vstack((np.array([[1]]),sigmoid(layer_input))) #完成激活函数,加上一个偏置参数
  22. layers_inputs[layer_index+1] = layer_input # 后一层计算结果
  23. layers_activations[layer_index +1] = layers_activation # 后一层完成激活的结果
  24. output_layer_activation = layers_activation[1:,:]
  25. #计算输出层和结果的差异
  26. delta = {}
  27. #标签处理
  28. bitwise_label = np.zeros((num_label_types,1))
  29. bitwise_label[labels[example_index][0]] = 1
  30. #计算输出结果和真实值之间的差异
  31. delta[num_layers-1] = output_layer_activation - bitwise_label #输出层
  32. #遍历 L,L-1,L-2...2
  33. for layer_index in range(num_layers - 2,0,-1):
  34. layer_theta = thetas[layer_index]
  35. next_delta = delta[layer_index+1]
  36. layer_input = layers_inputs[layer_index]
  37. layer_input = np.vstack((np.array((1)),layer_input))
  38. #按照公式计算
  39. delta[layer_index] = np.dot(layer_theta.T,next_delta)*sigmoid(layer_input)
  40. #过滤掉偏置参数
  41. delta[layer_index] = delta[layer_index][1:,:]
  42. #计算梯度值
  43. for layer_index in range(num_layers-1):
  44. layer_delta = np.dot(delta[layer_index+1],layers_activations[layer_index].T) #微调矩阵
  45. deltas[layer_index] = deltas[layer_index] + layer_delta #第一次25 * 785 第二次 10 * 26
  46. for layer_index in range(num_layers-1):
  47. deltas[layer_index] = deltas[layer_index] * (1/num_examples) #公式
  48. return deltas

        实现一次梯度下降:

  1. @staticmethod
  2. def gradient_step(data,labels,optimized_theta,layers):
  3. theta = MultilayerPerceptron.thetas_roll(optimized_theta,layers)
  4. #反向传播BP
  5. thetas_rolled_gradinets = MultilayerPerceptron.back_propagation(data,labels,theta,layers)
  6. thetas_unrolled_gradinets = MultilayerPerceptron.thetas_unroll(thetas_rolled_gradinets)
  7. return thetas_unrolled_gradinets

         实现梯度下降:

  1. @staticmethod
  2. def gradient_descent(data,labels,unrolled_theta,layers,max_ietrations,alpha):
  3. #1. 计算损失值
  4. #2. 计算梯度值
  5. #3. 更新参数
  6. optimized_theta = unrolled_theta #最好的theta值
  7. cost_history = [] #损失值的记录
  8. for i in range(max_ietrations):
  9. if i % 10 == 0 :
  10. print("当前迭代次数:",i)
  11. cost = MultilayerPerceptron.cost_function(data,labels,MultilayerPerceptron.thetas_roll(optimized_theta,layers),layers)
  12. cost_history.append(cost)
  13. theta_gradient = MultilayerPerceptron.gradient_step(data,labels,optimized_theta,layers)
  14. optimized_theta = optimized_theta - alpha * theta_gradient
  15. return optimized_theta,cost_history

        8、预测函数

        输入测试数据,前向传播走一次,得到预测值

  1. def predict(self,data):
  2. data_processed = prepare_for_training(data,normalize_data = self.normalize_data)[0]
  3. num_examples = data_processed.shape[0]
  4. predictions = MultilayerPerceptron.feedforword_propagation(data_processed,self.thetas,self.layers)
  5. return np.argmax(predictions,axis=1).reshape((num_examples,1))

四、完整代码

  1. import numpy as np
  2. from Neural_Network_Lab.utils.features import prepare_for_training
  3. from Neural_Network_Lab.utils.hypothesis import sigmoid,sigmoid_gradient
  4. class MultilayerPerceptron:
  5. def __init__(self,data,labels,layers,normalize_data=False):
  6. data_processed = prepare_for_training(data,normalize_data=normalize_data)[0]
  7. self.data = data_processed
  8. self.labels = labels
  9. self.layers = layers # [ 784 ,25 ,10]
  10. self.normalize_data = normalize_data
  11. self.thetas = MultilayerPerceptron.thetas_init(layers)
  12. def predict(self,data):
  13. data_processed = prepare_for_training(data,normalize_data = self.normalize_data)[0]
  14. num_examples = data_processed.shape[0]
  15. predictions = MultilayerPerceptron.feedforword_propagation(data_processed,self.thetas,self.layers)
  16. return np.argmax(predictions,axis=1).reshape((num_examples,1))
  17. def train(self,max_ietrations = 1000,alpha = 0.1):
  18. #方便矩阵更新 拉长 把矩阵拉成向量
  19. unrolled_theta = MultilayerPerceptron.thetas_unroll(self.thetas)
  20. (optimized_theta, cost_history) = MultilayerPerceptron.gradient_descent(self.data,self.labels,unrolled_theta,self.layers,max_ietrations,alpha)
  21. self.thetas = MultilayerPerceptron.thetas_roll(optimized_theta,self.layers)
  22. return self.thetas,cost_history
  23. @staticmethod
  24. def gradient_descent(data,labels,unrolled_theta,layers,max_ietrations,alpha):
  25. #1. 计算损失值
  26. #2. 计算梯度值
  27. #3. 更新参数
  28. optimized_theta = unrolled_theta #最好的theta值
  29. cost_history = [] #损失值的记录
  30. for i in range(max_ietrations):
  31. if i % 10 == 0 :
  32. print("当前迭代次数:",i)
  33. cost = MultilayerPerceptron.cost_function(data,labels,MultilayerPerceptron.thetas_roll(optimized_theta,layers),layers)
  34. cost_history.append(cost)
  35. theta_gradient = MultilayerPerceptron.gradient_step(data,labels,optimized_theta,layers)
  36. optimized_theta = optimized_theta - alpha * theta_gradient
  37. return optimized_theta,cost_history
  38. @staticmethod
  39. def gradient_step(data,labels,optimized_theta,layers):
  40. theta = MultilayerPerceptron.thetas_roll(optimized_theta,layers)
  41. #反向传播BP
  42. thetas_rolled_gradinets = MultilayerPerceptron.back_propagation(data,labels,theta,layers)
  43. thetas_unrolled_gradinets = MultilayerPerceptron.thetas_unroll(thetas_rolled_gradinets)
  44. return thetas_unrolled_gradinets
  45. @staticmethod
  46. def back_propagation(data,labels,thetas,layers):
  47. num_layers = len(layers)
  48. (num_examples,num_features) = data.shape
  49. num_label_types = layers[-1]
  50. deltas = {} # 算出每一层对结果的影响
  51. #初始化
  52. for layer_index in range(num_layers - 1):
  53. in_count = layers[layer_index]
  54. out_count = layers[layer_index + 1]
  55. deltas[layer_index] = np.zeros((out_count,in_count+1)) #25 * 785 10 *26
  56. for example_index in range(num_examples):
  57. layers_inputs = {}
  58. layers_activations = {}
  59. layers_activation = data[example_index,:].reshape((num_features,1))
  60. layers_activations[0] = layers_activation
  61. #逐层计算
  62. for layer_index in range(num_layers - 1):
  63. layer_theta = thetas[layer_index] #得到当前的权重参数值 : 25 *785 10 *26
  64. layer_input = np.dot(layer_theta,layers_activation) # 第一次 得到 25 * 1 第二次: 10 * 1
  65. layers_activation = np.vstack((np.array([[1]]),sigmoid(layer_input))) #完成激活函数,加上一个偏置参数
  66. layers_inputs[layer_index+1] = layer_input # 后一层计算结果
  67. layers_activations[layer_index +1] = layers_activation # 后一层完成激活的结果
  68. output_layer_activation = layers_activation[1:,:]
  69. #计算输出层和结果的差异
  70. delta = {}
  71. #标签处理
  72. bitwise_label = np.zeros((num_label_types,1))
  73. bitwise_label[labels[example_index][0]] = 1
  74. #计算输出结果和真实值之间的差异
  75. delta[num_layers-1] = output_layer_activation - bitwise_label #输出层
  76. #遍历 L,L-1,L-2...2
  77. for layer_index in range(num_layers - 2,0,-1):
  78. layer_theta = thetas[layer_index]
  79. next_delta = delta[layer_index+1]
  80. layer_input = layers_inputs[layer_index]
  81. layer_input = np.vstack((np.array((1)),layer_input))
  82. #按照公式计算
  83. delta[layer_index] = np.dot(layer_theta.T,next_delta)*sigmoid(layer_input)
  84. #过滤掉偏置参数
  85. delta[layer_index] = delta[layer_index][1:,:]
  86. #计算梯度值
  87. for layer_index in range(num_layers-1):
  88. layer_delta = np.dot(delta[layer_index+1],layers_activations[layer_index].T) #微调矩阵
  89. deltas[layer_index] = deltas[layer_index] + layer_delta #第一次25 * 785 第二次 10 * 26
  90. for layer_index in range(num_layers-1):
  91. deltas[layer_index] = deltas[layer_index] * (1/num_examples)
  92. return deltas
  93. @staticmethod
  94. def cost_function(data,labels,thetas,layers):
  95. num_layers = len(layers)
  96. num_examples = data.shape[0]
  97. num_labels = layers[-1]
  98. #前向传播走一次
  99. predictions = MultilayerPerceptron.feedforword_propagation(data,thetas,layers)
  100. #制作标签,每一个样本的标签都是one-dot
  101. bitwise_labels = np.zeros((num_examples,num_labels))
  102. for example_index in range(num_examples):
  103. bitwise_labels[example_index][labels[example_index][0]] = 1
  104. #咱们的预测值是概率值y= 7 [0,0,0,0,0,0,1,0,0,0] 在正确值的位置上概率越大越好 在错误值的位置上概率越小越好
  105. bit_set_cost = np.sum(np.log(predictions[bitwise_labels == 1]))
  106. bit_not_set_cost = np.sum(np.log(1 - predictions[bitwise_labels == 0]))
  107. cost = (-1/num_examples) * (bit_set_cost+bit_not_set_cost)
  108. return cost
  109. @staticmethod
  110. def feedforword_propagation(data,thetas,layers):
  111. num_layers = len(layers)
  112. num_examples = data.shape[0]
  113. in_layer_activation = data #输入层
  114. #逐层计算 隐藏层
  115. for layer_index in range(num_layers - 1):
  116. theta = thetas[layer_index]
  117. out_layer_activation = sigmoid(np.dot(in_layer_activation,theta.T)) #输出层
  118. # 正常计算之后是num_examples * 25 ,但是要考虑偏置项 变成num_examples * 26
  119. out_layer_activation = np.hstack((np.ones((num_examples,1)),out_layer_activation))
  120. in_layer_activation = out_layer_activation
  121. #返回输出层结果,不要偏置项
  122. return in_layer_activation[:,1:]
  123. @staticmethod
  124. def thetas_roll(unrolled_theta,layers):
  125. num_layers = len(layers)
  126. thetas = {}
  127. unrolled_shift = 0
  128. for layer_index in range(num_layers - 1):
  129. in_count = layers[layer_index]
  130. out_count = layers[layer_index + 1]
  131. thetas_width = in_count + 1
  132. thetas_height = out_count
  133. thetas_volume = thetas_width * thetas_height
  134. start_index = unrolled_shift
  135. end_index =unrolled_shift + thetas_volume
  136. layer_theta_unrolled = unrolled_theta[start_index:end_index]
  137. thetas[layer_index] = layer_theta_unrolled.reshape((thetas_height,thetas_width))
  138. unrolled_shift = unrolled_shift + thetas_volume
  139. return thetas
  140. @staticmethod
  141. def thetas_unroll(thetas):
  142. #拼接成一个向量
  143. num_theta_layers = len(thetas)
  144. unrolled_theta = np.array([])
  145. for theta_layer_index in range(num_theta_layers):
  146. unrolled_theta = np.hstack((unrolled_theta,thetas[theta_layer_index].flatten()))
  147. return unrolled_theta
  148. @staticmethod
  149. def thetas_init(layers):
  150. num_layers = len(layers)
  151. thetas = {} #用字典形式 key:表示第几层 vlues:权重参数矩阵
  152. for layer_index in range(num_layers-1):
  153. '''
  154. 会执行两次: 得到两组参数矩阵 25 * 785 , 10 * 26
  155. '''
  156. in_count = layers[layer_index]
  157. out_count = layers[layer_index+1]
  158. #初始化 初始值小
  159. #这里需要考虑偏置项,偏置的个数与输出的个数一样
  160. thetas[layer_index]=np.random.rand(out_count,in_count+1) * 0.05 #加一列输入特征
  161. return thetas

 五、手写数字识别

        数据集(读者可以找找下载,我就不放链接了>_<):   

 

         共一万个样本,第一列为标签值,一列表示像素点的值共28*28共784个像素点。

  1. import numpy as np
  2. import pandas as pd
  3. import matplotlib.pyplot as plt
  4. import matplotlib.image as mping
  5. import math
  6. from Neural_Network_Lab.Multilayer_Perceptron import MultilayerPerceptron
  7. data = pd.read_csv('../Neural_Network_Lab/data/mnist-demo.csv')
  8. #展示数据
  9. numbers_to_display = 25
  10. num_cells = math.ceil(math.sqrt(numbers_to_display))
  11. plt.figure(figsize=(10,10))
  12. for plot_index in range(numbers_to_display):
  13. digit = data[plot_index:plot_index+1].values
  14. digit_label = digit[0][0]
  15. digit_pixels = digit[0][1:]
  16. image_size = int(math.sqrt(digit_pixels.shape[0]))
  17. frame = digit_pixels.reshape((image_size,image_size))
  18. plt.subplot(num_cells,num_cells,plot_index+1)
  19. plt.imshow(frame,cmap = 'Greys')
  20. plt.title(digit_label)
  21. plt.subplots_adjust(wspace=0.5,hspace=0.5)
  22. plt.show()
  23. train_data = data.sample(frac= 0.8)
  24. test_data = data.drop(train_data.index)
  25. train_data = train_data.values
  26. test_data = test_data.values
  27. num_training_examples = 8000
  28. X_train = train_data[:num_training_examples,1:]
  29. y_train = train_data[:num_training_examples,[0]]
  30. X_test = test_data[:,1:]
  31. y_test = test_data[:,[0]]
  32. layers = [784,25,10]
  33. normalize_data = True
  34. max_iteration = 500
  35. alpha = 0.1
  36. multilayerperceptron = MultilayerPerceptron(X_train,y_train,layers,normalize_data)
  37. (thetas,cost_history) = multilayerperceptron.train(max_iteration,alpha)
  38. plt.plot(range(len(cost_history)),cost_history)
  39. plt.xlabel('Grident steps')
  40. plt.ylabel('cost')
  41. plt.show()
  42. y_train_predictions = multilayerperceptron.predict(X_train)
  43. y_test_predictions = multilayerperceptron.predict(X_test)
  44. train_p = np.sum((y_train_predictions == y_train) / y_train.shape[0] * 100)
  45. test_p = np.sum((y_test_predictions == y_test) / y_test.shape[0] * 100)
  46. print("训练集准确率:",train_p)
  47. print("测试集准确率:",test_p)
  48. numbers_to_display = 64
  49. num_cells = math.ceil(math.sqrt(numbers_to_display))
  50. plt.figure(figsize=(15,15))
  51. for plot_index in range(numbers_to_display):
  52. digit_label = y_test[plot_index,0]
  53. digit_pixels = X_test[plot_index,:]
  54. predicted_label = y_test_predictions[plot_index][0]
  55. image_size = int(math.sqrt(digit_pixels.shape[0]))
  56. frame = digit_pixels.reshape((image_size,image_size))
  57. plt.subplot(num_cells,num_cells,plot_index+1)
  58. color_map = 'Greens' if predicted_label == digit_label else 'Reds'
  59. plt.imshow(frame,cmap = color_map)
  60. plt.title(predicted_label)
  61. plt.tick_params(axis='both',which='both',bottom=False,left=False,labelbottom=False)
  62. plt.subplots_adjust(wspace=0.5,hspace=0.5)
  63. plt.show()

         训练集8000个,测试集2000个,迭代次数500次

        

        

         这里准确率不高,读者可以自行调整参数,改变迭代次数,网络层次都可以哦。

          

        

         

        

        

       

        

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/IT小白/article/detail/342372
推荐阅读
相关标签
  

闽ICP备14008679号