当前位置:   article > 正文

机器学习四:神经网络识别手写数字(Matlab)_matlab手写数字识别代码

matlab手写数字识别代码

目录

1、概述

​编辑

1.1、参数介绍

 2、数据导入及初始化:Loading and Visualizing Data

        2.1、数据导入

2.2、 displayData函数

3、导入参数

4、前向传播计算代价函数

4.1sigmoid.m介绍及代码

4.2、sigmoidGradient.m代码实现

4.3、nnCostFunction.m代码

4.4、代码解读及注意事项

        Part1:前向传播部分

Part2:求梯度更新Theta

Part3:正则化 

5、测试正则化后的代价函数

6、测试Sigmoid函数的梯度函数

​编辑7、初始化参数

8、测试正则化结果

8.1、 checkNNGradients.m介绍

8.2、computeNumericalGradient.m

9、训练模型

10、权重可视化

11、预测

11.1、predict.m


1、概述

本文对神经网络识别手写数字进行了模块化的介绍,对于“实验四“项目本身而言,需要完成以下几个模块,对此笔者也做了较为详细的介绍,供大家参考。

nnCostFunction

sigmoidGradient

randInitializeWeights

神经网络实际上是由多个逻辑回归拼凑而来,因此在学习神经网络之前,要熟练掌握逻辑回归算法,我将以图片的形式形象的展示逻辑回归和神经网络之间的关系。很明显可以看出简化版的网络模型是由上方五个逻辑回归模型组合而成。

1.1、参数介绍

参数意义大小

input_layer_size

输入层神经元个数400

hidden_layer_size

隐藏层神经元个数25

num_labels

输出层神经元个数10
m输入层每个神经元大小5000

 2、数据导入及初始化:Loading and Visualizing Data

        2.1、数据导入

  1. %% Machine Learning Online Class - Exercise 4 Neural Network Learning
  2. % sigmoidGradient.m
  3. % randInitializeWeights.m
  4. % nnCostFunction.m
  5. %% Initialization
  6. clear ; close all; clc
  7. %% Setup the parameters you will use for this exercise
  8. input_layer_size = 400; % 20x20 Input Images of Digits
  9. hidden_layer_size = 25; % 25 hidden units
  10. num_labels = 10; % 10 labels, from 1 to 10
  11. % (note that we have mapped "0" to label 10)
  12. %% =========== Part 1: Loading and Visualizing Data =============
  13. % We start the exercise by first loading and visualizing the dataset.
  14. % You will be working with a dataset that contains handwritten digits.
  15. %
  16. % Load Training Data
  17. fprintf('Loading and Visualizing Data ...\n')
  18. load('ex4data1.mat');
  19. m = size(X, 1);%行数
  20. % Randomly select 100 data points to display
  21. sel = randperm(size(X, 1));
  22. sel = sel(1:100);
  23. displayData(X(sel, :));
  24. fprintf('Program paused. Press enter to continue.\n');
  25. pause;

2.2、 displayData函数

可以运用displayData.m文件,根据原本数据集绘制出如下数据可视化图片。关于此部代码的效果可以从结果图中看出:即将一个二维数组在网格中显示出来,有兴趣的同学可以仔细了解一下。

  1. function [h, display_array] = displayData(X, example_width)
  2. %DISPLAYDATA Display 2D data in a nice grid
  3. % [h, display_array] = DISPLAYDATA(X, example_width) displays 2D data
  4. % stored in X in a nice grid. It returns the figure handle h and the
  5. % displayed array if requested.
  6. % Set example_width automatically if not passed in
  7. if ~exist('example_width', 'var') || isempty(example_width)
  8. example_width = round(sqrt(size(X, 2)));
  9. end
  10. % Gray Image
  11. colormap(gray);
  12. % Compute rows, cols
  13. [m n] = size(X);
  14. example_height = (n / example_width);
  15. % Compute number of items to display
  16. display_rows = floor(sqrt(m));
  17. display_cols = ceil(m / display_rows);
  18. % Between images padding
  19. pad = 1;
  20. % Setup blank display
  21. display_array = - ones(pad + display_rows * (example_height + pad), ...
  22. pad + display_cols * (example_width + pad));
  23. % Copy each example into a patch on the display array
  24. curr_ex = 1;
  25. for j = 1:display_rows
  26. for i = 1:display_cols
  27. if curr_ex > m,
  28. break;
  29. end
  30. % Copy the patch
  31. % Get the max value of the patch
  32. max_val = max(abs(X(curr_ex, :)));
  33. display_array(pad + (j - 1) * (example_height + pad) + (1:example_height), ...
  34. pad + (i - 1) * (example_width + pad) + (1:example_width)) = ...
  35. reshape(X(curr_ex, :), example_height, example_width) / max_val;
  36. curr_ex = curr_ex + 1;
  37. end
  38. if curr_ex > m,
  39. break;
  40. end
  41. end
  42. % Display Image
  43. h = imagesc(display_array, [-1 1]);
  44. % Do not show axis
  45. axis image off
  46. drawnow;
  47. end

3、导入参数

        由于本实验是三层网络模型,因此需要两个权重矩阵Theta1和Theta2;但三层的神经元个数不同,导致这两个权重矩阵大小也不相同。因此需要nn_params 函数将其拉伸为一列,以便存储传递,后可经由reshape函数还原。

矩阵名称含义大小
Theta1输入层-隐藏层间的权重矩阵hidden_layer_size×input_layer_size+1
Theta2隐藏层-输出层间的权重矩阵num_labels×hidden_layer_size+1

注:其中Theta1和Theta2矩阵大小中的 “+1” 分别是由于:输入层和隐藏层需要增加偏置项

  1. %% ================ Part 2: Loading Parameters ================
  2. % In this part of the exercise, we load some pre-initialized
  3. % neural network parameters.
  4. fprintf('\nLoading Saved Neural Network Parameters ...\n')
  5. % Load the weights into variables Theta1 and Theta2
  6. load('ex4weights.mat');
  7. % Unroll parameters
  8. nn_params = [Theta1(:) ; Theta2(:)];

4、前向传播计算代价函数

  1. %% ================ Part 3: Compute Cost (Feedforward) ================
  2. % To the neural network, you should first start by implementing the
  3. % feedforward part of the neural network that returns the cost only. You
  4. % should complete the code in nnCostFunction.m to return cost. After
  5. % implementing the feedforward to compute the cost, you can verify that
  6. % your implementation is correct by verifying that you get the same cost
  7. % as us for the fixed debugging parameters.
  8. %
  9. % We suggest implementing the feedforward cost *without* regularization
  10. % first so that it will be easier for you to debug. Later, in part 4, you
  11. % will get to implement the regularized cost.
  12. %
  13. fprintf('\nFeedforward Using Neural Network ...\n')
  14. % Weight regularization parameter (we set this to 0 here).
  15. lambda = 0;
  16. %J=([Theta1(:) ; Theta2(:)],输入层个数,隐藏层个数,标签个数,???,输出真实值)
  17. J = nnCostFunction(nn_params, input_layer_size, hidden_layer_size, ...
  18. num_labels, X, y, lambda);
  19. fprintf(['Cost at parameters (loaded from ex4weights): %f '...
  20. '\n(this value should be about 0.287629)\n'], J);
  21. fprintf('\nProgram paused. Press enter to continue.\n');
  22. pause;

4.1sigmoid.m介绍及代码

        sigmoid是在机器学习中是个非常重要的函数,关于其详细介绍可以在逻辑回归文章中查看。

  1. function g = sigmoid(z)
  2. %SIGMOID Compute sigmoid functoon
  3. % J = SIGMOID(z) computes the sigmoid of z.
  4. g = 1.0 ./ (1.0 + exp(-z));
  5. end

4.2、sigmoidGradient.m代码实现

        此部分为sigmoid函数的求导,推导过程不难,利用复合函数求导即可轻松得出结论。

  1. function g = sigmoidGradient(z)
  2. %SIGMOIDGRADIENT returns the gradient of the sigmoid function
  3. %evaluated at z sigmoid函数的求导
  4. % g = SIGMOIDGRADIENT(z) computes the gradient of the sigmoid function
  5. % evaluated at z. This should work regardless if z is a matrix or a
  6. % vector. In particular, if z is a vector or matrix, you should return
  7. % the gradient for each element.
  8. g = zeros(size(z));
  9. % ================== YOUR CODE HERE "sigmoid函数的求导" ==================
  10. % Instructions: Compute the gradient of the sigmoid function evaluated at
  11. % each value of z (z can be a matrix, vector or scalar).
  12. g=sigmoid(z).*(1-sigmoid(z));
  13. % =============================================================
  14. end

4.3、nnCostFunction.m代码

  1. function [J grad] = nnCostFunction(nn_params, ...
  2. input_layer_size, ...
  3. hidden_layer_size, ...
  4. num_labels, ...
  5. X, y, lambda)
  6. %NNCOSTFUNCTION Implements the neural network cost function for a two layer
  7. %neural network which performs classification
  8. % [J grad] = NNCOSTFUNCTON(nn_params, hidden_layer_size, num_labels, ...
  9. % X, y, lambda) computes the cost and gradient of the neural network. The
  10. % parameters for the neural network are "unrolled" into the vector
  11. % nn_params and need to be converted back into the weight matrices.
  12. %
  13. % The returned parameter grad should be a "unrolled" vector of the
  14. % partial derivatives of the neural network.
  15. % Reshape nn_params back into the parameters Theta1 and Theta2, the weight matrices
  16. % for our 2 layer neural network
  17. Theta1 = reshape(nn_params(1:hidden_layer_size * (input_layer_size + 1)), ...
  18. hidden_layer_size, (input_layer_size + 1));
  19. %Theta1为输入层与隐藏第一层之间的权重矩阵,大小为隐藏层神经元数量X(输入层神经元数量+偏置神经元)
  20. %reshape(x,y,z)X转前的数量 y转换后行数 z转换后列数
  21. Theta2 = reshape(nn_params((1 + (hidden_layer_size * (input_layer_size + 1))):end), ...
  22. num_labels, (hidden_layer_size + 1));
  23. % Setup some useful variables
  24. m = size(X, 1);
  25. % ====================== YOUR CODE HERE ======================
  26. % Instructions: You should complete the code by working through the
  27. % following parts.
  28. %
  29. % Part 1: Feedforward the neural network and return the cost in the
  30. % variable J. After implementing Part 1, you can verify that your
  31. % cost function computation is correct by verifying the cost
  32. % computed in ex4.m
  33. X = [ones(m,1) X]';%加偏置量 输入X (样本数 , input_layer_size)
  34. a1 = X;%401*5000
  35. z2 = Theta1*a1;%隐藏层输入z2(hidden_layer_size,样本数)
  36. % (25*401) x (401*5000)==>25x5000
  37. % theta1 =(hidden_layer_size,input_layer_size)
  38. % X (样本数 , input_layer_size)
  39. a2 = sigmoid(z2);%隐藏层输出25x5000 (hidden_layer_size,样本数)
  40. a2 = [ones(1,size(a2,2));a2];%加偏置量26*5000 (hidden_layer_size +1,样本数)
  41. z3 = Theta2*a2;%输出层输入(10*26)X(26*5000)===>10*5000 (num_labels,样本数)
  42. h_theta = sigmoid(z3);%输出层输出10*5000
  43. % y:5000*1 y应为(num_labels,样本数)
  44. [y_number] = size(y);
  45. % matrix = zeros(10,y_number(1));
  46. matrix = zeros(num_labels,y_number(1));
  47. A=eye(num_labels) ;
  48. for i = 1:y_number(1)
  49. matrix(:,i) = A(y(i),:);
  50. end
  51. y=matrix;
  52. %%Switch-case的方法不够灵活
  53. %log(h_theta)*y+log(1-h_theta)*(1-y)===>10*1
  54. J = -sum(sum(log(h_theta).*y+log(1-h_theta).*(1-y),2))/m;
  55. %
  56. % Part 2: Implement the backpropagation algorithm to compute the gradients
  57. % Theta1_grad and Theta2_grad. You should return the partial derivatives of
  58. % the cost function with respect to Theta1 and Theta2 in Theta1_grad and
  59. % Theta2_grad, respectively. After implementing Part 2, you can check
  60. % that your implementation is correct by running checkNNGradients
  61. %
  62. % Note: The vector y passed into the function is a vector of labels
  63. % containing values from 1..K. You need to map this vector into a
  64. % binary vector of 1's and 0's to be used with the neural network
  65. % cost function.
  66. %
  67. % Hint: We recommend implementing backpropagation using a for-loop
  68. % over the training examples if you are implementing it for the
  69. % first time.
  70. %
  71. %y=repmat(y,1,10);%5000*10===>10*5000
  72. %y=y';
  73. derta3 = (h_theta - y); %10*5000
  74. derta2 = (Theta2)'*derta3.*sigmoidGradient(a2);%(10*26)'X10*5000 = 26*5000
  75. %Theta2(:,[1]) = [];删除矩阵Theta2第一列?!!!!!!!!!!!!?
  76. Theta2_grad = derta3*(a2)';%10*26输出层到隐藏层梯度
  77. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
  78. %derta2([1],:) = [];
  79. derta2 = derta2(2:end,:);
  80. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
  81. Theta1_grad = derta2*(a1)';%26*401%隐藏
  82. % 注意:由于第二层(隐藏第一层)加了偏置项,但是偏置项与第一层无连接,
  83. % 因此计算梯度时。应将derta2中的偏置项部分去除
  84. % 因此Theta2_grad应为25*401!!!!!!!!!!!!
  85. alpha = 0.00003;
  86. Theta2 = Theta2 - alpha.*Theta2_grad;
  87. Theta1 = Theta1 - alpha.*Theta1_grad;
  88. % Part 3: Implement regularization with the cost function and gradients.
  89. %
  90. % Hint: You can implement this around the code for
  91. % backpropagation. That is, you can compute the gradients for
  92. % the regularization separately and then add them to Theta1_grad
  93. % and Theta2_grad from Part 2.
  94. Theta2_reg = Theta2_grad(:,2:end)+lambda.*Theta2(:,2:end);
  95. Theta2_grad = (1/m).*[Theta2_grad(:,1) Theta2_reg];
  96. Theta1_reg = Theta1_grad(:,2:end)+lambda.*Theta1(:,2:end);
  97. Theta1_grad = (1/m).*[Theta1_grad(:,1) Theta1_reg];
  98. % 加上正则项
  99. reg1 = sum(sum(Theta1(:,2:end).^2));
  100. reg2 = sum(sum(Theta2(:,2:end).^2));
  101. reg = (lambda/(2*m))*(reg1+reg2);
  102. J = J +reg;
  103. % =========================================================================
  104. % Unroll gradients
  105. grad = [Theta1_grad(:) ; Theta2_grad(:)];
  106. end

4.4、代码解读及注意事项

        Part1:前向传播部分

        在前文中也有介绍:神经网络模型实际上就是由多组逻辑回归模型首尾相连拼接而成。因此,前向传播可以简单理解为:多次求解逻辑回归模型,并且逐层推进,直至最后的预测结果,详细步骤如下。为了方便运算,可以采取向量化的方式,一方面代码编写简单;另一方面可以通过结果判断错误点的位置。

        为了避免求解错误,在编写代码时可参照下表进行检验。当然,如果你对前向传播算法非常熟悉,自然不会犯初学者经常遇到的错误——矩阵乘法维度不正确。

参数矩阵大小
Xinput_layer_size + 1 × m
Z2hidden_layer_size × m
A2hidden_layer_size + 1 × m
Z3num_labels × m

        公式如下,下图公式描述的是一个四层的网络,但公式与本项目道理相通,仅供参考

        值得注意的是,输出层y一般给定数据为一列向量,但是在求解误差函数是,需要的y是一个矩阵,其大小为:num_labels × m。因此,我们需要标签y的数据进行如下处理:

        此算法本人在完成实验时用了两种,第一种是for循环加switch-case结构,另一种是for循环+eye矩阵。第一种算法逻辑简单,容易理解;第二种更为灵活,代码量更少。

        最后,通过前面所得到的已知量求解代价函数。而代价函数感觉就是一种描述误差的函数,与一般的误差函数不同的是,代价函数是误差函数通过某种非线性变换后的产物,是以概率的形式来描述误差。本实验的结果与参考值相同,代码无误。

Part2:求梯度更新Theta

通过对代价函数J求偏导,利用链式求导法则可以得到公式*, 因此可以利用向量化的思想,求出Theta1 和 Theta2 的梯度,并且可以同步解出Theta的更新结果。

Part3:正则化 

在原有损失函数的基础上,增加一个正则项,以防止高次项导致的过拟合问题。实质上是通过正则化参数 λ来削弱高次项对模型的影响。

需要注意的是,偏置项系数是不参与正则化的,个人看法:拿线性回归的正则化来说:偏置项作为常数项只会影响曲线上下的平移,对曲线的曲折程度没有影响,因此对应的权重对曲线的曲折成的也是没有影响的。因此不需要对偏置项进行惩罚。

5、测试正则化后的代价函数

        测试结果与参考值基本相同,正则化后的代价函数无误。

  1. %% =============== Part 4: Implement Regularization ===============
  2. % Once your cost function implementation is correct, you should now
  3. % continue to implement the regularization with the cost.
  4. %
  5. fprintf('\nChecking Cost Function (w/ Regularization) ... \n')
  6. % Weight regularization parameter (we set this to 1 here).
  7. lambda = 1;
  8. J = nnCostFunction(nn_params, input_layer_size, hidden_layer_size, ...
  9. num_labels, X, y, lambda);
  10. fprintf(['Cost at parameters (loaded from ex4weights): %f '...
  11. '\n(this value should be about 0.383770)\n'], J);
  12. fprintf('Program paused. Press enter to continue.\n');
  13. pause;

6、测试Sigmoid函数的梯度函数

        测试结果准确。

  1. %% ================ Part 5: Sigmoid Gradient ================
  2. % Before you start implementing the neural network, you will first
  3. % implement the gradient for the sigmoid function. You should complete the
  4. % code in the sigmoidGradient.m file.
  5. %
  6. fprintf('\nEvaluating sigmoid gradient...\n')
  7. g = sigmoidGradient([-1 -0.5 0 0.5 1]);
  8. fprintf('Sigmoid gradient evaluated at [-1 -0.5 0 0.5 1]:\n ');
  9. fprintf('%f ', g);
  10. fprintf('\n\n');
  11. fprintf('Program paused. Press enter to continue.\n');
  12. pause;

7、初始化参数

  1. %% ================ Part 6: Initializing Pameters ================
  2. % In this part of the exercise, you will be starting to implment a two
  3. % layer neural network that classifies digits. You will start by
  4. % implementing a function to initialize the weights of the neural network
  5. % (randInitializeWeights.m)
  6. fprintf('\nInitializing Neural Network Parameters ...\n')
  7. initial_Theta1 = randInitializeWeights(input_layer_size, hidden_layer_size);
  8. initial_Theta2 = randInitializeWeights(hidden_layer_size, num_labels);
  9. % Unroll parameters
  10. initial_nn_params = [initial_Theta1(:) ; initial_Theta2(:)];

7.1、randInitializeWeights.m函数

初始化权重参数,实际上就是为了给Theta1和Theta2增加偏置量的权重。需要注意的是,本函数不单单是改变Theta矩阵的大小,在其初始化赋值时需注意:Theta的初始值不能赋值为0,因此本函数采用rand()函数,将Theta初始值限定在±INIT_EPSILON 之间。

  1. function W = randInitializeWeights(L_in, L_out)
  2. %RANDINITIALIZEWEIGHTS Randomly initialize the weights of a layer with L_in
  3. %incoming connections and L_out outgoing connections
  4. % W = RANDINITIALIZEWEIGHTS(L_in, L_out) randomly initializes the weights
  5. % of a layer with L_in incoming connections and L_out outgoing
  6. % connections.
  7. %
  8. % Note that W should be set to a matrix of size(L_out, 1 + L_in) as
  9. % the first column of W handles the "bias" terms
  10. % ====================== YOUR CODE HERE ======================
  11. % Instructions: Initialize W randomly so that we break the symmetry while
  12. % training the neural network.
  13. %
  14. % Note: The first column of W corresponds to the parameters for the bias unit
  15. %
  16. INIT_EPSILON = sqrt(6/(L_in+L_out));
  17. W = rand(L_out,L_in+1)*(2*INIT_EPSILON) - INIT_EPSILON;
  18. % =========================================================================
  19. end

8、测试正则化结果

        通过改变不同的lambda值计算出不同lambda对应的代价debug_J,来判断模型正则化的准确度,并给出lambda为3时的代价做参考,结果显示,计算值与参考值基本相同。

  1. %% =============== Part 8: Implement Regularization ===============
  2. % Once your backpropagation implementation is correct, you should now
  3. % continue to implement the regularization with the cost and gradient.
  4. %
  5. fprintf('\nChecking Backpropagation (w/ Regularization) ... \n')
  6. % Check gradients by running checkNNGradients
  7. lambda = 3;
  8. checkNNGradients(lambda);
  9. % Also output the costFunction debugging values
  10. debug_J = nnCostFunction(nn_params, input_layer_size, ...
  11. hidden_layer_size, num_labels, X, y, lambda);
  12. fprintf(['\n\nCost at (fixed) debugging parameters (w/ lambda = %f): %f ' ...
  13. '\n(for lambda = 3, this value should be about 0.576051)\n\n'], lambda, debug_J);
  14. fprintf('Program paused. Press enter to continue.\n');
  15. pause;

8.1、 checkNNGradients.m介绍

        checkNNGradients函数目的是梯度检验,根据传入的lambda参数,计算测试模型计(3*5*3的神经网络模型)算的梯度。并通过比较模型结果grad与数值计算梯度的结果numgrad 进行比较,进而判断模型的代价函数计算是否合理。

  1. function checkNNGradients(lambda)
  2. %CHECKNNGRADIENTS Creates a small neural network to check the
  3. %backpropagation gradients
  4. % CHECKNNGRADIENTS(lambda) Creates a small neural network to check the
  5. % backpropagation gradients, it will output the analytical gradients
  6. % produced by your backprop code and the numerical gradients (computed
  7. % using computeNumericalGradient). These two gradient computations should
  8. % result in very similar values.
  9. %
  10. if ~exist('lambda', 'var') || isempty(lambda)
  11. lambda = 0;
  12. end
  13. input_layer_size = 3;
  14. hidden_layer_size = 5;
  15. num_labels = 3;
  16. m = 5;
  17. % X = (m * input_layer_size);
  18. % Y = (m *num_labels);
  19. % We generate some 'random' test data
  20. %debugInitializeWeights(a,b)---->(a,b+1)
  21. Theta1 = debugInitializeWeights(hidden_layer_size, input_layer_size);%5 * 4
  22. Theta2 = debugInitializeWeights(num_labels, hidden_layer_size);%3 * 6
  23. % Reusing debugInitializeWeights to generate X
  24. X = debugInitializeWeights(m, input_layer_size - 1); % 5 * 3
  25. y = 1 + mod(1:m, num_labels)'; % 5 * 1 应该是 5*3!!!!!!!!!!!!!!!!!!!
  26. % Unroll parameters
  27. nn_params = [Theta1(:) ; Theta2(:)];% 38*1
  28. % Short hand for cost function
  29. costFunc = @(p) nnCostFunction(p, input_layer_size, hidden_layer_size, ...
  30. num_labels, X, y, lambda);
  31. [cost, grad] = costFunc(nn_params);
  32. numgrad = computeNumericalGradient(costFunc, nn_params);
  33. % Visually examine the two gradient computations. The two columns
  34. % you get should be very similar.
  35. disp([numgrad grad]);
  36. fprintf(['The above two columns you get should be very similar.\n' ...
  37. '(Left-Your Numerical Gradient, Right-Analytical Gradient)\n\n']);
  38. % Evaluate the norm of the difference between two solutions.
  39. % If you have a correct implementation, and assuming you used EPSILON = 0.0001
  40. % in computeNumericalGradient.m, then diff below should be less than 1e-9
  41. diff = norm(numgrad-grad)/norm(numgrad+grad);
  42. fprintf(['If your backpropagation implementation is correct, then \n' ...
  43. 'the relative difference will be small (less than 1e-9). \n' ...
  44. '\nRelative Difference: %g\n'], diff);
  45. end

8.2、computeNumericalGradient.m

  1. function numgrad = computeNumericalGradient(J, theta)
  2. %COMPUTENUMERICALGRADIENT Computes the gradient using "finite differences"
  3. %and gives us a numerical estimate of the gradient.
  4. % numgrad = COMPUTENUMERICALGRADIENT(J, theta) computes the numerical
  5. % gradient of the function J around theta. Calling y = J(theta) should
  6. % return the function value at theta.
  7. % Notes: The following code implements numerical gradient checking, and
  8. % returns the numerical gradient.It sets numgrad(i) to (a numerical
  9. % approximation of) the partial derivative of J with respect to the
  10. % i-th input argument, evaluated at theta. (i.e., numgrad(i) should
  11. % be the (approximately) the partial derivative of J with respect
  12. % to theta(i).)
  13. %
  14. numgrad = zeros(size(theta));
  15. perturb = zeros(size(theta));
  16. e = 1e-4;
  17. for p = 1:numel(theta)
  18. % Set perturbation vector
  19. perturb(p) = e;
  20. loss1 = J(theta - perturb);
  21. loss2 = J(theta + perturb);
  22. % Compute Numerical Gradient
  23. numgrad(p) = (loss2 - loss1) / (2*e);
  24. perturb(p) = 0;
  25. end
  26. end

9、训练模型

        运用matlab中的fming来训练模型,具体源码还没有进行分析,因此对此部分也不甚了解,只知道将代价函数等模块完成后,调用此函数即可进行模型训练。关于fming函数的其他内容暂不做过多说明,有兴趣可自行查阅相关资料。

  1. %% =================== Part 8: Training NN ===================
  2. % You have now implemented all the code necessary to train a neural
  3. % network. To train your neural network, we will now use "fmincg", which
  4. % is a function which works similarly to "fminunc". Recall that these
  5. % advanced optimizers are able to train our cost functions efficiently as
  6. % long as we provide them with the gradient computations.
  7. %
  8. fprintf('\nTraining Neural Network... \n')
  9. % After you have completed the assignment, change the MaxIter to a larger
  10. % value to see how more training helps.
  11. options = optimset('MaxIter', 50);
  12. % You should also try different values of lambda
  13. lambda = 1;
  14. % Create "short hand" for the cost function to be minimized
  15. costFunction = @(p) nnCostFunction(p, ...
  16. input_layer_size, ...
  17. hidden_layer_size, ...
  18. num_labels, X, y, lambda);
  19. % Now, costFunction is a function that takes in only one argument (the
  20. % neural network parameters)
  21. [nn_params, cost] = fmincg(costFunction, initial_nn_params, options);
  22. % Obtain Theta1 and Theta2 back from nn_params
  23. Theta1 = reshape(nn_params(1:hidden_layer_size * (input_layer_size + 1)), ...
  24. hidden_layer_size, (input_layer_size + 1));
  25. Theta2 = reshape(nn_params((1 + (hidden_layer_size * (input_layer_size + 1))):end), ...
  26. num_labels, (hidden_layer_size + 1));
  27. fprintf('Program paused. Press enter to continue.\n');
  28. pause;

10、权重可视化

  1. %% ================= Part 9: Visualize Weights =================
  2. % You can now "visualize" what the neural network is learning by
  3. % displaying the hidden units to see what features they are capturing in
  4. % the data.
  5. fprintf('\nVisualizing Neural Network... \n')
  6. displayData(Theta1(:, 2:end));
  7. fprintf('\nProgram paused. Press enter to continue.\n');
  8. pause;

11、预测

利用输入数据和模型训练出的权重,计算出预测的输出结果,并于实际结果进行比对,进而求出模型识别数字的准确率。

  1. %% ================= Part 10: Implement Predict =================
  2. % After training the neural network, we would like to use it to predict
  3. % the labels. You will now implement the "predict" function to use the
  4. % neural network to predict the labels of the training set. This lets
  5. % you compute the training set accuracy.
  6. pred = predict(Theta1, Theta2, X);
  7. fprintf('\nTraining Set Accuracy: %f\n', mean(double(pred == y)) * 100);

11.1、predict.m

实际上就是运用模型训练出来的权重矩阵计算出每张图片属于十大标签的概率,并从中选出概率值最大的标签作为预测数字的结果,并将此结果与实际的y进行比对,算出准确率。

本次训练所得的准确率为88.580000%,即只有44290张图片识别正确,效果仍需提高。需要说明的一点是,每次训练的准确率都不尽相同,基本上稳定在85%以上,效果还算理想。

  1. function p = predict(Theta1, Theta2, X)
  2. %PREDICT Predict the label of an input given a trained neural network
  3. % p = PREDICT(Theta1, Theta2, X) outputs the predicted label of X given the
  4. % trained weights of a neural network (Theta1, Theta2)
  5. % Useful values
  6. m = size(X, 1);
  7. num_labels = size(Theta2, 1);
  8. % You need to return the following variables correctly
  9. p = zeros(size(X, 1), 1);
  10. h1 = sigmoid([ones(m, 1) X] * Theta1');
  11. h2 = sigmoid([ones(m, 1) h1] * Theta2');
  12. [dummy, p] = max(h2, [], 2);
  13. % =========================================================================
  14. end

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/人工智能uu/article/detail/1008017
推荐阅读
相关标签
  

闽ICP备14008679号