赞
踩
目录
2、数据导入及初始化:Loading and Visualizing Data
8.2、computeNumericalGradient.m
本文对神经网络识别手写数字进行了模块化的介绍,对于“实验四“项目本身而言,需要完成以下几个模块,对此笔者也做了较为详细的介绍,供大家参考。
nnCostFunction
sigmoidGradient
randInitializeWeights
神经网络实际上是由多个逻辑回归拼凑而来,因此在学习神经网络之前,要熟练掌握逻辑回归算法,我将以图片的形式形象的展示逻辑回归和神经网络之间的关系。很明显可以看出简化版的网络模型是由上方五个逻辑回归模型组合而成。
参数 | 意义 | 大小 |
input_layer_size | 输入层神经元个数 | 400 |
hidden_layer_size | 隐藏层神经元个数 | 25 |
num_labels | 输出层神经元个数 | 10 |
m | 输入层每个神经元大小 | 5000 |
- %% Machine Learning Online Class - Exercise 4 Neural Network Learning
-
- % sigmoidGradient.m
- % randInitializeWeights.m
- % nnCostFunction.m
-
- %% Initialization
- clear ; close all; clc
-
- %% Setup the parameters you will use for this exercise
- input_layer_size = 400; % 20x20 Input Images of Digits
- hidden_layer_size = 25; % 25 hidden units
- num_labels = 10; % 10 labels, from 1 to 10
- % (note that we have mapped "0" to label 10)
-
- %% =========== Part 1: Loading and Visualizing Data =============
- % We start the exercise by first loading and visualizing the dataset.
- % You will be working with a dataset that contains handwritten digits.
- %
-
- % Load Training Data
- fprintf('Loading and Visualizing Data ...\n')
-
- load('ex4data1.mat');
- m = size(X, 1);%行数
-
- % Randomly select 100 data points to display
- sel = randperm(size(X, 1));
- sel = sel(1:100);
-
- displayData(X(sel, :));
-
- fprintf('Program paused. Press enter to continue.\n');
- pause;
-
可以运用displayData.m文件,根据原本数据集绘制出如下数据可视化图片。关于此部代码的效果可以从结果图中看出:即将一个二维数组在网格中显示出来,有兴趣的同学可以仔细了解一下。
- function [h, display_array] = displayData(X, example_width)
- %DISPLAYDATA Display 2D data in a nice grid
- % [h, display_array] = DISPLAYDATA(X, example_width) displays 2D data
- % stored in X in a nice grid. It returns the figure handle h and the
- % displayed array if requested.
-
- % Set example_width automatically if not passed in
- if ~exist('example_width', 'var') || isempty(example_width)
- example_width = round(sqrt(size(X, 2)));
- end
-
- % Gray Image
- colormap(gray);
-
- % Compute rows, cols
- [m n] = size(X);
- example_height = (n / example_width);
-
- % Compute number of items to display
- display_rows = floor(sqrt(m));
- display_cols = ceil(m / display_rows);
-
- % Between images padding
- pad = 1;
-
- % Setup blank display
- display_array = - ones(pad + display_rows * (example_height + pad), ...
- pad + display_cols * (example_width + pad));
-
- % Copy each example into a patch on the display array
- curr_ex = 1;
- for j = 1:display_rows
- for i = 1:display_cols
- if curr_ex > m,
- break;
- end
- % Copy the patch
-
- % Get the max value of the patch
- max_val = max(abs(X(curr_ex, :)));
- display_array(pad + (j - 1) * (example_height + pad) + (1:example_height), ...
- pad + (i - 1) * (example_width + pad) + (1:example_width)) = ...
- reshape(X(curr_ex, :), example_height, example_width) / max_val;
- curr_ex = curr_ex + 1;
- end
- if curr_ex > m,
- break;
- end
- end
-
- % Display Image
- h = imagesc(display_array, [-1 1]);
-
- % Do not show axis
- axis image off
-
- drawnow;
-
- end
由于本实验是三层网络模型,因此需要两个权重矩阵Theta1和Theta2;但三层的神经元个数不同,导致这两个权重矩阵大小也不相同。因此需要nn_params 函数将其拉伸为一列,以便存储传递,后可经由reshape函数还原。
矩阵名称 | 含义 | 大小 |
Theta1 | 输入层-隐藏层间的权重矩阵 | hidden_layer_size×input_layer_size+1 |
Theta2 | 隐藏层-输出层间的权重矩阵 | num_labels×hidden_layer_size+1 |
注:其中Theta1和Theta2矩阵大小中的 “+1” 分别是由于:输入层和隐藏层需要增加偏置项
- %% ================ Part 2: Loading Parameters ================
- % In this part of the exercise, we load some pre-initialized
- % neural network parameters.
-
- fprintf('\nLoading Saved Neural Network Parameters ...\n')
-
- % Load the weights into variables Theta1 and Theta2
- load('ex4weights.mat');
-
- % Unroll parameters
- nn_params = [Theta1(:) ; Theta2(:)];
- %% ================ Part 3: Compute Cost (Feedforward) ================
- % To the neural network, you should first start by implementing the
- % feedforward part of the neural network that returns the cost only. You
- % should complete the code in nnCostFunction.m to return cost. After
- % implementing the feedforward to compute the cost, you can verify that
- % your implementation is correct by verifying that you get the same cost
- % as us for the fixed debugging parameters.
- %
- % We suggest implementing the feedforward cost *without* regularization
- % first so that it will be easier for you to debug. Later, in part 4, you
- % will get to implement the regularized cost.
- %
- fprintf('\nFeedforward Using Neural Network ...\n')
-
- % Weight regularization parameter (we set this to 0 here).
- lambda = 0;
-
- %J=([Theta1(:) ; Theta2(:)],输入层个数,隐藏层个数,标签个数,???,输出真实值)
- J = nnCostFunction(nn_params, input_layer_size, hidden_layer_size, ...
- num_labels, X, y, lambda);
-
- fprintf(['Cost at parameters (loaded from ex4weights): %f '...
- '\n(this value should be about 0.287629)\n'], J);
-
- fprintf('\nProgram paused. Press enter to continue.\n');
- pause;
sigmoid是在机器学习中是个非常重要的函数,关于其详细介绍可以在逻辑回归文章中查看。
- function g = sigmoid(z)
- %SIGMOID Compute sigmoid functoon
- % J = SIGMOID(z) computes the sigmoid of z.
-
- g = 1.0 ./ (1.0 + exp(-z));
- end
此部分为sigmoid函数的求导,推导过程不难,利用复合函数求导即可轻松得出结论。
- function g = sigmoidGradient(z)
- %SIGMOIDGRADIENT returns the gradient of the sigmoid function
- %evaluated at z sigmoid函数的求导
- % g = SIGMOIDGRADIENT(z) computes the gradient of the sigmoid function
- % evaluated at z. This should work regardless if z is a matrix or a
- % vector. In particular, if z is a vector or matrix, you should return
- % the gradient for each element.
-
- g = zeros(size(z));
-
- % ================== YOUR CODE HERE "sigmoid函数的求导" ==================
- % Instructions: Compute the gradient of the sigmoid function evaluated at
- % each value of z (z can be a matrix, vector or scalar).
-
- g=sigmoid(z).*(1-sigmoid(z));
-
- % =============================================================
-
- end
- function [J grad] = nnCostFunction(nn_params, ...
- input_layer_size, ...
- hidden_layer_size, ...
- num_labels, ...
- X, y, lambda)
- %NNCOSTFUNCTION Implements the neural network cost function for a two layer
- %neural network which performs classification
- % [J grad] = NNCOSTFUNCTON(nn_params, hidden_layer_size, num_labels, ...
- % X, y, lambda) computes the cost and gradient of the neural network. The
- % parameters for the neural network are "unrolled" into the vector
- % nn_params and need to be converted back into the weight matrices.
- %
- % The returned parameter grad should be a "unrolled" vector of the
- % partial derivatives of the neural network.
-
-
- % Reshape nn_params back into the parameters Theta1 and Theta2, the weight matrices
- % for our 2 layer neural network
- Theta1 = reshape(nn_params(1:hidden_layer_size * (input_layer_size + 1)), ...
- hidden_layer_size, (input_layer_size + 1));
- %Theta1为输入层与隐藏第一层之间的权重矩阵,大小为隐藏层神经元数量X(输入层神经元数量+偏置神经元)
- %reshape(x,y,z)X转前的数量 y转换后行数 z转换后列数
-
- Theta2 = reshape(nn_params((1 + (hidden_layer_size * (input_layer_size + 1))):end), ...
- num_labels, (hidden_layer_size + 1));
-
- % Setup some useful variables
- m = size(X, 1);
-
- % ====================== YOUR CODE HERE ======================
- % Instructions: You should complete the code by working through the
- % following parts.
- %
- % Part 1: Feedforward the neural network and return the cost in the
- % variable J. After implementing Part 1, you can verify that your
- % cost function computation is correct by verifying the cost
- % computed in ex4.m
-
- X = [ones(m,1) X]';%加偏置量 输入X (样本数 , input_layer_size)
- a1 = X;%401*5000
- z2 = Theta1*a1;%隐藏层输入z2(hidden_layer_size,样本数)
- % (25*401) x (401*5000)==>25x5000
- % theta1 =(hidden_layer_size,input_layer_size)
- % X (样本数 , input_layer_size)
-
- a2 = sigmoid(z2);%隐藏层输出25x5000 (hidden_layer_size,样本数)
- a2 = [ones(1,size(a2,2));a2];%加偏置量26*5000 (hidden_layer_size +1,样本数)
- z3 = Theta2*a2;%输出层输入(10*26)X(26*5000)===>10*5000 (num_labels,样本数)
- h_theta = sigmoid(z3);%输出层输出10*5000
- % y:5000*1 y应为(num_labels,样本数)
- [y_number] = size(y);
-
-
- % matrix = zeros(10,y_number(1));
- matrix = zeros(num_labels,y_number(1));
- A=eye(num_labels) ;
- for i = 1:y_number(1)
- matrix(:,i) = A(y(i),:);
- end
- y=matrix;
- %%Switch-case的方法不够灵活
-
- %log(h_theta)*y+log(1-h_theta)*(1-y)===>10*1
- J = -sum(sum(log(h_theta).*y+log(1-h_theta).*(1-y),2))/m;
-
- %
- % Part 2: Implement the backpropagation algorithm to compute the gradients
- % Theta1_grad and Theta2_grad. You should return the partial derivatives of
- % the cost function with respect to Theta1 and Theta2 in Theta1_grad and
- % Theta2_grad, respectively. After implementing Part 2, you can check
- % that your implementation is correct by running checkNNGradients
- %
- % Note: The vector y passed into the function is a vector of labels
- % containing values from 1..K. You need to map this vector into a
- % binary vector of 1's and 0's to be used with the neural network
- % cost function.
- %
- % Hint: We recommend implementing backpropagation using a for-loop
- % over the training examples if you are implementing it for the
- % first time.
- %
- %y=repmat(y,1,10);%5000*10===>10*5000
- %y=y';
- derta3 = (h_theta - y); %10*5000
-
- derta2 = (Theta2)'*derta3.*sigmoidGradient(a2);%(10*26)'X10*5000 = 26*5000
- %Theta2(:,[1]) = [];删除矩阵Theta2第一列?!!!!!!!!!!!!?
- Theta2_grad = derta3*(a2)';%10*26输出层到隐藏层梯度
- %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
- %derta2([1],:) = [];
- derta2 = derta2(2:end,:);
- %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
- Theta1_grad = derta2*(a1)';%26*401%隐藏
- % 注意:由于第二层(隐藏第一层)加了偏置项,但是偏置项与第一层无连接,
- % 因此计算梯度时。应将derta2中的偏置项部分去除
- % 因此Theta2_grad应为25*401!!!!!!!!!!!!
- alpha = 0.00003;
- Theta2 = Theta2 - alpha.*Theta2_grad;
- Theta1 = Theta1 - alpha.*Theta1_grad;
-
- % Part 3: Implement regularization with the cost function and gradients.
- %
- % Hint: You can implement this around the code for
- % backpropagation. That is, you can compute the gradients for
- % the regularization separately and then add them to Theta1_grad
- % and Theta2_grad from Part 2.
- Theta2_reg = Theta2_grad(:,2:end)+lambda.*Theta2(:,2:end);
- Theta2_grad = (1/m).*[Theta2_grad(:,1) Theta2_reg];
-
- Theta1_reg = Theta1_grad(:,2:end)+lambda.*Theta1(:,2:end);
- Theta1_grad = (1/m).*[Theta1_grad(:,1) Theta1_reg];
- % 加上正则项
- reg1 = sum(sum(Theta1(:,2:end).^2));
- reg2 = sum(sum(Theta2(:,2:end).^2));
-
- reg = (lambda/(2*m))*(reg1+reg2);
- J = J +reg;
- % =========================================================================
-
- % Unroll gradients
- grad = [Theta1_grad(:) ; Theta2_grad(:)];
-
-
- end
在前文中也有介绍:神经网络模型实际上就是由多组逻辑回归模型首尾相连拼接而成。因此,前向传播可以简单理解为:多次求解逻辑回归模型,并且逐层推进,直至最后的预测结果,详细步骤如下。为了方便运算,可以采取向量化的方式,一方面代码编写简单;另一方面可以通过结果判断错误点的位置。
为了避免求解错误,在编写代码时可参照下表进行检验。当然,如果你对前向传播算法非常熟悉,自然不会犯初学者经常遇到的错误——矩阵乘法维度不正确。
参数 | 矩阵大小 |
X | input_layer_size + 1 × m |
Z2 | hidden_layer_size × m |
A2 | hidden_layer_size + 1 × m |
Z3 | num_labels × m |
公式如下,下图公式描述的是一个四层的网络,但公式与本项目道理相通,仅供参考
值得注意的是,输出层y一般给定数据为一列向量,但是在求解误差函数是,需要的y是一个矩阵,其大小为:num_labels × m。因此,我们需要标签y的数据进行如下处理:
此算法本人在完成实验时用了两种,第一种是for循环加switch-case结构,另一种是for循环+eye矩阵。第一种算法逻辑简单,容易理解;第二种更为灵活,代码量更少。
最后,通过前面所得到的已知量求解代价函数。而代价函数感觉就是一种描述误差的函数,与一般的误差函数不同的是,代价函数是误差函数通过某种非线性变换后的产物,是以概率的形式来描述误差。本实验的结果与参考值相同,代码无误。
通过对代价函数J求偏导,利用链式求导法则可以得到公式*, 因此可以利用向量化的思想,求出Theta1 和 Theta2 的梯度,并且可以同步解出Theta的更新结果。
在原有损失函数的基础上,增加一个正则项,以防止高次项导致的过拟合问题。实质上是通过正则化参数 λ来削弱高次项对模型的影响。
需要注意的是,偏置项系数是不参与正则化的,个人看法:拿线性回归的正则化来说:偏置项作为常数项只会影响曲线上下的平移,对曲线的曲折程度没有影响,因此对应的权重对曲线的曲折成的也是没有影响的。因此不需要对偏置项进行惩罚。
测试结果与参考值基本相同,正则化后的代价函数无误。
- %% =============== Part 4: Implement Regularization ===============
- % Once your cost function implementation is correct, you should now
- % continue to implement the regularization with the cost.
- %
-
- fprintf('\nChecking Cost Function (w/ Regularization) ... \n')
-
- % Weight regularization parameter (we set this to 1 here).
- lambda = 1;
-
- J = nnCostFunction(nn_params, input_layer_size, hidden_layer_size, ...
- num_labels, X, y, lambda);
-
- fprintf(['Cost at parameters (loaded from ex4weights): %f '...
- '\n(this value should be about 0.383770)\n'], J);
-
- fprintf('Program paused. Press enter to continue.\n');
- pause;
测试结果准确。
- %% ================ Part 5: Sigmoid Gradient ================
- % Before you start implementing the neural network, you will first
- % implement the gradient for the sigmoid function. You should complete the
- % code in the sigmoidGradient.m file.
- %
-
- fprintf('\nEvaluating sigmoid gradient...\n')
-
- g = sigmoidGradient([-1 -0.5 0 0.5 1]);
- fprintf('Sigmoid gradient evaluated at [-1 -0.5 0 0.5 1]:\n ');
- fprintf('%f ', g);
- fprintf('\n\n');
-
- fprintf('Program paused. Press enter to continue.\n');
- pause;
- %% ================ Part 6: Initializing Pameters ================
- % In this part of the exercise, you will be starting to implment a two
- % layer neural network that classifies digits. You will start by
- % implementing a function to initialize the weights of the neural network
- % (randInitializeWeights.m)
-
- fprintf('\nInitializing Neural Network Parameters ...\n')
-
- initial_Theta1 = randInitializeWeights(input_layer_size, hidden_layer_size);
- initial_Theta2 = randInitializeWeights(hidden_layer_size, num_labels);
-
- % Unroll parameters
- initial_nn_params = [initial_Theta1(:) ; initial_Theta2(:)];
7.1、randInitializeWeights.m函数
初始化权重参数,实际上就是为了给Theta1和Theta2增加偏置量的权重。需要注意的是,本函数不单单是改变Theta矩阵的大小,在其初始化赋值时需注意:Theta的初始值不能赋值为0,因此本函数采用rand()函数,将Theta初始值限定在±INIT_EPSILON 之间。
- function W = randInitializeWeights(L_in, L_out)
- %RANDINITIALIZEWEIGHTS Randomly initialize the weights of a layer with L_in
- %incoming connections and L_out outgoing connections
- % W = RANDINITIALIZEWEIGHTS(L_in, L_out) randomly initializes the weights
- % of a layer with L_in incoming connections and L_out outgoing
- % connections.
- %
- % Note that W should be set to a matrix of size(L_out, 1 + L_in) as
- % the first column of W handles the "bias" terms
-
- % ====================== YOUR CODE HERE ======================
- % Instructions: Initialize W randomly so that we break the symmetry while
- % training the neural network.
- %
- % Note: The first column of W corresponds to the parameters for the bias unit
- %
- INIT_EPSILON = sqrt(6/(L_in+L_out));
- W = rand(L_out,L_in+1)*(2*INIT_EPSILON) - INIT_EPSILON;
- % =========================================================================
- end
通过改变不同的lambda值计算出不同lambda对应的代价debug_J,来判断模型正则化的准确度,并给出lambda为3时的代价做参考,结果显示,计算值与参考值基本相同。
- %% =============== Part 8: Implement Regularization ===============
- % Once your backpropagation implementation is correct, you should now
- % continue to implement the regularization with the cost and gradient.
- %
-
- fprintf('\nChecking Backpropagation (w/ Regularization) ... \n')
-
- % Check gradients by running checkNNGradients
- lambda = 3;
- checkNNGradients(lambda);
-
- % Also output the costFunction debugging values
- debug_J = nnCostFunction(nn_params, input_layer_size, ...
- hidden_layer_size, num_labels, X, y, lambda);
-
- fprintf(['\n\nCost at (fixed) debugging parameters (w/ lambda = %f): %f ' ...
- '\n(for lambda = 3, this value should be about 0.576051)\n\n'], lambda, debug_J);
-
- fprintf('Program paused. Press enter to continue.\n');
- pause;
checkNNGradients函数目的是梯度检验,根据传入的lambda参数,计算测试模型计(3*5*3的神经网络模型)算的梯度。并通过比较模型结果grad与数值计算梯度的结果numgrad 进行比较,进而判断模型的代价函数计算是否合理。
- function checkNNGradients(lambda)
- %CHECKNNGRADIENTS Creates a small neural network to check the
- %backpropagation gradients
- % CHECKNNGRADIENTS(lambda) Creates a small neural network to check the
- % backpropagation gradients, it will output the analytical gradients
- % produced by your backprop code and the numerical gradients (computed
- % using computeNumericalGradient). These two gradient computations should
- % result in very similar values.
- %
-
- if ~exist('lambda', 'var') || isempty(lambda)
- lambda = 0;
- end
-
- input_layer_size = 3;
- hidden_layer_size = 5;
- num_labels = 3;
- m = 5;
- % X = (m * input_layer_size);
- % Y = (m *num_labels);
-
- % We generate some 'random' test data
- %debugInitializeWeights(a,b)---->(a,b+1)
- Theta1 = debugInitializeWeights(hidden_layer_size, input_layer_size);%5 * 4
- Theta2 = debugInitializeWeights(num_labels, hidden_layer_size);%3 * 6
- % Reusing debugInitializeWeights to generate X
- X = debugInitializeWeights(m, input_layer_size - 1); % 5 * 3
-
- y = 1 + mod(1:m, num_labels)'; % 5 * 1 应该是 5*3!!!!!!!!!!!!!!!!!!!
-
- % Unroll parameters
- nn_params = [Theta1(:) ; Theta2(:)];% 38*1
-
- % Short hand for cost function
- costFunc = @(p) nnCostFunction(p, input_layer_size, hidden_layer_size, ...
- num_labels, X, y, lambda);
-
- [cost, grad] = costFunc(nn_params);
- numgrad = computeNumericalGradient(costFunc, nn_params);
-
- % Visually examine the two gradient computations. The two columns
- % you get should be very similar.
- disp([numgrad grad]);
- fprintf(['The above two columns you get should be very similar.\n' ...
- '(Left-Your Numerical Gradient, Right-Analytical Gradient)\n\n']);
-
- % Evaluate the norm of the difference between two solutions.
- % If you have a correct implementation, and assuming you used EPSILON = 0.0001
- % in computeNumericalGradient.m, then diff below should be less than 1e-9
- diff = norm(numgrad-grad)/norm(numgrad+grad);
-
- fprintf(['If your backpropagation implementation is correct, then \n' ...
- 'the relative difference will be small (less than 1e-9). \n' ...
- '\nRelative Difference: %g\n'], diff);
-
- end
- function numgrad = computeNumericalGradient(J, theta)
- %COMPUTENUMERICALGRADIENT Computes the gradient using "finite differences"
- %and gives us a numerical estimate of the gradient.
- % numgrad = COMPUTENUMERICALGRADIENT(J, theta) computes the numerical
- % gradient of the function J around theta. Calling y = J(theta) should
- % return the function value at theta.
-
- % Notes: The following code implements numerical gradient checking, and
- % returns the numerical gradient.It sets numgrad(i) to (a numerical
- % approximation of) the partial derivative of J with respect to the
- % i-th input argument, evaluated at theta. (i.e., numgrad(i) should
- % be the (approximately) the partial derivative of J with respect
- % to theta(i).)
- %
-
- numgrad = zeros(size(theta));
- perturb = zeros(size(theta));
- e = 1e-4;
- for p = 1:numel(theta)
- % Set perturbation vector
- perturb(p) = e;
- loss1 = J(theta - perturb);
- loss2 = J(theta + perturb);
- % Compute Numerical Gradient
- numgrad(p) = (loss2 - loss1) / (2*e);
- perturb(p) = 0;
- end
-
- end
运用matlab中的fming来训练模型,具体源码还没有进行分析,因此对此部分也不甚了解,只知道将代价函数等模块完成后,调用此函数即可进行模型训练。关于fming函数的其他内容暂不做过多说明,有兴趣可自行查阅相关资料。
- %% =================== Part 8: Training NN ===================
- % You have now implemented all the code necessary to train a neural
- % network. To train your neural network, we will now use "fmincg", which
- % is a function which works similarly to "fminunc". Recall that these
- % advanced optimizers are able to train our cost functions efficiently as
- % long as we provide them with the gradient computations.
- %
- fprintf('\nTraining Neural Network... \n')
-
- % After you have completed the assignment, change the MaxIter to a larger
- % value to see how more training helps.
- options = optimset('MaxIter', 50);
-
- % You should also try different values of lambda
- lambda = 1;
-
- % Create "short hand" for the cost function to be minimized
- costFunction = @(p) nnCostFunction(p, ...
- input_layer_size, ...
- hidden_layer_size, ...
- num_labels, X, y, lambda);
-
- % Now, costFunction is a function that takes in only one argument (the
- % neural network parameters)
- [nn_params, cost] = fmincg(costFunction, initial_nn_params, options);
-
- % Obtain Theta1 and Theta2 back from nn_params
- Theta1 = reshape(nn_params(1:hidden_layer_size * (input_layer_size + 1)), ...
- hidden_layer_size, (input_layer_size + 1));
-
- Theta2 = reshape(nn_params((1 + (hidden_layer_size * (input_layer_size + 1))):end), ...
- num_labels, (hidden_layer_size + 1));
-
- fprintf('Program paused. Press enter to continue.\n');
- pause;
- %% ================= Part 9: Visualize Weights =================
- % You can now "visualize" what the neural network is learning by
- % displaying the hidden units to see what features they are capturing in
- % the data.
-
- fprintf('\nVisualizing Neural Network... \n')
-
- displayData(Theta1(:, 2:end));
-
- fprintf('\nProgram paused. Press enter to continue.\n');
- pause;
利用输入数据和模型训练出的权重,计算出预测的输出结果,并于实际结果进行比对,进而求出模型识别数字的准确率。
- %% ================= Part 10: Implement Predict =================
- % After training the neural network, we would like to use it to predict
- % the labels. You will now implement the "predict" function to use the
- % neural network to predict the labels of the training set. This lets
- % you compute the training set accuracy.
-
- pred = predict(Theta1, Theta2, X);
-
- fprintf('\nTraining Set Accuracy: %f\n', mean(double(pred == y)) * 100);
实际上就是运用模型训练出来的权重矩阵计算出每张图片属于十大标签的概率,并从中选出概率值最大的标签作为预测数字的结果,并将此结果与实际的y进行比对,算出准确率。
本次训练所得的准确率为88.580000%,即只有44290张图片识别正确,效果仍需提高。需要说明的一点是,每次训练的准确率都不尽相同,基本上稳定在85%以上,效果还算理想。
- function p = predict(Theta1, Theta2, X)
- %PREDICT Predict the label of an input given a trained neural network
- % p = PREDICT(Theta1, Theta2, X) outputs the predicted label of X given the
- % trained weights of a neural network (Theta1, Theta2)
-
- % Useful values
- m = size(X, 1);
- num_labels = size(Theta2, 1);
-
- % You need to return the following variables correctly
- p = zeros(size(X, 1), 1);
-
- h1 = sigmoid([ones(m, 1) X] * Theta1');
- h2 = sigmoid([ones(m, 1) h1] * Theta2');
- [dummy, p] = max(h2, [], 2);
-
- % =========================================================================
- end
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。