赞
踩
对训练例
(
x
k
,
y
k
)
\left(\boldsymbol{x}_{k}, \boldsymbol{y}_{k}\right)
(xk,yk), 假定神经网络的输出为
y
^
k
=
(
y
^
1
k
,
y
^
2
k
,
…
,
y
^
l
k
)
\hat{\boldsymbol{y}}_{k}=\left(\hat{y}_{1}^{k}, \hat{y}_{2}^{k}, \ldots, \hat{y}_{l}^{k}\right)
y^k=(y^1k,y^2k,…,y^lk), 即
y
^
j
k
=
f
(
β
j
−
θ
j
)
,
\hat{y}_{j}^{k}=f\left(\beta_{j}-\theta_{j}\right),
y^jk=f(βj−θj),
则网络在
(
x
k
,
y
k
)
\left(\boldsymbol{x}_{k}, \boldsymbol{y}_{k}\right)
(xk,yk) 上的均方误差为
E
k
=
1
2
∑
j
=
1
l
(
y
^
j
k
−
y
j
k
)
2
.
E_{k}=\frac{1}{2} \sum_{j=1}^{l}\left(\hat{y}_{j}^{k}-y_{j}^{k}\right)^{2} .
Ek=21j=1∑l(y^jk−yjk)2.
BP 算法基于梯度下降(gradient descent)策略, 以目标的负梯度方向对参数 进行调整. 对误差
E
k
E_{k}
Ek, 给定学习率
η
\eta
η, 有
Δ
w
h
j
=
−
η
∂
E
k
∂
w
h
j
.
\Delta w_{h j}=-\eta \frac{\partial E_{k}}{\partial w_{h j}} .
Δwhj=−η∂whj∂Ek.
注意到
w
h
j
w_{h j}
whj 先影响到第
j
j
j 个输出层神经元的输入值
β
j
\beta_{j}
βj, 再影响到其输出值
y
^
j
k
\hat{y}_{j}^{k}
y^jk, 然后影响到
E
k
E_{k}
Ek,那么根据链式法则有,
∂
E
k
∂
w
h
j
=
∂
E
k
∂
y
^
j
k
⋅
∂
y
^
j
k
∂
β
j
⋅
∂
β
j
∂
w
h
j
.
\frac{\partial E_{k}}{\partial w_{h j}}=\frac{\partial E_{k}}{\partial \hat{y}_{j}^{k}} \cdot \frac{\partial \hat{y}_{j}^{k}}{\partial \beta_{j}} \cdot \frac{\partial \beta_{j}}{\partial w_{h j}} .
∂whj∂Ek=∂y^jk∂Ek⋅∂βj∂y^jk⋅∂whj∂βj.
因为有
β
j
=
∑
h
=
1
q
w
h
j
b
h
\beta_{j}= \sum\limits_{h=1}^{q} w_{hj}b_{h}
βj=h=1∑qwhjbh
我们将
β
j
\beta_j
βj抽象为斜率为
b
h
b_h
bh的一条直线,那么自然有
∂
β
j
∂
w
h
j
=
b
h
.
\frac{\partial \beta_{j}}{\partial w_{h j}}=b_{h} .
∂whj∂βj=bh.
g
j
=
−
∂
E
k
∂
y
^
j
k
⋅
∂
y
^
j
k
∂
β
j
g_{j} =-\frac{\partial E_{k}}{\partial \hat{y}_{j}^{k}} \cdot \frac{\partial \hat{y}_{j}^{k}}{\partial \beta_{j}}
gj=−∂y^jk∂Ek⋅∂βj∂y^jk
=
−
(
y
^
j
k
−
y
j
k
)
f
′
(
β
j
−
θ
j
)
=-\left(\hat{y}_{j}^{k}-y_{j}^{k}\right) f^{\prime}\left(\beta_{j}-\theta_{j}\right)
=−(y^jk−yjk)f′(βj−θj)
=
y
^
j
k
(
1
−
y
^
j
k
)
(
y
j
k
−
y
^
j
k
)
.
=\hat{y}_{j}^{k}\left(1-\hat{y}_{j}^{k}\right)\left(y_{j}^{k}-\hat{y}_{j}^{k}\right) .
=y^jk(1−y^jk)(yjk−y^jk).
结合上式得
Δ
w
h
j
\Delta w_{h j}
Δwhj
Δ
w
h
j
=
η
g
j
b
h
.
\Delta w_{h j}=\eta g_{j} b_{h} .
Δwhj=ηgjbh.
类似可得
Δ θ j = − η g j , \Delta \theta_{j} =-\eta g_{j}, Δθj=−ηgj,
Δ γ h = − η e h , \Delta \gamma_{h} =-\eta e_{h}, Δγh=−ηeh,
同理得出
e
h
=
−
∂
E
k
∂
b
h
⋅
∂
b
h
∂
α
h
=
−
∑
j
=
1
l
∂
E
k
∂
β
j
⋅
∂
β
j
∂
b
h
f
′
(
α
h
−
γ
h
)
=
f
′
(
α
h
−
γ
h
)
∑
j
=
1
l
w
h
j
g
j
Δ
v
i
h
=
η
e
h
x
i
,
\Delta v_{i h} =\eta e_{h} x_{i},
Δvih=ηehxi,
Δ
γ
h
=
−
η
e
h
,
\Delta \gamma_{h} =-\eta e_{h},
Δγh=−ηeh,
from sklearn.model_selection import train_test_split import numpy as np import pandas as pd from sklearn import datasets iris = datasets.load_iris() data = iris.data target = iris.target class NeuralNetwork: def __init__(self, in_size, o_size, h_size): # 初始化层的数量 self.in_size = in_size self.o_size = o_size self.h_size = h_size self.W1 = np.random.randn(in_size, h_size) # n x b的矩阵 self.W2 = np.random.randn(h_size, o_size) # b x k的矩阵 def sigmod(self, x): return 1 / (1 + np.exp(-x)) # 映射函数,将连续值变成离散值 def ref(self, x): if x <= (1 / 3): return 0 elif x <= (2 / 3): return 1 else: return 2 # 设输入X为 m x n的矩阵 def forward(self, X): vec_rule = np.vectorize(self.ref) self.z2 = np.dot(X, self.W1) # m x b self.act2 = self.sigmod(self.z2) self.z3 = np.dot(self.act2, self.W2)# m x k self.y_hat = self.sigmod(self.z3) self.y_hat = vec_rule(self.y_hat) return self.y_hat # 设y为 m x k 的矩阵 def backward(self, X, y, y_hat, leraning_rate): # 算出输出层的梯度顶 Grd_1 = (y - y_hat) * self.sigmod(self.z3) * (1 - self.sigmod(self.z3)) # m x k # 输出层的Δ值 Delta_W2 = np.dot(self.act2.T, Grd_1) # b x k # 隐藏层的梯度顶 Grd_2 = np.dot(Grd_1, self.W2.T) * self.sigmod(self.z2) * (1 - self.sigmod(self.z2)) # m x b # 隐藏层的Δ值 Delta_W1 = np.dot(X.T, Grd_2) # n x b # 更新权值 self.W1 += leraning_rate * Delta_W1 self.W2 += leraning_rate * Delta_W2 def tarin(self, X, y, learning_rate, num_epochs): # 检查形状 if(X.shape[0] != y.shape[0]): return -1; for i in range(1, num_epochs + 1): y_hat = self.forward(X) self.backward(X, y, self.y_hat, learning_rate) # 输出均方误差 loss = np.mean((y - y_hat) ** 2) print(f"loss = {loss}, epochs/num_epochs:{i}/{num_epochs}") def predict(self, X): y_pred = self.forward(X) return y_pred
注: 部分公式来自周志华的西瓜书
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。