赞
踩
BP四项基本原则:
δ i ( L ) = ▽ y i C o s t ⋅ σ ′ ( l o g i t i ( L ) ) δ i ( l ) = ∑ j δ j ( l + 1 ) w j i ( l + 1 ) σ ′ ( l o g i t i ( l ) ) ∂ C o s t ∂ b i a s i ( l ) = δ i ( l ) ∂ C o s t ∂ w i j ( l ) = δ i ( l ) h j ( l − 1 )
其中, ( l ) (l) (l)表示第 l l l层,一共有L层, i , j i,j i,j表示当前层神经元的序号。
反向传播公式的目的主要是得到: ∂ C o s t ∂ b i a s i ( l ) \frac{\partial Cost}{\partial bias_i^{(l)}} ∂biasi(l)∂Cost和 ∂ C o s t ∂ w i j ( l ) \frac{\partial Cost}{\partial w_{ij}^{(l)}} ∂wij(l)∂Cost。
在推导的过程中
∂ C o s t ∂ b i a s i ( l ) = ∂ C o s t ∂ l o g i t i ( l ) ⋅ ∂ l o g i t i ( l ) ∂ b i a s i ( l ) ∂ C o s t ∂ w i j ( l ) = ∂ C o s t ∂ l o g i t i ( l ) ⋅ ∂ l o g i t i ( l ) ∂ w i j ( l )
会发现都要用到 ∂ C o s t ∂ l o g i t i ( l ) \frac{\partial Cost}{\partial logit_i^{(l)}} ∂logiti(l)∂Cost。
而
l o g i t i ( l ) = w i j ( l ) h j ( l ) + ∑ k ≠ j w i k ( l ) h k ( l ) + b i a s i ( l ) logit_i^{(l)} = w_{ij}^{(l)} h_j^{(l)} + \sum_{k\ne j} w_{ik}^{(l)} h_{k}^{(l)} + bias_i^{(l)} logiti(l)=wij(l)hj(l)+k=j∑wik(l)hk(l)+biasi(l)
所以
∂ l o g i t i ( l ) ∂ b i a s i ( l ) = 1 ∂ l o g i t i ( l ) ∂ w i j ( l ) = h j ( l )
那接下来的问题就只有求 ∂ C o s t ∂ l o g i t i ( l ) \frac{\partial Cost}{\partial logit_i^{(l)}} ∂logiti(l)∂Cost了,求它可以用递推法:
为公式看起来简洁,我们把 ∂ C o s t ∂ l o g i t i ( l ) \frac{\partial Cost}{\partial logit_i^{(l)}} ∂logiti(l)∂Cost记为 δ i ( l ) \delta_i^{(l)} δi(l),那么
δ i ( l ) = ∂ C o s t ∂ l o g i t i ( l ) = ∑ j ∂ C o s t ∂ l o g i t j ( l + 1 ) ⋅ ∂ l o g i t j ( l + 1 ) ∂ l o g i t i ( l ) = ∑ j δ j ( l + 1 ) ⋅ ∂ l o g i t j ( l + 1 ) ∂ l o g i t i ( l ) \delta_i^{(l)} = \frac{\partial Cost}{\partial logit_i^{(l)}} = \sum_j \frac{\partial Cost}{\partial logit_j^{(l+1)}} \cdot \frac{\partial logit_j^{(l+1)}}{\partial logit_i^{(l)}} = \sum_j \delta_j^{(l+1)} \cdot \frac{\partial logit_j^{(l+1)}}{\partial logit_i^{(l)}} δi(l)=∂logiti(l)∂Cost=j∑∂logitj(l+1)∂Cost⋅∂logiti(l)∂logitj(l+1)
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。