赞
踩
神经元细胞示意图
S
i
g
m
o
i
d
(
x
)
=
1
1
+
e
−
x
Sigmoid(x) = \frac{1}{1+e^{-x}}
Sigmoid(x)=1+e−x1
S
i
g
m
o
i
d
′
(
x
)
=
S
i
g
m
o
i
d
(
x
)
∗
(
1
−
S
i
g
m
o
i
d
(
x
)
)
Sigmoid'(x) = Sigmoid(x) * (1 - Sigmoid(x))
Sigmoid′(x)=Sigmoid(x)∗(1−Sigmoid(x))
符号说明:
X
: 输入向量X
对应的真实结果X
预测的结果Yi
: 向量的某个分量h_in
: 隐藏层输入h_out
: 隐藏层输出(经过激活函数处理)h_w
: 隐藏层权重矩阵output_in
: 输出层输入output_out
: 输出层输出(经过激活函数处理)output_w
: 输出层权重矩阵已知对于某个训练样例 d 的误差函数:
E
r
r
o
r
=
1
2
∗
∑
i
∈
o
u
t
p
u
t
(
Y
i
t
r
u
e
−
Y
i
p
r
e
d
)
2
Error = \frac{1}{2} * \sum_{i \in output}(Yi_{true}-Yi_{pred})^2
Error=21∗i∈output∑(Yitrue−Yipred)2
可得,总误差函数对输出层权重
w
35
w_{35}
w35 的偏导数:
∂
E
∂
w
35
=
∂
E
∂
o
u
t
p
u
t
5
_
o
u
t
∗
∂
o
u
t
p
u
t
5
_
o
u
t
∂
o
u
t
p
u
t
5
_
i
n
∗
∂
o
u
t
p
u
t
5
_
i
n
∂
w
35
\frac{\partial{E}}{\partial{w_{35}}} = \frac{\partial{E}}{\partial{output5\_out}} * \frac{\partial{output5\_out}}{\partial{output5\_in}} * \frac{\partial{output5\_in}}{\partial{w_{35}}}
∂w35∂E=∂output5_out∂E∗∂output5_in∂output5_out∗∂w35∂output5_in
各项展开得:
∂
E
∂
o
u
t
p
u
t
5
_
o
u
t
=
−
(
Y
5
t
r
u
e
−
o
u
t
p
u
t
5
_
o
u
t
)
\frac{\partial{E}}{\partial{output5\_out}} = -(Y5_{true} - output5\_out)
∂output5_out∂E=−(Y5true−output5_out)
∂
o
u
t
p
u
t
5
_
o
u
t
∂
o
u
t
p
u
t
5
_
i
n
=
S
i
g
m
o
i
d
(
o
u
t
p
u
t
5
_
i
n
)
∗
(
1
−
S
i
g
m
o
i
d
(
o
u
t
p
u
t
5
_
i
n
)
)
\frac{\partial{output5\_out}}{\partial{output5\_in}} = Sigmoid(output5\_in)*(1-Sigmoid(output5\_in))
∂output5_in∂output5_out=Sigmoid(output5_in)∗(1−Sigmoid(output5_in))
output5_in = h3_out * w_{35} + h4_out * w_{45}
∂
o
u
t
p
u
t
_
i
n
∂
w
35
=
h
3
_
o
u
t
\frac{\partial{output\_in}}{\partial{w_{35}}} = h3\_out
∂w35∂output_in=h3_out
最终偏导为:
∂
E
∂
w
35
=
−
(
Y
5
t
r
u
e
−
o
u
t
p
u
t
5
_
o
u
t
)
∗
S
i
g
m
o
i
d
(
o
u
t
p
u
t
5
_
i
n
)
∗
(
1
−
S
i
g
m
o
i
d
(
o
u
t
p
u
t
5
_
i
n
)
)
∗
h
3
_
o
u
t
\frac{\partial{E}}{\partial{w35}} = -(Y5_{true} - output5\_out) * Sigmoid(output5\_in)*(1-Sigmoid(output5\_in)) * h3\_out
∂w35∂E=−(Y5true−output5_out)∗Sigmoid(output5_in)∗(1−Sigmoid(output5_in))∗h3_out
总体上,隐藏层权重偏导公式如下:
∂
E
∂
w
13
=
∂
E
∂
h
3
_
o
u
t
∗
∂
h
3
_
o
u
t
∂
h
3
_
i
n
∗
∂
h
3
_
i
n
∂
w
13
\frac{\partial{E}}{\partial{w_{13}}} = \frac{\partial{E}}{\partial{h3\_out}} * \frac{\partial{h3\_out}}{\partial{h3\_in}} * \frac{\partial{h3\_in}}{\partial{w_{13}}}
∂w13∂E=∂h3_out∂E∗∂h3_in∂h3_out∗∂w13∂h3_in
展开,得:
∂
E
∂
w
13
=
∂
E
∂
h
3
_
o
u
t
∗
S
i
g
m
o
i
d
(
h
3
_
i
n
)
∗
(
1
−
S
i
g
m
o
i
d
(
h
3
_
i
n
)
)
∗
x
1
\frac{\partial{E}}{\partial{w_{13}}} = \frac{\partial{E}}{\partial{h3\_out}} * Sigmoid(h3\_in) * (1 - Sigmoid(h3\_in))* x_1
∂w13∂E=∂h3_out∂E∗Sigmoid(h3_in)∗(1−Sigmoid(h3_in))∗x1
仔细分析,我们只需对权重
w
13
w_{13}
w13 直接影响的节点求导即可:
∂
E
∂
h
3
_
o
u
t
=
∑
i
∈
D
o
w
n
s
t
r
e
a
m
(
h
3
)
∂
E
∂
o
u
t
p
u
t
i
_
o
u
t
∗
∂
o
u
t
p
u
t
i
_
o
u
t
∂
o
u
t
p
u
t
i
_
i
n
∗
∂
o
u
t
p
u
t
i
_
i
n
∂
h
3
_
o
u
t
\frac{\partial{E}}{\partial{h3\_out}} = \sum_{i \in Downstream(h3)} \frac{\partial{E}}{\partial{outputi\_out}} * \frac{\partial{outputi\_out}}{\partial{outputi\_in}} * \frac{\partial{outputi\_in}}{\partial{h3\_out}}
∂h3_out∂E=i∈Downstream(h3)∑∂outputi_out∂E∗∂outputi_in∂outputi_out∗∂h3_out∂outputi_in
各项展开得:
∂
E
∂
h
3
_
o
u
t
=
∑
i
∈
D
o
w
n
s
t
r
e
a
m
(
h
3
)
−
(
Y
i
t
r
u
e
−
o
u
t
p
u
t
i
_
o
u
t
)
∗
S
i
g
m
o
i
d
(
o
u
t
p
u
t
i
_
i
n
)
∗
(
1
−
S
i
g
m
o
i
d
(
o
u
t
p
u
t
i
_
i
n
)
)
∗
w
3
i
\frac{\partial{E}}{\partial{h3\_out}} = \sum_{i \in Downstream(h3)} -(Yi_{true} - outputi\_out) * Sigmoid(outputi\_in) * (1 - Sigmoid(outputi\_in)) * w_{3i}
∂h3_out∂E=i∈Downstream(h3)∑−(Yitrue−outputi_out)∗Sigmoid(outputi_in)∗(1−Sigmoid(outputi_in))∗w3i
w
35
=
w
35
−
Δ
w
35
w_{35} = w_{35} - \Delta w_{35}
w35=w35−Δw35
定义学习速率:
r
r
r
w
35
=
w
35
−
r
∗
∂
E
∂
w
35
w_{35} = w_{35} - r * \frac{\partial{E}}{\partial{w_{35}}}
w35=w35−r∗∂w35∂E
输出层权重调整:
w
35
=
w
35
+
r
∗
(
Y
5
t
r
u
e
−
o
u
t
p
u
t
5
_
o
u
t
)
∗
S
i
g
m
o
i
d
(
o
u
t
p
u
t
5
_
i
n
)
∗
(
1
−
S
i
g
m
o
i
d
(
o
u
t
p
u
t
5
_
i
n
)
)
∗
h
3
_
o
u
t
w_{35} = w_{35} + r * (Y5_{true} - output5\_out) * Sigmoid(output5\_in)*(1-Sigmoid(output5\_in)) * h3\_out
w35=w35+r∗(Y5true−output5_out)∗Sigmoid(output5_in)∗(1−Sigmoid(output5_in))∗h3_out
w
45
=
w
45
+
r
∗
(
Y
5
t
r
u
e
−
o
u
t
p
u
t
5
_
o
u
t
)
∗
S
i
g
m
o
i
d
(
o
u
t
p
u
t
5
_
i
n
)
∗
(
1
−
S
i
g
m
o
i
d
(
o
u
t
p
u
t
5
_
i
n
)
)
∗
h
4
_
o
u
t
w_{45} = w_{45} + r * (Y5_{true} - output5\_out) * Sigmoid(output5\_in)*(1-Sigmoid(output5\_in)) * h4\_out
w45=w45+r∗(Y5true−output5_out)∗Sigmoid(output5_in)∗(1−Sigmoid(output5_in))∗h4_out
w
36
=
w
36
+
r
∗
(
Y
6
t
r
u
e
−
o
u
t
p
u
t
6
_
o
u
t
)
∗
S
i
g
m
o
i
d
(
o
u
t
p
u
t
6
_
i
n
)
∗
(
1
−
S
i
g
m
o
i
d
(
o
u
t
p
u
t
6
_
i
n
)
)
∗
h
3
_
o
u
t
w_{36} = w_{36} + r * (Y6_{true} - output6\_out) * Sigmoid(output6\_in)*(1-Sigmoid(output6\_in)) * h3\_out
w36=w36+r∗(Y6true−output6_out)∗Sigmoid(output6_in)∗(1−Sigmoid(output6_in))∗h3_out
w
46
=
w
46
+
r
∗
(
Y
6
t
r
u
e
−
o
u
t
p
u
t
6
_
o
u
t
)
∗
S
i
g
m
o
i
d
(
o
u
t
p
u
t
6
_
i
n
)
∗
(
1
−
S
i
g
m
o
i
d
(
o
u
t
p
u
t
6
_
i
n
)
)
∗
h
4
_
o
u
t
w_{46} = w_{46} + r * (Y6_{true} - output6\_out) * Sigmoid(output6\_in)*(1-Sigmoid(output6\_in)) * h4\_out
w46=w46+r∗(Y6true−output6_out)∗Sigmoid(output6_in)∗(1−Sigmoid(output6_in))∗h4_out
利用矩阵运算简化
o
w
+
=
r
∗
(
Y
5
t
r
u
e
−
Y
5
p
r
e
d
Y
6
t
r
u
e
−
Y
6
p
r
e
d
)
⋅
S
i
g
m
o
i
d
′
(
o
u
t
p
u
t
5
_
i
n
o
u
t
p
u
t
6
_
i
n
)
×
(
h
3
_
o
u
t
h
4
_
o
u
t
)
ow += r * \left(
继续化简(默认向量为列向量):
o
w
+
=
r
∗
(
Y
t
r
u
e
−
Y
p
r
e
d
)
⋅
S
i
g
m
o
i
d
′
(
o
u
t
p
u
t
_
i
n
)
×
h
_
o
u
t
.
T
ow += r * (Y_{true} - Y_{pred}) \cdot Sigmoid'(output\_in) \times h\_out.T
ow+=r∗(Ytrue−Ypred)⋅Sigmoid′(output_in)×h_out.T
令:
o
u
t
p
u
t
_
e
r
r
o
r
s
=
(
Y
t
r
u
e
−
Y
p
r
e
d
)
output\_errors = (Y_{true} - Y_{pred})
output_errors=(Ytrue−Ypred)
则:
h
w
+
=
r
∗
(
o
u
t
p
u
t
_
e
r
r
o
r
s
⋅
S
i
g
m
o
i
d
′
(
o
u
t
p
u
t
_
i
n
)
)
×
h
_
o
u
t
.
T
hw += r * (output\_errors \cdot Sigmoid'(output\_in)) \times h\_out.T
hw+=r∗(output_errors⋅Sigmoid′(output_in))×h_out.T
隐藏层权重调整:
w
13
=
w
13
+
r
∗
S
i
g
m
o
i
d
′
(
h
3
_
i
n
)
∗
x
1
∗
∑
i
∈
D
o
w
n
s
t
r
e
a
m
(
h
3
)
(
Y
i
t
r
u
e
−
o
u
t
p
u
t
i
_
o
u
t
)
∗
S
i
g
m
o
i
d
′
(
o
u
t
p
u
t
i
_
i
n
)
∗
w
3
i
w_{13} = w_{13} + r * Sigmoid'(h3\_in) * x_1 * \sum_{i \in Downstream(h3)} (Yi_{true} - outputi\_out) * Sigmoid'(outputi\_in) * w_{3i}
w13=w13+r∗Sigmoid′(h3_in)∗x1∗i∈Downstream(h3)∑(Yitrue−outputi_out)∗Sigmoid′(outputi_in)∗w3i
w 23 = w 23 + r ∗ S i g m o i d ′ ( h 3 _ i n ) ∗ x 2 ∗ ∑ i ∈ D o w n s t r e a m ( h 3 ) ( Y i t r u e − o u t p u t i _ o u t ) ∗ S i g m o i d ′ ( o u t p u t i _ i n ) ∗ w 3 i w_{23} = w_{23} + r * Sigmoid'(h3\_in) * x_2 * \sum_{i \in Downstream(h3)} (Yi_{true} - outputi\_out) * Sigmoid'(outputi\_in) * w_{3i} w23=w23+r∗Sigmoid′(h3_in)∗x2∗i∈Downstream(h3)∑(Yitrue−outputi_out)∗Sigmoid′(outputi_in)∗w3i
w 14 = w 14 + r ∗ S i g m o i d ′ ( h 4 _ i n ) ∗ x 1 ∗ ∑ i ∈ D o w n s t r e a m ( h 4 ) ( Y i t r u e − o u t p u t i _ o u t ) ∗ S i g m o i d ′ ( o u t p u t i _ i n ) ∗ w 4 i w_{14} = w_{14} + r * Sigmoid'(h4\_in) * x_1 * \sum_{i \in Downstream(h4)} (Yi_{true} - outputi\_out) * Sigmoid'(outputi\_in) * w_{4i} w14=w14+r∗Sigmoid′(h4_in)∗x1∗i∈Downstream(h4)∑(Yitrue−outputi_out)∗Sigmoid′(outputi_in)∗w4i
w
24
=
w
24
+
r
∗
S
i
g
m
o
i
d
′
(
h
4
_
i
n
)
∗
x
2
∗
∑
i
∈
D
o
w
n
s
t
r
e
a
m
(
h
4
)
(
Y
i
t
r
u
e
−
o
u
t
p
u
t
i
_
o
u
t
)
∗
S
i
g
m
o
i
d
′
(
o
u
t
p
u
t
i
_
i
n
)
∗
w
4
i
w_{24} = w_{24} + r * Sigmoid'(h4\_in) * x_2 * \sum_{i \in Downstream(h4)} (Yi_{true} - outputi\_out) * Sigmoid'(outputi\_in) * w_{4i}
w24=w24+r∗Sigmoid′(h4_in)∗x2∗i∈Downstream(h4)∑(Yitrue−outputi_out)∗Sigmoid′(outputi_in)∗w4i
利用矩阵简化:
h
w
i
j
+
=
r
∗
S
i
g
m
o
i
d
′
(
h
j
_
i
n
h
j
_
i
n
)
⋅
(
x
1
x
2
)
×
∑
k
∈
D
o
w
n
s
t
r
e
a
m
(
h
j
)
(
Y
k
t
r
u
e
−
Y
k
p
r
e
d
)
⋅
S
i
g
m
o
i
d
′
(
o
u
t
p
u
t
k
_
i
n
)
×
w
j
k
hw_{ij} += r * Sigmoid'\left(
令(默认向量为列向量):
h
i
d
d
e
n
_
e
r
r
o
r
s
=
o
u
t
p
u
t
_
w
.
T
×
(
(
Y
k
t
r
u
e
−
Y
k
p
r
e
d
)
⋅
S
i
g
m
o
i
d
′
(
o
u
t
p
u
t
_
i
n
)
)
hidden\_errors = output\_w.T \times ((Yk_{true} - Yk_{pred} ) \cdot Sigmoid'( output\_{in}) )
hidden_errors=output_w.T×((Yktrue−Ykpred)⋅Sigmoid′(output_in))
带入得(默认向量为列向量):
h
w
+
=
r
∗
(
h
i
d
d
e
n
_
e
r
r
o
r
s
⋅
S
i
g
m
o
i
d
′
(
h
_
i
n
)
)
×
X
.
T
hw += r * (hidden\_errors \cdot Sigmoid'(h\_in)) \times X.T
hw+=r∗(hidden_errors⋅Sigmoid′(h_in))×X.T
#_author :NineSun
#data: 2019-10-12
import numpy as np
def activate(x):
return 1/(1+np.exp(-x))
def activate_de(x):
return activate(x)*(1-activate(x))
class NeutualNetWork:
# 构造方法,用于构造神经网络
def __init__(self):
# 隐藏层权重矩阵
self.hw=np.random.normal(np.zeros((2,2)))
# 输出层权重矩阵
self.ow=np.random.normal(np.zeros((2,2)))
# 学习速率
self.r=0.1
# 训练遍数
self.epoch=100
# 预测函数,根据输入数据预测结果
# x:输入向量
def prdict(self,x):
# 将输入向量转化为列向量
x=np.array(x,ndmin=2).T
# 计算隐藏层输入
h_in=np.dot(self.hw,x)
# 计算隐藏层输出
h_out = activate(h_in)
# 计算输出层输入 output_in
output_in=np.dot(self.ow,h_out)
# 计算输出层输出
output_out=activate(output_in)
return output_out
# 训练神经网络
# x_data:输入数据集
# ytrue_data:输出数据集
def train(self,x_data,ytrue_data):
# 训练epoch遍
for i in range(self.epoch):
# 针对每一个样本训练一遍
for data in zip(x_data,ytrue_data):
self.train_once(data[0],data[1])
# 针对一个样本(x,ytrue)进行训练
def train_once(self,x,ytrue):
'''
:param x: 输入行向量
:param ytrue: 输出行向量
:return:
'''
# 将输入输出向量转化为列向量
x = np.array(x, ndmin=2).T
ytrue = np.array(ytrue, ndmin=2).T
# 根据输入计算 Ypred,output_out
h_in=np.dot(self.hw,x)
h_out=activate(h_in)
output_in=np.dot(self.ow,h_out)
output_out=activate(output_in)
# 计算输出误差
output_errors=ytrue-output_out
cal_tmp=output_errors*activate_de(output_in)
# 更新输出层权重
self.ow+=self.r*np.dot(cal_tmp,h_out.T)
# 计算隐藏层误差
hidden_errors=np.dot(self.ow.T,cal_tmp)
# 更新隐藏层权重
self.hw+=self.r*np.dot(hidden_errors*activate_de(h_in),x.T)
# 1.准备数据
in_data = [
[165, 55],
[160, 53],
[175, 55],
[163, 55],
[173, 49],
[163, 56],
[180, 77],
[155, 54],
[176, 79],
[161, 49],
[180, 60],
[168, 57],
[172, 69],
[166, 50],
[172, 90],
[163, 51],
[175, 70],
[164, 53],
[160, 45],
[160, 56],
[182, 70],
[180, 74],
[185, 73],
[165, 55],
[185, 75]
]
out_data = [
[1,0],
[0,1],
[1, 0],
[0, 1],
[1, 0],
[0, 1],
[1, 0],
[0, 1],
[1, 0],
[0, 1],
[1, 0],
[0, 1],
[1, 0],
[0, 1],
[1, 0],
[0, 1],
[1, 0],
[0, 1],
[1, 0],
[0, 1],
[1, 0],
[0, 1],
[1, 0],
[0, 1],
[1, 0],
[0, 1],
[1, 0],
[0, 1],
[1, 0],
[0, 1]
]
# 测试数据
in_test=[
[165, 55],
[160, 53],
[175, 55],
[163, 55],
[173, 49]
]
out_test=[
[0,1],
[0, 1],
[1, 0],
[0, 1],
[1, 0],
]
nn = NeutualNetWork()
# 3.训练
nn.train(in_data,out_data)
# 4.预测
for test in zip(in_test,out_test):
ypred=nn.prdict(test[0])
print("in:",test[0],"yture:",test[1])
print("predit:",ypred)
这串代码可以直接运行
可随意指定:输入层,隐藏层,输出层的神经元个数
将神经网络结构,包括:学习速率,训练次数和所有权重矩阵存成文件,以便以后可以不必训练,加载数据后可直接进行预测。
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。