赞
踩
- w1, b1 = torch.randn(200, 784, requires_grad=True),\
- torch.zeros(200, requires_grad=True)
- w2, b2 = torch.randn(200, 200, requires_grad=True),\
- torch.zeros(200, requires_grad=True)
- w3, b3 = torch.randn(10, 200, requires_grad=True),\
- torch.zeros(10, requires_grad=True)
-
- def forward(x):
- x = x@w1.t() + b1
- x = F.relu(x)
- x = x@w2.t() + b2
- x = F.relu(x)
- x = x@w3.t() + b3
- x = F.relu(x)
- return x
注意,给w1初始化时传入的参数中784是输入层,200是输出层,是反过来的
另外,这种使用默认高斯方法进行初始化的方法有时效果并不好,需要另外寻找更优的初始化方法
- optimizer = optim.SGD([w1, b1, w2, b2, w3, b3], lr=learning_rate)
- cr
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。