赞
踩
虽然说很多代码都有问题,但是不管它们,我不是为了去debug,紧盯住自己的目标,目标是整理出一条通常的强化学习之路,让自己以及看到这些博客的大家在学习的时候能够少走一些弯路。所以从q-learning和Sarsa开始,这些基础代码不需要借助框架,所以没什么太大问题。但是深度学习的话就要借助TensorFlow或者pytorch框架,而这两个框架都分别出了两个版本,就导致前后的兼容性较差,前人的经验工作无法得到有效的利用。
我找了许多的学习代码,但是很多经常不能使用,但是我也花费了许多功夫,所以现在我想直接把对这些代码的debug过程记录下来。好了,话不多说,下面看着这么多的代码,又有几个能用的呢?测试一下吧
果不其然,运行code6-2
C:\Users\admin\AppData\Local\Programs\Python\Python37\lib\site-packages\gym\envs\registration.py:556: UserWarning: WARN: The environment CartPole-v0 is out of date. You should consider upgrading to version `v1`.
f"The environment {id} is out of date. You should consider "
C:\Users\admin\AppData\Local\Programs\Python\Python37\lib\site-packages\numpy\lib\shape_base.py:591: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.
a = asanyarray(a)
Traceback (most recent call last):
File "C:/Python program/Deep-Reiforcement-Learning-main/code/Code6/code6-2 DQN-2015算法求解倒立摆问题代码.py", line 220, in <module>
agent.train() #训练智能体
File "C:/Python program/Deep-Reiforcement-Learning-main/code/Code6/code6-2 DQN-2015算法求解倒立摆问题代码.py", line 145, in train
action = self.egreedy_action(state)
File "C:/Python program/Deep-Reiforcement-Learning-main/code/Code6/code6-2 DQN-2015算法求解倒立摆问题代码.py", line 70, in egreedy_action
state = torch.from_numpy(np.expand_dims(state,0))
TypeError: can't convert np.ndarray of type numpy.object_. The only supported types are: float64, float32, float16, complex64, complex128, int64, int32, int16, int8, uint8, and bool.
大概就是环境不对,好,改成v1
再运行
C:\Users\admin\AppData\Local\Programs\Python\Python37\lib\site-packages\numpy\lib\shape_base.py:591: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.
a = asanyarray(a)
Traceback (most recent call last):
File "C:/Python program/Deep-Reiforcement-Learning-main/code/Code6/code6-2 DQN-2015算法求解倒立摆问题代码.py", line 220, in <module>
agent.train() #训练智能体
File "C:/Python program/Deep-Reiforcement-Learning-main/code/Code6/code6-2 DQN-2015算法求解倒立摆问题代码.py", line 145, in train
action = self.egreedy_action(state)
File "C:/Python program/Deep-Reiforcement-Learning-main/code/Code6/code6-2 DQN-2015算法求解倒立摆问题代码.py", line 70, in egreedy_action
state = torch.from_numpy(np.expand_dims(state,0))
TypeError: can't convert np.ndarray of type numpy.object_. The only supported types are: float64, float32, float16, complex64, complex128, int64, int32, int16, int8, uint8, and bool.
Process finished with exit code 1
依然错误,好我再修改
state = torch.from_numpy(np.expand_dims(state, 0)).float()
然后运行,还是错误
然后运行其他的代码,也没有一个能跑的,算了
差点忘了,这个前几天刚debug过,里面深度强化学习的代码基本上都由于版本问题无法跑起来,放弃
作图,matplotlib的使用
import torch import torch.nn.functional as F from torch.autograd import Variable import matplotlib.pyplot as plt x=torch.linspace(-5,5,200) x=Variable(x) x_np=x.data.numpy() y_relu=torch.relu(x).data.numpy() y_sigmoid=torch.sigmoid(x).data.numpy() y_tanh=torch.tanh(x).data.numpy() y_softplus=F.softplus(x).data.numpy() #画图 plt.figure(1,figsize=(8,6)) plt.subplot(221) plt.plot(x_np,y_relu,c='red',label='relu') plt.ylim((-1,5)) plt.legend(loc='best') plt.subplot(222) plt.plot(x_np,y_sigmoid,c='red',label='relu') plt.ylim((-0.2,1.2)) plt.legend(loc='best') plt.subplot(223) plt.plot(x_np,y_tanh,c='red',label='tanh') plt.ylim((-1.2,1.2)) plt.legend(loc='best') plt.subplot(224) plt.plot(x_np,y_softplus,c='red',label='softmax') plt.ylim((-0.2,6)) plt.legend(loc='best') plt.show()
神经网络还必须得GPU才能跑
import torch import torch.nn.functional as F import matplotlib.pyplot as plt from torch.utils.tensorboard import SummaryWriter import os log_dirs="logs/regre" writer=SummaryWriter(log_dir=log_dirs) x = torch.unsqueeze(torch.linspace(-1, 1, 100), dim=1) # x data (tensor), shape=(100, 1) y = x.pow(2) + 0.2*torch.rand(x.size()) # noisy y data (tensor), shape=(100, 1) class Net(torch.nn.Module): def __init__(self, n_feature, n_hidden, n_output): super(Net, self).__init__() self.hidden = torch.nn.Linear(n_feature, n_hidden) # hidden layer self.predict = torch.nn.Linear(n_hidden, n_output) # output layer def forward(self, x): x = F.relu(self.hidden(x)) # activation function for hidden layer x = self.predict(x) # linear output return x net = Net(n_feature=1, n_hidden=10, n_output=1) # define the network print(net) # net architecture optimizer = torch.optim.SGD(net.parameters(), lr=0.2) loss_func = torch.nn.MSELoss() # this is for regression mean squared loss plt.ion() # something about plotting for t in range(200): prediction = net(x) # input x and predict based on x loss = loss_func(prediction, y) # must be (1. nn output, 2. target) optimizer.zero_grad() # clear gradients for next train loss.backward() # backpropagation, compute gradients optimizer.step() # apply gradients writer.add_scalar("Loss",loss) if t % 5 == 0: # plot and show learning process plt.cla() plt.scatter(x.data.numpy(), y.data.numpy()) plt.plot(x.data.numpy(), prediction.data.numpy(), 'r-', lw=5) plt.text(0.5, 0, 'Loss=%.4f' % loss.data.numpy(), fontdict={'size': 20, 'color': 'red'}) plt.pause(0.1) writer.close() plt.ioff() plt.show()
上面是一个用神经网络的回归代码,我又添加了tensorboard查看网络的代码,结果如下:
pytorch中使用tensorboard的方法
1.导入所需的库:
from torch.utils.tensorboard import SummaryWriter
2.创建一个SummaryWriter对象:
writer = SummaryWriter()
3.在训练过程中,将需要记录的指标(例如损失、准确率等)写入到SummaryWriter对象中:
# 在训练循环中,例如每个epoch或每个iteration
writer.add_scalar('Loss', loss, global_step)
writer.add_scalar('Accuracy', accuracy, global_step)
4.在训练完成后,关闭SummaryWriter对象:
writer.close()
5.启动TensorBoard服务器:
在终端中,使用以下命令在指定的目录中启动TensorBoard服务器:
tensorboard --logdir=logs
6.在浏览器中查看TensorBoard的可视化结果:
打开浏览器,访问TensorBoard服务器的地址(默认是http://localhost:6006),即可查看训练过程和模型性能的可视化结果。
logs是一个目录路径,你可以在你的代码中自行定义。它用于存储TensorBoard日志文件和事件文件,供TensorBoard服务器读取并进行可视化。
通常,你可以将logs目录定义为你项目的根目录下的一个子目录,或者在任意合适的位置创建一个新的目录。在定义logs路径时,你可以使用绝对路径或相对路径。
以下是一个示例,将logs目录定义为项目根目录下的子目录:
logs_dir = os.path.join(os.getcwd(), 'logs')
writer = SummaryWriter(log_dir=logs_dir)
OK了
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。