当前位置:   article > 正文

paddle 练习(二)使用基础API完成模型训练&预测_paddle trainer api

paddle trainer api

模型训练:

下面展示模型训练的代码。

这里用到的是线性回归模型最常用的损失函数–均方误差(MSE),用来衡量模型预测的房价和真实房价的差异。

对损失函数进行优化所采用的方法是梯度下降法.

  1. # 将训练数据集和测试数据集按照8:2的比例分开
  2. ratio = 0.8
  3. offset = int(housing_data.shape[0] * ratio)
  4. train_data = housing_data[:offset]
  5. test_data = housing_data[offset:]
  6. import paddle.nn.functional as F
  7. y_preds = []
  8. labels_list = []
  9. def train(model):
  10. print('start training ... ')
  11. # 开启模型训练模式
  12. model.train()
  13. EPOCH_NUM = 500
  14. train_num = 0
  15. optimizer = paddle.optimizer.SGD(learning_rate=0.001, parameters=model.parameters())
  16. for epoch_id in range(EPOCH_NUM):
  17. # 在每轮迭代开始之前,将训练数据的顺序随机的打乱
  18. np.random.shuffle(train_data)
  19. # 将训练数据进行拆分,每个batch包含20条数据
  20. mini_batches = [train_data[k: k+BATCH_SIZE] for k in range(0, len(train_data), BATCH_SIZE)]
  21. for batch_id, data in enumerate(mini_batches):
  22. features_np = np.array(data[:, :13], np.float32)
  23. labels_np = np.array(data[:, -1:], np.float32)
  24. features = paddle.to_tensor(features_np)
  25. labels = paddle.to_tensor(labels_np)
  26. # 前向计算
  27. y_pred = model(features)
  28. cost = F.mse_loss(y_pred, label=labels)
  29. train_cost = cost.numpy()[0]
  30. # 反向传播
  31. cost.backward()
  32. # 最小化loss,更新参数
  33. optimizer.step()
  34. # 清除梯度
  35. optimizer.clear_grad()
  36. if batch_id%30 == 0 and epoch_id%50 == 0:
  37. print("Pass:%d,Cost:%0.5f"%(epoch_id, train_cost))
  38. train_num = train_num + BATCH_SIZE
  39. train_nums.append(train_num)
  40. train_costs.append(train_cost)
  41. model = Regressor()
  42. train(model)
  43. matplotlib.use('TkAgg')
  44. # matplotlib inline
  45. draw_train_process(train_nums, train_costs)

如果你想成功运行这段代码,请参考我的paddle练习(一)种的开始数据集house.data加载部分代码。

paddle练习(一)使用线性回归预测波士顿房价_Vertira的博客-CSDN博客import paddleimport numpy as npimport osimport matplotlibimport matplotlib.pyplot as pltimport pandas as pdimport seaborn as snsimport warningswarnings.filterwarnings("ignore")print(paddle.__version__)# 从文件导入数据datafile = 'housing.data'housin.https://blog.csdn.net/Vertira/article/details/122171950

运行结果:

  1. start training ...
  2. Pass:0,Cost:507.42090
  3. Pass:50,Cost:47.54215
  4. Pass:100,Cost:83.45570
  5. Pass:150,Cost:86.61785
  6. Pass:200,Cost:32.05870
  7. Pass:250,Cost:15.67683
  8. Pass:300,Cost:23.19898
  9. Pass:350,Cost:48.89576
  10. Pass:400,Cost:56.87611
  11. Pass:450,Cost:33.11672

然后进行模型预测

  1. # 获取预测数据
  2. INFER_BATCH_SIZE = 100
  3. infer_features_np = np.array([data[:13] for data in test_data]).astype("float32")
  4. infer_labels_np = np.array([data[-1] for data in test_data]).astype("float32")
  5. infer_features = paddle.to_tensor(infer_features_np)
  6. infer_labels = paddle.to_tensor(infer_labels_np)
  7. fetch_list = model(infer_features)
  8. sum_cost = 0
  9. for i in range(INFER_BATCH_SIZE):
  10. infer_result = fetch_list[i][0]
  11. ground_truth = infer_labels[i]
  12. if i % 10 == 0:
  13. print("No.%d: infer result is %.2f,ground truth is %.2f" % (i, infer_result, ground_truth))
  14. cost = paddle.pow(infer_result - ground_truth, 2)
  15. sum_cost += cost
  16. mean_loss = sum_cost / INFER_BATCH_SIZE
  17. print("Mean loss is:", mean_loss.numpy())

预测结果:

  1. No.0: infer result is 11.91,ground truth is 8.50
  2. No.10: infer result is 5.13,ground truth is 7.00
  3. No.20: infer result is 14.46,ground truth is 11.70
  4. No.30: infer result is 16.31,ground truth is 11.70
  5. No.40: infer result is 13.45,ground truth is 10.80
  6. No.50: infer result is 15.81,ground truth is 14.90
  7. No.60: infer result is 18.66,ground truth is 21.40
  8. No.70: infer result is 15.23,ground truth is 13.80
  9. No.80: infer result is 17.96,ground truth is 20.60
  10. No.90: infer result is 21.41,ground truth is 24.50

画图显示:

  1. def plot_pred_ground(pred, ground):
  2. plt.figure()
  3. plt.title("Predication v.s. Ground truth", fontsize=24)
  4. plt.xlabel("ground truth price(unit:$1000)", fontsize=14)
  5. plt.ylabel("predict price", fontsize=14)
  6. plt.scatter(ground, pred, alpha=0.5) # scatter:散点图,alpha:"透明度"
  7. plt.plot(ground, ground, c='red')
  8. plt.show()

欢迎点赞,收藏,关注。

plot_pred_ground(fetch_list, infer_labels_np)

欢迎点赞,收藏,关注。

声明:本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:【wpsshop博客】
推荐阅读
相关标签
  

闽ICP备14008679号