赞
踩
在上次笔记优化器的内容中介绍了学习率的概念,但是在整个训练过程中学习率并不是一直不变的,一般学习率是要先设置的大一些,然后在训练过程中慢慢减小。这次笔记就简单介绍一下PyTorch中学习率调整策略。
是各种具体学习率调整策略方法函数所要继承的基类。
主要属性:
主要方法:
功能:等间隔调整学习率。
torch.optim.lr_scheduler.StepLR(optimizer, step_size, gamma=0.1, last_epoch=-1)
参数如下:
调整方式:lr = lr * gamma
举个栗子:
# 导入模块、设定超参数、给定权重数据 import torch import torch.optim as optim import numpy as np import matplotlib.pyplot as plt torch.manual_seed(1) LR = 0.1 iteration = 10 max_epoch = 200 weights = torch.randn((1), requires_grad=True) target = torch.zeros((1)) optimizer = optim.SGD([weights], lr=LR, momentum=0.9) # StepLR,每隔50轮下降一次学习率 scheduler_lr = optim.lr_scheduler.StepLR(optimizer, step_size=50, gamma=0.1) # 设置学习率下降策略 lr_list, epoch_list = list(), list() for epoch in range(max_epoch): # 获取当前lr,新版本用 get_last_lr()函数,旧版本用get_lr()函数,具体看UserWarning lr_list.append(scheduler_lr.get_last_lr()) epoch_list.append(epoch) for i in range(iteration): loss = torch.pow((weights - target), 2) loss.backward() optimizer.step() optimizer.zero_grad() scheduler_lr.step() plt.plot(epoch_list, lr_list, label="Step LR Scheduler") plt.xlabel("Epoch") plt.ylabel("Learning rate") plt.legend() plt.show()
绘制出的结果如图所示:
从图中可见每隔50轮学习率下降为原来的0.1倍。
功能:按照给定间隔调整学习率。
torch.optim.lr_scheduler.MultiStepLR(optimizer, milestones, gamma=0.1, last_epoch=-1)
参数如下:
调整方式:lr = lr * gamma
举个栗子:
milestones = [50, 125, 160] scheduler_lr = optim.lr_scheduler.MultiStepLR(optimizer, milestones=milestones, gamma=0.1) lr_list, epoch_list = list(), list() for epoch in range(max_epoch): lr_list.append(scheduler_lr.get_last_lr()) epoch_list.append(epoch) for i in range(iteration): loss = torch.pow((weights - target), 2) loss.backward() optimizer.step() optimizer.zero_grad() scheduler_lr.step() plt.plot(epoch_list, lr_list, label="Multi Step LR Scheduler\nmilestones:{}".format(milestones)) plt.xlabel("Epoch") plt.ylabel("Learning rate") plt.legend() plt.show()
结果如下图所示:
从图中可见,在我们设定的位置:50/125/160轮时学习率下降为原来的0.1倍。
功能:按指数衰减调整lr。
torch.optim.lr_scheduler.ExponentialLR(optimizer, gamma, last_epoch=-1)
参数需要关注的只有一个:
调整策略:lr = lr * gamma ^ epoch
举个栗子:
gamma = 0.95 scheduler_lr = optim.lr_scheduler.ExponentialLR(optimizer, gamma=gamma) lr_list, epoch_list = list(), list() for epoch in range(max_epoch): lr_list.append(scheduler_lr.get_last_lr()) epoch_list.append(epoch) for i in range(iteration): loss = torch.pow((weights - target), 2) loss.backward() optimizer.step() optimizer.zero_grad() scheduler_lr.step() plt.plot(epoch_list, lr_list, label="Exponential LR Scheduler\ngamma:{}".format(gamma)) plt.xlabel("Epoch") plt.ylabel("Learning rate") plt.legend() plt.show()
学习率变化结果如下图所示:
功能:余弦周期调整lr。
torch.optim.lr_scheduler.CosineAnnealingLR(optimizer, T_max, eta_min=0, last_epoch=-1)
参数如下所示:
调整方式:
t_max = 50 scheduler_lr = optim.lr_scheduler.CosineAnnealingLR(optimizer, T_max=t_max, eta_min=0.) lr_list, epoch_list = list(), list() for epoch in range(max_epoch): lr_list.append(scheduler_lr.get_last_lr()) epoch_list.append(epoch) for i in range(iteration): loss = torch.pow((weights - target), 2) loss.backward() optimizer.step() optimizer.zero_grad() scheduler_lr.step() plt.plot(epoch_list, lr_list, label="CosineAnnealingLR Scheduler\nT_max:{}".format(t_max)) plt.xlabel("Epoch") plt.ylabel("Learning rate") plt.legend() plt.show()
学习率变化曲线如下图所示:
T_max设置为50,所以0-50下降,50-100上升,以此类推。
功能:监控指标,当指标不再变化则调整。
torch.optim.lr_scheduler.ReduceLROnPlateau(optimizer, mode='min', factor=0.1, patience=10, verbose=False, threshold=0.0001, threshold_mode='rel', cooldown=0, min_lr=0, eps=1e-08)
参数如下所示:
举个栗子:
loss_value = 0.5 accuray = 0.9 factor = 0.1 mode = "min" patience = 10 cooldown = 10 min_lr = 1e-4 verbose = True scheduler_lr = optim.lr_scheduler.ReduceLROnPlateau(optimizer, factor=factor, mode=mode, patience=patience, cooldown=cooldown, min_lr=min_lr, verbose=verbose) for epoch in range(max_epoch): for i in range(iteration): # train(...) optimizer.step() optimizer.zero_grad() # if epoch == 5: # loss_value = 0.4 scheduler_lr.step(loss_value)
结果如下所示:
Epoch 12: reducing learning rate of group 0 to 1.0000e-02.
Epoch 33: reducing learning rate of group 0 to 1.0000e-03.
Epoch 54: reducing learning rate of group 0 to 1.0000e-04.
功能:自定义调整策略。
torch.optim.lr_scheduler.LambdaLR(optimizer, lr_lambda, last_epoch=-1)
参数主要就一个:
lr_lambda:函数或list。
举个例子:
lr_init = 0.1 weights_1 = torch.randn((6, 3, 5, 5)) weights_2 = torch.ones((5, 5)) optimizer = optim.SGD([ {'params': [weights_1]}, {'params': [weights_2]}], lr=lr_init) lambda1 = lambda epoch: 0.1 ** (epoch // 20) lambda2 = lambda epoch: 0.95 ** epoch scheduler = torch.optim.lr_scheduler.LambdaLR(optimizer, lr_lambda=[lambda1, lambda2]) lr_list, epoch_list = list(), list() for epoch in range(max_epoch): for i in range(iteration): # train(...) optimizer.step() optimizer.zero_grad() scheduler.step() lr_list.append(scheduler.get_lr()) epoch_list.append(epoch) print('epoch:{:5d}, lr:{}'.format(epoch, scheduler.get_lr())) plt.plot(epoch_list, [i[0] for i in lr_list], label="lambda 1") plt.plot(epoch_list, [i[1] for i in lr_list], label="lambda 2") plt.xlabel("Epoch") plt.ylabel("Learning Rate") plt.title("LambdaLR") plt.legend() plt.show()
结果如下图所示,一个是每隔20轮学习率下降为0.1倍,一个是指数衰减,底为0.95:
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。