当前位置:   article > 正文

神经网络模型训练中的相关概念:Epoch,Batch,Batch size,Iteration_epoch在神经网络里是什么意思

epoch在神经网络里是什么意思

神经网络模型训练中的相关概念如下:

  • Epoch(时期/回合):当一个完整的数据集通过了神经网络一次并且返回了一次,这个过程称为一次 epoch。也就是说,所有训练样本在神经网络中都进行了一次正向传播和一次反向传播。一个 epoch 是将所有训练样本训练一次的过程
  • Batch(批 / 一批样本):将整个训练样本分成若干个 batch。每个 batch 中包含一部分训练样本,每次送入网络中进行训练的是一个 batch
  • Batch size(批大小):每个 batch 中训练样本的数量。Batch size 的大小影响模型的优化程度和速度,以及内存的利用率和容量。
  • Iteration(迭代):训练一个 batch 就是一次 iteration每次迭代会更新模型的参数

多个 epoch 的原因是,单次训练数据集是不够的,需要反复多次才能拟合收敛。因为我们使用的是有限的数据集,并且我们使用一个迭代过程即梯度下降来优化学习过程。随着 epoch 数量增加,神经网络中的权重的更新次数也在增加,曲线从欠拟合变得过拟合。epoch 的个数是非常重要的,如果 epoch 太少,模型可能无法收敛到最优解;如果 epoch 太多,模型可能会过拟合,导致泛化能力下降。对于不同的数据集,合适的 epoch 数量是不同的,需要通过验证集或交叉验证来选择。

这些概念之间的关系可以用下面的公式表示:

  • 一个 epoch 中,batch 总数 = 训练样本数 / batch_size
  • iteration总数 = batch总数 * epoch总数

如果训练样本数不能被 batch_size 整除,那么有几种处理方法:

  • 调整 batch_size,使其能够整除训练样本数。比如,如果训练样本数是1000,可以选择 batch_size 为50,20,10等。
  • 调整训练样本数,使其能够被 batch_size 整除。比如,如果 batch_size 是35,可以选择训练样本数为700,1050,1400等。
  • 检查输入数据的 batch_size,当它和预设的 batch_size 不匹配时强行补齐。比如,如果 batch_size 是35,最后一个 batch 只有25个样本,可以用零或其他值填充剩余的10个位置。
    • 使用 drop_last 参数,舍弃最后一个不足 batch_size 的 batch。比如,如果 batch_size 是35,最后一个 batch 只有25个样本,可以直接忽略这个 batch,不参与训练。

具体例子如下所示,其中训练样本数为2922batch_size = 16epoch 总数为100。这意味着:

  • 需要把训练样本分成182(2922/16)个 batch,每个 batch 包含16个样本。
  • 每次训练一个 batch,就是一次 iteration。
  • 每次训练完所有的 batch,就是一次 epoch。
  • 需要训练100个 epoch,也就是18200个 iteration。
  • 每次 iteration,模型的参数都会更新一次。
[03/29 14:56:26] DALI is not installed, you can improve performance if use DALI
[03/29 14:56:27] DATASET : 
[03/29 14:56:27]     batch_size : 16
[03/29 14:56:27]     num_workers : 0
[03/29 14:56:27]     test : 
[03/29 14:56:27]         file_path : /home/aistudio/data/data104924/test_A_data.npy
[03/29 14:56:27]         format : SkeletonDataset
[03/29 14:56:27]         test_mode : True
[03/29 14:56:27]     test_batch_size : 1
[03/29 14:56:27]     test_num_workers : 0
[03/29 14:56:27]     train : 
[03/29 14:56:27]         file_path : /home/aistudio/data/data104925/train_data.npy
[03/29 14:56:27]         format : SkeletonDataset
[03/29 14:56:27]         label_path : /home/aistudio/data/data104925/train_label.npy
[03/29 14:56:27] ------------------------------------------------------------
[03/29 14:56:27] INFERENCE : 
[03/29 14:56:27]     name : STGCN_Inference_helper
[03/29 14:56:27]     num_channels : 2
[03/29 14:56:27]     person_nums : 1
[03/29 14:56:27]     vertex_nums : 25
[03/29 14:56:27]     window_size : 1000
[03/29 14:56:27] ------------------------------------------------------------
[03/29 14:56:27] METRIC : 
[03/29 14:56:27]     name : SkeletonMetric
[03/29 14:56:27]     out_file : submission2.csv
[03/29 14:56:27] ------------------------------------------------------------
[03/29 14:56:27] MIX : 
[03/29 14:56:27]     alpha : 0.2
[03/29 14:56:27]     name : Mixup
[03/29 14:56:27] ------------------------------------------------------------
[03/29 14:56:27] MODEL : 
[03/29 14:56:27]     backbone : 
[03/29 14:56:27]         name : AGCN
[03/29 14:56:27]     framework : RecognizerGCN
[03/29 14:56:27]     head : 
[03/29 14:56:27]         ls_eps : 0.1
[03/29 14:56:27]         name : STGCNHead
[03/29 14:56:27]         num_classes : 30
[03/29 14:56:27] ------------------------------------------------------------
[03/29 14:56:27] OPTIMIZER : 
[03/29 14:56:27]     learning_rate : 
[03/29 14:56:27]         cosine_base_lr : 0.05
[03/29 14:56:27]         iter_step : True
[03/29 14:56:27]         max_epoch : 100
[03/29 14:56:27]         name : CustomWarmupCosineDecay
[03/29 14:56:27]         warmup_epochs : 10
[03/29 14:56:27]         warmup_start_lr : 0.005
[03/29 14:56:27]     momentum : 0.9
[03/29 14:56:27]     name : Momentum
[03/29 14:56:27]     weight_decay : 
[03/29 14:56:27]         name : L2
[03/29 14:56:27]         value : 0.0001
[03/29 14:56:27] ------------------------------------------------------------
[03/29 14:56:27] PIPELINE : 
[03/29 14:56:27]     test : 
[03/29 14:56:27]         sample : 
[03/29 14:56:27]             name : AutoPadding
[03/29 14:56:27]             window_size : 1000
[03/29 14:56:27]         transform : 
[03/29 14:56:27]             SkeletonNorm : None
[03/29 14:56:27]     train : 
[03/29 14:56:27]         sample : 
[03/29 14:56:27]             name : AutoPadding
[03/29 14:56:27]             window_size : 1000
[03/29 14:56:27]         transform : 
[03/29 14:56:27]             SkeletonNorm : None
[03/29 14:56:27] ------------------------------------------------------------
[03/29 14:56:27] epochs : 100
[03/29 14:56:27] log_interval : 10
[03/29 14:56:27] model_name : AGCN
W0329 14:56:27.170261   942 device_context.cc:404] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 11.2, Runtime API Version: 10.1
W0329 14:56:27.176800   942 device_context.cc:422] device: 0, cuDNN Version: 7.6.
[03/29 14:56:30] Loading data, it will take some moment...
[03/29 14:56:33] Data Loaded!
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/nn/layer/norm.py:641: UserWarning: When training, we now always track global mean and variance.
  "When training, we now always track global mean and variance.")
[03/29 14:56:34] epoch:[  1/100] train step:0    loss: 3.37317 lr: 0.005000 top1: 0.00000 top5: 0.31250 batch_cost: 1.36950 sec, reader_cost: 0.42939 sec, ips: 11.68313 instance/sec.
[03/29 14:56:41] epoch:[  1/100] train step:10   loss: 3.40027 lr: 0.005241 top1: 0.00000 top5: 0.12378 batch_cost: 0.69300 sec, reader_cost: 0.00034 sec, ips: 23.08807 instance/sec.
[03/29 14:56:48] epoch:[  1/100] train step:20   loss: 3.23057 lr: 0.005481 top1: 0.06250 top5: 0.27953 batch_cost: 0.70447 sec, reader_cost: 0.00030 sec, ips: 22.71226 instance/sec.
[03/29 14:56:55] epoch:[  1/100] train step:30   loss: 3.03959 lr: 0.005722 top1: 0.12500 top5: 0.49993 batch_cost: 0.68937 sec, reader_cost: 0.00030 sec, ips: 23.20952 instance/sec.
[03/29 14:57:02] epoch:[  1/100] train step:40   loss: 3.19273 lr: 0.005962 top1: 0.12500 top5: 0.26837 batch_cost: 0.70059 sec, reader_cost: 0.00035 sec, ips: 22.83794 instance/sec.
[03/29 14:57:09] epoch:[  1/100] train step:50   loss: 2.99275 lr: 0.006203 top1: 0.12500 top5: 0.49463 batch_cost: 0.71244 sec, reader_cost: 0.00027 sec, ips: 22.45799 instance/sec.
[03/29 14:57:16] epoch:[  1/100] train step:60   loss: 3.09189 lr: 0.006443 top1: 0.18422 top5: 0.30922 batch_cost: 0.70627 sec, reader_cost: 0.00045 sec, ips: 22.65417 instance/sec.
[03/29 14:57:23] epoch:[  1/100] train step:70   loss: 3.27638 lr: 0.006684 top1: 0.16767 top5: 0.44052 batch_cost: 0.70927 sec, reader_cost: 0.00037 sec, ips: 22.55844 instance/sec.
[03/29 14:57:30] epoch:[  1/100] train step:80   loss: 2.94396 lr: 0.006924 top1: 0.05825 top5: 0.52851 batch_cost: 0.70284 sec, reader_cost: 0.00029 sec, ips: 22.76487 instance/sec.
[03/29 14:57:37] epoch:[  1/100] train step:90   loss: 2.87026 lr: 0.007165 top1: 0.18725 top5: 0.68664 batch_cost: 0.69787 sec, reader_cost: 0.00033 sec, ips: 22.92705 instance/sec.
[03/29 14:57:44] epoch:[  1/100] train step:100  loss: 3.12629 lr: 0.007405 top1: 0.18207 top5: 0.49457 batch_cost: 0.68661 sec, reader_cost: 0.00032 sec, ips: 23.30293 instance/sec.
[03/29 14:57:51] epoch:[  1/100] train step:110  loss: 3.00636 lr: 0.007646 top1: 0.18595 top5: 0.49767 batch_cost: 0.71403 sec, reader_cost: 0.00033 sec, ips: 22.40810 instance/sec.
[03/29 14:57:58] epoch:[  1/100] train step:120  loss: 2.92559 lr: 0.007886 top1: 0.29620 top5: 0.54352 batch_cost: 0.69685 sec, reader_cost: 0.00027 sec, ips: 22.96031 instance/sec.
[03/29 14:58:05] epoch:[  1/100] train step:130  loss: 2.81274 lr: 0.008127 top1: 0.18393 top5: 0.67501 batch_cost: 0.67208 sec, reader_cost: 0.00030 sec, ips: 23.80659 instance/sec.
[03/29 14:58:12] epoch:[  1/100] train step:140  loss: 3.02827 lr: 0.008367 top1: 0.06250 top5: 0.65827 batch_cost: 0.70593 sec, reader_cost: 0.00041 sec, ips: 22.66514 instance/sec.
[03/29 14:58:19] epoch:[  1/100] train step:150  loss: 2.95272 lr: 0.008608 top1: 0.18750 top5: 0.43750 batch_cost: 0.69523 sec, reader_cost: 0.00032 sec, ips: 23.01397 instance/sec.
[03/29 14:58:26] epoch:[  1/100] train step:160  loss: 3.34270 lr: 0.008848 top1: 0.06250 top5: 0.33044 batch_cost: 0.68350 sec, reader_cost: 0.00032 sec, ips: 23.40885 instance/sec.
[03/29 14:58:33] epoch:[  1/100] train step:170  loss: 3.28831 lr: 0.009089 top1: 0.00000 top5: 0.34511 batch_cost: 0.70497 sec, reader_cost: 0.00031 sec, ips: 22.69596 instance/sec.
[03/29 14:58:39] epoch:[  1/100] train step:180  loss: 2.88748 lr: 0.009330 top1: 0.06249 top5: 0.56248 batch_cost: 0.49910 sec, reader_cost: 0.00030 sec, ips: 32.05783 instance/sec.
[03/29 14:58:40] END epoch:1   train loss_avg: 3.11624  top1_avg: 0.10158 top5_avg: 0.43874 avg_batch_cost: 0.50068 sec, avg_reader_cost: 0.00083 sec, batch_cost_sum: 127.13062 sec, avg_ips: 22.90558 instance/sec.
[03/29 14:58:41] epoch:[  2/100] train step:0    loss: 3.00062 lr: 0.009378 top1: 0.06250 top5: 0.53927 batch_cost: 1.06469 sec, reader_cost: 0.27380 sec, ips: 15.02790 instance/sec.
[03/29 14:58:48] epoch:[  2/100] train step:10   loss: 2.91025 lr: 0.009618 top1: 0.12500 top5: 0.49546 batch_cost: 0.70174 sec, reader_cost: 0.00029 sec, ips: 22.80040 instance/sec.
[03/29 14:58:55] epoch:[  2/100] train step:20   loss: 2.79556 lr: 0.009859 top1: 0.06216 top5: 0.68475 batch_cost: 0.69660 sec, reader_cost: 0.00030 sec, ips: 22.96883 instance/sec.
[03/29 14:59:02] epoch:[  2/100] train step:30   loss: 3.19926 lr: 0.010099 top1: 0.12497 top5: 0.43742 batch_cost: 0.70330 sec, reader_cost: 0.00032 sec, ips: 22.75002 instance/sec.
[03/29 14:59:09] epoch:[  2/100] train step:40   loss: 2.77705 lr: 0.010340 top1: 0.12500 top5: 0.68748 batch_cost: 0.70972 sec, reader_cost: 0.00028 sec, ips: 22.54421 instance/sec.
[03/29 14:59:16] epoch:[  2/100] train step:50   loss: 2.77065 lr: 0.010580 top1: 0.18750 top5: 0.62500 batch_cost: 0.70323 sec, reader_cost: 0.00031 sec, ips: 22.75215 instance/sec.
[03/29 14:59:23] epoch:[  2/100] train step:60   loss: 2.92242 lr: 0.010821 top1: 0.22620 top5: 0.67861 batch_cost: 0.71235 sec, reader_cost: 0.00032 sec, ips: 22.46083 instance/sec.
[03/29 14:59:30] epoch:[  2/100] train step:70   loss: 2.85897 lr: 0.011061 top1: 0.06250 top5: 0.43750 batch_cost: 0.71474 sec, reader_cost: 0.00032 sec, ips: 22.38568 instance/sec.
[03/29 14:59:37] epoch:[  2/100] train step:80   loss: 2.88646 lr: 0.011302 top1: 0.10735 top5: 0.58087 batch_cost: 0.68660 sec, reader_cost: 0.00032 sec, ips: 23.30327 instance/sec.
[03/29 14:59:44] epoch:[  2/100] train step:90   loss: 2.52885 lr: 0.011542 top1: 0.43549 top5: 0.74666 batch_cost: 0.69556 sec, reader_cost: 0.00036 sec, ips: 23.00295 instance/sec.
[03/29 14:59:51] epoch:[  2/100] train step:100  loss: 3.04526 lr: 0.011783 top1: 0.10791 top5: 0.60207 batch_cost: 0.69577 sec, reader_cost: 0.00035 sec, ips: 22.99623 instance/sec.
[03/29 14:59:58] epoch:[  2/100] train step:110  loss: 3.41966 lr: 0.012023 top1: 0.05607 top5: 0.35570 batch_cost: 0.71025 sec, reader_cost: 0.00035 sec, ips: 22.52737 instance/sec.
[03/29 15:00:05] epoch:[  2/100] train step:120  loss: 2.62268 lr: 0.012264 top1: 0.00000 top5: 0.62490 batch_cost: 0.71275 sec, reader_cost: 0.00030 sec, ips: 22.44821 instance/sec.
[03/29 15:00:12] epoch:[  2/100] train step:130  loss: 2.74196 lr: 0.012505 top1: 0.12488 top5: 0.62464 batch_cost: 0.71282 sec, reader_cost: 0.00028 sec, ips: 22.44601 instance/sec.
[03/29 15:00:19] epoch:[  2/100] train step:140  loss: 2.67099 lr: 0.012745 top1: 0.12456 top5: 0.68618 batch_cost: 0.70457 sec, reader_cost: 0.00035 sec, ips: 22.70898 instance/sec.
[03/29 15:00:27] epoch:[  2/100] train step:150  loss: 2.98392 lr: 0.012986 top1: 0.12501 top5: 0.68747 batch_cost: 0.68798 sec, reader_cost: 0.00027 sec, ips: 23.25660 instance/sec.
[03/29 15:00:33] epoch:[  2/100] train step:160  loss: 2.81657 lr: 0.013226 top1: 0.11762 top5: 0.66905 batch_cost: 0.68583 sec, reader_cost: 0.00033 sec, ips: 23.32943 instance/sec.
[03/29 15:00:40] epoch:[  2/100] train step:170  loss: 2.96144 lr: 0.013467 top1: 0.15183 top5: 0.65183 batch_cost: 0.69816 sec, reader_cost: 0.00031 sec, ips: 22.91731 instance/sec.
[03/29 15:00:47] epoch:[  2/100] train step:180  loss: 2.80823 lr: 0.013707 top1: 0.12349 top5: 0.55721 batch_cost: 0.50249 sec, reader_cost: 0.00030 sec, ips: 31.84135 instance/sec.
[03/29 15:00:47] END epoch:2   train loss_avg: 2.90696  top1_avg: 0.13409 top5_avg: 0.57791 avg_batch_cost: 0.50287 sec, avg_reader_cost: 0.00091 sec, batch_cost_sum: 127.36288 sec, avg_ips: 22.86381 instance/sec.
[03/29 15:00:48] epoch:[  3/100] train step:0    loss: 2.57081 lr: 0.013755 top1: 0.18569 top5: 0.73736 batch_cost: 1.04606 sec, reader_cost: 0.24835 sec, ips: 15.29544 instance/sec.
[03/29 15:00:56] epoch:[  3/100] train step:10   loss: 2.78410 lr: 0.013996 top1: 0.11309 top5: 0.87793 batch_cost: 0.71015 sec, reader_cost: 0.00031 sec, ips: 22.53035 instance/sec.
[03/29 15:01:03] epoch:[  3/100] train step:20   loss: 2.64895 lr: 0.014236 top1: 0.18610 top5: 0.62150 batch_cost: 0.70124 sec, reader_cost: 0.00033 sec, ips: 22.81670 instance/sec.
[03/29 15:01:10] epoch:[  3/100] train step:30   loss: 2.83884 lr: 0.014477 top1: 0.12238 top5: 0.67834 batch_cost: 0.68523 sec, reader_cost: 0.00036 sec, ips: 23.34983 instance/sec.
[03/29 15:01:17] epoch:[  3/100] train step:40   loss: 2.60631 lr: 0.014717 top1: 0.25000 top5: 0.68749 batch_cost: 0.71406 sec, reader_cost: 0.00031 sec, ips: 22.40718 instance/sec.
[03/29 15:01:24] epoch:[  3/100] train step:50   loss: 3.04034 lr: 0.014958 top1: 0.06223 top5: 0.49946 batch_cost: 0.70691 sec, reader_cost: 0.00030 sec, ips: 22.63366 instance/sec.
[03/29 15:01:31] epoch:[  3/100] train step:60   loss: 2.85118 lr: 0.015198 top1: 0.18750 top5: 0.62500 batch_cost: 0.71714 sec, reader_cost: 0.00030 sec, ips: 22.31077 instance/sec.
[03/29 15:01:38] epoch:[  3/100] train step:70   loss: 2.69562 lr: 0.015439 top1: 0.12411 top5: 0.61785 batch_cost: 0.69712 sec, reader_cost: 0.00027 sec, ips: 22.95171 instance/sec.
[03/29 15:01:45] epoch:[  3/100] train step:80   loss: 2.49378 lr: 0.015680 top1: 0.18657 top5: 0.80831 batch_cost: 0.70739 sec, reader_cost: 0.00029 sec, ips: 22.61835 instance/sec.
[03/29 15:01:52] epoch:[  3/100] train step:90   loss: 2.80020 lr: 0.015920 top1: 0.17714 top5: 0.58875 batch_cost: 0.70922 sec, reader_cost: 0.00032 sec, ips: 22.56012 instance/sec.
[03/29 15:01:59] epoch:[  3/100] train step:100  loss: 2.55284 lr: 0.016161 top1: 0.18247 top5: 0.73659 batch_cost: 0.70368 sec, reader_cost: 0.00031 sec, ips: 22.73755 instance/sec.
[03/29 15:02:06] epoch:[  3/100] train step:110  loss: 2.57980 lr: 0.016401 top1: 0.18632 top5: 0.74295 batch_cost: 0.68889 sec, reader_cost: 0.00033 sec, ips: 23.22577 instance/sec.
[03/29 15:02:13] epoch:[  3/100] train step:120  loss: 3.05257 lr: 0.016642 top1: 0.07077 top5: 0.55423 batch_cost: 0.69157 sec, reader_cost: 0.00029 sec, ips: 23.13575 instance/sec.
[03/29 15:02:20] epoch:[  3/100] train step:130  loss: 2.43921 lr: 0.016882 top1: 0.24921 top5: 0.74790 batch_cost: 0.68415 sec, reader_cost: 0.00032 sec, ips: 23.38664 instance/sec.
[03/29 15:02:27] epoch:[  3/100] train step:140  loss: 2.86336 lr: 0.017123 top1: 0.12808 top5: 0.61576 batch_cost: 0.68897 sec, reader_cost: 0.00029 sec, ips: 23.22317 instance/sec.
[03/29 15:02:34] epoch:[  3/100] train step:150  loss: 2.95862 lr: 0.017363 top1: 0.06250 top5: 0.62466 batch_cost: 0.68853 sec, reader_cost: 0.00036 sec, ips: 23.23777 instance/sec.
[03/29 15:02:41] epoch:[  3/100] train step:160  loss: 2.79597 lr: 0.017604 top1: 0.18589 top5: 0.61935 batch_cost: 0.70793 sec, reader_cost: 0.00030 sec, ips: 22.60106 instance/sec.
[03/29 15:02:48] epoch:[  3/100] train step:170  loss: 2.72396 lr: 0.017844 top1: 0.18750 top5: 0.56250 batch_cost: 0.71050 sec, reader_cost: 0.00028 sec, ips: 22.51925 instance/sec.
[03/29 15:02:54] epoch:[  3/100] train step:180  loss: 2.67971 lr: 0.018085 top1: 0.18737 top5: 0.56224 batch_cost: 0.50566 sec, reader_cost: 0.00029 sec, ips: 31.64151 instance/sec.
[03/29 15:02:55] END epoch:3   train loss_avg: 2.80823  top1_avg: 0.15651 top5_avg: 0.62372 avg_batch_cost: 0.50625 sec, avg_reader_cost: 0.00081 sec, batch_cost_sum: 127.48577 sec, avg_ips: 22.84176 instance/sec.
[03/29 15:02:56] epoch:[  4/100] train step:0    loss: 2.74616 lr: 0.018133 top1: 0.11287 top5: 0.63898 batch_cost: 1.06761 sec, reader_cost: 0.25935 sec, ips: 14.98679 instance/sec.
[03/29 15:03:03] epoch:[  4/100] train step:10   loss: 2.40084 lr: 0.018373 top1: 0.24963 top5: 0.74913 batch_cost: 0.68534 sec, reader_cost: 0.00034 sec, ips: 23.34615 instance/sec.
[03/29 15:03:10] epoch:[  4/100] train step:20   loss: 2.44602 lr: 0.018614 top1: 0.43689 top5: 0.87402 batch_cost: 0.70286 sec, reader_cost: 0.00037 sec, ips: 22.76406 instance/sec.
[03/29 15:03:17] epoch:[  4/100] train step:30   loss: 2.96187 lr: 0.018855 top1: 0.18146 top5: 0.37097 batch_cost: 0.70473 sec, reader_cost: 0.00043 sec, ips: 22.70365 instance/sec.
[03/29 15:03:24] epoch:[  4/100] train step:40   loss: 3.21780 lr: 0.019095 top1: 0.00000 top5: 0.47439 batch_cost: 0.70200 sec, reader_cost: 0.00033 sec, ips: 22.79206 instance/sec.
[03/29 15:03:31] epoch:[  4/100] train step:50   loss: 2.94762 lr: 0.019336 top1: 0.11307 top5: 0.61589 batch_cost: 0.69474 sec, reader_cost: 0.00030 sec, ips: 23.03016 instance/sec.
[03/29 15:03:38] epoch:[  4/100] train step:60   loss: 2.65449 lr: 0.019576 top1: 0.17900 top5: 0.70751 batch_cost: 0.69572 sec, reader_cost: 0.00032 sec, ips: 22.99764 instance/sec.
[03/29 15:03:45] epoch:[  4/100] train step:70   loss: 2.66846 lr: 0.019817 top1: 0.18302 top5: 0.60706 batch_cost: 0.67541 sec, reader_cost: 0.00029 sec, ips: 23.68942 instance/sec.
[03/29 15:03:52] epoch:[  4/100] train step:80   loss: 2.69306 lr: 0.020057 top1: 0.16884 top5: 0.68467 batch_cost: 0.70518 sec, reader_cost: 0.00037 sec, ips: 22.68939 instance/sec.
[03/29 15:03:59] epoch:[  4/100] train step:90   loss: 2.84809 lr: 0.020298 top1: 0.12186 top5: 0.73589 batch_cost: 0.69095 sec, reader_cost: 0.00029 sec, ips: 23.15662 instance/sec.
[03/29 15:04:07] epoch:[  4/100] train step:100  loss: 2.60389 lr: 0.020538 top1: 0.18067 top5: 0.71927 batch_cost: 0.70607 sec, reader_cost: 0.00029 sec, ips: 22.66076 instance/sec.
[03/29 15:04:14] epoch:[  4/100] train step:110  loss: 2.32916 lr: 0.020779 top1: 0.18750 top5: 0.93750 batch_cost: 0.70722 sec, reader_cost: 0.00038 sec, ips: 22.62384 instance/sec.
[03/29 15:04:21] epoch:[  4/100] train step:120  loss: 2.80766 lr: 0.021019 top1: 0.10466 top5: 0.51165 batch_cost: 0.69675 sec, reader_cost: 0.00036 sec, ips: 22.96384 instance/sec.
[03/29 15:04:28] epoch:[  4/100] train step:130  loss: 2.40375 lr: 0.021260 top1: 0.18750 top5: 0.81250 batch_cost: 0.71240 sec, reader_cost: 0.00030 sec, ips: 22.45939 instance/sec.
[03/29 15:04:35] epoch:[  4/100] train step:140  loss: 2.38751 lr: 0.021500 top1: 0.18695 top5: 0.87298 batch_cost: 0.69577 sec, reader_cost: 0.00035 sec, ips: 22.99601 instance/sec.
[03/29 15:04:42] epoch:[  4/100] train step:150  loss: 2.78807 lr: 0.021741 top1: 0.31084 top5: 0.64473 batch_cost: 0.70501 sec, reader_cost: 0.00038 sec, ips: 22.69477 instance/sec.
[03/29 15:04:49] epoch:[  4/100] train step:160  loss: 2.93646 lr: 0.021981 top1: 0.06250 top5: 0.47930 batch_cost: 0.69499 sec, reader_cost: 0.00040 sec, ips: 23.02196 instance/sec.
[03/29 15:04:56] epoch:[  4/100] train step:170  loss: 3.06834 lr: 0.022222 top1: 0.11915 top5: 0.47077 batch_cost: 0.68760 sec, reader_cost: 0.00029 sec, ips: 23.26927 instance/sec.
[03/29 15:05:02] epoch:[  4/100] train step:180  loss: 2.60863 lr: 0.022462 top1: 0.37459 top5: 0.68677 batch_cost: 0.50258 sec, reader_cost: 0.00035 sec, ips: 31.83548 instance/sec.
[03/29 15:05:03] END epoch:4   train loss_avg: 2.77788  top1_avg: 0.17965 top5_avg: 0.64434 avg_batch_cost: 0.50238 sec, avg_reader_cost: 0.00111 sec, batch_cost_sum: 127.76674 sec, avg_ips: 22.79153 instance/sec.
[03/29 15:05:04] epoch:[  5/100] train step:0    loss: 2.46436 lr: 0.022511 top1: 0.28157 top5: 0.68814 batch_cost: 1.07273 sec, reader_cost: 0.27534 sec, ips: 14.91516 instance/sec.
[03/29 15:05:11] epoch:[  5/100] train step:10   loss: 2.96157 lr: 0.022751 top1: 0.12500 top5: 0.52542 batch_cost: 0.70304 sec, reader_cost: 0.00030 sec, ips: 22.75834 instance/sec.
[03/29 15:05:18] epoch:[  5/100] train step:20   loss: 2.88344 lr: 0.022992 top1: 0.11494 top5: 0.61705 batch_cost: 0.69354 sec, reader_cost: 0.00030 sec, ips: 23.07000 instance/sec.
[03/29 15:05:25] epoch:[  5/100] train step:30   loss: 2.45653 lr: 0.023232 top1: 0.29804 top5: 0.77996 batch_cost: 0.69325 sec, reader_cost: 0.00033 sec, ips: 23.07964 instance/sec.
[03/29 15:05:32] epoch:[  5/100] train step:40   loss: 3.29705 lr: 0.023473 top1: 0.15335 top5: 0.37500 batch_cost: 0.69653 sec, reader_cost: 0.00031 sec, ips: 22.97103 instance/sec.
[03/29 15:05:39] epoch:[  5/100] train step:50   loss: 2.55334 lr: 0.023713 top1: 0.06249 top5: 0.62494 batch_cost: 0.68533 sec, reader_cost: 0.00038 sec, ips: 23.34626 instance/sec.
[03/29 15:05:46] epoch:[  5/100] train step:60   loss: 2.56069 lr: 0.023954 top1: 0.31244 top5: 0.81236 batch_cost: 0.69861 sec, reader_cost: 0.00036 sec, ips: 22.90247 instance/sec.
[03/29 15:05:53] epoch:[  5/100] train step:70   loss: 2.41631 lr: 0.024194 top1: 0.25000 top5: 0.81250 batch_cost: 0.69859 sec, reader_cost: 0.00034 sec, ips: 22.90331 instance/sec.
[03/29 15:06:00] epoch:[  5/100] train step:80   loss: 2.43879 lr: 0.024435 top1: 0.18728 top5: 0.74922 batch_cost: 0.74415 sec, reader_cost: 0.00030 sec, ips: 21.50117 instance/sec.
[03/29 15:06:07] epoch:[  5/100] train step:90   loss: 2.63529 lr: 0.024675 top1: 0.16135 top5: 0.74275 batch_cost: 0.72602 sec, reader_cost: 0.00034 sec, ips: 22.03793 instance/sec.
[03/29 15:06:14] epoch:[  5/100] train step:100  loss: 2.63362 lr: 0.024916 top1: 0.12347 top5: 0.67526 batch_cost: 0.68853 sec, reader_cost: 0.00030 sec, ips: 23.23782 instance/sec.
[03/29 15:06:21] epoch:[  5/100] train step:110  loss: 2.80297 lr: 0.025156 top1: 0.17319 top5: 0.58206 batch_cost: 0.70664 sec, reader_cost: 0.00028 sec, ips: 22.64226 instance/sec.
[03/29 15:06:28] epoch:[  5/100] train step:120  loss: 2.28605 lr: 0.025397 top1: 0.32967 top5: 0.87536 batch_cost: 0.69289 sec, reader_cost: 0.00030 sec, ips: 23.09163 instance/sec.
[03/29 15:06:35] epoch:[  5/100] train step:130  loss: 2.56156 lr: 0.025637 top1: 0.28904 top5: 0.57807 batch_cost: 0.70462 sec, reader_cost: 0.00030 sec, ips: 22.70713 instance/sec.
[03/29 15:06:43] epoch:[  5/100] train step:140  loss: 2.36903 lr: 0.025878 top1: 0.35608 top5: 0.89967 batch_cost: 0.72537 sec, reader_cost: 0.00034 sec, ips: 22.05778 instance/sec.
[03/29 15:06:50] epoch:[  5/100] train step:150  loss: 3.05911 lr: 0.026119 top1: 0.07252 top5: 0.72376 batch_cost: 0.70057 sec, reader_cost: 0.00030 sec, ips: 22.83870 instance/sec.
[03/29 15:06:57] epoch:[  5/100] train step:160  loss: 2.48905 lr: 0.026359 top1: 0.17872 top5: 0.53908 batch_cost: 0.69123 sec, reader_cost: 0.00038 sec, ips: 23.14719 instance/sec.
[03/29 15:07:04] epoch:[  5/100] train step:170  loss: 2.36892 lr: 0.026600 top1: 0.43746 top5: 0.68744 batch_cost: 0.69701 sec, reader_cost: 0.00031 sec, ips: 22.95531 instance/sec.
[03/29 15:07:10] epoch:[  5/100] train step:180  loss: 2.25287 lr: 0.026840 top1: 0.31048 top5: 0.86828 batch_cost: 0.50450 sec, reader_cost: 0.00030 sec, ips: 31.71447 instance/sec.
[03/29 15:07:10] END epoch:5   train loss_avg: 2.61208  top1_avg: 0.22569 top5_avg: 0.71417 avg_batch_cost: 0.50811 sec, avg_reader_cost: 0.00085 sec, batch_cost_sum: 127.79206 sec, avg_ips: 22.78702 instance/sec.
[03/29 15:07:12] epoch:[  6/100] train step:0    loss: 2.85189 lr: 0.026888 top1: 0.05819 top5: 0.66162 batch_cost: 1.07792 sec, reader_cost: 0.25637 sec, ips: 14.84345 instance/sec.
[03/29 15:07:19] epoch:[  6/100] train step:10   loss: 2.59118 lr: 0.027129 top1: 0.18718 top5: 0.68653 batch_cost: 0.70914 sec, reader_cost: 0.00033 sec, ips: 22.56264 instance/sec.
[03/29 15:07:26] epoch:[  6/100] train step:20   loss: 2.44648 lr: 0.027369 top1: 0.25000 top5: 0.68750 batch_cost: 0.70485 sec, reader_cost: 0.00040 sec, ips: 22.69974 instance/sec.
[03/29 15:07:33] epoch:[  6/100] train step:30   loss: 2.46000 lr: 0.027610 top1: 0.46711 top5: 0.64991 batch_cost: 0.70593 sec, reader_cost: 0.00032 sec, ips: 22.66514 instance/sec.
[03/29 15:07:40] epoch:[  6/100] train step:40   loss: 2.37045 lr: 0.027850 top1: 0.24869 top5: 0.87019 batch_cost: 0.70518 sec, reader_cost: 0.00033 sec, ips: 22.68922 instance/sec.
[03/29 15:07:47] epoch:[  6/100] train step:50   loss: 2.65947 lr: 0.028091 top1: 0.24880 top5: 0.62141 batch_cost: 0.70351 sec, reader_cost: 0.00032 sec, ips: 22.74322 instance/sec.
[03/29 15:07:54] epoch:[  6/100] train step:60   loss: 2.13924 lr: 0.028331 top1: 0.37500 top5: 0.87500 batch_cost: 0.70757 sec, reader_cost: 0.00029 sec, ips: 22.61270 instance/sec.
[03/29 15:08:01] epoch:[  6/100] train step:70   loss: 2.57030 lr: 0.028572 top1: 0.36615 top5: 0.85731 batch_cost: 0.71513 sec, reader_cost: 0.00032 sec, ips: 22.37341 instance/sec.
[03/29 15:08:08] epoch:[  6/100] train step:80   loss: 2.69033 lr: 0.028812 top1: 0.22627 top5: 0.62818 batch_cost: 0.70214 sec, reader_cost: 0.00031 sec, ips: 22.78755 instance/sec.
[03/29 15:08:15] epoch:[  6/100] train step:90   loss: 3.14676 lr: 0.029053 top1: 0.12500 top5: 0.37500 batch_cost: 0.71345 sec, reader_cost: 0.00028 sec, ips: 22.42612 instance/sec.
[03/29 15:08:22] epoch:[  6/100] train step:100  loss: 2.57400 lr: 0.029294 top1: 0.27330 top5: 0.71450 batch_cost: 0.69552 sec, reader_cost: 0.00034 sec, ips: 23.00439 instance/sec.
[03/29 15:08:29] epoch:[  6/100] train step:110  loss: 2.26502 lr: 0.029534 top1: 0.25000 top5: 0.87500 batch_cost: 0.68348 sec, reader_cost: 0.00028 sec, ips: 23.40957 instance/sec.
[03/29 15:08:36] epoch:[  6/100] train step:120  loss: 2.16912 lr: 0.029775 top1: 0.43747 top5: 0.87494 batch_cost: 0.68997 sec, reader_cost: 0.00028 sec, ips: 23.18951 instance/sec.
[03/29 15:08:43] epoch:[  6/100] train step:130  loss: 2.42919 lr: 0.030015 top1: 0.55789 top5: 0.86671 batch_cost: 0.70373 sec, reader_cost: 0.00032 sec, ips: 22.73591 instance/sec.
[03/29 15:08:50] epoch:[  6/100] train step:140  loss: 2.84289 lr: 0.030256 top1: 0.11736 top5: 0.72842 batch_cost: 0.70333 sec, reader_cost: 0.00050 sec, ips: 22.74904 instance/sec.
[03/29 15:08:57] epoch:[  6/100] train step:150  loss: 2.73676 lr: 0.030496 top1: 0.19671 top5: 0.55427 batch_cost: 0.68943 sec, reader_cost: 0.00035 sec, ips: 23.20769 instance/sec.
[03/29 15:09:04] epoch:[  6/100] train step:160  loss: 2.25768 lr: 0.030737 top1: 0.18646 top5: 0.87034 batch_cost: 0.68771 sec, reader_cost: 0.00030 sec, ips: 23.26553 instance/sec.
[03/29 15:09:11] epoch:[  6/100] train step:170  loss: 2.91737 lr: 0.030977 top1: 0.27496 top5: 0.69988 batch_cost: 0.69979 sec, reader_cost: 0.00029 sec, ips: 22.86403 instance/sec.
[03/29 15:09:18] epoch:[  6/100] train step:180  loss: 2.27804 lr: 0.031218 top1: 0.37494 top5: 0.87487 batch_cost: 0.50157 sec, reader_cost: 0.00033 sec, ips: 31.89988 instance/sec.
[03/29 15:09:18] END epoch:6   train loss_avg: 2.55839  top1_avg: 0.23381 top5_avg: 0.75519 avg_batch_cost: 0.50127 sec, avg_reader_cost: 0.00096 sec, batch_cost_sum: 127.49788 sec, avg_ips: 22.83960 instance/sec.
[03/29 15:09:19] epoch:[  7/100] train step:0    loss: 2.48977 lr: 0.031266 top1: 0.24999 top5: 0.74997 batch_cost: 1.03012 sec, reader_cost: 0.27165 sec, ips: 15.53212 instance/sec.
[03/29 15:09:26] epoch:[  7/100] train step:10   loss: 2.42310 lr: 0.031506 top1: 0.06237 top5: 0.87359 batch_cost: 0.71256 sec, reader_cost: 0.00036 sec, ips: 22.45424 instance/sec.
[03/29 15:09:33] epoch:[  7/100] train step:20   loss: 2.01837 lr: 0.031747 top1: 0.49996 top5: 0.93744 batch_cost: 0.69950 sec, reader_cost: 0.00040 sec, ips: 22.87335 instance/sec.
[03/29 15:09:40] epoch:[  7/100] train step:30   loss: 2.30209 lr: 0.031987 top1: 0.37500 top5: 0.81250 batch_cost: 0.70529 sec, reader_cost: 0.00035 sec, ips: 22.68578 instance/sec.
[03/29 15:09:47] epoch:[  7/100] train step:40   loss: 2.87034 lr: 0.032228 top1: 0.15818 top5: 0.69135 batch_cost: 0.69888 sec, reader_cost: 0.00028 sec, ips: 22.89368 instance/sec.
[03/29 15:09:54] epoch:[  7/100] train step:50   loss: 2.26913 lr: 0.032468 top1: 0.47789 top5: 0.79039 batch_cost: 0.70312 sec, reader_cost: 0.00043 sec, ips: 22.75566 instance/sec.
[03/29 15:10:01] epoch:[  7/100] train step:60   loss: 2.41152 lr: 0.032709 top1: 0.31139 top5: 0.81027 batch_cost: 0.70262 sec, reader_cost: 0.00034 sec, ips: 22.77177 instance/sec.
[03/29 15:10:08] epoch:[  7/100] train step:70   loss: 2.78548 lr: 0.032950 top1: 0.18713 top5: 0.62388 batch_cost: 0.68241 sec, reader_cost: 0.00029 sec, ips: 23.44620 instance/sec.
[03/29 15:10:15] epoch:[  7/100] train step:80   loss: 2.31574 lr: 0.033190 top1: 0.30918 top5: 0.93153 batch_cost: 0.71623 sec, reader_cost: 0.00031 sec, ips: 22.33933 instance/sec.
[03/29 15:10:22] epoch:[  7/100] train step:90   loss: 2.50991 lr: 0.033431 top1: 0.16940 top5: 0.82673 batch_cost: 0.71050 sec, reader_cost: 0.00037 sec, ips: 22.51945 instance/sec.
[03/29 15:10:29] epoch:[  7/100] train step:100  loss: 3.06481 lr: 0.033671 top1: 0.13734 top5: 0.57484 batch_cost: 0.68461 sec, reader_cost: 0.00030 sec, ips: 23.37083 instance/sec.
[03/29 15:10:36] epoch:[  7/100] train step:110  loss: 2.33617 lr: 0.033912 top1: 0.31196 top5: 0.87391 batch_cost: 0.70370 sec, reader_cost: 0.00033 sec, ips: 22.73691 instance/sec.
[03/29 15:10:43] epoch:[  7/100] train step:120  loss: 2.57602 lr: 0.034152 top1: 0.18742 top5: 0.74960 batch_cost: 0.69058 sec, reader_cost: 0.00030 sec, ips: 23.16905 instance/sec.
[03/29 15:10:50] epoch:[  7/100] train step:130  loss: 2.93582 lr: 0.034393 top1: 0.12500 top5: 0.47862 batch_cost: 0.72508 sec, reader_cost: 0.00031 sec, ips: 22.06649 instance/sec.
[03/29 15:10:57] epoch:[  7/100] train step:140  loss: 2.41478 lr: 0.034633 top1: 0.31250 top5: 0.68749 batch_cost: 0.71846 sec, reader_cost: 0.00030 sec, ips: 22.26971 instance/sec.
[03/29 15:11:05] epoch:[  7/100] train step:150  loss: 2.37827 lr: 0.034874 top1: 0.28179 top5: 0.86073 batch_cost: 0.70377 sec, reader_cost: 0.00029 sec, ips: 22.73455 instance/sec.
[03/29 15:11:12] epoch:[  7/100] train step:160  loss: 2.14180 lr: 0.035114 top1: 0.24969 top5: 0.93648 batch_cost: 0.70252 sec, reader_cost: 0.00033 sec, ips: 22.77517 instance/sec.
[03/29 15:11:19] epoch:[  7/100] train step:170  loss: 2.74523 lr: 0.035355 top1: 0.32059 top5: 0.59682 batch_cost: 0.69936 sec, reader_cost: 0.00028 sec, ips: 22.87797 instance/sec.
[03/29 15:11:25] epoch:[  7/100] train step:180  loss: 2.27444 lr: 0.035595 top1: 0.31150 top5: 0.74599 batch_cost: 0.50434 sec, reader_cost: 0.00027 sec, ips: 31.72433 instance/sec.
[03/29 15:11:25] END epoch:7   train loss_avg: 2.51643  top1_avg: 0.25764 top5_avg: 0.76191 avg_batch_cost: 0.50491 sec, avg_reader_cost: 0.00087 sec, batch_cost_sum: 127.37218 sec, avg_ips: 22.86214 instance/sec.
[03/29 15:11:27] epoch:[  8/100] train step:0    loss: 2.44114 lr: 0.035643 top1: 0.06245 top5: 0.99927 batch_cost: 1.07130 sec, reader_cost: 0.26484 sec, ips: 14.93513 instance/sec.
[03/29 15:11:34] epoch:[  8/100] train step:10   loss: 2.46820 lr: 0.035884 top1: 0.18750 top5: 0.81250 batch_cost: 0.71347 sec, reader_cost: 0.00030 sec, ips: 22.42572 instance/sec.
[03/29 15:11:41] epoch:[  8/100] train step:20   loss: 2.56442 lr: 0.036125 top1: 0.24996 top5: 0.74983 batch_cost: 0.68839 sec, reader_cost: 0.00031 sec, ips: 23.24254 instance/sec.
[03/29 15:11:48] epoch:[  8/100] train step:30   loss: 2.50068 lr: 0.036365 top1: 0.29695 top5: 0.69938 batch_cost: 0.70363 sec, reader_cost: 0.00032 sec, ips: 22.73921 instance/sec.
[03/29 15:11:55] epoch:[  8/100] train step:40   loss: 2.38977 lr: 0.036606 top1: 0.31250 top5: 0.68750 batch_cost: 0.69178 sec, reader_cost: 0.00030 sec, ips: 23.12873 instance/sec.
[03/29 15:12:02] epoch:[  8/100] train step:50   loss: 2.20239 lr: 0.036846 top1: 0.43552 top5: 0.87188 batch_cost: 0.69028 sec, reader_cost: 0.00032 sec, ips: 23.17887 instance/sec.
[03/29 15:12:09] epoch:[  8/100] train step:60   loss: 2.08576 lr: 0.037087 top1: 0.37495 top5: 0.93741 batch_cost: 0.70834 sec, reader_cost: 0.00032 sec, ips: 22.58799 instance/sec.
[03/29 15:12:16] epoch:[  8/100] train step:70   loss: 2.49606 lr: 0.037327 top1: 0.40172 top5: 0.64457 batch_cost: 0.69600 sec, reader_cost: 0.00026 sec, ips: 22.98838 instance/sec.
[03/29 15:12:23] epoch:[  8/100] train step:80   loss: 2.30188 lr: 0.037568 top1: 0.18589 top5: 0.92145 batch_cost: 0.71637 sec, reader_cost: 0.00032 sec, ips: 22.33491 instance/sec.
[03/29 15:12:30] epoch:[  8/100] train step:90   loss: 2.07550 lr: 0.037808 top1: 0.43109 top5: 0.98902 batch_cost: 0.71585 sec, reader_cost: 0.00037 sec, ips: 22.35118 instance/sec.
[03/29 15:12:37] epoch:[  8/100] train step:100  loss: 2.33710 lr: 0.038049 top1: 0.23549 top5: 0.81698 batch_cost: 0.70628 sec, reader_cost: 0.00037 sec, ips: 22.65400 instance/sec.
[03/29 15:12:44] epoch:[  8/100] train step:110  loss: 2.65618 lr: 0.038289 top1: 0.18741 top5: 0.49972 batch_cost: 0.70001 sec, reader_cost: 0.00039 sec, ips: 22.85681 instance/sec.
[03/29 15:12:51] epoch:[  8/100] train step:120  loss: 2.16781 lr: 0.038530 top1: 0.43629 top5: 0.93468 batch_cost: 0.69094 sec, reader_cost: 0.00029 sec, ips: 23.15701 instance/sec.
[03/29 15:12:58] epoch:[  8/100] train step:130  loss: 2.27023 lr: 0.038770 top1: 0.24993 top5: 0.87484 batch_cost: 0.69355 sec, reader_cost: 0.00034 sec, ips: 23.06956 instance/sec.
[03/29 15:13:05] epoch:[  8/100] train step:140  loss: 2.58522 lr: 0.039011 top1: 0.12500 top5: 0.83108 batch_cost: 0.70103 sec, reader_cost: 0.00033 sec, ips: 22.82362 instance/sec.
[03/29 15:13:12] epoch:[  8/100] train step:150  loss: 2.25197 lr: 0.039251 top1: 0.31206 top5: 0.99869 batch_cost: 0.71337 sec, reader_cost: 0.00036 sec, ips: 22.42887 instance/sec.
[03/29 15:13:19] epoch:[  8/100] train step:160  loss: 2.27115 lr: 0.039492 top1: 0.37476 top5: 0.74962 batch_cost: 0.70206 sec, reader_cost: 0.00030 sec, ips: 22.79006 instance/sec.
[03/29 15:13:26] epoch:[  8/100] train step:170  loss: 3.12523 lr: 0.039732 top1: 0.08068 top5: 0.51419 batch_cost: 0.72897 sec, reader_cost: 0.00031 sec, ips: 21.94865 instance/sec.
[03/29 15:13:33] epoch:[  8/100] train step:180  loss: 2.67082 lr: 0.039973 top1: 0.20059 top5: 0.66765 batch_cost: 0.50181 sec, reader_cost: 0.00034 sec, ips: 31.88471 instance/sec.
[03/29 15:13:33] END epoch:8   train loss_avg: 2.48675  top1_avg: 0.25453 top5_avg: 0.77283 avg_batch_cost: 0.50221 sec, avg_reader_cost: 0.00091 sec, batch_cost_sum: 127.51958 sec, avg_ips: 22.83571 instance/sec.
[03/29 15:13:34] epoch:[  9/100] train step:0    loss: 2.12341 lr: 0.040021 top1: 0.42191 top5: 0.85421 batch_cost: 1.06815 sec, reader_cost: 0.26195 sec, ips: 14.97913 instance/sec.
[03/29 15:13:41] epoch:[  9/100] train step:10   loss: 2.51564 lr: 0.040262 top1: 0.36636 top5: 0.85772 batch_cost: 0.70063 sec, reader_cost: 0.00031 sec, ips: 22.83657 instance/sec.
[03/29 15:13:48] epoch:[  9/100] train step:20   loss: 3.25764 lr: 0.040502 top1: 0.15691 top5: 0.43883 batch_cost: 0.69377 sec, reader_cost: 0.00034 sec, ips: 23.06234 instance/sec.
[03/29 15:13:55] epoch:[  9/100] train step:30   loss: 2.48742 lr: 0.040743 top1: 0.36467 top5: 0.84401 batch_cost: 0.69796 sec, reader_cost: 0.00035 sec, ips: 22.92406 instance/sec.
[03/29 15:14:02] epoch:[  9/100] train step:40   loss: 2.45128 lr: 0.040983 top1: 0.18750 top5: 0.87500 batch_cost: 0.70433 sec, reader_cost: 0.00030 sec, ips: 22.71656 instance/sec.
[03/29 15:14:10] epoch:[  9/100] train step:50   loss: 2.33907 lr: 0.041224 top1: 0.24772 top5: 0.80623 batch_cost: 0.70531 sec, reader_cost: 0.00029 sec, ips: 22.68511 instance/sec.
[03/29 15:14:17] epoch:[  9/100] train step:60   loss: 2.26202 lr: 0.041464 top1: 0.25000 top5: 0.93750 batch_cost: 0.71487 sec, reader_cost: 0.00039 sec, ips: 22.38162 instance/sec.
[03/29 15:14:24] epoch:[  9/100] train step:70   loss: 2.24093 lr: 0.041705 top1: 0.24323 top5: 0.79558 batch_cost: 0.71356 sec, reader_cost: 0.00027 sec, ips: 22.42273 instance/sec.
[03/29 15:14:31] epoch:[  9/100] train step:80   loss: 2.38097 lr: 0.041945 top1: 0.36257 top5: 0.83259 batch_cost: 0.70034 sec, reader_cost: 0.00027 sec, ips: 22.84605 instance/sec.
[03/29 15:14:38] epoch:[  9/100] train step:90   loss: 1.97282 lr: 0.042186 top1: 0.68402 top5: 0.87036 batch_cost: 0.72411 sec, reader_cost: 0.00029 sec, ips: 22.09601 instance/sec.
[03/29 15:14:45] epoch:[  9/100] train step:100  loss: 2.76908 lr: 0.042426 top1: 0.24999 top5: 0.68749 batch_cost: 0.70043 sec, reader_cost: 0.00027 sec, ips: 22.84327 instance/sec.
[03/29 15:14:52] epoch:[  9/100] train step:110  loss: 2.18999 lr: 0.042667 top1: 0.37494 top5: 0.87483 batch_cost: 0.71288 sec, reader_cost: 0.00032 sec, ips: 22.44418 instance/sec.
[03/29 15:14:59] epoch:[  9/100] train step:120  loss: 2.14022 lr: 0.042907 top1: 0.18746 top5: 0.93736 batch_cost: 0.68995 sec, reader_cost: 0.00033 sec, ips: 23.19009 instance/sec.
[03/29 15:15:06] epoch:[  9/100] train step:130  loss: 2.31431 lr: 0.043148 top1: 0.25000 top5: 0.81250 batch_cost: 0.68054 sec, reader_cost: 0.00033 sec, ips: 23.51077 instance/sec.
[03/29 15:15:13] epoch:[  9/100] train step:140  loss: 2.34000 lr: 0.043389 top1: 0.24998 top5: 0.81242 batch_cost: 0.71420 sec, reader_cost: 0.00031 sec, ips: 22.40262 instance/sec.
[03/29 15:15:20] epoch:[  9/100] train step:150  loss: 2.33155 lr: 0.043629 top1: 0.29174 top5: 0.91698 batch_cost: 0.70514 sec, reader_cost: 0.00043 sec, ips: 22.69059 instance/sec.
[03/29 15:15:27] epoch:[  9/100] train step:160  loss: 2.34514 lr: 0.043870 top1: 0.30793 top5: 0.80107 batch_cost: 0.69373 sec, reader_cost: 0.00035 sec, ips: 23.06369 instance/sec.
[03/29 15:15:34] epoch:[  9/100] train step:170  loss: 2.52198 lr: 0.044110 top1: 0.38779 top5: 0.72551 batch_cost: 0.67962 sec, reader_cost: 0.00030 sec, ips: 23.54274 instance/sec.
[03/29 15:15:40] epoch:[  9/100] train step:180  loss: 2.46512 lr: 0.044351 top1: 0.12352 top5: 0.80437 batch_cost: 0.50476 sec, reader_cost: 0.00033 sec, ips: 31.69841 instance/sec.
[03/29 15:15:40] END epoch:9   train loss_avg: 2.47134  top1_avg: 0.26852 top5_avg: 0.77996 avg_batch_cost: 0.50409 sec, avg_reader_cost: 0.00078 sec, batch_cost_sum: 127.37165 sec, avg_ips: 22.86223 instance/sec.
[03/29 15:15:42] epoch:[ 10/100] train step:0    loss: 2.96107 lr: 0.044399 top1: 0.11891 top5: 0.54424 batch_cost: 1.10587 sec, reader_cost: 0.26607 sec, ips: 14.46825 instance/sec.
[03/29 15:15:49] epoch:[ 10/100] train step:10   loss: 2.23709 lr: 0.044639 top1: 0.24998 top5: 0.81245 batch_cost: 0.69632 sec, reader_cost: 0.00033 sec, ips: 22.97799 instance/sec.
[03/29 15:15:56] epoch:[ 10/100] train step:20   loss: 2.52222 lr: 0.044880 top1: 0.26982 top5: 0.66465 batch_cost: 0.70230 sec, reader_cost: 0.00032 sec, ips: 22.78237 instance/sec.
[03/29 15:16:03] epoch:[ 10/100] train step:30   loss: 2.21324 lr: 0.045120 top1: 0.31150 top5: 0.81031 batch_cost: 0.68848 sec, reader_cost: 0.00031 sec, ips: 23.23970 instance/sec.
[03/29 15:16:10] epoch:[ 10/100] train step:40   loss: 2.25340 lr: 0.045361 top1: 0.37500 top5: 0.81250 batch_cost: 0.70449 sec, reader_cost: 0.00032 sec, ips: 22.71134 instance/sec.
[03/29 15:16:17] epoch:[ 10/100] train step:50   loss: 2.23045 lr: 0.045601 top1: 0.37443 top5: 0.81135 batch_cost: 0.68609 sec, reader_cost: 0.00031 sec, ips: 23.32069 instance/sec.
[03/29 15:16:24] epoch:[ 10/100] train step:60   loss: 2.96213 lr: 0.045842 top1: 0.16072 top5: 0.52233 batch_cost: 0.71334 sec, reader_cost: 0.00030 sec, ips: 22.42967 instance/sec.
[03/29 15:16:31] epoch:[ 10/100] train step:70   loss: 2.16059 lr: 0.046082 top1: 0.29633 top5: 0.95552 batch_cost: 0.70037 sec, reader_cost: 0.00028 sec, ips: 22.84511 instance/sec.
[03/29 15:16:38] epoch:[ 10/100] train step:80   loss: 2.69990 lr: 0.046323 top1: 0.22429 top5: 0.74821 batch_cost: 0.71538 sec, reader_cost: 0.00037 sec, ips: 22.36570 instance/sec.
[03/29 15:16:45] epoch:[ 10/100] train step:90   loss: 2.66676 lr: 0.046564 top1: 0.21580 top5: 0.69281 batch_cost: 0.71799 sec, reader_cost: 0.00036 sec, ips: 22.28433 instance/sec.
[03/29 15:16:52] epoch:[ 10/100] train step:100  loss: 2.41354 lr: 0.046804 top1: 0.24999 top5: 0.87497 batch_cost: 0.70958 sec, reader_cost: 0.00035 sec, ips: 22.54862 instance/sec.
[03/29 15:16:59] epoch:[ 10/100] train step:110  loss: 2.38536 lr: 0.047045 top1: 0.41409 top5: 0.88482 batch_cost: 0.71506 sec, reader_cost: 0.00030 sec, ips: 22.37582 instance/sec.
[03/29 15:17:06] epoch:[ 10/100] train step:120  loss: 2.52977 lr: 0.047285 top1: 0.30121 top5: 0.72742 batch_cost: 0.69705 sec, reader_cost: 0.00041 sec, ips: 22.95383 instance/sec.
[03/29 15:17:13] epoch:[ 10/100] train step:130  loss: 2.14806 lr: 0.047526 top1: 0.43750 top5: 0.93750 batch_cost: 0.71028 sec, reader_cost: 0.00031 sec, ips: 22.52634 instance/sec.
[03/29 15:17:20] epoch:[ 10/100] train step:140  loss: 2.85185 lr: 0.047766 top1: 0.17275 top5: 0.69797 batch_cost: 0.71956 sec, reader_cost: 0.00034 sec, ips: 22.23584 instance/sec.
[03/29 15:17:27] epoch:[ 10/100] train step:150  loss: 2.76896 lr: 0.048007 top1: 0.14693 top5: 0.69080 batch_cost: 0.69619 sec, reader_cost: 0.00034 sec, ips: 22.98232 instance/sec.
[03/29 15:17:34] epoch:[ 10/100] train step:160  loss: 2.35859 lr: 0.048247 top1: 0.44712 top5: 0.79190 batch_cost: 0.69901 sec, reader_cost: 0.00031 sec, ips: 22.88967 instance/sec.
[03/29 15:17:41] epoch:[ 10/100] train step:170  loss: 2.37111 lr: 0.048488 top1: 0.12403 top5: 0.93071 batch_cost: 0.69868 sec, reader_cost: 0.00032 sec, ips: 22.90018 instance/sec.
[03/29 15:17:48] epoch:[ 10/100] train step:180  loss: 2.12248 lr: 0.048728 top1: 0.30482 top5: 0.91446 batch_cost: 0.50354 sec, reader_cost: 0.00030 sec, ips: 31.77483 instance/sec.
[03/29 15:17:48] END epoch:10  train loss_avg: 2.47666  top1_avg: 0.26287 top5_avg: 0.76981 avg_batch_cost: 0.50302 sec, avg_reader_cost: 0.00086 sec, batch_cost_sum: 127.64865 sec, avg_ips: 22.81262 instance/sec.
[03/29 15:17:49] epoch:[ 11/100] train step:0    loss: 2.50177 lr: 0.048776 top1: 0.22239 top5: 0.73659 batch_cost: 1.07179 sec, reader_cost: 0.26938 sec, ips: 14.92834 instance/sec.
[03/29 15:17:57] epoch:[ 11/100] train step:10   loss: 2.30357 lr: 0.048763 top1: 0.18749 top5: 0.81245 batch_cost: 0.70079 sec, reader_cost: 0.00032 sec, ips: 22.83136 instance/sec.
[03/29 15:18:04] epoch:[ 11/100] train step:20   loss: 2.46937 lr: 0.048750 top1: 0.18685 top5: 0.74804 batch_cost: 0.70470 sec, reader_cost: 0.00029 sec, ips: 22.70458 instance/sec.
[03/29 15:18:11] epoch:[ 11/100] train step:30   loss: 2.50641 lr: 0.048736 top1: 0.18749 top5: 0.81246 batch_cost: 0.70733 sec, reader_cost: 0.00034 sec, ips: 22.62037 instance/sec.
[03/29 15:18:18] epoch:[ 11/100] train step:40   loss: 2.98757 lr: 0.048723 top1: 0.11354 top5: 0.55625 batch_cost: 0.70535 sec, reader_cost: 0.00029 sec, ips: 22.68367 instance/sec.
[03/29 15:18:25] epoch:[ 11/100] train step:50   loss: 2.14526 lr: 0.048709 top1: 0.49998 top5: 0.87497 batch_cost: 0.70087 sec, reader_cost: 0.00035 sec, ips: 22.82889 instance/sec.
[03/29 15:18:32] epoch:[ 11/100] train step:60   loss: 2.45612 lr: 0.048695 top1: 0.29120 top5: 0.75925 batch_cost: 0.70693 sec, reader_cost: 0.00029 sec, ips: 22.63315 instance/sec.
[03/29 15:18:39] epoch:[ 11/100] train step:70   loss: 2.28251 lr: 0.048681 top1: 0.24959 top5: 0.87390 batch_cost: 0.70870 sec, reader_cost: 0.00030 sec, ips: 22.57653 instance/sec.
[03/29 15:18:46] epoch:[ 11/100] train step:80   loss: 2.96637 lr: 0.048667 top1: 0.32829 top5: 0.59408 batch_cost: 0.68660 sec, reader_cost: 0.00036 sec, ips: 23.30312 instance/sec.
[03/29 15:18:53] epoch:[ 11/100] train step:90   loss: 2.45896 lr: 0.048654 top1: 0.12500 top5: 0.74997 batch_cost: 0.70486 sec, reader_cost: 0.00029 sec, ips: 22.69938 instance/sec.
[03/29 15:19:00] epoch:[ 11/100] train step:100  loss: 2.16265 lr: 0.048640 top1: 0.24781 top5: 0.80647 batch_cost: 0.70654 sec, reader_cost: 0.00029 sec, ips: 22.64549 instance/sec.
[03/29 15:19:07] epoch:[ 11/100] train step:110  loss: 2.00276 lr: 0.048625 top1: 0.24999 top5: 0.87496 batch_cost: 0.70011 sec, reader_cost: 0.00034 sec, ips: 22.85354 instance/sec.
[03/29 15:19:14] epoch:[ 11/100] train step:120  loss: 2.34420 lr: 0.048611 top1: 0.18488 top5: 0.92878 batch_cost: 0.70190 sec, reader_cost: 0.00026 sec, ips: 22.79522 instance/sec.
[03/29 15:19:21] epoch:[ 11/100] train step:130  loss: 1.88093 lr: 0.048597 top1: 0.37461 top5: 0.99908 batch_cost: 0.71069 sec, reader_cost: 0.00028 sec, ips: 22.51349 instance/sec.
[03/29 15:19:28] epoch:[ 11/100] train step:140  loss: 2.42606 lr: 0.048583 top1: 0.29166 top5: 0.76561 batch_cost: 0.70561 sec, reader_cost: 0.00032 sec, ips: 22.67538 instance/sec.
[03/29 15:19:35] epoch:[ 11/100] train step:150  loss: 2.17910 lr: 0.048568 top1: 0.30683 top5: 0.85230 batch_cost: 0.72002 sec, reader_cost: 0.00028 sec, ips: 22.22168 instance/sec.
[03/29 15:19:42] epoch:[ 11/100] train step:160  loss: 2.05450 lr: 0.048554 top1: 0.56250 top5: 0.93750 batch_cost: 0.70115 sec, reader_cost: 0.00037 sec, ips: 22.81957 instance/sec.
[03/29 15:19:49] epoch:[ 11/100] train step:170  loss: 2.73806 lr: 0.048540 top1: 0.27783 top5: 0.78202 batch_cost: 0.71803 sec, reader_cost: 0.00029 sec, ips: 22.28324 instance/sec.
[03/29 15:19:55] epoch:[ 11/100] train step:180  loss: 2.40456 lr: 0.048525 top1: 0.30303 top5: 0.85370 batch_cost: 0.50031 sec, reader_cost: 0.00030 sec, ips: 31.98040 instance/sec.
[03/29 15:19:56] END epoch:11  train loss_avg: 2.44419  top1_avg: 0.27968 top5_avg: 0.78469 avg_batch_cost: 0.50109 sec, avg_reader_cost: 0.00085 sec, batch_cost_sum: 127.39963 sec, avg_ips: 22.85721 instance/sec.
[03/29 15:19:57] epoch:[ 12/100] train step:0    loss: 2.34270 lr: 0.048522 top1: 0.37288 top5: 0.80785 batch_cost: 1.06039 sec, reader_cost: 0.27052 sec, ips: 15.08877 instance/sec.
[03/29 15:20:04] epoch:[ 12/100] train step:10   loss: 2.24573 lr: 0.048507 top1: 0.25000 top5: 0.81250 batch_cost: 0.69120 sec, reader_cost: 0.00034 sec, ips: 23.14829 instance/sec.
[03/29 15:20:11] epoch:[ 12/100] train step:20   loss: 2.13167 lr: 0.048493 top1: 0.36477 top5: 0.91875 batch_cost: 0.69624 sec, reader_cost: 0.00028 sec, ips: 22.98063 instance/sec.
[03/29 15:20:18] epoch:[ 12/100] train step:30   loss: 2.04649 lr: 0.048478 top1: 0.30890 top5: 0.86599 batch_cost: 0.69645 sec, reader_cost: 0.00043 sec, ips: 22.97363 instance/sec.
[03/29 15:20:25] epoch:[ 12/100] train step:40   loss: 2.23277 lr: 0.048463 top1: 0.18750 top5: 0.93749 batch_cost: 0.69281 sec, reader_cost: 0.00032 sec, ips: 23.09439 instance/sec.
[03/29 15:20:32] epoch:[ 12/100] train step:50   loss: 2.26164 lr: 0.048448 top1: 0.24998 top5: 0.87493 batch_cost: 0.72817 sec, reader_cost: 0.00029 sec, ips: 21.97301 instance/sec.
[03/29 15:20:39] epoch:[ 12/100] train step:60   loss: 2.25266 lr: 0.048433 top1: 0.36681 top5: 0.80022 batch_cost: 0.70896 sec, reader_cost: 0.00042 sec, ips: 22.56815 instance/sec.
[03/29 15:20:46] epoch:[ 12/100] train step:70   loss: 2.47188 lr: 0.048418 top1: 0.17410 top5: 0.66070 batch_cost: 0.70223 sec, reader_cost: 0.00028 sec, ips: 22.78453 instance/sec.
[03/29 15:20:54] epoch:[ 12/100] train step:80   loss: 2.20363 lr: 0.048403 top1: 0.37496 top5: 0.87487 batch_cost: 0.73253 sec, reader_cost: 0.00029 sec, ips: 21.84199 instance/sec.
[03/29 15:21:01] epoch:[ 12/100] train step:90   loss: 2.41825 lr: 0.048388 top1: 0.30803 top5: 0.74196 batch_cost: 0.69405 sec, reader_cost: 0.00029 sec, ips: 23.05312 instance/sec.
[03/29 15:21:08] epoch:[ 12/100] train step:100  loss: 2.84730 lr: 0.048372 top1: 0.23188 top5: 0.64128 batch_cost: 0.70100 sec, reader_cost: 0.00030 sec, ips: 22.82440 instance/sec.
[03/29 15:21:15] epoch:[ 12/100] train step:110  loss: 2.14583 lr: 0.048357 top1: 0.37411 top5: 0.93572 batch_cost: 0.71812 sec, reader_cost: 0.00030 sec, ips: 22.28042 instance/sec.
[03/29 15:21:22] epoch:[ 12/100] train step:120  loss: 2.24149 lr: 0.048342 top1: 0.56249 top5: 0.93749 batch_cost: 0.70294 sec, reader_cost: 0.00031 sec, ips: 22.76158 instance/sec.
[03/29 15:21:29] epoch:[ 12/100] train step:130  loss: 2.24358 lr: 0.048326 top1: 0.31225 top5: 0.93691 batch_cost: 0.68991 sec, reader_cost: 0.00028 sec, ips: 23.19149 instance/sec.
[03/29 15:21:36] epoch:[ 12/100] train step:140  loss: 2.29778 lr: 0.048311 top1: 0.18639 top5: 0.86275 batch_cost: 0.71049 sec, reader_cost: 0.00036 sec, ips: 22.51966 instance/sec.
[03/29 15:21:43] epoch:[ 12/100] train step:150  loss: 2.29803 lr: 0.048295 top1: 0.18732 top5: 0.81182 batch_cost: 0.69540 sec, reader_cost: 0.00027 sec, ips: 23.00829 instance/sec.
[03/29 15:21:50] epoch:[ 12/100] train step:160  loss: 2.74029 lr: 0.048279 top1: 0.27945 top5: 0.58836 batch_cost: 0.74125 sec, reader_cost: 0.00038 sec, ips: 21.58526 instance/sec.
[03/29 15:21:57] epoch:[ 12/100] train step:170  loss: 2.17319 lr: 0.048263 top1: 0.31250 top5: 1.00000 batch_cost: 0.70173 sec, reader_cost: 0.00035 sec, ips: 22.80064 instance/sec.
[03/29 15:22:03] epoch:[ 12/100] train step:180  loss: 2.51104 lr: 0.048248 top1: 0.32995 top5: 0.81536 batch_cost: 0.50048 sec, reader_cost: 0.00034 sec, ips: 31.96899 instance/sec.
[03/29 15:22:04] END epoch:12  train loss_avg: 2.40610  top1_avg: 0.28942 top5_avg: 0.79410 avg_batch_cost: 0.50189 sec, avg_reader_cost: 0.00104 sec, batch_cost_sum: 127.84539 sec, avg_ips: 22.77751 instance/sec.
[03/29 15:22:05] epoch:[ 13/100] train step:0    loss: 2.62288 lr: 0.048244 top1: 0.21024 top5: 0.72155 batch_cost: 1.05660 sec, reader_cost: 0.27271 sec, ips: 15.14284 instance/sec.
[03/29 15:22:12] epoch:[ 13/100] train step:10   loss: 2.80555 lr: 0.048228 top1: 0.14584 top5: 0.75001 batch_cost: 0.70311 sec, reader_cost: 0.00046 sec, ips: 22.75603 instance/sec.
[03/29 15:22:19] epoch:[ 13/100] train step:20   loss: 2.68370 lr: 0.048213 top1: 0.21862 top5: 0.74974 batch_cost: 0.68662 sec, reader_cost: 0.00029 sec, ips: 23.30264 instance/sec.
[03/29 15:22:26] epoch:[ 13/100] train step:30   loss: 1.89672 lr: 0.048196 top1: 0.42875 top5: 0.98105 batch_cost: 0.71042 sec, reader_cost: 0.00035 sec, ips: 22.52175 instance/sec.
[03/29 15:22:33] epoch:[ 13/100] train step:40   loss: 2.25122 lr: 0.048180 top1: 0.30972 top5: 0.80231 batch_cost: 0.72042 sec, reader_cost: 0.00034 sec, ips: 22.20935 instance/sec.
[03/29 15:22:40] epoch:[ 13/100] train step:50   loss: 2.09944 lr: 0.048164 top1: 0.56250 top5: 0.81250 batch_cost: 0.70081 sec, reader_cost: 0.00031 sec, ips: 22.83072 instance/sec.
[03/29 15:22:47] epoch:[ 13/100] train step:60   loss: 2.68522 lr: 0.048148 top1: 0.14655 top5: 0.73167 batch_cost: 0.69789 sec, reader_cost: 0.00030 sec, ips: 22.92612 instance/sec.
[03/29 15:22:54] epoch:[ 13/100] train step:70   loss: 2.16664 lr: 0.048132 top1: 0.48979 top5: 0.91853 batch_cost: 0.72295 sec, reader_cost: 0.00031 sec, ips: 22.13148 instance/sec.
[03/29 15:23:01] epoch:[ 13/100] train step:80   loss: 2.33765 lr: 0.048115 top1: 0.43749 top5: 0.81249 batch_cost: 0.69810 sec, reader_cost: 0.00031 sec, ips: 22.91927 instance/sec.
[03/29 15:23:08] epoch:[ 13/100] train step:90   loss: 2.73813 lr: 0.048099 top1: 0.22141 top5: 0.78026 batch_cost: 0.71928 sec, reader_cost: 0.00029 sec, ips: 22.24458 instance/sec.
[03/29 15:23:15] epoch:[ 13/100] train step:100  loss: 2.02671 lr: 0.048082 top1: 0.43676 top5: 0.99852 batch_cost: 0.72687 sec, reader_cost: 0.00034 sec, ips: 22.01223 instance/sec.
[03/29 15:23:22] epoch:[ 13/100] train step:110  loss: 2.81072 lr: 0.048065 top1: 0.25000 top5: 0.67396 batch_cost: 0.68568 sec, reader_cost: 0.00038 sec, ips: 23.33464 instance/sec.
[03/29 15:23:29] epoch:[ 13/100] train step:120  loss: 2.44214 lr: 0.048049 top1: 0.44948 top5: 0.85667 batch_cost: 0.69001 sec, reader_cost: 0.00035 sec, ips: 23.18806 instance/sec.
[03/29 15:23:36] epoch:[ 13/100] train step:130  loss: 2.36915 lr: 0.048032 top1: 0.18594 top5: 0.86722 batch_cost: 0.70887 sec, reader_cost: 0.00033 sec, ips: 22.57108 instance/sec.
[03/29 15:23:44] epoch:[ 13/100] train step:140  loss: 2.00205 lr: 0.048015 top1: 0.49937 top5: 0.87415 batch_cost: 0.68674 sec, reader_cost: 0.00032 sec, ips: 23.29840 instance/sec.
[03/29 15:23:51] epoch:[ 13/100] train step:150  loss: 2.50445 lr: 0.047998 top1: 0.25000 top5: 0.81250 batch_cost: 0.72046 sec, reader_cost: 0.00028 sec, ips: 22.20794 instance/sec.
[03/29 15:23:58] epoch:[ 13/100] train step:160  loss: 2.01490 lr: 0.047981 top1: 0.43750 top5: 0.81250 batch_cost: 0.70387 sec, reader_cost: 0.00026 sec, ips: 22.73160 instance/sec.
[03/29 15:24:05] epoch:[ 13/100] train step:170  loss: 3.32327 lr: 0.047964 top1: 0.06250 top5: 0.58697 batch_cost: 0.70310 sec, reader_cost: 0.00032 sec, ips: 22.75645 instance/sec.
[03/29 15:24:11] epoch:[ 13/100] train step:180  loss: 2.75158 lr: 0.047947 top1: 0.21913 top5: 0.66768 batch_cost: 0.50247 sec, reader_cost: 0.00030 sec, ips: 31.84261 instance/sec.
[03/29 15:24:12] END epoch:13  train loss_avg: 2.36569  top1_avg: 0.31106 top5_avg: 0.81313 avg_batch_cost: 0.50191 sec, avg_reader_cost: 0.00084 sec, batch_cost_sum: 127.57903 sec, avg_ips: 22.82507 instance/sec.
[03/29 15:24:13] epoch:[ 14/100] train step:0    loss: 2.13885 lr: 0.047944 top1: 0.23971 top5: 0.95370 batch_cost: 1.07979 sec, reader_cost: 0.26358 sec, ips: 14.81771 instance/sec.
[03/29 15:24:20] epoch:[ 14/100] train step:10   loss: 2.80225 lr: 0.047927 top1: 0.23444 top5: 0.62132 batch_cost: 0.70389 sec, reader_cost: 0.00030 sec, ips: 22.73072 instance/sec.
[03/29 15:24:27] epoch:[ 14/100] train step:20   loss: 2.21788 lr: 0.047909 top1: 0.31250 top5: 0.75000 batch_cost: 0.69107 sec, reader_cost: 0.00031 sec, ips: 23.15234 instance/sec.
[03/29 15:24:34] epoch:[ 14/100] train step:30   loss: 2.24101 lr: 0.047892 top1: 0.30933 top5: 0.80553 batch_cost: 0.71783 sec, reader_cost: 0.00032 sec, ips: 22.28937 instance/sec.
[03/29 15:24:41] epoch:[ 14/100] train step:40   loss: 2.69538 lr: 0.047875 top1: 0.29935 top5: 0.68012 batch_cost: 0.68265 sec, reader_cost: 0.00030 sec, ips: 23.43822 instance/sec.
[03/29 15:24:48] epoch:[ 14/100] train step:50   loss: 2.37307 lr: 0.047857 top1: 0.41123 top5: 0.83098 batch_cost: 0.69714 sec, reader_cost: 0.00030 sec, ips: 22.95105 instance/sec.
[03/29 15:24:55] epoch:[ 14/100] train step:60   loss: 2.31909 lr: 0.047840 top1: 0.12500 top5: 0.91260 batch_cost: 0.69593 sec, reader_cost: 0.00034 sec, ips: 22.99079 instance/sec.
[03/29 15:25:02] epoch:[ 14/100] train step:70   loss: 2.88072 lr: 0.047822 top1: 0.27741 top5: 0.61431 batch_cost: 0.72125 sec, reader_cost: 0.00030 sec, ips: 22.18380 instance/sec.
[03/29 15:25:09] epoch:[ 14/100] train step:80   loss: 1.82962 lr: 0.047805 top1: 0.49984 top5: 0.99976 batch_cost: 0.72015 sec, reader_cost: 0.00028 sec, ips: 22.21755 instance/sec.
[03/29 15:25:17] epoch:[ 14/100] train step:90   loss: 2.23941 lr: 0.047787 top1: 0.37480 top5: 0.87464 batch_cost: 0.69583 sec, reader_cost: 0.00036 sec, ips: 22.99406 instance/sec.
[03/29 15:25:23] epoch:[ 14/100] train step:100  loss: 2.20035 lr: 0.047769 top1: 0.37470 top5: 0.87417 batch_cost: 0.69534 sec, reader_cost: 0.00027 sec, ips: 23.01034 instance/sec.
[03/29 15:25:30] epoch:[ 14/100] train step:110  loss: 2.75755 lr: 0.047751 top1: 0.11804 top5: 0.81934 batch_cost: 0.68740 sec, reader_cost: 0.00034 sec, ips: 23.27604 instance/sec.
[03/29 15:25:37] epoch:[ 14/100] train step:120  loss: 2.41216 lr: 0.047733 top1: 0.38373 top5: 0.77515 batch_cost: 0.69971 sec, reader_cost: 0.00034 sec, ips: 22.86665 instance/sec.
[03/29 15:25:45] epoch:[ 14/100] train step:130  loss: 2.14232 lr: 0.047715 top1: 0.31249 top5: 0.99996 batch_cost: 0.70626 sec, reader_cost: 0.00031 sec, ips: 22.65456 instance/sec.
[03/29 15:25:51] epoch:[ 14/100] train step:140  loss: 3.02582 lr: 0.047697 top1: 0.24372 top5: 0.68750 batch_cost: 0.68608 sec, reader_cost: 0.00034 sec, ips: 23.32094 instance/sec.
[03/29 15:25:58] epoch:[ 14/100] train step:150  loss: 2.09214 lr: 0.047679 top1: 0.37369 top5: 0.87108 batch_cost: 0.70625 sec, reader_cost: 0.00031 sec, ips: 22.65485 instance/sec.
[03/29 15:26:06] epoch:[ 14/100] train step:160  loss: 2.13291 lr: 0.047661 top1: 0.37499 top5: 0.99999 batch_cost: 0.69103 sec, reader_cost: 0.00027 sec, ips: 23.15376 instance/sec.
[03/29 15:26:13] epoch:[ 14/100] train step:170  loss: 2.17193 lr: 0.047643 top1: 0.31250 top5: 0.93750 batch_cost: 0.70904 sec, reader_cost: 0.00031 sec, ips: 22.56582 instance/sec.
[03/29 15:26:19] epoch:[ 14/100] train step:180  loss: 3.14274 lr: 0.047624 top1: 0.18750 top5: 0.59400 batch_cost: 0.50291 sec, reader_cost: 0.00033 sec, ips: 31.81512 instance/sec.
[03/29 15:26:19] END epoch:14  train loss_avg: 2.40408  top1_avg: 0.30748 top5_avg: 0.80169 avg_batch_cost: 0.50137 sec, avg_reader_cost: 0.00107 sec, batch_cost_sum: 127.67140 sec, avg_ips: 22.80855 instance/sec.
[03/29 15:26:21] epoch:[ 15/100] train step:0    loss: 2.10868 lr: 0.047621 top1: 0.29461 top5: 0.96780 batch_cost: 1.11929 sec, reader_cost: 0.27810 sec, ips: 14.29474 instance/sec.
[03/29 15:26:28] epoch:[ 15/100] train step:10   loss: 2.11238 lr: 0.047602 top1: 0.37500 top5: 0.87500 batch_cost: 0.71041 sec, reader_cost: 0.00039 sec, ips: 22.52221 instance/sec.
[03/29 15:26:35] epoch:[ 15/100] train step:20   loss: 1.98316 lr: 0.047584 top1: 0.55776 top5: 0.93158 batch_cost: 0.70243 sec, reader_cost: 0.00030 sec, ips: 22.77807 instance/sec.
[03/29 15:26:42] epoch:[ 15/100] train step:30   loss: 2.07229 lr: 0.047565 top1: 0.31113 top5: 0.99699 batch_cost: 0.72595 sec, reader_cost: 0.00032 sec, ips: 22.04013 instance/sec.
[03/29 15:26:49] epoch:[ 15/100] train step:40   loss: 2.30547 lr: 0.047547 top1: 0.37489 top5: 0.93724 batch_cost: 0.70239 sec, reader_cost: 0.00031 sec, ips: 22.77949 instance/sec.
[03/29 15:26:56] epoch:[ 15/100] train step:50   loss: 1.78268 lr: 0.047528 top1: 0.49999 top5: 0.87499 batch_cost: 0.70733 sec, reader_cost: 0.00027 sec, ips: 22.62018 instance/sec.
[03/29 15:27:03] epoch:[ 15/100] train step:60   loss: 2.87478 lr: 0.047509 top1: 0.22292 top5: 0.59792 batch_cost: 0.70234 sec, reader_cost: 0.00030 sec, ips: 22.78096 instance/sec.
[03/29 15:27:10] epoch:[ 15/100] train step:70   loss: 2.52452 lr: 0.047490 top1: 0.27055 top5: 0.81823 batch_cost: 0.68102 sec, reader_cost: 0.00034 sec, ips: 23.49428 instance/sec.
[03/29 15:27:17] epoch:[ 15/100] train step:80   loss: 2.84215 lr: 0.047472 top1: 0.21708 top5: 0.61830 batch_cost: 0.69927 sec, reader_cost: 0.00031 sec, ips: 22.88088 instance/sec.
[03/29 15:27:24] epoch:[ 15/100] train step:90   loss: 2.11277 lr: 0.047453 top1: 0.37473 top5: 0.87446 batch_cost: 0.70377 sec, reader_cost: 0.00031 sec, ips: 22.73455 instance/sec.
[03/29 15:27:31] epoch:[ 15/100] train step:100  loss: 2.68333 lr: 0.047434 top1: 0.27151 top5: 0.73872 batch_cost: 0.70957 sec, reader_cost: 0.00037 sec, ips: 22.54895 instance/sec.
[03/29 15:27:38] epoch:[ 15/100] train step:110  loss: 2.21230 lr: 0.047414 top1: 0.43026 top5: 0.92544 batch_cost: 0.70381 sec, reader_cost: 0.00030 sec, ips: 22.73329 instance/sec.
[03/29 15:27:45] epoch:[ 15/100] train step:120  loss: 2.17136 lr: 0.047395 top1: 0.37245 top5: 0.86939 batch_cost: 0.70774 sec, reader_cost: 0.00029 sec, ips: 22.60725 instance/sec.
[03/29 15:27:52] epoch:[ 15/100] train step:130  loss: 2.63410 lr: 0.047376 top1: 0.41518 top5: 0.69792 batch_cost: 0.69438 sec, reader_cost: 0.00036 sec, ips: 23.04203 instance/sec.
[03/29 15:28:00] epoch:[ 15/100] train step:140  loss: 1.95653 lr: 0.047357 top1: 0.43738 top5: 0.99973 batch_cost: 0.71293 sec, reader_cost: 0.00033 sec, ips: 22.44254 instance/sec.
[03/29 15:28:07] epoch:[ 15/100] train step:150  loss: 2.01001 lr: 0.047338 top1: 0.31214 top5: 0.87413 batch_cost: 0.68462 sec, reader_cost: 0.00028 sec, ips: 23.37066 instance/sec.
[03/29 15:28:13] epoch:[ 15/100] train step:160  loss: 2.06148 lr: 0.047318 top1: 0.43743 top5: 0.87480 batch_cost: 0.68099 sec, reader_cost: 0.00029 sec, ips: 23.49515 instance/sec.
[03/29 15:28:20] epoch:[ 15/100] train step:170  loss: 2.06328 lr: 0.047299 top1: 0.31217 top5: 0.99900 batch_cost: 0.68033 sec, reader_cost: 0.00032 sec, ips: 23.51808 instance/sec.
[03/29 15:28:27] epoch:[ 15/100] train step:180  loss: 2.57521 lr: 0.047279 top1: 0.17190 top5: 0.69540 batch_cost: 0.50064 sec, reader_cost: 0.00030 sec, ips: 31.95936 instance/sec.
[03/29 15:28:27] END epoch:15  train loss_avg: 2.33631  top1_avg: 0.33432 top5_avg: 0.82754 avg_batch_cost: 0.49997 sec, avg_reader_cost: 0.00083 sec, batch_cost_sum: 127.51969 sec, avg_ips: 22.83569 instance/sec.
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61
  • 62
  • 63
  • 64
  • 65
  • 66
  • 67
  • 68
  • 69
  • 70
  • 71
  • 72
  • 73
  • 74
  • 75
  • 76
  • 77
  • 78
  • 79
  • 80
  • 81
  • 82
  • 83
  • 84
  • 85
  • 86
  • 87
  • 88
  • 89
  • 90
  • 91
  • 92
  • 93
  • 94
  • 95
  • 96
  • 97
  • 98
  • 99
  • 100
  • 101
  • 102
  • 103
  • 104
  • 105
  • 106
  • 107
  • 108
  • 109
  • 110
  • 111
  • 112
  • 113
  • 114
  • 115
  • 116
  • 117
  • 118
  • 119
  • 120
  • 121
  • 122
  • 123
  • 124
  • 125
  • 126
  • 127
  • 128
  • 129
  • 130
  • 131
  • 132
  • 133
  • 134
  • 135
  • 136
  • 137
  • 138
  • 139
  • 140
  • 141
  • 142
  • 143
  • 144
  • 145
  • 146
  • 147
  • 148
  • 149
  • 150
  • 151
  • 152
  • 153
  • 154
  • 155
  • 156
  • 157
  • 158
  • 159
  • 160
  • 161
  • 162
  • 163
  • 164
  • 165
  • 166
  • 167
  • 168
  • 169
  • 170
  • 171
  • 172
  • 173
  • 174
  • 175
  • 176
  • 177
  • 178
  • 179
  • 180
  • 181
  • 182
  • 183
  • 184
  • 185
  • 186
  • 187
  • 188
  • 189
  • 190
  • 191
  • 192
  • 193
  • 194
  • 195
  • 196
  • 197
  • 198
  • 199
  • 200
  • 201
  • 202
  • 203
  • 204
  • 205
  • 206
  • 207
  • 208
  • 209
  • 210
  • 211
  • 212
  • 213
  • 214
  • 215
  • 216
  • 217
  • 218
  • 219
  • 220
  • 221
  • 222
  • 223
  • 224
  • 225
  • 226
  • 227
  • 228
  • 229
  • 230
  • 231
  • 232
  • 233
  • 234
  • 235
  • 236
  • 237
  • 238
  • 239
  • 240
  • 241
  • 242
  • 243
  • 244
  • 245
  • 246
  • 247
  • 248
  • 249
  • 250
  • 251
  • 252
  • 253
  • 254
  • 255
  • 256
  • 257
  • 258
  • 259
  • 260
  • 261
  • 262
  • 263
  • 264
  • 265
  • 266
  • 267
  • 268
  • 269
  • 270
  • 271
  • 272
  • 273
  • 274
  • 275
  • 276
  • 277
  • 278
  • 279
  • 280
  • 281
  • 282
  • 283
  • 284
  • 285
  • 286
  • 287
  • 288
  • 289
  • 290
  • 291
  • 292
  • 293
  • 294
  • 295
  • 296
  • 297
  • 298
  • 299
  • 300
  • 301
  • 302
  • 303
  • 304
  • 305
  • 306
  • 307
  • 308
  • 309
  • 310
  • 311
  • 312
  • 313
  • 314
  • 315
  • 316
  • 317
  • 318
  • 319
  • 320
  • 321
  • 322
  • 323
  • 324
  • 325
  • 326
  • 327
  • 328
  • 329
  • 330
  • 331
  • 332
  • 333
  • 334
  • 335
  • 336
  • 337
  • 338
  • 339
  • 340
  • 341
  • 342
  • 343
  • 344
  • 345
  • 346
  • 347
  • 348
  • 349
  • 350
  • 351
  • 352
  • 353
  • 354
  • 355
  • 356
  • 357
  • 358
  • 359
  • 360
  • 361
  • 362
  • 363
  • 364
  • 365
  • 366
  • 367
  • 368
  • 369
  • 370
  • 371
  • 372
  • 373
  • 374
  • 375
  • 376

输出 log 中,epoch:[ 1/100] train step:180 loss: 2.88748 意义如下:

  • epoch:[ 1/100] 表示 当前训练的是100个 epoch 中的第1个 epoch
  • train step:180 表示是 该 epoch 中的第180个批次输出的日志
  • train step 只到180是因为,一个 epoch 中,训练样本数 / batch_size = 2922 / 16 = 182.625,也就是说每一个 epoch ,有182批次,每一个批次是一次迭代,都会更新模型参数,可以计算出 loss 等评价指标。
  • 这里是每10个批次输出一次日志,记录该次迭代后的 loss 等评价指标。所以每一个 epoch 中,train step:0train step:10 从0开始每隔10,到 train step:180
声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/花生_TL007/article/detail/156737
推荐阅读
相关标签
  

闽ICP备14008679号