赞
踩
神经网络模型训练中的相关概念如下:
有多个 epoch 的原因是,单次训练数据集是不够的,需要反复多次才能拟合收敛。因为我们使用的是有限的数据集,并且我们使用一个迭代过程即梯度下降来优化学习过程。随着 epoch 数量增加,神经网络中的权重的更新次数也在增加,曲线从欠拟合变得过拟合。epoch 的个数是非常重要的,如果 epoch 太少,模型可能无法收敛到最优解;如果 epoch 太多,模型可能会过拟合,导致泛化能力下降。对于不同的数据集,合适的 epoch 数量是不同的,需要通过验证集或交叉验证来选择。
这些概念之间的关系可以用下面的公式表示:
batch 总数 = 训练样本数 / batch_size
;iteration总数 = batch总数 * epoch总数
;如果训练样本数不能被 batch_size 整除,那么有几种处理方法:
- 调整 batch_size,使其能够整除训练样本数。比如,如果训练样本数是1000,可以选择 batch_size 为50,20,10等。
- 调整训练样本数,使其能够被 batch_size 整除。比如,如果 batch_size 是35,可以选择训练样本数为700,1050,1400等。
- 检查输入数据的 batch_size,当它和预设的 batch_size 不匹配时强行补齐。比如,如果 batch_size 是35,最后一个 batch 只有25个样本,可以用零或其他值填充剩余的10个位置。
- 使用 drop_last 参数,舍弃最后一个不足 batch_size 的 batch。比如,如果 batch_size 是35,最后一个 batch 只有25个样本,可以直接忽略这个 batch,不参与训练。
具体例子如下所示,其中训练样本数为2922,batch_size = 16,epoch 总数为100。这意味着:
[03/29 14:56:26] DALI is not installed, you can improve performance if use DALI [03/29 14:56:27] DATASET : [03/29 14:56:27] batch_size : 16 [03/29 14:56:27] num_workers : 0 [03/29 14:56:27] test : [03/29 14:56:27] file_path : /home/aistudio/data/data104924/test_A_data.npy [03/29 14:56:27] format : SkeletonDataset [03/29 14:56:27] test_mode : True [03/29 14:56:27] test_batch_size : 1 [03/29 14:56:27] test_num_workers : 0 [03/29 14:56:27] train : [03/29 14:56:27] file_path : /home/aistudio/data/data104925/train_data.npy [03/29 14:56:27] format : SkeletonDataset [03/29 14:56:27] label_path : /home/aistudio/data/data104925/train_label.npy [03/29 14:56:27] ------------------------------------------------------------ [03/29 14:56:27] INFERENCE : [03/29 14:56:27] name : STGCN_Inference_helper [03/29 14:56:27] num_channels : 2 [03/29 14:56:27] person_nums : 1 [03/29 14:56:27] vertex_nums : 25 [03/29 14:56:27] window_size : 1000 [03/29 14:56:27] ------------------------------------------------------------ [03/29 14:56:27] METRIC : [03/29 14:56:27] name : SkeletonMetric [03/29 14:56:27] out_file : submission2.csv [03/29 14:56:27] ------------------------------------------------------------ [03/29 14:56:27] MIX : [03/29 14:56:27] alpha : 0.2 [03/29 14:56:27] name : Mixup [03/29 14:56:27] ------------------------------------------------------------ [03/29 14:56:27] MODEL : [03/29 14:56:27] backbone : [03/29 14:56:27] name : AGCN [03/29 14:56:27] framework : RecognizerGCN [03/29 14:56:27] head : [03/29 14:56:27] ls_eps : 0.1 [03/29 14:56:27] name : STGCNHead [03/29 14:56:27] num_classes : 30 [03/29 14:56:27] ------------------------------------------------------------ [03/29 14:56:27] OPTIMIZER : [03/29 14:56:27] learning_rate : [03/29 14:56:27] cosine_base_lr : 0.05 [03/29 14:56:27] iter_step : True [03/29 14:56:27] max_epoch : 100 [03/29 14:56:27] name : CustomWarmupCosineDecay [03/29 14:56:27] warmup_epochs : 10 [03/29 14:56:27] warmup_start_lr : 0.005 [03/29 14:56:27] momentum : 0.9 [03/29 14:56:27] name : Momentum [03/29 14:56:27] weight_decay : [03/29 14:56:27] name : L2 [03/29 14:56:27] value : 0.0001 [03/29 14:56:27] ------------------------------------------------------------ [03/29 14:56:27] PIPELINE : [03/29 14:56:27] test : [03/29 14:56:27] sample : [03/29 14:56:27] name : AutoPadding [03/29 14:56:27] window_size : 1000 [03/29 14:56:27] transform : [03/29 14:56:27] SkeletonNorm : None [03/29 14:56:27] train : [03/29 14:56:27] sample : [03/29 14:56:27] name : AutoPadding [03/29 14:56:27] window_size : 1000 [03/29 14:56:27] transform : [03/29 14:56:27] SkeletonNorm : None [03/29 14:56:27] ------------------------------------------------------------ [03/29 14:56:27] epochs : 100 [03/29 14:56:27] log_interval : 10 [03/29 14:56:27] model_name : AGCN W0329 14:56:27.170261 942 device_context.cc:404] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 11.2, Runtime API Version: 10.1 W0329 14:56:27.176800 942 device_context.cc:422] device: 0, cuDNN Version: 7.6. [03/29 14:56:30] Loading data, it will take some moment... [03/29 14:56:33] Data Loaded! /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/nn/layer/norm.py:641: UserWarning: When training, we now always track global mean and variance. "When training, we now always track global mean and variance.") [03/29 14:56:34] epoch:[ 1/100] train step:0 loss: 3.37317 lr: 0.005000 top1: 0.00000 top5: 0.31250 batch_cost: 1.36950 sec, reader_cost: 0.42939 sec, ips: 11.68313 instance/sec. [03/29 14:56:41] epoch:[ 1/100] train step:10 loss: 3.40027 lr: 0.005241 top1: 0.00000 top5: 0.12378 batch_cost: 0.69300 sec, reader_cost: 0.00034 sec, ips: 23.08807 instance/sec. [03/29 14:56:48] epoch:[ 1/100] train step:20 loss: 3.23057 lr: 0.005481 top1: 0.06250 top5: 0.27953 batch_cost: 0.70447 sec, reader_cost: 0.00030 sec, ips: 22.71226 instance/sec. [03/29 14:56:55] epoch:[ 1/100] train step:30 loss: 3.03959 lr: 0.005722 top1: 0.12500 top5: 0.49993 batch_cost: 0.68937 sec, reader_cost: 0.00030 sec, ips: 23.20952 instance/sec. [03/29 14:57:02] epoch:[ 1/100] train step:40 loss: 3.19273 lr: 0.005962 top1: 0.12500 top5: 0.26837 batch_cost: 0.70059 sec, reader_cost: 0.00035 sec, ips: 22.83794 instance/sec. [03/29 14:57:09] epoch:[ 1/100] train step:50 loss: 2.99275 lr: 0.006203 top1: 0.12500 top5: 0.49463 batch_cost: 0.71244 sec, reader_cost: 0.00027 sec, ips: 22.45799 instance/sec. [03/29 14:57:16] epoch:[ 1/100] train step:60 loss: 3.09189 lr: 0.006443 top1: 0.18422 top5: 0.30922 batch_cost: 0.70627 sec, reader_cost: 0.00045 sec, ips: 22.65417 instance/sec. [03/29 14:57:23] epoch:[ 1/100] train step:70 loss: 3.27638 lr: 0.006684 top1: 0.16767 top5: 0.44052 batch_cost: 0.70927 sec, reader_cost: 0.00037 sec, ips: 22.55844 instance/sec. [03/29 14:57:30] epoch:[ 1/100] train step:80 loss: 2.94396 lr: 0.006924 top1: 0.05825 top5: 0.52851 batch_cost: 0.70284 sec, reader_cost: 0.00029 sec, ips: 22.76487 instance/sec. [03/29 14:57:37] epoch:[ 1/100] train step:90 loss: 2.87026 lr: 0.007165 top1: 0.18725 top5: 0.68664 batch_cost: 0.69787 sec, reader_cost: 0.00033 sec, ips: 22.92705 instance/sec. [03/29 14:57:44] epoch:[ 1/100] train step:100 loss: 3.12629 lr: 0.007405 top1: 0.18207 top5: 0.49457 batch_cost: 0.68661 sec, reader_cost: 0.00032 sec, ips: 23.30293 instance/sec. [03/29 14:57:51] epoch:[ 1/100] train step:110 loss: 3.00636 lr: 0.007646 top1: 0.18595 top5: 0.49767 batch_cost: 0.71403 sec, reader_cost: 0.00033 sec, ips: 22.40810 instance/sec. [03/29 14:57:58] epoch:[ 1/100] train step:120 loss: 2.92559 lr: 0.007886 top1: 0.29620 top5: 0.54352 batch_cost: 0.69685 sec, reader_cost: 0.00027 sec, ips: 22.96031 instance/sec. [03/29 14:58:05] epoch:[ 1/100] train step:130 loss: 2.81274 lr: 0.008127 top1: 0.18393 top5: 0.67501 batch_cost: 0.67208 sec, reader_cost: 0.00030 sec, ips: 23.80659 instance/sec. [03/29 14:58:12] epoch:[ 1/100] train step:140 loss: 3.02827 lr: 0.008367 top1: 0.06250 top5: 0.65827 batch_cost: 0.70593 sec, reader_cost: 0.00041 sec, ips: 22.66514 instance/sec. [03/29 14:58:19] epoch:[ 1/100] train step:150 loss: 2.95272 lr: 0.008608 top1: 0.18750 top5: 0.43750 batch_cost: 0.69523 sec, reader_cost: 0.00032 sec, ips: 23.01397 instance/sec. [03/29 14:58:26] epoch:[ 1/100] train step:160 loss: 3.34270 lr: 0.008848 top1: 0.06250 top5: 0.33044 batch_cost: 0.68350 sec, reader_cost: 0.00032 sec, ips: 23.40885 instance/sec. [03/29 14:58:33] epoch:[ 1/100] train step:170 loss: 3.28831 lr: 0.009089 top1: 0.00000 top5: 0.34511 batch_cost: 0.70497 sec, reader_cost: 0.00031 sec, ips: 22.69596 instance/sec. [03/29 14:58:39] epoch:[ 1/100] train step:180 loss: 2.88748 lr: 0.009330 top1: 0.06249 top5: 0.56248 batch_cost: 0.49910 sec, reader_cost: 0.00030 sec, ips: 32.05783 instance/sec. [03/29 14:58:40] END epoch:1 train loss_avg: 3.11624 top1_avg: 0.10158 top5_avg: 0.43874 avg_batch_cost: 0.50068 sec, avg_reader_cost: 0.00083 sec, batch_cost_sum: 127.13062 sec, avg_ips: 22.90558 instance/sec. [03/29 14:58:41] epoch:[ 2/100] train step:0 loss: 3.00062 lr: 0.009378 top1: 0.06250 top5: 0.53927 batch_cost: 1.06469 sec, reader_cost: 0.27380 sec, ips: 15.02790 instance/sec. [03/29 14:58:48] epoch:[ 2/100] train step:10 loss: 2.91025 lr: 0.009618 top1: 0.12500 top5: 0.49546 batch_cost: 0.70174 sec, reader_cost: 0.00029 sec, ips: 22.80040 instance/sec. [03/29 14:58:55] epoch:[ 2/100] train step:20 loss: 2.79556 lr: 0.009859 top1: 0.06216 top5: 0.68475 batch_cost: 0.69660 sec, reader_cost: 0.00030 sec, ips: 22.96883 instance/sec. [03/29 14:59:02] epoch:[ 2/100] train step:30 loss: 3.19926 lr: 0.010099 top1: 0.12497 top5: 0.43742 batch_cost: 0.70330 sec, reader_cost: 0.00032 sec, ips: 22.75002 instance/sec. [03/29 14:59:09] epoch:[ 2/100] train step:40 loss: 2.77705 lr: 0.010340 top1: 0.12500 top5: 0.68748 batch_cost: 0.70972 sec, reader_cost: 0.00028 sec, ips: 22.54421 instance/sec. [03/29 14:59:16] epoch:[ 2/100] train step:50 loss: 2.77065 lr: 0.010580 top1: 0.18750 top5: 0.62500 batch_cost: 0.70323 sec, reader_cost: 0.00031 sec, ips: 22.75215 instance/sec. [03/29 14:59:23] epoch:[ 2/100] train step:60 loss: 2.92242 lr: 0.010821 top1: 0.22620 top5: 0.67861 batch_cost: 0.71235 sec, reader_cost: 0.00032 sec, ips: 22.46083 instance/sec. [03/29 14:59:30] epoch:[ 2/100] train step:70 loss: 2.85897 lr: 0.011061 top1: 0.06250 top5: 0.43750 batch_cost: 0.71474 sec, reader_cost: 0.00032 sec, ips: 22.38568 instance/sec. [03/29 14:59:37] epoch:[ 2/100] train step:80 loss: 2.88646 lr: 0.011302 top1: 0.10735 top5: 0.58087 batch_cost: 0.68660 sec, reader_cost: 0.00032 sec, ips: 23.30327 instance/sec. [03/29 14:59:44] epoch:[ 2/100] train step:90 loss: 2.52885 lr: 0.011542 top1: 0.43549 top5: 0.74666 batch_cost: 0.69556 sec, reader_cost: 0.00036 sec, ips: 23.00295 instance/sec. [03/29 14:59:51] epoch:[ 2/100] train step:100 loss: 3.04526 lr: 0.011783 top1: 0.10791 top5: 0.60207 batch_cost: 0.69577 sec, reader_cost: 0.00035 sec, ips: 22.99623 instance/sec. [03/29 14:59:58] epoch:[ 2/100] train step:110 loss: 3.41966 lr: 0.012023 top1: 0.05607 top5: 0.35570 batch_cost: 0.71025 sec, reader_cost: 0.00035 sec, ips: 22.52737 instance/sec. [03/29 15:00:05] epoch:[ 2/100] train step:120 loss: 2.62268 lr: 0.012264 top1: 0.00000 top5: 0.62490 batch_cost: 0.71275 sec, reader_cost: 0.00030 sec, ips: 22.44821 instance/sec. [03/29 15:00:12] epoch:[ 2/100] train step:130 loss: 2.74196 lr: 0.012505 top1: 0.12488 top5: 0.62464 batch_cost: 0.71282 sec, reader_cost: 0.00028 sec, ips: 22.44601 instance/sec. [03/29 15:00:19] epoch:[ 2/100] train step:140 loss: 2.67099 lr: 0.012745 top1: 0.12456 top5: 0.68618 batch_cost: 0.70457 sec, reader_cost: 0.00035 sec, ips: 22.70898 instance/sec. [03/29 15:00:27] epoch:[ 2/100] train step:150 loss: 2.98392 lr: 0.012986 top1: 0.12501 top5: 0.68747 batch_cost: 0.68798 sec, reader_cost: 0.00027 sec, ips: 23.25660 instance/sec. [03/29 15:00:33] epoch:[ 2/100] train step:160 loss: 2.81657 lr: 0.013226 top1: 0.11762 top5: 0.66905 batch_cost: 0.68583 sec, reader_cost: 0.00033 sec, ips: 23.32943 instance/sec. [03/29 15:00:40] epoch:[ 2/100] train step:170 loss: 2.96144 lr: 0.013467 top1: 0.15183 top5: 0.65183 batch_cost: 0.69816 sec, reader_cost: 0.00031 sec, ips: 22.91731 instance/sec. [03/29 15:00:47] epoch:[ 2/100] train step:180 loss: 2.80823 lr: 0.013707 top1: 0.12349 top5: 0.55721 batch_cost: 0.50249 sec, reader_cost: 0.00030 sec, ips: 31.84135 instance/sec. [03/29 15:00:47] END epoch:2 train loss_avg: 2.90696 top1_avg: 0.13409 top5_avg: 0.57791 avg_batch_cost: 0.50287 sec, avg_reader_cost: 0.00091 sec, batch_cost_sum: 127.36288 sec, avg_ips: 22.86381 instance/sec. [03/29 15:00:48] epoch:[ 3/100] train step:0 loss: 2.57081 lr: 0.013755 top1: 0.18569 top5: 0.73736 batch_cost: 1.04606 sec, reader_cost: 0.24835 sec, ips: 15.29544 instance/sec. [03/29 15:00:56] epoch:[ 3/100] train step:10 loss: 2.78410 lr: 0.013996 top1: 0.11309 top5: 0.87793 batch_cost: 0.71015 sec, reader_cost: 0.00031 sec, ips: 22.53035 instance/sec. [03/29 15:01:03] epoch:[ 3/100] train step:20 loss: 2.64895 lr: 0.014236 top1: 0.18610 top5: 0.62150 batch_cost: 0.70124 sec, reader_cost: 0.00033 sec, ips: 22.81670 instance/sec. [03/29 15:01:10] epoch:[ 3/100] train step:30 loss: 2.83884 lr: 0.014477 top1: 0.12238 top5: 0.67834 batch_cost: 0.68523 sec, reader_cost: 0.00036 sec, ips: 23.34983 instance/sec. [03/29 15:01:17] epoch:[ 3/100] train step:40 loss: 2.60631 lr: 0.014717 top1: 0.25000 top5: 0.68749 batch_cost: 0.71406 sec, reader_cost: 0.00031 sec, ips: 22.40718 instance/sec. [03/29 15:01:24] epoch:[ 3/100] train step:50 loss: 3.04034 lr: 0.014958 top1: 0.06223 top5: 0.49946 batch_cost: 0.70691 sec, reader_cost: 0.00030 sec, ips: 22.63366 instance/sec. [03/29 15:01:31] epoch:[ 3/100] train step:60 loss: 2.85118 lr: 0.015198 top1: 0.18750 top5: 0.62500 batch_cost: 0.71714 sec, reader_cost: 0.00030 sec, ips: 22.31077 instance/sec. [03/29 15:01:38] epoch:[ 3/100] train step:70 loss: 2.69562 lr: 0.015439 top1: 0.12411 top5: 0.61785 batch_cost: 0.69712 sec, reader_cost: 0.00027 sec, ips: 22.95171 instance/sec. [03/29 15:01:45] epoch:[ 3/100] train step:80 loss: 2.49378 lr: 0.015680 top1: 0.18657 top5: 0.80831 batch_cost: 0.70739 sec, reader_cost: 0.00029 sec, ips: 22.61835 instance/sec. [03/29 15:01:52] epoch:[ 3/100] train step:90 loss: 2.80020 lr: 0.015920 top1: 0.17714 top5: 0.58875 batch_cost: 0.70922 sec, reader_cost: 0.00032 sec, ips: 22.56012 instance/sec. [03/29 15:01:59] epoch:[ 3/100] train step:100 loss: 2.55284 lr: 0.016161 top1: 0.18247 top5: 0.73659 batch_cost: 0.70368 sec, reader_cost: 0.00031 sec, ips: 22.73755 instance/sec. [03/29 15:02:06] epoch:[ 3/100] train step:110 loss: 2.57980 lr: 0.016401 top1: 0.18632 top5: 0.74295 batch_cost: 0.68889 sec, reader_cost: 0.00033 sec, ips: 23.22577 instance/sec. [03/29 15:02:13] epoch:[ 3/100] train step:120 loss: 3.05257 lr: 0.016642 top1: 0.07077 top5: 0.55423 batch_cost: 0.69157 sec, reader_cost: 0.00029 sec, ips: 23.13575 instance/sec. [03/29 15:02:20] epoch:[ 3/100] train step:130 loss: 2.43921 lr: 0.016882 top1: 0.24921 top5: 0.74790 batch_cost: 0.68415 sec, reader_cost: 0.00032 sec, ips: 23.38664 instance/sec. [03/29 15:02:27] epoch:[ 3/100] train step:140 loss: 2.86336 lr: 0.017123 top1: 0.12808 top5: 0.61576 batch_cost: 0.68897 sec, reader_cost: 0.00029 sec, ips: 23.22317 instance/sec. [03/29 15:02:34] epoch:[ 3/100] train step:150 loss: 2.95862 lr: 0.017363 top1: 0.06250 top5: 0.62466 batch_cost: 0.68853 sec, reader_cost: 0.00036 sec, ips: 23.23777 instance/sec. [03/29 15:02:41] epoch:[ 3/100] train step:160 loss: 2.79597 lr: 0.017604 top1: 0.18589 top5: 0.61935 batch_cost: 0.70793 sec, reader_cost: 0.00030 sec, ips: 22.60106 instance/sec. [03/29 15:02:48] epoch:[ 3/100] train step:170 loss: 2.72396 lr: 0.017844 top1: 0.18750 top5: 0.56250 batch_cost: 0.71050 sec, reader_cost: 0.00028 sec, ips: 22.51925 instance/sec. [03/29 15:02:54] epoch:[ 3/100] train step:180 loss: 2.67971 lr: 0.018085 top1: 0.18737 top5: 0.56224 batch_cost: 0.50566 sec, reader_cost: 0.00029 sec, ips: 31.64151 instance/sec. [03/29 15:02:55] END epoch:3 train loss_avg: 2.80823 top1_avg: 0.15651 top5_avg: 0.62372 avg_batch_cost: 0.50625 sec, avg_reader_cost: 0.00081 sec, batch_cost_sum: 127.48577 sec, avg_ips: 22.84176 instance/sec. [03/29 15:02:56] epoch:[ 4/100] train step:0 loss: 2.74616 lr: 0.018133 top1: 0.11287 top5: 0.63898 batch_cost: 1.06761 sec, reader_cost: 0.25935 sec, ips: 14.98679 instance/sec. [03/29 15:03:03] epoch:[ 4/100] train step:10 loss: 2.40084 lr: 0.018373 top1: 0.24963 top5: 0.74913 batch_cost: 0.68534 sec, reader_cost: 0.00034 sec, ips: 23.34615 instance/sec. [03/29 15:03:10] epoch:[ 4/100] train step:20 loss: 2.44602 lr: 0.018614 top1: 0.43689 top5: 0.87402 batch_cost: 0.70286 sec, reader_cost: 0.00037 sec, ips: 22.76406 instance/sec. [03/29 15:03:17] epoch:[ 4/100] train step:30 loss: 2.96187 lr: 0.018855 top1: 0.18146 top5: 0.37097 batch_cost: 0.70473 sec, reader_cost: 0.00043 sec, ips: 22.70365 instance/sec. [03/29 15:03:24] epoch:[ 4/100] train step:40 loss: 3.21780 lr: 0.019095 top1: 0.00000 top5: 0.47439 batch_cost: 0.70200 sec, reader_cost: 0.00033 sec, ips: 22.79206 instance/sec. [03/29 15:03:31] epoch:[ 4/100] train step:50 loss: 2.94762 lr: 0.019336 top1: 0.11307 top5: 0.61589 batch_cost: 0.69474 sec, reader_cost: 0.00030 sec, ips: 23.03016 instance/sec. [03/29 15:03:38] epoch:[ 4/100] train step:60 loss: 2.65449 lr: 0.019576 top1: 0.17900 top5: 0.70751 batch_cost: 0.69572 sec, reader_cost: 0.00032 sec, ips: 22.99764 instance/sec. [03/29 15:03:45] epoch:[ 4/100] train step:70 loss: 2.66846 lr: 0.019817 top1: 0.18302 top5: 0.60706 batch_cost: 0.67541 sec, reader_cost: 0.00029 sec, ips: 23.68942 instance/sec. [03/29 15:03:52] epoch:[ 4/100] train step:80 loss: 2.69306 lr: 0.020057 top1: 0.16884 top5: 0.68467 batch_cost: 0.70518 sec, reader_cost: 0.00037 sec, ips: 22.68939 instance/sec. [03/29 15:03:59] epoch:[ 4/100] train step:90 loss: 2.84809 lr: 0.020298 top1: 0.12186 top5: 0.73589 batch_cost: 0.69095 sec, reader_cost: 0.00029 sec, ips: 23.15662 instance/sec. [03/29 15:04:07] epoch:[ 4/100] train step:100 loss: 2.60389 lr: 0.020538 top1: 0.18067 top5: 0.71927 batch_cost: 0.70607 sec, reader_cost: 0.00029 sec, ips: 22.66076 instance/sec. [03/29 15:04:14] epoch:[ 4/100] train step:110 loss: 2.32916 lr: 0.020779 top1: 0.18750 top5: 0.93750 batch_cost: 0.70722 sec, reader_cost: 0.00038 sec, ips: 22.62384 instance/sec. [03/29 15:04:21] epoch:[ 4/100] train step:120 loss: 2.80766 lr: 0.021019 top1: 0.10466 top5: 0.51165 batch_cost: 0.69675 sec, reader_cost: 0.00036 sec, ips: 22.96384 instance/sec. [03/29 15:04:28] epoch:[ 4/100] train step:130 loss: 2.40375 lr: 0.021260 top1: 0.18750 top5: 0.81250 batch_cost: 0.71240 sec, reader_cost: 0.00030 sec, ips: 22.45939 instance/sec. [03/29 15:04:35] epoch:[ 4/100] train step:140 loss: 2.38751 lr: 0.021500 top1: 0.18695 top5: 0.87298 batch_cost: 0.69577 sec, reader_cost: 0.00035 sec, ips: 22.99601 instance/sec. [03/29 15:04:42] epoch:[ 4/100] train step:150 loss: 2.78807 lr: 0.021741 top1: 0.31084 top5: 0.64473 batch_cost: 0.70501 sec, reader_cost: 0.00038 sec, ips: 22.69477 instance/sec. [03/29 15:04:49] epoch:[ 4/100] train step:160 loss: 2.93646 lr: 0.021981 top1: 0.06250 top5: 0.47930 batch_cost: 0.69499 sec, reader_cost: 0.00040 sec, ips: 23.02196 instance/sec. [03/29 15:04:56] epoch:[ 4/100] train step:170 loss: 3.06834 lr: 0.022222 top1: 0.11915 top5: 0.47077 batch_cost: 0.68760 sec, reader_cost: 0.00029 sec, ips: 23.26927 instance/sec. [03/29 15:05:02] epoch:[ 4/100] train step:180 loss: 2.60863 lr: 0.022462 top1: 0.37459 top5: 0.68677 batch_cost: 0.50258 sec, reader_cost: 0.00035 sec, ips: 31.83548 instance/sec. [03/29 15:05:03] END epoch:4 train loss_avg: 2.77788 top1_avg: 0.17965 top5_avg: 0.64434 avg_batch_cost: 0.50238 sec, avg_reader_cost: 0.00111 sec, batch_cost_sum: 127.76674 sec, avg_ips: 22.79153 instance/sec. [03/29 15:05:04] epoch:[ 5/100] train step:0 loss: 2.46436 lr: 0.022511 top1: 0.28157 top5: 0.68814 batch_cost: 1.07273 sec, reader_cost: 0.27534 sec, ips: 14.91516 instance/sec. [03/29 15:05:11] epoch:[ 5/100] train step:10 loss: 2.96157 lr: 0.022751 top1: 0.12500 top5: 0.52542 batch_cost: 0.70304 sec, reader_cost: 0.00030 sec, ips: 22.75834 instance/sec. [03/29 15:05:18] epoch:[ 5/100] train step:20 loss: 2.88344 lr: 0.022992 top1: 0.11494 top5: 0.61705 batch_cost: 0.69354 sec, reader_cost: 0.00030 sec, ips: 23.07000 instance/sec. [03/29 15:05:25] epoch:[ 5/100] train step:30 loss: 2.45653 lr: 0.023232 top1: 0.29804 top5: 0.77996 batch_cost: 0.69325 sec, reader_cost: 0.00033 sec, ips: 23.07964 instance/sec. [03/29 15:05:32] epoch:[ 5/100] train step:40 loss: 3.29705 lr: 0.023473 top1: 0.15335 top5: 0.37500 batch_cost: 0.69653 sec, reader_cost: 0.00031 sec, ips: 22.97103 instance/sec. [03/29 15:05:39] epoch:[ 5/100] train step:50 loss: 2.55334 lr: 0.023713 top1: 0.06249 top5: 0.62494 batch_cost: 0.68533 sec, reader_cost: 0.00038 sec, ips: 23.34626 instance/sec. [03/29 15:05:46] epoch:[ 5/100] train step:60 loss: 2.56069 lr: 0.023954 top1: 0.31244 top5: 0.81236 batch_cost: 0.69861 sec, reader_cost: 0.00036 sec, ips: 22.90247 instance/sec. [03/29 15:05:53] epoch:[ 5/100] train step:70 loss: 2.41631 lr: 0.024194 top1: 0.25000 top5: 0.81250 batch_cost: 0.69859 sec, reader_cost: 0.00034 sec, ips: 22.90331 instance/sec. [03/29 15:06:00] epoch:[ 5/100] train step:80 loss: 2.43879 lr: 0.024435 top1: 0.18728 top5: 0.74922 batch_cost: 0.74415 sec, reader_cost: 0.00030 sec, ips: 21.50117 instance/sec. [03/29 15:06:07] epoch:[ 5/100] train step:90 loss: 2.63529 lr: 0.024675 top1: 0.16135 top5: 0.74275 batch_cost: 0.72602 sec, reader_cost: 0.00034 sec, ips: 22.03793 instance/sec. [03/29 15:06:14] epoch:[ 5/100] train step:100 loss: 2.63362 lr: 0.024916 top1: 0.12347 top5: 0.67526 batch_cost: 0.68853 sec, reader_cost: 0.00030 sec, ips: 23.23782 instance/sec. [03/29 15:06:21] epoch:[ 5/100] train step:110 loss: 2.80297 lr: 0.025156 top1: 0.17319 top5: 0.58206 batch_cost: 0.70664 sec, reader_cost: 0.00028 sec, ips: 22.64226 instance/sec. [03/29 15:06:28] epoch:[ 5/100] train step:120 loss: 2.28605 lr: 0.025397 top1: 0.32967 top5: 0.87536 batch_cost: 0.69289 sec, reader_cost: 0.00030 sec, ips: 23.09163 instance/sec. [03/29 15:06:35] epoch:[ 5/100] train step:130 loss: 2.56156 lr: 0.025637 top1: 0.28904 top5: 0.57807 batch_cost: 0.70462 sec, reader_cost: 0.00030 sec, ips: 22.70713 instance/sec. [03/29 15:06:43] epoch:[ 5/100] train step:140 loss: 2.36903 lr: 0.025878 top1: 0.35608 top5: 0.89967 batch_cost: 0.72537 sec, reader_cost: 0.00034 sec, ips: 22.05778 instance/sec. [03/29 15:06:50] epoch:[ 5/100] train step:150 loss: 3.05911 lr: 0.026119 top1: 0.07252 top5: 0.72376 batch_cost: 0.70057 sec, reader_cost: 0.00030 sec, ips: 22.83870 instance/sec. [03/29 15:06:57] epoch:[ 5/100] train step:160 loss: 2.48905 lr: 0.026359 top1: 0.17872 top5: 0.53908 batch_cost: 0.69123 sec, reader_cost: 0.00038 sec, ips: 23.14719 instance/sec. [03/29 15:07:04] epoch:[ 5/100] train step:170 loss: 2.36892 lr: 0.026600 top1: 0.43746 top5: 0.68744 batch_cost: 0.69701 sec, reader_cost: 0.00031 sec, ips: 22.95531 instance/sec. [03/29 15:07:10] epoch:[ 5/100] train step:180 loss: 2.25287 lr: 0.026840 top1: 0.31048 top5: 0.86828 batch_cost: 0.50450 sec, reader_cost: 0.00030 sec, ips: 31.71447 instance/sec. [03/29 15:07:10] END epoch:5 train loss_avg: 2.61208 top1_avg: 0.22569 top5_avg: 0.71417 avg_batch_cost: 0.50811 sec, avg_reader_cost: 0.00085 sec, batch_cost_sum: 127.79206 sec, avg_ips: 22.78702 instance/sec. [03/29 15:07:12] epoch:[ 6/100] train step:0 loss: 2.85189 lr: 0.026888 top1: 0.05819 top5: 0.66162 batch_cost: 1.07792 sec, reader_cost: 0.25637 sec, ips: 14.84345 instance/sec. [03/29 15:07:19] epoch:[ 6/100] train step:10 loss: 2.59118 lr: 0.027129 top1: 0.18718 top5: 0.68653 batch_cost: 0.70914 sec, reader_cost: 0.00033 sec, ips: 22.56264 instance/sec. [03/29 15:07:26] epoch:[ 6/100] train step:20 loss: 2.44648 lr: 0.027369 top1: 0.25000 top5: 0.68750 batch_cost: 0.70485 sec, reader_cost: 0.00040 sec, ips: 22.69974 instance/sec. [03/29 15:07:33] epoch:[ 6/100] train step:30 loss: 2.46000 lr: 0.027610 top1: 0.46711 top5: 0.64991 batch_cost: 0.70593 sec, reader_cost: 0.00032 sec, ips: 22.66514 instance/sec. [03/29 15:07:40] epoch:[ 6/100] train step:40 loss: 2.37045 lr: 0.027850 top1: 0.24869 top5: 0.87019 batch_cost: 0.70518 sec, reader_cost: 0.00033 sec, ips: 22.68922 instance/sec. [03/29 15:07:47] epoch:[ 6/100] train step:50 loss: 2.65947 lr: 0.028091 top1: 0.24880 top5: 0.62141 batch_cost: 0.70351 sec, reader_cost: 0.00032 sec, ips: 22.74322 instance/sec. [03/29 15:07:54] epoch:[ 6/100] train step:60 loss: 2.13924 lr: 0.028331 top1: 0.37500 top5: 0.87500 batch_cost: 0.70757 sec, reader_cost: 0.00029 sec, ips: 22.61270 instance/sec. [03/29 15:08:01] epoch:[ 6/100] train step:70 loss: 2.57030 lr: 0.028572 top1: 0.36615 top5: 0.85731 batch_cost: 0.71513 sec, reader_cost: 0.00032 sec, ips: 22.37341 instance/sec. [03/29 15:08:08] epoch:[ 6/100] train step:80 loss: 2.69033 lr: 0.028812 top1: 0.22627 top5: 0.62818 batch_cost: 0.70214 sec, reader_cost: 0.00031 sec, ips: 22.78755 instance/sec. [03/29 15:08:15] epoch:[ 6/100] train step:90 loss: 3.14676 lr: 0.029053 top1: 0.12500 top5: 0.37500 batch_cost: 0.71345 sec, reader_cost: 0.00028 sec, ips: 22.42612 instance/sec. [03/29 15:08:22] epoch:[ 6/100] train step:100 loss: 2.57400 lr: 0.029294 top1: 0.27330 top5: 0.71450 batch_cost: 0.69552 sec, reader_cost: 0.00034 sec, ips: 23.00439 instance/sec. [03/29 15:08:29] epoch:[ 6/100] train step:110 loss: 2.26502 lr: 0.029534 top1: 0.25000 top5: 0.87500 batch_cost: 0.68348 sec, reader_cost: 0.00028 sec, ips: 23.40957 instance/sec. [03/29 15:08:36] epoch:[ 6/100] train step:120 loss: 2.16912 lr: 0.029775 top1: 0.43747 top5: 0.87494 batch_cost: 0.68997 sec, reader_cost: 0.00028 sec, ips: 23.18951 instance/sec. [03/29 15:08:43] epoch:[ 6/100] train step:130 loss: 2.42919 lr: 0.030015 top1: 0.55789 top5: 0.86671 batch_cost: 0.70373 sec, reader_cost: 0.00032 sec, ips: 22.73591 instance/sec. [03/29 15:08:50] epoch:[ 6/100] train step:140 loss: 2.84289 lr: 0.030256 top1: 0.11736 top5: 0.72842 batch_cost: 0.70333 sec, reader_cost: 0.00050 sec, ips: 22.74904 instance/sec. [03/29 15:08:57] epoch:[ 6/100] train step:150 loss: 2.73676 lr: 0.030496 top1: 0.19671 top5: 0.55427 batch_cost: 0.68943 sec, reader_cost: 0.00035 sec, ips: 23.20769 instance/sec. [03/29 15:09:04] epoch:[ 6/100] train step:160 loss: 2.25768 lr: 0.030737 top1: 0.18646 top5: 0.87034 batch_cost: 0.68771 sec, reader_cost: 0.00030 sec, ips: 23.26553 instance/sec. [03/29 15:09:11] epoch:[ 6/100] train step:170 loss: 2.91737 lr: 0.030977 top1: 0.27496 top5: 0.69988 batch_cost: 0.69979 sec, reader_cost: 0.00029 sec, ips: 22.86403 instance/sec. [03/29 15:09:18] epoch:[ 6/100] train step:180 loss: 2.27804 lr: 0.031218 top1: 0.37494 top5: 0.87487 batch_cost: 0.50157 sec, reader_cost: 0.00033 sec, ips: 31.89988 instance/sec. [03/29 15:09:18] END epoch:6 train loss_avg: 2.55839 top1_avg: 0.23381 top5_avg: 0.75519 avg_batch_cost: 0.50127 sec, avg_reader_cost: 0.00096 sec, batch_cost_sum: 127.49788 sec, avg_ips: 22.83960 instance/sec. [03/29 15:09:19] epoch:[ 7/100] train step:0 loss: 2.48977 lr: 0.031266 top1: 0.24999 top5: 0.74997 batch_cost: 1.03012 sec, reader_cost: 0.27165 sec, ips: 15.53212 instance/sec. [03/29 15:09:26] epoch:[ 7/100] train step:10 loss: 2.42310 lr: 0.031506 top1: 0.06237 top5: 0.87359 batch_cost: 0.71256 sec, reader_cost: 0.00036 sec, ips: 22.45424 instance/sec. [03/29 15:09:33] epoch:[ 7/100] train step:20 loss: 2.01837 lr: 0.031747 top1: 0.49996 top5: 0.93744 batch_cost: 0.69950 sec, reader_cost: 0.00040 sec, ips: 22.87335 instance/sec. [03/29 15:09:40] epoch:[ 7/100] train step:30 loss: 2.30209 lr: 0.031987 top1: 0.37500 top5: 0.81250 batch_cost: 0.70529 sec, reader_cost: 0.00035 sec, ips: 22.68578 instance/sec. [03/29 15:09:47] epoch:[ 7/100] train step:40 loss: 2.87034 lr: 0.032228 top1: 0.15818 top5: 0.69135 batch_cost: 0.69888 sec, reader_cost: 0.00028 sec, ips: 22.89368 instance/sec. [03/29 15:09:54] epoch:[ 7/100] train step:50 loss: 2.26913 lr: 0.032468 top1: 0.47789 top5: 0.79039 batch_cost: 0.70312 sec, reader_cost: 0.00043 sec, ips: 22.75566 instance/sec. [03/29 15:10:01] epoch:[ 7/100] train step:60 loss: 2.41152 lr: 0.032709 top1: 0.31139 top5: 0.81027 batch_cost: 0.70262 sec, reader_cost: 0.00034 sec, ips: 22.77177 instance/sec. [03/29 15:10:08] epoch:[ 7/100] train step:70 loss: 2.78548 lr: 0.032950 top1: 0.18713 top5: 0.62388 batch_cost: 0.68241 sec, reader_cost: 0.00029 sec, ips: 23.44620 instance/sec. [03/29 15:10:15] epoch:[ 7/100] train step:80 loss: 2.31574 lr: 0.033190 top1: 0.30918 top5: 0.93153 batch_cost: 0.71623 sec, reader_cost: 0.00031 sec, ips: 22.33933 instance/sec. [03/29 15:10:22] epoch:[ 7/100] train step:90 loss: 2.50991 lr: 0.033431 top1: 0.16940 top5: 0.82673 batch_cost: 0.71050 sec, reader_cost: 0.00037 sec, ips: 22.51945 instance/sec. [03/29 15:10:29] epoch:[ 7/100] train step:100 loss: 3.06481 lr: 0.033671 top1: 0.13734 top5: 0.57484 batch_cost: 0.68461 sec, reader_cost: 0.00030 sec, ips: 23.37083 instance/sec. [03/29 15:10:36] epoch:[ 7/100] train step:110 loss: 2.33617 lr: 0.033912 top1: 0.31196 top5: 0.87391 batch_cost: 0.70370 sec, reader_cost: 0.00033 sec, ips: 22.73691 instance/sec. [03/29 15:10:43] epoch:[ 7/100] train step:120 loss: 2.57602 lr: 0.034152 top1: 0.18742 top5: 0.74960 batch_cost: 0.69058 sec, reader_cost: 0.00030 sec, ips: 23.16905 instance/sec. [03/29 15:10:50] epoch:[ 7/100] train step:130 loss: 2.93582 lr: 0.034393 top1: 0.12500 top5: 0.47862 batch_cost: 0.72508 sec, reader_cost: 0.00031 sec, ips: 22.06649 instance/sec. [03/29 15:10:57] epoch:[ 7/100] train step:140 loss: 2.41478 lr: 0.034633 top1: 0.31250 top5: 0.68749 batch_cost: 0.71846 sec, reader_cost: 0.00030 sec, ips: 22.26971 instance/sec. [03/29 15:11:05] epoch:[ 7/100] train step:150 loss: 2.37827 lr: 0.034874 top1: 0.28179 top5: 0.86073 batch_cost: 0.70377 sec, reader_cost: 0.00029 sec, ips: 22.73455 instance/sec. [03/29 15:11:12] epoch:[ 7/100] train step:160 loss: 2.14180 lr: 0.035114 top1: 0.24969 top5: 0.93648 batch_cost: 0.70252 sec, reader_cost: 0.00033 sec, ips: 22.77517 instance/sec. [03/29 15:11:19] epoch:[ 7/100] train step:170 loss: 2.74523 lr: 0.035355 top1: 0.32059 top5: 0.59682 batch_cost: 0.69936 sec, reader_cost: 0.00028 sec, ips: 22.87797 instance/sec. [03/29 15:11:25] epoch:[ 7/100] train step:180 loss: 2.27444 lr: 0.035595 top1: 0.31150 top5: 0.74599 batch_cost: 0.50434 sec, reader_cost: 0.00027 sec, ips: 31.72433 instance/sec. [03/29 15:11:25] END epoch:7 train loss_avg: 2.51643 top1_avg: 0.25764 top5_avg: 0.76191 avg_batch_cost: 0.50491 sec, avg_reader_cost: 0.00087 sec, batch_cost_sum: 127.37218 sec, avg_ips: 22.86214 instance/sec. [03/29 15:11:27] epoch:[ 8/100] train step:0 loss: 2.44114 lr: 0.035643 top1: 0.06245 top5: 0.99927 batch_cost: 1.07130 sec, reader_cost: 0.26484 sec, ips: 14.93513 instance/sec. [03/29 15:11:34] epoch:[ 8/100] train step:10 loss: 2.46820 lr: 0.035884 top1: 0.18750 top5: 0.81250 batch_cost: 0.71347 sec, reader_cost: 0.00030 sec, ips: 22.42572 instance/sec. [03/29 15:11:41] epoch:[ 8/100] train step:20 loss: 2.56442 lr: 0.036125 top1: 0.24996 top5: 0.74983 batch_cost: 0.68839 sec, reader_cost: 0.00031 sec, ips: 23.24254 instance/sec. [03/29 15:11:48] epoch:[ 8/100] train step:30 loss: 2.50068 lr: 0.036365 top1: 0.29695 top5: 0.69938 batch_cost: 0.70363 sec, reader_cost: 0.00032 sec, ips: 22.73921 instance/sec. [03/29 15:11:55] epoch:[ 8/100] train step:40 loss: 2.38977 lr: 0.036606 top1: 0.31250 top5: 0.68750 batch_cost: 0.69178 sec, reader_cost: 0.00030 sec, ips: 23.12873 instance/sec. [03/29 15:12:02] epoch:[ 8/100] train step:50 loss: 2.20239 lr: 0.036846 top1: 0.43552 top5: 0.87188 batch_cost: 0.69028 sec, reader_cost: 0.00032 sec, ips: 23.17887 instance/sec. [03/29 15:12:09] epoch:[ 8/100] train step:60 loss: 2.08576 lr: 0.037087 top1: 0.37495 top5: 0.93741 batch_cost: 0.70834 sec, reader_cost: 0.00032 sec, ips: 22.58799 instance/sec. [03/29 15:12:16] epoch:[ 8/100] train step:70 loss: 2.49606 lr: 0.037327 top1: 0.40172 top5: 0.64457 batch_cost: 0.69600 sec, reader_cost: 0.00026 sec, ips: 22.98838 instance/sec. [03/29 15:12:23] epoch:[ 8/100] train step:80 loss: 2.30188 lr: 0.037568 top1: 0.18589 top5: 0.92145 batch_cost: 0.71637 sec, reader_cost: 0.00032 sec, ips: 22.33491 instance/sec. [03/29 15:12:30] epoch:[ 8/100] train step:90 loss: 2.07550 lr: 0.037808 top1: 0.43109 top5: 0.98902 batch_cost: 0.71585 sec, reader_cost: 0.00037 sec, ips: 22.35118 instance/sec. [03/29 15:12:37] epoch:[ 8/100] train step:100 loss: 2.33710 lr: 0.038049 top1: 0.23549 top5: 0.81698 batch_cost: 0.70628 sec, reader_cost: 0.00037 sec, ips: 22.65400 instance/sec. [03/29 15:12:44] epoch:[ 8/100] train step:110 loss: 2.65618 lr: 0.038289 top1: 0.18741 top5: 0.49972 batch_cost: 0.70001 sec, reader_cost: 0.00039 sec, ips: 22.85681 instance/sec. [03/29 15:12:51] epoch:[ 8/100] train step:120 loss: 2.16781 lr: 0.038530 top1: 0.43629 top5: 0.93468 batch_cost: 0.69094 sec, reader_cost: 0.00029 sec, ips: 23.15701 instance/sec. [03/29 15:12:58] epoch:[ 8/100] train step:130 loss: 2.27023 lr: 0.038770 top1: 0.24993 top5: 0.87484 batch_cost: 0.69355 sec, reader_cost: 0.00034 sec, ips: 23.06956 instance/sec. [03/29 15:13:05] epoch:[ 8/100] train step:140 loss: 2.58522 lr: 0.039011 top1: 0.12500 top5: 0.83108 batch_cost: 0.70103 sec, reader_cost: 0.00033 sec, ips: 22.82362 instance/sec. [03/29 15:13:12] epoch:[ 8/100] train step:150 loss: 2.25197 lr: 0.039251 top1: 0.31206 top5: 0.99869 batch_cost: 0.71337 sec, reader_cost: 0.00036 sec, ips: 22.42887 instance/sec. [03/29 15:13:19] epoch:[ 8/100] train step:160 loss: 2.27115 lr: 0.039492 top1: 0.37476 top5: 0.74962 batch_cost: 0.70206 sec, reader_cost: 0.00030 sec, ips: 22.79006 instance/sec. [03/29 15:13:26] epoch:[ 8/100] train step:170 loss: 3.12523 lr: 0.039732 top1: 0.08068 top5: 0.51419 batch_cost: 0.72897 sec, reader_cost: 0.00031 sec, ips: 21.94865 instance/sec. [03/29 15:13:33] epoch:[ 8/100] train step:180 loss: 2.67082 lr: 0.039973 top1: 0.20059 top5: 0.66765 batch_cost: 0.50181 sec, reader_cost: 0.00034 sec, ips: 31.88471 instance/sec. [03/29 15:13:33] END epoch:8 train loss_avg: 2.48675 top1_avg: 0.25453 top5_avg: 0.77283 avg_batch_cost: 0.50221 sec, avg_reader_cost: 0.00091 sec, batch_cost_sum: 127.51958 sec, avg_ips: 22.83571 instance/sec. [03/29 15:13:34] epoch:[ 9/100] train step:0 loss: 2.12341 lr: 0.040021 top1: 0.42191 top5: 0.85421 batch_cost: 1.06815 sec, reader_cost: 0.26195 sec, ips: 14.97913 instance/sec. [03/29 15:13:41] epoch:[ 9/100] train step:10 loss: 2.51564 lr: 0.040262 top1: 0.36636 top5: 0.85772 batch_cost: 0.70063 sec, reader_cost: 0.00031 sec, ips: 22.83657 instance/sec. [03/29 15:13:48] epoch:[ 9/100] train step:20 loss: 3.25764 lr: 0.040502 top1: 0.15691 top5: 0.43883 batch_cost: 0.69377 sec, reader_cost: 0.00034 sec, ips: 23.06234 instance/sec. [03/29 15:13:55] epoch:[ 9/100] train step:30 loss: 2.48742 lr: 0.040743 top1: 0.36467 top5: 0.84401 batch_cost: 0.69796 sec, reader_cost: 0.00035 sec, ips: 22.92406 instance/sec. [03/29 15:14:02] epoch:[ 9/100] train step:40 loss: 2.45128 lr: 0.040983 top1: 0.18750 top5: 0.87500 batch_cost: 0.70433 sec, reader_cost: 0.00030 sec, ips: 22.71656 instance/sec. [03/29 15:14:10] epoch:[ 9/100] train step:50 loss: 2.33907 lr: 0.041224 top1: 0.24772 top5: 0.80623 batch_cost: 0.70531 sec, reader_cost: 0.00029 sec, ips: 22.68511 instance/sec. [03/29 15:14:17] epoch:[ 9/100] train step:60 loss: 2.26202 lr: 0.041464 top1: 0.25000 top5: 0.93750 batch_cost: 0.71487 sec, reader_cost: 0.00039 sec, ips: 22.38162 instance/sec. [03/29 15:14:24] epoch:[ 9/100] train step:70 loss: 2.24093 lr: 0.041705 top1: 0.24323 top5: 0.79558 batch_cost: 0.71356 sec, reader_cost: 0.00027 sec, ips: 22.42273 instance/sec. [03/29 15:14:31] epoch:[ 9/100] train step:80 loss: 2.38097 lr: 0.041945 top1: 0.36257 top5: 0.83259 batch_cost: 0.70034 sec, reader_cost: 0.00027 sec, ips: 22.84605 instance/sec. [03/29 15:14:38] epoch:[ 9/100] train step:90 loss: 1.97282 lr: 0.042186 top1: 0.68402 top5: 0.87036 batch_cost: 0.72411 sec, reader_cost: 0.00029 sec, ips: 22.09601 instance/sec. [03/29 15:14:45] epoch:[ 9/100] train step:100 loss: 2.76908 lr: 0.042426 top1: 0.24999 top5: 0.68749 batch_cost: 0.70043 sec, reader_cost: 0.00027 sec, ips: 22.84327 instance/sec. [03/29 15:14:52] epoch:[ 9/100] train step:110 loss: 2.18999 lr: 0.042667 top1: 0.37494 top5: 0.87483 batch_cost: 0.71288 sec, reader_cost: 0.00032 sec, ips: 22.44418 instance/sec. [03/29 15:14:59] epoch:[ 9/100] train step:120 loss: 2.14022 lr: 0.042907 top1: 0.18746 top5: 0.93736 batch_cost: 0.68995 sec, reader_cost: 0.00033 sec, ips: 23.19009 instance/sec. [03/29 15:15:06] epoch:[ 9/100] train step:130 loss: 2.31431 lr: 0.043148 top1: 0.25000 top5: 0.81250 batch_cost: 0.68054 sec, reader_cost: 0.00033 sec, ips: 23.51077 instance/sec. [03/29 15:15:13] epoch:[ 9/100] train step:140 loss: 2.34000 lr: 0.043389 top1: 0.24998 top5: 0.81242 batch_cost: 0.71420 sec, reader_cost: 0.00031 sec, ips: 22.40262 instance/sec. [03/29 15:15:20] epoch:[ 9/100] train step:150 loss: 2.33155 lr: 0.043629 top1: 0.29174 top5: 0.91698 batch_cost: 0.70514 sec, reader_cost: 0.00043 sec, ips: 22.69059 instance/sec. [03/29 15:15:27] epoch:[ 9/100] train step:160 loss: 2.34514 lr: 0.043870 top1: 0.30793 top5: 0.80107 batch_cost: 0.69373 sec, reader_cost: 0.00035 sec, ips: 23.06369 instance/sec. [03/29 15:15:34] epoch:[ 9/100] train step:170 loss: 2.52198 lr: 0.044110 top1: 0.38779 top5: 0.72551 batch_cost: 0.67962 sec, reader_cost: 0.00030 sec, ips: 23.54274 instance/sec. [03/29 15:15:40] epoch:[ 9/100] train step:180 loss: 2.46512 lr: 0.044351 top1: 0.12352 top5: 0.80437 batch_cost: 0.50476 sec, reader_cost: 0.00033 sec, ips: 31.69841 instance/sec. [03/29 15:15:40] END epoch:9 train loss_avg: 2.47134 top1_avg: 0.26852 top5_avg: 0.77996 avg_batch_cost: 0.50409 sec, avg_reader_cost: 0.00078 sec, batch_cost_sum: 127.37165 sec, avg_ips: 22.86223 instance/sec. [03/29 15:15:42] epoch:[ 10/100] train step:0 loss: 2.96107 lr: 0.044399 top1: 0.11891 top5: 0.54424 batch_cost: 1.10587 sec, reader_cost: 0.26607 sec, ips: 14.46825 instance/sec. [03/29 15:15:49] epoch:[ 10/100] train step:10 loss: 2.23709 lr: 0.044639 top1: 0.24998 top5: 0.81245 batch_cost: 0.69632 sec, reader_cost: 0.00033 sec, ips: 22.97799 instance/sec. [03/29 15:15:56] epoch:[ 10/100] train step:20 loss: 2.52222 lr: 0.044880 top1: 0.26982 top5: 0.66465 batch_cost: 0.70230 sec, reader_cost: 0.00032 sec, ips: 22.78237 instance/sec. [03/29 15:16:03] epoch:[ 10/100] train step:30 loss: 2.21324 lr: 0.045120 top1: 0.31150 top5: 0.81031 batch_cost: 0.68848 sec, reader_cost: 0.00031 sec, ips: 23.23970 instance/sec. [03/29 15:16:10] epoch:[ 10/100] train step:40 loss: 2.25340 lr: 0.045361 top1: 0.37500 top5: 0.81250 batch_cost: 0.70449 sec, reader_cost: 0.00032 sec, ips: 22.71134 instance/sec. [03/29 15:16:17] epoch:[ 10/100] train step:50 loss: 2.23045 lr: 0.045601 top1: 0.37443 top5: 0.81135 batch_cost: 0.68609 sec, reader_cost: 0.00031 sec, ips: 23.32069 instance/sec. [03/29 15:16:24] epoch:[ 10/100] train step:60 loss: 2.96213 lr: 0.045842 top1: 0.16072 top5: 0.52233 batch_cost: 0.71334 sec, reader_cost: 0.00030 sec, ips: 22.42967 instance/sec. [03/29 15:16:31] epoch:[ 10/100] train step:70 loss: 2.16059 lr: 0.046082 top1: 0.29633 top5: 0.95552 batch_cost: 0.70037 sec, reader_cost: 0.00028 sec, ips: 22.84511 instance/sec. [03/29 15:16:38] epoch:[ 10/100] train step:80 loss: 2.69990 lr: 0.046323 top1: 0.22429 top5: 0.74821 batch_cost: 0.71538 sec, reader_cost: 0.00037 sec, ips: 22.36570 instance/sec. [03/29 15:16:45] epoch:[ 10/100] train step:90 loss: 2.66676 lr: 0.046564 top1: 0.21580 top5: 0.69281 batch_cost: 0.71799 sec, reader_cost: 0.00036 sec, ips: 22.28433 instance/sec. [03/29 15:16:52] epoch:[ 10/100] train step:100 loss: 2.41354 lr: 0.046804 top1: 0.24999 top5: 0.87497 batch_cost: 0.70958 sec, reader_cost: 0.00035 sec, ips: 22.54862 instance/sec. [03/29 15:16:59] epoch:[ 10/100] train step:110 loss: 2.38536 lr: 0.047045 top1: 0.41409 top5: 0.88482 batch_cost: 0.71506 sec, reader_cost: 0.00030 sec, ips: 22.37582 instance/sec. [03/29 15:17:06] epoch:[ 10/100] train step:120 loss: 2.52977 lr: 0.047285 top1: 0.30121 top5: 0.72742 batch_cost: 0.69705 sec, reader_cost: 0.00041 sec, ips: 22.95383 instance/sec. [03/29 15:17:13] epoch:[ 10/100] train step:130 loss: 2.14806 lr: 0.047526 top1: 0.43750 top5: 0.93750 batch_cost: 0.71028 sec, reader_cost: 0.00031 sec, ips: 22.52634 instance/sec. [03/29 15:17:20] epoch:[ 10/100] train step:140 loss: 2.85185 lr: 0.047766 top1: 0.17275 top5: 0.69797 batch_cost: 0.71956 sec, reader_cost: 0.00034 sec, ips: 22.23584 instance/sec. [03/29 15:17:27] epoch:[ 10/100] train step:150 loss: 2.76896 lr: 0.048007 top1: 0.14693 top5: 0.69080 batch_cost: 0.69619 sec, reader_cost: 0.00034 sec, ips: 22.98232 instance/sec. [03/29 15:17:34] epoch:[ 10/100] train step:160 loss: 2.35859 lr: 0.048247 top1: 0.44712 top5: 0.79190 batch_cost: 0.69901 sec, reader_cost: 0.00031 sec, ips: 22.88967 instance/sec. [03/29 15:17:41] epoch:[ 10/100] train step:170 loss: 2.37111 lr: 0.048488 top1: 0.12403 top5: 0.93071 batch_cost: 0.69868 sec, reader_cost: 0.00032 sec, ips: 22.90018 instance/sec. [03/29 15:17:48] epoch:[ 10/100] train step:180 loss: 2.12248 lr: 0.048728 top1: 0.30482 top5: 0.91446 batch_cost: 0.50354 sec, reader_cost: 0.00030 sec, ips: 31.77483 instance/sec. [03/29 15:17:48] END epoch:10 train loss_avg: 2.47666 top1_avg: 0.26287 top5_avg: 0.76981 avg_batch_cost: 0.50302 sec, avg_reader_cost: 0.00086 sec, batch_cost_sum: 127.64865 sec, avg_ips: 22.81262 instance/sec. [03/29 15:17:49] epoch:[ 11/100] train step:0 loss: 2.50177 lr: 0.048776 top1: 0.22239 top5: 0.73659 batch_cost: 1.07179 sec, reader_cost: 0.26938 sec, ips: 14.92834 instance/sec. [03/29 15:17:57] epoch:[ 11/100] train step:10 loss: 2.30357 lr: 0.048763 top1: 0.18749 top5: 0.81245 batch_cost: 0.70079 sec, reader_cost: 0.00032 sec, ips: 22.83136 instance/sec. [03/29 15:18:04] epoch:[ 11/100] train step:20 loss: 2.46937 lr: 0.048750 top1: 0.18685 top5: 0.74804 batch_cost: 0.70470 sec, reader_cost: 0.00029 sec, ips: 22.70458 instance/sec. [03/29 15:18:11] epoch:[ 11/100] train step:30 loss: 2.50641 lr: 0.048736 top1: 0.18749 top5: 0.81246 batch_cost: 0.70733 sec, reader_cost: 0.00034 sec, ips: 22.62037 instance/sec. [03/29 15:18:18] epoch:[ 11/100] train step:40 loss: 2.98757 lr: 0.048723 top1: 0.11354 top5: 0.55625 batch_cost: 0.70535 sec, reader_cost: 0.00029 sec, ips: 22.68367 instance/sec. [03/29 15:18:25] epoch:[ 11/100] train step:50 loss: 2.14526 lr: 0.048709 top1: 0.49998 top5: 0.87497 batch_cost: 0.70087 sec, reader_cost: 0.00035 sec, ips: 22.82889 instance/sec. [03/29 15:18:32] epoch:[ 11/100] train step:60 loss: 2.45612 lr: 0.048695 top1: 0.29120 top5: 0.75925 batch_cost: 0.70693 sec, reader_cost: 0.00029 sec, ips: 22.63315 instance/sec. [03/29 15:18:39] epoch:[ 11/100] train step:70 loss: 2.28251 lr: 0.048681 top1: 0.24959 top5: 0.87390 batch_cost: 0.70870 sec, reader_cost: 0.00030 sec, ips: 22.57653 instance/sec. [03/29 15:18:46] epoch:[ 11/100] train step:80 loss: 2.96637 lr: 0.048667 top1: 0.32829 top5: 0.59408 batch_cost: 0.68660 sec, reader_cost: 0.00036 sec, ips: 23.30312 instance/sec. [03/29 15:18:53] epoch:[ 11/100] train step:90 loss: 2.45896 lr: 0.048654 top1: 0.12500 top5: 0.74997 batch_cost: 0.70486 sec, reader_cost: 0.00029 sec, ips: 22.69938 instance/sec. [03/29 15:19:00] epoch:[ 11/100] train step:100 loss: 2.16265 lr: 0.048640 top1: 0.24781 top5: 0.80647 batch_cost: 0.70654 sec, reader_cost: 0.00029 sec, ips: 22.64549 instance/sec. [03/29 15:19:07] epoch:[ 11/100] train step:110 loss: 2.00276 lr: 0.048625 top1: 0.24999 top5: 0.87496 batch_cost: 0.70011 sec, reader_cost: 0.00034 sec, ips: 22.85354 instance/sec. [03/29 15:19:14] epoch:[ 11/100] train step:120 loss: 2.34420 lr: 0.048611 top1: 0.18488 top5: 0.92878 batch_cost: 0.70190 sec, reader_cost: 0.00026 sec, ips: 22.79522 instance/sec. [03/29 15:19:21] epoch:[ 11/100] train step:130 loss: 1.88093 lr: 0.048597 top1: 0.37461 top5: 0.99908 batch_cost: 0.71069 sec, reader_cost: 0.00028 sec, ips: 22.51349 instance/sec. [03/29 15:19:28] epoch:[ 11/100] train step:140 loss: 2.42606 lr: 0.048583 top1: 0.29166 top5: 0.76561 batch_cost: 0.70561 sec, reader_cost: 0.00032 sec, ips: 22.67538 instance/sec. [03/29 15:19:35] epoch:[ 11/100] train step:150 loss: 2.17910 lr: 0.048568 top1: 0.30683 top5: 0.85230 batch_cost: 0.72002 sec, reader_cost: 0.00028 sec, ips: 22.22168 instance/sec. [03/29 15:19:42] epoch:[ 11/100] train step:160 loss: 2.05450 lr: 0.048554 top1: 0.56250 top5: 0.93750 batch_cost: 0.70115 sec, reader_cost: 0.00037 sec, ips: 22.81957 instance/sec. [03/29 15:19:49] epoch:[ 11/100] train step:170 loss: 2.73806 lr: 0.048540 top1: 0.27783 top5: 0.78202 batch_cost: 0.71803 sec, reader_cost: 0.00029 sec, ips: 22.28324 instance/sec. [03/29 15:19:55] epoch:[ 11/100] train step:180 loss: 2.40456 lr: 0.048525 top1: 0.30303 top5: 0.85370 batch_cost: 0.50031 sec, reader_cost: 0.00030 sec, ips: 31.98040 instance/sec. [03/29 15:19:56] END epoch:11 train loss_avg: 2.44419 top1_avg: 0.27968 top5_avg: 0.78469 avg_batch_cost: 0.50109 sec, avg_reader_cost: 0.00085 sec, batch_cost_sum: 127.39963 sec, avg_ips: 22.85721 instance/sec. [03/29 15:19:57] epoch:[ 12/100] train step:0 loss: 2.34270 lr: 0.048522 top1: 0.37288 top5: 0.80785 batch_cost: 1.06039 sec, reader_cost: 0.27052 sec, ips: 15.08877 instance/sec. [03/29 15:20:04] epoch:[ 12/100] train step:10 loss: 2.24573 lr: 0.048507 top1: 0.25000 top5: 0.81250 batch_cost: 0.69120 sec, reader_cost: 0.00034 sec, ips: 23.14829 instance/sec. [03/29 15:20:11] epoch:[ 12/100] train step:20 loss: 2.13167 lr: 0.048493 top1: 0.36477 top5: 0.91875 batch_cost: 0.69624 sec, reader_cost: 0.00028 sec, ips: 22.98063 instance/sec. [03/29 15:20:18] epoch:[ 12/100] train step:30 loss: 2.04649 lr: 0.048478 top1: 0.30890 top5: 0.86599 batch_cost: 0.69645 sec, reader_cost: 0.00043 sec, ips: 22.97363 instance/sec. [03/29 15:20:25] epoch:[ 12/100] train step:40 loss: 2.23277 lr: 0.048463 top1: 0.18750 top5: 0.93749 batch_cost: 0.69281 sec, reader_cost: 0.00032 sec, ips: 23.09439 instance/sec. [03/29 15:20:32] epoch:[ 12/100] train step:50 loss: 2.26164 lr: 0.048448 top1: 0.24998 top5: 0.87493 batch_cost: 0.72817 sec, reader_cost: 0.00029 sec, ips: 21.97301 instance/sec. [03/29 15:20:39] epoch:[ 12/100] train step:60 loss: 2.25266 lr: 0.048433 top1: 0.36681 top5: 0.80022 batch_cost: 0.70896 sec, reader_cost: 0.00042 sec, ips: 22.56815 instance/sec. [03/29 15:20:46] epoch:[ 12/100] train step:70 loss: 2.47188 lr: 0.048418 top1: 0.17410 top5: 0.66070 batch_cost: 0.70223 sec, reader_cost: 0.00028 sec, ips: 22.78453 instance/sec. [03/29 15:20:54] epoch:[ 12/100] train step:80 loss: 2.20363 lr: 0.048403 top1: 0.37496 top5: 0.87487 batch_cost: 0.73253 sec, reader_cost: 0.00029 sec, ips: 21.84199 instance/sec. [03/29 15:21:01] epoch:[ 12/100] train step:90 loss: 2.41825 lr: 0.048388 top1: 0.30803 top5: 0.74196 batch_cost: 0.69405 sec, reader_cost: 0.00029 sec, ips: 23.05312 instance/sec. [03/29 15:21:08] epoch:[ 12/100] train step:100 loss: 2.84730 lr: 0.048372 top1: 0.23188 top5: 0.64128 batch_cost: 0.70100 sec, reader_cost: 0.00030 sec, ips: 22.82440 instance/sec. [03/29 15:21:15] epoch:[ 12/100] train step:110 loss: 2.14583 lr: 0.048357 top1: 0.37411 top5: 0.93572 batch_cost: 0.71812 sec, reader_cost: 0.00030 sec, ips: 22.28042 instance/sec. [03/29 15:21:22] epoch:[ 12/100] train step:120 loss: 2.24149 lr: 0.048342 top1: 0.56249 top5: 0.93749 batch_cost: 0.70294 sec, reader_cost: 0.00031 sec, ips: 22.76158 instance/sec. [03/29 15:21:29] epoch:[ 12/100] train step:130 loss: 2.24358 lr: 0.048326 top1: 0.31225 top5: 0.93691 batch_cost: 0.68991 sec, reader_cost: 0.00028 sec, ips: 23.19149 instance/sec. [03/29 15:21:36] epoch:[ 12/100] train step:140 loss: 2.29778 lr: 0.048311 top1: 0.18639 top5: 0.86275 batch_cost: 0.71049 sec, reader_cost: 0.00036 sec, ips: 22.51966 instance/sec. [03/29 15:21:43] epoch:[ 12/100] train step:150 loss: 2.29803 lr: 0.048295 top1: 0.18732 top5: 0.81182 batch_cost: 0.69540 sec, reader_cost: 0.00027 sec, ips: 23.00829 instance/sec. [03/29 15:21:50] epoch:[ 12/100] train step:160 loss: 2.74029 lr: 0.048279 top1: 0.27945 top5: 0.58836 batch_cost: 0.74125 sec, reader_cost: 0.00038 sec, ips: 21.58526 instance/sec. [03/29 15:21:57] epoch:[ 12/100] train step:170 loss: 2.17319 lr: 0.048263 top1: 0.31250 top5: 1.00000 batch_cost: 0.70173 sec, reader_cost: 0.00035 sec, ips: 22.80064 instance/sec. [03/29 15:22:03] epoch:[ 12/100] train step:180 loss: 2.51104 lr: 0.048248 top1: 0.32995 top5: 0.81536 batch_cost: 0.50048 sec, reader_cost: 0.00034 sec, ips: 31.96899 instance/sec. [03/29 15:22:04] END epoch:12 train loss_avg: 2.40610 top1_avg: 0.28942 top5_avg: 0.79410 avg_batch_cost: 0.50189 sec, avg_reader_cost: 0.00104 sec, batch_cost_sum: 127.84539 sec, avg_ips: 22.77751 instance/sec. [03/29 15:22:05] epoch:[ 13/100] train step:0 loss: 2.62288 lr: 0.048244 top1: 0.21024 top5: 0.72155 batch_cost: 1.05660 sec, reader_cost: 0.27271 sec, ips: 15.14284 instance/sec. [03/29 15:22:12] epoch:[ 13/100] train step:10 loss: 2.80555 lr: 0.048228 top1: 0.14584 top5: 0.75001 batch_cost: 0.70311 sec, reader_cost: 0.00046 sec, ips: 22.75603 instance/sec. [03/29 15:22:19] epoch:[ 13/100] train step:20 loss: 2.68370 lr: 0.048213 top1: 0.21862 top5: 0.74974 batch_cost: 0.68662 sec, reader_cost: 0.00029 sec, ips: 23.30264 instance/sec. [03/29 15:22:26] epoch:[ 13/100] train step:30 loss: 1.89672 lr: 0.048196 top1: 0.42875 top5: 0.98105 batch_cost: 0.71042 sec, reader_cost: 0.00035 sec, ips: 22.52175 instance/sec. [03/29 15:22:33] epoch:[ 13/100] train step:40 loss: 2.25122 lr: 0.048180 top1: 0.30972 top5: 0.80231 batch_cost: 0.72042 sec, reader_cost: 0.00034 sec, ips: 22.20935 instance/sec. [03/29 15:22:40] epoch:[ 13/100] train step:50 loss: 2.09944 lr: 0.048164 top1: 0.56250 top5: 0.81250 batch_cost: 0.70081 sec, reader_cost: 0.00031 sec, ips: 22.83072 instance/sec. [03/29 15:22:47] epoch:[ 13/100] train step:60 loss: 2.68522 lr: 0.048148 top1: 0.14655 top5: 0.73167 batch_cost: 0.69789 sec, reader_cost: 0.00030 sec, ips: 22.92612 instance/sec. [03/29 15:22:54] epoch:[ 13/100] train step:70 loss: 2.16664 lr: 0.048132 top1: 0.48979 top5: 0.91853 batch_cost: 0.72295 sec, reader_cost: 0.00031 sec, ips: 22.13148 instance/sec. [03/29 15:23:01] epoch:[ 13/100] train step:80 loss: 2.33765 lr: 0.048115 top1: 0.43749 top5: 0.81249 batch_cost: 0.69810 sec, reader_cost: 0.00031 sec, ips: 22.91927 instance/sec. [03/29 15:23:08] epoch:[ 13/100] train step:90 loss: 2.73813 lr: 0.048099 top1: 0.22141 top5: 0.78026 batch_cost: 0.71928 sec, reader_cost: 0.00029 sec, ips: 22.24458 instance/sec. [03/29 15:23:15] epoch:[ 13/100] train step:100 loss: 2.02671 lr: 0.048082 top1: 0.43676 top5: 0.99852 batch_cost: 0.72687 sec, reader_cost: 0.00034 sec, ips: 22.01223 instance/sec. [03/29 15:23:22] epoch:[ 13/100] train step:110 loss: 2.81072 lr: 0.048065 top1: 0.25000 top5: 0.67396 batch_cost: 0.68568 sec, reader_cost: 0.00038 sec, ips: 23.33464 instance/sec. [03/29 15:23:29] epoch:[ 13/100] train step:120 loss: 2.44214 lr: 0.048049 top1: 0.44948 top5: 0.85667 batch_cost: 0.69001 sec, reader_cost: 0.00035 sec, ips: 23.18806 instance/sec. [03/29 15:23:36] epoch:[ 13/100] train step:130 loss: 2.36915 lr: 0.048032 top1: 0.18594 top5: 0.86722 batch_cost: 0.70887 sec, reader_cost: 0.00033 sec, ips: 22.57108 instance/sec. [03/29 15:23:44] epoch:[ 13/100] train step:140 loss: 2.00205 lr: 0.048015 top1: 0.49937 top5: 0.87415 batch_cost: 0.68674 sec, reader_cost: 0.00032 sec, ips: 23.29840 instance/sec. [03/29 15:23:51] epoch:[ 13/100] train step:150 loss: 2.50445 lr: 0.047998 top1: 0.25000 top5: 0.81250 batch_cost: 0.72046 sec, reader_cost: 0.00028 sec, ips: 22.20794 instance/sec. [03/29 15:23:58] epoch:[ 13/100] train step:160 loss: 2.01490 lr: 0.047981 top1: 0.43750 top5: 0.81250 batch_cost: 0.70387 sec, reader_cost: 0.00026 sec, ips: 22.73160 instance/sec. [03/29 15:24:05] epoch:[ 13/100] train step:170 loss: 3.32327 lr: 0.047964 top1: 0.06250 top5: 0.58697 batch_cost: 0.70310 sec, reader_cost: 0.00032 sec, ips: 22.75645 instance/sec. [03/29 15:24:11] epoch:[ 13/100] train step:180 loss: 2.75158 lr: 0.047947 top1: 0.21913 top5: 0.66768 batch_cost: 0.50247 sec, reader_cost: 0.00030 sec, ips: 31.84261 instance/sec. [03/29 15:24:12] END epoch:13 train loss_avg: 2.36569 top1_avg: 0.31106 top5_avg: 0.81313 avg_batch_cost: 0.50191 sec, avg_reader_cost: 0.00084 sec, batch_cost_sum: 127.57903 sec, avg_ips: 22.82507 instance/sec. [03/29 15:24:13] epoch:[ 14/100] train step:0 loss: 2.13885 lr: 0.047944 top1: 0.23971 top5: 0.95370 batch_cost: 1.07979 sec, reader_cost: 0.26358 sec, ips: 14.81771 instance/sec. [03/29 15:24:20] epoch:[ 14/100] train step:10 loss: 2.80225 lr: 0.047927 top1: 0.23444 top5: 0.62132 batch_cost: 0.70389 sec, reader_cost: 0.00030 sec, ips: 22.73072 instance/sec. [03/29 15:24:27] epoch:[ 14/100] train step:20 loss: 2.21788 lr: 0.047909 top1: 0.31250 top5: 0.75000 batch_cost: 0.69107 sec, reader_cost: 0.00031 sec, ips: 23.15234 instance/sec. [03/29 15:24:34] epoch:[ 14/100] train step:30 loss: 2.24101 lr: 0.047892 top1: 0.30933 top5: 0.80553 batch_cost: 0.71783 sec, reader_cost: 0.00032 sec, ips: 22.28937 instance/sec. [03/29 15:24:41] epoch:[ 14/100] train step:40 loss: 2.69538 lr: 0.047875 top1: 0.29935 top5: 0.68012 batch_cost: 0.68265 sec, reader_cost: 0.00030 sec, ips: 23.43822 instance/sec. [03/29 15:24:48] epoch:[ 14/100] train step:50 loss: 2.37307 lr: 0.047857 top1: 0.41123 top5: 0.83098 batch_cost: 0.69714 sec, reader_cost: 0.00030 sec, ips: 22.95105 instance/sec. [03/29 15:24:55] epoch:[ 14/100] train step:60 loss: 2.31909 lr: 0.047840 top1: 0.12500 top5: 0.91260 batch_cost: 0.69593 sec, reader_cost: 0.00034 sec, ips: 22.99079 instance/sec. [03/29 15:25:02] epoch:[ 14/100] train step:70 loss: 2.88072 lr: 0.047822 top1: 0.27741 top5: 0.61431 batch_cost: 0.72125 sec, reader_cost: 0.00030 sec, ips: 22.18380 instance/sec. [03/29 15:25:09] epoch:[ 14/100] train step:80 loss: 1.82962 lr: 0.047805 top1: 0.49984 top5: 0.99976 batch_cost: 0.72015 sec, reader_cost: 0.00028 sec, ips: 22.21755 instance/sec. [03/29 15:25:17] epoch:[ 14/100] train step:90 loss: 2.23941 lr: 0.047787 top1: 0.37480 top5: 0.87464 batch_cost: 0.69583 sec, reader_cost: 0.00036 sec, ips: 22.99406 instance/sec. [03/29 15:25:23] epoch:[ 14/100] train step:100 loss: 2.20035 lr: 0.047769 top1: 0.37470 top5: 0.87417 batch_cost: 0.69534 sec, reader_cost: 0.00027 sec, ips: 23.01034 instance/sec. [03/29 15:25:30] epoch:[ 14/100] train step:110 loss: 2.75755 lr: 0.047751 top1: 0.11804 top5: 0.81934 batch_cost: 0.68740 sec, reader_cost: 0.00034 sec, ips: 23.27604 instance/sec. [03/29 15:25:37] epoch:[ 14/100] train step:120 loss: 2.41216 lr: 0.047733 top1: 0.38373 top5: 0.77515 batch_cost: 0.69971 sec, reader_cost: 0.00034 sec, ips: 22.86665 instance/sec. [03/29 15:25:45] epoch:[ 14/100] train step:130 loss: 2.14232 lr: 0.047715 top1: 0.31249 top5: 0.99996 batch_cost: 0.70626 sec, reader_cost: 0.00031 sec, ips: 22.65456 instance/sec. [03/29 15:25:51] epoch:[ 14/100] train step:140 loss: 3.02582 lr: 0.047697 top1: 0.24372 top5: 0.68750 batch_cost: 0.68608 sec, reader_cost: 0.00034 sec, ips: 23.32094 instance/sec. [03/29 15:25:58] epoch:[ 14/100] train step:150 loss: 2.09214 lr: 0.047679 top1: 0.37369 top5: 0.87108 batch_cost: 0.70625 sec, reader_cost: 0.00031 sec, ips: 22.65485 instance/sec. [03/29 15:26:06] epoch:[ 14/100] train step:160 loss: 2.13291 lr: 0.047661 top1: 0.37499 top5: 0.99999 batch_cost: 0.69103 sec, reader_cost: 0.00027 sec, ips: 23.15376 instance/sec. [03/29 15:26:13] epoch:[ 14/100] train step:170 loss: 2.17193 lr: 0.047643 top1: 0.31250 top5: 0.93750 batch_cost: 0.70904 sec, reader_cost: 0.00031 sec, ips: 22.56582 instance/sec. [03/29 15:26:19] epoch:[ 14/100] train step:180 loss: 3.14274 lr: 0.047624 top1: 0.18750 top5: 0.59400 batch_cost: 0.50291 sec, reader_cost: 0.00033 sec, ips: 31.81512 instance/sec. [03/29 15:26:19] END epoch:14 train loss_avg: 2.40408 top1_avg: 0.30748 top5_avg: 0.80169 avg_batch_cost: 0.50137 sec, avg_reader_cost: 0.00107 sec, batch_cost_sum: 127.67140 sec, avg_ips: 22.80855 instance/sec. [03/29 15:26:21] epoch:[ 15/100] train step:0 loss: 2.10868 lr: 0.047621 top1: 0.29461 top5: 0.96780 batch_cost: 1.11929 sec, reader_cost: 0.27810 sec, ips: 14.29474 instance/sec. [03/29 15:26:28] epoch:[ 15/100] train step:10 loss: 2.11238 lr: 0.047602 top1: 0.37500 top5: 0.87500 batch_cost: 0.71041 sec, reader_cost: 0.00039 sec, ips: 22.52221 instance/sec. [03/29 15:26:35] epoch:[ 15/100] train step:20 loss: 1.98316 lr: 0.047584 top1: 0.55776 top5: 0.93158 batch_cost: 0.70243 sec, reader_cost: 0.00030 sec, ips: 22.77807 instance/sec. [03/29 15:26:42] epoch:[ 15/100] train step:30 loss: 2.07229 lr: 0.047565 top1: 0.31113 top5: 0.99699 batch_cost: 0.72595 sec, reader_cost: 0.00032 sec, ips: 22.04013 instance/sec. [03/29 15:26:49] epoch:[ 15/100] train step:40 loss: 2.30547 lr: 0.047547 top1: 0.37489 top5: 0.93724 batch_cost: 0.70239 sec, reader_cost: 0.00031 sec, ips: 22.77949 instance/sec. [03/29 15:26:56] epoch:[ 15/100] train step:50 loss: 1.78268 lr: 0.047528 top1: 0.49999 top5: 0.87499 batch_cost: 0.70733 sec, reader_cost: 0.00027 sec, ips: 22.62018 instance/sec. [03/29 15:27:03] epoch:[ 15/100] train step:60 loss: 2.87478 lr: 0.047509 top1: 0.22292 top5: 0.59792 batch_cost: 0.70234 sec, reader_cost: 0.00030 sec, ips: 22.78096 instance/sec. [03/29 15:27:10] epoch:[ 15/100] train step:70 loss: 2.52452 lr: 0.047490 top1: 0.27055 top5: 0.81823 batch_cost: 0.68102 sec, reader_cost: 0.00034 sec, ips: 23.49428 instance/sec. [03/29 15:27:17] epoch:[ 15/100] train step:80 loss: 2.84215 lr: 0.047472 top1: 0.21708 top5: 0.61830 batch_cost: 0.69927 sec, reader_cost: 0.00031 sec, ips: 22.88088 instance/sec. [03/29 15:27:24] epoch:[ 15/100] train step:90 loss: 2.11277 lr: 0.047453 top1: 0.37473 top5: 0.87446 batch_cost: 0.70377 sec, reader_cost: 0.00031 sec, ips: 22.73455 instance/sec. [03/29 15:27:31] epoch:[ 15/100] train step:100 loss: 2.68333 lr: 0.047434 top1: 0.27151 top5: 0.73872 batch_cost: 0.70957 sec, reader_cost: 0.00037 sec, ips: 22.54895 instance/sec. [03/29 15:27:38] epoch:[ 15/100] train step:110 loss: 2.21230 lr: 0.047414 top1: 0.43026 top5: 0.92544 batch_cost: 0.70381 sec, reader_cost: 0.00030 sec, ips: 22.73329 instance/sec. [03/29 15:27:45] epoch:[ 15/100] train step:120 loss: 2.17136 lr: 0.047395 top1: 0.37245 top5: 0.86939 batch_cost: 0.70774 sec, reader_cost: 0.00029 sec, ips: 22.60725 instance/sec. [03/29 15:27:52] epoch:[ 15/100] train step:130 loss: 2.63410 lr: 0.047376 top1: 0.41518 top5: 0.69792 batch_cost: 0.69438 sec, reader_cost: 0.00036 sec, ips: 23.04203 instance/sec. [03/29 15:28:00] epoch:[ 15/100] train step:140 loss: 1.95653 lr: 0.047357 top1: 0.43738 top5: 0.99973 batch_cost: 0.71293 sec, reader_cost: 0.00033 sec, ips: 22.44254 instance/sec. [03/29 15:28:07] epoch:[ 15/100] train step:150 loss: 2.01001 lr: 0.047338 top1: 0.31214 top5: 0.87413 batch_cost: 0.68462 sec, reader_cost: 0.00028 sec, ips: 23.37066 instance/sec. [03/29 15:28:13] epoch:[ 15/100] train step:160 loss: 2.06148 lr: 0.047318 top1: 0.43743 top5: 0.87480 batch_cost: 0.68099 sec, reader_cost: 0.00029 sec, ips: 23.49515 instance/sec. [03/29 15:28:20] epoch:[ 15/100] train step:170 loss: 2.06328 lr: 0.047299 top1: 0.31217 top5: 0.99900 batch_cost: 0.68033 sec, reader_cost: 0.00032 sec, ips: 23.51808 instance/sec. [03/29 15:28:27] epoch:[ 15/100] train step:180 loss: 2.57521 lr: 0.047279 top1: 0.17190 top5: 0.69540 batch_cost: 0.50064 sec, reader_cost: 0.00030 sec, ips: 31.95936 instance/sec. [03/29 15:28:27] END epoch:15 train loss_avg: 2.33631 top1_avg: 0.33432 top5_avg: 0.82754 avg_batch_cost: 0.49997 sec, avg_reader_cost: 0.00083 sec, batch_cost_sum: 127.51969 sec, avg_ips: 22.83569 instance/sec.
输出 log 中,epoch:[ 1/100] train step:180 loss: 2.88748
意义如下:
epoch:[ 1/100]
表示 当前训练的是100个 epoch 中的第1个 epoch;train step:180
表示是 该 epoch 中的第180个批次输出的日志;
- train step 只到180是因为,一个 epoch 中,
训练样本数 / batch_size = 2922 / 16 = 182.625
,也就是说每一个 epoch ,有182批次,每一个批次是一次迭代,都会更新模型参数,可以计算出 loss 等评价指标。- 这里是每10个批次输出一次日志,记录该次迭代后的 loss 等评价指标。所以每一个 epoch 中,
train step:0
,train step:10
从0开始每隔10,到train step:180
。
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。