赞
踩
implement dropout (inverted dropout)
Making predictions at test time
Data augmentation
Early stopping
Why normalize inputs?
Training with mini batch gradient descent
Choosing your mini-batch size
V t = β V t − 1 + ( 1 − β ) θ t V_t = \beta V_{t-1} + (1 - \beta)\theta_t Vt=βVt−1+(1−β)θt
Root Mean Square prop
Hyperparameters
Try random values: Don’t use a grid
coarse to fine
Picking hyperparameters at random
Appropriate scale for hyperparameters
Hyperparameters for exponentially weighted averages
Re-test hyperparameters occasionally
How to search hyperparameters?
Implement Batch Norm
Working with mini-batches
Loss function
作业中的一个地方
tf.keras.losses.categorical_crossentropy(tf.transpose(labels), tf.transpose(logits), from_logits=True)
from_logits=True
from_logits
可以获得数值稳定性numerical stability作业答案参考
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。