当前位置:   article > 正文

sigmoid的前向传播以及loss函数计算,后向传播的梯度计算_sigmoid loss

sigmoid loss

cs231n只给出了softmax的传播梯度计算,所以自己试着写了sigmoid,此处自己写的sigmoid的后向传播梯度只适用于2分类,且标签为[0,1,0,1…]这种列表形式,后向梯度计算时利用CS231N的梯度校验函数算出来的梯度和自己算的梯度总是相差2倍关系,最后发现可能是计算是将1标签和0标签的损失都算了一遍,多算了一次,相当于数据量为2N

import numpy as np

import keras.backend as K
import tensorflow as tf


def eval_numerical_gradient(f, x, verbose=True, h=0.00001):
    """
    a naive implementation of numerical gradient of f at x
    - f should be a function that takes a single argument
    - x is the point (numpy array) to evaluate the gradient at
    """

    fx = f(x) # evaluate function value at original point
    grad = np.zeros_like(x)
    # iterate over all indexes in x
    it = np.nditer(x, flags=['multi_index'], op_flags=['readwrite'])
    while not it.finished:

        # evaluate function at x+h
        ix = it.multi_index
        oldval = x[ix]
        x[ix] = oldval + h # increment by h
        fxph = f(x) # evalute f(x + h)
        x[ix] = oldval - h
        fxmh = f(x) # evaluate f(x - h)
        x[ix] = oldval # restore

        # compute the partial derivative with centered formula
        grad[ix] = (fxph - fxmh) / (2 * h) # the slope
        if verbose:
            print(ix, grad[ix])
        it.iternext() # step to next dimension

    return grad
def rel_error(x, y):
  return np.max(np.abs(x - y) / (np.maximum(1e-8, np.abs(x) + np.abs(y))))
def sigmoid_forward(Z):
    A = 1 / (1 + np.exp(-Z))
    cache = Z
    return A, cache
def sigmoid_loss(x, y):#######只针对2分类的函数,多分类用softmax_loss,输入y为标签列表,不是矩阵
    probs, _ = sigmoid_forward(x)
    N = x.shape[0]
    y_ = np.zeros((N,2))
    y_[np.arange(N),y] = 1
    loss = -y_*np.log(probs)-(1-y_)*np.log(1-probs)
    loss = np.mean(loss)

    da = -(y_/probs-(1-y_)/(1-probs))
    dx = da*probs * (1 - probs)/N/2########因为y标签算了2次,相当于2N个数据
    ###与吴恩达视频中梯度计算有差异因为视频中标签算了1次
    return loss,dx


num_classes,num_inputs=2,5
x=0.001*np.random.randn(num_inputs,num_classes)
y=np.random.randint(num_classes,size=num_inputs)

loss_sig,_= sigmoid_loss(x,y)
print('loss of my sigmoid:',loss_sig)
print()

##########tf 用来对比
probs, _ = sigmoid_forward(x)
N = x.shape[0]
y_ = np.zeros((N,2))
y_[np.arange(N),y] = 1
#######tensorflow的sigmoid——entropy的label标签是one_hor形式,输入logis为不是概率,直接就是X,因为内部已经计算了sigmoid
loss_tf = tf.nn.sigmoid_cross_entropy_with_logits(labels=y_, logits=x)
loss_tf_ave = tf.reduce_mean(loss_tf)
print("the loss tf  is: ", K.eval(loss_tf))
print("the loss  tf ave is: ", K.eval(loss_tf_ave))
#######梯度检验
dx_num = eval_numerical_gradient(lambda x:sigmoid_loss(x,y)[0],x,verbose=False)
loss,dx=sigmoid_loss(x,y)
print('\ntesting softmax_loss:')
print('loss:',loss)
print('dx error:',rel_error(dx_num,dx))

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61
  • 62
  • 63
  • 64
  • 65
  • 66
  • 67
  • 68
  • 69
  • 70
  • 71
  • 72
  • 73
  • 74
  • 75
  • 76
  • 77
  • 78
  • 79
  • 80

loss of my sigmoid: 0.693255238734
the loss tf is: [[ 0.69340774 0.69344562]
[ 0.69317672 0.6938127 ]
[ 0.692981 0.69248962]
[ 0.6933821 0.6935182 ]
[ 0.69347767 0.69286103]]
the loss tf ave is: 0.693255238734
testing softmax_loss:
loss: 0.693255238734
dx error: 8.05752159213e-11

本文内容由网友自发贡献,转载请注明出处:【wpsshop博客】
推荐阅读
相关标签
  

闽ICP备14008679号