当前位置:   article > 正文

CS224n_2019_Assignment2: word2vec coding solution

CS224n_2019_Assignment2: word2vec coding solution

@[TOC](CS 224n (2019) Assignment #2 coding 作業答案 )

前言

A2是關於word2vec的forward and backward propagation的實現。要想A2的coding part完全正確,完成written part的gradient手寫推導公式十分有必要,因為本次作業並非直接引用現成package,而是用程式碼去實現數學公式,直接憑空想象很容易出錯。

最終運行此程式的結果為Gradient Check Pass: pass

題目詳情

預設環境

預設環境可用anaconda prompt或者pycharm。

1)anaconda prompt:轉到a2的該目錄下,运行下述操作即可自动创建新环境,优点是方便,缺点是package多,占部分内存空间:

conda env create -f env.yml
conda activate a2

完成a2可以解除环境:

conda deactivate

2)pycharm:在a2目錄下創建環境,並手动裝上需要的package,优点是安装少量的package就可完成作业,只要熟悉pycharm就很容易操作。

( a ) word2vec model

sigmoid function:

def sigmoid(x):
    """
    Compute the sigmoid function for the input here.
    Arguments:
    x -- A scalar or numpy array.
    Return:
    s -- sigmoid(x)
    """

    ### YOUR CODE HERE
    s = 1/(1 + np.exp(-x))
    ### END YOUR CODE

    return s
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14

naiveSoftmaxLossAndGradient function:

大致流程是先計算出y_hat,再計算loss function,再算每個parameter的gradient。記住教授說的,always check vector shape!

def naiveSoftmaxLossAndGradient(
    centerWordVec,
    outsideWordIdx,
    outsideVectors,
    dataset
):
    """ Naive Softmax loss & gradient function for word2vec models

    Implement the naive softmax loss and gradients between a center word's 
    embedding and an outside word's embedding. This will be the building block
    for our word2vec models.

    Arguments:
    centerWordVec -- numpy ndarray, center word's embedding
                    (v_c in the pdf handout)
    outsideWordIdx -- integer, the index of the outside word
                    (o of u_o in the pdf handout)
    outsideVectors -- outside vectors (rows of matrix) for all words in vocab
                      (U in the pdf handout)
    dataset -- needed for negative sampling, unused here.

    Return:
    loss -- naive softmax loss
    gradCenterVec -- the gradient with respect to the center word vector
                     (dJ / dv_c in the pdf handout)
    gradOutsideVecs -- the gradient with respect to all the outside word vectors
                    (dJ / dU)
    """

    ### YOUR CODE HERE

    ### Please use the provided softmax function (imported earlier in this file)
    ### This numerically stable implementation helps you avoid issues pertaining
    ### to integer overflow.

    gradCenterVec = np.zeros(centerWordVec.shape)   #(V,)
    gradOutsideVecs = np.zeros(outsideVectors.shape)   #(N,V)
    loss = 0.0

    # forward
    y_hat = softmax(np
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/小蓝xlanll/article/detail/629833
推荐阅读
相关标签
  

闽ICP备14008679号