赞
踩
1)bag of words求平均
2)TF-IDF加权平均
3)SIF加权平均
That is, the MLE is approximately a weighted average of the vectors of the words in the sentence.Note that for more frequent words w, the weight a/(p(w) + a) is smaller, so this naturally leads to a down weighting of the frequent words.
To estimate cs, we estimate the direction c0 by computing the first principal component of c˜s’s for a set of sentences. In other words, the final sentence embedding is obtained by subtracting the projection of c˜s’s to their first principal component.
1)Encoder:RNN/LSTM得到序列末尾的hidden vector;若双层,则concat得到的两个hidden vector
RNNs using long short-term memory (LSTM) capture long-distance dependency and have also been used for modeling sentences (Tai et al., 2015)。
2)BERT:[CLS]对应位置的输出即为句向量
3)skip-thought vectors:Skip-thought of (Kiros et al., 2015) tries to reconstruct the surrounding sentences from surrounded one and treats the hidden parameters as their vector representations.
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。