天景科技苑

这个屌丝很懒，什么也没留下！

热门标签

热门文章

当前位置: article > 正文

将句子表示为向量：无监督句子表示学习（sentence embedding）

作者：天景科技苑 | 2024-07-14 20:05:54

赞

踩

将句子表示为向量：无监督句子表示学习（sentence embedding）

本文主要是用作自己学习记录笔记使用，如有侵权请联系删除即可。

原文链接：

【上篇】

References

Le and Mikolov - 2014 - Distributed representations of sentences and documents
Li and Hovy - 2014 - A Model of Coherence Based on Distributed Sentence Representation
Kiros et al. - 2015 - Skip-Thought Vectors
Hill et al. - 2016 - Learning Distributed Representations of Sentences from Unlabelled Data
Arora et al. - 2016 - A simple but tough-to-beat baseline for sentence embeddings
Pagliardini et al. - 2017 - Unsupervised Learning of Sentence Embeddings using Compositional n-Gram Features
Logeswaran et al. - 2018 - An efficient framework for learning sentence representations

【下篇】

更详细的介绍可以参考论文作者的博客Google AI Blog (中文版)。

5. 总结

基于监督学习方法学习sentence embeddings可以归纳为两个步骤：
- 第一步选择监督训练数据，设计相应的包含句子编码器Encoder的模型框架；
- 第二步选择（设计）具体的句子编码器，包括DAN、基于LSTM、基于CNN和Transformer等。
Sentence Embedding的质量往往由训练数据和Encoder共同决定。Encoder不一定是越复杂越好，需要依据下游任务、计算资源、时间开销等多方面因素综合考虑。

References

Wieting et al. - 2015 - Towards universal paraphrastic sentence embeddings
Conneau et al. - 2017 - Supervised Learning of Universal Sentence Representations from Natural Language Inference Data
Cer et al. - 2018 - Universal Sentence Encoder
Google AI - 2018 - Advances in Semantic Textual Similarity

SIF

声明：本文内容由网友自发贡献，不代表【wpsshop博客】立场，版权归原作者所有，本站不承担相应法律责任。如您发现有侵权的内容，请联系我们。转载请注明出处：https://www.wpsshop.cn/w/天景科技苑/article/detail/826158

推荐阅读

相关标签

Copyright © 2003-2013 www.wpsshop.cn 版权所有，并保留所有权利。

闽ICP备14008679号