赞
踩
中文英文混合文本,近似度比较
pip install sentence-transformers
- import sys
- from sentence_transformers.util import cos_sim
- from sentence_transformers import SentenceTransformer as SBert
模型网站链接为:https://public.ukp.informatik.tu-darmstadt.de/reimers/sentence-transformers/v0.2/
然后查找paraphrase-multilingual-MiniLM-L12-v2这个模型名字,点击下载即可
model = SBert("C:\\Users\xxxx\Downloads\\paraphrase-multilingual-MiniLM-L12-v2")
- sentences1 ="xxxxx1"
-
- sentences2 = "xxxxxx2"
-
- # Compute embedding for both lists
- embeddings1 = model.encode(sentences1)
- embeddings2 = model.encode(sentences2)
-
- # Compute cosine-similarits
- cosine_scores = cos_sim(embeddings1, embeddings2)
- cosine_scores
-
sentence有512token限制
参考:https://blog.csdn.net/yuanzhoulvpi/article/details/121755062
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。