赞
踩
本文选择的是BERT-Base, Chinese:Chinese Simplified and Traditional, 12-layer, 768-hidden, 12-heads, 110M parameters
所有预训练模型:https://github.com/google-research/bert#pre-trained-models
1、下载bert-server服务
pip install bert-serving-server #server
pip install bert-serving-client #client
2、启动服务
#bert-serving-start -model_dir /project/bert/chinese-L-12_H768_A-12 -num worker=4 #模型解压路径,可指定GPU使用数量
bert-serving-start -model_dir /project/bert/chinese-L-12_H768_A-12
from bert_serving.client import BertClient
bc = BertClient()
sen_emb = bc.encode(["今天你感觉好些了吗"]) #return a ndarray (or List[List(float)])
print(sen_emb.shape)
print(sen_emb)
# 导入bert客户端 from bert_serving.client import BertClient import numpy as np class SimilarModel: def __init__(self): # ip默认为本地模式,如果bert服务部署在其他服务器上,修改为对应ip self.bert_client = BertClient() def close_bert(self): self.bert_client.close() def get_sentence_vec(self,sentence): ''' 根据bert获取句子向量 :param sentence: :return: ''' return self.bert_client.encode([sentence])[0] def cos_similar(self,sen_a_vec, sen_b_vec): ''' 计算两个句子的余弦相似度 :param sen_a_vec: :param sen_b_vec: :return: ''' vector_a = np.mat(sen_a_vec) vector_b = np.mat(sen_b_vec) num = float(vector_a * vector_b.T) denom = np.linalg.norm(vector_a) * np.linalg.norm(vector_b) cos = num / denom return cos if __name__=='__main__': # 从候选集condinates 中选出与sentence_a 最相近的句子 condinates = ['为什么天空是蔚蓝色的','太空为什么是黑的?','天空怎么是蓝色的','明天去爬山如何'] sentence_a = '天空为什么是蓝色的' bert_client = SimilarModel() max_cos_similar = 0 most_similar_sentence = '' for sentence_b in condinates: sentence_a_vec = bert_client.get_sentence_vec(sentence_a) sentence_b_vec = bert_client.get_sentence_vec(sentence_b) cos_sim = bert_client.cos_similar(sentence_a_vec,sentence_b_vec) print(sentence_b_vec,cos_sim) if cos_sim > max_cos_similar: max_cos_similar = cos_sim most_similar_sentence = sentence_b print('最相似的句子:',most_similar_sentence) bert_client.close_bert()
output:
为什么天空是蔚蓝色的 0.9817470751381709
太空为什么是黑的? 0.9311994519047564
天空怎么是蓝色的 0.9746721649911299
明天去爬山如何 0.8408672630644225
最相似的句子: 为什么天空是蔚蓝色的
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。