当前位置:   article > 正文

pytorch计算余弦相似度_pytorch cosine similarity

pytorch cosine similarity

        在pytorch中,可以使用torch.cosine_similarity函数对两个向量或者张量计算余弦相似度。先看一下pytorch源码对该函数的定义:

  1. class CosineSimilarity(Module):
  2. r"""Returns cosine similarity between :math:`x_1` and :math:`x_2`, computed along dim.
  3. .. math ::
  4. \text{similarity} = \dfrac{x_1 \cdot x_2}{\max(\Vert x_1 \Vert _2 \cdot \Vert x_2 \Vert _2, \epsilon)}.
  5. Args:
  6. dim (int, optional): Dimension where cosine similarity is computed. Default: 1
  7. eps (float, optional): Small value to avoid division by zero.
  8. Default: 1e-8
  9. Shape:
  10. - Input1: :math:`(\ast_1, D, \ast_2)` where D is at position `dim`
  11. - Input2: :math:`(\ast_1, D, \ast_2)`, same shape as the Input1
  12. - Output: :math:`(\ast_1, \ast_2)`
  13. Examples::
  14. >>> input1 = torch.randn(100, 128)
  15. >>> input2 = torch.randn(100, 128)
  16. >>> cos = nn.CosineSimilarity(dim=1, eps=1e-6)
  17. >>> output = cos(input1, input2)
  18. """
  19. __constants__ = ['dim', 'eps']
  20. def __init__(self, dim=1, eps=1e-8):
  21. super(CosineSimilarity, self).__init__()
  22. self.dim = dim
  23. self.eps = eps
  24. def forward(self, x1, x2):
  25. return F.cosine_similarity(x1, x2, self.dim, self.eps)

        可以看到该函数一共有四个参数:

  • x1和x2为待计算余弦相似度的张量;
  • dim为在哪个维度上计算余弦相似度;
  • eps是为了避免被零除而设置的一个小数值。

        看一下例子:

  1. import torch
  2. x = torch.FloatTensor(torch.rand([10]))
  3. print('x', x)
  4. y = torch.FloatTensor(torch.rand([10]))
  5. print('y', y)
  6. similarity = torch.cosine_similarity(x, y, dim=0)
  7. print('similarity', similarity)
  1. x tensor([0.2817, 0.6858, 0.1820, 0.7357, 0.7625, 0.3569, 0.4781, 0.8485, 0.1385,
  2. 0.5654])
  3. y tensor([0.3366, 0.8959, 0.7776, 0.2475, 0.9202, 0.2845, 0.7284, 0.8150, 0.2577,
  4. 0.0085])
  5. similarity tensor(0.8502)

        再看一个例子,给定一个张量,计算多个张量与它的余弦相似度,并将计算得到的余弦相似度标准化

  1. import torch
  2. def get_att_dis(target, behaviored):
  3. attention_distribution = []
  4. for i in range(behaviored.size(0)):
  5. attention_score = torch.cosine_similarity(target, behaviored[i].view(1, -1)) # 计算每一个元素与给定元素的余弦相似度
  6. attention_distribution.append(attention_score)
  7. attention_distribution = torch.Tensor(attention_distribution)
  8. return attention_distribution / torch.sum(attention_distribution, 0) # 标准化
  9. a = torch.FloatTensor(torch.rand(1, 10))
  10. print('a', a)
  11. b = torch.FloatTensor(torch.rand(3, 10))
  12. print('b', b)
  13. similarity = get_att_dis(target=a, behaviored=b)
  14. print('similarity', similarity)
  1. a tensor([[0.9255, 0.2194, 0.8370, 0.5346, 0.5152, 0.4645, 0.4926, 0.9882, 0.2783,
  2. 0.9258]])
  3. b tensor([[0.6874, 0.4054, 0.5739, 0.8017, 0.9861, 0.0154, 0.8513, 0.8427, 0.6669,
  4. 0.0694],
  5. [0.1720, 0.6793, 0.7764, 0.4583, 0.8167, 0.2718, 0.9686, 0.9301, 0.2421,
  6. 0.0811],
  7. [0.2336, 0.4783, 0.5576, 0.6518, 0.9943, 0.6766, 0.0044, 0.7935, 0.2098,
  8. 0.0719]])
  9. similarity tensor([0.3448, 0.3318, 0.3234])

未完待续...

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/blog/article/detail/42300
推荐阅读
相关标签
  

闽ICP备14008679号