BLUE 评价指标_blue指标

作者：Gausst松鼠会 | 2024-04-06 02:38:22

踩

blue指标

BLEU：用于机器翻译任务的评价。根据n-gram可以划分为多个评价指标。常见的有BLUE-1、BLUE-2、BLUE-3、BLUE-4四种，其中的数字表示连续单词的个数。BLUE-1衡量的是单词级别的准确性，高阶BLUE可以衡量句子的流畅性。

BLUE通常用来衡量一组机器产生的翻译句子集合 （candidates） 与一组人工翻译句子 （references） 的相似程度。

示例如下：

candidate: The cat sat on the mat.
reference: The cat is on the mat.
1
2

candidate {the, cat, sat, on, the, mat} 中有5个在 reference 中，即 blue1=5/6=0.83

candidate {the cat, cat sat, sat on, on the, the mat} 中有3个在 reference 中，即 blue2=3/5=0.6

candidate {the cat sat, cat sat on, sat on the, on the mat} 中有1个在 reference 中，即 blue3=1/4=0.25

candidate {the cat sat on, cat sat on the, sat on the mat}中有0个在 reference 中，即 blue4=0/3=0

声明：本文内容由网友自发贡献，不代表【wpsshop博客】立场，版权归原作者所有，本站不承担相应法律责任。如您发现有侵权的内容，请联系我们。转载请注明出处：https://www.wpsshop.cn/w/Gausst松鼠会/article/detail/369298