赞
踩
Transformer 的 Multi-Head-Attention 无法判断各个编码的位置信息。因此 Attention is all you need 中加入三角函数位置编码(sinusoidal position embedding),表达形式为:
P
E
(
p
o
s
,
2
i
)
=
sin
(
pos
/
1000
0
2
i
/
d
modal
)
P
E
(
p
o
s
,
2
i
+
1
)
=
cos
(
pos
/
1000
0
2
i
/
d
model
)
其中 pos
是单词位置,i = (0,1,... d_model)
所以d_model
为 512 情况下,第一个单词的位置编码可以表示为:
P
E
(
1
)
=
[
sin
(
1
/
1000
0
0
/
512
)
,
cos
(
1
/
1000
0
0
/
512
)
,
sin
(
1
/
1000
0
2
/
512
)
,
cos
(
1
/
1000
0
2
/
512
)
,
…
]
import numpy as np import matplotlib.pyplot as plt def get_angles(pos, i, d_model): angle_rates = 1 / np.power(10000, (2 * (i//2)) / np.float32(d_model)) return pos * angle_rates def positional_encoding(position, d_model): angle_rads = get_angles(np.arange(position)[:, np.newaxis], np.arange(d_model)[np.newaxis, :], d_model) # apply sin to even indices in the array; 2i angle_rads[:, 0::2] = np.sin(angle_rads[:, 0::2]) # apply cos to odd indices in the array; 2i+1 angle_rads[:, 1::2] = np.cos(angle_rads[:, 1::2]) pos_encoding = angle_rads[np.newaxis, ...] return pos_encoding tokens = 10 dimensions = 64 pos_encoding = positional_encoding(tokens, dimensions) print (pos_encoding.shape) plt.figure(figsize=(12,8)) plt.pcolormesh(pos_encoding[0], cmap='viridis') plt.xlabel('Embedding Dimensions') plt.xlim((0, dimensions)) plt.ylim((tokens,0)) plt.ylabel('Token Position') plt.colorbar() plt.show()
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。