赞
踩
gcn_out = self.gcn(A_hat, D_hat, X)
的公式实际上是图卷积网络(GCN)层的核心操作。具体来说,这一步的计算基于图卷积的基本公式:
H ( l + 1 ) = σ ( D ^ − 1 / 2 A ^ D ^ − 1 / 2 H ( l ) W ( l ) ) H^{(l+1)} = \sigma\left( \hat{D}^{-1/2} \hat{A} \hat{D}^{-1/2} H^{(l)} W^{(l)} \right) H(l+1)=σ(D^−1/2A^D^−1/2H(l)W(l))
在这个公式中:
我们可以具体推导出计算步骤:
邻接矩阵和度矩阵:假设图的邻接矩阵为 A A A,我们首先加入自环得到 A ^ = A + I \hat{A} = A + I A^=A+I,其中 I I I 是单位矩阵。然后计算 A ^ \hat{A} A^ 的度矩阵 D ^ \hat{D} D^,其对角线元素为 D ^ i i = ∑ j A ^ i j \hat{D}_{ii} = \sum_j \hat{A}_{ij} D^ii=∑jA^ij。
归一化的邻接矩阵:接下来计算 D ^ − 1 / 2 A ^ D ^ − 1 / 2 \hat{D}^{-1/2} \hat{A} \hat{D}^{-1/2} D^−1/2A^D^−1/2,用于对邻接矩阵进行归一化,使得卷积操作不会改变特征的尺度。
图卷积操作:最后,将归一化后的邻接矩阵与输入特征矩阵相乘,再与权重矩阵 W W W 相乘,并通过激活函数 σ \sigma σ 得到输出特征矩阵 H ( l + 1 ) H^{(l+1)} H(l+1)。
在Temporal Convolutional Network(TCN)中,关键操作包括卷积操作、激活函数、丢弃和跳跃连接。以下是TCN中TemporalBlock的推理公式:
y ( 1 ) = ReLU ( Dropout ( Chomp ( Conv1d ( x , W 1 ) ) ) ) y^{(1)} = \text{ReLU}(\text{Dropout}(\text{Chomp}(\text{Conv1d}(x, W_1)))) y(1)=ReLU(Dropout(Chomp(Conv1d(x,W1))))
y ( 2 ) = ReLU ( Dropout ( Chomp ( Conv1d ( y ( 1 ) , W 2 ) ) ) ) y^{(2)} = \text{ReLU}(\text{Dropout}(\text{Chomp}(\text{Conv1d}(y^{(1)}, W_2)))) y(2)=ReLU(Dropout(Chomp(Conv1d(y(1),W2))))
res = { x , if n inputs = n outputs Conv1d ( x , W downsample ) , otherwise \text{res} = {x,if ninputs=noutputsConv1d(x,Wdownsample),otherwise res={x,Conv1d(x,Wdownsample),if ninputs=noutputsotherwise
output = ReLU ( y ( 2 ) + res ) \text{output} = \text{ReLU}(y^{(2)} + \text{res}) output=ReLU(y(2)+res)
总结起来,TemporalBlock的推理公式如下:
output = ReLU ( Conv1d ( ReLU ( Dropout ( Chomp ( Conv1d ( x , W 1 ) ) ) ) , W 2 ) + res ) \text{output} = \text{ReLU}(\text{Conv1d}(\text{ReLU}(\text{Dropout}(\text{Chomp}(\text{Conv1d}(x, W_1)))), W_2) + \text{res}) output=ReLU(Conv1d(ReLU(Dropout(Chomp(Conv1d(x,W1)))),W2)+res)
其中:
在TransformerBlock中,关键操作包括多头自注意力机制、前馈神经网络层、层归一化和跳跃连接。以下是TransformerBlock的推理公式:
Attention ( Q , K , V ) = softmax ( Q K T d k ) V \text{Attention}(Q, K, V) = \text{softmax}\left(\frac{QK^T}{\sqrt{d_k}}\right) V Attention(Q,K,V)=softmax(dk QKT)V
其中, Q = K = V = x Q = K = V = x Q=K=V=x, d k d_k dk 是键的维度。多头自注意力输出为:
attn_output = MultiHeadAttention ( x , x , x ) \text{attn\_output} = \text{MultiHeadAttention}(x, x, x) attn_output=MultiHeadAttention(x,x,x)
x 1 = LayerNorm ( x + Dropout ( attn_output ) ) x_1 = \text{LayerNorm}(x + \text{Dropout}(\text{attn\_output})) x1=LayerNorm(x+Dropout(attn_output))
ff_output = Linear 2 ( Dropout ( ReLU ( Linear 1 ( x 1 ) ) ) ) \text{ff\_output} = \text{Linear}_2(\text{Dropout}(\text{ReLU}(\text{Linear}_1(x_1)))) ff_output=Linear2(Dropout(ReLU(Linear1(x1))))
output = LayerNorm ( x 1 + Dropout ( ff_output ) ) \text{output} = \text{LayerNorm}(x_1 + \text{Dropout}(\text{ff\_output})) output=LayerNorm(x1+Dropout(ff_output))
总结起来,TransformerBlock的推理公式如下:
attn_output = MultiHeadAttention ( x , x , x ) \text{attn\_output} = \text{MultiHeadAttention}(x, x, x) attn_output=MultiHeadAttention(x,x,x)
x 1 = LayerNorm ( x + Dropout ( attn_output ) ) x_1 = \text{LayerNorm}(x + \text{Dropout}(\text{attn\_output})) x1=LayerNorm(x+Dropout(attn_output))
ff_output = Linear 2 ( Dropout ( ReLU ( Linear 1 ( x 1 ) ) ) ) \text{ff\_output} = \text{Linear}_2(\text{Dropout}(\text{ReLU}(\text{Linear}_1(x_1)))) ff_output=Linear2(Dropout(ReLU(Linear1(x1))))
output = LayerNorm ( x 1 + Dropout ( ff_output ) ) \text{output} = \text{LayerNorm}(x_1 + \text{Dropout}(\text{ff\_output})) output=LayerNorm(x1+Dropout(ff_output))
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。