赞
踩
Process Sequences!
Sequential processing of non-sequential data
h t = f W ( h t − 1 , x t ) h_t = f_W(h_{t-1},x_t) ht=fW(ht−1,xt)
new state is calculated by f on old state and input x t x_t xt
y t = f W y ( h t ) y_t = f_{W_y}(h_t) yt=fWy(ht)
and output is a applying another function f on h_t
same function and the same set of parms are used at every time step
Backpropagation through time takes too much memory for long sequences
Instead, do the backpropagtion in truncated chunks.
Make it feasible to train
一个LSTM很详细的讲解!
https://blog.csdn.net/qian99/article/details/88628383
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。