赞
踩
Cen be implemented using counts with smoothing 可以用平滑计数实现
Can be implemented using feed-forward neural networks 可以用前馈神经网络实现
Problem: limited context 问题:上下文限制
E.g. Generate sentences using trigram model: 例如:使用 trigram 模型生成句子:
Allow representation of arbitrarily sized inputs 允许表示任意大小的输入
Core idea: processes the input sequence one at a time, by applying a recurrence formula 核心思想:一次处理一个输入序列,通过应用递归公式
Uses a state vector to represent contexts that have been previously processed 使用状态向量表示之前处理过的上下文
RNN Neuron: RNN 神经元
RNN States: RNN 状态
Activation 激活函数:
RNN Unrolled: 展开的 RNN
Training RNN: 训练 RNN
E.g of unrolled equation: 展开方程的例子
is current word (e.g. eats
) mapped to an embedding 是当前词(例如 eats)映射到一个嵌入
contains information of the previous words (e.g. a
and cow
) 包含前面词的信息(例如 a 和 cow)
is the next word (e.g. grass
) 是下一个词(例如 grass)
Training:
Vocabulary 词汇: [a, cow, eats, grass]
Training example 训练样本: a cow eats grass
Training process 训练过程:
Losses:
Total loss:
Generation:
Problems of RNN: RNN 的问题
RNN has the capability to model infinite context. But it cannot capture long-range dependencies in practice due to the vanishing gradients RNN 具有建模无限上下文的能力。但由于梯度消失,实际上无法捕捉长距离依赖性
Vanishing Gradient: Gradients in later steps diminish quickly during backpropagation. Earlier inputs do not get much update. 梯度消失:在反向传播过程中,后续步骤的梯度快速减小。较早的输入没有得到太多更新。
LSTM is introduced to solve vanishing gradients LSTM 用来解决梯度消失问题
Core idea: have memory cells that preserve gradients across time. Access to the memory cells is controlled by gates. 核心思想:拥有跨时间保存梯度的记忆单元。通过门控制对记忆单元的访问。
Gates: For each input, a gate decides: 门:对于每个输入,门决定
Comparison between simple RNN and LSTM: 简单 RNN 和 LSTM 的比较
A gate is a vector. Each element of the gate has values between 0 and 1. Use sigmoid function to produce . 门 是一个向量。门的每个元素的值在 0 到 1 之间。使用 sigmoid 函数来产生 。
is multiplied component-wise with vector to determine how much information to keep for 和向量 乘以 component-wise 来确定对 保留多少信息
Controls how much information to forget in the memory cell 控制在记忆单元 中忘记多少信息
E.g. Given Tha cas that the boy
predict the next word likes
例如,给定 Tha cas that the boy
预测下一个词 likes
cats
记忆单元正在存储名词信息 cats
cats
and store boy
to correctly predict the singular verb likes
该单元现在应该忘记 cats
并存储 boy
以正确预测单数动词 likes
Input gate controls how much new information to put to memory cell 输入门控制将多少新信息放入记忆单元
is new distilled information to be added 是要添加的新提炼信息
Shakespeare Generator 莎士比亚生成器:
Wikipedia Generator: 维基百科生成器
Code Generator 代码生成器
Text Classification 文本分类
Sequence Labeling: E.g. POS tagging 序列标记:例如,词性标注
Peephole connections: allow gates to look at cell state 窥视孔连接:允许门看到单元状态
Gated recurrent unit (GRU): Simplified variant with only 2 gates and no memory cell 门控循环单元(GRU):简化的变体,只有2个门,没有记忆单元
Multi-layer LSTM 多层LSTM
Bidirectional LSTM 双向LSTM
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。