当前位置:   article > 正文

HMM(2)_前向概率和后向概率

前向概率和后向概率

1.前向和后向概率的关系
   (1)前向概率: α t ( i ) = P ( y 1 , y 2 , ⋯ y t , q t = i ∣ λ ) \alpha_{t}(i)=P\left(y_{1}, y_{2}, \cdots y_{t}, q_{t}=i | \lambda\right) αt(i)=P(y1,y2,yt,qt=iλ)
   (2)后向概率: β t ( i ) = P ( y t + 1 , y t + 2 , ⋯   , y T ∣ q t = i , λ ) \beta_{t}(i)=P\left(y_{t+1}, y_{t+2}, \cdots, y_{T} | q_{t}=i, \lambda\right) βt(i)=P(yt+1,yt+2,,yTqt=i,λ)
在这里插入图片描述
   (3)关系:
P ( i t = q i , O ∣ λ ) = P ( O ∣ i t = q i , λ ) P ( i t = q i ∣ λ ) = P ( o 1 , ⋯ o t , o t + 1 , ⋯ o T ∣ i t = q i , λ ) P ( i t = q i ∣ λ ) = P ( o 1 , ⋯ o t ∣ i t = q i , λ ) P ( o t + 1 , ⋯ o T ∣ i t = q i , λ ) P ( i t = q i ∣ λ ) = P ( o 1 , ⋯ o t , i t = q i ∣ λ ) P ( o t + 1 , ⋯ o T ∣ i t = q i , λ ) = α t ( i ) β t ( i )

P(it=qi,O|λ)=P(O|it=qi,λ)P(it=qi|λ)=P(o1,ot,ot+1,oT|it=qi,λ)P(it=qi|λ)=P(o1,ot|it=qi,λ)P(ot+1,oT|it=qi,λ)P(it=qi|λ)=P(o1,ot,it=qi|λ)P(ot+1,oT|it=qi,λ)=αt(i)βt(i)
P(it=qi,Oλ)=P(Oit=qi,λ)P(it=qiλ)=P(o1,ot,ot+1,oTit=qi,λ)P(it=qiλ)=P(o1,otit=qi,λ)P(ot+1,oTit=qi,λ)P(it=qiλ)=P(o1,ot,it=qiλ)P(ot+1,oTit=qi,λ)=αt(i)βt(i)
2.单个状态的概率
   给定模型 λ \lambda λ以及观测序列 O O O,在时刻t处于状态 q i q_i qi的概率,记: γ t ( i ) = P ( i t = q i ∣ O , λ ) \gamma_{t}(i)=P\left(i_{t}=q_{i} | O, \lambda\right) γt(i)=P(it=qiO,λ)
   根据前向后向概率的定义:
P ( i t = q i , O ∣ λ ) = α t ( i ) β t ( i ) γ t ( i ) = P ( i t = q i ∣ O , λ ) = P ( i t = q i , O ∣ λ ) P ( O ∣ λ ) γ t ( i ) = α t ( i ) β t ( i ) P ( O ∣ λ ) = α t ( i ) β t ( i ) ∑ i = 1 N α t ( i ) β t ( i )
P(it=qi,O|λ)=αt(i)βt(i)γt(i)=P(it=qi|O,λ)=P(it=qi,O|λ)P(O|λ)γt(i)=αt(i)βt(i)P(O|λ)=αt(i)βt(i)i=1Nαt(i)βt(i)
P(it=qi,Oλ)=αt(i)βt(i)γt(i)=P(it=qiO,λ)=P(Oλ)P(it=qi,Oλ)γt(i)=P(Oλ)αt(i)βt(i)=i=1Nαt(i)βt(i)αt(i)βt(i)

    γ \gamma γ的意义:
   在每个时刻t选择在该时刻最有可能出现的状态 1 ^ t ∗ \hat{\mathbf{1}}_{\mathbf{t}}^{*} 1^t,从而得到一个状态序列 I ∗ = { i 1 ∗ , i 2 ∗ ⋯ i T ∗ } I^{*}=\left\{i_{1}^{*}, i_{2}^{*} \cdots i_{\mathrm{T}}^{*}\right\} I={i1,i2iT},将他作为预测的结果。
   给定模型和观测序列,时刻t处于 q i q_i qi的概率为:
γ t ( i ) = α t ( i ) β t ( i ) P ( O ∣ λ ) = α t ( i ) β t ( i ) ∑ t = 1 N α t ( i ) β t ( i ) \gamma_{t}(i)=\frac{\alpha_{t}(i) \beta_{t}(i)}{P(O | \lambda)}=\frac{\alpha_{t}(i) \beta_{t}(i)}{\sum_{t=1}^{N} \alpha_{t}(i) \beta_{t}(i)} γt(i)=P(Oλ)αt(i)βt(i)=t=1Nαt(i)βt(i)αt(i)βt(i)
3.两个状态的概率
ξ t ( i , j ) = P ( i t = q i , i t + 1 = q j ∣ O , λ ) = P ( i t = q t , i t + 1 = q j , O ∣ λ ) P ( O ∣ λ ) = P ( i t = q i , i t + 1 = q j , O ∣ λ ) ∑ i = 1 N ∑ j = 1 N P ( i t = q i , i t + 1 = q j , O ∣ λ ) P ( i t = q i , i t + 1 = q j , O ∣ λ ) = α t ( i ) a i j b j o t 1 β t + 1 ( j )
ξt(i,j)=P(it=qi,it+1=qj|O,λ)=P(it=qt,it+1=qj,O|λ)P(O|λ)=P(it=qi,it+1=qj,O|λ)i=1Nj=1NP(it=qi,it+1=qj,O|λ)P(it=qi,it+1=qj,O|λ)=αt(i)aijbjot1βt+1(j)
ξt(i,j)=P(it=qi,it+1=qjO,λ)=P(Oλ)P(it=qt,it+1=qj,Oλ)=i=1Nj=1NP(it=qi,it+1=qj,Oλ)P(it=qi,it+1=qj,Oλ)P(it=qi,it+1=qj,Oλ)=αt(i)aijbjot1βt+1(j)

4.期望
   在观测O下状态i出现的期望:
∑ t = 1 T γ t ( i ) \sum_{t=1}^{T} \gamma_{t}(i) t=1Tγt(i)
   在观测O下状态i转移到状态j的期望:
∑ t = 1 T − 1 ξ t ( i , j ) \sum_{t=1}^{T-1} \xi_{t}(i, j) t=1T1ξt(i,j)
5.学习算法:
   若训练数据包含观测序列和状态序列,则HMM的学习非常简单,是监督学习,若训练数据只有观测序列,则HMM的学习需要使用EM算法,是非监督学习。
   假设已给定训练数据包含S个长度相同的观测序列和对应的观测序列
{ ( O 1 , I 1 ) , ( O 2 , I 2 ) … ( O s , I s ) } \left\{\left(\mathrm{O}_{1}, \mathrm{I}_{1}\right),\left(\mathrm{O}_{2}, \mathrm{I}_{2}\right) \ldots\right. \left.\left(O_{s}, I_{s}\right)\right\} {(O1,I1),(O2,I2)(Os,Is)},那么,可以直接利用Bernoulli大数定理的结论“频率的极限是概率”,给出HMM的参数估计。
   (1)监督学习:
     初始概率: π ^ i = ∣ q i ∣ ∑ i ∣ q i ∣ \hat{\pi}_{i}=\frac{\left|q_{i}\right|}{\sum_{i}\left|q_{i}\right|} π^i=iqiqi
     转移概率: a ^ i j = ∣ q i j ∣ ∑ j = 1 N ∣ q i j ∣ \hat{a}_{i j}=\frac{\left|q_{i j}\right|}{\sum_{j=1}^{N}\left|q_{i j}\right|} a^ij=j=1Nqijqij
     观测概率: b ^ i k = ∣ s i k ∣ ∑ k = 1 M ∣ s i k ∣ \hat{b}_{i k}=\frac{\left|s_{i k}\right|}{\sum_{k=1}^{M}\left|s_{i k}\right|} b^ik=k=1Msiksik
在这里插入图片描述
   (2)Baum-Welch算法
     所有观测数据写成 O = ( o 1 , o 2 … o T ) \mathrm{O}=\left(\mathrm{o}_{1}, \mathrm{o}_{2} \dots \mathrm{o}_{\mathrm{T}}\right) O=(o1,o2oT),所有隐数据写成 I = ( i 1 , i 2 … i T ) \mathrm{I}=\left(\mathrm{i}_{1}, \mathrm{i}_{2} \dots \mathrm{i}_{\mathrm{T}}\right) I=(i1,i2iT),完全数据是 ( O , I ) = ( o 1 , o 2 … o T , i 1 , i 2 … i T ) (\mathrm{O}, \mathrm{I})=\left(\mathrm{o}_{1}, \mathrm{o}_{2} \dots \mathrm{o}_{\mathrm{T}}, \mathrm{i}_{1}, \mathrm{i}_{2} \dots \mathrm{i}_{\mathrm{T}}\right) (O,I)=(o1,o2oT,i1,i2iT),完全数据的对数似然是 ln ⁡ P ( O , I ∣ λ ) \ln \mathrm{P}(\mathrm{O}, \mathrm{I} | \lambda) lnP(O,Iλ)
     假设 λ ˉ \bar{\lambda} λˉ是HMM参数当前的估计值, λ \lambda λ是当前的参数。
Q ( λ , λ ˉ ) = ∑ I ( ln ⁡ P ( O , I ∣ λ ) ) P ( I ∣ O , λ ˉ ) = ∑ I ln ⁡ P ( O , I ∣ λ ) P ( O , I ∣ λ ˉ ) P ( O , λ ˉ ) ∝ ∑ I ln ⁡ P ( O , I ∣ λ ) P ( O , I ∣ λ ˉ )
Q(λ,λ¯)=I(lnP(O,I|λ))P(I|O,λ¯)=IlnP(O,I|λ)P(O,I|λ¯)P(O,λ¯)IlnP(O,I|λ)P(O,I|λ¯)
Q(λ,λˉ)=I(lnP(O,Iλ))P(IO,λˉ)=IlnP(O,Iλ)P(O,λˉ)P(O,Iλˉ)IlnP(O,Iλ)P(O,Iλˉ)

     EM过程:
在这里插入图片描述

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/空白诗007/article/detail/980449
推荐阅读
相关标签
  

闽ICP备14008679号