当前位置:   article > 正文

高斯-伯努利玻尔兹曼机_gaussian-bernoulli restricted boltzmann machine

gaussian-bernoulli restricted boltzmann machine

关于机器学习中的受限玻尔兹曼机(RBM)的非二值情况的推导
http://blog.51cto.com/13345387/1971665
限制Boltzmann机(Restricted Boltzmann Machine)
http://www.cnblogs.com/neopenx/p/4399336.html

上面两篇文章对非二值情况的受限玻尔兹曼机说得相当详细了。

1. 能量函数的区别

(1)Binary-Binary
E ( v , h ) = − ∑ j = 1 n ∑ i = 1 m w i j h j v i − ∑ i = 1 m a i v i − ∑ j = 1 n b j h j E\left ( v,h \right ) = -\sum_{j=1}^{n}\sum_{i=1}^{m}w_{ij}h_{j}v_{i} - \sum_{i=1}^{m}a_{i}v_{i} - \sum_{j=1}^{n}b_{j}h_{j} E(v,h)=j=1ni=1mwijhjvii=1maivij=1nbjhj
(2)Gaussian-Binary
E ( v , h ) = − ∑ j = 1 n ∑ i = 1 m w i j h j v i σ i − ∑ i = 1 m ( v i − a i ) 2 2 σ i 2 − ∑ j = 1 n b j h j E\left ( v,h \right ) = -\sum_{j=1}^{n}\sum_{i=1}^{m}w_{ij}h_{j}\frac{v_{i}}{\sigma_{i}} - \sum_{i=1}^{m}\frac{\left ( v_{i}-a_{i} \right )^2}{2\sigma_{i}^2} - \sum_{j=1}^{n}b_{j}h_{j} E(v,h)=j=1ni=1mwijhjσivii=1m2σi2(viai)2j=1nbjhj
(3)Gaussian-Gaussian
因为很不稳定,所以我也没有去研究。

2. 实现

我们知道权重的更新为:

Δ w i j = ϵ ( ⟨ v i h j ⟩ d a t a − ⟨ v i h j ⟩ m o d e l ) \Delta w_{ij} = \epsilon (\left \langle v_{i}h_{j} \right \rangle_{data}-\left \langle v_{i}h_{j} \right \rangle_{model}) Δwij=ϵ(vihjdatavihjmodel)
第一项数据分布的期望可以从输入获得
p ( H j = 1 ∣ v ) = σ ( ∑ i = 1 m w i j v i + a j ) p\left ( H_{j}=1|v \right )=\sigma\left ( \sum_{i=1}^{m}w_{ij}v_{i}+a_{j} \right ) p(Hj=1v)=σ(i=1mwijvi+aj)
p ( V i = 1 ∣ h ) = σ ( ∑ j = 1 m w i j h j + b i ) p\left ( V_{i}=1|h \right )=\sigma\left ( \sum_{j=1}^{m}w_{ij}h_{j}+b_{i} \right ) p(Vi=1h)=σ(j=1mwijhj+bi)
但是很难求第二项模型分布的期望就难求了,Hinton提出了共轭梯度的概念,简化了权重的更新公式。
里面涉及到吉布斯采样等内容。

Binary unit与Gaussian unit两中单元唯一的区别就是激活的方式。
Binary unit和Gaussian unit都是根据随机给定的概率进行决定是否激活的。通过一个sigmoid函数得到某个unit的激活概率,对于二进制单元,使用一个0/1随机生成器,当记过概率大于随机数,则单元激活,输出1;而对于高斯单元,随机数生成的方式不一样,它通过一个高斯分布生成,均值为激活单元的均值,方差需要提前设定好,一般取较小的数如0.01.
看代码:

def gibbs_sampling_step(self, visible, n_features):
        """Perform one step of gibbs sampling.
        :param visible: activations of the visible units
        :param n_features: number of features
        :return: tuple(hidden probs, hidden states, visible probs,
                       new hidden probs, new hidden states)
        """
        hprobs, hstates = self.sample_hidden_from_visible(visible)
        vprobs = self.sample_visible_from_hidden(hprobs, n_features)
        hprobs1, hstates1 = self.sample_hidden_from_visible(vprobs)

        return hprobs, hstates, vprobs, hprobs1, hstates1
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
    def sample_visible_from_hidden(self, hidden, n_features):
        """Sample the visible units from the hidden units.

        This is the Negative phase of the Contrastive Divergence algorithm.
        :param hidden: activations of the hidden units
        :param n_features: number of features
        :return: visible probabilities
        """
        visible_activation = tf.add(
            tf.matmul(hidden, tf.transpose(self.W)),
            self.bv_
        )

        if self.visible_unit_type == 'bin':
            vprobs = tf.nn.sigmoid(visible_activation)

        elif self.visible_unit_type == 'gauss':
            vprobs = tf.truncated_normal(
                (1, n_features), mean=visible_activation, stddev=self.stddev)

        else:
            vprobs = None

        return vprobs
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
    def sample_hidden_from_visible(self, visible):
        """Sample the hidden units from the visible units.
        This is the Positive phase of the Contrastive Divergence algorithm.
        :param visible: activations of the visible units
        :return: tuple(hidden probabilities, hidden binary states)
        """
        hprobs = tf.nn.sigmoid(tf.add(tf.matmul(visible, self.W), self.bh_))
        hstates = utilities.sample_prob(hprobs, self.hrand)

        return hprobs, hstates
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
    def compute_positive_association(self, visible,
                                     hidden_probs, hidden_states):
        """Compute positive associations between visible and hidden units.

        :param visible: visible units
        :param hidden_probs: hidden units probabilities
        :param hidden_states: hidden units states
        :return: positive association = dot(visible.T, hidden)
        """
        if self.visible_unit_type == 'bin':
            positive = tf.matmul(tf.transpose(visible), hidden_states)

        elif self.visible_unit_type == 'gauss':
            positive = tf.matmul(tf.transpose(visible), hidden_probs)

        else:
            positive = None

        return positive
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19

上述源码在这:Deep-Learning-TensorFlow

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/代码探险家/article/detail/944254
推荐阅读
相关标签
  

闽ICP备14008679号