How are Generative Stochastic Networks trained?

Answer by Yoshua Bengio:

There are many ways that neural networks can represent a conditional probability distribution. In the experiments reported in the paper, it is with classical sigmoid output units, each of which represent the probability that an output variable (here the i-th bit X_i to be reconstructed) take the value 1 or 0. In that case, the random bits X_i are assumed to be conditionally independent, given X_tilde. You can choose other kinds of distribution (any parametric distribution, by making these parameters a function of the neural network outputs; here the distribution is factorized Bernoulli, each with probability p_i = sigmoid(a_i), where a_i are the pre-sigmoid outputs of the neural net).

The neural net is trained as usual, by back-propagating the log-likelihood of the outputs (which is the same as the cross-entropy, in the above example). The only difference with ordinary neural nets (but similarly to dropout) is that noise is injected in the neural net (in the inputs and possibly hidden units as well).

View Answer on Quora

Advertisements

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s