Approximation Methods For Bayesian Inference

There are two method families to approximate intractable posterior distributions: deterministic methods such as variational inference, and stochastic methods such as sampling methods.

Deterministic methods

Variational methods come from the mathematical field calculus of variations where the goal is to find the function that optimizes a given numerical quantity. One common variational method is Variational Expectation-Maximization (EM). Standard EM is used to get Maximum Likelihood estimates

\[\hat{\boldsymbol{z}} = \mathop{\arg\max}_{\boldsymbol{z}} p(\boldsymbol{x} |\boldsymbol{z}).\]

But when the model gets complicated, computing the posterior $p(\boldsymbol{z}|\boldsymbol{x})$ in the E step becomes computationally intractable. Variational Bayes approximates the posterior by removing some dependencies between some variables of the model to get a tractable distribution $q(\boldsymbol z)$ over the space of hidden variables $\boldsymbol z$. Variational EM approach approximates the posterior by considering a parameterized family of tractable distributions [Bishop06].

Stochastic methods

Gibbs sampling is a Markov Chain Monte Carlo algorithm [Geman84] that repeatedly picks one hidden variable $z_i$ at random and samples it from the distribution of that variable conditioned on all other hidden variables $\boldsymbol{z}_{-i}$ and the observation $\boldsymbol x$.

initialize z randomly
repeat until convergence:
    pick i randomly
    draw z_i from p(z_i | z_-i, x)

We refer to convergence loosely here, as Gibbs sampling doesn’t converge and will eventually visit all possible state, maybe in an exponential number of iterations. We’ll usually assume convergence when the state is not changing too much over a reasonable number of iterations.