Online Learning for
Latent Dirichlet Allocation
Matthew D. Hoffman, David M. Blei and Francis Bach
NIPS 2010
Presented by Lingbo Li
Latent Dirichlet Allocation (LDA)
1) Draw each topic
2) For each document:
1) Draw topic proportions
2) For each word:
1) Draw
2) Draw
Batch variational Bayes for LDA
For a collection of documents, infer:
• Per-word topic assignment
• Per-document topic proportion
• topic distributions
True posterior
is approximated by
over the variational parameters
Online variational inference for LDA
• Mini-batches:
• Hyperparameter estimation:
Analysis of convergence
• Multiply the gradients by the inverse of an appropriate
positive definite matrix H to speed up stochastic
gradient algorithms.
• H: the Fisher information matrix of the variational
distribution q
Use perplexity on held-out data as a measure of
are fit using the E step in algorithm 2;
Evaluating learning parameters
• Two corpora: 352,549 documents from the journal Nature, and
100,000 documents from the English version Wikipedia.
• For each corpus, set aside a 1,000-document test set and a separate
1,000-document validation set.
• Run online LDA for five hours on the remaining documents from each
corpus for
Compare batch and online on fixed corpora:
True online