ppt

advertisement
Bayesian Nonparametric Matrix Factorization for
Recorded Music
Authors: Matthew D. Hoffman, David M. Blei, Perry R. cook
Princeton University, Department of Computer Science,
35 olden St., Princeton, NJ, 08540 USA
Reading Group Presenter:
Shujie Hou
Cognitive Radio Institute
Friday, October 15, 2010
Outline
■ Introduction
■ Terminology
■ Problem statement and contribution of this paper
■ Gap-NMF Model(Gamma Process Nonnegative Matrix Factorization )
■ Variational Inference
■ Definition
■ Variational Objective Function
■ Coordinate Ascent Optimization
■ Other Approaches
■ Evaluation
Terminology(1)
■ Nonparametric Statistics:
□ The term non-parametric is not meant to imply that such models
completely lack parameters but that the number and nature of
the parameters are flexible and not fixed in advance.
■ Nonnegative Matrix Factorization:
□ Non-negative matrix factorization (NMF) is a group of algorithms
in multivariate analysis and linear algebra where a matrix, is
factorized into (usually) two matrices with all elements are
greater than or equal to 0
X  WH
The above two definitions are cited from Wikipedia
Terminology(2)
■ Variational Inference:
□ Variational inference approximates the posterior distribution with
a simpler distribution, whose parameters are optimized to be
close to the true posterior.
■ Mean-field Variational Inference:
□ In mean-field variational inference, each variable is given an
independent distribution, usually of the same family as its prior.
Outline
■ Introduction
■ Terminology
■ Problem statement and Contribution of this Paper
■ Gap-NMF Model
■ Variational Inference
■ Definition
■ Variational Objective Function
■ Coordinate Ascent Optimization
■ Other Approaches
■ Evaluation
Problem Statement and Contribution
■ Research Topic:
□ Breaking audio spectrograms into separate sources of sound
using latent variable decompositions. E.g., matrix factorization.
■ A potential problem :
□ The number of latent variables must be specified in advance
which is not always possible.
■ Contribution of this paper
□ The paper develops Gamma Process Nonnegative Matrix
Factorization (GaP-NMF), a Bayesian nonparametric approach
to decompose spectrograms.
Outline
■ Introduction
■ Terminology
■ Problem statement and Contribution of this Paper
■ Gap-NMF Model
■ Variational Inference
■ Definition
■ Variational Objective Function
■ Coordinate Ascent Optimization
■ Other Approaches
■ Evaluation
Dataset on GaP-NMF Model
■ What are given is a M by N matrix X
in which Xmn is the power of audio signal at time window n and
frequency bin m.
If the number of latent variable is specified in advance:
■ Assuming the audio signal is composed of K static sound
sources. The problem is to decompose X  WH
, in which W
is M by K matrix, H is K by N matrix. In which cell Wmk is the
average amount of energy source k exhibits at frequency m.
cell H kn is the gain of source k at time n.
■ The problem is solved by
GaP-NMF Model
If the number of latent variable is not specified in advance:
■ GaP-NMF assumes that the data is drawn according to the
following generative process:
Based on the formula that
(Abdallah&Plumbley (2004))
GaP-NMF Model
If the number of latent variable is not specified in advance:
■ GaP-NMF assumes that the data is drawn according to the
following generative process:
The overall gain of the
corresponding source l
Used to control the
number of latent
variables
Based on the formula that
(Abdallah&Plumbley (2004))
GaP-NMF Model
Kingman ,1993
■ The number of nonzero
is the number of the latent
variables K.
■ If L increased towards infinity, the nonzero L which expressed
by K is finite and obeys:
Outline
■ Introduction
■ Terminology
■ Problem statement and Contribution of this Paper
■ Gap-NMF Model
■ Variational Inference
■ Definition
■ Variational Objective Function
■ Coordinate Ascent Optimization
■ Other Approaches
■ Evaluation
Definition of Variational Inference
■ Variational inference approximates the posterior distribution with a
simpler distribution, whose parameters are optimized to be close to
the true posterior.
■ Under this paper’s condition:
Posterior Distribution
What measured
Definition of Variational Inference
■ Variational inference approximates the posterior distribution with a
simpler distribution, whose parameters are optimized to be close to
the true posterior.
■ Under this paper’s condition:
Variational Distribution
Posterior Distribution
Approximates
Variational distribution
assumption with free
parameters
What measured
Definition of Variational Inference
■ Variational inference approximates the posterior distribution with a
simpler distribution, whose parameters are optimized to be close to
the true posterior.
■ Under this paper’s condition:
Variational Distribution
Adjust Parameters
Posterior Distribution
Approximates
Variational distribution
assumption with free
parameters
What measured
Outline
■ Introduction
■ Terminology
■ Problem statement and Contribution of this Paper
■ Gap-NMF Model
■ Variational Inference
■ Definition
■ Variational Objective Function
■ Coordinate Ascent Optimization
■ Other Approaches
■ Evaluation
Variational Objective Function
■ Assume each variable obeys the following Generalized
Inverse-Gaussian (GIG) family:
Variational Objective Function
■ Assume each variable obeys the following Generalized
Inverse-Gaussian (GIG) family:
It is Gamma
family
Variational Objective Function
■ Assume each variable obeys the following Generalized
Inverse-Gaussian (GIG) family:
It is Gamma
family
Denotes a modified
Bessel function of the
second kind
Deduction(1)
From Jordan
et al., 1999
■ The difference between the left and right sides is the
Kullback-Leibler divergence between the true posterior and
the variational distribution q.
■ Kullback-Leibler divergence : for probability distributions P
and Q of a discrete random variable their K–L divergence is
defined to be
Deduction(2)
Deduction(2)
Using Jensen’s inequality
Objective function
■ L=
Bounded by
■ The objective function becomes
+
■ Maximize the objective function defined above with the
corresponding parameters.
■ The distribution is obtained:
■ Because these three distributions are independent, we gain
approximates
Outline
■ Introduction
■ Terminology
■ Problem statement and Contribution of this Paper
■ Gap-NMF Model
■ Variational Inference
■ Definition
■ Variational Objective Function
■ Coordinate Ascent Optimization
■ Other Approaches
■ Evaluation
Coordinate Ascent Algorithm(1)
■ The derivative of the objective function with respect to
variational parameters equals to zero to obtain:
■ Similarly:
Coordinate Ascent Algorithm(2)
■ Using Lagrange multipliers, then the bound parameters
become
■ Then updating bound parameters and variational parameters
according to equations 14,15,16,17 and18 to ultimately
reaching a local minimum.
Outline
■ Introduction
■ Terminology
■ Problem statement and Contribution of this Paper
■ Gap-NMF Model
■ Variational Inference
■ Definition
■ Variational Objective Function
■ Coordinate Ascent Optimization
■ Other Approaches
■ Evaluation
Other Approaches
■
■
■
■
Finite Bayesian Model ( also called GIG-NMF).
Finite Non-Bayesian Model.
EU-Nonnegative Matrix Factorization.
KL-Nonnegative Matrix Factorization.
Outline
■ Introduction
■ Terminology
■ Problem statement and Contribution of this Paper
■ Gap-NMF Model
■ Variational Inference
■ Definition
■ Variational Objective Function
■ Coordinate Ascent Optimization
■ Other Approaches
■ Evaluation
Evaluation on Synthetic data(1)
■ The data is generated according to the following model:
Evaluation on Synthetic data(2)
Evaluation on Recorded Music
Conclusion
■ Gap-NMF model is capable of determining the number of
latent source automatically.
■ The key step of the paper is to use variational distribution to
approximate posterior distribution.
■ Gap-NMF can work well on analyzing and processing
recorder music, it can be applicable to other types of audio.
■Thank you!
Download