Slide

advertisement
Song-level Multi-pitch Tracking by
Heavily Constrained Clustering
Zhiyao Duan, Jinyu Han and Bryan Pardo
EECS Dept., Northwestern Univ.
Interactive Audio Lab, http://music.cs.northwestern.edu
For presentation in ICASSP 2010, Dallas, Texas, USA.
Multi-pitch Estimation & Tracking Task
• Given polyphonic music played by several
monophonic harmonic instruments (Num known)
• Estimate a pitch trajectory for each instrument
Northwestern University, Interactive Audio Lab. http://music.cs.northwestern.edu
2
Potential Applications
• Automatic music transcription
• Harmonic source separation
• Other applications
– Melody-based music search
– Chord recognition
– Source localization
– Music education
– ……
Northwestern University, Interactive Audio Lab. http://music.cs.northwestern.edu
3
The 2-stage Standard Approach
• Stage 1: Multi-pitch Estimation (MPE): estimate
pitches in each single time frame
– Z. Duan, B. Pardo and C. Zhang. , “Multiple Fundamental Frequency
Estimation by Modeling Spectral Peaks and Non-peak Regions”, IEEE Trans.
Audio Speech Language Process., in press.
Frequency
• Stage 2: Multi-pitch Tracking (MPT): connect pitch
estimates across frames into pitch trajectories
…
Time
4
State of the Art of MPT
• What existing MPT methods do
– Form short pitch trajectories within a note,
(note-level) according to local time-frequency
proximity of pitch estimates
• Our contribution
– Form long pitch trajectories through multiple
notes (song-level) using a new constrained
clustering algorithm
Northwestern University, Interactive Audio Lab. http://music.cs.northwestern.edu
5
Try Clustering by Timbre
?
• Each trajectory is a cluster of pitch estimates
• One cluster per instrument
• Clustering principle: maintain timbre
consistency in each cluster
Northwestern University, Interactive Audio Lab. http://music.cs.northwestern.edu
Timbre Feature of Pitch Estimates
• Harmonic structure: relative amplitudes
of first 50 harmonics
Harmonic Structure
100
Frequency
Amplitude (dB)
80
60
40
20
0
0
10
20
30
Harmonic number
Time
40
50
Minimize This Objective Function
Number of
Clusters
K
Center of k-th cluster
f ( )   xi  ck
2
k 1 iTk
A partition
into K clusters
For all pitch
estimates in
k-th cluster
The 50-d harmonic
structure of i-th
pitch estimate
Northwestern University, Interactive Audio Lab. http://music.cs.northwestern.edu
Objective Function Is Not Enough
Northwestern University, Interactive Audio Lab. http://music.cs.northwestern.edu
Add Pitch-locality Constraints
Frequency
• Must-link: pitch estimates close in both time and
frequency should be in the same cluster
• Cannot-link: simultaneous pitches should not be in
the same cluster (only for monophonic instruments)
Time
10
Properties of Our Problem
• Objective: timbre consistency
• Constraints: pitch locality
• Previous constrained clustering algorithms do
not apply due to the following properties:
– Inconsistent constraints:
pitch estimates sometimes erroneous
may make constraints unsatisfiable
– Heavily constrained:
nearly every pitch estimate is involved in at least one
constraint
Northwestern University, Interactive Audio Lab. http://music.cs.northwestern.edu
The Proposed Clustering Algorithm
 n : clustering in n-th iteration;
Cn : {all constraints satisfied by  n } ;
1. Start from an initial clustering  0 , which satisfies C0 , a
subset of all constraints; n=1;
2. Find a new clustering  n that decreases the objective f
and also satisfies Cn1 ;
3. Cn = {all constraints satisfied by  n } ;
4. Repeat 2-4 until the objective (nearly) cannot be
decreased;
C0  C1   C
f ( 0 )  f (1 )   f ()
Northwestern University, Interactive Audio Lab. http://music.cs.northwestern.edu
Initial Clustering
• Trivial one
–  0 : a random partition
– C0 : constraints satisfied by  0 , may be empty
• A more informative one for MPT
Frequency
–  0 : label pitches according to pitch order in each
frame: highest, second-highest, third.., fourth…
– C0 : will contain all cannot-links
…
Time
Northwestern University, Interactive Audio Lab. http://music.cs.northwestern.edu
Find A New Clustering
• 1. Satisfy current constraints
C0  C1   C
• 2. Decrease the objective function f ( 0 )  f (1 ) 
1
3
5
4
1
3
2
7
6
3
8
: satisfied cannot-link
: satisfied must-link
 f ()
3
2
5
7
6
4
8
: unsatisfied cannot-link
: unsatisfied cannot-link
• Swap set: A connected subgraph between two clusters.
• Traverse all swap sets until finding a new clustering that
decreases the objective function
Northwestern University, Interactive Audio Lab. http://music.cs.northwestern.edu
Algorithm Review
 k : partition of points into clusters
S k : feasible solution space under constraints Ck
Northwestern University, Interactive Audio Lab. http://music.cs.northwestern.edu
Experiments
• Data set
– 10 J.S. Bach chorales (quartets, played by violin,
clarinet, saxophone and bassoon)
– Each instrument is recorded individually, then mixed
• Ground-truth pitch trajectories
– Use YIN on monophonic tracks before mixing
• Input pitch estimates
– Our previous work in [1]
– Input accuracy: 70.0+-3.1%
[1] Zhiyao Duan, Bryan Pardo and Changshui Zhang, “Multiple Fundamental Frequency Estimation by
Modeling Spectral Peaks and Non-peak Regions”, IEEE Trans. Audio Speech Language Process., in press.
Northwestern University, Interactive Audio Lab. http://music.cs.northwestern.edu
16
Overall Multi-pitch Tracking Results
Mean % of correct pitch estimates
Northwestern University, Interactive Audio Lab. http://music.cs.northwestern.edu
Among Correctly Estimated Pitches
Northwestern University, Interactive Audio Lab. http://music.cs.northwestern.edu
An Example
Pitch (MIDI number)
Ground-truth Pitch Trajectories
90
80
70
60
50
40
0
5
10
15
Time (second)
20
25
Northwestern University, Interactive Audio Lab. http://music.cs.northwestern.edu
An Example
Pitch (MIDI number)
Our Resutls
90
80
70
60
50
40
0
5
10
15
Time (second)
20
25
Northwestern University, Interactive Audio Lab. http://music.cs.northwestern.edu
Conclusion
• Formulate the song-level Multi-pitch
Tracking problem as a constrained
clustering problem
– Objective: timbre consistency
– Constraints: pitch locality
• Existing constrained clustering algorithms
do not apply due to problem properties
• Propose a new constrained clustering
algorithm
• Experimental results are promising
Northwestern University, Interactive Audio Lab. http://music.cs.northwestern.edu
Thanks you!
Northwestern University, Interactive Audio Lab. http://music.cs.northwestern.edu
22
Download