Euromac Velarde Meredith 4

advertisement
A Wavelet-Based Approach
to the Discovery of Themes
and Motives in Melodies
Gissel Velarde and David Meredith
Aalborg University
Department of Architecture, Design & Media Technology
EuroMAC, September 2014
We present
• A computational method submitted to the
MIREX 2014 Discovery of Repeated Themes &
Sections task
• The results on the monophonic version of the
JKU Patterns Development Database
Ground Truth
Bach’s Fugue BWV 889
Ground Truth: Chopin’s Mazurka Op. 24, No. 4
The idea behind the method
• In the context of pattern discovery in
monophonic pieces:
– With a good melodic structure in terms of
segments, it should be possible to gather similar
segments into clusters and rank their salience
within the piece.
Considerations
• “a good melodic structure in terms of
segments”
– Is considered to be closer to the ground truth
analysis (See Collins, 2014)
• It specifies certain segments or patterns
• These patterns can be overlapping and hierarchical
Considerations
• We also consider other aspects of the
problem,
– representation,
– segmentation,
– measuring similarity,
– clustering of segments and
– ranking segments according to salience
The method
• The method
– Follows and extends our approach to melodic
segmentation and classification based on filtering
with the Haar wavelet (Velarde, Weyde and Meredith, 2013)
– Uses idea of computing a similarity matrix for
“window connectivity information from a generic
motif discovery algorithm for sequential data
(Jensen, Styczynski, Rigoutsos and Stephanopoulos, 2006)
Wavelet transform
Haar Wavelet
A family of functions is obtained
by translations and dilatations of
the mother wavelet:
The wavelet coefficients of the pitch
vector v for scale s and shift u are
defined as the inner product:

 1, if 0  t  1 / 2

 (t )   1, if 1 / 2  t  1

 0, otherwise


 t  2 s u 
1

 
 (t ) 
s
s

 2
 (u , s ) 2
2
v, u , s 

 v(t )

u ,s
(t )dt
Representation (Velarde et al. 2013)
New representation
First stage Segmentation (Velarde et al. 2013)
New segmentation
Segmentation
First stage
segmentation
Comparison
Concatenation
Constant segmentation, wavelet
zero-crossings or modulus maxima
Distance matrix given a measure
Binarized distance matrix
given a threshold
Contiguous similar diagonal
segments are concatenated
Comparison
Clustering
Ranking
Distance matrix given a
measure
By agglomerative clusters from an
agglomerative hierarchical cluster
tree
Criteria: sum of the length of occurrences
Parameter combinations
We tested the following parameter combinations:
• MIDI pitch
• Sampling rate: 16 samples per qn
• Representation:
– normalized pitch signal, wav coefficients, wav coefficients modulus
• Scale representation at 1 qn
• Segmentation:
– constant segmentation, zero crossings, modulus maxima
• Scale segmentation at 1 and 4 qn
• Threshold for concatenation: 0, 0.1, 1
• Distances:
– city-block, Euclidean, DTW
• Agglomerative clusters from an agglomerative hierarchical cluster tree
• Number of clusters: 7
• Ranking criterion: Sum of the length of occurrences
Evaluation
• As described at MIREX 2014:Discovery of Repeated
Themes & Sections
– establishment precision, establishment recall, and
establishment F1 score;
– occurrence precision, occurrence recall, and occurrence F1
score;
– three-layer precision, three-layer recall, and three-layer F1
score;
– runtime, first five target proportion and first five precision;
– standard precision, recall, and F1 score;
Results
•
On the JKU Patterns Development Database
monophonic version
•
•
•
•
•
•
J. S. Bach, Fugue BWV 889,
Beethoven's Sonata Op. 2, No. 1, Movement 3,
Chopin's Mazurka Op. 24, No. 4,
Gibbons's Silver Swan, and
Mozart's Sonata K.282, Movement 2.
We selected best combinations according to
representation and segmentation.
Results
Fig 1. Mean F1 score (mean(f1_est, f1_occ(c=.75), 3L F1, f1_occ (c=.5)) .
Results
Fig 2. Standard F1 score
Results
Fig 3. Mean Runtime per piece.
Our MIREX Submissions VM1 and VM2
Combinations selected based on
– mean F1 score: mean(F1_est, F1_occ(c=.75), F1_3, F1_occ (c=.5))
– standard F1 score
• VM1 differs from VM2 in the following parameters:
– Normalized pitch signal representation,
– Constant segmentation at the scale of 1 qn,
– Threshold for concatenation 0.1.
• VM2 differs from VM1 in the following parameters:
– Wavelet coefficients representation filtered at the scale of 1 qn
– Modulus maxima segmentation at the scale of 4 qn
– Threshold for concatenation 1
Our MIREX Submissions
P_occ
R_occ
Piece
n_P
n_Q P_est
R_est
F1_est
Bach
3
7
0.87
0.95
0.91
0.63
0.72
Beethoven 7
7
0.92
0.92
0.92
0.98
Chopin
4
7
0.53
0.86
0.66
Gibbons
8
7
0.95
0.95
Mozart
mean
SD
9
7
6.2 7
2.59 0
0.92
0.84
0.17
0.79
0.89
0.07
F1_occ
Runtime FFTP_
P_occ
R_occ
F1_occ
(c=.5)
(c=.5)
(c=.5)
0.6
0.63
0.72
0.76
0.8
0.89
0.68
0.47
0.82
17.76 0.77
0.73
0.71
0.14
23.61 0.67
23.01 0.77
10.34 0.11
P_3
R_3
F1_3
0.67
0.51
0.65
0.98
0.98
0.86
0.66
0.86
0.75
0.95
0.66
0.93
0.85
0.86
0.12
0.82
0.75
0.15
0.96
0.89
0.11
(c=.75) (c=.75) (c=.75)
(s)
est
0.57
8.5
0.95
0.91
0.88
31
0.48
0.7
0.57
34.2
0.77
0.85
0.79
0.88
0.81
0.12
0.79
0.7
0.19
0.69
0.75
0.1
FFP
P
R
F1
0.67
0.14
0.33
0.2
0.93
0.91
0.57
0.57
0.57
0.46
0.83
0.6
0
0
0
0.79
0.66
0.93
0.77
0.29
0.25
0.27
0.73
0.68
0.14
0.72
0.67
0.15
0.92
0.87
0.09
0.81
0.75
0.12
0.57
0.31
0.26
0.44
0.32
0.22
0.5
0.31
0.23
P
R
F1
Table 1. Results of VM1 on the JKU Patterns Development Database.
P_occ
Piece
n_P
n_Q P_est
R_est
R_occ
F1_occ
F1_est
Runtime FFTP_
P_3
R_3
F1_3
(c=.75) (c=.75) (c=.75)
Bach
P_occ
R_occ
F1_occ
(c=.5)
(c=.5)
(c=.5)
FFP
(s)
est
3
7
0.56
0.65
0.6
0.89
0.43
0.58
0.39
0.41
0.4
5.07
0.59
0.37
0.56
0.46
0.5
0
0
0
Beethoven 7
7
0.9
0.9
0.9
0.79
0.89
0.84
0.82
0.86
0.84
5.54
0.67
0.75
0.83
0.9
0.86
0
0
0
Chopin
4
7
0.58
0.86
0.69
0.69
0.83
0.75
0.53
0.78
0.64
5.83
0.65
0.44
0.67
0.65
0.66
0
0
0
Gibbons
8
7
0.92
0.88
0.9
0.79
0.84
0.82
0.81
0.73
0.77
2.22
0.7
0.76
0.72
0.69
0.71
0.14
0.13
0.13
Mozart
9
7
0.83
0.71
0.77
0.93
0.93
0.93
0.77
0.63
0.69
5.7
0.56
0.68
0.84
0.88
0.86
0
0
0
mean
6.2
7
0.76
0.8
0.77
0.82
0.78
0.78
0.66
0.68
0.67
4.87
0.63
0.6
0.72
0.71
0.72
0.03
0.03
0.03
SD
2.59 0
0.17
0.11
0.13
0.09
0.2
0.13
0.19
0.17
0.17
1.51
0.06
0.18
0.12
0.18
0.15
0.06
0.06
0.06
Table 2. Results of VM2 on the JKU Patterns Development Database.
Three Layer F1, (χ2(1)=1.8, p=0.1797):
Standard F1, (χ2(1)=4, p=0.045):
Runtime, (χ2(1)=5, p=0.0253)
->No significant difference
->VM1 preferred
->VM2 preferred
Example: Bach's Fugue BWV 889 prototypical pattern
Observations
• The segmentation stage makes more
difference in the results, according to the
parameters
– In the first stage segmentation
• The size of the scale affects the results for standard
measures and runtimes
– In the first comparison
• Zero-crossings segmentation works best with DTW
• DTW is much more expensive to compute
Observations
• In the comparison (after segmentation), City-block is
dominant
• DTW in the comparison after segmentation is not in
the best combinations
– Maybe because there is no ritardando or accelerando in this
dataset and/or representation
• For standard measures and a smaller segmentation
scale
– Pitch signal works better than wavelet representation
• For non standard measures and a larger
segmentation scale
– Modulus maxima performs slightly better than zerocrossings and constant segmentation
Conclusions
• Our novel wavelet-based method outperforms the
methods reported by Meredith (2013) and Nieto &
Farbood (2013) on the monophonic version of the
JKU PDD training dataset, scoring higher on
precision, recall and F1 score, and reporting faster
runtimes.
Conclusions
• The segmentation stage makes more difference in the
results, according to the parameters
• A small scale for first stage segmentation should be
preferable for higher values of the standard measures
and a large scale should be preferable for runtime
computation.
• City-block should be preferable after segmentation
References
[1]
T. Collins. Mirex 2014 competition: Discovery of repeated themes and
sections, 2014. http://www.musicir.org/mirex/wiki/2014:Discovery_of_Repeated_Themes_%26_Sections. Accessed on
12 May 2014.
[2]
K. Jensen, M. Styczynski, I. Rigoutsos and G. Stephanopoulos: “A generic
motif discovery algorithm for sequential data”, Bioinformatics, 22:1, pp. 21-28, 2006.
[3]
D. Meredith. “COSIATEC and SIATECCompress: Pattern discovery by
geometric compression”, Competition on Discovery of Repeated Themes and Sections,
MIREX 2013, Curitiba, Brazil, 2013.
[4]
O. Nieto, and M. Farbood. “Discovering Musical Patterns Using Audio
Structural Segmentation Techniques. Competition on Discovery of Repeated Themes
and Sections, MIREX 2013, Curitiba, Brazil, 2013
[5]
G. Velarde, T. Weyde and D. Meredith: “An approach to melodic
segmentation and classification based on filtering with the Haar-wavelet”, Journal of
New Music Research, 42:4, 325-345, 2013.
Download