presentation - Jessica Keup

advertisement
Jessica Keup
Problem Statement and Goal
• Genetic algorithms (GAs) to create music
– With programmatic fitness, ineffective music
– With human input, fitness bottleneck
• Way to solve fitness bottleneck?
Relevance and Significance
• Creativity/collaboration for musical novices
• Potential solution for GA fitness bottlenecks
• Potential use for crowdsourcing
Introduction - Literature Review - Methodology - Results - Conclusion
Research Question
Q: “When music that is created by a GA trained by a
crowdsourced group is compared to music created by a GA
trained by a small group, is the crowdsourced music more
effective?”
A: By running two instances of the same musical GA with those
two training conditions, then having composers and musical
laypeople review the results, the song effectiveness was about
the same overall.
Introduction - Literature Review - Methodology - Results - Conclusion
Computer Music
• Composition, performance, analysis, sound processing, sound production
• Search problem with no optimal solution
• GA suitability
• First, with programmatic fitness only
• Next, with human evaluation as fitness
• Recurring bottleneck problem
Introduction - Literature Review - Methodology - Results - Conclusion
Fitness Bottleneck and Workarounds
• GenJam – Biles (1994)
• Audioserve - Yee-King (2000)
• SBEAT3 - Unemi (2002)
identified the
problem
• Constructive Adaptive User Interface (CAUI) - Legaspi et al. (2007)
• Gartland-Jones and Copley (2003)
• Unehara and Onisawa (2003)
attempted
a solution
• Composition, Feedback, and Evolution Framework – Fu et al. (2009)
Introduction - Literature Review - Methodology - Results - Conclusion
5
Crowdsourcing
• Outsourcing to collective online intelligence
• Cons
• Pros
-
around-the-clock
inexpensive
fast
wisdom of crowd
- untrustworthiness
- lack of skill
- ethics of outsourcing
• Marketplaces such as
Introduction - Literature Review - Methodology - Results - Conclusion
6
Darwin Tunes
• Crowdsourced compositional GA – MacCallum and Leroi
• Evolectronica: Survival of the Funkiest
• 641 generations of evolution
• Not mTurk, not a formalized study
Music Information Retrieval Evaluation eXchange (MIREX)
• Urbano, Morato, Marrero, & Martin (2010) used mTurk
• Crowdsourced ratings of music similarity
• expert-level results on 2,810 rankings for $70.25
Introduction - Literature Review - Methodology - Results - Conclusion
7
GA choice - Melodycomposition
• Considered code from VARIATIONS, master’s thesis, Spieldose, and CAUI
• Melodycomposition – Craane on code.google.com
• Uses Java Genetic Algorithms Package (JGAP)
[F#:7:QUARTER]
[A#:4:QUARTER]
[F#:6:EIGTH]
• Modifications:
–
–
–
–
–
–
2 melodies (SA)
Additional fitness
Interaction with mTurk
Removal of GUI
Database persistence
# generations (11 & 200)
Introduction - Literature Review - Methodology - Results - Conclusion
Genre and Programmatic Fitness
• Chorale-like genre
– Instrumental
– 2-part (soprano/bass)
• List of fitness guidelines in addition to human ratings
– After Large Skip
– Consecutive Skips
– Global Pitch Distribution
– Interval
– Parallel Motion
– Proportion Notes/Rests
– Range
– Repeating Notes
– Scale
– Strong Beats
Introduction - Literature Review - Methodology - Results - Conclusion
Prototype and Task Setup
• Modification of melodycomposition
• Interaction with mTurk Java API
• Webpage for participants, with php and JavaScript to appear on mTurk
• MySQL database and Ubuntu server
Generate
songs
• IRB approval from Nova
• IRB approval from ETSU
Selection and
mutation
Calculate
fitness
Introduction - Literature Review - Methodology - Results - Conclusion
Post mTurk
HITs
Send results
to GA
Training GAs
Control
Test
Generations
11
200
Participants
11
154*
Listening Tasks
275
5,000
Songs
825
15,000
Recruitment
Consent
Introduction - Literature Review - Methodology - Results - Conclusion
11
Evaluation by Reviewers and Composers
Reviewers
Composers
8
8
10
10
Participants
Songs
Recruitment
Consent
Instructions
Ratings
Like?
Artistically Effective?
Similar?
Interesting?
Creative?
Artistically Effective?
Chorale-like?
Questions
What emotions?
What was
memorable?
What was memorable?
What were
shortcomings?
Introduction - Literature Review - Methodology - Results - Conclusion
12
Music
• Small control group songs:
1
2
3
4
5
Composers said:
Reviewers said:
randomness, dissonance, lack of coherence,
lack of shape, and atonality
curiosity, suspense, dissonance, ballet, storytelling,
syncopation, mystery, anxiety, awkward rhythms, and
too much distance between the bass and soprano
• Large test group songs:
6
7
8
Reviewers said:
darkness, lack of flow, mystery, curiosity,
happiness, ballads, major 3rds, and the need for
tempo variance
9
10
Composers said:
atonal, contrapuntal, too sustained,
dissonance, randomness, and lack of shape.
Introduction - Literature Review - Methodology - Results - Conclusion
13
Difference between Reviewers’/Composers’ Test
minus Control Effectiveness
t test
N Mean
St. Dev.
Min
Q1 Median
Q3
Max
Combined
16
.015
6.19 -8.00 -3.75
-0.67
0.75 19.00
Reviewers
8
.013
3.77 -8.00 -4.00
-2.50
1.75 19.00
Composers
8
.017
8.24 -4.67 -1.17
-0.33
0.00
Introduction - Literature Review - Methodology - Results - Conclusion
8.67
14
Combined Reviewer Ratings of All Music
Introduction - Literature Review - Methodology - Results - Conclusion
15
Combined Composer Ratings of All Music
Introduction - Literature Review - Methodology - Results - Conclusion
16
Reviewers’/Composers’ Artistic Effectiveness
Ratings
Paired t test
Reviewers
Composers
Difference
N
10
10
10
Mean
35.50
26.60
8.90
St.Dev. SE Mean
4.65
1.47
4.67
1.48
4.65
1.47
Introduction - Literature Review - Methodology - Results - Conclusion
17
Implications
• Test music slightly better overall, but not statically significant
• Null hypothesis not rejected
Recommendations
• Fine-tune rules in programmatic fitness function
• Change rules weights
• Avoid premature convergence (mutation rate?)
• Compare to 200 generations of programmatic fitness only
• Use Turkit
• Use preference judgments instead of best/middle/worst
• Use voting or limit HITs to one-per-worker
Introduction - Literature Review - Methodology - Results - Conclusion
18
Download