Jessica Keup Problem Statement and Goal • Genetic algorithms (GAs) to create music – With programmatic fitness, ineffective music – With human input, fitness bottleneck • Way to solve fitness bottleneck? Relevance and Significance • Creativity/collaboration for musical novices • Potential solution for GA fitness bottlenecks • Potential use for crowdsourcing Introduction - Literature Review - Methodology - Results - Conclusion Research Question Q: “When music that is created by a GA trained by a crowdsourced group is compared to music created by a GA trained by a small group, is the crowdsourced music more effective?” A: By running two instances of the same musical GA with those two training conditions, then having composers and musical laypeople review the results, the song effectiveness was about the same overall. Introduction - Literature Review - Methodology - Results - Conclusion Computer Music • Composition, performance, analysis, sound processing, sound production • Search problem with no optimal solution • GA suitability • First, with programmatic fitness only • Next, with human evaluation as fitness • Recurring bottleneck problem Introduction - Literature Review - Methodology - Results - Conclusion Fitness Bottleneck and Workarounds • GenJam – Biles (1994) • Audioserve - Yee-King (2000) • SBEAT3 - Unemi (2002) identified the problem • Constructive Adaptive User Interface (CAUI) - Legaspi et al. (2007) • Gartland-Jones and Copley (2003) • Unehara and Onisawa (2003) attempted a solution • Composition, Feedback, and Evolution Framework – Fu et al. (2009) Introduction - Literature Review - Methodology - Results - Conclusion 5 Crowdsourcing • Outsourcing to collective online intelligence • Cons • Pros - around-the-clock inexpensive fast wisdom of crowd - untrustworthiness - lack of skill - ethics of outsourcing • Marketplaces such as Introduction - Literature Review - Methodology - Results - Conclusion 6 Darwin Tunes • Crowdsourced compositional GA – MacCallum and Leroi • Evolectronica: Survival of the Funkiest • 641 generations of evolution • Not mTurk, not a formalized study Music Information Retrieval Evaluation eXchange (MIREX) • Urbano, Morato, Marrero, & Martin (2010) used mTurk • Crowdsourced ratings of music similarity • expert-level results on 2,810 rankings for $70.25 Introduction - Literature Review - Methodology - Results - Conclusion 7 GA choice - Melodycomposition • Considered code from VARIATIONS, master’s thesis, Spieldose, and CAUI • Melodycomposition – Craane on code.google.com • Uses Java Genetic Algorithms Package (JGAP) [F#:7:QUARTER] [A#:4:QUARTER] [F#:6:EIGTH] • Modifications: – – – – – – 2 melodies (SA) Additional fitness Interaction with mTurk Removal of GUI Database persistence # generations (11 & 200) Introduction - Literature Review - Methodology - Results - Conclusion Genre and Programmatic Fitness • Chorale-like genre – Instrumental – 2-part (soprano/bass) • List of fitness guidelines in addition to human ratings – After Large Skip – Consecutive Skips – Global Pitch Distribution – Interval – Parallel Motion – Proportion Notes/Rests – Range – Repeating Notes – Scale – Strong Beats Introduction - Literature Review - Methodology - Results - Conclusion Prototype and Task Setup • Modification of melodycomposition • Interaction with mTurk Java API • Webpage for participants, with php and JavaScript to appear on mTurk • MySQL database and Ubuntu server Generate songs • IRB approval from Nova • IRB approval from ETSU Selection and mutation Calculate fitness Introduction - Literature Review - Methodology - Results - Conclusion Post mTurk HITs Send results to GA Training GAs Control Test Generations 11 200 Participants 11 154* Listening Tasks 275 5,000 Songs 825 15,000 Recruitment Consent Introduction - Literature Review - Methodology - Results - Conclusion 11 Evaluation by Reviewers and Composers Reviewers Composers 8 8 10 10 Participants Songs Recruitment Consent Instructions Ratings Like? Artistically Effective? Similar? Interesting? Creative? Artistically Effective? Chorale-like? Questions What emotions? What was memorable? What was memorable? What were shortcomings? Introduction - Literature Review - Methodology - Results - Conclusion 12 Music • Small control group songs: 1 2 3 4 5 Composers said: Reviewers said: randomness, dissonance, lack of coherence, lack of shape, and atonality curiosity, suspense, dissonance, ballet, storytelling, syncopation, mystery, anxiety, awkward rhythms, and too much distance between the bass and soprano • Large test group songs: 6 7 8 Reviewers said: darkness, lack of flow, mystery, curiosity, happiness, ballads, major 3rds, and the need for tempo variance 9 10 Composers said: atonal, contrapuntal, too sustained, dissonance, randomness, and lack of shape. Introduction - Literature Review - Methodology - Results - Conclusion 13 Difference between Reviewers’/Composers’ Test minus Control Effectiveness t test N Mean St. Dev. Min Q1 Median Q3 Max Combined 16 .015 6.19 -8.00 -3.75 -0.67 0.75 19.00 Reviewers 8 .013 3.77 -8.00 -4.00 -2.50 1.75 19.00 Composers 8 .017 8.24 -4.67 -1.17 -0.33 0.00 Introduction - Literature Review - Methodology - Results - Conclusion 8.67 14 Combined Reviewer Ratings of All Music Introduction - Literature Review - Methodology - Results - Conclusion 15 Combined Composer Ratings of All Music Introduction - Literature Review - Methodology - Results - Conclusion 16 Reviewers’/Composers’ Artistic Effectiveness Ratings Paired t test Reviewers Composers Difference N 10 10 10 Mean 35.50 26.60 8.90 St.Dev. SE Mean 4.65 1.47 4.67 1.48 4.65 1.47 Introduction - Literature Review - Methodology - Results - Conclusion 17 Implications • Test music slightly better overall, but not statically significant • Null hypothesis not rejected Recommendations • Fine-tune rules in programmatic fitness function • Change rules weights • Avoid premature convergence (mutation rate?) • Compare to 200 generations of programmatic fitness only • Use Turkit • Use preference judgments instead of best/middle/worst • Use voting or limit HITs to one-per-worker Introduction - Literature Review - Methodology - Results - Conclusion 18