Songsmith: Using Machine Learning to Help People Make Music

advertisement
Songsmith:
Using Machine Learning to
Help People Make Music
Dan Morris, Ian Simon, Sumit Basu,
and the MSR Advanced Development Team
Microsoft Research
Computational User
Experiences (CUE) group
In general:
HCI +
(sensors, devices,
machine learning,
health, physiology)
Computational User
Experiences (CUE) group

Using physiological signals for input
Computational User
Experiences (CUE) group
Using physiological signals for input
 Health and wellness

Computational User
Experiences (CUE) group
Using physiological signals for input
 Health and wellness
 Creativity support tools

Songsmith:
Using Machine Learning to
Help People Make Music
Dan Morris, Ian Simon, Sumit Basu,
and the MSR Advanced Development Team
Microsoft Research
What is Songsmith?
You, singing...
Your music
“Automatic
accompaniment
generation for vocal
melodies”
Songsmith
High-Risk Live Demo
(What other kind of live demo is
there?)
Who is Songsmith for?
Today’s Talk





Overview and demo
How Songsmith works
Exposing machine learning parameters
What are people doing with Songsmith?
Creativity support tools @ Microsoft Research
Songsmith: 5000’ Overview
G Amin C Daug
C C# D D# E F F# G G# A A# B
Chords from Melody

Songsmith’s core: Hidden Markov Model
Observations (note vectors)
Song
start
Chord
1
Chord
2
Chord
3
Hidden states (chords)
Chord
4
Song
end
HMMs in 5 Minutes or Less
What does an HMM do for me?
What does an HMM want from me?
States
Observations
HMMs in 5 Minutes or Less
Possible states:
Transition
Observation
probabilities:
probabilities:
C major
P(C Major
P( | A Minor)?
| A Minor)?
C minor
P(C Major
P( | D Major)?
| D Major)?
C diminished
… … …P(F#
… Major | F# Major)?
States
Observations
Building our HMM
 Things
my HMM wants from me:
 Possible states
 Transition probabilities
 Observation probabilities
 Observations
Finding the Probabilities
Data-driven
 Not heuristic-driven

Training Data
~300 lead sheets
(vocal melodies with chords)
Processing the Database
Convert all chords to five basic triads
 Transpose every song into the same key
 Count the number of transitions from every chord to
every other chord (transition probabilities):
 Count the total duration of each melody note occurring
Minor
C# Major
C# Minor
D Major
D Minor …
while eachC Major
chordCis
playing
(observation
probabilities):

C Major
472
22
C Minor
9
314
C# Major
…
…
Notes played
over
…
…
D Major
…
C
Major: …
C# Minor
D Minor
…
…
…
…
…
35
50
76
189
0
44
39
71
…
0.2
…
…
…
0.15
…
…
…
…
0.1
…
…
…
…
…
…
0.25
…
0.05
…
0
…
C
C#
D
…
D#
E
F
F#
… G#
G
A
A#
B
…
Building our HMM
 Things
my HMM wants from me:
 Possible states
 Transition probabilities
 Observation probabilities
 Observations
Observations: what did
the user sing?
Input:
Hard!
Building a Pitch Histogram
FFT
Building our HMM

Things my HMM wants from me:




Possible states
Transition probabilities
Observation probabilities
Observations

Run Viterbi algorithm, get “best” sequence
of chords… thank you, HMM!

One hitch: key determination…
Today’s Talk





Overview and demo
How Songsmith works
Exposing machine learning parameters
What are people doing with Songsmith?
Creativity support tools @ Microsoft Research
So we can choose optimal
chords... so what?
My demo took 10 seconds… is that fun?
 What would a songwriter do next?
 How can we build creative exploration into a
learning-driven system?

UI: Exposing Learning Parameters
There are always hard-coded “magic
numbers” in machine learning
 Machine learning also use lots of
learned parameters
 Can we let users control those
numbers?

A Bad User Interface
Songsmith: A fun way to make music
(if you have a PhD in math and/or computer science)
Transition matrix
(edit me!)
Expected pitch histograms
(edit me!)
C
C#
D
D#
E
C
0.3
0.5
1.0
0.0
0.7
C
0.2
0.9
0.1
0.2
0.4
C#
0.9
0.3
0.1
0.8
0.0
C#
0.1
0.6
0.4
0.6
0.2
D
0.4
0.8
0.5
0.6
0.1
D
0.5
0.6
0.4
0.7
0.5
D#
1.0
0.9
0.2
0.7
0.2
D#
0.1
0.3
0.9
0.1
0.7
E
0.8
0.4
0.4
0.2
1.0
E
0.4
0.1
0.1
0.3
0.3
F
0.9
0.1
0.6
0.6
0.4
Observation weight:
Conjugate prior:
Frequency smoothing:
UI: Exposing Learning Parameters

Can we let users intuitively
control those
control those
numbers?
Exposing Model Parameters
in Songsmith
The “Happy Factor”
Happy Factor: Implementation
Partition database into two databases
(major and minor songs) using
clustering
 Build separate transition probability
matrix for each database
 When we actually run our HMM, blend
the two transition matrices together
according to user input…

The “Happy Factor”
 0
 1
log P ci | ci 1  
 log Pmaj ci | ci 1   1   log Pmin ci | ci 1 
Bonus question: what’s wrong with this equation?
Another “Happy Factor” Example
Happy
Sad
The “Jazz Factor”
Jazz Factor: Implementation

When running our HMM, we need to make
chords match the voice and each other

Computing how well each chord fits at a
given position:
k • log( P(this chord | what the user sang) )
+
(1-k) • log( P(this chord | the previous chord) )

Just put k on a slider!
Chord Locking

Global sliders are very coarse

Chords can be “locked” by the user
1
P ci | ci 1   
0
ci  Clock
ci  Clock
“Suggested Chords”
Songwriters will often explore “chord
substitutions”…
 …but we’re assuming our audience doesn’t
know that much music theory…


Expose suboptimal marginal probabilities at
each node as “suggestions”
Li   log P  x i | ci  
1   log P ci | ci 1  
1   log P ci 1 | ci 
Interactive Machine Learning

Songsmith is one example of IML
 Roughly: moving what used to be in the domain
of ML experts into users’ hands

Related work: image classification
 Fogarty et al, CHI 2008: CueFlik
 Fails and Olsen, IUI 2003: Crayons

Why?
 Harness end-user knowledge
 Use ML as a tool for data exploration
 Use ML as a tool for creative expression
Today’s Talk





Overview and demo
How Songsmith works
Exposing machine learning parameters
What are people doing with Songsmith?
Creativity support tools @ Microsoft Research
Today’s Talk





Overview and demo
How Songsmith works
Exposing machine learning parameters
What are people doing with Songsmith?
Creativity support tools @ Microsoft Research
Dynamic Mapping of Physical
Controls for Tabletop Groupware
(Fiebrink, CHI 2009)
One project, two problems:
1. Direct-touch, tabletop input is
great for collaboration…
…but suffers from serious
precision issues.
2. Working on music alone is
boring.
Dynamic Mapping of Physical
Controls for Tabletop Groupware
(Fiebrink, CHI 2009)
Incorporate high-precision controllers into a
tabletop environment
 Evaluate in a collaborative audio-editing app

Data-Driven Exploration of
Musical Chord Sequences
(Nichols, IUI 2009)
Problem: let people create and explore
music by moving around in a reduceddimensionality space
 Genres, artists make for intuitive labels

Data-Driven Exploration of
Musical Chord Sequences
(Nichols, IUI 2009)

Why isn’t this easy?

Solution(s):
Divergence-maximizing clustering, PCA
Future work?
work

Make Songsmith freakin’ amazing, and port it to a
bajillion platforms, and make an amazing
community Web site, and build it into audio
hosts… etc…

Other applications of machine learning in
creativity support tools…
 Writing? Painting? Web Design? CAD? Music?
Songsmith
research.microsoft.com/songsmith
dan@microsoft.com
Live/Google: songsmith
Dan Morris, Ian Simon, Sumit Basu,
and the MSR Advanced Development Team
Microsoft Research
Download