Reasoning on Knowledge - Faculty of Information Technology

advertisement
Online Pen-Based Music Symbol Recognition
Kian-Chin Lee, Somnuk Phon-Amnuaisuk
Music Informatics Research Group,
Centre for Artificial Intelligence and Intelligent Computing
Multimedia University, Cyberjaya, Malaysia.
{kclee, somnuk.amnuaisuk}@mmu.edu.my
Abstract
(ii)
This study proposes freehand input and editing of
musical notes using digital ink. As the user writes, pen
strokes are captured and preprocessed as primitives, and
then categorized and recognized by using Hidden Markov
models. These strokes will be clustered into meaningful
musical notations going through structural analysis.
Motivation and Objective
The objective of this study, thus, is to propose a system that
would allow a user to use an electronic pen or a stylus and
write musical notes on a touch screen monitor or a digital
write pad, in a manner similar to writing on paper using a
pen or a pencil. The system will translate the digital ink
pattern into the correct note representation.
The benefit of doing so is manifold. Since the proposed
system is pen-based and totally freehand, it will enable
music composers to write music score electronically in
exactly the same way as they write music on paper. Users
need not learn any special gestures for input; it gives them a
great amount of freedom and flexibility and therefore they
can concentrate on the process of composing music itself
rather than being confined by the many rules imposed by
many other input methods.
Other extended usage of this system are instant playback
of musical composition when linked to a music
performance system, remote learning of music composition,
and music composition on mobile devices.
Summary of Literature Survey
The main input methods of musical notation input can be
grouped into these three categories:
(i)
Non-notation based: Music is entered to a computer
from an electronic instrument connected to the
computer through the MIDI interface.
Off-line based (OMR, optical music recognition):
Music sheets are scanned optically and the musical
information is extracted later using various techniques
or procedures like image processing and off-line
symbol recognition.
(iii) Pen-based: there are three main approaches in using
pen-based devices for music recognition—point-andclick, gesture-based, and symbol recognition.
(George 2005)
In the first category, a piece of music is played on a MIDI
instrument and the musical information is then transferred
to the computer to be represented electronically. It requires
the users to be competent players of the instruments. The
playing must be flawless for every mistake in the playing of
a note or timing will be faithfully captured by the computer.
The major weakness of this method is due to the differences
of music in its notational form and in its playing form—that
is, a piece of music is never played as it is notated, since a
good performer should have freedom of interpretation when
playing the music piece, and this would introduce variation
of musical time from expressive timing information. For
example, when playing a note with staccato (a dot above the
note indicating that the note thus marked should be
shortened to half its written length, the second half replaced
with silence), the system would have difficulty to recognize
it correctly as a staccato instead of as a note with a rest.
Likewise, it would be hard to recognize correctly notes with
dynamics markings (varying degrees of loudness or
softness), expression marks (indications in a musical score
where the composer wish changes in the dynamics, tempo
(quicker or slower) or mood (e.g. sadder, more joyful, etc.))
Also, the left hand and right hand parts of the musical
information from a MIDI keyboard is very hard to be
detected and recognized separately (Somnuk 2004).
Off-line or OMR methods are more suitable for printed
music recognition; recognition rate would be lower on freehandwriting musical notation.
For the (on-line) pen-based category, the point-and-click
method works by selecting (using mouse, keyboard, and/or
stylus) musical symbols from pull-down menus or palettes
and placing them onto the screen based on the current
cursor position. Example applications are Finale, Sibelius,
etc. As pointed out by many authors, this method is slow
and highly unnatural (Somnuk 2004) (George 2005).
There have been some gesture-based techniques developed
in recent years to represent musical symbols. The first of
such systems is Presto by Anstice, et al. (Anstice 1996)
which uses different shorthand gestures to represent musical
symbols. For example, a dot represents a filled note with a
stem; draw line over one stem will add a tail to it. The work
on the Presto system was carried on by Ng, et al. (Ng 1998)
to create Presto2. Areas of improvement include better
notes beaming, an improved gesture set and recognition
algorithms, improved editing functions, and improved
audible and visual feedback.
evolution of music writing.
All methods mentioned have strengths and weaknesses
befitting their respective nature. Among these, we find that
freehand input and editing of musical notes using digital ink
to be the most natural form of interface between computers
and composers.
Research Methodology
We divide the process of recognition of musical notation
into four main stages:
Online Pen-Based Musical Notation Recognition
Input
Digitization
Evaluation on the Presto system claimed that the gesturebased system could be approximately three times as fast as
other methods of music data entry reported in the literature.
Its drawback, however, is that the users have to learn the
many different gestures to input different musical symbols;
and also, as we have pointed out, this method is not natural
to a composer.
Digital Ink
Symbol Recognition
Symbol List
Structural Analysis
Internal Hierarchical Structure
Interpretation
Final Result
On the other hand, Forsberg, et al. (Forsberg 1998)
designed The Music Notepad that is also using gesturebased input for entering common music notation. In this
work, four classes of gestural operations are available; each
corresponds to one of the four buttons of the stylus. The tip
of the stylus is to draw marking gestures. The lower button
is to perform direct manipulation operations for changing
note pitches and graphical placement of symbols. The
second lowest button is to slide the “music paper” across
the display screen. The eraser button is for playback of the
entire or regions of musical notation. The Music Notepad
integrated a set of gestures for very basic notes and rests
entry from the Structured Sound Synthesis Project (SSSP)
by Buxton et al. (Buxton 1979). Like the Presto, The Music
Notepad requires its users to learn many predefined gestures
and is not a natural way to compose music.
Problem Formulation
Traditionally musicians or composers write music using
pen and paper. It has always been a challenge to both
musicians and computer scientists to enable direct input of
musical notes into computer. Numerous methods have been
tried to ease the process of musical notation input to
computer: OMR (Optical Music Recognition), using MIDI
devices, GUI (Graphical User Interface using windows,
icons, menus, and point-and-click), pen- and gesture- based
systems, etc. With the advancement of personal computer
and handwriting recognition techniques it occurred to both
musicians/composers and researchers that writing music
directly onto computer is the logical next step in the
Musical notation recognition begins by considering how the
data is presented. The digitalization step transforms the
digital ink input into a static or dynamic representation.
The next step is symbol recognition where a classifier is
employed to assign labels to symbols. We will apply the
Hidden Markov model for this purpose.
At the structural analysis step, a list of the recognized
symbols is grouped to represent as an internal hierarchical
structure suitable for interpretation and processing by a
computer program.
The last step is interpretation. In this step, the internal
hierarchical structure is processed and interpreted to obtain
a final result. This result can be a musical expression
represented in MIDI or any of its equivalents ready for
playback or for printing.
References
1. George, S.E. (2005). Pen-Based Input for On-Line
Handwritten Music Notation. In George S.E. (Ed.),
Visual Perception of Music Notation: On-Line and OffLine Recognition (pp. 128–160). PA: IRM Press.
2. Somnuk Phon-Amnuaisuk (2004). Challenges and
Potentials in Freehand Music Editing Using Pen and
Digital Ink. In Music Network Open Workshop (colocated with WEDELMUSIC 2004 Conference),
Universitat Pompeu Frabra, Barcelona, Spain.
September 15-16.
3. Anstice J., Bell T., Cockburn A., and Setchell M.
(1996). The Design of a Pen-Based Musical Input
System. In Proceedings of the 6th Australian
Conference on Computer-Human Interaction
(OZCHT’96). Pages 260-267. IEEE Computer Society.
4. Ng E., Bell T., Cockburn A. (1998). Improvements to a
Pen-Based Musical Input System. OzCHI'98: The
Australian Conference on Computer-Human
Interaction. Adelaide, Australia. 29 November to 4
December, 1998. pages 239--252. IEEE Press.
5. Forsberg, Andrew, Dieterich M., and Zeleznik R.
(1998). "The Music Notepad", Proceedings of UIST
'98, ACM SIGGRAPH.
6. Buxton, W., Sniderman, R., Reeves, W., Patel, S. &
Baecker, R. (1979). The Evolution of the SSSP Score
Editing Tools. Computer Music Journal 3(4), 14-25
(reprinted in: Buxton, W., Sniderman, R., Reeves, W.,
Patel, S. & Baecker, R. (1985). The Evolution of the
SSSP Score Editing Tools. In Roads, C. & Strawn, J.
(1985). Foundations of Computer Music. MIT Press,
Cambridge MA, 376-402.)
7. Rabiner L.R. (1989). A Tutorial on Hidden Markov
Models and Selected Applications in Speech
Recognition, Proc. IEEE 77 (2) (1989) 257-286
8. Bunke H., Roth M., and Schukat-Talamazzini E.G.
(1995). "Off-Line Cursive Handwriting Recognition
Using Hidden Markov Models," Pattern Recognition,
vol. 28, no. 9, pp. 13991413, Sept. 1995.
9. Tapia, E. (2004). Understanding Mathematics: A
System for the Recognition of On-Line Handwritten
Mathematical Expressions. Dissertation zur Erlangung
des akademischen Grades eines Doktors der
Naturwissenschaften im Fachbereich Mathematik und
Informatik der Freien Universität Berlin. Berlin.
Download