Online Pen-Based Music Symbol Recognition Kian-Chin Lee, Somnuk Phon-Amnuaisuk Music Informatics Research Group, Centre for Artificial Intelligence and Intelligent Computing Multimedia University, Cyberjaya, Malaysia. {kclee, somnuk.amnuaisuk}@mmu.edu.my Abstract (ii) This study proposes freehand input and editing of musical notes using digital ink. As the user writes, pen strokes are captured and preprocessed as primitives, and then categorized and recognized by using Hidden Markov models. These strokes will be clustered into meaningful musical notations going through structural analysis. Motivation and Objective The objective of this study, thus, is to propose a system that would allow a user to use an electronic pen or a stylus and write musical notes on a touch screen monitor or a digital write pad, in a manner similar to writing on paper using a pen or a pencil. The system will translate the digital ink pattern into the correct note representation. The benefit of doing so is manifold. Since the proposed system is pen-based and totally freehand, it will enable music composers to write music score electronically in exactly the same way as they write music on paper. Users need not learn any special gestures for input; it gives them a great amount of freedom and flexibility and therefore they can concentrate on the process of composing music itself rather than being confined by the many rules imposed by many other input methods. Other extended usage of this system are instant playback of musical composition when linked to a music performance system, remote learning of music composition, and music composition on mobile devices. Summary of Literature Survey The main input methods of musical notation input can be grouped into these three categories: (i) Non-notation based: Music is entered to a computer from an electronic instrument connected to the computer through the MIDI interface. Off-line based (OMR, optical music recognition): Music sheets are scanned optically and the musical information is extracted later using various techniques or procedures like image processing and off-line symbol recognition. (iii) Pen-based: there are three main approaches in using pen-based devices for music recognition—point-andclick, gesture-based, and symbol recognition. (George 2005) In the first category, a piece of music is played on a MIDI instrument and the musical information is then transferred to the computer to be represented electronically. It requires the users to be competent players of the instruments. The playing must be flawless for every mistake in the playing of a note or timing will be faithfully captured by the computer. The major weakness of this method is due to the differences of music in its notational form and in its playing form—that is, a piece of music is never played as it is notated, since a good performer should have freedom of interpretation when playing the music piece, and this would introduce variation of musical time from expressive timing information. For example, when playing a note with staccato (a dot above the note indicating that the note thus marked should be shortened to half its written length, the second half replaced with silence), the system would have difficulty to recognize it correctly as a staccato instead of as a note with a rest. Likewise, it would be hard to recognize correctly notes with dynamics markings (varying degrees of loudness or softness), expression marks (indications in a musical score where the composer wish changes in the dynamics, tempo (quicker or slower) or mood (e.g. sadder, more joyful, etc.)) Also, the left hand and right hand parts of the musical information from a MIDI keyboard is very hard to be detected and recognized separately (Somnuk 2004). Off-line or OMR methods are more suitable for printed music recognition; recognition rate would be lower on freehandwriting musical notation. For the (on-line) pen-based category, the point-and-click method works by selecting (using mouse, keyboard, and/or stylus) musical symbols from pull-down menus or palettes and placing them onto the screen based on the current cursor position. Example applications are Finale, Sibelius, etc. As pointed out by many authors, this method is slow and highly unnatural (Somnuk 2004) (George 2005). There have been some gesture-based techniques developed in recent years to represent musical symbols. The first of such systems is Presto by Anstice, et al. (Anstice 1996) which uses different shorthand gestures to represent musical symbols. For example, a dot represents a filled note with a stem; draw line over one stem will add a tail to it. The work on the Presto system was carried on by Ng, et al. (Ng 1998) to create Presto2. Areas of improvement include better notes beaming, an improved gesture set and recognition algorithms, improved editing functions, and improved audible and visual feedback. evolution of music writing. All methods mentioned have strengths and weaknesses befitting their respective nature. Among these, we find that freehand input and editing of musical notes using digital ink to be the most natural form of interface between computers and composers. Research Methodology We divide the process of recognition of musical notation into four main stages: Online Pen-Based Musical Notation Recognition Input Digitization Evaluation on the Presto system claimed that the gesturebased system could be approximately three times as fast as other methods of music data entry reported in the literature. Its drawback, however, is that the users have to learn the many different gestures to input different musical symbols; and also, as we have pointed out, this method is not natural to a composer. Digital Ink Symbol Recognition Symbol List Structural Analysis Internal Hierarchical Structure Interpretation Final Result On the other hand, Forsberg, et al. (Forsberg 1998) designed The Music Notepad that is also using gesturebased input for entering common music notation. In this work, four classes of gestural operations are available; each corresponds to one of the four buttons of the stylus. The tip of the stylus is to draw marking gestures. The lower button is to perform direct manipulation operations for changing note pitches and graphical placement of symbols. The second lowest button is to slide the “music paper” across the display screen. The eraser button is for playback of the entire or regions of musical notation. The Music Notepad integrated a set of gestures for very basic notes and rests entry from the Structured Sound Synthesis Project (SSSP) by Buxton et al. (Buxton 1979). Like the Presto, The Music Notepad requires its users to learn many predefined gestures and is not a natural way to compose music. Problem Formulation Traditionally musicians or composers write music using pen and paper. It has always been a challenge to both musicians and computer scientists to enable direct input of musical notes into computer. Numerous methods have been tried to ease the process of musical notation input to computer: OMR (Optical Music Recognition), using MIDI devices, GUI (Graphical User Interface using windows, icons, menus, and point-and-click), pen- and gesture- based systems, etc. With the advancement of personal computer and handwriting recognition techniques it occurred to both musicians/composers and researchers that writing music directly onto computer is the logical next step in the Musical notation recognition begins by considering how the data is presented. The digitalization step transforms the digital ink input into a static or dynamic representation. The next step is symbol recognition where a classifier is employed to assign labels to symbols. We will apply the Hidden Markov model for this purpose. At the structural analysis step, a list of the recognized symbols is grouped to represent as an internal hierarchical structure suitable for interpretation and processing by a computer program. The last step is interpretation. In this step, the internal hierarchical structure is processed and interpreted to obtain a final result. This result can be a musical expression represented in MIDI or any of its equivalents ready for playback or for printing. References 1. George, S.E. (2005). Pen-Based Input for On-Line Handwritten Music Notation. In George S.E. (Ed.), Visual Perception of Music Notation: On-Line and OffLine Recognition (pp. 128–160). PA: IRM Press. 2. Somnuk Phon-Amnuaisuk (2004). Challenges and Potentials in Freehand Music Editing Using Pen and Digital Ink. In Music Network Open Workshop (colocated with WEDELMUSIC 2004 Conference), Universitat Pompeu Frabra, Barcelona, Spain. September 15-16. 3. Anstice J., Bell T., Cockburn A., and Setchell M. (1996). The Design of a Pen-Based Musical Input System. In Proceedings of the 6th Australian Conference on Computer-Human Interaction (OZCHT’96). Pages 260-267. IEEE Computer Society. 4. Ng E., Bell T., Cockburn A. (1998). Improvements to a Pen-Based Musical Input System. OzCHI'98: The Australian Conference on Computer-Human Interaction. Adelaide, Australia. 29 November to 4 December, 1998. pages 239--252. IEEE Press. 5. Forsberg, Andrew, Dieterich M., and Zeleznik R. (1998). "The Music Notepad", Proceedings of UIST '98, ACM SIGGRAPH. 6. Buxton, W., Sniderman, R., Reeves, W., Patel, S. & Baecker, R. (1979). The Evolution of the SSSP Score Editing Tools. Computer Music Journal 3(4), 14-25 (reprinted in: Buxton, W., Sniderman, R., Reeves, W., Patel, S. & Baecker, R. (1985). The Evolution of the SSSP Score Editing Tools. In Roads, C. & Strawn, J. (1985). Foundations of Computer Music. MIT Press, Cambridge MA, 376-402.) 7. Rabiner L.R. (1989). A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition, Proc. IEEE 77 (2) (1989) 257-286 8. Bunke H., Roth M., and Schukat-Talamazzini E.G. (1995). "Off-Line Cursive Handwriting Recognition Using Hidden Markov Models," Pattern Recognition, vol. 28, no. 9, pp. 13991413, Sept. 1995. 9. Tapia, E. (2004). Understanding Mathematics: A System for the Recognition of On-Line Handwritten Mathematical Expressions. Dissertation zur Erlangung des akademischen Grades eines Doktors der Naturwissenschaften im Fachbereich Mathematik und Informatik der Freien Universität Berlin. Berlin.