From: AAAI Technical Report SS-93-01. Compilation copyright © 1993, AAAI (www.aaai.org). All rights reserved. EXPERIENCE-BASED MUSIC COMPOSITION CARL WESCOTT and ROBERT LEVINSON Department of Computerand Information Sciences University of California, Santa Cruz Santa Cruz CA 95064 USA 1 Overview 2 "If a complex structure is completely unredundant - if no aspect of its structure can be inferred from any other - then it is its ownsimplest description. Wecan exhibit it but we cannot describe it by a simpler mechanism." - Herbert Simon, The Architecture of Complexity, p. 478. Humancreativity produces new forms, or patterns, from previous experience, in such diverse activities as composingmusic, writing prose, and playing chess. We believe that humans have domain-independent pattern-based neurobiological mechanisms that provide the foundations for creativity, and propose that systems which combine and alter simple patterns to model complex systems can effectively simulate creative intelligence. To study these mechanisms and experience-based creativity, Smurph, System for Musicomposition Using Repeated Pattern Hierarchies, has been developed. Smurph is intelligent software that learns to composemusic based on discovery of repeated musical patterns and its creative combinations and variations of those patterns. The goal is for Smurphto learn to compose original works of music entirely from its experience of ’listening’ to music and its analyses thereof, with virtually no music-theoretic knowledge or heuristics. This paper summarizes related research, describes the design of our experience-based chess and music programs, expounds on the patterns, hierarchical structures, and linguistic properties of music, and concludes by outlining ongoing research. 119 Related Research "[Musical fragments] that please me I retain in memory, and am accustomed, as I have been told, to humthem to myself. If I continue in this way, it soon occurs to me how I mayturn this or that morsel to account, so as to make a good dish of it, that is to say, agreeably to the rules of counterpoint, to the peculiarities of the various instruments, etc. All this fires mysoul, and, provided I amnot disturbed, mysubject enlarges itself, becomes methodized and defined, and the whole, though it be long, stands almost complete and finished in my mind, so that I can survey it, like a fine picture or a beautiful statue, at a glance. Nor do I hear in nay imagination the parts successively, but I hear them, as it were, all at once." Wolfgang AmadeusMozart, in Vernon, Creativity, p. 55. The original computer music compositions were stochastic music composed using random numbers or processes [Hiller & Isaacson, 1959] [Xenakis, 1971]. Often, these aleatoric devices were interactive compositional tools to assist the humancomposer, such a.s Koenig’s mentor systems Project One and Project Two [Laske, 1981]. Another approach has been to use formal grammars to model and generate music [Laske, 1973] [Bernstein, 1976] [Smoliar, 1976] [Winograd, 1968] [Holtzman, 1981] [Lerdahl & Jackendoff, 1983] [Roads, 1978]. More sophisticated music composition programs have been written that make use of rules based on the music-theoretic heuristics of traditional counterpoint species [Schottstaedt, 1984] [Ebcioglu, 1986]. These systems attempt to minimize penalties for broken rules using backtracking to compose an original piece of music or harmonize an existing one phrase-by-phrase. More recently, there have been several neural net implementations of compositional algorithms [Todd, 1991] [Mozer, 1991] [Lewis, 1991] [Kohonen, Laine, Tiits, & Torkkola, 1991] and algorithms to learn and produce well-formed melodies [Underwood, 1992] [Gilbert, 1993]. Someresearchers are studying the related field of real-time musical accompaniment [Dannenberg, 1984] [Vercoe, 1984] [Baird, Blevins, &Zahler, 1989]. Our computational music research builds on the work of David Cope, whose Experiments in Musical Intelligence (EMI) software has created pieces in the style of Bach, Mozart, Joplin, Gershwin, and Bartok, amongothers. In many respects, Cope combines some of the above techniques: EMI uses an Augmented Transition Network (ATN) grammaras well as a rule-base to describe the syntax of music. EMIbuilds a dictionary of signature phrases by listening to a composer’s works, and then composesmusic using these stylistic patterns and its rule-base [Cope, 1991a]. Smurph does not use transformational grammarsnor musical heuristics. Instead, it utilizes only its probabilistic analyses of input music to determine the most suitable patterns to follow its current output. It stores repeated patterns in a hierarchical database, and combines and alters them to form new musical phrases. 3 Morph & Smurph "While Babbage dreamt of creating a chess or tic-tac-toe automaton, [Lady Ada Lovelace] suggested that his Engine, with pitches and harmonies coded into its spinning cylinders, ’might compose elaborate and scientific pieces of music of any degree of complexity or extent.’ " - Douglas Hofstadter, Godel, Escher, Bach, p. 25. Morph, an experience-based, adaptive, pattern-oriented program that learns to play chess by watching or playing games, has already been built [Levinson, 1991]. Morph was developed based on our APS (Adaptive-Predictive Search) paradigm for machine learning. Starting with little a priori knowledge, Morphstores the patterns that it finds on the chess board in a hierarchical database, along with its estimated weight for each 120 pattern ranging from 0 to 1, with 0 indicating a sure loss and 1 indicating a win. Unlike most intelligent chess programs, Morph does not search the tree of possible movesfor future consequences of its actions. Instead, Morphevaluates the patterns that it finds in its possible legal movesand chooses the movewith the best possible combination of patterns. After the game is over, Morph adds and deletes patterns and adjusts its patternweights using temporal difference learning [Sutton, 1988] and the simple feedback of whether it won, drew, or lost the game. To study the application of similar principles to another domain besides chess, music composition was chosen, since: ¯ Music has relatively simple, discrete patterns that can be sequentially analyzed and unambiguously represented. * Due to the hierarchical structure of music, each note can be interpreted on manydifferent levels, as part of a bar, motive, phrase, section, or theme. ¯ Music composition, like winning chess, is often cited as an exampleof intelligent and creative behavior. ¯ There are digitally stored musical databases that can easily be accessed on-line. ¯ Psychological aspects of human music composition have been widely studied and published. Like Morph, Smurph stores musical patterns that it recognizes in a hierarchical database using a more-general-than operator. Rather than storing weights with these patterns, though, Smurph keeps track of the probabilities of occurrence of these patterns, modeling music as the result of a multiple-order Markovinformation source. Thus, when composing, Smurph combines these smaller patterns to produce music that has some of the style or signatures of the input music pieces that it has listened to. Smurph’s patterns currently use only the melodyof a musical piece, but Smurphwill eventually analyze all the voices in the input data stream. The input music is all transposed to the key of C, and rhythms are represented only in a temporally relative manner. Thus, two eighth notes followed by a quarter note are perceived to be identical to two quarter notes and a half note, assuming of course that the pitches mapone-to-one after transposition. Smurph’s database routines are written in C++ and interface with Macintosh MIDILISP code to parse and output MIDIevents to and from the computer. 4 unary operators like Generalize, Invert, and Re?)erse. 5 The Hierarchical Structure of Music Patterns "One of the most interesting aspects of the world is that it can be considered to be made up of patterns. A pattern is essentially an arrangement. It is characterized by the order of the elements of which it is made, rather than by the intrinsic nature of these elements." - Norbert Wiener, in Gonzalez & Thomason, Syntactic Pattern Recognition: An Introduction, p. 1. The easiest way to form complex patterns from simple ones is to group the smaller patterns together. Experimental evidence shows that the Gestalt principles of temporal proximity and similarity are important determinants of grouping in music. Grouping objects together improves perception and recognition visually as well as aurally: "Regular, symmetrical, simple shapes will be more readily perceived, appear more stable, and be better remembered than those which are not. Thus, for instance, conjunct pitch sequences (the law of proximity), continuing timbres (the law of similarity), cyclic formal structures (the law of return)all help to facilitate perception, learning, and understanding." [Deutsch, 1982] Musical groups or phrases can also be altered by the simple procedures of retrograde, inversion, and transposition. "Tonal melodies are often generated, either consciously or unconsciously, from one or more motives (gestures less than a phrase long). These typically fall so naturally into the logic of the line as not to be distinguishable. Motives are varied by transposition (different levels of the scale), by inversion (intervals in the opposite direction), and by extension or truncation." [Cope, 1991a] Traditional fugues consist almost entirely of multivoice variations on their melodic themes. Marvin Minsky believes that the most significant aspect of these similar phrases, motives, and rhythms is the higher level differences between otherwise identical objects. [Minsky & Laske, 1992] Currently Smurph stores only patterns that actually occur in the music that it listens to, but mechanisms are being added so that variants can be produced by using binary pattern transforms such as Most CommonSubpattern, and 121 "As a piece of music unfolds, its rhythmic structure is perceived not as a series of discrete independent units strung together in a mechanical, additive way like beads, but as an organic process in which smaller rhythmic motives, while possessing a shape and structure of their own, also function as integral parts of a larger rhythmic organization." - Grosvenor Cooper & Leonard Meyer, The Rhythmic Structure of Music, p. 2. According to Heinrich Schenker, the originator of modern harmonic analysis, music can be understood on three principal levels: foreground, middleground, and background. Schenker advocates a universal deep structure for music based on his formalisms not unlike Chomsky’s. This structure has two components, a fundamental melodic line and a bass line, which may be discovered by reeursively removing extraneous musical notes from a piece to reveal its harmonic and contrapuntal skeleton. The structure of music is the bridge between the composer and the listener. Music’s various aspects - pitch, intensity, duration, timbre, texture, and harmony - are interwoven at many different levels in an architectonic fashion. [Narmour, 1990] Experimental evidence shows that humans retain pitch information in their memories at different levels of abstraction. [Deutseh, 1982] Like pitch, the rhythms in a piece of music can be analyzed on multiple hierarchical levels. [Cooper & Meyer, 1960] Each note is analyzed in terms of its contextual tonal hierarchies, [Krumhansl, 1990] offering the musical stability which leads to emotion and meaning in music. [Meyer, 1956] Smurph’s hierarchical storing and analysis of patterns is consistent with the multilevel hierarchical structures in many forms of Western tonal music. The classical sonata, for example, is a hierarchy of subunits, and the "subunits in one branch are generally found in identical or varied form in other branches. For example, the themes from the exposition are also heard in the development, recapulation, and coda. but in different orders, keys, and tonalities. Similarly, the motives from one theme also appear throughout the movement,but transposed, inverted, stated in retrograde, diminished or augmented, or otherwise altered. Although the formal structure of a musical composition varies widely from one composer or style to another, the hierarchical nature of the form remains nearly universal." [Bateman, 1980] 6 Music as a Language able at all stages of the compositional process, it reflects music’s potential ambiguities through its stochastic generator. Also, rather than having to rewrite its grammar to create music in another style, Smurphsimply digests, analyzes, and emulates new input music. 7 "Just as letters are combined into words, words into sentences, sentences into paragraphs, and so on, so in music individual tones becomegrouped into motives, motives into phrases, phrases into periods, etc. This is a familiar concept in the analysis of harmonic and melodic structure." - Grosvenor Cooper and Leonard Meyer, The Rhythmic Structure of Music, p. 2. This multilevel hierarchical structure allows music to be viewed as a language. Besides the more formal approaches to describing music in terms of languages’ semantics, parts of speech, transformations, and musical grammars, [Bernstein, 1976] it is widely acknowledgedthat the hierarchical structure of music lends itself well to another analogy. Music groups itself into phrases, motives, and section, just as language is composed of words, sentences, and paragraphs. Also, like language, music creates tension and relief through its use of dissonance and cadences. As a language, it possesses the same universal deep structure that Chomskydiscusses in linguistics. [Chomsky, 1965] Music can be discussed in terms of the abstract expressions of formal language theory. Lerdahl and Jackendoff concentrate on the hierarchical aspects of music. They employ grouping analysis of musical sections, phrases, and motives as well as metrical analysis to break downhierarchies like two 3-note groups that are part of a larger 6-note group. They also use time-span reduction and prolongational reduction to isolate pitch hierarchies. [Lerdahl & Jackendoff, 1983] There have been attempts to compose music based on formal grammars, but one problem with this approach is that it can never be "adequate to mirror the total ambiguity which is so important in music, of the boundary between new figures and variations of an old one." [Lidov & Gabura, 1973] Smurph captures the essence of the structure of music without using formal grammars to output music. Since its patterns are avail122 Domain-Independent Creativity One of the primary high-level goals of this research is to explore to what extent creativity can be achieved in individual domains using principles that are essentially domain-independent. In the case of Smurph the goal then is for Smurph to exhibit the musical characteristics of those that have formal music training, without explicitly being given music theory as in knowledge-basedcomposition systems. Smurph’s expertise must come from its experience in "listening to" well-formed music. The first primitive version of Smurphwas based strictly on its hidden Markovmodel. As was expected, if Smurph used a high order of probabilistic synthesis, redundantrepetition resulted. If the order of synthesis was too low, the resulting output note sequences were mostly random and not typical of the input pieces’ style. Smurph’sfirst compositions contained coherent phrases, but lacked some of the basic elements which characterize Western tonal music. Since each note in its compositions was determined by probability, the overall structure which permeates most music was missing. Repetition, selfreference, and theme and variation occured only infrequently. The concept of modulating between keys was completely alien to Smurph. Musical tension and release were rare, and particularly glaring was the absence of definitive endings. Rather than encoding music-theoretic heuristics into the system, Smurphwas directed to improve its playing through other means. Smurph now stores the patterns of its compositions in a separate database, and occasionally consults this database to develop recurring musical phrases and create theme and variation. Smurph produces endings to its pieces by using the endings of the pieces it has listened to. Starting from the last note, usually the tonic or a fifth above it, Snmrph works backwards to write a bar or two of the final phrase which sometimes combines the tension and release of the traditional V7-I cadence with the patterns of the melody. In giving Smurphthe capability of generating music that follows someof the rules of music theory, we are actually giving it the potential to produce creative music based on generalizations of these qualities. If Smurphcan learn the laws of traditional music theory then it can break them. Thus Smurph by judiciously extending the boundaries of Western music may eventually be truly creative. [Boden, 1990] 8 Prediction are successful in other related domains. This approach may lead to plausible move generators for chess, organic synthesis, and automatic theorem proving, each being an area in which humans have shown superiority to computers. Finally, mappings of pattern structures across multiple problem domains will be explored, such as in music composition based on chess patterns. Our presentation furnishes specific results, including the creative output of the system, further details on its inner workings, and a critical analysis of its strengths and weaknesses. "Prediction is very difficult, especially of the future." - Niels Bohr 10 Acknowledgements The authors are indepted to David Cope and Kevin Karplus for valuable discussions and suggestions. References Like speech recognition systems and adaptive coding techniques, Smurphuses the context of input it has seen so far to predict upcomingpatterns. Humans,whether they realize it or not, also predict upcomingsections of music as they listen to it. In fact, the emotional aspect of music is mainly based on expectancy and surprise [Meyer, 1956] [Narmour, 1990]. Furthermore, experimental evidence shows that even children can predict whena piece of music is finished [Serafine, 1988]. 9 Conclusions and Ongoing Research Smurph creates original motives in the style of the input compositions. Whereas some previous experiments in statistical analysis of melodies [Olson, 1967] have only calculated zero through third-order analyses, Smurph produces higher-order analyses. Part of Smurph’schallenge is using the right order of synthesis and combining multiple orders of probabilistic information, and herein ties part of its creativity and our challenge as researchers. WhenSmurph sounds good enough in subjective listening test, it will be evaluated as in a double-blind Turing test. Also, Smurph will itself be given a test to determine which of two input pieces is Mozart’s (or another composer’s), after listening to selected pieces by that composer. By creating a computer program which learns a creative art through experience in another domain besides chess, the hypothesis that the basis for creativity is domain-independent is being tested. Webelieve that problem solving and the arts mayutilize similar mechanismsto recognize, alter, and combine patterns. Based on this theory, we hope to build intelligent systems which [Apostel, Sabbe, and Vandamme,1986] L. Apostel, H. Sabbe, and F. Vandamme,eds. Reason, Emotion, and Music: Towardsa commonstructure for the arts, sciences, and philosophies, based on a conceptual frameworkfor the description of music. Ghent, Belgium, 1986. [Baird, Blevins, &Zahler, 1989] B. Baird, D. Blevins, and N. Zahler. The artificially intelligent computer performer on the MacintoshII and a pattern matching algorithm for real-time interactive performance. In Proceedings of the 1989 International Computer Music Conference. The MIT Press, Cambridge, 1992. [Balaban, Ebcioglu, and Laske, 1992] Mira Balaban, KemalEbcioglu, and Otto Laske, eds. Understanding Music with AI: Perspectives on Music Cognition. The MITPress, Cambridge,1992. [Bateman, 1980] Wayne Bateman. Introduction to Computer Music. John Wiley & Sons, NewYork, 1980. [Bean, 1961] Calvert Bean, Jr. Information Theory Applied to the Analysis o] a Particular FormalProcess in Tonal Music. Dissertation. University of Illinois, Urbana,1961. [Bernstein, 1976] Leonard Bernstein. The Unanswered Question: Six Talks at Harvard. Harvard University Press, Cambridge,1976. [Boden, 1990] Margaret Boden. The Creative Mind: Myths and Mechanisms.Weidenfeld and Nicholson, London, 1990. [Chomsky,1965] NoamChomsky.Aspects of the Theory of Syntax. The MITPress, Cambridge,1965. [Chowning and Roads, 1985] John Chowning and Curtis Roads. John Chowningon Composition. In Composersand the Computer. William Kaufmann, Los Altos, 1985. 123 [Cooper & Meyer, 1960] Grosvenor Cooper and Leonard B. Meyer. The Rhythmic Structure of Music. The University of Chicago Press, Chicago, 1960. [Cope, 1988] David Cope. Music: The Universal Language. In Proceedings of the American Association o] Artificial Intelligence Workshop on Music and AL AAAI, Menlo Park, 1988. [Cope, 1991a] David Cope. Computers and Musical Style. A-R Editions, Madison, 1991. Volume 6 of the Computer Music and Digital Audio Series. [Cope, 1991b] David Cope. On Algorithmic Representation of Musical Style. In Musical Intelligence, Balaban, Ebcioglu, and Laske, eds. AAAIPress, Menlo Park, 1991. [Dannenberg, 1984] R. B. Dannenberg. An on-line algorithm for real-time accompaniment. In Proceedings o] the 1984 International Computer Music Conference. [Deutsch, 1982] Diana Deutsch. Grouping Mechanisms in Music. In The Psychology of Music, Deutsch, editor. Academic Press, NewYork, 1982. [Ebcioglu, 1986] Kemal Ebcioglu. An Expert System for the Harmonization of Chorales in the Style of J. S. Bach. Dissertation, 1986. [Gilbert, 1993] Stephen Gilbert. Training an HMM to Generate Appropriate Melodies. Unpublished, January 11, 1993. [Gould & Levinson, 1992] Jeff Gould and Robert Levinson. Experience-based Adaptive Search. To appear in Machine Learning: A Multi-Strategy Approach, Morgan Kauffman, 1992, vol. 4. [Lerdahl & Jackendoff, 1982] Fred Lerdahl and Ray Jazkendoff. A Grammatical Parallel Between Music and Language. In Music, Mind, and Brain: The Neuropsychology of Music, edited by Manfred Clynes, Plenum Press, NewYork, 1982. [Lerdahl & Jackendoff, 1983] Fred Lerdahl and Ray Jackendoff. A Generative Theory of Tonal Music. The MIT Press, Cambridge, 1983. [Levinson, 1991] Robert Levinson. Experience-based creativity. In Proceedings of Symposium on AI, Reasoning, and Creativittj, Kluwer Academic Press, 1991. To appear in the Kluwer book A1 and Creativity. Also appears as UCSCTecli Report 91-37. [Lewis, 1991] J. P. Lewis. Creation by Refinement and the Problem of Algorithmic Music Composition. In Music and Conn~ctionism, Todd & Loy, eds., The MIT Press, Cambridge, 1991. [Lidov & Gabura, 1973] David Lidov and A. James Gabura. A Melody-Writing Algorithm Using a Formal Language Model. In Computer Studies in the Humanities and Verbal Behavior, 4 (1973): 138-148. [Mandelbrot, 1983] Benoit Mandelbrot. The Fractal Geometry of Nature. W.H. Freeman and Company, NewYork, 1983. [Meyer, 1956] Leonard B. Meyer. Emotion and Meaning in Music. The University of Chicago Press, Chicago, 1956. [Meyer, 1967] Leonard B. Meyer. Music, the Arts, and ldeas: Patterns and Predictions in TwentiethCentury Culture. The University of Chicago Press, Chicago, 1967. [Minsky & Laske, 1992] Marvin Minsky and Otto Laske. A Conversation with Marvin Minsky. In A1 Magazine, Fall 1992, pp. 31-45. [Hiller & Isaacson, 1959] Lejaren Hiller and L. Isaacson. Experimental Music. McGraw-Hill, 1959. [I-Iofstadter, 1979] Douglas R. Hofstadter. Godei, Eschef, Bach: An Eternal Golden Braid. Random House, New York, 1979. [I’Ioltzman, 1981] S. Holtzman. Using Generative Grammars for Music Composition. In Computer Music Journal, 5, 1 (Spring 1981): 51-64. [Mozer, 1991] Michael Mozer. Connectionist Music Composition Based on Melodic, Stylistic, and Psycliophysical Constraints. In Music and Connectionism, Todd & Loy, eds., The MIT Press, Cambridge, 1991. [Narmour, 1990] Eugene Narmour. The Analysis and Cognition of Basic Music Structures: The Implication-Realization Model. The University of Chicago Press, Chicago, 1990. [Kohonen, Lalne, Tiits, & Torkkola, 1991] Teuvo Kohonen, Pauli Laine, Kalev Tiits, and Kari Torkkola. A Nonheuristic Automatic Composing Method. In Music and Connectionism, Todd & Loy, eds., The MIT Press, Cambridge, 1991. [Olson, 1967] Harry F. Olson. Music, Physics, and Engineering. Dover Publications, NewYork, 1967. Cognitive [Krumhansl, 1990] Carol Krumhansl. Foundations o] Musical Pitch. Oxford University Press, Oxford, 1990. [Pickover, 1990] Clifford A. Pickover. Computers, Pattern, Chaos, and Beauty. St. Martin’s Press. New York, 1990. [Laske, 1973] Otto Laske. In Search of a Generative Grammar for Music. In Perspectives o] New Music 12, 1 (Fall-Winter 1973): 351-378. [Roads, 1978] Curtis Roads. Composing Grammars. Computer Music Association, San Francisco, 1978. [Laske, 1981] Otto Laske. Music and Mind: An Artificial Intelligence Perspective. San Francisco, Computer Music Association, 1981. 124 [ScliiUinger, 1948] Joseph Schilfinger. The Mathematical Basis of the Arts. Philosophical Library, New York, 1948. [Schottstaedt, 1984] Bill Schottstaedt. Automatic Species Counterpoint. Report No. STAN-M-19, Center for Computer Research in Music and Acoustics, 1984. [Serafine, 1988] Mary Serafine. Music as Cognition: The Development of Thought in Sound. Columbia University Press, NewYork, 1988. [Smoliar, 1976] S. Smoliar. Music programs: An Approach to Music Theory Through Computational Linguistics. Journal of Music Theory. Yale University Press, NewHaven, 1976. [Sutton, 1988] Richard S. Sutton. Learning to predict by the methods of temporal differences. Machine Learning, 3(1):9-44, August 1988. [Todd, 1991] Peter M. Todd. A Connectionist Approach to Algorithmic Composition. In Music and Connectionism, Todd & Loy, eds., The MIT Press, Cambridge, 1991. [Underwood, 1992] Rebecca C. Underwood. ChopinNET: A sequential neural network for learning to reproduce melodies. Unpublished, June 5, 1992. [Vercoe, 1984] B. Vercoe. The Synthetic performer in the context of live performance. In Proceedings of the 1984 International Computer Music Conference. [Winograd, 1968] Terry Winograd. Linguistics and Computer Analysis of Tonal Harmony. In Journal of Music Theory 12 (Spring), 2-49. Yale University Press, NewHaven, 1968. [Xenakis, 1971] Iannis Xenakis. Formalized Music. Indiana University Press, Bloomington, 1971. 125