Modeling Carnatic Music with Category Theory

IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS, VOL. 48, NO. 6, JUNE 2018 967 Modeling and Analysis of Indian Carnatic Music Using Category Theory Sarala Padi, Spencer Breiner, Eswaran Subrahmanian, and Ram D. Sriram, Fellow, IEEE Abstract—This paper presents a category theoretic ontology of Carnatic music. Our goals here are twofold. First, we will demonstrate the power and flexibility of conceptual modeling techniques based on a branch of mathematics called category theory (CT), using the structure of Carnatic music as an example. Second, we describe a platform for collaboration and research sharing in this area. The construction of this platform uses formal methods of CT (colimits) to merge our Carnatic ontology with a generic model of music information retrieval tasks. The latter model allows us to integrate multiple analytical methods, such as hidden Markov models, machine learning algorithms, and other data mining techniques like clustering, bagging, etc., in the analysis of a variety of different musical features. Furthermore, the framework facilitates the storage of musical performances based on the proposed ontology, making them available for additional analysis and integration. The proposed framework is extensible, allowing future work in the area of rāga recognition to build on our results, thereby facilitating collaborative research. Generally speaking, the methods presented here are intended as an exemplar for designing collaborative frameworks supporting reproducibility of computational analysis and simulation. Index Terms—Categorical framework for Carnatic music, categorical structure for rāga, category theory (CT), ontology. I. I NTRODUCTION ROUND the world, a tremendous number of audio music files are accessed every day for personal usage, research, and analysis purposes. The area of music information retrieval (MIR) studies techniques and methods for helping users find the music files they want. Thus, MIR concentrates on archival methods, providing metadata, annotations, search algorithms, and analysis of music to help the users filter an ocean of musical content. Especially as listening habits migrate from hard drives to the cloud, MIR is becoming an increasingly critical area of research [1]. There are two general classes of MIR: 1) content-based and 2) metadata-based [2], [3]. The former involves retrieval based on musical content, which a user might provide by A Manuscript received May 12, 2016; revised August 29, 2016; accepted November 4, 2016. Date of publication January 31, 2017; date of current version May 15, 2018. This paper was recommended by Associate Editor L. Sheremetov. The authors are with the Department of ITL, National Institute of Standards and Technology, Gaithersburg, MD 20899 USA (e-mail: sarala.padi@nist.gov). This paper has supplementary downloadable multimedia material available at http://ieeexplore.ieee.org provided by the authors. This includes a brief introduction to Category Theory. This material is 116 KB in size. Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TSMC.2016.2631130 humming or singing. However, the vast majority of MIR is metadata-based; we retrieve music segments by searching for text related to the segment, such as its title or artist. In this paper, we will focus exclusively on metadata-based MIR. Although metadata-based retrieval is convenient for users and easy to implement, the metadata on which these searches depend must be expressive enough and accurate enough to support retrieval tasks. This makes metadata creation and evaluation one of the most significant challenges in metadatabased MIR. In many cases metadata collection requires explicit supervision to ensure accurate results. It may also be difficult to maintain a consistent format (including the choice of data fields) in our music databases, especially when these grow rapidly. One way to ease, though not eliminate, these challenges is through the use of domain modeling. If we have some rules about the way that different pieces of metadata relate to one another, then we can evaluate new data against these rules to recognize some incorrect inferences. Similarly, modeling the structure of a musical domain can allow us to cross-check related inferences. There is a significant body of literature modeling Western classical music, including tempo estimation, beat tracking, instrumental music segmentation, instruments classification, and transcription [4]–[9]. By comparison, there has been little effort to analyze Indian classical music and, correspondingly, most MIR technologies have not been applied in this context [10]–[12]. Here, we try to address this gap. Indian classical music has two main traditions, namely, Carnatic and Hindustani. Carnatic music is popular in the southern part of India while Hindustani is popular in the north. Though these two traditions are similar in certain respects, in this paper, we choose to focus only on Carnatic music, leaving the extension to Hindustani music and the relationship between the two branches for future work. A reader interested in learning more about Hindustani music should consult [13], [14]. A song in Carnatic music is composed in a specific melody (rāga) but, unlike the case in Western classical music, the rhythm can be rendered differently by different musicians [13], [15]. This is primarily because Carnatic music has been handed down from teacher to student in an oral tradition. Consequently, Carnatic music typically does not have standardized notations as are available in Western music, and what does exist varies from school to school. This adds an additional difficulty for the archiving and analysis of Indian classical music. c 2017 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/ 2168-2216 redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information. 968 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS, VOL. 48, NO. 6, JUNE 2018 Despite all this flexibility, Carnatic music is still constrained by rules, and our goal here is to model these rules using a mathematical discipline called category theory (CT). CT provides a precise mathematical representation for these aspects of Carnatic music so that these properties and characteristics can be clearly elucidated. This will allow us to check MIR metadata for consistency with these characteristics. There are, of course, many different modeling languages and techniques; CT is noteworthy for its focus on the relationships between informational entities, rather than the entities themselves. Because of this, categorical models are largely agnostic with respect to specific choices of representational form or formalism. The success of these methods is demonstrated by the impressive breadth of categorical methods, which have provided insights into fields as varied as music modeling [16], computer science [17]–[20], engineering [21]–[23], theoretical physics [24], and biosciences [25], [26]. Importantly for our purposes, CT models are closely related to database schemas, a fact which facilitates MIR activities, such as storage, retrieval, and analysis. We hope that this paper will find interested readers in several different areas. First of all, we hope that researchers interested in Carnatic music will be able to use and extend our model to assist in their own work, and thereby facilitate collaboration in this area. More generally, we offer this example as a fairly detailed study in the development and application of category theoretic models, of interest in any area of collaborative science. In the interests of these readers we do not assume familiarity with CT methods, and include a short introduction in the supplementary material. However, we also hope that experienced practitioners of CT may also enjoy this paper, and find in it some small progress toward a methodology of applied CT, an area of study still largely undeveloped. This paper begins in Section II, where we describe the individual building blocks and structure of Carnatic music that will be used for MIR tasks. In Section III, we translate this description into a categorical model. The basis of this translation is provided as supplementary material giving a brief introduction to CT; readers without a CT background are strongly advised to read the supplemental material before proceeding to Section III. Section IV broadens the scope of our models by merging the individual building blocks into a single conceptual framework to model Carnatic music as a whole. Finally, we discuss some of the ways that categorical methods can facilitate collaboration and group research. II. I NTRODUCTION TO C ARNATIC M USIC Indian classical music is one of the world’s oldest musical traditions. It has been developed over centuries, and been influenced by many religions and cultures. The present system of Indian music is based on two pillars: 1) rāga and 2) tāla. The first, rāga, is the melodic component of the music, corresponding to the “mode” or “scale” in Western classical music. The tāla, by contrast, describes the rhythmic component of a music. Indian classical music contains two traditions, Carnatic and Hindustani, associated with southern and northern India, respectively. Though both traditions involve the concepts of Fig. 1. Basic svaras that make up a melody in Carnatic music tradition. rāga and tāla, the interpretations of these notions vary based on the performance practices of the two traditions [27]. In this paper, we focus on the Carnatic tradition, leaving Hindustani music for future analysis. The most basic component in Carnatic music is the svara, which is roughly analogous to a note in western music. However, as we will see, the relationship between svara and pitch is more complicated than in western music. In addition, these svaras determine the articulation of a note, the way that it is sung or played. Fundamentally there are seven svaras in Carnatic music, namely, Sadja (S), Rishaba (R), Ghandhara (G), Madhyama (M), Panchama (P), Daivatha (D), and Nishada (N) [13]. As shown in Fig. 1, the first note, Sadja, is the tonic, meaning that the frequency of the other svaras is measured relative to the pitch1 of the Sadja. Thus, the frequency of the Panchama svara is always one and a half times the frequency of the Sadja. However, the remaining svaras (R, G, M, D, and N) may each occur in one of two or three variations. Rishaba, for example, may have a frequency ratio of (16/15), (9/8), or (6/5). We refer to these variations by R1 , R2 , and R3 , respectively. Moreover, some different svaras may share the same pitch frequency: G2 also has a relative pitch frequency of (6/5). Altogether there are 16 svara variations spread across 12 pitch frequencies (svarastanas). The fixed svaras, S and P, are called shudh svaras. The situation is similar to that in Western classical music, where E and B are pure notes in the sense that they do not have sharp and flat modifications, whereas the remaining notes, C, D, F, G, and A, may be modulated by sharp and flat notations C , D , etc. It is also worth noting that when the S svara in Carnatic music is fixed to C major in Western classical music then the svaras S, R, G, M, P, D, and N correspond to the seven notes of the C major scale, C, D, E, F, G, A, and B [28]. Table I lists the pitch ratios associated with each svara. See [10], [29] for discussions of the allowed frequencies for the 12 svarastanas and their frequency ratios relative to the tonic. Notice that some svaras share the same relative pitch; these are nonetheless distinguished by their articulation. This is displayed in Fig. 1, which shows the grouping of svaras by pitch (vertical groupings) and articulation (horizontal groupings). 1 Here, we use “pitch” to refer to the frequency of a tone (in hertz) and we use these two terms interchangeably. In later sections pitch will be the named feature which measures the frequency in Hz for a rāga recognition tasks. PADI et al.: MODELING AND ANALYSIS OF INDIAN CARNATIC MUSIC USING CT 969 TABLE I C ARNATIC M USIC Svaras AND T HEIR F REQUENCY R ATIOS (a) (b) Fig. 2. Svara sequence (both ascending and descending) and its generating regular expression for a Melakartha rāga system. A. Rāga The melodic component of a piece of Carnatic music is called its rāga. Rāgas are constructed by grouping together the svaras introduced above in different ways, and each of these groupings are associated with one of nine corresponding emotions [15]. The rāgas are developed to elicit these emotions and, in fact, the word “rāga” is derived from the Sanskrit word for “color” or “passion.” A rāga is uniquely determined by a sequence of svaras. Additionally, this sequence can be decomposed into an aarohana (ascending) sequence followed by an avarohana (descending) sequence. In the aarohana sequence the pitch tends to increase, beginning at the tonic and ending again at the tonic, one full scale higher (i.e., at twice the frequency of the initial tonic). The avarohana sequence, on the other hand, tends to decreases, beginning at the high tonic and ending at the low. Rāgas are broadly grouped into two classes, namely, Melakartha and Janya rāgas. While both classes contain aarohana and avarohana sequences, the Melakartha rāgas are more restrictive in the sequences they allow. Rāgas may be further classified as sampoorna (complete) or asampoorna. A sampoorna rāga includes exactly seven svaras in both the aarohana and the avarohana sequence. Any other rāga is asampoorna. Therefore, all Melakartha rāgas are sampoorna rāgas while Janya rāgas, may be either sampoorna or asampoorna rāgas. In Melakartha rāgas, both the aarohana and avarohana sequences are required to contain exactly one note from each svara class (i.e., the horizontal groupings as shown in Fig. 1). Moreover, in the aarohana sequence the pitch of each svara must be higher than the one before. Thus, e.g., the svara R2 can be followed by G2 or G3 , but not by G1 . Similar (but reversed) requirements hold for the avarohana sequence. This means that the frequency of notes in a rāga gradually increases from the tonic to its middle note (one full scale higher), and then gradually descends to return to the tonic at the end of the rāga. Fig. 2(a) gives a finite state machine and a regular expression which will produce all of the aarohana sequences in Melakartha rāgas. By contrast, Janya rāgas are much more flexible. Their sequences usually contain fewer than seven notes, although sometimes they may have seven or more notes. A given sequence in a Janya rāga may also repeat certain svaras, and need not be strictly ascending or descending. For a comparison of Melakartha and Janya rāgas [see Fig. 2(b)]. There are further restrictions on Melakartha rāgas which Janya rāgas do not share. For Melakartha rāgas, both the aarohana and avarohana sequences must contain exactly the same svaras, and these sequences must be exactly the reverse of one another. Thus, all together there are 72 possible Melakartha rāgas constructed from the allowed combinations of svaras indicated in Fig. 2. There is, however, an important relationship between the two types of rāgas: every Janya rāga is derived from a Melakartha rāga. Starting from a Melakartha rāga, one usually drops or adds a small number of svaras to arrive at an associated Janya rāga. This relationship is also reflected in an additional characteristic of rāgas: the emotion associated with a rāga. Traditionally, each rāga is associated with one of nine emotions, include Bhakti (ritual or devotional) and Viram (bravery or fury). A thorough discussion is given in [30]. Here, the important observation is that a Janya rāgas shares the same emotion as the Melakartha rāgas that it was derived from. B. Tāla In the Carnatic music tradition tāla and rāga are concepts of equal importance in rendering a composition. Tāla defines the rhythmic structure or framework using which music is performed. In general tāla means clap and is used to maintain the time which in turn determines the rhythmic structure or pattern for a musical composition. Generally, there are no special instruments to maintain the tāla in Indian classical music performance. It is usually maintained by the musician by tapping of the hand on the lap or using both the hands like clapping. As shown in Table II, seven tālas are characterized 970 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS, VOL. 48, NO. 6, JUNE 2018 TABLE II Tāla NAMES AND C ORRESPONDING S TRUCTURES (PATTERN ) IN C ARNATIC M USIC T RADITION . T HE N OTATIONS IN THE TABLE ARE : laghu—I, drutam—O, AND anudrutam—U TABLE IV L IST OF G ATIS AND THE N UMBER OF N OTES P ER B EAT TABLE III JATI NAMES AND N UMBER OF B EATS FOR laghu (I) S PECIFIED IN tāla Pattern using three different notations, namely, laghu (I), drutam (O), and anudrutam (U).2 Any piece of Carnatic music has a fixed cycle called an avartana, and each cycle is divided into basic time intervals called units or aksharas. The number of aksharas in a cycle is determined by two pieces of information: 1) a tāla and 2) a jati. A tāla is built up as a combination of three different clapping styles. Andrutam (indicated using U) consists of a single beat counted with an ordinary (palm-to-palm) clap. Drutam (O) consists of two beats: one ordinary clap followed by an “empty” clap placing the back of one hand into the palm of the other. Finally laghu (I) consists of a variable number of beats. The first is an ordinary clap followed by counting the remaining beats on the fingers of one hand. There are seven different tālas, each corresponding to a different sequence of these components. Thus, for example, the Jampa tāla corresponds to the sequence IUO while Ata tāla corresponds to IIOO. A list of all these tālas, together with the associated sequences, are described in Table II. The variability of the laghu component of a tāla is determined by a second characteristic of Carnatic rhythm: the jati. There are five traditional options for jati, each of which determines a different number of beats for the laghu component of a tāla. So the Tisra jati involves three beats for each laghu component while Misra involves seven. These various jatis, along with their number of beats, are indicated in Table III. The rhythm of a Carnatic composition is determined by a tāla and a jati, a pair which we call a tāla structure. Any combination is valid, so there are 7 × 5 = 35 tāla structures. From a tāla structure we can compute the number of beats in a full musical cycle. For example, the ata tāla (with pattern IIOO) in the tisra jati would consist of 3 + 3 + 2 + 2 = 10 beats per cycle, whereas the same tāla in the misra jati would consist of 7 + 7 + 2 + 2 = 18 beats per cycle. 2 laghu—indicates one beat of palm followed by counting the fingers, drutam—indicates one beat of the palm and turning it over and anudrutam— indicates just one beat of the palm. The number of aksharas (beats) for drutam is 2, anudrutam is 1, and for laghu depends on type of jati. Fig. 3. Illustration of tāla structure in Carnatic music tradition. The final component of Carnatic rhythm is called its gati. This determines the number of notes which are played or sung for each beat (akshara). Combining this with the number of aksharas per cycle we can also compute the number of notes per cycle. One point of potential confusion here arises from the fact that the gati names are the same as the jati names, although the two can vary independently. In addition, the number of notes per akshara for a given gati is the same as the number of beats per laghu (I) for the jati of the same name (see Table IV). Thus, a performance in tisra gati will have three notes per beat, regardless of its jati. A full example of this rhythmic structure is illustrated in Fig. 3 for Rupaka tāla Kanda jati with Tisra gati. Because Rupaka tāla has the pattern OI, the entire cycle is divided into two parts: 1) drutam (O) and 2) laghu (I). As always, drutam has fixed number of beats (2) while the number of beats for laghu is determined by the Kanda jati (5), making seven in total. Finally, the Tisra gati entails three notes per akshara, meaning that the full cycle will consist of 21 notes. C. Performance in Carnatic Music In the Carnatic music tradition, performance can be vocal or instrumental. In the case of vocal performances, the lead musician is a vocalist and in the case of instrumental performances, the lead musician is an instrumentalist (violin, veena, mridangam, etc). In Carnatic music performances the lead musician is generally accompanied by instruments, namely, violin, mridangam, ghatam, morsing, and tanpura. Tanpura is used to maintain the basic pitch throughout the concert, usually referred as the tonic of the performance. All these instruments are tuned to basic pitch of the lead performer. In general, any concert or musical performance is a combined effort from the lead musician and various instrumentalists who are accompanying the lead performer. Therefore, a piece of music is usually PADI et al.: MODELING AND ANALYSIS OF INDIAN CARNATIC MUSIC USING CT 971 Fig. 5. Methods and features used for rāga recognition task in Carnatic music tradition. Fig. 4. Research tasks in the MIR community for Indian classical music. tagged with song name, rāga name, tāla name, and composer name along with the names of all the musicians involved in a performance. This data is referred to as metadata and it is critical for MIR, and providing it is the main archival goal in MIR. The following section gives an overview of current research on archival methods and retrieval of Carnatic music. 1) Research in Carnatic Music (State of the Art): This section gives an overview of the state of the research in both the Carnatic and the Hindustani musical traditions. In particular, various methods and features used in the literature to recognize the rāga of a given music segment are described fully to help understand the categorical models discussed in later sections. Fig. 4 illustrates the state of research useful for MIR purposes in Indian classical music, an area which has grown substantially in the last ten years. One aspect of this research, CompMusic,3 is a research project focused on a wide variety of musical traditions, including Indian classical music, Chinese music, Turkish music, etc. “Dunya,” a culture-specific toolbox developed under this project, includes tools for music segmentation, feature extraction, meta-data extraction, and meta-data-based retrieval of music segments in wide variety of musical traditions [31]–[38]. Apart from the CompMusic project, other efforts have focused on analyzing Carnatic and Hindustani music for various processing tasks (both vocal and instrumental music) which are useful for MIR [39]–[44]. For most MIR-related tasks, metadata plays a dominant role; for example, a composer’s name is among the most useful metadata for information retrieval. Thus, composer identification is one of the most critical tasks among those given in Fig. 4. In Carnatic music, composers often have a unique signature (mudra) which they weave into their compositions. Thus, one possible way to identify the composer of a music segment is to identify its mudra, which occurs near the end of the a composition. As yet, no one has developed any technique or feature to find the mudra of a composition segment for either Carnatic or Hindustani music. The different borders around the tasks in Fig. 4 distinguishes them into 3 http://compmusic.upf.edu/ three classes: 1) those which have not yet been attempted; 2) those which are in progress; and 3) those which have been successfully completed. Fig. 5 elaborates on the rāga recognition task by giving an overview of the features and methods used for identifying the rāga of a music segment. Such a task involves a particular method which is used to analyze a particular feature in hopes of identifying the rāga of a music segment. For example, two recognition procedures considered in this paper are the application of hidden Markov models (the method HMMs) [45] to mel frequency cepstral coefficients features [46], and the application of the K-nearest neighbor [47] method to the pitch feature [48]. Many other features (pitch class distributions and pitch class dyad distributions) and methods (support vector machines and linear discriminant analysis) are available for these recognition tasks. At present, there is no common platform used to store, share, and integrate the results obtained from these tasks. This state of affairs is due to the lack of a common model of Carnatic music that is needed to integrate these studies. The categorical model that we propose will allow users to develop and work with such common database collections and to easily share their results with a larger group. In a broad sense, the model proposed in this paper provides the complete structure necessary for group of researchers to work together, share their data and results, and store the data in a collective database. The proposed categorical framework is intended to facilitate a platform for collaborative research and it can be adapted to other domains as well. III. C ATEGORY T HEORETIC R EPRESENTATION OF C ARNATIC M USIC Knowledge representation plays a fundamental role in describing any domain problem in which a computer must interpret and process data for further analysis. In many real world applications, modeling the objects and relationships in a domain constitutes a major contribution to such knowledge representation. The most important criteria for representing any domain problem is that knowledge should be properly defined so that it is consistent with the problem domain. Therefore, the primary challenges of knowledge representation are as follows. 1) Choosing the problem to solve. 972 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS, VOL. 48, NO. 6, JUNE 2018 2) Representing a problem and providing the required knowledge to solve the problem. 3) Validating the appropriateness of the knowledge before solving the problem. 4) Solving the problem computationally and evaluating the correctness of the solution to the problem. 5) Comprehensive representation of the knowledge so that a computer can solve the problem for a particular domain. There are many models for knowledge representation. One class of such models are called ontologies, which provide an explicit specification and abstraction of knowledge [49]–[52] in either human-readable form or a formal language. These typically begin by specifying the types of entities which occur in the problem domain, then supplemented with a collection of rules that the entities are expected to obey. The most popular approach to ontologies rely on a family of Web ontology languages (OWL) [53]–[55]. In MIR, these have been applied to model music for segmentation and the extraction of semantic information [56], [57]. The OWL approach is based on description logic and is often used in conjunction with other technologies, such as the resource description framework and the SPARQL the query language, and can also be mapped to relational database schemas for storage and retrieval [58], [59]. One shortcoming of existing ontological approaches is a lack of extensibility. It can be difficult to modify an existing ontology in order to extend or debug its domain. In particular, it is difficult to relate different ontologies to one another, particularly in cases where neither ontology embeds into the other. A lack of formal relationships between ontologies also leads to difficulties when migrating data from one ontology to another. Similar issues arise for ontological integration. Domainspecific OWL ontologies are often specialized from large and complex “upper” ontologies, e.g., [60]. The advantage is that this method can align multiple small ontologies derived from the same upper ontology. The disadvantage is that these upper ontologies can be difficult for developers to write and debug and for users to navigate. A more modular approach, based on identification of overlap between small ontologies, depends on a concrete representation of that overlap. In this paper, we investigate an alternative approach to ontologies based on CT. CT was developed in the 1940s by Eilenberg and MacLane [61] to study the relationship between two areas of mathematics: 1) topology and 2) algebra. Subsequent research has led to applications throughout mathematics as well as in theoretical physics and computer science. In 1990s, Rosebrugh and Wood [62] showed that a database schema can be viewed as a category, and a database instance as a functor on this category [63]. More recently, Spivak [64] has used this point of view to apply the methods of CT to problems of data management. Generally speaking, this line of work provides a dictionary of categorical interpretations for the standard vocabulary of databases, such as schemas, instances, queries, updates, and data migration. This paper has also been implemented in the functorial query language, software for building and analyzing databases from the categorical perspective [65]. There is also a broad and substantial literature on the use of CT in formal modeling. One line of inquiry following on this database work can be found in the work of Johnson and Rosebrugh [63], who provide a categorical interpretation of entity–attribute–relation diagrams. This paper was further elaborated by Diskin [66], [67], and Rutle et al. [68], with a particular emphasis on software engineering. Most importantly, whereas earlier work was largely theoretical, much of this more recent work has been implemented and is directly informed by engineering practice. Along broadly similar lines, CT can be used to understand the object-oriented class modeling as found, for example, in the universal modeling language (UML). Although it is ubiquitous in (especially the early stages of) software engineering, UML lacks well-defined semantics. Both Diskin [69] and Padi et al. [70] have associated (fragments of) the UML class diagram syntax with constructions in CT. Turning this relationship around, CT straight-forwardly inherits many of the methods used in object-oriented class modeling. The formal mathematics of CT allow us to sidestep some of the difficulties of other ontological representations. For example, one advantage of categorical ontologies is that we can express their inter-relationships using maps called functors. More generally, a diagram of functors can describe sophisticated relationships between families of ontologies, which can then be merged together using a construction called a colimit. Kan extensions provide a formal method for data migration between schemas. These powerful formal methods allow us to manipulate categorical ontologies in a way that is both correct and mathematically justified [71]–[73]. The previous section described some fundamental aspects of Carnatic music. In this section, we translate that description into a category-theoretic ontology. Our supplementary materials provide a brief introduction to the methods of CT (as well as our notation); we strongly advise those without a background in CT to study that material before proceeding. In our categorical model, each object represents a set of entities, and each arrow represents a function between these sets. In diagrams the objects will be represented as a box containing text which describe a typical element of the set. For example: represents the set of svaras (notes) in Carnatic music. For readability, we use corner braces rather than full boxes when discussing objects in the text: e.g., a svara. As discussed in Section II, a Carnatic melody is built up from 16 basic notes, spread across 12 pitches and 7 articulations. Each of these will become an object in our categorical representation. Since each of these objects represents a single individual, these can all be modeled by terminal objects in our category. We then group these notes by articulation, as in the horizontal groupings of Fig. 1. As shown in Fig. 6 and 7, categorically, these groupings take the form of coproducts, allowing us to represent the objects Rishabham, Gandharam, Dhaivata, Nishada, Madhyama, Sadja, and Panchama as finite sets (also called enumerations). PADI et al.: MODELING AND ANALYSIS OF INDIAN CARNATIC MUSIC USING CT 973 Fig. 10. Universal Mapping Property (UMP) of the product to compare frequency ratios between svaras. Fig. 6. Categorical representation of notes in Carnatic music tradition. Fig. 7. Categorical representation of notes with respect to isomorphism. Fig. 11. Illustrating the Universal Mapping Property of svaras shown in Fig. 10 with an example. Fig. 8. Projections from an M-arohana sequence to its svaras. Fig. 9. Cartesian product formed from individual svara objects. A. Modeling rāgas Let a svara denote the coproduct of the seven objects Sadja, Rish., etc., as described in Table I. All together this will be a coproduct of 16 terminal objects (2 × 1 element + 1 × 2 elements + 4 × 3 element). Table I defines a function4 As a minor abuse of notation we will use the same name freq_ratio to denote the composition of the above map with the coproduct injections, e.g., Rish. → a svara → Q. These svara objects are then connected into two sequences: 1) the arohana and 2) avarohana, to form a rāga. Here, we must distinguish between Melakartha and Janya rāgas, as their mathematical structure is quite different. For a Melakartha rāga, we can identify its Sadja svara, its Rishabham svara, etc. Mathematically, this corresponds to a family of functions from a sequence object into each of the svara objects as displayed in Fig. 8 for the Melakartha arohana (M-arohana) sequence. Together, these functions define a single map into the product object (see Fig. 9). Indeed the arohana sequence is completely determined by these choice of these seven svaras, so the map M-arohanasequence → Svara product will be a 4 Here, Q, the rational numbers, is the set of fractions. monomorphism (i.e., an injection). We can say that these arrows are jointly monic. However, not every choice of svaras is allowed as an M-arohana sequence; the rules for identifying those which are acceptable were described in Section II. We can model these rules using truth functions and pullbacks. One of these rules says that the frequency of the Rishabham svara is less than that of the Ghandharam svara (as displayed in Table I). To say this categorically we make use of a truth function less_than : Q × Q → {True, False}. This function is defined by rule True if p < q less_than(p, q) = False otherwise. Using the universal mapping property of the product we can use this to define a truth function on Rish. × Gand. as in Fig. 10. Notice that the object Q appears twice in the preceding diagram. Usually when this occurs it means that the same object is involved in two or more different arrows; we write it multiple times for readability, even though it really is the same object. The best way to understand such diagrams is by tracing an element through the diagram; since different functions give different values, this allows us to see why the object appears twice. For example, we might trace the pair (R1, G1) through the diagram in Fig. 11. Since (R1, G1) maps to True, this is an allowable transition in Melakartha rāgas. If we traced (R2, G1) through the diagram we would end up at False, so this pair is not allowed. In order to distinguish these cases requires the use of the pullback operation. Note that there is a map True : 1 → {True, False} which sends the unique element of 1 to True. As discussed in supplementary section on CT, pulling our truth function back along this map will yield a subset of Rish. × Gand. consisting of only those pairs which map to the value True. We denote this pullback by Rish. < Gand.. See Fig. 12. 974 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS, VOL. 48, NO. 6, JUNE 2018 Fig. 12. Defining frequency-ordered pairs of svaras using a pullback property. Fig. 16. Fig. 13. Representation of M-araohana sequence as a Cartesian product. Fig. 17. rāga. Fig. 14. Commutative diagram expressing the reversal symmetry of Melakartha rāga. Fig. 15. Using the UMP of the coproduct to define the type of a rāga. An additional relationship between Janya and Melakartha rāgas. Similar reasoning applies to the restriction between Dhaivata and Nishada svaras. This gives us a final definition shown in Fig. 13. Next, we turn to the relationship between arohana and avarohana sequences. In Melakartha rāgas, each is the reverse of the other. Using products and coproducts one can define a list operator ∗ which acts on objects in the category; given a set A, A∗ is the set of lists whose elements come from A. In particular, each arohana sequence is a list of svaras, so we will have a monic arrow from an M-arohana sequence into a list of svaras. The advantage of this point of view is that list objects come equipped with a variety of operations. In particular, we may reverse any sequence, corresponding to a map A∗ → A∗ . We can then specify the relationship between arohana and avarohana in Melakartha rāgas as an equation between the two paths such that the diagram in Fig. 14 commutes. As discussed in Section II, this categorical model says that every Melakartha rāga is a sampoorna rāga which is indicated as commutative property in the above diagram. Next consider Janya rāgas; these are much less structured than Melakartha rāgas and, consequently, there is much less that we can say about them at a categorical level. It is true that Janya rāgas contain an arohana and an avarohana sequence. We can also classify which Janya rāgas are sampoorna using an analysis very similar to the one above. There is also a relationship between the two types of rāgas: every Janya rāga is derived from a particular Melakartha rāga. In our category, this corresponds to the map provided in Fig. 15. In general, a rāga may be either a Melakartha rāga or a Janya rāga (but not both), so we can represent the object Illustrating the relationship between Melakartha rāga and Janya a rāga as the co-product of the a Melakartha rāga object and the a Janya rāga objects. We also find it useful to represent this information as a typing map. Consider the two-element set {J-type, M-type} ∼ = 1 + 1. Using the fact that any object has a unique map to the terminal object, the universal property of the coproduct a rāga yields our typing map. See Fig. 16. One final piece of information usually associated with a rāga is its emotion; historically, this is usually classified into one of nine categories. This corresponds to an arrow has a rāga −→ an emotion. In addition, this emotion is the same between a Janya rāga and the Melakartha rāga which it is derived from, corresponding to a final condition requiring that the two paths a Janya rāga → an emotion in the diagram in Fig. 17 agree. The above descriptions can be regarded as a recipe for defining a category which models the relationships between pieces and types of Carnatic rāgas. One begins by assembling all of the objects and arrows referred to above into a single graph (which we omit here for reasons of space). Next, one assembles a list of declared path equations and, finally, a list of all categorical constructions (i.e., products, pullbacks, monic arrow, coproducts, and finite sets) specified in our description. Mathematically, these lists of categorical constructions are called a sketch for our model, and there are well-known methods for generating a category from these basic constructions (see [74]); a complete description is beyond the scope of this paper, but is close in spirit to the categories of Examples 2 and 2.1 in the supplementary section. B. Modeling Rhythm Structure As discussed in Section II, in order to describe a piece of music we must specify both its melody and its rhythm. In Carnatic music the rāga provides the first, while the second is determined by what we call a Carnatic rhythm structure. This can be broken down into three pieces: 1) the tāla; 2) the jati; and 3) the gati; the first two pieces determine a tāla structure. PADI et al.: MODELING AND ANALYSIS OF INDIAN CARNATIC MUSIC USING CT Fig. 18. Fig. 19. 975 A tāla determines a sequence (list) of beats (of type O, U or I). Fig. 21. Categorical representation of an instance of a tāla structure. Fig. 22. Categorical representation of Carnatic music rhythm structure. Fig. 23. Jatis and gatis with the same name also have related beat counts. Categorical representation of tāla structure. Fig. 20. A jati associates each type of beat (O, U or I) with a natural number, its beat count. This figure number should be 19. As explained in Section II-B, there are a fixed number of tālas (seven), of jatis (five), and of gatis (five), so each of these can be represented as a finite set (i.e., a coproduct of terminal objects). Each of these vary independently, so the a Carnatic rhythm structure object will be a product of these three (and consequently a coproduct of 7 × 5 × 5 = 175 terminal objects). We also distinguish the object of pairs (tāla, jati), which we call a tāla structure. The structural pattern associated with each tāla (as displayed in Table II) can be represented as a function, as shown in Fig. 18. Here, we use the “Kleene star” A∗ to denote the set of lists with elements from A. As given on the left-hand side of the same table, the number of laghu (I) beats associated with each jati determines a function from a jati to the natural numbers. More usefully, because we know that the number of beats associated with drutam (O) and anudrutam (U) are always the same (two and one, respectively), each jati determines a function {O, U, I} → N. For example, the Misra jati (I = 7) would be associated with the function in Fig. 19. As discussed in the supplementary material, applying this function entry by entry determines a related map between ∗ lists (N∗ ){O,U,I} . This is essentially the functoriality of the list operator. This shows that a tāla structure determines both an ele∗ ment of {O, U, I}∗ and a function in (N∗ ){O,U,I} . Since one is a function and the other is an input, we can pair these together and apply the evaluation function to obtain a list of natural numbers. Summing the resulting list, we obtain the number of beats per cycle associated with a tāla structure shown in Fig. 20. Just as we did for rāgas, we can examine this diagram by tracking an element through it (Fig. 21) [although this is made more difficult by the fact that a function List{O, U, I} → List(N) contains an infinite number of values]. Finally, we can combine the beats/cycle information generated by a tāla structure with the notes/beat information given by the gati in order to determine the total number of notes/cycle in the segment (Fig. 22). Although we do not need to use it, we should also note that there is an isomorphism between a gati and a jati, implicit in Tables III and IV, and that this iso commutes with relevant counts associated with these objects, as shown in Fig. 23. IV. S INGLE C ONCEPTUAL M ODEL FOR A NALYSIS In this section, we describe a method of integrating the categorical models just defined (and others) into a single conceptual framework. We have already seen applications of limits 976 Fig. 24. IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS, VOL. 48, NO. 6, JUNE 2018 Categorical representation for analysis of Carnatic music. and colimits inside a category. Now, we employ some of the same ideas to study relationships between categories. First consider the following simple category (Fig. 24) which contains metadata related to a musical segment. A raw file5 records a Carnatic music segment, and this segment has five relevant pieces of metadata, including its title, artist, and composer. This is acceptable as a structure for storing data, but for analysis and validation we would like to integrate it with the categorical models from the previous sections. We can do this using pushouts of categories. Let R denotes the categorical ontology for rāgas which we developed in the previous section, and M the metadata category presented in Fig. 24. Notice that both of these categories contain an object called a rāga.6 We would like to glue these two categories together by identifying their rāga objects. In CT, functors are used to relate structures in one category to analogous constructions in another [75]. Technically, these functors must preserve the categorical structures (e.g., products and coproducts) include in our model. Let 1 denotes the category which contains only a single object and its identity arrow. Because of its particularly simple form, a functor from 1 into some other category C is determined by the choice of a single object in C. In particular, the a rāga object in R and M define two functors ragaM : 1 → M and ragaR : 1 → R. Consider the pushout diagram below. The functors into RM show that this composite category contains both M and R as subcategories. Moreover, the two paths 1 → RM agree, meaning that the rāga objects in the two categories have been identified. By its universal property, RM is the smallest category satisfying these two properties: it contains nothing except the objects and arrows which come from R and M. This gives a formal description of our intuitive goal: we have glued our models together along their common piece. In particular, RM is fully specified by categories M and R and the two maps ragaM and ragaR (see Fig. 25). Similarly, let T denotes the categorical ontology for tāla structures; exactly the same sort of argument allows us to integrate M and T. Moreover, these two pushouts form the legs of a third diagram, allowing us to iterate the pushout procedure. This yields a new category integrating both ontologies into our metadata category. Of course, given ontologies for the other metadata, such as database specifications for artist and composer data, we could glue these into our framework in exactly the same fashion (see Fig. 26). 5 We limit our attention to raw files that are Carnatic music segments. 6 Note, however, that unlike some other ontological approaches it is not important that the naming of these objects agree. Fig. 25. Fig. 26. pushouts. Integrating rāga and metadata models using pushout. Integrating rāga, tāla and metadata models through iterated Fig. 27. Categorical representation of task. Fig. 28. The task type object is a finite set consisting of five tasks. Fig. 29. Each piece of MIR metadata addresses a specific task type. Fig. 30. Categorical model for an experiment evaluation. Finally, we need to evaluate our model for various experimental conditions. Fig. 27 shows the categorical representation of a task. The object a task is a product consisting of three pieces of information: 1) a feature to be used; 2) a method of analysis required; and 3) the type of information to be identified. The last of these, a task type, is a finite set consisting of elements shown in Fig. 28. PADI et al.: MODELING AND ANALYSIS OF INDIAN CARNATIC MUSIC USING CT 977 any processing task, share the code, use the existing results for analysis purposes and verifying the results. This platform creates a common workspace, where anyone can use a shared database and perform analysis in a consistent manner. Notice, in particular, the dotted line Fig. 31. A commutative diagram expressing coherence between the task and result of an experiment. If we let meta-data for MIR denote the coproduct Attached to this object there is a typing map constructed in the same fashion as the rāga typing map introduced in Section III (see Fig. 29). Next, we have the notion of an experiment as depicted in Fig. 30. Roughly speaking, an experiment is an instance of a particular task which analyzes a raw file to generate a single result. This result can be mapped to meta-data for MIR, corresponding to an arrow (see Fig. 32). There is an obvious consistency check here, which can be represented as a commutative diagram: the type of result from an experiment is the same as the type of the task which the experiment performed. This is shown diagrammatically in Fig. 31. As shown in Fig. 30, we can also keep track of some additional information like the researcher and the date on which the experiment is conducted. This categorical model allows group of people to do the experiments on a given music segment with different methods and features and different users can store, retrieve, and update the results. This framework can be used by group of people for collaborative analysis and implementation purposes. Notice that the models of an experiment and a task may be glued together along their overlap using the same colimit methods discussed above. Moreover, in any particular domain the feature parameters, method parameters and results objects may be expanded in a similar way to represent structured data. Fig. 32 illustrates the complete categorical model for analysis of Carnatic music with experimental evaluation, including the tasks performed on music segments. As suggested by the figure, an experiment is performed on a raw file which contains a recording of a Carnatic music segment. Every experiment involves a task which is performed using features and methods with appropriate parameters. The parameters for feature and method vary based on type of feature and type of method with respect to a particular task. The idea behind this categorical model is that any user can execute the model and store his/her outputs for a particular task using different features and methods. These can be accessed or cross verified by other people who work on the same music segment. The model shares information between a group of people and allows them to collaborate on research using the same music databases. This collaborative platform allows for sharing the common model for modeling music for represents a ground truth. Every file under consideration records a particular piece of Carnatic music, so there is such a function. However, this information is, in general, not available to us. In some sense, our goal is to identify (the metadata of) the recorded segment based on characteristics of the file. Alternatively, in situations where we know the ground truth of a music segment, this provides us with additional paths from an experiment to meta-data for MIR (refer to Fig. 33). These alternate paths can help to verify our metadata. Ideally, these paths should agree (for any particular task type), but in practical experimental evaluations, this cannot be expected in all cases. One path is derived from ground truth about music segment and another path is taken from the experimental evaluation. We can compare the results of both the paths to judge the accuracy of our experimental methods. For example, Fig. 35 provides an instance of the task model for rāga recognition task using the HMM method applied to the pitch feature. It also gives an example of one element in the model, traced through the various maps. Other tasks may involve different parameters, but this basic model provides a framework enabling end users to integrate various pieces of information for implementing other tasks like tāla recognition, singer identification, song name identification, etc. V. U SE C ASES This section provides experimental evaluation and database schemas for the categorical models discussed in Section IV. A database is a collection of data that is stored and organized so that the end user can easily and quickly access, manage, and update the collection. In particular, relational database models involve the storage of data in tables (also called relations). In the relational models, each column in a table is called an attribute and each row is called as an entity or record. Each record is uniquely determined by an identifier called a primary key. Columns of other tables which refer to this primary key are called foreign keys, and these can be regarded as arrows in a category. In fact, Spivak has shown [64], [65] that (finitely presented) categories can be converted into database specifications (and vice versa), so in defining our categorical models we have already provided a database model. The fact that categorical models can be translated into a database schemas facilitates the storage, analysis and fine tuning of the data based on our models. The resulting database provides the users access to the data for further analysis and processing tasks. See [64], [65] for an explanation of how a categorical model can be represented as a database schema along with its integrity constraints. Via this approach we can use our existing models of rāga, tāla, etc., to define our underlying database models. In fact, in most cases we do not want to store all of the data associated 978 Fig. 32. IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS, VOL. 48, NO. 6, JUNE 2018 Single conceptual model for analysis of Carnatic music. Fig. 33. Categorical representation of an experimental analysis of a music segment. Fig. 34. Fig. 35. Categorical representation of rāga task with example elements. Fig. 36. Database model for the Carnatic music conceptual structure. Categorical representation of rāga task. with our model. For example, the database schema depicted in Fig. 36 includes only some of the objects and arrows from the detailed model in Section III; this might be the only data needed for metadata-based retrieval purposes. The decision of which data to store and which to throw away can be encoded as a functor from a reduced database schema into the full categorical model. It sends each table in Fig. 36 to the object of the categorical model which has the same name. Based on this mapping, there is a canonical way to migrate data from the complex schema into the simpler one. These are called -migrations, and are discussed in [65]. Fig. 37 depicts of a generic model for implementation and analysis of music segments. This generic model can be PADI et al.: MODELING AND ANALYSIS OF INDIAN CARNATIC MUSIC USING CT Fig. 37. 979 Graphical interpretation of generic model for analysis of Carnatic music. visualized as a three-layered structure. The first layer concentrates on a categorical representation of the characteristics of Carnatic music. This layer also details the methods, features, and tasks which are used to perform a particular analysis. While implementing such a task, we may pull characteristics of music from its corresponding models. After concluding a task, its results are stored in the database of the second layer. As explained in Fig. 36, the second layer in the generic model is a database schema which mediates our category theoretic ontology and the query processing system presented to users. This layer allows us to create database schema based on (a fragment of) the CT models provided in first layer so that end user can access the required information from the database through query processing system. The last layer of the model supports query analysis for information retrieval purposes. This layer also integrates the processing of results based on different methods which may be evaluated on different features. This layer allow the end user to access or retrieve information about desired music segments. The end user can also find the methods and features used for a particular task for further analysis purposes. This layer extracts the data from the database for any query analysis. In this way, the unified conceptual model allows a group of people to share the results and to get complete details about the tasks, features and results of a music segment in the database. We can say that this model allows the people to do collaborative research. for music segments, but these efforts have been somewhat disparate, making it difficult to compare their results. This paper discusses a categorical approach for modeling the Carnatic music for various processing tasks for MIR applications. Despite the complexity of traditional Carnatic music and its basis in rāga and tāla structures, we have formally characterized these complexities using the categorical approach. The conceptual model proposed in this paper provides a means to model any music segment for finding metadata for information retrieval purposes. With respect to collaborative research, the proposed categorical model provides a common platform that allows users or researchers to share data and verify the results on a common database. This paper also addresses the mapping of categorical structures into database schemas for storage and retrieval purposes. The framework developed in this paper will allow a group of people to conduct analysis using a categorical model on a common corpus of music segments collaboratively. This paper provides a framework to formalize the relationship between Carnatic music structure, prior research, tasks previously attempted, methods and features used for different music processing tasks. The general model proposed in this paper can be adapted to other domains by replacing our model of Carnatic music by models applicable to other domain problems. As future work, the proposed categorical model can be extended to characterize other musical traditions, including Western classical music and Hindustani music. VI. C ONCLUSION ACKNOWLEDGMENT In the area of MIR for Carnatic music, developing metadata for music segments is important and challenging task. Presently, some efforts have been made to provide metadata The authors would like to thank D. Spivak who helped to improve the CT models defined in this paper specially for defining models for tāla structure. The authors also thank 980 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS, VOL. 48, NO. 6, JUNE 2018 R. Wisnesky and S. Adhinarayanan for their valuable comments and suggestions for improving the consistency and readability of this paper. R EFERENCES [1] N. Orio, “Music retrieval: A tutorial and review,” Found. Trends Inf. Retrieval, vol. 1, no. 1, pp. 1–90, 2006. [2] M. A. Casey et al., “Content-based music information retrieval: Current directions and future challenges,” Proc. IEEE, vol. 96, no. 4, pp. 668–696, Apr. 2008. [3] R. Typke, F. Wiering, and R. C. Veltkamp, “A survey of music information retrieval systems,” in Proc. ISMIR, London, U.K., 2005, pp. 153–160. [4] E. D. Scheirer, “Tempo and beat analysis of acoustic musical signals,” J. Acoust. Soc. America, vol. 103, no. 1, pp. 588–601, 1998. [5] M. A. Alonso, G. Richard, and B. David, “Tempo and beat estimation of musical signals,” in Proc. ISMIR, Barcelona, Spain, 2004, pp. 158–163. [6] P. Herrera-Boyer, G. Peeters, and S. Dubnov, “Automatic classification of musical instrument sounds,” J. New Music Res., vol. 32, no. 1, pp. 3–21, 2003. [7] T. Kitahara, M. Goto, K. Komatani, T. Ogata, and H. G. Okuno, “Instrument identification in polyphonic music: Feature weighting to minimize influence of sound overlaps,” EURASIP J. Appl. Signal Process., vol. 1, no. 1, p. 155, 2007. [8] B. L. Sturm, M. Morvidone, and L. Daudet, “Musical instrument identification using multiscale mel-frequency cepstral coefficients,” in Proc. 18th Eur. Signal Process. Conf., Aalborg, Denmark, Aug. 2010, pp. 477–481. [9] M. Marolt, “SONIC: Transcription of polyphonic piano music with neural networks,” in Proc. Workshop Current Res. Directions Comput. Music, Barcelona, Spain, 2001, pp. 217–224. [10] H. G. Ranjani, S. Arthi, and T. V. Sreenivas, “Carnatic music analysis: Shadja, swara identification and raga verification in alapana using stochastic models,” in Proc. IEEE Workshop Appl. Signal Process. Audio Acoust. (WASPAA), New Paltz, NY, USA, Oct. 2011, pp. 29–32. [11] R. Sridhar and T. V. Geetha, “Raga identification of Carnatic music for music information retrieval,” Int. J. Recent Trends Eng., vol. 1, no. 1, pp. 571–574, 2009. [12] R. Sridhar and T. V. Geetha, “Music information retrieval of Carnatic songs based on Carnatic music singer identification,” in Proc. Int. Conf. Comput. Elect. Eng., 2008, pp. 407–411. [13] L. Pesch, The Oxford Illustrated Companion to South Indian Classical Music. New Delhi, India: Oxford Univ. Press, 2009. [14] R. M. Rangayyan, “An introduction to the classical music of India,” Dept. Elect. Comput. Eng., Univ. Calgary, Calgary, AB, Canada. [15] T. M. Krishna, A Southern Music: The Karnatik Story, HarperCollins, Noida, India, 2013. [16] M. Andreatta, A. Ehresmann, R. Guitart, and G. Mazzola, “Towards a categorical theory of creativity for music, discourse, and cognition,” in Mathematics and Computation in Music. Heidelberg, Germany: Springer, 2013, pp. 19–37. [17] E. Dennis-Jones and D. E. Rydeheard, “Categorical ML—Categorytheoretic modular programming,” Formal Aspects Comput., vol. 5, no. 4, pp. 337–366, 1993. [18] K. Williamson, M. Healy, and R. Barker, “Industrial applications of software synthesis via category theory—Case studies using Specware,” Autom. Softw. Eng., vol. 8, no. 1, pp. 7–30, 2001. [19] R. Tate, M. Stepp, and S. Lerner, “Generating compiler optimizations from proofs,” ACM SIGPLAN Notices, vol. 45, no. 1, pp. 389–402, 2010. [20] J. C. Reynolds, “Using category theory to design implicit conversions and generic operators,” in Semantics-Directed Compiler Generation. New York, NY, USA: Springer, 1980, pp. 211–258. [21] Z. Diskin and T. Maibaum, “Category theory and model-driven engineering: From formal semantics to design patterns and beyond,” in Model-Driven Engineering of Information Systems: Principles, Techniques, and Practice. Toronto, ON, Canada: Apple Acad. Press, 2014, p. 173. [22] T. Giesa, D. I. Spivak, and M. J. Buehler, “Category theory based solution for the building block replacement problem in materials design,” Adv. Eng. Mater., vol. 14, no. 9, pp. 810–817, 2012. [23] D. I. Spivak, T. Giesa, E. Wood, and M. J. Buehler, “Category theoretic analysis of hierarchical protein materials and social networks,” PLoS ONE, vol. 6, no. 9, pp. 1–15, Sep. 2011. [24] J. C. Baez and A. Lauda, “A prehistory of n-categorical physics,” in Deep Beauty: Understanding the Quantum World Through Mathematical Innovation. 2009, pp. 13–128. [25] J.-C. Letelier, J. Soto-Andrade, F. G. Abarzua, A. Cornish-Bowden, and M. Luz Cárdenas, “Organizational invariance and metabolic closure: Analysis in terms of (M,R) systems,” J. Theor. Biol., vol. 238, no. 4, pp. 949–961, 2006. [26] D. Ellerman, “Determination through universals: An application of category theory in the life sciences,” unpublished paper. [Online]. Available: https://arxiv.org/abs/1305.6958 [27] V. G. Paranjape, “Indian music and aesthetics,” J. Music Acad. Madras, vol. XXVIII, pp. 68–71, 1957. [28] P. Lavezzoli, The Dawn of Indian Music in the West. New York, NY, USA: Bhairavi, Continuum, 2006. [29] A. Bellur, V. Ishwar, X. Serra, and H. A. Murthy, “A knowledge based signal processing approach to tonic identification in Indian classical music,” in Proc. 2nd CompMusic Workshop, Istanbul, Turkey, 2012, pp. 113–116. [30] G. K. Koduri and B. Indurkhya, “A behavioral study of emotions in South Indian classical music and its implications in music recommendation systems,” in Proc. ACM Workshop Soc. Adapt. Pers. Multimedia Interact. Access, Florence, Italy, 2010, pp. 55–60. [31] P. Sarala, V. Ishwar, A. Bellur, and H. A. Murthy, “Applause identification and its relevance to archival of Carnatic music,” in Proc. Workshop Comput. Music, Istanbul, Turkey, Jul. 2012, pp. 66–71. [32] V. Ishwar, A. Bellur, and H. A. Murthy, “Motivic analysis and its relevance to raga identification in Carnatic music,” in Proc. Workshop Comput. Music, Istanbul, Turkey, Jul. 2012, pp. 153–157. [33] S. Gulati, J. Salamon, and X. Serra, “A two-stage approach for tonic identification in Indian art music,” in Proc. Workshop Comput. Music, Istanbul, Turkey, Jul. 2012, pp. 119–127. [34] G. K. Koduri, J. Serrá, and X. Serra, “Characterization of intonation in Karnataka music by parametrizing context-based Svara Distributions,” in Proc. Workshop Comput. Music, Istanbul, Turkey, Jul. 2012, pp. 128–132. [35] J. C. Ross and P. Rao, “Detection of raga-characteristic phrases from Hindustani classical music audio,” in Proc. Workshop Comput. Music, Istanbul, Turkey, Jul. 2012, pp. 133–138. [36] A. Vidwans, K. K. Ganguli, and P. Rao, “Classification of Indian classical vocal styles from melodic contours,” in Proc. Workshop Comput. Music, Istanbul, Turkey, Jul. 2012, pp. 139–146. [37] B. Bozkurt, “Features for analysis of Makam music,” in Proc. Workshop Comput. Music, Istanbul, Turkey, Jul. 2012, pp. 61–65. [38] A. Srinivasamurthy, S. Subramanian, T. Gregoire, and P. Chordia, “A beat tracking approach to complete description of rhythm in Indian classical music,” in Proc. Workshop Comput. Music, Istanbul, Turkey, Jul. 2012, pp. 72–78. [39] P. Sarala and H. A. Murthy, “Cent filter-banks and its relevance to identifying the main song in Carnatic music,” in Proc. Comput. Music Multidscipl. Res. (CMMR), Marseilles, France, Oct. 2013, pp. 659–681. [40] V. Ishwar, S. Dutta, A. Bellur, and H. A. Murthy, “Motif spotting in an alapana in Carnatic music,” in Proc. Int. Soc. Music Inf. Retrieval (ISMIR), Curitiba, Brazil, Nov. 2013, pp. 499–504. [41] A. Bellur and H. A. Murthy, “A novel application of group delay function for identifying tonic in Carnatic music,” in Proc. EUSIPCO, Marrakesh, Morocco, 2013, pp. 1–5. [42] J. C. Ross, T. P. Vinutha, and P. Rao, “Detecting melodic motifs from audio for Hindustani classical music,” in Proc. Int. Soc. Music Inf. Retrieval (ISMIR), Porto, Portugal, Oct. 2012, pp. 193–198. [43] P. Sarala and H. A. Murthy, “Inter and intra item segmentation of continuous audio recordings of Carnatic music for archival,” in Proc. Int. Soc. Music Inf. Retrieval (ISMIR), Curitiba, Brazil, Nov. 2013, pp. 487–492. [44] J. Kuriakose, J. C. Kumar, P. Sarala, H. A. Murthy, and U. K. Sivaraman, “Akshara transcription of mrudangam strokes in Carnatic music,” in Proc. 21st IEEE Nat. Conf. Commun. (NCC), Mumbai, India, 2015, pp. 1–6. [45] CUED. (2002). HTK Speech Recognition Toolkit. [Online]. Available: http://htk.eng.cam.ac.uk [46] B. Logan, “Mel frequency cepstral coefficients for music modeling,” in Proc. Int. Symp. Music Inf. Retrieval, Plymouth, MA, USA, 2000. [47] M.-L. Zhang and Z.-H. Zhou, “ML-KNN: A lazy learning approach to multi-label learning,” Pattern Recognit., vol. 40, no. 7, pp. 2038–2048, 2007. [48] D. Gerhard, “Pitch extraction and fundamental frequency: History and current techniques,” Dept. Comput. Sci., Univ. Regina, Regina, SK, Canada, Tech. Rep. TR-CS 2003-6, 2003. PADI et al.: MODELING AND ANALYSIS OF INDIAN CARNATIC MUSIC USING CT [49] B. Chandrasekaran, J. R. Josephson, and V. R. Benjamins, “What are ontologies, and why do we need them?” IEEE Intell. Syst., vol. 14, no. 1, pp. 20–26, Jan. 1999. [50] P. M. Simons, Parts: A Study in Ontology. Oxford, U.K.: Oxford Univ. Press, 1987. [51] D. Fensel, Ontologies: A Silver Bullet for Knowledge Management and Electronic Commerce, 2nd ed. Heidelberg, Germany: Springer-Verlag, 2003. [52] D. Fensel, “Ontology-based knowledge management,” Computer, vol. 35, no. 11, pp. 56–59, 2002. [53] G. Antoniou and F. Van Harmelen, “Web ontology language: Owl,” in Handbook on Ontologies. Heidelberg, Germany: Springer, 2004, pp. 67–92. [54] G. Antoniou, E. Franconi, and F. Van Harmelen, “Introduction to semantic Web ontology languages,” in Reasoning Web. Heidelberg, Germany: Springer, 2005, pp. 1–21. [55] D. L. McGuinness and F. Van Harmelen, “Owl Web ontology language overview,” W3C Recommendation, vol. 10, no. 10, 2004. [Online]. Available: https://www.w3.org/TR/owl-features/ [56] S. Song, M. Kim, S. Rho, and E. Hwang, “Music ontology for mood and situation reasoning to support music retrieval and recommendation,” in Proc. 3rd Int. Conf. Digit. Soc. (ICDS), Cancún, Mexico, Feb. 2009, pp. 304–309. [57] B. Fields, K. Page, D. De Roure, and T. Crawford, “The segment ontology: Bridging music-generic and domain-specific,” in Proc. IEEE Int. Conf. Multimedia Expo (ICME), Barcelona, Spain, 2011, pp. 1–6. [58] A. Gali, C. X. Chen, K. T. Claypool, and R. Uceda-Sosa, “From ontology to relational databases,” in Conceptual Modeling for Advanced Application Domains. Heidelberg, Germany: Springer, 2004, pp. 278–289. [59] B. Motik, I. Horrocks, and U. Sattler, “Bridging the gap between owl and relational databases,” Web Semantics Sci. Services Agents World Wide Web, vol. 7, no. 2, pp. 74–89, 2009. [60] A. Gangemi, N. Guarino, C. Masolo, A. Oltramari, and L. Schneider, “Sweetening Ontologies with DOLCE,” in Proc. Int. Conf. Knowl. Eng. Knowl. Manag., Sigüenza, Spain, 2002, pp. 166–181. [61] S. Eilenberg and S. MacLane, “General theory of natural equivalences,” Trans. Amer. Math. Soc., vol. 58, no. 2, pp. 231–294, 1945. [62] R. Rosebrugh and R. J. Wood, “Relational databases and indexed categories,” in Proc. Int. Category Theory Meeting CMS Conf., vol. 13. 1992, pp. 391–407. [63] M. Johnson and R. Rosebrugh, “Sketch data models, relational schema and data specifications,” Electron. Notes Theor. Comput. Sci., vol. 61, pp. 51–63, Jan. 2002. [64] D. I. Spivak, Category Theory for the Sciences. Cambridge, MA, USA: MIT Press, 2014. [65] H. Forssell, H. R. Gylterud, and D. I. Spivak, “Type theoretical databases,” in Proc. Int. Symp. Logic. Found. Comput. Sci., 2016, pp. 117–129. [66] Z. Diskin, “Generalized sketches as an algebraic graph-based framework for semantic modeling and database design,” Faculty Phys. Math., Univ. Latvia, Riga, Latvia, Tech. Rep. M-97, 1997. [67] Z. Diskin and U. Wolter, “A diagrammatic logic for object-oriented visual modeling,” Electron. Notes Theor. Comput. Sci., vol. 203, no. 6, pp. 19–41, 2008. [68] A. Rutle, A. Rossini, Y. Lamo, and U. Wolter, “A category-theoretical approach to the formalisation of version control in MDE,” in Proc. Int. Conf. Fundam. Approaches Softw. Eng., York, U.K., 2009, pp. 64–78. [69] Z. Diskin, “Mathematics of UML,” in Practical Foundations of Business System Specifications. Amsterdam, The Netherlands: Springer, 2003, pp. 145–178. [70] S. Padi, S. Breiner, E. Subrahmanian, and R. D. Sriram, “Category theoretical approaches for deconstructing the semantics of UML class diagrams,” in preparation. [71] I. Cafezeiro and E. H. Haeusler, “Semantic interoperability via category theory,” in Proc. 26th Int. Conf. Conceptual Model. Tuts. Posters Panels Ind. Contribut. (ER), vol. 83. Auckland, New Zealand, 2007, pp. 197–202. [72] F. McNeill, A. Bundy, and C. Walton, “Diagnosing and repairing ontological mismatches,” in Proc. Stairs 2nd Starting Ai Res. Symp., vol. 109. Amsterdam, The Netherlands, 2004, p. 241. [73] F. McNeill, “Dynamic ontology refinement,” Ph.D. dissertation, Div. Informat., Univ. at Edinburgh, Edinburgh, U.K., 2006. [74] M. Barr and C. Wells, Eds., Category Theory for Computing Science, 2nd ed. Hertfordshire, U.K.: Prentice-Hall, 1995. [75] S. M. Lane, Categories for the Working Mathematician (Graduate Texts in Mathematics), vol. 5. New York, NY, USA: Springer-Verlag, 1971. 981 Sarala Padi received the Ph.D. degree from the Indian Institute of Technology Madras, Chennai, India, in 2014. She is currently a Guest Researcher with the National Institute of Standards and Technology, Gaithersburg, MD, USA. Her area of specializations are machine learning and signal processing techniques for efficient music information retrieval and automatic indexing purposes, text mining, and NLP methods for processing of text documents for various applications. Her current research interests include applying category theory for modeling and knowledge representation, and applying deep learning models for medical image analysis purposes. Spencer Breiner received the master’s and Ph.D. degrees from Carnegie Mellon University, Pittsburgh, PA, USA, in 2010 and 2013, respectively. He joined the U.S. National Institute of Standards and Technology, Gaithersburg, MD, USA, in 2015 under a Post-Doctoral Grant from the National Research Council. His current research interest includes potential applications of the mathematical field of category theory to real-world applications. Eswaran Subrahmanian received the Ph.D. degree from Carnegie Mellon University, Pittsburgh, PA, USA, in 1987. He is a Research Professor with the Institute for Complex Engineered Systems and Engineering and Public Policy, Carnegie Mellon University, Pittsburgh, PA, USA. He has published over 100 refereed papers, co-edited three books on Empirical Studies in Design, Design engineering, and Organizational Implications knowledge Management. He also co-authored a book setting the global research agenda for ICT in Development. He has also co-edited three special issues in the areas of design theory, engineering informatics, and annotations in engineering design. His current research interests include design theory, design support systems, information modeling for engineering, collaborative engineering, and engineering education. Dr. Subrahmanian was a recipient of the Steven Fenves Award for Systems Engineering at CMU. He is a Distinguished Scientist of the Association of Computing Machinery, and a Fellow of the American Association of Advancement of Science and a member of the Design Society. Ram D. Sriram (S’82–M’85–SM’00–F’17) was a Faculty of Engineer with the Massachusetts Institute of Technology, Cambridge, MA, USA, from 1986 to 1994 and was instrumental in setting up the Intelligent Engineering Systems Laboratory. He is currently a Division Chief of the Software and Systems Division with the National Institute of Standards and Technology, Gaithersburg, MD, USA. He is a Distinguished Alumni with the Indian Institute of Technology and Carnegie Mellon University, Pittsburgh, PA, USA. He has co-authored or authored nearly 250 papers, books, and reports, including several books. His current research interests include developing knowledge-based expert systems, natural language interfaces, machine learning, object-oriented software development, life-cycle product and process models, geometrical modelers, objectoriented databases for industrial applications, health care informatics, bioinformatics, and bioimaging. Dr. Sriram was a recipient of the NSF’s Presidential Young Investigator Award in 1989, the ASME Design Automation Award in 2011, the ASME CIE Distinguished Service Award in 2014, and the Washington Academy of Sciences’ Distinguished Career in Engineering Sciences Award in 2015.

Modeling Carnatic Music with Category Theory

Related documents

Products

Support

Modeling Carnatic Music with Category Theory

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib