Temporal type constructors for computer music programming Thesis committee: Roger Dannenberg (Chair) Guy Blelloch Robert Harper Perry Cook, Princeton University 0 Computer music programming Subdomains: Digital signal processing. Response to asynchronous events. Representations of musical and sonic structure. Example applications: Synthesize audio from a musical score. Abstract features from audio; alter features. Transform audio to compress it. 1 Analysis of audio analyze resynthesize (modify) amplitude abstract pp f frequency render 2 The goals Computer music programming should be expressive: programs are clear and concise. general: programs fall within the expressive range. 3 The current tradeoff the promised land expressive unit-generator programming (Csound) low-level programming (C++) general 4 What we have now “Unit generator” programming (Csound). User configures black-box audio processors. Can’t express new DSP or new kinds of data. New kind of data: spectral frames, for example. Low-level programming (C++). Cumbersome without a computer music library. Libraries don’t support new kinds of data, and don’t give much benefit for new DSP. 5 What do we need? Write No Types arbitrary DSP in a high-level language. more writing unit generators in C. higher and lower than “audio stream”. higher: analysis frames for a new representation. lower: access to individual samples for new DSP. 6 My proposal Temporal type constructors. Proposed set: event, vector, infinite vector. Enable a pure applicative programming style. Through temporal type constructors, computer music programming can be both expressive and general. 7 A taste of the results Chronic is a prototype system using this idea. The FOF synthesis algorithm can’t be written in Csound. C implementation is 235 lines, and awkward. Chronic implementation is 34 lines, and closer to our idea of the algorithm. 8 Outline Temporal type constructors. Code examples in Chronic. Related work. Chronic internals. Future work. Conclusions. 9 Temporal type constructors event timestamped event: e.g. as a pair (, time). vec finite vector of : an array of elements. ivec infinite vector of : a time-indexed stream. time integer sample count. 10 Digital audio stream audio sample ivec (float might be chosen as the sample type.) S S S S S 11 Multi-channel audio stream multi_audio sample ivec vec S S S S S S S S S S S S S S S S S S S S 12 Short-time spectrum data spectra complex vec ivec C C C C C C C C C C C C C C C C C C C C C 13 A chord sequence chordseq pitch vec event vec P P P P P P P P P P P @ @ P @ @ 14 Musical-keyboard events MIDI (pitchvelocity) event ivec P V P V @ @ @ @ P V P V 15 Gestural musical events violin (pitch vec bowing vec) event ivec P P P P P P P P B B B B B B B B @ @ @ @ P P P P P B B B B B 16 Explicit vs. implicit time Implicit (Csound): out = -in Code runs in a context holding the current time: for (t=0; ; ++t) out[t] = -in[t] Looped unavoidably — hope it’s what you want. Explicit (Chronic): out = map (x. -x) in and in are of type float ivec. i.e. they are explicitly temporal data. Explicit model, with map, subsumes implicit. out 17 Explicit time Time information is built into the data. Code can stand outside of time. vs. operating within some implicit “now”. Advantages: A strictly more powerful model of time. Implicit time can do delay, but can’t do the inverse. Types are more tractable than code. The FOF example will show how this works. 18 Chronic Built inside O’Caml as a set of libraries. core: library: E V IV EV EIV L LV event vec ivec event vec event ivec float and other vec … 19 A couple of IV functions IV.iterate (fun x -> x +. 0.5) 1. [| 1.; 1.5; 2.; 2.5; ... |] (* y = IV.map succ (IV.delay 2 y) *) IV.delay_rec 2 [| 0; 5 |] (IV.map succ) [| 0; 5; 1; 6; ... |] 20 A couple of library functions let fs = 44100. (* sampling frequency *) LV.osc_sine 1000 (220./.fs) 0.25 (* 1000 samples of 220-Hz cosine *) LIV.para_eq (1200./.fs) 12. 0.5 x (* filter x to boost a 0.5-octave band around 1200 Hz by 12 dB *) 21 Examples built with Chronic FOF synthesis. Computer-music scores. Two reverberators. An FFT-based pitch shifter. 22 FOF synthesis Makes a sound with a peak in its spectrum. level frequency pitch peak pitch peak 23 The FOF waveform Series of enveloped sine-wave ‘grains’. 1 / Fpitch 1 / Fpeak 24 A fitting data type grains: float vec event ivec F @ F F F @ F F F F F F F F @ F @ 25 Skeleton of FOF code let fof (f_pitch: float ivec) phase0 f_peak bandwidth db rise fall dur (risefalltab: float vec) = let grain_times = LIV.phasor_wrap f_pitch phase0 in let fgrain t = ... (* miracle occurs *) in grain @@ (int t) in let grains = IV.map fgrain grain_times in EIV.vfold (+.) 0. grains float ivec float -> float vec event float vec event ivec float ivec 26 The missing piece let fgrain t = let sine = LV.osc_sine dur f_peak (~-.(frac t) *. f_peak) in let kenv = exp(~-.pi*.bandwidth) in let env = V.iterate (fun x -> kenv *. x) 1.0 dur in let smooth_ph = EV.pwl_list 0.0 [(rise, 1.); (dur-1-fall, 1.); (dur-1, 0.)] in let smooth = LV.tablei risefalltab smooth_ph in let ampl = L.db_to_amp db in let grain = V.map3 (fun x y z -> ampl *. x*.y*.z) sine env smooth in grain @@ (int t) 27 FOF in Chronic vs. in C What I showed you was slightly simplified. Less time-varying control, no “octaviation”. This was 19 lines; full FOF is 34. Csound’s More FOF in C is 235. importantly, it’s unintuitive. 28 #include "cs.h" #include "ugens7.h" #include <math.h> /* UGENS7.C */ } } *ar = FZERO; ovp = &p->basovrlap; /* loosely based on code of Michael Clarke, University of Huddersfield */ #define FZERO #define FONE (0.0f) (1.0f) static newpulse(FOFS *, OVRLAP *, float *, float *, float *); int while (ovp->nxtact != NULL) { /* perform cur actlist: */ float result; OVRLAP *prvact = ovp; ovp = ovp->nxtact; /* formant waveform */ fract = PFRAC1(ovp->formphs); /* from JMC Fog*/ ftab = ftp1->ftable + (ovp->formphs >> ftp1->lobits);/*JMC Fog*/ printf("\n ovp->formphs = %ld, ", ovp->formphs); */ /* TEMP JMC*/ v1 = *ftab++; /*JMC Fog*/ result = v1 + (*ftab - v1) * fract; /*JMC Fog*/ result = *(ftp1->ftable + (ovp->formphs >> ftp1->lobits) ); /* Init grain rise ftable phase. Negative kform values make the kris (ifnb) initial index go negative and crash csound. So insert another if-test with compensating code. */ if (*p->kris >= onedsr && *form != FZERO) { /* init fnb ris */ if (*form < FZERO && ovp->formphs != 0) ovp->risphs = (long)((MAXLEN - ovp->formphs) / -*form / *p->kris); else ovp->risphs = (long)(ovp->formphs / *form / *p->kris); ovp->risinc = (long)(sicvt / *p->kris); rismps = MAXLEN / ovp->risinc; } else { ovp->risphs = MAXLEN; rismps = 0; } if (newexp || rismps != p->prvsmps) { /* if new params */ if (p->prvsmps = rismps) /* redo preamp */ p->preamp = (float)pow(p->expamp, -rismps); else p->preamp = FONE; } ovp->curamp = octamp * p->preamp; /* set startamp */ ovp->expamp = p->expamp; if ((ovp->dectim = (long)(*p->kdec * esr)) > 0) /* fnb dec */ ovp->decinc = (long)(sicvt / *p->kdec); ovp->decphs = PMASK; if (!p->foftype) { /* Make fof take k-rate phase increment: Add current iphs to initial form phase */ ovp->formphs += (long)(*p->iphs * fmaxlen); /* krate phs */ ovp->formphs &= PMASK; /* Set up grain gliss increment: ovp->glissbas will be added to ovp->forminc at each pass in fof2. Thus glissbas must be equal to kgliss / grain playing time. Also make it harmonic, so integer kgliss can represent octaves (ie pow() call). */ ovp->glissbas = ovp->forminc * (float)pow(2.0, (double)*p->kgliss); /* glissbas should be diff of start & end pitch*/ ovp->glissbas -= ovp->forminc; ovp->glissbas /= ovp->timrem; ovp->sampct = 0; /* Must be reset in case ovp was used before */ } return(1); } FOF in C: what goes wrong? void fofset0(FOFS *p, int flag) { if ((p->ftp1 = ftfind(p->ifna)) != NULL && (p->ftp2 = ftfind(p->ifnb)) != NULL) { OVRLAP *ovp, *nxtovp; long olaps; p->durtogo = (long)(*p->itotdur * esr); if (*p->iphs == FZERO) /* if fundphs zero, */ p->fundphs = MAXLEN; /* trigger new FOF */ else p->fundphs = (long)(*p->iphs * fmaxlen) & PMASK; if ((olaps = (long)*p->iolaps) <= 0) { initerror("illegal value for iolaps"); return; } auxalloc((long)olaps * sizeof(OVRLAP), &p->auxch); ovp = &p->basovrlap; nxtovp = (OVRLAP *) p->auxch.auxp; do { ovp->nxtact = NULL; ovp->nxtfree = nxtovp; /* link the ovlap spaces */ ovp = nxtovp++; } while (--olaps); ovp->nxtact = NULL; ovp->nxtfree = NULL; p->fofcount = -1; p->prvband = FZERO; p->expamp = FONE; p->prvsmps = 0; p->preamp = FONE; p->xincod = (p->XINCODE & 0x7) ? 1 : 0; p->ampcod = (p->XINCODE & 0x2) ? 1 : 0; p->fundcod = (p->XINCODE & 0x1) ? 1 : 0; p->formcod = (p->XINCODE & 0x4) ? 1 : 0; if (flag) p->fmtmod = (*p->ifmode == FZERO) ? 0 : 1; } p->foftype = flag; } Can’t /* /* */ if (p->foftype) { if (p->fmtmod) ovp->formphs += form_inc; /* inc phs on mode */ else ovp->formphs += ovp->forminc; } else { #define kgliss ifmode /* float ovp->glissbas = kgliss / grain length. ovp->sampct is incremented each sample. We add glissbas * sampct to the pitch of grain at each a-rate pass (ovp->formphs is the index into ifna; ovp->forminc is the stepping factor that decides pitch) */ ovp->formphs += (long)(ovp->forminc + ovp->glissbas * ovp->sampct++); } ovp->formphs &= PMASK; if (ovp->risphs < MAXLEN) { /* formant ris envlp */ result *= *(ftp2->ftable + (ovp->risphs >> ftp2->lobits) ); ovp->risphs += ovp->risinc; } if (ovp->timrem <= ovp->dectim) { /* formant dec envlp */ result *= *(ftp2->ftable + (ovp->decphs >> ftp2->lobits) ); if ((ovp->decphs -= ovp->decinc) < 0) ovp->decphs = 0; } *ar += (result * ovp->curamp); /* add wavfrm to out */ if (--ovp->timrem) /* if fof not expird */ ovp->curamp *= ovp->expamp; /* apply bw exp dec */ else { prvact->nxtact = ovp->nxtact; /* else rm frm activ */ ovp->nxtfree = p->basovrlap.nxtfree;/* & ret spc to free */ p->basovrlap.nxtfree = ovp; ovp = prvact; } } p->fundphs += fund_inc; if (p->xincod) { if (p->ampcod) amp++; if (p->fundcod) fund_inc = (long)(*++fund * sicvt); if (p->formcod) form_inc = (long)(*++form * sicvt); } p->durtogo--; ar++; } while (--nsmps); } represent grains. Can’t stand outside of time Has to loop over output samples, and think static int rngflg=0; “What is the set of active grains right now? Are some dying? Are new ones born? Which envelopes are in their rise phase? entering fall phase? …” void fofset(FOFS { fofset0(p, 1); } *p) void fofset2(FOFS { fofset0(p, 0); } *p) void fof(FOFS *p) { OVRLAP *ovp; FUNC *ftp1, *ftp2; float *ar, *amp, *fund, *form; long nsmps = ksmps, fund_inc, form_inc; float v1, fract ,*ftab; You static int newpulse(FOFS *p, OVRLAP *ovp, float *amp, float *fund, float *form) { float octamp = *amp, oct; long rismps, newexp = 0; don’t want to think that way about FOF. if (p->auxch.auxp==NULL) { /* RWD fix */ initerror("fof: not initialized"); return; } ar = p->ar; amp = p->xamp; fund = p->xfund; form = p->xform; ftp1 = p->ftp1; ftp2 = p->ftp2; fund_inc = (long)(*fund * sicvt); form_inc = (long)(*form * sicvt); do { if (p->fundphs & MAXLEN) { /* if phs has wrapped */ p->fundphs &= PMASK; if ((ovp = p->basovrlap.nxtfree) == NULL) perferror("FOF needs more overlaps"); if (newpulse(p, ovp, amp, fund, form)) { /* init new fof */ ovp->nxtact = p->basovrlap.nxtact; /* & link into */ p->basovrlap.nxtact = ovp; /* actlist */ p->basovrlap.nxtfree = ovp->nxtfree; Want if ((ovp->timrem = (long)(*p->kdur * esr)) > p->durtogo) /* ringtime */ return(0); if ((oct = *p->koct) > FZERO) { /* octaviation */ long ioct = (long)oct, bitpat = ~(-1L << ioct); if (bitpat & ++p->fofcount) return(0); if ((bitpat += 1) & p->fofcount) octamp *= (FONE + ioct - oct); } if (*fund == FZERO) /* formant phs */ ovp->formphs = 0; else ovp->formphs = (long)(p->fundphs * *form / *fund) & PMASK; ovp->forminc = (long)(*form * sicvt); if (*p->kband != p->prvband) { /* bw: exp dec */ p->prvband = *p->kband; p->expamp = (float)exp((double)(*p->kband * mpidsr)); newexp = 1; } to loop over grains, not samples. 29 Computer music scores Construct a score, and synthesize from it: type note = float * (float vec) (* dB, Hz *) score: note event vec synth_beep: note -> float vec let sound = EV.vfold (+.) 0. (V.map (E.lift synth_beep) score) Hierarchical structure. type 'a element = Note of 'a | Riff of 'a element event vec Measure event timestamps in fractional beats. Tempo-map from beats to samples. 30 The components of a pitch shifter pitch sinusoidal spectral shifter analyzer modifier float ivec overlapped FFT complex vec ivec f: complex vec ivec -> complex vec ivec correct frequencies (float * float) vec ivec rescale frequencies (float * float) vec ivec compute spectrum complex vec ivec overlapped IFFT float ivec 31 Related work Languages with temporal type constructors. Languages with atomic signals and events. Events with explicit time. Events in implicit time. Events not first-class. Languages with signals only. Languages with events only. 32 Fran Elliott and Hudak, 1997. “Functional reactive animation” Used to define objects’ trajectories, etc. Animation, not video — no frames or pixels. Behavior is Time -> . Event is time-sorted stream of Time is continuous. Time * . 33 Continuous versus discrete time Animation is continuous change. DSP is discrete. Digital filters are based on unit delays. The FFT relies on discrete time and frequency. A delay line can’t hold a continuous-time signal. “delay x by 1” is t . (x (t-1)). Feedback delay involves x (t-1), x (t-2), x (t-3), … So Two different ways of programming. 34 ALDiSP Freericks, 1996. For digital signal processing. stream: demand-driven (like ivec). pipe: producer-driven. A pipe is a channel for asynchronous events. Event timing is implicit. Representing temporal data is not the goal. 35 Signals and events Atomicity of signals precludes general DSP. Some languages have events with explicit time: Arctic (Dannenberg et al., 1986): applicative programming for reactive systems. SuperCollider (McCartney, 1996): scores are lazy lists of particular events. Some have events in implicit time. Or events not first-class—score sublanguage. 36 Inside Chronic Everything besides ivecs is pretty easy. The properties of a good ivec. Chronic’s ivec implementation. Phases: building and computation. A little benchmark on static dataflow. 37 Desirable properties of an ivec Correct asymptotic space and time use. Block computation. Consumer control of block length. Efficient fan-out to multiple consumers. In-place update. 38 Chronic’s ivec implementation An ivec is a reference to an ivec_dat. An ivec_dat is an object. Has method compute (upto: time) -> unit Writes output into a shared buffer. 39 The building phase let evens = IV.iterate (fun x -> x+2) 0 let powtwo = IV.iterate (fun x -> x*2) 1 let powfour = IV.peekiv evens powtwo (* index into powtwo by evens *) iterate_dat f: x . x+2 x0: 0 peekiv_dat powfour: t: evens: powtwo: x: iterate_dat f: x . x*2 x0: 1 40 The computation phase Demand-driven dataflow: [0], [1]? peekiv [0], [1]? 1, 4 0, 2 iterate f: x . x+2 x0: 0 [0, 2, 4, 6, …] t: x: [0], [2]? 1, 4 iterate f: x . x*2 x0: 1 [1, 2, 4, 8, …] 41 Function calls are expensive function inlined: call: IV.map2 (+.) x y IV.add2 x y C++ inlined: for (i=0; i<len; ++i) z[i] = x[i]+y[i]; Relative times for optimal block length (256): map2 9.3 add2 1.0 C++ 0.36 42 Future work Comprehensions. Sampling rates. Arbitrary feedback delay. Lazy vectors. Real-time? 43 vec and ivec comprehensions Instead of IV.map2 (fun xi yi -> xi + 2*yi) x y, write { xi + 2*yi: xi in x, yi in y } or just { x + 2*y } More readable. Can generate specialized code. Accomplish with camlp4 preprocessor? or with C++ template tricks? 44 Signals with sampling rates Now you can use signals of differing rates, but you get no checking of rate mismatches. Audio signal: 44100 Hz. Control signal: 1000 Hz. Incorporate sampling rate into sig, isig types. 45 Conclusions Unify computer-music sublanguages. Think and program outside of time. If we construct types, we can take them apart. 46 Unify sublanguages Csound has three separate languages: event placement, signal routing, and DSP (“score”, “orchestra”, and C). The divisions cut across useful interaction. Nyquist unifies the event and routing levels. Chronic unifies all three. 47 Stand outside of time Program in time: logical time advances as the program runs. An event’s time is “now”. Out of time: all time is explicitly in the data. The program’s execution is atemporal. Allows vfold (in FOF code) to be factored out. Out-of-time is often the way we think about an algorithm. 48 Deconstructing constructed types Traditional computer music languages make the audio signal an atomic type—a black box. Then there is no notion of an audio sample. Other types: spectra, LPC frames, …? Add them as more atomic types? Add A corresponding suite of unit generators, too. constructed type is no longer a black box. 49 Questions we can now address How can computer music languages support writing new DSP and new representations? How can libraries for low-level languages support new DSP and new representations? How can we build better tools for researching music and DSP algorithms? 50 Summary Temporal type constructors lead to a better way of doing computer music programming. 51 52 Synthesis from a score let motif = base bend . bend let bleep = pitch . pitch osc filter base let shorten = x . timescale 0.5 x let double = sequence [motif, motif] let zeno = sequence (iterate 10 shorten motif) let score = double zeno zeno double zeno zeno double let audio = apply bleep score 53 What’s wrong? Csound aims to be expressive, high-level: audio signal, not audio sample, is atomic. Csound types go no higher and no lower. Higher: stream of frames of analysis data. Can’t Lower: Can’t express a new analysis system. access to individual samples. express new DSP. 54 Music languages (Csound) New DSP generally cannot be expressed. No access to individual elements of audio data. Recursive delay is restricted. Code is really scalar, mapped over time. Time Can’t is factored out, unavailable. construct new types. 55 Low-level languages (C++) It is hard to write a good support library. Most assume all data is synchronous signals. Infinite data is awkward. Libraries don’t help new DSP code much. Fine-grained primitives are hard to identify. 56 Implementations Chronic is a prototype implementation of this style of programming. One possible future use: a framework for — developing computer music algorithms. analyzing and manipulating sonic data. Similar niche to Matlab. 57 Something implicit time can’t do The implicit model can represent delay: out = temp; temp = in It cannot represent the inverse operation. Undesired Explicit delay leaks out, breaking modularity. time supports the inverse: out = drop 1 in Explicit drop 2 [1 0 6 6] = [6 6] time, with map, subsumes implicit. 58 Why this matters A DSP operation may add undesired delay. In explicit time, this can be removed. In implicit time, the delay leaks out. Must delay other signals to keep them aligned. A signal’s delayedness is not part of its type. 59 A couple of EIV functions EIV.pwl 3. [| 4.@@2; 1.@@5 |] [| 3.; 3.5; 4.; 3.; 2.; 1.; ... |] 0 1 2 3 4 5 EIV.vfold (+) 0 [| [| 1; 1; 1; 1 |] @@ 2; [| 2; 2; 2; 2 |] @@ 4 |] [| 0; 0; 1; 1; 3; 3; 2; 2; ... |] 0 1 2 [ 1 3 1 4 1 [ 2 5 6 1 ] 2 2 7 2 ] 60 Two reverberators Based on feedback-delay structures. Moorer: Feedback filtered comb. Gardner: nested allpass. delay, y = delay (f y): g y x (f y) + D y x f (f y) D y let f x y = IV.map2 (+.) x (IV.map (fun y -> g *. y) y) let echo length x = IV.delayz_rec2 length f x 61 A complication Can’t access the inside of a feedback delay. Kludge: duplicate part of it instead. Y 0.5 0.5 N1 N2 IV.delayz_rec2 X D N1 N2 lowpass g 62 Feedback delay: a comparison In low-level languages— you In computer music languages— you In have to maintain grungy delay queues. often can’t represent feedback delay. Chronic— high-level representation of feedback loops, but not arbitrary flow graphs. 63 Why not just a stream? type ’a ivec = Ivec of (unit -> ’a * ’a ivec) Has 64 An ivec is an ivec_dat ref class [’a] ivec_dat mutable in-place method get_buf () -> ’a vec fan-out from buf method compute (upto: time) control of block length -> unit (* side effect: writes to buf *) method seek (upto: time) type ’a ivec = ’a ivec_dat ref compute 10; use buf.(0) to buf.(9); compute 20; use buf.(0) to buf.(9); … 65 Subclassing ivec_dat class [’a, ’b] map_dat (f: ’a -> ’b) (x: ’a ivec) = object inherit [’b] ivec_dat method compute_hook (* call !x#compute; use !x#get_buf (); write to buf *) let map (f: ’a -> ’b) x = ref ((new map_dat f x) :> (’b ivec_dat)) 66 67 The components of a pitch shifter float ivec overlapped FFT complex vec ivec correct frequencies (float * float) vec ivec rescale frequencies (float * float) vec ivec compute spectrum complex vec ivec overlapped IFFT float ivec 68 pitch shifter float ivec float ivec 69 sinusoidal analyzer float ivec (float * float) vec ivec 70 spectral modifier float ivec complex vec ivec f: complex vec ivec -> complex vec ivec complex vec ivec float ivec 71 Reusing the components Sinusoidal analyser Spectral manipulator overlapped FFT overlapped FFT correct frequencies apply function f overlapped IFFT output: (float * float) vec ivec f: complex vec ivec -> complex vec ivec 72