Temporal type constructors for computer music programming Thesis committee: Roger Dannenberg (Chair)

Temporal type constructors for computer music programming Thesis committee: Roger Dannenberg (Chair) Guy Blelloch Robert Harper Perry Cook, Princeton University 0 Computer music programming  Subdomains:  Digital signal processing.  Response to asynchronous events.  Representations of musical and sonic structure.  Example applications:  Synthesize audio from a musical score.  Abstract features from audio; alter features.  Transform audio to compress it. 1 Analysis of audio analyze resynthesize (modify) amplitude abstract pp f frequency render 2 The goals  Computer music programming should be  expressive: programs are clear and concise.  general: programs fall within the expressive range. 3 The current tradeoff the promised land expressive unit-generator programming (Csound) low-level programming (C++) general 4 What we have now  “Unit generator” programming (Csound).  User configures black-box audio processors.  Can’t express new DSP or new kinds of data.  New kind of data: spectral frames, for example.  Low-level programming (C++).  Cumbersome without a computer music library.  Libraries don’t support new kinds of data, and don’t give much benefit for new DSP. 5 What do we need?  Write  No  Types arbitrary DSP in a high-level language. more writing unit generators in C. higher and lower than “audio stream”.  higher: analysis frames for a new representation.  lower: access to individual samples for new DSP. 6 My proposal Temporal type constructors.  Proposed set: event, vector, infinite vector.  Enable a pure applicative programming style. Through temporal type constructors, computer music programming can be both expressive and general. 7 A taste of the results  Chronic is a prototype system using this idea.  The FOF synthesis algorithm can’t be written in Csound.  C implementation is 235 lines, and awkward.  Chronic implementation is 34 lines, and closer to our idea of the algorithm. 8 Outline  Temporal type constructors.  Code examples in Chronic.  Related work.  Chronic internals.  Future work.  Conclusions. 9 Temporal type constructors  event timestamped  event: e.g. as a pair (, time).  vec finite vector of : an array of  elements.  ivec infinite vector of : a time-indexed stream. time integer sample count. 10 Digital audio stream  audio  sample ivec (float might be chosen as the sample type.) S S S S S 11 Multi-channel audio stream  multi_audio  sample ivec vec S S S S S S S S S S S S S S S S S S S S 12 Short-time spectrum data  spectra  complex vec ivec C C C C C C C C C C C C C C C C C C C C C 13 A chord sequence  chordseq  pitch vec event vec P P P P P P P P P P P @ @ P @ @ 14 Musical-keyboard events  MIDI  (pitchvelocity) event ivec P V P V @ @ @ @ P V P V 15 Gestural musical events  violin  (pitch vec  bowing vec) event ivec P P P P P P P P B B B B B B B B @ @ @ @ P P P P P B B B B B 16 Explicit vs. implicit time  Implicit (Csound): out = -in  Code runs in a context holding the current time: for (t=0; ; ++t) out[t] = -in[t]  Looped unavoidably — hope it’s what you want.  Explicit (Chronic): out = map (x. -x) in and in are of type float ivec.  i.e. they are explicitly temporal data.  Explicit model, with map, subsumes implicit.  out 17 Explicit time  Time information is built into the data.  Code can stand outside of time.  vs. operating within some implicit “now”.  Advantages: A strictly more powerful model of time.  Implicit time can do delay, but can’t do the inverse.  Types are more tractable than code.  The FOF example will show how this works. 18 Chronic  Built inside O’Caml as a set of libraries. core: library: E V IV EV EIV L LV  event  vec  ivec  event vec  event ivec float and other  vec … 19 A couple of IV functions IV.iterate (fun x -> x +. 0.5) 1. [| 1.; 1.5; 2.; 2.5; ... |] (* y = IV.map succ (IV.delay 2 y) *) IV.delay_rec 2 [| 0; 5 |] (IV.map succ) [| 0; 5; 1; 6; ... |] 20 A couple of library functions let fs = 44100. (* sampling frequency *) LV.osc_sine 1000 (220./.fs) 0.25 (* 1000 samples of 220-Hz cosine *) LIV.para_eq (1200./.fs) 12. 0.5 x (* filter x to boost a 0.5-octave band around 1200 Hz by 12 dB *) 21 Examples built with Chronic  FOF synthesis.  Computer-music scores.  Two reverberators.  An FFT-based pitch shifter. 22 FOF synthesis  Makes a sound with a peak in its spectrum. level frequency pitch peak pitch peak 23 The FOF waveform  Series of enveloped sine-wave ‘grains’. 1 / Fpitch 1 / Fpeak 24 A fitting data type grains: float vec event ivec F @ F F F @ F F F F F F F F @ F @ 25 Skeleton of FOF code let fof (f_pitch: float ivec) phase0 f_peak bandwidth db rise fall dur (risefalltab: float vec) = let grain_times = LIV.phasor_wrap f_pitch phase0 in let fgrain t = ... (* miracle occurs *) in grain @@ (int t) in let grains = IV.map fgrain grain_times in EIV.vfold (+.) 0. grains float ivec float -> float vec event float vec event ivec float ivec 26 The missing piece let fgrain t = let sine = LV.osc_sine dur f_peak (~-.(frac t) *. f_peak) in let kenv = exp(~-.pi*.bandwidth) in let env = V.iterate (fun x -> kenv *. x) 1.0 dur in let smooth_ph = EV.pwl_list 0.0 [(rise, 1.); (dur-1-fall, 1.); (dur-1, 0.)] in let smooth = LV.tablei risefalltab smooth_ph in let ampl = L.db_to_amp db in let grain = V.map3 (fun x y z -> ampl *. x*.y*.z) sine env smooth in grain @@ (int t) 27 FOF in Chronic vs. in C  What I showed you was slightly simplified.  Less time-varying control, no “octaviation”.  This was 19 lines; full FOF is 34.  Csound’s  More FOF in C is 235. importantly, it’s unintuitive. 28 #include "cs.h" #include "ugens7.h" #include <math.h> /* UGENS7.C */ } } *ar = FZERO; ovp = &p->basovrlap; /* loosely based on code of Michael Clarke, University of Huddersfield */ #define FZERO #define FONE (0.0f) (1.0f) static newpulse(FOFS *, OVRLAP *, float *, float *, float *); int while (ovp->nxtact != NULL) { /* perform cur actlist: */ float result; OVRLAP *prvact = ovp; ovp = ovp->nxtact; /* formant waveform */ fract = PFRAC1(ovp->formphs); /* from JMC Fog*/ ftab = ftp1->ftable + (ovp->formphs >> ftp1->lobits);/*JMC Fog*/ printf("\n ovp->formphs = %ld, ", ovp->formphs); */ /* TEMP JMC*/ v1 = *ftab++; /*JMC Fog*/ result = v1 + (*ftab - v1) * fract; /*JMC Fog*/ result = *(ftp1->ftable + (ovp->formphs >> ftp1->lobits) ); /* Init grain rise ftable phase. Negative kform values make the kris (ifnb) initial index go negative and crash csound. So insert another if-test with compensating code. */ if (*p->kris >= onedsr && *form != FZERO) { /* init fnb ris */ if (*form < FZERO && ovp->formphs != 0) ovp->risphs = (long)((MAXLEN - ovp->formphs) / -*form / *p->kris); else ovp->risphs = (long)(ovp->formphs / *form / *p->kris); ovp->risinc = (long)(sicvt / *p->kris); rismps = MAXLEN / ovp->risinc; } else { ovp->risphs = MAXLEN; rismps = 0; } if (newexp || rismps != p->prvsmps) { /* if new params */ if (p->prvsmps = rismps) /* redo preamp */ p->preamp = (float)pow(p->expamp, -rismps); else p->preamp = FONE; } ovp->curamp = octamp * p->preamp; /* set startamp */ ovp->expamp = p->expamp; if ((ovp->dectim = (long)(*p->kdec * esr)) > 0) /* fnb dec */ ovp->decinc = (long)(sicvt / *p->kdec); ovp->decphs = PMASK; if (!p->foftype) { /* Make fof take k-rate phase increment: Add current iphs to initial form phase */ ovp->formphs += (long)(*p->iphs * fmaxlen); /* krate phs */ ovp->formphs &= PMASK; /* Set up grain gliss increment: ovp->glissbas will be added to ovp->forminc at each pass in fof2. Thus glissbas must be equal to kgliss / grain playing time. Also make it harmonic, so integer kgliss can represent octaves (ie pow() call). */ ovp->glissbas = ovp->forminc * (float)pow(2.0, (double)*p->kgliss); /* glissbas should be diff of start & end pitch*/ ovp->glissbas -= ovp->forminc; ovp->glissbas /= ovp->timrem; ovp->sampct = 0; /* Must be reset in case ovp was used before */ } return(1); } FOF in C: what goes wrong? void fofset0(FOFS *p, int flag) { if ((p->ftp1 = ftfind(p->ifna)) != NULL && (p->ftp2 = ftfind(p->ifnb)) != NULL) { OVRLAP *ovp, *nxtovp; long olaps; p->durtogo = (long)(*p->itotdur * esr); if (*p->iphs == FZERO) /* if fundphs zero, */ p->fundphs = MAXLEN; /* trigger new FOF */ else p->fundphs = (long)(*p->iphs * fmaxlen) & PMASK; if ((olaps = (long)*p->iolaps) <= 0) { initerror("illegal value for iolaps"); return; } auxalloc((long)olaps * sizeof(OVRLAP), &p->auxch); ovp = &p->basovrlap; nxtovp = (OVRLAP *) p->auxch.auxp; do { ovp->nxtact = NULL; ovp->nxtfree = nxtovp; /* link the ovlap spaces */ ovp = nxtovp++; } while (--olaps); ovp->nxtact = NULL; ovp->nxtfree = NULL; p->fofcount = -1; p->prvband = FZERO; p->expamp = FONE; p->prvsmps = 0; p->preamp = FONE; p->xincod = (p->XINCODE & 0x7) ? 1 : 0; p->ampcod = (p->XINCODE & 0x2) ? 1 : 0; p->fundcod = (p->XINCODE & 0x1) ? 1 : 0; p->formcod = (p->XINCODE & 0x4) ? 1 : 0; if (flag) p->fmtmod = (*p->ifmode == FZERO) ? 0 : 1; } p->foftype = flag; }  Can’t /* /* */ if (p->foftype) { if (p->fmtmod) ovp->formphs += form_inc; /* inc phs on mode */ else ovp->formphs += ovp->forminc; } else { #define kgliss ifmode /* float ovp->glissbas = kgliss / grain length. ovp->sampct is incremented each sample. We add glissbas * sampct to the pitch of grain at each a-rate pass (ovp->formphs is the index into ifna; ovp->forminc is the stepping factor that decides pitch) */ ovp->formphs += (long)(ovp->forminc + ovp->glissbas * ovp->sampct++); } ovp->formphs &= PMASK; if (ovp->risphs < MAXLEN) { /* formant ris envlp */ result *= *(ftp2->ftable + (ovp->risphs >> ftp2->lobits) ); ovp->risphs += ovp->risinc; } if (ovp->timrem <= ovp->dectim) { /* formant dec envlp */ result *= *(ftp2->ftable + (ovp->decphs >> ftp2->lobits) ); if ((ovp->decphs -= ovp->decinc) < 0) ovp->decphs = 0; } *ar += (result * ovp->curamp); /* add wavfrm to out */ if (--ovp->timrem) /* if fof not expird */ ovp->curamp *= ovp->expamp; /* apply bw exp dec */ else { prvact->nxtact = ovp->nxtact; /* else rm frm activ */ ovp->nxtfree = p->basovrlap.nxtfree;/* & ret spc to free */ p->basovrlap.nxtfree = ovp; ovp = prvact; } } p->fundphs += fund_inc; if (p->xincod) { if (p->ampcod) amp++; if (p->fundcod) fund_inc = (long)(*++fund * sicvt); if (p->formcod) form_inc = (long)(*++form * sicvt); } p->durtogo--; ar++; } while (--nsmps); } represent grains.  Can’t stand outside of time  Has to loop over output samples, and think static int rngflg=0; “What is the set of active grains right now? Are some dying? Are new ones born? Which envelopes are in their rise phase? entering fall phase? …” void fofset(FOFS { fofset0(p, 1); } *p) void fofset2(FOFS { fofset0(p, 0); } *p) void fof(FOFS *p) { OVRLAP *ovp; FUNC *ftp1, *ftp2; float *ar, *amp, *fund, *form; long nsmps = ksmps, fund_inc, form_inc; float v1, fract ,*ftab;  You static int newpulse(FOFS *p, OVRLAP *ovp, float *amp, float *fund, float *form) { float octamp = *amp, oct; long rismps, newexp = 0; don’t want to think that way about FOF. if (p->auxch.auxp==NULL) { /* RWD fix */ initerror("fof: not initialized"); return; } ar = p->ar; amp = p->xamp; fund = p->xfund; form = p->xform; ftp1 = p->ftp1; ftp2 = p->ftp2; fund_inc = (long)(*fund * sicvt); form_inc = (long)(*form * sicvt); do { if (p->fundphs & MAXLEN) { /* if phs has wrapped */ p->fundphs &= PMASK; if ((ovp = p->basovrlap.nxtfree) == NULL) perferror("FOF needs more overlaps"); if (newpulse(p, ovp, amp, fund, form)) { /* init new fof */ ovp->nxtact = p->basovrlap.nxtact; /* & link into */ p->basovrlap.nxtact = ovp; /* actlist */ p->basovrlap.nxtfree = ovp->nxtfree;  Want if ((ovp->timrem = (long)(*p->kdur * esr)) > p->durtogo) /* ringtime */ return(0); if ((oct = *p->koct) > FZERO) { /* octaviation */ long ioct = (long)oct, bitpat = ~(-1L << ioct); if (bitpat & ++p->fofcount) return(0); if ((bitpat += 1) & p->fofcount) octamp *= (FONE + ioct - oct); } if (*fund == FZERO) /* formant phs */ ovp->formphs = 0; else ovp->formphs = (long)(p->fundphs * *form / *fund) & PMASK; ovp->forminc = (long)(*form * sicvt); if (*p->kband != p->prvband) { /* bw: exp dec */ p->prvband = *p->kband; p->expamp = (float)exp((double)(*p->kband * mpidsr)); newexp = 1; } to loop over grains, not samples. 29 Computer music scores  Construct a score, and synthesize from it: type note = float * (float vec) (* dB, Hz *) score: note event vec synth_beep: note -> float vec let sound = EV.vfold (+.) 0. (V.map (E.lift synth_beep) score)  Hierarchical structure. type 'a element = Note of 'a | Riff of 'a element event vec  Measure event timestamps in fractional beats.  Tempo-map from beats to samples. 30 The components of a pitch shifter pitch sinusoidal spectral shifter analyzer modifier float ivec overlapped FFT complex vec ivec f: complex vec ivec -> complex vec ivec correct frequencies (float * float) vec ivec rescale frequencies (float * float) vec ivec compute spectrum complex vec ivec overlapped IFFT float ivec 31 Related work  Languages with temporal type constructors.  Languages with atomic signals and events.  Events with explicit time.  Events in implicit time.  Events not first-class.  Languages with signals only.  Languages with events only. 32 Fran  Elliott and Hudak, 1997.  “Functional reactive animation”  Used to define objects’ trajectories, etc.  Animation, not video — no frames or pixels.   Behavior is Time -> .   Event is time-sorted stream of  Time is continuous. Time * . 33 Continuous versus discrete time  Animation is continuous change.  DSP is discrete.  Digital filters are based on unit delays.  The FFT relies on discrete time and frequency.  A delay line can’t hold a continuous-time signal. “delay x by 1” is  t . (x (t-1)).  Feedback delay involves x (t-1), x (t-2), x (t-3), …  So  Two different ways of programming. 34 ALDiSP  Freericks, 1996. For digital signal processing.   stream: demand-driven (like ivec).   pipe: producer-driven.  A pipe is a channel for asynchronous events.  Event timing is implicit.  Representing temporal data is not the goal. 35 Signals and events  Atomicity of signals precludes general DSP.  Some languages have events with explicit time:  Arctic (Dannenberg et al., 1986): applicative programming for reactive systems.  SuperCollider (McCartney, 1996): scores are lazy lists of particular events.  Some have events in implicit time.  Or events not first-class—score sublanguage. 36 Inside Chronic  Everything besides ivecs is pretty easy.  The properties of a good ivec.  Chronic’s ivec implementation.  Phases: building and computation.  A little benchmark on static dataflow. 37 Desirable properties of an ivec  Correct asymptotic space and time use.  Block computation.  Consumer control of block length.  Efficient fan-out to multiple consumers.  In-place update. 38 Chronic’s ivec implementation  An ivec is a reference to an ivec_dat.  An ivec_dat is an object.  Has method compute (upto: time) -> unit  Writes output into a shared buffer. 39 The building phase let evens = IV.iterate (fun x -> x+2) 0 let powtwo = IV.iterate (fun x -> x*2) 1 let powfour = IV.peekiv evens powtwo (* index into powtwo by evens *) iterate_dat f:  x . x+2 x0: 0 peekiv_dat powfour: t: evens: powtwo: x: iterate_dat f:  x . x*2 x0: 1 40 The computation phase  Demand-driven dataflow: [0], [1]? peekiv [0], [1]? 1, 4 0, 2 iterate f:  x . x+2 x0: 0 [0, 2, 4, 6, …] t: x: [0], [2]? 1, 4 iterate f:  x . x*2 x0: 1 [1, 2, 4, 8, …] 41 Function calls are expensive  function  inlined: call: IV.map2 (+.) x y IV.add2 x y  C++ inlined: for (i=0; i<len; ++i) z[i] = x[i]+y[i];  Relative times for optimal block length (256): map2 9.3 add2 1.0 C++ 0.36 42 Future work  Comprehensions.  Sampling rates.  Arbitrary feedback delay.  Lazy vectors.  Real-time? 43 vec and ivec comprehensions  Instead of IV.map2 (fun xi yi -> xi + 2*yi) x y, write { xi + 2*yi: xi in x, yi in y } or just { x + 2*y }  More readable.  Can generate specialized code.  Accomplish with camlp4 preprocessor?  or with C++ template tricks? 44 Signals with sampling rates  Now you can use signals of differing rates, but you get no checking of rate mismatches.  Audio signal: 44100 Hz.  Control signal: 1000 Hz.  Incorporate sampling rate into sig, isig types. 45 Conclusions  Unify computer-music sublanguages.  Think and program outside of time.  If we construct types, we can take them apart. 46 Unify sublanguages  Csound has three separate languages: event placement, signal routing, and DSP (“score”, “orchestra”, and C).  The divisions cut across useful interaction.  Nyquist unifies the event and routing levels.  Chronic unifies all three. 47 Stand outside of time  Program in time: logical time advances as the program runs. An event’s time is “now”.  Out of time: all time is explicitly in the data. The program’s execution is atemporal.  Allows vfold (in FOF code) to be factored out.  Out-of-time is often the way we think about an algorithm. 48 Deconstructing constructed types  Traditional computer music languages make the audio signal an atomic type—a black box.  Then there is no notion of an audio sample.  Other types: spectra, LPC frames, …?  Add them as more atomic types?  Add A corresponding suite of unit generators, too. constructed type is no longer a black box. 49 Questions we can now address  How can computer music languages support writing new DSP and new representations?  How can libraries for low-level languages support new DSP and new representations?  How can we build better tools for researching music and DSP algorithms? 50 Summary  Temporal type constructors lead to a better way of doing computer music programming. 51 52 Synthesis from a score let motif =  base bend . bend let bleep =  pitch . pitch osc filter base let shorten =  x . timescale 0.5 x let double = sequence [motif, motif] let zeno = sequence (iterate 10 shorten motif) let score = double zeno zeno double zeno zeno double let audio = apply bleep score 53 What’s wrong?  Csound aims to be expressive, high-level: audio signal, not audio sample, is atomic.  Csound types go no higher and no lower.  Higher: stream of frames of analysis data.  Can’t  Lower:  Can’t express a new analysis system. access to individual samples. express new DSP. 54 Music languages (Csound)  New DSP generally cannot be expressed.  No access to individual elements of audio data.  Recursive delay is restricted.  Code is really scalar, mapped over time.  Time  Can’t is factored out, unavailable. construct new types. 55 Low-level languages (C++)  It is hard to write a good support library.  Most assume all data is synchronous signals.  Infinite data is awkward.  Libraries don’t help new DSP code much.  Fine-grained primitives are hard to identify. 56 Implementations  Chronic is a prototype implementation of this style of programming.  One possible future use: a framework for —  developing computer music algorithms.  analyzing and manipulating sonic data.  Similar niche to Matlab. 57 Something implicit time can’t do  The implicit model can represent delay: out = temp; temp = in  It cannot represent the inverse operation.  Undesired  Explicit delay leaks out, breaking modularity. time supports the inverse: out = drop 1 in  Explicit drop 2 [1 0 6 6] = [6 6] time, with map, subsumes implicit. 58 Why this matters A DSP operation may add undesired delay.  In explicit time, this can be removed.  In implicit time, the delay leaks out.  Must delay other signals to keep them aligned.  A signal’s delayedness is not part of its type. 59 A couple of EIV functions EIV.pwl 3. [| 4.@@2; 1.@@5 |] [| 3.; 3.5; 4.; 3.; 2.; 1.; ... |] 0 1 2 3 4 5 EIV.vfold (+) 0 [| [| 1; 1; 1; 1 |] @@ 2; [| 2; 2; 2; 2 |] @@ 4 |] [| 0; 0; 1; 1; 3; 3; 2; 2; ... |] 0 1 2 [ 1 3 1 4 1 [ 2 5 6 1 ] 2 2 7 2 ] 60 Two reverberators  Based on feedback-delay structures.  Moorer:  Feedback filtered comb. Gardner: nested allpass. delay, y = delay (f y): g y x (f y) + D y x f (f y) D y let f x y = IV.map2 (+.) x (IV.map (fun y -> g *. y) y) let echo length x = IV.delayz_rec2 length f x 61 A complication  Can’t access the inside of a feedback delay.  Kludge: duplicate part of it instead. Y 0.5 0.5 N1 N2 IV.delayz_rec2 X D N1 N2 lowpass g 62 Feedback delay: a comparison  In low-level languages—  you  In computer music languages—  you  In have to maintain grungy delay queues. often can’t represent feedback delay. Chronic—  high-level representation of feedback loops,  but not arbitrary flow graphs. 63 Why not just a stream? type ’a ivec = Ivec of (unit -> ’a * ’a ivec)  Has 64 An ivec is an ivec_dat ref class [’a] ivec_dat mutable in-place method get_buf () -> ’a vec fan-out from buf method compute (upto: time) control of block length -> unit (* side effect: writes to buf *) method seek (upto: time) type ’a ivec = ’a ivec_dat ref compute 10; use buf.(0) to buf.(9); compute 20; use buf.(0) to buf.(9); … 65 Subclassing ivec_dat class [’a, ’b] map_dat (f: ’a -> ’b) (x: ’a ivec) = object inherit [’b] ivec_dat method compute_hook (* call !x#compute; use !x#get_buf (); write to buf *) let map (f: ’a -> ’b) x = ref ((new map_dat f x) :> (’b ivec_dat)) 66 67 The components of a pitch shifter float ivec overlapped FFT complex vec ivec correct frequencies (float * float) vec ivec rescale frequencies (float * float) vec ivec compute spectrum complex vec ivec overlapped IFFT float ivec 68 pitch shifter float ivec float ivec 69 sinusoidal analyzer float ivec (float * float) vec ivec 70 spectral modifier float ivec complex vec ivec f: complex vec ivec -> complex vec ivec complex vec ivec float ivec 71 Reusing the components Sinusoidal analyser Spectral manipulator overlapped FFT overlapped FFT correct frequencies apply function f overlapped IFFT output: (float * float) vec ivec f: complex vec ivec -> complex vec ivec 72

Temporal type constructors for computer music programming Thesis committee: Roger Dannenberg (Chair)

Related documents

Products

Support

Temporal type constructors for computer music programming Thesis committee: Roger Dannenberg (Chair)

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib