The General Temporal Workbench (formerly General Multimedia Workbench) A Universal System for Exploring Time-based Phenomena Donald Byrd School of Informatics & Jacobs School of Music Indiana University/Bloomington 2 Jan. 2010 1 Introduction: Time is Of the Essence • Gandini Juggling’s Mozart (Symphony no. 25, 1st mvmt) performance • “When you hear music, after it’s over, it’s gone, in the air. You can never capture it again.” —Eric Dolphy (1964) • Likewise for all complex temporal phenomena • …and timescale can be microseconds or millions of years • What if you really want to think about what happened (or, for creative art, you want to happen)? • Need a way to “freeze” it – Playing a recording over & over isn’t enough! • Obvious answer: visualization—but what’s the best way? • …and is visualization the only answer? 10 Sep. 2009 2 Motivation: We Have Big Problems • Long-standing, difficult problems in all fields • …plus deluge of data in many fields – Even arts & humanities are getting lots of hard data – Promises to help, but not much help so far! • What we need is insight, not data; how to get there? – Widely recognized as an important goal • The cross-discipline argument – Problems in all fields have much in common => a general system could be very valuable, if it’s possible – A general system is possible • The cross-creativity/analysis argument – Problems of creators & analysts in a field have much in common => system for creation & analysis also very valuable, if possible – A system for both is also possible 3 July 2009 3 Examples • Create/teach performers/rehearse/study a multimedia show – Gandini Juggling’s Mozart Symphony – …or marching band, or dance w/ music & lighting effects, etc. • Study Hendrix’s Star-Spangled Banner (VAT vs. published transcription) • Look for patterns in patient’s medical history (Lifelines) • Research on embodied language acquisition (Chen Yu) • Study or learn role in opera/musical (GTW simulation) • Organize ethnomusicology field research (EVIA AWB) • Study world events (JFK assassination, Salem Witchcraft Trials) 2 Jan. 2010 4 Jimi Hendrix’s Woodstock Star-Spangled Banner (a) Timeline overview w/ labeled segments (Variations Timeliner) (b) In music notation, guitar tablature, & words (published transcription) 28 Feb. 2009 5 Timelines (1) Applications of “timelines” in a broad sense – – – – – – – – – – – – audio editing (a few hundred millisec. to a few min.) juggling (a few seconds; vertically oriented) video & motion data of two participants interacting in lab (seconds) “bubble” diagrams of structure of pieces of music (minutes) movie/video of animal behavior, interview, show, etc. (minutes to hours) video annotation (hours) weather (hours to days) appointment calendar (week or month; 2-D) the assassination of President Kennedy (a few days) Salem witchcraft accusations (a month) personal history: medical, criminal justice, etc. (years) dinosaurs (tens of millions of years) 28 Mar. 2009 6 Timelines (2) Assassination of President Kennedy (SIMILE Timeline) 12 Mar. 2009 7 Concrete & Abstract Forms in Different Fields • Symbolic forms in music vs. text (& other areas)… Which came first? Symbolic form is… (Seeger, 1958) Text Music Symbolic Prescriptive script Score/performance parts, “sheet music” Real-time (performance, speech, etc.) Descriptive transcript transcription • Real-time = low level (concrete) • Symbolic = high level (abstract) • Very high-level: segmentation w/ labels • Ex: Hendrix Star-Spangled Banner • Concrete & abstract forms are useful for all temporal phenomena 14 Apr. 2009 8 Different Fields Have Much in Common • From music to remote disciplines in small steps, for three aspects of music • • • Steps shown aren’t unique—many paths are possible All are complex enough that no one way of “looking” at it can capture everything For all, people often want to compare two or more instances of the phenomenon 15 Apr. 2009 9 Solution 1: Better Human/Machine Partnerships Above figure is from Yu et al (2008), slightly modified • Integrate info visualization & analysis/data mining (Shneiderman 2002; Yu et al 2008) => closed loop: use visual perception to generate hypotheses for analysis; present results of analysis visually – Cf. browsing vs. searching dichotomy; HCIR, visual analytics • …or substitute synthesis for analysis, e.g., for composers 7 Apr. 2009 10 Solution 2: Allow All Sensory Modes • Visual: visualization is most generally useful, but not the only answer • Auditory: sound is central for many applications – sonification is surely valuable for some non-audio phenomena • Tactile: important for the blind • Other (olfactory): important for ?? • Don’t rule out support for all sensory modalities 14 Mar. 2009 11 Solution 3: Don’t Reinvent (or do without) the Wheel! • Problems of all temporal phenomena have much in common • …but people rarely share ideas or software across disciplines – Issue of “disjoint technical vocabulary/literature” (cf. Swanson 1988) – Value of exploratory search (cf. Jeremy Pickens, etc.) • Idea: a “General Temporal Workbench” (GTW) – Formerly General Music/General Multimedia Workbench (GMW) – Supports multiple: coordinated, editable, interactive visualizations & sonifications (eventually “tactilizations”, “olfactizations”?) – …of multiple instances – …of any combination of temporal phenomena – …plus data mining & analysis (Solution 1)! • For creative applications or analytical applications? – Both—the design is neutral – NB: a misleading question: insightful analysis involves creativity 15 May 2009 12 Use Existing Timeline Software • There’s an endless variety of timelines – – – – Orientation: horizontal, vertical, 2D, other Spacing: linear, logarithmic, piecewise linear, etc. Multiscale coordinated Playability: audio, video, Flash, etc. • Useful generic “timeliners” can go far beyond the basics – Ex: SIMILE Timeline & Timeplot • Even doing “the basics” well isn’t that easy – Ex: axis tick marks & numbers for them • Can consider all time-domain displays as variations on timelines • But what is there besides time-domain displays? 11 Sep. 2009 13 Use Existing Frequency Domain Viewers • Frequency domain = patterns in time domain – Example: economic cycles; Kondratiev’s theory of periodic collapses of capitalist economy :-) – Time/frequency domain (hybrid) more useful than pure • Visualization example: spectrogram (via Fourier analysis) • Very well-known in hard sciences, less in soft sciences • …almost unheard of in arts & humanities, & by public – Exception: computer music • Are periodic changes in cultural areas plausible? – Politics: U.S. House of Reps. elections correlate w/ franked mail – Direct experience important => periods of 1-2 generations(?) – Higher education population turnover => periods of 4 years(?) • …or periodic changes of blood glucose for diabetics? – Type-1 diabetics do constant self-medicating => need user-friendly tool 10 Jul 2009 14 What Fields are Candidates for a GTW? (1) • What fields can really benefit from synergy of “not reinventing the wheel”? • Relevant features 1. Complex enough that no one way of “looking” at it can capture everything, i.e., needs multimodal access 2. people often want to compare two or more instances of the phenomenon 3. (less important) specialized graphical notation(s) are widely used for symbolic form 19 Feb. 2009 15 What Fields are Candidates for a GTW? (2) • How many fields have at least Features 1 & 2? • An unusual example: juggling (Juggling Lab) – No single way of looking at it can capture all the information • Standard: video and/or animated stick figures • Optional: notation (“siteswap”), timeline showing paths of balls – People often want to compare versions of a trick – General human & animal movement is really complex • Conclusion: all non-trivial fields have 1 & 2; very many have all three. – Speculation: area w/ over (say) 100K person-years of serious interest probably has enough complexity to have Features 1 & 2 – Speculation: area w/ over (say) 500K person-years of serious interest probably has enough complexity to have all three 19 Feb. 2009 16 HCI: Multiple “Visualizations” Can be Great or Worthless • Parable of blind men & elephant • Point of multiple visualizations: let the user put the pieces together & “see” big picture – The more different the visualizations involved, the better… – But the more different the visualizations, the greater the danger of user getting confused! – Ease of navigating between is critical => need coordination – Often helpful to have a (small) overview on screen – Ex: viewing modes in PowerPoint – Ex: “Scrollbar with confetti” (Byrd 1998) gives overview with (v. often) no additional screen space • Similar principles apply to sonifications, etc. 1 Mar. 2009 17 The Ultimate Music-and-More System (1) • If system could do “everything” with music, should be useful for lots besides music! • Not just useful for many domains, allows synergy/leverage – – – – Related to “abstraction”, “modularizing”, “factoring” = breaking problem down into separate parts Cf. high-school algebra Result: no need to reinvent the wheel • Example: timelines • Example: apply frequency-domain approaches & software in many fields • But could a program do that? Is this practical? 26 Mar. 2009 18 The Ultimate Music-and-more System (2) • Practical iff it can be broken down to independent chunks • Modular design (in layers) is vital • Architecture plan for GTW I. Completely general framework: no knowledge of domain II. Generic domain-specific modules for file I/O, support for low-level modules, maybe “automatic” alignment III. API for user-written analyzers & “visualizers” 18 Feb. 2009 19 Architecture for a General Visualizer/Analyzer (1) • • Configuration = software + UI (windows, etc.) Software for common audio & video uses… 15 Mar. 2009 20 Architecture for a General Visualizer/Analyzer (2) For sequential art & movies based on sequential art (John Walsh)… 15 Mar. 2009 21 GTW “Screenshot” 1 • Scenario: music-informatics researcher (or ethnomusicologist) comparing two audio-segmentation algorithms • …or composer comparing input & output of synthesis programs 20 Apr. 2009 22 GTW “Screenshot” 2 (& “Demo”) • Scenario: singer (or conductor) comparing videos of performances to learn role in opera, musical, etc. • …or stage director, choreographer, or lighting designer comparing previous versions to own ideas • …or scholar studying performances (perhaps juggling w/ music!) 4 Apr. 2009 23 What’s Special About the GTW? • Design (& planned implementation) for our “solutions” – – • Better human/machine partnerships: tight coupling of visualization & analysis Don’t reinvent wheel: any presentations of anything temporal in any combination Factors, roughly from most to least fundamental: 1. 2. Architecture separates framework from assumptions about use (domain knowledge) Support for rapidly changing focus between very different visualizations at vastly different scales – – 3. 4. Configuration files set up internals and UI; experts can create for each use case Designed to support comparing “similar” documents – 5. 6. 7. Can automatically adjusts (in own windows) which visualizations take screen space & sizes/layouts of interfacing programs’ windows Support for showing relationships between features in different views Doesn’t assume consistency between coordinated representations Can act as “slave” (client for, e.g., SEASR/Meandre, Max/MSP, Pd) or master Fully multimodal: presentation in non-visual form (sonification, Braille) on same basis as visualization Analysis modules can communicate w/ presentation modules 16 Mar. 2009 24 The Truth: The GTW Can’t Do Everything • ... but it can enable YOU to! • Catch: needs technology for your field – Level II (domain knowledge) – Level III (user interaction) • Vast majority of users aren’t technology experts • Solution: user communities – Enables experts in each field • Something like an operating system – Not so hard because much more synergy (=> less new work per field) than now 2 Jan. 2010 25 Similar Tools for Non-Temporal Phenomena • Existing, very general tools for other situations – Network Workbench (Katy Börner/IU SLIS): visualize/explore networks – Google Map API: visualize/explore “space” (surface of the earth) – Both have proven very useful – But many phenomena have temporal and non-temporal aspects • • T. & network: artificial life, computer games, studying software (debugging, etc.): traversal => temporal form T. & spatial: folksong style vs. region of origin, art or general history, etc. – • Cf. Timemap (Google), Salem Witchcraft Accusations webpage All three: public health (as in epidemiology) – GTW could be used with other tools 2 June 2009 26 Getting Off the Ground • Working on prototype, based on EVIA Annotators Workbench; also Variations, CIShell, Chen Yu’s system for “visual mining of multimedia data” (all from IUB) • Other possible open-source starting points – Sonic Visualiser, SyncPlayer, SIMILE Timeline, etc. • Connections to general tools for nontemporal visualization – Network Workbench (Katy Börner/IU SLIS): networks – Google Map API: “space” (surface of the earth) • Connections to other general tools – Meandre for SEASR (UIUC): humanities/social science research – Max/MSP, Pd: musical audio 2 Jan. 2010 27 Conclusions • How do I know applications are realistic? – Many probably aren’t, but many, many possibilities exist! – Have ca. 30 usage scenarios, ca. half written/endorsed by experts – Some examples (all by experts) • • • • Ruth Stone: ethnomusicology field work Amar Flood: nanoscience/nanotechnology Larry Yaeger: artificial life Elaine Chew: annotating video of computer-aided musical improvisation • John Walsh: comic books/movies • Philip Gossett: musicology – Personal knowledge/experience for a few 2 Jan. 2010 28 End • Thanks to Geoff Chirgwin, Will Cowan, Allen Winold, Paul Sturm, &… • THE END 20 Feb. 2009 29 Extra Slides • Following slides are just in case… rev. 18 Feb. 2009 30 Good Design for Music Can Be Good for Many Things • 1. 2. 3. 4. Cf. “Why Studying Music is Both Difficult and Important” (Byrd 2009) Music is an art => people use elements in unusual ways Music is a performing art => performances & symbolic representations Much music has complex synchronization requirements Music involves many different instruments, often in groups. Leads to: – – – 5. 6. Arrangements/transcriptions for other instruments Versions for players with different levels of skill Notation may represent sounds or actions Music is often combined with text via singing, narration, etc. Music is extremely popular, so: – – Some works exist in many versions, arrangements for different ensembles, etc. Handling challenges is important, even on purely economic grounds rev. 18 Feb. 2009 31 HCI: Searching, Browsing, & Visualization • Visualization is essential for browsing, merely helpful for searching • In browsing, user finds everything; the computer just helps • Browsing is obviously good because it gives user more control, but few systems emphasize it. Why? – “Users are not likely to be pleasantly surprised to find that the library has something but that it has to be obtained in a slow or inconvenient way. Nearly all items will come from a search, and we do not know well how to browse in a remote library.” (Lesk, p. 163) • For “and”, read “as long as” • Searching is more natural on computer, browsing in real world – Effective browsing takes very fast computers—widely available now – Effective browsing has subtle UI demands • Cf. HCIR, visual analytics, visual searching, etc. 7 Mar. 2009 32 Why juggling? Who cares? • A surprising domain, but realistic – Features 1 & 2 apply – Feature 3 applies in part: has established (though not graphic) notation, “siteswap” • Many juggling programs available • GTW framework has support for: – – – – – Control of tempo, including pausing or going backwards UI for (temporal, not spatial) zooming in on details Synchronization of multiple videos and/or animations Framework for auto. synchronization Framework for combining independent visualizations • Animal motion in general is much more complex => more need for GTW! – Ex: dancing (Labanotation, etc.) Structure in Basic Representations of Music & Audio Audio: no explicit structure MIDI: simple, regular, welldefined structure Western Music Notation: very complex, irregular structure; some parts welldefined, some not—and what’s welldefined isn’t well-defined 10 Feb. 09 34 Basic Representations of Music & Audio Digital Audio 1. Audio (e.g., CD, MP3): like speech Time-stamped Events 2. Time-stamped Events (e.g., MIDI file): like unformatted text 3. Music Notation: like Music Notation text with complex formatting • Time scales of graphs: #1, milliseconds; #2 & 3, seconds • Essential difference among forms: “knowledge representation” = explicit structure 10 Feb. 09 35 “Isn’t it a mistake to use music notation this way?” • • • • • • Chris Raphael’s question about Hendrix transcription It’s obviously useful: easy to find phrases, “Taps”, etc. …but seriously misleading in places But CMWN is “always” misleading! Is it useful enough to justify danger of misleading? Knowledge representation has inevitable bias (Davis et al 1993); notation has more bias (Wiggins et al 1993) • Fundamental issue of transcription in ethnomusicology • Conclusion: use it, but be careful – Cf. my “Logician General’s Warning” on classification – …in fact, transcribing requires classifying constantly 12 Feb. 09 36 Sequential-Art/Movie: The Hard Goodbye (1) • From Frank Miller’s “Sin City” series • John Walsh (SLIS): want to compare comics, movies of them, etc. 18 Feb. 2009 37 Sequential-Art/Movie: The Hard Goodbye (2) • From Frank Miller’s “Sin City” series • John Walsh (SLIS): want to compare comics, movies of them, etc. 18 Feb. 2009 38 Types of Visualizations of Music (and more) • Is visualization static or dynamic? – Dynamic = time represented by time – Static = time represented by space • What features are visualized? • What basic representation? Audio, symbolic, both? – Easy to generalize to plays (score = script) & other text phenomena, dance, etc. 18 Feb. 2009 39 Types of Visualizations of Music (and more) • Hendrix example uses coordinated visualizations – Generalization of parallel, aligned, synchronized, etc. • How are multiple visualizations coordinated? 1. 2. 3. • • Parallel panes of a single window Superimposed in a single window Separate coordinated windows Forms 1 & 2 apply directly to audio (incl. sonification) Easy to interpolate between forms 1 & 2 – Categories in the real world are rarely discrete 26 Feb. 2009 40 The Ultimate Music System • Original goal: visualizer that can do anything with music – Handle any no. & combination of visualizations – Static visualizations: audio, any kind of notation, structural diagrams, etc. – Dynamic visualizations: video, etc. – Automatic (or near-automatic) synchronization – Support OS-level technologies (QuickTime, etc.) – Easy-to-learn UI allowing high degree of control • User may want frequent extreme zoom changes => help with • If it could do all that, should be useful for lots (domains with >=2 Features) besides music! 20 May 08 41