Uploaded by Юрий Семенов

Casanelles The Hyperorchestra A Study of a Virtual Music Ensemble

advertisement
Sponsoring Committee: Dr. Ronald Sadoff, Chair
Dr. Tae Hong Park
Dr. Martin Scherzinger
THE HYPERORCHESTRA: A STUDY OF A VIRTUAL MUSICAL
ENSEMBLE IN FILM MUSIC THAT
TRANSCENDS REALITY
Sergi Casanelles
Program in Composition / Scoring for Film and Multimedia
Department of Music and Performing Arts Professions
Submitted in partial fulfillment
of the requirements for the degree of
Doctor of Philosophy in the
Steinhardt School of Culture, Education, and Human Development
New York University
2015
ProQuest Number: 3729807
All rights reserved
INFORMATION TO ALL USERS
The quality of this reproduction is dependent upon the quality of the copy submitted.
In the unlikely event that the author did not send a complete manuscript
and there are missing pages, these will be noted. Also, if material had to be removed,
a note will indicate the deletion.
ProQuest 3729807
Published by ProQuest LLC (2015). Copyright of the Dissertation is held by the Author.
All rights reserved.
This work is protected against unauthorized copying under Title 17, United States Code
Microform Edition © ProQuest LLC.
ProQuest LLC.
789 East Eisenhower Parkway
P.O. Box 1346
Ann Arbor, MI 48106 - 1346
Copyright © 2015 Sergi Casanelles
ACKNOWLEDGEMENTS
I would like to thank Dr. Sadoff for his mentorship during my
doctoral studies, as well as my committee members Dr. Park and Dr.
Scherzinger. I would also like to express my gratitude to Dr. Gilbert for
assisting me in the first stages of my research and to Dr. MacFarlane for
his thoughts on McLuhan’s theories.
I would like to thank Dr. Kulezic-Wilson and Dr. Greene for their
comments in a forthcoming book chapter on Hyperorchestration. Their
suggestions served me for improving for this present study.
iii
TABLE OF CONTENTS
ACKNOWLEDGEMENTS
iii
TABLE OF CONTENTS
iv
LIST OF FIGURES
ix
CHAPTER
I
INTRODUCTION
1
II
MUSIC ANALYSIS FROM MOVIE SOUNDTRACKS
12
Introduction
Movies as Multimodal Experiences…
…and a Multimodal Approach to Music
Analysis
The Lord of the Rings: Creating Middle-earth
with a Global Musical Ensemble
The Social Network: Sound Quality as
Emotional Signifiers
Inception (2010): Challenging Reality and the
Impossible Orchestra
The Expanded Orchestra
An Impossible Crescendo
Expanding the Sound Palette with
Synthesizers
Man of Steel (2013): Expanding the
Hyperinstruments
Gravity (2013): Scoring the Soundlessness of
Outer Space
Interstellar (2014): The Church Organ and the
Integrated Score
12
15
16
25
25
35
41
45
46
48
50
57
63
continued
iv
III
IV
V
PHILOSOPHICAL APPROACHES TO
HYPERREALITY
74
On The Matrix
Baudrillard and the Hyperreal
The Three Orders of Simulacra
The Hyperreal
McLuhan’s Theory of the Media
The Spoken and the Written Word
Media and Simulacra
A Process of Virtualization
The Hyperreal Society
74
78
80
85
86
90
93
94
102
CINEMA ONTOLOGIES AND HYPERREALITY
104
Introduction
Narratives and Virtual Reality
Imagination and Virtual Reality
Digital Cinema and the Ontology of Cinema
Digital Cinema and Indexicality
Authorship and CGI: Gollum’s Case
The Case of Animation
Prince’s Perceptual Realism
Cinema and Hyperreality
104
106
111
115
116
123
129
131
135
FILM DIEGESIS, REALISM AND THE PERCEIVED
REALITY
141
Introduction
The Myth of the Perfect Illusion
Theories of the Film World
Defining the Diegesis and a Semantic World
for a Movie
A Clockwork Orange
Cinematic Realities in Etienne Souriau’s
Model
Realism, Verisimilitude and the Diegetic
World
Building the Diegesis
The Diegesis and the Hyperreal
The Filmic World
Aesthetic Realism
v
141
142
147
154
155
158
167
171
178
181
185
continued
VI
VII
MUSIC, HYPERREALITY AND THE
HYPERORCHESTRA
189
Introduction
Recorded Music and Musical Notation
Musical Notation
The Piano Concerto Recording
The Studio Recording and the
Acousmatic Sound
Synthesized Music and the Musical Instrument
Sampled Instruments
CGI Actors and Sample Libraries
Sample Libraries and Hyperreality
The Hyperorchestra
Ontological Approaches for the
Hyperorchestra
The Hyperorchestra and the Live
Concert
189
192
196
203
234
MIDI, SAMPLE LIBRARIES and MOCKUPS
239
Introduction
Musical Instruments Digital Interface (MIDI)
The effect of MIDI
Beyond the Mock-Up: Overview of the
Evolution of Sample Libraries
Technical Generalities of Sample Libraries
Dynamic Layering and Crossfading
Round Robin
Legato Transitions and Crossfading
Multiple Performance Techniques
Sound Perspectives
Connotation and Cultural Codes
Replicating the Orchestra
Analyzing EastWest’s Hollywood
Orchestra
Orchestral Ensembles and Coded Orchestral
Libraries
Sample Libraries and World Instruments
Epic Percussion Libraries
239
241
247
206
213
217
221
226
227
228
252
255
257
260
261
263
264
265
268
271
279
284
290
continued
vi
VIII
IX
X
Sample Libraries as a Blueprint for
Screen Music Scoring Practices
Hybrid Libraries
Additional Considerations on Sample Libraries
293
294
295
AESTHETIC COMPOSITIONAL FRAMEWORKS
297
Introduction
Sound and Music in the Hyperreal
The Recording Framework
The Contemporary Framework for
Audiovisual Music Creation
Hyperinstruments
A Framework for the Hyperorchestra
297
304
307
HYPERORCHESTRATION
329
Introduction
Traditional Orchestration: an Overview
The Spectral Movement
Music Software Tools
Defining Hyperorchestration
Mixing as a Hyperorchestration Tool
Mahler’s First Symphony
Defining Mixing
Defining a Sound Perspective
Sound Processing Tools
Virtual Space Design
Sound Processing and Aural Fidelity
Creation of Hyperinstruments
Hyperinstrumental Orchestration
Augmenting and Expanding the
Orchestral Sections
Incorporating New Instruments
Hyperorchestral Combination
329
330
333
339
342
344
345
349
353
359
372
378
380
383
CONCLUSIONS
395
The Hyperorchestra and Hyperreality
Interaction with the Moviemaking Process
Sound Sculpting: Integrating with the Rest of
the Soundtrack
395
399
vii
309
313
320
383
389
391
402
continued
Composition as Sound Design
Evolution and Expansion Possibilities
403
404
BIBLIOGRAPHY
407
APPENDIX A
424
viii
LIST OF FIGURES
1
2
3
4
5
6
7
8
9
The research process for the hyperorchestra. The
process starts at the top left and ends at the bottom
right.
5
Philosophical framework for the defining of an
ontology for the hyperorchestra. The figure shows
three main areas that will be addressed in different
chapters: postmodern philosophy and semiotics,
ontology for digital cinema and film diegesis, and
ontology for music and recorded music.
7
This is a graphical representation to show how music
is a subset of sound, which is also a subset of
mechanical waves.
22
Tuning of the strings of the Hardanger fiddle. The first
measure shows the tuning of the strings that are
actually played, whereas the second measure refers
to the sympathetic strings, which are not played.
29
Rohan’s Theme (transcription, reduced score). This is
the main theme of the second movie, The Two
Towers, and appears multiple times throughout the
movie.
30
Transcription/sketch of the opening track of The
Social Network (2010)
38
Transcription of Inception’s (2010) main chord
structure. This sequence of chords constitutes the
root for most of the movie’s soundtrack.
44
Progressive crescendo written utilizing a traditional
musical score notation.
47
Progressive crescendo written utilizing Logic’s piano
roll. The top frame is used to write the note, whereas
ix
10
11
12
13
14
15
16
17
18
19
20
the bottom frame is used to write continuous data. In
the example of this figure, a 0 value in this lower area
means the lowest possible dynamic and the highest
value (127) means the highest possible value. The top
numbers are the measure and submeasure.
47
Photo taken during the drum sessions (Zimmer, 2013,
p. 13)
52
Transcription of the melodic line for the main theme
of The Man of Steel (2013)
54
Transcription for William's main theme melody line for
Superman (1978). The figure shows only the
beginning of the theme.
55
Graphic representation of direct and reversed sound
waveforms.
59
Graphic representation of different amplitudes in sine
waves
60
Screenshots for the title credit and establishing shot
for Gravity (2013).
61
Score sketch that shows the intervallic content of the
organ part in the track S.T.A.Y. from Interstellar’s
(2012) soundtrack.
68
Abstract example of a directed graph. It does not
refer to any narrative specifically.
109
This graphical model describes performance based
on David Fincher's approach to performance and
acting as described by Prince (2012, p. 102).
126
This graphic summarizes the different roles that
contribute to generating a performance in audiovisual
media. The graphic is divided between physical and
virtual processes.
127
Graphic representation of the framework for the
generation of the diegesis and the filmic world.
185
x
21
The graphic shows three different crescendo
representations.
199
Representation of a crescendo utilizing traditional
musical notation
200
23
Schematic for an octave of a musical keyboard.
216
24
General classification for synthesized and sampled
instruments. Pure synthesizers refer to instruments
that create the sounds purely from sound synthesis.
Hybrid synthesizers are, as described above,
synthesizers that also employ one or more samples
(that are transformed) in order to produce the
sounds. Sample libraries are designed by creating
computer programs that utilize a set of samples to
generate virtual instruments. The last typology, the
recording, refers to any other typology of music
recording.
218
Graphical schematic to represent MIDI
communication.
244
26
MIDI communication and human interaction.
250
27
Screenshot of Logic Pro X piano roll window.
251
28
Conceptual graphical representation of the structure
of a virtual instrument inside a sampler. It receives
MIDI inputs that are used to decide which sound
samples to trigger, and in which amount, as output
sound. In this example, CC1 is used to decide the
mix of vibrato samples, whereas CC11 is used to
decide the mix of dynamics. The combination of
these two values will serve to decide the amount of
signal that each of the samples will contribute to the
final result. In addition, there is another set of
samples triggered at special occasions. For instance,
when a Note Off message is received, the sampler
will trigger a note release sound. When the sampler
detects two notes at the same time (assuming that
the virtual instrument is a monophonic legato
instrument), it will trigger a legato transition between
22
25
xi
both notes, followed by the corresponding mix of
samples for the last note that was played.
256
Hypothetical example of dynamic crossfading. The
figure shows how the mix of each of the samples
dynamically varies depending on the CC value. The
percentage refers to the amount of the signal from
that layer that will g to the final mix. For instance, a
CC value of 1 will output almost no sound, all of it
coming from the piano (p) sample. This is because
the output should represent the quietest sound
possible in the instrument. At values around 60, the
sound should become close to an mp dynamic. This
is why most of the sound comes from the mp
dynamic layer. These values will vary for each CC
number, dynamically mixing all the dynamic layers
accordingly.
258
Graphical representation of Hollywood Orchestra’s
input parameters for a string ensemble sustained
sound (Rogers, Phoenix, Bergersen & Murphy, 2009).
272
Musical score representation of the string position
possibilities for the violin ensemble in EastWest’s
Hollywood Orchestra. (Rogers, Phoenix, Bergersen &
Murphy, 2009, p. 23). The score shows which notes
are played on which string depending on the finger
position that the composer has selected.
274
32
Tunings for EastWest’s Ra (EastWest Sounds, 2008)
287
33
Visual representation of the main principles of the
Attack, Decay, Sustain and Release (ADSR) model
300
Graphical representation of the sound wave of a
timpani hit, with the ADSR labels superimposed.
301
Music in the hyperreal. This graphic shows how
sound sources from the physical world are
transported to the virtual area for processing. Once
this happens, music becomes hyperrealistic.
304
Graphic visualization of the processes involved in a
traditional movie scoring composition process.
307
29
30
31
34
35
36
xii
37
Graphical visualization of a framework for
contemporary music scoring. As it is a nonlinear
process, there is no specific linear set of steps.
Instead, the DAW becomes the core of the process.
310
Graphical representation of the hyperinstrumental
design framework. It progresses from top to bottom.
316
Graphical representation of a conceptual framework
for the hyperorchestra. Its main purpose is to show
that the hyperorchestra is the result of an attitude
that focuses on the sounding result in addition to a
process of generation of meaning.
322
40
Score sketch for Mahler's First Symphony (m. 17-25)
346
41
Amplitude distribution within the frequency spectrum.
All three graphics represent a situation in which the
sound output utilizes the maximum amplitude
available.
352
42
Hyperinstrument model revisited.
354
43
Visual representation of the piano mixes in The Social
Network (2010). The composers utilized three
different microphone positions at different distances
from the source.
355
Mixing perspectives with different panning. Although
the source is just one piano, the sounding result
generates the impression that the piano is in multiple
locations at the same time, thus becoming
multidimensional.
356
Waveform visualization of the effects of compression
on a timpani hit.
366
46
Sketch of the effects of a compressor.
367
47
Stereo and basic surround speaker configurations.
373
48
Representation of multidimensional spaces. Even
though there is only one piano recorded, the result
becomes multidimensional due to the mixing
38
39
44
45
xiii
process, as the piano seems to emanate from
different spaces.
374
49
Two-dimensional virtual configuration
375
50
Simple velocity mapping
426
51
Alternate velocity mapping
427
xiv
CHAPTER I
INTRODUCTION
This dissertation will define a new approach to the creation of
contemporary music for audiovisual media, which I call the
“hyperorchestra,” a term derived from the concept of hyperreality as
defined by Jean Baudrillard (1994) and Umberto Eco (1990), among
others. Thus, the term hyperorchestra is the portmanteau of hyperreal
and orchestra, which implies a musical ensemble that inhabits
hyperreality.
My process for creating the term was twofold. First, it involved
research on the concept of film diegesis, or the imaginary world in which
the movie takes place. This is a pivotal concept in screen music
scholarship that has produced many problematic approaches. Providing
a definition for the diegesis is complex because it interacts with the
concept of reality. In addition, numerous narratives of contemporary
movies (e.g. The Matrix (1999), Inception (2009), etc.) have challenged the
commonly accepted definition of what reality is. These narratives were
connected in time with the development of computer tools and digital
processes that were introduced in the course of creating a movie.
1
The second approach originates from the study of one of these
digital tools: the contemporary sample libraries for music creation. These
software products, created by companies such as EastWest Sounds,
Native Instruments and Vienna Symphonic Library contain a collection of
digital recording samples (the sample library) that are mapped into a set
of virtual instruments employing a piece of software called a sampler and
using a set of programming scripts that generate each virtual instrument.
Although all of these elements are constituent parts of these products,
nowadays they are simply referred to as sample libraries. In addition, the
composer does not interact directly with the samples, which are
frequently compressed and encrypted, but with the set of virtual
instruments that these libraries offer. Some of these instruments are able
to create music that sounds realistic even though it is produced virtually.
Moreover, they can create music that, while it sounds realistic, an
ensemble of performers could not replicate it in a concert hall.
Consequently, the music that is created using these tools could be
associated with the definition of hyperreality, or "models of a real without
origin in reality" (Baudrillard, 1994, p. 3). Similarly, the film diegesis might
also be associated with hyperreality when assessing the perceived
realism of the different scenes in a movie.
2
On these grounds, this study is a philosophical investigation that
aims to provide a comprehensive definition of the hyperorchestra and of
the music produced by employing hyperorchestral models. This requires
providing a definition of, and ontology1 for, the hyperorchestra, which
implies analyzing what constitutes the basis of hyperorchestral music
beyond the utilization of sample libraries. In addition, I have included two
related concepts: the hyperinstrument and hyperorchestration, both
derive from the hyperorchestra. The first aims to describe the different
virtual instruments that constitute a hyperorchestra. The second aims to
provide a set of principles for hyperorchestral writing, in a parallel manner
to what orchestration means for the traditional orchestra.
As a culturally grounded phenomenon, a theory for the
hyperorchestra needs to draw from ontological and aesthetic concepts.
In terms of ontology, I intend to elucidate the ontological implications of
the hyperorchestra and how they differ from traditional means of music
making. Furthermore, I will describe how the new ensemble integrates
into a wider framework for the hyperreal. I will pay special attention to the
relationship between the hyperreal and contemporary cinema, as well as
to how it has altered the definition of realism. Thus, I will address the
following problems:
1
By ontology, one refers to the philosophical investigation related to the
essence of being.
3
-
How does the hyperorchestra relate to the ontological changes
produced in contemporary movies, especially, caused by
social and technological changes?
-
How does the utilization of hyperorchestra help to transform
the concept that listeners have of what is realistic and what is
not?
-
What are the ontological implications of the hyperorchestra?
Should hyperorchestral music be considered ontologically
different from music created by physical means?
In answering these questions, I will describe how the
hyperorchestra becomes an integral part of contemporary society and
narrative cinema, which is, nowadays, its most prominent cultural
manifestation. Inquiring into the transformation of the concept of realism
is crucial when the concept of reality dissolves. In discussing realism in
cinema, along with how the world of the movie is created, I will outline a
model for interpreting how society engages with the cinematic experience
and its music. With the results of these findings, I will be able to provide
an ontological definition for the hyperorchestra, grounded in how music
becomes a part of the hyperreal.
The ontological inquiry will reveal how the models for the
hyperorchestra are culturally rooted. In other words, the hyperorchestra is
4
not simply a set of objective modes and techniques of music making, as
it also involves an aesthetic intent. Thus, it is not possible to define the
hyperorchestra just by describing its ontology: it is imperative to provide
an aesthetic investigation, which should be grounded in contemporary
screen music practices. This inquiry will address the following problems:
-
What are the aesthetic consequences of writing music for the
hyperorchestra? How does composing for the hyperorchestra
affect the overall aesthetic of screen music?
-
What are the specific aesthetic elements that hyperorchestral
music has incorporated thus far?
The research process is outlined in the following figure:
Figure 1. The research process for the hyperorchestra. The process starts
at the top left and ends at the bottom right.
5
The figure above highlights how the origin of the definition of the
hyperorchestra comes from the observation of contemporary screen
music practices and its new tools. As a result, the study is divided into
three main sections. It will begin by analyzing a selection of contemporary
movies that will serve as source material for the discussion. These movies
are relevant to emphasize different aspects of the hyperorchestra. In
order to analyze them properly, I will also present an analytical framework
to approach musical analysis beyond pure score analysis. The resulting
analytical framework will emphasize the importance of focusing on the
sounding aspect of the music in tandem with its cultural implications. This
will add to the more traditional approach of analyzing music in terms of its
harmony, melodic lines, instrumentation and form. For instance, the
microphone placement (and consequently the sound result) of the
recorded piano line in the main theme from the Social Network (2010)
becomes at least as important as the melody itself in terms of providing
content for the scenes in which it is present.
Chapters III-VI utilize philosophical research in order to discern the
ontology of the hyperreal, with an emphasis on cinema and music. The
objective is to provide a theoretical framework to properly define the
hyperorchestra and its utilization in contemporary screen music. The
6
following figure summarizes the different elements that will constitute the
theoretical background used to define the hyperorchestra:
Figure 2. Philosophical framework for the defining of an ontology for the
hyperorchestra. The figure shows three main areas that will be addressed
in different chapters: postmodern philosophy and semiotics, ontology for
digital cinema and film diegesis, and ontology for music and recorded
music.
Chapter III is focused on describing an approach to the hyperreal
that draws mainly from Baudrillard’s and McLuhan’s philosophies.
Although Baudrillard discusses art in some of his works, his definition of
hyperreality aims to describe a global model of society. Consequently, I
will interpret Baudrillard’s definition of hyperreality in terms of a theory for
7
the media that will serve to discuss how cinema and screen music relate
with an artistic approach to hyperreality. As McLuhan’s focus was on
defining an encompassing theory for the media, and his work influenced
Baudrillard’s philosophy, a discussion of some of his theories is
appropriate. As a result, I will propose a model for defining how the media
and the arts might interact with the concept of hyperreality. Baudrillard
defined the hyperreal as a means to criticize a model of society, whereas
I intend to define the principles of hyperreality as an opportunity to
expand the artistic possibilities of music creation. Therefore, this model
will extend and, to some degree, modify Baudrillard’s approach to
hyperreality.
After defining the basis of what I refer to as hyperreality, Chapters
IV and V will focus on cinema. In Chapter IV, I will discuss the ontology of
cinema and how it engages with a hyperreal framework. For the present
study, I will focus on narrative movies that follow a widely known narrative
structure, which I will refer to as “cinema”. As my discussion
concentrates on contemporary cinema, I will generally try to avoid the
term “film”, as it implies the utilization of a physical film reel, which is
barely used nowadays. The chapter will focus on two main concepts.
First, I will describe a theory for the narrative that brings cinema closer to
hyperreality. Second, I will argue against an ontology for cinema that
8
associates an indexical property to the medium, utilizing Stephen Price’s
(2012) concept of “perceptual realism”.
In analyzing some of the most relevant theoretical grounds of
understanding the cinematic experience, I intend to connect cinema to
the concept of hyperreality that I described in Chapter III. As an artistic
product of the 20th century, I will argue that cinema has strong
connections with an artistic approach to hyperreality. In other words,
cinema is an especially well-suited art to contribute to the generation of
hyperreality. In addition, a discussion involving the role of art in
generating hyperreality will serve as the foundation for a discussion of
how music interacts with the hyperreal.
Chapter V will describe how the world in which the movie happens
(the diegesis) is generated, how it connects with the concept of realism,
and how this process might be linked to the hyperreal. This is an
important discussion because, as I will argue, the diegesis is an imagined
world that might be connected with hyperreality. In addition, inquiring
about what the diegesis is, exactly, has become a key element of screen
music scholarship. From the perspective of screen music analysis,
describing the diegesis is essential in order to comprehend the role of
music for the screen, especially considering that it is an element that is
not normally heard in the world of the movie. Considering that the
9
characters cannot hear most of the movie’s music, it becomes important
to describe the role and place of music for the screen. Therefore, I will
describe in detail the role of music in creating the diegesis, which will also
serve to describe the conceptual framework where the hyperorchestra
acts in the movie.
Chapter VI will focus on music in the hyperreal, which will serve to
define an ontology for the hyperorchestra. The chapter will build on the
concepts from the previous chapters in order to provide a framework for
understanding how music integrates into the hyperreal and how the
hyperorchestra engages with contemporary audiovisual practices.
Therefore, the chapter will complete the analysis of the relationship
between artistic expression and hyperreality from the point of view of
music and the technological innovations of the past decades. A key
element of the chapter will be the exploration of the ontological
consequences of recording music, along with its implications in terms of
the creative process. The utilization of sound recording allows for the
creation of music that transcends the traditional live experience of an
ensemble playing music together.
The last three chapters will define the aesthetic basis of the
hyperorchestra. First, in Chapter VII, I will describe how MIDI and sample
libraries have become tools for music composition that have transcended
10
the traditional musical model grounded by the notated score. Due to the
rapid evolution of technology, sample libraries have become a very
recent, but integral, tool for defining the hyperorchestra. In addition, their
novelty justifies a detailed examination of their principles.
Finally, Chapters VIII and IX will propose an aesthetic framework
for understanding the implications of the hyperorchestra in musical
aesthetics and in the processes surrounding composition for audiovisual
media. The final section draws on the examples and findings from the
analytical section in order to shape an aesthetic framework that is
commensurate with contemporary practices.
11
CHAPTER II
MUSIC ANALYSIS FROM MOVIE SOUNDTRACKS
Introduction
In this chapter, I will analyze a selection of music from
contemporary Hollywood movies. I intend to analyze a wide but
contained scope of diverse approaches to recent screen music writing, in
terms of the resources employed in order to generate meaning, in
conjunction with the music’s aesthetic functionality. The examples will
highlight a series of aesthetic approaches and creational techniques that
will serve as source material to describe the principles that define the
hyperorchestra. All the examples are from popular movies.
As an artistic form, all music has an associated aesthetic status
within a given culture at a given time, as a product of consensus among
the people. In other words, the status of a piece of music in a given
society depends on the opinion of the people of this society. This is an
important definition, as it highlights that the appreciation of a piece of art
does not only depend on its intrinsic qualities, but also on its interaction
with the values of the culture that experiences it. The aesthetic status is a
result of the sensory stimuli caused by the music in tandem with a set of
12
cultural values. The music that I will analyze in this chapter shares their
creators’ aesthetic intent that pushes the boundaries of what was
considered screen music (in terms of style) when the movies were made.
Each of the examples discussed in this chapter will illustrate a different
mode of aesthetic expansion, thus representing the varied possibilities of
musical evolution offered by the hyperorchestra.
The movie and music selections presented here are meant to
highlight specific aspects that will serve as examples for shaping a
hyperorchestral framework. Thus, the collection does not intend to
become a representative sample of the most relevant movie scores of
recent years, as this would be a completely subjective endeavor, but aims
to present a variety of movie scores that have challenged and expanded
the established aesthetics and processes of screen music in different
ways.
In the music for The Lord of the Rings trilogy (2001-2003), the
movie orchestra extends to encompass instruments and cultural
traditions from around the world, creating one of the most complete
multicultural ensembles ever presented. Although the music mainly
comes from purely acoustic recordings of ensembles, the cultural
implications and the wide range of the instrumental resources present in
the score go beyond the traditional model of the Western orchestra. The
13
result is a blend of diverse musical traditions that are usually culturally
segregated. The second analysis will continue to focus on acoustic
instruments, even though the music for The Social Network (2010) is
mainly electronic. In the analysis of this movie’s music, I will focus on the
different versions of the main theme’s piano melody. The versions of the
theme are processed differently in order to alter its meaning and the
score is a remarkable example to illustrate how slight alterations of the
sound of one instrument can also alter the narrative meaning of the
music. The music for Inception (2010) might be considered archetypical
as a definition of the basic premise of the hyperorchestra. Its impossibly
massive brass ensemble is one of the clearest examples of how to
generate a verisimilar brass sound that could not be produced by
physical means alone. Similarly, the music for The Man of Steel (2013)
serves as an excellent example of the concept and the possibilities of a
hyperinstrument. The careful music and sound design in Gravity (2013)
make the movie incredibly thought provoking for a screen music analysis.
In this case, I will mainly concentrate on the nature and implications of
utilizing a reversed sound which is an extremely effective technique
widely used in contemporary musical practice and is also a key element
of the score of the movie. Finally, I will discuss Christopher Nolan’s
decision to use the pipe organ in Interstellar (2014), which is an
14
extraordinary example of the depth of music’s role in the transmission of
profound ideology and in the creation of the movie’s ontology beyond the
audiovisual world.
The analyzed examples will serve as the source material for
defining an aesthetic of the hyperorchestra and the hyperinstruments that
I will develop in the final chapters. Before this, I will preface the analyses
by describing the analytical approach and the main analytical tools that I
will employ.
Movies as Multimodal Experiences…
Although the movie’s mode of delivery is audiovisual, as a cultural
object, a movie is a multimodal cultural experience. In Chapter V, I will
offer a model for the movie world that contains its multiple levels of
meaning. Thus, the audiovisual material becomes just a part of the
information employed to generate the complete meaning of a movie as a
cultural object. In a similar manner, music is part of the audiovisual
content, although its full meaning resides in a broader level of
signification.
By including a referential, or semantic, level of analysis which is
based on how the music codifies meaning for a given scene, Sadoff
(2012) stresses the importance of a multifaceted analysis of screen music
15
in An Eclectic Methodology for Analyzing Film Music. As a consequence,
the musical content of a movie might be described as the sum of 1) the
music that appears in the audiovisual material; 2) its referential meaning;
3) the assumed musical background to comprehend the referential
meaning; and 4) any assumed meaning in the narrative. For example,
Kubrick assumes in A Clockwork Orange (1971) that the audience will
posses certain knowledge of Beethoven’s music beyond what is stated in
the movie’s narrative. Therefore, music that appears in a movie might
serve diverse functions on several levels of signification beyond its
relationship with the rest of the elements of the audiovisual track.
Conceiving the movie as a multimodal cultural manifestation is key in
order to unfold the intricate set of functions that music might serve.
…and a Multimodal Approach to Music Analysis
In tandem with a comprehensive approach to movies as
encompassing cultural entities, I will analyze the music from the selected
scenes utilizing a diverse set of perspectives. As I will argue in Chapter VI,
the musical content that the musical score can portray is significantly
limited. Therefore, a musical analysis based solely on the score’s content
cannot properly describe all the musical elements present in the musical
piece. Sometimes, the graphical content of the musical score is not even
16
significant for the analysis of a given piece of music. For example, the
score for a musical passage that harmonically contains only a single
chord will probably be unnecessary for its analysis. Instead, there might
be other musical elements that are much more useful to describe the
musical properties of the music that cannot be represented in the score.
As a consequence, it is essential to separate the structural
functions in music from the musical device that has been traditionally
attached to them. Tonal harmony, for instance, has customarily provided
structure to the musical discourse by generating moments of tension and
release. However, it is possible to structure a musical discourse by other
means without the need for this type of harmonic change. More
importantly, this flexibility allows musical devices, such as harmony, to be
employed to fulfill other functions. It is especially important to note how
the expanded sound possibilities, product of the studio, and the digital
manipulation of the sound offer new ways to generate the abovementioned structural functions of music. In a similar manner to how tonal
harmony had a role in structuring the musical discourse, the rhythmic
discourse has traditionally been note-centered. Thus, in order to generate
a rhythmic texture, different notes were required. However, by using
digital manipulation of the sound, it is possible to achieve rhythmic
17
motion by other means, such as dynamic panning or adding a timed
tremolo sound processing effect.
Beyond Sound
A multimodal musical analysis relies on two main perspectives: the
overall sonic analysis and the analysis of the music as a cultural
manifestation beyond its sound qualities. This second perspective is
especially important when discussing music for the screen, because
understanding the cultural qualities of a piece of music are generally
significant when it comes to discerning how the piece interacts with the
rest of the elements of the movie. For example, the levels of referential
analysis described in Sadoff's (2012) Eclectic Methodology, the music
and filmic codes, and the textual analysis are useful frameworks for
approaching this analytical perspective.
As a cultural construct, the referential associations with a piece of
music are culturally specific. This means that the interpretation will vary
concomitantly to the evolution of a culture. A film from the 1930s will
most surely be interpreted differently today than how it was understood
when it premiered. Moreover, the interpretation is also dependent to the
particular cultural framework of each spectator. For instance, Kubrick
utilized a piece from Rossini's La Gazza Ladra in A Clockwork Orange
18
(1971). If a spectator first views A Clockwork Orange and then watches
another audiovisual media that has a scene that utilizes the same music,
their perception of the second scene will inevitably be influenced by
Kubrick's film. Similarly, if the spectator is, for some reason, emotionally
attached to a concert performance of La Gazza Ladra, any utilization of
the piece in a movie will connect with the memories of that concert
event.2 In considering these processes, the perception of the music in a
movie might greatly differ between audience members. In concordance,
any musical analysis that acknowledges the cultural references of a piece
of music will consequently incorporate a degree of subjectivity. In order
to become as informative as possible, an analysis of the referential
content of a piece of music requires a proper acknowledgment of how its
content emanates from a given cultural framework. In the case of
analyzing a movie that utilizes La Gazza Ladra after Kubrick’s usage,
Kubrick's utilization of the piece must be described. In addition, it is
necessary to assess the cultural relevance of Kubrick's scene in order to
evaluate its cultural impact. For instance, Kubrick's films greatly influence
filmmakers and, at the same time, his films are popular and relevant
cultural items.
2
This is the well-known phenomenon of "Darling, they are playing our
tune" (Juslin & Vastfjall, 2008, p. 567).
19
I will utilize a semiotic framework in order to discuss this
perspective of the musical analysis, which will revolve around the musical
code. Although the majority of the music codifications are mostly
connotative (the meaning is culturally processed), denotation (the literal
meaning of the sign) also plays an important role in the creation of
meaning for the movies. For instance, a flute melody might denote the
sound of birds (which may ultimately connote nature). Similarly, a set of
percussion hits might denote gunshots. Further, there may actually be
gunshot sounds in the music as in the opening credits of The Good, the
Bad and the Ugly (1966). These sounds would, likewise, directly denote a
gunshot. In the expanded sonic world of the hyperorchestra, denotation
acquires a special significance. Composers utilize an array of sounds that
carry an attached denotational level of signification that might be
employed in the generation of meaning. Music is also denotative when
helping to provide structure to a movie. The musical theme functions as a
recognizable musical instance, besides its possible connotations, that
serves to mark different moments during the movie. If the theme is
associated with a particular narrative element (e.g. leitmotiv), its presence
will sonically denote the presence of the associated narrative item.
Musical codes that are connotative interact with the movie at different
levels of meaning. This is important to recognize in order to analyze their
20
functions at levels beyond the audiovisual; contributing to build the
diegesis, aiding the transmission of a philosophical idea, or presenting a
particular authorial aesthetic world. The proper interpretation of connoted
meaning associated with a piece of music will rely on the audience’s
knowledge and access to the cultural items to which the music is
referring. From this viewpoint, each spectator will generate a different
approach to the connoted meaning, which will depend on his or her
previous experiences and decoding capabilities for the audiovisual
information. Assuming that each spectator will interpret the connoted
information differently does not necessarily imply the inexistence of
common grounds for the interpretation of artworks or narratives. One of
the advantages of a society sharing a cultural background is precisely its
ability to share tools for the decoding of these messages.
Verisimilitude Analysis
This rather specific perspective of music analysis lies in between a
sonic and a beyond sound mode of musical analysis. From a
hyperorchestral point of view, inquiring about the degree of verisimilitude
is informative and will help to complement the analytical findings.
Assessing the degree of verisimilitude of a piece of music involves
critically listening to the sound and comparing it to the cultural model for
21
music realism. A highly unreal sound might generate specific connoted
meaning when it interacts with an adequate narrative. On the other hand,
a highly verisimilar sound will tend to become more transparent when
analyzed from this angle. This is because listening to unknown sounds, or
sounds that are not present in the physical reality, generally become
more noticeable and, therefore, increasingly significant in terms of their
narrative role.
Sonic Analysis
As I mentioned before, the analysis of the aural properties of the
music needs to include a broad set of perspectives beyond traditional
music analytical tools. Before describing the different approaches to a
sonic analysis of music, it is worth inquiring into how to properly define
music. It seems clear that music is a subset of sound, which is at the
same time a subset of mechanical waves (see Figure 3).
Figure 3. This is a graphical representation to show how music is a
subset of sound, which is also a subset of mechanical waves.
22
More interestingly, sound is, by definition, the mechanical waves
that the human ear can capture and the human brain can process. Thus,
the distinction between which mechanical waves are considered sound
and which are not is a combination of the physiological limitations of the
human ear and the brain’s ability to process sound inputs. Consequently,
the differences between what is and what is not sound will depend solely
on what each person is able to hear. As a subset of sound, music is
defined arbitrarily. There is not a specific physical or physiological
property that defines music as a cohesive subset. Instead, the sounds
that are considered music are defined arbitrarily according to cultural
background. Thus, the borders that separate music from sound are
flexible, variable and blurred. They evolve over time and vary depending
on the cultural background of each listener. Moreover, the borders might
further mutate depending on the musical style. As I argued before, in
screen music, there is a tendency to merge the concept of music and
sound, overturning the cultural constraints that impede music from
entirely embracing the full range of “sonic” possibilities available for
music making.
In calling this set of perspectives “sonic”, I am emphasizing an
analytical standpoint that incorporates sound as a main source material
for the process of musical analysis. From this viewpoint, the analysis
23
becomes partially phenomenological as it focuses purely on the sound
experienced, instead of attempting to discern how the sound is created
by the combination of diverse preexisting musical structures.
Nevertheless, utilizing this approach does not disregard the traditional
theoretical approaches for analyzing classical music, which, when
appropriate, become similarly useful in order to fully portray the whole
picture of a particular piece. In other words, a composer might generate
meaning in a piece of music through the utilization of leitmotivs or
specific harmonic progressions. In addition, this same music could
suggest additional meaning due to its pure sonic qualities beyond
traditional musical structural frameworks.
A sonic analysis allows for the highlighting, when necessary, of
aspects that go beyond traditional musical characteristics of pitch,
harmony, melody, and instrumentation. For example, a flute might be
presented with different amounts of reverb, or performed with diverse
amounts of air sound. In both cases, they are musical features that might
become paramount in describing the musical characteristics of the piece
and its associated meaning.
24
The Lord of the Rings: Creating Middle-earth
with a Global Musical Ensemble
The music for The Lord of the Rings trilogy (2001-2003) employed
a wide set of instruments from around the world, which expanded the
Western orchestra into a global musical ensemble. Although the music
generates almost entirely from recording sessions, the mix of diverse
cultural traditions transcends what could have been achieved by using
Western musical models one, which is why the music for The Lord of the
Rings is a good example for one of the elements of hyperorchestral
writing. I will focus the analysis on an interpretation of how music, created
by an extended and culturally eclectic sound palette, using a diverse set
of instruments from around the world, served to shape the audience’s
perception of the different cultures of Tolkien’s Middle-earth.
Director Peter Jackson describes how he envisioned the musical
approach for the movies: “More so to just scoring the film, I wanted the
music to reflect Tolkien, I wanted the music to also bring the world of
Middle-earth to life” (Jackson, 2001b, 00:39). By emphasizing that he
aimed for more than just screen music writing, Jackson implies that he
was expecting the music to act on multiple levels of signification beyond
the narrative. This is not unexpected if considering that one of the
greatest challenges of the movie trilogy was to audiovisually recreate
Middle-earth, which while it was extremely well-detailed by Tolkien in the
25
books, the imaginary world had never before been fully constructed with
that amount of detail at the audiovisual level.
One of the complexities of creating the diegesis for Middle-earth is
the apparent lack of connection with our world. In Tolkien’s imaginary
land, there are several different cultures of humans, which are apparently
unrelated to any recognizable human culture, in addition to diverse
fictional races extracted from mythological tales, such as Hobbits, Elves
and Dwarfs, which have prominent roles in the story. Each of these
different cultures has an extensive and thorough background story. Even
considering the extended length of The Lord of the Rings movie trilogy,3
there is an unavoidable process of condensation and transformation of
the contents of the book, in order to adapt it for an audiovisual mode of
delivery.
On one hand, sometimes there is not enough space to provide the
entire background that is included in the books. On the other hand, a
written description naturally has a greater level of ambiguity and
vagueness,4 whereas an audiovisual depiction of the same situation will
need to be much more explicit. By using diverse strategies and
approaches, the representation in the movies of the cultures of Middle3
The extended versions of the movies last for a total of almost twelve
hours.
4
The writer expects the reader to use the imagination to build the world
and recreate the story in his mind.
26
earth drew on diverse realities from the physical world, both in the
present and from the past. For instance, Rohan’s culture, which is a
human civilization, was inspired by European Nordic culture. Similarly, the
Elves of Lothlorien had an Eastern African and Indian essence (Jackson,
2002). The Hobbits and their land, the Shire, are reminiscent of Great
Britain. The insular lifestyle of the Hobbits, who live isolated from the rest
of Middle-earth, relates to the fact that Britain is actually an island.5
It is in this context that the role and function of the music ought to
be analyzed. As Peter Jackson describes when explaining the role of
composer Howard Shore in the movies:
So he is doing two jobs at the same time: one is underscoring the
film, as providing an emotional link, bridge between the movie and
the audience, bringing the audience in; but is doing in such a way
which also is telling you a lot about the cultures of this world. (sic)
(Jackson, 2001b, 02:26)
By referring to “underscoring”, Jackson is describing what is
regularly attributed to nondiegetic music, which is music that functions
narratively in order to support the audiovisuals. In addition, Shore’s
design of the music assisted in providing a context for the cultures. For
5
Although it is not the topic of this discussion, some of these references
might be also implied, more or less specifically, in the books. However, in
the movies there was a need to decide and establish these influences in
order to properly recreate the cultures, which left much less room for
ambiguity.
27
example, one of the main instruments for Rohan is the Hardanger fiddle,6
which Shore discovered while studying Nordic music as an inspiration; “it
was part of the research for The Two Towers, looking towards northern
European sounds and thinking about the Viking, Nordic culture” (Adams,
2005, p. 40). Doug Adams (2005) describes the instrument in The Music
of the Lord of the Rings Films: The Two towers – The Annotated Score:
Often referred to as the national instrument of Norway, the
Hardanger Fiddle was thought to have been invented in the mid
1600s. The tone is bracing and emphatic, but moderate at the
same time. In Norwegian culture the instrument was used to relate
history and lore, and it functions much the same in the music of
Rohan. […] When the Rohan culture is introduced, it is proud but
sorrowful—a once great civilization beset by a failing king and
unending assaults. (Adams, 2005, p. 40)
The Hardanger fiddle provides a particular sound for Rohan by
drawing associations with the Nordic culture that served as a reference.
One of the main characteristics of this type of fiddle is that it has
sympathetic strings, which provide an open and somewhat colder sound
to the instrument, due to the unfingered, and, thus, vibrato-less sound
that emanates from the resonance of the strings that are not being
bowed. In order to fully achieve the type of sound that makes the
instrument singular, the composer needs to take into account the
characteristics of the instrument, while at the same time the performer of
6
It is a Norwegian bowed stringed instrument similar to the violin that has
a set of sympathetic strings that are not bowed.
28
the Hardanger fiddle needs to employ specific performance practices that
are anchored in the cultural tradition of the instrument in Norway. In terms
of the musical content, in order to take advantage of the particular sound
of the instrument, the music written for the Hardanger fiddle should
maximize the resonant power of the sympathetic strings, which are
shown in Figure 4.
Figure 4. Tuning of the strings of the Hardanger fiddle. The first measure
shows the tuning of the strings that are actually played, whereas the
second measure refers to the sympathetic strings, which are not played.
The following musical transcription (Figure 5) of the beginning of
Rohan’s theme illustrates how the melody, and consequently the
harmony, is written by taking the instrument into consideration, as all the
important notes of the melody are part of the sympathetic strings:
29
Figure 5. Rohan’s Theme (transcription, reduced score). This is the main
theme of the second movie, The Two Towers, and appears multiple times
throughout the movie.
The Hardanger fiddle is one of several examples in the score that
utilizes instruments outside of the symphonic orchestra in order to feed
some of the cultural connotations of the instrument into the score. In
addition, this approach to music scoring and orchestration might
generate sounding associations between the different cultures of Middleearth. The music for the Shire, which employs several Celtic instruments,
also has a (Irish) fiddle as one of the main instruments, which creates a
musical connection with Rohan. Generating a cultural association through
music was purposefully executed when creating the music for Gollum. In
addition to the fiddle and other Irish instruments, the music for the Shire
frequently incorporates a dulcimer for the harmonic accompaniment.
Physically speaking, the dulcimer is a diatonic instrument made of a
series of strings tied to a wooden piece, which are hit with metallic
hammers. In terms of sound, it is not far removed from the sound of a
harpsichord. Shore uses a cimbalom in the music for Gollum, because
30
the cimbalom is an instrument developed from the dulcimer, with the
added possibility of playing the full chromatic scale and with an extended
instrumental range.7 As Gollum was once a Hobbit, an instrument that
precedes the dulcimer seemed appropriate. Moreover, the music written
for the cimbalom, associated with Gollum, is highly chromatic, reinforcing
the concept of how the instrument “evolved” from the dulcimer (Jackson,
2002).
Technical evolution and its conflict with nature and traditional living
habits is a key theme for the trilogy. As Buhler states, “the Lord of the
Rings can be understood as a parable of modernization, whose icon is
the machine” (Buhler, 2006, p. 233). Buhler also quotes Tolkien’s letter in
which the author discussed the allegorical content of the story. Tolkien
refers to the allegory of the machine as being closely related to magic:
By the last [the Machine (or Magic)] I intend all use of external
plans or devices (or apparatus) instead of developments of the
inherent inner powers or talents – or even the use of these talents
with the corrupted motive of the dominating: bulldozing the real
world, or coercing other wills. The Machine is our more obvious
form though more closely related to Magic than is usually
recognized. (Carpenter, 1981, pp. 145-146)
From this point of view, the cimbalom is an almost mechanized
dulcimer, or a dulcimer that has gone through a process of development
7
The cimbalom is a Hungarian instrument, which has also been used,
quite prominently, in the Hungarian-inspired score for the movie Sherlock
Holmes (2010).
31
that provoked an inevitable process of corruption from its original diatonic
mode, which parallels Gollum’s degradation through the magic of the
ring. In fact, the good cultures of the Middle-earth are clearly preindustrial, with no specific reference to any machinery that helps their
production. Instead, Isengard, which is part of the evil, represents, for
Shore, the industrial age (Jackson, 2001). As Buhler states:
Appropriately, the music is repetitive and impressively menacing,
grinding continuously like a giant, infernal machine, engine of
modern industry run amok. And it is hard not to extend the allegory
to the Orcs and Uruk-hai, who become by analogy the unwashed
proletariat reproduced only to feed the ravenous machine that
would consume the world. (Buhler, 2006, p. 245)
The accelerated building of an army in Isengard symbolizes the
power and dangers of the machine. Saruman and his workers bulldozed
Middle-earth in order to quickly create a militia of Uruk-hais, who are an
improved version of the Orcs. They employ the division of labor in order
to grow new soldiers and make swords and armor. The music is
predominantly in a 5/4 meter, which is enforced by an accented uneven
percussive ostinato.8 Therefore, the strong accents of this ostinato are
used to connote militarism at the same time that the unevenness of the
5/4 meter expresses the dangers of the industrial age.
8
The accent is on the first and fourth beats of the five-beat pattern.
32
Even more relevant are the instruments that constitute the
ostinato, especially the taiko drums and the distressed piano (grand
piano hit with metal chains). The taiko drums are a popular instrument
from Japan. However, the utilization of these drums in the music for
Isengard is unrelated to any intention of connecting Japanese culture with
the culture of Isengard and the Orcs. Moreover, the utilization of the taiko
drums as a war-like drum is similar to how the popular Japanese music
ensemble Kodo employs these instruments. The utilization of the taiko
drum in the music for Isengard marks another approach to the usage of
non-orchestral instruments, severed from their geographical
connotations. Instead, it is the particular performance mode of the drums,
when they used as an instrument of war,9 that connects the taiko to the
warriors of Isengard.10 Hence, the taiko in the music for Isengard is
stripped of its cultural connotations and is subsequently employed by
utilizing the denoted meaning of one of its cultural functionalities. Thus,
the taiko does not even refer to any type of Japanese war scenario; it
merely denotes war as a universal concept.
The “distressed” piano represents yet another approach. In the
score, a distressed piano is a grand piano that is hit on the soundboard
9
Taiko drums were used for several functions beyond war.
Since the release of The Lord of the Rings, the taiko drum has been a
regular instrument used for epic battle scenes in other movies.
10
33
using metallic chains; a performance technique commonly known in
contemporary Western classical music as an extended instrumental
technique. Generating new sounds with standard instruments from the
Western musical canon is achieved by applying new performance
techniques to the instruments in order to produce new sounds. Another
way of physically expanding the sound is by using everyday tools and
objects, such as the anvil, which was similarly used to create the rhythmic
pattern of the music for Isengard. In both cases, the output produced by
these instrumental sounds tie to Isengard’s culture more denotationally
than connotationally. The sound of an anvil hit by a hammer or the
metallic soundboard of a grand piano hit by metallic chains have a direct,
almost physical link, to the activities of the Orcs and Uruk-hais of
Isengard.11 This mode of instrument pairing with a culture is paralleled by
the use of a series of wood instruments, such as the marimba, for the
music of the Ents of Fangorn, thus denoting their wooden nature.
The music for The Lord of the Rings movies presents one of the
broadest sets of sounds achievable by using physical means alone. This
extensive sound palette, along with its intricate multicultural
connotations, is employed in order to signify a very diverse range of
cultures that populate the fictional Middle-earth. In addition to sound
11
They forged swords and other weapons by using anvils and similar
tools.
34
variety, the instruments provide meaning that helps to create the diegetic
world for each culture by suggesting cultural connotations with existing
cultures from our world, or through the functional aspects of the
instruments, which act as a denotation of what they represent.
The different cultures of Middle-earth are constructed through the
amalgamation of different bits of information from existing human
cultures in addition to newly invented material. From this perspective,
Middle-earth is a hyperreal product of a globalized earth, almost a literal
global village that combines elements of the pre-modern cultures of our
existing world. The music written for the movie trilogy is hyperorchestral,
as it continuously blends different cultural traditions in order to generate
different levels of signification.
The Social Network: Sound Quality as Emotional Signifiers
The music for The Social Network (2010), composed by Trent
Reznor and Atticus Ross, is mainly electronic. However, I will focus on
one of the few acoustic sounds present on the score: the piano part that
plays the main theme. I will focus on this because it is an outstanding
example of how specific sound qualities can have a great influence on the
emotional content of a musical element. During the movie, the musical
idea appears in three different sonic variations that differ mainly in their
35
instruments’ sound qualities. The harmonies, the notes and the tempo are
broadly the same in all three versions. The twelve notes of the melody are
performed by a solo piano.
Director David Fincher described his impressions of the piano part
in the context of the musical track: “It was kind of astounding, because it
seemed to talk about this loneliness, the piano was this lonely, so
childlike, and yet it had this seething anger vitriol that was sort of
bubbling under it” (Fincher, 2011, 10:52). In fact, Fincher explains that he
did chose the piece from among a set of tracks of musical ideas sent to
him by composers as starting material, before even watching the movie.
It was the director who selected the track that ultimately acted as the
opening and main theme of the movie. Reznor (Fincher, 2011) elaborates
on the musical qualities of the track, in reference to how it changed the
overall mood of the movie:
What I liked about that piece was it felt, the melody felt grand to
me, it felt bold and melancholy-tragic-important in some sense.
But, to be honest with you, I wasn’t sure how dark or isolated
David [Fincher] wanted this movie to come across. Because seeing
a real rough cut with other music in there, tempted in, it really
made the whole film different, it felt much more casual, (…) a
college movie about kids making stuff and screwing each other
over. But with that other thing in there, with our thing, it suddenly
felt like something is going on beneath the surface here. It gets the
attention in a vulnerability that I think it changed the whole tone of
the movie. (11:08)
36
The music includes a synth drone that creates a pedal note that
acts as one of the pivotal elements of the piece. It has a tremolo-like
sound, which was created by recording and transforming the sound
produced by an electric guitar. The tone of the note is mixed with the
sound of the metallic string. It is presented differently in the left and right
channels: in the left channel, the sound is direct and clear, whereas in the
right channel, the sound has been equalized in order to appear as though
it was captured from a distance. Further, an additional sense of motion is
created by unevenly changing the volume of the left and right parts of the
sound, which generates a particular movement in the panning that cannot
be directly associated with the object moving around the stereo field. The
result provides a constant non-static background that is unsettling.
During the piece, other sounds are added, thus complementing the drone
both below and above the original pitch. On the top of that sound, there
is the piano melody, which floats freely over the soundscape that the
drone generates (Figure 6). The melody revolves around six of the seven
tones of the D major scale, excluding the seventh, whereas the drone is
pitched to middle D. The tempo is just an approximation, as the music
does not have a precise meter.
37
Figure 6. Transcription/sketch of the opening track of The Social Network
(2010)
These two elements together, the synth drone and the piano, are
what portray the anger and loneliness mentioned by Fincher. The director
explains that after listening to this track, he envisioned a triptych of the
evolution of Mark Zuckerberg’s character (portrayed by Jesse Eisenberg),
which would include modified versions of the same track (Fincher, 2011).
In the opening scene, the piece of music starts just after Zuckerberg’s
girlfriend breaks up with him for being arrogant and derogatory. The
music appears again at minute 49 when, during one of the depositions,
Zuckerberg informs the lawyer that he does not have his full attention.
The theme appears for a third and final time at 1h 43 minutes when
Facebook cofounder, Eduardo Saverin (Andrew Garfield), explains in
another deposition that his shares of the company were diluted to just
0.03%, which symbolizes Zuckerberg’s betrayal. These three moments
can be interpreted to connect the three main themes of the movie:
Zuckerberg’s ex-girlfriend, his best and now ex-friend (Saverin), and his
first, and also former, partners (the Winklevoss twins).
38
The movie creates a fictional Zuckerberg that resembles a modern
version of Citizen Kane (1941). He is portrayed at the beginning of the
movie as an immature, insecure, intellectually brilliant, and potentially
obnoxious person. The movie shows how immaturity and insecurity might
trigger immoral behavior, which is exaggerated to facilitate the fast-paced
evolution of his project. In some ways, it seems that Zuckerberg’s
character was not able to evolve and mature as a person at the same rate
that his product, Facebook, was advancing in its development.
It is in this context that the musical track successfully intertwines
with the different levels of the movie’s meaning. Fincher acknowledges
how the two main elements of the music, the synthesized part and the
piano melody, are great vessels to express the bitterness of the
character, as well as the immaturity of someone who is still partially a
child. The music aids in establishing two possible explanations for the
actions of Zuckerberg’s character: his immaturity and an inherent
bitterness toward the rest of the world. Each time that the theme
reappears, the piano, which represents his innocence, is recorded from a
greater distance, progressively dissipating.
In addition, the variations in the music for this triptych help to
reveal a process of personal detachment. Zuckerberg’s character
disconnects from who he was at the beginning due to the power and
39
influence of his new position in society. It forms a parallel with the
encompassing concept of the movie surrounding the impact of Facebook
on the social lives of its users. In the final scene, Zuckerberg is shown
adding his ex-girlfriend as a friend on Facebook and refreshing the
screen, waiting for her acceptance. From this perspective, the Facebook
profile page might act as a veiled portrayal of reality, similar to the
“veiled” piano created by the microphone placement on the musical
track.
From the music side, it is remarkable the effect that a different
sound of the piano can have on the meaning that emanates from the
piece of music, simply as a result of microphone placement. The different
sound denotes distance from the source, which connotes detachment, as
already stated. In parallel, the piano is an instrument that generally
connotes childhood or intimacy. Both meanings interact in the piece,
which demonstrates the importance of diverse aspects of the sound on
the creation of meaning. This concept of creation of meaning could also
be applied to the synthesizer track; its portrayal of the unsettledness is
partly denotational, for its sound characteristics, and is partly
connotational, for its references (at least for Fincher) to the dissonant
music in The Shining (1980). In addition, its electronic nature helps to
signify youth culture, which correspondingly helps to generate the
40
diegesis from the beginning of the movie. Although the score has many
relevant elements, I have intentionally only focused on how the different
versions of the same track connected with the main conceptual arcs of
the movie, and how the evolution was portrayed in music by just altering
the microphone mixing.
Inception (2010): Challenging Reality and the Impossible Orchestra
The movie Inception (2010), directed by Christopher Nolan,
suggests that dreams may be, in certain ways, equivalent to hyperreality.
This is because dreams are not a representation of the physical world
although they feel perceptually realistic when we are experiencing them.
The music, composed by Hans Zimmer, is an archetypical example of an
expanded orchestral sound beyond what would be possible to achieve
just by physical means, thus making it hyperorchestral. In this analysis, I
will focus on how the hyperorchestral sound of an expanded screen
music orchestra is created and how employing hyperorchestral
procedures provide meaning for the narrative and the philosophic content
of the movie.
The plot revolves around a group of dream specialists who
artificially create and share a dream with a person in order to steal
information from them. In order to accomplish that, they craft an artificial
41
world that could be described as the diegesis of the shared dream. In his
search for a new “architect”, or the expert who builds these worlds, Cobb
(portrayed by Leonardo di Caprio), the main character, remarks that what
he is offering is “the chance to build cathedrals, entire cities, things that
never existed, things that couldn’t exist in the real world”. In fact, in the
dream, the architect can create a reproduction of the physical world (even
though Cobb advises against that), an invented reality that would be
feasible in the physical world, or something that goes beyond what could
be possible in our physical reality. With this premise, Inception challenges
our definition of reality. Moreover, the movie goes beyond questioning
whether we might be living in a dream by asking if that question even
matters. Assuming that what we experience in a dream is equivalent to
what we can experience in the physical world, the differences between
dreams and reality become blurred.
From this point of view, Inception’s concept is akin to the main
themes in The Matrix (1999). 12 In both cases, the created world (the
computer simulation in The Matrix and the dream world in Inception) is so
close to a physical reality that the two become nearly undistinguishable.
The targets of the dream stealing team have to be tricked into believing
12
In The Matrix trilogy, there is also an architect, who is the program (and
a character) responsible for creating the simulated world where humanity
is enslaved.
42
that they are experiencing a real event, in a similar manner as the
inhabitants of the Matrix. Although both movies engage with the concept
of hyperreality, Inception presents the topic from a different (and perhaps
more ambiguous) perspective, compared to how this is treated in The
Matrix. In Inception, there are no evil machines that enslaved humanity
and created a hyperreality for the enslaved humans to live in virtually,
which therefore need to be destroyed. Instead, there is a team of humans
who are responsible for creating dreams and the dream world.
Hyperreality in Inception is not portrayed as one of the dangers of our era
as presented in The Matrix but rather, as a personal, albeit dangerous
choice available to humanity.
In this general context, the score for Inception maintains
coherence with the narrative and the ideology of the movie. The
orchestral part of the ensemble is comprised primarily of brass and
strings. In addition, there is a formidable percussion complement
(synthetic and recorded), electric guitar and various synthesizers. In the
scene when Cobb trains the new architect (00:29:05), Ariadne (portrayed
by Ellen Page) learns how to design a world and how to transform it in
ways that transcend what can be achieved in reality. The scene presents
one of the most astonishing images of the movie, whereby Ariadne
decides to fold the world she created. In doing so, the people who inhabit
43
the world, which are projections of Cobb’s subconscious, become
aggressive toward her because Cobb’s subconscious perceives that the
changes being made to his dream are being carried out by someone else.
At some point, Ariadne recreates a Parisian bridge, which infuriates
Cobb. He tells her that she should never recreate something that exists in
reality or she would be unable to discern what is real from what is not.
Then, after Ariadne asks him if this was what happened to him, a woman
(Cobb’s projection of his deceased wife) kills her and she wakes up.
The music13 for the scene begins when Ariadne starts to
manipulate the world in a manner that would not be physically possible;
in Ariadne’s words: “My question is, what happens when you start
messing with the physics of the world?” The harmonic and melodic
structure of the music at this moment contains only a repeated
progression of four chords14 (Figure 7).
Figure 7. Transcription of Inception’s (2010) main chord structure. This
sequence of chords constitutes the root for most of the movie’s
soundtrack.
13
The music for the scene corresponds to the music in the track “Radical
Notion” from Inception’s original soundtrack album (Zimmer, 2010).
14
In fact, this four-chord progression constitutes one of the main musical
elements throughout the whole score.
44
Instead of being the musical core of the cue, the harmonic and
melodic progression becomes the ground (harmonically diverse and
fluctuating) upon which the music is constructed. Orchestration,
dynamics and rhythm become the core musical elements. The music at
the beginning is created as a distended dynamic expansion built on the
first and third chords and, conversely, as a contraction on the second and
fourth. The main elements in this dynamic expansion are brass and string
sounds.
The Expanded Orchestra
The brass section sounds verisimilar even though this kind of
sound could not be produced in reality. The amount of brass necessary
would be immense; furthermore, it is not clear which type of hall would
make it possible to render such a clear attack of a massive ensemble with
such minimal reverberation and the effects of delay. When adding the
string section to the musical mix, imagining a physical performance for
the whole ensemble becomes even more problematic. The strings are
playing in a reasonably soft dynamic, judging by its tonal color. It is not
possible to even estimate the number of string instruments that would be
required to balance that amount of brass, especially when the strings are
playing at a soft dynamic level.
45
As a result, the music sounds verisimilar in terms of an orchestral
sound, as the sounds per se have their origins in a live recording of
orchestral instruments. However, the music also sounds, at the same
time, like a maximized version of reality, similar to the way that computeredited photographs seem to idealize the human body or nature.
An Impossible Crescendo
In addition, the dynamic crescendo is carefully measured in order
to enhance one’s emotional reaction to the scene. It begins as a slow
crescendo although it becomes quite pronounced and ends abruptly.
This produces the sensation for the audience that the music is louder
than what could be produced by physical means in the real world. As I
will discuss in Chapter VI, the musical notation is vague when attempting
to precisely notate the amount of crescendo (or decrescendo) in a
passage. Figure 8 depicts an attempt to notate the degree of crescendo
in the chords of the music track in Radical Notion as precisely as possible
using standard musical score notation. Figure 9 shows a view generated
by the sequencer’s MIDI editor, which reveals a much closer approach to
the actual dynamic level of the instrument part.
Carefully calculated crescendos are a fundamental tool to generate
expressivity within an audiovisual sequence. The possibility of
46
manipulating, with precision, the increase in the dynamics at any moment
becomes a tool for synchronizing the music with the content of the
scene. Figure 8 shows the best way that a score can practically represent
the detail of a crescendo, whereas Figure 9 portrays the possibilities of
utilizing the piano roll to precisely shape a crescendo.
Figure 8. Progressive crescendo written utilizing a traditional musical
score notation.
Figure 9. Progressive crescendo written utilizing Logic’s piano roll. The
top frame is used to write the note, whereas the bottom frame is used to
write continuous data. In the example of this figure, a 0 value in this lower
area means the lowest possible dynamic and the highest value (127)
47
means the highest possible value. The top numbers are the measure and
submeasure.
Expanding the Sound Palette with Synthesizers
The remainder of the ensemble for this scene consists mainly of a
set of synthesizers that can be divided into three instrumental
approaches. First, there is a set of synths that emulate brass-like sounds.
Second, there are lower ranged thunder-like sounds that are an integral
part of the entire soundtrack. Last, there are airy synth sounds that are
normally appended as a tail near the close of the decrescendos. They
fully integrate into the soundscape without being recognized as
extraneous elements. The brass-like synthesizers may be perceived as an
extension of the dominant brass sound of the expanding chords.
Similarly, the airy sounds integrate into the whole soundscape, thus
functioning as a sonic reaction to the decrescendos. The lower-range
synthesizers become especially apparent when Cobb’s projections begin
to be hostile to Ariadne, once they perceive that their world is being
changed.
By using all the musical tools described above, the composers are
able to integrate and contribute to the scene by generating music
sonorities that are not led by a melodic or a functional harmonic
structure. This is significant, as varied soundscapes tend to be less
48
intrusive than a melody or a complex harmonic progression. In the scene,
Ariadne is not only learning how to be the architect of a dream world but
to discover the possibilities it offers when compared to the physical
world. Cobb knows that after her first design of a dream world, the
physical world will no longer be enough to satisfy her. Thus, discovering
that she can fold the world becomes a crucial moment for her. However,
based on previous scenes, the audience is already accustomed to
spectacular manipulations of the dream world. Therefore, it becomes
challenging to express the emotional shock that Ariadne is feeling during
that moment to the audience. This is why the music in the scene needs to
convey the intensity of her process of discovering the possibilities of the
hyperreal.
Composer Hans Zimmer and his team at his music production
company, Remote Control Productions, musically emphasize the intensity
of the situation by using the musical tools previously analyzed15:
crescendos that seem unreal and airy sounds that reinforce the
otherworldly sensation. The brass-like synthesized sounds become more
evident as the characters walk through the manipulated dream world, and
in some measure the music is able to mimic the manipulation. The very
15
In Hurwitz (2011), there is a description of the collaborative approach to
music creation that is used by Remote Control Productions.
49
low register synthesized sounds are employed when Cobb’s projections
start to detect Ariadne’s manipulations. Using a very low range of
frequencies, which are difficult to produce using physical instruments, is
a potent signifier of Cobb’s subconscious that fuels these projections and
hostility.
Finally, the utilization of impossible ensembles reveals a process of
score recording that is nonlinear. The recording sessions in traditional
orchestral screen music were normally linear; they were the culmination
of the process and all the elements of the cues were recorded at once. In
this environment, it is clear that this is no longer possible.
Man of Steel (2013): Expanding the Hyperinstruments
One of the most prominent features in the music for Man of Steel
(2013), also composed by Hans Zimmer, is the utilization of sophisticated
hyperinstruments. I define a hyperinstrument as the virtual formulation of
an instrument that, even though it seems realistic, its sound is not
regularly produced by physical means alone. The piano in The Social
Network should be considered a hyperinstrument, as its sound color
adapts to the necessities of the narrative. This definition will be greatly
expanded upon and discussed in Chapter VI. In Man of Steel, one of the
most relevant hyperinstruments employed in the movie is the drum
50
ensemble, which is a conceptual evolution of the massive brass
ensemble from Inception that I will also describe. In The Man of Steel
(2013), the music team recorded a rhythmic track by bringing together
twelve of the top drummers in LA who played together in a recording
session in the same room (WaterTowerMusic, 2013a). More importantly,
the drummers came from different musical traditions and styles. During
the session, the drummers were asked to begin each take of the
recording in unison by following a rhythmic pattern given by the
composer. After establishing the pattern, they were asked to introduce
rhythmic variations in accordance with their own personal drumming
style, thus allowing the initial material to freely evolve. The genesis of the
hyperinstrumental conception of a multicultural drum ensemble arises
from the difference between the solo string and the string ensemble. The
initial intention was to apply the concept of a sectional sound to the
drums, which is an instrument that is typically associated with a soloistic
mode of performance practice. As Zimmer (WaterTowerMusic, 2013a)
points out:
I’ve used drums before in scores, but if you have one drummer it
sounds a bit cheesy. It is a little bit if you have one solo violin, it’s
always sort of right in your face, but if you have a string session it
sounds beautiful. I thought, what if we could get the twelve
greatest drummers and melt them into one giant machine of
energy. And we did that! (1:04).
51
However, there is one element that substantially differentiates this
example from a string section. Whereas the string ensemble aims to
create a homogenous sound, this drum orchestra16 incorporates the
individual cultural traditions of its performers. In addition, the drums were
recorded using individual close microphones for each drum set, in
addition to the general microphones placed in the center of the stage
(Figure 10).
Figure 10. Photo taken during the drum sessions (Zimmer, 2013, p. 13)
16
This is how it is referred to in the soundtrack album’s liner notes
(Zimmer, 2013).
52
As a consequence, in addition to the individualities of each
drummer’s performance style, the sound of the drum orchestra is also the
result of mixing the individual microphones with the general microphones
placed in the middle, instead of simply capturing, as is typically done with
a string section, the sound emanating from the combination of the
instruments using a set of general microphones.
Hence, the resulting sound of Zimmer’s ensemble differs from the
sound that a drum ensemble would produce, and be perceived, in an
acoustic environment. Moreover, the signature sound for action drums
originates in the processed sounds created by the close drum recordings.
The sound produced by this drum orchestra is noteworthy. It preserves
the intensity and definition of a solo drum sound but incorporates slim
sound variations that generate the sectional sound. The sound result
might be qualified as “tribal” precisely because of these variations, which
assume that multiple performers were playing the same sequence at the
same time. Further, the increased power of the sound becomes
associated with having multiple performers instead of being a product of
sound processing. The score aimed to have an earthly feeling, connected
to the notion of humankind, instead of focusing on portraying a
superhero. As Zimmer (Rolling Stone, 2013) states:
53
My inspiration for the music came from trying to celebrate all that
is good and kind in the people of America's heartland, without
cleverness or cynicism. Just from the heart. I wanted the epic
sound of the fields and farms stretching past the horizon, of the
wind humming in the telephone wires. The music is less about the
icon that Superman is, and more about the outsider with
extraordinary powers and his struggle to become a part of
humanity. (Par. 2)
Therefore, a drum sound produced by an ensemble of performers
better depicts the roots of humanity and the planet Earth when compared
to a single processed sound. The accumulation of the slight differences
that generate an ensemble sound relate to the creation of humanity as a
whole. In contrast, the destruction of the planet Krypton at the beginning
of the movie is scored mainly by using a solo violin, in order to connect
the scene with the individuality of Superman’s mother’s death. The simple
melodic structure of the main theme of the movie serves as a channel to
signify “America’s heartland” (Figure 11).
Figure 11. Transcription of the melodic line for the main theme of The
Man of Steel (2013)
54
The melody is constituted of a set of wide intervals (4th, 5th, 6th and
7th) in a similar manner to the music used in the opening for the original
Superman (1978) movie, composed by John Williams (Figure 12).
Figure 12. Transcription for William's main theme melody line for
Superman (1978). The figure shows only the beginning of the theme.
In both cases, the wide open intervals reference the ideal of smalltown American life represented in Aaron Copland’s music, which have
become one of the standard codifications in the music that aims to
symbolize American values. The score for The Man of Steel has other
noteworthy elements that help to define how a messianic mythical hero
views America. In a similar approach to the drum orchestra, an ensemble
of steel guitars was recorded. In addition, the score includes music
produced by the very particular instruments designed by Chas Smith.
Smith creates his instruments using only “junk, surplus, and stuff left over
from jobs” (WaterTowerMusic, 2013b), which, in practice, means that
they regularly become large metallic-sounding sculptures. Similar to the
ensembles, the sound sculptures generate a sound that has a sense of
55
physicality attached, despite the fact that the range of sounds are far
removed from any ordinary orchestral sound.
Moreover, once the sounds that emanate from these sound
sculptures are recorded or sampled, they are properly processed in order
to generate regular pitches, generating a new hyperinstrument that
preserves the Western scale system, yet remains connected to its source
materials. At the same time, it generates a rich spectral soundscape. In
fact, this process also applies to the ensembles described above, which
were recorded and then sampled in order to create material to construct
the score. Therefore, the process of recording these instruments
highlights a practice of music creation that has abandoned the sense of a
performative linearity. The sessions generated a set of recordings that
would act as samples, or music snippets, for the creation of the musical
pieces of the soundtrack. Thus, they become custom-made sample
libraries and loops that are designed to specifically serve a particular
movie.
The analysis of music for this movie highlights an approach to
composition based on the creation of sounds that have an attached
meaning that interact with the meaning that is suggested in the movie. In
addition, it demonstrates a nonlinear process of scoring that reveals the
56
principles of a completely new aesthetic for music creation, which will be
detailed in Chapters VIII and IX.
Gravity (2013): Scoring the Soundlessness of Outer Space
One of the most remarkable elements in Gravity (2013) is its
treatment of sound, as the movie attempts to reproduce the impossibility
of sound propagation in outer space. This is clearly stated in the opening
text of the movie:
At 600 km above planet Earth the temperature
fluctuates between +258 and -148 degrees Fahrenheit
There is nothing to carry sound
No air pressure
No oxygen
Life in space is impossible. (Cuarón, 2013)
This does not mean that sound is not possible in space when
using human technology. Inside the space shuttle and inside the
spacesuits there is obviously air and, thus, the possibility of sound
propagation. Moreover, electromagnetic waves such as radio frequencies
differ from mechanical waves, such as sound, in that they do carry in a
vacuum. Therefore, it is possible to transmit sound (although not
physically) between different air-filled spaces if using radio transmission.
57
Any impact received by these spaces will produce sound inside them, as
the impact will generate mechanical waves in the air of those spaces.
Making a movie that attempts to portray the absence of sound
transmission in the outer space is challenging in terms of sound design
and sound perspective. It also involves establishing a set of moviemaking
decisions. The first decision involves determining the relationship
between the visual perspective and the sound perspective, which will
usually be different. If the camera were positioned in outer space, there
would be no sound from this perspective. However, the sound in the
movie normally mirrors the sound perceived by the characters, allowing
for some artistic license.
In this aural set-up, the music for Gravity needs to fit aesthetically
by properly interacting with the other elements of the soundtrack. The
score has many relevant elements and scoring techniques, from among
which I will focus on two: the extensive utilization of reversed sounds and
the absence of percussion instruments. Reversing a sound is a common
studio processing technique that consists of reading the recording
backwards.
Figure 13 shows an original sound (a timpani loud hit) and its
reversed version. The effect works best on sounds that have a strong
attack and initial decay with a longer release.
58
Figure 13. Graphic representation of direct and reversed sound
waveforms.
When reversed, the sound has a slow paced crescendo at the
beginning that culminates with a very strong increase in the dynamic that
is suddenly cut (if the attack is short). The result is highly dramatic,
employing material that was recorded acoustically and yet becomes a
sound that cannot be reproduced in a physical reality. Reversed sounds
that generate from a sound with a fast and strong attack are dramatic, as
they produce an almost impossible crescendo and a very sudden release.
Generating a crescendo that increases the amplitude of the sound as
quickly as what a reversed sound can produce is challenging. This is due
to the natural properties of the musical instruments and, more generally,
to the properties of any material that is able to produce sounds. In both
cases, the physical material would need to change its vibratory state,
which takes some time. Figure 14 shows the difference in vibration for
two sounds with equal frequency but with different amplitude.
59
Figure 14. Graphic representation of different amplitudes in sine waves
As shown in the figure above, in order for an instrument to sound
louder, it needs to vibrate at a higher amplitude. The vibrating materials,
such as strings, or the air in the case of wind instruments, need time to
adapt. Thus, creating a fast crescendo is difficult for most instruments.
Moreover, suddenly cutting the sound just after reaching the loudest
point is not possible. This would involve physically stopping the
instrument from continuing to vibrate and cutting any reverberation in the
room.
The reversed sound is generally very effective when synchronized
with a fade to black editing technique, especially when used after
revealing something that is visually striking. In Gravity, the editing effect
that I just described also appears reversed in the opening titles. First, the
movie shows the text quoted above on a black background. Then, the
music generates an extremely wide crescendo that ends with an
ensemble reversed sound. Finally, there is an image of the planet Earth
60
from outer space shown in total silence, which serves as an establishing
shot for the movie (Figure 15).
Figure 15. Screenshots for the title credit and establishing shot for Gravity
(2013).
In addition to the dramatic effect created by a swelling crescendo
after the text that states that “Life in space is impossible” (Cuarón, 2013),
the total silence that follows, in conjunction with a shot of Earth reminds
us that, in space, the physical laws of our planet do not apply. Therefore,
the impossible sudden cut of the reversed sound, without any
reverberation, emphasizes the necessary detachment from the everyday
61
rules of life in humanity’s natural environment. The impossible nature of
the reversed sound helps to generate the diegesis of outer space that
wholly differs from the physical rules of the ground. The sound qualities of
the reversed sound also physically connect with the space debris
travelling around Earth at twenty thousand miles per hour, which is also
faster than anything in the physical reality, including bullets. The
succession of reversed sounds in the score also provides an aural
context to represent the high-speed debris, which, due to its extreme
velocity, is difficult to accurately represent visually.
The reversed sound is also a tool that helps the composer,
Stephen Price, overcome the absence of percussion, as requested by
director Alfonso Cuarón: “You can’t use percussion. It’s a cliché; we can’t
do that” (Rosenbloom, 2013). The clear attack and extremely rapid decay
of the reversed sound serves as a creative replacement for percussion. In
addition, there are other techniques that Price (Rosenbloom, 2013)
employs to replace some of the effects of percussion:
The sound will move all around you – sort of attack you almost.
You feel as overwhelmed as she is, hopefully. With that are these
feelings of heartbeats and breaths, and a lot of the immediate
human side of things comes from these pulsations rather than
rhythms in the score. It sometimes works as a heartbeat, other
times it complements the sound design heartbeats that were there.
Sometimes the pulses that are there are accompanying breaths.
And always, we were very careful to get the tempo so that it felt
appropriate to the state that she was in. (You mentioned the rule)
62
The concept of creating pulsations to replace the musical effect of
percussion integrates well with the sonic nature of reversed sounds,
which works in conjunction with the heartbeat-type sound design. The
absence of both percussion and any actual sound caused by the objects
hitting each other also detaches the diegesis of outer space from
terrestrial life. As Ryan (Sandra Bullock) states when she loses a screw
from the Hubble telescope, “In my basement lab things usually fall to the
floor”; and when things fall to the floor, they produce a percussive sound.
In using these two devices, the score integrates with the diegetic ideas of
the director, assisting to create a very particular soundscape. As a sound,
the reversed sound should be conceived as a hyperinstrument.
Interestingly, it utilizes all of its material from physical recordings
but inverts it temporally, which is something (reversing the time) that
humans cannot yet accomplish without the help of technology.
Interstellar (2014): The Church Organ and the Integrated Score
Christopher Nolan’s Interstellar (2014) is a groundbreaking movie
in many different regards. I will focus on two different features that I will
discuss jointly because they are interrelated. In addition, I will describe
how these two elements aid in the creation of the complex cultural entity
that is Interstellar. The first aspect relates to an integrated approach to
63
the music conception as an intrinsic part of the movie as whole. The
second delineates the utilization of the church organ and how employing
the instrument intertwines with the movie’s philosophical ideas and its
narrative. Nolan (Zimmer, 2014a) describes his initial concept about how
music would integrate within the movie as follows:
To me the music has to be a fundamental ingredient, not a
condiment to be sprinkled on the finished meal. To this end, I
called Hans [Zimmer] before I’d even started work on
INTERSTELLAR and proposed a radical new approach to our
collaboration. (…) He [Zimmer] understood that what I wanted to
do was turn his usual process inside out, giving his musical and
emotional instincts free reign, so that the seed from which the
score would eventually grow would be fused with the narrative at
its earliest stage. (p. 3)
Thus, the music for Interstellar is not aimed at complementing an
audiovisual track or a narrative. Instead, the music not only becomes part
of the meaning of the movie from its inception, but also generates
emotional meaning. Interstellar is a superb example of a movie conceived
as a multimodal medium, transcending the purely classical Hollywood
narrative-driven movie paradigm. The utilization of a church organ as the
featured instrument for the score is closely associated with this particular
approach to movie making. In fact, it was Nolan’s (Lowder, 2014) idea to
use an organ in the score:
I really wanted them to use the church organ, and I will submit the
case strongly for some feeling of religiosity to it, even though the
64
film is not religious. But, the organ, the architectural cathedrals and
all of that, they represent mankind’s attempt to portray the
mystical or the metaphysical, what’s beyond us, beyond the realm
of the everyday. (1:57)
How Zimmer (2014a) defines his approach to the utilization of the
organ is significant, at least in the liner notes of the soundtrack album, as
it suggests a related but significantly different perspective:
Church organs have evolved over hundreds of years and stand as
an example of our restless scientific ingenuity to come up with
technological solutions. By the 17th century, the pipe organ was
the most complex man-made device: a distinction it kept until the
invention of the telephone exchange. A vast, crazy, dizzyingly
complex maze of pipes, levers and keyboards surrounding the
organist - sitting like an astronaut in his chair - with switches and
pedals at his feet. The enormous power of air within its bellows
pushing through thousands of huge pipes creates a sound
sometimes so low and powerful it’s like a fist punching into your
solar plexus and, at other times, the sound is something so
beautiful and fragile that it feels like a children’s choir. These were
the first digital keyboards with the mind of a synthesizer. And, of
course, this was the perfect metaphorical instrument for writing the
music for INTERSTELLAR. (p. 4)
In joining Nolan and Zimmer’s thoughts, they imply that organs
were simultaneously the best device created by mankind to represent the
mystical and the most complex piece of technology. The duality between
metaphysics and the inherent necessity for humanity to incessantly
develop new technologies are the two main concepts that govern
Interstellar’s ideology. Thus, Nolan and Zimmer’s viewpoints are
complementary when applied to the generation of the movie. Still, the
65
organ ties Interstellar to another element (a movie), which inevitably
serves as a background and reference for the movie: Kubrick’s 2001: A
Space Odyssey (1968). Interstellar might be understood as either a
commentary on, or a conceptual revisit of, the concepts developed in
2001. The sound of the church organ could become a signifier for the
connection between both movies.
Let us focus on 2001’s ending: Bowman (portrayed by Keir Dullea)
has transmuted into a baby, floating in outer space, observing Earth.
Meanwhile, Richard Strauss’ opening to Also Sprach Zarathustra is
playing, which concludes with a grandiloquent C major chord played with
a pipe organ. The movie ends with this chord resonating over a black
screen. In fact, Strauss’ decision to employ an extremely coded Christian
instrument (the organ) in a piece of music inspired by an iconic antiChristian novel (Nietzsche’s Also Sprach Zarathustra) is surprising. The
reason that the pipe organ successfully integrates with Strauss’ piece is,
as Nolan pointed out, because in terms of coding, the organ is the
musical continuation of a Gothic cathedral. Hence, they both produced
the impression of God by creating an aural and visual space that was
elevated beyond human scale. Thus, the organ highlights the fact that it
generates its sacred meaning by transcending what is natural for
humanity. This implies that what humans call sacred, spiritual or religious
66
is, by definition, something that transcends humanity. The organ and the
Gothic cathedral are the best representations of this transcendence in a
framework based on limited human perception. In other words, the pipe
organ was the best musical representation of transcendence in Western
culture through the use of physical means alone. In Strauss’ piece, the
organ works because of the transcendent nature of Zarathustra, who
believes in the death of God and who breaks the established principles of
Western morality. Considering the range of possible meanings that
Western societies might associate with the organ, it is worth asking if the
instrument acts as a religious-inducing device, or if it is just a signifier for
a rationalistic definition of transcendence in Interstellar. Although Nolan
stated that Interstellar is not a religious movie, I believe that the answer
that Nolan and his team propose is that, from a purely humanistic point of
view, it really does not matter.
Moreover, the organ becomes increasingly relevant in shaping the
scene where Cooper (portrayed by Matthew McConaughey) is located in
the neo-Gothic three-dimensional representation of a five-dimensional
space. This 3D projection of a 5D space acts as a sort of augmented
cathedral (I would call it a hyper-cathedral, as it is built by using a set of
signifiers from a cathedral in a hyperrealistic model), and it is appropriate
that the music contains an augmented organ (a hyper-organ, or an
67
example of a hyperinstrument based on the organ).17 The organ is
augmented by synthesizers that interact with the sound it produces. In
that particular moment in the movie, the organ is mainly playing a
minimalistic pattern based on the following intervals, which is not that
much different in conception from the thematic material used in Man of
Steel, as previously described (Figure 16).
Figure 16. Score sketch that shows the intervallic content of the organ
part in the track S.T.A.Y. from Interstellar’s (2012) soundtrack.
In tandem with this material in the physical organ sounds,
undulating synthesizers18 sustain notes of a minor chord that corresponds
to the suggested harmony in the pattern shown in the previous figure.
Therefore, the synth sounds share a similar set of bass frequencies as the
ones that emanate from the organ’s recording, thus creating the
impression that the physical organ is fluctuating in a much larger space,
as the synth sounds virtually resonate for a longer period of time. In fact,
the synthesizers are using a timbre that is sonically close to the sound of
17
This moment corresponds with the track S.T.A.Y. from the soundtrack
album (Zimmer, 2014b).
18
Synthesizers that are employing a tremolo processing effect in their
amplitude.
68
the organ in that precise moment. From this perspective, it can be argued
that adding the sound of these synthesizers produces the effect of a
multidimensional reverb that would transcend the acoustics of a threedimensional space, as the resulting sound is the combination of two
different spaces.
As aforementioned, Nolan’s (metaphysically) and Zimmer’s
(technologically) diverse approaches to the ontology of the pipe organ
delineate the two main conflicting themes of the movie.
Interstellar contrasts the principles of human essence with the
metaphysical problems that derive from the accelerated pace of
technological evolution in contemporary society. 2001’s premise was that
the essence of humanity was tied to the ability to kill another human for
survival. In other words, humans became human when they began to use
violence against one another. This is demonstrated at the beginning, in
the “Dawn of Man” sequence, in which the primates that touched the
monolith evolve and learn how to use tools to kill the rival members of
their species. In contraposition, Interstellar’s premise of the ontology of
humankind is much more humanistic: in Nolan’s movie, human essence is
linked to the ability to love other humans.
Moreover, Interstellar reinforces the idea that love cannot be
explained as a mechanism for the survival of the species. Humans are
69
humans when they love other specific humans, not when they love
humanity as a whole. Hence, there is a discrepancy between human
essence and the need to secure the survival of the species. As a
consequence, humans use technological progress as a means to secure
human endurance, as is stated in the movie by McConaughey’s
character, Cooper:
COOPER
We’ll find something.
Some new technology… We always have.
Therefore, love is what defines humanity but technological
progress is what allows humanity to endure. In fact, violence might be
one of the results of the conflict between these two facets. Moreover, the
tension between the essence of human beings and their need for survival
generates, at the present time, fears that conform to the main themes
of Interstellar. For instance, a hypothetical technologically advanced
human might evolve to reach the stage that it is able to interact in a
multidimensional space where time is just another dimension. However,
this collides with some of the principles of love, as there is a temporality
associated with having children or loving someone. To love is to accept
the precise consequences and the actions of the loved ones. It is in the
act of accepting their actions that humanity, irrationally, is able to love. In
70
a multidimensional space, where time and possibility are additional
dimensions, this is no longer possible because there are no decisions:
there are only sets of possibilities. Interstellar approaches these fears
from a positive humanistic perspective.
As has been mentioned, Interstellar references the movie 2001, as
a product of a shared cultural background. 2001 is part
of Interstellar because the movie is a postmodern cultural entity that
necessarily goes beyond the story. The audiovisual material that
constitutes the movie becomes a vessel to transmit a complex piece of
culture with numerous connections that transcend the strict narrative of
the movie. This new form of movie making allows for innovative means of
storytelling that go beyond the story. For example, the plot
of Interstellar jumps from the initial set-up at the farm to space. The movie
does not show the process of Cooper getting ready for the space trip.
This narrative strategy produces a certain degree of discomfort in the
audience, that might have expected a much smoother transition. From
the point of view of the storytelling, this is problematic. However, this
discomfort is precisely the emotion that the movie needed to transmit to
better portray how the main character had to confront the situation of
leaving his children in order to allow them the chance to survive. Showing
the process of preparation would have reduced the audience’s
71
discomfort, which would have defeated the purpose. Therefore,
Interstellar does not only go beyond classical narrative cinema but also
beyond the idea of what cinema is. It is not just a movie anymore; it is an
experience that encompasses several aspects of human culture. It
assumes that the spectators have access to the vast resources that the
internet offers, which will help to reveal part of the complexities of the
plot, while at the same time engaging in a fluid dialogue with several other
cultural entities. From this perspective, Interstellar is akin to Gravity,
which similarly revolves around the idea of an audiovisual experience that
depicts human anxiety in the environment of outer space. In this
approach to movie making, music acquires a status that is aligned with
the rest of the elements of the movie. Nolan (Zimmer, 2014a) explained
the initial process of the creation of the soundtrack, which started before
the script was even completed:
I asked him to give me one day of his time. I’d give him an
envelope with one page - a page explaining the fable at the heart
of my next project. The page would contain no information as to
genre or specifics of plot, merely lay out the heart of the movie-tobe. Hans would open the envelope, read it, start writing and at the
end of the day he’d play me whatever he’d accomplished. That
would be the basis of our score.
I listened to Day One countless times as I worked on the script,
and as we shot. It served as my emotional anchor, just as it serves
as the emotional anchor for the entire complex and thrilling score
that Hans went on to create almost two years later. (p. 3)
72
According to Nolan, the letter contained some lines of dialogue
and some ideas that were at the heart of the movie. For Zimmer, the letter
focused on the relationship between a father and his son (Lowder, 2014).
Nevertheless, this piece of music served Nolan well in the process of
creating the script, which then became inexorably linked to the musical
content of the piece. This approach helps to conceive of a movie as a
means to discuss an ontology for humanity, becoming a sophisticated
piece of philosophical thought that employs diverse mediums in order to
generate meaning. For Nolan, music helped to create the concept of love
that would have otherwise been difficult to portray in its depth had he
used only dialogue or visual cues.
73
CHAPTER III
PHILOSOPHICAL APPROACHES TO HYPERREALITY
In order to properly describe the hyperorchestra and the
hyperorchestral processes involved in contemporary screen music
creation, I will begin by discussing its philosophical grounds. This chapter
intends to provide a definition for hyperreality based on Baudrillard’s
(1994) and McLuhan’s (1964/1994) philosophies. However, I do not intend
to provide a comprehensive discussion of the theories of both authors.
Instead, I will draw on their thoughts in order to develop a model for
hyperreality that becomes applicable to cinema and its music. I will start
with yet another movie example, as its narrative attempted to portray
Baudrillard’s concepts of hyperreality.
On The Matrix
The Matrix (1999) has become a cultural icon for postmodern
movies that engage with the concept of hyperreality. The movie is set in a
dystopian future Earth, where humans exist while connected to a
computer-simulated reality in which they are enslaved by an aristocracy
of self-intelligent machines that exploit them as a form of biological
74
energy source. At the beginning of the movie, Jean Baudrillard’s book
Simulacra and Simulation (1994) appears on-screen, acting as a secret
container in which Neo, the protagonist, hides disks containing illegal
computer data and software. This is just one of the numerous allusions to
Baudrillard’s ideas that appear, both implicitly and explicitly, in the movie.
By far, the most salient of these references is the existence of a
simulacrum, or a computer-simulated reality, which humankind, unaware
of their enslaved status, virtually inhabits.
Baudrillard believed that the movie did not properly represent his
ideas (Genosco & Bryx, 2004). As will be discussed later in this thesis, the
central point of Baudrillard’s thought is the absence of the real in
contemporary Western societies, which have become symbolic systems
with no difference between the real (the referent) and its representation
(Baudrillard, 1994; Chan, 2008). For Baudrillard, The Matrix failed to
portray the lack of differentiation between the real and its simulation.
Instead, the movie presented an opposition between a dystopian real (but
real, nonetheless) and a hyperreal (the simulated virtual world called the
Matrix). From this standpoint, Baudrillard believed that the dichotomy
between the worlds portrayed in The Matrix was closer to Plato’s allegory
of the cave than his concept of the disappearance of the real (Genosco &
Bryx, 2004). This would still hold true even when considering that the real
75
in The Matrix is the sunless ruins of a lost human civilization, which
Morpheus qualifies by paraphrasing Baudrillard, as the desert of the real.
However, the desert of the real is portrayed literally in the world of The
Matrix (by showing the deserted ruins of cities), whereas the concept is
allegorical in Baudrillard’s work. For Baudrillard, the desert of the real
meant the absence of a real, or the inability of Western civilization to
distinguish reality. Hence, Baudrillard was not imagining a real where
Western civilization was a literal desert of ruins, as it is portrayed in the
movie.
Yet, one may speculate that the real and the simulated in the
movie are metaphorical instead of literal. The movie is not depicting an
imaginary world that inaccurately resembles Baudrillard’s ideology, but it
is employing an audiovisual narrative (and its created world) in order to
discuss a philosophical idea. The real in the movie is not its depiction of
the real, but it is part of a metaphor that aims to describe Baudrillard’s
idea of hyperreality. Similarly, the Matrix (as a virtual-simulated world) is
not the representation of the hyperreal, but a metaphor for Baudrillard’s
model. The purely audiovisual representation and its philosophical
meaning gains, within this perspective, different layers of signification
embedded into the same artwork. Hence, conceiving the world in The
Matrix as a metaphor to express Baudrillard’s thought seems to be the
76
most coherent approach. Moreover, understanding the world depicted in
The Matrix as a metaphor better justifies the utilization of questionable
physical and biological axioms. For example, humans are not a
productive system to generate energy, as they require more energy to
survive than they are able to actually produce. Moreover, actively using
the human brain requires a greater amount of energy when compared to
an inactive brain, which negates the positive effect of having conscious
humans connected to the Matrix.
Consequently, analyzing the movie according to its apparent and
literal meaning is problematic as the movie is referentially false.
Nevertheless, if a movie is a fictional narrative, it is afforded a greater
degree of freedom in terms of its narrative presentation. With this
example, I intended to highlight the degree of sophistication that
audiovisual narratives have achieved and the associated difficulty in
analyzing them.
I have chosen The Matrix for its level of audiovisual and narrative
complexity, and for its engagement with the concept of hyperreality. Both
features will become key in order to discuss, in the following chapters,
how the concept of hyperreality serves to describe contemporary movies
and their music. Before reaching that point, I will analyze some central
concepts of Baudrillard’s philosophy of the hyperreal, in conjunction with
77
Marshall McLuhan’s theory of the media (McLuhan, 1964/1994).
McLuhan’s theories influenced Baudrillard and they are useful when
trying to engage with the concept of hyperreality in cinema. In Chapter IV,
I will concentrate on the relationship between the notion of hyperreality
and the ontologies of cinema. Chapter V will specifically focus on how the
world of the movie is generated and its relationship with the concept of
realism. I will provide an explanation of how to interpret the different
levels of meaning of a movie, as I hinted in this introductory discussion on
The Matrix. Furthermore, I will offer a framework for analyzing the movie
while respecting its varied meanings and artistic value. Chapter VI will
focus on sound and music for the screen and its relationship to
hyperreality. In addition, I will define what I mean by hyperorchestra in
terms of ontology, by employing the concepts developed in the first three
chapters.
Baudrillard and the Hyperreal
Baudrillard’s concept of hyperreality is frequently employed in
contemporary culture beyond The Matrix. Several movies have portrayed,
in diverse manners, some of the axioms of Baudrillard’s ideology. For
instance, in his critical interview on The Matrix, Baudrillard (Genosco &
Bryx, 2004) mentions the following movies, which interact, in his opinion,
78
with the concept of hyperreal: The Truman Show (1998), Minority Report
(2002) and Mulholland Drive (2001) (Genosco & Bryx, 2004). In addition,
Inception (2009) and The Thirteenth Floor (1999) are equally good
examples. In the field of philosophy and critical thought, Umberto Eco
incorporates the concept of hyperreality into his essay Travels in
Hyperreality (Eco, 1986). However, Eco approaches hyperreality from a
perspective that may seem closer to how it is portrayed in some of these
movies as opposed to how Baudrillard describes it.
Hence, an inquiry into Baudrillard’s philosophy of the hyperreal
and the stages of simulacra becomes necessary before further discussion
of its implications can take place. This will serve to clarify Baudrillard’s
position on the concept. It is worthwhile to begin by analyzing
Baudrillard’s (1994) assessment of Disneyland:
Disneyland is presented as imaginary in order to make us believe
that the rest is real, whereas all of Los Angeles and the America
that surrounds it are no longer real, but belong to the hyperreal
order and to the [third] order of simulation. (p. 12)
For Baudrillard (1994), even though Disneyland “is a perfect model
of all the entangled orders of simulacra” (p. 12), the park acts as a
mechanism that masks the loss of reality in the contemporary world. In
other words, by admiring its aesthetic hyperreality, society is able to
acknowledge it as a perfect fake and forget that their reality is the true
79
fake. On the other hand, Eco’s (1990) description of the hyperreality of
Disneyland differs from Baudrillard’s approach:
Disneyland is more hyperrealistic than the wax museum, precisely
because the latter still tries to make us believe that what we are
seeing reproduces reality absolutely, whereas Disneyland makes it
clear that within its magic enclosure it is fantasy that is absolutely
reproduced. […] Disneyland can permit itself to present its
reconstructions as masterpieces of falsification. (p. 43)
Once the “total fake” is admitted, in order to be enjoyed it must
seem totally real. […] When there is a fake – hippopotamus,
dinosaur, sea serpent – it is not so much because it wouldn’t be
possible to have the real equivalent but because the public is
meant to admire the perfection of the fake and its obedience to the
program. […] Disneyland tells us that technology can give us more
reality than nature can. (pp. 43-44)
Eco’s account of Disneyland’s features highlights a distinctive
point of view. For Eco, Disneyland is actually hyperrealistic and society
enjoys it because it is able to produce a flawless (yet artificial) nature.
However, from Eco’s viewpoint, this does not negate the existence of
reality outside of Disneyland.
The Three Orders of Simulacra
In order to better discern what hyperreality is in terms of
Baudrillard, in this section I will examine his principal propositions
regarding the orders of simulacra and the hyperreal. I will attempt to
provide a distinct general picture of his ideology, which may have been
80
blurred by other interpretations, such as Eco’s description of
Disneyland’s hyperrealism. In Simulacra and Simulation (1994),
Baudrillard describes three different orders of simulacra that correlate
with different stages of modern human evolution (Baudrillard, 1993, p. 50;
Baudrillard, 1994, p. 121). Etymologically speaking, a simulacrum19 is an
image or a representation of an object. Baudrillard’s writing on the stages
of the simulacra is purposefully enigmatic and it has some ambiguity.
Nevertheless, Baudrillard defines the first order of simulacra as the stage
of human evolution when representations are based on imitation or
counterfeit (Baudrillard, 1994, p. 121). In this order, images are naturalist
and they attempt to become a reproduction of the world. Still, images are
not directly linked to the world but they act as an arbitrary referential sign
of it (Pawlett, 2007, pp. 74-75).
Baudrillard utilizes Saussure’s (1998) semiotic concepts, which
includes his definition of a sign. For Saussure, the sign is comprised of
two elements: the signifier (the form the sign takes) and the signified (the
concept that the sign represents). For example, the word “bird” acts as a
signifier of the concept of a “bird”, although it is not directly connected to
any specific element taken from reality. In fact, the word bird is useful as
it points out an abstract idea that can be applied not only to a particular
19
Simulacrum is the singular form of simulacra.
81
bird but also to an abstract version of it. The equivalent of a sign in the
physical world is called a referent. From this viewpoint, the link between
the referent and the sign is arbitrary. For instance, calling both an ostrich
and a nightingale birds is a convention, and thus is arbitrary. The case of
the former planet Pluto may help to further clarify this arbitrariness. Pluto
recently lost its status as a planet without any apparent change in its
physicality. This is because reality is socially constructed as the result of
imagining a model based on observations.
Consequently, the association that links the concept of what a
planet is to its referent is arbitrary, as it can change without the referent
actually changing at all.
Subsequently, there is a binary opposition between the notion of
the ‘world’ (or reality) and the ‘signs’ that humans construct to interact
with it (Pawlett, 2007, p. 75). Images are no different from other signs,
even though they aim to reproduce reality naturalistically. This is why
Baudrillard (1994) defines the first order of simulacra as the “imaginary of
the Utopia” (p. 121). According to Baudrillard (1993), the first order
appears with the Renaissance (p. 50), when the “bourgeois class
dismantled the fixed ranks and restricted exchanges of the feudal order
through the introduction of democratic parliamentary and legal
institutions” (Pawlett, 2007, p. 74). By breaking fixed ranks and norms,
82
the set of signs is no longer sacredly connected to a referent. In other
words, society learned that there was no divine order. Thus, the signified
part of a sign may vary depending on a change in fashion or social
mores.
In terms of art, Baudrillard emphasizes the development of stucco
during the Renaissance, a material that facilitated the imitation of nature
on walls. In addition, he mentions the importance of theatrical illusion. In
both cases, their naturalistic approach aims to provide an imitation or
counterfeit of nature (Baudrillard, 1993, pp. 50-52).
The beginning of the second order of simulacra corresponds with
the Industrial Revolution and is governed by the idea of production. In the
first order, the difference between the real and the simulacrum is still
presupposed (Baudrillard, 1993, p. 50) but this changes with the second
order:
The second-order simulacrum simplifies the problem by the
absorption of appearances, or by the liquidation of the real,
whichever you prefer. In any case it erects a reality without images,
without echo, without mirrors, without appearances: such indeed
is labour, such is the machine, such is the entire industrial system
of production in that it is radically opposed to the principle of
theatrical illusion. No more semblance or dissemblance, no more
God or Man, only an immanent logic of the principle of operativity.
( Baudrillard, 1993, p. 54)
This was made possible by the mass production associated with
the industrial era. An object might be reproduced on an industrial scale,
83
losing its attempt to be a counterfeit of its referent, as “serial production
gives way to generation through models” (Baudrillard, 1993, p. 56).
The third order of simulacra corresponds to the current codegoverned society, where simulation is the dominant schema (Baudrillard,
1993, p. 50). It might be associated with a postmodern society, even
though Baudrillard did not use this term specifically. In this order, signs
become modeled signifiers, as signifiers are detached from what they
signify. Thus, the meaning of a signifier is not determined by what it
signifies but by its relations with other signifiers. Computer-generated
objects are the clearest example. The planet Pandora in Avatar (2009) not
only lacks a referent (as Pandora does not exist), but it also only acquires
its meaning by establishing relations with other concepts. Pandora is a
model of a utopian natural and preindustrial world that becomes
meaningful when contrasted to a dystopian version of Earth, an
environmentalist ideology, and a corpus of futuristic dystopian narratives
that precede the movie. Therefore, Pandora acquires its meaning through
its relationship with other models of a world.
Computer-generated images of Pandora and its inhabitants serve
as pristine examples of what Baudrillard defines as the third order of
simulacra. However, the third order should not be considered a digital
phenomenon alone. In fact, the miniatures employed for making the first
84
Star Wars (1977) movie were neither computer-generated nor did they
have a referent. Disneyland serves as another example. As stated,
Baudrillard argued that contemporary society as a whole was part of the
third order of simulacra, which becomes the hyperreal.
The Hyperreal
In the third order of simulacra, the simulation becomes “the
generation by models of real without origin or reality: a hyperreal”
(Baudrillard, 1994, p. 1). The planet Pandora in Avatar (2009) is
hyperrealistic because it is created by a set of different models of real but
it does not have a real origin. Pandora cannot be, in contemporary
society, an equivalent to what Eden was for ancient Western society. This
is why defining Pandora as an idyllic place or as a modern rendition of
Eden is controversial. Pandora becomes ideal if the model for the ideal is
raw nature, but it does not if the model is based on cultural and
technological progress. For example, a person who loves cultural
activities would find life on Pandora disappointing. The inhabitants of
Pandora live in concert with nature. Their principal social activity is to
pray to a sort of shared consciousness. There are no signs of any musical
activity or any other artistic or leisure pursuits. Thus, the meaning of
85
Pandora becomes unstable, as it is dependent on the relationships based
on a symbolic system:
Without the stable equivalence of sign–referent and signifier–
signified, meaning becomes highly unstable, and binary
distinctions implode, reverse or become radically uncertain in their
meaning(s). (Pawlett, 2007, p. 77)
As a simulation, Pandora does not only precede any possible
experience of the real but it highlights how meaning is volatile. That would
have not been the case with Eden in a pre-modern society, which was a
symbol of a lost paradise and God’s power. In this kind of society, the
idea of paradise was as universal as the idea of God.
McLuhan’s Theory of the Media
Before continuing to scrutinize the notions and consequences of
the hyperreal, a discussion of McLuhan’s (1964/1994) theory of the
media, along with his famous statement “the medium is the message” (p.
7), will assist to elucidate the concept of the hyperreal in contemporary
society. Baudrillard examines various concepts of McLuhan’s thought in
several passages of his work, which indicates how McLuhan greatly
influenced his contemporary social model. McLuhan develops his theory
of the media in his book Understanding Media (McLuhan, 1964/1994).
The beginning of the first chapter, aptly titled “The Medium is the
86
Message”, offers an explanation of his partially cryptic yet well-known
statement:
In a culture like ours, long accustomed to splitting and dividing all
things as a means of control, it is sometimes a bit of a shock to be
reminded that, in operational and practical fact, the medium is the
message. This is merely to say that the personal and social
consequences of any medium - that is, of any extension of
ourselves - result from the new scale that is introduced into our
affairs by each extension of ourselves, or by any new technology.
(McLuhan, 1964/1994, p. 7)
Further, he provides different examples to illustrate what he aimed
to describe. The railway example is especially eloquent:
The railway did not introduce movement or transportation or wheel
or road into human society, but it accelerated and enlarged the
scale of previous human functions, creating totally new kinds of
cities and new kinds of work and leisure. This happened whether
the railway functioned in a tropical or a northern environment and
is quite independent of the freight or content of the railway
medium. (McLuhan, 1964/1994, p. 8)
For McLuhan, the medium becomes any extension of ourselves.
For example, the hammer extends our arms and the wheel may extend
our legs (Federman, 2004). In addition, the message of a medium is “the
change of scale or pace or pattern that it introduces into human affairs”
(McLuhan, 1964/1994, p. 8). This is why the message of the railway lies in
how it transformed society by generating, for example, new models of
cities. McLuhan’s thoughts on what the medium and the message are
87
become ontologically relevant for understanding both terms, as they
acquire a broader meaning than what it is usually assumed. No less
striking than McLuhan (1964/1994) famous quotation is the introduction
of the book (especially when considering that the text was written in
1964):
After three thousand years of explosion, by means of fragmentary
and mechanical technologies, the Western world is imploding.
During the mechanical ages we had extended our bodies in space.
Today, after more than a century of electric technology, we have
extended our central nervous system itself in a global embrace,
abolishing both space and time as far as our planet is concerned.
Rapidly, we approach the final phase of the extensions of man the technological simulation of consciousness, when the creative
process of knowing will be collectively and corporately extended to
the whole of human society, much as we have already extended
our senses and our nerves by the various media. (p. 3)
If the medium is an extension of our bodies, the evolution of
Western society may be described as a process of expansion (or
explosion) by different mediums. McLuhan argues that the world has
been imploding since the electric era because of instantaneous
communication: “As electrically contracted, the globe is no more than a
village” (McLuhan, 1964/1994, p. 5).
Assessing why McLuhan describes this process as an implosion
instead of a continuation of the expansion of our bodies is important to
our understanding of how this view relates to the concept of hyperreality.
Electricity allows for the achievement of a speed that is closer to the
88
speed of light, which is the maximum speed that is physically possible.
What comes after electricity will necessarily go beyond the physical
because the limits of the physical have already been reached. In that
sense, McLuhan labeled the process as an implosion, as it allowed for the
merging of social and political functions. Similarly, it virtually eliminated
the physical distance between people by bringing humanity together in a
global village. However, McLuhan’s process of implosion may also be
analyzed as a route of expansion that goes beyond the physical and the
real, thus driving further into the hyperreal. McLuhan termed the process
as an implosion because he focused on humankind and the extension of
their bodies. If railways extended the distance that a human could travel,
by employing airplanes this distance was extended further. With space
travel, humanity became able to travel beyond Earth. However, the
concept of the global village does not continue the expansion with the
same rationale. Instead, post-electrical development is focused on
connecting the consciousnesses of human beings. As this is not strictly
an extension of the body, and, for McLuhan, consciousness precedes
any technology, he thus defined the process as implosion.
Therefore, the process of implosion entails a disconnection from
the real, as the power of transformation of the new media is not focused
on the physical world. It is in this sense how McLuhan’s concept of
89
implosion intersects with the concept of the hyperreal. McLuhan’s theory
of the media is closely related to a discussion of the evolution of
language in Western culture. By defining McLuhan’s different visions of
language, connections with Baudrillard’s orders of simulacra can be
made. This will lead to a further discussion of the relationship between
both theories, and it will allow to further defining the concept of the
hyperreal.
The Spoken and the Written Word
“Language does for intelligence what the wheel does for the feet
and the body. It enables them to move from thing to thing with greater
ease and speed and ever less involvement” (McLuhan, 1964/1994, p. 89).
Furthermore, McLuhan (1964/1994) emphasizes that Western society is
rooted in the written word based on a phonetic alphabet (p. 89). The use
of such an alphabet is relevant because “the phonetically written word
sacrifices worlds of meaning and perception that were secured by forms
like the hieroglyph and the Chinese ideogram” (McLuhan, 1964/1994, p.
91). As a consequence, by using a phonetic alphabet, the visual and the
auditory experiences separate thus giving Western individuals “an eye for
an ear” (McLuhan, 1964/1994, p. 84). It is as a result of this separation
that phonetic language is the key to civilized society. However, the power
90
to civilize humanity comes at a cost. By separating “both signs and
sound from their semantic and dramatic meanings” (McLuhan,
1964/1994, p. 87), there is a detachment “from the feelings or emotional
involvement that a nonliterate man or society would experience”
(McLuhan, 1964/1994, p. 79). Then, language becomes a technology that
enables a process of abstraction from reality. In using a phonetic written
language, the process of abstraction is even greater than using the
spoken word. By using a phonetic written language, all the emotions and
nuances that are natural in the oral language need to be incorporated in a
rational, step-by-step, description.
It is because of the social and psychological effects of detachment
produced by the utilization of the phonetic alphabet that contemporary
society has attempted to recover its contact with imagination and
emotion, in order to regain “wholeness” (McLuhan, 1964/1994, p. 89). In
addition, McLuhan (1964/1994) believed that “consciousness is not a
verbal process” (p. 89). Thus, he believed that the process of expanding
consciousness would bypass spoken and written language:
The computer, in short, promises by technology a Pentecostal
condition of universal understanding and unity. The next logical
step would seem to be, not to translate, but to by-pass languages
in favor of a general cosmic consciousness, which might be very
like the collective unconscious dreamt of by Bergson. (McLuhan,
1964/1994, p. 89)
91
Though, prior to this transformation occurring, he observed that
electric technology would threaten the ideology that emanated from the
phonetic alphabet, because by “extending our central nervous system,
electric technology seems to favor the inclusive and participational
spoken word over the specialist written word” (McLuhan, 1964/1994, p.
89).
He could not predict that, with the hypertext, written language
would evolve and maintain its status at the same time that instant written
communication (email or instant messages) would become an extremely
useful tool for offline interactive communication. Still, both media
transformed the written language enough to corroborate, in a broad
sense, his statement. His belief that consciousness is separated from
language is, however, more controversial. Moreover, McLuhan’s
concepts do not seem to substantiate this belief. If language is one of the
first technologies of humanity, it is because it is essential for humans to
become humans. The importance that McLuhan gives to language and to
the phonetic alphabet challenges, in my understanding, his model of
consciousness. Therefore, McLuhan’s account of the consciousness may
be interpreted metaphorically instead of literally.
92
Media and Simulacra
As aforementioned, it is relevant to discuss the links between
language and humankind in order to examine Baudrillard’s orders of
simulacra in relation to McLuhan’s thesis. By underlining the detachment
that phonetic language has on perception and emotion, McLuhan
presents a similar argument to Saussure’s in his semiotic theory,
regarding the arbitrariness of the link between the sign and its referent, as
earlier described. Phonetic language is responsible for creating Western
civilized society by detaching the civilized human from reality. McLuhan
(1969) would qualify the perspective image, which constituted the basis
for the first order simulacra, as “specialist artifacts for enhancing human
perception” (p. 32). Even though the statement was referring to art in
Western civilization in general, this might be easily applied to the
perspective image as well. However, there is a significant difference
between the perspective image and other media like the wheel or the
railway. The perspective image, or the drawing technique that allows a
person to create a three-dimensional representation utilizing a twodimensional surface, expands human senses virtually, thus providing a
detachment from its referent: the three-dimensional space is represented
by only two dimensions. From this viewpoint, the image behaves similarly
to phonetic language. The similarity between image and language in
93
terms of their detachment from the environment stresses another
connection: if language is essential to define Western civilization and
image is similar to language, then producing simulacra may be essential
to define Western civilization.
A Process of Virtualization
After scrutinizing how language shaped Western society and how
this process may be similar to artistic manifestations such as painting, I
argue that the history of Western society should be understood as a
process of virtualization. By doing so, I am generalizing the term
virtualization, which is normally associated with computer science, to
apply to any process that involves generating a simulation of reality.
McLuhan’s approach to the media as an extension of the human body
and mind should be understood as a process of virtualizing human
existence. If Western civilization developed due to its utilization of a
phonetic language, this means that it was born when humans were able
to detach from reality and construct their knowledge by virtualizing it.
In his book The Singularity is Near, Ray Kurzweil (2005) goes even
further in exploring the connections of virtualization:
The word “virtual” is somewhat unfortunate. It implies “not real,”
but the reality will be that a virtual body is just as real as a physical
body in all the ways that matter. Consider that the telephone is
94
auditory virtual reality. No one feels that his voice in this virtualreality environment is not a “real” voice. With my physical body
today, I don’t directly experience someone’s touch on my arm. My
brain receives processed signals initiated by nerve endings in my
arm, which wind their way through the spinal cord, through the
brain stem, and up to the insula regions. If my brain – or an AI’s
brain – receives comparable signals of someone’s virtual touch on
a virtual arm, there’s no discernible difference. (p. 203)
Kurzweil states that there is no direct connection between reality
and human senses as the process is already mediated by physiology.
Thus, the human body becomes the first medium, in McLuhan’s terms,
for humankind. In these terms, the hyperreal is no longer a product of the
contemporary society. Instead, the ability to produce hyperreality may be
considered as a latent feature of the early stages of humanity. For
example, any dysfunction or limitation of the nervous system may result
in experiencing something without origin in reality. A person that comes
out from a dark place into bright daylight will most probably need some
time for their eyes to adjust to the new environment. During that time, the
information captured (perceived) by their eyes would provide incorrect (or
different from expected) information on the new environment. Similarly,
our inability to perceive ultra-violet light might result in a skin burn that
had no origin in the perceived reality. Even though these examples may
not constitute actual hyperreality, they highlight the fact that hyperreality
was an embryonic feature with its origins in humanity’s beginnings.
Further, this is congruent with the notion of virtualization. Any process of
95
virtualization (as it implies a detachment from reality) tends to generate
hyperreality. This is because if human beings, who are intelligent and
creative, perceive reality through mediation, it is probable that their
intelligence and creativity will attempt to modify the perceived reality.
Nevertheless, Baudrillard and McLuhan identify the industrial and
technological revolutions as two key moments of transformation. As has
been described, the Industrial Revolution is essential for Baudrillard’s
definition of the second order of simulacra. For McLuhan (1964/1994), the
printed book generated the modern world and the Industrial Revolution
as it “involves a principle of extension by homogenization that is the key
to understanding Western power” (pp. 170-178). The homogenization
produced by the printed word generated a further level of detachment: in
a society with print, the human became more abstract. From this
standpoint, modern science (that developed during the print era) might be
described as a form of abstraction of the written language. In other
words, modern science requires homogenization in order to develop, and
the printed word allowed homogenizing and structuring the language to
occur. McLuhan’s discussion of the separation of oral, written and printed
languages as distinct media highlights the substantial differences among
them, even though they might seem to be forms of the same medium.
This becomes clearer when extending McLuhan’s approach to linguistic
96
evolution a step further when considering formal languages and the
hypertext.
Computer languages are the most common form of formal
languages, which are languages defined by a finite number of symbols
and a finite and specific set of syntactic rules to combine them. Their
main feature is that they do not require human intelligence to be
processed. Content is generated by a closed set of grammatical rules.
Thus, they are the ultimate version of presenting knowledge through a set
of instructions. If the written language, as stated by McLuhan
(1964/1994), requires the construction of a coherent sequence to express
“what is quick and implicit in the spoken word” (p. 79), formal languages
require a closed grammar to produce the language with total abstraction.
If written language requires a large amount of text to describe a simple
event of the spoken word, formal languages require an extremely large
set of rules to process even the simplest form of written language.
Computer languages became the next step in the homogenization (and
simplification) of the language to the degree that they no longer require
human intellect in order to be executed. If the printed book enabled mass
education for the whole of Western society, formal languages allowed the
“education” of machines.
97
Hypertext and the widely used Hypertext Markup Language
(HTML) are not formal languages, as they are a set of markups that serve
to format natural language texts. They are just an extension of printed
language. Even though HTML formalizes how to generate a title, for
example, this was already formalized with print. I will not attempt to
discuss the implementation details of HTML here, as I intend to focus on
the essential changes of the hypertext with regard to the previous
discussion. Broadly speaking, hypertext has provided a syntactical layer
between texts.20 If print allowed for the mass distribution of existing texts,
the hypertext allowed for the creation of relationships between them,
which were necessary in order to be able to continue to produce
knowledge. Although the librarian’s function may be precisely that (to
connect texts and organize them), it is clearly limited in its scope. Modern
libraries were created as a rudimentary syntactic system that paralleled
the segmentation and isolation of knowledge in the modern world. Books
are ordered in a hierarchical structure according to topics and subtopics
to facilitate their access. With hypertext, the linearity of this hierarchical
classification is abandoned. As described before, the written word
required defining a simple event based on the oral communication linearly
and causally. With hypertext, this process transforms. Any texts can be
20
It will also become semantic, as I will argue below.
98
interconnected by complex networks of meaning. This becomes
especially clear when analyzing a tag system.
A simple tag system attaches different concepts to a text or to a
part of it. This current discussion may be tagged as “philosophy”,
“postmodern”, “postmodern philosophy”, “hyperreality”, “Baudrillard”,
“dissertation” etc. A tag system does not require any order or relation
between the tags. Tags may have different levels of meaning and they
may also be redundant.21 This does not preclude a tag system from being
more or less effective than the way that language allows for syntactically
correct sentences to have no semantic sense. However, even in a tag
system as chaotic as the example given above, tags may be useful when
attempting to find related texts by simply calculating their number of
shared tags. The use of tags highlights not only that there are
connections of meaning between texts but also that those connections
have their own meaning. An evolved version of the tag system, which will
clarify the previous statement, is found in social networks.22 The
revolution that social networks introduced to the system of personal
blogs and web pages is centered on creating meaningful connections by
making the process of tagging transparent to their users. Even though it
appears that the act of tagging someone as your friend is one of the key
21
22
Such as “postmodern philosophy” in the list supplied above.
Facebook (www.facebook.com) is the most popular example.
99
factors of a social network (when someone adds a friend it appears that
the person is tagging the target as ‘friend’), the actual friendship is
determined by the degree of interconnection in the system. Each time
that someone creates a connection with another person by “liking” their
status, the weight of the friendship tag increases. This also happens
when two people appear in the same photo or are in the same location.
This brief example of some of the functionalities of a social
network carries two important consequences. First, hypertext allows a
language to include audiovisual material as well as written language.
Second, the connection itself has meaning. It is from this point of view
that the concept of implosion in McLuhan’s theory needs to be
understood. He believed that implosion would revert humanity to a stage
of development that is closer to the tribal era: pre-civilization, but in a
global village. From that angle, humans would regain what was lost
during the process of abstraction of the modern world. By creating
meaning for the connections between different texts, hypertext
incorporates the nuances of the spoken word. The consequences that
this approach has had in transforming the principles of Western
civilization are enormous, as McLuhan stated. An example of this
transformation, in terms of knowledge, might arise by analyzing the
accepted scholarly opinion of Wikipedia (www.wikipedia.com). Wikipedia
100
is widely rejected as a scholarly source because it is not peer-reviewed in
the traditional sense. In addition, its content may change at any time as a
result of an update being made by any user. In other words, Wikipedia
does not have an identifiable authority and cannot be printed. Wikipedia’s
authority does not emanate out of traditional Western authority systems
from the meaning produced by the connections of its contributors. The
authority of the review does not come from the authority of the peers but
from the power of the link. A change is accepted according to a
combination of the number of users that support the change and the
authoritative value of those users in terms of how their previous proposed
changes have been accepted. Thus, I would speculate that when
Wikipedia becomes accepted as a scholarly source, it will mark the
apogee in the process of deconstruction of the Western model of
civilization that McLuhan defined.
The previous definition of the hypertext clearly fits into
Baudrillard’s (1994) definition of hyperreality: “the generation by models
of real without origin or reality: a hyperreal” (p. 1). Friendship, in a social
network as described above, is defined by models of real but not by an
actual “real” friendship. For Baudrillard, the third stage of simulacra does
not imply a partial return to a previous stage. The term implosion is used
in Baudrillard to symbolize the destruction of not only the referents but
101
also the signifiers. McLuhan’s focus on the phonetic language instead of
the image becomes key to understanding this difference and to link his
theory of the media to the process of virtualization. From this viewpoint,
hyperreality serves to reveal that the basis of Western knowledge and
science are not as robust as it seems. Baudrillard stated that the obvious
hyperrealism of Disneyland served to mask the absence of reality in the
rest of society. I would argue that it is precisely by being aware of a
hyperreality that humans realize that their structures of knowledge are not
as grounded in real referents as they think. Continuing with the example
of Wikipedia, the internet encyclopedia is not an example of the implosion
of the concepts of truth/falsehood or reality. Instead, Wikipedia indicates
that those concepts have never been stable. Moreover, Wikipedia is a
good example to demonstrate the manifestation of a social attitude:
humans in contemporary society are not only aware of the problem of the
arbitrariness of knowledge but they are also actively deciding on a model
in order to be able to create meaning in this situation.
The Hyperreal Society
In this thesis, I have provided a broad definition of hypertext, as it
seemed a good approximation of McLuhan’s idea of bypassing language.
Hypertext does not only incorporate different audiovisual materials but it
102
creates meaning by connecting different pieces of text, thus bypassing
the necessary linearity of the written language. Meaning becomes
unstable as it is not fixed. It returns to the individual the power of creating
a personalized meaning, but at the same time, it provides a set of tools to
create important connections between individualized meanings that can
lead to a sense of common understanding.
In terms of McLuhan, the media served as a mechanism to
augment the human body and senses at the cost of abstraction. With
hyperreality, the process of augmentation is achieved by returning to our
emotions and by accepting that natural human space is not found in its
relationship with the real, which is impossible to grasp, but in the building
of the hyperreal.
103
CHAPTER IV
CINEMA ONTOLOGIES AND HYPERREALITY
Introduction
There is no film anymore. Digital cinema has transformed the
physicality of the seventh art in such a significant manner that many
scholars have questioned whether it is still the same artistic medium.
David Bordwell describes this process in his book Pandora’s Digital Box
(2012):
The film is no longer a “film.” A movie now usually comes to a
theatre not on reels but on a matte-finish hard drive the size of a
big paperback. The drive houses a digital version of the movie,
along with alternative soundtracks in various languages and all
manner of copy-guarding encryption. Instead of lacing a print
through rollers and sprockets, the operator inserts the drive into a
server that “ingests” the “content.” (By now a movie has become
content, an undifferentiated item to be fed into a database.) The
server accesses the files only after a key, a long string of numbers
and letters unique to that server-projector combination, authorizes
the transfer. (pp. 7-8)
Beyond the purely technical aspects and their implications, the
disappearance of a well-defined physical link between the image and its
representation has brought to light relevant discussions regarding film
104
ontology. As I will argue, digital cinema is not a different artistic medium23
from classical filmed cinema, and neither has it significantly altered
cinema as a medium in terms of ontology. Instead, the advent of digital
cinema has highlighted the weakness and instability of some
assumptions on the indexical properties of the photograph. By analyzing
the technological changes of digital cinema, I will discuss the most
prominent elements related to cinema’s ontology, in order to provide a
philosophical ground for discussing the interaction between cinema and
hyperreality.
Prior to commencing this ontological discussion, it is necessary to
limit the categories of moving images that will be analyzed. There are
numerous audiovisual manifestations, as well as abundant approaches to
cinema. In terms of this discussion, however, the term cinema will be
restricted to pertain to narrative cinema that broadly follows the classical
Hollywood model of storytelling as defined by David Bordwell (Bordwell,
2006). In discussing the characteristics of this widespread modality of
storytelling, I will create a connection between the concept of virtual
reality discussed in the previous chapter and narrative cinema.
23
In this context, medium refers to an artistic modality, regardless of its
physical mode of delivery. Therefore, cinema is a medium regardless of
whether the movie is projected in a cinema, watched on a TV screen or
even on a cell phone. Similarly, literature is a unique medium regardless
of whether the book is printed or if it is in an electronic version.
105
The second part of this chapter will focus on the relationship
between cinema and hyperreality. In this section, I will incorporate
Stephen Prince’s model of perceptual realism (Prince, 1996; 2010; 2012)
into the discussion, as it greatly serves as a tool for inquiring about the
connections between hyperreality and cinema.
Narratives and Virtual Reality
Cinema has become a widespread mode of audiovisual narrative
storytelling. However, not all audiovisual manifestations should be
considered cinema, even if they employ the same set of technologies. For
example, a documentary will use, most probably, similar technologies to
a movie and it will include some type of story or narrative. However,
labeling it as cinema would complicate the definition of the word for the
purposes of this argument. For example, documentaries need to be
perceived as faithful to what they are depicting, thus acting as a record
that proves that a series of events actually occurred. However, this is not
the case for a movie.
This present discussion will focus only on narrative cinema that
broadly follows the classical Hollywood form. Even though this may seem
specific, most movies would fall into this category, regardless of where
they were produced. David Bordwell (1985) discusses the main
106
characteristics of the classical Hollywood form in Narration in the Fiction
Film. In this text, Bordwell (1985) references the Russian terms fabula and
syuzhet to refer to what he later defines as story and plot, respectively.
The story constitutes all the events of the narrative, including the ones
that do not appear in the movie. By contrast, the plot is all the information
that the movie contains, including the elements that would not be
considered an event in the story. In terms of the story, the plot typically
contains parts of the story only, which are not necessarily shown in
chronological order (Bordwell & Thompson, 2012, pp. 80-82).
Of all modes, the classical one conforms the most closely to the
“canonic story” which story-comprehension researchers posit as
normal for our culture. In fabula terms, the reliance upon
character-centered causality and the definition of the action as the
attempt to achieve a goal are both salient features of the canonic
format. At the level of the syuzhet, the classical film respects the
canonic pattern of establishing an initial state of affairs which gets
violated and which must then be set right. (Bordwell, 1985, p. 157)
Thus, narrative cinema is a character-centered “chain of events
linked by cause and effect and occurring in time and space” (Bordwell &
Thompson, 2012, p. 79). In another book, Bordwell (2006) argues that
postclassical era movies (produced after 1960) still follow, with some
innovations, the classical form:
American films have changed enormously. They have become
sexier, more profane, and more violent; fart jokes and kung fu are
everywhere. The industry has metamorphosed into a corporate
107
behemoth, while new technologies have transformed production
and exhibition. And, to come to my central concern, over the same
decades some novel strategies of plot and style have risen to
prominence. Behind these strategies, however, stand principles
that are firmly rooted in the history of studio moviemaking. (p. 1)
For the purpose of this thesis, Bordwell’s remarks regarding
contemporary moviemaking practices serve to define a common ground
for cinema that would fit with the scope described above.24 As an
example, nonlinear movie narratives usually follow the general principles
of classical Hollywood moviemaking. For instance, in Pulp Fiction (1994)
the narrative is split into four stories that produce a complete solid
narrative at the end. As a result, once the audience finishes watching the
movie, it is revealed to them a full-length story that merges all the four
stories cohesively. Hence, Pulp Fiction’s plot structure is neither
conventional nor linear. However, each story has its main characters with
their objectives and intentions, and a chain of causal events constructs
the narrative. Moreover, the full narrative discovered at the end similarly
follows these parameters, linking all the characters and situations
together in a complex directed graph of causal events (Figure 17). I use
the term “directed graph” instead of “chain”, as it may be more
appropriate for defining Pulp Fiction’s whole narrative, because it implies
24
Similarly, this set of movies represents the majority of films that have
been produced and released commercially.
108
directionality. Nevertheless, this does not significantly affect the overall
definition of a Hollywood film narrative.
Figure 17. Abstract example of a directed graph. It does not refer to any
narrative specifically.
From this perspective, narrative movies produce two important
outcomes. First, they generate an imaginary world in which the story
unfolds. Second, they generate an imaginary psychological world for
each of their main characters. These results constitute what is usually
described as the movie diegesis. Although this may be initially perceived
as a clear and simple concept, accurately analyzing the diegesis of a
movie (and how the diegesis is generated) becomes a complex task,
especially when referring to music. An in-depth definition of the diegesis
will be addressed in the following chapter. Now, I will focus on how this
109
set of movie worlds interacts with the concepts of virtualization and
virtual reality examined in the previous chapter.
Storytelling (in all of its forms) relies on creating an imaginary world
and a set of imaginary characters for a story to unfold. This still holds true
when the story recalls a “real life” event. Even in this situation, the story
needs to create a replica of this “real” world situation and its characters.
The beginning of Fargo (1996) illustrates how “real life” storytelling does
not differ from a fictional one. At the beginning of the movie, it is stated
that the story is “true”:
THIS IS A TRUE STORY
The events depicted in this film
took place in Minnesota in 1987.
At the request of the survivors,
the names have been changed.
Out of respect for the dead, the
rest has been told exactly as it
occurred. (Fargo, 00:00:19)
The Coen brothers, who directed the movie, later stated that the
movie was fictional, even though it was inspired by a set of real life
events. They argued that by making the audience believe that they were
telling a true story, they were allowed some sort of freedom in terms of
the narrative that they would not have had otherwise (Heitmueller, 2005).
110
Fargo highlights that fictional and nonfictional stories do not differ in
terms of the creation of their imaginary worlds. They may differ, as the
Coens suggest, in terms of their associations and the audience’s
identification with them, but not in the necessity to create an imaginary
world for the story and an imaginary psychological world for each of the
characters. Therefore, imagination becomes a key element to
understanding the close relationship between humanity and storytelling.
Imagination and Virtual Reality
In his book, Sweet Anticipation, David Huron (2006) describes
what he terms the imagination response (p. 8). It is the first step in his
well-known theory on human behavior that is based on imagination,
tension, prediction, reaction and appraisal (ITPRA) and is described in the
text.25 He believes that the “imagination response is one of the principal
mechanisms in behavioral motivation” as “Imagining an outcome allows
us to feel some vicarious pleasure (or displeasure)—as though that
outcome has already happened” (p. 8). Hence, “we don’t simply think
about future possibilities; we feel future possibilities” (p. 8). As an
example, Huron (2006) states: “It is important to pause and smell the
25
For Huron, these are the ordered steps for a human’s response to a
stimulus.
111
roses—to relish the pleasures of the moment. But it is also crucial to take
the imaginative step of planting and nurturing those roses” (p. 9).
For Huron, imagination is essential for human survival and
evolution. Thus, humans constantly imagine possible outcomes for
different situations they believe they may encounter. Moreover, Huron
(2006) states that the process of imagination involves thinking and feeling
at the same time (p. 9). Furthermore, Huron argues that it is not possible
for healthy humans to only think about what we imagine without actually
feeling. Therefore, by imagining a situation, humans experience the
emotional outcome before the situation happens, which may aid them in
shaping their actions (by planting roses, for example).
In looking at storytelling from this perspective, it becomes a
system that allows for the sharing of possible outcomes of different
events. This is why stories are normally rich in terms of mythical content,
which facilitates their application as a reference for a broad set of specific
situations. In accepting Huron’s findings, perceiving a story not only
involves a process of imagination but also a process of feeling the results
of each situation as if they had happened to the audience. Stories are not
only a form of entertainment but also a vessel to funnel the imagination
response in order to share it with other humans. In doing so, humans are
expanding their brains, as there is the possibility for them to receive an
112
imaginary situation, which originated in another human’s brain, and feel it
as if it were their own. In terms of McLuhan’s views, storytelling becomes
a robust first step towards a process of sharing a consciousness.
It is from this angle that the capacity to create psychological
worlds for the characters of a story becomes fundamental. Individuals
engage with stories by connecting with their specific situations due to
their characters. Their psychological world and the values that justify their
actions are key to establishing their behavioral patterns for benefit of the
receivers of the story. In addition, by having a clear picture of the
relationship between their actions and the characters’ psychology,
individuals are able to better discern the connections between their own
psychology and the psychology of the characters portrayed in the story.
This is one of the reasons that explains how a character-centric mode of
storytelling has become dominant. In this mode of storytelling, the
psychology of the characters is key to unfolding the events of the
narrative. Therefore, creating a consistent psychological profile for the
characters that are responsible for the main actions of the narrative
supports the establishment of a shared consciousness, as the events of a
narrative designed in such a manner better connect with the lived events
of its audience.
113
In addition to creating strong and complete psychological profiles
for its characters, a story that relates to shared mythical content is able to
include a wide set of situations. Consequently, the story may act as a
template. Each individual would unpack the content of some stories to
imagine specific situations, while modifying and complementing them
freely. As an added layer of complexity, an individual might have a set of
stories that would be applicable to a certain situation. Those stories might
also be contradictory or propose opposite resolutions. The individuals will
need to choose how to act26 and, by doing so, they will establish their
own behavioral principles.
Describing storytelling by employing Huron’s theory of imagination
from a cognitive psychology perspective reveals that any process of
imagination produces a virtual reality. Huron’s findings suggest that the
process of imagination is so powerful that it produces the same
emotional outcome as if it were experienced physically. By stating that
imagination makes us feel as if the event actually happened, Huron is
implying that the process of imagination generates alternate virtual
realities. In each of these virtual realities, the emotional pay-off is
equivalent to everyday physical experiences.
26
By choosing a particular story as a model, merging different stories
together or imagining a complete new situation with different outcomes.
114
Narrative cinema, as a form of storytelling, does not deviate from
this proposed framework. In comparison to a written tale, cinema is
capable of operating within a whole set of audiovisual materials in
addition to language. As I will describe in the following chapter, music
may also help to reinforce a narrative concept, thus reshaping the
imaginary world (or the virtual reality) that the narrative portrays. However,
even though utilizing cinema affords the creator with a richer set of tools
for narrating the story, it does not significantly differ from other modes of
storytelling in terms of its capacity to create a powerful virtual reality, as
this is achieved through the imagination.
Digital Cinema and the Ontology of Cinema
Digital cinema is not a different artistic medium from analog
cinema, nor does it significantly modify the definition of analog cinema in
terms of medium. Instead, digital cinema has assisted in elucidating
ontological27 questions on the properties of film and photography. In this
section, I will discuss the two most salient problems, which are the
27
In this discussion, the ontology of cinema will examine the properties of
cinema as a medium. For example, defining photography as a different
medium from a painting should be considered an ontological question. In
order to be able to properly demonstrate such a statement, a medium
specific quality (some property not applicable to photography) needs to
be found in order to justify the separation of both media in terms of
ontology.
115
concept of indexicality and authorship. In addition, I will inquire into how
these problems apply to animated movies, which will serve to further
clarify the present discussion.
Digital Cinema and Indexicality
One of the key elements of Peirce’s semiotics (Atkin, 2013) is the
division of signs into three non-exclusive categories: icons, indexes and
symbols. A sign is an icon when it resembles its object; it is an index
when there is a factual connection with the object; finally, it is a symbol
when the relationship is arbitrary.28 Following this rationale, a painting and
a photograph should both be an icon of what they portray. However, it
was believed that only the photograph was an index of the reality that it
captured. This was believed to be true since the painter did not
necessarily need any reality in order to create a painting. Conversely, this
was not possible with a photograph, which requires a reality to capture.
Following this line of thought, the camera would become a device
capable of objectively capturing the reality in front of its lens, which acts
as a record of its existence: the chemical process that converts the light
emanating from the objects in front of the camera into a negative
28
The categories are not exclusive: a sign may be classified according to
one, two or all of the categories at the same time.
116
becomes the physical proof of its indexicality. An equivalent logic may be
applied to film.
With the invention of the digital camera, the indexicality of the new
device has been questioned. There is no film to capture the light from the
objects. Instead, a set of digital light sensors mediates and generates a
collection of binary code for the digital version of the picture. From this
perspective, the digital image is simply a collection of single colored dots
(the pixels) that produce the impression of what photography once was.
Due to the digital mediators and its pixelation (fragmentation of its surface
into discrete single colored dots), digital photography would lose its
indexical property.
Even before the first commercial digital camera was produced,
Friedrich Kittler underlined the weaknesses of attributing an indexical
property to the photographs or films produced by any type of camera. In
discussing why film is not indexically linked to reality, Kittler (1999) states:
“Instead of recording physical waves, generally speaking it only stores
their chemical effects on its negatives” (p. 119). By emphasizing the
chemical process, Kittler (1999) also reveals a further consequence: if the
chemical print were accepted as indexical, then digital photographs
should likewise be accepted as an indexical medium. This is because the
photographic negative has a finite number of molecules. Hence, this
117
implies that the negative also has a finite resolution, which should be
considered, in terms of indexicality, to be similar to the limitations of
pixelation. At most, the difference between analog and digital
photography is a problem of resolution (there may be more molecules in a
negative than pixels in a digital photograph), which does not justify a
separation in terms of ontology based on their indexical properties. Thus,
if the analog photograph were a medium capable of indexing reality, then
the digital photograph acquires the same indexical power as well.
So far, the purpose of the discussion has been to refute any
ontological differences between digital and analog photography based on
their indexical property. However, this does not mean that I accept that
photography is indexical per se, as I believe that the primary assumption
(the objectivity of the camera in terms of being able to capture reality) is
flawed, as I will argue below.
Stephen Prince (2012) challenges the assumed indexicality of
photography by observing that the camera does not capture an amount
of light that is comparable to the human eye. In this statement, Prince
challenges the camera’s indexicality capacity from the restricted point of
view of the perception of the human eye. He intentionally modifies the
definition of an index to apply only to the human-perceived reality instead
of reality itself, which is impossible to grasp. This restriction of the scope
118
of the indexical property was naïvely implied in the prior approach. For
example, a regular camera is not intended to capture the infrared waves
that are present in reality, as the human eye does not perceive them.
Nevertheless, Prince (2012) acknowledges that the camera fails to
become an index, even when only considering a human-centered
perception of reality. Therefore, the camera itself acts as a filter of
anything it captures:
Despite the important place that live-action-camera reality holds in
the theory and aesthetic practice of realistic filmmaking, the
camera is a poor instrument compared with the human eye. It is a
lossy instrument that fails to capture the full range of luminosity in
a scene or environment. The human eye can respond to a dynamic
range of 100,000:1 luminance levels; a camera, depending on the
speed of the lens, the latitude of the film, and the size of its digital
sensor, captures far fewer. Theories of cinema that take the
camera as an indexical agent accurately capturing what is before it
have neglected to consider the significance of this characteristic—
a single analog or digital image created in camera is a low-level
facsimile of the lighting environment from which it derives. (Prince,
2012, pp. 192-193)
Prince’s statement implies that a photograph may be unable
to capture, for example, the visual content of dark areas.
Subsequently, the photograph would fail to index those areas. For
instance, if a photograph were only able to capture the head of a
person talking at night but not the body (due to the lighting) this
would not be an index to prove that talking heads without bodies
exist. Similarly, in the absence of light, the camera fails to index
119
any reality.
Even when considering an ideal situation in terms of
lighting, Prince argues that a digital technology such as HDR (High
Definition Range) is much better in terms of capturing the full
radiance of the environment. In order to register the full light
spectrum, the same environment is recorded with different light
sensibilities. After the recording process, the data from this set of
light sensors is interpolated, creating a computer model. Prince’s
criticism evidences the problems that arise as a result of defining
photography or cinema as an index of reality. In the previous
chapter, I argued that any process of perception (including
mechanical perception involving a camera) becomes mediated by
the artifact employed for its capture. The human eye, the chemical
components of the negative, and the optical transducer of a digital
camera act as mediators between reality and what they capture.
As a consequence, the indexical property needs to be further
diluted to acknowledge the interference of mediators in the link
between reality and photography.
Digital cinema (and photography) has also allowed the
incorporation of Computer Graphic Imagery (CGI) and computer image
processing into moving images. In terms of ontology, the utilization of
120
these techniques may become a dividing point in terms of medium
specificity, even if digital cinema does not differ from traditional film in
terms of its capacity to become an index of reality. By adding objects that
are generated by a computer or by modifying the existing picture using
computer software (eliminating the facial imperfections in a photograph,
for example) digital cinema may differ significantly from analog film
production. Following this rationale, analog and digital cinema would
become ontologically diverse media due to the inability of analog cinema
to incorporate objects that were not produced physically. In other words,
even though the analog camera might not be able to become a strong
index of reality due to the problems discussed above, at least it portrays
what actually originates in the physical world. However, it is precisely
because of its inability to index the reality captured by the camera that a
computer modification of the image should not be considered a medium
specific property. For example, make-up may carry out the same function
as digital retouching, especially when assuming that part of the realistic
effect of make-up is possible due to the limitations of the camera. In other
words, make-up that seems natural or verisimilar in a movie might look
much more artificial when observed directly on set. Similarly, when a
character is portrayed as being injured, this does not require actually
injuring the actor: make-up techniques are able to produce a verisimilar
121
recreation of the injury. In addition, a full size view of the planet Pandora
from Avatar should not be considered different from the planet Tatooine
in Star Wars (1977), as discussed in the previous chapter. Even though
Pandora was digitally created, and Tatooine was generated using a
miniature, both fictional planets are portrayed in their respective movies
despite the fact that they do not exist in the physical world. Star Wars is a
convenient example for this discussion as it allows for a comparison of
the original trilogy with the subsequent prequel. There are numerous
elements in the second trilogy that were digitally generated but were
created using physical special effects in the first. Nevertheless, those
elements are perceived as equivalents in terms of the narrative29 in both
trilogies, which reinforces the argument in favor of considering CGI
manipulation as a typology of visual effects instead of a new category of
audiovisual medium.
Further, in accepting CGI and visual effects as part of the movie
making process, the indexical property of a product that utilizes those
techniques weakens even more to become almost non-existent. Thus,
the truth of a moving image (or a photograph) cannot rely on its
indexicality. Instead, it becomes a matter of its iconic value and its
symbolic attribution. Its iconic property (how the picture resembles the
29
For instance, a spaceship created utilizing a miniature model will
represent the same spaceship if it were created using CGI.
122
object that it attempts to reproduce) is connected to its degree of fidelity,
resolution and clarity. A high-resolution image of a dark area will only
serve as an icon for the darkness of the moment, but it will not serve as
an icon for the objects that it could not capture. Hence, the iconic
property alone is not sufficient to ensure the truthfulness of a moving
image. As the indexical property is not usable, the possibility of delivering
verisimilar experiences (as moving images do) may only be attributed to a
symbolic process. This is only possible because the camera has become
the symbol in Western society of a device capable of documenting the
environment. This means that a movie is perceived as verisimilar due to
the conjunction of its iconic power (how well it resembles the reality it is
meant to represent) and the symbolic assumption that filmed events
portray reality. However, these properties may not be enough to produce
a verisimilar output. It is for this reason that the author’s trust of the
image may become the strongest symbolic value in terms of delivering
verisimilitude.
Authorship and CGI: Gollum’s Case
In Liveness: Performance in a Mediatized Culture, Philip Auslander
(2008) discusses the implications of a CGI character like Gollum in The
Lord of the Rings (2001-2003) trilogy in terms of the performance’s
123
authorship. In addition to its genesis through computer processing,
Gollum’s performance was also generated using an actor (Andy Serkis),
who was recorded using motion and facial sensors. This virtual
performance challenges the definition of what acting and performance
really are, especially when considering its multiple ramifications:
Once created, a digital clone can undertake an infinite variety of
performances the actual performer never executed; such
performances can also be extrapolated from other forms of
information, such as motion capture data. Whether generated in a
special-effects studio or a live, interactive dance performance,
motion capture data can be stored and used to produce future
performances that were, in some sense, executed by the
performer but without that performer’s direct participation.
(Auslander, 2008, p. 170)
Auslander focuses his discussion on the legal implications in terms
of the copyright of the performance. Nevertheless, his thoughts on the
Gollum problem similarly challenge the ontology of performance and its
authorship. Hence, it becomes difficult to precisely define what
performance is, at the same time that it becomes equally difficult to
assign authorship to a given performance. Stephen Prince (2012)
provides a compelling answer based on an idea coined by film director
David Fincher, who advocates differentiating the process of acting and
performance:
On stage, performance and acting often are interchangeable. In
cinema, acting is a subset of performance. For our purposes, then,
124
acting is the ostensive behavior that occurs on set to portray
characters and story action. Performance is understood as the
subsequent manipulation of that behavior by filmmakers or by
actors and filmmakers. This distinction will enable us to recognize
the ways that cinema employs technology to mediate the actor’s
contribution, via such things as editing, music scoring, lighting,
makeup, and compositing. (p. 102)
The separation between acting and performance is radical. Acting
becomes just a part of the performance, contained in a finite set of
physical actions, meanwhile a performance is achieved in a much
broader sense. For example, when music helps to shape a character (the
Indiana Jones theme, for example), this music becomes part of the
performance of the character. This definition allows for the inclusion of
CGI as part of the performance as well. In addition, the model is flexible
enough to incorporate other frequently neglected elements, such as the
stunts, into the whole performance process (Figure 18).
In the graphical representation that appears in the following figure,
CGI contains more processes in addition to virtual characters. Digital
retouching may act as virtual make-up in a similar manner to how color
correction or artificial lighting relate to physical lighting or the selection of
a lens filter. Moreover, the model reveals that performance in audiovisual
media is a process that involves physical and virtual actions.
125
Figure 18. This graphical model describes performance based on David
Fincher's approach to performance and acting as described by Prince
(2012, p. 102).
From a conceptual point of view, even when editing was done by
physically cutting film stock, film editing and music scoring have always
been part of the virtual process of film performance, as they are part of
the postproduction stage. In this situation, the editor is not cutting film for
the sake of cutting it, but for the sake of what it represented. Thus, the
action of cutting film stock is conceptually physical but the action of
editing film shots has always involved a virtual framework. Defining the
process of editing as virtual does not preclude acknowledging the links
between editing and the set of physical processes in production (the
distribution and selection of takes, the instructions from the director and
126
his or her team, etc). Similarly, CGI usually incorporates data from the
physical world into its digital processes. Equally, music may incorporate
recordings. Figure 19 attempts to represent this approach by modifying
the previous model:
Figure 19. This graphic summarizes the different roles that contribute to
generating a performance in audiovisual media. The graphic is divided
between physical and virtual processes.
This deeper division of the performance process underlines further
connections regarding acting. For example, the instructions given by the
director and his/her team to the actors will have an effect on the acting
result. This is not exclusive to movies, as it could be generalized to other
arts. For example, music performance could be divided between the
127
physical act of playing the instrument and all the aesthetic decisions that
involve selecting a set of different playing actions. Influences from other
performers or performances, or the instruments being utilized become an
integral part of the performance. Following this line of thought, most
performances are the product of multiple authors, which is regularly
collectively referred to as ‘influences’.
In lieu of this model, the concept of authorship dissolves. In terms
of ontology of the performance, this approach removes any possible
distinction between a physical character, like Aragorn, and a digitally
manipulated character, like Gollum in The Lord of the Rings. They are not
ontologically different, as they only differ in the degree of performance
attributable to the actor. Thus, Viggo Mortensen had a greater influence
on the overall performance of Aragorn than Serkis did in the performance
of Gollum. However, in both cases, their acting was only a part of the
overall performance process. The implications of this model are
significant due to the strong reliance on the concepts of authorship and
authenticity in modern Western culture. The previous discussion reveals
that both concepts are not as stable as people in Western civilization
might have assumed. Thus, authorship and authenticity hold a degree of
arbitrariness, which ultimately means that they are symbolic. For
128
example, assigning Viggo Mortensen as the sole author of the
performance of Aragorn30 might be the shared convention of our society.
The Case of Animation
Animated movies have traditionally been considered a separate
medium from “live action” movies. The distinction becomes extremely
problematic in contemporary cinema when considering the previous
discussion of CGI and performance. The argument for why the main
character from Wall-E (2008) is animated, but Gollum is not, is
complicated. The previous statements serve as a means to challenge this
assumption in terms of ontology. First, if photography and film are not
indexes of reality, they are not ontologically different from animation.
Second, the redefinition of the concept of performance dissolves the
differences between a “live character” and an “animated” one. Moreover,
Disney’s usage of anthropomorphism in his animated animals might be
understood as a pre-digital process of pseudo-motion capture. Disney’s
animators certainly did not have the technology to capture motion
digitally, but they achieved remarkable results by emulating human body
movements. Thus, it became a process of motion capture that was
achieved by using human perception alone. Similarly, even the images
30
Excluding the stunt doubles used in action scenes, the make-up artists
and costume designers, etc.
129
generated in classical animation are influenced by the observation and
capture of diverse pieces of information from the physical world. In
considering this, the possible distinction between what is live and what is
animated becomes even thinner in terms of ontology.
By this statement, I do not intend to defend the idea that there are
no differences between animated and live action movies. However, the
difference is not a matter of ontology but a matter of aesthetics. Similarly,
the discussion on the differences between film and digital cinema
belongs to the field of aesthetics instead of ontology. This is relevant, as
aesthetics are part of a shared symbolic system of society, which implies
a degree of arbitrariness. Imagining the different audience reactions
between a 21st- century audience watching King Kong (1933) and its
original audience should elucidate the previous statement. Even though
the movie is exactly the same, current audiences will find the creature
more unrealistic than the original audiences.31 This change in the
perception of King Kong may only be attributed to an evolution in the
audience’s aesthetics, especially when considering that the movie is
exactly the same and that human perceptual abilities (in terms of what
31
The New York Times review from the premiere confirms that
assumption. For example: “Imagine a 50-foot beast with a girl in one paw
climbing up the outside of the Empire State Building, and after putting the
girl on a ledge, clutching at airplanes, the pilots of which are pouring
bullets from machine guns into the monster's body” (Hall, 1933).
130
human senses can perceive) have not significantly changed. Thus, the
appreciation of the unrealism in King Kong is purely aesthetic and thus, a
product of the shared symbolic system of society.
Prince’s Perceptual Realism
Preliminary to a discussion on hyperreality and cinema, I will
describe a term coined by Stephen Prince for analyzing digital movies:
perceptual realism (Prince, 1996; Prince, 2010; Prince, 2012). The term is
relevant as it will further elucidate the discussion between ontology and
aesthetics. Perceptual realism refers to objects in a movie that, even
though referentially false, they are perceived as realistic when depicted
within the world of the movie (Prince, 2012, p. 32). Prince (2012) argues
that due to their perceptually realistic condition, "they are able to compel
belief in the fictional world of the film in ways that traditional special
effects could not accomplish" (p. 33) and, therefore, "the more
comprehensive a scene in evoking perceptual realism, the likelier it is to
compel the spectator’s belief" (p. 33).
Prince utilizes the dinosaurs of Jurassic Park as an example of an
object that is perceptually realistic. Gollum, or the Na’vi characters from
Avatar (2009), are other examples that have already been discussed.
However, the dinosaurs of Jurassic Park are exemplary as the movie’s
131
diegesis is situated in present (as of 1993) times. The dinosaurs become
an abnormal element of a depicted world that looks like it did in 1993. In
addition, the movie combined physical models of dinosaurs with digitally
generated ones. Hence, the dinosaurs will help to exemplify a model for
perceptual realism that goes beyond CGI. Furthermore, the dinosaurs in
the movie are a strong example of perceptual realism, as they do not
currently coexist with humans and no human has ever interacted with a
dinosaur (Prince, 2012, p. 32).
In a related manner, Prince describes the scene in Forrest Gump
(1994) in which Tom Hanks interacts with President Kennedy as
perceptually realistic. In this case, the recording of President Kennedy is
real and it may still exist in the memories of the audience members. Tom
Hanks is similarly real and is a well-known actor. However, audiences
also know32 that President Kennedy and Tom Hanks never interacted.
Moreover, the scene could never have been filmed because of the age
difference between them.
One of the main implications of Prince’s definition relies on
evidencing that perception has a key role in generating a model of the
real. The implications in terms of the construction of the world of the
movie will be explored in the following chapter. For now, Prince’s concept
32
At least the audiences that went to see the movie in 1994.
132
signals how the limitations of the senses are relevant in terms of
perceiving a piece of art. In addition, Prince’s conceptual approach
associates the limitations of human perception in the movies with the
limits of the senses in perceiving the world. The aesthetics of digital 3D
exemplify this position. Prince argues that 2D (planar) cinema is “3D to
the extent that it replicates the monocular depth cues that observers
employ when viewing spatial layouts in the world at distances of six feet
or more” (p. 205). Thus, even though humans use both eyes to create a
three-dimensional view of the world, this capability tends to be restricted
only to the immediate space. This is the reason that planar movies or
paintings that use perspective techniques have spatial depth. Thus, it is
not the painting or the photograph that tricks the human eye into
believing that there is a third dimension to a planar area. Instead, it is the
human perception that “elects” to not perceive stereophonically after a
certain distance, thereby making a planar representation of a landscape
and the landscape equivalent in terms of perception.
The manner of how human senses shape the perception of
reality does not only apply to three-dimensional perception in
cinema. Human perception experiences movement by seeing 24 or
more similar images per second. With less than 24 images per
second, movement may also be perceived but the illusion may
133
become apparent. Similarly, with more than 24 images per second,
a human may perceive a smoother sense of movement. The exact
number of frames per second required for a person to perceive
realistic movement depends on the limits of human senses and a
certain degree of aesthetics, as humans are accustomed to adjust
their perceptual expectations to the environment. Similarly, a CGI
object will be perceived as realistic depending on the degree of
definition. The threshold will also be determined by a combination
of perceptual capabilities and aesthetics. King Kong's previous
example illustrates this point, as it reveals how the perception of
verisimilitude varies depending on the aesthetic values of a
society.
In addition, Prince's (2012) definition of perceptual realism
(p. 32) provides the foundation for discussing the processes
involved in perceiving something as realistic even though it is
referentially false. The dinosaurs from Jurassic Park are a good
example: they were generated by employing different models of
real, which were extracted from archeological findings and from
inferring animal behavior. Similarly, even though sound does not
travel in outer space, it actually seems more realistic when it does,
as is the case in most movies. Space travel is not a widespread
134
human experience. Thus, the sound propagation quality is inferred
from other physical media like air or water. I would speculate that
even for audiences that are aware of this physical property of
sound, a chase scene in space is perceived as more realistic with
the sound effects of explosions, lasers and collisions than it is
without.
Thus, perceptual realism is the product of aesthetics and
the limitations of human senses. Perceiving a dinosaur from
Jurassic Park as realistic involves an aesthetic decision of
acceptance because dinosaurs do not exist in contemporary
reality. In addition, the previous discussion highlights the inability
of human senses to distinguish between a CGI object and an
image captured by a camera, once CGI reaches a certain degree
of resolution. Moreover, distinguishing a camera from something
that has either been created digitally or has been manipulated
becomes symbolic, as the indexical properties have been diluted.
It is from this perspective that Prince’s definition may be especially
interconnected with the discussions concerning hyperreality.
Cinema and Hyperreality
Prince's definition of the perceptual realism in the dinosaurs
135
of Jurassic Park is close to Baudrillard's definition of hyperreality:
"models of a real without origin in reality" (Baudrillard, 1994, p. 3).
Nevertheless, Prince's scope is mainly aesthetic, as he attempts to
describe a new aesthetic process that surfaced alongside digital
technologies. However, the implications of his definition of
perceptual realism may extend further when considering the
problems of indexicality and perception.
In terms of cinema as an index of reality, Prince contends
that digital cinema may be even more indexical than its analog
precedent. When describing HDR imaging (HDRi), he asserts that
the technology acts as an example of a higher degree of
indexicality when applying a digital process. However, by stating
that indexicality may have different degrees, Prince transforms its
meaning: it is no longer a binary property that would signal if the
medium acts as an index of the reality it portrays. In fact, deciding
the degree of indexicality of a given image requires an aesthetic
evaluation. For example, the increase of lightness captured using
HDRi should translate into a similar increase in the degree of
indexicality. In addition, one must assess to which degree
lightness contributes to the overall indexicality of the image. Based
on this model, evaluating the indexicality of a given picture
136
requires two elements. First, there needs to be an object against
which it can be compared. If the object that it will be compared
against is the human eye, then a precise definition of what “human
eye” means is necessary, as vision varies according to the
individual’. Second, a decision must be made in order to assign
the contribution that each pictorial feature (lightness, color, etc.)
has to the final degree of indexicality.
By following Prince’s approach of evaluation, indexicality
has become an aesthetic property, as its values are generated by
using a symbolic system. This implies that a process of
assessment based on cultural codifications mediates the link
between the moving image and reality. In addition, an object
generated using computer software may become indistinguishable
from an object captured by the camera. This especially holds true
if considering that CGI could incorporate captured elements in
order to generate its digital objects, which further complicates its
relationship with reality and indexicality. For example, different
sensors may be used to capture the precise amount of lighting in
an environment using HDRi technology. With a proper set of
sensors, it is possible to get a precise 3D radiance map of a space.
With this information, it is then possible to create computer-
137
generated lighting that, as Prince argues, would become indexical
of the radiance of the room33 (Prince, 2012, pp. 192-198). This
model of lighting can be used to digitally illuminate any space,
including an invented digital model. Hence, even with a culturally
mediated approach to indexicality, problems arise due to the
inability to distinguish, by using human senses alone, between
what was captured and what was artificially generated.
The implications of redefining what an index is in terms of
the relationship between cinema and hyperreality are concentrated
in three areas. First, McLuhan’s position that Western civilization
was given “an eye for an ear” (McLuhan, 1964/1994, p. 84), as it
became visually biased with the phonetic language still holds. This
is why an analysis of the indexicality of the moving image tends to
ignore the rest of the senses: reality is what is seen, but not what is
tasted, smelled or heard. However, it is also true that in terms of
sound, audiovisual media has reshaped the perceptual model.34
For example, if the soundtrack of an audiovisual sequence is not
coherent with the visual track, this may affect how the image is
perceived, as I will discuss in the following chapters.
33
This is similar in terms of this discussion on the motion capture sensors
and the utilization of photographed textures to “paint” computergenerated objects.
34
As McLuhan argued.
138
Second, the modified definition of indexicality reveals how a
comparison with other symbols is needed in order to decide the
indexical value of a moving image. As a consequence, the
assumed direct connection between the real and the image is lost
as the link is established by applying a set of rules based on
cultural assumptions.
Third, CGI generates objects that have no apparent origin in
reality, even though they may integrate some information captured
from the world.
In the previous chapter, I argued that the miniatures used in
the original Star Wars trilogy were equivalent to CGI. In terms of
indexicality, filming those miniatures is as indexical of reality as is
using artificial lighting modeled after an HDRi capture. This
assumption implies that digital cinema has not transformed cinema
in terms of its indexicality. Instead, digital cinema has indicated
that the ontological supposition of indexicality was never there.
Therefore, a study of the relationship between cinema and reality
becomes a perceptual inquiry. Therefore, an analysis of how
audiences perceive a moving image, and their ability (or inability) to
discern whether its origin was in their perceived reality, is required.
Digital cinema has been permitted to further hide the illusion, as it
139
is able to work on a definition level that transcends the limits of
human senses. As a consequence, discerning how the world
depicted in the movie connects with the perceived world of the
spectator becomes a process that goes beyond perception.
Further expanding on this concept requires scrutinizing how the
world of the movie is created, which is the focus of the next
chapter.
140
CHAPTER V
FILM DIEGESIS, REALISM AND THE PERCEIVED REALITY
Introduction
The diegesis is the imaginary world in the viewer’s mind where the
actions of the movie (or any other narrative) happen. The term imaginary
is essential to this definition, since movie spectators do not actually see
the diegesis of a given movie through their eyes while watching it. Hence,
the audiovisual content of the movie is not the diegesis, as the diegesis
cannot be depicted. Therefore, from a purely perceptual point of view,
there is no diegesis.
Remarkably, screen music scholarship has been significantly
involved in the discussion on film diegesis (Cecchi, 2010; Gorbman, 1987;
Neumeyer, 2009; Smith, 2009; Stilwell, 2007; Winters, 2010; Yacavone,
2012). This is because the characters apparently cannot hear most of
film’s score. Music is not the only cinematographic device that the
characters are unable to interact with: for example, characters are equally
oblivious to movie editing techniques. Yet, edits and cuts are not
commonly mentioned when discussing film diegesis. Conversely, film
editing is paramount in theoretical approaches to film realism, although
141
music is barely acknowledged in these other sets of discussions. This is
noteworthy, since an analysis of the relationship between film and realism
is indispensible in order to examine the diegesis in depth. By highlighting
the apparent disassociation between diegetic and realism studies in film, I
am not questioning the importance of diegetic theories in screen music
scholarship. Describing the function of the musical underscore implies
comprehending, first, how and from where music interacts with the filmic
world. However, the predominance of screen music scholarship in the
definition of film diegesis has generated a misconception, by assuming
that music has a singular role in film that cannot be paralleled with other
filmic techniques.
In this chapter, I will discuss contemporary theories of film diegesis
along with Souriau’s (1951, p. 231-40) original formulation, in relationship
to realism and the perceived reality. In doing so, I will link the concept of
film diegesis with hyperreality and virtuality. As a result, I will build a new
framework to describe and analyze the role of film diegesis and its role in
experiencing a movie.
The Myth of the Perfect Illusion
Film critic André Bazin (Braudy & Cohen, 2009) states that film
pioneers, “in their imaginations, […] saw the cinema as a total and
142
complete representation of reality; they saw in a trice the reconstruction
of a perfect illusion of the outside world in wound, color and relief” (p.
165). Therefore, he argues that cinema was born out of a myth that he
defines as “the myth of total cinema” (p. 166). Bazin’s statements might
resonate with the scrutiny of the indexical property associated with film
from the previous chapter. However, Bazin’s theoretical approach
focuses on the degree of perfection of the representation instead of the
value of the representation as an index of reality. Thus, Bazin’s discussion
is mainly aesthetic instead of ontological.
The virtual world in The Matrix (1999) serves as an imaginary
implementation of what an almost perfect illusion would look like. It
requires a representation of a world that feeds information to the senses
of the enslaved humans with a level of definition indistinguishable from
the inputs of the physical world. In fact, the Matrix is not exactly a perfect
illusion because it has some glitches (mistakes in its code) that cause
awareness of the illusion. Cinema is clearly far from that illusory level. If
cinema were a means to generate perfect illusions, this would function
against its role as a narrative and artistic medium. For instance, time
management is fundamental in cinema. By employing time ellipsis, the
film narrative skips non-relevant moments. Likewise, reordering the
143
events of the narrative non-linearly35 is commonly utilized for narrative or
artistic purposes. The movie Pulp Fiction (1994) would change
significantly if it were not narrated in four separate stories that, at the end
of the film, generate an encompassing narrative. Similarly, Memento’s
(2000) narration employs a fragmented and reversed time structure in
order to better describe the perceptual disability of the main character,
which becomes a key feature of the story.
The example of Memento is relevant, as it exemplifies the
limitations of a model of realism based only on a pure audiovisual input,
which is implied in Bazin’s myth.36 This suggests that reality, for humanity,
goes beyond pure audiovisual (or even multi-sensorial) data, as reality
necessarily includes other aspects beyond purely perceptual data (e.g.
human psychological states.) In Memento, the main character, Leonard,
suffers from a type of amnesia that prevents him from storing new
memories. Being unable to store new information, his life is rooted in the
information of his short-term memory, which lasts for only a few minutes.
Creating a movie within this premise is challenging, especially when the
character’s amnesia has a central role in the plot. Most of the story is told
backwards in short fragments of time that would be equivalent to the
35
From a temporal perspective.
If cinema, which is an audiovisual medium, aims to become a perfect
representation of reality, then reality becomes an audiovisual experience.
36
144
duration of Leonard’s short-term memory. In doing that, each fragment is
cohesive in terms of Leonard’s awareness. Thus, he is able to recall all
the information portrayed in each of these sequences. If the scene had
lasted longer, at some point in the scene, Leonard would not remember
his actions at the beginning of the same scene. By employing this
strategy of narration, director Christopher Nolan forces the spectators to
experience each scene in a similar manner to how Leonard would have
experienced it. Further, the order of these scenes is reversed regarding
the timeline: the first scene in the movie is the last, chronologically
speaking.37 By using these narratological techniques, the audience
experience each scene without any prior context.38
More importantly, a conventional narration (linear in terms of time)
would fail to properly narrate the story in its full range. First, a linear
narration would not adequately narrate the inability of the character to
store memories, as it is challenging to audiovisually represent the
moment when the character stops remembering information. Second, the
story would lose its suspense, which is generated precisely by using
37
The movie also includes black-and-white scene inserts that are
presented chronologically, although the events happen prior to the main
story. In a classical narrative, these scenes would be considered
flashbacks.
38
Nevertheless, it is important to specify that an audience member with
no amnesia would store the information provided by the movie and
ultimately reveal the whole story.
145
Leonard’s own disability. At the end of the movie,39 audiences discover
how Leonard decided to write a lie meant to manipulate his further
actions. He knew that he would forget about his forgery and he would
accept what he wrote as a fact.
Memento shows that an audiovisual sequence does not capture
reality in its entirety, regardless of the level of sophistication. In fact, this
is also true for everyday human interactions. For instance, a short
interaction with a person who suffers this type of amnesia would not
automatically provide the information to conclude that the person is
suffering from this condition, unless it is explicitly stated. It would require
a longer interaction and some questioning in order to be fully aware of the
disability. However, movies rely on telling stories in which the psychology
of the characters is clear. Hence, it is because of its willingness to
provide a broader portrayal of the “reality” (compared to what would be
possible to discern by inspecting a situation with just human senses) that
narrative cinema cannot become a perfect illusion. Instead, narrative
cinema develops into a more powerful tool for transmitting meaning than
any perfect illusion would be ever able to achieve. In addition, cinema has
an added artistic value beyond its purely narrative capability. Quentin
Tarantino’s decision to divide the plot of Pulp Fiction into four different
39
Therefore, the first action (chronologically speaking).
146
stories is not only challenging in terms of screenwriting but it is also a
decision that adds artistic value to the movie. It is on these aspects that a
discussion on film diegesis should be rooted. In other words, defining
how the diegesis is created requires taking into consideration the fact
that pure sensory depiction does not fully represent “reality”, and that
cinema is an art form beyond its audiovisual representational capabilities.
Before defining a model based on these two viewpoints, I will first explore
selected relevant approaches extracted from recent scholarship, which
mainly focus on the theoretical models defined by Daniel Yacavone and
Benjamin Winters. In addition, I will revisit Souriau’s model of the “filmic
world”, who coined the term diegesis in its application for film.
Theories of the Film World
There are numerous sources (Cecchi, 2010; Neumeyer, 2009;
Smith, 2009; Stilwell, 2007; Winters, 2010) that provide a historical review
of the theoretical evolution of the term diegesis in film. In this section, I
will summarize some of its most relevant topics, especially in reference to
Daniel Yacavone’s article Spaces, Gaps and Levels: From the Diegetic to
the Aesthetic in Film Theory (2012), in which the author discusses most of
the concepts presented in previous research. In this article, Yacavone
147
proposes a model of film diegesis based on an aesthetic framework that
includes a multilayered approach to movie comprehension.
Although Souriau (1951, p. 231-40) initially adapted its application
in film, Gérard Genette established the term in its modern meaning in
Narrative Discourse: an Essay in Method (Genette, 1980) by freely utilizing
and simplifying Souriau’s concept. In screen music, Claudia Gorbman
implemented Genette’s definition in her influential book Unheard
Melodies: Narrative Film Music (1987). Gorbman’s essay became pivotal
for most of the subsequent research in film diegesis, as it emanated from
the nascent field of screen music scholarship. For Gorbman and Genette,
defining the diegetic world was a means to establish a dichotomy
between the diegesis and the extra diegesis. The narrator, and most of
the music, would be part of the extra diegetic layer, in charge of providing
narratological meaning for the narrative. Thus, music would have a
narrative function by being part of the narration. In this reductive
dichotomous view of film experience, the diegesis became linked to the
audiovisual material of the film. If each filmic element is directly classified
as either diegetic or not, the visual material (along with the sound that
would be considered indexical to the visuals) becomes diegetic,
meanwhile the narrator, or the background music, will not be considered
part of the world of the movie. Associating the diegesis with the visual
148
content of the film is problematic and restrictive, as I will argue below.
Nevertheless, this association became widespread in most of the
scholarship related to the diegesis published after Genette and Gorbman.
Stilwell’s Fantastical Gap (Stilwell, 2007) relaxed the dichotomy40
by adding a space between both areas. The Fantastical Gap evidenced
that music could cross between both spaces (diegesis and non-diegesis)
and reside in an area of ambiguity: the gap. Even though this is an
improvement over a dichotomous model, the gap is only meant for music,
which ultimately prolongs the association between the diegesis and the
visuals:
Stilwell does not pursue the many larger implications of the
presence of this sort of meaningful gap between the diegetic and
nondiegetic as one that the film and by extension the viewer may
figuratively occupy. Nor does she acknowledge that there are
many such gaps in films, ones which do not always involve music
or sound, or, indeed, narrative, and are of potential equal
significance in the experience of a film. (Yacavone, 2012, p. 33)
In addition, Stilwell’s approach perpetuates the assumption that
there is only one level of meaning in film, against which objects need to
be classified.41 As described in Memento’s plot discussion, the movie’s
singular narration technique empowers its audience to experience the
world within Leonard’s cognitive limitations. This technique alone
40
41
Diegetic vs. non-diegetic.
Being diegetic or not (or forming part of the gap).
149
generates a new level of meaning, in addition to the level portrayed by the
main narrative that describes the events of the story. Yacavone argues
that “such gaps can exist only because, as previously noted, there are
many more levels of meaning and experience at play in a film work than is
often acknowledged” (p. 33). Hence, a multi-leveled approach is
necessary in order to better approach a definition of the diegesis.
Yacavone qualifies Winters’ approach in The Non-diegetic Fallacy:
Film, Music and Narrative Space (2010) as antirealist. By realism, both
authors accept a Bazinian approach that defines realism as a set of filmic
techniques that intend to mimic the “real world”. Therefore, as “real life”
is not underscored, music is regularly placed outside of the diegesis
unless it comes directly from the characters’ actions. However, there is
music that is routinely qualified as nondiegetic (Winters mentions the
Indiana Jones theme as one of the clearest examples) that seems to
belong to the world of the movie at the same level as its characters (in the
case of Indiana Jones, its theme is as important as the character’s attire).
In consequence, Winters argues against qualifying cinema as a realistic
experience. Yacavone (2012) elaborates Winters’ position by stating:
From the viewer’s perspective, this music is part of the same
presented reality – the same audio-visual, perceptual-imaginative,
spatial-temporal cinematic experience – as its characters, for
instance. Yet it is not, for these reasons (or any others), part of the
same narrated, represented, or fictional reality. (p. 23)
150
In order to provide a better foundation for this new theoretical
framework for the diegesis, Yacavone and Winters argue that Christian
Metz’s approach to the term, as it appeared in his seminal book Film
Language (1974),42 is more appropriate. Metz includes his theoretical
model in a discussion on film denotation because “the concept of
diegesis is as important for the film semiologist as the idea of art” (p. 97).
Diegesis, for Metz (1974),
designates the film represented instance […]- that is to say, the
sum of the film’s denotation: the narration itself, but also the
fictional space and time dimensions implied in and by the
narrative, as consequently the characters, the landscapes, the
events, and other narrative elements, in so far as they are
considered in their denoted aspect. (p. 98)
Denotation is a semiotic term that refers to the obvious or literal
meaning of a sign. In contrast, connotation involves a meaning that is
culturally processed. However, denotation is often associated with a
widely shared cultural convention, making its difference to connotation
vague. Yacavone simplifies the definition of denotation as “one thing
standing in for another, via the assignment of a relevant and conventional
label, a name, to some perceptual form” (Yacavone, 2012 p. 28). In
cinema, the denoted reality “always includes that which a film takes, or,
more appropriately, borrows from the ‘real world’, and implicitly, that
42
Similarly to Souriau, Metz’s viewpoints have been ignored by most
cinema and screen music scholars in favor of Genette’s principles.
151
within it from which the film deviates or that it rejects” (p. 29). Thus, a
viewer “actualizes this inescapably referential level of a film on the basis
of the prior knowledge and experience that he or she brings to it,
including that of what is and is not fictional” (p. 29). The audiovisual
content of a movie becomes an incomplete rendition of the fictional world
it creates, which audiences fill with the information taken from their
perceived everyday reality. By acknowledging the incompleteness in
terms of the world creation by the information provided by the movie
material alone, the theoretical framework opens the door to include other
elements inside the movie that do not apparently generate its world
(commonly defined as extra-diegetic), as contributors to the creation of
the world of the movie.
In addition, Yacavone (2012) stresses the importance of the
aesthetic level in cinema. This is because filmic elements may have a
function beyond being denotational, narratological or narrative. For
example, the decision to not use music or sound in a scene signals an
aesthetic decision instead of just a technical or a narrative choice. By
cutting the sound, the movie loses part of its apparent realism, but this
does not affect the level of realism of the diegesis.
Yacavone’s (2012) account of diegesis is innovative, as he merges
the key elements of the diegetic model in a cohesive framework that
152
places the diegesis in a multilayered interpretation of the filmic
experience. However, the resulting model still holds a vision of the
cinematic experience anchored to a model that conceives of the
audiovisual content of the movie as an illusion. This is why realist
accounts of the filmic experience are problematic for the author, who
ultimately chooses an approach that distances cinema from realism. I
argue that Yacavone’s problem is not in the concept of realism itself but
in the definition of reality. In fact, his definition of realism (as well as
Winters’) supports McLuhan’s critical description of Western society as
being visually biased.
However, approaching the diegesis within a visual bias is
problematic when intending to provide a comprehensive vision of the
worlds of the movie. For example, it is not possible to consider a piece of
music that transmits the character’s feelings (especially when those
feelings are not explicit in the audiovisual content) as a part of the
diegesis while maintaining a realistic approach. Conversely, if the music is
qualified as non-diegetic, the diegetic elements would become realistic at
the cost of neglecting the role of the music in shaping the diegesis.
Hence, a world where music shapes the feelings of people cannot be
considered realistic. This is why Winters distances himself from a realistic
153
position. For instance, in approaching the diegesis of Star Wars, Winters
(2010) argues:
To imagine the film-world of Star Wars … as one saturated with
the ‘sound’ of music (whether or not the characters hear it as
‘sound’) seems perfectly acceptable to me as a filmgoer. It even
seems possible that Luke’s engagement with the force allows him
to ‘hear’ and manipulate the film’s music. (p. 233)
However, a model of moviemaking that distances itself from
realism is problematic when attempting to interpret the audience’s
engagement with the narrative, or when attributing mythical content to
stories portrayed in movies. For example, it is challenging to describe a
horror scene without associating it with a realistic perception. Without
considering realism, an analysis would fail to explain why audiences are
scared or feel anguished. This is why I will propose a framework that
preserves the idea of realism. In doing so, the model will naturally engage
with the concept of hyperreality described in the previous chapters.
Defining the Diegesis and a Semantic World for a Movie
I began this chapter by stating that from a pure audiovisual point of
view, there is no diegesis. In the previous chapter, I argued that cinema
cannot be considered an index of reality. In addition, in this chapter I
have discussed how becoming a perfect illusion would defeat the
154
purpose of narrative cinema. Moreover, cinema is something significantly
more powerful in terms of meaning than a perfect illusion could ever
aspire to become. In order to approach what is the diegesis exactly and
how it is generated, Winters argued in favor of distancing cinema from
being a realist medium. As I have already stated, this approach fails to
describe how audience engagement is achieved within a non-realistic
medium.
I will argue that the problem is not located in the concept of
realism but in the concept of reality itself, along with how reality is
perceived and represented. In lieu of that, cinema should be considered a
realistic medium, even when the audiovisual material that comprises the
movie may not be realistic. In order to fuel the discussion in a wide range
of situations, I will begin by describing two relevant scenes in Kubrick’s A
Clockwork Orange (1971). The movie is one of the most exemplary cases
of a complex interaction between the audiovisual material and the
diegetic world, and it will later serve as an example for the definition of
the diegetic framework.
A Clockwork Orange
Stanley Kubrick’s A Clockwork Orange presents a variety of
different musical situations that challenge most of the current approaches
155
to film diegesis. I will focus on two types of scenes that portray two
important features in terms of the diegesis.
Fight Scenes
In the book on which the movie is based, the Nadsat43 language
assists to alleviate the violence at the beginning of the narrative by
confounding the comprehension of what is written. However, employing
Nadsat with a similar purpose was not possible in the movie because the
audiovisual material would overcome the utilization of the language. In
addition, the shorter length of the movie44 negates the possibility for the
audiences to slowly learn it, as is the case for the book. Kubrick provided
a cinematic equivalent: montage sequences with classical music that
softened the extreme violence for some of the initial fighting scenes. In
one particular scene, the cinematic result is a choreographic fight
accompanied by Rossini’s waltz-like music. The scene begins when Alex
and his friends arrive at an old theater that is in ruins, where they find a
rival mob raping a girl. Music from Rossini’s La Gazza Ladra is heard
43
Nadsat is an invented argot by Burgess in his book A Clockwork
Orange. The teenagers in the novel and the movie speak it. The words of
the argot are derived from Russian, which are adapted to an English
pronunciation (for example, droog means friend). In addition, there are
some words and idioms that come from childhood expressions and from
Cockney rhyming slang (for example, Appy Polly loggy for apology).
44
In comparison to the time it would take to read the whole book.
156
throughout the scene. However, the music does not come from the
theater and it is mixed louder than what it would be if it were coming from
the stage. The fight between the members of both bands is
choreographed to follow Rossini’s music. As a result, the scene portrays
an artistic representation of a fight that eludes an apparent direct
connection with realism, as the characters dance to the pace of a piece
of music, which they cannot realistically hear.
The Orchestral and Electronic Versions of Beethoven’s 9th Symphony
The simultaneous and interchangeable usage of two distinct
versions of Beethoven’s 9th Symphony (an orchestral performance and
Wendy Carlos’ electronic version employing the Moog synthesizer)
proves to be even more challenging when attempting to define its place
in the world of the movie. As they are used indiscriminately, there is an
object from the filmic world that is presented in the aural track in two
significantly distinct forms. In addition, the difference between both
versions is so significant that it is certainly noticed by the spectators. This
especially holds true when considering that the original piece of music is
widely known in Western society.
157
Cinematic Realities in Etienne Souriau’s Model
Winters’ (2010) critique of film realism is grounded in the
assumption that the audiovisual material that the cinema presents to the
audience is a depiction of the world of the movie. Therefore, as the movie
world significantly differs from the real world (the movie world has
underscore music, for example), it cannot be considered realistic.
Gorbman (1987) attempted to resolve the issue by adding a level of
narration that was extraneous to the narrative. However, this added level
neglects the diegetic role of underscore music, which is the reason for
Winters’ argument in favor of recognizing cinema as a nonrealistic
medium.
Winters (2010) correctly points out that Étienne Souriau, who
originally coined the term for its use in film theory, placed the diegesis as
just one of the seven levels of reality in film.45 Souriau’s model is generally
not discussed in the aforementioned scholarship related to film diegesis,
although he is usually credited as the creator of the term. Claudia
Gorbman included a brief quote extracted from his definition of the term,
which is usually cited in other texts. This is surprising, as Souriau’s
definition of the diegesis cannot be separated from the rest of his model
of filmic levels. Hence, it is worth revisiting Souriau’s approach in order to
45
To be precise, Souriau qualifies them as kinds of reality; levels involved
in the structure of the filmic universe (Souriau, 1951, p. 234).
158
better understand how the original conception of diegesis integrated into
his complete model for the filmic universe.
Souriau (1951) described his model in the article La structure de
I'univers filmique et Ie vocabulaire de la filmologie (pp. 231-51). He later
incorporated the model into the preface for L’universe filmique (Souriau,
1953, pp. 5-10). Before describing these levels, Souriau differentiates two
main spaces, which he considers to be significantly different. This
differentiation, which has been ignored in the literature, is fundamental in
order to grasp Souriau’s theory for the filmic world. The following
paragraphs are a translation of his description of these two levels46:
On one hand, there is the screen, which frames all the visual
material in a rectangular fixed plane with constant dimensions and
position. Everything that is given to me to see, everything that my eyes
perceive is within this frame. This is a basic fact.
However, I also perceive a completely different space, which is
infinitely wider and it is three-dimensional. I obtain it by processing the
illusion, by employing cognitive, perceptual, reconstructive and
46
The following lines greatly transform Souriau’s assumed point of view
on the diegesis taken from Gorbman’s brief quote. Gorbman’s citation
came from a brief definition that he provided at the end of the text. It was
intended as a partial summary of his viewpoints, assuming that the rest of
the context was already read.
159
imaginational operations. […] In short, the film causes a whole
topography to emerge: the space where the story takes place.
These two spaces are clearly distinct. In order to avoid confusion, I
will give them two names without providing any particular justification.
They should be considered just as convenient labels. The first will be the
“screenic” space. The other, if you accept it, will be qualified as the
“diegetic” (from the Greek διήγησις, diegesis: narration, storytelling).
Therefore, two spaces: First, the screenic space, that includes the games
between light and darkness, the shapes, the phenomena that are visible
onscreen; second, the diegetic space, this space is only constructed by
the mind of the spectator (and assumed or built by the creator of the
screenplay): a space in which are supposed to happen all the events that
are presented, inhabited by the characters, from which I comprehend the
scene represented before me. (Souriau, 1951, p. 233)
Souriau presents a dichotomy that differs from the model
described by Gorbman. For Souriau, the diegesis is an imagined space
that contrasts with the “reality” perceived, which is projected onscreen. A
realist approach emanates from Souriau’s thesis when he assumes that
the diegetic reality is an imagined world, a product of our cognitive
operations, where the characters live. His framework connects with the
imagination theory for storytelling discussed in the previous chapter. It
160
also implies that movies have multiple levels of signification when he
differentiates between the projected reality and the diegesis. It is from
this viewpoint that Souriau’s levels for the filmic universe emanate. Put
succinctly, Souriau (1951) proposes the following levels:
1. Afilmic reality: the reality that exists independently from filmic
reality.
2. Profilmic reality: the objective reality photographed by the
camera.
3. Filmographic reality: the film as a physical object, structured
by techniques such as editing.
4. Filmophanic reality: the film as projected on a screen and its
sound
5. The diegetic space: an imaginary world created by the
spectator in which the story is supposed to happen and the
characters are supposed to inhabit.
6. Spectatorial facts: the spectator’s subjective perception and
comprehension of a film, influenced by their psychology and
personality.
7. The creatorial level: anything that relates to the filmmaker’s
intentions as a creator. This includes extra-filmic choices,
161
such as casting an actor or using a song in order to promote
it. (pp. 233-51)
The profilmic, filmographic and filmophanic levels associate with
the overall screenic space defined above, although there are some
nuances that might be relevant for specific analytical situations. In
addition, Souriau’s definitions need to be updated to align with current
practices. In using CGI, color correction, or motion capture, the camera is
no longer pivotal in the process of capturing audiovisual information.
Therefore, the profilmic reality should be reformulated as the set of
captured raw material from which the film is generated. The difference
between the filmographic and the filmophanic realities is slight and almost
irrelevant in a discussion on realism and diegesis. It has, and it had,
relevance in terms of archival analysis, because the filmographic level
concentrates on the physical medium that stores the film.47 Metz provides
an example of the distinction between the two levels: “For example, 24
filmographic images correspond to one second of filmophanic time”
(Metz, 1984, p. 8). The filmographic level would distinguish between a film
at 24 frames per second (fps) with another at 48. For example, Peter
47
In the digital era, this would be equivalent to the properties of the
format employed to virtually store the movie (resolution, type of
compression, frame rate, sample rate etc.).
162
Jackson’s The Hobbit (2012) was recorded at 48 frames per second (fps).
This case features one of the rare situations where discussing the
filmographic level might become relevant in relationship to the perception
of realism. Thus, these three realities constitute the screenic level defined
above, although they may be recalled separately if necessary.
In addition to these three realities and the diegetic space, Souriau
outlines three additional levels that are of great relevance for their
interaction with the diegesis. Souriau (1951) acknowledges the
importance of the afilmic level, which he defines as the real and ordinary
world that exists independently outside the film (p. 234). He believes that
it is a strongly relevant level of reality to consider when analyzing a filmic
universe. He demonstrates this importance by defining realism as a
sincere expression of an afilmic universe. Similarly, Souriau states that
the documentary cannot be defined if it is not as an image of a reality that
is part of the afilmic universe (p. 234). Applying Souriau’s definition of the
afilmic reality, this level becomes relevant as it serves as a model for the
assimilation of the screenic reality and the construction of the diegetic
space. The reconstructive process that aims for the creation of the mental
representation for the diegesis draws on the information extracted from
163
the afilmic reality.48 In addition, the afilmic level canalizes the intertextual
relationships between different narratives or, more generally, between a
movie and any piece of human knowledge.49 In acknowledging the
everyday world that film audiences inhabit, Souriau stresses the
importance that the information from this world has on comprehending
the film and constructing the diegetic space.
The creational level naturally interacts with the afilmic reality, which
is, practically, the experiential reality. It may include the philosophical,
theoretical or artistic intentions of the filmmakers. Employing elements
from Baudrillard’s ideas of the hyperreal in The Matrix is an example of
this level and how it draws from the afilmic reality. Baudrillard’s ideas are
significant in building the movie’s narrative, although they are mostly not
explicitly present either in the screenic reality or in the diegesis. Similarly,
the bullet-time effect is a creational device that is implemented in the
screenic reality only.
Finally, the spectatorial facts highlight that the film requires the
interpretation of its spectators, and that this process of interpretation is
not neutral: it is the product of the spectators’ afilmic background and
experiences. As Souriau emphasizes, it belongs to the spectatorial level
48
For example, there is no need to watch a person sleeping in a movie to
imply that the person regularly sleeps.
49
Such as recognizing the actor that plays a character, or the interaction
between Baudrillard’s philosophical views and The Matrix.
164
everything that involves the subjective psychology of the viewer (p. 238).
These experiences interact with the screenic reality and the creatorial
level in order to generate the diegesis. Furthermore, Souriau remarks that
the spectatorial facts extend beyond the duration of the filmophanic time
(the duration of the movie), as a particular movie might have an effect on
the spectator’s viewpoints. In terms of the diegetic space, the spectatorial
facts reveal that it is not a unique or objective model. Instead, the
creation of the diegesis depends on the subjectivity of each spectator.
The diegetic space contrasts with the screenic levels (they refer purely to
the audiovisual content of the movie), which are objective.
Hence, a key concept in Souriau’s model is the differentiation
between the film as an objective object and the subjective parts of the
filmic experience, which start with the creation of the diegesis. However,
most of the scholarship related to the diegesis assumes that the
relationship between the images projected on the screen are the diegesis
(Barham, 2014; Buhler, 2013; Cecchi, 2010; Gorbman, 1987; Kassabian,
2013; Neumeyer, 2009; Smith, 2009; Stilwell, 2007; Winters, 2010;
Winters, 2012; Yacavone, 2012). This vision is concomitant with the visual
bias of Western society described by McLuhan.
Winters’ reference to Souriau’s levels is an attempt to distantiate
the diegesis from Gorbman’s narratological approach. However, he
165
states that, based on Souriau’s levels, “diegesis indicates the existence
of a unique filmic universe, peculiar to each movie” (Winters, 2010, p.
226). In other words, he asserts that there is a unique connection
between the film and its diegesis, which it is not Souriau’s vision, as the
diegesis is imagined by the spectator.50 Furthermore, this viewpoint
implicitly defends a diegetic world tied to the screenic material.
It is within this conceptual framework that, referring to Souriau’s
levels and his conception of the filmic universe, Winters (2010) states:
More importantly still, nothing in this description justifies the
automatic exclusion of music from the diegesis, since the
presence of music in the space of the filmic universe might be
considered an aspect specific to a particular film, whether realist or
fantastic in its aesthetic. (p. 226)
However, this is not exactly Souriau’s point of view. In discussing
the music that accompanies film, Souriau (Metz, 1984) argues that it “is
only related to the filmic universe in a global, atmospheric or syncretically
manner (…). The topical anchor is not inherent. We are rather dealing with
a moral relationship, a sympathetic relationship, with a more or less
expressive function” (p. 7). In Souriau’s vision, music is part of the
screenic (or filmophanic to be precise) space similar to where the editing
decisions are made (cuts, framing, etc.). For Souriau, there is no nondiegetic space because the diegesis does not interact at the same level
50
Therefore, there are as many diegeses as spectators for a given movie.
166
as the filmic elements. Thus, a piece of music would only populate the
diegesis imagined by a particular spectator if the spectator chooses. This
does not imply that a piece of music that does not exist in the imagined
diegesis does not contribute to the creation of the imagined world, in a
similar manner that a close-up might contribute to fill in some of the
details of the diegesis without implying that the world is populated by
close-ups.51
Therefore, stating that the theme from Indiana Jones is as diegetic
as his whip can only be explained by defending, as Winters does, an
antirealist view of the cinematic perspective. However, denying the
realistic experience of the movies becomes extremely problematic, as I
mentioned before, and as I will continue to argue in the following section.
Realism, Verisimilitude and the Diegetic World
I defend an interpretation of the movie experience that considers
realism as an integral part of the medium. Without it, the movie would not
be able to rely on the audience’s experiences in order to fill in the gaps
that the plot does not set out explicitly. It is important that spectators can
assume that the main laws of physics will apply to the characters and the
events of the movie, so there is no need for the movie to establish that, in
51
In this case, it is challenging to even attempt to describe what a world
populated by close-ups would look like.
167
its world, gravity acts exactly as it does in the physical world, for
instance. Consequently, the movie’s content will only need to establish
the laws of gravity if they differ from the laws in the physical world.
Similarly, audiences assume that most of the characters will adhere to
commonly accepted human behaviors and have a certain physiology.
This also applies to most aliens in movies, like Star Wars (1977), who are
expected to have mostly humanoid features. In other words, by assuming
that movies are generally realistic, they are allowed to use elements from
the afilmic reality level that complement the information provided by the
filmic level. This perspective is coherent with Souriau’s levels and the
importance of the spectator’s subjectivity in constructing the film world.
However, Winters is correct in arguing that underscore music
might help to create the diegesis and the characters. Music can even
serve to establish a general mood for the scene as if it were “wallpaper”
(Winters, 2012). As stated before, being part of the creation of the
diegesis does not necessarily imply being part of the diegesis itself. For
instance, a film sequence consisting of different shots will similarly assist
in creating the diegesis, although the concept of a film edit or cut will not
be part of the diegesis. Yet, Winters does not consider that film editing
challenges the conception of the diegesis in a similar manner to music.
168
I propose a definition of the diegetic world that detaches the
diegesis from the screenic level, in accordance with Souriau’s model. Part
of the audiovisual material of the movie (the screenic content) will serve to
assist the creation of the diegesis. However, the entirety of the screenic
content will not necessarily be utilized for the diegesis, nor will it be
directly represented in the diegetic world. Similarly, the diegesis will not
be exclusively built based on the audiovisual input, as the spectators will
complement the inputs from the movie with their knowledge and
experiences of their perceived world. In accepting these premises,
cinema necessarily becomes a realistic medium, as it generates a
diegetic world based on the perceived reality of the spectator. However,
this does not imply that the screenic content is realistic, per se, or that
the diegesis is a replica of the perceived world. In fact, Prince’s
perceptual realism stresses a model of realism that is referentially false.
Thus, the dinosaurs of Jurassic Park are perceptually realistic even
though they are referentially false for the world in which the spectators
live.
For screen music, I would argue against considering underscore
music in a similar manner. This is because, given the context, the
dinosaurs in Jurassic Park are verisimilar meanwhile underscore music is
not. The dinosaurs have enough links with models of reality to be
169
considered plausible in the fictional world of the movie.52 This would even
hold true with a character as detached from a humanoid as Jabba the
Hutt in Star Wars. Jabba becomes verisimilar with regard to the fiction
portrayed. Conversely, underscore music, film editing, voiceover
narration, or printed titles are entirely different in terms of verisimilitude.
This is because they belong to another level of signification.
If accepting these premises, narrative cinema becomes a realistic
medium by generating a diegetic world that is verisimilar, regardless of
whether the objects of the world are referentially false or not. This
definition adjusts the conception of realism into a more manageable
model, as it focuses on the spectator’s perception and expectations of
the world and the fiction. In addition, it further distances the diegetic
world from the audiovisual material of the movie. Moreover, this approach
does not attempt to interact with reality, which is an abstract idea, but
with how reality is perceived by each individual. Realism engages with a
subjective appreciation of each spectator instead of an objective and
abstract concept that is not possible to define.
52
Part of the movie’s narrative is dedicated to explaining and justifying
how the imaginary scientists of Jurassic Park where able to create
dinosaurs.
170
Building the Diegesis
I began the chapter by stressing the importance of the term
imaginary when discussing the diegesis. In Chapter IV, I described
Huron’s model for human imagination and its importance, as it allowed
experiencing the outcome of a situation without the need to have
physically felt it. Imagination helps to reveal the importance of narrative
storytelling in human culture. Narratives are used as a means to imagine
the outcomes of a variety of situations. From this viewpoint, conceiving
the diegesis as an imaginary world does seem natural.
In order to define the process of building the diegesis, it is
important to take into consideration a multilayered model in accordance
with Yacavone’s and Souriau’s propositions. Generating a diegetic world
based on the spectator’s afilmic reality does not exclude the process of
appreciating non-realistic aspects of the movie. Examining the creation of
the diegesis of an opera or a movie musical should clarify this statement.
If their diegetic world were built as verisimilar, it would naturally exclude
the fact that the characters sing to each other instead of talking. Singing
is an essential part of those artistic manifestations, which is appreciated
on another level of signification (Souriau’s creational level, for example). I
believe that this is not different from being cognizant of philosophical
171
concepts suggested in a movie that generally would not be part of the
diegetic world.
Since imagination is crucial for human survival, the creation of the
diegesis may become a quasi-automatic process once the cultural codes
associated with moviemaking are absorbed. In fact, the process of
diegesis creation should not differ greatly from the process of perceiving
unknown areas. People are not required to spot the restrooms in a
restaurant in order to assume that the restaurant has them.53 Moreover, it
is most probable that they are able to find them without even asking the
personnel, by imagining a mental representation of their probable
location, based on the inputs from the environment and the previous
knowledge gained from having been in other restaurants.
Comparably, creating the diegetic world of a movie involves
incorporating information from the audiovisual material. This is achieved
by, first, properly decoding this material by using a diverse set of codes;
second, merging the audiovisual material with previous experiences from
the afilmic reality of the viewer; and third, if necessary, filling the gaps
using a process of hypothesizing. The last process highlights how the
diegesis is not only an imaginary entity but also a world that evolves
according to the new information gained during the unfolding of the
53
Having restrooms in a restaurant is not only a law but also a cultural
convention.
172
movie’s events. From this perspective, information provided at the
screenic level will constantly be employed as a means to reveal or
reshape aspects of the diegetic world. This is how the screenic material
dynamically connects with the diegesis: feeding in new information
created while unfolding the narrative.
The beginning of The Matrix serves as a good example, as it slowly
unveils the rules of the virtual world in which humanity is enslaved. It also
illustrates how the creation of the diegesis greatly infers from previous
experiences of each audience member. The first scene shows Trinity
(played by Carrie-Ann Moss) escaping from the police and the agents. At
the beginning, it is reasonable to expect that the diegetic world of the
movie is similar to the perceived reality.54 Later on, Trinity is seen
performing actions that defy the laws of gravity of our planet. The
spectators will necessarily reshape their rendering of the diegetic world in
order to accommodate these otherwise extraordinary abilities. At this
moment, it is probable that the spectators assume that Trinity, and the
agents, have some kind of superpower. Assuming that they have
superpowers implies previous knowledge of some iconic elements of
modern popular culture.55 This assumption is made solely by using
information from the afilmic reality and it does not involve, once the
54
55
At least, to the perceived reality of 1999.
Superhero movies or comics primarily.
173
audience has witnessed Trinity’s impossible acrobatic movements, the
screenic material.
However, there are no clues to imply that she is a superhero in the
sense that Superman or Spiderman are. At the end of the opening scene,
Trinity mysteriously disappears inside a telephone booth after she
answers the phone. From the viewer’s perspective, the only information
they have about her disappearance comes from the agents who
acknowledge her escape. At his moment, the reason for her vanishing
from the telephone booth remains uncertain56 and may generate a
different set of hypotheses57 for the plot and for the diegetic world. In
addition, the image track has a higher degree of green than expected,
which may or not may not be noticed by the audience.58 It also
incorporates a new camera shooting style, the now well-known bulletcamera effect, which suspends Trinity in the air when she starts fighting.
However, I believe that the bullet-camera effect does not reshape the
diegesis by assuming that Trinity is able to stop time when she jumps, as
56
The moment when, and how, she disappears is not shown in the movie.
The concept of hypothesis-making for a movie narrative is thoroughly
discussed in Bordwell’s book chapter The Viewer’s Activity (Bordwell,
1985, pp. 29-47). In this text, Bordwell surveys the principles of cognitive
science in its application to the processing of narrative movies.
58
In the case that spectators notice the green filter of the image, they
might imagine a greener diegesis or they might perceive it as a creatorial
device.
57
174
the shooting technique is perceived as an artistic cinematographic
device.
The beginning of The Matrix illustrates how the diegetic world is
produced by initially assuming that it is as close as possible to the
perceived world. The audiovisual (screenic) inputs that differ from this
perception are introduced in the diegesis as alterations on the assumed
model. It is not until Trinity performs the physically impossible acrobatic
attack, that she ceases to be imagined as an ordinary human being.
When, later on, it is revealed that the world witnessed at the beginning of
the movie is just a computer simulation that allows for the transcending of
the laws of physics, Trinity will become, again, a regular human being. In
sum, the screenic material populates a preexisting set of templates for
the diegetic world, which exist as models of the spectator’s perceived
reality. This example highlights how spectators assume that the laws of
gravity should apply to the diegetic world without requiring an explicit
confirmation. Instead, the movie’s content will only need to be explicit on
the alterations to the assumed rules, as in the case of The Matrix.59
Furthermore, the case of opera or movie musicals reveals that
there are certain genre-specific rules that commonly apply. If the
spectator is aware of those rules, the diegetic world will be built by taking
59
In addition, revealing the transformation of the rules serves as a tool to
generate suspense.
175
them into consideration. In Questions of Genre, Stephen Neale (1990)
defined generic verisimilitude as the process where an action becomes
verisimilar due to the genre of the encompassing movie. For Neale,
singing in musicals is generically verisimilar. For the process of diegetic
building, acknowledging the existence of a generic verisimilitude implies
awareness of the generic conventions that govern a particular genre.
To further exemplify the process of constructing the diegesis, I will
provide three additional situations that feature particularly relevant
situations. It is common to use a printed title to state the location of a
particular scene. It is also clear that employing printed titles is not meant
to imply a diegetic world containing flying letters. Instead, printed titles
have an equivalent function as an establishing shot or a musical idea that
signifies location.60 The information provided by the printed titles will
serve to assist the construction of the diegetic world. An establishing shot
of New York City (NYC) containing a shot of the Statue of Liberty and
another of the Empire State Building serves as a signifier of NYC only if
the spectators associate those images with NYC. If this is the case, they
will use their previous knowledge of NYC in order to quickly populate the
diegetic world. Similarly, an accordion tune will only assist in locating the
60
Such as using accordion music for Paris, or a duduk for a MiddleEastern country.
176
diegesis in Paris if, first, the spectator is aware that an accordion codifies
Paris and, second, if the spectator has previous knowledge of the city.
The Indiana Jones main theme complements the information about
the character for the diegetic world in a similar manner. In addition, it has
its own artistic and stylistic value. Similarly, a fragment of underscore
music that evokes an ominous mood will reinforce a mysterious or
dangerous situation by providing the emotional content that the scene
would lack otherwise. In that particular situation, it is the emotional
content of the music that forms the diegesis instead of the music as an
entity. According to Huron (2007, pp. 8-9), the objective of imagining is to
produce emotional outputs that are equivalent to experiencing the
situation. If so, the music in this scene would be responsible for
completing the emotional content, allowing the audient to fully imagine
the situation.
Lastly, let us examine the sound of a closing door. In a movie, this
sound is generally louder than expected when its encompassing action
(closing the door) has narrative importance. Conversely, the same sound
will be barely noticeable when the action is irrelevant in terms of narrative
building. However, it does not seem reasonable to infer that this
phenomenon implies a diegetic world where doors adapt their closing
sound to the importance of the action that made them close. In this case,
177
the varying volume of the sound of a closing door does not interact with
the diegetic level. Instead, it acts as a narration device to unfold the
narrative.
The Diegesis and the Hyperreal
By linking the creation of the diegesis with Huron’s imagination
process, the diegetic world arises as a form of simulation. Further, as a
fictional world, the diegesis emanates from the utilization of models of
perceived reality. Thus, when analyzing how the diegetic world is
constructed, it is revealed that any diegetic space is hyperrealistic. From
a narrational theoretical framework, cinema should not significantly differ
from other means of storytelling in how its diegetic world is generated.
For example, the diegetic world of a novel would be generated by utilizing
the written text in a similar manner as the movie uses its screenic content.
However, McLuhan (1964/1994) pointed out the difficulties that written
language has in replicating oral communications, along with the amount
of effort necessary when attempting to portray what would otherwise be
simple nuances in oral expression. In addition, the creation of the
diegesis relies entirely on the reader’s previous experiences and their
capacity to imagine. Generating a diegetic model for Avatar’s Pandora in
a novel would depend entirely on the audiovisual experiences of the
178
readers, alongside with their imaginations. As a consequence, a novel
does not allow sharing models of a real that will surprise the spectator.
The diegetic model for a planet like Pandora based on a pure literary
source will only draw from the reader’s imagination and experiences.
Instead, cinema is able to mimic perceptual inputs, in terms of
vision and sound. Literature lacks this mimicking capability, which
deepens the relationship between cinema and hyperreality. In the case of
Pandora, the movie Avatar is capable of describing an imaginary planet
by using models of the real with which the spectator might not be
familiar. Moreover, cinema is capable of employing these audiovisual
inputs without becoming a pure illusion. Underscore music is a relevant
example, as it is created by utilizing a product of Western culture (a
symphonic orchestra) without expecting to generate an illusion of its
presence (there is a symphonic orchestra in front of us). The dinosaurs of
Jurassic Park, or an actor portraying a character, should be similarly
understood. For example, Harrison Ford is the source for creating the
character of Indiana Jones, without implying that an illusion of Ford’s
presence will emanate in the diegesis. The generation of the movie’s
diegesis will be focused on the character (Indiana Jones) and his world,
and it will not likely include the actor and his world. Following Souriau’s
(1951) model, identifying who the actor is could be part of either the
179
creatorial or the spectatorial level in relationship with the afilmic reality.
Similarly, it is not expected (although not totally unlikely) that Harrison
Ford is confused with Indiana Jones in his everyday life.
The dinosaurs of Jurassic Park differ from Indiana Jones because
they are created without a specific referent from the perceived reality.
Although models from the perceived reality are used, the final product is,
as Prince described, perceptually realistic. However, there is evidence of
the dinosaurs’ existence, which situates them as part of Earth’s past
history. The manner of how humanity may depict those dinosaurs is
influenced not only by the renditions generated by archeological findings
but also from their representation in movies like Jurassic Park. Thus, the
realistic model of a dinosaur may be mapped after its depiction in
cinema. In the case of gunshot sounds, the relation between perceived
reality and cinema becomes even more noteworthy. Thankfully, a
significant number of Western audiences have not been directly exposed
to war. Consequently, they do not have direct experience of how different
gunshots and missiles sound. Even in the case that they have heard live
gunshots, it is still highly probable that the source of the majority of the
gunshots that they have heard was cinema, television or videogames.
Therefore, it is reasonable to believe that the experiential model of a
gunshot sound is generated, for most people, by how it is depicted in the
180
movies and other media. It is from this viewpoint that movies can
influence the perception of events in the everyday lives of their
spectators, which is an instrument to fuel hyperreality.
The Filmic World
To conclude this discussion, I will define the filmic world once the
diegesis and its relationship with the real and the hyperreal have been
established. The filmic world encompasses most of Souriau’s creational
and spectatorial levels. It also acknowledges Yacavone’s position on a
multilevel model for analyzing cinema. Revisiting the first example from A
Clockwork Orange described above, there were two different renditions of
Beethoven’s 9th Symphony that related to the same diegetic object. This
duplication indicates that, during the process of generating the diegesis,
both versions are merged to signify Beethoven’s piece. This is relevant in
terms of the plot, as Alex, the main character, perceives them as the
same object. However, the difference between both versions is
noticeable. Hence, it is worth inquiring into the function that diverse
versions that signify the same diegetic object have, when the duplication
is unnecessary for creating the diegesis.
As previously stated, the diegetic world becomes the realistic layer
of meaning in a movie. In addition, there are other layers of meaning that
181
can be grouped into an overall filmic world. Therefore, movies have their
own aesthetic and style, which constitute a substantial part of their
cultural meaning. Employing Wendy Carlos’ synthesized version in A
Clockwork Orange is an aesthetic decision that aids in the production of
the overall world of the movie, beyond the diegesis. For example, the
futuristic and eerie environment of A Clockwork Orange is partially
depicted through the usage of the moog synthesizer instead of a classical
Western orchestra. Thus, utilizing the moog synthesizer serves to shape
the diegesis at the same time that it contributes to the overall aesthetic of
the film.
Similarly, Pulp Fiction’s broken narrative is an aesthetic decision
that modifies the filmic world. The diegetic world would remain the same
if the movie were narrated linearly as a single story. However, the filmic
world would transform. Even though the story would not change, the
overall audiovisual narrative would be significantly altered if the plot were
rearranged. An important part of the value and originality of Pulp Fiction’s
narrative lies on creating four connected stories, along with the process
of slowly revealing the whole narrative and the relationships between the
characters. The movie is constantly changing the audience’s
expectations of the characters by showing them from different angles.
This situation exemplifies one of the main points in Kassabian’s (2013)
182
The End of the Diegesis as we Know It?, who states that “we are entering
a period in which diegesis is receding into the background in favor of
sensory experience as the primary organizing principle of audiovisual
forms” (p. 102). This recession of the narrative seems clear in a movie like
Gravity (2013), where the story is just a vessel to present an audiovisual
experience. In the case of Pulp Fiction, the importance of the experiential
part lies in the utilization of four different stories as a narration technique.
However, the narrative is still complex. Instead of a recession of the
narrative, I believe that there is an addition of other resources that enrich
the process of storytelling in its most general meaning.
Another resource that enhances the storytelling process is the
utilization of philosophical or theoretical subtext. In the case of The
Matrix, there are diverse allusions to Baudrillard, Plato, Christianity,
Buddhism and Western literary works, among others. However, the
fictional scientific theory underlying the possibility of creating dinosaurs
using fossilized DNA in Jurassic Park should be considered as part of the
diegetic world. This theory is not part of the subtext, as it becomes an
axiom for the diegesis and for the story. In relationship to the
philosophical standpoints proposed in the movie, the filmic world of
Jurassic Park includes a quasi-theological position in favor of limiting the
progress of scientific research that involves the artificial generation of life.
183
The story in The Matrix is significantly complex. The philosophical
references and the aesthetic decisions of using color filtering, martial arts
and the bullet-camera effect add to the overall filmic experience in
creating a similarly complex filmic world. The Matrix is an eloquent
example of a postmodern artistic product that draws from multiple
sources to create a complex artistic piece.
Figure 20 aims to illustrate the process of generation of both
entities (the diegesis and the filmic world) described above. As a
conceptual element, a movie is constituted by its screenic content (its
audiovisual material) and all the references from the afilmic reality
(philosophy, actors, locations, etc.). All the data is perceived and
decoded by the particular spectator. The spectators have their own
model for the perceived reality and they generate the diegesis using it
along with the inputs from the screenic content. They use their knowledge
of the world and its culture as a means to create the filmic world, which
also incorporates screenic content.
184
Figure 20. Graphic representation of the framework for the generation of
the diegesis and the filmic world.
Aesthetic Realism
By utilizing the framework discussed above, the concept of what a
realistic movie means becomes clearer. When the filmic world is rich and
dense in content, the authorial intention gains importance. Conversely,
when the elements of the filmic world are thin, the author becomes more
transparent. In addition, transparency relies on cultural conventions of
moviemaking, which become part of a shared aesthetic for films. This is
the reason that music that is apparently outside of the diegetic world and
185
then, suddenly, appears to emanate from the diegesis (or vice versa) has
drawn so much theoretical attention. These moments are a noticeable
aesthetic device similar to the example from A Clockwork Orange.
Contrariwise, editing cuts that respect the 180-degree rule or subtle
volume changes in the musical track in order to adapt the music to the
rest of the soundtrack are not generally perceived as significant gaps.
The diegetic world might be close to the perceived reality or it might be a
completely invented world. Star Wars is an imagined world whereas the
diegesis of Pulp Fiction seems closer to Western society of the 90s. The
diegetic world of Ben-Hur (1959) will be evaluated in terms of its shared
expectations on how life was during the era of the Roman Empire. On the
level of the narrative, the story might seem closer to everyday
experiences or it might be more imaginative. For example, the TV show
The Wire (2002-2008) described the drug world in Baltimore as a fictional
story that resembled a real-life experience. In Requiem for a Dream
(2000), Darren Aronofsky also portrays a drug world by developing a story
that is clearly fictional and which also uses a rich filmic world.
Aesthetic realism is an aesthetic position that is regularly qualified,
simply, as realism. In terms of the concepts defined above, a movie will
become aesthetically realistic when it has a diegetic world almost
equivalent to the perceived reality, the filmic world is as transparent as
186
possible (the movie follows the conventions of the cinematic medium),
and the narrative is perceived as similar to an everyday life experience.
An aesthetically realistic movie does not necessarily better engage with
the individual reality and psychology of the spectators. For example, the
world of The Wire will probably be unrelated to the world of most of its
spectators. For them, The Wire’s narrative has a primarily documentary
value instead of a purely fictional one. Conversely, the drug-world in
Requiem for a Dream is utilized as a means to generate myth, by pointing
to a wider concept of obsession and addiction. In the movie, heroin
addiction is equated to TV addiction or to the abuse of weight-loss pills.
By displaying a plethora of addictive acts, the movie scrutinizes where
the limits are between a healthy habit and an addiction. In these terms,
Requiem for a Dream is closer to the everyday life of its audiences, as the
opposition between habit and addiction is common in contemporary
Western society.
The introductory statement of Fargo should be analyzed from this
viewpoint. By stating that the plot was a “true story”, the directors aimed
to engage the spectators in a manner similar to if they were watching a
documentary. Instead of generating mythology, the narrative served as a
counterfeit documentary to show how absurdly human beings sometimes
act. In these terms, an aesthetically realistic film that is acknowledged as
187
such has the power to significantly reshape the perceptions of the reality
of its spectators. By watching The Wire, audiences will create a model of
reality of the drug world and corruption in Baltimore that they may believe
as veridical, regardless of its actual status. Similarly, biased news reports
could have the same effect. Hence, as an audiovisual technique,
aesthetic realism might also be a means to fuel hyperreality into the
everyday lives of the audience members.
With the analysis of aesthetic realism, the relationship between the
concepts of reality, the hyperreal, the diegesis and the overall filmic world
become more explicit. This is possible by approaching the diegesis from
a postmodern perspective, which suggests that movies have different
layers of meaning that interact heterogeneously with each spectator,
dependent on their own knowledge and experiences. This chapter has
also served to introduce some musically related discussions, which is the
focus of the sixth chapter.
188
CHAPTER VI
MUSIC, HYPERREALITY AND THE HYPERORCHESTRA
Introduction
Philip Auslander (2008) states in Liveness: Performance in a
Mediatized Culture that there is no ontological difference between a live
cultural event and its mechanical reproduction in the form of live
broadcasts for the radio or television (pp. 52-63). This is because both
events exist “only in the present moment” (p. 50) and, thus, they are
ephemeral. In addition, he argues that Greek masks in classical Greek
theater may have acted as amplification devices, which would have
mediatized the performance in a similar manner as electric amplification
does. Even if this was not true (since the masks did not amplify the voices
of the actors) the stage and the auditorium routinely act as a medium to
amplify and modify the sound produced onstage. In lieu of that,
Auslander (2012) suggests that a live recording should be ontologically
similar to a live experience or a live broadcast, as they differ only in the
characteristics of the mediation. The possibility of repeating a live
recording is not inherent in the medium. Instead, it is the result of a
cultural practice. In other words, Auslander (2012) suggests that if a
189
viewer records a live television show, remains in isolation from other
inputs, watches it once just a couple of hours later, and then deletes the
show, in terms of its liveness, it is the equivalent of having watched the
show live. Hence, the ability to store a live-recorded event is not a
necessary property to define live broadcasting.
It is not in the recording technology, per se, but in the techniques
employed to record it, that a recording of a musical piece becomes
significantly different from a live rendition. In a musical record, different
pieces of musical information are recorded, selected, and mixed together,
in order to create a product that becomes a virtual live experience. In this
chapter, I will explore how music interacts with hyperreality. I will begin by
describing three different modes of virtualization: recorded music,
synthesized and sampled music. At the end of the chapter, I will define
the hyperorchestra in terms of ontology in its relationship to the different
modes of virtualization.
I will primarily describe the hyperorchestra in its utilization in music
for visual media, as this is the medium in which the hyperorchestra is
currently hegemonic. However, I will also explore the foundations for the
hyperorchestra by looking at a wider range of musical experiences, which
will include music for the live concert stage.
190
Musicologist Nicholas Cook (2013a, 2013b) defends, as an
essential axiom for his approach to the musical phenomenon, an
analytical methodology that incorporates performance as the
fundamental process for musical expression. This position relegates the
written musical score to a secondary status.
Like others working in musical performance and multimedia, I have
attacked traditional musicological approaches for treating a part of
culture as if it were the whole. To analyze music as performance is
to critique a musicology of writing that treats performance as
essentially a supplement to a written text; to study performance as
a form of multimedia is to see it as a phenomenon that involves the
body and all its senses, not a depersonalized sound source. By
implication, we in this field contrast our work to a truncated,
narrow-minded musicology that reflects the autonomy-based
aesthetic ideologies of the past, rather than the performative reality
of music as a vital and unprecedentedly popular cultural practice in
today’s multimedia-oriented world. (Cook, 2013b, p. 53)
Traditional musicological approaches, following the nineteenthcentury aesthetics of absolute instrumental music, force a
distinction between the “musical” and the “extra-musical,” where
the “musical” is essentially defined in terms of notation: everything
that Michelangeli does before he plays the first note, then, falls into
the category of the “extra-musical,” from which it is but a small
step (and one that musicologists and traditional aestheticians
readily make) to dismiss it as self-indulgence or showmanship.
(Cook, 2013a, pp. 72-73)
In both quotations, Cook defends the importance of incorporating
not only the resulting sound but also the audiovisual performance.
Following Cook’s proposals, I will argue against considering the score
alone as a valid source to describe the musical experience due to its
191
representational limitations. In addition, even when considering only
sound recordings, I will support Cook’s position of an audiovisual
approach to performance. Although there is no explicit visual content in
an audio recording, listeners may generate a mental representation of the
stage where the music was supposedly performed. In building the
hyperorchestra, the mental representation of the imagined physical space
is essential to preserve a sense of realism.
Recorded Music and Musical Notation
Any process of capturing sound, for the purposes of either
recording or broadcasting it, involves the utilization of a medium to
capture it: the microphone. More importantly, the process entails the
selection of the set of microphones used to record, their placement, and
their degree of contribution61 to the final sound. This fact does not
necessarily challenge Auslander’s statement on the ontology of live
performances. A live recording might only utilize two microphones,62 one
for each stereo channel, located in the middle of the hall. In this situation,
the microphones would not necessarily be particularly mixed together,
nor they would be differently placed. As a consequence, in a purely
61
The amount of signal from each microphone that will be sent to the final
mix.
62
It could even just utilize a single microphone, which will produce a
monophonic result.
192
stereo recording, the process of mediation involves only the physical
means of how the microphone has captured the sound and transformed it
to an electric (and then digital) signal. Hence, employing multiple
microphones in diverse locations and mixing them to generate the
resulting sound is not a necessity of the medium (audio recording), but is
an aesthetic possibility. Moreover, the perceived sound that a live
audience hears in a live concert will differ depending on the acoustics of
the hall and the spectator’s specific seating location.63
I believe that the process of capturing sound implies an aesthetical
intent and a set of technical decisions that must be made to fulfill it. For
example, in recording an orchestral piece, the sound engineer might
decide to recreate the sound from the conductor’s position. In order to
achieve this objective, the engineer might decide to either place two
microphones near the conductor’s position, or to recreate the position by
using multiple microphones.64 The first possibility, to employ only
microphones placed near the conductor’s location, does not imply a
higher degree of fidelity to what the conductor would be hearing.
Microphones are not equivalent to human ears, as they capture sound
63
Even imagining an impossible exact performance of a musical piece,
the sound will vary depending on the concert hall where it was
performed. Similarly, the sound will be different in the orchestra seats or
in the top floors of the hall.
64
Which might also include a set of microphones from the conductor’s
perspective.
193
differently. As a consequence, there might be some specific aural
qualities of the sound that a microphone would capture more closely to
human hearing if it were placed in a different location.65 There are
different types of microphones that capture sound differently in terms of
their sensitivity to specific frequency ranges, the capturing area (cardioid,
directional),66 or the physical process of how the microphone converts
sound to electricity (condenser, ribbon). Thus, a combination of diverse
microphones in different locations may produce, once properly mixed, a
closer representation of what conductors would hear from their location
onstage. In terms of ontology, the properties of the microphone highlight
their inability to become, just like the camera, an index of reality. This
holds true even if considering that microphones capture sound
continuously, which differs from a cinema camera that is only able to
record a reduced number of frames per second. It is true that a digital
recording is able to only capture a discrete amount of information per
second (96kHz, which means 96000 samples per second, for example).
However, the captured information is utilized to generate a continuous
signal when speakers reproduce it. Thus, ears do not need to create an
illusion of sound, as the eye does for moving images, because the
65
Or as a result of the combination of multiple microphone types and
placements.
66
For a further description of microphone typologies, see Owsinski (2013,
pp. 3-51).
194
sounds that the ears receive is already fully formed.67 Nevertheless, the
impossibility of creating a microphone that is equivalent to a human ear
challenges its indexicality in the same manner as what happens with the
camera due to the impossibility of capturing the same range of light as
the eye.
The process of recording sound not only involves choosing a
sound perspective, but also deciding how to achieve the desired sound in
terms of the microphone combination. Moreover, picking a sound
perspective does not necessarily imply that it is actually replicating a
specific location in the concert hall. Instead, the recording may aim to
reproduce an ideal listening perspective that does not exactly correspond
to any concrete spot in the hall. Furthermore, the recording may attempt
to better capture the composer’s intentions of the sound of a piece,
which might distance it from emulating the resulting sound from a live
performance in a concert hall. Thus, any process of sound capturing
involves an aesthetic process, which generates the desired shape of the
recorded sound. This aesthetic process creates new prospects for
creatively shaping the resulting sound. More importantly, it alters the
central point of music performance from the score to the sound result.
67
In other words, the eyes will receive a discrete number of still images
per second that the brain will interpret as movement. In the case of
sound, the ears already receive a sound wave that is created through its
interaction with the speakers.
195
Musical Notation
In terms of McLuhan’s (1964/1994) approach to media, a musical
score is similar to phonetic writing. The score is an abstract notation
system that aims to translate an aural phenomenon that occurs in time
into a visual static representation. In addition, the symphonic orchestra
(and other ensembles) acts as a standardized ensemble that facilitates
notation. Standardized ensembles are helpful, as the score fails to
graphically show the sound differences of a set of instruments depending
on their position in the space, for example. If the string section of the
orchestra were placed at the back of the concert hall, the woodwinds in
the middle of the parterre and the brass onstage, the resulting sound
would greatly differ from a traditional placement of the orchestra.
However, a traditional musical score notation does not include the tools
to graphically reproduce those differences. The score serves only as a
partial visual representation of sound when very specific restrictions are
applied. Even in this strictly controlled environment, the score requires a
high degree of interpretation to act as a form of representation of the
sound depicted on its pages. Applying another set of constraints,
implementing harmonic, melodic and rhythmic rules further facilitates the
process of interpretation. By definition, the score is only able to depict a
196
limited amount of pitches68 and rhythms that are based on a meter. In
addition, harmonic principles facilitate a theoretical understanding of the
sounds and its progressions that, in turn, facilitates imagining them.
Even when considering all these restrictions, the score fails to
provide a visual differentiation of timbral aspects, in order to
acknowledge that a trombone sounds different from a bassoon, for
example. The timbre differences of the instruments written in a score can
only be depicted by applying the acquired knowledge of the sound
properties of both instruments. Nevertheless, Western classical music
was born as a product of a score-centric vision of music, thus carrying its
limitations.
In exchange for its restrictions, the musical score offers diverse
advantages as a medium for music transmission. First, the score allows a
rapid diffusion of complex musical content. For instance, a group of
trained musicians is able to quickly perform a piece of music just by using
the score. In addition, the score allows the separation of the process of
writing music and the process of performing it, thereby facilitating a sort
of division of labor similar to other practices that emanated out of the
Industrial Revolution. Within the score paradigm, the nascent musical
68
Mainly based on a 12-tone scale. Although it is possible to notate
quartertones or even smaller musical distances in the score, these new
pitches are still subordinate to a 12-tone scale framework.
197
industry started to separate the labor of the composer and that of the
performer, at the same time that the musical production69 became
standardized. This implies that any trained performer is able to play a
musical piece written by a composer. At the same time, these performers
are interchangeable, if needed, as they are detached from the process of
creating the musical piece.
In addition, the process of standardization involves discretizing
different musical features. For example, a continuous parameter like
dynamics is divided in a few steps (f, mf, mp, p); dynamic variations are
mainly represented as a unique transitional process expressed by terms
like crescendo or decrescendo. Figure 21 attempts to clarify the previous
statement, by graphically portraying three different possibilities for a
crescendo. In this diagram, the vertical axis represents the dynamic level,
whereas the horizontal axis represents time.
The first graphic refers to a linear and proportional process of
crescendo, meanwhile the other two are non-linear. The graphic
represents a performance that will increase the dynamic at a constant
ratio during the crescendo. In the second representation, the increase of
dynamic will be slower at the beginning and more intense towards the
69
In this situation, production refers to the act of creating musical
performances.
198
end of the crescendo. The last example represents an increase of the
dynamic that incorporates smaller dynamic variations.
Figure 21. The graphic shows three different crescendo representations.
199
However, these three different processes of crescendo are
represented equally when using the notation provided in a musical score
to represent crescendo (Figure 22).
Figure 22. Representation of a crescendo utilizing traditional musical
notation
Therefore, a performer will not be able to distinguish between
these different types of crescendo by reading a musical score, thus
creating a degree of ambiguity that will need to be resolved at the time of
the performance. A similar process is involved in the rest of the elements
of the score. The act of performing a musical piece written in a score
involves the aesthetic process of interpreting the contents of the score. In
realizing how vaguely Western musical notation represents dynamics,
another limitation of the representational capabilities of score, in terms of
the sound result, becomes apparent.
The necessity of interpreting the score to produce a musical
performance pinpoints an additional aspect of Western music practice: it
still relies on the oral tradition. Learning an instrument involves acquiring
technical knowledge through instruction. In the process of instruction, the
student learns the performance practices necessary to properly decode
200
the score. The nuances involved in the interpretation of a crescendo
exemplify the impossibility of acquiring this knowledge from a manual,
due to the inexistence of proper notation procedures to specifically
describe it.70 Therefore, these practices are acquired by aural
communication,71 as they cannot be transmitted through written
knowledge. In terms of McLuhan (1964/1994), this is the type of
interaction that would precede phonetic language. It is from this angle
that the utilization of the score, with all its restrictions, may be analyzed.
The score becomes the only element capable of bringing part of the
musical practice as close as possible to the other forms of intellectual
knowledge that can be acquired by studying written information.
Consequently, the score allowed for the isolation of the necessity for oral
transmission in music to a reduced set of situations.
The curated environment for music creation described above
encompasses standardized models for the orchestral ensemble, musical
language, the concert hall, and music notation. However, in the process
70
Learning how to perform a crescendo implies assessing a different set
of musical parameters to decide what would be the best shape to
implement the dynamic variation. It does not necessarily require
awareness of the exact dynamic evolution of the sound by the performer.
Instead, aural communication serves as a tool to sonically assess when a
particular performance of a crescendo fits the needs of a passage.
71
For this example, I replaced the term oral with aural to acknowledge a
communication system that involves the sound of the instrument being
learned.
201
of the evolution of Western music, the framework adapted to incorporate
new sonic possibilities. The orchestra added new instruments and
instrumental techniques, and the harmonic language developed beyond
the structure of previously established harmonic transformations and
tonality. Similarly, the rhythmic complexity expanded, stressing the
limitations of what was possible to notate using the musical score. Each
of these amplifications relaxed the restricted environment of Western
music creation, which decreased the value of the score as a
representation of music. By losing its representational capabilities, the
score becomes a blueprint for musical performance.
In introducing the possibility of recording audio, music
experienced a shift in its perspective. Audio recording transcends the
score and focuses on the process and manipulation of sound. A piece
might be recorded several times and subsequently edited, which
transforms the resulting output. The degree of the transformations
granted by the recording process does not necessarily involve an
interpretation of the score or the assumed musical structure. This
statement will be further clarified by analyzing the aesthetics of the
modern recordings of piano concertos.
202
The Piano Concerto Recording
The Western classical concerto involves a soloist and an entire
orchestra performing together. The soloist has a prominent role, even
though he or she is clearly outnumbered by the orchestra. I am using the
piano concerto as an example, but the process of recording this type of
musical expression could be generalized to any Western classical
concerto for any solo instrument. Most classical recordings follow an
aesthetic principle that aims to produce a sound that is similar to a live
experience. In other words, the recording should sound verisimilar in
order to be aesthetically accepted as a valid rendition of the classical
piece. However, the solo piano is recorded with a number of dedicated
microphones, which are invariably mixed louder than the orchestral
microphones. In doing so, listeners of the recording are able to hear the
solo piano in moments that would not have been audible within the
environs of a live concert hall performance. For example, the piano is
barely audible in most of the passages when the soloist is playing along
with the whole orchestra in a forte dynamic. In aesthetic terms, those
passages are challenging. When looking at the score and following the
established performance conventions for this type of piece, it is implied
that the intention of the composer is that the soloist should be heard,
even though it may be acoustically impossible if the full orchestra is
203
playing loud. By making the orchestra play quieter (this is a common
solution that some conductors do) the piano might be heard, although the
timbrical result of the orchestra will not be forte anymore, and this is
equally challenging to the intentions of the composer that are assumed
by looking at the score. In addition, those types of passages may even be
visually confusing for live audiences, as they watch the soloists playing
with great effort and strength without actually being able to hear them.
In a recording, it is possible to achieve the effect of distinctly
hearing the soloist at the same time that the orchestra is playing in a forte
or fortissimo dynamic. The result is a version of the piano concerto that
could not be generated solely by live acoustic means, yet still sounds
convincing and realistic. In terms of aesthetics, a recording that more
loudly mixes the solo piano becomes an idealistic version of the piece,
implementing the utopia that a single instrument is able to overcome a
hundred players when it is performed by an exceptional artist, which is
the grounds for the concept of the concerto itself.
Utilizing McLuhan’s (1964/1994) definition of media, a musical
instrument is a medium that becomes an extension of the voice, as it is
able to produce sounds that would not be otherwise possible. Similarly,
the soloist in a piano concerto extends the musical instrument by
overpowering an entire orchestra. This romantic idea of a superhuman
204
collides with the acoustic limitations of live sound. However, with the aid
of the recording and selective mixing, this artistic endeavor is realizable in
a form that seems natural to the listener. In a live concert, a similar effect
could be achieved by amplifying the soloist. However, the process of
amplification, which is not visually and aurally neutral in a live
performance, works against the epic and the concept of the soloist as a
superhuman artist. Instead, in a recording, the artificial process is hidden,
at the same time that the verisimilitude of the musical experience is
preserved.
By recording the piano concerto within these principles, a process
of virtualization occurs. In capturing the performance’s sound from very
specific locations, and then mixing the captured sounds with a precise
aesthetic intention, the result, although grounded in simultaneous
captures of a real experience, transcends what the human senses would
have perceived. It is from this perspective that the sonic result of a piano
concerto recording might become hyperrealistic. The totality of the sound
of the recording originates at the same time and space in the physical
world. However, the resulting sound does not pertain to the same world,
as it is transformed beyond the possibilities of the real. The
transformation adheres to an aesthetic intention rooted in how the piece
should ideally sound. In addition, the conflict between the possibilities of
205
the physical world and the intentions of the musical creators72 stresses
the necessity for human artistic expressions to transcend the physical
limitations of the real. Furthermore, the recording of piano concertos
might reshape the audience's expectations of a live performance of the
same type of piece: as the acoustic model of the recording is convincing
and aesthetically coherent, audiences might expect the same balance in
the sound of a live performance.
The Studio Recording and the Acousmatic Sound
The analysis of the aesthetics of the piano concerto recording in its
relationship with a virtual reality pointed out how a recording that aimed
to reproduce reality altered some of its properties. In the case of
recordings that fully utilize the possibilities of the studio, this goes further.
The studio overrides the necessity for the performers to share the same
space at the same time. In a studio recording, each instrument might be
located in different isolation rooms, or they can be recorded at different
times. In addition, the rooms might be sonically treated in order to
minimize the early reflections or the overall reverberation. Hence, the set
72
The musical creators may include the composer, the performers and
the recording team. This is important as there are some creative
decisions (the size of the ensemble, for example) that are generally not
specified in the score and might affect the relationship in terms of the
comparative loudness of the soloist and the orchestra.
206
of microphones that captures each of the instruments receives a sound
that is mainly the sonic outcome of the instrument, but converts it into a
malleable source material.
In terms of musical properties, the studio recording extends the
timbre, which is the equivalent of stating that it extends the sonic
possibilities of the music. It does not essentially alter the harmonic,
melodic or contrapuntal qualities of the music, however. From this
viewpoint, the techniques involved in a studio recording cannot usually be
expressed on a traditional musical score. It is from this perspective that
some of the concepts discussed in relationship to the studio recording
might resonate with some of the principles of musique concrète. In both
cases, the new medium (the studio) permits the extension of the sound
beyond the acoustic possibilities of the instruments73 playing in a physical
space. By transcending the necessity of a certain degree of fidelity (or
resemblance) to an aural model from the physical world, the music
created in the studio is sculpted from all sides without any preexisting
assumption. At the very least, the virtual stage and the virtual positions of
the sound sources on the stage are generated. A virtual stage implies
sounds that lack an identifiable visual source. This typology of sound was
labeled “acousmatic” by one of the founders of musique concrète, Pierre
73
In this case, an instrument is any physical object that is able to produce
sound.
207
Schaeffer (2007), in his book Treaty of the Musical Objects. Acousmatic is
a sound that one hears without seeing the causes that originated it (p.
56). Strictly speaking, all recorded music should be considered
acousmatic as it is reproduced without visually seeing the source. For
Schaeffer, an acousmatic situation breaks the symbolic connection
between the sound and its visual source (p. 56). Schaeffer precisely
qualifies the connection as symbolic instead of indexical, as he believes
that part of musical hearing involves visual information. By proposing an
acousmatic mode of hearing, Schaeffer argued in favor of focusing on the
pure properties of the sound, regardless of its visual cues. This is why he
named the process acousmatic, which was the term employed to refer to
Pythagoras’ disciples, who listened to their mentor behind a curtain in
total silence.
However, the listener of a recording might visually imagine the
performers. This is the why acousmatic should be considered an attitude,
a conscious decision of disregarding the visual cues in order to focus
exclusively on the sound. The process of disconnecting from the visual
source becomes much easier when the sound is not clearly connected
with a physical object. For example, a compressed and distorted sound
from a guitar that is panned from left to right does not clearly represent a
physical instrumental experience: the sound that emanates from a guitar
208
is different from the sound being reproduced, at the same time that it is
unfeasible that the performer is able to move around the spectators’ aural
range. Hence, an acousmatic attitude of hearing concentrates on the
sound of the music without a visual or cultural bias. In terms of
aesthetics, this is the most relevant innovation that recording, as a
technology, has added to music.
Nevertheless, an acousmatic attitude towards the pure sonic
properties of music is compatible with its cultural connections. In fact,
decoupling the sound from its source may even facilitate the creation of a
level of signification that connects with a cultural background. For
instance, Richard Wagner's design of Bayreuth's theater hid the
orchestra in order to force the audience to focus on the sound of the
music instead of being distracted by the visual cues of the performers.
Furthermore, the cultural model of an orchestral sound would still be
associated with orchestral music regardless of whether it is visible or not.
By hiding the source, audiences still recognize the main pattern of the
sound (orchestral music). Michel Chion (1994) argues that, contrary to
Schaeffer's assumption that an "acousmatic situation could encourage
reduced listening, in that it provokes one to separate oneself from causes
or effects" (p. 32), the listener will attempt to reveal the source of the
sound first:
209
Confronted with a sound from a loudspeaker that is presenting
itself without a visual calling card, the listener is led all the more
intently to ask, "What's that?" (i.e., "What is causing this sound?")
and to be attuned to the minutest clues (often interpreted wrong
anyway) that might help to identify the cause. (p. 32)
Chion’s focus is on cinema sound, which influences his perception
of Schaeffer’s concept of acousmatic sound. Chion (2009) rightly
remarks, in line with the previous discussion on the model of an
orchestral sound: “if the source has been seen, the acousmatic sound
carries along with it a mental visual representation” (p. 465). Chion’s
remarks are aligned with the concept of “source bonding” defined by
Denis Smalley (1994), which is: “the natural tendency to relate sounds to
supposed sources and causes, and to relate sounds to each other
because they appear to have shared or associated origins” (p. 37). For
Smalley (1994), source bonding is a deeply culturally embedded process
because, prior to electroacoustic music, the sounds had always had an
identifiable source (p. 37).
Nevertheless, Chion (1994) asserts that, in most of the cases in
cinema, the source of the sound is identifiable. Chion’s approach to the
term is practical and has been widely used. However, it deviates from the
aesthetic discussion intended by Schaeffer (2007). For Schaeffer, an
acousmatic sound broke the symbolic connection between the sound
and its source. For Chion, this is not the case in most situations, as the
210
symbolic connection is coded in the society that shares the cultural
background. Even though Chion’s assumption may be true for most of
the off-screen diegetic sounds of a movie, it becomes a reductionist
approach when analyzing the possibilities of music created in the studio.
Examining, once more, the example of a processed guitar sound that is
dynamically panned around the virtual soundstage, a symbolic
connection with a physical source may be found. A listener may identify a
guitar as the physical origin of the sound, yet realizing that the dynamic
panning is not a product of the movement of the source (the guitarist)
during the performance. As a consequence, the symbolic connection
between the sound and its source becomes just a trace, a model of
reality that collaborates with creating a more complex, acousmatic (and
hyperrealistic) sound model.
Hence, an acousmatic attitude (as defined by Schaeffer) towards
sound becomes a further step into hyperreal sound and music which
transcends pure physical models of sound. Moreover, when listeners
identify the guitar as the sound source in the previous example, they also
respond to an acousmatic attitude. In acknowledging that the physical
source of the sound, the guitar, acts as an element that contributes to the
shape of the final sound, the listener assumes a sonic attitude detached
from exclusively physical forms of sound generation. Therefore, a listener
211
who is able to incorporate the concept of detachment from the sound
and the source demonstrates an acousmatic sensibility to sound.
When audiences experience audiovisual media, speakers generate
the sound. This is independent from the fact that the visual image might
represent the physical origin of the sound. In any case, a significant part
of the sounds in a movie are generated separately from the source shown
in the visuals. The sound of a door closing, if it is narratively important, is
usually magnified. The dialogue almost always has a predominant space.
Thus, the act of listening to audiovisual material usually becomes an
acousmatic experience. Remarkably, this even holds true in a multicamera recording of a live orchestra performance. In this situation, the
sound will not change depending on the camera angles selected (a closeup of the oboe performer would not imply a change in the mix of the oboe
sound).
The studio recording has yet another relevant consequence. The
live concert rendering of a piece of music that originated in a studio
becomes the representation of the original sound, limited by the
possibilities of the physical world. It is assumed that the quality of the
music and the sound would not be equal to the recording. Yet, the value
of the live concert rests on the symbolic connections with the musicians
sharing the same physical space.
212
Synthesized Music and the Musical Instrument
Even though the introduction of the ability to synthesize music
through the use of electricity was a milestone in the evolution of music
during the 20th century, from the viewpoint of this current discussion,
sound synthesis mainly expanded on the new possibilities introduced by
studio recording. Synthesizers are able to generate sound without the
need for a string (or another material) to vibrate. However, this is not
exactly true. Synthesizers generate an electric signal that is transduced to
sound using a loudspeaker by a vibrating membrane. Thus, synthesized
music and sound are a form of an acousmatic experience in which the
source is generated by using electricity.
Synthesizers are able to generate a new range of sounds that
cannot be produced by traditional instruments in the physical world. They
can create pure sine waves that repeat exactly through time. They sound
significantly different from any physically generated sound because this
degree of mathematical purity in the sound waves is not achievable by
physical objects. This fact highlights that the imperfections and
unevenness of the timbre are essential for a sound to be associated with
the physical world. In other words, natural sounds have a complex and
variable timbre. Thus, the regularity of synthesized sounds becomes
something beyond the physical world.
213
The analysis of synthesizers as a new musical instrument of the
20th century assists in clarifying its definition in terms of being a medium.
Conversely, the definition of a musical instrument will serve as a means to
properly identify the significance of the introduction of the synthesizers in
Western culture. In terms of the human perspective, the voice is the first
complex natural instrument, as it is able to produce a wide set of pitches
and dynamic levels. In addition, hands and other body parts are the
source of percussive and rhythmic sounds. In McLuhan’s (1964/1994)
framework for the media, musical instruments are a medium to extend
the musical capabilities of the human body by introducing new timbres,
pitches and the possibility of polyphony within a single instrument. It
becomes a medium that extends the human voice in a similar manner to
how the hammer amplifies the arm by making it stronger.
In this framework, wind instruments are a natural extension
of the voice, similar to a more elaborate form of whistling. Similarly,
percussive instruments are an extension of body percussion.
String instruments are, however, a more sophisticated technology.
They require a process of sound creation that is not found in a
natural environment. It cannot be considered a direct extension of
the human body in terms of organology. The sound of a plucked
string is also significantly different from what can be achieved by
214
the human voice. The process of bowing is even more complex, as
it further distances the instrument from the human voice.
From this brief analysis of different typologies of musical
instruments, the necessity to treat musical instruments as
technological devices arises. So far, wind and percussive
instruments naturally amplify the human body. For the string
instruments, even though their mode of producing sound is less
similar to how the human body generates it, they are still a
technology that directly generates sound: a performer plucks a
string and the string produces sound. Thus, in all these cases,
there is direct physical contact between the source of the sound
production and the performer. However, this situation changes
with the introduction of the keyboard. The musical keyboard
virtualizes the process of performing music. By pressing a key, the
performer does not directly interact with the source of the sound.
Instead, the performer activates a mechanism that will produce the
sound. The keyboard becomes an additional level of mediation,
which also facilitates the standardization of music, as it is built
using the 12-tone division of the octave. In addition, the 12 notes
are divided between seven white keys and five black keys, which
suggest the seven-note scale system (Figure 23).
215
Figure 23. Schematic for an octave of a musical keyboard.
Thus, the musical keyboard is an interface that solidifies a scalebased 12-tone musical framework, at the same time that it detaches the
process of playing (by pressing its keys) from the actual production of the
sound. The first synthesizers, like the Hammond organ, used the musical
keyboard as the main interface for music performance, in addition to a
console of knobs designed for shaping the sound. In terms of musical
performance practices, playing a synthesizer is similar to playing another
keyboard instrument. Thus, the synthesizers did not innovate by adding a
new musical interface. However, the synthesizers did offer the possibility
of molding the sound they produced, in forms that were not exactly
possible with purely physical instruments. In fact, physical instruments
started to incorporate a set of extended techniques that, similarly,
focused on expanding the sound possibilities available for music creation.
216
Hence, the incorporation of synthesizers as a new set of musical
instruments highlights a new attitude in which sound variety becomes
more relevant. Their utilization of electricity should be understood as a
means to fulfill their goal of expanding the sound palette. Thus, extended
techniques for physical instruments and synthesized sounds obey a
similar artistic intention. Nevertheless, the introduction of electrical
devices as a means for sound generation allows for the creation of music
that transcends what was once impossible to achieve just by using
physically generated sounds.
Sampled Instruments
From a basic perspective, a sampled instrument might be
considered to have evolved from synthesizers. From this viewpoint,
sampled instruments utilize a short recording (sample) of a sound
produced in the physical world in order to generate a new synthesized
instrument. Thus, this instrument does not purely originate out of the
utilization of electric signals. Instead, a sampled instrument is the product
of processing recorded sounds captured from the physical world, in
addition to the creation of a computer program that generates a playable
virtual instrument from these samples. This simple transformation has
significant implications in terms of the ontology of musical instruments.
217
Capturing and processing a sound from the physical world adds a layer
of virtuality that did not exist in the synthesizers.
From the perspective of the sound they produce, sampled
instruments might be divided by two main approaches. First, there are
instruments using a sample as a source that is modified using synthesis
to produce the resulting sound (hybrid synthesizers). The intention of
these instruments is not to replicate or emulate a physical instrument by
virtualizing it but, instead, to create a new one by transforming a
physically generated sound sample.
Figure 24. General classification for synthesized and sampled
instruments. Pure synthesizers refer to instruments that create the
sounds purely from sound synthesis. Hybrid synthesizers are, as
described above, synthesizers that also employ one or more samples
(that are transformed) in order to produce the sounds. Sample libraries
218
are designed by creating computer programs that utilize a set of samples
to generate virtual instruments. The last typology, the recording, refers to
any other typology of music recording.
The second group contains instruments that attempt to virtually
replicate a physical instrument (or ensemble) by using multiple samples of
the instrument to emulate it (Figure 24). In most cases, the sample (or set
of samples) that serves to produce the sound for the instruments of the
first group carries a certain degree of signification over to the newly
created instrument. For example, a virtual instrument that uses the sound
of a metallic trashcan when hit with a hammer will probably have some
sort of connection with the source of the sample in terms of its
signification even if the resulting sound has been modified using
synthesis. Thus, when the instrument appears in a piece of music, it will
probably bring some references of the cultural connotations of the
source. By incorporating meaning borrowed from the physical world, the
new instrument becomes hyperreal, as it integrates models from the real
without the sound having purely originated in the physical world. In the
trashcan sound example, the resulting sound may have been transformed
and the pitch shifted in order to provide different musical notes. Hence,
even though it might still preserve the connotations of hitting a trashcan,
the sound produced by the instrument could not be generated just by
physical means.
219
The third group of instruments constitutes the core of what is
commonly defined by screen composers as sample libraries,74 although
they really are virtual instruments that utilize extensively sampled sounds
in order to produce a realistic rendition of the instrument or ensemble that
they replicate. They expand the definition of a musical instrument even
further by treating ensembles as instruments. Even though this could be
equated with the actual treatment of the string sections as unified
instruments, with sample libraries this process extends to any possible
instrumental combination. Moreover, the selection and design of the
ensemble implies a degree of cultural signification. For example, a virtual
instrument from a sample library that represents an orchestral sound (or a
section of the orchestra) would accomplish that goal by following the
orchestration principles that are part of a cultural background. The result
will generate a soundscape coded within a particular cultural framework.
In addition, a sampled instrument from this second group may be
designed to reproduce specific musical gestures or rhythmic patterns
(runs, trills, percussion loops, etc.) that are also a culturally coded.
Sample libraries challenge the definition of a musical instrument in
another manner, especially when they become a virtual version of a
74
In practice, the first group is regularly considered to just be
synthesizers. However, for the purposes of precisely defining these
instruments, the distinction was required.
220
physical instrument. This leads to the question of whether the sampled
instrument should be considered an attempt to imitate the real
instrument, which makes it, therefore, a counterfeit, or whether it should
be considered a new instrument that is culturally tied to its physical
counterpart. In order to fully comprehend what a sampled instrument with
these characteristics really is, it is worth analyzing how they compare to
actors and their computer-generated counterparts.
CGI Actors and Sample Libraries
In Chapter IV, I discussed what Auslander (2008) described as the
“Gollum Problem”. With the introduction of computer-generated
characters, along with motion capture devices to capture the movements
for those characters, the concept of the authorship of the act of
performance was challenged. In these terms, the virtual instruments in
sample libraries are closely related to CGI characters. They both use
samples from the physical world that are processed by computer models.
In addition, they both require a certain amount of programming that is
best achieved by experts who interact with specific interfaces. Thus, the
actor's performance is incorporated, using motion capture devices, into
the computer model, thus generating a visual moving image.
Programming the computer model could technically be achieved just by
221
using computer tools and without the interaction of an actor. However,
the expertise of an actor becomes key in providing life and verisimilitude
to the CGI character.75 In addition, an actor may become a specialist of
motion capture performance. This is the case of Andy Serkis, who is
responsible for the performance of Gollum in The Lord of the Rings
franchise, but also a few other well-known CGI characters, including
Caesar in both Rise of the Planet of the Apes (2011) and Dawn of the
Planet of the Apes (2014).
Music creation using sample libraries follows a similar process.
Sample libraries are created to allow complex performance programming
using the Music Instruments Digital Interface (MIDI) protocol. However,
the introduction of this MIDI data is mainly achieved with a musical
keyboard and a set of faders. In addition, it is typically the composer (or
the assistant composer) who actually programs the libraries. Therefore,
there is often no specialized musical performer involved in the process,
as there is an actor involved in the creation of the CGI character. This is
explained by several reasons. First, the composer, as a musician, has
some performance skills. Second, the input devices (keyboard and
faders) do not capture musical performance as naturally as motion
75
This is coherent with McLuhan’s thesis on the limitations of the written
language. Programming language, as an even more formal type of written
communication, requires a massive amount of data in order to represent
body movements.
222
capture sensors do for acting. For example, a violin performance might
be captured by simultaneously using a keyboard and some faders to
control dynamics, vibrato or bow change. However, a keyboard and a set
of faders are not able to capture a violin performance as naturally as
motion capture does. Third, the existence of diverse instruments that are
performed very differently, in addition to ensemble sounds, complicate
the task of designing capture mechanisms, as well as the logistics of
capturing the performance. For example, MIDI wind controllers do exist,
although they are rarely used. This is because they require a wind
performer to be fully effective.
In considering these limitations, sample libraries are generally
designed according to the assumption that the composer or the assistant
will act as the performer. This inevitably implies a simplification of the
inputs that the sampler is accepting, as there are performative nuances
that can only be realized by actual performers with fully functional
interfaces. Although some assistants have become MIDI programming
specialists that act as a liaison between the composer and the sample
libraries, their specialty is programming the library instead of actual
instrumental performance. Compared to CGI characters, MIDI
programmers are the equivalent of CGI programming specialists of the
virtual character’s movements. On the other hand, these actors enact
223
performances of a wide variety of non-human creatures, which is a
challenge comparable to performing an instrument for which the musical
performer is not trained to perform.
Sample libraries adapt to this situation not only by designing
interfaces for the composers and MIDI programmers but also by
incorporating predesigned performance elements that are reproduced
automatically. For example, instruments in sample libraries routinely
incorporate predesigned amounts of vibrato, instrument noises or breath.
In general, instruments in these libraries produce sound by following
some performance standard practices, most of which derive from what is
commonly known in the movie industry as “the Hollywood Sound”.
Hence, by employing these libraries, a set of fixed cultural conventions is
inevitably introduced to the music they create.76
By analyzing the similarities between CGI characters and sample
libraries, the relationship between the physical instrument and its
sampled counterpart becomes somewhat clearer. Motion capture actors
provide an actual performance of the virtually designed characters, which
results in an actual human performance. Similarly, music created with
sample libraries is also performed, even though the performance
capturing capabilities are not as extensive as motion capture for acting.
76
This will be explored in detail in the following chapter.
224
However, this lack of precision in the capturing process of the
performance is partially supplemented by inserting predesigned
performance practices. From this viewpoint, the actual performance is
neither synthetic nor computer-generated: it has been produced by
humans. In the case of musical instruments, they act as a medium to
transmit the musical ideas of the performer in a comparable manner to a
virtual instrument from a sample library.
In addition, in terms of narrative cinema, there is a supplementary
consideration to ponder: the importance of how both music and acting
support the narrative. For example, the character of Yoda in the Star
Wars franchise was portrayed using a puppet for the first four movies
(Episodes I, IV, V and VI) and as a CGI character for episodes II-III. As a
character, Yoda is the most powerful Jedi master that appears onscreen,
which implies that he is also the best light-saber fighter. Yoda’s
supremacy as a light-saber warrior would be difficult to portray through
the use of a puppet, or by employing physical means alone. Nevertheless,
this is not a significant problem when Yoda is represented using a CGI
character. Thus, from a narrative standpoint, Yoda is better portrayed by
a CGI character than by a physical puppet or actor. In a similar manner,
underscore music adheres to the narrative needs of the movie as its
225
primary function, which in several situations might be better achieved by
using sample libraries, as I described in Chapter IV.
Sample Libraries and Hyperreality
In Chapter IV, I discussed Prince’s (2012) definition of perceptual
realism and how it was connected with the idea of the hyperreal. CGI
characters are ordinarily a good example of perceptual realism. Similarly,
music produced using sample libraries interacts with the hyperreal.
Sample libraries are able to create music that, even though it sounds
realistic, cannot be produced by physical means alone. By using
recordings from the physical reality and transforming them into interactive
virtual instruments, sample libraries engage with various models of reality
that have been transformed into a virtual sphere. This approach differs
from what synthesizers added to music, which is related to sound
expansion. Instead, with sample libraries, composers are able to interact
with models of reality, and transform them beyond what would be
achievable in the physical world, but still retain their cultural value as
physical artifacts. This is why sample libraries have a fundamental role in
the definition and the emergence of the hyperorchestra, which I will
discuss in the following section.
226
The Hyperorchestra
So far, I have described the essential elements that constitute the
technological devices that facilitate and expand the process of music
creation. In analyzing these different technological inventions, the
capability of expanding the available soundscape arose as one of their
key components. This process of sound expansion crossed the
boundaries of the physical world by the introduction of musical
synthesizers, and it opened itself up to the hyperreal with the
incorporation of the virtual instruments from sample libraries. In
considering these implications of the process of sound expansion, I
define the hyperorchestra as a virtual ensemble capable of incorporating
all of these new means of music creation. The hyperorchestra inhabits
hyperreality, as it goes beyond the physical world, yet it remains realistic.
By defining the hyperorchestra,77 I intend to encapsulate all the
processes of music creation that transcend the limitations of the physical
world. For example, a recording of a piano concerto, as described before,
77
As mentioned in Chapter I, the term hyperorchestra was created by
joining the words “hyperreal” and “orchestra”. From this term, I generated
related words such as hyperinstrument and hyperorchestration.
Composer Todd Machover has used the term hyperinstrument to refer to
“designing expanded musical instruments, using technology to give extra
power and finesse to virtuosic performers” (Machover et al., n.d.), and
hyperorchestra to produce similar results with an orchestral ensemble.
227
would use hyperorchestral techniques to produce a physically impossible
sonic result.
Ontological Approaches for the Hyperorchestra
A definition of the hyperorchestra should begin by scrutinizing its
ontology in terms of the differences between the new ensemble and the
physical orchestral ensembles that preceded it. The process of sound
expansion does not suffice to differentiate the hyperorchestra from
traditional orchestral ensembles in terms of ontology. For instance, the
orchestra has historically evolved the variety of sounds it could produce.78
Thus, the orchestra in the 18th-century classical era79 had a much more
restricted sound palette compared to the 20th-century orchestra
employed by Ravel or Stravinsky. The latter extended its sound
possibilities by introducing new instruments as well as new instrumental
techniques. For example, Mozart did not use a full woodwind choir and,
similarly, he would not ask the string players to utilize the sul ponticello
technique as both Ravel and Stravinsky did. However, both ensembles
are considered orchestras. Moreover, a symphonic orchestra that
78
As stated before, by the addition of new instruments, the utilization of
new instrumental techniques, and the expansion of the musical language.
79
The ensemble utilized by composers like Mozart or Haydn.
228
includes synthesizers will still generally be considered a symphonic
orchestra.
Likewise, a virtual performance80 of a classical orchestral piece
that employs orchestral sample libraries becomes only partially
hyperorchestral. In this situation, sample libraries might be used as a
means to replicate reality instead of extending it. Ideally, the virtual
performance could become indistinguishable from a recording of the
piece performed by live musicians. However, even in this scenario,
referring to the virtual ensemble as an orchestra becomes problematic.
From a viewpoint purely derived from the performance praxis, the two
performances are different because one did not originate from the
performance of a physical ensemble. Per contra, an aesthetic evaluation
that would only contemplate the sound of both performances would
conclude (considering the ideal case that the sampled version would
sound comparable to the recorded version) that they are two equivalent
pieces.
A deeper inquiry into the process of how both performances are
created reveals the underlying complexity of assessing whether these two
pieces of music are ontologically different in terms of the ensemble that
performed them, or, if otherwise, they are not. A virtual instrument from a
80
A performance created virtually utilizing computer software.
229
sample library is created by using multiple short recordings of an
instrument or ensemble. Each sample regularly captures a note
performed in a specific dynamic and articulation. Other samples may be
recorded to capture transitions between notes, or special effects, such as
crescendos or trills. There is a performance intention during the process
of the recording of the library by the instrumental performers, guided by
an overall aesthetic that the producers of the sample library aim to
achieve for that particular instrument. Essentially speaking, another
musician uses these sets of recordings, aided by computer software (the
sampler), to create a performance of the piece.
A recording of a physical performance is similarly achieved by
recording samples of the performance that are later edited and mixed
together by an audio engineer. From this viewpoint, the difference
between the two performances might lie in three different aspects. First,
the length of the samples will significantly differ. It is expected that the
physical recording is built by using longer samples (takes) and it is even
possible that the whole piece is recorded in a single take. Second, the
physical orchestra will record all the instruments that are playing at the
same time. Third, the performers of the physical orchestra are aware that
they are actually performing the piece that is being recorded. The first
difference is weak, as it relies on a vague definition of length. Thus, the
230
difference between both performances would rely on an arbitrary fixed
length of the sample or take. Furthermore, it is even possible that, in a
particular instance, the length of one take from the physical recording is
shorter than the length of a particular sample from the sample library. The
second difference would not apply in a solo instrumental performance,
even though the number of instruments does not seem to have a
significant influence on the evaluation of the differences between both
performances. Moreover, it would differentiate between orchestral
recordings that might use more than one recording space at the same
time.
By a process of elimination, the third difference becomes pivotal in
order to differentiate the virtual ensemble from the physical one.
Following this line of thought, even when considering that both
performances sound equivalent, the ensembles that produced them are
not because the physical performers of the sampled version did not
perform the latter piece. Yet, this rationale does not fully resolve the
question, as there is actually a person (either a composer or a MIDI
programmer) who produced the performance of the piece. This situation
is similar to the CGI actors discussed before. In that case, a distinction
made between the performance and the acting (Prince, 2012, p. 102)
provided a theoretical background to describe this new typology of
231
actors. Analogously, a differentiation could be introduced in music,
distinguishing between the acts of playing and performing. In the case of
the virtual orchestra, the instrumental players are the ones mainly
responsible for playing particular notes or gestures,81 meanwhile the
composer or MIDI programmer would later produce a performance using
computer tools.
Consequently, the distinction between a virtual and a physical
orchestra lies in the possibility to dislocate the process of performance.
Moreover, this becomes a central feature for hyperreal music, as it
implies surpassing the limitations of the physical world that would not
allow delaying the performance from the moment when the instrumental
players produce the musical sounds. Similarly, this is the reason that
recorded music becomes, potentially, hyperrealistic.82 The process of
recording implies a posterior process of editing and mixing that could be
considered part of the performance of the piece. This seems analogous
to the process of movie editing and acting. However, in the meanwhile
editing becomes an essential and visible part of the filmmaking process,
while editing the musical record generally remains transparent and goes
unnoticed.
81
Even though there is some performance involved, as I described
before, most of the performance of the piece would be produced later.
82
Similarly, the electric nature of the synthesizers opens the door to the
hyperreal.
232
Therefore, the hyperorchestra becomes a specific medium,
ontologically differentiated from the orchestra and other musical
ensembles, due to its capability to transcend the physical world,
achieving a result that could not be accomplished by physical means.
The discussion above highlighted the blurred area that separates the
orchestra from the hyperorchestra. When creating a virtual version of a
piece that sounds like the equivalent of a live rendition, reality is
surpassed not by the sonic result but by the process used to generate it.
Thus, based purely on aesthetic terms, the virtual performance is not
different from the physical one, as it has not expanded the aesthetic
possibilities offered in a performance employing purely physical means.
Thus, if replicating physical performances were all that the
hyperorchestra could offer, the new ensemble would not be relevant in
terms of musical aesthetics.83 Fortunately, the possibilities of going
beyond the physical world have an actual impact on the aesthetic
possibilities of the music produced with this new ensemble. The following
chapters will concentrate on scrutinizing the aesthetics of the
hyperorchestra in music for the screen. For the remainder of this chapter,
I will continue to explore the boundaries of the hyperorchestra by
examining hyperorchestral elements present in live performances. This
83
It would still be incredibly relevant in terms of the creative process,
economics and its cultural implications.
233
exploration will provide the grounds to examine the ontological
implications of the orchestra in a broader sense.
The Hyperorchestra and the Live Concert
Julia Wolfe’s piece With a Blue Dress On (2010) is scored for a solo
violin and a prerecorded track. More precisely, the recorded track
contains exactly four separate solo violin tracks (Wolfe, 2012). In fact, five
violin players can actually perform the piece without employing the
prerecorded track, although this transforms the staging and the
performance impact. Thus, the sonic result emanates exclusively from the
solo violin performances played either live or from a recording. When
staged as originally conceived, the performance of the piece creates the
impression of a soloist who is able to play multiple violin parts at once. It
is not always clear what is performed live and what is recorded. This is
achieved by employing a set of sounds that could not be produced by the
same performer at the same time and that incorporate visual ambiguity:
for the average spectator, it is not clear which passages are played live
and which are not.84
In terms of the aesthetic intention, utilizing a solo violin with a
prerecorded track of multiple violin lines is similar to amplifying the soloist
84
A trained violinist will probably be able to discern which passages the
performer is playing live.
234
in a solo concerto, thus magnifying the soloist’s power over the whole
orchestra. In both cases, the result potentiates a figure of the performer
that transcends what is humanly possible. Similarly, the aesthetic effect
of these prerecorded tracks is comparable to the audiovisual effect
produced by Paganini when he performed left-hand pizzicatos in one of
the variations of his Caprice No. 24. In Paganini’s case, the violin seemed
to be playing alone when producing the left hand pizzicato notes, as the
performer’s bow was not producing all the sounds. Paganini expanded
the violin technique to the limits of what was physically possible, at the
same time that he was aware of the audiovisual effect that a technique
like the left hand pizzicato would have on his audience. For this particular
variation, the spectator’s experience varies significantly if the performer is
not seen playing. This situation was not exclusive to Paganini: the pianist
Franz Liszt also created a similarly deceptive technique, commonly
known as the third-hand technique. These virtuosos established the
grounds for what I call a hyperinstrument, a virtual formulation of an
instrument that, even when sounding realistic, its sound could not be
produced just by physical means. For Liszt and Paganini, this was
achieved by being able to play their instrument in a manner that none of
their colleagues was able to perform, all the while remaining purely
physical.
235
In the case of Wolfe’s piece, the concept of a hyperinstrument is
even more pertinent, as it is physically impossible for a solo violinist to
perform all the music of the piece at once. As a consequence, With a
Blue Dress On becomes a musical piece that requires an audiovisual
experience to fully unfold85 in the same manner as Paganini’s Caprice. As
discussed before, Nicholas Cook’s theories confront a common
approach in music theory that neglects the importance of the
performance by concentrating mainly on the written score (Cook, 2013b).
The cases described above are evidence that the performance of those
musical pieces is audiovisual. A recording will capture all the musical
elements of the performance, but will lose the visual cues that contribute
to generating the whole meaning of the piece. The role of the soloist in
these pieces resonates with the mythical idea of the hero, an individual
who is able to achieve something that seemed humanly impossible. This
may be one of the reasons that humans enjoy watching a soloist perform,
in a similar manner that they enjoy hearing a narrative about a hero. In the
case of these pieces, the visual part of the performance is significant in
order to fully achieve a mythical status, where the soloist becomes the
hero able to play what seems impossible.
85
Remarkably, when an ensemble of violins performs the piece, it creates
a different audiovisual experience, yet generates an equivalent musical
output.
236
A similar situation occurs in a number of pieces by Dutch
composer Louis Andriessen. In Hoketus, the composer asks to amplify all
the instruments in order to be able to balance the dynamics (Everett,
2007, pp. 68-69). By altering the dynamics between the instruments of
the ensemble, Andriessen creates a sound result that also could not be
achieved without amplification. By extending this technique (which is not
exclusive to the composer), one might be able to balance the sound of a
flute in a soft dynamic with the sound of a louder trumpet. In this case,
the performance can be reproduced in reality. However, it is the resulting
sound balance that is not possible with a solely physical performance on
a concert stage. In this situation, the concept of a hyperinstrument arises
as the result of altering its balance compared to the other instruments of
the ensemble. A piano flute that is heard in a similar dynamic of a forte
trumpet becomes a hyperinstrument as this balance transcends what
would be achievable just by physical means.
The third and last example is hypothetical. Let us imagine an
orchestra performing in a concert hall that utilizes sample libraries as a
means for sound extension. At a certain moment, a sample of massive
amounts of brass plays in conjunction with the other instruments of the
orchestra. In another situation, string samples double the physical strings
in order to make them sound louder when compared to the other sections
237
of the orchestra, without altering the dynamic tone. The result would be
purely hyperorchestral, even though it appears to be presented in a live
concert. Moreover, this hypothetical orchestra may incorporate any of the
other two sets of techniques previously described: all the instrumental
sections of the orchestra could be amplified in order to further control the
balance at the same time that a hypersoloist could perform a physically
impossible part.
In this last section, I complemented the examination of the
interaction between music and hyperreality as a matter of perception by
exploring hyperorchestral possibilities in live concerts, even though their
application is still in a nascent stage compared to how these techniques
are already common praxis in music for the audiovisual media. By
exploring these examples, a shared objective of surpassing human
capabilities appeared. This objective becomes a fundamental aesthetic
principle for the hyperorchestra, as this new ensemble fulfills the
aesthetic need of music and sound expansion to transcend the physical
world. This will be explored in much more detail in the following chapters
that are dedicated to the aesthetic foundations of the hyperorchestra.
238
CHAPTER VII
MIDI, SAMPLE LIBRARIES and MOCKUPS
Introduction
Sample libraries have become an indispensible tool for
contemporary music creation, at least for the music written for visual
media. As I mentioned in Chapter I, sample libraries commonly refer in
the screen music professional community to the set of virtual instruments
that utilize a normally extensive collection of samples in order to produce
sound. For example, a virtual instrument that reproduces a violin
ensemble playing a short staccato will be built using several samples for
each pitch that the instrument is able to play. The samples are not
designed to be used alone (they are normally not accessible to their
users), but to contribute to the programming of the virtual instrument. For
instance, some of the samples in these libraries are only meaningful when
they are programmed in conjunction with other samples (e.g. the sound
of a legato transition).
Due to its central role in contemporary screen music composition
and because of its specificities, I will provide an in depth exploration of
the principles of these virtual instruments, which will be described in this
239
chapter. Sample libraries are built utilizing the Musical Instruments Digital
Interface (MIDI) protocol. In fact, the introduction of MIDI in 1983
facilitated the utilization of computer-aided technology for creating music.
MIDI provided the grounds for establishing a new paradigm for music
creation, separated from the musical score and from purely improvised
music. Sample libraries utilize MIDI as a means to communicate with the
composer. Thus, the interaction between MIDI and the composer is
dependent on how MIDI was originally defined.
As a consequence, I will begin this chapter by describing and
analyzing MIDI as a technological interface, along with exploring how
MIDI’s implementation relates to the models of Western musical practice.
It will serve as a foundation to define the importance and the influence of
such an interface in contemporary music production. Later, for the sake
of clarity, I will provide a short historical overview of the evolution of
sample libraries, which will lead to a description of their main technical
aspects. The second part of the chapter is dedicated to the presentation
of a survey of the main categories of sample libraries, exemplified by
some of the libraries that are now widespread tools for contemporary
composers.
240
Musical Instruments Digital Interface (MIDI)
In Chapter VI, I described how musical instruments mediate music
production. I argued that they act as an interface between humans and
the production of sound. I also suggested that the musical keyboard
contributed to establishing the 12-tone system as the standard system
for Western music. One of the keys to the success of MIDI as a musical
technology, which has already lasted more than 30 years without any
significant change in its definition, is that its design afforded a great
degree of flexibility, while at the same time it allowed a natural and
practical implementation of the Western canonic musical system. Thus,
MIDI is a remarkably flexible interface that is also extraordinarily practical
when used as an implementation of Western musical practice. Formally,
MIDI is a communication protocol, which implies that it is an interface
that allows for interaction between other interfaces. From this viewpoint,
MIDI might be compared to the musical score, which acts as a
communication protocol between two humans. Similarly, there are some
elements of the musical keyboard, which acts as an interface between
the performer and the actual generation of the sound, that might relate
conceptually with the inherent principles of MIDI. In other words, a
keyboard facilitates the communication between a performer and a range
of instruments that utilize the keyboard (piano, harpsichord, organ) in a
241
similar manner that a myriad of different electronic musical devices are
designed to employ MIDI in order to facilitate its use.
However, the differences between MIDI and the musical keyboard,
as well as the score, are numerous. For instance, as an interface, the
keyboard is fairly objective: two identical events will regularly generate
two identical sounds. The keyboard is strongly charged with cultural
connotations, however. Its 12-tone structure, divided into seven white
keys and five black keys, suggests a seven-tone organization, which
might condition the music produced with it. Technically speaking, on a
keyboard instrument it is generally easier to perform music that follows
Western tonal principles than music that does not. Similarly, the keyboard
prevents a flexible approach to the sound that it produces, as it only
allows the performer to press and depress the keys (for instance a violin
allows the performer to utilize many more techniques). For example, if the
performer wishes to expand the sounds produced by the piano, they
need to bypass the keyboard and directly interact with the strings.
Although they can utilize the keyboard after the alteration of the strings,
any further change will require bypassing the keyboard once again.
Moreover, changing the tuning, which would expand the sounds
produced by the piano, is impractical, as it requires a significant amount
of time.
242
On the other hand, the musical score is ambiguous, which
generates different interpretations for most of the terms (dynamics,
articulations, etc.). Utilizing the score as an interface implies the
employment of a system for interpreting its symbols based on a set of
cultural practices. Even though the design of the musical score is
culturally biased towards a Western classical musical model, it allows for
a greater degree of flexibility when compared to the keyboard. It is
possible to notate extended techniques, for example, without significantly
altering the most traditional notation. Instead, MIDI allows a great degree
of flexibility in an implementation that can be coded objectively. As a
communication protocol, MIDI is comprised of different types of
messages that are designed to serve diverse purposes. Appendix A
provides a general overview of the most relevant MIDI messages for the
present discussion and how they function: the midi note and the
Continuous Controller (CC).
What is remarkable about MIDI, and is probably the reason for its
wide success, is how the design afforded this high degree of flexibility
without compromising the practicality of utilizing the protocol just for
simple Western musical standards. The conjunction of flexibility and
practicality for Western music is achieved by a design that preserves
enough elements from a Western musical framework without limiting the
243
possibilities of the interface too much. In fact, MIDI has been used to
control and trigger live events that are not exclusively musical, which
proves its flexibility.
As has already been described, MIDI is a communication protocol
that acts as an interface between other interfaces. Figure 25 visually
describes a preliminary (and simple) visual representation of this
communication:
Figure 25. Graphical schematic to represent MIDI communication.
The model implies that, by utilizing input interfaces,86 it is possible
to map their events into MIDI messages. Therefore, what is commonly
known as a MIDI keyboard is a physical interface that converts a set of
inputs produced by a musical keyboard into a set of MIDI messages. The
86
The four input interfaces do not represent the totality of possible input
devices, although they are a relevant sample. In the case of the mouse, it
is regularly used in conjunction with a graphical interface in a computer
DAW (Digital Audio Workstation) that I will describe later.
244
mappings that convert physical gestures into MIDI values are arbitrary
and do not necessarily follow any specific pattern, although it is expected
that all MIDI keyboards will similarly map their events into a conventional
set of MIDI messages. Moreover, different inputting devices are generally
better suited for specific MIDI events. For instance, the faders in a digital
mixer naturally map Continuous Controllers. Similarly, the output
interfaces will convert the MIDI messages they receive into sound
depending on an arbitrary mapping that associates a sound with a
particular set of MIDI information, in the manner that best fits the needs of
the virtual instrument.
In an object-oriented programming language such as Java (Oracle
Corporation, 1995), an interface is a general set of methods (functions)
that an entity (a class) might implement. For example, there is an interface
called Comparable (this mainly means that the class can be compared).
Classes that implement this interface are required to define a function
called CompareTo, which serves as a means to compare classes that are
Comparable. This approach is useful to discern how MIDI interacts with
some of its implementations. For instance, a MIDI keyboard will
implement MIDI by associating each of its keys with a MIDI command
that generates “Note On” events when they are pressed, and “Note Off”
events when they are depressed. In addition, they generate a velocity
245
value based on how fast the note was pressed. The MIDI keyboard might
not provide data for any of the Continuous Controllers, which will be
assumed to have a default constant value. Further, it is worth remarking
that this implementation does not assume any particular pitch to be
associated with any particular key or any particular sound to a defined
velocity value.87 From a conceptual perspective, it is worth inquiring how
Western music implements MIDI. For Western music, each MIDI note
value will be associated with a specific tempered pitch from the 12-tone
system. Velocities would generally be used as a means to signify
dynamics. In addition, a CC might be used to represent dynamic
variations in tandem with velocity. This is how notation programs such as
Finale (MakeMusic Inc., 2013) have traditionally generated sound.88 The
implementation might become more specific and determine that velocity
will serve as a dynamic for only percussive or short sounds, and that
sustained sounds will employ a CC to map a varying dynamic not fixed to
a particular note.
87
Although the physical structure of the keyboard implies a 12-tone
based musical model, this is the system that will work most organically
when employing a musical keyboard.
88
Notation programs incorporate MIDI processing that translates score
abbreviations, such as tremolos or trills, into a MIDI set of notes. This
means that during the playback, they generate a new set of MIDI
messages. For example, a tremolo assigned to a note will be translated in
multiple repetitions of the note during playback.
246
The effect of MIDI
Alexander Galloway (2012) succinctly describes interfaces as
“those mysterious zones of interaction that mediate between different
realities” (Preface, Par. 1), which implies that “interfaces are not things,
but rather processes that effect a result of whatever kind” (Preface, Par.
1). The effect of employing a Western musical system (as an interface)
has been described in the previous chapter. Here, I will attempt to
describe the effect that MIDI has on music generation and how this effect
has thus far been utilized by composers. Earlier, I began the discussion
by stating that MIDI usage was flexible. Moreover, I suggested that MIDI
is also mostly practical to encode music generated by using Western
practices. The flexibility of MIDI might imply that its effect is minimal,
although I believe that it cannot be overlooked. In addition, some of
MIDI's flexibility might work against user-friendliness. I will analyze the
effects of MIDI in three main areas. The first revolves around how a broad
definition clashes with the limited human capabilities in terms of
multidimensional thinking. It is complex for a human to imagine a
multidimensional space that would be represented by several CCs.
Hence, imagining the combined effect that even five or six of these
controllers might generate becomes challenging. From a practical point of
view, this system forces interfaces, such as the software present in
247
sample libraries, to generate a layer to automatically negotiate some of
the sound parameters that would otherwise overwhelm their users.
The second area revolves around the centrality of the MIDI note in
the protocol and its consequences for how an instrument is defined.
Moreover, Continuous Controllers are independent of the notes. As a
consequence, virtual instruments that implement MIDI become a very
specific sound device, detached from a more organic approach to
instrument creation.89 For example, a violin tremolo will become a
different instrument than a pizzicato violin, although they might represent
the same physical instrument. This implies that a musical passage that
includes a violin playing a sustained sound and a left-hand pizzicato will
necessarily become two different instruments from a MIDI perspective.
Moreover, the necessity of polyphonic dynamic variation might force one
to employ two or more MIDI instruments in order to separately modify the
CCs that affect each of the lines. In employing MIDI, the music creator is
forced to detach from the physical source that produced the sound and
its cultural implications (such as a violinist performing several string
89
There are techniques that partially solve the instrumental techniques
problem, by employing unused MIDI notes (key switches) or CC in order
to trigger different types of articulation sounds (pizzicato, tremolo, legato,
etc.). However, at a conceptual level, they are still different instruments
that are triggered together in the same MIDI instance by using a sort of
switch.
248
techniques) and concentrate on the sound effect on its own. From this
viewpoint, MIDI forces its users to think virtually and purely sonically.
The last area refers to a basic design decision: MIDI does not
transport sound, just messages. Sound processors, such as
reverberation effects, utilize an audio input to generate an audio output.
Thus, the MIDI protocol does not serve as the correct interface for such
musical devices. Therefore, sound processors (e.g. equalizers) require
another interface design in order to properly integrate into the workflow of
the digital music creation. This interface is integrated in the software that
commonly serves as the platform to negotiate with digital sound creation:
the Digital Audio Workstation (DAW). Hence, once the output MIDI
interface generates the sound, the user has the opportunity to further
modify and interact with it. This approach facilitates the integration of
recorded sounds without the need for the implementation of a MIDI layer
to interact with them. Figure 26 expands on the previous graphical model
to incorporate some refinements to the model as a consequence of this
fact. For the sake of clarity, I have only incorporated sample libraries as
an example of an output interface. The graphic shows the different layers
of mediation between the inputs from the composer and the people that
recorded and created the library. In addition, it shows the dual input
249
process that the composers are afforded in terms of MIDI inputting and
sound processing.
Figure 26. MIDI communication and human interaction.
To finalize the discussion on the implications of the utilization of
MIDI as an interface that interacts with the creative process, I will
describe the most widespread notation system used to work with MIDI
data inside of a Digital Audio Workstation (DAW): Logic Pro (Apple Inc.,
2013). All the most common DAWs incorporate a variation of what, in
250
Logic Pro, is called the piano roll.90 In terms of a musical notation system
comparable to the musical score, utilizing the piano roll allows the
composer to be more specific in terms of the desired resulting sound.
However, this comes at the cost of a decreased level of readability and
the inability to write in vague terms (such as crescendo) that would be
later interpreted.
Figure 27. Screenshot of Logic Pro X piano roll window.
The piano roll (Figure 27) is divided into two sections that are tied
to a temporal matrix: the note area label incorporates velocity values and
the Continuous Controller area, which is multidimensional (it shows, one
90
More information for similar systems can be found in Pejrolo & DeRosa
(2011, pp. 76-83).
251
at a time, the information of different CCs). The piano roll view highlights
the note-centric design of MIDI at the same time that it allows the user
multiple types of control.
Beyond the Mock-Up: Overview of the Evolution of Sample Libraries
The term mock-up has been ubiquitous in the screen music
industry for the past 15 years. It defines the computer simulation of music
for the movies. The book On the Track (2004), which was created by
professionals in the screen music industry in an attempt to describe their
practices, defined the term as the “electronic or acoustic audio
replications of the music (sometimes a blend of both) varying in quality
from rough demos to finely polished performances” (Karlin & Wright,
2004, p. 762). This definition dates back to 2004, which is important when
considering how quickly this has evolved. In fact, the relevance of this
definition nowadays is mostly historical. In a similar manner, composer
Hans Zimmer (Vary, 2013) discussed in an interview the incorporation of
the mock-up as a tool to communicate between the creative teams of a
movie, which started soon after his arrival in Hollywood in the late 1980s:
When I first came to Hollywood, most people were still writing
[music] on pieces of paper,” he says. “The first time a director
would actually get to hear something would be when the orchestra
was wheeled in, which I didn’t think was very efficient. I mean,
there’s a huge emotional distance playing somebody something on
252
a piano and shouting at them, ‘This is where the French horns
come in!’ as opposed to at least [playing] an imitation of the
French horns coming in. (Par. 16)
Karlin and Wright's definition includes the possibility of the
recording of certain instruments that would be included as part of the
mock-up. These instruments would generally be recorded in the
composer’s own studio and integrated into the electronic track. At the
heart of the mock-up process, are the sample libraries. Their incredible
evolution over the past two decades directly governs the progression of
the concept of a mock-up.
As I outlined in Chapter VI, a sample library consists of a collection
of recordings from an instrument, group, or section, that are organized in
order to be able to virtually reproduce the sound of that instrument91 by
interpreting the MIDI information. As I will discuss below, there are
several techniques associated with the creation of a sample library. For
instance, each note might be recorded multiple times, in multiple
dynamics, with multiple levels of vibrato, multiple articulations and using
multiple techniques. In addition, the transition between two notes might
similarly be recorded. With this aggregate of information, a piece of
software called a sampler is built and scripted in order to generate the
91
In this instance, instrument refers to the specific sound entity defined
before (e.g. violin pizzicato), not to a specific physical instrument.
253
sounding result. The evolution of sample libraries has been exponential,
similar to the pace at which computers have evolved.
Hans Zimmer has released some of his mock-ups, which are
integral to his process of screen music writing. Zimmer creates a musical
suite, inspired by the themes of the movie, which serves as the basis for
the discussion of the music with the movies’ creative team (Hurwitz,
2011). In his album More Music from the Motion Picture “Gladiator”, Hans
Zimmer released a track called The Gladiator Waltz (Zimmer, 2001).
Similarly, in the deluxe edition of the soundtrack album for The Man of
Steel (2013), the composer released a track called Man of Steel Hans’
Original Sketchbook (Zimmer, 2013). In comparing the recordings of both
pieces of music from the leading composer in creating highly realistic
mock-ups in the 2000s, the evolution that sample libraries have
experienced in a little more than ten years is apparent. In this evolution,
the mock-ups have become something more than just mock-ups. Now,
they generally constitute a significant part of the resulting hyperorchestral
sound (as they are part of the finished soundtrack along with other
recordings), which invalidates their qualification as simple mock-ups. As a
result, the mock-up has evolved from being ubiquitous during the screen
music composition process to permeating the final product, thus
254
generating a new sonic model for music that has become fundamental in
the creation of the hyperorchestra.
Technical Generalities of Sample Libraries
As mentioned, the sampler is the software responsible for
interpreting MIDI information, selecting and processing the sampled
sounds of the library, and generating the sound result. In this section, I
will define some of the most salient features that samplers incorporate in
order to generate sophisticated sound outputs. In the figure below, is an
abstract graphical representation of a sampler that is playing a legato
string instrument. The information from CC1 (Continuous Controller
number 1) is employed to signify the amount of vibrato in the sound.
Similarly, CC11 controls the dynamic. In this case, the velocity has no
effect on the resulting sound, in a similar manner to the rest of the CCs
that are not programmed by the sampler. The theoretical sampler of
Figure 28 possesses a total of 12 different sounds for each note, in a grid
that comprises different dynamic and vibrato levels. To achieve a
particular sound from the pair of numbers received from CC1 and CC11,
the sampler mixes a set of these sounds in different proportions in order
to generate the final sound. When a note changes, a legato transition is
255
triggered to perform the actual sound of a note transitioning to another.
Similarly, the end of a line triggers a note release sound.
Figure 28. Conceptual graphical representation of the structure of a
virtual instrument inside a sampler. It receives MIDI inputs that are used
to decide which sound samples to trigger, and in which amount, as
output sound. In this example, CC1 is used to decide the mix of vibrato
samples, whereas CC11 is used to decide the mix of dynamics. The
combination of these two values will serve to decide the amount of signal
that each of the samples will contribute to the final result. In addition,
there is another set of samples triggered at special occasions. For
instance, when a Note Off message is received, the sampler will trigger a
note release sound. When the sampler detects two notes at the same
time (assuming that the virtual instrument is a monophonic legato
256
instrument), it will trigger a legato transition between both notes, followed
by the corresponding mix of samples for the last note that was played.
Following this general definition of a sampler, I will describe some
of the most common sampling techniques present in contemporary
sample libraries. They will serve to inform the previous discussion on
some of the most relevant sample libraries that will exemplify my
typology.
Dynamic Layering and Crossfading
In the design of sample libraries, crossfading is the technique of
employing a discrete set of sounds and mapping them onto an array of
numbers in order to generate a more realistic approach to the sound
produced at different musical dynamics (an instrument playing forte will
not only sound louder, it will sound timbrically different than when played
softer). An evolution of the crossfading technique, which might be defined
as musical dynamic layering,92 dynamically mixes different amounts of the
sounds recorded at different musical dynamic levels in order to provide a
much more varied timbre. In the previous figure, crossfading would imply
the selection of one of the sounds on the table depending on the values
92
The term crossfading is still being used when referring to this
technique.
257
of the CCs, whereas dynamic layering would mix the different sounds
depending on the CC values. The following hypothetical table (
Figure 29) aims to clarify this process. The percentages express
the amount of the original sound that will permeate into the resulting
sound, in relationship with a CC value input.
CC value
p
mp
f
ff
1
5%
0%
0%
0%
30
80%
10%
0%
0%
60
5%
70%
5%
0%
90
0%
15%
70%
5%
100
0%
5%
100%
15%
110
0%
0%
70%
50%
Figure 29. Hypothetical example of dynamic crossfading. The figure
shows how the mix of each of the samples dynamically varies depending
on the CC value. The percentage refers to the amount of the signal from
that layer that will g to the final mix. For instance, a CC value of 1 will
output almost no sound, all of it coming from the piano (p) sample. This is
because the output should represent the quietest sound possible in the
instrument. At values around 60, the sound should become close to an
mp dynamic. This is why most of the sound comes from the mp dynamic
layer. These values will vary for each CC number, dynamically mixing all
the dynamic layers accordingly.
258
The advantages of musical dynamic layering are that the sampler
is able to simulate a much wider musical dynamic range than the one that
was originally recorded by mixing them in different amounts. Thus, it
creates smooth and timbrically varied dynamic transitions with only
having recorded four different dynamic states.
If combining diverse CCs, the dynamic layering technique could
become multidimensional, as was the case in Figure 28. Each dimension
increases the required computational power exponentially as well as the
number of samples required, which might limit its generalization to more
than a few dimensions. More interestingly, by dynamically layering, it is
possible to achieve a wide variety of sounds that would not be possible
to achieve by a physical string player or a string section. With the current
definition of MIDI, dynamic layering generates 128 distinct dynamics,
each producing a different timbre for each note. Combining two CCs to
achieve a variety of dynamics and vibratos, the sampler is able to achieve
16384 (214) different sounding states. This level of detail, that surpasses
what performers can consciously achieve, allows the virtual composer
that utilizes these tools to produce music that varies in a similar manner
as how, unconsciously, a physical musician produces music. At the same
time, it opens the door to a much more sophisticated palette of sound
variations that would not be achievable by physical instruments.
259
Round Robin
“Round robin” is a common computational technique that allows
the distribution of CPU time evenly among processes. In samplers, this
computational technique is adapted to be able to employ a pool of similar
sounds for the same note (Rogers, Phoenix, Bergersen & Murphy, 2009,
p. 24). The technique is intended to overturn an acoustic effect produced
when repeating the same sound multiple times in a short period of time.
The brain recognizes that the same sound is triggered repeatedly. This
effect is commonly known as the machine gun effect, as it is sonically
unpleasant in orchestral movie scores. Multiple samples of the same note
at the same dynamic are recorded to avoid repetition of the exact same
sample. Instead, the repetition occurs after eight, 11 or even 16 iterations.
Round robin is primarily used in short or percussive sounds, as they are
the ones that are more susceptible to being repeated at a similar dynamic
range. As a consequence, the same MIDI note will result in slightly
different sounds, depending on which sample is triggered from the round
robin chain.
Although sustained sounds in sample library virtual instruments are
actually achieved by looping a short sample, they rely on the user’s
manipulation of the CCs in order to provide the necessary variety to
prevent the brain from recognizing the loop. However, this is not possible
260
in most of the short repeated note passages, which intentionally aim for a
similar dynamic across all of the notes. This is why round robin is
normally implemented in short-duration techniques such as staccato. For
instance, this is how short articulations are built using the program
EastWest Hollywood Strings (Rogers, Phoenix, Bergersen & Murphy,
2009, p. 38).
Legato Transitions and Crossfading
A key element that constitutes the legato effect is the sound
produced when the performer changes from one note to another. Legato
describes a musical practice that involves the performer connecting the
sound of two consecutives notes. As a consequence, the legato notes
lose their attack in favor of a transition sound between each of the
intervals. Realistically, this is only achievable between notes in certain
ranges in bowed and wind instruments. The nature of a percussive
instrument, such as the piano, does not offer the possibility of physical
legato. In the piano, each note needs to be attacked in order to produce
sound. However, pianists are required to play legato regularly in most of
the pieces in the Western piano literature. In practice, the pianists
simulate the legato by overlapping the notes, hoping to mask the attack
of the subsequent note by the sound of the previous. Thus, the piano (or
261
any other percussive instrument) does not produce a special legato
transition sound that needs to be specially recorded when creating a
realistic sample library.
If sample libraries only record the individual sound of each
possible note (even multiple times and in multiple variations), the legato
effect is lost. A common technique to simulate legato involves slightly
overlapping the consecutive notes in a legato line. The effect produces an
instant when both notes are being reproduced simultaneously in order to
simulate the legato transition sound. However, this technique is often
defective in even approximating the sound of the legato transition. This is
why sample libraries have incorporated recordings of the sound
produced when a note transitions to another. The amount of sounds
required within contemporary sample libraries is therefore extensive; it is
necessary to record the sound transition between each note playable by
the instrument with all the other notes, in order to properly reproduce the
transition between them in both directions. Generally, sample libraries
only record the transition between notes within an octave, which is a
realistic approach of the possibilities of legato in most instruments. For
instance, there would be recordings starting with the C3 with all the notes
that are in the register of the instrument within an octave. Additionally, the
262
transitions might be recorded at different dynamic levels and with
different legato speeds.
The legato transition is triggered when the sampler receives an
overlapping of two MIDI notes. Instead of playing both notes at the same
time, the sampler crossfades (literally speaking) the currently sounding
sound with the appropriate transition that correspond to the notes that
are being pressed. Then, the legato transition sound is crossfaded into
the sound of the second note. Therefore, legato instruments are
necessarily monophonic, as it is difficult to program a system that could
differentiate between overlapping due to a legato intention or due to
polyphony.
Multiple Performance Techniques
It has been already implied that a sample library will demonstrate
performance techniques that differ from a physical instrument or
ensemble mapped as different instruments. For example, a violin sample
library will regularly incorporate legato, sustain, staccato, pizzicato,
harmonics, tremolo etc. However, the performance techniques might
extend beyond what should be considered regular string techniques.
There might be different types of staccatos that respond to different
aesthetic intentions. Similarly, there might be different types of legatos.
263
Some of these performance techniques carry strong connoted meaning,
as I will describe below.
Sound Perspectives
The concept “sound perspectives”93 refers to the utilization of
different microphone positions in order to represent how a particular
instrument might be heard from different locations. This is important
because, as I will later analyze, the effect of microphone placement
significantly alters the recorded sound. Sample libraries generally contain
three or four different perspectives for the user to mix. They add an
additional layer of sound variation based on a hypothetical and
hyperrealistic sonic placement of the instrument.94 Each perspective has
the exact amount of sample content. Therefore, a sample library with four
perspectives will multiply the amount of disk space needed for the library
by four. If all the perspectives are employed, the computational
requirements will also multiply by four.
The utilization of sound perspectives reveals how the different
spaces influence the generated sound of an object. The complexity of
these physical interactions between the space and the sound are not yet
93
Here, I am only providing a general definition. Sound perspectives will
be thoroughly analyzed when defining hyperinstruments.
94
It might not be possible to achieve the resulting sound in the physical
world due to the total freedom of mixing different perspectives.
264
fully modeled digitally. This seems to be the reason that most of the
sample libraries produced nowadays do not record “dry” sounds95 and,
instead, present different sound perspectives to the user.
Connotation and Cultural Codes
As a product that is created primarily by Western companies,
sample libraries routinely incorporate sounds from around the world that
are highly coded in Western cultural tradition:
Virtual instrument libraries constitute a broad, yet selective sonic
ethnography spanning popular, traditional, and world cultures.
They provide an expansive pool of sounds that all commercial
media composers draw from. Their file names reflect the practical,
prejudiced, and esoteric: “viola solo legato mp,” “tundra travel,”
“Singapore squeak,” and “Jihad” (an evolving soundscape
combining dark “Middle Eastern” timbres, a driving rhythmic loop,
and a male chorus chanting “Arabic” phonemes). […] Acoustic
libraries are packaged by related instruments: “Galaxy Pianos,”
“Ministry of Rock,” and “Symphonic Orchestra.” In the case of
non-Western instruments, they are often assembled as an
aggregate of culturally related sounds, such as “Silk” and “Desert
Winds,” which contain ethnic “eastern” instruments that may
encompass the music of entire continents. (Sadoff, 2013)
Hence, the importance of the connotations and their influence on
the composition process should not be overlooked. Sample libraries are
not ideologically neutral, as they provide an aesthetic that emanates from
95
A dry sound is recorded in a studio that intends to cancel any possible
reverberation effect as much as possible, by employing special sound
treatment and microphone positions.
265
a particular cultural viewpoint.96 A tendency towards the design of
connoted instruments is fairly common in current sample libraries.
However, there are libraries that specifically draw from a clear connoted
meaning in order to generate highly coded instruments. The case of
8Dio’s vision for a string library in Adagio Violins (8Dio, 2011) is
significant. Their legato sounds include the following:
-
Extra Terrestrial Legato
-
Perdition Legato
-
Adagio Legato
-
Schindler’s Legato
-
Lost Legato
-
Instinct Legato
-
Village Legato
The approach to the design of their string library legatos expressly
differs from the widespread focus on legato transitions and multiple
crossfades. Instead, they provide a finite set of performance clichés that
are recorded integrally. They are properly named to refer to the specific
scores of movies or television shows. Instead of recording fairly objective
legato transitions, the library provides the transition and the subsequent
note, which generates a more personal and expressive sound than just
96
This concept will be central in Chapters VIII and IX, when discussing an
aesthetic for the hyperorchestra.
266
the brief moment of the transition. This process of sampling affords the
ability to ask the performers to generate specifically coded performances.
Adagio Violins uses the round robin technique to provide a varied amount
of note performances. As a consequence, their instruments are much less
malleable, although they offer a much richer experience when used within
their fairly small connoted world.
In the remainder of this chapter, I will describe and exemplify (with
a small selection of relevant libraries) what I believe are the five most
common approaches to sample library design, which correlate with their
intended connoted meaning. First, I will describe the libraries that attempt
to replicate orchestral instruments, followed by libraries such as the
Adagio Violins that intend to generate highly coded orchestral sounds.
Later on, I will describe the libraries that explore instruments outside the
Western orchestral canon, followed by a very specific type of libraries
dedicated to epic percussion. Although most of the drums in epic
percussion libraries come from outside the Western tradition, these
libraries have achieved a high degree of specificity, which justifies an
isolated analysis. To conclude, I will explore libraries dedicated to
generating hybrid synthesized virtual instruments that merge physical
sounds with electronic processing.97 These libraries do not intend to have
97
See Figure 24 in the previous chapter.
267
any specific connection with physical instruments, although they still
preserve some links due to the codification of their source sounds.
Replicating the Orchestra
One of the main design affordances of sample libraries has been to
be able to reproduce the symphonic orchestra by individually generating
virtual versions of all its instruments and sections. This goal is partially
utopian, as it negotiates with an idealistic model of the orchestra as a
musical ensemble for Western culture. In fact, it is not possible to
establish a unique orchestral sound, and it is even less possible to define
a unique recorded orchestral sound. The size of the ensemble,98 the hall,
and the recording techniques employed are variables that have a major
effect on the final sound. In addition, the orchestra is constituted by a
diverse group of instruments that combine differently. For instance, a solo
horn performing a note sounds different to four horns playing the same
note. Moreover, four solo horns recorded individually, and then mixed,
sound different to recording the four horns together. The sound difference
is less significant when different instruments interact, however. A flute
and an oboe playing together do not sound significantly different to how
they sound when recorded separately, although they will not sound
98
An orchestra for a Mozart symphony is vastly different to an orchestra
needed to perform a Mahler symphony.
268
exactly the same. The difference between an orchestral sound for Mozart
or for John Williams is not only a matter of ensemble dimensions, as it is
also aesthetic. Each composer has an associated set of performance
practices that affect the performing and recording techniques employed,
which ultimately affects the resulting orchestral sound.
As a result, attempting to generate a sample library that
reproduces an idealistic model of the Western orchestra becomes a
chimeric enterprise. Hence, the production of a sample library in this
paradigm generally begins by selecting a particular overall aesthetic for
the orchestral sound. For practical reasons, most of the libraries aim to
replicate the Hollywood Orchestra, although this is not the only possible
approach. For example, the Vienna Symphonic Library (VSL, 2004) aimed
for a romantic concert orchestral dry sound. In the necessity to choose
an encompassing aesthetic framework, the designers of sample libraries
negotiate between the desire to generate an objective and versatile
orchestral sound and the need to adhere to a codified set of principles.
Moreover, the concept of the Hollywood Orchestra is neither static nor
universal for the music written for audiovisual media. As Sadoff (2013)
asserts, a contemporary library such as EastWest’s Hollywood Strings,
which is now a section of their Hollywood Orchestra (EastWest Sounds,
2014), “no longer reflects the live sound aesthetics of the concert hall or
269
the Hollywood sound of earlier generations” (Reappropriating Genres and
Codes, Par. 1).
Furthermore, this sound is broadly modeled after iconic orchestral
pieces that appear in blockbuster movies, especially in their foremost
epic moments. Although this modeling decision does not imply the
impossibility of generating more delicate sounds (the sample library still
intends to be as objective and as versatile as possible), the overall
aesthetic bias of the makers of libraries such as Hollywood Orchestra
towards epic movies might still be noticed, especially in their sectional
sounds. This is the why another sample library company, Spitfire Audio,
created a string library within the Hollywood orchestral paradigm, while
aiming for a much more intimate sound. Spitfire’s Sable Strings (Spitfire
Audio, 2012) recorded a small (16-player) string section that, although
preserving the recording principles of the Hollywood orchestral recording,
aspired to a higher level of definition in the sound. By employing a
sequencing technique known as layering, which is the manual version of
dynamic layering described above, a composer can mix both approaches
to Hollywood string sampling to generate a personal and dynamically
evolving sound (Spitfire Audio, 2012).
270
Analyzing EastWest’s Hollywood Orchestra
For the sake of precision, I will analyze one of the most common
libraries from this paradigm that aims to replicate the orchestra. However,
most of the concepts discussed below apply to the majority of similar
libraries (e.g. LA Scoring Strings, Spitfire BML, Cinesample Orchestral
sounds, etc.) released in recent years. First of all, the design decisions of
the library could easily be qualified as modernistic or tied to structuralism.
They assume that the resulting sound of an instrument or section can be
modeled by a finite set of parameters. For instance, string players
produce a bowed sound by deciding, in addition to the pitch, the amount
of pressure that they apply to the bow and the amount of vibrato they
produce by slightly moving the left-hand finger that is helping to produce
the pitch. A more sophisticated approach might introduce the speed of
the bow movement and bow changes. Thus, the premise for constructing
the library is that a set of finite parameters can effectively describe the
sound production of the instrument in a particular cultural performance
practice framework. Figure 30 represents how EastWest’s Hollywood
Orchestra generates a string section sustained sound by using a definite
series of inputs.
271
Figure 30. Graphical representation of Hollywood Orchestra’s input
parameters for a string ensemble sustained sound (Rogers, Phoenix,
Bergersen & Murphy, 2009).
Each pitch from the 12-tone Western scale is assigned to a MIDI
note. By employing a set of four MIDI notes that are not utilized for
pitches (the range of any physical instrument is generally inferior to 128
semitones), the library adapts the performance to four different sets of
finger positions, which means that higher positions will employ the lower
strings for more pitches. CC11 generates musical dynamics by
representing the concept of bow pressure. CC1 is used to define the
amount of vibrato of the notes.99 This means that each note had to be
99
The corresponding sound in brass instruments does not offer this
variation, as this would not be a common practice in brass instruments. In
their case, the amount of vibrato correlates with the dynamic, which can
then be controlled with just one CC utilizing a one-dimensional array of
samples.
272
recorded at several dynamic levels and with different amounts of vibrato.
In part, this approach correctly represents the parameters of a string
performance: the performer mainly controls the position and movement
of the left fingers and the bow. The performer is generally unable to
maintain the exact amount of vibrato and bow pressure, which is what
generates an expressive and varied sound performance. The absence of
these nuances in the vibrato and dynamics would generate a
performance closer to that of the Toyota Robot100 (DiagonalView, 2008).
In fact, the Robot’s performance sounded closer to a digital rendition
than it did to a live performance. Consequently, the success of this
approach to sample libraries is dependent on the composer providing
these small variations by varying the CCs values, in order to generate the
necessary timbral variations typical of a human performance.
There are certain aspects of the performance that are either
processed automatically or that are fixed during the recording process.
For example, the sustain sounds offer the opportunity (for the strings) to
utilize the round robin technique in order to automatically alternate
between up and down bows. Similarly, the possibility to select the actual
string that a note will be played on (finger positions) is fairly limited to
common practices, and it is decided automatically by the sampler
100
The Toyota Robot’s performance can be seen on YouTube:
https://www.youtube.com/watch?v=EzjkBwZtxp4
273
depending on the position. Figure 31 describes which pitches will be
performed on which strings for the violin section. In concordance with the
overall concept of violin positions, each increase in the position will
correspond to either a tone or a semitone played on a lower string:
Figure 31. Musical score representation of the string position possibilities
for the violin ensemble in EastWest’s Hollywood Orchestra. (Rogers,
Phoenix, Bergersen & Murphy, 2009, p. 23). The score shows which
notes are played on which string depending on the finger position that the
composer has selected.
Finger position 1 allows the composer to employ open strings that
are generally avoided unless they are specifically requested. The second
position might be the closest to a standard playing, whereas the third and
274
fourth will accomplish a slightly more intense sound. There is not an
option to vary the position where the bow touches the string. This is
because the placement of the bow tends to vary depending on which
pitches are played. When the composers do not wish for that to happen,
it is usually because they are employing an extended technique such as
sul tasto or sul ponticello.101 The library specifically incorporates instances
of these specific sounds, as well as a flautando, which is achieved by a
combination of a sul tasto position and a faster movement of the bow.
However, slightly different positioning of the bow as well as different
bowing speeds produce variations on the sound that the library does not
exactly offer. To be clear, each instance of the library is recorded as a
result of what would be the most common positioning and speed of the
bow considering the rest of variables (pitch, dynamic, vibrato and
performance technique), which produces a non-customizable variation of
these parameters.
Legato instruments employ the legato technique previously
explained. They utilize velocity as a means to describe the speed of the
transition. When the velocity increases, the transition time between notes
becomes shorter. This is achieved by manipulating the legato transitions
(cutting them or time stretching them) instead of recording them multiple
101
In sul tasto, the bow is placed closer to the fingerboard. In sul
ponticello, the bow is placed closer to the bridge.
275
times. Similarly, some instruments offer the possibility to create a
portamento effect102 in the transitions, which is regularly triggered at low
velocities. In this case, the portamento has been actually recorded
separately. In addition, some string legato instruments also offer
automatic bow change (similar to the round robin in the sustain
instruments).
There are two sampling decisions that, similar to some of the
techniques for long notes, apply to the string section only. The first is the
possibility to create divisi.103 Instead of recording half of the section
playing, the producers decide to record the whole section utilizing
microphones placed in both sides of the section. The placement of these
microphones captures half of the section more prominently due to their
placement. The decision highlights the difference in sound between
recording half of the section or the section in its totality. If only half of the
section was playing, the result would be more accurate when the divisi
section would play alone, but the ensemble sound would be partially lost
when both parts of the section were playing at the same time. The
producers of the library decided to preserve the ensemble sound,
102
The portamento effect implies a slower transition between legato
notes.
103
When a string section plays divisi, the section is divided into two
halves, which play different music. It is possible to have a divisi a 3, or
even more, which implies dividing the section into more than two parts.
276
expecting that their divisi will be employed to actually play divisi (divided)
and not to generate a reduced ensemble (Stewart, 2010, Divide and
Conquer). Another controversial decision was to create the sordino104
effect by equalization and filtering, instead of recording, the instruments
with sordino (Stewart, 2010).
The library offers a set of short articulations that include different
types of staccato, as well as particular techniques such as pizzicato. It
extensively uses round robin in order to provide a variety of sounds for
each dynamic range. In addition, wind instruments are recorded
employing the double-tonguing technique, which is how they would play
repeated fast staccato notes. To finalize, the library offers five
microphone positions. It was recorded in Los Angeles at EastWest
Studios.105 The sound of the library voluntarily incorporates the sound of
the studio, even for the close positions. Thus, the close perspective in
this library still preserves the sound of the studio, although the
microphones are placed closer to the instruments. This approach differs
from other practices, where the close positions are recorded employing
microphones that try to avoid capturing the sound of the hall.
104
In the strings, the sordino (muted) sound is achieved by placing a
device on the bridge.
105
http://www.eastweststudios.com/
277
In analyzing EastWest’s Hollywood Orchestra, most of the
characteristics of this paradigm for sample library creation have arisen.
Their structuralist approach results in a product that attempts to be
generalist by allowing a high degree of modification for a controlled set of
parameters, which have been identified as highly influential in producing
the resulting sound. The number of features ensures the possibility of
working with a great amount of detail, albeit within a system that it is still
practical. As a consequence of this degree of detail, it is possible to
achieve results that would not be feasible using a physical orchestra, for
instance, a very specific crescendo or a pizzicato passage that is played
by all the performers of the section together. The utilization of these
libraries retro-feeds the codes associated with the Hollywood sound,
which becomes highly tied to the sound generated by these virtual
instruments. In their interaction with physical recordings, the live
musicians will cooperate and readjust their performance techniques to
either integrate with, or become closer to, the sound achieved using
sample libraries.
Moreover, these libraries offer instruments that are the product of
the recording of, for example, six horns together. In doing so, they
encourage a sound, six horns playing unison, that was not regularly
278
present in conventional orchestral writing and that the orchestras might
need to emulate.
Orchestral Ensembles and Coded Orchestral Libraries
If the previous paradigm of sample libraries could be considered
structuralist, the libraries in this section might align with a poststructuralist perspective. They were built in the gaps that the previous
libraries left, attempting to provide a mix between a naturalistic
composing solution for specific settings and a naturalistic sound by
diminishing the amount of customization. The Extra Terrestrial legato
mentioned before in 8Dio’s Adagio Violins is a clear example of the first
approach. This virtual instrument attempts to sound more natural by
focusing on a very specific type of legato and string performance style,
which emulates the performance of the strings in John William’s music for
E.T. (1982). On the other side, Spitfire Audio’s Albion (Spitfire Audio,
2011) includes a set of instrumental sections. For example, there is an
instrument called “Woodwinds Hi”, which was created by recording the
flute, clarinet and oboe playing together (within their registers). This
instrument attempts to simplify woodwind writing at the same time that it
intends to achieve a more natural sound in the woodwind section by
recording the instruments together. When using this instrument, it is not
279
possible to select which woodwind instrument will perform at a given
time.
This approach of sample library design relies on how different
musical devices are codified to generate meaning. These libraries depend
heavily on cultural conventions that can be isolated in performance
practices or instrument combinations. In order to define a string legato
virtual instrument modeled with the performance in the movie E.T. (1982),
there should be specific instrumental practices applicable to the string
legato in the movie (or in a group of similar movies). When this happens,
the specific technique to perform legato might become codified. By using
the instrument, the composers are actively recognizing the codification
and integrating it into their discourse. Moreover, the usage of this specific
instrument is constrained to a very particular set of situations where the
codified meaning could apply.
In practice, if the previous model intended to create a virtual
instrument by sampling it in all its possible performing techniques, this
approach attempts to grasp a wide set of performance possibilities.
Instead of attempting to create a model of a virtual instrument that could
be programmed to reproduce any possible performance technique, these
libraries present a varied set of techniques that serve very specific
purposes. Thus, instead of deeply sampling the instrument through
280
recording different types of vibrato, dynamics and transitions, the
designers will concentrate on sampling legato by employing a specific set
of particular performance practices. The advantage of this approach is
that the result should theoretically be more natural, as it has been created
from a single recording instead of through the union and merging of
several samples. For example, the different legatos in Adagio Violins were
created by performing the transition together with the arrival note. When
a legato note is played, the sample already contains the transition and the
subsequent note, instead of crossfading the transition to a sostenuto
sample of the arrival note. This will result in a performance that will
naturally react to the legato process in a manner that will correspond to
its aesthetic intent (the amount of portamento, evolution of the vibrato
etc.). Theoretically, a similar sound could be achieved by the proper
manipulation of the dynamics and vibrato from an instrument of the
previous group, although it is reasonable to expect a more artificial result.
As a consequence, the virtual instruments in this category do not attempt
to replicate a particular physical instrument. Instead, they attempt to
model a specific performance practice for either a single instrument, or a
group of instruments.
These two different approaches to sample library design highlight
the tension between a framework based on structuralist premises and
281
another that relies on a post-structuralist approach. The first method
assumes that it is possible to describe a physical process by employing a
discrete set of parameters that can be mapped onto a set of functions to
generate an output. The progress and evolution of the tools based on this
model lie in the exponential growth rate of technology, which will
theoretically surpass human capabilities in just a few years. However, this
method of modeling presents an inherent risk, which is eloquently
exemplified by Borges (1999) in his surrealist story On Exactitude in
Science106:
In that Empire, the Art of Cartography attained such Perfection that
the map of a single Province occupied the entirety of a City, and
the map of the Empire, the entirety of a Province. In time, those
Unconscionable Maps no longer satisfied, and the Cartographers
Guilds struck a Map of the Empire whose size was that of the
Empire, and which coincided point for point with it. (p. 325)
This parable highlights that the risk of constructing a model that
replicates reality is that you might end up with a replica instead of a
model. In the case of sample libraries, an increased complexity in the
programming of the library could go beyond what is possible to
conceptualize. A possible solution might be, as discussed in Chapter IV,
to approach the performance of music with these libraries in a similar
manner to how CGI actors provide the acting material for their virtual
106
Baudrillard uses this story at the beginning of Simulacra and
Simulation (1994)
282
counterparts. Another alternative would lie in the utilization of Artificial
Intelligence as a middle layer between the virtual instrument and the
composer/performer. The round robin technique described above might
exemplify a simple solution of integrating a middle layer.
On the other hand, I qualified this second group of libraries as
post-structuralist because their approach starts with the premise that it is
not possible to create a model for human performance and, when this is
attempted, the result is emotionally flat due to the lack of variation. This
position is clear in the following comment, posted by the official 8Dio
YouTube account in response to a YouTube video that compared the
legato sounds of diverse sample libraries:
Comparing [8dio] Adagio [Strings] with other libraries only using XFades [Crossfades] is really counter-intuitive to the Adagio
concept. While Adagio certainly has traditional x-fades - the vast
majority of the concept is built around using dynamic articulations,
which is completely skipped in the comparison. […] Adagio is 90%
built around a massive selection of alternative legato articulation[s]
that are much more dynamic in nature.
Unfortunately you cannot make a video comparing these to the
others, since they don't have them. […] Real strings are capable of
so much more - and this everlasting notion of x-fade legato with
sustains covers all string needs is so far from the truth - and in
essence miscommunicating how real strings operate. […]
Adagio is the only library capable of this [offering varied types of
legato] and it [has] much more vital articulations [than] dead-boring
x-fade sustains. (8dioproductions, 2014)
283
Consequently, 8Dio’s approach for this library attempts to present
several different performance practices of legato. In other words, the
library is not trying to reproduce the legato effect as a technique but to
capture the different performance practices associated with the legato
technique. The result of this position is a product that presents the
composer with a varied fixed set of performance approaches. At first
sight, this design seems to restrict the possibilities of the composer, who
becomes bound to a fixed set of codified performances. However, the
composer has not regularly negotiated at this level of detail, which is why
the Western score is not actually prepared to describe these varied
typologies of legato. Thus, these libraries assume a greater part of the
performer’s role compared to the libraries from the first group. The
increase in their role as performers implies that the libraries enforce a
compositional approach that is aware of the performance conventions,
utilizing them in order to convey specific codified meaning.
Sample Libraries and World Instruments
Sample libraries have allowed composers easy access to myriad
instruments from around the world. The overall design objectives of
libraries that include instruments outside of the Western canon are similar
to how Western orchestral libraries are created, although they necessarily
284
incorporate specific solutions due to the diversity of musical systems and
traditions.107 Moreover, they are designed with Western customers in
mind, who are not necessarily well-versed in the particular performance
practices of each of the sampled instruments. More importantly, these
libraries deliver a varied set of codified meanings that integrate into the
screen music orchestral musical discourse.
From a Western perspective, the essence of these instruments is
closely tied to a particular performance practice. For instance, a
shakuhachi, which is a Japanese flute, is not only the name of the
instrument but also the performance style. Furthermore, the set of
intonations that define how the shakuhachi is performed cannot be
mapped onto a 12-tone pitch structure. One of the solutions is to record
a diverse set of topical phrases, which generate an instrument that,
instead of delivering notes, delivers full musical phrases. In a certain way,
this approach to sample design expands the previous paradigm further,
by including not only a note with its transition but a fully constructed
musical phrase or gesture.108
In practical terms, a set of prerecorded phrases alone does not
fulfill the need of the potential users of these libraries. The scope for
107
Including popular practices from within Western countries.
There are libraries from the previous group that also provide fully
formed textures and phrases within a Western orchestral style.
108
285
these predesigned musical phrases is necessarily restricted to limited
musical situations that permit their integration within the overall musical
discourse. Therefore, the designers of the non-classical Western sample
libraries are faced with the problem of creating a workable and flexible
instrument that still expresses its original performance practices. One of
the first challenges relates to how to employ the particular scale tunings
of different musical systems. In addition, the designers of these libraries
assume that the users will employ a MIDI keyboard as a musical input
tool. EastWest world library Ra (EastWest Sounds, 2008) introduced a
system with multiple tunings that could be applied to any instrument.
Figure 32 shows the great diversity of tunings available to all the
instruments in the library. In addition, the user is able to choose a root
note from the Western 12-tone system, which will become the pitch from
where the scale system will begin to form. In order to facilitate playing
these scale systems on a MIDI keyboard, the tunings respect the
structure of the keyboard in octaves. This means that some of the notes
in the 12-tone scale will be mapped to the same pitch. This provides the
opportunity of presenting varied sounds within the same pitch. In some
situations, a pentatonic scale will result in 12 different sounds mapping
only five different pitches.
286
Figure 32. Tunings for EastWest’s Ra (EastWest Sounds, 2008)
In order to include diverse musical performances, these
instruments are regularly sampled with different performance techniques.
Technically speaking, the sampling approach does not differ from how
Western instruments are sampled (e.g. sampling the violin playing
pizzicato, legato, staccato, etc.). However, with the non-Western
instruments, these different techniques are frequently fundamental in
order to create a regular musical performance. In other words, although
the libraries utilize the same framework as their Western counterparts, the
287
sampled techniques generate a single unified mode of performance.109
This is why the variety of these performance techniques from an
instrument are integrated into a single virtual instrument with several key
switches (unused notes on the MIDI keyboard used to change between
virtual articulations).
The shakuhachi virtual instrument present in EastWest’s Ra is a
good example. It integrates 14 different performance modes, which are
mapped to the lower 11 MIDI note values110 (EastWest Sounds, 2008, p.
80):
-
Sustain Vibrato
-
Espressivo Vibrato
-
Legato Vibrato
-
Legato Non Vibrato
-
Non Vibrato
-
Overblown 2
-
Overblown 1
-
Spit 4RR
-
Harmonic FX
-
Trill
109
A Western violin might just play legato for an extended amount of time,
for instance.
110
C0 – C#1 to be precise.
288
-
Melody 1
-
Melody 2
-
Melody 3
-
Melody 4
The first five articulations on the list reproduce a Western
perspective of instrumental performance. The following five are specific
performance modes of the instrument. It is important to highlight that
they generate varied sounds that do not necessarily correspond to the
pitch. For example, the overblown articulation will naturally raise the pitch
by an octave, as a consequence of overblowing. Finally, the last four
articulations are prerecorded melodies or motives. Their initial pitch
corresponds to the note pressed. As in any other instrument, the sound
will stop when the note is depressed, regardless of whether it arrived at
the end of the phrase.
The design and implementation of sample libraries that employ
non-classical Western instruments generates two important
consequences that will affect the overall aesthetics of the music written
with them. First, the utilization of prerecorded phrases engages with
certain aspects of the culture of the mash-up. Miguel Mera (2013) briefly
describes the mash-up as follows:
289
In its most basic form a mashup (sometimes also called “bastard
pop”) is where two or more samples from different songs are
blended together to create a new track. (Par. 5)
Mashup is considered transformative and playful, delighting in
synchronic simultaneity and difference and actively demonstrating
that meaning is not fixed. (Mashup: Beyond Counterpoint?, Par. 5)
The second consequence relates to the generation of crosscultural musical entities. The possibility of combining different tunings
with varied instruments blends two different cultural traditions into one
musical entity. In addition, a single library such as Ra provides a wide
variety of Instruments from around the world, which become accessible
to the user of the library. These instruments can be mixed with Western
orchestral instruments. In addition, they can also be partially
decontextualized and used as if they were orchestral instruments. In the
shakuhachi example, this could be achieved by employing the first five
articulations on the list. Therefore, the possibilities in terms of aesthetics
and the generation of new codified meaning are vast, which explains the
popularity of those instruments in contemporary Hollywood practices.
Epic Percussion Libraries
In some ways, this group of instruments should be considered a
subset of world instruments, as the epic percussion libraries routinely
include drums from around the world. However, there are some
290
peculiarities that are worth a separate analysis. An epic percussion library
refers to a set of different drums and other percussion instruments that
serve to propel heavy drum-based action sequences. From this
perspective, these libraries act as an expanded drum set for
hyperorchestral music. The role of the drum set in the diverse genres of
rock and contemporary popular music is to generate a constant groove
that propels the music. Similarly, epic drums serve to provide a bed of
tension and action utilizing a variety of sounds that stimulate the listener.
It is important to remark that not all world percussion instruments will be
part of epic percussion libraries. For instance, the Japanese taiko drum
has become iconic for its ability to signify a battle scene. However, this is
not the case for the Indian tabla, whose performance practices
inextricably link the instrument to its cultural background.
In terms of sampling, percussion instruments are easier to create
compared to wind or bowed instruments. They do not generate legato or
a sustained sound that needs to be looped and modified. Creating a
sample library of a percussion instrument just requires recording each
percussive hit several times in several dynamic levels. Utilizing the round
robin technique is essential, as there will likely be continuous repetition of
the same drum hit. In addition, the drums sampled in these epic libraries
are generally processed, in order to achieve a dense or intense sound
291
result. Moreover, some of the instruments are created by mixing
recordings with synthesizers and intensive sound processing, in a similar
manner to how the hybrid instruments are created.
Epic percussion originated from libraries that provided electronic
drum loops and beats. Quantum Leap’s StormDrum (2004) was one of
the first libraries of this kind. The following review of the product, written
at the time of its release, elucidates the rationale behind the creation of
this new set of sampled instruments:
If you’ve watched any of the more epic-styled Hollywood films
lately, no doubt you’ve noticed a musical trend that is taking hold
of the industry. Films such as the Lord of the Rings trilogy and
Gladiator feature original scores from notable screen composers,
and all films rely heavily on the use of what I affectionately call
“boomy” percussion. This trend of large, hard-hitting and, at times,
almost tribal percussion usage has crossed over into television,
music, and of course videogames. […]
For the contemporary composer and studio musician, it can be a
bit of a challenge to create these sounds with existing software,
and it is even more of a challenge to find and record the
instruments themselves. Anyone who has seen a live Taiko
performance can sympathize with the roadies who have to haul
those drums from place to place. Orchestral libraries will typically
provide bass drums, timpani, and possibly even toms of some kind
or another. However, none of these quite capture the sound of
those epic soundtracks from Hollywood composers. […]
Award winning East-West producer Nick Phoenix has set out to
solve this problem with a collection of samples created specifically
for those seeking all the ‘boom’ without the bulk and weight of a
Dragon Drum. Designed to give composers and musicians the
biggest, boomiest collection of percussion samples in one
292
complete package, Stormdrum allows you to get that big
Hollywood sound easily – with professional quality results. (Kirn,
2005)
The text highlights the influence of Gladiator (2000) and The Lord
of the Rings (2001) in shaping a new paradigm for screen music scoring
that included this type of non-orchestral percussion. For example, the
taiko drum was employed to create the percussive texture of Isengard in
The Lord of the Rings (Adams, 2010, p. 388) and, since then, they have
became standard in the scoring of battle-related scenes.
Sample Libraries as a Blueprint for Screen Music Scoring Practices
The example of StormDrum underlines how sample libraries react
to aesthetic practices in order to provide instruments that satisfy the
needs of the composers. From this viewpoint, sample libraries become a
blueprint for the screen music scoring practices at the time of their
release, while at the same time they provide the means to extend a
particular aesthetic to a wide range of practitioners. This becomes even
clearer in another epic percussion library: Spitfire’s Hans Zimmer
Percussion (Spitfire Audio, 2013). The library was designed under the
supervision of Hans Zimmer himself, in order to emulate the iconic drum
sounds from his scores. Therefore, the library becomes a snapshot of
Zimmer’s sound choice practices at the time. Although Zimmer’s movie
293
scores for a given time period might serve a similar purpose, the library is
able to capture very precise and concrete elements of the practice that
might not be evident in a fully mixed piece of music.
Hybrid Libraries
The last approach to sample library design is generally qualified as
hybrid, in the sense that it presents a set of instruments that are the
product of the combination of recorded sounds, sound synthesis and
sound processing. Their objective differs from all the previous paradigms,
with the exception of some instruments in the epic percussion category,
as they do not attempt to model a live instrument. Instead, these hybrid
libraries utilize sound recordings to generate new sounds via processing
or synthesis. Libraries such as Spectrasonic’s Omnisphere or 8Dio’s
Hybrid Tools are good examples of this approach.
The instruments resulting from this process of hybridization
generally retain some of the associated coded meaning that might be
attached to the source of the sound (e.g. a metallic stick hitting a pipe
evokes an industrial context). This meaning mutates depending on the
amount of transformation applied to the sample. As a result, the
instruments in these libraries fluctuate between new sound horizons and
connoted meaning from common elements of everyday life.
294
This approach allows the generation of fluent soundscapes that
evolve over time, creating a dynamic texture that results from a single
MIDI note. For instance, in Sample Logic’s Morphestra, there is a virtual
instrument called “Jihad”, which is “an evolving soundscape combining
dark ‘Middle Eastern’ timbres, a driving rhythmic loop, and a male chorus
chanting ‘Arabic’ phonemes” (Sadoff, 2013, Reappropriating Genres and
Codes).
Additional Considerations on Sample Libraries
The present discussion has highlighted a diverse set of design
approaches to sample libraries. The description of their design indicates
their importance in shaping the aesthetic of contemporary musical
practices, which will be described in subsequent chapters. In addition,
the utilization of these libraries allows for the creation of music that would
not necessarily be possible to produce by physical means. Furthermore,
the contrasting approach to sample design between the first two groups
elucidated some of the limitations of these virtual instruments. It is
reasonable to expect that the composers will adapt to the limitations of
the samples, which will also have an effect on their aesthetic attitudes.
From a broad perspective, adapting to the possibilities of an
instrument (physical or virtual) has always been the norm in music
295
composition. For instance, even though a violin can certainly produce a
very special sound when smashed with a hammer, this is not considered
a practical possibility when writing for a violin. Moreover, composers have
always had to adapt to the instrumental forces available to them. Sample
libraries virtualize the performance practices of the physical instruments,
which adds complexity. In adapting to the possibilities of the library,
composers might decide to not employ musical resources that would be
achievable in a physical performance if they do not translate
appropriately to the sample libraries that they posses. Nevertheless, this
situation might be the consequence of budget restrictions (e.g. the
composer cannot afford the cost of hiring a physical orchestra) or due to
lack of expertise.
296
CHAPTER VIII
AESTHETIC COMPOSITIONAL FRAMEWORKS
Introduction
In the following two final chapters, I will address the process of
music creation for audiovisual media employing hyperorchestral
resources from an aesthetic viewpoint. The contents of the previous two
chapters, which were analyses of sample libraries and movie scores from
recent movies, will serve as the source material to outline an aesthetic for
the hyperorchestra. In Chapter VI, I described music in the hyperreal and
the concept of the hyperorchestra in terms of ontology. Although the line
that separated the traditional orchestra from the hyperorchestra was thin,
it was possible to establish an ontological distinction based on the
process with which the music was created. However, in order for the
hyperorchestra to be aesthetically differentiated from the traditional
orchestra, it required the expansion of possibilities in terms of the sound
that the physical orchestra could achieve. The following chapters are
dedicated to an exploration of the aesthetics of the hyperorchestra and
how it transcends the musical possibilities of the physical world. During
the ontological scrutiny, the recording arose as a means to produce
297
hyperreal music, on the basis of its capacity to transform and virtualize
the sound. Thus, this present chapter will begin by focusing on the
process of recording music. More specifically, this chapter is dedicated
to proposing and describing musical frameworks that can be used to
write hyperorchestrally.
As an arbitrary culturally defined subset of sound, music
composers have generally relied on frameworks for creating new pieces.
The score, the orchestral instruments, and an established set of
performance practices have served as compositional frameworks for
Western music creation. In conjunction with the score, each instrument
provides an established sound output based on the information
presented in the score and interpreted by the performer. In an equivalent
manner, all musical traditions reflect a musical framework based on the
boundaries and limitations to a sound that wishes to be considered
music. In terms of sounds, Western orchestral music expanded by adding
new instruments and by extending the techniques available to all the
instruments. The orchestra has barely evolved since the beginning of the
20th century. However, other musical styles that emanated from popular
music have expanded their sound using the recording studio and the
utilization of electricity-driven devices. When adding new instruments to
the orchestra, the composers choose, implicitly, a new interface for music
298
production. This interface becomes, on its own, a new framework from
which to generate new music. The borders of what music is may naturally
expand. Therefore, utilizing new instruments that become new
frameworks for music creation extends the boundaries of what is
considered music and it does that quite smoothly. With non-physical
instruments, such as synthesizers, the process of expansion of the
boundaries becomes much more noticeable. Physical instruments
produce a limited and defined range of sounds (generally associated with
music), whereas the synthesizer has a series of wave generators that
create diverse sounds that may, or may not, be considered music. In
order for these sounds to become accepted instruments in Western
culture they must utilize musical frameworks that are part of the culture.
As an example, I will describe the Attack, Decay, Sustain and
Release (ADSR) envelope model, which is employed by almost all
synthesizers. These four states are assumed to be the different stages
that any musical note goes through during its lifecycle. All physical
sounds begin in a silent state. The physical body responsible for
producing the sound needs to start to vibrate in order to generate sound
waves, thus making the process gradual. Imagine playing a note on a
piano: at the moment when the hammer hits the string, there is no sound.
Once the string has been hit and once the hammer no longer physically
299
touches the string, the string starts to vibrate, thus increasing its vibration
amplitude, which is perceived as volume. This process is what is called
the attack. After the attack stage, the musical note will recover from the
attack and lower its amplitude (volume) until reaching a more stable state
of vibration. During this stage, the vibration device recovers from the
impact of the attacking device (the hammer in the piano). After this stage,
there is a period of time when the note stays at a similar amplitude, the
sustain stage. Finally, the note ends when the vibrating object returns to
its non-vibrating stage, which is the release. This is regularly modeled
following a prototype similar to Figure 33.
Figure 33. Visual representation of the main principles of the Attack,
Decay, Sustain and Release (ADSR) model
300
The release stage might be triggered by either the forced end of
the note (the piano key is released and, therefore, the damper forces the
string to stop vibrating) or because the vibrating source does not vibrate
anymore (after a while, the piano string will stop vibrating). This template
for the lifespan of a musical note roughly models how musical notes
sound, especially for percussion instruments. In Figure 34, the ADSR
model is superimposed over a waveform of a timpani hit.
Figure 34. Graphical representation of the sound wave of a timpani hit,
with the ADSR labels superimposed.
Synthesized instruments do not require an attack, decay or release
stage. As the sound is created using electric signals and without a
physical vibrating medium, they can begin sounding at the desired
amplitude and they can be cut without a release. However, modeling
301
them through an ADSR envelope template brings them closer to how
physical instruments react, thus stretching the boundaries of what is
considered music less. In other words, a synthesized sound without
ADSR might just be considered a sound (a beep), whereas a synthesized
sound with an ADSR envelope applied to it might become a musical note.
However, modeling the sound of a note in terms of a generic ADSR
opens the door to an expanded range of sound processes that go
beyond what would be natural or achievable with physical instruments.
Furthermore, it provides a framework for sound expansion that still
preserves the connection with some sort of physicality. Even with a
sound process, such as reversing the sound, the result, still resembles (in
terms of stages) the ADSR. A reversed sound has a very slow attack, a
non-existent decay, a sustained stage with a fast crescendo and an
extremely fast (and physically impossible) release.
The example of ADSR envelopes111 serves to reveal the necessity
of creating frameworks that define what music is, and how the flexibility
of those structures are key for the expansion of the boundaries of what is
considered music in a given culture. By establishing a virtual model
inspired by physical processes, but not restricted to them, utilizing the
111
The ADSR envelope should be considered one of several frameworks
that defined the creation of synthesized instruments.
302
ADSR model allows for a curated expansion of the musical boundaries
that feels culturally connected to the existing musical background.
Parallel to the introduction of non-physical instruments and sound
manipulation techniques, the boundaries of what music is in Western
culture have expanded as a result of globalization. However, each
musical tradition is attached to a particular cultural background, which
differs from the Western cultural tradition. Thus, there is a dual process
of, on the one side, musical assimilation of practices from different
traditions and, on the other, cultural mixture, which generates stylistic
diversity. If focusing just on contemporary Western screen music, these
two distinct processes crystallize as follows. First, music for audiovisual
media becomes stylistically diverse. Second, the orchestral cinema
sound expands by incorporating instruments and practices from other
traditions. In conjunction with the employment of electronic and virtual
sounds and processes, the orchestrally rooted music for audiovisual
media is able to greatly expand aesthetically.
These processes and new models have become the substrata for
the creation of music in the hyperreal. It is from these substrata that I will
define, in this chapter, the aesthetic frameworks for hyperreal music.
First, I will provide a general model for music and hyperreality, which will
interconnect with a description of how recording music has affected the
303
process of music creation. After that, I will outline a model for the
hyperinstruments and their creation. Finally, I will propose a framework
for the hyperorchestra.
Sound and Music in the Hyperreal
With the exception of the music for the films in the silent era,
screen music has always been associated with a process of sound
recording. As described in Chapter VI, recording music is inherently tied
to a hyperreal approach, as music transcends its pure physicality to
become virtualized. Figure 35 provides an overview of how music
operates in hyperreality.
Figure 35. Music in the hyperreal. This graphic shows how sound sources
from the physical world are transported to the virtual area for processing.
Once this happens, music becomes hyperrealistic.
304
I included sound synthesis as a separate instance to acknowledge
the analog nature of the origins of synthesized music. However, I
differentiated it from what I called the physical world, following McLuhan
(1964/1994), as its mode of sound generation is based on electricity.
Although electricity is part of the physical world that humanity inhabits, its
revolutionary nature goes beyond pure physicality. Humanity has just
recently discovered how to produce, transform and transport electricity.
In other words, the sounds created by a synthesizer cannot be generally
reproduced in nature. In the model, the physical world includes the music
created and performed using physical instruments. There is a crossover
between the physical and the electrical that I did not include for the sake
of clarity, which involves electric instruments such as the electric guitar.
From the viewpoint of this framework, they are indeed physical
instruments that require electrical amplification. Therefore, they should be
considered mainly as physical instruments. Live performances, such as
the ones that accompanied films during the silent era, emanate from
physical processes only. The rest of the musical processes described in
the graphic involve a recording112 of some sort. The traditional recording
sessions brought the recorded music into the virtual, where it was edited,
processed and mixed. At the end of the process, the result became the
112
Analog synthesizers could also be physically recorded. However, for
the sake of clarity I did not specify it on the graphic.
305
final product, which would ultimately enter the rerecording process. The
creation of sample libraries or hybrid synthesizers also involves recording,
processing and mixing, which then generates virtual instruments that are
autonomous from the physical reality. These instruments become similar
to synthesizers, either analog or digital.
In the virtual paradigm, music can be modified by editing,
processing and mixing. Editing involves selecting the takes or take
fragments that best represent the musical objective and generate a single
linear music track. Just by editing, it is possible to achieve sounding
results that could not be achieved by physical means. For instance, it is
possible to cut the time required for a string performer to change from
playing pizzicato to bowing. With mixing, it is possible to put together
music that was never performed at the same time, rearrange the volumes
of each of the instruments and, in a similar manner as processing, modify
the sound of the instruments by equalizing, compressing, etc. The
difference between the sound transformation achieved by mixing or
processing is slight. However, by processing, I mean the modification of
the sound of the instrument creatively, which could imply a significant
loosening of the resemblance with what was recorded.
306
The Recording Framework
The previous model provides a general outlook on the interaction
between music and hyperreality, which is highly tied to the process of
recording. Figure 36 portrays a framework that aims to represent the
traditional process of music creation for cinema and its interaction with
hyperreality.
Figure 36. Graphic visualization of the processes involved in a traditional
movie scoring composition process.
307
First of all, it is important to remark that the process is fairly linear.
It begins with the conceptual step of music creation, which draws on
preexisting musical references from the director or the movie creative
team, such as a temp track, or from the dialogue between the director
and the composer during the spotting session. From this referential
background, the composer creates the music for the movie. Traditionally,
it will employ the classical Western tools for music creation, which
revolve around the score and include the available instruments and a
music theoretical framework. Once the music is created, orchestrated
and edited, it is then performed and recorded. Any performance involves
the selection of specific instruments and performers for each of the
instrumental parts, which will generate an individual result in addition to
the acoustics of the hall. Although the process of recording would ideally
minimize any possible incidents, they might still happen. The recording
virtualizes the performance beginning with the selection and placement of
microphones and recording equipment. In addition, the multiple takes
and recording material generates a set of musical content that is purely
virtual. From this material, music is assembled in a hyperreal process that
involves, as aforementioned, editing, processing, mixing and mastering.
With the development of sample libraries, a step was added after
the conceptual stage, which involved the musical mock-up. The
308
composer would generate a digital simulation, which was an
approximation of the final sound of the score, once recorded. The mockup would be used as a communication tool, in a similar (but much more
specific) manner to the temp track. If the temp track served to transmit
the musical ideas of the director to the composer, the mock-up serves to
show how the music would ultimately sound before actually recording it,
which is a very expensive and time-consuming process.
The Contemporary Framework for Audiovisual Music Creation
As described in Chapter VII, the development of sample libraries
made it possible for some of the sounds produced by sample libraries to
permeate the final musical product. This fact has transformed the way
that music is created and produced, thus generating a model (Figure 37)
that goes beyond a linear approach to the process of music scoring.
At the center of the process of music creation, there is the music
sequencer or Digital Audio Workstation (DAW). A DAW is computer
software that provides several functions. It works with MIDI in order to
generate hyperscores that contain all the flexibility provided by MIDI. In
addition, the DAW is able to integrate with diverse sample libraries in
order to generate sounds from the MIDI information. The MIDI part of the
309
DAW can also be used to generate sound with diverse synthesizers that
integrate with the software.
Figure 37. Graphical visualization of a framework for contemporary music
scoring. As it is a nonlinear process, there is no specific linear set of
steps. Instead, the DAW becomes the core of the process.
From the perspective of the DAW, synthesizers and sample
libraries are equivalent virtual devices: they both input MIDI information in
order to output . The MIDI “score” can follow the temporal standards of
the traditional score (tempo and time signature), which facilitates the
process of music composition and the possible transcription of part of
310
the musical content onto a traditional score that can be performed
physically. The DAW is also able to manage audio recordings, which can
be easily integrated with the MIDI information. The audio material can be
edited in variety of ways. In addition, the DAW integrates with various
sound processors, allowing the manipulation and modification of the
sound that comes from either the audio samples or the sample libraries.
Finally, the DAW allows for the loading of a video file in order to properly
synchronize the music with an audiovisual track.
At all times, there is a digital musical piece that can serve as a
demo or as a mock-up for the music that is being written. Parallel to that,
two processes might occur. First, there can be recording sessions in
order to create audio samples for the score. These audio samples might
even become custom-made sample libraries that are specific to the
movie. These samples integrate with the MIDI tracks by either being
placed as audio files in the DAW or by being converted into a sample
library, which will be operated through MIDI analogously to any other
library. The second process involves more traditional recording sessions.
All, or part, of the instrumental material might be recorded by employing
one or more recording sessions. The recording session might include the
entire orchestra or just individual instruments or sections, thereby
increasing the amount of flexibility in terms of mixing and editing. Once
311
recorded, the music will return to the DAW113 and it will integrate with the
rest of the digital musical elements.
Both procedures require the digital music to be properly arranged
and orchestrated for the physical instruments or devices that will perform
it. For the sampling part, this process also involves designing the content
to be sampled, in order to adequately generate the required sound
material. It is important to remark that recording is not limited to musical
instruments. Anything that produces sound could be recorded and
incorporated into a score. The final result might come entirely from the
content garnered during the recording sessions, which would generate
music equivalent, in terms of sound, to the traditional scoring model.
However, this does not exclude the fact that the process of music
creation has become mainly hyperreal.
By following this model, creating music is no longer exclusively
linear in terms of its production. The flexibility introduced by removing the
need for a streamlined process has several advantages in terms of how
composers adapt to the changes in the contemporary process of
moviemaking. Digital movie editing allows continuous editing until the
very end of the postproduction process. The possibility of creating music
in a similar manner (music that can be edited at any time) is greatly
113
Generally, the recording already takes place using a DAW.
312
advantageous. Moreover, the adaptability of the music allows the process
of music creation to start long before there is any edited footage of the
movie. The flexibility of having the music as a file inside of a DAW allows
it to be easily adapted to the picture after the music is written. Moreover,
the possibility of duplicating the files enables multiple people to work on
the music at the same time, if it is necessary. Therefore, it is conceivable
to have a large team of digital music arrangers that, in a very short period
of time, is able to adapt the existing musical material to synchronize with
the latest cut of the movie. In other words, the final stages of the process
of movie scoring become scalable. Furthermore, using a digitized
framework affords increased flexibility in terms of the postproduction
process. The music is regularly delivered in different blocks, called stems,
that represent different musical sections. Thus, it is possible to
dynamically mix the stems to better integrate the music into the movie’s
soundtrack.
Hyperinstruments
In Chapter VI, I briefly defined a hyperinstrument: a virtual
instrument that generates sound in a form that would not be possible
using regular physical means. When analyzing the piano for The Social
Network (2010), I stated that it should be considered a hyperinstrument,
313
as the sound is adapted to the narrative meaning. The appearance of a
model for a hyperinstrument is connected to the historical evolution of the
instruments in the Western orchestral model, and especially its lack of
change in almost a century. In fact, Western orchestral instruments have
evolved to a point that they cannot really evolve much further. In terms of
its physical design, an orchestral flute has been flawless for some years
already. Moreover, any changes will necessarily transform the instrument
beyond what it is right now. In other words, by physically improving some
aspects of its sound, this would probably have a negative impact on
some other aspect. The tradeoff between a desired improvement and the
loss of some specific quality114 could be acceptable if the music
associated with those instruments would evolve accordingly. However,
this has not been the case for the last century in Western orchestral
music. Orchestras have regularly performed pieces from past centuries
that require an orchestral model that would not change. Therefore, the
Western symphonic orchestra and its fixed repertoire have become a
stationary cultural structure, a symbolic musical model from the pre-
114
The sounding differences between the piano and the harpsichord are a
clear example of the tradeoff associated with any instrumental
development. In general terms, the piano has many more sounding
possibilities and musical resources, such as being able to play in different
dynamics. However, the technical development of the piano was
achieved at the expense of losing the delicate metallic sound of the
harpsichord.
314
electric era that celebrates and perpetuates a cultural past. New music
for the orchestra has regularly needed to adapt to those restrictions in
order to be performed.
Instead of developing or expanding the instruments of the
symphonic orchestra from an organological point of view, composers
have envisioned means to produce new sounds with the same
instruments. These new strategies of sound production are commonly
called extended techniques. The name denotes techniques that extend
the standard or established practices of the given instrument. Hence, the
extended techniques are outside of the cultural framework for the
instruments of the Western symphonic orchestra. This means that most
instrumental performers do not regularly employ extended techniques
and, thus, the composer should not expect their technical proficiency in
this regard. From this viewpoint, an extended technique might be
considered a kind of antecessor of a hyperinstrument, as it generates a
sound that is outside of the established cultural framework, yet still
employs purely physical methods. Nevertheless, extended techniques do
focus on the sound produced by the instrument, relegating the
production of the pitch to secondary importance. For instance, the effect
of a string performer playing sul ponticello (placing the bow closer to the
bridge instead of its regular position) is mainly sonic. The intention of the
315
composer, when asking for sul ponticello, is to generate a very specific
sound that differs from the sound that the string instrument would
produce otherwise. The utilization of extended techniques denotes an
attitude to music that focuses on the sound produced and highlights its
importance in addition to pitch, rhythm and harmony. It is also similar to
the attitude that leads composers to create hyperinstruments.
Figure 38 models the definition of a hyperinstrument, which is
similar to the previous models described above.
Figure 38. Graphical representation of the hyperinstrumental design
framework. It progresses from top to bottom.
316
There are two key elements that define hyperinstruments as
distinct entities, separate from regular instruments. First,
hyperinstruments are specific in terms of the sound. Second,
hyperinstruments regularly carry an associated meaning for the given
sound. Consequently, writing for hyperinstruments does not only involve
a specific technique, but also requires a particular attitude towards sound
creation. For example, selecting from among the different, but similar,
staccato articulations in diverse clarinet sample libraries in order to find
the precise sound that best fits the musical intention of a certain moment,
denotes a hyperinstrumental intention. This attitude differs from
traditional scoring or composition, in which a clarinet staccato would
have just been written in the score, accepting as appropriate the result
that a sufficiently trained performer would provide. The composer would
never expect the performer to arrive at the recording session with several
clarinets from different manufacturers in order for him to select the sound
that best fits his views. Similarly, the composer would also have limited
influence over the specific placement of the microphones upon which the
recording engineers decided, which would also affect the final result of
the clarinet sound. Moreover, thinking in terms of a hyperinstrumental
standpoint involves carefully selecting, when possible, the desired
microphone perspectives or the microphone placement, in order to
317
achieve the desired effect. The example from the piano in The Social
Network, to which I earlier referred, clarifies this point. The instrument
changed its sound in order to adapt to the narrative needs of the movie.
Moreover, the flexibility of the hyperinstrumental model allows for the
possibility to further transform the sound of the instrument in order to
achieve new sonic environments by adding sound processing.
At any of these stages (selection of the sound or instrument to
record, deciding the microphone perspective or placement, and choosing
a set of sound processors), there might be a meaning creation process
involved. In screen music, these new instruments are generally created in
order to convey a specific significance. For instance, for Nolan’s Dark
Knight trilogy (2005-2012), Zimmer and his team created a sample library
of Batman’s cape swish sound that would be integrated as a sort of
percussion instrument into the score. This hyperinstrument directly
denoted Batman by musicalizing the sound of one of his most iconic
gadgets. The reversed sounds that appear in Gravity (2013), analyzed in
the previous chapter, were used to signify a specific conception of the
outer space and the absence of sound. Similarly, the hyperdrums utilized
in The Man of Steel (2013) served as signifiers of humanity, thus
reinforcing a precise viewpoint of the myth of Superman. One of the
implications of creating instruments that are specifically customized for a
318
very specific purpose is that they lose the universality of the traditional
orchestral instruments, therefore becoming somewhat “liquid”, if using
Bauman’s approach to contemporary culture. As he states:
I use the term ‘liquid modernity’ here for the currently existing
shape of the modern condition, described by other authors as
‘postmodernity’, ‘late modernity’, ‘second’ or ‘hyper’ modernity.
What makes modernity ‘liquid’, and thus justifies the choice of
name, is its self-propelling, self- intensifying, compulsive and
obsessive ‘modernization’, as a result of which, like liquid, none of
the consecutive forms of social life is able to maintain its shape for
long. (Bauman, 2011, p. 11)
For Bauman, the concept of “liquid modernity” is a means to
express the postmodern contemporary condition. It is a concept that
defines the nature of hyperinstruments, especially when contrasted with
the established instruments of the Western orchestral tradition. They are
mostly single-use instances designed to satisfy a very particular need by
generating a custom tailored sound. Moreover, their ephemeral qualities
guarantee that the music will have a singular soundscape, distinct from
other movies. Composer Blake Neely (Folmann, 2014) defines this as
adding an “ear candy” to the presence of a sounding element that
becomes unique and particular to the musical soundscape of the
audiovisual piece:
I’m a big fan of mastering FX. They make everything pop. I’m a big
fan of filters, because they make any sound instantly unique. I am
religious about placing things in the stereo field. It’s a big field to
319
play in, so why put everything in full stereo? I position (pan) things
“around the room” for clarity and to help the mix come to life. And
lastly, I’m not finished with a cue until I’ve put some piece of “ear
candy” in it — something new, whether it’s a particular woodwind
voicing or a cool textural sound. Something that makes you want
to listen again. (All your scores always have a phenomenal
sound…)
In sum, in terms of music creation, a model for a hyperinstrument
becomes a specific instance of a general definition of an instrument. A
specific instance of a clarinet playing a certain type of staccato, recorded
and mixed with defined sound perspectives and processed employing
another set of concrete sound processors could be considered an
instance of an orchestral clarinet sound. However, a hyperinstrumental
attitude acknowledges the importance of a very specific sound in order to
produce meaning and a soundscape for the movie in which the music
integrates. Moreover, the possibilities offered by the hyperinstrumental
model go far beyond that, as discussed above. The hyperinstrument
might also generate sounds that are disconnected from any possible link
to a sound produced by physical means.
A Framework for the Hyperorchestra
Briefly, a framework for the hyperorchestra revolves around the
combination of hyperinstruments in the space defined in Figure 35, which
outlined the music in the hyperreal. The following, and final, chapter is
320
dedicated to the combination of hyperinstruments, which I refer to as
hyperorchestration. In this section, I will provide a framework with which
to approach the hyperorchestra, and from which to define the
hyperorchestration techniques that I will later discuss. As a virtual
ensemble, the hyperorchestra is less stable than the Western symphonic
orchestra, which is heavily grounded in a cultural background. In other
words, with a very basic set of orchestrational principles, a piece of music
written for the Western symphonic orchestra will sound reasonably
balanced and coherent.
However, let us imagine a hyperorchestra that includes an
instrument that connotes Japan, such as the shakuhachi, and another
instrument that connotes India, such as the tabla.115 Initially, the resulting
connoted meaning of the combination of those instruments is not
predictable. For instance, depending on the musical material that the
shakuhachi plays, the instrument might be assimilated into an Indian
bansuri flute, thus generating a mostly Indian musical result. Similarly, if
the rhythmic pattern of the tabla does not clearly associate the instrument
with an Indian performance practice, it might be assimilated into a small
Japanese taiko, shifting the sound toward a Japanese soundscape.
However, if both instruments are performed by clearly employing their
115
India’s most iconic percussion instrument.
321
cultural performance practices, the sounding result (in terms of meaning)
is unpredictable. This problem does not only affect instruments from
different cultural traditions. For example, the piano in The Social Network
acquires its meaning only if properly combined with other sounds. If it
were combined with only an orchestral set of sounds, the effect of
different microphone placements would be much more difficult to
perceive, thus defeating the purpose of the different mixes that defined
this particular hyperinstrument. Thinking in terms of a hyperorchestra
involves a dual process of soundscape sculpting and the generation of
meaning (Figure 39).
Figure 39. Graphical representation of a conceptual framework for the
hyperorchestra. Its main purpose is to show that the hyperorchestra is
the result of an attitude that focuses on the sounding result in addition to
a process of generation of meaning.
322
These two main pillars are closely interrelated. For instance, the
virtual space where the music is placed could have an attached meaning.
This was the case in Interstellar (2014), where the cathedral in which the
organ was recorded contributed to its technological and religious
meaning in tandem with the sound of the pipe organ. Writing music using
a hyperorchestral model involves negotiating the generation of meaning
at the same time that the different sounds are distributed around the
spatial image and the sound spectrum, in order to generate the desired
soundscape.
Continuing with a metaphor related to the fine arts, the canvas for
building the sound for the hyperorchestra is, at least nowadays, the
humanly audible sound spectrum.116 Combining the different sounds in a
hyperorchestra involves deciding which space each one of the
hyperinstruments will occupy in the soundscape. This process might
interact with the definition of the hyperinstruments that are present in the
virtual ensemble. For instance, a filter might be required in order to
restrict the spectral range of a hyperinstrument. Hence, the process of
hyperorchestral scoring is similarly fluid and non-linear, as is the
116
Composing and appreciating music that utilizes a sound spectrum that
goes beyond human perception would require humans to have a set of
bionic ears that would extend their aural capabilities. Even though this
might be plausible in the future, these types of devices do not currently
exist.
323
contemporary process of movie scoring. Hyperorchestral decisions blend
and permeate the definitions of the hyperinstruments that conform to the
virtual ensemble.
Sound sculpting also involves a purely aesthetic set of decisions
that is applicable whenever it is possible. Frequently, there are diverse
means to achieve a desired meaning. The music for Gravity could have
included percussion and still retained a similar meaning. The decision to
not include any type of percussion, which in this case came from the
director, Cuarón, should be considered aesthetic. Similarly, the utilization
of reversed sounds is not the only possible technique that could have
portrayed the extraneous physical properties of outer space. Thus, the
decision to utilize reversed sounds was similarly based on aesthetic
grounds. These aesthetic decisions contribute to the creation of a unique
and particular soundscape for the audiovisual object in which they are
embedded.
Closely related to the aesthetics, is the design of the virtual space
where the music will sound. As a virtual space, it does not necessarily
follow a purely three-dimensional design. In other words, the instruments
do not necessary need to be placed in a single imaginary threedimensional hall. The piano in The Social Network changed its placement
dynamically. Similarly, a hyperinstrument that is the product of a mix of
324
diverse microphone perspectives would result in a multidimensional sonic
placement that would include the combination of the different locations in
the virtual space at the same time, and in different amounts of sound.
Similarly, the virtual space might be the combination of different threedimensional spaces, thus also becoming multidimensional. Designing a
virtual space with this amount of variation and possibilities is both
challenging and fundamental at the same time, considering that the
objective is to generate a cohesive sound that is aesthetically appealing.
In most of these instances, a very heterogeneous sound space might
decrease the verisimilitude of the sound, thus affecting the overall
meaning of the music.
In parallel with sound sculpting, the generation of meaning is a key
process in a hyperorchestral framework. As I have already described,
managing diverse meanings from different cultural traditions is one of the
main challenges for hyperorchestral writing. The analysis of the music for
The Lord of the Rings trilogy in the previous chapter serves as a
successful example of a widely multicultural-driven music creation
framework. Furthermore, the music for The Lord of the Rings reveals
some of the main features that allow for the presence of diverse cultural
traditions, while maintaining a cohesive message. First of all, the music
for Middle-earth targeted multiple cultures that each had their own
325
specific musical world. Second, Middle-earth is a fictional world. Third,
each of the cultures had an associated a set of instruments that were
somewhat related. Therefore, the utilization of instruments from diverse
musical traditions in the music for The Lord of the Rings movies is
achieved by the presence of different music for each of the cultures of
Middle-earth that detached from a specific cultural tradition. Within each
of these entities, the instruments were culturally related. Therefore, a
musical soundtrack that has several contained elements facilitates the
utilization of diverse instruments with different cultural backgrounds.
The sounds that generate the hyperinstrument come from a wide
range of sources. Some of these sources are attached to common
objects that are part of everyday life. The cape sound from The Dark
Knight trilogy is an adequate example. Similarly, some hyperinstruments
present in Interstellar’s track, “Dust” (Zimmer, 2014), are modeled to
denote the sound of wind filled with dust. Thus, in addition to the
associations generated by the cultural traditions attached to the
instruments, meaning can be generated by the references through
sounds that are directly associated with human activities or experiences.
In this vast landscape of sound possibilities, it is relevant, in terms
of meaning, to assess the degree of verisimilitude of the resulting sound,
which can integrate with the narrative content of the movie. For instance,
326
the massive brass and impossible crescendos in Inception (2010)
sounded verisimilar although they attached a level of intensity that
seemed to extend beyond the physical world. From this angle, the level of
verisimilitude of the music is concomitant to the degree of verisimilitude
of the diegesis of the dreamed world, which is literally being folded by
Ariadne (Ellen Page). Managing how the degree of verisimilitude of the
music interacts with its material becomes important in order for the music
to provide meaningful content.
Finally, meaning can be generated in the hyperorchestra through
the traditional process of referential meaning. For instance, the sadness
that is frequently associated with the minor mode will permeate into a
hyperorchestral framework. However, the possibility to generate meaning
from a much wider range of perspectives, in conjunction with the chance
to create extremely rich and varied soundscapes, could alter the effect of
traditional referential meaning. A minor chord might easily be eclipsed by
an overly positive soundscape, or by a specific melodic structure
associated with a particular cultural tradition.
In the description of a framework for the hyperorchestra, I
highlighted the immense possibilities in terms of sound variability and
generation of meaning that it offers, while identifying the possible risks
that emanate from such a flexible device. The hyperorchestra should
327
similarly be considered a “liquid” cultural entity with an incredibly
expressive power at the expense of having solid and established
foundations. In other words, creating music with the hyperorchestra
offers an enormously expanded range of musical prospects at the price
of losing the safety net that the Western symphonic orchestra model
offers. Now that the different models for working with music within the
hyperreality are defined, I will dedicate the next, and last, chapter to
analyzing some of the major techniques of hyperorchestration, alongside
describing how these techniques allow myriad ways of musical
expression.
328
CHAPTER IX
HYPERORCHESTRATION
Introduction
In the preface to the third edition of The Study of Orchestration
(2002), Samuel Adler (2002) begins by admitting that he failed, twenty
years before, when he attempted to predict the evolution of Western
orchestral music:
In 1979, I stated that music of the last quarter of the twentieth
century would be even more complex and even more experimental
than in the decades since World War II. New methods of notation
would be devised, new instruments would be invented, and
possibly even new concert spaces would be created to
accommodate the cataclysmic changes that I predicted would
occur. (p. ix)
He believed that “it is indeed an understatement to say that my
soothsaying was dead wrong” (Adler, 2002, p. ix) because, in his
experience, orchestral music became simpler during the last quarter of
the 20th century. Further, he believed that orchestration had followed a
similar path:
A similar situation exists in the real of orchestration. Although new
notation and extended instrumental techniques were all the rage
from the mid-twentieth century through the middle 1970s, a more
329
traditional approach to the orchestra seems to have regained a
foothold, despite all of the previous focus on experimentation.
(Adler, 2002, p. ix)
The present analysis of the hyperorchestra serves to indicate that,
in reality, Adler was not entirely incorrect in his forecasts; they ultimately
took a few more years than expected to materialize. Digital Audio
Workstations, MIDI and sample libraries were developed during these two
decades, generating new methods of notation, new instruments, new
spaces and new complex modes of sound creation. Although Adler was
thinking about the physical world exclusively, he predicted an evolution
that could never have occurred, in practical terms, in that limited world.
However, everything that he described was eventually manifested in the
expansion of music through the digital realm, thus generating the
hyperorchestra. This chapter focuses on the extended orchestrational
techniques that apply when working with a hyperorchestra, which I define
as hyperorchestration.
Traditional Orchestration: an Overview
In order to establish the grounds for describing the processes
involved in hyperorchestration, I will begin by providing some general
principles that govern traditional orchestration. In addition, I will describe
the main concepts that define spectral music and its orchestrational
330
techniques, which, from a certain angle, provide hyperorchestral content
to the traditional orchestra. Continuing with the overview of Adler’s (2002)
seminal book, the author describes at the beginning of the
instrumentation section of the text, what the orchestra means in his
opinion:
The orchestra is certainly one of the noblest creations of Western
Civilization. The study of its intricacies will illumine many important
areas of music. After all, timbre and texture clarify the form as well
as the content of a host of compositions. Further, specific
orchestral colors and even the spacing of chords in the orchestral
fabric give special ‘personality’ to the music of composers from
the Classical period to our own time. (p. 3)
Adler articulates the established status of the orchestra as a
cultural institution for Western culture, as discussed in the previous
chapter. Moreover, orchestration is a main force for the creation of music,
especially in order to clarify and generate structure. Later in the text, at
the beginning of the ‘Orchestration’ section of the book, Adler (2002)
reminds his readers that:
Scoring for orchestra is thinking for orchestra. When dealing with a
composite instrument like the orchestra you must be completely
familiar with the character and quality of the orchestra’s
components: the range and limitations of each instrument as well
as how that instrument will sound alone and in combination with
other instruments. The timbre, strength, and texture of every
segment of the instrument’s range become crucial when you are
creating orchestral color combinations. (p. 547)
331
These two excerpts stress the interconnection between
instrumentation and orchestration. Orchestrating is to combine the sound
of the instruments in order to create musical textures and timbres, which
will serve diverse objectives. For Adler, one of the main objectives of
orchestrating is to aid in the creation of a musical structure. In order to
properly combine the sound of the instruments, it is necessary to not only
know how to write for them but also to know which sound you can
expect to hear. In fact, the concept of a physical instrument is misleading,
as it might seem that its physical integrity would generate a cohesive set
of sounds. However, it is precisely because of the physical nature of the
instruments that their sounds tend to not be homogeneous, as the
physical elements that constitute the instrument react differently
depending on performance factors, such as pitch and dynamic. Thus, the
sound of a clarinet will differ in terms of timbre depending on the pitch
even when performed by the same performer using the same clarinet.
Similarly, a dynamic increase on the note will not only increase its volume
but will also modify its timbral characteristics. Moreover, at a certain
degree of the dynamic level, the sound would begin to be significantly
distorted. In addition to the changes caused by pitch and dynamics, a
varied range of techniques can generally be used to play each instrument
that equally generate a diverse assortment of sounds.
332
From a conceptual standpoint, orchestrating involves
acknowledging that each instrument is able to generate a collection of
different sounds, although this is limited to a contained scope (e.g. the
timbre of a clarinet when played in a loud dynamic level cannot be
achieved when playing in a soft dynamic). Orchestrating a musical piece
requires the utilization of these varied sounds in a manner that is possible
to perform. Although each instrument is able to generate multiple sounds,
it is generally only possible to generate one (or a few) at the same time.
Therefore, part of orchestrating is also managing the orchestral forces in
order to achieve the desired results in the best possible manner.
Moreover, orchestrating becomes planning which instrumental forces
employ in order to shape a musical structure for the piece, at the same
time that it displays and clarifies the rest of the musical features of the
composition.
The Spectral Movement
The spectral movement is, in my opinion, the most relevant
compositional approach in the field of contemporary Western classical
music that focuses on the expansion of the sound from an organic point
of view. In Did You Say Spectral? (2000), Gérard Grisey, who is
considered one of the founders of the movement, reviewed the
333
emergence of the spectral movement, which “offered a formal
organization and sonic material that came directly from the physics of
sound, as discovered through science and microphonic access” (Grisey,
2000, p. 1). Grisey (2000) summarizes the main aspects of the spectral
attitude in comparison to other approaches to music:
What is radically different in spectral music is the attitude of the
composer faced with the cluster of forces that make up sounds
and faced with the time needed for their emergence. From its
beginnings, this music has been characterized by the hypnotic
power of slowness and by a virtual obsession with continuity,
thresholds, transience and dynamic forms. It is in radical
opposition to all sorts of formalism which refuse to include time
and entropy as the actual foundation of all musical dimensions. […]
Finally, it is sounds and their own materials which generate,
through projections or inductions, new musical forms. (pp. 1-2)
In a similar manner, Grisey’s colleague, Tristan Murail (Bruce &
Murail, 2000) describes the beginning of the movement in terms of their
exploration of sound:
I think that it is chiefly an attitude toward musical and sonic
phenomena, although it also entails a few techniques, of course.
We were trying to find a way out of the structuralist contradiction.
(…) We wanted to build something more sound (pun intended).
This was part of it, as was a curiosity about sounds. Also, at that
time, the information that we required was not as readily available
as it is today. Gérard Grisey and I had read books on acoustics
that were designed more for engineers than for musicians. There
we found rare information on spectra, sonograms, and such that
was very difficult to exploit. We also did our own experiments. For
example, we knew how to calculate the output of ring modulators
and, a little later, frequency modulation. Those things were,
theoretically, quite easy to manipulate. (p. 12)
334
In these two viewpoints of the founders of the spectral movement,
three main ideas arise. First, the movement began as a reaction to
formalist or structuralist forms of musical composition, as a product of
the Darmstadt School after the end of the Second World War. Next, they
focused on the sound as the main material for music construction. In
order to study it, they needed to have technical knowledge and,
ultimately, this required the aid of technology, which served to reveal the
physical properties of the sound. Last, spectralism should primarily be
considered as an attitude towards music composition rather than a
school of composition, as it does not formulate a set of compositional
axioms that should be followed. Instead, the Spectral School suggests
paying closer attention to the sound itself and to create music by
considering its nature.
The close relationship between the spectral movement and the
development of technologies to analyze sound is also acknowledged by
Murail in the same interview: “I think, in fact, that there had been a
historic conjunction between an aesthetic movement, the spectral
movement, and the techniques, research, and software developed at the
[Institut Réunit Scientifiques et Musiciens] IRCAM” (Bruce & Murail, 2000,
p. 13). In a review of the relationship between technology and music
335
creation in the area of spectral music, Daubresse and Assayag (2000)
state:
Their compositional techniques [of the Spectral composers] were
already sufficiently rich and sophisticated to make them at ease in
front of both analyses and synthesizers; they [The Spectral
composers] went from an acoustical and musical multirepresentation to the programming of processes for the generation
of symbolic or sonic material. Manipulating timbre -- but also
traditional instruments -- with ease, freed from repetitive
calculation, they certainly gave synthesis some of its first proofs of
musical respectability. (p. 62)
One of the earliest developments that spectral composers utilized
to enrich the traditional orchestration of the symphonic orchestra was the
translation of some acoustical findings in order to generate new
soundscapes. The most well known of these techniques is frequently
referred to as orchestral synthesis:
Perhaps the most important idea emerging from early spectral
music (though it was presaged in other musics [sic]) was the idea
of instrumental (or orchestral) synthesis. Taking the concept of
additive synthesis, the building up of complex sounds from
elementary ones, and using it metaphorically as a basis for
creating instrumental sound colors (timbres), spectral composers
opened up a new approach to composition, harmony and
orchestration. The sound complexes built this way are
fundamentally different from the models on which they are based,
since each component is played by an instrument with its own
complex spectrum. Thus the result is not the original model, but a
new, much more complex structure inspired by that model. The
sounds created in this way keep something of the coherence and
quality that comes from the model while adding numerous
dimensions of instrumental and timbral richness and variety.
(Fineberg, 2000, p. 85)
336
Fineberg describes how the spectral composers employed the
acoustic processes that were derived from the concept of the Fourier
transform, which is, broadly, the mathematical process to divide any
sound in a series of sine waves of different frequencies. They used the
Fourier transform to shape an orchestral sound that was the product of
additively inserting the different frequencies that constituted the harmonic
spectrum of a sound.117 As Fineberg points out, the result generates a
complex sound spectrum due to the physical nature of the orchestral
instruments, which do not behave simply as sine generators, but maintain
the overall approach to additive synthesis. From this viewpoint, the sound
produced through orchestral synthesis is purely acoustic, although it
could have not originated without the aid of non-physical means of sound
analysis, thus transforming its nature beyond the culturally expected
soundscapes for the music produced by the symphonic orchestra. In
outlining the consequences of the spectral attitude, Grisey listed the
following timbral aspects118:
-
More 'ecological' approach to timbres, noises and intervals.
-
Integration of harmony and timbre within a single entity.
117
For instance, Grisey’s Partiels is modeled after the sound spectrum of
a low E performed by a trombone.
118
Grisey described seven timbral consequences, although these three
are the most salient.
337
-
Integration of all sounds (from white noise to sinusoidal
sounds). (Grisey, 2000, pp. 1-2)
All three features are aspects that define the process of
hyperorchestration. In essence, the concepts described by the spectral
movement resonate with the primary concepts of the hyperorchestra in
terms of sound building. Composing for a hyperorchestra is writing music
while considering the soundscape, which is the equivalent of having a
spectral attitude towards sound. However, they do differ in relation to
their generation of meaning. By using the hyperorchestra, composers
acknowledge the cultural anchors of the sounds it uses, which are
employed for the creation of meaning and emotional content. Negotiating
with the sound in the hyperorchestra involves much more than a timbral
development, as it includes the construction of a cohesive and expressive
layer of meaning. Therefore, the sounds acquire a level of signification
beyond their pure timbral characteristics. Nevertheless, the musical
paradigm that emanates from the spectral school greatly informs how the
process of hyperorchestration functions, at the same time that it provides
an additional link between the new ensemble and the traditional
orchestra.
338
Music Software Tools
The generation of music that contemplates the full sound spectrum
is tied to the utilization of technology that supports the process. Working
with the sound spectrum involves utilizing knowledge that is not directly
achievable just by pure observation of nature. Above all, the computer,
which is capable of integrating and virtualizing diverse equipment, has
become an indispensible piece of technology for music composition. This
fragment how spectral music requires computer-aided tools:
For composers, the computer has become, bit-by-bit, an
irreplaceable working tool; as important as the pencil and eraser. It
fills several functions, from printing out scores to playing back
synthesized sounds in real-time, and passing through the
development and use of compositional algorithms, simulating
orchestrations or even controlling synthesizers (virtual or external).
It is important to remember how much computers have profoundly
changed not only the daily work habits of musicians and scientists,
but also, indirectly, in the ways that they conceive of sonic
phenomena, speculate on their possible manipulations, formalize
and even express and communicate their ideas. These different
elements have allowed a giant qualitative leap in analysis,
synthesis and computer assisted composition. (Daubresse &
Assayag, 2000, p. 63)
This explanation of the central role of the computer is broadly
applicable to the process of writing music for the hyperorchestra. So far, I
have mentioned how the limitations of the score were overcome by the
utilization of MIDI with visual interfaces such as Logic Pro’s piano roll. I
also mentioned how Digital Audio Workstations (DAW) are at the center of
339
the process of music production for the audiovisual media, governing all
the different processes involved in creating the final score. The DAW
integrates and communicates with virtual instruments (sample libraries
and synthesizers) in order to produce sound from MIDI information.
Moreover, the sound outputted by these virtual instruments might be
further transformed by employing integrated sound processors. Finally,
the DAW also serves as a platform upon which to record and manage
sound. Hence, sound files inside of a DAW might be merged with sound
produced by utilizing MIDI data. Hence, the DAW becomes the truly
virtual version of the music studio for both the music composer and the
studio engineer.
It is important to mention that there are different types of DAWs,
which focus on different parts of the process of music production. First,
there are the platforms usually known as notation software.119 They are
the pure virtual translation of the traditional composing studio for the
classical composer. They provide an interface for score writing with
extended editing capabilities, similar to the features of a word processor.
They allow a MIDI keyboard to be connected in order to produce sound
and to facilitate the process of note inputting. In addition, they provide
rudimentary playback, which might replace the need to use a piano to
119
Finale and AVID’s Sibelius are the two main platforms for score
notation.
340
reproduce and hear the music composed. Their greatest strength is that
they are the only software that is able to produce all sorts of
professionally edited musical scores. Therefore, they are the DAWs used
in most of the processes that involve the creation of the score.
Fortunately, they integrate with MIDI, allowing the creation of scores from
MIDI content and the creation of MIDI files from the scores written within
the program. Notation software is able to import MIDI sources from other
DAWs that could be used to generate a traditional score. However, the
differences between the expanded possibilities of MIDI and the limitations
of the Western musical score imply that some process of transformation
is needed in order to produce a playable score.
Second, there are the DAWs that are mostly focused on music
production using MIDI and existing sounds. Logic Pro, which I mentioned
before, is an example taken from this category.120 These are the DAWs to
which I mostly refer when discussing the process of hyperorchestration.
Finally, the last category mainly includes AVID’s ProTools, which is
hegemonic as the preferred platform in recording studios. ProTools is the
standard for music recording, mixing and mastering. DAWs, such as
Logic Pro, are also able to act as virtual recording studios, although they
are not integrated into the studio hardware in the same manner as
120
MOTU’s Digital Performer and Steinberg’s Cubase are the other two
main MIDI oriented DAWs.
341
ProTools. However, from the perspective of a home recording studio,
Logic Pro and similar DAWs are valid tools for recording music. Similarly,
ProTools is able to work with MIDI and sample libraries, although the
editing capabilities are somewhat limited. Although their integrated score
editors are very limited, all these DAWs are able to generate musical
scores. The DAWs from the second category are the principal tool for
writing music for audiovisual media. Furthermore, they are the most
convenient to use when composing for the hyperorchestra. Finally, they
are the only ones capable of becoming the core of the scoring process,
as defined in the previous chapter. As a final remark, the utilization of
computer tools is key for a nonlinear approach to the process of music
creation and movie scoring, which is essential for the new paradigm of
music writing.
Defining Hyperorchestration
The definition of hyperorchestration derives from the principles of
traditional orchestration, which I discussed above. Western orchestration
builds upon the established model of the symphonic orchestra, which
serves as a means to stabilize several sound parameters. Thus, the study
of how timbre and texture are tools for musical expression and structure
involves learning, on the one hand, how to write for the instruments of the
342
orchestra and, on the other hand, how to combine them within the
restricted environment of the orchestral set-up. Instead of just studying
instrumental timbre, hyperorchestration acknowledges the full sound
spectrum, which is not far removed from how the spectral movement
addresses orchestration. From the point of view of the current discussion,
timbre just becomes a subset of the full sound spectrum. The concept of
timbre defines the spectral qualities of the diverse instruments in terms of
their general sound template. For instance, timbre will distinguish
between the sound qualities of a flute in comparison with the string
section. The differences will be evaluated by extracting the general
spectral characteristics of the sound produced by a generic flute (and a
generic string section) placed in its appropriate position within the
orchestral ensemble. These general characteristics become the sound
template for the orchestral flute, which can then be compared to similar
models for the rest of the instruments of the orchestra.
Hyperorchestration involves going beyond the previous definition
of timbre to evaluate the aural qualities of either a sound or a
hyperinstrument, which become individual and unique instances that
draw from physical instruments, synthesizers, recording techniques and
sound processing. Moreover, the placement of the instruments is not
limited to the assumed position in the orchestral ensemble. In fact, the
343
instruments are not limited to an established physical position at all.
Hyperorchestration implies considering the whole soundscape as the
canvas for sound generation. Finally, hyperorchestration extends the
denotative and connotative associations in the Western orchestra by
acknowledging that meaning is, in fact, one of the pillars of the
construction of a desired sound.
The rest of the chapter is divided into three sections, which will
delineate the different processes outlined in the definition of
hyperorchestration. First, I will define how mixing becomes a
hyperorchestration tool and the possibilities it offers. Then, I will explore
the principles that govern the creation of hyperinstruments and their role
as very specific instances that serve a unique purpose. Finally, I will
explore the combination of these hyperinstruments and the extended
possibilities they offer.
Mixing as a Hyperorchestration Tool
One of the key elements for the process of hyperorchestration is to
successfully negotiate the soundscape once music is liberated from the
restrictions of the standardized model of the symphonic orchestra. In
order to contextualize the paradigm shift that hyperorchestration offers
when compared to traditional orchestration, I will begin by analyzing the
344
beginning of Mahler’s First Symphony, which represents an attempt to
transcend the limitations of the orchestral sound, thus generating a new
soundscape for trumpets. In so doing, the trumpets also acquire new
meaning.
Mahler’s First Symphony
At the beginning of Mahler’s First Symphony, the composer asks
the trumpet players to perform as if they were very far away. In concert
performances of the piece, this is regularly achieved by temporarily
placing the trumpet players offstage.121 Besides the effect that such
spatial placement of the trumpets might have in terms of musical
signification (an awakening call that comes from far away), this
unconventional placement engenders a new sonority that generates a
new array of orchestral interactions. In terms of orchestration, this
unorthodox placement implies that the conventional principles of
orchestral trumpet scoring will not be applicable when the trumpets are
offstage. In the score, Mahler notated the trumpet part with a ppp
dynamic, which is softer than the pp written for the woodwinds that are
playing just before the trumpets’ entrance (Figure 40).
121
Mahler’s First Symphony is neither the only nor the first case of the
utilization of this technique in classical music. However, it is one of the
best-known and most popular examples.
345
Figure 40. Score sketch for Mahler's First Symphony (m. 17-25)
Examining the most common performance practices for this
beginning, one realizes that the ppp dynamic refers to the resulting
loudness of the trumpets, from the perspective of the concert stage.122 In
other words, the trumpet players are not performing a ppp dynamic; they
are performing in a much louder dynamic (mf or f) in order to compensate
122
Or the conductor.
346
for the attenuation of their sound, which is a product of their placement.
When the offstage trumpets interact with the clarinet, the clarinet
becomes significantly louder than the trumpets, although it is not
performing at a significantly louder dynamic level.
The loudness is not the only sound property altered when the
trumpets play offstage. The timbre of the trumpets is also altered by the
distance and by the walls between their location and the stage. Thus,
even though the performers would probably be able to play onstage more
quietly than the offstage sound, the result would be significantly different.
Consequently, performing the opening of the trumpet part onstage in a
quieter dynamic would significantly alter the sound that the composer
envisioned for the beginning of the symphony.
In light of this situation, it is necessary to examine other variations
made to the sound of an orchestral piece that are much more common, in
order to compare them to the situation presented by Mahler’s offstage
trumpets. In an orchestral performance, there are logistic decisions that
must be made that will affect the sound. For example, deciding to place
the second violins to the right side of the stage, or choosing the cymbal
size for a cymbal part will clearly modify the final sound. The sound of the
performance will also be influenced by elements that are directly related
to the performers. For example, the performers choose which instrument
347
that they will use. Similarly, the performers’ specific instrumental
technique and their technical level of expertise will also condition the
resulting sound. Finally, the acoustics of the hall will also greatly affect
the result. However, none of these aspects is generally considered, in
Western culture, to be significant alterations of the sound of a piece of
music; at least not in a comparable manner to changing the placement of
the trumpets onstage for the beginning of Mahler’s symphony. This holds
true even when these alterations have a greater effect on the resulting
sound. In other words, a performance of the symphony in a cathedral will
still be considered a performance of the same piece as if it were played in
a concert hall. However, a performance in which the trumpets play
onstage all the time would be considered, at the very least, a modification
of the piece.
The implications of this irregular aesthetic assessment could lead
to a variety of discussions, although I will concentrate on one of the
repercussions: there are no solid ontological grounds on which to
quantify the degree of significance of any of these alterations if only
considering their influence on the resulting sound. As a consequence,
judging which alterations are acceptable and which are not becomes
symbolic; a product of the conventions used in Western classical music
as discussed at the beginning of the chapter. These conventions
348
establish when a sound modification alters the piece (changing the
placement of the trumpets, changing instruments) and when it does not
(performing the piece in different halls).
This musical example reveals the extent of the arbitrariness that
Western classical musical practices employ in order to consider a set of
performances to be equivalent to a common model. The inexistence of
solid grounds to sonically justify these types of decisions opened the
door for the evolution of a hyperorchestral model that is inclusive in terms
of its acknowledgment of sonic variety.
Defining Mixing
Mahler’s decision on the trumpet placement at the beginning of his
symphony was one of the few methods at the composer’s disposal to
alter the musical mix of the orchestra, prior to the invention of the
recording. Thus, the true concept of sound mixing is interrelated with the
process of recording music. In its pure and original definition, mixing was
the process of joining together the different recorded elements of a song,
and altering them in order to create a sound image that was as close as
possible to the live sound that preceded the recording. Moylan (2014), in
349
his respected book Understanding and Crafting the Mix, defines the
process of mixing a song as follows123:
The mix is crafted to bring musical ideas and their sound qualities
together. The song is built during the mixing process by combining
the musical ideas, focusing on shaping the sound qualities of the
recording. The recording is the song, and it is the performance of
the piece of music. A successful mix will be constructed with a
returning focus on the materials of the song, and the message of
the music. The musical ideas that were captured in tracking are
now presented in the mix in ways that best deliver the story of the
text and the character of the music. (p. 418)
A key concept in Moylan’s definition is the importance of focusing,
during the mixing process, on the creator’s intended meaning for the
music. This is achieved by adapting the different pieces captured during
the recording (or created by the synthesizers) while putting them
together. In fact, Moylan (2014) suggests that mixing is, in fact,
composing music:
The process of planning and shaping the mix is very similar to
composing. Sounds are put together in particular ways to best suit
the music. The mix is crafted through shaping the sound stage,
through combining sound sources at certain dynamic levels,
through structuring pitch density, and much more. How these
many aspects come together provides the overall characteristics
of the recording, as well as all its sonic details. Consistently
shaping the “sound” of recordings in certain ways leads some
recordists to develop their own personal, audible styles over time.
(p. 416)
123
His expertise is in popular songs and his book is focused on the
production of songs. However, his principles are broadly applicable to
different types of recordings.
350
If mixing involves using the different elements that were recorded
in order to generate a song, mixing is, in fact, part of the compositional
process. This is especially true when applying my model of music writing
in the hyperreal, presented in the previous chapter. However, in Moylan’s
framework, mixing is contained within the few processes and strategies
he briefly mentioned. Moreover, Moylan’s approach is broadly linear in
terms of its music production process, placing the mixing process at a
very specific stage of the production of the song. My proposed model for
music creation outlines a process that becomes nonlinear, in which the
techniques associated with mixing become an integral and intertwined
part of the entire creative process. In this environment, mixing evolves to
include a much broader approach to sound modification.
Therefore, mixing comprises a set of processes dedicated to
placing a specific sound or virtual instrument within the virtual
soundscape while shaping its sound spectrum. In other words, mixing is
sculpting the sound and placing it on the virtual canvas that is presented
to us. It also contributes to the creation of hyperinstruments and their
associated meaning, thus acting in a similar manner to how traditional
orchestration functions when creating textures and timbres. Mixing also
infers deciding how to distribute the total sound level (the maximum
351
amplitude of the final sound) between the frequency areas. Figure 41
includes three different graphics that represent these diverse allocations.
Figure 41. Amplitude distribution within the frequency spectrum. All three
graphics represent a situation in which the sound output utilizes the
maximum amplitude available.
All three graphics have roughly the same output stereo level.124 The
first one shows a rather homogenous division of the total sound level
124
It is important to remark that the sound amplitude is normally
measured using decibels, which is a logarithmic mode of measurement.
Thus, the relationship in terms of sound amplitude between -10dB and 20dB is in the exponential order instead of the linear.
352
among the frequency range. Graphics two and three show how the total
stereo level is filled by sound in the low and high frequency range. In both
cases, the sounds in those frequencies are “allowed” to be louder, as
they are alone in filling the output levels. This process of deciding how to
distribute the total sound level available between the different frequency
ranges is similar to deciding how to distribute musical content among the
instruments of a musical ensemble.
In the following sections I will describe the three main processes
involved in mixing when used as a hyperorchestration tool: the creation of
a sound perspective for a hyperinstrument, the utilization of sound
processing tools and the design of a virtual space in which to place the
music.
Defining a Sound Perspective
In the model for the design of hyperinstruments (Figure 42), I
included a stage that involves deciding the microphone placement and
the final sound perspective of the hyperinstrument. The main theme in
The Social Network (2010) exemplified the possibilities of employing the
sound perspective as a musical tool to create meaning.
353
Figure 42. Hyperinstrument model revisited.
The process of generating a perspective is also relevant because a
significant number of recent sample libraries have included diverse sound
perspectives.125 This implies that deciding upon, and creating, sound
perspectives has become one of the creative processes when writing
music using virtual instruments. Further, the model also includes the
process of deciding the microphone placement in a recording. During the
analysis of the hyper-drums present in Man of Steel (2013), I described
125
They are also called microphone positions, although the term might be
misleading. Each position does not necessarily reflect the sound captured
by a single microphone, but it might be the result of the sounds captured
by several microphones already mixed in order to create a perspective.
354
how the utilization of microphones placed close to each of the drums was
essential in order to produce the final sound. Recording non-linearly and
in isolated groups also encourages the utilization of multiple, and specific,
microphones with which to shape a specific perspective for the new
recordings in an approach that is aesthetically similar to deciding the
sound perspective in a virtual instrument. In any case, both processes
provide the composer or the mixer with access to several microphone
recordings (or already mixed perspectives) from which to generate a
specific perspective. Figure 43, which is reminiscent of Moylan’s
templates for stereo analysis (Moylan, 2014), represents the three
different mixes of the piano present in The Social Network (2010):
Figure 43. Visual representation of the piano mixes in The Social Network
(2010). The composers utilized three different microphone positions at
different distances from the source.
355
With only these three recordings, it is possible to generate multiple
customized perspectives by mixing the different microphone positions.
For example, one mix could be mainly created with perspective 1, with
some parts of perspective 3. The generation of a sound perspective could
be made even more sophisticated by individually altering the position in
the stereo field (panning) for some of the perspectives, and mixing them
together afterwards (Figure 44).
Figure 44. Mixing perspectives with different panning. Although the
source is just one piano, the sounding result generates the impression
that the piano is in multiple locations at the same time, thus becoming
multidimensional.
The alteration of the positioning of the microphones captured in
the stereo field produces a significant result: it is not possible to establish
a physical placement for the piano in a two-dimensional stereo image.
The position of the piano becomes multidimensional, in which each
356
additional perspective generates a new dimension. Conceiving a
multidimensional stereo field surpasses the possibilities offered in the
physical world, thus becoming hyperreal.
Sound Perspectives in Spitfire’s Hans Zimmer Percussion (2013)
The British sample library production company Spitfire Audio has
provided one of the most groundbreaking additions in terms of sound
perspectives. In 2013-14, it released a revolutionary triptych of sample
libraries called Hans Zimmer Percussion, which was already mentioned in
Chapter VII. This is a set of sample libraries dedicated to epic drums,
which aims to mimic the iconic style of Hans Zimmer’s drum writing
during the first decade of the 21st century. On the product’s website, the
company describes the recording process:
Recorded at Air Studios via an unsurpassable signal chain. 96
rarefied microphones, into Neve Montserrat pre-amps, the world's
biggest Neve 88R desk (which was exhaustively re-gained at every
dynamic layer for optimum signal quality), via a dual chain to HDX
and Prism converters running at 192k. Well over 30 TB of raw
material from these sessions alone. (SpitfireAudio, 2013)
Therefore, each sound included in the library was recorded by
using 96 microphones. The microphones were made by a variety of
brands and according to various typologies, and they were placed in
different locations. Moreover, in most of the sample libraries, the sound
357
engineers would only use some of these microphones to generate three
or four different sound perspectives that would be included in the final
product. In this particular library, the producers decided to ask for
different sets of sound perspectives from the most renowned sound
engineers of the time, in addition to Hans Zimmer himself, who also
provided his particular perspective on the sounds. Furthermore, most of
the engineers are Zimmer’s regular collaborators.
As a result, the library provides different versions of similar
perspectives, which are usually significantly different. From this
viewpoint, the library is exemplary in its ability to show the importance of
mixing in the process of hyperorchestration. Each of the perspectives
included in the library might serve to communicate different ideas. The
users are expected to mix and create their own singular perspective from
the set of a particular engineer. However, they are not just limited to
doing that. They might choose to combine different perspectives from
diverse engineers in order to create a very specific and singular sound
result. Moreover, they could further process each one of the
perspectives.126 The result will be a mix of the different aesthetic
conceptions of the engineers, in addition to the ideas added by the
composer.
126
For example, by altering the panning.
358
Sound Processing Tools
A second area of relevance for how mixing techniques have
become tools for hyperorchestration is the utilization of the multitude of
sound processing tools available. Most of these tools developed as a
means of restoring how music was perceived before it was recorded. For
instance, equalization served to correct the deviation from the perceived
live sound of an instrument and the result as a product of microphone
recording. In the practical book The Art of Mixing, David Gibson (2005)
succinctly describes the fundamental roots of sound processors and their
relationship with sound:
There are three components to sound: volume (or amplitude),
frequency, and time. That’s it. Therefore, every sound manipulator
used in the studio can be categorized as to whether it controls
volume, frequency, or time. (p. 75)
Following this, he classifies the different processors according to
six main groups from the viewpoint of how the effects interrelate with the
physical characteristics of the sound. These groups originate in the
combination of these three features: Volume, Frequency, Time, Volume
over Frequency, Frequency over Time, and Volume over Time. As my
approach is mainly aesthetic, it will differ from Gibson’s sound-property
based classification. Instead, I will describe the processing tools in four
main areas: Equalization or Frequency Control, Dynamic Control, Sound
359
Alteration and Virtual Space Design. I will treat the last area separately, as
it involves assorted concepts beyond the utilization of sound processors.
Equalization
The term equalization (EQ) is linked to the original purpose of these
processors when they were first designed. They were initially meant to
restore the original sound of the instrument that was transformed due to
the recording process. As microphones do not capture sound evenly
across frequency ranges, equalizers were meant to serve as a restoration
tool following the recording process. Gibson (2005) provides a broad
definition of equalizers and the complexity attached to their utilization
during the mixing process:
EQ is a change in the volume of a particular frequency of a sound,
similar to the bass and treble tone controls on a stereo. It is one of
the least understood aspects of recording and mixing probably
because there is such a large number of frequencies—from 20 to
20,000Hz. The real difficulty comes from the fact that boosting or
cutting the volume of any one of these frequencies depends on the
structure of the sound itself: Each one is different. (p. 89)
Broadly understood, an equalizer is able to alter the volume of the
restricted frequency range of the incoming sound signal (it pertains to the
volume over frequency category mentioned before). Generally, equalizers
are able to individually alter several frequency areas within the same
360
processor.127 Gibson (2005) also describes how the process of
equalization has evolved in tangent with what it is considered a natural
sound:
In the beginning, the basic goal of using EQ was to make the
sound natural—just like it sounded in the room where the
instrument was. You can’t get any more natural than that, right?
The only problem is that natural ain’t natural any more. These days
natural is defined by what is currently on CDs and the radio. We
have become addicted to crisper, brighter, and cleaner, as well as
fatter, fuller, and bigger. Therefore, to make a sound natural can be
boring and unnaturally dull by today’s standards. What we hear on
the radio and on CDs these days is much brighter, crisper, and
bassier than the real thing. If it isn’t bright enough and doesn’t
have enough low-end, it won’t be considered right. (p. 195)
Still, Gibson’s approach to equalization is traditional, as it is mainly
focused on generating the accepted cultural standard for a natural sound.
Gibson’s definition reveals that the concept of natural sound is hyperreal,
as it is symbolically based on the cultural conventions of a given time and
in a given society. However, Gibson’s view is still contained in a rather
restricted aesthetic approach. As he states, “if you don’t EQ the
instruments based on these traditions, it is either considered to be wrong
or exceedingly creative” (Gibson, 2005, p. 194). This is, partially, because
his main focus is the production of popular sound recordings, with an
aesthetically restricted palette of sounds that conforms to their musical
127
See Gibson (2005, pp. 89-120) for a more detailed description of
equalizations and equalizers.
361
style. In another practical book on mixing, The Mixing Engineer’s
Handbook, Owsinski (2013) provides an aesthetically expanded
approach, when compared to Gibson’s, to equalization:
There are three primary goals when equalizing:
1. To make an instrument sound clearer and more defined
2. To make the instrument or mix bigger and larger than life
3. To make all the elements of a mix fit together better by
juggling frequencies so that each instrument has its own
predominant frequency range (p. 25)
Although Owsinski’s views are similarly focused on song
production, he offers some aesthetic insights that can be generalized to
apply to a broader range of hyperorchestration tools. Hence, Owsinski’s
triad of primary goals serves as a template for the main categories in
which equalization might be used hyperorchestrationally.
First, equalization might be used in order to modify the sound
properties of a hyperinstrument or a sound combination in order to alter it
within the confines of what is considered its natural sound. In traditional
orchestration, the possibilities to modify the sound of an instrument are
very limited. For instance, instead of a violin section, a solo violin might
be asked to perform a passage. Similarly, the section could play using a
mute or sul tasto. These orchestrational decisions will alter the string
sound to a certain degree: a solo violin will sound more clear and intense
362
than a full section at the expense of becoming the sound of just one
violin. In addition to these resources, hyperorchestration allows the mixer
to fine-tune the sounds by equalizing some of their frequencies, thus
producing, for example, either a clearer or more diffused sound. Diverse
mixing techniques could be applied to produce those effects. For
instance, Owsinski (2013) suggests that “more often than not, the lack of
definition of an instrument is because of too much lower midrange in
approximately the 400Hz to 800Hz area. This area adds a ‘boxy’ quality
to the sound” (p. 28). If using Owsinki’s recommendation, in order to
create a clearer instance of the instrument, the equalizer would lower the
volume to within the 400 to 800Hz frequency range.
The second area in which equalization could be employed as a
hyperorchestration mechanism is as a means to stretch the verisimilitude
of an instrument, thus expanding its sound possibilities, which would
generally alter its meaning. In traditional orchestration, this approach can
be achieved by the accurate blending of different instruments, thus
generating a new sound result caused by the combination. In order to
increase the power of a string sound, a wide chord played by the
trombones could be added beneath the string section. Similarly, Ravel’s
Bolero is a well-known and eloquent example of how to generate new
363
instrumental sounds by combining the existing sounds within the
orchestra.
By employing equalization as a tool for transforming the sound of
the hyperinstrument to either push or surpass its verisimilitude as an
actual representation of a physical instrument, it is possible to further
tailor the sound of the instrument to satisfy particular expressive needs.
For instance, a region from the high frequency area could be filtered (the
signal is completely muted in that area), in order to enforce a sound that
is mellow and partially obscure. A sound transformation that only utilizes
equalization, but quite intensely, might completely transform the original
sound into something that barely resembles it.
Finally, equalization is useful in order to assign certain frequency
areas to each instrument, to enhance and better shape the final sound.
This is the approach that most resembles an extended mode of
orchestration. For each instrument, the hyperorchestrator should decide
which frequency ranges the instrument is allotted. When just one, or a
few, instruments occupy a frequency range, they will become more clear
and present. In this process, it is essential to consider how the overtones
interact and their role in shaping the sound of the instrument. If some of
the frequencies of an instrument are cut and, therefore, some of its
overtones are missing, the timbre of the instrument might significantly
364
change. At the same time, sounds in the same frequency range might
mask each other, thus diluting their contribution to the creation of the
timbral content of the instrument.
In acoustic orchestration, the composer might decide not to use
brass in a passage where the flute is playing a main line, as a means to
prevent the overtones of the brass from masking the flute sound. With
hyperorchestration, it would be possible to filter the frequencies of the
brass section that collided with the flute line in order to preserve both
instruments, at the cost of modifying the timbre of the brass. However,
the timbre of the brass would have been altered even without EQ in its
interaction with the flute, thus making the process of frequency filtering
less streamlined, by offering multiple sounding solutions.
Dynamic Control
The process of equalization is a means to control, over time, the
amplitude in diverse frequency ranges. In the case of the processors
dedicated to dynamic control, they focus on reducing or expanding the
dynamic range. In other words, they regulate the amount of variation, in
terms of volume, between the quietest and the loudest moments of a
365
soundtrack.128 Figure 45 shows the results of utilizing a compressor (using
a waveform representation), which is the most common dynamic control
processor, in the recording of a single timpani hit.129
Figure 45. Waveform visualization of the effects of compression on a
timpani hit.
The compressor is able to reduce the dynamic range by
decreasing the volume of only the loudest sound, which is attenuated
beyond an established threshold.130 Then, the compressor can increase
the gain (volume) of the sound proportionally to the amount of gain
128
The multiband compressors are processors that control the dynamic
range over a frequency range, thus falling in the middle of both groups of
processors. For the sake of clarity, I will not refer to them.
129
The sample recording is extracted from Spitfire’s Hans Zimmer
Percussion (HZ01) (Spitfire Audio, 2013). To be precise, it is the lowest
dynamic for a low C hit (C1) utilizing the mid tree microphone position mix
by Geoff Foster. Foster’s mixes are barely processed, as his approach
aims for a naturalistic sound.
130
The threshold and the proportion of volume reduction are the main
parameters present on compressors, which are modifiable by their users.
366
reduction, keeping the sound within the limits of the maximum amplitude
level (Figure 46).
Figure 46. Sketch of the effects of a compressor.
The result is a sound that has a decreased amount of dynamic
variation. In a percussion sound, such as the timpani hit, compression
allows the hit to seem to last longer, as it delays the decay stage. A
sound that is more constant in terms of its dynamic becomes
perceptually more present. In traditional orchestration, dynamic control is
only achieved by the utilization of dynamics in the score. For percussion
instruments like the timpani, this is not directly possible. This is one of the
reasons that composers developed and utilized timpani rolls (continually
hitting the timpani with two mallets) as a means to generate a constant
dynamic at the expense of greatly altering the sound. Another effect
produced by compression is generating a thicker sound, due to its
sustained constant dynamic. In traditional orchestration, this was
367
somewhat achieved by adding instruments to a particular sound. For
instance, if composers desired a denser French horn sound, they would
add the whole French horn section, playing in unison. The addition of
individual sounds, which are slightly detuned amongst themselves,
generates an ensemble sound that is, in sum, more constant both
dynamically and timbrically.131 That is why the string section became the
center of the orchestra, for its ability to sustain a constant flux of sound.
The utilization of dynamic control processors allows for the
extension of those orchestration principles, thus generating compelling
new sounds that can emanate from the individuality of a single instrument
sound alone. By compressing its sound, a single instrument might
become thicker than the whole ensemble.
Sound Alteration
This last group of sound processing tools is wide in scope, yet
very specific in its objectives. The effects used by this group aim to
manipulate the sound by purposefully transforming it beyond its original
shape. Equalization and dynamic control maintain a certain correlation
131
There is a sound processor, commonly called chorus, which attempts
to replicate this effect by duplicating the signal and slightly detuning the
duplicated signals. Compression works differently, producing a more
present sound by reducing the dynamic range.
368
with the original sound that, when employing the processors in this
group, might be easily lost.
The clearest example is the utilization of distortion on electric
guitars. Generally, distorted electric guitars barely resemble the sound
produced by the actual instrument. In fact, a distorted electric guitar
sound has become a recognizable singular instrument, distinguished from
a guitar sound. The diverse processors involved in producing the
distorted electric guitar sound have served as the basis for developing a
full range of sound processors that can be applied to any sound. They fall
under a variety of categories according to Gibson’s (2005) distinctions:
-
Frequency over Time: Vibrato generators, Flangers,
Choruses
-
Volume over Time: Tremolo generators
-
Time: Delays
-
Volume (p. 75)
A mix of these approaches, in addition to extreme equalization,
would generally create other commonly used distortion effects in the
electric guitar. Vibrato and tremolo generators are relevant examples
because they are able to produce contained and controlled sound
369
variation over time, making the sound of a hyperinstrument dynamic and,
thus, generating a rhythmic pulse that could potentially interact with the
rest of the audiovisual material. In addition, these effects generate from
physical instrumental models. Vibrato is a common technique for string
and some wind instruments, and it is essential for producing their most
common sound. A string vibrato is literally the slight alteration of the
frequency over time at a particular rate by moving the finger that is
pressing the string. This rate might also be considered a very low
frequency. For example, the string player might vibrate a note by slightly
moving the finger to the sides at a rate of around eight movements per
second. In a vibrato sound processor, this is modeled by using a low
frequency sinusoid wave called Low Frequency Oscillators (LFO). They
are physically equivalent to a sound wave, although they are not
perceptible by the human ear. This allows them to easily modulate the
sound by altering its frequency. The tremolo effect diverges further from
its physical counterpart. Tremolos are equivalent to vibratos applied over
the amplitude (volume) instead of the frequency. They generate volume
undulations, which might resemble the undulations produced by the bow
change on a string tremolo, by similarly using an LFO over a volume
controller. The nature of these effects highlights how they extend
traditional and orchestrational principles beyond their physical
370
possibilities, and how they are applicable to a broad range of sounds.
The possibility of a fixed sound undulation over time also generates a
rhythmic and metric effect that would not be possible to achieve with a
long sound, thus making the utilization of note attacks non-essential for
the generation of meter and rhythmic patterns. Delay generators also
contribute to the creation of rhythmic and meter patterns by repeating a
sound at an established rate (or meter), also with an established
attenuation over time. Although they actually need a clear attack to
become noticeable, they are able to generate a complex metric pattern
with just a few single sounds.
Distortion-related effects highly alter the sound spectrum. In
addition, they have strong meaning attached due to their origins in
diverse styles of popular music. Moreover, the nature of a distorted
sound, which defies an aesthetic model based on sound clarity, does add
connotational meaning. Distortion might indicate an underground culture
or refer to the corrupted nature of an event or a place. However, their
meaning possibilities are evolving as they are utilized in an everincreasing musical scope.
371
Virtual Space Design
Creating a virtual space from where the music will emanate
involves utilizing microphone perspectives, various sound processors
and, in addition, reverberation processors and spatial positioning tools
such as panning. The nature of the virtual space is, even in the most
current frequent approaches, two-dimensional. The stereo and the
surround speaker design generate an aural image that has no height. This
is because in both configurations, the speakers are placed at the same
height, generating a two-dimensional virtual plane. The main difference
between a surround design and the stereo is the possibility of placing
sounds behind or at the sides of the spectator, going beyond a model for
the virtual space that mimics the concert stage. In addition, the surround
speaker system reproduces, with increased precision, the reflections
produced by reverberations from the back of the hall. Figure 47
graphically shows the configurations of a stereo and a surround speaker
design.
In orchestral music recordings, the stereo attempts to replicate a
sound model similar to the experience of attending a concert. Surround
sound aims to reproduce a sound that is ambient, which might recreate
the sonic environment of different spaces. As I suggested before, by
372
employing a virtual space it is possible to go beyond the restricted
number of dimensions of the physical world.
Figure 47. Stereo and basic surround speaker configurations.
However, it is important to be aware that even by utilizing either
stereo or surround, it is not technically possible to generate a sense of
aural height. This is, however, possible by using newer designs of sound
delivery such as Dolby Atmos (Dolby, n.d.). Hence, the additional possible
dimensions do not enhance the representation of the physical world.
Instead, they extend its possibilities. The example described in Figure 44
generated a virtual space where a sound object is divided between
different locations, generating sounds from different positions. From the
audience point of view, the model would generate the following
instrument placement (Error! Reference source not found.).
373
Figure 48. Representation of multidimensional spaces. Even though there
is only one piano recorded, the result becomes multidimensional due to
the mixing process, as the piano seems to emanate from different
spaces.
This particular sound distribution might generate the impression
that the piano is on the far right of a long rectangular area (Figure 49).
However, if the rest of the sounds assume a different virtual space,
similar to a regular stage, then the final result would still be
multidimensional, as it includes diverse sound spaces at the same time.
Moreover, the presence of another virtual space will dilute the
mental representation of the area from the previous figure. The design of
virtual spaces becomes a challenging process with myriad possibilities
for sound expansion.
374
Figure 49. Two-dimensional virtual configuration
The traditional model of orchestral set-up, along with the principles
of orchestration, is grounded in a mostly physically stable positioning of
the ensemble. The principles that regulate the balance between
instruments, the difference in their power, or how they mask other
instruments all assume a fixed physical disposition. Thus, the
stereophonic orchestral sound is, in practice, a one-dimensional model in
terms of music composition.
Unlocking the space dimension is, from an aesthetic viewpoint,
extremely appealing. Mahler’s example outlines the immense possibilities
offered when expanding the sound dimensions, while highlighting the
associated challenges in terms of the necessity to redefine the
orchestration principles. Therefore, aside from examples similar to
Mahler’s, which are fairly exceptional in the most common symphonic
repertoire, there is no orchestrational equivalent to virtually expanding the
space. In addition to the possibilities of creating complex sounding
instruments, I will describe two main areas in which the virtual space
375
design becomes especially relevant in terms of the generation of
meaning: the utilization of reverberation effects, and the implications of
designing an evolved and sophisticated virtual space in terms of sound
cohesiveness.
Reverberation Processors
There are two main types of processors that generate
reverberation or, in other words, that allow placing a sound as if it were
performed inside of a space. The first set includes processors that
digitally process the sound in order to emulate reverberation. This is
mainly achieved by generating sound delays that replicate those that
would be created by a hall. The reverberation processors from the other
group are commonly known as convolution reverbs. The reverbs in this
second group are generated from a sound sample captured in a physical
space (impulse response), which captures the natural reverberation
generated inside a hall from the particular perspective of the location of
the microphone. It is similar to the HDR processes of capturing the
lighting of a hall that I described in Chapter IV (Prince, 2012, pp. 192193).
The possibilities of convolution reverbs are somewhat parallel to
the addition of instruments from around the world, as described in the
376
analysis of The Lord of the Rings (2001-2003). They allow for the ability to
position instruments or sounds in places where they would not be
expected to be located. For instance, an Indian tabla could be virtually
placed inside of a European Gothic cathedral. Moreover, it is also
possible to place an instrumental ensemble whose size would exceed the
capacity of a particular room or hall. For example, it is possible to create
a convolution reverb that reproduces the reflections produced in a
bathroom, and then employ it for a full orchestral sound. Using
convolution reverbs has the potential to carry the meaning from their
original spaces into the hyperreal, where it will evolve and interact with
other meanings present in the diverse musical objects of the
hyperorchestra.
Cohesiveness
One of the most important aspects to consider when designing a
virtual space is to evaluate the resulting cohesiveness. While placing a
symphonic orchestra inside a bathroom could generate an interesting
sound, it will disjoint the spatial representation of the ensemble within the
spatial representation of the space. Similarly, the presence of multiple
two-dimensional virtual spaces interacting and generating a
377
multidimensional space will diminish the cohesiveness of the resulting
soundscape.
Much akin to most of the features of the aesthetics of the
hyperorchestra, a sound result that is more or less cohesive has
advantages and disadvantages, depending on the meaning and
functionality intended for the music. Generally, a non-cohesive sound
would make the music more apparent and noticeable, due to its
peculiarities. It will also tend to be associated with some degree of nonrealism, which might be related to the diegesis or to the aesthetics of the
whole film world. Therefore, from a hyperorchestral perspective, the
cohesiveness of the virtual sound stage becomes another variable to
evaluate in the process of sound and meaning creation.
Sound Processing and Aural Fidelity
Bordwell and Thompson (2012) define aural fidelity as “the extent
to which the sound is faithful to the source as we conceive it” (p. 583).132
The word “conceive” is key to comprehending that the concept does not
necessarily link a recorded sound source with a sound from the physical
world: “we do not know what laser guns sound like, but we accept the
whang they make in Return of the Jedi as plausible” (Bordwell &
132
Jeff Smith also discusses Bordwell and Thompson’s concept of aural
fidelity in relation to the creation of the diegesis (Smith, 2009).
378
Thompson, p. 283). Therefore, assessing the aural fidelity of a sound
involves analyzing its verisimilitude. The original purpose of most of the
techniques described in this section, dedicated to sound processing
tools, was to restore, as much as possible, the aural fidelity of the
recorded sounds. By using these tools, the recorded sound would
generate a verisimilar image of the music that would ideally become a
representation of a physical experience. From this angle, equalization
served to restore the original frequency spectrum before the recording.
Further, the utilization of a reverb would place the instruments inside a
hall. Similarly, mixing was originally conceived as the resulting process of
joining all the elements to recreate a live experience.
In the aesthetics for the hyperorchestra, these possibilities expand
beyond the pure objective of accurately representing a physical event.
Similar to the concept of cohesiveness, aural fidelity becomes another
dimension from which to generate meaning, instead of being the goal of
the whole process. Assessing the degree of fidelity of an instrument or a
sound becomes an aesthetic decision linked to the functions that the
sound is expected to fulfill, thus related to its intended meaning and
sound beauty.
379
Creation of Hyperinstruments
In this section, I will explore the aesthetic grounds for the creation
of hyperinstruments by utilizing some of the mixing techniques previously
analyzed, in addition to extrapolating the findings described in the screen
music analysis chapter. When describing the music for The Man of Steel
(2013), I stressed the importance of the creation of hyperinstruments
based on new ensemble sounds, in order to generate an appropriate
soundscape for a movie that revisited the figure of Superman from a
different angle. Thus, the aesthetic intent of creating these new
ensemble-based hyperinstruments obeyed the necessity of generating
meaning, which would connect with humanity in relationship to
Superman. Generating a hyperinstrument regularly begins with
conceptualizing the intended signification of the instrument, along with
defining the contributions of the instrument to the overall meaning of the
score.
Once a general objective has been established, the creation of a
hyperinstrument begins by choosing an instrumental instance.133
Selecting an instrument becomes a very specific endeavor. It implies
picking the particular sound that is adequate to fulfill the purpose of the
hyperinstrument. It might involve choosing a precise virtual instrument
133
Or a set of instances.
380
from a sample library collection, or deciding to record a specific musician
playing an instrument. Moreover, picking a sound source involves
deciding on a particular articulation or a performance practice.
During the process of choosing sound sources, the composer
could decide to amplify the instrument by merging diverse sources. In the
analysis of the music for Interstellar (2014), I discussed how, in the track
entitled S.T.A.Y. (Zimmer, 2014), the pipe organ merged with a
synthesizer sound, which created a hyper-organ that interacted with the
multidimensional space portrayed in the movie. Adding a synthesized
sound detached the pipe organ from its physical roots inside of a
cathedral, thereby expanding its possibilities beyond the threedimensional world where humanity regularly interacts. In tandem with the
process of choosing an instrument, the composer could select a
particular sound perspective that might be the result of merging diverse
sound sources. In addition, other sounds, such as the synthesizer in
S.T.A.Y, could be added as a means to complement the shape of the
sound. Likewise, instrument merging might involve the utilization of two
or more sampled virtual instruments (or recordings). For instance, a
clarinet sound might be reshaped by adding a flute sound at a very soft
volume, which would slightly modify the tone of the clarinet.
381
The creation of a hyperinstrument also involves deciding whether
to employ any of the sound processors previously outlined. In the case of
instruments that are the product of merging, this procedure also involves
deciding to apply these processors to the individual sounds, to the
resulting sound after the merge, or to both. A flute sound merged with a
clarinet might have a delay effect and a longer reverberation time, thus
generating an atmospheric sound surrounding the main clarinet sound.
The resulting hyperinstrument will have associated a particular
sound and a set of possibilities for generating meaning. The origin of the
sounds is key as they greatly contribute to the hyperinstrument’s
expressive power. The cultural origin of the instrumental source or the
origin of the sound that shaped the instrument are the basis for the
transmission of extramusical content. Furthermore, the utilization of
specific processing techniques might attach additional connotations. For
instance, distortion might reinforce a menacing or disruptive
expressiveness. Similarly, a long reverb might be associated with
transcendence.
The conjunction of all these principles shapes a hyperinstrument,
which becomes a tailored and specific sound generator with a singular
sound, and with a set of connotations attached. The conjunction between
a focused path for meaning creation and a sonic color that is distinctive
382
and original become powerful creative elements for making music in
hyperreality.
Hyperinstrumental Orchestration
In this final section, I will explore the aesthetic principles that guide
the combination of hyperinstruments beyond the utilization of mixing
techniques. The following principles expand the traditional orchestration
techniques into the possibilities offered by the hyperorchestra, while
preserving their grounds. I will describe three main approaches that
revolve around the main traditional orchestrational practices. First, I will
explore how to expand the concept of instrumental sections. Second, I
will describe the incorporations of new instruments from diverse origins.
Last, I will discuss the process of combining extended sections and
instruments.
Augmenting and Expanding the Orchestral Sections
In the analysis of the music in Inception (2010), I argued that the
score was created by generating a massive brass section, which would
have to be matched with an even larger string section if it were performed
live. The colossal brass sound in Inception has already become iconic for
describing a process of expanding the original sections from the
383
traditional orchestra. Taking a different approach, the score in The Man of
Steel (2013) served as an example of the utilization of new orchestral
sections generated from creating ensembles from single instrument
instances. The hyperdrum ensemble became a section within the
percussion section of the hyperorchestral ensemble for the score. These
examples provide two different approaches that extend the concept of
orchestral sections.134
One of the first implications revolves around an increased
independence and individuality for these new sections. They integrate
with the rest of the sounds of the hyperorchestra, yet they preserve their
own entity and a distinctive presence. The massive brass section in
Inception, alone, already defies the possibility of ever being physically
placed in an actual hall. Because of this fact, the brass section already
becomes distinct. This is also a consequence of an approach to the
ensemble that abolishes the fixed structure or the traditional orchestral
instrumental placement. The orchestral placement was designed to
create an already “mixed” and balanced disposition. For instance, the
brass section was traditionally sonically in the background, supporting
134
In the traditional orchestra, the sections are the strings, woodwinds,
brass and percussion. In addition, harp and keyboard instruments such
as the piano or the celeste generally are considered as isolated
instruments, although leaning towards integrating with the percussion
section due to their sound properties.
384
the strings. When mixing became an orchestration tool, the need for the
fixed instrumental mix provided by the traditional orchestra is no longer
necessary. Therefore, the sections in the hyperorchestra are autonomous
sound entities and the composer has the freedom to make them evolve
and make them generate new sonic environments. The definition of a
section within an orchestra dissolves, which allows for a more flexible
approach to their construction.
It is in this context that the possibility to augment, extend and
create new sections appears. The process of expansion is grounded in
the principle idea that defines a section: a structured sound that
produces a cohesive output through the conjunction of diverse sound
sources. The hyperdrum ensemble generated its ensemble sound by
joining a set of recordings utilizing close microphones, in addition to
general room microphones. The sound retained the intensity and clarity of
an individual drum at the same time that it added the power of the
ensemble. Similarly, the string sections in the same score from The Man
of Steel were mixed by emphasizing the closely recorded sound of the
section, thus generating a sound of the string ensemble that also differed
from the string ensemble sound within an orchestra: it was a blend of
sectional power and soloistic intensity.
385
The individual treatment of each section allows for its expansion, in
terms of the number of performers, without compromising the clarity of
the output sound due to the dimensions of the hall. In live performances,
there is a necessary correlation between the number of performers and
the minimum reverberation, which is caused by the hall’s dimensions. It is
possible to place a single player in a small room, but this becomes
impossible with larger ensembles. In the hypothetical case of a 2000piece orchestra, the hall that would be required to host the ensemble
would be enormous. Even when the sound of the section is produced
exclusively by live players, recording one section alone allows for the
possibility to increase the number of performers without the need to
record in a bigger place, thus maintaining a controlled amount of
reverberation. Moreover, recording techniques, such as using
microphones close to the instruments, allow for the disregard of most of
the reverberation sound. If the sound of the section is produced either
partially or in its totality by using sample libraries, the possibilities to
expand the dimensions of the ensemble escalate. Orchestral sample
libraries such as EastWest’s Hollywood Orchestra (EastWest Sounds,
2014), which I described in Chapter VII, regularly incorporate ensemble
instruments for the brass section. For instance, there is a full virtual
instrument (it includes a wide range of different articulations) that is
386
generated from the recording of six French horns in unison. Hence, a
four-note chord performed by this virtual ensemble is theoretically
created from a virtually formed group of 24 French horn players.
Consequently, the associated properties attributed to an
instrumental section are deconstructed when they become part of the
hyperorchestra. The correlation between ensemble size, reverberation,
and individualistic sound breaks, allowing for a much more flexible
approach to shaping the final sound. A highly numerous brass ensemble
could be contained into a virtual space that generates a reverberation
that would generally be attributed to an ensemble of a much smaller size.
Similarly, this ensemble might still preserve the intensity of a single
instrument recorded closely. Moreover, the virtualization of the
ensembles produced by either non-linear recording sessions or by the
utilization of sample libraries allows for the multiplication of the possible
ensembles in terms of their articulations simultaneously played. A string
section might be sustaining a chord at the same time that the same string
section is playing a pizzicato musical passage. In this situation, the string
section becomes multidimensional, being able to produce multiple sound
articulations as a whole, all at once.
The examples from The Man of Steel revealed the possibility of
creating new sections that did not exist in the traditional orchestral
387
ensemble. Aesthetically, they derive from the deconstruction of the
concept of an instrumental section. These new ensembles utilize some of
the properties associated with an instrumental section, but by
implementing them in new sound paradigms. The hyperdrums would not
be possible if all the music for the score was recorded at once. Moreover,
recording them using only sectional microphones would have not
generated the particular sound intensity and power that the hyperdrums
in Man of Steel aimed for. The aesthetic revolution of instrumental section
generation has just recently been picked up by sample library makers,
who have begun to release a new group of virtual instruments that are
focused on the creation of these new ensembles that, although they are
rooted in the orchestral instruments, expand them far beyond their natural
place in the orchestra. 8Dio has just released Orchestral Grand
Ensembles (8Dio, 2015), which incorporates a separate microphone
position that captured each of the instruments of the ensemble in a close
perspective, thus mimicking the model designed by Zimmer and his
collaborators for the hyperdrums. In this new library, there are virtual
instruments for a four-piece piano ensemble, a seven-piece guitar
ensemble and a five-piece harp ensemble, among others.
Expanding the orchestral sections also comes from the
performance side. With the utilization of computer-aided technology, it is
388
possible to generate impossible crescendos, such as those described in
the analysis of Inception’s (2010) score. This is a good example to
describe how the performance element of the ensemble sound has also
been deconstructed to allow for humanly impossible gestures, generated
by the modification of the recorded sounds or by the utilization of
carefully programmed sample libraries. The possibility of total precision in
the case of rhythmically challenging short note passages, such as fast
string pizzicato passages, is another example of the possibilities in terms
of a computer-assisted (or computer-amplified) performance.
Incorporating New Instruments
The analysis of the music for The Lord of the Rings (2001-2003)
movies presented an extended ensemble produced by the addition of an
extensive collection of instruments from diverse cultural origins. Similarly,
the hyperdrum ensemble’s sound source is the drum kit, which originated
in the 20th century’s popular music scene. Moreover, incorporating
synthesizers or hyperinstruments add sounds to the orchestra that are
likewise new.
In addition to the expansion in terms of meaning, adding these
new instruments accomplishes various orchestrational purposes.
Frequently, new instruments are treated as soloist instruments. They
389
occupy a central space in the soundscape by being recorded and mixed
as soloists. Moreover, the instrument does not need to be in the same
room or even be recorded at the same time, if taking into consideration a
non-linear process of music recording. The advantages of these modes
of production allow for the utilization of instruments as soloists that would
not have been properly balanced if they were performed live with an
orchestra.
The physical and cultural nature of those instruments generate
interactions within existing sections, at the same time that new
connections are being produced. A Japanese shakuhachi will blend with
other woodwind instruments (especially with the flutes), as they are all
share a similar timbre. At the same time, all the Japanese instruments
present in the score might form a virtual Japanese section within the
hyperorchestra, which will become a space of interaction by cultural
similitude. Therefore, the shakuhachi might blend with the woodwind
section, amplifying its range of sonorities, yet become integrated within
the ensemble, at the same time that it might become a separate
Japanese instance during other moments of the music. Synthesizers
integrate in a similar manner. They can become distinctive parts of the
ensemble, or even become the whole ensemble, at the same time that
they can integrate and blend with an instrumental section. The examples
390
in Inception are eloquent: airy-sounding synthesizers that become the tail
of a soft and high string sound produced with harmonics, strong and
timbrically dense low sounds that would complement the powerful brass
chords, or heartbeat-like pulsations that become isolated and prominent
sounds that signify anxiety. Thus, the incorporation of new instruments
adds, in addition to meaning, new sound dimensions that might serve as
a complement and as an amplification of existing sections, or as a
generators of soloistic sounds that might become prominent in a similar
manner to the role of the soloist in any orchestral concerto.
Hyperorchestral Combination
This last section relates to the process of combining different
sections and instruments in terms of balance and sound. Balancing and
combining instruments and sections is one of the key focuses of
traditional orchestration. Adler’s proposal to divide musical elements into
three main categories is a well-accepted paradigm for organizing the
orchestration:
1. Foreground: the most important voice, usually the melody,
which the composer wants to be heard most prominently;
2. Middleground: countermelodies or important contrapuntal
material;
391
3. Background: accompaniment, either chordal or using
polyphonic or melodic figures. (Adler, 2002, p. 118).
Once it has been decided which musical items pertain to each
category, the role of the orchestrator is to create the balances and
combinations that best achieve the desired results, making the
foreground voice prominent, the middleground ideas distinctly heard but
in a secondary position, and the background ideas non prominent but
filling the soundscape. With the hyperorchestra, the process of combining
instruments and sections becomes much more flexible, akin to the rest of
the orchestrational elements discussed above. Volume mixing becomes
key in hyperorchestration in order to expand the possibilities of the
traditional orchestra. For instance, in order to balance the sound of a
brass section playing fortissimo with a string section playing in a
moderately soft dynamic, it is necessary to apply volume adjustments
between the sections. This process might create a collateral effect in
terms of the ensemble: the string section might seem bigger as it will be
mixed louder in comparison to the brass section, if they need to be sound
balanced. Thus, the process of instrumental combination interacts with
the reshaping and amplification of instrumental sections as previously
described.
392
In a similar manner, the sonic placement of the instruments and
sections will have an effect on their combination and their unity as a
section. For instance, in the case of a string section that plays, at the
same time, a sustained chord and a full pizzicato accompaniment, both
sounds will probably emanate from the same virtual location that
corresponds to the string section placement. It is in this situation that a
multidimensional section will be generated. If, on the contrary, the
sustained chord and the pizzicato passage were assigned to distinct
spaces in the soundscape, it would appear as if there were two different
string sections inside the ensemble.
In parallel, the success of the introduction of a new instrument,
such as the shakuhachi, depends on how its role is negotiated in terms of
its combination with the other instruments. If the new instrument needs to
have a soloistic role, the process of instrumental combination must take
this fact into consideration. Otherwise, not only might the singular sound
of the instrument be lost, but the associated meaning might also not
permeate properly.
As a final matter, when applying hyperinstrumental orchestration
techniques, composers must assess how their utilization affects the
perception of verisimilitude in the music they make. This is comparable to
the evaluation of aural fidelity when employing mixing tools for
393
hyperorchestration. Similarly, verisimilitude refers to the appearance of
reality instead of an attempt to portray realism. Thus, the evaluation of
how the utilization of hyperorchestral techniques reshapes the perception
of verisimilitude must just focus on the resulting sound as experienced by
the audience, disregarding the actual technical details of its production.
Moreover, the process of evaluation needs to take into account the
desired level of verisimilitude to fulfill the needs of the movie in terms of
the added meaning. This assessment will allow for the permanent
cognizance of the implications of particular hyperorchestrational
procedures. Hyperorchestrating is a means of sound expansion and
development, which, like any similar process, involves balancing what is
gained against what is lost.
394
X CONCLUSIONS
In defining the hyperorchestra, I have provided a model that
describes a new approach to music creation that has emerged from the
process of virtualization and digitalization of conventional musical
practices, concomitant with similar changes to how audiences experience
music. In describing the hyperorchestra, I investigated the practical and
aesthetic consequences of its utilization in audiovisual media. I began this
dissertation by describing the concept of hyperreality with two primary
goals: to provide theoretical grounds for the new ensemble and to
integrate it within a wider social scope. These conclusions are divided
into five brief sections that encompass the diverse set of conclusions for
this study.
The Hyperorchestra and Hyperreality
A pivotal element of the philosophical investigation involved
scrutinizing the ontology of the hyperorchestra in terms of its distinctive
qualities as compared to traditional musical ensembles. McLuhan’s
(1964/1994) theory of media served as a tool in order to address the
395
ontological definition of the hyperorchestra. Employing McLuhan’s
definition of media, which focuses on how new devices serve to amplify
the human body and its interaction with human senses, allowed for the
creation of a streamlined context to define how humans interact with
music and sound. In addition, it also presented a challenge. Human
senses act as mediators of the musical experience. Similarly, a musical
instrument adds an additional layer of mediation. Establishing how the
hyperorchestra differs ontologically from these processes required a
detailed analysis. For instance, a virtual instrument that employs a sample
library could be interpreted as another layer of mediation, akin to a
physical instrument.
McLuhan (1964/1994) described the revolutionary aspects of the
electrical era, especially in contraposition to the previous stages of
human development, as he believed that they produced an implosion
instead of an expansion. This is because the technology that was made
possible by electricity allowed humanity to be virtually interconnected
(without the need for a physical connection), at the same time that it
generated a set of new media that were predominantly audiovisual.
McLuhan’s ideas served to establish two main premises for defining the
hyperorchestra as a new medium: the capability of recording music and
sound, and defining the Western musical score as a medium similar to
396
the phonetic alphabet. Recording music allows for a nonlinear approach
to music creation. It breaks the temporal boundary that is naturally
associated with a musical performance, allowing the chronological nature
of the timeline to become variable. Similarly, recording music severs the
connection between musical performance and a three-dimensional space
in which the sound unfolds. Microphone placement, mixing and
processing allow for the creation of custom made spaces that might not
even depict any analogous or specific physical space. The discussion on
how recordings of piano concertos tend to feature the piano sound,
producing a final result wherein the piano sounds louder and much more
distinct that it possibly could in a live performance, served as an
example.
Regarding the Western musical score, an open analysis of its
properties revealed its close relationship to phonetic language, in terms
of McLuhan’s theory. For McLuhan (1964/1994), with the introduction of
phonetic language, humans were given “an eye for an ear” (p. 84) as it
diminished “the role of the other senses of sound and touch and taste in
any literate culture” (p. 84). Similarly, the musical score arose as a cultural
product that neglected significant elements of sound, in order to provide
a streamlined framework for a reductionist approach to music production.
Hence, in exchange for rationalizing and restricting the process of music
397
creation, music was readily shared and performed, in a manner similar to
how the Industrial Revolution transformed the social structure of Western
civilization. Utilizing a musical score favored a compositional process that
largely highlighted pitch and a fixed temporal grid for depicting rhythm
and meter.
Although the processes of recording and producing open the door
to music into hyperreality, it is still possible to produce a sound recording
that aims for an aural result that attempts to mirror a live performance.
Similarly, a spectral approach to composing music allows for the
envisioning of the musical experience in terms of a soundscape, while still
continuing to utilize the traditional notated score. This is why the
description of the hyperorchestra requires an aesthetic inquiry, even
when only attempting to provide a definition. Music, as a product of
human culture, can only be defined when its aesthetics are taken into
consideration.
Utilizing a combination of 1) philosophical inquiry focused on the
ontology, 2) the sound properties of music, and 3) aesthetic inquiry that
analyzes how music is a symbolic product of a culture, it is possible to
define and describe the processes surrounding the hyperorchestra.
Moreover, this dual approach emphasizes the generation of meaning as
one of the key features of this new ensemble. The hyperorchestra serves
398
as both a new way to produce music that incorporates a wide range of
sounds, and as a musical conception that introduces the generation of
meaning as one of its primary objectives. Within a hyperorchestral
framework, the music creators become cognizant of the symbolic nature
of music within a given culture, unfolding its power as a generator of
meaning and emotion. In other words, when writing for the
hyperorchestra, the composer is aware of, and utilizes, the cultural
framework in order to shape the intended meaning of the music. This
expansion in meaning and of the soundscape triggers a musical
experience that can interact with and combine diverse cultural traditions.
Similarly, it can produce results that transcend what would be achievable
by physical means alone. Therefore, the hyperorchestra positions the
generation of meaning as central throughout the creative and
compositional process, instead of serving as a byproduct derived at the
end, but rather providing a wide array of tools for building meaning.
Interaction with the Moviemaking Process
In this thesis, I stated that a hyperorchestral process of music
creation is similar to the process of digital movie production. Moreover, I
defined diverse areas in which employing a hyperorchestral framework
actually benefits the integration of the score within the various levels of
399
meaning in a movie. A central area of interaction lies in how the movie
relates to realism, and how it becomes more or less verisimilar. I
proposed a model for the film world in which the diegesis becomes,
following Souriau’s (1951) original concept, an imaginary entity generated
by the audiences from the combination of the material from the movie
and the audience’s experiences of their world. Thus, I concluded that the
diegetic world was created as close as possible to the perceived world of
the audience, which afforded the movie creators the freedom to assume
that audiences would infer a great deal of information in the process of
‘world building’. Although music does not literally generate the world, the
perception of its contribution toward verisimilitude influences the
resultant diegesis.
This is the rationale by which I emphasized the capability of the
hyperorchestra to generate music that sounded verisimilar although it
could not be produced by physical means alone. Prince’s (2012)
definition of perceptual realism became useful in order to establish a
parallel with the visual entities within the movie. Thus, assessing how the
resultant sound of the score in terms of its verisimilitude becomes
essential in order to properly shape the generated meaning of the music
to properly integrate it with the rest of the elements of the movie. The
model for the creation of the diegetic world was inclusive in terms of
400
describing how music could aid in shaping the diegesis. From this
perspective, the possibilities offered by the hyperorchestra, in terms of a
detailed approach to the generation of meaning, empower the screen
music with a far greater capacity for constructing the diegetic world. As
the music emanates from a combination of diverse sounds that embody
varied significations, it reinforces its contribution in suggesting to the
audiences how to generate the diegetic world. The analyses of the scores
for The Man of Steel (2013) or The Lord of the Rings trilogy (2001-2003)
are useful examples to demonstrate how their music contains a level of
meaning that can provide rich content to the world of the movie.
Prince’s (2012) and Auslander’s (2008) discussion on the process
and implications of utilizing CGI provided context to link it with the music
production process within the hyperorchestra. Although this similar
approach to production does not necessarily imply a better integration of
the music within the movie, it provides a connection that might trigger
creativity in terms of interaction. This has already been the case with the
addition of the musical mock-up over the last two decades. With the
mock-up, the process of movie scoring started to become nonlinear, thus
adapting to the film editing process, which is inherently nonlinear.
Christopher Nolan’s utilization of the music for Interstellar (2014) as a
source for building the script (simultaneously incorporating the church
401
organ for its philosophical implications), exemplifies the possibilities that
this new framework affords in terms of interconnected creativity. In this
case, music is not only blended with the rest of the soundtrack and the
audiovisual track, but it also serves to generate the original script.
Sound Sculpting: Integrating with the Rest of the Soundtrack
With the hyperorchestra, sound sculpting becomes one of the
pivotal roles for music composition, i.e. composers focus on the sound
spectrum, shaping how the hyperinstruments interact with the diverse
frequency regions. This aesthetic approach conceives of music well
beyond pitch, and thus utilizes the full soundscape as a compositional
palette. Within a framework that conceives of music in terms of the sound
spectrum, the score will interact more fluidly with the other elements that
pertain to the movie soundtrack: dialogue, voiceover and sound effects.
In sum, all elements in the soundtrack are approached purely sonically,
without the mediation of the traditional score. For instance, a
hyperinstrument could filter a certain area’s frequency in order to better
interact and integrate with the sound effects that are occurring at the
same time. Moreover, as a result of this approach, the boundaries
between the soundtrack’s elements could blend, if this is the desire of the
creators, thus facilitating the generation of a soundtrack that is far more
402
integrated. This integration allows for an increased set of creative
possibilities that is not otherwise possible when the various sonic
elements are isolated. Hence, the hyperorchestra facilitates the
integration of all the elements that constitute the soundtrack, therefore
generating new opportunities for interconnection.
Composition as Sound Design
The definition of the hyperorchestra provided in Chapter VI and the
examination of the musical experience from a postmodern perspective
revealed that composition is a culturally restricted subset of sound.
Musical composition implies designing sound combinations over time
that, when they are appropriate to the aesthetics of a given cultural
framework, produces music. It is within this context that the Western
score - the symphonic orchestra and the Western musical theory
principles - can be conceived as containment tanks to produce creative
works in a restricted and controlled area. In terms of ontology, when
excluding cultural barriers, sound design and composition become
equivalent. Thus, a composer becomes a sound designer with culturally
limited tools and scope. With the introduction of a musical model based
on the hyperreal, the culturally loaded borders that contained musical
practices have blurred, allowing for a sound expansion that encompasses
403
a wide range of sounds, as well as the possibility to interconnect with
different cultural sound subsets.
Evolution and Expansion Possibilities
The final part of the conclusion aims at providing a set of
hypotheses in terms of the evolutionary possibilities in the aesthetics of
the hyperorchestra. I defined a framework that, while grounded in current
screen music practices, was flexible enough to encompass diverse
approaches.
One of the most prominent aesthetic limitations that I outlined is
the horizontal speaker design that does not utilize height. Moreover,
stereo is still the dominant mode of production for recorded music
outside of movie theaters. In purely narrative cinema, a stereo image
suffices to portray the majority of the information. The addition of
surround sound essentially contributed to the experiential part of the
movie. In other words, surround sound serves a relevant narrative role for
most current movies. In terms of music, the stereo audio systems remain
the dominant means by which music is delivered. This is also true
throughout the process of music creation, although there is an increasing
number of screen composers that are incorporating surround systems
into their studios. However, surround adds an additional layer of technical
404
complexity to the creative process, which restricts the degree to which it
is used. In addition, screen music is still primarily stereophonic, where
surround speakers fulfill only a secondary role. Conversely, the
introduction of technologies such as Dolby Atmos (Dolby) are unlocking
the parameter of height by utilizing a full three-dimensional array of
speakers. The potential of this technology is enormous in terms of the
experiential possibilities it unlocks, albeit with significant technical
complexities.
An aesthetic of movie making that continues to expand on the
experiential means of expression that cinema offers, will ultimately
engage with modes of sound production that increasingly utilize these
surrounding technologies. The degree of utilization will also rest on
various practical implications and their capacity to translate the
experience to a mode of listening for personal viewing on television or
hand-held devices. The model for the hyperorchestra, however, remains
valid when expanded to a three-dimensional surround experience. In
terms of virtual space design, it will unlock new dimensions for
instrumental placement that could trigger new sonic environments and
ultimately, the possibility to generate new meanings.
In Chapter VI, I briefly described how the hyperorchestral model
could interact with purely concert music expressions. In myriad ways,
405
most contemporary popular music already employs hyperorchestral
concepts in live concerts, generating musical output that reflects varied
combinations of diverse sound processing. Similarly, electroacoustic
music has routinely employed multiple speaker systems to present
diverse musical experiences and sound experiments. The hyperorchestra
also offers these musical manifestations a widely extended range of
means for musical expression that might be applied in the future.
Finally, I described how, in Interstellar (2014), the process of movie
scoring became closely integrated with the actual process of movie
making. In creating movies that are increasingly experiential, where the
narrative becomes just another tool for expression, this level of
integration will likely become more common. Moreover, contemporary
processes of music creation have allowed for close collaborations
between individuals with diverse backgrounds. This triggers an
increasingly collaborative approach to writing music. From this
perspective, the hyperorchestra becomes a very powerful tool. Its
streamlined process of music production emanates from the necessity for
collaboration and integration of different musical sensitivities and
individuals.
406
BIBLIOGRAPHY
8dio. (2011). Adagio Violins [Computer Software]. Retrieved from
http://8dio.com/instrument-category/orchestral/#instrument/adagioviolins/
8dio. (2015). Acoustic Grand Ensembles (AGE) [Computer Software].
Retrieved from http://8dio.com/#instrument/acoustic-grandensembles-age-bundle-vst-au-axx/
8dioproductions. (2014). String Library Comparison - Part 1: Legato (Berlin
Strings, Hollywood Strings, LASS, Adagio) [Comment on a Video
File]. Retrieved from
https://www.youtube.com/watch?v=esSR5NBlyhk
Adams, D. (2005). The Music of the Lord of the Rings: The Two Towers The Annotated Score. Retrieved from http://www.lordoftheringssoundtrack.com/ttt_annotated_score.pdf
Adams, D. (2010). The Music of The Lord of the Rings Films: A
Comprehensive Account of Howard Shore’s Scores (Book and
Rarities CD). Alfred Music.
Adler, S. (2002). Study of Orchestration, Third Edition (Third Edition ed.). W.
W. Norton & Company.
Altman, R. (1999). Film/Genre. London: British Film Institute.
Apple Inc. (2013). Logic Pro X [Computer Software]. Retrieved from
http://www.apple.com/logic-pro
Aronofsky, D. (Director) (2000). Requiem for a Dream. [Motion Picture]
United States: Artisan Entertainment.
Atkin, A. (2013). Peirce's Theory of Signs. In E. N. Zalta (Ed.), The Stanford
Encyclopedia of Philosophy.
http://plato.stanford.edu/archives/sum2013/entries/peircesemiotics/
407
Auslander, P. (2008). Liveness. Perforrmance in a mediatized culture
(Second ed.). New York: Routledge.
Barham, J. (2014). Music and the moving image. In S. Downes (Ed.),
Aesthetics of Music: Musicological Perspectives (pp. 224-238).
Routledge.
Barthes, R. (1978). Image-Music-Text (S. Heath, Trans.). Hill and Wang.
Barthes, R. (2010). Camera Lucida: Reflections on Photography (Reprint
ed.). Hill and Wang.
Barthes, R. (2012). Mythologies: The Complete Edition, in a New
Translation. Hill and Wang.
Baudrillard, J. (1993) Symbolic Exchange and Death (Theory, Culture &
Society). Sage Publications.
Baudrillard, J. (1994). Simulacra and simulation. Ann Arbor: University of
Michigan Press.
Bauman, Z. (2011). Culture in a Liquid Modern World (1 ed.). Polity.
Bazin, André, & Truffaut, F. (1973). The French Renoir. In Jean Renoir (pp.
74-91). New York: Simon and Schuster.
Biancorosso, G. (2008). Whose Phenomenology of Music? David Huron’s
Theory of Expectation. Music and Letters, 89(3), 396404.doi:10.1093/ml/gcn015
Boltz, M. G. (2004). The cognitive processing of film and musical
soundtracks. Memory & Cognition, 32(7), 11941205.doi:10.3758/BF03196892
Boltz, M. G., Schulkind, M., & Kantra, S. (1991). Effects of background
music on the remembering of filmed events. Memory & Cognition,
19(6), 593-606.doi:10.3758/BF03197154
Bordwell, D. (1985). Narration in the fiction film. Madison, Wis.: University of
Wisconsin Press.
Bordwell, D. (1997a). Against the Seventh Art: André Bazin and the
Dialectical Program. In On the history of film style (pp. 46-82).
Cambridge, Mass.: Harvard University Press.
408
Bordwell, D. (1997b). Defending and Defining the Seventh Art: The
standard Version of Stylistic History. In On the history of film style
(pp. 12-45). Cambridge, Mass.: Harvard University Press.
Bordwell, D. (2006). The way Hollywood tells it : story and style in modern
movies. Berkeley: University of California Press.
Bordwell, D. (2008). Cognitive Theory. In P. P. Livingston, Carl R. (Ed.),
Routledge Companion to Philosophy and Film (pp. 356-367).
Florence, KY, USA: Routledge.
Bordwell, D. (2012). Pandora’s Digital Box: Films, Files, and the Future of
Movies. Madison, Wisconsin: The Irvington Way Institute Press.
Retrieved from http://www.davidbordwell.net/books/pandora.php
Bordwell, D., & Thompson, K. (2012). Film art : an introduction (10th ed.
ed.). New York: McGraw-Hill.
Borges, J. L. (1999). Collected Fictions. Penguin Books.
Branigan, E. (1984). Character Reflection and Projection. In Point of view in
the cinema : a theory of narration and subjectivity in classical film
(pp. 122-138). Berlin ; New York: Mouton.
Branigan, E. (1992). Focalization. In Narrative comprehension and film (p.
xv, 325). London ; New York: Routledge.
Branigan, E. (2010). Soundtrack in Mind. Projections, 4(1), 4167.doi:10.3167/proj.2010.040104
Braudy, L., & Cohen, M. (2009). Film theory and criticism : introductory
readings (7th ed.). New York: Oxford University Press.
Brownrigg, M. (2003). Film Music and Film Genre. PhD. University of
Stirling.
Brownrigg, M. (2007). Hearing place: Film music, geography and ethnicity.
International Journal of Media and Cultural Politics International
Journal of Media and Cultural Politics, 3(3), 307-323.
Bruce, R., & Murail, T. (2000). An Interview with Tristan Murail. Computer
Music Journal, 24(1), 11-19.Retrieved from
http://www.jstor.org/stable/3681847
409
Buckland, W. (2000). Cognitive Semiotics of Film. West Nyack, NY, USA:
Cambridge University Press.
Buhler, J. (2013). Psychoanalysis, Apparatus Theory, and Subjectivity.
Oxford Handbooks Online. Retrieved 19 Oct. 2014, from
http://www.oxfordhandbooks.com/10.1093/oxfordhb/978019532849
3.001.0001/oxfordhb-9780195328493-e-004
Buhler, J. (2006). Enchantments of Lord of the Rings: Soundtrack, Myth,
Language and Modernity. In E. Mathijs & Pomerance, Murray (Eds.),
From hobbits to Hollywood essays on Peter Jackson’s Lord of the
rings. Amsterdam; New York: Rodopi.
Cameron, J. A. (Director) (2009). Avatar. [Motion Picture] United States:
20th Century Fox.
Casanelles, S. (2013). Hyperorchestra hyperreality and Inception. Paper
presented at the Music and the Moving Image Conference, New
York, NY. New York.
Casanelles, S. (Forthcoming). Mixing as a Hyper-Orchestration Tool. In L.
Greene & D. Kulezic-Wilson (Eds.), Palgrave Handbook of Sound
Design and Music in Screen Media: Integrated Soundtracks.
Cecchi, A. (2010). Diegetic versus nondiegetic: a reconsideration of the
conceptual opposition as a contribution to the theory of audiovision.
Worlds of Audiovision.Retrieved from http://www5.unipv.it/wav/pdf/WAV_Cecchi_2010_eng.pdf
Chion, M. (1994). Audio-vision : sound on screen (Gorbman, C., Trans.).
New York: Columbia University Press.
Chion, M. (2009). Film, a Sound Art (Film and Culture Series) (Gorbman, C.,
Trans.). Columbia University Press.
Cohen, A. J. (1993). Associationism and musical soundtrack phenomena.
Contemporary Music Review, 9(1), 163-178.
doi:10.1080/07494469300640421
Cohen, A. J. (2011). Music as a source of emotion in film. In P. N. Juslin &
J. Sloboda (Eds.), Handbook of Music and Emotion: Theory,
Research, Applications (pp. 879-908). Oxford University Press, USA.
Cohen, E., & Cohen, J. (Directors) (1996). Fargo. [Motion Picture] United
410
States: MGM/UA Home Entertainment.
Cook, N. (2013a). Bridging the Unbridgeable? Empirical Musicology and
Interdisciplinary Performance Studies. In N. Cook & R. Pettengill
(Eds.), Taking It to the Bridge: Music as Performance (pp. 70-85).
University of Michigan Press.
Cook, N. (2013b). Beyond Music: Mashup, Multimedia Mentality, and
Intellectual Property. In J. Richardson, C. Gorbman, & C. Vernallis
(Eds.), The Oxford Handbook of New Audiovisual Aesthetics (Oxford
Handbooks) (pp. 53-76). Oxford University Press, USA.
Cooper, M., & Schoedsack, E. (Directors) (1933). King Kong. [Motion
Picture] United States: RKO Radio Productions.
Coppola, F. F. (Director) (1979). Apocalypse Now. [Motion Picture] United
States: United Artists.
Cuarón, A. (Director) (2013). Gravity. [Motion Picture] United States: Warner
Bros. Pictures.
Currie, G. (1995). Image and Mind : Film, Philosophy and Cognitive
Science. Cambridge, GBR: Cambridge University Press.
Currie, G. (2000). Preserving the Traces: An Answer to Noêl Carroll. Journal
of Aesthetics and Art Criticism, 58(3), 306-308.
Currie, G. (2008). The Nature of Fiction (1 ed.). Cambridge University Press.
Daubresse, E., & Assayag, G. (2000). Technology and Creation- The
Creative Evolution. Contemporary Music Review, 19(2), 61-80.
DiagonalView. (2008). Robot Violinist. Retrieved 2014 from
https://www.youtube.com/watch?v=EzjkBwZtxp4
Dolby. (n.d.) Atmos. Retrieved 2015 from
http://www.dolby.com/us/en/technologies/dolby-atmos.html
Donnelly, K. J. (2014). Occult Aesthetics: Synchronization in Sound Film
(Oxford Music/Media). Oxford University Press, USA.
Donner, R. (Director) (1978). Superman. [Motion Picture] United States:
Warner Bross.
411
EastWest Sounds. (2004). Quantum Leap StormDrum [Computer Software].
EastWest Sounds. (2008). Quantum Leap Ra [Computer
Software].Retrieved from http://www.soundsonline.com/Ra
EastWest Sounds. (2014). EastWest Quantum Leap Hollywood Orchestra
[Computer Software].Retrieved from
http://www.soundsonline.com/Hollywood-Orchestra
Eco, Umberto. (1990). Travels in Hyperreality (Harvest Book). Mariner
Books.
Everett, Y. U. (2007). The Music of Louis Andriessen (Music in the Twentieth
Century). Cambridge University Press.
Felluga, D. (2003). The Matrix: Paradigm of Post-modernism or intellectual
poseur? (Part I). In G. Yeffeth (Ed.), Taking the red pill : science,
philosophy and religion in The Matrix (1st ed. ed.). Dallas, Tex.
Chicago: BenBella Books.
Ferrara, L. (1991). Philosophy and the Analysis of Music: Bridges to Musical
Sound, Form, and Reference. Praeger.
Fincher, D. (Director) (2010). The Social Network. [Motion Picture] United
States: Columbia Pictures.
Fincher, D. (Director) (2011). Trent Reznor, Atticus Ross, David Fincher On
the Score. [Motion Picture] US: Sony Pictures.
Galloway, A. R. (2012). The Interface Effect [Kindle version]. Polity.
Retrieved from amazon.com
Genette, G. (1980). Narrative discourse : an essay in method. Ithaca, N.Y.:
Cornell University Press.
Gibson, D. (2005). The Art of Mixing: A Visual Guide to Recording,
Engineering, and Production (2 ed.). Artistpro.
Gledhill, C. (2000). Rethinking Genre. In C. Gledhill & L. Williams (Eds.),
Reinventing Film Studies (pp. 221-243). Bloomsbury USA.
Gondry, M. (Director) (2004). Eternal Sunshine of the Spotless Mind.
[Motion Picture] United States: Focus Features.
412
Goodman, N. (1978). Ways of worldmaking. Indianapolis: Hackett Pub. Co.
Gorbman, C. (1980). Narrative Film Music. Yale French Studies, 60, 183203.doi:10.2307/2930011
Gorbman, C. (1987). Unheard melodies : narrative film music. London
Bloomington: BFI Pub. Indiana University Press.
Gordon, A. (2003). The Matrix: Paradigm of Post-modernism or intellectual
poseur? (Part II). In G. Yeffeth (Ed.), Taking the red pill : science,
philosophy and religion in The Matrix (1st ed. ed.). Dallas, Tex.
Chicago: BenBella Books.
Grisey, G. (2000). Did You Say Spectral? Contemporary Music Review,
19(3), 1-3.
Hall, M. (1933). King Kong (1933) A Fantastic Film in Which a Monstrous
Ape Uses Automobiles for Missiles and Climbs a Skyscraper.
Retrieved 06/15/2014, from
http://www.nytimes.com/movie/review?res=9F03E3DC173BEF3ABC
4B53DFB5668388629EDE
Hoover, T. (2009). Keeping Score : Interviews with Today’s Top Film,
Television, and Game Music Composers. Boston, MA, USA: Course
Technolgy.
Carpenter, H. (Ed.). (1981). The Letters of J.R.R. Tolkien. Boston: Houghton
Mifflin.
Hurwitz, M. (2011). Sound for Picture: Hans Zimmer’s Scoring Collective Composer Collaboration at Remote Control Productions. In J.
Wierzbicki, N. Platte, & C. Roust (Eds.), The Routledge Film Music
Sourcebook (pp. 254-257). Routledge.
Itzkoff, D. (2010). Hans Zimmer Extracts the Secrets of the ‘Inception’
Score. Retrieved 08-24-2013, from
http://artsbeat.blogs.nytimes.com/2010/07/28/hans-zimmerextracts-the-secrets-of-the-inception-score/
Jackson, P. (Director) (2001a). Lord of The rings: The Fellowship of the
Ring. [Motion Picture] New Zealand: New Line Cinema.
Jackson, P. (Director) (2001b). Music for the Middle-Earth. In Lord of The
rings: The Fellowship of the Ring. [Motion Picture] New Zealand:
413
New Line Cinema.
Jackson, P. (Director) (2002a). Lord of The rings: The Two Towers. [Motion
Picture] New Zealand: New Line Cinema.
Jackson, P. (Director) (2002b). Music for the Middle-Earth. In Lord of The
rings: The Two Towers. [Motion Picture] New Zealand: New Line
Cinema.
Jackson, P. (Director) (2003). Lord of The rings: The Return of the King.
[Motion Picture] New Zealand: New Line Cinema.
Jackson, P. (Director) (2012). The Hobbit: An Unexpected Journey. [Motion
Picture] New Zealand: New Line Cinema.
Fineberg, J. (2000). Guide to the Basic Concepts and Techniques of
Spectral Music. Contemporary Music Review, 19(2), 81-113.
Folmann, T. (2014). Interview with Blake Neely. Retrieved from
http://8dio.com/blog/#blog/blakeneely/
Juslin, P. N., & Sloboda, J. (2011). Handbook of Music and Emotion:
Theory, Research, Applications (Reprint ed.). Oxford University
Press, USA.
Juslin, P. N., & Vastfjall, D. (2008). Emotional responses to music: The need
to consider underlying mechanisms. Behavioral and Brain Sciences,
31(5), 559-559.
Heitmueller, K. (2005). Rewind: What Part Of ‘Based On’ Don’t You
Understand? Retrieved 06/14/2014, from
http://www.mtv.com/news/1499898/rewind-what-part-of-based-ondont-you-understand/
Karlin, F., & Wright, R. (2004). On the Track : A Guide to Contemporary Film
Scoring.Retrieved from http://www.ebrary.com
Kassabian, A. (2013). The End of Diegesis As We Know It? Oxford
Handbooks Online. doi:10.1093/oxfordhb/9780199733866.013.032
Kirn, P. (2005). East West Stormdrum Sample Library: In-Depth Review.
Retrieved from http://createdigitalmusic.com/2005/07/east-weststormdrum-sample-library-in-depth-review/
414
Kittler, F. (1999). Gramophone, Film, Typewriter (Writing Science). Stanford
University Press.
Kivy, P. (1991). Opera Talk: A Philosophical ‘phantasie’. Cambridge Opera
Journal, 3(1), 63-77.doi:10.1017/S0954586700003372
Kivy, P. (2007). Music in the Movies: A Philosophical Inquiry. In Music,
language, and cognition : and other essays in the aesthetics of music
(pp. 62-90). Oxford New York: Clarendon Press Oxford University
Press.
Kosinski, J. (Director) (2010). Tron: Legacy. [Motion Picture] United States:
Walt Disney Pictures.
Kubrick, S. (Director) (1968). 2001: A Space Odyssey. [Motion Picture]
United States: MGM.
Kubrick, S. (Director) (1971). A Clockwork Orange. [Motion Picture] United
States: Warner Bros.
Kubrick, S. (Director) (1980). The Shinning. [Motion Picture] United States:
Warner Bross.
Leone, S. (Director) (1966). The Good, the Bad and the Ugly. [Motion
Picture] Italy: Produzioni Europee Associati.
Lessage, J. (1976). S/Z and the Rules of the Game. Jump Cut: A Review of
Contemporary Media, 45-51.
Levinson, J. (1996). Film Music and Narrative Agency. In D. Bordwell & N.
Carroll (Eds.), Post-theory : reconstructing film studies (p. xvii, 564).
Madison: University of Wisconsin Press.
Lowder, J. B. (2014). How Interstellar’s Stunning Score Was Made.
Retrieved from
http://www.slate.com/blogs/browbeat/2014/11/18/making_interstell
ar_s_score_hans_zimmer_s_soundtrack_explored_in_exclusive.html
Lucas, G. (Director) (1977). Star War: Episode IV - A New Hope. [Motion
Picture] United States: 20th Century Fox.
Lynch, D. (Director) (2001). Mulholland Drive. [Motion Picture] United
States: Universal Pictures.
415
Lyotard, J.-F. (1984). The post-modern condition; a report on knowledge.
Minneapolis, University Minnesota Press.
Machover et al. (n.d.) Hyperinstruments. Retrieved from
http://opera.media.mit.edu/projects/hyperinstruments.html
MakeMusic Inc. (2013). Finale 2014 [Computer Software].Retrieved from
http://www.finalemusic.com
McLuhan, M. (1994). Understanding media : the extensions of man (1st MIT
Press ed. ed.). Cambridge, Mass.: MIT Press.
Mera, M. (2013). Inglo(u)rious Basterdization? Tarantino and the War Movie
Mashup. In C. Vernallis, A. Herzog, & J. Richardson (Eds.), The
Oxford Handbook of Sound and Image in Digital Media. Oxford
Handbooks Online. doi:10.1093/oxfordhb/9780199757640.013.030
Metz, C. (1991). Film language : a semiotics of the cinema (University of
Chicago Press ed. ed.). Chicago: University of Chicago Press.
Metz, C. (1984). A profile on Etienne Souriau. On Film, 12, 5-8.
Moylan, W. (2014). Understanding and Crafting the Mix : The Art of
Recording (3rd Edition). Independence, KY, USA: Focal Press.
Retrieved from http://www.ebrary.com
Neale, S. (1990). Questions of genre. Screen, 31(1), 4566.doi:10.1093/screen/31.1.45
Neale, S. (2000). Genre and Hollywood. London ; New York: Routledge.
Neumeyer, D. (2009). Diegetic/Nondiegetic: A Theoretical Model. Music and
the Moving Image, 2(1), 26-39.Retrieved from
http://www.jstor.org/stable/10.5406/musimoviimag.2.1.0026
Nolan, C. (Director) (2000). Memento. [Motion Picture] United States:
Summit Entertainment.
Nolan, C. (Director) (2005). Batman Begins. [Motion Picture] United States:
Warner Bros.
Nolan, C. (Director) (2008). The Dark Knight. [Motion Picture] United States:
Warner Bros.
416
Nolan, C. (Director) (2010). Inception. [Motion Picture] United States:
Warner Bross.
Nolan, C. (Director) (2012). The Dark Knight Rises. [Motion Picture] United
States: Warner Bros.
Nolan, C. (Director) (2014). Interstellar. [Motion Picture] United States:
Paramount Pictures.
Oracle Corporation. (1995). Java Programming Language [Computer
Software].Retrieved from http://www.oracle.com/technetwork/java
Owsinski, B. (2013a). The Mixing Engineer’s Handbook (3 ed.). Cengage
Learning PTR.
Owsinski, B. (2013b). The Recording Engineer’s Handbook (3 ed.).
Cengage Learning PTR.
Pawlett, William. Jean Baudrillard: Against Banality (Key Sociologists).
Routledge, 2007.
Pejrolo, A., & DeRosa, R. (2007). Acoustic and MIDI Orchestration for the
Contemporary Composer: A Practical Guide to Writing and
Sequencing for the Studio Orchestra. Kidlington, GBR: Focal Press.
Retrieved from http://www.ebrary.com
Pejrolo, A., & DeRosa, R. (2011). Creative Sequencing Techniques for
Music Production : A Practical Guide to Pro Tools, Logic, Digital
Performer and Cubase.Retrieved from http://www.ebrary.com
Phoenix, N., Austin, T., & Pacemaker. (2011). Quantum Leap RA Virtual
Instrument User’s Manual. Retrieved 2014 from
http://www.soundsonline-forums.com/docs/QL_RA_Manual.pdf
Prince, S. (1996). True lies : perceptual realism, digital images, and film
theory. Film Quarterly, 49(2), 27-37.doi:10.2307/1213468
Prince, S. (2010). Through the Looking Glass: Philosophical Toys and
Digital Visual Effects. Projections, 4(2), 1940.doi:10.3167/proj.2010.040203
Prince, S. (2012). Digital Visual Effects in Cinema: The Seduction of Reality.
Rutgers University Press.
417
Reeves, M. (Director) (2014). Dawn of the Planet of the Apes. [Motion
Picture] United States: 20th Century Fox.
Rogers, D., Phoenix, N., Bergersen, T., & Murphy, S. (2009).
EastWest/Quantum Leap Hollywood Strings Virtual Instrument
User’s Manual. Retrieved 2014 from http://www.soundsonlineforums.com/docs/EW-QL_Hollywood-Strings-Diamond_Manual.pdf
RollingStone. (2013). ‘Man of Steel’ Composer Hans Zimmer Celebrates
Mankind on ‘DNA’. Retrieved from
http://www.rollingstone.com/music/news/man-of-steel-composerhans-zimmer-celebrates-mankind-on-dna-20130513
Rosenbloom, E. (2013). Film Music Friday: Steven Price on Gravity.
Retrieved from
http://www.ascap.com/playback/2013/10/wecreatemusic/fmfsteven-price-gravity.aspx
Rusnak, J. (Director) (1999). The Thirteen Floor. [Motion Picture] United
States: Columbia Pitures.
Sadoff, R. H. (2006). The role of the music editor and the temp track’ as
blueprint for the score, source music, and source music of films.
Popular Music - Cambridge, 25(2), 165184.doi:10.1017/S0261143006000845
Sadoff, R. H. (2012). An Eclectic Methodology for Analyzing Film Music.
Music and the Moving Image, 5(2), 7086.doi:10.5406/musimoviimag.5.2.0070
Sadoff, R. H. (2013). Scoring for Film and Video Games: Collaborative
Practices and Digital Post- Production. In C. Vernallis, A. Herzog, &
J. Richardson (Eds.), The Oxford Handbook of Sound and Image in
Digital Media. Oxford Handbooks Online.
doi:0.1093/oxfordhb/9780199757640.013.039
Saussure, F. (1998) “Nature of the Linguistic Sign,” In The Critical Tradition:
Classic Texts and Contem- Porary Trends, edited by David H.
Richter, 832–35. Boston: Bedford/St. Martin’s Press.
Schaeffer, P. (2002). Traité des objets musicaux. Seuil.
Schaeffer, P. (2007). Tratado de los objetos musicales / Treatment of
Musical Objects (Spanish Edition). Alianza Editorial.
418
Scheurer, T. E. (2008). Music and mythmaking in film: genre and the role of
the composer. McFarland.
Scorcesse, M. (Director) (2011). Hugo. [Motion Picture] United States:
Paramount Pictures.
Scott, R. (Director) (2000). Gladiator. [Motion Picture] U.S: DreamWorks
Pictures.
Sibielski, R. (2004). Postmodern Narrative or Narrative of the Postmodern?
History, Identity, and the Failure of Rationality as an Ordering
Principle in Memento. Literature and Psychology, 49(4), 82-100.
Smalley, D. (1994). Defining timbre - Refining timbre. Contemporary Music
Review, 10:2, 35-48.doi:10.1080/07494469400640281
Smith, J. (1996). Unheard Melodies? A Critique of Psychoanalytic theories
of Film Music. In D. Bordwell & N. Carroll (Eds.), Post-theory :
reconstructing film studies (p. xvii, 564). Madison: University of
Wisconsin Press.
Smith, J. (2009). Bridging the Gap: Reconsidering the Border between
Diegetic and Nondiegetic Music. Music and the Moving Image,
2(1).Retrieved from
http://www.jstor.org/stable/10.5406/musimoviimag.2.1.0001
Snyder, Z. (Director) (2013). Man of Steel. [Motion Picture] United States:
Warner Bross.
Souriau, E. (1951). La structure de I’univers filmique et Ie vocabulaire de la
filmologie. Revue internationale de filmologie, 7/8, 231-240.
Souriau, E., & Agel, H. (Eds.). (1953). L’univers filmique. Paris: Flammarion.
Spielberg, S. (Director) (1982). E.T. the Extra-Terrestrial. [Motion Picture]
United States: Universal Pictures.
Spielberg, S. (Director) (1993). Jurassic Park. [Motion Picture] United
States: Universal Pictures.
Spielberg, S. (Director) (2002). Minority Report. [Motion Picture] United
States: 20th Century Fox.
Spielberg, S. (Director) (2005). Memoirs of a Geisha. [Motion Picture] United
419
States: Columbia Pictures.
Spitfire Audio. (2011). Albion [Computer Software].Retrieved from
www.spitfireaudio.com
Spitfire Audio. (2012). Sable Strings [Computer Software].Retrieved from
www.spitfireaudio.com
Spitfire Audio. (2013a). HZ01: Hans Zimmer Percussion [Computer
Software].Retrieved from www.spitfireaudio.com
Spitfire Audio. (2013b). Hans Zimmer Percussion: London Ensembles.
Retrieved from http://www.spitfireaudio.com/hz-percussion-londonensembles
Stanton, A. (Director) (2008). Wall-E. [Motion Picture] United States: Walt
Disney Studios.
Stewart, D. “Ewql Hollywood Strings.” (2010): Accessed 2014,
http://www.soundonsound.com/sos/sep10/articles/ewql-hollywoodstrings.htm.
Stilwell, R. J. (2007). The Gap between diegetic and nondiegetic. In D.
Goldmark, L. Kramer, & R. Leppert (Eds.), Beyond the Soundtrack:
Representing Music in Cinema (1 ed.). University of California Press.
Tan, S.-L., Cohen, A. J., Lipscomb, S. D., & Kendall, R. A. (Eds.). (2013).
The Psychology of Music in Multimedia. Oxford University Press.
doi:10.1093/acprof:oso/9780199608157.001.0001
Tarantino, Q. (Director) (1994). Pulp Fiction. [Motion Picture] United States:
Miramax.
Thom, R. (2007). Acoustics of the Soul. Offscreen, 11(8-9).Retrieved from
http://www.offscreen.com/Sound_Issue/thom_diegesis.pdf
Vary, A. (2013). Inside The Mind (And Studio) Of Hollywood’s Music
Maestro. Retrieved 2014, from
http://www.buzzfeed.com/adambvary/hans-zimmer-film-composerinside-his-studio
Vernallis, C., Herzog, A., & Richardson, J. (Eds.). (2013). The Oxford
Handbook of Sound and Image in Digital Media. Oxford Handbooks
Online. doi:10.1093/oxfordhb/9780199757640.001.0001
420
VSL. (2004). Vienna Symphonic Library [Computer Software].Retrieved
from http://www.vsl.co.at/
Wachowski, A., & Wachowsi, L. (Directors) (1999). The Matrix. [Motion
Picture] United States: Warner Bross.
WaterTowerMusic. (2013a). Man Of Steel Soundtrack - Percussion - Hans
Zimmer. Retrieved from
https://www.youtube.com/watch?v=QTOMIyynBPE
WaterTowerMusic. (2013b). Man Of Steel Soundtrack - Sculptural
Percussion - Hans Zimmer. Retrieved from
https://www.youtube.com/watch?v=RSFMh0KKl9c
Weir, P. (Director) (1998). The Truman Show. [Motion Picture] United
States: Paramount Pictures.
Welles, O. (Director) (1941). Citizen Kane. [Motion Picture] United States:
RKO Radio Pictures.
Wierzbicki, J., Platte, N., & Roust, C. (2011). The Routledge Film Music
Sourcebook. Routledge.
Winding Refn, N. (Director) (2011). Drive. [Motion Picture] United States:
FilmDistrict.
Winters, B. (2010). The non-diegetic fallacy: Film, music, and narrative
space. Music and letters, 91(2), 224-244.doi:10.1093/ml/gcq019
Winters, B. (2012). Musical Wallpaper? Music, Sound, and the Moving
Image, 6(1), 39-54.doi:10.3828/msmi.2012.5
Wolfe, J. (2012). With a Blue Dress On. Retrieved 09/14/2014, from
http://juliawolfemusic.com/music/with-a-blue-dress-on
Wyatt, R. (Director) (2011). Rise of the Planet of the APes. [Motion Picture]
United States: Twentieth Century Fox.
Wyler, W. (Director) (1959). Ben-Hur. [Motion Picture] United States: MGM.
Yacavone, D. (2008). Towards a Theory of Film Worlds. Film-Philosophy,
12(2), 83-108.Retrieved from http://www.filmphilosophy.com/2008v12n2/yacavone.pdf
421
Yacavone, D. (2012). Spaces, Gaps, and Levels. Music, Sound, and the
Moving Image, 6(1), 21-37.doi:10.3828/msmi.2012.4
Yacavone, D. (2014). Film Worlds: A Philosophical Aesthetics of Cinema.
Columbia University Press.
Yeffeth, G. (Ed.). (2003). Taking the red pill : science, philosophy and
religion in The Matrix (1st ed. ed.). Dallas, Tex. Chicago: BenBella
Books.
Zemeckis, R. (Director) (1994). Forrest Gump. [Motion Picture] United
States: Paramount Pictures.
Zimmer, H. (2001). The Gladiator Waltz. More Music from the Motion
Picutre “Gladiator”. [iTunes]. Universal Classics Group. Retrieved
from https://itunes.apple.com/us/album/the-gladiatorwaltz/id22580901?i=22580985
Zimmer, H. (2010). Radical Notion. Inception (Music from the Motion
Picture). [iTunes]. Warner Bros. Entertainment Inc. Retrieved from
https://itunes.apple.com/us/album/inception-music-frommotion/id380349905
Zimmer, H. (2013a). Digital Booklet. Man of Steel (Original Motion Picture
Soundtrack) Deluxe Edition. [iTunes]. WaterTower Music. Retrieved
from https://itunes.apple.com/us/album/man-steel-originalmotion/id642515245
Zimmer, H. (2013b). Oil Rig. Man of Steel (Original Motion Picture
Soundtrack) Deluxe Edition. [iTunes]. WaterTower Music. Retrieved
from https://itunes.apple.com/us/album/man-steel-originalmotion/id642515245
Zimmer, H. (2014a). Digital Booklet. Interstellar (Original Motion Picture
Soundtrack). [iTunes]. WaterTower Music. Retrieved from
https://itunes.apple.com/us/album/interstellar-originalmotion/id944005211
Zimmer, H. (2014b). Dust. Interstellar (Original Motion Picture Soundtrack).
[iTunes]. WaterTower Music. Retrieved from
https://itunes.apple.com/us/album/dust/id944005211?i=944005218
Zimmer, H. (2014c). S.T.A.Y. Interstellar (Original Motion Picture
Soundtrack). [iTunes]. WaterTower Music. Retrieved from
422
https://itunes.apple.com/us/album/s.t.a.y./id944005211?i=94400523
2
Zone, R. (2012). 3-D Revolution: The History of Modern Stereoscopic
Cinema (1ST ed.). The University Press of Kentucky.
423
APPENDIX A
OVERVIEW OF THE PRINCIPAL MIDI MESSAGES
As a communication protocol, MIDI is constituted by different
types of messages. In equivalence with the Western score and the
Western musical system, MIDI includes two main types of messages135:
the notes and the continuous controllers. However, the flexibility in how
to implement both types of messages is remarkable. By definition, a note
is just a type of event that has a beginning and an end in a given temporal
mapping. It is a definition that facilitates the implementation of Western
musical notes, which are precisely defined as timed pitches. Each MIDI
135
For more information, see Pejrolo & DeRosa, 2007, pp. 1-19. The
discussion that follows concentrates on the conceptual implications of
the MIDI protocol. Thus, for the sake of clarity, the specific descriptions
of the technical implementation of the protocol are omitted unless they
are necessary. Concomitantly, this present discussion analyzes MIDI in its
practical objective definition, which does not imply that its original design
was created as a mapping of the Western musical framework. However,
the practical result of the protocol is much more flexible and it is generally
not attached to a Western tradition. For this viewpoint, it is important to
remark that although most of the names associated with diverse aspects
of the protocol generally have connoted meanings within the Western
musical framework, the implementation of the protocol only registers
numbers. For instance, naming Continuous Controller Number 1 as
“Modulation” does not practically affect how this controller functions
within the protocol. As I will describe below, this controller is regularly
used nowadays to control either the dynamics or the vibrato of the
instruments.
424
instrument is allowed 128 different notes (27), which are enough to map
the piano, which has 88 keys. However, strictly by its definition, each
note is just a different number in the interval between 0-127. As they are
merely numbers, MIDI notes do not have any direct Western musical
connotations, such as octave, scale or pitch. Although MIDI notes have
no direct connection to pitch, it is easy to map the piano notes onto MIDI
notes as aforementioned. Similarly, there are 128 different possible
Continuous Controllers (CC) that can map 128 different states at any
time. A Continuous Controller does not have a start or an end point.
Instead, the Continuous Controller event changes the value of the
controller at a given time, which remains the same until the next change.
The high number of possible controllers and values implies that, at each
moment, there are 249 different possible states for an instrument that can
also map 128 different notes. In addition, each MIDI note has an
additional set of associated parameters, such as the velocity (it also has
128 possible values). The velocity can easily map the dynamic range, as
has been a common practice in digital instruments that operate with the
MIDI protocol. This is because the velocity is captured in a MIDI keyboard
by measuring the speed of how the keys are being pressed. A higher
speed implies higher velocity that usually translates in a higher dynamic in
parallel with how the piano functions. Thus, defining dynamic tables for
425
the velocity parameter is simple. For example, velocity could map two
dynamic levels by the following association:
Velocity Value
Dynamic
0-63
piano
64-127
forte
Figure 50. Simple velocity mapping
In this hypothetical situation, a velocity value between 0 and 63
would result in a piano dynamic, and forte otherwise. Similarly, velocity
can easily map a wider range of dynamics, as is shown in this second
hypothetical mapping (Figure 51).
Velocity Value
Dynamic
0-20
pp
21-40
p
41-61
mp
62-83
mf
84-105
f
426
106-127
ff
Figure 51. Alternate velocity mapping
It is important to remark that these tables are fully arbitrary and
that they are not part of the protocol. In addition, the dynamic mapping
might only be proportional to the velocity value (0 would be no sound and
128 would be the loudest dynamic possible). A similar rationale applies to
the Continuous Controllers. Although some of them are conventionally
named,136 the given names have no effect on the protocol, which,
ultimately, only registers numbers. One of these Continuous Controllers
might be used to represent the dynamics instead of the velocity, which
will allow the instrument to alter its dynamic while a note is playing. This
is an improvement to using velocity, which is a unique number associated
with each MIDI note, especially for the instruments that, unlike the piano,
are able to actively modify the dynamics during a note.137
Similarly, the flexibility that MIDI notes permit is significant. They
can be associated with multiple possible tunings or sounds with the sole
limitation of a maximum of 128 instances. However, by utilizing a
136
CC1 is called Modulation, CC7 is called Volume, CC11 is called
Expression, etc.
137
Wind instruments or bowed strings, for example.
427
combination of a note and a Continuous Controller, it is possible to
significantly extend the number of sounds that the instrument can map.
For instance, it is feasible to create instruments that can generate more
than 128 different pitches. In this hypothetical implementation, each note
might be associated with a pitch area that would be adjusted by a
Continuous Controller, thus allowing 128 possible pitches for each of the
128 possible pitch areas. With the combination of the note value and just
one CC, the number of possible pitches could increase to 16,384.
Moreover, as described, a MIDI note is just a musical instance that
has a beginning and an end. A MIDI note can map a particular sound or
noise, unrelated to a pitch. Similarly, a MIDI note could trigger a
performance or a complex musical sequence. Furthermore, a MIDI note
might trigger multiple sounds at the same time, even the sound of an
entire orchestra.
428
Download