Hearing the Elephant down the hall
Rethinking Mise en scene sound and space.
Overview.
The techno-cultural relationship between evolving technologies and the aesthetic understandings that define cinema practice consistently lead to a process of re-evaluation of how we define the cinematic medium.
This paper will explore the impact on notions of cinematic ‘space’ brought on simultaneously by the proliferation of multi-channel (surround-sound) audio reproduction and the aesthetics of immersive genre computer gaming.
It will be argued that the established principles of mise en scene composition do not allow for a holistic understanding of a new visual and spatial aesthetic and indeed that some elements of mise en scene need to be re-thought in the context of new technologies and the immersive experiences they bring. In particular this paper will explore the notion of a Macro-mise en scene compositional understanding; a framing sensibility that, through new technologies and aesthetics, embraces cinematic space beyond the framed image.
The paper will closely examine contemporary computer games such as Doom3 (2004) and Half Life 2 (2004) alongside the work of filmmaker Gus Van Sant, Elephant (2004) and Psycho (1998), as key examples of a newly engaged spatial aesthetic driven by sound and its relationship to space and viewership as influenced by computer game aesthetics.
1. Mise en scene, montage and the totality of the frame.
For the majority of the past one hundred years the making of cinema and the construction of meaning through the cinematic form (moving image, sound and music) has been understood, and indeed governed by, the central pillar of the mise en scene . Regardless of genre, style or even the type of mechanical process involved in making cinema (be it live-action image capture through a photographic process, animation or various forms of rotoscoping both digital and analogue) the common trait shared by all is the framed border of the image. As such, drawing upon long standing artistic traditions of frame-based composition and arrangement from the theatre, painting and photography, cinematic meaning has been built on the act of ‘placing in the scene’, arranging visual (and aural) elements within a framed perspective; a distinct and restricted window through which a viewer shall experience the cinematic work.
In this manner, long standing cinematic thinking dictates that the borders of the frame, and subsequently the unavoidably confined nature of the cinematic image (in comparison to how our human eye might perceive the world) is essential for meaning to be composed. This idea comes very much out of the Formalist notion of cinematic meaning; that “film’s specific property is its inability to perfectly imitate normal visual experience of reality” (Buckland. 1998. p23). Therefore cinema is an art because it relies on the construction and auteurship of meaning exploiting the limitations and constraints of its form and process. A painting or photograph cannot exist without a frame and subsequently neither can a film.
This perspective however seems to lean heavily towards a very static notion of cinema, a cinema in two-dimensions and without a sense of space (much like its painting and photographic siblings). To accommodate a desire for a less static cinematic arrangement, and one that seeks to set cinema apart as in possession of unique properties, two elements combine; camera movement and montage.
Both these technical and aesthetic techniques allow the user to experience a distinctly cinematic media by providing two elements that cinema’s parents, photography and painting, cannot – camera movement and the montage ‘cut’. Both of these rather simple, but none the less powerful and defining, cinematic assets allow for a film to present a very distinct notion of space in relation to a specific perspective or range of presented perspectives.
Montage is the process of sequencing visual elements; cutting from one shot to another. In simple terms the mise en scene could be described as creating meaning ‘inside’ the frame and montage that of creating meaning ‘between’ the frames. The father of montage thinking, Sergi Eisenstein, described the powerful phenomena of montage as “the fact that two film pieces of any kind, placed together, inevitably combine into a new concept, a new quality, arising out of the juxtaposition.” (Eisenstein, 1947. p4)
In the context of cinematic space the montage cut allows the viewer to be shifted instantly and, more importantly, without question, without breaking their visual acceptance, from one place, space, location or even time, to another. The most common example being the reversal of a shot’s perspective so the viewer/audience may see the scene from the opposite angle. This is standard practice for the depicting of dialogue scenes between two characters where the mise en scene shifts continually in 180 degree movements between lines of dialogue. Likewise with the definitive three-shot sequence the frame cuts directly from the event being ‘seen’ to the reaction-shot of the person ‘seeing’.
In this manner montage treats individual cinematic frames as compositions complete unto themselves, each frame holding all the necessary information for that moment in time with additional information gained through an additive, accumulative effect of cutting and sequencing these complete compositions.
In essence the opposed perspective of montage is the movement of the camera within the duration of a single shot or take. Where montage is able to instantly shift the viewer to a new spatial location/perspective, the longer moving camera shot allows for a more realistic sense of distance to be conveyed to a viewer; a dolly or hand-held shot that moves from one room to another delivers a direct connection with the physical proximity of the two locations in realist terms rather than framed compositional ones. Simply put this movement of the camera creates a shot that shifts in space and presents a series of continuous and fluid, individual mise en scene framings or compositions that are constructed, to some degree, by the viewer themself within the parameters of the moving frame.
Menard, in discussing Andre Bazin’s perspective of the moving camera states that his “conception of cinema eschews from the interpretive potential of image content editing, including in-camera scene editing, and favours a form of meaning that originates within the spectator’s perceptual field.” (Menard, 2003.) In other words, the realist perspective championed by Bazin seeks for a meaning to originate and exist, to a large extent, wholly within the framed perspective held by the viewer. This differs distinctly from the montage ‘cut’ in that these successive and continuous framings are presented seamless, in real-time and, to some small extent, real-space continuity. Meaning derived from inside the frame as apposed to Montage where, as above, meaning is constructed between frames, augmented by a proactive mental exercise of sequence assembly on the part of the viewer.
In What is Cinema? (1967) Bazin gives a key example of the power of direct and tangible framed spatiality in the cinematic depiction of a boy carrying a lion cub and being stalked by the lion cub’s mother. Much of the sequence according to Bazin is shown in “banal montage” (p. 49 - footnote) it is only in the final part of the sequence that the constructed nature of the sequence’s tension is relinquished to the reality of a single take where the lion, cub and child all occupy the same frame. “This single frame in which trickery is out of the question gives immediate and retroactive authenticity” (p49 – footnote). According to Bazin this use of a single, uncut, framed shot rather than a continuation of the montage doesn’t achieve a different narrative meaning, the story remains the same, but would not have “unfolded before the camera in its spatial and physical reality” (p49 – footnote) and so bears more emotive impact.
What remains common and central to the traditional and accepted understandings of both these two related but distinct methodologies is that they both present compositions within a framed window in their Totality. There is little notion of a world or space outside of the framed borders. There is an unvoiced acceptance on the part of the viewer that all that is important in a scene will take place within the screen’s frame; “what is inside the frame is, ‘The Film’, what is outside of it is not.” (A-Soma, 2001)
Regardless of filmic style or genre the viewer’s truth remains that everything of importance or significance to their understanding of the film’s plot, narrative, meaning and characters will occur within that visual and bordered frame or, to the same ends, be shifted and brought into the frame, onto the stage, by the movement of the camera. Anything that the camera’s frame moves past, leaves visually behind or excludes from view outside of the contents of the visual frame is deemed not to be of importance or narrative significance. Moreover there is a perception that anything outside of the mise en scene frame doesn’t exist at all. Traditional thinking holds that the narrative world of the cinematic work exists Only within the mise en scene. The mise en scene is a vehicle for the telling of a specific and precisely constructed singular narrative or experience and nothing else, no other possibilities, exists outside that which is framed and composed. To the audience the story being told exists as a complete entity within the contents of the compositional frame.
2. The Macro-Mise en scene. – Sound in Space & the world beyond the frame.
Even taking into account the relatively short history of cinema as an artform, there have been remarkably only a handful of paradigm shifts or changes, stylistically or technically, to its means and form. Cinema, for the most part is made and viewed now much as it has ever been. The introduction of synchronous recorded sound and colour are obviously the two most significant of these seminal changes over the past one hundred years but there are other, more recent, shifts that whilst more subtle and less obvious to the average movie-goer have been none the less enormously impactful.
Far less well documented and yet arguably just as significant to our perception of what cinema is and how we perceive our experience of it, is the development of multi-channel sound; what is today best known as Surround-Sound.
The development of multi-channel sound has come over a long period of time; from the emergence of stereo sound (known originally as binaural sound) in the very early 30’s, through the seminal five-channel soundtrack of Fantasia (1940), and on to the established and standard 5.1, 7.1 and even 9.1 channel sound for both the cinema theatre and the home theatre systems of today.
Even the fledgling first step from mono sound (a single source of sound located directly behind the screen) to stereo (two separately placed sound sources, left and right) serves as a direct signifier of the substantial change in thinking about cinematic space that came as result of this evolution in technology.
Sound beyond sight.
The cinemascope biblical epic, The Robe (1953), was one of the first films to embrace the notion of a tangible and audience-aware space beyond the edges of the frame. A space indicated, constructed and conveyed in aural terms relative to the position of the audience in the theatre. The Robe is the first film where a line of dialogue in the screenplay marked as OS (meaning off-screen) is actually heard to emanate from a space ‘off-screen’, outside the borders of the mise en scene frame.
This simple predecessor to what is now arguably the cinematic standard, particularly with the widespread take up of surround-sound home theatre systems, has lead directly to a new audience perception of what comprises the cinematic space. “The end of the image is no longer the edge of the screen. We are completely immersed in a sound universe and feel as if we are actually in the space of the action, because we can hear the action surround us” (Yu, E. 2003).
Truppin, in his essay And then there was sound (1992), discusses this idea of off-screen sound and writes that “our tendency to attach a sound to an emitting source in the interests of coherence allows us to accept the existence of that which we cannot see. While the picture of the sound source may not be available the imagination steps in to provide a mental image of this sound source. In this way off-screen sound enlarges the film space beyond the borders of the screen.” (p. 236)
Obviously there is a correlation here between spatially placed sound and a desire to create a greater sense of linked spatial/visual realism. Andre Bazin has described the cinematic realism he holds so dear as residing “in the homogeneity of the space”. (Bazin, A. 1967 pp 50 - footnote). But he was speaking in the specific on the visual mise en scene space, the framed space in deep focus. What about the space defined not by visual elements but by aural ones? How is our definition and understanding of mise en scene as a framing/composition technique challenged by an aural space that is bigger and more realistically holistic than the visual framed mise en scene can ever be?
Through traditional sound replication systems and practices, namely monaural sound, the space of a film’s action and events exists at a distance from the viewer, a space beyond the screen which, through light projection presents itself as window-like to an alternate world. Once sound however is expanded beyond the framed screen, first left and right in stereo and then to the sides and behind as surround, the audience is shifted from their traditional role, placed into the film’s environment. No longer does the viewer look through the screen window at rain falling in a removed place onto the characters, they sit in the rain, the same rain falling on the characters. The fundamental positioning and role of spectatorship on the part of viewer’s is shifted.
Traditional mono sound recording, that which employs just a single channel of sound emanating from a single spatial position – generally front and centre – it can be said presents its sound grammar in just two distinct forms: the visual-specific and the visual-abstract. In the context of the visual-specific the camera directs the audience to the specific source of the sound; an action, event, object or character. The sound may be heard without visual connection but in order for it to take on meaningful significance for the viewer it must be visually framed; the sound, the source of the sound, must be seen. For example the sound of person screaming would generally cause the camera to quickly move or pan to see the source of the scream. This is common cinema technique and itself reflects very human tendencies of turning personal attention to the source of a sound not only to see the source but also to hear it better as human ears are significantly more effective in a forward facing position.
However if the visible cinema frame, once it is focused on the source, then shifts away again by camera movement to a new subject or frame with the sound/event continuing, there is an audience assumption that the sound/event is no longer of significance or importance to the scene (and generally it would be ‘mixed down’ to a very low background volume level by the sound designer. We’ll look further at ideas of ‘mixing’ further on).
The traditional mise en scene privileges the frame contents as of utmost significance. The sound of events occurring or emanating outside the frame serve simply as a means to repopulate the visual frame, generally by prompting the camera to move. Once the source of these sounds is removed from the frame it is diminished in importance and audience/narrative relevance. Generally this is conducted technically by the sound being mixed down in volume regardless of the sound source’s physical proximity to the viewer/camera’s position in the scene.
This direct visual framing of important sounds lends itself perfectly to mono sound because an object framed in the restricted mise en scene border forms no ‘realistic’ spatial clash with the front and centre placement of the sound reproduction source. Indeed it could be argued that much cinematography of the post-synchronous sound era prior to multi-channel sound reproduction has been dictated by an acceptance of the technical restrictions of single channel sound in mono.
The much dissected shower scene from Hitchcock’s Psycho (1960) can be seen as one example where the no doubt gloriously effective cinematography is perhaps none the less being influenced by the limitations of the sound placement or, at the very least, the sound design hasn’t influenced the position or movement of the camera.
In this scene where Marion is murdered the camera continually shifts, mostly through cutting, to reframe in precise context individual visual moments, each one of which has an equally precise sound attached. In other words the direct sources of sound in the shower scene are always framed in the visual specific front and centre. When Marion takes off her shoes and they clip on the floor we see her feet and the shoes framed in close up. When the curtain is pulled over, we see the curtain centre to the mise en scene. When the water begins running we focus directly on the water coming from the shower head front and centre and so realistically connected with the sound’s emanation. Rather than the camera panning to a new position to reframe Marion washing herself in the shower it cuts back and forth, often to reverse 180 degree angles, always with the core sounds (namely running water) maintained front and centre to the frame. When the character of Norman’s Mother (later revealed as Norman himself – my apologies if you haven’t seen the film..!) enters the bathroom, the camera moves subtly to frame him also front and centre. Subsequently the key sound elements in this scene triggered by this character (the shower curtain being swiftly snapped back and the harsh string sounds that are half music motif, half sound effect attached to the knife) are also framed and heard front and centre. This scene draws virtually no event or action into the scene’s frame from without or beyond dits border via sound. In this context the shower scene can be seen as very two-dimensional in its composition with each visual element carrying specific sounds composed squarely and with specific visual connection, with little need, or indeed opportunity, to create a more tangible concept of space due the technical restrictions of mono sound.
Orson Wells in The Magnificent Ambersons (1942) experiments but ultimately struggles with the restrictions of mono sound in placing the source of sounds in space. An example scene in question involves two characters arguing in front of door on a landing at the top of a set of stairs and a third, unseen character off-screen, yelling at the couple to be quiet. The female character twice over moves inside the door and slams it shut whilst continuing to engage vocally in the argument. In an attempt to simulate the aural effect of this shift in the vocal source from visible to non-visible and back again along with the acoustic restrictions of the closed door, Wells adds a large amount of artificial reverberation to the woman’s voice and lowers its mix level. This audio effect makes the voice seem further away and muffled. Likewise the sound of the man complaining off-screen of the noise is heard at a slightly lower mix level and with a similar, but more ‘echoy’, reverb to indicate a removed distance away from the framed scene.
For the first part of the scene this approach is more or less effective in simulating a sense of distance contained within a single framed take. But the reverberation can also be seen (and heard) as an aural sign-post, a signifier of distance and position rather than an effect borne out of spatial actuality. When the woman goes inside for good and the character of George moves off away from the door the camera follows to keep him and his voice front and centre to the mise en scene squarely framed and to bring into frame the figure of the man who was complaining of the noise. Subsequently placing the door, and the woman’s voice beyond, behind the camera/viewer perspective. Of course the sound in this case being mono forces the voice of woman, still yelling out, to be spatially misplaced to the mise en scene in order for George’s to stay true.
Here the movement of Wells’ camera is torn between the problematic nature of mono, non-spatially specific sound, and the desire to avoid montage cutting that might have prompted the camera to cut to a reverse angle to show the closed door with the woman’s voice emanating from behind it and so make the mono sound form continue to function with spatial accuracy. The result is neither one way or the other, a scene with interesting choices in regard to reverberation to signify distance but camera movement and composition that can neither reconcile itself to the realism of spatial placement in regard to sound and space nor the constructed, formalist, technique of montage that would defy space altogether by cutting.
Conversely to sound in the mono visual specific is sound in the visual-abstract; that which is not connected visually to an action or event but rather is generally ambient or atmospheric not demanding the audience connect directly with it in a conscious, visual manner. Sound of this type is largely independent of ideas of diegesis (being either diegetic or non-diegetic ) but remains unconnected to a visual, framed specific that is necessary for the viewer. Visually abstract sound is about tone and mood rather than a direct relation to narrative progression. Mono sound in the visual abstract is tightly dictated by its mix-level as relative volume is the only tangible tool at the sound designer’s disposal to control the impact and significance of visually abstract sounds. A visually abstract sound played too prominent or loud in the mix will demand viewer/camera attention.
Sound in mono therefore can be said to force the camera to be specific and complete, confined to audience’s need for a clarity they cannot obtain from the sound’s single directional source.
In multi-channel sound however the camera is not confined to this audience visual clarity because the sound is now able to provide physically and spatially specific information, tied heavily to physical reality, as well as a viewer-based realism to the scene. This sound can be directly connected and diegetic to the visual aesthetic of the scene but need not be specifically connected to the image. Computer game sound designer Chia Chin Lee explains that “The viewers are sharing the same auditory world as the characters on-screen” (Lee CC, 1999) and so many aural elements exist beyond our sight, and subsequently the sight of the characters but none the less can be an integral and specific part of the scene.
A great deal of contemporary understanding about the mise en scene comes from the work of Bordwell and Thompson and their book Film Art (Bordwell & Thompson. 1989). They comment that “in most films those shapes (mise en scene elements and contents) also represent a three dimensional space in which action occurs. Since the image is flat the mise en scene must give us cues that will enable us to infer the three dimensionality of the scene.” (Bordwell, Thompson. 1989 pg 136)
The assumption in this understanding of cinema space, that it is derived via the framed, screen-based mise en scene, is that the three-dimensional world of the narrative exists beyond, on the other side of, the cinematic frame/screen and subsequently that the viewer does not share the same space as the action. Instead the screen image ‘infers’ three dimensionality based solely on light, dark and shadow. The audience observes from a removed ‘God’ position that is non-diegetic to the scene.
Space in this context is ‘representational’ rather than ‘actual’, as it implies that only the action takes place in the cinema space, not the process of viewing. Surround sound technology and the widespread act of spatial placement of sound around the viewer drives a new understanding of the act of spectatorship taking place within that same space.
This notion of a surrounding, realistic, aural space is one that is most at home in, and most integral to, contemporary 3D computer gaming, specifically the genre of games dedicated to creating an immersive, personal experience; otherwise known as First-Person Shooter games.
In gaming it is the ‘Space’ rather than the ‘Frame’ that is ‘composed’ by a game designer/director. This comes both aesthetically and practically in terms of the actual process of designing game levels and complete 3D virtual spaces. Subsequently the visual frame in computer gaming is free to move over the more holistic ‘composition’ without negating or detracting from the importance of compositional elements outside of the visual frame. This can be referred to as the Macro-mise en scene. The audience may be viewing the narrative/story/experience through the hard borders of a visual frame but there is no illusion of the frame presenting any sense of a totality of vision because of a larger composition constructed for the viewer through spatially specific sound. The viewer accepts that the frame is just a small part of the composed scene, not the scene in it’s entirety.
Where this idea clashes with cinema is that conventional film grammar presents us with a cinema sound that is largely representational rather than authentic. Sounds that are specifically necessary to the forward progression of story are delivered in a manner that allows the viewer to ‘accept’ that they exist but not in a way that demands or replicates an aural actuality. Soundscapes undergo a precise process of ‘mixing’ where volume levels of individual sounds are not only precisely set but also controlled over time; faded-in and faded-out with changing relative proportions in relation to other sounds in the scene.
As an example, in Psycho (1960) the sound of the car-horn beeping as the character of Marion attempts to get the attention of the Motel manager (Norman Bates) is a sound that exists in the cinema space to represent the act of Marion demanding the attention of the Motel manager and wanting to get out of the rain. As the camera cuts from Marion in the car to a removed upward angle of the house there is no change in the sound of the horn which continues to play. Its reverberation, echo or spatial location to either the car, the house or indeed the position of the viewer, remains unchanged. The horn sound is simply there to say ‘a horn is sounding with urgency’ rather than to have an actuality of a horn sound in space and time. The car horn is a representational signifier.
Computer Games and the new perspective.
This notion of sound as a signifier rather than a representation of actuality is distinctly different and opposed to the aural aesthetic and grammar of 3D game sound where sound most often exists in ‘actuality’ rather than simply as a signifier; important game sounds will continue regardless of the camera/user’s presence. Again the mise en scene, both aural and visual, doesn’t seek to present a totality of vision; there is no endeavour on the part of game creators, nor expectation on the part of game players/viewers, that the mise en scene is complete with all the information they need for the game’s forward narrative or experiential progression. The framed mise en scene is just one small element of a larger compositional space the viewer is engaging with. in actuality based largely on a realist perspective rather than one of signifiers.
In First-Person Shooter (FPS) genre games (key examples being Doom3 (2004), Half Life2 (2004), Quake3 (1999), etc) it is common for the player’s perspective/avatar to encounter another character (known as an NPC, non-player character) who will speak (direct address to camera in film terms) generally conveying important information for the narrative. Because the player/viewer directly controls the visual mise en scene they are free to turn the ‘camera’ (aka: their perspective view) away from the NPC. They can even walk away and explore another part of the space/room/environment. The sound of the NPC’s voice will continue regardless of the ‘camera’ shift but the action of re-framing the mise en scene in no way removes the aural sound of the NPC speaking from the viewers accepted Macro-mise en scene. The movement of the framed mise en scene away from the direct source doesn’t diminish it’s importance to the ‘scene’ as it certainly would in traditional cinematic grammar.
Further examples can been seen in games such as Doom3 (and numerous others) which take place in long series of interior winding hallways, corridors and rooms. Here the game narrative, and the viewer/player’s progression through it, is in large part driven and guided by sound attached to specific places and environments. In filmic terms this would simply be called ‘room tone’, the ambient, atmospheric sound of the space based on its size, contents, purpose and so on. In gaming however this room tone becomes a crucial element of the spatially based narrative journey. The player/viewer navigates via sound, knows what is happening on the other side of a door by sound, knows if a creature/enemy is about to attack by sound and, in particular, knows the specific direction, in relation to their viewer/player position, of all of these via sound placement.
Many games of this genre present a labyrinth like environment where often the only way to know if your avatar is going in the right direction is by the spatial arrangement of sound. As gaming sound designer Lee comments “In the film world, coherence in sonic language is easier to achieve because the audio track is composed for a predictable, linear medium… in the game world, the audio remains the linear component, while the image becomes dynamic!” (Lee CC, 1999). In other words under the gaming aesthetic, where the framed image is a completely dynamic and constantly (unpredictably) shifting element, it is the surround-sound design that becomes the linear driver of the viewer/players perception of cinematic space rather than the inferred three-dimensionality of screen-based light and shadow described by Bordwell and Thompson. The mise en scene frame loses its totality, its completeness and becomes a dynamic portal to a larger composition rather than the composition itself.
Where gaming and Cinema meet.
Increasingly these notions of spatial awareness and aural specificity cross over from the gaming environment, where they are most at home, to cinema where the tools and technology (home theatre and cinema surround sound systems, Dolby 5.1 etc) are much the standard but still largely under utilised in everything but a superficial way as. As Randy Thom states “What passes for ‘great sound’ in films today is too often merely loud sound” (1999)
Yu comments on this spatial actuality through sound bleeding in relation to cinema; “We can also hear not only the sound space the character of the film is in but also what is outside. Sounds with character can get in through a window.” (Yu, E. 2003) This idea takes the notion of an immersive aural environment one step further. Not only is the audience placed into the same space as the characters and subsequently hear the environment around those characters but, moreover, to hear what the character hears which goes beyond the environment the camera/viewer/characters are placed in and takes in notions of space beyond the one the viewer is immersed in at any given time.
Gus Van Sant’s Elephant (2004) presents an cinematic example of a very effective paired relationship between camera technique and surround sound that can be said derives directly from gaming sensibilities of spatial placement and Macro-mise en scene composition of a larger space that the camera is free to range over and embrace spatially specific but non-visual elements into.
A particularly good example of this correlation in Elephant is that of the scene involving a meeting of a small group of students discussing homosexuality. In this scene sound and vision exist in the same composed space, the same Macro mise en scene, but necessarily aligned in the same mise en scene frame.
The camera moves into the room tracking a particular student who is slightly late for the meeting. As she sits the camera moves to the centre of a circle of seated students and it continues to move, panning slowly around the room, from left to right, taking in a medium close up of each student in succession.
The discussion is not a boisterous one, each student (and the teacher), for the most part, is given time to speak and comment in turn whilst the camera continues to move slowly but without pausing. The key difference in the staging of this scene is not the fluid movement of the camera but the fact that the camera is not necessarily (indeed rarely) focused on the person who is actually speaking. The voice of each person is heard, and heard in surround spatial placement relative to their position in the circle (and the viewer), but the camera is independent of this aural composition. It moves across people before they speak, after they speak and indeed on people who don’t speak at all (or at least not at the same time as the camera is on them) Only occasionally is the camera focused on the source of the sound.
With this style Van Sant creates a scene where sound and vision exist in the same scene, both are equally prominent to the viewers experience of the scene and the scene’s narrative, but sound and vision do not exist in the same frame. The visual and aural frames are independent or, more accurately, the visual frame shows just one mobile part of the audience’s perceived Macro-mise en scene.
Van Sant takes this further in Elephant by using spatially placed sound to construct for the viewer a specific spatial awareness of the larger environment beyond the traditional mise en scene. The narrative of Elephant works in broken non-linear fashion, shifting both forwards and backwards, forcing the viewer to piece together the discontinuous moments in time. Moreover Van Sant shows several scenes a number of times over, each from a different perspective, following a different character or with a different focus. One particular example involves three characters who cross paths in the school hallway with each version following a character into the hallway from a different direction. The sound placement and the sounds themselves are very specific in all three of these scenes and spatially truthful in their replication. In particular in each version there is the sound of a car and its door opening a closing from the school car park. This car park is given an exact location in relation to the hallway through this sound, likewise the music practice room and the distant proximity of the cafeteria where most students are.
Elephant follows no particular protagonist but rather an ensemble of individual characters (in many cases with no direct connection with each other) through this particular day in their school lives. Because the film ultimately involves the tragedy of a ‘Columbine’ like shooting at the school, the physical/spatial location of each of these student characters we are following (in many cases literally with a steadi-cam) within the school environment becomes crucial to the narrative and, in particular, to our sense of drama (as we know what’s around the corner – literally and spatially – before they do. A long established tenant of the Horror and Thriller genres taken further with surround sound).
The above example of the car door placed specifically in the surround sound array outside a certain hallway window becomes important because when we follow the character of John he moves from the hallway outside to the car park where he crosses paths with the two student gunmen entering the school. Each time we see the scene in flash-back from a different (even reverse) perspective, we are reminded that the gunmen are coming from that particular direction in relation the hallway. From here the viewer, through sound location much like a Doom3 player/viewer navigates by sound, is able to piece together not only the narrative as we traditionally understand it but also the space and the characters relationship to space that is so crucial to the drama of the film.
Van Sant in Elephant essentially composes a Macro-mise en scene, the entire school spatially, aurally and visually, and then manoeuvres a camera through this ‘composition’. For the audience the mise en scene is much larger than the frame and there is no illusion that the frame represents a complete ‘Totality’ of the scene, just a small section of it. This is a definitively game-like approach as the process of designing and ‘composing’ a game is one of creating an entire space where the player/viewer’s personal mise en scene frame will be just one incomplete window on the larger composition; a larger composition the viewer must be aware of as a whole outside of their visible frame.
As discussed earlier, in cinema the ‘mix’ of sound (the relative levels of volume between individual sounds within a scene) is an artificially imposed construction on the part of the sound designer. Sounds not necessary for narrative information or those whose presence has already been signified (as discussed early in the context of representative sound), can be then mixed or faded down into the background, most often to an unnaturally low level. A prime example would be a scene in a nightclub where the music is at a very high amplified level, so much so that audible speech between two people would be impossible short of direct shouting into the ear.
Sound designer and filmmaker, Walter Murch, has commented that “(sound) layers are listed in order of importance, in somewhat the same way that you might arrange the instrumental groups in an orchestra.” (Murch 2000)
In this way traditional cinematic grammar in regard to sound is built around the idea of sound triggered, or mixed, by action; as above the act of one character speaking to another would force the mix-down of other sounds, such as music, not specifically relevant to conveyance of narrative in the mise en scene. In other words once the sound has been signified it looses audience importance.
In gaming however, the soundscape is, as with a games’ visual environment, composed in space and set universal to the Macro-mise en scene; an aural actuality of space and volume. Subsequently the ‘mix’ is constructed by the player/viewer in accordance with what is required for their particular understanding of the ‘narrative’. Thus, in gaming, crucial narrative information will take place in the larger Macro-mise en scene whether the player/viewer has framed the subject or source of this information or not. It exists in the space independent of the frame.
This approach stems as much from the pragmatics of producing sound for a 3D game as it does from an aesthetic choice based on the creation of realistic, immersive environments. A common technique for creating sound designs for 3D games and 3D animated films is to use virtual microphones and speakers that are ‘placed’ invisibly in the composed 3D space; this is known as 3D sound rendering. These virtual devices then permit sounds to be arranged not just in terms of traditional mix level but also in space, proximity and location. The movement and placement of virtual microphones then allows for sound position to be recorded; i.e. the level of a sound based on the distance at any given time of the virtual microphone to virtual speaker.
The effect of sound rendering as audio design tool can be seen in virtually any FPS genre computer game where the player/viewer will move their perspective, via the avatar through whose eyes the player is seeing the environment, closer to an object that is emitting the sound. As the avatar gets closer the volume level of the sound emanating from the object grows louder, an effect produced by the game’s program code simply by an equation that calculates mix level based on distance or proximity of the avatar to that object. The code then subsequently also calculates the sound’s placement in the 5 channel array in accordance with the facing direction of the avatar.
The development of this technology and the aesthetic/artistic choices that go with it, can be linked to the films of Robert Altman and, in particular the theoretical analysis of Altman’s work by Phillip Brophy. Altman, in films such as The Player (1992), pioneered the use of long zoom lenses, multiple cameras and, particularly, radio microphones to remove the technicalities of production (particularly sound production – boom and sound recordist) from the environs of the performance and from the actors themselves. This freed actors to engage with space in a more naturalistic and less technically cluttered or constructed manner.
By using radio microphones on each character in a scene and recording, at a distance, multiple voices simultaneously as independent tracks with minimal mixing in scenes where very often many characters talk over the top of each other, Brophy argues Altman was able to put onus, and a notion of a somewhat less passive involvement, back on the viewer to “sort out a scene’s significance without any readable cues.” (Brophy, 1987). Brophy argues that this use of multi-channel recording and radio microphones represents a change in cinema language from ‘focused’ to ‘unfocused’.
Brophy attests that Altman’s post-production choice of not focusing the sound by artificially mixing a specific ‘voice’ to be overtly more dominant that the others presents a sound-scape in its totality that could be argued is the aural equivalent of a visual master-shot. The viewer is then forced to sift and ‘edit’ for themselves through the sounds to pick up what is most significant and correlate that to the images presented, which according to Brophy, most often begin in Altman films with close-ups and slow retractive zooms. In other words, finely focused images amid a scape of unfocused sound. Brophy argues this is somewhat the antithesis of traditional film/sound technique where the sound is more often the first to focus accompanied by a more total visual picture; an establishing or master-shot.
The aesthetic changes heralded by this approach, which date back to the eighties, can be seen as very much akin to the aesthetics of spatial, unfocused sound in totality discussed above in regard to 3D game environments. Where Altman sought to create a free-form naturalistic visual and aural space by removing the overt and constructed technicalities of production from the scene of performance, he was still greatly restricted from fully realising this goal because of the lack of spatial perception available to the audience.
The opening scene from The Player is an example of Altman’s technique at its best and yet despite its free-form use of space and sound in a single very long shot - in many ways the forerunner to the extraordinary single-take, entire film, Russian Ark (2002) - The Player is still a very traditional mise en scene film in regard to the audience perception of space. What’s important to the scene happens totally within the frame, the camera constantly moving to bring wholly into frame crucial visual elements for narrative and character. The space inside the mise en scene may be free moving and more open than traditional framing but there is very little sense of important space beyond the frame. The camera almost never reverses its spatial perspective through movement nor uses sound in audible specific placements through surround sound to immerse the viewer. In this sense Altman’s work may be seen as a first step towards a sense of the Macro-mise en scene in contemporary cinema but one that still required further development of the 3D gaming aesthetics and widespread take-up of surround sound systems as the cinematic norm, to drive the notion further.
Gus Van Sant’s Elephant can again be seen as an example of a film that follows on from precedents set by films such as The Player in terms of a less-focused sound designed with more onus on the viewer’s interpretative process, but goes further than Altman by constructing its multi-layered sound in 3D space
Elephant, in its numerous school hallway scenes, introduces an enormous range of sounds (doors, laughter, squeaky shoes on tiled floors, rattling chains, flushing toilets, etc) that are specific and meaningful for the viewer in both abstract and specific terms (not just background Walla ) but which have no tangible connection to the framed mise en scene. They are part of the Macro-mise en scene which is fundamentally a part of the viewers ‘frame’ of understanding in experiencing the film. They are aurally specific in location and each, as the film unfolds holds a greater or lesser degree of specific meaning that can be ‘assembled’ by the viewer for a more complete understanding of the film.
3D - Illusion and Actuality.
Equal to the changes in sound, though perhaps not as obvious in fuelling substantial shift in our perception of cinematic space (as well as being the principle driving element behind the theories of Bazin and his fellow cinema Realists) was a fundamental change in the manufacture of film stock.
Film manufacturers Kodak and Agfa in 1938 released faster (up to 120 ASA), large grain, film stocks (Cousins M, 2004. p175) that reacted more quickly to light. This allowed for cinematographers to close down the apertures of their lenses without reducing the overall light level. Subsequently filmmakers were able to obtain an unprecedented depth-of-field and have framed subjects, both foreground and background, simultaneously in focus. Cinema embracing this new, in-focus, space was called ‘deep-focus’.
Prior to this development of deep-focus mise en scene the arrangement and composition of the cinematic frame was a substantially two-dimensional, very flat, affair. From the viewer’s perspective, prior to these faster film stocks, there was a direct perception of height and width, distance across the viewing pane (the X and Y axis of the screen), between characters and objects, but very little visual construction or direct meaningful awareness of depth (along the Z axis).
The ability to close down the camera’s aperture and have a deeper field of focus gave cinema audiences this awareness of depth and directors the ability to be more theatrical in their staging by having specifically discernable action in both fore and background.
As with Brophy, mentioned above in regard to sound, Andre Bazin commented extensively on the idea of audience pro-activity in the otherwise perceived passivity of cinematic spectatorship. Specifically Bazin focused on the deep-focus cinema of Orson Wells’ Citizen Kane (1941). “Whereas the camera lens, classically, had focused successively on different parts of the scene, the camera of Orson Wells takes in with equal sharpness the whole field of vision contained simultaneously within the dramatic field.” (Bazin. 1971 Pg28.) According to Bazin (and many others since) this prompts the viewer to effectively perform their own internal, mental edit of the scene to draw mise en scene elements of significance into focus for themselves. The idea that the whole field of vision is simultaneously matched with a total dramatic field implies that the parameters of the dramatic field are contained within the visual and don’t exist significantly outside of the visual.
This assessment of deep-focus cinema as taking in the whole of a ‘dramatic field’ can be viewed as fundamentally flawed in the light of a new, immersive, understanding of cinematic space; for certainly in forms such as 3D games and certain filmic examples discussed previously, the dramatic field extends well beyond the field of vision.
In gaming in particular, with examples numerous in FPS games such as Half Life 2, a great deal of the dramatic field takes place outside of visual terms. Important events in Half Life 2 continually play out on the other side of walls, in inaccessible spaces or environmental locations that the player/viewer is aware of but may never encounter or enter visually.
In this context deep-focus cinema, as held so dear by Bazin, certainly changed and expanded our understanding of cinematic space but it was a 3D of Illusion rather than a 3D of Actuality. Likewise, as discussed earlier, Bordwell and Thompson’s definition of mise en scene space is preoccupied with space perception coming via ‘depth cues’. These prescribe a representation, outside of actuality, of a space with volume and depth rather than composition of the viewer’s perception of that space as a shared occupant. As such Bordwell and Thompson’s definition of cinematic space is invariably restricted to notions of foreground and background (1998. p138), a visual construct linked closely to the technical phenomenon of depth of field, and allows no room for a notion of viewer immersion and spatial ‘sharing’ with the film that is so much a part of contemporary, gaming influenced, media technology and aesthetics.
When, through surround sound, the viewer is placed spatially in the scene as an ‘occupant’ sharing the same auditory space as the film, ideas of space being restricted to visual foreground and background are obsolete as they cannot account for all the space the viewer is acutely and specifically aware of and which the Macro-mise en scene is now free to include without necessarily visually framing in parameters of foreground and background.
Bordwell and Thompson do discuss, to some extent, the role of off-screen space and make specific reference to William Wyler’s film Jezebel (1938) where a character waves to friends off-screen and then the hand of that person enters the screen with a wine glass and the camera shifting to re-frame. (1998 p167)
But this perception of space is still very much in line with traditional film grammar that insists on important events being brought into frame to be relevant; that the ‘film’ only exists inside the mise en scene frame. Reference may be made to off-screen space but only as a means of bringing elements into the mise en scene that previously did not exist; the camera subsequently forced to shift and re-frame to maintain it’s totality of vision.
Gaming Co-existence: Long-Take and Montage.
The two grand pillar-like extents of cinematic thinking invariably break down to the divergent ideas of cinema’s role as an artform expounded by Sergi Eisenstein on one side, in the form of Montage (i.e. the power of the cut as key cinematic tenant) and Andre Bazin in the realism of the Long-Take (i.e. the power of the un-interrupted shot).
Both these perspectives, Formalism in the former and Realism in the later, can both be seen as tangible underpinnings of computer gaming design and practice and yet neither analytical tool allows us a complete picture to understand this new sense of the macro-mise en scene as a compositional paradigm.
The FPS genre of gaming as a whole, on the surface, seems to embody Bazinian thinking at its most pure in that the immersive, first-person perspective of the player/viewer’s mise en scene is able to naturally and organically present a continuous ‘long-take’ cinematic style of spatial realism free of the disjoined, constructed nature of montage. And yet this is certainly not to say that FPS genre gaming is built around a purely Realist aesthetic or that key ideas of Montage and formalism do not play a significant role.
A good example of fused Realist and Formalist aesthetics in gaming can be seen in games such as Star Wars Jedi Academy (2003). Much like any other FPS genre game, the narrative of Jedi Academy revolves around the player/viewer taking on an avatar (in this case a Jedi Knight from the canon of original Star Wars (1977, 1980, 1983) films).
From a Bazinian, Realist perspective a large proportion of the game plays out with a real-time basis of long continuous takes. Moreover, added to the cinematic notion of the long take, is it’s partner deep-focus, the two of which are the cornerstone of much cinema Realist theory expounded by Bazin. Indeed, ideas of depth of field and deep-focus are more purely at home in 3D graphics than anywhere else as depth of field is a construct that belongs solely to camera lenses. In a camera-less scene made up of computer generated, coloured pixels, there is no such thing as depth of field, focus is infinite and everything, no matter how close or far away from the framed perspective, is always in focus .
Potentially, at least in the short term, this complete lack of depth of field is problematic for our established cultural knowledge of film grammar. Viewership of cinema is very attached to the notion of a forced depth of field as it has been central to our visual understanding of the captured photographic image since the invention of the still image camera. So much so that it is common practice to ‘fake’ or impose depth of field effects on 3D animated films in order to engage both with an established notion of what filmic cinema looks like and to invoke a heightened sense of realism with the sub-conscious notion that the animated events have been captured on film as live events rather than constructed through the purely technical process of cell or keyframe-based animation.
This extends into gaming as the implication is that if depth of field as a visual tenant is removed from gaming 3D animation we loose a connection with the gaming visuals (which at least for now aren’t photo-realistic) as live action elements and subsequently undermine their visual truth.
That said, Jedi Knight is arguably far more ‘cinematic’ than it is ‘realist’ in that it employs a wide variety of cinematic devices to deliberately break the continuity of the long-take, immersive, first-person perspective. These breaks are known in gaming terms as ‘cut scenes’. When the player/viewer completes a set task or game level, the continuity of proactively driven action is broken by a shift in perspective from first to third person. Here a ‘cut-scene’ plays out where the player/viewer is no longer in control of the avatar and indeed no longer viewing the mise en scene through the eyes of that character. Not only does this deliberate shift invoke Montage ideas of shifting and cutting camera angles but very often the virtual camera will move away from the human eye-line to shoot the cut-scene from non-human low and high angles in an instantly recognisable invocation of Hollywood cinema.
Taking this idea further Jedi Knight (along with other games such as Max Payne (2001)) continually break from notions of real-time and first-person throughout the proactive play of the game. In Jedi Knight when a light-sabre wielding enemy is defeated the final killing blow, delivered on the part of the player/viewer’s avatar, is shown from a sweeping third person camera perspective and simultaneously in slow-motion. A purely cinematic and formalist imposed perspective; one that Eisenstein would argue shows cinema’s status as art because of its divergence from perceivable ‘reality’. Likewise Max Payne actually makes these un-realistic cinematic impositions a central part of the game. The player/viewer is able to deliberately enter into slow motion, sweeping camera, diving actions (viewed from the third-person and immediately recognisable as stock standard Hollywood moments) as a tactic for winning the game and defeating game enemies.
At the other end of the spectrum, the hugely successful game Half Life 2 (2004) presents a game that would appear to represent a Realist perspective of cinematic viewership in its purest form. Half Life 2 is an FPS game much like any other but with one distinct difference; it is the first and (at the time of writing) only FPS game to present the entire game experience, from the moment of starting until game-over, in player controllable first-person. Apart for a simple still graphic shown whilst the game is loading data, Half Life 2 has no cut-scenes of any kind; its game narrative is presented entirely in first-person, real-time. There is only one shift in time that moves the story forward one week but this is not done via a cinematic cut but rather as a part of the game’s narrative; the player/viewer enters a machine that shifts them conveniently forward in time. The effect of this is the player-viewer’s acceptance of the time shift as truthful to the game narrative rather than cinematically imposed.
All this would imply that, if it can be argued FPS gaming holds at its core an ideal of immersive realism, then Half Life 2 is a game that confirms Bazinian ideas of long-take mise en scene as the best embodiment of cinematic realism. And yet Bazin’s ideas of the mise en scene; “The camera cannot see everything at once, but it makes sure not to lose any part of what it chooses to see.” (1972) do not allow for the new aesthetics of a mise en scene that doesn’t need to see everything.
As example, the prologue of Half Life 2 is presented solely through the eyes of the character/avatar being controlled by the player/viewer so from the very outset of the game the player is purely immersed proactively in this character’s perspective; no cuts, just a single long take. Another character, G-Man, has ‘you’ trapped, drugged and strapped to a chair. But out of kilter with Bazin, and most commonly held notions of the totality of the mise en scene, the player/viewer is in tangible control (although limited) during this prologue and so is able to look away, re-frame the mise en scene, beyond what would otherwise be the focus and contents of the frame. This prologue is composed as a macro-mise en scene where numerous elements exist, the character of the G-man not least amongst them, in a space shared by the player/viewer. The ‘camera’ in this context isn’t required to frame the G-Man to have him be part of the mise en scene. The space is composed without necessarily requiring direct reference to the visual contents of the frame. With the gaming aesthetic the ‘camera’ doesn’t need to see everything or even try to see everything, nor is it restricted to seeing everything that is critical.
Again Van Sant’s Elephant can be seen an substantial example of a film embracing this new aesthetic of Macro-mise en scene composition and a disjointed relationship between visual frame and aural frame.
The opening scene of Elephant is a lengthy, time-lapse shot of a cloudy sky rolling back with the only solid feature a phone pole and powerlines. The shot is obviously not in a natural real-time, the clouds and light moving too swiftly. Not so fast that the scene is overtly false but enough to invoke a sense of hyper-reality. The soundscape however is in real-time, being that of a substantial group of school students playing some form of field game on the ground which, presumably, the camera also stands looking up. As well as being in aural real-time the sound of the students has elements of spatial placement that give the impression the game is happening around the upward view of the camera’s static frame, and subsequently the view of the viewer, sharing the same auditory space as the film.
There is some expectation that at any moment in this scene the camera will cut, or at least pan/tilt, to take in the action of the character’s we can obviously hear as part of the scene off-screen. Our long established cinematic grammar inferring that what’s important shall be framed. But the camera never does move; Van Sant sets up from the outset a visual and aural style that infers that the visual frame in Elephant will never be complete, that it will only be a small window to a bigger, more complete environment. Van Sant sets up a pre-text, one that is arguably very game like, where for the viewer image frame and spatial/aural frame may not always match up.
This notion may have been established with a shot of the sky in real-time as opposed to the chosen time-lapse but by opting for an obviously unrealistic image Van Sant pushes the notion of image/sound displacement just a small step further away from a purely Realist perspective and embracing Formalist notions that allow for a more cinematic experience that attempts to let go of none of its audience accepted realism.
In other words, with this opening scene, Van Sant composes for the audience a conscious and specific Macro-mise en scene and visually frames one small aspect of it without losing a sense of the whole.
This ideal is carried throughout the film, obviously there is a key example cited earlier concerning the student meeting and the divorcement of image-frame and sound-frame, but also in more subtle techniques. One of the few adult characters that Elephant allows into the frame is the character of John’s father who is shown in the scene immediately after the opening prologue drunk and attempting to drive his son home. This scene, shot largely from the bonnet of the car looking in through the windscreen, never equally or in a balanced compositional way, frames John and his Father. Traditional cinematography would have the camera placed centre to the car bonnet with the two characters equally framed at either side of the mise en scene to create a complete and total image. For the exchange of dialogue between the characters cuts would have been made to frame each character wholly and individually. Neither of these things occur. Instead the camera captures the scene from the corner of the car hood and, in an imbalanced manner, has John’s father occupy most of the space and John mostly cut off by the frame’s edge. This framing is most interesting because the audience is aware through numerous clues (not the least of which being that this scene opens with the title slide of ‘John’) that the character central to the story is not the father. Van Sant frames John almost entirely out of the mise en scene frame and yet it is John that is the focus, John that does most of the speaking and, empathetically, it is John the viewer is most concerned about. John is the central element of the audience’s Macro-mise en scene space and the localised visual framing is free to explore other elements and is ‘allowed’ to present an incomplete visual frame.
With Elephant Van Sant often appears to deliberately avoid usual and accepted mise en scene practice both visually as well as aurally. An obvious, but effective example is the scene where Alex and Eric (the gunman of the later part of the film) are in their room after school and Alex plays Beethoven on the piano. This scene begins with a focus on Alex playing the piano and through a lengthy shot is mostly static. Visually the mise en scene centres on Alex and aurally on the diegetic music. As the scene goes on Eric enters and the camera explores the room without cutting, circling about taking in all the elements of the room, finally zooming in on Eric on the bed playing computer games. Throughout this movement the sound of the piano is never ‘mixed down’ as cinematic grammar would usually dictate for any element that is no longer the focus of the scene. As with the former example of the car horn in Psycho the sound would usually be representational rather than actual. Once it’s representation established in continual presence as mise en scene dominant is no longer necessary. In this bedroom scene the piano, the music, both characters and the entire contents of the room form the Macro-mise en scene as vital and integral visual and aural elements. None of which are diminished or reduced in importance because they do not occupy the frame at any given time.
4. Case-Study. Psycho:
Mono versus Surround – Hitchcock and the Van Sant re-make.
The decision by Gus Van Sant to do a remake of the seminal Hitchcock thriller Psycho (Van Sant, 1998; Hitchcock, 1960) was, on its own, a staggeringly brazen thing to do. To remake this film, regarded for such a long time by so many as ‘perfect’, certainly drew a great deal of criticism. Mark Carpenter observed that “Not the least of the arrows it endured was the remarkable derision it received from the mainstream press” and that it is Van Sant’s “most eccentric and reviled (and very nearly forgotten) project”. (2004)
But Psycho (98) was not just a remake; to go a huge step further and, as Van Sant did, do a remake that is a shot-by-shot replication of the original was widely heralded as outrageous. Or at the very least begging the obvious question of Why…?
Van Sant’s choice to do this may, or may not, have artistic or cultural merit depending on the polarised opinions of reviewer, filmmakers and academics alike, but having two, for all intents are purposes, identical versions of the same well crafted film, gives a superb opportunity for cross examination. Apart form the inclusion of just a handful of new (and very short duration) shots by Van Sant, there are just two distinct differences between the original and the remake; Colour (the impact of which we’ll leave to other scholars), performances which inevitably introduce their own nuances and Surround Sound. In this context we have the perfect opportunity to examine the possible impact of surround sound on mise en scene framing and the notion of a Macro-mise en scene. In other words, we get to ask the question; ‘Would Hitchcock have shot Psycho differently if he’d had surround sound technology available…?’
The original Psycho was a film produced with an entirely mono soundtrack having just a single sound channel that was placed front and centre behind the screen. The remake by Van Sant was released both in theatrical surround sound for the cinema and as Dolby 5.1 for DVD. This seemingly simple shift allows for a great many creative and aesthetic differences to occur; either by deliberate intention or natural evolution of the process of mixing and arranging sound.
The first most obvious example of surround sound in Van Sant’s Psycho comes via the musical score. Musical motifs and elements in the remake version are reproduced fairly evenly across all channels in the surround field. This is fairly common practice in contemporary cinema sound design as it immediately and more succinctly invokes the non-diegetic properties of a musical score which doesn’t exist in the same auditory world of the film space. By giving the music the non-specific spatial placement of all-channels equally the music is placed distinctly beyond cinema space.
This multi-channel reproduction renders the music immediately recognisable from other sound elements, not by volume mix level or by its character but by its lack of spatial specificity. This makes the nature of the music score noticeably different from the original Hitchcock version of Psycho. In the original the music shares the same single channel as all other sounds (dialogue, foley, etc) and so the music, whilst still diegetic by definition, is perceived by the viewer as less removed from other sounds in the mix, less identifiable as a distinct element. When the music is placed into surround sound, subsequently reproduced everywhere (and so perceivably ‘nowhere’) it can be seen to act as a commentary on the film rather than, via mono in Hitchcock’s version, an intrinsic part of the scene.
Surround sound is most often, obviously, touted for its ability to be spatially specific in the reproduction of aural elements. What is often ignored or unnoticed in our contemporary understanding of surround sound is this ability to place sounds beyond spatial recognition and indeed remove them from space altogether via this technique above of emanating a sound, most often music, from all channels in equal balance.
Central to all cinema production, both in the specific of individual films and the bigger scale of the canon of cinematic genre’s, is the ‘contact’ established with the audience in regard to the parameters the film will work within. This is most obvious in the case of narrative and genre forms such as Science Fiction where, in order for the audience to ‘buy into’ the otherwise infeasible elements of the story (aliens, flying saucers, time travel, ghosts, etc), certain paradigm truths must be established early on from within which the film’s narrative will operate.
These kind of ‘contracts’ also exist on a production level as techno-aesthetic parameters. Star Wars: a new hope (1977) opens with a large space craft rumbling overhead, engaging rear left/right channels and low-frequency (sub-woofer or LFE) effects and so sets up, from the very outset of the film, a contract with the audience to both accept and expect the spatial realism of multi-channel sound for the duration of the film.
In a film such as Psycho (Van Sant) this same notion of a techno-aesthetic contract is likewise established early with the placement of spatially specific dialogue and foley sounds in discreet channels relevant to the framed scene. With the audience then accepting that there is a degree of spatial truth about the film any music element then brought aurally into the mise en scene via a particular ‘direction’ may be perceived as diegetic rather than the usual and accepted non-diegetic nature of a musical score. Without any changes to the script, performance, camera shot, movement or even audio mix level, this immediately sets Van Sant’s Psycho apart from the original because the ‘contract’ with the audience is fundamentally different.
As a result, the musical elements which, regardless of mono or surround sound, require a clear distinction from other sounds in a film, must be placed outside of diegetic acceptance. The only way to do this with surround sound, where the audience accepts and expects all sounds to come from ‘somewhere’, is to place the music ‘nowhere’.
One of the most interesting specific sound design choices made by Hitchcock was in relation to the ‘voices’ that play in the minds of his characters; Marion and Norman. For Marion this is most notably as she is driving towards California and before arriving at the Bates Hotel, having stolen the money, and the voices of people she knows looking for her can be heard debating her motives and what they will do to find her.
Even in mono this sound is an interesting aural hybrid in that it fits with what Gorbman has referred to as ‘Meta-Diegetic’ sound (Gorbman 1976, cited in Milicevic 1994). To the viewer the voices are non-diegetic in that they have no visible or real-world source in the scene (much like a voice-over) and yet, to the character (Marion) the voices are distinctly diegetic to events that either have or will take place in her reality. Going further with Hitchcock’s play on the psychology of the characters and their respective guilt, the voices could equally be ‘real’ voices of events that ‘have’ happened for the viewer (an audio flashback); or they could be imagined voices inside the head of the character that are not connected to any reality.
In the Van Sant Psycho this scene with Marion in the car is the first non-musical sound and non directly diegetic sound to obviously engage with the surround-sound spatial field. The voices are heard to emanate from, and pan between, all five channels much like musical elements. The effect of this is to expand the viewer’s perception of these sounds as hyper-real or ‘unnatural’ sounds operating in a ‘nowhere’ space much as the score does.
Much of Psycho hinges on the Aristotelian notion of the reversal of fortune (in other words the ‘big twist’) and so it isn’t until the film’s conclusion that we realise that the voice of Norman’s Mother is a similar meta-diegetic sound as that we aurally observe with Marion. Until the twist is revealed we assume that Mother is speaking, only after do we realise that her voice belonged to Normans imagination. This shift from the ‘reality’ of spatial placement of sound to the ‘unreality’ of a sound coming from all channels at once can be best seen in the scene where Marion listens through the window to Norman arguing with his Mother. Rather than spatially place the sound beyond the window from where Marion is presumably hearing it, Mother’s voice is heard through all five surround channels where as Norman’s, in response to his ‘Mother’s voice’ is heard in a more spatially specific form.
It is here that the shift to surround sound has potential impact on the unveiling of narrative in Psycho. In mono the sound of Marion’s meta-diegetic voices are aurally identical to any other voices and sounds in the film. Likewise there is nothing to Norman’s voice and the voice of his mother to aurally set it as apart or distinct, certainly nothing to imply that one voice is real and the other not. However, in the surround sound version, the choice to make these voices emanate from all channels, thus removing them from a spatial reality is a immediately recognisable clue for the viewer that these voices are not ‘real’. It could be argued that this choice is too much of a ‘give away’ for the viewer and that it potentially signifies the ‘unreality’ of the Mother’s voice too early in the narrative potentially spoiling the reveal of the films ending.
Hitchcock’s Psycho is a film that has a great deal of purity in its visual framing and mise en scene focus. As is the hallmark of Hitchcock’s work there is no wasted compositional elements. Subsequently the soundtrack in mono, emanating from directly behind the screen, lends itself to this tight integration of aural and visual mise en scene where the confines of the frame are presented as a total entity, a complete picture, aurally and visually.
Arguably the cinematography and mise en scene technique in Hitchcock’s Psycho is influenced by the limitations of mono sound reproduction, presenting a film that has little awareness of three dimensional space. Indeed Psycho doesn’t even engage in any large degree with deep-focus composition. Much of Psycho is composed in very precise, two-dimensional framings that seem to focus on a visual and compositional purity.
This approach, part aesthetic choice, part technological restriction (an approach not by any means unique to Psycho, but common to the vast majority of cinema up to this period) has a distinct affect on the placement and role of the viewer in regard to the drama playing out on screen. In reviewing the work of James Lastra, Sarah Kozloff has commented that there is a long prevailing tradition of sound perceived “less anthropomorphically and as more analogous to writing.” (Kozloff, S. 2002). In other words, the film (and in particular the sound) makes no pretence that the viewer is part of the scene or sharing the same sensory or auditory space as the space of the film. The viewer is nothing other than a removed observer, divorced from the film’s reality.
As an example events such as the downpour of rain as Marion first arrives at the Bates Motel are flatly composed and produced, the viewer removed from the cinema space. The rain does not fall on the viewer, rather they see through the mise en scenes portal to a moving composition of rain falling on a character. As Kozloff says this is akin to the removed and remote experience of reading. Psycho seems to follow a premise of not placing the viewer ‘in’ the scene where they hear what the character’s hear but rather providing a series of windows through which they see and hear specific, voyeuristic and filtered elements of what the characters see and hear.
Interestingly Van Sant’s remake of Psycho, despite the fact that surround sound now allows for a different relationship between the viewer and the spatial world of the film, very often chooses not to utilise the technology and subsequently maintains the removed position of the viewer. This can be seen as a choice based on an older aesthetic rather than a more contemporary one.
The above example of the rain scene illustrates this where, if the film were a newly produced original work, there would be little hesitation on the part of the sound designer to re-produce the sound of the rain falling heavily from all speakers in order to situate the viewer in the rain as Marion is in the rain. To share the same space. Even though this option is open to Van Sant he chooses not to engage with space in this way; there is no sense of the Macro-mise en scene that surround sound facilitates so readily.
There are two broad possibilities for this choice; either Van Sant is choosing to remain true to his self imposed edict by going sound-for-sound as well as shot-for-shot. Or that by going shot-for-shot visually he found that the sound design was dictated, and even restricted by, Hitchcock’s mise en scene. In other words that Hitchcock’s purist cinematography presenting a totality of the frame didn’t allow or accommodate for a shift in the viewer’s concept of space.
An example of this visual confinement of sound and space by the cinematography can be seen in the famous shower scene. Here Hitchcock uses a series of jump-cuts that shift the framed perspective (often 180 degrees) from looking in at Marion in the shower to looking out through the curtain, via a series of close-ups of the shower head and Marion’s face. The sound design for this scene (as with most of the original film) is quite complex with a precisely balanced arrangement and mix of sounds – water gurgling, splashing, dripping, musical elements etc. Mono sound dictates that sounds emanate from front and centre to the image so Hitchcock’s use of quickly moving cuts, that continually reverse the position and change the angle of the mise en scene, by default ensures that the single channel mono sound doesn’t appear spatially false. The cutting allows the sounds to naturally always occur front and centre to the frame.
If it could be imagined that this scene were redone using less hard cutting and perspective shifting and more fluid pans, tilts, pedestals and handheld camera then the scene would lend itself readily to a spatial composition and arrangement of sound relative to the viewer as an occupant of the scene. However since Van Sant makes the decision to capture the scene shot-for shot as Hitchcock had, there is no room to embrace the new opportunities of audio spatiality. The camera doesn’t allow it.
This approach not to utilise the spatial possibilities of the surround sound field by Van Sant becomes most noticeable at the end of this shower scene with the famous matching transition from the spiralling water down the drain hole to the tracking back and twisting close-up of Marion’s eye. Once pulled away from the eye the camera turns 180 degrees to dolly out of the bathroom but the sound of the shower, which is still running and now located spatially behind the viewer/camera, does not shift to now emanate from the rear speakers in the five channel array as one might expect in a contemporary film. The sound of the shower remains from the front speakers, spatially incorrect, and instead using the traditional method of creating a faux spatial arrangement by mixing down the sound of the shower in the mix and adding a small amount of artificial reverb to invoke a sense of distance.
Here was potentially an easy option for employing spatially correct sound that would not have conflicted with the shot-for-shot mise en scene composition. However it could perhaps be argued that for Van Sant to use surround sound now, even in one of the rare shots that allows it, would be too obvious in a film that has, up until this point, created a position for the viewer as removed from the world of the film, as a remote voyeur rather than an immersive observer.
Beyond the role of the camera and sound in constructing a sense (or lack thereof) of cinematic space we can also examine the changing process and nature of designing, building and constructing a cinematic scene in real and physical terms; i.e. the Set.
In making Psycho Hitchcock built a ‘complete’ (using the term loosely) film set for the Bates Motel consisting of the motel building of individual cabins and the infamous House where Norman and his ‘Mother’ live, along with a swamp-like pond where Norman dumps the bodies of his victims and a driveway/car-park where many of characters (such as Marion and Aberghast) arrive at Motel.
These four set elements provide the bulk of the film’s locales and in both versions of Psycho there is an implied close proximity between these. This is particularly so in the case of house looking down on, as if surveying, the Motel from the hill as metaphor of Norman’s Mother’s perceived invasive watchful eye.
Using the arguments and examples from earlier in this essay this may be seen as a similar practice to 3D game creation; the design of a macro-space into which the camera and action will be set, not dissimilar to Elephant in terms of the school being the composed macro-space.
Certainly it would have been feasible, perhaps even desirable, to shoot Psycho with the tenants and notions of the Macro-mise en scene in order to establish the Motel and its environs in their entirety as a tangible entity in the film, much as the school building is in Elephant. However, since Psycho was shot for mono sound with a unidirectional, single source, field and subsequently the camera framing follows a very square, two-dimensional, mise en scene with a clear totality of framing, there is, as a result, little to no sense of space in Psycho (either the original or the remake).
The spatial arrangement of the scene’s elements (motel, house, pond, car-park) in Psycho creates a spatiality (as before with sound) of ‘representation’ rather than ‘actuality’. The image of the House is a representation of a house on the hill that is at a distance and on higher ground; rather than a spatial and acoustic actuality of the house and its location. Likewise when Norman puts the car with body inside into the pond there is nothing to tell the viewer what direction the pond is from the Motel or how far. Too, when Sam stands in the car-park calling out for Aberghast and the image cuts to Norman standing by the pond, there is a change in the acoustic of Sam’s voice, what Rick Altman describes as “an impression of auditory perspective… created by a change in volume and reverberation levels.” (1992. p60). This is standard practice for implying acoustic and physical distance but which is, as before, a representation or ‘impression’ of sound rather than an acoustic actuality. For the viewer in this scene (as with much of Psycho) the connection with the sound of Sam’s voice is not in the actuality of the viewer hearing it but in the context of the representation of another character hearing it. Altman refers to this specific tendency in film sound, which he argues has been dominant since the 1930’s, as being “asked not to hear but to identify with someone who will hear for us. Instead of giving us freedom to move about the film’s space at will, this technique locates us in a very specific place – the body of a character who hears for us.” (1992. p60). What this indicates in the context of viewership is that there is an established tradition in cinema prior to surround sound and FPS computer gaming (and arguably immersive media generally) that places the viewer in a position that looks in on the cinematic space but does not aurally or spatially share it.
As a result of this abstract, representational arrangement of sound and physicality, Psycho (both versions) retains the old mise en scene grammar of the frame as cinematic totality. As the space around the Motel has no actuality for the viewer, they are not placed into the shared aural space of the film, there is no expectation that anything of importance to the film will occur outside the frame, or even that the film exists outside the frame. Here we perhaps see the symbiotic relationship between surround sound and cinematic technique. Despite the addition of a surround sound field of viewer aural immersion, the cinematography of Psycho (originally composed and designed with mono sound reproduction in mind and subsequently emulated precisely from the original into the remake) prevents the Van Sant version from engaging the notions of a Macro-mise en scene. The cinematography and soundscape present a mise en scene of aural and visual representation rather than one of spatial and acoustic actuality.
Second guessing Hitchcock.
Cinema is an art form driven as much by the evolution of the technology of its production as by cultural and aesthetic influences. In this regard there is possible room for speculation not just on what is possible now and into the future for the creation of cinema based on current technologies (such as surround sound and 3D virtual cameras) but also to offer conjecture on past works of cinema as to how they might have been created differently if their makers had had access to contemporary tools. A process such as this allows for possible examination of what parameters (technical and creative) dictated the production of a past work, influencing its style and form in manner that can only be considered in hindsight.
In the specific context of what has been argued above, speculating on how Hitchcock himself might have created Psycho differently if he’d had access to surround sound production and reproduction systems, perhaps allows for a more tangible comprehension of the shift towards this idea of the Macro-Mise en scene.
As this is very early stage research and theoretical thinking on a new cinematic compositional aesthetic, rather than attempting to offer a full and authoritative conclusion I will instead endeavour to propose a series of questions and possibilities for how surround sound may have influenced the cinematography and construction of Psycho.
The first and most distinct shift that surround sound does organically, virtually without any external directorial choice, is to reposition spectatorship in the context of aural and spatial immersion. Hitchcock’s Psycho presents a mise en scene that is very much window-like; a constructed/dictated premise exists whereby the viewer does not share the same auditory space as the character’s on screen, nor do they share the same physical space but rather are given a voyeuristic and limited perspective on a given set of scenes and circumstances.
A key example of this is that explored earlier in regard to the audible and visible rain when Marion arrives at the Bates Motel. Here the sound of rain in Hitchcock’s Psycho emanates from a single centre front mono speaker source and as a result the audience is placed outside the scene, out of the rain, not in the same space as the characters. By introducing surround sound and having rain through all channels, as would be common surround sound design practice, the viewer’s perception of their ‘role’ and ‘position’ in viewing the film is fundamentally altered. A new and quite different ‘contact’ is established between the viewer and the film and this could subsequently have impact on every facet of Psycho.
Psycho is a film that draws heavily on montage principles for building tension by precise cutting of images into sequence; notable case in point being the famous shower scene. No frame is wasted in Psycho and each frame unto itself presents an array of information for the viewer, fully utilising the mise en scene.
On one hand this can certainly be considered to be the hand of a great filmmaker who is astutely aware of the craft at hand in regard to both the two cornerstones of cinematic production; mise en scene and montage. A filmmaker able to exploit for dramatic purposes the strengths of both aesthetic understandings.
On the other hand, this precise completeness of framing where the camera depicts in totality both the visual and aural elements, deliberately and directly delivering them to the viewer inside a framed window space might also be seen as a necessary aesthetic imposition driven by a restricted sound field. One that doesn’t allow for the viewer to perceive directly of spatially tangible elements outside of the frame.
In the shower scene when we hear the knife we see the knife. When we hear the shower curtain get pulled back by Mother (Norman) we see the curtain get pulled back. Almost no action that takes place in Psycho happens beyond the frame or is drawn aurally into the frame from beyond it. Psycho is a film that is very contained. This may well be one of the first elements to change if Hitchcock had had the ability to spatially position sound beyond the screen.
The shower scene, currently built around quick montage cuts and 180 degree reversals of position (at times almost jump-cuts) from outside the shower looking in, to inside the shower looking out, may be prompted into a more fluid and spatially aware composition through sound. With the ability to place sounds with precise and perceptible location outside of the frame the need, or aesthetic desire, to depict each movement or sound generating event is perhaps undermined or diminished. These events and actions are free to take place with total audience awareness without the need for the camera to frame them visually.
In this same regard Psycho is a film that uses confined, enclosed or isolated spaces as a means to build and enhance tension but because of the restricted nature of space available to a film in monaural sound, where the camera is in effect hamstrung in its ability to depict spatial location and direction, it has to circumvent a realistic or spatially accurate construction in favour of a representational one.
Here Hitchcock too may have made different choices had surround sound been available to him. The bulk of Psycho takes place in a single location (the Bates Motel) drawing in just a small handful of locales consisting principally of the motel cabins, the reception office and the family house on the hill. A great deal of the tension of the film is built around the proximity of these three. Norman insists Marion take the cabin closest to the office where he says it will be better if she needs anything but where it becomes apparent that the decision is more about he being able to watch over her. Likewise the house on the hill watches over the Motel and more specifically the spectre of Norman’s mother watches over him and everything he does.
This layered structure of proximity, voyeurism and a notion of ‘watching over’ is deliberately built into the fabric of Psycho and yet there is virtually no real sense of spatial direction and distance in tangible terms for the audience. When we see the house on the hill it is only with a fairly abstract notion of it’s direction from the house. Likewise when Marion hears Norman arguing with his Mother and looks up at the house from her room window there is little sense of spatial location of this or of the house’s actual distance from the Motel room window. The sound of the arguing and indeed the image of the house itself are, as a result, rendered representational of these dramatic devices rather than out of a spatial, physical actuality.
Arguably then, the use of surround sound, and the more mobile, game-like, camera movements surround sound inspires that allow for the camera to explore space without being beholden to a totalitarian sense of framing, would have allowed Hitchcock to greatly enhance the claustrophobic, paranoid tension of Psycho. If, as has been argued above, surround sound can be said to primarily serve as a way to, a) shift the audience into a shared space with the characters and b) allow for a mise en scene composition that stretches beyond the visible frame, then certainly these tools, and the techniques they introduce, could potentially enhance the drama of Psycho on a psychological level. A ‘contract’ is created that makes the viewer consistently aware of the ‘watchful eyes’ outside of their visible frame but with the realism that comes with a spatial specificity of direct aural location.
None of this is to say in any sense that these choices, centred on new technologies, would make Psycho a better film. Rather these observations are about looking at how cinema is a product of the tools available at the time of its making and, as a result, the aesthetic principles and governing theories built around films of a particular time require a constant process of re-examination as technologies evolve.
There are precious few films that have gone some way towards truly exploiting or engaging with the notion of a Macro-mise en scene and the use of surround sound as a creative spatial construct influencing and changing camera technique and our accepted notions of visual framing. Sadly surround sound is still largely limited to explosions and large scale Hollywood blockbuster effects. It is fair to say that a great deal of the truly inventive and forward thinking use of spatial sound, and indeed the embracement of spatiality and the Macro-mise en scene in general as a compositional tool, is coming from 3D computer gaming.
Where computer games have long borrowed aesthetic, cultural and technical influences from popular cinema it is now also fair to say that gaming and gaming aesthetics are having a profound impact on cinema itself and our perceptions of cinema’s language and form. Statistics alone stand as a substantial and solid indicator of a current and future shift in our perceptions of what popular media is? SpiderMan 2 (2004) grossed $40.4 million in its opening weekend and was hailed as a huge popular cinema box-office success for its studio producers. In Contrast the release of the FPS genre x-box console game Halo 2 drew $125 million in its first weekend of sales far eclipsing Hollywood’s best efforts.
If sheer popularity, public take-up and entertainment saturation can be taken as one substantial indicator, gaming and the cinematic Macro-mise en scene aesthetics it brings to mainstream media of all kinds may well be seen as the new dominant visual language and discourse. In this regard the notion of the Macro-mise en scene as a central element of game design, encompassing immersive aural and visual constructs, becomes, by proxy, a central hub of all future media compositional thinking.
—
List of references
Altman, R (ed). 1992. Sound theory Sound practice. Routledge. New York.
A-Soma, 2001. Mise en scene and the off-screen space.
Accessed 8 June 2005
<www.soma.org.uk/Essay%202%20%20Off%20Screen%20Space/essay2offscreens.html>
Bazin, A. 1967. What is cinema? Volume 1. pp 50 (footnote).
University of California Press. Berkeley.
Bazin, A. 1971 p28. What is cinema? Volume 2.
University of California Press. Berkeley.
Bazin, A. 1972. Orson Welles: A Critical View. Haper and Row. New York.
Bordwell D & Thompson K, 1998. Film Art: an introduction. MaGraw Hill. New
York.
Bridgett, R. Off Screen Sound in Interactive Media. Viewed 10th July 20005
<http://web.archive.org/web/20030204113842/http://www.sound-design.org.uk/off.htm>
Brophy, P. 1987. California Split. Accessed 3rd March 2005
<http://media-arts.rmit.edu.au/Phil_Brophy/MMAlec/CaliforniaSplit.html>
Buckland W. 1998. Film studies. Hodder Headline. London
Carpenter, M. 2004. Rip in the curtain; Gus Van Sant’s Psycho.
Viewed 13th June 2005.
<http://www.horschamp.qc.ca/new_offscreen/van_psycho.html>
Chion, M. Silence in the loudspeakers. Viewed 10th March 2005.
<http://www.frameworkonline.com/40mc.htm>
Citizen Kane. 1941. RKO Pictures. Dir. Orson Wells.
Cook P. 1985. The cinema book. (first edition) BFI publishing. London.
Cousins M. 2004. The story of film. Pavilion Books. London.
Dark Side of the Moon. 1972. Pink Floyd. Capitol Records.
Doom3. 2004. iD software. Published by Activision.
Dykhoff, K. 2003. About the perception of sound.
Viewed 10th March 2005.
<http://www.draminst.se/start/inenglish/articles/ljudartikel/>
Eisenstein S. 1947. The film sense. Harcourt Brace. London.
Elephant, 2004. HBO Films. USA. Dir. Gus Van Sant.
Fantasia. 1940. Disney. Disney. Dir. James Algar, Samuel Armstrong.
Gorbman, C. 1976. Teaching the Soundtrack.
‘Quarterly review of film studies’. Vol. 446-452
Half Life 2. 2004. Valve Corporation. Published by Sierra.
Iversen, G. 2003. Dissolving views: style, memory and space.
Viewed 3rd of March 2005.
<http://www.hf.ntnu.no/estetisk_teknologi/output/DISSOLVE.pdf>
Jezebel. 1938. Warner Bros. Dir. William Wyler
Jurassic Park. 1992. Universal Studios. Dir. Steven Spielberg.
Khalili, N. 2003. Walter Benjamin Revisited: The Work of Cinema in the Age
of Digital (Re)production. Viewed 13th June 2005.
<http://www.horschamp.qc.ca/new_offscreen/new_media.html>
Kozloff, S. 2002. Sound Technology and the American Cinema: Perception,
Representation, Modernity by James Lastra. Book review.
Viewed 9th June 2005.
<www.findarticles.com/p/articles/mi_m1070/is_3_55/ai_85465113>
First published ‘Film Quarterly’, Spring 2002.
Last action hero, The. 1993. Sony Pictures. Dir. J. Mc Tiernan 1994
Lee CC. 1999. Special Feature: Sound and Immersion. 3D Sound Surge.
Accessed 6 June 2005
<www.3dsoundsurge.com/interviews/soundandimmersion.html>
Lisztomania. 1975. Facets Video. Dir K. Russel
The Lord of the Rings: The Two Towers. 2002. New Line. Dir. Peter Jackson.
Max Payne. 2001. Remedy Games. Published by Godgames and
Take2Games.
Magnificent Ambersons, The. 1942. RKO Pictures. Dir. Orson Wells.
Menard D. 2003. Toward a synthesis of cinema – a theory of the long take
moving camera, part 1. www.offcreen.com Accessed 12th June 2005. <www.horschamp.qc.ca/new_offscreen/synthesis_theory.html>
Milicevic, M. 1994. Film sound beyond reality. Accessed 22nd June 2005.
<www.filmsound.org/articles/beyond.htm>
Murch W. 2000. ‘Dense clarity, clear clarity’. Volume Bed of Sound exhibition.
PS1 Museum of Modern Art New York. Accessed 9th June 2005
<www.ps1.org/cut/volume/murch.html>
The Player. 1992. New Line Cinema. Dir. Robert Altman
Psycho, 1998. Universal Studios. Dir. Gus Van Sant.
Psycho, 1960. Universal studios
. Dir. Alfred Hitchcock.
Quake III 1999. iD Software. Published by Activision
The Robe. 1953. 20th Century Fox. Dir. Henry Koster.
Russian Ark. 2002. Wellspring Media. Dir Aleksandr Sokurov
Shrek. 2001. Dreamworks SKG. Dir. Andrew Adamson, Vicky Jenson
Spiderman 2. 2004/ Columbia Pictures (Sony) Dir. Sam Raimi
Star Wars Jedi Knight: Jedi Academy. 2003. Lucasarts/Raven.
Published by Activision.
Star Wars Epsiode 1. 1999. 20th Century Fox. Dir. George Lucas.
Star Wars. Episode IV: A new hope. 1977. 20th Century Fox.
Dir. George Lucas
Star Wars. Episode V: The empire strikes back. 1980. 20th Century Fox.
Dir Lawrence Kasdan.
Star Wars. Episode VI: Return of the Jedi. 1983. 20th Century Fox.
Dir. Richard Marquand
Thom, R. 1999. Designing a movie for sound. Viewed 10th July 2005
www.filmsound.org/articles/designing_for_sound.htm
Totaro, D. 2004. Psycho redux. Viewed 13th June 2005.
<http://www.horschamp.qc.ca/new_offscreen/psycho_van.html>
Toy Story. 1995. Pixar Studios. Dir. John Lasseter
Truppin A. 1992. ‘And then there was sound’. Sound theory Sound practice.
Altman R (ed). Routledge. New York.
Yu, E. 2003. Sounds of cinema: what do we really hear?, Perspectives.
Viewed 9th June 2005
<www.findarticles.com/p/articles/mi_m0412/is_2_31/ai_107041434>.
First published Journal of Popular Film and Television, Summer, 2003.

