Top Banner
7KH 8QKHDUG 9RLFH LQ WKH 6RXQG )LOP Justin Horton Cinema Journal, Volume 52, Number 4, Summer 2013, pp. 3-24 (Article) 3XEOLVKHG E\ 8QLYHUVLW\ RI 7H[DV 3UHVV DOI: 10.1353/cj.2013.0031 For additional information about this article Access provided by Trent University (17 Mar 2015 20:13 GMT) http://muse.jhu.edu/journals/cj/summary/v052/52.4.horton.html
23

The Unheard Voice in the Sound Film - Justin Horton.pdf

Nov 17, 2015

Download

Documents

Thysus
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
  • Justin Horton

    Cinema Journal, Volume 52, Number 4, Summer 2013, pp. 3-24 (Article)

    DOI: 10.1353/cj.2013.0031

    For additional information about this article

    Access provided by Trent University (17 Mar 2015 20:13 GMT)

    http://muse.jhu.edu/journals/cj/summary/v052/52.4.horton.html

    http://muse.jhu.edu/journals/cj/summary/v052/52.4.horton.html

  • 3www.cmstudies.org 52 | No. 4 | Summer 2013

    2

    013

    by

    the

    Un

    ive

    rsit

    y o

    f Te

    xas

    Pre

    ss

    Justin Horton is a PhD candidate in the Program in Moving Images Studies in the Department of Communication at Georgia State University. His research interests include fi lm realism, sound studies, and embodiment. He is currently at work on a dissertation on subjectivity, affect, and out-of-body experience. His work has appeared in Cinephile and previously in Cinema Journal, and he won the SCMS Student Writing Award in 2012.

    The Unheard Voice in the Sound Filmby JUSTIN HORTON

    Abstract: Much has been written on the disembodied voice in fi lm. This article seeks to address the inverse: speaking characters whose words go unheard. Often thought rare, this phenomenon is quite common, and to address this blind spot, I propose the neologism the voice-out and identify ten categories thereof. In so doing, I illuminate a number of understudied sound-image relations within the cinema.

    Someone who sees without hearing is much more uneasy than someone who hears without seeing. Georg Simmel1

    There is only one element of the cinema . . . that remains constrained to perpetual clarity and stability, and that is dialogue. We seem to have to understand each and every word, from beginning to end, and not one word had better be skipped. Why? What would it matter if we lost three words of what the hero says? Yet this has remained almost taboo in fi lms. We are only beginning to know; for, as we shall see, in sound fi lm, there is a lot riding on these three lost words. Michel Chion2

    O ne of the most discussed fi lm endings in recent memory occurs in Sofi a Coppolas Lost in Translation (2003). In it, aging American actor Bob Harris (Bill Murray) abruptly exits his taxi and treks along the Tokyo streets in search of Charlotte (Scarlett Johansson), the young American woman with whom he has entered into a staid May-December romance, one unconsummated and unacknowledged by either of them. He fi nds her, embraces her, and whispers into

    1 Simmel, writing in 1911, suggests that public transportation brought about this curious condition, for it was under these circumstances that people had to look at one another for long minutes or even hours without speaking to one another. Georg Simmel, Mlanges de philosophie rlativiste: Contribution la culture philosophique, trans. Alix Guillan (Paris: Felix Alcan, 1912), 2627, quoted in Walter Benjamin, On Some Motifs in Baudelaire, in The Writer of Modern Life: Essays on Charles Baudelaire (Cambridge, MA: Harvard University Press, 2006), 207.

    2 Michel Chion, Audio-Vision: Sound on Screen, trans. Claudia Gorbman (New York: Columbia University Press, 1994), 170.

  • Cinema Journal 52 | No. 4 | Summer 2013

    4

    her ear. And we in the auditorium hear not one damn word of it. The pair exchange a brief kiss and a longing glance before walking away in opposite directions. Fin. In a much-praised film singled out for its performances, its cinematography, and its subtlety, it is this ending that has drawn the most commentary. As one critic professes, I have never seen anything quite like it, in any movie.3

    More than a century after the invention of cinema and within a medium in which, to borrow a clich, there is nothing new under the sun, why does this ending register as so startlingly different? Surely, it cant be the lack of definite resolutionwill they or wont they?for decades of art cinema ambiguity have conditioned us to accept and even to revel in a lack of narrative closure.4 I submit that in the case of Lost in Translation what is most jarring, what elicits such strong reaction pro and contra, is that we arent permitted to hear Murrays lines. This films climax, then, testifies to the fact that few cinematic conventions are as doggedly adhered to as the insistence on clarity of speech. Audiences, it seems, can tolerate fractured chronology, ellipses, or narrative irresolution, but the withholding or occlusion of the voice is experienced as an unusual and unsettling disruption. With this observation in mind, I contend that sound films, despite the still-prevalent bias toward the visual among spectators, critics, and scholars, hinge more often than not on the voice. This article thus seeks to consider the circumstances in which characters are seen speaking but nonetheless go unheard partially or completely by audiences.5 I am not, however, seeking out aberrant moments or mere exceptions that prove Michel Chions rule in the epigraph to this article, for although Lost in Translations briefly absent voice strikes us as unique most likely because of its placement at the climax of the film, the practice of withholding the voice is by no means rare. In fact, despite Chions insistence, the unheard voice is rather prominent across multiple cinematic modes and can be found in a broad range of instances, from the common and conventional to the conspicuous and unnerving. Here, I consider the various forms inaudible speech may take and the effects and consequences of those variegated arrangements. The unheard voice of my title is meant to be taken in the literal sense and not as a metaphor for those who have been excluded from representation or participation within mass mediabe it for race, gender, class, ethnicity, sexual preference, or what have youthough by no means do I mean to diminish such a clearly important project. In what follows, I examine briefly the history of narrative conventions that have resulted in the installation of the intelligibility of speech as orthodox within film practice. Next, I consider existing theory on the dynamic between the seen and the heard in film. While scholars have written a great deal about the relationship between hearing and seeing as it pertains to the disembodied voice, this study examines the

    3 Stephanie Zacharek, Lost in Translation, review of Lost in Translation, by Sofia Coppola, Salon, September 12, 2003, http://www.salon.com/2003/09/12/translation/.

    4 Indeed, David Bordwell suggests that the appropriate strategy for watching an art film is to read for maximum am-biguity. See David Bordwell, The Art Cinema as a Mode of Film Practice, Film Criticism 4, no. 1 (1979): 5664.

    5 Unheard speech is not always by design of the filmmaker. Rick Altman reminds us that the space of exhibition factors as well, for poor acoustics in movie palaces rendered many lines inaudible. See Rick Altman, The Material Heteroge-neity of Recorded Sound, in Sound Theory, Sound Practice, ed. Rick Altman (New York: Routledge, 1991), 2728.

  • Cinema Journal 52 | No. 4 | Summer 2013

    5

    inverse situation, the unheard voice, a phenomenon that has not yet been adequately addressed within cinema studies. Finally, I propose a typology of the unheard voice in cinematic texts. I argue that this typology allows us to discuss absent voices with greater precision and to illuminate the complex interactions of sound and image that have been so central to the power and appeal of cinema and yet remain in a nascent state with regard to theory. Though my concern is largely with and my examples almost entirely from film, I believe that the findings of this study and my proposed model are applicable to other media, such as television and video games, which so often share narrative and stylistic conventions with the cinema.

    The Conventionality of Cinematic Speech. One arena of cinema studies that has extensively considered sound is history. The transition to the sound cinema and its implications in terms of film style have been well documented, as have subsequent technological innovations.6 Many artists of the era (often correctly) feared that the sophistication of the silent cinema and its freedom to move the camera and cut within the scene would be hampered by the limitations of early sound-recording technology. The puny range of microphones and the belief that speaking characters always must be within the frame to avoid confusing moviegoers combined to confine the complex visual syntax that had been mastered in the presound cinema. The silent cinema of the 1920s had by that time overcome its inherited debt to the stage by advancing beyond a tableau style. In its place was a cinema unmoored from any fixed perspective and free to assemble disparate shots into highly complex spatial and temporal configurations. Although sound offered a more perfect mimetic representation of the world, this was achieved, many believed, at the expense of visual expressivity, thus resulting in a sort of canned theater that abandoned the specifically cinematic style developed by early filmmakers. To maintain the illusion of talking pictures, practitioners insisted on always showing the moving lips of actors as visual assurance of synchronization and spatialization. Such a style was reinforced by the bulkiness of equipment and the necessity of recording all sound elements (e.g., score, sound effects, dialogue) during principal photography. After a period of initial awkwardness, though, filmmakers managed to integrate sound without disfiguring the visual sophistication developed during the silent era. Improved recording technologies, the implementation of post-dubbing, and experimentation with sound editing restored much of the visual flexibility of the prevoice cinema. Dialogue came to be regarded as an enriching element, complementing and strengthening the film when effectively and artfully deployed. In

    6 See, to name but a few, Richard Abel and Rick Altman, eds., The Sounds of Early Cinema (Bloomington: Indiana University Press, 2001); Donald Crafton, The Talkies: American Cinemas Transition to Sound, 19261931 (Berke-ley: University of California Press, 1999); James Lastra, Sound Technology in the American Cinema: Perception, Representation, Modernity (New York: Columbia University Press, 2000); Rick Altman, Silent Film Sound (New York: Columbia University Press, 2004); Charles OBrien, Cinemas Conversion to Sound: Technology and Film Style in France and the US (Bloomington: Indiana University Press, 2004); Gianluca Sergi, The Dolby Era: Film Sound in Contemporary Hollywood (Manchester, UK: Manchester University Press, 2005); Mark Kerins, Beyond Dolby (Ste-reo): Cinema in the Digital Sound Age (Bloomington: Indiana University Press, 2010).

  • Cinema Journal 52 | No. 4 | Summer 2013

    6

    light of this refinement, filmmakers gradually eased away from always showing the source of speech. One pivotal innovation was the development of the voice-off, the presence of speech on the soundtrack from a character not framed within the visual field.7 Instead of relying on a two-shot during a conversation scene, for example, characters could be shot in individual close-ups. Upon hearing any voice not belonging to the framed speaker, the spectator would infer that it belonged to the other party depicted previously in the master shot but located now off-screen. In addition to unfettering the camera from speaking characters, the voice-off had profound implications for both film practice and film theory. First, upon realizing that audiences could comprehend the device of the voice-off, filmmakers were freed to experiment with other techniques, such as the sound bridge and voice-over narration. Second, the voice-off calls attention to off-screen space, which contributes greatly to a spectators cognitive mapping of spatial and temporal orientation.8 Finally, and most crucial to my purposes, the voice-off became the subject of some of the earliest theoretical forays into the peculiarities of the voice in film sound. I take the resulting line of inquiry into the relationship between voice and image as my point of departure.

    Voices and Bodies, Presence and Absence. As Georg Simmels epigraph that opens this article and my discussion of Lost in Translation both suggest, the phenomenon of seeing the source of an utterance and not hearing it is an unusual one, which makes the scant scholarly attention paid to it all the more remarkable.9 The inverse situation, however, of hearing a voice but not seeing its source, has received considerable scrutiny. Chion, in 1982, termed the unsourced speaking voice the acousmtre.10 He describes this effect as an uncanny one, and to it he attributes a number of powers: ubiquity, panopticism, omniscience, and omnipotence.11 The acousmatic voice exercises its powers, argues Chion, by aurally haunting the screen until the moment its physical source is revealed, a revelation he calls de-acousmatization, a relinquishing of the voices spectrality in exchange for assuredthat is, visualizedcorporeality. Mary Ann Doane also attributes an otherworldly quality to the disembodied voice,

    7 The term voice-off gained traction with the publication of Pascal Bonitzers The Silences of the Voice in 1975. Bonitzer writes of the ideological implications of documentary narrators whose invisibility grants them godlike au-thority. Since that time, though, film studies has adopted the term voice-over for this unseen narrating voice while retaining voice-off to describe diegetic characters who are heard speaking but are not present within the frame. Pascal Bonitzer, The Silences of the Voice, in Narrative, Apparatus, Ideology: A Film Theory Reader, ed. Philip Rosen (New York: Columbia University Press, 1986), 319334.

    8 Chion, Audio-Vision, 920. Chion considers sound to be primarily geared toward temporalization of the image. Mary Ann Doane, on the contrary, suggests that the voice-off is first and foremost in service of the films construction of space and [accounting] for [the] lost space of the diegesis that is not within the frame. Mary Ann Doane, The Voice in the Cinema: The Articulation of Body and Space, in Film Sound: Theory and Practice, ed. Elisabeth Weis and John Belton (New York: Columbia University Press, 1985), 167.

    9 To my knowledge, until now, Michel Chion has written the only account of it, a cursory six pages in Audio-Vision. I return to Chions theorization elsewhere in the article. Chion, Audio-Vision, 177183.

    10 Chions acousmtre is perhaps his most noted theoretical contribution to the study of film sound and appears first in The Voice in Cinema and later, with little modification, in Audio-Vision. Michel Chion, The Voice in Cinema, ed. Claudia Gorbman (New York: Columbia University Press, 1999), 2128.

    11 Chion, Voice in Cinema, 24.

  • Cinema Journal 52 | No. 4 | Summer 2013

    7

    describing it as fantasmatic; she asks, echoing Chion, Who can conceive of a voice without a body?12

    I begin here with the disembodied voicethe opposite circumstance of my stated object of analysisfor two reasons: first, Chion and Doane demonstrate how disconcerting it is to experience speaker and utterance as unattached, so much so, in fact, that both resort to describing the phenomenon as somehow unearthly.13 Second, this body of literature highlights just how understudied the inverse of acousmatic sound is, for, as Ive suggested, the unheard voice can be equally and perhaps even more unsettling. Most important, while the disembodied voice in the cinema evokes an ethereal realm, the embodied but unheard voice, I contend, forces the spectator to confront his or her own epistemological limitations. Both Doane and Chion connect the peculiar sway of the disembodied voice to the experience of the child in the womb, the universal, primordial first encounter with acousmatic sound: a child develops its sense of hearing in utero and therefore listens for several months before seeing is even possible.14 Though vision tends to be the most culturally privileged of the senses, sound is the one through which the outside world first registers.15 Moreover, after birth, our sense of hearing is the one we cannot turn off we dont possess, as Steven Connor says, earlids.16 Conversely, vision is necessarily selective: one chooses which details she wishes to focus upon, and to look to ones left is to the exclusion of what is to ones right.17 Similarly, if faced with an unpleasant image, one may avert his eyes, shut out the data to prevent the brain from processing it. This is not the case with sound: there is no escaping it, for we hear in all directions simultaneously. Sound even traverses physical barriers, such as thin apartment walls or human hands; as anyone whose parents have ever instructed her to cover her ears to dam off distasteful language knows, palms can block only so much.18

    The experience of sound as a phenomenological constant contributes, I believe, to the unsettling response we often have when confronted with inaudible voices in film texts. Another factor, though, is, to apply a visual metaphor to a sonic context, the expectation of character speech behaving in accord with conventions of narrative

    12 Doane, Voice in the Cinema, 162.

    13 Chion, Voice in Cinema, 23; Doane, Voice in the Cinema, 162.

    14 Chion, Voice in Cinema, 1718; Doane, Voice in the Cinema, 169171.

    15 For a fascinating discussion of the cultural privilege accorded to vision (ocularcentrism) and the intellectual discourse surrounding it, see Martin Jay, Downcast Eyes: The Denigration of Vision in Twentieth-Century French Thought (Berkeley: University of California Press, 1994). Several film theorists grounded in phenomenological ap-proaches have called attention to the hierarchal privileging of vision and of hearingthe distance sensesover the proximate senses of touch, taste, and smell and how the cinema activates these latter modalities despite its raw materials (images and sounds) playing to the top of hierarchy. See Laura U. Marks, The Skin of the Film: Intercultural Cinema, Embodiment, and the Senses (Durham, NC: Duke University Press, 1999); Vivian Sobchack, Carnal Thoughts: Embodiment and Moving Image Culture (Berkeley: University of California Press, 2004); Jennifer M. Barker, The Tactile Eye: Touch and the Cinematic Experience (Berkeley: University of California Press, 2009).

    16 Steven Connor, Dumbstruck: A Cultural History of Ventriloquism (New York: Oxford University Press, 2000), 1617.

    17 Chion, Audio-Vision, 182.

    18 One might here object and bring up recent headphone technologies that cancel outside noise. Although such devices are relatively effective at eliminating external sound, they tend to make one aware of her own internal bodily sounds, such as respiration or heartbeat. The exception, in this case, proves the rule.

  • Cinema Journal 52 | No. 4 | Summer 2013

    8

    transparency. Films convey vast amounts of information to audiences in a number of ways: visual (the ticking down of a digital timer affixed to a bomb), nonverbal sound (off-screen police sirens alerting a villain that his capture is imminent), musical score (the swelling strings or tinkling piano notes that cue our tears), and verbal (character dialogue). The importance of dialoguethe most historically and institutionally privileged form of film sound19as narrative engine cannot be overstated, for no matter how lush the visuals, fictional films since the 1930s onward have been, as Chion theorizes, voco- and verbocentric.20 Chion acknowledges the way in which spectators engage dialogue foremost as a hermeneutic channel: when hearing characters speak, [an audience member] will first seek the meaning of the words, moving on to interpret the other sounds only when [her] interest in meaning has been satisfied.21 In the classical model of cinema especially, filmgoers expect efficiency from dialogue: if the speech is present, it must be driving the story forward in some fashion, providing exposition, or otherwise aiding in character development. In this regard, classical narrative is not very tolerant of the extraneous: as the old screenwriting maxim tells us, a gun seen in the first act must go off in the third. What, then, are we to make of speech that is occluded or removed entirely? From a strictly narrative perspective, speech that is somehow concealed or out of spectator earshot must either be irrelevant or withheld for a specific end. Although he does not address the absent voice specifically, David Bordwell would likely describe withholdings of this latter type as intentional obfuscations on the part of the films narration, designed to open a gap in the perceivers knowledge. These fissures cue filmgoers to form a hypothesis, one that they will test against subsequent information in an attempt to fill in the gap. To utilize Bordwells terms, a gap that is organized around unheard dialogue is flaunted by the narration. In his framework, the gap behaves in one of three ways: (1) it serves as a red herring, calculated to mislead the filmgoer; (2) the perceiver retrospectively will solve the riddle raised by the gap with later information; or (3) it will remain a permanent gap, the narration prohibiting its reconciliation.22

    The whispered line at the end of Lost in Translation qualifies as a permanent gap, and thus Bordwells approach provides us a framework from which to consider how these missing words function narratively. And yet a narrative lens can account only partially for the peculiarity of the unheard voice. Still at stake, I think, is another, perhaps more fundamental disturbance. As we shall see below, obfuscated dialogue is paradoxical: in one sense, it conforms to our day-to-day experience of the world in which all that is spoken is not necessarily audible and all that is audible is not necessarily comprehensible; in another, the unheard voice speaks to the impossibility of reconciling sonic perspective with the spectatorial mastery of the visual realm afforded by continuity editing.

    19 Doane, Voice in the Cinema, 162.

    20 Ibid.; Chion, Audio-Vision, 6.

    21 Chion, Audio-Vision, 6.

    22 David Bordwell, Narration in the Fiction Film (Madison: University of Wisconsin Press, 1985), 55.

  • Cinema Journal 52 | No. 4 | Summer 2013

    9

    Sound and Transcendental Subjectivity. In his essay Ideological Effects of the Basic Cinematographic Apparatus, Jean-Louis Baudry writes of the cinemas ability to provide to the viewer an impossible array of vantage points on the profilmic event.23 In reality, if someone were to take a seat, say, at a crowded bar and watch fellow drinkers, that persons vision would be limited to his specific position and by his visual acuity. He may, as weve already noted, choose which portion of the barroom he wishes to view at any one moment. The continuity system of editing, though, grants the filmic spectator a series of viewpoints which he could not achieve in real life. For instance, consider the following firmly entrenched conventions that would likely occur in a speculative film of our pub patron: an assortment of people throughout the room framed from variegated distances (e.g., medium close-up, close-up), all motivated by his supposed shift of glance. With each new view, the audibility of the soundtrack fluctuates: in wider shots, we see people conversing, though individual sentences are indistinct, and the dull roar of the bar predominates. In otherscorresponding to the bar patrons interest, we might be led to presumefragments of dialogue are louder and more present than in others. After a series of cuts, the camera settles itself on two women sitting at a booth. For a time far longer than the previous shots, the camera lingers, every word of their conversation audible and comprehensible. This scenario presents us with two problems, one relating to narrative and the other to realism. First, are we to attribute each new viewpoint to the narration or to the character? If we conclude that it is the narration, then the spectator in the movie theater has been granted an impossible liberation: she is at once placed in one position inside the bar, then another, then anotherall in what appears to be an uninterrupted temporal linearity. This is, of course, a physical impossibility, even if she were an agent within the diegesis. Likewise, if we assume that the shots are tethered to the shifting attention of the character, then these points of view and audition, too, are impossible, for though our eyes may concentrate on any portion of the visual field before us, they do not zoom. Moreover, focusing ones attention on someone across the room does not make that person suddenly more audible, especially in the situation of a noisy tavern. For our present discussion, this modulation of sound to ensure access to or clarity of speech irrespective of the cameras position is of great significance. Sound, just like picture, is framed, or selected for our audition. Depending on the narrative circumstances, this sonic framing is subject to modulation. In other words, certain soundsbe they dialogue, ambient noises, musicare highlighted on the soundtrack. The visual equivalents of framing and editing are quite apparent to us: a cut from one close-up to another literally transfigures the image. In many ways, our vision operates in such a manner: as our eyes move, what we see is necessarily altered. Sound, though,

    23 I consider Baudrys essay unquestionably an important one, despite the fact that I do not subscribe to his charac-terization of the docile, naive spectatorresearch into reception has taught us as much. I invoke it here only to illustrate how the mastery of vision enacted by the film apparatus functions similarly with regard to sound. My aim is not to cast sound continuity as an ideological trap. Rather, I want to draw attention to the paradoxical sonic perspective that comes with editing, for it will pay dividends later in this study. Jean-Louis Baudry, Ideological Effects of the Basic Cinematographic Apparatus, Film Theory and Criticism, 6th ed. (New York: Oxford University Press, 2004), 355365.

  • Cinema Journal 52 | No. 4 | Summer 2013

    10

    as weve already established, is omnipresent. We can attempt to ignore certain sounds, but we cannot eliminate them. Cinema, however, can. Imagine again our speculative example of the bar scene. As the space is established, the volume of the diegetic mu-sicMotown over the bars speaker system, lets saymight be quite high. That is, of course, until the narration seizes upon the conversation it wishes to privilege, at which point the song would be modulated down while the volume of the dialogue is notched upward. Within one long-take, static-camera shot, the soundtrack may shift its atten-tion dozens of times without any visual signifier to mark the transition. Only the most attuned of filmgoers is likely to take notice of these modulations. Hence, filmmakers exploit our perceptual experience of sound as omnipresent and omnidirectional to guide our attention to particular portions of the narrative and mise-en-scne. James Lastra illuminates this situation in his Sound Technology and the American Cinema. With the coming of synchronized sound, practitioners generally fell into one of two philosophical camps about how sound should operate in the movies, and both bear the influence of antecedent technologies. One group, argues Lastra, aligned itself with telephony, seeking to maximize the clarity of speech over and above all other sounds. As with a telephone call, outside noises form a distraction from the communicative exchange, and the voice is abstracted from its larger sonic context to ensure audibility. The other group was disposed to aural fidelity, which Lastra equates with phonogra-phy. A recording of a musical performance might seek to render a faithful representa-tion of the sound in its specific spatial context. Hence, certain sounds might be louder than others on the basis of the particularity of the space and the listenersin this case, the microphonesposition within it.24

    These two competing dispositions became wedded in the film practices of the clas-sical era. That is to say, dialogue was almost always given the telephonic treatment, but when the sonic attributes of the space were evoked, it was often done in accord with a phonographic, or point-of-audition, approach. Crucially, practitioners discovered they could move almost seamlessly between these models depending on the end sought (sonic realism or sound clarity) in a specific instance, depending on eithers narrative expediency.25

    The selectivity of narrative presentation and its illusionistic realism should by now come as no surprise. But recall again the closing scene of Lost in Translation, in which spectators are denied audition of Bobs whisper. In a series of alternating medium close-ups, we see Bobs and Charlottes faces as the former speaks and the latter lis-tens. Were we, too, characters within the films diegetic world standing at the same position as the camera relative to its subjects in this instance, we might expect to hear Murrays line. And yet, despite our proximity to the characters, we do not. Granted, if one were following Bordwells narrative theory, this is simply a moment of highly restricted, self-conscious narration. In that this gap occurs at the films most impor-tant narrative instant and during what is typically the most privileged and narratively forthcoming moment in the cinemathe endthis flaunted narrative withholding is keenly felt. I submit that this instance registers as either novel or frustrating not simply

    24 Lastra, Sound Technology, 138139.

    25 Ibid., 139143.

  • Cinema Journal 52 | No. 4 | Summer 2013

    11

    for its narrative open-endedness but also because it amounts to a twofold violation of perceiver expectations: first, films generally frame the voice so that no (important) words are lost, and this one does not. Second, if we were standing at Charlottes shoul-der, we would hear that line, and yet we dont. The very least we can say about the ending of Lost in Translation is that it operates in accord with neither our conventional experience of the cinema nor our perceptual experience of reality. Lost in Translation is unique for the flagrancy with which it withholds the voice, and it is therefore a particularly illustrative and well-known example of unheard speech in the sound cinema. But the conspicuous absence of the voice in Coppolas film is but one form that this most unusual and yet common phenomenon takes. It is our task now to consider the effects and consequences of the unheard voice in its numerous permutations.

    The Voice-Out

    [T]he spoken word is most cinematic if the messages it conveys elude our grasp; if all that actually can be grasped is the sight of the speakers.

    Siegfried Kracauer26

    To fully consider the peculiarities of withheld cinematic speech, we must develop terms that address its many complexities and that might better grapple with the problem than what the current body of scholarship provides. Chion has presented the term emanation speech for just the phenomenon under consideration here.27 He writes:

    Emanation speech is speech which is not necessarily heard and understood fully, and in any case is not intimately tied to the heart of what might be called the narrative action. The effect of emanation speech arises from two situations. First, dialogue spoken by characters is not totally intelligible. Second, the director may direct the actors and use framing and editing in ways that run counter to the standard rulesavoiding emphasis on the articulations of the text, the play of questions and answers, important hesitations and words. Speech then becomes a kind of emanation from the characters, an aspect of themselves, like their silhouette issignificant but not essential to the mise-en-scne and action.28

    Chion breaks emanation speech into seven categories:29

    1. Rarefaction, wherein a filmmaker contrives a reason to attenuate the voice, such as placing a character behind a window or at a great distance from the camera

    26 Siegfried Kracauer, Theory of Film: The Redemption of Physical Reality (Princeton, NJ: Princeton University Press, 1997), 107.

    27 Chion, Audio-Vision, 177183.

    28 Ibid., 176177, Chions emphasis.

    29 Ibid., 179183.

  • Cinema Journal 52 | No. 4 | Summer 2013

    12

    2. Proliferation and ad libs, or moments when characters speak over one another, thereby diminishing the overall clarity of their respective lines

    3. Multilingualism, in which a character speaks a language other than the dominant one of the audience, and the filmmaker provides no subtitles to convey the line's meaning(s)

    4. Narrative commentary over dialogue, wherein a nondiegetic narrator speaks over characters conversing in the diegetic realm

    5. Submerged speech, which occurs when characters are placed in environments that allow for external sounds to interrupt speech occasionally, such as nightclubs or beach scenes

    6. Loss of intelligibility, wherein filmmakers attempt to convey a characters fluctuations in aural attention by reducing the clarity of portions of speech30

    7. Decentering, wherein speech is heard but seems entirely unrelated to the visuals

    Chions typology certainly addresses many of the situations under consideration here. However, there are a number of issues present that warrant a new set of terms. Foremost, Chion considers emanation speech to be that which is not intimately tied to narrative action.31 Our original example, Lost in Translation, clearly suggests oth-erwise, as the whispered lines are indeed the very resolution of the narrative; thus, these words, despite being unheard, are by no means irrelevant. Further, Coppolas film likewise points to one of the most glaring gaps in Chions concept: whispers. Fi-nally, Chions framework, to my mind, does not adequately address why each of these instances is disconcerting. We need a theory that can account for more instances of unintelligible speech than those that Chion has identified, as well as one that engages the perceptual, experiential schema against which unheard dialogue works. To that end, I propose that we strike the term emanation speech, primarily because it does not describe with necessary specificity the phenomenon under discussion. What speech that issues forth from a figure on-screen, for example, doesnt emanate from a character? What we are describing are the moments in which the voice to a greater or lesser extent becomes absent, as if it were, like a light switch, turned off or modu-lated down. In that regard, voice-off might be a more appropriate descriptor, though Bonitzers sense of the term is now firmly entrenched within the discipline. Thus, I propose voice-out to describe any instance of character speech that a spectator cannot hear or comprehend as a result of sonic obfuscation, be that in the form of conspicu-ous manipulation on the part of the filmmaker, an intrusion of a contesting sonic con-tingency, a flubbed line, or what have you. The task at hand is to distinguish the array of instances in which the voice-out comes into play. When appropriate, I have retained

    30 Ibid., 182.

    31 Ibid., 177.

  • Cinema Journal 52 | No. 4 | Summer 2013

    13

    Chions categories, whereas in others I have either nuanced his terminology or substi-tuted my own entirely. Each of the following categories is an attempt to account for the work done by the unheard voice, be that a narrative function, an attempt to align the audience with a character, or a self-conscious stylistic gesture. These categories are not modes of film sound; rather, they are strategies that filmmakers use to disrupt momentarily the predom-inant sound-image relations of the continuity system. My typology therefore attempts to determine the justifications for, and implications of, these deviations from conven-tion, and at the same time to theorize how the unheard voice becomes a convention unto itself. Just as the image may ceaselessly shift between scales (from near to far) and perspective (subjective to objective) and move about space without the limitations of a corporeal agent in reality, so, too, can sound shift gears and operate according to dif-ferent ontological assumptions at any given moment. The typology here attempts to categorize one particular form of such aural fluctuation. We begin as did cinema, with silence. 1. Silent Cinema. Chion, concerned primarily with contemporary film, treats sound as a given, noting that emanation speech is the rarest and most cinematic of filmed speech.32 What he clearly ignores is the entire body of presound cinema, which is predicated upon the unheard voices of its speaking characters. Anyone who has ever screened silent films for uninitiated undergraduates no doubt has seen the look of consternation creep across their faces. In addition to the abstraction that is black-and-white cinematography, primitive narratives, and unusual acting styles, the lack of an actual voice to accompany the moving mouth of the figure on-screen is quite an unset-tling experience to those of us steeped in sound texts.33 Think also of how much is said but never reconciled by the viewer in silent film; for instance, it is not uncommon to see a character speak what appear to be several lines of dialogue only to be presented with a short-sentence intertitle to substitute synecdochically a portion of the speech for all that was spoken. The silent cinema is an interesting case for a number of reasons. First, the film does not withhold the voice; it is without the voice.34 For that matter, there was no direct sound at all: the entire sound of the diegetic world is absent. Thus, unlike most of the other instances that follow, we cannot attribute the unheard voice to filmmaker con-trivance or grant it narrative significance. It is, rather, a technological given that narra-tives must utilize to their benefit or orchestrate ways to work around. This distinction will come into play elsewhere in the argument.

    32 Ibid.

    33 Doane, however, is quite aware of the problem. She writes, The uncanny effect of the silent film in the era of sound is in part linked to the separation . . . of an actors speech from the image of his/her body. Doane, Voice in the Cinema, 152. Rick Altman illuminates implicitly the peculiarity of watching silent films by noting how sound cinema functions in two distinct ways like a mirror: sound and image collude to give the illusion of unity to the onscreen human figures, which, in turn, assures the spectator of her own sense of wholeness. See Rick Altman, Moving Lips: Cinema as Ventriloquism, Yale French Studies 60 (1980): 7176.

    34 And yet its not. As Donald Crafton notes, the coming of synchronized sound brought with it the problem of an actors voice not matching the one a spectator imagined him to possess. Thus, the spectator mentally fills in the missing voice of the silent film. Crafton, Talkies, 509513.

  • Cinema Journal 52 | No. 4 | Summer 2013

    14

    Throughout this article, I have referred to the silent cinema as if it were a singu-lar historical occurrence. However, there are actually two types of silent cinema: that which is technologically determinedthat is, produced prior to the advent or wide-spread adoption of synchronized soundand that which is produced after sync-sound technology had been embraced broadly. In the latter case, the lack of speech is an artistic choice on the part of the filmmakers. In this subcategory we can place figures like Stan Brakhage and Andy Warhol, directors who throughout their careers largely eschewed sound in their experimental works. For a film to fall into this subcategory, it must be completely silent for the whole of its duration. A shift from sound to silence is an altogether different animal and thus is dealt with separately later in the argument. 2. Nested Voice-Out. Sometimes in cinema when we see a character speaking yet do not hear his or her voice, it is not always a formal manipulation on the part of the film-maker. Take as an example the scene in National Lampoons Christmas Vacation ( Jeremiah S. Checkik, 1989) in which Clark Griswold (Chevy Chase) finds himself trapped in the attic and occupies his time by watching 8mm home movies projected onto a wall. The technology of 8mm film is such that amateur users rarely attempt direct sound. Thus, as Clark watches his old footage, we see but do not hear speaking figures. The past-ness of these images is bracketed off by the technological limitations of their source of capture.35 Therefore, multiple diegetic levels are present at once. More often than not, this approach is designed to activate nostalgia or to indicate the era of the images recording. There are, however, more complex examples of the nested voice-out. Take Jean-Luc Godards Vivre sa vie (My Life to Live; 1962), in which Nana (Anna Karina) watches The Passion of Joan of Arc (Carl Theodor Dreyer, 1928) in a Parisian cinema. A selection from Dreyers film plays uninterrupted in excess of two minutes before we see Nanas tearful, affective response to the film: Karina mirroring Falconetti. Godards incorpo-ration of the older film in the newer one is no doubt a metacinematic commentary, a self-aware citation. Another variation of the nested voice-out can be found in some dubbed movies. Despite the Italians early mastery of automated dialogue replacement (ADR), there are a number of instances from the neorealist period in which a characters lips are seen moving and yet we do not hear an accompanying voice. Most often, this is due to an error in the post-sync dubbing. Though rare within the Italian cinema of this era, such errors, in that they remind one of the alienating effects of the unheard voice, may disrupt briefly the spectators engagement with the cinematic illusion. The loss of sync that can result from the practice of ADR likewise accounts at least in part for the sometimes humorous discrepancies between moving lips and speaking voices in poorly dubbed kung-fu movies and, for that matter, in Woody Allens Whats Up, Tiger Lily? (1966). 3. Anti-Redundant Voice-Out. As moving images existed for some time without the aidor burden, as early resisters of the talkie contendedof direct sound, we may think of sound as always already excessive in that the pantomime of bodies and gestures

    35 Im speaking here, of course, of the source as implied by the narrative. The images we see very well could have been filmed with contemporary equipment and degraded to look less pristine.

  • Cinema Journal 52 | No. 4 | Summer 2013

    15

    is often sufficient to convey the meaning and emotion of a scene without resorting to speech. We sense this quite frequently during action scenes when adversaries spot each other across a distance: a Hey, asshole! at that point would be superfluous given the work done by the shotreverse shot cutting between glaring pairs of eyes (think of the showdown in The Good, the Bad, and the Ugly; Sergio Leone, 1966). Similarly, the visual cues of desire between two lustful characters likewise speak volumes. Therefore, the voice in addition to very expressive facial gestures can be, at times, redundant. Im thinking here of the climax of The Graduate (Mike Nichols, 1967), where Benjamin (Dustin Hoffman) disrupts the wedding ceremony. In this scene, we see but do not hear several wedding guests demonstratively cursing Benjamin in extreme close-up. Our memories of this famous scene might fool us into thinking that this silence is because Ben is separated from the gallery by a glass partition, and thus we dont hear because Ben cannot hear. However, a review of the scene shows that each of the close-ups of swearing mouths is cued by a glance from Elaine (Katharine Ross), not Ben, and Elaine certainly would be able to hear those maledictions. The combination of the prominent, askew framing; swift zooms; and the effusive emotions of the face are suf-ficient on their own to convey the meaning. Anything more would be gratuitous. One also sees such a phenomenon in sports broadcasts: one neednt excel at lip-reading to know unquestionably what an angry coach or player shouts. The anti-redundant voice-out also comes into play in texts that have been censored. Of course, the most common example of such a voice-out is when a speaking charac-ter is bleeped out, that is, when a short, 1000 Hz beep tone is inserted into the sound mix in an effort to mask out an unwanted portion of speech, most often an expletive.36 Generally, though, the utterance or its meaning is never lost despite its obfuscation. In such cases, the exact word is redundant given the context of its speaking, so much so, in fact, that, at times, even the image of the spoken swear must be obscured. During the opening credit sequence of the Comedy Central television series, Tosh.0 (Comedy Central, 2009 ), for example, host Daniel Toshs voice is entirely removed as he ges-tures emphatically toward the screen of a laptop computer. Suddenly, in midsentence, a pixilated black blotch appears over his mouth. Therefore, even when divorced from any specific contextwe do not see the viral video to which Tosh is pointing, nor do we hear any portion of his sentenceboth the voice and the image of lips moving must be removed to bracket off the expletive from signification. In a shot in which the specific word was already absent from the sound mix, the image nevertheless had to be censored, attesting to speechs superfluity at times when combined with the highly communicative human face. 4. Proliferative Voice-Out. This category follows Chions designation of proliferation and ad libs as emanation speech. The most notable case, as Chions translator Claudia Gorbman points out, may well be the work of Robert Altman, whose experiments in overlapping dialogue reflect an attempt to render the chaos of sound that is multiple

    36 The 1000 Hz tone frequently accompanies the color bars during the leader segment of videos intended for broad-cast or theatrical exhibition. The color bars serve as a standard image against which technicians adjust display levels, while, in a similar manner, the tone is used to gauge volume levels. As such, the latter is something of a degree-zero of sound since silence, as weve discussed, doesnt really exist.

  • Cinema Journal 52 | No. 4 | Summer 2013

    16

    people speaking at once.37 Take Nashville (1975), in which characters do not obey the alternating speech patterns of Hollywood shotreverse shot convention. The speech of characters in the foreground is frequently overpowered by other conversations in the background and those off to the side, only to be modulated moments later. 5. Verisimilar Voice-Out. This category closely aligns with Chions own rarefaction. We experience in our day-to-day lives moments in which a sudden noiseambulance sirens, saydisrupts our ability to hear the speech of another. Thus, if a filmmaker de-ploys sound that drowns out the voiceassuming, of course, that the absent line isnt one absolutely essential to plot advancement or character developmentit is for the sake of verisimilitude, or what Lastra might call phonographic, point-of-audition realism.38 Abbas Kiarostami often utilizes this approach, as in his Taste of Cherry (1997), when a conversation between a man in a car and a passerby is obscured by the sound of a barreling truck. Terrence Malick deploys the verisimilar voice-out in The Tree of Life (2011) in the early scene in which Brad Pitts character receives the phone call informing him that his son has been killed. As Pitt shouts into the phone, his words are completely lost amid the noise of airplane machinery. The verisimilar voice-out also refers to moments in which the film presents to the ears of the perceivers in the audience sounds consistent with what a human of nor-mal hearing capacity might hear from spatial coordinates roughly consistent with those of the cameras implied position within the diegetic space. A recent example is in The Social Network (David Fincher, 2010), in which in one scene late in the film the camera is positioned outside the glass walls of an office as Eduardo Saverin (Andrew Garfield), inside, is given the news that he has been pushed out of the Facebook em-pire. Here, the spectator is positioned as an onlooker, an out-of-earshot witness to a backstabbing. Hence, the verisimilar voice-out breaks with the tendency of the cinema to violate sound perspective to maintain the centrality of the voice to narrativethat is, the cinemas verbocentrism. I separate this category from the proliferative voice-out above, though, to isolate the fact that, with proliferation, overlapping and/or a multiplicity of speakers is con-spicuously modulated by the filmmakers. Altmans sound design, for example, is char-acterized by increased volume on the conversation of greatest importance. It there-fore masquerades as a realistic approach while conforming to Chions claims about emanation speech: it is not intimately tied to . . . the narrative action.39 Therefore, unimportant lines may be lost in the mix, but crucial ones will be especially clear. By contrast, the verisimilar voice-out might arise from the actual presence of a real element that interferes with the clarity of dialogue during production and is retained subsequently for its correspondence with our real-life experiences. However, it may also be utilized as a contrivance to retard narrative information. For instance, in Taste of Cherry, Mr. Badii (Homayoun Ershadi) asks a young man to assist him in his suicide attempt, but the question is drowned out by outside ambient noise. This is a signifi-cant moment in the narrative, for it establishes Badiis central conflict. Kiarostami is

    37 Chion, Audio-Vision, 220n4.

    38 Lastra, Sound Technology.

    39 Chion, Audio-Vision, 177.

  • Cinema Journal 52 | No. 4 | Summer 2013

    17

    therefore using the verisimilar voice-out to delay our knowledge of the protagonists motivations, which only later become clear. 6. Whispers. As Ive already rehearsed, the whisper is a curious event in the cinema, for it often places the narrative tendency to foreground speech in tension with the mo-ments in which the whisper is used as a device to withhold information. Whether this gap is filled triggers a justifiable sense of anticipation, a fact that no doubt testifies to its cinematic staying power. Whispers may be wholly or partially unintelligible, such as in Lost in Translation, where we hear only a muffled rumble, or all sonic aspects of it may be off-limits. There remains more to be said about on-screen whispers, but that discussion is set to the side for the moment and taken up again toward the end of the article. 7. Subjective Voice-Out. The subjective voice-out is one in which both sound and im-age are presented from the points of view and audition of a character. An instructive instance is taken from In the Company of Men (Neil LaBute, 1997). In the film, two men conspire to court separately yet concurrently a deaf coworker named Christine (Stacy Edwards), in order for both to reject herthe humiliation of one woman as retalia-tion against all the others who have rejected them. One of the men, though, falls in love with their dupe and pleads with her to forgive him. In this moment, we cut to a shot from her point of view of Howards (Matt Mallory) plaintive facehis lips easily readable, his voice entirely absent. Here, we see and hear precisely as does the deaf woman. To differentiate between the subjective voice-out and other categories, we can adopt Edward Branigans term from the point, meaning that the shot corresponds to the (approximate) spatial position of a character.40 The key distinction as it pertains to the voice-out is that Branigans theorization is largely concerned with visual points of view. The subjective voice-out, then, is when point of view and point of audition are implied to be fixed to the spatial coordinates of a character. 8. Free Indirect Voice-Out. Similar to the subjective voice-out is the free indirect voice-out. Free indirect discourse, which was theorized first in literary studies and later ap-plied to the cinema by V. N. Volosinov, Pier Paolo Pasolini, Gilles Deleuze, and others, is a mode of cinematic enunciation that disrupts the supposedly stable subjective-objective binary.41 As Pasolini described it, a free indirect shot is one that is colored by a characters subjectivity without being from the characters point of view.42 We see this frequently in films in which a character is experiencing an altered state of consciousness, say, intoxicated or punch-drunk. A close-up of said character might be

    40 Edward R. Branigan, Point of View in the Cinema: A Theory of Narration and Subjectivity in Classical Film (Berlin: de Gruyter Mouton, 1985), 78. I must stress the distinction between the subjective voice-out of In the Company of Men with our example of the verisimilar voice-out from The Social Network, for the former is clearly coded as being the subjective experience of an identified character. With Social Network, the position from which we see and hear is not from the spatial coordinates of an actual character within the diegesis. Instead, we experience Eduardos sacking as a hypothetical character might. The distinction is that in LaButes film, we are tethered to what we know to be an actual character.

    41 Louis-Georges Schwartz, Typewriter: Free Indirect Discourse in Deleuzes Cinema, SubStance 34, no. 3 (2005): 107135.

    42 See Pier Paolo Pasolini, The Cinema of Poetry, in Heretical Empiricism, trans. Louise K. Barnett (Bloomington: Indiana University Press, 1988), 167186.

  • Cinema Journal 52 | No. 4 | Summer 2013

    18

    photographed in wobbly handheld or with tracers or other aftereffects to suggest her inebriation. Thus, the characters subjective state seems to saturate or alter an other-wise objective shot. In a less conventional manner, Michelangelo Antonioni frequently deploys the free indirect by framing his characters against deserted landscapes in such a way that their psychological alienation seems to be reflected in the physical world around them. In each of these cases, though, this liminal space between subjective and objective is conveyed visually. Less often discussed is free indirect sound.43 A filmmaker may achieve this effect by presenting what (or how) the character hears while still keep-ing the character within the frame. Thus, the sound is subjective without being coded visually as such. A characters or characters temporary loss or reduction of hearing is an increasingly common usage of the free indirect: a bomb goes off, and after the blast, the sound mix is manipulated in such a way that, despite occurring within a sup-posedly objective, third-person shot, voices and other direct sounds are muffled or a ringing is heard so as to convey the momentary trauma to the ears caused by the explo-sion. The filmgoer therefore hears as does a character, without the camera occupying his physical position within space. However, this sudden sonic transfiguration is clearly attributed to a conspicuous event (the bomb blast) within the narrative. A more subtle and interesting variation of this strategy occurs in All the Real Girls (David Gordon Green, 2003) during an argu-ment between the two young lovers, Noel (Zooey Deschanel) and Paul (Paul Schnei-der). Upon the revelation that his girlfriend has been unfaithful, we cut a shot of her emphatically explaining herself from Pauls visual point of view. As the shot begins, we hear her protests; midway through, her voice is completely and abruptly dropped from the sound mix while the other voice, sounds, and music remain. After this shot, Paul remarks, Im looking at you right now and I hear you talking and the words that are coming out of your mouth are like they are coming out of a stranger. In a fully subjective depiction, Deschanels lines no doubt would be audible, for the man, in the literal sense, can in fact hear her. Rather, the film at this point is miming the characters mental state, removing a voice that he certainly attends to but suddenly cannot com-prehend. This extraction of the womans voice serves as a metaphor for his sudden estrangement from her. 9. The Escorted Voice-Out. There are times in film in which portions of the soundtrack are drastically reduced or completely removed so that others may take the lead. Such up-and-down modulations are often at work in nightclub scenes, such as our hypothet-ical bar example above. Several scenes of this sort occur in Singles (Cameron Crowe, 1992), wherein the songs of on-stage grunge bands and the enthusiastic cheers from the crowds are dialed down to accommodate the dialogue during the films numerous meet-cute scenarios. In moments such as these, music or elements of the mix may diminish character speech, but the voice typically returns to its hierarchical supremacy when crucial, narratively significant lines are uttered.

    43 Deleuze has offered the most thorough articulation of free indirect sound-image relations. See Gilles Deleuze, Cinema 2: The Time-Image (Minneapolis: University of Minnesota Press), 234241.

  • Cinema Journal 52 | No. 4 | Summer 2013

    19

    There are, however, moments in which this verbocentrism is reversed, when music or other sounds effects are ratcheted up while voices are reduced in part or in whole.44 I group these under what I call the escorted voice-out, and though it is often deployed as a familiar trope, it possesses a disquieting power. Its most recognizable form is the time-condensing montage, such as when a romantic pair is seen engaging in a number of precious activities (e.g., cooking a meal, picnicking, snuggling on a sofa) or the train-ing montage in sports films. Generally, a poppy tune takes the lead while their voices are cut from the mix. Another familiar utilization of the escorted voice-out occurs in Crash (Paul Haggis, 2004), wherein the bigoted cop (Matt Dillon) rescues his former victim (Thandie Newton) from an overturned vehicle. As the car becomes engulfed in flames, direct sound is diminished, all screaming voices are removed, and the an-gelic musical accompaniment is pushed to the fore. Using a similar strategy but to vastly different ends is Paranoid Park (Gus Van Sant, 2007). In a striking scene, teenager Alex (Gabe Nevins) unceremoniously breaks up with his cheerleader girlfriend Jennifer (Taylor Momsen) behind the football stadium bleachers. The couple is framed within the shot, but their voices are gradually dialed down to silence, displaced by Nino Rotas theme from Juliet of the Spirits (Federico Fellini, 1965), which Van Sant deploys through-out the film to signal the boys various spiritual awakenings. Of all the many forms the voice-out may take, the escorted variety is perhaps the one most frequently utilized in the cinema. For instance, as Chion notes, voice-over commentary often obscures or overtakes the speech of diegetic characters, as is the case in Goodfellas (Martin Scorsese, 1990).45 In these cases, though, the voice-out is not disruptive or alienating: with Goodfellas, the faith we place in the voice-over narrator assures us that nothing crucial to the narrative has been lost as a result of his or her intrusion. Likewise, in the case of Crash, the use of the escort functions similarly to the anti-redundant voice-out. A character screaming for help as her vehicle catches fire doesnt necessarily need the sound to convey the emotional thrust of the scene. The re-placement of speech with music is a poetic device that alters our sense of the depicted moment: had Crash deployed a more traditionally realistic soundtrack with voices, the scene would have seemed to be structured around the immediate suspense of the rescue. With the music taking the lead, the scene registers differently: the rescue is a foregone conclusion and the scene goes from a thriller to one of redemption. The example from Paranoid Park, however, demonstrates how the device can achieve an uncanny effect precisely because it momentarily disrupts our ontological assump-tions about the film at the same time that it shatters the films illusion of a characters

    44 Paul Thberge has coined the term diegetic silence to describe moments in which one or more elements of the diegetic world (e.g., effects, dialogue) is removed from the sound mix and replaced by nondiegetic sounds. A di-egetic silence is thus similar to my notion of the escorted voice-out, but, in that my theory privileges the voice, not all escorted voice-outs would qualify as diegetic silences, and vice versa. As do I, Thberge notes that the device is something of a clich in Hollywood. Paul Thberge, Almost Silent: The Interplay of Sound and Silence in Contemporary Cinema and Television, in Lowering the Boom: Critical Studies in Film Sound, ed. Jay Beck and Tony Grajeda (Urbana: University of Illinois Press, 2008), 57.

    45 Chion, Audio-Vision, 181.

  • Cinema Journal 52 | No. 4 | Summer 2013

    20

    real embodiedness.46 Nothing within the narrative prior to this moment hints to us that the breakup was imminent or that our protagonist had even contemplated the split. We see the Nevins character approach the young woman and witness their conversation in silence. Only after Momsens face begins to register her anger do we gather the meaning of the exchange. Toward the scenes end, Rotas theme fades away, and the voices return to prominence long enough to confirm that the two are no lon-ger a couple. Here, a completely conventional device is deployed in a novel and quite compelling way. 10. Abandoned Sound. Perhaps the rarest of deviations from sound cinema norms, abandoned sound is when all soundsdialogue, direct sound, music, voice-overare completely removed from the film, leaving the perceiver in a purely visual mode, as a spectator proper. Though abandoned sound is similar to the escorted voice-out, the two categories differ in that the latter typically drops all but one or two components of the soundtrack, whereas the former cuts all sound. Abandoned sound is, therefore, a radical strategy, for it places the spectator in a position of silence that even the suppos-edly silent cinema did not, as presound films were frequently accompanied by live or recorded music. In this regard, abandoned sound closely aligns with the subcategory of intentional silence discussed in relation to our first category, the technologically determined silent cinema. Ive kept these two domains separate, however, since avant-garde works like those of Brakhage and others typically employ silence for the duration of the film, whereas abandoned sound achieves its startling effect by the sudden and necessarily brief evacuation of sound. Mike Figgiss 1995 film Leaving Las Vegas utilizes abandoned sound to great effect in the scene in which the protagonist Ben (Nicolas Cage) suffers a heart attack and all sound is withdrawn: the film unspools in complete silence. Figgis describes festi-val audiences as experiencing a tremendous discomfort during this moment, one in which the protection of this sound blanket of mush is removed.47 As weve already rehearsed, sound is omnipresent and inescapable. To unexpectedly find oneself with-out the sound of the film and to be left with only the ambient noise of the audito-rium or ones living room is far more startling that its visual equivalent of the cut to black. If sound is the crutch that props up the cinematic experience (Rick Altmans ventriloquist), then its abandonment is the equivalent of snatching the prop from the perceiver, leaving him unsteady in vision alone. Perhaps philosopher Don Ihde puts it best: The sudden absence of sound can disembody a scene. . . . [It] becomes eerie, a moving tableau that becomes more abstract and silent.48

    46 Robert Spadoni invokes Freuds notion of the uncanny in his book on Universals early horror films. Freud, he writes, noted two instances in particular when the uncanny emerges, such as momentarily perceiving an inanimate object to be alive and, conversely, perceiving a living thing to be inanimate. Thus, for one to hear Momsens voice abruptly slip away from audibility is to become aware suddenly of the films artificiality, to recognize, following Altmans ventriloquism argument, that there is a difference between ones self and the self that he identified with on-screen. Robert Spadoni, Uncanny Bodies: The Coming of Sound Film and the Origins of the Horror Genre (Berkeley: Uni-versity of California Press, 2007), 6, 30.

    47 Mike Figgis, Silence: The Absence of Sound, in Soundscape: The School of Sound Lectures 19982001, ed. Larry Sider, Diane Freeman, and Jerry Sider (London: Wallflower Press, 2007), 2, quoted in Thberge, Almost Silent, 53.

    48 Don Ihde, Listening and Voice: Phenomenologies of Sound (Albany: State University of New York Press, 2007), 83.

  • Cinema Journal 52 | No. 4 | Summer 2013

    21

    The Four Zones of the Voice-Out. Now that we have categorized the forms the voice-out may take and distinguished them, we may begin to group these categories according to shared sets of functions or motivations. If we were to place our categories within a circular model according to my ordering, certain similarities emerge. The first category, silent (era) cinema, owes its unheard voices not to a stylistic choice but to a technological limitation. Hence it is grouped to itself, for every other category features an absence or obscuring of the voice that is intentionally designed by the filmmaker. As a special case, the silent cinema forms the first anchor point on the voice-out model. For this reason, I set it to the side for the moment so that we can consider it more fully at the articles conclusion. The two subsequent categories, the nested and anti-redundant voice-outs, form the first of four zones within the model shown in Figure 1. For our purposes, zones group voice-out categories that share similar properties in terms of how they typically function within cinematic texts. The nested voice-out differs from every other category in that it reinscribes the technical limitations of the silent cinema into a film that does not have such limitations. For this reason, we cannot construe it as a withholding of speech in the narrative sense: watching a fictional character watching The Passion of Joan of Arc in Vivre sa vie does not sub-tract the voice. Rather, it perpetuates an already-existing absence. I pair the nested and anti-redundant voice-outs by virtue of the fact that, like the former, the latters absence of the voice is not experienced as a void. No narrative information is withheld in the example from The Graduate. That isnt to say that the angry parents lines are irrel-evant. Instead, their audibility is simply unnecessary for our discernment of the scene. This inessential sound recalls our discussion of the title card in the silent cinema: much is said, little is directly conveyed, but nothing is missed. Together, these voice-outs from what I call the visual zone, and it is bracketed in Figure 1. The next two categories, the verisimilar and proliferative voice-outs, each, by and large, aspire to sonic realism. The former obscures the voice to suggest a realistic soundscape, though it does so in a way that ensures that important narrative informa-tion is not missed. The latter conforms to a point-of-audition model, seeking to repli-cate what one might hear were she to stand in the cameras position within the space of the film. Here, both types of voice-out in this range operate in a similar manner:

    Figure 1. The voice-out typology.

  • Cinema Journal 52 | No. 4 | Summer 2013

    22

    they activate the voice-out to re-create conditions that one experiences in reality. Thus, together they form the realistic zone. The whisper, the next category in the model, is situated directly opposite from the first anchor, the silent cinema. It, too, is a peculiar case. Therefore, it stands to itself and is designated as the second anchor point. Its significance is addressed below. The subjective zone comprises the subjective and free indirect voice-outs. Each is motivated by an attempt to use sound as a tool to depict or to investigate psychic inte-riority. The subjective voice-out is coupled with a point-of-view shot and is therefore coded entirely as subjective. The free indirect mode modifies this considerably, as it does not announce itself as a subjective depiction but rather subtly suggests a mutual penetration of subjective and objective realms. With the escorted voice-out and abandoned sound, the realistic soundscape of voices, music, and sound effects is disrupted as the film shifts to a stylized presentation in which, in the first case, one elementgenerally musicdrowns out all other aural elements. In the second case, sound is completely removed. These two techniques are far and away the most obtrusive ones in the model. For this reason, I refer to them collectively as the self-conscious zone, the upper-left area in Figure 1. This zone is situ-ated diagonally opposite the realistic zone, for both utilize similar techniques to vastly different ends. Not only does the self-conscious voice-out zone stand at a remove from objective sound; it also pushes us closer and closer to the silent cinema. As we move through this zone, we obviously lose first the voice, then other portions of the soundtrack, before finally losing sound altogether. However, this tenth position in the model, abandoned sound, is not a pastiche or mere appropriation of silent cinema aesthetics, an instance of speech in a dead language.49 Instead, it radicalizes even its precursor, for it for-goes the musical accompaniment and intertitles that guided and grounded audiences in the silent era. Hence, a film featuring abandoned sound becomes more silent than even the silent cinema was. As such, abandoned sound brings us closer to avant-garde practice than to a mere borrowing of silent film conventions. For this reason, our example from Leaving Las Vegas more closely aligns with the work of Stan Brakhage that I categorized as a subset of silent cinemathus their proximity to each other in Figure 1. However, in that the films that utilize strategies consistent with abandoned sound have, unlike silent cinema, the technological means to reproduce direct sound but as an artistic choice elect not to, the category of abandoned sound corresponds in many ways to the second category, the nested voice-out. Despite all of the categories being interrelated by the absence or obfuscation of the voice, there are instances in which there are multiple points of contact among several categories that I think merit special attention. There are two points on the continuum that work to either unify or defy the bound-aries of the zones: the silent cinema and the whisper. Silent cinema links the reinscrip-tion of silence characteristic of the nested voice-out with the evacuation of sound in the abandoned sound category: the three terms form a potential circuit unto themselves.

    49 Fredric Jameson, Postmodernism; or, The Cultural Logic of Late Capitalism (Durham, NC: Duke University Press, 1991), 17.

  • Cinema Journal 52 | No. 4 | Summer 2013

    23

    Here, the tension between silent and sound cinemaor rather, ones expectations of the latter relative to the formermanifests itself in what I contend are some of the more interesting sound-image relations within the cinema. The whisper, located at the bottom half of the circle, works in a similar fashion to the silent cinema, as it, too, mediates between two otherwise-distinct regions: the subjective and the realistic. Often, the whisper both reinforces and violates the notion of point of audition. If we are to hear a character whisper in, say, a medium shot, we either cut closer to him or the line is pushed up in the sound mix, louder than it would be to an invisible auditor in those spatial coordinates. To turn up the volume on the whisper recalls the theatrical stage whisper wherein the supposedly quiet line is delivered to the back row.50

    A whisper in the cinema goes unheard in one of three cases: (1) we are aligned with a character who is not capable of hearing it (subjective voice-out); (2) we are positioned in such a way that, as a third party, were we to occupy that same point within the diegesis, the whispered words would likewise remain unintelligible (veri-similar voice-out); or (3) the character or the camera is positioned in such a way that either one should be able to hear it, but the narration withholds it nevertheless. The transgression or unification that the whisper achieves within the voice-out model is when the first and second options above collapse into one, which makes for shaky on-tological ground. In other words, pinpointing which category to attribute the unheard or obscured line to becomes quite difficult, as if our illustrative circle had momentarily folded in on itself. And there, I argue, is where the peculiarity of the climax of Lost in Translation lies: pinned over the shoulder of two people whose love for each other is mu-tual but unconsummated, we cannot hear what they hear, or even what a proximate stranger on that Tokyo street might, or what a film that played according to convention would contrive to supply.51

    Conclusion: An Irrational Sway. The voice, as weve elaborated, is the most privi-leged component of the soundtrack, and for good reason. In the same way that the mimetic representation of the human figure on-screen maintains an irrational sway over us, the voice, too, carries the uncanny power to assure us that, somewhere, these apparitions on the screen have a corporeal existence. It is not surprising, then, that the voice has attracted a great deal of commentary. One such area of fascination is the disembodied voice: heard yet not sourced vi-sually. This study has sought to uncover the inverse circumstance: the body that is on-screen whose speech we are denied. This, like the ghostly off-screen voice, can be an unsettling experience for filmgoers, for it may simultaneously violate our phenom-enological experience of reality while defying the conventional sound-image relations

    50 Lastra, Sound Technology, 159162.

    51 Contrast the conclusion of Brick (Rian Johnson, 2005), wherein the teenaged femme fatale whispers an unheard line into the protagonists ear. Though not audible, the content of the line is so thoroughly implied in the preceding dialogue that we need not hear it. Lauras (Nora Zehetner) whisper is initially framed in a close two-shot before the film cuts to a long shot. Thus, it is a whispered voice-out that, as in Lost in Translation, occurs at the films conclu-sion yet is not experienced as a violation of either narrative or spectator positioning norms. It therefore stabilizes the verbocentrism that Murrays whisper in Lost in Translation works to make fluid and uncertain.

  • Cinema Journal 52 | No. 4 | Summer 2013

    24

    we have been conditioned to expect. Therefore, what Ive termed the voice-out is a significant theoretical blind spot in media studies, one that this article has attempted to remedy by sketching a preliminary set of terms adequate to discuss the phenomenon with the precision it demands. The model I have laid out here, however, is by no means exhaustive or rigid. I do not mean to suggest that all moments of unheard speech can be wrangled neatly into one of my ten categories. Rather, the voice-out typology is intended as an analytic tool for the phenomenon of inaudible film speech: it is neither prescriptive nor defini-tive, and hence, it is not carved in stone. My aim for this model is for it to serve as a preliminary guide for the consideration of a largely unaddressed yet surprisingly com-mon audiovisual practice. Each of the types invites refinement and elaboration, for there exist examples that complicate my typologythe recent film The Artist (Michel Hazanavicius, 2011), for instance, moves adroitly among my classificationsand there are likely varieties of unheard voices that I have yet to consider.52 What do we make of characters who are mute? How should we grapple with the unsure ontological realm of the voice in animated texts? New categories are therefore possible. Suffice it to say, the voice-out, as a device, achieves a variety of ends: it may add to a films verisimilitude, depict character interiority, hark back to the silent cinema, or demarcate points of radical departure from classical norms of self-effacement. In short, the voice-out is, despite its supposed status as taboo, a commonly practiced yet little-studied tactic that is utilized in remarkably diverse ways.53 On the one hand, it runs counter to the predominant verbocentrism of the cinema by shaking free from the constraints of perpetual clarity that Chion suggests in the epigraph that opens this article; on the other hand, it violates our phenomenological experience of real-ity, of the relationship between character audibility and camera distance that Lastra describes as phonographic point of audition. The voice-out, particularly in its more unique and unconventional applications, effectively mines the paradox that is the un-heard voice in the sound film. It is my hope that, in granting the voice-out its due attention, we may continue to further elucidate the complex and multifarious interac-tions of sound and image, cinemas raw materials that are ceaselessly combined to entertain, to enthrall, and to confound.

    Thanks to Greg M. Smith and Donald Crafton for their thoughtful comments on earlier versions of this article.

    52 Similarly, 2001: A Space Odyssey (Stanley Kubrick, 1968) is a film that testifies to the slipperiness of the voice-out. In the scene in which Poole is killed while outside the spacecraft, Kubrick removes all sound entirely. This would suggest, then, abandoned sound. However, given that there is no sound in space, it actually functions as a verisimilar voice-out. My thanks to Angelo Restivo for reminding me of this scene.

    53 Chion, Audio-Vision, 170.