Speak the Middle Tongue – Take the Forked Road: A Theory of the Voice – 1


Susan Gevirtz: areodrom orion


My talk addresses the use of ‘voice’ philosophically. My comments therefore pertain to film and all forms of sonic media in general. To get us in the spirit of my talk I being with a few seconds from the Tuva Throat singer Borbannadir.


The grain of the voice, Barthes tell us,


To comprehend this definition, we must rigorously avoid a misunderstanding; that, to signify something is to communicate it. To ward off that misapprehension we must answer the following questions:


How are we to understand materiality and significance, and their relation? What does the italicization of significance connote? Barthes’ use of materiality is straightforward; it refers to “the sonic effects of the tongue, the glottis, the teeth, the mucous membranes, the nose;” [183] to the “body in the voice as it sings, the hand as it writes, the limb as it performs.” [188] These embodied forces determine the diction of enunciation, which constitutes its ‘grain,’ that allows us to recognize the identity of a speaker when he/she speaks.


Barthes theoretically allies the grain with the geno-song, a biological, materialist concept he transposes to music from Kristeva’s linguistic analog – the geno-text – individual works of pheno- or species-text exemplified by genres like romance or science fiction texts. The geno-song is defined thus:

[it] is the volume of the singing and speaking voice, the space where significations germinate ‘from within language and in its very materiality’; it forms a signifying play having nothing to do with communication, representation (of feelings), expression; it is that apex (or that depth) of production where the melody really works at the language – not at what it says, but the voluptuousness of its sound-signifiers, or its letters – where melody explores how the language works and identifies with that work. [182]

Significance, then, is italicized to warn us against mistaking it for communication, representation, or expression. It is what works at language but not at its meaning. Meaning, for Barthes, is a product of the reductive forces of the pheno-text by which culture enforces limits to understanding, to significance, by “reconcil[ing] the subject to what in music can be said: what is said about it, predicatively, by Institution, Criticism, Opinion,” by which he means the codes of langue that always precede and police the voice. [185] Significance of the grain, Barthes acknowledges in a passing parenthetical allusion to another of his foundational texts, derives its value when the text emerges in the work.

In order for a voice to have grain, to have non-communicative significance, it must break the codes of works, of pheno-songs and pheno-texts, and emerge into the text of the geno-song through listening to the relation of the body of the speaker, singer, or player. The relation, he tells us, also transcends individualistic subjectivity because it is erotic and physiological – it is not a psychological subject who sings or listens, the voice is not an expression of any subject, but it’s dissolution, produced outside of the laws of culture, beyond the valuations of ‘I like’ or ‘I don’t like.’ [188]

We are all familiar with these poststructuralist themes: text vs work, authorial death, corporeal jouissance, the politics of language. My intention here is to remind us of what the stakes were then, (and which I strongly believe are still relevant today), in putting significance beyond reductive codes that normalize meaning, and in relation to a decidedly non-subjective but still embodied materiality. Sound, music, voice, all have the critical capacity to de-subjectivize, to break free of the narrow circuits through which the codes of culture restrict most of us to ‘expressing’ only individuality. The politics of the grain of the voice aims to produce collective responses against liberal humanism’s ‘voice’ that atomizes the social through its infinite and repetitive broadcast of bankrupt individualist, subjectivist fantasies. While this claim may appear hyperbolic, it is what Barthes intended, and accurately represents the political aims of much poststructuralist thought – to free our voices from a double dominance; from the dominance of normalizing pheno-songs on the one hand, and on the other, from communications of reductive meaning, between singers and listeners.

I now turn briefly to Derrida’s analysis of ‘monolingualism’ as a means by which to define voice, not as a work but, by analogy, as a ‘text.’


Monolingualism opens by staging the scene of a dialogue about “performative contradictions.” The umbrella formulation of the contradiction reads: “I have only one language; it is not mine.” [1] The contradiction, thus formulated, may easily be justified. In Kristeva’s terms, the contradiction lies between pheno-text and geno-text; for Barthes analysis of the voice, between pheno-song and geno-song. I propose here another dichotomy between pheno-voice and geno-voice, which I will formulate in a moment. Each of these oppositions is a species of the Sassurian legacy; parole refers to having one language through one’s ability to speak it; while langue refers to the impossibility of ever possessing a language at all because it forever supercedes the capacity of speech. A second defense of Derrida’s performative contradiction lies in the dialogic; in the reciprocal, bidirectionality of speaking/listening, in the inevitability of networks of co-interpolations; signifying networks in which speakers call listeners into being, and vice versa. Recognizing that the stakes of ‘giving voice to’ involve communication, but not necessarily significance – what is it that is given voice, for what purpose, and by what means? – we quickly find that a long series of antinomies unfolds here: truth and lie, confession and judgment, proof and construction, monologue and dialogue. This partial series demarks some of the undecideable parameters of “performative contradiction,” and therefore of what the act of giving voice signifies.


This series of antinomies also allows us to suggest initial definitions of pheno- and geno-voices. [1] The pheno-voice is an agent of communication whose role is to negotiate contradictions in the act of giving voice, in its performance of irresolvable contradictions whose function is not to generate, but to suspend, meaning. [2] The geno-voice arises in a particular network of interpolations as an advocate for some vector of signification; it is always a product of objective conditions both exogenous and endogenous to it. [3] This means that the geno-voice is an assemblage of all other voices (the text emerging in a work) that have responded to the call of a given performative contradiction.


A voice is an agency that advocates by calling for a particular significance for a given network of co-interpolations. It is a polygenetic agency, a call without a singular agent, without identity – polyphonic, polyglot, and polyatomic – a complex network of nodes and characteristics resonating along the vocal scale of the network of terms show here on the paradigmatic axis.

To put this in the precise language of performative contradiction, a geno-voice is not more objective than subjective. In Barthes’ terms paraphrased above: it is produced by listening to the relation of the body of the speaker, singer, or player; with the additional qualification of plurality or relations among speakers, singers, and players. A voice interpolates values, and is therefore essentially, necessarily a sociopolitical agency called into being through its encounters with antinomies, with the Other that is both exogenous and endogenous to it. To recognize that we BOTH always and never speak only one language, that language is BOTH ours and not ours, is to know, as Gevirtz has so beautifully put it, that when we give voice to anything at all, it is to speak with the middle tongue and to take the forked road.

I now turn to some examples to illustrate this theory of the voice. In 1979, the Caribbean poet Edward Kamau Brathwaite gave a remarkable lecture at Harvard entitled, History of the Voice, published in 1980 in book form. The aim of this work was to show how Black Caribbean poetry emerged simultaneously with and through the musical geno-songs of Calypso and Reggae with the conscious intent of distinguishing itself from the dominance of British English taught in Jamaican schools. His lecture demonstrates in extraordinary detail how these geno-songs are based in the materiality of the Caribbean environment, like its weather, and from the polyglot patois of black Jamaican idiom, in order to resist the master’s poetic language exemplified by the Shakespearian metrical model of iambic pentameter. I’ll cite here only one of his many examples, in which he compares Shakespeare to his own poetry, to illustrate this.

screen-shot-2016-10-28-at-1-55-50-amHe describes the difference between these lines in this way:

…not only is there a difference in syllabic or stress pattern, there is an important difference in shape of intonation. In the Shakespeare…, the voice travels in a single forward plane towards the horizon of its end. In the kaiso, after the skimming movement of the first line, we have a distinct variation. The voice dips and deepens to describe an intervallic pattern.

Barthe’s concept of the geno-song is readily apparent here; the “shapes of intonation” the dipping and deepening of the voice, and the emphasis on the “intervallic pattern,” are all terms which stress embodiment. Based on this type of polygenetic, poetic analysis, he then elaborates the concept of “nation language:”

First of all, it is from, as I’ve said, an oral tradition. The poetry, the culture itself, exists not in a dictionary but in the tradition of the spoken word. It is based as much on sound as it is on song. That is to say, the noise that it makes is part of the meaning, and if you ignore the noise (or what you would think of as noise, shall I say) then you lose part of the meaning. When it is written, you lose the sound or the noise, and therefore you lose part of the meaning.

[History of the Voice,17]


Significance, meaning, is co-created in the intervallic spacetime between the material recitation of spoken/sung sound and song; it arises as much from noise as from written language. But nation language also demonstrates the polygentic, performances of contradictions that are the fundamental condition of monolingualism, what Brathwaite calls the ‘total expression” that determines the uniqueness of Caribbean poetry, [slide12] that is based on:

oral tradition … [that] demands not only the griot but the audience to complete the community: the noise and sounds that the maker makes are responded to by the audience and are returned to him. Hence we have the creation of a continuum where meaning truly resides. And this total expression comes about because people be in the open air, because people live in conditions of poverty (‘unhouselled’) because they come from a historical experience where they had to rely on the very breath rather than on paraphernalia like books and museums and machines. They had to depend on immanence, the power within themselves, rather than the technology outside themselves. [original emphasis]


Total expression is performed in the antinomic space between listener and singer, between the voice and the call; that is to say, the singer is also and simultaneously a listener just as the listener is simultaneously a singer. Singer and listener reciprocally interpolate one another, but do so relative to the historically unique immanence of breath, poverty, and Jamaica’s material environment. It is the particular material conditions of its island state that calls forth the particular vector of significance as both noise and language, and that gives polygentic voice to the particular Caribbean agency without agent of nation language. This polygentic voice is a corporeal agency that embodies a diction recognizable as what Bourdieu has called the collective-individual, and Spivak has called the planetary subject – subjects that resist social atomization because they listen to the call of total expression, and learn to give voice to the network of co-interpolations.

We need a new term for such an agency without agent. Gevirtz has given us a profound image in which to locate it; giving voice with the polygenetic middle tongue so that we can take the forked roads between noise and meaning, between listening and speaking, between giving voice and calling, in order to complete the community of the geno-voice. For her, the model derives from the “the virtual space of all the tele-technosciences, in the general dis-location to which our time is destined…”, as Derrida has described the event that governs communication today. Remember that significance doesn’t necessarily imply communication of linguistic meaning; that geno-songs and geno-voices aim to destabilize, to dislocate meaning in the networks of co-interpolation.

 What I will next demonstrate, using the work of the American poet and performance artist, David Antin, is that giving voice to something in the tele-technoscientific virtual space of communication today takes the form, as Gevirtz’s poetry brilliantly models, of broadcast.  As she puts it in a poem entitled, “Prosthesis:”

The Voice Speaks to its own mouth
and also from a speech external.

Antin, has developed his work with voice in the medium he polemically and with deliberate understatement calls simply, talk. Antin does not write; he speaks improvisationally before live and radio audiences, records and then transcribes his talk verbatim;


these talk transcriptions are then arranged on the page free of the formalities of proper written conventions such as punctuation and standardized paragraphing and sentence structure, making use of elliptical spaces between talk-fragments, then published as texts. As with nation language poetry, the immanence of the occasion of his talk performances is fundamental.

screen-shot-2016-10-28-at-2-23-56-amDavid Antin, Tuning, New Directions, 1977

Each published talk is preceded with brief introductory written texts that describe the circumstances of each talk, in a voice that is as direct and familiar as the talk-texts. These introductions are polysemic in that they reflect on the polygenetic significances of the talk mise en scène. Their function is to draw the reader into the occasion’s immanence, and are analogous to Brathwaite’s notion of total expression; they historicize, locate, conjure up a past-present moment.

screen-shot-2016-10-28-at-2-27-16-amAntin, 1977

In 1977 he performed a talk entitled, Tuning, which may be thought of as his aesthetic manifesto. Radio broadcasting is clearly one of his intended paradigmatic registers, as in the phrases, tuning the radio, tuning a musical instrument, to tune in or out, and in the colloquial expression of song, a tune. His intent is to make an alliance with everyday practices of non-narrative conversation, as at a dinner table. He is resolutely not a storyteller; we might think of him as pursuing the forked paths between a radio or TV commentator and an essayist like Montaigne. His overall intent is, to use an awful neologism, to de-literature-ize, through the presencing act of talking, and by using the poetic equivalent of jazz improvisation; each talk piece is a consummate example of a geno-song, performed rigorously in a geno-voice.

Antin begins to tune the middle tongue, to tune into the forked zone of the antinomy, exogenous/endogenous; he ‘calls’ his talk into existence as talk and immediately tunes his listeners exogenously to be complicit in the freedom from expectation, and endogenously to another antinomy – the personal, contradictory zone of generosity versus “self”-indulgence. He immediately creates a dialog about ethics, drawing speaker and listener onto the same forked path, while putting them in the explorer’s frame of mind, the erotic frame of wanting to know. He suspends meaning, suggesting that ‘communication’ is necessarily a negotiation and never certain or resolved because it requires consensus.

Antin’s politics of tuning is similar to Barthes’ grain of the voice; its aim is to produce, as suggested above, a collective resistance to the liberal humanist ‘voice’ that atomizes the social through its infinite, repetitive broadcast of bankrupt subjectivist fantasies, which can never be generous, and only ever “self”-indulgent. Tuning is, therefore, a tuning to collective listening/speaking of polygenetic voices, to giving that amalgam a collective chance. An assessment that “seems only fair.” Talk becomes the preeminent ‘record’ of urgency because it’s immanent, and a demand for judgments that must be made now, because we listen to his call and are thus made complicit in what ‘we’ all desperately need to address. Listeners will be taken seriously because they are witnesses to a polygenetic vocal event to which they have been called to participate and judge. And because they listen, they are ethically responsible. Antin’s talks literally produce extra-legal courtrooms in which listener/responder voices will be heard, judged, and acted upon.

Antin’s aim has been to free our voices from a double dominance through listening to talk; from the dominance of normalizing pheno-songs on the one hand, and on the other, from communications of reductive meaning, between singers and listeners. His appeal is to “some sense of urgency out there           a passing police car           they     have an audience   they have an audience and a need          and they may respond to it badly                      but they have their sense of urgency”      He suppressed perhaps saying, ‘at least.’

Antin calls us to urgency of what we do not yet know, but could; not to the unknowable event itself, which is just that – incomprehensible. Sight is overrated; sound is more singular and so a more reliable form of evidence. So talk is far more believable than writing because ‘we’ can experience it collectively and simultaneously, and agree or not, on what the voice says, even when, and perhaps especially when, we misunderstand. Antin’s method is to mediate these disparate ethical calls.

To round out my ‘talk’, I’ll now turn briefly to Samuel Beckett’s “Rough for Radio I,” written originally in French in 1961, but not published in English until 1976 as “Sketch for Radio Play” in Stereo Headphones, no. 7. This work has been largely ignored by Beckett critics, in part, because Beckett himself considered it surpassed by his later radio works, particularly by “Cascando,” 1962. “Rough for Radio I,” however, if less developed in aesthetic terms than his later works, is far more relevant to my theoretical discussion here, because it addresses what a first encounter with radio may have been like. I don’t have time here to discuss this short but complex work in much detail, and will highlight only a few of its elements. The two main characters are simply called HE and SHE. HE has invited SHE to his flat for reasons that are never made completely clear, but, as SHE says: “I have come to listen.” In the first part of the work, the dialogue between the two characters establish the scene and state of HE’s mind – the flat is dark and cold, HE is troubled and responds irritably and with barely restrained hostility to SHE’s concern for him and interest in the event. As SHE says: HE has suffered her to come. The event that unfolds only very slowly and haltingly depicts SHE’s first encounter with both operating a radio, learning to tune it, (knobs must be twisted not pushed), and with it’s transmission only of voices on some channels, only music on others, and both simultaneously playing on still others. SHE’s response is of incomprehension and astonishment. SHE cannot understand the relation between the voices and music, cannot understand where they are, if they are together or not, why they cannot be seen or see each other, or whether or not they are in the same situations. Beckett depicts HE as equally unable to comprehend the experience of listening to the disembodied voices and music; HE does not understand SHE’s question – “Are they in the same… situation?” But when she modifies her query to – “Are they… subject to the same… conditions?” – HE replies, “Yes, madam.” Beckett suggests that HE has been traumatized by listening to the radio, and that HE has become addicted to the experience of listening. SHE asks: “… you like that?” HE responds: “It is a need.” At this point, SHE leaves the flat.

In the second part of the radio play, HE makes two successive telephone calls to his doctor, but reaches only the latter’s secretary. The secretary’s voice is never heard; her comments can only be surmised from HE’s answers. During the two calls, the radio alternately plays voices and music simultaneously. The sonic effect is palimpsestic, yet riddled with the secretary’s silent responses and HE’s silent pauses as he listens to her, and to the radio. HE is very agitated, in a state of panic, describing his situation as “most urgent.” The radio music and voices become increasing faint, and HE is now terrified that they will completely stop. “They’re ending,” he tells the secretary, and with terror shouts, “ENDING.” HE for a moment imagines that the voices and music will come together, then realizes that that is impossible. “…how could they meet?”, he asks. The secretary, after apparently asking: Isn’t that what all last gasps are like?, hangs up abruptly, or the connection is accidentally lost. But she then calls back immediately, the music and voice are heard together, though fading, and finally cease. Against this sonic background, HE’s replies to the secretary reveal that the doctor is unable to come until the next day because he has to attend to two births, one of which is breech.

What, then, are we to make of “Rough for Radio I?” My view is that it is Beckett’s critique of broadcast’s power to alienate, to create trauma, panic, and psychosis through its enforcement of isolation of individuals from each other. The disembodied voices produce a general sociocultural condition of disembodiment, nothing less than a psychotic historical rupture in the human condition. It is an allegory of social breakdown, the breakdown of relations between HE and SHE, between HE and the doctor, between He and his wife who has left him; in general, the breakdown of human relations caused by the advent of the tele-technoscientific, virtual, sonic space of communication, represented by both radio and telephone. That the musicians and speakers are isolated from each other, will never be able to inhabit the same situation, never come together, is what explains why they are subject to the same conditions, those of alienation. This interpretation is reinforced at the work’s end by the now thankfully rare condition in which women once gave birth – confinement. The radio confines it’s listeners to their living rooms, just as the telephones opens up a global virtual space of sonic alienation, with speakers/listeners reduced to disembodied voices on either end of the telephone line, and just as the broadcast musicians and speakers are confined to separate channels. “Rough for Radio I” is a grim depiction of the death of the communities of nation language and its total expression that both Antin and Brathwaite work to restore. Antin’s urgency is powerfully figured here by two possible types of birth; will a post-broadcast humanity be born in its confined condition safely, or, literally inversely, by breech? Beckett no doubt intended breech to be understood homophonically – as much a breach of law, code, and most importantly, a breach of relations, as a breech birth, which equalizes the potentials of life and death.


“Rough for Radio I,” then, goes into this breach, as a work of broadcast about broadcast, in formal terms; but much more importantly, in affective terms, it is an attempt to produce a condition of immanence in listeners through identification with HE and SHE. In other words, Beckett’s work represents the alienation between the radio’s speaker/voices/music in order to produce an understanding in listeners that they are, actually, literally, those characters. HE and SHE are literally voices in their heads, and their material signification emerges their, in the reciprocal events of listening/hearing, and hearing/listening. In this sense, “Rough for Radio I” speaks with the third tongue and takes the forked road between listening and giving voice, between HE and SHE, between patient and doctor, between music and voice, and finally, between the birth and death of humanity as it listens to and speaks of the historically inscribed conditions of broadcast modernity. Beckett’s remarkable allegory of immanent urgency is a consummate performance of contradictions that demonstrates our monolinguistic fate – that though we speak only one language, it can never belong to us.

By way of summary, I will end by citing a few lines from Gevirtz’s poem, “Prosthesis,” already referred to, which is, and better because realized rather than merely theoretically speculative, another revision of the concept with which I began, the grain of the voice:



