Click here for the PMO frontpages! Click here for the PMO frontpages!

Voice and Sound Processing:

Examples of Mise en Scene of Voice in Recorded Rock Music

Serge Lacasse

I. Introduction

We already know how important technology is in popular music. Not only has technology allowed us to create sound sources which never existed before - and therefore peculiar to the technology -, such as the electric guitar or the synthesiser, but it has also provided artists with yet another musical instrument: the recording studio. First thought of as a means to register and store sound, recording techniques have gradually become creation tools in their own right, this in turn changing our perception of engineers and producers from a status of 'technicians' to one of 'artists'.[1] As in electroacoustic music, the recording studio has become a place where artists are literally shaping sound. This new perspective has given rise to a wide range of aesthetic trends in sound recording which can be easily identified as indicators of particular musical styles.

The emergence of recording aesthetics requires that the musicologist takes into account sonic manifestations stemming from the use of technology. Unfortunately, a traditional musicological formation does not prepare the student for such new manifestations. Indeed, Richard Middleton points out that:

[traditional] musicological methods [...] tend to neglect or have difficulty with parameters which are not easily notated: non-standard pitch [...]; irregular rhythms [...]; not to mention new techniques developed in the recording studio, such as fuzz, wah-wah, phasing and reverberation.[2]

Susan McClary and Robert Walser go further when they write that:

[...] developing methods for getting at those overlooked dimensions requires not only noticing them, but also constructing a vocabulary and theoretical models with which to refer to them and to differentiate among them. [...] Yet inasmuch as popular music defies being explained in terms of harmonic structure and insists on these other parameters, the musicologist who wishes to make sense of this music must come to terms with these uncharted areas for which there is no shared critical apparatus or language.[3]

In previous research I have referred to these 'new parameters' as 'technological musical parameters'.[4] It is, however, misleading to equate technology with novelty because a large number of effects produced with today's technology seem to derive from much older practices. For example, Iégor Resnikoff's research into the resonance of primitive caves sheltering some rupestrian paintings suggests that such resonance was used by our ancestors to add a sonic dimension to their rituals.[5] Antoni Gryzik suggests that this form of mise en scène of the human voice has continued in the course of history:

In order to assume their god's invisibility, some peoples (such as Mayas and Egyptians) have developed a kind of auditory ritual; an invisible dramaturgy composed of sets of unusual reverberations, which allowed the priest to impress the audience without being seen.[6]

History abounds in such examples of vocal mise en scène which occur anywhere and long before the arrival of recording technology. But this is another topic that would need to be examined elsewhere.

The aim of today's paper is to present a way to approach some technological musical parameters. In order to do so, I will concentrate on the voice in rock music recordings. I will then try to show that when these parameters are taken as voice 'modulators', it becomes easier to understand the meanings that can emerge during the listening process. Moreover, the voice being the instrument which carries the lyrics, I will try to foreground the relationships between the sung words and some technological parameters. But before, I would like to give some precision to notions concerned with the mise en scène of the recorded voice.

II. Mise en scène of the Recorded Voice

A. Definitions

By 'vocal mise en scène'is meant the way in which a recorded voice is presented to someone hearing a recording. By 'vocal setting' is meant a particular mode of presenting such a recorded voice. For example, it would be possible to pan a voice on the far left with plenty of reverb; that would constitute a particular vocal setting. In a stereophonic listening situation, the listener would indeed have the impression that the voice is coming from the left-hand side and sounding in a large room. On the other hand, in 'mono' (as it is still the case with most television sets) the listener would perceive the reverberation, but not the panning effect. In the context of this paper, I am assuming a stereophonic listening situation. This last remark shows how important it is to approach this question from the listener's perspective. Another important reason for insisting on listener perspective is of course that some sound treatment techniques (e.g. moderate equalisation or compression) are imperceptible to most listeners. I would now like to turn to more concrete illustrations.

B. Distortion

The first excerpt presents a voice with heavy distortion, and is taken from Nine Inch Nails's 'Heresy' from the Downward Spiral album.[7]

Excerpt 1: Nine Inch Nails, 'Heresy', The Downward Spiral, INTD-92346, p1994 (00:50-01:08)

I would first like to point out that this type of vocal setting is not unique. We can hear it, for example, in King Crimson's '21st Century Skizoid Man' from their 1969 album In the Court of the Crimson King.

Excerpt 2: King Crimson, '21st Century Skizoid Man', In the Court of the Crimson King, SD 8245, p1969 (second verse)

There are numerous examples of voice with heavy distortion, but let's look at these two for now.

In both cases, distortion seems to support, and even to increase, the level of aggressiveness present in the lead vocal. Furthermore, this aggressiveness is manifest in the lyrics: 'God is dead/And no one cares/If there is a hell/I'll see you there'. For its part, Crimson's singer describes the following scene: 'Blood rack, barbed wire/Politicians' funeral pyre/Innocents raped with napalm fire'. It is reasonable to argue that even non-English speaking listeners would 'understand' the aggressiveness expressed by the lyrics with the help of the distortion. Of course, the singer's interpretation is also relevant, but the addition of distortion definitely supports this feeling of aggressiveness.

Results from a recent reception test have helped me determine how efficiently distortion communicates a feeling of aggressiveness. The test was conducted on 128 subjects in Québec in April 1997. Listeners were mostly non-musicians and were aged between 19 and 25. During the test, subjects heard excerpts of a speaking voice. Each excerpt consisted of the same recording of the same voice, but presented in eight different settings: with reverberation, with echo, with flanging, etc., and, of course, with distortion.[8] It is not surprising to learn that more than 90% of listeners have perceived the voice with distortion as 'quite' or 'very' aggressive. On the other hand, the normal voice was considered 'soft' by 70% of the same respondents.[9] It seems then that this kind of vocal setting is quite 'meaningful'. Let's look now at other examples.

C. Echo

Next, I will discuss some cases of echo. But first, we need to make a distinction between 'echo' and 'reverberation', two phenomena that are often confused. Put in simple terms, echo is perceived as the repetition of an original sound event, while reverberation is perceived as a prolongation of that sound event. For musicians such as Murray Shafer, this difference is important:

Early sound engineers sought to carry over special acoustic properties [...] into the ziggurats of Babylon and the cathedrals and crypts of Christendom. Echo and reverberation accordingly carry a strong religious symbolism. But echo and reverberation do not imply the same type of enclosure, for while reverberation implies an enormous single room, echo (in which reflection is distinguishable as a repetition or partial repetition of the original sound) suggests the bouncing of sound off innumerable distant surfaces. It is thus the condition of the many-chambered palace and of the labyrinth. But echo suggests a still deeper mystery. [...] every reflection implies a doubling of the sound by its own ghost, hidden on the other side of the reflecting surface. This is the world of alter-egos, following and pacing the real world an instant later, mocking its follies.[10]

It is indeed this same echo which has troubled Athanasisus Kircher, a 17th Century scientist, who wrote in 1673: 'I have thus tried to immobilise this fleeting celestial being and to please him by talking to him [...]. But echo, used to woods and to solitude, carefree and not willing to be tamed, has easily eluded all my efforts'.[11]

Echo, then, has bewitched listeners for a very long time. But why should echo have such a great hold on listeners? Michel Imberty proposes a psychoanalytic explanation. According to him, echo reminds the listener of what Imberty calls 'this inaugural instant of time', i.e. this moment when the child experiences for the first time a separation from the mother. In other words, when the subject experiences its first separation from the object: then, the child understands that his voice is not his mother's. And, since echo 'is the remaining of something which is not there anymore, irremediably', it illustrates a duality: the nostalgia of a loss and, at the same time, the feeling of one's own existence.[12]

I was thus surprised to hear the following lyrics sung by Bono in the song 'Mofo' from U2's Pop album: 'Mother am I still your son/You know I've been waiting for so long to hear you say so/Mother you left and made me someone/Now I'm still a child but no one tells me No'. Interestingly enough, this passage is quite close to what Imberty is describing: this kind of ambivalence between the loss of something and the desire to become autonomous. But it is even more interesting to know that in the whole song it is the only time we can hear a clear echo on the voice.

Excerpt 3: U2, 'Mofo', Pop, 314 524 334-2, p1997 (2:35-3:16)

Of course, I am not saying that every time we hear echo in a pop song, we're in presence of some sort of psychoanalytic illustration of a mother-son separation! All I'm trying to do is to foreground the potential of expressiveness held by this particular kind of setting.

But we can also look at echo from a strict musical point of view. Because of its temporal nature, echo becomes part of the rhythmical structure of the music. Usually, an engineer will set the echo so that its rate will correspond to the beat of the song (quarter notes, eighth notes, triplets, etc.). Indeed, if an engineer was to set the echo at some random rate, we would obtain some rhythmical 'dissonance'. This concern for rhythmic concordance if you like was already present in early recordings of rockabilly, as is the case with Elvis Presley's 'Baby, Let's Play House' recorded in December 1954 in Sam Philips's studio in Memphis.

Excerpt 4: Elvis Presley, 'Baby, Let's Play House', The Sun Sessions CD: Elvis Presley Commemorative Issue, 6414-2-R, p1955 (first verse)

Middleton writes that:

Elvis Presley's early records, with their novel use of echo, may have represented a watershed in the abandoning of attempts to reproduce live performance in favour of a specifically studio sound; but the effect is used largely to intensify an old pop characteristic - 'star presence' - : Elvis becomes 'larger than life'.[13]

Here Middleton shows how sound manipulation can be aesthetically determining. The addition of echo will be representative of a particular aesthetic trend. Moreover, as Middleton pointed out, the sound of echo intensifies Elvis's presence, as it is quite perceptible in the excerpt we've just heard.

In some sections of a remix of Peter Gabriel's 'Kiss That Frog', we can hear quite clearly an echo in which the rhythm has influence on the whole rhythmic structure. It is maybe not by chance that so much echo is present in dance music.

Excerpt 5: Peter Gabriel, 'Kiss that Frog (Mindblender Mix)', Kiss that Frog, PGSDF 10, p1993 (1:47-2:43)

It is worth saying that the standard version which appears on the album Us does not have echo on it. It is also interesting to note that some phrases have been cut out from the remix presumably in order to leave space for the echo to be properly heard.

III. Conclusion

I hope that this short presentation will have shown that it can be important to take into account technological parameters during a song analysis. Of course, I have only talked about very few cases of particular vocal settings, but it is hoped that it has been enough to foreground the vast potential of expression held by these parameters alone. From a musicological standpoint then, I ask for an acknowledgement of these neglected musical parameters in future analysis of rock music.


Grysik, Antoni (1984) Le rôle du son dans le récit cinématographique. Paris: Minard.

Imberty, Michel (1993) 'L'utopie du comblement : à propos de l'Adieu du Chant de la Terre de Gustav Mahler'. Les cahiers de l'IRCAM : utopies, 4: 53-62.

Jones, Steven G. (1987) 'Rock Formation: Popular Music and the Technology of Sound Recording'. Ph.D. thesis. Urbana: University of Illinois.

Kircher, Athanasisus (1994) 'Nouveau traité de l'énergie phonique', in Les cahiers de l'IRCAM : espaces, 5: 15-28.

Lacasse, Serge (1999) 'La mise en scène de la voix en musique rock enregistrée : résultats préliminaires', in Les cahiers de la Société québécoise de recherche en musique 2, 1.

Lacasse, Serge (1995) 'Une analyse des rapports texte-musique dans "Digging in the Dirt" de Peter Gabriel'. M.A. thesis. Québec: Université Laval.

McClary, Susan and Robert Walser (1990) 'Start Making Sense!: Musicology Wrestles with Rock'. In On Record: Rock, Pop, and the Written Word, edited by Simon Frith and Andrew Goodwin. New York: Pantheon Books, pp. 277-92.

Middleton, Richard (1990) Studying Popular Music. Milton Keynes: Open University Press.

Moylan, William (1992) The Art of Recording: The Creative Resources of Music Production and Audio. New York: Van Nostrand Reinhold.

Osgood, Charles E., George J. Suci and Percy H. Tannenbaum (1961) The Measurement of Meaning.Urbana: University of Illinois Press.

Reznikoff, Iégor (1995) 'On the Sound Dimension of Prehistoric Painted Caves and Rocks', in Musical Signification: Essays in the Semiotic Theory and Analysis of Music, edited by Eero Tarasti. Berlin: Mouton de Gruyter, pp. 541-557.

Shafer, Murray (1977) The Tuning of the World. Toronto: McClelland and Stewart.

Shea, William F. (1990) 'The Role and Function of Technology in American Popular Music: 1945-1965'. Ph.D. thesis, University of Michigan.


[1] William Moylan, The Art of Recording: The Creative Resources of Music Production and Audio (New York: Van Nostrand Reinhold, 1992), p. 35. One can also refer to Steven G. Jones, 'Rock Formation: Popular Music and the Technology of Sound Recording', Ph.D. thesis (Urbana: University of Illinois, 1987); William F. Shea, 'The Role and Function of Technology in American Popular Music: 1945-1965', Ph.D. thesis (University of Michigan, 1990).

[2] Richard Middleton, Studying Popular Music (Milton Keynes: Open University Press, 1990), p. 104-105.

[3] Susan McClary and Robert Walser, 'Start Making Sense!: Musicology Wrestles with Rock', On Record: Rock, Pop, and the Written Word, edited by Simon Frith and Andrew Goodwin (New York: Pantheon Books, 1990), p. 282.

[4] Serge Lacasse, 'Une analyse des rapports texte-musique dans "Digging in the Dirt" de Peter Gabriel', M.A. thesis (Québec: Université Laval, 1995).

[5] Iégor Reznikoff, 'On the Sound Dimension of Prehistoric Painted Caves and Rocks', Musical Signification: Essays in the Semiotic Theory and Analysis of Music, edited by Eero Tarasti (Berlin: Mouton de Gruyter, 1995), p. 541-557.

[6] Antoni Grysik, Le rôle du son dans le récit cinématographique (Paris: Minard, 1984), p. 11-12. Translation is mine.

[7] Sound excerpts were played during the paper presentation. Readers are encouraged to listen to the suggested recordings in order to get a full appreciation of the argument.

[8] An excerpt of what listeners heard during the test was played during the presentation.

[9] The method used was Osgood's semantic differential as presented in Charles E. Osgood, George J. Suci and Percy H. Tannenbaum, The Measurement of Meaning (Urbana: University of Illinois Press, 1961). For a complete description of the test, see Serge Lacasse, " La mise en scène de la voix en musique rock enregistrée : résultats préliminaires ", Les cahiers de la Société québécoise de recherche en musique 2, 1 (1999).

[10] Murray Shafer, The Tuning of the World (Toronto: McClelland ans Stewart, 1977), p. 218.

[11] Athanasisus Kircher, 'Nouveau traité de l'énergie phonique', 1673, Les cahiers de l'IRCAM: espaces, 5 (1994): 17. Translation is mine.

[12] Michel Imberty, 'L'utopie du comblement : à propos de l'Adieu du Chant de la Terre de Gustav Mahler', Les cahiers de l'IRCAM: utopies, 4 (1993): 54-55.

[13] Middleton, p. 89. It has to be noticed that earlier use of echo exists: Jimmie Rodgers, 'Blue Yodel No. 11' (1929), recorded and produced by Ralph Peer in Nashville, seems to represent the very first instance of the use of echo in recorded popular music.