13
When we speak, our vocal cords begin to vibrate and create a tone. In order to form words, we move 
muscles in our mouth to filter the sound—to articulate certain vowel or consonant sounds. Open your 
mouth and make a long open “aaaah” sound. Keep your vocal cords vibrating and try to change that 
“aaaah” sound to an “iiiii” sound. Play around with your mouth like Homer Dudley did back in 1928—
move to an open ‘e’ sound (like in the word “bet”) and then to an “oooooh” sound. Notice how all 
the work is done by the muscles in your mouth—your vocal cords are always oscillating at the same 
pitch! The human mouth is in fact an extremely expressive resonant filter. A vocoder seeks to capture 
the movements of the mouth and impart the tonal characteristics of the mouth shape onto a second 
sound (called the “carrier”). In essence, by using a vocoder you are swapping your vocal cords with the 
carrier. You can take any sound (a synthesizer, a drum machine, the sound of an aircraft taking o) and 
shape it with the dynamics of your mouth! 
The Wendy Carlos/Bob Moog vocoder, which is at the heart of Spectravox, was constructed out of a 
core of a few elements: two 10-band fixed filter banks (with matching frequency bands), 10 envelope 
followers, and 10 voltage-controlled amplifiers. The sound of a human voice (called the PROGRAM) is 
sent to the first bank of 10 filters (the analysis filter bank). The output of each individual filter band is 
sent to an envelope follower, which tracks the amplitude of the voice signal in that particular frequency 
band. The output of each individual envelope follower essentially encodes the motion of a muscle in 
the mouth—the way that muscle is filtering the sound at any given moment. 
A second input (the CARRIER) is sent to the second bank (the “synthesis filter bank”). In this filter 
bank, however, the level of each individual filter is controlled by the envelopes from the analysis 
filter bank. The PROGRAM signal creates 10 dierent filter envelopes with the motions of the mouth, 
and those motions are used to control the shapes of the synthesis filter bank which is filtering the 
CARRIER. This is how a vocoder maps the motions of the mouth onto a completely dierent sound!
FIGURE: The PROGRAM is connected to the analysis filter bank, which breaks the PROGRAM sound 
up into 10 frequency bands. Each band is sent to an ENVELOPE FOLLOWER which converts the audio 
in each PROGRAM band into a slow moving control voltage. These control voltages are then used to 
control the level of each filter band in a synthesis filter bank. A dierent sound—the CARRIER—is fed 
through the synthesis filter bank—thus mapping the timbral characteristics of the PROGRAM onto  
the CARRIER.
VOCODERS
UNDERSTANDING FILTER BANKS AND VOCODERS (Continued)
PROGRAM
ANALYSIS
FILTER
BANK
ENVELOPE
FOLLOWERS
CARRIER
OUTPUT
1
2
3
4
5
6
7
8
9
10
SYNTHESIS
FILTER
BANK
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
10