A Robot's Guide to Singing

Published: Apr 2, 2024

The human voice production system can be organized into three different subsystems:

  • Generator
  • Vibrator
  • Resonator

The generator provides the energy for the vibrator which creates frequencies that travel through the resonator. We’ll navigate deeper into the role and mechanics of each of the subsystems later. Naturally there is interplay between these systems, however for optimal navigation of an individual’s vocal output space one should be precise in which subsystem variables to modify.

In addition, the whole system functions best when unmitigated by unrelated tissue and muscle. At best, this is accomplished by proper alignment of the skeletal system.

There is a tendency during vocalization for modifications to the voice production system that have no effect relating to the desired output. These superfluous modifications can even have a negative impact. It is important to be aware of extraneous actuations.

Generator

In the human body, the generator consists of all the structures below the larynx/vocal folds. The goals of this system is two fold:

  • Maximize lung volume to maximize the quantity of sound generating mass
  • Manage subglottal breath pressure to maximize efficiency of sound generating mass

Both goals can be accomplished by proper utilization of the skeletal-muscle system associated with the lungs.

It can be beneficial to imagine lung volume as a cylinder where the intercostals and obliques modify the radius of said cylinder and the abdominis and diaphragm manage the height. Naturally the analog reality of the skeletal-muscle system doesn’t perfectly align with this description, but it is a good mental model to simplify the purpose of the generator subsystem.

During typical autonomous breathing, the shoulders elevate during inhalation and settle during exhalation. This engages the upper trapezius, neck, and shoulder muscles, which can limit laryngeal mobility and stability.

For brevity’s sake, we won’t be exhaustive on the interplay between the vocal production systems. It is sufficient to be conscious of how the skeletal-muscle system can interact with multiple systems of voice production.

Vibrator

In the human body, the vibrator consists of the vocal folds which are located in the larynx. The vocal folds function similar to a reed in woodwind instruments, however instead of varying the length of the air column as you would in a clarinet, the human modifies the length and tension of the vocal folds. The goal of the vibrator system is to generate frequencies for use by the resonator subsystem.

Elongating the vocal folds produces higher pitches, shortening them lowers the pitch. Thickening the vocal folds results in heavier registration whilst thinning them lightens the registration. One can imagine thinner vocal folds to produce near sinusoidal wave forms, whilst the thicker folds will have lower frequency harmonics.

Contraction of the thyroarytenoid shortens and thickens the vocal folds. Conversely, contraction of the cricoarytenoid lengthens and thins the vocal folds. Together these are an agonist-antagonist pair not dissimilar to the relationship between the quadriceps and the hamstring.

There are vocal modalities beyond that of our typical range. We’ll speak only to falsetto which is a vocal register above our range. The machinations for producing tones in the falsetto register differ than normal and vary in each human. The common property between the methods of production is that only the edge of the nearly or completely shut vocal folds vibrates. This differs from vibrations in the normal vocal register which occur along the entire vocal fold. For the curious reader, other modalities include the vocal fry and whistle registers. It is beneficial to practice the blending of modalities for smooth transitions between them.

Note that the vocal folds have no sensory mechanisms; Damage cannot be felt, it can only be heard. As a corollary, during singing pain should not be felt.

Resonator

In the human body, the resonator consists of all the structures above the vocal folds/larynx. This includes the throat, tongue, mouth, and nasal cavity. The structure of the resonator creates the unique qualities of a voice and enables the distinct qualities of language. The goals of this system depends on the desired output qualities of the vocal production system.

We can optimize the functionality of the internal cavities of the subsystem by maximizing its volume. This will give us more options for modifications in the space. This can be accomplished in a myriad of ways; For now, we’ll note the most culturally familiar option of emulating a yawn by raising the soft palate, and lowering the tongue and jaw (without substantially limiting laryngeal mobility). Note that we are maximizing our options for vocal production, during production we have the choice of anything within our capabilities.

Other prominent structures in the resonator subsystem are the lips and teeth, which are controlled by facial muscles and the jaw respectively.

It should be noted that our ability to hear ourselves is impossible. Evaluation of tonal quality can only be performed by an external listener.

Disadvantages of the Human Creature

One draw back of managing the human vocal production system is that some necessary machinations are not directly accessible from the brain. Previous literature suggests that such machinations can be accessed through “mental imagery”, but the correlation between images and their mechanical output must be derived on a case-by-case basis.