When it comes to audio in our videos, most of us just plug in a microphone and hit the record button.
There is more to digital audio than squiggly lines on the timeline. A better understanding of the process will help you record and edit better sound with each project. It's time to dig in and crack the digital audio code. I promise to keep the math to a minimum.
As audio is piped into your camera or computer, it is digitized, a process that converts the sounds we hear into a string of ones and zeros. On playback, the computer reassembles the digits and converts them into something we can hear. Magic, right? Actually, it's more math than magic. To understand the math, we first need to know a couple of key phrases: sampling rate and bit depth. In digital video, the incoming audio is sampled 48,000 times per second (48kHz for short)—sort of like a snapshot of the sound at that instant in time. By sampling 48,000 times per second, we're assured of a very accurate depiction of our audio. By comparison, CD audio is sampled at 44.1kHz. There are some fancy mathematical calculations—commonly referred to as the Nyquist Theorem—that explain the sampling rate.
The simple version goes like this: to accurately reproduce a sound, you must sample at double the rate of its highest pitch. Since human hearing is generally considered to be from 20 to 20,000Hz, we should sample at least 40,000 times per second. A 48kHz sampling rate provides that and more.
If your head's not swimming with too much information, let me complicate things a bit (pun intended). Each sample of audio has a 16-bit depth. Imagine a string of ones and zeros 16 digits long. With 16 bits, there are 65,536 possible gradiations to represent the volume of each sample. So for each second of audio, 48,000 samples are taken, each with over 65,000 variants. Surely, that's enough to reconstruct clean, clear audio for your video. While audio for digital video exceeds the quality of CD audio, there are even higher sampling rates and bit depths used in professional audio recording. The new HD-DVD and Blu-ray discs use the Dolby Digital Plus codec, which supports 7.1 channels with sampling rates up to 96kHz with a 24-bit depth. The audio world is constantly changing!
Digital audio is very easy to overload or distort.
As good as digital video sound is, there are practical limits. For instance, digital audio is very easy to overload or distort. Back in the days of analog tape, you could go a little "past zero" on the meters and the audio would remain unaffected. In the digital world, once all 16 bits are set to "1," there is nowhere else to go. Any volume beyond that point is drastically distorted. Digital distortion isn't the soft buzz or fuzzy sound you get with analog equipment; it's a harsh, nasty sound you definitely don't want on your soundtrack. Some camcorders have Automatic Gain Control, or AGC. The tradeoff is audio that changes volume automatically, bringing down loud sounds or raising the volume on soft sounds, even bringing the air-conditioner and other sounds to the same level. If your camcorder has manual audio levels, learn to read and monitor the audio meters while recording.
Manual audio adjustments allow you to get optimum signal into the camera but also open the door to potential distortion. If you're using a mixer, make sure the mic levels are strong, but not too strong. When connecting the mixer to your camera, test the audio level by having your talent laugh loudly or speak in their strongest voice. Use this as your maximum level and don't change it. You'll come back with consistent, clean audio that is easy to edit in post.
16-bit, 48kHz stereo digital audio is about 11MB per minute. That's a pretty hefty file size, especially for music or dialog on longer projects. To save disc space, you may be tempted to use a compressed file format like MP3, AAC or WMA. While most editing software supports various compressed formats, you should resist the temptation. Compressed audio is great for distribution over the Internet or to load on your favorite music player, but it's not quite up to the standard of professional video.
There are good reasons to use compressed audio in your projects—just be aware that you're trading quality for convenience.
Compressed audio formats all use something called perceptual coding. In order to achieve their dramatic size reduction, perceptual coders analyze the audio and decide which portions of the audio can be thrown out, based on the idea that the human ear is not capable of hearing very high and very low frequencies. Compare an original CD track to a compressed version and you'll hear reduced highs, flabby lows and general weirdness in the middle. Another secret of compressed audio is the division of bit rate. A stereo track needs about twice the bit rate of a mono track.
So, your 128kbps MP3s are actually two 64kbps channels. While this may sound fine on the subway or in your car, a video project played on a home theater will quickly reveal the compression artifacts in the sound. If you must use compressed audio in your projects, use the highest practical bit rate.
With all that said, there are good reasons to use compressed audio in your projects. For instance, many buy-out music vendors offer their tracks for download. If you've found the perfect track online and need it today, buy it. Just be aware that you're trading quality for convenience. Another ideal application is narration or voice-overs. With a couple of phone calls and an email or two, you can hire a professional announcer to record the voice-over for your next project. You email the script along with any notes; they record the script, convert the file to MP3 and email it back to you. Mixed with some music, a voice-only mono track encoded at anything above 128kbps is almost impossible to distinguish from the uncompressed version.
This has been pretty technical stuff, but now you know how to get the most from your audio. In short, capture clean, loud sound at the highest quality settings. Use compressed audio only when necessary and your video projects will sound their best on any playback system.
Contributing Editor Hal Robertson is a digital media producer and technology consultant.
This article was initially published in the October 2006 issue of Videomaker Magazine. For more info on video production, visit their site or return to the Video Toolbox.