(Page 2 of 2 pages for this article  <  1 2)

Tuesday, August 19, 2008

Vacuum Packed

Compress audio files without losing quality? You can, if you measure them the right way.

Dawn of Delta

To understand how the better way works, think about what audio data actually means.

Here’s a simplified diagram of how a sound gets digitized. We’re looking at half a sine wave at about -3 dBFS. Five times during that half-wave, we measure its voltage (blue lines) on a scale between 0 and roughly 32,000. The results - numbers between 2601 and 20,400 - are shown in the drawing.

image

Why ~32,000? Because 16 bit sound allows roughly 65,000 possible values. Half of those are reserved for negative numbers (the other half of the wave). Why only about 20,000 as our maximum voltage? So other sounds can be louder, up to 0 dBFS.

Why only 5 measurements? To simplify the drawing. If this were our 1 kHz test file, there’d be 24 of them.

The highest numbers in our drawing (and their negative equivalents for the other half-wave) require 16 bits to store. If we could somehow make those numbers somewhat smaller, we could store them with fewer bits… saving file space.

It’s not difficult. Here’s the exact same wave. Only instead of noting the value of each sample by itself, we write down just the difference from the previous sample… mathematicians call it the delta.

image

Same wave, same numbers, just written differently. These smaller numbers will need fewer bits to store!

It’s like if you gave walking directions in two different ways:

Turn right at your front door, go two blocks, turn left, go four blocks, right again one block and you’re there.

or

Turn right at your front door, go two blocks, turn left and keep going until you’re six blocks from home, right again until you’re seven blocks from home...

Both directions will get you to the same place, but the first version is simpler.

Back when desktop computers didn’t have enough power for psycho-acoustic algorithms, this is how audio data compression was done. You can still select it in most audio programs: QuickTime IMA, or Microsoft ADPCM. The delta measurements were arbitrarily limited to 4 bits instead of 16, for 1/4 the data.

Running our test files through IMA gives us the expected 75% reduction (plus a few bytes for overhead):

image

The only problem with this scheme was that it wasn’t necessarily lossless. Sometimes, samples are more than 4 bits apart. In the case of sudden loud or high frequency sounds, delta numbers would lag behind the proper sample value for a fraction of a second. This would create a soft, short burst of noise around the signal. 

As soon as computers were able to handle the more efficient and better sounding masking algorithms, delta encoding was mostly abandoned.

Delta is Ready...

But Moore’s Law still rules, and desktop computers keep getting more powerful.

Modern computers can look at a signal and predict the total delta between individual samples, no matter how big a jump. They’re fast enough to check the guesses, and go back and refine them until they’re accurate. They note the rules for that guesswork in the file, and voilà:

Reasonable shrinkage, with perfect recovery when you open the file. Here are the numbers for two implementations, Apple Lossless and FLAC (Free Lossless Audio Encoder):

image Scroll down for an easier to read version.

Signals that are easier to predict will shrink more. That’s why the sinewave loses about 90% (much smaller and more accurate than the original delta method).

But even more complex signals, like our voice/music mix, can shrink 50%. That’s with absolutely no signal loss. When you open the compressed file, it’s a perfect clone of the original.

And unlike most mp3, silences don’t waste much space at all (since the delta remains 0 during a silence). So if you’re sending stems, or individual tracks with pauses, you see even grater shrinkage.

Eye Candy

I’ve thrown some numbers around in this article. They may be easier to grasp as a chart:

image

The files compressed with IMA (delta 4:1) are nicely shrunk. But remember, this is a lossy compression… sudden jumps in the waveform get noisy. The mixed track doesn’t shrink quite as much under Apple Lossless or FLAC, but it does end up considerably smaller than the Zipped version. And the process is as transparent - or lossless - as Zipping a Word doc before you email it.

FLAC is actually capable of greater compression, because its guesses can be fine-tuned. Here’s a typical FLAC control panel:

image

But Apple Lossless (and an equivalent setting in Windows Media) are a lot easier to use, and give almost as good results. So next time you have to get small, take a trip to the Delta.

Next Time: A couple of unintuitive shortcuts that can speed up sending audio-for-video in QuickTime.

AudioDistributionPost Production

(Page 2 of 2 pages for this article  <  1 2)





Glad you liked it, Travis.  Spread the word.

There are other articles with a scientific bent in my Inside Track blog. They take some effort to write and illustrate, and I’m basically lazy, so they’re balanced with articles that are based on humor or on observations about our industry. But always with some kind of audio tutorial or tip. It’s what I do.

You might also like some of the tutorials on my own site, dplay.com.

Posted by Jay Rose  on  08/20  at  07:05 AM


Name:

Email:

Location:

URL:

Smileys

Remember my personal information

Notify me of follow-up comments?

Submit the word you see below:




Advertisements
















Copyright 2008 ProVideo Coalition LLC
Check PageRank