Variations on a Theme: Bit Crushing

Disclaimer: For any sticklers out there, there are some discrepancies between this article and exactly how the NES handles audio. There are also some details neglected (sample rates, resulting bit depths). This is intended as a basic overview.

Original Clip

Crushing 1

Crushing 2

Crushing 3

Crushing 4

Crushing 5

The sound of simple bit crushing exploded in popularity several years ago when “lo-fi” became all the rage. It’s almost always done the same way: a control changes the “step” size of a blocky mess to vary the amount of quantization distortion. The fewer bits used, the bigger, steppier, and noisier the result gets.

Modern digital audio signals are almost always PCM data. Each chunk of data is essentially a sound pressure level at an exact point in time.

..but that’s only one way to store and represent audio in the digital realm. What about an entirely different method?

The Nintendo NES designers didn’t care much about audio fidelity, but did want some type of digital audio playback. You’ve probably played a Nintendo game at some point that warned you, through a wall of half-intelligible fuzz, to “skate or die die die die*” or “double dibl.” It took special audio encoding to sound so terrible.

Instead of encoding the volume of each point in time, many NES games stored a sequence of volume differences. It’s a handy format known as Differential PCM. If you have audio data that looks like, in PCM form:

1, 2, 4, 6, 3, 2

in DPCM form it becomes

+1, +1, +2, +2, -3,-1

To get the original data back you must add each term in the DPCM. As a breakdown:

n[0] = 1

n[1] = n[0] + 1 = 2

n[2] = n[1] + 2 = 4

n[3] = n[2] + 2 = 6…

We get our original data back. The advantage here is that instead of caring about the largest value (6), the largest *difference* (3) is what matters. Most audio signals have relatively small differences compared to their highest and lowest values, so a high compression ratio is possible. Instead of needing a whole byte per sample, you could likely get away with a nibble. Of course, the Nintendo didn’t have that many bits to waste! It used 1-bit DPCM. That same PCM stream of numbers, converted to 1-bit DPCM and back, goes like this:

DPCM: +1, +1, +1, +1, -1, -1

PCM: 1, 2, 3, 4, 3, 2

That’s not what we started with! This creates a distortion different from traditional bit-crushing. It has the effect of increasing noise and filtering the signal at the same time. Essentially the 1-bit DPCM format “chases” the incoming audio. High frequencies end up distorted into triangle waves.

Hopefully we’ve all seen a sine wave at some point. Here is a low frequency sine wave that’s been run through 1-bit DPCM and reconstructed:

Tasty

The Lowest Distortion Case

The wave is slow enough that the 1-bit +/- signal can track it pretty well. Notice the flat regions of the sine wave switch back and forth between two values very quickly:

Flat Part Close Up

Flat Part Close Up

Each new value can only be the previous value plus one or the previous value minus one. There is no such thing as a constant value. Isn’t 1-bit great?

What happens when the sine wave is faster and gets harder to “chase”? The original wave is shown in red, and the 1-bit de/reconstruction is shown in blue.

Triangles!

Triangles!

Graphs are pretty hard to hear, so I included some audio demos of a drum loop. Each file uses 1-bit DPCM, but in each case the size of the steps decreases.

*After listening some more, the Skate or Die 2 intro vocal sounds too clear to be using DPCM. It was possible to use actual 7-bit PCM data on the NES, but was somewhat rare because of the overhead required both to play it back smoothly and also store it. For the highest quality recorded audio I’ve heard in an NES game, check out Big Bird’s Hide and Speak.

2 Responses to “Variations on a Theme: Bit Crushing”

  1. […] out the rest of the discussion with graphs and audio examples on his blog Share […]

  2. Ordu Oyunlari says:

    What a great post! Thanks for sharing this on your blog.