August 03, 2007
Moon Audio/Visual

Yesterday I wondered what would happen if I stored the raw data from a pixmap image to a WAV file, converted that file to an MP3, then back to a WAV, then finally reconstituted it as a pixmap.

Here's the original photo I used (I can't remember what site I got this from, unfortunately [see update below]):

moon over lake michigan, original

You can hear what this image sounds like (16 khz sample rate).

It turns out the answer is that there's no noticeable difference after one conversion cycle. In order to see any real difference I had to do multiple generations of mp3 compression—all done at a constant 96kbps. (To hear the audio artifacting that comes from repeated mp3 compression, check here.)

Here's the result after 20 generations:

moon over lake michigan, mp3ified

Gavin pointed out that it mostly just lost contrast. If you renormalize the histogram, this is what you get:

moon over lake michigan, mp3ified and normalized

Now I want to see the result of compressing audio with JPEG, but I wouldn't be able to stand the disappointment if I spent a couple hours doing it myself and then discovered that again there was no interesting difference. So I guess that means it's your turn to contribute something for once.

Update: The photo is by my buddy Christopher Trott.

Update: There's more discussion on programming.reddit.com.

Posted by jjwiseman at August 03, 2007 03:55 PM
Comments

I think it would be much more interesting if you tried lower compression rates -- and tried other algorithms. 96kbps sounds bad to the trained ear but it really preserves quite a lot of the information. I suggest you try a much lower compression rate like 24.

Posted by: Justin on August 3, 2007 07:28 PM

Awesome image, that photographer sure knows how to take cool photos!

Posted by: Christopher Trott on August 3, 2007 07:32 PM

good experiment, would you mind posting the "after" pictures as .PNG ?
Thanks

Posted by: Ron on August 3, 2007 09:20 PM

So how much compression did you get (in the first pass)?

Posted by: Aaron on August 3, 2007 11:44 PM

The .raw file would appear to be 1 620 000 bytes and the mp3 file 1 621 476 bytes, so it's not all that suprising that there's little loss of information. Interesting that it's all in the dynamic range though; I guess now we have a nice visual proof that MP3 encoding significantly hurts dynamic range.

Incidentally, non-audio data turned into sound is generally referred to in the "scene" as 'binary noise'.

(Yes, this is a scene.)

Posted by: Ola on August 4, 2007 06:17 AM

could you please tell which tools were used and how? i'd like to see what will happen if lowest possible compression rate (kinda 8 kbps or so) will be used.

certainly wav->mp3->wav conversion is simple via lame encoder. but i don't know which way is best for producing pixmap and stuffing metadata for tools to understand.

if you have some scripts, it would be great if you can share them.

Posted by: killerstorm on August 4, 2007 07:47 AM

Oh, and once you mentioned it, I couldn't get it out of my head until I tried.

The answer to how JPEG compression affects sound is pretty much "adds noise". Illustrated at http://du-nez.blogspot.com/2007/08/wav-to-jpg-to-wav.html

Posted by: Ola on August 4, 2007 08:56 AM

what happens to the picture if you mix a voice over the picture audio?

Posted by: anonyfoo on August 4, 2007 06:30 PM

It will be much more apropriate to compress greyscale image (one band of data). Or convert two components into stereo mp3.

Posted by: Ivan Tikhonov on August 5, 2007 12:25 PM
Post a comment
Name:


Email Address:


URL:




Unless you answer this question, your comment will be classified as spam and will not be posted.
(I'll give you a hint: the answer is “lisp”.)

Comments:


Remember info?