MP3 & AAC: What you’re *not* hearing

Hello my friends. Having recently completed a series of college broadcast seminars (and, with NAB underway in a matter of days) the question of ‘audio compression’ ALWAYS comes up. People always ask, “Is MP3 *really* that bad?” or “Isn’t iTunes’ original AAC better than standard MP3″ and so on and so forth. Then of course, there are many who also contend that MP3, AAC(+) and the like are simply good enough because ‘you’re not losing that much anyway’… Uh huh. Yeah, I don’t think so. 😉
So, in an effort to ‘visually’ demonstrate what you’re NOT hearing when you compress (using a variety of popular formats), I decided to pick a few cuts from one of my more recent FAME outings in Amsterdam…

The subject in question…Coltrane, with Elvin Jones/Jimmy Cobb on drums, 1959/1960

The Subject of Today's Lesson
Now, many may ask why I chose this particular record. Well, the benefits of well-recorded (and for that matter, remastered) Jazz albums is that you tend to have a lot of stuff going on in the ‘high end’…and in this case, you’ve got some nice, present hi-hats and sizzle cymbals. Sizzle Cymbals are the ones that have the little rivets drilled into them. Used like a ride, they literally ‘sizzle’ as you strike them. This provides a really ‘atmospheric’ kind of sound, but also one that tends to resonate for a long, long time (and has a really nice decay). As such, it also occupies a lot of that high-frequency space.
So, let’s take a look at what the upper register (approx 16k-22k) of this recording looks like UNCOMPRESSED. Sorry that these darned images are so small (limitations of my blog…anyone offering to help me ‘pimp’ my blog’s CSS?? Greg is probably furious with me right now! lol) but if I get requests, I’ll post larger frame size versions directly on my Photobucket page. Just comment me and let me know. This CD was ripped directly into Audition 3. We’re looking at the Spectral Frequency Display in the Edit View, with 16k-22k zoomed in.
Coltrane Uncompressed, 16-bit Stereo, 44.1kHz
Uncompressed Coltrane, 16k-22k
As you can see, there is a good deal info above 16,000Hz. And, more importantly, you can truly SEE how strong (and present) that foot-closed hi-hat is (on the right channel, ie, the bottom of the image; top of the image represents left channel). There’s plenty of that sizzle-y ‘atmosphere’ as well, as represented by the reddish-purple color. It’s not high-amplitude…but it’s THERE…and it’s that very presence that gives the recording it’s ambience (an essential factor for good jazz recordings from this era).
So, now that you know what you’ll get from CD, let’s now take a look at the NEW 256Kbps AAC that you can get from iTunes (these are the newer iTunes Plus files– you can also rip directly into this format via iTunes).
AAC 256kbps, iTunes Plus
256K AAC , 16k-22k
Now, this is pretty difficult to see in these images, but what you’ve got is audio information that actually extends *nearly* to 19k on the average (best viewed on the left channel) with transient material (the initial attacks of the hi-hat) extended to approximately 20k (on the right channel). This, I must say, is pretty darn good, and for most ears you probably won’t be missing much. Granted, this is *not* lossless…and speaking as a mastering engineer, I can tell you that in certain passages, you do get some weird artifacting and aliasing (largely because of those difficult sizzle cymbals; you are, after all, sacrificing sizzle and ambience for smaller file sizes)…but on the whole, it’s a pretty sweet type of compression. As mentioned, I *only* started buying iTunes albums when iTunes Plus became available (and you’ll see why in just a moment). Still, it’s NOT replacing CDs for me
So, I’ll give AAC 256 a B+/A- grade. Let’s see the popular MP3 format…
MP3 @ 192Kbps (slightly higher than most internet audio encodes)
192k MP3, 16k-22k
Now, here’s the deal with everything above 128Kbps in MP3. You basically get a flat-top, razor-edge at 16k…*but*, above 128 you *do* get transient materials that extend nearly up to 20.5KHz. So what does that mean? Well, it means that your ears hear the initial ‘attack’ of the hi-hat (in it’s hi-end glory) but any decay of said instrument is truncated, and all the high ambience is compressed away. What you will also notice (if you listen carefully, with phones, assuming no hearing loss or damage) is that you *will* begin to hear some swishy, phasey-type sounds in the high register, again, all because of the compression. I used to use 192 for reference files (back when broadband was still a luxury)…but now, I *never* use anything less than 224, and generally I’ll do 256Kbps if I’m sending someone an MP3.
But…the *best* (not) is yet to come. Again, I know so many people who’ve ripped and done away with their CD collections…and they ripped everything into the the native AAC format in iTunes. Well, check it out…
Old-school..AAC 128 (standard iTunes downloads)
128k AAC, 16k-22k
Sad, sad times, eh? What do you get? Well, for one thing, you’ll notice (aside from the channels inadvertently being swapped when I ripped this! Bizarre) that now you have your ‘ambience’ resonating no higher than approximately 17k, with your transient attacks only extending to around 18.1k. Period. Nothing above that. WHA?? I mean, come on people!! You will also hear (very clearly) lots of swishy/swirleys in the upper register…but that’s only half of the devastation of 128 AAC. ;( In short, if you’ve ripped your library in this format, you’ve thrown away more than just the sizzley-hissy high end. You’ve also lost a great deal of the ‘meat’ in the middle (the mid frequencies, where the primary fundamentals of everything live)… I’m crying tears right now!


So, here’s a brief look, side-by-side (or, top to bottom, I suppose) of what 0-18Khz looks like. Most importantly, take note of the middle of the file…you’ll notice how in the 256k version, it still looks fairly ‘solid’; same for the 192K MP3. But our friend AAC 128…Oy! Quel naufrage!
256Kbps AAC (Plus) (0-18kHz)
0-18k, AAC  256kbps
192Kbps MP3 (0-18kHz)
0-18k, MP3 192Kbps
128Kbps AAC Original (0-18kHz)
Very 'Hole-y', 0-18k, AAC 128kbps
For these, here are some direct links in 1024 so you can *really* see what’s going on here…
256Kbps AAC (Plus)
128Kbps AAC (Original)
So there you have it, my friends. Just look at that 128 version. Not only did you sacrifice a great deal in the high-end, but you’ve thrown away an ENORMOUS amount of sonic info in the middle…all in an effort to shrink it down.
Well, that might work for some…but not for me. It truly PAINS my ears to hear these encodes (digital satellite radio has a problem with low bitrate as well, but that’s for another time). But frankly, it’s not really about what I *think*…it’s the mere fact that these different compression schemes all affect your audio files differently; and if you want something as close to CD as possible, and *don’t* want to go the uncompressed route, there *are* options…and good sounding ones at that. And, if you have to compress (and generally, we do) now you know what to listen and look out for.
Whoa…this could be my longest blog post ever. Hope you enjoyed the technically nerdy audio info. Audio Engineer Geeks Unite! We are *not* alone…at least, I hope not. 😉
Until next time, my compression-curious colleagues (nice alliteration there)…
Blog on.

13 thoughts on “MP3 & AAC: What you’re *not* hearing

  1. Thanks Jason, I enjoyed it a lot. I’ve always found this waveform graphics kind of synesthetic and they usually show me that we can see much better than hear. And by we I mean all mankind based solely on my own experience : )
    Now really, I can spot a jpeg compression meters away, but I usually can’t tell the difference between audio codecs. Granted, I’m a photographer and I’ve been dealing with images for years, but I’m also a decent music amateur so I’m not totally deaf. Maybe I just never had the proper headphone.
    Maybe it’s a matter of education, as some of us might not know what to look (to hear) for. For example, sometimes I look at images and point out really obvious photoshop mistakes but it takes untrained people several seconds and sometimes even further explanation to get them to “see”. They can probably distinguish as much shades of grey as I can but they can’t spot a fake shadow because they are not looking for it.
    Perhaps if you posted the audio files so that we can hear them ourselves along with some guidelines like: “can you hear this glitch at the 11th second?” we could try to draw the line between size and quality accordingly. Like an audiophile crash course. I know I would certainly learn a lot!

    Hey Pedro. Thanks for the comments. Indeed, it *does* take a while to understand the parallel of what you’re seeing *and* what you’re NOT hearing. As you mentioned, it’s not unlike photographic imaging/restoration in that sense. I can honestly say that before I became entrenched in all things Photoshop, etc, I didn’t really understand ‘noise’ in images; I mean, obvious ‘grain’, sure…but digital noise from digital photography (and all the lovely edging, color spill and other things) I really didn’t quite ‘get’ until I became more familiar with the overall process and history behind it.
    Sound, however, is a bit trickier as it’s not always terribly obvious to the general listener, even *when* it’s pointed out. Not to mention, good speakers and/or headphones are *not* the norm. The general listening device is either earbuds (by no means representative of ‘flat response’ fidelity) and/or computer/desktop speakers (again, not representative of a true, flat frequency response)…so the listening medium as well as the environment all comes into play here. I suppose that’s one of the bases for introducing heavily compressed audio in the first place—most people just *don’t* hear it and won’t hear it (and therefore, don’t know what they’re missing)…BUT…once you *do* hear what’s happening, it really does become night and day.
    At the moment, I don’t have a way to post high-resolution files (sadly, YouTube is MONO, and pretty low bitrate) so it’ll have to wait until later. The best suggestion I can offer up is to try it yourself. You can export the MP3s through Soundbooth, and you can use it’s Spectral Frequency display to showcase the same things I showed in Audition (and, SB will open the iTunes files directly—but *not* purchased 128 AAC, as those are protected…so you’d have to rip something yourself at 128, then it’s fine)
    I’m thinking I might need to do a training day somewhere…what do you think?? Think anyone would be interested in something like that? 😉
    However, stay tuned to my series on Adobe TV entitled “Short & Suite”, as I’ll talk a lot about these techniques and how to understand to hear what you’re seeing….—JL

  2. I did a similar “study”… although I think it got misinterpreted quite a bit.
    My point was simply to demonstrate that there ARE differences between lossless and lossy compression. Whether it is noticeable or not wasn’t the point.
    As an audio engineer, I want to hear CDs exactly the way artists, producers, and engineers put it on CD. Just because I can’t tell something’s missing isn’t relevant.
    You can read my post
    here.

    Hey Nick. Yep, and that’s why I stated at the end…it’s merely the fact that you ARE throwing material away with different types of compression.
    The fact remains, for this musical cat, compression artifacts simply drive me nuts; I can’t really put it any other way. And the moment I hear it (and I’ve done many a blind test) I just can’t really enjoy a piece of music anymore. I was once offered someone’s entire catalog (on a hard drive)…literally 22,000+ songs. I listened to a small sample and instantly noticed the artifacting. A careful look showed the files to be 128-160Kbps on average (with a few 192; I found only one album that was actually 256)…I didn’t take any of it. It literally made me ‘twitch’ to listen to it (and this was largely classic bluegrass and jazz)
    Thanks for the comments, Nick. Keep the (uncompressed audio) CD alive!! 😉 —JL

  3. Also, some folks might criticize this study as they did mine.
    The folks at Hydrogen Audio go by ABX listening test theory… saying that statistical information means nothing… They say that what counts is which one sounds better to a listener in a double-blind listening test. Like I said, I want what engineers did, not necessarily what sounds better.. but usually, they’re the same.
    Lastly, lossy compression doesn’t always/only throw stuff away… Sometimes, it adds noise and other psycho-acoustical data to mask what is changing.

    I totally agree. See my response to the reader above. Listening environment, playback equipment, and ‘knowing’ what you’re hearing all matters. And indeed, the ‘additive’ elements of some compression codecs is a whole other ball of wax. No disputing that.
    As for Hydrogen saying statistical info means nothing — well, I’ll leave that opinion with them. I would simply say that if you took a well-recorded disc and ripped as uncompressed WAV or AIFF, and then did the exports into the formats I mentioned, the graphical (ie, spectral) display clearly showcases what has happened to the original audio, post compression…for *those* formats, with *those* attributes. And if you know what you’re listening for, you’ll hear the (potential) artifacts. Thanks, Nick. Good post, btw…and glad to see you’re an Audition user! 😉 —JL

  4. For the OSS enthusiasts, ogg vorbis? How does that stack up? 🙂

    Hey Kevin. I have some friends who are game composers, and they use Ogg all the time (as did I, for a while). I can only compare to a fixed bit-rate setting, which is all I really ever used (though in Ogg, you have the beauty of VBR, and *lots* of control over damping time, manual min/max bitrate setting, lowpass filtering, etc)…,
    But at 256k Fixed, the Ogg version was less than 1/5 the size of the uncompressed file, with ‘consistent’ audio extended all the way up to around 19.5k…and it sounded great. 😉 –JL

  5. The best suggestion I can offer up is to try it yourself.
    I kind of did a few years ago, with little rigor though. My conclusions were that I should probably get a better headphone. Do you ahve any suggestions up until U$ 100 ? ( it should be a hobby after all )
    Don’t laugh, but I still hear most of my music from a huge, old, early 90’s stereo connected to my MacBookPro. As I still have some LPs that can only be played from it, it gets convenient. Can hardware like this decently reproduce uncompressed audio or am I not knowing what I’m missing? I don’t know if what I have is the equivalent to an trinitron CRT, oldie but goodie, or some magenta casted garbage.
    Is there some kind of sound ( frequency ) where compression artifacts and losses are more evidently heard, for example a low sound like bass, a more midrange sound like piano or a hight pitch violin ? Or is it not like that at all ?

    However, stay tuned to my series on Adobe TV entitled “Short & Suite”, as I’ll talk a lot about these techniques and how to understand to hear what you’re seeing….

    Great, I will! Looking forward to it.
    —-
    Hey Pedro. While I’m usually not one for making recommendations, I can honestly say, without a doubt, that the best deal going for under $100 (USD) are the Sony MDR-7506 headphones. These are industry-standard cans, present in literally *every* studio I’ve ever worked in. They’re durable, they fold, they have a pretty representative frequency response (ie, something mixed properly on them will likely sound good everywhere) and I’m the proud owner of two, and have been for years. US Retail (at any Guitar Center/Musician’s Friend or elsewhere) is approximately $99.00. You can’t go wrong, and if you’ve been using the ‘alternatives’ (ie, earbuds, computer speakers) they’ll really ‘raise the bar’ for your listening experience. –JL

  6. Thank you for this. With the current trends in compression technology, and the companies standing behind it all (Apple’s iTunes and iPods, Microsoft and it’s MP3 players) I see sound reproduction continuing to degrade, and being accepted as such. Hopefully this is a short-term trend, and the thinking will be replaced in the near future. As to your comment “It truly PAINS my ears to hear these encodes,” I agree, and I actually believe certain compression techniques can increase the likelihood of hearing loss. The sharp, swishy sounds can atually hurt my ears and feel more painful than overly loud sounds. This alone will probably one day be found to be a leading cause of hearing loss in our youth. Thanks again and keep up the good work!

  7. Hi Jason
    I´m writting to you from San Jose, Costa Rica. I have no word to express my thankfulness for healing me with so many tips. Anyway..thank you! I have a question. Is it possible to work in every audio channel independently in Adobe Soundboot. If so, could you tell me how, please. I record nats in channel one and voice in channel two, so I want to erase channel two to paste there the content of channel 2 Thanks…again!

    Hi Esteban! It’s nice to hear from you. Unfortunately, there isn’t any way to do what you’re describing in Soundbooth. However, you *can* Export both channels to independent mono files. Simply go to the File Menu and choose Export>Channels to Mono Files. This will then give you discrete, mono files, each containing the ‘separated’ Channel 1/Channel 2 audio. To do ‘exactly’ what you’re describing, however, can be easily done in Adobe Audition 3. 😉 Take care. —JL

  8. Came across your post as I was surfing for news about Apple’s announcement that the entire iTunes store will be going DRM-free, and I never thought that AAC files were better than MP3. Now I’ve learned a bunch more (and am ready to give up MP3 for good), but am wondering if I do make the digital-switch one day, what bitrate should I rip my CD collection to? (I don’t plan on tossing my CDs, though.)

    Hey John! Well, naturally I’m a believer in keeping the ‘hard copies’…always 😉 But the native bitrate for iTunes Plus files (and the ‘Higher Quality AAC’ option that is available to you in iTunes’ Import dialog) is a 256Kbps Stereo AAC file. As mentioned, it’s really quite good, while maintaining an excellent quality/file size ratio. Sure, mathematically lossless is better, (and uncompressed is king, IF you want to deal with the file sizes; all depends on how many discs you’ve got in your collection) but as I said…I’ve ‘bitten the bullet’ and have resorted to AAC256 and iTunes Plus. An ideal compromise, indeed, and one that I’m very happy with. All the best. –JL

  9. Jason, Thanks for taking the time to put this in pixels. It answered my question; “just how much information is leaking away when we compress?” and, “should I still buy the original CD’s or can I finally move my buying to the ether?”
    I conclude, sadly not. I know my ears can’t tell the difference between 256k AAC and wav but my head can and will always know that, despite mostly listening in the car, I’ll always have access to the original if I need it, and that’s not true of a collection of AAC albums.
    This is clearly more of a reflection of my state of mind and neurosis, perhaps what I should do is buy everything on iTunes and convince myself that one day, iTunes will switch to lossless and I’ll be able to replace all my purchases with bit-perfect copies.
    MPT

    Hey Michael. Thanks for the kind appreciation, and indeed, we’re *not* quite there yet, are we (in terms of ditching the ‘hard copies’)? Like you, I’m still a CD-buying guy. Oddly enough, even though I continue to purchase iTunes Plus albums, I’ve since purchased several of them on CD *after* buying them on iTunes. Bizarre? Perhaps. Collectors OCD/Neurotic-type behaviour? Kinda. But at the end of the day, there’s nothing like a real album cover and a thick booklet of liner notes, anecdotes, and producer/engineering/mastering credits. For me, that’s a truly integral part of the experience too. Yes, the uncompressed sound is the main attraction (though I’ve grown to appreciate the size trade-off of compressed material, simply for storage). Even then, I still *need* to know who recorded the stuff! (after all, I’m a recording engineer; that interests me). CDs won’t be going anywhere just yet; not in my house, and from the sound of it, not in yours either 😉 All the best. —JL

    1. Don’t know if anyone’s still listening but I thought I’d comment on my comment from seven years ago – shout out if anyone is still here!

      So looking back on my desire to keep originals, it looks like I finally traded a little loss for an easy life. Today I’ve migrated all of my CDs to AAC in the cloud and buy online first, CDs second.

      Online has given me access to music that I might never have looked at before and, more significantly, access to re-mastered versions of 80’s CDs where the quality, compassed or not is an order of magnitude beyond the original badly mastered source.

      Also, how funny is to to reflect back on these discussions – we were fixated on the tradeoff between loss and file size. I have more storage on my phone today than on our office servers then, who cares about file sizes anymore when we have terra-bytes of data in the cloud…. Wait, did you say cloud, what’s that?

      MPT

Leave a Reply

Your email address will not be published. Required fields are marked *