Audio Transcoding Basics

     

This post is more or less a preamble for an upcoming article. Transcoding means we convert from one format to another one e.g. from WAV to MP3. There are lots of formats, some of them are lossless, others are lossy.

Lossless encoding means that no information is lost and you can decode it in a way that it is absolutely identical to the source material. You may call it reversible if you like. That also means you can encode it 100 times but still, not a single bit is lost.

Lossy encoding, on the other hand, focuses on what content your ears will miss the least if taken away (please refer to psychoacoustics). In other words, it tries to keep the content that your ears can actually perceive and drops anything else that’d probably be unheard due to the limitations of the human ear. Thus the name, lossy encoding, because you lose information. Once encoded, you cannot reproduce the source material anymore, only a similar one. The more encoding passes you perform, the more content gets lost. While generally being worse in quality, lossy encoders have a big advantage over lossless ones: they produce (much) smaller files.

A very important thing to note: after encoding, in the best case, quailty will be just as good as it was before. You cannot improve quality with encoding, you can either preserve or worsen it.

It’s crucial to make a distinction between lossless and lossy transcoding. Lossless transcoding occurs when you convert to a lossless format, and (guess what) it’s lossy when the destination format is lossy. But it also matters a lot if the source material is lossless or lossy. Let’s take the 4 possibilites into account:

  • lossy -> lossless: this does no harm but doesn’t make any sense at all either. Remember, the material won’t get any better by putting it in a lossless format. You’ll have the exact same (lossy) audio content – taking up much more disk space. Also, this is a one-way ticket, you can’t get the original, smaller lossy format back anymore in case you lose it (see lossy -> lossy). So performing this kind of conversion is totally useless. It’s a waste of time and space for no good.

  • lossy -> lossy: this is the worst of the worst. This means your source is already damaged, and now you damage it even more. You may want it to take up less disk space, but unfortunately lossy encoders can’t do wonders. If you use moderate encoder settings, the result will be negligibly smaller (but with worse quality), and if you use aggressive settings, the result will be awful in quality. Neither of those is desirable.

  • lossless -> lossless: this makes perfect sense and there are several use cases for this. You may get a new device which doesn’t support your old format. You may find a new codec (or a new release of your existing codec) which improves encoding efficiency and thus saves you some disk space without sacrificing any quality. You may want to merge several files into one and your application can only do that by re-encoding all of them as a single stream, and so on. This is absolutely safe to do as no content will be lost at all.

  • lossless -> lossy: this one’s fine, too. Portable devices are usually very limited in means of disk space (no, 64 GB is not that much), and you want to have as many tracks carried with you as possible, right? You may even want to store all your tracks only in MP3 because disk space is scarce or something, although in this case you’ll have to keep in mind, if you ever have to move to a different format, you’ll have to face the consequences mentioned above for lossy -> lossy.

Bottom line? Transcoding from lossy material should be avoided like the plague, but transcoding from lossless material is fine as long as you don’t lose the lossless version (be it either the lossless source or the transcoded lossless result).