Digital Audio Formats

Many different digital audio formats are mentioned in my post on High Resolution Download Sites.  Here we discuss some of these formats so that you can better choose the ones that suit your needs and so you’ll know what to expect from your choices.

Digital audio exists either as a file stored on a disc or hard drive or as a bitstream, a sequence of bits being transmitted from one place or device to another.  Associated with most digital audio formats is a codec (COder-DECoder), which is an algorithm used to encode audio into the format and to decode it from that format.  The formats often have the same name as the associated codec.

Characteristics of Digital Audio Formats

The characteristics that differentiate digital audio formats include:

Resolution:  How many bits are used to encode each sample.  All things being equal, higher resolution affords higher sound quality, but requires bigger files and longer downloads.

Sampling frequency:  The rate at which the original audio signal is sampled.  All things being equal, higher sampling frequencies afford higher sound quality, but again require bigger files and longer downloads.

Number of channels:  Most music is 2-channel.  There is also multi-channel audio, of which 6 channel (5.1) and 8 channel (7.1) are the most common.  1-channel formats are also used for historical mono recordings.

Compression:  In some audio formats the original digital audio data is processed to reduce its size to produce smaller files and allow quicker downloads.  This is compression and it can be lossless or lossy.  In lossless compression the original digital audio data can be fully recovered.  With lossy compression the original digital audio data cannot be fully recovered.  The data loss appears as decrease in audio quality.

Lossy compression can produce smaller files than lossless compression.  However the resultant decrease in sound quality is a serious compromise and several factors are reducing the need for lossy compression, including:

  • Reduction in the cost and increase in the size of memories in computer, phones and portable music players
  • Reduction in the cost and increase in the size cost of hard drives and solid state drives.
  • Increase in availability of broadband Internet and the bandwidth of WiFi networks.
  • Availability of lossless compression codecs.

Tagging:  Some digital audio formats provide for the inclusion of metadata (data about the audio file), such as titles, artistes, composers, genres, album artwork, etc.

PCM and DSD

The post Analog … Digital … What’s it all about anyway? discusses sampling and quantisation.  The two main categories of digital audio formats, PCM and DTS differ in how they address sampling and quantisation.

PCM:  Pulse code modulation (PCM) uses a multi-bit word to represent each sample.  Most digital audio formats use PCM.  Typical word lengths are 16 bits, 24 bits and 32 bits.  Two ‘families’ of sampling rates are frequently used in PCM.  The first uses multiples of the CD sampling rate, 44.1 kHz (44.1 kHz, 88.2 kHz, 176.4 kHz and 352.8 kHz)  The second uses multiples of 48 kHz (48 kHz, 96 kHz, 192 kHz and 384 kHz).

DSD:  Direct stream digital (DSD) represents each sample with 1 bit.  It was developed for use in SACDs.  DSD compensates for the low resolution of each sample by using much higher sampling frequencies than those used with PCM.  It is becoming more popular among audiophiles as DSD downloads appear and more DSD-capable DACs are introduced.

The base DSD sampling rate is 2.8224 MHz, which is 64 times the sampling rate of CD.  This is known as DSD64.  There are also double-rate DSD or DSD128 (5.6448 MHz sampling), quad-rate DSD or DSD256 (11.2896 MHz) and octuple-rate DSD or DSD512 (22.5792 MHz sampling).

Within the digital audio camp the DSD vs PCM debate is as heated as is the analog audio vs digital audio debate.

Popular Digital Audio Formats

AIFF (Audio Interchange File Format) is an uncompressed PCM file format developed by Apple.  It is one of the most commonly used audio file formats on Apple Mac OS.  AIFF supports resolutions from 8 to 32 bits, sampling rates from 8kHz to 192kHz and from 1 to 8 channels.  It supports metadata tagging.

ALAC (Apple Lossless Audio Codec) is lossless, compressed PCM file format developed by Apple.  it is effectively the Apple equivalent of WAV.  ALAC supports bit depths of 16, 20, 24 or 32 bits, sampling rates from 1Hz to 384kHz, and from 1 to 8 channels.  It also supports metadata tagging.

DSD (Direct Stream Digital) is a lossless, uncompressed 1-bit file format.  It was developed by Sony and Phillips for use in SACD.  Sampling rates range from DSD64 (2.8224 MHz) to DSD512 (22.5792 MHz) and up to at least 6 channels (5.1).  DSD supports metadata tagging.

DXD (Digital eXtreme Definition) is a lossless, uncompressed PCM format developed by Digital Audio Denmark and Merging Technologies.  It has a 24-bit depth and a 352.8 kHz sampling rate.  It was originally developed for editing and mastering DSD files for SACD.  It is now also being used as a very high resolution consumer audio format.

FLAC (Free Lossless Audio Codec) is open, lossless, compressed PCM file format.  FLAC is widely used for CD quality and high resolution downloads.  FLAC supports bit depths of 8, 16, 20, 24 bits 32 bits, sampling rates from 1Hz to 655.350kHz, and from 1 to 8 channels.  It supports metadata tagging.

MP3 (MPEG-1 or MPEG-2 Audio Layer III) is a lossy, compressed PCM file format developed by the Motion Picture Experts Group (MPEG).  MP3 is widely used in portable music players and for streaming low resolution audio over the Internet.  Its lossy compression limits its sound quality, but its small files make it attractive.  MP3 supports sample rates of 8 kHz to 48 kHz, bit rates of 8 kbps to 320 kbps and 2 channels. It also supports metadata tagging.

WAV (Waveform Audio File Format) is a PCM file format developed by IBM and Microsoft.  Although WAV supports compression it is most commonly used uncompressed.  It is a common file format for use in Windows computers.  It does not support metadata tagging.

WMA (Windows Media Audio) is a lossy, compressed PCM file format developed by Microsoft.  WMA is an alternative to MP3 and is used for streaming music in Windows.  It supports sample rates of 8 kHz to 48 kHz, bit rates of 8 kbps to 768 kbps and 2 channels.

© Wayne Butcher

Leave a Reply