You are here

Transferring Digital Audio Using PC Soundcards

PC Musician By Martin Walker
Published July 1999

Figure 1: Modern PC audio editing software such as Wavelab has functions to check for digital integrity, as well as spotting stray glitches. There are a number of reasons why two files which should be the same may not be, and not all of them are obvious.Figure 1: Modern PC audio editing software such as Wavelab has functions to check for digital integrity, as well as spotting stray glitches. There are a number of reasons why two files which should be the same may not be, and not all of them are obvious.

Most people expect that a digital input or output will simply provide a bit‑for‑bit copy of the original signal. After all, bits are bits, aren't they? Martin Walker sheds some light on the various things that can go wrong.

Musicians now tend to take for granted the multitude of possible software manipulations available to transform digital audio in ever‑more interesting or bizarre ways. However, most expect that when they transfer a digital audio file each bit of data in the original file will remain intact. This perception hasn't always been the case — in the past, those more used to analogue recording would refuse to work from a digital copy, on the grounds that the original is always better!

Today, in many people's eyes digital audio is perfect. When they copy a track or complete album of songs from a DAT or Minidisc recorder to a PC they expect an identical version to appear on their hard drive, from where it can be written to a blank CD‑R disc using a CD writer. But while many such transfers are indeed bit copies of the original, this isn't always the case. If you look carefully on most soundcard packaging you won't find a claim of bit‑for‑bit digital transfers. There may, for instance, be claims of 'highest possible audio quality', but no guarantees. This might seem ludicrous to seasoned professionals who take such things for granted.

Unexpected Manipulation

Figure 2: A professional soundcard such as Event's Layla provides comprehensive clock options, with two clock output choices (standard wordclock and superclock, which runs 256 times faster for faster sync'ing) and five possible inputs (Internal, Super, Word, MTC, or S/PDIF).Figure 2: A professional soundcard such as Event's Layla provides comprehensive clock options, with two clock output choices (standard wordclock and superclock, which runs 256 times faster for faster sync'ing) and five possible inputs (Internal, Super, Word, MTC, or S/PDIF).

In some situations, digital audio may be changed without you realising it, simply due to the design of the software or hardware. As an example, S/PDIF sockets are being used to support 16‑, 20‑, and 24‑bit digital transfers. Given that so many PC soundcards offer internal signal paths of 24 bits or more it's fairly obvious that, once past the input socket, a 16‑bit signal is going to be treated as a 24‑bit signal. Seemingly, the sensible solution to this is to pad out the missing low‑end bits with zeros, so that when the same digital signal arrives at the S/PDIF output socket the top 16 bits are identical. In some cases this is what happens, but not always (more on this later).

I came across another more basic example of how digital audio can be changed in an unexpected way when reviewing the Opcode DATport USB (Universal Serial bus) digital interface recently. As you can read in the review starting on page 48, I was initially foxed when signals sent to its S/PDIF output emerged exactly 6dB down compared to the original file. Since this was such an exact figure it might suggest a 'missing bits' bug, but in fact there was nothing wrong at all — USB audio can be used to send signals to external 'digital' speakers, so Microsoft placed a software output level control in their standard mixer applet. This is fine if you just want to change the volume of low‑cost digital speakers, but for faithful bit‑for‑bit transfers of digital audio it is essential that this slider is at the very top (unity gain) position, since otherwise you will be throwing away resolution during playback.

If you look carefully on most soundcard packaging you won't find a claim of bit‑for‑bit accurate digital transfers — there may be claims of 'highest possible audio quality', but no guarantees.

Where digital inputs are concerned, any level control in the digital domain will also result in reduced dynamic range, but this is potentially more serious, since it will affect your recordings. For this reason you might think that there would be no point in having digital input‑level controls, and in general this is true. However, some low‑cost soundcards have a level control for the analogue input that is in the digital domain. One example of this is Event's Darla (the company's Gina and Layla cards provide analogue input level controls). Event told me that Darla has a digital level control only to reduce the signal once it has passed through the A‑D converter, and it is essential that this control is permanently left full up for the best possible signal quality. Event's web site now states that "adjustments to record levels must be made either at the source or within the host application", so I suspect that the latest drivers ignore the control altogether.

Clocking On

Figure 3: Glitches can occur when audio is being grabbed from CDs, or when data is being read from DAT tapes with so many errors that the correction circuitry can't cope. Some, like this one, are obvious when displayed in a waveform editor. Others are easy to miss unless you track them down using sophisticated detection algorithms.Figure 3: Glitches can occur when audio is being grabbed from CDs, or when data is being read from DAT tapes with so many errors that the correction circuitry can't cope. Some, like this one, are obvious when displayed in a waveform editor. Others are easy to miss unless you track them down using sophisticated detection algorithms.

One important concern with digital transfers is the Master/Slave relationship — something I also mentioned in my review of the Opcode DATport. When transferring the contents of a DAT tape or Minidisc digitallyusing a PC soundcard, you will normally only obtain a bit‑for‑bit copy if the soundcard can act as a slave, faithfully locking onto and following the clock present in the S/PDIF signal arriving at the digital input.

If the soundcard can only be used as a Master it will certainly record a high‑quality digital signal, but this won't be an exact copy of the original, since the clocks are not locked together. Effectively, what you will be doing is freewheeling. You'll get a digital transfer which has not passed through either an A‑D or D‑A converter, and though it's likely to sound very similar much of the time, the bits won't be the same and the audio will be prone to occasional clicks and pops (see the 'Practical Tests' section for an example).

You don't normally see 'Master' or 'Slave' settings in soundcard utility software. In the case of budget soundcards there is rarely any option at all, and on more expensive cards the options are likely to be under the heading of 'Clock'. The Internal (master) clock is normally used when playing back existing recordings, or when recording analogue signals, and for optimum sound quality this must be as stable in frequency as possible. Any variation (or jitter) in the frequency will degrade the sound when it undergoes conversion — either from analogue to digital if you're recording, or from digital to analogue if you're monitoring the output signal. Once the conversion is done, this jitter cannot be removed from the signal, since it is essentially random in nature.

While you might think that 'inaudible' glitches are not worth bothering with, they are still a sign that something is wrong, and the potential exists for your valuable music to become irretrievably corrupted at some point.

When you're recording from a digital input, the soundcard must be able to switch off its Internal clock, since the signal arriving at the S/PDIF input socket already has a clock element present. Your soundcard's External clock option needs to be enabled, so that the card becomes the slave and can lock itself to the incoming signal. This is known as a synchronous transfer. If your soundcard doesn't have such a switched option it is unlikely to be able to record bit‑for‑bit copies from a digital input socket.

Some professional recording studios slave their digital equipment to a separate Master clock with extremely low jitter. It's an ideal solution, but it relies on every other digital device being able to run as a slave, and not all can do this.

Sample Rate Conversion

Figure 4: The original (top trace) and copy (bottom trace) may look and sound identical, but Wavelab's File Comparer function revealed a difference between them: well‑behaved dither on the lowest bit. The middle trace shows this in a greatly amplified form.Figure 4: The original (top trace) and copy (bottom trace) may look and sound identical, but Wavelab's File Comparer function revealed a difference between them: well‑behaved dither on the lowest bit. The middle trace shows this in a greatly amplified form.

The absence of an External clock option isn't necessarily a disaster. Several soundcards — such as the Emu APS and Soundblaster Live! — provide fixed‑frequency internal processing at 48kHz, and because of this design choice their digital output sockets are also at a fixed 48kHz. Since these cards cannot be slaved to an external digital signal, another approach is used: asynchronous sample rate conversion (SRC).

Essentially, this interpolates (oversamples) the input signal, by stuffing zeroes between the original samples, and then digitally low‑pass filters the result to effectively give a much higher sample rate. Then a second process known as decimation (or downsampling) reduces this rate. By choosing different integer values for each process you can convert one sample rate to another. For instance, you could convert incoming audio at 44.1kHz to 48kHz by multiplying by 160 and then dividing by 147.

Given that resampling takes place, should we worry about its effect on quality? Well, it is possible to build a perfect SRC, given sufficient processing power, but a more modest real‑time version, such as the one used in the 10K1 chip of the Emu APS and SB Live!, will produce a certain amount of 'pass‑band ripple' — a series of tiny bumps in the frequency response that are caused by the digital filtering. The Emu spec is designed to tolerances of +/‑0.5dB, and this is what gives rise to the 1dB anomalies that some people have measured on the S/PDIF input of the SB Live! soundcard. However, the most important thing to understand is that a digital signal that has been resampled is still normally likely to be in a far better state than an analogue signal that has been passed through an A‑D converter. The huge difference is that while resampling relies on a software algorithm, A‑D conversion passes the signal through extra hardware.

Emu remain convinced that a small amount of ripple is preferable to the extra noise and other anomalies which would inevitably result from using the analogue path, but acknowledge that it would be preferable to have zero pass‑band ripple. They have already developed DSP code that runs in the 10K1 effect engine and compensates for the ripple, and this will be available in a future driver release for the Emu APS card.

The APS also has a fader available for its digital input, but this is designed to give unity gain when raised to its top position, to make it easy to match levels during a digital transfer. The SB Live! has the disadvantage that its software mixer utility doesn't currently have a 'unity gain' point, making it more difficult to record a digital transfer at the same level as the original.

The Best Approach

Figure 5: If you don't slave your S/PDIF input to an incoming digital clock signal it will freewheel and occasionally miss samples altogether, giving rise to clicks. Here the original 1kHz sine wave shown at the top has been transferred with the clock switched to Internal (Master). You can clearly see a missed sample at the identical point in the copied file beneath.Figure 5: If you don't slave your S/PDIF input to an incoming digital clock signal it will freewheel and occasionally miss samples altogether, giving rise to clicks. Here the original 1kHz sine wave shown at the top has been transferred with the clock switched to Internal (Master). You can clearly see a missed sample at the identical point in the copied file beneath.

Apart from the wide range of quality effects these soundcards provide, their main advantage is their ability to mix both analogue and digital signals in real time (see the 'Mix & Match' box). The reason why the Emu 10K1 DSP chip can do this is that each signal is converted to 48kHz on entering the system, and it can then be dealt with just like another voice in a sampler.

Emu are very open about the advantages and disadvantages of their fixed‑frequency approach, and their web site (www.emu.com/dtm/aps_home.html) contains an informative paper on just this subject. They encourage people to work exclusively at 48kHz (which gives slightly better audio quality anyway), arguing that most DAT tapes, along with DVD and Minidisc, are already at this sample rate, and that the entiraudio chain, including their many DSP effects, benefits from the increase.

If you choose to record and play back your entire project with a 48kHz sample rate, any 44.1kHz signals imported via the S/PDIF input should only change by a tiny amount. Unfortunately, if you choose to stick with 44.1kHz for your recordings they will be converted to 48kHz as they enter the chip, and then automatically downsampled back to 44.1kHz before being saved on the hard drive, which means a second trip through an SRC.

I tried a digital transfer at 44.1kHz using the SB Live! (with two such passes through an SRC), and after a few tweaks of its recording level fader managed to achieve a reasonably close level match with the original. On replay this certainly sounded very similar to the original file, but I could still reliably identify the original every time (it was more transparent, and there was a slight harshness at the top end of the SB Live! version). This rules out the SB Live! as a stand‑alone solution for digital copying in a professional studio, but it is, after all, a low‑cost consumer soundcard, and is still excellent value for money considering the huge number of features it provides.

The secret with the Emu chip is to work entirely at 48kHz during a project. Resampling would then only be necessary once, at the end, to convert the master mix down to 44.1kHz before burning an audio CD. In this case audio degradation would not be quite such an issue, especially since the sample rate is being reduced rather than increased. Also, rather than relying on real‑time SRC algorithms, the conversion can be done off‑line using a high‑quality software‑based conversion utility. Emu have developed their own 'optimum sample‑rate converter' as a stand‑alone application for APS customers at no charge.

One final point for APS owners is that there is apparently a way to achieve bit‑for‑bit copies, although I haven't been able to test it myself. It only works with the Cubase VST ASIO drivers and 48kHz signals. First you need to set your DAT recorder (or other digital device) to slave to the fixed 48kHz clock signal from the APS digital output. Then you record using either of the two S/PDIF inputs of the card, but switch off the Econtrol mixer. This should ensure that no sample‑rate conversion takes place, and that the file recorded to your hard drive is identical to the original.

Disturbing The Flow

Even when you're using asynchronous transfers, or when the master/slave relationship has been set up correctly for a synchronous transfer, it is still possible to occasionally lose samples when transferring audio data between devices. Despite the error correction provided for the data coming off DAT tapes and CDs, once the signal is being sent down a cable to another digital input socket there is no further error correction involved. The data is simply streamed continuously (along with various subcodes), with no possibility to retry for any bits that disappear en route if you experience a power supply glitch, are using long, low‑quality cables, or have an intermittent contact somewhere.

Another potential problem when transferring from a medium such as DAT to a PC soundcard is the very clever error‑correction methods used by DAT recorders. Paul White and others have argued for some time that error readouts should be provided on all DAT machines, so that it's possible to see when a particular tape is reaching the end of its useful life, or when the heads need cleaning (this tends to be when the number of errors increases). There is certainly clever stuff going on in DAT recordings to help reduce the risk of the data being lost altogether, but anyone who has worked for a long time with a DAT recorder will tell you that you can experience strange noises when playing back some tapes. You may notice small ticks, pops, bursts of 'static' or even, in some cases, complete blanking of the audio output for several seconds if there's severe data corruption.

Checking For Problems

The effects listed above tend to be clearly audible, and most people would repeat a digital transfer if theheard any, but during a long digital transfer it is quite possible that some sections will contain interpolated data where a tiny section has been error‑corrected. If the DAT replay itself varies it can be difficult to establish whether a particular soundcard is, in fact, producing bit‑for‑bit copies.

When data corruption occurs, it can be in several forms. The most obvious is when sections of the waveform arrive late, leaving a gap of a few samples. These can often be spotted by examining the waveform closely (see Figure 3). Worse are those forms of corruption that repeat a section of the waveform so that the only visible sign is a sudden instantaneous change in sample value.

These kinds of glitches may be obvious to the ear or virtually inaudible, depending on the type of music concerned and where in the waveform the glitches occur. While you might think that 'inaudible' glitches are not worth bothering with, they are still a sign that something is wrong, and the potential exists for your valuable music to become irretrievably corrupted at some point. At the end of a long session you may not notice a few stray clicks, but sooner or later you will, and although there are various utilities that may be able to aid you in 'rebuilding' your waveforms, you will not be popular if you're running a commercial concern and your customers notice a problem before you do. Even digital grabs of tracks from audio CDs can be prone to glitches in some cases (see the 'Papering Over The Cracks' box).

Fortunately, some modern software has functions that can check for such corruption, and whenever you're initially testing a new piece of gear it's wise to check at least the first few transfers for digital integrity. Steinberg's Wavelab, for instance, has a Global Analysis function with a special page for glitch and clip errors. In the case of clipping it scans the data for several consecutive samples at maximum digital value. You can select how many samples, but the default is a sensible four.

The glitch‑detection routine (like most such analytical algorithms) works by looking for sudden large jumps in sample value. Again, you can set your own Threshold (how big a jump in level is reported as a glitch)Automatic glitch detection is a useful tool, but it is always wise to zoom in on the waveform to check for yourself, as this sort of algorithm is not infallible and it is often easy to spot that a sudden jump in a waveform is legitimate.

Practical Tests

To give you more confidence in your digital transfers, a function such as Wavelab's File Comparer is invaluable. You simply select any two files and they are compared on a bit‑by‑bit basis. A very useful option is that of generating a 'Delta' file of the differences encountered; in the case of glitches, only these errors will appear in the Delta file. This procedure is also useful in determining just what an effects algorithm is doing to your signal. In the case of more subtle digital manipulation, you can also amplify the Delta file by a chosen number of dBs, to hear the difference more clearly. If you don't have Wavelab, you can perform a similar function by inverting the copied file (for instance, using the Invert/Flip process in Sonic Foundry's Sound Forge) and then mixing it with the original. The result will be a Delta file.

I tested the digital transfer capabilities of my Event Gina card using this facility. First I played back a WAV file through its S/PDIF output and recorded it to DAT. Then I re‑recorded the audio back through the Gina S/PDIF input. After topping and tailing the transferred file carefully, to ensure that it started at exactly the same sample point as the original, I ran the File Comparer function. This informed me that there were thousands of differences between the two files, although to my ears they sounded identical. I then generated a Delta file, and this proved that the differences were at such a low level as to be virtually inaudible. Generating another with 60dB gain showed some very well‑behaved noise at a level of about –30dB. This shows that the difference between the original and copied files was originally at ‑90dB, a sure sign that low‑level dithering had been added to the lowest bit (see Figure 4).

I had already proved that the Gina S/PDIF output provided bit‑for‑bit transfers when testing the Opcode DATport, and this suggested that the dithering was happening on the input side. A possible explanation for this is that the 16‑bit digital signal was being padded out with zeroes to an internal 24‑bit signal, but that when the signal was saved as 16‑bit to hard disk it was dithered down to 16‑bit. This sort of dithering is generally advantageous when dealing with 24‑bit signals, since with most dithering algorithms it is possible to achieve a couple of extra bits worth of resolution above that of a standard 16‑bit file. I would personally not be worried about this happening during digital transfers unless I expected to bounce backwards and forwards several times — it's not generally recommended to add dither more than once to a signal.

I then tried the same digital input test with the Gina clock deliberately set to Internal, so that the copy would not be clock‑locked. It's easy to fool yourself into thinking that there is still no audible difference, and with much material you may not notice any change, even when zooming in and comparing the waveforms of original and copy more closely. However, one sure test is to use a high‑frequency tone (or a slowly decaying piano note), since any skipped samples will be immediately obvious in such a simple waveform. Sure enough, when I tried a 1kHz sine‑wave I could hear a regular 'tick' several times a second (see Figure 5), and this time the Delta file needed no added gain to allow me to hear the difference between the two. With music files, although I had adjusted the tracks to start at the same sample they immediately started to drift out of sync.

This brings us neatly back to a perennial problem when musicians attempt to run several soundcards side by side. If you are, for instance, using several soundcards to replay long tracks, each controlled by an application such as Cakewalk Pro Audio, Cubase VST, or Logic Audio, they will only stay in sync if they are all locked to a single clock. You can do this either by slaving them all to a single high‑quality Master clock (as mentioned earlier), or by using the first to provide the Master clock and then chaining the S/PDIF output of the first to the S/PDIF input of the second, and so on.

Digital Dilemmas

If you buy a soundcard with digital I/O specifically to transfer your DAT tapes to your PC you will probably be annoyed if they are changed in any way en route. However, semi‑pro musicians are only a tiny part of the consumer market, and their needs are unlikely to be very high on the agenda when low‑cost consumer soundcards are designed. If you buy a professional soundcard with digital I/O, the manufacturer knows what you're going to be doing with it, and bit‑for‑bit copying facilities are likely to be included. Despite reviewing a current total of 27 soundcards, I have yet to find one at less than £200 that provides these facilities. When I do, I'll let you know!

Mix & Match

You may be wondering why Emu designed their 10K1 chip (featured in both their APS and the SB Live! and SB Value soundcards, as explained in the main text) to provide asynchronous sample‑rate conversion, rather than using a simpler Internal/External clock system. One of the main reasons is that digital outputs are now appearing on many devices, including digital mixers and effect units, hardware samplers, some MIDI keyboards and modules, CD players, and even some humble CD‑ROM drives.

In the case of high‑end devices, such as digital mixers and effect units, a digital input is likely to be fitted, and provision made for these devices to be driven from a Master clock. However, many of the other types of unit mentioned above can only work as a Master device, and this poses problems if you want to use more than one at a time. For instance, a CD player can only be used as a Master device, so to use one in conjunction with other digital equipment you would have to allow the CD player to provide the master clock for the rest of your studio (not generally a good idea).

Emu's answer is to provide each digital input with an asynchronous sample‑rate converter that can accept a digital stream at between 28kHz and 53kHz and convert it to the 48kHz sample rate that their card uses internally. This is the only way to mix digital signals at differing sample rates — say, the digital out of a sampler fixed at 44.1kHz and the 48kHz fixed output of a MIDI module. There are even benefits at consumer level, since game players can have music from the digital output of their CD‑ROM drive mixed in real time with internal sampler voices and WAV file playback. Thus, a design like this can overcome many frustrations when dealing with multiple digital devices.

Picture This

One thing that would really help people to understand a complex product such as a soundcard is a proper diagram of signal flow, like the equivalents found in nearly every hardware mixer manual — manufacturers please take note! To be fair, a few soundcard manufacturers do provide these, usually because the routing possibilities of their card are complex and potentially confusing, but at least users can follow the course of an input signal through the various switches and faders if they temporarily lose their output signal.

When it comes to digital manipulation it's even more difficult to obtain details, since in many cases not even technical support personnel have the necessary information. It's fair to say that most users don't need to know such technicalities as where and how internal signals are dithered down from their internal format before emerging from digital or analogue output sockets, but this information can sometimes be valuable when fault‑finding, and some sort of functional diagram would greatly help.

Papering Over The Cracks

Even when you're grabbing tracks from an audio CD (Digital Audio Extraction, or DAE) there is potential for digital corruption. Normally, when an audio CD is being read the information is simply passed into the CD player's buffer and read out by a highly accurate clock at an extremely steady rate (the accuracy of the clock determines the amount of traditional jitter in the final signal). The spindle speed of the CD player is adjusted to make sure that there is always some data in the buffer.

When data files are being extracted from a CD‑ROM, each block of data contains a header with sync information and a copy of the block's address, so that the drive can easily find the start and address of each block — vital for data storage. However, during digital audio extraction the PC has to grab a chunk of data, write this to the hard drive, and then return to exactly the same point in the data stream. Since the absolute position of the previous grabbed block in the data stream will be slightly uncertain, the PC may not always continue extracting at exactly the same place where it left off in the stream of audio data. If it restarts from a point a few bytes earlier in the stream, part of the waveform will be repeated; a few bytes later and there will be a gap in the waveform. Either way there will be a sudden discontinuity in the grabbed waveform, which will result in an audible click on playback.

One solution used by several software packages is to grab larger overlapping sections of audio. These sections are then moved backwards or forwards until the overlapping sections match exactly and are in sync. This technique is used by Adaptec's Easy CD Creator and Cequadrat's WinOnCD, amongst others, and is variously called 're‑sync'ing' or 'jitter correction'.

It is always best, when first using a new CD‑ROM or CD‑R drive for DAE, to determine the maximum reliable extraction rate for your particular drive. Some drives may need jitter correction switched on, and although this will slow down the rate of extraction it's better than having glitches in your audio. Other drives may benefit from having their extraction rates 'capped' at a certain speed, rather than attempting to extract audio at the maximum speed available. You can read more about this subject in my 'Taking the Bits' feature in SOS August '98.