PERFECT COPIES, EVERY TIME? Transferring Digital Audio Using PC Soundcards Published in SOS July 1999 Technique : PC Musician Most people expect that a digital input or output will simply provide a bit-for-bit copy of the original signal. After all, bits are bits, aren't they? Martin Walker sheds some light on the various things that can go wrong.
Today, in many people's eyes digital audio is perfect. When they copy a track or complete album of songs from a DAT or Minidisc recorder to a PC they expect an identical version to appear on their hard drive, from where it can be written to a blank CD-R disc using a CD writer. But while many such transfers are indeed bit copies of the original, this isn't always the case. If you look carefully on most soundcard packaging you won't find a claim of bit-for-bit digital transfers. There may, for instance, be claims of 'highest possible audio quality', but no guarantees. This might seem ludicrous to seasoned professionals who take such things for granted. Unexpected Manipulation In some situations, digital audio may be changed without you realising it, simply due to the design of the software or hardware. As an example, S/PDIF sockets are being used to support 16-, 20-, and 24-bit digital transfers. Given that so many PC soundcards offer internal signal paths of 24 bits or more it's fairly obvious that, once past the input socket, a 16-bit signal is going to be treated as a 24-bit signal. Seemingly, the sensible solution to this is to pad out the missing low-end bits with zeros, so that when the same digital signal arrives at the S/PDIF output socket the top 16 bits are identical. In some cases this is what happens, but not always (more on this later). I came across another more basic example of how digital audio can be changed in an unexpected way when reviewing the Opcode DATport USB (Universal Serial Buss) digital interface recently. As you can read Where digital inputs are concerned, any level control in the digital domain will also result in reduced dynamic range, but this is potentially more serious, since it will affect your recordings. For this reason you might think that there would be no point in having digital input-level controls, and in general this is true. However, some low-cost soundcards have a level control for the analogue input that is in the digital domain. One example of this is Event's Darla (the company's Gina and Layla cards provide analogue input level controls). Event told me that Darla has a digital level control only to reduce the signal once it has passed through the A-D converter, and it is essential that this control is permanently left full up for the best possible signal quality. Event's web site now states that "adjustments to record levels must be made either at the source or within the host application", so I suspect that the latest drivers ignore the control altogether. Clocking On One important concern with digital transfers is the Master/Slave relationship -- something I also mentioned in my review of the Opcode DATport. When transferring the contents of a DAT tape or Minidisc digitally, If the soundcard can only be used as a Master it will certainly record a high-quality digital signal, but this won't be an exact copy of the original, since the clocks are not locked together. Effectively, what you will be doing is freewheeling. You'll get a digital transfer which has not passed through either an A-D or D-A converter, and though it's likely to sound very similar much of the time, the bits won't be the same and the audio will be prone to occasional clicks and pops (see the 'Practical Tests' section for an example). You don't normally see 'Master' or 'Slave' settings in soundcard utility software. In the case of budget soundcards there is rarely any option at all, and on more expensive cards the options are likely to be under the heading of 'Clock'. The Internal (master) clock is normally used when playing back existing recordings, or when recording analogue signals, and for optimum sound quality this must be as stable in frequency as possible. Any variation (or jitter) in the frequency will degrade the sound when it undergoes conversion -- either from analogue to digital if you're recording, or from digital to analogue if you're monitoring the output signal. Once the conversion is done, this jitter cannot be removed from the signal, since it is essentially random in nature. When you're recording from a digital input, the soundcard must be able to switch off its Internal clock, since the signal arriving at the S/PDIF input socket already has a clock element present. Your soundcard's Some professional recording studios slave their digital equipment to a separate Master clock with extremely low jitter. It's an ideal solution, but it relies on every other digital device being able to run as a slave, and not all can do this. Sample Rate Conversion The absence of an External clock option isn't necessarily a disaster. Several soundcards -- such as the Emu APS and Soundblaster Live! -- provide fixed-frequency internal processing at 48kHz, and because of this design choice their digital output sockets are also at a fixed 48kHz. Since these cards cannot be slaved to an external digital signal, another approach is used: asynchronous sample rate conversion (SRC). Essentially, this interpolates (oversamples) the input signal, by stuffing zeroes between the original samples, and then digitally low-pass filters the result to effectively give a much higher sample rate. Then a second process known as decimation (or downsampling) reduces this rate. By choosing different integer values for each process you can convert one sample rate to another. For instance, you could convert incoming audio at 44.1kHz to 48kHz by multiplying by 160 and then dividing by 147. Given that resampling takes place, should we worry about its effect on quality? Well, it is possible to build a perfect SRC, given sufficient processing power, but a more modest real-time version, such as the one used in the 10K1 chip of the Emu APS and SB Live!, will produce a certain amount of 'pass-band ripple' -- a series of tiny bumps in the frequency response that are caused by the digital filtering. The Emu spec is In the case of high-end devices, such as digital mixers and effect units, a digital input is likely to be fitted, and provision made for these devices to be driven from a Master clock. However, many of the other types of unit mentioned above can only work as a Master device, and this poses problems if you want to use more than one at a time. For instance, a CD player can only be used as a Master device, so to use one in conjunction with other digital equipment you would have to allow the CD player to provide the master clock for the rest of your studio (not generally a good idea). Emu's answer is to provide each digital input with an asynchronous sample-rate converter that can accept a digital stream at between 28kHz and 53kHz and convert it to the 48kHz sample rate that their card uses internally. This is the only way to mix digital signals at differing sample rates -- say, the digital out of a sampler fixed at 44.1kHz and the 48kHz fixed output of a MIDI module. There are even benefits at consumer level, since game players can have music from the digital output of their CD-ROM drive mixed in real time with internal sampler voices and WAV file playback. Thus, a design like this can overcome many frustrations when dealing with multiple digital devices. Emu remain convinced that a small amount of ripple is preferable to the extra noise and other anomalies which would inevitably result from using the analogue path, but acknowledge that it would be preferable to have zero pass-band ripple. They have already developed DSP code that runs in the 10K1 effect engine and compensates for the ripple, and this will be available in a future driver release for the Emu APS card. The APS also has a fader available for its digital input, but this is designed to give unity gain when raised to its top position, to make it easy to match levels during a digital transfer. The SB Live! has the disadvantage that its software mixer utility doesn't currently have a 'unity gain' point, making it more difficult to record a digital transfer at the same level as the original. The Best Approach Apart from the wide range of quality effects these soundcards provide, their main advantage is their ability to mix both analogue and digital signals in real time (see the 'Mix & Match' box). The reason why the Emu 10K1 DSP chip can do this is that each signal is converted to 48kHz on entering the system, and it can then be dealt with just like another voice in a sampler. Emu are very open about the advantages and disadvantages of their fixed-frequency approach, and their web site (www.emu.com/dtm/aps_home.html) contains an informative paper on just this subject. They encourage people to work exclusively at 48kHz (which gives slightly better audio quality anyway), arguing that most DAT tapes, along with DVD and Minidisc, are already at this sample rate, and that the entire If you choose to record and play back your entire project with a 48kHz sample rate, any 44.1kHz signals imported via the S/PDIF input should only change by a tiny amount. Unfortunately, if you choose to stick with 44.1kHz for your recordings they will be converted to 48kHz as they enter the chip, and then automatically downsampled back to 44.1kHz before being saved on the hard drive, which means a second trip through an SRC. I tried a digital transfer at 44.1kHz using the SB Live! (with two such passes through an SRC), and after a few tweaks of its recording level fader managed to achieve a reasonably close level match with the original. On replay this certainly sounded very similar to the original file, but I could still reliably identify the original every time (it was more transparent, and there was a slight harshness at the top end of the SB Live! version). This rules out the SB Live! as a stand-alone solution for digital copying in a professional studio, but it is, after all, a low-cost consumer soundcard, and is still excellent value for money considering the huge number of features it provides. The secret with the Emu chip is to work entirely at 48kHz during a project. Resampling would then only be necessary once, at the end, to convert the master mix down to 44.1kHz before burning an audio CD. In this case audio degradation would not be quite such an issue, especially since the sample rate is being reduced rather than increased. Also, rather than relying on real-time SRC algorithms, the conversion can be done off-line using a high-quality software-based conversion utility. Emu have developed their own 'optimum sample-rate converter' as a stand-alone application for APS customers at no charge. One final point for APS owners is that there is apparently a way to achieve bit-for-bit copies, although I When it comes to digital manipulation it's even more difficult to obtain details, since in many cases not even technical support personnel have the necessary information. It's fair to say that most users don't need to know such technicalities as where and how internal signals are dithered down from their internal format before emerging from digital or analogue output sockets, but this information can sometimes be valuable when fault-finding, and some sort of functional diagram would greatly help. Disturbing The Flow Even when you're using asynchronous transfers, or when the master/slave relationship has been set up correctly for a synchronous transfer, it is still possible to occasionally lose samples when transferring audio data between devices. Despite the error correction provided for the data coming off DAT tapes and CDs, once the signal is being sent down a cable to another digital input socket there is no further error correction involved. The data is simply streamed continuously (along with various subcodes), with no possibility to retry for any bits that disappear en route if you experience a power supply glitch, are using long, low-quality cables, or have an intermittent contact somewhere. Another potential problem when transferring from a medium such as DAT to a PC soundcard is the very clever error-correction methods used by DAT recorders. Paul White and others have argued for some time that error readouts should be provided on all DAT machines, so that it's possible to see when a particular tape is reaching the end of its useful life, or when the heads need cleaning (this tends to be when the number of errors increases). There is certainly clever stuff going on in DAT recordings to help reduce the risk of the data being lost altogether, but anyone who has worked for a long time with a DAT recorder will tell you that you can experience strange noises when playing back some tapes. You may notice small ticks, pops, bursts of 'static' or even, in some cases, complete blanking of the audio output for several seconds if there's severe data corruption. Checking For Problems The effects listed above tend to be clearly audible, and most people would repeat a digital transfer if they When data corruption occurs, it can be in several forms. The most obvious is when sections of the waveform arrive late, leaving a gap of a few samples. These can often be spotted by examining the waveform closely (see Figure 3). Worse are those forms of corruption that repeat a section of the waveform so that the only visible sign is a sudden instantaneous change in sample value. These kinds of glitches may be obvious to the ear or virtually inaudible, depending on the type of music concerned and where in the waveform the glitches occur. While you might think that 'inaudible' glitches are not worth bothering with, they are still a sign that something is wrong, and the potential exists for your valuable music to become irretrievably corrupted at some point. At the end of a long session you may not notice a few stray clicks, but sooner or later you will, and although there are various utilities that may be able to aid you in 'rebuilding' your waveforms, you will not be popular if you're running a commercial concern and your customers notice a problem before you do. Even digital grabs of tracks from audio CDs can be prone to glitches in some cases (see the 'Papering Over The Cracks' box). Fortunately, some modern software has functions that can check for such corruption, and whenever you're initially testing a new piece of gear it's wise to check at least the first few transfers for digital integrity. Steinberg's Wavelab, for instance, has a Global Analysis function with a special page for glitch and clip errors. In the case of clipping it scans the data for several consecutive samples at maximum digital value. You can select how many samples, but the default is a sensible four. The glitch-detection routine (like most such analytical algorithms) works by looking for sudden large jumps in sample value. Again, you can set your own Threshold (how big a jump in level is reported as a glitch). Practical Tests To give you more confidence in your digital transfers, a function such as Wavelab's File Comparer is invaluable. You simply select any two files and they are compared on a bit-by-bit basis. A very useful option is that of generating a 'Delta' file of the differences encountered; in the case of glitches, only these errors will appear in the Delta file. This procedure is also useful in determining just what an effects algorithm is doing to your signal. In the case of more subtle digital manipulation, you can also amplify the Delta file by a chosen number of dBs, to hear the difference more clearly. If you don't have Wavelab, you can perform a similar function by inverting the copied file (for instance, using the Invert/Flip process in Sonic Foundry's Sound Forge) and then mixing it with the original. The result will be a Delta file. I tested the digital transfer capabilities of my Event Gina card using this facility. First I played back a WAV file through its S/PDIF output and recorded it to DAT. Then I re-recorded the audio back through the Gina S/PDIF input. After topping and tailing the transferred file carefully, to ensure that it started at exactly the same sample point as the original, I ran the File Comparer function. This informed me that there were thousands of differences between the two files, although to my ears they sounded identical. I then generated a Delta file, and this proved that the differences were at such a low level as to be virtually inaudible. Generating another with 60dB gain showed some very well-behaved noise When data files are being extracted from a CD-ROM, each block of data contains a header with sync information and a copy of the block's address, so that the drive can easily find the start and address of each block -- vital for data storage. However, during digital audio extraction the PC has to grab a chunk of data, write this to the hard drive, and then return to exactly the same point in the data stream. Since the absolute position of the previous grabbed block in the data stream will be slightly uncertain, the PC may not always continue extracting at exactly the same place where it left off in the stream of audio data. If it restarts from a point a few bytes earlier in the stream, part of the waveform will be repeated; a few bytes later and there will be a gap in the waveform. Either way there will be a sudden discontinuity in the grabbed waveform, which will result in an audible click on playback. One solution used by several software packages is to grab larger overlapping sections of audio. These sections are then moved backwards or forwards until the overlapping sections match exactly and are in sync. This technique is used by Adaptec's Easy CD Creator and Cequadrat's WinOnCD, amongst others, and is variously called 're-sync'ing' or 'jitter correction'. It is always best, when first using a new CD-ROM or CD-R drive for DAE, to determine the maximum reliable extraction rate for your particular drive. Some drives may need jitter correction switched on, and although this will slow down the rate of extraction it's better than having glitches in your audio. Other drives may benefit from having their extraction rates 'capped' at a certain speed, rather than attempting to extract audio at the maximum speed available. You can read more about this subject in my 'Taking the Bits' feature in SOS August '98. I had already proved that the Gina S/PDIF output provided bit-for-bit transfers when testing the Opcode DATport, and this suggested that the dithering was happening on the input side. A possible explanation for this is that the 16-bit digital signal was being padded out with zeroes to an internal 24-bit signal, but that when the signal was saved as 16-bit to hard disk it was dithered down to 16-bit. This sort of dithering is generally advantageous when dealing with 24-bit signals, since with most dithering algorithms it is possible to achieve a couple of extra bits worth of resolution above that of a standard 16-bit file. I would personally not be worried about this happening during digital transfers unless I expected to bounce backwards and forwards several times -- it's not generally recommended to add dither more than once to a signal. I then tried the same digital input test with the Gina clock deliberately set to Internal, so that the copy would not be clock-locked. It's easy to fool yourself into thinking that there is still no audible difference, and with much material you may not notice any change, even when zooming in and comparing the waveforms of original and copy more closely. However, one sure test is to use a high-frequency tone (or a slowly decaying piano note), since any skipped samples will be immediately obvious in such a simple waveform. Sure enough, when I tried a 1kHz sine-wave I could hear a regular 'tick' several times a second (see Figure 5), and this time the Delta file needed no added gain to allow me to hear the difference between the two. With music files, although I had adjusted the tracks to start at the same sample they immediately started to drift out of sync. This brings us neatly back to a perennial problem when musicians attempt to run several soundcards side by side. If you are, for instance, using several soundcards to replay long tracks, each controlled by an application such as Cakewalk Pro Audio, Cubase VST, or Logic Audio, they will only stay in sync if they are all locked to a single clock. You can do this either by slaving them all to a single high-quality Master clock (as mentioned earlier), or by using the first to provide the Master clock and then chaining the S/PDIF output of the first to the S/PDIF input of the second, and so on. Digital Dilemmas If you buy a soundcard with digital I/O specifically to transfer your DAT tapes to your PC you will probably be annoyed if they are changed in any way en route. However, semi-pro musicians are only a tiny part of the consumer market, and their needs are unlikely to be very high on the agenda when low-cost consumer soundcards are designed. If you buy a professional soundcard with digital I/O, the manufacturer knows what you're going to be doing with it, and bit-for-bit copying facilities are likely to be included. Despite reviewing a current total of 27 soundcards, I have yet to find one at less than £200 that provides these facilities. When I do, I'll let you know!
Published in SOS July 1999 | Sunday 7th September 2008 September 2008
Click image for Contents
Other recent issues: Screenshots too small? Click on photos, screenshots and diagrams in articles (after August 2003 issue) to open a Larger View window for detailed viewing/printing. |