You are here

QSound Labs: Right On Q

Interview | Manufacturer By Paul White
Published November 1995

Canadian company QSound Labs were among the third‑party TDM developers present at Digiworld, the recent Digidesign‑hosted event in London. Paul White chatted to QSound's head of technical support, Scott Willing, about the latest developments in their 3D processing technology.

Over the past few years, it has become technically feasible to build sound processing systems capable of extending the stereo soundstage beyond the confines of the two loudspeakers used to reproduce conventional stereo. Although this has always been possible using multi‑speaker setups such as those used in cinemas, there's obviously a great attraction in being able to recreate wide spatial effects through any normal stereo speaker system. Although a number of manufacturers are working in this field, the two most prominent systems are Roland's RSS and QSound, manufactured by Canadian company QSound Labs.

At the recent Digiworld event, hosted by Digidesign and attended by many of their third‑party developers, I was fortunate enough to have the opportunity to discuss the QSound approach to spatial processing with Scott Willing, who is in charge of technical support for the company's software development group. Scott began by telling me something about the QSound method of 3D sound creation.

Dummy Head Theory — And The Qsound Approach

"The classic research into spatial hearing is based on what I refer to as the dummy head theory. Basically, you make recordings using an acoustic replica of a real head and then analyse these recordings to see what actually happens to sound received at the ears. As you say [see the '3D from Stereo' box], if sound is located to one side of the head, there are time arrival differences at the two ears. There's also something called the head‑related transfer function, which encompasses a couple of concepts, including the fact that the ears receive differently filtered versions of the sound. If you base a localisation synthesis technology on these principles, you can certainly get somewhere, but it's primarily only useful for binaural applications where you're listening over headphones. That's because you are doing an analysis of what happens at the ears, so you have to take those signals and reproduce them back at the ears."

In a loudspeaker‑based system, both ears hear sound from both speakers, which is obviously quite unlike the headphone situation where there is very little crosstalk between channels. How can you compensate for this?

"One of the approaches has been to take binaural synthesis, and then attempt to translate this to loudspeaker listening by using crosstalk cancellation. This attempts to cancel the left channel information as it arrives at the right ear and vice versa, but this approach is fraught with side‑effects, particularly from a high‑fidelity standpoint. You can certainly achieve out‑of‑speaker localisation using crosstalk cancellation, but in essence, you're adding more out‑of‑phase components, which tends to be a bit brutal in terms of the timbral variation that takes place.

"Some scientists have been upset by the empirical nature of QSound's research, but we feel we used a better model — we included the human brain and hearing system as part of the test loop. The QSound game plan was to create localisation for human beings listening to two speakers, so we actually put human beings in front of speakers, took the basics of what was known about localisation, and then tested people with various sine burst and broadband signals to see how they actually perceived things. Over the course of the oft‑quoted half a million listening tests, a huge database was compiled that documented the subjective results of various input signals, phase relationships and the like on human listeners.

"One of the things that sets QSound apart is the enormous amount of work that went into the process of boiling down this huge database. A couple of people at Q have developed substantial skills — I would think unique in the world — in refinement and optimisation. We're talking about professionals with well‑trained ears, listening and tweaking in a lengthy and mind‑numbing interactive search for perfection. I'm not sure why they're still sane, but at the end of the day, I feel QSound does a better job of keeping all the frequency components of a localised signal in the same place. The benefits for high‑fidelity reproduction are obvious, but it also means that band‑limiting has little effect on the perceived placement. This is good news for guys working in multimedia — band‑limiting happens a lot there due to the low sample rates and cheap speakers used."

Were there any surprises in that some parameters were more important than you thought they might be?

"QSound is ultimately a filtering algorithm, and I find it surprising that the amplitude component is so important, as compared to just phase and delay. Rather than take the dummy head model and apply crosstalk cancellation, we used our database of processing versus perceived effect and created a system that put that theory into practice.

"We're now optimising our algorithms for different horsepower platforms, so that we can provide integrated software solutions for people working in multimedia. We can also optimise or reach an effective compromise for various geometries — if somebody makes a little toy with speakers just eight inches apart, we can still get 180‑degree sound placement."

That Rising Feeling: Elevation And Rear Placement

You're still manipulating time, amplitude, and frequency response — so is the main difference between your system and something like Roland's RSS system simply that they use crosstalk cancellation and you don't?

"Essentially, yes, although we like to think that our techniques provide the most natural‑sounding localisation algorithms available. We also separate ourselves from some of our competitors in the sense that we don't claim to do things that we can't achieve reliably."

Does this include things like putting sounds behind the listener?

"Yes. There's a lot of psychology involved in both elevation and rear‑placement. Take sounds in motion, for example — if you do a pan from one side to the other and make it continuous, there are lots of people who will get the impression that the sound went all the way around, especially if you dip the level every other pass. There are tricks you can do to extend the usefulness of our basic technology, but we don't have a knob that says 'Rear' on it. Similarly, we don't have any elevation algorithms in any of our currently released products, because elevation is actually easier to achieve for headphone listening. We do have headphone algorithms that will be finding their way into commercial products in the near future. The bottom line with elevation algorithms is that just about everything I've heard that attempts to recreate elevation effects sounds filtered. In the real world, you're not conscious of elevated sounds being different.

"There are good arguments that if you provide a consistent cue in an interactive environment such as a game, and link the cue to a visual effect, the listener eventually comes to perceive the cue as representing elevation, even if it's initially perceived as filtering. There's an awful lot of psychology involved. For example, if you produce helicopter sounds, or angelic vocals with some nice reverb on them, these things tend to be perceived as being higher up. There's a thunder roll on Roger Waters' Amused To Death album, and when I'm sitting in my living room, it crawls across the ceiling as far as I'm concerned."

What are the problems in putting a sound behind the listener? Is it because we use unconscious head movements to localise sounds coming from behind? If this isn't the case, it's difficult to understand how we differentiate between a sound that's directly in front and one that's directly behind, because in both instances, the sound arrives at both ears at the same time.

"For front‑to‑back differentiation, humans do tend to rely on moving the head after the sound is initially heard. It's interesting that with headphones, it's actually fairly easy to do rear placement, but very difficult to convincingly put something out in front. It certainly helps if the sound source is in motion when using speakers, and we've had experience in our research of being able to place limited ranges of frequencies directly behind, but it's not something we've been able to refine to the point where we'd feel comfortable putting it into a commercial product.

"One of the things that fascinates me about QSound is that if you place a sound outside the normal stereo field, you can turn your head and look at it: you'd expect that as soon as you moved your head the image would collapse, but in fact, it's quite stable."

Development And Marketing

You started off with a very secretive piece of hardware which could only be rented, rather like Aphex did with their Aural Exciter, but now you have low‑cost, software‑based implementations of your systems that almost anyone can use. How has your marketing game plan evolved since the first working system?

"The hardware box has advanced from the early days — we referred to the two originals as the refrigerator set. Now the hardware is a lot smaller, but it has fundamentally the same architecture based around a PC fitted with our own DSP cards. That has full automation with joystick control, and can be configured as eight channels automated via SMPTE, MTC, or direct MIDI control. People perceive the degree of stereo placement differently, but for most, we get around 180 degrees of positioning. I prefer to use the phrase 'well beyond the normal speaker field'.

"With software plug‑ins such as we have for Digidesign's Pro Tools hardware, the goal is to have QSound migrate to as many appropriate platforms as possible, and to make it far more accessible economically. I'd like to be able to offer a 2000‑dollar, 2U hardware‑based QSound system, but at the moment, the best way we see of making the technology available is through software plug‑ins.

"QSys TDM is a 4‑channel version of the Q System. The initial release lacks just one thing the Q System offers, and that's automation. Digidesign is not currently supporting automation within plug‑ins, and apparently it's going to require a rewrite of their automation system, but they have plans to introduce it some time next year. Depending on how many hoops you want to jump through, there are workarounds for automation. The simplest is to take a stereo output from the TDM module and print it to a couple of tape tracks — but then that's nobody's idea of true automation."

Will the software systems really allow us to do the same thing as their hardware counterparts?

"The algorithms used in our plug‑ins are exactly the same as in our hardware systems. We have done some serious optimisation for very low‑power systems such as budget PC soundcards, but the plug‑ins for use with Digidesign are mathematically identical to the hardware. What's more, because software can be upgraded far more easily than hardware, the software systems have some new features that haven't found their way into the hardware boxes just yet.

"Although we are building more hardware systems, we see the future as being mainly in software, and we have still to develop a system for a serious PC platform. It's hard to determine what sales figures for the various platforms are, but we're looking to support the most popular platforms first. Ultimately, we'd like QSound to be available to everyone, regardless of their operating environment."

I guess the games and multimedia markets have given you far greater potential for expansion than the mainstream music market?

"Yes, the multimedia market far exceeds the music business, but one interesting thing is that as multimedia developers become more sophisticated, they are starting to use the same sort of tools that we associate with professional audio. For example, a number of visitors to Digiworld have been multimedia developers — and this is ostensibly a pro audio show! There's a lot of overlap, and the developments we do in one area usually find their way into the others."

"Roger Waters' Amused To Death album is definitely one to check out if you want to see what our system is capable of. There's a phone that rings on that record — it doesn't even sound like my phone, but I get up to answer it every time!"

3D From Stereo: The Theory

When you first encounter a 3D system like RSS or QSound, it's difficult to comprehend how such a wide soundstage can be produced from just two speakers — until you realise that we perceive a complete 360 degree soundstage in real life, using just two ears. The mechanics of stereo hearing have been discussed in SOS on many occasions (most recently in the article on 3D Mixing in November '94's SOS), but it's worth recapping briefly on how our ears and brains extract directional information from the sounds we hear.

As audio travels at a finite velocity, any sounds arriving predominantly from one side of the listener will reach one ear before the other. The head acts as an obstruction between the sound source and the furthest ear, so when the sound reaches this ear, it is lower in amplitude and spectrally altered. As the sound source moves, the interaural delays, spectral filtering, and amplitude difference will change, so in theory, if you can reproduce these auditory clues electronically, you can create the same 3D effects for stereo sound.

The Low‑End Theory: Qxpander

You were saying that you had to come up with different versions of your algorithms to make use of the available computing power. Does that mean that your low‑cost QXpander software is somehow less powerful than your top‑end systems, or do you use the same algorithms?

"The underlying placement algorithms are identical, but it's important to make the distinction between the optimisations we do to make QSound work on very low horsepower platforms like PCs with soundcards, and the differences between Q1 [the mono‑in, stereo‑out processor] and QXpander [stereo‑in/stereo‑out], which is more of a signal processing and architecture difference. QXpander is really for processing stereo mixes — it enables you to get some QSound enhancement into a mix where the individual tracks are not available for processing. In that sense, it is more of a mastering tool.

"However, a lot of people are using QXpander for processing stereo submixes, which are then added back to a multitrack mix. The main difference is that QXpander contains a pre‑processor, which helps to keep the mono information from being destroyed by stereo expansion. One of the drawbacks to most stereo expansion algorithms is that you end up with a softening of the centre‑stage sound, and maybe a loss of low end. QXpander already addresses those things by including a separate control for the centre signal — when fully up, this keeps all the mono components from being processed. There's also a low‑frequency parametric EQ to compensate for any bass loss suffered when using lower settings of the centre control, and a dynamic compensator which effectively adjusts the centre parameter in real time, allowing you to apply more processing on those parts of the track where there is little or no critical centre information.

"Another feature will soon be included in all our products, including QXpander 2.0 which is out in a couple of months, and that's designed to further improve the mono compatibility of the system. Quite simply, it's a high‑pass filter available before the QSound process, so if you exclude some of the low end from the placement algorithm, you still get quite effective localisation, while retaining very good mono compatibility. We give the user the ability to adjust this to suit the material being worked on."

Does this work because most of the spatial 'glitter' in a track happens at mid and high frequencies?

"That's true, although if you're not concerned about mono compatibility, QSound is capable of placing low‑frequency sounds down to about 100Hz. You can paste a kick drum to the wall quite convincingly. On Julian Lennon's last album, Help Yourself, there's a track called 'Would You' which has an 808 kick drum right out to one side."

Qsound In Use

"QSound works exceptionally well for spreading out reverb without upsetting the rest of the mix — it can really put the room ambience out in the room. I still really enjoy the Sting Soul Cages album, specifically for the use of QSound on reverb and delay, whereas the Madonna Immaculate Collection album uses it to spread incidental percussion. The system was also used on the recent Pink Floyd Pulse album to create a really big feel for a live record. Crowd sounds aren't often a large part of live albums in terms of technicalities, but on this album, you actually feel like you're sitting in the crowd. James Guthrie, who mixed that album, is really an expert at that sort of thing.

"Roger Waters has taken the science of using QSound for sound effects to a high art, and Amused To Death has all kinds of things like TVs off to one side and dripping taps. It's definitely an album to check out if you want to see what our system is capable of. There's a phone that rings on that record — it doesn't even sound like my phone, but I get up to answer it every time!"