Audibility and Musical Understanding of Phase Distortion
Professor David Wessel
What is Phase Distortion?
Any signal varying in amplitude with time can be understood as component sinusoids of different frequencies (c.f. Fourier Transformation). If one of these frequency components is time delayed by any amount other than zero, then we say the original signal has phase distortion.
(figure 1) Original waveform ====>
(figure 2) Waveform after FT
Phase Distortion in Loudspeakers
Loudspeakers using multiple drivers typically limit the signal entering each driver to certain frequency bands using filters. Tweeters (drive units 1.5 inch or smaller) are given mostly high frequencies, woofers (typically 5 inches or larger up to 18 or more inches) are given low frequencies, and midranges (drive units larger than tweeters) are given the remainder, middle frequencies.
Electrically, filtering is accomplished with crossover networks. There are a couple main filter topologies: Bessel, Butterworth, and Linkwitz-Riley. The slope of a crossover filter is referred to as N-order times 6 dB per octave, where N is a positive integer (e.g. 6 dB per octave filter is a 1st order crossover; 12 dB/oct. is 2nd order, etc.) All crossover filters alter the phase in some manner, such that for example, a common 2nd order Butterworth filter will have its tweeter playing 180 degrees out of phase with the woofer.
Step Response in Loudspeakers
To measure the phase performance of a loudspeaker, we look at step response instead of impulse pesponse for cleaner data. A step function would be a voltage which instantaneously rises from zero to a positive value and stays there (figure 3). The following diagrams use MLSSA: Maximum-Length Sequence, averaging across a number of samples trained at random periods.
(figure 3) Perfect Step Response
A good phase-accurate (aka transient perfect) loudspeaker has a reasonable facsimile of the step response (figure 4).
(figure 4) Good Loudspeaker Step Response
A typical phase-inaccurate loudspeaker such as the 2nd order Butterworth mentioned above would look like figure 5.
(figure 5) Typical Loudspeaker Step Response
In this typical phase-inaccurate loudspeaker step response that is a measurement summing tweeter and woofer response, we see the tweeter initially rising with the input voltage (figure 4), but the woofer that is lagging by 180 degrees contributes the rapid falling immediately afterwards (figure 4), resulting in a totally mangled step response.
(figure 6) Tweeter Step Response from figure 5
(figure 7) Woofer Step Response from figure 5
Measuring Loudspeakers, Part Two
John Atkinson, December 1998
from preprint, "Loudspeakers: What Measurements Can Tell Us-And What They Can't Tell Us!,"
AES Preprint 4608
Is Phase Distortion Audible?
Phase is easily audible as it is being changed, such as with a component frequency in a complex waveform. Experiments with steady-state complex waveforms indicate no difference in timbre if the phase of a component waveform is altered.
Ohm's Phase Law in the 1800s claimed that the phase of a waveform has no effect on how the ear perceives it, and in typical circumstances, relative phase differences are difficult to perceive. One theory suggests that phase is altered by ordinary reverberant environments, being position-dependent, so our ears don't pay much attention to phase.
Despite these early beliefs, studies have been conducted demonstrating that phase distortion is audible, however subtle and specific to certain circumstances. However, many people claim that previous research shows phase distortion is not audible. They simply not read the more current research. Many loudspeaker designers are guilty in this regard. My assertion is:
Depends - Yes, phase distortion is audible under the right circumstances to certain people, but the concensus is it is rather subtle at best, especially in relation to other forms of distortion. Phase is chaotic in reverberant environments, yes, but the direct sound is not affected by reverberation. In some situations such as choral music in a cathedral from the back of the audience, phase is totally messed up, but in most other cases it still matters!
How Audible Is Phase Distortion?
The most comprehensive review of phase distortion research I've come across is by Daisuke Koya, "Aural Phase Distortion Detection", University of Miami, May 2000 (Masters Thesis)
"Although not in large numbers, previous research in investigation of the audibility of phase distortion has proven that it is an audible phenomenon. Lipshitz et al.  has shown that on suitably chosen signals, even small midrange phase distortion can be clearly audible. Mathes and Miller  and Craig and Jeffress  showed that a simple two-component tone, consisting of a fundamental and second harmonic, changed in timbre as the phase of the second harmonic was varied relative to the fundamental. The above experiment was replicated by Lipshitz et al., with summed 200 and 400 Hz frequencies, presented double blind via loudspeakers resulting in a 100% accuracy score. An experiment involving polarity inversion of both loudspeaker channels resulted in an audibility confidence rating in excess of 99% with the two-component tone, although the effect was very subtle on music and speech. Cabot et al.  tested the audibility of phase shifts in two component octave complexes with fundamental and third-harmonic signals via headphones. The experiment demonstrated that phase shifts of harmonic complexes were detectable.
Another very simple experiment conducted by Lipshitz et al. was to demonstrate that the inner ear responds asymmetrically. Reversing the polarity of only one channel of a pair of headphones markedly produces an audible and oppressive effect on both monaural and stereophonic material. This effect predominantly affects frequency components below 1 kHz. Because reversal of polarity does not introduce dispersive or time-delay effects into the signal, but merely reverses compressions into rarefactions and vice versa, these audible effects are due only to the constant 180° phase shift that polarity reversal brings about. Since interaural cross-correlations do not occur before the olivary complexes to which the acoustic nerve bundles connect, it must be concluded that what is changed is the acoustic nerve output from the cochlea due to polarity reversal. This change owes to two factors: cochlear response to the opposite polarity half of the waveform, and the waveform having a shifted time relationship relative to the signal heard by the other ear. This reaffirms the half-wave rectifying nature of the inner ear."
 S. P. Lipshitz, M. Pocock, and J. Vanderkooy, "On the Audibility of Midrange Phase Distortion in Audio Systems," J. Audio Eng. Soc., vol. 30, pp. 580-595 (1982 Sep.).
 R. C. Mathes and R. L. Miller, "Phase Effects in Monaural Perception," J. Acoust. Soc. Am., vol. 19, pp. 780-797 (1947).
 J. H. Craig and L. A. Jeffress, "The Effect of Phase on the Quality of a Two-Component Tone," J. Audio Eng. Soc., vol. 34, pp. 1752-1760 (1962).
 R. C. Cabot, M. G. Mino, D. A. Dorans, I. S. Tackel, and H. E. Breed, "Detection of Phase Shifts in Harmonically Related Tones," J. Audio Eng. Soc., vol. 24, pp. 568-571 (1976 Sep.).
 J. R. Ashley, "Group and Phase Delay Requirements for Loudspeaker Systems," Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (Denver, CO, 1980 Apr. 9-11), vol. 3, pp. 1030-1033 (1980).
Why So Much Disagreement?
Mainly, phase distortion is difficult to test and to prove. On the implementation side, phase-linear (accurate) source material and reproduction equipment are necessary:
1) minimalist mic'ing required - no multi-mic'ing or studio mic'ing
2) no mixing board analog equalization, because analog equalization filters alter phase!
3) preferably no reverb or additional post-processing
4) phase linear amplification required (any good quality amplifier)
5) phase linear speakers, e.g. crossover-less single driver speakers like Quad electrostats
6) or Stax electrostatic headphones
Double-blind ABX testing methodology
In the ABX test, a number of trials are given where one is to discern between two conditions, A and B. The participant knows what A is and what B is, but is given an X which is either A or B. An "ABX comparator" is a black box that administers the switching across trials. In a double-blind test, neither the administrator of the test nor the participant knows anything about the conditions, eliminating transferable bias.
Caveats with ABX
The odds are indeed stacked against the participant who must make the fine discrimination. However, one must keep in mind that demonstrating a null result (cannot discern) does not prove anything - only proving that one can in fact discern the difference is significant.
Achieving significance requires a good number of trials, but as number of trials increases, subject fatigue may result in more errors, driving away from significance.
There are lots of statistical issues with ABX testing! For a comprehensive treatise refer again to Daisuke Koya:
(figure 8) Step and phase response of a 4th order Linkwitz-Riley (LR) crossover at 100 Hz, drawn by Linkwitz himself.
(figure 9) 4th Order Linkwitz-Riley Square-wave Response
(figure 10) 4th Order Linkwitz-Riley Phase Response
The ABX Phase Distortion Challenge
Can I discern a 4th order Linkwitz-Riley filter that has 360 degrees of phase rotation between high and low pass?
Using the PCABX computer program:
PCABX loads two digitized samples (.wav files) and switches between them purely in software. I use a high fidelity PC based stereo for testing, playback, in addition to researching and writing this paper. My PC is nearly silent - no fan noise because it doesn't use any fans, built myself.
Celeron 566 .18 micron Coppermine processor
||Seagate Barracuda IV 40Gb - 24 dB(A)
||M-Audio Delta Series Audiophile 2496 outputting S/PDIF
||Monarchy DIP mk2
||MSB Link III DAC with upgrades
||Power Modules Belles 150a - 100watts Class AB MOSFET
||DACT 10kOhm stepped attenuator as "passive preamp"
||crossover-less dipole line arrays 9 modified full range drivers per side in series-parallel
||Monster HTS-2000 power line conditioner
Very low jitter digital output from PC hard drive/sound card source, especially after DIP reclocking yields clean output from MSB Link that enters the Belles 150a with minimal wiring (less than a foot of teflon-insulated pure silver). Belles 150a is a high current design heavily biased towards Class A and operates towards Class A range when driving 95 dB sensitive line arrays (estimated) at low-medium volumes. End result is a reasonably phase-linear, low noise and low distortion playback chain. Listening is done in medium-small Berkeley studio apartment, near field about 6 feet from speakers, speakers positioned 5 feet apart, placed against 2" acoustic foam on wall of room (to absorb dipole back wave). Test was performed at night with less background noise.
10 correct out of 13 trials which is p<0.05 (4.6 percent chance of guessing so correctly) significance level for "castanet" sample, discerning unaltered reference with a digitally processed 4th Linkwitz-Riley filter at 300 Hz and 3000 Hz (a common 3-way speaker configuration).
I CAN HEAR PHASE DISTORTION, at least in this "castanet" sample, on my system, in my familiar acoustical environment. Filtered sample is marginally different from reference with 6dB more noise at 85dB SNR, 0.0001 more THD at 0.00017.
Difference is subtle but noticeable, I believe related to the phase distortion and not to any spuriae in the sample presentation. Discerning requires a fair bit of concentration and rapid switching (repeated-music test not running-music).
Subjective Impression of Phase Distortion, discussion
4th order LR crossover have always sounded "disjointed" to me - transients sound blurred, and high frequencies don't match up with low frequencies. Most noticeable with B&W audiophile speakers, which all use 4th order LR crossovers. The DM603 series is especially horrible sounding because the LR crossover is relatively low at 1-2kHz, whereas the DM303 series is not too bad because the LR crossover is at 4kHz, almost out of the midrange frequencies. 360 degrees of phase rotation is pretty horrible.
In the castanet sample, I listened for a subjective feel of the running notes. In LR filtered sample, the notes feel like they're stumbling over each other, while in the non-filtered sample, they are fast but liquid, flowing. 360 degrees of phase rotation at 10 kHz is 0.1 milliseconds, which seems inconsequential, but it means the source at 10 kHz would be positioned 1.356 inches closer to you, and smeared in physical location a couple inches over its full frequency spectrum. From the above diagrams, at 2 kHz, 180 degrees of phase rotation is .25 milliseconds.
Vijay Iyer describes how micro-timing in the single-digit milliseconds, controlled by musicians, can affect the emotional content of rhythmic music. Is it then surprising that sub-millisecond timing differences can be perceived? Ill-defined audiophile terminology such as "PRAT", or "pace, rhythm, attack, timing" may be due to these sub-millisecond crossover delays.
Steady-state timbre, congruent with previous research, does not sound much different between samples. The onset portion of the notes, however, is most important for timbre and one could say that imperfect transient/step responses negatively effect the overall quality of timbre.
Perceived Location (Soundstaging) Effects
Second order crossovers with 180 degrees of phase difference between tweeter and midwoofer have been perceived by this author to have a "high frequency wrap-around" effect. Instruments with predominantly high frequency onsets such as cymbals or even plucked guitar are perceived to be located significantly in front of the speaker, to the point of being next to me. Oddly enough, the steady-state tone of the guitar, for instance, is located as normal at or behind the plane of the speakers. The pluck of the guitar "jumps out" at the listener.
An average human head has an interaural distance of 23 cm. This translates to a maximum interaural time difference of 0.69 ms between ears. In terms of phase, for an ITD of 0.5ms (17 cm) and a frequency of 1kHz the interaural phase difference will be 180 degrees and for a 500Hz tone the IPD will be 90 degrees. Differences in phase up to one period (360 degrees) are used to locate sounds at angles relative to the head. With the distance between ears in the sub-milliseconds, the sub-millisecond delay due to crossovers intuitively seems to be significant.
A phase alteration of 180 degrees in high frequencies from a 2nd order crossover would not seem to have any effect on the angle of location, as both sound sources (loudspeakers) are equally effected, yielding no differences in phase between the two ears. However, altering high frequencies relative to low frequencies causes the high frequencies to be perceived closer to the listener while the low frequencies stay put. Why? Notably, 4th order LR crossovers do not exhibit this soundstaging effect, as they have 360 degrees of phase rotation, congruent with the theory stating that at greater than a full period IPD becomes ambiguous.
IPDs vary with frequency as the interaural distance is fixed, but frequency wavelength changes. With a second order crossover, phase difference varies sigmoidally with phase. A real world sound source that exhibited this phase behavior at the ear would be a very strange beast indeed. For a 2nd order crossover at 2 kHz, 200 Hz is in phase, 2 kHz is 90 degrees out of phase, and 20 kHz is 180 degrees out of phase.
(Figure 11) Phase differences between 2nd order Butterworth high- pass filter (white/top line) and its low-pass complement (yellow/bottom line)
2nd order crossover at Fc = 2 kHz
||deg phase rotation (b)
||(a*b) fractioned wavelength|
||8.6 m, 0 m|
||12.9 cm, -4.3 cm|
||4.3 cm, -4.3 cm|
||-4.275 cm, 1.425 cm|
||0 m, 0.85 cm|
||(bold component is greater in amplitude, thus component is heard over other due to amplitude masking; at the crossover frequency (2 kHz) tweeter and woofer cancel each other out, which is why some designers purposely connect the two Butterworth-filtered drivers out of phase)
From these rough calculations, we can see that very strange things are going on. At around the crossover frequency, the amount of phase rotation times the wavelength of the frequency equals a somewhat constant amount of delay, around 4 cm or 0.125 ms. 4 cm of distance would seem to not matter that much in terms of absolute distance to the sound source, but somehow its effects on the listener can be dramatic. I postulate that there is a synergy of effects of loudspeaker phase distortion with recordings that have certain phase cues encoded from the original acoustic space.
The dynamic response of a speaker is also significant, as some speakers are said to be more "forward" while others are said to be "laid back". Performers in a recording are said to sound like they are located in front of the plane of the speakers with "forward" speakers. "Forward" speakers are usually more dynamic, such as horns, with high efficiency and high dynamic range. Distance of a sound source is also perceived via amplitude, so perhaps the combination of a dynamic speaker and its phase distortion yields the more interesting "jumping out" location effects, most noticeable during dynamic amplitude onsets of notes.
It would be interesting to perform a more rigorous mathematical and empirical analysis of the effects of phase on spatial location perception. Something must be going on with the interaural phase difference, but in this 2nd order crossover example there are no obviously significant values in the phase calculations.
How Important is Phase Distortion? revisited
In loudspeaker design, crossover topologies and frequency points are chosen to match a particular set of drivers that were chosen for a price point, or vice versa. Some drivers are very low distortion in their preferred operating range, with significant cone breakup (distortion) once exceeded. Seas Excel Magnesium cone drivers in particular. These kinds of drivers require a high slope crossover such as the 4th order LR, including perhaps tricky notch filters which themselves introduce phase distortion.
In the scheme of loudspeaker design, phase is usually and perhaps rightfully secondary to considerations such as flat amplitude/frequency response, off-axis power response, and others depending on the designer. Flat amplitude response, generally agreed to be the most important criterion in a system, is more easily achieved with higher order crossovers. However, reasonably linear frequency responses can be achieved with lower order crossovers that have more benign phase characteristics. Room responses typically play such havoc with frequency response that strict adherence to a flat FR at the cost of phase may not be a worthwhile sacrifice.
Some notable phase-coherent or nearly so commercial brands are:
||(recently out of business)|
(nice coaxial midrange-tweeter driver)
(looks boring, sounds great)
(transmission line bass)
(looks good, sounds good)
(Audience 40 model)
John Krevkovs has a number of DIY transient perfect speaker designs, notably a 1.5 order phase-coherent crossover albeit with low sensitivity:
Future Directions: Digital Filtering (FIR), DSP room acoustics correction
Finite Impulse Response
The input xn and output yn sequences of a digital filter both represent signals sampled at discrete, uniformly spaced, time increments tn. A finite impulse response (FIR) digital filter takes N+1 of the most recent samples of xn, multiplies them by N+1 coefficients, and sums the result to form yn.
All a's are zero for an FIR filter. FIR digital filters have constant group delay, which given enough computing power, can give you steep cutoff and perfect phase response! In other words, the best of both worlds, the holy grail of crossover design, is here!
Meridian speakers use DSP crossovers in an all-digital but otherwise conventional loudspeaker with integrated amplifiers. Professional brands such as RANE offer digital crossovers, but they may not be phase-accurate FIR types, and component quality is questionable.
Home-brew digital FIR crossovers outputted from multi-channel pro-studio sound cards are in the works. Personal computers pushing 3 GHz of processing power can easily cope with large sample FIRs.
SoundEasy 5.0 Speaker Designer software reportedly has digital crossover functionality built-in.
DSP Room Acoustics Correction
Digital Signal Processing chips exist that, when inserted in low-level stages of your audio source, can sample pulse trains outputted from your audio system and proactively filter out first-reflections, low frequency room mode peaks, reverberations and most importantly phase deviations. Solutions from SigTech, TACT, and Perceptual Tech (in decreasing order of quality, flexibility and expense) already exist and are reportedly quite effective.
Art Ludwig's Sound Pages
Aural Phase Distortion Detection, Daisuke Koya, University of Miami
Boston Audio Society ABX Testing article
Linkwitz Labs phase distortion discussion
Measuring Loudspeakers, John Atkinson, Stereophile Magazine
PCABX Linkwitz-Riley Filter Simulation