Audibility and Musical Understanding of Phase Distortion

Andrew Hon

Music 108
Fall 2002
Professor David Wessel

What is Phase Distortion?

Any signal varying in amplitude with time can be understood as component sinusoids of different frequencies (c.f. Fourier Transformation). If one of these frequency components is time delayed by any amount other than zero, then we say the original signal has phase distortion.

(figure 1) Original waveform  ====>

(figure 2) Waveform after FT

Phase Distortion in Loudspeakers

Loudspeakers using multiple drivers typically limit the signal entering each driver to certain frequency bands using filters. Tweeters (drive units 1.5 inch or smaller) are given mostly high frequencies, woofers (typically 5 inches or larger up to 18 or more inches) are given low frequencies, and midranges (drive units larger than tweeters) are given the remainder, middle frequencies.

Electrically, filtering is accomplished with crossover networks. There are a couple main filter topologies: Bessel, Butterworth, and Linkwitz-Riley. The slope of a crossover filter is referred to as N-order times 6 dB per octave, where N is a positive integer (e.g. 6 dB per octave filter is a 1st order crossover; 12 dB/oct. is 2nd order, etc.) All crossover filters alter the phase in some manner, such that for example, a common 2nd order Butterworth filter will have its tweeter playing 180 degrees out of phase with the woofer.

Step Response in Loudspeakers

To measure the phase performance of a loudspeaker, we look at step response instead of impulse pesponse for cleaner data. A step function would be a voltage which instantaneously rises from zero to a positive value and stays there (figure 3). The following diagrams use MLSSA: Maximum-Length Sequence, averaging across a number of samples trained at random periods.

(figure 3) Perfect Step Response

A good phase-accurate (aka transient perfect) loudspeaker has a reasonable facsimile of the step response (figure 4).

(figure 4) Good Loudspeaker Step Response

A typical phase-inaccurate loudspeaker such as the 2nd order Butterworth mentioned above would look like figure 5.

(figure 5) Typical Loudspeaker Step Response

In this typical phase-inaccurate loudspeaker step response that is a measurement summing tweeter and woofer response, we see the tweeter initially rising with the input voltage (figure 4), but the woofer that is lagging by 180 degrees contributes the rapid falling immediately afterwards (figure 4), resulting in a totally mangled step response.

(figure 6) Tweeter Step Response from figure 5

(figure 7) Woofer Step Response from figure 5

diagrams courtesy
Measuring Loudspeakers, Part Two
John Atkinson, December 1998
from preprint, "Loudspeakers: What Measurements Can Tell Us-And What They Can't Tell Us!,"
AES Preprint 4608

Is Phase Distortion Audible?

Phase is easily audible as it is being changed, such as with a component frequency in a complex waveform. Experiments with steady-state complex waveforms indicate no difference in timbre if the phase of a component waveform is altered.

Ohm's Phase Law in the 1800s claimed that the phase of a waveform has no effect on how the ear perceives it, and in typical circumstances, relative phase differences are difficult to perceive. One theory suggests that phase is altered by ordinary reverberant environments, being position-dependent, so our ears don't pay much attention to phase.

Despite these early beliefs, studies have been conducted demonstrating that phase distortion is audible, however subtle and specific to certain circumstances. However, many people claim that previous research shows phase distortion is not audible. They simply not read the more current research. Many loudspeaker designers are guilty in this regard. My assertion is:

Depends - Yes, phase distortion is audible under the right circumstances to certain people, but the concensus is it is rather subtle at best, especially in relation to other forms of distortion. Phase is chaotic in reverberant environments, yes, but the direct sound is not affected by reverberation. In some situations such as choral music in a cathedral from the back of the audience, phase is totally messed up, but in most other cases it still matters!

How Audible Is Phase Distortion?

The most comprehensive review of phase distortion research I've come across is by Daisuke Koya, "Aural Phase Distortion Detection", University of Miami, May 2000 (Masters Thesis)

Koya states:

"Although not in large numbers, previous research in investigation of the audibility of phase distortion has proven that it is an audible phenomenon. Lipshitz et al. [7] has shown that on suitably chosen signals, even small midrange phase distortion can be clearly audible. Mathes and Miller [8] and Craig and Jeffress [9] showed that a simple two-component tone, consisting of a fundamental and second harmonic, changed in timbre as the phase of the second harmonic was varied relative to the fundamental. The above experiment was replicated by Lipshitz et al., with summed 200 and 400 Hz frequencies, presented double blind via loudspeakers resulting in a 100% accuracy score. An experiment involving polarity inversion of both loudspeaker channels resulted in an audibility confidence rating in excess of 99% with the two-component tone, although the effect was very subtle on music and speech. Cabot et al. [10] tested the audibility of phase shifts in two component octave complexes with fundamental and third-harmonic signals via headphones. The experiment demonstrated that phase shifts of harmonic complexes were detectable.

Another very simple experiment conducted by Lipshitz et al. was to demonstrate that the inner ear responds asymmetrically. Reversing the polarity of only one channel of a pair of headphones markedly produces an audible and oppressive effect on both monaural and stereophonic material. This effect predominantly affects frequency components below 1 kHz. Because reversal of polarity does not introduce dispersive or time-delay effects into the signal, but merely reverses compressions into rarefactions and vice versa, these audible effects are due only to the constant 180 phase shift that polarity reversal brings about. Since interaural cross-correlations do not occur before the olivary complexes to which the acoustic nerve bundles connect, it must be concluded that what is changed is the acoustic nerve output from the cochlea due to polarity reversal. This change owes to two factors: cochlear response to the opposite polarity half of the waveform, and the waveform having a shifted time relationship relative to the signal heard by the other ear. This reaffirms the half-wave rectifying nature of the inner ear."

[7] S. P. Lipshitz, M. Pocock, and J. Vanderkooy, "On the Audibility of Midrange Phase Distortion in Audio Systems," J. Audio Eng. Soc., vol. 30, pp. 580-595 (1982 Sep.).

[8] R. C. Mathes and R. L. Miller, "Phase Effects in Monaural Perception," J. Acoust. Soc. Am., vol. 19, pp. 780-797 (1947).

[9] J. H. Craig and L. A. Jeffress, "The Effect of Phase on the Quality of a Two-Component Tone," J. Audio Eng. Soc., vol. 34, pp. 1752-1760 (1962).

[10] R. C. Cabot, M. G. Mino, D. A. Dorans, I. S. Tackel, and H. E. Breed, "Detection of Phase Shifts in Harmonically Related Tones," J. Audio Eng. Soc., vol. 24, pp. 568-571 (1976 Sep.).

[11] J. R. Ashley, "Group and Phase Delay Requirements for Loudspeaker Systems," Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (Denver, CO, 1980 Apr. 9-11), vol. 3, pp. 1030-1033 (1980).

Why So Much Disagreement?

Mainly, phase distortion is difficult to test and to prove. On the implementation side, phase-linear (accurate) source material and reproduction equipment are necessary:

1) minimalist mic'ing required - no multi-mic'ing or studio mic'ing
2) no mixing board analog equalization, because analog equalization filters alter phase!
3) preferably no reverb or additional post-processing
4) phase linear amplification required (any good quality amplifier)
5) phase linear speakers, e.g. crossover-less single driver speakers like Quad electrostats
6) or Stax electrostatic headphones

Double-blind ABX testing methodology

In the ABX test, a number of trials are given where one is to discern between two conditions, A and B. The participant knows what A is and what B is, but is given an X which is either A or B. An "ABX comparator" is a black box that administers the switching across trials. In a double-blind test, neither the administrator of the test nor the participant knows anything about the conditions, eliminating transferable bias.

Caveats with ABX

The odds are indeed stacked against the participant who must make the fine discrimination. However, one must keep in mind that demonstrating a null result (cannot discern) does not prove anything - only proving that one can in fact discern the difference is significant.

Achieving significance requires a good number of trials, but as number of trials increases, subject fatigue may result in more errors, driving away from significance.

There are lots of statistical issues with ABX testing! For a comprehensive treatise refer again to Daisuke Koya:

(figure 8) Step and phase response of a 4th order Linkwitz-Riley (LR) crossover at 100 Hz, drawn by Linkwitz himself.

(figure 9) 4th Order Linkwitz-Riley Square-wave Response

(figure 10) 4th Order Linkwitz-Riley Phase Response

The ABX Phase Distortion Challenge

Can I discern a 4th order Linkwitz-Riley filter that has 360 degrees of phase rotation between high and low pass?

Using the PCABX computer program:

PCABX loads two digitized samples (.wav files) and switches between them purely in software. I use a high fidelity PC based stereo for testing, playback, in addition to researching and writing this paper. My PC is nearly silent - no fan noise because it doesn't use any fans, built myself.



Celeron 566 .18 micron Coppermine processor

  hard drive Seagate Barracuda IV 40Gb - 24 dB(A)  
  sound card M-Audio Delta Series Audiophile 2496 outputting S/PDIF  
  jitter reducer Monarchy DIP mk2  
  DAC MSB Link III DAC with upgrades  
  power amplifier Power Modules Belles 150a - 100watts Class AB MOSFET  
  "passive preamp" DACT 10kOhm stepped attenuator as "passive preamp"  
  speakers crossover-less dipole line arrays 9 modified full range drivers per side in series-parallel  
  power conditioner Monster HTS-2000 power line conditioner  

Very low jitter digital output from PC hard drive/sound card source, especially after DIP reclocking yields clean output from MSB Link that enters the Belles 150a with minimal wiring (less than a foot of teflon-insulated pure silver). Belles 150a is a high current design heavily biased towards Class A and operates towards Class A range when driving 95 dB sensitive line arrays (estimated) at low-medium volumes. End result is a reasonably phase-linear, low noise and low distortion playback chain. Listening is done in medium-small Berkeley studio apartment, near field about 6 feet from speakers, speakers positioned 5 feet apart, placed against 2" acoustic foam on wall of room (to absorb dipole back wave). Test was performed at night with less background noise.


10 correct out of 13 trials which is p<0.05 (4.6 percent chance of guessing so correctly) significance level for "castanet" sample, discerning unaltered reference with a digitally processed 4th Linkwitz-Riley filter at 300 Hz and 3000 Hz (a common 3-way speaker configuration).

I CAN HEAR PHASE DISTORTION, at least in this "castanet" sample, on my system, in my familiar acoustical environment. Filtered sample is marginally different from reference with 6dB more noise at 85dB SNR, 0.0001 more THD at 0.00017.

Difference is subtle but noticeable, I believe related to the phase distortion and not to any spuriae in the sample presentation. Discerning requires a fair bit of concentration and rapid switching (repeated-music test not running-music).

Subjective Impression of Phase Distortion, discussion

4th order LR crossover have always sounded "disjointed" to me - transients sound blurred, and high frequencies don't match up with low frequencies. Most noticeable with B&W audiophile speakers, which all use 4th order LR crossovers. The DM603 series is especially horrible sounding because the LR crossover is relatively low at 1-2kHz, whereas the DM303 series is not too bad because the LR crossover is at 4kHz, almost out of the midrange frequencies. 360 degrees of phase rotation is pretty horrible.

In the castanet sample, I listened for a subjective feel of the running notes. In LR filtered sample, the notes feel like they're stumbling over each other, while in the non-filtered sample, they are fast but liquid, flowing. 360 degrees of phase rotation at 10 kHz is 0.1 milliseconds, which seems inconsequential, but it means the source at 10 kHz would be positioned 1.356 inches closer to you, and smeared in physical location a couple inches over its full frequency spectrum. From the above diagrams, at 2 kHz, 180 degrees of phase rotation is .25 milliseconds.

Vijay Iyer describes how micro-timing in the single-digit milliseconds, controlled by musicians, can affect the emotional content of rhythmic music. Is it then surprising that sub-millisecond timing differences can be perceived? Ill-defined audiophile terminology such as "PRAT", or "pace, rhythm, attack, timing" may be due to these sub-millisecond crossover delays.

Steady-state timbre, congruent with previous research, does not sound much different between samples. The onset portion of the notes, however, is most important for timbre and one could say that imperfect transient/step responses negatively effect the overall quality of timbre.

Perceived Location (Soundstaging) Effects

Second order crossovers with 180 degrees of phase difference between tweeter and midwoofer have been perceived by this author to have a "high frequency wrap-around" effect. Instruments with predominantly high frequency onsets such as cymbals or even plucked guitar are perceived to be located significantly in front of the speaker, to the point of being next to me. Oddly enough, the steady-state tone of the guitar, for instance, is located as normal at or behind the plane of the speakers. The pluck of the guitar "jumps out" at the listener.

An average human head has an interaural distance of 23 cm. This translates to a maximum interaural time difference of 0.69 ms between ears. In terms of phase, for an ITD of 0.5ms (17 cm) and a frequency of 1kHz the interaural phase difference will be 180 degrees and for a 500Hz tone the IPD will be 90 degrees. Differences in phase up to one period (360 degrees) are used to locate sounds at angles relative to the head. With the distance between ears in the sub-milliseconds, the sub-millisecond delay due to crossovers intuitively seems to be significant.

A phase alteration of 180 degrees in high frequencies from a 2nd order crossover would not seem to have any effect on the angle of location, as both sound sources (loudspeakers) are equally effected, yielding no differences in phase between the two ears. However, altering high frequencies relative to low frequencies causes the high frequencies to be perceived closer to the listener while the low frequencies stay put. Why? Notably, 4th order LR crossovers do not exhibit this soundstaging effect, as they have 360 degrees of phase rotation, congruent with the theory stating that at greater than a full period IPD becomes ambiguous.

IPDs vary with frequency as the interaural distance is fixed, but frequency wavelength changes. With a second order crossover, phase difference varies sigmoidally with phase. A real world sound source that exhibited this phase behavior at the ear would be a very strange beast indeed. For a 2nd order crossover at 2 kHz, 200 Hz is in phase, 2 kHz is 90 degrees out of phase, and 20 kHz is 180 degrees out of phase.

(Figure 11) Phase differences between 2nd order Butterworth high- pass filter (white/top line) and its low-pass complement (yellow/bottom line)

2nd order crossover at Fc = 2 kHz

  frequency wavelength (a) deg phase rotation (b) (a*b) fractioned wavelength
  200 Hz 1.72 m 180, 0 8.6 m, 0 m
  1 kHz 34.4 cm 135, -45 12.9 cm, -4.3 cm
  2 kHz 17.2 cm 90, -90 4.3 cm, -4.3 cm
  3 kHz 11.4 cm -135, 45 -4.275 cm, 1.425 cm
  20 kHz 1.7 cm 0, 180 0 m, 0.85 cm

  (bold component is greater in amplitude, thus component is heard over other due to amplitude masking; at the crossover frequency (2 kHz) tweeter and woofer cancel each other out, which is why some designers purposely connect the two Butterworth-filtered drivers out of phase)  

From these rough calculations, we can see that very strange things are going on. At around the crossover frequency, the amount of phase rotation times the wavelength of the frequency equals a somewhat constant amount of delay, around 4 cm or 0.125 ms. 4 cm of distance would seem to not matter that much in terms of absolute distance to the sound source, but somehow its effects on the listener can be dramatic. I postulate that there is a synergy of effects of loudspeaker phase distortion with recordings that have certain phase cues encoded from the original acoustic space.

The dynamic response of a speaker is also significant, as some speakers are said to be more "forward" while others are said to be "laid back". Performers in a recording are said to sound like they are located in front of the plane of the speakers with "forward" speakers. "Forward" speakers are usually more dynamic, such as horns, with high efficiency and high dynamic range. Distance of a sound source is also perceived via amplitude, so perhaps the combination of a dynamic speaker and its phase distortion yields the more interesting "jumping out" location effects, most noticeable during dynamic amplitude onsets of notes.

It would be interesting to perform a more rigorous mathematical and empirical analysis of the effects of phase on spatial location perception. Something must be going on with the interaural phase difference, but in this 2nd order crossover example there are no obviously significant values in the phase calculations.

How Important is Phase Distortion? revisited

In loudspeaker design, crossover topologies and frequency points are chosen to match a particular set of drivers that were chosen for a price point, or vice versa. Some drivers are very low distortion in their preferred operating range, with significant cone breakup (distortion) once exceeded. Seas Excel Magnesium cone drivers in particular. These kinds of drivers require a high slope crossover such as the 4th order LR, including perhaps tricky notch filters which themselves introduce phase distortion.

In the scheme of loudspeaker design, phase is usually and perhaps rightfully secondary to considerations such as flat amplitude/frequency response, off-axis power response, and others depending on the designer. Flat amplitude response, generally agreed to be the most important criterion in a system, is more easily achieved with higher order crossovers. However, reasonably linear frequency responses can be achieved with lower order crossovers that have more benign phase characteristics. Room responses typically play such havoc with frequency response that strict adherence to a flat FR at the cost of phase may not be a worthwhile sacrifice.

Some notable phase-coherent or nearly so commercial brands are:

  (recently out of business)
(nice coaxial midrange-tweeter driver)
(looks boring, sounds great)
(transmission line bass)
(looks good, sounds good)
(Audience 40 model)

John Krevkovs has a number of DIY transient perfect speaker designs, notably a 1.5 order phase-coherent crossover albeit with low sensitivity:

Future Directions: Digital Filtering (FIR), DSP room acoustics correction

Finite Impulse Response

The input xn and output yn sequences of a digital filter both represent signals sampled at discrete, uniformly spaced, time increments tn. A finite impulse response (FIR) digital filter takes N+1 of the most recent samples of xn, multiplies them by N+1 coefficients, and sums the result to form yn.

All a's are zero for an FIR filter. FIR digital filters have constant group delay, which given enough computing power, can give you steep cutoff and perfect phase response! In other words, the best of both worlds, the holy grail of crossover design, is here!

Meridian speakers use DSP crossovers in an all-digital but otherwise conventional loudspeaker with integrated amplifiers. Professional brands such as RANE offer digital crossovers, but they may not be phase-accurate FIR types, and component quality is questionable.

Home-brew digital FIR crossovers outputted from multi-channel pro-studio sound cards are in the works. Personal computers pushing 3 GHz of processing power can easily cope with large sample FIRs.

SoundEasy 5.0 Speaker Designer software reportedly has digital crossover functionality built-in.

DSP Room Acoustics Correction

Digital Signal Processing chips exist that, when inserted in low-level stages of your audio source, can sample pulse trains outputted from your audio system and proactively filter out first-reflections, low frequency room mode peaks, reverberations and most importantly phase deviations. Solutions from SigTech, TACT, and Perceptual Tech (in decreasing order of quality, flexibility and expense) already exist and are reportedly quite effective.


Art Ludwig's Sound Pages

Aural Phase Distortion Detection, Daisuke Koya, University of Miami

Boston Audio Society ABX Testing article

Linkwitz Labs phase distortion discussion

Measuring Loudspeakers, John Atkinson, Stereophile Magazine

PCABX Linkwitz-Riley Filter Simulation






Crossover phase distortion


 - my preference for a low order crossover between woofer and midrange, because of its reduced phase distortion.

undistorted as with a 1st order 100 Hz Butterworth crossover, when lowpass and highpass outputs are added.


The 2nd order Linkwitz-Riley, 3rd order Butterworth and (1st order Butterworth crossovers form a 1st order allpass when the polarity of the midrange is reversed). The waveform is distorted. The high frequency spectral content forms a sharp spike of opposite polarity to the input, followed by the low frequency content with the same polarity as the input.

The 4th order L-R crossover forms a 2nd order allpass when woofer and midrange outputs are added. Again, the waveform is distorted, though differently from the previous case.

The spectrum of the waveform, that is applied to the 100 Hz crossover, is shown below. It has a 50 Hz fundamental and harmonics at odd multiples thereof. Each spectral component is transmitted in the above crossovers with correct amplitude, but their relative phase is changed, particularly between the regions below and above the 100 Hz crossover frequency.

The waveforms of A, B and C above are quite different and it would not be unreasonable to expect that they sound different. Yet, I have not found a signal for which I can hear a difference. This seems to confirm Ohm's acoustic law that we do not hear waveform distortion. At least it seems to apply to the phase distortion generated by typical allpass crossovers.








3-Way Active Crossover with Linear-Phase Response


The problems that exist in common crossover networks are known. The low-pass filter causes delay in the signal. On the contrary the high-pass filter causes be pre-ahead in the signal that it in goes through from this. So, the cross-frequency are created certain problems as 1] the signals of two filters confutation 2] the change of phase between the filters influence axial 3]to axial diagram depend from the frequency. The crossover circuit try it unties many from the problems that report above and are based on research of S. Lipshitz and J. Vanderkooy that was published in the magazine JAES (Journal Audio Engineering Society). A network crossover of linear phase it uses a low-pass department with the help of circuit of time delay and circuit of abstraction it gives in the exit signal with characteristically low-pass filter. This delay time is not constant for entire the area of frequencies, but is altered very late and mainly doesn't exist differences of phase between the signals of two outputs, neither even near in the cross-frequency.

Part List

R1-16=100Kohms R23-24-25-26=37.5Kohms [33K+4.7K] C12-13-20-21-22=1nF 100V MKT
R2-3-4-5=56Kohms R30-31-32-33-34-35-36=10Kohms C19-23-24-30-31-32-33=47nF 100V MKT
R6-27=37.5Kohms[33K+4.7K] R37-38-39-40-41-41=10Kohms C25-26-27-28-29=1nF 100V MKT
R8-9-12-13-14=10Kohms R42-43-44=47Kohms C36-37=1uF 100V MKT
R10-28=75Kohms (150K//150K) R45-46=47 ohms C38-39=47uF 25V
R11-29=NC TR1-2-3-4=47Kohms trimmer or pot. IC1=TL071
R15=56.3Kohms C1-34-35=2.2uF 100V MKT IC2-3-4-5-6-7=TL072-NE5532
R17=12Kohms C2-3-7-8-14-15-18=47nF 100V MKT
R18-19-20-21-22=10Kohms C4-5-6-9-10-11-16-17=10nF 100V MKT All the rsestors is 1/4W 1% metal film


The crossover circuit is constituted as it appears from block diagram [Fig.2] from two low-pass filters of fourth order -24db/oct, one for the line of low frequency signals and one for the high frequency. In the same frequency function also the two delay-time units, T1 (for low cross frequency F1) and T2 (for high cross frequency F2) and give him of characteristically phase with the low-pass part. The circuit delays T1 imitate the delay time that import the filter of low frequencies LPF1, while the T2 imitates the delay time that import the filter of low frequencies LPF2 that exists in the line of mid frequencies. Then the signal that emanates from low-pass filter is removed with IC7A-B, from the signal that has suffered delay, result a signal that his characteristics is same with a signal that has passed in from a low-pass filter. In the exit of each line found a trimmer with that we can adjust the level and level between the loudspeakers. The circuit supply become from a stabilized voltage +/- 15V. The use of crossover networks of fourth-order Linkwitz heaves the cross-frequencies to find in -6db [Fig. 3].

In picture [Fig. 4], appear the basic circuits and the necessary types of calculation for the low-pass filters as well as circuits delay time. Also exist an example of calculation for crossfrequencies F1=200HZ and F2=3KHZ, that it will help in the calculation and the adaptation in your needs. The circuit comes from relative article of magazine Elektor. More theoretical details exist in article, also in relative articles of S. Lipshitz and J. Vanderkooy in the JAES.

  • A Family of Linear-Phase Crossover Networks of High Slope Derived by Time DelayVol. 31, Number 1 pp. 2 (1983)Author: Stanley P. Lipshitz and John Vanderkooy Abstract: The design of linear-phase crossover networks has until now necessitated the use of crossovers, at least one of whose outputs suffers from either frequency response ripple in the passband or low rolloff rate in the stopband. It may be desirable, at leas

  • Use of Frequency Overlap and Equalization to Produce High-Slope Linear-Phase Loudspeaker Crossover NetworksVol. 33, Number 3 pp. 114 (1985)Author: Stanley P. Lipshitz and John Vanderkooy Abstract: It has been shown that linear-phase crossovers of high slope can be synthesized by subtracting a suitable low-pass output from a time-delayed version of the input signal. It would be nice to be able to avoid the expense of such an electronic time-delay

  • 2-Way Active Crossover with Linear Phase Response




Tech Design...
Examination of crossover induced  transient distortion.


1st order Butterworth: