Summarized
from Shaffer, Withnell, Dhar, Lilly, Harmon and Goodman (2003), Ear
and Hearing, 24, 367-379
Lauren
A. Shaffer1 and Sumitrajit Dhar2
1Department
of Speech Pathology and Audiology, Ball State University, Muncie, IN 47306
2Department
of Speech and Hearing Sciences, Indiana University, Bloomington, IN 47405
The clinical utility of otoacoustic emissions (OAEs) is limited
because hearing thresholds cannot be accurately or reliably predicted
from OAE tests results. We recently published a paper that discusses
this limitation in light of current theories of how OAEs are generated.
The following summarizes some of the main points from the paper.
For further detail, additional figures, and analysis methods, the
reader is referred to Shaffer et al. (2003 ) (please see footnote1.
Theory of the generation
mechanisms and sources of OAEs
Over the last decade, a coherent theory has developed suggesting
that the different OAE types share common mechanisms of generation (Talmadge,
Tubis, Long & Piskorki, 1998; Zweig & Shera, 1995). Shera and Guinan
(1999) suggested a re-classification of OAEs based on two mechanisms of
generation, nonlinear distortion and linear coherent reflection. Nonlinear
distortion and linear coherent reflection can also be thought of in the terms
suggested by Kemp (1986), “wave-fixed” and “place
fixed”. Nonlinear distortion is believed to arise from physiological
nonlinearities associated with the action of the cochlear amplifier as it
injects energy into the basilar membrane motion. Therefore, OAE energy arising
from nonlinear distortion is “fixed” to the traveling wave.
Reflection, on the other hand, may occur anywhere along the cochlear partition
that an irregularity causes energy to be turned around. There is still much
uncertainty about what types of irregularities actually cause reflection in the
cochlea. In fact, Kemp (2002) suggested a further branching of generation
terminology to distinguish between potentially passive mechanisms, such as an
irregularity in hair cell number, and active mechanisms, such as variation in
cochlear amplifier gain.
While the question of what causes reflection remains unanswered,
what is known is that the irregularities are randomly distributed (Zwieg &
Shera, 1995) such that energy can be reflected at sites all along the cochlear
partition. Most of the reflected wavelets will occur with random phase
relations causing the energy to be cancelled, however, energy that is reflected
under the peak of the traveling wave will have a coherent phase that allows the
positive summation of energy, creating a substantial reflection component
of the OAE that is recorded in the ear canal.
One problem with the classification system suggested by Shera
and Guinan (1999) is that most emissions are not generated purely from one
mechanism, but are a mix of the two different generation mechanisms,
particularly as stimulus level is increased. For example, stimulus frequency
emissions, which are believed to arise from linear coherent reflection at low
stimulus levels, may involve both nonlinear distortion and linear coherent
reflection at higher stimulus levels (Long, Talmadge, & Thorp, 2001; Shera
& Guinan, 1999). Distortion product emissions and high-level transient
evoked emissions also arise from a mix of nonlinear distortion and linear
coherent reflection (Shera & Guinan, 1999; Yates & Withnell, 1999).
The term ‘source’ can cause some confusion in a
discussion of OAE generation. The term is sometimes used synonymously with
generation mechanism. In Shaffer et al. (2003) we use the term
‘source’ to refer to the location or site where OAE energy arises
regardless of the mechanism that generates the emission. The 2f1-f2
DPOAE arises predominantly from two sources (e.g. Brown, Harris &
Beveridge,1996; Gaskill & Brown, 1996; Kemp & Brown, 1983; Kummer,
Janssen & Arnold,1995; Talmadge et al., 1996, 1997). Energy of the 2f1-f2
DPOAE first arises at the site on the basilar membrane where the stimulus
induced traveling waves interact. For simplicity, we refer to this region as
the “overlap” region. The overlap region under many stimulus
conditions can be approximated as the characteristic frequency location of f2.
Energy arising in this location is generated predominantly by nonlinear
distortion. Thus, we refer to this energy as the nonlinear component. The
energy of the nonlinear component, which has a frequency of 2f1-f2,
travels bi-directionally, basally toward the ear canal, and apically toward the
characteristic frequency location of 2f1-f2 (CFdp).
At the CFdp location the energy undergoes linear coherent reflection.
This reflection component of the 2f1-f2 DPOAE
then travels basally toward the ear canal. At the stapes, some distortion
product energy will pass on to the middle ear, while some energy will be
reflected at this boundary causing a standing wave pattern to be developed in
the cochlea and having dramatic effects on the level of the emission recorded
in the ear canal (Talmadge et al., 1998; Dhar, Talmadge, Long & Tubis,
2002).
Systematic variation in the
amplitude of the composite DPOAE
The 2f1-f2 DPOAE measured in the ear canal
is a vector sum of the nonlinear and reflection components that originate
from two different cochlear sources and can be thought of as a composite
of these different components. The energy from these different sources
interacts as it travels in the cochlea arising first at the overlap
region, then being reflected at the Cfdp region and again
at the middle ear boundary. The resulting interference pattern leads
to variation in the sound pressure level and phase of the composite
DPOAE that is measured in the ear canal. This variation in the sound
pressure level of the DPOAE is quasi-periodic with frequency and is
known as “fine structure” (see Figure 1 and footnote .
All evoked
OAEs show evidence of amplitude and phase fine structure when observed
with fine frequency resolution. In fact, the commonality in the frequency
spacing of fine structure amplitude peaks across different emission
types (including the minimum spacing between spontaneous emission
peaks) provided early evidence of common mechanisms of generation
(reviewed in Talmadge et al., 1998).

Figure
1. Amplitude and phase fine structure for the 2f1-f2
DPOAE of a subject (KG) with normal hearing. Note that peak to valley
amplitude variation can be greater than 20 dB, and that valleys in the
amplitude fine structure sometimes drop into the noise floor. Talmadge, Long,
Tubis and Dhar (1999) used a phasor model to show that characteristic phase
patterns (ramp and sawtooth patterns) result from the interaction of the
nonlinear and reflection components of the 2f1-f2 DPOAE.
Which pattern arises depends on the relative strength of the two components,
which can vary with frequency. In these data, ramp patterns appear in the
frequency range from approximately1500 to 2000 Hz and sawtooth patterns appear
above 2000 Hz.
Fine structure is not observed in clinical measurements of
DPOAEs because common protocols call for recording the DPOAEs at 1/3 octave
intervals. This resolution is too coarse to observe fine structure. The
typical spacing between amplitude peaks in the fine structure varies with
frequency, but in the mid audiometric frequencies ranges from approximately
100-200 Hz. Peak-to-valley variation in amplitude can be 20 dB or greater in
normal hearing subjects (He & Schmeidt, 1993). So, while fine structure is
not observed in clinical testing, its presence can greatly complicate the
interpretation of the clinical DP-gram and clinical DP input/output (I/O)
function (Heitmann, Waldmann & Plinkert, 1996).
The problem in interpreting individual test results is that
valleys or “dips” in the amplitude fine structure may drop below
the confidence intervals of normative data and can also drop below the noise
floor (see Figure 1). Such points will be interpreted as evidence of
dysfunction in the frequency region where the dip occurs, when, in fact, higher
resolution recordings would show that such dips are part of a normal pattern of
fine structure.
Problems equally arise in attempting to interpret DP I/O
functions. The peaks and valleys of amplitude fine structure shift with
stimulus level, causing changes in the shape and slope of the I/O function.
This problem is illustrated schematically in Figure 2. Shifts in fine
structure with stimulus level may partly explain why in recent studies
attempting to extrapolate thresholds from DPOAE I/O data, 30% and 60% of the
I/O functions did not meet the criteria for inclusion (Boege & Janssen,
2002; Gorga, Neely, Dorn & Hoover, 2003). Among normal hearing subjects,
the majority did not meet the slope criterion (Gorga et al., 2003). While fine
structure was not considered in these studies, it may well explain the
variability in I/O function slope among normal hearing subjects.

Figure 2. Schematic of shift in fine structure with increasing
stimulus level. The colors represent different L2 levels for a constant L1.
Resulting I/O functions will vary in slope depending on where in the fine
structure the sample is taken. Frequencies near a “dip” or on the
slope of fine structure can produce very different I/O functions. Frequency
shifts in DPOAE fine structure were first described by He and Schmeidt (1993).
The change in the shape or slope of I/O functions resulting from fine structure
shifts with level illustrates that I/O functions are affected by the
interaction of multiple source components.
At present the only way to deal with the clinical problems
associated with low-resolution recording is to record with high stimulus
resolution around any frequency in question to ascertain whether fine structure
is influencing the DP-gram or I/O function.
Predicting behavioral
thresholds from the composite DPOAE
So, how can theory of generation mechanisms and cochlear sources
shed light on the limitations of the composite OAE in predicting behavioral
hearing thresholds? First, studies that have attempted to relate OAE amplitude
to hearing thresholds typically correlate a single emission frequency to a
single audiometric frequency (in the case of distortion products, the DPOAE
amplitude at a given f2 frequency is correlated to the behavioral
threshold at the same audiometric frequency). From generation theory and
experimental data, we now know that for DPOAEs and high level TEOAEs there are
multiple cochlear sources that contribute to the emission (Avan, Bonfils, Loth
& Wit, 1993; Withnell, Yates & Kirk, 2000), therefore the amplitude of
the composite OAE at a single frequency represents the sensitivity of all the
cochlear sources that have contributed to the emission, not just the
sensitivity at a single frequency. This “mismatch” between the
cochlear locations that give rise to the emission and the emission test
frequency may be responsible for some of the variability observed in simple
correlations.
Amplitude fine structure also plays a roll in limiting the
prediction of hearing thresholds from DPOAE level. When data from large
numbers of subjects are averaged to obtain normative values, the amplitude
variation is essentially low-pass filtered and the fine structure pattern is no
longer obvious in the averaged data. What is obvious in the averaged data,
however, is the large amplitude variance (and correspondingly large confidence
interval range) resulting from fine structure (Gorga, Neely, Ohlrich, Hoover,
Redner, & Peters, 1997). For this reason, fine structure amplitude
variation contributes to the overlap in amplitude distributions between normal
hearing and impaired ears (Gorga et al., 1997), and probably also contributes
to the tremendous variability seen in correlations of DPOAE level and hearing
thresholds.
“Single source”
or “component” DP-grams and DP I/O functions
Two different methods have been used to separate the nonlinear
and reflection components of the composite DPOAE. Waldmann, Heitmann, &
Plinkert (1997) first conceptualized the “single-source” DPOAE
(sgDPOAE), later showing that a suppressor tone close in frequency to 2f1-f2
could be used to reduce the amplitude variation in fine structure (Heitmann,
Waldmann, Schnitzler, Plinkert & Zenner, 1998) by suppressing the
reflection component from the Cfdp region.
The other technique used to separate the nonlinear and
reflection components involves digital signal processing methods that exploit
the observation that the phases of the two components exhibit different
behaviors as a function of frequency. In fact, the two components have quite
distinct phase behaviors. Because the nonlinear component is associated with
the stimulus traveling waves (wave-fixed), its phase is relatively invariant
with frequency. This behavior arises because the traveling wave exhibits an
approximately constant number of cycles of vibration to the peak regardless of
the frequency of the stimulus. The result is that the nonlinear distortion
produced always shares the same phase relation to the stimulus phase producing
a flat phase versus frequency function. The reflection component, on the other
hand, which arises from fixed locations (place-fixed) on the cochlear partition
has a rapidly rotating phase that accumulates across frequency. The resulting
phase versus frequency function has a steep slope. From this discussion, it may
be much clearer how two components, one with approximately constant phase and
one with rapidly changing phase, sum together to produce the quasi-periodic
pattern of amplitude and phase fine structure.
Stover, Neely, and Gorga (1996) first showed that inverse fast
Fourier transform (IFFT) of the DP-gram could be used to separate the
components in the time domain. Because of their phase properties, the
nonlinear and reflection components resolve as independent peaks when an IFFT
is applied to high resolution DP-gram data. Time-windowing can then be used to
isolate the component peaks, and FFT used to convert the time-windowed data
back into “component” DP-grams. Figure 3 illustrates the nonlinear
and reflection component DP-grams that result from using this analysis. Kalluri
and Shera (2001) showed that a suppression paradigm and the IFFT/time-windowing
analysis produce equivalent results in isolating the nonlinear component. For
further details about this analysis or about the possible errors associated with
the analysis, readers are referred to Kalluri and Shera (2001) and Shaffer et
al., (2003).

Figure
3. The IFFT and time-windowing analysis were applied to the fine
structure data given in Figure 1. After IFFT, two dominant peaks result in the
time domain, the nonlinear component (blue) and the reflection component (light
purple). For the time domain data, axes are given at the top and to the
right. The dominant nonlinear and reflection component peaks were then
isolated by time-windowing and converted back into the frequency domain by
FFT. The nonlinear component DP-gram is shown in red and the reflection
component DP-gram is shown in green. Axes for the frequency domain data appear
at the bottom and to the left. Note that while some amplitude variation
remains in the component DP-grams, it does not show the periodicity of the
original fine structure suggesting that it is not related to two-source
interference. Such residual amplitude variation may come from the unmixing of
the sources during analysis (Kalluri & Shera, 2001) or may represent
natural variation in the components across frequency.
The appeal of a “single-source” or
“component” DP-gram is that limiting the DP-gram to a single
component removes the fine structure-related amplitude variation that arises
from multiple components. Therefore, normative data based on single-source
DP-grams may exhibit less amplitude variance. To the extent that fine
structure is responsible for the tremendous variability noted in correlations
of behavioral threshold to DPOAE level, correlations based on a component
DP-gram might improve threshold predictions. Toward this end, we correlated
DPOAE level obtained using a suppression paradigm to behavioral thresholds.
Results suggest that correlation coefficients while statistically significant
at some frequencies, in general, do not improve and are still quite variable
when suppression is used to remove the reflection component. A manuscript of
these findings and how they can be interpreted in light of generation theory
was recently submitted for review.
Conclusions
OAE generation theory suggests that DPOAEs arise from two
different generation mechanisms and may involve multiple generation sources.
The interaction of multiple components leads to amplitude variation in the
composite ear canal signal. Clinical interpretation of OAE test results is
complicated by the presence of this amplitude variation, which is known as fine
structure. Not only do multiple sources challenge the idea that the test frequency
is assessing the sensitivity of a single cochlear location, but the amplitude
variation resulting from multiple sources yields normative data with a large
range of variance causing overlap of response distributions between normal and
hearing impaired populations, and limiting the potential of correlational
analyses to consistently predict hearing thresholds. Methods that allow the
isolation of DPOAE components may have some clinical utility. Studies are
needed to determine whether component DP-grams and component DP-I/O functions
can be used to better predict behavioral thresholds.
References
Avan, P., Bonfils, P., Loth, D., & Wit, H.P. (1993).
Temporal patterns of transient-evoked otoacoustic emissions in normal and
impaired cochleae. Hearing Research, 70, 109-120.
Boege, P., & Janssen, T. (2002). Pure-tone threshold
estimation from extrapolated distortion product emission I/O functions in
normal and cochlear hearing loss ears. Journal of the Acoustical Society of America, 111, 1810-1818.
Brown, A.M., Harris, F.P., & Beveridge, H.A. (1996).
Two sources of acoustic distortion products from the human cochlea. Journal
of the Acoustical Society of America, 100,
3260-3267.
Dhar,
S., Talmadge, C.L., Long, G.R., & Tubis, A. (2002). Multiple internal
reflections in the cochlea and their effect on DPOAE fine structure. Journal
of the Acoustical Society of America, 112, 2883-2897.
Gaskill, S.A., &
Brown, A.M. (1996). Suppression of human acoustic distortion product: dual
origin of 2f1-f2. Journal of the
Acoustical Society of America, 100, 3268-3274.
Gorga, M.P., Neely, S.T., Ohlrich, B., Hoover, B., Redner,
J., & Peters, J. (1997). From the laboratory to clinic: A large scale
study of distortion product otoacoustic emissions in ears with normal hearing
and ears with hearing loss. Ear & Hearing, 18, 440-455.
He,
N.H., & Schmiedt, R.A. (1993). Fine structure of the 2f1-f2 acoustic
distortion product: Changes with primary level. Journal of the Acoustical
Society of America, 94, 2659-2669.
Heitmann, J., Waldmann, B., & Plinkert, P.K. (1996).
Limitations in the use of distortion product otoacoustic emissions in objective
audiometry as the result of fine structure. European Archives of
Otorhinolaryngology, 253, 167-171.
Heitmann, J., Waldmann, B., Schnitzler, H.P., Plinkert,
P.K., & Zenner, H.P. (1998). Suppression of distortion product otoacoustic
emissions (DPOAE) near 2f1-f2 removes DP-gram fine structure - Evidence for a
second generator. Journal of the Acoustical Society of America, 103, 1527-1531.
Kalluri
R., & Shera, C.A. (2001). Distortion-product source unmixing: A test of
the two-mechanism model for DPOAE generation. Journal of the Acoustical
Society of America, 109, 622-637.
Kemp, D.T. (1986). Otoacoustic emissions, traveling waves
and cochlear mechanisms. Hearing Research, 22, 95-104.
Kemp, D. T. (2002). Exploring cochlear status with
otoacoustic emissions: The potential for new clinical applications, In M.S.
Robinette and T.J. Glattke (Eds), Otoacoustic Emissions:
Clinical Applications, New York: Thieme.
Kemp, D.T., & Brown, A.M. (1983). An integrated view
of the cochlear mechanical nonlinearities observable in the ear canal. In E. de
Boer & M.A. Viergever (Eds) Mechanics of Hearing (pp. 75-82). The Hague, The Netherlands: Martinus Nijhoff.
Kummer, P., Janssen, T., & Arnold, W. (1995). Suppression tuning characteristics of the 2f1-f2
distortion product otoacoustic emission in humans. Journal
of the Acoustical Society of America, 98,
197-210.
Long,
G.R., Talmadge, C.L., & Thorpe, C.A. (2001). Experimental measurement of
level dependence of stimulus frequency otoacoustic emissions fine structure. Association for Research in Otolaryngology: Abstracts of the Twenty-fourth Midwinter
Meeting, 46, 13.
Shaffer,
Withnell, Dhar, Lilly, Harmon, & Goodman (2003). Sources and Mechanisms of
DPOAE Generation: Implications for the Prediction of Auditory Sensitivity Ear
and Hearing, 24, 367-379.
Shera, C.A., & Guinan, J.J. Jr. (1999).
Evoked otoacoustic emissions arise by two fundamentally
different mechanisms: A taxonomy for mammalian OAEs. Journal of the
Acoustical Society of America, 105, 782-798.
Stover, L.J., Neely, S.T., & Gorga, M.P. (1996).
Latency and multiple sources of distortion product emissions. Journal
of the Acoustical Society of America, 99,
1016-1024.
Talmadge, C.L., Long, G.R., Tubis, A., & Dhar, S.
(1999). Experimental confirmation of the two-source interference model for the
fine structure of distortion product otoacoustic emissions. Journal
of the Acoustical Society of America, 105,
275-292.
Talmadge, C.L., Tubis, A., Long, G.R., & Piskorski, P.
(1998). Modeling otoacoustic and hearing threshold fine structure. Journal
of the Acoustical Society of America, 104,
1517-1543.
Waldmann
B., Heitmann, J., & Plinkert, P. (1997). Distorsionsproducte (sgDPOAE):
Entwicklung eines neuen prazisionsme systems. Audiologische Akustic, 1,
22-31.
Withnell,
R.H., Yates, G.K., & Kirk, D.L. (2000). Changes to
low-frequency components of the TEOAE following acoustic trauma to the base of
the cochlea. Hearing Research, 139, 1-12.
Yates, G.K., & Withnell, R.H. (1999). The role of
intermodulation distortion in transient-evoked otoacoustic emissions, Hearing
Research, 136, 49-64.
Zweig, G., & Shera, C. (1995). The origins of
periodicity in the spectrum of evoked otoacoustic emissions. Journal
of the Acoustical Society of America, 98,
2018-2047.