`Goodness of Fit` calculator

advertisement
A ‘Goodness of Fit’ calculator
Mary Hostler, John Hostler, John Bamford, Helen Whitehouse.
HCD group, University of Manchester.
Introduction
The use of DSP hearing aids means that a wide range of options and settings is
available to the audiologist at fitting. These options include fast acting, nonlinear wide dynamic range compression, multiple channels, frequency shaping
in numerous bands, noise reduction algorithms, speech enhancement
algorithms, feedback suppression, multiple memories and multiple
microphones. One consequence of this flexibility is that at the verification stage
of the fitting process DSP hearing aids can be matched more closely to
prescribed targets (O’Donnell 2001). However in order to achieve this,
audiologists have to make subjective judgements about a ‘good’ match to
targets, using real ear gain and sound pressure level measures, or coupler
measures incorporating Real Ear to Coupler Differences (RECDs).
Guidance for making these judgements is given in ‘good practice guidance
for adult hearing aid fittings and services’ by Gatehouse et al (2000). It is
recommended that tolerances of only +/- 5dB at frequencies of 250, 500,
1000 and 2000 Hz and of +/- 8dB at 3000 and 4000 Hz are acceptable. It is
also recommended that the slope in each octave should be within +/5dB/octave of the target slope. These ideas have been reflected in a few
1
studies that have proposed objective scaled measures of ‘goodness of fit’
(GoF). Hall and Rowson (Hall, 1997) developed a measure based on initial
proposals by Bamford (1997 unpublished). Their method used a calculation
of the difference in dB between the actual gain and the target values at four
frequencies of 250, 500, 1000, and 2000, Hz. In order to account for slope, a
doubling of the difference value was used where there was a change in
direction. This value was then subtracted from one hundred, giving a higher
score for a closer fit.
Hall’s method makes only a crude adjustment for slope differences and it
penalises ‘under-fit’ equally to ‘overshoot’, although in terms of access to the
Long Term Average Speech Spectrum (LTASS) the effects of these errors
are very different. ‘Under-fit’ reduces access to softer speech sounds
whereas the consequences of ‘overshoot’ depend on dynamic range and
‘headroom’ in the hearing aid output limitation settings. However Hall’s GoF
measure was used, though not further developed, by Dighe (2001) when
investigating discrepancies in real ear measures of children fitted with
hearing aids chosen either by traditional or theoretical fitting methods.
In the course of our work on the Modernising Children’s Hearing Aid Services
(MCHAS) project we decided to try to develop an improved GoF measure,
with the intention of:

Providing a more objective measure of how well a hearing aid fitting
meets targets (relying less on the subjective judgement of individuals) and
thus facilitating hearing aid selection and verification procedures;
2

Investigating the differences between proprietary fittings and generic
(published) fitting targets;

Interpreting variance in outcome and benefit measures.
We recognised, of course, that ‘goodness of fit’ is an essentially contestable
concept, and one that can be defined in many ways. An essential part of our
work was to refine the concept so that a sensible numerical score could be
given. We made a number of assumptions, including:

A ‘good’ fit is one where the actual gain meets but does not greatly exceed
the target gain at each frequency;

The ‘match’ between actual and target gain is more crucial at some
frequencies than at others;

A ‘good’ fit would be expected to correlate well with other measures
(speech discrimination, aided audibility measures etc.), although this will
also depend on the validity of the targets prescribed by a particular fitting
procedure;

There would be a reasonable degree of agreement among experienced
professional audiologists as to what constitutes ‘goodness of fit’.
Developing a GoF calculator
We decided at the outset that the GoF measure should be calculated by a
spreadsheet program. We thought it would be ideal if the user could simply
type in the figures for target gain and actual gain at various frequencies and
3
have the spreadsheet calculate a GoF score automatically: this became the
design brief that was ultimately achieved. We worked towards it by an iterative
process. The earliest version simply tested the concept and set up a
spreadsheet in which target and actual gains were typed in and a score was
generated using arbitrary weightings and measures. These measures and
weightings were then refined and adjusted so as to produce a better match
with other measures referred to above. The current version of the calculator is
Mark IV, and the arithmetic implemented in the spreadsheet is explained in
Appendix 1.
As mentioned above, an important requirement for us was that scores
generated by the calculator would correlate with the subjective judgements of
experienced clinicians, which we assumed would themselves show a large
measure of agreement. In order to test this hypothesis, and to establish a
benchmark by which the calculated GoF scores could be verified, we used
representative data from 30 of the subjects participating in one of the MCHAS
studies and from one site. Hearing thresholds for each subject were provided
by the participating site and we calculated the DSL target gain values for each
of them. (DSL targets were used in this exercise, but the GoF calculator could
be used to rate the fitting to any prescription target). These target gain values
were accompanied by the actual hearing aid gain measures that had been
gathered by audiologists on site as data for the MCHAS study. GoF scores for
the closeness of the 30 hearing aid fittings to the targets were also generated
by the calculator. The fittings and targets, displayed both graphically and in
tabular form, were sent to 15 experienced paediatric audiologists in the UK,
4
USA and Canada. These individuals were requested to score each example
for ‘goodness of fit’ with respect to aided hearing for speech at normal
conversational levels, using a five point scale in which 1 = very good and 5 =
poor. However no guidance was given to them on how ‘goodness of fit’ was to
be interpreted, since we wished to ascertain whether (as hypothesised) they
would agree on this.
Ratings on the 1 – 5 scale for all 30 fittings were received from all the
clinicians we contacted. Understandably there was greater agreement over
some fittings rather than others, but the ratings varied by no more than one
scale point for 40% of the fittings and by no more than two for 90%. One
respondent’s ratings correlated with the average ratings for the whole group
at only 0.81, but for all the rest the correlation was between 0.91 and 0.96
(mean = 0.94), supporting our hypothesis that there would be a high degree
of unanimity among professionals. The GoF scores produced by the Mark IV
calculator correlated with the average of the clinicians’ ratings for each fitting
at 0.923.
GoF and AAI
As noted above, one of our assumptions in developing the GoF calculator was
that the scores it produces would correlate with other measures as well. One
measure that particularly interested us was the Aided Audibility Index (AAI),
for which scores were available in the MCHAS study data. These scores had
5
been generated by a computer program called the Situational Hearing-Aid
Response Profile (SHARP) developed by Stelmachowicz, Lewis, Karasek &
Creutz (1994). The AAI in the SHARP program, like other Audibility Indices, is
based on signal audibility above hearing threshold in a number of frequency
bands, and utilises a frequency importance weighting which relates to the
contribution made by each frequency band to speech recognition. The
program calculates the AAI as a score between 0 and 1 (1 means that the
signal is fully audible) and is designed to accommodate non-linear systems
using gain information for a range of input levels across 13 different speech
spectra. We hypothesised that if the GoF scores from our calculator were
really an objective measure of ‘goodness of fit’, there would be a strong
correlation with the AAI scores for the same hearing aid fittings.
We calculated the GoF and AAI scores for 97 analogue hearing aid fittings
and 98 DSP nonlinear fittings and found that the correlation for the latter (with
AAI scores for speech at 1 meter) was significant beyond the p = 0.01 level.
Correlations between the AAI scores and the GoF scores for analogue
hearing aids were not significant, which accords with conclusions reached by
O’Donnell (2002) who found that DSP hearing aids are significantly better at
meeting fitting targets than aids using analogue technology.
Future developments
The present version of the GoF calculator (Mark IV) has been in use for the
past year and our colleagues are now planning significant improvements to it.
6
These developments are intended to overcome some of the limitations that
derive from the relatively simple algorithm that the calculator employs
(Appendix 1). The developments are planned with four main objectives in
mind:

To reflect the different prescriptive procedures that are used to generate
the fitting targets. Different procedures have differing rationales, and the
consequences of exceeding or under-achieving the target figure at a
particular frequency therefore depend on the importance given to the
target figure by the specific prescriptive procedure that is employed.

To take better account of the degree of hearing loss of the hearing aid
user. Clearly, the practical significance of exceeding or falling short of the
target gain will depend on the user’s residual dynamic range: for a severeprofound hearing loss, for example, the consequences of falling short of
the target will be more serious than for a mild loss.

To reflect better the non-linearity of the latest DSP aids, which typically
apply differential gain to different levels of signal input. Ideally the
calculator should accept data for actual gain achieved in quiet, moderate
and loud conditions, and should compute from these an overall ‘goodness
of fit’.

To incorporate the saturation response to an input of 90 dB or more. At
present the calculator accepts data relating to gain (actual and target)
alone: ideally it would take account also of output levels and apply a
penalty when these reach or exceed the predicted ULLs.
Conclusion
7
The GoF calculator as it stands (Mark IV) generates scores that were found to
correlate strongly with the subjective judgements of experienced clinicians
when it was validated. It has a potential clinical use at the verification stage of
the hearing aid fitting process as a tool for providing a quick and simple
means to judge the closeness of a hearing aid fit to targets.
The calculator has proved easy to use. With further refinement and validation
the calculator could be utilised by clinicians and Teachers of the Deaf as a
quick means of rating a hearing aid fitting, with possible further uses in
research into different prescriptive targets and their relationship to hearing aid
benefit and satisfaction.
References
Dighe, A. (2001) Discrepancies in real ear measurements between two
groups of school children with sensori-neural hearing loss wearing hearing
aids chosen by a traditional or theoretical fitting method. Unpublished MSc
dissertation, University of Manchester.
Gatehouse, S., Stephens, S.D.G., Davis, A.C. and Bamford, J.M. (2001)
Good practice guidance for adult hearing aid fittings and services. BAAS
newsletter, issue 6.
Hall, R.L. (1997) The hearing aid outcome measures of self-perceived benefit,
satisfaction and use and their relationship with goodness of fit of a hearing
aid. Unpublished MSc dissertation, University of Manchester.
O’Donnell, J. (2001) Achieving DSL prescriptive targets with analogue and
digital hearing aids. Unpublished MSc Dissertation, University of Manchester.
8
Stelmachowicz, P., Lewis, D., Karasek, A., & Creutz, T. (1994) Situational
Hearing Aid Response Profile (SHARP version 2.0) Omaha, Neb.: Boys Town
national Research Hospital.
Appendix 1
HOW THE GOODNESS OF FIT CALCULATOR WORKS
Summary

The Mk IV calculator uses three measures to calculate ‘goodness of fit’. It
allocates ‘penalty points’ for badness of fit in respect of three features of
the fitting. It uses these points to calculate the three measures
independently of each other and then it combines the measures to
calculate a GOF score.

The three features measured are:
A) ‘Close fit’, for which ‘badness’ is the difference between the actual gain
(AG) and the target gain (TG) at each frequency.
B) ‘Similar shape’, for which ‘badness’ is the extent to which the shape of
the AG curve differs from the shape of the TG curve.
C) ‘Adequate gain’, for which ‘badness’ is the difference between the total
AG provided by the hearing aid and the total TG required.
A) How ‘close fit’ is measured

At each frequency, the difference between AG and TG is calculated.

Differences of less than 2 dB are awarded zero penalty points (at each
frequency).
9

Differences that exceed the following limits are awarded a maximum of 4
penalty points (at each frequency):
Maximum penalties awarded when AG
500 Hz 1 kHz
2 kHz
4 kHz
15 dB
15 dB
20 dB
25 dB
20 dB
20 dB
25 dB
25 dB
is less than TG by more than these
limits:
(the aid is underfitted)
Maximum penalties awarded when AG
is more than TG by more than these
limits:
(the aid is overfitted)

Differences greater than 2 dB and less than the limits are awarded penalty
points pro rata, up to a maximum of 4 points (at each frequency).

The number of penalty points awarded is divided by the maximum (at each
frequency), and the mean of those scores constitutes measure A.
B) How ‘similar shape’ is calculated

Two measures are used to calculate the similarity between the TG curve
and the AG curve. These are the ‘gradient’ and the ‘slope’ between the
gain figures at each adjacent pair of frequencies.

The gradient between each pair of figures is defined as the second minus
the first. Thus if the TG at 1 kHz is 22 dB and at 2 kHz it is30 dB, the
gradient is +8 dB.

To measure the similarity of gradients, the gradient for each pair of AG
figures is subtracted from the gradient for the corresponding TG pair.
10

If the difference between the gradients is 15dB or more, two penalty points
are awarded; if it is less, penalty points are awarded pro rata, up to a
maximum of two (for each pair).


The slope between each pair of figures is defined as follows:

‘up’ if the gradient is positive and greater than +1;

‘down’ if the gradient is negative and less than –1;

‘flat’ if it is neither.
The similarity of slopes is measured as follows:

if the slope between an adjacent pair of AG figures is the same as the
slope between the corresponding TG pair (e.g. both gradients are
positive and greater than +1), no penalty points are awarded;

if the two slopes are different, one penalty point is assigned if either
slope is ‘flat’ and two penalty points are allocated if one is ‘up’ and the
other is ‘down’.

The total number of penalty points awarded for both gradient and slope is
divided by the possible maximum: this constitutes measure B.
C) How ‘adequate gain’ is calculated

The total of the TG figures, for all four frequencies, is calculated. The total
of the AG figures is also calculated.

The difference between the two totals, divided by the TG total, constitutes
measure C.
11
How the GOF score is calculated

The three measures – A, B and C – are weighted in the ratio 3:1:1,
summed and then divided by 5. The result is subtracted from 1 in order to
produce a GOF score in the range between 1 (best) and 0 (worst).
12
Download