Validity of LEQ as a predictor of the impact of aircraft noise

advertisement
HACAN
Heathrow Association for the Control of Aircraft Noise
President: Professor Walter Holland CBE MD FRCP FFPHM
PO Box 339, Richmond, Surrey TW9 3RB
Tel: 0181 876 0455
Fax: 0181 878 0881
PROOF OF EVIDENCE
H. F. JONES
VALIDITY OF LEQ AS A PREDICTOR OF THE IMPACT
OF AIRCRAFT NOISE ON PEOPLE
June 1997
HAC 62
Validity of Leq ...
2
Personal Details
I hold an MA in mathematics from the University of Cambridge and a PhD in
Theoretical Physics from the University of London, and am currently a Senior
Lecturer in the Physics Department of Imperial College.
I have lived in Richmond since 1965 (when the number of ATMs was 180,000), and
have been a committee member of HACAN (originally KACAN) since 1971.
The occasion of my joining KACAN was when I took part in a guided tour of the
Richmond Green area given by the Richmond Society, and found the commentary
frequently inaudible because of aircraft noise. This was about the time when,
following the lengthening of the northern runway, all westerly landings were assigned
to the southern runway, giving a dramatic foretaste of life without alternation. It
seemed to me that the efforts of the Richmond Society on the ground were being
negated by this assault from the air.
Since then I have been involved in both the Fourth Terminal Inquiry and the first Fifth
Terminal Inquiry and have seen the number of ATMs rise to 427,000.
Validity of Leq ...
3
Summary
This paper is concerned with the validity of the noise index Leq as a predictor of
subjective disturbance in the population affected by aircraft noise at Heathrow at the
present time or in the future.
Both Leq and its predecessor NNI are based on social surveys, of which the most
recent was carried out in 1982, when the aircraft noise climate at Heathrow was very
different from the present one.
I question the validity of the analysis of that survey, noting in particular the large
uncertainties involved and the fact that the noise indices studied in depth in that
analysis are not the same as that actually adopted when the changeover to Leq was
implemented. As a consequence I cast doubt on the official assumption that 57 Leq
represents the "onset of community disturbance".
A wealth of subjective evidence suggests that the present Leq indexation is already at
variance with the true disturbance. It seems that, contrary to the Leq model, sheer
numbers of aircraft (greater than those experienced at any site in the 1982 study) cause
severe annoyance, even when the noise of some individual aircraft is reduced. Without
another survey to check and recalibrate it, extrapolation of the model further into the
future, to predict the effect of a Fifth Terminal, is not credible.
Validity of Leq ...
Table of Contents
1. Introduction
2. History. NNI
3. Change to Leq
4. Critique of Leq as a Predictive Tool
4.1 Doubling the Numbers
4.2 Offsetting Noise vs. Number
4.3 Spreading the Noise
4.4 Concentrating the Noise
5. Leq/NNI as a Snapshot
6. Critique of DR 8402
6.1 Multiple Regression Analysis
6.2 Unequal Sample Sizes/Response Rates
6.3 Extrapolation vs. Interpolation
6.4 The Ratio k
6.5 MRAs with 1-week Leq
6.6 3-month Leq and NNI
7. Critique of DORA 9023
7.1 16-hour vs. 24-hour Leq
7.2 Benchmark for High Annoyance
7.3 Benchmark for Onset of Community Disturbance
8. Leq as a Measure of Disturbance at the Present Time
8.1 Beyond the 57 Leq Contour
8.2 Night Flights
9. Conclusions
4
Validity of Leq ...
Definitions
References
Figures
Attachment 1
5
Validity of Leq ...
6
1. Introduction
1.1 The problem of (aircraft) noise is that it causes disturbance to people. Thus the
aim of any noise index must be to provide a valid measure of the subjective
annoyance experienced by the population affected. To quote Adams and McManus
(Noise and Noise Law, Wiley Chancery Law, p.55): "To be useful, there must be a
good correlation between the parameter selected and the subjective response to the
noise from the point of view of annoyance and noise intrusion". What is measured by
the Department of Transport is the geographical distribution of the average noise
energy, in the form of a particular variant of Leq, and the fundamental question is
whether this correlates sufficiently strongly with annoyance that it can be used as a
substitute for checking that annoyance directly.
1.2 In an ideal world such a check could be provided by conducting social surveys on
a regular basis, say every two years or so. In that way one could measure the extent of
aircraft noise disturbance based on the direct experience of the people affected and
identify such correlations as exist between that disturbance and the numbers of aircraft
and the average noise of each as these change over time. However, this has not been
considered a practicable proposition because of the expense and effort that would be
involved, although that could be considerably reduced by undertaking smaller surveys
in representative areas.
1.3 Instead, the modus operandi at British airports has been to conduct infrequent
large-scale surveys, to try and identify correlations at the time of those surveys
between subjective annoyance on the one hand and the number of aircraft and their
individual noise on the other. In between surveys the index thus devised, which is
described in terms of the physical characteristics of the noise, is updated by
measurements and calculations, and regarded as accurately reflecting the subjective
disturbance.
1.4 Out of this methodology have arisen two large edifices for the assessment of
aircraft noise disturbance, based in turn on NNI and Leq. Both incorporate an implicit
trade-off between noise and number of aircraft - that is to say that the index at any
location, and by inference the subjective annoyance, does not change if the number of
aircraft increases, provided that the average noise of each decreases by a
corresponding amount. Thus in Leq terms a doubling of numbers can be offset by a
barely perceptible1 3dB decrease in average loudness. This is the reason for the
shrinking of the Leq contours and so, it is implied, community disturbance in recent
years.
1.5 The central point of this paper is that these edifices are built on very shaky
foundations. The Leq system, in particular, is based on a social survey which was
carried out in 1982, but is being applied today in vastly changed circumstances, and
even being relied upon to predict future disturbance in 2016. My criticism of this
system is two-fold. Firstly I expose some of the shortcomings of the analysis of the
social survey of 1982 in DR 8402 and DORA 9023, which led to the current system.
The second criticism is of a more general nature, namely that the index is being used
1
Ref. 1, para. 11
Validity of Leq ...
7
to extrapolate in time, over rather a long time scale, from the conditions of 1982. As
the length of the extrapolation increases, so inevitably will the uncertainties. My
conclusion is that the noise contour system as currently applied by the Department of
Transport can not be relied on. Indeed there are many indications, from the analysis of
letters to the Inquiry, from individual testimony to the Inquiry, from complaints data,
from the distribution of HACAN membership etc., as detailed in another HACAN
submission, that it considerably underestimates the degree of the present level of
disturbance.
1.6 Assessing noise disturbance is by no means a simple matter, both from the point
of view of the human response and that of the measurement of the characteristics of
the noise environment. The human response is both a physical and a psychological
reaction to noise, which depends on many factors, including pitch, tone, intermittency,
ambient noise levels, socio-economic class, etc. etc., and has a great deal of variability
from one individual to another. The physical characteristics of the noise are also rather
complicated, as reflected by the number of different measures which are used, such as
dBA, dBD, PNdB, EPNdB, Leq of many different varieties, DNL, LAmax, LNP etc.
(See Adams and McManus, Attachment 1). These all involve some form of averaging,
with different weightings, over frequency and/or time - and their validity or
appropriateness can change with time, for example as the mix of the aircraft fleet
changes, both in size and in type of engines - from propellers to turbo-props, to jets, to
fan-jets etc.
1.7 There is clearly some attraction in extracting from this complicated situation a
single index, which at Heathrow has become (16-hour, 3-month) Leq, but its
limitations should be clearly recognized, and it should not be elevated to the status of
an icon. After all, contours of Leq are just that, contours of Leq. What are really
needed are contours of disturbance. The Leq methodology can help draw a contour of
disturbance at a particular time. If by interviewing representative samples of the
affected population a strong correlation between levels of disturbance and Leq is
established then it may be valid to draw a complete contour around an airport based
on Leq calculations, thus avoiding the cost of a much larger interview programme.
Again, changes in Leq may be able to act as a surrogate for changes in disturbance
over short periods of time provided that flight numbers and average noise levels
change by only modest amounts, but when these change significantly the index needs
to be recalibrated by asking people about their subjective experience of current noise
levels. Such a recalibration is long overdue at Heathrow.
1.8 The relevance of this to the Fifth Terminal proposals is two-fold. Firstly, we
believe, and will give evidence to show, that the Leq contours, as currently
interpreted, greatly underestimate the extent of the disturbance from the airport at the
present time. The population has been subject to a huge increase in the number of
operations since the commitment to limit annual numbers to 275,000 was abandoned
in 1985, and has therefore been deprived of the improvement in the noise climate
which would have resulted from the introduction of quieter engines with a fixed
number of operations. Instead, the Leq system has been used to claim, without further
verification, that the reduction in the average noise per aircraft has more than offset
the effect of increasing
numbers. This is quite contrary to the experience of
Validity of Leq ...
8
HACAN members, for whom the large increase in numbers, even with somewhat
quieter aircraft, has resulted in more disturbance rather than less.
1.9 Secondly, the projections which are being made as far ahead as 2016 are on even
more shaky ground. As has been detailed in HAC 1, we are firmly of the belief that
the full utilization of T5 would require a very substantial increase in numbers. We do
not believe the claim that the effect of such an increase would be offset by a reduction
in the noise of individual aircraft, and look to a limit on numbers as the only sure way
of improving the situation.
1.10 Such a limit could take the form of an imposed limit, as the government
currently applies its powers in relation to Stansted, for example. Alternatively, and
less satisfactorily, there could be the practical operational limit of the capacity of
Heathrow’s runways with present restrictions on night flights maintained or improved,
and the alternation system maintained. It is widely accepted that this limit is around
475,000. In HACAN’s view, a Fifth Terminal designed to take an additional 30
million passengers per annum cannot possibly be accommodated within current
operational practices.
Validity of Leq ...
9
2. The History of NNI (the Noise and Number Index)
2.1 The index NNI was established as a result of the report of the Wilson
Committee(1) (1963).
NNI = L + 15 log N -80
where
L = logarithmic average of peak noise levels (PNdB),
N = number of aircraft per day
( Roughly PNdB = dBA + 13 )
[The average is taken over a 12-hour day (06.00 - 18.00 GMT) for the 3 months midJune to mid-September, and aircraft with L below 80 PNdB are not counted.]
2.1.1 Insofar as it concerned aircraft noise, the report was based on a social survey(2)
(September 1961), with a sample size of 1909, combined with measurements of noise
exposure. The NNI index was put forward as the best correlation between annoyance
and exposure to aircraft noise at that time.
2.1.2 The following correspondences were subsequently adopted, although their basis
is not entirely clear:
35 NNI
"Onset of Community Disturbance"
45 NNI
"Moderate Annoyance"
55 NNI
"High Annoyance"
2.1.3 The only such correspondence which was explicitly suggested in the Wilson
report (p. 211) was that "exposure to aircraft noise reaches an unreasonable level in
the range 50 - 60 NNI." The further correspondences seem to be based on Figure 2 of
the report, reproduced here as Fig. 1, plotting average annoyance rating against NNI.
The rating "moderate" does seem to correspond roughly to 45 NNI, but it is interesting
to note that the rating "little" corresponds to 32 NNI rather than 35 NNI. The figure of
32 NNI is also picked out in Ref. 6 (para 18) as a level below which "very few people
find noise to be a major disamenity".
2.2 A critique was given in a KACAN paper(3) in 1969.
2.2.1 The main points made were that
(a) The NNI might well have to be reviewed whenever the situation changed
qualitatively, for example by the change in the nature of the fleet.
(b) The use of log N rather than N had not been well established. This has enormous
implications when the NNI is extrapolated to larger numbers of aircraft.
(c) The duration of the noise was not taken into account.
Validity of Leq ...
10
(d) It is rather unlikely that a single variable can adequately correlate with the whole
spectrum of noise exposure and annoyance.
2.2.2 To amplify this latter point it is worth reproducing a plot (Fig. 2) of selected
NNI levels against L and N, produced by P. Davies, former general secretary of
FHANG. From this figure it can be seen that the same level of 55 NNI can be
produced by such widely differing scenarios (all per NNI day) as:
400 flights at 96 PNdB,
150 flights at 102 PNdB (the night-time take-off limit),
46 flights at 110 PNdB (the day-time take-off limit),
or 1 Concorde at take-off (130 PNdB)
(e) In terms of subjective annoyance NNI attempts to represent an average, but there
is a wide variation in people's sensitivity to noise.
(f) The Wilson Committee emphasized the tentative nature of its conclusions, which,
however, were subsequently treated as definitive.
2.3 The index was monitored in the Second Survey of Aircraft Noise Annoyance
around Heathrow(4), carried out in September 1967 with a sample size of 4699 adults
and published in 1971.
2.3.1 The authors claimed to find no increase in annoyance since the earlier survey, in
spite of the increase in aircraft numbers, and speculated whether this was evidence of
acclimatization among the population affected.
2.3.2 Using multiple regression analysis to correlate various combinations of L and N
with the annoyance scale N/1 they suggested that the degree of correlation was very
insensitive to the precise value of the coefficient K in the combination L + K
log(N+1). But in fact there was a marginally better correlation using N itself, in the
form L + 0.1(N+1)-70.
2.4 As was pointed out in the KACAN response(5), the evidence for acclimatization
was far from conclusive: there had been significant population shifts in the
intervening period, and also the alternation system had been introduced by 1967.
Alternation, or half-day noise relief, is a measure for alleviating noise disturbance
which is very important to HACAN's members, and we will return to this point later
on (§ 4.4).
2.5 The status and validity of NNI was reviewed in a DORA paper(6) in 1981 which
broadly endorsed the index, though leaving the way open for future reviews in the
light of changing circumstances, in particular the trend to a larger number of
somewhat quieter aircraft. International comparisons were made in an Annex, from
which it is to be noted that the indices used in Germany and The Netherlands were
broadly similar to NNI, while those used in several other countries, in particular the
USA, were basically of the Leq type.
Validity of Leq ...
11
3. The Change to Leq
3.1 In 1985 a CAA paper, DR Report 8402, was published(7) which recommended a
change from NNI to Leq as a physical index which correlated better with subjective
annoyance. The paper was based on noise measurements together with a social survey
carried out in 1982 in which 2097 people were interviewed, at the three London
airports Heathrow, Gatwick and Luton, and also at Manchester and Aberdeen.
3.1.1 The principal difference between NNI and Leq (Equivalent Continuous Sound
Level) lies in the relative weighting of noise and number. Leq is defined as that
continuous noise level (dBA) which, over a specified period of time, would have the
same acoustic energy as the succession of discrete noise events. If all the events have
the same duration and noise level this reduces to
Leq = L + 10 log N +const. (dBA).
3.1.2 Thus, compared with NNI, the weighting of log N is reduced from 15 to 10,
which clearly has important implications, to be discussed below.
3.2 After public consultation the paper was followed up(8) in 1990 by a further
report, DORA Report 9023, which contained detailed proposals for changing from
NNI to Leq. An important element of this report was the recommendation for setting
the bench marks in Leq corresponding to the "onset of community disturbance",
"moderate disturbance" and "high disturbance" at 57 Leq, 63 Leq and 69 Leq
respectively. As already mentioned, these had previously been set in NNI terms at 35
NNI, 45 NNI and 55 NNI respectively.
4. Critique of Leq as a Predictive Tool
4.1 Doubling the Numbers
4.1.1 Suppose that the numbers of aircraft doubled while the noise of each remained
the same. Then the Leq would increase by approximately 3 ( = 10 log 2). But it will be
noted that the increments between the different levels of subjective annoyance have
been set at 6. So apparently the annoyance would only move half way to the next
benchmark, say from 57 Leq to 60 Leq. In fact the claim is that the numbers would
have to quadruple before the population became just moderately disturbed. Or again,
given that the current number of ATMs at Heathrow (427,000) is roughly equal to the
numbers at Gatwick, Stansted and Luton combined, according to the Leq model the
population around Heathrow would hardly notice if all the latter flights were
transferred to Heathrow. This seems so patently absurd that it calls into question the
whole concept of Leq as a tool for quantifying changes in the response of the
population over time.
4.1.2 In this regard NNI is equally implausible. Doubling the numbers would lead to
an increase in NNI of 4.5, and quadrupling to an increase of 9, compared with the
steps of 10 set between the benchmarks of "onset of disturbance", "moderate
annoyance" and "high annoyance".
Validity of Leq ...
12
4.2 Offsetting Noise vs. Numbers
4.2.1 The actual trend which has occurred over the last few decades at Heathrow, and
is likely to continue for some time, is of increasing numbers of aircraft coupled with
decreasing noise levels of individual aircraft, although, as detailed in HAC 63, the
scope for further reductions in landing noise is severely limited. Leq does not change
if, say, the number of planes is doubled but they are each 3dB quieter. This sort of
trade-off explains how the noise contours have shrunk in recent years and are
predicted to do so in general, though even then growing in some areas, in the future.
But how much credence should we place in this shrinking when it goes against all the
evidence of increasing public protest and is based on the premise that a barely
perceptible change in perceived loudness can completely offset a doubling of
numbers?
4.2.2 The trade-off in the case of NNI is of the same general nature, although the
numbers are given somewhat more weighting. There a doubling of numbers could
apparently be offset by a 4.5dB reduction in average loudness.
4.3 Spreading the Noise
4.3.1 One of the alleviative measures most valued by local residents is the alternation
system, whereby aircraft land on one runway and take off on the other for half of the
day (07.00 - 15.00), the roles of the two runways being reversed for the other half of
the day (15.00 - 23.00), and the overall pattern being rotated on a weekly basis. The
result for the majority of residents is that they suffer aircraft noise for half of the day
but have a period of respite for the other half. In the alternative scenario of mixed
mode, which would marginally increase runway capacity, the same number of aircraft
would be spread continuously throughout the day, with a longer gap between aircraft
but no respite. It is absolutely unequivocal which of these scenarios the residents
prefer (See HAC 60), yet there would be no difference in Leq.
4.4 Concentrating the Noise
4.4.1 Suppose that the Government decided that runway 27L would always be used
for landing, and runway 27R always for take-off, as indeed was the threat when a third
parallel runway was considered at Heathrow. That means that for half the population
the numbers would be roughly doubled, whereas some would have the numbers
greatly reduced. Thus at a stroke roughly half the population would be removed from
the 57 Leq contour, so that one could claim that "the number of people affected by
aircraft noise" had been drastically reduced. But of course the population still affected
would be much worse off, with many more inside the higher contours. The shape of
the whole Leq "mountain" is important, not just the headline 57 Leq contour.
5. Leq/NNI as a Snapshot
5.1 How can we explain the evident inadequacy of NNI or Leq to tally with people's
reaction to a changing situation with the weighty surveys on which they were based?
To answer this question we need to look at the methodology of those surveys. To be
specific let us just refer to the ANIS survey described in Ref. 7. A sample population
was chosen at 26 sites, chosen to have a wide variation in numbers N and loudness L.
Validity of Leq ...
13
Then the subjective response of the sample was compared with the physical noise data
N and L, and a search was made for a single variable of the form L + k log N which
would have the best correlation with the subjective response. The claim was that a
particular form of Leq, which corresponds to k=10 in the above expression2, was well
correlated with the subjective response, better that NNI, which corresponds to k=15.
This was the basis for the change from NNI to Leq.
5.2 However, it is very important to realize that what is being undertaken is a
snapshot of the situation at a particular time, and indeed over a limited range of the
variables L and N. Thus, if we accept the results of the study it would be reasonable to
use it by interpolation to estimate the subjective response of people other than those
sampled, with intermediate values of N and L, at that time. However, what is on much
more shaky ground is extrapolation to a future situation in which the typical values of
N or L lie outside the range covered in the study. As far as Heathrow is concerned the
present numbers of aircraft indeed go beyond those measured in the survey, as
detailed in the following section.
5.3 A graphical illustration of the pitfalls of extrapolation is given in Fig. 3. This is a
hypothetical example, but one not without relevance to noise indices such as NNI or
Leq. Two completely different functions of x are plotted, one logarithmic, one linear.
Nonetheless over a limited range of x, say from 200 to 500, they agree fairly well, and
if one is a good fit to some data within that range, the other will also be a reasonable
fit. However, when the two fits are extended (extrapolated) beyond that range, they
differ markedly. Yet we can not with any confidence prefer one curve over the other,
or indeed any other curve which would give a reasonable fit to the data within the
limited range 200-500.
5.4 The point that any noise measure has a limited range of validity in time was
explicitly acknowledged in the introduction to Ref. 4, one of whose objectives was "to
investigate whether the findings of the 1961 survey remain valid in 1967" and in
paragraph 19 of Ref. 6, which states "There is now a considerable amount of
experience of the usefulness and validity of the NNI for immediate control and short
term development, but less certainty about its use for those long term planning
purposes where some new circumstances need to be envisaged. ... There is therefore
an argument in favour of testing the Index to ensure that it can continue to be
representative of annoyance in these changing conditions." It was implicitly
acknowledged by the setting up of the ANIS study, to re-evaluate NNI in the light of
changing circumstances.
6. Critique of DR 8402
6.0.1 The paper DR 8402 (Ref. 7) was the basis for the changeover from NNI to Leq,
which was claimed to have a better correlation with subjective disturbance. The
methodology was to perform multiple regression analyses to see how the various
measures of subjective annoyance were correlated with the noise data at the time of
the ANIS study (1982).
2But
see §6.6
Validity of Leq ...
14
6.1 Multiple Regression Analysis
6.1.1 Multiple regression analysis (MRA) involves finding that linear combination of
independent variables which best explains the variation in a given dependent variable.
In the present case the independent variables are for the most part noise data, and the
dependent variable is some measure of subjective annoyance. The subjective measures
primarily used in DR 8402 were AVOGAS, the average annoyance rating on the (old)
Guttman scale, ARCBOTH, the percentage of the sample population considering
aircraft noise to be the most bothersome noise, VMANN, the percentage very much
annoyed by noise in general and ARCNA, the percentage finding the levels of aircraft
noise not acceptable. The noise measures included average daily numbers of aircraft
above a certain noise threshold, average peak noise levels, again above various
thresholds, NNI, and various versions of Leq. The latter were averaged over various
periods (three months, 1 week, 24 hours and 16 hours, and also over various modes of
operation of the airport). It was found that some of the correlations were significantly
improved by including WORKAP, the percentage of the sample population whose
work was in some way connected with the airport, among the independent variables .
6.1.2 As a gauge of how good a fit is, the multiple correlation coefficient, R, has a
rather simple interpretation, namely that 100×R2% of the variation of the dependent
variable (in this case AVOGAS, ARCBOTH etc.) is explained by the fit. Thus, for
example, R=0.9 (R2=0.81) means that 81% of the variation is explained by the fit.
However, it is clear that this percentage can always be increased by bringing in more
independent variables, so the number of independent variables also affects the
significance of the fit. This can be partly taken into account by using an 'adjusted'
value of R, denoted by Ra, but strictly speaking the correct test involves a statistic F
derived from R, whose significance can then be read off from published tables.
6.2 Unequal Sample Size/Response Rates
6.2.1 A general reservation which applies to all of the MRAs performed in DR 8402
is that the dependent variables AVOGAS etc. are treated as single, exact numbers,
whereas they are in fact averages over a sample population, each with an individual
error, or variance. In such a situation the correct procedure is known as maximum
likelihood analysis, which places less emphasis on those data which have a large error
and more on those which are better determined. The quantity to be minimized in this
case is chi squared ( 2) rather than 1-R2. If all the samples were of the same size and
all the response rates the same this more general analysis would reduce to MRA.
However, this is not the case - the sample sizes vary from 66 to 101 (Table C2) and
the response rates from 55% to 78.3% (Table 5.1), so the results obtained by MRA in
DR 8402 are indeed subject to the above criticism. That is, the fits taking into account
the sampling errors on AVOGAS etc. would actually differ from those obtained by the
simpler analysis.
6.2.2 Moreover, in the presence of sampling errors, even if these are all equal so that
the fits are the same, the F test of the simple MRA overestimates the significance of
the fits. That is because the simple MRA is just a fit to the central values of AVOGAS
etc., ignoring the fact that these central values have an uncertainty.
Validity of Leq ...
15
6.3 Extrapolation vs. Interpolation
6.3.1 Again, a legitimate use of such analyses is to interpolate within the range of
noise and numbers covered by the survey, but extrapolation beyond that range is a
much more dangerous activity. In para. 8.30 of DR 8402 it was claimed that the data
set, which was designed to include areas of high numbers/low noise, was appropriate
for future conditions. However, the authors clearly did not anticipate the scale of the
growth in numbers which have occurred since then, which means that in many areas
present flight numbers now exceed the maximum which occurred in 1982.
6.3.2 At that time the number of air transport movements (ATMs) was about
250,000, and the numbers were subject to a limit of 275,000, but this was abandoned
in 1985 following the first Terminal 5 Inquiry partly on the grounds that the airport
had essentially reached saturation point as far as runway capacity was concerned. Far
from this being the case, the latest figures (January 1997) show a throughput of
427,000 ATMs.
6.3.3 In 1982 the sample area affected by the largest number of aircraft movements
was East Sheen, which suffered from a daily (24-hour) average of 319 movements in
worst mode (westerly landing on either runway). Today the corresponding figure
would be 420,000/365/2 = 575 movements, an 80% increase. Even those areas
affected by only a single runway are now subjected to 288 flights per day, close to the
maximum in 1982. Thus the extension to today's situation of fits set up on the basis of
the 1982 measurements does indeed involve a considerable amount of extrapolation.
6.4 The Ratio k
6.4.1 Coming now to the actual MRAs performed in DR 8402, the basis for the
preference of Leq over NNI was an analysis which examined the correlation of
AVOGAS (the average annoyance level on the old Guttman scale) with noise, in the
form of average peak noise level L over various periods (3-month, 1-week, 24-hour
day, 16-hour day) and with various thresholds (80, 75, 70 dBA), and the
corresponding numbers N of aircraft (actually the logarithms of those numbers). The
aim of the analysis was to find the ratio k of the coefficient of log N to that of L in the
linear fit. The results from MRA1 were k=6.4 in a fit whose overall R2 was 0.6667.
This seems to show that the coefficient k=15 which occurs in the NNI combination is
not validated, but it also casts doubt on the validity of the 3-month Leq which was
used in this analysis, for which the nominal k would be 10 if all the aircraft noise
events were identical. However, as discussed later, over the actual aircraft mix at the
time the coefficient is rather larger, of the order of 13. Moreover, given the value of R,
the fit is not a very good one, explaining only some 67% of the variation in AVOGAS
(64% if Ra2 is used).
6.5 MRAs with 1-week Leq
Validity of Leq ...
16
6.5.1 The best fits (MR7) to the annoyance data are obtained by using 1-week 24hour Leq (i.e. the 24-hour Leq in the week immediately preceding the interview) as
the independent variable, taking into account the percentage of the sample
(WORKAP) whose employment was connected with the airport. For some of the
dependent variables (AVOGAS and ARCBOTH) a jump at 57 Leq was also
introduced.
6.5.2 For example MR7B (VMANN vs. W1LQ24 and WORKAP), which is shown
as a graph of the adjusted VMANN against W1LQ24 in Fig. 9.4, reproduced here as
Fig. 4, gives R2 = 0.8402 (Ra2 = 0.8283), and thus explains 84% of the variation.
However, it is very important to note that in the production of noise contours the
three-month version of Leq (M3LQ) has to be used for obvious practical reasons, so
this correlation is not as useful as it might seem.
6.6 3-month Leq and NNI
6.6.1 In fact, if one compares NNI with M3LQ24 (3-month, 24-hour Leq), they are
very highly correlated. This is shown in Fig. 9.1 of DR 8402, reproduced here as Fig.
5. The correlation coefficient r is not quoted, but in fact is 0.98647 (r2=0.97), which
means that the correlation is very good indeed3.
6.6.2 Another way of looking at this is an analysis, not performed in the paper, of
M3LQ versus M3L80 and LM3N80, namely the 3-month average of peak noise levels
with a threshold of 80dBA and the logarithm of the corresponding numbers. A similar
analysis can be performed for thresholds of 75 and 70dBA. The correlation is actually
better than any of the multiple correlations presented in the paper, with R2 = 0.979,
but the relative coefficient of log N to L is 12.9 rather than 10. As mentioned above,
the ratio 10 is what one would obtain if all the aircraft noise events were identical, but
this analysis shows that over the spread of aircraft types at the time of the survey the
coefficient was approximately 13, quite close to the 15 of NNI. This explains why,
over that data sample, the two are not so different. However, it is not at all clear that
this relation will stay the same as time progresses and the aircraft mix changes, and we
see again the possible dangers of extrapolation.
7. Critique of DORA 9023
7.1 16-hour Leq vs. 24-hour Leq
7.1.1 In DORA 9023, it was decided to use 16-hour Leq, i.e. Leq averaged over the
period 0700-2300 local time, rather than 24-hour Leq, for noise classification
purposes. It is therefore necessary to establish the relationship between 16-hour and
24-hour Leq, shown in Fig. 6, and between NNI and 16-hour Leq, shown in Fig. 7. To
a good approximation M3LQ16 = M3LQ24 +1.3 dBA under the conditions prevailing
in 1982.
7.1.2 The reason for the preference for 16-hour Leq was the perception, with which I
would concur, that night-time disturbance is a separate problem from day-time
disturbance. Moreover, the averaging process implicit in Leq is likely to be much less
3The
linear relationship is of the approximate form NNI = 1.36 M3LQ24 -41
Validity of Leq ...
17
appropriate for night-time disturbance because, in the middle of the night at least, the
disturbance is produced by well-separated individual noise events.
7.1.3 However, the choice of 16-hour Leq has the inevitable result that any change in
the pattern of night-time disturbance, such as the recent reduction of the night quota
period, which caused a huge community reaction, is not reflected at all in the
published (day-time) Leq contours. This point is discussed in more detail in §8.2.
7.2 Benchmark for High Annoyance
7.2.1 The benchmark for high annoyance was eventually set at 69 Leq (16 hours),
which corresponds roughly to 67.4 Leq (24 hours). This seems not unreasonable,
always remembering that the analysis refers to the situation in 1982, given that at this
level of 24-hour Leq aircraft noise was found to be very much annoying to about 2/3
of the population, "not acceptable" to about 3/4 and the "most bothersome noise" to
9/10 (para 9.16, p. 62 of DR 8402). It corresponds to 51 NNI according to the
regression analysis of Fig. 7. By 1988, however, the relationship had changed(8), so
that 69 Leq (16 hours) then corresponded to 53.5 NNI. In either case the NNI value is
less than 55, so the upper benchmark seems to err on the safe side.
7.3 Benchmark for Onset of Community Annoyance
7.3.1 The benchmark for "Onset of Community Disturbance" was set at 57 (16 hours)
Leq, which would correspond to 55.5 Leq (24 hours), or 34.5 NNI. Part of the basis
for this decision, as discussed in para 2.4.2 of the paper, was the set of figures 1-5,
plotting various indicators of community disturbance against (3-month, 24-hour) Leq.
It is worth examining Figures 1 and 2, reproduced here as Figs. 8 and 9, in more
detail, since they correspond directly to a figure already given in DR 8402, Fig. 7.5
(here Fig. 10), which plots VMANN, the percentage of the population very much
annoyed by aircraft noise, against three-month, 24-hour Leq, M3LQ24.
7.3.2 It is noteworthy that this latter figure, which incorporates error bars, gives no
evidence of any threshold around 55 Leq. The stretched scale and the suggestive
shading in Fig. 1, and the aggregation of points in Fig. 2, of DORA 9023 are artefacts
which tend to give the misleading impression of such a threshold.
7.3.3 Two further points should be made in this connection. Firstly these two figures
make no allowance for WORKAP, the percentage of the population whose work is
associated with the airport. This latter was shown in DR 8402 (in conjunction with 1week Leq) to be an important "confounding factor", which tends to mask the true
annoyance. A multiple regression analysis, not performed in DR 8402, of VMANN
vs. M3LQ24 and WORKAP gives a fitted value of very nearly 20% at 55.5 Leq (24
hours), corresponding to 57 Leq (16 hours), for the adjusted percentage very much
annoyed (Fig. 11).
7.3.4 Secondly all such fits have a substantial margin of uncertainty, i.e. there is a
large spread about the regression line. In this particular fit the standard deviation is
7.37. There is a 30% chance that true value of VMANN differs from the regression
equation by more than one standard deviation, and a 5% chance that it differs by two
Validity of Leq ...
18
standard deviations, i.e. by about 15%. Thus it can be very misleading to draw a
regression as a line if one forgets the broad swathe of uncertainty on either side.
7.3.5 These points apply to each of the benchmarks, but are perhaps of particular
importance when one is trying to establish a "threshold of annoyance".
7.3.6 The ultimate criteria for the choice of scales were derived from comparisons
between average GAS scores, NNI and 3-month, 16-hour Leq in 1982, and also in
unpublished Leq measurements made in 1988. The three options for translating
between the old benchmarks in NNI and new benchmarks in 16-hour Leq, as
presented in a table at the bottom of p. 29 of DORA 9023 were:
1) The best fit between NNI and Leq in 1982, without reference to the data on
subjective annoyance.
2) The 1982 Leq values which corresponded to the same GAS scores which in the
original social survey of 1967 occurred at 35, 45 and 55 NNI respectively.
3) The best fit between NNI and Leq in 1988, again without reference to the data on
subjective annoyance.
7.3.7 Of these, the second seems the most reasonable. After all, it is subjective
annoyance which we are trying to gauge. However, it is important to note that the
correlation between AVOGAS and 3-month, 16-hour Leq is not particularly
impressive, particularly when the WORKAP factor is not taken into account (In DR
8402 the best correlation between AVOGAS and the noise metrics involved both
WORKAP and a step at 57 Leq (1 week, 24 hours)). The correlation coefficient r is
only 0.76 (r2 = 0.57), and correspondingly the standard deviation is large, as can be
seen visually from Fig. 12. Its actual value is 0.6. To put this number into context the
differences in AVOGAS between "low", "moderate" and "high" annoyance in 1967
were 0.82. Thus the error in this relation is of the order of the intervals between the
different categories. For a similar reason the reduction in AVOGAS at 35 NNI
between 1967 and 1982 is statistically insignificant.
7.3.8 It was ultimately decided to set the lower threshold at 57 Leq (3 months, 16
hours). From the point of view of AVOGAS this is not an unreasonable figure. But,
as we have stressed above, it is essentially impossible to locate the threshold, if indeed
one exists, with any precision. The reasons, again, are that the correlation between
AVOGAS and Leq is not very strong, the errors are large, and the confounding factor
of WORKAP has not been taken into consideration. Moreover, we should never forget
that the analysis was based on a social survey carried out in 1982, since when
conditions at Heathrow have changed dramatically.
7.3.9 The measure VMANN discussed above shows no threshold at all. Another
measure used extensively in DR 8402, but not used in DORA 9023, is ARCBOTH,
the percentage of the population considering aircraft noise to be the most bothersome
noise. When fitted against 3-month, 16-hour Leq, taking into account the WORKAP
factor, this measure seems to exhibit a step at about 60 Leq, as shown in Fig. 13.
However, the average adjusted percentage of the nine sample areas below this step is
Validity of Leq ...
19
some 36%, so it hardly represents a threshold of disturbance. At 57 Leq the fitted
value is 39%.
7.3.10 This section has unfortunately been rather technical, but it is important to
distinguish between all the different variants of Leq which have been used, and to
point out the large errors involved, bearing in mind that an error of just 3dB in setting
the threshold could "justify" a doubling of aircraft numbers! The authors of DORA
9023 themselves (p. 30) admit that "This kind of analysis has to be largely a matter of
judgement: there are statistical and methodological uncertainties and the numbers are
indicative rather than definitive". Nonetheless the figure of 57 Leq (3 months, 16
hours) has been elevated to the status of an icon, and the numbers inside the 57 Leq
contour are confidently equated by government ministers, and in BAA's statement of
case, to the numbers of people actually disturbed at any time, now or in the future.
8. Leq as a Measure of Disturbance at the Present Time
8.0.1 I have cast doubt on the validity of Leq for making predictions about future
levels of community disturbance from aircraft noise. In addition there are many
reasons, apart from those mentioned in the previous section, for believing that the 57
Leq contour seriously underestimates the current level of disturbance.
8.1 It is clear a priori that the 57 Leq contour is only one aspect of the whole noise
climate of the airport. As discussed above, it is very important to consider the
populations inside the contours of higher levels. It is equally important not to neglect
the lower levels. It is a myth to suppose that the disturbance ceases abruptly at 57 Leq:
there is obviously not a sharp cutoff. In a separate proof of evidence (HAC 64)
HACAN will show that there are large numbers of HACAN members, affiliated
amenity societies, complainants and registered objectors to this Inquiry lying outside
the 57 Leq contour.
8.2 The estimates for Leq are derived for a 16-hour day (07.00 - 23.00) during a threemonth period in the summer. Thus it takes no account of night flights, which are
known to be a huge cause of distress to residents and the most frequent cause of
complaint, nor of the noise in the "shoulder hours", which has increased dramatically
in recent years. To spell this out in a little more detail, the present night noise regime,
which dates from 1993, specifies two periods: the "night period" 23.00 - 07.00, when
a take-off noise limit of 102 PNdB and various other restrictions apply, and the "night
quota period", from 23.30 - 06.00, when the number of aircraft is restricted. The
"shoulder hours" are the night-time periods either side of the quota period, namely
23.00 - 23.30 and 06.00 - 07.00. The end of the period when a numerical limit applied
during the winter was brought forward from 06.30 to 06.00, and between 1991 and
1996 the number of flights between 06.00 and 07.00 increased from 7,301 to 11,924
(letter from NATS to Dr. J. Cavalla). None of these changes is reflected in any way in
the calculation of the 57 Leq contour.
Validity of Leq ...
20
9. Conclusions
9.1 The crucial question we have been addressing is the validity of Leq as an
objective measure of subjective disturbance, and in particular whether the 57 Leq
contour accurately delineates the boundary beyond which aircraft noise is not
perceived to be a significant problem.
9.2 The first issue is whether Leq is well-correlated with subjective disturbance. In
DR 8402 the best such correlations involved W1LQ24, the 1-week, 24-hour version of
Leq, and took account of the WORKAP factor. But the usefulness of such a
correlation is not at all clear, since for practical purposes one is forced to use the 3month version of Leq.
9.3 This was indeed done in DORA 9023, and the picture was further complicated by
the eventual decision to use 16-hour Leq, which did not feature at all in the multiple
regression analyses carried out in DR 8402. Moreover, no account was taken of the
WORKAP factor, which had proved so important in the earlier document. Thus there
is an unfortunate mismatch between DR 8402, which finds good correlation using an
impractical version of Leq, and DORA 9023, which uses M3LQ16, which has only a
moderate correlation with subjective annoyance.
9.4 The second issue, the identification of 57 Leq (M3LQ16) with the onset of
community disturbance, is even more fraught. In many subjective measures there is no
clear threshold at all and, because the correlation is not particularly high, the errors are
very large, and yet in terms of population, the difference between 57 Leq and, say, 54
Leq is considerable. As quoted above, the authors of DORA 9023 acknowledged the
limitations of this analysis, but, as with NNI, such reservations have tended to be
ignored.
9.5 In any case, all of this discussion refers to a social survey conducted in 1982,
when conditions at Heathrow were very different. From many other indicators,
discussed in section 8, it seems clear that the 57 Leq contour does not now have the
significance attributed to it, if indeed it ever had. What is clearly needed, following
the precedents of 1967 and 1982, is another social survey to recalibrate the contours to
present conditions. In the absence of such a recalibration the Inquiry cannot place any
weight on Leq contours as currently interpreted.
Validity of Leq ...
21
Definitions
Decibel: L = 10 log10[(p/pref)2].
A-weighting: The pressures at different frequencies are weighted differently, to
mimic the response of the human ear.
Typical sound levels (from Ref. 1)
65 dBA
Busy restaurant or canteen
69 dBA
Vacuum cleaner in home (at 10')
76 dBA
Inside compartment of suburban electric train
80 dBA
Ringing alarm at 2'
86 dBA
Printing press (medium size automatic)
92 dBA
Heavy diesel vehicle at 25'
The present take-off limits at Heathrow are 110 PNdB = 97 dBA during the day and
102 PNdB = 89 dBA at night.
Leq:
Leq = 10 log10 <(pA/pref)2> over some time period.
= 10 log10 < 10**(LA/10) > = SEL + 10 log10 N - const.
M3LQ24: 3-month, 24-hour Leq
W1LQ24: 1-week, 24-hour Leq (measured in the preceding week)
M3LQ16: 3-month, 16-hour Leq (from 0700 hours to 2300 hours)
AVOGAS: average score on the old Guttman annoyance scale
ARCBOTH: percentage of the population finding aircraft noise the most bothersome
noise
ADJBOTH: ARCBOTH adjusted for the WORKAP factor
VMANN: percentage of the population finding aircraft noise very annoying
ADJVM: VMANN adjusted for the WORKAP factor
ARCNA: percentage of the population finding aircraft noise not acceptable
WORKAP: percentage of the population whose work is connected with the airport
Validity of Leq ...
References
1) Noise: Final Report. (The 'Wilson Report')
Cmnd. 2056, HMSO (1963). Reprinted 1971.
2) Aircraft Noise Annoyance Around London (Heathrow) Airport
SS 337 (1963).
3) The Noise and Number Index.
Kew Papers on Aircraft Noise, no. 3.
4) Second Survey of Aircraft Noise Annoyance around London
(Heathrow) Airport.
MIL Research Limited.
HMSO 1971.
5) The Second Survey of Aircraft Noise Annoyance around London
(Heathrow) Airport.
Kew Papers on Aircraft Noise, no. 4.
6) The Noise and Number Index.
DORA Communication 7907 (1981).
7) United Kingdom Aircraft Noise Index Study: main report.
P. Brooker, J. B. Critchley, D. J. Monkman and C. Richmond.
DR Report 8402 (1985).
8) The Use of Leq as an Aircraft Noise Index
J. B. Critchley and J. B. Ollerhead.
DORA Report 9023 (1990)
22
Validity of Leq ...
23
Figures
Fig. 1
Annoyance rating vs. NNI. (Ref. 1, p.208)
Fig. 2
NNI levels vs. L and N. (P. Davies)
Fig. 3
Extrapolation vs. interpolation
Fig. 4
Adjusted percentage 'very much annoyed' vs. 1-week, 24-hour Leq.
(Ref. 7, p. 108)
Fig. 5
NNI vs. 3-month, 24-hour Leq. (Ref. 7, p. 105)
Fig. 6
3-month, 16-hour Leq vs. 3-month, 24-hour Leq.
(constructed from data given in Ref. 7)
Fig. 7
NNI vs. 3-month, 16-hour Leq. (constructed from data given in Ref. 7)
Fig. 8
Percentage 'very much annoyed' vs. 3-month, 24-hour Leq.
(Ref. 8, p. 35)
Fig. 9
Aggregated graph derived from the previous figure. (Ref. 8, p. 35)
Fig. 10
Percentage 'very much annoyed' vs. 3-month, 24-hour Leq.
(Ref. 7, p. 102)
Fig. 11
Adjusted percentage 'very much annoyed' vs. 3-month, 24-hour Leq.
(constructed from data given in Ref. 7)
Fig. 12
Average annoyance rating on the old Guttman scale vs. 3-month, 16hour Leq. (constructed from data given in Ref. 7)
Fig. 13
Adjusted percentage finding aircraft noise to be the most bothersome
noise vs. 3-month, 16-hour Leq. (constructed from data given in Ref. 7)
Download