(0t WORD-RECOGNITION COMPUTER PROGRAM cxr's, 1A

advertisement
R.
L.
E.
cxr's, Cu~l'
iO
1A
i*ARCUMIT
.
2R00 36-412
is
'w,i- J
T7S1!NSTITUITE Or T
!J ~CI,",'.,,* t~.".~'l, I'SSACH'TIET T-3r"
'v ....
ariOllrtLst
1-aNOL0G
Y
iIjUE?7Tr'P; ws$_Ffiff
xt£'
'~
.
l
,v'
'
WORD-RECOGNITION COMPUTER PROGRAM
BERNARD GOLD
I
A
L
olt
(0t
%OOF'
TECHNICAL REPORT 452
JUNE 15, 1966
MASSACHUSETTS INSTITUTE OF TECHNOLOGY
RESEARCH LABORATORY OF ELECTRONICS
CAMBRIDGE, MASSACHUSETTS
The Research Laboratory of Electronics is an interdepartmental
laboratory in which faculty members and graduate students from
numerous academic departments conduct research.
The research reported in this document was made possible in
part by support extended the Massachusetts Institute of Technology, Research Laboratory of Electronics, by the JOINT SERVICES ELECTRONICS PROGRAMS (U.S. Army, U.S. Navy, and
U.S. Air Force) under Contract No. DA36-039-AMC-03200(E);
additional support was received from the National Science Foundation (Grant GK-835), the National Institutes of Health (Grant
5 POI MH-04737-06), and the National Aeronautics and Space Administration (Grant NsG-496).
Reproduction in whole or in part is permitted for any purpose of
the United States Government.
Qualified requesters may obtain copies of this report from DDC.
MASSACHUSETTS INSTITUTE
OF TECHNOLOGY
RESEARCH LABORATORY OF ELECTRONICS
Technical Report 452
June 15, 1966
WORD-RECOGNITION COMPUTER PROGRAM
Bernard Gold
(Manuscript received April 14, 1966)
Abstract
A word-recognition computer program has been designed and tested for a vocabulary
of 54 words and a population of 10 male speakers. The program performs the functions
of segmentation, measurements on the segments, and decision making. Out of the
540 words, 74 were incorrectly classified by the program.
____I
I
__
I
TABLE OF CONTENTS
I.
General Discussion
1
II.
Synopsis of Present Work
4
III.
Description of the Computer Facility
5
IV.
Finding the Word Boundaries
6
V.
Segmentation
7
VI.
Description of Measurements
11
VII.
Discussion of the Formant Measurements
13
VIII.
Decision Making
15
IX.
Preliminary Results
18
X.
Future Plans
19
Appendix I
Word List
20
Acknowledgment
21
iii
_
__
_
--1_1.11-_1_11___ __.__
r_ __
I. GENERAL DISCUSSION
Recent work has led to some results in automatic word recognition which are reasonably encouraging. The results of this work are recorded here. A good deal of the report
sets forth ideas and opinions about the scope of automatic speech recognition, technological aims, and other aspects of this many-faceted problem.
In speaking of automatic speech recognition, we must quickly delineate the problem.
To begin with, the general problem of conversing with a machine as if it were a human
is of such vast scope and such great complexity that it cannot seriously be considered to
be of immediate technological interest.
On the other hand, greatly constrained versions
of the problem are of interest, and first we shall enumerate and speculate about the consequences of these constraints.
A vital and far-reaching constraint is that of having the (speech recognition) machine
recognize only disconnected words, rather than connected speech. Thus, the speaker
utters a single word, the machine analyzes the word and tries to recognize the word by
comparison against a stored vocabulary. The game may be altered to allow the machine
to 'refuse to recognize' the word if it does not adequately match any of the words in the
stored vocabulary.
The ability to build a machine that recognizes words does not imply an ability to build
a machine to recognize connected speech. A word spoken in isolation can be very different from the same word spoken in a sentence. The same word spoken in different sentences may again be greatly changed. How to segment speech into words is, to begin
with, an unsolved problem. Thus, work on word recognition is not necessarily the correct path to work on speech recognition.
Why not view, however, the problem of word
For technological applications, recognition of words is
probably as useful as recognizing connected speech. Even if one wanted to impress a
multiword thought on a machine, the speaker should have little trouble in pausing sufficiently long between words to make automatic word segmentation a not too difficult
recognition as an aim in itself?
matter.
Accepting, then, the restriction that a reasonable goal in speech recognition is that
of word recognition, we next inquire as to a feasible vocabulary size. In particular, we
should like to be able to make some good guesses as to how complex the machine
becomes as a function of vocabulary size. To make these guesses, however, requires
information on how many speakers the machine is expected to respond to. We know, for
example, that a digit recognizer responding only to 'his master's voice' is quite simple.
If you want the machine to respond to a wide variety of speakers, even the digit recognizer becomes appreciably more complex but the problem is still quite manageable.
Extending the vocabulary to approximately 50 words for a variety of speakers seems to
require an additional order of complexity, and it is to this step that the work described
here is devoted.
Given the acoustic signal representing the input word, the machine must make
1
II
__
_C
_
_
_
measurements on the signal, the result being a set of numbers.
If n measurements are
performed, the set of n numbers derived can be thought of as a point in an n-dimensional
measurement space.
The position of the point 'defines' the word.
The ability of the
machine to recognize a word depends on (a) the compactness of that part of the space
containing all versions of the same word uttered by different speakers, and (b) the adequate diffusion of the other words over the space.
In our opinion, these attributes derive
primarily from the exact nature of the measurements chosen.
The argument has often
been advanced that the type of measurement is not really too important because by means
of statistical models one can derive transformations that will generate 'good' measurements from the original 'poor' measurements.
Although such arguments are valid for
many classification problems, we think that successful word recognition depends on
acoustic rather than statistical insights.
A spoken word is uniquely defined as a string of phonemes.
The natural approach
to word recognition would appear to be, first, to find the boundaries separating all of
the phonemes and second, classification of each phoneme.
The first problem is called
segmentation.
We would like to make the point that recognition of words from vocabularies of 20 or
more words (provided that a special monosyllabic vocabulary is not chosen) requires
some sort of segmentation.
Thus far, it seems to be true that rules for consistent seg-
mentation of a word into phonemes are very difficult to invent. In Section V the segmentation rules that were used for the present work are given. There the reader will see to
what extent our segmentation is phonemic and to what extent we have avoided some of the
more difficult segmentation problems.
A few comments will now be made on how to judge the effectiveness of a word recognizer.
For example, the word 'address' may be stressed on the second syllable by nine
out of ten speakers.
If the tenth speaker stresses the first syllable, should the machine
be blamed for an error?
Actually, a man listening to these 10 utterances might say that
the speaker had, in some sense, made an error.
The clear distinction between the two
ways of saying 'address' or 'advertisement' suggests that perhaps both versions of these
words be stored in the vocabulary.
This argument could, of course, be extended to say
that whenever the recognizer failed to make consistent measurements on the same word
spoken by different speakers, then that same word be redefined as two or more words in
the vocabulary. This extension has limitations, however, since words are more difficult
to recognize in a larger vocabulary.
Mainly in the interests of simplicity, no attempt has been made here to enlarge the
vocabulary for the purpose of encompassing different ways of saying the same word.
It
is estimated that an increase of perhaps 10 words to the present vocabulary of 54 would
have eliminated a large portion of the errors made. It seems more fruitful, at present,
to study the causes of these errors than to try to avoid them.
Having raised the question of speaker pronunciation, let us pursue this important
matter further. The word recognizer to be described in this report depends greatly on
2
having the same syllable stressed for every utterance of a given word.
In the vast
majority of cases, all speakers naturally stress the same syllable and the machine correctly notes this; in several words, however, stress is ambiguous. Although recognition
is often successful even then, this success is usually quite precarious.
Based on exper-
ience with this recognizer, it is thought that instructing the speaker to stress properly
would improve the performance.
Another worthwhile demand on the speaker is that he
explode his final p's, k's and t's, rather than swallow them, as speakers often do.
Finally, if close-talking microphones are used, avoidance of "breathiness" is desirable
to prevent low-frequency "wind" noise. Such noise may cause the recognizer to mistake
an unvoiced sound as a voiced sound.
Three areas for which word recognition may be a pertinent problem are speech bandwidth compression, voice-actuated mechanisms, and 'talking to computers'. It is well
appreciated that a word recognizer and word synthesizer combine to create a speech
bandwidth compression system yielding very close to the ultimate bandwidth.
The prob-
lem of preserving the speaker identity has not yet even been attempted. Furthermore,
the very small vocabulary of 50-100 words that might be available, at present, would
indeed limit the utility of such a device for speech communication.
Whether voice-actuated mechanisms would be beneficial is really impossible to know.
Perhaps one reason for wanting to build a word recognizer is to satisfy one's curiosity
about its affect on other people.
Talking to computers appears to be the most promising immediate application of a
word recognizer.
Many computer people argue that a typewriter input is more reliable
Such an argument was once very potent, but recent work wherein
and uses less storage.
man and machine must both react quickly to each other has made it less convincing.
One
could cite as a possible application of a word recognizer the area of computer-controlled
graphic communication.
Computer rooms are usually noisy places, hardly a suitable environment for a word
recognizer.
To minimize these disturbances to the word recognizer requires that
special attention be paid to the microphone that is used and exactly how it is mounted on
the speaker.
This particular problem has been avoided here; the words were recorded
in a fairly quiet environment (not an anechoic chamber).
Thus, we can only speculate
about possible difficulties and how to treat them.
In 'talking to computers', the speaker may very well want to keep his hands free for
such purposes as typing and for control of an oscilloscope display with a light pen.
Microphone mounts of the type worn by telephone operators are perhaps most practical
under these conditions; these microphones ought to have noise cancellation and closeIf the resulting protection against spurious triggering of
the recognizer is still insufficient, special circuitry might be necessary.
Design of a satisfactory acoustic environment might well be one of the interesting
talking features, if possible.
'tangential' problems of talking to computers.
3
I_
I
-
1·_1^-·1111-1 -- l---ltY.--
--
_
_-
_
__
II. SYNOPSIS OF PRESENT WORK
Presentation of this synopsis has been deferred to make clearer the chosen boundFirst, recordings were made of 10 male speakers, each of
aries of the present work.
whom read the same list of 54 words.
were computer-oriented.
The chosen words
Each word was analyzed by a 16-channel spectrum analyzer
covering the 180-3000 cps band.
and voicing detector.
The list is given in Appendix I.
Also, the words were passed through a pitch extracter
All processed words were passed through the word-recognition
program, from which the results of the measurements were tabulated.
The main func-
tion of the program was to segment the word, make 15 measurements after segmentation,
and file these measurements for each of the 540 spoken words.
The tabulation of all of
these measurements was printed and after rather lengthy inspection of the data, a decision algorithm was devised.
All 540 words were then passed through this algorithm and
the results tabulated and printed.
The rest of this report describes, in more detail, the
computer facility, segmentation, the measurements, and the decision algorithm, and
discusses the results that were obtained.
4
III. DESCRIPTION OF THE COMPUTER FACILITY
For convenience of discussion, the various programs are divided into three compartments,
called data gathering,
gathering, is shown in Fig. 1.
compilation,
and decision making.
The first, data
It consists of a real-time read-in, into TX-2 core mem-
ory, of the spectrum, voicing decision, and voice fundamental frequency.
All spectrum
channels are sampled each 2. 5 msec, and other information is sampled each 5 msec.
SPE
WA
Fig. 1.
Data-gathering mode of the word-recognition program.
The next step is compilation, whereby the digital tapes are played back into core
memory and the results of the measurements compiled.
Approximately one hour of
computer running time was needed to compile a complete inventory of test results on the
540 words.
The result is a three-dimensional matrix of numbers, obtained by per-
forming each measurement on every word and speaker.
In a 36-bit word-length com-
puter, 2025 registers are needed to store these numbers as 9-bit quantities.
Finally, the decision-making program singles out one speaker's words and, by
matching measurements made on these words with similar measurements made on the
other 9 speakers, the program tries to recognize all words spoken by this speaker.
By
performing this operation on all of the speakers in turn, each time matching against the
other 9 speakers, a total score is finally obtained.
5
~~~~~~~~~~~~
-1- -_
----
__
-1. -..
-
IV.
FINDING THE WORD BOUNDARIES
Let us assume that the spectrum of the word, and the voicing decision associated with
the word, are in core memory.
Even at this early stage, minor difficulties arise in
determining the beginning and end of the word.
In order to read the word into memory,
the program must sense when the input signal exceeds a threshold.
too low may trigger the read-in before the word is uttered.
A threshold that is
A threshold that is too high
may cause the weak 'f' in a word like 'four' to be missed, so that a word more like 'or'
will be read into memory.
Of course, careful monitoring can eliminate these troubles,
but great care consumes time and, besides, there is the tempting thought that in a limited vocabulary (containing 'four' but not 'or'), the garbled word might still be correctly
identified.
Our program was designed to eliminate noise of less than 40-msec duration that
This is done by measuring the average mag-
might have triggered the read-in program.
nitude function with a running window of 17. 5-msec width, as indicated in Fig. 2.
SPEECH
LEVEL
WORD
ENDING
WORD
BEGINNING
1 11
.. l........
SCANNING
WINDOW
(17.5 msec)
Fig. 2.
_ _ TMI
--TIME
SCANNING WINDOW
( 100 msec )
Word boundary determination.
For our purposes, 'magnitude function' is defined as the sum of the spectrum channels at any moment. Comparison with a fixed threshold determines the word beginning,
which is defined as the position of the scanning window when the threshold is first
exceeded as the window scans from left to right.
Figure 2 has been drawn to illustrate
that a spurious sample which triggered the read-in program could be eliminated.
To find the end of the word, the procedure is similar but the scan window must be
greatly widened.
A word like 'eight', if the final t is exploded, will contain an interval
of silence between the vowel and the t.
As seen in Fig. 2, the end of the word is made
coincident with the beginning of the window.
6
V.
SEGMENTATION
By studying speech spectrograms and the speech magnitude function, several basic
segmentation cues emerge. These will now be discussed.
A sharp dip in the over-all magnitude function generally signifies a segment.
Thus,
in Fig. 3, the dip shown usually indicates a vowel-consonant-vowel sequence, although
the most accurate placement of the segment boundaries is not determined only from this
picture.
The cue is not perfect; for example, the speech level of the word 'zero' often
fluctuates, as shown in Fig. 4.
CONSONANT
_
,MAGNITUDE
El
FUNCTION
MGNITUDE
FUNCTION
VOWEL
TIME
TIME
Fig. 3. Characteristic dip during an
intervocalic consonant.
Fig. 4.
Magnitude function for
"zero."
Cessation or initiation of voicing is another good cue which is quite well detectable
with a buzz-hiss detector of the type used in vocoders. The voicing pattern of a given
word is not unique. In the words 'divide', 'seven', 'number', the voiced fricatives and
voiced plosives may be detected as either voiced or unvoiced.
Many other such examples
can be found. As we shall see in a moment, however, this and the first segmentation cue
together serve as the primary segmentation determinants of the present program.
A rapid change in the over-all spectrum is another cue for segmentation. This
'change' we have defined by the formula,
16
A (xi-yi)
i= 5
C= 16
E
'
(1)
(xi+Yi)
i=5
where x i is the sum of 4 samples from spectrum channel i to the left of a reference
time, and yi is the sum of 4 samples from the same channel to the right of the reference. We see from Eq. 1 that C must be a positive number smaller than unity; large
changes in spectrum are reflected as large values of C. Running the summation in (1)
from the fifth rather than from the first spectrum channel proved a more sensitive measure of segmentation.
Based on the above-mentioned three cues, the specific segmentation algorithm that
was used can now be given.
7
_
LIICII___II___·XLI_
IPIIII--CI-
--- -^1_--____11_
.·(_.·_
..
I
PI
1.
Divide the word into voiced and unvoiced segments.
To remove possible spurious
segments, first delete all voiced segments of less than 20-msec duration and then all
unvoiced segments of less than 20-msec duration.
2.
Find the voiced segment having the greatest energy, defined as the product of the
average magnitude function and segment duration.
3.
if possible, by finding an "adequate dip."
Divide this largest segment further,
This "dip" is defined as the position of the smallest minimum magnitude function during
Segment edges are ignored in searching for this dip.
the segment.
The dip is considered
to be "adequate," and denotes a segment if it is no greater than 1/3 of one of the maxima
to either side and no greater than 0. 7 of the other maximum.
Given an "adequate dip,"
segment boundaries to both left and right are found by computing the largest value of C
(in Eq. 1) between the dip and the two maxima.
4.
If further division according to rule 3 succeeded, then the previous large segment
obtained from rule 2 has now been fractured into three segments.
The 'stressed seg-
ment' of these three is found by the location of the maximum magnitude function.
5.
Further fracturing of the stressed segment into, at most, three segments is now
possible.
The stressed segment is scanned, and C is measured at each sample.
reference boundaries for which C exceeds a threshold are marked as boundaries.
All
The
program is constrained to allow only a "left segment" or "right segment" or both to be
chopped off the stressed segment.
Figures 5,
The rules will now be illustrated for a few words.
6, and 7 show the
::`:·I.r ,
:I
A~·
1,")
..
F ''
11
:·Eti
B ffliai-··
a
Fig. 5.
F;P
Segmentation of
"directive. "
I "
-
ii·
I .I
I
,I
,i
:
"
I
,J
I
]D I
I
I
I
!
R
E
1III
RULE 1
.
RULE 3
I VE
C T
8
,
$
;*-yei
·-ds
;VA',
-U
';@ 'tA
r
~,$p,~~ird·:T·i'
L-
:<44
A.
:C.L
i,
.L 7:
Pc
,,
;rc··
-"'
:·
.
:
i-·
ar.·
5i
.1" 5i;h '.?
i.:
RULES 1 THROUGH 4
'I
RULES 5
I
NE
Fig. 6.
Segmentation of "nine."
IN
,1
.Y
.K.i·
.i· ·····-
·· ·
--·""·"ia
'"
-·;
··'·i
,-:. :··
J
I
I
!MUL
I
I
II
I
RULE 1
T
Fig. 7.
I
PI
Y
II
I
I
RULE 5
Segmentation of "multiply."
9
...
_~
-
-
c
spectrograms and the resultant segments obtained.
a)
directive
Rule 1 will result in dire ct ive
Rule 2 will choose dire as the large segment
Rule 3 will further segment to give di r e ct ive
Rule 4 will stress the third segment
Rule 5 does nothing
b) nine
Rules 1 through 4 do nothing
Rule 5 will yield n i ne
c) multiply
Rule 1 yields mul t i ply
Rule 2 chooses mul as the largest segment
Rules 3 and 4 do nothing
Rule 5 produces m ul t i pl y
10
VI.
DESCRIPTION OF MEASUREMENTS
The measurement repertoire has been limited so that occasionally only a fraction of
a word is analyzed.
This can be most easily illustrated by an example.
If the word
'intersect' is operated on by the segmentation rules, the result will be
in
P-4
t
P-3
s
P- 1
rer
P- 2
e
PO
ct
p1
Denoting the stressed segment e as po, we can define segments to the right of po as
P'1
P2' etc., and segments to the left of po as P 1, P2', etc.
The program analyzes only the bracketed part of the word 'intersect'.
As many as
two segments to the right and two segments to the left of po are analyzed,
information being ignored.
the other
The stressed segment po may itself be subdivided by Rule 5,
but this does not affect the treatment of the other segments.
We now enumerate the measurements.
1.
Relative duration of any left segment within po
0
2, 3, 4,
and 5.
Relative durations of P- 2,
6.
Presence of a silence in P i1
7.
Presence of a silence in P 1 .
8, 9,
10.
P_ ,1
P 1 ' and P2 .
Spectral measurements performed on the middle third of po.
11.
Spectral measurement on P_ 1 .
12,
13, 14, and 15.
Measurements based on formant motion during segment po'
By relative duration is meant the ratios of the duration of a given segment to the
duration of the entire word.
Measurements 6 and 7 are strong cues for the recognition
of P_ 1 and P1 as voiceless plosives.
A silence is
considered to be present when the
average speech level in a 20-msec scan window falls below a threshold.
Measurement 8 is defined as
16
M
-
M8=
e qi-
i= 1
16
i= 1
qi
(2)
the
i=s
average magnitude function of the middle third of the segment
where qi is the average magnitude function of the middle third of the segment po.
Sim-
ilarly,
4
8
12
iqi+
qiM
=i=1
9
i= 5
i=
I
=9
i=l
16
16
qi
i= 13
(3)
i
11
- - _C_ ~ I I
- --------
I
w
and
4
M
i=1
8
qi-
i=5
12
q 1
7
16r
16
qi+
7
i= 13
q
(4)
i=l
Measurement 11 is similar to M 8 , except that the evaluation of qi takes place over those
20 msec of p_ 1 having the greatest average magnitude function.
M 8 is a useful discriminant for segments with low first (F 1 ) and second (F 2 ) formants, such as in the words 'move', 'load',
'quarter'.
M 9 tended to give consistently
negative results for segments with high F 1 and low F 2 , as in 'octal', 'half', 'add'.
M10
gave similar results for these words, but also gave negative results for the diphthongs
in 'divide',
'nine',
'binary',
'five'.
M 1 1 gave strong negative results for the 'S' sound.
12
VII. DISCUSSION OF THE FORLMANT MEASUREMENTS
Formant information contains strong recognition cues and deserves special consideration. Two points are pertinent for the present problem. One is that the usual criterion of
acceptability of a formant-tracking device, namely, that it control a formant synthesizer
to produce acceptable vowel quality, does not apply. Second, formant measurements on
the same vowel (in the same context) by different speakers, are widely diffused.
From
this, it can be implied that (a) a speech-recognition "formant tracker" need not actually
track formants as defined by tracing the center of the dark bars on a spectrogram; it can
track some function of the spectrum which is perhaps closely related to formants and
Chart 1.
Vi
1
Wi.
Db Attenuation in
F 3 Measurement
Frequency (cps)
Db Attenuation in
F 2 Measurement
900
8
oo
1080
4
oo
1260
0
oo
1440
0
o0
1620
0
8
1800
5
6
1980
10
4
2160
15
2
2340
18
0
2520
22
0
2700
oo
0
2880
8
oo
3060
16
oo
Note - This chart was obtained from a paper by J. N. Shearme,
"Analysis of the Performance of an Automatic Formant Measuring System," presented at the Stockholm Speech Communication Seminar, 1962.
easier to measure, and (b) we should not depend too heavily on precise formant locations.
The definitions of formants F 1 ,
F2, and F 3 that are used here will now be given:
7
ix
F
= i=1
1
7
xi=
i= 1
13
-
"
.'
_
.
-~II-
-I
16
iv.x.
iv.x i
i=1
1 1
i=l
F3 =
16
wix.
i=l
Here, x. is the it h spectrum channel signal, and u i , vi, and w. are sets of weights which
11
1
1
th
roughly correspond to the average frequency of occurrence of the formants in the i
ter.
Chart 1 shows the sets of weights, in the F
in the F
1
and F 3 region.
fil-
Weighting is uniform
region.
The measurements 12, 13, 14, and 15 are now defined as: 12) the average value of
F 1 over po; 13) the average value of F 2 over po; 14) the difference between F 2 near the
beginning and near the end of p;
derivative of F2 .
and 15) the average of the absolute value of the time
Thus, some information concerning formant values and some informa-
tion about formant changes are included.
14
VIII.
DECISION MAKING
Perhaps more time has been spent (by people purportedly working on speech recognition) in trying to devise statistical recognition models than in working closely with
actual data. We think that a good deal of this effort has been wasted, not because a
recognition model is unimportant, but because most of this work has been directed at
models that can be theoretically analyzed; for example, n-dimensional Gaussian models.
Speech data will not fit into predetermined formats and any attempt to squeeze them into
one will lower the machine's capabilities.
In the general discussion the framework of a word-recognition decision maker has
been loosely defined. Let us first somewhat formalize these ideas by constructing a
model and then describe how this model was changed to accomodate the data.
Given any one of the 15 measurements and any of the 54 words, an empirical range
for that measurement and word can be constructed. Thus, numbers obtained by applying
measurement 8 to each speaker saying the word 'load' varied from 67 to 92; let us then
define the range associated with the word 'load'
and measurement 8 as all numbers
greater than or equal to 67 and less than or equal to 92. In this way a range R.. is
defined for the ith word and jth measurement and by inspection of the observed data
numerical values can be assigned to every Rij.
If we think of range as a line segment along one of the axes of a 15-dimensional
measurement space, then it is clear that any word is bounded within a 15-dimensional
"box."
Each "box" will contain all utterances of the same word.
Now, if these "boxes"
were mutually disjoint, the decision problem would be solved, at least to the extent that
all 54 words would be perfectly partitioned and thus distinguishable.
Let us be more precise about the phrase "mutually disjoint."
We mean that no two
Even if 14 measurements of a given word fell
different words ever fall in the same box.
within the 14-dimensional projection of another word's box, if the 15t h measurement did
not fall into the range of the 15 t h measurement of the other word, the first word would
not be wrongly classified as the other word.
As expected, this division of the measurement space into 'boxes' does not, for our
set of measurements, uniquely classify all utterances.
If the model were applied exactly
as described, the net result would be that many of the boxes would occupy such a large
proportion of the space that large numbers of coincident (nondisjoint) boxes would be
For example, measurement 5 applied to the word 'cycle' covers a range, if
present.
nine out of the ten speakers are considered, from 11 to 20.
This is a rather restricted
region for a measurement that yields a number from 0 to approximately 70, over all
540 words; however, measurement 5 applied to the 10 t h speaker, yields 47. Thus the
range would be 11-47, much wider than before and much less capable of keeping the
boxes disjoint.
This failure of the measurements to give numbers that are clustered together for all
utterances of the same word is sufficiently prevalent to make it necessary to modify the
15
_^I
-1
11-----·111
_
1_1
·_1111111411_--pI1l_..-
_II_
__I_
I
Chart 2. Score-Probability Table.
range model.
Probability
Score
0
-20
1/9
-6
2/9
-4
3/9
-2
4/9
0
5/9
6/9
0
7/9
2
8/9
4
1
5
1
Presumably, such modification has to be probabilistic, and it is exactly
at this point that traps must be avoided.
We have very few samples of the effect of each
measurement on each word (10, to be exact), and double or triple that number would still
be small.
The kind of trouble
caused is
exemplified
(and there are
many more
examples) by measurement 1, the relative duration of p_ 2 applied to 'delete'. The numbers obtained vary from 15 to 43 for 9 of the 10 speakers.
For the 10ht
speaker, the
segmentation failed to uncover a PZ' so that the measurement could not be made (if we
like, we can say the result is zero).
Now, from purely probabilistic considerations, it
could be claimed that measurement 1 applied to 'delete' yielded a range from 15 to 43
with probability 0. 9, and yielded the measurement 0 with probability . 1.
that
1 0 th
Inspection of
utterance shows, however, that the stress was put on the first syllable rather
than the second, and since we are unwilling to admit that 1 out of 10 speakers will say
'delete' that way we would prefer to throw away that measurement.
Our procedure, then, is intuitive and based somewhat on personal judgements about
each of the utterances.
What is done is to find a range R
based on 10 numbers, which
may or may not encompass the actual numerical range of observed values which appears
to be that region in which we expect the numbers to fall.
Now, in assigning probabilities,
when a given speaker's words are to be processed, he is first withdrawn from the population.
Then,
probability qi
if m 1 of the data points falls within our chosen range,
=
ml/ 9 to the range.
we assign a
If m 2 numbers were below the range,
then the
probability (associated with Rij), that a subsequent number is below the range is q 2 =
m 2 / 9 , and similarly q 3 = m 3 / 9 is the probability that a number is above the range.
The model is now nearly complete.
For each measurement and word, a range R..
and three probabilities have been assigned.
algorithm can now be described.
An 'unknown' word appears.
Based on these entities, a decision-making
It is first matched with word 1 of the vocabulary as
follows: Measurement 1 is made and it is determined whether this measurement falls
16
---`--
I
-- ------
within the range of word 1, below this range or above it.
probability qi' q2', or q3 will be obtained.
and saved.
Depending on the result, a
A partial score is then found from Chart 2
The total score is the sum of partial scores over all measurements.
The
unknown word is then matched in the same way with all other words in the vocabulary
and that word resulting in the greatest total score 'wins', that is, the decision is made
that the unknown word is that word in the vocabulary with the highest score.
17
_--
_-
~
__~~~~~~~~_1
--
-
-
---I
-·l-r-r·-----u--^-·--···
I---`
I
-
PRELIMINARY RESULTS
IX.
Our program correctly identified all but 74 of the 540 words in our catalogue, an
average of approximately 86%.
The errors are enumerated in the appendix, together
with some supplementary data.
The appendix shows that approximately 95% of the words
Some of the remaining 5% included
were either correctly identified or finished second.
words that were either stressed wrongly, mispronounced or not correctly read into the
computer memory.
93
o
z
O
0
O
u
w
25
a
U
0
O
d
0
20 Z
89
O
15 z
U
z
w
_
U
z
10 a
U 87
85
16
0
Fig. 8.
24
32
40
48
56
THRESHOLD
(a) Probability of correct decision.
(b) Probability of no decision as a
function of threshold setting.
If 'no decision' is allowable, the performance of the program can be judged from
Fig. 8.
The abscissa represents a threshold setting that the highest score must exceed
for there to be a decision.
The two curves show how this threshold setting affects the
percentage of correct decisions, as well as that of no decisions.
18
I
X.
FUTURE PLANS
At present, no specific plan for future work exists, mainly because the number of
possible directions is large enough to be a little confusing.
We end our report by citing
some of these possibilities.
Of great importance is the extension to more speakers.
If 10 new speakers are
added, will simple adjustments of the ranges suffice to maintain the over-all average?
How will female speech and foreign accents affect the decision making?
Can the ranges and three associated probabilities be found automatically?
Possible
enlargement of the vocabulary, in the future, may make this a very important question.
In the present work, we have avoided the problem of consistent segmenting of initial
and final voiced plosives, final voiced fricatives and nasals, and glides.
Clearly, we
must make progress along these lines, either in refining the segment measures or in
adapting the work of others to the problem.
Other problems that came to mind are further 'tuning' of the present program to
improve the average by adjusting the ranges and probabilities and perhaps adding a few
measurements, trying to improve the formant tracking measurements, experimenting
with a speech signal of wider band in the hope of finding measurement cues for distinguishing between plosives and fricatives.
Perhaps the most tempting direction is that of applying the technique to a practical
problem of 'talking to a computer'.
19
_~~
~ ~ ~ ~~~~~~~~~~~~~
II_
^
_
_
I
I
-
Y
_^_
_I
Appendix I. Word List
insert
delete
name
end
replace
scale
move
cycle
read
skip
binary
jump
save
address
core
overflow
directive
point
list
control
load
register
store
word
add
exchange
subtract
input
zero
output
one
make
two
intersect
three
compare
four
accumulate
five
memory
six
bite
seven
eight
quarter
half
nine
whole
multiply
unite
divide
decimal
number
octal
20
4
1
I
Acknowledgement
The work reported here was begun because of the joint interest of the author and
Professor K. N. Stevens, who has followed the progress
continuously,
contributing
valuable suggestions as well as insights on the nature of some of the more enigmatic
speech sounds.
Many of the thoughts presented here derive from conversations with him.
One result of the relative informality of the report is the absence of documentation.
The main purpose of the report was rapid dissemination of progress up to the present
time, rather than a more formal presentation.
Nevertheless,
edge the important influence of previous work.
J.
Chief
it is proper to acknowl-
among these is the work of
W. Forgie and Carma Forgie, who have made important advances in several areas
of speech recognition.
Lincoln Laboratory,
The author has also had the benefit of day-to-day contact,
M. I. T., with J.
W. Forgie.
at
Notable among the large volume of
other published material is that of George Hughes. Although no mention has been made
in this report of the application of the linguistic concept of distinctive features of speech
recognition, this approach, developed by Hughes, has greatly influenced the lines of
investigation pursued here.
Many other significant papers have appeared over the years.
ography appears in IEEE Spectrum of March,
1965.
21
I
-
-
-'----'
-'-----------
-
--
------·- ·-
··-·--
An extensive bibli-
_
________
__
__
_
JOINT SERVICES ELECTRONICS PROGRAM
REPORTS DISTRIBUTION LIST
AFRSTE
Hqs. USAF
Room ID-429, The Pentagon
Washington, D.C. 20330
Department of Defense
Dr. Edward M. Reilley
Asst Director (Research)
Ofc of Defense Res & Eng
Department of Defense
Washington, D. C. 20301
Office of Deputy Director
(Research and Information Room 3D1037)
Department of Defense
The Pentagon
Washington, D. C. 20301
Director
Advanced Research Projects Agency
Department of Defense
Washington, D. C. 20301
AFFTC (FTBPP-2)
Technical Library
Edwards AFB, Calif.
93523
Space Systems Division
Air Force Systems Command
Los Angeles Air Force Station
Los Angeles, California 90045
Attn: SSSD
SSD(SSTRT/Lt. Starbuck)
AFUPO
Los Angeles, California 90045
Director for Materials Sciences
Advanced Research Projects Agency
Department of Defense
Washington. D. C. 20301
Det #6, OAR (LOOAR)
Air Force Unit Post Office
Los Angeles, California 90045
Headquarters
Defense Communications Agency (333)
The Pentagon
Washington, D. C. 20305
Systems Engineering Group (RTD)
Technical Information Reference Branch
Attn: SEPIR
Directorate of Engineering Standards
and Technical Information
Wright-Patterson AFB, Ohio 45433
Defense Documentation Center
Attn: TISIA
Cameron Station, Bldg. 5
Alexandria, Virginia 22314
ARL (ARIY)
Wright-Patterson AFB, Ohio 45433
Director
National Security Agency
Attn: Librarian C-332
Fort George G. Meade, Maryland 20755
AFAL (AVT)
Wright-Patterson AFB, Ohio 45433
AFAL (AVTE/R. D. Larson)
Wright-Patterson AFB, Ohio 45433
Weapons Systems Evaluation Group
Attn: Col. Finis G. Johnson
Department of Defense
Washington, D. C. 20305
National Security Agency
Attn: R4-James Tippet
Office of Research
Fort George G. Meade, Maryland 20755
RADC (EMLAL- 1)
Griffiss AFB, New York 13442
Attn: Documents Library
Central Intelligence Agency
Attn: OCR/DD Publications
Washington, D. C. 20505
AFCRL (CRMXLR)
AFCRL Research Library, Stop 29
L. G. Hanscom Field
Bedford, Massachusetts 01731
Department of the Air Force
AUL3T-9663
Maxwell AFB, Alabama 36112
-
I
ill
I
-
I-"
Commanding General
Attn: STEWS-WS-VT
White Sands Missile Range
New Mexico 88002
-----"
'---'")e
--·LI--L·-·IIIPL
·---
~
....
l
I
JOINT SERVICES REPORTS DISTRIBUTION LIST (continued)
Academy Library DFSLB)
U. S. Air Force Academy
Colorado 80840
Commanding Officer
U. S. Army Materials Research Agency
Watertown Arsenal
Watertown, Massachusetts 02172
FJSRL
USAF Academy, Colorado 80840
APGC (PGBPS-12)
Eglin AFB, Florida 32542
AFETR Technical Library
(ETV, MU-135)
Patrick AFB, Florida 32925
AFETR (ETLLG-1)
STINFO Officer (for Library)
Patrick AFB, Florida 32925
ESD (ESTI)
L. G. Hanscom Field
Bedford, Massachusetts
01731
AEDC (ARO, INC)
Attn: Library/Documents
Arnold AFS, Tennessee 37389
European Office of Aerospace Research
Shell Building
47 Rue Cantersteen
Brussels, Belgium
Lt. Col. E. P. Gaines, Jr.
Chief, Electronics Division
Directorate of Engineering Sciences
Air Force Office of Scientific Research
Washington, D.C. 20333
Commanding Officer
U. S. Army Ballistics Research Laboratory
Attn: V. W. Richards
Aberdeen Proving Ground
Aberdeen, Maryland 21005
Commandant
U. S. Army Air Defense School
Attn: Missile Sciences Division C&S Dept.
P.O. Box 9390
Fort Bliss, Texas 79916
Commanding General
Frankford Arsenal
Attn: SMUFA-L6000 (Dr. Sidney Ross)
Philadelphia, Pennsylvania 19137
Commanding General
U.S. Army Missile Command
Attn: Technical Library
Redstone Arsenal, Alabama 35809
U. S. Army Munitions Command
Attn: Technical Information Branch
Picatinney Arsenal
Dover, New Jersey 07801
Commanding Officer
Harry Diamond Laboratories
Attn: Mr. Berthold Altman
Connecticut Avenue and Van Ness St. N. W.
Washington, D. C. 20438
Department of the Army
U. S. Army Research Office
Attn: Physical Sciences Division
3045 Columbia Pike
Arlington, Virginia 22204
Research Plans Office
U. S. Army Research Office
3045 Columbia Pike
Arlington, Virginia 22204
Commanding General
U. S. Army Materiel Command
Attn: AMCRD-RS- PE-E
Washington, D.C. 20315
Commanding General
U. S. Army Strategic Communications
Command
Washington, D. C. 20315
Commanding Officer
U. S. Army Security Agency
Arlington Hall
Arlington, Virginia 22212
Commanding Officer
U. S. Army Limited War Laboratory
Attn: Technical Director
Aberdeen Proving Ground
Aberdeen, Maryland 21005
Commanding Officer
Human Engineering Laboratories
Aberdeen Proving Ground, Maryland 21005
Director
U.S. Army Engineer
Geodesy, Intelligence and Mapping
Research and Development Agency
Fort Belvoir, Virginia 22060
I
JOINT SERVICES REPORTS DISTRIBUTION LIST (continued)
Department of the Navy
Commandant
U. S. Army Command and General
Staff College
Attn: Secretary
Fort Leavenworth, Kansas 66270
Chief of Naval Research
Department of the Navy
Washington, D. C. 20360
Attn: Code 427
Dr. H. Robl, Deputy Chief Scientist
U. S. Army Research Office (Durham)
Box CM, Duke Station
Durham, North Carolina 27706
Chief, Bureau of Ships
Department of the Navy
Washington, D. C. 20360
Commanding Officer
U. S. Army Research Office (Durham)
Attn: CRD-AA-IP (Richard O. Ulsh)
Box CM, Duke Station
Durham, North Carolina 27706
Chief, Bureau of Weapons
Department of the Navy
Washington, D. C. 20360
Commanding Officer
Office of Naval Research Branch Office
Box 39, Navy No 100 F. P.O.
New York, New York 09510
Superintendent
U. S. Army Military Academy
West Point, New York 10996
Commanding Officer
Office of Naval Research Branch Office
1030 East Green Street
Pasadena, California
The Walter Reed Institute of Research
Walter Reed Medical Center
Washington, D.C. 20012
Commanding Officer
U. S. Army Engineer R&D Laboratory
Attn: STINFO Branch
Fort Belvoir, Virginia 22060
Commanding Officer
Office of Naval Research Branch Office
219 South Dearborn Street
Chicago, Illinois 60604
Commanding Officer
U. S. Army Electronics R&D Activity
White Sands Missile Range,
New Mexico 88002
Dr. S. Benedict Levin, Director
Institute for Exploratory Research
U. S. Army Electronics Command
Attn: Mr. Robert O. Parker, Executive
Secretary, JSTAC (AMSEL-XL-D)
Fort Monmouth, New Jersey 07703
Commanding Officer
Office of Naval Research Branch Office
207 West 42nd Street
New York, New York 10011
Commanding Officer
Office of Naval Research Branch Office
495 Summer Street
Boston, Massachusetts 02210
Director, Naval Research Laboratory
Technical Information Officer
Washington, D. C.
Attn: Code 2000
Commanding General
U. S. Army Electronics Command
Fort Monmouth, New Jersey 07703
Attn: AMSEL-SC
HL-O
AMSEL-RD-D
HL-R
RD-G
NL-D
RD-MAF- 1
NL-A
RD-MAT
NL-P
RD-GF
NL-R
XL-D
NL-S
XL-E
KL-D
XL-C
KL-E
XL-S
KL-S
HL-D
KL-T
HL-L
VL-D
HL-J
WL-D
HL-P
Commander
Naval Air Development and Material Center
Johnsville, Pennsylvania 18974
Librarian, U. S. Electronics Laboratory
San Diego, California 95152
Commanding Officer and Director
U. S. Naval Underwater Sound Laboratory
Fort Trumbull
New London, Connecticut 06840
Librarian, U. S. Naval Post Graduate School
Monterey, California
-··-·
-
-
4"~~-
..
N.
.
,,
··---
-
JOINT SERVICES REPORTS DISTRIBUTION LIST (continued)
Commander
U. S. Naval Air Missile Test Center
Point Magu, California
Dr. H. Harrison, Code RRE
Chief, Electrophysics Branch
NASA, Washington, D. C. 20546
Director
U.S. Naval Observatory
Washington, D. C.
Goddard Space Flight Center
NASA
Attn: Library, Documents Section Code 252
Green Belt, Maryland 20771
Chief of Naval Operations
OP-07
Washington, D. C.
Director, U. S. Naval Security Group
Attn: G43
3801 Nebraska Avenue
Washington, D. C.
Commanding Officer
Naval Ordnance Laboratory
White Oak, Maryland
Commanding Officer
Naval Ordnance Laboratory
Corona, California
Commanding Officer
Naval Ordnance Test Station
China Lake, California
Commanding Officer
Naval Avionics Facility
Indianapolis, Indiana
Commanding Officer
Naval Training Device Center
Orlando, Florida
U. S. Naval Weapons Laboratory
Dahlgren, Virginia
Weapons Systems Test Division
Naval Air Test Center
Patuxtent River, Maryland
Attn: Library
Other Government Agencies
Mr. Charles F. Yost
Special Assistant to the Director
of Research
NASA
Washington, D.C. 20546
NASA Lewis Research Center
Attn: Library
21000 Brookpark Road
Cleveland, Ohio 44135
National Science Foundation
Attn: Dr. John R. Lehmann
Division of Engineering
1800 G Street N. W.
Washington, D. C. 20550
U.S. Atomic Energy Commission
Division of Technical Information Extension
P.O. Box 62
Oak Ridge, Tennessee 37831
Los Alamos Scientific Library
Attn: Reports Library
P.O. Box 1663
Los Alamos, New Mexico 87544
NASA Scientific & Technical Information
Facility
Attn: Acquisitions Branch (S/AK/DL)
P. O. Box 33
College Park, Maryland 20740
Non-Government Agencies
Director
Research Laboratory for Electronics
Massachusetts Institute of Technology
Cambridge, Massachusetts 02139
Polytechnic Institute of Brooklyn
55 Johnson Street
Brooklyn, New York 11201
Attn: Mr. Jerome Fox
Research Coordinator
Director
Columbia Radiation Laboratory
Columbia University
538 West 120th Street
New York, New York 10027
Director
Stanford Electronics Laboratories
Stanford University
Stanford, California
Director
Coordinated Science Laboratory
University of Illinois
Urbana, Illinois 61803
JOINT SERVICES REPORTS DISTRIBUTION LIST (continued)
Director
Electronics Research Laboratory
University of California
Berkeley 4, California
Hunt Library
Carnegie Institute of Technology
Schenley Park
Pittsburgh, Pennsylvania 15213
Director
Electronics Sciences Laboratory
University of Southern California
Los Angeles, California 90007
Mr. Henry L. Bachmann
Assistant Chief Engineer
Wheeler Laboratories
122 Cuttermill Road
Great Neck, New York
Professor A. A. Dougal, Director
Laboratories for Electronics and
Related Sciences Research
University of Texas
Austin, Texas 78712
University of Liege
Electronic Institute
15, Avenue Des Tilleuls
Val-Benoit, Liege
Belgium
Division of Engineering and Applied
Physics
210 Pierce Hall
Harvard University
Cambridge, Massachusetts 02138
University of California at Los Angeles
Department of Engineering
Los Angeles, California
California Institute of Technology
Pasadena, California
Attn: Documents Library
Aerospace Corporation
P.O. Box 95085
Los Angeles, California 90045
Attn: Library Acquisitions Group
University of California
Santa Barbara, California
Attn: Library
Professor Nicholas George
California Institute of Technology
Pasadena, California
Carnegie Institute of Technology
Electrical Engineering Department
Pittsburgh, Pennsylvania
Aeronautics Library
Graduate Aeronautical Laboratories
California Institute of Technology
1201 E. California Blvd.
Pasadena, California 91109
University of Michigan
Electrical Engineering Department
Ann Arbor, Michigan
New York University
College of Engineering
New York, New York
Director, USAF Project RAND
Via: Air Force Liaison Office
The RAND Corporation
1700 Main Street
Santa Monica, California 90406
Attn: Library
Syracuse University
Dept. of Electrical Engineering
Syracuse, New York
The Johns Hopkins University
Applied Physics Laboratory
8621 Georgia Avenue
Silver Spring, Maryland
Attn: Boris W. Kuvshinoff
Document Librarian
Yale University
Engineering Department
New Haven, Connecticut
Bendix Pacific Division
11600 Sherman Way
North Hollywood, California
School of Engineering Sciences
Arizona State University
Tempe, Arizona
General Electric Company
Research Laboratories
Schenectady, New York
Dr. Leo Young
Stanford Research Institute
Menlo Park, California
I_
IIC · ·_11_
_I
_C11____
___I_·
_ I
JOINT SERVICES REPORTS DISTRIBUTION LIST (continued)
Airborne Instruments Laboratory
Deerpark, New York
Lockheed Aircraft Corporation
P.O. Box 504
Sunnyvale, California
Raytheon Company
Bedford, Massachusetts
Attn: Librarian
_
__
UNCLASSTFTIE
Security Classification
DOCUMENT CONTROL DATA - R&D
(Security claslflcaton of title, body of abstractand indexing annotation must be entered when the overall report is classified)
I. ORIGINATIN G ACTIVITY (Corporate author)
28a. REPORT SECURITY C LASSIFICATION
Research Laboratory of Electronics
Massachusetts Institute of Technology
Cambridge, Massachusettsone
Unclassified
GROUP
2b
3. REPORT TITLE
Word-Recognition Computer Program
4. DESCRIPTIVE NOTES (Type of report and nclusive dates)
Technical Report
5. AUTHOR(S) (Last name, first name,
Gold,
nitial)
Bernard
6. REPORT DATE
7-.
June 15, 1966
8a.
C_.OTRCT 0C-
7b. NO. OF REFS
36
NO.
0
None
9.. ORIGINATOR'S
REPORT NUMBER(S)
Technical Report 452
D[-039-AMCi~i~-03ZVUOfE
200-14501-B31F
b. PROJECT NO,
TOTAL NO. OF PACES
NSF
Grant GK-835
NIH Grant GK-835
PO1
MH
NIH Grant 5
7-0 6
PO1 MH-04737-06
9b.
OTHER RPORT NO(S) (Any other numbers that may be assglned
this report)
NASA Grant NsG-496
10 A VA IL ABILITY/LIMITATION NOTICES
Distribution of this report is unlimited
I11SUPPLEMENTARY
NOTES
12. SPONSORING MILITARY ACTIVITY
Joint Services Electronics Program
thru USAECOM, Fort Monmouth, N. J.
13. ABSTRACT
A word-recognition computer program has been designed and tested for a
vocabulary of 54 words and a population of 10 male speakers. The program
performs the functions of segmentation, measurements on the segments, and
decision making. Out of the 540 words, 74 were incorrectly classified by the
program.
rn%
-L
1
FORM
JAN64
1 A7
1
/ 4j
UNCLASSIFIED
Security Classification
-II-------
-
Pl_-·--·l-LII
----·-
.
.
. -
_l
_L·
_. .-
I· I__---_- I--
-
1
'JNCLASSIFIED
Classifi,atinn
i curitv Y"""""`L""
V`~~·'
14.
KEY WORDS
LINK A
ROLM
LINK B
ROL.E
W
LINK C
WT
ROLE
WT
Word - recognition
Segmentation
Talking to Computers
Voice-actuated mechanisms
INSTRUCTIONS
1. ORIGINATING ACTIVITY: Enter the name and address
imposed by security classification, using standard statements
of the contractor, subcontractor, grantee, Department of Desuch. as:
fense activity or other organization (corporale author) issuing
(1)
"Qualified requesters may obtain copies of this
the report.
report from DDC."
2a. REPORT SECUITY CLASSIFICATION: Enter the over(2)
"Foreign announcement and dissemination of this
all security classification of the report. Indicate whether
report by DDC is not authorized. "
"Restricted Data" is included Marking is to be in accordance with appropriate security regulations.
(3) "U. S. Government agencies may obtain copies of
this report directly from DDC. Other qualified DDC
2b. GROUP: Automatic downgrading is specified in DoD Diusers shall request through
rective 5200. 10 and Armed Forces Industrial Manual. Enter
.,
the group number. Also, when applicable, show that optional
markings have been used for Group 3 and Group 4 as author(4)
"U.
S.
military agencies may obtain copies of this
ized.
report directly from DDC Other qualified users
3. REPORT TITLE: Enter the complete report title in all
shall request through
capital letters. Titles in all cases should be unclassified.
If a meaningful title cannot be selected without classification, show title classification in all capitals in parenthesis
(5)
"All distribution of this report is controlled. Qualimmediately following the title.
ified DDC users shall request through
4. DESCRIPTIVE NOTES: If appropriate, enter the type of
report, e.g., interim, progress, summary, annual, or final.
If the report has been furnished to the Office of Technical
Give the inclusive dates when a specific reporting period is
Services, Department of Commerce, for sale to the public, indicovered.
cate this fact and enter the price, if known.
5. AUTHOR(S): Enter the name(s) of author(s) as shown on
11. SUPPLEMENTARY NOTES: Use for additional explanaor in the report. Entel last name, first name, middle initial.
tory notes.
If military, show rank and branch of service. The name of
the principal .:thor is an absolute minimum requirement.
12. SPONSORING MILITARY ACTIVITY: Enter the name of
the departmental project office or laboratory sponsoring (pay6. REPORT DATE. Enter the date of the report as day,
ing
for) the research and development. Include address.
month, year; or month, year. If more than one date appears
on the report, use date of publication.
13. ABSTRACT: Enter an abstract giving a brief and factual
summary of the document indicative of the report, even though
7a. TOTAL NUMBER OF PAGES: The total page count
it may also appear elsewhere in the body of the technical reshould follow normal pagination procedures, i.e., enter the
port. If additional space is required, a continuation sheet shall
number of pages containing information.
be attached.
7b. NUMBER OF REFERENCES: Enter the total number of
It is highly desirable that the abstract of classified reports
references cited in the report.
be unclassified. Each paragraph of the abstract shall end with
Ba. CONTRACT OR GRANT NUMBER: If appropriate, enter
an indication of the military security classification of the inthe applicable number of the contract or grant under which
formation in the paragraph, represented as (TS). (S), (C). or (U)
the report was written.
There is no limitation on the length of the abstract. How8b, 8c, & 8d. PROJECT NUMOER: Enter the appropriate
ever, the suggested length is from 150 to 225 words.
military department identification, such as project number,
subproject number, system numbers, task number, etc.
14. KEY WORDS: Key words are technically meaningful terms
or short phrases that characterize a report and may be used as
9a. ORIGINATOR'S REPORT NUMBER(S): Enter the offiindex entries for cataloging the report. Key words must be
cial report number by which the document will be identified
selected so that no security classification is required. Identiand controlled by the originating activity. This number must
fiers, such as equipment model designation. trade name, military
he unique to this report.
project code name, geographic location, may be used as key
9b. OTHER REPORT NUMBER(S): If the report has been
words but will be followed by an indication of technical conassigned any other repcrt numbers (either by the originator
text. The assignment of links, rules, and weights is optional.
or by the sponsor), also enter this number(s).
10. AVAILABILITY/LIMITATION NOTICES: Enter any limitations on further dissemination of the report, other than those
UNCLASSIFIED
-
Security Classification
Download